在 2025/8/12 14:10, Ido Schimmel 写道:
On Tue, Aug 12, 2025 at 05:18:18PM +0800, Wang Liang wrote:
When set multicast_query_interval to a large value, the local variable
'time' in br_multicast_send_query() may overflow. If the time is smaller
than jiffies, the timer will expire immediately, and then call mod_timer()
again, which creates a loop and may trigger the following soft lockup
issue.
watchdog: BUG: soft lockup - CPU#1 stuck for 221s! [rb_consumer:66]
CPU: 1 UID: 0 PID: 66 Comm: rb_consumer Not tainted 6.16.0+ #259 PREEMPT(none)
Call Trace:
<IRQ>
__netdev_alloc_skb+0x2e/0x3a0
br_ip6_multicast_alloc_query+0x212/0x1b70
__br_multicast_send_query+0x376/0xac0
br_multicast_send_query+0x299/0x510
br_multicast_query_expired.constprop.0+0x16d/0x1b0
call_timer_fn+0x3b/0x2a0
__run_timers+0x619/0x950
run_timer_softirq+0x11c/0x220
handle_softirqs+0x18e/0x560
__irq_exit_rcu+0x158/0x1a0
sysvec_apic_timer_interrupt+0x76/0x90
</IRQ>
This issue can be reproduced with:
ip link add br0 type bridge
echo 1 > /sys/class/net/br0/bridge/multicast_querier
echo 0xffffffffffffffff >
/sys/class/net/br0/bridge/multicast_query_interval
ip link set dev br0 up
The multicast_startup_query_interval can also cause this issue. Similar to
the commit 99b40610956a("net: bridge: mcast: add and enforce query interval
^ missing space
minimum"), add check for the query interval maximum to fix this issue.
Link: https://lore.kernel.org/netdev/20250806094941.1285944-1-wangliang74@xxxxxxxxxx/
Fixes: 7e4df51eb35d ("bridge: netlink: add support for igmp's intervals")
Probably doesn't matter in practice given how old both commits are, but
I think you should blame d902eee43f19 ("bridge: Add multicast
count/interval sysfs entries") instead. The commit message also uses the
sysfs path and not the netlink one.
Thanks for your suggestions!
The bug fix tag is really important. I will correct it and send a new
patch later.
Suggested-by: Nikolay Aleksandrov <razor@xxxxxxxxxxxxx>
Signed-off-by: Wang Liang <wangliang74@xxxxxxxxxx>
Code looks fine to me.