On Fri, 2023-12-15 at 18:21 +0800, Jiazi Li wrote: > On Thu, Dec 14, 2023 at 02:13:38PM +0100, Johannes Berg wrote: > > > > > > > > Ok that's bad - so you hit the WARN_ON there? How that? We should fix > > > > that too? > > > > > > > Yes, hit this WARN_ON in the test of direct connection between mobile > > > phones and PC. Here is the log: > > > [ 2741.982362] -----------[ cut here ]----------- > > > [ 2741.982446] WARNING: CPU: 6 PID: 2175 at net/wireless/scan.c:1496 cfg80211_update_assoc_bss_entry+0x350/0x378 [cfg80211] > > > > Right, so you can reproduce that - can you find a fix for it? > > > > I am responsible for kernel stability and I am not very familiar with wireless code. > The colleague in charge of the WiFi module also couldn't find the root cause, so we > used the workaround solution I mentioned earlier to address this issue. So you're going to have to fix it after all ;-) Syzbot ran into the same problem [1], but I'm pretty convinced that with a well-behaved driver, this should be practically impossible unless the AP is also doing something weird. Your (non-upstream!) driver is likely messing around with the BSS / scan reporting in weird ways while the AP is doing CSA. [1] https://syzkaller.appspot.com/bug?extid=dc6f4dce0d707900cdea Ultimately it comes down to the cfg80211 code tracking BSSes by the (BSSID, channel) tuple rather than just BSSID, so if - you see a BSS (BSSID-x, 1) - you see a BSS (BSSID-x, 11) - you connect to e.g. the channel 11 one - then it channel switches to channel 1 you run into this issue. I guess we must handle it _somehow_ if only to prevent attackers triggering this, but right now I don't have a good idea. However, in practice, it shouldn't happen since actual APs using the same BSSID on different channels are extremely rare (if not never happening) these days, chances are the driver is just reporting the duplicate BSSes in a weird way. johannes