Bitterblue Smith <rtl8821cerfe2@xxxxxxxxx> wrote: > Hi, > > A few users with RTL8851BU who did not install wireless-regdb reported this: > > kernel: rtw89_8851bu 1-2:1.2: Firmware version 0.29.41.3 (65cefb31), cmd version 0, type 3 > ... > kernel: rtw89_8851bu 1-2:1.2: rtw89_hw_scan_offload failed ret -110 > kernel: rtw89_8851bu 1-2:1.2: c2h reg timeout > kernel: rtw89_8851bu 1-2:1.2: FW does not process h2c registers > kernel: rtw89_8851bu 1-2:1.2: HW scan failed: -110 > > The AP can't be pinged anymore, but the driver is still receiving beacons. > > It's the same with RTL8832BU. > > It can also be reproduced with RTL8852BE (PCI). The output is different: > > [ 628.015012] rtw89_8852be_git 0000:02:00.0: Firmware version 0.29.29.8 (39dbf50f), cmd version 0, type > 3 > ... > [ 698.619819] rtw89_8852be_git 0000:02:00.0: FW status = 0x67001220 > [ 698.619830] rtw89_8852be_git 0000:02:00.0: FW BADADDR = 0x0 > [ 698.619835] rtw89_8852be_git 0000:02:00.0: FW EPC/RA = 0xb89bacd3 > [ 698.619841] rtw89_8852be_git 0000:02:00.0: FW MISC = 0xb8900635 > [ 698.619845] rtw89_8852be_git 0000:02:00.0: R_AX_HALT_C2H = 0x30000008 > [ 698.619850] rtw89_8852be_git 0000:02:00.0: R_AX_SER_DBG_INFO = 0x0 > [ 698.619858] rtw89_8852be_git 0000:02:00.0: [ERR]fw PC = 0xb8987df9 > [ 698.619873] rtw89_8852be_git 0000:02:00.0: [ERR]fw PC = 0xb89784e7 > [ 698.619888] rtw89_8852be_git 0000:02:00.0: [ERR]fw PC = 0xb8935ea7 > [ 698.619903] rtw89_8852be_git 0000:02:00.0: [ERR]fw PC = 0xb8920565 > [ 698.619917] rtw89_8852be_git 0000:02:00.0: [ERR]fw PC = 0xb8935e9f > [ 698.619932] rtw89_8852be_git 0000:02:00.0: [ERR]fw PC = 0xb8935ed1 > [ 698.619947] rtw89_8852be_git 0000:02:00.0: [ERR]fw PC = 0xb890cd1f > [ 698.619961] rtw89_8852be_git 0000:02:00.0: [ERR]fw PC = 0xb893035c > [ 698.619976] rtw89_8852be_git 0000:02:00.0: [ERR]fw PC = 0xb8934333 > [ 698.619990] rtw89_8852be_git 0000:02:00.0: [ERR]fw PC = 0xb8934345 > [ 698.620005] rtw89_8852be_git 0000:02:00.0: [ERR]fw PC = 0xb8935f1f > [ 698.620019] rtw89_8852be_git 0000:02:00.0: [ERR]fw PC = 0xb89bac9d > [ 698.620034] rtw89_8852be_git 0000:02:00.0: [ERR]fw PC = 0xb89afeb3 > [ 698.620048] rtw89_8852be_git 0000:02:00.0: [ERR]fw PC = 0xb8935e1b > [ 698.620063] rtw89_8852be_git 0000:02:00.0: [ERR]fw PC = 0xb89bac2b > [ 698.620076] rtw89_8852be_git 0000:02:00.0: SER catches error: 0x4000 > [ 698.620139] rtw89_8852be_git 0000:02:00.0: FW status = 0x68001220 > [ 698.620144] rtw89_8852be_git 0000:02:00.0: FW BADADDR = 0x0 > [ 698.620150] rtw89_8852be_git 0000:02:00.0: FW EPC/RA = 0xb89bacd3 > [ 698.620156] rtw89_8852be_git 0000:02:00.0: FW MISC = 0xb8900635 > [ 698.620161] rtw89_8852be_git 0000:02:00.0: R_AX_HALT_C2H = 0x30000008 > [ 698.620167] rtw89_8852be_git 0000:02:00.0: R_AX_SER_DBG_INFO = 0x0 > [ 698.620177] rtw89_8852be_git 0000:02:00.0: [ERR]fw PC = 0xb893037c > [ 698.620214] rtw89_8852be_git 0000:02:00.0: [ERR]fw PC = 0xb8935df9 > [ 698.620231] rtw89_8852be_git 0000:02:00.0: [ERR]fw PC = 0xb8934345 > [ 698.620246] rtw89_8852be_git 0000:02:00.0: [ERR]fw PC = 0xb893435b > [ 698.620261] rtw89_8852be_git 0000:02:00.0: [ERR]fw PC = 0xb893435b > [ 698.620276] rtw89_8852be_git 0000:02:00.0: [ERR]fw PC = 0xb893435b > [ 698.620291] rtw89_8852be_git 0000:02:00.0: [ERR]fw PC = 0xb8934379 > [ 698.620306] rtw89_8852be_git 0000:02:00.0: [ERR]fw PC = 0xb8934377 > [ 698.620321] rtw89_8852be_git 0000:02:00.0: [ERR]fw PC = 0xb893437f > [ 698.620336] rtw89_8852be_git 0000:02:00.0: [ERR]fw PC = 0xb8934315 > [ 698.620351] rtw89_8852be_git 0000:02:00.0: [ERR]fw PC = 0xb89333b5 > [ 698.620366] rtw89_8852be_git 0000:02:00.0: [ERR]fw PC = 0xb893432d > [ 698.620381] rtw89_8852be_git 0000:02:00.0: [ERR]fw PC = 0xb893432f > [ 698.620396] rtw89_8852be_git 0000:02:00.0: [ERR]fw PC = 0xb893432f > [ 698.620412] rtw89_8852be_git 0000:02:00.0: [ERR]fw PC = 0xb8934349 > [ 698.620424] rtw89_8852be_git 0000:02:00.0: SER catches error: 0x4000 > [ 698.621158] rtw89_8852be_git 0000:02:00.0: FW backtrace invalid size: 0x0 > [ 698.625076] ieee80211 phy1: Hardware restart was requested > [ 698.625098] ------------[ cut here ]------------ > [ 698.625100] ieee80211_restart_work called with hardware scan in progress > [ 698.625172] WARNING: CPU: 2 PID: 61 at net/mac80211/main.c:354 ieee80211_restart_work+0x13d/0x150 > [mac80211] > [ 698.625313] Modules linked in: rtw89_8852be_git(OE) rtw89_8852b_git(OE) rtw89_8852b_common_git(OE) > rtw89_pci_git(OE) rtw89_core_git(OE) cmac ccm vfat edac_mce_amd fat kvm_amd ccp kvm mac80211 irqbypass > crct10dif_pclmul crc32_pclmul polyval_clmulni polyval_generic gf128mul eeepc_wmi snd_hda_codec_hdmi > libarc4 ghash_clmulni_intel asus_wmi sha512_ssse3 ledtrig_audio snd_hda_intel sha256_ssse3 cfg80211 > sparse_keymap snd_intel_dspcfg sha1_ssse3 platform_profile snd_intel_sdw_acpi aesni_intel i8042 serio > snd_hda_codec crypto_simd rfkill acpi_cpufreq pcspkr wmi_bmof snd_hda_core cryptd k10temp sp5100_tco > i2c_piix4 snd_hwdep snd_pcm snd_timer joydev snd soundcore mousedev mac_hid it87 hwmon_vid sg crypto_user > dm_mod fuse loop nfnetlink bpf_preload ip_tables x_tables ext4 crc32c_generic crc16 mbcache jbd2 usbhid > radeon drm_ttm_helper ttm video i2c_algo_bit crc32c_intel drm_suballoc_helper drm_display_helper xhci_pci > cec xhci_pci_renesas wmi > [ 698.625424] Unloaded tainted modules: rtw89_core_git(OE):1 rtw89_pci_git(OE):1 > rtw89_8852b_common_git(OE):1 rtw89_8852b_git(OE):1 rtw89_8852be_git(OE):1 [last unloaded: > rtw89_core_git(OE)] > [ 698.625439] CPU: 2 PID: 61 Comm: kworker/2:1 Tainted: G OE 6.6.88-1-lts66 #1 > 29602267a9340ebc551d246a9d0d242da9be9d82 > [ 698.625445] Hardware name: System manufacturer System Product Name/F2A85-M, BIOS 6508 07/11/2014 > [ 698.625448] Workqueue: events_freezable ieee80211_restart_work [mac80211] > [ 698.625551] RIP: 0010:ieee80211_restart_work+0x13d/0x150 [mac80211] > [ 698.625656] Code: bd f0 e9 ff ff e8 73 b6 da ff 5b 5d 41 5c 41 5d 41 5e e9 76 17 ff c1 48 c7 c6 50 ea > cc c0 48 c7 c7 a8 d0 cd c0 e8 53 3d 51 c1 <0f> 0b e9 03 ff ff ff 66 66 2e 0f 1f 84 00 00 00 00 00 90 90 > 90 90 > [ 698.625660] RSP: 0018:ffffc900002b7e40 EFLAGS: 00010282 > [ 698.625664] RAX: 0000000000000000 RBX: ffff88810129a900 RCX: 0000000000000027 > [ 698.625667] RDX: ffff888227521748 RSI: 0000000000000001 RDI: ffff888227521740 > [ 698.625669] RBP: ffff888192781f50 R08: 0000000000000000 R09: ffffc900002b7cb0 > [ 698.625672] R10: ffffffff840b26c8 R11: 0000000000000003 R12: ffff88810007ac00 > [ 698.625674] R13: ffff888192780900 R14: ffff88810007ac05 R15: 0000000000000000 > [ 698.625677] FS: 0000000000000000(0000) GS:ffff888227500000(0000) knlGS:0000000000000000 > [ 698.625680] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 > [ 698.625683] CR2: 00005eb01aafb000 CR3: 0000000108b16000 CR4: 00000000000406e0 > [ 698.625686] Call Trace: > [ 698.625690] <TASK> > [ 698.625695] process_one_work+0x190/0x3a0 > [ 698.625705] worker_thread+0x318/0x460 > [ 698.625711] ? __pfx_worker_thread+0x10/0x10 > [ 698.625716] kthread+0xe8/0x120 > [ 698.625719] ? __pfx_kthread+0x10/0x10 > [ 698.625723] ret_from_fork+0x34/0x50 > [ 698.625729] ? __pfx_kthread+0x10/0x10 > [ 698.625732] ret_from_fork_asm+0x1b/0x30 > [ 698.625739] </TASK> > [ 698.625740] ---[ end trace 0000000000000000 ]--- > [ 698.625750] rtw89_8852be_git 0000:02:00.0: rtw89_hw_scan_offload failed ret 1 > > I guess that's a firmware crash. > > Interestingly, this test firmware with version 1.29.29.9 works fine for > RTL8852BE and RTL8832BU: > > https://lore.kernel.org/linux-wireless/42783d9a032143bfb67ea969ee0b805d@xxxxxxxxxxx/ > > As far as I can tell, this problem happens when the hardware scan channel > list is too long to fit in a single H2C message (see > "if (list_len == RTW89_SCAN_LIST_LIMIT_AX)" in > rtw89_hw_scan_add_chan_list_ax()) and the last channel in the first H2C > message happens to be the operating channel (RTW89_CHAN_OPERATE). > > To reproduce this condition in a reliable way, pretend that every channel is > a DFS channel: > > diff --git a/fw.c b/fw.c > index 48575e4..bf5df41 100644 > --- a/fw.c > +++ b/fw.c > @@ -7121,6 +7121,7 @@ int rtw89_hw_scan_prep_chan_list_ax(struct rtw89_dev *rtwdev, > type = RTW89_CHAN_DFS; > else > type = RTW89_CHAN_ACTIVE; > + type = RTW89_CHAN_DFS; > rtw89_hw_scan_add_chan_ax(rtwdev, type, req->n_ssids, ch_info); > > if (scan_info->connected && > > Then the driver will add RTW89_CHAN_OPERATE before every single channel and > the list of channels will be longer than RTW89_SCAN_LIST_LIMIT_AX. > > One workaround is to not let the operating channel be the last one in the > split list: > > diff --git a/fw.c b/fw.c > index 27d84464347b..ef036e1585f3 100644 > --- a/fw.c > +++ b/fw.c > @@ -7381,6 +7381,13 @@ int rtw89_hw_scan_add_chan_list_ax(struct rtw89_dev *rtwdev, > INIT_LIST_HEAD(&list); > > list_for_each_entry_safe(ch_info, tmp, &scan_info->chan_list, list) { > + /* The operating channel (tx_null == true) should > + * not be last in the list, to avoid breaking > + * RTL8851BU and RTL8832BU. > + */ > + if (list_len + 1 == RTW89_SCAN_LIST_LIMIT_AX && ch_info->tx_null) > + break; > + > list_move_tail(&ch_info->list, &list); > > list_len++; > > Another way is to keep tx_null false, then it doesn't matter how the list of > channels is split: > > diff --git a/fw.c b/fw.c > index 48575e4..420a665 100644 > --- a/fw.c > +++ b/fw.c > @@ -6906,7 +6906,7 @@ static void rtw89_hw_scan_add_chan_ax(struct rtw89_dev *rtwdev, int chan_type, > ch_info->pri_ch = op->primary_channel; > ch_info->ch_band = op->band_type; > ch_info->bw = op->band_width; > - ch_info->tx_null = true; > + // ch_info->tx_null = true; > ch_info->num_pkt = 0; > break; > case RTW89_CHAN_DFS: > > Hopefully this can help you find the problem in the firmware. Thanks for the detail. That's helpful for us to dig the problem.