On Fri, Jul 25, 2025 at 5:25 AM Baochen Qiang <baochen.qiang@xxxxxxxxxxxxxxxx> wrote: > > > > On 6/12/2025 12:40 AM, balsam.chihi@xxxxxxxxxxx wrote: > > From: Balsam CHIHI <balsam.chihi@xxxxxxxxxxx> > > Hello, sorry for the delay. I was OoO. > > During system reboot or module removal (rmmod), a crash was observed > > due to a synchronous external abort caused by invalid register access > > in ath11k_pci_clear_dbg_registers(). This happens when the device is > > already powered off and the driver attempts to read debug registers. > > What does 'powered off' mean exactly here? is WLAN powered by an internal/external supply > that will be removed during rmmod/reboot? The "powered off" means that the wlan card is rmmod or reboot. It is not powered by external supply. > > > > > The crash log shows: > > root@OpenWrt:~# reboot > > [ 343.663492] Internal error: synchronous external abort: > > 0000000096000210 [#1] SMP > > [ 343.670992] Modules linked in: [...] > > [ 343.842432] CPU: 7 PID: 9435 Comm: procd Tainted: G O 6.6.86 #0 > > [ 343.849746] Hardware name: LS1088A > > [ 343.856969] pstate: 60400005 (nZCv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--) > > [ 343.863933] pc : ath11k_pci_get_msi_irq+0x18a0/0x1900 [ath11k_pci] > > I don't see any relationship between ath11k_pci_clear_dbg_registers() and this PC pointer. > So how did you locate the problematic register access then? I tyied to look for the problem based the PC pointer but I was not abled to find it. So, I added prints in ath11k driver to find out where it stucks, classical debug method. Then I found out that any "ath11k_pcic_read32()" call fails on rmmod ath11k_pci or reboot, but they work fine when insmod/boot ath11k_pci. > > > [ 343.870125] lr : ath11k_pcic_init_msi_config+0x98/0xc4 [ath11k] > > [...] > > [ 343.950800] Call trace: > > [ 343.953241] ath11k_pci_get_msi_irq+0x18a0/0x1900 [ath11k_pci] > > [ 343.959080] ath11k_pcic_init_msi_config+0x98/0xc4 [ath11k] > > [ 343.964671] ath11k_pcic_read32+0x30/0xb4 [ath11k] > > [ 343.969477] ath11k_pci_get_msi_irq+0x528/0x1900 [ath11k_pci] > > [ 343.975230] ath11k_pci_get_msi_irq+0x1460/0x1900 [ath11k_pci] > > [ 343.981068] ath11k_pci_get_msi_irq+0x1750/0x1900 [ath11k_pci] > > [ 343.986906] pci_device_shutdown+0x34/0x44 > > [ 343.991004] device_shutdown+0x160/0x268 > > [ 343.994928] kernel_restart+0x40/0xc0 > > [ 343.998594] __do_sys_reboot+0x104/0x23c > > [ 344.002518] __arm64_sys_reboot+0x24/0x30 > > [ 344.006529] do_el0_svc+0x6c/0x100 > > [ 344.009931] el0_svc+0x28/0x9c > > [ 344.012986] el0t_64_sync_handler+0x120/0x12c > > [ 344.017344] el0t_64_sync+0x178/0x17c > > [ 344.021009] Code: f94e0a80 92404a73 91420273 8b130013 (b9400273) > > [ 344.027102] ---[ end trace 0000000000000000 ]--- > > > > This patch adds a `power_on` flag to conditionally skip debug register > > reads when the device is not powered, preventing invalid memory access. > > why is only 'read' concerned here? Good question, I can't explain why only the reading fails but not the writing. > > You are skipping several 'read' in this change, could you help narrow down which > individual read instance causes this? > All read calls fail on rmmod/reboot, when ath11k_pci_clear_dbg_registers() is called. But they all succeed when insmod/boot ath11k_pci. > > > > With this change, the system no longer crashes on reboot or rmmod, > > and the driver continues to function correctly when reloaded. > > > > Tested-on: QCN9074 hw2.0 PCIe on LS1088A > > Tested-on: OpenWrt 24.10.1 > > Fixes: crash on reboot/rmmod due to invalid register access > > Signed-off-by: Balsam CHIHI <balsam.chihi@xxxxxxxxxxx> > > --- > > drivers/net/wireless/ath/ath11k/pci.c | 36 ++++++++++++++++----------- > > 1 file changed, 21 insertions(+), 15 deletions(-) > > > > diff --git a/drivers/net/wireless/ath/ath11k/pci.c b/drivers/net/wireless/ath/ath11k/pci.c > > index 78444f8ea153..0c1b55957557 100644 > > --- a/drivers/net/wireless/ath/ath11k/pci.c > > +++ b/drivers/net/wireless/ath/ath11k/pci.c > > @@ -203,16 +203,18 @@ static void ath11k_pci_soc_global_reset(struct ath11k_base *ab) > > ath11k_warn(ab, "link down error during global reset\n"); > > } > > > > -static void ath11k_pci_clear_dbg_registers(struct ath11k_base *ab) > > +static void ath11k_pci_clear_dbg_registers(struct ath11k_base *ab, bool power_on) > > { > > - u32 val; > > + if (power_on) { > > + u32 val; > > > > - /* read cookie */ > > - val = ath11k_pcic_read32(ab, PCIE_Q6_COOKIE_ADDR); > > - ath11k_dbg(ab, ATH11K_DBG_PCI, "pcie_q6_cookie_addr 0x%x\n", val); > > + /* read cookie */ > > + val = ath11k_pcic_read32(ab, PCIE_Q6_COOKIE_ADDR); > > + ath11k_dbg(ab, ATH11K_DBG_PCI, "pcie_q6_cookie_addr 0x%x\n", val); > > > > - val = ath11k_pcic_read32(ab, WLAON_WARM_SW_ENTRY); > > - ath11k_dbg(ab, ATH11K_DBG_PCI, "wlaon_warm_sw_entry 0x%x\n", val); > > + val = ath11k_pcic_read32(ab, WLAON_WARM_SW_ENTRY); > > + ath11k_dbg(ab, ATH11K_DBG_PCI, "wlaon_warm_sw_entry 0x%x\n", val); > > + } > > > > /* TODO: exact time to sleep is uncertain */ > > mdelay(10); > > @@ -223,14 +225,18 @@ static void ath11k_pci_clear_dbg_registers(struct ath11k_base *ab) > > ath11k_pcic_write32(ab, WLAON_WARM_SW_ENTRY, 0); > > mdelay(10); > > > > - val = ath11k_pcic_read32(ab, WLAON_WARM_SW_ENTRY); > > - ath11k_dbg(ab, ATH11K_DBG_PCI, "wlaon_warm_sw_entry 0x%x\n", val); > > + if (power_on) { > > + u32 val; > > > > - /* A read clear register. clear the register to prevent > > - * Q6 from entering wrong code path. > > - */ > > - val = ath11k_pcic_read32(ab, WLAON_SOC_RESET_CAUSE_REG); > > - ath11k_dbg(ab, ATH11K_DBG_PCI, "soc reset cause %d\n", val); > > + val = ath11k_pcic_read32(ab, WLAON_WARM_SW_ENTRY); > > + ath11k_dbg(ab, ATH11K_DBG_PCI, "wlaon_warm_sw_entry 0x%x\n", val); > > + > > + /* A read clear register. clear the register to prevent > > + * Q6 from entering wrong code path. > > + */ > > + val = ath11k_pcic_read32(ab, WLAON_SOC_RESET_CAUSE_REG); > > + ath11k_dbg(ab, ATH11K_DBG_PCI, "soc reset cause %d\n", val); > > + } > > } > > > > static int ath11k_pci_set_link_reg(struct ath11k_base *ab, > > @@ -368,7 +374,7 @@ static void ath11k_pci_sw_reset(struct ath11k_base *ab, bool power_on) > > } > > > > ath11k_mhi_clear_vector(ab); > > - ath11k_pci_clear_dbg_registers(ab); > > + ath11k_pci_clear_dbg_registers(ab, power_on); > > ath11k_pci_soc_global_reset(ab); > > ath11k_mhi_set_mhictrl_reset(ab); > > } >