On Fri, Jul 18, 2025 at 10:05:02AM GMT, Baochen Qiang wrote: > > > On 7/17/2025 7:29 PM, Manivannan Sadhasivam wrote: > > On Thu, Jul 17, 2025 at 06:46:12PM GMT, Baochen Qiang wrote: > >> > >> > >> On 7/17/2025 6:31 PM, Manivannan Sadhasivam wrote: > >>> On Thu, Jul 17, 2025 at 05:24:13PM GMT, Baochen Qiang wrote: > >>> > >>> [...] > >>> > >>>>> @@ -16,6 +16,8 @@ > >>>>> #include "mhi.h" > >>>>> #include "debug.h" > >>>>> > >>>>> +#include "../ath.h" > >>>>> + > >>>>> #define ATH12K_PCI_BAR_NUM 0 > >>>>> #define ATH12K_PCI_DMA_MASK 36 > >>>>> > >>>>> @@ -928,8 +930,7 @@ static void ath12k_pci_aspm_disable(struct ath12k_pci *ab_pci) > >>>>> u16_get_bits(ab_pci->link_ctl, PCI_EXP_LNKCTL_ASPM_L1)); > >>>>> > >>>>> /* disable L0s and L1 */ > >>>>> - pcie_capability_clear_word(ab_pci->pdev, PCI_EXP_LNKCTL, > >>>>> - PCI_EXP_LNKCTL_ASPMC); > >>>>> + pci_disable_link_state(ab_pci->pdev, PCIE_LINK_STATE_L0S | PCIE_LINK_STATE_L1); > >>>> > >>>> Not always, but sometimes seems the 'disable' does not work: > >>>> > >>>> [ 279.920507] ath12k_pci_power_up 1475: link_ctl 0x43 //before disable > >>>> [ 279.920539] ath12k_pci_power_up 1482: link_ctl 0x43 //after disable > >>>> > >>>> > >>>>> > >>>>> set_bit(ATH12K_PCI_ASPM_RESTORE, &ab_pci->flags); > >>>>> } > >>>>> @@ -958,10 +959,7 @@ static void ath12k_pci_aspm_restore(struct ath12k_pci *ab_pci) > >>>>> { > >>>>> if (ab_pci->ab->hw_params->supports_aspm && > >>>>> test_and_clear_bit(ATH12K_PCI_ASPM_RESTORE, &ab_pci->flags)) > >>>>> - pcie_capability_clear_and_set_word(ab_pci->pdev, PCI_EXP_LNKCTL, > >>>>> - PCI_EXP_LNKCTL_ASPMC, > >>>>> - ab_pci->link_ctl & > >>>>> - PCI_EXP_LNKCTL_ASPMC); > >>>>> + pci_enable_link_state(ab_pci->pdev, ath_pci_aspm_state(ab_pci->link_ctl)); > >>>> > >>>> always, the 'enable' is not working: > >>>> > >>>> [ 280.561762] ath12k_pci_start 1180: link_ctl 0x43 //before restore > >>>> [ 280.561809] ath12k_pci_start 1185: link_ctl 0x42 //after restore > >>>> > >>> > >>> Interesting! I applied your diff and I never see this issue so far (across 10+ > >>> reboots): > >> > >> I was not testing reboot. Here is what I am doing: > >> > >> step1: rmmod ath12k > >> step2: force LinkCtrl using setpci (make sure it is 0x43, which seems more likely to see > >> the issue) > >> > >> sudo setpci -s 02:00.0 0x80.B=0x43 > >> > >> step3: insmod ath12k and check linkctrl > >> > > > > So I did the same and got: > > > > [ 3283.363569] ath12k_pci_power_up 1475: link_ctl 0x43 > > [ 3283.363769] ath12k_pci_power_up 1480: link_ctl 0x40 > > [ 3284.007661] ath12k_pci_start 1180: link_ctl 0x40 > > [ 3284.007826] ath12k_pci_start 1185: link_ctl 0x42 > > > > My host machine is Qcom based Thinkpad T14s and it doesn't support L0s. So > > that's why the lnkctl value once enabled becomes 0x42. This is exactly the > > reason why the drivers should not muck around LNKCTL register manually. > > Thanks, then the 0x43 -> 0x40 -> 0x40 -> 0x42 sequence should not be a concern. But still > the random 0x43 -> 0x43 -> 0x43 -> 0x42 sequence seems problematic. > > How many iterations have you done with above steps? From my side it seems random so better > to do some stress test. > So I ran the modprobe for about 50 times on the Intel NUC that has QCA6390, but didn't spot the disparity. This is the script I used: for i in {1..50} ;do echo "Loop $i"; sudo setpci -s 01:00.0 0x80.B=0x43;\ sudo modprobe -r ath11k_pci; sleep 1; sudo modprobe ath11k_pci; sleep 1;done And I always got: [ 5862.388083] ath11k_pci_aspm_disable: 609 lnkctrl: 0x43 [ 5862.388124] ath11k_pci_aspm_disable: 614 lnkctrl: 0x40 [ 5862.876291] ath11k_pci_start: 880 lnkctrl: 0x40 [ 5862.876346] ath11k_pci_start: 886 lnkctrl: 0x42 Also no AER messages. TBH, I'm not sure how you were able to see the random issues with these APIs. That looks like a race, which is scary. I do not want to ignore your scenario, but would like to reproduce and get to the bottom of it. - Mani -- மணிவண்ணன் சதாசிவம்