On Mon, Jul 21, 2025 at 11:32:44AM GMT, Johan Hovold wrote: > On Mon, Jul 14, 2025 at 11:31:04PM +0530, Manivannan Sadhasivam wrote: > > Commit 9f4f3dfad8cf ("PCI: qcom: Enable ASPM for platforms supporting 1.9.0 > > ops") allowed the Qcom controller driver to enable ASPM for all PCI devices > > enumerated at the time of the controller driver probe. It proved to be > > useful for devices already powered on by the bootloader as it allowed > > devices to enter ASPM without user intervention. > > > > However, it could not enable ASPM for the hotplug capable devices i.e., > > devices enumerated *after* the controller driver probe. This limitation > > mostly went unnoticed as the Qcom PCI controllers are not hotplug capable > > and also the bootloader has been enabling the PCI devices before Linux > > Kernel boots (mostly on the Qcom compute platforms which users use on a > > daily basis). > > > > But with the advent of the commit b458ff7e8176 ("PCI/pwrctl: Ensure that > > pwrctl drivers are probed before PCI client drivers"), the pwrctrl driver > > started to block the PCI device enumeration until it had been probed. > > Though, the intention of the commit was to avoid race between the pwrctrl > > driver and PCI client driver, it also meant that the pwrctrl controlled PCI > > devices may get probed after the controller driver and will no longer have > > ASPM enabled. So users started noticing high runtime power consumption with > > WLAN chipsets on Qcom compute platforms like Thinkpad X13s, and Thinkpad > > T14s, etc... > > Note the ASPM regression for ath11k/ath12k only happened in 6.15, so > commit b458ff7e8176 ("PCI/pwrctl: Ensure that pwrctl drivers are probed > before PCI client drivers") in 6.13 does not seem to be the immediate > culprit here. > This series was intented to fix the ASPM issue which exist even before the introduction of pwrctrl framework. But I also agree that the below commits made the issue more visible and caused regression on platforms where WLAN used to work. > Candidates from 6.15 include commits like > > 957f40d039a9 ("PCI/pwrctrl: Move creation of pwrctrl devices to pci_scan_device()") > 2489eeb777af ("PCI/pwrctrl: Skip scanning for the device further if pwrctrl device is created") > > This is probably related to the reports of these drivers sometimes > failing to probe with > > ath12k_pci 0004:01:00.0: of_irq_parse_pci: failed with rc=134 > > after pwrctrl was merged, and which since 6.15 should instead result in > the drivers not probing at all (as we've discussed off list). > We discussed about the ASPM issue IIRC. The above mentioned of_irq_parse_pci could also be related to the fact that we are turning off the supplies after pci_dev destruction. For this issue, I guess the patch from Brian could be the fix: https://lore.kernel.org/linux-pci/20250711174332.1.I623f788178c1e4c5b1a41dbfc8c7fa55966373c0@changeid/ > > Obviously, it is the pwrctrl change that caused regression, but it > > ultimately uncovered a flaw in the ASPM enablement logic of the controller > > driver. So to address the actual issue, switch to the bus notifier for > > enabling ASPM of the PCI devices. The notifier will notify the controller > > driver when a PCI device is attached to the bus, thereby allowing it to > > enable ASPM more reliably. It should be noted that the > > 'pci_dev::link_state', which is required for enabling ASPM by the > > pci_enable_link_state_locked() API, is only set by the time of > > BUS_NOTIFY_BIND_DRIVER stage of the notification. So we cannot enable ASPM > > during BUS_NOTIFY_ADD_DEVICE stage. > > > > So with this, we can also get rid of the controller driver specific > > 'qcom_pcie_ops::host_post_init' callback. > > > > Cc: stable@xxxxxxxxxxxxxxx # v6.7 > > Fixes: 9f4f3dfad8cf ("PCI: qcom: Enable ASPM for platforms supporting 1.9.0 ops") > > So whatever form this fix ends up taking it only needs to be backported > to 6.15. > > As you mention above these platforms do not support hotplug, but even if > they were, enabling ASPM for hotplugged devices is arguably more of a > new features than a bug fix. > FYI, I'm going to drop this series in favor this (with one yet-to-be-submitted patch on top): https://lore.kernel.org/linux-pci/20250720190140.2639200-1-david.e.box@xxxxxxxxxxxxxxx/ - Mani -- மணிவண்ணன் சதாசிவம்