On Wed, Aug 27, 2025 at 10:43:26PM +0530, Manivannan Sadhasivam wrote: > On Wed, Aug 27, 2025 at 10:47:39AM GMT, Bjorn Helgaas wrote: > > On Mon, Aug 25, 2025 at 01:52:57PM +0530, Manivannan Sadhasivam via B4 Relay wrote: > > > From: Manivannan Sadhasivam <manivannan.sadhasivam@xxxxxxxxxxxxxxxx> > > > > > > Commit 9f4f3dfad8cf ("PCI: qcom: Enable ASPM for platforms > > > supporting 1.9.0 ops") allowed the Qcom controller driver to > > > enable ASPM for all PCI devices enumerated at the time of the > > > controller driver probe. It proved to be useful for devices > > > already powered on by the bootloader, as it allowed devices to > > > enter ASPM without user intervention. > > > > > > However, it could not enable ASPM for the hotplug capable > > > devices i.e., devices enumerated *after* the controller driver > > > probe. This limitation mostly went unnoticed as the Qcom PCI > > > controllers are not hotplug capable and also the bootloader has > > > been enabling the PCI devices before Linux Kernel boots (mostly > > > on the Qcom compute platforms which users use on a daily basis). > > > > > > But with the advent of the commit b458ff7e8176 ("PCI/pwrctl: > > > Ensure that pwrctl drivers are probed before PCI client > > > drivers"), the pwrctrl driver started to block the PCI device > > > enumeration until it had been probed. Though, the intention of > > > the commit was to avoid race between the pwrctrl driver and PCI > > > client driver, it also meant that the pwrctrl controlled PCI > > > devices may get probed after the controller driver and will no > > > longer have ASPM enabled. So users started noticing high runtime > > > power consumption with WLAN chipsets on Qcom compute platforms > > > like Thinkpad X13s, and Thinkpad T14s, etc... > > > > > > Obviously, it is the pwrctrl change that caused regression, but > > > it ultimately uncovered a flaw in the ASPM enablement logic of > > > the controller driver. So to address the actual issue, use the > > > newly introduced pci_host_set_default_pcie_link_state() API, > > > which allows the controller drivers to set the default ASPM > > > state for *all* devices. This default state will be honored by > > > the ASPM driver while enabling ASPM for all the devices. > > > > So I guess this fixes a power consumption regression that became > > visible with b458ff7e8176 ("PCI/pwrctl: Ensure that pwrctl drivers > > are probed before PCI client drivers"). Arguably we should have a > > Fixes: tag for that, and maybe even skip the tag for 9f4f3dfad8cf, > > since if you have 9f4f3dfad8cf but not b458ff7e8176, it doesn't > > sound like you would need to backport this change? > > 9f4f3dfad8cf is the culprit which added a half baked solution for > enabling ASPM and b458ff7e8176 made the issue more obvious since > instead of requiring people to connect a device after boot, it > allowed the device to enumerate after the probe of the controller > driver, thereby making it more visible. > > It would make sense to add a Fixes tag for b458ff7e8176 too, but > 9f4f3dfad8cf should also be present IMO. OK. IIUC we would only land on 9f4f3dfad8cf for hot-plugged devices, and you said these controllers don't support hotplug. Backporting something to fix a half-baked solution that doesn't cause an actual problem is of marginal value, but we can keep the Fixes: 9f4f3dfad8cf. > > > Also with this change, ASPM is now enabled by default on all > > > Qcom platforms as I haven't heard of any reported issues (apart > > > from the unsupported L0s on some platorms, which is still > > > handled separately). > > > > If ASPM hasn't been enabled by default, how would you hear about > > issues? Is ASPM commonly enabled in some other way? > > Mostly from the downstream drivers. They do enable ASPM by default > and I haven't heard of any issues with that. So I assumed that would > mean it will be safe for us to enable ASPM for all platforms in > upstream as well. > > > If you think the risk of ASPM issues is nil, I guess it's OK to do > > at the same time. From a maintenance perspective it might be nice > > to separate that change so if there happened to be a regression, > > we could identify and revert that part by itself if necessary. It > > looks like previously, ASPM was only enabled for one part: > > > > ops_2_7_0 # cfg_2_7_0 qcom,pcie-sdm845 > > No. Previously ASPM was enabled for ops_1_9_0 and ops_1_21_0 based > platforms. Oops, my mistake. Traversing all the ops_ and cfg_ structs was a little confusing. For posterity, I guess the corrected matrix is that ASPM was previously enabled for these: ops_1_9_0 # cfg_1_9_0 qcom,pcie-sc7280 qcom,pcie-sc8180x qcom,pcie-sdx55 qcom,pcie-sm8150 qcom,pcie-sm8250 qcom,pcie-sm8350 qcom,pcie-sm8450-pcie0 qcom,pcie-sm8450-pcie1 qcom,pcie-sm8550 # cfg_1_34_0 qcom,pcie-sa8775p ops_1_21_0 # cfg_sc8280xp qcom,pcie-sa8540p qcom,pcie-sc8280xp qcom,pcie-x1e80100 And will now be enabled for these: ops_2_1_0 # cfg_2_1_0 qcom,pcie-apq8064 qcom,pcie-ipq8064 qcom,pcie-ipq8064-v2 ops_1_0_0 # cfg_1_0_0 qcom,pcie-apq8084 ops_2_3_2 # cfg_2_3_2 qcom,pcie-msm8996 ops_2_4_0 # cfg_2_4_0 qcom,pcie-ipq4019 qcom,pcie-qcs404 ops_2_3_3 # cfg_2_3_3 qcom,pcie-ipq8074 ops_2_7_0 # cfg_2_7_0 qcom,pcie-sdm845 ops_2_9_0 # cfg_2_9_0 qcom,pcie-ipq5018 qcom,pcie-ipq6018 qcom,pcie-ipq8074-gen3 qcom,pcie-ipq9574 > If you insist, I can do that by calling > pci_host_set_default_pcie_link_state() from qcom_pcie_init_2_7_0() > and later move it to qcom_pcie_host_init() in a separate patch. I don't insist; that's why I said "if you think the risk of issues is nil," since I didn't know that you had any experience with ASPM being enabled on the others. But it sounds like you do, so I'm ok with this as-is. Bjorn