Re: [PATCH] PCI/ACPI: Fix runtime PM ref imbalance on hot-plug capable ports

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Mon, 23 Jun 2025, Lukas Wunner wrote:

> pcie_portdrv_probe() and pcie_portdrv_remove() both call
> pci_bridge_d3_possible() to determine whether to use runtime power
> management.  The underlying assumption is that pci_bridge_d3_possible()
> always returns the same value because otherwise a runtime PM reference
> imbalance occurs.
> 
> That assumption falls apart if the device is inaccessible on ->remove()
> due to hot-unplug:  pci_bridge_d3_possible() calls pciehp_is_native(),
> which accesses Config Space to determine whether the device is Hot-Plug
> Capable.   An inaccessible device returns "all ones", which is converted
> to "all zeroes" by pcie_capability_read_dword().  Hence the device no
> longer seems Hot-Plug Capable on ->remove() even though it was on
> ->probe().
> 
> The resulting runtime PM ref imbalance causes errors such as:
> 
>   pcieport 0000:02:04.0: Runtime PM usage count underflow!
> 
> The Hot-Plug Capable bit is cached in pci_dev->is_hotplug_bridge.
> pci_bridge_d3_possible() only calls pciehp_is_native() if that flag is
> set.  Re-checking the bit in pciehp_is_native() is thus unnecessary.
> 
> However pciehp_is_native() is also called from hotplug_is_native().  Move
> the Config Space access to that function.  The function is only invoked
> from acpiphp_glue.c, so move it there instead of keeping it in a publicly
> visible header.
> 
> Fixes: 5352a44a561d ("PCI: pciehp: Make pciehp_is_native() stricter")
> Reported-by: Laurent Bigonville <bigon@xxxxxxxx>
> Closes: https://bugzilla.kernel.org/show_bug.cgi?id=220216
> Reported-by: Mario Limonciello <mario.limonciello@xxxxxxx>
> Closes: https://lore.kernel.org/r/20250609020223.269407-3-superm1@xxxxxxxxxx/
> Link: https://lore.kernel.org/all/20250620025535.3425049-3-superm1@xxxxxxxxxx/T/#u
> Signed-off-by: Lukas Wunner <lukas@xxxxxxxxx>
> Cc: stable@xxxxxxxxxxxxxxx # v4.18+

Hmm, I suspect this fixes the root cause for TB unplug while suspended 
which did have these RPM underflows. I saw them in logs while we were 
looking into LBMS interactions with the Target Speed quirk (that most 
visibly caused extra 60s delay, which was solved).

Back then, I briefly tried find solution to the underflow problem too but 
couldn't come up anything useful and as it looked relatively harmless I 
ended up postponing looking deeper into it.

Reviewed-by: Ilpo Järvinen <ilpo.jarvinen@xxxxxxxxxxxxxxx>

-- 
 i.

> ---
>  drivers/pci/hotplug/acpiphp_glue.c | 15 +++++++++++++++
>  drivers/pci/pci-acpi.c             |  5 -----
>  include/linux/pci_hotplug.h        |  4 ----
>  3 files changed, 15 insertions(+), 9 deletions(-)
> 
> diff --git a/drivers/pci/hotplug/acpiphp_glue.c b/drivers/pci/hotplug/acpiphp_glue.c
> index 5b1f271c6034..ae2bb8970f63 100644
> --- a/drivers/pci/hotplug/acpiphp_glue.c
> +++ b/drivers/pci/hotplug/acpiphp_glue.c
> @@ -50,6 +50,21 @@ static void acpiphp_sanitize_bus(struct pci_bus *bus);
>  static void hotplug_event(u32 type, struct acpiphp_context *context);
>  static void free_bridge(struct kref *kref);
>  
> +static bool hotplug_is_native(struct pci_dev *bridge)
> +{
> +	u32 slot_cap;
> +
> +	pcie_capability_read_dword(bridge, PCI_EXP_SLTCAP, &slot_cap);
> +
> +	if (slot_cap & PCI_EXP_SLTCAP_HPC && pciehp_is_native(bridge))
> +		return true;
> +
> +	if (shpchp_is_native(bridge))
> +		return true;
> +
> +	return false;
> +}
> +
>  /**
>   * acpiphp_init_context - Create hotplug context and grab a reference to it.
>   * @adev: ACPI device object to create the context for.
> diff --git a/drivers/pci/pci-acpi.c b/drivers/pci/pci-acpi.c
> index b78e0e417324..57bce9cc8a38 100644
> --- a/drivers/pci/pci-acpi.c
> +++ b/drivers/pci/pci-acpi.c
> @@ -816,15 +816,10 @@ int pci_acpi_program_hp_params(struct pci_dev *dev)
>  bool pciehp_is_native(struct pci_dev *bridge)
>  {
>  	const struct pci_host_bridge *host;
> -	u32 slot_cap;
>  
>  	if (!IS_ENABLED(CONFIG_HOTPLUG_PCI_PCIE))
>  		return false;
>  
> -	pcie_capability_read_dword(bridge, PCI_EXP_SLTCAP, &slot_cap);
> -	if (!(slot_cap & PCI_EXP_SLTCAP_HPC))
> -		return false;
> -
>  	if (pcie_ports_native)
>  		return true;
>  
> diff --git a/include/linux/pci_hotplug.h b/include/linux/pci_hotplug.h
> index ec77ccf1fc4d..02efeea62b25 100644
> --- a/include/linux/pci_hotplug.h
> +++ b/include/linux/pci_hotplug.h
> @@ -102,8 +102,4 @@ static inline bool pciehp_is_native(struct pci_dev *bridge) { return true; }
>  static inline bool shpchp_is_native(struct pci_dev *bridge) { return true; }
>  #endif
>  
> -static inline bool hotplug_is_native(struct pci_dev *bridge)
> -{
> -	return pciehp_is_native(bridge) || shpchp_is_native(bridge);
> -}
>  #endif
> 

[Index of Archives]     [DMA Engine]     [Linux Coverity]     [Linux USB]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [Greybus]

  Powered by Linux