[cc += Mika] On Mon, Jun 23, 2025 at 06:37:33AM -0500, Mario Limonciello wrote: > On 6/23/25 5:11 AM, Rafael J. Wysocki wrote: > > On Mon, Jun 23, 2025 at 12:05 PM Lukas Wunner <lukas@xxxxxxxxx> wrote: > > > pcie_portdrv_probe() and pcie_portdrv_remove() both call > > > pci_bridge_d3_possible() to determine whether to use runtime power > > > management. The underlying assumption is that pci_bridge_d3_possible() > > > always returns the same value because otherwise a runtime PM reference > > > imbalance occurs. > > > > > > That assumption falls apart if the device is inaccessible on ->remove() > > > due to hot-unplug: pci_bridge_d3_possible() calls pciehp_is_native(), > > > which accesses Config Space to determine whether the device is Hot-Plug > > > Capable. An inaccessible device generally returns "all ones" for such > > > Config Read Requests. Hence the device may seem Hot-Plug Capable on > > > ->remove() even though it wasn't on ->probe(). > > > > > > Use the cached copy of the Hot-Plug Capable bit to avoid the Config Space > > > access and the resulting runtime PM ref imbalance. > > > > > > Signed-off-by: Lukas Wunner <lukas@xxxxxxxxx> > > > > Reviewed-by: Rafael J. Wysocki <rafael@xxxxxxxxxx> > > Tested-by: Mario Limonciello <mario.limonciello@xxxxxxx>> I ended up changing the patch significantly, so I did not include Rafael's Reviewed-by and Mario's Tested-by in the final patch. My apologies for this! Looking at the commit which introduced the Config Space read, 5352a44a561d, I got the impression that Mika may have deliberately avoided using the is_hotplug_bridge flag. Notably, is_hotplug_bridge is also set by check_hotplug_bridge() in acpiphp_glue.c, and his intention was probably to avoid matching those bridges in pciehp_is_native(). So I decided to err on the side of caution and keep the Config Space read if pciehp_is_native() is called from hotplug_is_native(). Just to avoid any potential regressions since the fix is tagged for stable. I also searched lore for occurrences of the keywords... pcieport Runtime PM usage count underflow ...and did find quite a few reports, but this error message was just a side effect and the reports were about completely different issues. It does prove though that this bug has existed for a while! Thanks Laurent for the report and Mario for root-causing this! Lukas