Re: [PATCH] PCI/ACPI: Fix runtime PM ref imbalance on hot-plug capable ports

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Wed, Jun 25, 2025 at 09:37:38AM +0200, Lukas Wunner wrote:
> On Tue, Jun 24, 2025 at 09:24:07AM -0500, Bjorn Helgaas wrote:
> > On Mon, Jun 23, 2025 at 07:08:20PM +0200, Lukas Wunner wrote:
> > > pcie_portdrv_probe() and pcie_portdrv_remove() both call
> > > pci_bridge_d3_possible() to determine whether to use runtime power
> > > management.  The underlying assumption is that pci_bridge_d3_possible()
> > > always returns the same value because otherwise a runtime PM reference
> > > imbalance occurs.
> > > 
> > > That assumption falls apart if the device is inaccessible on ->remove()
> > > due to hot-unplug:  pci_bridge_d3_possible() calls pciehp_is_native(),
> > > which accesses Config Space to determine whether the device is Hot-Plug
> > > Capable.   An inaccessible device returns "all ones", which is converted
> > > to "all zeroes" by pcie_capability_read_dword().  Hence the device no
> > > longer seems Hot-Plug Capable on ->remove() even though it was on
> > > ->probe().
> > 
> > This is pretty subtle; thanks for chasing it down.
> > 
> > It doesn't look like anything in pci_bridge_d3_possible() should 
> > change over the life of the device, although acpi_pci_bridge_d3() is
> > non-trivial.
> > 
> > Should we consider calling pci_bridge_d3_possible() only once and
> > caching the result?  We already call it in pci_pm_init() and save the
> > result in dev->bridge_d3.  That member can be changed by
> > pci_bridge_d3_update(), but we could add another copy that we never
> > update after pci_pm_init().
> 
> If we did that, I think we'd still want to have a WARN_ON() like this in
> pcie_portdrv_remove():
> 
> +	WARN_ON(dev->bridge_d3_orig != pci_bridge_d3_possible(dev));
> +
> +	if (dev->bridge_d3_orig) {
> -	if (pci_bridge_d3_possible(dev)) {
> 
> Because without the WARN_ON(), such bugs would fly under the radar.
> 
> However currently we get the WARN_ON() for free because of the runtime PM
> refcount underflow.
> 
> So caching the original return value of pci_bridge_d3_possible(dev)
> wouldn't be a net positive.

Fair point.  pci_bridge_d3_possible() is mainly used by portdrv, and 
keeping another copy in the pci_dev does seem like overkill.

If the point is to ensure that the runtime PM setup done by
pcie_portdrv_probe() is undone by pcie_portdrv_remove() and
pcie_portdrv_shutdown(), maybe portdrv should remember what it did,
e.g., call pci_bridge_d3_possible() once in .probe() and save the
result for use in .remove() and .shutdown().

That's what I expect drivers to do in general for cleaning up things
in .remove(): it's the driver's problem to remember what needs to be
cleaned up.

But I feel like I'm missing your point about bugs flying under the
radar.  Having portdrv keep track of whether it did runtime PM setup
(i.e., the pci_bridge_d3_possible() state at .probe()-time) is
functionally the same as having struct pci_dev keep track of it, so
the bugs you're referring to could still fly under the radar.

Bjorn




[Index of Archives]     [DMA Engine]     [Linux Coverity]     [Linux USB]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [Greybus]

  Powered by Linux