Re: [PATCH 2/2] PCI: pciehp: Ignore Link Down/Up caused by Secondary Bus Reset

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Fri, Apr 11, 2025 at 03:28:15PM -0700, Sathyanarayanan Kuppuswamy wrote:
> On 4/10/25 8:27 AM, Lukas Wunner wrote:
> > Introduce two helpers to annotate code sections which cause spurious link
> > changes:  pci_hp_ignore_link_change() and pci_hp_unignore_link_change()
> > Use those helpers in lieu of masking interrupts in the Slot Control
> > register.
> > 
> > Introduce a helper to check whether such a code section is executing
> > concurrently and if so, await it:  pci_hp_spurious_link_change()
> > Invoke the helper in the hotplug IRQ thread pciehp_ist().  Re-use the
> > IRQ thread's existing code which ignores DPC-induced link changes unless
> > the link is unexpectedly down after reset recovery or the device was
> > replaced during the bus reset.
> 
> Since most of the changes in this patch is related to adding framework to
> ignore spurious hotplug event, why not split it in to a separate patch?

The idea is to ease backporting.  The issue fixes VFIO passthrough on
v6.13-rc1 and newer, so the patch will have to be backported at least
to v6.13, v6.14, v6.15.


> Is this fix tested in any platform?

Yes, Joel Mathew Thomas successfully tested it:

https://bugzilla.kernel.org/show_bug.cgi?id=219765

That's an Asus TUF FA507NU dual-GPU laptop (AMD CPU + Nvidia discrete GPU).
The Nvidia GPU is passed through to a VM.


> > --- a/drivers/pci/pci.h
> > +++ b/drivers/pci/pci.h
> > @@ -227,6 +227,7 @@ static inline bool pcie_downstream_port(const struct pci_dev *dev)
> >   /* Functions for PCI Hotplug drivers to use */
> >   int pci_hp_add_bridge(struct pci_dev *dev);
> > +bool pci_hp_spurious_link_change(struct pci_dev *pdev);
> 
> Do you expect this function used outside hotplug code? If not why not leave
> the declaration in pciehp.h?

Any hotplug driver may call this.  In particular, there are two other drivers
implementing the ->reset_slot() callback: pnv_php.c and s390_pci_hpc.c

pnv_php.c does a similar dance as pciehp_hpc.c (before this patch):
It disables the interrupt, performs a Secondary Bus Reset, clears spurious
events and re-enables the interrupt.  I think it can be converted to use
the newly introduced API.  That should make it more robust against removal
or replacement of the device during the Secondary Bus Reset.

Also, to cope with runtime suspend to D3cold, there's an ignore_hotplug
bit in struct pci_dev.  The bit is set by drivers for discrete GPUs and
is honored by acpiphp and pciehp.  I'd like to remove the bit in favor
of the new mechanism introduced here and that means acpiphp will have to
be converted to call pci_hp_spurious_link_change().

One thing that's problematic about the current behavior is that hotplug
events are ignored wholesale for GPUs which runtime suspend to D3cold.
That works for discrete GPUs in laptops which are soldered down on the
mainboard, but doesn't work for GPUs which are plugged into an actual
hotplug port, e.g. data center GPUs.  The new API will allow detecting
and ignoring spurious events in a more surgical manner.  I envision
that __pci_set_power_state() will call pci_hp_ignore_link_change()
if the target power state is D3cold.  Also vga_switcheroo.c will have
to call that.  But none of the GPU drivers will have to call
pci_ignore_hotplug() anymore.

To summarize, there are at least two other hotplug drivers besides pciehp
which I expect will call pci_hp_spurious_link_change() in the foreseeable
future, acpiphp and pnv_php, hence the declaration is not in pciehp.h
but in drivers/pci/pci.h.


> > --- a/include/linux/pci.h
> > +++ b/include/linux/pci.h
> > @@ -1848,6 +1848,14 @@ static inline void pcie_no_aspm(void) { }
> >   static inline bool pcie_aspm_enabled(struct pci_dev *pdev) { return false; }
> >   #endif
> > +#ifdef CONFIG_HOTPLUG_PCI
> > +void pci_hp_ignore_link_change(struct pci_dev *pdev);
> > +void pci_hp_unignore_link_change(struct pci_dev *pdev);
> > +#else
> > +static inline void pci_hp_ignore_link_change(struct pci_dev *pdev) { }
> > +static inline void pci_hp_unignore_link_change(struct pci_dev *pdev) { }
> > +#endif
> > +
> 
> Generally we expose APIs when we really need it.  Since you have already
> identified some use cases where you will use it in other drivers, why not
> include one such change as an example?

Mostly because I wanted to get this fix out the door quickly before more
people come across the regression it addresses.

Converting the Mellanox Ethernet driver to use the API would require an ack
from its maintainers.  Since it's not urgent, I was planning to do that in
a subsequent cycle.  Same for the conversion of D3cold handling.

Thanks,

Lukas




[Index of Archives]     [DMA Engine]     [Linux Coverity]     [Linux USB]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [Greybus]

  Powered by Linux