On Thu, Apr 10, 2025 at 05:27:10PM +0200, Lukas Wunner wrote: > Trying to kill several birds with one stone here: > > First of all, PCIe hotplug is deliberately ignoring link events occurring > as a side effect of Downstream Port Containment. But it's not yet ignoring > Presence Detect Changed events. These can happen if a hotplug bridge uses > in-band presence detect. Reported by Keith Busch, patch [1/2] seeks to > fix it. > > Second, PCIe hotplug is deliberately ignoring link events and Presence > Detect Changed events occurring as a side effect of a Secondary Bus Reset. > But that's no longer working properly since the introduction of bandwidth > control in v6.13-rc1. Actually it never worked properly, but bandwidth > control is now mercilessly exposing the issue. VFIO is thus broken, > it resets the device on passthrough. Reported by Joel Mathew Thomas. > > Third, link or presence events can not only occur as a side effect of DPC > or SBR, but also because of suspend to D3cold, a firmware update or FPGA > reconfiguration. In particular, Altera engineers report that the link > goes down as a side effect of FPGA reconfiguration and the PCIe hotplug > driver responds by disabling slot power. Obviously not what you'd want > while the FPGA is being reconfigured! > > This leads me to believe that we need a generic mechanism to tell hotplug > drivers that spurious link changes are ongoing which need to be ignored. > Patch [2/2] introduces an API for it and the first user is SBR handling > in PCIe hotplug. This fixes the issue exposed by bandwidth control. > It also aligns DPC and SBR handling in the PCIe hotplug driver such that > they use the same code path. > > The API pci_hp_ignore_link_change() / pci_hp_unignore_link_change() is > initially not exported. It can be once the first modular user shows up. > > Although these are technically fixes, they're slightly intrusive, so it > would be good to let them simmer in linux-next for a while. One option > would be to apply for v6.16 and let Greg & Sasha do the backporting. > Another would be to apply to the for-linus branch for v6.15 but wait > maybe 4 weeks before a pull request is sent. > > Please review and test. Thanks! > > Lukas Wunner (2): > PCI: pciehp: Ignore Presence Detect Changed caused by DPC > PCI: pciehp: Ignore Link Down/Up caused by Secondary Bus Reset > > drivers/pci/hotplug/pci_hotplug_core.c | 69 ++++++++++++++++++++++++++++++ > drivers/pci/hotplug/pciehp.h | 1 + > drivers/pci/hotplug/pciehp_core.c | 29 ------------- > drivers/pci/hotplug/pciehp_hpc.c | 78 ++++++++++++++++++++++------------ > drivers/pci/pci.h | 3 ++ > include/linux/pci.h | 8 ++++ > 6 files changed, 132 insertions(+), 56 deletions(-) Applied to pci/hotplug for now, thanks! We can get it into -next and if all goes well easily move to for-linus for v6.15 if that's the right thing.