On Wed, Aug 27, 2025 at 03:45:14PM -0700, Keith Busch wrote: > Synchronize the interrupt to ensure the reset isn't going to disrupt a > previously pending handler from igoring the reset's link flap. Back to > back secondary bus resets create a window when the previous reset > proceeds with DLLLA, waking the pending pciehp interrupt thread, but the > subsequent reset tears it down while the irq thread tries to confirm the > link is active, triggering unexpected re-enumeration. Help me understand this: I think what you mean is that pciehp_reset_slot() runs and the Secondary Bus Reset causes a spurious link change. So pciehp_ist() runs, waits for the reset to finish with pci_hp_spurious_link_change(), then calls pciehp_ignore_link_change(), which tests whether the link is active again by calling pciehp_check_link_active(). And you're saying that at the same time, pciehp_reset_slot() runs, performs a Secondary Bus Reset, thus brings down the link, confusing the concurrent pciehp_check_link_active(). Did I understand that correctly? I don't quite see how this can happen, given pciehp_reset_slot() acquires ctrl->reset_lock for writing and the same lock is held for reading for the call to pciehp_check_link_active(). Moreover pciehp_ist() ignores the link flap in the first iteration (it clears the flags in the local variable "events") and if pciehp_check_link_active() would indeed fail, then the bits would only be set in ctrl->pending_events and a rerun of pciehp_ist() would be triggered. That second iteration of pciehp_ist() would then find the PCI_LINK_CHANGED flag set and ignore the link change (again). So this should all work fine. > +++ b/drivers/pci/hotplug/pciehp_hpc.c > @@ -946,6 +946,7 @@ int pciehp_reset_slot(struct hotplug_slot *hotplug_slot, bool probe) > > down_write_nested(&ctrl->reset_lock, ctrl->depth); > > + synchronize_irq(ctrl->pcie->irq); > pci_hp_ignore_link_change(pdev); > > rc = pci_bridge_secondary_bus_reset(ctrl->pcie->port); This will deadlock because it waits for pciehp_ist() to finish but pciehp_ist() may wait for ctrl->reset_lock, which is held here. Thanks, Lukas