On Wed, Jul 09, 2025 at 05:55:17PM -0600, Keith Busch wrote: > On Wed, Jul 09, 2025 at 06:38:20PM -0500, Bjorn Helgaas wrote: > > This relies on somebody (typically pciehp, I guess) calling > > pci_dev_set_disconnected() when a surprise remove happens. > > > > Do you think it would be practical for the driver's .remove() method > > to recognize that the device may stop responding at any point, even if > > no hotplug driver is present to call pci_dev_set_disconnected()? > > > > Waiting forever for an interrupt seems kind of vulnerable in general. > > Maybe "artificially adding timeouts" is alluding to *not* waiting > > forever for interrupts? That doesn't seem artificial to me because > > it's just a fact of life that devices can disappear at arbitrary > > times. > > I totally agree here. Every driver's .remove() should be able to > guarantee forward progress some way. I put some work in blk-mq and nvme > to ensure that happens for those devices at least. > > That "forward progress" can come slow though, maybe minutes, so we do > have opprotunisitic short cuts sprinkled about the driver. There are > still gaps when waiting for interrupt driven IO that need the longer > timeouts to trigger. It'd be cool if there was a mechansim to kick in > quicker, but this is still an uncommon exceptional condition, right? It's uncommon, yes. -- MST