Lukas, Keith, Bjorn, others, would very much appreciate your comments on whether the pci core changes are acceptable. Parav, I expect this to be integrated into your work on fixing surprise removal. As such, I am not queueing these patches yet - please include with your other patches fixing these issues. ========== This is an attempt to fix the following race in virtio: when device removal is initiated by a user action, such as driver unbind, it in turn initiates driver cleanup and is then waiting for an interrupt from the device. If the device is now surprise-removed, that interrupt never arrives and the remove callback hangs forever. For example, this was reported for virtio-blk: 1. the graceful removal is ongoing in the remove() callback, where disk deletion del_gendisk() is ongoing, which waits for the requests +to complete, 2. Now few requests are yet to complete, and surprise removal started. At this point, virtio block driver will not get notified by the driver core layer, because it is likely serializing remove() happening by +user/driver unload and PCI hotplug driver-initiated device removal. So vblk driver doesn't know that device is removed, block layer is waiting for requests completions to arrive which it never gets. So del_gendisk() gets stuck. We could add timeouts to handle that, but given virtio blk devices are implemented in userspace, this makes the device flaky. Instead, this adds pci core infrastructure, and virtio core infrastructure, for drivers to be notified of device disconnect. Actual cleanup in virtio-blk is still TBD. Compile-tested only. ========== Notes on the design: Care was taken to avoid re-introducing the bug fixed by commit 74ff8864cc84 ("PCI: hotplug: Allow marking devices as disconnected during bind/unbind") To avoid taking locks on removal path, and to avoid invoking callback with unpredictable latency, the event is reported through a WQ. Adding APIs to enable/disable the reporting on probe/remove, helps make sure the driver won't go away in the middle of the handling, all without taking any locks. The benefit is that the resulting API is harder than a callback to misuse, adding unpredictable latencies to unplug. The WQ can simply do the cleanup directly, taking any locks it needs for that. The cost is several extra bytes per device, which seems modest. Previous discussion: https://lore.kernel.org/all/11cfcb55b5302999b0e58b94018f92a379196698.1751136072.git.mst@xxxxxxxxxx Changes from v4: fix !CONFIG_PCI, reported by Stephen Changes from v3: added documentation to address comments by Parav support in virtio core Changes from v2: v2 was corrupted, fat fingers :( Changes from v1: switched to a WQ, with APIs to enable/disable added motivation ========== Michael S. Tsirkin (5): pci: report surprise removal event virtio: fix comments, readability virtio: pack config changed flags virtio: allow transports to suppress config change virtio: support device disconnect drivers/pci/pci.h | 6 ++++ drivers/virtio/virtio.c | 23 +++++++++++++-- drivers/virtio/virtio_pci_common.c | 45 ++++++++++++++++++++++++++++++ drivers/virtio/virtio_pci_common.h | 3 ++ drivers/virtio/virtio_pci_legacy.c | 2 ++ drivers/virtio/virtio_pci_modern.c | 2 ++ include/linux/pci.h | 45 ++++++++++++++++++++++++++++++ include/linux/virtio.h | 11 ++++++-- include/linux/virtio_config.h | 32 +++++++++++++++++++++ 9 files changed, 163 insertions(+), 6 deletions(-) -- MST