Re: [PATCH v5] virtio_blk: Fix disk deletion hang on device surprise removal

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Thu, Jun 26, 2025 at 09:19:49AM +0000, Parav Pandit wrote:
> 
> > From: Michael S. Tsirkin <mst@xxxxxxxxxx>
> > Sent: 26 June 2025 12:04 PM
> > To: Parav Pandit <parav@xxxxxxxxxx>
> > Cc: Stefan Hajnoczi <stefanha@xxxxxxxxxx>; axboe@xxxxxxxxx;
> > virtualization@xxxxxxxxxxxxxxx; linux-block@xxxxxxxxxxxxxxx;
> > stable@xxxxxxxxxxxxxxx; NBU-Contact-Li Rongqing (EXTERNAL)
> > <lirongqing@xxxxxxxxx>; Chaitanya Kulkarni <chaitanyak@xxxxxxxxxx>;
> > xuanzhuo@xxxxxxxxxxxxxxxxx; pbonzini@xxxxxxxxxx;
> > jasowang@xxxxxxxxxx; alok.a.tiwari@xxxxxxxxxx; Max Gurtovoy
> > <mgurtovoy@xxxxxxxxxx>; Israel Rukshin <israelr@xxxxxxxxxx>
> > Subject: Re: [PATCH v5] virtio_blk: Fix disk deletion hang on device surprise
> > removal
> > 
> > On Thu, Jun 26, 2025 at 06:29:09AM +0000, Parav Pandit wrote:
> > > > > > yes however this is not at all different that hotunplug right after reset.
> > > > > >
> > > > > For hotunplug after reset, we likely need a timeout handler.
> > > > > Because block driver running inside the remove() callback waiting
> > > > > for the IO,
> > > > may not get notified from driver core to synchronize ongoing remove().
> > > >
> > > >
> > > > Notified of what?
> > > Notification that surprise-removal occurred.
> > >
> > > > So is the scenario that graceful remove starts, and meanwhile a
> > > > surprise removal happens?
> > > >
> > > Right.
> > 
> > 
> > where is it stuck then? can you explain?
> 
> I am not sure I understood the question.
> 
> Let me try:
> Following scenario will hang even with the current fix:
> 
> Say, 
> 1. the graceful removal is ongoing in the remove() callback, where disk deletion del_gendisk() is ongoing, which waits for the requests to complete,
> 
> 2. Now few requests are yet to complete, and surprise removal started.
> 
> At this point, virtio block driver will not get notified by the driver core layer, because it is likely serializing remove() happening by user/driver unload and PCI hotplug driver-initiated device removal.
> So vblk driver doesn't know that device is removed, block layer is waiting for requests completions to arrive which it never gets.
> So del_gendisk() gets stuck.
> 
> This needs some kind of timeout handling to improve the situation to make removal more robust.
> 
> Did I answer or I didn't understand the question?

You did, thanks! How do other drivers handle this? The issue seems generic.

-- 
MST





[Index of Archives]     [KVM Development]     [Libvirt Development]     [Libvirt Users]     [CentOS Virtualization]     [Netdev]     [Ethernet Bridging]     [Linux Wireless]     [Kernel Newbies]     [Security]     [Linux for Hams]     [Netfilter]     [Bugtraq]     [Yosemite Forum]     [MIPS Linux]     [ARM Linux]     [Linux RAID]     [Linux Admin]     [Samba]

  Powered by Linux