On 6/24/25 2:12 AM, Bart Van Assche wrote: > On 6/19/25 6:29 PM, Damien Le Moal wrote: >> On 6/19/25 02:13, Bart Van Assche wrote: >>> >>> On 6/17/25 10:56 PM, Damien Le Moal wrote: >>>> Can you check exactly the path that is being followed ? (your >>> > backtrace does not seem to have everything) >>> >>> Hmm ... it is not clear to me why this information is required? My >>> understanding is that the root cause is the same as for the deadlock >>> fixed by Christoph: >>> 1. A bio is queued onto zwplug->bio_list. Before this happens, the >>> queue reference count is increased by one. >>> 2. A value is written into a block device sysfs attribute and queue >>> freezing starts. The queue freezing code waits for completion of >>> all bios on zwplug->bio_list because the reference count owned by >>> these bios is only released when these bios complete. >>> 3. blk_zone_wplug_bio_work() dequeues a bio from zwplug->bio_list, >>> calls dm_submit_bio() through a function pointer, dm_submit_bio() >>> calls submit_bio_noacct() indirectly and submit_bio_noacct() calls >>> bio_queue_enter() indirectly. bio_queue_enter() sees that queue >>> freezing has started and waits until the queue is unfrozen. >>> 4. A deadlock occurs because (2) and (3) wait for each other >>> indefinitely. >> >> Then we need to split DM BIOs immediately on submission, always. >> So something like this totally untested patch should solve the issue. >> Care to test ? > > (back in the office after four days off work) > > Hi Damien, > > Hmm ... it is not clear to me how a patch that modifies when bios are > split could address the deadlock scenario described above? What am I > missing? Additionally, hadn't Christoph requested not to split bios at > the top of the device driver stack? DM already calls bio split to limits at the top of its submission path. Not for all BIOs though. I encourage you to look at the DM code more closely to understand the issue here. What is happening is that DM in general does *NOT* split write BIOs. But a DM target driver is free to do so using dm_accept_partial_bio() and that will cause the reminder of a BIO to be issued again but *NOT* necessarily from the same context. Because of zone write plugging, this may happen from the zone write plug BIO work, thus causing going through the queue enter which can deadlock with freeze when a BIO for the same zone is already plugged. Zone write plugging heavily relies on the fact that once plugged, BIOs should *NOT* be split again, as otherwise we can deadlock. DM dm_accept_partial_bio() breaks that contract. > The patch that I posted one month ago is sufficient to fix this > deadlock. See also > https://lore.kernel.org/linux-block/20250522171405.3239141-1-bvanassche@xxxxxxx/ I do not like this. This is playing weird games with the queue enter/exit which are very hard to understand. And I think Jens will not accept this as he does not want to see zone stuff all over the place (and I agree). For a nicer solution, which is mostly DM-based, combine what I sent you to force write BIOs to be split early for zoned DM devices together with the patch [1], which I sent already but needs more work. This combination was tested by Shin'ichiro and he could not reproduce the hang with both patches applied. [1] https://lore.kernel.org/dm-devel/20250611011340.92226-1-dlemoal@xxxxxxxxxx/ As far as I can tell, dm-crypt is the only DM target driver supporting zones that splits write operations "under the hood". But I will check again. -- Damien Le Moal Western Digital Research