On 5/21/25 23:18, Bart Van Assche wrote: > On 5/20/25 10:53 PM, Christoph Hellwig wrote: >> On Tue, May 20, 2025 at 11:09:15AM -0700, Bart Van Assche wrote: >>> If the sequential write bios are split by the device mapper, sorting >>> bios in the block layer is not necessary. Christoph and Damien, do you >>> agree to replace the bio sorting code in my previous email with the >>> patch below? >> >> No. First please create a reproducer for your issue using null_blk >> or scsi_debug, otherwise we have no way to understand what is going >> on here, and will regress in the future. >> >> Second should very much be able to fix the splitting in dm to place >> the bios in the right order. As mentioned before I have a theory >> of how to do it, but we really need a proper reproducer to test this >> and then to write it up to blktests first. > > Hi Christoph, > > The following pull request includes a test that triggers the deadlock > fixed by patch 2/2 reliably: > > https://github.com/osandov/blktests/pull/171 +Shin'ichiro so that he is aware of the context. Please share the blktest patch on this list so that we can see how you recreate the issue. That makes it easier to see if a fix is appropriate. > I do not yet have a reproducer for the bio reordering but I'm still > working on this. I am still very confused about how this is possible assuming a well behaved user that actually submits write BIOs in sequence for a zone. That means with a lock around submit_bio() calls. Assuming such user, a large write BIO that is split would have its fragments all processed and added to the target zone plug in order. Another context (or the same context) submitting the next write for that zone would have the same happen, so BIO fragments should not be reordered... So to clarify: are we talking about splits of the BIO that the DM device receives ? Or is it about splits of cloned BIOs that are used to process the BIOs that the DM device received ? The clones are for the underlying device and should not have the zone plugging flag set until the DM target driver submits them, even if the original BIO is flagged with zone plugging. Looking at the bio clone code, the bio flags do not seem to be copied from the source BIO to the clone. So even if the source BIO (the BIO received by the DM device) is flagged with zone write plugging, a clone should not have this flag set until it is submitted. Could you clarify the sequence and BIO flags you see that leads to the issue ? -- Damien Le Moal Western Digital Research