On 5/9/25 9:42 AM, Bart Van Assche wrote: > If a bio is split, the bio fragments are added in reverse LBA order into > the plug list. This triggers write errors with zoned storage and > sequential writes. Fix this by preserving the LBA order when inserting in > the plug list. Preserving the order of the fragment would be a good thing for all block devices. But what I fail to see here is how this lack of ordering affects zoned block device writes since zone write plugging will split large BIOs when a write BIO goes through zone write plugging. That happens before we have a request, so we should never endup needing to split a zone write request. > > This patch has been posted as an RFC because this patch changes the > complexity of inserting in the plug list from O(1) into O(n). > > Cc: Christoph Hellwig <hch@xxxxxx> > Cc: Damien Le Moal <dlemoal@xxxxxxxxxx> > Signed-off-by: Bart Van Assche <bvanassche@xxxxxxx> > --- > block/blk-mq.c | 26 +++++++++++++++++++++++++- > 1 file changed, 25 insertions(+), 1 deletion(-) > > diff --git a/block/blk-mq.c b/block/blk-mq.c > index 796baeccd37b..e1311264a337 100644 > --- a/block/blk-mq.c > +++ b/block/blk-mq.c > @@ -1386,6 +1386,30 @@ static inline unsigned short blk_plug_max_rq_count(struct blk_plug *plug) > return BLK_MAX_REQUEST_COUNT; > } > > +/* > + * If a bio is split, the bio fragments are submitted in opposite order. Hence > + * this function that inserts in LBA order in the plug list. > + */ > +static inline void rq_list_insert_sorted(struct rq_list *rl, struct request *rq) > +{ > + sector_t rq_pos = rq->bio->bi_iter.bi_sector; > + struct request *next, *prev; > + > + for (prev = NULL, next = rl->head; next; > + prev = next, next = next->rq_next) > + if (next->q == rq->q && rq_pos < next->bio->bi_iter.bi_sector) > + break; > + > + if (!prev) { > + rq_list_add_head(rl, rq); > + } else if (!next) { > + rq_list_add_tail(rl, rq); > + } else { > + prev->rq_next = rq; > + rq->rq_next = next; > + } > +} > + > static void blk_add_rq_to_plug(struct blk_plug *plug, struct request *rq) > { > struct request *last = rq_list_peek(&plug->mq_list); > @@ -1408,7 +1432,7 @@ static void blk_add_rq_to_plug(struct blk_plug *plug, struct request *rq) > */ > if (!plug->has_elevator && (rq->rq_flags & RQF_SCHED_TAGS)) > plug->has_elevator = true; > - rq_list_add_tail(&plug->mq_list, rq); > + rq_list_insert_sorted(&plug->mq_list, rq); > plug->rq_count++; > } > -- Damien Le Moal Western Digital Research