Re: [PATCH RFC] Submit split bios in order

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 5/9/25 9:42 AM, Bart Van Assche wrote:
> If a bio is split, the bio fragments are added in reverse LBA order into
> the plug list. This triggers write errors with zoned storage and
> sequential writes. Fix this by preserving the LBA order when inserting in
> the plug list.

Preserving the order of the fragment would be a good thing for all block
devices. But what I fail to see here is how this lack of ordering affects zoned
block device writes since zone write plugging will split large BIOs when a
write BIO goes through zone write plugging. That happens before we have a
request, so we should never endup needing to split a zone write request.

> 
> This patch has been posted as an RFC because this patch changes the
> complexity of inserting in the plug list from O(1) into O(n).
> 
> Cc: Christoph Hellwig <hch@xxxxxx>
> Cc: Damien Le Moal <dlemoal@xxxxxxxxxx>
> Signed-off-by: Bart Van Assche <bvanassche@xxxxxxx>
> ---
>  block/blk-mq.c | 26 +++++++++++++++++++++++++-
>  1 file changed, 25 insertions(+), 1 deletion(-)
> 
> diff --git a/block/blk-mq.c b/block/blk-mq.c
> index 796baeccd37b..e1311264a337 100644
> --- a/block/blk-mq.c
> +++ b/block/blk-mq.c
> @@ -1386,6 +1386,30 @@ static inline unsigned short blk_plug_max_rq_count(struct blk_plug *plug)
>  	return BLK_MAX_REQUEST_COUNT;
>  }
>  
> +/*
> + * If a bio is split, the bio fragments are submitted in opposite order. Hence
> + * this function that inserts in LBA order in the plug list.
> + */
> +static inline void rq_list_insert_sorted(struct rq_list *rl, struct request *rq)
> +{
> +	sector_t rq_pos = rq->bio->bi_iter.bi_sector;
> +	struct request *next, *prev;
> +
> +	for (prev = NULL, next = rl->head; next;
> +	     prev = next, next = next->rq_next)
> +		if (next->q == rq->q && rq_pos < next->bio->bi_iter.bi_sector)
> +			break;
> +
> +	if (!prev) {
> +		rq_list_add_head(rl, rq);
> +	} else if (!next) {
> +		rq_list_add_tail(rl, rq);
> +	} else {
> +		prev->rq_next = rq;
> +		rq->rq_next = next;
> +	}
> +}
> +
>  static void blk_add_rq_to_plug(struct blk_plug *plug, struct request *rq)
>  {
>  	struct request *last = rq_list_peek(&plug->mq_list);
> @@ -1408,7 +1432,7 @@ static void blk_add_rq_to_plug(struct blk_plug *plug, struct request *rq)
>  	 */
>  	if (!plug->has_elevator && (rq->rq_flags & RQF_SCHED_TAGS))
>  		plug->has_elevator = true;
> -	rq_list_add_tail(&plug->mq_list, rq);
> +	rq_list_insert_sorted(&plug->mq_list, rq);
>  	plug->rq_count++;
>  }
>  


-- 
Damien Le Moal
Western Digital Research




[Index of Archives]     [Linux RAID]     [Linux SCSI]     [Linux ATA RAID]     [IDE]     [Linux Wireless]     [Linux Kernel]     [ATH6KL]     [Linux Bluetooth]     [Linux Netdev]     [Kernel Newbies]     [Security]     [Git]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Device Mapper]

  Powered by Linux