Re: [PATCH V3] block: make segment size limit workable for > 4K PAGE_SIZE

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Tue, Feb 18, 2025 at 11:42:43AM +0100, Daniel Gomez wrote:
> On Thu, Feb 13, 2025 at 08:00:40PM +0100, Ming Lei wrote:
> > PAGE_SIZE is applied in validating block device queue limits, this way is
> > very fragile and is wrong:
> > 
> > - queue limits are read from hardware, which is often one readonly hardware
> > property
> > 
> > - PAGE_SIZE is one config option which can be changed during build time.
> > 
> > In RH lab, it has been found that max segment size of some mmc card is
> > less than 64K, then this kind of card can't be probed successfully when
> > same kernel is re-built with 64K PAGE_SIZE.
> > 
> > Fix this issue by using BLK_MIN_SEGMENT_SIZE in related code for dealing
> > with queue limits and checking if bio needn't split as a hint. Define
> > BLK_MIN_SEGMENT_SIZE as 4K(minimized PAGE_SIZE).
> > 
> > The following commits are depended for backporting:
> > 
> > commit 6aeb4f836480 ("block: remove bio_add_pc_page")
> > commit 02ee5d69e3ba ("block: remove blk_rq_bio_prep")
> > commit b7175e24d6ac ("block: add a dma mapping iterator")
> > 
> > Cc: Paul Bunyan <pbunyan@xxxxxxxxxx>
> > Cc: Yi Zhang <yi.zhang@xxxxxxxxxx>
> > Cc: Luis Chamberlain <mcgrof@xxxxxxxxxx>
> > Cc: John Garry <john.g.garry@xxxxxxxxxx>
> > Cc: Bart Van Assche <bvanassche@xxxxxxx>
> > Cc: Keith Busch <kbusch@xxxxxxxxxx>
> > Signed-off-by: Ming Lei <ming.lei@xxxxxxxxxx>
> > ---
> > V3:
> > 	- rephrase commit log & fix patch style(Christoph)
> > 	- more comment log(Christoph)
> > V2:
> > 	- cover bio_split_rw_at()
> > 	- add BLK_MIN_SEGMENT_SIZE
> > 
> >  block/blk-merge.c      | 2 +-
> >  block/blk-settings.c   | 6 +++---
> >  block/blk.h            | 8 ++++++--
> >  include/linux/blkdev.h | 2 ++
> >  4 files changed, 12 insertions(+), 6 deletions(-)
> > 
> > diff --git a/block/blk-merge.c b/block/blk-merge.c
> > index 15cd231d560c..b55c52a42303 100644
> > --- a/block/blk-merge.c
> > +++ b/block/blk-merge.c
> > @@ -329,7 +329,7 @@ int bio_split_rw_at(struct bio *bio, const struct queue_limits *lim,
> >  
> >  		if (nsegs < lim->max_segments &&
> >  		    bytes + bv.bv_len <= max_bytes &&
> > -		    bv.bv_offset + bv.bv_len <= PAGE_SIZE) {
> > +		    bv.bv_offset + bv.bv_len <= BLK_MIN_SEGMENT_SIZE) {
> 
> In which cases this "new" condition (for systems where PAGE_SIZE >
> BLK_MIN_SEGMENT_SIZE) is going to be true? In my test case below, is always

Yes.

> false, so it defaults to the else path. And I think that is going to be the
> "normal" case in these systems, is that right?

It depends on block size in your workload.

> 
> Doing a 'quick' test using next-20250213 on a 16k PAGE_SIZE system with the NVMe
> driver and a 4k lbs disk + ~1h fio sequential writes, I get results indicating
> a write performance degradation of ~0.8%. This is due to the new loop condition
> doing 4k steps rather than PS. I guess it's going to be slighly worse the larger
> the PAGE_SIZE the system is, and bio? So, why not decreasing the minimum segment

No, just one extra bvec_split_segs() is called once for any >4k page size 

Probably the opposite, effect on 64K page size could be smaller since
bio may have less bvec on same workload.

> size for the cases it's actually needed rather than making it now the default?
> 
> I've measured bio_split_rw_at latency in the above test with the following
> results:

Fine, I will add one min segment size for covering this case since you care the
little perf drop on >4k PAGE_SIZE. 


Thanks,
Ming





[Index of Archives]     [Linux RAID]     [Linux SCSI]     [Linux ATA RAID]     [IDE]     [Linux Wireless]     [Linux Kernel]     [ATH6KL]     [Linux Bluetooth]     [Linux Netdev]     [Kernel Newbies]     [Security]     [Git]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Device Mapper]

  Powered by Linux