On Wed, Jul 16, 2025 at 03:26:02PM +0800, Yu Kuai wrote: > Hi, > > 在 2025/07/15 23:56, Coly Li 写道: > > Then when my raid5 array sets its queue limits, because its io_opt is 64KiB*7, > > and the raid component sata hard drive has io_opt with 32767 sectors, by > > calculation in block/blk-setting.c:blk_stack_limits() at line 753, > > 753 t->io_opt = lcm_not_zero(t->io_opt, b->io_opt); > > the calculated opt_io_size of my raid5 array is more than 1GiB. It is too large. > > Perhaps we should at least provide a helper for raid5 that we prefer > raid5 io_opt over underlying disk's io_opt. Because of raid5 internal > implemation, chunk_size * data disks is the best choice, there will be > significant differences in performance if not aligned with io_opt. > > Something like following: > Yeah, this one also solves my issue. Thanks. Coly Li > diff --git a/block/blk-settings.c b/block/blk-settings.c > index a000daafbfb4..04e7b4808e7a 100644 > --- a/block/blk-settings.c > +++ b/block/blk-settings.c > @@ -700,6 +700,7 @@ int blk_stack_limits(struct queue_limits *t, struct > queue_limits *b, > t->features &= ~BLK_FEAT_POLL; > > t->flags |= (b->flags & BLK_FLAG_MISALIGNED); > + t->flags |= (b->flags & BLK_FLAG_STACK_IO_OPT); > > t->max_sectors = min_not_zero(t->max_sectors, b->max_sectors); > t->max_user_sectors = min_not_zero(t->max_user_sectors, > @@ -750,7 +751,10 @@ int blk_stack_limits(struct queue_limits *t, struct > queue_limits *b, > b->physical_block_size); > > t->io_min = max(t->io_min, b->io_min); > - t->io_opt = lcm_not_zero(t->io_opt, b->io_opt); > + if (!t->io_opt || !(t->flags & BLK_FLAG_STACK_IO_OPT) || > + (b->flags & BLK_FLAG_STACK_IO_OPT)) > + t->io_opt = lcm_not_zero(t->io_opt, b->io_opt); > + > t->dma_alignment = max(t->dma_alignment, b->dma_alignment); > > /* Set non-power-of-2 compatible chunk_sectors boundary */ > diff --git a/drivers/md/raid5.c b/drivers/md/raid5.c > index 5b270d4ee99c..bb482ec40506 100644 > --- a/drivers/md/raid5.c > +++ b/drivers/md/raid5.c > @@ -7733,6 +7733,7 @@ static int raid5_set_limits(struct mddev *mddev) > lim.io_min = mddev->chunk_sectors << 9; > lim.io_opt = lim.io_min * (conf->raid_disks - conf->max_degraded); > lim.features |= BLK_FEAT_RAID_PARTIAL_STRIPES_EXPENSIVE; > + lim.flags |= BLK_FLAG_STACK_IO_OPT; > lim.discard_granularity = stripe; > lim.max_write_zeroes_sectors = 0; > mddev_stack_rdev_limits(mddev, &lim, 0); > diff --git a/include/linux/blkdev.h b/include/linux/blkdev.h > index 332b56f323d9..65317e93790e 100644 > --- a/include/linux/blkdev.h > +++ b/include/linux/blkdev.h > @@ -360,6 +360,9 @@ typedef unsigned int __bitwise blk_flags_t; > /* passthrough command IO accounting */ > #define BLK_FLAG_IOSTATS_PASSTHROUGH ((__force blk_flags_t)(1u << 2)) > > +/* ignore underlying disks io_opt */ > +#define BLK_FLAG_STACK_IO_OPT ((__force blk_flags_t)(1u << 3)) > + > struct queue_limits { > blk_features_t features; > blk_flags_t flags; >