Re: [PATCH 1/2] block: ignore underlying non-stack devices io_opt

Damien Le Moal <dlemoal@xxxxxxxxxx> · Mon, 18 Aug 2025 12:18:19 +0900

On 8/18/25 11:57 AM, Yu Kuai wrote:
>>> diff --git a/drivers/md/raid5.c b/drivers/md/raid5.c
>>> index 023649fe2476..989acd8abd98 100644
>>> --- a/drivers/md/raid5.c
>>> +++ b/drivers/md/raid5.c
>>> @@ -7730,6 +7730,7 @@ static int raid5_set_limits(struct mddev *mddev)
>>>       lim.io_min = mddev->chunk_sectors << 9;
>>>       lim.io_opt = lim.io_min * (conf->raid_disks - conf->max_degraded);
>>
>> It seems to me that moving this *after* the call to mddev_stack_rdev_limits()
>> would simply overwrite the io_opt limit coming from stacking and get you the
>> same result as your patch, but without adding the new limit flags.
> 
> This is not enough, we have the case array is build on the top of
> another array, we still need the lcm_not_zero() to not break this case.
> And I would expect this flag for all the arrays, not just raid5.

Nothing prevents you from doing that in the md code. The block layer stacking
limits provides a sensible default. If the block device driver does not like
the default given, it is free to change it for whatever valid reason it has.
As I said, that's what the DM .io_hint target driver method is for.

As for the "expected that flag for all arrays", that is optimistic at best. For
scsi hardware raid, as discussed already, the optimal I/O size is *not* the
stripe size. And good luck with any AHCI-based hardware RAID...

-- 
Damien Le Moal
Western Digital Research