Re: [PATCH] block: fix atomic write limits for stacked devices

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 




On 6/4/25 12:59 PM, John Garry wrote:
> On 03/06/2025 16:16, Nilay Shroff wrote:
>>>> diff --git a/block/blk-settings.c b/block/blk-settings.c
>>>> index a000daafbfb4..35c1354dd5ae 100644
>>>> --- a/block/blk-settings.c
>>>> +++ b/block/blk-settings.c
>>>> @@ -598,8 +598,14 @@ static bool blk_stack_atomic_writes_head(struct queue_limits *t,
>>>>            !blk_stack_atomic_writes_boundary_head(t, b))
>>>>            return false;
>>>>    -    if (t->io_min <= SECTOR_SIZE) {
>>>> -        /* No chunk sectors, so use bottom device values directly */
>>>> +    if (t->io_min <= SECTOR_SIZE ||
>>>> +        (!(t->atomic_write_hw_unit_max % t->io_min) &&
>>>> +         !(t->atomic_write_hw_unit_min % t->io_min))) {
>>> So will this now break md raid0/10 or dm stripe when t->io_min is set (> SECTOR_SIZE)? I mean, for md raid0/10 or dm-stripe, we should be taking the chunk size into account there and we now don't seem to be doing so now.
>>>
>> Shouldn't it be work good if we ensure that a bottom device atomic write unit min/max are
>> aligned with the top device chunk sectors then top device could simply copy and use the
>> bottom device atomic write limits directly? 
> 
> You need to be more specific when you say "aligned".
> 
I meant to say bottom device atomic write unit min/max are multiples of top device chunk sectors .

> Consider chunk sectors for md raid0 is 16KB and b->atomic_write_hw_unit_max is 32KB. Then we must reduce t->atomic_write_hw_unit_max to 16KB (so cannot use the value in b->atomic_write_hw_unit_max directly).
Okay, I think I understood your concerns here.

> 
>> Or do we have a special case for raid0/10 and
>> dm-strip which can't handle atomic write if chunk size for stacked device is greater than
>> SECTOR_SIZE?
>>
>> BTW there's a typo in the above change, we should have the above if check written as below
>> (my bad):
>>      if (t->io_min <= SECTOR_SIZE ||
>>          (!(b->atomic_write_hw_unit_max % t->io_min) &&
>>           !(b->atomic_write_hw_unit_min % t->io_min))) {
>>      ...
>>      ...
>>
>>> What is the value of top device io_min and physical_block_size in your example?
>> The NVme disk which I am using has both t->io_min and t->physical_block_size set
>> to 4096.
> 
> I need to test further, but maybe we can change the check to this:
> 
> if (t->io_min <= SECTOR_SIZE || t->io_min == t->physical_block_size) {
>         /* No chunk sectors, so use bottom device values directly */
>         t->atomic_write_hw_unit_max = b->atomic_write_hw_unit_max;
>         t->atomic_write_hw_unit_min = b->atomic_write_hw_unit_min;
>         t->atomic_write_hw_max = b->atomic_write_hw_max;
>         return true;
> }

How about instead adding a new BLK_FEAT_STRIPED flag and then use it here while 
setting atomic limits as shown below:

diff --git a/block/blk-settings.c b/block/blk-settings.c
index a000daafbfb4..bf5d35282d42 100644
--- a/block/blk-settings.c
+++ b/block/blk-settings.c
@@ -598,8 +598,14 @@ static bool blk_stack_atomic_writes_head(struct queue_limits *t,
            !blk_stack_atomic_writes_boundary_head(t, b))
                return false;
 
-       if (t->io_min <= SECTOR_SIZE) {
-               /* No chunk sectors, so use bottom device values directly */
+       if (t->io_min <= SECTOR_SIZE || !(t->features & BLK_FEAT_STRIPED)) {
+               /*
+                * If there are no chunk sectors, or if the top device does not
+                * advertise the STRIPED feature (i.e., it's not a striped
+                * device like md-raid0 or dm-stripe), then we directly inherit
+                * the atomic write capabilities from the underlying (bottom)
+                * device.
+                */
                t->atomic_write_hw_unit_max = b->atomic_write_hw_unit_max;
                t->atomic_write_hw_unit_min = b->atomic_write_hw_unit_min;
                t->atomic_write_hw_max = b->atomic_write_hw_max;

I tested the above change with md-raid0 and dm-strip setup and seems to 
be working well. What do you think?

Thanks,
--Nilay






[Index of Archives]     [Linux RAID]     [Linux SCSI]     [Linux ATA RAID]     [IDE]     [Linux Wireless]     [Linux Kernel]     [ATH6KL]     [Linux Bluetooth]     [Linux Netdev]     [Kernel Newbies]     [Security]     [Git]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Device Mapper]

  Powered by Linux