On 6/5/25 8:38 PM, John Garry wrote: > The atomic write unit max is limited by any stack device stripe size. > > It is required that the atomic write unit is a power-of-2 factor of the > stripe size. > > Currently we use io_min limit to hold the stripe size, and check for a > io_min <= SECTOR_SIZE when deciding if we have a striped stacked device. > > Nilay reports that this causes a problem when the physical block size is > greater than SECTOR_SIZE [0]. > > Furthermore, io_min may be mutated when stacking devices, and this makes > it a poor candidate to hold the stripe size. Such an example would be > when the io_min is less than the physical block size. > > Use chunk_sectors to hold the stripe size, which is more appropriate. > > [0] https://lore.kernel.org/linux-block/888f3b1d-7817-4007-b3b3-1a2ea04df771@xxxxxxxxxxxxx/T/#mecca17129f72811137d3c2f1e477634e77f06781 > > Signed-off-by: John Garry <john.g.garry@xxxxxxxxxx> > --- > block/blk-settings.c | 8 +++++--- > 1 file changed, 5 insertions(+), 3 deletions(-) > > diff --git a/block/blk-settings.c b/block/blk-settings.c > index a000daafbfb4..5b0f1a854e81 100644 > --- a/block/blk-settings.c > +++ b/block/blk-settings.c > @@ -594,11 +594,13 @@ static bool blk_stack_atomic_writes_boundary_head(struct queue_limits *t, > static bool blk_stack_atomic_writes_head(struct queue_limits *t, > struct queue_limits *b) > { > + unsigned int chunk_size = t->chunk_sectors << SECTOR_SHIFT; > + > if (b->atomic_write_hw_boundary && > !blk_stack_atomic_writes_boundary_head(t, b)) > return false; > > - if (t->io_min <= SECTOR_SIZE) { > + if (!t->chunk_sectors) { > /* No chunk sectors, so use bottom device values directly */ > t->atomic_write_hw_unit_max = b->atomic_write_hw_unit_max; > t->atomic_write_hw_unit_min = b->atomic_write_hw_unit_min; > @@ -617,12 +619,12 @@ static bool blk_stack_atomic_writes_head(struct queue_limits *t, > * aligned with both limits, i.e. 8K in this example. > */ > t->atomic_write_hw_unit_max = b->atomic_write_hw_unit_max; > - while (t->io_min % t->atomic_write_hw_unit_max) > + while (chunk_size % t->atomic_write_hw_unit_max) > t->atomic_write_hw_unit_max /= 2; > > t->atomic_write_hw_unit_min = min(b->atomic_write_hw_unit_min, > t->atomic_write_hw_unit_max); > - t->atomic_write_hw_max = min(b->atomic_write_hw_max, t->io_min); > + t->atomic_write_hw_max = min(b->atomic_write_hw_max, chunk_size); > > return true; > } This works well with my NVMe disk which supports atomic writes however the only concern is what if in case t->chunk_sectors is also defined for NVMe disk? I see that nvme_set_chunk_sectors() initializes the chunk_sectors for NVMe. The value which is assigned to lim->chunk_sectors in nvme_set_chunk_sectors() represents "noiob" (i.e. Namespace Optimal I/O Boundary). My disk has "noiob" set to zero but in case if it's non-zero then would it break the above logic for NVMe atomic writes? Thanks, --Nilay