Re: [RFC 10/12] ext4/062: Atomic writes test for bigalloc using fio crc verifier on multiple files

Ojaswin Mujoo <ojaswin@xxxxxxxxxxxxx> · Fri, 20 Jun 2025 22:19:19 +0530

On Fri, Jun 20, 2025 at 03:01:40PM +0100, John Garry wrote:
> On 13/06/2025 06:37, Ojaswin Mujoo wrote:
> > On Thu, Jun 12, 2025 at 11:26:17AM +0100, John Garry wrote:
> > > On 11/06/2025 10:34, Ojaswin Mujoo wrote:
> > > > From: "Ritesh Harjani (IBM)"<ritesh.list@xxxxxxxxx>
> > > > 
> > > > Brute force all possible blocksize clustersize combination on a bigalloc
> > > > filesystem for stressing atomic write using fio data crc verifier. We run
> > > > multiple threads in parallel with each job writing to its own file. The
> > > > parallel jobs running on a constrained filesystem size ensure that we stress
> > > > the ext4 allocator to allocate contiguous extents.
> > > > 
> > > > Signed-off-by: Ritesh Harjani (IBM)<ritesh.list@xxxxxxxxx>
> > > > Signed-off-by: Ojaswin Mujoo<ojaswin@xxxxxxxxxxxxx>
> > > 
> > > RWF_ATOMIC does not guarantee that racing atomic writes and reads are
> > > serialised. That is what you are testing here, right?
> > > 
> > > NVMe and SCSI do guarantee this (serialisation). However, reads in the block
> > > layer may be split into multiple requests, even though unlikely.
> > 
> > Hey John,
> > 
> > We are not really testing the serialization here
> > (verify_write_sequence=0) but rather that multiple threads atomically
> > writing to the same file should never tear the write.
> > 
> > In the test, for each job, multiple threads are doing the write on the
> > same file with the same iosize so they should always overwrite each
> > other completely.  The verifier then ensures that the whole iosize chunk
> > written matches the checksum, which will only happen if the write is not
> > torn. That way we are able to ensure that even with multiple threads
> > writing the same ranges, we don't break the writes (the sequence doesn't
> > matter as long as it is not breaking)
> 
> So the threads are overwriting the same data range, right?
> 
> If so, as an experiment, try setting /sys/block/DEV/queue/max_sectors_kb
> lower than the bsize and see what happens...

Okay so I tried this flow:

1. mount ext4 FS
2. max_sectors_kb = 4
3. Do a fio atomic write of 64k (awu_max = 64k in this case)

And I'm able to see checksum issues that means the write is getting
torn. I'm not sure of the exact block layer code around max_sectors_kb, but
I do see max_sectors_kb -- sets --> max_user_sectors --> max_sectors
But then get_max_io_size() ignores max_sectors for atomic writes and
uses atomic_write_max_sectors, which should be correctly set.

Hmm.. I must be missing something, what is splitting the bio?

Regards,
ojaswin

> 
> Thanks,
> John