On Mon, Apr 21, 2025 at 11:29:52AM -0500, Theodore Ts'o wrote: > On Mon, Apr 21, 2025 at 08:54:33AM -0700, Darrick J. Wong wrote: > > > > I might be wading in deeper than I know, but it seems to me that > > after a crash recovery it's not great to see 64k files with no blocks > > allocated to them at all. > > Well, what ext4 in no dioread_nolock mode will do is to allocate > blocks marked as unitializationed, and then write the data blocks, and > then change them to be marked as initialized. So it's not that there > are no blocks allocated at all; but that there are blocks allocated > but attempts to read from the file will return all zeros. But that's not what I see -- on my system, I get files with i_size == 65536, but no mappings at all: --- /run/fstests/bin/tests/generic/044.out 2025-04-17 14:52:53.521658441 -0700 +++ /var/tmp/fstests/generic/044.out.bad 2025-04-21 08:46:15.328757541 -0700 @@ -1 +1,95 @@ QA output created by 044 +corrupt file /opt/906 - non-zero size but no extents +corrupt file /opt/907 - non-zero size but no extents # mount /opt/ # ls /opt/906 -rw------- 1 root root 65536 Apr 21 08:45 /opt/906 # filefrag -v !$ filefrag -v /opt/906 Filesystem type is: ef53 File size of /opt/906 is 65536 (16 blocks of 4096 bytes) /opt/906: 0 extents found ...unless ext4 is removing those unwritten blocks during recovery? > This is non-ideal, but my main concern is a performance issue, not a > correctness one. We're modifying the metadata blocks twice, and while > most of the time the two modifications happen within a single > transaction (so the user won't actually see the zero blocks after the > crash _most_ of the time), the extra journal handles means extra CPU > and extra jbd2 spinlocks getting taken and released. > > So it's on my todo list to fix, in my copious spare time..... > > > (I don't care about the others whining about _exclude_fs-- if > > you make the design decision that the current ext4 behavior is > > good enough, then the test cannot ever be satisfied so let's > > capture that in the test > itself, not in everyone's scattered > > exclusion lists.) > > Fair enough, I can try, and see if we get people attempting to NACK > the changes this time around. Support beating back the whiners would > be appreciated. Ok, I'll chime in whenever I see patches. :) > I can also see if Luis's LBS changes might it easier to deal with the > bigalloc test bugs. It will mean exposing the concept of cluster > allocation size (as distinct from block size) to the core xfstests > infrastructure, and again, we can see if people try to NACK the > changes. This will require a bit more work, however as this is a big > difference between XFS's LBS feature and ext4's bigalloc feature. That shouldn't be a problem; _xfs_get_file_block_size has returned the allocation unit size for XFS files for quite some time, despite being badly named. --D > > - Ted