On Wed, Apr 23, 2025 at 07:09:28PM +0200, Jan Kara wrote: > On Wed 16-04-25 09:58:30, Luis Chamberlain wrote: > > On Tue, Apr 15, 2025 at 06:28:55PM +0200, Jan Kara wrote: > > > > So I tried: > > > > > > > > root@e1-ext4-2k /var/lib/xfstests # fsck /dev/loop5 -y 2>&1 > log > > > > e2fsck 1.47.2 (1-Jan-2025) > > > > root@e1-ext4-2k /var/lib/xfstests # wc -l log > > > > 16411 log > > > > > > Can you share the log please? > > > > Sure, here you go: > > > > https://github.com/linux-kdevops/20250416-ext4-jbd2-bh-migrate-corruption > > > > The last trace-0004.txt is a fresh one with Davidlohr's patches > > applied. It has trace-0004-fsck.txt. > > Thanks for the data! I was staring at them for some time and at this point > I'm leaning towards a conclusion that this is actually not a case of > metadata corruption but rather a bug in ext4 transaction credit computation > that is completely independent of page migration. > > Based on the e2fsck log you've provided the only damage in the filesystem > is from the aborted transaction handle in the middle of extent tree growth. > So nothing points to a lost metadata write or anything like that. And the > credit reservation for page writeback is indeed somewhat racy - we reserve > number of transaction credits based on current tree depth. However by the > time we get to ext4_ext_map_blocks() another process could have modified > the extent tree so we may need to modify more blocks than we originally > expected and reserved credits for. > > Can you give attached patch a try please? > > Honza > -- > Jan Kara <jack@xxxxxxxx> > SUSE Labs, CR > From 4c53fb9f4b9b3eb4a579f69b7adcb6524d55629c Mon Sep 17 00:00:00 2001 > From: Jan Kara <jack@xxxxxxx> > Date: Wed, 23 Apr 2025 18:10:54 +0200 > Subject: [PATCH] ext4: Fix calculation of credits for extent tree modification > > Luis and David are reporting that after running generic/750 test for 90+ > hours on 2k ext4 filesystem, they are able to trigger a warning in > jbd2_journal_dirty_metadata() complaining that there are not enough > credits in the running transaction started in ext4_do_writepages(). > > Indeed the code in ext4_do_writepages() is racy and the extent tree can > change between the time we compute credits necessary for extent tree > computation and the time we actually modify the extent tree. Thus it may > happen that the number of credits actually needed is higher. Modify > ext4_ext_index_trans_blocks() to count with the worst case of maximum > tree depth. > > Link: https://lore.kernel.org/all/20250415013641.f2ppw6wov4kn4wq2@offworld > Reported-by: Davidlohr Bueso <dave@xxxxxxxxxxxx> > Reported-by: Luis Chamberlain <mcgrof@xxxxxxxxxx> > Signed-off-by: Jan Kara <jack@xxxxxxx> I kicked off tests! Let's see after ~ 90 hours! Luis