On Mon, May 19, 2025 at 12:24:32PM +0200, Jan Kara wrote: > On Fri 16-05-25 13:38:00, Brian Foster wrote: > > fstest generic/388 occasionally reproduces a crash that looks as > > follows: > > > > BUG: kernel NULL pointer dereference, address: 0000000000000000 > > ... > > Call Trace: > > <TASK> > > ext4_block_zero_page_range+0x30c/0x380 [ext4] > > ext4_truncate+0x436/0x440 [ext4] > > ext4_process_orphan+0x5d/0x110 [ext4] > > ext4_orphan_cleanup+0x124/0x4f0 [ext4] > > ext4_fill_super+0x262d/0x3110 [ext4] > > get_tree_bdev_flags+0x132/0x1d0 > > vfs_get_tree+0x26/0xd0 > > vfs_cmd_create+0x59/0xe0 > > __do_sys_fsconfig+0x4ed/0x6b0 > > do_syscall_64+0x82/0x170 > > ... > > > > This occurs when processing a symlink inode from the orphan list. The > > partial block zeroing code in the truncate path calls > > ext4_dirty_journalled_data() -> folio_mark_dirty(). The latter calls > > mapping->a_ops->dirty_folio(), but symlink inodes are not assigned an > > a_ops vector in ext4, hence the crash. > > > > To avoid this problem, update the ext4_dirty_journalled_data() helper to > > only mark the folio dirty on regular files (for which a_ops is > > assigned). This also matches the journaling logic in the ext4_symlink() > > creation path, where ext4_handle_dirty_metadata() is called directly. > > > > Fixes: d84c9ebdac1e ("ext4: Mark pages with journalled data dirty") > > Signed-off-by: Brian Foster <bfoster@xxxxxxxxxx> > > Yeah, I forgot about this subtlety when writing d84c9ebdac1e. Good catch > and thanks for fixing this up! The fix looks good. Feel free to add: > > Reviewed-by: Jan Kara <jack@xxxxxxx> > > > --- > > > > Hi Jan, > > > > I'm not intimately familiar with the jbd machinery here so this may well > > be wrong, but it survives my testing so far. I initially hacked this to > > mark the buffer dirty instead of the folio, but discovered jbd2 doesn't > > seem to like that. I suspect that is because jbd2 wants to dirty/submit > > the buffer itself after it's logged..? > > > > Anyways, after that, this struck me as most consistent with behavior > > prior to d84c9ebdac1e and/or with the creation path, so I'm floating > > this as a first pass. Is my understanding of d84c9ebdac1e correct in > > that it is mainly an optimization to allow writeback to force the > > journaling mechanism vs. otherwise waiting for the other way around > > (i.e. a journal commit to mark folios dirty)? Thoughts appreciated.. > > Well, the motivation for d84c9ebdac1e was not so much an optimization but > rather to provide better visibility to the generic code what needs writing > out. Otherwise we had to special-case data journalling in a lot of places > that tried to do "clean the inode & purge the page cache" because simple > filemap_write_and_wait() was not enough to get the dirty pages in the inode > to disk. > Ah, I see. Thanks for the insight (and review). Brian > Honza > > > diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c > > index 94c7d2d828a6..d3c138003ad3 100644 > > --- a/fs/ext4/inode.c > > +++ b/fs/ext4/inode.c > > @@ -1009,7 +1009,12 @@ int ext4_walk_page_buffers(handle_t *handle, struct inode *inode, > > */ > > static int ext4_dirty_journalled_data(handle_t *handle, struct buffer_head *bh) > > { > > - folio_mark_dirty(bh->b_folio); > > + struct folio *folio = bh->b_folio; > > + struct inode *inode = folio->mapping->host; > > + > > + /* only regular files have a_ops */ > > + if (S_ISREG(inode->i_mode)) > > + folio_mark_dirty(folio); > > return ext4_handle_dirty_metadata(handle, NULL, bh); > > } > > > > -- > > 2.49.0 > > > -- > Jan Kara <jack@xxxxxxxx> > SUSE Labs, CR >