On 2025/6/20 0:44, Jan Kara wrote: > On Wed 11-06-25 19:16:24, Zhang Yi wrote: >> From: Zhang Yi <yi.zhang@xxxxxxxxxx> >> >> This partially reverts commit d6bf294773a4 ("ext4/jbd2: convert >> jbd2_journal_blocks_per_page() to support large folio"). This >> jbd2_journal_blocks_per_folio() will lead to a significant >> overestimation of journal credits. Since we still reserve credits for >> one page and attempt to extend and restart handles during large folio >> writebacks, so we should convert this helper back to >> ext4_journal_blocks_per_page(). >> >> Signed-off-by: Zhang Yi <yi.zhang@xxxxxxxxxx> > > Here I'm not decided. Does it make any particular sense to reserve credits > for one *page* worth of blocks when pages don't have any particular meaning > in our writeback code anymore? We could reserve credits just for one > physical extent and that should be enough. Indeed, reserving credits for a single page is no longer suitable in the currently folio based context. It do seems more appropriate to allocate credits for a single extent. > For blocksize == pagesize (most > common configs) this would be actually equivalent. If blocksize < pagesize, > this could force us to do some more writeback retries and thus get somewhat > higher writeback CPU overhead but do we really care for these configs? It > is well possible I've overlooked something and someone will spot a > performance regression in practical setup with this in which case we'd have > to come up with something more clever but I think it's worth it to start > simple and complicate later. This can indeed be a problem if the file system is already fragmented enough. However, thanks to the credits extension logic in __ext4_journal_ensure_credits(), I suppose that on most file system images, it will not trigger excessive retry operations. Besides, although there might be some lock cost in jbd2_journal_extend(), I suppose it won't be a big deal. Perhaps we could reserve more credits through a complex formula at the outset, which would lower the cost of expanding the credits. But I don't think this will help much in reducing the number of retries, it may only be helpful in extreme cases (the running transaction stats to commit, we cannot extend it). So, I think we can implement it by reserving for an extent for the time being. Do you agree? Thanks, Yi. > > >> --- >> fs/ext4/ext4_jbd2.h | 7 +++++++ >> fs/ext4/inode.c | 6 +++--- >> fs/jbd2/journal.c | 6 ++++++ >> include/linux/jbd2.h | 1 + >> 4 files changed, 17 insertions(+), 3 deletions(-) >> >> diff --git a/fs/ext4/ext4_jbd2.h b/fs/ext4/ext4_jbd2.h >> index 63d17c5201b5..c0ee756cb34c 100644 >> --- a/fs/ext4/ext4_jbd2.h >> +++ b/fs/ext4/ext4_jbd2.h >> @@ -326,6 +326,13 @@ static inline int ext4_journal_blocks_per_folio(struct inode *inode) >> return 0; >> } >> >> +static inline int ext4_journal_blocks_per_page(struct inode *inode) >> +{ >> + if (EXT4_JOURNAL(inode) != NULL) >> + return jbd2_journal_blocks_per_page(inode); >> + return 0; >> +} >> + >> static inline int ext4_journal_force_commit(journal_t *journal) >> { >> if (journal) >> diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c >> index 67e37dd546eb..9835145b1b27 100644 >> --- a/fs/ext4/inode.c >> +++ b/fs/ext4/inode.c >> @@ -2556,7 +2556,7 @@ static int mpage_map_and_submit_extent(handle_t *handle, >> */ >> static int ext4_da_writepages_trans_blocks(struct inode *inode) >> { >> - int bpp = ext4_journal_blocks_per_folio(inode); >> + int bpp = ext4_journal_blocks_per_page(inode); >> >> return ext4_meta_trans_blocks(inode, >> MAX_WRITEPAGES_EXTENT_LEN + bpp - 1, bpp); >> @@ -2634,7 +2634,7 @@ static int mpage_prepare_extent_to_map(struct mpage_da_data *mpd) >> ext4_lblk_t lblk; >> struct buffer_head *head; >> handle_t *handle = NULL; >> - int bpp = ext4_journal_blocks_per_folio(mpd->inode); >> + int bpp = ext4_journal_blocks_per_page(mpd->inode); >> >> if (mpd->wbc->sync_mode == WB_SYNC_ALL || mpd->wbc->tagged_writepages) >> tag = PAGECACHE_TAG_TOWRITE; >> @@ -6255,7 +6255,7 @@ int ext4_meta_trans_blocks(struct inode *inode, int lblocks, int pextents) >> */ >> int ext4_writepage_trans_blocks(struct inode *inode) >> { >> - int bpp = ext4_journal_blocks_per_folio(inode); >> + int bpp = ext4_journal_blocks_per_page(inode); >> int ret; >> >> ret = ext4_meta_trans_blocks(inode, bpp, bpp); >> diff --git a/fs/jbd2/journal.c b/fs/jbd2/journal.c >> index d480b94117cd..7fccb425907f 100644 >> --- a/fs/jbd2/journal.c >> +++ b/fs/jbd2/journal.c >> @@ -84,6 +84,7 @@ EXPORT_SYMBOL(jbd2_journal_start_commit); >> EXPORT_SYMBOL(jbd2_journal_force_commit_nested); >> EXPORT_SYMBOL(jbd2_journal_wipe); >> EXPORT_SYMBOL(jbd2_journal_blocks_per_folio); >> +EXPORT_SYMBOL(jbd2_journal_blocks_per_page); >> EXPORT_SYMBOL(jbd2_journal_invalidate_folio); >> EXPORT_SYMBOL(jbd2_journal_try_to_free_buffers); >> EXPORT_SYMBOL(jbd2_journal_force_commit); >> @@ -2661,6 +2662,11 @@ int jbd2_journal_blocks_per_folio(struct inode *inode) >> inode->i_sb->s_blocksize_bits); >> } >> >> +int jbd2_journal_blocks_per_page(struct inode *inode) >> +{ >> + return 1 << (PAGE_SHIFT - inode->i_sb->s_blocksize_bits); >> +} >> + >> /* >> * helper functions to deal with 32 or 64bit block numbers. >> */ >> diff --git a/include/linux/jbd2.h b/include/linux/jbd2.h >> index 43b9297fe8a7..f35369c104ba 100644 >> --- a/include/linux/jbd2.h >> +++ b/include/linux/jbd2.h >> @@ -1724,6 +1724,7 @@ static inline int tid_geq(tid_t x, tid_t y) >> } >> >> extern int jbd2_journal_blocks_per_folio(struct inode *inode); >> +extern int jbd2_journal_blocks_per_page(struct inode *inode); >> extern size_t journal_tag_bytes(journal_t *journal); >> >> static inline int jbd2_journal_has_csum_v2or3(journal_t *journal) >> -- >> 2.46.1 >>