Re: next-20250626: WARNING fs jbd2 transaction.c start_this_handle with ARM64_64K_PAGES

Zhang Yi <yi.zhang@xxxxxxxxxxxxxxx> · Mon, 7 Jul 2025 12:54:56 +0800

On 2025/7/4 19:17, Jan Kara wrote:
> On Thu 03-07-25 19:33:32, Zhang Yi wrote:
>> On 2025/7/3 15:26, Naresh Kamboju wrote:
>>> On Thu, 26 Jun 2025 at 19:23, Zhang Yi <yi.zhang@xxxxxxxxxxxxxxx> wrote:
>>>> On 2025/6/26 20:31, Naresh Kamboju wrote:
>>>>> Regressions noticed on arm64 devices while running LTP syscalls mmap16
>>>>> test case on the Linux next-20250616..next-20250626 with the extra build
>>>>> config fragment CONFIG_ARM64_64K_PAGES=y the kernel warning noticed.
>>>>>
>>>>> Not reproducible with 4K page size.
>>>>>
>>>>> Test environments:
>>>>> - Dragonboard-410c
>>>>> - Juno-r2
>>>>> - rk3399-rock-pi-4b
>>>>> - qemu-arm64
>>>>>
>>>>> Regression Analysis:
>>>>> - New regression? Yes
>>>>> - Reproducibility? Yes
>>>>>
>>>>> Test regression: next-20250626 LTP mmap16 WARNING fs jbd2
>>>>> transaction.c start_this_handle
>>>>>
>>>>> Reported-by: Linux Kernel Functional Testing <lkft@xxxxxxxxxx>
>>>>
>>>> Thank you for the report. The block size for this test is 1 KB, so I
>>>> suspect this is the issue with insufficient journal credits that we
>>>> are going to resolve.
>>>
>>> I have applied your patch set [1] and tested and the reported
>>> regressions did not fix.
>>> Am I missing anything ?
>>>
>>> [1] https://lore.kernel.org/linux-ext4/20250611111625.1668035-1-yi.zhang@xxxxxxxxxxxxxxx/
>>>
>>
>> Hmm. It seems that my fix for the insufficient journal credit series
>> cannot handle cases with a page size of 64k. The problem is the folio
>> size can up to 128M, and the 'rsv_blocks' in ext4_do_writepages() can
>> up to 1577 on 1K block size filesystems, this is too large.
> 
> Firstly, I think that 128M folios are too big for our current approaches
> (in ext4 at least) to sensibly work. Maybe we could limit max folio order
> in ext4 mappings to max 1024 blocks per folio or something like that? For
> realistic setups with 4k blocksize this means 4M folios which is not really
> limiting for x86. Arm64 or ppc64 could do bigger but the gain for even
> larger folios is diminishingly small anyway.

Yeah, I agree.

> 
> Secondly, I'm wondering that even with 1577 reserved blocks we shouldn't
> really overflow the journal unless you make it really small. But maybe
> that's what the test does...

Yes, the test creates a filesystem image with a block size of 1 KB and a
journal consisting of 1024 blocks.

> 
>> Therefore, at this time, I think we should disable the large folio
>> support for 64K page size. Then, we may need to reserve rsv_blocks
>> for one extent and implement the same journal extension logic for
>> reserved credits.
>>
>> Ted and Jan, what do you think?
> 
> I wouldn't really disable it for 64K page size. I'd rather limit max folio
> order to 1024 blocks. That actually makes sense as a general limitation of
> our current implementation (linked lists of bhs in each folio don't really
> scale). We can use mapping_set_folio_order_range() for that instead of
> mapping_set_large_folios().
> 

Indeed, after noticing that Btrfs also calls mapping_set_folio_order_range()
to set the maximum size of a folio, I thought this solution should work. So
I changed my mind and was going to try this solution. However, I guess limit
max folio order to 1024 blocks is somewhat too small. I'd like to limit the
order to 2048 blocks, because this this allows a file system with a 1KB
block size to achieve a maximum folio size to PMD size in x86 with a 4KB
page size, this is useful for increase the TLB efficiency and reduce page
fault handling overhead.

I defined a new macro, something like this:

/*
 * Limit the maximum folio order to 2048 blocks to prevent overestimation
 * of reserve handle credits during the folio writeback in environments
 * where the PAGE_SIZE exceeds 4KB.
 */
#define EXT4_MAX_PAGECACHE_ORDER(i)             \
                min(MAX_PAGECACHE_ORDER, (11 + (i)->i_blkbits - PAGE_SIZE))

What do you think?

Best regards,
Yi.