On 7 Jul 2025, at 10:23, Pankaj Raghav (Samsung) wrote: > From: Pankaj Raghav <p.raghav@xxxxxxxxxxx> > > There are many places in the kernel where we need to zeroout larger > chunks but the maximum segment we can zeroout at a time by ZERO_PAGE > is limited by PAGE_SIZE. > > This concern was raised during the review of adding Large Block Size support > to XFS[1][2]. > > This is especially annoying in block devices and filesystems where we > attach multiple ZERO_PAGEs to the bio in different bvecs. With multipage > bvec support in block layer, it is much more efficient to send out > larger zero pages as a part of a single bvec. > > Some examples of places in the kernel where this could be useful: > - blkdev_issue_zero_pages() > - iomap_dio_zero() > - vmalloc.c:zero_iter() > - rxperf_process_call() > - fscrypt_zeroout_range_inline_crypt() > - bch2_checksum_update() > ... > > We already have huge_zero_folio that is allocated on demand, and it will be > deallocated by the shrinker if there are no users of it left. > > At moment, huge_zero_folio infrastructure refcount is tied to the process > lifetime that created it. This might not work for bio layer as the completions > can be async and the process that created the huge_zero_folio might no > longer be alive. > > Add a config option STATIC_PMD_ZERO_PAGE that will always allocate > the huge_zero_folio via memblock, and it will never be freed. Do the above users want a PMD sized zero page or a 2MB zero page? Because on systems with non 4KB base page size, e.g., ARM64 with 64KB base page, PMD size is different. ARM64 with 64KB base page has 512MB PMD sized pages. Having STATIC_PMD_ZERO_PAGE means losing half GB memory. I am not sure if it is acceptable. Best Regards, Yan, Zi