Hi Pankaj, On Thu, May 22, 2025 at 11:02:41AM +0200, Pankaj Raghav wrote: > There are many places in the kernel where we need to zeroout larger > chunks but the maximum segment we can zeroout at a time by ZERO_PAGE > is limited by PAGE_SIZE. > > This concern was raised during the review of adding Large Block Size support > to XFS[1][2]. > > This is especially annoying in block devices and filesystems where we > attach multiple ZERO_PAGEs to the bio in different bvecs. With multipage > bvec support in block layer, it is much more efficient to send out > larger zero pages as a part of a single bvec. > > Some examples of places in the kernel where this could be useful: > - blkdev_issue_zero_pages() > - iomap_dio_zero() > - vmalloc.c:zero_iter() > - rxperf_process_call() > - fscrypt_zeroout_range_inline_crypt() > - bch2_checksum_update() > ... > > We already have huge_zero_folio that is allocated on demand, and it will be > deallocated by the shrinker if there are no users of it left. > > But to use huge_zero_folio, we need to pass a mm struct and the > put_folio needs to be called in the destructor. This makes sense for > systems that have memory constraints but for bigger servers, it does not > matter if the PMD size is reasonable (like x86). > > Add a config option THP_HUGE_ZERO_PAGE_ALWAYS that will always allocate > the huge_zero_folio, and it will never be freed. This makes using the > huge_zero_folio without having to pass any mm struct and a call to put_folio > in the destructor. I don't think this config option should be tied to THP. It's perfectly sensible to have a configuration with HUGETLB and without THP. > I have converted blkdev_issue_zero_pages() as an example as a part of > this series. > > I will send patches to individual subsystems using the huge_zero_folio > once this gets upstreamed. > > Looking forward to some feedback. > > [1] https://lore.kernel.org/linux-xfs/20231027051847.GA7885@xxxxxx/ > [2] https://lore.kernel.org/linux-xfs/ZitIK5OnR7ZNY0IG@xxxxxxxxxxxxx/ > > Changes since v1: > - Added the config option based on the feedback from David. > - Removed iomap patches so that I don't clutter this series with too > many subsystems. > > Pankaj Raghav (2): > mm: add THP_HUGE_ZERO_PAGE_ALWAYS config option > block: use mm_huge_zero_folio in __blkdev_issue_zero_pages() > > arch/x86/Kconfig | 1 + > block/blk-lib.c | 15 +++++++++--- > mm/Kconfig | 12 +++++++++ > mm/huge_memory.c | 63 ++++++++++++++++++++++++++++++++++++++---------- > 4 files changed, 74 insertions(+), 17 deletions(-) > > > base-commit: f1f6aceb82a55f87d04e2896ac3782162e7859bd > -- > 2.47.2 > > -- Sincerely yours, Mike.