Re: [PATCH v4 06/12] khugepaged: introduce khugepaged_scan_bitmap for mTHP support

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 





On 2025/4/28 22:47, Nico Pache wrote:
On Sat, Apr 26, 2025 at 8:52 PM Baolin Wang
<baolin.wang@xxxxxxxxxxxxxxxxx> wrote:



On 2025/4/17 08:02, Nico Pache wrote:
khugepaged scans PMD ranges for potential collapse to a hugepage. To add
mTHP support we use this scan to instead record chunks of utilized
sections of the PMD.

khugepaged_scan_bitmap uses a stack struct to recursively scan a bitmap
that represents chunks of utilized regions. We can then determine what
mTHP size fits best and in the following patch, we set this bitmap while
scanning the PMD.

max_ptes_none is used as a scale to determine how "full" an order must
be before being considered for collapse.

When attempting to collapse an order that has its order set to "always"
lets always collapse to that order in a greedy manner without
considering the number of bits set.

Signed-off-by: Nico Pache <npache@xxxxxxxxxx>
---
   include/linux/khugepaged.h |  4 ++
   mm/khugepaged.c            | 94 ++++++++++++++++++++++++++++++++++----
   2 files changed, 89 insertions(+), 9 deletions(-)

diff --git a/include/linux/khugepaged.h b/include/linux/khugepaged.h
index 1f46046080f5..18fe6eb5051d 100644
--- a/include/linux/khugepaged.h
+++ b/include/linux/khugepaged.h
@@ -1,6 +1,10 @@
   /* SPDX-License-Identifier: GPL-2.0 */
   #ifndef _LINUX_KHUGEPAGED_H
   #define _LINUX_KHUGEPAGED_H
+#define KHUGEPAGED_MIN_MTHP_ORDER    2

Why is the minimum mTHP order set to 2? IMO, the file large folios can
support order 1, so we don't expect to collapse exec file small folios
to order 1 if possible?
I should have been more specific in the patch notes, but this affects
anonymous only. I'll go over my commit messages and make sure this is
reflected in the next version.

OK. I am looking into how to support shmem mTHP collapse based on your patch series.

(PS: I need more time to understand your logic in this patch, and any
additional explanation would be helpful:) )

We are currently scanning ptes in a PMD. The core principle/reasoning
behind the bitmap is to keep the PMD scan while saving its state. We
then use this bitmap to determine which chunks of the PMD are active
and are the best candidates for mTHP collapse. We start at the PMD
level, and recursively break down the bitmap to find the appropriate
sizes for the bitmap.

looking at a simplified example: we scan a PMD and get the following
bitmap, 1111101101101011 (in this case MIN_MTHP_ORDER= 5, so each bit
== 32 ptes, in the actual set each bit == 4 ptes).
We would first attempt a PMD collapse, while checking the number of
bits set vs the max_ptes_none tunable. If those conditions arent
triggered, we will try the next enabled mTHP order, for each half of
the bitmap.

ie) order 8 attempt on 11111011 and order 8 attempt on 01101011.

If a collapse succeeds we dont keep recursing on that portion of the
bitmap. If not, we continue attempting lower orders.

Hopefully that helps you understand my logic here! Let me know if you
need more clarification.

Thanks for your explanation. That's pretty much how I understand it.:) I'll give a test for your new version.


I gave a presentation on this that might help too:
https://docs.google.com/presentation/d/1w9NYLuC2kRcMAwhcashU1LWTfmI5TIZRaTWuZq-CHEg/edit?usp=sharing&resourcekey=0-nBAGld8cP1kW26XE6i0Bpg

Unfortunately, this link requires access permission.




[Index of Archives]     [Kernel Newbies]     [Security]     [Netfilter]     [Bugtraq]     [Linux FS]     [Yosemite Forum]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Samba]     [Video 4 Linux]     [Device Mapper]     [Linux Resources]

  Powered by Linux