On 2025/5/30 10:04, Zi Yan wrote:
On 29 May 2025, at 21:51, Baolin Wang wrote:
On 2025/5/29 23:10, Zi Yan wrote:
On 29 May 2025, at 4:23, Baolin Wang wrote:
The MADV_COLLAPSE will ignore the system-wide Anon THP sysfs settings, which
means that even though we have disabled the Anon THP configuration, MADV_COLLAPSE
will still attempt to collapse into a Anon THP. This violates the rule we have
agreed upon: never means never.
To address this issue, should check whether the Anon THP configuration is disabled
in thp_vma_allowable_orders(), even when the TVA_ENFORCE_SYSFS flag is set.
Signed-off-by: Baolin Wang <baolin.wang@xxxxxxxxxxxxxxxxx>
---
include/linux/huge_mm.h | 23 +++++++++++++++++++----
1 file changed, 19 insertions(+), 4 deletions(-)
diff --git a/include/linux/huge_mm.h b/include/linux/huge_mm.h
index 2f190c90192d..199ddc9f04a1 100644
--- a/include/linux/huge_mm.h
+++ b/include/linux/huge_mm.h
@@ -287,20 +287,35 @@ unsigned long thp_vma_allowable_orders(struct vm_area_struct *vma,
unsigned long orders)
{
/* Optimization to check if required orders are enabled early. */
- if ((tva_flags & TVA_ENFORCE_SYSFS) && vma_is_anonymous(vma)) {
- unsigned long mask = READ_ONCE(huge_anon_orders_always);
+ if (vma_is_anonymous(vma)) {
+ unsigned long always = READ_ONCE(huge_anon_orders_always);
+ unsigned long madvise = READ_ONCE(huge_anon_orders_madvise);
+ unsigned long inherit = READ_ONCE(huge_anon_orders_inherit);
+ unsigned long mask = always | madvise;
+
+ /*
+ * If the system-wide THP/mTHP sysfs settings are disabled,
+ * then we should never allow hugepages.
+ */
+ if (!(mask & orders) && !(hugepage_global_enabled() && (inherit & orders)))
Can you explain the logic here? Is it equivalent to:
1. if THP is set to always, always_mask & orders == 0, or
2. if THP if set to madvise, madvise_mask & order == 0, or
3. if THP is set to inherit, inherit_mask & order == 0?
I cannot figure out why (always | madvise) & orders does not check
THP enablement case, but inherit & orders checks hugepage_global_enabled().
Sorry for not being clear. Let me try again:
Now we can control per-sized mTHP through ‘huge_anon_orders_always’, so always does not need to rely on the check of hugepage_global_enabled().
For madvise, referring to David's suggestion: “allowing for collapsing in a VM without VM_HUGEPAGE in the "madvise" mode would be fine", so we can just check 'huge_anon_orders_madvise' without relying on hugepage_global_enabled().
Got it. Setting always or madvise knob in per-size mTHP means user wants to
enable that size, so their orders are not limited by the global config.
Setting inherit means user wants to follow the global config.
Now it makes sense to me. I wonder if renaming inherit to inherit_global
and huge_anon_orders_inherit to huge_anon_orders_inherit_global
could make code more clear (We cannot change sysfs names, but changing
kernel variable names should be fine?).
The 'huge_anon_orders_inherit' is also a per-size mTHP interface. See
the doc:
"
Alternatively it is possible to specify that a given hugepage size
will inherit the top-level "enabled" value::
echo inherit
>/sys/kernel/mm/transparent_hugepage/hugepages-<size>kB/enabled
"
The 'inherit' already implies that it is meant to inherit the top-level
'enabled' value, so I personally still prefer the variable name
'inherit', just as we use it elsewhere.
In the case where hugepage_global_enabled() is enabled, we need to check whether the 'inherit' has enabled the corresponding orders.
In summary, the current strategy is:
1. If always & orders == 0, and madvise & orders == 0, and hugepage_global_enabled() == false (global THP settings are not enabled), it means mTHP of the orders are prohibited from being used, then madvise_collapse() is forbidden.
2. If always & orders == 0, and madvise & orders == 0, and hugepage_global_enabled() == true (global THP settings are enabled), and inherit & orders == 0, it means mTHP of the orders are still prohibited from being used, and thus madvise_collapse() is not allowed.
Putting the summary in the comment might be very helpful. WDYT?
Sure. will do.
Otherwise, the patch looks good to me. Thanks.
Reviewed-by: Zi Yan <ziy@xxxxxxxxxx>
Thanks.