Re: [PATCH v7 mm-new 02/10] mm: thp: add support for BPF based THP order selection

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 





On 2025/9/11 10:48, Yafang Shao wrote:
On Wed, Sep 10, 2025 at 9:57 PM Lance Yang <lance.yang@xxxxxxxxx> wrote:
[...]
+/**
+ * @thp_order_fn_t: Get the suggested THP orders from a BPF program for allocation
+ * @vma: vm_area_struct associated with the THP allocation
+ * @vma_type: The VMA type, such as BPF_THP_VM_HUGEPAGE if VM_HUGEPAGE is set
+ *            BPF_THP_VM_NOHUGEPAGE if VM_NOHUGEPAGE is set, or BPF_THP_VM_NONE if
+ *            neither is set.
+ * @tva_type: TVA type for current @vma
+ * @orders: Bitmask of requested THP orders for this allocation
+ *          - PMD-mapped allocation if PMD_ORDER is set
+ *          - mTHP allocation otherwise
+ *
+ * Return: The suggested THP order from the BPF program for allocation. It will
+ *         not exceed the highest requested order in @orders. Return -1 to
+ *         indicate that the original requested @orders should remain unchanged.

A minor documentation nit: the comment says "Return -1 to indicate that the
original requested @orders should remain unchanged". It might be slightly
clearer to say "Return a negative value to fall back to the original
behavior". This would cover all error codes as well ;)

will change it.

Please feel free to change it ;)



[...]

Also, for future extensions, it might be a good idea to add a reserved
flags argument to the thp_order_fn_t signature.

For example thp_order_fn_t(..., unsigned long flags).

This would give us aforward-compatible way to add new semantics later
without breaking the ABI and needing a v2. We could just require it to be
0 for now.

That makes sense. However, as Lorenzo mentioned previously, we should
keep the interface as minimal as possible.

Got it.

[...]
Forgot to add:

Noticed that if the hook returns 0, bpf_hook_thp_get_orders() falls
back to 'orders', preventing us from dynamically disabling mTHP
allocations.

Could you please clarify what you mean by that?

+       thp_order = bpf_hook_thp_get_order(vma, vma_type, tva_type, orders);
+       if (thp_order < 0)
+               goto out;

In my implementation, it only falls back to @orders if the return
value is negative. If the return value is 0, it uses BIT(0):

My bad, I completely misread the code last night ...

I see now that returning 0 forces a base page (order-0)


+       if (thp_order <= highest_order(orders))
+               thp_orders = BIT(thp_order);

Yes, this is exactly the behavior we need. It will allow us to dynamically
disable mTHP for low-priority containers when we need to, which is perfect
for our use case!



Honoring a return of 0 is critical for our use case, which is to
dynamically disable mTHP for low-priority containers when memory gets
low in mixed workloads.

And then re-enable it for them when memory is back above the low
watermark.

Thank you for detailing your use case; that context is very helpful.

Cheers,
Lance




[Index of Archives]     [Linux Samsung SoC]     [Linux Rockchip SoC]     [Linux Actions SoC]     [Linux for Synopsys ARC Processors]     [Linux NFS]     [Linux NILFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]


  Powered by Linux