On Tue, Sep 9, 2025 at 7:46 PM Yafang Shao <laoar.shao@xxxxxxxxx> wrote: > > +/* Detecting whether a task can successfully allocate THP is unreliable because > + * it may be influenced by system memory pressure. Instead of making the result > + * dependent on unpredictable factors, we should simply check > + * bpf_hook_thp_get_orders()'s return value, which is deterministic. > + */ > +SEC("fexit/bpf_hook_thp_get_orders") > +int BPF_PROG(thp_run, struct vm_area_struct *vma, u64 vma_flags, enum tva_type tva_type, > + unsigned long orders, int retval) > +{ ... > +SEC("struct_ops/thp_get_order") > +int BPF_PROG(alloc_in_khugepaged, struct vm_area_struct *vma, enum bpf_thp_vma_type vma_type, > + enum tva_type tva_type, unsigned long orders) > +{ This is a bad idea to mix struct_ops logic with fentry/fexit style. struct_ops hook will not be affected by compiler optimizations, while fentry depends on a whim of compilers. struct_ops can be scoped, while fentry is always global. sched-ext already struggles with the later, since some scheds need tracing data from other parts of the kernel and they cannot be grouped together. All sorts of workarounds were proposed, but no good solution in sight. So don't go this route for THP. Make everything you need to be struct_ops based and/or pass whatever extra data into these ops. Also think of scoping for bpf-thp from the start. Currently st_ops/thp_get_order is only one and it's global. It's ok for prototypes and experiments, but not ok for landing upstream. I think cgroup would a natural scope and different cgroups might want their own bpf based THP hints. Once you do that, think through how delegation of suggested order will propagate through hierarchy. bpf-oom seems to be aligning toward the same design principles, so don't reinvent the wheel.