On 26.08.25 09:19, Yafang Shao wrote:
Background ========== Our production servers consistently configure THP to "never" due to historical incidents caused by its behavior. Key issues include: - Increased Memory Consumption THP significantly raises overall memory usage, reducing available memory for workloads. - Latency Spikes Random latency spikes occur due to frequent memory compaction triggered by THP. - Lack of Fine-Grained Control THP tuning is globally configured, making it unsuitable for containerized environments. When multiple workloads share a host, enabling THP without per-workload control leads to unpredictable behavior. Due to these issues, administrators avoid switching to madvise or always modes—unless per-workload THP control is implemented. To address this, we propose BPF-based THP policy for flexible adjustment. Additionally, as David mentioned [0], this mechanism can also serve as a policy prototyping tool (test policies via BPF before upstreaming them).
There is a lot going on and most reviewers (including me) are fairly busy right now, so getting more detailed review could take a while.
This topic sounds like a good candidate for the bi-weekly MM alignment session.
Would you be interested in presenting the current bpf interface, how to use it, drawbacks, todos, ... in that forum?
David Rientjes, who organizes this meeting, is already on Cc. -- Cheers David / dhildenb