Re: [RFC PATCH v2 0/5] mm, bpf: BPF based THP adjustment

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



I totally agree with you that the key point here is how to define the
API. As I replied to David, I believe we have two fundamental
principles to adjust the THP policies:
1. Selective Benefit: Some tasks benefit from THP, while others do not.
2. Conditional Safety: THP allocation is safe under certain conditions
but not others.

Therefore, I believe we can define these APIs based on the established
principles - everything else constitutes implementation details, even
if core MM internals need to change.

But if we're looking to make the concept of THP go away, we really need to
go further than this.

Yeah. I might be wrong, but I also don't think doing control on a per-process level etc would be the right solution long-term.

In a world where we do stuff automatically ("auto" mode), we would be much smarter about where to place a (m)THP, and which size we would use.

One might use bpf to control the allocation policy. But I don't think this would be per-process or even per-VMA etc. Sure, we might give hints, but placement decisions should happen on another level (e.g., during page faults, during khugepaged etc).


The second we have 'bpf program that figures out whether THP should be
used' we are permanently tied to the idea of THP on/off being a thing.

I mean any future stuff that makes THP more automagic will probably involve
having new modes for the legacy THP
/sys/kernel/mm/transparent_hugepage/enabled and
/sys/kernel/mm/transparent_hugepage/hugepages-xxkB/enabled

Yeah, the plan is to have "auto" in /sys/kernel/mm/transparent_hugepage/enabled and just have all other sizes "inherit" that option. And have a Kconfig that just enables that as default. Once we're there, just phase out the interface long-term.

That's the plan. Now we "only" have to figure out how to make the placement actually better ;)


But if people are super reliant on this stuff it's potentially really
limiting.

I think you said in another post here that you were toying with the notion
of exposing somehow the madvise() interface and having that be the 'stable
API' of sorts?

That definitely sounds more sensible than something that very explicitly
interacts with THP.

Of course we have Usama's series and my proposed series for extending
process_madvise() along those lines also.

Yes.

--
Cheers,

David / dhildenb





[Index of Archives]     [Linux Samsung SoC]     [Linux Rockchip SoC]     [Linux Actions SoC]     [Linux for Synopsys ARC Processors]     [Linux NFS]     [Linux NILFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]


  Powered by Linux