On Tue, Jun 10, 2025 at 05:00:47PM +0100, Usama Arif wrote: > On 10/06/2025 16:46, Matthew Wilcox wrote: > > On Tue, Jun 10, 2025 at 04:30:43PM +0100, Usama Arif wrote: > >> If we have 2 workloads on the same server, For e.g. one is database where THPs > >> just dont do well, but the other one is AI where THPs do really well. How > >> will the kernel monitor that the database workload is performing worse > >> and the AI one isnt? > > > > It can monitor the allocation/access patterns and see who's getting > > the benefit. The two workloads are in competition for memory, and > > we can tell which pages are hot and which cold. > > > > And I don't believe it's a binary anyway. I bet there are some > > allocations where the database benefits from having THPs (I mean, I know > > a database which invented the entire hugetlbfs subsystem so it could > > use PMD entries and avoid one layer of TLB misses!) > > > > Sure, but this is just an example. Workload owners are not going to spend time > trying to see how each allocation works and if its hot, they put it in hugetlbfs. No, they're not. It should be automatic. There are many deficiencies in the kernel; this is one of them. > Ofcourse hugetlbfs has its own drawbacks of reserving pages. Drawback or advantage? It's a feature. You're being very strange about this. First you want to reserve THPs for some workloads only, then when given a way to do that you complain that ... you have to reserve hugetlb pages. You can't possibly mean both of these things sincerely.