> > > > > > The mm is a nice convenient place to stick an mm but there are other > > > ways to keep an efficient refcount around. For instance, you could just > > > bump a per-cpu refcount and then have the shrinker sum up all the > > > refcounts to see if there are any outstanding on the system as a whole. > > > > > > I understand that the current refcounts are tied to an mm, but you could > > > either replace the mm-specific ones or add something in parallel for > > > when there's no mm. > > > > But the whole idea of allocating a static PMD page for sane > > architectures like x86 started with the intent of avoiding the refcounts and > > shrinker. > > > > This was the initial feedback I got[2]: > > > > I mean, the whole thing about dynamically allocating/freeing it was for > > memory-constrained systems. For large systems, we just don't care. > > For non-mm usage we can just use the folio refcount. The per-mm refcounts > are all combined into a single folio refcount. The way the global variable > is managed based on per-mm refcounts is the weird thing. > > In some corner cases we might end up having multiple instances of huge zero > folios right now. Just imagine: > > 1) Allocate huge zero folio during read fault > 2) vmsplice() it > 3) Unmap the huge zero folio > 4) Shrinker runs and frees it > 5) Repeat with 1) > > As long as the folio is vmspliced(), it will not get actually freed ... > > I would hope that we could remove the shrinker completely, and simply never > free the huge zero folio once allocated. Or at least, only free it once it > is actually no longer used. > Thanks for the explanation, David. But I am still a bit confused on how to proceed with these patches. So IIUC, our eventual goal is to get rid of the shrinker. But do we still want to add a static PMD page in the .bss or do we take an alternate approach here? -- Pankaj