Re: [PATCH 0/5] add STATIC_PMD_ZERO_PAGE config option

"Pankaj Raghav (Samsung)" <kernel@xxxxxxxxxxxxxxxx> · Mon, 16 Jun 2025 12:49:27 +0200

> > > 
> > > The mm is a nice convenient place to stick an mm but there are other
> > > ways to keep an efficient refcount around. For instance, you could just
> > > bump a per-cpu refcount and then have the shrinker sum up all the
> > > refcounts to see if there are any outstanding on the system as a whole.
> > > 
> > > I understand that the current refcounts are tied to an mm, but you could
> > > either replace the mm-specific ones or add something in parallel for
> > > when there's no mm.
> > 
> > But the whole idea of allocating a static PMD page for sane
> > architectures like x86 started with the intent of avoiding the refcounts and
> > shrinker.
> > 
> > This was the initial feedback I got[2]:
> > 
> > I mean, the whole thing about dynamically allocating/freeing it was for
> > memory-constrained systems. For large systems, we just don't care.
> 
> For non-mm usage we can just use the folio refcount. The per-mm refcounts
> are all combined into a single folio refcount. The way the global variable
> is managed based on per-mm refcounts is the weird thing.
> 
> In some corner cases we might end up having multiple instances of huge zero
> folios right now. Just imagine:
> 
> 1) Allocate huge zero folio during read fault
> 2) vmsplice() it
> 3) Unmap the huge zero folio
> 4) Shrinker runs and frees it
> 5) Repeat with 1)
> 
> As long as the folio is vmspliced(), it will not get actually freed ...
> 
> I would hope that we could remove the shrinker completely, and simply never
> free the huge zero folio once allocated. Or at least, only free it once it
> is actually no longer used.
> 

Thanks for the explanation, David.

But I am still a bit confused on how to proceed with these patches.

So IIUC, our eventual goal is to get rid of the shrinker.

But do we still want to add a static PMD page in the .bss or do we take
an alternate approach here?

--
Pankaj