Re: [LSF/MM/BPF TOPIC] The future of anon_vma

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Sun, Feb 23, 2025 at 01:38:08PM +0530, Dev Jain wrote:
>
>
> On 09/01/25 3:53 am, Lorenzo Stoakes wrote:
> > Hi all,
> >
> > Since time immemorial the kernel has maintained two separate realms within
> > mm - that of file-backed mappings and that of anonymous mappings.
> >
> > Each of these require a reverse mapping from folio to VMA, utilising
> > interval trees from an intermediate object referenced by folio->mapping
> > back to the VMAs which map it.
> >
> > In the case of a file-backed mapping, this 'intermediate object' is the
> > shared page cache entry, of type struct address_space. It is non-CoW which
> > keep things simple(-ish) and the concept is straight-forward - both the
> > folio and the VMAs which map the page cache object reference it.
> >
> > In the case of anonymous memory, things are not quite as simple, as a
> > result of CoW. This is further complicated by forking and the very many
> > different combinations of CoW'd and non-CoW'd folios that can exist within
> > a mapping.
> >
> > This kind of mapping utilises struct anon_vma objects which as a result of
> > this complexity are pretty well entirely concerned with maintaining the
> > notion of an anon_vma object rather than describing the underlying memory
> > in any way.
> >
> > Of course we can enter further realms of insan^W^W^W^W^Wcomplexity by
> > maintaining a MAP_PRIVATE file-backed mapping where we can experience both
> > at once!
> >
> > The fact that we can have both CoW'd and non-CoW'd folios referencing a VMA
> > means that we require -yet another- type, a struct anon_vma_chain,
> > maintained on a linked list, to abstract the link between anon_vma objects
> > and VMAs, and to provide a means by which one can manage and traverse
> > anon_vma objects from the VMA as well as looking them up from the reverse
> > mapping.
> >
> > Maintaining all of this correctly is very fragile, error-prone and
> > confusing, not to mention the concerns around maintaining correct locking
> > semantics, correctly propagating anonymous VMA state on fork, and trying to
> > reuse state to avoid allocating unnecessary memory to maintain all of this
> > infrastructure.
> >
> > An additional consequence of maintaining these two realms is that that
> > which straddles them - shmem - becomes something of an enigma -
> > file-backed, but existing on the anonymous LRU list and requiring a lot of
> > very specific handling.
> >
> > It is obvious that there is some isomorphism between the representation of
> > file systems and anonymous memory, less the CoW handling. However there is
> > a concept which exists within file systems which can somewhat bridge the gap
> >   - reflinks.
> >
> > A future where we unify anonymous and file-backed memory mappings would be
> > one in which a reflinks were implemented at a general level rather than, as
> > they are now, implemented individually within file systems.
> >
> > I'd like to discuss how feasible doing so might be, whether this is a sane
> > line of thought at all, and how a roadmap for working towards the
> > elimination of anon_vma as it stands might look.
> >
> > As with my other proposal, I will gather more concrete information before
> > LSF to ensure the discussion is specific, and of course I would be
> > interested to discuss the topic in this thread also!
> >
> > Thanks!
> >
>
> Thanks for this, as a beginner I have tried understanding the rmap code a
> million times, after forgetting it a million times, thanks to the sheer
> complexity of the anon_vma and anon_vma_chain. Whenever I read it again, the
> first thought is "surely there has to be some better way, someone must
> figure it out" :)
>

No problem, this is something I am very interested in putting time into,
and _will_ be at least -attempting- patches for (I have ideas that likely
will land _before_ LSF).

Note the follow up mail - I am providing short, medium + long term
approaches, no longer JUST focusing on the 'how to remove' bit.

I'd be remiss given what you've said if I didn't mention that my book
covers this stuff in a great amount of detail, including anon_vma in a lot
of detail (I steal some diagrams from it for the LSF slides :) and that you
can pre-order and read full draft now... https://linuxmemory.org/ ;)

But also in the slides I have drafted start by (as quickly as I can so it
can be a good discussion) go over how file-backed rmap compares to
anon_vma, the complexities, why and the pitfalls.

This is to provide a basis for the 'So what?' portion as is - what can we
do, why, how.

Point is I'm attacking this one way or another whether I host this topic or
not, as I feel as you (and I am sure many others) do - can't we do better?
So it seems a nice courtesy and great opportunity for discussion to speak
about it in person.

I know for one in VMA merging we almost certainly can. And since I
literally wrote all of that code (latest iteration anyway) that's very much
in my wheelhouse.

Cheers, Lorenzo




[Index of Archives]     [Linux Ext4 Filesystem]     [Union Filesystem]     [Filesystem Testing]     [Ceph Users]     [Ecryptfs]     [NTFS 3]     [AutoFS]     [Kernel Newbies]     [Share Photos]     [Security]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux Cachefs]     [Reiser Filesystem]     [Linux RAID]     [NTFS 3]     [Samba]     [Device Mapper]     [CEPH Development]

  Powered by Linux