On 9/9/25 12:53 AM, David Hildenbrand wrote:
On 08.09.25 23:14, Anthony Yznaga wrote:
On 9/8/25 1:59 PM, Matthew Wilcox wrote:
On Mon, Sep 08, 2025 at 10:32:22PM +0200, David Hildenbrand wrote:
In the context of this series, how do we handle VMA-modifying
functions like
mprotect/some madvise/mlock/mempolicy/...? Are they currently
blocked when
applied to a mshare VMA?
I haven't been following this series recently, so I'm not sure what
Anthony will say. My expectation is that the shared VMA is somewhat
transparent to these operations; that is they are faulty if they span
the boundary of the mshare VMA, but otherwise they pass through and
affect the shared VMAs.
That does raise the interesting question of how mlockall() affects
an mshare VMA. I'm tempted to say that it should affect the shared
VMA, but reasonable people might well disagree with me and have
excellent arguments.
Right, I think there are (at least) two possible models.
(A) It's just a special file mapping.
How that special file is orchestrated is not controlled through VMA
change operations (mprotect etc) from one process but through dedicated
ioctl.
(B) It's something different.
VMA change operations will affect how that file is orchestrated but not
modify how the VMA in each process looks like.
I still believe that (A) is clean and (B) is asking for trouble. But in
any case, this is one of the most vital parts of mshare integration and
should be documented clearly.
And how are we handling other page table walkers that don't modify
VMAs like
MADV_DONTNEED, smaps, migrate_pages, ... etc?
I'd expect those to walk into the shared region too.
I've received conflicting feedback in previous discussions that things
like protection changes should be done via ioctl. I do thing somethings
are appropriate for ioctl like map and unmap, but I also like the idea
of the existing APIs being transparent to mshare so long as they are
operating entirely with an mshare range and not crossing boundaries.
We have to be very careful here to not create a mess (this is all going
to be unchangeable API later), and getting the opinion from other VMA
handling folks (i.e., Lorenzo, Liam, Vlastimil, Pedro) will be crucial.
So if can you answer the questions I raised in more detail? In
particular how it works with the current series or what the current
long-term plans are?
With respect to the current series there are some deficiencies. For
madvise(), there are some advices like MADV_DONTNEED that will operate
on the shared page table without taking the needed locks. Many will fail
for various reasons. I'll add a check to reject trying to apply advise
to msharefs VMAs. The plan is to add an ioctl for applying advice to the
memory in an mshare region. If it makes sense to make it more
transparent then I think that's something could come later.
Things like migrate_pages() that use the rmap to get to VMAs are in
better shape because they will naturally find the real VMA with its
vm_mm pointing to an mshare mm.
Another area I'm currently working on is ensuring mmu notifiers work.
There is some locking trickery to work out there.
Currently the plan is to add ioctls for protections changes, advice, and
whatever else makes sense. I'm definitely open to feedback on any aspect
of this.
Thanks!