Re: [PATCH v3 00/22] Add support for shared PTEs across processes

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 





On 9/9/25 12:53 AM, David Hildenbrand wrote:
On 08.09.25 23:14, Anthony Yznaga wrote:


On 9/8/25 1:59 PM, Matthew Wilcox wrote:
On Mon, Sep 08, 2025 at 10:32:22PM +0200, David Hildenbrand wrote:
In the context of this series, how do we handle VMA-modifying functions like mprotect/some madvise/mlock/mempolicy/...? Are they currently blocked when
applied to a mshare VMA?

I haven't been following this series recently, so I'm not sure what
Anthony will say.  My expectation is that the shared VMA is somewhat
transparent to these operations; that is they are faulty if they span
the boundary of the mshare VMA, but otherwise they pass through and
affect the shared VMAs.

That does raise the interesting question of how mlockall() affects
an mshare VMA.  I'm tempted to say that it should affect the shared
VMA, but reasonable people might well disagree with me and have
excellent arguments.

Right, I think there are (at least) two possible models.

(A) It's just a special file mapping.

How that special file is orchestrated is not controlled through VMA change operations (mprotect etc) from one process but through dedicated ioctl.

(B) It's something different.

VMA change operations will affect how that file is orchestrated but not modify how the VMA in each process looks like.


I still believe that (A) is clean and (B) is asking for trouble. But in any case, this is one of the most vital parts of mshare integration and should be documented clearly.


And how are we handling other page table walkers that don't modify VMAs like
MADV_DONTNEED, smaps, migrate_pages, ... etc?

I'd expect those to walk into the shared region too.

I've received conflicting feedback in previous discussions that things
like protection changes should be done via ioctl. I do thing somethings
are appropriate for ioctl like map and unmap, but I also like the idea
of the existing APIs being transparent to mshare so long as they are
operating entirely with an mshare range and not crossing boundaries.

We have to be very careful here to not create a mess (this is all going to be unchangeable API later), and getting the opinion from other VMA handling folks (i.e., Lorenzo, Liam, Vlastimil, Pedro) will be crucial.

So if can you answer the questions I raised in more detail? In particular how it works with the current series or what the current long-term plans are?

With respect to the current series there are some deficiencies. For madvise(), there are some advices like MADV_DONTNEED that will operate on the shared page table without taking the needed locks. Many will fail for various reasons. I'll add a check to reject trying to apply advise to msharefs VMAs. The plan is to add an ioctl for applying advice to the memory in an mshare region. If it makes sense to make it more transparent then I think that's something could come later.

Things like migrate_pages() that use the rmap to get to VMAs are in better shape because they will naturally find the real VMA with its vm_mm pointing to an mshare mm.

Another area I'm currently working on is ensuring mmu notifiers work. There is some locking trickery to work out there.

Currently the plan is to add ioctls for protections changes, advice, and whatever else makes sense. I'm definitely open to feedback on any aspect of this.




Thanks!






[Index of Archives]     [Kernel Newbies]     [Security]     [Netfilter]     [Bugtraq]     [Linux FS]     [Yosemite Forum]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Samba]     [Video 4 Linux]     [Device Mapper]     [Linux Resources]

  Powered by Linux