On Mon, May 19, 2025 at 08:14:17PM +0100, Lorenzo Stoakes wrote: > On Mon, May 19, 2025 at 08:59:32PM +0200, David Hildenbrand wrote: > > > > > > > > I am not 100% sure why we bail out on special mappings: all we have to do is > > > > reliably identify anon pages, and we should be able to do that. > > > > > > But they map e.g. kernel memory (at least for VM_PFNMAP, purely and by > > > implication really VM_IO), it wouldn't make sense for KSM to be asked to > > > try to merge these right? > > > > > > And of course no underlying struct page to pin, no reference counting > > > either, so I think you'd end up in trouble potentially here wouldn't you? > > > And how would the CoW work? > > > > KSM only operates on anonymous pages. It cannot de-duplicate anything else. > > (therefore, only MAP_PRIVATE applies) > > Yes I had this realisation see my reply to your reply :) > > I mean is MAP_PRIVATE of a VM_PFNMAP really that common?... > > > > > Anything else (no struct page, not a CoW anon folio in such a mapping) is > > skipped. > > > > Take a look at scan_get_next_rmap_item() where we do > > > > folio = folio_walk_start(&fw, vma, ksm_scan.address, 0); > > if (folio) { > > if (!folio_is_zone_device(folio) && > > folio_test_anon(folio)) { > > folio_get(folio); > > tmp_page = fw.page; > > } > > folio_walk_end(&fw, vma) > > } > > > > > > Before I changed that code, we were using GUP. And GUP just always refuses > > VM_IO|VM_PFNMAP because it cannot handle it properly. > > OK so it boils down to doing KSM _on the already CoW'd_ MAP_PRIVATE mapping? > > But hang on, how do we discover this? vm_normal_page() will screw this up right? > As VM_SPECIAL will be set... > > OK now I'm not sure I understand how MAP_PRIVATE-mapped VM_PFNMAP mappings work > :))) Oh wait hang on... So here, MAP_PRIVATE, CoW'd would clear pte_special() presumably: if (IS_ENABLED(CONFIG_ARCH_HAS_PTE_SPECIAL)) { if (likely(!pte_special(pte))) goto check_pfn; ... And in non-PTE special flag case: ... if (!is_cow_mapping(vma->vm_flags)) return NULL; ... And of course, this will be a CoW mapping... So actually we should be fine. Got it. > > > > > > > > > > > So, assuming we could remove the VM_PFNMAP | VM_IO | VM_DONTEXPAND | > > > > VM_MIXEDMAP constraint from vma_ksm_compatible(), could we simplify? > > > > > > Well I question removing this constraint for above reasons. > > > > > > At any rate, even if we _could_ this feels like a bigger change that we > > > should come later. > > > > "bigger" -- it might just be removing these 4 flags from the check ;) > > > > I'll dig a bit more. > > Right, but doing so would be out of scope here don't you think? > > I'd rather not delay fixing this bug on this basis ideally, esp. as easy to > adjust later. > > I suggest we put this in as-is (or close to it anyway) and then if we remove the > flags we can change this... > > As I said in other reply, .mmap() means the driver can do literally anything > they want (which is _hateful_), so we'd really want to have some confidence they > didn't do something so crazy, unless we were happy to just let that possibly > explode. > > > > > -- > > Cheers, > > > > David / dhildenb > >