On Tue, May 6, 2025 at 8:13 PM Sean Christopherson <seanjc@xxxxxxxxxx> wrote: > > On Thu, Jan 09, 2025, James Houghton wrote: > > KVM: Add KVM_MEM_USERFAULT memslot flag and bitmap > > KVM: Add KVM_MEMORY_EXIT_FLAG_USERFAULT > > KVM: Allow late setting of KVM_MEM_USERFAULT on guest_memfd memslot > > KVM: Advertise KVM_CAP_USERFAULT in KVM_CHECK_EXTENSION > > KVM: x86/mmu: Add support for KVM_MEM_USERFAULT > > KVM: arm64: Add support for KVM_MEM_USERFAULT > > KVM: selftests: Fix vm_mem_region_set_flags docstring > > KVM: selftests: Fix prefault_mem logic > > KVM: selftests: Add va_start/end into uffd_desc > > KVM: selftests: Add KVM Userfault mode to demand_paging_test > > KVM: selftests: Inform set_memory_region_test of KVM_MEM_USERFAULT > > KVM: selftests: Add KVM_MEM_USERFAULT + guest_memfd toggle tests > > KVM: Documentation: Add KVM_CAP_USERFAULT and KVM_MEM_USERFAULT > > details > > > > Documentation/virt/kvm/api.rst | 33 +++- > > arch/arm64/kvm/Kconfig | 1 + > > arch/arm64/kvm/mmu.c | 26 +++- > > arch/x86/kvm/Kconfig | 1 + > > arch/x86/kvm/mmu/mmu.c | 27 +++- > > arch/x86/kvm/mmu/mmu_internal.h | 20 ++- > > arch/x86/kvm/x86.c | 36 +++-- > > include/linux/kvm_host.h | 19 ++- > > include/uapi/linux/kvm.h | 6 +- > > .../selftests/kvm/demand_paging_test.c | 145 ++++++++++++++++-- > > .../testing/selftests/kvm/include/kvm_util.h | 5 + > > .../selftests/kvm/include/userfaultfd_util.h | 2 + > > tools/testing/selftests/kvm/lib/kvm_util.c | 42 ++++- > > .../selftests/kvm/lib/userfaultfd_util.c | 2 + > > .../selftests/kvm/set_memory_region_test.c | 33 ++++ > > virt/kvm/Kconfig | 3 + > > virt/kvm/kvm_main.c | 54 ++++++- > > 17 files changed, 419 insertions(+), 36 deletions(-) > > I didn't look at the selftests changes, but nothing in this series scares me. We > bikeshedded most of this death this in the "exit on missing" series, so for me at > least, the only real question is whether or not we want to add the uAPI. AFAIK, > this is best proposal for post-copy guest_memfd support (and not just because it's > the only proposal :-D). The only thing that I want to call out again is that this UAPI works great for when we are going from userfault --> !userfault. That is, it works well for postcopy (both for guest_memfd and for standard memslots where userfaultfd scalability is a concern). But there is another use case worth bringing up: unmapping pages that the VMM is emulating as poisoned. Normally this can be handled by mm (e.g. with UFFDIO_POISON), but for 4K poison within a HugeTLB-backed memslot (if the HugeTLB page remains mapped in userspace), KVM Userfault is the only option (if we don't want to punch holes in memslots). This leaves us with three problems: 1. If using KVM Userfault to emulate poison, we are stuck with small pages in stage 2 for the entire memslot. 2. We must unmap everything when toggling on KVM Userfault just to unmap a single page. 3. If KVM Userfault is already enabled, we have no choice but to toggle KVM Userfault off and on again to unmap the newly poisoned pages (i.e., there is no ioctl to scan the bitmap and unmap newly-userfault pages). All of these are non-issues if we emulate poison by removing memslots, and I think that's possible. But if that proves too slow, we'd need to be a little bit more clever with hugepage recovery and with unmapping newly-userfault pages, both of which I think can be solved by adding some kind of bitmap re-scan ioctl. We can do that later if the need arises. > So... yes? Thanks Sean! > Attached are a variation on the series using the common "struct kvm_page_fault" > idea. The documentation change could be squashed with the final enablement patch. > > Compile tested only. I would not be the least bit surprised if I completely > butchered something. Looks good! The new selftests work just fine.