Ackerley Tng <ackerleytng@xxxxxxxxxx> writes: > Sean Christopherson <seanjc@xxxxxxxxxx> writes: > >> Paolo, >> >> The arm64 patches have been Reviewed-by Marc, and AFAICT the x86 side of >> things is a go. Barring a screwup on my end, this just needs your approval. >> >> Assuming everything looks good, it'd be helpful to get this into kvm/next >> shortly after rc1. The x86 Kconfig changes in particular create semantic >> conflicts with in-flight series. >> >> >> Add support for host userspace mapping of guest_memfd-backed memory for VM >> types that do NOT use support KVM_MEMORY_ATTRIBUTE_PRIVATE (which isn't >> precisely the same thing as CoCo VMs, since x86's SEV-MEM and SEV-ES have >> no way to detect private vs. shared). >> >> mmap() support paves the way for several evolving KVM use cases: >> >> * Allows VMMs like Firecracker to run guests entirely backed by >> guest_memfd [1]. This provides a unified memory management model for >> both confidential and non-confidential guests, simplifying VMM design. >> >> * Enhanced Security via direct map removal: When combined with Patrick's >> series for direct map removal [2], this provides additional hardening >> against Spectre-like transient execution attacks by eliminating the >> need for host kernel direct maps of guest memory. >> >> * Lays the groundwork for *restricted* mmap() support for guest_memfd-backed >> memory on CoCo platforms [3] that permit in-place >> sharing of guest memory with the host. >> >> Based on kvm/queue. >> >> [1] https://github.com/firecracker-microvm/firecracker/tree/feature/secret-hiding >> [2] https://lore.kernel.org/all/20250221160728.1584559-1-roypat@xxxxxxxxxxxx >> [3] https://lore.kernel.org/all/20250328153133.3504118-1-tabba@xxxxxxxxxx >> >> [...snip...] > > With this version, when guest_memfd memory is mmap-ed() and faulted to > userspace, when there's a memory failure, the process does not get a > SIGBUS. Specifically, this selftest fails with "MADV_HWPOISON should > have triggered SIGBUS." > > diff --git i/tools/testing/selftests/kvm/guest_memfd_test.c w/tools/testing/selftests/kvm/guest_memfd_test.c > index b86bf89a71e04..70ef75a23bb60 100644 > --- i/tools/testing/selftests/kvm/guest_memfd_test.c > +++ w/tools/testing/selftests/kvm/guest_memfd_test.c > @@ -70,6 +70,10 @@ static void test_mmap_supported(int fd, size_t page_size, size_t total_size) > > ret = munmap(mem, total_size); > TEST_ASSERT(!ret, "munmap() should succeed."); > + > + ret = fallocate(fd, FALLOC_FL_KEEP_SIZE | FALLOC_FL_PUNCH_HOLE, 0, > + total_size); > + TEST_ASSERT(!ret, "Truncate the entire file (cleanup) should succeed."); > } > > static sigjmp_buf jmpbuf; > @@ -104,6 +108,47 @@ static void test_fault_overflow(int fd, size_t page_size, size_t total_size) > > ret = munmap(mem, map_size); > TEST_ASSERT(!ret, "munmap() should succeed."); > + > + ret = fallocate(fd, FALLOC_FL_KEEP_SIZE | FALLOC_FL_PUNCH_HOLE, 0, > + total_size); > + TEST_ASSERT(!ret, "Truncate the entire file (cleanup) should succeed."); > +} > + > +static void test_memory_failure(int fd, size_t page_size, size_t total_size) > +{ > + struct sigaction sa_old, sa_new = { > + .sa_handler = fault_sigbus_handler, > + }; > + void *memory_failure_addr; > + char *mem; > + int ret; > + > + mem = mmap(NULL, total_size, PROT_READ | PROT_WRITE, MAP_SHARED, fd, 0); > + TEST_ASSERT(mem != MAP_FAILED, "mmap() for guest_memfd should succeed."); > + > + memset(mem, 0xaa, page_size); > + My bad. If the above was changed from page_size to total_size, the page would have been faulted in, and then we get a SIGBUS. > + memory_failure_addr = mem + page_size; > + sigaction(SIGBUS, &sa_new, &sa_old); > + if (sigsetjmp(jmpbuf, 1) == 0) { > + madvise(memory_failure_addr, page_size, MADV_HWPOISON); > + TEST_ASSERT(false, "MADV_HWPOISON should have triggered SIGBUS."); > + } > + sigaction(SIGBUS, &sa_old, NULL); > + > + ret = munmap(mem, total_size); > + TEST_ASSERT(!ret, "munmap() should succeed."); > + > + ret = fallocate(fd, FALLOC_FL_KEEP_SIZE | FALLOC_FL_PUNCH_HOLE, 0, > + total_size); > + TEST_ASSERT(!ret, "Truncate the entire file (cleanup) should succeed."); > } > > static void test_mmap_not_supported(int fd, size_t page_size, size_t total_size) > @@ -286,6 +331,7 @@ static void test_guest_memfd(unsigned long vm_type) > if (flags & GUEST_MEMFD_FLAG_MMAP) { > test_mmap_supported(fd, page_size, total_size); > test_fault_overflow(fd, page_size, total_size); > + test_memory_failure(fd, page_size, total_size); > } else { > test_mmap_not_supported(fd, page_size, total_size); > } > > Is this by design or should some new memory_failure handling be added?