On Thu, May 15, 2025, Rick P Edgecombe wrote: > On Thu, 2025-05-15 at 11:42 -0700, Vishal Annapurve wrote: > > On Thu, May 15, 2025 at 11:03 AM Edgecombe, Rick P > > <rick.p.edgecombe@xxxxxxxxx> wrote: > > > > > > On Wed, 2025-05-14 at 16:41 -0700, Ackerley Tng wrote: > > > > Hello, > > > > > > > > This patchset builds upon discussion at LPC 2024 and many guest_memfd > > > > upstream calls to provide 1G page support for guest_memfd by taking > > > > pages from HugeTLB. > > > > > > Do you have any more concrete numbers on benefits of 1GB huge pages for > > > guestmemfd/coco VMs? I saw in the LPC talk it has the benefits as: > > > - Increase TLB hit rate and reduce page walks on TLB miss > > > - Improved IO performance > > > - Memory savings of ~1.6% from HugeTLB Vmemmap Optimization (HVO) > > > - Bring guest_memfd to parity with existing VMs that use HugeTLB pages for > > > backing memory > > > > > > Do you know how often the 1GB TDP mappings get shattered by shared pages? > > > > > > Thinking from the TDX perspective, we might have bigger fish to fry than 1.6% > > > memory savings (for example dynamic PAMT), and the rest of the benefits don't > > > have numbers. How much are we getting for all the complexity, over say buddy > > > allocated 2MB pages? TDX may have bigger fish to fry, but some of us have bigger fish to fry than TDX :-) > > This series should work for any page sizes backed by hugetlb memory. > > Non-CoCo VMs, pKVM and Confidential VMs all need hugepages that are > > essential for certain workloads and will emerge as guest_memfd users. > > Features like KHO/memory persistence in addition also depend on > > hugepage support in guest_memfd. > > > > This series takes strides towards making guest_memfd compatible with > > usecases where 1G pages are essential and non-confidential VMs are > > already exercising them. > > > > I think the main complexity here lies in supporting in-place > > conversion which applies to any huge page size even for buddy > > allocated 2MB pages or THP. > > > > This complexity arises because page structs work at a fixed > > granularity, future roadmap towards not having page structs for guest > > memory (at least private memory to begin with) should help towards > > greatly reducing this complexity. > > > > That being said, DPAMT and huge page EPT mappings for TDX VMs remain > > essential and complement this series well for better memory footprint > > and overall performance of TDX VMs. > > Hmm, this didn't really answer my questions about the concrete benefits. > > I think it would help to include this kind of justification for the 1GB > guestmemfd pages. "essential for certain workloads and will emerge" is a bit > hard to review against... > > I think one of the challenges with coco is that it's almost like a sprint to > reimplement virtualization. But enough things are changing at once that not all > of the normal assumptions hold, so it can't copy all the same solutions. The > recent example was that for TDX huge pages we found that normal promotion paths > weren't actually yielding any benefit for surprising TDX specific reasons. > > On the TDX side we are also, at least currently, unmapping private pages while > they are mapped shared, so any 1GB pages would get split to 2MB if there are any > shared pages in them. I wonder how many 1GB pages there would be after all the > shared pages are converted. At smaller TD sizes, it could be not much. You're conflating two different things. guest_memfd allocating and managing 1GiB physical pages, and KVM mapping memory into the guest at 1GiB/2MiB granularity. Allocating memory in 1GiB chunks is useful even if KVM can only map memory into the guest using 4KiB pages. > So for TDX in isolation, it seems like jumping out too far ahead to effectively > consider the value. But presumably you guys are testing this on SEV or > something? Have you measured any performance improvement? For what kind of > applications? Or is the idea to basically to make guestmemfd work like however > Google does guest memory? The longer term goal of guest_memfd is to make it suitable for backing all VMs, hence Vishal's "Non-CoCo VMs" comment. Yes, some of this is useful for TDX, but we (and others) want to use guest_memfd for far more than just CoCo VMs. And for non-CoCo VMs, 1GiB hugepages are mandatory for various workloads.