Re: [RFC PATCH v2 00/51] 1G page support for guest_memfd

"Edgecombe, Rick P" <rick.p.edgecombe@xxxxxxxxx> · Fri, 16 May 2025 17:04:47 +0000

On Fri, 2025-05-16 at 10:09 -0300, Jason Gunthorpe wrote:
> > You're conflating two different things.  guest_memfd allocating and managing
> > 1GiB physical pages, and KVM mapping memory into the guest at 1GiB/2MiB
> > granularity.  Allocating memory in 1GiB chunks is useful even if KVM can
> > only
> > map memory into the guest using 4KiB pages.
> 
> Even if KVM is limited to 4K the IOMMU might not be - alot of these
> workloads have a heavy IO component and we need the iommu to perform
> well too.

Oh, interesting point.

> 
> Frankly, I don't think there should be objection to making memory more
> contiguous. 

No objections from me to anything except the lack of concrete justification.

> There is alot of data that this always brings wins
> somewhere for someone.

For the direct map huge page benchmarking, they saw that sometimes 1GB pages
helped, but also sometimes 2MB pages helped. That 1GB will help *some* workload
doesn't seem surprising.

> 
> > The longer term goal of guest_memfd is to make it suitable for backing all
> > VMs,
> > hence Vishal's "Non-CoCo VMs" comment.  Yes, some of this is useful for TDX,
> > but
> > we (and others) want to use guest_memfd for far more than just CoCo VMs. 
> > And
> > for non-CoCo VMs, 1GiB hugepages are mandatory for various workloads.
> 
> Yes, even from an iommu perspective with 2D translation we need to
> have the 1G pages from the S2 resident in the IOTLB or performance
> falls off a cliff.

"falls off a cliff" is the level of detail and the direction of hand waving I
have been hearing. But it also seems modern CPUs are quite good at hiding the
cost of walks with caches etc. Like how 5 level paging was made unconditional. I
didn't think about IOTLB though. Thanks for mentioning it.