On Fri, 2025-05-16 at 10:09 -0300, Jason Gunthorpe wrote: > > You're conflating two different things. guest_memfd allocating and managing > > 1GiB physical pages, and KVM mapping memory into the guest at 1GiB/2MiB > > granularity. Allocating memory in 1GiB chunks is useful even if KVM can > > only > > map memory into the guest using 4KiB pages. > > Even if KVM is limited to 4K the IOMMU might not be - alot of these > workloads have a heavy IO component and we need the iommu to perform > well too. Oh, interesting point. > > Frankly, I don't think there should be objection to making memory more > contiguous. No objections from me to anything except the lack of concrete justification. > There is alot of data that this always brings wins > somewhere for someone. For the direct map huge page benchmarking, they saw that sometimes 1GB pages helped, but also sometimes 2MB pages helped. That 1GB will help *some* workload doesn't seem surprising. > > > The longer term goal of guest_memfd is to make it suitable for backing all > > VMs, > > hence Vishal's "Non-CoCo VMs" comment. Yes, some of this is useful for TDX, > > but > > we (and others) want to use guest_memfd for far more than just CoCo VMs. > > And > > for non-CoCo VMs, 1GiB hugepages are mandatory for various workloads. > > Yes, even from an iommu perspective with 2D translation we need to > have the 1G pages from the S2 resident in the IOTLB or performance > falls off a cliff. "falls off a cliff" is the level of detail and the direction of hand waving I have been hearing. But it also seems modern CPUs are quite good at hiding the cost of walks with caches etc. Like how 5 level paging was made unconditional. I didn't think about IOTLB though. Thanks for mentioning it.