On Tue, 22 Jul 2025 at 15:32, Sean Christopherson <seanjc@xxxxxxxxxx> wrote: > > On Tue, Jul 22, 2025, Fuad Tabba wrote: > > On Tue, 22 Jul 2025 at 06:36, Xiaoyao Li <xiaoyao.li@xxxxxxxxx> wrote: > > > - In 0010-KVM-x86-mmu-Rename-.private_max_mapping_level-to-.gm.patch, > > > there is double gmem in the name of vmx/vt 's callback implementation: > > > > > > vt_gmem_gmem_max_mapping_level > > > tdx_gmem_gmem_max_mapping_level > > > vt_op_tdx_only(gmem_gmem_max_mapping_level) > > > > Sean's patches do that, then he fixes it in a later patch. I'll fix > > this at the source. > > Dagnabbit. I goofed a search+replace, caught it when re-reading things, and > fixed-up the wrong commit. Sorry :-( > > > > - In 0013-KVM-x86-mmu-Extend-guest_memfd-s-max-mapping-level-t.patch, > > > kvm_x86_call(gmem_max_mapping_level)(...) returns 0 for !private case. > > > It's not correct though it works without issue currently. > > > > > > Because current gmem doesn't support hugepage so that the max_level > > > gotten from gmem is always PG_LEVEL_4K and it returns early in > > > kvm_gmem_max_mapping_level() on > > > > > > if (max_level == PG_LEVEL_4K) > > > return max_level; > > > > > > But just look at the following case: > > > > > > return min(max_level, > > > kvm_x86_call(gmem_max_mapping_level)(kvm, pfn, is_private)); > > > > > > For non-TDX case and non-SNP case, it will return 0, i.e. > > > PG_LEVEL_NONE eventually. > > > > > > so either 1) return PG_LEVEL_NUM/PG_LEVEL_1G for the cases where > > > .gmem_max_mapping_level callback doesn't have specific restriction. > > > > > > or 2) > > > > > > tmp = kvm_x86_call(gmem_max_mapping_level)(kvm, pfn, is_private); > > > if (tmp) > > > return min(max_level, tmp); > > > > > > return max-level; > > > > Sean? What do you think? > > #2, because KVM uses a "ret0" static call when TDX is disabled (and KVM should > do the same when SEV is disabled, but the SEV #ifdefs are still a bit messy). > Switching to any other value would require adding a VMX stubs for the !TDX case. > > I think it makes sense to explicitly call that out as the "CoCo level", to help > unfamiliar readers understand why vendor code has any say in the max > mapping level. > > And I would say we adjust max_level instead of having an early return, e.g. to > reduce the probability of future bugs due to adding code between the call to > .gmem_max_mapping_level() and the final return. > > This as fixup? Applied it to my tree. Builds and runs fine. Thanks! /fuad > diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c > index eead5dca6f72..a51013e0992a 100644 > --- a/arch/x86/kvm/mmu/mmu.c > +++ b/arch/x86/kvm/mmu/mmu.c > @@ -3279,9 +3279,9 @@ static u8 kvm_gmem_max_mapping_level(struct kvm *kvm, struct kvm_page_fault *fau > const struct kvm_memory_slot *slot, gfn_t gfn, > bool is_private) > { > + u8 max_level, coco_level; > struct page *page; > kvm_pfn_t pfn; > - u8 max_level; > > /* For faults, use the gmem information that was resolved earlier. */ > if (fault) { > @@ -3305,8 +3305,16 @@ static u8 kvm_gmem_max_mapping_level(struct kvm *kvm, struct kvm_page_fault *fau > if (max_level == PG_LEVEL_4K) > return max_level; > > - return min(max_level, > - kvm_x86_call(gmem_max_mapping_level)(kvm, pfn, is_private)); > + /* > + * CoCo may influence the max mapping level, e.g. due to RMP or S-EPT > + * restrictions. A return of '0' means "no additional restrictions", > + * to allow for using an optional "ret0" static call. > + */ > + coco_level = kvm_x86_call(gmem_max_mapping_level)(kvm, pfn, is_private); > + if (coco_level) > + max_level = min(max_level, coco_level); > + > + return max_level; > } > > int kvm_mmu_max_mapping_level(struct kvm *kvm, struct kvm_page_fault *fault,