On Wed, May 14, 2025 at 04:15:14AM +0800, Edgecombe, Rick P wrote: > On Thu, 2025-04-24 at 11:06 +0800, Yan Zhao wrote: > > From: "Edgecombe, Rick P" <rick.p.edgecombe@xxxxxxxxx> > > > > Disallow page merging (huge page adjustment) for mirror root by leveraging > > the disallowed_hugepage_adjust(). > > > > [Yan: Passing is_mirror to disallowed_hugepage_adjust()] > > > > Signed-off-by: Edgecombe, Rick P <rick.p.edgecombe@xxxxxxxxx> > > Signed-off-by: Yan Zhao <yan.y.zhao@xxxxxxxxx> > > --- > > arch/x86/kvm/mmu/mmu.c | 6 +++--- > > arch/x86/kvm/mmu/mmu_internal.h | 2 +- > > arch/x86/kvm/mmu/paging_tmpl.h | 2 +- > > arch/x86/kvm/mmu/tdp_mmu.c | 7 ++++--- > > 4 files changed, 9 insertions(+), 8 deletions(-) > > > > diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c > > index a284dce227a0..b923deeeb62e 100644 > > --- a/arch/x86/kvm/mmu/mmu.c > > +++ b/arch/x86/kvm/mmu/mmu.c > > @@ -3326,13 +3326,13 @@ void kvm_mmu_hugepage_adjust(struct kvm_vcpu *vcpu, struct kvm_page_fault *fault > > fault->pfn &= ~mask; > > } > > > > -void disallowed_hugepage_adjust(struct kvm_page_fault *fault, u64 spte, int cur_level) > > +void disallowed_hugepage_adjust(struct kvm_page_fault *fault, u64 spte, int cur_level, bool is_mirror) > > { > > if (cur_level > PG_LEVEL_4K && > > cur_level == fault->goal_level && > > is_shadow_present_pte(spte) && > > !is_large_pte(spte) && > > - spte_to_child_sp(spte)->nx_huge_page_disallowed) { > > + (spte_to_child_sp(spte)->nx_huge_page_disallowed || is_mirror)) { > > /* > > * A small SPTE exists for this pfn, but FNAME(fetch), > > * direct_map(), or kvm_tdp_mmu_map() would like to create a > > @@ -3363,7 +3363,7 @@ static int direct_map(struct kvm_vcpu *vcpu, struct kvm_page_fault *fault) > > * large page, as the leaf could be executable. > > */ > > if (fault->nx_huge_page_workaround_enabled) > > - disallowed_hugepage_adjust(fault, *it.sptep, it.level); > > + disallowed_hugepage_adjust(fault, *it.sptep, it.level, false); > > > > base_gfn = gfn_round_for_level(fault->gfn, it.level); > > if (it.level == fault->goal_level) > > diff --git a/arch/x86/kvm/mmu/mmu_internal.h b/arch/x86/kvm/mmu/mmu_internal.h > > index db8f33e4de62..1c1764f46e66 100644 > > --- a/arch/x86/kvm/mmu/mmu_internal.h > > +++ b/arch/x86/kvm/mmu/mmu_internal.h > > @@ -411,7 +411,7 @@ static inline int kvm_mmu_do_page_fault(struct kvm_vcpu *vcpu, gpa_t cr2_or_gpa, > > int kvm_mmu_max_mapping_level(struct kvm *kvm, > > const struct kvm_memory_slot *slot, gfn_t gfn); > > void kvm_mmu_hugepage_adjust(struct kvm_vcpu *vcpu, struct kvm_page_fault *fault); > > -void disallowed_hugepage_adjust(struct kvm_page_fault *fault, u64 spte, int cur_level); > > +void disallowed_hugepage_adjust(struct kvm_page_fault *fault, u64 spte, int cur_level, bool is_mirror); > > > > void track_possible_nx_huge_page(struct kvm *kvm, struct kvm_mmu_page *sp); > > void untrack_possible_nx_huge_page(struct kvm *kvm, struct kvm_mmu_page *sp); > > diff --git a/arch/x86/kvm/mmu/paging_tmpl.h b/arch/x86/kvm/mmu/paging_tmpl.h > > index 68e323568e95..1559182038e3 100644 > > --- a/arch/x86/kvm/mmu/paging_tmpl.h > > +++ b/arch/x86/kvm/mmu/paging_tmpl.h > > @@ -717,7 +717,7 @@ static int FNAME(fetch)(struct kvm_vcpu *vcpu, struct kvm_page_fault *fault, > > * large page, as the leaf could be executable. > > */ > > if (fault->nx_huge_page_workaround_enabled) > > - disallowed_hugepage_adjust(fault, *it.sptep, it.level); > > + disallowed_hugepage_adjust(fault, *it.sptep, it.level, false); > > > > base_gfn = gfn_round_for_level(fault->gfn, it.level); > > if (it.level == fault->goal_level) > > diff --git a/arch/x86/kvm/mmu/tdp_mmu.c b/arch/x86/kvm/mmu/tdp_mmu.c > > index 405874f4d088..8ee01277cc07 100644 > > --- a/arch/x86/kvm/mmu/tdp_mmu.c > > +++ b/arch/x86/kvm/mmu/tdp_mmu.c > > @@ -1244,6 +1244,7 @@ int kvm_tdp_mmu_map(struct kvm_vcpu *vcpu, struct kvm_page_fault *fault) > > struct tdp_iter iter; > > struct kvm_mmu_page *sp; > > int ret = RET_PF_RETRY; > > + bool is_mirror = is_mirror_sp(root); > > > > kvm_mmu_hugepage_adjust(vcpu, fault); > > > > @@ -1254,8 +1255,8 @@ int kvm_tdp_mmu_map(struct kvm_vcpu *vcpu, struct kvm_page_fault *fault) > > for_each_tdp_pte(iter, kvm, root, fault->gfn, fault->gfn + 1) { > > int r; > > > > - if (fault->nx_huge_page_workaround_enabled) > > - disallowed_hugepage_adjust(fault, iter.old_spte, iter.level); > > + if (fault->nx_huge_page_workaround_enabled || is_mirror) > > Maybe we should rename nx_huge_page_workaround_enabled to something more generic > and do the is_mirror logic in kvm_mmu_do_page_fault() when setting it. It should > shrink the diff and centralize the logic. Hmm, I'm reluctant to rename nx_huge_page_workaround_enabled, because (1) Invoking disallowed_hugepage_adjust() for mirror root is to disable page promotion for TDX private memory, so is only applied to TDP MMU. (2) nx_huge_page_workaround_enabled is used specifically for nx huge pages. fault->huge_page_disallowed = fault->exec && fault->nx_huge_page_workaround_enabled; if (fault->huge_page_disallowed) account_nx_huge_page(vcpu->kvm, sp, fault->req_level >= it.level); sp->nx_huge_page_disallowed = fault->huge_page_disallowed. Affecting fault->huge_page_disallowed would impact sp->nx_huge_page_disallowed as well and would disable huge pages entirely. So, we still need to keep nx_huge_page_workaround_enabled. If we introduce a new flag fault->disable_hugepage_adjust, and set it in kvm_mmu_do_page_fault(), we would also need to invoke tdp_mmu_get_root_for_fault() there as well. Checking for mirror root for non-TDX VMs is not necessary, and the invocation of tdp_mmu_get_root_for_fault() seems redundant with the one in kvm_tdp_mmu_map(). > > + disallowed_hugepage_adjust(fault, iter.old_spte, iter.level, is_mirror); > > > > /* > > * If SPTE has been frozen by another thread, just give up and > > @@ -1278,7 +1279,7 @@ int kvm_tdp_mmu_map(struct kvm_vcpu *vcpu, struct kvm_page_fault *fault) > > */ > > sp = tdp_mmu_alloc_sp(vcpu); > > tdp_mmu_init_child_sp(sp, &iter); > > - if (is_mirror_sp(sp)) > > + if (is_mirror) > > kvm_mmu_alloc_external_spt(vcpu, sp); > > > > sp->nx_huge_page_disallowed = fault->huge_page_disallowed; >