On Wed, Mar 26, 2025, Yosry Ahmed wrote: > Currently, INVPLGA interception handles it like INVLPG, which flushes > L1's TLB translations for the address. It was implemented in this way > because L1 and L2 shared an ASID. Now, L1 and L2 have separate ASIDs. It > is still harmless to flush L1's translations, but it's only correct > because all translations are flushed on nested transitions anyway. > > In preparation for stopping unconditional flushes on nested transitions, > handle INVPLGA interception properly. If L1 specified zero as the ASID, > this is equivalent to INVLPG, so handle it as such. Otherwise, use > INVPLGA to flush the translations of the appropriate ASID tracked by > KVM, if any. Sync the shadow MMU as well, as L1 invalidated L2's > mappings. > > Signed-off-by: Yosry Ahmed <yosry.ahmed@xxxxxxxxx> > --- > arch/x86/include/asm/kvm_host.h | 2 ++ > arch/x86/kvm/mmu/mmu.c | 5 +++-- > arch/x86/kvm/svm/svm.c | 36 +++++++++++++++++++++++++++++++-- > 3 files changed, 39 insertions(+), 4 deletions(-) > > diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h > index d881e7d276b12..a158d324168a0 100644 > --- a/arch/x86/include/asm/kvm_host.h > +++ b/arch/x86/include/asm/kvm_host.h > @@ -2237,6 +2237,8 @@ int kvm_mmu_page_fault(struct kvm_vcpu *vcpu, gpa_t cr2_or_gpa, u64 error_code, > void *insn, int insn_len); > void kvm_mmu_print_sptes(struct kvm_vcpu *vcpu, gpa_t gpa, const char *msg); > void kvm_mmu_invlpg(struct kvm_vcpu *vcpu, gva_t gva); > +void __kvm_mmu_invalidate_addr(struct kvm_vcpu *vcpu, struct kvm_mmu *mmu, > + u64 addr, unsigned long roots, bool gva_flush); > void kvm_mmu_invalidate_addr(struct kvm_vcpu *vcpu, struct kvm_mmu *mmu, > u64 addr, unsigned long roots); > void kvm_mmu_invpcid_gva(struct kvm_vcpu *vcpu, gva_t gva, unsigned long pcid); > diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c > index e2b1994f12753..d3baa12df84e7 100644 > --- a/arch/x86/kvm/mmu/mmu.c > +++ b/arch/x86/kvm/mmu/mmu.c > @@ -6355,8 +6355,8 @@ static void kvm_mmu_invalidate_addr_in_root(struct kvm_vcpu *vcpu, > write_unlock(&vcpu->kvm->mmu_lock); > } > > -static void __kvm_mmu_invalidate_addr(struct kvm_vcpu *vcpu, struct kvm_mmu *mmu, > - u64 addr, unsigned long roots, bool gva_flush) > +void __kvm_mmu_invalidate_addr(struct kvm_vcpu *vcpu, struct kvm_mmu *mmu, > + u64 addr, unsigned long roots, bool gva_flush) I don't love passing a boolean to avoid a flush. I especially don't like it in this case because vmx_flush_tlb_gva() has similar logic. Unfortunately, I don't see a better option at this point. :-/ If we do keep the param, it needs to be something like @flush_gva, because I read @gva_flush as "this is a gva flush", and got all kinds of confused when reading the code. > { > int i; > > @@ -6382,6 +6382,7 @@ static void __kvm_mmu_invalidate_addr(struct kvm_vcpu *vcpu, struct kvm_mmu *mmu > kvm_mmu_invalidate_addr_in_root(vcpu, mmu, addr, mmu->prev_roots[i].hpa); > } > } > +EXPORT_SYMBOL_GPL(__kvm_mmu_invalidate_addr); > > void kvm_mmu_invalidate_addr(struct kvm_vcpu *vcpu, struct kvm_mmu *mmu, > u64 addr, unsigned long roots) > diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c > index 3649707c61d3e..4b95fd6b501e6 100644 > --- a/arch/x86/kvm/svm/svm.c > +++ b/arch/x86/kvm/svm/svm.c > @@ -2505,6 +2505,7 @@ static int clgi_interception(struct kvm_vcpu *vcpu) > > static int invlpga_interception(struct kvm_vcpu *vcpu) > { > + struct vcpu_svm *svm = to_svm(vcpu); > gva_t gva = kvm_rax_read(vcpu); > u32 asid = kvm_rcx_read(vcpu); > > @@ -2514,8 +2515,39 @@ static int invlpga_interception(struct kvm_vcpu *vcpu) > > trace_kvm_invlpga(to_svm(vcpu)->vmcb->save.rip, asid, gva); > > - /* Let's treat INVLPGA the same as INVLPG (can be optimized!) */ > - kvm_mmu_invlpg(vcpu, gva); This code needs to do a noncanonical check (assuming we can't figure out a way to shoehorn this into kvm_mmu_invlpg()). Consuming gva here for the asid != 0 case might be "fine", because INVLPGA won't fault, but it's still a bug, e.g. I don't know what will happen when KVM tries to synchronize MMUs. Another reason I don't love the @flush_gva param :-/ > + /* > + * APM is silent about using INVLPGA to flush the host ASID (i.e. 0). > + * Do the logical thing and handle it like INVLPG. > + */ > + if (asid == 0) { if (!asid) > + kvm_mmu_invlpg(vcpu, gva); > + return kvm_skip_emulated_instruction(vcpu); > + } > + > + /* > + * Check if L1 specified the L2 ASID we are currently tracking. If it > + * isn't, do nothing as we have to handle the TLB flush when switching > + * to the new ASID anyway. > + */ Please avoid pronoouns. And try not to allude to behavior; the above doesn't actually say what happens when switching to a new ASID, only that "we have to handle the TLB flush". E.g. /* * Flush hardware TLB entries only if L1 is flushing KVM's currently * tracked L2 ASID. KVM does a full TLB flush when L1 runs a VMCB with * a different L2 ASID. */ > + if (asid == svm->nested.last_asid) > + invlpga(gva, svm_nested_asid(vcpu->kvm)); > + > + /* > + * If NPT is disabled, sync the shadow page tables as L1 is invalidating > + * mappings for L2. Sync all roots as ASIDs are not tracked in the MMU > + * role. > + * > + * As we are not flushing the current context, skip the gva flush from > + * __kvm_mmu_invalidate_addr(), it would flush the wrong ASID anyway. > + * The correct TLB flush was done above (if needed). > + * > + * This always operates on root_mmu because L1 and L2 share an MMU when > + * NPT is disabled. This can be optimized by invalidating guest roots > + * only. Heh, I had a comment typed up about only need to sync guest roots, and then I read this literal comment. :-) > + */ > + if (!npt_enabled) > + __kvm_mmu_invalidate_addr(vcpu, &vcpu->arch.root_mmu, gva, > + KVM_MMU_ROOTS_ALL, false); > > return kvm_skip_emulated_instruction(vcpu); > } > -- > 2.49.0.395.g12beb8f557-goog >