Re: [PATCH v3 1/2] KVM: x86: Provide a capability to disable APERF/MPERF read intercepts

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Mon, Apr 28, 2025 at 3:58 PM Sean Christopherson <seanjc@xxxxxxxxxx> wrote:
>
> On Fri, Mar 21, 2025, Jim Mattson wrote:
> > diff --git a/Documentation/virt/kvm/api.rst b/Documentation/virt/kvm/api.rst
> > index 2b52eb77e29c..6431cd33f06a 100644
> > --- a/Documentation/virt/kvm/api.rst
> > +++ b/Documentation/virt/kvm/api.rst
> > @@ -7684,6 +7684,7 @@ Valid bits in args[0] are::
> >    #define KVM_X86_DISABLE_EXITS_HLT              (1 << 1)
> >    #define KVM_X86_DISABLE_EXITS_PAUSE            (1 << 2)
> >    #define KVM_X86_DISABLE_EXITS_CSTATE           (1 << 3)
> > +  #define KVM_X86_DISABLE_EXITS_APERFMPERF       (1 << 4)
>
> Might be pre-existing with C-states, but I think the documentation needs to call
> out that userspace is responsible for enumerating APERFMPERF in guest CPUID.
>
> And more importantly, KVM either needs to honor APERFMPERF in each vCPU's CPUID,
> or the documentation needs to call out that KVM doesn't honor guest CPUID for
> APERF/MPERF MSRs.  I don't have a strong preference either way, but I'm leaning
> toward having KVM honor CPUID so that if someone copy+pastes the KVM selftest
> code for the host enabling, it'll do the right thing.  On the other hand, KVM
> doesn't (and shouldn't) fully emulate the MSRs, so I'm a-ok if we ignore CPUID
> entirely (but document it).
>
> Ignoring CPUID entirely would also make it easier to document that KVM doesn't
> upport loading/saving C-state or APERF/MPERF MSRs via load/store lists on VM-Enter
> and VM-Exit.  E.g. we can simply say KVM doesn't emulate the MSRs in any capacity,
> and that the capability disable the exit/interception, no more no less.
>
> Heh, I guess maybe I've talked myself into having KVM ignore guest CPUID :-)

I concur. I will add a note to that effect.

> > diff --git a/arch/x86/kvm/svm/svm.h b/arch/x86/kvm/svm/svm.h
> > index ea44c1da5a7c..5b38d5c00788 100644
> > --- a/arch/x86/kvm/svm/svm.h
> > +++ b/arch/x86/kvm/svm/svm.h
> > @@ -44,7 +44,7 @@ static inline struct page *__sme_pa_to_page(unsigned long pa)
> >  #define      IOPM_SIZE PAGE_SIZE * 3
> >  #define      MSRPM_SIZE PAGE_SIZE * 2
> >
> > -#define MAX_DIRECT_ACCESS_MSRS       48
> > +#define MAX_DIRECT_ACCESS_MSRS       50
>
> Ugh, I really need to get the MSR interception cleanup series posted.
>
> > diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
> > index 4b64ab350bcd..1b3cdca806b4 100644
> > --- a/arch/x86/kvm/x86.c
> > +++ b/arch/x86/kvm/x86.c
> > @@ -4535,6 +4535,9 @@ static u64 kvm_get_allowed_disable_exits(void)
> >  {
> >       u64 r = KVM_X86_DISABLE_EXITS_PAUSE;
> >
> > +     if (boot_cpu_has(X86_FEATURE_APERFMPERF))
> > +             r |= KVM_X86_DISABLE_EXITS_APERFMPERF;
> > +
> >       if (!mitigate_smt_rsb) {
> >               r |= KVM_X86_DISABLE_EXITS_HLT |
> >                       KVM_X86_DISABLE_EXITS_CSTATE;
> > @@ -6543,7 +6546,8 @@ int kvm_vm_ioctl_enable_cap(struct kvm *kvm,
> >
> >               if (!mitigate_smt_rsb && boot_cpu_has_bug(X86_BUG_SMT_RSB) &&
> >                   cpu_smt_possible() &&
> > -                 (cap->args[0] & ~KVM_X86_DISABLE_EXITS_PAUSE))
> > +                 (cap->args[0] & ~(KVM_X86_DISABLE_EXITS_PAUSE |
> > +                                   KVM_X86_DISABLE_EXITS_APERFMPERF)))
> >                       pr_warn_once(SMT_RSB_MSG);
> >
> >               if (cap->args[0] & KVM_X86_DISABLE_EXITS_PAUSE)
> > @@ -6554,6 +6558,8 @@ int kvm_vm_ioctl_enable_cap(struct kvm *kvm,
> >                       kvm->arch.hlt_in_guest = true;
> >               if (cap->args[0] & KVM_X86_DISABLE_EXITS_CSTATE)
> >                       kvm->arch.cstate_in_guest = true;
> > +             if (cap->args[0] & KVM_X86_DISABLE_EXITS_APERFMPERF)
> > +                     kvm->arch.aperfmperf_in_guest = true;
>
> Rather that an ever-growing stream of a booleans, what about tracing the flags
> as a u64 and providing a builder macro to generate the helper?  The latter is a
> bit gratuitous, but this seems like the type of boilerplate that would be
> embarassingly easy to screw up without anyone noticing.
>
> Very lightly tested...
>
> --
> From: Sean Christopherson <seanjc@xxxxxxxxxx>
> Date: Mon, 28 Apr 2025 11:35:47 -0700
> Subject: [PATCH] KVM: x86: Consolidate DISABLE_EXITS_xxx handling into a
>  single kvm_arch field
>
> Replace the individual xxx_in_guest booleans with a single field to track
> exits that have been disabled for a VM.  To further cut down on the amount
> of boilerplate needed for each disabled exit, add a builder macro to
> generate the accessor.
>
> No functional change intended.
>
> Signed-off-by: Sean Christopherson <seanjc@xxxxxxxxxx>
> ---
>  arch/x86/include/asm/kvm_host.h |  5 +----
>  arch/x86/kvm/svm/svm.c          |  2 +-
>  arch/x86/kvm/vmx/vmx.c          |  2 +-
>  arch/x86/kvm/x86.c              | 25 ++++++++-----------------
>  arch/x86/kvm/x86.h              | 28 +++++++++-------------------
>  5 files changed, 20 insertions(+), 42 deletions(-)
>
> diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
> index 6c06f3d6e081..4b174499b29c 100644
> --- a/arch/x86/include/asm/kvm_host.h
> +++ b/arch/x86/include/asm/kvm_host.h
> @@ -1389,10 +1389,7 @@ struct kvm_arch {
>
>         gpa_t wall_clock;
>
> -       bool mwait_in_guest;
> -       bool hlt_in_guest;
> -       bool pause_in_guest;
> -       bool cstate_in_guest;
> +       u64 disabled_exits;
>
>         unsigned long irq_sources_bitmap;
>         s64 kvmclock_offset;
> diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c
> index cc1c721ba067..0f0c06be85d6 100644
> --- a/arch/x86/kvm/svm/svm.c
> +++ b/arch/x86/kvm/svm/svm.c
> @@ -5053,7 +5053,7 @@ static int svm_vm_init(struct kvm *kvm)
>         }
>
>         if (!pause_filter_count || !pause_filter_thresh)
> -               kvm->arch.pause_in_guest = true;
> +               kvm->arch.disabled_exits |= KVM_X86_DISABLE_EXITS_PAUSE;
>
>         if (enable_apicv) {
>                 int ret = avic_vm_init(kvm);
> diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
> index ef2d7208dd20..109ade8fc47b 100644
> --- a/arch/x86/kvm/vmx/vmx.c
> +++ b/arch/x86/kvm/vmx/vmx.c
> @@ -7613,7 +7613,7 @@ int vmx_vcpu_create(struct kvm_vcpu *vcpu)
>  int vmx_vm_init(struct kvm *kvm)
>  {
>         if (!ple_gap)
> -               kvm->arch.pause_in_guest = true;
> +               kvm->arch.disabled_exits |= KVM_X86_DISABLE_EXITS_PAUSE;
>
>         if (boot_cpu_has(X86_BUG_L1TF) && enable_ept) {
>                 switch (l1tf_mitigation) {
> diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
> index f6ce044b090a..3800d6cfecce 100644
> --- a/arch/x86/kvm/x86.c
> +++ b/arch/x86/kvm/x86.c
> @@ -6591,27 +6591,18 @@ int kvm_vm_ioctl_enable_cap(struct kvm *kvm,
>                         break;
>
>                 mutex_lock(&kvm->lock);
> -               if (kvm->created_vcpus)
> -                       goto disable_exits_unlock;
> -
> +               if (!kvm->created_vcpus) {
>  #define SMT_RSB_MSG "This processor is affected by the Cross-Thread Return Predictions vulnerability. " \
>                     "KVM_CAP_X86_DISABLE_EXITS should only be used with SMT disabled or trusted guests."
>
> -               if (!mitigate_smt_rsb && boot_cpu_has_bug(X86_BUG_SMT_RSB) &&
> -                   cpu_smt_possible() &&
> -                   (cap->args[0] & ~KVM_X86_DISABLE_EXITS_PAUSE))
> -                       pr_warn_once(SMT_RSB_MSG);
> +                       if (!mitigate_smt_rsb && cpu_smt_possible() &&
> +                           boot_cpu_has_bug(X86_BUG_SMT_RSB) &&
> +                           (cap->args[0] & ~KVM_X86_DISABLE_EXITS_PAUSE))
> +                               pr_warn_once(SMT_RSB_MSG);
>
> -               if (cap->args[0] & KVM_X86_DISABLE_EXITS_PAUSE)
> -                       kvm->arch.pause_in_guest = true;
> -               if (cap->args[0] & KVM_X86_DISABLE_EXITS_MWAIT)
> -                       kvm->arch.mwait_in_guest = true;
> -               if (cap->args[0] & KVM_X86_DISABLE_EXITS_HLT)
> -                       kvm->arch.hlt_in_guest = true;
> -               if (cap->args[0] & KVM_X86_DISABLE_EXITS_CSTATE)
> -                       kvm->arch.cstate_in_guest = true;
> -               r = 0;
> -disable_exits_unlock:
> +                       kvm->arch.disabled_exits |= cap->args[0];
> +                       r = 0;
> +               }
>                 mutex_unlock(&kvm->lock);
>                 break;
>         case KVM_CAP_MSR_PLATFORM_INFO:
> diff --git a/arch/x86/kvm/x86.h b/arch/x86/kvm/x86.h
> index 88a9475899c8..1675017eea88 100644
> --- a/arch/x86/kvm/x86.h
> +++ b/arch/x86/kvm/x86.h
> @@ -481,25 +481,15 @@ static inline u64 nsec_to_cycles(struct kvm_vcpu *vcpu, u64 nsec)
>             __rem;                                              \
>          })
>
> -static inline bool kvm_mwait_in_guest(struct kvm *kvm)
> -{
> -       return kvm->arch.mwait_in_guest;
> -}
> -
> -static inline bool kvm_hlt_in_guest(struct kvm *kvm)
> -{
> -       return kvm->arch.hlt_in_guest;
> -}
> -
> -static inline bool kvm_pause_in_guest(struct kvm *kvm)
> -{
> -       return kvm->arch.pause_in_guest;
> -}
> -
> -static inline bool kvm_cstate_in_guest(struct kvm *kvm)
> -{
> -       return kvm->arch.cstate_in_guest;
> -}
> +#define BUILD_DISABLED_EXITS_HELPER(lname, uname)                              \
> +static inline bool kvm_##lname##_in_guest(struct kvm *kvm)                     \
> +{                                                                              \
> +       return kvm->arch.disabled_exits & KVM_X86_DISABLE_EXITS_##uname;        \
> +}
> +BUILD_DISABLED_EXITS_HELPER(hlt, HLT);
> +BUILD_DISABLED_EXITS_HELPER(pause, PAUSE);
> +BUILD_DISABLED_EXITS_HELPER(mwait, MWAIT);
> +BUILD_DISABLED_EXITS_HELPER(cstate, CSTATE);

The boilerplate is bad, but that's abhorrent.

>  static inline bool kvm_notify_vmexit_enabled(struct kvm *kvm)
>  {
>
> base-commit: 45eb29140e68ffe8e93a5471006858a018480a45
> --





[Index of Archives]     [KVM ARM]     [KVM ia64]     [KVM ppc]     [Virtualization Tools]     [Spice Development]     [Libvirt]     [Libvirt Users]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite Questions]     [Linux Kernel]     [Linux SCSI]     [XFree86]

  Powered by Linux