On Mon, May 05, 2025 at 11:03:00AM -0700, Sean Christopherson wrote: > Set the magic BP_SPEC_REDUCE bit to mitigate SRSO when running VMs if and > only if KVM has at least one active VM. Leaving the bit set at all times > unfortunately degrades performance by a wee bit more than expected. > > Use a dedicated spinlock and counter instead of hooking virtualization > enablement, as changing the behavior of kvm.enable_virt_at_load based on > SRSO_BP_SPEC_REDUCE is painful, and has its own drawbacks, e.g. could > result in performance issues for flows that are sensitive to VM creation > latency. > > Defer setting BP_SPEC_REDUCE until VMRUN is imminent to avoid impacting > performance on CPUs that aren't running VMs, e.g. if a setup is using > housekeeping CPUs. Setting BP_SPEC_REDUCE in task context, i.e. without > blasting IPIs to all CPUs, also helps avoid serializing 1<=>N transitions > without incurring a gross amount of complexity (see the Link for details > on how ugly coordinating via IPIs gets). > > Link: https://lore.kernel.org/all/aBOnzNCngyS_pQIW@xxxxxxxxxx > Fixes: 8442df2b49ed ("x86/bugs: KVM: Add support for SRSO_MSR_FIX") > Reported-by: Michael Larabel <Michael@xxxxxxxxxxxxxxxxxx> > Closes: https://www.phoronix.com/review/linux-615-amd-regression > Cc: Borislav Petkov <bp@xxxxxxxxx> > Signed-off-by: Sean Christopherson <seanjc@xxxxxxxxxx> > --- > > v2: Defer setting BP_SPEC_REDUCE until VMRUN is imminent, which in turn > allows for eliding the lock on 0<=>1 transitions as there is no race > with CPUs doing VMRUN before receiving the IPI to set the bit, and > having multiple tasks take the lock during svm_srso_vm_init() is a-ok. > > v1: https://lore.kernel.org/all/20250502223456.887618-1-seanjc@xxxxxxxxxx > > arch/x86/kvm/svm/svm.c | 71 ++++++++++++++++++++++++++++++++++++++---- > arch/x86/kvm/svm/svm.h | 2 ++ > 2 files changed, 67 insertions(+), 6 deletions(-) > > diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c > index cc1c721ba067..15f7a0703c16 100644 > --- a/arch/x86/kvm/svm/svm.c > +++ b/arch/x86/kvm/svm/svm.c > @@ -607,9 +607,6 @@ static void svm_disable_virtualization_cpu(void) > kvm_cpu_svm_disable(); > > amd_pmu_disable_virt(); > - > - if (cpu_feature_enabled(X86_FEATURE_SRSO_BP_SPEC_REDUCE)) > - msr_clear_bit(MSR_ZEN4_BP_CFG, MSR_ZEN4_BP_CFG_BP_SPEC_REDUCE_BIT); > } > > static int svm_enable_virtualization_cpu(void) > @@ -687,9 +684,6 @@ static int svm_enable_virtualization_cpu(void) > rdmsr(MSR_TSC_AUX, sev_es_host_save_area(sd)->tsc_aux, msr_hi); > } > > - if (cpu_feature_enabled(X86_FEATURE_SRSO_BP_SPEC_REDUCE)) > - msr_set_bit(MSR_ZEN4_BP_CFG, MSR_ZEN4_BP_CFG_BP_SPEC_REDUCE_BIT); > - > return 0; > } > > @@ -1518,6 +1512,63 @@ static void svm_vcpu_free(struct kvm_vcpu *vcpu) > __free_pages(virt_to_page(svm->msrpm), get_order(MSRPM_SIZE)); > } > > +#ifdef CONFIG_CPU_MITIGATIONS > +static DEFINE_SPINLOCK(srso_lock); > +static atomic_t srso_nr_vms; > + > +static void svm_srso_clear_bp_spec_reduce(void *ign) > +{ > + struct svm_cpu_data *sd = this_cpu_ptr(&svm_data); > + > + if (!sd->bp_spec_reduce_set) > + return; > + > + msr_clear_bit(MSR_ZEN4_BP_CFG, MSR_ZEN4_BP_CFG_BP_SPEC_REDUCE_BIT); > + sd->bp_spec_reduce_set = false; > +} > + > +static void svm_srso_vm_destroy(void) > +{ > + if (!cpu_feature_enabled(X86_FEATURE_SRSO_BP_SPEC_REDUCE)) > + return; > + > + if (atomic_dec_return(&srso_nr_vms)) > + return; > + > + guard(spinlock)(&srso_lock); > + > + /* > + * Verify a new VM didn't come along, acquire the lock, and increment > + * the count before this task acquired the lock. > + */ > + if (atomic_read(&srso_nr_vms)) > + return; > + > + on_each_cpu(svm_srso_clear_bp_spec_reduce, NULL, 1); Just a passing-by comment. I get worried about sending IPIs while holding a spinlock because if someone ever tries to hold that spinlock with IRQs disabled, it may cause a deadlock. This is not the case for this lock, but it's not obvious (at least to me) that holding it in a different code path that doesn't send IPIs with IRQs disabled could cause a problem. You could add a comment, convert it to a mutex to make this scenario impossible, or dismiss my comment as being too paranoid/ridiculous :)