Hi Sean,
I'd like to bring up an issue concerning MSR_IA32_PL0_SSP.
The FRED spec claims:
The FRED SSP MSRs are supported by any processor that enumerates
CPUID.(EAX=7,ECX=1):EAX.FRED[bit 17] as 1. If such a processor does not
support CET, FRED transitions will not use the MSRs (because shadow stacks
are not enabled), but the MSRs would still be accessible using MSR-access
instructions (e.g., RDMSR, WRMSR).
It means KVM needs to handle MSR_IA32_PL0_SSP even when FRED is supported
but CET is not. And this can be broken down into two subtasks:
1) Allow such a guest to access MSR_IA32_PL0_SSP w/o triggering #GP. And
this behavior is already implemented in patch 8 of this series.
2) Save and restore MSR_IA32_PL0_SSP in both KVM and Qemu for such a guest.
What novel work needs to be done in KVM? For QEMU, I assume it's just adding an
"or FRED" somewhere. For KVM, I'm missing what additional work would be required
that wouldn't be naturally covered by patch 8 (assuming patch 8 is bug-free).
Extra patches:
1) A patch to save/restore guest MSR_IA32_PL0_SSP (i.e., FRED SSP0), as
what we have done for RSP0, following is the patch on top of the patch
saving/restoring RSP0:
diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
index 449a5e02c7de..0bf684342a71 100644
--- a/arch/x86/kvm/vmx/vmx.c
+++ b/arch/x86/kvm/vmx/vmx.c
@@ -1294,8 +1294,13 @@ void vmx_prepare_switch_to_guest(struct kvm_vcpu *vcpu)
wrmsrq(MSR_KERNEL_GS_BASE, vmx->msr_guest_kernel_gs_base);
- if (guest_cpu_cap_has(vcpu, X86_FEATURE_FRED))
+ if (guest_cpu_cap_has(vcpu, X86_FEATURE_FRED)) {
wrmsrns(MSR_IA32_FRED_RSP0, vmx->msr_guest_fred_rsp0);
+
+ /* XSAVES/XRSTORS do not cover SSP MSRs */
+ if (!kvm_cpu_cap_has(X86_FEATURE_SHSTK))
+ wrmsrns(MSR_IA32_FRED_SSP0, vmx->msr_guest_fred_ssp0);
+ }
#else
savesegment(fs, fs_sel);
savesegment(gs, gs_sel);
@@ -1349,6 +1354,10 @@ static void vmx_prepare_switch_to_host(struct
vcpu_vmx *vmx)
* CPU exits to userspace (RSP0 is a per-task value).
*/
fred_sync_rsp0(vmx->msr_guest_fred_rsp0);
+
+ /* XSAVES/XRSTORS do not cover SSP MSRs */
+ if (!kvm_cpu_cap_has(X86_FEATURE_SHSTK))
+ vmx->msr_guest_fred_ssp0 = read_msr(MSR_IA32_FRED_SSP0);
}
#endif
load_fixmap_gdt(raw_smp_processor_id());
diff --git a/arch/x86/kvm/vmx/vmx.h b/arch/x86/kvm/vmx/vmx.h
index 733fa2ef4bea..12c1cf827cb7 100644
--- a/arch/x86/kvm/vmx/vmx.h
+++ b/arch/x86/kvm/vmx/vmx.h
@@ -228,6 +228,7 @@ struct vcpu_vmx {
#ifdef CONFIG_X86_64
u64 msr_guest_kernel_gs_base;
u64 msr_guest_fred_rsp0;
+ u64 msr_guest_fred_ssp0;
#endif
u64 spec_ctrl;
And We might want to zero host MSR_IA32_PL0_SSP when switching to host.
2) Add vmx_read_guest_fred_ssp0()/vmx_write_guest_fred_ssp0(), and use
them to read/write MSR_IA32_PL0_SSP in patch 8:
diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
index 99106750b1e3..cbdc67682d27 100644
--- a/arch/x86/kvm/vmx/vmx.c
+++ b/arch/x86/kvm/vmx/vmx.c
@@ -1400,9 +1408,23 @@ static void vmx_write_guest_fred_rsp0(struct
vcpu_vmx *vmx, u64 data)
vmx_write_guest_host_msr(vmx, MSR_IA32_FRED_RSP0, data,
&vmx->msr_guest_fred_rsp0);
}
+
+static u64 vmx_read_guest_fred_ssp0(struct vcpu_vmx *vmx)
+{
+ return vmx_read_guest_host_msr(vmx, MSR_IA32_FRED_SSP0,
+ &vmx->msr_guest_fred_ssp0);
+}
+
+static void vmx_write_guest_fred_ssp0(struct vcpu_vmx *vmx, u64 data)
+{
+ vmx_write_guest_host_msr(vmx, MSR_IA32_FRED_SSP0, data,
+ &vmx->msr_guest_fred_ssp0);
+}
#endif
static void grow_ple_window(struct kvm_vcpu *vcpu)
@@ -2189,6 +2211,18 @@ int vmx_get_msr(struct kvm_vcpu *vcpu, struct
msr_data *msr_info)
case MSR_IA32_DEBUGCTLMSR:
msr_info->data = vmx_guest_debugctl_read();
break;
+ case MSR_IA32_PL0_SSP:
+ /*
+ * If kvm_cpu_cap_has(X86_FEATURE_SHSTK) but
+ * !guest_cpu_cap_has(vcpu, X86_FEATURE_SHSTK), XSAVES/XRSTORS
+ * cover SSP MSRs.
+ */
+ if (!kvm_cpu_cap_has(X86_FEATURE_SHSTK) &&
+ guest_cpu_cap_has(vcpu, X86_FEATURE_FRED)) {
+ msr_info->data = vmx_read_guest_fred_ssp0(vmx);
+ break;
+ }
+ fallthrough;
default:
find_uret_msr:
msr = vmx_find_uret_msr(vmx, msr_info->index);
@@ -2540,7 +2574,18 @@ int vmx_set_msr(struct kvm_vcpu *vcpu, struct
msr_data *msr_info)
}
ret = kvm_set_msr_common(vcpu, msr_info);
break;
-
+ case MSR_IA32_PL0_SSP:
+ /*
+ * If kvm_cpu_cap_has(X86_FEATURE_SHSTK) but
+ * !guest_cpu_cap_has(vcpu, X86_FEATURE_SHSTK), XSAVES/XRSTORS
+ * cover SSP MSRs.
+ */
+ if (!kvm_cpu_cap_has(X86_FEATURE_SHSTK) &&
+ guest_cpu_cap_has(vcpu, X86_FEATURE_FRED)) {
+ vmx_write_guest_fred_ssp0(vmx, data);
+ break;
+ }
+ fallthrough;
default:
find_uret_msr:
msr = vmx_find_uret_msr(vmx, msr_index);
3) Another change I was discussing with Chao:
https://lore.kernel.org/lkml/2ed04dff-e778-46c6-bd5f-51295763af06@xxxxxxxxx/
I have the patches for 2) but they are not included in this series, because
1) how much do we care the value in MSR_IA32_PL0_SSP in such a guest?
Yes, Chao told me that you are the one saying that MSRs can be used as
clobber registers and KVM should preserve the value. Does MSR_IA32_PL0_SSP
in such a guest count?
If the architecture says that MSR_IA32_PL0_SSP exists and is accessible, then
KVM needs to honor that.
2) Saving/restoring MSR_IA32_PL0_SSP adds complexity, though it's seldom
used. Is it worth it?
Honoring the architecture is generally not optional. There are extreme cases
where KVM violates that rule and takes (often undocumented) erratum, e.g. APIC
base relocation would require an absurd amount of complexity for no real world
benefit. But I would be very surprised if the complexity in KVM or QEMU to support
this scenario is at all meaningful, let alone enough to justify diverging from
the architectural spec.
Let me post v7 which includes all the required changes.
Thanks!
Xin