On Fri, Jun 27, 2025, Konrad Rzeszutek Wilk wrote: > On Fri, Jun 27, 2025 at 08:23:52AM +0200, Alexandre Chartre wrote: > > > > On 6/27/25 07:41, Xiaoyao Li wrote: > > > On 6/26/2025 10:02 PM, Sean Christopherson wrote: > > > > +Jim > > > > > > > > For the scope, "KVM: x86:" > > > > > > > > On Thu, Jun 26, 2025, Alexandre Chartre wrote: > > > > > KVM emulates the ARCH_CAPABILITIES on x86 for both vmx and svm. > > > > > However the IA32_ARCH_CAPABILITIES MSR is an Intel-specific MSR > > > > > so it makes no sense to emulate it on AMD. > > > > > > > > > > The AMD documentation specifies that this MSR is not defined on > > > > > the AMD architecture. So emulating this MSR on AMD can even cause > > > > > issues (like Windows BSOD) as the guest OS might not expect this > > > > > MSR to exist on such architecture. > > > > > > > > > > Signed-off-by: Alexandre Chartre<alexandre.chartre@xxxxxxxxxx> > > > > > --- > > > > > > > > > > A similar patch was submitted some years ago but it looks like it felt > > > > > through the cracks: > > > > > https://lore.kernel.org/kvm/20190307093143.77182-1- xiaoyao.li@xxxxxxxxxxxxxxx/ > > > > It didn't fall through the cracks, we deliberately elected to emulate the MSR in > > > > common code so that KVM's advertised CPUID support would match KVM's emulation. > > > > > > > > On Thu, 2019-03-07 at 19:15 +0100, Paolo Bonzini wrote: > > > > > On 07/03/19 18:37, Sean Christopherson wrote: > > > > > > On Thu, Mar 07, 2019 at 05:31:43PM +0800, Xiaoyao Li wrote: > > > > > > > At present, we report F(ARCH_CAPABILITIES) for x86 arch(both vmx and svm) > > > > > > > unconditionally, but we only emulate this MSR in vmx. It will cause #GP > > > > > > > while guest kernel rdmsr(MSR_IA32_ARCH_CAPABILITIES) in an AMD host. > > > > > > > > > > > > > > Since MSR IA32_ARCH_CAPABILITIES is an intel-specific MSR, it makes no > > > > > > > sense to emulate it in svm. Thus this patch chooses to only emulate it > > > > > > > for vmx, and moves the related handling to vmx related files. > > > > > > > > > > > > What about emulating the MSR on an AMD host for testing purpsoes? It > > > > > > might be a useful way for someone without Intel hardware to test spectre > > > > > > related flows. > > > > > > > > > > > > In other words, an alternative to restricting emulation of the MSR to > > > > > > Intel CPUS would be to move MSR_IA32_ARCH_CAPABILITIES handling into > > > > > > kvm_{get,set}_msr_common(). Guest access to MSR_IA32_ARCH_CAPABILITIES > > > > > > is gated by X86_FEATURE_ARCH_CAPABILITIES in the guest's CPUID, e.g. > > > > > > RDMSR will naturally #GP fault if userspace passes through the host's > > > > > > CPUID on a non-Intel system. > > > > > > > > > > This is also better because it wouldn't change the guest ABI for AMD > > > > > processors. Dropping CPUID flags is generally not a good idea. > > > > > > > > > > Paolo > > > > > > > > I don't necessarily disagree about emulating ARCH_CAPABILITIES being pointless, > > > > but Paolo's point about not changing ABI for existing setups still stands. This > > > > has been KVM's behavior for 6 years (since commit 0cf9135b773b ("KVM: x86: Emulate > > > > MSR_IA32_ARCH_CAPABILITIES on AMD hosts"); 7 years, if we go back to when KVM > > > > enumerated support without emulating the MSR (commit 1eaafe91a0df ("kvm: x86: > > > > IA32_ARCH_CAPABILITIES is always supported"). > > > > > > > > And it's not like KVM is forcing userspace to enumerate support for > > > > ARCH_CAPABILITIES, e.g. QEMU's named AMD configs don't enumerate support. So > > > > while I completely agree KVM's behavior is odd and annoying for userspace to deal > > > > with, this is probably something that should be addressed in userspace. > > > > > > > > > I am resurecting this change because some recent Windows updates (like OS Build > > > > > 26100.4351) crashes on AMD KVM guests (BSOD with Stop code: UNSUPPORTED PROCESSOR) > > > > > just because the ARCH_CAPABILITIES is available. > > > > > > Isn't it the Windows bugs? I think it is incorrect to assume AMD will never implement ARCH_CAPABILITIES. > > > > > > > Yes, although on one hand they are just following the current AMD specification which > > says that ARCH_CAPABILITIES is not defined on AMD cpus; but on the other hand they are > > breaking a 6+ years behavior. So it might be nice if we could prevent such an issue in > > the future. > > Hi Sean, > > Part of the virtualization stack is to lie accurately and in this case > KVM is doing it incorrectly. No, KVM isn't doing anything "incorrectly". The ioctl in question, KVM_GET_SUPPORTED_CPUID, advertises what *KVM* supports. The CPUID model that is configured for and presented to the guest is fully controlled by userspace, i.e. by QEMU. And relative to what KVM is advertising, KVM's behavior is correct. Prior to commit 0cf9135b773b, KVM was indeed buggy, because KVM didn't emulate a feature that was advertised to userspace. But that hasn't been the case for 6+ years. Even if KVM were explicitly setting guest CPUID, KVM's behavior _still_ wouldn't be incorrect, because it wouldn't violate AMD's architecture. Per AMD's APM, software cannot assume reserved CPUID bits are '0': All bit positions that are not defined as fields are reserved. The value of bits within reserved ranges cannot be relied upon to be zero. Software must mask off all reserved bits in the return value prior to making any value comparisons of represented information. > Not fixing it b/c of it being for 7 years in and being part of an ABI but > saying it should be fixed in QEMU sounds like you agree technically, but are > constrained by a policy. I'm not constrained by policy, I'm weighing the risk vs. reward of changing KVM's ABI to remedy a problem that affects exactly one configuration in one VMM, is relatively straightforward to address in said VMM, and has already been fixed in the affected guest kernel (because as above, QEMU's behavior isn't a violation of AMD's architecture).