On Tue, 2025-08-26 at 12:30 -0700, Sean Christopherson wrote: > On Fri, Aug 22, 2025, Colin Percival wrote: > > On 8/21/25 14:10, David Woodhouse wrote: > > > On Thu, 2025-08-21 at 13:48 -0700, Sean Christopherson wrote: > > > > > I think I'm a lot happier with the explicit CPUID leaf exposed by the > > > > > hypervisor. > > > > > > > > Why? If the hypervisor is ultimately the one defining the state, why does it > > > > matter which CPUID leaf its in? > > > [...] > > > > > > If you tell me that 0x15 is *never* wrong when seen by a KVM guest, and > > > that it's OK to extend the hardware CPUID support up to 0x15 even on > > > older CPUs and there'll never be any adverse consequences from weird > > > assumptions in guest operating systems if we do the latter... well, for > > > a start, I won't believe you. And even if I do, I won't think it's > > > worth the risk. Just use a hypervisor leaf :) > > But for CoCo VMs (TDX in particular), using a hypervisor leaf is objectively worse, > because the hypervisor leaf is emulated by the untrusted world, whereas CPUID.0x15 > is emulated by the trusted world (TDX-Module). > > If the issue is one of trust, what if we carve out a KVM_FEATURE_xxx bit that > userspace can set to pinky swear it isn't broken? > > > FreeBSD developer here. I'm with David on this, we'll consult the 0x15/0x16 > > CPUID leaves if we don't have anything better, but I'm not going to trust > > those nearly as much as the 0x40000010 leaf. > > > > Also, the 0x40000010 leaf provides the lapic frequency, which AFAIK is not > > exposed in any other way. > > On Intel CPUs, CPUID.0x15 defines the APIC timer frequency: > > The APIC timer frequency will be the processor’s bus clock or core crystal clock > frequency (when TSC/core crystal clock ratio is enumerated in CPUID leaf 0x15) > divided by the value specified in the divide configuration register. > > Thanks to TDX (again), that is also now KVM's ABI. And AMD's Secure TSC provides it in a GUEST_TSC_FREQ MSR, I believe. For the non-CoCo cases, I do think we'd need at least that 'I pinky swear that CPUID 0x15 is telling the truth' bit — because right now, on today's hypervisors, I believe it might not be correct. So a guest can't trust it without that bit. But I'm also concerned about the side-effects of advertising to guests that everything up to 0x15 is present, on older and AMD CPUs. And I just don't see the point in that 'pinky swear' bit, when there's an *existing* hypervisor leaf which just gives the information directly, which is implemented in QEMU and EC2, as well as various guests.
Attachment:
smime.p7s
Description: S/MIME cryptographic signature