Re: [PATCH v5 3/5] KVM: nVMX: check vmcs12->guest_ia32_debugctl value given by L2

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



+Jim

On Thu, May 22, 2025, Sean Christopherson wrote:
> On Wed, May 21, 2025, Maxim Levitsky wrote:
> > Check the vmcs12 guest_ia32_debugctl value before loading it, to avoid L2
> > being able to load arbitrary values to hardware IA32_DEBUGCTL.
> > 
> > Reviewed-by: Chao Gao <chao.gao@xxxxxxxxx>
> > Signed-off-by: Maxim Levitsky <mlevitsk@xxxxxxxxxx>
> > ---
> >  arch/x86/kvm/vmx/nested.c | 3 ++-
> >  arch/x86/kvm/vmx/vmx.c    | 2 +-
> >  arch/x86/kvm/vmx/vmx.h    | 1 +
> >  3 files changed, 4 insertions(+), 2 deletions(-)
> > 
> > diff --git a/arch/x86/kvm/vmx/nested.c b/arch/x86/kvm/vmx/nested.c
> > index e073e3008b16..00f2b762710c 100644
> > --- a/arch/x86/kvm/vmx/nested.c
> > +++ b/arch/x86/kvm/vmx/nested.c
> > @@ -3146,7 +3146,8 @@ static int nested_vmx_check_guest_state(struct kvm_vcpu *vcpu,
> >  		return -EINVAL;
> >  
> >  	if ((vmcs12->vm_entry_controls & VM_ENTRY_LOAD_DEBUG_CONTROLS) &&
> > -	    CC(!kvm_dr7_valid(vmcs12->guest_dr7)))
> > +	    (CC(!kvm_dr7_valid(vmcs12->guest_dr7)) ||
> > +	     CC(vmcs12->guest_ia32_debugctl & ~vmx_get_supported_debugctl(vcpu, false))))
> 
> This is a breaking change.  For better or worse (read: worse), KVM's ABI is to
> drop BTF and LBR if they're unsupported (the former is always unsupported).
> Failure to honor that ABI means L1 can't excplitly load what it think is its
> current value into L2.
> 
> I'll slot in a path to provide another helper for checking the validity of
> DEBUGCTL.  I think I've managed to cobble together something that isn't too
> horrific (options are a bit limited due to the existing ugliness).

And then Jim ruined my day.  :-)

As evidenced by this hilarious KUT testcase, it's entirely possible there are
existing KVM guests that are utilizing/abusing DEBUGCTL features.

	/*
	 * Optional RTM test for hardware that supports RTM, to
	 * demonstrate that the current volume 3 of the SDM
	 * (325384-067US), table 27-1 is incorrect. Bit 16 of the exit
	 * qualification for debug exceptions is not reserved. It is
	 * set to 1 if a debug exception (#DB) or a breakpoint
	 * exception (#BP) occurs inside an RTM region while advanced
	 * debugging of RTM transactional regions is enabled.
	 */
	if (this_cpu_has(X86_FEATURE_RTM)) {
		vmcs_write(ENT_CONTROLS,
			   vmcs_read(ENT_CONTROLS) | ENT_LOAD_DBGCTLS);
		/*
		 * Set DR7.RTM[bit 11] and IA32_DEBUGCTL.RTM[bit 15]
		 * in the guest to enable advanced debugging of RTM
		 * transactional regions.
		 */
		vmcs_write(GUEST_DR7, BIT(11));
		vmcs_write(GUEST_DEBUGCTL, BIT(15));
		single_step_guest("Hardware delivered single-step in "
				  "transactional region", starting_dr6, 0);
		check_db_exit(false, false, false, &xbegin, BIT(16),
			      starting_dr6);
	} else {
		vmcs_write(GUEST_RIP, (u64)&skip_rtm);
		enter_guest();
	}

For RTM specifically, disallowing DEBUGCTL.RTM but allowing DR7.RTM seems a bit
silly.  Unless there's a security concern, that can probably be fixed by adding
support for RTM.  Note, there's also a virtualization hole here, as KVM doesn't
vet DR7 beyond checking that bits 63:32 are zero, i.e. a guest could set DR7.RTM
even if KVM doesn't advertise support.  Of course, closing that hole would require
completely dropping support for disabling DR interception, since VMX doesn't
give per-DR controls.

For the other bits, I don't see a good solution.  The only viable options I see
are to silently drop all unsupported bits (maybe with a quirk?), or enforce all
bits and cross our fingers that no L1 VMM is running guests with those bits set
in GUEST_DEBUGCTL.




[Index of Archives]     [KVM ARM]     [KVM ia64]     [KVM ppc]     [Virtualization Tools]     [Spice Development]     [Libvirt]     [Libvirt Users]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite Questions]     [Linux Kernel]     [Linux SCSI]     [XFree86]

  Powered by Linux