Re: [PATCH 1/3] x86: KVM: VMX: Wrap GUEST_IA32_DEBUGCTL read/write with access functions

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Tue, 2025-04-22 at 16:33 -0700, Sean Christopherson wrote:
> On Tue, Apr 15, 2025, Maxim Levitsky wrote:
> > Instead of reading and writing GUEST_IA32_DEBUGCTL vmcs field directly,
> > wrap the logic with get/set functions.
> 
> Why?  I know why the "set" helper is being added, but it needs to called out.
> 
> Please omit the getter entirely, it does nothing more than obfuscate a very
> simple line of code.

In this patch yes. But in the next patch I switch to reading from 'vmx->msr_ia32_debugctl'
You want me to open code this access? I don't mind, if you insist.

> 
> > Also move the checks that the guest's supplied value is valid to the new
> > 'set' function.
> 
> Please do this in a separate patch.  There's no need to mix refactoring and
> functional changes.

I thought that it was natural to do this in a the same patch. In this patch I introduce
a 'vmx_set_guest_debugctl' which should be used any time we set the msr given
the guest value, and VM entry is one of these cases.

I can split this if you want.

> 
> > In particular, the above change fixes a minor security issue in which L1
> 
> Bug, yes.  Not sure it constitutes a meaningful security issue though.

I also think so, but I wanted to mention this just in case.

> 
> > hypervisor could set the GUEST_IA32_DEBUGCTL, and eventually the host's
> > MSR_IA32_DEBUGCTL
> 
> No, the lack of a consistency check allows the guest to set the MSR in hardware,
> but that is not the host's value.

That's what I meant - the guest can set the real hardware MSR. Yes, after the
guest exits, the OS value is restored. I'll rephrase this in v2.

> 
> > to any value by performing a VM entry to L2 with VM_ENTRY_LOAD_DEBUG_CONTROLS
> > set.
> 
> Any *legal* value.  Setting completely unsupported bits will result in VM-Enter
> failing with a consistency check VM-Exit.

True.

> 
> > Signed-off-by: Maxim Levitsky <mlevitsk@xxxxxxxxxx>
> > ---
> >  arch/x86/kvm/vmx/nested.c    | 15 +++++++---
> >  arch/x86/kvm/vmx/pmu_intel.c |  9 +++---
> >  arch/x86/kvm/vmx/vmx.c       | 58 +++++++++++++++++++++++-------------
> >  arch/x86/kvm/vmx/vmx.h       |  3 ++
> >  4 files changed, 57 insertions(+), 28 deletions(-)
> > 
> > diff --git a/arch/x86/kvm/vmx/nested.c b/arch/x86/kvm/vmx/nested.c
> > index e073e3008b16..b7686569ee09 100644
> > --- a/arch/x86/kvm/vmx/nested.c
> > +++ b/arch/x86/kvm/vmx/nested.c
> > @@ -2641,6 +2641,7 @@ static int prepare_vmcs02(struct kvm_vcpu *vcpu, struct vmcs12 *vmcs12,
> >  	struct vcpu_vmx *vmx = to_vmx(vcpu);
> >  	struct hv_enlightened_vmcs *evmcs = nested_vmx_evmcs(vmx);
> >  	bool load_guest_pdptrs_vmcs12 = false;
> > +	u64 new_debugctl;
> >  
> >  	if (vmx->nested.dirty_vmcs12 || nested_vmx_is_evmptr12_valid(vmx)) {
> >  		prepare_vmcs02_rare(vmx, vmcs12);
> > @@ -2653,11 +2654,17 @@ static int prepare_vmcs02(struct kvm_vcpu *vcpu, struct vmcs12 *vmcs12,
> >  	if (vmx->nested.nested_run_pending &&
> >  	    (vmcs12->vm_entry_controls & VM_ENTRY_LOAD_DEBUG_CONTROLS)) {
> >  		kvm_set_dr(vcpu, 7, vmcs12->guest_dr7);
> > -		vmcs_write64(GUEST_IA32_DEBUGCTL, vmcs12->guest_ia32_debugctl);
> > +		new_debugctl = vmcs12->guest_ia32_debugctl;
> >  	} else {
> >  		kvm_set_dr(vcpu, 7, vcpu->arch.dr7);
> > -		vmcs_write64(GUEST_IA32_DEBUGCTL, vmx->nested.pre_vmenter_debugctl);
> > +		new_debugctl = vmx->nested.pre_vmenter_debugctl;
> >  	}
> > +
> > +	if (CC(!vmx_set_guest_debugctl(vcpu, new_debugctl, false))) {
> 
> The consistency check belongs in nested_vmx_check_guest_state(), only needs to
> check the VM_ENTRY_LOAD_DEBUG_CONTROLS case, and should be posted as a separate
> patch.

I can move it there. Can you explain why though you want this? Is it because of the
order of checks specified in the PRM?

Currently GUEST_IA32_DEBUGCTL of the host is *written* in prepare_vmcs02. 
Should I also move this write to nested_vmx_check_guest_state?

Or should I write the value blindly in prepare_vmcs02 and then check the value
of 'vmx->msr_ia32_debugctl' in nested_vmx_check_guest_state and fail if the value
contains reserved bits? 
I don't like that idea that much IMHO.


> 
> > +		*entry_failure_code = ENTRY_FAIL_DEFAULT;
> > +		return -EINVAL;
> > +	}
> > +
> > +static void __vmx_set_guest_debugctl(struct kvm_vcpu *vcpu, u64 data)
> > +{
> > +	vmcs_write64(GUEST_IA32_DEBUGCTL, data);
> > +}
> > +
> > +bool vmx_set_guest_debugctl(struct kvm_vcpu *vcpu, u64 data, bool host_initiated)
> > +{
> > +	u64 invalid = data & ~vmx_get_supported_debugctl(vcpu, host_initiated);
> > +
> > +	if (invalid & (DEBUGCTLMSR_BTF|DEBUGCTLMSR_LBR)) {
> > +		kvm_pr_unimpl_wrmsr(vcpu, MSR_IA32_DEBUGCTLMSR, data);
> > +		data &= ~(DEBUGCTLMSR_BTF|DEBUGCTLMSR_LBR);
> > +		invalid &= ~(DEBUGCTLMSR_BTF|DEBUGCTLMSR_LBR);
> > +	}
> > +
> > +	if (invalid)
> > +		return false;
> > +
> > +	if (is_guest_mode(vcpu) && (get_vmcs12(vcpu)->vm_exit_controls &
> > +					VM_EXIT_SAVE_DEBUG_CONTROLS))
> > +		get_vmcs12(vcpu)->guest_ia32_debugctl = data;
> > +
> > +	if (intel_pmu_lbr_is_enabled(vcpu) && !to_vmx(vcpu)->lbr_desc.event &&
> > +	    (data & DEBUGCTLMSR_LBR))
> > +		intel_pmu_create_guest_lbr_event(vcpu);
> > +
> > +	__vmx_set_guest_debugctl(vcpu, data);
> > +	return true;
> 
> Return 0/-errno, not true/false.

There are plenty of functions in this file and KVM that return boolean.

e.g: 

static bool nested_vmx_check_eptp(struct kvm_vcpu *vcpu, u64 new_eptp)
static inline bool vmx_control_verify(u32 control, u32 low, u32 high)
static bool nested_evmcs_handle_vmclear(struct kvm_vcpu *vcpu, gpa_t vmptr)

static inline bool nested_vmx_prepare_msr_bitmap(struct kvm_vcpu *vcpu,
						 struct vmcs12 *vmcs12)


static bool nested_vmx_check_eptp(struct kvm_vcpu *vcpu, u64 new_eptp)
static bool nested_get_vmcs12_pages(struct kvm_vcpu *vcpu)

...


I personally think that functions that emulate hardware should return boolean values
or some hardware specific status code (e.g VMX failure code) because the real hardware
never returns -EINVAL and such.


Best regards,
	Maxim Levitsky




> 






[Index of Archives]     [KVM ARM]     [KVM ia64]     [KVM ppc]     [Virtualization Tools]     [Spice Development]     [Libvirt]     [Libvirt Users]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite Questions]     [Linux Kernel]     [Linux SCSI]     [XFree86]

  Powered by Linux