Re: [PATCH 4/5] KVM: RISC-V: reset VCPU state when becoming runnable

Radim Krčmář <rkrcmar@xxxxxxxxxxxxxxxx> · Wed, 30 Apr 2025 13:45:29 +0200

2025-04-30T15:47:13+05:30, Anup Patel <anup@xxxxxxxxxxxxxx>:
> On Wed, Apr 30, 2025 at 1:59 PM Radim Krčmář <rkrcmar@xxxxxxxxxxxxxxxx> wrote:
>> 2025-04-30T10:56:35+05:30, Anup Patel <anup@xxxxxxxxxxxxxx>:
>> > On Wed, Apr 30, 2025 at 9:52 AM Anup Patel <anup@xxxxxxxxxxxxxx> wrote:
>> >> On Tue, Apr 29, 2025 at 9:51 PM Radim Krčmář <rkrcmar@xxxxxxxxxxxxxxxx> wrote:
>> >> > The point of this patch is to reset the boot VCPU, so we reset the VCPU
>> >> > that is made runnable by the KVM_SET_MP_STATE IOCTL.
>> >>
>> >> Like I said before, we don't need to do this. The initiating VCPU
>> >> can be resetted just before exiting to user space for system reset
>> >> event exit.
>>
>> You assume initiating VCPU == boot VCPU.
>>
>> We should prevent KVM_SET_MP_STATE IOCTL for all non-initiating VCPUs if
>> we decide to accept the assumption.
>
> There is no such assumption.

You probably haven't intended it:

  1) VCPU 0 is "chilling" in userspace.
  2) VCPU 1 initiates SBI reset.
  3) VCPU 1 makes a reset request to VCPU 0.
  4) VCPU 1 returns to userspace.
  5) Userspace knows it should reset the VM.
  6) VCPU 0 still hasn't entered KVM.
  7) Userspace sets the initial state of VCPU 0 and enters KVM.
  8) VCPU 0 is reset in KVM, because of the pending request.
  9) The initial boot state from userspace is lost.

>> I'd rather choose a different design, though.
>>
>> How about a new userspace interface for IOCTL reset?
>> (Can be capability toggle for KVM_SET_MP_STATE or a straight new IOCTL.)
>>
>> That wouldn't "fix" current userspaces, but would significantly improve
>> the sanity of the KVM interface.
>
> I believe the current implementation needs a few improvements
> that's all. We certainly don't need to introduce any new IOCTL.

I do too.  The whole patch could have been a single line:

diff --git a/arch/riscv/kvm/vcpu.c b/arch/riscv/kvm/vcpu.c
index d3d957a9e5c4..b3e6ad87e1cd 100644
--- a/arch/riscv/kvm/vcpu.c
+++ b/arch/riscv/kvm/vcpu.c
@@ -511,6 +511,7 @@ int kvm_arch_vcpu_ioctl_set_mpstate(struct kvm_vcpu *vcpu,
 
 	switch (mp_state->mp_state) {
 	case KVM_MP_STATE_RUNNABLE:
+		kvm_riscv_reset_vcpu(vcpu);
 		WRITE_ONCE(vcpu->arch.mp_state, *mp_state);
 		break;
 	case KVM_MP_STATE_STOPPED:

It is the backward compatibility and trying to fix current userspaces
that's making it ugly.  I already gave up on the latter, so we can have
a decently clean solution with the former.

> Also, keep in mind that so far we have avoided any RISC-V
> specific KVM IOCTLs and we should try to keep it that way
> as long as we can.

We can re-use KVM_SET_MP_STATE and add a KVM capability.
Userspace will opt-in to reset the VCPU through the existing IOCTL.

This design will also allow userspace to trigger a VCPU reset without
tearing down the whole VM.