Re: [PATCH v3] riscv: skip csr restore if vcpu preempted reload

Anup Patel <anup@xxxxxxxxxxxxxx> · Tue, 9 Sep 2025 12:19:43 +0530

On Mon, Aug 25, 2025 at 5:44 PM Jinyu Tang <tjytimi@xxxxxxx> wrote:
>
> The kvm_arch_vcpu_load() function is called in two cases for riscv:
> 1. When entering KVM_RUN from userspace ioctl.
> 2. When a preempted VCPU is scheduled back.
>
> In the second case, if no other KVM VCPU has run on this CPU since the
> current VCPU was preempted, the guest CSR (including AIA CSRS and HGTAP)
> values are still valid in the hardware and do not need to be restored.
>
> This patch is to skip the CSR write path when:
> 1. The VCPU was previously preempted
> (vcpu->scheduled_out == 1).
> 2. It is being reloaded on the same physical CPU
> (vcpu->arch.last_exit_cpu == cpu).
> 3. No other KVM VCPU has used this CPU in the meantime
> (vcpu == __this_cpu_read(kvm_former_vcpu)).
>
> This reduces many CSR writes with frequent preemption on the same CPU.

Currently, I see the following issues with this patch:

1) It's making Guest usage of IMSIC VS-files on the QEMU virt
    machine very unstable and Guest never boots. It could be
    some QEMU issue but I don't want to increase instability
    on QEMU since it is our primary development vehicle.

2) We have CSRs like hedeleg which can be updated by KVM
    user space via set_guest_debug() ioctl.

The direction of the patch is fine but it is very fragile at the moment.

Regards,
Anup

>
> Signed-off-by: Jinyu Tang <tjytimi@xxxxxxx>
> Reviewed-by: Nutty Liu <nutty.liu@xxxxxxxxxxx>
> ---
>  v2 -> v3:
>  v2 was missing a critical check because I generated the patch from my
>  wrong (experimental) branch. This is fixed in v3. Sorry for my trouble.
>
>  v1 -> v2:
>  Apply the logic to aia csr load. Thanks for
>  Andrew Jones's advice.
>
>  arch/riscv/kvm/vcpu.c | 13 +++++++++++--
>  1 file changed, 11 insertions(+), 2 deletions(-)
>
> diff --git a/arch/riscv/kvm/vcpu.c b/arch/riscv/kvm/vcpu.c
> index f001e5640..66bd3ddd5 100644
> --- a/arch/riscv/kvm/vcpu.c
> +++ b/arch/riscv/kvm/vcpu.c
> @@ -25,6 +25,8 @@
>  #define CREATE_TRACE_POINTS
>  #include "trace.h"
>
> +static DEFINE_PER_CPU(struct kvm_vcpu *, kvm_former_vcpu);
> +
>  const struct _kvm_stats_desc kvm_vcpu_stats_desc[] = {
>         KVM_GENERIC_VCPU_STATS(),
>         STATS_DESC_COUNTER(VCPU, ecall_exit_stat),
> @@ -581,6 +583,10 @@ void kvm_arch_vcpu_load(struct kvm_vcpu *vcpu, int cpu)
>         struct kvm_vcpu_csr *csr = &vcpu->arch.guest_csr;
>         struct kvm_vcpu_config *cfg = &vcpu->arch.cfg;
>
> +       if (vcpu->scheduled_out && vcpu == __this_cpu_read(kvm_former_vcpu) &&
> +               vcpu->arch.last_exit_cpu == cpu)
> +               goto csr_restore_done;
> +
>         if (kvm_riscv_nacl_sync_csr_available()) {
>                 nsh = nacl_shmem();
>                 nacl_csr_write(nsh, CSR_VSSTATUS, csr->vsstatus);
> @@ -624,6 +630,9 @@ void kvm_arch_vcpu_load(struct kvm_vcpu *vcpu, int cpu)
>
>         kvm_riscv_mmu_update_hgatp(vcpu);
>
> +       kvm_riscv_vcpu_aia_load(vcpu, cpu);
> +
> +csr_restore_done:
>         kvm_riscv_vcpu_timer_restore(vcpu);
>
>         kvm_riscv_vcpu_host_fp_save(&vcpu->arch.host_context);
> @@ -633,8 +642,6 @@ void kvm_arch_vcpu_load(struct kvm_vcpu *vcpu, int cpu)
>         kvm_riscv_vcpu_guest_vector_restore(&vcpu->arch.guest_context,
>                                             vcpu->arch.isa);
>
> -       kvm_riscv_vcpu_aia_load(vcpu, cpu);
> -
>         kvm_make_request(KVM_REQ_STEAL_UPDATE, vcpu);
>
>         vcpu->cpu = cpu;
> @@ -645,6 +652,8 @@ void kvm_arch_vcpu_put(struct kvm_vcpu *vcpu)
>         void *nsh;
>         struct kvm_vcpu_csr *csr = &vcpu->arch.guest_csr;
>
> +       __this_cpu_write(kvm_former_vcpu, vcpu);
> +
>         vcpu->cpu = -1;
>
>         kvm_riscv_vcpu_aia_put(vcpu);
> --
> 2.43.0
>