On 30/04/2025 13:11, Gavin Shan wrote: > On 4/16/25 11:41 PM, Steven Price wrote: >> The guest can request that a region of it's protected address space is >> switched between RIPAS_RAM and RIPAS_EMPTY (and back) using >> RSI_IPA_STATE_SET. This causes a guest exit with the >> RMI_EXIT_RIPAS_CHANGE code. We treat this as a request to convert a >> protected region to unprotected (or back), exiting to the VMM to make >> the necessary changes to the guest_memfd and memslot mappings. On the >> next entry the RIPAS changes are committed by making RMI_RTT_SET_RIPAS >> calls. >> >> The VMM may wish to reject the RIPAS change requested by the guest. For >> now it can only do with by no longer scheduling the VCPU as we don't >> currently have a usecase for returning that rejection to the guest, but >> by postponing the RMI_RTT_SET_RIPAS changes to entry we leave the door >> open for adding a new ioctl in the future for this purpose. >> >> Signed-off-by: Steven Price <steven.price@xxxxxxx> >> --- >> Changes since v7: >> * Rework the loop in realm_set_ipa_state() to make it clear when the >> 'next' output value of rmi_rtt_set_ripas() is used. >> New patch for v7: The code was previously split awkwardly between two >> other patches. >> --- >> arch/arm64/kvm/rme.c | 88 ++++++++++++++++++++++++++++++++++++++++++++ >> 1 file changed, 88 insertions(+) >> > > One nitpick below, either way: > > Reviewed-by: Gavin Shan <gshan@xxxxxxxxxx> > >> diff --git a/arch/arm64/kvm/rme.c b/arch/arm64/kvm/rme.c >> index bee9dfe12e03..fe0d5b8703d2 100644 >> --- a/arch/arm64/kvm/rme.c >> +++ b/arch/arm64/kvm/rme.c >> @@ -624,6 +624,65 @@ void kvm_realm_unmap_range(struct kvm *kvm, >> unsigned long start, >> realm_unmap_private_range(kvm, start, end); >> } >> +static int realm_set_ipa_state(struct kvm_vcpu *vcpu, >> + unsigned long start, >> + unsigned long end, >> + unsigned long ripas, >> + unsigned long *top_ipa) >> +{ >> + struct kvm *kvm = vcpu->kvm; >> + struct realm *realm = &kvm->arch.realm; >> + struct realm_rec *rec = &vcpu->arch.rec; >> + phys_addr_t rd_phys = virt_to_phys(realm->rd); >> + phys_addr_t rec_phys = virt_to_phys(rec->rec_page); >> + struct kvm_mmu_memory_cache *memcache = &vcpu->arch.mmu_page_cache; >> + unsigned long ipa = start; >> + int ret = 0; >> + >> + while (ipa < end) { >> + unsigned long next; >> + >> + ret = rmi_rtt_set_ripas(rd_phys, rec_phys, ipa, end, &next); >> + >> + if (RMI_RETURN_STATUS(ret) == RMI_SUCCESS) { >> + ipa = next; >> + } else if (RMI_RETURN_STATUS(ret) == RMI_ERROR_RTT) { > > ---> > >> + int walk_level = RMI_RETURN_INDEX(ret); >> + int level = find_map_level(realm, ipa, end); >> + >> + /* >> + * If the RMM walk ended early then more tables are >> + * needed to reach the required depth to set the RIPAS. >> + */ >> + if (walk_level < level) { >> + ret = realm_create_rtt_levels(realm, ipa, >> + walk_level, >> + level, >> + memcache); >> + /* Retry with RTTs created */ >> + if (!ret) >> + continue; >> + } else { >> + ret = -EINVAL; >> + } >> + > > <--- This block of code have been existing in multiple functions. I guess > it would be worthy to introduce a helper for it if you agree. > Alternatively, > it's definitely something to do in the future, after this series is > merged :) I believe it's just two functions: realm_set_ipa_state() and realm_init_ipa_state(). Those two functions are going basically the same thing just at different stages (realm_init before the guest has started, and realm_set when it's running). I've had a go and combing the two functions, it's a little clunky because of the differences, but I think it's an improvement over the repeated code. Thanks, Steve >> + break; >> + } else { >> + WARN(1, "Unexpected error in %s: %#x\n", __func__, >> + ret); >> + ret = -ENXIO; >> + break; >> + } >> + } >> + >> + *top_ipa = ipa; >> + >> + if (ripas == RMI_EMPTY && ipa != start) >> + realm_unmap_private_range(kvm, start, ipa); >> + >> + return ret; >> +} >> + >> static int realm_init_ipa_state(struct realm *realm, >> unsigned long ipa, >> unsigned long end) >> @@ -863,6 +922,32 @@ void kvm_destroy_realm(struct kvm *kvm) >> kvm_free_stage2_pgd(&kvm->arch.mmu); >> } >> +static void kvm_complete_ripas_change(struct kvm_vcpu *vcpu) >> +{ >> + struct kvm *kvm = vcpu->kvm; >> + struct realm_rec *rec = &vcpu->arch.rec; >> + unsigned long base = rec->run->exit.ripas_base; >> + unsigned long top = rec->run->exit.ripas_top; >> + unsigned long ripas = rec->run->exit.ripas_value; >> + unsigned long top_ipa; >> + int ret; >> + >> + do { >> + kvm_mmu_topup_memory_cache(&vcpu->arch.mmu_page_cache, >> + kvm_mmu_cache_min_pages(vcpu->arch.hw_mmu)); >> + write_lock(&kvm->mmu_lock); >> + ret = realm_set_ipa_state(vcpu, base, top, ripas, &top_ipa); >> + write_unlock(&kvm->mmu_lock); >> + >> + if (WARN_RATELIMIT(ret && ret != -ENOMEM, >> + "Unable to satisfy RIPAS_CHANGE for %#lx - %#lx, >> ripas: %#lx\n", >> + base, top, ripas)) >> + break; >> + >> + base = top_ipa; >> + } while (top_ipa < top); >> +} >> + >> int kvm_rec_enter(struct kvm_vcpu *vcpu) >> { >> struct realm_rec *rec = &vcpu->arch.rec; >> @@ -873,6 +958,9 @@ int kvm_rec_enter(struct kvm_vcpu *vcpu) >> for (int i = 0; i < REC_RUN_GPRS; i++) >> rec->run->enter.gprs[i] = vcpu_get_reg(vcpu, i); >> break; >> + case RMI_EXIT_RIPAS_CHANGE: >> + kvm_complete_ripas_change(vcpu); >> + break; >> } >> if (kvm_realm_state(vcpu->kvm) != REALM_STATE_ACTIVE) > > Thanks, > Gavin >