Re: [PATCH v9 15/43] arm64: RME: Allow VMM to set RIPAS

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi Steve,

On 7/10/25 12:42 AM, Steven Price wrote:
On 02/07/2025 01:37, Gavin Shan wrote:
On 6/11/25 8:48 PM, Steven Price wrote:
Each page within the protected region of the realm guest can be marked
as either RAM or EMPTY. Allow the VMM to control this before the guest
has started and provide the equivalent functions to change this (with
the guest's approval) at runtime.

When transitioning from RIPAS RAM (1) to RIPAS EMPTY (0) the memory is
unmapped from the guest and undelegated allowing the memory to be reused
by the host. When transitioning to RIPAS RAM the actual population of
the leaf RTTs is done later on stage 2 fault, however it may be
necessary to allocate additional RTTs to allow the RMM track the RIPAS
for the requested range.

When freeing a block mapping it is necessary to temporarily unfold the
RTT which requires delegating an extra page to the RMM, this page can
then be recovered once the contents of the block mapping have been
freed.

Signed-off-by: Steven Price <steven.price@xxxxxxx>
---
Changes from v8:
   * Propagate the 'may_block' flag to allow conditional calls to
     cond_resched_rwlock_write().
   * Introduce alloc_rtt() to wrap alloc_delegated_granule() and
     kvm_account_pgtable_pages() and use when allocating RTTs.
   * Code reorganisation to allow init_ipa_state and set_ipa_state to
     share a common ripas_change() function,
   * Other minor changes following review.
Changes from v7:
   * Replace use of "only_shared" with the upstream "attr_filter" field
     of struct kvm_gfn_range.
   * Clean up the logic in alloc_delegated_granule() for when to call
     kvm_account_pgtable_pages().
   * Rename realm_destroy_protected_granule() to
     realm_destroy_private_granule() to match the naming elsewhere. Also
     fix the return codes in the function to be descriptive.
   * Several other minor changes to names/return codes.
Changes from v6:
   * Split the code dealing with the guest triggering a RIPAS change into
     a separate patch, so this patch is purely for the VMM setting up the
     RIPAS before the guest first runs.
   * Drop the useless flags argument from alloc_delegated_granule().
   * Account RTTs allocated for a guest using kvm_account_pgtable_pages().
   * Deal with the RMM granule size potentially being smaller than the
     host's PAGE_SIZE. Although note alloc_delegated_granule() currently
     still allocates an entire host page for every RMM granule (so wasting
     memory when PAGE_SIZE>4k).
Changes from v5:
   * Adapt to rebasing.
   * Introduce find_map_level()
   * Rename some functions to be clearer.
   * Drop the "spare page" functionality.
Changes from v2:
   * {alloc,free}_delegated_page() moved from previous patch to this one.
   * alloc_delegated_page() now takes a gfp_t flags parameter.
   * Fix the reference counting of guestmem pages to avoid leaking memory.
   * Several misc code improvements and extra comments.
---
   arch/arm64/include/asm/kvm_rme.h |   6 +
   arch/arm64/kvm/mmu.c             |   8 +-
   arch/arm64/kvm/rme.c             | 447 +++++++++++++++++++++++++++++++
   3 files changed, 458 insertions(+), 3 deletions(-)


With below nitpicks addressed. The changes looks good to me.

Reviewed-by: Gavin Shan <gshan@xxxxxxxxxx>

Thanks, most the nitpicks I agree - thanks for raising. Just one below I
wanted to comment on...

[...]

You're welcome.

+
+enum ripas_action {
+    RIPAS_INIT,
+    RIPAS_SET,
+};
+
+static int ripas_change(struct kvm *kvm,
+            struct kvm_vcpu *vcpu,
+            unsigned long ipa,
+            unsigned long end,
+            enum ripas_action action,
+            unsigned long *top_ipa)
+{

The 'enum ripas_action' is used in limited scope, I would replace it
with a 'bool'
parameter to ripas_change(), something like below. If we plan to support
more actions
in future, then the 'enum ripas_action' makes sense to me.

The v1.1 spec[1] adds RMI_RTT_SET_S2AP (set stage 2 access permission).
So that adds a third option to the enum. I agree the enum is a little
clunky but it allows extension and at least spells out the action which
is occurring.

The part I'm not especially happy with is the 'vcpu' argument which is
not applicable to RIPAS_INIT but otherwise required (and in those cases
could replace 'kvm'). But I couldn't come up with a better solution for
that.

[1] Available from:
https://developer.arm.com/documentation/den0137/latest (following the
small "here" link near the end).


Right, it's as I guessed. A enum looks good if we need to extend it
to cover the third case (RMI_RTT_SET_S2AP in RMMv1.1). Note that I just
started looking into RMMv1.1 implementation several days ago and didn't
have a good understanding on RMMv1.1 at present :-)

Thanks,
Gavin

Thanks,
Steve

static int ripas_change(struct kvm *kvm,
             struct kvm_vcpu *vcpu,
             unsigned long ipa,
             unsigned long end,
             bool set_ripas,
             unsigned long *top_ipa)

+    struct realm *realm = &kvm->arch.realm;
+    phys_addr_t rd_phys = virt_to_phys(realm->rd);
+    phys_addr_t rec_phys;
+    struct kvm_mmu_memory_cache *memcache = NULL;
+    int ret = 0;
+
+    if (vcpu) {
+        rec_phys = virt_to_phys(vcpu->arch.rec.rec_page);
+        memcache = &vcpu->arch.mmu_page_cache;
+
+        WARN_ON(action != RIPAS_SET);
+    } else {
+        WARN_ON(action != RIPAS_INIT);
+    }
+
+    while (ipa < end) {
+        unsigned long next;
+
+        switch (action) {
+        case RIPAS_INIT:
+            ret = rmi_rtt_init_ripas(rd_phys, ipa, end, &next);
+            break;
+        case RIPAS_SET:
+            ret = rmi_rtt_set_ripas(rd_phys, rec_phys, ipa, end,
+                        &next);
+            break;
+        }
+

if 'enum ripas_action' is replaced by 'bool set_ripas' as above, this needs
twist either.

+        switch (RMI_RETURN_STATUS(ret)) {
+        case RMI_SUCCESS:
+            ipa = next;
+            break;
+        case RMI_ERROR_RTT:
+            int err_level = RMI_RETURN_INDEX(ret);
+            int level = find_map_level(realm, ipa, end);
+
+            if (err_level >= level)
+                return -EINVAL;
+
+            ret = realm_create_rtt_levels(realm, ipa, err_level,
+                              level, memcache);
+            if (ret)
+                return ret;
+            /* Retry with the RTT levels in place */
+            break;
+        default:
+            WARN_ON(1);
+            return -ENXIO;
+        }
+    }
+
+    if (top_ipa)
+        *top_ipa = ipa;
+
+    return 0;
+}
+
+static int realm_init_ipa_state(struct kvm *kvm,
+                unsigned long ipa,
+                unsigned long end)
+{
+    return ripas_change(kvm, NULL, ipa, end, RIPAS_INIT, NULL);
+}
+
+static int kvm_init_ipa_range_realm(struct kvm *kvm,
+                    struct arm_rme_init_ripas *args)
+{
+    gpa_t addr, end;
+
+    addr = args->base;
+    end = addr + args->size;
+
+    if (end < addr)
+        return -EINVAL;
+
+    if (kvm_realm_state(kvm) != REALM_STATE_NEW)
+        return -EPERM;
+
+    return realm_init_ipa_state(kvm, addr, end);
+}
+
   /* Protects access to rme_vmid_bitmap */
   static DEFINE_SPINLOCK(rme_vmid_lock);
   static unsigned long *rme_vmid_bitmap;
@@ -441,6 +876,18 @@ int kvm_realm_enable_cap(struct kvm *kvm, struct
kvm_enable_cap *cap)
       case KVM_CAP_ARM_RME_CREATE_REALM:
           r = kvm_create_realm(kvm);
           break;
+    case KVM_CAP_ARM_RME_INIT_RIPAS_REALM: {
+        struct arm_rme_init_ripas args;
+        void __user *argp = u64_to_user_ptr(cap->args[1]);
+
+        if (copy_from_user(&args, argp, sizeof(args))) {
+            r = -EFAULT;
+            break;
+        }
+
+        r = kvm_init_ipa_range_realm(kvm, &args);
+        break;
+    }
       default:
           r = -EINVAL;
           break;

Thanks,
Gavin







[Index of Archives]     [KVM ARM]     [KVM ia64]     [KVM ppc]     [Virtualization Tools]     [Spice Development]     [Libvirt]     [Libvirt Users]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite Questions]     [Linux Kernel]     [Linux SCSI]     [XFree86]

  Powered by Linux