Re: [PATCH v9 10/17] KVM: x86: Compute max_mapping_level with input from guest_memfd

Fuad Tabba <tabba@xxxxxxxxxx> · Thu, 22 May 2025 10:34:22 +0100

On Thu, 22 May 2025 at 09:56, David Hildenbrand <david@xxxxxxxxxx> wrote:
>
> On 22.05.25 09:22, Fuad Tabba wrote:
> > Hi David,
> >
> > On Wed, 21 May 2025 at 09:01, David Hildenbrand <david@xxxxxxxxxx> wrote:
> >>
> >> On 13.05.25 18:34, Fuad Tabba wrote:
> >>> From: Ackerley Tng <ackerleytng@xxxxxxxxxx>
> >>>
> >>> This patch adds kvm_gmem_max_mapping_level(), which always returns
> >>> PG_LEVEL_4K since guest_memfd only supports 4K pages for now.
> >>>
> >>> When guest_memfd supports shared memory, max_mapping_level (especially
> >>> when recovering huge pages - see call to __kvm_mmu_max_mapping_level()
> >>> from recover_huge_pages_range()) should take input from
> >>> guest_memfd.
> >>>
> >>> Input from guest_memfd should be taken in these cases:
> >>>
> >>> + if the memslot supports shared memory (guest_memfd is used for
> >>>     shared memory, or in future both shared and private memory) or
> >>> + if the memslot is only used for private memory and that gfn is
> >>>     private.
> >>>
> >>> If the memslot doesn't use guest_memfd, figure out the
> >>> max_mapping_level using the host page tables like before.
> >>>
> >>> This patch also refactors and inlines the other call to
> >>> __kvm_mmu_max_mapping_level().
> >>>
> >>> In kvm_mmu_hugepage_adjust(), guest_memfd's input is already
> >>> provided (if applicable) in fault->max_level. Hence, there is no need
> >>> to query guest_memfd.
> >>>
> >>> lpage_info is queried like before, and then if the fault is not from
> >>> guest_memfd, adjust fault->req_level based on input from host page
> >>> tables.
> >>>
> >>> Signed-off-by: Ackerley Tng <ackerleytng@xxxxxxxxxx>
> >>> Signed-off-by: Fuad Tabba <tabba@xxxxxxxxxx>
> >>> ---
> >>>    arch/x86/kvm/mmu/mmu.c   | 92 ++++++++++++++++++++++++++--------------
> >>>    include/linux/kvm_host.h |  7 +++
> >>>    virt/kvm/guest_memfd.c   | 12 ++++++
> >>>    3 files changed, 79 insertions(+), 32 deletions(-)
> >>>
> >>> diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c
> >>> index cfbb471f7c70..9e0bc8114859 100644
> >>> --- a/arch/x86/kvm/mmu/mmu.c
> >>> +++ b/arch/x86/kvm/mmu/mmu.c
> >>> @@ -3256,12 +3256,11 @@ static int host_pfn_mapping_level(struct kvm *kvm, gfn_t gfn,
> >>>        return level;
> >>>    }
> >> [...]
> >>
> >>>    static u8 kvm_max_level_for_fault_and_order(struct kvm *kvm,
> >>>                                            struct kvm_page_fault *fault,
> >>>                                            int order)
> >>> @@ -4523,7 +4551,7 @@ static int __kvm_mmu_faultin_pfn(struct kvm_vcpu *vcpu,
> >>>    {
> >>>        unsigned int foll = fault->write ? FOLL_WRITE : 0;
> >>>
> >>> -     if (fault->is_private || kvm_gmem_memslot_supports_shared(fault->slot))
> >>> +     if (fault_from_gmem(fault))
> >>
> >> Should this change rather have been done in the previous patch?
> >>
> >> (then only adjust fault_from_gmem() in this function as required)
> >>
> >>>                return kvm_mmu_faultin_pfn_gmem(vcpu, fault);
> >>>
> >>>        foll |= FOLL_NOWAIT;
> >>> diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h
> >>> index de7b46ee1762..f9bb025327c3 100644
> >>> --- a/include/linux/kvm_host.h
> >>> +++ b/include/linux/kvm_host.h
> >>> @@ -2560,6 +2560,7 @@ static inline bool kvm_mem_is_private(struct kvm *kvm, gfn_t gfn)
> >>>    int kvm_gmem_get_pfn(struct kvm *kvm, struct kvm_memory_slot *slot,
> >>>                     gfn_t gfn, kvm_pfn_t *pfn, struct page **page,
> >>>                     int *max_order);
> >>> +int kvm_gmem_mapping_order(const struct kvm_memory_slot *slot, gfn_t gfn);
> >>>    #else
> >>>    static inline int kvm_gmem_get_pfn(struct kvm *kvm,
> >>>                                   struct kvm_memory_slot *slot, gfn_t gfn,
> >>> @@ -2569,6 +2570,12 @@ static inline int kvm_gmem_get_pfn(struct kvm *kvm,
> >>>        KVM_BUG_ON(1, kvm);
> >>>        return -EIO;
> >>>    }
> >>> +static inline int kvm_gmem_mapping_order(const struct kvm_memory_slot *slot,
> >>> +                                      gfn_t gfn)
> >>
> >> Probably should indent with two tabs here.
> >
> > (I'm fixing the patch before respinning, hence it's me asking)
> >
> > Not sure I understand. Indentation here matches the same style as that
> > for kvm_gmem_get_pfn() right above it in the alignment of the
> > parameters, i.e., the parameter `gfn_t gfn` is aligned with the
> > parameter `const struct kvm_memory_slot *slot` (four tabs and a
> > space).
>
> Yeah, that way of indenting is rather bad practice. Especially for new
> code we're adding or when we touch existing code, we should just use two
> tabs.
>
> That way, we can fit more stuff into a single line, and when doing
> simple changes, such as renaming the function or changing the return
> type, we won't have to touch all the parameters.
>
> Maybe KVM has its own rules on that ... that's why I said "probably" :)

:)

I see, although I agree with you, I'd rather that indentation be
consistent within the same file.

Thanks,
/fuad
> --
> Cheers,
>
> David / dhildenb
>