Re: [PATCH 1/2] KVM: arm64: Split kvm_pgtable_stage2_destroy()

Oliver Upton <oliver.upton@xxxxxxxxx> · Tue, 29 Jul 2025 08:57:22 -0700

On Thu, Jul 24, 2025 at 11:51:43PM +0000, Raghavendra Rao Ananta wrote:
> Split kvm_pgtable_stage2_destroy() into two:
>   - kvm_pgtable_stage2_destroy_range(), that performs the
>     page-table walk and free the entries over a range of addresses.
>   - kvm_pgtable_stage2_destroy_pgd(), that frees the PGD.
> 
> This refactoring enables subsequent patches to free large page-tables
> in chunks, calling cond_resched() between each chunk, to yield the CPU
> as necessary.
> 
> Direct callers of kvm_pgtable_stage2_destroy() will continue to walk
> the entire range of the VM as before, ensuring no functional changes.
> 
> Also, add equivalent pkvm_pgtable_stage2_*() stubs to maintain 1:1
> mapping of the page-table functions.

Uhh... We can't stub these functions out for protected mode, we already
have a load-bearing implementation of pkvm_pgtable_stage2_destroy().
Just reuse what's already there and provide a NOP for
pkvm_pgtable_stage2_destroy_pgd().

> +void kvm_pgtable_stage2_destroy_pgd(struct kvm_pgtable *pgt)
> +{
> +	/*
> +	 * We aren't doing a pgtable walk here, but the walker struct is needed
> +	 * for kvm_dereference_pteref(), which only looks at the ->flags.
> +	 */
> +	struct kvm_pgtable_walker walker = {0};

This feels subtle and prone for error. I'd rather we have something that
boils down to rcu_dereference_raw() (with the appropriate n/hVHE awareness)
and add a comment why it is safe.

> +void kvm_pgtable_stage2_destroy(struct kvm_pgtable *pgt)
> +{
> +	kvm_pgtable_stage2_destroy_range(pgt, 0, BIT(pgt->ia_bits));
> +	kvm_pgtable_stage2_destroy_pgd(pgt);
> +}
> +

Move this to mmu.c as a static function and use KVM_PGT_FN()

Thanks,
Oliver