Hi Oliver, > > Protected mode is affected by the same problem, potentially even worse > due to the overheads of calling into EL2. Both protected and > non-protected flows should use stage2_destroy_range(). > I experimented with this (see diff below), and it looks like it takes significantly longer to finish the destruction even for a very small VM. For instance, it takes ~140 seconds on an Ampere Altra machine. This is probably because we run cond_resched() for every breakup in the entire sweep of the possible address range, 0 to ~(0ULL), even though there are no actual mappings there, and we context switch out more often. --- a/arch/arm64/kvm/mmu.c +++ b/arch/arm64/kvm/mmu.c + static void kvm_pgtable_stage2_destroy(struct kvm_pgtable *pgt) + { + u64 end = is_protected_kvm_enabled() ? ~(0ULL) : BIT(pgt->ia_bits); + u64 next, addr = 0; + + do { + next = stage2_range_addr_end(addr, end); + KVM_PGT_FN(kvm_pgtable_stage2_destroy_range)(pgt, addr, + next - addr); + + if (next != end) + cond_resched(); + } while (addr = next, addr != end); + + + KVM_PGT_FN(kvm_pgtable_stage2_destroy_pgd)(pgt); + } --- a/arch/arm64/kvm/pkvm.c +++ b/arch/arm64/kvm/pkvm.c @@ -316,9 +316,13 @@ static int __pkvm_pgtable_stage2_unmap(struct kvm_pgtable *pgt, u64 start, u64 e return 0; } -void pkvm_pgtable_stage2_destroy(struct kvm_pgtable *pgt) +void pkvm_pgtable_stage2_destroy_range(struct kvm_pgtable *pgt, u64 addr, u64 size) +{ + __pkvm_pgtable_stage2_unmap(pgt, addr, addr + size); +} + +void pkvm_pgtable_stage2_destroy_pgd(struct kvm_pgtable *pgt) +{ +} Without cond_resched() in place, it takes about half the time. I also tried moving cond_resched() to __pkvm_pgtable_stage2_unmap(), as per the below diff, and calling pkvm_pgtable_stage2_destroy_range() for the entire 0 to ~(1ULL) range (instead of breaking it up). Even for a fully 4K mapped 128G VM, I see it taking ~65 seconds, which is close to the baseline (no cond_resched()). --- a/arch/arm64/kvm/pkvm.c +++ b/arch/arm64/kvm/pkvm.c @@ -311,8 +311,11 @@ static int __pkvm_pgtable_stage2_unmap(struct kvm_pgtable *pgt, u64 start, u64 e return ret; pkvm_mapping_remove(mapping, &pgt->pkvm_mappings); kfree(mapping); + cond_resched(); } Does it make sense to call cond_resched() only when we actually unmap pages? Thank you. Raghavendra