Re: [PATCH] KVM: x86/mmu: Prevent installing hugepages when mem attributes are changing

Sean Christopherson <seanjc@xxxxxxxxxx> · Mon, 28 Apr 2025 07:50:21 -0700

On Mon, Apr 28, 2025, Yan Zhao wrote:
> On Fri, Apr 25, 2025 at 05:10:56PM -0700, Sean Christopherson wrote:
> > @@ -7686,6 +7707,37 @@ bool kvm_arch_pre_set_memory_attributes(struct kvm *kvm,
> >  	if (WARN_ON_ONCE(!kvm_arch_has_private_mem(kvm)))
> >  		return false;
> >  
> > +	if (WARN_ON_ONCE(range->end <= range->start))
> > +		return false;
> > +
> > +	/*
> > +	 * If the head and tail pages of the range currently allow a hugepage,
> > +	 * i.e. reside fully in the slot and don't have mixed attributes, then
> > +	 * add each corresponding hugepage range to the ongoing invalidation,
> > +	 * e.g. to prevent KVM from creating a hugepage in response to a fault
> > +	 * for a gfn whose attributes aren't changing.  Note, only the range
> > +	 * of gfns whose attributes are being modified needs to be explicitly
> > +	 * unmapped, as that will unmap any existing hugepages.
> > +	 */
> > +	for (level = PG_LEVEL_2M; level <= KVM_MAX_HUGEPAGE_LEVEL; level++) {
> > +		gfn_t start = gfn_round_for_level(range->start, level);
> > +		gfn_t end = gfn_round_for_level(range->end - 1, level);
> > +		gfn_t nr_pages = KVM_PAGES_PER_HPAGE(level);
> > +
> > +		if ((start != range->start || start + nr_pages > range->end) &&
> > +		    start >= slot->base_gfn &&
> > +		    start + nr_pages <= slot->base_gfn + slot->npages &&
> > +		    !hugepage_test_mixed(slot, start, level))
> Instead of checking mixed flag in disallow_lpage, could we check disallow_lpage
> directly?
> 
> So, if mixed flag is not set but disallow_lpage is 1, there's no need to update
> the invalidate range.
> 
> > +			kvm_mmu_invalidate_range_add(kvm, start, start + nr_pages);
> > +
> > +		if (end == start)
> > +			continue;
> > +
> > +		if ((end + nr_pages) <= (slot->base_gfn + slot->npages) &&
> > +		    !hugepage_test_mixed(slot, end, level))
> if ((end + nr_pages > range->end) &&
>     ((end + nr_pages) <= (slot->base_gfn + slot->npages)) &&
>     !lpage_info_slot(gfn, slot, level)->disallow_lpage)
> 
> ?

No, disallow_lpage is used by write-tracking and shadow paging to prevent creating
huge pages for a write-protected gfn.  mmu_lock is dropped after the pre_set_range
call to kvm_handle_gfn_range(), and so disallow_lpage could go to zero if the last
shadow page for the affected range is zapped.  In practice, KVM isn't going to be
doing write-tracking or shadow paging for CoCo VMs, so there's no missed optimization
on that front.

And if disallow_lpage is non-zero due to a misaligned memslot base/size, then the
start/end checks will skip this level anyways.