On Mon, Apr 28, 2025 at 07:50:21AM -0700, Sean Christopherson wrote: > On Mon, Apr 28, 2025, Yan Zhao wrote: > > On Fri, Apr 25, 2025 at 05:10:56PM -0700, Sean Christopherson wrote: > > > @@ -7686,6 +7707,37 @@ bool kvm_arch_pre_set_memory_attributes(struct kvm *kvm, > > > if (WARN_ON_ONCE(!kvm_arch_has_private_mem(kvm))) > > > return false; > > > > > > + if (WARN_ON_ONCE(range->end <= range->start)) > > > + return false; > > > + > > > + /* > > > + * If the head and tail pages of the range currently allow a hugepage, > > > + * i.e. reside fully in the slot and don't have mixed attributes, then > > > + * add each corresponding hugepage range to the ongoing invalidation, > > > + * e.g. to prevent KVM from creating a hugepage in response to a fault > > > + * for a gfn whose attributes aren't changing. Note, only the range > > > + * of gfns whose attributes are being modified needs to be explicitly > > > + * unmapped, as that will unmap any existing hugepages. > > > + */ > > > + for (level = PG_LEVEL_2M; level <= KVM_MAX_HUGEPAGE_LEVEL; level++) { > > > + gfn_t start = gfn_round_for_level(range->start, level); > > > + gfn_t end = gfn_round_for_level(range->end - 1, level); > > > + gfn_t nr_pages = KVM_PAGES_PER_HPAGE(level); > > > + > > > + if ((start != range->start || start + nr_pages > range->end) && > > > + start >= slot->base_gfn && > > > + start + nr_pages <= slot->base_gfn + slot->npages && > > > + !hugepage_test_mixed(slot, start, level)) > > Instead of checking mixed flag in disallow_lpage, could we check disallow_lpage > > directly? > > > > So, if mixed flag is not set but disallow_lpage is 1, there's no need to update > > the invalidate range. > > > > > + kvm_mmu_invalidate_range_add(kvm, start, start + nr_pages); > > > + > > > + if (end == start) > > > + continue; > > > + > > > + if ((end + nr_pages) <= (slot->base_gfn + slot->npages) && > > > + !hugepage_test_mixed(slot, end, level)) > > if ((end + nr_pages > range->end) && > > ((end + nr_pages) <= (slot->base_gfn + slot->npages)) && > > !lpage_info_slot(gfn, slot, level)->disallow_lpage) > > > > ? > > No, disallow_lpage is used by write-tracking and shadow paging to prevent creating > huge pages for a write-protected gfn. mmu_lock is dropped after the pre_set_range > call to kvm_handle_gfn_range(), and so disallow_lpage could go to zero if the last > shadow page for the affected range is zapped. In practice, KVM isn't going to be That's a good point. I missed it. > doing write-tracking or shadow paging for CoCo VMs, so there's no missed optimization > on that front. > > And if disallow_lpage is non-zero due to a misaligned memslot base/size, then the > start/end checks will skip this level anyways. If the gfn and userspace address are not aligned wrt each other at a certain level, the disallow_lpage for that level is set to 1 for the entire slot. This is often the case at the 1G level. But as kvm_vm_set_mem_attributes() holds write mmu_lock for most of the time, preventing fault over a larger range for another short period looks no harm.