> On Tue, Aug 12, 2025 at 11:39 AM Edgecombe, Rick P <rick.p.edgecombe@xxxxxxxxx> wrote: > > On Tue, 2025-08-12 at 09:15 -0700, Sean Christopherson wrote: > > > I actually went down this path too, but the problem I hit was that TDX > > > module wants the PAMT page size to match the S-EPT page size. > > > > Right, but over-populating the PAMT would just result in "wasted" memory, > > correct? I.e. KVM can always provide more PAMT entries than are needed. Or am > > I misunderstanding how dynamic PAMT works? > > Demote needs DPAMT pages in order to split the DPAMT. But "needs" is what I was > hoping to understand better. > > I do think though, that we should consider premature optimization vs re- > architecting DPAMT only for the sake of a short term KVM design. As in, if fault > path managed DPAMT is better for the whole lazy accept way of things, it > probably makes more sense to just do it upfront with the existing architecture. > > BTW, I think I untangled the fault path DPAMT page allocation code in this > series. I basically moved the existing external page cache allocation to > kvm/vmx/tdx.c. So the details of the top up and external page table cache > happens outside of x86 mmu code. The top up structure comes from arch/x86 side > of tdx code, so the cache can just be passed into tdx_pamt_get(). And from the > MMU code's perspective there is just one type "external page tables". It doesn't > know about DPAMT at all. > > So if that ends up acceptable, I think the main problem left is just this global > lock. And it seems we have a simple solution for it if needed. > > > > > In other words, IMO, reclaiming PAMT pages on-demand is also a premature > > optimization of sorts, as it's not obvious to me that the host would actually > > be able to take advantage of the unused memory. > > I was imagining some guestmemfd callback to setup DPAMT backing for all the > private memory. Just leave it when it's shared for simplicity. Then cleanup > DPAMT when the pages are freed from guestmemfd. The control pages could have > their own path like it does in this series. But it doesn't seem supported. IMO, tieing lifetime of guest_memfd folios with that of KVM ownership beyond the memslot lifetime is leaking more state into guest_memfd than needed. e.g. This will prevent usecases where guest_memfd needs to be reused while handling reboot of a confidential VM [1]. IMO, if avoidable, its better to not have DPAMT or generally other KVM arch specific state tracking hooked up to guest memfd folios specially with hugepage support and whole folio splitting/merging that needs to be done. If you still need it, guest_memfd should be stateless as much as possible just like we are pushing for SNP preparation tracking [2] to happen within KVM SNP and IMO any such tracking should ideally be cleaned up on memslot unbinding. [1] https://lore.kernel.org/kvm/CAGtprH9NbCPSwZrQAUzFw=4rZPA60QBM2G8opYo9CZxRiYihzg@xxxxxxxxxxxxxx/ [2] https://lore.kernel.org/kvm/20250613005400.3694904-2-michael.roth@xxxxxxx/