On Mon, 2025-05-12 at 12:55 +0300, Kirill A. Shutemov wrote: > On Fri, May 09, 2025 at 09:25:58AM +0800, Yan Zhao wrote: > > On Thu, May 08, 2025 at 04:23:56PM +0300, Kirill A. Shutemov wrote: > > > On Tue, May 06, 2025 at 07:55:17PM +0800, Yan Zhao wrote: > > > > On Fri, May 02, 2025 at 04:08:24PM +0300, Kirill A. Shutemov wrote: > > > > > The functions kvm_x86_ops::link_external_spt() and > > > > > kvm_x86_ops::set_external_spte() are used to assign new memory to a VM. > > > > > When using TDX with Dynamic PAMT enabled, the assigned memory must be > > > > > covered by PAMT. > > > > > > > > > > The new function kvm_x86_ops::phys_prepare() is called before > > > > > link_external_spt() and set_external_spte() to ensure that the memory is > > > > > ready to be assigned to the virtual machine. In the case of TDX, it > > > > > makes sure that the memory is covered by PAMT. > > > > > > > > > > kvm_x86_ops::phys_prepare() is called in a context where struct kvm_vcpu > > > > > is available, allowing the implementation to allocate memory from a > > > > > per-VCPU pool. > > > > > > > > > Why not invoke phys_prepare() and phys_cleanup() in set_external_spte_present()? > > > > Or in tdx_sept_set_private_spte()/tdx_sept_link_private_spt()? > > > > > > Because the memory pool we allocated from is per-vcpu and we lost access > > > to vcpu by then. And not all callers provide vcpu. > > Maybe we can get vcpu via kvm_get_running_vcpu(), as in [1]. > > Then for callers not providing vcpu (where vcpu is NULL), we can use per-KVM > > cache? > > Hm. I was not aware of kvm_get_running_vcpu(). Will play with it, thanks. I am not sure why per-vcpu cache matters. For non-leaf SEPT pages, AFAICT the "vcpu->arch.mmu_external_spt_cache" is just an empty cache, and eventually __get_free_page() is used to allocate in: sp->external_spt = kvm_mmu_memory_cache_alloc(&vcpu->arch.mmu_external_spt_cache); So why not we actually create a kmem_cache for it with an actual 'ctor', and we can call tdx_alloc_page() in that. This makes sure when the "external_spt" is allocated, the underneath PAMT entry is there. For the last level guest memory page, similar to SEV, we can hook the kvm_arch_gmem_prepare() to call tdx_alloc_page() to make PAMT entry ready.