On 4/24/2025 11:09 AM, Yan Zhao wrote:
Introduce a "prefetch" parameter to the private_max_mapping_level hook and enforce the max mapping level of a prefetch fault for private memory to be 4KB. This is a preparation to enable the ignoring huge page splitting in the fault path. If a prefetch fault results in a 2MB huge leaf in the mirror page table, there may not be a vCPU available to accept the corresponding 2MB huge leaf in the S-EPT if the TD is not configured to receive #VE for page acceptance. Consequently, if a vCPU accepts the page at 4KB level, it will trigger an EPT violation to split the 2MB huge leaf generated by the prefetch fault. Since handling the BUSY error from SEAMCALLs for huge page splitting is more comprehensive in the fault path, which is with kvm->mmu_lock held for reading, force the max mapping level of a prefetch fault of private memory to be 4KB to prevent potential splitting. Since prefetch faults for private memory are uncommon after the TD's build time, enforcing a 4KB mapping level is unlikely to cause any performance degradation.
I am wondering what are the use cases for KVM_PRE_FAULT_MEMORY. Is there an API usage guide to limit that userspace shouldn't use it for a large amount of memory pre-fault? If no, and userspace uses it to pre-fault a lot of memory, this "unlikely to cause any performance degradation" might be not true.