On Mon, 2025-05-19 at 16:32 +0800, Yan Zhao wrote: > > On the opposite, if other non-Linux TDs don't follow 1G->2M->4K accept > > order, > > e.g., they always accept 4K, there could be *endless EPT violation* if I > > understand your words correctly. > > > > Isn't this yet-another reason we should choose to return PG_LEVEL_4K instead > > of > > 2M if no accept level is provided in the fault? > As I said, returning PG_LEVEL_4K would disallow huge pages for non-Linux TDs. > TD's accept operations at size > 4KB will get TDACCEPT_SIZE_MISMATCH. TDX_PAGE_SIZE_MISMATCH is a valid error code that the guest should handle. The docs say the VMM needs to demote *if* the mapping is large and the accept size is small. But if we map at 4k size for non-accept EPT violations, we won't hit this case. I also wonder what is preventing the TDX module from handling a 2MB accept size at 4k mappings. It could be changed maybe. But I think Kai's question was: why are we complicating the code for the case of non-Linux TDs that also use #VE for accept? It's not necessary to be functional, and there aren't any known TDs like that which are expected to use KVM today. (err, except the MMU stress test). So in another form the question is: should we optimize KVM for a case we don't even know if anyone will use? The answer seems obviously no to me. I think this connects the question of whether we can pass the necessary info into fault via synthetic error code. Consider this new design: - tdx_gmem_private_max_mapping_level() simply returns 4k for prefetch and pre- runnable, otherwise returns 2MB - if fault has accept info 2MB size, pass 2MB size into fault. Otherwise pass 4k (i.e. VMs that are relying on #VE to do the accept won't get huge pages *yet*). What goes wrong? Seems simpler and no more stuffing fault info on the vcpu.