Re: [PATCH v3 2/6] x86/virt/tdx: Mark memory cache state incoherent when making SEAMCALL

"Edgecombe, Rick P" <rick.p.edgecombe@xxxxxxxxx> · Thu, 26 Jun 2025 18:37:17 +0000

On Thu, 2025-06-26 at 22:48 +1200, Kai Huang wrote:
> On TDX platforms, at hardware level dirty cachelines with and without
> TDX keyID can coexist, and CPU can flush them back to memory in random
> order.
> 

Same comment as the previous patch.

>   During kexec, the caches must be flushed before jumping to the
> new kernel to avoid silent memory corruption when a cacheline with a
> different encryption property is written back over whatever encryption
> properties the new kernel is using.
> 
> A percpu boolean is used to mark whether the cache of a given CPU may be
> in an incoherent state, and the kexec performs WBINVD on the CPUs with
> that boolean turned on.
> 
> For TDX, only the TDX module or the TDX guests can generate dirty
> cachelines of TDX private memory, i.e., they are only generated when the
> kernel does SEAMCALL.
             ^a
> 
> Turn on that boolean when the kernel does SEAMCALL so that kexec can
Nit: "Turn on" is a little ambiguous. "Set"?

> correctly flush cache.
> 
> SEAMCALL can be made from both task context and IRQ disabled context.

SEAMCALLs

> Given SEAMCALL is just a lengthy instruction (e.g., thousands of cycles)
> from kernel's point of view and preempt_{disable|enable}() is cheap
> compared to it, simply unconditionally disable preemption during setting
> the percpu boolean and making SEAMCALL.
> 
> Signed-off-by: Kai Huang <kai.huang@xxxxxxxxx>
> Tested-by: Farrah Chen <farrah.chen@xxxxxxxxx>
> ---
> 
> v2 -> v3:
>  - Change to use __always_inline for do_seamcall() to avoid indirect
>    call instructions of making SEAMCALL.

How did this come about?

>  - Remove the senstence "not all SEAMCALLs generate dirty cachelines of
>    TDX private memory but just treat all of them do." in changelog and
>    the code comment. -- Dave
> 
> ---
>  arch/x86/include/asm/tdx.h | 29 ++++++++++++++++++++++++++++-
>  1 file changed, 28 insertions(+), 1 deletion(-)
> 
> diff --git a/arch/x86/include/asm/tdx.h b/arch/x86/include/asm/tdx.h
> index 7ddef3a69866..d4c624c69d7f 100644
> --- a/arch/x86/include/asm/tdx.h
> +++ b/arch/x86/include/asm/tdx.h
> @@ -102,10 +102,37 @@ u64 __seamcall_ret(u64 fn, struct tdx_module_args *args);
>  u64 __seamcall_saved_ret(u64 fn, struct tdx_module_args *args);
>  void tdx_init(void);
>  
> +#include <linux/preempt.h>
>  #include <asm/archrandom.h>
> +#include <asm/processor.h>
>  
>  typedef u64 (*sc_func_t)(u64 fn, struct tdx_module_args *args);
>  
> +static __always_inline u64 do_seamcall(sc_func_t func, u64 fn,
> +				       struct tdx_module_args *args)
> +{

So now we have:

seamcall()
	sc_retry()
		do_seamcall()
			__seamcall()

do_seamcall() is only called from sc_retry(). Why add yet another helper in the
stack? You could just build it into sc_retry().

Oh, and __seamcall_*() variety is called directly too, so skips the
do_seamcall() per-cpu var logic in those cases. So, maybe do_seamcall() is
needed, but it needs a better name considering where it would get called from.

These wrappers need an overhaul I think, but maybe for now just have
do_dirty_seamcall() which is called from tdh_vp_enter() and sc_retry().

Oh no, actually scratch that! The inline/flatten issue will happen again if we
add the per-cpu vars to tdh_vp_enter()... Which means we probably need to set
the per-cpu var in tdx_vcpu_enter_exit(). And the other __seamcall() caller is
the machine check handler...

Am I missing something? It seems this patch is incomplete. If some of these
missed SEAMCALLs don't dirty a cacheline, then the justification that it works
by just covering all seamcalls needs to be updated.

Side topic. Do all the SEAMCALL wrappers calling into the seamcall_*() variety
of wrappers need the entropy retry logic? I think no, and some callers actually
depend on it not happening.

> +	u64 ret;
> +
> +	preempt_disable();
> +
> +	/*
> +	 * SEAMCALLs are made to the TDX module and can generate dirty
> +	 * cachelines of TDX private memory.  Mark cache state incoherent
> +	 * so that the cache can be flushed during kexec.
> +	 *
> +	 * This needs to be done before actually making the SEAMCALL,
> +	 * because kexec-ing CPU could send NMI to stop remote CPUs,
> +	 * in which case even disabling IRQ won't help here.
> +	 */
> +	this_cpu_write(cache_state_incoherent, true);
> +
> +	ret = func(fn, args);
> +
> +	preempt_enable();
> +
> +	return ret;
> +}
> +
>  static __always_inline u64 sc_retry(sc_func_t func, u64 fn,
>  			   struct tdx_module_args *args)
>  {
> @@ -113,7 +140,7 @@ static __always_inline u64 sc_retry(sc_func_t func, u64 fn,
>  	u64 ret;
>  
>  	do {
> -		ret = func(fn, args);
> +		ret = do_seamcall(func, fn, args);
>  	} while (ret == TDX_RND_NO_ENTROPY && --retry);
>  
>  	return ret;