On Thu, 2025-06-26 at 22:48 +1200, Kai Huang wrote: > On TDX platforms, at hardware level dirty cachelines with and without > TDX keyID can coexist, and CPU can flush them back to memory in random > order. > Same comment as the previous patch. > During kexec, the caches must be flushed before jumping to the > new kernel to avoid silent memory corruption when a cacheline with a > different encryption property is written back over whatever encryption > properties the new kernel is using. > > A percpu boolean is used to mark whether the cache of a given CPU may be > in an incoherent state, and the kexec performs WBINVD on the CPUs with > that boolean turned on. > > For TDX, only the TDX module or the TDX guests can generate dirty > cachelines of TDX private memory, i.e., they are only generated when the > kernel does SEAMCALL. ^a > > Turn on that boolean when the kernel does SEAMCALL so that kexec can Nit: "Turn on" is a little ambiguous. "Set"? > correctly flush cache. > > SEAMCALL can be made from both task context and IRQ disabled context. SEAMCALLs > Given SEAMCALL is just a lengthy instruction (e.g., thousands of cycles) > from kernel's point of view and preempt_{disable|enable}() is cheap > compared to it, simply unconditionally disable preemption during setting > the percpu boolean and making SEAMCALL. > > Signed-off-by: Kai Huang <kai.huang@xxxxxxxxx> > Tested-by: Farrah Chen <farrah.chen@xxxxxxxxx> > --- > > v2 -> v3: > - Change to use __always_inline for do_seamcall() to avoid indirect > call instructions of making SEAMCALL. How did this come about? > - Remove the senstence "not all SEAMCALLs generate dirty cachelines of > TDX private memory but just treat all of them do." in changelog and > the code comment. -- Dave > > --- > arch/x86/include/asm/tdx.h | 29 ++++++++++++++++++++++++++++- > 1 file changed, 28 insertions(+), 1 deletion(-) > > diff --git a/arch/x86/include/asm/tdx.h b/arch/x86/include/asm/tdx.h > index 7ddef3a69866..d4c624c69d7f 100644 > --- a/arch/x86/include/asm/tdx.h > +++ b/arch/x86/include/asm/tdx.h > @@ -102,10 +102,37 @@ u64 __seamcall_ret(u64 fn, struct tdx_module_args *args); > u64 __seamcall_saved_ret(u64 fn, struct tdx_module_args *args); > void tdx_init(void); > > +#include <linux/preempt.h> > #include <asm/archrandom.h> > +#include <asm/processor.h> > > typedef u64 (*sc_func_t)(u64 fn, struct tdx_module_args *args); > > +static __always_inline u64 do_seamcall(sc_func_t func, u64 fn, > + struct tdx_module_args *args) > +{ So now we have: seamcall() sc_retry() do_seamcall() __seamcall() do_seamcall() is only called from sc_retry(). Why add yet another helper in the stack? You could just build it into sc_retry(). Oh, and __seamcall_*() variety is called directly too, so skips the do_seamcall() per-cpu var logic in those cases. So, maybe do_seamcall() is needed, but it needs a better name considering where it would get called from. These wrappers need an overhaul I think, but maybe for now just have do_dirty_seamcall() which is called from tdh_vp_enter() and sc_retry(). Oh no, actually scratch that! The inline/flatten issue will happen again if we add the per-cpu vars to tdh_vp_enter()... Which means we probably need to set the per-cpu var in tdx_vcpu_enter_exit(). And the other __seamcall() caller is the machine check handler... Am I missing something? It seems this patch is incomplete. If some of these missed SEAMCALLs don't dirty a cacheline, then the justification that it works by just covering all seamcalls needs to be updated. Side topic. Do all the SEAMCALL wrappers calling into the seamcall_*() variety of wrappers need the entropy retry logic? I think no, and some callers actually depend on it not happening. > + u64 ret; > + > + preempt_disable(); > + > + /* > + * SEAMCALLs are made to the TDX module and can generate dirty > + * cachelines of TDX private memory. Mark cache state incoherent > + * so that the cache can be flushed during kexec. > + * > + * This needs to be done before actually making the SEAMCALL, > + * because kexec-ing CPU could send NMI to stop remote CPUs, > + * in which case even disabling IRQ won't help here. > + */ > + this_cpu_write(cache_state_incoherent, true); > + > + ret = func(fn, args); > + > + preempt_enable(); > + > + return ret; > +} > + > static __always_inline u64 sc_retry(sc_func_t func, u64 fn, > struct tdx_module_args *args) > { > @@ -113,7 +140,7 @@ static __always_inline u64 sc_retry(sc_func_t func, u64 fn, > u64 ret; > > do { > - ret = func(fn, args); > + ret = do_seamcall(func, fn, args); > } while (ret == TDX_RND_NO_ENTROPY && --retry); > > return ret;