On Fri, 2025-08-01 at 16:23 +0800, Chao Gao wrote: > On Tue, Jul 29, 2025 at 12:28:37AM +1200, Kai Huang wrote: > > On TDX platforms, dirty cacheline aliases with and without encryption > > bits can coexist, and the cpu can flush them back to memory in random > > order. During kexec, the caches must be flushed before jumping to the > > new kernel otherwise the dirty cachelines could silently corrupt the > > memory used by the new kernel due to different encryption property. > > > > A percpu boolean is used to mark whether the cache of a given CPU may be > > in an incoherent state, and the kexec performs WBINVD on the CPUs with > > that boolean turned on. > > > > For TDX, only the TDX module or the TDX guests can generate dirty > > cachelines of TDX private memory, i.e., they are only generated when the > > kernel does a SEAMCALL. > > > > Set that boolean when the kernel does SEAMCALL so that kexec can flush > > the cache correctly. > > > > The kernel provides both the __seamcall*() assembly functions and the > > seamcall*() wrapper ones which additionally handle running out of > > entropy error in a loop. Most of the SEAMCALLs are called using the > > seamcall*(), except TDH.VP.ENTER and TDH.PHYMEM.PAGE.RDMD which are > > called using __seamcall*() variant directly. > > > > To cover the two special cases, add a new helper do_seamcall() which > > only sets the percpu boolean and then calls the __seamcall*(), and > > change the special cases to use do_seamcall(). To cover all other > > SEAMCALLs, change seamcall*() to call do_seamcall(). > > > > For the SEAMCALLs invoked via seamcall*(), they can be made from both > > task context and IRQ disabled context. Given SEAMCALL is just a lengthy > > instruction (e.g., thousands of cycles) from kernel's point of view and > > preempt_{disable|enable}() is cheap compared to it, just unconditionally > > disable preemption during setting the boolean and making SEAMCALL. > > > > Signed-off-by: Kai Huang <kai.huang@xxxxxxxxx> > > Tested-by: Farrah Chen <farrah.chen@xxxxxxxxx> > > Reviewed-by: Chao Gao <chao.gao@xxxxxxxxx> > > One small question below, > > <snip> > > > +static __always_inline u64 do_seamcall(sc_func_t func, u64 fn, > > + struct tdx_module_args *args) > > +{ > > + lockdep_assert_preemption_disabled(); > > + > > + /* > > + * SEAMCALLs are made to the TDX module and can generate dirty > > + * cachelines of TDX private memory. Mark cache state incoherent > > + * so that the cache can be flushed during kexec. > > + * > > + * This needs to be done before actually making the SEAMCALL, > > ... > > > + * because kexec-ing CPU could send NMI to stop remote CPUs, > > + * in which case even disabling IRQ won't help here. > > + */ > > + this_cpu_write(cache_state_incoherent, true); > > I'm not sure this write is guaranteed to occur before the SEAMCALL, as I don't > see any explicit barrier to prevent the compiler from reordering them. Do you > think this is an issue? AFAICT this_cpu_write() eventually ended up with "volatile" being used, so it has a compiler barrier already. Plus what right after this_cpu_write() is a function call, with the body actually implemented in another file, so I don't think the compiler is smart enough to do any reordering here. > > > + > > + return func(fn, args); > > +} > > +