On Thu, Jun 26, 2025 at 10:48:47PM +1200, Kai Huang wrote: ... > Doing WBINVD in stop_this_cpu() could potentially increase the chance to > trigger the above "race" despite it's still rare to happen. Oh the amount of text... Please run it and all your comments through AI to simplify formulations etc. It is a lot to read. > Signed-off-by: Kai Huang <kai.huang@xxxxxxxxx> > --- > arch/x86/include/asm/kexec.h | 2 +- > arch/x86/include/asm/processor.h | 2 ++ > arch/x86/kernel/cpu/amd.c | 16 ++++++++++++++++ > arch/x86/kernel/machine_kexec_64.c | 15 ++++++++++----- > arch/x86/kernel/process.c | 16 +++------------- > arch/x86/kernel/relocate_kernel_64.S | 15 +++++++++++---- > 6 files changed, 43 insertions(+), 23 deletions(-) > > diff --git a/arch/x86/include/asm/kexec.h b/arch/x86/include/asm/kexec.h > index f2ad77929d6e..d7e93522b93d 100644 > --- a/arch/x86/include/asm/kexec.h > +++ b/arch/x86/include/asm/kexec.h > @@ -122,7 +122,7 @@ relocate_kernel_fn(unsigned long indirection_page, > unsigned long pa_control_page, > unsigned long start_address, > unsigned int preserve_context, > - unsigned int host_mem_enc_active); > + unsigned int cache_incoherent); So preserve_context and cache_incoherent are both a *single* bit of information. And we use two u32s for that?!?! How about flags please? > #endif > extern relocate_kernel_fn relocate_kernel; > #define ARCH_HAS_KIMAGE_ARCH > diff --git a/arch/x86/include/asm/processor.h b/arch/x86/include/asm/processor.h > index bde58f6510ac..a24c7805acdb 100644 > --- a/arch/x86/include/asm/processor.h > +++ b/arch/x86/include/asm/processor.h > @@ -731,6 +731,8 @@ void __noreturn stop_this_cpu(void *dummy); > void microcode_check(struct cpuinfo_x86 *prev_info); > void store_cpu_caps(struct cpuinfo_x86 *info); > So much text above - not a single comment here explaining what this var is for. > +DECLARE_PER_CPU(bool, cache_state_incoherent); > + > enum l1tf_mitigations { > L1TF_MITIGATION_OFF, > L1TF_MITIGATION_AUTO, > diff --git a/arch/x86/kernel/cpu/amd.c b/arch/x86/kernel/cpu/amd.c > index f18f540db58c..4c7fde344216 100644 > --- a/arch/x86/kernel/cpu/amd.c > +++ b/arch/x86/kernel/cpu/amd.c > @@ -503,6 +503,22 @@ static void early_detect_mem_encrypt(struct cpuinfo_x86 *c) > { > u64 msr; > > + /* > + * Mark using wbinvd is needed during kexec on processors that For all text: write insns in caps pls - WBINVD. > + * support SME. This provides support for performing a successful > + * kexec when going from SME inactive to SME active (or vice-versa). > + * > + * The cache must be cleared so that if there are entries with the > + * same physical address, both with and without the encryption bit, > + * they don't race each other when flushed and potentially end up > + * with the wrong entry being committed to memory. > + * > + * Test the CPUID bit directly because the machine might've cleared > + * X86_FEATURE_SME due to cmdline options. Where? That same function does the clearing later... -- Regards/Gruss, Boris. https://people.kernel.org/tglx/notes-about-netiquette