Introduction ============ NMI-source reporting with FRED [1] provides a new mechanism for identifying the source of NMIs. As part of the FRED event delivery framework, a 16-bit vector bitmap is provided that identifies one or more sources that caused the NMI. Using the source bitmap, the kernel can precisely run the relevant NMI handlers instead of polling the entire NMI handler list. Additionally, the source information would be invaluable for debugging misbehaving handlers and unknown NMIs. Changes since v5 ================ This series mainly implements PeterZ's suggestions in patches 4,5,6: * Simplify NMI handling by always setting and using the source bitmap. * Add runtime warnings for unexpected values. * Fix a compile issue in apic.h with a specific config. * Drop the tracepoint changes for now (include it with the DebugFS series). * Pick-up Sandipan's tested-by for the perf patch. The previous posting included a major simplification compared to the series posted last year[2]. Refer the v5 cover letter for details. v5: https://lore.kernel.org/lkml/20250507012145.2998143-1-sohil.mehta@xxxxxxxxx/ Overview of NMI-source usage ============================ Code snippets: // Allocate a static source vector at compile time #define NMIS_VECTOR_TEST 1 // Register an NMI handler with the vector register_nmi_handler(NMI_LOCAL, test_handler, 0, "nmi_test", NMIS_VECTOR_TEST); // Generate an NMI with the source vector using NMI encoded delivery __apic_send_IPI_mask(cpumask, APIC_DM_NMI | NMIS_VECTOR_TEST); // Handle an NMI with or without the source information (oversimplified) source_bitmap = fred_event_data(regs); if (!source_bitmap || (source_bitmap & BIT(NMIS_VECTOR_TEST))) test_handler(); // Unregister handler along with the vector unregister_nmi_handler(NMI_LOCAL, "nmi_test"); Patch structure =============== The patches are based on tip:x86/nmi because they depend on the NMI cleanup series merged earlier [3]. Patch 1-2: Prepare FRED/KVM and enumerate NMI-source reporting Patch 3-5: Register and handle NMI-source vectors Patch 6-8: APIC changes to generate NMIs with vectors Patch 9: Improve debug print with NMI-source information Many thanks to Sean Christopherson, Xin Li, H. Peter Anvin, Andi Kleen, Tony Luck, Kan Liang, Jacob Pan Jun, Zeng Guang, Peter Zijlstra, Sandipan Das, Steven Rostedt and others for their contributions, reviews and feedback. Future work / Opens =================== I am considering a few additional changes that would be valuable for enhancing NMI handling support. Any feedback, preferences or suggestions on the following would be helpful. Assigning more NMI-source vectors --------------------------------- The current patches assign NMI vectors to a limited number of sources. The microcode rendezvous and crash reboot code use NMI but do not go through the typical register_nmi_handler() path. Their handling is special-cased in exc_nmi(). To isolate blame and improve debugging, it would be useful to assign vectors to them, even if the vectors are ignored during handling. Other NMI sources, such as GHES and Platform NMIs, can also be assigned vectors to speed up their NMI handling and improve isolation. However, this would require a software/hardware agreement on vector reservation and usage. Such an endeavor is likely not worth the effort. Explicitly enabling NMIs ------------------------ HPA brought up the idea [4] of explicitly enabling NMIs only when the kernel is ready to take them. With FRED, if we enter the kernel with NMIs disabled, they could remain disabled until returning back to userspace. I am evaluating the request and related code changes. Debug support (Tracing and DebugFS) ----------------------- NMI-source information can help identify issues when multiple NMIs occur simultaneously or if certain NMI handlers consistently misbehave. Based on feedback from Steven Rostedt[5], the plan is to move the trace event to arch/x86 and then add source_bitmap to the nmi_handler() trace event. Currently, the kernel has counters for unknown NMIs, swallowed NMIs and other NMI handling data. However, there is no easy way to access that. To identify issues that happen over a longer timeframe, it might be useful to add DebugFS support for NMI statistics. KVM support ----------- The NMI-source feature can be useful for perf users and other NMI use cases in guest VMs. Exposing NMI-source to guests once FRED support is in place should be relatively easier. The prototype code for this is under evaluation. Links ===== [1]: Chapter 9, https://www.intel.com/content/www/us/en/content-details/819481/flexible-return-and-event-delivery-fred-specification.html [2]: https://lore.kernel.org/lkml/20240709143906.1040477-1-jacob.jun.pan@xxxxxxxxxxxxxxx/ [3]: https://lore.kernel.org/lkml/20250327234629.3953536-1-sohil.mehta@xxxxxxxxx/ [4]: https://lore.kernel.org/lkml/F5D36889-A868-46D2-A678-8EE26E28556D@xxxxxxxxx/ [5]: https://lore.kernel.org/lkml/20250507174809.10cfc5ac@xxxxxxxxxxxxxxxxxx/ Jacob Pan (1): perf/x86: Enable NMI-source reporting for perfmon Sohil Mehta (7): x86/cpufeatures: Add the CPUID feature bit for NMI-source reporting x86/nmi: Extend the registration interface to include the NMI-source vector x86/nmi: Assign and register NMI-source vectors x86/nmi: Add support to handle NMIs with source information x86/nmi: Prepare for the new NMI-source vector encoding x86/nmi: Enable NMI-source for IPIs delivered as NMIs x86/nmi: Print source information with the unknown NMI console message Zeng Guang (1): x86/fred, KVM: VMX: Pass event data to the FRED entry point from KVM arch/x86/entry/entry_64_fred.S | 2 +- arch/x86/events/amd/ibs.c | 2 +- arch/x86/events/core.c | 6 ++--- arch/x86/events/intel/core.c | 6 ++--- arch/x86/include/asm/apic.h | 39 +++++++++++++++++++++++++++++ arch/x86/include/asm/apicdef.h | 2 +- arch/x86/include/asm/cpufeatures.h | 1 + arch/x86/include/asm/fred.h | 9 ++++--- arch/x86/include/asm/nmi.h | 37 ++++++++++++++++++++++++++- arch/x86/kernel/apic/hw_nmi.c | 5 ++-- arch/x86/kernel/apic/ipi.c | 4 +-- arch/x86/kernel/apic/local.h | 24 ++++++++++-------- arch/x86/kernel/cpu/cpuid-deps.c | 1 + arch/x86/kernel/cpu/mce/inject.c | 4 +-- arch/x86/kernel/cpu/mshyperv.c | 3 +-- arch/x86/kernel/kgdb.c | 8 +++--- arch/x86/kernel/kvm.c | 9 +------ arch/x86/kernel/nmi.c | 37 +++++++++++++++++++++++++++ arch/x86/kernel/nmi_selftest.c | 9 +++---- arch/x86/kernel/smp.c | 6 ++--- arch/x86/kvm/vmx/vmx.c | 5 ++-- arch/x86/platform/uv/uv_nmi.c | 4 +-- drivers/acpi/apei/ghes.c | 2 +- drivers/char/ipmi/ipmi_watchdog.c | 3 +-- drivers/edac/igen6_edac.c | 3 +-- drivers/thermal/intel/therm_throt.c | 2 +- drivers/watchdog/hpwdt.c | 6 ++--- 27 files changed, 171 insertions(+), 68 deletions(-) base-commit: f2e01dcf6df2d12e86c363ea9c37d53994d89dd6 -- 2.43.0