On 9/8/25 18:40, Yazen Ghannam wrote:
Scalable MCA systems have a per-CPU register that gives the APIC LVT
offset for the thresholding and deferred error interrupts.
Currently, this register is read once to set up the deferred error
interrupt and then read again for each thresholding block. Furthermore,
the APIC LVT registers are configured each time, but they only need to
be configured once per-CPU.
Move the APIC LVT setup to the early part of CPU init, so that the
registers are set up once. Also, this ensures that the kernel is ready
to service the interrupts before the individual error sources (each MCA
bank) are enabled.
Apply this change only to SMCA systems to avoid breaking any legacy
behavior. The deferred error interrupt is technically advertised by the
SUCCOR feature. However, this was first made available on SMCA systems.
Therefore, only set up the deferred error interrupt on SMCA systems and
simplify the code.
Guidance from hardware designers is that the LVT offsets provided from
the platform should be used. The kernel should not try to enforce
specific values. However, the kernel should check that an LVT offset is
not reused for multiple sources.
Therefore, remove the extra checking and value enforcement from the MCE
code. The "reuse/conflict" case is already handled in
setup_APIC_eilvt().
Tested-by: Tony Luck <tony.luck@xxxxxxxxx>
Reviewed-by: Tony Luck <tony.luck@xxxxxxxxx>
Signed-off-by: Yazen Ghannam <yazen.ghannam@xxxxxxx>
---
Notes:
Link:
https://lore.kernel.org/r/20250825-wip-mca-updates-v5-15-865768a2eef8@xxxxxxx
v5->v6:
* Applied "bools to flags" and other fixups from Boris.
v4->v5:
* Added back to set.
* Updated commit message with more details.
v3->v4:
* Dropped from set.
v2->v3:
* Add tags from Tony.
v1->v2:
* Use new per-CPU struct.
* Don't set up interrupt vectors.
arch/x86/kernel/cpu/mce/amd.c | 121 ++++++++++++++++++------------------------
1 file changed, 53 insertions(+), 68 deletions(-)
diff --git a/arch/x86/kernel/cpu/mce/amd.c b/arch/x86/kernel/cpu/mce/amd.c
index 1b1b83b3aef9..a6f5c9339d7c 100644
--- a/arch/x86/kernel/cpu/mce/amd.c
+++ b/arch/x86/kernel/cpu/mce/amd.c
@@ -43,9 +43,6 @@
/* Deferred error settings */
#define MSR_CU_DEF_ERR 0xC0000410
nit: While touching this code why not finally rename this in line with
the APM, section 9.3.1.4: MCA_INTR_CFG
Perhaps as a separate patch. I see that you did send a patch containing
this rename:
https://lore.kernel.org/all/20231118193248.1296798-13-yazen.ghannam@xxxxxxx/
But I guess it didn't land.