On Tue, Sep 02, 2025 at 01:10:52PM +0200, Borislav Petkov wrote: > On Mon, Aug 25, 2025 at 05:33:10PM +0000, Yazen Ghannam wrote: > > +/* > > + * Threshold interrupt handler will service THRESHOLD_APIC_VECTOR. The interrupt > > + * goes off when error_count reaches threshold_limit. > > + */ > > +static void amd_threshold_interrupt(void) > > +{ > > + machine_check_poll(MCP_TIMESTAMP, &this_cpu_ptr(&mce_amd_data)->thr_intr_banks); > > } > > So the thresholding interrupt will fire. > > It'll call machine_check_poll(). > > That thing will do something and eventually call back into amd.c again: > > if (mce_flags.amd_threshold) > amd_reset_thr_limit(i); This resets only a bank with a valid error. Also, it resets the limit *before* clearing MCi_STATUS which should be the last step. > > Why the back'n'forth? > > Why not: > > static void amd_threshold_interrupt(void) > { > machine_check_poll(MCP_TIMESTAMP, &this_cpu_ptr(&mce_amd_data)->thr_intr_banks); > amd_reset_thr_limit(); This means we'd need to do another loop through the banks. Their MCi_STATUS registers would be cleared. So they could log another error before the limit is reset. Overall, the goal is to loop through the banks one time and log/reset banks as we go through them. Thanks, Yazen