On Fri, Mar 21, 2025 at 05:01:15PM -0500, Bjorn Helgaas wrote: > On Thu, Mar 20, 2025 at 06:58:03PM -0700, Jon Pan-Doh wrote: > > Update name to reflect the broader definition of structs/variables that > > are stored (e.g. ratelimits). This is a preparatory patch for adding rate > > limit support. > > > > Signed-off-by: Karolina Stolarek <karolina.stolarek@xxxxxxxxxx> > > Signed-off-by: Jon Pan-Doh <pandoh@xxxxxxxxxx> > > Reported-by: Sargun Dhillon <sargun@xxxxxxxx> > > What did Sargun report? Is there a bug fix in here? Can we include a > URL to whatever Sargun reported? He reported RCU CPU stall warnings and CSD-lock warnings internally within Meta, so sorry, no useful URL. I did put together a series of hacks that fix the problem: (1) Disabling __aer_print_error() entirely, (2) Disabling __aer_print_error() printk() and sysfs, (3) Disabling only __aer_print_error() printk(), and finally (4) Throttling __aer_print_error() printk(). Jon's patch looks to cover my #4 plus it looks to allow run-time control of the throttling. So my patch is strictly a stop-the-bleeding measure for Meta's fleet while this patch series makes its way upstream. I do plan to look at Jon's patch in more detail when he posts the next version. Fair enough? Thanx, Paul