Re: [PATCH v4 5/7] PCI/AER: Introduce ratelimit for error logs

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Thu, Mar 20, 2025 at 1:29 PM Bjorn Helgaas <helgaas@xxxxxxxxxx> wrote:
>
> On Thu, Mar 20, 2025 at 12:53:53PM -0700, Jon Pan-Doh wrote:
> I think the struct aer_err_info is basically a per-interrupt thing, so
> maybe we could evaluate __ratelimit() once at the initial entry, save
> the result in aer_err_info, and use that saved value everywhere we
> print messages?

I like this approach. Another advantage is it removes the need for the 2x
ratelimit logic. Updated for v5.

>   - native AER: aer_isr_one_error() has RP pointer in rpc->rpd and
>     could save it (or pointer to the RP's ratelimit struct, or just
>     the result of __ratelimit()) in aer_err_info.

Similar to aer_err_info.dev[], I store the evaluated __ratelimit() in
aer_err_info.ratelimited[]. The main quirk is that for multiple
errors, you won't
see the root port log if the first error is ratelimited, but the
subsequent errors
are under the limit. I think this is fine, as the log prints out the
first error only,
but can change aer_print_port_info() to log if any of the errors is
under the limit.

>   - GHES AER: I'm not sure struct cper_sec_pcie contains the RP, might
>     have to search upwards from the device we know about?
>
>   - native DPC: dpc_process_error() has DP pointer and could save it
>     in aer_err_info.
>
>   - EDR DPC: passes DP pointer to dpc_process_error().

These are largely unchanged:
- GHES/CXL gated by aer_ratelimit() in pci_print_aer()
- DPC not ratelimited with the expectation that there won't be error storms

Thanks,
Jon





[Index of Archives]     [DMA Engine]     [Linux Coverity]     [Linux USB]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [Greybus]

  Powered by Linux