On PREEMPT_RT, currently both aer_irq and aer_isr run in separate threads, at the same FIFO priority. This can lead to the aer_isr thread starving the aer_irq thread, particularly if multi_error_valid causes a scan of all devices, and multiple errors are raised during the scan. On !PREEMPT_RT, or if aer_irq runs at a higher priority than aer_isr, these errors can be queued as single-error events as they happen. But if aer_irq can't run until aer_isr finishes, by that time the multi event bit will be set again, causing a new scan and an infinite loop. Signed-off-by: Crystal Wood <crwood@xxxxxxxxxx> --- I'm seeing this on a particular ARM server when using /sys/bus/pci/rescan, though the internal reporter sometimes saw it happen on boot as well. On !PREEMPT_RT, or with this patch, a finite number of errors are emitted and the scan completes. --- drivers/pci/pcie/aer.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/drivers/pci/pcie/aer.c b/drivers/pci/pcie/aer.c index 15ed541d2fbe..6945a112a5cd 100644 --- a/drivers/pci/pcie/aer.c +++ b/drivers/pci/pcie/aer.c @@ -1671,7 +1671,8 @@ static int aer_probe(struct pcie_device *dev) set_service_data(dev, rpc); status = devm_request_threaded_irq(device, dev->irq, aer_irq, aer_isr, - IRQF_SHARED, "aerdrv", dev); + IRQF_NO_THREAD | IRQF_SHARED, + "aerdrv", dev); if (status) { pci_err(port, "request AER IRQ %d failed\n", dev->irq); return status; -- 2.47.1