Hi, AER correctable errors are pretty rare. I only saw one once before and came up with commit 78457cae24cb ("PCI: xilinx-nwl: Rate-limit misc interrupt messages") in response. I saw another today and, unfortunately, clearing the correctable AER bit in MSGF_MISC_STATUS is not sufficient to handle the IRQ. It gets immediately re-raised, preventing the system from making any other progress. I suspect that it needs to be cleared in PCI_ERR_ROOT_STATUS. But since the AER IRQ never gets delivered to aer_irq, those registers never get tickled. The underlying problem is that pcieport thinks that the IRQ is going to be one of the MSIs or a legacy interrupt, but it's actually a native interrupt: CPU0 CPU1 CPU2 CPU3 42: 0 0 0 0 GICv2 150 Level nwl_pcie:misc 45: 0 0 0 0 nwl_pcie:legacy 0 Level PCIe PME, aerdrv 46: 25 0 0 0 nwl_pcie:msi 524288 Edge nvme0q0 47: 0 0 0 0 nwl_pcie:msi 524289 Edge nvme0q1 48: 0 0 0 0 nwl_pcie:msi 524290 Edge nvme0q2 49: 46 0 0 0 nwl_pcie:msi 524291 Edge nvme0q3 50: 0 0 0 0 nwl_pcie:msi 524292 Edge nvme0q4 In the above example, AER errors will trigger interrupt 42, not 45. Actually, there are a bunch of different interrupts in MSGF_MISC_STATUS, so maybe nwl_pcie_misc_handler should be an interrupt controller instead? But even then pcie_port_enable_irq_vec() won't figure out the correct IRQ. Any ideas on how to fix this? Additionally, any tips on actually triggering AER/PME stuff in a consistent way? Are there any off-the-shelf cards for sending weird PCIe stuff over a link for testing? Right now all I have --Sean # lspci -vv 00:00.0 PCI bridge: Xilinx Corporation Device d011 (prog-if 00 [Normal decode]) Control: I/O- Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx- Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx- Latency: 0 Interrupt: pin A routed to IRQ 45 Bus: primary=00, secondary=01, subordinate=0c, sec-latency=0 I/O behind bridge: 00000000-00000fff [size=4K] Memory behind bridge: e0000000-e00fffff [size=1M] Prefetchable memory behind bridge: [disabled] Secondary status: 66MHz- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- <SERR- <PERR- BridgeCtl: Parity- SERR+ NoISA- VGA- VGA16- MAbort- >Reset- FastB2B- PriDiscTmr- SecDiscTmr- DiscTmrStat- DiscTmrSERREn- Capabilities: [40] Power Management version 3 Flags: PMEClk- DSI- D1- D2- AuxCurrent=0mA PME(D0+,D1+,D2+,D3hot+,D3cold-) Status: D0 NoSoftRst+ PME-Enable- DSel=0 DScale=0 PME- Capabilities: [60] Express (v2) Root Port (Slot-), MSI 00 DevCap: MaxPayload 256 bytes, PhantFunc 0 ExtTag- RBE+ DevCtl: CorrErr+ NonFatalErr+ FatalErr+ UnsupReq+ RlxdOrd+ ExtTag- PhantFunc- AuxPwr- NoSnoop+ MaxPayload 128 bytes, MaxReadReq 512 bytes DevSta: CorrErr- NonFatalErr- FatalErr- UnsupReq- AuxPwr- TransPend+ LnkCap: Port #0, Speed 5GT/s, Width x2, ASPM not supported ClockPM- Surprise- LLActRep+ BwNot+ ASPMOptComp+ LnkCtl: ASPM Disabled; RCB 64 bytes, Disabled- CommClk- ExtSynch- ClockPM- AutWidDis- BWInt- AutBWInt- LnkSta: Speed 5GT/s (ok), Width x2 (ok) TrErr- Train- SlotClk+ DLActive+ BWMgmt- ABWMgmt+ RootCap: CRSVisible- RootCtl: ErrCorrectable- ErrNon-Fatal- ErrFatal- PMEIntEna+ CRSVisible- RootSta: PME ReqID 0000, PMEStatus- PMEPending- DevCap2: Completion Timeout: Range B, TimeoutDis+ NROPrPrP- LTR- 10BitTagComp- 10BitTagReq- OBFF Not Supported, ExtFmt- EETLPPrefix- EmergencyPowerReduction Not Supported, EmergencyPowerReductionInit- FRS- LN System CLS Not Supported, TPHComp- ExtTPHComp- ARIFwd- AtomicOpsCap: Routing- 32bit- 64bit- 128bitCAS- DevCtl2: Completion Timeout: 50us to 50ms, TimeoutDis- LTR- OBFF Disabled, ARIFwd- AtomicOpsCtl: ReqEn- EgressBlck- LnkCtl2: Target Link Speed: 5GT/s, EnterCompliance- SpeedDis- Transmit Margin: Normal Operating Range, EnterModifiedCompliance- ComplianceSOS- Compliance De-emphasis: -6dB LnkSta2: Current De-emphasis Level: -6dB, EqualizationComplete- EqualizationPhase1- EqualizationPhase2- EqualizationPhase3- LinkEqualizationRequest- Retimer- 2Retimers- CrosslinkRes: unsupported Capabilities: [100 v1] Device Serial Number 00-00-00-00-00-00-00-00 Capabilities: [10c v1] Virtual Channel Caps: LPEVC=0 RefClk=100ns PATEntryBits=1 Arb: Fixed- WRR32- WRR64- WRR128- Ctrl: ArbSelect=Fixed Status: InProgress- VC0: Caps: PATOffset=00 MaxTimeSlots=1 RejSnoopTrans- Arb: Fixed- WRR32- WRR64- WRR128- TWRR128- WRR256- Ctrl: Enable+ ID=0 ArbSelect=Fixed TC/VC=ff Status: NegoPending- InProgress- Capabilities: [128 v1] Vendor Specific Information: ID=1234 Rev=1 Len=018 <?> Capabilities: [140 v1] Advanced Error Reporting UESta: DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq- ACSViol- UEMsk: DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq- ACSViol- UESvrt: DLP+ SDES+ TLP- FCP+ CmpltTO- CmpltAbrt- UnxCmplt- RxOF+ MalfTLP+ ECRC- UnsupReq- ACSViol- CESta: RxErr- BadTLP- BadDLLP- Rollover- Timeout- AdvNonFatalErr- CEMsk: RxErr- BadTLP- BadDLLP- Rollover- Timeout- AdvNonFatalErr+ AERCap: First Error Pointer: 00, ECRCGenCap+ ECRCGenEn- ECRCChkCap+ ECRCChkEn- MultHdrRecCap- MultHdrRecEn- TLPPfxPres- HdrLogCap- HeaderLog: 00000000 00000000 00000000 00000000 RootCmd: CERptEn+ NFERptEn+ FERptEn+ RootSta: CERcvd- MultCERcvd- UERcvd- MultUERcvd- FirstFatal- NonFatalMsg- FatalMsg- IntMsg 0 ErrorSrc: ERR_COR: 0000 ERR_FATAL/NONFATAL: 0000 Kernel driver in use: pcieport