On Thu, 26 Jun 2025 17:42:35 -0500 Terry Bowman <terry.bowman@xxxxxxx> wrote: > This patchset updates CXL Protocol Error handling for CXL Ports and CXL > Endpoints (EP). The reach of this patchset grew from CXL Ports to include > EPs as well. > > This patchset is a continuation of v9 found here: > https://lore.kernel.org/linux-cxl/20250603172239.159260-1-terry.bowman@xxxxxxx/ > > The first patch is a small cleanup change to reduce amount of code. > > The next 2 patches introduce pci_dev::is_cxl, aer_info::is_cxl, and add > bus string to AER log tracing. aer_info::is_cxl will be used to indicate a > CXL or PCI error and will be used to direct the error handling flow in > later patches. > > The next patch introduces a new driver file, pci/pcie/cxl_aer.c, to move > the existing CXL AER logic into. > > The next 3 patches update the AER driver and CXL driver to use a kfifo. > The kfifo is added to offload CXL-AER protocol error work to the CXL > driver. These patches provide the kfifo work add and work remove. > > The next 5 patches prepare the CXL driver for adding the updated protocol > error handlers. This includes adding CXL Port RAS mapping and updating > interfaces for common support. > > The final 5 patches add the CXL error handlers for CXL EPs and CXL Ports. > CXL EPs keep the PCIe error handler for cases the EP error is interpreted > as a PCIe error. These patches also add logic to unmask CXL Protocol Errors > during port probing, and mask CXL Protocol Errors during port device > cleanup. Hello Terry, Thank you for this new version. I just wanted to add that I have been testing this new version on a few machines, and it fixes an issue that I was seeing on v8 of the patchset. Previously, booting a kernel with the parameter pcie_ports=compat would lead to a kernel crash caused by a NULL pointer dereference. After I rebased the kernel to use v10 instead, this went away and I can use pcie_ports=compat without any complications. I tried looking in to see what the change that led to this fix was, but couldn't find anything specific. It seems like a use-after-free bug and happens specifically in cxl_dport_init_ras_reporting. Since this new version fixes this issue, pleae feel free to add my tested-by tag in future versions. Thank you again for your work on this series! I hope you have a great day. Joshua Hahn Tested-by: Joshua Hahn <joshua.hahnjy@xxxxxxxxx> Sent using hkml (https://github.com/sjp38/hackermail)