On Thu, 26 Jun 2025 17:42:42 -0500 Terry Bowman <terry.bowman@xxxxxxx> wrote: > Create cxl_do_recovery() to provide uncorrectable protocol error (UCE) > handling. Follow similar design as found in PCIe error driver, > pcie_do_recovery(). One difference is cxl_do_recovery() will treat all UCEs > as fatal with a kernel panic. This is to prevent corruption on CXL memory. > > Export the PCI error driver's merge_result() to CXL namespace. I think this may be a confusion from earlier review. Anyhow, it should be namespaced in the sense of not exporting something the vague name of merge_result but it's PCI code, not CXL code and we don't have the dangerous interface argument to justify putting it in the CXL namespace so I think a namespaced EXPORT makes little sense for this one. Jonathan > Introduce > PCI_ERS_RESULT_PANIC and add support in merge_result() routine. This will > be used by CXL to panic the system in the case of uncorrectable protocol > errors. PCI error handling is not currently expected to use the > PCI_ERS_RESULT_PANIC. > > Copy pci_walk_bridge() to cxl_walk_bridge(). Make a change to walk the > first device in all cases. > > Copy the PCI error driver's report_error_detected() to cxl_report_error_detected(). > Note, only CXL Endpoints and RCH Downstream Ports(RCH DSP) are currently > supported. Add locking for PCI device as done in PCI's report_error_detected(). > This is necessary to prevent the RAS registers from disappearing before > logging is completed. > > Call panic() to halt the system in the case of uncorrectable errors (UCE) > in cxl_do_recovery(). Export pci_aer_clear_fatal_status() for CXL to use > if a UCE is not found. In this case the AER status must be cleared and > uses pci_aer_clear_fatal_status(). > > Signed-off-by: Terry Bowman <terry.bowman@xxxxxxx> > diff --git a/drivers/pci/pcie/err.c b/drivers/pci/pcie/err.c > index de6381c690f5..63fceb3e8613 100644 > --- a/drivers/pci/pcie/err.c > +++ b/drivers/pci/pcie/err.c > @@ -21,9 +21,12 @@ > #include "portdrv.h" > #include "../pci.h" > > -static pci_ers_result_t merge_result(enum pci_ers_result orig, > - enum pci_ers_result new) > +pci_ers_result_t merge_result(enum pci_ers_result orig, > + enum pci_ers_result new) > { > + if (new == PCI_ERS_RESULT_PANIC) > + return PCI_ERS_RESULT_PANIC; > + > if (new == PCI_ERS_RESULT_NO_AER_DRIVER) > return PCI_ERS_RESULT_NO_AER_DRIVER; > > @@ -45,6 +48,7 @@ static pci_ers_result_t merge_result(enum pci_ers_result orig, > > return orig; > } > +EXPORT_SYMBOL_NS_GPL(merge_result, "CXL"); Do we care about namespacing this? I think not given it is PCIe code and hardly destructive for other drivers to mess with it if they like. I would namespace it in the sense of renaming it to make it clear it's about pci errors though. pci_ers_merge_result() perhaps? Do that as a percursor patch. > > static int report_error_detected(struct pci_dev *dev, > pci_channel_state_t state,