On 6/27/2025 6:05 AM, Jonathan Cameron wrote: > On Thu, 26 Jun 2025 17:42:42 -0500 > Terry Bowman <terry.bowman@xxxxxxx> wrote: > >> Create cxl_do_recovery() to provide uncorrectable protocol error (UCE) >> handling. Follow similar design as found in PCIe error driver, >> pcie_do_recovery(). One difference is cxl_do_recovery() will treat all UCEs >> as fatal with a kernel panic. This is to prevent corruption on CXL memory. >> >> Export the PCI error driver's merge_result() to CXL namespace. > I think this may be a confusion from earlier review. Anyhow, it should > be namespaced in the sense of not exporting something the vague name of > merge_result but it's PCI code, not CXL code and we don't have the dangerous > interface argument to justify putting it in the CXL namespace so I think > a namespaced EXPORT makes little sense for this one. > > Jonathan > > >> Introduce >> PCI_ERS_RESULT_PANIC and add support in merge_result() routine. This will >> be used by CXL to panic the system in the case of uncorrectable protocol >> errors. PCI error handling is not currently expected to use the >> PCI_ERS_RESULT_PANIC. >> >> Copy pci_walk_bridge() to cxl_walk_bridge(). Make a change to walk the >> first device in all cases. >> >> Copy the PCI error driver's report_error_detected() to cxl_report_error_detected(). >> Note, only CXL Endpoints and RCH Downstream Ports(RCH DSP) are currently >> supported. Add locking for PCI device as done in PCI's report_error_detected(). >> This is necessary to prevent the RAS registers from disappearing before >> logging is completed. >> >> Call panic() to halt the system in the case of uncorrectable errors (UCE) >> in cxl_do_recovery(). Export pci_aer_clear_fatal_status() for CXL to use >> if a UCE is not found. In this case the AER status must be cleared and >> uses pci_aer_clear_fatal_status(). >> >> Signed-off-by: Terry Bowman <terry.bowman@xxxxxxx> > >> diff --git a/drivers/pci/pcie/err.c b/drivers/pci/pcie/err.c >> index de6381c690f5..63fceb3e8613 100644 >> --- a/drivers/pci/pcie/err.c >> +++ b/drivers/pci/pcie/err.c >> @@ -21,9 +21,12 @@ >> #include "portdrv.h" >> #include "../pci.h" >> >> -static pci_ers_result_t merge_result(enum pci_ers_result orig, >> - enum pci_ers_result new) >> +pci_ers_result_t merge_result(enum pci_ers_result orig, >> + enum pci_ers_result new) >> { >> + if (new == PCI_ERS_RESULT_PANIC) >> + return PCI_ERS_RESULT_PANIC; >> + >> if (new == PCI_ERS_RESULT_NO_AER_DRIVER) >> return PCI_ERS_RESULT_NO_AER_DRIVER; >> >> @@ -45,6 +48,7 @@ static pci_ers_result_t merge_result(enum pci_ers_result orig, >> >> return orig; >> } >> +EXPORT_SYMBOL_NS_GPL(merge_result, "CXL"); > Do we care about namespacing this? I think not given it is PCIe code > and hardly destructive for other drivers to mess with it if they like. > > I would namespace it in the sense of renaming it to make it clear > it's about pci errors though. > > pci_ers_merge_result() perhaps? > > Do that as a percursor patch. > Good idea. There is a lot of changes related to just exporting this and changing the name. I've changed the namespace export to be: EXPORT_SYMBOL(pci_ers_merge_result); I moved this and its related required changes into an earlier patch. -Terry >> >> static int report_error_detected(struct pci_dev *dev, >> pci_channel_state_t state,