On Thu, 26 Jun 2025 17:42:49 -0500 Terry Bowman <terry.bowman@xxxxxxx> wrote: > CXL Endpoint protocol errors are currently handled using PCI error > handlers. The CXL Endpoint requires CXL specific handling in the case of > uncorrectable error (UCE) handling not provided by the PCI handlers. > > Add CXL specific handlers for CXL Endpoints. Rename the existing > cxl_error_handlers to be pci_error_handlers to more correctly indicate > the error type and follow naming consistency. > > The PCI handlers will be called if the CXL device is not trained for > alternate protocol (CXL). Update the CXL Endpoint PCI handlers to call the > CXL UCE handlers. > > The existing EP UCE handler includes checks for various results. These are > no longer needed because CXL UCE recovery will not be attempted. Implement > cxl_handle_ras() to return PCI_ERS_RESULT_NONE or PCI_ERS_RESULT_PANIC. The > CXL UCE handler is called by cxl_do_recovery() that acts on the return > value. In the case of the PCI handler path, call panic() if the result is > PCI_ERS_RESULT_PANIC. > > Signed-off-by: Terry Bowman <terry.bowman@xxxxxxx> > Reviewed-by: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@xxxxxxxxxxxxxxx> A few minor comments inline. J > diff --git a/drivers/cxl/core/pci.c b/drivers/cxl/core/pci.c > index 887b54cf3395..7209ffb5c2fe 100644 > --- a/drivers/cxl/core/pci.c > +++ b/drivers/cxl/core/pci.c > > - scoped_guard(device, dev) { > - if (!dev->driver) { > +pci_ers_result_t cxl_error_detected(struct device *dev) > +{ > + struct pci_dev *pdev = to_pci_dev(dev); > + struct cxl_dev_state *cxlds = pci_get_drvdata(pdev); > + struct device *cxlmd_dev = &cxlds->cxlmd->dev; > + pci_ers_result_t ue; > + > + scoped_guard(device, cxlmd_dev) { I think there is nothing much happening after this (maybe introduced in later patches in which case ignore this comment). So can you just use a guard and reduce the indent of the rest? > + > + if (!cxlmd_dev->driver) { > dev_warn(&pdev->dev, > "%s: memdev disabled, abort error handling\n", > dev_name(dev)); > - return PCI_ERS_RESULT_DISCONNECT; > + return PCI_ERS_RESULT_PANIC; > } > > if (cxlds->rcd) > @@ -881,29 +888,23 @@ pci_ers_result_t cxl_error_detected(struct pci_dev *pdev, > ue = cxl_handle_ras(&cxlds->cxlmd->dev, cxlds->serial, cxlds->regs.ras); little hard to tell from this code blob but can you return here? > } > > - > - switch (state) { > - case pci_channel_io_normal: > - if (ue) { > - device_release_driver(dev); > - return PCI_ERS_RESULT_NEED_RESET; > - } > - return PCI_ERS_RESULT_CAN_RECOVER; > - case pci_channel_io_frozen: > - dev_warn(&pdev->dev, > - "%s: frozen state error detected, disable CXL.mem\n", > - dev_name(dev)); > - device_release_driver(dev); > - return PCI_ERS_RESULT_NEED_RESET; > - case pci_channel_io_perm_failure: > - dev_warn(&pdev->dev, > - "failure state error detected, request disconnect\n"); > - return PCI_ERS_RESULT_DISCONNECT; > - } > - return PCI_ERS_RESULT_NEED_RESET; > + return ue; > } > EXPORT_SYMBOL_NS_GPL(cxl_error_detected, "CXL");