RE: [PATCH v11 13/23] cxl/pci: Unify CXL trace logging for CXL Endpoints and CXL Ports

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



>-----Original Message-----
>From: Terry Bowman <terry.bowman@xxxxxxx>
>Sent: 27 August 2025 02:35
>To: dave@xxxxxxxxxxxx; Jonathan Cameron <jonathan.cameron@xxxxxxxxxx>;
>dave.jiang@xxxxxxxxx; alison.schofield@xxxxxxxxx; dan.j.williams@xxxxxxxxx;
>bhelgaas@xxxxxxxxxx; Shiju Jose <shiju.jose@xxxxxxxxxx>;
>ming.li@xxxxxxxxxxxx; Smita.KoralahalliChannabasappa@xxxxxxx;
>rrichter@xxxxxxx; dan.carpenter@xxxxxxxxxx;
>PradeepVineshReddy.Kodamati@xxxxxxx; lukas@xxxxxxxxx;
>Benjamin.Cheatham@xxxxxxx;
>sathyanarayanan.kuppuswamy@xxxxxxxxxxxxxxx; linux-cxl@xxxxxxxxxxxxxxx;
>alucerop@xxxxxxx; ira.weiny@xxxxxxxxx
>Cc: linux-kernel@xxxxxxxxxxxxxxx; linux-pci@xxxxxxxxxxxxxxx
>Subject: [PATCH v11 13/23] cxl/pci: Unify CXL trace logging for CXL Endpoints
>and CXL Ports
>
>CXL currently has separate trace routines for CXL Port errors and CXL Endpoint
>errors. This is inconvenient for the user because they must enable
>2 sets of trace routines. Make updates to the trace logging such that a single
>trace routine logs both CXL Endpoint and CXL Port protocol errors.
>
>Keep the trace log fields 'memdev' and 'host'. While these are not accurate for
>non-Endpoints the fields will remain as-is to prevent breaking userspace RAS
>trace consumers.
>
>Add serial number parameter to the trace logging. This is used for EPs and 0 is
>provided for CXL port devices without a serial number.
>
>Leave the correctable and uncorrectable trace routines' TP_STRUCT__entry()
>unchanged with respect to member data types and order.
>
>Below is output of correctable and uncorrectable protocol error logging.
>CXL Root Port and CXL Endpoint examples are included below.
>
>Root Port:
>cxl_aer_correctable_error: memdev=0000:0c:00.0 host=pci0000:0c serial: 0
>status='CRC Threshold Hit'
>cxl_aer_uncorrectable_error: memdev=0000:0c:00.0 host=pci0000:0c serial: 0
>status: 'Cache Byte Enable Parity Error' first_error: 'Cache Byte Enable Parity
>Error'
>
>Endpoint:
>cxl_aer_correctable_error: memdev=mem3 host=0000:0f:00.0 serial=0
>status='CRC Threshold Hit'
>cxl_aer_uncorrectable_error: memdev=mem3 host=0000:0f:00.0 serial: 0 status:
>'Cache Byte Enable Parity Error' first_error: 'Cache Byte Enable Parity Error'
>
>Signed-off-by: Terry Bowman <terry.bowman@xxxxxxx>

Reviewed-by: Shiju Jose <shiju.jose@xxxxxxxxxx>,
apart from one error below.

>
>---
>Changes in v10->v11:
>- Updated CE and UCE trace routines to maintain consistent TP_Struct ABI and
>unchanged TP_printk() logging.
>---
> drivers/cxl/core/ras.c   | 35 +++++++++++----------
> drivers/cxl/core/trace.h | 68 +++++++---------------------------------
> 2 files changed, 30 insertions(+), 73 deletions(-)
>
>diff --git a/drivers/cxl/core/ras.c b/drivers/cxl/core/ras.c index
>3454cf1a118d..fda3b0a64dab 100644
>--- a/drivers/cxl/core/ras.c
>+++ b/drivers/cxl/core/ras.c
>@@ -13,7 +13,7 @@ static void cxl_cper_trace_corr_port_prot_err(struct
>pci_dev *pdev,  {
> 	u32 status = ras_cap.cor_status & ~ras_cap.cor_mask;
>
>-	trace_cxl_port_aer_correctable_error(&pdev->dev, status);
>+	trace_cxl_aer_correctable_error(&pdev->dev, status, 0);
> }
>
> static void cxl_cper_trace_uncorr_port_prot_err(struct pci_dev *pdev, @@ -
>28,8 +28,8 @@ static void cxl_cper_trace_uncorr_port_prot_err(struct pci_dev
>*pdev,
> 	else
> 		fe = status;
>
>-	trace_cxl_port_aer_uncorrectable_error(&pdev->dev, status, fe,
>-					       ras_cap.header_log);
>+	trace_cxl_aer_uncorrectable_error(&pdev->dev, status, fe,
>+					  ras_cap.header_log, 0);
> }
>
> static void cxl_cper_trace_corr_prot_err(struct cxl_memdev *cxlmd, @@ -37,7
>+37,8 @@ static void cxl_cper_trace_corr_prot_err(struct cxl_memdev *cxlmd,
>{
> 	u32 status = ras_cap.cor_status & ~ras_cap.cor_mask;
>
>-	trace_cxl_aer_correctable_error(cxlmd, status);
>+	trace_cxl_aer_correctable_error(&cxlmd->dev, cxlmd->cxlds->serial,
>+					status);
Please correct to 
             trace_cxl_aer_correctable_error(&cxlmd->dev,  status, 
					cxlmd->cxlds->serial);
> }
>
> static void
[...]
>-
> TRACE_EVENT(cxl_aer_correctable_error,
>-	TP_PROTO(const struct cxl_memdev *cxlmd, u32 status),
>-	TP_ARGS(cxlmd, status),
>+	TP_PROTO(const struct device *cxlmd, u32 status, u64 serial),
>+	TP_ARGS(cxlmd, status, serial),
> 	TP_STRUCT__entry(
>-		__string(memdev, dev_name(&cxlmd->dev))
>-		__string(host, dev_name(cxlmd->dev.parent))
>+		__string(memdev, dev_name(cxlmd))
>+		__string(host, dev_name(cxlmd->parent))
> 		__field(u64, serial)
> 		__field(u32, status)
> 	),
> 	TP_fast_assign(
> 		__assign_str(memdev);
> 		__assign_str(host);
>-		__entry->serial = cxlmd->cxlds->serial;
>+		__entry->serial = serial;
> 		__entry->status = status;
> 	),
> 	TP_printk("memdev=%s host=%s serial=%lld: status: '%s'",
>--
>2.51.0.rc2.21.ge5ab6b3e5a

Thanks,
Shiju






[Index of Archives]     [DMA Engine]     [Linux Coverity]     [Linux USB]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [Greybus]

  Powered by Linux