On 5/12/25 9:21 AM, Gregory Price wrote: > Add basic ACPI table information needed to understand the CXL > driver probe process. > > Signed-off-by: Gregory Price <gourry@xxxxxxxxxx> Reviewed-by: Dave Jiang <dave.jiang@xxxxxxxxx> > --- > Documentation/driver-api/cxl/index.rst | 1 + > .../driver-api/cxl/platform/acpi.rst | 76 +++++++++++++++++++ > .../driver-api/cxl/platform/acpi/cedt.rst | 62 +++++++++++++++ > .../driver-api/cxl/platform/acpi/dsdt.rst | 28 +++++++ > .../driver-api/cxl/platform/acpi/hmat.rst | 32 ++++++++ > .../driver-api/cxl/platform/acpi/slit.rst | 21 +++++ > .../driver-api/cxl/platform/acpi/srat.rst | 44 +++++++++++ > 7 files changed, 264 insertions(+) > create mode 100644 Documentation/driver-api/cxl/platform/acpi.rst > create mode 100644 Documentation/driver-api/cxl/platform/acpi/cedt.rst > create mode 100644 Documentation/driver-api/cxl/platform/acpi/dsdt.rst > create mode 100644 Documentation/driver-api/cxl/platform/acpi/hmat.rst > create mode 100644 Documentation/driver-api/cxl/platform/acpi/slit.rst > create mode 100644 Documentation/driver-api/cxl/platform/acpi/srat.rst > > diff --git a/Documentation/driver-api/cxl/index.rst b/Documentation/driver-api/cxl/index.rst > index ffa0462ad950..336322dc35a0 100644 > --- a/Documentation/driver-api/cxl/index.rst > +++ b/Documentation/driver-api/cxl/index.rst > @@ -26,6 +26,7 @@ that have impacts on each other. The docs here break up configurations steps. > :caption: Platform Configuration > > platform/bios-and-efi > + platform/acpi > > .. toctree:: > :maxdepth: 1 > diff --git a/Documentation/driver-api/cxl/platform/acpi.rst b/Documentation/driver-api/cxl/platform/acpi.rst > new file mode 100644 > index 000000000000..ee7e6bd4c43d > --- /dev/null > +++ b/Documentation/driver-api/cxl/platform/acpi.rst > @@ -0,0 +1,76 @@ > +.. SPDX-License-Identifier: GPL-2.0 > + > +=========== > +ACPI Tables > +=========== > + > +ACPI is the "Advanced Configuration and Power Interface", which is a standard > +that defines how platforms and OS manage power and configure computer hardware. > +For the purpose of this theory of operation, when referring to "ACPI" we will > +usually refer to "ACPI Tables" - which are the way a platform (BIOS/EFI) > +communicates static configuration information to the operation system. > + > +The Following ACPI tables contain *static* configuration and performance data > +about CXL devices. > + > +.. toctree:: > + :maxdepth: 1 > + > + acpi/cedt.rst > + acpi/srat.rst > + acpi/hmat.rst > + acpi/slit.rst > + acpi/dsdt.rst > + > +The SRAT table may also contain generic port/initiator content that is intended > +to describe the generic port, but not information about the rest of the path to > +the endpoint. > + > +Linux uses these tables to configure kernel resources for statically configured > +(by BIOS/EFI) CXL devices, such as: > + > +- NUMA nodes > +- Memory Tiers > +- NUMA Abstract Distances > +- SystemRAM Memory Regions > +- Weighted Interleave Node Weights > + > +ACPI Debugging > +============== > + > +The :code:`acpidump -b` command dumps the ACPI tables into binary format. > + > +The :code:`iasl -d` command disassembles the files into human readable format. > + > +Example :code:`acpidump -b && iasl -d cedt.dat` :: > + > + [000h 0000 4] Signature : "CEDT" [CXL Early Discovery Table] > + > +Common Issues > +------------- > +Most failures described here result in a failure of the driver to surface > +memory as a DAX device and/or kmem. > + > +* CEDT CFMWS targets list UIDs do not match CEDT CHBS UIDs. > +* CEDT CFMWS targets list UIDs do not match DSDT CXL Host Bridge UIDs. > +* CEDT CFMWS Restriction Bits are not correct. > +* CEDT CFMWS Memory regions are poorly aligned. > +* CEDT CFMWS Memory regions spans a platform memory hole. > +* CEDT CHBS UIDs do not match DSDT CXL Host Bridge UIDs. > +* CEDT CHBS Specification version is incorrect. > +* SRAT is missing regions described in CEDT CFMWS. > + > + * Result: failure to create a NUMA node for the region, or > + region is placed in wrong node. > + > +* HMAT is missing data for regions described in CEDT CFMWS. > + > + * Result: NUMA node being placed in the wrong memory tier. > + > +* SLIT has bad data. > + > + * Result: Lots of performance mechanisms in the kernel will be very unhappy. > + > +All of these issues will appear to users as if the driver is failing to > +support CXL - when in reality they are all the failure of a platform to > +configure the ACPI tables correctly. > diff --git a/Documentation/driver-api/cxl/platform/acpi/cedt.rst b/Documentation/driver-api/cxl/platform/acpi/cedt.rst > new file mode 100644 > index 000000000000..1d9c9d3592dc > --- /dev/null > +++ b/Documentation/driver-api/cxl/platform/acpi/cedt.rst > @@ -0,0 +1,62 @@ > +.. SPDX-License-Identifier: GPL-2.0 > + > +================================ > +CEDT - CXL Early Discovery Table > +================================ > + > +The CXL Early Discovery Table is generated by BIOS to describe the CXL memory > +regions configured at boot by the BIOS. > + > +CHBS > +==== > +The CXL Host Bridge Structure describes CXL host bridges. Other than describing > +device register information, it reports the specific host bridge UID for this > +host bridge. These host bridge ID's will be referenced in other tables. > + > +Example :: > + > + Subtable Type : 00 [CXL Host Bridge Structure] > + Reserved : 00 > + Length : 0020 > + Associated host bridge : 00000007 <- Host bridge _UID > + Specification version : 00000001 > + Reserved : 00000000 > + Register base : 0000010370400000 > + Register length : 0000000000010000 > + > +CFMWS > +===== > +The CXL Fixed Memory Window structure describes a memory region associated > +with one or more CXL host bridges (as described by the CHBS). It additionally > +describes any inter-host-bridge interleave configuration that may have been > +programmed by BIOS. > + > +Example :: > + > + Subtable Type : 01 [CXL Fixed Memory Window Structure] > + Reserved : 00 > + Length : 002C > + Reserved : 00000000 > + Window base address : 000000C050000000 <- Memory Region > + Window size : 0000003CA0000000 > + Interleave Members (2^n) : 01 <- Interleave configuration > + Interleave Arithmetic : 00 > + Reserved : 0000 > + Granularity : 00000000 > + Restrictions : 0006 > + QtgId : 0001 > + First Target : 00000007 <- Host Bridge _UID > + Next Target : 00000006 <- Host Bridge _UID > + > +The restriction field dictates what this SPA range may be used for (memory type, > +voltile vs persistent, etc). One or more bits may be set. :: > + > + Bit[0]: CXL Type 2 Memory > + Bit[1]: CXL Type 3 Memory > + Bit[2]: Volatile Memory > + Bit[3]: Persistent Memory > + Bit[4]: Fixed Config (HPA cannot be re-used) > + > +INTRA-host-bridge interleave (multiple devices on one host bridge) is NOT > +reported in this structure, and is solely defined via CXL device decoder > +programming (host bridge and endpoint decoders). > diff --git a/Documentation/driver-api/cxl/platform/acpi/dsdt.rst b/Documentation/driver-api/cxl/platform/acpi/dsdt.rst > new file mode 100644 > index 000000000000..b4583b01d67d > --- /dev/null > +++ b/Documentation/driver-api/cxl/platform/acpi/dsdt.rst > @@ -0,0 +1,28 @@ > +.. SPDX-License-Identifier: GPL-2.0 > + > +============================================== > +DSDT - Differentiated system Description Table > +============================================== > + > +This table describes what peripherals a machine has. > + > +This table's UIDs for CXL devices - specifically host bridges, must be > +consistent with the contents of the CEDT, otherwise the CXL driver will > +fail to probe correctly. > + > +Example Compute Express Link Host Bridge :: > + > + Scope (_SB) > + { > + Device (S0D0) > + { > + Name (_HID, "ACPI0016" /* Compute Express Link Host Bridge */) // _HID: Hardware ID > + Name (_CID, Package (0x02) // _CID: Compatible ID > + { > + EisaId ("PNP0A08") /* PCI Express Bus */, > + EisaId ("PNP0A03") /* PCI Bus */ > + }) > + ... > + Name (_UID, 0x05) // _UID: Unique ID > + ... > + } > diff --git a/Documentation/driver-api/cxl/platform/acpi/hmat.rst b/Documentation/driver-api/cxl/platform/acpi/hmat.rst > new file mode 100644 > index 000000000000..095a26f02a37 > --- /dev/null > +++ b/Documentation/driver-api/cxl/platform/acpi/hmat.rst > @@ -0,0 +1,32 @@ > +.. SPDX-License-Identifier: GPL-2.0 > + > +=========================================== > +HMAT - Heterogeneous Memory Attribute Table > +=========================================== > + > +The Heterogeneous Memory Attributes Table contains information such as cache > +attributes and bandwidth and latency details for memory proximity domains. > +For the purpose of this document, we will only discuss the SSLIB entry. > + > +SLLBI > +===== > +The System Locality Latency and Bandwidth Information records latency and > +bandwidth information for proximity domains. > + > +This table is used by Linux to configure interleave weights and memory tiers. > + > +Example (Heavily truncated for brevity) :: > + > + Structure Type : 0001 [SLLBI] > + Data Type : 00 <- Latency > + Target Proximity Domain List : 00000000 > + Target Proximity Domain List : 00000001 > + Entry : 0080 <- DRAM LTC > + Entry : 0100 <- CXL LTC > + > + Structure Type : 0001 [SLLBI] > + Data Type : 03 <- Bandwidth > + Target Proximity Domain List : 00000000 > + Target Proximity Domain List : 00000001 > + Entry : 1200 <- DRAM BW > + Entry : 0200 <- CXL BW > diff --git a/Documentation/driver-api/cxl/platform/acpi/slit.rst b/Documentation/driver-api/cxl/platform/acpi/slit.rst > new file mode 100644 > index 000000000000..a56768e8fe41 > --- /dev/null > +++ b/Documentation/driver-api/cxl/platform/acpi/slit.rst > @@ -0,0 +1,21 @@ > +.. SPDX-License-Identifier: GPL-2.0 > + > +======================================== > +SLIT - System Locality Information Table > +======================================== > + > +The system locality information table provides "abstract distances" between > +accessor and memory nodes. Node without initiators (cpus) are infinitely (FF) > +distance away from all other nodes. > + > +The abstract distance described in this table does not describe any real > +latency of bandwidth information. > + > +Example :: > + > + Signature : "SLIT" [System Locality Information Table] > + Localities : 0000000000000004 > + Locality 0 : 10 20 20 30 > + Locality 1 : 20 10 30 20 > + Locality 2 : FF FF 0A FF > + Locality 3 : FF FF FF 0A > diff --git a/Documentation/driver-api/cxl/platform/acpi/srat.rst b/Documentation/driver-api/cxl/platform/acpi/srat.rst > new file mode 100644 > index 000000000000..56d7bbb18c3b > --- /dev/null > +++ b/Documentation/driver-api/cxl/platform/acpi/srat.rst > @@ -0,0 +1,44 @@ > +.. SPDX-License-Identifier: GPL-2.0 > + > +===================================== > +SRAT - Static Resource Affinity Table > +===================================== > + > +The System/Static Resource Affinity Table describes resource (CPU, Memory) > +affinity to "Proximity Domains". This table is technically optional, but for > +performance information (see "HMAT") to be enumerated by linux it must be > +present. > + > +There is a careful dance between the CEDT and SRAT tables and how NUMA nodes are > +created. If things don't look quite the way you expect - check the SRAT Memory > +Affinity entries and CEDT CFMWS to determine what your platform actually > +supports in terms of flexible topologies. > + > +The SRAT may statically assign portions of a CFMWS SPA range to a specific > +proximity domains. See linux numa creation for more information about how > +this presents in the NUMA topology. > + > +Proximity Domain > +================ > +A proximity domain is ROUGHLY equivalent to "NUMA Node" - though a 1-to-1 > +mapping is not guaranteed. There are scenarios where "Proximity Domain 4" may > +map to "NUMA Node 3", for example. (See "NUMA Node Creation") > + > +Memory Affinity > +=============== > +Generally speaking, if a host does any amount of CXL fabric (decoder) > +programming in BIOS - an SRAT entry for that memory needs to be present. > + > +Example :: > + > + Subtable Type : 01 [Memory Affinity] > + Length : 28 > + Proximity Domain : 00000001 <- NUMA Node 1 > + Reserved1 : 0000 > + Base Address : 000000C050000000 <- Physical Memory Region > + Address Length : 0000003CA0000000 > + Reserved2 : 00000000 > + Flags (decoded below) : 0000000B > + Enabled : 1 > + Hot Pluggable : 1 > + Non-Volatile : 0