On 4/30/25 11:10 AM, Gregory Price wrote: > Document __init time configurations that affect CXL driver probe > process and memory region configuration. > > Signed-off-by: Gregory Price <gourry@xxxxxxxxxx> > --- > Documentation/driver-api/cxl/index.rst | 1 + > .../driver-api/cxl/linux/early-boot.rst | 130 ++++++++++++++++++ > 2 files changed, 131 insertions(+) > create mode 100644 Documentation/driver-api/cxl/linux/early-boot.rst > > diff --git a/Documentation/driver-api/cxl/linux/early-boot.rst b/Documentation/driver-api/cxl/linux/early-boot.rst > new file mode 100644 > index 000000000000..275174d5b0bb > --- /dev/null > +++ b/Documentation/driver-api/cxl/linux/early-boot.rst > @@ -0,0 +1,130 @@ > +.. SPDX-License-Identifier: GPL-2.0 > + > +======================= > +Linux Init (Early Boot) > +======================= > + > +Linux configuration is split into two major steps: Early-Boot and everything else. > + > +During early boot, Linux sets up immutable resources (such as numa nodes), while > +later operations include things like driver probe and memory hotplug. Linux may > +read EFI and ACPI information throughout this process to configure logical > +representations of the devices. > + > +During Linux Early Boot stage (functions in the kernel that have the __init > +decorator), the system takes the resources created by EFI/BIOS (ACPI tables) > +and turns them into resources that the kernel can consume. > + > + > +BIOS, Build and Boot Options > +============================ > + > +There are 4 pre-boot options that need to be considered during kernel build > +which dictate how memory will be managed by Linux during early boot. > + > +* EFI_MEMORY_SP > + > + * BIOS/EFI Option that dictates whether memory is SystemRAM or > + Specific Purpose. Specific Purpose memory will be deferred to > + drivers to manage - and not immediately exposed as system RAM. > + > +* CONFIG_EFI_SOFT_RESERVE > + > + * Linux Build config option that dictates whether the kernel supports > + Specific Purpose memory. > + > +* CONFIG_MHP_DEFAULT_ONLINE_TYPE > + > + * Linux Build config that dictates whether and how Specific Purpose memory > + converted to a dax device should be managed (left as DAX or onlined as > + SystemRAM in ZONE_NORMAL or ZONE_MOVABLE). > + > +* nosoftreserve > + > + * Linux kernel boot option that dictates whether Soft Reserve should be > + supported. Similar to CONFIG_EFI_SOFT_RESERVE. > + > +Memory Map Creation > +=================== > + > +While the kernel parses the EFI memory map, if :code:`Specific Purpose` memory > +is supported and detect, it will set this region aside as :code:`SOFT_RESERVED`. detected, > + > +If :code:`EFI_MEMORY_SP=0`, :code:`CONFIG_EFI_SOFT_RESERVE=n`, or > +:code:`nosoftreserve=y` - Linux will default a CXL device memory region to > +SystemRAM. This will expose the memory to the kernel page allocator in > +:code:`ZONE_NORMAL`, making it available for use for most allocations (including > +:code:`struct page` and page tables). > + > +If `Specific Purpose` is set and supported, :code:`CONFIG_MHP_DEFAULT_ONLINE_TYPE_*` > +dictates whether the memory is onlined by default (:code:`_OFFLINE` or > +:code:`_ONLINE_*`), and if online which zone to online this memory to by default > +(:code:`_NORMAL` or :code:`_MOVABLE`). > + > +If placed in :code:`ZONE_MOVABLE`, the memory will not be available for most > +kernel allocations (such as :code:`struct page` or page tables). This may > +significant impact performance depending on the memory capacity of the system. > + > + > +NUMA Node Reservation > +===================== > + > +Linux refers to the proximity domains (:code:`PXM`) defined in the SRAT to > +create NUMA nodes in :code:`acpi_numa_init`. Typically, there is a 1:1 relation > +between :code:`PXM` and NUMA node IDs. > + > +SRAT is the only ACPI defined way of defining Proximity Domains. Linux chooses > +to, at most, map those 1:1 with NUMA nodes. CEDT adds a description of SPA > +ranges which Linux may wish to map to one or more NUMA nodes Add ending period above. > + > +If there are CXL ranges in the CFMWS but not in SRAT, then a fake :code:`PXM` > +is created (as of v6.15). In the future, Linux may reject CFMWS not described > +by SRAT due to the ambiguity of proximity domain association. -- ~Randy