Michal Clapinski wrote: > Currently, the user has to specify each memory region to be used with > nvdimm via the memmap parameter. Due to the character limit of the > command line, this makes it impossible to have a lot of pmem devices. > This new parameter solves this issue by allowing users to divide > one e820 entry into many nvdimm regions. > > This change is needed for the hypervisor live update. VMs' memory will > be backed by those emulated pmem devices. To support various VM shapes > I want to create devdax devices at 1GB granularity similar to hugetlb. This looks fairly straightforward, but if this moves forward I would explicitly call the parameter something like "split" instead of "pmem" to align it better with its usage. However, while this is expedient I wonder if you would be better served with ACPI table injection to get more control and configuration options... > It's also possible to expand this parameter in the future, > e.g. to specify the type of the device (fsdax/devdax). ...for example, if you injected or customized your BIOS to supply an ACPI NFIT table you could get to deeper degrees of customization without wrestling with command lines. Supply an ACPI NFIT that carves up a large memory-type range into an aribtrary number of regions. In the NFIT there is a natural place to specify whether the range gets sent to PMEM. See call to nvdimm_pmem_region_create() near NFIT_SPA_PM in acpi_nfit_register_region()", and "simply" pick a new guid to signify direct routing to device-dax. I say simply, but that implies new ACPI NFIT driver plumbing for the new mode. Another overlooked detail about NFIT is that there is an opportunity to determine cases where the platform might have changed the physical address map from one boot to the next. In other words, I cringe at the fragility of memmap=, but I understand that it has the benefit of being simple. See the "nd_set cookie" concept in acpi_nfit_init_interleave_set().