CXL is intended to be a programmable topology, and a single CXL Fixed Memory Window (CFMWS) may back memory that a driver wants to split across multiple NUMA nodes for tiering or isolation. Those nodes must exist at __init time to be usable later. Add CONFIG_ACPI_NUMA_ADD_CFMWS_NODES, the number of additional standby NUMA nodes to reserve per CEDT CFMWS entry. acpi_parse_cfmws() records the per-window count, which is folded into the standby request on successful acpi_numa_init(). Signed-off-by: Gregory Price --- drivers/acpi/numa/Kconfig | 20 ++++++++++++++++++++ drivers/acpi/numa/srat.c | 13 +++++++++++-- 2 files changed, 31 insertions(+), 2 deletions(-) diff --git a/drivers/acpi/numa/Kconfig b/drivers/acpi/numa/Kconfig index ecf27bf45e5b..65d7eb9a4022 100644 --- a/drivers/acpi/numa/Kconfig +++ b/drivers/acpi/numa/Kconfig @@ -14,6 +14,26 @@ config ACPI_HMAT performance attributes through the node's sysfs device if provided. +config ACPI_NUMA_ADD_CFMWS_NODES + int "Additional standby NUMA nodes per CEDT CFMWS entry" + depends on ACPI_NUMA + range 0 4 + default 0 + help + Number of additional standby NUMA nodes to reserve per CEDT + CXL Fixed Memory Window Structure (CFMWS) entry. + + By default ACPI reserves 1 NUMA node per unique PXM entry in + the SRAT, or 1 node for a CFMWS without SRAT mappings. + + Setting this > 0 reserves additional standby nodes per CFMWS + that drivers can claim at runtime via + numa_request_exclusive_node(). This is useful for CXL drivers + that want to place memory on distinct NUMA nodes within the + same CXL Fixed Memory Window. + + Set to 0 (default) to disable. + config ACPI_NUMA_STANDBY_NODES int "Additional standby NUMA nodes for runtime claiming" depends on ACPI_NUMA diff --git a/drivers/acpi/numa/srat.c b/drivers/acpi/numa/srat.c index d7b0e4ece610..6c54d5f0cf0a 100644 --- a/drivers/acpi/numa/srat.c +++ b/drivers/acpi/numa/srat.c @@ -354,6 +354,7 @@ static int __init acpi_parse_slit(struct acpi_table_header *table) } static int parsed_numa_memblks __initdata; +static int cfmws_standby_count __initdata; static int __init acpi_parse_memory_affinity(union acpi_subtable_headers *header, @@ -454,7 +455,7 @@ static int __init acpi_parse_cfmws(union acpi_subtable_headers *header, * window. */ if (!numa_fill_memblks(start, end)) - return 0; + goto standby_nodes; /* No SRAT description. Create a new node. */ node = acpi_map_pxm_to_node(*fake_pxm); @@ -473,6 +474,11 @@ static int __init acpi_parse_cfmws(union acpi_subtable_headers *header, /* Set the next available fake_pxm value */ (*fake_pxm)++; + +standby_nodes: + /* Request any standby nodes (created after numa_emulation runs) */ + cfmws_standby_count += CONFIG_ACPI_NUMA_ADD_CFMWS_NODES; + return 0; } @@ -607,6 +613,8 @@ int __init acpi_numa_init(void) if (acpi_disabled) return -EINVAL; + cfmws_standby_count = 0; + /* * Should not limit number with cpu num that is from NR_CPUS or nr_cpus= * SRAT cpu entries could have different order with that in MADT. @@ -666,7 +674,8 @@ int __init acpi_numa_init(void) return -ENOENT; /* Request any standby nodes (created after numa emulation) */ - numa_request_standby_count(CONFIG_ACPI_NUMA_STANDBY_NODES); + numa_request_standby_count(CONFIG_ACPI_NUMA_STANDBY_NODES + + cfmws_standby_count); return 0; } -- 2.54.0