>-----Original Message----- >From: Jonathan Cameron <jonathan.cameron@xxxxxxxxxx> >Sent: 20 August 2025 09:54 >To: Mike Rapoport <rppt@xxxxxxxxxx> >Cc: Shiju Jose <shiju.jose@xxxxxxxxxx>; rafael@xxxxxxxxxx; bp@xxxxxxxxx; >akpm@xxxxxxxxxxxxxxxxxxxx; dferguson@xxxxxxxxxxxxxxxxxxx; linux- >edac@xxxxxxxxxxxxxxx; linux-acpi@xxxxxxxxxxxxxxx; linux-mm@xxxxxxxxx; linux- >doc@xxxxxxxxxxxxxxx; tony.luck@xxxxxxxxx; lenb@xxxxxxxxxx; >leo.duran@xxxxxxx; Yazen.Ghannam@xxxxxxx; mchehab@xxxxxxxxxx; >Linuxarm <linuxarm@xxxxxxxxxx>; rientjes@xxxxxxxxxx; >jiaqiyan@xxxxxxxxxx; Jon.Grimm@xxxxxxx; dave.hansen@xxxxxxxxxxxxxxx; >naoya.horiguchi@xxxxxxx; james.morse@xxxxxxx; jthoughton@xxxxxxxxxx; >somasundaram.a@xxxxxxx; erdemaktas@xxxxxxxxxx; pgonda@xxxxxxxxxx; >duenwen@xxxxxxxxxx; gthelen@xxxxxxxxxx; >wschwartz@xxxxxxxxxxxxxxxxxxx; wbs@xxxxxxxxxxxxxxxxxxxxxx; >nifan.cxl@xxxxxxxxx; tanxiaofei <tanxiaofei@xxxxxxxxxx>; Zengtao (B) ><prime.zeng@xxxxxxxxxxxxx>; Roberto Sassu <roberto.sassu@xxxxxxxxxx>; >kangkang.shen@xxxxxxxxxxxxx; wanghuiqiang <wanghuiqiang@xxxxxxxxxx> >Subject: Re: [PATCH v11 1/3] mm: Add support to retrieve physical address >range of memory from the node ID > >On Wed, 20 Aug 2025 10:34:13 +0300 >Mike Rapoport <rppt@xxxxxxxxxx> wrote: > >> On Tue, Aug 19, 2025 at 05:54:20PM +0100, Jonathan Cameron wrote: >> > On Tue, 12 Aug 2025 15:26:13 +0100 >> > <shiju.jose@xxxxxxxxxx> wrote: >> > >> > > From: Shiju Jose <shiju.jose@xxxxxxxxxx> >> > > >> > > In the numa_memblks, a lookup facility is required to retrieve the >> > > physical address range of memory in a NUMA node. ACPI RAS2 memory >> > > features are among the use cases. >> > > >> > > Suggested-by: Jonathan Cameron <jonathan.cameron@xxxxxxxxxx> >> > > Signed-off-by: Shiju Jose <shiju.jose@xxxxxxxxxx> >> > >> > Looks fine to me. Mike, what do you think? >> >> I still don't see why we can't use existing functions like >> get_pfn_range_for_nid() or memblock_search_pfn_nid(). >> >> Or even node_start_pfn() and node_spanned_pages(). > >Good point. No reason anyone would scrub this on memory that hasn't been >hotplugged yet, so no need to use numa-memblk to get the info. >I guess I was thinking of the wrong hammer :) > >I'm not sure node_spanned_pages() works though as we need not to include >ranges that might be on another node as we'd give a wrong impression of what >was being scrubbed. > >Should be able to use some combination of node_start_pfn() and maybe >memblock_search_pfn_nid() to get it though (that also gets the nid we already >know but meh, no ral harm in that.) Thanks Mike and Jonathan. The following approaches were tried as you suggested, instead of newly proposed nid_get_mem_physaddr_range(). Methods 1 to 3 give the same result as nid_get_mem_physaddr_range(), but Method 4 gives a different value for the size. Please advise which method should be used for the RAS2? Thanks, Shiju Method 1 start_pfn = node_start_pfn(ras2_ctx->sys_comp_nid); end_pfn = node_end_pfn(ras2_ctx->sys_comp_nid); start = __pfn_to_phys(start_pfn); end = __pfn_to_phys(end_pfn); ras2_ctx->mem_base_addr = start; ras2_ctx->mem_size = end - start; pr_info("mem_base_addr=0x%lx mem_size=0x%lx\n", ras2_ctx->mem_base_addr, ras2_ctx->mem_size); Method 2 start_pfn = node_start_pfn(ras2_ctx->sys_comp_nid); size_pfn = node_spanned_pages(ras2_ctx->sys_comp_nid); ras2_ctx->mem_base_addr = __pfn_to_phys(start_pfn); ras2_ctx->mem_size = __pfn_to_phys(size_pfn); pr_info("mem_base_addr=0x%lx mem_size=0x%lx\n", ras2_ctx->mem_base_addr, ras2_ctx->mem_size); Method 3 get_pfn_range_for_nid(ras2_ctx->sys_comp_nid, &start_pfn, &end_pfn); start = __pfn_to_phys(start_pfn); end = __pfn_to_phys(end_pfn); ras2_ctx->mem_base_addr = start; ras2_ctx->mem_size = end - start; pr_info("mem_base_addr=0x%lx mem_size=0x%lx\n", ras2_ctx->mem_base_addr, ras2_ctx->mem_size); Method 4 pfn = node_start_pfn(ras2_ctx->sys_comp_nid); rc = memblock_search_pfn_nid(pfn, &start_pfn, &end_pfn); if (rc == NUMA_NO_NODE) { pr_warn("Failed to find phy addr range for NUMA node(%u) rc=%d\n", rc); goto ctx_free; } start = __pfn_to_phys(start_pfn); end = __pfn_to_phys(end_pfn); ras2_ctx->mem_base_addr = start; ras2_ctx->mem_size = end - start; pr_info("mem_base_addr=0x%lx mem_size=0x%lx\n", ras2_ctx->mem_base_addr, ras2_ctx->mem_size); > >Jonathan > > > > >> >> > One passing comment inline. >> > >> > Reviewed-by: Jonathan Cameron <jonathan.cameron@xxxxxxxxxx> >> > >> > > --- >> > > include/linux/numa.h | 9 +++++++++ >> > > include/linux/numa_memblks.h | 2 ++ >> > > mm/numa.c | 10 ++++++++++ >> > > mm/numa_memblks.c | 37 >++++++++++++++++++++++++++++++++++++ >> > > 4 files changed, 58 insertions(+) >> > > >> > > diff --git a/include/linux/numa.h b/include/linux/numa.h index >> > > e6baaf6051bc..1d1aabebd26b 100644 >> > > --- a/include/linux/numa.h >> > > +++ b/include/linux/numa.h >> > > @@ -41,6 +41,10 @@ int memory_add_physaddr_to_nid(u64 start); int >> > > phys_to_target_node(u64 start); #endif >> > > >> > > +#ifndef nid_get_mem_physaddr_range int >> > > +nid_get_mem_physaddr_range(int nid, u64 *start, u64 *end); #endif >> > > + >> > > int numa_fill_memblks(u64 start, u64 end); >> > > >> > > #else /* !CONFIG_NUMA */ >> > > @@ -63,6 +67,11 @@ static inline int phys_to_target_node(u64 start) >> > > return 0; >> > > } >> > > >> > > +static inline int nid_get_mem_physaddr_range(int nid, u64 *start, >> > > +u64 *end) { >> > > + return 0; >> > > +} >> > > + >> > > static inline void alloc_offline_node_data(int nid) {} #endif >> > > >> > > diff --git a/include/linux/numa_memblks.h >> > > b/include/linux/numa_memblks.h index 991076cba7c5..7b32d96d0134 >> > > 100644 >> > > --- a/include/linux/numa_memblks.h >> > > +++ b/include/linux/numa_memblks.h >> > > @@ -55,6 +55,8 @@ extern int phys_to_target_node(u64 start); >> > > #define phys_to_target_node phys_to_target_node extern int >> > > memory_add_physaddr_to_nid(u64 start); #define >> > > memory_add_physaddr_to_nid memory_add_physaddr_to_nid >> > > +extern int nid_get_mem_physaddr_range(int nid, u64 *start, u64 >> > > +*end); #define nid_get_mem_physaddr_range >> > > +nid_get_mem_physaddr_range >> > > #endif /* CONFIG_NUMA_KEEP_MEMINFO */ >> > > >> > > #endif /* CONFIG_NUMA_MEMBLKS */ >> > > diff --git a/mm/numa.c b/mm/numa.c index >> > > 7d5e06fe5bd4..5335af1fefee 100644 >> > > --- a/mm/numa.c >> > > +++ b/mm/numa.c >> > > @@ -59,3 +59,13 @@ int phys_to_target_node(u64 start) } >> > > EXPORT_SYMBOL_GPL(phys_to_target_node); >> > > #endif >> > > + >> > > +#ifndef nid_get_mem_physaddr_range int >> > > +nid_get_mem_physaddr_range(int nid, u64 *start, u64 *end) { >> > > + pr_info_once("Unknown target phys addr range for node=%d\n", >> > > +nid); >> > > + >> > > + return 0; >> > > +} >> > > +EXPORT_SYMBOL_GPL(nid_get_mem_physaddr_range); >> > > +#endif >> > > diff --git a/mm/numa_memblks.c b/mm/numa_memblks.c index >> > > 541a99c4071a..e1e56b7a3499 100644 >> > > --- a/mm/numa_memblks.c >> > > +++ b/mm/numa_memblks.c >> > > @@ -590,4 +590,41 @@ int memory_add_physaddr_to_nid(u64 start) } >> > > EXPORT_SYMBOL_GPL(memory_add_physaddr_to_nid); >> > > >> > > +/** >> > > + * nid_get_mem_physaddr_range - Get the physical address range >> > > + * of the memblk in the NUMA node. >> > > + * @nid: NUMA node ID of the memblk >> > > + * @start: Start address of the memblk >> > > + * @end: End address of the memblk >> > > + * >> > > + * Find the lowest contiguous physical memory address range of >> > > +the memblk >> > > + * in the NUMA node with the given nid and return the start and >> > > +end >> > > + * addresses. >> > > + * >> > > + * RETURNS: >> > > + * 0 on success, -errno on failure. >> > > + */ >> > > +int nid_get_mem_physaddr_range(int nid, u64 *start, u64 *end) { >> > > + struct numa_meminfo *mi = &numa_meminfo; >> > > + int i; >> > > + >> > > + if (!numa_valid_node(nid)) >> > > + return -EINVAL; >> > > + >> > > + for (i = 0; i < mi->nr_blks; i++) { >> > > + if (mi->blk[i].nid == nid) { >> > > + *start = mi->blk[i].start; >> > > + /* >> > > + * Assumption: mi->blk[i].end is the last address >> > > + * in the range + 1. >> > >> > This was my fault for asking on internal review if this was >> > documented anywhere. It's kind of implicitly obvious when reading >> > numa_memblk.c because there are a bunch of end - 1 prints. >> > So can probably drop this comment. >> > >> > > + */ >> > > + *end = mi->blk[i].end; >> > > + return 0; >> > > + } >> > > + } >> > > + >> > > + return -ENODEV; >> > > +} >> > > +EXPORT_SYMBOL_GPL(nid_get_mem_physaddr_range); >> > > #endif /* CONFIG_NUMA_KEEP_MEMINFO */ >> > >>