RE: [PATCH v11 1/3] mm: Add support to retrieve physical address range of memory from the node ID

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



>-----Original Message-----
>From: Jonathan Cameron <jonathan.cameron@xxxxxxxxxx>
>Sent: 20 August 2025 09:54
>To: Mike Rapoport <rppt@xxxxxxxxxx>
>Cc: Shiju Jose <shiju.jose@xxxxxxxxxx>; rafael@xxxxxxxxxx; bp@xxxxxxxxx;
>akpm@xxxxxxxxxxxxxxxxxxxx; dferguson@xxxxxxxxxxxxxxxxxxx; linux-
>edac@xxxxxxxxxxxxxxx; linux-acpi@xxxxxxxxxxxxxxx; linux-mm@xxxxxxxxx; linux-
>doc@xxxxxxxxxxxxxxx; tony.luck@xxxxxxxxx; lenb@xxxxxxxxxx;
>leo.duran@xxxxxxx; Yazen.Ghannam@xxxxxxx; mchehab@xxxxxxxxxx;
>Linuxarm <linuxarm@xxxxxxxxxx>; rientjes@xxxxxxxxxx;
>jiaqiyan@xxxxxxxxxx; Jon.Grimm@xxxxxxx; dave.hansen@xxxxxxxxxxxxxxx;
>naoya.horiguchi@xxxxxxx; james.morse@xxxxxxx; jthoughton@xxxxxxxxxx;
>somasundaram.a@xxxxxxx; erdemaktas@xxxxxxxxxx; pgonda@xxxxxxxxxx;
>duenwen@xxxxxxxxxx; gthelen@xxxxxxxxxx;
>wschwartz@xxxxxxxxxxxxxxxxxxx; wbs@xxxxxxxxxxxxxxxxxxxxxx;
>nifan.cxl@xxxxxxxxx; tanxiaofei <tanxiaofei@xxxxxxxxxx>; Zengtao (B)
><prime.zeng@xxxxxxxxxxxxx>; Roberto Sassu <roberto.sassu@xxxxxxxxxx>;
>kangkang.shen@xxxxxxxxxxxxx; wanghuiqiang <wanghuiqiang@xxxxxxxxxx>
>Subject: Re: [PATCH v11 1/3] mm: Add support to retrieve physical address
>range of memory from the node ID
>
>On Wed, 20 Aug 2025 10:34:13 +0300
>Mike Rapoport <rppt@xxxxxxxxxx> wrote:
>
>> On Tue, Aug 19, 2025 at 05:54:20PM +0100, Jonathan Cameron wrote:
>> > On Tue, 12 Aug 2025 15:26:13 +0100
>> > <shiju.jose@xxxxxxxxxx> wrote:
>> >
>> > > From: Shiju Jose <shiju.jose@xxxxxxxxxx>
>> > >
>> > > In the numa_memblks, a lookup facility is required to retrieve the
>> > > physical address range of memory in a NUMA node. ACPI RAS2 memory
>> > > features are among the use cases.
>> > >
>> > > Suggested-by: Jonathan Cameron <jonathan.cameron@xxxxxxxxxx>
>> > > Signed-off-by: Shiju Jose <shiju.jose@xxxxxxxxxx>
>> >
>> > Looks fine to me.  Mike, what do you think?
>>
>> I still don't see why we can't use existing functions like
>> get_pfn_range_for_nid() or memblock_search_pfn_nid().
>>
>> Or even node_start_pfn() and node_spanned_pages().
>
>Good point.  No reason anyone would scrub this on memory that hasn't been
>hotplugged yet, so no need to use numa-memblk to get the info.
>I guess I was thinking of the wrong hammer :)
>
>I'm not sure node_spanned_pages() works though as we need not to include
>ranges that might be on another node as we'd give a wrong impression of what
>was being scrubbed.
>
>Should be able to use some combination of node_start_pfn() and maybe
>memblock_search_pfn_nid() to get it though (that also gets the nid we already
>know but meh, no ral harm in that.)

Thanks Mike and Jonathan.

The following approaches were tried as you suggested, instead of newly proposed
nid_get_mem_physaddr_range().
Methods 1 to 3 give the same result as nid_get_mem_physaddr_range(), but
Method 4 gives a different value for the size.

Please advise which method should be used for the RAS2?

Thanks,
Shiju

Method 1 
 start_pfn = node_start_pfn(ras2_ctx->sys_comp_nid);
 end_pfn = node_end_pfn(ras2_ctx->sys_comp_nid);
 start = __pfn_to_phys(start_pfn);
 end = __pfn_to_phys(end_pfn);
 ras2_ctx->mem_base_addr = start;
 ras2_ctx->mem_size = end - start;
 pr_info("mem_base_addr=0x%lx mem_size=0x%lx\n", ras2_ctx->mem_base_addr, ras2_ctx->mem_size);

Method 2
 start_pfn = node_start_pfn(ras2_ctx->sys_comp_nid);
 size_pfn = node_spanned_pages(ras2_ctx->sys_comp_nid);
 ras2_ctx->mem_base_addr = __pfn_to_phys(start_pfn);
 ras2_ctx->mem_size = __pfn_to_phys(size_pfn);
 pr_info("mem_base_addr=0x%lx mem_size=0x%lx\n", ras2_ctx->mem_base_addr, ras2_ctx->mem_size);

Method 3
 get_pfn_range_for_nid(ras2_ctx->sys_comp_nid, &start_pfn, &end_pfn);
 start = __pfn_to_phys(start_pfn);               
 end = __pfn_to_phys(end_pfn);   
 ras2_ctx->mem_base_addr = start;
 ras2_ctx->mem_size = end - start;
 pr_info("mem_base_addr=0x%lx mem_size=0x%lx\n", ras2_ctx->mem_base_addr, ras2_ctx->mem_size);

Method 4
 pfn = node_start_pfn(ras2_ctx->sys_comp_nid);
 rc = memblock_search_pfn_nid(pfn, &start_pfn, &end_pfn);
 if (rc == NUMA_NO_NODE) {
     pr_warn("Failed to find phy addr range for NUMA node(%u) rc=%d\n", rc);
   goto ctx_free;
 }
 start = __pfn_to_phys(start_pfn);               
 end = __pfn_to_phys(end_pfn);   
 ras2_ctx->mem_base_addr = start;
 ras2_ctx->mem_size = end - start;
 pr_info("mem_base_addr=0x%lx mem_size=0x%lx\n", ras2_ctx->mem_base_addr, ras2_ctx->mem_size);

>
>Jonathan
>
>
>
>
>>
>> > One passing comment inline.
>> >
>> > Reviewed-by: Jonathan Cameron <jonathan.cameron@xxxxxxxxxx>
>> >
>> > > ---
>> > >  include/linux/numa.h         |  9 +++++++++
>> > >  include/linux/numa_memblks.h |  2 ++
>> > >  mm/numa.c                    | 10 ++++++++++
>> > >  mm/numa_memblks.c            | 37
>++++++++++++++++++++++++++++++++++++
>> > >  4 files changed, 58 insertions(+)
>> > >
>> > > diff --git a/include/linux/numa.h b/include/linux/numa.h index
>> > > e6baaf6051bc..1d1aabebd26b 100644
>> > > --- a/include/linux/numa.h
>> > > +++ b/include/linux/numa.h
>> > > @@ -41,6 +41,10 @@ int memory_add_physaddr_to_nid(u64 start);  int
>> > > phys_to_target_node(u64 start);  #endif
>> > >
>> > > +#ifndef nid_get_mem_physaddr_range int
>> > > +nid_get_mem_physaddr_range(int nid, u64 *start, u64 *end); #endif
>> > > +
>> > >  int numa_fill_memblks(u64 start, u64 end);
>> > >
>> > >  #else /* !CONFIG_NUMA */
>> > > @@ -63,6 +67,11 @@ static inline int phys_to_target_node(u64 start)
>> > >  	return 0;
>> > >  }
>> > >
>> > > +static inline int nid_get_mem_physaddr_range(int nid, u64 *start,
>> > > +u64 *end) {
>> > > +	return 0;
>> > > +}
>> > > +
>> > >  static inline void alloc_offline_node_data(int nid) {}  #endif
>> > >
>> > > diff --git a/include/linux/numa_memblks.h
>> > > b/include/linux/numa_memblks.h index 991076cba7c5..7b32d96d0134
>> > > 100644
>> > > --- a/include/linux/numa_memblks.h
>> > > +++ b/include/linux/numa_memblks.h
>> > > @@ -55,6 +55,8 @@ extern int phys_to_target_node(u64 start);
>> > > #define phys_to_target_node phys_to_target_node  extern int
>> > > memory_add_physaddr_to_nid(u64 start);  #define
>> > > memory_add_physaddr_to_nid memory_add_physaddr_to_nid
>> > > +extern int nid_get_mem_physaddr_range(int nid, u64 *start, u64
>> > > +*end); #define nid_get_mem_physaddr_range
>> > > +nid_get_mem_physaddr_range
>> > >  #endif /* CONFIG_NUMA_KEEP_MEMINFO */
>> > >
>> > >  #endif /* CONFIG_NUMA_MEMBLKS */
>> > > diff --git a/mm/numa.c b/mm/numa.c index
>> > > 7d5e06fe5bd4..5335af1fefee 100644
>> > > --- a/mm/numa.c
>> > > +++ b/mm/numa.c
>> > > @@ -59,3 +59,13 @@ int phys_to_target_node(u64 start)  }
>> > > EXPORT_SYMBOL_GPL(phys_to_target_node);
>> > >  #endif
>> > > +
>> > > +#ifndef nid_get_mem_physaddr_range int
>> > > +nid_get_mem_physaddr_range(int nid, u64 *start, u64 *end) {
>> > > +	pr_info_once("Unknown target phys addr range for node=%d\n",
>> > > +nid);
>> > > +
>> > > +	return 0;
>> > > +}
>> > > +EXPORT_SYMBOL_GPL(nid_get_mem_physaddr_range);
>> > > +#endif
>> > > diff --git a/mm/numa_memblks.c b/mm/numa_memblks.c index
>> > > 541a99c4071a..e1e56b7a3499 100644
>> > > --- a/mm/numa_memblks.c
>> > > +++ b/mm/numa_memblks.c
>> > > @@ -590,4 +590,41 @@ int memory_add_physaddr_to_nid(u64 start)  }
>> > > EXPORT_SYMBOL_GPL(memory_add_physaddr_to_nid);
>> > >
>> > > +/**
>> > > + * nid_get_mem_physaddr_range - Get the physical address range
>> > > + *				of the memblk in the NUMA node.
>> > > + * @nid: NUMA node ID of the memblk
>> > > + * @start: Start address of the memblk
>> > > + * @end: End address of the memblk
>> > > + *
>> > > + * Find the lowest contiguous physical memory address range of
>> > > +the memblk
>> > > + * in the NUMA node with the given nid and return the start and
>> > > +end
>> > > + * addresses.
>> > > + *
>> > > + * RETURNS:
>> > > + * 0 on success, -errno on failure.
>> > > + */
>> > > +int nid_get_mem_physaddr_range(int nid, u64 *start, u64 *end) {
>> > > +	struct numa_meminfo *mi = &numa_meminfo;
>> > > +	int i;
>> > > +
>> > > +	if (!numa_valid_node(nid))
>> > > +		return -EINVAL;
>> > > +
>> > > +	for (i = 0; i < mi->nr_blks; i++) {
>> > > +		if (mi->blk[i].nid == nid) {
>> > > +			*start = mi->blk[i].start;
>> > > +			/*
>> > > +			 * Assumption: mi->blk[i].end is the last address
>> > > +			 * in the range + 1.
>> >
>> > This was my fault for asking on internal review if this was
>> > documented anywhere. It's kind of implicitly obvious when reading
>> > numa_memblk.c because there are a bunch of end - 1 prints.
>> > So can probably drop this comment.
>> >
>> > > +			 */
>> > > +			*end = mi->blk[i].end;
>> > > +			return 0;
>> > > +		}
>> > > +	}
>> > > +
>> > > +	return -ENODEV;
>> > > +}
>> > > +EXPORT_SYMBOL_GPL(nid_get_mem_physaddr_range);
>> > >  #endif /* CONFIG_NUMA_KEEP_MEMINFO */
>> >
>>





[Index of Archives]     [Linux IBM ACPI]     [Linux Power Management]     [Linux Kernel]     [Linux Laptop]     [Kernel Newbies]     [Share Photos]     [Security]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Samba]     [Video 4 Linux]     [Device Mapper]     [Linux Resources]
  Powered by Linux