Re: [PATCH v11 1/3] mm: Add support to retrieve physical address range of memory from the node ID

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Wed, Aug 20, 2025 at 10:00:50AM +0000, Shiju Jose wrote:
> >-----Original Message-----
> >From: Jonathan Cameron <jonathan.cameron@xxxxxxxxxx>
> >Sent: 20 August 2025 09:54
> >To: Mike Rapoport <rppt@xxxxxxxxxx>
> >Cc: Shiju Jose <shiju.jose@xxxxxxxxxx>; rafael@xxxxxxxxxx; bp@xxxxxxxxx;
> >akpm@xxxxxxxxxxxxxxxxxxxx; dferguson@xxxxxxxxxxxxxxxxxxx; linux-
> >edac@xxxxxxxxxxxxxxx; linux-acpi@xxxxxxxxxxxxxxx; linux-mm@xxxxxxxxx; linux-
> >doc@xxxxxxxxxxxxxxx; tony.luck@xxxxxxxxx; lenb@xxxxxxxxxx;
> >leo.duran@xxxxxxx; Yazen.Ghannam@xxxxxxx; mchehab@xxxxxxxxxx;
> >Linuxarm <linuxarm@xxxxxxxxxx>; rientjes@xxxxxxxxxx;
> >jiaqiyan@xxxxxxxxxx; Jon.Grimm@xxxxxxx; dave.hansen@xxxxxxxxxxxxxxx;
> >naoya.horiguchi@xxxxxxx; james.morse@xxxxxxx; jthoughton@xxxxxxxxxx;
> >somasundaram.a@xxxxxxx; erdemaktas@xxxxxxxxxx; pgonda@xxxxxxxxxx;
> >duenwen@xxxxxxxxxx; gthelen@xxxxxxxxxx;
> >wschwartz@xxxxxxxxxxxxxxxxxxx; wbs@xxxxxxxxxxxxxxxxxxxxxx;
> >nifan.cxl@xxxxxxxxx; tanxiaofei <tanxiaofei@xxxxxxxxxx>; Zengtao (B)
> ><prime.zeng@xxxxxxxxxxxxx>; Roberto Sassu <roberto.sassu@xxxxxxxxxx>;
> >kangkang.shen@xxxxxxxxxxxxx; wanghuiqiang <wanghuiqiang@xxxxxxxxxx>
> >Subject: Re: [PATCH v11 1/3] mm: Add support to retrieve physical address
> >range of memory from the node ID
> >
> >On Wed, 20 Aug 2025 10:34:13 +0300
> >Mike Rapoport <rppt@xxxxxxxxxx> wrote:
> >
> >> On Tue, Aug 19, 2025 at 05:54:20PM +0100, Jonathan Cameron wrote:
> >> > On Tue, 12 Aug 2025 15:26:13 +0100
> >> > <shiju.jose@xxxxxxxxxx> wrote:
> >> >
> >> > > From: Shiju Jose <shiju.jose@xxxxxxxxxx>
> >> > >
> >> > > In the numa_memblks, a lookup facility is required to retrieve the
> >> > > physical address range of memory in a NUMA node. ACPI RAS2 memory
> >> > > features are among the use cases.
> >> > >
> >> > > Suggested-by: Jonathan Cameron <jonathan.cameron@xxxxxxxxxx>
> >> > > Signed-off-by: Shiju Jose <shiju.jose@xxxxxxxxxx>
> >> >
> >> > Looks fine to me.  Mike, what do you think?
> >>
> >> I still don't see why we can't use existing functions like
> >> get_pfn_range_for_nid() or memblock_search_pfn_nid().
> >>
> >> Or even node_start_pfn() and node_spanned_pages().
> >
> >Good point.  No reason anyone would scrub this on memory that hasn't been
> >hotplugged yet, so no need to use numa-memblk to get the info.
> >I guess I was thinking of the wrong hammer :)
> >
> >I'm not sure node_spanned_pages() works though as we need not to include
> >ranges that might be on another node as we'd give a wrong impression of what
> >was being scrubbed.

If nodes are not interleaved node_spanned_pages() would work, even if there
are holes inside the node, like e.g. e820-reserved memory.
So with non-interleaved nodes node_start_pfn() and either
node_spanned_pages() or node_end_pfn() will give the node extents and they
are faster than get_pfn_range_for_nid().

If the nodes are interleaved, though, a single mem_base, mem_size are not
enough for a node as there are a few contiguous ranges in that node, e.g.

  0              4G              8G             12G            16G
  +-------------+ +-------------+ +-------------+ +-------------+
  |    node 0   | |    node 1   | |    node 0   | |    node 1   |
  +-------------+ +-------------+ +-------------+ +-------------+

I didn't look into the details of the RAS2 driver, but isn't it's something
it should handle?

> >Should be able to use some combination of node_start_pfn() and maybe
> >memblock_search_pfn_nid() to get it though (that also gets the nid we already
> >know but meh, no ral harm in that.)
> 
> Thanks Mike and Jonathan.
> 
> The following approaches were tried as you suggested, instead of newly proposed
> nid_get_mem_physaddr_range().
> Methods 1 to 3 give the same result as nid_get_mem_physaddr_range(), but
> Method 4 gives a different value for the size.

I believe that's because on x86 the node 0 is really scrambled because of
e820/efi reservations that never make it to memblock.
 
> Please advise which method should be used for the RAS2?
> 
> Thanks,
> Shiju
> 

-- 
Sincerely yours,
Mike.




[Index of Archives]     [Linux IBM ACPI]     [Linux Power Management]     [Linux Kernel]     [Linux Laptop]     [Kernel Newbies]     [Share Photos]     [Security]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Samba]     [Video 4 Linux]     [Device Mapper]     [Linux Resources]
  Powered by Linux