On Tue, Aug 05, 2025 at 03:33:49PM +0200, David Hildenbrand wrote: > > David, there is another alternative to prevent this, simple though a > > bit wasteful, just allocate a bit bigger to ensure the allocation > > doesn't end on an exact PAGE_SIZE boundary? > > :/ in particular doing that through the memblock in sparse_init_nid(), I am > not so sure that's a good idea. It would probably be some work to make larger allocations to avoid padding :\ > I prefer Linus' proposal and avoids the one nth_page(), unless any other > approach can help us get rid of more nth_page() usage -- and I don't think > your proposal could, right? If the above were solved - so the struct page allocations could be larger than a section, arguably just the entire range being plugged, then I think you also solve the nth_page() problem too. Effectively the nth_page() problem is that we allocate the struct page arrays on an arbitary section-by-section basis, and then the arch sets MAX_ORDER so that a folio can cross sections, effectively guaranteeing to virtually fragment the page *'s inside folios. Doing a giant vmalloc at the start so you could also cheaply add some padding would effectively also prevent the nth_page problem as we can reasonably say that no folio should extend past an entire memory region. Maybe there is some reason we can't do a giant vmalloc on these systems that also can't do SPARSE_VMMEMAP :\ But perhaps we could get up to MAX_ORDER at least? Or perhaps we could have those systems reduce MAX_ORDER? So, I think they are lightly linked problems. I suppose this is also a limitation with Linus's suggestion. It doesn't give the optimal answer for for 1G pages on these older systems: for (size_t nr = 1; nr < nr_pages; nr++) { if (*pages++ != ++page) break; Since that will exit every section. At least for scatterlist like cases the point of this function is just to speed things up. If it returns short the calling code should still be directly checking phys_addr contiguity anyhow. Something for the kdoc I suppose. Jason