Re: [RFC PATCH v2 22/51] mm: hugetlb: Refactor hugetlb allocation functions

Ira Weiny <ira.weiny@xxxxxxxxx> · Sat, 31 May 2025 18:45:55 -0500

Ackerley Tng wrote:
> Refactor dequeue_hugetlb_folio() and alloc_surplus_hugetlb_folio() to
> take mpol, nid and nodemask. This decouples allocation of a folio from
> a vma.
> 
> Signed-off-by: Ackerley Tng <ackerleytng@xxxxxxxxxx>
> Change-Id: I890fb46fe8c6349383d8cf89befc68a4994eb416
> ---
>  mm/hugetlb.c | 64 ++++++++++++++++++++++++----------------------------
>  1 file changed, 30 insertions(+), 34 deletions(-)
> 

[snip]

>  
> @@ -2993,6 +2974,11 @@ struct folio *alloc_hugetlb_folio(struct vm_area_struct *vma,
>  	int ret, idx;
>  	struct hugetlb_cgroup *h_cg = NULL;
>  	gfp_t gfp = htlb_alloc_mask(h) | __GFP_RETRY_MAYFAIL;
> +	struct mempolicy *mpol;
> +	nodemask_t *nodemask;
> +	gfp_t gfp_mask;
> +	pgoff_t ilx;
> +	int nid;
>  
>  	idx = hstate_index(h);
>  
> @@ -3032,7 +3018,6 @@ struct folio *alloc_hugetlb_folio(struct vm_area_struct *vma,
>  
>  		subpool_reservation_exists = npages_req == 0;
>  	}
> -
>  	reservation_exists = vma_reservation_exists || subpool_reservation_exists;
>  
>  	/*
> @@ -3048,21 +3033,30 @@ struct folio *alloc_hugetlb_folio(struct vm_area_struct *vma,
>  			goto out_subpool_put;
>  	}
>  
> +	mpol = get_vma_policy(vma, addr, h->order, &ilx);

Why does the memory policy need to be acquired here instead of after the
cgroup charge?  AFAICT this is not needed and would at least eliminate 1
of the error conditions puts.

> +
>  	ret = hugetlb_cgroup_charge_cgroup(idx, pages_per_huge_page(h), &h_cg);
> -	if (ret)
> +	if (ret) {
> +		mpol_cond_put(mpol);
                ^^^^
		here

All that said I think the use of some new cleanup macros could really help
a lot of this code.

What do folks in this area of the kernel think of those?

Ira

[snip]