Re: [PATCH v2] SUNRPC: Cleanup/fix initial rq_pages allocation

Benjamin Coddington <bcodding@xxxxxxxxxx> · Tue, 10 Jun 2025 08:19:54 -0400

On 9 Jun 2025, at 15:25, Chuck Lever wrote:

> On 6/9/25 1:21 PM, Benjamin Coddington wrote:
>> While investigating some reports of memory-constrained NUMA machines
>> failing to mount v3 and v4.0 nfs mounts, we found that svc_init_buffer()
>> was not attempting to retry allocations from the bulk page allocator.
>> Typically, this results in a single page allocation being returned and
>> the mount attempt fails with -ENOMEM.  A retry would have allowed the mount
>> to succeed.
>>
>> Additionally, it seems that the bulk allocation in svc_init_buffer() is
>> redundant because svc_alloc_arg() will perform the required allocation and
>> does the correct thing to retry the allocations.
>>
>> The call to allocate memory in svc_alloc_arg() drops the preferred node
>> argument, but I expect we'll still allocate on the preferred node because
>> the allocation call happens within the svc thread context, which chooses
>> the node with memory closest to the current thread's execution.
>
> IIUC this assumption might be incorrect. When a @node argument is
> passed in, the allocator tries to allocate memory on that node only.
> When the non-node API is used, the local node is tried first, but if
> that allocation fails, it looks on other nodes for free pages.

After checking this morning, I see that both calls end up in the same place:
alloc_pages_bulk_noprof(), and @preferred_nid is either from @node in one or
from numa_mem_id() in the other.

So, I stand by my statement above.  I don't see where
alloc_pages_bulk_noprof() will behave differently regarding how strictly the
preferred node is used based on whether alloc_pages_bulk_node() or
alloc_pages_bulk() is called.

Ben