Re: [PATCH v2] SUNRPC: Cleanup/fix initial rq_pages allocation

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 9 Jun 2025, at 15:25, Chuck Lever wrote:

> On 6/9/25 1:21 PM, Benjamin Coddington wrote:
>> While investigating some reports of memory-constrained NUMA machines
>> failing to mount v3 and v4.0 nfs mounts, we found that svc_init_buffer()
>> was not attempting to retry allocations from the bulk page allocator.
>> Typically, this results in a single page allocation being returned and
>> the mount attempt fails with -ENOMEM.  A retry would have allowed the mount
>> to succeed.
>>
>> Additionally, it seems that the bulk allocation in svc_init_buffer() is
>> redundant because svc_alloc_arg() will perform the required allocation and
>> does the correct thing to retry the allocations.
>>
>> The call to allocate memory in svc_alloc_arg() drops the preferred node
>> argument, but I expect we'll still allocate on the preferred node because
>> the allocation call happens within the svc thread context, which chooses
>> the node with memory closest to the current thread's execution.
>
> IIUC this assumption might be incorrect. When a @node argument is
> passed in, the allocator tries to allocate memory on that node only.
> When the non-node API is used, the local node is tried first, but if
> that allocation fails, it looks on other nodes for free pages.
>
> This is how svc_alloc_arg has worked for a while. I tried to make this a
> node-specific allocation in 5f7fc5d69f6e ("SUNRPC: Resupply rq_pages
> from node-local memory"), but that had to be reverted because there
> are some cases where the svc_pool's sv_id is not valid to use as the
> node identifier.

It might be possible to know whether we're in per-pool mode in
svc_alloc_arg(), I can look into it.  In that case we could just fall back
to the non-node specific bulk allocation.  I'd rather not be inserting
hard-to-find performance regressions.

> But otherwise, IMO de-duplicating the code that fills rq_pages seems
> like a step forward. I can take v2 and drop the above paragraph if I get
> one or more additional R-b's. This is probably nfsd-fixes material.

Roger.

>> This patch cleans out the bulk allocation in svc_init_buffer() to allow
>> svc_alloc_arg() to handle the allocation/retry logic for rq_pages.
>
> Fixes: ed603bcf4fea ("sunrpc: Replace the rq_pages array with
> dynamically-allocated memory")

Thanks for the attention, Chuck.

Ben





[Index of Archives]     [Linux Filesystem Development]     [Linux USB Development]     [Linux Media Development]     [Video for Linux]     [Linux NILFS]     [Linux Audio Users]     [Yosemite Info]     [Linux SCSI]

  Powered by Linux