Re: [PATCH] SUNRPC: Cleanup/fix initial rq_pages allocation

Benjamin Coddington <bcodding@xxxxxxxxxx> · Thu, 05 Jun 2025 14:30:45 -0400

On 5 Jun 2025, at 14:08, Chuck Lever wrote:

> On 6/5/25 12:54 PM, Benjamin Coddington wrote:
>> On 5 Jun 2025, at 10:26, Chuck Lever wrote:
>>
>>> This doesn't apply to v6.16-rc1 due to recent changes to use a
>>> dynamically-allocated rq_pages array. This array is allocated in
>>> svc_init_buffer(); the array allocation has to remain.
>>
>> Well, shucks.  I guess I should be paying better attention.
>>
>> Can we drop the bulk allocation in svc_init_buffer if we're just going to
>> try it more robustly in svc_alloc_arg?
>
> Maybe!

Ok, I'll send something.

> I would like to understand the failure a little better. Why is mount
> susceptible to this issue?

For v3, we're starting lockd, and on v4.0 it's the callback thread(s).  It's
pretty easy to reproduce if you bump the cb threads to something insane like
64k.

Customers have a really hard time handling this on autofs, its not
like the system just booted - instead the system will be up for long periods
doing work, then the automount fails requiring manual intervention.

I think the bulk allocator can be pretty sensitive to some conditions which
cause it to bail out and only return a single page.

Ben