Re: Wanted: more fixups for client delegations test/free walk

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 17 Apr 2025, at 9:59, Trond Myklebust wrote:

> On Thu, 2025-04-17 at 09:28 -0400, Benjamin Coddington wrote:
>> Hey Trond, Anna, et al.
>>
>> I'm looking at working on nfs_server_reap_expired_delegations() because
>> the work to walk that list is order n(n+1)/2  and the list can grow very
>> large due to some servers doing SEQ4_STATUS_ADMIN_STATE_REVOKED these
>> days.  It can currently grow unlimited by 5k delegation watermark.
>
> Why would we be seeing an increase of SEQ4_STATUS_ADMIN_STATE_REVOKED
> cases? That should normally just be seen on network outages that last
> longer than a full lease period.

No idea - but its happening.  Maybe I've been too hasty to blame
ADMIN_STATE_REVOKED, but we do see in the results many
NFS_DELEGATION_REVOKED states. Perhaps the linux server's admin revocation
interfaces are gaining popular use for some reason.

>> First observation is that we don't remove revoked states from the list
>> even after doing FREE_STATEID, so we're still doing walks across
>> delegation state we'll never use again.  I think we can fix this by
>> plumbing in the error result from FREE_STATEID.. so that's a potential
>> bit of work.
>>
>> I'm tempted to just do:
>>
>> @@ -1342,7 +1346,7 @@ nfs_delegation_test_free_expired(struct inode
>> *inode,
>>                 return;
>>         status = ops->test_and_free_expired(server, stateid, cred);
>>         if (status == -NFS4ERR_EXPIRED || status == -
>> NFS4ERR_BAD_STATEID)
>> -               nfs_remove_bad_delegation(inode, stateid);
>> +               nfs_delegation_mark_returned(inode, stateid);
>>  }
>>
>> .. but I think that gets us up to not tracking state the server might
>> still be tracking.
>
> Just marking the delegation as being returned isn't helpful. That
> doesn't trigger any action to recover the state.

That's the only path to removing the delegation from the nfs_server's
delegations list, and seems to do all the right steps for this case.. but
calling it here can't currently account for cases where FREE_STATEID wasn't
successful.

>>
>> Other approaches might be to walk the list once moving the work to a
>> temporary list and then operate on that linearly.
>>
>> Advice, thoughts, or direction welcomed..  I'll probably work on
>> splitting out nfs41_free_stateid() from test_and_free_expired(), so the
>> delegation code can know for sure that we're done with that state.
>
> Setting up a separate list for state that needs recovery of some sort
> might be useful. However that doesn't solve the problem that you have
> to scan the entire list of stateids every time the server sets
> SEQ4_STATUS_ADMIN_STATE_REVOKED. The latter is a protocol requirement,
> which is why if we're seeing a lot of it happening, then we need to
> solve that problem first.

Well, we did have some misbehaving servers that are getting fixed, but its
still entirely possible to end up with client systems that keep old revoked
state on that list (that it has already done FREE_STATEID for), and once we
have to walk and test it things get hot.  We're seeing this on clients that
are /not/ on misbehaving servers, we've seen some clients with ~500k entries
on that list.

I'd really like a way to trim the list, thus the desire to remove state
after a successful FREE_STATEID.

Thanks very much for your look and discussion.

Ben





[Index of Archives]     [Linux Filesystem Development]     [Linux USB Development]     [Linux Media Development]     [Video for Linux]     [Linux NILFS]     [Linux Audio Users]     [Yosemite Info]     [Linux SCSI]

  Powered by Linux