Re: Wanted: more fixups for client delegations test/free walk

Trond Myklebust <trondmy@xxxxxxxxxx> · Thu, 17 Apr 2025 09:59:40 -0400

On Thu, 2025-04-17 at 09:28 -0400, Benjamin Coddington wrote:
> Hey Trond, Anna, et al.
> 
> I'm looking at working on nfs_server_reap_expired_delegations()
> because the
> work to walk that list is order n(n+1)/2  and the list can grow very
> large
> due to some servers doing SEQ4_STATUS_ADMIN_STATE_REVOKED these
> days.  It
> can currently grow unlimited by 5k delegation watermark.

Why would we be seeing an increase of SEQ4_STATUS_ADMIN_STATE_REVOKED
cases? That should normally just be seen on network outages that last
longer than a full lease period.

> 
> First observation is that we don't remove revoked states from the
> list even
> after doing FREE_STATEID, so we're still doing walks across
> delegation
> state we'll never use again.  I think we can fix this by plumbing in
> the
> error result from FREE_STATEID.. so that's a potential bit of work.
> 
> I'm tempted to just do:
> 
> @@ -1342,7 +1346,7 @@ nfs_delegation_test_free_expired(struct inode
> *inode,
>                 return;
>         status = ops->test_and_free_expired(server, stateid, cred);
>         if (status == -NFS4ERR_EXPIRED || status == -
> NFS4ERR_BAD_STATEID)
> -               nfs_remove_bad_delegation(inode, stateid);
> +               nfs_delegation_mark_returned(inode, stateid);
>  }
> 
> .. but I think that gets us up to not tracking state the server might
> still
> be tracking.

Just marking the delegation as being returned isn't helpful. That
doesn't trigger any action to recover the state.

> 
> Other approaches might be to walk the list once moving the work to a
> temporary list and then operate on that linearly.
> 
> Advice, thoughts, or direction welcomed..  I'll probably work on
> splitting
> out nfs41_free_stateid() from test_and_free_expired(), so the
> delegation
> code can know for sure that we're done with that state.
> 

Setting up a separate list for state that needs recovery of some sort
might be useful. However that doesn't solve the problem that you have
to scan the entire list of stateids every time the server sets
SEQ4_STATUS_ADMIN_STATE_REVOKED. The latter is a protocol requirement,
which is why if we're seeing a lot of it happening, then we need to
solve that problem first.

-- 
Trond Myklebust
Linux NFS client maintainer, Hammerspace
trond.myklebust@xxxxxxxxxxxxxxx