Re: Wanted: more fixups for client delegations test/free walk

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Thu, 2025-04-17 at 10:33 -0400, Benjamin Coddington wrote:
> On 17 Apr 2025, at 9:59, Trond Myklebust wrote:
> 
> > On Thu, 2025-04-17 at 09:28 -0400, Benjamin Coddington wrote:
> > > Hey Trond, Anna, et al.
> > > 
> > > I'm looking at working on nfs_server_reap_expired_delegations()
> > > because
> > > the work to walk that list is order n(n+1)/2  and the list can
> > > grow very
> > > large due to some servers doing SEQ4_STATUS_ADMIN_STATE_REVOKED
> > > these
> > > days.  It can currently grow unlimited by 5k delegation
> > > watermark.
> > 
> > Why would we be seeing an increase of
> > SEQ4_STATUS_ADMIN_STATE_REVOKED
> > cases? That should normally just be seen on network outages that
> > last
> > longer than a full lease period.
> 
> No idea - but its happening.  Maybe I've been too hasty to blame
> ADMIN_STATE_REVOKED, but we do see in the results many
> NFS_DELEGATION_REVOKED states. Perhaps the linux server's admin
> revocation
> interfaces are gaining popular use for some reason.

Why should we be changing the client if the server implementation or
the admins are abusing the functionality? Revoking delegation state
should be an exceptional case because it breaks locks and therefore has
serious potential to corrupt data.

> 
> > > First observation is that we don't remove revoked states from the
> > > list
> > > even after doing FREE_STATEID, so we're still doing walks across
> > > delegation state we'll never use again.  I think we can fix this
> > > by
> > > plumbing in the error result from FREE_STATEID.. so that's a
> > > potential
> > > bit of work.
> > > 
> > > I'm tempted to just do:
> > > 
> > > @@ -1342,7 +1346,7 @@ nfs_delegation_test_free_expired(struct
> > > inode
> > > *inode,
> > >                 return;
> > >         status = ops->test_and_free_expired(server, stateid,
> > > cred);
> > >         if (status == -NFS4ERR_EXPIRED || status == -
> > > NFS4ERR_BAD_STATEID)
> > > -               nfs_remove_bad_delegation(inode, stateid);
> > > +               nfs_delegation_mark_returned(inode, stateid);
> > >  }
> > > 
> > > .. but I think that gets us up to not tracking state the server
> > > might
> > > still be tracking.
> > 
> > Just marking the delegation as being returned isn't helpful. That
> > doesn't trigger any action to recover the state.
> 
> That's the only path to removing the delegation from the nfs_server's
> delegations list, and seems to do all the right steps for this case..
> but
> calling it here can't currently account for cases where FREE_STATEID
> wasn't
> successful.
> 
> > > 
> > > Other approaches might be to walk the list once moving the work
> > > to a
> > > temporary list and then operate on that linearly.
> > > 
> > > Advice, thoughts, or direction welcomed..  I'll probably work on
> > > splitting out nfs41_free_stateid() from test_and_free_expired(),
> > > so the
> > > delegation code can know for sure that we're done with that
> > > state.
> > 
> > Setting up a separate list for state that needs recovery of some
> > sort
> > might be useful. However that doesn't solve the problem that you
> > have
> > to scan the entire list of stateids every time the server sets
> > SEQ4_STATUS_ADMIN_STATE_REVOKED. The latter is a protocol
> > requirement,
> > which is why if we're seeing a lot of it happening, then we need to
> > solve that problem first.
> 
> Well, we did have some misbehaving servers that are getting fixed,
> but its
> still entirely possible to end up with client systems that keep old
> revoked
> state on that list (that it has already done FREE_STATEID for), and
> once we
> have to walk and test it things get hot.  We're seeing this on
> clients that
> are /not/ on misbehaving servers, we've seen some clients with ~500k
> entries
> on that list.
> 
> I'd really like a way to trim the list, thus the desire to remove
> state
> after a successful FREE_STATEID.

We can definitely have a separate list on which we put the state that
fails the FREE_STATEID test.

-- 
Trond Myklebust
Linux NFS client maintainer, Hammerspace
trond.myklebust@xxxxxxxxxxxxxxx





[Index of Archives]     [Linux Filesystem Development]     [Linux USB Development]     [Linux Media Development]     [Video for Linux]     [Linux NILFS]     [Linux Audio Users]     [Yosemite Info]     [Linux SCSI]

  Powered by Linux