Hi,
在 2025/9/2 18:21, Jeff Layton 写道:
On Tue, 2025-09-02 at 10:22 +0800, Li Lingfeng wrote:
When file access conflicts occur between clients, the server recalls
delegations. If the client holding delegation fails to return it after
a recall, nfs4_laundromat adds the delegation to cl_revoked list.
This causes subsequent SEQUENCE operations to set the
SEQ4_STATUS_RECALLABLE_STATE_REVOKED flag, forcing the client to
validate all delegations and return the revoked one.
However, if the client fails to return the delegation due to a timeout
after receiving the recall or a server bug, the delegation remains in the
server's cl_revoked list. The client marks it revoked and won't find it
upon detecting SEQ4_STATUS_RECALLABLE_STATE_REVOKED. This leads to a loop:
the server persistently sets SEQ4_STATUS_RECALLABLE_STATE_REVOKED, and the
client repeatedly tests all delegations, severely impacting performance
when numerous delegations exist.
It is a performance impact, but I don't get the "loop" here. Are you
saying that this problem compounds itself? That testing all delegations
causes others to be revoked?
The delegation will be removed from server->delegations in client after
NFSPROC4_CLNT_DELEGRETURN is performed.
nfs4_delegreturn_done
nfs_delegation_mark_returned
nfs_detach_delegation
nfs_detach_delegation_locked
list_del_rcu // remove delegation from server->delegations
From the client's perspective, the delegation has been returned, but on
the server side, it is left in the cl_revoked list.[1].
Subsequently, every sequence from the client will be flagged with
SEQ4_STATUS_RECALLABLE_STATE_REVOKED as long as cl_revoked remains
non-empty.
nfsd4_sequence
seq->status_flags |= SEQ4_STATUS_RECALLABLE_STATE_REVOKED
When the client detects SEQ4_STATUS_RECALLABLE_STATE_REVOKED while
processing a sequence result, it sets NFS_DELEGATION_TEST_EXPIRED for all
delegations and wakes up the state manager for handling.
nfs41_sequence_done
nfs41_sequence_process
nfs41_handle_sequence_flag_errors
nfs41_handle_recallable_state_revoked
nfs_test_expired_all_delegations
nfs_mark_test_expired_all_delegations
nfs_delegation_mark_test_expired_server
// set NFS_DELEGATION_TEST_EXPIRED for delegations in
server->delegations
nfs4_schedule_state_manager
The state manager tests all delegations except the one that was returned,
as it is no longer in server->delegations.
nfs4_state_manager
nfs4_begin_drain_session
nfs_reap_expired_delegations
nfs_server_reap_expired_delegations
// test delegations in server->delegations
There may be a loop:
1) send a sequence(client)
2) return SEQ4_STATUS_RECALLABLE_STATE_REVOKED(server)
3) set NFS_DELEGATION_TEST_EXPIRED for all delegations(client)
4) test all delegations by state manager(client)
5) send another sequence(client)
The state manager's traversal of delegations occurs between
nfs4_begin_drain_session and nfs4_end_drain_session. Non-privileged requests
will be blocked because the NFS4_SLOT_TBL_DRAINING flag is set. If there are
many delegations to traverse, this blocking time can be relatively long.
Since abnormal delegations are removed from flc_lease via nfs4_laundromat
--> revoke_delegation --> destroy_unhashed_deleg -->
nfs4_unlock_deleg_lease --> kernel_setlease, and do not block new open
requests indefinitely, retaining such a delegation on the server is
unnecessary.
Reported-by: Zhang Jian <zhangjian496@xxxxxxxxxx>
Fixes: 3bd64a5ba171 ("nfsd4: implement SEQ4_STATUS_RECALLABLE_STATE_REVOKED")
Closes: https://lore.kernel.org/all/ff8debe9-6877-4cf7-ba29-fc98eae0ffa0@xxxxxxxxxx/
Signed-off-by: Li Lingfeng <lilingfeng3@xxxxxxxxxx>
---
fs/nfsd/nfs4state.c | 11 +++++++++++
1 file changed, 11 insertions(+)
diff --git a/fs/nfsd/nfs4state.c b/fs/nfsd/nfs4state.c
index 88c347957da5..aa65a685dbb9 100644
--- a/fs/nfsd/nfs4state.c
+++ b/fs/nfsd/nfs4state.c
@@ -4326,6 +4326,8 @@ nfsd4_sequence(struct svc_rqst *rqstp, struct nfsd4_compound_state *cstate,
int buflen;
struct net *net = SVC_NET(rqstp);
struct nfsd_net *nn = net_generic(net, nfsd_net_id);
+ struct list_head *pos, *next;
+ struct nfs4_delegation *dp;
if (resp->opcnt != 1)
return nfserr_sequence_pos;
@@ -4470,6 +4472,15 @@ nfsd4_sequence(struct svc_rqst *rqstp, struct nfsd4_compound_state *cstate,
default:
seq->status_flags = 0;
}
+ if (!list_empty(&clp->cl_revoked)) {
+ list_for_each_safe(pos, next, &clp->cl_revoked) {
+ dp = list_entry(pos, struct nfs4_delegation, dl_recall_lru);
+ if (dp->dl_time < (ktime_get_boottime_seconds() - 2 * nn->nfsd4_lease)) {
+ list_del_init(&dp->dl_recall_lru);
+ nfs4_put_stid(&dp->dl_stid);
+ }
+ }
+ }
if (!list_empty(&clp->cl_revoked))
seq->status_flags |= SEQ4_STATUS_RECALLABLE_STATE_REVOKED;
if (atomic_read(&clp->cl_admin_revoked))
This seems like a violation of the spec. AIUI, the server is required
to hang onto a record of the delegation until the client does the
TEST_STATEID/FREE_STATEID dance to remove it. Just discarding them like
this seems wrong.
Our expected outcome was that the client would release the abnormal
delegation via TEST_STATEID/FREE_STATEID upon detecting its invalidity.
However, this problematic delegation is no longer present in the
client's server->delegations list—whether due to client-side timeouts or
the server-side bug [1].
Should we instead just administratively evict the client since it's
clearly not behaving right in this case?
Thanks for the suggestion. While administratively evicting the client would
certainly resolve the immediate delegation issue, I'm concerned that
approach
might be a bit heavy-handed.
The problematic behavior seems isolated to a single delegation. Meanwhile,
the client itself likely has numerous other open files and active state on
the server. Forcing a complete client reconnect would tear down all that
state, which could cause significant application disruption and be perceived
as a service outage from the client's perspective.
[1]
https://lore.kernel.org/all/de669327-c93a-49e5-a53b-bda9e67d34a2@xxxxxxxxxx/
Thanks,
Lingfeng