----- Original Message ----- > From: "Haihua Yang" <yanghh@xxxxxxxxx> > To: "linux-nfs" <linux-nfs@xxxxxxxxxxxxxxx> > Sent: Thursday, 7 August, 2025 18:14:57 > Subject: LAYOUTCOMMIT Failure After CB_LAYOUTRECALL in pNFS Filelayout Scenario > I'm observing a consistent failure of LAYOUTCOMMIT when the NFS client > accesses a pNFS share using filelayout. Below is the sequence of > events: > 1, The client opens a file for writing and successfully receives a > layout (stateid with seqid = 1). > 2, The client writes data to the data server (DS) successfully. > 3, The NFS server sends a CB_LAYOUTRECALL (stateid with seqid = 2) > due to some change on the server side. > 4, The client sends a LAYOUTCOMMIT (still with seqid = 1), followed > by a LAYOUTRETURN (with seqid = 2). > 5, The server responds to LAYOUTCOMMIT with NFS4ERR_OLD_STATEID. > 6, The server responds to LAYOUTRETURN with NFS4ERR_OK. > 7, The client retries LAYOUTCOMMIT (still using seqid = 1). > 8, The server replies with NFS4ERR_BAD_STATEID because the state was > already removed when processing the LAYOUTRETURN. > > It seems there may be two issues with the Linux NFS client’s behavior: > 1, The client should not send LAYOUTRETURN before receiving a > non-retryable response to LAYOUTCOMMIT. > 2, After receiving a CB_LAYOUTRECALL, the client should not continue > using the old seqid. I think this question should go to NFSv4 IETF working group list. Noetheless, rfc8881 says: For CB_LAYOUTRECALL arguments, the client MUST send a response to the recall before using the seqid. So, it sounds, as long as the client hasn't responded to CB_LAYOUTRECALL, the 'valid' seqid is 1. Thus, LAYOUTCOMMIT seqid=1, LAYOUTRETURN seqid=2 looks correct. See: https://datatracker.ietf.org/doc/html/rfc8881#layout_stateid Best regards, Tigran. > > Would you consider this a bug in the client? Or is there something I > may have misunderstood in the protocol behavior? > > Thanks, > Haihua Yang
Attachment:
smime.p7s
Description: S/MIME Cryptographic Signature