Re: LAYOUTCOMMIT Failure After CB_LAYOUTRECALL in pNFS Filelayout Scenario

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi Tigran and Trond,

The Linux client calls nfs4_layout_refresh_old_stateid when the server
returns NFS4ERR_OLD_STATEID in response to a LAYOUTRETURN, but it
doesn’t do the same for LAYOUTCOMMIT. Is there a specific reason for
this difference?
Thanks
Haihua Yang

On Thu, Aug 7, 2025 at 10:22 AM Haihua Yang <yanghh@xxxxxxxxx> wrote:
>
> Tigran,
> I forgot to mention in the previous email, after step 4, client also
> sends a reply to the CB_LAYOUTRECALL. But when retrying the
> LAYOUTCOMMIT afterword, it still uses seqid 1.
> From what I observed in the Linux implementation, the retry logic
> doesn’t update the request arguments, so the client ends up resending
> the same LAYOUTCOMMIT with the old seqid.
>
> Regards,
> Haihua Yang
>
>
> On Thu, Aug 7, 2025 at 9:33 AM Mkrtchyan, Tigran
> <tigran.mkrtchyan@xxxxxxx> wrote:
> >
> >
> >
> > ----- Original Message -----
> > > From: "Haihua Yang" <yanghh@xxxxxxxxx>
> > > To: "linux-nfs" <linux-nfs@xxxxxxxxxxxxxxx>
> > > Sent: Thursday, 7 August, 2025 18:14:57
> > > Subject: LAYOUTCOMMIT Failure After CB_LAYOUTRECALL in pNFS Filelayout Scenario
> >
> > > I'm observing a consistent failure of LAYOUTCOMMIT when the NFS client
> > > accesses a pNFS share using filelayout. Below is the sequence of
> > > events:
> > >  1, The client opens a file for writing and successfully receives a
> > > layout (stateid with seqid = 1).
> > >  2, The client writes data to the data server (DS) successfully.
> > >  3, The NFS server sends a CB_LAYOUTRECALL (stateid with seqid = 2)
> > > due to some change on the server side.
> > >  4, The client sends a LAYOUTCOMMIT (still with seqid = 1), followed
> > > by a LAYOUTRETURN (with seqid = 2).
> > >  5, The server responds to LAYOUTCOMMIT with NFS4ERR_OLD_STATEID.
> > >  6, The server responds to LAYOUTRETURN with NFS4ERR_OK.
> > >  7, The client retries LAYOUTCOMMIT (still using seqid = 1).
> > >  8, The server replies with NFS4ERR_BAD_STATEID because the state was
> > > already removed when processing the LAYOUTRETURN.
> > >
> > > It seems there may be two issues with the Linux NFS client’s behavior:
> > >  1, The client should not send LAYOUTRETURN before receiving a
> > > non-retryable response to LAYOUTCOMMIT.
> > >  2, After receiving a CB_LAYOUTRECALL, the client should not continue
> > > using the old seqid.
> >
> > I think this question should go to NFSv4 IETF working group list.
> > Noetheless, rfc8881 says:
> >
> >    For CB_LAYOUTRECALL arguments, the client MUST send a response to the recall before using the seqid.
> >
> > So, it sounds, as long as the client hasn't responded to CB_LAYOUTRECALL, the 'valid' seqid is 1. Thus,
> > LAYOUTCOMMIT seqid=1, LAYOUTRETURN seqid=2 looks correct.
> >
> > See: https://datatracker.ietf.org/doc/html/rfc8881#layout_stateid
> >
> > Best regards,
> >    Tigran.
> >
> > >
> > > Would you consider this a bug in the client? Or is there something I
> > > may have misunderstood in the protocol behavior?
> > >
> > > Thanks,
> > > Haihua Yang





[Index of Archives]     [Linux Filesystem Development]     [Linux USB Development]     [Linux Media Development]     [Video for Linux]     [Linux NILFS]     [Linux Audio Users]     [Yosemite Info]     [Linux SCSI]

  Powered by Linux