Re: [nfsv4] Re: simple NFSv4.1/4.2 test of remove while holding a delegation

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Tue, 2025-06-10 at 05:42 -0700, Rick Macklem wrote:
> On Tue, Jun 10, 2025 at 4:51 AM Jeff Layton <jlayton@xxxxxxxxxx> wrote:
> > 
> > On Mon, 2025-06-09 at 18:06 -0700, Rick Macklem wrote:
> > > On Mon, Jun 9, 2025 at 5:17 PM Dai Ngo <dai.ngo@xxxxxxxxxx> wrote:
> > > > 
> > > > On 6/9/25 4:35 PM, Rick Macklem wrote:
> > > > > Hi,
> > > > > 
> > > > > I hope you don't mind a cross-post, but I thought both groups
> > > > > might find this interesting...
> > > > > 
> > > > > I have been creating a compound RPC that does REMOVE and
> > > > > then tries to determine if the file object has been removed and
> > > > > I was surprised to see quite different results from the Linux knfsd
> > > > > and Solaris 11.4 NFSv4.1/4.2 servers. I think both these servers
> > > > > provide FH4_PERSISTENT file handles, although I suppose I
> > > > > should check that?
> > > > > 
> > > > > First, the test OPEN/CREATEs a regular file called "foo" (only one
> > > > > hard link) and acquires a write delegation for it.
> > > > > Then a compound does the following:
> > > > > ...
> > > > > REMOVE foo
> > > > > PUTFH fh for foo
> > > > > GETATTR
> > > > > 
> > > > > For the Solaris 11.4 server, the server CB_RECALLs the
> > > > > delegation and then replies NFS4ERR_STALE for the PUTFH above.
> > > > > (The FreeBSD server currently does the same.)
> > > > > 
> > > > > For a fairly recent Linux (6.12) knfsd, the above replies NFS_OK
> > > > > with nlinks == 0 in the GETATTR reply.
> > > > > 
> > > > > Hmm. So I've looked in RFC8881 (I'm terrible at reading it so I
> > > > > probably missed something) and I cannot find anything that states
> > > > > either of the above behaviours is incorrect.
> > 
> > This seems outside the scope of the spec. What you're probably seeing
> > is just differences in the implementation details of the two servers.
> > 
> > > > > (NFS4ERR_STALE is listed as an error code for PUTFH, but the
> > > > > description of PUTFH only says that it sets the CFH to the fh arg.
> > > > > It does not say anything w.r.t. the fh arg. needing to be for a file
> > > > > that still exists.) Neither of these servers sets
> > > > > OPEN4_RESULT_PRESERVE_UNLINKED in the OPEN reply.
> > > > > 
> > > > > So, it looks like "file object no longer exists" is indicated either
> > > > > by a NFS4ERR_STALE reply to either PUTFH or GETATTR
> > > > > OR
> > > > > by a successful reply, but with nlinks == 0 for the GETATTR reply.
> > > > > 
> > > > > To be honest, I kinda like the Linux knfsd version, but I am wondering
> > > > > if others think that both of these replies is correct?
> > > > > 
> > > > > Also, is the CB_RECALL needed when the delegation is held by
> > > > > the same client as the one doing the REMOVE?
> > > > 
> > > > The Linux NFSD detects the delegation belongs to the same client that
> > > > causes the conflict (due to REMOVE) and skips the CB_RECALL. This is
> > > > an optimization based on the assumption that the client would handle
> > > > the conflict locally.
> > > And then what does the server do with the delegation?
> > > - Does it just discard it, since the file object has been deleted?
> > > OR
> > > - Does it guarantee that a DELEGRETURN done after the REMOVE will
> > >   still work (which seems to be the case for the 6.12 server I am using for
> > >   testing).
> > > 
> > 
> > The latter. The file on the server is still being held open by virtue
> > of the fact that the client holds a delegation stateid on it.
> > 
> > The inode will still exist in core (with nlinks == 0) until its last
> > reference is released (here, when the client does the final
> > DELEGRETURN). Aside from the fact that it's now disconnected from the
> > filesystem namespace, it's still "alive", and reachable via filehandle.
> Thanks for the info. (I had a hunch it was held by the delegation.)
> I'll guess that implies that LINK could still be done, bumping nlink to 1
> before the DELEGRETURN? That means that nlink == 0 only guarantees
> that the file object will be deleted if the client holds a write delegation and
> ensures that LINK is not allowed before DELEGRETURN.
> 

I believe that LINK is actually prevented at that point. The VFS only
allows flink() to work when nlink == 0 on O_TMPFILE files, IIRC. IMO,
that's a Linux implementation detail rather than something the NFS
protocol or POSIX requires.

> Although trying to avoid the WRITE, WRITE,...COMMIT to the server
> just before a file is deleted seems worth the effort, it never seems to
> be as easy as you'd think.
> 

Definitely. The problem of course is that you can't really know whether
a REMOVE will actually delete the file. It'll remove the name, but
link() could have raced in, and at that point you sort of have to do
the writes.

> > 
> > > > 
> > > > If the REMOVE was done by another client, the REMOVE will not complete
> > > > until the delegation is returned. If the PUTFH comes after the REMOVE
> > > > was completed, it'll  fail with NFS4ERR_STALE since the file, specified
> > > > by the file handle, no longer exists.
> > > Assuming the statement w.r.t. "fail with NFS4ERR_STALE" only applies to
> > > "REMOVE done by another client" then that sounds fine.
> > > However if the "fail with NFS4ERR_STALE is supposed for happen after
> > > REMOVE for same client" then that is not what I am seeing.
> > > If you are curious, the packet trace is here. (Look at packet#58).
> > > https://people.freebsd.org/~rmacklem/linux-remove.pcap
> > > 
> > > Btw, in case you are curious why I am doing this testing, I am trying
> > > to figure out a good way for the FreeBSD client to handle temporary
> > > files. Typically on POSIX they are done via the syscalls:
> > > 
> > > fd = open("foo", O_CREATE ...);
> > > unlink("foo");
> > > write(fd,..), write(fd,..)...
> > > read(fd,...), read(fd,...)...
> > > close(fd);
> > > 
> > > If this happens quickly and is not too much writing, the writes
> > > copy data into buffers/pages, the reads read the data out of
> > > the pages and then it all gets deleted.
> > > 
> > 
> > Yep, common pattern.
> > 
> > > Unfortunately, the CB_RECALL forces the NFSv4.n client
> > > to do WRITE, WRITE,..COMMIT and then DELEGRETURN.
> > > Then the REMOVE throws all the data away on the NFSv4.n
> > > server.
> > > --> As such, I really like not doing the CB_RECALL for "same client".
> > > My concern is "what happens to the delegation after the file object ("foo")
> > > gets deleted?
> > > It either needs to be thrown away by the NFSv4.n server or the
> > > PUTFH, DELEGRETURN needs to work after the REMOVE.
> > 
> > I think the latter. A REMOVE just removes the filename from the
> > namespace. What happens to the underlying inode/vnode/whathaveyou is
> > undefined by the protocol. The delegation is effectively holding the
> > file open, so it needs to continue to exist on the server, just as the
> > file "foo" in your example above must exist after the unlink().
> > 
> > > Otherwise, the NFSv4.n server may get constipated by the delegations,
> > > which might be called stale, since the file object has been deleted.
> > > 
> > > --> I can do PUTFH, GETATTR after REMOVE in the same compound,
> > >      to find out if the file object has been deleted. But then, if a
> > >      PUTFH, DELEGRETURN fails with NFS4ERR_STALE, can I get
> > >      away with saying "the server should just discard the delegation as
> > >      the client already has done so??.
> > > 
> > > Thanks for your comments, rick
> > > 
> > 
> > If you still have an outstanding delegation after a REMOVE, then
> > returning ESTALE on the filehandle at that point seems wrong. The
> > delegation still exists, so the underlying filehandle should still
> > exist.
> > 
> > Linux doesn't generally throw back an NFS4ERR_STALE until it just can't
> > find the inode at all anymore. A dentry holds a reference to the inode,
> > and open files hold a reference to the dentry. The remove just
> > disconnects the dentry from the namespace and drops its refcount. When
> > the DELEGRETURN issues the last close, the inode gets cleaned up and at
> > that point you can't find it by filehandle anymore.
> > 
> > You probably want to aim for similar behavior in FreeBSD?
> I'm not sure. So long as the server guarantees that the file object has been
> deleted by the REMOVE, throwing NFS4ERR_STALE seems a reasonable alternative?
> 

At that point won't you have to start returning writeback errors back
to userland? What if you do this?

fd = open("foo", O_CREATE ...);
unlink("foo");
write(fd,..), write(fd,..)...
fsync(fd);

In the absence of a delegation, won't the fsync get back an error here
because the file is now stale?

> Note that the FreeBSD server does not handle NFSv4 OPENs and
> DELEGATIONs like a POSIX open(2), so the file handle is no longer
> valid once nlink == 0 on the underlying vnode/inode.
> (Again, I don't think there is anything in RFC8881 that specifies
> what is correct behaviour for this?)
> 
> It's a case where I'd like to be able to test against all extant servers,
> but none of the others show up at Bakeathons these days. Sigh.
> 
> Thanks for your comments, rick
> 


> > 
> > > > 
> > > > -Dai
> > > > 
> > > > > (I don't think it is, but there is a discussion in 18.25.4 which says
> > > > > "When the determination above cannot be made definitively because
> > > > > delegations are being held, they MUST be recalled.." but everything
> > > > > above that is a may/MAY, so it is not obvious to me if a server really
> > > > > needs to case?)
> > > > > 
> > > > > Any comments? Thanks, rick
> > > > > ps: I am amazed when I learn these things about NFSv4.n after all
> > > > >        these years.
> > > > > 
> > 
> > 
> > --
> > Jeff Layton <jlayton@xxxxxxxxxx>

-- 
Jeff Layton <jlayton@xxxxxxxxxx>





[Index of Archives]     [Linux Filesystem Development]     [Linux USB Development]     [Linux Media Development]     [Video for Linux]     [Linux NILFS]     [Linux Audio Users]     [Yosemite Info]     [Linux SCSI]

  Powered by Linux