Re: [RFC] a possible way of reducing the PITA of ->d_name audits

NeilBrown <neilb@xxxxxxxxxxx> · Mon, 15 Sep 2025 09:07:00 +1000

On Sun, 14 Sep 2025, Al Viro wrote:
> On Sun, Sep 14, 2025 at 02:37:30AM +0100, Al Viro wrote:
> 
> > AFAICS, it can happen if you are there from nfs4_file_open(), hit
> > _nfs4_opendata_to_nfs4_state(opendata), find ->rpc_done to be true
> > in there, hit nfs4_opendata_find_nfs4_state(), have it call
> > nfs4_opendata_get_inode() and run into a server without
> > NFS_CAP_ATOMIC_OPEN_V1.  Then you get ->o_arg.claim set to
> > NFS4_OPEN_CLAIM_NULL and hit this:
> >                 inode = nfs_fhget(data->dir->d_sb, &data->o_res.fh,
> >                                 &data->f_attr);
> > finding not the same inode as your dentry has attached to it.
> > 
> > So the test might end up not being true, at least from my reading of
> > that code.
> > 
> > What I don't understand is the reasons for not failing immediately
> > with EOPENSTALE in that case.
> > 
> > TBH, I would be a lot more comfortable if the "attach inode to dentry"
> > logics in there had been taken several levels up the call chains - analysis
> > would be much easier that way...
>  
> BTW, that's one place where your scheme with locking dentry might cause
> latency problems - two opens on the same cached dentry could be sent
> in parallel, but if you hold it against renames, etc., you might end up
> with those two roundtrips serialized against each other...
> 

That is certainly something we need to explore when the time comes,
though at the moment I'm mostly focuses on trying to land APIs to
centralise the locking so that we can easily see what locking is
actually being done and can change it all in one place.

Currently all calls to ->atomic_open are exclusive on the dentry, either
because O_CREATE means the directory is locked exclusively or because
d_alloc_parallel() locked the dentry with DCACHE_PAR_LOOKUP.  ...except
a cached-negative (which was accepted by d_revalidate) without O_CREATE.
I would rather we didn't call ->atomic_open in that case.  If the
filesystem wants us to (which NFS does) they can fail ->d_revalidate
for LOOKUP_OPEN.  I have a patch to do that for NFS so we can drop the
d_alloc_parallel() from nfs_atomic_open().  I think we need that case to
be exclusive anyway.  You achieved that for NFS by using
d_alloc_parallel().

All other atomic_open functions simply call finish_no_open() or return
-ENOENT when given a non-d_in_lookup() dentry and are not asked to
CREATE.  They could all be simplified if we avoided the atomic_open call
in that case.

The proposal is to add a case when O_CREATE isn't requested and the
dentry is no longer d_in_lookup().  Probably that needs exclusive access
to the dentry anyway?  Either way it wouldn't be a regression.

I wonder if maybe we should transition from offering ->atomic_open() to
offering ->lookup_open() which is called instead of the current
lookup_open() (though with locking moved inside the function).

So for a LAST_NORM open the filesystem can take complete control after
the parent has been found.  The VFS can export a bunch of appropriate
helpers which the filesystem can mix and match.  This seems cleaner than
having a ->lookup() which bypasses LOOKUP_OPEN requests and a
->d_revalidate which does the same.

NeilBrown