Re: [PATCH 2/6] VFS/ovl: add lookup_one_positive_killable()

"NeilBrown" <neilb@xxxxxxxxxxx> · Mon, 08 Sep 2025 12:07:37 +1000

On Sat, 06 Sep 2025, Amir Goldstein wrote:
> On Sat, Sep 6, 2025 at 7:00 AM NeilBrown <neilb@xxxxxxxxxxx> wrote:
> >
> > From: NeilBrown <neil@xxxxxxxxxx>
> >
> > ovl wants a lookup which won't block on a fatal signal.
> > It currently uses down_write_killable() and then repeated
> > calls to lookup_one()
> >
> > The lock may not be needed if the name is already in the dcache and it
> > aid proposed future changes if the locking is kept internal to namei.c
> >
> > So this patch adds lookup_one_positive_killable() which is like
> > lookup_one_positive() but will abort in the face of a fatal signal.
> > overlayfs is changed to use this.
> >
> > Signed-off-by: NeilBrown <neil@xxxxxxxxxx>
> 
> I think the commit should mention that this changes from
> inode_lock_killable() to inode_lock_shared_killable() on the
> underlying dir inode which is a good thing for this scope.
> 
> BTW I was reading the git history that led to down_write_killable()
> in this code and I had noticed that commit 3e32715496707
> ("vfs: get rid of old '->iterate' directory operation") has made
> the ovl directory iteration non-killable when promoting the read
> lock on the ovl directory to write lock.

hmmmm....

So the reason that this uses a killable lock is simply because it used
to happen under readdir and readdir uses a killable lock.  Is that
right?

So there is no particularly reason that "killable" is important here?
So I could simply change it to use lookup_one_positive() and you
wouldn't mind?

I'd actually like to make all directory/dentry locking killable - I
don't think there is any downside.  But I don't want to try pushing that
until my current exercise is finished.

> 
> In any case, you may add:
> Reviewed-by: Amir Goldstein <amir73il@xxxxxxxxx>

Thanks!

NeilBrown

> 
> > ---
> >  fs/namei.c             | 54 ++++++++++++++++++++++++++++++++++++++++++
> >  fs/overlayfs/readdir.c | 28 +++++++++++-----------
> >  include/linux/namei.h  |  3 +++
> >  3 files changed, 71 insertions(+), 14 deletions(-)
> >
> > diff --git a/fs/namei.c b/fs/namei.c
> > index cd43ff89fbaa..b1bc298b9d7c 100644
> > --- a/fs/namei.c
> > +++ b/fs/namei.c
> > @@ -1827,6 +1827,19 @@ static struct dentry *lookup_slow(const struct qstr *name,
> >         return res;
> >  }
> >
> > +static struct dentry *lookup_slow_killable(const struct qstr *name,
> > +                                          struct dentry *dir,
> > +                                          unsigned int flags)
> > +{
> > +       struct inode *inode = dir->d_inode;
> > +       struct dentry *res;
> > +       if (inode_lock_shared_killable(inode))
> > +               return ERR_PTR(-EINTR);
> > +       res = __lookup_slow(name, dir, flags);
> > +       inode_unlock_shared(inode);
> > +       return res;
> > +}
> > +
> >  static inline int may_lookup(struct mnt_idmap *idmap,
> >                              struct nameidata *restrict nd)
> >  {
> > @@ -3010,6 +3023,47 @@ struct dentry *lookup_one_unlocked(struct mnt_idmap *idmap, struct qstr *name,
> >  }
> >  EXPORT_SYMBOL(lookup_one_unlocked);
> >
> > +/**
> > + * lookup_one_positive_killable - lookup single pathname component
> > + * @idmap:     idmap of the mount the lookup is performed from
> > + * @name:      qstr olding pathname component to lookup
> > + * @base:      base directory to lookup from
> > + *
> > + * This helper will yield ERR_PTR(-ENOENT) on negatives. The helper returns
> > + * known positive or ERR_PTR(). This is what most of the users want.
> > + *
> > + * Note that pinned negative with unlocked parent _can_ become positive at any
> > + * time, so callers of lookup_one_unlocked() need to be very careful; pinned
> > + * positives have >d_inode stable, so this one avoids such problems.
> > + *
> > + * This can be used for in-kernel filesystem clients such as file servers.
> > + *
> > + * Ut should be called without the parent i_rwsem held, and will take
> 
> Typo: ^^ It
> 
> Thanks,
> Amir.
>