On Fri, 04 Jul 2025, Al Viro wrote: > On Fri, Jul 04, 2025 at 12:43:13AM +0100, Al Viro wrote: > > > I would rather *not* leave a dangling pointer there, and yes, it can > > end up being dangling. kfree_rcu() from inside the ->evict_inode() > > may very well happen earlier than (also RCU-delayed) freeing of struct > > inode itself. > > > > What we can do is WRITE_ONCE() to set it to NULL on the evict_inode > > side and READ_ONCE() in the proc_sys_compare(). > > > > The reason why the latter is memory-safe is that ->d_compare() for > > non-in-lookup dentries is called either under rcu_read_lock() (in which > > case observing non-NULL means that kfree_rcu() couldn't have gotten to > > freeing the sucker) *or* under ->d_lock, in which case the inode can't > > reach ->evict_inode() until we are done. > > > > So this predicate is very much relevant. Have that fucker called with > > neither rcu_read_lock() nor ->d_lock, and you might very well end up > > with dereferencing an already freed ctl_table_header. > > IOW, I would prefer to do this: Looks good - thanks, NeilBrown > > [PATCH] fix proc_sys_compare() handling of in-lookup dentries > > There's one case where ->d_compare() can be called for an in-lookup > dentry; usually that's nothing special from ->d_compare() point of > view, but... proc_sys_compare() is weird. > > The thing is, /proc/sys subdirectories can look differently for > different processes. Up to and including having the same name > resolve to different dentries - all of them hashed. > > The way it's done is ->d_compare() refusing to admit a match unless > this dentry is supposed to be visible to this caller. The information > needed to discriminate between them is stored in inode; it is set > during proc_sys_lookup() and until it's done d_splice_alias() we really > can't tell who should that dentry be visible for. > > Normally there's no negative dentries in /proc/sys; we can run into > a dying dentry in RCU dcache lookup, but those can be safely rejected. > > However, ->d_compare() is also called for in-lookup dentries, before > they get positive - or hashed, for that matter. In case of match > we will wait until dentry leaves in-lookup state and repeat ->d_compare() > afterwards. In other words, the right behaviour is to treat the > name match as sufficient for in-lookup dentries; if dentry is not > for us, we'll see that when we recheck once proc_sys_lookup() is > done with it. > > While we are at it, fix the misspelled READ_ONCE and WRITE_ONCE there. > > Fixes: d9171b934526 ("parallel lookups machinery, part 4 (and last)") > Reported-by: NeilBrown <neilb@xxxxxxxxxx> > Reviewed-by: Christian Brauner <brauner@xxxxxxxxxx> > Reviewed-by: NeilBrown <neil@xxxxxxxxxx> > Signed-off-by: Al Viro <viro@xxxxxxxxxxxxxxxxxx> > --- > diff --git a/fs/proc/inode.c b/fs/proc/inode.c > index a3eb3b740f76..3604b616311c 100644 > --- a/fs/proc/inode.c > +++ b/fs/proc/inode.c > @@ -42,7 +42,7 @@ static void proc_evict_inode(struct inode *inode) > > head = ei->sysctl; > if (head) { > - RCU_INIT_POINTER(ei->sysctl, NULL); > + WRITE_ONCE(ei->sysctl, NULL); > proc_sys_evict_inode(inode, head); > } > } > diff --git a/fs/proc/proc_sysctl.c b/fs/proc/proc_sysctl.c > index cc9d74a06ff0..08b78150cdde 100644 > --- a/fs/proc/proc_sysctl.c > +++ b/fs/proc/proc_sysctl.c > @@ -918,17 +918,21 @@ static int proc_sys_compare(const struct dentry *dentry, > struct ctl_table_header *head; > struct inode *inode; > > - /* Although proc doesn't have negative dentries, rcu-walk means > - * that inode here can be NULL */ > - /* AV: can it, indeed? */ > - inode = d_inode_rcu(dentry); > - if (!inode) > - return 1; > if (name->len != len) > return 1; > if (memcmp(name->name, str, len)) > return 1; > - head = rcu_dereference(PROC_I(inode)->sysctl); > + > + // false positive is fine here - we'll recheck anyway > + if (d_in_lookup(dentry)) > + return 0; > + > + inode = d_inode_rcu(dentry); > + // we just might have run into dentry in the middle of __dentry_kill() > + if (!inode) > + return 1; > + > + head = READ_ONCE(PROC_I(inode)->sysctl); > return !head || !sysctl_is_seen(head); > } > >