I'm writing a long response to this series, in the meantime I noticed this bit landed in https://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs.git/commit/?h=vfs-6.18.inode.refcount.preliminaries&id=3cba19f6a00675fbc2af0987dfc90e216e6cfb74 but with some whitespace issues in comments -- they are indented with spaces instead of tabs after the opening line. I verified the mail I sent does not have it, so I'm guessing this was copy-pasted? Tabing them by hand does the trick, below is my copy-paste as proof, please indent by hand in your editor ;) diff --git a/fs/inode.c b/fs/inode.c index 2db680a37235..fe4868e2a954 100644 --- a/fs/inode.c +++ b/fs/inode.c @@ -1915,10 +1915,10 @@ void iput(struct inode *inode) lockdep_assert_not_held(&inode->i_lock); VFS_BUG_ON_INODE(inode->i_state & I_CLEAR, inode); /* - * Note this assert is technically racy as if the count is bogusly - * equal to one, then two CPUs racing to further drop it can both - * conclude it's fine. - */ + * Note this assert is technically racy as if the count is bogusly + * equal to one, then two CPUs racing to further drop it can both + * conclude it's fine. + */ VFS_BUG_ON_INODE(atomic_read(&inode->i_count) < 1, inode); if (atomic_add_unless(&inode->i_count, -1, 1)) @@ -1942,9 +1942,9 @@ void iput(struct inode *inode) } /* - * iput_final() drops ->i_lock, we can't assert on it as the inode may - * be deallocated by the time the call returns. - */ + * iput_final() drops ->i_lock, we can't assert on it as the inode may + * be deallocated by the time the call returns. + */ iput_final(inode); } EXPORT_SYMBOL(iput); While here, vim told me about spaces instead of tabs in 2 more spots in the file. Again to show the lines: diff --git a/fs/inode.c b/fs/inode.c index 2db680a37235..833de5457a06 100644 --- a/fs/inode.c +++ b/fs/inode.c @@ -550,11 +550,11 @@ static void __inode_add_lru(struct inode *inode, bool rotate) struct wait_queue_head *inode_bit_waitqueue(struct wait_bit_queue_entry *wqe, struct inode *inode, u32 bit) { - void *bit_address; + void *bit_address; - bit_address = inode_state_wait_address(inode, bit); - init_wait_var_entry(wqe, bit_address, 0); - return __var_waitqueue(bit_address); + bit_address = inode_state_wait_address(inode, bit); + init_wait_var_entry(wqe, bit_address, 0); + return __var_waitqueue(bit_address); } EXPORT_SYMBOL(inode_bit_waitqueue); @@ -2938,7 +2938,7 @@ EXPORT_SYMBOL(mode_strip_sgid); */ void dump_inode(struct inode *inode, const char *reason) { - pr_warn("%s encountered for inode %px", reason, inode); + pr_warn("%s encountered for inode %px", reason, inode); } EXPORT_SYMBOL(dump_inode); Christian, I think it would be the most expedient if you just made changes on your own with whatever commit message you see fit. No need to mention I brought this up. If you insist I can send a patch. On Wed, Aug 27, 2025 at 6:24 PM Mateusz Guzik <mjguzik@xxxxxxxxx> wrote: > > The material change is I_DIRTY_TIME handling without a spurious ref > acquire/release cycle. > > While here a bunch of smaller changes: > 1. predict there is an inode -- bpftrace suggests one is passed vast > majority of the time > 2. convert BUG_ON into VFS_BUG_ON_INODE > 3. assert on ->i_count > 4. assert ->i_lock is not held > 5. flip the order of I_DIRTY_TIME and nlink count checks as the former > is less likely to be true > > I verified atomic_read(&inode->i_count) does not show up in asm if > debug is disabled. > > Signed-off-by: Mateusz Guzik <mjguzik@xxxxxxxxx> > --- > > The routine kept annoying me, so here is a further revised variant. > > I verified this compiles, but I still cannot runtime test. I'm sorry for > that. My signed-off is conditional on a good samaritan making sure it > works :) > > diff compared to the thing I sent "informally": > - if (unlikely(!inode)) > - asserts > - slightly reworded iput_final commentary > - unlikely() on the second I_DIRTY_TIME check > > Given the revamp I think it makes sense to attribute the change to me, > hence a "proper" mail. > > The thing surviving from the submission by Josef is: > + if (atomic_add_unless(&inode->i_count, -1, 1)) > + return; > > And of course he is the one who brought up the spurious refcount trip in > the first place. > > I'm happy with Reported-by, Co-developed-by or whatever other credit > as you guys see fit. > > That aside I think it would be nice if NULL inodes passed to iput > became illegal, but that's a different story for another day. > > fs/inode.c | 46 +++++++++++++++++++++++++++++++++++----------- > 1 file changed, 35 insertions(+), 11 deletions(-) > > diff --git a/fs/inode.c b/fs/inode.c > index 01ebdc40021e..01a554e11279 100644 > --- a/fs/inode.c > +++ b/fs/inode.c > @@ -1908,20 +1908,44 @@ static void iput_final(struct inode *inode) > */ > void iput(struct inode *inode) > { > - if (!inode) > + if (unlikely(!inode)) > return; > - BUG_ON(inode->i_state & I_CLEAR); > + > retry: > - if (atomic_dec_and_lock(&inode->i_count, &inode->i_lock)) { > - if (inode->i_nlink && (inode->i_state & I_DIRTY_TIME)) { > - atomic_inc(&inode->i_count); > - spin_unlock(&inode->i_lock); > - trace_writeback_lazytime_iput(inode); > - mark_inode_dirty_sync(inode); > - goto retry; > - } > - iput_final(inode); > + lockdep_assert_not_held(&inode->i_lock); > + VFS_BUG_ON_INODE(inode->i_state & I_CLEAR, inode); > + /* > + * Note this assert is technically racy as if the count is bogusly > + * equal to one, then two CPUs racing to further drop it can both > + * conclude it's fine. > + */ > + VFS_BUG_ON_INODE(atomic_read(&inode->i_count) < 1, inode); > + > + if (atomic_add_unless(&inode->i_count, -1, 1)) > + return; > + > + if ((inode->i_state & I_DIRTY_TIME) && inode->i_nlink) { > + trace_writeback_lazytime_iput(inode); > + mark_inode_dirty_sync(inode); > + goto retry; > } > + > + spin_lock(&inode->i_lock); > + if (unlikely((inode->i_state & I_DIRTY_TIME) && inode->i_nlink)) { > + spin_unlock(&inode->i_lock); > + goto retry; > + } > + > + if (!atomic_dec_and_test(&inode->i_count)) { > + spin_unlock(&inode->i_lock); > + return; > + } > + > + /* > + * iput_final() drops ->i_lock, we can't assert on it as the inode may > + * be deallocated by the time the call returns. > + */ > + iput_final(inode); > } > EXPORT_SYMBOL(iput); > > -- > 2.43.0 > -- Mateusz Guzik <mjguzik gmail.com>