On Thu, Aug 21, 2025 at 04:18:13PM -0400, Josef Bacik wrote: > Adjusting i_state flags always means updating the values manually. Bring > these forward into the 2020's and make a nice clean macro for defining > the i_state values as an enum, providing __ variants for the cases where > we need the bit position instead of the actual value, and leaving the > actual NAME as the 1U << bit value. > > Signed-off-by: Josef Bacik <josef@xxxxxxxxxxxxxx> > --- > include/linux/fs.h | 234 +++++++++++++++++++++++---------------------- > 1 file changed, 122 insertions(+), 112 deletions(-) > > diff --git a/include/linux/fs.h b/include/linux/fs.h > index 9a1ce67eed33..e741dc453c2c 100644 > --- a/include/linux/fs.h > +++ b/include/linux/fs.h > @@ -665,6 +665,127 @@ is_uncached_acl(struct posix_acl *acl) > #define IOP_MGTIME 0x0020 > #define IOP_CACHED_LINK 0x0040 > > +/* > + * Inode state bits. Protected by inode->i_lock > + * > + * Four bits determine the dirty state of the inode: I_DIRTY_SYNC, > + * I_DIRTY_DATASYNC, I_DIRTY_PAGES, and I_DIRTY_TIME. > + * > + * Four bits define the lifetime of an inode. Initially, inodes are I_NEW, > + * until that flag is cleared. I_WILL_FREE, I_FREEING and I_CLEAR are set at > + * various stages of removing an inode. > + * > + * Two bits are used for locking and completion notification, I_NEW and I_SYNC. > + * > + * I_DIRTY_SYNC Inode is dirty, but doesn't have to be written on > + * fdatasync() (unless I_DIRTY_DATASYNC is also set). > + * Timestamp updates are the usual cause. > + * I_DIRTY_DATASYNC Data-related inode changes pending. We keep track of > + * these changes separately from I_DIRTY_SYNC so that we > + * don't have to write inode on fdatasync() when only > + * e.g. the timestamps have changed. > + * I_DIRTY_PAGES Inode has dirty pages. Inode itself may be clean. > + * I_DIRTY_TIME The inode itself has dirty timestamps, and the > + * lazytime mount option is enabled. We keep track of this > + * separately from I_DIRTY_SYNC in order to implement > + * lazytime. This gets cleared if I_DIRTY_INODE > + * (I_DIRTY_SYNC and/or I_DIRTY_DATASYNC) gets set. But > + * I_DIRTY_TIME can still be set if I_DIRTY_SYNC is already > + * in place because writeback might already be in progress > + * and we don't want to lose the time update > + * I_NEW Serves as both a mutex and completion notification. > + * New inodes set I_NEW. If two processes both create > + * the same inode, one of them will release its inode and > + * wait for I_NEW to be released before returning. > + * Inodes in I_WILL_FREE, I_FREEING or I_CLEAR state can > + * also cause waiting on I_NEW, without I_NEW actually > + * being set. find_inode() uses this to prevent returning > + * nearly-dead inodes. > + * I_WILL_FREE Must be set when calling write_inode_now() if i_count > + * is zero. I_FREEING must be set when I_WILL_FREE is > + * cleared. > + * I_FREEING Set when inode is about to be freed but still has dirty > + * pages or buffers attached or the inode itself is still > + * dirty. > + * I_CLEAR Added by clear_inode(). In this state the inode is > + * clean and can be destroyed. Inode keeps I_FREEING. > + * > + * Inodes that are I_WILL_FREE, I_FREEING or I_CLEAR are > + * prohibited for many purposes. iget() must wait for > + * the inode to be completely released, then create it > + * anew. Other functions will just ignore such inodes, > + * if appropriate. I_NEW is used for waiting. > + * > + * I_SYNC Writeback of inode is running. The bit is set during > + * data writeback, and cleared with a wakeup on the bit > + * address once it is done. The bit is also used to pin > + * the inode in memory for flusher thread. > + * > + * I_REFERENCED Marks the inode as recently references on the LRU list. > + * > + * I_WB_SWITCH Cgroup bdi_writeback switching in progress. Used to > + * synchronize competing switching instances and to tell > + * wb stat updates to grab the i_pages lock. See > + * inode_switch_wbs_work_fn() for details. > + * > + * I_OVL_INUSE Used by overlayfs to get exclusive ownership on upper > + * and work dirs among overlayfs mounts. > + * > + * I_CREATING New object's inode in the middle of setting up. > + * > + * I_DONTCACHE Evict inode as soon as it is not used anymore. > + * > + * I_SYNC_QUEUED Inode is queued in b_io or b_more_io writeback lists. > + * Used to detect that mark_inode_dirty() should not move > + * inode between dirty lists. > + * > + * I_PINNING_FSCACHE_WB Inode is pinning an fscache object for writeback. > + * > + * I_LRU_ISOLATING Inode is pinned being isolated from LRU without holding > + * i_count. > + * > + * Q: What is the difference between I_WILL_FREE and I_FREEING? > + * > + * __I_{SYNC,NEW,LRU_ISOLATING} are used to derive unique addresses to wait > + * upon. There's one free address left. > + */ > + > +/* > + * As simple macro to define the inode state bits, __NAME will be the bit value > + * (0, 1, 2, ...), and NAME will be the bit mask (1U << __NAME). The __NAME_SEQ > + * is used to reset the sequence number so the next name gets the next bit value > + * in the sequence. > + */ > +#define INODE_BIT(name) \ > + __ ## name, \ > + name = (1U << __ ## name), \ > + __ ## name ## _SEQ = __ ## name I'm not sure if this is the future we want :D I think it's harder to parse than what we have now. > + > +enum inode_state_bits { > + INODE_BIT(I_NEW), > + INODE_BIT(I_SYNC), > + INODE_BIT(I_LRU_ISOLATING), > + INODE_BIT(I_DIRTY_SYNC), > + INODE_BIT(I_DIRTY_DATASYNC), > + INODE_BIT(I_DIRTY_PAGES), > + INODE_BIT(I_WILL_FREE), > + INODE_BIT(I_FREEING), > + INODE_BIT(I_CLEAR), > + INODE_BIT(I_REFERENCED), > + INODE_BIT(I_LINKABLE), > + INODE_BIT(I_DIRTY_TIME), > + INODE_BIT(I_WB_SWITCH), > + INODE_BIT(I_OVL_INUSE), > + INODE_BIT(I_CREATING), > + INODE_BIT(I_DONTCACHE), > + INODE_BIT(I_SYNC_QUEUED), > + INODE_BIT(I_PINNING_NETFS_WB), > +}; Good idea but I really dislike this macro indirection. Can't we just do the really boring? enum inode_state_bits { __I_BIT_NEW = 0U __I_BIT_SYNC = 1U __I_BIT_LRU_ISOLATING = 2U } enum inode_state_flags_t { I_NEW = (1U << __I_BIT_NEW) I_SYNC = (1U << __I_BIT_SYNC) I_LRU_ISOLATING = (1U << __I_BIT_LRU_ISOLATING) I_DIRTY_SYNC = (1U << 3) I_DIRTY_DATASYNC = (1U << 4) I_DIRTY_PAGES = (1U << 5) I_WILL_FREE = (1U << 6) I_FREEING = (1U << 7) I_CLEAR = (1U << 8) I_REFERENCED = (1U << 9) I_LINKABLE = (1U << 10) I_DIRTY_TIME = (1U << 11) I_WB_SWITCH = (1U << 12) I_OVL_INUSE = (1U << 13) I_CREATING = (1U << 14) I_DONTCACHE = (1U << 15) I_SYNC_QUEUED = (1U << 16) I_PINNING_NETFS_WB = (1U << 17) }; Note that inode_state_wait_address() and that only works on four bits so we can't really use higher bits anyway without switching back to a scheme where we have to use unsigned long and waste for bytes for nothing on 64 bit. With that out of the way, Reviewed-by: Christian Brauner <brauner@xxxxxxxxxx> > +#define I_DIRTY_INODE (I_DIRTY_SYNC | I_DIRTY_DATASYNC) > +#define I_DIRTY (I_DIRTY_INODE | I_DIRTY_PAGES) > +#define I_DIRTY_ALL (I_DIRTY | I_DIRTY_TIME) > + > /* > * Keep mostly read-only and often accessed (especially for > * the RCU path lookup and 'stat' data) fields at the beginning > @@ -723,7 +844,7 @@ struct inode { > #endif > > /* Misc */ > - u32 i_state; > + enum inode_state_bits i_state; > /* 32-bit hole */ > struct rw_semaphore i_rwsem; > > @@ -2484,117 +2605,6 @@ static inline void kiocb_clone(struct kiocb *kiocb, struct kiocb *kiocb_src, > }; > } > > -/* > - * Inode state bits. Protected by inode->i_lock > - * > - * Four bits determine the dirty state of the inode: I_DIRTY_SYNC, > - * I_DIRTY_DATASYNC, I_DIRTY_PAGES, and I_DIRTY_TIME. > - * > - * Four bits define the lifetime of an inode. Initially, inodes are I_NEW, > - * until that flag is cleared. I_WILL_FREE, I_FREEING and I_CLEAR are set at > - * various stages of removing an inode. > - * > - * Two bits are used for locking and completion notification, I_NEW and I_SYNC. > - * > - * I_DIRTY_SYNC Inode is dirty, but doesn't have to be written on > - * fdatasync() (unless I_DIRTY_DATASYNC is also set). > - * Timestamp updates are the usual cause. > - * I_DIRTY_DATASYNC Data-related inode changes pending. We keep track of > - * these changes separately from I_DIRTY_SYNC so that we > - * don't have to write inode on fdatasync() when only > - * e.g. the timestamps have changed. > - * I_DIRTY_PAGES Inode has dirty pages. Inode itself may be clean. > - * I_DIRTY_TIME The inode itself has dirty timestamps, and the > - * lazytime mount option is enabled. We keep track of this > - * separately from I_DIRTY_SYNC in order to implement > - * lazytime. This gets cleared if I_DIRTY_INODE > - * (I_DIRTY_SYNC and/or I_DIRTY_DATASYNC) gets set. But > - * I_DIRTY_TIME can still be set if I_DIRTY_SYNC is already > - * in place because writeback might already be in progress > - * and we don't want to lose the time update > - * I_NEW Serves as both a mutex and completion notification. > - * New inodes set I_NEW. If two processes both create > - * the same inode, one of them will release its inode and > - * wait for I_NEW to be released before returning. > - * Inodes in I_WILL_FREE, I_FREEING or I_CLEAR state can > - * also cause waiting on I_NEW, without I_NEW actually > - * being set. find_inode() uses this to prevent returning > - * nearly-dead inodes. > - * I_WILL_FREE Must be set when calling write_inode_now() if i_count > - * is zero. I_FREEING must be set when I_WILL_FREE is > - * cleared. > - * I_FREEING Set when inode is about to be freed but still has dirty > - * pages or buffers attached or the inode itself is still > - * dirty. > - * I_CLEAR Added by clear_inode(). In this state the inode is > - * clean and can be destroyed. Inode keeps I_FREEING. > - * > - * Inodes that are I_WILL_FREE, I_FREEING or I_CLEAR are > - * prohibited for many purposes. iget() must wait for > - * the inode to be completely released, then create it > - * anew. Other functions will just ignore such inodes, > - * if appropriate. I_NEW is used for waiting. > - * > - * I_SYNC Writeback of inode is running. The bit is set during > - * data writeback, and cleared with a wakeup on the bit > - * address once it is done. The bit is also used to pin > - * the inode in memory for flusher thread. > - * > - * I_REFERENCED Marks the inode as recently references on the LRU list. > - * > - * I_WB_SWITCH Cgroup bdi_writeback switching in progress. Used to > - * synchronize competing switching instances and to tell > - * wb stat updates to grab the i_pages lock. See > - * inode_switch_wbs_work_fn() for details. > - * > - * I_OVL_INUSE Used by overlayfs to get exclusive ownership on upper > - * and work dirs among overlayfs mounts. > - * > - * I_CREATING New object's inode in the middle of setting up. > - * > - * I_DONTCACHE Evict inode as soon as it is not used anymore. > - * > - * I_SYNC_QUEUED Inode is queued in b_io or b_more_io writeback lists. > - * Used to detect that mark_inode_dirty() should not move > - * inode between dirty lists. > - * > - * I_PINNING_FSCACHE_WB Inode is pinning an fscache object for writeback. > - * > - * I_LRU_ISOLATING Inode is pinned being isolated from LRU without holding > - * i_count. > - * > - * Q: What is the difference between I_WILL_FREE and I_FREEING? > - * > - * __I_{SYNC,NEW,LRU_ISOLATING} are used to derive unique addresses to wait > - * upon. There's one free address left. > - */ > -#define __I_NEW 0 > -#define I_NEW (1 << __I_NEW) > -#define __I_SYNC 1 > -#define I_SYNC (1 << __I_SYNC) > -#define __I_LRU_ISOLATING 2 > -#define I_LRU_ISOLATING (1 << __I_LRU_ISOLATING) > - > -#define I_DIRTY_SYNC (1 << 3) > -#define I_DIRTY_DATASYNC (1 << 4) > -#define I_DIRTY_PAGES (1 << 5) > -#define I_WILL_FREE (1 << 6) > -#define I_FREEING (1 << 7) > -#define I_CLEAR (1 << 8) > -#define I_REFERENCED (1 << 9) > -#define I_LINKABLE (1 << 10) > -#define I_DIRTY_TIME (1 << 11) > -#define I_WB_SWITCH (1 << 12) > -#define I_OVL_INUSE (1 << 13) > -#define I_CREATING (1 << 14) > -#define I_DONTCACHE (1 << 15) > -#define I_SYNC_QUEUED (1 << 16) > -#define I_PINNING_NETFS_WB (1 << 17) > - > -#define I_DIRTY_INODE (I_DIRTY_SYNC | I_DIRTY_DATASYNC) > -#define I_DIRTY (I_DIRTY_INODE | I_DIRTY_PAGES) > -#define I_DIRTY_ALL (I_DIRTY | I_DIRTY_TIME) > - > extern void __mark_inode_dirty(struct inode *, int); > static inline void mark_inode_dirty(struct inode *inode) > { > -- > 2.49.0 >