On 4/27/25 7:01 PM, trondmy@xxxxxxxxxx wrote: > From: Trond Myklebust <trond.myklebust@xxxxxxxxxxxxxxx> > > The Linux client assumes that all filehandles are non-volatile for > renames within the same directory (otherwise sillyrename cannot work). > However, the existence of the Linux 'subtree_check' export option has > meant that nfs_rename() has always assumed it needs to flush writes > before attempting to rename. > > Since NFSv4 does allow the client to query whether or not the server > exhibits this behaviour, and since knfsd does actually set the > appropriate flag when 'subtree_check' is enabled on an export, it > should be OK to optimise away the write flushing behaviour in the cases > where it is clearly not needed. No objection to the high level goal, seems like a sensible change. So NFSv3 still has the flushing requirement, but NFSv4 can disable that requirement as long as the server in question asserts FH4_VOLATILE_ANY and FH4_VOL_RENAME, correct? I'm wondering how confident we are that other server implementations behave reasonably. Yes, they are broken if they don't behave, but there is still risk of introducing a regression. > Signed-off-by: Trond Myklebust <trond.myklebust@xxxxxxxxxxxxxxx> > --- > fs/nfs/client.c | 2 ++ > fs/nfs/dir.c | 15 ++++++++++++++- > include/linux/nfs_fs_sb.h | 6 ++++++ > 3 files changed, 22 insertions(+), 1 deletion(-) > > diff --git a/fs/nfs/client.c b/fs/nfs/client.c > index 2115c1189c2d..6d63b958c4bb 100644 > --- a/fs/nfs/client.c > +++ b/fs/nfs/client.c > @@ -1105,6 +1105,8 @@ struct nfs_server *nfs_create_server(struct fs_context *fc) > if (server->namelen == 0 || server->namelen > NFS2_MAXNAMLEN) > server->namelen = NFS2_MAXNAMLEN; > } > + /* Linux 'subtree_check' borkenness mandates this setting */ Assuming you are going to respin this patch to deal with the build bot warnings, can you make this comment more useful? "borkenness" sounds like you are dealing with a server bug here, but that's not really the case. subtree_check is working as designed: it requires the extra flushing. The no_subtree_check case does not, IIUC. It would be better to explain this need strictly in terms of file handle volatility, no? > + server->fh_expire_type = NFS_FH_VOL_RENAME; > > if (!(fattr->valid & NFS_ATTR_FATTR)) { > error = ctx->nfs_mod->rpc_ops->getattr(server, ctx->mntfh, > diff --git a/fs/nfs/dir.c b/fs/nfs/dir.c > index bd23fc736b39..d0e0b435a843 100644 > --- a/fs/nfs/dir.c > +++ b/fs/nfs/dir.c > @@ -2676,6 +2676,18 @@ nfs_unblock_rename(struct rpc_task *task, struct nfs_renamedata *data) > unblock_revalidate(new_dentry); > } > > +static bool nfs_rename_is_unsafe_cross_dir(struct dentry *old_dentry, > + struct dentry *new_dentry) > +{ > + struct nfs_server *server = NFS_SB(old_dentry->d_sb); > + > + if (old_dentry->d_parent != new_dentry->d_parent) > + return false; > + if (server->fh_expire_type & NFS_FH_RENAME_UNSAFE) > + return !(server->fh_expire_type & NFS_FH_NOEXPIRE_WITH_OPEN); > + return true; > +} > + > /* > * RENAME > * FIXME: Some nfsds, like the Linux user space nfsd, may generate a > @@ -2763,7 +2775,8 @@ int nfs_rename(struct mnt_idmap *idmap, struct inode *old_dir, > > } > > - if (S_ISREG(old_inode->i_mode)) > + if (S_ISREG(old_inode->i_mode) && > + nfs_rename_is_unsafe_cross_dir(old_dentry, new_dentry)) > nfs_sync_inode(old_inode); > task = nfs_async_rename(old_dir, new_dir, old_dentry, new_dentry, > must_unblock ? nfs_unblock_rename : NULL); > diff --git a/include/linux/nfs_fs_sb.h b/include/linux/nfs_fs_sb.h > index 71319637a84e..6732c7e04d19 100644 > --- a/include/linux/nfs_fs_sb.h > +++ b/include/linux/nfs_fs_sb.h > @@ -236,6 +236,12 @@ struct nfs_server { > u32 acl_bitmask; /* V4 bitmask representing the ACEs > that are supported on this > filesystem */ > + /* The following #defines numerically match the NFSv4 equivalents */ > +#define NFS_FH_NOEXPIRE_WITH_OPEN (0x1) > +#define NFS_FH_VOLATILE_ANY (0x2) > +#define NFS_FH_VOL_MIGRATION (0x4) > +#define NFS_FH_VOL_RENAME (0x8) > +#define NFS_FH_RENAME_UNSAFE (NFS_FH_VOLATILE_ANY | NFS_FH_VOL_RENAME) > u32 fh_expire_type; /* V4 bitmask representing file > handle volatility type for > this filesystem */ -- Chuck Lever