On Wed, Aug 20, 2025 at 4:57 AM Thomas Bertschinger <tahbertschinger@xxxxxxxxx> wrote: > > On Tue Aug 19, 2025 at 9:11 AM MDT, Jens Axboe wrote: > > I'll take a look at this, but wanted to mention that I dabbled in this > > too a while ago, here's what I had: > > > > https://git.kernel.dk/cgit/linux/log/?h=io_uring-handle > > Thanks! That is helpful. Right away I see something you included that I > missed: requiring CONFIG_FHANDLE. Missing that would explain the build > failure emails I got on this series. > > I'll include that in v2, when I get around to that--hopefully soon. > > > > > Probably pretty incomplete, but I did try and handle some of the > > cases that won't block to avoid spurious -EAGAIN and io-wq usage. > > So for the non-blocking case, what I am concerned about is code paths > like this: > > do_handle_to_path() > -> exportfs_decode_fh_raw() > -> fh_to_dentry() > -> xfs_fs_fh_to_dentry() > ... -> xfs_iget() > OR > -> ext4_fh_to_dentry() > ... -> ext4_iget() > > Where there doesn't seem to be any existing way to tell the FS > implementation to give up and return -EAGAIN when appropriate. I wasn't > sure how to do that without modifying the signature of fh_to_dentry() > (and fh_to_parent()) which seems awfully invasive for this. > > (Using a flag in task_struct to signify "don't block" was previously > discussed: > https://lore.kernel.org/io-uring/22630618-40fc-5668-078d-6cefcb2e4962@xxxxxxxxx/ > and that could allow not needing to pass a flag via function argument, > but I agree with the conclusion in that email chain that it's an ugly > solution.) > > Any thoughts on that? This seemed to me like there wasn't an obvious > easy solution, hence why I just didn't attempt it at all in v1. > Maybe I'm missing something, though. > Since FILEID_IS_CONNECTABLE, we started using the high 16 bits of fh_type for FILEID_USER_FLAGS, since fs is not likely expecting a fh_type beyond 0xff (Documentation/filesystems/nfs/exporting.rst): "A filehandle fragment consists of an array of 1 or more 4byte words, together with a one byte "type"." The name FILEID_USER_FLAGS may be a bit misleading - it was never the intention for users to manipulate those flags, although they certainly can and there is no real harm in that. These flags are used in the syscall interface only, but ->fh_to_{dentry,parent}() function signature also take an int fh_flags argument, so we can use that to express the non-blocking request. Untested patch follows (easier than explaining): diff --git a/fs/exportfs/expfs.c b/fs/exportfs/expfs.c index d3e55de4a2a2a..a46c97af4dfb1 100644 --- a/fs/exportfs/expfs.c +++ b/fs/exportfs/expfs.c @@ -391,7 +391,7 @@ int exportfs_encode_inode_fh(struct inode *inode, struct fid *fid, else type = nop->encode_fh(inode, fid->raw, max_len, parent); - if (type > 0 && FILEID_USER_FLAGS(type)) { + if (type > 0 && type & ~FILEID_HANDLE_TYPE_MASK) { pr_warn_once("%s: unexpected fh type value 0x%x from fstype %s.\n", __func__, type, inode->i_sb->s_type->name); return -EINVAL; @@ -443,9 +443,12 @@ exportfs_decode_fh_raw(struct vfsmount *mnt, struct fid *fid, int fh_len, const struct export_operations *nop = mnt->mnt_sb->s_export_op; struct dentry *result, *alias; char nbuf[NAME_MAX+1]; + int fh_type = fileid_type | FILEID_FS_FLAGS(flags); int err; - if (fileid_type < 0 || FILEID_USER_FLAGS(fileid_type)) + BUILD_BUG_ON(FILEID_HANDLE_TYPE_MASK & FILEID_FS_FLAGS_MASK); + BUILD_BUG_ON(FILEID_USER_FLAGS_MASK & FILEID_FS_FLAGS_MASK); + if (fileid_type < 0 || fileid_type & ~FILEID_HANDLE_TYPE_MASK) return ERR_PTR(-EINVAL); /* @@ -453,7 +456,7 @@ exportfs_decode_fh_raw(struct vfsmount *mnt, struct fid *fid, int fh_len, */ if (!exportfs_can_decode_fh(nop)) return ERR_PTR(-ESTALE); - result = nop->fh_to_dentry(mnt->mnt_sb, fid, fh_len, fileid_type); + result = nop->fh_to_dentry(mnt->mnt_sb, fid, fh_len, fh_type); if (IS_ERR_OR_NULL(result)) return result; @@ -481,6 +484,10 @@ exportfs_decode_fh_raw(struct vfsmount *mnt, struct fid *fid, int fh_len, * filesystem root. */ if (result->d_flags & DCACHE_DISCONNECTED) { + err = -EAGAIN; + if (flags & EXPORT_FH_CACHED) + goto err_result; + err = reconnect_path(mnt, result, nbuf); if (err) goto err_result; @@ -511,6 +518,10 @@ exportfs_decode_fh_raw(struct vfsmount *mnt, struct fid *fid, int fh_len, if (alias) return alias; + err = -EAGAIN; + if (flags & EXPORT_FH_CACHED) + goto err_result; + /* * Try to extract a dentry for the parent directory from the * file handle. If this fails we'll have to give up. diff --git a/fs/fhandle.c b/fs/fhandle.c index 7c236f64cdeac..228512424ad65 100644 --- a/fs/fhandle.c +++ b/fs/fhandle.c @@ -339,7 +339,7 @@ static int handle_to_path(int mountdirfd, struct file_handle __user *ufh, (f_handle.handle_bytes == 0)) return -EINVAL; - if (f_handle.handle_type < 0 || + if (f_handle.handle_type < 0 || FILEID_FS_FLAGS(f_handle.handle_type) || FILEID_USER_FLAGS(f_handle.handle_type) & ~FILEID_VALID_USER_FLAGS) return -EINVAL; @@ -382,7 +382,10 @@ static int handle_to_path(int mountdirfd, struct file_handle __user *ufh, if (f_handle.handle_type & FILEID_IS_DIR) ctx.fh_flags |= EXPORT_FH_DIR_ONLY; /* Filesystem code should not be exposed to user flags */ - handle->handle_type &= ~FILEID_USER_FLAGS_MASK; + BUILD_BUG_ON(FILEID_HANDLE_TYPE_MASK & FILEID_USER_FLAGS_MASK); + handle->handle_type &= ~FILEID_HANDLE_TYPE_MASK; + if (o_flags & O_NONBLOCK) + ctx.fh_flags |= EXPORT_FH_CACHED; retval = do_handle_to_path(handle, path, &ctx); out_path: diff --git a/include/linux/exportfs.h b/include/linux/exportfs.h index cfb0dd1ea49c7..331d28093b568 100644 --- a/include/linux/exportfs.h +++ b/include/linux/exportfs.h @@ -174,6 +174,18 @@ struct handle_to_path_ctx { /* * Filesystems use only lower 8 bits of file_handle type for fid_type. + */ +#define FILEID_HANDLE_TYPE_MASK 0xff + +/* + * vfs uses bits 8..15 of @fh_type arg of fh_to_dentry/fh_to_parent methods + * as misc. flags to pass to filesystems. + */ +#define EXPORT_FH_CACHED 0x100 /* Non-blocking encode/decode */ +#define FILEID_FS_FLAGS_MASK 0xff00 +#define FILEID_FS_FLAGS(flags) ((flags) & FILEID_FS_FLAGS_MASK) + +/* * name_to_handle_at() uses upper 16 bits of type as user flags to be * interpreted by open_by_handle_at(). */ Thanks, Amir.