This is loosely based on the original documentation written by David Howells and later maintained by Christian Brauner, but has been rewritten to be more from a user perspective (as well as fixing a few critical mistakes). Co-authored-by: David Howells <dhowells@xxxxxxxxxx> Signed-off-by: David Howells <dhowells@xxxxxxxxxx> Co-authored-by: Christian Brauner <brauner@xxxxxxxxxx> Signed-off-by: Christian Brauner <brauner@xxxxxxxxxx> Signed-off-by: Aleksa Sarai <cyphar@xxxxxxxxxx> --- man/man2/fsconfig.2 | 670 ++++++++++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 670 insertions(+) diff --git a/man/man2/fsconfig.2 b/man/man2/fsconfig.2 new file mode 100644 index 0000000000000000000000000000000000000000..97c9aff0e0c195e6028e1c7bd70e40905ba9f994 --- /dev/null +++ b/man/man2/fsconfig.2 @@ -0,0 +1,670 @@ +.\" Copyright, the authors of the Linux man-pages project +.\" +.\" SPDX-License-Identifier: Linux-man-pages-copyleft +.\" +.TH fsconfig 2 (date) "Linux man-pages (unreleased)" +.SH NAME +fsconfig \- configure new or existing filesystem context +.SH LIBRARY +Standard C library +.RI ( libc ,\~ \-lc ) +.SH SYNOPSIS +.nf +.B #include <sys/mount.h> +.P +.BI "int fsconfig(int " fd ", unsigned int " cmd "," +.BI " const char *" key ", const void *" value ", int " aux ");" +.fi +.SH DESCRIPTION +The +.BR fsconfig () +system call is part of +the suite of file descriptor based mount facilities in Linux. +.P +.BR fsconfig () +is used to supply parameters to +and issue commands against +the filesystem configuration context +associated with the file descriptor +.IR fd . +Filesystem configuration contexts can be created with +.BR fsopen (2) +or instantiated from an extant filesystem instance with +.BR fspick (2). +.P +The +.I cmd +argument indicates the command to be issued. +Some commands supply parameters to the context +(equivalent to mount options specified with +.BR mount (8)), +while others are meta-operations on the filesystem context. +The list of valid +.I cmd +values are: +.RS +.TP +.B FSCONFIG_SET_FLAG +Set the flag parameter named by +.IR key . +.I value +must be NULL, +and +.I aux +must be 0. +.TP +.B FSCONFIG_SET_STRING +Set the string parameter named by +.I key +to the value specified by +.IR value . +.I value +points to a null-terminated string, +and +.I aux +must be 0. +.TP +.B FSCONFIG_SET_BINARY +Set the blob parameter named by +.I key +to the contents of the binary blob +specified by +.IR value . +.I value +points to +the start of a buffer +that is +.I aux +bytes in length. +.TP +.B FSCONFIG_SET_FD +Set the file parameter named by +.I key +to the open file description +referenced by the file descriptor +.IR aux . +.I value +must be NULL. +.IP +You may also use +.B \%FSCONFIG_SET_STRING +for file parameters, +with +.I value +set to a null-terminated string +containing a base-10 representation +of the file descriptor number. +This mechanism is primarily intended for compatibility +with older +.BR mount (2)-based +programs, +and only works for parameters +that only accept file descriptor arguments. +.TP +.B FSCONFIG_SET_PATH +Set the path parameter named by +.I key +to the object at a provided path, +resolved in a similar manner to +.BR openat (2). +.I value +points to a null-terminated pathname string, +and +.I aux +is equivalent to the +.I dirfd +argument to +.BR openat (2). +.IP +You may also use +.B \%FSCONFIG_SET_STRING +for path parameters, +the behaviour of which is equivalent to +.B \%FSCONFIG_SET_PATH +with +.I aux +set to +.BR \%AT_FDCWD . +.TP +.B FSCONFIG_SET_PATH_EMPTY +As with +.BR \%FSCONFIG_SET_PATH , +except that if +.I value +is an empty string, +the file descriptor specified by +.I aux +is operated on directly +and may be any type of file +(not just a directory). +This is equivalent to the behaviour of +.B \%AT_EMPTY_PATH +with most "*at()" system calls. +If +.I aux +is +.BR \%AT_FDCWD , +the call operates on +the current working directory of the calling process. +.TP +.B FSCONFIG_CMD_CREATE +This command instructs the filesystem driver +to instantiate an instance of the filesystem in the kernel +with the parameters specified in the filesystem configuration context. +.IR key " and " value +must be NULL, +and +.I aux +must be 0. +.IP +If this operation succeeds, +the filesystem context +associated with file descriptor +.I fd +now references the created filesystem instance, +and is placed into a special "awaiting-mount" mode +that allows you to use +.BR fsmount (2) +to create a mount object from the filesystem instance. +.\" FS_CONTEXT_AWAITING_MOUNT is the term the kernel uses for this. +.IP +This is intended for use with filesystem configuration contexts created with +.BR fsopen (2). +In order to create a filesystem instance, +the calling process must have the +.B \%CAP_SYS_ADMIN +capability. +.IP +Note that the Linux kernel reuses filesystem instances +for many filesystems, +so (depending on the filesystem being configured and parameters used) +it is possible for the filesystem instance "created" by +.B \%FSCONFIG_CMD_CREATE +to, in fact, be a reference +to an existing filesystem instance in the kernel. +The kernel will attempt to merge the specified parameters +of this filesystem configuration context +with those of the filesystem instance being reused, +but some parameters may be +.IR "silently ignored" . +.IP +Programs that need to ensure +that they create a new filesystem instance +with specific parameters +(notably, security-related parameters +such as +.I acl +to enable POSIX ACLs as described in +.BR acl (5)) +should use +.B \%FSCONFIG_CMD_CREATE_EXCL +instead. +.TP +.BR FSCONFIG_CMD_CREATE_EXCL " (since Linux 6.6)" +.\" commit 22ed7ecdaefe0cac0c6e6295e83048af60435b13 +.\" commit 84ab1277ce5a90a8d1f377707d662ac43cc0918a +As with +.BR \%FSCONFIG_CMD_CREATE , +except that the kernel is instructed +to create a new filesystem instance +("superblock" in kernel-developer parlance) +rather than reusing an existing one. +.IP +If this is not possible +(such as with disk-backed filesystems +where multiple filesystem instances +using the same filesystem driver +and writing to the same underlying device +could result in data corruption), +this operation will incur +an +.B EBUSY +error. +.IP +As a result (unlike +.BR \%FSCONFIG_CMD_CREATE ), +if this command succeeds +then the calling process can be sure that +all of the parameters successfully configured with +.BR fsconfig () +will actually be applied +to the created filesystem instance. +.TP +.B FSCONFIG_CMD_RECONFIGURE +This command instructs the filesystem driver +to apply the parameters specified in the filesystem configuration context +to the extant filesystem instance +referenced by the filesystem configuration context. +.IP +This is primarily intended for use with +.BR fspick (2), +but may also be used to modify the parameters of filesystem instance after +.BR \%FSCONFIG_CMD_CREATE +was used to create it +and a mount object was created using +.BR fsmount (2). +In order to reconfigure an extant filesystem instance, +the calling process must have the +.B CAP_SYS_ADMIN +capability. +.IP +Once this operation succeeds, the filesystem context is reset +but remains in reconfiguration mode +and thus can be used for subsequent +.B \%FSCONFIG_CMD_RECONFIGURE +commands. +.RE +.P +Parameters specified with +.BI FSCONFIG_SET_ * +do not take effect +until the corresponding +.B \%FSCONFIG_CMD_CREATE +or +.B \%FSCONFIG_CMD_RECONFIGURE +command is issued. +.SH RETURN VALUE +On success, +.BR fsconfig () +returns 0. +On error, \-1 is returned, and +.I errno +is set to indicate the error. +.SH ERRORS +If an error occurs, the filesystem driver may provide +additional information about the error +through the message retrieval interface for filesystem configuration contexts. +This additional information can be retrieved at any time by calling +.BR read (2) +on the filesystem instance or filesystem configuration context +referenced by the file descriptor +.IR fd . +(See the "Message retrieval interface" subsection in +.BR fsopen (2) +for more details on the message format.) +.P +Even after an error occurs, +the filesystem configuration context is +.I not +invalidated, +and thus can still be used with other +.BR fsconfig () +commands. +This means that users can probe support for filesystem parameters +on a per-parameter basis, +and adjust which parameters they wish to set. +.P +The error values given below result from +filesystem type independent errors. +Each filesystem type may have its own special errors +and its own special behavior. +See the Linux kernel source code for details. +.TP +.B EACCES +A component of a path +provided as a path parameter +was not searchable. +(See also +.BR path_resolution (7).) +.TP +.B EACCES +.B \%FSCONFIG_CMD_CREATE +was attempted +for a read-only filesystem +without specifying the +.RB ' ro ' +flag parameter. +.TP +.B EACCES +A specified block device parameter +is located on a filesystem +mounted with the +.B \%MS_NODEV +option. +.TP +.B EBADF +The file descriptor given by +.I fd +(or possibly by +.IR aux , +depending on the command) +is invalid. +.TP +.B EBUSY +The filesystem context associated with +.I fd +is in the wrong state +for the given command. +.TP +.B EBUSY +The filesystem instance cannot be reconfigured as read-only +with +.B \%FSCONFIG_CMD_RECONFIGURE +because some programs +still hold files open for writing. +.TP +.B EBUSY +A new filesystem instance was requested with +.B \%FSCONFIG_CMD_CREATE_EXCL +but a matching superblock already existed. +.TP +.B EFAULT +One of the pointer arguments +points to a location +outside the calling process's accessible address space. +.TP +.B EINVAL +.I fd +does not refer to +a filesystem configuration context +or filesystem instance. +.TP +.B EINVAL +One of the values of +.IR name , +.IR value , +and/or +.I aux +were set to a non-zero value when +.I cmd +required that they be zero +(or NULL). +.TP +.B EINVAL +The parameter named by +.I name +cannot be set +using the type specified with +.IR cmd . +.TP +.B EINVAL +One of the source parameters +referred to +an invalid superblock. +.TP +.B ELOOP +Too many links encountered +during pathname resolution +of a path argument. +.TP +.B ENAMETOOLONG +A path argument was longer than +.BR PATH_MAX . +.TP +.B ENOENT +A path argument had a non-existent component. +.TP +.B ENOENT +A path argument is an empty string, +but +.I cmd +is not +.BR \%FSCONFIG_SET_PATH_EMPTY . +.TP +.B ENOMEM +The kernel could not allocate sufficient memory to complete the operation. +.TP +.B ENOTBLK +The parameter named by +.I name +must be a block device, +but the provided parameter value was not a block device. +.TP +.B ENOTDIR +A component of the path prefix +of a path argument +was not a directory. +.TP +.B EOPNOTSUPP +The command given by +.I cmd +is not valid. +.TP +.B ENXIO +The major number +of a block device parameter +is out of range. +.TP +.B EPERM +The command given by +.I cmd +was +.BR \%FSCONFIG_CMD_CREATE , +.BR \%FSCONFIG_CMD_CREATE_EXCL , +or +.BR \% FSCONFIG_CMD_RECONFIGURE , +but the calling process does not have the required +.B \%CAP_SYS_ADMIN +capability. +.SH STANDARDS +Linux. +.SH HISTORY +Linux 5.2. +.\" commit ecdab150fddb42fe6a739335257949220033b782 +.\" commit 400913252d09f9cfb8cce33daee43167921fc343 +glibc 2.36. +.SH NOTES +.SS Generic filesystem parameters +Each filesystem driver is responsible for +parsing most parameters specified with +.BR fsconfig (), +meaning that individual filesystems +may have very different behaviour +when encountering parameters with the same name. +In general, +you should not assume that the behaviour of +.BR fsconfig () +when specifying a parameter to one filesystem type +will match the behaviour of the same parameter +with a different filesystem type. +.P +However, +the following generic parameters +apply to all filesystems and have unified behaviour. +They are set using the listed +.BI \%FSCONFIG_SET_ * +command. +.TP +\fIro\fP and \fIrw\fP (\fB\%FSCONFIG_SET_FLAG\fP) +Configure whether the filesystem instance is read-only. +.TP +\fIdirsync\fP (\fB\%FSCONFIG_SET_FLAG\fP) +Make directory changes on this filesystem instance synchronous. +.TP +\fIsync\fP and \fIasync\fP (\fB\%FSCONFIG_SET_FLAG\fP) +Configure whether writes on this filesystem instance +will be made synchronous +(as though the +.B O_SYNC +flag to +.BR open (2) +was specified for +all file opens in this filesystem instance). +.TP +\fIlazytime\fP and \fInolazytime\fP (\fB\%FSCONFIG_SET_FLAG\fP) +Configure whether to reduce on-disk updates +of inode timestamps on this filesystem instance +(as described in the +.B \%MS_LAZYTIME +section of +.BR mount (2)). +.TP +\fImand\fP and \fInomand\fP (\fB\%FSCONFIG_SET_FLAG\fP) +Configure whether the filesystem instance should permit mandatory locking. +Since Linux 5.15, +.\" commit f7e33bdbd6d1bdf9c3df8bba5abcf3399f957ac3 +mandatory locking has been deprecated +and setting this flag is a no-op. +.TP +\fIsource\fP (\fB\%FSCONFIG_SET_STRING\fP) +This parameter is equivalent to the +.I source +parameter passed to +.BR mount (2) +for the same filesystem type, +and is usually the pathname of a block device +containing the filesystem. +This parameter may only be set once +per filesystem configuration context transaction. +.IP +Note that individual filesystem drivers +may choose to additionally permit other +.BI FSCONFIG_SET_ * +commands be used with +.IR source , +.\" (i.e., fc->ops->parse_param() parses "source") +but setting this parameter with +.B FSCONFIG_SET_STRING +is always allowed. +.P +In addition, +any filesystem parameters associated with +Linux Security Modules (LSMs) +are also generic with respect to the underlying filesystem. +See the documentation for the LSM you wish to configure for more details. +.SH CAVEATS +.SS Filesystem parameter types +As a result of +each filesystem driver being responsible for +parsing most parameters specified with +.BR fsconfig (), +some filesystem parameters +may have unintuitive behaviour +with regards to which +.BI \%FSCONFIG_SET_ * +commands are permitted +to configure a given parameter. +.P +In order for +filesystem parameters to be backwards compatible with +.BR mount (2), +they must be parseable as strings; +this almost universally means that +.B \%FSCONFIG_SET_STRING +can also be used to configure them. +.\" Aleksa Sarai +.\" Theoretically, a filesystem could check fc->oldapi and refuse +.\" FSCONFIG_SET_STRING if the operation is coming from the new API, but no +.\" filesystems do this (and probably never will). +However, other +.BI \%FSCONFIG_SET_ * +commands need to be opted into +by each filesystem driver's parameter parser. +.P +One of the most user-visible instances of +this inconsistency is that +many filesystems do not support +configuring path parameters with +.B \%FSCONFIG_SET_PATH +(despite the name), +which can lead to somewhat confusing +.B EINVAL +errors. +(For example, the generic +.I source +parameter\[em]which is usually a path\[em]can only be configured +with +.BR \%FSCONFIG_SET_STRING .) +.P +When writing programs that use +.BR fsconfig () +to configure parameters +with commands other than +.BR \%FSCONFIG_SET_STRING , +users should verify +that the +.BI \%FSCONFIG_SET_ * +commands used to configure each parameter +are supported by the corresponding filesystem driver. +.\" Aleksa Sarai +.\" While this (quite confusing) inconsistency in behaviour is true today +.\" (and has been true since this was merged), this appears to mostly be an +.\" unintended consequence of filesystem drivers hand-coding fsparam parsing. +.\" Path parameters are the most eggregious causes of confusion. Hopefully we +.\" can make this no longer the case in a future kernel. +.SH EXAMPLES +To illustrate the different kinds of flags that can be configured with +.BR fsconfig (), +here are a few examples of some different filesystems being created: +.P +.in +4n +.EX +int fsfd, mntfd; +\& +fsfd = fsopen("tmpfs", FSOPEN_CLOEXEC); +fsconfig(fsfd, FSCONFIG_SET_FLAG, "inode64", NULL, 0); +fsconfig(fsfd, FSCONFIG_SET_STRING, "uid", "1234", 0); +fsconfig(fsfd, FSCONFIG_SET_STRING, "huge", "never", 0); +fsconfig(fsfd, FSCONFIG_SET_FLAG, "casefold", NULL, 0); +fsconfig(fsfd, FSCONFIG_CMD_CREATE, NULL, NULL, 0); +mntfd = fsmount(fsfd, FSMOUNT_CLOEXEC, MOUNT_ATTR_NOEXEC); +move_mount(mntfd, "", AT_FDCWD, "/tmp", MOVE_MOUNT_F_EMPTY_PATH); +\& +fsfd = fsopen("erofs", FSOPEN_CLOEXEC); +fsconfig(fsfd, FSCONFIG_SET_STRING, "source", "/dev/loop0", 0); +fsconfig(fsfd, FSCONFIG_SET_FLAG, "acl", NULL, 0); +fsconfig(fsfd, FSCONFIG_SET_FLAG, "user_xattr", NULL, 0); +fsconfig(fsfd, FSCONFIG_CMD_CREATE_EXCL, NULL, NULL, 0); +mntfd = fsmount(fsfd, FSMOUNT_CLOEXEC, MOUNT_ATTR_NOSUID); +move_mount(mntfd, "", AT_FDCWD, "/mnt", MOVE_MOUNT_F_EMPTY_PATH); +.EE +.in +.P +Usually, +specifying the same parameter named by +.I key +multiple times with +.BR fsconfig () +causes the parameter value to be replaced. +However, some filesystems can have unique behaviour: +.P +.in +4n +.EX +\& +int fsfd, mntfd; +int lowerdirfd = open("/o/ctr/lower1", O_DIRECTORY | O_CLOEXEC); +\& +fsfd = fsopen("overlay", FSOPEN_CLOEXEC); +/* "lowerdir+" appends to the lower dir stack each time. */ +fsconfig(fsfd, FSCONFIG_SET_FD, "lowerdir+", NULL, lowerdirfd); +fsconfig(fsfd, FSCONFIG_SET_STRING, "lowerdir+", "/o/ctr/lower2", 0); +fsconfig(fsfd, FSCONFIG_SET_STRING, "lowerdir+", "/o/ctr/lower3", 0); +fsconfig(fsfd, FSCONFIG_SET_STRING, "lowerdir+", "/o/ctr/lower4", 0); +.\" fsconfig(fsfd, FSCONFIG_SET_PATH, "lowerdir+", "/o/ctr/lower5", AT_FDCWD); +.\" fsconfig(fsfd, FSCONFIG_SET_PATH_EMPTY, "lowerdir+", "", lowerdirfd); +.\" Aleksa Sarai: Hopefully these will also be supported in the future. +fsconfig(fsfd, FSCONFIG_SET_STRING, "xino", "auto", 0); +fsconfig(fsfd, FSCONFIG_SET_STRING, "nfs_export", "off", 0); +fsconfig(fsfd, FSCONFIG_CMD_CREATE, NULL, NULL, 0); +mntfd = fsmount(fsfd, FSMOUNT_CLOEXEC, 0); +move_mount(mntfd, "", AT_FDCWD, "/mnt", MOVE_MOUNT_F_EMPTY_PATH); +.EE +.in +.P +And here is an example of how +.BR fspick (2) +can be used with +.BR fsconfig () +to reconfigure the parameters +of an extant filesystem instance +attached to +.IR /proc : +.P +.in +4n +.EX +int fsfd = fspick(AT_FDCWD, "/proc", FSPICK_CLOEXEC); +fsconfig(fsfd, FSCONFIG_SET_STRING, "hidepid", "ptraceable", 0); +fsconfig(fsfd, FSCONFIG_SET_STRING, "subset", "pid", 0); +fsconfig(fsfd, FSCONFIG_CMD_RECONFIGURE, NULL, NULL, 0); +.EE +.in +.SH SEE ALSO +.BR fsmount (2), +.BR fsopen (2), +.BR fspick (2), +.BR mount (2), +.BR mount_setattr (2), +.BR move_mount (2), +.BR open_tree (2), +.BR mount_namespaces (7) + -- 2.50.1