On Wed, Apr 9, 2025 at 1:36 PM Christian Brauner <brauner@xxxxxxxxxx> wrote: > > Ok, I see the bug. It's caused by pecularity in systemd that a specific > implementation detail of mount_setattr() papered over. > > Basically, I added a shortcut to mount_settatr(): > > /* Don't bother walking through the mounts if this is a nop. */ > if (attr.attr_set == 0 && > attr.attr_clr == 0 && > attr.propagation == 0) > return 0; > > So that we: > > * don't pointlessly do path lookup > * don't pointlessly walk the mount tree and hold the namespace semaphore etc. > > When I added copy_mount_setattr() this cycle this optimization got > broken because I moved it into this helper and we now do path lookup and > walk the mount tree even if there's no mount properties to change at > all. > > That's just a performance thing, not a correctness thing though. > > systemd has the following code: > > int make_fsmount( > int error_log_level, > const char *what, > const char *type, > unsigned long flags, > const char *options, > int userns_fd) { > > <snip> > > mnt_fd = fsmount(fs_fd, FSMOUNT_CLOEXEC, 0); > if (mnt_fd < 0) > return log_full_errno(error_log_level, errno, "Failed to create mount fd for \"%s\" (\"%s\"): %m", what, type); > > if (mount_setattr(mnt_fd, "", AT_EMPTY_PATH|AT_RECURSIVE, > &(struct mount_attr) { > .attr_set = ms_flags_to_mount_attr(f) | (userns_fd >= 0 ? MOUNT_ATTR_IDMAP : 0), > .userns_fd = userns_fd, > }, MOUNT_ATTR_SIZE_VER0) < 0) > > <snip> > > So if userns_fd is greater or equal than zero MOUNT_ATTR_IDMAP will be > raised otherwise not. > > Later in the code we find this function used in nspawn during: > > static int get_fuse_version(uint32_t *ret_major, uint32_t *ret_minor) { > > <snip> > /* Get a FUSE handle. */ > fuse_fd = open("/dev/fuse", O_CLOEXEC|O_RDWR); > <snip> > mnt_fd = make_fsmount(LOG_DEBUG, "nspawn-fuse", "fuse.nspawn", 0, opts, -EBADF); > > This will cause the aforementioned mount_setattr() call to be called > with: > > if (mount_setattr(mnt_fd, "", AT_EMPTY_PATH|AT_RECURSIVE, > &(struct mount_attr) { > .attr_set = 0, > .userns_fd = -EBADF, > }, MOUNT_ATTR_SIZE_VER0) < 0) > > This means: > > attr_set == 0 && attr_clear == 0 and propagation == 0 and we'd thus > never trigger a path lookup on older kernels. But now we do thus causing > the hang. > > I've restored the old behavior. Can you please test?: > > https://web.git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs.git/log/?h=work.mount.fixes Thanks, I can confirm systemd-nspawn working as expected on the kernel built from the branch work.mount.fixes. Tested-by: Mikhail Gavrilov <mikhail.v.gavrilov@xxxxxxxxx> -- Best Regards, Mike Gavrilov.
Attachment:
6.15.0-rc1-work.mount.fixes.zip
Description: Zip archive