Re: 6.15-rc1/regression/bisected - commit 474f7825d533 is broke systemd-nspawn on my system

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Tue, Apr 08, 2025 at 06:05:23PM +0500, Mikhail Gavrilov wrote:
> On Tue, Apr 8, 2025 at 3:22 PM Christian Brauner <brauner@xxxxxxxxxx> wrote:
> >
> >
> > Resolved it for you:
> >
> > https://web.git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs.git/log/?h=work.bisect
> 
> I can confirm that systemd-nspawn is working on the kernel built from
> branch work.bisect.
> It confirms the correctness of my bisect.
> 
> > sudo /usr/bin/systemd-nspawn -q --ephemeral -D /var/lib/mock/fedora-rawhide-x86_64/root
> [sudo] password for mikhail:
> [root@root-7a084a9cbe689c8a ~]# uname -r
> 6.15.0-rc1-work.bisect+
> 
> And I attached the full kernel log below.

Ok, I see the bug. It's caused by pecularity in systemd that a specific
implementation detail of mount_setattr() papered over.

Basically, I added a shortcut to mount_settatr():

        /* Don't bother walking through the mounts if this is a nop. */
        if (attr.attr_set == 0 &&
            attr.attr_clr == 0 &&
            attr.propagation == 0)
                return 0;

So that we:

* don't pointlessly do path lookup
* don't pointlessly walk the mount tree and hold the namespace semaphore etc.

When I added copy_mount_setattr() this cycle this optimization got
broken because I moved it into this helper and we now do path lookup and
walk the mount tree even if there's no mount properties to change at
all.

That's just a performance thing, not a correctness thing though.

systemd has the following code:

	int make_fsmount(
	                int error_log_level,
	                const char *what,
	                const char *type,
	                unsigned long flags,
	                const char *options,
	                int userns_fd) {
	
<snip>
	
	        mnt_fd = fsmount(fs_fd, FSMOUNT_CLOEXEC, 0);
	        if (mnt_fd < 0)
	                return log_full_errno(error_log_level, errno, "Failed to create mount fd for \"%s\" (\"%s\"): %m", what, type);
	
	        if (mount_setattr(mnt_fd, "", AT_EMPTY_PATH|AT_RECURSIVE,
	                          &(struct mount_attr) {
	                                  .attr_set = ms_flags_to_mount_attr(f) | (userns_fd >= 0 ? MOUNT_ATTR_IDMAP : 0),
	                                  .userns_fd = userns_fd,
	                          }, MOUNT_ATTR_SIZE_VER0) < 0)
	
<snip>

So if userns_fd is greater or equal than zero MOUNT_ATTR_IDMAP will be
raised otherwise not.

Later in the code we find this function used in nspawn during:

	static int get_fuse_version(uint32_t *ret_major, uint32_t *ret_minor) {

<snip>
	        /* Get a FUSE handle. */
	        fuse_fd = open("/dev/fuse", O_CLOEXEC|O_RDWR);
<snip>
	        mnt_fd = make_fsmount(LOG_DEBUG, "nspawn-fuse", "fuse.nspawn", 0, opts, -EBADF);

This will cause the aforementioned mount_setattr() call to be called
with:

	        if (mount_setattr(mnt_fd, "", AT_EMPTY_PATH|AT_RECURSIVE,
	                          &(struct mount_attr) {
	                                  .attr_set = 0,
	                                  .userns_fd = -EBADF,
	                          }, MOUNT_ATTR_SIZE_VER0) < 0)

This means:

attr_set == 0 && attr_clear == 0 and propagation == 0 and we'd thus
never trigger a path lookup on older kernels. But now we do thus causing
the hang.

I've restored the old behavior. Can you please test?:

https://web.git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs.git/log/?h=work.mount.fixes




[Index of Archives]     [Linux Ext4 Filesystem]     [Union Filesystem]     [Filesystem Testing]     [Ceph Users]     [Ecryptfs]     [NTFS 3]     [AutoFS]     [Kernel Newbies]     [Share Photos]     [Security]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux Cachefs]     [Reiser Filesystem]     [Linux RAID]     [NTFS 3]     [Samba]     [Device Mapper]     [CEPH Development]

  Powered by Linux