Re: [BUG] propagation graph breakage by MOVE_MOUNT_SET_GROUP move_mount(2)

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 





On 5/9/25 16:28, Al Viro wrote:
On Fri, May 09, 2025 at 09:26:28AM +0100, Al Viro wrote:
AFAICS, 9ffb14ef61ba "move_mount: allow to add a mount into an existing
group" breaks assertions on ->mnt_share/->mnt_slave.  For once, the data
structures in question are actually documented.

Documentation/filesystem/sharedsubtree.rst:
         All vfsmounts in a peer group have the same ->mnt_master.  If it is
	non-NULL, they form a contiguous (ordered) segment of slave list.

fs/pnode.c:
  * Note that peer groups form contiguous segments of slave lists.

fs/namespace.c:do_set_group():
         if (IS_MNT_SLAVE(from)) {
                 struct mount *m = from->mnt_master;

                 list_add(&to->mnt_slave, &m->mnt_slave_list);
                 to->mnt_master = m;
         }

         if (IS_MNT_SHARED(from)) {
                 to->mnt_group_id = from->mnt_group_id;
                 list_add(&to->mnt_share, &from->mnt_share);
                 lock_mount_hash();
                 set_mnt_shared(to);
                 unlock_mount_hash();
         }

Note that 'to' goes right after 'from' in ->mnt_share (i.e. peer group
list) and into the beginning of the slave list 'from' belongs to.  IOW,
contiguity gets broken if 'from' is both IS_MNT_SLAVE and IS_MNT_SHARED.
Which is what happens when the peer group 'from' is in gets propagation
from somewhere.

Agreed, list ordering consistency looks broken by my commit.


It's not hard to fix - something like

         if (IS_MNT_SHARED(from)) {
		to->mnt_group_id = from->mnt_group_id;
                 list_add(&to->mnt_share, &from->mnt_share);
		if (IS_MNT_SLAVE(from))
			list_add(&to->mnt_slave, &from->mnt_slave);
		to->mnt_master = from->mnt_master;
                 lock_mount_hash();
                 set_mnt_shared(to);
                 unlock_mount_hash();
         } else if (IS_MNT_SLAVE(from)) {
		to->mnt_master = from->mnt_master;
		list_add(&to->mnt_slave, &from->mnt_master->mnt_slave_list);
	}

ought to do it.

Yes it should work.

In case (IS_MNT_SLAVE(from) && !IS_MNT_SHARED(from)) we can probably also do:

list_add(&to->mnt_slave, &from->mnt_slave);

as next slave after "from" is definitely not from the same shared group with "from" (as it's not in a shared group) so we won't break list continuity.

That will allow to simplify code change to:

        if (IS_MNT_SLAVE(from)) {
                struct mount *m = from->mnt_master;

-                list_add(&to->mnt_slave, &m->mnt_slave_list);
+                list_add(&to->mnt_slave, &from->mnt_slave);
                to->mnt_master = m;
        }

        if (IS_MNT_SHARED(from)) {
                to->mnt_group_id = from->mnt_group_id;
                list_add(&to->mnt_share, &from->mnt_share);
                lock_mount_hash();
                set_mnt_shared(to);
                unlock_mount_hash();
        }

If I'm not missing something (didn't test yet).

I'm nowhere near sufficiently awake right now to put
together a regression test, but unless I'm missing something subtle, it
should be possible to get a fairly obvious breakage of propagate_mnt()
out of that...

I managed to see weird behavior like that:

# rmdir /tmp/{A,B,C,D,E,Z}
# unshare -m
mkdir /tmp/{A,B,C,D,E,Z}
mount --make-rprivate /
mount -t tmpfs tmpfs /tmp/A
mount --bind /tmp/A /tmp/Z
mount --make-shared /tmp/A
mount --bind /tmp/A /tmp/B
mount --make-slave /tmp/B
mount --make-shared /tmp/B
mount --bind /tmp/B /tmp/C
mount --bind /tmp/C /tmp/D
mount --bind /tmp/D /tmp/E
./setgroup-v2 /tmp/C /tmp/Z
mkdir /tmp/A/subdir
mount -t tmpfs tmpfs /tmp/A/subdir

This creates 16 subdir mounts instead of expected 6:

cat /proc/self/mountinfo | grep /tmp/
1071 1065 0:109 / /tmp/A rw,relatime shared:556 - tmpfs tmpfs rw,seclabel,inode64 1073 1065 0:109 / /tmp/Z rw,relatime shared:1040 master:556 - tmpfs tmpfs rw,seclabel,inode64 1076 1065 0:109 / /tmp/B rw,relatime shared:1040 master:556 - tmpfs tmpfs rw,seclabel,inode64 1077 1065 0:109 / /tmp/C rw,relatime shared:1040 master:556 - tmpfs tmpfs rw,seclabel,inode64 1078 1065 0:109 / /tmp/D rw,relatime shared:1040 master:556 - tmpfs tmpfs rw,seclabel,inode64 1079 1065 0:109 / /tmp/E rw,relatime shared:1040 master:556 - tmpfs tmpfs rw,seclabel,inode64 1080 1071 0:136 / /tmp/A/subdir rw,relatime shared:1041 - tmpfs tmpfs rw,seclabel,inode64 1081 1073 0:136 / /tmp/Z/subdir rw,relatime shared:1042 master:1041 - tmpfs tmpfs rw,seclabel,inode64 1082 1078 0:136 / /tmp/D/subdir rw,relatime shared:1042 master:1041 - tmpfs tmpfs rw,seclabel,inode64 1083 1079 0:136 / /tmp/E/subdir rw,relatime shared:1042 master:1041 - tmpfs tmpfs rw,seclabel,inode64 1084 1076 0:136 / /tmp/B/subdir rw,relatime shared:1042 master:1041 - tmpfs tmpfs rw,seclabel,inode64 1085 1077 0:136 / /tmp/C/subdir rw,relatime shared:1042 master:1041 - tmpfs tmpfs rw,seclabel,inode64 1086 1084 0:136 / /tmp/B/subdir rw,relatime shared:1042 master:1041 - tmpfs tmpfs rw,seclabel,inode64 1087 1085 0:136 / /tmp/C/subdir rw,relatime shared:1042 master:1041 - tmpfs tmpfs rw,seclabel,inode64 1088 1081 0:136 / /tmp/Z/subdir rw,relatime shared:1042 master:1041 - tmpfs tmpfs rw,seclabel,inode64 1089 1082 0:136 / /tmp/D/subdir rw,relatime shared:1042 master:1041 - tmpfs tmpfs rw,seclabel,inode64 1090 1083 0:136 / /tmp/E/subdir rw,relatime shared:1042 master:1041 - tmpfs tmpfs rw,seclabel,inode64 1142 1089 0:136 / /tmp/D/subdir rw,relatime shared:1042 master:1041 - tmpfs tmpfs rw,seclabel,inode64 1143 1090 0:136 / /tmp/E/subdir rw,relatime shared:1042 master:1041 - tmpfs tmpfs rw,seclabel,inode64 1144 1086 0:136 / /tmp/B/subdir rw,relatime shared:1042 master:1041 - tmpfs tmpfs rw,seclabel,inode64 1145 1087 0:136 / /tmp/C/subdir rw,relatime shared:1042 master:1041 - tmpfs tmpfs rw,seclabel,inode64 1146 1088 0:136 / /tmp/Z/subdir rw,relatime shared:1042 master:1041 - tmpfs tmpfs rw,seclabel,inode64

Maybe that can be converted to a regression test.


Not sufficiently awake is right - wrong address on Cc...  Anyway, bedtime
for me...



--
Best regards, Pavel Tikhomirov
Senior Software Developer, Virtuozzo.





[Index of Archives]     [Linux Ext4 Filesystem]     [Union Filesystem]     [Filesystem Testing]     [Ceph Users]     [Ecryptfs]     [NTFS 3]     [AutoFS]     [Kernel Newbies]     [Share Photos]     [Security]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux Cachefs]     [Reiser Filesystem]     [Linux RAID]     [NTFS 3]     [Samba]     [Device Mapper]     [CEPH Development]

  Powered by Linux