Re: [RFC][CFT][PATCH] Rewrite of propagate_umount() (was Re: [BUG] propagate_umount() breakage)

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Mon, May 19, 2025 at 11:11:10AM -0700, Linus Torvalds wrote:
> Another thing that is either purely syntactic, or shows that I
> *really* don't understand your patch. Why do you do this odd thing:
> 
>         // reduce the set until it's non-shifting
>         for (m = first_candidate(); m; m = trim_one(m))
>                 ;
> 
> which seems to just walk the candidates list in a very non-obvious
> manner (this is one of those "I had to go back and forth to see what
> first_candidate() did and what lists it used" cases).
> 
> It *seems* to be the same as
> 
>         list_for_each_entry_safe(m, tmp, &candidates, mnt_umounting)
>                 trim_one(m);
> 
> because if I read that code right, 'trim_one()' will just always
> return the next entry in that candidate list.

	list_for_each_entry_safe() is safe wrt removal of the _current_
element.  Suppose the list is <A, B, C, D> and trim_one(A) wants to
take out A, B and D.  What we want is to have it followed by trim_one(C),
but how could *anything* outside of trim_one() get to C?

	list_for_each_entry_safe() would pick B as the next entry
to consider, which will end up with looping in B ever after.

	What trim_one(A) does instead is
* note that A itself needs to be removed
* remove B and D
* remember who currently follows A (the list is down to <A, C>, so C it is)
* remove A
* return C.

	We could have a separate list and have trim_one(m) move m to in
case it still looks like a valid candidate.  Then the loop would turn
into
	while !empty(candidates)
		trim_one(first candidate)
	move the second list over to candidates (or just use it on the
next stage instead of candidates)
What's slightly confusing about that variant is that m might survive
trim_one(m), only to be taken out by subsequent call of trim_one() for
another mount.  So remove_candidate(p) would become "remove p from
candidates or from the set of seen candidates, whichever p currently
belongs to"...

	Another variant would be to steal one more bit from mnt_flags,
set it for all candidates when collecting them, have is_candidate() check
that instead of list_empty(&m->mnt_umounting) and clean it where this
variant removes from the list; trim_one() would immediately return if
bit is not set.  Then we could really do list_for_each_entry_safe(),
with another loop doing list removals afterwards.  Extra work that way,
though, and I still think it's more confusing...
	OTOH, "steal one more bit" would allow to get rid of
->mnt_umounting - we could use ->mnt_list for all these sets.  IIRC,
the reason for ->mnt_umounting was that we used to maintain ->mnt_list
for everything in a namespace, and having to restore it if we decide
that candidate has to stay alive would be painful.  These days we
use rbtree for iterators, so ->mnt_list on those suckers is fair
game...

	Re globals - the other bunch is part of propagate_mnt(), not
propagate_umount()...  IIRC, at some point I got fed up by arseloads
of arguments passed from propagate_umount() down to the helpers and
went "fuck it, we have a hard dependency on global serialization anyway,
might as well make all those arguments global" (and I remember trying
to go with per-namespace locks; stuff of nightmares, that)...

	I'll look into gathering that stuff into a single structure
passed by the callers, but the whole thing (especially on the umount
side) really depends upon not running into another instance working
at the same time.  Trying to cope with duelling umount propagations
from two threads... the data structures would be the least of headache
sources there.




[Index of Archives]     [Linux Ext4 Filesystem]     [Union Filesystem]     [Filesystem Testing]     [Ceph Users]     [Ecryptfs]     [NTFS 3]     [AutoFS]     [Kernel Newbies]     [Share Photos]     [Security]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux Cachefs]     [Reiser Filesystem]     [Linux RAID]     [NTFS 3]     [Samba]     [Device Mapper]     [CEPH Development]

  Powered by Linux