On Sat, May 31, 2025 at 09:57:00PM +0100, Al Viro wrote: > One possibility is to wrap the use of __lookup_mnt() into a sample-and-recheck > loop there; for the call of path_overmounted() in finish_automount() it'll > give the right behaviour. OK, that's definitely the right thing to do, whatever we end up doing with checks in do_move_mount(). So the rules become: Mount hash lookup (__lookup_mnt()) requires mount_lock - either holding its spinlock component, or seqretry on its seqcount component. If we are not holding the spinlock side of mount_lock, we must be under rcu_read_lock() at least for the duration of lookup. Result is safe to dereference as long as 1) mount_lock is still held or 2) rcu_read_lock() is still held or 3) namespace_sem had been held since before the lookup *AND* parent's refcount remains positive. This covers only the continued safety of access to the result of lookup; we still must've satisfied the rules above for the lookup itself. Acquiring a reference to result in cases (1) and (3) is safe; in case (2) it must be done with __legitimize_mnt(result, seq), with seq being a value of mount_lock seqcount component sampled *BEFORE* the lookup. That's pretty close to the rules for the rest of mount tree walking... Complications wrt namespace_sem come from dissolving of lazy-umounted trees; stuck children get detached when parent's refcount drops to zero. That happens outside of namespace_sem and I don't see any sane way to change that.