> > > On Wed, Apr 09, 2025 at 12:37:06PM +0200, Christian Brauner wrote: > > > > On Tue, Apr 08, 2025 at 04:58:34PM -0400, Eric Chanudet wrote: > > > > > Attempt to re-spin this series based on the feedback received in v3 that > > > > > pointed out the need to wait the grace-period in namespace_unlock() > > > > > before calling the deferred mntput(). > > > > > > > > I still hate this with a passion because it adds another special-sauce > > > > path into the unlock path. I've folded the following diff into it so it > > > > at least doesn't start passing that pointless boolean and doesn't > > > > introduce __namespace_unlock(). Just use a global variable and pick the > > > > value off of it just as we do with the lists. Testing this now: My apologies, I went with the feedback from v3[1] and failed to parse the context surrounding it. [1] https://lore.kernel.org/all/Znx-WGU5Wx6RaJyD@xxxxxxxxxxxxxxxxxxxx/ > > > > @@ -2094,7 +2088,7 @@ static int do_umount(struct mount *mnt, int flags) > > > > } > > > > out: > > > > unlock_mount_hash(); > > > > - __namespace_unlock(flags & MNT_DETACH); > > > > + namespace_unlock(); > > > > return retval; > > > > } > > > > > > > > I believe you skipped setting unmounted_lazily in this hunk? With this, I have applied your patch for the following discussion and down thread. Happy to send a v5, should this patch be deemed worth pursuing. On Wed, Apr 09, 2025 at 04:25:10PM +0200, Sebastian Andrzej Siewior wrote: > On 2025-04-09 16:02:29 [+0200], Mateusz Guzik wrote: > > On Wed, Apr 09, 2025 at 03:14:44PM +0200, Sebastian Andrzej Siewior wrote: > > > One question: Do we need this lazy/ MNT_DETACH case? Couldn't we handle > > > them all via queue_rcu_work()? > > > If so, couldn't we have make deferred_free_mounts global and have two > > > release_list, say release_list and release_list_next_gp? The first one > > > will be used if queue_rcu_work() returns true, otherwise the second. > > > Then once defer_free_mounts() is done and release_list_next_gp not > > > empty, it would move release_list_next_gp -> release_list and invoke > > > queue_rcu_work(). > > > This would avoid the kmalloc, synchronize_rcu_expedited() and the > > > special-sauce. > > > > > > > To my understanding it was preferred for non-lazy unmount consumers to > > wait until the mntput before returning. Unless I misunderstand the statement, and from the previous thread[2], this is a requirement of the user API. [2] https://lore.kernel.org/all/Y8m+M%2FffIEEWbfmv@ZenIV/ > > On Wed, Apr 09, 2025 at 03:14:44PM +0200, Sebastian Andrzej Siewior wrote: > > > On 2025-04-09 12:37:06 [+0200], Christian Brauner wrote: > > > > diff --git a/fs/namespace.c b/fs/namespace.c > > > > index e5b0b920dd97..25599428706c 100644 > > > > --- a/fs/namespace.c > > > > +++ b/fs/namespace.c > > > > @@ -1840,29 +1842,21 @@ static void __namespace_unlock(bool lazy) > > > … > > > > + d = kmalloc(sizeof(struct deferred_free_mounts), GFP_KERNEL); > > > > + if (d) { > > > > + hlist_move_list(&head, &d->release_list); > > > > + INIT_RCU_WORK(&d->rwork, defer_free_mounts); > > > > + queue_rcu_work(system_wq, &d->rwork); > > > > > > Couldn't we do system_unbound_wq? I think we can, afaict we don't need locality? I'll run some tests with system_unbound_wq. Thanks, -- Eric Chanudet