On Tue, Apr 08, 2025 at 04:58:34PM -0400, Eric Chanudet wrote: > Defer releasing the detached file-system when calling namespace_unlock() > during a lazy umount to return faster. > > When requesting MNT_DETACH, the caller does not expect the file-system > to be shut down upon returning from the syscall. Not quite. Sure, there might be another process pinning a filesystem; in that case umount -l simply removes it from mount tree, drops the reference and goes away. However, we need to worry about the following case: umount -l has succeeded <several minutes later> shutdown -r now <apparently clean shutdown, with all processes killed just fine> <reboot> WTF do we have a bunch of dirty local filesystems? Where has the data gone? Think what happens if you have e.g. a subtree with several local filesystems mounted in it, along with an NFS on a slow server. Or a filesystem with shitloads of dirty data in cache, for that matter. Your async helper is busy in the middle of shutting a filesystem down, with several more still in the list of mounts to drop. With no indication for anyone and anything that something's going on. umount -l MAY leave filesystem still active; you can't e.g. do it and pull a USB stick out as soon as it finishes, etc. After all, somebody might've opened a file on it just as you called umount(2); that's expected behaviour. It's not fully async, though - having unobservable fs shutdown going on with no way to tell that it's not over yet is not a good thing. Cost of synchronize_rcu_expedited() is an issue, all right, and it does feel like an excessively blunt tool, but that's a separate story. Your test does not measure that, though - you have fs shutdown mixed with the cost of synchronize_rcu_expedited(), with no way to tell how much does each of those cost. Could you do mount -t tmpfs tmpfs mnt; sleep 60 > mnt/foo & followed by umount -l mnt to see where the costs are?