On Wed, Apr 09, 2025 at 03:04:43PM +0200, Mateusz Guzik wrote: > On Tue, Apr 08, 2025 at 04:58:34PM -0400, Eric Chanudet wrote: > > Defer releasing the detached file-system when calling namespace_unlock() > > during a lazy umount to return faster. > > > > When requesting MNT_DETACH, the caller does not expect the file-system > > to be shut down upon returning from the syscall. Calling > > synchronize_rcu_expedited() has a significant cost on RT kernel that > > defaults to rcupdate.rcu_normal_after_boot=1. Queue the detached struct > > mount in a separate list and put it on a workqueue to run post RCU > > grace-period. > > > > w/o patch, 6.15-rc1 PREEMPT_RT: > > perf stat -r 10 --null --pre 'mount -t tmpfs tmpfs mnt' -- umount mnt > > 0.02455 +- 0.00107 seconds time elapsed ( +- 4.36% ) > > perf stat -r 10 --null --pre 'mount -t tmpfs tmpfs mnt' -- umount -l mnt > > 0.02555 +- 0.00114 seconds time elapsed ( +- 4.46% ) > > > > w/ patch, 6.15-rc1 PREEMPT_RT: > > perf stat -r 10 --null --pre 'mount -t tmpfs tmpfs mnt' -- umount mnt > > 0.026311 +- 0.000869 seconds time elapsed ( +- 3.30% ) > > perf stat -r 10 --null --pre 'mount -t tmpfs tmpfs mnt' -- umount -l mnt > > 0.003194 +- 0.000160 seconds time elapsed ( +- 5.01% ) > > > > Christian wants the patch done differently and posted his diff, so I'm > not going to comment on it. > > I do have some feedback about the commit message though. > > In v1 it points out a real user which runs into it, while this one does > not. So I would rewrite this and put in bench results from the actual > consumer -- as it is one is left to wonder why patching up lazy unmount > is of any significance. Certainly. Doing the test mentioned in v1 again with v4+Christian's suggested changes: - QEMU x86_64, 8cpus, PREEMPT_RT, w/o patch: # perf stat -r 10 --table --null -- crun run test 0.07584 +- 0.00440 seconds time elapsed ( +- 5.80% ) - QEMU x86_64, 8cpus, PREEMPT_RT, w/ patch: # perf stat -r 10 --table --null -- crun run test 0.01421 +- 0.00387 seconds time elapsed ( +- 27.26% ) I will add that to the commit message. > I had to look up what rcupdate.rcu_normal_after_boot=1 is. Docs claim it > makes everyone use normal grace-periods, which explains the difference. > But without that one is left to wonder if perhaps there is a perf bug in > RCU instead where this is taking longer than it should despite the > option. Thus I would also denote how the delay shows up. I tried the test above while trying to force expedited RCU on the cmdline with: rcupdate.rcu_normal_after_boot=0 rcupdate.rcu_expedited=1 Unfortunately, rcupdate.rcu_normal_after_boot=0 has no effect and rcupdate_announce_bootup_oddness() reports: [ 0.015251] No expedited grace period (rcu_normal_after_boot). Which yielded similar results: - QEMU x86_64, 8cpus, PREEMPT_RT, w/o patch: # perf stat -r 10 --table --null -- crun run test 0.07838 +- 0.00322 seconds time elapsed ( +- 4.11% ) - QEMU x86_64, 8cpus, PREEMPT_RT, w/ patch: # perf stat -r 10 --table --null -- crun run test 0.01582 +- 0.00353 seconds time elapsed ( +- 22.30% ) I don't think rcupdate.rcu_expedited=1 had an effect, but I have not confirmed that yet. > v1 for reference: > > v1: https://lore.kernel.org/all/20230119205521.497401-1-echanude@xxxxxxxxxx/ -- Eric Chanudet