On Tue, Jun 03, 2025 at 04:48:08PM +0200, Michal Hocko wrote: > On Tue 03-06-25 22:22:46, Baolin Wang wrote: > > Let me try to clarify further. > > > > The 'mm->rss_stat' is updated by using add_mm_counter(), > > dec/inc_mm_counter(), which are all wrappers around > > percpu_counter_add_batch(). In percpu_counter_add_batch(), there is percpu > > batch caching to avoid 'fbc->lock' contention. > > OK, this is exactly the line of argument I was looking for. If _all_ > updates done in the kernel are using batching and therefore the lock is > only held every N (percpu_counter_batch) updates then a risk of locking > contention would be decreased. This is worth having a note in the > changelog. > > > This patch changes task_mem() > > and task_statm() to get the accurate mm counters under the 'fbc->lock', but > > this will not exacerbate kernel 'mm->rss_stat' lock contention due to the > > the percpu batch caching of the mm counters. > > > > You might argue that my test cases cannot demonstrate an actual lock > > contention, but they have already shown that there is no significant > > 'fbc->lock' contention when the kernel updates 'mm->rss_stat'. > > I was arguing that `top -d 1' doesn't really represent a potential > adverse usage. These proc files are generally readable so I would be > expecting something like busy loop read while process tries to update > counters to see the worst case scenario. If that is barely visible then > we can conclude a normal use wouldn't even notice. > Baolin, please run stress-ng command that stresses minor anon page faults in multiple threads and then run multiple bash scripts which cat /proc/pidof(stress-ng)/status. That should be how much the stress-ng process is impacted by the parallel status readers versus without them.