Re: [PATCH] mm: fix the inaccurate memory statistics issue for users

Shakeel Butt <shakeel.butt@xxxxxxxxx> · Tue, 3 Jun 2025 10:29:47 -0700

On Tue, Jun 03, 2025 at 04:48:08PM +0200, Michal Hocko wrote:
> On Tue 03-06-25 22:22:46, Baolin Wang wrote:
> > Let me try to clarify further.
> > 
> > The 'mm->rss_stat' is updated by using add_mm_counter(),
> > dec/inc_mm_counter(), which are all wrappers around
> > percpu_counter_add_batch(). In percpu_counter_add_batch(), there is percpu
> > batch caching to avoid 'fbc->lock' contention. 
> 
> OK, this is exactly the line of argument I was looking for. If _all_
> updates done in the kernel are using batching and therefore the lock is
> only held every N (percpu_counter_batch) updates then a risk of locking
> contention would be decreased. This is worth having a note in the
> changelog.
> 
> > This patch changes task_mem()
> > and task_statm() to get the accurate mm counters under the 'fbc->lock', but
> > this will not exacerbate kernel 'mm->rss_stat' lock contention due to the
> > the percpu batch caching of the mm counters.
> > 
> > You might argue that my test cases cannot demonstrate an actual lock
> > contention, but they have already shown that there is no significant
> > 'fbc->lock' contention when the kernel updates 'mm->rss_stat'.
> 
> I was arguing that `top -d 1' doesn't really represent a potential
> adverse usage. These proc files are generally readable so I would be
> expecting something like busy loop read while process tries to update
> counters to see the worst case scenario. If that is barely visible then
> we can conclude a normal use wouldn't even notice.
> 

Baolin, please run stress-ng command that stresses minor anon page
faults in multiple threads and then run multiple bash scripts which cat
/proc/pidof(stress-ng)/status. That should be how much the stress-ng
process is impacted by the parallel status readers versus without them.