Re: [PATCH] mm: fix the inaccurate memory statistics issue for users

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 





On 2025/6/4 01:29, Shakeel Butt wrote:
On Tue, Jun 03, 2025 at 04:48:08PM +0200, Michal Hocko wrote:
On Tue 03-06-25 22:22:46, Baolin Wang wrote:
Let me try to clarify further.

The 'mm->rss_stat' is updated by using add_mm_counter(),
dec/inc_mm_counter(), which are all wrappers around
percpu_counter_add_batch(). In percpu_counter_add_batch(), there is percpu
batch caching to avoid 'fbc->lock' contention.

OK, this is exactly the line of argument I was looking for. If _all_
updates done in the kernel are using batching and therefore the lock is
only held every N (percpu_counter_batch) updates then a risk of locking
contention would be decreased. This is worth having a note in the
changelog.

OK.

This patch changes task_mem()
and task_statm() to get the accurate mm counters under the 'fbc->lock', but
this will not exacerbate kernel 'mm->rss_stat' lock contention due to the
the percpu batch caching of the mm counters.

You might argue that my test cases cannot demonstrate an actual lock
contention, but they have already shown that there is no significant
'fbc->lock' contention when the kernel updates 'mm->rss_stat'.

I was arguing that `top -d 1' doesn't really represent a potential
adverse usage. These proc files are generally readable so I would be
expecting something like busy loop read while process tries to update
counters to see the worst case scenario. If that is barely visible then
we can conclude a normal use wouldn't even notice.

OK.

Baolin, please run stress-ng command that stresses minor anon page
faults in multiple threads and then run multiple bash scripts which cat
/proc/pidof(stress-ng)/status. That should be how much the stress-ng
process is impacted by the parallel status readers versus without them.

Sure. Thanks Shakeel. I run the stress-ng with the 'stress-ng --fault 32 --perf -t 1m' command, while simultaneously running the following scripts to read the /proc/pidof(stress-ng)/status for each thread.

From the following data, I did not observe any obvious impact of this patch on the stress-ng tests when repeatedly reading the /proc/pidof(stress-ng)/status.

w/o patch
stress-ng: info: [6891] 3,993,235,331,584 CPU Cycles 59.767 B/sec stress-ng: info: [6891] 1,472,101,565,760 Instructions 22.033 B/sec (0.369 instr. per cycle) stress-ng: info: [6891] 36,287,456 Page Faults Total 0.543 M/sec stress-ng: info: [6891] 36,287,456 Page Faults Minor 0.543 M/sec

w/ patch
stress-ng: info: [6872] 4,018,592,975,968 CPU Cycles 60.177 B/sec stress-ng: info: [6872] 1,484,856,150,976 Instructions 22.235 B/sec (0.369 instr. per cycle) stress-ng: info: [6872] 36,547,456 Page Faults Total 0.547 M/sec stress-ng: info: [6872] 36,547,456 Page Faults Minor 0.547 M/sec

=========================
#!/bin/bash

# Get the PIDs of stress-ng processes
PIDS=$(pgrep stress-ng)

# Loop through each PID and monitor /proc/[pid]/status
for PID in $PIDS; do
    while true; do
        cat /proc/$PID/status
	usleep 100000
    done &
done




[Index of Archives]     [Linux Ext4 Filesystem]     [Union Filesystem]     [Filesystem Testing]     [Ceph Users]     [Ecryptfs]     [NTFS 3]     [AutoFS]     [Kernel Newbies]     [Share Photos]     [Security]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux Cachefs]     [Reiser Filesystem]     [Linux RAID]     [NTFS 3]     [Samba]     [Device Mapper]     [CEPH Development]

  Powered by Linux