On 2025-06-09 20:17, Andrew Morton wrote:
On Mon, 9 Jun 2025 10:56:46 +0200 Vlastimil Babka <vbabka@xxxxxxx> wrote:
On 6/9/25 10:52 AM, Vlastimil Babka wrote:
On 6/9/25 10:31 AM, Ritesh Harjani (IBM) wrote:
Baolin Wang <baolin.wang@xxxxxxxxxxxxxxxxx> writes:
On 2025/6/9 15:35, Michal Hocko wrote:
On Mon 09-06-25 10:57:41, Ritesh Harjani wrote:
Any reason why we dropped the Fixes tag? I see there were a series of
discussion on v1 and it got concluded that the fix was correct, then why
drop the fixes tag?
This seems more like an improvement than a bug fix.
Yes. I don't have a strong opinion on this, but we (Alibaba) will
backport it manually,
because some of user-space monitoring tools depend
on these statistics.
That sounds like a regression then, isn't it?
Hm if counters were accurate before f1a7941243c1 and not afterwards, and
this is making them accurate again, and some userspace depends on it,
then Fixes: and stable is probably warranted then. If this was just a
perf improvement, then not. But AFAIU f1a7941243c1 was the perf
improvement...
Dang, should have re-read the commit log of f1a7941243c1 first. It seems
like the error margin due to batching existed also before f1a7941243c1.
" This patch converts the rss_stats into percpu_counter to convert the
error margin from (nr_threads * 64) to approximately (nr_cpus ^ 2)."
so if on some systems this means worse margin than before, the above
"if" chain of thought might still hold.
f1a7941243c1 seems like a good enough place to tell -stable
maintainers where to insert the patch (why does this sound rude).
The patch is simple enough. I'll add fixes:f1a7941243c1 and cc:stable
and, as the problem has been there for years, I'll leave the patch in
mm-unstable so it will eventually get into LTS, in a well tested state.
Andrew, are you considering submitting this patch for 6.16? I think
we should, it does look like a regression for larger systems built
with 64k base page size.
On comparing a very simple app which just allocates & touches some
memory against v6.1 (which doesn't have f1a7941243c1) and latest
Linus tree (4c06e63b9203) I can see that on latest Linus tree the
values for VmRSS, RssAnon and RssFile from /proc/self/status are
all zeroes while they do report values on v6.1 and a Linus tree
with this patch.
My test setup is a arm64 VM with 80 CPUs running a kernel with 64k
pagesize. The kernel only reports the RSS values starting at 10MB
(which makes sense since the Per-CPU counters will cache up to two
times the number of CPUs and the kernel accounts pages). The situation
will be worse on larger systems, of course.