Hello. On pondělí 11. srpna 2025 18:06:16, středoevropský letní čas David Rientjes wrote: > On Mon, 11 Aug 2025, Oleksandr Natalenko wrote: > > > Hello Damien. > > > > I'm fairly confident that the following commit > > > > 459779d04ae8d block: Improve read ahead size for rotational devices > > > > caused a regression in my test bench. > > > > I'm running v6.17-rc1 in a small QEMU VM with virtio-scsi disk. It has got 1 GiB of RAM, so I can saturate it easily causing reclaiming mechanism to kick in. > > > > If MGLRU is enabled: > > > > $ echo 1000 | sudo tee /sys/kernel/mm/lru_gen/min_ttl_ms > > > > then, once page cache builds up, an OOM happens without reclaiming inactive file pages: [1]. Note that inactive_file:506952kB, I'd expect these to be reclaimed instead, like how it happens with v6.16. > > > > If MGLRU is disabled: > > > > $ echo 0 | sudo tee /sys/kernel/mm/lru_gen/min_ttl_ms > > > > then OOM doesn't occur, and things seem to work as usual. > > > > If MGLRU is enabled, and 459779d04ae8d is reverted on top of v6.17-rc1, the OOM doesn't happen either. > > > > Could you please check this? > > > > This looks to be an MGLRU policy decision rather than a readahead > regression, correct? > > Mem-Info: > active_anon:388 inactive_anon:5382 isolated_anon:0 > active_file:9638 inactive_file:126738 isolated_file:0 > > Setting min_ttl_ms to 1000 is preserving the working set and triggering > the oom kill is the only alternative to free memory in that configuration. > The oom kill is being triggered by kswapd for this purpose. > > So additional readahead would certainly increase that working set. This > looks working as intended. OK, this makes sense indeed, thanks for the explanation. But is inactive_file explosion expected and justified? Without revert: $ echo 3 | sudo tee /proc/sys/vm/drop_caches; free -m; sudo journalctl -kb >/dev/null; free -m 3 total used free shared buff/cache available Mem: 690 179 536 3 57 510 Swap: 1379 12 1367 /* OOM happens here */ total used free shared buff/cache available Mem: 690 177 52 3 561 513 Swap: 1379 17 1362 With revert: $ echo 3 | sudo tee /proc/sys/vm/drop_caches; free -m; sudo journalctl -kb >/dev/null; free -m 3 total used free shared buff/cache available Mem: 690 214 498 4 64 476 Swap: 1379 0 1379 /* no OOM */ total used free shared buff/cache available Mem: 690 209 462 4 119 481 Swap: 1379 0 1379 The journal folder size is: $ sudo du -hs /var/log/journal 575M /var/log/journal It looks like this readahead change causes far more data to be read than actually needed? -- Oleksandr Natalenko, MSE
Attachment:
signature.asc
Description: This is a digitally signed message part.