On Wed, Jun 18, 2025 at 03:00:12PM +0200, Daniel Gomez wrote: > From: Daniel Gomez <da.gomez@xxxxxxxxxxx> > > Running generic/551 on a tmpfs filesystem with less than 10 GB (ish) > of RAM can lead to the system running out of memory, triggering the > kernel's OOM killer and terminating the aio-dio-write-v process. > > Fix generic/551 by substracting the amount of available memory allocated ^^^^^^ subtracting > for the tmpfs scratch device to the total available free memory. > > Reported-by: Chuck Lever <chuck.lever@xxxxxxxxxx> > Signed-off-by: Daniel Gomez <da.gomez@xxxxxxxxxxx> > --- > While integrating tmpfs support for xfstests in kdevops CI [1], we noticed > that generic/551 could trigger the OOM killer when the scratch device > is tmpfs, due to not properly accounting for available system memory. > Fix the test for tmpfs by subtracting the memory allocated to the > scratch tmpfs mount from the total available memory, ensuring the test > runs within safe limits. > > [1] > https://lore.kernel.org/all/20250615-ci-workflow-v1-0-53b267cd2f0a@xxxxxxxxxxx/ > > These are the kernel oom-killer logs for generic/551 run on a system > with less than 10G of RAM: > > run fstests generic/551 at 2025-06-18 11:42:44 > aio-dio-write-v invoked oom-killer: > gfp_mask=0x140cca(GFP_HIGHUSER_MOVABLE|__GFP_COMP), order=0, > oom_score_adj=250 > CPU: 5 UID: 0 PID: 1717 Comm: aio-dio-write-v Not tainted > 6.16.0-rc2-00049-g52da431bf03b #10 PREEMPT(full) > Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 2025.02-6 > 04/08/2025 > {...} > Tasks state (memory values in pages): > [ pid ] uid tgid total_vm rss rss_anon rss_file rss_shmem > pgtables_bytes swapents oom_score_adj name > {...} > [ 1717] 0 1717 876600 875978 875945 33 0 > 7065600 0 250 aio-dio-write-v > oom-kill:constraint=CONSTRAINT_NONE,nodemask=(null),cpuset=/,mems_allowe > d=0,global_oom,task_memcg=/system.slice/fstests-generic-551.scope,task=a > io-dio-write-v,pid=1717,uid=0 > Out of memory: Killed process 1717 (aio-dio-write-v) total-vm:3506400kB, > anon-rss:3503780kB, file-rss:132kB, shmem-rss:0kB, UID:0 pgtables:6900kB > oom_score_adj:250 > > Results collected with kdevops on the following tmpfs profiles > (before/after the changes): > > diff --git a/workflows/fstests/results/last-run/ > 6.16.0-rc2-00049-g52da431bf03b/xunit_results.txt > b/workflows/fstests/results/last-run/ > 6.16.0-rc2-00049-g52da431bf03b/xunit_results.txt > index 02a1f09..0229294 100644 > --- a/workflows/fstests/results/last-run/ > 6.16.0-rc2-00049-g52da431bf03b/xunit_results.txt > +++ b/workflows/fstests/results/last-run/ > 6.16.0-rc2-00049-g52da431bf03b/xunit_results.txt > @@ -1,21 +1,21 @@ > KERNEL: 6.16.0-rc2-00049-g52da431bf03b > CPUS: 8 > > -tmpfs_noswap_huge_never: 1 tests, 1 failures, 9 seconds > - generic/551 Failed 8s > -tmpfs_default: 1 tests, 1 failures, 5 seconds > - generic/551 Failed 4s > -tmpfs_noswap_huge_within_size: 1 tests, 1 failures, 8 seconds > - generic/551 Failed 8s > -tmpfs_huge_always: 1 tests, 1 failures, 11 seconds > - generic/551 Failed 11s > -tmpfs_huge_within_size: 1 tests, 1 failures, 8 seconds > - generic/551 Failed 8s > -tmpfs_noswap_huge_always: 1 tests, 1 failures, 6 seconds > - generic/551 Failed 6s > -tmpfs_noswap_huge_advise: 1 tests, 1 failures, 8 seconds > - generic/551 Failed 7s > -tmpfs_huge_advise: 1 tests, 1 failures, 8 seconds > - generic/551 Failed 7s > -Totals: 8 tests, 0 skipped, 8 failures, 0 errors, 59s > +tmpfs_noswap_huge_advise: 1 tests, 134 seconds > + generic/551 Pass 134s > +tmpfs_noswap_huge_never: 1 tests, 141 seconds > + generic/551 Pass 141s > +tmpfs_huge_advise: 1 tests, 142 seconds > + generic/551 Pass 142s > +tmpfs_default: 1 tests, 139 seconds > + generic/551 Pass 139s > +tmpfs_noswap_huge_always: 1 tests, 109 seconds > + generic/551 Pass 108s > +tmpfs_noswap_huge_within_size: 1 tests, 116 seconds > + generic/551 Pass 115s > +tmpfs_huge_within_size: 1 tests, 112 seconds > + generic/551 Pass 111s > +tmpfs_huge_always: 1 tests, 145 seconds > + generic/551 Pass 145s > +Totals: 8 tests, 0 skipped, 0 failures, 0 errors, 1035s > --- > tests/generic/551 | 1 + > 1 file changed, 1 insertion(+) > > diff --git a/tests/generic/551 b/tests/generic/551 > index 4a7f0a638235e272ef55ffeb3b3e548707568379..6a7376d7a8e3580bee0a5c98eacdaf93c60c8d5c 100755 > --- a/tests/generic/551 > +++ b/tests/generic/551 > @@ -38,6 +38,7 @@ do_test() > local truncsize > local total_size=0 > local avail_mem=`_available_memory_bytes` > + [ "$FSTYP" = "tmpfs" ] && avail_mem=$((avail_mem - free_size_k * 1024)) This makes sense to me. But better to have a comment for this. If you don't have more suggestions, I'll add below comment when I merge this patch. # To avoid OOM on tmpfs, subtract the amount of available memory allocated # for the tmpfs Reviewed-by: Zorro Lang <zlang@xxxxxxxxxx> > > # the number of AIO write operation > num_oper=$((RANDOM % 64 + 1)) > > --- > base-commit: b7680adf9ff7bdc962fb95b5cbd304abd3137b69 > change-id: 20250618-fix-tmpfs-generic-551-4c74b15d4c25 > > Best regards, > -- > Daniel Gomez <da.gomez@xxxxxxxxxxx> > >