Re: [PATCH] generic/764: fsstress + migrate_pages() test

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Wed, Mar 26, 2025 at 11:50:55AM -0700, Luis Chamberlain wrote:
> 0-day reported a page migration kernel warning with folios which happen
> to be buffer-heads [0]. I'm having a terribly hard time reproducing the bug
> and so I wrote this test to force page migration filesystems.
> 
> It turns out we have have no tests for page migration on fstests or ltp,
> and its no surprise, other than compaction covered by generic/750 there
> is no easy way to trigger page migration right now unless you have a
> numa system.
> 
> We should evaluate if we want to help stress test page migration
> artificially by later implementing a way to do page migration on simple
> systems to an artificial target.
> 
> So far, this doesn't trigger any kernel splats, not even warnings for me.
> 
> Reported-by: kernel test robot <oliver.sang@xxxxxxxxx>
> Link: https://lore.kernel.org/r/202503101536.27099c77-lkp@xxxxxxxxx # [0]
> Signed-off-by: Luis Chamberlain <mcgrof@xxxxxxxxxx>
> ---
>  common/config         |  2 +
>  common/rc             |  8 ++++
>  tests/generic/764     | 94 +++++++++++++++++++++++++++++++++++++++++++
>  tests/generic/764.out |  2 +
>  4 files changed, 106 insertions(+)
>  create mode 100755 tests/generic/764
>  create mode 100644 tests/generic/764.out
> 
> diff --git a/common/config b/common/config
> index 2afbda141746..93b50f113b44 100644
> --- a/common/config
> +++ b/common/config
> @@ -239,6 +239,8 @@ export BTRFS_MAP_LOGICAL_PROG=$(type -P btrfs-map-logical)
>  export PARTED_PROG="$(type -P parted)"
>  export XFS_PROPERTY_PROG="$(type -P xfs_property)"
>  export FSCRYPTCTL_PROG="$(type -P fscryptctl)"
> +export NUMACTL_PROG="$(type -P numactl)"
> +export MIGRATEPAGES_PROG="$(type -P migratepages)"
>  
>  # udev wait functions.
>  #
> diff --git a/common/rc b/common/rc
> index e51686389a78..ed9613a9bf28 100644
> --- a/common/rc
> +++ b/common/rc
> @@ -281,6 +281,14 @@ _require_vm_compaction()
>  	fi
>  }
>  
> +_require_numa_nodes()
> +{
> +	readarray -t QUEUE < <($NUMACTL_PROG --show | awk '/^membind:/ {for (i=2; i<=NF; i++) print $i}')

sed makes this easier: remove the membind token, then remove all the
lines that have ":"s left in them. This leaves behind the membind
node string.

$ numactl --show | sed -e 's/membind://' -e '/:/d'
 0 1 2 3
$

Also should have:

	_require_command "$NUMACTL_PROG" "numactl"

built into it, rather than requiring the test to declare it first.

> +	if (( ${#QUEUE[@]} < 2 )); then
> +		_notrun "You need a system with at least two numa nodes to run this test"
> +	fi
> +}



> +
>  # Requires CONFIG_DEBUGFS and truncation knobs
>  _require_split_huge_pages_knob()
>  {
> diff --git a/tests/generic/764 b/tests/generic/764
> new file mode 100755
> index 000000000000..91d9fb7e08da
> --- /dev/null
> +++ b/tests/generic/764
> @@ -0,0 +1,94 @@
> +#! /bin/bash
> +# SPDX-License-Identifier: GPL-2.0
> +# Copyright (c) 2024 Luis Chamberlain.  All Rights Reserved.
> +#
> +# FS QA Test 764
> +#
> +# fsstress + migrate_pages() test
> +#
> +. ./common/preamble
> +_begin_fstest auto rw long_rw stress soak smoketest
> +
> +_cleanup()
> +{
> +	cd /
> +	rm -f $runfile
> +	rm -f $tmp.*
> +	kill -9 $run_migration_pid > /dev/null 2>&1
> +	kill -9 $stress_pid > /dev/null 2>&1
> +
> +	wait > /dev/null 2>&1
> +}

If you implement this using the fsstress wrappers like I mention
below, and get rid of running the main migration loop in background,
this cleanup function can go away completely.

> +
> +_require_scratch
> +_require_command "$NUMACTL_PROG" "numactl"
> +_require_command "$MIGRATEPAGES_PROG" "migratepages"
> +_require_numa_nodes
> +
> +readarray -t QUEUE < <($NUMACTL_PROG --show | awk '/^membind:/ {for (i=2; i<=NF; i++) print $i}')
> +if (( ${#QUEUE[@]} < 2 )); then
> +	echo "Not enough NUMA nodes to pick two different ones."
> +	exit 1
> +fi

You've implemented this twice.

> +echo "Silence is golden"
> +
> +_scratch_mkfs > $seqres.full 2>&1
> +_scratch_mount >> $seqres.full 2>&1
> +
> +nr_cpus=$((LOAD_FACTOR * 4))
> +nr_ops=$((25000 * nr_cpus * TIME_FACTOR))

Don't scale ops with nr_cpus - you've already scaled processes
with nr_cpus.

> +fsstress_args=(-w -d $SCRATCH_MNT -n $nr_ops -p $nr_cpus)
> +test -n "$SOAK_DURATION" && fsstress_args+=(--duration="$SOAK_DURATION")
> +
> +runfile="$tmp.migratepages"
> +pidfile="$tmp.stress.pid"
> +
> +run_stress_fs()
> +{
> +	$FSSTRESS_PROG $FSSTRESS_AVOID "${fsstress_args[@]}" &
> +	stress_pid=$!
> +	echo $stress_pid > $pidfile
> +	wait $stress_pid
> +	rm -f $runfile
> +	rm -f $pidfile
> +}

Don't reimplement _run_fsstress(), call it instead.

> +
> +run_stress_fs &

Actually, you want _run_fsstress_bg() here, and then
_kill_fsstress() when you want it to die.

> +touch $runfile
> +stress_pid=$(cat $pidfile)

Don't need either of these.

> +
> +while [ -e $runfile ]; do

while [ -n "_FSSTRESS_PID" ]; do


> +	readarray -t QUEUE < <(numactl --show | awk '/^membind:/ {for (i=2; i<=NF; i++) print $i}')

Third time this is implemented.

> +	# Proper Fisher–Yates shuffle
> +	for ((i=${#QUEUE[@]} - 1; i > 0; i--)); do
> +		j=$((RANDOM % (i + 1)))
> +		var=${QUEUE[i]}
> +		QUEUE[i]=${QUEUE[j]}
> +		QUEUE[j]=$var
> +	done
> +
> +	RANDOM_NODE_1=${QUEUE[0]}
> +	RANDOM_NODE_2=${QUEUE[1]}

If all you are doing is picking two random nodes, then you could
just use RANDOM for the array index and drop the whole shuffle
thing, yes?

> +	if [[ -f $pidfile ]]; then

no need for this if we gate the loop on _FSSTRESS_PID

> +		echo "migrating parent fsstress process:" >> $seqres.full
> +		echo -en "\t$MIGRATEPAGES_PROG $pid $RANDOM_NODE_1 $RANDOM_NODE_2 ..." >> $seqres.full
> +		$MIGRATEPAGES_PROG $stress_pid $RANDOM_NODE_1 $RANDOM_NODE_2
> +		echo " $?" >> $seqres.full
> +		echo "migrating child fsstress processes ..." >> $seqres.full
> +		for pid in $(ps --ppid "$stress_pid" -o pid=); do
> +			echo -en "\tmigratepages $pid $RANDOM_NODE_1 $RANDOM_NODE_2 ..." >> $seqres.full
> +			$MIGRATEPAGES_PROG $pid $RANDOM_NODE_1 $RANDOM_NODE_2
> +			echo " $?" >> $seqres.full
> +		done
> +	fi
> +	sleep 2
> +done &
> +run_migration_pid=$!

why is this put in the background, only to then wait on it to
complete? The loop will stop when fsstress finishes, yes?
Which means this doesn't need to be run in the background at all,
and then cleanup doesn't need to handle killing this, either.

-Dave.
-- 
Dave Chinner
david@xxxxxxxxxxxxx




[Index of Archives]     [Linux Ext4 Filesystem]     [Union Filesystem]     [Filesystem Testing]     [Ceph Users]     [Ecryptfs]     [NTFS 3]     [AutoFS]     [Kernel Newbies]     [Share Photos]     [Security]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux Cachefs]     [Reiser Filesystem]     [Linux RAID]     [NTFS 3]     [Samba]     [Device Mapper]     [CEPH Development]

  Powered by Linux