From: Dave Chinner <dchinner@xxxxxxxxxx> Concurrency is currently hard coded at 64 worker threads. This is too many for small CPU count machines; the idea is to create a sustained load of roughly one test per CPU as they are mostly single threaded/single process tests. The number "64" was chosen because I've been developing this functionality on a 64p VM. Rather than hard coding the concurrency, probe the number of CPUs available and create that many running contexts as the default concurrency to use. Further, add a CLI option to specify the number of threads to run so that we can over- or under-commit the CPU resources to enable direct benchmarking of performance with different levels of concurrency. Let's use that capability to show how much check-parallel can benefit small systems. Using a single check execution thread for all tests inside a 4p control group to limit maximum CPU usage to the equivalent of a small 4p machine: $ time sudo numactl -C 4-7 ./check-parallel -D /mnt/xfs -t 1 -g quick -s xfs -x dump -X generic/531 Runner 0 Failures: generic/504 Tests run: 921 Tests _notrun: 272 Failure count: 2 ..... real 61m31.362s user 0m0.029s sys 0m0.059s the quick group on XFS takes *over an hour* to run. If we use the same 4p control group setup and run with 8 test execution threads to ensure the 4 CPUs are fully utilised for most of the test run: $ time sudo numactl -C 4-7 ./check-parallel -D /mnt/xfs -t 8 -g quick -s xfs -x dump -X generic/531 Runner 7 Failures: generic/504 Tests run: 921 Tests _notrun: 145 Failure count: 1 ..... real 17m33.124s user 0m0.009s sys 0m0.017s The same test run takes only 17m33s. The same number of tests were run, the same failures occurred. [ Ignore the differences in notrun/failure count - the multi-file aggregation currently doesn't work correctly for the single log file case. ] That's a reduction in test runtime of ~72% for a 4 CPU system. Or, if we want to measure it the other way, we get a ~3.5x improvement in runtime scalability. i.e. going from 1 -> 4 CPUs being used for test execution (4x increase) we get a 3.5x improvement in scalability when we go from check to check-parallel. Signed-off-by: Dave Chinner <dchinner@xxxxxxxxxx> --- check-parallel | 9 ++++++++- 1 file changed, 8 insertions(+), 1 deletion(-) diff --git a/check-parallel b/check-parallel index cb5d6aedf..0649a417f 100755 --- a/check-parallel +++ b/check-parallel @@ -10,7 +10,7 @@ # the loop devices. basedir="" -runners=64 +runners=$(getconf _NPROCESSORS_CONF) runner_list=() runtimes=() show_test_list= @@ -30,6 +30,7 @@ usage() check options -D <dir> Directory to run in + -t <n> Number of concurrent tests to run -n Output test list, do not run tests -r randomize test order --exact-order run tests in the exact order specified @@ -81,6 +82,7 @@ while [ $# -gt 0 ]; do -\? | -h | --help) usage ;; -D) basedir=$2; shift ;; + -t) runners=$2; shift ;; -g) _tl_setup_group $2 ; shift ;; -e) _tl_setup_exclude_tests $2 ; shift ;; -E) _tl_setup_exclude_file $2 ; shift ;; @@ -111,6 +113,11 @@ if [ ! -d "$basedir" ]; then echo "Invalid basedir specification" usage fi +if [[ $runners -le 0 || $runners -gt 1024 ]]; then + echo "Invalid thread specificaton: $runners" + usage +fi + if [ -d "$basedir/runner-0/" ]; then prev_results=`ls -tr $basedir/runner-0/ | grep results | tail -1` fi -- 2.45.2