[PATCH blktests 2/2] check: introduce ERR_EXIT flag

"Shin'ichiro Kawasaki" <shinichiro.kawasaki@xxxxxxx> · Fri, 6 Jun 2025 12:56:30 +0900

In bash script development, it is a good practice to enable strict
error-checking using "set -e" or "set -o errexit". However, implementing
this in blktests has two problems.

The first problem is script changes to ignore non-zero status. When the
error-checking is enabled, any non-zero status immediately terminates
the script. However, non-zero statuses are often intentional and
expected. Ignoring these statuses across all test cases would require
extensive code modifications and weird codes. Therefore, the error-
checking should be enabled for limited subset of test cases where non-
zero statuses are not expected.

The second problem is impact on following test cases. When the error-
checking is enabled for a test case, the test case script exits
immediately encountering an error. This skips the cleanup steps in the
test case, potentially leaving the test system in a dirty state, which
may affect subsequent test cases.

To address the two problems, introduce the new flag ERR_EXIT. When a
test case sets ERR_EXIT=1 at the beginning of its file, the blktests
check script calls "set -e" for the test case. The test case does not
need to call "set -e". This allows the strict error-checking to be
applied to test cases selectively.

If a test case with ERR_EXIT=1 exits due to a non-zero status, catch the
exit in _cleanup(). Then print an error message to the user and stop the
test script. This ensures the following test cases are not run.

Signed-off-by: Shin'ichiro Kawasaki <shinichiro.kawasaki@xxxxxxx>
---
 check             | 22 ++++++++++++++++++++--
 common/shellcheck |  4 +++-
 new               |  6 ++++++
 3 files changed, 29 insertions(+), 3 deletions(-)

diff --git a/check b/check
index f11ce74..a690e47 100755
--- a/check
+++ b/check
@@ -4,6 +4,8 @@
 
 shopt -s extglob
 
+readonly ABORT_RUN=255
+
 _warning() {
 	echo "$0: $*" >&2
 }
@@ -332,6 +334,12 @@ _cleanup() {
 	fi
 
 	_exit_cgroup2
+
+	if ((ERR_EXIT)); then
+		echo "${TEST_NAME} should be error free but caused an error."
+		echo "It might have left dirty status. Abort the test run."
+		exit $ABORT_RUN
+	fi
 }
 
 _call_test() {
@@ -372,6 +380,7 @@ _call_test() {
 		fi
 
 		TIMEFORMAT="%Rs"
+		((ERR_EXIT)) && set -e
 		pushd . >/dev/null || return
 		{ time "$test_func" >"${seqres}.out" 2>&1; } 2>"${seqres}.runtime"
 		TEST_RUN["exit_status"]=$?
@@ -379,6 +388,12 @@ _call_test() {
 		TEST_RUN["runtime"]="$(cat "${seqres}.runtime")"
 		rm -f "${seqres}.runtime"
 
+		# The test case did not exit due to error for "set -e". Clear
+		# ERR_EXIT flag to not abort in _cleanup(). Also ensure to
+		# disable the error check.
+		unset ERR_EXIT
+		set +e
+
 		_cleanup
 
 		if [[ -v SKIP_REASONS ]]; then
@@ -523,6 +538,7 @@ _run_test() {
 	COND_DESC=""
 	FALLBACK_DEVICE=0
 	MODULES_TO_UNLOAD=()
+	ERR_EXIT=0
 
 	local nr_conds cond_i
 	local ret=0
@@ -635,13 +651,15 @@ _run_group() {
 		unset TEST_DEV
 	fi
 
-	local ret=0
+	local ret=0 run_test_ret
 	local test_name
 	for test_name in "${tests[@]}"; do
 		# Avoid "if" and "!" for errexit in test cases
 		( _run_test "$test_name" )
 		# shellcheck disable=SC2181
-		(($? != 0)) && ret=1
+		run_test_ret=$?
+		((run_test_ret != 0)) && ret=1
+		((run_test_ret == ABORT_RUN)) && break
 	done
 	return $ret
 }
diff --git a/common/shellcheck b/common/shellcheck
index ac0a51e..a1f8473 100644
--- a/common/shellcheck
+++ b/common/shellcheck
@@ -6,5 +6,7 @@
 
 # Suppress unused global variable warnings.
 _silence_sc2034() {
-	echo "$CAN_BE_ZONED $CGROUP2_DIR $CHECK_DMESG $COND_DESC $DESCRIPTION $DMESG_FILTER $FIO_PERF_FIELDS $FIO_PERF_PREFIX $QUICK $SKIP_REASONS ${TEST_RUN[*]} $TIMED" > /dev/null
+	echo "$CAN_BE_ZONED $CGROUP2_DIR $CHECK_DMESG $COND_DESC $DESCRIPTION \
+		$DMESG_FILTER $ERR_EXIT $FIO_PERF_FIELDS $FIO_PERF_PREFIX \
+		$QUICK $SKIP_REASONS ${TEST_RUN[*]} $TIMED" > /dev/null
 }
diff --git a/new b/new
index d84f01d..3664b1b 100755
--- a/new
+++ b/new
@@ -149,6 +149,12 @@ DESCRIPTION=""
 # Alternatively, you can filter out any unimportant messages in dmesg like so:
 # DMESG_FILTER="grep -v sysfs"
 
+# TODO: if you want to enable bash errexit "set -e" feature for strict error
+# check, uncomment the line below. If any command returns non-zero status, the
+# test case stops at that step. For detail, refer to the bash manual "SHELL
+# BUILTIN COMMAND", "set", "-e" section.
+# ERR_EXIT=1
+
 # TODO: if this test can be run for both regular block devices and zoned block
 # devices, uncomment the line below.
 # CAN_BE_ZONED=1
-- 
2.49.0