From: Darrick J. Wong <djwong@xxxxxxxxxx> First, start with the premise that fstests is run with a nonzero limit on the size of core dumps so that we can capture the state of misbehaving fs utilities like fsck and scrub if they crash. When fsstress is compiled with DEBUG defined (which is the default), it will periodically call check_cwd to ensure that the current working directory hasn't changed out from underneath it. If the filesystem is XFS and it shuts down, the stat64() calls will start returning EIO. In this case, we follow the out: label and call abort() to exit the program. Historically this did not produce any core dumps because $PWD is on the dead filesystem and the write fails. However, modern systems are often configured to capture coredumps using some external mechanism, e.g. abrt/systemd-coredump. In this case, the capture tool will succeeds in capturing every crashed process, which fills the crash dump directory with a lot of useless junk. Worse, if the capture tool is configured to pass the dumps to fstests, it will flag the test as failed because something dumped core. This is really silly, because basic stat requests for the current working directory can be satisfied from the inode cache without a disk access. In this narrow situation, EIO only happens when the fs has shut down, so just exit the program. We really should have a way to query if a filesystem is shut down that isn't conflated with (possibly transient) EIO errors. But for now this is what we have to do. :( Signed-off-by: "Darrick J. Wong" <djwong@xxxxxxxxxx> --- ltp/fsstress.c | 15 ++++++++++++++- 1 file changed, 14 insertions(+), 1 deletion(-) diff --git a/ltp/fsstress.c b/ltp/fsstress.c index 8dbfb81f95a538..d4abe561787f19 100644 --- a/ltp/fsstress.c +++ b/ltp/fsstress.c @@ -1049,8 +1049,21 @@ check_cwd(void) ret = stat64(".", &statbuf); if (ret != 0) { + int error = errno; + fprintf(stderr, "fsstress: check_cwd stat64() returned %d with errno: %d (%s)\n", - ret, errno, strerror(errno)); + ret, error, strerror(error)); + + /* + * The current working directory is pinned in memory, which + * means that stat should not have had to do any disk accesses + * to retrieve stat information. Treat an EIO as an indication + * that the filesystem shut down and exit instead of dumping + * core like the abort() below does. + */ + if (error == EIO) + exit(1); + goto out; }