Re: [PATCH v2] xfs: test case for handling io errors when reading extended attributes

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Fri, Aug 29, 2025 at 02:24:19PM +1000, Donald Douwsma wrote:
> We've seen reports from the field panicing in xfs_trans_brelse after
> an io error for an attribute block.
> 
> sd 0:0:23:0: [sdx] tag#271 CDB: Read(16) 88 00 00 00 00 00 9b df 5e 78 00 00 00 08 00 00
> critical medium error, dev sdx, sector 2615107192 op 0x0:(READ) flags 0x1000 phys_seg 1 prio class 2
> XFS (sdx1): metadata I/O error in "xfs_da_read_buf+0xe1/0x140 [xfs]" at daddr 0x9bdf5678 len 8 error 61
> BUG: kernel NULL pointer dereference, address: 00000000000000e0
> ...
> RIP: 0010:xfs_trans_brelse+0xb/0xe0 [xfs]
> ...
> Call Trace:
>  <TASK>
>  ...
>  ? xfs_trans_brelse+0xb/0xe0 [xfs]
>  xfs_attr_leaf_get+0xb6/0xc0 [xfs]
>  xfs_attr_get+0xa0/0xd0 [xfs]
>  xfs_xattr_get+0x75/0xb0 [xfs]
>  __vfs_getxattr+0x53/0x70
>  inode_doinit_use_xattr+0x63/0x180
>  inode_doinit_with_dentry+0x196/0x510
>  security_d_instantiate+0x2f/0x50
>  d_splice_alias+0x46/0x2b0
>  xfs_vn_lookup+0x8b/0xb0 [xfs]
>  __lookup_slow+0x84/0x130
>  walk_component+0x158/0x1d0
>  path_lookupat+0x6e/0x1c0
>  filename_lookup+0xcf/0x1d0
>  vfs_statx+0x8d/0x170
>  vfs_fstatat+0x54/0x70
>  __do_sys_newfstatat+0x26/0x60
> 
> As this is specific to ENODATA test using scsi_debug instead of dmerror
> which returns EIO.
> 
> Signed-off-by: Donald Douwsma <ddouwsma@xxxxxxxxxx>
> ---
>  tests/xfs/999     | 193 ++++++++++++++++++++++++++++++++++++++++++++++
>  tests/xfs/999.out |   2 +
>  2 files changed, 195 insertions(+)
>  create mode 100755 tests/xfs/999
>  create mode 100644 tests/xfs/999.out
> 
> diff --git a/tests/xfs/999 b/tests/xfs/999
> new file mode 100755
> index 00000000..2a45eb7c
> --- /dev/null
> +++ b/tests/xfs/999
> @@ -0,0 +1,193 @@
> +#!/bin/bash
> +# SPDX-License-Identifier: GPL-2.0
> +# Copyright (c) 2025 Red Hat Inc.  All Rights Reserved.
> +#
> +# FS QA Test 999
> +#
> +# Regression test for panic following IO error when reading extended attribute blocks
> +#
> +#   XFS (sda): metadata I/O error in "xfs_da_read_buf+0xe1/0x140 [xfs]" at daddr 0x78 len 8 error 61
> +#   BUG: kernel NULL pointer dereference, address: 00000000000000e0
> +#   ...
> +#   RIP: 0010:xfs_trans_brelse+0xb/0xe0 [xfs]
> +#
> +#   [53887.310123] Call Trace:
> +#    <TASK>
> +#    ? show_trace_log_lvl+0x1c4/0x2df
> +#    ? show_trace_log_lvl+0x1c4/0x2df
> +#    ? xfs_attr_leaf_get+0xb6/0xc0 [xfs]
> +#    ? __die_body.cold+0x8/0xd
> +#    ? page_fault_oops+0x134/0x170
> +#    ? xfs_trans_read_buf_map+0x133/0x300 [xfs]
> +#    ? exc_page_fault+0x62/0x150
> +#    ? asm_exc_page_fault+0x22/0x30
> +#    ? xfs_trans_brelse+0xb/0xe0 [xfs]
> +#    xfs_attr_leaf_get+0xb6/0xc0 [xfs]
> +#    xfs_attr_get+0xa0/0xd0 [xfs]
> +#    xfs_xattr_get+0x88/0xd0 [xfs]
> +#    __vfs_getxattr+0x7b/0xb0
> +#    inode_doinit_use_xattr+0x63/0x180
> +#    inode_doinit_with_dentry+0x196/0x510
> +#    security_d_instantiate+0x3a/0xb0
> +#    d_splice_alias+0x46/0x2b0
> +#    xfs_vn_lookup+0x8b/0xb0 [xfs]
> +#    __lookup_slow+0x81/0x130
> +#    walk_component+0x158/0x1d0
> +#    ? path_init+0x2c5/0x3f0
> +#    path_lookupat+0x6e/0x1c0
> +#    filename_lookup+0xcf/0x1d0
> +#    ? tlb_finish_mmu+0x65/0x150
> +#    vfs_statx+0x82/0x160
> +#    vfs_fstatat+0x54/0x70
> +#    __do_sys_newfstatat+0x26/0x60
> +#    do_syscall_64+0x5c/0xe0
> +#
> +# For SELinux enabled filesystems the attribute security.selinux will be
> +# created before any additional attributes are added. Any attempt to open the
> +# file will read this entry, leading to the panic being triggered as above, via
> +# security_d_instantiate, rather than fgetxattr(2), to test via fgetxattr mount
> +# with a fixed context=
> +#
> +# Kernels prior to v4.16 don't have medium_error_start, and only return errors
> +# for a specific location, making scsi_debug unsuitable for checking old kernels.
> +# See d9da891a892a scsi: scsi_debug: Add two new parameters to scsi_debug driver

Haha, I think we don't need to write such long comment, how about shorten it a bit,
for example remove the call trace details :)

> +#
> +
> +. ./common/preamble
> +_begin_fstest auto
                      ^^^^
                      attr

> +
> +_fixed_by_kernel_commit ae668cd567a6 "xfs: do not propagate ENODATA disk errors into xattr code"
> +
> +# Override the default cleanup function.
> +_cleanup()
> +{
> +	cd /
> +	rm -f $tmp.*
> +	_unmount $SCRATCH_MNT
> +	_put_scsi_debug_dev
> +}
> +
> +# Import common functions.
> +. ./common/scsi_debug
> +
> +_require_scsi_debug
> +modinfo scsi_debug | grep -wq medium_error_start || _notrun "test requires scsi_debug medium_error_start"

Oh, maybe we can improve _require_scsi_debug, help it to accept arguments, likes:

  _require_scsi_debug medium_error_start

> +scsi_debug_dev=$(_get_scsi_debug_dev)
> +test -b $scsi_debug_dev || _notrun "Failed to initialize scsi debug device"
> +echo "SCSI debug device $scsi_debug_dev" >>$seqres.full
> +
> +_mkfs_dev $scsi_debug_dev  >> $seqres.full
> +_mount $scsi_debug_dev $SCRATCH_MNT  >> $seqres.full

Hmm... if you use $SCRATCH_MNT, you might need _require_scratch. You can set
SCRATCH_DEV=$scsi_debug_dev, then use _scratch_mkfs and _scratch_mount and
other _scratch_* functions in this case.

> +
> +blocksize=$(_get_block_size $SCRATCH_MNT) >> $seqres.full
                                             ^^^^^^^^^^^^^^^
                                             what's this for?

Better to use _get_file_block_size.

> +echo Blocksize $blocksize >> $seqres.full
> +
> +# Use dry_run=1 when verifying the test to avoid panicking.
> +enable_error=1
> +[ ${dry_run:-0} -eq 1 ] && enable_error=0

I think the dry_run is useless for this case in xfstests, except you
hope to create a new global parameter named "dry_run" for xfstests.
I think we don't need that. Please use other method to decide enable_error.
Or always "enable error" if that's a necessary condition to reproduce the
bug.

> +
> +test_attr()
> +{
> +	test=$1
> +	testfile=$SCRATCH_MNT/$test
> +	attr_size=$2 # bytes
> +	error_at_block=${3:-0}

Better to use "local" for variables in a function.

> +	shift 3

But there might be not the 3rd argument, as you set it to 0 by default.

I didn't see you use any $4, $* or $@ below, so maybe we don't need this
line ?

> +
> +	value_size=$attr_size

Looks like the attr_size and value_size are duplicated, due to I didn't
see you change any of them below.

> +
> +	echo 0 > /sys/module/scsi_debug/parameters/opts
> +
> +	echo -e "\nTesting : $test" >> $seqres.full
> +	echo -e "attr size: $attr_size" >> $seqres.full
> +	echo -e "error at block: $error_at_block\n" >> $seqres.full
> +
> +	touch $testfile
> +
> +	inode=$(ls -i $testfile|awk '{print $1}')

inode=$(stat -c '%i' $testfile)

> +	printf "inode: %d Testfile: %s\n" $inode $testfile >> $seqres.full
> +	setfattr -n user.test_attr -v $(printf "%0*d" $value_size $value_size) $testfile

As this case is "attr" related, so you need these:

. common/attr

_require_attrs user

$SETFATTR_PROG ....

> +
> +	$XFS_IO_PROG -c "bmap -al" $testfile >> $seqres.full
> +	start_blocks=($($XFS_IO_PROG -c "bmap -al" $testfile | awk 'match($3, /[0-9]+/, a) {print a[0]}'))
> +	echo "Attribute fork extent(s) start at ${attrblocks[*]}" >> $seqres.full
> +
> +	_unmount $SCRATCH_MNT >>$seqres.full 2>&1
> +
> +	echo "Dump inode $inode details with xfs_db" >> $seqres.full
> +	$XFS_DB_PROG -c "inode $inode" -c "print core.aformat core.naextents a" $scsi_debug_dev >> $seqres.full
> +	inode_daddr=$($XFS_DB_PROG -c "inode $inode" -c "daddr" $scsi_debug_dev | awk '{print $4}')

_scratch_xfs_get_metadata_field ...

> +	echo inode daddr $inode_daddr >> $seqres.full
> +
> +
> +	_mount $scsi_debug_dev $SCRATCH_MNT  >> $seqres.full
> +
> +        if [[ start_blocks[0] -ne 0 ]]; then
        ^^^

> +                # Choose the block to error, currently only works with a single extent.
> +                error_daddr=$((start_blocks[0] + error_at_block*blocksize/512))
> +        else
> +                # Default to the inode daddr when no extents were found.
> +                # Fails reading the inode during stat, so arguably useless
> +                # even for testing the upper layers of getfattr.
> +                error_daddr=$inode_daddr

I'm curious, won't this affect the inode core ? If inode core can't be read, won't
this cause other problem which isn't as you expect?

> +        fi
        ^^^

Bad indentation ...

> +
> +	echo "Setup scsi_debug to error when reading attributes from block" \
> +	     "$error_at_block at daddr $error_daddr" >> $seqres.full
> +	echo $error_daddr > /sys/module/scsi_debug/parameters/medium_error_start
> +	echo $(($blocksize/512)) > /sys/module/scsi_debug/parameters/medium_error_count
> +
> +	if [ $enable_error -eq 1 ]; then
> +		echo Enabling error >> $seqres.full
> +        	echo 2 > /sys/module/scsi_debug/parameters/opts
> +	fi
> +
> +	grep ^ /sys/module/scsi_debug/parameters/{medium_error_start,medium_error_count,opts} >> $seqres.full
> +
> +	echo "Re-read the extended attribute on $testfile, panics on IO errors" >> $seqres.full
> +	sync # Let folk see where we failed in the results.
> +	getfattr -d -m - $testfile >> $seqres.full 2>&1 # Panic here on failure

If you don't care about the getfattr output, you don't need to call _getfattr, but
please use $GETFATTR_PROG at least.



> +}
> +
> +
> +# TODO: Avoid assumptions about inode size using _xfs_get_inode_size and
> +#       _xfs_get_inode_core_bytes, currently assuming 512 byte inodes.
> +
> +# aformat shortform
> +# Include shortform for completeness, we can only inject errors for attributes
> +# stored outside of the inode.
> +test_attr "attr_local" 1 0
> +
> +# aformat extents
> +# Single leaf block, known to panic
> +test_attr "attr_extent_one_block" 512 0
> +
> +# Other tests to exercise multi block extents and multi extent attribute forks.
> +# Before the panic was understood it seemed possible that failing on the second
> +# block of a multi block attribute fork was involved. Seems like it may be
> +# worth testing in the future.
> +
> +test_attr "attr_extent_two_blocks_1" 5000 0
> +test_attr "attr_extent_two_blocks_2" 5000 1
> +# test_attr "attr_extent_many_blocks" 16000 4
> +#
> +# When using a single name=value pair to fill the space these tend to push out
> +# to multiple extents, rather than a single long extent, so don't yet test
> +# failing subsequent blocks within the first extent.
> +#
> +# i.e.
> +# xfs_bmap -va $testfile
> +# /mnt/test/testfile:
> +# EXT: FILE-OFFSET      BLOCK-RANGE      AG AG-OFFSET        TOTAL
> +#   0: [0..7]:          96..103           0 (96..103)            8
> +#   1: [8..23]:         80..95            0 (80..95)            16
> +
> +# aformat btree
> +# test_attr "attr_extents" 50000 0
> +# test_attr "attr_extents" 50000 $blocksize
> +# test_attr "attr_extents" 50000 $((blocksize+1))

Can you really get a btree aformat through this ? I doubt that. You
might need lots of attribute entries, not only one attribute with huge
data.

> +
> +# success, all done
> +echo Silence is golden
> +status=0
> +exit
> diff --git a/tests/xfs/999.out b/tests/xfs/999.out
> new file mode 100644
> index 00000000..3b276ca8
> --- /dev/null
> +++ b/tests/xfs/999.out
> @@ -0,0 +1,2 @@
> +QA output created by 999
> +Silence is golden
> -- 
> 2.47.3
> 
> 





[Index of Archives]     [XFS Filesystem Development (older mail)]     [Linux Filesystem Development]     [Linux Audio Users]     [Yosemite Trails]     [Linux Kernel]     [Linux RAID]     [Linux SCSI]


  Powered by Linux