Re: [PATCH RFC] xfs: remap block layer ENODATA read errors to EIO

"Darrick J. Wong" <djwong@xxxxxxxxxx> · Mon, 18 Aug 2025 16:04:35 -0700

On Mon, Aug 18, 2025 at 04:11:41PM -0500, Eric Sandeen wrote:
> On 8/18/25 3:45 PM, Darrick J. Wong wrote:
> > On Mon, Aug 18, 2025 at 03:22:02PM -0500, Eric Sandeen wrote:
> >> We had a report that a failing scsi disk was oopsing XFS when an xattr
> >> read encountered a media error. This is because the media error returned
> >> -ENODATA, which we map in xattr code to -ENOATTR and treat specially.
> >>
> >> In this particular case, it looked like:
> >>
> >> xfs_attr_leaf_get()
> >> 	error = xfs_attr_leaf_hasname(args, &bp);
> >> 	// here bp is NULL, error == -ENODATA from disk failure
> >> 	// but we define ENOATTR as ENODATA, so ...
> >> 	if (error == -ENOATTR)  {
> >> 		// whoops, surprise! bp is NULL, OOPS here
> >> 		xfs_trans_brelse(args->trans, bp);
> >> 		return error;
> >> 	} ...
> >>
> >> To avoid whack-a-mole "test for null bp" or "which -ENODATA do we really
> >> mean in this function?" throughout the xattr code, my first thought is
> >> that we should simply map -ENODATA in lower level read functions back to
> >> -EIO, which is unambiguous, even if we lose the nuance of the underlying
> >> error code. (The block device probably already squawked.) Thoughts?
> > 
> > Uhhhh where does this ENODATA come from?  Is it the block layer?
> > 
> > $ git grep -w ENODATA block/
> > block/blk-core.c:146:   [BLK_STS_MEDIUM]        = { -ENODATA,   "critical medium" },
> 
> That, probably, though I don't speak block layer very well. As mentioned, it was a
> SCSI disk error, and it appeared in XFS as -ENODATA:
> 
> sd 0:0:23:0: [sdad] tag#937 FAILED Result: hostbyte=DID_OK driverbyte=DRIVER_OK cmd_age=2s
> sd 0:0:23:0: [sdad] tag#937 Sense Key : Medium Error [current] 
> sd 0:0:23:0: [sdad] tag#937 Add. Sense: Read retries exhausted
> sd 0:0:23:0: [sdad] tag#937 CDB: Read(16) 88 00 00 00 00 00 9b df 5e 78 00 00 00 08 00 00
> critical medium error, dev sdad, sector 2615107192 op 0x0:(READ) flags 0x1000 phys_seg 1 prio class 2

Ah, yup, critical error, we ran out of retries.

> XFS (sdad1): metadata I/O error in "xfs_da_read_buf+0xe1/0x140 [xfs]" at daddr 0x9bdf5678 len 8 error 61 
> (see error 61, ENODATA)
> 
> > --D
> > 
> >> Signed-off-by: Eric Sandeen <sandeen@xxxxxxxxxx>
> >> ---
> >>
> >> diff --git a/fs/xfs/xfs_buf.c b/fs/xfs/xfs_buf.c
> >> index f9ef3b2a332a..6ba57ccaa25f 100644
> >> --- a/fs/xfs/xfs_buf.c
> >> +++ b/fs/xfs/xfs_buf.c
> >> @@ -747,6 +747,9 @@ xfs_buf_read_map(
> >>  		/* bad CRC means corrupted metadata */
> >>  		if (error == -EFSBADCRC)
> >>  			error = -EFSCORRUPTED;
> >> +		/* ENODATA == ENOATTR which confuses xattr layers */

Can this comment mention that ENODATA comes from the block layer?

		/*
		 * ENODATA means critical medium error, don't let it
		 * get mixed up with the xattr usage
		 */

With that changed,
Reviewed-by: "Darrick J. Wong" <djwong@xxxxxxxxxx>

--D

> >> +		if (error == -ENODATA)
> >> +			error = -EIO;
> >>  		return error;
> >>  	}
> >>  
> >>
> >>
> > 
>