Re: [PATCH RFC] xfs: remap block layer ENODATA read errors to EIO

Dave Chinner <david@xxxxxxxxxxxxx> · Wed, 20 Aug 2025 07:45:16 +1000

On Tue, Aug 19, 2025 at 10:38:54AM -0500, Eric Sandeen wrote:
> On 8/19/25 10:23 AM, Christoph Hellwig wrote:
> 
> ...
> 
> > The one thing we had a discussion about was ENOSPC, which can happen
> > with some thing provisioning solutions (and apparently redhat cares
> > about dm-thin there).  For this we do want retry metadata writes
> > based on that design, and special casing it would be good, because
> > an escaping ENOSPC would do the entirely wrong thing in all layers
> > about the buffer cache.
> > 
> > Another one is EAGAIN for non-blocking I/O.  That's mostly a data
> > path thing, and we can't really deal with it, but if we make full
> > use of it, it needs to be special cased.
> > 
> > And then EOPNOTSUP if we want to try optional operations that we
> > can't query ahead of time.  SCSI WRITE_SAME is one of them, but
> > we fortunately hide that behind block layer helpers.
> > 
> > For file system directly dealing with persistent reservations
> > BLK_STS_RESV_CONFLICT might be another one, but I hope we don't
> > get there :)
> > 
> > If the file system ever directly makes use of Command duration
> > limits, BLK_STS_DURATION_LIMIT might be another one.
> > 
> > As you see very little of that is actually relevant for XFS,
> > and even less for the buffer cache.
> 
> Ok, this is getting a little more complex. The ENODATA problem is
> very specific, and has (oddly) been reported by users/customers twice
> in recent days. Maybe I can send an acceptable fix for that specific,
> observed problem (also suitable for -stable etc), then another
> one that is more ambitious on top of that.

Right, the lowest risk, minimal targetted fix for the problem
reported is to remap the error in the attr layers. Nothing else is
then affected (ie. global changes of behaviour have significant
potential for unexpected regressions), but the issue is solved for
the users that are tripping over it.

Then, if someone really wants to completely rearchitect how we
handle IO errors in XFS, that can be done as a separate project,
with it's own justification, design review, planning for
integration/deprecation/removal of existing error handling
infrastructure, etc.

We do not tie acceptance of trivial bug fixes with a requirement to
completely rearchitect fundamental filesystem behaviours that are
only vaguely related to the bug that needs to be fixed.

-Dave.
-- 
Dave Chinner
david@xxxxxxxxxxxxx