On Sun, Jun 01, 2025 at 10:38:07PM -0700, Christoph Hellwig wrote: > On Thu, May 29, 2025 at 02:36:30PM +1000, Dave Chinner wrote: > > In these situations writeback could fail for several attempts before > > the storage timed out and came back online. Then the next write > > retry would succeed, and everything would be good. Linux never gave > > us a specific IO error for this case, so we just had to retry on EIO > > and hope that the storage came back eventually. > > Linux has had differenciated I/O error codes for quite a while. But > more importantly dm-multipath doesn't just return errors to the upper > layer during failover, but is instead expected to queue the I/O up > until it either has a working path or an internal timeout passed. > > In other words, write errors in Linux are in general expected to be > persistent, modulo explicit failfast requests like REQ_NOWAIT. Say what? the blk_errors array defines multiple block layer errors that are transient in nature - stuff like ENOSPC, ETIMEDOUT, EILSEQ, ENOLINK, EBUSY - all indicate a transient, retryable error occurred somewhere in the block/storage layers. What is permanent about dm-thinp returning ENOSPC to a write request? Once the pool has been GC'd to free up space or expanded, the ENOSPC error goes away. What is permanent about an IO failing with EILSEQ because a t10 checksum failed due to a random bit error detected between the HBA and the storage device? Retry the IO, and it goes through just fine without any failures. These transient error types typically only need a write retry after some time period to resolve, and that's what XFS does by default. What makes these sorts of errors persistent in the linux block layer and hence requiring an immediate filesystem shutdown and complete denial of service to the storage? I ask this seriously, because you are effectively saying the linux storage stack now doesn't behave the same as the model we've been using for decades. What has changed, and when did it change? > Which also leaves me a bit puzzled what the XFS metadata retries are > actually trying to solve, especially without even having a corresponding > data I/O version. It's always been for preventing immediate filesystem shutdown when spurious transient IO errors occur below XFS. Data IO errors don't cause filesystem shutdowns - errors get propagated to the application - so there isn't a full system DOS potential for incorrect classification of data IO errors... -Dave. -- Dave Chinner david@xxxxxxxxxxxxx