Re: IO error handling in xfs_repair

Carlos Maiolino <cem@xxxxxxxxxx> · Fri, 27 Jun 2025 21:24:29 +0200

On Thu, Jun 26, 2025 at 11:56:44PM -0700, Andi Kleen wrote:
> Hi,
> 
> I have a spinning disk with XFS that corrupted a sector containing some inodes.
> Reading it always gave a IO error (ENODATA).
> 
> xfs_repair unfortunately couldn't handle this at all, running into this
> gem:
> 
>          if (process_inode_chunk(mp, agno, num_inos, first_ino_rec,
>                                 ino_discovery, check_dups, extra_attr_check,
>                                 &bogus))  {
>                         /* XXX - i/o error, we've got a problem */
>                         abort();
>          }
> 
> TBH I was a bit shocked that XFS repair doesn't handle IO errors.
> Surely that's a common occurrence?

Hi Andi.

This behavior is well documented on xfs_repair man page:

"
Disk Errors
	xfs_repair  aborts on most disk I/O errors. Therefore, if you are
	trying to repair a filesystem that was damaged due to a disk drive
	failure, steps should be taken to ensure that all blocks in the
	filesystem are readable and writable before attempting to use
	xfs_repair to repair the filesystem.
	A  possible  method  is  using dd(8) to copy the data onto a good disk.
"
I don't think IO errors could be classified as a common occurrence.

> 
> Anyways, what I ended up doing was to use strace to get the seek offset
> of the bad sector and then write a little python program to clear the block
> (which then likely got remapped, or simply rewritten on the medium),
> and apart from a few lost inodes everything was fine.
> 
> It seems that xfs_repair should have an option to clear erroring blocks that
> it encounters? I realize that this option could be dangerous, but in many cases
> it would seem like the only way to recover.

I believe one of the problems is xfsprogs can't really pinpoint what
happened. Could be a transient failure due a link problem or a bad
block on disk, or whatever else. So it has been designed to bail out and
let the admin handle it.

IMO Adding an option to force to 'clear errored blocks', which basically
means forcing a write() on the block so that it could possibly be
relocated by the disk's firmware is not a good strategy.
Depending how many bad sectors are in the disk, or the nature of the IO
error, this would might end up damaging the filesystem beyond recovery,
as you mentioned yourself.
So, in some cases, you either gotta try to force the disk to relocate
the block manually or copy the still not bad data somewhere else, both
achievable with `dd` for example.

> 
> Or at a minimum print the seek offset on an error so that it can be cleared manually.
> 

This seems weird. If xfs bailed where you pointed, calling
process_inode_chunk(), this likely bailed from here:

if (error) {
	do_warn(_("cannot read inode %" PRIu64 ", disk block %" PRId64 ", cnt %d\n"),
		XFS_AGINO_TO_INO(mp, agno, first_irec->ino_startnum),
		XFS_AGB_TO_DADDR(mp, agno, agbno),
		XFS_FSB_TO_BB(mp,
			M_IGEO(mp)->blocks_per_cluster));
	while (bp_index > 0) {
		bp_index--;
		libxfs_buf_relse(bplist[bp_index]);
	}
	free(bplist);
	return(1);
}

process_inode_chunk() was supposed to log the inode and disk block,
perhaps the abort() prevented the stderr buffer to be flushed, do you
still have the whole xfs_repair output to the point where it failed?