Re: XFS complains about data corruption after xfs_repair

Eric Sandeen <sandeen@xxxxxxxxxxx> · Thu, 29 May 2025 13:55:35 -0500

On 5/25/25 5:39 AM, Roy Sigurd Karlsbakk wrote:
>> On 24 May 2025, at 03:18, Roy Sigurd Karlsbakk <roy@xxxxxxxxxxxxx> wrote:
>>
>>> On 4 May 2025, at 00:34, Dave Chinner <david@xxxxxxxxxxxxx> wrote:
>>>
>>> On Sat, May 03, 2025 at 04:01:48AM +0200, Roy Sigurd Karlsbakk wrote:
>>>> Hi all
>>>>
>>>> I have an XFS filesystem on an LVM LV which resides on a RAID-10 (md) with four Seagate Exos 16TB drives. This has worked well for a long time, but just now, it started complaining. The initial logs were showing a lot of errors and I couldn't access the filesystem, so I gave it a reboot, tha tis, I had to force one. Anyway - it booted up again and looked normal, but still complained. I rebooted to single and found the (non-root) filesystem already mounted and unable to unmount it, I commented it out from fstab and rebooted once more to single. This allowed me to run xfs_repair, although I had to use -L. Regardless, it finished and I re-enabled the filesystem in fstab and rebooted once more. Starting up now, it seems to work, somehow, but ext4 still throws some errors as shown below, that is, "XFS (dm-0): corrupt dinode 43609984, (btree extents)." It seems to be the same dinode each time.
>>>>
>>>> Isn't an xfs_repair supposed to fix this?
>>>>
>>>> I'm running Debian Bookworm 12.10, kernel 6.1.0-34-amd64 and xfsprogs 6.1.0 - everything just clean debian.
>>>
>>> Can you pull a newer xfsprogs from debian/testing or /unstable or
>>> build the latest versionf rom source and see if the problem
>>> persists?
>>
>> I just tried with xfsprogs-6.14 and also upgraded the kernel from 6.1.0-35 to 6.12.22+bpo. The new xfsprogs haven't been installed properly, just lying in its own directory to be run from there. I downed the system again and ran a new repair. After the initial repair, I ran it another time, and another, just to check. After rebooting back, it still throws thesame error at me about "[lø. mai 3 03:28:14 2025] XFS (dm-0): Metadata corruption detected at xfs_iread_bmbt_block+0x271/0x2d0 [xfs], inode 0x2996f80 xfs_iread_bmbt_block"
>>
>>> It is complaining that it is trying to load more extents than the
>>> inode thinks it has allocated in ip->if_nextents.
>>>
>>> That means either the btree has too many extents in it, or the inode
>>> extent count is wrong. I can't tell which it might be from the
>>> dump output, so it would be useful to know if xfs-repair is actually
>>> detecting this issue, too.
>>>
>>> Can you post the output from xfs_repair? Could you also pull a newer
>>> xfs_reapir from debian/testing or build 6.14 from source and see if
>>> the problem is detected and/or fixed?
>>
>> I couldn't find much relevnt output, really. I can obviously run it again, but it takes some time and if you have some magick options to try with it, please let me know first.

Generally best to just pass along the full output and let the requester
decide what is relevant ;) (you may be right, but often reporters filter
too much.)

> 
> So, basically, I now get this error message in dmesg/kernel log every five (5) seconds:
> 
> [sø. mai 25 12:12:40 2025] XFS (dm-0): corrupt dinode 43609984, (btree extents).
> [sø. mai 25 12:12:40 2025] XFS (dm-0): Metadata corruption detected at xfs_iread_bmbt_block+0x2ad/0x320 [xfs], inode 0x2996f80 xfs_iread_bmbt_block
> [sø. mai 25 12:12:40 2025] XFS (dm-0): Unmount and run xfs_repair
> [sø. mai 25 12:12:40 2025] XFS (dm-0): First 72 bytes of corrupted metadata buffer:
> [sø. mai 25 12:12:40 2025] 00000000: 42 4d 41 33 00 00 00 f8 00 00 00 01 10 26 57 2a  BMA3.........&W*
> [sø. mai 25 12:12:40 2025] 00000010: ff ff ff ff ff ff ff ff 00 00 00 06 61 32 b9 58  ............a2.X
> [sø. mai 25 12:12:40 2025] 00000020: 00 00 01 27 00 13 83 80 a4 0c 52 99 b8 45 4b 5b  ...'......R..EK[
> [sø. mai 25 12:12:40 2025] 00000030: b6 3e 63 d8 b0 5e 20 5f 00 00 00 00 02 99 6f 80  .>c..^ _......o.
> [sø. mai 25 12:12:40 2025] 00000040: 7f fb a7 f6 00 00 00 00                          ........

You might consider sending a compressed xfs_metadump image off-list to me and/or Dave.

xfs_metadump obfuscates filenames by default and contains no data blocks, but sometimes
strings slip through so I generally suggest not posting to the list.

But with the metadump perhaps someone has time to do more investigation, assuming it
reproduces with the metadump image.

-Eric

> roy
> --