For anyone interested in evaluating this problem, here is a cleanroom'd reproducer metadump image that demonstrates it. For a single-block extent format directory ... xfs_db> inode 131 xfs_db> p ... 0:[0,15,1,0] ... xfs_db> fsblock 15 xfs_db> type dir3 we corrupt the magic, the uuid, and the crc of this dir3 block: xfs_db> write -c bhdr.hdr.uuid 0xdeadbeef Allowing write of corrupted data and bad CRC bhdr.hdr.uuid = 20000000-0000-0000-0000-0000deadbeef xfs_db> write -c bhdr.hdr.magic 0xfeedface Allowing write of corrupted data and bad CRC bhdr.hdr.magic = 0xfeedface xfs_db> quit and then xfs_repair fails to fix things up enough to pass the verifier when it tries to write out the buffer after "fixing" it: # xfs_repair foo.img Phase 1 - find and verify superblock... Phase 2 - using internal log - zero log... - scan filesystem freespace and inode maps... - found root inode chunk Phase 3 - for each AG... - scan and clear agi unlinked lists... - process known inodes and perform inode discovery... - agno = 0 Metadata CRC error detected at 0x5556997d5ca0, xfs_dir3_block block 0x78/0x1000 bad directory block magic # 0xfeedface in block 0 for directory inode 131 - agno = 1 - agno = 2 - agno = 3 - process newly discovered inodes... Phase 4 - check for duplicate blocks... - setting up duplicate extent list... - check for inodes claiming duplicate blocks... - agno = 0 - agno = 2 - agno = 1 - agno = 3 bad directory block magic # 0xfeedface in block 0 for directory inode 131 Phase 5 - rebuild AG headers and trees... - reset superblock... Phase 6 - check inode connectivity... - resetting contents of realtime bitmap and summary inodes - traversing filesystem ... bad directory block magic # 0xfeedface for directory inode 131 block 0: fixing magic # to 0x58444233 - traversal finished ... - moving disconnected inodes to lost+found ... Phase 7 - verify and correct link counts... Metadata corruption detected at 0x5556997d5990, xfs_dir3_block block 0x78/0x1000 libxfs_bwrite: write verifier failed on xfs_dir3_block bno 0x78/0x8 xfs_repair: Releasing dirty buffer to free list! xfs_repair: Refusing to write a corrupt buffer to the data device! xfs_repair: Lost a write to the data device! fatal error -- File system metadata writeout failed, err=117. Re-run xfs_repair. This is because this sequence in longform_dir2_entry_check(): /* check v5 metadata */ d = bp->b_addr; if (be32_to_cpu(d->magic) == XFS_DIR3_BLOCK_MAGIC || be32_to_cpu(d->magic) == XFS_DIR3_DATA_MAGIC) { error = check_dir3_header(mp, bp, ino); if (error) { fixit++; if (fmt == XFS_DIR2_FMT_BLOCK) goto out_fix; libxfs_buf_relse(bp); bp = NULL; continue; } } longform_dir2_entry_check_data(mp, ip, num_illegal, need_dot, irec, ino_offset, bp, hashtab, &freetab, da_bno, fmt == XFS_DIR2_FMT_BLOCK); would have fixed the UUID had check_dir3_header found the error, but the magic was wrong so that never ran and fixit was never set. longform_dir2_entry_check_data then fixes the magic and the crc, but does not fix the UUID, so the verifier check fails on writeout. When all 3 items are bad, I'm not exactly sure what we should do to get the UUID fixed up here (or if it just should have been junked at that point) -Eric
Attachment:
repro.meta.bz2
Description: BZip2 compressed data