Re: [PATCH v3] xfs_repair: handling a block with bad crc, bad uuid, and bad magic number needs fixing

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 3/21/25 5:05 PM, bodonnel@xxxxxxxxxx wrote:
> From: Bill O'Donnell <bodonnel@xxxxxxxxxx>
> 
> In certain cases, if a block is so messed up that crc, uuid and magic
> number are all bad, we need to not only detect in phase3 but fix it
> properly in phase6. In the current code, the mechanism doesn't work
> in that it only pays attention to one of the parameters.
> 
> Note: in this case, the nlink inode link count drops to 1, but
> re-running xfs_repair fixes it back to 2. This is a side effect that
> should probably be handled in update_inode_nlinks() with separate patch.
> Regardless, running xfs_repair twice, with this patch applied
> fixes the issue. Recognize that this patch is a fix for xfs v5.

The reason this fix leaves the inode nlinks in an inconsistent state
is because the dir is (was) a longform directory (XFS_DIR2_FMT_BLOCK),
and we go down this path with your patch in place:

                /* check v5 metadata */
                if (xfs_has_crc(mp)) {
                        error = check_dir3_header(mp, bp, ino);
                        if (error) {
                                fixit++;
                                if (fmt == XFS_DIR2_FMT_BLOCK) { <==== true
                                        goto out_fix;	<=== goto
                                }

                                libxfs_buf_relse(bp);
                                bp = NULL;
                                continue;
                        }
                }

                longform_dir2_entry_check_data(mp, ip, num_illegal, need_dot,
                                irec, ino_offset, bp, hashtab,
                                &freetab, da_bno, fmt == XFS_DIR2_FMT_BLOCK);
...

out_fix:

        if (!no_modify && (fixit || dotdot_update)) {
                longform_dir2_rebuild(mp, ino, ip, irec, ino_offset, hashtab);


longform_dir2_rebuild tries to rebuild the directory from the entries found
via longform_dir2_entry_check_data() and placed in hashtab, but because we never
called longform_dir2_entry_check_data(), hashtab is empty. This is why all
entries in the problematic dir end up in lost+found.

That also means that longform_dir2_rebuild completes without adding any entries
at all, and so the directory is now shortform. Because shortform directories 
have no explicit "." entry, I think it would need an extra ref added at this
point.

But I wonder - why not call longform_dir2_entry_check_data() before we check
the header? That way it /will/ populate hashtab with any found entries in the
block, and when the header is found to be corrupt, it will rebuild it with all
entries intact, and leave nothing in lost+found.


Thoughts?
-Eric







[Index of Archives]     [XFS Filesystem Development (older mail)]     [Linux Filesystem Development]     [Linux Audio Users]     [Yosemite Trails]     [Linux Kernel]     [Linux RAID]     [Linux SCSI]


  Powered by Linux