Re: [PATCH v2] xfs: Verify DA node btree hash order

Dave Chinner <david@xxxxxxxxxxxxx> · Tue, 6 May 2025 08:17:16 +1000

On Mon, May 05, 2025 at 08:06:39AM +0000, Charalampos Mitrodimas wrote:
> The xfs_da3_node_verify() function checks the integrity of directory
> and attribute B-tree node blocks. However, it was missing a check to
> ensure that the hash values of the btree entries within the node are
> non-decreasing hash values (allowing equality).
> 
> Add a loop to iterate through the btree entries and verify that each
> entry's hash value is greater than the previous one. If an
> out-of-order hash value is detected, return failure to indicate
> corruption.

Ok, the code is fine, but....

> This addresses the "XXX: hash order check?" comment and improves
> corruption detection for DA node blocks.

.... it doesn't address that comment.

That comment was posed as a question for good reasons.

Ask yourself this question and then do the math: what is the
overhead of doing this hash order check on a 64kB directory node
block? How many times does this loop iterate, and how much extra CPU
does that burn when you are processing tens of thousands of these
blocks every second?

IOWs, that comment is posed as a question because the hash order
check is trivial to implement but we've assumed that it is -too
expensive to actually implement-. It has never been clear that the
additional runtime expense is worth the potential gain in corruption
detection coverage.

In terms of performance and scalability, we have to consider what
impact this has on directory lookup performance when
there are millions of entries in a directory. What about when
there are billions of directory entries in the filesystem? What
impact does this have on directory modification and writeback speed
(verifiers are also run prior to writeback, not just on read)?
What impact does it have on overall directory scalability? etc.

Now consider the other side of the coin: what is the risk of
undetected corruptions slipping through because we don't verify the
hash order? Do we have any other protections against OOO hash
entries in place? What is the severity of the failure scenarios
associated with an out-of-order hash entry - can it oops the
machine, cause a security issue, etc? Have we ever seen an out of
order hash entry in the wild?

Hence we also need to quantify the risk we are assuming by not
checking the hash order exhaustively and how it changes by adding
such checking. What holes in the order checking still exist even
with the new checks added (e.g. do we check hash orders across
sibling blocks?).

Are there any other protections on node blocks that already inform
us of potential ordering issues without needing expensive,
exhaustive tests?  If not, are there new, lower cost checks we can
add that will give us the same detection capabilty without the
IO-time verification overhead? (e.g. in the hash entry binary search
lookup path.)

i.e. What is the risk profile associated with the status quo of the
past 30 years (i.e. no hash order verification at IO time) and how
much does that improve by adding some form of hash order
verification?

Hence as a first step before we add such hash order checking, we
need performance and scalability regression testing (especially on
directories containing millions of entries) to determine the runtime
hit we will take from adding the check. Once the additional runtime
overhead has been measured, quantified and analysed, then we can
balance that against the risk profile improvement and make an
informed decision on this verification...

-Dave.
-- 
Dave Chinner
david@xxxxxxxxxxxxx