On Fri, Jul 04, 2025 at 02:13:15PM -0700, Eduard Zingerman wrote: > On Fri, 2025-07-04 at 10:26 -0700, Eduard Zingerman wrote: > > On Fri, 2025-07-04 at 19:14 +0200, Paul Chaignon wrote: > > > On Thu, Jul 03, 2025 at 11:54:27AM -0700, Eduard Zingerman wrote: > > [...] > > > > > I think is_branch_taken() modification should not be too complicated. > > > > For JSET it only checks tnum, but does not take ranges into account. > > > > Reasoning about ranges is something along the lines: > > > > - for unsigned range a = b & CONST -> a is in [b_min & CONST, b_max & CONST]; > > > > - for signed ranged same thing, but consider two unsigned sub-ranges; > > > > - for non CONST cases, I think same reasoning can apply, but more > > > > min/max combinations need to be explored. > > > > - then check if zero is a member or 'a' range. > > > > > > > > Wdyt? > > > > > > I might be missing something, but I'm not sure that works. For the > > > unsigned range, if we have b & 0x2 with b in [2; 10], then we'd end up > > > with a in [2; 2] and would conclude that the jump is never taken. But > > > b=8 proves us wrong. > > > > I see, what is really needed is an 'or' joined mask of all 'b' values. > > I need to think how that can be obtained (or approximated). > > I think the mask can be computed as in or_range() function at the > bottom of the email. This gives the following algorithm, if only > unsigned range is considered: > > - assume prediction is needed for "if a & b goto ..." > - bits that may be set in 'a' are or_range(a_min, a_max) > - bits that may be set in 'b' are or_range(b_min, b_max) > - if computed bit masks intersect: both branches are possible > - otherwise only false branch is possible. > > Wdyt? This is really nice! I think we can extend it to detect some always-true branches as well, and thus handle the initial case reported by syzbot. - if a_min == 0: we don't deduce anything - bits that may be set in 'a' are: possible_a = or_range(a_min, a_max) - bits that are always set in 'b' are: always_b = b_value & ~b_mask - if possible_a & always_b == possible_a: only true branch is possible - otherwise, we can't deduce anything For BPF_X case, we probably want to also check the reverse with possible_b & always_a. --- #include <stdint.h> #include <stdio.h> #include <stdbool.h> static uint64_t or_range(uint64_t lo, uint64_t hi) { uint64_t m; uint32_t i; m = hi; i = 0; while (lo != hi) { m |= 1lu << i; lo >>= 1; hi >>= 1; i++; } return m; } static bool always_matches(uint64_t lo, uint64_t hi, uint64_t mask) { uint64_t possible_bits = or_range(lo, hi); return possible_bits & mask == possible_bits; } static bool always_matches_naive(uint64_t lo, uint64_t hi, uint64_t mask) { uint64_t v = 0; for (v = lo; v <= hi; v++) { if (!(v & mask)) { return false; } } return true; } int main(int argc, char *argv[]) { int max = 0x300; for (int mask = 0; mask < max; mask++) { for (int lo = 1; lo < max; lo++) { for (int hi = lo; hi < max; hi++) { bool expected = always_matches_naive(lo, hi, mask); bool result = always_matches(lo, hi, mask); if (result == true && expected == false) { printf("mismatch: %x..%x & %x -> expecting %d, result %d\n", lo, hi, mask, expected, result); return 1; } } } } printf("all ok\n"); return 0; }