Re: [PATCH bpf-next 1/2] bpf: Use tnums for JEQ/JNE is_branch_taken logic

Shung-Hsi Yu <shung-hsi.yu@xxxxxxxx> · Thu, 14 Aug 2025 20:55:22 +0800

On Wed, Aug 13, 2025 at 05:34:08PM +0200, Paul Chaignon wrote:
> In the following toy program (reg states minimized for readability), R0
> and R1 always have different values at instruction 6. This is obvious
> when reading the program but cannot be guessed from ranges alone as
> they overlap (R0 in [0; 0xc0000000], R1 in [1024; 0xc0000400]).
> 
>   0: call bpf_get_prandom_u32#7  ; R0_w=scalar()
>   1: w0 = w0                     ; R0_w=scalar(var_off=(0x0; 0xffffffff))
>   2: r0 >>= 30                   ; R0_w=scalar(var_off=(0x0; 0x3))
>   3: r0 <<= 30                   ; R0_w=scalar(var_off=(0x0; 0xc0000000))
>   4: r1 = r0                     ; R1_w=scalar(var_off=(0x0; 0xc0000000))
>   5: r1 += 1024                  ; R1_w=scalar(var_off=(0x400; 0xc0000000))
>   6: if r1 != r0 goto pc+1
> 
> Looking at tnums however, we can deduce that R1 is always different from
> R0 because their tnums don't agree on known bits. This patch uses this
> logic to improve is_scalar_branch_taken in case of BPF_JEQ and BPF_JNE.
> 
> This change has a tiny impact on complexity, which was measured with
> the Cilium complexity CI test. That test covers 72 programs with
> various build and load time configurations for a total of 970 test
> cases. For 80% of test cases, the patch has no impact. On the other
> test cases, the patch decreases complexity by only 0.08% on average. In
> the best case, the verifier needs to walk 3% less instructions and, in
> the worst case, 1.5% more. Overall, the patch has a small positive
> impact, especially for our largest programs.
> 
> Signed-off-by: Paul Chaignon <paul.chaignon@xxxxxxxxx>
> ---
>  include/linux/tnum.h  | 3 +++
>  kernel/bpf/tnum.c     | 8 ++++++++
>  kernel/bpf/verifier.c | 4 ++++
>  3 files changed, 15 insertions(+)
> 
> diff --git a/include/linux/tnum.h b/include/linux/tnum.h
> index 57ed3035cc30..06a41d070e75 100644
> --- a/include/linux/tnum.h
> +++ b/include/linux/tnum.h
> @@ -51,6 +51,9 @@ struct tnum tnum_xor(struct tnum a, struct tnum b);
>  /* Multiply two tnums, return @a * @b */
>  struct tnum tnum_mul(struct tnum a, struct tnum b);
>  
> +/* Return true if the known bits of both tnums have the same value */
> +bool tnum_agree(struct tnum a, struct tnum b);
> +
>  /* Return a tnum representing numbers satisfying both @a and @b */
>  struct tnum tnum_intersect(struct tnum a, struct tnum b);
>  
> diff --git a/kernel/bpf/tnum.c b/kernel/bpf/tnum.c
> index fa353c5d550f..8cb73d35196e 100644
> --- a/kernel/bpf/tnum.c
> +++ b/kernel/bpf/tnum.c
> @@ -143,6 +143,14 @@ struct tnum tnum_mul(struct tnum a, struct tnum b)
>  	return tnum_add(TNUM(acc_v, 0), acc_m);
>  }
>  
> +bool tnum_agree(struct tnum a, struct tnum b)
> +{
> +	u64 mu;
> +
> +	mu = ~a.mask & ~b.mask;
> +	return (a.value & mu) == (b.value & mu);
> +}

Nit: I finding the naming a bit unconventional compared to other tnum
helpers we have, with are either usually named after a BPF instruction
or set operation. tnum_overlap() would be my choice for the name of such
new helper.

One more comment below.

>  /* Note that if a and b disagree - i.e. one has a 'known 1' where the other has
>   * a 'known 0' - this will return a 'known 1' for that bit.
>   */
> diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
> index 3a3982fe20d4..fa86833254e3 100644
> --- a/kernel/bpf/verifier.c
> +++ b/kernel/bpf/verifier.c
> @@ -15891,6 +15891,8 @@ static int is_scalar_branch_taken(struct bpf_reg_state *reg1, struct bpf_reg_sta
>  			return 0;
>  		if (smin1 > smax2 || smax1 < smin2)
>  			return 0;
> +		if (!tnum_agree(t1, t2))
> +			return 0;

Could we reuse tnum_xor() here instead?

If xor of two register cannot be 0, then the two can never hold the same
value. Also we can use the tnum_xor() result in place of tnum_is_const()
checks.

case BPF_JEQ:
    t = tnum_xor(t1, t2);
    if (!t.mask) /* Equvalent of tnum_is_const(t1) && tnum_is_const(t2) */
        return t.value == 0;
    if (umin1 > umax2 || umax1 < umin2)
        return 0;
    if (smin1 > smax2 || smax1 < smin2)
        return 0;
    if (!t.value) /* Equvalent of !tnum_agree(t1, t2) */
      return 0;
    ...
case BPF_JNE:
    t = tnum_xor(t1, t2);
    if (!t.mask) /* Equvalent of tnum_is_const(t1) && tnum_is_const(t2) */
        return t.value != 0;
    /* non-overlapping ranges */
    if (umin1 > umax2 || umax1 < umin2)
        return 1;
    if (smin1 > smax2 || smax1 < smin2)
        return 1;
    if (!t.value) /* Equvalent of !tnum_agree(t1, t2) */
        return 1;

Looks slighly less readable though.