On Tue, 2025-07-08 at 18:19 +0200, Paul Chaignon wrote: > On Mon, Jul 07, 2025 at 05:57:32PM -0700, Eduard Zingerman wrote: > > On Mon, 2025-07-07 at 17:51 -0700, Alexei Starovoitov wrote: > > > On Mon, Jul 7, 2025 at 5:37 PM Eduard Zingerman <eddyz87@xxxxxxxxx> wrote: > > > > > > > > On Mon, 2025-07-07 at 16:29 -0700, Eduard Zingerman wrote: > > > > > On Tue, 2025-07-08 at 00:30 +0200, Paul Chaignon wrote: > > [...] > > > > > But I think the program below would still be problematic: > > > > > > > > SEC("socket") > > > > __success > > > > __retval(0) > > > > __naked void jset_bug1(void) > > > > { > > > > asm volatile (" \ > > > > call %[bpf_get_prandom_u32]; \ > > > > if r0 < 2 goto 1f; \ > > > > r0 |= 1; \ > > > > if r0 & -2 goto 1f; \ > > > > 1: r0 = 0; \ > > > > exit; \ > > > > " : > > > > : __imm(bpf_get_prandom_u32) > > > > : __clobber_all); > > > > } > > > > > > > > The possible_r0 would be changed by `if r0 & -2`, so new rule will not hit. > > > > And the problem remains unsolved. I think we need to reset min/max > > > > bounds in regs_refine_cond_op for JSET: > > > > - in some cases range is more precise than tnum > > > > - in these cases range cannot be compressed to a tnum > > > > - predictions in jset are done for a tnum > > > > - to avoid issues when narrowing tnum after prediction, forget the > > > > range. > > > > > > You're digging too deep. llvm doesn't generate JSET insn, > > > so this is syzbot only issue. Let's address it with minimal changes. > > > Do not introduce fancy branch taken analysis. > > > I would be fine with reverting this particular verifier_bug() hunk. > > Ok, if LLVM doesn't generate JSETs, I agree there's not much point > trying to reduce false positives. I like Eduard's solution below > because it handles the JSET case without removing the warning. Given > the number of crashes syzkaller is generating, I suspect this isn't > only about JSET, so it'd be good to keep some visibility into invariant > violations. I suspect similar problems might be found in any place where tnum operations are used to narrow the range. E.g. if a repro for JSET would be found, same repro might be applicable to BPF_AND. In general, it might be the case we should not treat out of sync bounds as an error. Assuming that tnum and bounds based ranges have different precision in different scale regions, situations when one bound is changed w/o changing another can be legit. E.g.: ____ bounds range ____ / \ 0 --------------------------------------------------------- MAX \___________________________________________________/ tnum range Narrowing only tnum: ____ bounds range ____ / \ 0 --------------------------------------------------------- MAX \___________________/ tnum range This does not highlight an error, but a difference in expressive power for specific values. > > My point is that the fix should look as below (but extract it as a > > utility function): > > > > diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c > > index 53007182b46b..b2fe665901b7 100644 > > --- a/kernel/bpf/verifier.c > > +++ b/kernel/bpf/verifier.c > > @@ -16207,6 +16207,14 @@ static void regs_refine_cond_op(struct bpf_reg_state *reg1, struct bpf_reg_state > > swap(reg1, reg2); > > if (!is_reg_const(reg2, is_jmp32)) > > break; > > + reg1->u32_max_value = U32_MAX; > > + reg1->u32_min_value = 0; > > + reg1->s32_max_value = S32_MAX; > > + reg1->s32_min_value = S32_MIN; > > + reg1->umax_value = U64_MAX; > > + reg1->umin_value = 0; > > + reg1->smax_value = S64_MAX; > > + reg1->smin_value = S32_MIN; > > Looks like __mark_reg_unbounded :) I suspected there should be something already :) > I can send a test case + __mark_reg_unbounded for BPF_JSET | BPF_X in > regs_refine_cond_op. I suspect we may need the same for the BPF_JSET > case as well, but I'm unable to build a repro for that so far. Please go ahead. > > > val = reg_const_value(reg2, is_jmp32); > > if (is_jmp32) { > > t = tnum_and(tnum_subreg(reg1->var_off), tnum_const(~val)); > > > > ---- > > > > Because of irreconcilable differences in what can be represented as a > > tnum and what can be represented as a range.