Re: [syzbot] [bpf?] WARNING in reg_bounds_sanity_check

Paul Chaignon <paul.chaignon@xxxxxxxxx> · Fri, 4 Jul 2025 19:14:38 +0200

On Thu, Jul 03, 2025 at 11:54:27AM -0700, Eduard Zingerman wrote:
> On Thu, 2025-07-03 at 19:02 +0200, Paul Chaignon wrote:
> > On Tue, Jul 01, 2025 at 06:55:28PM -0700, syzbot wrote:
> > > Hello,
> > >
> > > syzbot found the following issue on:
> > >
> > > HEAD commit:    cce3fee729ee selftests/bpf: Enable dynptr/test_probe_read_..
> > > git tree:       bpf-next
> > > console+strace: https://syzkaller.appspot.com/x/log.txt?x=147793d4580000
> > > kernel config:  https://syzkaller.appspot.com/x/.config?x=79da270cec5ffd65
> > > dashboard link: https://syzkaller.appspot.com/bug?extid=c711ce17dd78e5d4fdcf
> > > compiler:       Debian clang version 20.1.6 (++20250514063057+1e4d39e07757-1~exp1~20250514183223.118), Debian LLD 20.1.6
> > > syz repro:      https://syzkaller.appspot.com/x/repro.syz?x=1594e48c580000
> > > C reproducer:   https://syzkaller.appspot.com/x/repro.c?x=1159388c580000
> > >
> > > Downloadable assets:
> > > disk image: https://storage.googleapis.com/syzbot-assets/f286a7ef4940/disk-cce3fee7.raw.xz
> > > vmlinux: https://storage.googleapis.com/syzbot-assets/e2f2ebe1fdc3/vmlinux-cce3fee7.xz
> > > kernel image: https://storage.googleapis.com/syzbot-assets/6e3070663778/bzImage-cce3fee7.xz
> > >
> > > IMPORTANT: if you fix the issue, please add the following tag to the commit:
> > > Reported-by: syzbot+c711ce17dd78e5d4fdcf@xxxxxxxxxxxxxxxxxxxxxxxxx
> > >
> > > ------------[ cut here ]------------
> > > verifier bug: REG INVARIANTS VIOLATION (false_reg1): range bounds violation u64=[0x0, 0x0] s64=[0x0, 0x0] u32=[0x1, 0x0] s32=[0x0, 0x0] var_off=(0x0, 0x0)(1)
> > > WARNING: CPU: 1 PID: 5833 at kernel/bpf/verifier.c:2688 reg_bounds_sanity_check+0x6e6/0xc20 kernel/bpf/verifier.c:2682
> >
> > I'm unsure how to handle this one.
> >
> > One example repro is as follows.
> >
> >   0: call bpf_get_netns_cookie
> >   1: if r0 == 0 goto <exit>
> >   2: if r0 & Oxffffffff goto <exit>
> >
> > The issue is on the path where we fall through both jumps.
> >
> > That path is unreachable at runtime: after insn 1, we know r0 != 0, but
> > with the sign extension on the jset, we would only fallthrough insn 2
> > if r0 == 0. Unfortunately, is_branch_taken() isn't currently able to
> > figure this out, so the verifier walks all branches. As a result, we end
> > up with inconsistent register ranges on this unreachable path:
> >
> >   0: if r0 == 0 goto <exit>
> >     r0: u64=[0x1, 0xffffffffffffffff] var_off=(0, 0xffffffffffffffff)
> >   1: if r0 & 0xffffffff goto <exit>
> >     r0 before reg_bounds_sync: u64=[0x1, 0xffffffffffffffff] var_off=(0, 0)
> >     r0 after reg_bounds_sync:  u64=[0x1, 0] var_off=(0, 0)
> >
> > I suspect there isn't anything specific to these two conditions, and
> > anytime we start walking an unreachable path, we may end up with
> > inconsistent register ranges. The number of times syzkaller is currently
> > hitting this (180 in 1.5 days) suggests there are many different ways to
> > reproduce.
> >
> > We could teach is_branch_taken() about this case, but we probably won't
> > be able to cover all cases. We could stop warning on this, but then we
> > may also miss legitimate cases (i.e., invariants violations on reachable
> > paths). We could also teach reg_bounds_sync() to stop refining the
> > bounds before it gets inconsistent, but I'm unsure how useful that'd be.
> 
> Hi Paul,
> 
> In general, I think that reg_bounds_sync() can be used as a substitute
> for is_branch_taken() -> whenever an impossible range is produced,
> the branch should be deemed impossible at runtime and abandoned.
> If I recall correctly Andrii considered this too risky some time ago,
> so this warning is in place to catch bugs.

Hi Eduard,

Yeah, that feels risky enough that I didn't even dare mention it as a
possibility :)

> 
> Which leaves only the option to refine is_branch_taken().
> 
> I think is_branch_taken() modification should not be too complicated.
> For JSET it only checks tnum, but does not take ranges into account.
> Reasoning about ranges is something along the lines:
> - for unsigned range a = b & CONST -> a is in [b_min & CONST, b_max & CONST];
> - for signed ranged same thing, but consider two unsigned sub-ranges;
> - for non CONST cases, I think same reasoning can apply, but more
>   min/max combinations need to be explored.
> - then check if zero is a member or 'a' range.
> 
> Wdyt?

I might be missing something, but I'm not sure that works. For the
unsigned range, if we have b & 0x2 with b in [2; 10], then we'd end up
with a in [2; 2] and would conclude that the jump is never taken. But
b=8 proves us wrong.

> 
> > The number of times syzkaller is currently hitting this (180 in 1.5
> > days) suggests there are many different ways to reproduce.
> 
> It is a bit inconvenient to read syzbot BPF reports at the moment,
> because it us hard to figure out how the program looks like.
> Do you happen to know how complicated would it be to modify syzbot
> output to:
> - produce a comment with BPF program
> - generating reproducer with a flag, allowing to print level 2
>   verifier log
> ?

I have the same thought sometimes. Right now, I add verifier logs to a
syz or C reproducer to see the program. Producing the BPF program in a
comment would likely be tricky as we'd need to maintain a disassembler
in syzkaller. Adding verifier logs to reproducers that contain
bpf(PROG_LOAD) calls seems easier. Then I guess we'd get that output in
the strace or console logs of syzbot.

> 
> Thanks,
> Eduard