On Thu, Jul 03, 2025 at 11:54:27AM -0700, Eduard Zingerman wrote: > On Thu, 2025-07-03 at 19:02 +0200, Paul Chaignon wrote: > > On Tue, Jul 01, 2025 at 06:55:28PM -0700, syzbot wrote: > > > Hello, > > > > > > syzbot found the following issue on: > > > > > > HEAD commit: cce3fee729ee selftests/bpf: Enable dynptr/test_probe_read_.. > > > git tree: bpf-next > > > console+strace: https://syzkaller.appspot.com/x/log.txt?x=147793d4580000 > > > kernel config: https://syzkaller.appspot.com/x/.config?x=79da270cec5ffd65 > > > dashboard link: https://syzkaller.appspot.com/bug?extid=c711ce17dd78e5d4fdcf > > > compiler: Debian clang version 20.1.6 (++20250514063057+1e4d39e07757-1~exp1~20250514183223.118), Debian LLD 20.1.6 > > > syz repro: https://syzkaller.appspot.com/x/repro.syz?x=1594e48c580000 > > > C reproducer: https://syzkaller.appspot.com/x/repro.c?x=1159388c580000 > > > > > > Downloadable assets: > > > disk image: https://storage.googleapis.com/syzbot-assets/f286a7ef4940/disk-cce3fee7.raw.xz > > > vmlinux: https://storage.googleapis.com/syzbot-assets/e2f2ebe1fdc3/vmlinux-cce3fee7.xz > > > kernel image: https://storage.googleapis.com/syzbot-assets/6e3070663778/bzImage-cce3fee7.xz > > > > > > IMPORTANT: if you fix the issue, please add the following tag to the commit: > > > Reported-by: syzbot+c711ce17dd78e5d4fdcf@xxxxxxxxxxxxxxxxxxxxxxxxx > > > > > > ------------[ cut here ]------------ > > > verifier bug: REG INVARIANTS VIOLATION (false_reg1): range bounds violation u64=[0x0, 0x0] s64=[0x0, 0x0] u32=[0x1, 0x0] s32=[0x0, 0x0] var_off=(0x0, 0x0)(1) > > > WARNING: CPU: 1 PID: 5833 at kernel/bpf/verifier.c:2688 reg_bounds_sanity_check+0x6e6/0xc20 kernel/bpf/verifier.c:2682 > > > > I'm unsure how to handle this one. > > > > One example repro is as follows. > > > > 0: call bpf_get_netns_cookie > > 1: if r0 == 0 goto <exit> > > 2: if r0 & Oxffffffff goto <exit> > > > > The issue is on the path where we fall through both jumps. > > > > That path is unreachable at runtime: after insn 1, we know r0 != 0, but > > with the sign extension on the jset, we would only fallthrough insn 2 > > if r0 == 0. Unfortunately, is_branch_taken() isn't currently able to > > figure this out, so the verifier walks all branches. As a result, we end > > up with inconsistent register ranges on this unreachable path: > > > > 0: if r0 == 0 goto <exit> > > r0: u64=[0x1, 0xffffffffffffffff] var_off=(0, 0xffffffffffffffff) > > 1: if r0 & 0xffffffff goto <exit> > > r0 before reg_bounds_sync: u64=[0x1, 0xffffffffffffffff] var_off=(0, 0) > > r0 after reg_bounds_sync: u64=[0x1, 0] var_off=(0, 0) > > > > I suspect there isn't anything specific to these two conditions, and > > anytime we start walking an unreachable path, we may end up with > > inconsistent register ranges. The number of times syzkaller is currently > > hitting this (180 in 1.5 days) suggests there are many different ways to > > reproduce. > > > > We could teach is_branch_taken() about this case, but we probably won't > > be able to cover all cases. We could stop warning on this, but then we > > may also miss legitimate cases (i.e., invariants violations on reachable > > paths). We could also teach reg_bounds_sync() to stop refining the > > bounds before it gets inconsistent, but I'm unsure how useful that'd be. > > Hi Paul, > > In general, I think that reg_bounds_sync() can be used as a substitute > for is_branch_taken() -> whenever an impossible range is produced, > the branch should be deemed impossible at runtime and abandoned. > If I recall correctly Andrii considered this too risky some time ago, > so this warning is in place to catch bugs. Hi Eduard, Yeah, that feels risky enough that I didn't even dare mention it as a possibility :) > > Which leaves only the option to refine is_branch_taken(). > > I think is_branch_taken() modification should not be too complicated. > For JSET it only checks tnum, but does not take ranges into account. > Reasoning about ranges is something along the lines: > - for unsigned range a = b & CONST -> a is in [b_min & CONST, b_max & CONST]; > - for signed ranged same thing, but consider two unsigned sub-ranges; > - for non CONST cases, I think same reasoning can apply, but more > min/max combinations need to be explored. > - then check if zero is a member or 'a' range. > > Wdyt? I might be missing something, but I'm not sure that works. For the unsigned range, if we have b & 0x2 with b in [2; 10], then we'd end up with a in [2; 2] and would conclude that the jump is never taken. But b=8 proves us wrong. > > > The number of times syzkaller is currently hitting this (180 in 1.5 > > days) suggests there are many different ways to reproduce. > > It is a bit inconvenient to read syzbot BPF reports at the moment, > because it us hard to figure out how the program looks like. > Do you happen to know how complicated would it be to modify syzbot > output to: > - produce a comment with BPF program > - generating reproducer with a flag, allowing to print level 2 > verifier log > ? I have the same thought sometimes. Right now, I add verifier logs to a syz or C reproducer to see the program. Producing the BPF program in a comment would likely be tricky as we'd need to maintain a disassembler in syzkaller. Adding verifier logs to reproducers that contain bpf(PROG_LOAD) calls seems easier. Then I guess we'd get that output in the strace or console logs of syzbot. > > Thanks, > Eduard