On Tue, 2025-06-17 at 19:17 -0400, Harishankar Vishwanathan wrote: > This patch improves the precison of the scalar(32)_min_max_add and > scalar(32)_min_max_sub functions, which update the u(32)min/u(32)_max > ranges for the BPF_ADD and BPF_SUB instructions. We discovered this more > precise operator using a technique we are developing for automatically > synthesizing functions for updating tnums and ranges. > > According to the BPF ISA [1], "Underflow and overflow are allowed during > arithmetic operations, meaning the 64-bit or 32-bit value will wrap". > Our patch leverages the wrap-around semantics of unsigned overflow and > underflow to improve precision. > > Below is an example of our patch for scalar_min_max_add; the idea is > analogous for all four functions. > > There are three cases to consider when adding two u64 ranges [dst_umin, > dst_umax] and [src_umin, src_umax]. Consider a value x in the range > [dst_umin, dst_umax] and another value y in the range [src_umin, > src_umax]. > > (a) No overflow: No addition x + y overflows. This occurs when even the > largest possible sum, i.e., dst_umax + src_umax <= U64_MAX. > > (b) Partial overflow: Some additions x + y overflow. This occurs when > the largest possible sum overflows (dst_umax + src_umax > U64_MAX), but > the smallest possible sum does not overflow (dst_umin + src_umin <= > U64_MAX). > > (c) Full overflow: All additions x + y overflow. This occurs when both > the smallest possible sum and the largest possible sum overflow, i.e., > both (dst_umin + src_umin) and (dst_umax + src_umax) are > U64_MAX. > > The current implementation conservatively sets the output bounds to > unbounded, i.e, [umin=0, umax=U64_MAX], whenever there is *any* > possibility of overflow, i.e, in cases (b) and (c). Otherwise it > computes tight bounds as [dst_umin + src_umin, dst_umax + src_umax]: > > if (check_add_overflow(*dst_umin, src_reg->umin_value, dst_umin) || > check_add_overflow(*dst_umax, src_reg->umax_value, dst_umax)) { > *dst_umin = 0; > *dst_umax = U64_MAX; > } > > Our synthesis-based technique discovered a more precise operator. > Particularly, in case (c), all possible additions x + y overflow and > wrap around according to eBPF semantics, and the computation of the > output range as [dst_umin + src_umin, dst_umax + src_umax] continues to > work. Only in case (b), do we need to set the output bounds to > unbounded, i.e., [0, U64_MAX]. > > Case (b) can be checked by seeing if the minimum possible sum does *not* > overflow and the maximum possible sum *does* overflow, and when that > happens, we set the output to unbounded: > > min_overflow = check_add_overflow(*dst_umin, src_reg->umin_value, dst_umin); > max_overflow = check_add_overflow(*dst_umax, src_reg->umax_value, dst_umax); > > if (!min_overflow && max_overflow) { > *dst_umin = 0; > *dst_umax = U64_MAX; > } > > Below is an example eBPF program and the corresponding log from the > verifier. At instruction 7: (0f) r5 += r3, due to conservative overflow > handling, the current implementation of scalar_min_max_add() sets r5's > bounds to [0, U64_MAX], which is then updated by reg_bounds_sync() to > [0x3d67e960f7d, U64_MAX]. > > 0: R1=ctx() R10=fp0 > 0: (85) call bpf_get_prandom_u32#7 ; R0_w=scalar() > 1: (bf) r3 = r0 ; R0_w=scalar(id=1) R3_w=scalar(id=1) > 2: (18) r4 = 0x950a43d67e960f7d ; R4_w=0x950a43d67e960f7d > 4: (4f) r3 |= r4 ; R3_w=scalar(smin=0x950a43d67e960f7d,smax=-1,umin=0x950a43d67e960f7d,smin32=0xfe960f7d,umin32=0x7e960f7d,var_off=(0x950a43d67e960f7d; 0x6af5bc298169f082)) R4_w=0x950a43d67e960f7d > 5: (18) r5 = 0xc014a00000000000 ; R5_w=0xc014a00000000000 > 7: (0f) r5 += r3 ; R3_w=scalar(smin=0x950a43d67e960f7d,smax=-1,umin=0x950a43d67e960f7d,smin32=0xfe960f7d,umin32=0x7e960f7d,var_off=(0x950a43d67e960f7d; 0x6af5bc298169f082)) R5_w=scalar(smin=0x800003d67e960f7d,umin=0x3d67e960f7d,smin32=0xfe960f7d,umin32=0x7e960f7d,var_off=(0x3d67e960f7d; 0xfffffc298169f082)) > 8: (b7) r0 = 0 ; R0_w=0 > 9: (95) exit > > With our patch, r5's bounds after instruction 7 are set to a much more > precise [0x551ee3d67e960f7d, 0xc0149fffffffffff] by > scalar_min_max_add(). > > ... > 7: (0f) r5 += r3 ; R3_w=scalar(smin=0x950a43d67e960f7d,smax=-1,umin=0x950a43d67e960f7d,smin32=0xfe960f7d,umin32=0x7e960f7d,var_off=(0x950a43d67e960f7d; 0x6af5bc298169f082)) R5_w=scalar(smin=0x800003d67e960f7d,umin=0x551ee3d67e960f7d,umax=0xc0149fffffffffff,smin32=0xfe960f7d,umin32=0x7e960f7d,var_off=(0x3d67e960f7d; 0xfffffc298169f082)) > 8: (b7) r0 = 0 ; R0_w=0 > 9: (95) exit > > The logic for scalar32_min_max_add is analogous. For the > scalar(32)_min_max_sub functions, the reasoning is similar but applied > to detecting underflow instead of overflow. > > We verified the correctness of the new implementations using Agni [3,4]. > > We since also discovered that a similar technique has been used to > calculate output ranges for unsigned interval addition and subtraction > in Hacker's Delight [2]. > > [1] https://docs.kernel.org/bpf/standardization/instruction-set.html > [2] Hacker's Delight Ch.4-2, Propagating Bounds through Add’s and Subtract’s > [3] https://github.com/bpfverif/agni > [4] https://people.cs.rutgers.edu/~sn349/papers/sas24-preprint.pdf > > Co-developed-by: Matan Shachnai <m.shachnai@xxxxxxxxxxx> > Signed-off-by: Matan Shachnai <m.shachnai@xxxxxxxxxxx> > Co-developed-by: Srinivas Narayana <srinivas.narayana@xxxxxxxxxxx> > Signed-off-by: Srinivas Narayana <srinivas.narayana@xxxxxxxxxxx> > Co-developed-by: Santosh Nagarakatte <santosh.nagarakatte@xxxxxxxxxxx> > Signed-off-by: Santosh Nagarakatte <santosh.nagarakatte@xxxxxxxxxxx> > Signed-off-by: Harishankar Vishwanathan <harishankar.vishwanathan@xxxxxxxxx> > --- Acked-by: Eduard Zingerman <eddyz87@xxxxxxxxx> [...]