Cc Dimitar (worked on coerce_reg_to_size_sx, point 1), Paul & Eduard (maybe point 1 will eventually lead to invariant violation?), and other BPF maintainers (point 2, 3, and 4). On Wed, Apr 16, 2025 at 01:52:09PM +0000, Tao Lyu wrote: > Hi, > > I found the following usability issues; kindly write them here to see if the community is willing to solve them. > If yes, I could write patches for them gradually. > > 1. Inaccurate tracking of arithmetic instruction results. > > There are many inaccurate arithmetic computation results. > For example, like below: > r0 should be 0 after `r0 = (s16)r3`. > However, due to the inaccurate range track in eBPF at (coerce_reg_to_size_sx and set_sext64_default_val), > the lower 16-bit of r0 becomes unknown, leading to false negatives when exit. > > func#0 @0 > 0: R1=ctx() R10=fp0 > 0: (b7) r6 = -657948387 ; R6_w=0xffffffffd8c8811d > 1: (94) w6 s%= 16 ; R6_w=scalar(smin=0,smax=umax=0xffffffff,var_off=(0x0; 0xffffffff)) > 2: (18) r8 = 0xff11000279981800 ; R8_w=map_ptr(ks=4,vs=8) > 4: (18) r9 = 0x19556057 ; R9_w=0x19556057 > 6: (bf) r3 = r10 ; R3_w=fp0 R10=fp0 > 7: (bf) r3 = r6 ; R3_w=scalar(id=1,smin=0,smax=umax=0xffffffff,var_off=(0x0; 0xffffffff)) R6_w=scalar(id=1,smin=0,smax=umax=0xffffffff,var_off=(0x0; 0xffffffff)) > 8: (67) r3 <<= 38 ; R3_w=scalar(smax=0x7fffffc000000000,umax=0xffffffc000000000,smin32=0,smax32=umax32=0,var_off=(0x0; 0xffffffc000000000)) > 9: (bf) r0 = r6 ; R0_w=scalar(id=1,smin=0,smax=umax=0xffffffff,var_off=(0x0; 0xffffffff)) R6_w=scalar(id=1,smin=0,smax=umax=0xffffffff,var_off=(0x0; 0xffffffff)) > 10: (bc) w0 = (s16)w3 ; R0_w=0 R3_w=scalar(smax=0x7fffffc000000000,umax=0xffffffc000000000,smin32=0,smax32=umax32=0,var_off=(0x0; 0xffffffc000000000)) > 11: (bf) r0 = (s16)r3 ; R0_w=scalar(smin=smin32=-32768,smax=smax32=32767) R3_w=scalar(smax=0x7fffffc000000000,umax=0xffffffc000000000,smin32=0,smax32=umax32=0,var_off=(0x0; 0xffffffc000000000)) I would say it is just not as precise, rather than inaccurate. But your point remain, the verifier was not able to come up with [0, 0] as the range after instruction 11, and we end up with [S16_MIN, S16_MAX] instead. Dimitar has a patchset[1] that make better use of tnum for sign extension, which should address this. 1: https://lore.kernel.org/all/20250130112342.69843-1-dimitar.kanaliev@xxxxxxxxxxxxxx/ Can't comment on point 2 and 3, and unsure about point 4. > 2. Unnecessary atomic instructions operating on private memory (e.g, stack). > > Since atomic instructions are only useful on the shared memory, > it is unnecessary to allow them on the private memory like stack, > which was discussed long time ago in this commit: > https://github.com/torvalds/linux/commit/ca36960211eb228bcbc7aaebfa0d027368a94c60 > > Moreover, allowing atomic instruction also introduce another usability bugs, > which was reported here: https://lore.kernel.org/bpf/20231020172941.155388-1-tao.lyu@xxxxxxx/ > > > 3. Inconsistent constraints on instructions involving pointer type transitions. > > Most bitwise instructions, such as and and xor are disallowed on pointers, > but negation and bitwise swap are allowed. > Moreover, negation and bitwise swap are permitted in atomic instructions, > such as atomic_and and atomic_xor. > > > 4. Coarse-grained pointer comparison. > > Pointers pointing to the same memory region can infer more pointer range information. > For example, comparing two stack pointers (one with a constant range, while the other with a variable range) can help infer the variable range, > as shown in the code below. > > 11: R0=map_value(map=array_map3,ks=4,vs=8) R9=ctx() R10=fp0 fp-8=mmmmmmmm > 11: (61) r6 = *(u32 *)(r0 +0) ; R0=map_value(map=array_map3,ks=4,vs=8) R6_w=scalar(smin=0,smax=umax=0xffffffff,var_off=(0x0; 0xffffffff)) > 12: (bf) r1 = r10 ; R1_w=fp0 R10=fp0 > 13: (0f) r1 += r6 > mark_precise: frame0: last_idx 13 first_idx 11 subseq_idx -1 > mark_precise: frame0: regs=r6 stack= before 12: (bf) r1 = r10 > mark_precise: frame0: regs=r6 stack= before 11: (61) r6 = *(u32 *)(r0 +0) > 14: R1_w=fp(smin=0,smax=umax=0xffffffff,var_off=(0x0; 0xffffffff)) R6_w=scalar(smin=0,smax=umax=0xffffffff,var_off=(0x0; 0xffffffff)) > 14: (bf) r2 = r10 ; R2_w=fp0 R10=fp0 > 15: (07) r2 += -512 ; R2_w=fp-512 > 16: (ad) if r1 < r2 goto pc+2 ; R1_w=fp(smin=0,smax=umax=0xffffffff,var_off=(0x0; 0xffffffff)) R2_w=fp-512 > 17: (3d) if r1 >= r10 goto pc+1 ; R1_w=fp(smin=0,smax=umax=0xffffffff,var_off=(0x0; 0xffffffff)) R10=fp0 > 18: (71) r3 = *(u8 *)(r1 +0) > invalid unbounded variable-offset read from stack R1 I really have no idea whether compilers produce this kind of pattern, but if there is some chance, then it does seem reasonable to have this implemented. Seems like much existing code for scalar can be reused.