Re: [syzbot ci] Re: bpf: Use tnums for JEQ/JNE is_branch_taken logic

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Wed, Aug 20, 2025 at 12:37:46PM -0700, Eduard Zingerman wrote:
> On Wed, 2025-08-20 at 13:34 +0200, Paul Chaignon wrote:
> 
> [...]
> 
> > I have a patch to potentially fix this, but I'm still testing it and
> > would prefer to send it separately as it doesn't really relate to my
> > current patchset.
> 
> I'd like to bring this point again: this is a cat-and-mouse game.
> is_scalar_branch_taken() and regs_refine_cond_op() are essentially
> same operation and should be treated as such: produce register states
> for both branches and prune those that result in an impossible state.

I agree. I've been slowly convincing myself of the same :)
So far, the syzkaller invariant violation reports have been useful to
retrieve examples of where the two functions diverge and to deduce
improvements. But syzkaller now seems to be iterating between the same
three different cases of violations, so maybe it's time to look at a
more generic solution.

I assume you would call regs_refine_cond_op() then reg_bounds_sync()
and use the result in is_scalar_branch_taken()? Most of the impossible
states are only exposed once passed through reg_bounds_sync().

> There is nothing wrong with this logically and we haven't got a single
> real bug from the invariant violations check if I remember correctly.

That's correct. We also have pretty good coverage of this logic in
selftests and now it seems in syzkaller as well. Agni is also
continuously verifying reg_bounds_sync(). So I think the risk of relying
on regs_refine_cond_op() and reg_bounds_sync() in
is_scalar_branch_taken() would be low.

> 
> Comparing the two functions, it looks like tricky cases are BPF_JE/JNE
> and BPF_JSET/JSET|BPF_X. However, given that regs_refine_cond_op() is
> called for a false branch with opcode reversed it looks like there is
> no issues with these cases.
> 
> I'll give this a try.

I'd be happy to contribute selftests extracted from the syzkaller logs.
It's still hitting three different kinds of invariant violations: the
one reported here, the one addressed by my patch, and another one.

And I'm of course interested to run any patch through Cilium's CI to
check the complexity impact.

> 
> [...]




[Index of Archives]     [Linux Samsung SoC]     [Linux Rockchip SoC]     [Linux Actions SoC]     [Linux for Synopsys ARC Processors]     [Linux NFS]     [Linux NILFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]


  Powered by Linux