This change addresses the cases described in [1]. These cases can be illustrated with the following diagram: .-> A --. Assume the states are visited in the order A, B, C. | | | Assume that state B reaches a state equivalent to state A. | v v At this point, state C is not processed yet, so state A has '-- B C not received read or precision marks from state C. As a result, these marks won't be propagated to B. If B has incomplete marks, it is unsafe to use it in states_equal() checks. Patch #1 introduces the function compute_scc(), which uses a slightly modified version of Tarjan's algorithm to identify loops in the program's control flow graph. The function was tested by generating random graphs and comparing its outputs with results of a simple, slower algorithm. See [2] for details. Patch #3 leverages computed SCC information to replace bpf_verifier_state->loop_entry based logic. Instead, for states within an iterator-based loop, all registers are marked as read and precise. Patch #4 provides examples of unsafe programs that would be accepted without this patch-set. The change has significant negative performance impact on a number of tests and sched_ext programs: ========= selftests: master vs patch ========= File Program Insns (A) Insns (B) Insns (DIFF) ---------------------------------- ---------------------------- --------- --------- ------------------- arena_list.bpf.o arena_list_add 374 406 +32 (+8.56%) iters.bpf.o checkpoint_states_deletion 1211 1216 +5 (+0.41%) iters.bpf.o clean_live_states 588 620 +32 (+5.44%) iters.bpf.o iter_subprog_check_stacksafe 128 112 -16 (-12.50%) iters.bpf.o iter_subprog_iters 664 571 -93 (-14.01%) pyperf600_iter.bpf.o on_event 2591 5929 +3338 (+128.83%) test_usdt.bpf.o usdt12 1803 1860 +57 (+3.16%) verifier_iterating_callbacks.bpf.o cond_break1 100 86633 +86533 (+86533.00%) verifier_iterating_callbacks.bpf.o cond_break2 90 110 +20 (+22.22%) Total progs: 3587 Old success: 2070 New success: 2070 States diff min: -15.38% States diff max: 78660.00% -20 .. -10 %: 1 -10 .. 0 %: 1 0 .. 5 %: 3581 5 .. 15 %: 1 30 .. 40 %: 1 120 .. 130 %: 1 78660 .. 78665 %: 1 ========= scx: master vs patch ========= File Program Insns (A) Insns (B) Insns (DIFF) -------------- --------------------- --------- --------- ---------------- bpf.bpf.o bpfland_init 975 1012 +37 (+3.79%) bpf.bpf.o chaos_dispatch 27025 27054 +29 (+0.11%) bpf.bpf.o chaos_init 6024 6180 +156 (+2.59%) bpf.bpf.o lavd_cpu_offline 1419 1627 +208 (+14.66%) bpf.bpf.o lavd_cpu_online 1419 1627 +208 (+14.66%) bpf.bpf.o lavd_dispatch 59090 96710 +37620 (+63.67%) bpf.bpf.o lavd_init 3066 3276 +210 (+6.85%) bpf.bpf.o layered_dispatch 9040 12450 +3410 (+37.72%) bpf.bpf.o layered_dump 1890 1615 -275 (-14.55%) bpf.bpf.o layered_enqueue 6443 6400 -43 (-0.67%) bpf.bpf.o layered_init 3874 4255 +381 (+9.83%) bpf.bpf.o layered_runnable 1706 1674 -32 (-1.88%) bpf.bpf.o p2dq_dispatch 1068 1102 +34 (+3.18%) bpf.bpf.o p2dq_init 5080 5371 +291 (+5.73%) bpf.bpf.o rusty_init 35707 35758 +51 (+0.14%) bpf.bpf.o tp_cgroup_attach_task 149 203 +54 (+36.24%) scx_pair.bpf.o pair_dispatch 891 659 -232 (-26.04%) scx_qmap.bpf.o qmap_dispatch 1703 3934 +2231 (+131.00%) scx_qmap.bpf.o qmap_dump 230 316 +86 (+37.39%) scx_qmap.bpf.o qmap_init 18548 23063 +4515 (+24.34%) Total progs: 247 Old success: 217 New success: 217 States diff min: -25.88% States diff max: 132.62% -30 .. -20 %: 1 -20 .. -10 %: 1 -5 .. 0 %: 2 0 .. 5 %: 232 5 .. 15 %: 3 15 .. 25 %: 3 25 .. 35 %: 2 35 .. 45 %: 1 50 .. 60 %: 1 130 .. 135 %: 1 [1] https://lore.kernel.org/bpf/3c6ac16b7578406e2ddd9ba889ce955748fe636b.camel@xxxxxxxxx/ [2] https://github.com/eddyz87/scc-test Eduard Zingerman (4): bpf: compute SCCs in program control flow graph bpf: frame_insn_idx() utility function bpf: use SCC info instead of loop_entry selftests/bpf: tests with a loop state missing read/precision mark include/linux/bpf_verifier.h | 46 +- kernel/bpf/verifier.c | 630 ++++++++++++++-------- tools/testing/selftests/bpf/progs/iters.c | 141 +++++ 3 files changed, 581 insertions(+), 236 deletions(-) -- 2.48.1