On Mon, May 12, 2025 at 5:07 PM Eduard Zingerman <eddyz87@xxxxxxxxx> wrote: > > > - From verification point of view: > this function is RET_VOID and is not in > find_in_skiplist(), patch_generator() would replace its call with a > dummy. However, a corresponding bpf_spin_unlock() would remain and thus > bpf_check() will exit with error. > So, you would need some special version of bpf_check, that collects > all resources needed for program translation (e.g. maps), but does > not perform semantic checks. > Or patch_generator() has to be called for a program that is already > verified. No. let's not parametrize bpf_check. Here is what I proposed earlier in the thread: the verifier should just remember all places where kfuncs and helpers return _OR_NULL, then when the verification is complete, copy the prog, replaces 'call kfunc/help' with 'call stub', run two JITs, and compare JIT artifacts to make sure IPs match. But thinking about it more... I'm not sure any more that it's a good idea to fast execute the program on one cpu and let it continue running as-is on all other cpus including future invocations on this cpu. So far the reasons to terminate bpf program: - timeout in rqspinlock - fault in arena - some future watchdog In all cases the program is buggy, so it's safer from kernel pov and from data integrity pov to stop all instances now and prevent future invocations. So I think we should patch the prog text in run-time without cloning. The verifier should prepare an array of patches in text_poke_bp_batch() format and when timeout/fault detected do one call to text_poke_bp_batch() to stub out the whole prog. At least on x86 we always emit nop5 in the prologue, so we can patch it with goto exit as well. Then the prog will be completely gutted.