Re: [RFC bpf-next 3/4] bpf: Generating a stubbed version of BPF program for termination

Eduard Zingerman <eddyz87@xxxxxxxxx> · Mon, 12 May 2025 21:40:15 -0700

On Mon, 2025-05-12 at 17:20 -0700, Alexei Starovoitov wrote:
> On Mon, May 12, 2025 at 5:07 PM Eduard Zingerman <eddyz87@xxxxxxxxx> wrote:
> > 
> > 
> > - From verification point of view:
> >   this function is RET_VOID and is not in
> >   find_in_skiplist(), patch_generator() would replace its call with a
> >   dummy. However, a corresponding bpf_spin_unlock() would remain and thus
> >   bpf_check() will exit with error.
> >   So, you would need some special version of bpf_check, that collects
> >   all resources needed for program translation (e.g. maps), but does
> >   not perform semantic checks.
> >   Or patch_generator() has to be called for a program that is already
> >   verified.
> 
> No. let's not parametrize bpf_check.
> 
> Here is what I proposed earlier in the thread:
> 
> the verifier should just remember all places where kfuncs
> and helpers return _OR_NULL,
> then when the verification is complete, copy the prog,
> replaces 'call kfunc/help' with 'call stub',
> run two JITs, and compare JIT artifacts
> to make sure IPs match.

Makes sense, much cleaner compared to special version of bpf_check().

> But thinking about it more...
> I'm not sure any more that it's a good idea to fast execute
> the program on one cpu and let it continue running as-is on
> all other cpus including future invocations on this cpu.
> So far the reasons to terminate bpf program:
> - timeout in rqspinlock
> - fault in arena
> - some future watchdog
> 
> In all cases the program is buggy, so it's safer
> from kernel pov and from data integrity pov to stop
> all instances now and prevent future invocations.
> So I think we should patch the prog text in run-time
> without cloning.

Stopping all program instances makes sense.

However, this is orthogonal to preparing dummy version of the program.
Atomically converting program to dummy might be useful from the same
data integrity pov.

Also, patching for dummy might be an effort to make program side
effect free, e.g. by masking any map access.

Still curious about PTR_TO_BTF_ID and bpf_spin_lock().

[...]