Re: [RFC bpf-next 0/4] bpf: Fast-Path approach for BPF program Termination

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Sun, Apr 20, 2025 at 3:56 AM Raj Sahu <rjsu26@xxxxxxxxx> wrote:
>
> From: Raj <rjsu26@xxxxxxxxx>
>
> Motivation:
> We propose an enhancement to the BPF infrastructure to address the
> critical issue of long-running BPF programs [1,2,3,4]. While the verifier
> ensures termination by restricting instruction count and backward edges, the
> total execution time of BPF programs is not bounded. Certain helper functions
> and iterators can result in extended runtimes, affecting system performance.
>
> The existing BPF infrastructure verifies that programs will not indefinitely
> loop or leak resources. However, helper functions such as `bpf_loop`,
> `bpf_for_each_map_elem`, and other iterative or expensive kernel interactions
> can cause runtimes that significantly degrade system performance [6]. Current
> detaching mechanisms do not immediately terminate running instances,
> monopolizing CPU. Therefore, a termination solution is necessary to swiftly
> terminate execution while safely releasing resources.
>
> Existing termination approach like the BPF Exception or Runtime hooks [5] have
> the issue of either lack of dynamism or having runtime overheads: BPF
> Exceptions: Introduces bpf_throw() and exception callbacks, requiring stack
> unwinding and exception state management. Cleanup can only be done for
> pre-defined cancellation points.  Runtime Hooks: Proposes watchdog timers, which
> requires resource tracking, though minimal but non-zero runtime overheads.
>
> Design:
> We introduce the Fast-Path termination mechanism, leveraging the
> verifier's guarantees regarding control flow and resource management. The
> approach dynamically patches running BPF programs with a stripped-down version
> that accelerates termination. This can be used to terminate any given instance
> of a BPF execution. Key elements include:
>
> - Implicit Lifetime Management: Utilizing the verifier’s inherent control flow
>   and resource cleanup paths encoded within the BPF program structure,
>   eliminating the need for explicit garbage collection or unwinding tables.
>
> - Dynamic Program Patching: At runtime, BPF programs are atomically patched,
>   replacing expensive helper calls with stubbed versions (fast fall-through
>   implementations). This ensures swift termination without compromising safety
>   or leaking resources.
>
> - Helper Function Adjustments: Certain helper functions (e.g., `bpf_loop`,
>   `bpf_for_each_map_elem`) include  mechanisms to facilitate early exits through
>   modified return values.
>

I understand that the motivation for this proposal is kernel safety, so perhaps
my concern simply doesn't matter in that context, but I'm concerned about the
possibility of data corruption, specifically data stored in maps.

There are many ways to write data into maps that (especially with JIT) do not
end up going through any helper functions. For example, once a pointer to a map
value has been obtained, it can simply be written to update the map. That means
there is no helper to be patched to prevent the write.

My understanding is that with the Fast-Path termination mechanism, those write
instructions will still be executed and will still update the map. But if the
values they are writing are dependent on the results of any patched function
calls, the values will not be the intended ones which will result in data
corruption. This corruption would not impact the safety of the kernel, but
could cause problems for userspace applications relying on the map data.

Is that a correct understanding? If so, is that a concern that should be
addressed/mitigated?

> TODOs:
> - Termination support for nested BPF programs.
> - Policy enforcements to control runtime of BPF programs in a system:
> - Timer based termination (watchdog)
>         - Userspace management to detect low-performing BPF program and
>           terminated them
>
> We haven’t added any selftests in the POC as this mail is mainly to get
> feedback on the design. Attaching link to sample BPF programs to
> validate the POC [7].  Styling will be taken care in next iteration.
>
> References:
> 1. https://lpc.events/event/17/contributions/1610/attachments/1229/2505/LPC_BPF_termination_Raj_Sahu.pdf
> 2. https://vtechworks.lib.vt.edu/server/api/core/bitstreams/f0749daa-4560-41c9-9f36-6aa618161665/content
> 3. https://lore.kernel.org/bpf/AM6PR03MB508011599420DB53480E8BF799F72@xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx/T/
> 4. https://vtechworks.lib.vt.edu/server/api/core/bitstreams/7fb70c04-0736-4e2d-b48b-2d8d012bacfc/content
> 5. https://lwn.net/ml/all/AM6PR03MB5080513BFAEB54A93CC70D4399FE2@xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx/#t
> 6. https://people.cs.vt.edu/djwillia/papers/ebpf23-runtime.pdf
> 7. https://github.com/sidchintamaneni/os-dev-env/tree/main/bpf-programs-catalog/research/termination/patch_gen_testing
>
>  arch/x86/kernel/smp.c          |   4 +-
>  include/linux/bpf.h            |  18 ++
>  include/linux/filter.h         |  16 ++
>  include/linux/smp.h            |   2 +-
>  include/uapi/linux/bpf.h       |  13 ++
>  kernel/bpf/bpf_iter.c          |  65 ++++++
>  kernel/bpf/core.c              |  45 ++++
>  kernel/bpf/helpers.c           |   8 +
>  kernel/bpf/syscall.c           | 375 +++++++++++++++++++++++++++++++++
>  kernel/bpf/verifier.c          |  16 +-
>  kernel/smp.c                   |  22 +-
>  tools/bpf/bpftool/prog.c       |  40 ++++
>  tools/include/uapi/linux/bpf.h |   5 +
>  tools/lib/bpf/bpf.c            |  15 ++
>  tools/lib/bpf/bpf.h            |  10 +
>  tools/lib/bpf/libbpf.map       |   1 +
>  16 files changed, 643 insertions(+), 12 deletions(-)
>
> --
> 2.43.0
>
>





[Index of Archives]     [Linux Samsung SoC]     [Linux Rockchip SoC]     [Linux Actions SoC]     [Linux for Synopsys ARC Processors]     [Linux NFS]     [Linux NILFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]


  Powered by Linux