[no subject]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



The bpf global trampoline has addition overhead in comparison with the bpf
trampoline:
1. We do more checks. We check if origin call is need, if the prog is
   sleepable, etc, in the global trampoline.
2. We do more memory read and write. We need to load the bpf progs from
   memory, and save addition regs to stack.
3. The function metadata lookup.

However, we also have some optimization:
1. For fentry, we avoid 2 function call: __bpf_prog_enter_recur and
   __bpf_prog_exit_recur, as we make them inline in our case.
2. For fexit/fmodret, we avoid another 2 function call: __bpf_tramp_enter
   and __bpf_tramp_exit by inline them.

The performance of fentry-multi is closer to fentry-multi-all, which means
the hash table is O(1) and fast enough.

Further work
------------
The performance of the global trampoline can be optimized further.

First, we can avoid some checks by generate more bpf_global_caller, such
as:

static __always_inline notrace int
bpf_global_caller_run(unsigned long *args, unsigned long *ip, int nr_args,
                      bool sleepable, bool do_origin)
{
    xxxxxx
}

static __always_used __no_stack_protector notrace int
bpf_global_caller_2_sleep_origin(unsigned long *args, unsigned long *ip)
{
    return bpf_global_caller_run(args, ip, nr_args, 2, 1, 1);
}

And the bpf global caller "bpf_global_caller_2_sleep_origin" can be used
for the functions who have 2 function args, and have sleepable bpf progs,
and have fexit or modify_return. The check of sleepable and origin call
will be optimized by the compiler, as they are const.

Second, we can implement the function metadata with the function padding.
The hash table lookup for metadata consume ~15 instructions. With
function padding, it needs only 5 instructions, and will be faster.

Besides the performance, we also need to make the global trampoline
collaborate with bpf trampoline. For now, FENTRY_MULTI will be attached
to the target who already have FENTRY on it, and -EEXIST will be returned.
So we need another series to make them work together.

Changes since V1:

* remove the function metadata that bases on function padding, and
  implement it with a resizable hash table.
* rewrite the bpf global trampoline with C.
* use the existing bpf bench frame for bench testings.
* remove the part that make tracing-multi compatible with tracing.

Link: https://lore.kernel.org/all/20250303132837.498938-1-dongml2@xxxxxxxxxxxxxxx/ [1]
Link: https://lore.kernel.org/bpf/20240311093526.1010158-1-dongmenglong.8@xxxxxxxxxxxxx/ [2]
Link: https://lore.kernel.org/bpf/CAADnVQ+G+mQPJ+O1Oc9+UW=J17CGNC5B=usCmUDxBA-ze+gZGw@xxxxxxxxxxxxxx/ [3]
Menglong Dong (18):
  bpf: add function hash table for tracing-multi
  x86,bpf: add bpf_global_caller for global trampoline
  ftrace: factor out ftrace_direct_update from register_ftrace_direct
  ftrace: add reset_ftrace_direct_ips
  bpf: introduce bpf_gtramp_link
  bpf: tracing: add support to record and check the accessed args
  bpf: refactor the modules_array to ptr_array
  bpf: verifier: add btf to the function args of bpf_check_attach_target
  bpf: verifier: move btf_id_deny to bpf_check_attach_target
  x86,bpf: factor out arch_bpf_get_regs_nr
  bpf: tracing: add multi-link support
  libbpf: don't free btf if tracing_multi progs existing
  libbpf: support tracing_multi
  libbpf: add btf type hash lookup support
  libbpf: add skip_invalid and attach_tracing for tracing_multi
  selftests/bpf: move get_ksyms and get_addrs to trace_helpers.c
  selftests/bpf: add basic testcases for tracing_multi
  selftests/bpf: add bench tests for tracing_multi

 arch/x86/Kconfig                              |   4 +
 arch/x86/net/bpf_jit_comp.c                   | 290 ++++++++++++-
 include/linux/bpf.h                           |  59 +++
 include/linux/bpf_tramp.h                     |  72 ++++
 include/linux/bpf_types.h                     |   1 +
 include/linux/bpf_verifier.h                  |   1 +
 include/linux/btf.h                           |   3 +-
 include/linux/ftrace.h                        |   7 +
 include/linux/kfunc_md.h                      |  91 ++++
 include/uapi/linux/bpf.h                      |  10 +
 kernel/bpf/Makefile                           |   1 +
 kernel/bpf/btf.c                              | 113 ++++-
 kernel/bpf/kfunc_md.c                         | 352 ++++++++++++++++
 kernel/bpf/syscall.c                          | 395 +++++++++++++++++-
 kernel/bpf/trampoline.c                       | 220 +++++++++-
 kernel/bpf/verifier.c                         | 161 ++++---
 kernel/trace/bpf_trace.c                      |  48 +--
 kernel/trace/ftrace.c                         | 183 +++++---
 net/bpf/test_run.c                            |   3 +
 net/core/bpf_sk_storage.c                     |   2 +
 net/sched/bpf_qdisc.c                         |   2 +-
 tools/bpf/bpftool/common.c                    |   3 +
 tools/include/uapi/linux/bpf.h                |  10 +
 tools/lib/bpf/bpf.c                           |  10 +
 tools/lib/bpf/bpf.h                           |   6 +
 tools/lib/bpf/btf.c                           | 102 +++++
 tools/lib/bpf/btf.h                           |   6 +
 tools/lib/bpf/libbpf.c                        | 296 ++++++++++++-
 tools/lib/bpf/libbpf.h                        |  25 ++
 tools/lib/bpf/libbpf.map                      |   5 +
 tools/testing/selftests/bpf/Makefile          |   2 +-
 tools/testing/selftests/bpf/bench.c           |   8 +
 .../selftests/bpf/benchs/bench_trigger.c      |  72 ++++
 .../selftests/bpf/benchs/run_bench_trigger.sh |   1 +
 .../selftests/bpf/prog_tests/fentry_fexit.c   |  22 +-
 .../selftests/bpf/prog_tests/fentry_test.c    |  79 +++-
 .../selftests/bpf/prog_tests/fexit_test.c     |  79 +++-
 .../bpf/prog_tests/kprobe_multi_test.c        | 220 +---------
 .../selftests/bpf/prog_tests/modify_return.c  |  60 +++
 .../bpf/prog_tests/tracing_multi_link.c       | 210 ++++++++++
 .../selftests/bpf/progs/fentry_multi_empty.c  |  13 +
 .../selftests/bpf/progs/tracing_multi_test.c  | 181 ++++++++
 .../selftests/bpf/progs/trigger_bench.c       |  22 +
 .../selftests/bpf/test_kmods/bpf_testmod.c    |  24 ++
 tools/testing/selftests/bpf/test_progs.c      |  50 +++
 tools/testing/selftests/bpf/test_progs.h      |   3 +
 tools/testing/selftests/bpf/trace_helpers.c   | 283 +++++++++++++
 tools/testing/selftests/bpf/trace_helpers.h   |   3 +
 48 files changed, 3349 insertions(+), 464 deletions(-)
 create mode 100644 include/linux/bpf_tramp.h
 create mode 100644 include/linux/kfunc_md.h
 create mode 100644 kernel/bpf/kfunc_md.c
 create mode 100644 tools/testing/selftests/bpf/prog_tests/tracing_multi_link.c
 create mode 100644 tools/testing/selftests/bpf/progs/fentry_multi_empty.c
 create mode 100644 tools/testing/selftests/bpf/progs/tracing_multi_test.c

-- 
2.39.5






[Index of Archives]     [Linux Samsung SoC]     [Linux Rockchip SoC]     [Linux Actions SoC]     [Linux for Synopsys ARC Processors]     [Linux NFS]     [Linux NILFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]


  Powered by Linux