Re: [PATCH v1 bpf-next 00/11] BPF indirect jumps

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 





On 8/16/25 11:06 AM, Anton Protopopov wrote:
This patchset implements a new type of map, instruction set, and uses
it to build support for indirect branches in BPF (on x86). (The same
map will be later used to provide support for indirect calls and static
keys.) See [1], [2] for more context.

This patch set is a follow-up on the initial RFC [3], now converted to
normal version to trigger CI. Note that GCC and non-x86 archs are not
supposed to work.

Short table of contents:

   * Patches 1-6 implement the new map of type
     BPF_MAP_TYPE_INSN_SET and corresponding selftests. This map can
     be used to track the "original -> xlated -> jitted mapping" for
     a given program. Patches 5,6 add support for "blinded" variant.

   * Patches 7,8,9 implement the support for indirect jumps

   * Patches 10,11 add support for LLVM-compiled programs containing
     indirect jumps.

A special LLVM should be used for that, see [4] for the details and
some related discussions. Due to this fact, selftests for indirect
jumps which directly use `goto *rX` are commented out (such that
CI can run).

There is a list of TBDs (mostly, more selftests + some limitations
like maximal map size), however, all the selftests which compile
to contain an indirect jump work with this patchset.

See individual patches for more details on implementation details.

Changes since RFC:

   * I've tried to address all the comments provided by Alexei and
     Eduard in RFC. Will try to list the most important of them below.

   * One big change: move from older LLVM version [5] to newer [4].
     Now LLVM generates jump tables as symbols in the new special
     section ".jumptables". Another part of this change is that
     libbpf now doesn't try to link map load and goto *rX, as
     1) this is absolutely not reliable 2) for some use cases this
     is impossible (namely, when more than one jump table can be used
     in the same gotox instruction).

   * Added insn_successors() support (Alexei, Eduard). This includes
     getting rid of the ugly bpf_insn_set_iter_xlated_offset()
     interface (Eduard).

   * Removed hack for the unreachable instruction, as new LLVM thank to
     Eduard doesn't generate it.

   * Set mem_size for direct map access properly instead of hacking.
     Remove off>0 check. (Alexei)

   * Do not allocate new memory for min_index/max_index (Alexei, Eduard)

   * Information required during check_cfg is now cached to be reused
     later (Alexei + general logic for supporting multiple JT per jump)

   * Properly compare registers in regsafe (Alexei, Eduard)

   * Remove support for JMP32 (Eduard)

   * Better checks in adjust_ptr_min_max_vals (Eduard)

   * More selftests were added (but still there's room for more) which
     directly use gotox (Alexei)

   * More checks and verbose messages added

   * "unique pointers" are no more in the map

Links:
   1. https://lpc.events/event/18/contributions/1941/
   2. https://lwn.net/Articles/1017439/
   3. https://lore.kernel.org/bpf/20250615085943.3871208-1-a.s.protopopov@xxxxxxxxx/
   4. https://github.com/llvm/llvm-project/pull/149715

Anton Protopopov (11):
   bpf: fix the return value of push_stack
   bpf: save the start of functions in bpf_prog_aux
   bpf, x86: add new map type: instructions array
   selftests/bpf: add selftests for new insn_array map
   bpf: support instructions arrays with constants blinding
   selftests/bpf: test instructions arrays with blinding
   bpf, x86: allow indirect jumps to r8...r15
   bpf, x86: add support for indirect jumps
   bpf: disasm: add support for BPF_JMP|BPF_JA|BPF_X
   libbpf: support llvm-generated indirect jumps
   selftests/bpf: add selftests for indirect jumps

  arch/x86/net/bpf_jit_comp.c                   |  39 +-
  include/linux/bpf.h                           |  30 +
  include/linux/bpf_types.h                     |   1 +
  include/linux/bpf_verifier.h                  |  20 +-
  include/uapi/linux/bpf.h                      |  11 +
  kernel/bpf/Makefile                           |   2 +-
  kernel/bpf/bpf_insn_array.c                   | 350 ++++++++++
  kernel/bpf/core.c                             |  20 +
  kernel/bpf/disasm.c                           |   9 +
  kernel/bpf/syscall.c                          |  22 +
  kernel/bpf/verifier.c                         | 603 ++++++++++++++++--
  .../bpf/bpftool/Documentation/bpftool-map.rst |   2 +-
  tools/bpf/bpftool/map.c                       |   2 +-
  tools/include/uapi/linux/bpf.h                |  11 +
  tools/lib/bpf/libbpf.c                        | 159 ++++-
  tools/lib/bpf/libbpf_probes.c                 |   4 +
  tools/lib/bpf/linker.c                        |  12 +-
  tools/testing/selftests/bpf/Makefile          |   4 +-
  .../selftests/bpf/prog_tests/bpf_goto_x.c     | 132 ++++
  .../selftests/bpf/prog_tests/bpf_insn_array.c | 498 +++++++++++++++
  .../testing/selftests/bpf/progs/bpf_goto_x.c  | 384 +++++++++++
  21 files changed, 2230 insertions(+), 85 deletions(-)
  create mode 100644 kernel/bpf/bpf_insn_array.c
  create mode 100644 tools/testing/selftests/bpf/prog_tests/bpf_goto_x.c
  create mode 100644 tools/testing/selftests/bpf/prog_tests/bpf_insn_array.c
  create mode 100644 tools/testing/selftests/bpf/progs/bpf_goto_x.c

After indirect jumps, the next natural steps will be supporting callx
and static key in bpf programs.

For static keys, currently, llvm supports gotol_or_nop/nop_or_gotol insns
(https://github.com/llvm/llvm-project/compare/main...aspsk:llvm-project:static-keys)
and these insns can only be used in inline asm.

For callx, there are two patterns, one is calling a particular func with
flow sensitive analysis, another is calling through call stable.

The following two examples are to call a partuclar func with current
variable to tracing register.

Example 1:

typedef int (*op_t)(int, int); static int add(int a, int b) { return a + b; } static int mul(int a, int b) { return a * b; } static int apply(op_t f, int a, int b) { // indirect call via function pointer return f(a, b); } int result(int i, int j) { op_t f; if (i + j) f = add; else f = mul; return apply(f, i, j); } The asm code:
result:                                 # @result
# %bb.0:
        w4 = w2
        w4 = -w4
        r3 = mul ll
        if w1 == w4 goto LBB0_2
# %bb.1:
        r3 = add ll
LBB0_2:
        callx r3
        exit

Example 2:

typedef int (*op_t)(int, int);

__attribute__((section("_add"))) static int add(int a, int b) { return a + b; }
__attribute__((section("_mul"))) static int mul(int a, int b) { return a * b; }

struct ctx {
  op_t f;
};

__attribute__((noinline)) static int apply(struct ctx *ctx, int a, int b) {
    // indirect call via function pointer
    return ctx->f(a, b);
}

int result(int i, int j) {
    int x = 2, y = 3;
    struct ctx ctx;

    if (i&2) ctx.f = add;
    else ctx.f = mul;
    int r1 = apply(&ctx, x, y);
    int r2 = apply(&ctx, x, y);

    return r1 + r2;
}

asm code:

result:                                 # @result
# %bb.0:
        w1 &= 2
        r2 = mul ll
        if w1 == 0 goto LBB0_2
# %bb.1:
        r2 = add ll
LBB0_2:
        *(u64 *)(r10 - 8) = r2
        r6 = r10
        r6 += -8
        r1 = r6
        call apply
        w7 = w0
        r1 = r6
        call apply
        w0 += w7
        exit
...
apply:                                  # @apply
# %bb.0:
        r3 = *(u64 *)(r1 + 0)
        w1 = 2
        w2 = 3
        callx r3
        exit

In the above two cases, current verifier can be enhanced to
track functions and eventuall 'callx r3' can find proper
targets.

Another pattern is to have a calltable (similar to jump table)
and callx will call one of functions based on calltable base
and an index. The example is below:

typedef int (*op_t)(int, int);

__attribute__((section("_add"))) static int add(int a, int b) { return a + b; }
__attribute__((section("_mul"))) static int mul(int a, int b) { return a * b; }

__attribute__((noinline)) static int apply(op_t *ops, int index, int a, int b) {
    // indirect call via function pointer
    return ops[index](a, b);
}

int result(int i, int j) {
    op_t ops[] = { add, mul, add, add, mul, mul };
    int x = 2, y = 3;

    int r1 = apply(ops, 0, x, y);
    int r2 = apply(ops, 4, x, y);

    return r1 + r2;
}

int result2(int i, int j) {
    op_t ops[] = { add, add, add, mul, mul };
    int x = 3, y = 2;

    int r1 = apply(ops, 1, x, y);
    int r2 = apply(ops, 2, x, y);

    return r1 + r2;
}


The related llvm IR:

@__const.result.ops = private unnamed_addr constant [6 x ptr] [ptr @add, ptr @mul, ptr @add, ptr @add, ptr @mul, ptr @mul], align 8
@__const.result2.ops = private unnamed_addr constant [5 x ptr] [ptr @add, ptr @add, ptr @add, ptr @mul, ptr @mul], align 8

; Function Attrs: nounwind
define dso_local i32 @result(i32 noundef %0, i32 noundef %1) local_unnamed_addr #0 {
  %3 = tail call fastcc i32 @apply(ptr noundef @__const.result.ops, i32 noundef 0, i32 noundef 2, i32 noundef 3)
  %4 = tail call fastcc i32 @apply(ptr noundef @__const.result.ops, i32 noundef 4, i32 noundef 2, i32 noundef 3)
  %5 = add nsw i32 %4, %3
  ret i32 %5
}
...
; Function Attrs: noinline nounwind
define internal fastcc i32 @apply(ptr noundef nonnull readonly captures(none) %0, i32 noundef range(i32 0, 5) %1, i32 noundef range(i32 2, 4) %2, i32 noundef range(i32 2, 4) %3) unnamed_addr #2 {
  %5 = zext nneg i32 %1 to i64
  %6 = getelementptr inbounds nuw ptr, ptr %0, i64 %5
  %7 = load ptr, ptr %6, align 8, !tbaa !3
  %8 = tail call i32 %7(i32 noundef %2, i32 noundef %3) #3
  ret i32 %8
}

; Function Attrs: nounwind
define dso_local i32 @result2(i32 noundef %0, i32 noundef %1) local_unnamed_addr #0 {
  %3 = tail call fastcc i32 @apply(ptr noundef @__const.result2.ops, i32 noundef 1, i32 noundef 3, i32 noundef 2)
  %4 = tail call fastcc i32 @apply(ptr noundef @__const.result2.ops, i32 noundef 2, i32 noundef 3, i32 noundef 2)
  %5 = add nsw i32 %4, %3
  ret i32 %5
}

To make
   @__const.result.ops = private unnamed_addr constant [6 x ptr] [ptr @add, ptr @mul, ptr @add, ptr @add, ptr @mul, ptr @mul], align 8
explicit for call table, the llvm changed the call table name (with a llvm hack) to be like
   BPF.__const.result.ops

The llvm hack on top of https://github.com/llvm/llvm-project/pull/149715:

commit 7b3d21831d2e2ccc77ec09553297e955e480eb96 (HEAD -> jumptable-v17-callx)
Author: Yonghong Song <yonghong.song@xxxxxxxxx>
Date:   Thu Sep 4 10:02:56 2025 -0700

    callx

diff --git a/llvm/lib/Target/BPF/BPFCheckAndAdjustIR.cpp b/llvm/lib/Target/BPF/BPFCheckAndAdjustIR.cpp
index b202b20291af..3d6a2c047d7b 100644
--- a/llvm/lib/Target/BPF/BPFCheckAndAdjustIR.cpp
+++ b/llvm/lib/Target/BPF/BPFCheckAndAdjustIR.cpp
@@ -55,6 +55,7 @@ private:
   bool sinkMinMax(Module &M);
   bool removeGEPBuiltins(Module &M);
   bool insertASpaceCasts(Module &M);
+  bool renameCallTableGlobal(Module &M);
 };
 } // End anonymous namespace
@@ -527,12 +528,39 @@ bool BPFCheckAndAdjustIR::insertASpaceCasts(Module &M) {
   return Changed;
 }
+bool BPFCheckAndAdjustIR::renameCallTableGlobal(Module &M) {
+  bool Changed = false;
+  for (GlobalVariable &Global : M.globals()) {
+    if (Global.getLinkage() != GlobalValue::PrivateLinkage)
+      continue;
+    if (!Global.isConstant() || !Global.hasInitializer())
+      continue;
+
+    Constant *CV = dyn_cast<Constant>(Global.getInitializer());
+    if (!CV)
+      continue;
+    ConstantArray *CA = dyn_cast<ConstantArray>(CV);
+    if (!CA)
+      continue;
+
+    for (unsigned i = 1, e = CA->getNumOperands(); i != e; ++i) {
+      if (!dyn_cast<Function>(CA->getOperand(i)))
+        continue;
+    }
+    Global.setName("BPF." + Global.getName());
+    Global.setLinkage(GlobalValue::LinkageTypes::InternalLinkage);
+    Global.setSection(".calltables");
+    Changed = true;
+  }
+
+  return Changed;
+}
+
 bool BPFCheckAndAdjustIR::adjustIR(Module &M) {
   bool Changed = removePassThroughBuiltin(M);
   Changed = removeCompareBuiltin(M) || Changed;
   Changed = sinkMinMax(M) || Changed;
   Changed = removeGEPBuiltins(M) || Changed;
   Changed = insertASpaceCasts(M) || Changed;
+  Changed = renameCallTableGlobal(M) || Changed;
   return Changed;
 }


At the end, we will have asm code:

result:                                 # @result
# %bb.0:
        r1 = BPF.__const.result.ops ll
        w2 = 0
        w3 = 2
        w4 = 3
        call apply
        w6 = w0
        r1 = BPF.__const.result.ops ll
        w2 = 4
        w3 = 2
        w4 = 3
        call apply
        w0 += w6
        exit
.Lfunc_end0:
        .size   result, .Lfunc_end0-result
...
 apply:                                  # @apply
# %bb.0:
        r2 = w2
        r2 <<= 3
        r1 += r2
        r5 = *(u64 *)(r1 + 0)
        w1 = w3
        w2 = w4
        callx r5
        exit
.Lfunc_end3:
        .size   apply, .Lfunc_end3-apply
...
        .section        .calltables,"a",@progbits
        .p2align        3, 0x0
BPF.__const.result.ops:
        .quad   add
        .quad   mul
        .quad   add
        .quad   add
        .quad   mul
        .quad   mul
        .size   BPF.__const.result.ops, 48

        .type   BPF.__const.result2.ops,@object # @BPF.__const.result2.ops
        .p2align        3, 0x0
BPF.__const.result2.ops:
        .quad   add
        .quad   add
        .quad   add
        .quad   mul
        .quad   mul
        .size   BPF.__const.result2.ops, 40

$ llvm-readelf -s t.o
...
     2: 0000000000000000    48 OBJECT  LOCAL  DEFAULT     6 BPF.__const.result.ops
     8: 0000000000000030    40 OBJECT  LOCAL  DEFAULT     6 BPF.__const.result2.ops
     9: 0000000000000000     0 SECTION LOCAL  DEFAULT     6 .calltables

0000000000000000 <result>:
       0:       18 01 00 00 00 00 00 00 00 00 00 00 00 00 00 00 r1 = 0x0 ll
                0000000000000000:  R_BPF_64_64  .calltables
       2:       b4 02 00 00 00 00 00 00 w2 = 0x0
       3:       b4 03 00 00 02 00 00 00 w3 = 0x2
       4:       b4 04 00 00 03 00 00 00 w4 = 0x3
       5:       85 10 00 00 09 00 00 00 call 0x9
       6:       bc 06 00 00 00 00 00 00 w6 = w0
       7:       18 01 00 00 00 00 00 00 00 00 00 00 00 00 00 00 r1 = 0x0 ll
                0000000000000038:  R_BPF_64_64  .calltables
       9:       b4 02 00 00 04 00 00 00 w2 = 0x4
      10:       b4 03 00 00 02 00 00 00 w3 = 0x2
      11:       b4 04 00 00 03 00 00 00 w4 = 0x3
      12:       85 10 00 00 02 00 00 00 call 0x2
      13:       0c 60 00 00 00 00 00 00 w0 += w6
      14:       95 00 00 00 00 00 00 00 exit
...
00000000000000b8 <result2>:
      23:       18 01 00 00 30 00 00 00 00 00 00 00 00 00 00 00 r1 = 0x30 ll
                00000000000000b8:  R_BPF_64_64  .calltables
      25:       b4 02 00 00 01 00 00 00 w2 = 0x1
      26:       b4 03 00 00 03 00 00 00 w3 = 0x3
      27:       b4 04 00 00 02 00 00 00 w4 = 0x2
      28:       85 10 00 00 f2 ff ff ff call -0xe
      29:       bc 06 00 00 00 00 00 00 w6 = w0
      30:       18 01 00 00 30 00 00 00 00 00 00 00 00 00 00 00 r1 = 0x30 ll
                00000000000000f0:  R_BPF_64_64  .calltables
      32:       b4 02 00 00 02 00 00 00 w2 = 0x2
      33:       b4 03 00 00 03 00 00 00 w3 = 0x3
      34:       b4 04 00 00 02 00 00 00 w4 = 0x2
      35:       85 10 00 00 eb ff ff ff call -0x15
      36:       0c 60 00 00 00 00 00 00 w0 += w6
      37:       95 00 00 00 00 00 00 00 exit


Relocation section '.rel.calltables' at offset 0x3d0 contains 11 entries:
    Offset             Info             Type               Symbol's Value  Symbol's Name
0000000000000000  0000000400000002 R_BPF_64_ABS64         0000000000000000 _add
0000000000000008  0000000600000002 R_BPF_64_ABS64         0000000000000000 _mul
0000000000000010  0000000400000002 R_BPF_64_ABS64         0000000000000000 _add
0000000000000018  0000000400000002 R_BPF_64_ABS64         0000000000000000 _add
0000000000000020  0000000600000002 R_BPF_64_ABS64         0000000000000000 _mul
0000000000000028  0000000600000002 R_BPF_64_ABS64         0000000000000000 _mul
0000000000000030  0000000400000002 R_BPF_64_ABS64         0000000000000000 _add
0000000000000038  0000000400000002 R_BPF_64_ABS64         0000000000000000 _add
0000000000000040  0000000400000002 R_BPF_64_ABS64         0000000000000000 _add
0000000000000048  0000000600000002 R_BPF_64_ABS64         0000000000000000 _mul
0000000000000050  0000000600000002 R_BPF_64_ABS64         0000000000000000 _mul

For the disasm code, the first two relocations clearly points to
BPF.__const.result.ops, and the next two relocations point to
BPF.__const.result2.ops.





[Index of Archives]     [Linux Samsung SoC]     [Linux Rockchip SoC]     [Linux Actions SoC]     [Linux for Synopsys ARC Processors]     [Linux NFS]     [Linux NILFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]


  Powered by Linux