On Mon, May 12, 2025 at 10:49 PM Yonghong Song <yonghong.song@xxxxxxxxx> wrote: > > > > On 5/11/25 8:30 AM, Alexei Starovoitov wrote: > > On Sun, May 11, 2025 at 11:28 AM Yonghong Song <yonghong.song@xxxxxxxxx> wrote: > >> Marc Suñé (Isovalent, part of Cisco) reported an issue where an > >> uninitialized variable caused generating bpf prog binary code not > >> working as expected. The reproducer is in [1] where the flags > >> “-Wall -Werror” are enabled, but there is no warning and compiler > >> may take advantage of uninit variable to do aggressive optimization. > >> > >> In llvm internals, uninitialized variable usage may generate > >> 'unreachable' IR insn and these 'unreachable' IR insns may indicate > >> uninit var impact on code optimization. With clang21 patch [2], > >> those 'unreachable' IR insn are converted to func bpf_unreachable(). > >> > >> In kernel, a new kfunc bpf_unreachable() is added. If this kfunc > >> (generated by [2]) is the last insn in the main prog or a subprog, > >> the verifier will suggest the verification failure may be due to > >> uninitialized var, so user can check their source code to find the > >> root cause. > >> > >> Without this patch, the verifier will output > >> last insn is not an exit or jmp > >> and user will not know what is the potential root cause and > >> it will take more time to debug this verification failure. > >> > >> [1] https://github.com/msune/clang_bpf/blob/main/Makefile#L3 > >> [2] https://github.com/llvm/llvm-project/pull/131731 > >> > >> Signed-off-by: Yonghong Song <yonghong.song@xxxxxxxxx> > >> --- > >> kernel/bpf/helpers.c | 5 +++++ > >> kernel/bpf/verifier.c | 17 ++++++++++++++++- > >> 2 files changed, 21 insertions(+), 1 deletion(-) > >> > >> In order to compile kernel successfully with the above [2], the following > >> change is needed due to clang21 changes: > >> > >> --- a/Makefile > >> +++ b/Makefile > >> @@ -852,7 +852,7 @@ endif > >> endif # may-sync-config > >> endif # need-config > >> > >> -KBUILD_CFLAGS += -fno-delete-null-pointer-checks > >> +KBUILD_CFLAGS += -fno-delete-null-pointer-checks -Wno-default-const-init-field-unsafe > >> > >> --- a/scripts/Makefile.extrawarn > >> +++ b/scripts/Makefile.extrawarn > >> @@ -19,6 +19,7 @@ KBUILD_CFLAGS += $(call cc-disable-warning, frame-address) > >> KBUILD_CFLAGS += $(call cc-disable-warning, address-of-packed-member) > >> KBUILD_CFLAGS += -Wmissing-declarations > >> KBUILD_CFLAGS += -Wmissing-prototypes > >> +KBUILD_CFLAGS += -Wno-default-const-init-var-unsafe > >> > >> diff --git a/kernel/bpf/helpers.c b/kernel/bpf/helpers.c > >> index fed53da75025..6048d7e19d4c 100644 > >> --- a/kernel/bpf/helpers.c > >> +++ b/kernel/bpf/helpers.c > >> @@ -3283,6 +3283,10 @@ __bpf_kfunc void bpf_local_irq_restore(unsigned long *flags__irq_flag) > >> local_irq_restore(*flags__irq_flag); > >> } > >> > >> +__bpf_kfunc void bpf_unreachable(void) > >> +{ > >> +} > > Does it have to be an actual function with the body? > > Can it be a kfunc that doesn't consume any .text ? > > I tried to define bpf_unreachable as an extern function, but > it does not work as a __bpf_kfunc. I agree that we do not > need to consume any bytes in .text section for bpf_unreachable. > I have not found a solution for that yet. Have you tried marking it as 'naked' and empty body? > > > >> + > >> __bpf_kfunc_end_defs(); > >> > >> BTF_KFUNCS_START(generic_btf_ids) > >> @@ -3388,6 +3392,7 @@ BTF_ID_FLAGS(func, bpf_iter_kmem_cache_next, KF_ITER_NEXT | KF_RET_NULL | KF_SLE > >> BTF_ID_FLAGS(func, bpf_iter_kmem_cache_destroy, KF_ITER_DESTROY | KF_SLEEPABLE) > >> BTF_ID_FLAGS(func, bpf_local_irq_save) > >> BTF_ID_FLAGS(func, bpf_local_irq_restore) > >> +BTF_ID_FLAGS(func, bpf_unreachable) > >> BTF_KFUNCS_END(common_btf_ids) > >> > >> static const struct btf_kfunc_id_set common_kfunc_set = { > >> diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c > >> index 28f5a7899bd6..d26aec0a90d0 100644 > >> --- a/kernel/bpf/verifier.c > >> +++ b/kernel/bpf/verifier.c > >> @@ -206,6 +206,7 @@ static int ref_set_non_owning(struct bpf_verifier_env *env, > >> static void specialize_kfunc(struct bpf_verifier_env *env, > >> u32 func_id, u16 offset, unsigned long *addr); > >> static bool is_trusted_reg(const struct bpf_reg_state *reg); > >> +static void verbose_insn(struct bpf_verifier_env *env, struct bpf_insn *insn); > >> > >> static bool bpf_map_ptr_poisoned(const struct bpf_insn_aux_data *aux) > >> { > >> @@ -3398,7 +3399,10 @@ static int check_subprogs(struct bpf_verifier_env *env) > >> int i, subprog_start, subprog_end, off, cur_subprog = 0; > >> struct bpf_subprog_info *subprog = env->subprog_info; > >> struct bpf_insn *insn = env->prog->insnsi; > >> + bool is_bpf_unreachable = false; > >> int insn_cnt = env->prog->len; > >> + const struct btf_type *t; > >> + const char *tname; > >> > >> /* now check that all jumps are within the same subprog */ > >> subprog_start = subprog[cur_subprog].start; > >> @@ -3433,7 +3437,18 @@ static int check_subprogs(struct bpf_verifier_env *env) > >> if (code != (BPF_JMP | BPF_EXIT) && > >> code != (BPF_JMP32 | BPF_JA) && > >> code != (BPF_JMP | BPF_JA)) { > >> - verbose(env, "last insn is not an exit or jmp\n"); > >> + verbose_insn(env, &insn[i]); > >> + if (btf_vmlinux && insn[i].code == (BPF_CALL | BPF_JMP) && > >> + insn[i].src_reg == BPF_PSEUDO_KFUNC_CALL) { > >> + t = btf_type_by_id(btf_vmlinux, insn[i].imm); > >> + tname = btf_name_by_offset(btf_vmlinux, t->name_off); > >> + if (strcmp(tname, "bpf_unreachable") == 0) > > the check by name is not pretty. > > > >> + is_bpf_unreachable = true; > >> + } > >> + if (is_bpf_unreachable) > >> + verbose(env, "last insn is bpf_unreachable, due to uninitialized var?\n"); > >> + else > >> + verbose(env, "last insn is not an exit or jmp\n"); > > This is too specific imo. > > add_subprog_and_kfunc() -> add_kfunc_call() > > should probably handle it instead, > > and print that error. > > add_subprog_and_kfunc() -> add_kfunc_call() approach probably won't work. > The error should be emitted only if the verifier (through path sensitive > analysis) reaches bpf_unreachable(). > > if bpf_unreachable() exists in the bpf prog, but verifier cannot reach it > during verification process, error will not printed. you mean when bpf_unreachable() in the actual dead code then the verifier shouldn't error ? Makes sense. > > It doesn't matter that call bpf_unreachable is the last insn > > of a program or subprogram. > > I suspect llvm can emit it anywhere. > > It is totally possible that bpf_unreachable may appear in the middle of > code. But based on past examples, bpf_unreachable tends to be in the > last insn and it may be targetted from multiple sources. This also makes > code easier to understand. I can dig into llvm internal a little bit > more to find how llvm places 'unreachable' IR insns. I recall Anton hit some odd case of unreachable code with upcoming indirect goto/call work. If llmv starts emitting call bpf_unreachable that may be another case.