Re: [PATCH bpf-next v3 1/2] bpf: Warn with bpf_unreachable() kfunc maybe due to uninitialized variable

Alexei Starovoitov <alexei.starovoitov@xxxxxxxxx> · Mon, 19 May 2025 15:48:38 -0700

On Mon, May 19, 2025 at 1:34 PM Yonghong Song <yonghong.song@xxxxxxxxx> wrote:
>
> Marc Suñé (Isovalent, part of Cisco) reported an issue where an
> uninitialized variable caused generating bpf prog binary code not
> working as expected. The reproducer is in [1] where the flags
> “-Wall -Werror” are enabled, but there is no warning as the compiler
> takes advantage of uninitialized variable to do aggressive optimization.
> The optimized code looks like below:
>
>       ; {
>            0:       bf 16 00 00 00 00 00 00 r6 = r1
>       ;       bpf_printk("Start");
>            1:       18 01 00 00 00 00 00 00 00 00 00 00 00 00 00 00 r1 = 0x0 ll
>                     0000000000000008:  R_BPF_64_64  .rodata
>            3:       b4 02 00 00 06 00 00 00 w2 = 0x6
>            4:       85 00 00 00 06 00 00 00 call 0x6
>       ; DEFINE_FUNC_CTX_POINTER(data)
>            5:       61 61 4c 00 00 00 00 00 w1 = *(u32 *)(r6 + 0x4c)
>       ;       bpf_printk("pre ipv6_hdrlen_offset");
>            6:       18 01 00 00 06 00 00 00 00 00 00 00 00 00 00 00 r1 = 0x6 ll
>                     0000000000000030:  R_BPF_64_64  .rodata
>            8:       b4 02 00 00 17 00 00 00 w2 = 0x17
>            9:       85 00 00 00 06 00 00 00 call 0x6
>       <END>
>
> The verifier will report the following failure:
>   9: (85) call bpf_trace_printk#6
>   last insn is not an exit or jmp
>
> The above verifier log does not give a clear hint about how to fix
> the problem and user may take quite some time to figure out that
> the issue is due to compiler taking advantage of uninitialized variable.
>
> In llvm internals, uninitialized variable usage may generate
> 'unreachable' IR insn and these 'unreachable' IR insns may indicate
> uninitialized variable impact on code optimization. So far, llvm
> BPF backend ignores 'unreachable' IR hence the above code is generated.
> With clang21 patch [2], those 'unreachable' IR insn are converted
> to func bpf_unreachable(). In order to maintain proper control flow
> graph for bpf progs, [2] also adds an 'exit' insn after bpf_unreachable()
> if bpf_unreachable() is the last insn in the function.
> The new code looks like:
>
>       ; {
>            0:       bf 16 00 00 00 00 00 00 r6 = r1
>       ;       bpf_printk("Start");
>            1:       18 01 00 00 00 00 00 00 00 00 00 00 00 00 00 00 r1 = 0x0 ll
>                     0000000000000008:  R_BPF_64_64  .rodata
>            3:       b4 02 00 00 06 00 00 00 w2 = 0x6
>            4:       85 00 00 00 06 00 00 00 call 0x6
>       ; DEFINE_FUNC_CTX_POINTER(data)
>            5:       61 61 4c 00 00 00 00 00 w1 = *(u32 *)(r6 + 0x4c)
>       ;       bpf_printk("pre ipv6_hdrlen_offset");
>            6:       18 01 00 00 06 00 00 00 00 00 00 00 00 00 00 00 r1 = 0x6 ll
>                     0000000000000030:  R_BPF_64_64  .rodata
>            8:       b4 02 00 00 17 00 00 00 w2 = 0x17
>            9:       85 00 00 00 06 00 00 00 call 0x6
>           10:       85 10 00 00 ff ff ff ff call -0x1
>                     0000000000000050:  R_BPF_64_32  bpf_unreachable
>           11:       95 00 00 00 00 00 00 00 exit
>       <END>
>
> In kernel, a new kfunc bpf_unreachable() is added. During insn
> verification, any hit with bpf_unreachable() will result in
> verification failure. The kernel is able to provide better
> log message for debugging.
>
> With llvm patch [2] and without this patch (no bpf_unreachable()
> kfunc for existing kernel), e.g., for old kernels, the verifier
> outputs
>   10: <invalid kfunc call>
>   kfunc 'bpf_unreachable' is referenced but wasn't resolved
> Basically, kernel does not support bpf_unreachable() kfunc.
> This still didn't give clear signals about possible reason.
>
> With llvm patch [2] and with this patch, the verifier outputs
>   10: (85) call bpf_unreachable#74479
>   unexpected bpf_unreachable() due to uninitialized variable?
> It gives much better hints for verification failure.
>
>   [1] https://github.com/msune/clang_bpf/blob/main/Makefile#L3
>   [2] https://github.com/llvm/llvm-project/pull/131731
>
> Signed-off-by: Yonghong Song <yonghong.song@xxxxxxxxx>
> ---
>  kernel/bpf/helpers.c  | 5 +++++
>  kernel/bpf/verifier.c | 5 +++++
>  2 files changed, 10 insertions(+)
>
> diff --git a/kernel/bpf/helpers.c b/kernel/bpf/helpers.c
> index c1113b74e1e2..4852c36b1c51 100644
> --- a/kernel/bpf/helpers.c
> +++ b/kernel/bpf/helpers.c
> @@ -3273,6 +3273,10 @@ __bpf_kfunc void bpf_local_irq_restore(unsigned long *flags__irq_flag)
>         local_irq_restore(*flags__irq_flag);
>  }
>
> +__bpf_kfunc void bpf_unreachable(void)
> +{
> +}
> +
>  __bpf_kfunc_end_defs();
>
>  BTF_KFUNCS_START(generic_btf_ids)
> @@ -3386,6 +3390,7 @@ BTF_ID_FLAGS(func, bpf_copy_from_user_dynptr, KF_SLEEPABLE)
>  BTF_ID_FLAGS(func, bpf_copy_from_user_str_dynptr, KF_SLEEPABLE)
>  BTF_ID_FLAGS(func, bpf_copy_from_user_task_dynptr, KF_SLEEPABLE | KF_TRUSTED_ARGS)
>  BTF_ID_FLAGS(func, bpf_copy_from_user_task_str_dynptr, KF_SLEEPABLE | KF_TRUSTED_ARGS)
> +BTF_ID_FLAGS(func, bpf_unreachable)
>  BTF_KFUNCS_END(common_btf_ids)
>
>  static const struct btf_kfunc_id_set common_kfunc_set = {
> diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
> index d5807d2efc92..08013e2e1697 100644
> --- a/kernel/bpf/verifier.c
> +++ b/kernel/bpf/verifier.c
> @@ -12105,6 +12105,7 @@ enum special_kfunc_type {
>         KF_bpf_res_spin_unlock,
>         KF_bpf_res_spin_lock_irqsave,
>         KF_bpf_res_spin_unlock_irqrestore,
> +       KF_bpf_unreachable,
>  };
>
>  BTF_SET_START(special_kfunc_set)
> @@ -12208,6 +12209,7 @@ BTF_ID(func, bpf_res_spin_lock)
>  BTF_ID(func, bpf_res_spin_unlock)
>  BTF_ID(func, bpf_res_spin_lock_irqsave)
>  BTF_ID(func, bpf_res_spin_unlock_irqrestore)
> +BTF_ID(func, bpf_unreachable)
>
>  static bool is_kfunc_ret_null(struct bpf_kfunc_call_arg_meta *meta)
>  {
> @@ -13508,6 +13510,9 @@ static int check_kfunc_call(struct bpf_verifier_env *env, struct bpf_insn *insn,
>                         return err;
>                 }
>                 __mark_btf_func_reg_size(env, regs, BPF_REG_0, sizeof(u32));
> +       } else if (!insn->off && insn->imm == special_kfunc_list[KF_bpf_unreachable]) {

Looks good, but let's not abuse special_kfunc_list[] for this case.
special_kfunc_type supposed to be in both set[] and list[].
This is not the case here.
It was wrong to add KF_bpf_set_dentry_xattr, bpf_iter_css_task_new,
bpf_dynptr_from_skb, and many others.
Let's fix this tech debt that we accumulated.

special_kfunc_type should include only kfuncs that return
a pointer, so that this part is triggered:

        } else if (btf_type_is_ptr(t)) {
                ptr_type = btf_type_skip_modifiers(desc_btf, t->type,
&ptr_type_id);

                if (meta.btf == btf_vmlinux &&
btf_id_set_contains(&/special_kfunc_set, meta.func_id)) {

All other kfuncs shouldn't be there. They don't need to be in
the special_kfunc_set.

Let's split enum special_kfunc_type into what it meant to be
originally (both set and list), and move all list-only kfuncs
into a new array.
Let's call it kfunc_ids.
Then the check in this patch will look like:
insn->imm == kfunc_ids[KF_bpf_unreachable]

Digging through the code it looks like we made a bit of a mess there.
Like this part:
        } else if (btf_type_is_void(t)) {
                if (meta.btf == btf_vmlinux &&
btf_id_set_contains(&special_kfunc_set, meta.func_id)) {
                        if (meta.func_id ==
special_kfunc_list[KF_bpf_obj_drop_impl] ||
                            meta.func_id ==
special_kfunc_list[KF_bpf_percpu_obj_drop_impl]) {

*obj_drop don't need to be in a set,
and btf_id_set_contains() doesn't need to be called.
Both kfuncs should be moved to new kfunc_ids[]