On 5/19/25 6:48 AM, Alexei Starovoitov wrote:
On Mon, May 19, 2025 at 1:34 PM Yonghong Song <yonghong.song@xxxxxxxxx> wrote:
Marc Suñé (Isovalent, part of Cisco) reported an issue where an
uninitialized variable caused generating bpf prog binary code not
working as expected. The reproducer is in [1] where the flags
“-Wall -Werror” are enabled, but there is no warning as the compiler
takes advantage of uninitialized variable to do aggressive optimization.
The optimized code looks like below:
; {
0: bf 16 00 00 00 00 00 00 r6 = r1
; bpf_printk("Start");
1: 18 01 00 00 00 00 00 00 00 00 00 00 00 00 00 00 r1 = 0x0 ll
0000000000000008: R_BPF_64_64 .rodata
3: b4 02 00 00 06 00 00 00 w2 = 0x6
4: 85 00 00 00 06 00 00 00 call 0x6
; DEFINE_FUNC_CTX_POINTER(data)
5: 61 61 4c 00 00 00 00 00 w1 = *(u32 *)(r6 + 0x4c)
; bpf_printk("pre ipv6_hdrlen_offset");
6: 18 01 00 00 06 00 00 00 00 00 00 00 00 00 00 00 r1 = 0x6 ll
0000000000000030: R_BPF_64_64 .rodata
8: b4 02 00 00 17 00 00 00 w2 = 0x17
9: 85 00 00 00 06 00 00 00 call 0x6
<END>
The verifier will report the following failure:
9: (85) call bpf_trace_printk#6
last insn is not an exit or jmp
The above verifier log does not give a clear hint about how to fix
the problem and user may take quite some time to figure out that
the issue is due to compiler taking advantage of uninitialized variable.
In llvm internals, uninitialized variable usage may generate
'unreachable' IR insn and these 'unreachable' IR insns may indicate
uninitialized variable impact on code optimization. So far, llvm
BPF backend ignores 'unreachable' IR hence the above code is generated.
With clang21 patch [2], those 'unreachable' IR insn are converted
to func bpf_unreachable(). In order to maintain proper control flow
graph for bpf progs, [2] also adds an 'exit' insn after bpf_unreachable()
if bpf_unreachable() is the last insn in the function.
The new code looks like:
; {
0: bf 16 00 00 00 00 00 00 r6 = r1
; bpf_printk("Start");
1: 18 01 00 00 00 00 00 00 00 00 00 00 00 00 00 00 r1 = 0x0 ll
0000000000000008: R_BPF_64_64 .rodata
3: b4 02 00 00 06 00 00 00 w2 = 0x6
4: 85 00 00 00 06 00 00 00 call 0x6
; DEFINE_FUNC_CTX_POINTER(data)
5: 61 61 4c 00 00 00 00 00 w1 = *(u32 *)(r6 + 0x4c)
; bpf_printk("pre ipv6_hdrlen_offset");
6: 18 01 00 00 06 00 00 00 00 00 00 00 00 00 00 00 r1 = 0x6 ll
0000000000000030: R_BPF_64_64 .rodata
8: b4 02 00 00 17 00 00 00 w2 = 0x17
9: 85 00 00 00 06 00 00 00 call 0x6
10: 85 10 00 00 ff ff ff ff call -0x1
0000000000000050: R_BPF_64_32 bpf_unreachable
11: 95 00 00 00 00 00 00 00 exit
<END>
In kernel, a new kfunc bpf_unreachable() is added. During insn
verification, any hit with bpf_unreachable() will result in
verification failure. The kernel is able to provide better
log message for debugging.
With llvm patch [2] and without this patch (no bpf_unreachable()
kfunc for existing kernel), e.g., for old kernels, the verifier
outputs
10: <invalid kfunc call>
kfunc 'bpf_unreachable' is referenced but wasn't resolved
Basically, kernel does not support bpf_unreachable() kfunc.
This still didn't give clear signals about possible reason.
With llvm patch [2] and with this patch, the verifier outputs
10: (85) call bpf_unreachable#74479
unexpected bpf_unreachable() due to uninitialized variable?
It gives much better hints for verification failure.
[1] https://github.com/msune/clang_bpf/blob/main/Makefile#L3
[2] https://github.com/llvm/llvm-project/pull/131731
Signed-off-by: Yonghong Song <yonghong.song@xxxxxxxxx>
---
kernel/bpf/helpers.c | 5 +++++
kernel/bpf/verifier.c | 5 +++++
2 files changed, 10 insertions(+)
diff --git a/kernel/bpf/helpers.c b/kernel/bpf/helpers.c
index c1113b74e1e2..4852c36b1c51 100644
--- a/kernel/bpf/helpers.c
+++ b/kernel/bpf/helpers.c
@@ -3273,6 +3273,10 @@ __bpf_kfunc void bpf_local_irq_restore(unsigned long *flags__irq_flag)
local_irq_restore(*flags__irq_flag);
}
+__bpf_kfunc void bpf_unreachable(void)
+{
+}
+
__bpf_kfunc_end_defs();
BTF_KFUNCS_START(generic_btf_ids)
@@ -3386,6 +3390,7 @@ BTF_ID_FLAGS(func, bpf_copy_from_user_dynptr, KF_SLEEPABLE)
BTF_ID_FLAGS(func, bpf_copy_from_user_str_dynptr, KF_SLEEPABLE)
BTF_ID_FLAGS(func, bpf_copy_from_user_task_dynptr, KF_SLEEPABLE | KF_TRUSTED_ARGS)
BTF_ID_FLAGS(func, bpf_copy_from_user_task_str_dynptr, KF_SLEEPABLE | KF_TRUSTED_ARGS)
+BTF_ID_FLAGS(func, bpf_unreachable)
BTF_KFUNCS_END(common_btf_ids)
static const struct btf_kfunc_id_set common_kfunc_set = {
diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
index d5807d2efc92..08013e2e1697 100644
--- a/kernel/bpf/verifier.c
+++ b/kernel/bpf/verifier.c
@@ -12105,6 +12105,7 @@ enum special_kfunc_type {
KF_bpf_res_spin_unlock,
KF_bpf_res_spin_lock_irqsave,
KF_bpf_res_spin_unlock_irqrestore,
+ KF_bpf_unreachable,
};
BTF_SET_START(special_kfunc_set)
@@ -12208,6 +12209,7 @@ BTF_ID(func, bpf_res_spin_lock)
BTF_ID(func, bpf_res_spin_unlock)
BTF_ID(func, bpf_res_spin_lock_irqsave)
BTF_ID(func, bpf_res_spin_unlock_irqrestore)
+BTF_ID(func, bpf_unreachable)
static bool is_kfunc_ret_null(struct bpf_kfunc_call_arg_meta *meta)
{
@@ -13508,6 +13510,9 @@ static int check_kfunc_call(struct bpf_verifier_env *env, struct bpf_insn *insn,
return err;
}
__mark_btf_func_reg_size(env, regs, BPF_REG_0, sizeof(u32));
+ } else if (!insn->off && insn->imm == special_kfunc_list[KF_bpf_unreachable]) {
Looks good, but let's not abuse special_kfunc_list[] for this case.
special_kfunc_type supposed to be in both set[] and list[].
This is not the case here.
It was wrong to add KF_bpf_set_dentry_xattr, bpf_iter_css_task_new,
bpf_dynptr_from_skb, and many others.
Let's fix this tech debt that we accumulated.
special_kfunc_type should include only kfuncs that return
a pointer, so that this part is triggered:
} else if (btf_type_is_ptr(t)) {
ptr_type = btf_type_skip_modifiers(desc_btf, t->type,
&ptr_type_id);
if (meta.btf == btf_vmlinux &&
btf_id_set_contains(&/special_kfunc_set, meta.func_id)) {
All other kfuncs shouldn't be there. They don't need to be in
the special_kfunc_set.
Let's split enum special_kfunc_type into what it meant to be
originally (both set and list), and move all list-only kfuncs
into a new array.
Let's call it kfunc_ids.
Then the check in this patch will look like:
insn->imm == kfunc_ids[KF_bpf_unreachable]
IIUC, the main goal is to remove some kfuncs from special_kfunc_set
since they are unnecessary.
I think we do not need an 'enum' type for special_kfunc_set since
the for all kfuncs in special_kfunc_set, btf_id_set_contains()
is used to find corresponding btf_id. So current 'enum special_kfunc_type'
is only used for special_kfunc_list to find proper kfunc_id's.
I think the following change should achieve this:
diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
index 08013e2e1697..2cf00b06ae66 100644
--- a/kernel/bpf/verifier.c
+++ b/kernel/bpf/verifier.c
@@ -12060,7 +12060,7 @@ enum kfunc_ptr_arg_type {
KF_ARG_PTR_TO_RES_SPIN_LOCK,
};
-enum special_kfunc_type {
+enum special_kfunc_list_type {
KF_bpf_obj_new_impl,
KF_bpf_obj_drop_impl,
KF_bpf_refcount_acquire_impl,
@@ -12126,24 +12126,10 @@ BTF_ID(func, bpf_rbtree_first)
BTF_ID(func, bpf_rbtree_root)
BTF_ID(func, bpf_rbtree_left)
BTF_ID(func, bpf_rbtree_right)
-#ifdef CONFIG_NET
-BTF_ID(func, bpf_dynptr_from_skb)
-BTF_ID(func, bpf_dynptr_from_xdp)
-#endif
BTF_ID(func, bpf_dynptr_slice)
BTF_ID(func, bpf_dynptr_slice_rdwr)
-BTF_ID(func, bpf_dynptr_clone)
BTF_ID(func, bpf_percpu_obj_new_impl)
BTF_ID(func, bpf_percpu_obj_drop_impl)
-BTF_ID(func, bpf_throw)
-BTF_ID(func, bpf_wq_set_callback_impl)
-#ifdef CONFIG_CGROUPS
-BTF_ID(func, bpf_iter_css_task_new)
-#endif
-#ifdef CONFIG_BPF_LSM
-BTF_ID(func, bpf_set_dentry_xattr)
-BTF_ID(func, bpf_remove_dentry_xattr)
-#endif
BTF_SET_END(special_kfunc_set)
BTF_ID_LIST(special_kfunc_list)
I renamed 'enum special_kfunc_type' to 'enum special_kfunc_list_type'
implying that the enum values in special_kfunc_lit_type has
1:1 relation to special_kfunc_list.
WDYT?
Digging through the code it looks like we made a bit of a mess there.
Like this part:
} else if (btf_type_is_void(t)) {
if (meta.btf == btf_vmlinux &&
btf_id_set_contains(&special_kfunc_set, meta.func_id)) {
if (meta.func_id ==
special_kfunc_list[KF_bpf_obj_drop_impl] ||
meta.func_id ==
special_kfunc_list[KF_bpf_percpu_obj_drop_impl]) {
*obj_drop don't need to be in a set,
and btf_id_set_contains() doesn't need to be called.
Both kfuncs should be moved to new kfunc_ids[]
As you mentioned, for this one, we can do
diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
index 2cf00b06ae66..a3ff57eaa5f4 100644
--- a/kernel/bpf/verifier.c
+++ b/kernel/bpf/verifier.c
@@ -12110,7 +12110,6 @@ enum special_kfunc_list_type {
BTF_SET_START(special_kfunc_set)
BTF_ID(func, bpf_obj_new_impl)
-BTF_ID(func, bpf_obj_drop_impl)
BTF_ID(func, bpf_refcount_acquire_impl)
BTF_ID(func, bpf_list_push_front_impl)
BTF_ID(func, bpf_list_push_back_impl)
@@ -12129,7 +12128,6 @@ BTF_ID(func, bpf_rbtree_right)
BTF_ID(func, bpf_dynptr_slice)
BTF_ID(func, bpf_dynptr_slice_rdwr)
BTF_ID(func, bpf_percpu_obj_new_impl)
-BTF_ID(func, bpf_percpu_obj_drop_impl)
BTF_SET_END(special_kfunc_set)
BTF_ID_LIST(special_kfunc_list)
@@ -13909,7 +13907,7 @@ static int check_kfunc_call(struct bpf_verifier_env *env, struct bpf_insn *insn,
if (reg_may_point_to_spin_lock(®s[BPF_REG_0]) && !regs[BPF_REG_0].id)
regs[BPF_REG_0].id = ++env->id_gen;
} else if (btf_type_is_void(t)) {
- if (meta.btf == btf_vmlinux && btf_id_set_contains(&special_kfunc_set, meta.func_id)) {
+ if (meta.btf == btf_vmlinux) {
if (meta.func_id == special_kfunc_list[KF_bpf_obj_drop_impl] ||
meta.func_id == special_kfunc_list[KF_bpf_percpu_obj_drop_impl]) {
insn_aux->kptr_struct_meta =