Am Sonntag, dem 08.06.2025 um 10:45 +0200 schrieb Bert Karwatzki: > Am Donnerstag, dem 05.06.2025 um 14:51 +0200 schrieb Sebastian Andrzej Siewior: > > On 2025-06-05 08:48:38 [-0400], Steven Rostedt wrote: > > > On Thu, 5 Jun 2025 11:19:03 +0200 > > > Bert Karwatzki <spasswolf@xxxxxx> wrote: > > > > > > > This patch seems to create so much output that the orginal error message and > > > > backtrace often get lost, so I needed several runs to get a meaningful message > > > > when running > > > > > > Are you familiar with preempt count tracing? > > > > I have an initial set of patches to tackle this problem, I'm going to > > send them after the merge window. > > > > Sebastian > > I've found the reason for the "mysterious" increase of preempt_count: > > [ 70.821750] [ T2746] bpf_link_settle calling fd_install() preemt_count = 0 > [ 70.821751] [ T2746] preempt_count_add 5898: preempt_count = 0x0 counter = 0x1b232c > [ 70.821752] [ T2746] preempt_count_add 5900: preempt_count = 0x1 counter = 0x1b232d > [ 70.821754] [ T2746] preempt_count_sub 5966: preempt_count = 0x1 counter = 0x1b232e > [ 70.821755] [ T2746] preempt_count_sub 5968: preempt_count = 0x0 counter = 0x1b232f > [ 70.821761] [ T2746] __bpf_trace_sys_enter 18: preempt_count = 0x0 > [ 70.821762] [ T2746] __bpf_trace_sys_enter 18: preempt_count = 0x1 > [ 70.821764] [ T2746] __bpf_trace_run: preempt_count = 1 > [ 70.821765] [ T2746] bpf_prog_run: preempt_count = 1 > [ 70.821766] [ T2746] __bpf_prog_run: preempt_count = 1 > > It's caused by this macro from include/trace/bpf_probe.h (with my pr_err()): > > #define __BPF_DECLARE_TRACE_SYSCALL(call, proto, args) \ > static notrace void \ > __bpf_trace_##call(void *__data, proto) \ > { \ > might_fault(); \ > if (!strcmp(get_current()->comm, "test_progs")) \ > pr_err("%s %d: preempt_count = 0x%x", __func__, __LINE__, preempt_count());\ > preempt_disable_notrace(); \ > if (!strcmp(get_current()->comm, "test_progs")) \ > pr_err("%s %d: preempt_count = 0x%x", __func__, __LINE__, preempt_count());\ > CONCATENATE(bpf_trace_run, COUNT_ARGS(args))(__data, CAST_TO_U64(args)); \ > preempt_enable_notrace(); \ > } > > The preempt_{en,dis}able_notrace were introduced in > commit 4aadde89d81f ("tracing/bpf: disable preemption in syscall probe") > This commit is present in v6.14 and v6.15, but the bug already appears in > v6.12 so in that case preemption is disable somewhere else. > > Bert Karwatzki After reading this https://lore.kernel.org/bpf/CAADnVQJf535hwud5XtQKStOge9=pYVYWSiq_8Q2YAvN5rba==A@xxxxxxxxxxxxxx/ I tried using migrate_{en,disable} like this (in v6.15) diff --git a/include/trace/bpf_probe.h b/include/trace/bpf_probe.h index 183fa2aa2935..49257cb90209 100644 --- a/include/trace/bpf_probe.h +++ b/include/trace/bpf_probe.h @@ -58,9 +58,9 @@ static notrace void \ __bpf_trace_##call(void *__data, proto) \ { \ might_fault(); \ - preempt_disable_notrace(); \ + migrate_disable(); \ CONCATENATE(bpf_trace_run, COUNT_ARGS(args))(__data, CAST_TO_U64(args)); \ - preempt_enable_notrace(); \ + migrate_enable(); \ } #undef DECLARE_EVENT_SYSCALL_CLASS diff --git a/kernel/trace/bpf_trace.c b/kernel/trace/bpf_trace.c index 187dc37d61d4..ec0326405fc3 100644 --- a/kernel/trace/bpf_trace.c +++ b/kernel/trace/bpf_trace.c @@ -2350,7 +2350,7 @@ void __bpf_trace_run(struct bpf_raw_tp_link *link, u64 *args) struct bpf_run_ctx *old_run_ctx; struct bpf_trace_run_ctx run_ctx; - cant_sleep(); + cant_migrate(); if (unlikely(this_cpu_inc_return(*(prog->active)) != 1)) { bpf_prog_inc_misses_counter(prog); goto out; diff --git a/tools/testing/selftests/bpf/progs/dynptr_success.c b/tools/testing/selftests/bpf/progs/dynptr_success.c index e1fba28e4a86..7cfb9473a526 100644 --- a/tools/testing/selftests/bpf/progs/dynptr_success.c +++ b/tools/testing/selftests/bpf/progs/dynptr_success.c @@ -7,6 +7,7 @@ #include <bpf/bpf_helpers.h> #include <bpf/bpf_tracing.h> #include "bpf_misc.h" +#include "bpf_kfuncs.h" #include "errno.h" char _license[] SEC("license") = "GPL"; This fixes the warnings when running the bpf cgroup examples: ./test_progs -a "cgrp_local_storage/cgrp1*" but I still get a warning from another example (I don't know which, yet): [ T4696] BUG: sleeping function called from invalid context at kernel/locking/spinlock_rt.c:48 [ T4696] in_atomic(): 1, irqs_disabled(): 1, non_block: 0, pid: 4696, name: test_progs [ T4696] preempt_count: 1, expected: 0 [ T4696] RCU nest depth: 0, expected: 0 [ T4696] 2 locks held by test_progs/4696: [ T4696] #0: ffffffff91d30860 (rcu_read_lock_trace){....}-{0:0}, at: bpf_prog_test_run_syscall+0xbb/0x250 [ T4696] #1: ffff9ca88e6741c8 ((&c->lock)){+.+.}-{3:3}, at: ___slab_alloc+0x68/0xde0 [ T4696] irq event stamp: 100 [ T4696] hardirqs last enabled at (99): [<ffffffff91199098>] do_syscall_64+0x38/0xfa0 [ T4696] hardirqs last disabled at (100): [<ffffffff907bc57b>] __bpf_async_init+0xdb/0x310 [ T4696] softirqs last enabled at (0): [<ffffffff9057b694>] copy_process+0xc84/0x3840 [ T4696] softirqs last disabled at (0): [<0000000000000000>] 0x0 [ T4696] CPU: 1 UID: 0 PID: 4696 Comm: test_progs Tainted: G O 6.15.0-bpf-00003-g5197b534e6ad #4 PREEMPT_{RT,(full)} [ T4696] Tainted: [O]=OOT_MODULE [ T4696] Hardware name: Micro-Star International Co., Ltd. Alpha 15 B5EEK/MS-158L, BIOS E158LAMS.10F 11/11/2024 [ T4696] Call Trace: [ T4696] <TASK> [ T4696] dump_stack_lvl+0x6d/0xb0 [ T4696] __might_resched.cold+0xe1/0xf3 [ T4696] rt_spin_lock+0x5f/0x190 [ T4696] ? ___slab_alloc+0x68/0xde0 [ T4696] ? srso_alias_return_thunk+0x5/0xfbef5 [ T4696] ___slab_alloc+0x68/0xde0 [ T4696] ? find_held_lock+0x2b/0x80 [ T4696] ? try_to_wake_up+0x47b/0xbb0 [ T4696] ? srso_alias_return_thunk+0x5/0xfbef5 [ T4696] ? bpf_map_kmalloc_node+0x72/0x220 [ T4696] ? srso_alias_return_thunk+0x5/0xfbef5 [ T4696] ? _raw_spin_unlock_irqrestore+0x40/0x80 [ T4696] ? srso_alias_return_thunk+0x5/0xfbef5 [ T4696] ? try_to_wake_up+0x47b/0xbb0 [ T4696] ? bpf_map_kmalloc_node+0x72/0x220 [ T4696] __kmalloc_node_noprof+0xee/0x490 [ T4696] bpf_map_kmalloc_node+0x72/0x220 [ T4696] __bpf_async_init+0x107/0x310 [ T4696] bpf_timer_init+0x33/0x40 [ T4696] bpf_prog_7e15f1bc7d1d26d0_start_cb+0x5d/0x91 [ T4696] bpf_prog_d85f43676fabf521_start_timer+0x65/0x8a [ T4696] bpf_prog_test_run_syscall+0x103/0x250 [ T4696] ? fput+0x3f/0x90 [ T4696] __sys_bpf+0xd33/0x26d0 [ T4696] __x64_sys_bpf+0x21/0x30 [ T4696] do_syscall_64+0x72/0xfa0 [ T4696] entry_SYSCALL_64_after_hwframe+0x76/0x7e [ T4696] RIP: 0033:0x7f6e173fa779 [ T4696] Code: ff c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 4f 86 0d 00 f7 d8 64 89 01 48 [ T4696] RSP: 002b:00007f6e0f95a878 EFLAGS: 00000202 ORIG_RAX: 0000000000000141 [ T4696] RAX: ffffffffffffffda RBX: 00007f6e0f95bcdc RCX: 00007f6e173fa779 [ T4696] RDX: 0000000000000050 RSI: 00007f6e0f95a8b0 RDI: 000000000000000a [ T4696] RBP: 00007f6e0f95a890 R08: 0000000000000003 R09: 00007f6e0f95a8b0 [ T4696] R10: 00007ffcf6d23f80 R11: 0000000000000202 R12: 0000000000000020 [ T4696] R13: 000000000000005f R14: 00007ffcf6d23d70 R15: 00007f6e0f15b000 [ T4696] </TASK> Soon after that (1-2 min) while still running the bpf test_progs I got a kernel panic (flashing capslock and lockup) and needed to reboot. Bert Karwatzki