On Thu Aug 28, 2025 at 7:50 PM +08, Paul E. McKenney wrote: > On Thu, Aug 28, 2025 at 10:40:47AM +0800, Leon Hwang wrote: >> On 28/8/25 08:42, Alexei Starovoitov wrote: >> > On Tue, Aug 26, 2025 at 7:58 PM Leon Hwang <leon.hwang@xxxxxxxxx> wrote: [...] >> > >> > bpf infra is trying hard not to crash it, but debug kernel is a different >> > category. rcu_read_lock_held() doesn't exist in production kernels. >> > You can propose adding "notrace" for it, but in general that doesn't scale. >> > Same with rcu_lockdep_current_cpu_online(). >> > It probably deserves "notrace" too. >> >> Indeed, it doesn't scale. >> >> When I run >> ./bpfsnoop -k "htab_*_elem" --output-fgraph --fgraph-debug >> --fgraph-exclude >> 'rcu_read_lock_*held,rcu_lockdep_current_cpu_online,*raw_spin_*lock*,kvfree,show_stack,put_task_stack', >> the kernel doesn’t panic, but the OS eventually stalls and becomes >> unresponsive to key presses. >> >> It seems preferable to avoid running BPF programs continuously in such >> cases. > > Agreed, when adding code to the Linux kernel, whether via a patch, via > a BPF program, or by whatever other means, you are taking responsibility > for the speed, scalability, and latency effects of that code. > > Nevertheless, I am happy to add a few "notrace" modifiers > if needed. Do you guys need them for rcu_read_lock_held() and > rcu_lockdep_current_cpu_online()? > I think it would be better to add "notrace" to following functions: ./bpfsnoop -k 'rcu_read_*lock_*held*,rcu_lockdep_*' --show-func-proto bool rcu_lockdep_current_cpu_online(); [traceable] int rcu_read_lock_any_held(); [traceable] int rcu_read_lock_bh_held(); [traceable] int rcu_read_lock_held(); [traceable] int rcu_read_lock_sched_held(); [traceable] Thanks, Leon