Re: [BUG] Deadlock triggered by bpfsnoop funcgraph feature

"Leon Hwang" <leon.hwang@xxxxxxxxx> · Thu, 28 Aug 2025 21:39:29 +0800

On Thu Aug 28, 2025 at 7:50 PM +08, Paul E. McKenney wrote:
> On Thu, Aug 28, 2025 at 10:40:47AM +0800, Leon Hwang wrote:
>> On 28/8/25 08:42, Alexei Starovoitov wrote:
>> > On Tue, Aug 26, 2025 at 7:58 PM Leon Hwang <leon.hwang@xxxxxxxxx> wrote:

[...]

>> >
>> > bpf infra is trying hard not to crash it, but debug kernel is a different
>> > category. rcu_read_lock_held() doesn't exist in production kernels.
>> > You can propose adding "notrace" for it, but in general that doesn't scale.
>> > Same with rcu_lockdep_current_cpu_online().
>> > It probably deserves "notrace" too.
>>
>> Indeed, it doesn't scale.
>>
>> When I run
>> ./bpfsnoop -k "htab_*_elem" --output-fgraph --fgraph-debug
>> --fgraph-exclude
>> 'rcu_read_lock_*held,rcu_lockdep_current_cpu_online,*raw_spin_*lock*,kvfree,show_stack,put_task_stack',
>> the kernel doesn’t panic, but the OS eventually stalls and becomes
>> unresponsive to key presses.
>>
>> It seems preferable to avoid running BPF programs continuously in such
>> cases.
>
> Agreed, when adding code to the Linux kernel, whether via a patch, via
> a BPF program, or by whatever other means, you are taking responsibility
> for the speed, scalability, and latency effects of that code.
>
> Nevertheless, I am happy to add a few "notrace" modifiers
> if needed.  Do you guys need them for rcu_read_lock_held() and
> rcu_lockdep_current_cpu_online()?
>

I think it would be better to add "notrace" to following functions:

./bpfsnoop -k 'rcu_read_*lock_*held*,rcu_lockdep_*' --show-func-proto
bool rcu_lockdep_current_cpu_online(); [traceable]
int rcu_read_lock_any_held(); [traceable]
int rcu_read_lock_bh_held(); [traceable]
int rcu_read_lock_held(); [traceable]
int rcu_read_lock_sched_held(); [traceable]

Thanks,
Leon