Re: [BUG] Deadlock triggered by bpfsnoop funcgraph feature

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Thu, Aug 28, 2025 at 09:43:00AM -0700, Alexei Starovoitov wrote:
> On Thu, Aug 28, 2025 at 6:39 AM Leon Hwang <leon.hwang@xxxxxxxxx> wrote:
> >
> > On Thu Aug 28, 2025 at 7:50 PM +08, Paul E. McKenney wrote:
> > > On Thu, Aug 28, 2025 at 10:40:47AM +0800, Leon Hwang wrote:
> > >> On 28/8/25 08:42, Alexei Starovoitov wrote:
> > >> > On Tue, Aug 26, 2025 at 7:58 PM Leon Hwang <leon.hwang@xxxxxxxxx> wrote:
> >
> > [...]
> >
> > >> >
> > >> > bpf infra is trying hard not to crash it, but debug kernel is a different
> > >> > category. rcu_read_lock_held() doesn't exist in production kernels.
> > >> > You can propose adding "notrace" for it, but in general that doesn't scale.
> > >> > Same with rcu_lockdep_current_cpu_online().
> > >> > It probably deserves "notrace" too.
> > >>
> > >> Indeed, it doesn't scale.
> > >>
> > >> When I run
> > >> ./bpfsnoop -k "htab_*_elem" --output-fgraph --fgraph-debug
> > >> --fgraph-exclude
> > >> 'rcu_read_lock_*held,rcu_lockdep_current_cpu_online,*raw_spin_*lock*,kvfree,show_stack,put_task_stack',
> > >> the kernel doesn’t panic, but the OS eventually stalls and becomes
> > >> unresponsive to key presses.
> > >>
> > >> It seems preferable to avoid running BPF programs continuously in such
> > >> cases.
> > >
> > > Agreed, when adding code to the Linux kernel, whether via a patch, via
> > > a BPF program, or by whatever other means, you are taking responsibility
> > > for the speed, scalability, and latency effects of that code.
> > >
> > > Nevertheless, I am happy to add a few "notrace" modifiers
> > > if needed.  Do you guys need them for rcu_read_lock_held() and
> > > rcu_lockdep_current_cpu_online()?
> > >
> >
> > I think it would be better to add "notrace" to following functions:
> >
> > ./bpfsnoop -k 'rcu_read_*lock_*held*,rcu_lockdep_*' --show-func-proto
> > bool rcu_lockdep_current_cpu_online(); [traceable]
> > int rcu_read_lock_any_held(); [traceable]
> > int rcu_read_lock_bh_held(); [traceable]
> > int rcu_read_lock_held(); [traceable]
> > int rcu_read_lock_sched_held(); [traceable]
> 
> Agree. Seems like an easy way to remove a footgun.

Very good, and please see below.  This might or might not make the next
merge window, but if not, it should be good for the one after that.

> Independently it would be good to make noinstr/notrace to include __cpuidle
> functions. I think right now it's allowed to attach to default_idle()
> which is causing issues.

Leon, would you be interested in putting together a patch for these?

							Thanx, Paul

------------------------------------------------------------------------

commit dada60c8851f19e54524cc1bcf8ab5938eb909c9
Author: Paul E. McKenney <paulmck@xxxxxxxxxx>
Date:   Thu Aug 28 10:17:10 2025 -0700

    rcu: Mark diagnostic functions as notrace
    
    The rcu_lockdep_current_cpu_online(), rcu_read_lock_sched_held(),
    rcu_read_lock_held(), rcu_read_lock_bh_held(), rcu_read_lock_any_held()
    are used by tracing-related code paths, so putting traces on them is
    unlikely to make anyone happy.  This commit therefore marks them all
    "notrace".
    
    Reported-by: Leon Hwang <leon.hwang@xxxxxxxxx>
    Reported-by: Alexei Starovoitov <alexei.starovoitov@xxxxxxxxx>
    Signed-off-by: Paul E. McKenney <paulmck@xxxxxxxxxx>

diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c
index 1291e0761d70ab..2515ee9a82df4f 100644
--- a/kernel/rcu/tree.c
+++ b/kernel/rcu/tree.c
@@ -4021,7 +4021,7 @@ bool rcu_cpu_online(int cpu)
  * RCU on an offline processor during initial boot, hence the check for
  * rcu_scheduler_fully_active.
  */
-bool rcu_lockdep_current_cpu_online(void)
+bool notrace rcu_lockdep_current_cpu_online(void)
 {
 	struct rcu_data *rdp;
 	bool ret = false;
diff --git a/kernel/rcu/update.c b/kernel/rcu/update.c
index c912b594ba987f..dfeba9b3539508 100644
--- a/kernel/rcu/update.c
+++ b/kernel/rcu/update.c
@@ -117,7 +117,7 @@ static bool rcu_read_lock_held_common(bool *ret)
 	return false;
 }
 
-int rcu_read_lock_sched_held(void)
+int notrace rcu_read_lock_sched_held(void)
 {
 	bool ret;
 
@@ -342,7 +342,7 @@ EXPORT_SYMBOL_GPL(debug_lockdep_rcu_enabled);
  * Note that rcu_read_lock() is disallowed if the CPU is either idle or
  * offline from an RCU perspective, so check for those as well.
  */
-int rcu_read_lock_held(void)
+int notrace rcu_read_lock_held(void)
 {
 	bool ret;
 
@@ -367,7 +367,7 @@ EXPORT_SYMBOL_GPL(rcu_read_lock_held);
  * Note that rcu_read_lock_bh() is disallowed if the CPU is either idle or
  * offline from an RCU perspective, so check for those as well.
  */
-int rcu_read_lock_bh_held(void)
+int notrace rcu_read_lock_bh_held(void)
 {
 	bool ret;
 
@@ -377,7 +377,7 @@ int rcu_read_lock_bh_held(void)
 }
 EXPORT_SYMBOL_GPL(rcu_read_lock_bh_held);
 
-int rcu_read_lock_any_held(void)
+int notrace rcu_read_lock_any_held(void)
 {
 	bool ret;
 




[Index of Archives]     [Linux Samsung SoC]     [Linux Rockchip SoC]     [Linux Actions SoC]     [Linux for Synopsys ARC Processors]     [Linux NFS]     [Linux NILFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]


  Powered by Linux