On 2025-06-16 17:10:33 [+0200], Marc Strämke wrote: > Hi Sebastian, Hi Marc, > I am still trying to figure that puzzle out: Please see the following > function trace snippet: > > ip-690530 [000] ..... 178636.460435: rtnl_is_locked <-__dev_change_flags > ip-690530 [000] ..... 178636.460435: __local_bh_disable_ip <-__dev_change_flags > ip-690530 [000] ..... 178636.460435: migrate_disable <-__local_bh_disable_ip > ip-690530 [000] ...1. 178636.460435: preempt_disable: caller=__local_bh_disable_ip+0x76/0xe0 parent=__local_bh_disable_ip+0x76/0xe0 > ip-690530 [000] ...11 178636.460435: preempt_enable: caller=__local_bh_disable_ip+0x76/0xe0 parent=__local_bh_disable_ip+0x76/0xe0 > ip-690530 [000] ....1 178636.460435: rt_spin_lock <-__local_bh_disable_ip > ip-690530 [000] ....1 178636.460436: __rcu_read_lock <-rt_spin_lock > ip-690530 [000] ....1 178636.460436: migrate_disable <-__local_bh_disable_ip > ip-690530 [000] ....2 178636.460436: __rcu_read_lock <-__local_bh_disable_ip > ip-690530 [000] b...2 178636.460436: rt_spin_lock <-__dev_change_flags > ip-690530 [000] b...2 178636.460436: __rcu_read_lock <-rt_spin_lock > ip-690530 [000] b...2 178636.460436: migrate_disable <-__dev_change_flags > ip-690530 [000] b...3 178636.460436: __dev_set_rx_mode <-__dev_change_flags > ip-690530 [000] b...3 178636.460437: igb_set_rx_mode <-__dev_change_flags > > It is my understanding that __local_bh_disable_ip called from > netif_addr_lock_bh seems to be the origin of my latency. How so? > What i do not understand is, even if the bottom halves are disabled. > Shouldn't I see the interrupt arriving in the trace? Yes. > If i understood it correctly, your talk "Unblocking the softirq lock for > PREEMPT_RT" during the plumbers conference 2023 is exactly about that case, > right? No. > Probably fixing this issue is out of my abilities for now ;-) The wast > variety of locking concepts inside the kernel is quite intimidating for a > newcomer in kernel land... If you would have only the scheduler events enabled and you would see that "ip-690530" is doing something, then an interrupt fires, that interrupts wakes a thread, CPU switches to that thread and that thread does sched_pi_setprio() and switches back to "ip-690530" until it is done _then_ it would be what I said on plumbers in 2023. Your trace snippet above is short latency wise. It covers just 2us.