On 15.04.25 10:00, Sebastian Andrzej Siewior wrote: > On 2025-04-15 08:54:01 [+0200], Jan Kiszka wrote: >> On 15.04.25 08:23, Sebastian Andrzej Siewior wrote: >>> On 2025-04-15 07:35:50 [+0200], Jan Kiszka wrote: >>>>> On RT the read_lock() in the timer block, the write blocks, too. So >>>>> every blocker on the lock is scheduled out until the reader is gone. On >>>>> top of that, the reader gets RCU boosted with FIFO-1 by default to get >>>>> out. >>>> >>>> There is no boosting of the active readers on RT as there is no >>>> information recorded about who is currently holding a read lock. This is >>>> the whole point why rwlocks are hairy with RT, I thought. >>> >>> Kind of, yes. PREEMPT_RT has by default RCU boosting enabled with >>> SCHED_FIFO 1. If you acquire a readlock you start a RCU section. If you >>> get stuck in a RCU section for too long then this boosting will take >>> effect by making the task, within the RCU section, the owner of the >>> boost-lock and the boosting task will try to acquire it. This is used to >>> get SCHED_OTHER tasks out of the RCU section. >>> But if a SCHED_FIFO task is on the CPU then this boosting will have to >>> no effect because the scheduler will not switch to a task with lower >>> priority. >> >> Does that boosting happen to need ktimersd or ksoftirqd (which both are >> stalling in our case)? I'm still looking for the reason why it does not >> help in the observed stall scenarios. > > Your problem is that you likely have many reader which need to get out > first. That spinlock replacement will help. I'm not sure about the CFS > patch referenced in the thread here. Nope, we only have two readers, one which is scheduled out by CFS and another one - in soft IRQ context - that is getting stuck after the writer promoted the held lock to a write lock. > > That boosting requires a RCU reader that starts the mechanism (on rcu > unlock). But I don't think that it will help. You would also need to > raise the priority above to the writer level (manually) and that will > likely break other things. It is meant to unstuck SCHED_OTHER tasks and > not boost stuck reader as a side effect. Also I am not sure how that > works with multiple tasks. Ok, that is likely why we don't see that coming in for helping us out. Jan -- Siemens AG, Foundational Technologies Linux Expert Center