On Wed Aug 6, 2025 at 8:44 AM CEST, Maurizio Lombardi wrote: > On Wed Aug 6, 2025 at 8:22 AM CEST, Maurizio Lombardi wrote: >> >> Ops sorry they are two read locks, the real problem then is that >> something is holding the write lock. > > Ok, I think I get what happens now. > > The threads that call nvmet_tcp_data_ready() (takes the read lock 2 > times) and > nvmet_tcp_release_queue_work() (tries to take the write lock) > are blocking each other. > So I still think that deferring the call to queue->data_ready() by > using a workqueue should fix it. > I reproduced the issue by creating a reader thread that tries to take the lock twice and a writer thread that takes the write lock between the two calls to read_lock() [ 33.398311] [Reader] Thread started. [ 33.398410] [Writer] Thread started, waiting for reader to get lock... [ 33.398577] [Reader] Acquired read_lock successfully. [ 33.399391] [Reader] Sleeping for a while to allow writer to block... [ 33.418697] [Writer] Reader has the lock. Attempting to acquire write_lock... THIS SHOULD BLOCK. [ 41.288105] [Reader] Attempting to acquire a second read_lock... THIS SHOULD BLOCK. [ 93.388349] rcu: INFO: rcu_preempt self-detected stall on CPU [ 93.388758] rcu: 7-....: (5999 ticks this GP) idle=9db4/1/0x4000000000000000 softirq=1846/1846 fqs=2444 [ 93.389390] rcu: (t=6001 jiffies g=1917 q=4319 ncpus=8) [ 93.389745] CPU: 7 UID: 0 PID: 1784 Comm: reader_thread Kdump: loaded Tainted: G OEL ------- --- 6.12.0-116.el10.aarch64 #1 PREEMPT(voluntary) [ 93.389749] Tainted: [O]=OOT_MODULE, [E]=UNSIGNED_MODULE, [L]=SOFTLOCKUP [ 93.389749] Hardware name: QEMU KVM Virtual Machine, BIOS 0.0.0 02/06/2015 [ 93.389750] pstate: 00400005 (nzcv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--) [ 93.389752] pc : queued_spin_lock_slowpath+0x78/0x460 [ 93.389757] lr : queued_read_lock_slowpath+0x21c/0x228 [ 93.389759] sp : ffff80008bd6bdd0 [ 93.389760] x29: ffff80008bd6bdd0 x28: 0000000000000000 x27: 0000000000000000 [ 93.389762] x26: 0000000000000000 x25: 0000000000000000 x24: 0000000000000000 [ 93.389764] x23: ffffb1c374605008 x22: ffff0000ca9342c0 x21: ffff80008bafb960 [ 93.389766] x20: ffff0000c4735e40 x19: ffffb1c37460701c x18: 0000000000000006 [ 93.389767] x17: 444c554f48532053 x16: ffffb1c3ee73ab48 x15: 636f6c5f64616572 [ 93.389769] x14: 20646e6f63657320 x13: 2e4b434f4c422044 x12: ffffb1c3eff5ec10 [ 93.389771] x11: ffffb1c3efc9ec68 x10: ffffb1c3eff5ec68 x9 : ffffb1c3ee73b4c4 [ 93.389772] x8 : 0000000000000001 x7 : 00000000000bffe8 x6 : c0000000ffff7fff [ 93.389774] x5 : ffff00112ebe05c8 x4 : 0000000000000000 x3 : 0000000000000000 [ 93.389776] x2 : 0000000000000001 x1 : 0000000000000000 x0 : 0000000000000001 [ 93.389778] Call trace: [ 93.389779] queued_spin_lock_slowpath+0x78/0x460 (P) [ 93.389782] queued_read_lock_slowpath+0x21c/0x228 [ 93.389785] _raw_read_lock+0x60/0x80 [ 93.389787] reader_thread_fn+0x7c/0xc0 [dead] [ 93.389791] kthread+0x110/0x130 [ 93.389794] ret_from_fork+0x10/0x20 So apparently in case of contention writers have the precedence. Note that the same problem may also affect nvmet_tcp_write_space() Maurizio