On Thu, Aug 14, 2025 at 01:54:59PM +0530, Nilay Shroff wrote: > A recent lockdep[1] splat observed while running blktest block/005 > reveals a potential deadlock caused by the cpu_hotplug_lock dependency > on ->freeze_lock. This dependency was introduced by commit 033b667a823e > ("block: blk-rq-qos: guard rq-qos helpers by static key"). > > That change added a static key to avoid fetching q->rq_qos when > neither blk-wbt nor blk-iolatency is configured. The static key > dynamically patches kernel text to a NOP when disabled, eliminating > overhead of fetching q->rq_qos in the I/O hot path. However, enabling > a static key at runtime requires acquiring both cpu_hotplug_lock and > jump_label_mutex. When this happens after the queue has already been > frozen (i.e., while holding ->freeze_lock), it creates a locking > dependency from cpu_hotplug_lock to ->freeze_lock, which leads to a > potential deadlock reported by lockdep [1]. > > To resolve this, replace the static key mechanism with q->queue_flags: > QUEUE_FLAG_QOS_ENABLED. This flag is evaluated in the fast path before > accessing q->rq_qos. If the flag is set, we proceed to fetch q->rq_qos; > otherwise, the access is skipped. > > Since q->queue_flags is commonly accessed in IO hotpath and resides in > the first cacheline of struct request_queue, checking it imposes minimal > overhead while eliminating the deadlock risk. > > This change avoids the lockdep splat without introducing performance > regressions. > > [1] https://lore.kernel.org/linux-block/4fdm37so3o4xricdgfosgmohn63aa7wj3ua4e5vpihoamwg3ui@fq42f5q5t5ic/ > > Reported-by: Shinichiro Kawasaki <shinichiro.kawasaki@xxxxxxx> > Closes: https://lore.kernel.org/linux-block/4fdm37so3o4xricdgfosgmohn63aa7wj3ua4e5vpihoamwg3ui@fq42f5q5t5ic/ > Fixes: 033b667a823e ("block: blk-rq-qos: guard rq-qos helpers by static key") > Tested-by: Shin'ichiro Kawasaki <shinichiro.kawasaki@xxxxxxx> > Signed-off-by: Nilay Shroff <nilay@xxxxxxxxxxxxx> It is hard to use static branch correctly in current case from lock viewpoint, and most distributions should enable at least one rqos, so static branch won't optimize for typical cases: Reviewed-by: Ming Lei <ming.lei@xxxxxxxxxx> Thanks, Ming