On 8/13/25 5:46 PM, Jens Axboe wrote: > On 8/13/25 5:20 AM, Nilay Shroff wrote: >> Hi Jens, >> >> On 8/6/25 7:14 AM, Yu Kuai wrote: >>> Hi, >>> >>> ? 2025/08/06 9:28, Jens Axboe ??: >>>> On 8/4/25 10:58 PM, Nilay Shroff wrote: >>>>> >>>>> >>>>> On 8/4/25 7:12 PM, Ming Lei wrote: >>>>>> On Mon, Aug 04, 2025 at 05:51:09PM +0530, Nilay Shroff wrote: >>>>>>> This patchset replaces the use of a static key in the I/O path (rq_qos_ >>>>>>> xxx()) with an atomic queue flag (QUEUE_FLAG_QOS_ENABLED). This change >>>>>>> is made to eliminate a potential deadlock introduced by the use of static >>>>>>> keys in the blk-rq-qos infrastructure, as reported by lockdep during >>>>>>> blktests block/005[1]. >>>>>>> >>>>>>> The original static key approach was introduced to avoid unnecessary >>>>>>> dereferencing of q->rq_qos when no blk-rq-qos module (e.g., blk-wbt or >>>>>>> blk-iolatency) is configured. While efficient, enabling a static key at >>>>>>> runtime requires taking cpu_hotplug_lock and jump_label_mutex, which >>>>>>> becomes problematic if the queue is already frozen ? causing a reverse >>>>>>> dependency on ->freeze_lock. This results in a lockdep splat indicating >>>>>>> a potential deadlock. >>>>>>> >>>>>>> To resolve this, we now gate q->rq_qos access with a q->queue_flags >>>>>>> bitop (QUEUE_FLAG_QOS_ENABLED), avoiding the static key and the associated >>>>>>> locking altogether. >>>>>>> >>>>>>> I compared both static key and atomic bitop implementations using ftrace >>>>>>> function graph tracer over ~50 invocations of rq_qos_issue() while ensuring >>>>>>> blk-wbt/blk-iolatency were disabled (i.e., no QoS functionality). For >>>>>>> easy comparision, I made rq_qos_issue() noinline. The comparision was >>>>>>> made on PowerPC machine. >>>>>>> >>>>>>> Static Key (disabled : QoS is not configured): >>>>>>> 5d0: 00 00 00 60 nop # patched in by static key framework (not taken) >>>>>>> 5d4: 20 00 80 4e blr # return (branch to link register) >>>>>>> >>>>>>> Only a nop and blr (branch to link register) are executed ? very lightweight. >>>>>>> >>>>>>> atomic bitop (QoS is not configured): >>>>>>> 5d0: 20 00 23 e9 ld r9,32(r3) # load q->queue_flags >>>>>>> 5d4: 00 80 29 71 andi. r9,r9,32768 # check QUEUE_FLAG_QOS_ENABLED (bit 15) >>>>>>> 5d8: 20 00 82 4d beqlr # return if bit not set >>>>>>> >>>>>>> This performs an ld and and andi. before returning. Slightly more work, >>>>>>> but q->queue_flags is typically hot in cache during I/O submission. >>>>>>> >>>>>>> With Static Key (disabled): >>>>>>> Duration (us): min=0.668 max=0.816 avg?0.750 >>>>>>> >>>>>>> With atomic bitop QUEUE_FLAG_QOS_ENABLED (bit not set): >>>>>>> Duration (us): min=0.684 max=0.834 avg?0.759 >>>>>>> >>>>>>> As expected, both versions are almost similar in cost. The added latency >>>>>>> from an extra ld and andi. is in the range of ~9ns. >>>>>>> >>>>>>> There're two patches in the series. The first patch replaces static key >>>>>>> with QUEUE_FLAG_QOS_ENABLED. The second patch ensures that we disable >>>>>>> the QUEUE_FLAG_QOS_ENABLED when the queue no longer has any associated >>>>>>> rq_qos policies. >>>>>>> >>>>>>> As usual, feedback and review comments are welcome! >>>>>>> >>>>>>> [1] https://lore.kernel.org/linux-block/4fdm37so3o4xricdgfosgmohn63aa7wj3ua4e5vpihoamwg3ui@fq42f5q5t5ic/ >>>>>> >>>>>> >>>>>> Another approach is to call memalloc_noio_save() in cpu hotplug code... >>>>>> >>>>> Yes that would help fix this. However per the general usage of GFP_NOIO scope in >>>>> kernel, it is used when we're performing memory allocations in a context where I/O >>>>> must not be initiated, because doing so could cause deadlocks or recursion. >>>>> >>>>> So we typically, use GFP_NOIO in a code path that is already doing I/O, such as: >>>>> - In block layer context: during request submission >>>>> - Filesystem writeback, or swap-out. >>>>> - Memory reclaim or writeback triggered by memory pressure. >>>>> >>>>> The cpu hotplug code may not be running in any of the above context. So >>>>> IMO, adding memalloc_noio_save() in the cpu hotplug code would not be >>>>> a good idea, isn't it? >>>> >>>> Please heed Ming's advice, moving this from a static key to an atomic >>>> queue flags ops is pointless, may as well kill it at that point. >>> >>> Nilay already tested and replied this is a dead end :( >>> >>> I don't quite understand why it's pointless, if rq_qos is never enabled, >>> an atmoic queue_flag is still minor optimization, isn't it? >>> >>>> >>>> I see v2 is out now with the exact same approach. >>>> >> As mentioned earlier, I tried Ming's original recommendation, but it didn?t >> resolve the issue. In a separate thread, Ming agreed that using an atomic queue >> flag is a reasonable approach and would avoid the lockdep problem while still >> keeping a minor fast-path optimization. >> >> That leaves us with two options: >> - Use an atomic queue flag, or >> - Remove the static key entirely. >> >> So before I send v3, do you prefer the atomic queue flag approach, or >> would you rather see the static key removed altogether? My preference >> is for the atomic queue flag, as it maintains a lightweight check >> without the static key?s locking concerns. > > Atomic test is still going to be better than pointless calls into > rq-qos, so that's still a win. Hence retaining it is better than simply > killing it off entirely. > > I wonder if it makes sense to combine with IS_ENABLED() as well. Though > with how distros enable everything under the sun, probably not going to > be that useful. > Yes agreed, in my RHEL distro CONFIG_BLK_WBT, CONFIG_BLK_CGROUP_IOCOST and CONFIG_BLK_CGROUP_IOLATENCY are all default enabled. So IS_ENABLED() may not be that helpful. I'd send out v3 with some minor changes (per review comments) using atomic queue flag now. Thanks, --Nilay