Re: [PATCH 0/2] block: blk-rq-qos: replace static key with atomic bitop

Nilay Shroff <nilay@xxxxxxxxxxxxx> · Wed, 13 Aug 2025 20:31:27 +0530



On 8/13/25 5:46 PM, Jens Axboe wrote:
> On 8/13/25 5:20 AM, Nilay Shroff wrote:
>> Hi Jens,
>>
>> On 8/6/25 7:14 AM, Yu Kuai wrote:
>>> Hi,
>>>
>>> ? 2025/08/06 9:28, Jens Axboe ??:
>>>> On 8/4/25 10:58 PM, Nilay Shroff wrote:
>>>>>
>>>>>
>>>>> On 8/4/25 7:12 PM, Ming Lei wrote:
>>>>>> On Mon, Aug 04, 2025 at 05:51:09PM +0530, Nilay Shroff wrote:
>>>>>>> This patchset replaces the use of a static key in the I/O path (rq_qos_
>>>>>>> xxx()) with an atomic queue flag (QUEUE_FLAG_QOS_ENABLED). This change
>>>>>>> is made to eliminate a potential deadlock introduced by the use of static
>>>>>>> keys in the blk-rq-qos infrastructure, as reported by lockdep during
>>>>>>> blktests block/005[1].
>>>>>>>
>>>>>>> The original static key approach was introduced to avoid unnecessary
>>>>>>> dereferencing of q->rq_qos when no blk-rq-qos module (e.g., blk-wbt or
>>>>>>> blk-iolatency) is configured. While efficient, enabling a static key at
>>>>>>> runtime requires taking cpu_hotplug_lock and jump_label_mutex, which
>>>>>>> becomes problematic if the queue is already frozen ? causing a reverse
>>>>>>> dependency on ->freeze_lock. This results in a lockdep splat indicating
>>>>>>> a potential deadlock.
>>>>>>>
>>>>>>> To resolve this, we now gate q->rq_qos access with a q->queue_flags
>>>>>>> bitop (QUEUE_FLAG_QOS_ENABLED), avoiding the static key and the associated
>>>>>>> locking altogether.
>>>>>>>
>>>>>>> I compared both static key and atomic bitop implementations using ftrace
>>>>>>> function graph tracer over ~50 invocations of rq_qos_issue() while ensuring
>>>>>>> blk-wbt/blk-iolatency were disabled (i.e., no QoS functionality). For
>>>>>>> easy comparision, I made rq_qos_issue() noinline. The comparision was
>>>>>>> made on PowerPC machine.
>>>>>>>
>>>>>>> Static Key (disabled : QoS is not configured):
>>>>>>> 5d0: 00 00 00 60     nop    # patched in by static key framework (not taken)
>>>>>>> 5d4: 20 00 80 4e     blr    # return (branch to link register)
>>>>>>>
>>>>>>> Only a nop and blr (branch to link register) are executed ? very lightweight.
>>>>>>>
>>>>>>> atomic bitop (QoS is not configured):
>>>>>>> 5d0: 20 00 23 e9     ld      r9,32(r3)     # load q->queue_flags
>>>>>>> 5d4: 00 80 29 71     andi.   r9,r9,32768   # check QUEUE_FLAG_QOS_ENABLED (bit 15)
>>>>>>> 5d8: 20 00 82 4d     beqlr                 # return if bit not set
>>>>>>>
>>>>>>> This performs an ld and and andi. before returning. Slightly more work,
>>>>>>> but q->queue_flags is typically hot in cache during I/O submission.
>>>>>>>
>>>>>>> With Static Key (disabled):
>>>>>>> Duration (us): min=0.668 max=0.816 avg?0.750
>>>>>>>
>>>>>>> With atomic bitop QUEUE_FLAG_QOS_ENABLED (bit not set):
>>>>>>> Duration (us): min=0.684 max=0.834 avg?0.759
>>>>>>>
>>>>>>> As expected, both versions are almost similar in cost. The added latency
>>>>>>> from an extra ld and andi. is in the range of ~9ns.
>>>>>>>
>>>>>>> There're two patches in the series. The first patch replaces static key
>>>>>>> with QUEUE_FLAG_QOS_ENABLED. The second patch ensures that we disable
>>>>>>> the QUEUE_FLAG_QOS_ENABLED when the queue no longer has any associated
>>>>>>> rq_qos policies.
>>>>>>>
>>>>>>> As usual, feedback and review comments are welcome!
>>>>>>>
>>>>>>> [1] https://lore.kernel.org/linux-block/4fdm37so3o4xricdgfosgmohn63aa7wj3ua4e5vpihoamwg3ui@fq42f5q5t5ic/
>>>>>>
>>>>>>
>>>>>> Another approach is to call memalloc_noio_save() in cpu hotplug code...
>>>>>>
>>>>> Yes that would help fix this. However per the general usage of GFP_NOIO scope in
>>>>> kernel, it is used when we're performing memory allocations in a context where I/O
>>>>> must not be initiated, because doing so could cause deadlocks or recursion.
>>>>>
>>>>> So we typically, use GFP_NOIO in a code path that is already doing I/O, such as:
>>>>> - In block layer context: during request submission
>>>>> - Filesystem writeback, or swap-out.
>>>>> - Memory reclaim or writeback triggered by memory pressure.
>>>>>
>>>>> The cpu hotplug code may not be running in any of the above context. So
>>>>> IMO, adding memalloc_noio_save() in the cpu hotplug code would not be
>>>>> a good idea, isn't it?
>>>>
>>>> Please heed Ming's advice, moving this from a static key to an atomic
>>>> queue flags ops is pointless, may as well kill it at that point.
>>>
>>> Nilay already tested and replied this is a dead end :(
>>>
>>> I don't quite understand why it's pointless, if rq_qos is never enabled,
>>> an atmoic queue_flag is still minor optimization, isn't it?
>>>
>>>>
>>>> I see v2 is out now with the exact same approach.
>>>>
>> As mentioned earlier, I tried Ming's original recommendation, but it didn?t
>> resolve the issue. In a separate thread, Ming agreed that using an atomic queue
>> flag is a reasonable approach and would avoid the lockdep problem while still
>> keeping a minor fast-path optimization.
>>
>> That leaves us with two options:
>> - Use an atomic queue flag, or
>> - Remove the static key entirely.
>>
>> So before I send v3, do you prefer the atomic queue flag approach, or
>> would you rather see the static key removed altogether? My preference
>> is for the atomic queue flag, as it maintains a lightweight check
>> without the static key?s locking concerns. 
> 
> Atomic test is still going to be better than pointless calls into
> rq-qos, so that's still a win. Hence retaining it is better than simply
> killing it off entirely.
> 
> I wonder if it makes sense to combine with IS_ENABLED() as well. Though
> with how distros enable everything under the sun, probably not going to
> be that useful.
> 
Yes agreed, in my RHEL distro CONFIG_BLK_WBT, CONFIG_BLK_CGROUP_IOCOST
and CONFIG_BLK_CGROUP_IOLATENCY are all default enabled. So IS_ENABLED()
may not be that helpful. I'd send out v3 with some minor changes (per 
review comments) using atomic queue flag now.

Thanks,
--Nilay