On 4/22/25 4:23 PM, Ming Lei wrote: > On Sat, Apr 19, 2025 at 07:25:31PM +0530, Nilay Shroff wrote: >> >> >> On 4/18/25 10:06 PM, Ming Lei wrote: >>> Move elv_register[unregister]_queue out of ->elevator_lock & queue freezing, >>> so we can kill many lockdep warnings. >>> >>> elv_register[unregister]_queue() is serialized, and just dealing with sysfs/ >>> debugfs things, no need to be done with queue frozen. >>> >>> With this change, elevator's ->exit() is called before calling >>> elv_unregister_queue, then user may call into ->show()/store() of elevator's >>> sysfs attributes, and we have covered this issue by adding `ELEVATOR_FLAG_DYNG`. >>> >>> For blk-mq debugfs, hctx->sched_tags is always checked with ->elevator_lock by >>> debugfs code, meantime hctx->sched_tags is updated with ->elevator_lock, so >>> there isn't such issue. >>> >>> Signed-off-by: Ming Lei <ming.lei@xxxxxxxxxx> >>> --- >>> block/blk-mq.c | 9 ++++---- >>> block/blk.h | 1 + >>> block/elevator.c | 58 ++++++++++++++++++++++++++++++++++-------------- >>> block/elevator.h | 5 +++++ >>> 4 files changed, 52 insertions(+), 21 deletions(-) >>> >> >> [...] >> >>> +static void elevator_exit(struct request_queue *q) >>> +{ >>> + __elevator_exit(q); >>> + kobject_put(&q->elevator->kobj); >>> } >> >> [...] >>> +int elevator_change_done(struct request_queue *q, struct elv_change_ctx *ctx) >>> +{ >>> + int ret = 0; >>> + >>> + if (ctx->old) { >>> + elv_unregister_queue(q, ctx->old); >>> + kobject_put(&ctx->old->kobj); >>> + } >>> + if (ctx->new) { >>> + ret = elv_register_queue(q, ctx->new, ctx->uevent); >>> + if (ret) { >>> + unsigned memflags = blk_mq_freeze_queue(q); >>> + >>> + mutex_lock(&q->elevator_lock); >>> + elevator_exit(q); >>> + mutex_unlock(&q->elevator_lock); >>> + blk_mq_unfreeze_queue(q, memflags); >>> + } >>> + } >>> + return 0; >>> +} >>> + >> It seems that we're still leaking ->elevator_lock in sysfs/kernfs with >> the above elevator_exit call. I think we probably want to move out >> kobject_put(&q->elevator->kobj) from elevator_exit and invoke it >> after we release ->elevator_lock in elevator_change_done. > > q->elevator_lock is owned by request queue instead of elevator queue, so it > shouldn't be one issue, or can you explain in details? > I meant that one of the objective of this patchset is to ensure that we don't grab any block/queue lock and enter the kernefs/sysfs/debugfs code. However here in elevator_change_done() we invoke elevator_exit() after grabbing q->elevator_lock. Now if we look at the definition of elevator_exit() then it appears that we invoke kobject_put(&ctx->old->kobj) from it. So that means that while holding q->elevator_lock we enter in kernfs/sysfs code. So my point was to move kobject_put() out of the elevator_exit() and call it after we release q->elevator_lock, maybe something like shown below: elevator_change_done() { ... ... if (ctx->new) { ret = elv_register_queue(q, ctx->new, ctx->uevent); if (ret) { unsigned memflags = blk_mq_freeze_queue(q); mutex_lock(&q->elevator_lock); __elevator_exit(q); mutex_unlock(&q->elevator_lock); kobject_put(&q->elevator->kobj); blk_mq_unfreeze_queue(q, memflags); } } } Thanks, --Nilay