Hi, Ming Lei <ming.lei@xxxxxxxxxx> 于2025年8月16日周六 12:05写道: > > On Sat, Aug 16, 2025 at 10:57:23AM +0800, Yu Kuai wrote: > > Hi, > > > > 在 2025/8/16 3:30, Nilay Shroff 写道: > > > > > > On 8/15/25 1:32 PM, Yu Kuai wrote: > > > > From: Yu Kuai <yukuai3@xxxxxxxxxx> > > > > > > > > In the case user trigger tags grow by queue sysfs attribute nr_requests, > > > > hctx->sched_tags will be freed directly and replaced with a new > > > > allocated tags, see blk_mq_tag_update_depth(). > > > > > > > > The problem is that hctx->sched_tags is from elevator->et->tags, while > > > > et->tags is still the freed tags, hence later elevator exist will try to > > > > free the tags again, causing kernel panic. > > > > > > > > Fix this problem by using new allocated elevator_tags, also convert > > > > blk_mq_update_nr_requests to void since this helper will never fail now. > > > > > > > > Meanwhile, there is a longterm problem can be fixed as well: > > > > > > > > If blk_mq_tag_update_depth() succeed for previous hctx, then bitmap depth > > > > is updated, however, if following hctx failed, q->nr_requests is not > > > > updated and the previous hctx->sched_tags endup bigger than q->nr_requests. > > > > > > > > Fixes: f5a6604f7a44 ("block: fix lockdep warning caused by lock dependency in elv_iosched_store") > > > > Fixes: e3a2b3f931f5 ("blk-mq: allow changing of queue depth through sysfs") > > > > Signed-off-by: Yu Kuai <yukuai3@xxxxxxxxxx> > > > > --- > > > > block/blk-mq.c | 19 ++++++------------- > > > > block/blk-mq.h | 4 +++- > > > > block/blk-sysfs.c | 21 ++++++++++++++------- > > > > 3 files changed, 23 insertions(+), 21 deletions(-) > > > > > > > > diff --git a/block/blk-mq.c b/block/blk-mq.c > > > > index 11c8baebb9a0..e9f037a25fe3 100644 > > > > --- a/block/blk-mq.c > > > > +++ b/block/blk-mq.c > > > > @@ -4917,12 +4917,12 @@ void blk_mq_free_tag_set(struct blk_mq_tag_set *set) > > > > } > > > > EXPORT_SYMBOL(blk_mq_free_tag_set); > > > > -int blk_mq_update_nr_requests(struct request_queue *q, unsigned int nr) > > > > +void blk_mq_update_nr_requests(struct request_queue *q, > > > > + struct elevator_tags *et, unsigned int nr) > > > > { > > > > struct blk_mq_tag_set *set = q->tag_set; > > > > struct blk_mq_hw_ctx *hctx; > > > > unsigned long i; > > > > - int ret = 0; > > > > blk_mq_quiesce_queue(q); > > > > @@ -4946,24 +4946,17 @@ int blk_mq_update_nr_requests(struct request_queue *q, unsigned int nr) > > > > nr - hctx->sched_tags->nr_reserved_tags); > > > > } > > > > } else { > > > > - queue_for_each_hw_ctx(q, hctx, i) { > > > > - if (!hctx->tags) > > > > - continue; > > > > - ret = blk_mq_tag_update_depth(hctx, &hctx->sched_tags, > > > > - nr); > > > > - if (ret) > > > > - goto out; > > > > - } > > > > + blk_mq_free_sched_tags(q->elevator->et, set); > > > I think you also need to ensure that elevator tags are freed after we unfreeze > > > queue and release ->elevator_lock otherwise we may get into the lockdep splat > > > for pcpu_lock dependency on ->freeze_lock and/or ->elevator_lock. Please note > > > that blk_mq_free_sched_tags internally invokes sbitmap_free which invokes > > > free_percpu which acquires pcpu_lock. > > > > Ok, thanks for the notice. However, as Ming suggested, we might fix this > > problem > > > > in the next merge window. > > There are two issues involved: > > - blk_mq_tags double free, introduced recently > > - long-term lock issue in queue_requests_store() > > IMO, the former is a bit serious, because kernel panic can be triggered, > so suggest to make it to v6.17. The latter looks less serious and has > existed for long time, but may need code refactor to get clean fix. > > > I'll send one patch to fix this regression by > > replace > > > > st->tags with reallocated new sched_tags as well. > > Patch 7 in this patchset and patch 8 in your 1st post looks enough to > fix this double free issue. > But without previous refactor, this looks hard. Can we consider the following one line patch for this merge window? just fix the first double free issue for now. diff --git a/block/blk-mq-tag.c b/block/blk-mq-tag.c index d880c50629d6..1e0ccf19295a 100644 --- a/block/blk-mq-tag.c +++ b/block/blk-mq-tag.c @@ -622,6 +622,7 @@ int blk_mq_tag_update_depth(struct blk_mq_hw_ctx *hctx, return -ENOMEM; blk_mq_free_map_and_rqs(set, *tagsptr, hctx->queue_num); + hctx->queue->elevator->et->tags[hctx->queue_num]= new; *tagsptr = new; } else { /* > > Thanks, > Ming > >