On 4/18/25 10:06 PM, Ming Lei wrote: > elevator change is one well-define behavior: > > - tear down current elevator if it exists > > - setup new elevator > > It is supposed to cover any case for changing elevator by single > internal API, typically the following cases: > > - setup default elevator in add_disk() > > - switch to none in del_disk() > > - reset elevator in blk_mq_update_nr_hw_queues() > > - switch elevator in sysfs `store` elevator attribute > > This patch uses elevator_change() to cover all above cases: > > - every elevator switch is serialized with each other: add_disk/del_disk/ > store elevator is serialized already, blk_mq_update_nr_hw_queues() uses > srcu for syncing with the other three cases > > - for both add_disk()/del_disk(), queue freeze works at atomic mode > or has been froze, so the freeze in elevator_change() won't add extra > delay > > - `struct elev_change_ctx` instance holds any info for changing elevator > > Signed-off-by: Ming Lei <ming.lei@xxxxxxxxxx> > --- > block/blk-sysfs.c | 18 ++++------- > block/blk.h | 5 ++- > block/elevator.c | 81 ++++++++++++++++++++++++++--------------------- > block/elevator.h | 1 + > block/genhd.c | 19 +---------- > 5 files changed, 55 insertions(+), 69 deletions(-) > > diff --git a/block/blk-sysfs.c b/block/blk-sysfs.c > index a2882751f0d2..58c50709bc14 100644 > --- a/block/blk-sysfs.c > +++ b/block/blk-sysfs.c > @@ -869,14 +869,8 @@ int blk_register_queue(struct gendisk *disk) > if (ret) > goto out_unregister_ia_ranges; > > + elevator_set_default(q); > mutex_lock(&q->elevator_lock); > - if (q->elevator) { > - ret = elv_register_queue(q, false); > - if (ret) { > - mutex_unlock(&q->elevator_lock); > - goto out_crypto_sysfs_unregister; > - } > - } > wbt_enable_default(disk); > mutex_unlock(&q->elevator_lock); > [...] [...] > /* > diff --git a/block/genhd.c b/block/genhd.c > index 86c3db5b9305..de227aa923ed 100644 > --- a/block/genhd.c > +++ b/block/genhd.c > @@ -438,12 +438,6 @@ static int __add_disk_fwnode(struct gendisk_data *data) > */ > if (disk->fops->submit_bio || disk->fops->poll_bio) > return -EINVAL; > - > - /* > - * Initialize the I/O scheduler code and pick a default one if > - * needed. > - */ > - elevator_init_mq(disk->queue); > } else { > if (!disk->fops->submit_bio) > return -EINVAL; > @@ -587,11 +581,7 @@ static int __add_disk_fwnode(struct gendisk_data *data) > if (disk->major == BLOCK_EXT_MAJOR) > blk_free_ext_minor(disk->first_minor); > out_exit_elevator: > - if (disk->queue->elevator) { > - mutex_lock(&disk->queue->elevator_lock); > - elevator_exit(disk->queue); > - mutex_unlock(&disk->queue->elevator_lock); > - } > + elevator_set_none(disk->queue); > return ret; > } As we moved elevator init function (elevator_set_default) inside blk_register_queue, we probably now don't need label out_exit_elevator in __add_disk_fwnode. In fact, no code in __add_disk_fwnode shall jump to this label. I think, elevator_set_none may be called anytime after we see failure post blk_register_queue returns. Thanks, --Nilay