On Wed, Aug 06, 2025 at 05:21:32PM +0800, Yu Kuai wrote: > Hi, > > 在 2025/08/01 19:44, Ming Lei 写道: > > Replace the spinlock in blk_mq_find_and_get_req() with an SRCU read lock > > around the tag iterators. > > > > This is done by: > > > > - Holding the SRCU read lock in blk_mq_queue_tag_busy_iter(), > > blk_mq_tagset_busy_iter(), and blk_mq_hctx_has_requests(). > > > > - Removing the now-redundant tags->lock from blk_mq_find_and_get_req(). > > > > This change improves performance by replacing a spinlock with a more > > scalable SRCU lock, and fixes lockup issue in scsi_host_busy() in case of > > shost->host_blocked. > > > > Meantime it becomes possible to use blk_mq_in_driver_rw() for io > > accounting. > > > > Signed-off-by: Ming Lei<ming.lei@xxxxxxxxxx> > > --- > > block/blk-mq-tag.c | 12 ++++++++---- > > block/blk-mq.c | 24 ++++-------------------- > > 2 files changed, 12 insertions(+), 24 deletions(-) > > I think it's not good to use blk_mq_in_driver_rw() for io accounting, we > start io accounting from blk_account_io_start(), where such io is not in > driver yet. In blk_account_io_start(), the current IO is _not_ taken into account in update_io_ticks() yet. Also please look at 'man iostat': %util Percentage of elapsed time during which I/O requests were issued to the device (bandwidth utilization for the device). Device saturation occurs when this value is close to 100% for devices serving requests serially. But for devices serving requests in parallel, such as RAID arrays and modern SSDs, this number does not reflect their performance limits. which shows %util in device level, not from queuing IO to complete io from device. That said the current approach for counting inflight IO via percpu counter from blk_account_io_start() is not correct from %util viewpoint from request based driver. Thanks, Ming