On Friday 06/20 at 14:47 +0800, Yu Kuai wrote: > Hi, > > 在 2025/06/20 12:10, Calvin Owens 写道: > > I dumped all the similar WARNs I've seen here (blk-warn-%d.txt): > > > > https://github.com/jcalvinowens/lkml-debug-616/tree/master > > These reports also contain both request-based and bio-based disk, I > think perhaps following concurrent scenario is possible: > > While bdev_count_inflight is interating all cpu, some IOs are issued > from traversed cpu and then completed from the cpu that is not traversed > yet. > > cpu0 > cpu1 > bdev_count_inflight > //for_each_possible_cpu > // cpu0 is 0 > infliht += 0 > // issue a io > blk_account_io_start > // cpu0 inflight ++ > > cpu2 > // the io is done > blk_account_io_done > // cpu2 inflight -- > // cpu 1 is 0 > inflight += 0 > // cpu2 is -1 > inflight += -1 > ... > > In this case, the total inflight will be -1. > > Yi and Calvin, > > Can you please help testing the following patch, it add a WARN_ON_ONCE() > using atomic operations, if the new warning is not reporduced while > the old warning is reporduced, I think it can be confirmed the above > analyze is correct, and I will send a revert for the WARN_ON_ONCE() > change in bdev_count_inflight(). Hi Kuai, I can confirm it's what you expected, I've reproduced the original warning with your patch while not seeing any of the new ones. If you like, for the revert: Tested-By: Calvin Owens <calvin@xxxxxxxxxx> Thanks, Calvin > Thanks, > Kuai > > diff --git a/block/blk-core.c b/block/blk-core.c > index b862c66018f2..2b033caa74e8 100644 > --- a/block/blk-core.c > +++ b/block/blk-core.c > @@ -1035,6 +1035,8 @@ unsigned long bdev_start_io_acct(struct block_device > *bdev, enum req_op op, > part_stat_local_inc(bdev, in_flight[op_is_write(op)]); > part_stat_unlock(); > > + atomic_inc(&bdev->inflight[op_is_write(op)]); > + > return start_time; > } > EXPORT_SYMBOL(bdev_start_io_acct); > @@ -1065,6 +1067,8 @@ void bdev_end_io_acct(struct block_device *bdev, enum > req_op op, > part_stat_add(bdev, nsecs[sgrp], jiffies_to_nsecs(duration)); > part_stat_local_dec(bdev, in_flight[op_is_write(op)]); > part_stat_unlock(); > + > + WARN_ON_ONCE(atomic_dec_return(&bdev->inflight[op_is_write(op)]) < > 0); > } > EXPORT_SYMBOL(bdev_end_io_acct); > > diff --git a/block/blk-merge.c b/block/blk-merge.c > index 70d704615be5..ff15276d277f 100644 > --- a/block/blk-merge.c > +++ b/block/blk-merge.c > @@ -658,6 +658,8 @@ static void blk_account_io_merge_request(struct request > *req) > part_stat_local_dec(req->part, > in_flight[op_is_write(req_op(req))]); > part_stat_unlock(); > + > + WARN_ON_ONCE(atomic_dec_return(&req->part->inflight[op_is_write(req_op(req))]) > < 0); > } > } > > diff --git a/block/blk-mq.c b/block/blk-mq.c > index 4806b867e37d..94e728ff8bb6 100644 > --- a/block/blk-mq.c > +++ b/block/blk-mq.c > @@ -1056,6 +1056,8 @@ static inline void blk_account_io_done(struct request > *req, u64 now) > part_stat_local_dec(req->part, > in_flight[op_is_write(req_op(req))]); > part_stat_unlock(); > + > + WARN_ON_ONCE(atomic_dec_return(&req->part->inflight[op_is_write(req_op(req))]) > < 0); > } > } > > @@ -1116,6 +1118,8 @@ static inline void blk_account_io_start(struct request > *req) > update_io_ticks(req->part, jiffies, false); > part_stat_local_inc(req->part, in_flight[op_is_write(req_op(req))]); > part_stat_unlock(); > + > + atomic_inc(&req->part->inflight[op_is_write(req_op(req))]); > } > > static inline void __blk_mq_end_request_acct(struct request *rq, u64 now) > diff --git a/include/linux/blk_types.h b/include/linux/blk_types.h > index 3d1577f07c1c..a81110c07426 100644 > --- a/include/linux/blk_types.h > +++ b/include/linux/blk_types.h > @@ -43,6 +43,7 @@ struct block_device { > sector_t bd_nr_sectors; > struct gendisk * bd_disk; > struct request_queue * bd_queue; > + atomic_t inflight[2]; > struct disk_stats __percpu *bd_stats; > unsigned long bd_stamp; > atomic_t __bd_flags; // partition number + flags >