Re: [PATCH] block: don't use submit_bio_noacct_nocheck in blk_zone_wplug_bio_work

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 6/18/25 05:09, Bart Van Assche wrote:
> On 6/10/25 9:44 PM, Christoph Hellwig wrote:
>> diff --git a/block/blk-zoned.c b/block/blk-zoned.c
>> index 8f15d1aa6eb8..45c91016cef3 100644
>> --- a/block/blk-zoned.c
>> +++ b/block/blk-zoned.c
>> @@ -1306,7 +1306,6 @@ static void blk_zone_wplug_bio_work(struct work_struct *work)
>>   	spin_unlock_irqrestore(&zwplug->lock, flags);
>>   
>>   	bdev = bio->bi_bdev;
>> -	submit_bio_noacct_nocheck(bio);
>>   
>>   	/*
>>   	 * blk-mq devices will reuse the extra reference on the request queue
>> @@ -1314,8 +1313,12 @@ static void blk_zone_wplug_bio_work(struct work_struct *work)
>>   	 * path for BIO-based devices will not do that. So drop this extra
>>   	 * reference here.
>>   	 */
>> -	if (bdev_test_flag(bdev, BD_HAS_SUBMIT_BIO))
>> +	if (bdev_test_flag(bdev, BD_HAS_SUBMIT_BIO)) {
>> +		bdev->bd_disk->fops->submit_bio(bio);
>>   		blk_queue_exit(bdev->bd_disk->queue);
>> +	} else {
>> +		blk_mq_submit_bio(bio);
>> +	}
>>   
>>   put_zwplug:
>>   	/* Drop the reference we took in disk_zone_wplug_schedule_bio_work(). */
> 
> This patch is necessary but not sufficient. With this patch applied, if
> I run the deadlock reproducer (tests/zbd/013) with Jens' for-next
> branch, the deadlock shown below is reported. The first call stack shows
> the familiar queue_ra_store() invocation. The second call stack is new
> and shows a dm_split_and_process_bio() invocation.

That function may call bio_split_to_limits() which trigger the call chain

__bio_split_to_limits() -> bio_split_xxx() -> bio_submit_split() ->
submit_bio_noacct() -> submit_bio_noacct_nocheck()

And we then should endup doing:

	if (current->bio_list)
		bio_list_add(&current->bio_list[0], bio);

Since this is a split from within the original submitter context... Well, I
think we should be. If we endup calling again __submit_bio_noacct() directly
here, we would be reentering the submission path, at the risk of a stack
overflow, which is what the current->bio_list tries to avoid.
So I am confused why you endup seeing this issue... Can you check exactly the
path that is being followed ? (your backtrace does not seem to have everything)

Depending on the BIO, dm_split_and_process_bio may also trigger the path:

__split_and_process_bio() -> __map_bio() -> dm_submit_bio_remap() ->
submit_bio_noacct()

But that should be for submission of cloned BIOs to the block device used as the
backing dev of the DM device. So that should not cause an issue since that is a
different bdev. Or is this maybe confusing lockdep ?

> 
> sysrq: Show Blocked State
> task:check           state:D stack:27208 pid:2728  tgid:2728  ppid:2697 
>   task_flags:0x480040 flags:0x00004002
> Call Trace:
>   __schedule+0x8be/0x1c10
>   schedule+0xdd/0x270
>   blk_mq_freeze_queue_wait+0xfd/0x140
>   blk_mq_freeze_queue_nomemsave+0x1e/0x30
>   queue_ra_store+0x155/0x2a0
>   queue_attr_store+0x24d/0x2d0
>   sysfs_kf_write+0xdc/0x120
>   kernfs_fop_write_iter+0x39f/0x5a0
>   vfs_write+0x4fa/0x1300
>   ksys_write+0x109/0x1f0
>   __x64_sys_write+0x76/0xb0
>   x64_sys_call+0x276/0x17d0
>   do_syscall_64+0x94/0x3a0
>   entry_SYSCALL_64_after_hwframe+0x4b/0x53
> task:kworker/52:2H   state:D stack:26528 pid:2873  tgid:2873  ppid:2 
>   task_flags:0x4208060 flags:0x00004000
> Workqueue: dm-0_zwplugs blk_zone_wplug_bio_work
> Call Trace:
>   __schedule+0x8be/0x1c10
>   schedule+0xdd/0x270
>   __bio_queue_enter+0x32d/0x7c0
>   __submit_bio+0x1dd/0x6c0
>   __submit_bio_noacct+0x147/0x580
>   submit_bio_noacct_nocheck+0x4de/0x620
>   submit_bio_noacct+0x8f4/0x1a50
>   dm_split_and_process_bio+0x8a1/0x1c00 [dm_mod 
> 14a6a78a54cd51bfc1d6559d48b0c80b677774ec]
>   dm_submit_bio+0x137/0x490 [dm_mod 
> 14a6a78a54cd51bfc1d6559d48b0c80b677774ec]
>   blk_zone_wplug_bio_work+0x455/0x630
>   process_one_work+0xe29/0x1420
>   worker_thread+0x5ed/0xff0
>   kthread+0x3cd/0x840
>   ret_from_fork+0x412/0x520
>   ret_from_fork_asm+0x11/0x20
> 
> Bart.
> 


-- 
Damien Le Moal
Western Digital Research




[Index of Archives]     [Linux RAID]     [Linux SCSI]     [Linux ATA RAID]     [IDE]     [Linux Wireless]     [Linux Kernel]     [ATH6KL]     [Linux Bluetooth]     [Linux Netdev]     [Kernel Newbies]     [Security]     [Git]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Device Mapper]

  Powered by Linux