Re: [PATCH] block: Fix a deadlock related freezing zoned storage devices

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 5/23/25 04:10, Ming Lei wrote:
> On Thu, May 22, 2025 at 11:32:58AM -0700, Bart Van Assche wrote:
>> On 5/22/25 10:38 AM, Jens Axboe wrote:
>>> On 5/22/25 11:14 AM, Bart Van Assche wrote:
>>>>   static void __submit_bio(struct bio *bio)
>>>>   {
>>>>   	/* If plug is not used, add new plug here to cache nsecs time. */
>>>> @@ -633,8 +640,12 @@ static void __submit_bio(struct bio *bio)
>>>>   	if (!bdev_test_flag(bio->bi_bdev, BD_HAS_SUBMIT_BIO)) {
>>>>   		blk_mq_submit_bio(bio);
>>>> -	} else if (likely(bio_queue_enter(bio) == 0)) {
>>>> +	} else {
>>>>   		struct gendisk *disk = bio->bi_bdev->bd_disk;
>>>> +		bool zwp = bio_zone_write_plugging(bio);
>>>> +
>>>> +		if (unlikely(!zwp && bio_queue_enter(bio) != 0))
>>>> +			goto finish_plug;
>>>>   	
>>>>   		if ((bio->bi_opf & REQ_POLLED) &&
>>>>   		    !(disk->queue->limits.features & BLK_FEAT_POLL)) {
>>>> @@ -643,9 +654,12 @@ static void __submit_bio(struct bio *bio)
>>>>   		} else {
>>>>   			disk->fops->submit_bio(bio);
>>>>   		}
>>>> -		blk_queue_exit(disk->queue);
>>>> +
>>>> +		if (!zwp)
>>>> +			blk_queue_exit(disk->queue);
>>>>   	}
>>>
>>> This is pretty ugly, and I honestly absolutely hate how there's quite a
>>> bit of zoned_whatever sprinkling throughout the core code. What's the
>>> reason for not unplugging here, unaligned writes? Because you should
>>> presumable have the exact same issues on non-zoned devices if they have
>>> IO stuck in a plug (and doesn't get unplugged) while someone is waiting
>>> on a freeze.
>>>
>>> A somewhat similar case was solved for IOPOLL and queue entering. That
>>> would be another thing to look at. Maybe a live enter could work if the
>>> plug itself pins it?
>>
>> Hi Jens,
>>
>> q->q_usage_counter is not increased for bios on current->plug_list.
>> q->q_usage_counter is increased before a bio is added to the zoned pluglist.
>> So these two cases are different.
>>
>> I think it is important to hold a q->q_usage_counter reference for bios
>> on the zoned plug list because bios are added to that list after bio
>> splitting happened. Hence, request queue limits must not change while
>> any bio is on the zoned plug list.
> 
> Hi Bart,
> 
> Can you share why request queue limit can't be changed after bio is added
> to zoned plug list?

Because BIOs on a zone write plug list have already been split according to the
current request queue limits. So until these BIOs are executed, we cannot change
the limits as that potentially would require again splitting and that would
completely messup the zone write pointer tracking of zone write plugging.

> If it is really true, we may have to drain zoned plug list when freezing
> queue.

Yes, that is what we need. But we currently endup deadlocking on a queue freeze
because the internal issuing of plugged zone write BIOs uses
submit_bio_noacct_nocheck() which calls __submit_bio() and that function will
call blk_queue_enter() again for DM device BIOs. We need to somehow cleanly
avoid calling that queue enter without sprinkling around lots of zone stuff in
this core block layer submission path.

> 
> 
> Thanks, 
> Ming
> 


-- 
Damien Le Moal
Western Digital Research




[Index of Archives]     [Linux RAID]     [Linux SCSI]     [Linux ATA RAID]     [IDE]     [Linux Wireless]     [Linux Kernel]     [ATH6KL]     [Linux Bluetooth]     [Linux Netdev]     [Kernel Newbies]     [Security]     [Git]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Device Mapper]

  Powered by Linux