Re: [PATCH 2/2] block: Fix a deadlock related freezing zoned storage devices

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 5/15/25 9:51 PM, Christoph Hellwig wrote:
On Wed, May 14, 2025 at 01:29:37PM -0700, Bart Van Assche wrote:
+/*
+ * Do not call bio_queue_enter() if the BIO_ZONE_WRITE_PLUGGING flag has been
+ * set because this causes blk_mq_freeze_queue() to deadlock if
+ * blk_zone_wplug_bio_work() submits a bio. Calling bio_queue_enter() for bios
+ * on the plug list is not necessary since a q_usage_counter reference is held
+ * while a bio is on the plug list.
+ */

How do we end up with BIO_ZONE_WRITE_PLUGGING set here?  If that flag
was set in a stacking driver we need to clear it before resubmitting
the bio I think.

Hmm ... my understanding is that BIO_ZONE_WRITE_PLUGGING, if set, must
remain set until the bio has completed. Here is an example of code in
the bio completion path that tests this flag:

static inline void blk_zone_bio_endio(struct bio *bio)
{
	/*
	 * For write BIOs to zoned devices, signal the completion of the BIO so
	 * that the next write BIO can be submitted by zone write plugging.
	 */
	if (bio_zone_write_plugging(bio))
		blk_zone_write_plug_bio_endio(bio);
}

The bio_zone_write_plugging() definition is as follows:

static inline bool bio_zone_write_plugging(struct bio *bio)
{
	return bio_flagged(bio, BIO_ZONE_WRITE_PLUGGING);
}

Can you provide a null_blk based reproducer for your testcase to see
what happens here?

My attempts so far to build a reproduce for the blktests framework have
been unsuccessful. This test script runs fine in the VM that I use for
kernel testing:

https://github.com/bvanassche/blktests/blob/master/tests/zbd/013

Either way we can't just simply skip taking q_usage_count.

Why not? If BIO_ZONE_WRITE_PLUGGING is set, that guarantees that a
queue reference count is held and will remain held across the entire
disk->fops->submit_bio() invocation, isn't it? From
blk_zone_wplug_bio_work(), below the submit_bio_noacct_nocheck(bio)
call:

	if (bdev_test_flag(bdev, BD_HAS_SUBMIT_BIO))
		blk_queue_exit(bdev->bd_disk->queue);

Thanks,

Bart.




[Index of Archives]     [Linux RAID]     [Linux SCSI]     [Linux ATA RAID]     [IDE]     [Linux Wireless]     [Linux Kernel]     [ATH6KL]     [Linux Bluetooth]     [Linux Netdev]     [Kernel Newbies]     [Security]     [Git]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Device Mapper]

  Powered by Linux