[PATCH v3] dm: optimize REQ_PREFLUSH with data when using the linear target

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 




On Thu, 11 Sep 2025, Mikulas Patocka wrote:

> 
> 
> On Wed, 10 Sep 2025, Bart Van Assche wrote:
> 
> > The dm core splits REQ_PREFLUSH bios that have data into two bios.
> > First, a REQ_PREFLUSH bio with no data is submitted to all underlying
> > dm devices. Next, the REQ_PREFLUSH flag is cleared and the same bio is
> > resubmitted. This approach is essential if there are multiple underlying
> > devices to provide correct REQ_PREFLUSH semantics.
> > 
> > Splitting a bio into an empty flush bio and a non-flush data bio is
> > not necessary if there is only a single underlying device. Hence this
> > patch that does not split REQ_PREFLUSH bios if there is only one
> > underlying device.
> > 
> > This patch preserves the order of REQ_PREFLUSH writes if there is only
> > one underlying device and if one or more write bios have been queued
> > past the REQ_PREFLUSH bio before the REQ_PREFLUSH bio is processed.
> > 
> > Cc: Mike Snitzer <snitzer@xxxxxxxxxx>
> > Cc: Damien Le Moal <dlemoal@xxxxxxxxxx>
> > Signed-off-by: Bart Van Assche <bvanassche@xxxxxxx>
> > ---
> > 
> > Changes compared to v1:
> >  - Made the patch description more detailed.
> >  - Removed the reference to write pipelining from the patch description.
> 
> Hi
> 
> I think that the problem here is that not all targets handle a PREFLUSH 
> bio with data (for example, dm-integrity doesn't handle it correctly; it 
> assumes that the PREFLUSH bio is empty).
> 
> I suggest that the logic should be changed to test that 
> "t->flush_bypasses_map == true" (that will rule out targets that don't 
> support flush optimization) and "dm_table_get_devices returns just one 
> device" - if both of these conditions are true, you can send the PREFLUSH 
> bio with data to the one device that dm_table_get_devices returned.
> 
> It will also optimize the case when you have multiple dm-linear targets 
> with just one underlying device.
> 
> Mikulas

Here I'm sending a patch that implements this logic. Please test it.

Mikulas



From: Mikulas Patocka <mpatocka@xxxxxxxxxx>

If the table has only linear targets and there is just one underlying
device, we can optimize REQ_PREFLUSH with data - we don't have to split
it to two bios - a flush and a write. We can pass it to the linear target
directly.

Signed-off-by: Mikulas Patocka <mpatocka@xxxxxxxxxx>

---
 drivers/md/dm-core.h |    1 +
 drivers/md/dm.c      |   21 +++++++++++++--------
 2 files changed, 14 insertions(+), 8 deletions(-)

Index: linux-2.6/drivers/md/dm.c
===================================================================
--- linux-2.6.orig/drivers/md/dm.c	2025-08-15 17:28:23.000000000 +0200
+++ linux-2.6/drivers/md/dm.c	2025-09-12 15:29:08.000000000 +0200
@@ -490,18 +490,13 @@ u64 dm_start_time_ns_from_clone(struct b
 }
 EXPORT_SYMBOL_GPL(dm_start_time_ns_from_clone);
 
-static inline bool bio_is_flush_with_data(struct bio *bio)
-{
-	return ((bio->bi_opf & REQ_PREFLUSH) && bio->bi_iter.bi_size);
-}
-
 static inline unsigned int dm_io_sectors(struct dm_io *io, struct bio *bio)
 {
 	/*
 	 * If REQ_PREFLUSH set, don't account payload, it will be
 	 * submitted (and accounted) after this flush completes.
 	 */
-	if (bio_is_flush_with_data(bio))
+	if (io->requeue_flush_with_data)
 		return 0;
 	if (unlikely(dm_io_flagged(io, DM_IO_WAS_SPLIT)))
 		return io->sectors;
@@ -590,6 +585,7 @@ static struct dm_io *alloc_io(struct map
 	io = container_of(tio, struct dm_io, tio);
 	io->magic = DM_IO_MAGIC;
 	io->status = BLK_STS_OK;
+	io->requeue_flush_with_data = false;
 
 	/* one ref is for submission, the other is for completion */
 	atomic_set(&io->io_count, 2);
@@ -976,11 +972,12 @@ static void __dm_io_complete(struct dm_i
 	if (requeued)
 		return;
 
-	if (bio_is_flush_with_data(bio)) {
+	if (unlikely(io->requeue_flush_with_data)) {
 		/*
 		 * Preflush done for flush with data, reissue
 		 * without REQ_PREFLUSH.
 		 */
+		io->requeue_flush_with_data = false;
 		bio->bi_opf &= ~REQ_PREFLUSH;
 		queue_io(md, bio);
 	} else {
@@ -1996,11 +1993,19 @@ static void dm_split_and_process_bio(str
 	}
 	init_clone_info(&ci, io, map, bio, is_abnormal);
 
-	if (bio->bi_opf & REQ_PREFLUSH) {
+	if (unlikely((bio->bi_opf & REQ_PREFLUSH) != 0)) {
+		if (map->flush_bypasses_map) {
+			struct list_head *devices = dm_table_get_devices(map);
+			if (devices->next == devices->prev)
+				goto send_preflush_with_data;
+		}
+		if (bio->bi_iter.bi_size)
+			io->requeue_flush_with_data = true;
 		__send_empty_flush(&ci);
 		/* dm_io_complete submits any data associated with flush */
 		goto out;
 	}
+send_preflush_with_data:
 
 	if (static_branch_unlikely(&zoned_enabled) &&
 	    (bio_op(bio) == REQ_OP_ZONE_RESET_ALL)) {
Index: linux-2.6/drivers/md/dm-core.h
===================================================================
--- linux-2.6.orig/drivers/md/dm-core.h	2025-07-06 15:02:23.000000000 +0200
+++ linux-2.6/drivers/md/dm-core.h	2025-09-12 15:19:36.000000000 +0200
@@ -291,6 +291,7 @@ struct dm_io {
 	struct dm_io *next;
 	struct dm_stats_aux stats_aux;
 	blk_status_t status;
+	bool requeue_flush_with_data;
 	atomic_t io_count;
 	struct mapped_device *md;
 





[Index of Archives]     [DM Crypt]     [Fedora Desktop]     [ATA RAID]     [Fedora Marketing]     [Fedora Packaging]     [Fedora SELinux]     [Yosemite Discussion]     [KDE Users]     [Fedora Docs]

  Powered by Linux