On Thu, 11 Sep 2025, Mikulas Patocka wrote: > > > On Wed, 10 Sep 2025, Bart Van Assche wrote: > > > The dm core splits REQ_PREFLUSH bios that have data into two bios. > > First, a REQ_PREFLUSH bio with no data is submitted to all underlying > > dm devices. Next, the REQ_PREFLUSH flag is cleared and the same bio is > > resubmitted. This approach is essential if there are multiple underlying > > devices to provide correct REQ_PREFLUSH semantics. > > > > Splitting a bio into an empty flush bio and a non-flush data bio is > > not necessary if there is only a single underlying device. Hence this > > patch that does not split REQ_PREFLUSH bios if there is only one > > underlying device. > > > > This patch preserves the order of REQ_PREFLUSH writes if there is only > > one underlying device and if one or more write bios have been queued > > past the REQ_PREFLUSH bio before the REQ_PREFLUSH bio is processed. > > > > Cc: Mike Snitzer <snitzer@xxxxxxxxxx> > > Cc: Damien Le Moal <dlemoal@xxxxxxxxxx> > > Signed-off-by: Bart Van Assche <bvanassche@xxxxxxx> > > --- > > > > Changes compared to v1: > > - Made the patch description more detailed. > > - Removed the reference to write pipelining from the patch description. > > Hi > > I think that the problem here is that not all targets handle a PREFLUSH > bio with data (for example, dm-integrity doesn't handle it correctly; it > assumes that the PREFLUSH bio is empty). > > I suggest that the logic should be changed to test that > "t->flush_bypasses_map == true" (that will rule out targets that don't > support flush optimization) and "dm_table_get_devices returns just one > device" - if both of these conditions are true, you can send the PREFLUSH > bio with data to the one device that dm_table_get_devices returned. > > It will also optimize the case when you have multiple dm-linear targets > with just one underlying device. > > Mikulas Here I'm sending a patch that implements this logic. Please test it. Mikulas From: Mikulas Patocka <mpatocka@xxxxxxxxxx> If the table has only linear targets and there is just one underlying device, we can optimize REQ_PREFLUSH with data - we don't have to split it to two bios - a flush and a write. We can pass it to the linear target directly. Signed-off-by: Mikulas Patocka <mpatocka@xxxxxxxxxx> --- drivers/md/dm-core.h | 1 + drivers/md/dm.c | 21 +++++++++++++-------- 2 files changed, 14 insertions(+), 8 deletions(-) Index: linux-2.6/drivers/md/dm.c =================================================================== --- linux-2.6.orig/drivers/md/dm.c 2025-08-15 17:28:23.000000000 +0200 +++ linux-2.6/drivers/md/dm.c 2025-09-12 15:29:08.000000000 +0200 @@ -490,18 +490,13 @@ u64 dm_start_time_ns_from_clone(struct b } EXPORT_SYMBOL_GPL(dm_start_time_ns_from_clone); -static inline bool bio_is_flush_with_data(struct bio *bio) -{ - return ((bio->bi_opf & REQ_PREFLUSH) && bio->bi_iter.bi_size); -} - static inline unsigned int dm_io_sectors(struct dm_io *io, struct bio *bio) { /* * If REQ_PREFLUSH set, don't account payload, it will be * submitted (and accounted) after this flush completes. */ - if (bio_is_flush_with_data(bio)) + if (io->requeue_flush_with_data) return 0; if (unlikely(dm_io_flagged(io, DM_IO_WAS_SPLIT))) return io->sectors; @@ -590,6 +585,7 @@ static struct dm_io *alloc_io(struct map io = container_of(tio, struct dm_io, tio); io->magic = DM_IO_MAGIC; io->status = BLK_STS_OK; + io->requeue_flush_with_data = false; /* one ref is for submission, the other is for completion */ atomic_set(&io->io_count, 2); @@ -976,11 +972,12 @@ static void __dm_io_complete(struct dm_i if (requeued) return; - if (bio_is_flush_with_data(bio)) { + if (unlikely(io->requeue_flush_with_data)) { /* * Preflush done for flush with data, reissue * without REQ_PREFLUSH. */ + io->requeue_flush_with_data = false; bio->bi_opf &= ~REQ_PREFLUSH; queue_io(md, bio); } else { @@ -1996,11 +1993,19 @@ static void dm_split_and_process_bio(str } init_clone_info(&ci, io, map, bio, is_abnormal); - if (bio->bi_opf & REQ_PREFLUSH) { + if (unlikely((bio->bi_opf & REQ_PREFLUSH) != 0)) { + if (map->flush_bypasses_map) { + struct list_head *devices = dm_table_get_devices(map); + if (devices->next == devices->prev) + goto send_preflush_with_data; + } + if (bio->bi_iter.bi_size) + io->requeue_flush_with_data = true; __send_empty_flush(&ci); /* dm_io_complete submits any data associated with flush */ goto out; } +send_preflush_with_data: if (static_branch_unlikely(&zoned_enabled) && (bio_op(bio) == REQ_OP_ZONE_RESET_ALL)) { Index: linux-2.6/drivers/md/dm-core.h =================================================================== --- linux-2.6.orig/drivers/md/dm-core.h 2025-07-06 15:02:23.000000000 +0200 +++ linux-2.6/drivers/md/dm-core.h 2025-09-12 15:19:36.000000000 +0200 @@ -291,6 +291,7 @@ struct dm_io { struct dm_io *next; struct dm_stats_aux stats_aux; blk_status_t status; + bool requeue_flush_with_data; atomic_t io_count; struct mapped_device *md;