On Wed, 25 Jun 2025, Damien Le Moal wrote: > Any zoned DM target that requires zone append emulation will use the > block layer zone write plugging. In such case, DM target drivers must > not split BIOs using dm_accept_partial_bio() as doing so can potentially > lead to deadlocks with queue freeze operations. Regular write operations > used to emulate zone append operations also cannot be split by the > target driver as that would result in an invalid writen sector value > return using the BIO sector. > > In order for zoned DM target drivers to avoid such incorrect BIO > splitting, we must ensure that large BIOs are split before being passed > to the map() function of the target, thus guaranteeing that the > limits for the mapped device are not exceeded. > > dm-crypt and dm-flakey are the only target drivers supporting zoned > devices and using dm_accept_partial_bio(). > > In the case of dm-crypt, this function is used to split BIOs to the > internal max_write_size limit (which will be suppressed in a different > patch). However, since crypt_alloc_buffer() uses a bioset allowing only > up to BIO_MAX_VECS (256) vectors in a BIO. The dm-crypt device > max_segments limit, which is not set and so default to BLK_MAX_SEGMENTS > (128), must thus be respected and write BIOs split accordingly. > > In the case of dm-flakey, since zone append emulation is not required, > the block layer zone write plugging is not used and no splitting of BIOs > required. > > Modify the function dm_zone_bio_needs_split() to use the block layer > helper function bio_needs_zone_write_plugging() to force a call to > bio_split_to_limits() in dm_split_and_process_bio(). This allows DM > target drivers to avoid using dm_accept_partial_bio() for write > operations on zoned DM devices. > > Fixes: f211268ed1f9 ("dm: Use the block layer zone append emulation") > Cc: stable@xxxxxxxxxxxxxxx > Signed-off-by: Damien Le Moal <dlemoal@xxxxxxxxxx> Reviewed-by: Mikulas Patocka <mpatocka@xxxxxxxxxx> > --- > drivers/md/dm.c | 29 ++++++++++++++++++++++------- > 1 file changed, 22 insertions(+), 7 deletions(-) > > diff --git a/drivers/md/dm.c b/drivers/md/dm.c > index e477765cdd27..f1e63c1808b4 100644 > --- a/drivers/md/dm.c > +++ b/drivers/md/dm.c > @@ -1773,12 +1773,29 @@ static inline bool dm_zone_bio_needs_split(struct mapped_device *md, > struct bio *bio) > { > /* > - * For mapped device that need zone append emulation, we must > - * split any large BIO that straddles zone boundaries. > + * Special case the zone operations that cannot or should not be split. > */ > - return dm_emulate_zone_append(md) && bio_straddles_zones(bio) && > - !bio_flagged(bio, BIO_ZONE_WRITE_PLUGGING); > + switch (bio_op(bio)) { > + case REQ_OP_ZONE_APPEND: > + case REQ_OP_ZONE_FINISH: > + case REQ_OP_ZONE_RESET: > + case REQ_OP_ZONE_RESET_ALL: > + return false; > + default: > + break; > + } > + > + /* > + * Mapped devices that require zone append emulation will use the block > + * layer zone write plugging. In such case, we must split any large BIO > + * to the mapped device limits to avoid potential deadlocks with queue > + * freeze operations. > + */ > + if (!dm_emulate_zone_append(md)) > + return false; > + return bio_needs_zone_write_plugging(bio) || bio_straddles_zones(bio); > } > + > static inline bool dm_zone_plug_bio(struct mapped_device *md, struct bio *bio) > { > if (!bio_needs_zone_write_plugging(bio)) > @@ -1927,9 +1944,7 @@ static void dm_split_and_process_bio(struct mapped_device *md, > > is_abnormal = is_abnormal_io(bio); > if (static_branch_unlikely(&zoned_enabled)) { > - /* Special case REQ_OP_ZONE_RESET_ALL as it cannot be split. */ > - need_split = (bio_op(bio) != REQ_OP_ZONE_RESET_ALL) && > - (is_abnormal || dm_zone_bio_needs_split(md, bio)); > + need_split = is_abnormal || dm_zone_bio_needs_split(md, bio); > } else { > need_split = is_abnormal; > } > -- > 2.49.0 >