On Fri, Feb 21, 2025 at 12:12:42PM +0100, Luis Chamberlain wrote: > On Wed, Feb 19, 2025 at 10:44:09AM +0800, Ming Lei wrote: > > PAGE_SIZE is applied in validating block device queue limits, this way is > > very fragile and is wrong: > > > > - queue limits are read from hardware, which is often one readonly hardware > > property > > > > - PAGE_SIZE is one config option which can be changed during build time. > > > > In RH lab, it has been found that max segment size of some mmc card is > > less than 64K, then this kind of card can't be probed successfully when > > same kernel is re-built with 64K PAGE_SIZE. > > > > Fix this issue by adding BLK_MIN_SEGMENT_SIZE and lim->min_segment_size: > > > > - validate segment limits by BLK_MIN_SEGMENT_SIZE which is 4K(minimized PAGE_SIZE) > > > > - checking if one bvec can be one segment quickly by lim->min_segment_size > > > > commit 6aeb4f836480 ("block: remove bio_add_pc_page") > > commit 02ee5d69e3ba ("block: remove blk_rq_bio_prep") > > commit b7175e24d6ac ("block: add a dma mapping iterator") > > Let me try to help with this commit log message a bit, how about: > > Using PAGE_SIZE as a minimum expected DMA segment size in consideration > of devices which have a max DMA segment size of 4k when used on 64k > PAGE_SIZE systems leads to devices not being able to probe such as > eMMC and Exynos UFS controller [0] [1] you can end up with a probe failure > as follows: > > WARNING: CPU: 2 PID: 397 at block/blk-settings.c:339 blk_validate_limits+0x364/0x3c0 > Modules linked in: mmc_block(+) rpmb_core crct10dif_ce ghash_ce sha2_ce dw_mmc_bluefield sha256_arm64 dw_mmc_pltfm sha1_ce dw_mmc mmc_core nfit i2c_mlxbf sbsa_gwdt gpio_mlxbf2 > f_tmfifo dm_mirror dm_region_hash dm_log dm_mod > CPU: 2 UID: 0 PID: 397 Comm: (udev-worker) Not tainted 6.12.0-39.el10.aarch64+64k #1 > Hardware name: https://www.mellanox.com BlueField SoC/BlueField SoC, BIOS BlueField:3.5.1-1-g4078432 Jan 28 2021 > ng pstate: 80000005 (Nzcv daif -PAN -UAO -TCO -DIT -SSBS BTYPE=--) > pc : blk_validate_limits+0x364/0x3c0 > p.service > lr : blk_set_default_limits+0x20/0x40 > Setup... > sp : ffff80008688f2d0 > x29: ffff80008688f2d0 x28: ffff000082acb600 x27: ffff80007bef02a8 > x26: ffff80007bef0000 x25: ffff80008688f58e x24: ffff80008688f450 > x23: ffff80008301b000 x22: 00000000ffffffff x21: ffff800082c39950 > x20: 0000000000000000 x19: ffff0000930169e0 x18: 0000000000000014 > x17: 00000000767472b1 x16: 0000000005a697e6 x15: 0000000002f42ca4 > x11: 00000000de7f0111 x10: 000000005285b53a x9 : ffff800080752908 > x8 : 0000000000000001 x7 : 0000000000000000 x6 : 0000000000000200 > x5 : 0000000000000000 x4 : 000000000000ffff x3 : 0000000000004000 > x2 : 0000000000000200 x1 : 0000000000001000 x0 : ffff80008688f450 > Call trace: > blk_validate_limits+0x364/0x3c0 > blk_set_default_limits+0x20/0x40 > blk_alloc_queue+0x84/0x240 > blk_mq_alloc_queue+0x80/0x118 > __blk_mq_alloc_disk+0x28/0x198 > mmc_alloc_disk+0xe0/0x260 [mmc_block] > ... > mmcblk mmc0:0001: probe with driver mmcblk failed with error -22 > > To fix this provide a block sanity check to ensure we use the min of the > block device's segment size and PAGE_SIZE. I think I would clarify this a bit more to something like: ...ensure we use the block device's max segment size as the new min segment size when max segment size is < PAGE_SIZE for 16k and 64k base page size systems. > > Link: https://lore.kernel.org/linux-block/20230612203314.17820-1-bvanassche@xxxxxxx/ # [0] > Link: https://lore.kernel.org/linux-block/1d55e942-5150-de4c-3a02-c3d066f87028@xxxxxxx/ # [1] > > > diff --git a/block/blk-merge.c b/block/blk-merge.c > > index 15cd231d560c..4fe2dfabfc9d 100644 > > --- a/block/blk-merge.c > > +++ b/block/blk-merge.c > > @@ -329,7 +329,7 @@ int bio_split_rw_at(struct bio *bio, const struct queue_limits *lim, > > > > if (nsegs < lim->max_segments && > > bytes + bv.bv_len <= max_bytes && > > - bv.bv_offset + bv.bv_len <= PAGE_SIZE) { > > + bv.bv_offset + bv.bv_len <= lim->min_segment_size) { > > nsegs++; > > bytes += bv.bv_len; > > } else { > > Now that's certainly more in line with what I expected. > > > diff --git a/include/linux/blkdev.h b/include/linux/blkdev.h > > index 248416ecd01c..1f7d492975c1 100644 > > --- a/include/linux/blkdev.h > > +++ b/include/linux/blkdev.h > > @@ -367,6 +367,7 @@ struct queue_limits { > > unsigned int max_sectors; > > unsigned int max_user_sectors; > > unsigned int max_segment_size; > > + unsigned int min_segment_size; > > unsigned int physical_block_size; > > unsigned int logical_block_size; > > unsigned int alignment_offset; > > @@ -1163,6 +1164,8 @@ static inline bool bdev_is_partition(struct block_device *bdev) > > enum blk_default_limits { > > BLK_MAX_SEGMENTS = 128, > > BLK_SAFE_MAX_SECTORS = 255, > > + /* use minimized PAGE_SIZE as min segment size hint */ > > + BLK_MIN_SEGMENT_SIZE = 4096, > > BLK_MAX_SEGMENT_SIZE = 65536, > > BLK_SEG_BOUNDARY_MASK = 0xFFFFFFFFUL, > > I think Bart provided a more sensible comment: > > Use 4 KiB as the smallest default supported DMA segment size limit instead of > PAGE_SIZE. This is important if the page size is larger than 4 KiB since > the maximum DMA segment size for some storage controllers is only 4 KiB. > > With that: > > Reviewed-by: Luis Chamberlain <mcgrof@xxxxxxxxxx> > > Luis