Hi all, I'm not really sure what guarantees the block layer makes regarding the segments in a bio as part of a request submitted to a block driver. As far as I can tell this is not documented anywhere. In particular, - Is bv_len aligned to SECTOR_SIZE? - To logical_sector_size? - What if logical_sector_size > PAGE_SIZE? - What about bv_offset? - Is it possible to have a bio where the total length is a multiple of logical_sector_size, but the data is split across several segments where each segment is a multiple of SECTOR_SIZE? - Is is possible to have segments not even aligned to SECTOR_SIZE? - Can I somehow request to only get segments with bv_len aligned to logical_sector_size? Or do I need to do my own coalescing and bounce buffering for that? I've been reading some drivers (as well as stuff in block/) to try and figure things out, but it's hard to figure out all the places where constraints are enforced. In particular, I've read several drivers that make some big assumptions (which might be bugs?) For example, in drivers/mtd/mtd_blkdevs.c, do_blktrans_request looks like: block = blk_rq_pos(req) << 9 >> tr->blkshift; nsect = blk_rq_cur_bytes(req) >> tr->blkshift; switch (req_op(req)) { /* ... snip ... */ case REQ_OP_READ: buf = kmap(bio_page(req->bio)) + bio_offset(req->bio); for (; nsect > 0; nsect--, block++, buf += tr->blksize) { if (tr->readsect(dev, block, buf)) { kunmap(bio_page(req->bio)); return BLK_STS_IOERR; } } kunmap(bio_page(req->bio)); rq_for_each_segment(bvec, req, iter) flush_dcache_page(bvec.bv_page); return BLK_STS_OK; For context, tr->blkshift is either 512 or 4096, depending on the backend. From what I can tell, this code assumes the following: - There is only one bio in a request. This one is a bit of a soft assumption since we should only flush the pages in the bio and not the whole request otherwise. - There is only one segment in a bio. This one could be reasonable if max_segments was set to 1, but it's not as far as I can tell. So I guess we just go off the end of the bio if there's a second segment? - The data is in lowmem OR bv_offset + bv_len <= PAGE_SIZE. kmap() only maps a single page, so if we go past one page we end up in adjacent kmapped pages. Am I missing something here? Handling highmem seems like a persistent issue. E.g. drivers/mtd/ubi/block.c doesn't even bother doing a kmap. Should both of these have BLK_FEAT_BOUNCE_HIGH? --Sean