On Tue, Jul 29, 2025 at 12:13:42PM -0400, Tony Battersby wrote: > Improve writeback performance to RAID-4/5/6 by aligning writes to stripe > boundaries. This relies on io_opt being set to the stripe size (or > a multiple) when BLK_FEAT_RAID_PARTIAL_STRIPES_EXPENSIVE is set. This is the wrong layer to be pulling filesystem write alignments from. Filesystems already have alignment information in their on-disk formats. XFS has stripe unit and stripe width information in the filesysetm superblock that is set by mkfs.xfs. This information comes from the block device io-opt/io-min values exposed to userspace at mkfs time, so the filesystem already knows what the optimal IO alignment parameters are for the storage stack underneath it. Indeed, we already align extent allocations to these parameters, so aligning filesystem writeback to the same configured alignment makes a lot more sense than pulling random stuff from block devices during IO submission... > @@ -1685,81 +1685,118 @@ static int iomap_add_to_ioend(struct iomap_writepage_ctx *wpc, > struct inode *inode, loff_t pos, loff_t end_pos, > unsigned len) > { > - struct iomap_folio_state *ifs = folio->private; > - size_t poff = offset_in_folio(folio, pos); > - unsigned int ioend_flags = 0; > - int error; > - > - if (wpc->iomap.type == IOMAP_UNWRITTEN) > - ioend_flags |= IOMAP_IOEND_UNWRITTEN; > - if (wpc->iomap.flags & IOMAP_F_SHARED) > - ioend_flags |= IOMAP_IOEND_SHARED; > - if (folio_test_dropbehind(folio)) > - ioend_flags |= IOMAP_IOEND_DONTCACHE; > - if (pos == wpc->iomap.offset && (wpc->iomap.flags & IOMAP_F_BOUNDARY)) > - ioend_flags |= IOMAP_IOEND_BOUNDARY; > + struct queue_limits *lim = bdev_limits(wpc->iomap.bdev); > + unsigned int io_align = > + (lim->features & BLK_FEAT_RAID_PARTIAL_STRIPES_EXPENSIVE) ? > + lim->io_opt >> SECTOR_SHIFT : 0; i.e. this alignment should come from the filesystem, not the block device. -Dave. -- Dave Chinner david@xxxxxxxxxxxxx