On Mon, Jun 16, 2025 at 09:37:01PM -0700, Christoph Hellwig wrote: > On Mon, Jun 16, 2025 at 12:07:42PM -0400, Mike Snitzer wrote: > > But that's OK... my test bdev is a bad example (archaic VMware vSphere > > provided SCSI device): it doesn't reflect expected modern hardware. > > > > But I just slapped together a test pmem blockdevice (memory backed, > > using memmap=6G!18G) and it too has dma_alignment=511 > > That's the block layer default when not overriden by the driver, I guess > pmem folks didn't care enough. I suspect it should not have any > alignment requirements at all. Yeah, I hacked it with this just to quickly simulate NVMe's dma_alignment: diff --git a/drivers/nvdimm/pmem.c b/drivers/nvdimm/pmem.c index 210fb77f51ba..0ab2826073f9 100644 --- a/drivers/nvdimm/pmem.c +++ b/drivers/nvdimm/pmem.c @@ -457,6 +457,7 @@ static int pmem_attach_disk(struct device *dev, .max_hw_sectors = UINT_MAX, .features = BLK_FEAT_WRITE_CACHE | BLK_FEAT_SYNCHRONOUS, + .dma_alignment = 3, }; int nid = dev_to_node(dev), fua; struct resource *res = &nsio->res; > > I'd like NFSD to be able to know if its bvec is dma-aligned, before > > issuing DIO writes to underlying XFS. AFAIK I can do that simply by > > checking the STATX_DIOALIGN provided dio_mem_align... > > Exactly. I'm finding that even with dma_alignment=3 the bvec, that nfsd_vfs_write()'s call to xdr_buf_to_bvec() produces from NFS's WRITE payload, still causes iov_iter_aligned_bvec() to return false. The reason is that iov_iter_aligned_bvec() inspects each member of the bio_vec in isolation (in its while() loop). So even though NFS WRITE payload's overall size is aligned on-disk (e.g. offset=0 len=512K) its first and last bvec members are _not_ aligned (due to 512K NFS WRITE payload being offset 148 bytes into the first page of the pages allocated for it by SUNRPC). So iov_iter_aligned_bvec() fails at this check: if (len & len_mask) return false; with tracing I added: nfsd-14027 [001] ..... 3734.668780: nfsd_vfs_write: iov_iter_aligned_bvec: addr_mask=3 len_mask=511 nfsd-14027 [001] ..... 3734.668781: nfsd_vfs_write: iov_iter_aligned_bvec: len=3948 & len_mask=511 failed Is this another case of the checks being too strict? The bvec does describe a contiguous 512K extent of on-disk LBA, just not if inspected piece-wise. BTW, XFS's directio code _will_ also check with iov_iter_aligned_bvec() via iov_iter_is_aligned(). Mike