On Fri, Jun 06, 2025 at 01:20:48PM +0200, Christian König wrote: > > dmabuf acts as a driver and shouldn't be handled by VFS, so I made > > dmabuf implement copy_file_range callbacks to support direct I/O > > zero-copy. I'm open to both approaches. What's the preference of > > VFS experts? > > That would probably be illegal. Using the sg_table in the DMA-buf > implementation turned out to be a mistake. Two thing here that should not be directly conflated. Using the sg_table was a huge mistake, and we should try to move dmabuf to switch that to a pure dma_addr_t/len array now that the new DMA API supporting that has been merged. Is there any chance the dma-buf maintainers could start to kick this off? I'm of course happy to assist. But that notwithstanding, dma-buf is THE buffer sharing mechanism in the kernel, and we should promote it instead of reinventing it badly. And there is a use case for having a fully DMA mapped buffer in the block layer and I/O path, especially on systems with an IOMMU. So having an iov_iter backed by a dma-buf would be extremely helpful. That's mostly lib/iov_iter.c code, not VFS, though. > The question Christoph raised was rather why is your CPU so slow > that walking the page tables has a significant overhead compared to > the actual I/O? Yes, that's really puzzling and should be addressed first.