On Sat, Apr 26, 2025 at 03:46:30PM -0700, Luis Chamberlain wrote: > On Wed, Apr 23, 2025 at 11:12:58AM +0300, Leon Romanovsky wrote: > > From: Leon Romanovsky <leonro@xxxxxxxxxx> > > > > Introduce new DMA APIs to perform DMA linkage of buffers > > in layers higher than DMA. > > > > In proposed API, the callers will perform the following steps. > > In map path: > > if (dma_can_use_iova(...)) > > dma_iova_alloc() > > for (page in range) > > dma_iova_link_next(...) > > dma_iova_sync(...) > > else > > /* Fallback to legacy map pages */ > > for (all pages) > > dma_map_page(...) > > > > In unmap path: > > if (dma_can_use_iova(...)) > > dma_iova_destroy() > > else > > for (all pages) > > dma_unmap_page(...) > > > > Reviewed-by: Christoph Hellwig <hch@xxxxxx> > > Tested-by: Jens Axboe <axboe@xxxxxxxxx> > > Signed-off-by: Leon Romanovsky <leonro@xxxxxxxxxx> > > --- > > drivers/iommu/dma-iommu.c | 261 ++++++++++++++++++++++++++++++++++++ > > include/linux/dma-mapping.h | 32 +++++ > > 2 files changed, 293 insertions(+) > > > > diff --git a/drivers/iommu/dma-iommu.c b/drivers/iommu/dma-iommu.c > > index d2c298083e0a..2e014db5a244 100644 > > --- a/drivers/iommu/dma-iommu.c > > +++ b/drivers/iommu/dma-iommu.c > > @@ -1818,6 +1818,267 @@ void dma_iova_free(struct device *dev, struct dma_iova_state *state) > > } > > EXPORT_SYMBOL_GPL(dma_iova_free); > > > > +static int __dma_iova_link(struct device *dev, dma_addr_t addr, > > + phys_addr_t phys, size_t size, enum dma_data_direction dir, > > + unsigned long attrs) > > +{ > > + bool coherent = dev_is_dma_coherent(dev); > > + > > + if (!coherent && !(attrs & DMA_ATTR_SKIP_CPU_SYNC)) > > + arch_sync_dma_for_device(phys, size, dir); > > So arch_sync_dma_for_device() is a no-op on some architectures, notably x86. > So since you're doing this work and given the above pattern is common on > the non iova case, we could save ourselves 2 branches checks on x86 on > __dma_iova_link() and also generalize savings for the non-iova case as > well. For the non-iova case we have two use cases, one with the attrs on > initial mapping, and one without on subsequent sync ops. For the iova > case the attr is always consistently used. I want to believe that compiler will discards these "if (!coherent && !(attrs & DMA_ATTR_SKIP_CPU_SYNC)))" branch if case is empty. > > So we could just have something like this: > > #ifdef CONFIG_ARCH_HAS_SYNC_DMA_FOR_DEVICE > static inline void arch_sync_dma_device(struct device *dev, > phys_addr_t paddr, size_t size, > enum dma_data_direction dir) > { > if (!dev_is_dma_coherent(dev)) > arch_sync_dma_for_device(paddr, size, dir); > } > > static inline void arch_sync_dma_device_attrs(struct device *dev, > phys_addr_t paddr, size_t size, > enum dma_data_direction dir, > unsigned long attrs) > { > if (!dev_is_dma_coherent(dev) && !(attrs & DMA_ATTR_SKIP_CPU_SYNC)) > arch_sync_dma_for_device(paddr, size, dir); > } > #else > static inline void arch_sync_dma_device(struct device *dev, > phys_addr_t paddr, size_t size, > enum dma_data_direction dir) > { > } > > static inline void arch_sync_dma_device_attrs(struct device *dev, > phys_addr_t paddr, size_t size, > enum dma_data_direction dir, > unsigned long attrs) > { > } > #endif The problem is that dev_is_dma_coherent() and DMA_ATTR_SKIP_CPU_SYNC checks are scattered over all dma-iommu.c file with different combinations. While we can do new static functions for small number of use cases, it will be half-solution. > > > +/** > > + * dma_iova_link - Link a range of IOVA space > > + * @dev: DMA device > > + * @state: IOVA state > > + * @phys: physical address to link > > + * @offset: offset into the IOVA state to map into > > + * @size: size of the buffer > > + * @dir: DMA direction > > + * @attrs: attributes of mapping properties > > + * > > + * Link a range of IOVA space for the given IOVA state without IOTLB sync. > > + * This function is used to link multiple physical addresses in contiguous > > + * IOVA space without performing costly IOTLB sync. > > + * > > + * The caller is responsible to call to dma_iova_sync() to sync IOTLB at > > + * the end of linkage. > > + */ > > +int dma_iova_link(struct device *dev, struct dma_iova_state *state, > > + phys_addr_t phys, size_t offset, size_t size, > > + enum dma_data_direction dir, unsigned long attrs) > > +{ > > + struct iommu_domain *domain = iommu_get_dma_domain(dev); > > + struct iommu_dma_cookie *cookie = domain->iova_cookie; > > + struct iova_domain *iovad = &cookie->iovad; > > + size_t iova_start_pad = iova_offset(iovad, phys); > > + > > + if (WARN_ON_ONCE(iova_start_pad && offset > 0)) > > + return -EIO; > > + > > + if (dev_use_swiotlb(dev, size, dir) && iova_offset(iovad, phys | size)) > > There is already a similar check for the non-iova case for this on > iommu_dma_map_page() and a nice comment about what why this checked, > this seems to be just screaming for a helper: > > /* > * Checks if a physical buffer has unaligned boundaries with respect to > * the IOMMU granule. Returns non-zero if either the start or end > * address is not aligned to the granule boundary. > */ > static inline size_t iova_unaligned(struct iova_domain *iovad, > phys_addr_t phys, > size_t size) > { > return iova_offset(iovad, phys | size); > } I added this function, thanks. > Other than that, looks good. > > Reviewed-by: Luis Chamberlain <mcgrof@xxxxxxxxxx> > > Luis