On Fri, May 23, 2025 at 05:45:10AM -0700, Christoph Hellwig wrote: > On Wed, May 21, 2025 at 03:31:03PM -0700, Keith Busch wrote: > > From: Keith Busch <kbusch@xxxxxxxxxx> > > > > Provide a basic block level api to copy a range of a block device's > > sectors to a new destination on the same device. This just reads the > > source data into host memory, then writes it back out to the device at > > the requested destination. > > As someone who recently spent a lot of time on optimizing such loops: > having a general API that allocates a buffer for each copy is a bad > idea. You'll want some kind of caller provided longer living allocation > if you do regularly do such copies. > > Maybe having common code is good to avoid copies, but I suspect most > real users would want their own. But the end goal is that no host memory buffer would be needed at all. A buffer is allocated only in the fallback path. If we have the caller provide their buffer for that fallback case, that kind of defeats the benefit of reduced memory utilization. So it sounds like you might be suggesting that I don't even bother providing the host instrumented copy fallback and provide an API that performs the copy only if it can be offloaded. Yes?