On Mon, May 12, 2025 at 10:46 PM Miklos Szeredi <miklos@xxxxxxxxxx> wrote: > > On Mon, 12 May 2025 at 21:03, Joanne Koong <joannelkoong@xxxxxxxxx> wrote: > > > > On Wed, May 7, 2025 at 7:45 AM Miklos Szeredi <miklos@xxxxxxxxxx> wrote: > > > > > > On Wed, 23 Apr 2025 at 01:56, Joanne Koong <joannelkoong@xxxxxxxxx> wrote: > > > > > > > For servers that do not need to access pages after answering the > > > > request, splice gives a non-trivial improvement in performance. > > > > Benchmarks show roughly a 40% speedup. > > > > > > Hmm, have you looked at where this speedup comes from? > > > > > > Is this a real zero-copy scenario where the server just forwards the > > > pages to a driver which does DMA, so that the CPU never actually > > > touches the page contents? > > > > I ran the benchmarks last month on the passthrough_ll server (from the > > libfuse examples) with the actual copying out / buffer processing > > removed (eg the .write_buf handler immediately returns > > "fuse_reply_write(req, fuse_buf_size(in_buf));". > > Ah, ok. > > It would be good to see results in a more realistic scenario than that > before deciding to do this. The results vary depending on how IO-intensive the server-side processing logic is (eg ones that are not as intensive would show a bigger relative performance speedup than ones where a lot of time is spent on server-side processing). I can include the results from benchmarks on our internal fuse server, which forwards the data in the write buffer to a remote server over the network. For that, we saw roughly a 5% improvement in throughput for 5 GB writes with 16 MB chunk sizes, and a 2.45% improvement in throughput for 12 parallel writes of 16 GB files with 64 MB chunk sizes. Thanks, Joanne > > Thanks, > Miklos