On Fri, Jul 18, 2025 at 07:10:37PM +0200, Bernd Schubert wrote: > > > On 7/18/25 01:27, Darrick J. Wong wrote: > > From: Darrick J. Wong <djwong@xxxxxxxxxx> > > > > The fuse_request_{send,end} tracepoints capture the value of > > req->in.h.unique in the trace output. It would be really nice if we > > could use this to match a request to its response for debugging and > > latency analysis, but the call to trace_fuse_request_send occurs before > > the unique id has been set: > > > > fuse_request_send: connection 8388608 req 0 opcode 1 (FUSE_LOOKUP) len 107 > > fuse_request_end: connection 8388608 req 6 len 16 error -2 > > > > Move the callsites to trace_fuse_request_send to after the unique id has > > been set, or right before we decide to cancel a request having not set > > one. > > Sorry, my fault, I have a branch for that already. Just occupied and > then just didn't send v4. > > https://lore.kernel.org/all/20250403-fuse-io-uring-trace-points-v3-0-35340aa31d9c@xxxxxxx/ (Aha, that was before I started paying attention to the fuse patches on fsdevel.) > The updated branch is here > > https://github.com/bsbernd/linux/commits/fuse-io-uring-trace-points/ > > Objections if we go with that version, as it adds a few more tracepoints > and removes the lock to get the unique ID. Let me look through the branch -- * fuse: Make the fuse unique value a per-cpu counter Is there any reason you didn't use percpu_counter_init() ? It does the same per-cpu batching that (I think) your version does. * fuse: Set request unique on allocation * fuse: {io-uring} Avoid _send code dup Looks good, Reviewed-by: "Darrick J. Wong" <djwong@xxxxxxxxxx> * fuse: fine-grained request ftraces Are these three new tracepoints exactly identical except in name? If you declare an event class for them, that will save a lot of memory (~5K per tracepoint according to rostedt) over definining them individually. * per cpu cntr fix I think you can avoid this if you use the kernel struct percpu_counter. --D