On Wed, Sep 03, 2025 at 04:05:06PM -0700, Joanne Koong wrote: > On Wed, Sep 3, 2025 at 11:47 AM Darrick J. Wong <djwong@xxxxxxxxxx> wrote: > > > > On Wed, Sep 03, 2025 at 08:54:05AM -0700, Darrick J. Wong wrote: > > > On Wed, Sep 03, 2025 at 05:48:46PM +0200, Miklos Szeredi wrote: > > > > On Tue, 26 Aug 2025 at 20:52, Darrick J. Wong <djwong@xxxxxxxxxx> wrote: > > > > > > Sorry for the late reply on this. > > > > > > Hrmm. I was thinking that it would be very nice to have > > > > > fuse_request_{send,end} bracket the start and end of a fuse request, > > > > > even if we kill it immediately. > > Oh interesting, I didn't realize there was a trace_fuse_request_end(). > I get now why you wanted the trace_fuse_request_send() for the > !fiq->connected case, for symmetry. I was thinking of it from the > client userspace side (one idea I have, which idk if it is actually > that useful or not, is building some sort of observability "wireshark > for fuse" tool that gives more visibility into the requests being sent > to/from the server like their associated kernel vs libfuse timestamps > to know where the latency is happening. this issue has come up in prod > a few times when debugging slow requests); from this perspective, it > seemed confusing to see requests show up that were never in good faith > attempted to be sent to the server. Well at first I was all "Ehhh, why are all the request ids zero??". Later when I started debugging corruption problems in fuse2fs I realized that it would be really helpful to be able to match the start and end of a particular fuse request to all the stuff that happened in between. *So far* I can mostly just watch the raw ftrace feed to see what's going on without going crazy, but yes, a wireshark plugin would be nice. Or someone writing fuseslow. ;) > If you want to preserve the symmetry, maybe one idea is only doing the > trace_fuse_request_end() if the req.in.h.unique code is valid? That > would skip doing the trace for the !fiq->connected case. That will actually cause some loss of trace data -- trace_fuse_request_end also captures the error code of the aborted commands, so you can tell from the -103 error code that something really bad happened. --D > Thanks, > Joanne > > > > > > > > I'm fine with that, and would possibly simplify some code that checks > > > > for an error and calls ->end manually. But that makes it a > > > > non-trivial change unfortunately. > > > > > > Yes, and then you have to poke the idr structure for a request id even > > > if that caller already knows that the connection's dead. That seems > > > like a waste of cycles, but OTOH maybe we just don't care? > > > > > > (Though I suppose seeing more than one request id of zero in the trace > > > output implies very strongly that the connection is really dead) > > > > Well.... given the fuse_iqueue::reqctr usage, the first request gets a > > unique id of 2 and increments by two thereafter. So it's a pretty safe > > bet that unique==0 means the request isn't actually being sent, or that > > your very lucky in that your fuse server has been running for a /very/ > > long time. > > > > I think I just won't call trace_fuse_request_send for requests that are > > immediately ended; and I'll refactor the req->in.h.unique assignment > > into a helper so that virtiofs and friends can call the helper and get > > the tracepoint automatically. > > > > For example, fuse_dev_queue_req now becomes: > > > > > > static inline void fuse_request_assign_unique_locked(struct fuse_iqueue *fiq, > > struct fuse_req *req) > > { > > if (req->in.h.opcode != FUSE_NOTIFY_REPLY) > > req->in.h.unique = fuse_get_unique_locked(fiq); > > > > /* tracepoint captures in.h.unique and in.h.len */ > > trace_fuse_request_send(req); > > } > > > > static void fuse_dev_queue_req(struct fuse_iqueue *fiq, struct fuse_req *req) > > { > > spin_lock(&fiq->lock); > > if (fiq->connected) { > > fuse_request_assign_unique_locked(fiq, req); > > list_add_tail(&req->list, &fiq->pending); > > fuse_dev_wake_and_unlock(fiq); > > } else { > > spin_unlock(&fiq->lock); > > req->out.h.error = -ENOTCONN; > > clear_bit(FR_PENDING, &req->flags); > > fuse_request_end(req); > > } > > } > > > > --D >