Re: [PATCH 3/7] fuse: capture the unique id of fuse commands being sent

Joanne Koong <joannelkoong@xxxxxxxxx> · Wed, 3 Sep 2025 16:05:06 -0700

On Wed, Sep 3, 2025 at 11:47 AM Darrick J. Wong <djwong@xxxxxxxxxx> wrote:
>
> On Wed, Sep 03, 2025 at 08:54:05AM -0700, Darrick J. Wong wrote:
> > On Wed, Sep 03, 2025 at 05:48:46PM +0200, Miklos Szeredi wrote:
> > > On Tue, 26 Aug 2025 at 20:52, Darrick J. Wong <djwong@xxxxxxxxxx> wrote:
> > >

Sorry for the late reply on this.

> > > > Hrmm.  I was thinking that it would be very nice to have
> > > > fuse_request_{send,end} bracket the start and end of a fuse request,
> > > > even if we kill it immediately.

Oh interesting, I didn't realize there was a trace_fuse_request_end().
I get now why you wanted the trace_fuse_request_send() for the
!fiq->connected case, for symmetry. I was thinking of it from the
client userspace side (one idea I have, which idk if it is actually
that useful or not, is building some sort of observability "wireshark
for fuse" tool that gives more visibility into the requests being sent
to/from the server like their associated kernel vs libfuse timestamps
to know where the latency is happening. this issue has come up in prod
a few times when debugging slow requests); from this perspective, it
seemed confusing to see requests show up that were never in good faith
attempted to be sent to the server.

If you want to preserve the symmetry, maybe one idea is only doing the
trace_fuse_request_end() if the req.in.h.unique code is valid? That
would skip doing the trace for the !fiq->connected case.

Thanks,
Joanne
> > >
> > > I'm fine with that, and would possibly simplify some code that checks
> > > for an error and calls ->end manually.  But that makes it a
> > > non-trivial change unfortunately.
> >
> > Yes, and then you have to poke the idr structure for a request id even
> > if that caller already knows that the connection's dead.  That seems
> > like a waste of cycles, but OTOH maybe we just don't care?
> >
> > (Though I suppose seeing more than one request id of zero in the trace
> > output implies very strongly that the connection is really dead)
>
> Well.... given the fuse_iqueue::reqctr usage, the first request gets a
> unique id of 2 and increments by two thereafter.  So it's a pretty safe
> bet that unique==0 means the request isn't actually being sent, or that
> your very lucky in that your fuse server has been running for a /very/
> long time.
>
> I think I just won't call trace_fuse_request_send for requests that are
> immediately ended; and I'll refactor the req->in.h.unique assignment
> into a helper so that virtiofs and friends can call the helper and get
> the tracepoint automatically.
>
> For example, fuse_dev_queue_req now becomes:
>
>
> static inline void fuse_request_assign_unique_locked(struct fuse_iqueue *fiq,
>                                                      struct fuse_req *req)
> {
>         if (req->in.h.opcode != FUSE_NOTIFY_REPLY)
>                 req->in.h.unique = fuse_get_unique_locked(fiq);
>
>         /* tracepoint captures in.h.unique and in.h.len */
>         trace_fuse_request_send(req);
> }
>
> static void fuse_dev_queue_req(struct fuse_iqueue *fiq, struct fuse_req *req)
> {
>         spin_lock(&fiq->lock);
>         if (fiq->connected) {
>                 fuse_request_assign_unique_locked(fiq, req);
>                 list_add_tail(&req->list, &fiq->pending);
>                 fuse_dev_wake_and_unlock(fiq);
>         } else {
>                 spin_unlock(&fiq->lock);
>                 req->out.h.error = -ENOTCONN;
>                 clear_bit(FR_PENDING, &req->flags);
>                 fuse_request_end(req);
>         }
> }
>
> --D