Re: [PATCH 3/7] fuse: capture the unique id of fuse commands being sent

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Wed, Sep 03, 2025 at 04:05:06PM -0700, Joanne Koong wrote:
> On Wed, Sep 3, 2025 at 11:47 AM Darrick J. Wong <djwong@xxxxxxxxxx> wrote:
> >
> > On Wed, Sep 03, 2025 at 08:54:05AM -0700, Darrick J. Wong wrote:
> > > On Wed, Sep 03, 2025 at 05:48:46PM +0200, Miklos Szeredi wrote:
> > > > On Tue, 26 Aug 2025 at 20:52, Darrick J. Wong <djwong@xxxxxxxxxx> wrote:
> > > >
> 
> Sorry for the late reply on this.
> 
> > > > > Hrmm.  I was thinking that it would be very nice to have
> > > > > fuse_request_{send,end} bracket the start and end of a fuse request,
> > > > > even if we kill it immediately.
> 
> Oh interesting, I didn't realize there was a trace_fuse_request_end().
> I get now why you wanted the trace_fuse_request_send() for the
> !fiq->connected case, for symmetry. I was thinking of it from the
> client userspace side (one idea I have, which idk if it is actually
> that useful or not, is building some sort of observability "wireshark
> for fuse" tool that gives more visibility into the requests being sent
> to/from the server like their associated kernel vs libfuse timestamps
> to know where the latency is happening. this issue has come up in prod
> a few times when debugging slow requests); from this perspective, it
> seemed confusing to see requests show up that were never in good faith
> attempted to be sent to the server.

Well at first I was all "Ehhh, why are all the request ids zero??".

Later when I started debugging corruption problems in fuse2fs I realized
that it would be really helpful to be able to match the start and end of
a particular fuse request to all the stuff that happened in between.

*So far* I can mostly just watch the raw ftrace feed to see what's going
on without going crazy, but yes, a wireshark plugin would be nice.  Or
someone writing fuseslow. ;)

> If you want to preserve the symmetry, maybe one idea is only doing the
> trace_fuse_request_end() if the req.in.h.unique code is valid? That
> would skip doing the trace for the !fiq->connected case.

That will actually cause some loss of trace data --
trace_fuse_request_end also captures the error code of the aborted
commands, so you can tell from the -103 error code that something really
bad happened.

--D

> Thanks,
> Joanne
> > > >
> > > > I'm fine with that, and would possibly simplify some code that checks
> > > > for an error and calls ->end manually.  But that makes it a
> > > > non-trivial change unfortunately.
> > >
> > > Yes, and then you have to poke the idr structure for a request id even
> > > if that caller already knows that the connection's dead.  That seems
> > > like a waste of cycles, but OTOH maybe we just don't care?
> > >
> > > (Though I suppose seeing more than one request id of zero in the trace
> > > output implies very strongly that the connection is really dead)
> >
> > Well.... given the fuse_iqueue::reqctr usage, the first request gets a
> > unique id of 2 and increments by two thereafter.  So it's a pretty safe
> > bet that unique==0 means the request isn't actually being sent, or that
> > your very lucky in that your fuse server has been running for a /very/
> > long time.
> >
> > I think I just won't call trace_fuse_request_send for requests that are
> > immediately ended; and I'll refactor the req->in.h.unique assignment
> > into a helper so that virtiofs and friends can call the helper and get
> > the tracepoint automatically.
> >
> > For example, fuse_dev_queue_req now becomes:
> >
> >
> > static inline void fuse_request_assign_unique_locked(struct fuse_iqueue *fiq,
> >                                                      struct fuse_req *req)
> > {
> >         if (req->in.h.opcode != FUSE_NOTIFY_REPLY)
> >                 req->in.h.unique = fuse_get_unique_locked(fiq);
> >
> >         /* tracepoint captures in.h.unique and in.h.len */
> >         trace_fuse_request_send(req);
> > }
> >
> > static void fuse_dev_queue_req(struct fuse_iqueue *fiq, struct fuse_req *req)
> > {
> >         spin_lock(&fiq->lock);
> >         if (fiq->connected) {
> >                 fuse_request_assign_unique_locked(fiq, req);
> >                 list_add_tail(&req->list, &fiq->pending);
> >                 fuse_dev_wake_and_unlock(fiq);
> >         } else {
> >                 spin_unlock(&fiq->lock);
> >                 req->out.h.error = -ENOTCONN;
> >                 clear_bit(FR_PENDING, &req->flags);
> >                 fuse_request_end(req);
> >         }
> > }
> >
> > --D
> 




[Index of Archives]     [Linux Ext4 Filesystem]     [Union Filesystem]     [Filesystem Testing]     [Ceph Users]     [Ecryptfs]     [NTFS 3]     [AutoFS]     [Kernel Newbies]     [Share Photos]     [Security]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux Cachefs]     [Reiser Filesystem]     [Linux RAID]     [NTFS 3]     [Samba]     [Device Mapper]     [CEPH Development]

  Powered by Linux