Re: [PATCH v12 06/14] unwind_user/deferred: Add deferred unwinding interface

Steven Rostedt <rostedt@xxxxxxxxxxx> · Wed, 2 Jul 2025 15:21:11 -0400

On Wed, 2 Jul 2025 15:12:45 -0400
Mathieu Desnoyers <mathieu.desnoyers@xxxxxxxxxxxx> wrote:

> > But you are missing one more thing that the trace can use, and that's
> > the time sequence. As soon as the same thread has a new id you can
> > assume all the older user space traces are not applicable for any new
> > events for that thread, or any other thread with the same thread ID.  
> 
> In order for the scheme you describe to work, you need:
> 
> - instrumentation of task lifetime (exit/fork+clone),
> - be sure that the events related to that instrumentation were not
>    dropped.
> 
> I'm not sure about ftrace, but in LTTng enabling instrumentation of
> task lifetime is entirely up to the user.

Has nothing to do with task lifetime. If you see a deferred request
with id of 1 from task 8888, and then later you see either a deferred
request or a stack trace with an id other than 1 for task 8888, you can
then say all events before now are no longer eligible for new deferred
stack traces.

> 
> And even if it's enabled, events can be discarded (e.g. buffer full).

The only case is if you see a deferred request with id 1 for task 8888,
then you start dropping all events and that task 8888 exits and a new
one appears with task id 8888 where it too has a deferred request with
id 1 then you start picking up events again and see a deferred stack
trace for the new task 8888 where it's id is 1, you lose.

But other than that exact scenario, it should not get confused.

> 
> > 
> > Thus the only issue that can truly be a problem is if you have missed
> > events where thread id wraps around. I guess that could be possible if
> > a long running task finally exits and it's thread id is reused
> > immediately. Is that a common occurrence?  
> 
> You just need a combination of thread ID re-use and either no
> instrumentation of task lifetime or events discarded to trigger this.

Again, it's seeing a new request with another id for the same task, you
don't need to worry about it. You don't even need to look at fork and
exit events.

> Even if it's not so frequent, at large scale and in production, I
> suspect that this will happen quite often.

Really? As I explained above?

-- Steve