Re: [PATCH v10 07/11] perf: Support deferred user callchains for per CPU events

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Fri, 13 Jun 2025 22:46:12 -0400
Steven Rostedt <rostedt@xxxxxxxxxxx> wrote:

> +/*
> + * Deferred unwinding callback for per CPU events.
> + * Note, the request for the deferred unwinding may have happened
> + * on a different CPU.
> + */
> +static void perf_event_deferred_cpu(struct unwind_work *work,
> +				    struct unwind_stacktrace *trace, u64 timestamp)
> +{
> +	struct perf_unwind_deferred *defer =
> +		container_of(work, struct perf_unwind_deferred, unwind_work);
> +	struct perf_unwind_cpu *cpu_events, *cpu_unwind;
> +	struct perf_event *event;
> +	int cpu;
> +
> +	guard(rcu)();
> +	guard(preempt)();
> +
> +	cpu = smp_processor_id();
> +	cpu_events = rcu_dereference(defer->cpu_events);
> +	cpu_unwind = &cpu_events[cpu];
> +
> +	WRITE_ONCE(cpu_unwind->processing, 1);
> +	/*
> +	 * Make sure the above is seen before the event->unwind_deferred
> +	 * is checked. This matches the mb() in rcuwait_rcu_wait_event() in
> +	 * perf_remove_unwind_deferred().
> +	 */
> +	smp_mb();
> +
> +	list_for_each_entry_rcu(event, &cpu_unwind->list, unwind_list) {
> +		/* If unwind_deferred is NULL the event is going away */
> +		if (unlikely(!event->unwind_deferred))
> +			continue;
> +		perf_event_callchain_deferred(event, trace, timestamp);
> +		/* Only the first CPU event gets the trace */
> +		break;
> +	}
> +

Hmm, I think I need a smp_mb() here too.

> +	WRITE_ONCE(cpu_unwind->processing, 0);
> +	rcuwait_wake_up(&cpu_unwind->pending_unwind_wait);
> +}

The first smp_mb() is for synchronizing removing of the event from
perf_remove_unwind_deferred() that has:

	event->unwind_deferred = NULL;

	/*
	 * Make sure perf_event_deferred_cpu() is done with this event.
	 * That function will set cpu_unwind->processing and then
	 * call smp_mb() before iterating the list of its events.
	 * If the event's unwind_deferred is NULL, it will be skipped.
	 * The smp_mb() in that function matches the mb() in
	 * rcuwait_wait_event().
	 */
	rcuwait_wait_event(&cpu_unwind->pending_unwind_wait,
				   !cpu_unwind->processing, TASK_UNINTERRUPTIBLE);


So that the unwind_deferred setting to NULL is seen before the
cpu_unwind->processing is checked. But I think, in theory, without the
smp_mb() before the clearing of the cpu_unwind->procssing that it can
be seen before the unwind_deferred is read.

  CPU 0                                    CPU 1
  -----                                    -----
read event->unwind_deferred

                                        write NULL > event->unwind_deferre
                                        smp_mb() (in rcuwait)

CPU writes 0 > cpu_unwind->processing (re-ordered)

                                        reads cpu_unwind->processing == 0
                                        Starts to free event

Executes perf_event_callchain_deferred()


I'll add another smp_mb() to be safe in v11.

-- Steve








[Index of Archives]     [Linux Samsung SoC]     [Linux Rockchip SoC]     [Linux Actions SoC]     [Linux for Synopsys ARC Processors]     [Linux NFS]     [Linux NILFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]


  Powered by Linux