Re: [PATCH bpf-next v2 3/4] bpf: task work scheduling kfuncs

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Wed, Aug 27, 2025 at 6:34 PM Alexei Starovoitov
<alexei.starovoitov@xxxxxxxxx> wrote:
>
> On Wed, Aug 20, 2025 at 11:33 AM Andrii Nakryiko
> <andrii.nakryiko@xxxxxxxxx> wrote:
> >
> >
> > I don't see why we can't consolidate internals of all these async
> > callbacks while maintaining a user-facing API that makes most sense
> > for each specific case. For task_work (and yes, I think for timers it
> > would make more sense as well), we are talking about a single
> > conceptual operation: just schedule a callback. So it makes sense to
> > have a single kfunc that expresses that.
> >
> > Having a split into init, set_callback, kick is unnecessary and
> > cumbersome.
>
> For timers the split api of init vs start_timer is mandatory,
> since it's common usage to start and stop the same timer before
> it fires. So the moving init operation (which allocates) and
> set_callback operation (that needs a pointer to callback)
> out of start/stop is a good thing, since the code that does start/stop
> may be far away and in a different file than init/set_callback.

Ok, that makes sense, thanks for filling in the context.

> That's how networking stack is written.
> Where I screwed things up is when I made bpf_timer_cancel() to also
> clear the callback to NULL. Not sure what I was thinking.
> The api wasn't modelled by existing kernel timer api, but
> by how the networking stack is using timers.
> Most of the time the started timer will be cancelled before firing.
> bpf_wq just followed bpf_timer api pattern.
> But, unlike timers, wq and task_work actually expect the callback
> to be called. It's rare to cancel wq/task_work.
> So for them the single 'just schedule' kfunc that allocates,
> sets callback, and schedules makes sense.
> For wq there is no bpf_wq_cancel. So there is a difference already.
> It's fine for bpf_timers, bpf_wq, bpf_task_work to have
> different set of kfuncs, since the usage pattern is different.
>

agreed

> Regarding state machine vs spinlock I think we should see
> through this approach with state machine.
> And if we convince ourselves that after all reviews it
> looks to be bug free, we should try to convert timer/wq
> to state machine as well to have a shared logic.

yep, completely agree

> Note irq_work from nmi is mandatory for timers too
> regardless of state machine or rqspinlock,
> since timers/wq take more locks inside,
> We cannot rqspinlock() + hrtimer_start() unconditionally.

Ack. Once we unify all this, we can invest a bit more effort trying to
optimize away the need for irq_work when not under NMI (and maybe not
in irq). For timers this seems more important than for task_work, as
timers can be used significantly more frequently.

But one step at a time, let's wait for Mykyta to come back from
vacation and update the patch set.





[Index of Archives]     [Linux Samsung SoC]     [Linux Rockchip SoC]     [Linux Actions SoC]     [Linux for Synopsys ARC Processors]     [Linux NFS]     [Linux NILFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]


  Powered by Linux