Re: [RFC bpf-next v1 2/4] bpf: Support cookie for linked-based struct_ops attachment

Amery Hung <ameryhung@xxxxxxxxx> · Mon, 14 Jul 2025 14:02:19 -0700

On Mon, Jul 14, 2025 at 1:46 PM Andrii Nakryiko
<andrii.nakryiko@xxxxxxxxx> wrote:
>
> On Fri, Jul 11, 2025 at 2:38 PM Amery Hung <ameryhung@xxxxxxxxx> wrote:
> >
> > On Fri, Jul 11, 2025 at 1:21 PM Alexei Starovoitov
> > <alexei.starovoitov@xxxxxxxxx> wrote:
> > >
> > > On Fri, Jul 11, 2025 at 12:29 PM Amery Hung <ameryhung@xxxxxxxxx> wrote:
> > > >
> > > > On Fri, Jul 11, 2025 at 11:41 AM Andrii Nakryiko
> > > > <andrii.nakryiko@xxxxxxxxx> wrote:
> > > > >
> > > > > On Thu, Jul 10, 2025 at 2:00 PM Amery Hung <ameryhung@xxxxxxxxx> wrote:
> > > > > >
> > > > > > On Thu, Jul 10, 2025 at 12:47 PM Martin KaFai Lau <martin.lau@xxxxxxxxx> wrote:
> > > > > > >
> > > > > > > On 7/10/25 11:39 AM, Amery Hung wrote:
> > > > > > > >> On 7/8/25 4:08 PM, Amery Hung wrote:
> > > > > > > >>> @@ -906,6 +904,10 @@ static long bpf_struct_ops_map_update_elem(struct bpf_map *map, void *key,
> > > > > > > >>>                goto unlock;
> > > > > > > >>>        }
> > > > > > > >>>
> > > > > > > >>> +     err = bpf_struct_ops_prepare_attach(st_map, 0);
> > > > > > > >> A follow-up on the "using the map->id as the cookie" comment in the cover
> > > > > > > >> letter. I meant to use the map->id here instead of 0. If the cookie is intended
> > > > > > > >> to identify a particular struct_ops instance (i.e., the struct_ops map), then
> > > > > > > >> map->id should be a good fit, and it is automatically generated by the kernel
> > > > > > > >> during the map creation. As a result, I suspect that most of the changes in
> > > > > > > >> patch 1 and patch 2 will not be needed.
> > > > > > > >>
> > > > > > > > Do you mean keep using cookie as the mechanism to associate programs,
> > > > > > > > but for struct_ops the cookie will be map->id (i.e.,
> > > > > > > > bpf_get_attah_cookie() in struct_ops will return map->id)?
> > > > > > >
> > > > > > > I meant to use the map->id as the bpf_cookie stored in the bpf_tramp_run_ctx.
> > > > > > > Then there is no need for user space to generate a unique cookie during
> > > > > > > link_create. The kernel has already generated a unique ID in the map->id. The
> > > > > > > map->id is available during the bpf_struct_ops_map_update_elem(). Then there is
> > > > > > > also no need to distinguish between SEC(".struct_ops") vs
> > > > > > > SEC(".struct_ops.link"). Most of the patch 1 and patch 2 will not be needed.
> > > > > > >
> > > > > > > A minor detail: note that the same struct ops program can be used in different
> > > > > > > trampolines. Thus, to be specific, the bpf cookie is stored in the trampoline.
> > > > > > >
> > > > > > > If the question is about bpf global variable vs bpf cookie, yeah, I think using
> > > > > > > a bpf global variable should also work. The global variable can be initialized
> > > > > > > before libbpf's bpf_map__attach_struct_ops(). At that time, the map->id should
> > > > > > > be known already. I don't have a strong opinion on reusing the bpf cookie in the
> > > > > > > struct ops trampoline. No one is using it now, so it is available to be used.
> > > > > > > Exposing BPF_FUNC_get_attach_cookie for struct ops programs is pretty cheap
> > > > > > > also. Using bpf cookie to allow the struct ops program to tell which struct_ops
> > > > > > > map is calling it seems to fit well also after sleeping on it a bit. bpf global
> > > > > > > variable will also break if a bpf_prog.o has more than one SEC(".struct_ops").
> > > > > > >
> > > > > >
> > > > > > While both of them work, using cookie instead of global variable is
> > > > > > one less thing for the user to take care of (i.e., slightly better
> > > > > > usability).
> > > > > >
> > > > > > With the approach you suggested, to not mix the existing semantics of
> > > > > > bpf cookie, I think a new struct_ops kfuncs is needed to retrieve the
> > > > >
> > > > > yes, if absolutely necessary, sure, let's reuse the spot that is
> > > > > reserved for cookie inside the trampoline, but let's not expose this
> > > > > as real BPF cookie (i.e., let's not allow bpf_get_attach_cookie()
> > > > > helper for struct_ops), because BPF cookie is meant to be fully user
> > > > > controllable and used for whatever they deem necessary. Not
> > > > > necessarily to just identify the struct_ops map. So it will be a huge
> > > > > violation to just pre-define what BPF cookie value is for struct_ops.
> > > > >
> > > >
> > > > We had some offline discussions and figured out this will not work well.
> > > >
> > > > sched_ext users already call scx kfuncs in global subprograms. If we
> > > > choose to add bpf_get_struct_ops_id() to get the id to be passed to
> > > > scx kfuncs, it will force the user to create two sets of the same
> > > > global subprog. The one called by struct_ops that calls
> > > > bpf_get_struct_ops_id() and tracing programs that calls
> > > > bpf_get_attach_cookie().
> > >
> > > Can we put cookie into map_extra during st_ops map creation time and
> > > later copy it into actual cookie place in a trampoline?
> >
> > It should work. Currently, it makes sense to have cookie in struct_ops
> > map as all struct_ops implementers (hid, tcp cc, qdisc, sched_ext) do
> > not allow multi-attachment of the same map. A struct_ops map is
> > effectively an unique attachment for now.
>
> Is there any conceptual reason why struct_ops map shouldn't be allowed
> to be attached to multiple places? If not, should we try to lift this
> restriction and not invent struct_ops-specific BPF cookie APIs?
>
> From libbpf POV, struct_ops map is created implicitly from the
> struct_ops variable. There is no map_flags field there. We'll need to
> add new special APIs and/or conventions just to be able to set
> map_flags.
>
> >
> > Additionally, we will be able to support cookie for non-link
> > struct_ops with this way.
> >
> > This approach will not block future effort to support link-specific
> > cookie if there is such a use case. We can revisit this patchset then.
>
> It will create two ways to specify BPF cookie for struct_ops: (legacy
> and special way) through map_flags and common one through the
> LINK_CREATE command (and I guess we'd need to reject LINK_CREATE if
> cookie was already set through map_flags, right?). Why confuse users
> like that?
>
> From what I understand, the problem is that currently struct_ops map's
> BPF trampoline(s) are created a bit too early, before the attachment
> step. How hard would it be to move trampoline creation to an actual
> attachment time? Should we seriously consider this before we invent
> new struct_ops-specific exceptions for BPF cookie?

Yeah...

I realized that while map_extra works, the cookie needs to be passed
to the kernel in map creation time and will create the confusion you
mentioned in libbpf.

I am fixing the patchset by moving trampoline and ksyms to
bpf_struct_ops_link. It shouldn't complicate struct_ops code too much
(finger cross).