Re: [PATCH bpf-next 1/4] bpf: Allow get_func_[arg|arg_cnt] helpers in raw tracepoint programs

Alexei Starovoitov <alexei.starovoitov@xxxxxxxxx> · Wed, 30 Apr 2025 09:53:15 -0700

On Wed, Apr 30, 2025 at 8:55 AM Leon Hwang <leon.hwang@xxxxxxxxx> wrote:
>
>
>
> On 2025/4/30 20:43, Kafai Wan wrote:
> > On Wed, Apr 30, 2025 at 10:46 AM Alexei Starovoitov
> > <alexei.starovoitov@xxxxxxxxx> wrote:
> >>
> >> On Sat, Apr 26, 2025 at 9:00 AM KaFai Wan <mannkafai@xxxxxxxxx> wrote:
> >>>
>
> [...]
>
> >>> @@ -2312,7 +2322,7 @@ void __bpf_trace_run(struct bpf_raw_tp_link *link, u64 *args)
> >>>  #define REPEAT(X, FN, DL, ...)         REPEAT_##X(FN, DL, __VA_ARGS__)
> >>>
> >>>  #define SARG(X)                u64 arg##X
> >>> -#define COPY(X)                args[X] = arg##X
> >>> +#define COPY(X)                args[X + 1] = arg##X
> >>>
> >>>  #define __DL_COM       (,)
> >>>  #define __DL_SEM       (;)
> >>> @@ -2323,9 +2333,10 @@ void __bpf_trace_run(struct bpf_raw_tp_link *link, u64 *args)
> >>>         void bpf_trace_run##x(struct bpf_raw_tp_link *link,             \
> >>>                               REPEAT(x, SARG, __DL_COM, __SEQ_0_11))    \
> >>>         {                                                               \
> >>> -               u64 args[x];                                            \
> >>> +               u64 args[x + 1];                                        \
> >>> +               args[0] = x;                                            \
> >>>                 REPEAT(x, COPY, __DL_SEM, __SEQ_0_11);                  \
> >>> -               __bpf_trace_run(link, args);                            \
> >>> +               __bpf_trace_run(link, args + 1);                        \
> >>
> >> This is neat, but what is this for?
> >> The program that attaches to a particular raw_tp knows what it is
> >> attaching to and how many arguments are there,
> >> so bpf_get_func_arg_cnt() is a 5th wheel.
> >>
> >> If the reason is "for completeness" then it's not a good reason
> >> to penalize performance. Though it's just an extra 8 byte of stack
> >> and a single store of a constant.
> >>
> > If we try to capture all arguments of a specific raw_tp in tracing programs,
> > We first obtain the arguments count from the format file in debugfs or BTF
> > and pass this count to the BPF program via .bss section or cookie (if
> > available).
> >
> > If we store the count in ctx and get it via get_func_arg_cnt helper in
> > the BPF program，
> > a) It's easier and more efficient to get the arguments count in the BPF program.
> > b) It could use a single BPF program to capture arguments for multiple raw_tps,
> > reduce the number of BPF programs when massive tracing.
> >
>
>
> bpf_get_func_arg() will be very helpful for bpfsnoop[1] when tracing tp_btf.
>
> In bpfsnoop, it can generate a small snippet of bpf instructions to use
> bpf_get_func_arg() for retrieving and filtering arguments. For example,
> with the netif_receive_skb tracepoint, bpfsnoop can use
> bpf_get_func_arg() to filter the skb argument using pcap-filter(7)[2] or
> a custom attribute-based filter. This will allow bpfsnoop to trace
> multiple tracepoints using a single bpf program code.

I doubt you thought it through end to end.
When tracepoint prog attaches we have this check:
        /*
         * check that program doesn't access arguments beyond what's
         * available in this tracepoint
         */
        if (prog->aux->max_ctx_offset > btp->num_args * sizeof(u64))
                return -EINVAL;

So you cannot have a single bpf prog attached to many tracepoints
to read many arguments as-is.
You can hack around that limit with probe_read,
but the values won't be trusted and you won't be able to pass
such untrusted pointers into skb and other helpers/kfuncs.