On Thu, Feb 20, 2025 at 10:55 AM Willem de Bruijn <willemdebruijn.kernel@xxxxxxxxx> wrote: > > Jason Xing wrote: > > This patch introduces a new callback in tcp_tx_timestamp() to correlate > > tcp_sendmsg timestamp with timestamps from other tx timestamping > > callbacks (e.g., SND/SW/ACK). > > > > Without this patch, BPF program wouldn't know which timestamps belong > > to which flow because of no socket lock protection. This new callback > > is inserted in tcp_tx_timestamp() to address this issue because > > tcp_tx_timestamp() still owns the same socket lock with > > tcp_sendmsg_locked() in the meanwhile tcp_tx_timestamp() initializes > > the timestamping related fields for the skb, especially tskey. The > > tskey is the bridge to do the correlation. > > > > For TCP, BPF program hooks the beginning of tcp_sendmsg_locked() and > > then stores the sendmsg timestamp at the bpf_sk_storage, correlating > > this timestamp with its tskey that are later used in other sending > > timestamping callbacks. > > > > Signed-off-by: Jason Xing <kerneljasonxing@xxxxxxxxx> > > --- > > include/uapi/linux/bpf.h | 5 +++++ > > net/ipv4/tcp.c | 4 ++++ > > tools/include/uapi/linux/bpf.h | 5 +++++ > > 3 files changed, 14 insertions(+) > > > > diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h > > index 9355d617767f..86fca729fbd8 100644 > > --- a/include/uapi/linux/bpf.h > > +++ b/include/uapi/linux/bpf.h > > @@ -7052,6 +7052,11 @@ enum { > > * when SK_BPF_CB_TX_TIMESTAMPING > > * feature is on. > > */ > > + BPF_SOCK_OPS_TS_SND_CB, /* Called when every sendmsg syscall > > + * is triggered. It's used to correlate > > + * sendmsg timestamp with corresponding > > + * tskey. > > + */ > > }; > > > > /* List of TCP states. There is a build check in net/ipv4/tcp.c to detect > > diff --git a/net/ipv4/tcp.c b/net/ipv4/tcp.c > > index 12b9c4f9c151..4b9739cd3bc5 100644 > > --- a/net/ipv4/tcp.c > > +++ b/net/ipv4/tcp.c > > @@ -492,6 +492,10 @@ static void tcp_tx_timestamp(struct sock *sk, struct sockcm_cookie *sockc) > > if (tsflags & SOF_TIMESTAMPING_TX_RECORD_MASK) > > shinfo->tskey = TCP_SKB_CB(skb)->seq + skb->len - 1; > > } > > + > > + if (cgroup_bpf_enabled(CGROUP_SOCK_OPS) && > > + SK_BPF_CB_FLAG_TEST(sk, SK_BPF_CB_TX_TIMESTAMPING) && skb) > > + bpf_skops_tx_timestamping(sk, skb, BPF_SOCK_OPS_TS_SND_CB); > > } > > > > static bool tcp_stream_is_readable(struct sock *sk, int target) > > diff --git a/tools/include/uapi/linux/bpf.h b/tools/include/uapi/linux/bpf.h > > index d3e2988b3b4c..2739ee0154a0 100644 > > --- a/tools/include/uapi/linux/bpf.h > > +++ b/tools/include/uapi/linux/bpf.h > > @@ -7042,6 +7042,11 @@ enum { > > * when SK_BPF_CB_TX_TIMESTAMPING > > * feature is on. > > */ > > + BPF_SOCK_OPS_TS_SND_CB, /* Called when every sendmsg syscall > > + * is triggered. It's used to correlate > > + * sendmsg timestamp with corresponding > > + * tskey. > > + */ > > Feel free to decline this late in the review process, but a bit more > bikeshedding.. > > Can we spell out TSTAMP instead of TS in these definitions? Within > the context of this series it is self-explanatory, but when reading > kernel code the meaning of a two letter acronym is not that clear. Even though I feel reluctant to change across the whole series because if so, I will adjust in many places. Of course, you're right about the new name being clearer :) > > And instead of SND can we use SENDMSG or something like that? > SND here confused me as the software timestamp is SCM_TSTAMP_SND. I'm not sure about this. For TCP, it's not implemented in the tcp_sendmsg_locked but tcp_tx_timestamp. Well, I have no strong preference. You can make the final call :) Thanks, Jason > > For instance: > > BPF_SOCK_OPS_TSTAMP_SENDMSG_CB, > BPF_SOCK_OPS_TSTAMP_SCHED_CB, > BPF_SOCK_OPS_TSTAMP_SND_SW_CB, > BPF_SOCK_OPS_TSTAMP_SND_HW_CB, > (BPF_SOCK_OPS_TSTAMP_TX_COMPLETION_CB,) > BPF_SOCK_OPS_TSTAMP_ACK_CB, > > (not sure what the OPT in OPT_CB added). > >