Re: [PATCH bpf-next v12 01/12] bpf: add networking timestamping support to bpf_get/setsockopt()

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



> > > Now I wonder if I should use the u8 sk_bpf_cb_flags in V13 or just
> > > keep it as-is? Either way is fine with me :) bpf_sock_ops_cb_flags
> > > uses u8 as an example, thus I think we prefer the former?
> >
> > If it fits in a u8 and that in practice also results in less memory
> > and cache pressure (i.e., does not just add a 24b hole), then it is a
> > net improvement.
> 
> Probably I didn't state it clearly. I agree with you on saving memory:)
> 
> In the previous response, I was trying to keep the sk_bpf_cb_flags
> flag and use a u8 instead. I admit u32 is too large after you noticed
> this.
> 
> Would the following diff on top of this series be acceptable for you?
> And would it be a proper place to put the u8 sk_bpf_cb_flags in struct
> sock?
> diff --git a/include/net/sock.h b/include/net/sock.h
> index 6f4d54faba92..e85d6fb3a2ba 100644
> --- a/include/net/sock.h
> +++ b/include/net/sock.h
> @@ -447,7 +447,7 @@ struct sock {
>         int                     sk_forward_alloc;
>         u32                     sk_tsflags;
>  #define SK_BPF_CB_FLAG_TEST(SK, FLAG) ((SK)->sk_bpf_cb_flags & (FLAG))
> -       u32                     sk_bpf_cb_flags;
> +       u8                      sk_bpf_cb_flags;
>         __cacheline_group_end(sock_write_rxtx);
> 
>         __cacheline_group_begin(sock_write_tx);
> 
> The following output is the result of running 'pahole --hex -C sock vmlinux'.
> Before this series:
>         u32                        sk_tsflags;           /* 0x168   0x4 */
>         __u8
> __cacheline_group_end__sock_write_rxtx[0]; /* 0x16c     0 */
>         __u8
> __cacheline_group_begin__sock_write_tx[0]; /* 0x16c     0 */
>         int                        sk_write_pending;     /* 0x16c   0x4 */
>         atomic_t                   sk_omem_alloc;        /* 0x170   0x4 */
>         int                        sk_sndbuf;            /* 0x174   0x4 */
>         int                        sk_wmem_queued;       /* 0x178   0x4 */
>         refcount_t                 sk_wmem_alloc;        /* 0x17c   0x4 */
>         /* --- cacheline 6 boundary (384 bytes) --- */
>         long unsigned int          sk_tsq_flags;         /* 0x180   0x8 */
> ...
> /* sum members: 773, holes: 1, sum holes: 1 */
> 
> After this diff patch:
>         u32                        sk_tsflags;           /* 0x168   0x4 */
>         u8                         sk_bpf_cb_flags;      /* 0x16c   0x1 */
>         __u8
> __cacheline_group_end__sock_write_rxtx[0]; /* 0x16d     0 */
>         __u8
> __cacheline_group_begin__sock_write_tx[0]; /* 0x16d     0 */
> 
>         /* XXX 3 bytes hole, try to pack */
> 
>         int                        sk_write_pending;     /* 0x170   0x4 */
>         atomic_t                   sk_omem_alloc;        /* 0x174   0x4 */
>         int                        sk_sndbuf;            /* 0x178   0x4 */
>         int                        sk_wmem_queued;       /* 0x17c   0x4 */
>         /* --- cacheline 6 boundary (384 bytes) --- */
>         refcount_t                 sk_wmem_alloc;        /* 0x180   0x4 */
> 
>         /* XXX 4 bytes hole, try to pack */
> 
>         long unsigned int          sk_tsq_flags;         /* 0x188   0x8 */
> ...
> /* sum members: 774, holes: 3, sum holes: 8 */
> 
> It will introduce 7 extra sum holes if this series with this u8 change
> gets applied. I think it's a proper position because this new
> sk_bpf_cb_flags will be used in the tx and rx path just like
> sk_tsflags, aligned with rules introduced by the commit[1].

Reducing a u64 to u8 can leave 7b of holes, but that is not great,
of course.

Since this bitmap is only touched if a BPF program is loaded, arguably
it need not be in the hot path cacheline groups.

Can you find a hole further down to place this in, or at least a spot
that does not result in 7b of wasted space (in the hotpath cacheline
groups of all places).




[Index of Archives]     [Linux Samsung SoC]     [Linux Rockchip SoC]     [Linux Actions SoC]     [Linux for Synopsys ARC Processors]     [Linux NFS]     [Linux NILFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]


  Powered by Linux