On Wed, Aug 27, 2025 at 4:48 PM Martin KaFai Lau <martin.lau@xxxxxxxxx> wrote: > > On 8/26/25 11:38 AM, Kuniyuki Iwashima wrote: > > The main targets are BPF_CGROUP_INET_SOCK_CREATE and > > BPF_SOCK_OPS_PASSIVE_ESTABLISHED_CB as demonstrated in the selftest. > > > > Note that we do not support modifying the flag once sk->sk_memcg is set > > especially because UDP charges memory under sk->sk_receive_queue.lock > > instead of lock_sock(). > > > > [ ... ] > > > diff --git a/include/net/sock.h b/include/net/sock.h > > index 63a6a48afb48..d41a2f8f8b30 100644 > > --- a/include/net/sock.h > > +++ b/include/net/sock.h > > @@ -2596,10 +2596,39 @@ static inline gfp_t gfp_memcg_charge(void) > > return in_softirq() ? GFP_ATOMIC : GFP_KERNEL; > > } > > > > +#define SK_BPF_MEMCG_FLAG_MASK (SK_BPF_MEMCG_FLAG_MAX - 1) > > +#define SK_BPF_MEMCG_PTR_MASK ~SK_BPF_MEMCG_FLAG_MASK > > + > > #ifdef CONFIG_MEMCG > > +static inline void mem_cgroup_sk_set_flags(struct sock *sk, unsigned short flags) > > +{ > > + unsigned long val = (unsigned long)sk->sk_memcg; > > + > > + val |= flags; > > + sk->sk_memcg = (struct mem_cgroup *)val; > > +} > > + > > [ ... ] > > > +static int sk_bpf_set_get_memcg_flags(struct sock *sk, int *optval, bool getopt) > > +{ > > + if (!sk_has_account(sk)) > > + return -EOPNOTSUPP; > > + > > + if (getopt) { > > + *optval = mem_cgroup_sk_get_flags(sk); > > + return 0; > > + } > > + > > + if (sock_owned_by_user_nocheck(sk) && mem_cgroup_from_sk(sk)) > > I still don't see how this can stop some of the bh bpf prog to set/clear the bit > after mem_cgroup_sk_alloc() is done. For example, in BPF_SOCK_OPS_RETRANS_CB, > bh_lock_sock() is held but not owned by user, so the bpf prog can continue to > change the SK_BPF_MEMCG_FLAGS. What am I missing? You are totally right. I think I was just confused. > > Passing some more args to __bpf_setsockopt() for caller context purpose is quite > ugly which I think you also mentioned/tried earlier. Another option I was thinking (but didn't try) was extending void *ctx from struct sock * to like below and silently cast ctx back to the original one in the dedicated setsockopt proto. struct { struct sock * sk; /* caller info here */ } > May be it can check "if > (in_softirq() && mem_cgroup_from_sk(sk)) return -EBUSY" but looks quite tricky > together with the sock_owned_by_user_nocheck(). > > It seems this sk_memcg bit change is only needed for sock_ops hook and create > hook? sock_ops already has its only func_proto bpf_sock_ops_setsockopt(). > bpf_sock_ops_setsockopt can directly call sk_bpf_set_get_memcg_flags() but limit > it to the BPF_SOCK_OPS_PASSIVE_ESTABLISHED_CB. Is it also safe to allow it for > BPF_SOCK_OPS_TCP_LISTEN_CB? For the create hook, create a new func proto which > can directly call sk_bpf_set_get_memcg_flags(). wdyt? I prefer the new proto instead of a hacky/complicated approach. > > > + return -EBUSY; > > + > > + if (*optval <= 0 || *optval >= SK_BPF_MEMCG_FLAG_MAX) > > It seems the bit cannot be cleared (the *optval <= 0 check). There could be > multiple bpf progs running in a cgroup which want to clear this bit. e.g. the > parent cgroup's prog may want to ensure this bit is not set. I didn't think about the situation, so the set helper will need to reset all flags (in case there could be multiple bit flags in the future) static inline void mem_cgroup_sk_set_flags(struct sock *sk, unsigned short flags { unsigned long val = (unsigned long)sk->sk_memcg; val &= SK_BPF_MEMCG_PTR_MASK; val |= flags; sk->sk_memcg = (struct mem_cgroup *)val; }