Re: [PATCH net-next 0/2] net: xsk: add two sysctl knobs

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Wed, Jun 18, 2025 at 8:29 AM Jason Xing <kerneljasonxing@xxxxxxxxx> wrote:
>
> On Wed, Jun 18, 2025 at 1:46 AM Stanislav Fomichev <stfomichev@xxxxxxxxx> wrote:
> >
> > On 06/17, Jason Xing wrote:
> > > Hi Stanislav,
> > >
> > > On Tue, Jun 17, 2025 at 9:11 AM Stanislav Fomichev <stfomichev@xxxxxxxxx> wrote:
> > > >
> > > > On 06/17, Jason Xing wrote:
> > > > > From: Jason Xing <kernelxing@xxxxxxxxxxx>
> > > > >
> > > > > Introduce a control method in the xsk path to let users have the chance
> > > > > to tune it manually.
> > > >
> > > > Can you expand more on why the defaults don't work for you?
> > >
> > > We use a user-level tcp stack with xsk to transmit packets that have
> > > higher priorities than other normal kernel tcp flows. It turns out
> > > that enlarging the number can minimize times of triggering sendto
> > > sysctl, which contributes to faster transmission. it's very easy to
> > > hit the upper bound (namely, 32) if you log the return value of
> > > sendto. I mentioned a bit about this in the second patch, saying that
> > > we can have a similar knob already appearing in the qdisc layer.
> > > Furthermore, exposing important parameters can help applications
> > > complete their AI/auto-tuning to judge which one is the best fit in
> > > their production workload. That is also one of the promising
> > > tendencies :)
> > >
> > > >
> > > > Also, can we put these settings into the socket instead of (global/ns)
> > > > sysctl?
> > >
> > > As to MAX_PER_SOCKET_BUDGET, it seems not easy to get its
> > > corresponding netns? I have no strong opinion on this point for now.
>
> To add to that, after digging into this part, I realized that we're
> able to use sock_net(sk)->core.max_tx_budget directly to finish the
> namespace stuff because xsk_create() calls sk_alloc() which correlates
> its netns in the sk->sk_net. Sounds reasonable?

Updated patch here:
https://lore.kernel.org/all/20250618065553.96822-1-kerneljasonxing@xxxxxxxxx/

Please review :0

Thanks,
Jason

>
> >
> > I'm suggesting something along these lines (see below). And then add
> > some way to configure it (plus, obviously, set the default value
> > on init). There is also a question on whether you need separate
> > values for MAX_PER_SOCKET_BUDGET and TX_BATCH_SIZE, and if yes,
>
> For now, actually I don't see a specific reason to separate them, so
> let me use a single one in V2. My use case only expects to see the
> TX_BATCH_SIZE adjustment.
>
> > then why.
> >
> > diff --git a/net/xdp/xsk.c b/net/xdp/xsk.c
> > index 72c000c0ae5f..fb2caec9914d 100644
> > --- a/net/xdp/xsk.c
> > +++ b/net/xdp/xsk.c
> > @@ -424,7 +424,7 @@ bool xsk_tx_peek_desc(struct xsk_buff_pool *pool, struct xdp_desc *desc)
> >         rcu_read_lock();
> >  again:
> >         list_for_each_entry_rcu(xs, &pool->xsk_tx_list, tx_list) {
> > -               if (xs->tx_budget_spent >= MAX_PER_SOCKET_BUDGET) {
> > +               if (xs->tx_budget_spent >= xs->max_tx_budget) {
>
> If we implement it like this, xs->max_tx_budget has to read a
> per-netns from somewhere and then initialize it. The core problem
> still remains: where to store the per netns value.
>
> Do you think using the aforementioned sock_net(sk)->core.max_tx_budget
> is possible?
>
> Thanks,
> Jason
>
> >                         budget_exhausted = true;
> >                         continue;
> >                 }
> > @@ -779,7 +779,6 @@ static struct sk_buff *xsk_build_skb(struct xdp_sock *xs,
> >  static int __xsk_generic_xmit(struct sock *sk)
> >  {
> >         struct xdp_sock *xs = xdp_sk(sk);
> > -       u32 max_batch = TX_BATCH_SIZE;
> >         bool sent_frame = false;
> >         struct xdp_desc desc;
> >         struct sk_buff *skb;
> > @@ -797,7 +796,7 @@ static int __xsk_generic_xmit(struct sock *sk)
> >                 goto out;
> >
> >         while (xskq_cons_peek_desc(xs->tx, &desc, xs->pool)) {
> > -               if (max_batch-- == 0) {
> > +               if (xs->max_tx_budget-- == 0) {
> >                         err = -EAGAIN;
> >                         goto out;
> >                 }





[Index of Archives]     [Linux Samsung SoC]     [Linux Rockchip SoC]     [Linux Actions SoC]     [Linux for Synopsys ARC Processors]     [Linux NFS]     [Linux NILFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]


  Powered by Linux