Re: [PATCH net-next V6 2/2] veth: apply qdisc backpressure on full ptr_ring to reduce TX drops

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 





On 24/04/2025 17.53, Jakub Kicinski wrote:
On Thu, 24 Apr 2025 17:24:51 +0200 Jesper Dangaard Brouer wrote:
Looks like I wrote a reply to v5 but didn't hit send. But I may have
set v5 to Changes Requested because of it :S Here is my comment:

   I think this is missing a memory barrier. When drivers do this dance
   there's usually a barrier between stop and recheck, to make sure the
   stop is visible before we check. And vice versa veth_xdp_rcv() needs
   to make sure other side sees the "empty" indication before it checks
   if the queue is stopped.

The call netif_tx_stop_queue(txq); already contains a memory barrier
smp_mb__before_atomic() plus an atomic set_bit operation.  That should
be sufficient.

That barrier is _before_ stopping the queue. I'm saying we need a
barrier between stop and emptiness re-check. Note that:
  - smp_mb__after_atomic() is enough, and it 'compiles' to nothing
    on x86

I see, I will add a smp_mb__after_atomic() after netif_tx_stop_queue()
and send a V7.  I considered an atomic operation a full memory-barrier,
which I guess is correct for x86 (as you say this compiled to nothing),
but I guess other archs need this, so lets add it.

  - all of this is the unlikely path :) You restart the qdisc
    when the ptr ring is completely full so the stopping in absolute
    worst case will happen once or twice per full ptr_ring ?


Yes, basically. It should only happen once per full ptr_ring event.
As soon as TXQ is stopped, the driver code is no-longer called.
Do remember that remote CPU running veth_poll call, will (re)start the
TXQ again via qdisc layer, which call veth driver code again, e.g. race
to fill ptr_ring again and that will stop TXQ again. (Sysadm help: These
full/TXQ-stop events will be recorded in "requeues" counter by qdisc
stats).  The remote CPU running NAPI is in a fairly tight loop, so it
will do it's best to empty the queue, and have a total budget of 300.
The race is still very unlikely, but it is a race, that would stop the
TXQ forever for the veth device (we don't recover).


And the other side veth_poll(), have a smp_store_mb() before reading
ptr_ring.

--Jesper

p.s.
I actually had an alternative implementation of this, that only calls
stop when it is needed.  See below, it kind of looks prettier, but it
adds an extra memory barrier in the likely path. (And I'm not sure if
read memory barrier is strong enough).

Not sure that works either :S





[Index of Archives]     [Linux Samsung SoC]     [Linux Rockchip SoC]     [Linux Actions SoC]     [Linux for Synopsys ARC Processors]     [Linux NFS]     [Linux NILFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]


  Powered by Linux