Re: [PATCH net-next V7 2/2] veth: apply qdisc backpressure on full ptr_ring to reduce TX drops

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 6/10/25 2:40 PM, Jesper Dangaard Brouer wrote:


On 10/06/2025 20.26, Ihor Solodrai wrote:
On 6/10/25 8:56 AM, Jesper Dangaard Brouer wrote:


On 10/06/2025 13.43, Jesper Dangaard Brouer wrote:

On 10/06/2025 00.09, Ihor Solodrai wrote:
[...]

Can you give me the output from below command (on your compiled kernel):

  ./scripts/faddr2line drivers/net/veth.o veth_xdp_rcv.constprop.0+0x6b


Still need above data/info please.

root@devvm7589:/ci/workspace# ./scripts/faddr2line ./kout.gcc/drivers/ net/veth.o veth_xdp_rcv.constprop.0+0x6b
veth_xdp_rcv.constprop.0+0x6b/0x390:
netdev_get_tx_queue at /ci/workspace/kout.gcc/../include/linux/ netdevice.h:2637 (inlined by) veth_xdp_rcv at /ci/workspace/kout.gcc/../drivers/net/ veth.c:912

Which is:

veth.c:912
     struct veth_priv *priv = netdev_priv(rq->dev);
     int queue_idx = rq->xdp_rxq.queue_index;
     struct netdev_queue *peer_txq;
     struct net_device *peer_dev;
     int i, done = 0, n_xdpf = 0;
     void *xdpf[VETH_XDP_BATCH];

     /* NAPI functions as RCU section */
     peer_dev = rcu_dereference_check(priv->peer, rcu_read_lock_bh_held());
  --->    peer_txq = netdev_get_tx_queue(peer_dev, queue_idx);

netdevice.h:2637
     static inline
     struct netdev_queue *netdev_get_tx_queue(const struct net_device *dev,
                      unsigned int index)
     {
         DEBUG_NET_WARN_ON_ONCE(index >= dev->num_tx_queues);
  --->        return &dev->_tx[index];
     }

So the suspect is peer_dev (priv->peer)?

Yes, this is the problem!

So, it seems that peer_dev (priv->peer) can become a NULL pointer.

Managed to reproduce - via manually deleting the peer device:
  - ip link delete dev veth42
  - while overloading veth41 via XDP redirecting packets into it.

Managed to trigger concurrent crashes on two CPUs (C0 + C3)
  - so below output gets interlaced a bit:

[...]

A fix could look like this:

diff --git a/drivers/net/veth.c b/drivers/net/veth.c
index e58a0f1b5c5b..a3046142cb8e 100644
--- a/drivers/net/veth.c
+++ b/drivers/net/veth.c
@@ -909,7 +909,7 @@ static int veth_xdp_rcv(struct veth_rq *rq, int budget,

         /* NAPI functions as RCU section */
        peer_dev = rcu_dereference_check(priv->peer, rcu_read_lock_bh_held());
-       peer_txq = netdev_get_tx_queue(peer_dev, queue_idx);
+       peer_txq = peer_dev ? netdev_get_tx_queue(peer_dev, queue_idx) : NULL;

         for (i = 0; i < budget; i++) {
                 void *ptr = __ptr_ring_consume(&rq->xdp_ring);
@@ -959,7 +959,7 @@ static int veth_xdp_rcv(struct veth_rq *rq, int budget,
         rq->stats.vs.xdp_packets += done;
         u64_stats_update_end(&rq->stats.syncp);

-       if (unlikely(netif_tx_queue_stopped(peer_txq)))
+       if (peer_txq && unlikely(netif_tx_queue_stopped(peer_txq)))
                 netif_tx_wake_queue(peer_txq);


Great! I presume you will send a patch separately?




--Jesper







[Index of Archives]     [Linux Samsung SoC]     [Linux Rockchip SoC]     [Linux Actions SoC]     [Linux for Synopsys ARC Processors]     [Linux NFS]     [Linux NILFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]


  Powered by Linux