On Mon, Jun 30, 2025 at 10:50:55AM -0400, Michael S. Tsirkin wrote: > On Mon, Jun 30, 2025 at 05:51:09PM +0800, Zigit Zo wrote: > > This bug happens if the VMM sends a VIRTIO_NET_S_ANNOUNCE request while > > the virtio-net driver is still probing with rtnl_lock() hold, this will > > cause a recursive mutex in netdev_notify_peers(). > > > > Fix it by skip acking the annouce in virtnet_config_changed_work() when > > probing. The annouce will still get done when ndo_open() enables the > > virtio_config_driver_enable(). > > I am not so sure it will be - while driver is not loaded, device does > not have to send interrupts, and there's no rule I'm aware of that says > we'll get one after DRIVER_OK. > > How about, we instead just schedule the work to do it later? > > > Also, there is another bug here. > If ndo_open did not run, we actually should not send any announcements. > > Do we care if carrier on is set on probe or on open? > If not, let's just defer this to ndo_open? Hmm yes I think we do, device is visible to userspace is it not? Hmm. We can keep the announce bit set in vi->status and on open, check it and then schedule a work to do the announcement. > > > We've observed a softlockup with Ubuntu 24.04, and can be reproduced with > > QEMU sending the announce_self rapidly while booting. > > > > [ 494.167473] INFO: task swapper/0:1 blocked for more than 368 seconds. > > [ 494.167667] Not tainted 6.8.0-57-generic #59-Ubuntu > > [ 494.167810] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. > > [ 494.168015] task:swapper/0 state:D stack:0 pid:1 tgid:1 ppid:0 flags:0x00004000 > > [ 494.168260] Call Trace: > > [ 494.168329] <TASK> > > [ 494.168389] __schedule+0x27c/0x6b0 > > [ 494.168495] schedule+0x33/0x110 > > [ 494.168585] schedule_preempt_disabled+0x15/0x30 > > [ 494.168709] __mutex_lock.constprop.0+0x42f/0x740 > > [ 494.168835] __mutex_lock_slowpath+0x13/0x20 > > [ 494.168949] mutex_lock+0x3c/0x50 > > [ 494.169039] rtnl_lock+0x15/0x20 > > [ 494.169128] netdev_notify_peers+0x12/0x30 > > [ 494.169240] virtnet_config_changed_work+0x152/0x1a0 > > [ 494.169377] virtnet_probe+0xa48/0xe00 > > [ 494.169484] ? vp_get+0x4d/0x100 > > [ 494.169574] virtio_dev_probe+0x1e9/0x310 > > [ 494.169682] really_probe+0x1c7/0x410 > > [ 494.169783] __driver_probe_device+0x8c/0x180 > > [ 494.169901] driver_probe_device+0x24/0xd0 > > [ 494.170011] __driver_attach+0x10b/0x210 > > [ 494.170117] ? __pfx___driver_attach+0x10/0x10 > > [ 494.170237] bus_for_each_dev+0x8d/0xf0 > > [ 494.170341] driver_attach+0x1e/0x30 > > [ 494.170440] bus_add_driver+0x14e/0x290 > > [ 494.170548] driver_register+0x5e/0x130 > > [ 494.170651] ? __pfx_virtio_net_driver_init+0x10/0x10 > > [ 494.170788] register_virtio_driver+0x20/0x40 > > [ 494.170905] virtio_net_driver_init+0x97/0xb0 > > [ 494.171022] do_one_initcall+0x5e/0x340 > > [ 494.171128] do_initcalls+0x107/0x230 > > [ 494.171228] ? __pfx_kernel_init+0x10/0x10 > > [ 494.171340] kernel_init_freeable+0x134/0x210 > > [ 494.171462] kernel_init+0x1b/0x200 > > [ 494.171560] ret_from_fork+0x47/0x70 > > [ 494.171659] ? __pfx_kernel_init+0x10/0x10 > > [ 494.171769] ret_from_fork_asm+0x1b/0x30 > > [ 494.171875] </TASK> > > > > Fixes: df28de7b0050 ("virtio-net: synchronize operstate with admin state on up/down") > > Signed-off-by: Zigit Zo <zuozhijie@xxxxxxxxxxxxx> > > --- > > drivers/net/virtio_net.c | 15 ++++++++++++--- > > 1 file changed, 12 insertions(+), 3 deletions(-) > > > > diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c > > index e53ba600605a..0290d289ebee 100644 > > --- a/drivers/net/virtio_net.c > > +++ b/drivers/net/virtio_net.c > > @@ -6211,7 +6211,8 @@ static const struct net_device_ops virtnet_netdev = { > > .ndo_tx_timeout = virtnet_tx_timeout, > > }; > > > > -static void virtnet_config_changed_work(struct work_struct *work) > > +static void __virtnet_config_changed_work(struct work_struct *work, > > + bool check_announce) > > { > > struct virtnet_info *vi = > > container_of(work, struct virtnet_info, config_work); > > So this will be schedule_announce instead of check_announce? > > > > > > @@ -6221,7 +6222,7 @@ static void virtnet_config_changed_work(struct work_struct *work) > > struct virtio_net_config, status, &v) < 0) > > return; > > > > - if (v & VIRTIO_NET_S_ANNOUNCE) { > > + if (check_announce && (v & VIRTIO_NET_S_ANNOUNCE)) { > > netdev_notify_peers(vi->dev); > > virtnet_ack_link_announce(vi); > > } > > @@ -6244,6 +6245,11 @@ static void virtnet_config_changed_work(struct work_struct *work) > > } > > } > > > > +static void virtnet_config_changed_work(struct work_struct *work) > > +{ > > + __virtnet_config_changed_work(work, true); > > +} > > + > > static void virtnet_config_changed(struct virtio_device *vdev) > > { > > struct virtnet_info *vi = vdev->priv; > > @@ -7030,7 +7036,10 @@ static int virtnet_probe(struct virtio_device *vdev) > > otherwise get link status from config. */ > > netif_carrier_off(dev); > > if (virtio_has_feature(vi->vdev, VIRTIO_NET_F_STATUS)) { > > - virtnet_config_changed_work(&vi->config_work); > > + /* The check_annouce work will get scheduled when ndo_open() > > + * doing the virtio_config_driver_enable(). > > + */ > > + __virtnet_config_changed_work(&vi->config_work, false); > > } else { > > vi->status = VIRTIO_NET_S_LINK_UP; > > virtnet_update_settings(vi); > > > > base-commit: 2def09ead4ad5907988b655d1e1454003aaf8297 > > -- > > 2.49.0