Re: [External] Re: [PATCH net] virtio-net: fix a rtnl_lock() deadlock during probing

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Tue, Jul 01, 2025 at 03:48:41PM +0800, Zigit Zo wrote:
> On 6/30/25 10:54 PM, Michael S. Tsirkin wrote:
> > On Mon, Jun 30, 2025 at 10:50:55AM -0400, Michael S. Tsirkin wrote:
> >> On Mon, Jun 30, 2025 at 05:51:09PM +0800, Zigit Zo wrote:
> >>> This bug happens if the VMM sends a VIRTIO_NET_S_ANNOUNCE request while
> >>> the virtio-net driver is still probing with rtnl_lock() hold, this will
> >>> cause a recursive mutex in netdev_notify_peers().
> >>>
> >>> Fix it by skip acking the annouce in virtnet_config_changed_work() when
> >>> probing. The annouce will still get done when ndo_open() enables the
> >>> virtio_config_driver_enable().
> >>
> >> I am not so sure it will be - while driver is not loaded, device does
> >> not have to send interrupts, and there's no rule I'm aware of that says
> >> we'll get one after DRIVER_OK.
> 
> Yep, at first we're thinking that when the VIRTIO_NET_S_ANNOUNCE flag set,
> we can always assure an interrupt has fired by VMM, to notify the driver
> to do the announcement.
> 
> But later we realized that the S_ANNOUNCE flag can be sent before the
> driver's probing, and for QEMU seems to set the status flag regardless of
> whether driver is ready, so the problem you're talking still may happens.
> >> How about, we instead just schedule the work to do it later?I'm not sure if scheduling the work later will break df28de7b0050, the work
> was being scheduled before that commit, and we have no much idea of why that
> commit removes the schedule_work, we just keep it for safe...

Well managing async things is always tricky. Direct call is safer.
If you reintroduce it, you need to audit all call paths for safely.


> Then, for plan A, we change the check_announce to schedule_announce, and if
> that's true, we do another schedule_work to call virtnet_config_changed_work
> again to finish the announcement, like
> 
> 	if (v & VIRTIO_NET_S_ANNOUNCE) {
> 		if (unlikely(schedule_announce))
> 			schedule_work(&vi->config_work);
> 		else {
> 			netdev_notify_peers(vi->dev);
> 			virtnet_ack_link_announce(vi);
> 		}
> 	}
> 
> >>
> >> Also, there is another bug here.
> >> If ndo_open did not run, we actually should not send any announcements.
> >>
> >> Do we care if carrier on is set on probe or on open?
> >> If not, let's just defer this to ndo_open?
> > 
> > Hmm yes I think we do, device is visible to userspace is it not?
> > 
> > Hmm.  We can keep the announce bit set in vi->status and on open, check
> > it and then schedule a work to do the announcement.
> 
> Okay, so there's a plan B, we save the bit and re-check it in ndo_open, like
> 
> 	/* __virtnet_config_changed_work() */
> 	if (v & VIRTIO_NET_S_ANNOUNCE) {
> 		vi->status |= VIRTIO_NET_S_ANNOUNCE;
> 		if (unlikely(!check_announce))
> 			goto check_link;
> 
> 		netdev_notify_peers(vi->dev);
> 		virtnet_ack_link_announce(vi);
> 		vi->status &= ~VIRTIO_NET_S_ANNOUNCE;
> 	}
> 
> 	/* virtnet_open() */
> 	if (virtio_has_feature(vi->vdev, VIRTIO_NET_F_STATUS)) {
> 		if (vi->status & VIRTIO_NET_S_LINK_UP)
> 			netif_carrier_on(vi->dev);
> 		if
> 		if (vi->status & VIRTIO_NET_S_ANNOUNCE)
> 			schedule_work(&vi->config_work);
> 		virtio_config_driver_enable(vi->vdev);
> 	}
> 
> This is a dirty demo, any ideas are welcomed :)
> 
> (I think in virtnet_open() we can make the S_LINK_UP being scheduled as well?)





[Index of Archives]     [KVM Development]     [Libvirt Development]     [Libvirt Users]     [CentOS Virtualization]     [Netdev]     [Ethernet Bridging]     [Linux Wireless]     [Kernel Newbies]     [Security]     [Linux for Hams]     [Netfilter]     [Bugtraq]     [Yosemite Forum]     [MIPS Linux]     [ARM Linux]     [Linux RAID]     [Linux Admin]     [Samba]

  Powered by Linux