From: Luiz Augusto von Dentz <luiz.dentz@xxxxxxxxx> Date: Tue, 17 Jun 2025 10:52:11 -0400 > Hi Kuniyuki, > > On Mon, Jun 16, 2025 at 4:35 PM Kuniyuki Iwashima <kuni1840@xxxxxxxxx> wrote: > > > > From: Luiz Augusto von Dentz <luiz.dentz@xxxxxxxxx> > > Date: Mon, 16 Jun 2025 15:30:19 -0400 > > > Hi Kuniyuki, > > > > > > On Mon, Jun 16, 2025 at 3:20 PM Kuniyuki Iwashima <kuni1840@xxxxxxxxx> wrote: > > > > > > > > From: Luiz Augusto von Dentz <luiz.dentz@xxxxxxxxx> > > > > Date: Mon, 16 Jun 2025 14:56:26 -0400 > > > > > Hi Pauli, Kuniyuki, > > > > > > > > > > On Mon, Jun 16, 2025 at 2:12 PM Pauli Virtanen <pav@xxxxxx> wrote: > > > > > > > > > > > > Hi, > > > > > > > > > > > > ma, 2025-06-16 kello 10:37 -0700, Kuniyuki Iwashima kirjoitti: > > > > > > > From: Kuniyuki Iwashima <kuniyu@xxxxxxxxxx> > > > > > > > > > > > > > > syzbot reported use-after-free in vhci_flush() without repro. [0] > > > > > > > > > > > > > [clip] > > > > > > > diff --git a/net/bluetooth/hci_core.c b/net/bluetooth/hci_core.c > > > > > > > index 07a8b4281a39..d648b514e2df 100644 > > > > > > > --- a/net/bluetooth/hci_core.c > > > > > > > +++ b/net/bluetooth/hci_core.c > > > > > > > @@ -64,9 +64,9 @@ static DEFINE_IDA(hci_index_ida); > > > > > > > > > > > > > > /* Get HCI device by index. > > > > > > > * Device is held on return. */ > > > > > > > -struct hci_dev *hci_dev_get(int index) > > > > > > > +static struct hci_dev *__hci_dev_get(int index, int *srcu_index) > > > > > > > { > > > > > > > - struct hci_dev *hdev = NULL, *d; > > > > > > > + struct hci_dev *hdev = NULL; > > > > > > > > > > > > > > BT_DBG("%d", index); > > > > > > > > > > > > > > @@ -74,9 +74,11 @@ struct hci_dev *hci_dev_get(int index) > > > > > > > return NULL; > > > > > > > > > > > > > > read_lock(&hci_dev_list_lock); > > > > > > > - list_for_each_entry(d, &hci_dev_list, list) { > > > > > > > - if (d->id == index) { > > > > > > > - hdev = hci_dev_hold(d); > > > > > > > + list_for_each_entry(hdev, &hci_dev_list, list) { > > > > > > > + if (hdev->id == index) { > > > > > > > + hci_dev_hold(hdev); > > > > > > > + if (srcu_index) > > > > > > > + *srcu_index = srcu_read_lock(&hdev->srcu); > > > > > > > break; > > > > > > > } > > > > > > > } > > > > > > > @@ -84,6 +86,22 @@ struct hci_dev *hci_dev_get(int index) > > > > > > > return hdev; > > > > > > > } > > > > > > > > > > > > If no list item has `hdev->id == index`, doesn't this now return the > > > > > > list head -> crash? > > > > > > > > > > Seems wrong to me as well, *d was used to hold the current entry but > > > > > that has been removed so hdev would be used instead which may return a > > > > > valid/non-NULL entry even when its index doesn't match. Btw, are there > > > > > any documentation regarding the usage of SRCU in such cases where > > > > > there are still references? > > > > > > > > I think that's the main use case of SRCU, and basically the use case > > > > is similar to the normal RCU. There are a bunch of doc under > > > > Documentations/ and I'm not sure what's the best one, but LWN article > > > > would be a nice one to start. > > > > > > > > https://lwn.net/Articles/202847/ > > > > > > > > Also, one good example would be rtnl_link_ops_get(), rtnl_link_ops_put(), > > > > and rtnl_link_unregister(). > > > > > > Oh, this indeed looks pretty similar. > > > > > > > > > > > > > > > > > Usually the hci_unregister_dev is called by the driver to inform the > > > > > hardware has been unplugged from the system, so we do want to be able > > > > > to abort any ongoing usage of the hci_dev so in this particular case > > > > > perhaps it is easier to just check if HCI_UNREGISTER has been set > > > > > before attempting to flush. > > > > > > > > Maybe like below ? > > > > > > > > I think the downside of this approach would be we need to apply this > > > > change to each driver that has the same issue. > > > > > > > > Note that we need to hold mutex here otherwise there is no guarantee > > > > that another thread does not complete kfree(). I'm not sure if touching > > > > unregister_dev in non-bluetooth-core code is something we should avoid. > > > > > > > > What do you think ? > > > > > > > > ---8<--- > > > > diff --git a/drivers/bluetooth/hci_vhci.c b/drivers/bluetooth/hci_vhci.c > > > > index 59f4d7bdffdc..3dea1292da2d 100644 > > > > --- a/drivers/bluetooth/hci_vhci.c > > > > +++ b/drivers/bluetooth/hci_vhci.c > > > > @@ -66,7 +66,10 @@ static int vhci_flush(struct hci_dev *hdev) > > > > { > > > > struct vhci_data *data = hci_get_drvdata(hdev); > > > > > > > > - skb_queue_purge(&data->readq); > > > > + mutex_lock(&hdev->unregister_lock); > > > > + if (hci_dev_test_flag(hdev, HCI_UNCONFIGURED)) > > > > + skb_queue_purge(&data->readq); > > > > + mutex_unlock(&hdev->unregister_lock); > > > > > > > > return 0; > > > > } > > > > ---8<--- > > > > > > Not quite, I was thinking more along this lines: > > > > > > diff --git a/net/bluetooth/hci_core.c b/net/bluetooth/hci_core.c > > > index 07a8b4281a39..4c8105ad4885 100644 > > > --- a/net/bluetooth/hci_core.c > > > +++ b/net/bluetooth/hci_core.c > > > @@ -548,7 +548,7 @@ static int hci_dev_do_reset(struct hci_dev *hdev) > > > hci_conn_hash_flush(hdev); > > > hci_dev_unlock(hdev); > > > > > > - if (hdev->flush) > > > + if (!hci_dev_test_flag(hdev, HCI_UNREGISTER) && hdev->flush) > > > hdev->flush(hdev); > > > > I think technically there is still a small race window (and it > > could be large enough with _RT kernel). > > > > CPU1 CPU2 > > hci_unregister_dev > > hci_dev_get > > hci_dev_test_flag(hdev, HCI_UNREGISTER) > > hci_dev_set_flag(hdev, HCI_UNREGISTER) > > list_del(&hdev->list) > > kfree(vhci_data) > > hdev->flush(hdev) > > > > > > I'll keep the current form in v3 to fix the use-after-free > > with the mentioned list head bug fixed. > > Ok, and how the v3 helps with the situation above? It seems the > assumption is that synchronize_srcu would wait for the thread holding > hci_dev_get_srcu, but what if the order is the reversed? It does seem > like we still need to prevent hci_dev_get_srcu to return the hdev it > is has been unregistered already. It is never reversed as hci_dev_get_srcu() and hci_unregister_dev() hold hci_dev_list_lock. So, if hci_dev_get_srcu() finds a hdev, it's always after that that the hdev is un-listed by hci_unregister_dev(), and synchronize_srcu() follows it.