Fedor Pchelkin <pchelkin@xxxxxxxxx> wrote: > > [...] > > > > > > > drivers/net/wireless/realtek/rtw89/core.c | 15 ++++++++--- > > > drivers/net/wireless/realtek/rtw89/core.h | 32 > > > ++++++++++++++--------- drivers/net/wireless/realtek/rtw89/pci.c | > > > 6 +++-- > > > 3 files changed, 36 insertions(+), 17 deletions(-) > > > > > > diff --git a/drivers/net/wireless/realtek/rtw89/core.c > > > b/drivers/net/wireless/realtek/rtw89/core.c > > > index 57590f5577a3..826540319027 100644 > > > --- a/drivers/net/wireless/realtek/rtw89/core.c > > > +++ b/drivers/net/wireless/realtek/rtw89/core.c > > > @@ -1088,6 +1088,7 @@ int rtw89_core_tx_kick_off_and_wait(struct > > > rtw89_dev *rtwdev, struct sk_buff *sk > > > struct rtw89_tx_skb_data *skb_data = RTW89_TX_SKB_CB(skb); > > > struct rtw89_tx_wait_info *wait; > > > unsigned long time_left; > > > + bool free_wait = true; > > > int ret = 0; > > > > > > wait = kzalloc(sizeof(*wait), GFP_KERNEL); @@ -1097,7 > > > +1098,8 @@ int rtw89_core_tx_kick_off_and_wait(struct rtw89_dev *rtwdev, struct > sk_buff *sk > > > } > > > > > > init_completion(&wait->completion); > > > - rcu_assign_pointer(skb_data->wait, wait); > > > + spin_lock_init(&wait->owner_lock); > > > + skb_data->wait = wait; > > > > > > rtw89_core_tx_kick_off(rtwdev, qsel); > > > time_left = wait_for_completion_timeout(&wait->completion, > > > @@ -1107,8 +1109,15 @@ int rtw89_core_tx_kick_off_and_wait(struct > > > rtw89_dev *rtwdev, struct sk_buff *sk > > > else if (!wait->tx_done) > > > ret = -EAGAIN; > > > > > > - rcu_assign_pointer(skb_data->wait, NULL); > > > - kfree_rcu(wait, rcu_head); > > > > Please consider the following. > > (moving "rcu_assign_pointer(skb_data->wait, NULL)" to be under "if > > (time_left == 0)") > > > > There is still a tiny race window. Suppose wait_for_completion_timeout() exits with a timeout, > so time_left is 0. If completing side goes on in parallel just after that, it has a chance to > proceed and free skb_data before the below if (time_left == 0) fragment is executed. Okay, logically it sounds right. > > > if (time_left == 0) { > > rcu_assign_pointer(skb_data->wait, NULL); > > ret = -ETIMEDOUT; > > } else if (!wait->tx_done) { > > ret = -EAGAIN; > > } > > > > kfree_rcu(wait, rcu_head); > > > > If completing side does run as expected (potential racing mentioned in > > this patch), there is no real need to assign NULL back. > > Actually the race happens regardless of wait_for_completion_timeout() exit status, it's briefly > mentioned in the race diagram inside commit message (but the diagram can show only one > possible concurrency scenario). I agree this may be improved and described more explicitly > though. Will appreciate to see that in next version. Thanks. > > As for the patch itself, currently I can't see another way of fixing that other than introducing > locks on both waiting and completing side. I took some time on thinking this. The following is another idea. The skb, which are sent by tx_wait_complete, are owned by driver. They don't come from stack, so we don't need to do ieee80211_tx_status_ni. Based on above, some rough points of the new idea are listed below. 1. Let rtw89_core_tx_wait_complete return true/false to indicate whether tx_wait or not 2. Add some new field into rtw89_tx_wait_info e.g. list_head, skb, finished 3. Add a list_head to rtwdev Add a work func doing things as for each wait in rtwdev->XXX_list: if !wait->finished: wait_for_completion() free wait->skb free wait 4. [waiting side] [completing side] wait_for_completion_timeout() ... ... /* make complete the last step */ ... if (rtw89_core_tx_wait_complete) ... return; ... // not assign NULL back to skb_data->wait if time_left != 0: wait-> finished = true wait->skb = skb add wait to rtwdev->XXX_list queue above work Please help evaluate the new idea. Thank you.