On Sun Aug 10, 2025 at 7:00 PM CEST, Rameshkumar Sundaram wrote: > During GTK rekey, mac80211 issues a clear key (if the old key exists) > followed by an install key operation in the same context. This causes > ath11k to send two WMI commands in quick succession: one to clear the > old key and another to install the new key in the same slot. > > Under certain conditions—especially under high load or time sensitive > scenarios, firmware may process these commands asynchronously in a way > that firmware assumes the key is cleared whereas hardware has a valid key. > This inconsistency between hardware and firmware leads to group addressed > packet drops. Only setting the same key again can restore a valid key in > firmware and allow packets to be transmitted. > > This issue remained latent because the host's clear key commands were > not effective in firmware until commit 436a4e886598 ("ath11k: clear the > keys properly via DISABLE_KEY"). That commit enabled the host to > explicitly clear group keys, which inadvertently exposed the race. > > To mitigate this, restrict group key clearing across all modes (AP, STA, > MESH). During rekey, the new key can simply be set on top of the previous > one, avoiding the need for a clear followed by a set. > > However, in AP mode specifically, permit group key clearing when no > stations are associated. This exception supports transitions from secure > modes (e.g., WPA2/WPA3) to open mode, during which all associated peers > are removed and the group key is cleared as part of the transition. > > Add a per-BSS station counter to track the presence of stations during > set key operations. Also add a reset_group_keys flag to track the key > re-installation state and avoid repeated installation of the same key > when the number of connected stations transitions to non-zero within a > rekey period. > > Additionally, for AP and Mesh modes, when the first station associates, > reinstall the same group key that was last set. This ensures that the > firmware recovers from any race that may have occurred during a previous > key clear when no stations were associated. > > This change ensures that key clearing is permitted only when no clients > are connected, avoiding packet loss while enabling dynamic security mode > transitions. > > Tested-on: QCN9074 hw1.0 PCI WLAN.HK.2.9.0.1-02146-QCAHKSWPL_SILICONZ-1 > Tested-on: WCN6855 hw2.1 PCI WLAN.HSP.1.1-03125-QCAHSPSWPL_V1_V2_SILICONZ_LITE-3.6510.41 > > Reported-by: Steffen Moser <lists@xxxxxxxxxxxxxxxx> > Closes: https://lore.kernel.org/linux-wireless/c6366409-9928-4dd7-bf7b-ba7fcf20eabf@xxxxxxxxxxxxxxxx > Fixes: 436a4e886598 ("ath11k: clear the keys properly via DISABLE_KEY") > Signed-off-by: Rameshkumar Sundaram <rameshkumar.sundaram@xxxxxxxxxxxxxxxx> Hello, I've just confirmed that this works well for my devices. For the record or if someone else wants to check it, I run an AP with a short rekey interval (wpa_group_rekey=10 in hostapd config) & run a continuous 'arping -b x.x.x.x'. Without the fix I loose broadcast connectivity from the AP to the STA like once every 3-5 rekey. With your fix everything runs smoothly. Tested-by: Nicolas Escande <nico.escande@xxxxxxxxx> Tested-on: QCN9074 hw1.0 PCI WLAN.HK.2.9.0.1-01977-QCAHKSWPL_SILICONZ-1