Association comeback delay behavior

James Prestwood <prestwoj@xxxxxxxxx> · Thu, 22 May 2025 10:45:48 -0700

Hi,

After noticing this log "rejected association temporarily; comeback 
duration 1000 TU (1024 ms)" I started looking more into how the kernel 
handles this and noticed a few things:

1. The kernel takes the delay in the association response frame and 
waits, but has no sane bounds for how long the wait is. An AP could send 
0xffffffff and the kernel will just block for that entire duration.

2. The first issue would appear to be guarded by the fact that 
run_again() only reschedules if the new timeout is less than the current 
time remaining but only if there is an existing timer set.

Looking at the code, the association timer gets set when we begin an 
association so it _should_ be set when we hit this comeback delay case. 
But through testing I found that it is not. Hacking hostapd to use 10000 
TU's as the comeback delay I see this:

[    4.338185] wlan1: associate with 02:00:00:00:00:00 (try 1/3)
[    4.340023] wlan1: RX AssocResp from 02:00:00:00:00:00 (capab=0x411 
status=30 aid=0)
[    4.340409] wlan1: 02:00:00:00:00:00 rejected association 
temporarily; comeback duration 10000 TU (10240 ms)
[   14.654103] wlan1: associate with 02:00:00:00:00:00 (try 2/3)
[   14.657405] wlan1: RX AssocResp from 02:00:00:00:00:00 (capab=0x411 
status=30 aid=0)
[   14.658430] wlan1: 02:00:00:00:00:00 rejected association 
temporarily; comeback duration 10000 TU (10240 ms)
[   14.848706] wlan1: associate with 02:00:00:00:00:00 (try 3/3)
[   14.851596] wlan1: RX AssocResp from 02:00:00:00:00:00 (capab=0x411 
status=30 aid=0)
[   14.854269] wlan1: 02:00:00:00:00:00 rejected association 
temporarily; comeback duration 10000 TU (10240 ms)

So the first association attempt waited the full 10 seconds, then after 
that the timer was presumably set, and we only waited the default 200ms 
(ASSOC_TIMEOUT). So to me, this feels like either a bug or an oversight 
on how this should be handled:

 - If the timer should already be set, this is a bug as I see the 
kernel waiting excessively.

 - If the timer being unset is expected, the kernel should be limiting 
this wait to something reasonable.

I also realize that CMD_ASSOC_COMEBACK was added and userspace gets 
notified, but this feels excessive to handle in userspace when the 
kernel could instead enforce a sane timeout all on its own without 
requiring userspace disconnect/reconnect when the AP sends an absurd 
timeout.

My main concern here is a rouge AP scenario that can then DoS all your 
clients that try and connect to it.

Thanks,

James