Re: [PATCH bpf v2] bpf: lru: adjust free target to avoid global table starvation

Stanislav Fomichev <stfomichev@xxxxxxxxx> · Wed, 18 Jun 2025 16:22:44 -0700



On 06/18, Willem de Bruijn wrote:
> From: Willem de Bruijn <willemb@xxxxxxxxxx>
> 
> BPF_MAP_TYPE_LRU_HASH can recycle most recent elements well before the
> map is full, due to percpu reservations and force shrink before
> neighbor stealing. Once a CPU is unable to borrow from the global map,
> it will once steal one elem from a neighbor and after that each time
> flush this one element to the global list and immediately recycle it.
> 
> Batch value LOCAL_FREE_TARGET (128) will exhaust a 10K element map
> with 79 CPUs. CPU 79 will observe this behavior even while its
> neighbors hold 78 * 127 + 1 * 15 == 9921 free elements (99%).
> 
> CPUs need not be active concurrently. The issue can appear with
> affinity migration, e.g., irqbalance. Each CPU can reserve and then
> hold onto its 128 elements indefinitely.
> 
> Avoid global list exhaustion by limiting aggregate percpu caches to
> half of map size, by adjusting LOCAL_FREE_TARGET based on cpu count.
> This change has no effect on sufficiently large tables.
> 
> Similar to LOCAL_NR_SCANS and lru->nr_scans, introduce a map variable
> lru->free_target. The extra field fits in a hole in struct bpf_lru.
> The cacheline is already warm where read in the hot path. The field is
> only accessed with the lru lock held.
> 
> Tested-by: Anton Protopopov <a.s.protopopov@xxxxxxxxx>
> Signed-off-by: Willem de Bruijn <willemb@xxxxxxxxxx>

Acked-by: Stanislav Fomichev <sdf@xxxxxxxxxxx>