On 6/4/2025 10:34 PM, Johan Hovold wrote: > Add the missing memory barrier to make sure that destination ring > descriptors are read after the head pointers to avoid using stale data > on weakly ordered architectures like aarch64. > > The barrier is added to the ath11k_hal_srng_access_begin() helper for > symmetry with follow-on fixes for source ring buffer corruption which > will add barriers to ath11k_hal_srng_access_end(). > > Tested-on: WCN6855 hw2.1 WLAN.HSP.1.1-03125-QCAHSPSWPL_V1_V2_SILICONZ_LITE-3.6510.41 > > Fixes: d5c65159f289 ("ath11k: driver for Qualcomm IEEE 802.11ax devices") > Cc: stable@xxxxxxxxxxxxxxx # 5.6 > Signed-off-by: Johan Hovold <johan+linaro@xxxxxxxxxx> > --- > drivers/net/wireless/ath/ath11k/ce.c | 3 --- > drivers/net/wireless/ath/ath11k/dp_rx.c | 3 --- > drivers/net/wireless/ath/ath11k/hal.c | 12 +++++++++++- > 3 files changed, 11 insertions(+), 7 deletions(-) > > diff --git a/drivers/net/wireless/ath/ath11k/ce.c b/drivers/net/wireless/ath/ath11k/ce.c > index 9d8efec46508..39d9aad33bc6 100644 > --- a/drivers/net/wireless/ath/ath11k/ce.c > +++ b/drivers/net/wireless/ath/ath11k/ce.c > @@ -393,9 +393,6 @@ static int ath11k_ce_completed_recv_next(struct ath11k_ce_pipe *pipe, > goto err; > } > > - /* Make sure descriptor is read after the head pointer. */ > - dma_rmb(); > - > *nbytes = ath11k_hal_ce_dst_status_get_length(desc); > > *skb = pipe->dest_ring->skb[sw_index]; > diff --git a/drivers/net/wireless/ath/ath11k/dp_rx.c b/drivers/net/wireless/ath/ath11k/dp_rx.c > index ea2959305dec..d8dab182a9af 100644 > --- a/drivers/net/wireless/ath/ath11k/dp_rx.c > +++ b/drivers/net/wireless/ath/ath11k/dp_rx.c > @@ -2650,9 +2650,6 @@ int ath11k_dp_process_rx(struct ath11k_base *ab, int ring_id, > try_again: > ath11k_hal_srng_access_begin(ab, srng); > > - /* Make sure descriptor is read after the head pointer. */ > - dma_rmb(); > - > while (likely(desc = > (struct hal_reo_dest_ring *)ath11k_hal_srng_dst_get_next_entry(ab, > srng))) { > diff --git a/drivers/net/wireless/ath/ath11k/hal.c b/drivers/net/wireless/ath/ath11k/hal.c > index 8cb1505a5a0c..921114686ba3 100644 > --- a/drivers/net/wireless/ath/ath11k/hal.c > +++ b/drivers/net/wireless/ath/ath11k/hal.c > @@ -823,13 +823,23 @@ u32 *ath11k_hal_srng_src_peek(struct ath11k_base *ab, struct hal_srng *srng) > > void ath11k_hal_srng_access_begin(struct ath11k_base *ab, struct hal_srng *srng) > { > + u32 hp; > + > lockdep_assert_held(&srng->lock); > > if (srng->ring_dir == HAL_SRNG_DIR_SRC) { > srng->u.src_ring.cached_tp = > *(volatile u32 *)srng->u.src_ring.tp_addr; > } else { > - srng->u.dst_ring.cached_hp = READ_ONCE(*srng->u.dst_ring.hp_addr); > + hp = READ_ONCE(*srng->u.dst_ring.hp_addr); > + > + if (hp != srng->u.dst_ring.cached_hp) { My ath12k comments apply here: this consumes more CPU cycles > + srng->u.dst_ring.cached_hp = hp; > + /* Make sure descriptor is read after the head > + * pointer. > + */ > + dma_rmb(); > + } > > /* Try to prefetch the next descriptor in the ring */ > if (srng->flags & HAL_SRNG_FLAGS_CACHED)