On 6/4/2025 2:59 PM, Johan Hovold wrote: > On Wed, Jun 04, 2025 at 10:16:23AM +0800, Baochen Qiang wrote: >> On 6/3/2025 7:51 PM, Johan Hovold wrote: >>> On Tue, Jun 03, 2025 at 06:52:37PM +0800, Baochen Qiang wrote: >>>> On 6/2/2025 4:03 PM, Johan Hovold wrote: >>> >>>>> No, the barrier is needed between reading the head pointer and accessing >>>>> descriptor fields, that's what matters. >>>>> >>>>> You can still end up with reading stale descriptor data even when >>>>> ath11k_hal_srng_dst_get_next_entry() returns non-NULL due to speculation >>>>> (that's what happens on the X13s). >>>> >>>> The fact is that a dma_rmb() does not even prevent speculation, no matter where it is >>>> placed, right? >>> >>> It prevents the speculated load from being used. >> >> Sorry, still not get it. To my knowledge whether the speculated load (steps 3 and 4) would >> get used depends on whether the condition check pass in step 2. How does a dma_rmb() make >> any difference in this process? > > It orders the two loads from the device so that the descriptor is not > (speculatively) loaded before the head pointer. I was thinking we need barrier_nospec() here to prevent speculatively load, instead of (or in addition to) dma_rmb(). But, seems I was wrong. Even the kernel doc [1] talks about the ordering brought by dma_rmb() if (desc->status != DEVICE_OWN) { /* do not read data until we own descriptor */ dma_rmb(); /* read/modify data */ read_data = desc->data; [...] } The dma_rmb() allows us to guarantee that the device has released ownership before we read the data from the descriptor So a single dma_rmb() should be enough. > > When the CPU sees the updated head pointer it may otherwise proceed with > using stale descriptor data. The barrier prevents this. > > Johan [1] https://www.kernel.org/doc/Documentation/memory-barriers.txt