On 6/4/2025 2:59 PM, Johan Hovold wrote:
On Wed, Jun 04, 2025 at 10:16:23AM +0800, Baochen Qiang wrote:
On 6/3/2025 7:51 PM, Johan Hovold wrote:
On Tue, Jun 03, 2025 at 06:52:37PM +0800, Baochen Qiang wrote:
On 6/2/2025 4:03 PM, Johan Hovold wrote:
No, the barrier is needed between reading the head pointer and accessing descriptor fields, that's what matters.
You can still end up with reading stale descriptor data even when ath11k_hal_srng_dst_get_next_entry() returns non-NULL due to speculation (that's what happens on the X13s).
The fact is that a dma_rmb() does not even prevent speculation, no matter where it is placed, right?
It prevents the speculated load from being used.
Sorry, still not get it. To my knowledge whether the speculated load (steps 3 and 4) would get used depends on whether the condition check pass in step 2. How does a dma_rmb() make any difference in this process?
It orders the two loads from the device so that the descriptor is not (speculatively) loaded before the head pointer.
I was thinking we need barrier_nospec() here to prevent speculatively load, instead of (or in addition to) dma_rmb().
But, seems I was wrong. Even the kernel doc [1] talks about the ordering brought by dma_rmb()
if (desc->status != DEVICE_OWN) { /* do not read data until we own descriptor */ dma_rmb();
/* read/modify data */ read_data = desc->data; [...] }
The dma_rmb() allows us to guarantee that the device has released ownership before we read the data from the descriptor
So a single dma_rmb() should be enough.
When the CPU sees the updated head pointer it may otherwise proceed with using stale descriptor data. The barrier prevents this.
Johan
[1] https://www.kernel.org/doc/Documentation/memory-barriers.txt