Re: [PATCH 1/3] wifi: ath11k: fix dest ring-buffer corruption

3 Jun 2025


      On 6/2/2025 4:03 PM, Johan Hovold wrote:
...
On Thu, May 29, 2025 at 03:03:38PM +0800, Miaoqing Pan wrote:
...
On 5/26/2025 7:48 PM, Johan Hovold wrote:
...
Add the missing memory barriers to make sure that destination ring
descriptors are read after the head pointers to avoid using stale data
on weakly ordered architectures like aarch64.
...
...
@@ -3851,6 +3851,9 @@ int ath11k_dp_process_rx_err(struct ath11k_base *ab, struct napi_struct *napi,
  
   ath11k_hal_srng_access_begin(ab, srng);

/* Make sure descriptor is read after the head pointer. */
dma_rmb();


Thanks Johan, for continuing to follow up on this issue. I have some 
different opinions.
This change somewhat deviates from the fix approach described in 
https://lore.kernel.org/all/20250321095219.19369-1-johan+linaro@kernel.org/. 
In this case, the descriptor might be accessed before it is updated or 
while it is still being updated. Therefore, a dma_rmb() should be added 
after the call to ath11k_hal_srng_dst_get_next_entry() and before 
accessing ath11k_hal_ce_dst_status_get_length(), to ensure that the DMA 
has completed before reading the descriptor.
However, in this patch, the memory barrier is used to protect the head 
pointer (HP). I don't think a memory barrier is necessary for HP, 
because even if an outdated HP is fetched, 
ath11k_hal_srng_dst_get_next_entry() will return NULL and exit safely.
No, the barrier is needed between reading the head pointer and accessing
descriptor fields, that's what matters.
You can still end up with reading stale descriptor data even when
ath11k_hal_srng_dst_get_next_entry() returns non-NULL due to speculation
(that's what happens on the X13s).
The fact is that a dma_rmb() does not even prevent speculation, no matter where it is
placed, right? If so the whole point of dma_rmb() is to prevent from compiler reordering
or CPU reordering, but is it really possible?
The sequence is
1# reading HP
    	srng->u.dst_ring.cached_hp = READ_ONCE(*srng->u.dst_ring.hp_addr);
2# validate HP
    	if (srng->u.dst_ring.tp == srng->u.dst_ring.cached_hp)
    		return NULL;
3# get desc
    	desc = srng->ring_base_vaddr + srng->u.dst_ring.tp;
4# accessing desc
    	ath11k_hal_desc_reo_parse_err(... desc, ...)
Clearly each step depends on the results of previous steps. In this case the compiler/CPU
is expected to be smart enough to not do any reordering, isn't it?
...
Whether to place it before or after (or inside)
ath11k_hal_srng_dst_get_next_entry() is a trade off between readability, 
maintainability and whether we want to avoid unnecessary barriers in
cases like the above where we strictly only need one barrier before the
loop (or if we want to avoid the barrier in case the ring is ever
empty).
...
So, placing the memory barrier inside 
ath11k_hal_srng_dst_get_next_entry() would be more appropriate.
@@ -678,6 +678,8 @@ u32 *ath11k_hal_srng_dst_get_next_entry(struct 
ath11k_base *ab,
         if (srng->flags & HAL_SRNG_FLAGS_CACHED)
                 ath11k_hal_srng_prefetch_desc(ab, srng);

  dma_rmb();


   return desc;

}

So this will add a barrier in each iteration of the loop, but we only
need a single one after reading the head pointer.
[ Also note that ath11k_hal_srng_dst_peek() would similarly need a
barrier if we were to move them into those helpers. ]
Johan

    

2026

2025

2024

2023

2022

2021

2020

2019

2018

2017

Re: [PATCH 1/3] wifi: ath11k: fix dest ring-buffer corruption