Users of the Lenovo ThinkPad X13s have reported that Wi-Fi sometimes breaks and the log fills up with errors like:
ath11k_pci 0006:01:00.0: HTC Rx: insufficient length, got 1484, expected 1492 ath11k_pci 0006:01:00.0: HTC Rx: insufficient length, got 1460, expected 1484
which based on a quick look at the driver seemed to indicate some kind of ring-buffer corruption.
Miaoqing Pan tracked it down to the host seeing the updated destination ring head pointer before the updated descriptor, and the error handling for that in turn leaves the ring buffer in an inconsistent state.
Add the missing memory barrier to make sure that the descriptor is read after the head pointer to address the root cause of the corruption while fixing up the error handling in case there are ever any (ordering) bugs on the device side.
Note that the READ_ONCE() are only needed to avoid compiler mischief in case the ring-buffer helpers are ever inlined.
Tested-on: WCN6855 hw2.1 WLAN.HSP.1.1-03125-QCAHSPSWPL_V1_V2_SILICONZ_LITE-3.6510.41
Fixes: d5c65159f289 ("ath11k: driver for Qualcomm IEEE 802.11ax devices") Closes: https://bugzilla.kernel.org/show_bug.cgi?id=218623 Link: https://lore.kernel.org/20250310010217.3845141-3-quic_miaoqing@quicinc.com Cc: Miaoqing Pan quic_miaoqing@quicinc.com Cc: stable@vger.kernel.org # 5.6 Signed-off-by: Johan Hovold johan+linaro@kernel.org --- drivers/net/wireless/ath/ath11k/ce.c | 11 +++++------ drivers/net/wireless/ath/ath11k/hal.c | 4 ++-- 2 files changed, 7 insertions(+), 8 deletions(-)
diff --git a/drivers/net/wireless/ath/ath11k/ce.c b/drivers/net/wireless/ath/ath11k/ce.c index e66e86bdec20..9d8efec46508 100644 --- a/drivers/net/wireless/ath/ath11k/ce.c +++ b/drivers/net/wireless/ath/ath11k/ce.c @@ -393,11 +393,10 @@ static int ath11k_ce_completed_recv_next(struct ath11k_ce_pipe *pipe, goto err; }
+ /* Make sure descriptor is read after the head pointer. */ + dma_rmb(); + *nbytes = ath11k_hal_ce_dst_status_get_length(desc); - if (*nbytes == 0) { - ret = -EIO; - goto err; - }
*skb = pipe->dest_ring->skb[sw_index]; pipe->dest_ring->skb[sw_index] = NULL; @@ -430,8 +429,8 @@ static void ath11k_ce_recv_process_cb(struct ath11k_ce_pipe *pipe) dma_unmap_single(ab->dev, ATH11K_SKB_RXCB(skb)->paddr, max_nbytes, DMA_FROM_DEVICE);
- if (unlikely(max_nbytes < nbytes)) { - ath11k_warn(ab, "rxed more than expected (nbytes %d, max %d)", + if (unlikely(max_nbytes < nbytes || nbytes == 0)) { + ath11k_warn(ab, "unexpected rx length (nbytes %d, max %d)", nbytes, max_nbytes); dev_kfree_skb_any(skb); continue; diff --git a/drivers/net/wireless/ath/ath11k/hal.c b/drivers/net/wireless/ath/ath11k/hal.c index 61f4b6dd5380..8cb1505a5a0c 100644 --- a/drivers/net/wireless/ath/ath11k/hal.c +++ b/drivers/net/wireless/ath/ath11k/hal.c @@ -599,7 +599,7 @@ u32 ath11k_hal_ce_dst_status_get_length(void *buf) struct hal_ce_srng_dst_status_desc *desc = buf; u32 len;
- len = FIELD_GET(HAL_CE_DST_STATUS_DESC_FLAGS_LEN, desc->flags); + len = FIELD_GET(HAL_CE_DST_STATUS_DESC_FLAGS_LEN, READ_ONCE(desc->flags)); desc->flags &= ~HAL_CE_DST_STATUS_DESC_FLAGS_LEN;
return len; @@ -829,7 +829,7 @@ void ath11k_hal_srng_access_begin(struct ath11k_base *ab, struct hal_srng *srng) srng->u.src_ring.cached_tp = *(volatile u32 *)srng->u.src_ring.tp_addr; } else { - srng->u.dst_ring.cached_hp = *srng->u.dst_ring.hp_addr; + srng->u.dst_ring.cached_hp = READ_ONCE(*srng->u.dst_ring.hp_addr);
/* Try to prefetch the next descriptor in the ring */ if (srng->flags & HAL_SRNG_FLAGS_CACHED)
On 3/21/25 02:49, Johan Hovold wrote:
Users of the Lenovo ThinkPad X13s have reported that Wi-Fi sometimes breaks and the log fills up with errors like:
ath11k_pci 0006:01:00.0: HTC Rx: insufficient length, got 1484, expected 1492 ath11k_pci 0006:01:00.0: HTC Rx: insufficient length, got 1460, expected 1484
which based on a quick look at the driver seemed to indicate some kind of ring-buffer corruption.
Miaoqing Pan tracked it down to the host seeing the updated destination ring head pointer before the updated descriptor, and the error handling for that in turn leaves the ring buffer in an inconsistent state.
Add the missing memory barrier to make sure that the descriptor is read after the head pointer to address the root cause of the corruption while fixing up the error handling in case there are ever any (ordering) bugs on the device side.
Note that the READ_ONCE() are only needed to avoid compiler mischief in case the ring-buffer helpers are ever inlined.
Tested-on: WCN6855 hw2.1 WLAN.HSP.1.1-03125-QCAHSPSWPL_V1_V2_SILICONZ_LITE-3.6510.41
Fixes: d5c65159f289 ("ath11k: driver for Qualcomm IEEE 802.11ax devices") Closes: https://bugzilla.kernel.org/show_bug.cgi?id=218623 Link: https://lore.kernel.org/20250310010217.3845141-3-quic_miaoqing@quicinc.com Cc: Miaoqing Pan quic_miaoqing@quicinc.com Cc: stable@vger.kernel.org # 5.6 Signed-off-by: Johan Hovold johan+linaro@kernel.org
drivers/net/wireless/ath/ath11k/ce.c | 11 +++++------ drivers/net/wireless/ath/ath11k/hal.c | 4 ++-- 2 files changed, 7 insertions(+), 8 deletions(-)
diff --git a/drivers/net/wireless/ath/ath11k/ce.c b/drivers/net/wireless/ath/ath11k/ce.c index e66e86bdec20..9d8efec46508 100644 --- a/drivers/net/wireless/ath/ath11k/ce.c +++ b/drivers/net/wireless/ath/ath11k/ce.c @@ -393,11 +393,10 @@ static int ath11k_ce_completed_recv_next(struct ath11k_ce_pipe *pipe, goto err; }
- /* Make sure descriptor is read after the head pointer. */
- dma_rmb();
- *nbytes = ath11k_hal_ce_dst_status_get_length(desc);
- if (*nbytes == 0) {
ret = -EIO;
goto err;
- }
*skb = pipe->dest_ring->skb[sw_index]; pipe->dest_ring->skb[sw_index] = NULL; @@ -430,8 +429,8 @@ static void ath11k_ce_recv_process_cb(struct ath11k_ce_pipe *pipe) dma_unmap_single(ab->dev, ATH11K_SKB_RXCB(skb)->paddr, max_nbytes, DMA_FROM_DEVICE);
if (unlikely(max_nbytes < nbytes)) {
ath11k_warn(ab, "rxed more than expected (nbytes %d, max %d)",
if (unlikely(max_nbytes < nbytes || nbytes == 0)) {
ath11k_warn(ab, "unexpected rx length (nbytes %d, max %d)", nbytes, max_nbytes); dev_kfree_skb_any(skb); continue;
diff --git a/drivers/net/wireless/ath/ath11k/hal.c b/drivers/net/wireless/ath/ath11k/hal.c index 61f4b6dd5380..8cb1505a5a0c 100644 --- a/drivers/net/wireless/ath/ath11k/hal.c +++ b/drivers/net/wireless/ath/ath11k/hal.c @@ -599,7 +599,7 @@ u32 ath11k_hal_ce_dst_status_get_length(void *buf) struct hal_ce_srng_dst_status_desc *desc = buf; u32 len;
- len = FIELD_GET(HAL_CE_DST_STATUS_DESC_FLAGS_LEN, desc->flags);
- len = FIELD_GET(HAL_CE_DST_STATUS_DESC_FLAGS_LEN, READ_ONCE(desc->flags)); desc->flags &= ~HAL_CE_DST_STATUS_DESC_FLAGS_LEN;
return len; @@ -829,7 +829,7 @@ void ath11k_hal_srng_access_begin(struct ath11k_base *ab, struct hal_srng *srng) srng->u.src_ring.cached_tp = *(volatile u32 *)srng->u.src_ring.tp_addr; } else {
srng->u.dst_ring.cached_hp = *srng->u.dst_ring.hp_addr;
srng->u.dst_ring.cached_hp = READ_ONCE(*srng->u.dst_ring.hp_addr);
/* Try to prefetch the next descriptor in the ring */ if (srng->flags & HAL_SRNG_FLAGS_CACHED)
I was experiencing this issue several times per day, but haven't hit it once in the last few days while testing this patch.
Tested-by: Clayton Craft clayton@craftyguy.net
On 3/21/25 10:49, Johan Hovold wrote:
Users of the Lenovo ThinkPad X13s have reported that Wi-Fi sometimes breaks and the log fills up with errors like:
ath11k_pci 0006:01:00.0: HTC Rx: insufficient length, got 1484, expected 1492 ath11k_pci 0006:01:00.0: HTC Rx: insufficient length, got 1460, expected 1484
which based on a quick look at the driver seemed to indicate some kind of ring-buffer corruption.
I had this issue on the Windows Dev Kit 2023 and X13s, interestingly never on the HP Omnibook X14. With this patch there have been no occurrences.
Tested-by: Jens Glathe jens.glathe@oldschoolsolutions.biz
with best regards
Jens
Hi Johan,
On Fri, Mar 21, 2025 at 4:51 AM Johan Hovold johan+linaro@kernel.org wrote:
Users of the Lenovo ThinkPad X13s have reported that Wi-Fi sometimes breaks and the log fills up with errors like:
ath11k_pci 0006:01:00.0: HTC Rx: insufficient length, got 1484, expected 1492 ath11k_pci 0006:01:00.0: HTC Rx: insufficient length, got 1460, expected 1484
which based on a quick look at the driver seemed to indicate some kind of ring-buffer corruption.
Miaoqing Pan tracked it down to the host seeing the updated destination ring head pointer before the updated descriptor, and the error handling for that in turn leaves the ring buffer in an inconsistent state.
Add the missing memory barrier to make sure that the descriptor is read after the head pointer to address the root cause of the corruption while fixing up the error handling in case there are ever any (ordering) bugs on the device side.
Note that the READ_ONCE() are only needed to avoid compiler mischief in case the ring-buffer helpers are ever inlined.
Tested-on: WCN6855 hw2.1 WLAN.HSP.1.1-03125-QCAHSPSWPL_V1_V2_SILICONZ_LITE-3.6510.41
Fixes: d5c65159f289 ("ath11k: driver for Qualcomm IEEE 802.11ax devices") Closes: https://bugzilla.kernel.org/show_bug.cgi?id=218623 Link: https://lore.kernel.org/20250310010217.3845141-3-quic_miaoqing@quicinc.com Cc: Miaoqing Pan quic_miaoqing@quicinc.com Cc: stable@vger.kernel.org # 5.6 Signed-off-by: Johan Hovold johan+linaro@kernel.org
drivers/net/wireless/ath/ath11k/ce.c | 11 +++++------ drivers/net/wireless/ath/ath11k/hal.c | 4 ++-- 2 files changed, 7 insertions(+), 8 deletions(-)
diff --git a/drivers/net/wireless/ath/ath11k/ce.c b/drivers/net/wireless/ath/ath11k/ce.c index e66e86bdec20..9d8efec46508 100644 --- a/drivers/net/wireless/ath/ath11k/ce.c +++ b/drivers/net/wireless/ath/ath11k/ce.c @@ -393,11 +393,10 @@ static int ath11k_ce_completed_recv_next(struct ath11k_ce_pipe *pipe, goto err; }
/* Make sure descriptor is read after the head pointer. */
dma_rmb();
*nbytes = ath11k_hal_ce_dst_status_get_length(desc);
if (*nbytes == 0) {
ret = -EIO;
goto err;
} *skb = pipe->dest_ring->skb[sw_index]; pipe->dest_ring->skb[sw_index] = NULL;
@@ -430,8 +429,8 @@ static void ath11k_ce_recv_process_cb(struct ath11k_ce_pipe *pipe) dma_unmap_single(ab->dev, ATH11K_SKB_RXCB(skb)->paddr, max_nbytes, DMA_FROM_DEVICE);
if (unlikely(max_nbytes < nbytes)) {
ath11k_warn(ab, "rxed more than expected (nbytes %d, max %d)",
if (unlikely(max_nbytes < nbytes || nbytes == 0)) {
ath11k_warn(ab, "unexpected rx length (nbytes %d, max %d)", nbytes, max_nbytes); dev_kfree_skb_any(skb); continue;
diff --git a/drivers/net/wireless/ath/ath11k/hal.c b/drivers/net/wireless/ath/ath11k/hal.c index 61f4b6dd5380..8cb1505a5a0c 100644 --- a/drivers/net/wireless/ath/ath11k/hal.c +++ b/drivers/net/wireless/ath/ath11k/hal.c @@ -599,7 +599,7 @@ u32 ath11k_hal_ce_dst_status_get_length(void *buf) struct hal_ce_srng_dst_status_desc *desc = buf; u32 len;
len = FIELD_GET(HAL_CE_DST_STATUS_DESC_FLAGS_LEN, desc->flags);
len = FIELD_GET(HAL_CE_DST_STATUS_DESC_FLAGS_LEN, READ_ONCE(desc->flags)); desc->flags &= ~HAL_CE_DST_STATUS_DESC_FLAGS_LEN; return len;
@@ -829,7 +829,7 @@ void ath11k_hal_srng_access_begin(struct ath11k_base *ab, struct hal_srng *srng) srng->u.src_ring.cached_tp = *(volatile u32 *)srng->u.src_ring.tp_addr; } else {
srng->u.dst_ring.cached_hp = *srng->u.dst_ring.hp_addr;
srng->u.dst_ring.cached_hp = READ_ONCE(*srng->u.dst_ring.hp_addr); /* Try to prefetch the next descriptor in the ring */ if (srng->flags & HAL_SRNG_FLAGS_CACHED)
-- 2.48.1
Like the others, I was seeing this message multiple times a day, and on a not so rare occasion it would also take the wifi device offline, with this patch applied, I do not see that at all here.
Tested-by: Steev Klimaszewski steev@kali.org
Hi Jeff,
On Fri, Mar 21, 2025 at 10:49:16AM +0100, Johan Hovold wrote:
Users of the Lenovo ThinkPad X13s have reported that Wi-Fi sometimes breaks and the log fills up with errors like:
ath11k_pci 0006:01:00.0: HTC Rx: insufficient length, got 1484, expected 1492 ath11k_pci 0006:01:00.0: HTC Rx: insufficient length, got 1460, expected 1484
which based on a quick look at the driver seemed to indicate some kind of ring-buffer corruption.
Miaoqing Pan tracked it down to the host seeing the updated destination ring head pointer before the updated descriptor, and the error handling for that in turn leaves the ring buffer in an inconsistent state.
Add the missing memory barrier to make sure that the descriptor is read after the head pointer to address the root cause of the corruption while fixing up the error handling in case there are ever any (ordering) bugs on the device side.
Note that the READ_ONCE() are only needed to avoid compiler mischief in case the ring-buffer helpers are ever inlined.
Tested-on: WCN6855 hw2.1 WLAN.HSP.1.1-03125-QCAHSPSWPL_V1_V2_SILICONZ_LITE-3.6510.41
Fixes: d5c65159f289 ("ath11k: driver for Qualcomm IEEE 802.11ax devices") Closes: https://bugzilla.kernel.org/show_bug.cgi?id=218623 Link: https://lore.kernel.org/20250310010217.3845141-3-quic_miaoqing@quicinc.com Cc: Miaoqing Pan quic_miaoqing@quicinc.com Cc: stable@vger.kernel.org # 5.6 Signed-off-by: Johan Hovold johan+linaro@kernel.org
This patch fixes a long-standing issue that is hurting user of machines like the X13s, but as far as I can tell it has not been picked up yet (perhaps due to the temporary MAINTAINERS glitch that caused the wireless list not to be CCed?).
Would be good to get this fixed in 6.15 and backported to stable.
The following related patches are also not yet in linux-next:
https://lore.kernel.org/lkml/20250321145302.4775-1-johan+linaro@kernel.org/ https://lore.kernel.org/lkml/20250321095219.19369-1-johan+linaro@kernel.org/
Note that I still intend to sending follow-on fixes for the other missing barriers, but I'm a bit short on time at the moment.
Johan
On 3/21/2025 5:49 PM, Johan Hovold wrote:
Users of the Lenovo ThinkPad X13s have reported that Wi-Fi sometimes breaks and the log fills up with errors like:
ath11k_pci 0006:01:00.0: HTC Rx: insufficient length, got 1484, expected 1492 ath11k_pci 0006:01:00.0: HTC Rx: insufficient length, got 1460, expected 1484
which based on a quick look at the driver seemed to indicate some kind of ring-buffer corruption.
Miaoqing Pan tracked it down to the host seeing the updated destination ring head pointer before the updated descriptor, and the error handling for that in turn leaves the ring buffer in an inconsistent state.
Add the missing memory barrier to make sure that the descriptor is read after the head pointer to address the root cause of the corruption while fixing up the error handling in case there are ever any (ordering) bugs on the device side.
Note that the READ_ONCE() are only needed to avoid compiler mischief in case the ring-buffer helpers are ever inlined.
Tested-on: WCN6855 hw2.1 WLAN.HSP.1.1-03125-QCAHSPSWPL_V1_V2_SILICONZ_LITE-3.6510.41
Fixes: d5c65159f289 ("ath11k: driver for Qualcomm IEEE 802.11ax devices") Closes: https://bugzilla.kernel.org/show_bug.cgi?id=218623 Link: https://lore.kernel.org/20250310010217.3845141-3-quic_miaoqing@quicinc.com Cc: Miaoqing Pan quic_miaoqing@quicinc.com Cc: stable@vger.kernel.org # 5.6 Signed-off-by: Johan Hovold johan+linaro@kernel.org
drivers/net/wireless/ath/ath11k/ce.c | 11 +++++------ drivers/net/wireless/ath/ath11k/hal.c | 4 ++-- 2 files changed, 7 insertions(+), 8 deletions(-)
diff --git a/drivers/net/wireless/ath/ath11k/ce.c b/drivers/net/wireless/ath/ath11k/ce.c index e66e86bdec20..9d8efec46508 100644 --- a/drivers/net/wireless/ath/ath11k/ce.c +++ b/drivers/net/wireless/ath/ath11k/ce.c @@ -393,11 +393,10 @@ static int ath11k_ce_completed_recv_next(struct ath11k_ce_pipe *pipe, goto err; }
- /* Make sure descriptor is read after the head pointer. */
- dma_rmb();
- *nbytes = ath11k_hal_ce_dst_status_get_length(desc);
- if (*nbytes == 0) {
ret = -EIO;
goto err;
- }
*skb = pipe->dest_ring->skb[sw_index]; pipe->dest_ring->skb[sw_index] = NULL; @@ -430,8 +429,8 @@ static void ath11k_ce_recv_process_cb(struct ath11k_ce_pipe *pipe) dma_unmap_single(ab->dev, ATH11K_SKB_RXCB(skb)->paddr, max_nbytes, DMA_FROM_DEVICE);
if (unlikely(max_nbytes < nbytes)) {
ath11k_warn(ab, "rxed more than expected (nbytes %d, max %d)",
if (unlikely(max_nbytes < nbytes || nbytes == 0)) {
ath11k_warn(ab, "unexpected rx length (nbytes %d, max %d)", nbytes, max_nbytes); dev_kfree_skb_any(skb); continue;
diff --git a/drivers/net/wireless/ath/ath11k/hal.c b/drivers/net/wireless/ath/ath11k/hal.c index 61f4b6dd5380..8cb1505a5a0c 100644 --- a/drivers/net/wireless/ath/ath11k/hal.c +++ b/drivers/net/wireless/ath/ath11k/hal.c @@ -599,7 +599,7 @@ u32 ath11k_hal_ce_dst_status_get_length(void *buf) struct hal_ce_srng_dst_status_desc *desc = buf; u32 len;
- len = FIELD_GET(HAL_CE_DST_STATUS_DESC_FLAGS_LEN, desc->flags);
- len = FIELD_GET(HAL_CE_DST_STATUS_DESC_FLAGS_LEN, READ_ONCE(desc->flags)); desc->flags &= ~HAL_CE_DST_STATUS_DESC_FLAGS_LEN;
return len; @@ -829,7 +829,7 @@ void ath11k_hal_srng_access_begin(struct ath11k_base *ab, struct hal_srng *srng) srng->u.src_ring.cached_tp = *(volatile u32 *)srng->u.src_ring.tp_addr; } else {
srng->u.dst_ring.cached_hp = *srng->u.dst_ring.hp_addr;
srng->u.dst_ring.cached_hp = READ_ONCE(*srng->u.dst_ring.hp_addr);
/* Try to prefetch the next descriptor in the ring */ if (srng->flags & HAL_SRNG_FLAGS_CACHED)
Reviewed-by: Miaoqing Pan quic_miaoqing@quicinc.com
On Fri, 21 Mar 2025 10:49:16 +0100, Johan Hovold wrote:
Users of the Lenovo ThinkPad X13s have reported that Wi-Fi sometimes breaks and the log fills up with errors like:
ath11k_pci 0006:01:00.0: HTC Rx: insufficient length, got 1484, expected 1492 ath11k_pci 0006:01:00.0: HTC Rx: insufficient length, got 1460, expected 1484
which based on a quick look at the driver seemed to indicate some kind of ring-buffer corruption.
[...]
Applied, thanks!
[1/1] wifi: ath11k: fix ring-buffer corruption commit: 6d037a372f817e9fcb56482f37917545596bd776
Best regards,
linux-stable-mirror@lists.linaro.org