6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Christoph Paasch cpaasch@openai.com
[ Upstream commit 531d0d32de3e1b6b77a87bd37de0c2c6e17b496a ]
gso_size is expected by the networking stack to be the size of the payload (thus, not including ethernet/IP/TCP-headers). However, cqe_bcnt is the full sized frame (including the headers). Dividing cqe_bcnt by lro_num_seg will then give incorrect results.
For example, running a bpftrace higher up in the TCP-stack (tcp_event_data_recv), we commonly have gso_size set to 1450 or 1451 even though in reality the payload was only 1448 bytes.
This can have unintended consequences: - In tcp_measure_rcv_mss() len will be for example 1450, but. rcv_mss will be 1448 (because tp->advmss is 1448). Thus, we will always recompute scaling_ratio each time an LRO-packet is received. - In tcp_gro_receive(), it will interfere with the decision whether or not to flush and thus potentially result in less gro'ed packets.
So, we need to discount the protocol headers from cqe_bcnt so we can actually divide the payload by lro_num_seg to get the real gso_size.
v2: - Use "(unsigned char *)tcp + tcp->doff * 4 - skb->data)" to compute header-len (Tariq Toukan tariqt@nvidia.com) - Improve commit-message (Gal Pressman gal@nvidia.com)
Fixes: e586b3b0baee ("net/mlx5: Ethernet Datapath files") Signed-off-by: Christoph Paasch cpaasch@openai.com Reviewed-by: Tariq Toukan tariqt@nvidia.com Reviewed-by: Gal Pressman gal@nvidia.com Link: https://patch.msgid.link/20250715-cpaasch-pf-925-investigate-incorrect-gso_s... Signed-off-by: Jakub Kicinski kuba@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/ethernet/mellanox/mlx5/core/en_rx.c | 12 ++++++++---- 1 file changed, 8 insertions(+), 4 deletions(-)
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_rx.c b/drivers/net/ethernet/mellanox/mlx5/core/en_rx.c index 57b0e26696e30..d5731f7be04fd 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/en_rx.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/en_rx.c @@ -1159,8 +1159,9 @@ static void mlx5e_lro_update_tcp_hdr(struct mlx5_cqe64 *cqe, struct tcphdr *tcp) } }
-static void mlx5e_lro_update_hdr(struct sk_buff *skb, struct mlx5_cqe64 *cqe, - u32 cqe_bcnt) +static unsigned int mlx5e_lro_update_hdr(struct sk_buff *skb, + struct mlx5_cqe64 *cqe, + u32 cqe_bcnt) { struct ethhdr *eth = (struct ethhdr *)(skb->data); struct tcphdr *tcp; @@ -1211,6 +1212,8 @@ static void mlx5e_lro_update_hdr(struct sk_buff *skb, struct mlx5_cqe64 *cqe, tcp->check = csum_ipv6_magic(&ipv6->saddr, &ipv6->daddr, payload_len, IPPROTO_TCP, check); } + + return (unsigned int)((unsigned char *)tcp + tcp->doff * 4 - skb->data); }
static void *mlx5e_shampo_get_packet_hd(struct mlx5e_rq *rq, u16 header_index) @@ -1567,8 +1570,9 @@ static inline void mlx5e_build_rx_skb(struct mlx5_cqe64 *cqe, mlx5e_macsec_offload_handle_rx_skb(netdev, skb, cqe);
if (lro_num_seg > 1) { - mlx5e_lro_update_hdr(skb, cqe, cqe_bcnt); - skb_shinfo(skb)->gso_size = DIV_ROUND_UP(cqe_bcnt, lro_num_seg); + unsigned int hdrlen = mlx5e_lro_update_hdr(skb, cqe, cqe_bcnt); + + skb_shinfo(skb)->gso_size = DIV_ROUND_UP(cqe_bcnt - hdrlen, lro_num_seg); /* Subtract one since we already counted this as one * "regular" packet in mlx5e_complete_rx_cqe() */