If:
1) the user requested USO, but 2) there is not enough payload for GSO to kick in, and 3) the egress device doesn't offer checksum offload, then
we want to compute the L4 checksum in software early on.
In the case when we are not taking the GSO path, but it has been requested, the software checksum fallback in skb_segment doesn't get a chance to compute the full checksum, if the egress device can't do it. As a result we end up sending UDP datagrams with only a partial checksum filled in, which the peer will discard.
Fixes: 10154dbded6d ("udp: Allow GSO transmit from devices with no checksum offload") Reported-by: Ivan Babrou ivan@cloudflare.com Signed-off-by: Jakub Sitnicki jakub@cloudflare.com Acked-by: Willem de Bruijn willemdebruijn.kernel@gmail.com Cc: stable@vger.kernel.org --- Changes in v2: - Fix typo in patch description - Link to v1: https://lore.kernel.org/r/20241010-uso-swcsum-fixup-v1-1-a63fbd0a414c@cloudf... --- net/ipv4/udp.c | 4 +++- net/ipv6/udp.c | 4 +++- 2 files changed, 6 insertions(+), 2 deletions(-)
diff --git a/net/ipv4/udp.c b/net/ipv4/udp.c index 8accbf4cb295..2849b273b131 100644 --- a/net/ipv4/udp.c +++ b/net/ipv4/udp.c @@ -951,8 +951,10 @@ static int udp_send_skb(struct sk_buff *skb, struct flowi4 *fl4, skb_shinfo(skb)->gso_type = SKB_GSO_UDP_L4; skb_shinfo(skb)->gso_segs = DIV_ROUND_UP(datalen, cork->gso_size); + + /* Don't checksum the payload, skb will get segmented */ + goto csum_partial; } - goto csum_partial; }
if (is_udplite) /* UDP-Lite */ diff --git a/net/ipv6/udp.c b/net/ipv6/udp.c index 52dfbb2ff1a8..0cef8ae5d1ea 100644 --- a/net/ipv6/udp.c +++ b/net/ipv6/udp.c @@ -1266,8 +1266,10 @@ static int udp_v6_send_skb(struct sk_buff *skb, struct flowi6 *fl6, skb_shinfo(skb)->gso_type = SKB_GSO_UDP_L4; skb_shinfo(skb)->gso_segs = DIV_ROUND_UP(datalen, cork->gso_size); + + /* Don't checksum the payload, skb will get segmented */ + goto csum_partial; } - goto csum_partial; }
if (is_udplite)
Jakub Sitnicki wrote:
If:
- the user requested USO, but
- there is not enough payload for GSO to kick in, and
- the egress device doesn't offer checksum offload, then
we want to compute the L4 checksum in software early on.
In the case when we are not taking the GSO path, but it has been requested, the software checksum fallback in skb_segment doesn't get a chance to compute the full checksum, if the egress device can't do it. As a result we end up sending UDP datagrams with only a partial checksum filled in, which the peer will discard.
Fixes: 10154dbded6d ("udp: Allow GSO transmit from devices with no checksum offload") Reported-by: Ivan Babrou ivan@cloudflare.com Signed-off-by: Jakub Sitnicki jakub@cloudflare.com Acked-by: Willem de Bruijn willemdebruijn.kernel@gmail.com Cc: stable@vger.kernel.org
You already included my Acked-by, but just to confirm: LGTM.
Hello:
This patch was applied to netdev/net.git (main) by Jakub Kicinski kuba@kernel.org:
On Fri, 11 Oct 2024 14:17:30 +0200 you wrote:
If:
- the user requested USO, but
- there is not enough payload for GSO to kick in, and
- the egress device doesn't offer checksum offload, then
we want to compute the L4 checksum in software early on.
[...]
Here is the summary with links: - [net,v2] udp: Compute L4 checksum as usual when not segmenting the skb https://git.kernel.org/netdev/net/c/d96016a764f6
You are awesome, thank you!
Hi Willem,
On Fri, Oct 11, 2024 at 02:17 PM +02, Jakub Sitnicki wrote:
If:
- the user requested USO, but
- there is not enough payload for GSO to kick in, and
- the egress device doesn't offer checksum offload, then
we want to compute the L4 checksum in software early on.
In the case when we are not taking the GSO path, but it has been requested, the software checksum fallback in skb_segment doesn't get a chance to compute the full checksum, if the egress device can't do it. As a result we end up sending UDP datagrams with only a partial checksum filled in, which the peer will discard.
Fixes: 10154dbded6d ("udp: Allow GSO transmit from devices with no checksum offload") Reported-by: Ivan Babrou ivan@cloudflare.com Signed-off-by: Jakub Sitnicki jakub@cloudflare.com Acked-by: Willem de Bruijn willemdebruijn.kernel@gmail.com Cc: stable@vger.kernel.org
I'm finally circling back to add a regression test for the above fix.
Instead of extending the selftest/net/udpgso.sh test case, I want to propose a different approach. I would like to check if the UDP packets packets are handed over to the netdevice with the expected checksum (complete or partial, depending on the device features), instead of testing for side-effects (packet dropped due to bad checksum).
For that we could use packetdrill. We would need to extend it a bit to allow specifying a UDP checksum in the script, but otherwise it would make writing such tests rather easy. For instance, the regression test for this fix could be as simple as:
---8<--- // Check if sent datagrams with length below GSO size get checksummed correctly
--ip_version=ipv4 --local_ip=192.168.0.1
` ethtool -K tun0 tx-checksumming off >/dev/null `
0 socket(..., SOCK_DGRAM, IPPROTO_UDP) = 3 +0 bind(3, ..., ...) = 0 +0 connect(3, ..., ...) = 0
+0 write(3, ..., 1000) = 1000 +0 > udp sum 0x3643 (1000) // expect complete checksum
+0 setsockopt(3, IPPROTO_UDP, UDP_SEGMENT, [1280], 4) = 0 +0 write(3, ..., 1000) = 1000 +0 > udp sum 0x3643 (1000) // expect complete checksum --->8---
(I'd actually like to have a bit mode of syntax sugar there, so we can simply specify "sum complete" and have packetdrill figure out the expected checksum value. Then IP address pinning wouldn't be needed.)
If we ever regress, the failure will be straightforward to understand. Here's what I got when running the above test with the fix reverted:
~ # packetdrill dgram_below_gso_size.pkt dgram_below_gso_size.pkt:19: error handling packet: live packet field l4_csum: expected: 13891 (0x3643) vs actual: 34476 (0x86ac) script packet: 0.000168 udp sum 0x3643 (1000) actual packet: 0.000166 udp sum 0x86ac (1000) ~ #
My patched packetdrill PoC is at:
https://github.com/jsitnicki/packetdrill/commits/udp-segment/rfc1/
If we want to go with the packetdrill-based test, that raises the question where do keep it? In the packetdrill repo? Or with the rest of the selftests/net?
Using the packetdrill repo would make it easier to synchronize the development of packetdrill features with the tests that use them. But we would also have to hook it up to netdev CI.
WDYT?
-jkbs
Jakub Sitnicki wrote:
Hi Willem,
On Fri, Oct 11, 2024 at 02:17 PM +02, Jakub Sitnicki wrote:
If:
- the user requested USO, but
- there is not enough payload for GSO to kick in, and
- the egress device doesn't offer checksum offload, then
we want to compute the L4 checksum in software early on.
In the case when we are not taking the GSO path, but it has been requested, the software checksum fallback in skb_segment doesn't get a chance to compute the full checksum, if the egress device can't do it. As a result we end up sending UDP datagrams with only a partial checksum filled in, which the peer will discard.
Fixes: 10154dbded6d ("udp: Allow GSO transmit from devices with no checksum offload") Reported-by: Ivan Babrou ivan@cloudflare.com Signed-off-by: Jakub Sitnicki jakub@cloudflare.com Acked-by: Willem de Bruijn willemdebruijn.kernel@gmail.com Cc: stable@vger.kernel.org
I'm finally circling back to add a regression test for the above fix.
Instead of extending the selftest/net/udpgso.sh test case, I want to propose a different approach. I would like to check if the UDP packets packets are handed over to the netdevice with the expected checksum (complete or partial, depending on the device features), instead of testing for side-effects (packet dropped due to bad checksum).
For that we could use packetdrill. We would need to extend it a bit to allow specifying a UDP checksum in the script, but otherwise it would make writing such tests rather easy. For instance, the regression test for this fix could be as simple as:
---8<--- // Check if sent datagrams with length below GSO size get checksummed correctly
--ip_version=ipv4 --local_ip=192.168.0.1
` ethtool -K tun0 tx-checksumming off >/dev/null `
0 socket(..., SOCK_DGRAM, IPPROTO_UDP) = 3 +0 bind(3, ..., ...) = 0 +0 connect(3, ..., ...) = 0
+0 write(3, ..., 1000) = 1000 +0 > udp sum 0x3643 (1000) // expect complete checksum
+0 setsockopt(3, IPPROTO_UDP, UDP_SEGMENT, [1280], 4) = 0 +0 write(3, ..., 1000) = 1000 +0 > udp sum 0x3643 (1000) // expect complete checksum --->8---
(I'd actually like to have a bit mode of syntax sugar there, so we can simply specify "sum complete" and have packetdrill figure out the expected checksum value. Then IP address pinning wouldn't be needed.)
If we ever regress, the failure will be straightforward to understand. Here's what I got when running the above test with the fix reverted:
~ # packetdrill dgram_below_gso_size.pkt dgram_below_gso_size.pkt:19: error handling packet: live packet field l4_csum: expected: 13891 (0x3643) vs actual: 34476 (0x86ac) script packet: 0.000168 udp sum 0x3643 (1000) actual packet: 0.000166 udp sum 0x86ac (1000) ~ #
My patched packetdrill PoC is at:
https://github.com/jsitnicki/packetdrill/commits/udp-segment/rfc1/
If we want to go with the packetdrill-based test, that raises the question where do keep it? In the packetdrill repo? Or with the rest of the selftests/net?
Using the packetdrill repo would make it easier to synchronize the development of packetdrill features with the tests that use them. But we would also have to hook it up to netdev CI.
WDYT?
+1. Packetdrill is great environment for such tests. Packetdrill tests are also concise and easy to read.
I recently imported an initial batch of packetdrill tests to ksft and with that the netdev CI. We have a patch series with the remaining .pkt files on github ready for when the merge window opens.
Any changes to packetdrill itself need to go to github, as that does not ship with the kernel.
I encourage .pkt files to go there too. But I suspect that that won't be enforceable, and we do want the review on netdev@ first.
One issue with testing optional features may be that packetdrill runs by default on a tun device (though it can also run across two NICs as a two machine test).
Tun supports NETIF_F_GSO_UDP_L4, so that should be no concern in this case.
One small request: to avoid confusion with CHECKSUM_COMPLETE, please use something else to mean fully computed checksums on the egress path (which, somewhat non-obviously, would be CHECKSUM_NONE). Perhaps SUM_PSEUDO and SUM_FULL?
[ Sasha's backport helper bot ]
Hi,
Found matching upstream commit: d96016a764f6aa5c7528c3d3f9cb472ef7266951
Status in newer kernel trees: 6.12.y | Present (exact SHA1)
Note: The patch differs from the upstream commit: --- --- - 2024-11-23 15:36:10.452516394 -0500 +++ /tmp/tmp.YUZFWQktZJ 2024-11-23 15:36:10.442794360 -0500 @@ -1,5 +1,3 @@ -If: - 1) the user requested USO, but 2) there is not enough payload for GSO to kick in, and 3) the egress device doesn't offer checksum offload, then @@ -17,15 +15,17 @@ Signed-off-by: Jakub Sitnicki jakub@cloudflare.com Acked-by: Willem de Bruijn willemdebruijn.kernel@gmail.com Cc: stable@vger.kernel.org -Link: https://patch.msgid.link/20241011-uso-swcsum-fixup-v2-1-6e1ddc199af9@cloudfl... -Signed-off-by: Jakub Kicinski kuba@kernel.org +--- +Changes in v2: +- Fix typo in patch description +- Link to v1: https://lore.kernel.org/r/20241010-uso-swcsum-fixup-v1-1-a63fbd0a414c@cloudf... --- net/ipv4/udp.c | 4 +++- net/ipv6/udp.c | 4 +++- 2 files changed, 6 insertions(+), 2 deletions(-)
diff --git a/net/ipv4/udp.c b/net/ipv4/udp.c -index 8accbf4cb2956..2849b273b1310 100644 +index 8accbf4cb295..2849b273b131 100644 --- a/net/ipv4/udp.c +++ b/net/ipv4/udp.c @@ -951,8 +951,10 @@ static int udp_send_skb(struct sk_buff *skb, struct flowi4 *fl4, @@ -41,7 +41,7 @@
if (is_udplite) /* UDP-Lite */ diff --git a/net/ipv6/udp.c b/net/ipv6/udp.c -index 52dfbb2ff1a80..0cef8ae5d1ea1 100644 +index 52dfbb2ff1a8..0cef8ae5d1ea 100644 --- a/net/ipv6/udp.c +++ b/net/ipv6/udp.c @@ -1266,8 +1266,10 @@ static int udp_v6_send_skb(struct sk_buff *skb, struct flowi6 *fl6, @@ -56,3 +56,4 @@ }
if (is_udplite) + ---
Results of testing on various branches:
| Branch | Patch Apply | Build Test | |---------------------------|-------------|------------| | stable/linux-6.12.y | Failed (branch not found) | N/A | | stable/linux-6.11.y | Failed (branch not found) | N/A | | stable/linux-6.6.y | Failed (branch not found) | N/A | | stable/linux-6.1.y | Failed (branch not found) | N/A | | stable/linux-5.15.y | Failed (branch not found) | N/A | | stable/linux-5.10.y | Failed (branch not found) | N/A | | stable/linux-5.4.y | Failed (branch not found) | N/A | | stable/linux-4.19.y | Failed (branch not found) | N/A |
linux-stable-mirror@lists.linaro.org