Here are various fixes:
- Patch 1: Better handling SKB extension allocation failures. A fix for v5.7.
- Patches 2, 3: Avoid resetting MPTCP limits when flushing MPTCP endpoints. With a validation in the selftests. Fixes for v5.7.
- Patches 4, 5, 6: Disallow '0' as ADD_ADDR retransmission timeout. With a preparation patch, and a validation in the selftests. Fixes for v5.11.
- Patches 8, 9: Fix C23 extension warnings in the selftests, spotted by GCC. Fixes for v6.16.
Signed-off-by: Matthieu Baerts (NGI0) matttbe@kernel.org --- Christoph Paasch (1): mptcp: drop skb if MPTCP skb extension allocation fails
Geliang Tang (3): mptcp: remove duplicate sk_reset_timer call mptcp: disable add_addr retransmission when timeout is 0 selftests: mptcp: disable add_addr retrans in endpoint_tests
Matthieu Baerts (NGI0) (4): mptcp: pm: kernel: flush: do not reset ADD_ADDR limit selftests: mptcp: pm: check flush doesn't reset limits selftests: mptcp: connect: fix C23 extension warning selftests: mptcp: sockopt: fix C23 extension warning
Documentation/networking/mptcp-sysctl.rst | 2 ++ net/mptcp/options.c | 6 ++++-- net/mptcp/pm.c | 18 ++++++++++++------ net/mptcp/pm_kernel.c | 1 - tools/testing/selftests/net/mptcp/mptcp_connect.c | 5 +++-- tools/testing/selftests/net/mptcp/mptcp_inq.c | 5 +++-- tools/testing/selftests/net/mptcp/mptcp_join.sh | 1 + tools/testing/selftests/net/mptcp/mptcp_sockopt.c | 5 +++-- tools/testing/selftests/net/mptcp/pm_netlink.sh | 1 + 9 files changed, 29 insertions(+), 15 deletions(-) --- base-commit: 065c31f2c6915b38f45b1c817b31f41f62eaa774 change-id: 20250815-net-mptcp-misc-fixes-6-17-rc2-d18b2437e8d4
Best regards,
From: Christoph Paasch cpaasch@openai.com
When skb_ext_add(skb, SKB_EXT_MPTCP) fails in mptcp_incoming_options(), we used to return true, letting the segment proceed through the TCP receive path without a DSS mapping. Such segments can leave inconsistent mapping state and trigger a mid-stream fallback to TCP, which in testing collapsed (by artificially forcing failures in skb_ext_add) throughput to zero.
Return false instead so the TCP input path drops the skb (see tcp_data_queue() and step-7 processing). This is the safer choice under memory pressure: it preserves MPTCP correctness and provides backpressure to the sender.
Control packets remain unaffected: ACK updates and DATA_FIN handling happen before attempting the extension allocation, and tcp_reset() continues to ignore the return value.
With this change, MPTCP continues to work at high throughput if we artificially inject failures into skb_ext_add.
Fixes: 6787b7e350d3 ("mptcp: avoid processing packet if a subflow reset") Cc: stable@vger.kernel.org Signed-off-by: Christoph Paasch cpaasch@openai.com Reviewed-by: Matthieu Baerts (NGI0) matttbe@kernel.org Signed-off-by: Matthieu Baerts (NGI0) matttbe@kernel.org --- net/mptcp/options.c | 6 ++++-- 1 file changed, 4 insertions(+), 2 deletions(-)
diff --git a/net/mptcp/options.c b/net/mptcp/options.c index 70c0ab0ecf905d282e5dc6c1b21ffc6476c8d71b..2a8ea28442b271675411d190604073f779bee7fa 100644 --- a/net/mptcp/options.c +++ b/net/mptcp/options.c @@ -1118,7 +1118,9 @@ static bool add_addr_hmac_valid(struct mptcp_sock *msk, return hmac == mp_opt->ahmac; }
-/* Return false if a subflow has been reset, else return true */ +/* Return false in case of error (or subflow has been reset), + * else return true. + */ bool mptcp_incoming_options(struct sock *sk, struct sk_buff *skb) { struct mptcp_subflow_context *subflow = mptcp_subflow_ctx(sk); @@ -1222,7 +1224,7 @@ bool mptcp_incoming_options(struct sock *sk, struct sk_buff *skb)
mpext = skb_ext_add(skb, SKB_EXT_MPTCP); if (!mpext) - return true; + return false;
memset(mpext, 0, sizeof(*mpext));
A flush of the MPTCP endpoints should not affect the MPTCP limits. In other words, 'ip mptcp endpoint flush' should not change 'ip mptcp limits'.
But it was the case: the MPTCP_PM_ATTR_RCV_ADD_ADDRS (add_addr_accepted) limit was reset by accident. Removing the reset of this counter during a flush fixes this issue.
Fixes: 01cacb00b35c ("mptcp: add netlink-based PM") Cc: stable@vger.kernel.org Reported-by: Thomas Dreibholz dreibh@simula.no Closes: https://github.com/multipath-tcp/mptcp_net-next/issues/579 Reviewed-by: Mat Martineau martineau@kernel.org Signed-off-by: Matthieu Baerts (NGI0) matttbe@kernel.org --- net/mptcp/pm_kernel.c | 1 - 1 file changed, 1 deletion(-)
diff --git a/net/mptcp/pm_kernel.c b/net/mptcp/pm_kernel.c index d39e7c1784608db290b8a2c1bc4fc24ed800cbb4..667803d72b643a0bb98365003b136c53f2a9a975 100644 --- a/net/mptcp/pm_kernel.c +++ b/net/mptcp/pm_kernel.c @@ -1085,7 +1085,6 @@ static void __flush_addrs(struct list_head *list) static void __reset_counters(struct pm_nl_pernet *pernet) { WRITE_ONCE(pernet->add_addr_signal_max, 0); - WRITE_ONCE(pernet->add_addr_accept_max, 0); WRITE_ONCE(pernet->local_addr_max, 0); pernet->addrs = 0; }
This modification is linked to the parent commit where the received ADD_ADDR limit was accidentally reset when the endpoints were flushed.
To validate that, the test is now flushing endpoints after having set new limits, and before checking them.
The 'Fixes' tag here below is the same as the one from the previous commit: this patch here is not fixing anything wrong in the selftests, but it validates the previous fix for an issue introduced by this commit ID.
Fixes: 01cacb00b35c ("mptcp: add netlink-based PM") Cc: stable@vger.kernel.org Reviewed-by: Mat Martineau martineau@kernel.org Signed-off-by: Matthieu Baerts (NGI0) matttbe@kernel.org --- tools/testing/selftests/net/mptcp/pm_netlink.sh | 1 + 1 file changed, 1 insertion(+)
diff --git a/tools/testing/selftests/net/mptcp/pm_netlink.sh b/tools/testing/selftests/net/mptcp/pm_netlink.sh index 2e6648a2b2c0c6ea0e9f030c942077a0f483092f..ac7ec6f9402376a34602ef1ca6c4822e8dde0ded 100755 --- a/tools/testing/selftests/net/mptcp/pm_netlink.sh +++ b/tools/testing/selftests/net/mptcp/pm_netlink.sh @@ -198,6 +198,7 @@ set_limits 1 9 2>/dev/null check "get_limits" "${default_limits}" "subflows above hard limit"
set_limits 8 8 +flush_endpoint ## to make sure it doesn't affect the limits check "get_limits" "$(format_limits 8 8)" "set limits"
flush_endpoint
From: Geliang Tang tanggeliang@kylinos.cn
sk_reset_timer() was called twice in mptcp_pm_alloc_anno_list.
Simplify the code by using a 'goto' statement to eliminate the duplication.
Note that this is not a fix, but it will help backporting the following patch. The same "Fixes" tag has been added for this reason.
Fixes: 93f323b9cccc ("mptcp: add a new sysctl add_addr_timeout") Cc: stable@vger.kernel.org Signed-off-by: Geliang Tang tanggeliang@kylinos.cn Reviewed-by: Matthieu Baerts (NGI0) matttbe@kernel.org Signed-off-by: Matthieu Baerts (NGI0) matttbe@kernel.org --- net/mptcp/pm.c | 5 ++--- 1 file changed, 2 insertions(+), 3 deletions(-)
diff --git a/net/mptcp/pm.c b/net/mptcp/pm.c index 420d416e2603de2e54f017216c56daa80f356e87..c5f6a53ce5f1e10dada52865e41b685d888a71fa 100644 --- a/net/mptcp/pm.c +++ b/net/mptcp/pm.c @@ -353,9 +353,7 @@ bool mptcp_pm_alloc_anno_list(struct mptcp_sock *msk, if (WARN_ON_ONCE(mptcp_pm_is_kernel(msk))) return false;
- sk_reset_timer(sk, &add_entry->add_timer, - jiffies + mptcp_get_add_addr_timeout(net)); - return true; + goto reset_timer; }
add_entry = kmalloc(sizeof(*add_entry), GFP_ATOMIC); @@ -369,6 +367,7 @@ bool mptcp_pm_alloc_anno_list(struct mptcp_sock *msk, add_entry->retrans_times = 0;
timer_setup(&add_entry->add_timer, mptcp_pm_add_timer, 0); +reset_timer: sk_reset_timer(sk, &add_entry->add_timer, jiffies + mptcp_get_add_addr_timeout(net));
From: Geliang Tang tanggeliang@kylinos.cn
When add_addr_timeout was set to 0, this caused the ADD_ADDR to be retransmitted immediately, which looks like a buggy behaviour. Instead, interpret 0 as "no retransmissions needed".
The documentation is updated to explicitly state that setting the timeout to 0 disables retransmission.
Fixes: 93f323b9cccc ("mptcp: add a new sysctl add_addr_timeout") Cc: stable@vger.kernel.org Suggested-by: Matthieu Baerts matttbe@kernel.org Signed-off-by: Geliang Tang tanggeliang@kylinos.cn Reviewed-by: Matthieu Baerts (NGI0) matttbe@kernel.org Signed-off-by: Matthieu Baerts (NGI0) matttbe@kernel.org --- Documentation/networking/mptcp-sysctl.rst | 2 ++ net/mptcp/pm.c | 13 ++++++++++--- 2 files changed, 12 insertions(+), 3 deletions(-)
diff --git a/Documentation/networking/mptcp-sysctl.rst b/Documentation/networking/mptcp-sysctl.rst index 5bfab01eff5a9db89e1484787953241c16e147cf..1683c139821e3ba6d9eaa3c59330a523d29f1164 100644 --- a/Documentation/networking/mptcp-sysctl.rst +++ b/Documentation/networking/mptcp-sysctl.rst @@ -12,6 +12,8 @@ add_addr_timeout - INTEGER (seconds) resent to an MPTCP peer that has not acknowledged a previous ADD_ADDR message.
+ Do not retransmit if set to 0. + The default value matches TCP_RTO_MAX. This is a per-namespace sysctl.
diff --git a/net/mptcp/pm.c b/net/mptcp/pm.c index c5f6a53ce5f1e10dada52865e41b685d888a71fa..136a380602cae872b76560649c924330e5f42533 100644 --- a/net/mptcp/pm.c +++ b/net/mptcp/pm.c @@ -274,6 +274,7 @@ static void mptcp_pm_add_timer(struct timer_list *timer) add_timer); struct mptcp_sock *msk = entry->sock; struct sock *sk = (struct sock *)msk; + unsigned int timeout;
pr_debug("msk=%p\n", msk);
@@ -291,6 +292,10 @@ static void mptcp_pm_add_timer(struct timer_list *timer) goto out; }
+ timeout = mptcp_get_add_addr_timeout(sock_net(sk)); + if (!timeout) + goto out; + spin_lock_bh(&msk->pm.lock);
if (!mptcp_pm_should_add_signal_addr(msk)) { @@ -302,7 +307,7 @@ static void mptcp_pm_add_timer(struct timer_list *timer)
if (entry->retrans_times < ADD_ADDR_RETRANS_MAX) sk_reset_timer(sk, timer, - jiffies + mptcp_get_add_addr_timeout(sock_net(sk))); + jiffies + timeout);
spin_unlock_bh(&msk->pm.lock);
@@ -344,6 +349,7 @@ bool mptcp_pm_alloc_anno_list(struct mptcp_sock *msk, struct mptcp_pm_add_entry *add_entry = NULL; struct sock *sk = (struct sock *)msk; struct net *net = sock_net(sk); + unsigned int timeout;
lockdep_assert_held(&msk->pm.lock);
@@ -368,8 +374,9 @@ bool mptcp_pm_alloc_anno_list(struct mptcp_sock *msk,
timer_setup(&add_entry->add_timer, mptcp_pm_add_timer, 0); reset_timer: - sk_reset_timer(sk, &add_entry->add_timer, - jiffies + mptcp_get_add_addr_timeout(net)); + timeout = mptcp_get_add_addr_timeout(net); + if (timeout) + sk_reset_timer(sk, &add_entry->add_timer, jiffies + timeout);
return true; }
From: Geliang Tang tanggeliang@kylinos.cn
To prevent test instability in the "delete re-add signal" test caused by ADD_ADDR retransmissions, disable retransmissions for this test by setting net.mptcp.add_addr_timeout to 0.
Suggested-by: Matthieu Baerts matttbe@kernel.org Signed-off-by: Geliang Tang tanggeliang@kylinos.cn Reviewed-by: Matthieu Baerts (NGI0) matttbe@kernel.org Signed-off-by: Matthieu Baerts (NGI0) matttbe@kernel.org --- tools/testing/selftests/net/mptcp/mptcp_join.sh | 1 + 1 file changed, 1 insertion(+)
diff --git a/tools/testing/selftests/net/mptcp/mptcp_join.sh b/tools/testing/selftests/net/mptcp/mptcp_join.sh index b8af65373b3ada96472347171924ad3a6cf14777..82cae37d9c2026cc55466636d53a76f929a03452 100755 --- a/tools/testing/selftests/net/mptcp/mptcp_join.sh +++ b/tools/testing/selftests/net/mptcp/mptcp_join.sh @@ -3842,6 +3842,7 @@ endpoint_tests() # remove and re-add if reset_with_events "delete re-add signal" && mptcp_lib_kallsyms_has "subflow_rebuild_header$"; then + ip netns exec $ns1 sysctl -q net.mptcp.add_addr_timeout=0 pm_nl_set_limits $ns1 0 3 pm_nl_set_limits $ns2 3 3 pm_nl_add_endpoint $ns1 10.0.2.1 id 1 flags signal
GCC was complaining about the new label:
mptcp_connect.c:187:2: warning: label followed by a declaration is a C23 extension [-Wc23-extensions] 187 | int err = getaddrinfo(node, service, hints, res); | ^
Simply declare 'err' before the label to avoid this warning.
Fixes: a862771d1aa4 ("selftests: mptcp: use IPPROTO_MPTCP for getaddrinfo") Cc: stable@vger.kernel.org Reviewed-by: Geliang Tang geliang@kernel.org Signed-off-by: Matthieu Baerts (NGI0) matttbe@kernel.org --- tools/testing/selftests/net/mptcp/mptcp_connect.c | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-)
diff --git a/tools/testing/selftests/net/mptcp/mptcp_connect.c b/tools/testing/selftests/net/mptcp/mptcp_connect.c index ac1349c4b9e5404c95935eb38b08a15d774eb1d9..4f07ac9fa207cb08a934582b98d688d0b9512f97 100644 --- a/tools/testing/selftests/net/mptcp/mptcp_connect.c +++ b/tools/testing/selftests/net/mptcp/mptcp_connect.c @@ -183,9 +183,10 @@ static void xgetaddrinfo(const char *node, const char *service, struct addrinfo *hints, struct addrinfo **res) { -again: - int err = getaddrinfo(node, service, hints, res); + int err;
+again: + err = getaddrinfo(node, service, hints, res); if (err) { const char *errstr;
GCC was complaining about the new label:
mptcp_inq.c:79:2: warning: label followed by a declaration is a C23 extension [-Wc23-extensions] 79 | int err = getaddrinfo(node, service, hints, res); | ^
mptcp_sockopt.c:166:2: warning: label followed by a declaration is a C23 extension [-Wc23-extensions] 166 | int err = getaddrinfo(node, service, hints, res); | ^
Simply declare 'err' before the label to avoid this warning.
Fixes: dd367e81b79a ("selftests: mptcp: sockopt: use IPPROTO_MPTCP for getaddrinfo") Cc: stable@vger.kernel.org Reviewed-by: Geliang Tang geliang@kernel.org Signed-off-by: Matthieu Baerts (NGI0) matttbe@kernel.org --- tools/testing/selftests/net/mptcp/mptcp_inq.c | 5 +++-- tools/testing/selftests/net/mptcp/mptcp_sockopt.c | 5 +++-- 2 files changed, 6 insertions(+), 4 deletions(-)
diff --git a/tools/testing/selftests/net/mptcp/mptcp_inq.c b/tools/testing/selftests/net/mptcp/mptcp_inq.c index 3cf1e2a612cef911028f46569563d16dd5d32129..f3bcaa48df8f22e8f4833fcc3b919d21764bf7fb 100644 --- a/tools/testing/selftests/net/mptcp/mptcp_inq.c +++ b/tools/testing/selftests/net/mptcp/mptcp_inq.c @@ -75,9 +75,10 @@ static void xgetaddrinfo(const char *node, const char *service, struct addrinfo *hints, struct addrinfo **res) { -again: - int err = getaddrinfo(node, service, hints, res); + int err;
+again: + err = getaddrinfo(node, service, hints, res); if (err) { const char *errstr;
diff --git a/tools/testing/selftests/net/mptcp/mptcp_sockopt.c b/tools/testing/selftests/net/mptcp/mptcp_sockopt.c index 9934a68df23708ecb413c4ab26523989e3b9f158..e934dd26a59d9b50445d61e8b8013ce3c8d2a8a0 100644 --- a/tools/testing/selftests/net/mptcp/mptcp_sockopt.c +++ b/tools/testing/selftests/net/mptcp/mptcp_sockopt.c @@ -162,9 +162,10 @@ static void xgetaddrinfo(const char *node, const char *service, struct addrinfo *hints, struct addrinfo **res) { -again: - int err = getaddrinfo(node, service, hints, res); + int err;
+again: + err = getaddrinfo(node, service, hints, res); if (err) { const char *errstr;
On Fri, 15 Aug 2025 19:28:18 +0200 Matthieu Baerts (NGI0) wrote:
Here are various fixes:
Patch 1: Better handling SKB extension allocation failures. A fix for v5.7.
Patches 2, 3: Avoid resetting MPTCP limits when flushing MPTCP endpoints. With a validation in the selftests. Fixes for v5.7.
Patches 4, 5, 6: Disallow '0' as ADD_ADDR retransmission timeout. With a preparation patch, and a validation in the selftests. Fixes for v5.11.
Patches 8, 9: Fix C23 extension warnings in the selftests, spotted by GCC. Fixes for v6.16.
userspace_pm.sh which hasn't flaked in 1000 runs has flaked last night, with this series applied: https://netdev-3.bots.linux.dev/vmksft-mptcp/results/255941/8-userspace-pm-s... Looks unrelated but also quite strange?
Hi Jakub,
On 16/08/2025 20:27, Jakub Kicinski wrote:
On Fri, 15 Aug 2025 19:28:18 +0200 Matthieu Baerts (NGI0) wrote:
Here are various fixes:
Patch 1: Better handling SKB extension allocation failures. A fix for v5.7.
Patches 2, 3: Avoid resetting MPTCP limits when flushing MPTCP endpoints. With a validation in the selftests. Fixes for v5.7.
Patches 4, 5, 6: Disallow '0' as ADD_ADDR retransmission timeout. With a preparation patch, and a validation in the selftests. Fixes for v5.11.
Patches 8, 9: Fix C23 extension warnings in the selftests, spotted by GCC. Fixes for v6.16.
userspace_pm.sh which hasn't flaked in 1000 runs has flaked last night, with this series applied: https://netdev-3.bots.linux.dev/vmksft-mptcp/results/255941/8-userspace-pm-s...
Thank you for the notification!
Looks unrelated but also quite strange?
Indeed: the whole series should not affect the behaviour validated by the selftests. The error messages are not very useful, and the test was not really slower than usual. Our CI never complained about that either. I will monitor that!
Cheers, Matt
Hello:
This series was applied to netdev/net.git (main) by Jakub Kicinski kuba@kernel.org:
On Fri, 15 Aug 2025 19:28:18 +0200 you wrote:
Here are various fixes:
Patch 1: Better handling SKB extension allocation failures. A fix for v5.7.
Patches 2, 3: Avoid resetting MPTCP limits when flushing MPTCP endpoints. With a validation in the selftests. Fixes for v5.7.
[...]
Here is the summary with links: - [net,1/8] mptcp: drop skb if MPTCP skb extension allocation fails https://git.kernel.org/netdev/net/c/ccab04469798 - [net,2/8] mptcp: pm: kernel: flush: do not reset ADD_ADDR limit https://git.kernel.org/netdev/net/c/68fc0f4b0d25 - [net,3/8] selftests: mptcp: pm: check flush doesn't reset limits https://git.kernel.org/netdev/net/c/452690be7de2 - [net,4/8] mptcp: remove duplicate sk_reset_timer call https://git.kernel.org/netdev/net/c/5d13349472ac - [net,5/8] mptcp: disable add_addr retransmission when timeout is 0 https://git.kernel.org/netdev/net/c/f5ce0714623c - [net,6/8] selftests: mptcp: disable add_addr retrans in endpoint_tests https://git.kernel.org/netdev/net/c/f92199f551e6 - [net,7/8] selftests: mptcp: connect: fix C23 extension warning https://git.kernel.org/netdev/net/c/2eefbed30d46 - [net,8/8] selftests: mptcp: sockopt: fix C23 extension warning https://git.kernel.org/netdev/net/c/3259889fd3c0
You are awesome, thank you!
linux-kselftest-mirror@lists.linaro.org