As is described in the "How to use MPTCP?" section in MPTCP wiki [1]:
"Your app should create sockets with IPPROTO_MPTCP as the proto: ( socket(AF_INET, SOCK_STREAM, IPPROTO_MPTCP); ). Legacy apps can be forced to create and use MPTCP sockets instead of TCP ones via the mptcpize command bundled with the mptcpd daemon."
But the mptcpize (LD_PRELOAD technique) command has some limitations [2]:
- it doesn't work if the application is not using libc (e.g. GoLang apps) - in some envs, it might not be easy to set env vars / change the way apps are launched, e.g. on Android - mptcpize needs to be launched with all apps that want MPTCP: we could have more control from BPF to enable MPTCP only for some apps or all the ones of a netns or a cgroup, etc. - it is not in BPF, we cannot talk about it at netdev conf.
So this patchset attempts to use BPF to implement functions similer to mptcpize.
The main idea is to add a hook in sys_socket() to change the protocol id from IPPROTO_TCP (or 0) to IPPROTO_MPTCP.
[1] https://github.com/multipath-tcp/mptcp_net-next/wiki [2] https://github.com/multipath-tcp/mptcp_net-next/issues/79
v10: - drop "#ifdef CONFIG_BPF_JIT". - include vmlinux.h and bpf_tracing_net.h to avoid defining some macros. - drop unneeded checks for mptcp.
v9: - update comment for 'update_socket_protocol'.
v8: - drop the additional checks on the 'protocol' value after the 'update_socket_protocol()' call.
v7: - add __weak and __diag_* for update_socket_protocol.
v6: - add update_socket_protocol.
v5: - add bpf_mptcpify helper.
v4: - use lsm_cgroup/socket_create
v3: - patch 8: char cmd[128]; -> char cmd[256];
v2: - Fix build selftests errors reported by CI
Closes: https://github.com/multipath-tcp/mptcp_net-next/issues/79
Geliang Tang (5): bpf: Add update_socket_protocol hook selftests/bpf: Use random netns name for mptcp selftests/bpf: Add two mptcp netns helpers selftests/bpf: Drop unneeded checks for mptcp selftests/bpf: Add mptcpify test
net/mptcp/bpf.c | 15 ++ net/socket.c | 24 ++++ .../testing/selftests/bpf/prog_tests/mptcp.c | 129 +++++++++++++++--- tools/testing/selftests/bpf/progs/mptcpify.c | 20 +++ 4 files changed, 169 insertions(+), 19 deletions(-) create mode 100644 tools/testing/selftests/bpf/progs/mptcpify.c
Add a hook named update_socket_protocol in __sys_socket(), for bpf progs to attach to and update socket protocol. One user case is to force legacy TCP apps to create and use MPTCP sockets instead of TCP ones.
Define a mod_ret set named bpf_mptcp_fmodret_ids, add the hook update_socket_protocol into this set, and register it in bpf_mptcp_kfunc_init().
Closes: https://github.com/multipath-tcp/mptcp_net-next/issues/79 Acked-by: Matthieu Baerts matthieu.baerts@tessares.net Signed-off-by: Geliang Tang geliang.tang@suse.com --- net/mptcp/bpf.c | 15 +++++++++++++++ net/socket.c | 24 ++++++++++++++++++++++++ 2 files changed, 39 insertions(+)
diff --git a/net/mptcp/bpf.c b/net/mptcp/bpf.c index 5a0a84ad94af..8a16672b94e2 100644 --- a/net/mptcp/bpf.c +++ b/net/mptcp/bpf.c @@ -19,3 +19,18 @@ struct mptcp_sock *bpf_mptcp_sock_from_subflow(struct sock *sk)
return NULL; } + +BTF_SET8_START(bpf_mptcp_fmodret_ids) +BTF_ID_FLAGS(func, update_socket_protocol) +BTF_SET8_END(bpf_mptcp_fmodret_ids) + +static const struct btf_kfunc_id_set bpf_mptcp_fmodret_set = { + .owner = THIS_MODULE, + .set = &bpf_mptcp_fmodret_ids, +}; + +static int __init bpf_mptcp_kfunc_init(void) +{ + return register_btf_fmodret_id_set(&bpf_mptcp_fmodret_set); +} +late_initcall(bpf_mptcp_kfunc_init); diff --git a/net/socket.c b/net/socket.c index 2b0e54b2405c..9f98ced88ac5 100644 --- a/net/socket.c +++ b/net/socket.c @@ -1644,11 +1644,35 @@ struct file *__sys_socket_file(int family, int type, int protocol) return sock_alloc_file(sock, flags, NULL); }
+/* A hook for bpf progs to attach to and update socket protocol. + * + * A static noinline declaration here could cause the compiler to + * optimize away the function. A global noinline declaration will + * keep the definition, but may optimize away the callsite. + * Therefore, __weak is needed to ensure that the call is still + * emitted, by telling the compiler that we don't know what the + * function might eventually be. + * + * __diag_* below are needed to dismiss the missing prototype warning. + */ + +__diag_push(); +__diag_ignore_all("-Wmissing-prototypes", + "kfuncs which will be used in BPF programs"); + +__weak noinline int update_socket_protocol(int family, int type, int protocol) +{ + return protocol; +} + +__diag_pop(); + int __sys_socket(int family, int type, int protocol) { struct socket *sock; int flags;
+ protocol = update_socket_protocol(family, type, protocol); sock = __sys_socket_create(family, type, protocol); if (IS_ERR(sock)) return PTR_ERR(sock);
On 8/3/23 6:55 PM, Geliang Tang wrote:
Add a hook named update_socket_protocol in __sys_socket(), for bpf progs to attach to and update socket protocol. One user case is to force legacy TCP apps to create and use MPTCP sockets instead of TCP ones.
Define a mod_ret set named bpf_mptcp_fmodret_ids, add the hook update_socket_protocol into this set, and register it in bpf_mptcp_kfunc_init().
Closes: https://github.com/multipath-tcp/mptcp_net-next/issues/79 Acked-by: Matthieu Baerts matthieu.baerts@tessares.net Signed-off-by: Geliang Tang geliang.tang@suse.com
Acked-by: Yonghong Song yonghong.song@linux.dev
Use rand() to generate a random netns name instead of using the fixed name "mptcp_ns" for every test.
By doing that, we can re-launch the test even if there was an issue removing the previous netns or if by accident, a netns with this generic name already existed on the system.
Note that using a different name each will also help adding more subtests in future commits.
Acked-by: Yonghong Song yonghong.song@linux.dev Reviewed-by: Matthieu Baerts matthieu.baerts@tessares.net Signed-off-by: Geliang Tang geliang.tang@suse.com --- tools/testing/selftests/bpf/prog_tests/mptcp.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-)
diff --git a/tools/testing/selftests/bpf/prog_tests/mptcp.c b/tools/testing/selftests/bpf/prog_tests/mptcp.c index cd0c42fff7c0..4ccca3d39a8f 100644 --- a/tools/testing/selftests/bpf/prog_tests/mptcp.c +++ b/tools/testing/selftests/bpf/prog_tests/mptcp.c @@ -7,7 +7,7 @@ #include "network_helpers.h" #include "mptcp_sock.skel.h"
-#define NS_TEST "mptcp_ns" +char NS_TEST[32];
#ifndef TCP_CA_NAME_MAX #define TCP_CA_NAME_MAX 16 @@ -147,6 +147,8 @@ static void test_base(void) if (!ASSERT_GE(cgroup_fd, 0, "test__join_cgroup")) return;
+ srand(time(NULL)); + snprintf(NS_TEST, sizeof(NS_TEST), "mptcp_ns_%d", rand()); SYS(fail, "ip netns add %s", NS_TEST); SYS(fail, "ip -net %s link set dev lo up", NS_TEST);
Add two netns helpers for mptcp tests: create_netns() and cleanup_netns(). Use them in test_base().
These new helpers will be re-used in the following commits introducing new tests.
Acked-by: Yonghong Song yonghong.song@linux.dev Reviewed-by: Matthieu Baerts matthieu.baerts@tessares.net Signed-off-by: Geliang Tang geliang.tang@suse.com --- .../testing/selftests/bpf/prog_tests/mptcp.c | 35 ++++++++++++------- 1 file changed, 23 insertions(+), 12 deletions(-)
diff --git a/tools/testing/selftests/bpf/prog_tests/mptcp.c b/tools/testing/selftests/bpf/prog_tests/mptcp.c index 4ccca3d39a8f..4407bd5c9e9a 100644 --- a/tools/testing/selftests/bpf/prog_tests/mptcp.c +++ b/tools/testing/selftests/bpf/prog_tests/mptcp.c @@ -22,6 +22,26 @@ struct mptcp_storage { char ca_name[TCP_CA_NAME_MAX]; };
+static struct nstoken *create_netns(void) +{ + srand(time(NULL)); + snprintf(NS_TEST, sizeof(NS_TEST), "mptcp_ns_%d", rand()); + SYS(fail, "ip netns add %s", NS_TEST); + SYS(fail, "ip -net %s link set dev lo up", NS_TEST); + + return open_netns(NS_TEST); +fail: + return NULL; +} + +static void cleanup_netns(struct nstoken *nstoken) +{ + if (nstoken) + close_netns(nstoken); + + SYS_NOFAIL("ip netns del %s &> /dev/null", NS_TEST); +} + static int verify_tsk(int map_fd, int client_fd) { int err, cfd = client_fd; @@ -147,13 +167,8 @@ static void test_base(void) if (!ASSERT_GE(cgroup_fd, 0, "test__join_cgroup")) return;
- srand(time(NULL)); - snprintf(NS_TEST, sizeof(NS_TEST), "mptcp_ns_%d", rand()); - SYS(fail, "ip netns add %s", NS_TEST); - SYS(fail, "ip -net %s link set dev lo up", NS_TEST); - - nstoken = open_netns(NS_TEST); - if (!ASSERT_OK_PTR(nstoken, "open_netns")) + nstoken = create_netns(); + if (!ASSERT_OK_PTR(nstoken, "create_netns")) goto fail;
/* without MPTCP */ @@ -176,11 +191,7 @@ static void test_base(void) close(server_fd);
fail: - if (nstoken) - close_netns(nstoken); - - SYS_NOFAIL("ip netns del " NS_TEST " &> /dev/null"); - + cleanup_netns(nstoken); close(cgroup_fd); }
Load success means prog_fd and map_fd are always valid. So drop these unneeded ASSERT_GE checks for them in mptcp run_test().
Signed-off-by: Geliang Tang geliang.tang@suse.com --- tools/testing/selftests/bpf/prog_tests/mptcp.c | 10 ---------- 1 file changed, 10 deletions(-)
diff --git a/tools/testing/selftests/bpf/prog_tests/mptcp.c b/tools/testing/selftests/bpf/prog_tests/mptcp.c index 4407bd5c9e9a..3dc0ba2e7590 100644 --- a/tools/testing/selftests/bpf/prog_tests/mptcp.c +++ b/tools/testing/selftests/bpf/prog_tests/mptcp.c @@ -127,17 +127,7 @@ static int run_test(int cgroup_fd, int server_fd, bool is_mptcp) goto out;
prog_fd = bpf_program__fd(sock_skel->progs._sockops); - if (!ASSERT_GE(prog_fd, 0, "bpf_program__fd")) { - err = -EIO; - goto out; - } - map_fd = bpf_map__fd(sock_skel->maps.socket_storage_map); - if (!ASSERT_GE(map_fd, 0, "bpf_map__fd")) { - err = -EIO; - goto out; - } - err = bpf_prog_attach(prog_fd, cgroup_fd, BPF_CGROUP_SOCK_OPS, 0); if (!ASSERT_OK(err, "bpf_prog_attach")) goto out;
On 8/3/23 6:55 PM, Geliang Tang wrote:
Load success means prog_fd and map_fd are always valid. So drop these unneeded ASSERT_GE checks for them in mptcp run_test().
Signed-off-by: Geliang Tang geliang.tang@suse.com
Acked-by: Yonghong Song yonghong.song@linux.dev
Implement a new test program mptcpify: if the family is AF_INET or AF_INET6, the type is SOCK_STREAM, and the protocol ID is 0 or IPPROTO_TCP, set it to IPPROTO_MPTCP. It will be hooked in update_socket_protocol().
Extend the MPTCP test base, add a selftest test_mptcpify() for the mptcpify case. Open and load the mptcpify test prog to mptcpify the TCP sockets dynamically, then use start_server() and connect_to_fd() to create a TCP socket, but actually what's created is an MPTCP socket, which can be verified through the outputs of 'ss' and 'nstat' commands.
Reviewed-by: Matthieu Baerts matthieu.baerts@tessares.net Signed-off-by: Geliang Tang geliang.tang@suse.com --- .../testing/selftests/bpf/prog_tests/mptcp.c | 88 +++++++++++++++++++ tools/testing/selftests/bpf/progs/mptcpify.c | 20 +++++ 2 files changed, 108 insertions(+) create mode 100644 tools/testing/selftests/bpf/progs/mptcpify.c
diff --git a/tools/testing/selftests/bpf/prog_tests/mptcp.c b/tools/testing/selftests/bpf/prog_tests/mptcp.c index 3dc0ba2e7590..e5ac2c3aab7d 100644 --- a/tools/testing/selftests/bpf/prog_tests/mptcp.c +++ b/tools/testing/selftests/bpf/prog_tests/mptcp.c @@ -6,6 +6,7 @@ #include "cgroup_helpers.h" #include "network_helpers.h" #include "mptcp_sock.skel.h" +#include "mptcpify.skel.h"
char NS_TEST[32];
@@ -185,8 +186,95 @@ static void test_base(void) close(cgroup_fd); }
+static void send_byte(int fd) +{ + char b = 0x55; + + ASSERT_EQ(write(fd, &b, sizeof(b)), 1, "send single byte"); +} + +static int verify_mptcpify(void) +{ + char cmd[256]; + int err = 0; + + snprintf(cmd, sizeof(cmd), + "ip netns exec %s ss -tOni | grep -q '%s'", + NS_TEST, "tcp-ulp-mptcp"); + if (!ASSERT_OK(system(cmd), "No tcp-ulp-mptcp found!")) + err++; + + snprintf(cmd, sizeof(cmd), + "ip netns exec %s nstat -asz %s | awk '%s' | grep -q '%s'", + NS_TEST, "MPTcpExtMPCapableSYNACKRX", + "NR==1 {next} {print $2}", "1"); + if (!ASSERT_OK(system(cmd), "No MPTcpExtMPCapableSYNACKRX found!")) + err++; + + return err; +} + +static int run_mptcpify(int cgroup_fd) +{ + int server_fd, client_fd, err = 0; + struct mptcpify *mptcpify_skel; + + mptcpify_skel = mptcpify__open_and_load(); + if (!ASSERT_OK_PTR(mptcpify_skel, "skel_open_load")) + return -EIO; + + err = mptcpify__attach(mptcpify_skel); + if (!ASSERT_OK(err, "skel_attach")) + goto out; + + /* without MPTCP */ + server_fd = start_server(AF_INET, SOCK_STREAM, NULL, 0, 0); + if (!ASSERT_GE(server_fd, 0, "start_server")) { + err = -EIO; + goto out; + } + + client_fd = connect_to_fd(server_fd, 0); + if (!ASSERT_GE(client_fd, 0, "connect to fd")) { + err = -EIO; + goto close_server; + } + + send_byte(client_fd); + err += verify_mptcpify(); + + close(client_fd); +close_server: + close(server_fd); +out: + mptcpify__destroy(mptcpify_skel); + return err; +} + +static void test_mptcpify(void) +{ + struct nstoken *nstoken = NULL; + int cgroup_fd; + + cgroup_fd = test__join_cgroup("/mptcpify"); + if (!ASSERT_GE(cgroup_fd, 0, "test__join_cgroup")) + return; + + nstoken = create_netns(); + if (!ASSERT_OK_PTR(nstoken, "create_netns")) + goto fail; + + ASSERT_OK(run_mptcpify(cgroup_fd), "run_mptcpify"); + +fail: + cleanup_netns(nstoken); + close(cgroup_fd); +} + void test_mptcp(void) { if (test__start_subtest("base")) test_base(); + if (test__start_subtest("mptcpify")) + test_mptcpify(); } diff --git a/tools/testing/selftests/bpf/progs/mptcpify.c b/tools/testing/selftests/bpf/progs/mptcpify.c new file mode 100644 index 000000000000..53301ae8a8f7 --- /dev/null +++ b/tools/testing/selftests/bpf/progs/mptcpify.c @@ -0,0 +1,20 @@ +// SPDX-License-Identifier: GPL-2.0 +/* Copyright (c) 2023, SUSE. */ + +#include "vmlinux.h" +#include <bpf/bpf_tracing.h> +#include "bpf_tracing_net.h" + +char _license[] SEC("license") = "GPL"; + +SEC("fmod_ret/update_socket_protocol") +int BPF_PROG(mptcpify, int family, int type, int protocol) +{ + if ((family == AF_INET || family == AF_INET6) && + type == SOCK_STREAM && + (!protocol || protocol == IPPROTO_TCP)) { + return IPPROTO_MPTCP; + } + + return protocol; +}
On 8/3/23 6:55 PM, Geliang Tang wrote:
Implement a new test program mptcpify: if the family is AF_INET or AF_INET6, the type is SOCK_STREAM, and the protocol ID is 0 or IPPROTO_TCP, set it to IPPROTO_MPTCP. It will be hooked in update_socket_protocol().
Extend the MPTCP test base, add a selftest test_mptcpify() for the mptcpify case. Open and load the mptcpify test prog to mptcpify the TCP sockets dynamically, then use start_server() and connect_to_fd() to create a TCP socket, but actually what's created is an MPTCP socket, which can be verified through the outputs of 'ss' and 'nstat' commands.
Reviewed-by: Matthieu Baerts matthieu.baerts@tessares.net Signed-off-by: Geliang Tang geliang.tang@suse.com
Ack with a minor nit below.
Acked-by: Yonghong Song yonghong.song@linux.dev
.../testing/selftests/bpf/prog_tests/mptcp.c | 88 +++++++++++++++++++ tools/testing/selftests/bpf/progs/mptcpify.c | 20 +++++ 2 files changed, 108 insertions(+) create mode 100644 tools/testing/selftests/bpf/progs/mptcpify.c
diff --git a/tools/testing/selftests/bpf/prog_tests/mptcp.c b/tools/testing/selftests/bpf/prog_tests/mptcp.c index 3dc0ba2e7590..e5ac2c3aab7d 100644 --- a/tools/testing/selftests/bpf/prog_tests/mptcp.c +++ b/tools/testing/selftests/bpf/prog_tests/mptcp.c @@ -6,6 +6,7 @@ #include "cgroup_helpers.h" #include "network_helpers.h" #include "mptcp_sock.skel.h" +#include "mptcpify.skel.h" char NS_TEST[32]; @@ -185,8 +186,95 @@ static void test_base(void) close(cgroup_fd); } +static void send_byte(int fd) +{
- char b = 0x55;
- ASSERT_EQ(write(fd, &b, sizeof(b)), 1, "send single byte");
+}
+static int verify_mptcpify(void) +{
- char cmd[256];
- int err = 0;
- snprintf(cmd, sizeof(cmd),
"ip netns exec %s ss -tOni | grep -q '%s'",
NS_TEST, "tcp-ulp-mptcp");
- if (!ASSERT_OK(system(cmd), "No tcp-ulp-mptcp found!"))
err++;
- snprintf(cmd, sizeof(cmd),
"ip netns exec %s nstat -asz %s | awk '%s' | grep -q '%s'",
NS_TEST, "MPTcpExtMPCapableSYNACKRX",
"NR==1 {next} {print $2}", "1");
- if (!ASSERT_OK(system(cmd), "No MPTcpExtMPCapableSYNACKRX found!"))
err++;
- return err;
+}
+static int run_mptcpify(int cgroup_fd) +{
- int server_fd, client_fd, err = 0;
- struct mptcpify *mptcpify_skel;
- mptcpify_skel = mptcpify__open_and_load();
- if (!ASSERT_OK_PTR(mptcpify_skel, "skel_open_load"))
return -EIO;
- err = mptcpify__attach(mptcpify_skel);
- if (!ASSERT_OK(err, "skel_attach"))
goto out;
- /* without MPTCP */
- server_fd = start_server(AF_INET, SOCK_STREAM, NULL, 0, 0);
- if (!ASSERT_GE(server_fd, 0, "start_server")) {
err = -EIO;
goto out;
- }
- client_fd = connect_to_fd(server_fd, 0);
- if (!ASSERT_GE(client_fd, 0, "connect to fd")) {
err = -EIO;
goto close_server;
- }
- send_byte(client_fd);
- err += verify_mptcpify();
The above code essentially equals to err = verify_mptcpify() since err must be 0 before the above code. I think it is worthwhile to change the above to err = verify_mptcpify();
Otherwise, people may confuse that maybe err could be non-zero before send_byte(client_fd)? If this is the case, why we did not return earlier? The code err = verify_mptcpify() will make it clear that all previous error conditions have been handled properly.
- close(client_fd);
+close_server:
- close(server_fd);
+out:
- mptcpify__destroy(mptcpify_skel);
- return err;
+}
[...]
linux-kselftest-mirror@lists.linaro.org