Re: [PATCH bpf-next v4 2/2] selftests/bpf: Add mptcp subflow subtest

15 Aug 2024


      Hi Martin,
Thank you for your reply!
On 15/08/2024 00:37, Martin KaFai Lau wrote:
...
On 8/14/24 3:04 AM, Matthieu Baerts wrote:
...
Hi Martin,
Thank you for your reply!
On 14/08/2024 03:12, Martin KaFai Lau wrote:
...
On 8/5/24 2:52 AM, Matthieu Baerts (NGI0) wrote:
...
+static int endpoint_init(char *flags)
+{
+    SYS(fail, "ip -net %s link add veth1 type veth peer name veth2",
NS_TEST);
+    SYS(fail, "ip -net %s addr add %s/24 dev veth1", NS_TEST, ADDR_1);
+    SYS(fail, "ip -net %s link set dev veth1 up", NS_TEST);
+    SYS(fail, "ip -net %s addr add %s/24 dev veth2", NS_TEST, ADDR_2);
+    SYS(fail, "ip -net %s link set dev veth2 up", NS_TEST);
+    if (SYS_NOFAIL("ip -net %s mptcp endpoint add %s %s", NS_TEST,
ADDR_2, flags)) {
+        printf("'ip mptcp' not supported, skip this test.\n");
+        test__skip();
It is always a skip now in bpf CI:
#171/3   mptcp/subflow:SKIP
This test is a useful addition for the bpf CI selftest.
It can't catch regression if it is always a skip in bpf CI though.
Indeed, for the moment, this test is skipped in bpf CI.
The MPTCP CI checks the MPTCP BPF selftests that are on top of net and
net-next at least once a day. It is always running with the last stable
version of iproute2, so this test is not skipped:
#169/3   mptcp/subflow:OK
https://github.com/multipath-tcp/mptcp_net-next/actions/
runs/10384566794/job/28751869426#step:7:11080
...
iproute2 needs to be updated (cc: Daniel Xu and Manu, the outdated
iproute2 is something that came up multiple times).
Not sure when the iproute2 can be updated. In the mean time, your v3 is
pretty close to getting pm_nl_ctl compiled. Is there other blocker on
this?
I will try to find some time to check the modifications I suggested in
the v3, but I don't know how long it will take to have them ready, as
they might require some adaptations of the CI side as well, I need to
check. On the other hand, I understood adding a duplicated version of
the mptcp.h UAPI header is not an option either.
So not to block this (already old) series, I thought it would help to
first focus on this version using 'ip mptcp', while I'm looking at the
selftests modifications. If these modifications are successful, I can
always resend the patch 2/3 from the v3 later, and using 'pm_nl_ctl'
instead of 'ip mptcp', to be able to work with IPRoute2 5.5.
Do you think that could work like that?
If there is CI started covering it, staying with the 'ip mptcp' is fine.
The bpf CI has to start testing it asap also. The iproute2 package will
need to be updated on the bpf CI side. I think this has to be done
regardless.
It will be useful to avoid the uapi header dup on its own. The last one
you have seems pretty close.
Thank you. Yes I will try to find time to look at that.
...
...
...
...
+        goto fail;
+    }


+    return 0;
+fail:
+    return -1;
+}


+static int _ss_search(char *src, char *dst, char *port, char *keyword)
+{
+    return SYS_NOFAIL("ip netns exec %s ss -enita src %s dst %s %s %d
| grep -q '%s'",
+              NS_TEST, src, dst, port, PORT_1, keyword);
+}


+static int ss_search(char *src, char *keyword)
+{
+    return _ss_search(src, ADDR_1, "dport", keyword);
+}


+static void run_subflow(char *new)
+{
+    int server_fd, client_fd, err;
+    char cc[TCP_CA_NAME_MAX];
+    socklen_t len = sizeof(cc);


+    server_fd = start_mptcp_server(AF_INET, ADDR_1, PORT_1, 0);
+    if (!ASSERT_GE(server_fd, 0, "start_mptcp_server"))
+        return;


+    client_fd = connect_to_fd(server_fd, 0);
+    if (!ASSERT_GE(client_fd, 0, "connect to fd"))
+        goto fail;


+    err = getsockopt(server_fd, SOL_TCP, TCP_CONGESTION, cc, &len);
+    if (!ASSERT_OK(err, "getsockopt(srv_fd, TCP_CONGESTION)"))
+        goto fail;


+    send_byte(client_fd);


+    ASSERT_OK(ss_search(ADDR_1, "fwmark:0x1"), "ss_search
fwmark:0x1");
+    ASSERT_OK(ss_search(ADDR_2, "fwmark:0x2"), "ss_search
fwmark:0x2");
+    ASSERT_OK(ss_search(ADDR_1, new), "ss_search new cc");
+    ASSERT_OK(ss_search(ADDR_2, cc), "ss_search default cc");
Is there a getsockopt way instead of ss + grep?
No there isn't: from the userspace, the app communicates with the MPTCP
socket, which can have multiple paths (subflows, a TCP socket). To keep
the compatibility with TCP, [gs]etsockopt() will look at/modify the
whole MPTCP connection. For example, in some cases, a setsockopt() will
propagate the option to all the subflows. Depending on the option, the
modification might only apply to the first subflow, or to the
user-facing socket.
For advanced users who want to have different options set to the
different subflows of an MPTCP connection, they can use BPF: that's what
is being validated here. In other words, doing a 'getsockopt()' from the
userspace program here will not show all the different marks and TCP CC
that can be set per subflow with BPF. We can see that in the test: a
getsockopt() is done on the MPTCP socket to retrieve the default TCP CC
('cc' which is certainly 'cubic'), but we expect to find another one
('new' which is 'reno'), set by the BPF program from patch 1/2. I guess
we could use bpf to do a getsockopt() per subflow, but that's seems a
bit cheated to have the BPF test program setting something and checking
if it is set. Here, it is an external way. Because it is done from a
I think the result is valid by having a bpf prog to inspect the value of
a sock. Inspecting socket is an existing use case. There are many
existing bpf tests covering this inspection use case to ensure the
result is legit. A separate cgroup/getsockopt program should help here
(more on this below).
I didn't consider a separate program. Indeed, should work.
...
...
dedicated netns, it sounds OK to do that, no?
Thanks for the explanation. I was hoping there is a way to get to the
underlying subflow fd. It seems impossible.
In the netns does help here. It is not only about the ss iterating a lot
of connections or not. My preference is not depending on external tool/
shell-ing if possible, e.g. to avoid the package update discussion like
the iproute2 here. The uapi from the testing kernel is always up-to-
date. ss is another binary but arguably in the same iproute2 package.
There is now another extra "grep" and pipe here. We had been bitten by
different shell behaviors and some arch has different shells ...etc.
OK, I thought it was fine to use 'ss | grep' because it is used in other
BPF selftests: test_tc_tunnel.sh & test_xdp_features.sh.
...
I think it is ok to take this set as is if you (and Gelang?) are ok to
followup a "cgroup/getsockopt" way to inspect the subflow as the very
next patch to the mptcp selftest. It seems inspecting subflow will be a
common test going forward for mptcp, so it will be beneficial to have a
"cgroup/getsockopt" way to inspect the subflow directly.
Take a look at a recent example [0]. The mptcp test is under a cgroup
already and has the cgroup setup. An extra "cgroup/getsockopt" prog
should be enough. That prog can walk the msk->conn_list and use
bpf_rdonly_cast (or the bpf_core_cast macro in libbpf) to cast a pointer
to tcp_sock for readonly. It will allow to inspect all the fields in a
tcp_sock.
It looks interesting to be able to inspect all the fields in a tcp_sock!
I will check with Geliang what can be done.
...
Something needs to a fix in patch 2(replied separately), so a re-spin is
needed.
Thank you. Will do in the next version!
...
pw-bot: cr
alan.maguire@oracle.com/
Cheers,
Matt
-- 
Sponsored by the NGI0 Core fund.

2026

2025

2024

2023

2022

2021

2020

2019

2018

2017

Re: [PATCH bpf-next v4 2/2] selftests/bpf: Add mptcp subflow subtest