Hi Stanislav,
On 28/07/2023 18:51, Stanislav Fomichev wrote:
On 07/28, Matthieu Baerts wrote:
Hi Stanislav,
On 27/07/2023 20:01, Stanislav Fomichev wrote:
On 07/27, Matthieu Baerts wrote:
Hi Paul, Stanislav,
On 18/07/2023 18:14, Paul Moore wrote:
On Tue, Jul 18, 2023 at 11:21 AM Geliang Tang geliang.tang@suse.com wrote:
As is described in the "How to use MPTCP?" section in MPTCP wiki [1]:
"Your app can create sockets with IPPROTO_MPTCP as the proto: ( socket(AF_INET, SOCK_STREAM, IPPROTO_MPTCP); ). Legacy apps can be forced to create and use MPTCP sockets instead of TCP ones via the mptcpize command bundled with the mptcpd daemon."
But the mptcpize (LD_PRELOAD technique) command has some limitations [2]:
- it doesn't work if the application is not using libc (e.g. GoLang
apps)
- in some envs, it might not be easy to set env vars / change the way
apps are launched, e.g. on Android
- mptcpize needs to be launched with all apps that want MPTCP: we could
have more control from BPF to enable MPTCP only for some apps or all the ones of a netns or a cgroup, etc.
- it is not in BPF, we cannot talk about it at netdev conf.
So this patchset attempts to use BPF to implement functions similer to mptcpize.
The main idea is add a hook in sys_socket() to change the protocol id from IPPROTO_TCP (or 0) to IPPROTO_MPTCP.
[1] https://github.com/multipath-tcp/mptcp_net-next/wiki [2] https://github.com/multipath-tcp/mptcp_net-next/issues/79
v5:
- add bpf_mptcpify helper.
v4:
- use lsm_cgroup/socket_create
v3:
- patch 8: char cmd[128]; -> char cmd[256];
v2:
- Fix build selftests errors reported by CI
Closes: https://github.com/multipath-tcp/mptcp_net-next/issues/79 Signed-off-by: Geliang Tang geliang.tang@suse.com
include/linux/bpf.h | 1 + include/linux/lsm_hook_defs.h | 2 +- include/linux/security.h | 6 +- include/uapi/linux/bpf.h | 7 + kernel/bpf/bpf_lsm.c | 2 + net/mptcp/bpf.c | 20 +++ net/socket.c | 4 +- security/apparmor/lsm.c | 8 +- security/security.c | 2 +- security/selinux/hooks.c | 6 +- tools/include/uapi/linux/bpf.h | 7 + .../testing/selftests/bpf/prog_tests/mptcp.c | 128 ++++++++++++++++-- tools/testing/selftests/bpf/progs/mptcpify.c | 17 +++ 13 files changed, 187 insertions(+), 23 deletions(-) create mode 100644 tools/testing/selftests/bpf/progs/mptcpify.c
...
diff --git a/security/security.c b/security/security.c index b720424ca37d..bbebcddce420 100644 --- a/security/security.c +++ b/security/security.c @@ -4078,7 +4078,7 @@ EXPORT_SYMBOL(security_unix_may_send);
- Return: Returns 0 if permission is granted.
*/ -int security_socket_create(int family, int type, int protocol, int kern) +int security_socket_create(int *family, int *type, int *protocol, int kern) { return call_int_hook(socket_create, 0, family, type, protocol, kern); }
Using the LSM to change the protocol family is not something we want to allow. I'm sorry, but you will need to take a different approach.
@Paul: Thank you for your feedback. It makes sense and I understand.
@Stanislav: Despite the fact the implementation was smaller and reusing more code, it looks like we cannot go in the direction you suggested. Do you think what Geliang suggested before in his v3 [1] can be accepted?
(Note that the v3 is the same as the v1, only some fixes in the selftests.)
We have too many hooks in networking, so something that doesn't add a new one is preferable :-(
Thank you for your reply and the explanation, I understand.
Moreover, we already have a 'socket init' hook, but it runs a bit late.
Indeed. And we cannot move it before the creation of the socket.
Is existing cgroup/sock completely unworkable? Is it possible to expose some new bpf_upgrade_socket_to(IPPROTO_MPTCP) kfunc which would call some new net_proto_family->upgrade_to(IPPROTO_MPTCP) to do the surgery? Or is it too hacky?
I cannot judge if it is too hacky or not but if you think it would be OK, please tell us :)
Maybe try and see how it goes? Doing the surgery to convert from tcp to mptcp is probably hard, but it seems that we should be able to do something like:
int upgrade_to(sock, sk) { if (sk is not a tcp one) return -EINVAL;
sk_common_release(sk); return inet6_create(net, sock, IPPROTO_MPTCP, false); }
?
The only thing I'm not sure about is whether you can call inet6_create on a socket that has seen sk_common_release'd...
Oh sorry, now I better understand your suggestion and the fact it is hacky. Good workaround, we can keep this in mind if there is no other solutions to avoid these create-release-create operations.
Another option Alexei suggested is to add some fentry-like thing:
noinline int update_socket_protocol(int protocol) { return protocol; } /* TODO: ^^^ add the above to mod_ret set */
int __sys_socket(int family, int type, int protocol) { ...
protocol = update_socket_protocol(protocol);
... }
But it's also too problem specific it seems? And it's not cgroup-aware.
It looks like it is what Geliang did in his v6. If it is the only acceptable solution, I guess we can do without cgroup support. We can continue the discussions in his v6 if that's easier.
Ack, that works too, let's see how other people feel about it. I'm assuming in the bpf program we can always do bpf_get_current_cgroup_id() to filter by cgroup.
Good point, that works too and looks enough!
Cheers, Matt