From: Menglong Dong imagedong@tencent.com
The return value of BPF_CGROUP_RUN_PROG_INET{4,6}_POST_BIND() in __inet_bind() is not handled properly. While the return value is non-zero, it will set inet_saddr and inet_rcv_saddr to 0 and exit: exit:
err = BPF_CGROUP_RUN_PROG_INET4_POST_BIND(sk); if (err) { inet->inet_saddr = inet->inet_rcv_saddr = 0; goto out_release_sock; }
Let's take UDP for example and see what will happen. For UDP socket, it will be added to 'udp_prot.h.udp_table->hash' and 'udp_prot.h.udp_table->hash2' after the sk->sk_prot->get_port() called success. If 'inet->inet_rcv_saddr' is specified here, then 'sk' will be in the 'hslot2' of 'hash2' that it don't belong to (because inet_saddr is changed to 0), and UDP packet received will not be passed to this sock. If 'inet->inet_rcv_saddr' is not specified here, the sock will work fine, as it can receive packet properly, which is wired, as the 'bind()' is already failed.
To undo the get_port() operation, introduce the 'put_port' field for 'struct proto'. For TCP proto, it is inet_put_port(); For UDP proto, it is udp_lib_unhash(); For icmp proto, it is ping_unhash().
Therefore, after sys_bind() fail caused by BPF_CGROUP_RUN_PROG_INET4_POST_BIND(), it will be unbinded, which means that it can try to be binded to another port.
The second patch is the selftests for this modification.
Changes since v2: - NULL check for sk->sk_prot->put_port
Changes since v1: - introduce 'put_port' field for 'struct proto' - add selftests for it
Menglong Dong (2): net: bpf: handle return value of BPF_CGROUP_RUN_PROG_INET{4,6}_POST_BIND() bpf: selftests: add bind retry for post_bind{4, 6}
include/net/sock.h | 1 + net/ipv4/af_inet.c | 2 + net/ipv4/ping.c | 1 + net/ipv4/tcp_ipv4.c | 1 + net/ipv4/udp.c | 1 + net/ipv6/af_inet6.c | 2 + net/ipv6/ping.c | 1 + net/ipv6/tcp_ipv6.c | 1 + net/ipv6/udp.c | 1 + tools/testing/selftests/bpf/test_sock.c | 166 +++++++++++++++++++++--- 10 files changed, 157 insertions(+), 20 deletions(-)