This patchset adds two kfunc helpers, bpf_xdp_get_xfrm_state() and bpf_xdp_xfrm_state_release() that wrap xfrm_state_lookup() and xfrm_state_put(). The intent is to support software RSS (via XDP) for the ongoing/upcoming ipsec pcpu work [0]. Recent experiments performed on (hopefully) reproducible AWS testbeds indicate that single tunnel pcpu ipsec can reach line rate on 100G ENA nics.
Note this patchset only tests/shows generic xfrm_state access. The "secret sauce" (if you can really even call it that) involves accessing a soon-to-be-upstreamed pcpu_num field in xfrm_state. Early example is available here [1].
[0]: https://datatracker.ietf.org/doc/draft-ietf-ipsecme-multi-sa-performance/03/ [1]: https://github.com/danobi/xdp-tools/blob/e89a1c617aba3b50d990f779357d6ce2863...
Changes from v4: * Fixup commit message for selftest * Set opts->error -ENOENT for !x * Revert single file xfrm + bpf
Changes from v3: * Place all xfrm bpf integrations in xfrm_bpf.c * Avoid using nval as a temporary * Rebase to bpf-next * Remove extraneous __failure_unpriv annotation for verifier tests
Changes from v2: * Fix/simplify BPF_CORE_WRITE_BITFIELD() algorithm * Added verifier tests for bitfield writes * Fix state leakage across test_tunnel subtests
Changes from v1: * Move xfrm tunnel tests to test_progs * Fix writing to opts->error when opts is invalid * Use __bpf_kfunc_start_defs() * Remove unused vxlanhdr definition * Add and use BPF_CORE_WRITE_BITFIELD() macro * Make series bisect clean
Changes from RFCv2: * Rebased to ipsec-next * Fix netns leak
Changes from RFCv1: * Add Antony's commit tags * Add KF_ACQUIRE and KF_RELEASE semantics
Daniel Xu (9): bpf: xfrm: Add bpf_xdp_get_xfrm_state() kfunc bpf: xfrm: Add bpf_xdp_xfrm_state_release() kfunc libbpf: Add BPF_CORE_WRITE_BITFIELD() macro bpf: selftests: test_loader: Support __btf_path() annotation bpf: selftests: Add verifier tests for CO-RE bitfield writes bpf: selftests: test_tunnel: Setup fresh topology for each subtest bpf: selftests: test_tunnel: Use vmlinux.h declarations bpf: selftests: Move xfrm tunnel test to test_progs bpf: xfrm: Add selftest for bpf_xdp_get_xfrm_state()
include/net/xfrm.h | 9 + net/xfrm/Makefile | 1 + net/xfrm/xfrm_policy.c | 2 + net/xfrm/xfrm_state_bpf.c | 130 ++++++++++++++ tools/lib/bpf/bpf_core_read.h | 32 ++++ .../selftests/bpf/prog_tests/test_tunnel.c | 162 +++++++++++++++++- .../selftests/bpf/prog_tests/verifier.c | 2 + tools/testing/selftests/bpf/progs/bpf_misc.h | 1 + .../selftests/bpf/progs/bpf_tracing_net.h | 1 + .../selftests/bpf/progs/test_tunnel_kern.c | 138 ++++++++------- .../bpf/progs/verifier_bitfield_write.c | 100 +++++++++++ tools/testing/selftests/bpf/test_loader.c | 7 + tools/testing/selftests/bpf/test_tunnel.sh | 92 ---------- 13 files changed, 522 insertions(+), 155 deletions(-) create mode 100644 net/xfrm/xfrm_state_bpf.c create mode 100644 tools/testing/selftests/bpf/progs/verifier_bitfield_write.c
This commit adds support for per-prog btf_custom_path. This is necessary for testing CO-RE relocations on non-vmlinux types using test_loader infrastructure.
Acked-by: Andrii Nakryiko andrii@kernel.org Signed-off-by: Daniel Xu dxu@dxuuu.xyz --- tools/testing/selftests/bpf/progs/bpf_misc.h | 1 + tools/testing/selftests/bpf/test_loader.c | 7 +++++++ 2 files changed, 8 insertions(+)
diff --git a/tools/testing/selftests/bpf/progs/bpf_misc.h b/tools/testing/selftests/bpf/progs/bpf_misc.h index 799fff4995d8..2fd59970c43a 100644 --- a/tools/testing/selftests/bpf/progs/bpf_misc.h +++ b/tools/testing/selftests/bpf/progs/bpf_misc.h @@ -71,6 +71,7 @@ #define __retval_unpriv(val) __attribute__((btf_decl_tag("comment:test_retval_unpriv="#val))) #define __auxiliary __attribute__((btf_decl_tag("comment:test_auxiliary"))) #define __auxiliary_unpriv __attribute__((btf_decl_tag("comment:test_auxiliary_unpriv"))) +#define __btf_path(path) __attribute__((btf_decl_tag("comment:test_btf_path=" path)))
/* Convenience macro for use with 'asm volatile' blocks */ #define __naked __attribute__((naked)) diff --git a/tools/testing/selftests/bpf/test_loader.c b/tools/testing/selftests/bpf/test_loader.c index a350ecdfba4a..74ceb7877ae2 100644 --- a/tools/testing/selftests/bpf/test_loader.c +++ b/tools/testing/selftests/bpf/test_loader.c @@ -27,6 +27,7 @@ #define TEST_TAG_RETVAL_PFX_UNPRIV "comment:test_retval_unpriv=" #define TEST_TAG_AUXILIARY "comment:test_auxiliary" #define TEST_TAG_AUXILIARY_UNPRIV "comment:test_auxiliary_unpriv" +#define TEST_BTF_PATH "comment:test_btf_path="
/* Warning: duplicated in bpf_misc.h */ #define POINTER_VALUE 0xcafe4all @@ -58,6 +59,7 @@ struct test_spec { const char *prog_name; struct test_subspec priv; struct test_subspec unpriv; + const char *btf_custom_path; int log_level; int prog_flags; int mode_mask; @@ -288,6 +290,8 @@ static int parse_test_spec(struct test_loader *tester, goto cleanup; update_flags(&spec->prog_flags, flags, clear); } + } else if (str_has_pfx(s, TEST_BTF_PATH)) { + spec->btf_custom_path = s + sizeof(TEST_BTF_PATH) - 1; } }
@@ -578,6 +582,9 @@ void run_subtest(struct test_loader *tester, } }
+ /* Implicitly reset to NULL if next test case doesn't specify */ + open_opts->btf_custom_path = spec->btf_custom_path; + tobj = bpf_object__open_mem(obj_bytes, obj_byte_cnt, open_opts); if (!ASSERT_OK_PTR(tobj, "obj_open_mem")) /* shouldn't happen */ goto subtest_cleanup;
Add some tests that exercise BPF_CORE_WRITE_BITFIELD() macro. Since some non-trivial bit fiddling is going on, make sure various edge cases (such as adjacent bitfields and bitfields at the edge of structs) are exercised.
Acked-by: Andrii Nakryiko andrii@kernel.org Signed-off-by: Daniel Xu dxu@dxuuu.xyz --- .../selftests/bpf/prog_tests/verifier.c | 2 + .../bpf/progs/verifier_bitfield_write.c | 100 ++++++++++++++++++ 2 files changed, 102 insertions(+) create mode 100644 tools/testing/selftests/bpf/progs/verifier_bitfield_write.c
diff --git a/tools/testing/selftests/bpf/prog_tests/verifier.c b/tools/testing/selftests/bpf/prog_tests/verifier.c index 8d746642cbd7..ac49ec25211d 100644 --- a/tools/testing/selftests/bpf/prog_tests/verifier.c +++ b/tools/testing/selftests/bpf/prog_tests/verifier.c @@ -6,6 +6,7 @@ #include "verifier_and.skel.h" #include "verifier_array_access.skel.h" #include "verifier_basic_stack.skel.h" +#include "verifier_bitfield_write.skel.h" #include "verifier_bounds.skel.h" #include "verifier_bounds_deduction.skel.h" #include "verifier_bounds_deduction_non_const.skel.h" @@ -116,6 +117,7 @@ static void run_tests_aux(const char *skel_name,
void test_verifier_and(void) { RUN(verifier_and); } void test_verifier_basic_stack(void) { RUN(verifier_basic_stack); } +void test_verifier_bitfield_write(void) { RUN(verifier_bitfield_write); } void test_verifier_bounds(void) { RUN(verifier_bounds); } void test_verifier_bounds_deduction(void) { RUN(verifier_bounds_deduction); } void test_verifier_bounds_deduction_non_const(void) { RUN(verifier_bounds_deduction_non_const); } diff --git a/tools/testing/selftests/bpf/progs/verifier_bitfield_write.c b/tools/testing/selftests/bpf/progs/verifier_bitfield_write.c new file mode 100644 index 000000000000..623f130a3198 --- /dev/null +++ b/tools/testing/selftests/bpf/progs/verifier_bitfield_write.c @@ -0,0 +1,100 @@ +// SPDX-License-Identifier: GPL-2.0 + +#include <linux/bpf.h> +#include <stdint.h> + +#include <bpf/bpf_helpers.h> +#include <bpf/bpf_core_read.h> + +#include "bpf_misc.h" + +struct core_reloc_bitfields { + /* unsigned bitfields */ + uint8_t ub1: 1; + uint8_t ub2: 2; + uint32_t ub7: 7; + /* signed bitfields */ + int8_t sb4: 4; + int32_t sb20: 20; + /* non-bitfields */ + uint32_t u32; + int32_t s32; +} __attribute__((preserve_access_index)); + +SEC("tc") +__description("single CO-RE bitfield roundtrip") +__btf_path("btf__core_reloc_bitfields.bpf.o") +__success +__retval(3) +int single_field_roundtrip(struct __sk_buff *ctx) +{ + struct core_reloc_bitfields bitfields; + + __builtin_memset(&bitfields, 0, sizeof(bitfields)); + BPF_CORE_WRITE_BITFIELD(&bitfields, ub2, 3); + return BPF_CORE_READ_BITFIELD(&bitfields, ub2); +} + +SEC("tc") +__description("multiple CO-RE bitfield roundtrip") +__btf_path("btf__core_reloc_bitfields.bpf.o") +__success +__retval(0x3FD) +int multiple_field_roundtrip(struct __sk_buff *ctx) +{ + struct core_reloc_bitfields bitfields; + uint8_t ub2; + int8_t sb4; + + __builtin_memset(&bitfields, 0, sizeof(bitfields)); + BPF_CORE_WRITE_BITFIELD(&bitfields, ub2, 1); + BPF_CORE_WRITE_BITFIELD(&bitfields, sb4, -1); + + ub2 = BPF_CORE_READ_BITFIELD(&bitfields, ub2); + sb4 = BPF_CORE_READ_BITFIELD(&bitfields, sb4); + + return (((uint8_t)sb4) << 2) | ub2; +} + +SEC("tc") +__description("adjacent CO-RE bitfield roundtrip") +__btf_path("btf__core_reloc_bitfields.bpf.o") +__success +__retval(7) +int adjacent_field_roundtrip(struct __sk_buff *ctx) +{ + struct core_reloc_bitfields bitfields; + uint8_t ub1, ub2; + + __builtin_memset(&bitfields, 0, sizeof(bitfields)); + BPF_CORE_WRITE_BITFIELD(&bitfields, ub1, 1); + BPF_CORE_WRITE_BITFIELD(&bitfields, ub2, 3); + + ub1 = BPF_CORE_READ_BITFIELD(&bitfields, ub1); + ub2 = BPF_CORE_READ_BITFIELD(&bitfields, ub2); + + return (ub2 << 1) | ub1; +} + +SEC("tc") +__description("multibyte CO-RE bitfield roundtrip") +__btf_path("btf__core_reloc_bitfields.bpf.o") +__success +__retval(0x21) +int multibyte_field_roundtrip(struct __sk_buff *ctx) +{ + struct core_reloc_bitfields bitfields; + uint32_t ub7; + uint8_t ub1; + + __builtin_memset(&bitfields, 0, sizeof(bitfields)); + BPF_CORE_WRITE_BITFIELD(&bitfields, ub1, 1); + BPF_CORE_WRITE_BITFIELD(&bitfields, ub7, 16); + + ub1 = BPF_CORE_READ_BITFIELD(&bitfields, ub1); + ub7 = BPF_CORE_READ_BITFIELD(&bitfields, ub7); + + return (ub7 << 1) | ub1; +} + +char _license[] SEC("license") = "GPL";
On 12/11/23 12:20 PM, Daniel Xu wrote:
Add some tests that exercise BPF_CORE_WRITE_BITFIELD() macro. Since some non-trivial bit fiddling is going on, make sure various edge cases (such as adjacent bitfields and bitfields at the edge of structs) are exercised.
Hi DanielXu, I have pushed the libbpf changes (adding BPF_CORE_WRITE_BITFIELD) and verifier test in patch 3-5 which is useful by itself. e.g. Another patchset can start using it also: https://lore.kernel.org/bpf/8fccb066-6d17-4fa8-ba67-287042046ea4@linux.dev/
Thanks.
On Wed, Dec 13, 2023 at 03:58:39PM -0800, Martin KaFai Lau wrote:
On 12/11/23 12:20 PM, Daniel Xu wrote:
Add some tests that exercise BPF_CORE_WRITE_BITFIELD() macro. Since some non-trivial bit fiddling is going on, make sure various edge cases (such as adjacent bitfields and bitfields at the edge of structs) are exercised.
Hi DanielXu, I have pushed the libbpf changes (adding BPF_CORE_WRITE_BITFIELD) and verifier test in patch 3-5 which is useful by itself. e.g. Another patchset can start using it also: https://lore.kernel.org/bpf/8fccb066-6d17-4fa8-ba67-287042046ea4@linux.dev/
Thanks.
Sounds good. I'll rebase my patchset on top of bpf-next.
Thanks, Daniel
This helps with determinism b/c individual setup/teardown prevents leaking state between different subtests.
Signed-off-by: Daniel Xu dxu@dxuuu.xyz --- tools/testing/selftests/bpf/prog_tests/test_tunnel.c | 7 ++----- 1 file changed, 2 insertions(+), 5 deletions(-)
diff --git a/tools/testing/selftests/bpf/prog_tests/test_tunnel.c b/tools/testing/selftests/bpf/prog_tests/test_tunnel.c index d149ab98798d..b57d48219d0b 100644 --- a/tools/testing/selftests/bpf/prog_tests/test_tunnel.c +++ b/tools/testing/selftests/bpf/prog_tests/test_tunnel.c @@ -535,23 +535,20 @@ static void test_ipip_tunnel(enum ipip_encap encap) #define RUN_TEST(name, ...) \ ({ \ if (test__start_subtest(#name)) { \ + config_device(); \ test_ ## name(__VA_ARGS__); \ + cleanup(); \ } \ })
static void *test_tunnel_run_tests(void *arg) { - cleanup(); - config_device(); - RUN_TEST(vxlan_tunnel); RUN_TEST(ip6vxlan_tunnel); RUN_TEST(ipip_tunnel, NONE); RUN_TEST(ipip_tunnel, FOU); RUN_TEST(ipip_tunnel, GUE);
- cleanup(); - return NULL; }
vmlinux.h declarations are more ergnomic, especially when working with kfuncs. The uapi headers are often incomplete for kfunc definitions.
This commit also switches bitfield accesses to use CO-RE helpers. Switching to vmlinux.h definitions makes the verifier very unhappy with raw bitfield accesses. The error is:
; md.u.md2.dir = direction; 33: (69) r1 = *(u16 *)(r2 +11) misaligned stack access off (0x0; 0x0)+-64+11 size 2
Fix by using CO-RE-aware bitfield reads and writes.
Co-developed-by: Antony Antony antony.antony@secunet.com Signed-off-by: Antony Antony antony.antony@secunet.com Signed-off-by: Daniel Xu dxu@dxuuu.xyz --- .../selftests/bpf/progs/bpf_tracing_net.h | 1 + .../selftests/bpf/progs/test_tunnel_kern.c | 76 +++++-------------- 2 files changed, 22 insertions(+), 55 deletions(-)
diff --git a/tools/testing/selftests/bpf/progs/bpf_tracing_net.h b/tools/testing/selftests/bpf/progs/bpf_tracing_net.h index 0b793a102791..1bdc680b0e0e 100644 --- a/tools/testing/selftests/bpf/progs/bpf_tracing_net.h +++ b/tools/testing/selftests/bpf/progs/bpf_tracing_net.h @@ -26,6 +26,7 @@ #define IPV6_AUTOFLOWLABEL 70
#define TC_ACT_UNSPEC (-1) +#define TC_ACT_OK 0 #define TC_ACT_SHOT 2
#define SOL_TCP 6 diff --git a/tools/testing/selftests/bpf/progs/test_tunnel_kern.c b/tools/testing/selftests/bpf/progs/test_tunnel_kern.c index f66af753bbbb..b320fb7bb080 100644 --- a/tools/testing/selftests/bpf/progs/test_tunnel_kern.c +++ b/tools/testing/selftests/bpf/progs/test_tunnel_kern.c @@ -6,62 +6,26 @@ * modify it under the terms of version 2 of the GNU General Public * License as published by the Free Software Foundation. */ -#include <stddef.h> -#include <string.h> -#include <arpa/inet.h> -#include <linux/bpf.h> -#include <linux/if_ether.h> -#include <linux/if_packet.h> -#include <linux/if_tunnel.h> -#include <linux/ip.h> -#include <linux/ipv6.h> -#include <linux/icmp.h> -#include <linux/types.h> -#include <linux/socket.h> -#include <linux/pkt_cls.h> -#include <linux/erspan.h> -#include <linux/udp.h> +#include "vmlinux.h" +#include <bpf/bpf_core_read.h> #include <bpf/bpf_helpers.h> #include <bpf/bpf_endian.h> +#include "bpf_kfuncs.h" +#include "bpf_tracing_net.h"
#define log_err(__ret) bpf_printk("ERROR line:%d ret:%d\n", __LINE__, __ret)
-#define VXLAN_UDP_PORT 4789 +#define VXLAN_UDP_PORT 4789 +#define ETH_P_IP 0x0800 +#define PACKET_HOST 0 +#define TUNNEL_CSUM bpf_htons(0x01) +#define TUNNEL_KEY bpf_htons(0x04)
/* Only IPv4 address assigned to veth1. * 172.16.1.200 */ #define ASSIGNED_ADDR_VETH1 0xac1001c8
-struct geneve_opt { - __be16 opt_class; - __u8 type; - __u8 length:5; - __u8 r3:1; - __u8 r2:1; - __u8 r1:1; - __u8 opt_data[8]; /* hard-coded to 8 byte */ -}; - -struct vxlanhdr { - __be32 vx_flags; - __be32 vx_vni; -} __attribute__((packed)); - -struct vxlan_metadata { - __u32 gbp; -}; - -struct bpf_fou_encap { - __be16 sport; - __be16 dport; -}; - -enum bpf_fou_encap_type { - FOU_BPF_ENCAP_FOU, - FOU_BPF_ENCAP_GUE, -}; - int bpf_skb_set_fou_encap(struct __sk_buff *skb_ctx, struct bpf_fou_encap *encap, int type) __ksym; int bpf_skb_get_fou_encap(struct __sk_buff *skb_ctx, @@ -205,9 +169,9 @@ int erspan_set_tunnel(struct __sk_buff *skb) __u8 hwid = 7;
md.version = 2; - md.u.md2.dir = direction; - md.u.md2.hwid = hwid & 0xf; - md.u.md2.hwid_upper = (hwid >> 4) & 0x3; + BPF_CORE_WRITE_BITFIELD(&md.u.md2, dir, direction); + BPF_CORE_WRITE_BITFIELD(&md.u.md2, hwid, (hwid & 0xf)); + BPF_CORE_WRITE_BITFIELD(&md.u.md2, hwid_upper, (hwid >> 4) & 0x3); #endif
ret = bpf_skb_set_tunnel_opt(skb, &md, sizeof(md)); @@ -246,8 +210,9 @@ int erspan_get_tunnel(struct __sk_buff *skb) bpf_printk("\tindex %x\n", index); #else bpf_printk("\tdirection %d hwid %x timestamp %u\n", - md.u.md2.dir, - (md.u.md2.hwid_upper << 4) + md.u.md2.hwid, + BPF_CORE_READ_BITFIELD(&md.u.md2, dir), + (BPF_CORE_READ_BITFIELD(&md.u.md2, hwid_upper) << 4) + + BPF_CORE_READ_BITFIELD(&md.u.md2, hwid), bpf_ntohl(md.u.md2.timestamp)); #endif
@@ -284,9 +249,9 @@ int ip4ip6erspan_set_tunnel(struct __sk_buff *skb) __u8 hwid = 17;
md.version = 2; - md.u.md2.dir = direction; - md.u.md2.hwid = hwid & 0xf; - md.u.md2.hwid_upper = (hwid >> 4) & 0x3; + BPF_CORE_WRITE_BITFIELD(&md.u.md2, dir, direction); + BPF_CORE_WRITE_BITFIELD(&md.u.md2, hwid, (hwid & 0xf)); + BPF_CORE_WRITE_BITFIELD(&md.u.md2, hwid_upper, (hwid >> 4) & 0x3); #endif
ret = bpf_skb_set_tunnel_opt(skb, &md, sizeof(md)); @@ -326,8 +291,9 @@ int ip4ip6erspan_get_tunnel(struct __sk_buff *skb) bpf_printk("\tindex %x\n", index); #else bpf_printk("\tdirection %d hwid %x timestamp %u\n", - md.u.md2.dir, - (md.u.md2.hwid_upper << 4) + md.u.md2.hwid, + BPF_CORE_READ_BITFIELD(&md.u.md2, dir), + (BPF_CORE_READ_BITFIELD(&md.u.md2, hwid_upper) << 4) + + BPF_CORE_READ_BITFIELD(&md.u.md2, hwid), bpf_ntohl(md.u.md2.timestamp)); #endif
test_progs is better than a shell script b/c C is a bit easier to maintain than shell. Also it's easier to use new infra like memory mapped global variables from C via bpf skeleton.
Co-developed-by: Antony Antony antony.antony@secunet.com Signed-off-by: Antony Antony antony.antony@secunet.com Signed-off-by: Daniel Xu dxu@dxuuu.xyz --- .../selftests/bpf/prog_tests/test_tunnel.c | 143 ++++++++++++++++++ .../selftests/bpf/progs/test_tunnel_kern.c | 11 +- tools/testing/selftests/bpf/test_tunnel.sh | 92 ----------- 3 files changed, 151 insertions(+), 95 deletions(-)
diff --git a/tools/testing/selftests/bpf/prog_tests/test_tunnel.c b/tools/testing/selftests/bpf/prog_tests/test_tunnel.c index b57d48219d0b..2d7f8fa82ebd 100644 --- a/tools/testing/selftests/bpf/prog_tests/test_tunnel.c +++ b/tools/testing/selftests/bpf/prog_tests/test_tunnel.c @@ -50,6 +50,7 @@ */
#include <arpa/inet.h> +#include <linux/if_link.h> #include <linux/if_tun.h> #include <linux/limits.h> #include <linux/sysctl.h> @@ -92,6 +93,11 @@ #define IPIP_TUNL_DEV0 "ipip00" #define IPIP_TUNL_DEV1 "ipip11"
+#define XFRM_AUTH "0x1111111111111111111111111111111111111111" +#define XFRM_ENC "0x22222222222222222222222222222222" +#define XFRM_SPI_IN_TO_OUT 0x1 +#define XFRM_SPI_OUT_TO_IN 0x2 + #define PING_ARGS "-i 0.01 -c 3 -w 10 -q"
static int config_device(void) @@ -264,6 +270,92 @@ static void delete_ipip_tunnel(void) SYS_NOFAIL("ip fou del port 5555 2> /dev/null"); }
+static int add_xfrm_tunnel(void) +{ + /* at_ns0 namespace + * at_ns0 -> root + */ + SYS(fail, + "ip netns exec at_ns0 " + "ip xfrm state add src %s dst %s proto esp " + "spi %d reqid 1 mode tunnel " + "auth-trunc 'hmac(sha1)' %s 96 enc 'cbc(aes)' %s", + IP4_ADDR_VETH0, IP4_ADDR1_VETH1, XFRM_SPI_IN_TO_OUT, XFRM_AUTH, XFRM_ENC); + SYS(fail, + "ip netns exec at_ns0 " + "ip xfrm policy add src %s/32 dst %s/32 dir out " + "tmpl src %s dst %s proto esp reqid 1 " + "mode tunnel", + IP4_ADDR_TUNL_DEV0, IP4_ADDR_TUNL_DEV1, IP4_ADDR_VETH0, IP4_ADDR1_VETH1); + + /* root -> at_ns0 */ + SYS(fail, + "ip netns exec at_ns0 " + "ip xfrm state add src %s dst %s proto esp " + "spi %d reqid 2 mode tunnel " + "auth-trunc 'hmac(sha1)' %s 96 enc 'cbc(aes)' %s", + IP4_ADDR1_VETH1, IP4_ADDR_VETH0, XFRM_SPI_OUT_TO_IN, XFRM_AUTH, XFRM_ENC); + SYS(fail, + "ip netns exec at_ns0 " + "ip xfrm policy add src %s/32 dst %s/32 dir in " + "tmpl src %s dst %s proto esp reqid 2 " + "mode tunnel", + IP4_ADDR_TUNL_DEV1, IP4_ADDR_TUNL_DEV0, IP4_ADDR1_VETH1, IP4_ADDR_VETH0); + + /* address & route */ + SYS(fail, "ip netns exec at_ns0 ip addr add dev veth0 %s/32", + IP4_ADDR_TUNL_DEV0); + SYS(fail, "ip netns exec at_ns0 ip route add %s dev veth0 via %s src %s", + IP4_ADDR_TUNL_DEV1, IP4_ADDR1_VETH1, IP4_ADDR_TUNL_DEV0); + + /* root namespace + * at_ns0 -> root + */ + SYS(fail, + "ip xfrm state add src %s dst %s proto esp " + "spi %d reqid 1 mode tunnel " + "auth-trunc 'hmac(sha1)' %s 96 enc 'cbc(aes)' %s", + IP4_ADDR_VETH0, IP4_ADDR1_VETH1, XFRM_SPI_IN_TO_OUT, XFRM_AUTH, XFRM_ENC); + SYS(fail, + "ip xfrm policy add src %s/32 dst %s/32 dir in " + "tmpl src %s dst %s proto esp reqid 1 " + "mode tunnel", + IP4_ADDR_TUNL_DEV0, IP4_ADDR_TUNL_DEV1, IP4_ADDR_VETH0, IP4_ADDR1_VETH1); + + /* root -> at_ns0 */ + SYS(fail, + "ip xfrm state add src %s dst %s proto esp " + "spi %d reqid 2 mode tunnel " + "auth-trunc 'hmac(sha1)' %s 96 enc 'cbc(aes)' %s", + IP4_ADDR1_VETH1, IP4_ADDR_VETH0, XFRM_SPI_OUT_TO_IN, XFRM_AUTH, XFRM_ENC); + SYS(fail, + "ip xfrm policy add src %s/32 dst %s/32 dir out " + "tmpl src %s dst %s proto esp reqid 2 " + "mode tunnel", + IP4_ADDR_TUNL_DEV1, IP4_ADDR_TUNL_DEV0, IP4_ADDR1_VETH1, IP4_ADDR_VETH0); + + /* address & route */ + SYS(fail, "ip addr add dev veth1 %s/32", IP4_ADDR_TUNL_DEV1); + SYS(fail, "ip route add %s dev veth1 via %s src %s", + IP4_ADDR_TUNL_DEV0, IP4_ADDR_VETH0, IP4_ADDR_TUNL_DEV1); + + return 0; +fail: + return -1; +} + +static void delete_xfrm_tunnel(void) +{ + SYS_NOFAIL("ip xfrm policy delete dir out src %s/32 dst %s/32 2> /dev/null", + IP4_ADDR_TUNL_DEV1, IP4_ADDR_TUNL_DEV0); + SYS_NOFAIL("ip xfrm policy delete dir in src %s/32 dst %s/32 2> /dev/null", + IP4_ADDR_TUNL_DEV0, IP4_ADDR_TUNL_DEV1); + SYS_NOFAIL("ip xfrm state delete src %s dst %s proto esp spi %d 2> /dev/null", + IP4_ADDR_VETH0, IP4_ADDR1_VETH1, XFRM_SPI_IN_TO_OUT); + SYS_NOFAIL("ip xfrm state delete src %s dst %s proto esp spi %d 2> /dev/null", + IP4_ADDR1_VETH1, IP4_ADDR_VETH0, XFRM_SPI_OUT_TO_IN); +} + static int test_ping(int family, const char *addr) { SYS(fail, "%s %s %s > /dev/null", ping_command(family), PING_ARGS, addr); @@ -532,6 +624,56 @@ static void test_ipip_tunnel(enum ipip_encap encap) test_tunnel_kern__destroy(skel); }
+static void test_xfrm_tunnel(void) +{ + DECLARE_LIBBPF_OPTS(bpf_tc_hook, tc_hook, + .attach_point = BPF_TC_INGRESS); + struct test_tunnel_kern *skel = NULL; + struct nstoken *nstoken; + int tc_prog_fd; + int ifindex; + int err; + + err = add_xfrm_tunnel(); + if (!ASSERT_OK(err, "add_xfrm_tunnel")) + return; + + skel = test_tunnel_kern__open_and_load(); + if (!ASSERT_OK_PTR(skel, "test_tunnel_kern__open_and_load")) + goto done; + + ifindex = if_nametoindex("veth1"); + if (!ASSERT_NEQ(ifindex, 0, "veth1 ifindex")) + goto done; + + /* attach tc prog to tunnel dev */ + tc_hook.ifindex = ifindex; + tc_prog_fd = bpf_program__fd(skel->progs.xfrm_get_state); + if (!ASSERT_GE(tc_prog_fd, 0, "bpf_program__fd")) + goto done; + if (attach_tc_prog(&tc_hook, tc_prog_fd, -1)) + goto done; + + /* ping from at_ns0 namespace test */ + nstoken = open_netns("at_ns0"); + err = test_ping(AF_INET, IP4_ADDR_TUNL_DEV1); + close_netns(nstoken); + if (!ASSERT_OK(err, "test_ping")) + goto done; + + if (!ASSERT_EQ(skel->bss->xfrm_reqid, 1, "req_id")) + goto done; + if (!ASSERT_EQ(skel->bss->xfrm_spi, XFRM_SPI_IN_TO_OUT, "spi")) + goto done; + if (!ASSERT_EQ(skel->bss->xfrm_remote_ip, 0xac100164, "remote_ip")) + goto done; + +done: + delete_xfrm_tunnel(); + if (skel) + test_tunnel_kern__destroy(skel); +} + #define RUN_TEST(name, ...) \ ({ \ if (test__start_subtest(#name)) { \ @@ -548,6 +690,7 @@ static void *test_tunnel_run_tests(void *arg) RUN_TEST(ipip_tunnel, NONE); RUN_TEST(ipip_tunnel, FOU); RUN_TEST(ipip_tunnel, GUE); + RUN_TEST(xfrm_tunnel);
return NULL; } diff --git a/tools/testing/selftests/bpf/progs/test_tunnel_kern.c b/tools/testing/selftests/bpf/progs/test_tunnel_kern.c index b320fb7bb080..3a59eb9c34de 100644 --- a/tools/testing/selftests/bpf/progs/test_tunnel_kern.c +++ b/tools/testing/selftests/bpf/progs/test_tunnel_kern.c @@ -929,6 +929,10 @@ int ip6ip6_get_tunnel(struct __sk_buff *skb) return TC_ACT_OK; }
+volatile int xfrm_reqid = 0; +volatile int xfrm_spi = 0; +volatile int xfrm_remote_ip = 0; + SEC("tc") int xfrm_get_state(struct __sk_buff *skb) { @@ -939,9 +943,10 @@ int xfrm_get_state(struct __sk_buff *skb) if (ret < 0) return TC_ACT_OK;
- bpf_printk("reqid %d spi 0x%x remote ip 0x%x\n", - x.reqid, bpf_ntohl(x.spi), - bpf_ntohl(x.remote_ipv4)); + xfrm_reqid = x.reqid; + xfrm_spi = bpf_ntohl(x.spi); + xfrm_remote_ip = bpf_ntohl(x.remote_ipv4); + return TC_ACT_OK; }
diff --git a/tools/testing/selftests/bpf/test_tunnel.sh b/tools/testing/selftests/bpf/test_tunnel.sh index 2dec7dbf29a2..d9661b9988ba 100755 --- a/tools/testing/selftests/bpf/test_tunnel.sh +++ b/tools/testing/selftests/bpf/test_tunnel.sh @@ -517,90 +517,6 @@ test_ip6ip6() echo -e ${GREEN}"PASS: ip6$TYPE"${NC} }
-setup_xfrm_tunnel() -{ - auth=0x$(printf '1%.0s' {1..40}) - enc=0x$(printf '2%.0s' {1..32}) - spi_in_to_out=0x1 - spi_out_to_in=0x2 - # at_ns0 namespace - # at_ns0 -> root - ip netns exec at_ns0 \ - ip xfrm state add src 172.16.1.100 dst 172.16.1.200 proto esp \ - spi $spi_in_to_out reqid 1 mode tunnel \ - auth-trunc 'hmac(sha1)' $auth 96 enc 'cbc(aes)' $enc - ip netns exec at_ns0 \ - ip xfrm policy add src 10.1.1.100/32 dst 10.1.1.200/32 dir out \ - tmpl src 172.16.1.100 dst 172.16.1.200 proto esp reqid 1 \ - mode tunnel - # root -> at_ns0 - ip netns exec at_ns0 \ - ip xfrm state add src 172.16.1.200 dst 172.16.1.100 proto esp \ - spi $spi_out_to_in reqid 2 mode tunnel \ - auth-trunc 'hmac(sha1)' $auth 96 enc 'cbc(aes)' $enc - ip netns exec at_ns0 \ - ip xfrm policy add src 10.1.1.200/32 dst 10.1.1.100/32 dir in \ - tmpl src 172.16.1.200 dst 172.16.1.100 proto esp reqid 2 \ - mode tunnel - # address & route - ip netns exec at_ns0 \ - ip addr add dev veth0 10.1.1.100/32 - ip netns exec at_ns0 \ - ip route add 10.1.1.200 dev veth0 via 172.16.1.200 \ - src 10.1.1.100 - - # root namespace - # at_ns0 -> root - ip xfrm state add src 172.16.1.100 dst 172.16.1.200 proto esp \ - spi $spi_in_to_out reqid 1 mode tunnel \ - auth-trunc 'hmac(sha1)' $auth 96 enc 'cbc(aes)' $enc - ip xfrm policy add src 10.1.1.100/32 dst 10.1.1.200/32 dir in \ - tmpl src 172.16.1.100 dst 172.16.1.200 proto esp reqid 1 \ - mode tunnel - # root -> at_ns0 - ip xfrm state add src 172.16.1.200 dst 172.16.1.100 proto esp \ - spi $spi_out_to_in reqid 2 mode tunnel \ - auth-trunc 'hmac(sha1)' $auth 96 enc 'cbc(aes)' $enc - ip xfrm policy add src 10.1.1.200/32 dst 10.1.1.100/32 dir out \ - tmpl src 172.16.1.200 dst 172.16.1.100 proto esp reqid 2 \ - mode tunnel - # address & route - ip addr add dev veth1 10.1.1.200/32 - ip route add 10.1.1.100 dev veth1 via 172.16.1.100 src 10.1.1.200 -} - -test_xfrm_tunnel() -{ - if [[ -e /sys/kernel/tracing/trace ]]; then - TRACE=/sys/kernel/tracing/trace - else - TRACE=/sys/kernel/debug/tracing/trace - fi - config_device - > ${TRACE} - setup_xfrm_tunnel - mkdir -p ${BPF_PIN_TUNNEL_DIR} - bpftool prog loadall ${BPF_FILE} ${BPF_PIN_TUNNEL_DIR} - tc qdisc add dev veth1 clsact - tc filter add dev veth1 proto ip ingress bpf da object-pinned \ - ${BPF_PIN_TUNNEL_DIR}/xfrm_get_state - ip netns exec at_ns0 ping $PING_ARG 10.1.1.200 - sleep 1 - grep "reqid 1" ${TRACE} - check_err $? - grep "spi 0x1" ${TRACE} - check_err $? - grep "remote ip 0xac100164" ${TRACE} - check_err $? - cleanup - - if [ $ret -ne 0 ]; then - echo -e ${RED}"FAIL: xfrm tunnel"${NC} - return 1 - fi - echo -e ${GREEN}"PASS: xfrm tunnel"${NC} -} - attach_bpf() { DEV=$1 @@ -630,10 +546,6 @@ cleanup() ip link del ip6geneve11 2> /dev/null ip link del erspan11 2> /dev/null ip link del ip6erspan11 2> /dev/null - ip xfrm policy delete dir out src 10.1.1.200/32 dst 10.1.1.100/32 2> /dev/null - ip xfrm policy delete dir in src 10.1.1.100/32 dst 10.1.1.200/32 2> /dev/null - ip xfrm state delete src 172.16.1.100 dst 172.16.1.200 proto esp spi 0x1 2> /dev/null - ip xfrm state delete src 172.16.1.200 dst 172.16.1.100 proto esp spi 0x2 2> /dev/null }
cleanup_exit() @@ -716,10 +628,6 @@ bpf_tunnel_test() test_ip6ip6 errors=$(( $errors + $? ))
- echo "Testing IPSec tunnel..." - test_xfrm_tunnel - errors=$(( $errors + $? )) - return $errors }
This commit extends test_tunnel selftest to test the new XDP xfrm state lookup kfunc.
Co-developed-by: Antony Antony antony.antony@secunet.com Signed-off-by: Antony Antony antony.antony@secunet.com Signed-off-by: Daniel Xu dxu@dxuuu.xyz --- .../selftests/bpf/prog_tests/test_tunnel.c | 20 ++++++-- .../selftests/bpf/progs/test_tunnel_kern.c | 51 +++++++++++++++++++ 2 files changed, 67 insertions(+), 4 deletions(-)
diff --git a/tools/testing/selftests/bpf/prog_tests/test_tunnel.c b/tools/testing/selftests/bpf/prog_tests/test_tunnel.c index 2d7f8fa82ebd..fc804095d578 100644 --- a/tools/testing/selftests/bpf/prog_tests/test_tunnel.c +++ b/tools/testing/selftests/bpf/prog_tests/test_tunnel.c @@ -278,7 +278,7 @@ static int add_xfrm_tunnel(void) SYS(fail, "ip netns exec at_ns0 " "ip xfrm state add src %s dst %s proto esp " - "spi %d reqid 1 mode tunnel " + "spi %d reqid 1 mode tunnel replay-window 42 " "auth-trunc 'hmac(sha1)' %s 96 enc 'cbc(aes)' %s", IP4_ADDR_VETH0, IP4_ADDR1_VETH1, XFRM_SPI_IN_TO_OUT, XFRM_AUTH, XFRM_ENC); SYS(fail, @@ -292,7 +292,7 @@ static int add_xfrm_tunnel(void) SYS(fail, "ip netns exec at_ns0 " "ip xfrm state add src %s dst %s proto esp " - "spi %d reqid 2 mode tunnel " + "spi %d reqid 2 mode tunnel replay-window 42 " "auth-trunc 'hmac(sha1)' %s 96 enc 'cbc(aes)' %s", IP4_ADDR1_VETH1, IP4_ADDR_VETH0, XFRM_SPI_OUT_TO_IN, XFRM_AUTH, XFRM_ENC); SYS(fail, @@ -313,7 +313,7 @@ static int add_xfrm_tunnel(void) */ SYS(fail, "ip xfrm state add src %s dst %s proto esp " - "spi %d reqid 1 mode tunnel " + "spi %d reqid 1 mode tunnel replay-window 42 " "auth-trunc 'hmac(sha1)' %s 96 enc 'cbc(aes)' %s", IP4_ADDR_VETH0, IP4_ADDR1_VETH1, XFRM_SPI_IN_TO_OUT, XFRM_AUTH, XFRM_ENC); SYS(fail, @@ -325,7 +325,7 @@ static int add_xfrm_tunnel(void) /* root -> at_ns0 */ SYS(fail, "ip xfrm state add src %s dst %s proto esp " - "spi %d reqid 2 mode tunnel " + "spi %d reqid 2 mode tunnel replay-window 42 " "auth-trunc 'hmac(sha1)' %s 96 enc 'cbc(aes)' %s", IP4_ADDR1_VETH1, IP4_ADDR_VETH0, XFRM_SPI_OUT_TO_IN, XFRM_AUTH, XFRM_ENC); SYS(fail, @@ -628,8 +628,10 @@ static void test_xfrm_tunnel(void) { DECLARE_LIBBPF_OPTS(bpf_tc_hook, tc_hook, .attach_point = BPF_TC_INGRESS); + LIBBPF_OPTS(bpf_xdp_attach_opts, opts); struct test_tunnel_kern *skel = NULL; struct nstoken *nstoken; + int xdp_prog_fd; int tc_prog_fd; int ifindex; int err; @@ -654,6 +656,14 @@ static void test_xfrm_tunnel(void) if (attach_tc_prog(&tc_hook, tc_prog_fd, -1)) goto done;
+ /* attach xdp prog to tunnel dev */ + xdp_prog_fd = bpf_program__fd(skel->progs.xfrm_get_state_xdp); + if (!ASSERT_GE(xdp_prog_fd, 0, "bpf_program__fd")) + goto done; + err = bpf_xdp_attach(ifindex, xdp_prog_fd, XDP_FLAGS_REPLACE, &opts); + if (!ASSERT_OK(err, "bpf_xdp_attach")) + goto done; + /* ping from at_ns0 namespace test */ nstoken = open_netns("at_ns0"); err = test_ping(AF_INET, IP4_ADDR_TUNL_DEV1); @@ -667,6 +677,8 @@ static void test_xfrm_tunnel(void) goto done; if (!ASSERT_EQ(skel->bss->xfrm_remote_ip, 0xac100164, "remote_ip")) goto done; + if (!ASSERT_EQ(skel->bss->xfrm_replay_window, 42, "replay_window")) + goto done;
done: delete_xfrm_tunnel(); diff --git a/tools/testing/selftests/bpf/progs/test_tunnel_kern.c b/tools/testing/selftests/bpf/progs/test_tunnel_kern.c index 3a59eb9c34de..c0dd38616562 100644 --- a/tools/testing/selftests/bpf/progs/test_tunnel_kern.c +++ b/tools/testing/selftests/bpf/progs/test_tunnel_kern.c @@ -30,6 +30,10 @@ int bpf_skb_set_fou_encap(struct __sk_buff *skb_ctx, struct bpf_fou_encap *encap, int type) __ksym; int bpf_skb_get_fou_encap(struct __sk_buff *skb_ctx, struct bpf_fou_encap *encap) __ksym; +struct xfrm_state * +bpf_xdp_get_xfrm_state(struct xdp_md *ctx, struct bpf_xfrm_state_opts *opts, + u32 opts__sz) __ksym; +void bpf_xdp_xfrm_state_release(struct xfrm_state *x) __ksym;
struct { __uint(type, BPF_MAP_TYPE_ARRAY); @@ -950,4 +954,51 @@ int xfrm_get_state(struct __sk_buff *skb) return TC_ACT_OK; }
+volatile int xfrm_replay_window = 0; + +SEC("xdp") +int xfrm_get_state_xdp(struct xdp_md *xdp) +{ + struct bpf_xfrm_state_opts opts = {}; + struct xfrm_state *x = NULL; + struct ip_esp_hdr *esph; + struct bpf_dynptr ptr; + u8 esph_buf[8] = {}; + u8 iph_buf[20] = {}; + struct iphdr *iph; + u32 off; + + if (bpf_dynptr_from_xdp(xdp, 0, &ptr)) + goto out; + + off = sizeof(struct ethhdr); + iph = bpf_dynptr_slice(&ptr, off, iph_buf, sizeof(iph_buf)); + if (!iph || iph->protocol != IPPROTO_ESP) + goto out; + + off += sizeof(struct iphdr); + esph = bpf_dynptr_slice(&ptr, off, esph_buf, sizeof(esph_buf)); + if (!esph) + goto out; + + opts.netns_id = BPF_F_CURRENT_NETNS; + opts.daddr.a4 = iph->daddr; + opts.spi = esph->spi; + opts.proto = IPPROTO_ESP; + opts.family = AF_INET; + + x = bpf_xdp_get_xfrm_state(xdp, &opts, sizeof(opts)); + if (!x || opts.error) + goto out; + + if (!x->replay_esn) + goto out; + + xfrm_replay_window = x->replay_esn->replay_window; +out: + if (x) + bpf_xdp_xfrm_state_release(x); + return XDP_PASS; +} + char _license[] SEC("license") = "GPL";
Hi Daniel,
Tiny nits below in case you respin this for other reasons:
On Mon, Dec 11, 2023 at 12:20 PM Daniel Xu dxu@dxuuu.xyz wrote:
This commit extends test_tunnel selftest to test the new XDP xfrm state lookup kfunc.
Co-developed-by: Antony Antony antony.antony@secunet.com Signed-off-by: Antony Antony antony.antony@secunet.com Signed-off-by: Daniel Xu dxu@dxuuu.xyz
.../selftests/bpf/prog_tests/test_tunnel.c | 20 ++++++-- .../selftests/bpf/progs/test_tunnel_kern.c | 51 +++++++++++++++++++ 2 files changed, 67 insertions(+), 4 deletions(-)
diff --git a/tools/testing/selftests/bpf/prog_tests/test_tunnel.c b/tools/testing/selftests/bpf/prog_tests/test_tunnel.c index 2d7f8fa82ebd..fc804095d578 100644 --- a/tools/testing/selftests/bpf/prog_tests/test_tunnel.c +++ b/tools/testing/selftests/bpf/prog_tests/test_tunnel.c @@ -278,7 +278,7 @@ static int add_xfrm_tunnel(void) SYS(fail, "ip netns exec at_ns0 " "ip xfrm state add src %s dst %s proto esp "
"spi %d reqid 1 mode tunnel "
"spi %d reqid 1 mode tunnel replay-window 42 " "auth-trunc 'hmac(sha1)' %s 96 enc 'cbc(aes)' %s", IP4_ADDR_VETH0, IP4_ADDR1_VETH1, XFRM_SPI_IN_TO_OUT, XFRM_AUTH, XFRM_ENC); SYS(fail,
@@ -292,7 +292,7 @@ static int add_xfrm_tunnel(void) SYS(fail, "ip netns exec at_ns0 " "ip xfrm state add src %s dst %s proto esp "
"spi %d reqid 2 mode tunnel "
"spi %d reqid 2 mode tunnel replay-window 42 "
nit: why do you need to set the replay-window in both directions?
"auth-trunc 'hmac(sha1)' %s 96 enc 'cbc(aes)' %s", IP4_ADDR1_VETH1, IP4_ADDR_VETH0, XFRM_SPI_OUT_TO_IN, XFRM_AUTH, XFRM_ENC); SYS(fail,
@@ -313,7 +313,7 @@ static int add_xfrm_tunnel(void) */ SYS(fail, "ip xfrm state add src %s dst %s proto esp "
"spi %d reqid 1 mode tunnel "
"spi %d reqid 1 mode tunnel replay-window 42 " "auth-trunc 'hmac(sha1)' %s 96 enc 'cbc(aes)' %s", IP4_ADDR_VETH0, IP4_ADDR1_VETH1, XFRM_SPI_IN_TO_OUT, XFRM_AUTH, XFRM_ENC); SYS(fail,
@@ -325,7 +325,7 @@ static int add_xfrm_tunnel(void) /* root -> at_ns0 */ SYS(fail, "ip xfrm state add src %s dst %s proto esp "
"spi %d reqid 2 mode tunnel "
"spi %d reqid 2 mode tunnel replay-window 42 " "auth-trunc 'hmac(sha1)' %s 96 enc 'cbc(aes)' %s", IP4_ADDR1_VETH1, IP4_ADDR_VETH0, XFRM_SPI_OUT_TO_IN, XFRM_AUTH, XFRM_ENC); SYS(fail,
@@ -628,8 +628,10 @@ static void test_xfrm_tunnel(void) { DECLARE_LIBBPF_OPTS(bpf_tc_hook, tc_hook, .attach_point = BPF_TC_INGRESS);
LIBBPF_OPTS(bpf_xdp_attach_opts, opts); struct test_tunnel_kern *skel = NULL; struct nstoken *nstoken;
int xdp_prog_fd; int tc_prog_fd; int ifindex; int err;
@@ -654,6 +656,14 @@ static void test_xfrm_tunnel(void) if (attach_tc_prog(&tc_hook, tc_prog_fd, -1)) goto done;
/* attach xdp prog to tunnel dev */
xdp_prog_fd = bpf_program__fd(skel->progs.xfrm_get_state_xdp);
if (!ASSERT_GE(xdp_prog_fd, 0, "bpf_program__fd"))
goto done;
err = bpf_xdp_attach(ifindex, xdp_prog_fd, XDP_FLAGS_REPLACE, &opts);
if (!ASSERT_OK(err, "bpf_xdp_attach"))
goto done;
/* ping from at_ns0 namespace test */ nstoken = open_netns("at_ns0"); err = test_ping(AF_INET, IP4_ADDR_TUNL_DEV1);
@@ -667,6 +677,8 @@ static void test_xfrm_tunnel(void) goto done; if (!ASSERT_EQ(skel->bss->xfrm_remote_ip, 0xac100164, "remote_ip")) goto done;
if (!ASSERT_EQ(skel->bss->xfrm_replay_window, 42, "replay_window"))
goto done;
done: delete_xfrm_tunnel(); diff --git a/tools/testing/selftests/bpf/progs/test_tunnel_kern.c b/tools/testing/selftests/bpf/progs/test_tunnel_kern.c index 3a59eb9c34de..c0dd38616562 100644 --- a/tools/testing/selftests/bpf/progs/test_tunnel_kern.c +++ b/tools/testing/selftests/bpf/progs/test_tunnel_kern.c @@ -30,6 +30,10 @@ int bpf_skb_set_fou_encap(struct __sk_buff *skb_ctx, struct bpf_fou_encap *encap, int type) __ksym; int bpf_skb_get_fou_encap(struct __sk_buff *skb_ctx, struct bpf_fou_encap *encap) __ksym; +struct xfrm_state * +bpf_xdp_get_xfrm_state(struct xdp_md *ctx, struct bpf_xfrm_state_opts *opts,
u32 opts__sz) __ksym;
+void bpf_xdp_xfrm_state_release(struct xfrm_state *x) __ksym;
struct { __uint(type, BPF_MAP_TYPE_ARRAY); @@ -950,4 +954,51 @@ int xfrm_get_state(struct __sk_buff *skb) return TC_ACT_OK; }
+volatile int xfrm_replay_window = 0;
+SEC("xdp") +int xfrm_get_state_xdp(struct xdp_md *xdp) +{
struct bpf_xfrm_state_opts opts = {};
struct xfrm_state *x = NULL;
struct ip_esp_hdr *esph;
struct bpf_dynptr ptr;
u8 esph_buf[8] = {};
u8 iph_buf[20] = {};
struct iphdr *iph;
u32 off;
if (bpf_dynptr_from_xdp(xdp, 0, &ptr))
goto out;
off = sizeof(struct ethhdr);
iph = bpf_dynptr_slice(&ptr, off, iph_buf, sizeof(iph_buf));
if (!iph || iph->protocol != IPPROTO_ESP)
goto out;
off += sizeof(struct iphdr);
esph = bpf_dynptr_slice(&ptr, off, esph_buf, sizeof(esph_buf));
if (!esph)
goto out;
opts.netns_id = BPF_F_CURRENT_NETNS;
opts.daddr.a4 = iph->daddr;
opts.spi = esph->spi;
opts.proto = IPPROTO_ESP;
opts.family = AF_INET;
x = bpf_xdp_get_xfrm_state(xdp, &opts, sizeof(opts));
if (!x || opts.error)
nit: how can opts.error be non zero if x == NULL?
Eyal.
goto out;
if (!x->replay_esn)
goto out;
xfrm_replay_window = x->replay_esn->replay_window;
+out:
if (x)
bpf_xdp_xfrm_state_release(x);
return XDP_PASS;
+}
char _license[] SEC("license") = "GPL";
2.42.1
On Mon, Dec 11, 2023 at 01:39:25PM -0800, Eyal Birger wrote:
Hi Daniel,
Tiny nits below in case you respin this for other reasons:
On Mon, Dec 11, 2023 at 12:20 PM Daniel Xu dxu@dxuuu.xyz wrote:
This commit extends test_tunnel selftest to test the new XDP xfrm state lookup kfunc.
Co-developed-by: Antony Antony antony.antony@secunet.com Signed-off-by: Antony Antony antony.antony@secunet.com Signed-off-by: Daniel Xu dxu@dxuuu.xyz
.../selftests/bpf/prog_tests/test_tunnel.c | 20 ++++++-- .../selftests/bpf/progs/test_tunnel_kern.c | 51 +++++++++++++++++++ 2 files changed, 67 insertions(+), 4 deletions(-)
diff --git a/tools/testing/selftests/bpf/prog_tests/test_tunnel.c b/tools/testing/selftests/bpf/prog_tests/test_tunnel.c index 2d7f8fa82ebd..fc804095d578 100644 --- a/tools/testing/selftests/bpf/prog_tests/test_tunnel.c +++ b/tools/testing/selftests/bpf/prog_tests/test_tunnel.c @@ -278,7 +278,7 @@ static int add_xfrm_tunnel(void) SYS(fail, "ip netns exec at_ns0 " "ip xfrm state add src %s dst %s proto esp "
"spi %d reqid 1 mode tunnel "
"spi %d reqid 1 mode tunnel replay-window 42 " "auth-trunc 'hmac(sha1)' %s 96 enc 'cbc(aes)' %s", IP4_ADDR_VETH0, IP4_ADDR1_VETH1, XFRM_SPI_IN_TO_OUT, XFRM_AUTH, XFRM_ENC); SYS(fail,
@@ -292,7 +292,7 @@ static int add_xfrm_tunnel(void) SYS(fail, "ip netns exec at_ns0 " "ip xfrm state add src %s dst %s proto esp "
"spi %d reqid 2 mode tunnel "
"spi %d reqid 2 mode tunnel replay-window 42 "
nit: why do you need to set the replay-window in both directions?
No reason - probably just careless here.
"auth-trunc 'hmac(sha1)' %s 96 enc 'cbc(aes)' %s", IP4_ADDR1_VETH1, IP4_ADDR_VETH0, XFRM_SPI_OUT_TO_IN, XFRM_AUTH, XFRM_ENC); SYS(fail,
@@ -313,7 +313,7 @@ static int add_xfrm_tunnel(void) */ SYS(fail, "ip xfrm state add src %s dst %s proto esp "
"spi %d reqid 1 mode tunnel "
"spi %d reqid 1 mode tunnel replay-window 42 " "auth-trunc 'hmac(sha1)' %s 96 enc 'cbc(aes)' %s", IP4_ADDR_VETH0, IP4_ADDR1_VETH1, XFRM_SPI_IN_TO_OUT, XFRM_AUTH, XFRM_ENC); SYS(fail,
@@ -325,7 +325,7 @@ static int add_xfrm_tunnel(void) /* root -> at_ns0 */ SYS(fail, "ip xfrm state add src %s dst %s proto esp "
"spi %d reqid 2 mode tunnel "
"spi %d reqid 2 mode tunnel replay-window 42 " "auth-trunc 'hmac(sha1)' %s 96 enc 'cbc(aes)' %s", IP4_ADDR1_VETH1, IP4_ADDR_VETH0, XFRM_SPI_OUT_TO_IN, XFRM_AUTH, XFRM_ENC); SYS(fail,
@@ -628,8 +628,10 @@ static void test_xfrm_tunnel(void) { DECLARE_LIBBPF_OPTS(bpf_tc_hook, tc_hook, .attach_point = BPF_TC_INGRESS);
LIBBPF_OPTS(bpf_xdp_attach_opts, opts); struct test_tunnel_kern *skel = NULL; struct nstoken *nstoken;
int xdp_prog_fd; int tc_prog_fd; int ifindex; int err;
@@ -654,6 +656,14 @@ static void test_xfrm_tunnel(void) if (attach_tc_prog(&tc_hook, tc_prog_fd, -1)) goto done;
/* attach xdp prog to tunnel dev */
xdp_prog_fd = bpf_program__fd(skel->progs.xfrm_get_state_xdp);
if (!ASSERT_GE(xdp_prog_fd, 0, "bpf_program__fd"))
goto done;
err = bpf_xdp_attach(ifindex, xdp_prog_fd, XDP_FLAGS_REPLACE, &opts);
if (!ASSERT_OK(err, "bpf_xdp_attach"))
goto done;
/* ping from at_ns0 namespace test */ nstoken = open_netns("at_ns0"); err = test_ping(AF_INET, IP4_ADDR_TUNL_DEV1);
@@ -667,6 +677,8 @@ static void test_xfrm_tunnel(void) goto done; if (!ASSERT_EQ(skel->bss->xfrm_remote_ip, 0xac100164, "remote_ip")) goto done;
if (!ASSERT_EQ(skel->bss->xfrm_replay_window, 42, "replay_window"))
goto done;
done: delete_xfrm_tunnel(); diff --git a/tools/testing/selftests/bpf/progs/test_tunnel_kern.c b/tools/testing/selftests/bpf/progs/test_tunnel_kern.c index 3a59eb9c34de..c0dd38616562 100644 --- a/tools/testing/selftests/bpf/progs/test_tunnel_kern.c +++ b/tools/testing/selftests/bpf/progs/test_tunnel_kern.c @@ -30,6 +30,10 @@ int bpf_skb_set_fou_encap(struct __sk_buff *skb_ctx, struct bpf_fou_encap *encap, int type) __ksym; int bpf_skb_get_fou_encap(struct __sk_buff *skb_ctx, struct bpf_fou_encap *encap) __ksym; +struct xfrm_state * +bpf_xdp_get_xfrm_state(struct xdp_md *ctx, struct bpf_xfrm_state_opts *opts,
u32 opts__sz) __ksym;
+void bpf_xdp_xfrm_state_release(struct xfrm_state *x) __ksym;
struct { __uint(type, BPF_MAP_TYPE_ARRAY); @@ -950,4 +954,51 @@ int xfrm_get_state(struct __sk_buff *skb) return TC_ACT_OK; }
+volatile int xfrm_replay_window = 0;
+SEC("xdp") +int xfrm_get_state_xdp(struct xdp_md *xdp) +{
struct bpf_xfrm_state_opts opts = {};
struct xfrm_state *x = NULL;
struct ip_esp_hdr *esph;
struct bpf_dynptr ptr;
u8 esph_buf[8] = {};
u8 iph_buf[20] = {};
struct iphdr *iph;
u32 off;
if (bpf_dynptr_from_xdp(xdp, 0, &ptr))
goto out;
off = sizeof(struct ethhdr);
iph = bpf_dynptr_slice(&ptr, off, iph_buf, sizeof(iph_buf));
if (!iph || iph->protocol != IPPROTO_ESP)
goto out;
off += sizeof(struct iphdr);
esph = bpf_dynptr_slice(&ptr, off, esph_buf, sizeof(esph_buf));
if (!esph)
goto out;
opts.netns_id = BPF_F_CURRENT_NETNS;
opts.daddr.a4 = iph->daddr;
opts.spi = esph->spi;
opts.proto = IPPROTO_ESP;
opts.family = AF_INET;
x = bpf_xdp_get_xfrm_state(xdp, &opts, sizeof(opts));
if (!x || opts.error)
nit: how can opts.error be non zero if x == NULL?
Ignoring the new -ENOENT case, it can't. Which is why I'm testing that behavior here.
[...]
Thanks, Daniel
On Mon, Dec 11, 2023 at 2:31 PM Daniel Xu dxu@dxuuu.xyz wrote:
On Mon, Dec 11, 2023 at 01:39:25PM -0800, Eyal Birger wrote:
Hi Daniel,
Tiny nits below in case you respin this for other reasons:
On Mon, Dec 11, 2023 at 12:20 PM Daniel Xu dxu@dxuuu.xyz wrote:
This commit extends test_tunnel selftest to test the new XDP xfrm state lookup kfunc.
Co-developed-by: Antony Antony antony.antony@secunet.com Signed-off-by: Antony Antony antony.antony@secunet.com Signed-off-by: Daniel Xu dxu@dxuuu.xyz
.../selftests/bpf/prog_tests/test_tunnel.c | 20 ++++++-- .../selftests/bpf/progs/test_tunnel_kern.c | 51 +++++++++++++++++++ 2 files changed, 67 insertions(+), 4 deletions(-)
diff --git a/tools/testing/selftests/bpf/prog_tests/test_tunnel.c b/tools/testing/selftests/bpf/prog_tests/test_tunnel.c index 2d7f8fa82ebd..fc804095d578 100644 --- a/tools/testing/selftests/bpf/prog_tests/test_tunnel.c +++ b/tools/testing/selftests/bpf/prog_tests/test_tunnel.c @@ -278,7 +278,7 @@ static int add_xfrm_tunnel(void) SYS(fail, "ip netns exec at_ns0 " "ip xfrm state add src %s dst %s proto esp "
"spi %d reqid 1 mode tunnel "
"spi %d reqid 1 mode tunnel replay-window 42 " "auth-trunc 'hmac(sha1)' %s 96 enc 'cbc(aes)' %s", IP4_ADDR_VETH0, IP4_ADDR1_VETH1, XFRM_SPI_IN_TO_OUT, XFRM_AUTH, XFRM_ENC); SYS(fail,
@@ -292,7 +292,7 @@ static int add_xfrm_tunnel(void) SYS(fail, "ip netns exec at_ns0 " "ip xfrm state add src %s dst %s proto esp "
"spi %d reqid 2 mode tunnel "
"spi %d reqid 2 mode tunnel replay-window 42 "
nit: why do you need to set the replay-window in both directions?
No reason - probably just careless here.
"auth-trunc 'hmac(sha1)' %s 96 enc 'cbc(aes)' %s", IP4_ADDR1_VETH1, IP4_ADDR_VETH0, XFRM_SPI_OUT_TO_IN, XFRM_AUTH, XFRM_ENC); SYS(fail,
@@ -313,7 +313,7 @@ static int add_xfrm_tunnel(void) */ SYS(fail, "ip xfrm state add src %s dst %s proto esp "
"spi %d reqid 1 mode tunnel "
"spi %d reqid 1 mode tunnel replay-window 42 " "auth-trunc 'hmac(sha1)' %s 96 enc 'cbc(aes)' %s", IP4_ADDR_VETH0, IP4_ADDR1_VETH1, XFRM_SPI_IN_TO_OUT, XFRM_AUTH, XFRM_ENC); SYS(fail,
@@ -325,7 +325,7 @@ static int add_xfrm_tunnel(void) /* root -> at_ns0 */ SYS(fail, "ip xfrm state add src %s dst %s proto esp "
"spi %d reqid 2 mode tunnel "
"spi %d reqid 2 mode tunnel replay-window 42 " "auth-trunc 'hmac(sha1)' %s 96 enc 'cbc(aes)' %s", IP4_ADDR1_VETH1, IP4_ADDR_VETH0, XFRM_SPI_OUT_TO_IN, XFRM_AUTH, XFRM_ENC); SYS(fail,
@@ -628,8 +628,10 @@ static void test_xfrm_tunnel(void) { DECLARE_LIBBPF_OPTS(bpf_tc_hook, tc_hook, .attach_point = BPF_TC_INGRESS);
LIBBPF_OPTS(bpf_xdp_attach_opts, opts); struct test_tunnel_kern *skel = NULL; struct nstoken *nstoken;
int xdp_prog_fd; int tc_prog_fd; int ifindex; int err;
@@ -654,6 +656,14 @@ static void test_xfrm_tunnel(void) if (attach_tc_prog(&tc_hook, tc_prog_fd, -1)) goto done;
/* attach xdp prog to tunnel dev */
xdp_prog_fd = bpf_program__fd(skel->progs.xfrm_get_state_xdp);
if (!ASSERT_GE(xdp_prog_fd, 0, "bpf_program__fd"))
goto done;
err = bpf_xdp_attach(ifindex, xdp_prog_fd, XDP_FLAGS_REPLACE, &opts);
if (!ASSERT_OK(err, "bpf_xdp_attach"))
goto done;
/* ping from at_ns0 namespace test */ nstoken = open_netns("at_ns0"); err = test_ping(AF_INET, IP4_ADDR_TUNL_DEV1);
@@ -667,6 +677,8 @@ static void test_xfrm_tunnel(void) goto done; if (!ASSERT_EQ(skel->bss->xfrm_remote_ip, 0xac100164, "remote_ip")) goto done;
if (!ASSERT_EQ(skel->bss->xfrm_replay_window, 42, "replay_window"))
goto done;
done: delete_xfrm_tunnel(); diff --git a/tools/testing/selftests/bpf/progs/test_tunnel_kern.c b/tools/testing/selftests/bpf/progs/test_tunnel_kern.c index 3a59eb9c34de..c0dd38616562 100644 --- a/tools/testing/selftests/bpf/progs/test_tunnel_kern.c +++ b/tools/testing/selftests/bpf/progs/test_tunnel_kern.c @@ -30,6 +30,10 @@ int bpf_skb_set_fou_encap(struct __sk_buff *skb_ctx, struct bpf_fou_encap *encap, int type) __ksym; int bpf_skb_get_fou_encap(struct __sk_buff *skb_ctx, struct bpf_fou_encap *encap) __ksym; +struct xfrm_state * +bpf_xdp_get_xfrm_state(struct xdp_md *ctx, struct bpf_xfrm_state_opts *opts,
u32 opts__sz) __ksym;
+void bpf_xdp_xfrm_state_release(struct xfrm_state *x) __ksym;
struct { __uint(type, BPF_MAP_TYPE_ARRAY); @@ -950,4 +954,51 @@ int xfrm_get_state(struct __sk_buff *skb) return TC_ACT_OK; }
+volatile int xfrm_replay_window = 0;
+SEC("xdp") +int xfrm_get_state_xdp(struct xdp_md *xdp) +{
struct bpf_xfrm_state_opts opts = {};
struct xfrm_state *x = NULL;
struct ip_esp_hdr *esph;
struct bpf_dynptr ptr;
u8 esph_buf[8] = {};
u8 iph_buf[20] = {};
struct iphdr *iph;
u32 off;
if (bpf_dynptr_from_xdp(xdp, 0, &ptr))
goto out;
off = sizeof(struct ethhdr);
iph = bpf_dynptr_slice(&ptr, off, iph_buf, sizeof(iph_buf));
if (!iph || iph->protocol != IPPROTO_ESP)
goto out;
off += sizeof(struct iphdr);
esph = bpf_dynptr_slice(&ptr, off, esph_buf, sizeof(esph_buf));
if (!esph)
goto out;
opts.netns_id = BPF_F_CURRENT_NETNS;
opts.daddr.a4 = iph->daddr;
opts.spi = esph->spi;
opts.proto = IPPROTO_ESP;
opts.family = AF_INET;
x = bpf_xdp_get_xfrm_state(xdp, &opts, sizeof(opts));
if (!x || opts.error)
nit: how can opts.error be non zero if x == NULL?
Ignoring the new -ENOENT case, it can't. Which is why I'm testing that behavior here.
I'm sorry, I don't understand.
AFAICT, regardless of the -ENOENT change, I don't see how (!x) is false and (opt.error) is true, and so "if (!x || opts.error)" is always equivalent to "if (!x)".
What am I missing? Eyal.
On Mon, Dec 11, 2023 at 03:13:07PM -0800, Eyal Birger wrote:
On Mon, Dec 11, 2023 at 2:31 PM Daniel Xu dxu@dxuuu.xyz wrote:
On Mon, Dec 11, 2023 at 01:39:25PM -0800, Eyal Birger wrote:
Hi Daniel,
Tiny nits below in case you respin this for other reasons:
On Mon, Dec 11, 2023 at 12:20 PM Daniel Xu dxu@dxuuu.xyz wrote:
This commit extends test_tunnel selftest to test the new XDP xfrm state lookup kfunc.
Co-developed-by: Antony Antony antony.antony@secunet.com Signed-off-by: Antony Antony antony.antony@secunet.com Signed-off-by: Daniel Xu dxu@dxuuu.xyz
.../selftests/bpf/prog_tests/test_tunnel.c | 20 ++++++-- .../selftests/bpf/progs/test_tunnel_kern.c | 51 +++++++++++++++++++ 2 files changed, 67 insertions(+), 4 deletions(-)
diff --git a/tools/testing/selftests/bpf/prog_tests/test_tunnel.c b/tools/testing/selftests/bpf/prog_tests/test_tunnel.c index 2d7f8fa82ebd..fc804095d578 100644 --- a/tools/testing/selftests/bpf/prog_tests/test_tunnel.c +++ b/tools/testing/selftests/bpf/prog_tests/test_tunnel.c @@ -278,7 +278,7 @@ static int add_xfrm_tunnel(void) SYS(fail, "ip netns exec at_ns0 " "ip xfrm state add src %s dst %s proto esp "
"spi %d reqid 1 mode tunnel "
"spi %d reqid 1 mode tunnel replay-window 42 " "auth-trunc 'hmac(sha1)' %s 96 enc 'cbc(aes)' %s", IP4_ADDR_VETH0, IP4_ADDR1_VETH1, XFRM_SPI_IN_TO_OUT, XFRM_AUTH, XFRM_ENC); SYS(fail,
@@ -292,7 +292,7 @@ static int add_xfrm_tunnel(void) SYS(fail, "ip netns exec at_ns0 " "ip xfrm state add src %s dst %s proto esp "
"spi %d reqid 2 mode tunnel "
"spi %d reqid 2 mode tunnel replay-window 42 "
nit: why do you need to set the replay-window in both directions?
No reason - probably just careless here.
"auth-trunc 'hmac(sha1)' %s 96 enc 'cbc(aes)' %s", IP4_ADDR1_VETH1, IP4_ADDR_VETH0, XFRM_SPI_OUT_TO_IN, XFRM_AUTH, XFRM_ENC); SYS(fail,
@@ -313,7 +313,7 @@ static int add_xfrm_tunnel(void) */ SYS(fail, "ip xfrm state add src %s dst %s proto esp "
"spi %d reqid 1 mode tunnel "
"spi %d reqid 1 mode tunnel replay-window 42 " "auth-trunc 'hmac(sha1)' %s 96 enc 'cbc(aes)' %s", IP4_ADDR_VETH0, IP4_ADDR1_VETH1, XFRM_SPI_IN_TO_OUT, XFRM_AUTH, XFRM_ENC); SYS(fail,
@@ -325,7 +325,7 @@ static int add_xfrm_tunnel(void) /* root -> at_ns0 */ SYS(fail, "ip xfrm state add src %s dst %s proto esp "
"spi %d reqid 2 mode tunnel "
"spi %d reqid 2 mode tunnel replay-window 42 " "auth-trunc 'hmac(sha1)' %s 96 enc 'cbc(aes)' %s", IP4_ADDR1_VETH1, IP4_ADDR_VETH0, XFRM_SPI_OUT_TO_IN, XFRM_AUTH, XFRM_ENC); SYS(fail,
@@ -628,8 +628,10 @@ static void test_xfrm_tunnel(void) { DECLARE_LIBBPF_OPTS(bpf_tc_hook, tc_hook, .attach_point = BPF_TC_INGRESS);
LIBBPF_OPTS(bpf_xdp_attach_opts, opts); struct test_tunnel_kern *skel = NULL; struct nstoken *nstoken;
int xdp_prog_fd; int tc_prog_fd; int ifindex; int err;
@@ -654,6 +656,14 @@ static void test_xfrm_tunnel(void) if (attach_tc_prog(&tc_hook, tc_prog_fd, -1)) goto done;
/* attach xdp prog to tunnel dev */
xdp_prog_fd = bpf_program__fd(skel->progs.xfrm_get_state_xdp);
if (!ASSERT_GE(xdp_prog_fd, 0, "bpf_program__fd"))
goto done;
err = bpf_xdp_attach(ifindex, xdp_prog_fd, XDP_FLAGS_REPLACE, &opts);
if (!ASSERT_OK(err, "bpf_xdp_attach"))
goto done;
/* ping from at_ns0 namespace test */ nstoken = open_netns("at_ns0"); err = test_ping(AF_INET, IP4_ADDR_TUNL_DEV1);
@@ -667,6 +677,8 @@ static void test_xfrm_tunnel(void) goto done; if (!ASSERT_EQ(skel->bss->xfrm_remote_ip, 0xac100164, "remote_ip")) goto done;
if (!ASSERT_EQ(skel->bss->xfrm_replay_window, 42, "replay_window"))
goto done;
done: delete_xfrm_tunnel(); diff --git a/tools/testing/selftests/bpf/progs/test_tunnel_kern.c b/tools/testing/selftests/bpf/progs/test_tunnel_kern.c index 3a59eb9c34de..c0dd38616562 100644 --- a/tools/testing/selftests/bpf/progs/test_tunnel_kern.c +++ b/tools/testing/selftests/bpf/progs/test_tunnel_kern.c @@ -30,6 +30,10 @@ int bpf_skb_set_fou_encap(struct __sk_buff *skb_ctx, struct bpf_fou_encap *encap, int type) __ksym; int bpf_skb_get_fou_encap(struct __sk_buff *skb_ctx, struct bpf_fou_encap *encap) __ksym; +struct xfrm_state * +bpf_xdp_get_xfrm_state(struct xdp_md *ctx, struct bpf_xfrm_state_opts *opts,
u32 opts__sz) __ksym;
+void bpf_xdp_xfrm_state_release(struct xfrm_state *x) __ksym;
struct { __uint(type, BPF_MAP_TYPE_ARRAY); @@ -950,4 +954,51 @@ int xfrm_get_state(struct __sk_buff *skb) return TC_ACT_OK; }
+volatile int xfrm_replay_window = 0;
+SEC("xdp") +int xfrm_get_state_xdp(struct xdp_md *xdp) +{
struct bpf_xfrm_state_opts opts = {};
struct xfrm_state *x = NULL;
struct ip_esp_hdr *esph;
struct bpf_dynptr ptr;
u8 esph_buf[8] = {};
u8 iph_buf[20] = {};
struct iphdr *iph;
u32 off;
if (bpf_dynptr_from_xdp(xdp, 0, &ptr))
goto out;
off = sizeof(struct ethhdr);
iph = bpf_dynptr_slice(&ptr, off, iph_buf, sizeof(iph_buf));
if (!iph || iph->protocol != IPPROTO_ESP)
goto out;
off += sizeof(struct iphdr);
esph = bpf_dynptr_slice(&ptr, off, esph_buf, sizeof(esph_buf));
if (!esph)
goto out;
opts.netns_id = BPF_F_CURRENT_NETNS;
opts.daddr.a4 = iph->daddr;
opts.spi = esph->spi;
opts.proto = IPPROTO_ESP;
opts.family = AF_INET;
x = bpf_xdp_get_xfrm_state(xdp, &opts, sizeof(opts));
if (!x || opts.error)
nit: how can opts.error be non zero if x == NULL?
Ignoring the new -ENOENT case, it can't. Which is why I'm testing that behavior here.
I'm sorry, I don't understand.
AFAICT, regardless of the -ENOENT change, I don't see how (!x) is false and (opt.error) is true, and so "if (!x || opts.error)" is always equivalent to "if (!x)".
What am I missing? Eyal.
The selftests are tests so my intention was to check edge cases here. In normal operation it shouldn't be possible that bpf_xdp_get_xfrm_state() returns non-NULL and also an error. Maybe another way of writing this would be:
if (!x) goto out; assert(opts.error == 0);
If I'm trying to be too clever (or maybe just wrong) or it's pointless, I can remove the `opts.error` condition.
Thanks, Daniel
On Mon, Dec 11, 2023 at 3:49 PM Daniel Xu dxu@dxuuu.xyz wrote:
On Mon, Dec 11, 2023 at 03:13:07PM -0800, Eyal Birger wrote:
On Mon, Dec 11, 2023 at 2:31 PM Daniel Xu dxu@dxuuu.xyz wrote:
On Mon, Dec 11, 2023 at 01:39:25PM -0800, Eyal Birger wrote:
Hi Daniel,
Tiny nits below in case you respin this for other reasons:
On Mon, Dec 11, 2023 at 12:20 PM Daniel Xu dxu@dxuuu.xyz wrote:
This commit extends test_tunnel selftest to test the new XDP xfrm state lookup kfunc.
Co-developed-by: Antony Antony antony.antony@secunet.com Signed-off-by: Antony Antony antony.antony@secunet.com Signed-off-by: Daniel Xu dxu@dxuuu.xyz
.../selftests/bpf/prog_tests/test_tunnel.c | 20 ++++++-- .../selftests/bpf/progs/test_tunnel_kern.c | 51 +++++++++++++++++++ 2 files changed, 67 insertions(+), 4 deletions(-)
diff --git a/tools/testing/selftests/bpf/prog_tests/test_tunnel.c b/tools/testing/selftests/bpf/prog_tests/test_tunnel.c index 2d7f8fa82ebd..fc804095d578 100644 --- a/tools/testing/selftests/bpf/prog_tests/test_tunnel.c +++ b/tools/testing/selftests/bpf/prog_tests/test_tunnel.c @@ -278,7 +278,7 @@ static int add_xfrm_tunnel(void) SYS(fail, "ip netns exec at_ns0 " "ip xfrm state add src %s dst %s proto esp "
"spi %d reqid 1 mode tunnel "
"spi %d reqid 1 mode tunnel replay-window 42 " "auth-trunc 'hmac(sha1)' %s 96 enc 'cbc(aes)' %s", IP4_ADDR_VETH0, IP4_ADDR1_VETH1, XFRM_SPI_IN_TO_OUT, XFRM_AUTH, XFRM_ENC); SYS(fail,
@@ -292,7 +292,7 @@ static int add_xfrm_tunnel(void) SYS(fail, "ip netns exec at_ns0 " "ip xfrm state add src %s dst %s proto esp "
"spi %d reqid 2 mode tunnel "
"spi %d reqid 2 mode tunnel replay-window 42 "
nit: why do you need to set the replay-window in both directions?
No reason - probably just careless here.
"auth-trunc 'hmac(sha1)' %s 96 enc 'cbc(aes)' %s", IP4_ADDR1_VETH1, IP4_ADDR_VETH0, XFRM_SPI_OUT_TO_IN, XFRM_AUTH, XFRM_ENC); SYS(fail,
@@ -313,7 +313,7 @@ static int add_xfrm_tunnel(void) */ SYS(fail, "ip xfrm state add src %s dst %s proto esp "
"spi %d reqid 1 mode tunnel "
"spi %d reqid 1 mode tunnel replay-window 42 " "auth-trunc 'hmac(sha1)' %s 96 enc 'cbc(aes)' %s", IP4_ADDR_VETH0, IP4_ADDR1_VETH1, XFRM_SPI_IN_TO_OUT, XFRM_AUTH, XFRM_ENC); SYS(fail,
@@ -325,7 +325,7 @@ static int add_xfrm_tunnel(void) /* root -> at_ns0 */ SYS(fail, "ip xfrm state add src %s dst %s proto esp "
"spi %d reqid 2 mode tunnel "
"spi %d reqid 2 mode tunnel replay-window 42 " "auth-trunc 'hmac(sha1)' %s 96 enc 'cbc(aes)' %s", IP4_ADDR1_VETH1, IP4_ADDR_VETH0, XFRM_SPI_OUT_TO_IN, XFRM_AUTH, XFRM_ENC); SYS(fail,
@@ -628,8 +628,10 @@ static void test_xfrm_tunnel(void) { DECLARE_LIBBPF_OPTS(bpf_tc_hook, tc_hook, .attach_point = BPF_TC_INGRESS);
LIBBPF_OPTS(bpf_xdp_attach_opts, opts); struct test_tunnel_kern *skel = NULL; struct nstoken *nstoken;
int xdp_prog_fd; int tc_prog_fd; int ifindex; int err;
@@ -654,6 +656,14 @@ static void test_xfrm_tunnel(void) if (attach_tc_prog(&tc_hook, tc_prog_fd, -1)) goto done;
/* attach xdp prog to tunnel dev */
xdp_prog_fd = bpf_program__fd(skel->progs.xfrm_get_state_xdp);
if (!ASSERT_GE(xdp_prog_fd, 0, "bpf_program__fd"))
goto done;
err = bpf_xdp_attach(ifindex, xdp_prog_fd, XDP_FLAGS_REPLACE, &opts);
if (!ASSERT_OK(err, "bpf_xdp_attach"))
goto done;
/* ping from at_ns0 namespace test */ nstoken = open_netns("at_ns0"); err = test_ping(AF_INET, IP4_ADDR_TUNL_DEV1);
@@ -667,6 +677,8 @@ static void test_xfrm_tunnel(void) goto done; if (!ASSERT_EQ(skel->bss->xfrm_remote_ip, 0xac100164, "remote_ip")) goto done;
if (!ASSERT_EQ(skel->bss->xfrm_replay_window, 42, "replay_window"))
goto done;
done: delete_xfrm_tunnel(); diff --git a/tools/testing/selftests/bpf/progs/test_tunnel_kern.c b/tools/testing/selftests/bpf/progs/test_tunnel_kern.c index 3a59eb9c34de..c0dd38616562 100644 --- a/tools/testing/selftests/bpf/progs/test_tunnel_kern.c +++ b/tools/testing/selftests/bpf/progs/test_tunnel_kern.c @@ -30,6 +30,10 @@ int bpf_skb_set_fou_encap(struct __sk_buff *skb_ctx, struct bpf_fou_encap *encap, int type) __ksym; int bpf_skb_get_fou_encap(struct __sk_buff *skb_ctx, struct bpf_fou_encap *encap) __ksym; +struct xfrm_state * +bpf_xdp_get_xfrm_state(struct xdp_md *ctx, struct bpf_xfrm_state_opts *opts,
u32 opts__sz) __ksym;
+void bpf_xdp_xfrm_state_release(struct xfrm_state *x) __ksym;
struct { __uint(type, BPF_MAP_TYPE_ARRAY); @@ -950,4 +954,51 @@ int xfrm_get_state(struct __sk_buff *skb) return TC_ACT_OK; }
+volatile int xfrm_replay_window = 0;
+SEC("xdp") +int xfrm_get_state_xdp(struct xdp_md *xdp) +{
struct bpf_xfrm_state_opts opts = {};
struct xfrm_state *x = NULL;
struct ip_esp_hdr *esph;
struct bpf_dynptr ptr;
u8 esph_buf[8] = {};
u8 iph_buf[20] = {};
struct iphdr *iph;
u32 off;
if (bpf_dynptr_from_xdp(xdp, 0, &ptr))
goto out;
off = sizeof(struct ethhdr);
iph = bpf_dynptr_slice(&ptr, off, iph_buf, sizeof(iph_buf));
if (!iph || iph->protocol != IPPROTO_ESP)
goto out;
off += sizeof(struct iphdr);
esph = bpf_dynptr_slice(&ptr, off, esph_buf, sizeof(esph_buf));
if (!esph)
goto out;
opts.netns_id = BPF_F_CURRENT_NETNS;
opts.daddr.a4 = iph->daddr;
opts.spi = esph->spi;
opts.proto = IPPROTO_ESP;
opts.family = AF_INET;
x = bpf_xdp_get_xfrm_state(xdp, &opts, sizeof(opts));
if (!x || opts.error)
nit: how can opts.error be non zero if x == NULL?
Ignoring the new -ENOENT case, it can't. Which is why I'm testing that behavior here.
I'm sorry, I don't understand.
AFAICT, regardless of the -ENOENT change, I don't see how (!x) is false and (opt.error) is true, and so "if (!x || opts.error)" is always equivalent to "if (!x)".
What am I missing? Eyal.
The selftests are tests so my intention was to check edge cases here. In normal operation it shouldn't be possible that bpf_xdp_get_xfrm_state() returns non-NULL and also an error. Maybe another way of writing this would be:
if (!x) goto out; assert(opts.error == 0);
I think this would convey the "edge case testing" notion better.
If I'm trying to be too clever (or maybe just wrong) or it's pointless, I can remove the `opts.error` condition.
At least for me the tests also serve as references as to how the API is expected to be used, so I think it'd be clearer without signaling that opts.error could potentially be nonzero on success.
An assertion would indeed make that clear.
Thanks for the explanation, Eyal.
On Mon, Dec 11, 2023 at 04:25:06PM -0800, Eyal Birger wrote:
On Mon, Dec 11, 2023 at 3:49 PM Daniel Xu dxu@dxuuu.xyz wrote:
On Mon, Dec 11, 2023 at 03:13:07PM -0800, Eyal Birger wrote:
On Mon, Dec 11, 2023 at 2:31 PM Daniel Xu dxu@dxuuu.xyz wrote:
On Mon, Dec 11, 2023 at 01:39:25PM -0800, Eyal Birger wrote:
Hi Daniel,
Tiny nits below in case you respin this for other reasons:
On Mon, Dec 11, 2023 at 12:20 PM Daniel Xu dxu@dxuuu.xyz wrote:
This commit extends test_tunnel selftest to test the new XDP xfrm state lookup kfunc.
Co-developed-by: Antony Antony antony.antony@secunet.com Signed-off-by: Antony Antony antony.antony@secunet.com Signed-off-by: Daniel Xu dxu@dxuuu.xyz
.../selftests/bpf/prog_tests/test_tunnel.c | 20 ++++++-- .../selftests/bpf/progs/test_tunnel_kern.c | 51 +++++++++++++++++++ 2 files changed, 67 insertions(+), 4 deletions(-)
diff --git a/tools/testing/selftests/bpf/prog_tests/test_tunnel.c b/tools/testing/selftests/bpf/prog_tests/test_tunnel.c index 2d7f8fa82ebd..fc804095d578 100644 --- a/tools/testing/selftests/bpf/prog_tests/test_tunnel.c +++ b/tools/testing/selftests/bpf/prog_tests/test_tunnel.c @@ -278,7 +278,7 @@ static int add_xfrm_tunnel(void) SYS(fail, "ip netns exec at_ns0 " "ip xfrm state add src %s dst %s proto esp "
"spi %d reqid 1 mode tunnel "
"spi %d reqid 1 mode tunnel replay-window 42 " "auth-trunc 'hmac(sha1)' %s 96 enc 'cbc(aes)' %s", IP4_ADDR_VETH0, IP4_ADDR1_VETH1, XFRM_SPI_IN_TO_OUT, XFRM_AUTH, XFRM_ENC); SYS(fail,
@@ -292,7 +292,7 @@ static int add_xfrm_tunnel(void) SYS(fail, "ip netns exec at_ns0 " "ip xfrm state add src %s dst %s proto esp "
"spi %d reqid 2 mode tunnel "
"spi %d reqid 2 mode tunnel replay-window 42 "
nit: why do you need to set the replay-window in both directions?
No reason - probably just careless here.
"auth-trunc 'hmac(sha1)' %s 96 enc 'cbc(aes)' %s", IP4_ADDR1_VETH1, IP4_ADDR_VETH0, XFRM_SPI_OUT_TO_IN, XFRM_AUTH, XFRM_ENC); SYS(fail,
@@ -313,7 +313,7 @@ static int add_xfrm_tunnel(void) */ SYS(fail, "ip xfrm state add src %s dst %s proto esp "
"spi %d reqid 1 mode tunnel "
"spi %d reqid 1 mode tunnel replay-window 42 " "auth-trunc 'hmac(sha1)' %s 96 enc 'cbc(aes)' %s", IP4_ADDR_VETH0, IP4_ADDR1_VETH1, XFRM_SPI_IN_TO_OUT, XFRM_AUTH, XFRM_ENC); SYS(fail,
@@ -325,7 +325,7 @@ static int add_xfrm_tunnel(void) /* root -> at_ns0 */ SYS(fail, "ip xfrm state add src %s dst %s proto esp "
"spi %d reqid 2 mode tunnel "
"spi %d reqid 2 mode tunnel replay-window 42 " "auth-trunc 'hmac(sha1)' %s 96 enc 'cbc(aes)' %s", IP4_ADDR1_VETH1, IP4_ADDR_VETH0, XFRM_SPI_OUT_TO_IN, XFRM_AUTH, XFRM_ENC); SYS(fail,
@@ -628,8 +628,10 @@ static void test_xfrm_tunnel(void) { DECLARE_LIBBPF_OPTS(bpf_tc_hook, tc_hook, .attach_point = BPF_TC_INGRESS);
LIBBPF_OPTS(bpf_xdp_attach_opts, opts); struct test_tunnel_kern *skel = NULL; struct nstoken *nstoken;
int xdp_prog_fd; int tc_prog_fd; int ifindex; int err;
@@ -654,6 +656,14 @@ static void test_xfrm_tunnel(void) if (attach_tc_prog(&tc_hook, tc_prog_fd, -1)) goto done;
/* attach xdp prog to tunnel dev */
xdp_prog_fd = bpf_program__fd(skel->progs.xfrm_get_state_xdp);
if (!ASSERT_GE(xdp_prog_fd, 0, "bpf_program__fd"))
goto done;
err = bpf_xdp_attach(ifindex, xdp_prog_fd, XDP_FLAGS_REPLACE, &opts);
if (!ASSERT_OK(err, "bpf_xdp_attach"))
goto done;
/* ping from at_ns0 namespace test */ nstoken = open_netns("at_ns0"); err = test_ping(AF_INET, IP4_ADDR_TUNL_DEV1);
@@ -667,6 +677,8 @@ static void test_xfrm_tunnel(void) goto done; if (!ASSERT_EQ(skel->bss->xfrm_remote_ip, 0xac100164, "remote_ip")) goto done;
if (!ASSERT_EQ(skel->bss->xfrm_replay_window, 42, "replay_window"))
goto done;
done: delete_xfrm_tunnel(); diff --git a/tools/testing/selftests/bpf/progs/test_tunnel_kern.c b/tools/testing/selftests/bpf/progs/test_tunnel_kern.c index 3a59eb9c34de..c0dd38616562 100644 --- a/tools/testing/selftests/bpf/progs/test_tunnel_kern.c +++ b/tools/testing/selftests/bpf/progs/test_tunnel_kern.c @@ -30,6 +30,10 @@ int bpf_skb_set_fou_encap(struct __sk_buff *skb_ctx, struct bpf_fou_encap *encap, int type) __ksym; int bpf_skb_get_fou_encap(struct __sk_buff *skb_ctx, struct bpf_fou_encap *encap) __ksym; +struct xfrm_state * +bpf_xdp_get_xfrm_state(struct xdp_md *ctx, struct bpf_xfrm_state_opts *opts,
u32 opts__sz) __ksym;
+void bpf_xdp_xfrm_state_release(struct xfrm_state *x) __ksym;
struct { __uint(type, BPF_MAP_TYPE_ARRAY); @@ -950,4 +954,51 @@ int xfrm_get_state(struct __sk_buff *skb) return TC_ACT_OK; }
+volatile int xfrm_replay_window = 0;
+SEC("xdp") +int xfrm_get_state_xdp(struct xdp_md *xdp) +{
struct bpf_xfrm_state_opts opts = {};
struct xfrm_state *x = NULL;
struct ip_esp_hdr *esph;
struct bpf_dynptr ptr;
u8 esph_buf[8] = {};
u8 iph_buf[20] = {};
struct iphdr *iph;
u32 off;
if (bpf_dynptr_from_xdp(xdp, 0, &ptr))
goto out;
off = sizeof(struct ethhdr);
iph = bpf_dynptr_slice(&ptr, off, iph_buf, sizeof(iph_buf));
if (!iph || iph->protocol != IPPROTO_ESP)
goto out;
off += sizeof(struct iphdr);
esph = bpf_dynptr_slice(&ptr, off, esph_buf, sizeof(esph_buf));
if (!esph)
goto out;
opts.netns_id = BPF_F_CURRENT_NETNS;
opts.daddr.a4 = iph->daddr;
opts.spi = esph->spi;
opts.proto = IPPROTO_ESP;
opts.family = AF_INET;
x = bpf_xdp_get_xfrm_state(xdp, &opts, sizeof(opts));
if (!x || opts.error)
nit: how can opts.error be non zero if x == NULL?
Ignoring the new -ENOENT case, it can't. Which is why I'm testing that behavior here.
I'm sorry, I don't understand.
AFAICT, regardless of the -ENOENT change, I don't see how (!x) is false and (opt.error) is true, and so "if (!x || opts.error)" is always equivalent to "if (!x)".
What am I missing? Eyal.
The selftests are tests so my intention was to check edge cases here. In normal operation it shouldn't be possible that bpf_xdp_get_xfrm_state() returns non-NULL and also an error. Maybe another way of writing this would be:
if (!x) goto out; assert(opts.error == 0);
I think this would convey the "edge case testing" notion better.
If I'm trying to be too clever (or maybe just wrong) or it's pointless, I can remove the `opts.error` condition.
At least for me the tests also serve as references as to how the API is expected to be used, so I think it'd be clearer without signaling that opts.error could potentially be nonzero on success.
An assertion would indeed make that clear.
Sure, sounds good. I will check on the new bpf assert infra.
Thanks for the explanation, Eyal.
Np!
If you don't mind (and there no more comments), I would prefer to send a follow up fixing the nits in this revision. So that I stop blasting the list (as well as people who may not be as concerned with these details).
Thanks, Daniel
On Tue, Dec 12, 2023 at 8:17 AM Daniel Xu dxu@dxuuu.xyz wrote:
If you don't mind (and there no more comments), I would prefer to send a follow up fixing the nits in this revision. So that I stop blasting the list (as well as people who may not be as concerned with these details).
Resending patches is little effort while follow up patches double the commits, more code churn, increase in code reviews, etc. Always address feedback by resending.
On Tue, Dec 12, 2023 at 08:44:42AM -0800, Alexei Starovoitov wrote:
On Tue, Dec 12, 2023 at 8:17 AM Daniel Xu dxu@dxuuu.xyz wrote:
If you don't mind (and there no more comments), I would prefer to send a follow up fixing the nits in this revision. So that I stop blasting the list (as well as people who may not be as concerned with these details).
Resending patches is little effort while follow up patches double the commits, more code churn, increase in code reviews, etc. Always address feedback by resending.
Got it; will keep that in mind.
Thanks, Daniel
cc Kumar
On Tue, Dec 12, 2023 at 09:17:02AM -0700, Daniel Xu wrote:
On Mon, Dec 11, 2023 at 04:25:06PM -0800, Eyal Birger wrote:
On Mon, Dec 11, 2023 at 3:49 PM Daniel Xu dxu@dxuuu.xyz wrote:
On Mon, Dec 11, 2023 at 03:13:07PM -0800, Eyal Birger wrote:
On Mon, Dec 11, 2023 at 2:31 PM Daniel Xu dxu@dxuuu.xyz wrote:
On Mon, Dec 11, 2023 at 01:39:25PM -0800, Eyal Birger wrote:
Hi Daniel,
Tiny nits below in case you respin this for other reasons:
On Mon, Dec 11, 2023 at 12:20 PM Daniel Xu dxu@dxuuu.xyz wrote: > > This commit extends test_tunnel selftest to test the new XDP xfrm state > lookup kfunc. > > Co-developed-by: Antony Antony antony.antony@secunet.com > Signed-off-by: Antony Antony antony.antony@secunet.com > Signed-off-by: Daniel Xu dxu@dxuuu.xyz > --- > .../selftests/bpf/prog_tests/test_tunnel.c | 20 ++++++-- > .../selftests/bpf/progs/test_tunnel_kern.c | 51 +++++++++++++++++++ > 2 files changed, 67 insertions(+), 4 deletions(-) > > diff --git a/tools/testing/selftests/bpf/prog_tests/test_tunnel.c b/tools/testing/selftests/bpf/prog_tests/test_tunnel.c > index 2d7f8fa82ebd..fc804095d578 100644 > --- a/tools/testing/selftests/bpf/prog_tests/test_tunnel.c > +++ b/tools/testing/selftests/bpf/prog_tests/test_tunnel.c > @@ -278,7 +278,7 @@ static int add_xfrm_tunnel(void) > SYS(fail, > "ip netns exec at_ns0 " > "ip xfrm state add src %s dst %s proto esp " > - "spi %d reqid 1 mode tunnel " > + "spi %d reqid 1 mode tunnel replay-window 42 " > "auth-trunc 'hmac(sha1)' %s 96 enc 'cbc(aes)' %s", > IP4_ADDR_VETH0, IP4_ADDR1_VETH1, XFRM_SPI_IN_TO_OUT, XFRM_AUTH, XFRM_ENC); > SYS(fail, > @@ -292,7 +292,7 @@ static int add_xfrm_tunnel(void) > SYS(fail, > "ip netns exec at_ns0 " > "ip xfrm state add src %s dst %s proto esp " > - "spi %d reqid 2 mode tunnel " > + "spi %d reqid 2 mode tunnel replay-window 42 "
nit: why do you need to set the replay-window in both directions?
No reason - probably just careless here.
> "auth-trunc 'hmac(sha1)' %s 96 enc 'cbc(aes)' %s", > IP4_ADDR1_VETH1, IP4_ADDR_VETH0, XFRM_SPI_OUT_TO_IN, XFRM_AUTH, XFRM_ENC); > SYS(fail, > @@ -313,7 +313,7 @@ static int add_xfrm_tunnel(void) > */ > SYS(fail, > "ip xfrm state add src %s dst %s proto esp " > - "spi %d reqid 1 mode tunnel " > + "spi %d reqid 1 mode tunnel replay-window 42 " > "auth-trunc 'hmac(sha1)' %s 96 enc 'cbc(aes)' %s", > IP4_ADDR_VETH0, IP4_ADDR1_VETH1, XFRM_SPI_IN_TO_OUT, XFRM_AUTH, XFRM_ENC); > SYS(fail, > @@ -325,7 +325,7 @@ static int add_xfrm_tunnel(void) > /* root -> at_ns0 */ > SYS(fail, > "ip xfrm state add src %s dst %s proto esp " > - "spi %d reqid 2 mode tunnel " > + "spi %d reqid 2 mode tunnel replay-window 42 " > "auth-trunc 'hmac(sha1)' %s 96 enc 'cbc(aes)' %s", > IP4_ADDR1_VETH1, IP4_ADDR_VETH0, XFRM_SPI_OUT_TO_IN, XFRM_AUTH, XFRM_ENC); > SYS(fail, > @@ -628,8 +628,10 @@ static void test_xfrm_tunnel(void) > { > DECLARE_LIBBPF_OPTS(bpf_tc_hook, tc_hook, > .attach_point = BPF_TC_INGRESS); > + LIBBPF_OPTS(bpf_xdp_attach_opts, opts); > struct test_tunnel_kern *skel = NULL; > struct nstoken *nstoken; > + int xdp_prog_fd; > int tc_prog_fd; > int ifindex; > int err; > @@ -654,6 +656,14 @@ static void test_xfrm_tunnel(void) > if (attach_tc_prog(&tc_hook, tc_prog_fd, -1)) > goto done; > > + /* attach xdp prog to tunnel dev */ > + xdp_prog_fd = bpf_program__fd(skel->progs.xfrm_get_state_xdp); > + if (!ASSERT_GE(xdp_prog_fd, 0, "bpf_program__fd")) > + goto done; > + err = bpf_xdp_attach(ifindex, xdp_prog_fd, XDP_FLAGS_REPLACE, &opts); > + if (!ASSERT_OK(err, "bpf_xdp_attach")) > + goto done; > + > /* ping from at_ns0 namespace test */ > nstoken = open_netns("at_ns0"); > err = test_ping(AF_INET, IP4_ADDR_TUNL_DEV1); > @@ -667,6 +677,8 @@ static void test_xfrm_tunnel(void) > goto done; > if (!ASSERT_EQ(skel->bss->xfrm_remote_ip, 0xac100164, "remote_ip")) > goto done; > + if (!ASSERT_EQ(skel->bss->xfrm_replay_window, 42, "replay_window")) > + goto done; > > done: > delete_xfrm_tunnel(); > diff --git a/tools/testing/selftests/bpf/progs/test_tunnel_kern.c b/tools/testing/selftests/bpf/progs/test_tunnel_kern.c > index 3a59eb9c34de..c0dd38616562 100644 > --- a/tools/testing/selftests/bpf/progs/test_tunnel_kern.c > +++ b/tools/testing/selftests/bpf/progs/test_tunnel_kern.c > @@ -30,6 +30,10 @@ int bpf_skb_set_fou_encap(struct __sk_buff *skb_ctx, > struct bpf_fou_encap *encap, int type) __ksym; > int bpf_skb_get_fou_encap(struct __sk_buff *skb_ctx, > struct bpf_fou_encap *encap) __ksym; > +struct xfrm_state * > +bpf_xdp_get_xfrm_state(struct xdp_md *ctx, struct bpf_xfrm_state_opts *opts, > + u32 opts__sz) __ksym; > +void bpf_xdp_xfrm_state_release(struct xfrm_state *x) __ksym; > > struct { > __uint(type, BPF_MAP_TYPE_ARRAY); > @@ -950,4 +954,51 @@ int xfrm_get_state(struct __sk_buff *skb) > return TC_ACT_OK; > } > > +volatile int xfrm_replay_window = 0; > + > +SEC("xdp") > +int xfrm_get_state_xdp(struct xdp_md *xdp) > +{ > + struct bpf_xfrm_state_opts opts = {}; > + struct xfrm_state *x = NULL; > + struct ip_esp_hdr *esph; > + struct bpf_dynptr ptr; > + u8 esph_buf[8] = {}; > + u8 iph_buf[20] = {}; > + struct iphdr *iph; > + u32 off; > + > + if (bpf_dynptr_from_xdp(xdp, 0, &ptr)) > + goto out; > + > + off = sizeof(struct ethhdr); > + iph = bpf_dynptr_slice(&ptr, off, iph_buf, sizeof(iph_buf)); > + if (!iph || iph->protocol != IPPROTO_ESP) > + goto out; > + > + off += sizeof(struct iphdr); > + esph = bpf_dynptr_slice(&ptr, off, esph_buf, sizeof(esph_buf)); > + if (!esph) > + goto out; > + > + opts.netns_id = BPF_F_CURRENT_NETNS; > + opts.daddr.a4 = iph->daddr; > + opts.spi = esph->spi; > + opts.proto = IPPROTO_ESP; > + opts.family = AF_INET; > + > + x = bpf_xdp_get_xfrm_state(xdp, &opts, sizeof(opts)); > + if (!x || opts.error)
nit: how can opts.error be non zero if x == NULL?
Ignoring the new -ENOENT case, it can't. Which is why I'm testing that behavior here.
I'm sorry, I don't understand.
AFAICT, regardless of the -ENOENT change, I don't see how (!x) is false and (opt.error) is true, and so "if (!x || opts.error)" is always equivalent to "if (!x)".
What am I missing? Eyal.
The selftests are tests so my intention was to check edge cases here. In normal operation it shouldn't be possible that bpf_xdp_get_xfrm_state() returns non-NULL and also an error. Maybe another way of writing this would be:
if (!x) goto out; assert(opts.error == 0);
I think this would convey the "edge case testing" notion better.
If I'm trying to be too clever (or maybe just wrong) or it's pointless, I can remove the `opts.error` condition.
At least for me the tests also serve as references as to how the API is expected to be used, so I think it'd be clearer without signaling that opts.error could potentially be nonzero on success.
An assertion would indeed make that clear.
Sure, sounds good. I will check on the new bpf assert infra.
Couldn't quite get bpf_assert() working. The following diff:
diff --git a/tools/testing/selftests/bpf/progs/test_tunnel_kern.c b/tools/testing/selftests/bpf/progs/test_tunnel_kern.c index c0dd38616562..f00dba85ac5d 100644 --- a/tools/testing/selftests/bpf/progs/test_tunnel_kern.c +++ b/tools/testing/selftests/bpf/progs/test_tunnel_kern.c @@ -8,8 +8,9 @@ */ #include "vmlinux.h" #include <bpf/bpf_core_read.h> -#include <bpf/bpf_helpers.h> #include <bpf/bpf_endian.h> +#include <bpf/bpf_helpers.h> +#include "bpf_experimental.h" #include "bpf_kfuncs.h" #include "bpf_tracing_net.h"
@@ -988,8 +989,9 @@ int xfrm_get_state_xdp(struct xdp_md *xdp) opts.family = AF_INET;
x = bpf_xdp_get_xfrm_state(xdp, &opts, sizeof(opts)); - if (!x || opts.error) + if (!x) goto out; + bpf_assert_with(opts.error == 0, XDP_PASS);
if (!x->replay_esn) goto out;
results in:
57: (b7) r1 = 2 ; R1_w=2 refs=5 58: (85) call bpf_throw#115436 calling kernel function bpf_throw is not allowed
It looks like the above error comes from verifier.c:fetch_kfunc_meta, but I can run the exceptions selftests just fine with the same bzImage. So I'm thinking it's not a kfunc registration or BTF issue.
Maybe it's cuz I'm holding onto KFUNC_ACQUIRE'd `x`? Not sure.
So for now I think I'll drop checking opts.error.
[...]
On Tue, 12 Dec 2023 at 20:52, Daniel Xu dxu@dxuuu.xyz wrote:
cc Kumar
On Tue, Dec 12, 2023 at 09:17:02AM -0700, Daniel Xu wrote:
On Mon, Dec 11, 2023 at 04:25:06PM -0800, Eyal Birger wrote:
On Mon, Dec 11, 2023 at 3:49 PM Daniel Xu dxu@dxuuu.xyz wrote:
On Mon, Dec 11, 2023 at 03:13:07PM -0800, Eyal Birger wrote:
On Mon, Dec 11, 2023 at 2:31 PM Daniel Xu dxu@dxuuu.xyz wrote:
On Mon, Dec 11, 2023 at 01:39:25PM -0800, Eyal Birger wrote: > Hi Daniel, > > Tiny nits below in case you respin this for other reasons: > > On Mon, Dec 11, 2023 at 12:20 PM Daniel Xu dxu@dxuuu.xyz wrote: > > > > This commit extends test_tunnel selftest to test the new XDP xfrm state > > lookup kfunc. > > > > Co-developed-by: Antony Antony antony.antony@secunet.com > > Signed-off-by: Antony Antony antony.antony@secunet.com > > Signed-off-by: Daniel Xu dxu@dxuuu.xyz > > --- > > .../selftests/bpf/prog_tests/test_tunnel.c | 20 ++++++-- > > .../selftests/bpf/progs/test_tunnel_kern.c | 51 +++++++++++++++++++ > > 2 files changed, 67 insertions(+), 4 deletions(-) > > > > diff --git a/tools/testing/selftests/bpf/prog_tests/test_tunnel.c b/tools/testing/selftests/bpf/prog_tests/test_tunnel.c > > index 2d7f8fa82ebd..fc804095d578 100644 > > --- a/tools/testing/selftests/bpf/prog_tests/test_tunnel.c > > +++ b/tools/testing/selftests/bpf/prog_tests/test_tunnel.c > > @@ -278,7 +278,7 @@ static int add_xfrm_tunnel(void) > > SYS(fail, > > "ip netns exec at_ns0 " > > "ip xfrm state add src %s dst %s proto esp " > > - "spi %d reqid 1 mode tunnel " > > + "spi %d reqid 1 mode tunnel replay-window 42 " > > "auth-trunc 'hmac(sha1)' %s 96 enc 'cbc(aes)' %s", > > IP4_ADDR_VETH0, IP4_ADDR1_VETH1, XFRM_SPI_IN_TO_OUT, XFRM_AUTH, XFRM_ENC); > > SYS(fail, > > @@ -292,7 +292,7 @@ static int add_xfrm_tunnel(void) > > SYS(fail, > > "ip netns exec at_ns0 " > > "ip xfrm state add src %s dst %s proto esp " > > - "spi %d reqid 2 mode tunnel " > > + "spi %d reqid 2 mode tunnel replay-window 42 " > > nit: why do you need to set the replay-window in both directions?
No reason - probably just careless here.
> > > "auth-trunc 'hmac(sha1)' %s 96 enc 'cbc(aes)' %s", > > IP4_ADDR1_VETH1, IP4_ADDR_VETH0, XFRM_SPI_OUT_TO_IN, XFRM_AUTH, XFRM_ENC); > > SYS(fail, > > @@ -313,7 +313,7 @@ static int add_xfrm_tunnel(void) > > */ > > SYS(fail, > > "ip xfrm state add src %s dst %s proto esp " > > - "spi %d reqid 1 mode tunnel " > > + "spi %d reqid 1 mode tunnel replay-window 42 " > > "auth-trunc 'hmac(sha1)' %s 96 enc 'cbc(aes)' %s", > > IP4_ADDR_VETH0, IP4_ADDR1_VETH1, XFRM_SPI_IN_TO_OUT, XFRM_AUTH, XFRM_ENC); > > SYS(fail, > > @@ -325,7 +325,7 @@ static int add_xfrm_tunnel(void) > > /* root -> at_ns0 */ > > SYS(fail, > > "ip xfrm state add src %s dst %s proto esp " > > - "spi %d reqid 2 mode tunnel " > > + "spi %d reqid 2 mode tunnel replay-window 42 " > > "auth-trunc 'hmac(sha1)' %s 96 enc 'cbc(aes)' %s", > > IP4_ADDR1_VETH1, IP4_ADDR_VETH0, XFRM_SPI_OUT_TO_IN, XFRM_AUTH, XFRM_ENC); > > SYS(fail, > > @@ -628,8 +628,10 @@ static void test_xfrm_tunnel(void) > > { > > DECLARE_LIBBPF_OPTS(bpf_tc_hook, tc_hook, > > .attach_point = BPF_TC_INGRESS); > > + LIBBPF_OPTS(bpf_xdp_attach_opts, opts); > > struct test_tunnel_kern *skel = NULL; > > struct nstoken *nstoken; > > + int xdp_prog_fd; > > int tc_prog_fd; > > int ifindex; > > int err; > > @@ -654,6 +656,14 @@ static void test_xfrm_tunnel(void) > > if (attach_tc_prog(&tc_hook, tc_prog_fd, -1)) > > goto done; > > > > + /* attach xdp prog to tunnel dev */ > > + xdp_prog_fd = bpf_program__fd(skel->progs.xfrm_get_state_xdp); > > + if (!ASSERT_GE(xdp_prog_fd, 0, "bpf_program__fd")) > > + goto done; > > + err = bpf_xdp_attach(ifindex, xdp_prog_fd, XDP_FLAGS_REPLACE, &opts); > > + if (!ASSERT_OK(err, "bpf_xdp_attach")) > > + goto done; > > + > > /* ping from at_ns0 namespace test */ > > nstoken = open_netns("at_ns0"); > > err = test_ping(AF_INET, IP4_ADDR_TUNL_DEV1); > > @@ -667,6 +677,8 @@ static void test_xfrm_tunnel(void) > > goto done; > > if (!ASSERT_EQ(skel->bss->xfrm_remote_ip, 0xac100164, "remote_ip")) > > goto done; > > + if (!ASSERT_EQ(skel->bss->xfrm_replay_window, 42, "replay_window")) > > + goto done; > > > > done: > > delete_xfrm_tunnel(); > > diff --git a/tools/testing/selftests/bpf/progs/test_tunnel_kern.c b/tools/testing/selftests/bpf/progs/test_tunnel_kern.c > > index 3a59eb9c34de..c0dd38616562 100644 > > --- a/tools/testing/selftests/bpf/progs/test_tunnel_kern.c > > +++ b/tools/testing/selftests/bpf/progs/test_tunnel_kern.c > > @@ -30,6 +30,10 @@ int bpf_skb_set_fou_encap(struct __sk_buff *skb_ctx, > > struct bpf_fou_encap *encap, int type) __ksym; > > int bpf_skb_get_fou_encap(struct __sk_buff *skb_ctx, > > struct bpf_fou_encap *encap) __ksym; > > +struct xfrm_state * > > +bpf_xdp_get_xfrm_state(struct xdp_md *ctx, struct bpf_xfrm_state_opts *opts, > > + u32 opts__sz) __ksym; > > +void bpf_xdp_xfrm_state_release(struct xfrm_state *x) __ksym; > > > > struct { > > __uint(type, BPF_MAP_TYPE_ARRAY); > > @@ -950,4 +954,51 @@ int xfrm_get_state(struct __sk_buff *skb) > > return TC_ACT_OK; > > } > > > > +volatile int xfrm_replay_window = 0; > > + > > +SEC("xdp") > > +int xfrm_get_state_xdp(struct xdp_md *xdp) > > +{ > > + struct bpf_xfrm_state_opts opts = {}; > > + struct xfrm_state *x = NULL; > > + struct ip_esp_hdr *esph; > > + struct bpf_dynptr ptr; > > + u8 esph_buf[8] = {}; > > + u8 iph_buf[20] = {}; > > + struct iphdr *iph; > > + u32 off; > > + > > + if (bpf_dynptr_from_xdp(xdp, 0, &ptr)) > > + goto out; > > + > > + off = sizeof(struct ethhdr); > > + iph = bpf_dynptr_slice(&ptr, off, iph_buf, sizeof(iph_buf)); > > + if (!iph || iph->protocol != IPPROTO_ESP) > > + goto out; > > + > > + off += sizeof(struct iphdr); > > + esph = bpf_dynptr_slice(&ptr, off, esph_buf, sizeof(esph_buf)); > > + if (!esph) > > + goto out; > > + > > + opts.netns_id = BPF_F_CURRENT_NETNS; > > + opts.daddr.a4 = iph->daddr; > > + opts.spi = esph->spi; > > + opts.proto = IPPROTO_ESP; > > + opts.family = AF_INET; > > + > > + x = bpf_xdp_get_xfrm_state(xdp, &opts, sizeof(opts)); > > + if (!x || opts.error) > > nit: how can opts.error be non zero if x == NULL?
Ignoring the new -ENOENT case, it can't. Which is why I'm testing that behavior here.
I'm sorry, I don't understand.
AFAICT, regardless of the -ENOENT change, I don't see how (!x) is false and (opt.error) is true, and so "if (!x || opts.error)" is always equivalent to "if (!x)".
What am I missing? Eyal.
The selftests are tests so my intention was to check edge cases here. In normal operation it shouldn't be possible that bpf_xdp_get_xfrm_state() returns non-NULL and also an error. Maybe another way of writing this would be:
if (!x) goto out; assert(opts.error == 0);
I think this would convey the "edge case testing" notion better.
If I'm trying to be too clever (or maybe just wrong) or it's pointless, I can remove the `opts.error` condition.
At least for me the tests also serve as references as to how the API is expected to be used, so I think it'd be clearer without signaling that opts.error could potentially be nonzero on success.
An assertion would indeed make that clear.
Sure, sounds good. I will check on the new bpf assert infra.
Couldn't quite get bpf_assert() working. The following diff:
diff --git a/tools/testing/selftests/bpf/progs/test_tunnel_kern.c b/tools/testing/selftests/bpf/progs/test_tunnel_kern.c index c0dd38616562..f00dba85ac5d 100644 --- a/tools/testing/selftests/bpf/progs/test_tunnel_kern.c +++ b/tools/testing/selftests/bpf/progs/test_tunnel_kern.c @@ -8,8 +8,9 @@ */ #include "vmlinux.h" #include <bpf/bpf_core_read.h> -#include <bpf/bpf_helpers.h> #include <bpf/bpf_endian.h> +#include <bpf/bpf_helpers.h> +#include "bpf_experimental.h" #include "bpf_kfuncs.h" #include "bpf_tracing_net.h"
@@ -988,8 +989,9 @@ int xfrm_get_state_xdp(struct xdp_md *xdp) opts.family = AF_INET;
x = bpf_xdp_get_xfrm_state(xdp, &opts, sizeof(opts));
if (!x || opts.error)
if (!x) goto out;
bpf_assert_with(opts.error == 0, XDP_PASS); if (!x->replay_esn) goto out;
results in:
57: (b7) r1 = 2 ; R1_w=2 refs=5 58: (85) call bpf_throw#115436 calling kernel function bpf_throw is not allowed
I think this might be because bpf_throw is not registered for use by BPF_PROG_TYPE_XDP. I would simply register the generic_kfunc_set for this program type as well, since it's already done for TC.
It looks like the above error comes from verifier.c:fetch_kfunc_meta, but I can run the exceptions selftests just fine with the same bzImage. So I'm thinking it's not a kfunc registration or BTF issue.
Maybe it's cuz I'm holding onto KFUNC_ACQUIRE'd `x`? Not sure.
Yes, even once you enable this, this will fail for now. I am sending out a series later this week that enables bpf_throw with acquired references, but until then may I suggest the following:
#define bpf_assert_if(cond) for (int ___i = 0, ___j = (cond); !(___j) \ && !___j; bpf_throw(), ___i++)
This will allow you to insert some cleanup code with an assertion. Then in my series, I will convert this temporary bpf_assert_if back to the normal bpf_assert.
It would look like: bpf_assert_if(opts.error == 0) { // Execute if assertion failed bpf_xdp_xfrm_state_release(x); }
Likewise for bpf_assert_with_if, you get the idea.
So for now I think I'll drop checking opts.error.
[...]
On Wed, Dec 13, 2023 at 12:13:51AM +0100, Kumar Kartikeya Dwivedi wrote:
On Tue, 12 Dec 2023 at 20:52, Daniel Xu dxu@dxuuu.xyz wrote:
cc Kumar
On Tue, Dec 12, 2023 at 09:17:02AM -0700, Daniel Xu wrote:
On Mon, Dec 11, 2023 at 04:25:06PM -0800, Eyal Birger wrote:
On Mon, Dec 11, 2023 at 3:49 PM Daniel Xu dxu@dxuuu.xyz wrote:
On Mon, Dec 11, 2023 at 03:13:07PM -0800, Eyal Birger wrote:
On Mon, Dec 11, 2023 at 2:31 PM Daniel Xu dxu@dxuuu.xyz wrote: > > On Mon, Dec 11, 2023 at 01:39:25PM -0800, Eyal Birger wrote: > > Hi Daniel, > > > > Tiny nits below in case you respin this for other reasons: > > > > On Mon, Dec 11, 2023 at 12:20 PM Daniel Xu dxu@dxuuu.xyz wrote: > > > > > > This commit extends test_tunnel selftest to test the new XDP xfrm state > > > lookup kfunc. > > > > > > Co-developed-by: Antony Antony antony.antony@secunet.com > > > Signed-off-by: Antony Antony antony.antony@secunet.com > > > Signed-off-by: Daniel Xu dxu@dxuuu.xyz > > > --- > > > .../selftests/bpf/prog_tests/test_tunnel.c | 20 ++++++-- > > > .../selftests/bpf/progs/test_tunnel_kern.c | 51 +++++++++++++++++++ > > > 2 files changed, 67 insertions(+), 4 deletions(-) > > > > > > diff --git a/tools/testing/selftests/bpf/prog_tests/test_tunnel.c b/tools/testing/selftests/bpf/prog_tests/test_tunnel.c > > > index 2d7f8fa82ebd..fc804095d578 100644 > > > --- a/tools/testing/selftests/bpf/prog_tests/test_tunnel.c > > > +++ b/tools/testing/selftests/bpf/prog_tests/test_tunnel.c > > > @@ -278,7 +278,7 @@ static int add_xfrm_tunnel(void) > > > SYS(fail, > > > "ip netns exec at_ns0 " > > > "ip xfrm state add src %s dst %s proto esp " > > > - "spi %d reqid 1 mode tunnel " > > > + "spi %d reqid 1 mode tunnel replay-window 42 " > > > "auth-trunc 'hmac(sha1)' %s 96 enc 'cbc(aes)' %s", > > > IP4_ADDR_VETH0, IP4_ADDR1_VETH1, XFRM_SPI_IN_TO_OUT, XFRM_AUTH, XFRM_ENC); > > > SYS(fail, > > > @@ -292,7 +292,7 @@ static int add_xfrm_tunnel(void) > > > SYS(fail, > > > "ip netns exec at_ns0 " > > > "ip xfrm state add src %s dst %s proto esp " > > > - "spi %d reqid 2 mode tunnel " > > > + "spi %d reqid 2 mode tunnel replay-window 42 " > > > > nit: why do you need to set the replay-window in both directions? > > No reason - probably just careless here. > > > > > > "auth-trunc 'hmac(sha1)' %s 96 enc 'cbc(aes)' %s", > > > IP4_ADDR1_VETH1, IP4_ADDR_VETH0, XFRM_SPI_OUT_TO_IN, XFRM_AUTH, XFRM_ENC); > > > SYS(fail, > > > @@ -313,7 +313,7 @@ static int add_xfrm_tunnel(void) > > > */ > > > SYS(fail, > > > "ip xfrm state add src %s dst %s proto esp " > > > - "spi %d reqid 1 mode tunnel " > > > + "spi %d reqid 1 mode tunnel replay-window 42 " > > > "auth-trunc 'hmac(sha1)' %s 96 enc 'cbc(aes)' %s", > > > IP4_ADDR_VETH0, IP4_ADDR1_VETH1, XFRM_SPI_IN_TO_OUT, XFRM_AUTH, XFRM_ENC); > > > SYS(fail, > > > @@ -325,7 +325,7 @@ static int add_xfrm_tunnel(void) > > > /* root -> at_ns0 */ > > > SYS(fail, > > > "ip xfrm state add src %s dst %s proto esp " > > > - "spi %d reqid 2 mode tunnel " > > > + "spi %d reqid 2 mode tunnel replay-window 42 " > > > "auth-trunc 'hmac(sha1)' %s 96 enc 'cbc(aes)' %s", > > > IP4_ADDR1_VETH1, IP4_ADDR_VETH0, XFRM_SPI_OUT_TO_IN, XFRM_AUTH, XFRM_ENC); > > > SYS(fail, > > > @@ -628,8 +628,10 @@ static void test_xfrm_tunnel(void) > > > { > > > DECLARE_LIBBPF_OPTS(bpf_tc_hook, tc_hook, > > > .attach_point = BPF_TC_INGRESS); > > > + LIBBPF_OPTS(bpf_xdp_attach_opts, opts); > > > struct test_tunnel_kern *skel = NULL; > > > struct nstoken *nstoken; > > > + int xdp_prog_fd; > > > int tc_prog_fd; > > > int ifindex; > > > int err; > > > @@ -654,6 +656,14 @@ static void test_xfrm_tunnel(void) > > > if (attach_tc_prog(&tc_hook, tc_prog_fd, -1)) > > > goto done; > > > > > > + /* attach xdp prog to tunnel dev */ > > > + xdp_prog_fd = bpf_program__fd(skel->progs.xfrm_get_state_xdp); > > > + if (!ASSERT_GE(xdp_prog_fd, 0, "bpf_program__fd")) > > > + goto done; > > > + err = bpf_xdp_attach(ifindex, xdp_prog_fd, XDP_FLAGS_REPLACE, &opts); > > > + if (!ASSERT_OK(err, "bpf_xdp_attach")) > > > + goto done; > > > + > > > /* ping from at_ns0 namespace test */ > > > nstoken = open_netns("at_ns0"); > > > err = test_ping(AF_INET, IP4_ADDR_TUNL_DEV1); > > > @@ -667,6 +677,8 @@ static void test_xfrm_tunnel(void) > > > goto done; > > > if (!ASSERT_EQ(skel->bss->xfrm_remote_ip, 0xac100164, "remote_ip")) > > > goto done; > > > + if (!ASSERT_EQ(skel->bss->xfrm_replay_window, 42, "replay_window")) > > > + goto done; > > > > > > done: > > > delete_xfrm_tunnel(); > > > diff --git a/tools/testing/selftests/bpf/progs/test_tunnel_kern.c b/tools/testing/selftests/bpf/progs/test_tunnel_kern.c > > > index 3a59eb9c34de..c0dd38616562 100644 > > > --- a/tools/testing/selftests/bpf/progs/test_tunnel_kern.c > > > +++ b/tools/testing/selftests/bpf/progs/test_tunnel_kern.c > > > @@ -30,6 +30,10 @@ int bpf_skb_set_fou_encap(struct __sk_buff *skb_ctx, > > > struct bpf_fou_encap *encap, int type) __ksym; > > > int bpf_skb_get_fou_encap(struct __sk_buff *skb_ctx, > > > struct bpf_fou_encap *encap) __ksym; > > > +struct xfrm_state * > > > +bpf_xdp_get_xfrm_state(struct xdp_md *ctx, struct bpf_xfrm_state_opts *opts, > > > + u32 opts__sz) __ksym; > > > +void bpf_xdp_xfrm_state_release(struct xfrm_state *x) __ksym; > > > > > > struct { > > > __uint(type, BPF_MAP_TYPE_ARRAY); > > > @@ -950,4 +954,51 @@ int xfrm_get_state(struct __sk_buff *skb) > > > return TC_ACT_OK; > > > } > > > > > > +volatile int xfrm_replay_window = 0; > > > + > > > +SEC("xdp") > > > +int xfrm_get_state_xdp(struct xdp_md *xdp) > > > +{ > > > + struct bpf_xfrm_state_opts opts = {}; > > > + struct xfrm_state *x = NULL; > > > + struct ip_esp_hdr *esph; > > > + struct bpf_dynptr ptr; > > > + u8 esph_buf[8] = {}; > > > + u8 iph_buf[20] = {}; > > > + struct iphdr *iph; > > > + u32 off; > > > + > > > + if (bpf_dynptr_from_xdp(xdp, 0, &ptr)) > > > + goto out; > > > + > > > + off = sizeof(struct ethhdr); > > > + iph = bpf_dynptr_slice(&ptr, off, iph_buf, sizeof(iph_buf)); > > > + if (!iph || iph->protocol != IPPROTO_ESP) > > > + goto out; > > > + > > > + off += sizeof(struct iphdr); > > > + esph = bpf_dynptr_slice(&ptr, off, esph_buf, sizeof(esph_buf)); > > > + if (!esph) > > > + goto out; > > > + > > > + opts.netns_id = BPF_F_CURRENT_NETNS; > > > + opts.daddr.a4 = iph->daddr; > > > + opts.spi = esph->spi; > > > + opts.proto = IPPROTO_ESP; > > > + opts.family = AF_INET; > > > + > > > + x = bpf_xdp_get_xfrm_state(xdp, &opts, sizeof(opts)); > > > + if (!x || opts.error) > > > > nit: how can opts.error be non zero if x == NULL? > > Ignoring the new -ENOENT case, it can't. Which is why I'm testing that > behavior here.
I'm sorry, I don't understand.
AFAICT, regardless of the -ENOENT change, I don't see how (!x) is false and (opt.error) is true, and so "if (!x || opts.error)" is always equivalent to "if (!x)".
What am I missing? Eyal.
The selftests are tests so my intention was to check edge cases here. In normal operation it shouldn't be possible that bpf_xdp_get_xfrm_state() returns non-NULL and also an error. Maybe another way of writing this would be:
if (!x) goto out; assert(opts.error == 0);
I think this would convey the "edge case testing" notion better.
If I'm trying to be too clever (or maybe just wrong) or it's pointless, I can remove the `opts.error` condition.
At least for me the tests also serve as references as to how the API is expected to be used, so I think it'd be clearer without signaling that opts.error could potentially be nonzero on success.
An assertion would indeed make that clear.
Sure, sounds good. I will check on the new bpf assert infra.
Couldn't quite get bpf_assert() working. The following diff:
diff --git a/tools/testing/selftests/bpf/progs/test_tunnel_kern.c b/tools/testing/selftests/bpf/progs/test_tunnel_kern.c index c0dd38616562..f00dba85ac5d 100644 --- a/tools/testing/selftests/bpf/progs/test_tunnel_kern.c +++ b/tools/testing/selftests/bpf/progs/test_tunnel_kern.c @@ -8,8 +8,9 @@ */ #include "vmlinux.h" #include <bpf/bpf_core_read.h> -#include <bpf/bpf_helpers.h> #include <bpf/bpf_endian.h> +#include <bpf/bpf_helpers.h> +#include "bpf_experimental.h" #include "bpf_kfuncs.h" #include "bpf_tracing_net.h"
@@ -988,8 +989,9 @@ int xfrm_get_state_xdp(struct xdp_md *xdp) opts.family = AF_INET;
x = bpf_xdp_get_xfrm_state(xdp, &opts, sizeof(opts));
if (!x || opts.error)
if (!x) goto out;
bpf_assert_with(opts.error == 0, XDP_PASS); if (!x->replay_esn) goto out;
results in:
57: (b7) r1 = 2 ; R1_w=2 refs=5 58: (85) call bpf_throw#115436 calling kernel function bpf_throw is not allowed
I think this might be because bpf_throw is not registered for use by BPF_PROG_TYPE_XDP. I would simply register the generic_kfunc_set for this program type as well, since it's already done for TC.
Ah yeah, that was it.
It looks like the above error comes from verifier.c:fetch_kfunc_meta, but I can run the exceptions selftests just fine with the same bzImage. So I'm thinking it's not a kfunc registration or BTF issue.
Maybe it's cuz I'm holding onto KFUNC_ACQUIRE'd `x`? Not sure.
Yes, even once you enable this, this will fail for now. I am sending out a series later this week that enables bpf_throw with acquired references, but until then may I suggest the following:
#define bpf_assert_if(cond) for (int ___i = 0, ___j = (cond); !(___j) \ && !___j; bpf_throw(), ___i++)
This will allow you to insert some cleanup code with an assertion. Then in my series, I will convert this temporary bpf_assert_if back to the normal bpf_assert.
It would look like: bpf_assert_if(opts.error == 0) { // Execute if assertion failed bpf_xdp_xfrm_state_release(x); }
Likewise for bpf_assert_with_if, you get the idea.
I gave it a try and I'm getting this compile error:
progs/test_tunnel_kern.c:996:2: error: variable '___j' used in loop condition not modified in loop body [-Werror,-Wfor-loop-analysis] bpf_assert_with_if(opts.error == 0, XDP_PASS) { ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ /home/dxu/dev/linux/tools/testing/selftests/bpf/bpf_experimental.h:295:38: note: expanded from macro 'bpf_assert_with_if' for (int ___i = 0, ___j = (cond); !(___j) && !___j; bpf_throw(value), ___i++) ^~~~ ~~~~ 1 error generated. make: *** [Makefile:618: /home/dxu/dev/linux/tools/testing/selftests/bpf/test_tunnel_kern.bpf.o] Error 1
Seems like the compiler is being clever.
Thanks, Daniel
On Wed, Dec 13, 2023 at 3:15 PM Daniel Xu dxu@dxuuu.xyz wrote:
On Wed, Dec 13, 2023 at 12:13:51AM +0100, Kumar Kartikeya Dwivedi wrote:
On Tue, 12 Dec 2023 at 20:52, Daniel Xu dxu@dxuuu.xyz wrote:
cc Kumar
On Tue, Dec 12, 2023 at 09:17:02AM -0700, Daniel Xu wrote:
On Mon, Dec 11, 2023 at 04:25:06PM -0800, Eyal Birger wrote:
On Mon, Dec 11, 2023 at 3:49 PM Daniel Xu dxu@dxuuu.xyz wrote:
On Mon, Dec 11, 2023 at 03:13:07PM -0800, Eyal Birger wrote: > On Mon, Dec 11, 2023 at 2:31 PM Daniel Xu dxu@dxuuu.xyz wrote: > > > > On Mon, Dec 11, 2023 at 01:39:25PM -0800, Eyal Birger wrote: > > > Hi Daniel, > > > > > > Tiny nits below in case you respin this for other reasons: > > > > > > On Mon, Dec 11, 2023 at 12:20 PM Daniel Xu dxu@dxuuu.xyz wrote: > > > > > > > > This commit extends test_tunnel selftest to test the new XDP xfrm state > > > > lookup kfunc. > > > > > > > > Co-developed-by: Antony Antony antony.antony@secunet.com > > > > Signed-off-by: Antony Antony antony.antony@secunet.com > > > > Signed-off-by: Daniel Xu dxu@dxuuu.xyz > > > > --- > > > > .../selftests/bpf/prog_tests/test_tunnel.c | 20 ++++++-- > > > > .../selftests/bpf/progs/test_tunnel_kern.c | 51 +++++++++++++++++++ > > > > 2 files changed, 67 insertions(+), 4 deletions(-) > > > > > > > > diff --git a/tools/testing/selftests/bpf/prog_tests/test_tunnel.c b/tools/testing/selftests/bpf/prog_tests/test_tunnel.c > > > > index 2d7f8fa82ebd..fc804095d578 100644 > > > > --- a/tools/testing/selftests/bpf/prog_tests/test_tunnel.c > > > > +++ b/tools/testing/selftests/bpf/prog_tests/test_tunnel.c > > > > @@ -278,7 +278,7 @@ static int add_xfrm_tunnel(void) > > > > SYS(fail, > > > > "ip netns exec at_ns0 " > > > > "ip xfrm state add src %s dst %s proto esp " > > > > - "spi %d reqid 1 mode tunnel " > > > > + "spi %d reqid 1 mode tunnel replay-window 42 " > > > > "auth-trunc 'hmac(sha1)' %s 96 enc 'cbc(aes)' %s", > > > > IP4_ADDR_VETH0, IP4_ADDR1_VETH1, XFRM_SPI_IN_TO_OUT, XFRM_AUTH, XFRM_ENC); > > > > SYS(fail, > > > > @@ -292,7 +292,7 @@ static int add_xfrm_tunnel(void) > > > > SYS(fail, > > > > "ip netns exec at_ns0 " > > > > "ip xfrm state add src %s dst %s proto esp " > > > > - "spi %d reqid 2 mode tunnel " > > > > + "spi %d reqid 2 mode tunnel replay-window 42 " > > > > > > nit: why do you need to set the replay-window in both directions? > > > > No reason - probably just careless here. > > > > > > > > > "auth-trunc 'hmac(sha1)' %s 96 enc 'cbc(aes)' %s", > > > > IP4_ADDR1_VETH1, IP4_ADDR_VETH0, XFRM_SPI_OUT_TO_IN, XFRM_AUTH, XFRM_ENC); > > > > SYS(fail, > > > > @@ -313,7 +313,7 @@ static int add_xfrm_tunnel(void) > > > > */ > > > > SYS(fail, > > > > "ip xfrm state add src %s dst %s proto esp " > > > > - "spi %d reqid 1 mode tunnel " > > > > + "spi %d reqid 1 mode tunnel replay-window 42 " > > > > "auth-trunc 'hmac(sha1)' %s 96 enc 'cbc(aes)' %s", > > > > IP4_ADDR_VETH0, IP4_ADDR1_VETH1, XFRM_SPI_IN_TO_OUT, XFRM_AUTH, XFRM_ENC); > > > > SYS(fail, > > > > @@ -325,7 +325,7 @@ static int add_xfrm_tunnel(void) > > > > /* root -> at_ns0 */ > > > > SYS(fail, > > > > "ip xfrm state add src %s dst %s proto esp " > > > > - "spi %d reqid 2 mode tunnel " > > > > + "spi %d reqid 2 mode tunnel replay-window 42 " > > > > "auth-trunc 'hmac(sha1)' %s 96 enc 'cbc(aes)' %s", > > > > IP4_ADDR1_VETH1, IP4_ADDR_VETH0, XFRM_SPI_OUT_TO_IN, XFRM_AUTH, XFRM_ENC); > > > > SYS(fail, > > > > @@ -628,8 +628,10 @@ static void test_xfrm_tunnel(void) > > > > { > > > > DECLARE_LIBBPF_OPTS(bpf_tc_hook, tc_hook, > > > > .attach_point = BPF_TC_INGRESS); > > > > + LIBBPF_OPTS(bpf_xdp_attach_opts, opts); > > > > struct test_tunnel_kern *skel = NULL; > > > > struct nstoken *nstoken; > > > > + int xdp_prog_fd; > > > > int tc_prog_fd; > > > > int ifindex; > > > > int err; > > > > @@ -654,6 +656,14 @@ static void test_xfrm_tunnel(void) > > > > if (attach_tc_prog(&tc_hook, tc_prog_fd, -1)) > > > > goto done; > > > > > > > > + /* attach xdp prog to tunnel dev */ > > > > + xdp_prog_fd = bpf_program__fd(skel->progs.xfrm_get_state_xdp); > > > > + if (!ASSERT_GE(xdp_prog_fd, 0, "bpf_program__fd")) > > > > + goto done; > > > > + err = bpf_xdp_attach(ifindex, xdp_prog_fd, XDP_FLAGS_REPLACE, &opts); > > > > + if (!ASSERT_OK(err, "bpf_xdp_attach")) > > > > + goto done; > > > > + > > > > /* ping from at_ns0 namespace test */ > > > > nstoken = open_netns("at_ns0"); > > > > err = test_ping(AF_INET, IP4_ADDR_TUNL_DEV1); > > > > @@ -667,6 +677,8 @@ static void test_xfrm_tunnel(void) > > > > goto done; > > > > if (!ASSERT_EQ(skel->bss->xfrm_remote_ip, 0xac100164, "remote_ip")) > > > > goto done; > > > > + if (!ASSERT_EQ(skel->bss->xfrm_replay_window, 42, "replay_window")) > > > > + goto done; > > > > > > > > done: > > > > delete_xfrm_tunnel(); > > > > diff --git a/tools/testing/selftests/bpf/progs/test_tunnel_kern.c b/tools/testing/selftests/bpf/progs/test_tunnel_kern.c > > > > index 3a59eb9c34de..c0dd38616562 100644 > > > > --- a/tools/testing/selftests/bpf/progs/test_tunnel_kern.c > > > > +++ b/tools/testing/selftests/bpf/progs/test_tunnel_kern.c > > > > @@ -30,6 +30,10 @@ int bpf_skb_set_fou_encap(struct __sk_buff *skb_ctx, > > > > struct bpf_fou_encap *encap, int type) __ksym; > > > > int bpf_skb_get_fou_encap(struct __sk_buff *skb_ctx, > > > > struct bpf_fou_encap *encap) __ksym; > > > > +struct xfrm_state * > > > > +bpf_xdp_get_xfrm_state(struct xdp_md *ctx, struct bpf_xfrm_state_opts *opts, > > > > + u32 opts__sz) __ksym; > > > > +void bpf_xdp_xfrm_state_release(struct xfrm_state *x) __ksym; > > > > > > > > struct { > > > > __uint(type, BPF_MAP_TYPE_ARRAY); > > > > @@ -950,4 +954,51 @@ int xfrm_get_state(struct __sk_buff *skb) > > > > return TC_ACT_OK; > > > > } > > > > > > > > +volatile int xfrm_replay_window = 0; > > > > + > > > > +SEC("xdp") > > > > +int xfrm_get_state_xdp(struct xdp_md *xdp) > > > > +{ > > > > + struct bpf_xfrm_state_opts opts = {}; > > > > + struct xfrm_state *x = NULL; > > > > + struct ip_esp_hdr *esph; > > > > + struct bpf_dynptr ptr; > > > > + u8 esph_buf[8] = {}; > > > > + u8 iph_buf[20] = {}; > > > > + struct iphdr *iph; > > > > + u32 off; > > > > + > > > > + if (bpf_dynptr_from_xdp(xdp, 0, &ptr)) > > > > + goto out; > > > > + > > > > + off = sizeof(struct ethhdr); > > > > + iph = bpf_dynptr_slice(&ptr, off, iph_buf, sizeof(iph_buf)); > > > > + if (!iph || iph->protocol != IPPROTO_ESP) > > > > + goto out; > > > > + > > > > + off += sizeof(struct iphdr); > > > > + esph = bpf_dynptr_slice(&ptr, off, esph_buf, sizeof(esph_buf)); > > > > + if (!esph) > > > > + goto out; > > > > + > > > > + opts.netns_id = BPF_F_CURRENT_NETNS; > > > > + opts.daddr.a4 = iph->daddr; > > > > + opts.spi = esph->spi; > > > > + opts.proto = IPPROTO_ESP; > > > > + opts.family = AF_INET; > > > > + > > > > + x = bpf_xdp_get_xfrm_state(xdp, &opts, sizeof(opts)); > > > > + if (!x || opts.error) > > > > > > nit: how can opts.error be non zero if x == NULL? > > > > Ignoring the new -ENOENT case, it can't. Which is why I'm testing that > > behavior here. > > I'm sorry, I don't understand. > > AFAICT, regardless of the -ENOENT change, I don't see > how (!x) is false and (opt.error) is true, and so > "if (!x || opts.error)" is always equivalent to "if (!x)". > > What am I missing? > Eyal.
The selftests are tests so my intention was to check edge cases here. In normal operation it shouldn't be possible that bpf_xdp_get_xfrm_state() returns non-NULL and also an error. Maybe another way of writing this would be:
if (!x) goto out; assert(opts.error == 0);
I think this would convey the "edge case testing" notion better.
If I'm trying to be too clever (or maybe just wrong) or it's pointless, I can remove the `opts.error` condition.
At least for me the tests also serve as references as to how the API is expected to be used, so I think it'd be clearer without signaling that opts.error could potentially be nonzero on success.
An assertion would indeed make that clear.
Sure, sounds good. I will check on the new bpf assert infra.
Couldn't quite get bpf_assert() working. The following diff:
diff --git a/tools/testing/selftests/bpf/progs/test_tunnel_kern.c b/tools/testing/selftests/bpf/progs/test_tunnel_kern.c index c0dd38616562..f00dba85ac5d 100644 --- a/tools/testing/selftests/bpf/progs/test_tunnel_kern.c +++ b/tools/testing/selftests/bpf/progs/test_tunnel_kern.c @@ -8,8 +8,9 @@ */ #include "vmlinux.h" #include <bpf/bpf_core_read.h> -#include <bpf/bpf_helpers.h> #include <bpf/bpf_endian.h> +#include <bpf/bpf_helpers.h> +#include "bpf_experimental.h" #include "bpf_kfuncs.h" #include "bpf_tracing_net.h"
@@ -988,8 +989,9 @@ int xfrm_get_state_xdp(struct xdp_md *xdp) opts.family = AF_INET;
x = bpf_xdp_get_xfrm_state(xdp, &opts, sizeof(opts));
if (!x || opts.error)
if (!x) goto out;
bpf_assert_with(opts.error == 0, XDP_PASS); if (!x->replay_esn) goto out;
results in:
57: (b7) r1 = 2 ; R1_w=2 refs=5 58: (85) call bpf_throw#115436 calling kernel function bpf_throw is not allowed
I think this might be because bpf_throw is not registered for use by BPF_PROG_TYPE_XDP. I would simply register the generic_kfunc_set for this program type as well, since it's already done for TC.
Ah yeah, that was it.
It looks like the above error comes from verifier.c:fetch_kfunc_meta, but I can run the exceptions selftests just fine with the same bzImage. So I'm thinking it's not a kfunc registration or BTF issue.
Maybe it's cuz I'm holding onto KFUNC_ACQUIRE'd `x`? Not sure.
Yes, even once you enable this, this will fail for now. I am sending out a series later this week that enables bpf_throw with acquired references, but until then may I suggest the following:
#define bpf_assert_if(cond) for (int ___i = 0, ___j = (cond); !(___j) \ && !___j; bpf_throw(), ___i++)
This will allow you to insert some cleanup code with an assertion. Then in my series, I will convert this temporary bpf_assert_if back to the normal bpf_assert.
It would look like: bpf_assert_if(opts.error == 0) { // Execute if assertion failed bpf_xdp_xfrm_state_release(x); }
Likewise for bpf_assert_with_if, you get the idea.
I gave it a try and I'm getting this compile error:
progs/test_tunnel_kern.c:996:2: error: variable '___j' used in loop condition not modified in loop body [-Werror,-Wfor-loop-analysis] bpf_assert_with_if(opts.error == 0, XDP_PASS) { ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ /home/dxu/dev/linux/tools/testing/selftests/bpf/bpf_experimental.h:295:38: note: expanded from macro 'bpf_assert_with_if' for (int ___i = 0, ___j = (cond); !(___j) && !___j; bpf_throw(value), ___i++) ^~~~ ~~~~ 1 error generated. make: *** [Makefile:618: /home/dxu/dev/linux/tools/testing/selftests/bpf/test_tunnel_kern.bpf.o] Error 1
Seems like the compiler is being clever.
It looks like ___j is used twice - maybe it was meant to be ___i? i.e.:
for (int ___i = 0, ___j = (cond); !(___j) && !___i; bpf_throw(value), ___i++)
Eyal.
On Thu, 14 Dec 2023 at 00:49, Eyal Birger eyal.birger@gmail.com wrote:
On Wed, Dec 13, 2023 at 3:15 PM Daniel Xu dxu@dxuuu.xyz wrote:
[...]
diff --git a/tools/testing/selftests/bpf/progs/test_tunnel_kern.c b/tools/testing/selftests/bpf/progs/test_tunnel_kern.c index c0dd38616562..f00dba85ac5d 100644 --- a/tools/testing/selftests/bpf/progs/test_tunnel_kern.c +++ b/tools/testing/selftests/bpf/progs/test_tunnel_kern.c @@ -8,8 +8,9 @@ */ #include "vmlinux.h" #include <bpf/bpf_core_read.h> -#include <bpf/bpf_helpers.h> #include <bpf/bpf_endian.h> +#include <bpf/bpf_helpers.h> +#include "bpf_experimental.h" #include "bpf_kfuncs.h" #include "bpf_tracing_net.h"
@@ -988,8 +989,9 @@ int xfrm_get_state_xdp(struct xdp_md *xdp) opts.family = AF_INET;
x = bpf_xdp_get_xfrm_state(xdp, &opts, sizeof(opts));
if (!x || opts.error)
if (!x) goto out;
bpf_assert_with(opts.error == 0, XDP_PASS); if (!x->replay_esn) goto out;
results in:
57: (b7) r1 = 2 ; R1_w=2 refs=5 58: (85) call bpf_throw#115436 calling kernel function bpf_throw is not allowed
I think this might be because bpf_throw is not registered for use by BPF_PROG_TYPE_XDP. I would simply register the generic_kfunc_set for this program type as well, since it's already done for TC.
Ah yeah, that was it.
It looks like the above error comes from verifier.c:fetch_kfunc_meta, but I can run the exceptions selftests just fine with the same bzImage. So I'm thinking it's not a kfunc registration or BTF issue.
Maybe it's cuz I'm holding onto KFUNC_ACQUIRE'd `x`? Not sure.
Yes, even once you enable this, this will fail for now. I am sending out a series later this week that enables bpf_throw with acquired references, but until then may I suggest the following:
#define bpf_assert_if(cond) for (int ___i = 0, ___j = (cond); !(___j) \ && !___j; bpf_throw(), ___i++)
This will allow you to insert some cleanup code with an assertion. Then in my series, I will convert this temporary bpf_assert_if back to the normal bpf_assert.
It would look like: bpf_assert_if(opts.error == 0) { // Execute if assertion failed bpf_xdp_xfrm_state_release(x); }
Likewise for bpf_assert_with_if, you get the idea.
I gave it a try and I'm getting this compile error:
progs/test_tunnel_kern.c:996:2: error: variable '___j' used in loop condition not modified in loop body [-Werror,-Wfor-loop-analysis] bpf_assert_with_if(opts.error == 0, XDP_PASS) { ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ /home/dxu/dev/linux/tools/testing/selftests/bpf/bpf_experimental.h:295:38: note: expanded from macro 'bpf_assert_with_if' for (int ___i = 0, ___j = (cond); !(___j) && !___j; bpf_throw(value), ___i++) ^~~~ ~~~~ 1 error generated. make: *** [Makefile:618: /home/dxu/dev/linux/tools/testing/selftests/bpf/test_tunnel_kern.bpf.o] Error 1
Seems like the compiler is being clever.
It looks like ___j is used twice - maybe it was meant to be ___i? i.e.:
for (int ___i = 0, ___j = (cond); !(___j) && !___i; bpf_throw(value), ___i++)
Ah, yes, that's a typo. Eyal is right, it should be ___i.
On Thu, 14 Dec 2023 at 17:08, Kumar Kartikeya Dwivedi memxor@gmail.com wrote:
On Thu, 14 Dec 2023 at 00:49, Eyal Birger eyal.birger@gmail.com wrote:
On Wed, Dec 13, 2023 at 3:15 PM Daniel Xu dxu@dxuuu.xyz wrote:
[...]
diff --git a/tools/testing/selftests/bpf/progs/test_tunnel_kern.c b/tools/testing/selftests/bpf/progs/test_tunnel_kern.c index c0dd38616562..f00dba85ac5d 100644 --- a/tools/testing/selftests/bpf/progs/test_tunnel_kern.c +++ b/tools/testing/selftests/bpf/progs/test_tunnel_kern.c @@ -8,8 +8,9 @@ */ #include "vmlinux.h" #include <bpf/bpf_core_read.h> -#include <bpf/bpf_helpers.h> #include <bpf/bpf_endian.h> +#include <bpf/bpf_helpers.h> +#include "bpf_experimental.h" #include "bpf_kfuncs.h" #include "bpf_tracing_net.h"
@@ -988,8 +989,9 @@ int xfrm_get_state_xdp(struct xdp_md *xdp) opts.family = AF_INET;
x = bpf_xdp_get_xfrm_state(xdp, &opts, sizeof(opts));
if (!x || opts.error)
if (!x) goto out;
bpf_assert_with(opts.error == 0, XDP_PASS); if (!x->replay_esn) goto out;
results in:
57: (b7) r1 = 2 ; R1_w=2 refs=5 58: (85) call bpf_throw#115436 calling kernel function bpf_throw is not allowed
I think this might be because bpf_throw is not registered for use by BPF_PROG_TYPE_XDP. I would simply register the generic_kfunc_set for this program type as well, since it's already done for TC.
Ah yeah, that was it.
It looks like the above error comes from verifier.c:fetch_kfunc_meta, but I can run the exceptions selftests just fine with the same bzImage. So I'm thinking it's not a kfunc registration or BTF issue.
Maybe it's cuz I'm holding onto KFUNC_ACQUIRE'd `x`? Not sure.
Yes, even once you enable this, this will fail for now. I am sending out a series later this week that enables bpf_throw with acquired references, but until then may I suggest the following:
#define bpf_assert_if(cond) for (int ___i = 0, ___j = (cond); !(___j) \ && !___j; bpf_throw(), ___i++)
This will allow you to insert some cleanup code with an assertion. Then in my series, I will convert this temporary bpf_assert_if back to the normal bpf_assert.
It would look like: bpf_assert_if(opts.error == 0) { // Execute if assertion failed bpf_xdp_xfrm_state_release(x); }
Likewise for bpf_assert_with_if, you get the idea.
I gave it a try and I'm getting this compile error:
progs/test_tunnel_kern.c:996:2: error: variable '___j' used in loop condition not modified in loop body [-Werror,-Wfor-loop-analysis] bpf_assert_with_if(opts.error == 0, XDP_PASS) { ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ /home/dxu/dev/linux/tools/testing/selftests/bpf/bpf_experimental.h:295:38: note: expanded from macro 'bpf_assert_with_if' for (int ___i = 0, ___j = (cond); !(___j) && !___j; bpf_throw(value), ___i++) ^~~~ ~~~~ 1 error generated. make: *** [Makefile:618: /home/dxu/dev/linux/tools/testing/selftests/bpf/test_tunnel_kern.bpf.o] Error 1
Seems like the compiler is being clever.
It looks like ___j is used twice - maybe it was meant to be ___i? i.e.:
for (int ___i = 0, ___j = (cond); !(___j) && !___i; bpf_throw(value), ___i++)
Ah, yes, that's a typo. Eyal is right, it should be ___i.
Additionally, I would modify the macro to do ___j = !!(cond).
On Thu, Dec 14, 2023 at 05:16:08PM +0100, Kumar Kartikeya Dwivedi wrote:
On Thu, 14 Dec 2023 at 17:08, Kumar Kartikeya Dwivedi memxor@gmail.com wrote:
On Thu, 14 Dec 2023 at 00:49, Eyal Birger eyal.birger@gmail.com wrote:
On Wed, Dec 13, 2023 at 3:15 PM Daniel Xu dxu@dxuuu.xyz wrote:
[...]
diff --git a/tools/testing/selftests/bpf/progs/test_tunnel_kern.c b/tools/testing/selftests/bpf/progs/test_tunnel_kern.c index c0dd38616562..f00dba85ac5d 100644 --- a/tools/testing/selftests/bpf/progs/test_tunnel_kern.c +++ b/tools/testing/selftests/bpf/progs/test_tunnel_kern.c @@ -8,8 +8,9 @@ */ #include "vmlinux.h" #include <bpf/bpf_core_read.h> -#include <bpf/bpf_helpers.h> #include <bpf/bpf_endian.h> +#include <bpf/bpf_helpers.h> +#include "bpf_experimental.h" #include "bpf_kfuncs.h" #include "bpf_tracing_net.h"
@@ -988,8 +989,9 @@ int xfrm_get_state_xdp(struct xdp_md *xdp) opts.family = AF_INET;
x = bpf_xdp_get_xfrm_state(xdp, &opts, sizeof(opts));
if (!x || opts.error)
if (!x) goto out;
bpf_assert_with(opts.error == 0, XDP_PASS); if (!x->replay_esn) goto out;
results in:
57: (b7) r1 = 2 ; R1_w=2 refs=5 58: (85) call bpf_throw#115436 calling kernel function bpf_throw is not allowed
I think this might be because bpf_throw is not registered for use by BPF_PROG_TYPE_XDP. I would simply register the generic_kfunc_set for this program type as well, since it's already done for TC.
Ah yeah, that was it.
It looks like the above error comes from verifier.c:fetch_kfunc_meta, but I can run the exceptions selftests just fine with the same bzImage. So I'm thinking it's not a kfunc registration or BTF issue.
Maybe it's cuz I'm holding onto KFUNC_ACQUIRE'd `x`? Not sure.
Yes, even once you enable this, this will fail for now. I am sending out a series later this week that enables bpf_throw with acquired references, but until then may I suggest the following:
#define bpf_assert_if(cond) for (int ___i = 0, ___j = (cond); !(___j) \ && !___j; bpf_throw(), ___i++)
This will allow you to insert some cleanup code with an assertion. Then in my series, I will convert this temporary bpf_assert_if back to the normal bpf_assert.
It would look like: bpf_assert_if(opts.error == 0) { // Execute if assertion failed bpf_xdp_xfrm_state_release(x); }
Likewise for bpf_assert_with_if, you get the idea.
I gave it a try and I'm getting this compile error:
progs/test_tunnel_kern.c:996:2: error: variable '___j' used in loop condition not modified in loop body [-Werror,-Wfor-loop-analysis] bpf_assert_with_if(opts.error == 0, XDP_PASS) { ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ /home/dxu/dev/linux/tools/testing/selftests/bpf/bpf_experimental.h:295:38: note: expanded from macro 'bpf_assert_with_if' for (int ___i = 0, ___j = (cond); !(___j) && !___j; bpf_throw(value), ___i++) ^~~~ ~~~~ 1 error generated. make: *** [Makefile:618: /home/dxu/dev/linux/tools/testing/selftests/bpf/test_tunnel_kern.bpf.o] Error 1
Seems like the compiler is being clever.
It looks like ___j is used twice - maybe it was meant to be ___i? i.e.:
for (int ___i = 0, ___j = (cond); !(___j) && !___i; bpf_throw(value), ___i++)
Ah, yes, that's a typo. Eyal is right, it should be ___i.
Additionally, I would modify the macro to do ___j = !!(cond).
Makes sense. Will send out v6 with these fixes today.
On Thu, Dec 14, 2023 at 11:23:02AM -0700, Daniel Xu wrote:
On Thu, Dec 14, 2023 at 05:16:08PM +0100, Kumar Kartikeya Dwivedi wrote:
On Thu, 14 Dec 2023 at 17:08, Kumar Kartikeya Dwivedi memxor@gmail.com wrote:
On Thu, 14 Dec 2023 at 00:49, Eyal Birger eyal.birger@gmail.com wrote:
On Wed, Dec 13, 2023 at 3:15 PM Daniel Xu dxu@dxuuu.xyz wrote:
> [...] > > diff --git a/tools/testing/selftests/bpf/progs/test_tunnel_kern.c b/tools/testing/selftests/bpf/progs/test_tunnel_kern.c > index c0dd38616562..f00dba85ac5d 100644 > --- a/tools/testing/selftests/bpf/progs/test_tunnel_kern.c > +++ b/tools/testing/selftests/bpf/progs/test_tunnel_kern.c > @@ -8,8 +8,9 @@ > */ > #include "vmlinux.h" > #include <bpf/bpf_core_read.h> > -#include <bpf/bpf_helpers.h> > #include <bpf/bpf_endian.h> > +#include <bpf/bpf_helpers.h> > +#include "bpf_experimental.h" > #include "bpf_kfuncs.h" > #include "bpf_tracing_net.h" > > @@ -988,8 +989,9 @@ int xfrm_get_state_xdp(struct xdp_md *xdp) > opts.family = AF_INET; > > x = bpf_xdp_get_xfrm_state(xdp, &opts, sizeof(opts)); > - if (!x || opts.error) > + if (!x) > goto out; > + bpf_assert_with(opts.error == 0, XDP_PASS); > > if (!x->replay_esn) > goto out; > > results in: > > 57: (b7) r1 = 2 ; R1_w=2 refs=5 > 58: (85) call bpf_throw#115436 > calling kernel function bpf_throw is not allowed >
I think this might be because bpf_throw is not registered for use by BPF_PROG_TYPE_XDP. I would simply register the generic_kfunc_set for this program type as well, since it's already done for TC.
Ah yeah, that was it.
> It looks like the above error comes from verifier.c:fetch_kfunc_meta, > but I can run the exceptions selftests just fine with the same bzImage. > So I'm thinking it's not a kfunc registration or BTF issue. > > Maybe it's cuz I'm holding onto KFUNC_ACQUIRE'd `x`? Not sure. >
Yes, even once you enable this, this will fail for now. I am sending out a series later this week that enables bpf_throw with acquired references, but until then may I suggest the following:
#define bpf_assert_if(cond) for (int ___i = 0, ___j = (cond); !(___j) \ && !___j; bpf_throw(), ___i++)
This will allow you to insert some cleanup code with an assertion. Then in my series, I will convert this temporary bpf_assert_if back to the normal bpf_assert.
It would look like: bpf_assert_if(opts.error == 0) { // Execute if assertion failed bpf_xdp_xfrm_state_release(x); }
Likewise for bpf_assert_with_if, you get the idea.
I gave it a try and I'm getting this compile error:
progs/test_tunnel_kern.c:996:2: error: variable '___j' used in loop condition not modified in loop body [-Werror,-Wfor-loop-analysis] bpf_assert_with_if(opts.error == 0, XDP_PASS) { ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ /home/dxu/dev/linux/tools/testing/selftests/bpf/bpf_experimental.h:295:38: note: expanded from macro 'bpf_assert_with_if' for (int ___i = 0, ___j = (cond); !(___j) && !___j; bpf_throw(value), ___i++) ^~~~ ~~~~ 1 error generated. make: *** [Makefile:618: /home/dxu/dev/linux/tools/testing/selftests/bpf/test_tunnel_kern.bpf.o] Error 1
Seems like the compiler is being clever.
It looks like ___j is used twice - maybe it was meant to be ___i? i.e.:
for (int ___i = 0, ___j = (cond); !(___j) && !___i; bpf_throw(value), ___i++)
Ah, yes, that's a typo. Eyal is right, it should be ___i.
Additionally, I would modify the macro to do ___j = !!(cond).
Makes sense. Will send out v6 with these fixes today.
Looks like only x86 supports exceptions (looking at bpf_jit_supports_exceptions()).
This causes selftests in this patchset to fail on !x86, which is unfortunate. We probably want to be running these tests on all the major archs, so I will drop the assertion patches from this patchset.
But since they're generally useful and I've already written the selftests for it, I could put them up in another patchset? Or maybe not cuz you're gonna fix it later anyways. WDYT?
Thanks, Daniel
On Thu, Dec 14, 2023 at 12:24 PM Daniel Xu dxu@dxuuu.xyz wrote:
Looks like only x86 supports exceptions (looking at bpf_jit_supports_exceptions()).
This causes selftests in this patchset to fail on !x86, which is unfortunate. We probably want to be running these tests on all the major archs, so I will drop the assertion patches from this patchset.
But since they're generally useful and I've already written the selftests for it, I could put them up in another patchset? Or maybe not cuz you're gonna fix it later anyways. WDYT?
Yeah. don't use bpf_assert in generic tests yet. Only tests that test bpf_assert should use it.
Pls send the ones you wrote separately, so they stay in email archives and we can pick them up later.
linux-kselftest-mirror@lists.linaro.org