Cong reported an issue where running 'test_sockmap' in the current bpf-next tree results in an error [1].
The specific test case that triggered the error is a combined test involving ktls and bpf_msg_pop_data().
Root Cause: When sending plaintext data, we initially calculated the corresponding ciphertext length. However, if we later reduced the plaintext data length via socket policy, we failed to recalculate the ciphertext length.
This results in transmitting buffers containing uninitialized data during ciphertext transmission.
This causes uninitialized bytes to be appended after a complete "Application Data" packet, leading to errors on the receiving end when parsing TLS record.
This issue has existed for a long time but was only exposed after the following test code was merged. commit 47eae080410b ("selftests/bpf: Add more tests for test_txmsg_push_pop in test_sockmap")
Although we already had tests for pop data before this commit, the pop data length was insufficient (less than 5 bytes). This meant that the corrupted TLS records with data length <5 bytes were cached without being parsed, resulting in no error being triggered.
After this fix, all tests pass.
1/ 6 sockmap::txmsg test passthrough:OK 2/ 6 sockmap::txmsg test redirect:OK 3/ 2 sockmap::txmsg test redirect wait send mem:OK 4/ 6 sockmap::txmsg test drop:OK 5/ 6 sockmap::txmsg test ingress redirect:OK 6/ 7 sockmap::txmsg test skb:OK 7/12 sockmap::txmsg test apply:OK 8/12 sockmap::txmsg test cork:OK 9/ 3 sockmap::txmsg test hanging corks:OK 10/11 sockmap::txmsg test push_data:OK 11/17 sockmap::txmsg test pull-data:OK 12/ 9 sockmap::txmsg test pop-data:OK 13/ 6 sockmap::txmsg test push/pop data:OK 14/ 1 sockmap::txmsg test ingress parser:OK 15/ 1 sockmap::txmsg test ingress parser2:OK 16/ 6 sockhash::txmsg test passthrough:OK 17/ 6 sockhash::txmsg test redirect:OK 18/ 2 sockhash::txmsg test redirect wait send mem:OK 19/ 6 sockhash::txmsg test drop:OK 20/ 6 sockhash::txmsg test ingress redirect:OK 21/ 7 sockhash::txmsg test skb:OK 22/12 sockhash::txmsg test apply:OK 23/12 sockhash::txmsg test cork:OK 24/ 3 sockhash::txmsg test hanging corks:OK 25/11 sockhash::txmsg test push_data:OK 26/17 sockhash::txmsg test pull-data:OK 27/ 9 sockhash::txmsg test pop-data:OK 28/ 6 sockhash::txmsg test push/pop data:OK 29/ 1 sockhash::txmsg test ingress parser:OK 30/ 1 sockhash::txmsg test ingress parser2:OK 31/ 6 sockhash:ktls:txmsg test passthrough:OK 32/ 6 sockhash:ktls:txmsg test redirect:OK 33/ 2 sockhash:ktls:txmsg test redirect wait send mem:OK 34/ 6 sockhash:ktls:txmsg test drop:OK 35/ 6 sockhash:ktls:txmsg test ingress redirect:OK 36/ 7 sockhash:ktls:txmsg test skb:OK 37/12 sockhash:ktls:txmsg test apply:OK 38/12 sockhash:ktls:txmsg test cork:OK 39/ 3 sockhash:ktls:txmsg test hanging corks:OK 40/11 sockhash:ktls:txmsg test push_data:OK 41/17 sockhash:ktls:txmsg test pull-data:OK 42/ 9 sockhash:ktls:txmsg test pop-data:OK 43/ 6 sockhash:ktls:txmsg test push/pop data:OK 44/ 1 sockhash:ktls:txmsg test ingress parser:OK 45/ 0 sockhash:ktls:txmsg test ingress parser2:OK Pass: 45 Fail: 0
[1]: https://lore.kernel.org/bpf/CAM_iQpU7=4xjbefZoxndKoX9gFFMOe7FcWMq5tHBsymbrnM...
Jiayuan Chen (2): bpf,ktls: Fix data corruption when using bpf_msg_pop_data() in ktls selftests/bpf: Add test to cover ktls with bpf_msg_pop_data
net/tls/tls_sw.c | 15 +++ .../selftests/bpf/prog_tests/sockmap_ktls.c | 91 +++++++++++++++++++ .../selftests/bpf/progs/test_sockmap_ktls.c | 4 + 3 files changed, 110 insertions(+)
When sending plaintext data, we initially calculated the corresponding ciphertext length. However, if we later reduced the plaintext data length via socket policy, we failed to recalculate the ciphertext length.
This results in transmitting buffers containing uninitialized data during ciphertext transmission.
This causes uninitialized bytes to be appended after a complete "Application Data" packet, leading to errors on the receiving end when parsing TLS record.
Fixes: d3b18ad31f93 ("tls: add bpf support to sk_msg handling") Reported-by: Cong Wang xiyou.wangcong@gmail.com Signed-off-by: Jiayuan Chen jiayuan.chen@linux.dev --- net/tls/tls_sw.c | 15 +++++++++++++++ 1 file changed, 15 insertions(+)
diff --git a/net/tls/tls_sw.c b/net/tls/tls_sw.c index fc88e34b7f33..b23a4655be6a 100644 --- a/net/tls/tls_sw.c +++ b/net/tls/tls_sw.c @@ -872,6 +872,21 @@ static int bpf_exec_tx_verdict(struct sk_msg *msg, struct sock *sk, delta = msg->sg.size; psock->eval = sk_psock_msg_verdict(sk, psock, msg); delta -= msg->sg.size; + + if ((s32)delta > 0) { + /* It indicates that we executed bpf_msg_pop_data(), + * causing the plaintext data size to decrease. + * Therefore the encrypted data size also needs to + * correspondingly decrease. We only need to subtract + * delta to calculate the new ciphertext length since + * ktls does not support block encryption. + */ + if (!WARN_ON_ONCE(!ctx->open_rec)) { + struct sk_msg *enc = &ctx->open_rec->msg_encrypted; + + sk_msg_trim(sk, enc, enc->sg.size - delta); + } + } } if (msg->cork_bytes && msg->cork_bytes > msg->sg.size && !enospc && !full_record) {
On 2025-05-23 21:18:58, Jiayuan Chen wrote:
When sending plaintext data, we initially calculated the corresponding ciphertext length. However, if we later reduced the plaintext data length via socket policy, we failed to recalculate the ciphertext length.
This results in transmitting buffers containing uninitialized data during ciphertext transmission.
This causes uninitialized bytes to be appended after a complete "Application Data" packet, leading to errors on the receiving end when parsing TLS record.
Fixes: d3b18ad31f93 ("tls: add bpf support to sk_msg handling") Reported-by: Cong Wang xiyou.wangcong@gmail.com Signed-off-by: Jiayuan Chen jiayuan.chen@linux.dev
LGTM thanks.
Reviewed-by: John Fastabend john.fastabend@gmail.com
On Fri, May 23, 2025 at 09:18:58PM +0800, Jiayuan Chen wrote:
When sending plaintext data, we initially calculated the corresponding ciphertext length. However, if we later reduced the plaintext data length via socket policy, we failed to recalculate the ciphertext length.
This results in transmitting buffers containing uninitialized data during ciphertext transmission.
This causes uninitialized bytes to be appended after a complete "Application Data" packet, leading to errors on the receiving end when parsing TLS record.
Fixes: d3b18ad31f93 ("tls: add bpf support to sk_msg handling") Reported-by: Cong Wang xiyou.wangcong@gmail.com Signed-off-by: Jiayuan Chen jiayuan.chen@linux.dev
net/tls/tls_sw.c | 15 +++++++++++++++ 1 file changed, 15 insertions(+)
diff --git a/net/tls/tls_sw.c b/net/tls/tls_sw.c index fc88e34b7f33..b23a4655be6a 100644 --- a/net/tls/tls_sw.c +++ b/net/tls/tls_sw.c @@ -872,6 +872,21 @@ static int bpf_exec_tx_verdict(struct sk_msg *msg, struct sock *sk, delta = msg->sg.size; psock->eval = sk_psock_msg_verdict(sk, psock, msg); delta -= msg->sg.size;
if ((s32)delta > 0) {
/* It indicates that we executed bpf_msg_pop_data(),
* causing the plaintext data size to decrease.
* Therefore the encrypted data size also needs to
* correspondingly decrease. We only need to subtract
* delta to calculate the new ciphertext length since
* ktls does not support block encryption.
*/
if (!WARN_ON_ONCE(!ctx->open_rec)) {
I am wondering if we need to WARN here? Because the code below this handles it gracefully:
931 bool reset_eval = !ctx->open_rec; 932 933 rec = ctx->open_rec; 934 if (rec) { 935 msg = &rec->msg_plaintext; 936 if (!msg->apply_bytes) 937 reset_eval = true; 938 } 939 if (reset_eval) { 940 psock->eval = __SK_NONE; 941 if (psock->sk_redir) { 942 sock_put(psock->sk_redir); 943 psock->sk_redir = NULL; 944 } 945 }
Thanks for fixing it! Cong
2025/5/30 02:16, "Cong Wang" xiyou.wangcong@gmail.com 写到:
On Fri, May 23, 2025 at 09:18:58PM +0800, Jiayuan Chen wrote:
When sending plaintext data, we initially calculated the corresponding
ciphertext length. However, if we later reduced the plaintext data length
via socket policy, we failed to recalculate the ciphertext length.
This results in transmitting buffers containing uninitialized data during
ciphertext transmission.
This causes uninitialized bytes to be appended after a complete
"Application Data" packet, leading to errors on the receiving end when
parsing TLS record.
Fixes: d3b18ad31f93 ("tls: add bpf support to sk_msg handling")
Reported-by: Cong Wang xiyou.wangcong@gmail.com
Signed-off-by: Jiayuan Chen jiayuan.chen@linux.dev
net/tls/tls_sw.c | 15 +++++++++++++++
1 file changed, 15 insertions(+)
diff --git a/net/tls/tls_sw.c b/net/tls/tls_sw.c
index fc88e34b7f33..b23a4655be6a 100644
--- a/net/tls/tls_sw.c
+++ b/net/tls/tls_sw.c
@@ -872,6 +872,21 @@ static int bpf_exec_tx_verdict(struct sk_msg *msg, struct sock *sk,
delta = msg->sg.size;
psock->eval = sk_psock_msg_verdict(sk, psock, msg);
delta -= msg->sg.size;
if ((s32)delta > 0) {
/* It indicates that we executed bpf_msg_pop_data(),
- causing the plaintext data size to decrease.
- Therefore the encrypted data size also needs to
- correspondingly decrease. We only need to subtract
- delta to calculate the new ciphertext length since
- ktls does not support block encryption.
*/
if (!WARN_ON_ONCE(!ctx->open_rec)) {
I am wondering if we need to WARN here? Because the code below this
handles it gracefully:
Hi Cong
The ctx->open_rec is freed after a TLS record is processed (regardless of whether the redirect check passes or triggers a redirect). The 'if (rec)' check in the subsequent code you print is indeed designed to handle the expected lifecycle state of open_rec.
But the code path I modified should never see a NULL open_rec under normal operation As this is a bug fix, I need to ensure the fix itself doesn't create new issues.
Thanks.
931 bool reset_eval = !ctx->open_rec;
932
933 rec = ctx->open_rec;
934 if (rec) {
935 msg = &rec->msg_plaintext;
936 if (!msg->apply_bytes)
937 reset_eval = true;
938 }
939 if (reset_eval) {
940 psock->eval = __SK_NONE;
941 if (psock->sk_redir) {
942 sock_put(psock->sk_redir);
943 psock->sk_redir = NULL;
944 }
945 }
Thanks for fixing it!
Cong
The selftest can reproduce an issue where using bpf_msg_pop_data() in ktls causes errors on the receiving end.
Signed-off-by: Jiayuan Chen jiayuan.chen@linux.dev --- .../selftests/bpf/prog_tests/sockmap_ktls.c | 91 +++++++++++++++++++ .../selftests/bpf/progs/test_sockmap_ktls.c | 4 + 2 files changed, 95 insertions(+)
diff --git a/tools/testing/selftests/bpf/prog_tests/sockmap_ktls.c b/tools/testing/selftests/bpf/prog_tests/sockmap_ktls.c index b6c471da5c28..b87e7f39e15a 100644 --- a/tools/testing/selftests/bpf/prog_tests/sockmap_ktls.c +++ b/tools/testing/selftests/bpf/prog_tests/sockmap_ktls.c @@ -314,6 +314,95 @@ static void test_sockmap_ktls_tx_no_buf(int family, int sotype, bool push) test_sockmap_ktls__destroy(skel); }
+static void test_sockmap_ktls_tx_pop(int family, int sotype) +{ + char msg[37] = "0123456789abcdefghijklmnopqrstuvwxyz\0"; + int c = 0, p = 0, one = 1, sent, recvd; + struct test_sockmap_ktls *skel; + int prog_fd, map_fd; + char rcv[50] = {0}; + int err; + int i, m, r; + + skel = test_sockmap_ktls__open_and_load(); + if (!ASSERT_TRUE(skel, "open ktls skel")) + return; + + err = create_pair(family, sotype, &c, &p); + if (!ASSERT_OK(err, "create_pair()")) + goto out; + + prog_fd = bpf_program__fd(skel->progs.prog_sk_policy); + map_fd = bpf_map__fd(skel->maps.sock_map); + + err = bpf_prog_attach(prog_fd, map_fd, BPF_SK_MSG_VERDICT, 0); + if (!ASSERT_OK(err, "bpf_prog_attach sk msg")) + goto out; + + err = bpf_map_update_elem(map_fd, &one, &c, BPF_NOEXIST); + if (!ASSERT_OK(err, "bpf_map_update_elem(c)")) + goto out; + + err = init_ktls_pairs(c, p); + if (!ASSERT_OK(err, "init_ktls_pairs(c, p)")) + goto out; + + struct { + int pop_start; + int pop_len; + } pop_policy[] = { + /* trim the start */ + {0, 2}, + {0, 10}, + {1, 2}, + {1, 10}, + /* trim the end */ + {35, 2}, + /* New entries should be added before this line */ + {-1, -1}, + }; + + i = 0; + while (pop_policy[i].pop_start >= 0) { + skel->bss->pop_start = pop_policy[i].pop_start; + skel->bss->pop_end = pop_policy[i].pop_len; + + sent = send(c, msg, sizeof(msg), 0); + if (!ASSERT_EQ(sent, sizeof(msg), "send(msg)")) + goto out; + + recvd = recv_timeout(p, rcv, sizeof(rcv), MSG_DONTWAIT, 1); + if (!ASSERT_EQ(recvd, sizeof(msg) - pop_policy[i].pop_len, "pop len mismatch")) + goto out; + + /* verify the data + * msg: 0123456789a bcdefghij klmnopqrstuvwxyz + * | | + * popped data + */ + for (m = 0, r = 0; m < sizeof(msg);) { + /* skip checking the data that has been popped */ + if (m >= pop_policy[i].pop_start && + m <= pop_policy[i].pop_start + pop_policy[i].pop_len - 1) { + m++; + continue; + } + + if (!ASSERT_EQ(msg[m], rcv[r], "data mismatch")) + goto out; + m++; + r++; + } + i++; + } +out: + if (c) + close(c); + if (p) + close(p); + test_sockmap_ktls__destroy(skel); +} + static void run_tests(int family, enum bpf_map_type map_type) { int map; @@ -338,6 +427,8 @@ static void run_ktls_test(int family, int sotype) test_sockmap_ktls_tx_cork(family, sotype, true); if (test__start_subtest("tls tx egress with no buf")) test_sockmap_ktls_tx_no_buf(family, sotype, true); + if (test__start_subtest("tls tx with pop")) + test_sockmap_ktls_tx_pop(family, sotype); }
void test_sockmap_ktls(void) diff --git a/tools/testing/selftests/bpf/progs/test_sockmap_ktls.c b/tools/testing/selftests/bpf/progs/test_sockmap_ktls.c index 8bdb9987c0c7..83df4919c224 100644 --- a/tools/testing/selftests/bpf/progs/test_sockmap_ktls.c +++ b/tools/testing/selftests/bpf/progs/test_sockmap_ktls.c @@ -7,6 +7,8 @@ int cork_byte; int push_start; int push_end; int apply_bytes; +int pop_start; +int pop_end;
struct { __uint(type, BPF_MAP_TYPE_SOCKMAP); @@ -22,6 +24,8 @@ int prog_sk_policy(struct sk_msg_md *msg) bpf_msg_cork_bytes(msg, cork_byte); if (push_start > 0 && push_end > 0) bpf_msg_push_data(msg, push_start, push_end, 0); + if (pop_start >= 0 && pop_end > 0) + bpf_msg_pop_data(msg, pop_start, pop_end, 0);
return SK_PASS; }
On 2025-05-23 21:18:59, Jiayuan Chen wrote:
The selftest can reproduce an issue where using bpf_msg_pop_data() in ktls causes errors on the receiving end.
Signed-off-by: Jiayuan Chen jiayuan.chen@linux.dev
Yep LGTM thanks.
Reviewed-by: John Fastabend john.fastabend@gmail.com
linux-kselftest-mirror@lists.linaro.org