In test_sockmap.c, the testcase sets socket nonblock first, and then calls select() and recvmsg() to receive data. If some error occur, nonblock setting will make recvmsg() return immediately, rather than blocking forever.
However, the way to call fcntl() to set nonblock is wrong. To set socket noblock, we need to use
fcntl(fd, F_SETFL, O_NONBLOCK);
rather than:
fcntl(fd, O_NONBLOCK);
Signed-off-by: Qiao Ma mqaio@linux.alibaba.com --- tools/testing/selftests/bpf/test_sockmap.c | 7 ++++++- 1 file changed, 6 insertions(+), 1 deletion(-)
diff --git a/tools/testing/selftests/bpf/test_sockmap.c b/tools/testing/selftests/bpf/test_sockmap.c index 0fbaccdc8861..abb4102f33b0 100644 --- a/tools/testing/selftests/bpf/test_sockmap.c +++ b/tools/testing/selftests/bpf/test_sockmap.c @@ -598,7 +598,12 @@ static int msg_loop(int fd, int iov_count, int iov_length, int cnt, struct timeval timeout; fd_set w;
- fcntl(fd, fd_flags); + err = fcntl(fd, F_SETFL, fd_flags); + if (err < 0) { + perror("fcntl failed"); + goto out_errno; + } + /* Account for pop bytes noting each iteration of apply will * call msg_pop_data helper so we need to account for this * by calculating the number of apply iterations. Note user
On Wed, Aug 24, 2022 at 7:11 PM Qiao Ma mqaio@linux.alibaba.com wrote:
In test_sockmap.c, the testcase sets socket nonblock first, and then calls select() and recvmsg() to receive data. If some error occur, nonblock setting will make recvmsg() return immediately, rather than blocking forever.
However, the way to call fcntl() to set nonblock is wrong. To set socket noblock, we need to use
fcntl(fd, F_SETFL, O_NONBLOCK);
rather than:
fcntl(fd, O_NONBLOCK);
Signed-off-by: Qiao Ma mqaio@linux.alibaba.com
tools/testing/selftests/bpf/test_sockmap.c | 7 ++++++- 1 file changed, 6 insertions(+), 1 deletion(-)
diff --git a/tools/testing/selftests/bpf/test_sockmap.c b/tools/testing/selftests/bpf/test_sockmap.c index 0fbaccdc8861..abb4102f33b0 100644 --- a/tools/testing/selftests/bpf/test_sockmap.c +++ b/tools/testing/selftests/bpf/test_sockmap.c @@ -598,7 +598,12 @@ static int msg_loop(int fd, int iov_count, int iov_length, int cnt, struct timeval timeout; fd_set w;
fcntl(fd, fd_flags);
err = fcntl(fd, F_SETFL, fd_flags);
if (err < 0) {
perror("fcntl failed");
goto out_errno;
}
John, Jakub,
Please review this. Unfortunately test_sockmap (and sockmap kernel) is broken before and after this patch, so I'm hesitant to apply it not to make thing harder to debug. Here is what I see: # ./test_sockmap recv failed(): Bad address rx thread exited with err 1. recv failed(): Bad address rx thread exited with err 1. recv failed(): Bad address rx thread exited with err 1. # 4/ 6 sockmap::txmsg test ingress redirect:FAIL detected skb data error with skb ingress update @iov[0]:0 "00 00 00 00" != "PASS" data verify msg failed: Unknown error -5 rx thread exited with err 1. [ 16.735850] ------------[ cut here ]------------ [ 16.736195] WARNING: CPU: 3 PID: 1480 at net/core/stream.c:205 sk_stream_kill_queues+0x18a/0x1a0 [ 16.736799] Modules linked in: [ 16.737007] CPU: 3 PID: 1480 Comm: test_sockmap Not tainted 5.19.0-14050-g343949e10798 #4212 [ 16.737543] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.12.0-59-gc9ba5276e321-prebuilt.qemu.org 04/01/2014 [ 16.738522] RIP: 0010:sk_stream_kill_queues+0x18a/0x1a0 [ 16.738883] Code: 41 5f c3 89 ee 48 89 df e8 13 69 fe ff 4c 89 e7 e8 db 01 4f ff 8b ab 28 02 00 00 eb c1 0f 0b e9 62 ff ff ff 0f 0b 85 ed 74 ce <0f> 0b 5b 5d 41 5c 41 5d 41 5e 41 5f c3 66 0f 1f 84 00 00 00 00 00 [ 16.740082] RSP: 0018:ffff888118007ab8 EFLAGS: 00010206 [ 16.740435] RAX: 0000000000000000 RBX: ffff8881174b2400 RCX: ffffffff81f5d47a [ 16.740915] RDX: 0000000000000003 RSI: dffffc0000000000 RDI: ffff8881174b2670 [ 16.741374] RBP: 0000000000000b00 R08: fffffbfff08cfbdf R09: fffffbfff08cfbdf [ 16.741841] R10: ffffffff8467def7 R11: fffffbfff08cfbde R12: ffff8881174b2628 [ 16.742298] R13: ffffffff85c13c00 R14: ffff88810611d824 R15: ffff8881174b25c8 [ 16.742768] FS: 00007f937403c400(0000) GS:ffff888628f80000(0000) knlGS:0000000000000000 [ 16.743283] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 16.743657] CR2: 00007f937403c378 CR3: 0000000125983005 CR4: 00000000003706e0 [ 16.744124] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 16.744584] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 [ 16.745065] Call Trace: [ 16.745241] <TASK> [ 16.745401] inet_csk_destroy_sock+0x9f/0x1c0 [ 16.745707] tcp_rcv_state_process+0x1a84/0x2120 [ 16.746013] ? lock_release+0x3b0/0x3b0 [ 16.746271] ? tcp_established_options+0x189/0x300 [ 16.746592] ? tcp_finish_connect+0x240/0x240 [ 16.746894] ? rcu_read_lock_bh_held+0xa0/0xa0 [ 16.747190] ? rcu_read_lock_held_common+0x1a/0x50 [ 16.747505] ? rcu_read_lock_sched_held+0x56/0xc0 [ 16.747823] ? rcu_read_lock_held_common+0x1a/0x50 [ 16.748139] ? lock_release+0xad/0x3b0 [ 16.748390] ? __release_sock+0x83/0x150 [ 16.748646] ? lock_downgrade+0x360/0x360 [ 16.748932] ? lock_acquire+0xab/0x380 [ 16.749187] ? lock_downgrade+0x360/0x360 [ 16.749440] ? tcp_v4_do_rcv+0x195/0x4f0 [ 16.749696] tcp_v4_do_rcv+0x195/0x4f0 [ 16.749933] ? trace_hardirqs_on+0x2d/0xe0 [ 16.750194] __release_sock+0xb9/0x150 ... [ 213.529908] ------------[ cut here ]------------ [ 213.530232] page_counter underflow: -7 nr_pages=31 [ 213.530571] WARNING: CPU: 1 PID: 1925 at mm/page_counter.c:57 page_counter_cancel+0x56/0x80 [ 213.531125] Modules linked in: [ 213.531332] CPU: 1 PID: 1925 Comm: test_sockmap Tainted: G W 5.19.0-14050-g343949e10798 #4212 [ 213.531989] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.12.0-59-gc9ba5276e321-prebuilt.qemu.org 04/01/2014 [ 213.532734] RIP: 0010:page_counter_cancel+0x56/0x80 [ 213.533070] Code: 0a 48 89 df 5b 5d e9 89 fe ff ff 80 3d d1 b6 1e 03 00 75 18 48 89 ea 48 c7 c7 60 6a b3 82 c6 05 be b6 1e 03 01 e8 aa aa 0b 01 <0f> 0b be 08 00 00 00 48 89 df e8 bb 22 fe ff 48 89 df e8 13 1a fe [ 213.534259] RSP: 0018:ffff88811ae2f8b0 EFLAGS: 00010086 [ 213.534607] RAX: 0000000000000000 RBX: ffff888100c08120 RCX: 0000000000000000 [ 213.535071] RDX: 0000000000000027 RSI: dffffc0000000000 RDI: ffffed10235c5f0c [ 213.535535] RBP: 000000000000001f R08: ffffed10c51d4ee6 R09: ffffed10c51d4ee6 [ 213.536000] R10: ffff888628ea772b R11: ffffed10c51d4ee5 R12: ffff888628eb5d18 [ 213.536479] R13: ffff888628eb5d10 R14: ffff888628eb5d18 R15: 000000000000001f [ 213.536953] FS: 00007fe0f0478400(0000) GS:ffff888628e80000(0000) knlGS:0000000000000000 [ 213.537463] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 213.537858] CR2: 00007fe0f04786d0 CR3: 000000011ab33004 CR4: 00000000003706e0 [ 213.538325] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 213.538838] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 [ 213.539318] Call Trace: [ 213.539488] <TASK> [ 213.539635] page_counter_uncharge+0x1d/0x40 [ 213.539927] drain_stock.isra.54+0x5d/0xb0 [ 213.540200] __refill_stock+0x42/0xb0 [ 213.540451] refill_stock+0xc1/0x1c0 [ 213.540688] try_charge_memcg+0xb2e/0xb50
and test_sockmap 'hangs' (or doing something for long time) after #31/ 6 sockhash:ktls:txmsg test drop:OK
Alexei Starovoitov wrote:
On Wed, Aug 24, 2022 at 7:11 PM Qiao Ma mqaio@linux.alibaba.com wrote:
In test_sockmap.c, the testcase sets socket nonblock first, and then calls select() and recvmsg() to receive data. If some error occur, nonblock setting will make recvmsg() return immediately, rather than blocking forever.
However, the way to call fcntl() to set nonblock is wrong. To set socket noblock, we need to use
fcntl(fd, F_SETFL, O_NONBLOCK);
rather than:
fcntl(fd, O_NONBLOCK);
Signed-off-by: Qiao Ma mqaio@linux.alibaba.com
tools/testing/selftests/bpf/test_sockmap.c | 7 ++++++- 1 file changed, 6 insertions(+), 1 deletion(-)
diff --git a/tools/testing/selftests/bpf/test_sockmap.c b/tools/testing/selftests/bpf/test_sockmap.c index 0fbaccdc8861..abb4102f33b0 100644 --- a/tools/testing/selftests/bpf/test_sockmap.c +++ b/tools/testing/selftests/bpf/test_sockmap.c @@ -598,7 +598,12 @@ static int msg_loop(int fd, int iov_count, int iov_length, int cnt, struct timeval timeout; fd_set w;
fcntl(fd, fd_flags);
err = fcntl(fd, F_SETFL, fd_flags);
if (err < 0) {
perror("fcntl failed");
goto out_errno;
}
John, Jakub,
Please review this. Unfortunately test_sockmap (and sockmap kernel) is broken before and after this patch, so I'm hesitant to apply it not to make thing harder to debug. Here is what I see: # ./test_sockmap recv failed(): Bad address rx thread exited with err 1. recv failed(): Bad address rx thread exited with err 1. recv failed(): Bad address rx thread exited with err 1. # 4/ 6 sockmap::txmsg test ingress redirect:FAIL detected skb data error with skb ingress update @iov[0]:0 "00 00 00 00" != "PASS" data verify msg failed: Unknown error -5 rx thread exited with err 1. [ 16.735850] ------------[ cut here ]------------ [ 16.736195] WARNING: CPU: 3 PID: 1480 at net/core/stream.c:205 sk_stream_kill_queues+0x18a/0x1a0 [ 16.736799] Modules linked in: [ 16.737007] CPU: 3 PID: 1480 Comm: test_sockmap Not tainted 5.19.0-14050-g343949e10798 #4212 [ 16.737543] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.12.0-59-gc9ba5276e321-prebuilt.qemu.org 04/01/2014 [ 16.738522] RIP: 0010:sk_stream_kill_queues+0x18a/0x1a0 [ 16.738883] Code: 41 5f c3 89 ee 48 89 df e8 13 69 fe ff 4c 89 e7 e8 db 01 4f ff 8b ab 28 02 00 00 eb c1 0f 0b e9 62 ff ff ff 0f 0b 85 ed 74 ce <0f> 0b 5b 5d 41 5c 41 5d 41 5e 41 5f c3 66 0f 1f 84 00 00 00 00 00 [ 16.740082] RSP: 0018:ffff888118007ab8 EFLAGS: 00010206 [ 16.740435] RAX: 0000000000000000 RBX: ffff8881174b2400 RCX: ffffffff81f5d47a [ 16.740915] RDX: 0000000000000003 RSI: dffffc0000000000 RDI: ffff8881174b2670 [ 16.741374] RBP: 0000000000000b00 R08: fffffbfff08cfbdf R09: fffffbfff08cfbdf [ 16.741841] R10: ffffffff8467def7 R11: fffffbfff08cfbde R12: ffff8881174b2628 [ 16.742298] R13: ffffffff85c13c00 R14: ffff88810611d824 R15: ffff8881174b25c8 [ 16.742768] FS: 00007f937403c400(0000) GS:ffff888628f80000(0000) knlGS:0000000000000000 [ 16.743283] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 16.743657] CR2: 00007f937403c378 CR3: 0000000125983005 CR4: 00000000003706e0 [ 16.744124] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 16.744584] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 [ 16.745065] Call Trace: [ 16.745241] <TASK> [ 16.745401] inet_csk_destroy_sock+0x9f/0x1c0 [ 16.745707] tcp_rcv_state_process+0x1a84/0x2120 [ 16.746013] ? lock_release+0x3b0/0x3b0 [ 16.746271] ? tcp_established_options+0x189/0x300 [ 16.746592] ? tcp_finish_connect+0x240/0x240 [ 16.746894] ? rcu_read_lock_bh_held+0xa0/0xa0 [ 16.747190] ? rcu_read_lock_held_common+0x1a/0x50 [ 16.747505] ? rcu_read_lock_sched_held+0x56/0xc0 [ 16.747823] ? rcu_read_lock_held_common+0x1a/0x50 [ 16.748139] ? lock_release+0xad/0x3b0 [ 16.748390] ? __release_sock+0x83/0x150 [ 16.748646] ? lock_downgrade+0x360/0x360 [ 16.748932] ? lock_acquire+0xab/0x380 [ 16.749187] ? lock_downgrade+0x360/0x360 [ 16.749440] ? tcp_v4_do_rcv+0x195/0x4f0 [ 16.749696] tcp_v4_do_rcv+0x195/0x4f0 [ 16.749933] ? trace_hardirqs_on+0x2d/0xe0 [ 16.750194] __release_sock+0xb9/0x150 ... [ 213.529908] ------------[ cut here ]------------ [ 213.530232] page_counter underflow: -7 nr_pages=31 [ 213.530571] WARNING: CPU: 1 PID: 1925 at mm/page_counter.c:57 page_counter_cancel+0x56/0x80 [ 213.531125] Modules linked in: [ 213.531332] CPU: 1 PID: 1925 Comm: test_sockmap Tainted: G W 5.19.0-14050-g343949e10798 #4212 [ 213.531989] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.12.0-59-gc9ba5276e321-prebuilt.qemu.org 04/01/2014 [ 213.532734] RIP: 0010:page_counter_cancel+0x56/0x80 [ 213.533070] Code: 0a 48 89 df 5b 5d e9 89 fe ff ff 80 3d d1 b6 1e 03 00 75 18 48 89 ea 48 c7 c7 60 6a b3 82 c6 05 be b6 1e 03 01 e8 aa aa 0b 01 <0f> 0b be 08 00 00 00 48 89 df e8 bb 22 fe ff 48 89 df e8 13 1a fe [ 213.534259] RSP: 0018:ffff88811ae2f8b0 EFLAGS: 00010086 [ 213.534607] RAX: 0000000000000000 RBX: ffff888100c08120 RCX: 0000000000000000 [ 213.535071] RDX: 0000000000000027 RSI: dffffc0000000000 RDI: ffffed10235c5f0c [ 213.535535] RBP: 000000000000001f R08: ffffed10c51d4ee6 R09: ffffed10c51d4ee6 [ 213.536000] R10: ffff888628ea772b R11: ffffed10c51d4ee5 R12: ffff888628eb5d18 [ 213.536479] R13: ffff888628eb5d10 R14: ffff888628eb5d18 R15: 000000000000001f [ 213.536953] FS: 00007fe0f0478400(0000) GS:ffff888628e80000(0000) knlGS:0000000000000000 [ 213.537463] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 213.537858] CR2: 00007fe0f04786d0 CR3: 000000011ab33004 CR4: 00000000003706e0 [ 213.538325] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 213.538838] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 [ 213.539318] Call Trace: [ 213.539488] <TASK> [ 213.539635] page_counter_uncharge+0x1d/0x40 [ 213.539927] drain_stock.isra.54+0x5d/0xb0 [ 213.540200] __refill_stock+0x42/0xb0 [ 213.540451] refill_stock+0xc1/0x1c0 [ 213.540688] try_charge_memcg+0xb2e/0xb50
and test_sockmap 'hangs' (or doing something for long time) after #31/ 6 sockhash:ktls:txmsg test drop:OK
Thanks for spotting I'll take a look.
On Fri, Aug 26, 2022 at 9:24 AM John Fastabend john.fastabend@gmail.com wrote:
Alexei Starovoitov wrote:
On Wed, Aug 24, 2022 at 7:11 PM Qiao Ma mqaio@linux.alibaba.com wrote:
In test_sockmap.c, the testcase sets socket nonblock first, and then calls select() and recvmsg() to receive data. If some error occur, nonblock setting will make recvmsg() return immediately, rather than blocking forever.
However, the way to call fcntl() to set nonblock is wrong. To set socket noblock, we need to use
fcntl(fd, F_SETFL, O_NONBLOCK);
rather than:
fcntl(fd, O_NONBLOCK);
Signed-off-by: Qiao Ma mqaio@linux.alibaba.com
tools/testing/selftests/bpf/test_sockmap.c | 7 ++++++- 1 file changed, 6 insertions(+), 1 deletion(-)
diff --git a/tools/testing/selftests/bpf/test_sockmap.c b/tools/testing/selftests/bpf/test_sockmap.c index 0fbaccdc8861..abb4102f33b0 100644 --- a/tools/testing/selftests/bpf/test_sockmap.c +++ b/tools/testing/selftests/bpf/test_sockmap.c @@ -598,7 +598,12 @@ static int msg_loop(int fd, int iov_count, int iov_length, int cnt, struct timeval timeout; fd_set w;
fcntl(fd, fd_flags);
err = fcntl(fd, F_SETFL, fd_flags);
if (err < 0) {
perror("fcntl failed");
goto out_errno;
}
John, Jakub,
Please review this. Unfortunately test_sockmap (and sockmap kernel) is broken before and after this patch, so I'm hesitant to apply it not to make thing harder to debug. Here is what I see: # ./test_sockmap recv failed(): Bad address rx thread exited with err 1. recv failed(): Bad address rx thread exited with err 1. recv failed(): Bad address rx thread exited with err 1. # 4/ 6 sockmap::txmsg test ingress redirect:FAIL detected skb data error with skb ingress update @iov[0]:0 "00 00 00 00" != "PASS" data verify msg failed: Unknown error -5 rx thread exited with err 1. [ 16.735850] ------------[ cut here ]------------ [ 16.736195] WARNING: CPU: 3 PID: 1480 at net/core/stream.c:205 sk_stream_kill_queues+0x18a/0x1a0 [ 16.736799] Modules linked in: [ 16.737007] CPU: 3 PID: 1480 Comm: test_sockmap Not tainted 5.19.0-14050-g343949e10798 #4212 [ 16.737543] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.12.0-59-gc9ba5276e321-prebuilt.qemu.org 04/01/2014 [ 16.738522] RIP: 0010:sk_stream_kill_queues+0x18a/0x1a0 [ 16.738883] Code: 41 5f c3 89 ee 48 89 df e8 13 69 fe ff 4c 89 e7 e8 db 01 4f ff 8b ab 28 02 00 00 eb c1 0f 0b e9 62 ff ff ff 0f 0b 85 ed 74 ce <0f> 0b 5b 5d 41 5c 41 5d 41 5e 41 5f c3 66 0f 1f 84 00 00 00 00 00 [ 16.740082] RSP: 0018:ffff888118007ab8 EFLAGS: 00010206 [ 16.740435] RAX: 0000000000000000 RBX: ffff8881174b2400 RCX: ffffffff81f5d47a [ 16.740915] RDX: 0000000000000003 RSI: dffffc0000000000 RDI: ffff8881174b2670 [ 16.741374] RBP: 0000000000000b00 R08: fffffbfff08cfbdf R09: fffffbfff08cfbdf [ 16.741841] R10: ffffffff8467def7 R11: fffffbfff08cfbde R12: ffff8881174b2628 [ 16.742298] R13: ffffffff85c13c00 R14: ffff88810611d824 R15: ffff8881174b25c8 [ 16.742768] FS: 00007f937403c400(0000) GS:ffff888628f80000(0000) knlGS:0000000000000000 [ 16.743283] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 16.743657] CR2: 00007f937403c378 CR3: 0000000125983005 CR4: 00000000003706e0 [ 16.744124] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 16.744584] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 [ 16.745065] Call Trace: [ 16.745241] <TASK> [ 16.745401] inet_csk_destroy_sock+0x9f/0x1c0 [ 16.745707] tcp_rcv_state_process+0x1a84/0x2120 [ 16.746013] ? lock_release+0x3b0/0x3b0 [ 16.746271] ? tcp_established_options+0x189/0x300 [ 16.746592] ? tcp_finish_connect+0x240/0x240 [ 16.746894] ? rcu_read_lock_bh_held+0xa0/0xa0 [ 16.747190] ? rcu_read_lock_held_common+0x1a/0x50 [ 16.747505] ? rcu_read_lock_sched_held+0x56/0xc0 [ 16.747823] ? rcu_read_lock_held_common+0x1a/0x50 [ 16.748139] ? lock_release+0xad/0x3b0 [ 16.748390] ? __release_sock+0x83/0x150 [ 16.748646] ? lock_downgrade+0x360/0x360 [ 16.748932] ? lock_acquire+0xab/0x380 [ 16.749187] ? lock_downgrade+0x360/0x360 [ 16.749440] ? tcp_v4_do_rcv+0x195/0x4f0 [ 16.749696] tcp_v4_do_rcv+0x195/0x4f0 [ 16.749933] ? trace_hardirqs_on+0x2d/0xe0 [ 16.750194] __release_sock+0xb9/0x150 ... [ 213.529908] ------------[ cut here ]------------ [ 213.530232] page_counter underflow: -7 nr_pages=31 [ 213.530571] WARNING: CPU: 1 PID: 1925 at mm/page_counter.c:57 page_counter_cancel+0x56/0x80 [ 213.531125] Modules linked in: [ 213.531332] CPU: 1 PID: 1925 Comm: test_sockmap Tainted: G W 5.19.0-14050-g343949e10798 #4212 [ 213.531989] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.12.0-59-gc9ba5276e321-prebuilt.qemu.org 04/01/2014 [ 213.532734] RIP: 0010:page_counter_cancel+0x56/0x80 [ 213.533070] Code: 0a 48 89 df 5b 5d e9 89 fe ff ff 80 3d d1 b6 1e 03 00 75 18 48 89 ea 48 c7 c7 60 6a b3 82 c6 05 be b6 1e 03 01 e8 aa aa 0b 01 <0f> 0b be 08 00 00 00 48 89 df e8 bb 22 fe ff 48 89 df e8 13 1a fe [ 213.534259] RSP: 0018:ffff88811ae2f8b0 EFLAGS: 00010086 [ 213.534607] RAX: 0000000000000000 RBX: ffff888100c08120 RCX: 0000000000000000 [ 213.535071] RDX: 0000000000000027 RSI: dffffc0000000000 RDI: ffffed10235c5f0c [ 213.535535] RBP: 000000000000001f R08: ffffed10c51d4ee6 R09: ffffed10c51d4ee6 [ 213.536000] R10: ffff888628ea772b R11: ffffed10c51d4ee5 R12: ffff888628eb5d18 [ 213.536479] R13: ffff888628eb5d10 R14: ffff888628eb5d18 R15: 000000000000001f [ 213.536953] FS: 00007fe0f0478400(0000) GS:ffff888628e80000(0000) knlGS:0000000000000000 [ 213.537463] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 213.537858] CR2: 00007fe0f04786d0 CR3: 000000011ab33004 CR4: 00000000003706e0 [ 213.538325] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 213.538838] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 [ 213.539318] Call Trace: [ 213.539488] <TASK> [ 213.539635] page_counter_uncharge+0x1d/0x40 [ 213.539927] drain_stock.isra.54+0x5d/0xb0 [ 213.540200] __refill_stock+0x42/0xb0 [ 213.540451] refill_stock+0xc1/0x1c0 [ 213.540688] try_charge_memcg+0xb2e/0xb50
and test_sockmap 'hangs' (or doing something for long time) after #31/ 6 sockhash:ktls:txmsg test drop:OK
Thanks for spotting I'll take a look.
Friendly ping. John, did you get a chance to look at this? This patch is still marked as "Needs ACK" in Patchworks.
Andrii Nakryiko wrote:
On Fri, Aug 26, 2022 at 9:24 AM John Fastabend john.fastabend@gmail.com wrote:
Alexei Starovoitov wrote:
On Wed, Aug 24, 2022 at 7:11 PM Qiao Ma mqaio@linux.alibaba.com wrote:
In test_sockmap.c, the testcase sets socket nonblock first, and then calls select() and recvmsg() to receive data. If some error occur, nonblock setting will make recvmsg() return immediately, rather than blocking forever.
However, the way to call fcntl() to set nonblock is wrong. To set socket noblock, we need to use
fcntl(fd, F_SETFL, O_NONBLOCK);
rather than:
fcntl(fd, O_NONBLOCK);
Signed-off-by: Qiao Ma mqaio@linux.alibaba.com
tools/testing/selftests/bpf/test_sockmap.c | 7 ++++++- 1 file changed, 6 insertions(+), 1 deletion(-)
diff --git a/tools/testing/selftests/bpf/test_sockmap.c b/tools/testing/selftests/bpf/test_sockmap.c index 0fbaccdc8861..abb4102f33b0 100644 --- a/tools/testing/selftests/bpf/test_sockmap.c +++ b/tools/testing/selftests/bpf/test_sockmap.c @@ -598,7 +598,12 @@ static int msg_loop(int fd, int iov_count, int iov_length, int cnt, struct timeval timeout; fd_set w;
fcntl(fd, fd_flags);
err = fcntl(fd, F_SETFL, fd_flags);
if (err < 0) {
perror("fcntl failed");
goto out_errno;
}
John, Jakub,
Please review this. Unfortunately test_sockmap (and sockmap kernel) is broken before and after this patch, so I'm hesitant to apply it not to make thing harder to debug. Here is what I see: # ./test_sockmap
[...]
and test_sockmap 'hangs' (or doing something for long time) after #31/ 6 sockhash:ktls:txmsg test drop:OK
Thanks for spotting I'll take a look.
Friendly ping. John, did you get a chance to look at this? This patch is still marked as "Needs ACK" in Patchworks.
Yep thanks. We are tracking a couple fixes internally around this so should have something pop out soon. I think we want the fix and test to go in at the same time.
.John
On 9/22/22 8:46 PM, John Fastabend wrote:
Andrii Nakryiko wrote:
On Fri, Aug 26, 2022 at 9:24 AM John Fastabend john.fastabend@gmail.com wrote:
Alexei Starovoitov wrote:
On Wed, Aug 24, 2022 at 7:11 PM Qiao Ma mqaio@linux.alibaba.com wrote:
In test_sockmap.c, the testcase sets socket nonblock first, and then calls select() and recvmsg() to receive data. If some error occur, nonblock setting will make recvmsg() return immediately, rather than blocking forever.
However, the way to call fcntl() to set nonblock is wrong. To set socket noblock, we need to use
fcntl(fd, F_SETFL, O_NONBLOCK);
rather than:
fcntl(fd, O_NONBLOCK);
Signed-off-by: Qiao Ma mqaio@linux.alibaba.com
tools/testing/selftests/bpf/test_sockmap.c | 7 ++++++- 1 file changed, 6 insertions(+), 1 deletion(-)
diff --git a/tools/testing/selftests/bpf/test_sockmap.c b/tools/testing/selftests/bpf/test_sockmap.c index 0fbaccdc8861..abb4102f33b0 100644 --- a/tools/testing/selftests/bpf/test_sockmap.c +++ b/tools/testing/selftests/bpf/test_sockmap.c @@ -598,7 +598,12 @@ static int msg_loop(int fd, int iov_count, int iov_length, int cnt, struct timeval timeout; fd_set w;
fcntl(fd, fd_flags);
err = fcntl(fd, F_SETFL, fd_flags);
if (err < 0) {
perror("fcntl failed");
goto out_errno;
}
John, Jakub,
Please review this. Unfortunately test_sockmap (and sockmap kernel) is broken before and after this patch, so I'm hesitant to apply it not to make thing harder to debug. Here is what I see: # ./test_sockmap
[...]
and test_sockmap 'hangs' (or doing something for long time) after #31/ 6 sockhash:ktls:txmsg test drop:OK
Thanks for spotting I'll take a look.
Friendly ping. John, did you get a chance to look at this? This patch is still marked as "Needs ACK" in Patchworks.
Yep thanks. We are tracking a couple fixes internally around this so should have something pop out soon. I think we want the fix and test to go in at the same time.
Ok, I'll mark it as 'awaiting upstream' assuming that you carry this fix forward together with your series then.
Thanks, Daniel
linux-kselftest-mirror@lists.linaro.org