On 12/7/20 1:55 PM, Weqaar Janjua wrote:
On Sat, 28 Nov 2020 at 03:13, Yonghong Song yhs@fb.com wrote:
On 11/27/20 9:54 AM, Weqaar Janjua wrote:
On Fri, 27 Nov 2020 at 04:19, Yonghong Song yhs@fb.com wrote:
On 11/26/20 1:22 PM, Weqaar Janjua wrote:
On Thu, 26 Nov 2020 at 09:01, Björn Töpel bjorn.topel@intel.com wrote:
On 2020-11-26 07:44, Yonghong Song wrote: > [...] > > What other configures I am missing? > > BTW, I cherry-picked the following pick from bpf tree in this experiment. > commit e7f4a5919bf66e530e08ff352d9b78ed89574e6b (HEAD -> xsk) > Author: Björn Töpel bjorn.topel@intel.com > Date: Mon Nov 23 18:56:00 2020 +0100 > > net, xsk: Avoid taking multiple skbuff references >
Hmm, I'm getting an oops, unless I cherry-pick:
36ccdf85829a ("net, xsk: Avoid taking multiple skbuff references")
*AND*
537cf4e3cc2f ("xsk: Fix umem cleanup bug at socket destruct")
from bpf/master.
Same as Bjorn's findings ^^^, additionally applying the second patch 537cf4e3cc2f [PASS] all tests for me
PREREQUISITES: [ PASS ] SKB NOPOLL: [ PASS ] SKB POLL: [ PASS ] DRV NOPOLL: [ PASS ] DRV POLL: [ PASS ] SKB SOCKET TEARDOWN: [ PASS ] DRV SOCKET TEARDOWN: [ PASS ] SKB BIDIRECTIONAL SOCKETS: [ PASS ] DRV BIDIRECTIONAL SOCKETS: [ PASS ]
With the first patch alone, as soon as we enter DRV/Native NOPOLL mode kernel panics, whereas in your case NOPOLL tests were falling with packets being *lost* as per seqnum mismatch.
Can you please test this out with both patches and let us know?
I applied both the above patches in bpf-next as well as this patch set, I still see failures. I am attaching my config file. Maybe you can take a look at what is the issue.
Thanks for the config, can you please confirm the compiler version, and resource limits i.e. stack size, memory, etc.?
root@arch-fb-vm1:~/net-next/net-next/tools/testing/selftests/bpf ulimit -a core file size (blocks, -c) unlimited data seg size (kbytes, -d) unlimited scheduling priority (-e) 0 file size (blocks, -f) unlimited pending signals (-i) 15587 max locked memory (kbytes, -l) unlimited max memory size (kbytes, -m) unlimited open files (-n) 1024 pipe size (512 bytes, -p) 8 POSIX message queues (bytes, -q) 819200 real-time priority (-r) 0 stack size (kbytes, -s) 8192 cpu time (seconds, -t) unlimited max user processes (-u) 15587 virtual memory (kbytes, -v) unlimited file locks (-x) unlimited
compiler: gcc 8.2
Only NOPOLL tests are failing for you as I see it, do the same tests fail every time?
In my case, with above two bpf patches applied as well, I got: $ ./test_xsk.sh setting up ve9127: root: 192.168.222.1/30
setting up ve4520: af_xdp4520: 192.168.222.2/30
Spec file created: veth.spec
PREREQUISITES: [ PASS ]
# Interface found: ve9127
# Interface found: ve4520
# NS switched: af_xdp4520
1..1
# Interface [ve4520] vector [Rx]
# Interface [ve9127] vector [Tx]
# Sending 10000 packets on interface ve9127
not ok 1 ERROR: [worker_pkt_validate] prev_pkt [59], payloadseqnum [0]
# Totals: pass:0 fail:1 xfail:0 xpass:0 skip:0 error:0
SKB NOPOLL: [ FAIL ]
# Interface found: ve9127
# Interface found: ve4520
# NS switched: af_xdp4520 # NS switched: af_xdp4520
1..1 # Interface [ve4520] vector [Rx] # Interface [ve9127] vector [Tx] # Sending 10000 packets on interface ve9127 # End-of-tranmission frame received: PASS # Received 10000 packets on interface ve4520 ok 1 PASS: SKB POLL # Totals: pass:1 fail:0 xfail:0 xpass:0 skip:0 error:0 SKB POLL: [ PASS ] # Interface found: ve9127 # Interface found: ve4520 # NS switched: af_xdp4520 1..1 # Interface [ve4520] vector [Rx] # Interface [ve9127] vector [Tx] # Sending 10000 packets on interface ve9127 not ok 1 ERROR: [worker_pkt_validate] prev_pkt [153], payloadseqnum [0] # Totals: pass:0 fail:1 xfail:0 xpass:0 skip:0 error:0 DRV NOPOLL: [ FAIL ] # Interface found: ve9127 # Interface found: ve4520 # NS switched: af_xdp4520 1..1 # Interface [ve4520] vector [Rx] # Interface [ve9127] vector [Tx] # Sending 10000 packets on interface ve9127 # End-of-tranmission frame received: PASS # Received 10000 packets on interface ve4520 ok 1 PASS: DRV POLL # Totals: pass:1 fail:0 xfail:0 xpass:0 skip:0 error:0 DRV POLL: [ PASS ] # Interface found: ve9127 # Interface found: ve4520 # NS switched: af_xdp4520 1..1 # Creating socket # Interface [ve4520] vector [Rx] # Interface [ve9127] vector [Tx] # Sending 10000 packets on interface ve9127 not ok 1 ERROR: [worker_pkt_validate] prev_pkt [54], payloadseqnum [0] # Totals: pass:0 fail:1 xfail:0 xpass:0 skip:0 error:0 SKB SOCKET TEARDOWN: [ FAIL ] # Interface found: ve9127 # Interface found: ve4520 # NS switched: af_xdp4520 1..1 # Creating socket # Interface [ve4520] vector [Rx] # Interface [ve9127] vector [Tx] # Sending 10000 packets on interface ve9127 not ok 1 ERROR: [worker_pkt_validate] prev_pkt [0], payloadseqnum [0] # Totals: pass:0 fail:1 xfail:0 xpass:0 skip:0 error:0 DRV SOCKET TEARDOWN: [ FAIL ] # Interface found: ve9127 # Interface found: ve4520 # NS switched: af_xdp4520 1..1 # Creating socket # Interface [ve4520] vector [Rx] # Interface [ve9127] vector [Tx] # Sending 10000 packets on interface ve9127 not ok 1 ERROR: [worker_pkt_validate] prev_pkt [64], payloadseqnum [0] # Totals: pass:0 fail:1 xfail:0 xpass:0 skip:0 error:0 SKB BIDIRECTIONAL SOCKETS: [ FAIL ] # Interface found: ve9127 # Interface found: ve4520 # NS switched: af_xdp4520 1..1 # Creating socket # Interface [ve4520] vector [Rx] # Interface [ve9127] vector [Tx] # Sending 10000 packets on interface ve9127 not ok 1 ERROR: [worker_pkt_validate] prev_pkt [83], payloadseqnum [0] # Totals: pass:0 fail:1 xfail:0 xpass:0 skip:0 error:0 DRV BIDIRECTIONAL SOCKETS: [ FAIL ] cleaning up... removing link ve4520 removing ns af_xdp4520 removing spec file: veth.spec
Second runs have one previous success becoming failure.
./test_xsk.sh setting up ve2458: root: 192.168.222.1/30
setting up ve4468: af_xdp4468: 192.168.222.2/30
[ 286.597111] IPv6: ADDRCONF(NETDEV_CHANGE): ve4468: link becomes ready
Spec file created: veth.spec
PREREQUISITES: [ PASS ]
# Interface found: ve2458
# Interface found: ve4468
# NS switched: af_xdp4468
1..1
# Interface [ve4468] vector [Rx]
# Interface [ve2458] vector [Tx]
# Sending 10000 packets on interface ve2458
not ok 1 ERROR: [worker_pkt_validate] prev_pkt [67], payloadseqnum [0]
# Totals: pass:0 fail:1 xfail:0 xpass:0 skip:0 error:0
SKB NOPOLL: [ FAIL ]
# Interface found: ve2458
# Interface found: ve4468
# NS switched: af_xdp4468
1..1
# Interface [ve4468] vector [Rx]
# Interface [ve2458] vector [Tx]
# Sending 10000 packets on interface ve2458
# End-of-tranmission frame received: PASS # Received 10000 packets on interface ve4468 ok 1 PASS: SKB POLL # Totals: pass:1 fail:0 xfail:0 xpass:0 skip:0 error:0 SKB POLL: [ PASS ] # Interface found: ve2458 # Interface found: ve4468 # NS switched: af_xdp4468 1..1 # Interface [ve4468] vector [Rx] # Interface [ve2458] vector [Tx] # Sending 10000 packets on interface ve2458 not ok 1 ERROR: [worker_pkt_validate] prev_pkt [191], payloadseqnum [0] # Totals: pass:0 fail:1 xfail:0 xpass:0 skip:0 error:0 DRV NOPOLL: [ FAIL ] # Interface found: ve2458 # Interface found: ve4468 # NS switched: af_xdp4468 1..1 # Interface [ve4468] vector [Rx] # Interface [ve2458] vector [Tx] # Sending 10000 packets on interface ve2458 not ok 1 ERROR: [worker_pkt_validate] prev_pkt [0], payloadseqnum [0] # Totals: pass:0 fail:1 xfail:0 xpass:0 skip:0 error:0 DRV POLL: [ FAIL ] # Interface found: ve2458 # Interface found: ve4468 # NS switched: af_xdp4468 1..1 # Creating socket # Interface [ve4468] vector [Rx] # Interface [ve2458] vector [Tx] # Sending 10000 packets on interface ve2458 not ok 1 ERROR: [worker_pkt_validate] prev_pkt [0], payloadseqnum [0] # Totals: pass:0 fail:1 xfail:0 xpass:0 skip:0 error:0 SKB SOCKET TEARDOWN: [ FAIL ] # Interface found: ve2458 # Interface found: ve4468 # NS switched: af_xdp4468 1..1 # Creating socket # Interface [ve4468] vector [Rx] # Interface [ve2458] vector [Tx] # Sending 10000 packets on interface ve2458 not ok 1 ERROR: [worker_pkt_validate] prev_pkt [171], payloadseqnum [0] # Totals: pass:0 fail:1 xfail:0 xpass:0 skip:0 error:0 DRV SOCKET TEARDOWN: [ FAIL ] # Interface found: ve2458 # Interface found: ve4468 # NS switched: af_xdp4468 1..1 # Creating socket # Interface [ve4468] vector [Rx] # Interface [ve2458] vector [Tx] # Sending 10000 packets on interface ve2458 not ok 1 ERROR: [worker_pkt_validate] prev_pkt [124], payloadseqnum [0] # Totals: pass:0 fail:1 xfail:0 xpass:0 skip:0 error:0 SKB BIDIRECTIONAL SOCKETS: [ FAIL ] # Interface found: ve2458 # Interface found: ve4468 # NS switched: af_xdp4468 1..1 # Creating socket # Interface [ve4468] vector [Rx] # Interface [ve2458] vector [Tx] # Sending 10000 packets on interface ve2458 not ok 1 ERROR: [worker_pkt_validate] prev_pkt [195], payloadseqnum [0] # Totals: pass:0 fail:1 xfail:0 xpass:0 skip:0 error:0 DRV BIDIRECTIONAL SOCKETS: [ FAIL ] cleaning up... removing link ve4468 removing ns af_xdp4468 removing spec file: veth.spec
I will need to spend some time debugging this to have a fix.
Thanks.
Thanks, /Weqaar
Can I just run test_xsk.sh at tools/testing/selftests/bpf/ directory? This will be easier than the above for bpf developers. If it does not work, I would like to recommend to make it work.
yes test_xsk.shis self contained, will update the instructions in there with v4.
That will be great. Thanks!
v4 is out on the list, incorporating most if not all your suggestions to the best of my memory.
I was able to reproduce the issue you were seeing (from your logs) -> veth interfaces were receiving packets from the IPv6 neighboring system (thanks @Björn Töpel for mentioning this).
The packet validation algo in *xdpxceiver* *assumed* all packets would be IPv4 and intended for Rx. Rx validates packets on both ip->tos = 0x9 (id for xsk tests) and ip->version = 0x4, ignores the rest.
Hoping the tests now work -> PASS in your environment.
Yes, no all tests passed in my environment. I will reply the v4 with Test-by tag. Now I think xsk people can really look at details.
Thanks, /Weqaar
Thanks, /Weqaar
Björn