On 02/09/2018 06:14 AM, Li Zhijian wrote:
Hi
INTEL 0-Day noticed that bpf/test_maps has different results at different platforms. when it fails, the details are like
Sorry for the late reply and thanks for reporting! More below:
880 Failed to create hashmap key=16 value=131072 'Cannot allocate memory' 881 Failed to create hashmap key=8 value=32768 'Cannot allocate memory' 882 Failed to create hashmap key=8 value=131072 'Cannot allocate memory' 883 Failed to create hashmap key=16 value=32768 'Cannot allocate memory' 884 Failed to create hashmap key=8 value=16384 'Cannot allocate memory' 885 Failed to create hashmap key=16 value=16384 'Cannot allocate memory' 886 Failed to create hashmap key=8 value=65536 'Cannot allocate memory' 887 Failed to create hashmap key=16 value=131072 'Cannot allocate memory' 888 Failed to create hashmap key=16 value=32768 'Cannot allocate memory' 889 Failed to create hashmap key=16 value=65536 'Cannot allocate memory' 890 Failed to create hashmap key=8 value=65536 'Cannot allocate memory' 891 Failed to create hashmap key=8 value=131072 'Cannot allocate memory' 892 Failed to create hashmap key=8 value=131072 'Cannot allocate memory' 893 Failed to create hashmap key=16 value=32768 'Cannot allocate memory' 894 Failed to create hashmap key=8 value=16384 'Cannot allocate memory' 895 Failed to create hashmap key=8 value=131072 'Cannot allocate memory' 896 Failed to create hashmap key=16 value=8192 'Cannot allocate memory' 897 Failed to create hashmap key=8 value=32768 'Cannot allocate memory' 898 Failed to create hashmap key=16 value=8192 'Cannot allocate memory' 899 Failed to create hashmap key=8 value=262144 'Cannot allocate memory' 900 Failed to create hashmap key=8 value=262144 'Cannot allocate memory' 901 Failed to create hashmap key=8 value=262144 'Cannot allocate memory' 902 Failed to create hashmap key=16 value=262144 'Cannot allocate memory' 903 Failed to create hashmap key=8 value=262144 'Cannot allocate memory' 904 Failed to create hashmap key=8 value=262144 'Cannot allocate memory' 905 test_maps: test_maps.c:955: run_parallel: Assertion `status == 0' failed. 906 Aborted 907 not ok 1..3 selftests: test_maps [FAIL]
After a simply looking at the code, looks it's related to the cpu number and system memory.
below are the result under different platform
- Good
model: Sandy Bridge nr_node: 1 nr_cpu: 4 memory: 6G
- Good
model: qemu-system-x86_64 -enable-kvm nr_cpu: 2 memory: 4G
- Bad
model: Ivytown Ivy Bridge-EP nr_cpu: 48 memory: 64G
- Bad
model: Skylake nr_cpu: 104 memory: 64G
I try to change the process number to 10 from 100, so it can pass at above Skylake(4) machine.
lizhijian@haswell-OptiPlex-9020:~/lkp/linux/tools/testing/selftests/bpf$ git diff diff --git a/tools/testing/selftests/bpf/test_maps.c b/tools/testing/selftests/bpf/test_maps.c index 040356e..b788ca1 100644 --- a/tools/testing/selftests/bpf/test_maps.c +++ b/tools/testing/selftests/bpf/test_maps.c @@ -960,7 +960,7 @@ static void test_map_stress(void) { run_parallel(100, test_hashmap, NULL); run_parallel(100, test_hashmap_percpu, NULL); - run_parallel(100, test_hashmap_sizes, NULL); + run_parallel(10, test_hashmap_sizes, NULL); run_parallel(100, test_hashmap_walk, NULL); run_parallel(100, test_arraymap, NULL);
Unless Alexei has some better idea, I think if the bpf_create_map() error in the stress test is about ENOMEM, then we shouldn't fail hard via exit(), for all other cases we should however. So probably makes sense to just check for errno == ENOMEM in case of fd < 0 in test_hashmap_sizes() and then continue to keep trying under stress. Feel free to send a patch, Li.
Thanks again, Daniel -- To unsubscribe from this list: send the line "unsubscribe linux-kselftest" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Fri, Feb 09, 2018 at 03:01:57PM +0100, Daniel Borkmann wrote:
On 02/09/2018 06:14 AM, Li Zhijian wrote:
Hi
INTEL 0-Day noticed that bpf/test_maps has different results at different platforms. when it fails, the details are like
Sorry for the late reply and thanks for reporting! More below:
880 Failed to create hashmap key=16 value=131072 'Cannot allocate memory' 881 Failed to create hashmap key=8 value=32768 'Cannot allocate memory' 882 Failed to create hashmap key=8 value=131072 'Cannot allocate memory' 883 Failed to create hashmap key=16 value=32768 'Cannot allocate memory' 884 Failed to create hashmap key=8 value=16384 'Cannot allocate memory' 885 Failed to create hashmap key=16 value=16384 'Cannot allocate memory' 886 Failed to create hashmap key=8 value=65536 'Cannot allocate memory' 887 Failed to create hashmap key=16 value=131072 'Cannot allocate memory' 888 Failed to create hashmap key=16 value=32768 'Cannot allocate memory' 889 Failed to create hashmap key=16 value=65536 'Cannot allocate memory' 890 Failed to create hashmap key=8 value=65536 'Cannot allocate memory' 891 Failed to create hashmap key=8 value=131072 'Cannot allocate memory' 892 Failed to create hashmap key=8 value=131072 'Cannot allocate memory' 893 Failed to create hashmap key=16 value=32768 'Cannot allocate memory' 894 Failed to create hashmap key=8 value=16384 'Cannot allocate memory' 895 Failed to create hashmap key=8 value=131072 'Cannot allocate memory' 896 Failed to create hashmap key=16 value=8192 'Cannot allocate memory' 897 Failed to create hashmap key=8 value=32768 'Cannot allocate memory' 898 Failed to create hashmap key=16 value=8192 'Cannot allocate memory' 899 Failed to create hashmap key=8 value=262144 'Cannot allocate memory' 900 Failed to create hashmap key=8 value=262144 'Cannot allocate memory' 901 Failed to create hashmap key=8 value=262144 'Cannot allocate memory' 902 Failed to create hashmap key=16 value=262144 'Cannot allocate memory' 903 Failed to create hashmap key=8 value=262144 'Cannot allocate memory' 904 Failed to create hashmap key=8 value=262144 'Cannot allocate memory' 905 test_maps: test_maps.c:955: run_parallel: Assertion `status == 0' failed. 906 Aborted 907 not ok 1..3 selftests: test_maps [FAIL]
After a simply looking at the code, looks it's related to the cpu number and system memory.
below are the result under different platform
- Good
model: Sandy Bridge nr_node: 1 nr_cpu: 4 memory: 6G
- Good
model: qemu-system-x86_64 -enable-kvm nr_cpu: 2 memory: 4G
- Bad
model: Ivytown Ivy Bridge-EP nr_cpu: 48 memory: 64G
- Bad
model: Skylake nr_cpu: 104 memory: 64G
I try to change the process number to 10 from 100, so it can pass at above Skylake(4) machine.
lizhijian@haswell-OptiPlex-9020:~/lkp/linux/tools/testing/selftests/bpf$ git diff diff --git a/tools/testing/selftests/bpf/test_maps.c b/tools/testing/selftests/bpf/test_maps.c index 040356e..b788ca1 100644 --- a/tools/testing/selftests/bpf/test_maps.c +++ b/tools/testing/selftests/bpf/test_maps.c @@ -960,7 +960,7 @@ static void test_map_stress(void) { run_parallel(100, test_hashmap, NULL); run_parallel(100, test_hashmap_percpu, NULL); - run_parallel(100, test_hashmap_sizes, NULL); + run_parallel(10, test_hashmap_sizes, NULL); run_parallel(100, test_hashmap_walk, NULL); run_parallel(100, test_arraymap, NULL);
Unless Alexei has some better idea, I think if the bpf_create_map() error in the stress test is about ENOMEM, then we shouldn't fail hard via exit(), for all other cases we should however. So probably makes sense to just check for errno == ENOMEM in case of fd < 0 in test_hashmap_sizes() and then continue to keep trying under stress. Feel free to send a patch, Li.
that's probably good path for now. I also see that test_maps fails on freshly booted kernel with such assert, but then restarting test_maps again works and repeated runs succeed too. I suspect there is a deeper issue here related to memory allocation. Either slab or percpu allocator are behaving funky. It needs to be further debugged.
-- To unsubscribe from this list: send the line "unsubscribe linux-kselftest" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
linux-kselftest-mirror@lists.linaro.org