While running kselftest net tls test case on Linux next and mainline kernels the test case fails at following sub test cases and hangs every time. Please investigate this hang issue.
kconfigs are generated from kselftest-merge config.
metadata: git_ref: master git_repo: https://gitlab.com/Linaro/lkft/mirrors/next/linux-next git_sha: b61581ae229d8eb9f21f8753be3f4011f7692384 git_describe: next-20220323 kernel_version: 5.17.0 kernel-config: https://builds.tuxbuild.com/26mKij4yB5Q6WUpOyHHEoHLstVJ/config
Test log link, -------------- # selftests: net: tls # TAP version 13 # 1..502 # # Starting 502 tests from 14 test cases. # # RUN global.non_established ... # # OK global.non_established # ok 1 global.non_established # # RUN global.keysizes ... # # OK global.keysizes <trim>
# # RUN tls.12_aes_gcm.splice_cmsg_to_pipe ... # # tls.c:688:splice_cmsg_to_pipe:Expected splice(self->cfd, NULL, p[1], NULL, send_len, 0) (10) == -1 (-1) # # tls.c:689:splice_cmsg_to_pipe:Expected errno (2) == EINVAL (22) # # splice_cmsg_to_pipe: Test terminated by timeout # # FAIL tls.12_aes_gcm.splice_cmsg_to_pipe # not ok 21 tls.12_aes_gcm.splice_cmsg_to_pipe # # RUN tls.12_aes_gcm.splice_dec_cmsg_to_pipe ... # # tls.c:708:splice_dec_cmsg_to_pipe:Expected recv(self->cfd, buf, send_len, 0) (10) == -1 (-1) # # tls.c:709:splice_dec_cmsg_to_pipe:Expected errno (2) == EIO (5) # # splice_dec_cmsg_to_pipe: Test terminated by timeout # # FAIL tls.12_aes_gcm.splice_dec_cmsg_to_pipe # not ok 22 tls.12_aes_gcm.splice_dec_cmsg_to_pipe # # RUN tls.12_aes_gcm.recv_and_splice ... # # OK tls.12_aes_gcm.recv_and_splice
<trim>
# # RUN tls.13_aes_gcm.splice_cmsg_to_pipe ... # # tls.c:688:splice_cmsg_to_pipe:Expected splice(self->cfd, NULL, p[1], NULL, send_len, 0) (10) == -1 (-1) # # tls.c:689:splice_cmsg_to_pipe:Expected errno (2) == EINVAL (22) # # splice_cmsg_to_pipe: Test terminated by timeout # # FAIL tls.13_aes_gcm.splice_cmsg_to_pipe # not ok 70 tls.13_aes_gcm.splice_cmsg_to_pipe # # RUN tls.13_aes_gcm.splice_dec_cmsg_to_pipe ... # # tls.c:708:splice_dec_cmsg_to_pipe:Expected recv(self->cfd, buf, send_len, 0) (10) == -1 (-1) # # tls.c:709:splice_dec_cmsg_to_pipe:Expected errno (2) == EIO (5) # # splice_dec_cmsg_to_pipe: Test terminated by timeout # # FAIL tls.13_aes_gcm.splice_dec_cmsg_to_pipe # not ok 71 tls.13_aes_gcm.splice_dec_cmsg_to_pipe
<trim>
# # RUN tls.12_chacha.splice_cmsg_to_pipe ... # # tls.c:688:splice_cmsg_to_pipe:Expected splice(self->cfd, NULL, p[1], NULL, send_len, 0) (10) == -1 (-1) # # tls.c:689:splice_cmsg_to_pipe:Expected errno (2) == EINVAL (22) # # splice_cmsg_to_pipe: Test terminated by timeout # # FAIL tls.12_chacha.splice_cmsg_to_pipe # not ok 119 tls.12_chacha.splice_cmsg_to_pipe # # RUN tls.12_chacha.splice_dec_cmsg_to_pipe ... # # tls.c:708:splice_dec_cmsg_to_pipe:Expected recv(self->cfd, buf, send_len, 0) (10) == -1 (-1) # # tls.c:709:splice_dec_cmsg_to_pipe:Expected errno (2) == EIO (5) # # splice_dec_cmsg_to_pipe: Test terminated by timeout # # FAIL tls.12_chacha.splice_dec_cmsg_to_pipe # not ok 120 tls.12_chacha.splice_dec_cmsg_to_pipe
<trim>
# # RUN tls.13_chacha.splice_cmsg_to_pipe ... # # tls.c:688:splice_cmsg_to_pipe:Expected splice(self->cfd, NULL, p[1], NULL, send_len, 0) (10) == -1 (-1) # # tls.c:689:splice_cmsg_to_pipe:Expected errno (2) == EINVAL (22) # # splice_cmsg_to_pipe: Test terminated by timeout # # FAIL tls.13_chacha.splice_cmsg_to_pipe # not ok 168 tls.13_chacha.splice_cmsg_to_pipe # # RUN tls.13_chacha.splice_dec_cmsg_to_pipe ... # # tls.c:708:splice_dec_cmsg_to_pipe:Expected recv(self->cfd, buf, send_len, 0) (10) == -1 (-1) # # tls.c:709:splice_dec_cmsg_to_pipe:Expected errno (2) == EIO (5) # # splice_dec_cmsg_to_pipe: Test terminated by timeout # # FAIL tls.13_chacha.splice_dec_cmsg_to_pipe # not ok 169 tls.13_chacha.splice_dec_cmsg_to_pipe
<trim>
# # RUN tls.13_sm4_gcm.splice_cmsg_to_pipe ... # # tls.c:688:splice_cmsg_to_pipe:Expected splice(self->cfd, NULL, p[1], NULL, send_len, 0) (10) == -1 (-1) # # tls.c:689:splice_cmsg_to_pipe:Expected errno (2) == EINVAL (22) # # splice_cmsg_to_pipe: Test terminated by timeout # # FAIL tls.13_sm4_gcm.splice_cmsg_to_pipe # not ok 217 tls.13_sm4_gcm.splice_cmsg_to_pipe # # RUN tls.13_sm4_gcm.splice_dec_cmsg_to_pipe ... # # tls.c:708:splice_dec_cmsg_to_pipe:Expected recv(self->cfd, buf, send_len, 0) (10) == -1 (-1) # # tls.c:709:splice_dec_cmsg_to_pipe:Expected errno (2) == EIO (5) [ 661.901558] kworker/dying (49) used greatest stack depth: 10576 bytes left
Test case HANG here.
Full test log links [1] including boot log and test run log.
Reported-by: Linux Kernel Functional Testing lkft@linaro.org
-- Linaro LKFT https://lkft.linaro.org
https://lkft.validation.linaro.org/scheduler/job/4770773#L2700
On 3/25/22 1:40 AM, Naresh Kamboju wrote:
While running kselftest net tls test case on Linux next and mainline kernels the test case fails at following sub test cases and hangs every time. Please investigate this hang issue.
kconfigs are generated from kselftest-merge config.
metadata: git_ref: master git_repo: https://gitlab.com/Linaro/lkft/mirrors/next/linux-next git_sha: b61581ae229d8eb9f21f8753be3f4011f7692384 git_describe: next-20220323 kernel_version: 5.17.0 kernel-config: https://builds.tuxbuild.com/26mKij4yB5Q6WUpOyHHEoHLstVJ/config
Test log link,
# selftests: net: tls # TAP version 13 # 1..502 # # Starting 502 tests from 14 test cases. # # RUN global.non_established ... # # OK global.non_established # ok 1 global.non_established # # RUN global.keysizes ... # # OK global.keysizes
<trim>
# # RUN tls.12_aes_gcm.splice_cmsg_to_pipe ... # # tls.c:688:splice_cmsg_to_pipe:Expected splice(self->cfd, NULL, p[1], NULL, send_len, 0) (10) == -1 (-1) # # tls.c:689:splice_cmsg_to_pipe:Expected errno (2) == EINVAL (22) # # splice_cmsg_to_pipe: Test terminated by timeout # # FAIL tls.12_aes_gcm.splice_cmsg_to_pipe # not ok 21 tls.12_aes_gcm.splice_cmsg_to_pipe # # RUN tls.12_aes_gcm.splice_dec_cmsg_to_pipe ... # # tls.c:708:splice_dec_cmsg_to_pipe:Expected recv(self->cfd, buf, send_len, 0) (10) == -1 (-1) # # tls.c:709:splice_dec_cmsg_to_pipe:Expected errno (2) == EIO (5) # # splice_dec_cmsg_to_pipe: Test terminated by timeout # # FAIL tls.12_aes_gcm.splice_dec_cmsg_to_pipe # not ok 22 tls.12_aes_gcm.splice_dec_cmsg_to_pipe # # RUN tls.12_aes_gcm.recv_and_splice ... # # OK tls.12_aes_gcm.recv_and_splice
<trim>
# # RUN tls.13_aes_gcm.splice_cmsg_to_pipe ... # # tls.c:688:splice_cmsg_to_pipe:Expected splice(self->cfd, NULL, p[1], NULL, send_len, 0) (10) == -1 (-1) # # tls.c:689:splice_cmsg_to_pipe:Expected errno (2) == EINVAL (22) # # splice_cmsg_to_pipe: Test terminated by timeout # # FAIL tls.13_aes_gcm.splice_cmsg_to_pipe # not ok 70 tls.13_aes_gcm.splice_cmsg_to_pipe # # RUN tls.13_aes_gcm.splice_dec_cmsg_to_pipe ... # # tls.c:708:splice_dec_cmsg_to_pipe:Expected recv(self->cfd, buf, send_len, 0) (10) == -1 (-1) # # tls.c:709:splice_dec_cmsg_to_pipe:Expected errno (2) == EIO (5) # # splice_dec_cmsg_to_pipe: Test terminated by timeout # # FAIL tls.13_aes_gcm.splice_dec_cmsg_to_pipe # not ok 71 tls.13_aes_gcm.splice_dec_cmsg_to_pipe
<trim>
# # RUN tls.12_chacha.splice_cmsg_to_pipe ... # # tls.c:688:splice_cmsg_to_pipe:Expected splice(self->cfd, NULL, p[1], NULL, send_len, 0) (10) == -1 (-1) # # tls.c:689:splice_cmsg_to_pipe:Expected errno (2) == EINVAL (22) # # splice_cmsg_to_pipe: Test terminated by timeout # # FAIL tls.12_chacha.splice_cmsg_to_pipe # not ok 119 tls.12_chacha.splice_cmsg_to_pipe # # RUN tls.12_chacha.splice_dec_cmsg_to_pipe ... # # tls.c:708:splice_dec_cmsg_to_pipe:Expected recv(self->cfd, buf, send_len, 0) (10) == -1 (-1) # # tls.c:709:splice_dec_cmsg_to_pipe:Expected errno (2) == EIO (5) # # splice_dec_cmsg_to_pipe: Test terminated by timeout # # FAIL tls.12_chacha.splice_dec_cmsg_to_pipe # not ok 120 tls.12_chacha.splice_dec_cmsg_to_pipe
<trim>
# # RUN tls.13_chacha.splice_cmsg_to_pipe ... # # tls.c:688:splice_cmsg_to_pipe:Expected splice(self->cfd, NULL, p[1], NULL, send_len, 0) (10) == -1 (-1) # # tls.c:689:splice_cmsg_to_pipe:Expected errno (2) == EINVAL (22) # # splice_cmsg_to_pipe: Test terminated by timeout # # FAIL tls.13_chacha.splice_cmsg_to_pipe # not ok 168 tls.13_chacha.splice_cmsg_to_pipe # # RUN tls.13_chacha.splice_dec_cmsg_to_pipe ... # # tls.c:708:splice_dec_cmsg_to_pipe:Expected recv(self->cfd, buf, send_len, 0) (10) == -1 (-1) # # tls.c:709:splice_dec_cmsg_to_pipe:Expected errno (2) == EIO (5) # # splice_dec_cmsg_to_pipe: Test terminated by timeout # # FAIL tls.13_chacha.splice_dec_cmsg_to_pipe # not ok 169 tls.13_chacha.splice_dec_cmsg_to_pipe
<trim>
# # RUN tls.13_sm4_gcm.splice_cmsg_to_pipe ... # # tls.c:688:splice_cmsg_to_pipe:Expected splice(self->cfd, NULL, p[1], NULL, send_len, 0) (10) == -1 (-1) # # tls.c:689:splice_cmsg_to_pipe:Expected errno (2) == EINVAL (22) # # splice_cmsg_to_pipe: Test terminated by timeout # # FAIL tls.13_sm4_gcm.splice_cmsg_to_pipe # not ok 217 tls.13_sm4_gcm.splice_cmsg_to_pipe # # RUN tls.13_sm4_gcm.splice_dec_cmsg_to_pipe ... # # tls.c:708:splice_dec_cmsg_to_pipe:Expected recv(self->cfd, buf, send_len, 0) (10) == -1 (-1) # # tls.c:709:splice_dec_cmsg_to_pipe:Expected errno (2) == EIO (5) [ 661.901558] kworker/dying (49) used greatest stack depth: 10576 bytes left
This seems to be the problem perhaps.
Jakub, any thoughts. The last change to tls.c was a while back.
Test case HANG here.
Full test log links [1] including boot log and test run log.
Reported-by: Linux Kernel Functional Testing lkft@linaro.org
-- Linaro LKFT https://lkft.linaro.org
https://lkft.validation.linaro.org/scheduler/job/4770773#L2700
thanks, -- Shuah thanks, -- Shuah
On Fri, 25 Mar 2022 16:13:33 -0600 Shuah Khan wrote:
# # RUN tls.13_sm4_gcm.splice_cmsg_to_pipe ... # # tls.c:688:splice_cmsg_to_pipe:Expected splice(self->cfd, NULL, p[1], NULL, send_len, 0) (10) == -1 (-1) # # tls.c:689:splice_cmsg_to_pipe:Expected errno (2) == EINVAL (22) # # splice_cmsg_to_pipe: Test terminated by timeout # # FAIL tls.13_sm4_gcm.splice_cmsg_to_pipe # not ok 217 tls.13_sm4_gcm.splice_cmsg_to_pipe # # RUN tls.13_sm4_gcm.splice_dec_cmsg_to_pipe ... # # tls.c:708:splice_dec_cmsg_to_pipe:Expected recv(self->cfd, buf, send_len, 0) (10) == -1 (-1) # # tls.c:709:splice_dec_cmsg_to_pipe:Expected errno (2) == EIO (5) [ 661.901558] kworker/dying (49) used greatest stack depth: 10576 bytes left
This seems to be the problem perhaps.
Jakub, any thoughts. The last change to tls.c was a while back.
Yes, sorry, kicked off a build and got distracted.
I can repro the failures, TLS=n in the config I must have not tested that in the new cases.
But I can't repro the hung, and we have a timer at the hardness level IIUC so IDK how this could "hang"?
Naresh, is there any stack trace in the logs? Can you repro on Linus's tree?
On 3/25/22 5:12 PM, Jakub Kicinski wrote:
On Fri, 25 Mar 2022 16:13:33 -0600 Shuah Khan wrote:
# # RUN tls.13_sm4_gcm.splice_cmsg_to_pipe ... # # tls.c:688:splice_cmsg_to_pipe:Expected splice(self->cfd, NULL, p[1], NULL, send_len, 0) (10) == -1 (-1) # # tls.c:689:splice_cmsg_to_pipe:Expected errno (2) == EINVAL (22) # # splice_cmsg_to_pipe: Test terminated by timeout # # FAIL tls.13_sm4_gcm.splice_cmsg_to_pipe # not ok 217 tls.13_sm4_gcm.splice_cmsg_to_pipe # # RUN tls.13_sm4_gcm.splice_dec_cmsg_to_pipe ... # # tls.c:708:splice_dec_cmsg_to_pipe:Expected recv(self->cfd, buf, send_len, 0) (10) == -1 (-1) # # tls.c:709:splice_dec_cmsg_to_pipe:Expected errno (2) == EIO (5) [ 661.901558] kworker/dying (49) used greatest stack depth: 10576 bytes left
This seems to be the problem perhaps.
Jakub, any thoughts. The last change to tls.c was a while back.
Yes, sorry, kicked off a build and got distracted.
I can repro the failures, TLS=n in the config I must have not tested that in the new cases.
But I can't repro the hung, and we have a timer at the hardness level IIUC so IDK how this could "hang"?
Naresh, is there any stack trace in the logs? Can you repro on Linus's tree?
I couldn't reproduce this either - I have TLS=m
thanks, -- Shuah
Hi Jakub,
This seems to be the problem perhaps.
Jakub, any thoughts. The last change to tls.c was a while back.
Yes, sorry, kicked off a build and got distracted.
I can repro the failures, TLS=n in the config I must have not tested that in the new cases.
Thanks for investigation and reproducing the test failures.
But I can't repro the hung, and we have a timer at the hardness level IIUC so IDK how this could "hang"?
Did you get a chance to build and test with the config file that I have provided ? I am using kselftest-merge configs for my daily CI system [1] on gitlab.
If you think the Kconfig is a test
Naresh, is there any stack trace in the logs? Can you repro on Linus's tree?
There is no stack trace print, it was only hung.
I couldn't reproduce this either - I have TLS=m
If you want to reproduce the same build or re-use the bzImage / vmlinux / System.map you may check the provided link [1].
Step to reproduce [1]:
# TuxMake supports the concept of runtimes. # See https://docs.tuxmake.org/runtimes/, for that to work it requires # that you install podman or docker on your system. # # To install tuxmake on your system globally: # sudo pip3 install -U tuxmake # # See https://docs.tuxmake.org/ for complete documentation. # Original tuxmake command with fragments listed below.
tuxmake --runtime podman --target-arch x86_64 --toolchain gcc-11 --kconfig defconfig --kconfig-add https://raw.githubusercontent.com/Linaro/meta-lkft/sumo/recipes-kernel/linux... --kconfig-add https://raw.githubusercontent.com/Linaro/meta-lkft/sumo/recipes-kernel/linux... --kconfig-add https://raw.githubusercontent.com/Linaro/meta-lkft/sumo/recipes-kernel/linux... --kconfig-add https://raw.githubusercontent.com/Linaro/meta-lkft/sumo/recipes-kernel/linux... --kconfig-add https://raw.githubusercontent.com/Linaro/meta-lkft/sumo/recipes-kernel/linux... --kconfig-add CONFIG_IGB=y --kconfig-add CONFIG_UNWINDER_FRAME_POINTER=y cpupower headers kernel kselftest kselftest-merge modules
[1] https://builds.tuxbuild.com/26mKij4yB5Q6WUpOyHHEoHLstVJ/
- Naresh
On Tue, 29 Mar 2022 20:19:10 +0530 Naresh Kamboju wrote:
But I can't repro the hung, and we have a timer at the hardness level IIUC so IDK how this could "hang"?
Did you get a chance to build and test with the config file that I have provided ? I am using kselftest-merge configs for my daily CI system [1] on gitlab.
If you think the Kconfig is a test
I have not used the exact config, just a config with TLS=n. I also have a different GCC version (11.1). Unfortunately I don't have access to docker or podman on the machines I'm using.
Naresh, is there any stack trace in the logs? Can you repro on Linus's tree?
There is no stack trace print, it was only hung.
Can you check where the process is stuck and it's state? /proc/$pid/stack and run that thru scripts/decode_stacktrace
I couldn't reproduce this either - I have TLS=m
If you want to reproduce the same build or re-use the bzImage / vmlinux / System.map you may check the provided link [1].
Step to reproduce [1]:
# TuxMake supports the concept of runtimes. # See https://docs.tuxmake.org/runtimes/, for that to work it requires # that you install podman or docker on your system. # # To install tuxmake on your system globally: # sudo pip3 install -U tuxmake # # See https://docs.tuxmake.org/ for complete documentation. # Original tuxmake command with fragments listed below.
tuxmake --runtime podman --target-arch x86_64 --toolchain gcc-11 --kconfig defconfig --kconfig-add https://raw.githubusercontent.com/Linaro/meta-lkft/sumo/recipes-kernel/linux... --kconfig-add https://raw.githubusercontent.com/Linaro/meta-lkft/sumo/recipes-kernel/linux... --kconfig-add https://raw.githubusercontent.com/Linaro/meta-lkft/sumo/recipes-kernel/linux... --kconfig-add https://raw.githubusercontent.com/Linaro/meta-lkft/sumo/recipes-kernel/linux... --kconfig-add https://raw.githubusercontent.com/Linaro/meta-lkft/sumo/recipes-kernel/linux... --kconfig-add CONFIG_IGB=y --kconfig-add CONFIG_UNWINDER_FRAME_POINTER=y cpupower headers kernel kselftest kselftest-merge modules
[1] https://builds.tuxbuild.com/26mKij4yB5Q6WUpOyHHEoHLstVJ/
I only see build logs here, are there logs for the run?