LKFT CI found that with the latest mainline kernel (6.1) on
some QEMU emulators and FVP, the following tests will take
longer than the kselftest framework default timeout (45 seconds) to
run and thus got terminated with TIMEOUT error:
* fp-stress - took about 11m30s
* sve-ptrace - took about 8m50s
* check_gcr_el1_cswitch - took about 6m
* check_user_mem - took about 3m
* syscall-abi - took about 5m
Current test timeouts:
not ok 29 selftests: arm64: sve-ptrace # TIMEOUT 45 seconds
not ok 36 selftests: arm64: check_gcr_el1_cswitch # TIMEOUT 45 seconds
not ok 41 selftests: arm64: check_user_mem # TIMEOUT 45 seconds
not ok 46 selftests: arm64: syscall-abi # TIMEOUT 45 seconds
Signed-off-by: Naresh Kamboju <naresh.kamboju(a)linaro.org>
---
tools/testing/selftests/arm64/settings | 1 +
1 file changed, 1 insertion(+)
create mode 100644 tools/testing/selftests/arm64/settings
diff --git a/tools/testing/selftests/arm64/settings b/tools/testing/selftests/arm64/settings
new file mode 100644
index 000000000000..8959a5dd8ace
--- /dev/null
+++ b/tools/testing/selftests/arm64/settings
@@ -0,0 +1 @@
+timeout=900
\ No newline at end of file
--
2.30.2
On Wed, Jan 11, 2023, at 07:16, Naresh Kamboju wrote:
> On Tue, 10 Jan 2023 at 23:36, Greg Kroah-Hartman <gregkh(a)linuxfoundation.org> wrote:
>>
>
> Results from Linaro’s test farm.
> Regressions on arm64 Raspberry Pi 4 Model B.
>
> Reported-by: Linux Kernel Functional Testing <lkft(a)linaro.org>
>
> While running LTP controllers cgroup_fj_stress_blkio test cases
> the Insufficient stack space to handle exception! occurred and
> followed by kernel panic on arm64 Raspberry Pi 4 Model B with
> clang-15 built kernel Image.
>
> The full boot and test log attached to this email and build and
> Kconfig links provided in the bottom of this email.
>
> I will try to reproduce this reported issue and get back to you.
I looked at the log between 6.0.18 and 6.0.19-rc1, but don't see
any arm64 or memory management patches that could result in this.
Do you know if 6.0.18 ran successfull
> [ 2893.044339] Insufficient stack space to handle exception!
> [ 2893.044351] ESR: 0x0000000096000047 -- DABT (current EL)
> [ 2893.044360] FAR: 0xffff8000128180d0
> [ 2893.044364] Task stack: [0xffff800012a18000..0xffff800012a1c000]
> [ 2893.044370] IRQ stack: [0xffff80000a798000..0xffff80000a79c000]
> [ 2893.044375] Overflow stack: [0xffff0000f77c4310..0xffff0000f77c5310]
...
> [ 2893.044413] pc : el1h_64_sync+0x0/0x68
> [ 2893.044430] lr : wp_page_copy+0xf8/0x90c
> [ 2893.044445] sp : ffff8000128180d0
...
> [ 2893.044692] el1h_64_sync+0x0/0x68
> [ 2893.044700] do_wp_page+0x4a0/0x5c8
> [ 2893.044708] handle_mm_fault+0x7fc/0x14dc
> [ 2893.044718] do_page_fault+0x29c/0x450
> [ 2893.044727] do_mem_abort+0x4c/0xf8
> [ 2893.044741] el0_da+0x48/0xa8
> [ 2893.044750] el0t_64_sync_handler+0xcc/0xf0
> [ 2893.044759] el0t_64_sync+0x18c/0x190
It claims that the stack overflow happened in do_wp_page(),
but that has a really short call chain. It would be good
to have the source line for do_wp_page+0x4a0/0x5c8 and
wp_page_copy+0xf8/0x90c to see where exactly it was.
> [ 2893.285975] WARNING: CPU: 2 PID: 315758 at kernel/sched/core.c:3119
> set_task_cpu+0x14c/0x208
....
> [ 2893.286117] CPU: 2 PID: 315758 Comm: cgroup_fj_stres Not tainted
> [ 2893.286416] arch_timer_handler_phys+0x44/0x54
> [ 2893.286427] handle_percpu_devid_irq+0x90/0x220
> [ 2893.286439] generic_handle_domain_irq+0x38/0x50
> [ 2893.286447] gic_handle_irq+0x68/0xe8
> [ 2893.286455] el1_interrupt+0x88/0xc8
> [ 2893.286464] el1h_64_irq_handler+0x18/0x24
> [ 2893.286474] el1h_64_irq+0x64/0x68
> [ 2893.286482] panic+0x2d8/0x374
This is apparently a second unrelated bug -- it still processes timer
interrupts after calling panic() and this apparently fails because
the system is already unusable.
> artifact-location:
> https://storage.tuxsuite.com/public/linaro/lkft/builds/2K9JDtix2mHMoYRjNkBe…
file not found. I tried to get the vmlinux file to look at the disassembly
but the artifacts appear to be gone already.
Arnd
Total jobs: 60
Total errors: 33 (55.00%)
LAVA errors: 0 (0.00%)
Test errors: 18 (30.00%)
Job errors: 4 (6.67%)
Infra errors: 11 (18.33%)
Canceled jobs: 0 (0.00%)
Device type: x15
Total jobs: 2
Total errors: 2 (100.00%)
Error type: Test
Error count: 2 (100.00%)
Error: No match for error type 'Test', message 'lava-docker-test-shell timed out after 585 seconds'
Count: 1 (50.00%)
IDs:
x15-10:
6045621
Error: No match for error type 'Test', message 'tradefed - adb device lost[0400b0076ca68922]'
Count: 1 (50.00%)
IDs:
x15-01:
6045619
Device type: dragonboard-845c
Total jobs: 37
Total errors: 18 (48.65%)
Error type: Infrastructure
Error count: 10 (27.03%)
Error: Connection closed
Count: 10 (27.03%)
IDs:
db845c-01:
6038823
db845c-03:
6040735 6044923 6044932 6044977 6045013
6045070 6045400
db845c-07:
6038798 6040303
Error type: Test
Error count: 4 (10.81%)
Error: No match for error type 'Test', message 'lava-docker-test-shell timed out after 277 seconds'
Count: 1 (2.70%)
IDs:
db845c-03:
6045055
Error: No match for error type 'Test', message 'lava-docker-test-shell timed out after 587 seconds'
Count: 1 (2.70%)
IDs:
db845c-07:
6044924
Error: No match for error type 'Test', message 'tradefed - adb device lost[b6b742b9]'
Count: 1 (2.70%)
IDs:
db845c-01:
6044921
Error: No match for error type 'Test', message 'lava-docker-test-shell timed out after 285 seconds'
Count: 1 (2.70%)
IDs:
db845c-03:
6044918
Error type: Job
Error count: 4 (10.81%)
Error: wait for prompt timed out
Count: 3 (8.11%)
IDs:
db845c-03:
6040595 6040786
db845c-07:
6038827
Error: No match for error type 'Job', message 'login-action timed out after 870 seconds'
Count: 1 (2.70%)
IDs:
db845c-03:
6038828
Device type: hi6220-hikey-r2
Total jobs: 21
Total errors: 13 (61.90%)
Error type: Test
Error count: 12 (57.14%)
Error: No match for error type 'Test', message 'lava-docker-test-shell timed out after 585 seconds'
Count: 1 (4.76%)
IDs:
hikey-6220-r2-16:
6045193
Error: No match for error type 'Test', message 'lava-docker-test-shell timed out after 584 seconds'
Count: 2 (9.52%)
IDs:
hikey-6220-r2-15:
6045192
hikey-6220-r2-17:
6045188
Error: No match for error type 'Test', message 'lava-docker-test-shell timed out after 582 seconds'
Count: 1 (4.76%)
IDs:
hikey-6220-r2-14:
6045190
Error: No match for error type 'Test', message 'lava-docker-test-shell timed out after 583 seconds'
Count: 3 (14.29%)
IDs:
hikey-6220-r2-01:
6045018
hikey-6220-r2-14:
6045032
hikey-6220-r2-15:
6045011
Error: No match for error type 'Test', message 'tradefed - adb device lost[35FE0622003F362E]'
Count: 1 (4.76%)
IDs:
hikey-6220-r2-10:
6044942
Error: No match for error type 'Test', message 'tradefed - adb device lost[6EDCB009004013EF]'
Count: 2 (9.52%)
IDs:
hikey-6220-r2-08:
6038902 6040464
Error: No match for error type 'Test', message 'tradefed - adb device lost[235D989C003B0752]'
Count: 1 (4.76%)
IDs:
hikey-6220-r2-03:
6040177
Error: No match for error type 'Test', message 'tradefed - adb device lost[5C9DE2B700341A1B]'
Count: 1 (4.76%)
IDs:
hikey-6220-r2-04:
6039875
Error type: Infrastructure
Error count: 1 (4.76%)
Error: No match for error type 'Infrastructure', message 'wait-device-boardid timed out after 571 seconds'
Count: 1 (4.76%)
IDs:
hikey-6220-r2-04:
6044941