The following kernel crash was noticed on arm64 Rock-pi 4b, Qcomm dragonboard 410c and 845c devices while booting / loading kernel modules on today's Linux next-20240822.
First seen on next-20240822. Good: next-20240821 BAD: next-20240822
Reported-by: Linux Kernel Functional Testing lkft@linaro.org
We are bisecting this problem.
Crash log: -------- [ 0.000000] Linux version 6.11.0-rc4-next-20240822 (tuxmake@tuxmake) (Debian clang version 18.1.8 (++20240731024919+3b5b5c1ec4a3-1~exp1~20240731145017.142), Debian LLD 18.1.8) #1 SMP PREEMPT @1724309008 [ 0.000000] KASLR disabled due to lack of seed [ 0.000000] Machine model: Radxa ROCK Pi 4B ... [ 16.923808] coresight-cpu-debug fe430000.debug: Coresight debug-CPU0 initialized [ 16.933919] coresight-cpu-debug fe432000.debug: Coresight debug-CPU1 initialized [ 16.935653] coresight-cpu-debug fe434000.debug: Coresight debug-CPU2 initialized [ 16.937846] coresight-cpu-debug fe436000.debug: Coresight debug-CPU3 initialized [ 16.939744] coresight-cpu-debug fe610000.debug: Coresight debug-CPU4 initialized [ 16.942132] coresight-cpu-debug fe710000.debug: Coresight debug-CPU5 initialized [ 37.987966] rcu: INFO: rcu_preempt self-detected stall on CPU [ 37.989037] rcu: 2-...!: (5248 ticks this GP) idle=d1f4/1/0x4000000000000000 softirq=1552/1553 fqs=0 [ 37.990715] rcu: (t=5250 jiffies g=1269 q=784 ncpus=6) [ 37.991667] rcu: rcu_preempt kthread timer wakeup didn't happen for 5249 jiffies! g1269 f0x0 RCU_GP_WAIT_FQS(5) ->state=0x402 [ 37.993718] rcu: Possible timer handling issue on cpu=4 timer-softirq=296 [ 37.994966] rcu: rcu_preempt kthread starved for 5250 jiffies! g1269 f0x0 RCU_GP_WAIT_FQS(5) ->state=0x402 ->cpu=4 [ 37.996844] rcu: Unless rcu_preempt kthread gets sufficient CPU time, OOM is now expected behavior. [ 37.998499] rcu: RCU grace-period kthread stack dump: [ 37.999419] task:rcu_preempt state:I stack:0 pid:18 tgid:18 ppid:2 flags:0x00000008 [ 38.001117] Call trace: [ 38.001565] __switch_to+0x15c/0x208 [ 38.002229] __schedule+0x54c/0x9d0 [ 38.002870] schedule+0x12c/0x1e0 [ 38.003479] schedule_timeout+0x9c/0x1b0 [ 38.004202] rcu_gp_fqs_loop+0x1e0/0x880 [ 38.004926] rcu_gp_kthread+0x70/0x230 [ 38.005616] kthread+0x104/0x198 [ 38.006209] ret_from_fork+0x10/0x20 [ 38.006877] CPU: 2 UID: 0 PID: 237 Comm: (udev-worker) Not tainted 6.11.0-rc4-next-20240822 #1 [ 38.008442] Hardware name: Radxa ROCK Pi 4B (DT) [ 38.009284] pstate: 20000005 (nzCv daif -PAN -UAO -TCO -DIT -SSBS BTYPE=--) [ 38.010551] pc : smp_call_function_many_cond+0x468/0x6b8 [ 38.011521] lr : smp_call_function_many_cond+0x428/0x6b8 [ 38.012490] sp : ffff800083d1bb50 [ 38.013094] x29: ffff800083d1bb90 x28: 0000000000000006 x27: 0000000000000000 [ 38.014399] x26: ffff80008269e000 x25: 0000000000000000 x24: 0000000000000001 [ 38.015702] x23: ffff8000801bc470 x22: ffff800074e75000 x21: ffff0000f74d8c20 [ 38.017005] x20: ffff0000f746cb80 x19: ffff8000826a3470 x18: ffff80007afc2000 [ 38.018308] x17: 000000000909008e x16: 0000000000000000 x15: ffff8000816092a0 [ 38.019611] x14: 060a090000000000 x13: 0000000000000003 x12: 000000000000003f [ 38.020914] x11: 0000000000000001 x10: 0000000000000000 x9 : 0000000000000011 [ 38.022218] x8 : ffff0000f74b6c28 x7 : 0000000000000000 x6 : 0000000000000080 [ 38.023520] x5 : 000000000000003f x4 : 0000000000000000 x3 : 0000000000000006 [ 38.024823] x2 : 0000000000000004 x1 : 0000000000000004 x0 : 0000000000000004 [ 38.026126] Call trace: [ 38.026574] smp_call_function_many_cond+0x468/0x6b8 [ 38.027481] kick_all_cpus_sync+0x44/0x78 [ 38.028216] flush_module_icache+0x38/0xe0 [ 38.028965] load_module+0x1374/0x1778 [ 38.029654] __arm64_sys_finit_module+0x238/0x348 [ 38.030516] invoke_syscall+0x4c/0x110 [ 38.031207] el0_svc_common+0x8c/0xf0 [ 38.031880] do_el0_svc+0x28/0x40 [ 38.032490] el0_svc+0x40/0x88 [ 38.033054] el0t_64_sync_handler+0x90/0x100 [ 38.033837] el0t_64_sync+0x190/0x198 [ 38.034509] Sending NMI from CPU 2 to CPUs 4: [ 48.036555] Sending NMI from CPU 2 to CPUs 5: [ 78.247966] sched: DL replenish lagged too much [ 121.047963] rcu: INFO: rcu_preempt self-detected stall on CPU [ 121.049009] rcu: 2-...!: (21002 ticks this GP) idle=d1f4/1/0x4000000000000000 softirq=1552/1553 fqs=0 [ 121.050702] rcu: (t=26015 jiffies g=1269 q=784 ncpus=6) [ 121.051669] rcu: rcu_preempt kthread timer wakeup didn't happen for 26014 jiffies! g1269 f0x0 RCU_GP_WAIT_FQS(5) ->state=0x200 [ 121.053735] rcu: Possible timer handling issue on cpu=4 timer-softirq=296 [ 121.054982] rcu: rcu_preempt kthread starved for 26015 jiffies! g1269 f0x0 RCU_GP_WAIT_FQS(5) ->state=0x200 ->cpu=4 [ 121.056876] rcu: Unless rcu_preempt kthread gets sufficient CPU time, OOM is now expected behavior. [ 121.058530] rcu: RCU grace-period kthread stack dump: [ 121.059447] task:rcu_preempt state:R stack:0 pid:18 tgid:18 ppid:2 flags:0x00000008 [ 121.061140] Call trace: [ 121.061587] __switch_to+0x15c/0x208 [ 121.062245] __schedule+0x54c/0x9d0 [ 121.062885] schedule+0x12c/0x1e0 [ 121.063494] schedule_timeout+0x9c/0x1b0 [ 121.064214] rcu_gp_fqs_loop+0x1e0/0x880 [ 121.064936] rcu_gp_kthread+0x70/0x230 [ 121.065625] kthread+0x104/0x198 [ 121.066217] ret_from_fork+0x10/0x20 [ 121.066874] CPU: 2 UID: 0 PID: 237 Comm: (udev-worker) Not tainted 6.11.0-rc4-next-20240822 #1 ... [ 785.590550] Sending NMI from CPU 2 to CPUs 4: [ 795.592594] Sending NMI from CPU 2 to CPUs 5:
boot Log links, -------- - https://qa-reports.linaro.org/lkft/linux-next-master/build/next-20240822/tes... - https://lkft.validation.linaro.org/scheduler/job/7816099#L523 - https://lkft.validation.linaro.org/scheduler/job/7815910#L2448
Boot failed comparison: - https://qa-reports.linaro.org/lkft/linux-next-master/build/next-20240822/tes...
metadata: ---- git describe: next-20240822 git repo: https://gitlab.com/Linaro/lkft/mirrors/next/linux-next git sha: 6a7917c89f219f09b1d88d09f376000914a52763 kernel config: https://storage.tuxsuite.com/public/linaro/lkft/builds/2l0EZh4ZCNgUovHichD9y... kernel version: 6.11.0-rc4 artifact location: https://storage.tuxsuite.com/public/linaro/lkft/builds/2l0EZh4ZCNgUovHichD9y... build url: https://storage.tuxsuite.com/public/linaro/lkft/builds/2l0EZh4ZCNgUovHichD9y... toolchain: clang-18 and gcc-13
Steps to reproduce: --------- - https://tuxapi.tuxsuite.com/v1/groups/linaro/projects/lkft/tests/2l0EbxmBUpG... - https://tuxapi.tuxsuite.com/v1/groups/linaro/projects/lkft/tests/2l0EbxmBUpG...
-- Linaro LKFT https://lkft.linaro.org
Hi,
On 22/08/2024 13:16, Naresh Kamboju wrote:
The following kernel crash was noticed on arm64 Rock-pi 4b, Qcomm dragonboard 410c and 845c devices while booting / loading kernel modules on today's Linux next-20240822.
I see the same crash log on the following devices built with next-20240822 and ARM64 defconfig: - Libre Computer AML-S805X-AC (Amlogic S805X) - BananaPi M2S (Amlogic A311D) - Snapdragon DB410c - Qcom SM8250-HDK
I do not see the same crash log, but the following board fails to complete boot: - Banana Pi BPI-M5 (Amlogic S905X3) - Libre Computer AML-S905X-CC (Amlogic S905X) - Qcom SM8150-HDK - Qcom DB845c - Qcom SM8350-HDK - Qcom SM8450-HDK - Qcom SM8550-HDK - Qcom SM8550-QRD - Qcom SM8650-HDK - Qcom SM8650-QRD
Neil
First seen on next-20240822. Good: next-20240821 BAD: next-20240822
Reported-by: Linux Kernel Functional Testing lkft@linaro.org
We are bisecting this problem.
Crash log:
[ 0.000000] Linux version 6.11.0-rc4-next-20240822 (tuxmake@tuxmake) (Debian clang version 18.1.8 (++20240731024919+3b5b5c1ec4a3-1~exp1~20240731145017.142), Debian LLD 18.1.8) #1 SMP PREEMPT @1724309008 [ 0.000000] KASLR disabled due to lack of seed [ 0.000000] Machine model: Radxa ROCK Pi 4B ... [ 16.923808] coresight-cpu-debug fe430000.debug: Coresight debug-CPU0 initialized [ 16.933919] coresight-cpu-debug fe432000.debug: Coresight debug-CPU1 initialized [ 16.935653] coresight-cpu-debug fe434000.debug: Coresight debug-CPU2 initialized [ 16.937846] coresight-cpu-debug fe436000.debug: Coresight debug-CPU3 initialized [ 16.939744] coresight-cpu-debug fe610000.debug: Coresight debug-CPU4 initialized [ 16.942132] coresight-cpu-debug fe710000.debug: Coresight debug-CPU5 initialized [ 37.987966] rcu: INFO: rcu_preempt self-detected stall on CPU [ 37.989037] rcu: 2-...!: (5248 ticks this GP) idle=d1f4/1/0x4000000000000000 softirq=1552/1553 fqs=0 [ 37.990715] rcu: (t=5250 jiffies g=1269 q=784 ncpus=6) [ 37.991667] rcu: rcu_preempt kthread timer wakeup didn't happen for 5249 jiffies! g1269 f0x0 RCU_GP_WAIT_FQS(5) ->state=0x402 [ 37.993718] rcu: Possible timer handling issue on cpu=4 timer-softirq=296 [ 37.994966] rcu: rcu_preempt kthread starved for 5250 jiffies! g1269 f0x0 RCU_GP_WAIT_FQS(5) ->state=0x402 ->cpu=4 [ 37.996844] rcu: Unless rcu_preempt kthread gets sufficient CPU time, OOM is now expected behavior. [ 37.998499] rcu: RCU grace-period kthread stack dump: [ 37.999419] task:rcu_preempt state:I stack:0 pid:18 tgid:18 ppid:2 flags:0x00000008 [ 38.001117] Call trace: [ 38.001565] __switch_to+0x15c/0x208 [ 38.002229] __schedule+0x54c/0x9d0 [ 38.002870] schedule+0x12c/0x1e0 [ 38.003479] schedule_timeout+0x9c/0x1b0 [ 38.004202] rcu_gp_fqs_loop+0x1e0/0x880 [ 38.004926] rcu_gp_kthread+0x70/0x230 [ 38.005616] kthread+0x104/0x198 [ 38.006209] ret_from_fork+0x10/0x20 [ 38.006877] CPU: 2 UID: 0 PID: 237 Comm: (udev-worker) Not tainted 6.11.0-rc4-next-20240822 #1 [ 38.008442] Hardware name: Radxa ROCK Pi 4B (DT) [ 38.009284] pstate: 20000005 (nzCv daif -PAN -UAO -TCO -DIT -SSBS BTYPE=--) [ 38.010551] pc : smp_call_function_many_cond+0x468/0x6b8 [ 38.011521] lr : smp_call_function_many_cond+0x428/0x6b8 [ 38.012490] sp : ffff800083d1bb50 [ 38.013094] x29: ffff800083d1bb90 x28: 0000000000000006 x27: 0000000000000000 [ 38.014399] x26: ffff80008269e000 x25: 0000000000000000 x24: 0000000000000001 [ 38.015702] x23: ffff8000801bc470 x22: ffff800074e75000 x21: ffff0000f74d8c20 [ 38.017005] x20: ffff0000f746cb80 x19: ffff8000826a3470 x18: ffff80007afc2000 [ 38.018308] x17: 000000000909008e x16: 0000000000000000 x15: ffff8000816092a0 [ 38.019611] x14: 060a090000000000 x13: 0000000000000003 x12: 000000000000003f [ 38.020914] x11: 0000000000000001 x10: 0000000000000000 x9 : 0000000000000011 [ 38.022218] x8 : ffff0000f74b6c28 x7 : 0000000000000000 x6 : 0000000000000080 [ 38.023520] x5 : 000000000000003f x4 : 0000000000000000 x3 : 0000000000000006 [ 38.024823] x2 : 0000000000000004 x1 : 0000000000000004 x0 : 0000000000000004 [ 38.026126] Call trace: [ 38.026574] smp_call_function_many_cond+0x468/0x6b8 [ 38.027481] kick_all_cpus_sync+0x44/0x78 [ 38.028216] flush_module_icache+0x38/0xe0 [ 38.028965] load_module+0x1374/0x1778 [ 38.029654] __arm64_sys_finit_module+0x238/0x348 [ 38.030516] invoke_syscall+0x4c/0x110 [ 38.031207] el0_svc_common+0x8c/0xf0 [ 38.031880] do_el0_svc+0x28/0x40 [ 38.032490] el0_svc+0x40/0x88 [ 38.033054] el0t_64_sync_handler+0x90/0x100 [ 38.033837] el0t_64_sync+0x190/0x198 [ 38.034509] Sending NMI from CPU 2 to CPUs 4: [ 48.036555] Sending NMI from CPU 2 to CPUs 5: [ 78.247966] sched: DL replenish lagged too much [ 121.047963] rcu: INFO: rcu_preempt self-detected stall on CPU [ 121.049009] rcu: 2-...!: (21002 ticks this GP) idle=d1f4/1/0x4000000000000000 softirq=1552/1553 fqs=0 [ 121.050702] rcu: (t=26015 jiffies g=1269 q=784 ncpus=6) [ 121.051669] rcu: rcu_preempt kthread timer wakeup didn't happen for 26014 jiffies! g1269 f0x0 RCU_GP_WAIT_FQS(5) ->state=0x200 [ 121.053735] rcu: Possible timer handling issue on cpu=4 timer-softirq=296 [ 121.054982] rcu: rcu_preempt kthread starved for 26015 jiffies! g1269 f0x0 RCU_GP_WAIT_FQS(5) ->state=0x200 ->cpu=4 [ 121.056876] rcu: Unless rcu_preempt kthread gets sufficient CPU time, OOM is now expected behavior. [ 121.058530] rcu: RCU grace-period kthread stack dump: [ 121.059447] task:rcu_preempt state:R stack:0 pid:18 tgid:18 ppid:2 flags:0x00000008 [ 121.061140] Call trace: [ 121.061587] __switch_to+0x15c/0x208 [ 121.062245] __schedule+0x54c/0x9d0 [ 121.062885] schedule+0x12c/0x1e0 [ 121.063494] schedule_timeout+0x9c/0x1b0 [ 121.064214] rcu_gp_fqs_loop+0x1e0/0x880 [ 121.064936] rcu_gp_kthread+0x70/0x230 [ 121.065625] kthread+0x104/0x198 [ 121.066217] ret_from_fork+0x10/0x20 [ 121.066874] CPU: 2 UID: 0 PID: 237 Comm: (udev-worker) Not tainted 6.11.0-rc4-next-20240822 #1 ... [ 785.590550] Sending NMI from CPU 2 to CPUs 4: [ 795.592594] Sending NMI from CPU 2 to CPUs 5:
boot Log links,
- https://qa-reports.linaro.org/lkft/linux-next-master/build/next-20240822/tes...
- https://lkft.validation.linaro.org/scheduler/job/7816099#L523
- https://lkft.validation.linaro.org/scheduler/job/7815910#L2448
Boot failed comparison:
metadata:
git describe: next-20240822 git repo: https://gitlab.com/Linaro/lkft/mirrors/next/linux-next git sha: 6a7917c89f219f09b1d88d09f376000914a52763 kernel config: https://storage.tuxsuite.com/public/linaro/lkft/builds/2l0EZh4ZCNgUovHichD9y... kernel version: 6.11.0-rc4 artifact location: https://storage.tuxsuite.com/public/linaro/lkft/builds/2l0EZh4ZCNgUovHichD9y... build url: https://storage.tuxsuite.com/public/linaro/lkft/builds/2l0EZh4ZCNgUovHichD9y... toolchain: clang-18 and gcc-13
Steps to reproduce:
- https://tuxapi.tuxsuite.com/v1/groups/linaro/projects/lkft/tests/2l0EbxmBUpG...
- https://tuxapi.tuxsuite.com/v1/groups/linaro/projects/lkft/tests/2l0EbxmBUpG...
-- Linaro LKFT https://lkft.linaro.org