Following boot warnings and crashes noticed on arm64 Rpi4 device running Linux next-20230621 kernel.
Reported-by: Linux Kernel Functional Testing lkft@linaro.org
boot log:
[ 22.331748] Kernel text patching generated an invalid instruction at 0xffff8000835d6580! [ 22.340579] Unexpected kernel BRK exception at EL1 [ 22.346141] Internal error: BRK handler: 00000000f2000100 [#1] PREEMPT SMP . [ 22.353814] Modules linked in: hci_uart(+) brcmfmac(+) btqca brcmutil btbcm vc4(+) cfg80211 bluetooth reset_raspberrypi clk_raspberrypi crct10dif_ce raspberrypi_hwmon snd_soc_hdmi_codec cec v3d bcm2711_thermal rfkill drm_display_helper drm_shmem_helper pcie_brcmstb drm_dma_helper pwm_bcm2835 i2c_bcm2835 gpu_sched drm_kms_helper fuse drm [ 22.376947] vc4-drm gpu: bound fe004000.txp (ops vc4_txp_ops [vc4]) [ 22.384754] CPU: 3 PID: 159 Comm: systemd-udevd Not tainted 6.4.0-rc7-next-20230621 #1 [ 22.384769] Hardware name: Raspberry Pi 4 Model B (DT) [ 22.384776] pstate: 40000005 (nZcv daif -PAN -UAO -TCO -DIT -SSBS BTYPE=--) [ 22.384789] pc : 0xffff8000835d6580 [ 22.384798] lr : sk_filter_trim_cap+0x80/0x2a0 [ 22.384825] sp : ffff800083cdb9d0 [ 22.384831] x29: ffff800083cdb9d0 x28: 0000000000000000 x27: 0000000000000001 [ 22.384853] x26: ffff000041ece000 x25: ffff0000423ac800 x24: ffff800083468e00 [ 22.384872] x23: 0000000000000000 x22: ffff000044c41300 x21: 0000000000000001 [ 22.384891] x20: ffff800083bed000 x19: ffff000044c41300 x18: 0000000000000000 [ 22.384909] x17: 0000000000000000 x16: 0000000000000000 x15: 0000000015798650 [ 22.384928] x14: 414d003633373832 x13: 3332323d44455a49 x12: 4c414954494e495f [ 22.384946] x11: 4345535500343331 x10: 343d4d554e514553 x9 : ffff80008124f500 [ 22.384965] x8 : ffff800083cdb7d8 x7 : 0000000000000000 x6 : 0000000000000001 [ 22.384983] x5 : ffff800082def000 x4 : ffff800082def2e8 x3 : 0000000000000000 [ 22.385001] x2 : ffff8000835d657c x1 : ffff800083bed048 x0 : ffff000044c41300 [ 22.385020] Call trace: [ 22.385025] 0xffff8000835d6580 [ 22.385033] netlink_broadcast+0x1f0/0x4e8 [ 22.385047] netlink_sendmsg+0x318/0x420 [ 22.385056] ____sys_sendmsg+0x1cc/0x2c8 [ 22.385075] ___sys_sendmsg+0x88/0xf0 [ 22.385084] __sys_sendmsg+0x70/0xd8 [ 22.385093] __arm64_sys_sendmsg+0x2c/0x40 [ 22.385102] invoke_syscall+0x50/0x128 [ 22.385120] el0_svc_common.constprop.0+0xf4/0x120 [ 22.385136] do_el0_svc+0x44/0xb8 [ 22.385152] el0_svc+0x30/0x98 [ 22.385163] el0t_64_sync_handler+0x13c/0x158 [ 22.385174] el0t_64_sync+0x190/0x198 [ 22.385190] Code: d4202000 d4202000 d4202000 910003c9 (d503201f) [ 22.385199] ---[ end trace 0000000000000000 ]--- [ 22.385206] note: systemd-udevd[159] exited with irqs disabled [ 22.385378] note: systemd-udevd[159] exited with preempt_count 1 [ 22.395083] uart-pl011 fe201000.serial: no DMA platform data [ 22.427073] vc4-drm gpu: bound fe206000.pixelvalve (ops vc4_crtc_ops [vc4]) [ 22.434295] ------------[ cut here ]------------ [ 22.474787] Bluetooth: HCI UART protocol Marvell registered [ 22.475105] Voluntary context switch within RCU read-side critical section! [ 22.475129] WARNING: CPU: 3 PID: 159 at kernel/rcu/tree_plugin.h:320 rcu_note_context_switch+0x458/0x530 [ 22.553049] vc4-drm gpu: bound fe207000.pixelvalve (ops vc4_crtc_ops [vc4]) [ 22.555403] Modules linked in: hci_uart(+) brcmfmac(+) btqca brcmutil btbcm vc4(+) cfg80211 bluetooth reset_raspberrypi clk_raspberrypi crct10dif_ce [ 22.583877] vc4-drm gpu: bound fe20a000.pixelvalve (ops vc4_crtc_ops [vc4]) [ 22.584920] raspberrypi_hwmon snd_soc_hdmi_codec [ 22.614889] vc4-drm gpu: bound fe216000.pixelvalve (ops vc4_crtc_ops [vc4]) [ 22.618972] cec v3d bcm2711_thermal rfkill drm_display_helper drm_shmem_helper pcie_brcmstb drm_dma_helper pwm_bcm2835 i2c_bcm2835 gpu_sched drm_kms_helper fuse drm [ 22.666409] CPU: 3 PID: 159 Comm: systemd-udevd Tainted: G D 6.4.0-rc7-next-20230621 #1 [ 22.666424] Hardware name: Raspberry Pi 4 Model B (DT) [ 22.666431] pstate: 600000c5 (nZCv daIF -PAN -UAO -TCO -DIT -SSBS BTYPE=--) [ 22.666443] pc : rcu_note_context_switch+0x458/0x530 [ 22.666463] lr : rcu_note_context_switch+0x458/0x530 [ 22.666478] sp : ffff800083cdb3a0 [ 22.666483] x29: ffff800083cdb3a0 x28: ffff0000423acc08 x27: 0000000000000000 [ 22.666506] x26: ffff00004205a080 x25: ffff80008146d7c8 x24: 0000000000000000 [ 22.666525] x23: 0000000000000000 x22: ffff00004205a080 x21: ffff80008335dd18 [ 22.666544] x20: ffff80008237d2c0 x19: ffff0000fb5f1140 x18: ffffffffffffffff [ 22.666562] x17: 3035663432313830 x16: 3030386666666620 x15: 3a20397820333535 [ 22.666581] x14: 3431356534353564 x13: 216e6f6974636573 x12: 206c616369746972 [ 22.666600] x11: 6320656469732d64 x10: 6165722055435220 x9 : ffff80008012b740 [ 22.752425] x8 : 6863746977732074 x7 : 7865746e6f632079 x6 : 7261746e756c6f56 [ 22.759681] x5 : 0000000000000000 x4 : 0000000000000000 x3 : 0000000000000027 [ 22.766935] x2 : 0000000000000000 x1 : 0000000000000000 x0 : ffff00004205a080 [ 22.774190] Call trace: [ 22.776668] rcu_note_context_switch+0x458/0x530 [ 22.781364] __schedule+0xb8/0xd78 [ 22.784824] schedule+0x60/0x100 [ 22.788105] netlink_table_grab.part.0+0x8c/0xf8 [ 22.792801] netlink_release+0x5dc/0x6d8 [ 22.796782] __sock_release+0x4c/0xc8 [ 22.800505] sock_close+0x20/0x38 [ 22.803872] __fput+0xbc/0x280 [ 22.806979] ____fput+0x18/0x30 [ 22.810172] task_work_run+0x78/0xd8 [ 22.813802] do_exit+0x2f8/0x9a8 [ 22.817085] make_task_dead+0xa4/0x1a8 [ 22.820896] die+0x254/0x260 [ 22.823819] arm64_notify_die+0xbc/0xe0 [ 22.827712] do_debug_exception+0xe0/0x118 [ 22.831876] el1_dbg+0x70/0x90 [ 22.834975] el1h_64_sync_handler+0xc8/0xe8 [ 22.839222] el1h_64_sync+0x64/0x68 [ 22.842761] 0xffff8000835d6580 [ 22.845949] netlink_broadcast+0x1f0/0x4e8 [ 22.850105] netlink_sendmsg+0x318/0x420 [ 22.854084] ____sys_sendmsg+0x1cc/0x2c8 [ 22.858072] ___sys_sendmsg+0x88/0xf0 [ 22.861788] __sys_sendmsg+0x70/0xd8 [ 22.865414] __arm64_sys_sendmsg+0x2c/0x40 [ 22.869571] invoke_syscall+0x50/0x128 [ 22.873383] el0_svc_common.constprop.0+0xf4/0x120 [ 22.878251] do_el0_svc+0x44/0xb8 [ 22.881621] el0_svc+0x30/0x98 [ 22.884721] el0t_64_sync_handler+0x13c/0x158 [ 22.889144] el0t_64_sync+0x190/0x198 [ 22.892860] ---[ end trace 0000000000000000 ]---
Links: - https://lkft.validation.linaro.org/scheduler/job/6531518#L886 - https://qa-reports.linaro.org/lkft/linux-next-master-sanity/build/next-20230... - https://qa-reports.linaro.org/lkft/linux-next-master-sanity/build/next-20230...
metadata: git_ref: master git_repo: https://gitlab.com/Linaro/lkft/mirrors/next/linux-next git_sha: 15e71592dbae49a674429c618a10401d7f992ac3 git_describe: next-20230621 kernel_version: 6.4.0-rc7 kernel-config: https://storage.tuxsuite.com/public/linaro/lkft/builds/2RVA7srTtdxlVq1QVEgrz... artifact-location: https://storage.tuxsuite.com/public/linaro/lkft/builds/2RVA7srTtdxlVq1QVEgrz... toolchain: gcc-11
-- Linaro LKFT https://lkft.linaro.org
On Wed, Jun 21, 2023 at 06:06:51PM +0530, Naresh Kamboju wrote:
Following boot warnings and crashes noticed on arm64 Rpi4 device running Linux next-20230621 kernel.
Reported-by: Linux Kernel Functional Testing lkft@linaro.org
boot log:
[ 22.331748] Kernel text patching generated an invalid instruction at 0xffff8000835d6580! [ 22.340579] Unexpected kernel BRK exception at EL1 [ 22.346141] Internal error: BRK handler: 00000000f2000100 [#1] PREEMPT SMP
This indicates execution of AARCH64_BREAK_FAULT.
That could be from dodgy arguments to aarch64_insn_gen_*(), or elsewhere, and given this is in the networking code I suspect this'll be related to BPF.
Looking at next-20230621 I see commit:
49703aa2adfaff28 ("bpf, arm64: use bpf_jit_binary_pack_alloc")
... which changed the way BPF allocates memory, and has code that pads memory with a bunch of AARCH64_BREAK_FAULT, so it looks like that *might* be related.
Are you able to bisect this?
In the mean time, I've Cc'd the relevant BPF people to give them a heads-up.
Thanks, Mark.
. [ 22.353814] Modules linked in: hci_uart(+) brcmfmac(+) btqca brcmutil btbcm vc4(+) cfg80211 bluetooth reset_raspberrypi clk_raspberrypi crct10dif_ce raspberrypi_hwmon snd_soc_hdmi_codec cec v3d bcm2711_thermal rfkill drm_display_helper drm_shmem_helper pcie_brcmstb drm_dma_helper pwm_bcm2835 i2c_bcm2835 gpu_sched drm_kms_helper fuse drm [ 22.376947] vc4-drm gpu: bound fe004000.txp (ops vc4_txp_ops [vc4]) [ 22.384754] CPU: 3 PID: 159 Comm: systemd-udevd Not tainted 6.4.0-rc7-next-20230621 #1 [ 22.384769] Hardware name: Raspberry Pi 4 Model B (DT) [ 22.384776] pstate: 40000005 (nZcv daif -PAN -UAO -TCO -DIT -SSBS BTYPE=--) [ 22.384789] pc : 0xffff8000835d6580 [ 22.384798] lr : sk_filter_trim_cap+0x80/0x2a0 [ 22.384825] sp : ffff800083cdb9d0 [ 22.384831] x29: ffff800083cdb9d0 x28: 0000000000000000 x27: 0000000000000001 [ 22.384853] x26: ffff000041ece000 x25: ffff0000423ac800 x24: ffff800083468e00 [ 22.384872] x23: 0000000000000000 x22: ffff000044c41300 x21: 0000000000000001 [ 22.384891] x20: ffff800083bed000 x19: ffff000044c41300 x18: 0000000000000000 [ 22.384909] x17: 0000000000000000 x16: 0000000000000000 x15: 0000000015798650 [ 22.384928] x14: 414d003633373832 x13: 3332323d44455a49 x12: 4c414954494e495f [ 22.384946] x11: 4345535500343331 x10: 343d4d554e514553 x9 : ffff80008124f500 [ 22.384965] x8 : ffff800083cdb7d8 x7 : 0000000000000000 x6 : 0000000000000001 [ 22.384983] x5 : ffff800082def000 x4 : ffff800082def2e8 x3 : 0000000000000000 [ 22.385001] x2 : ffff8000835d657c x1 : ffff800083bed048 x0 : ffff000044c41300 [ 22.385020] Call trace: [ 22.385025] 0xffff8000835d6580 [ 22.385033] netlink_broadcast+0x1f0/0x4e8 [ 22.385047] netlink_sendmsg+0x318/0x420 [ 22.385056] ____sys_sendmsg+0x1cc/0x2c8 [ 22.385075] ___sys_sendmsg+0x88/0xf0 [ 22.385084] __sys_sendmsg+0x70/0xd8 [ 22.385093] __arm64_sys_sendmsg+0x2c/0x40 [ 22.385102] invoke_syscall+0x50/0x128 [ 22.385120] el0_svc_common.constprop.0+0xf4/0x120 [ 22.385136] do_el0_svc+0x44/0xb8 [ 22.385152] el0_svc+0x30/0x98 [ 22.385163] el0t_64_sync_handler+0x13c/0x158 [ 22.385174] el0t_64_sync+0x190/0x198 [ 22.385190] Code: d4202000 d4202000 d4202000 910003c9 (d503201f) [ 22.385199] ---[ end trace 0000000000000000 ]--- [ 22.385206] note: systemd-udevd[159] exited with irqs disabled [ 22.385378] note: systemd-udevd[159] exited with preempt_count 1 [ 22.395083] uart-pl011 fe201000.serial: no DMA platform data [ 22.427073] vc4-drm gpu: bound fe206000.pixelvalve (ops vc4_crtc_ops [vc4]) [ 22.434295] ------------[ cut here ]------------ [ 22.474787] Bluetooth: HCI UART protocol Marvell registered [ 22.475105] Voluntary context switch within RCU read-side critical section! [ 22.475129] WARNING: CPU: 3 PID: 159 at kernel/rcu/tree_plugin.h:320 rcu_note_context_switch+0x458/0x530 [ 22.553049] vc4-drm gpu: bound fe207000.pixelvalve (ops vc4_crtc_ops [vc4]) [ 22.555403] Modules linked in: hci_uart(+) brcmfmac(+) btqca brcmutil btbcm vc4(+) cfg80211 bluetooth reset_raspberrypi clk_raspberrypi crct10dif_ce [ 22.583877] vc4-drm gpu: bound fe20a000.pixelvalve (ops vc4_crtc_ops [vc4]) [ 22.584920] raspberrypi_hwmon snd_soc_hdmi_codec [ 22.614889] vc4-drm gpu: bound fe216000.pixelvalve (ops vc4_crtc_ops [vc4]) [ 22.618972] cec v3d bcm2711_thermal rfkill drm_display_helper drm_shmem_helper pcie_brcmstb drm_dma_helper pwm_bcm2835 i2c_bcm2835 gpu_sched drm_kms_helper fuse drm [ 22.666409] CPU: 3 PID: 159 Comm: systemd-udevd Tainted: G D 6.4.0-rc7-next-20230621 #1 [ 22.666424] Hardware name: Raspberry Pi 4 Model B (DT) [ 22.666431] pstate: 600000c5 (nZCv daIF -PAN -UAO -TCO -DIT -SSBS BTYPE=--) [ 22.666443] pc : rcu_note_context_switch+0x458/0x530 [ 22.666463] lr : rcu_note_context_switch+0x458/0x530 [ 22.666478] sp : ffff800083cdb3a0 [ 22.666483] x29: ffff800083cdb3a0 x28: ffff0000423acc08 x27: 0000000000000000 [ 22.666506] x26: ffff00004205a080 x25: ffff80008146d7c8 x24: 0000000000000000 [ 22.666525] x23: 0000000000000000 x22: ffff00004205a080 x21: ffff80008335dd18 [ 22.666544] x20: ffff80008237d2c0 x19: ffff0000fb5f1140 x18: ffffffffffffffff [ 22.666562] x17: 3035663432313830 x16: 3030386666666620 x15: 3a20397820333535 [ 22.666581] x14: 3431356534353564 x13: 216e6f6974636573 x12: 206c616369746972 [ 22.666600] x11: 6320656469732d64 x10: 6165722055435220 x9 : ffff80008012b740 [ 22.752425] x8 : 6863746977732074 x7 : 7865746e6f632079 x6 : 7261746e756c6f56 [ 22.759681] x5 : 0000000000000000 x4 : 0000000000000000 x3 : 0000000000000027 [ 22.766935] x2 : 0000000000000000 x1 : 0000000000000000 x0 : ffff00004205a080 [ 22.774190] Call trace: [ 22.776668] rcu_note_context_switch+0x458/0x530 [ 22.781364] __schedule+0xb8/0xd78 [ 22.784824] schedule+0x60/0x100 [ 22.788105] netlink_table_grab.part.0+0x8c/0xf8 [ 22.792801] netlink_release+0x5dc/0x6d8 [ 22.796782] __sock_release+0x4c/0xc8 [ 22.800505] sock_close+0x20/0x38 [ 22.803872] __fput+0xbc/0x280 [ 22.806979] ____fput+0x18/0x30 [ 22.810172] task_work_run+0x78/0xd8 [ 22.813802] do_exit+0x2f8/0x9a8 [ 22.817085] make_task_dead+0xa4/0x1a8 [ 22.820896] die+0x254/0x260 [ 22.823819] arm64_notify_die+0xbc/0xe0 [ 22.827712] do_debug_exception+0xe0/0x118 [ 22.831876] el1_dbg+0x70/0x90 [ 22.834975] el1h_64_sync_handler+0xc8/0xe8 [ 22.839222] el1h_64_sync+0x64/0x68 [ 22.842761] 0xffff8000835d6580 [ 22.845949] netlink_broadcast+0x1f0/0x4e8 [ 22.850105] netlink_sendmsg+0x318/0x420 [ 22.854084] ____sys_sendmsg+0x1cc/0x2c8 [ 22.858072] ___sys_sendmsg+0x88/0xf0 [ 22.861788] __sys_sendmsg+0x70/0xd8 [ 22.865414] __arm64_sys_sendmsg+0x2c/0x40 [ 22.869571] invoke_syscall+0x50/0x128 [ 22.873383] el0_svc_common.constprop.0+0xf4/0x120 [ 22.878251] do_el0_svc+0x44/0xb8 [ 22.881621] el0_svc+0x30/0x98 [ 22.884721] el0t_64_sync_handler+0x13c/0x158 [ 22.889144] el0t_64_sync+0x190/0x198 [ 22.892860] ---[ end trace 0000000000000000 ]---
Links:
- https://lkft.validation.linaro.org/scheduler/job/6531518#L886
- https://qa-reports.linaro.org/lkft/linux-next-master-sanity/build/next-20230...
- https://qa-reports.linaro.org/lkft/linux-next-master-sanity/build/next-20230...
metadata: git_ref: master git_repo: https://gitlab.com/Linaro/lkft/mirrors/next/linux-next git_sha: 15e71592dbae49a674429c618a10401d7f992ac3 git_describe: next-20230621 kernel_version: 6.4.0-rc7 kernel-config: https://storage.tuxsuite.com/public/linaro/lkft/builds/2RVA7srTtdxlVq1QVEgrz... artifact-location: https://storage.tuxsuite.com/public/linaro/lkft/builds/2RVA7srTtdxlVq1QVEgrz... toolchain: gcc-11
-- Linaro LKFT https://lkft.linaro.org
On Wed, 21 Jun 2023 at 18:27, Mark Rutland mark.rutland@arm.com wrote:
On Wed, Jun 21, 2023 at 06:06:51PM +0530, Naresh Kamboju wrote:
Following boot warnings and crashes noticed on arm64 Rpi4 device running Linux next-20230621 kernel.
Reported-by: Linux Kernel Functional Testing lkft@linaro.org
boot log:
[ 22.331748] Kernel text patching generated an invalid instruction at 0xffff8000835d6580! [ 22.340579] Unexpected kernel BRK exception at EL1 [ 22.346141] Internal error: BRK handler: 00000000f2000100 [#1] PREEMPT SMP
This indicates execution of AARCH64_BREAK_FAULT.
I see kernel panic with kselftest merge configs on Juno-r2 and Rpi4.
That could be from dodgy arguments to aarch64_insn_gen_*(), or elsewhere, and given this is in the networking code I suspect this'll be related to BPF.
Looking at next-20230621 I see commit:
49703aa2adfaff28 ("bpf, arm64: use bpf_jit_binary_pack_alloc")
... which changed the way BPF allocates memory, and has code that pads memory with a bunch of AARCH64_BREAK_FAULT, so it looks like that *might* be related.
Are you able to bisect this?
I have not started bisection on this issue yet. Let me give it a try.
In the mean time, I've Cc'd the relevant BPF people to give them a heads-up.
Thanks.
Extra information from boot failures. This is always reproducible on Juno-r2 and Rpi4 devices.
Reported-by: Linux Kernel Functional Testing lkft@linaro.org
Boot crash log: [ 3.605232] Kernel text patching generated an invalid instruction at bpf_prog_99a0cd861b84ee07___loader.prog+0x0/0x728! [ 3.616052] Unexpected kernel BRK exception at EL1 [ 3.620849] Internal error: BRK handler: 00000000f2000100 [#1] PREEMPT SMP [ 3.627736] Modules linked in: [ 3.630796] CPU: 1 PID: 1 Comm: swapper/0 Not tainted 6.4.0-rc7-next-20230621 #1 [ 3.638140] hub 1-1:1.0: USB hub found [ 3.638206] Hardware name: ARM Juno development board (r2) (DT) [ 3.638210] pstate: 60000005 (nZCv daif -PAN -UAO -TCO -DIT -SSBS BTYPE=--) [ 3.642431] hub 1-1:1.0: 4 ports detected [ 3.647879] pc : bpf_prog_99a0cd861b84ee07___loader.prog+0x0/0x728 [ 3.647891] lr : kern_sys_bpf+0x130/0x218 [ 3.669061] sp : ffff80008391bc10 [ 3.672376] x29: ffff80008391bc10 x28: ffff8000826e70d8 x27: ffff800082450110 [ 3.679533] x26: ffff8000820ed948 x25: ffff800082427b10 x24: 0000000000000289 [ 3.686687] x23: ffff000800acfa00 x22: ffff8000837f8000 x21: ffff000823dbc240 [ 3.693841] x20: ffff8000839b1000 x19: ffff80008391bca8 x18: 000000001d03406d [ 3.700995] x17: ffff800080464204 x16: ffff8000804640b4 x15: ffff8000803f8af0 [ 3.708149] x14: ffff8000803f88f8 x13: ffff800081717720 x12: ffff8000824514b4 [ 3.715302] x11: ffff800080015788 x10: ffff800082470304 x9 : ffff8000800f3338 [ 3.722456] x8 : ffff80008391bcf8 x7 : 0000000000000000 x6 : 0000000000000001 [ 3.729609] x5 : 0000000000000001 x4 : ffff8000831f0000 x3 : ffff8008fc63d000 [ 3.736763] x2 : ffff800083b6d88c x1 : ffff8000839b1048 x0 : ffff000823dbc240 [ 3.743917] Call trace: [ 3.746362] bpf_prog_99a0cd861b84ee07___loader.prog+0x0/0x728 [ 3.752210] bpf_load_and_run.constprop.0+0x120/0x1d8 [ 3.757270] load+0xf4/0x278 [ 3.760159] do_one_initcall+0x50/0x2f0 [ 3.764001] kernel_init_freeable+0x224/0x438 [ 3.768368] kernel_init+0x30/0x200 [ 3.771862] ret_from_fork+0x10/0x20 [ 3.775447] Code: d4202000 00000780 d4202000 d4202000 (910003c9) [ 3.781550] ---[ end trace 0000000000000000 ]--- [ 3.786172] note: swapper/0[1] exited with irqs disabled [ 3.791526] note: swapper/0[1] exited with preempt_count 1 [ 3.797043] Kernel panic - not syncing: Attempted to kill init! exitcode=0x0000000b [ 3.804711] SMP: stopping secondary CPUs [ 3.808843] Kernel Offset: disabled [ 3.812331] CPU features: 0x40000106,1e010000,0000421b [ 3.817476] Memory Limit: none [ 3.820536] ---[ end Kernel panic - not syncing: Attempted to kill init! exitcode=0x0000000b ]---
Links: https://qa-reports.linaro.org/lkft/linux-next-master/build/next-20230621/tes... https://qa-reports.linaro.org/lkft/linux-next-master/build/next-20230621/tes... https://qa-reports.linaro.org/lkft/linux-next-master/build/next-20230621/tes...
metadata: git_ref: master git_repo: https://gitlab.com/Linaro/lkft/mirrors/next/linux-next git_sha: 15e71592dbae49a674429c618a10401d7f992ac3 git_describe: next-20230621 kernel_version: 6.4.0-rc7 kernel-config: https://storage.tuxsuite.com/public/linaro/lkft/builds/2RVAA4lj35ia3YDkqaoV6... artifact-location: https://storage.tuxsuite.com/public/linaro/lkft/builds/2RVAA4lj35ia3YDkqaoV6... toolchain: gcc-11 build_name: gcc-11-lkftconfig-kselftest
-- Linaro LKFT https://lkft.linaro.org
Hi,
On Wed, Jun 21, 2023 at 3:39 PM Naresh Kamboju naresh.kamboju@linaro.org wrote:
On Wed, 21 Jun 2023 at 18:27, Mark Rutland mark.rutland@arm.com wrote:
On Wed, Jun 21, 2023 at 06:06:51PM +0530, Naresh Kamboju wrote:
Following boot warnings and crashes noticed on arm64 Rpi4 device running Linux next-20230621 kernel.
Reported-by: Linux Kernel Functional Testing lkft@linaro.org
boot log:
[ 22.331748] Kernel text patching generated an invalid instruction at 0xffff8000835d6580! [ 22.340579] Unexpected kernel BRK exception at EL1 [ 22.346141] Internal error: BRK handler: 00000000f2000100 [#1] PREEMPT SMP
This indicates execution of AARCH64_BREAK_FAULT.
I see kernel panic with kselftest merge configs on Juno-r2 and Rpi4.
Is there a way to reproduce this setup on Qemu?
I am able to build the linux-next kernel with the config given below. But the bug doesn't reproduce in Qemu with debian rootfs.
I guess I would need the Rootfs that is being used here to reproduce it. Can you point me to the rootfs for this?
Thanks, Puranjay
metadata: git_ref: master git_repo: https://gitlab.com/Linaro/lkft/mirrors/next/linux-next git_sha: 15e71592dbae49a674429c618a10401d7f992ac3 git_describe: next-20230621 kernel_version: 6.4.0-rc7 kernel-config: https://storage.tuxsuite.com/public/linaro/lkft/builds/2RVAA4lj35ia3YDkqaoV6... artifact-location: https://storage.tuxsuite.com/public/linaro/lkft/builds/2RVAA4lj35ia3YDkqaoV6... toolchain: gcc-11 build_name: gcc-11-lkftconfig-kselftest
-- Linaro LKFT https://lkft.linaro.org
On Wed, 21 Jun 2023 at 19:46, Puranjay Mohan puranjay12@gmail.com wrote:
Hi,
On Wed, Jun 21, 2023 at 3:39 PM Naresh Kamboju naresh.kamboju@linaro.org wrote:
On Wed, 21 Jun 2023 at 18:27, Mark Rutland mark.rutland@arm.com wrote:
On Wed, Jun 21, 2023 at 06:06:51PM +0530, Naresh Kamboju wrote:
Following boot warnings and crashes noticed on arm64 Rpi4 device running Linux next-20230621 kernel.
Reported-by: Linux Kernel Functional Testing lkft@linaro.org
boot log:
[ 22.331748] Kernel text patching generated an invalid instruction at 0xffff8000835d6580! [ 22.340579] Unexpected kernel BRK exception at EL1 [ 22.346141] Internal error: BRK handler: 00000000f2000100 [#1] PREEMPT SMP
This indicates execution of AARCH64_BREAK_FAULT.
I see kernel panic with kselftest merge configs on Juno-r2 and Rpi4.
Is there a way to reproduce this setup on Qemu?
Not reproducible on Qemu-arm64. I see only on arm64 devices Juno-r2 and Rpi4.
I am able to build the linux-next kernel with the config given below. But the bug doesn't reproduce in Qemu with debian rootfs.
I guess I would need the Rootfs that is being used here to reproduce it. Can you point me to the rootfs for this?
Here is the link for rootfs - OE one. https://storage.tuxsuite.com/public/linaro/lkft/oebuilds/2RVA7dHPf73agY0gDJD...
Thanks, Puranjay
metadata: git_ref: master git_repo: https://gitlab.com/Linaro/lkft/mirrors/next/linux-next git_sha: 15e71592dbae49a674429c618a10401d7f992ac3 git_describe: next-20230621 kernel_version: 6.4.0-rc7 kernel-config: https://storage.tuxsuite.com/public/linaro/lkft/builds/2RVAA4lj35ia3YDkqaoV6... artifact-location: https://storage.tuxsuite.com/public/linaro/lkft/builds/2RVAA4lj35ia3YDkqaoV6... toolchain: gcc-11 build_name: gcc-11-lkftconfig-kselftest
-- Linaro LKFT https://lkft.linaro.org
- Naresh
Hi,
On Wed, Jun 21, 2023 at 4:41 PM Naresh Kamboju naresh.kamboju@linaro.org wrote:
On Wed, 21 Jun 2023 at 19:46, Puranjay Mohan puranjay12@gmail.com wrote:
Hi,
On Wed, Jun 21, 2023 at 3:39 PM Naresh Kamboju naresh.kamboju@linaro.org wrote:
On Wed, 21 Jun 2023 at 18:27, Mark Rutland mark.rutland@arm.com wrote:
On Wed, Jun 21, 2023 at 06:06:51PM +0530, Naresh Kamboju wrote:
Following boot warnings and crashes noticed on arm64 Rpi4 device running Linux next-20230621 kernel.
Reported-by: Linux Kernel Functional Testing lkft@linaro.org
boot log:
[ 22.331748] Kernel text patching generated an invalid instruction at 0xffff8000835d6580! [ 22.340579] Unexpected kernel BRK exception at EL1 [ 22.346141] Internal error: BRK handler: 00000000f2000100 [#1] PREEMPT SMP
This indicates execution of AARCH64_BREAK_FAULT.
I see kernel panic with kselftest merge configs on Juno-r2 and Rpi4.
Is there a way to reproduce this setup on Qemu?
Not reproducible on Qemu-arm64. I see only on arm64 devices Juno-r2 and Rpi4.
I am able to build the linux-next kernel with the config given below. But the bug doesn't reproduce in Qemu with debian rootfs.
I guess I would need the Rootfs that is being used here to reproduce it. Can you point me to the rootfs for this?
Here is the link for rootfs - OE one. https://storage.tuxsuite.com/public/linaro/lkft/oebuilds/2RVA7dHPf73agY0gDJD...
I tested this rootfs and couldn't reproduce on Qemu. Now, I will try to use my raspberry pi and try to reproduce this.
Thanks.
Thanks, Puranjay
metadata: git_ref: master git_repo: https://gitlab.com/Linaro/lkft/mirrors/next/linux-next git_sha: 15e71592dbae49a674429c618a10401d7f992ac3 git_describe: next-20230621 kernel_version: 6.4.0-rc7 kernel-config: https://storage.tuxsuite.com/public/linaro/lkft/builds/2RVAA4lj35ia3YDkqaoV6... artifact-location: https://storage.tuxsuite.com/public/linaro/lkft/builds/2RVAA4lj35ia3YDkqaoV6... toolchain: gcc-11 build_name: gcc-11-lkftconfig-kselftest
-- Linaro LKFT https://lkft.linaro.org
- Naresh
-- Thanks and Regards
Yours Truly,
Puranjay Mohan
On Wed, Jun 21, 2023 at 01:57:21PM +0100, Mark Rutland wrote:
On Wed, Jun 21, 2023 at 06:06:51PM +0530, Naresh Kamboju wrote:
Following boot warnings and crashes noticed on arm64 Rpi4 device running Linux next-20230621 kernel.
Reported-by: Linux Kernel Functional Testing lkft@linaro.org
boot log:
[ 22.331748] Kernel text patching generated an invalid instruction at 0xffff8000835d6580! [ 22.340579] Unexpected kernel BRK exception at EL1 [ 22.346141] Internal error: BRK handler: 00000000f2000100 [#1] PREEMPT SMP
This indicates execution of AARCH64_BREAK_FAULT.
That could be from dodgy arguments to aarch64_insn_gen_*(), or elsewhere, and given this is in the networking code I suspect this'll be related to BPF.
Looking at next-20230621 I see commit:
49703aa2adfaff28 ("bpf, arm64: use bpf_jit_binary_pack_alloc")
... which changed the way BPF allocates memory, and has code that pads memory with a bunch of AARCH64_BREAK_FAULT, so it looks like that *might* be related.
For the benefit of those just looknig at this thread, there has been some discussion in the original thread for this commit. Summary and links below.
We identified a potential issue with missing cache maintenance:
https://lore.kernel.org/linux-arm-kernel/ZJMXqTffB22LSOkd@FVFF77S0Q05N/
Puranjay verified that was causing the problem seen here:
https://lore.kernel.org/linux-arm-kernel/CANk7y0h5ucxmMz4K8sGx7qogFyx6PRxYxm...
Alexei has dropped this commit for now:
https://lore.kernel.org/linux-arm-kernel/CAADnVQJqDOMABEx8JuU6r_Dehyf=SkDfRN...
Thanks, Mark.
Hi Mark,
On Thu, 22 Jun 2023 at 15:12, Mark Rutland mark.rutland@arm.com wrote:
On Wed, Jun 21, 2023 at 01:57:21PM +0100, Mark Rutland wrote:
On Wed, Jun 21, 2023 at 06:06:51PM +0530, Naresh Kamboju wrote:
Following boot warnings and crashes noticed on arm64 Rpi4 device running Linux next-20230621 kernel.
Reported-by: Linux Kernel Functional Testing lkft@linaro.org
boot log:
[ 22.331748] Kernel text patching generated an invalid instruction at 0xffff8000835d6580! [ 22.340579] Unexpected kernel BRK exception at EL1 [ 22.346141] Internal error: BRK handler: 00000000f2000100 [#1] PREEMPT SMP
This indicates execution of AARCH64_BREAK_FAULT.
That could be from dodgy arguments to aarch64_insn_gen_*(), or elsewhere, and given this is in the networking code I suspect this'll be related to BPF.
Looking at next-20230621 I see commit:
49703aa2adfaff28 ("bpf, arm64: use bpf_jit_binary_pack_alloc")
... which changed the way BPF allocates memory, and has code that pads memory with a bunch of AARCH64_BREAK_FAULT, so it looks like that *might* be related.
For the benefit of those just looknig at this thread, there has been some discussion in the original thread for this commit. Summary and links below.
We identified a potential issue with missing cache maintenance:
https://lore.kernel.org/linux-arm-kernel/ZJMXqTffB22LSOkd@FVFF77S0Q05N/
Puranjay verified that was causing the problem seen here:
https://lore.kernel.org/linux-arm-kernel/CANk7y0h5ucxmMz4K8sGx7qogFyx6PRxYxm...
Alexei has dropped this commit for now:
https://lore.kernel.org/linux-arm-kernel/CAADnVQJqDOMABEx8JuU6r_Dehyf=SkDfRN...
Thanks for the detailed information. I am happy to test any proposed fix patches.
Thanks, Mark.
- Naresh
Hi Naresh,
On Thu, Jun 22, 2023 at 2:35 PM Naresh Kamboju naresh.kamboju@linaro.org wrote:
Hi Mark,
On Thu, 22 Jun 2023 at 15:12, Mark Rutland mark.rutland@arm.com wrote:
On Wed, Jun 21, 2023 at 01:57:21PM +0100, Mark Rutland wrote:
On Wed, Jun 21, 2023 at 06:06:51PM +0530, Naresh Kamboju wrote:
Following boot warnings and crashes noticed on arm64 Rpi4 device running Linux next-20230621 kernel.
Reported-by: Linux Kernel Functional Testing lkft@linaro.org
boot log:
[ 22.331748] Kernel text patching generated an invalid instruction at 0xffff8000835d6580! [ 22.340579] Unexpected kernel BRK exception at EL1 [ 22.346141] Internal error: BRK handler: 00000000f2000100 [#1] PREEMPT SMP
This indicates execution of AARCH64_BREAK_FAULT.
That could be from dodgy arguments to aarch64_insn_gen_*(), or elsewhere, and given this is in the networking code I suspect this'll be related to BPF.
Looking at next-20230621 I see commit:
49703aa2adfaff28 ("bpf, arm64: use bpf_jit_binary_pack_alloc")
... which changed the way BPF allocates memory, and has code that pads memory with a bunch of AARCH64_BREAK_FAULT, so it looks like that *might* be related.
For the benefit of those just looknig at this thread, there has been some discussion in the original thread for this commit. Summary and links below.
We identified a potential issue with missing cache maintenance:
https://lore.kernel.org/linux-arm-kernel/ZJMXqTffB22LSOkd@FVFF77S0Q05N/
Puranjay verified that was causing the problem seen here:
https://lore.kernel.org/linux-arm-kernel/CANk7y0h5ucxmMz4K8sGx7qogFyx6PRxYxm...
Alexei has dropped this commit for now:
https://lore.kernel.org/linux-arm-kernel/CAADnVQJqDOMABEx8JuU6r_Dehyf=SkDfRN...
Thanks for the detailed information. I am happy to test any proposed fix patches.
I have sent the v4 of the patch series: https://lore.kernel.org/bpf/20230626085811.3192402-1-puranjay12@gmail.com/T/... This works on my raspberry pi 4 setup. If possible can you test this on the similar setup where it was failing earlier?
Thanks, Mark.
- Naresh
Thanks, Puranjay
Hi Naresh,
On Mon, Jun 26, 2023 at 11:04 AM Puranjay Mohan puranjay12@gmail.com wrote:
Hi Naresh,
On Thu, Jun 22, 2023 at 2:35 PM Naresh Kamboju naresh.kamboju@linaro.org wrote:
Hi Mark,
On Thu, 22 Jun 2023 at 15:12, Mark Rutland mark.rutland@arm.com wrote:
On Wed, Jun 21, 2023 at 01:57:21PM +0100, Mark Rutland wrote:
On Wed, Jun 21, 2023 at 06:06:51PM +0530, Naresh Kamboju wrote:
Following boot warnings and crashes noticed on arm64 Rpi4 device running Linux next-20230621 kernel.
Reported-by: Linux Kernel Functional Testing lkft@linaro.org
boot log:
[ 22.331748] Kernel text patching generated an invalid instruction at 0xffff8000835d6580! [ 22.340579] Unexpected kernel BRK exception at EL1 [ 22.346141] Internal error: BRK handler: 00000000f2000100 [#1] PREEMPT SMP
This indicates execution of AARCH64_BREAK_FAULT.
That could be from dodgy arguments to aarch64_insn_gen_*(), or elsewhere, and given this is in the networking code I suspect this'll be related to BPF.
Looking at next-20230621 I see commit:
49703aa2adfaff28 ("bpf, arm64: use bpf_jit_binary_pack_alloc")
... which changed the way BPF allocates memory, and has code that pads memory with a bunch of AARCH64_BREAK_FAULT, so it looks like that *might* be related.
For the benefit of those just looknig at this thread, there has been some discussion in the original thread for this commit. Summary and links below.
We identified a potential issue with missing cache maintenance:
https://lore.kernel.org/linux-arm-kernel/ZJMXqTffB22LSOkd@FVFF77S0Q05N/
Puranjay verified that was causing the problem seen here:
https://lore.kernel.org/linux-arm-kernel/CANk7y0h5ucxmMz4K8sGx7qogFyx6PRxYxm...
Alexei has dropped this commit for now:
https://lore.kernel.org/linux-arm-kernel/CAADnVQJqDOMABEx8JuU6r_Dehyf=SkDfRN...
Thanks for the detailed information. I am happy to test any proposed fix patches.
I have sent the v4 of the patch series: https://lore.kernel.org/bpf/20230626085811.3192402-1-puranjay12@gmail.com/T/... This works on my raspberry pi 4 setup. If possible can you test this on the similar setup where it was failing earlier?
I think my previous email was missed. Can you test the V4 series in the same setup? This is still not applied to the bpf-next tree.
Thanks, Puranjay.
Hi Puranjay,
On Mon, 3 Jul 2023 at 15:07, Puranjay Mohan puranjay12@gmail.com wrote:
Hi Naresh,
On Mon, Jun 26, 2023 at 11:04 AM Puranjay Mohan puranjay12@gmail.com wrote:
Hi Naresh,
On Thu, Jun 22, 2023 at 2:35 PM Naresh Kamboju naresh.kamboju@linaro.org wrote:
Hi Mark,
On Thu, 22 Jun 2023 at 15:12, Mark Rutland mark.rutland@arm.com wrote:
On Wed, Jun 21, 2023 at 01:57:21PM +0100, Mark Rutland wrote:
On Wed, Jun 21, 2023 at 06:06:51PM +0530, Naresh Kamboju wrote:
Following boot warnings and crashes noticed on arm64 Rpi4 device running Linux next-20230621 kernel.
Reported-by: Linux Kernel Functional Testing lkft@linaro.org
boot log:
[ 22.331748] Kernel text patching generated an invalid instruction at 0xffff8000835d6580! [ 22.340579] Unexpected kernel BRK exception at EL1 [ 22.346141] Internal error: BRK handler: 00000000f2000100 [#1] PREEMPT SMP
This indicates execution of AARCH64_BREAK_FAULT.
That could be from dodgy arguments to aarch64_insn_gen_*(), or elsewhere, and given this is in the networking code I suspect this'll be related to BPF.
Looking at next-20230621 I see commit:
49703aa2adfaff28 ("bpf, arm64: use bpf_jit_binary_pack_alloc")
... which changed the way BPF allocates memory, and has code that pads memory with a bunch of AARCH64_BREAK_FAULT, so it looks like that *might* be related.
For the benefit of those just looknig at this thread, there has been some discussion in the original thread for this commit. Summary and links below.
We identified a potential issue with missing cache maintenance:
https://lore.kernel.org/linux-arm-kernel/ZJMXqTffB22LSOkd@FVFF77S0Q05N/
Puranjay verified that was causing the problem seen here:
https://lore.kernel.org/linux-arm-kernel/CANk7y0h5ucxmMz4K8sGx7qogFyx6PRxYxm...
Alexei has dropped this commit for now:
https://lore.kernel.org/linux-arm-kernel/CAADnVQJqDOMABEx8JuU6r_Dehyf=SkDfRN...
Thanks for the detailed information. I am happy to test any proposed fix patches.
I have sent the v4 of the patch series: https://lore.kernel.org/bpf/20230626085811.3192402-1-puranjay12@gmail.com/T/... This works on my raspberry pi 4 setup. If possible can you test this on the similar setup where it was failing earlier?
I think my previous email was missed. Can you test the V4 series in the same setup?
I have tested V4 series and reported issues got fixed.
Tested-by: Naresh Kamboju naresh.kamboju@linaro.org
Thank you !
This is still not applied to the bpf-next tree.
Thanks, Puranjay.
- Naresh