On Tue, Oct 17, 2023 at 01:34:18PM +0530, Naresh Kamboju wrote:
Following kernel crash noticed while running selftests: ftrace: ftracetest-ktap on FVP models running stable-rc 6.5.8-rc2.
This is not an easy to reproduce issue and not seen on mainline and next. We are investigating this report.
To confirm have you seen this on other stables as well or is this only v6.5? For how long have you been seeing this?
[ 764.987161] Unable to handle kernel NULL pointer dereference at virtual address 0000000000000000
[ 765.074221] Call trace: [ 765.075045] sve_save_state+0x4/0xf0 [ 765.076138] fpsimd_thread_switch+0x2c/0xe8 [ 765.077305] __switch_to+0x20/0x158 [ 765.078384] __schedule+0x2cc/0xb38 [ 765.079464] preempt_schedule_irq+0x44/0xa8 [ 765.080633] el1_interrupt+0x4c/0x68 [ 765.081691] el1h_64_irq_handler+0x18/0x28 [ 765.082829] el1h_64_irq+0x64/0x68 [ 765.083874] ftrace_return_to_handler+0x98/0x158 [ 765.085090] return_to_handler+0x20/0x48 [ 765.086205] do_sve_acc+0x64/0x128 [ 765.087272] el0_sve_acc+0x3c/0xa0 [ 765.088356] el0t_64_sync_handler+0x114/0x130 [ 765.089524] el0t_64_sync+0x190/0x198
So something managed to get flagged as having SVE state without having the backing storage allocated. We *were* preempted in the SVE access handler which does the allocation but I can't see the path that would trigger that since we allocate the state before setting TIF_SVE. It's possible the compiler did something funky, a decode of the backtrace might help show that?