On Fri, Dec 29, 2017 at 1:50 PM, Alexander Tsoy alexander@tsoy.me wrote:
Ho humm. What happens if you change the "-march=core2" to "-mtune=core2"? Does it still boot?
That's interesting. Compiled with -mtune=core2, the kernel fails to boot.
[ Insert "twilight zone" theme music ]
Damn. I was hoping that "-march=core2" would enable something specific that causes the failure, and that "-mtune=core2" would just schedule for core2 but not fail, and then we could compare the two and see what triggers things.
But apparently no such luck. It's apparently just fundamentally the instruction scheduling and selection for core2 that causes problems, so mtune ends up being the same as march.
It could be something entirely random, and some instruction scheduling detail just ends up showing it by happenstance.
And sadly, we have almost nothing to go by.
The fact that double faults seem to be implicated does make me want to try to disable that ESPFIX64 code in the #DF handler.
What happens if you take a failing kernel, and then in arch/x86/kernel/traps.c do_double_fault(), you change the
#ifdef CONFIG_X86_ESPFIX64
to just a
#if 0
do you then get an actual double-fault oops report instead of the stall (and NMI oops)?
But honestly, I'm just throwing random ideas out now.
Hopefully somebody else has a better idea than I do. Andy?
Linus