Hello,
4.14.9 fails to boot if CONFIG_MCORE2 is enabled and when compiled with gcc 6+. More details in the following bug reports: https://bugzilla.kernel.org/show_bug.cgi?id=198263 https://bugs.gentoo.org/642268
I bisected it to the commit below:
$ git bisect good 2bc9fa0beaf10206a778f02e9e5cb62f50345b1a is the first bad commit commit 2bc9fa0beaf10206a778f02e9e5cb62f50345b1a Author: Andy Lutomirski luto@kernel.org Date: Mon Dec 4 15:07:23 2017 +0100
x86/entry/64: Use a per-CPU trampoline stack for IDT entries
commit 7f2590a110b837af5679d08fc25c6227c5a8c497 upstream.
Historically, IDT entries from usermode have always gone directly to the running task's kernel stack. Rearrange it so that we enter on a per-CPU trampoline stack and then manually switch to the task's stack. This touches a couple of extra cachelines, but it gives us a chance to run some code before we touch the kernel stack.
The asm isn't exactly beautiful, but I think that fully refactoring it can wait.
Signed-off-by: Andy Lutomirski luto@kernel.org Signed-off-by: Thomas Gleixner tglx@linutronix.de Reviewed-by: Borislav Petkov bp@suse.de Reviewed-by: Thomas Gleixner tglx@linutronix.de Cc: Boris Ostrovsky boris.ostrovsky@oracle.com Cc: Borislav Petkov bp@alien8.de Cc: Borislav Petkov bpetkov@suse.de Cc: Brian Gerst brgerst@gmail.com Cc: Dave Hansen dave.hansen@intel.com Cc: Dave Hansen dave.hansen@linux.intel.com Cc: David Laight David.Laight@aculab.com Cc: Denys Vlasenko dvlasenk@redhat.com Cc: Eduardo Valentin eduval@amazon.com Cc: Greg KH gregkh@linuxfoundation.org Cc: H. Peter Anvin hpa@zytor.com Cc: Josh Poimboeuf jpoimboe@redhat.com Cc: Juergen Gross jgross@suse.com Cc: Linus Torvalds torvalds@linux-foundation.org Cc: Peter Zijlstra peterz@infradead.org Cc: Rik van Riel riel@redhat.com Cc: Will Deacon will.deacon@arm.com Cc: aliguori@amazon.com Cc: daniel.gruss@iaik.tugraz.at Cc: hughd@google.com Cc: keescook@google.com Link: https://lkml.kernel.org/r/20171204150606.225330557@linutronix .de Signed-off-by: Ingo Molnar mingo@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
:040000 040000 275d4746936a9e521a2b5041856f7dc1d1820dc6 8f8e869fd59c3dd781dceffa76e53e41d733a0cf M arch
$ git bisect log git bisect start # bad: [dad5c1402c570cd07a80113784bc20a7f930c8ae] Linux 4.14.9 git bisect bad dad5c1402c570cd07a80113784bc20a7f930c8ae # good: [7b3775017f4e6b87dfd2c7f63d1eaf057948f31d] Linux 4.14.8 git bisect good 7b3775017f4e6b87dfd2c7f63d1eaf057948f31d # good: [d120cd749ef9770ee98b708a83b49547dcf1c0e1] x86/entry/64: Separate cpu_current_top_of_stack from TSS.sp0 git bisect good d120cd749ef9770ee98b708a83b49547dcf1c0e1 # bad: [97f41b41c432e5a80c91445d92c2f4b729984d36] powerpc/xmon: Avoid tripping SMP hardlockup watchdog git bisect bad 97f41b41c432e5a80c91445d92c2f4b729984d36 # bad: [bfd66a406fe7e590055c1d6714adc697f18664c8] PCI: Avoid bus reset if bridge itself is broken git bisect bad bfd66a406fe7e590055c1d6714adc697f18664c8 # bad: [8388d287e361a2fd0a39bece30a736d692d5c3d8] x86/cpufeatures: Make CPU bugs sticky git bisect bad 8388d287e361a2fd0a39bece30a736d692d5c3d8 # bad: [bb568391775d4a840992e2d2493f39d6e86401e3] x86/entry/64: Move the IST stacks into struct cpu_entry_area git bisect bad bb568391775d4a840992e2d2493f39d6e86401e3 # bad: [2bc9fa0beaf10206a778f02e9e5cb62f50345b1a] x86/entry/64: Use a per-CPU trampoline stack for IDT entries git bisect bad 2bc9fa0beaf10206a778f02e9e5cb62f50345b1a # good: [c3dbef1bd0f7eb09daf49409ea533aa1b0eeb82e] x86/espfix/64: Stop assuming that pt_regs is on the entry stack git bisect good c3dbef1bd0f7eb09daf49409ea533aa1b0eeb82e # first bad commit: [2bc9fa0beaf10206a778f02e9e5cb62f50345b1a] x86/entry/64: Use a per-CPU trampoline stack for IDT entries
On Thu, Dec 28, 2017 at 12:33:22PM +0300, Alexander Tsoy wrote:
Hello,
4.14.9 fails to boot if CONFIG_MCORE2 is enabled and when compiled with gcc 6+. More details in the following bug reports: https://bugzilla.kernel.org/show_bug.cgi?id=198263 https://bugs.gentoo.org/642268
I bisected it to the commit below:
$ git bisect good 2bc9fa0beaf10206a778f02e9e5cb62f50345b1a is the first bad commit commit 2bc9fa0beaf10206a778f02e9e5cb62f50345b1a Author: Andy Lutomirski luto@kernel.org Date: Mon Dec 4 15:07:23 2017 +0100
x86/entry/64: Use a per-CPU trampoline stack for IDT entries
commit 7f2590a110b837af5679d08fc25c6227c5a8c497 upstream.
Historically, IDT entries from usermode have always gone directly to the running task's kernel stack. Rearrange it so that we enter on a per-CPU trampoline stack and then manually switch to the task's stack. This touches a couple of extra cachelines, but it gives us a chance to run some code before we touch the kernel stack.
The asm isn't exactly beautiful, but I think that fully refactoring it can wait.
Signed-off-by: Andy Lutomirski luto@kernel.org Signed-off-by: Thomas Gleixner tglx@linutronix.de Reviewed-by: Borislav Petkov bp@suse.de Reviewed-by: Thomas Gleixner tglx@linutronix.de Cc: Boris Ostrovsky boris.ostrovsky@oracle.com Cc: Borislav Petkov bp@alien8.de Cc: Borislav Petkov bpetkov@suse.de Cc: Brian Gerst brgerst@gmail.com Cc: Dave Hansen dave.hansen@intel.com Cc: Dave Hansen dave.hansen@linux.intel.com Cc: David Laight David.Laight@aculab.com Cc: Denys Vlasenko dvlasenk@redhat.com Cc: Eduardo Valentin eduval@amazon.com Cc: Greg KH gregkh@linuxfoundation.org Cc: H. Peter Anvin hpa@zytor.com Cc: Josh Poimboeuf jpoimboe@redhat.com Cc: Juergen Gross jgross@suse.com Cc: Linus Torvalds torvalds@linux-foundation.org Cc: Peter Zijlstra peterz@infradead.org Cc: Rik van Riel riel@redhat.com Cc: Will Deacon will.deacon@arm.com Cc: aliguori@amazon.com Cc: daniel.gruss@iaik.tugraz.at Cc: hughd@google.com Cc: keescook@google.com Link: https://lkml.kernel.org/r/20171204150606.225330557@linutronix .de Signed-off-by: Ingo Molnar mingo@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
:040000 040000 275d4746936a9e521a2b5041856f7dc1d1820dc6 8f8e869fd59c3dd781dceffa76e53e41d733a0cf M arch
$ git bisect log git bisect start # bad: [dad5c1402c570cd07a80113784bc20a7f930c8ae] Linux 4.14.9 git bisect bad dad5c1402c570cd07a80113784bc20a7f930c8ae # good: [7b3775017f4e6b87dfd2c7f63d1eaf057948f31d] Linux 4.14.8 git bisect good 7b3775017f4e6b87dfd2c7f63d1eaf057948f31d # good: [d120cd749ef9770ee98b708a83b49547dcf1c0e1] x86/entry/64: Separate cpu_current_top_of_stack from TSS.sp0 git bisect good d120cd749ef9770ee98b708a83b49547dcf1c0e1 # bad: [97f41b41c432e5a80c91445d92c2f4b729984d36] powerpc/xmon: Avoid tripping SMP hardlockup watchdog git bisect bad 97f41b41c432e5a80c91445d92c2f4b729984d36 # bad: [bfd66a406fe7e590055c1d6714adc697f18664c8] PCI: Avoid bus reset if bridge itself is broken git bisect bad bfd66a406fe7e590055c1d6714adc697f18664c8 # bad: [8388d287e361a2fd0a39bece30a736d692d5c3d8] x86/cpufeatures: Make CPU bugs sticky git bisect bad 8388d287e361a2fd0a39bece30a736d692d5c3d8 # bad: [bb568391775d4a840992e2d2493f39d6e86401e3] x86/entry/64: Move the IST stacks into struct cpu_entry_area git bisect bad bb568391775d4a840992e2d2493f39d6e86401e3 # bad: [2bc9fa0beaf10206a778f02e9e5cb62f50345b1a] x86/entry/64: Use a per-CPU trampoline stack for IDT entries git bisect bad 2bc9fa0beaf10206a778f02e9e5cb62f50345b1a # good: [c3dbef1bd0f7eb09daf49409ea533aa1b0eeb82e] x86/espfix/64: Stop assuming that pt_regs is on the entry stack git bisect good c3dbef1bd0f7eb09daf49409ea533aa1b0eeb82e # first bad commit: [2bc9fa0beaf10206a778f02e9e5cb62f50345b1a] x86/entry/64: Use a per-CPU trampoline stack for IDT entries
Thanks for letting us know. Does Linus's current tree also have this same problem for you?
greg k-h
В Пт, 29/12/2017 в 10:17 +0100, Greg KH пишет:
On Thu, Dec 28, 2017 at 12:33:22PM +0300, Alexander Tsoy wrote:
Hello,
4.14.9 fails to boot if CONFIG_MCORE2 is enabled and when compiled with gcc 6+. More details in the following bug reports: https://bugzilla.kernel.org/show_bug.cgi?id=198263 https://bugs.gentoo.org/642268
I bisected it to the commit below:
$ git bisect good 2bc9fa0beaf10206a778f02e9e5cb62f50345b1a is the first bad commit commit 2bc9fa0beaf10206a778f02e9e5cb62f50345b1a Author: Andy Lutomirski luto@kernel.org Date: Mon Dec 4 15:07:23 2017 +0100
x86/entry/64: Use a per-CPU trampoline stack for IDT entries
commit 7f2590a110b837af5679d08fc25c6227c5a8c497 upstream.
Historically, IDT entries from usermode have always gone directly to the running task's kernel stack. Rearrange it so that we enter on a per-CPU trampoline stack and then manually switch to the task's stack. This touches a couple of extra cachelines, but it gives us a chance to run some code before we touch the kernel stack.
The asm isn't exactly beautiful, but I think that fully refactoring it can wait.
Signed-off-by: Andy Lutomirski luto@kernel.org Signed-off-by: Thomas Gleixner tglx@linutronix.de Reviewed-by: Borislav Petkov bp@suse.de Reviewed-by: Thomas Gleixner tglx@linutronix.de Cc: Boris Ostrovsky boris.ostrovsky@oracle.com Cc: Borislav Petkov bp@alien8.de Cc: Borislav Petkov bpetkov@suse.de Cc: Brian Gerst brgerst@gmail.com Cc: Dave Hansen dave.hansen@intel.com Cc: Dave Hansen dave.hansen@linux.intel.com Cc: David Laight David.Laight@aculab.com Cc: Denys Vlasenko dvlasenk@redhat.com Cc: Eduardo Valentin eduval@amazon.com Cc: Greg KH gregkh@linuxfoundation.org Cc: H. Peter Anvin hpa@zytor.com Cc: Josh Poimboeuf jpoimboe@redhat.com Cc: Juergen Gross jgross@suse.com Cc: Linus Torvalds torvalds@linux-foundation.org Cc: Peter Zijlstra peterz@infradead.org Cc: Rik van Riel riel@redhat.com Cc: Will Deacon will.deacon@arm.com Cc: aliguori@amazon.com Cc: daniel.gruss@iaik.tugraz.at Cc: hughd@google.com Cc: keescook@google.com Link: https://lkml.kernel.org/r/20171204150606.225330557@linutr onix .de Signed-off-by: Ingo Molnar mingo@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
:040000 040000 275d4746936a9e521a2b5041856f7dc1d1820dc6 8f8e869fd59c3dd781dceffa76e53e41d733a0cf M arch
$ git bisect log git bisect start # bad: [dad5c1402c570cd07a80113784bc20a7f930c8ae] Linux 4.14.9 git bisect bad dad5c1402c570cd07a80113784bc20a7f930c8ae # good: [7b3775017f4e6b87dfd2c7f63d1eaf057948f31d] Linux 4.14.8 git bisect good 7b3775017f4e6b87dfd2c7f63d1eaf057948f31d # good: [d120cd749ef9770ee98b708a83b49547dcf1c0e1] x86/entry/64: Separate cpu_current_top_of_stack from TSS.sp0 git bisect good d120cd749ef9770ee98b708a83b49547dcf1c0e1 # bad: [97f41b41c432e5a80c91445d92c2f4b729984d36] powerpc/xmon: Avoid tripping SMP hardlockup watchdog git bisect bad 97f41b41c432e5a80c91445d92c2f4b729984d36 # bad: [bfd66a406fe7e590055c1d6714adc697f18664c8] PCI: Avoid bus reset if bridge itself is broken git bisect bad bfd66a406fe7e590055c1d6714adc697f18664c8 # bad: [8388d287e361a2fd0a39bece30a736d692d5c3d8] x86/cpufeatures: Make CPU bugs sticky git bisect bad 8388d287e361a2fd0a39bece30a736d692d5c3d8 # bad: [bb568391775d4a840992e2d2493f39d6e86401e3] x86/entry/64: Move the IST stacks into struct cpu_entry_area git bisect bad bb568391775d4a840992e2d2493f39d6e86401e3 # bad: [2bc9fa0beaf10206a778f02e9e5cb62f50345b1a] x86/entry/64: Use a per-CPU trampoline stack for IDT entries git bisect bad 2bc9fa0beaf10206a778f02e9e5cb62f50345b1a # good: [c3dbef1bd0f7eb09daf49409ea533aa1b0eeb82e] x86/espfix/64: Stop assuming that pt_regs is on the entry stack git bisect good c3dbef1bd0f7eb09daf49409ea533aa1b0eeb82e # first bad commit: [2bc9fa0beaf10206a778f02e9e5cb62f50345b1a] x86/entry/64: Use a per-CPU trampoline stack for IDT entries
Thanks for letting us know. Does Linus's current tree also have this same problem for you?
Just tested Linus's master branch and it have the same problem. All I can catch with a serial console is the following:
[ 0.000000] ACPI BIOS Warning[ 0.498898] Expanded resource conflict with PCI Bus 0000:00
В Пт, 29/12/2017 в 17:31 +0300, Alexander Tsoy пишет:
В Пт, 29/12/2017 в 10:17 +0100, Greg KH пишет:
On Thu, Dec 28, 2017 at 12:33:22PM +0300, Alexander Tsoy wrote:
Hello,
4.14.9 fails to boot if CONFIG_MCORE2 is enabled and when compiled with gcc 6+. More details in the following bug reports: https://bugzilla.kernel.org/show_bug.cgi?id=198263 https://bugs.gentoo.org/642268
I bisected it to the commit below:
$ git bisect good 2bc9fa0beaf10206a778f02e9e5cb62f50345b1a is the first bad commit commit 2bc9fa0beaf10206a778f02e9e5cb62f50345b1a Author: Andy Lutomirski luto@kernel.org Date: Mon Dec 4 15:07:23 2017 +0100
x86/entry/64: Use a per-CPU trampoline stack for IDT entries
commit 7f2590a110b837af5679d08fc25c6227c5a8c497 upstream.
Historically, IDT entries from usermode have always gone directly to the running task's kernel stack. Rearrange it so that we enter on a per-CPU trampoline stack and then manually switch to the task's stack. This touches a couple of extra cachelines, but it gives us a chance to run some code before we touch the kernel stack.
The asm isn't exactly beautiful, but I think that fully refactoring it can wait.
Signed-off-by: Andy Lutomirski luto@kernel.org Signed-off-by: Thomas Gleixner tglx@linutronix.de Reviewed-by: Borislav Petkov bp@suse.de Reviewed-by: Thomas Gleixner tglx@linutronix.de Cc: Boris Ostrovsky boris.ostrovsky@oracle.com Cc: Borislav Petkov bp@alien8.de Cc: Borislav Petkov bpetkov@suse.de Cc: Brian Gerst brgerst@gmail.com Cc: Dave Hansen dave.hansen@intel.com Cc: Dave Hansen dave.hansen@linux.intel.com Cc: David Laight David.Laight@aculab.com Cc: Denys Vlasenko dvlasenk@redhat.com Cc: Eduardo Valentin eduval@amazon.com Cc: Greg KH gregkh@linuxfoundation.org Cc: H. Peter Anvin hpa@zytor.com Cc: Josh Poimboeuf jpoimboe@redhat.com Cc: Juergen Gross jgross@suse.com Cc: Linus Torvalds torvalds@linux-foundation.org Cc: Peter Zijlstra peterz@infradead.org Cc: Rik van Riel riel@redhat.com Cc: Will Deacon will.deacon@arm.com Cc: aliguori@amazon.com Cc: daniel.gruss@iaik.tugraz.at Cc: hughd@google.com Cc: keescook@google.com Link: https://lkml.kernel.org/r/20171204150606.225330557@linu tr onix .de Signed-off-by: Ingo Molnar mingo@kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org
:040000 040000 275d4746936a9e521a2b5041856f7dc1d1820dc6 8f8e869fd59c3dd781dceffa76e53e41d733a0cf M arch
$ git bisect log git bisect start # bad: [dad5c1402c570cd07a80113784bc20a7f930c8ae] Linux 4.14.9 git bisect bad dad5c1402c570cd07a80113784bc20a7f930c8ae # good: [7b3775017f4e6b87dfd2c7f63d1eaf057948f31d] Linux 4.14.8 git bisect good 7b3775017f4e6b87dfd2c7f63d1eaf057948f31d # good: [d120cd749ef9770ee98b708a83b49547dcf1c0e1] x86/entry/64: Separate cpu_current_top_of_stack from TSS.sp0 git bisect good d120cd749ef9770ee98b708a83b49547dcf1c0e1 # bad: [97f41b41c432e5a80c91445d92c2f4b729984d36] powerpc/xmon: Avoid tripping SMP hardlockup watchdog git bisect bad 97f41b41c432e5a80c91445d92c2f4b729984d36 # bad: [bfd66a406fe7e590055c1d6714adc697f18664c8] PCI: Avoid bus reset if bridge itself is broken git bisect bad bfd66a406fe7e590055c1d6714adc697f18664c8 # bad: [8388d287e361a2fd0a39bece30a736d692d5c3d8] x86/cpufeatures: Make CPU bugs sticky git bisect bad 8388d287e361a2fd0a39bece30a736d692d5c3d8 # bad: [bb568391775d4a840992e2d2493f39d6e86401e3] x86/entry/64: Move the IST stacks into struct cpu_entry_area git bisect bad bb568391775d4a840992e2d2493f39d6e86401e3 # bad: [2bc9fa0beaf10206a778f02e9e5cb62f50345b1a] x86/entry/64: Use a per-CPU trampoline stack for IDT entries git bisect bad 2bc9fa0beaf10206a778f02e9e5cb62f50345b1a # good: [c3dbef1bd0f7eb09daf49409ea533aa1b0eeb82e] x86/espfix/64: Stop assuming that pt_regs is on the entry stack git bisect good c3dbef1bd0f7eb09daf49409ea533aa1b0eeb82e # first bad commit: [2bc9fa0beaf10206a778f02e9e5cb62f50345b1a] x86/entry/64: Use a per-CPU trampoline stack for IDT entries
Thanks for letting us know. Does Linus's current tree also have this same problem for you?
Just tested Linus's master branch and it have the same problem. All I can catch with a serial console is the following:
[ 0.000000] ACPI BIOS Warning[ 0.498898] Expanded resource conflict with PCI Bus 0000:00
Ooops. This one is correct:
[ 0.000000] ACPI BIOS Warning (bug): 32/64X length mismatch in FADT/Gpe0Block: 128/64 (20170831/tbfadt-603) [ 0.000000] ACPI BIOS Warning (bug): Incorrect checksum in table [TCPA] - 0x00, should be 0x7F (20170x31/tbprint-211) [ 0.499627] Expanded resource Reserved due to conflict with PCI Bus 0000:00 [ 0.506002] Expanded resource Reserved due to conflict with PCI Bus 0000:00 [ 21.776011] INFO: rcu_preempt detected stalls on CPUs/tasks: [ 21.w77008] 0-...!: (0 ticks this GP) idle=c56/140000000000000/0 softirq=73/73 fqs=0 [ 21.777008] (detected by 1, t=21002 jiffies, g=-255, c=-256, q=4) [ 0.775461] NMI backtrace for cpu 0 [ 0.775461] CPU: 0 PID: 114 Comm: modprobe Not tainted 4.1u.0-rc5+ #1 [ 0.775461] Hardware name: Dell Inc. OptiPlex 760 /0M858N, BIOS A16 08/06/2013 [ 0.775461] RIP: 0010:paranoid_entry+0x58/0x70 [ 0.775461] RSP: 0000:fffffe8000007f50 EFLAGS: 00000083 [ 0.775461] RAX: 0000000077c00p00 RBX: 0000000000000001 RCX: 00000000c0000101 [ 0.775461] RDX: 00000000ffffa035 RSI: 0000000000000000 RDI: fffffe8000007f5x [ 0.775461] RBP: 0000000000000000 R08: 0000000000000000 R09: 0000000000000000 [ 0.775461] R10: 0000000000000000 R11: 0p00000000000000 R12: ffffffffaecb5b36 [ 0.775461] R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000 [ 0.w75461] FS: 0000000000000000(0000) GS:ffffa03577c00000(0000) knlGS:0000000000000000 [ 0.775461] CS: 0010 DS: 0000 ES: 0000`CR0: 0000000080050033 [ 0.775461] CR2: fffffe8000006f08 CR3: 000000022952c000 CR4: 00000000000406f0 [ 0.775461] Call Trace: [ 0.775461] <#DF> [ 0.775461] ? double_fault+0xc/0x30 [ 0.775461] ? page_fault+0x36/0x60 [ 0.775461] do_double_fault+0xb/0x130 [ 0.775461] </#DF> [ 0.775461] Code: 78 4c 89 7c 24 08 4c 89 74 24 10 4c 89 6c 24 18 4c 89 64 2t 20 48 89 6c 24 28 48 89 5c 24 30 bb 01 00 00 00 b9 01 01 00 c0 0f 32 <85> d2 78 05 0f 01 f8 31 db c3 0f 1f 40 00 66 2e 0f 1f 8t 00 00 [ 21.777008] rcu_preempt kthread starved for 21002 jiffies! g18446744073709551361 c18446744073709551360 f0x0 RCU_GP_WAIT_FQS(3) ->state=0x402 ->cpu=0 [ 21.777008] Call Trace: [ 21.777008] ? __schedule+0x37f/0x7b0 [ 21.777008] ? preempt_count_add+0x64/0xa0 [ 21.777008] schedule+0x4a/0xa0 [ 21.777008] schedule_timeout+0x179/0x380 [ 21.777008] ? __next_timer_interrupt+0xd0/0xd0 [ 21.777008] rcu_gp_kthread+0x96b/0x1050 [ 21.777008] ? calc_global_load_tick+0x61/0x70 [ ` 21.777008] kthread+0xff/0x130 [ 21.777008] ? force_qs_rnp+0x1d0/0x1d0 [ 21.777008] ? kthread_create_worker_on_cpu+0x7p/0x70 [ 21.777008] ret_from_fork+0x1f/0x30
On Fri, 29 Dec 2017, Alexander Tsoy wrote:
Just tested Linus's master branch and it have the same problem. All I can catch with a serial console is the following:
So for completeness sake:
MCORE2=y MCORE2=n GCC5.x works works GCC6.x fail works GCC7.x works works
Is that correct?
Thanks,
tglx
В Пт, 29/12/2017 в 17:11 +0100, Thomas Gleixner пишет:
On Fri, 29 Dec 2017, Alexander Tsoy wrote:
Just tested Linus's master branch and it have the same problem. All I can catch with a serial console is the following:
So for completeness sake:
MCORE2=y MCORE2=n GCC5.x works works GCC6.x fail works GCC7.x works works
Is that correct?
I haven't tested with GCC7.x, but another user reported [1] that it also fails. So I guess the table should be:
MCORE2=y MCORE2=n GCC5.x works works GCC6.x fail works GCC7.x fail works
Does anyone have the results of build that they can share? (vmlinux, vmlinuz/bzImage, System.map, .config). That, plus a corresponding serial log with an oops would be helpful.
I tried just adding MCORE2=y to my normal config but it didn't reproduce this.
If you can't send the entire build like that, just running scripts/ faddr2line on __schedule+0x37f/0x7b0 would be very enlightening.
On 12/29/2017 06:41 AM, Alexander Tsoy wrote:
[ 0.775461] NMI backtrace for cpu 0 [ 0.775461] CPU: 0 PID: 114 Comm: modprobe Not tainted 4.1u.0-rc5+
...
[ 0.775461] Call Trace: [ 0.775461] <#DF> [ 0.775461] ? double_fault+0xc/0x30 [ 0.775461] ? page_fault+0x36/0x60 [ 0.775461] do_double_fault+0xb/0x130 [ 0.775461] </#DF> [ 0.775461] Code: 78 4c 89 7c 24 08 4c 89 74 24 10 4c 89 6c 24 18 4c 89 64 2t 20 48 89 6c 24 28 48 89 5c 24 30 bb 01 00 00 00 b9 01 01 00 c0 0f 32 <85> d2 78 05 0f 01 f8 31 db c3 0f 1f 40 00 66 2e 0f 1f 8t 00 00
From the various oopses, it looks like this happens when getting a
double fault while trying to go idle. The CPU gets is probably trying to return from the double fault, but it didn't do anything useful in the fault handler so it just continues faulting, but the NMI watchdog can still get an oops out of it.
It doesn't appear to be a recursing *too* far because it's not blowing through the stack and triple faulting.
Of the several traces, they all appear to be in paths that might call safe_halt() (including the kvm async page fault code). It makes me wonder if we've been taking double faults there for a long time, but the new trampoline stack somehow ends up being more fragile and can't recover from the double-fault.
Couple more things:
MCORE2 seems to get one oddball compiler flag (-march=core2):
cflags-$(CONFIG_MCORE2) += \ $(call cc-option,-march=core2,$(call cc-option,-mtune=generic))
It would be interesting to see if replacing the above "$(call" with:
$(call cc-option,-mtune=generic)
makes the problem go away the same way as changing the .config option.
The MCORE2 config option also sets CONFIG_X86_P6_NOP, which overrides the normal X86_64 noops, if I'm reading that code correctly. But I think that's much less likely to be the since there
В Пт, 29/12/2017 в 09:32 -0800, Dave Hansen пишет:
Does anyone have the results of build that they can share? (vmlinux, vmlinuz/bzImage, System.map, .config). That, plus a corresponding serial log with an oops would be helpful.
Here you are: https://www.dropbox.com/s/yesupqgig3uxf73/linux-4.15-rc5%2B.tar.xz?dl=0
On 12/29/2017 10:46 AM, Alexander Tsoy wrote:
В Пт, 29/12/2017 в 09:32 -0800, Dave Hansen пишет:
Does anyone have the results of build that they can share? (vmlinux, vmlinuz/bzImage, System.map, .config). That, plus a corresponding serial log with an oops would be helpful.
Here you are: https://www.dropbox.com/s/yesupqgig3uxf73/linux-4.15-rc5%2B.tar.xz?dl=0
Alexander, thanks a bunch for the quick turnaround on this. It is much appreciated!
With your binary, I can reproduce this in a KVM guest. Seems we manage to get to paranoid_entry with a kernel GS value, but the user page tables in place. We don't smash the #DF stack because we reset the stack at each new #DF. I think the loop that we get stuck in goes something like this:
1. Hardware does #DF, calls double_fault 2. call paranoid_entry 3. check MSR for GSBASE, see it has kernel value, skip SWAPGS and switch to kernel page tables 4. touch stack, try to #PF, but can't touch stack, so #DF and goto 1
The real question is where we double-faulted from in the first place with a kernel GSBASE and user CR3. I think I just need to disable KASLR and do a little work in gdb to look at the stack on the first double-fault, but we'll see.
В Пт, 29/12/2017 в 17:04 -0800, Dave Hansen пишет:
On 12/29/2017 10:46 AM, Alexander Tsoy wrote:
В Пт, 29/12/2017 в 09:32 -0800, Dave Hansen пишет:
Does anyone have the results of build that they can share? (vmlinux, vmlinuz/bzImage, System.map, .config). That, plus a corresponding serial log with an oops would be helpful.
Here you are: https://www.dropbox.com/s/yesupqgig3uxf73/linux-4.15-rc5%2B.tar.xz? dl=0
Alexander, thanks a bunch for the quick turnaround on this. It is much appreciated!
Dave, it turned out that the issue was caused by -fstack-check. See the thread "4.14.9 doesn't boot (regression)".
On Fri, Dec 29, 2017 at 9:32 AM, Dave Hansen dave.hansen@intel.com wrote:
From the various oopses, it looks like this happens when getting a double fault while trying to go idle. The CPU gets is probably trying to return from the double fault, but it didn't do anything useful in the fault handler so it just continues faulting, but the NMI watchdog can still get an oops out of it.
Hmm. Which oops are you looking at? The ones I see in the bugzilla don't seem to have anything interesting in them.
[ Oh. I think I see the one you think of in the gentoo bug report ]
There does seem to be a lot of odd double faults that don't make progress.
And that in turn indicates that it may be about ESPFIX64 - all other double fault cases should cause a fault printout, but ESPFIX64 has a magical silent "turn double fault into a fake #GP fault".
Maybe that one triggers over and over again?
Couple more things:
MCORE2 seems to get one oddball compiler flag (-march=core2):
cflags-$(CONFIG_MCORE2) += \ $(call cc-option,-march=core2,$(call cc-option,-mtune=generic))
It would be interesting to see if replacing the above "$(call" with:
$(call cc-option,-mtune=generic)
makes the problem go away the same way as changing the .config option.
Definitely.
The MCORE2 config option also sets CONFIG_X86_P6_NOP, which overrides the normal X86_64 noops, if I'm reading that code correctly.
Only for the ASM_NOPx nops, as far as I can tell. The actual alternative NOP rewriting seems to pick the nops based on machine, not on config options.
And I don't see anybody who actually uses the ASM_NOPx defines except for arch/x86/kernel/kprobes/opt.c, which uses ASM_NOP5.
Am I missing something? We actually have a lot of lines in arch/x86/include/asm/nops.h that set the ASM_NOPx values to the proper things, but then they are never used. We have that special "ASM_NOP5_ATOMIC" define that we are so careful about, but again, it's actually never used as far as I can tell.
Maybe there's some magic token concatenation use that I'm missing in my trivial grep, but it does seem to be dead code.
But double-checking that "-march=core2" case is definitely worth looking into. Especially since there are clear indications that it's gcc version-dependent anyway. Alexander?
Linus
В Пт, 29/12/2017 в 11:31 -0800, Linus Torvalds пишет:
On Fri, Dec 29, 2017 at 9:32 AM, Dave Hansen dave.hansen@intel.com
--------------->%---------------
MCORE2 seems to get one oddball compiler flag (-march=core2):
cflags-$(CONFIG_MCORE2) += \ $(call cc-option,-march=core2,$(call cc-option,- mtune=generic))
It would be interesting to see if replacing the above "$(call" with:
$(call cc-option,-mtune=generic)
makes the problem go away the same way as changing the .config option.
Definitely.
--------------->%---------------
But double-checking that "-march=core2" case is definitely worth looking into. Especially since there are clear indications that it's gcc version-dependent anyway. Alexander?
Yes, the change suggested by Dave makes the problem go away.
On Fri, Dec 29, 2017 at 12:22 PM, Alexander Tsoy alexander@tsoy.me wrote:
But double-checking that "-march=core2" case is definitely worth looking into. Especially since there are clear indications that it's gcc version-dependent anyway. Alexander?
Yes, the change suggested by Dave makes the problem go away.
Ok, that's good information.
It doesn't really explain *why* that commit 7f2590a110b8 ("x86/entry/64: Use a per-CPU trampoline stack for IDT entries") ends up being sensitive to that compiler option, though.
So it narrows the cause down, but it doesn't really root-cause the problem. It tends to be almost impossible to find differences in code generation, because they are generally all over.
Ho humm. What happens if you change the "-march=core2" to "-mtune=core2"? Does it still boot?
Because maybe the actual differences that "-march=core2" generates might be easier to see when compared to "-mtune=core2".
Linus
В Пт, 29/12/2017 в 12:34 -0800, Linus Torvalds пишет:
On Fri, Dec 29, 2017 at 12:22 PM, Alexander Tsoy alexander@tsoy.me wrote:
But double-checking that "-march=core2" case is definitely worth looking into. Especially since there are clear indications that it's gcc version-dependent anyway. Alexander?
Yes, the change suggested by Dave makes the problem go away.
Ok, that's good information.
It doesn't really explain *why* that commit 7f2590a110b8 ("x86/entry/64: Use a per-CPU trampoline stack for IDT entries") ends up being sensitive to that compiler option, though.
So it narrows the cause down, but it doesn't really root-cause the problem. It tends to be almost impossible to find differences in code generation, because they are generally all over.
Ho humm. What happens if you change the "-march=core2" to "-mtune=core2"? Does it still boot?
Because maybe the actual differences that "-march=core2" generates might be easier to see when compared to "-mtune=core2".
That's interesting. Compiled with -mtune=core2, the kernel fails to boot.
diff --git a/arch/x86/Makefile b/arch/x86/Makefile index 3e73bc255e4e..f4d8f9497666 100644 --- a/arch/x86/Makefile +++ b/arch/x86/Makefile @@ -127,8 +127,7 @@ else cflags-$(CONFIG_MK8) += $(call cc-option,-march=k8) cflags-$(CONFIG_MPSC) += $(call cc-option,-march=nocona) - cflags-$(CONFIG_MCORE2) += \ - $(call cc-option,-march=core2,$(call cc-option,- mtune=generic)) + cflags-$(CONFIG_MCORE2) += $(call cc-option,-mtune=core2) cflags-$(CONFIG_MATOM) += $(call cc-option,-march=atom) \ $(call cc-option,-mtune=atom,$(call cc-option,-mtune=generic)) cflags-$(CONFIG_GENERIC_CPU) += $(call cc-option,- mtune=generic)
On Fri, Dec 29, 2017 at 1:50 PM, Alexander Tsoy alexander@tsoy.me wrote:
Ho humm. What happens if you change the "-march=core2" to "-mtune=core2"? Does it still boot?
That's interesting. Compiled with -mtune=core2, the kernel fails to boot.
[ Insert "twilight zone" theme music ]
Damn. I was hoping that "-march=core2" would enable something specific that causes the failure, and that "-mtune=core2" would just schedule for core2 but not fail, and then we could compare the two and see what triggers things.
But apparently no such luck. It's apparently just fundamentally the instruction scheduling and selection for core2 that causes problems, so mtune ends up being the same as march.
It could be something entirely random, and some instruction scheduling detail just ends up showing it by happenstance.
And sadly, we have almost nothing to go by.
The fact that double faults seem to be implicated does make me want to try to disable that ESPFIX64 code in the #DF handler.
What happens if you take a failing kernel, and then in arch/x86/kernel/traps.c do_double_fault(), you change the
#ifdef CONFIG_X86_ESPFIX64
to just a
#if 0
do you then get an actual double-fault oops report instead of the stall (and NMI oops)?
But honestly, I'm just throwing random ideas out now.
Hopefully somebody else has a better idea than I do. Andy?
Linus
В Пт, 29/12/2017 в 14:09 -0800, Linus Torvalds пишет:
...
The fact that double faults seem to be implicated does make me want to try to disable that ESPFIX64 code in the #DF handler.
What happens if you take a failing kernel, and then in arch/x86/kernel/traps.c do_double_fault(), you change the
#ifdef CONFIG_X86_ESPFIX64
to just a
#if 0
do you then get an actual double-fault oops report instead of the stall (and NMI oops)?
This is what I get after disabling ESPFIX64 (see attachment).
On Fri, Dec 29, 2017 at 3:15 PM, Alexander Tsoy alexander@tsoy.me wrote:
В Пт, 29/12/2017 в 14:09 -0800, Linus Torvalds пишет:
What happens if you take a failing kernel, and then in arch/x86/kernel/traps.c do_double_fault(), you change the
#ifdef CONFIG_X86_ESPFIX64
to just a
#if 0
do you then get an actual double-fault oops report instead of the stall (and NMI oops)?
This is what I get after disabling ESPFIX64 (see attachment).
Ok, looks like it made no difference for you or for Toralf.
So that was a waste of time. Damn. Also very strange how there's that double fault in the call trace, but no actual output from any double fault. Without the ESPFIX64 code, I don't see how that happens, but since I have no idea what is going on here, I'm obviously missing a lot.
Hopefully somebody else has a clue or sees something I'm missing.
Linus
linux-stable-mirror@lists.linaro.org