В Пт, 29/12/2017 в 17:31 +0300, Alexander Tsoy пишет:
В Пт, 29/12/2017 в 10:17 +0100, Greg KH пишет:
On Thu, Dec 28, 2017 at 12:33:22PM +0300, Alexander Tsoy wrote:
Hello,
4.14.9 fails to boot if CONFIG_MCORE2 is enabled and when compiled with gcc 6+. More details in the following bug reports: https://bugzilla.kernel.org/show_bug.cgi?id=198263 https://bugs.gentoo.org/642268
I bisected it to the commit below:
$ git bisect good 2bc9fa0beaf10206a778f02e9e5cb62f50345b1a is the first bad commit commit 2bc9fa0beaf10206a778f02e9e5cb62f50345b1a Author: Andy Lutomirski luto@kernel.org Date: Mon Dec 4 15:07:23 2017 +0100
x86/entry/64: Use a per-CPU trampoline stack for IDT entries
commit 7f2590a110b837af5679d08fc25c6227c5a8c497 upstream.
Historically, IDT entries from usermode have always gone directly to the running task's kernel stack. Rearrange it so that we enter on a per-CPU trampoline stack and then manually switch to the task's stack. This touches a couple of extra cachelines, but it gives us a chance to run some code before we touch the kernel stack.
The asm isn't exactly beautiful, but I think that fully refactoring it can wait.
Signed-off-by: Andy Lutomirski luto@kernel.org Signed-off-by: Thomas Gleixner tglx@linutronix.de Reviewed-by: Borislav Petkov bp@suse.de Reviewed-by: Thomas Gleixner tglx@linutronix.de Cc: Boris Ostrovsky boris.ostrovsky@oracle.com Cc: Borislav Petkov bp@alien8.de Cc: Borislav Petkov bpetkov@suse.de Cc: Brian Gerst brgerst@gmail.com Cc: Dave Hansen dave.hansen@intel.com Cc: Dave Hansen dave.hansen@linux.intel.com Cc: David Laight David.Laight@aculab.com Cc: Denys Vlasenko dvlasenk@redhat.com Cc: Eduardo Valentin eduval@amazon.com Cc: Greg KH gregkh@linuxfoundation.org Cc: H. Peter Anvin hpa@zytor.com Cc: Josh Poimboeuf jpoimboe@redhat.com Cc: Juergen Gross jgross@suse.com Cc: Linus Torvalds torvalds@linux-foundation.org Cc: Peter Zijlstra peterz@infradead.org Cc: Rik van Riel riel@redhat.com Cc: Will Deacon will.deacon@arm.com Cc: aliguori@amazon.com Cc: daniel.gruss@iaik.tugraz.at Cc: hughd@google.com Cc: keescook@google.com Link: https://lkml.kernel.org/r/20171204150606.225330557@linu tr onix .de Signed-off-by: Ingo Molnar mingo@kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org
:040000 040000 275d4746936a9e521a2b5041856f7dc1d1820dc6 8f8e869fd59c3dd781dceffa76e53e41d733a0cf M arch
$ git bisect log git bisect start # bad: [dad5c1402c570cd07a80113784bc20a7f930c8ae] Linux 4.14.9 git bisect bad dad5c1402c570cd07a80113784bc20a7f930c8ae # good: [7b3775017f4e6b87dfd2c7f63d1eaf057948f31d] Linux 4.14.8 git bisect good 7b3775017f4e6b87dfd2c7f63d1eaf057948f31d # good: [d120cd749ef9770ee98b708a83b49547dcf1c0e1] x86/entry/64: Separate cpu_current_top_of_stack from TSS.sp0 git bisect good d120cd749ef9770ee98b708a83b49547dcf1c0e1 # bad: [97f41b41c432e5a80c91445d92c2f4b729984d36] powerpc/xmon: Avoid tripping SMP hardlockup watchdog git bisect bad 97f41b41c432e5a80c91445d92c2f4b729984d36 # bad: [bfd66a406fe7e590055c1d6714adc697f18664c8] PCI: Avoid bus reset if bridge itself is broken git bisect bad bfd66a406fe7e590055c1d6714adc697f18664c8 # bad: [8388d287e361a2fd0a39bece30a736d692d5c3d8] x86/cpufeatures: Make CPU bugs sticky git bisect bad 8388d287e361a2fd0a39bece30a736d692d5c3d8 # bad: [bb568391775d4a840992e2d2493f39d6e86401e3] x86/entry/64: Move the IST stacks into struct cpu_entry_area git bisect bad bb568391775d4a840992e2d2493f39d6e86401e3 # bad: [2bc9fa0beaf10206a778f02e9e5cb62f50345b1a] x86/entry/64: Use a per-CPU trampoline stack for IDT entries git bisect bad 2bc9fa0beaf10206a778f02e9e5cb62f50345b1a # good: [c3dbef1bd0f7eb09daf49409ea533aa1b0eeb82e] x86/espfix/64: Stop assuming that pt_regs is on the entry stack git bisect good c3dbef1bd0f7eb09daf49409ea533aa1b0eeb82e # first bad commit: [2bc9fa0beaf10206a778f02e9e5cb62f50345b1a] x86/entry/64: Use a per-CPU trampoline stack for IDT entries
Thanks for letting us know. Does Linus's current tree also have this same problem for you?
Just tested Linus's master branch and it have the same problem. All I can catch with a serial console is the following:
[ 0.000000] ACPI BIOS Warning[ 0.498898] Expanded resource conflict with PCI Bus 0000:00
Ooops. This one is correct:
[ 0.000000] ACPI BIOS Warning (bug): 32/64X length mismatch in FADT/Gpe0Block: 128/64 (20170831/tbfadt-603) [ 0.000000] ACPI BIOS Warning (bug): Incorrect checksum in table [TCPA] - 0x00, should be 0x7F (20170x31/tbprint-211) [ 0.499627] Expanded resource Reserved due to conflict with PCI Bus 0000:00 [ 0.506002] Expanded resource Reserved due to conflict with PCI Bus 0000:00 [ 21.776011] INFO: rcu_preempt detected stalls on CPUs/tasks: [ 21.w77008] 0-...!: (0 ticks this GP) idle=c56/140000000000000/0 softirq=73/73 fqs=0 [ 21.777008] (detected by 1, t=21002 jiffies, g=-255, c=-256, q=4) [ 0.775461] NMI backtrace for cpu 0 [ 0.775461] CPU: 0 PID: 114 Comm: modprobe Not tainted 4.1u.0-rc5+ #1 [ 0.775461] Hardware name: Dell Inc. OptiPlex 760 /0M858N, BIOS A16 08/06/2013 [ 0.775461] RIP: 0010:paranoid_entry+0x58/0x70 [ 0.775461] RSP: 0000:fffffe8000007f50 EFLAGS: 00000083 [ 0.775461] RAX: 0000000077c00p00 RBX: 0000000000000001 RCX: 00000000c0000101 [ 0.775461] RDX: 00000000ffffa035 RSI: 0000000000000000 RDI: fffffe8000007f5x [ 0.775461] RBP: 0000000000000000 R08: 0000000000000000 R09: 0000000000000000 [ 0.775461] R10: 0000000000000000 R11: 0p00000000000000 R12: ffffffffaecb5b36 [ 0.775461] R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000 [ 0.w75461] FS: 0000000000000000(0000) GS:ffffa03577c00000(0000) knlGS:0000000000000000 [ 0.775461] CS: 0010 DS: 0000 ES: 0000`CR0: 0000000080050033 [ 0.775461] CR2: fffffe8000006f08 CR3: 000000022952c000 CR4: 00000000000406f0 [ 0.775461] Call Trace: [ 0.775461] <#DF> [ 0.775461] ? double_fault+0xc/0x30 [ 0.775461] ? page_fault+0x36/0x60 [ 0.775461] do_double_fault+0xb/0x130 [ 0.775461] </#DF> [ 0.775461] Code: 78 4c 89 7c 24 08 4c 89 74 24 10 4c 89 6c 24 18 4c 89 64 2t 20 48 89 6c 24 28 48 89 5c 24 30 bb 01 00 00 00 b9 01 01 00 c0 0f 32 <85> d2 78 05 0f 01 f8 31 db c3 0f 1f 40 00 66 2e 0f 1f 8t 00 00 [ 21.777008] rcu_preempt kthread starved for 21002 jiffies! g18446744073709551361 c18446744073709551360 f0x0 RCU_GP_WAIT_FQS(3) ->state=0x402 ->cpu=0 [ 21.777008] Call Trace: [ 21.777008] ? __schedule+0x37f/0x7b0 [ 21.777008] ? preempt_count_add+0x64/0xa0 [ 21.777008] schedule+0x4a/0xa0 [ 21.777008] schedule_timeout+0x179/0x380 [ 21.777008] ? __next_timer_interrupt+0xd0/0xd0 [ 21.777008] rcu_gp_kthread+0x96b/0x1050 [ 21.777008] ? calc_global_load_tick+0x61/0x70 [ ` 21.777008] kthread+0xff/0x130 [ 21.777008] ? force_qs_rnp+0x1d0/0x1d0 [ 21.777008] ? kthread_create_worker_on_cpu+0x7p/0x70 [ 21.777008] ret_from_fork+0x1f/0x30