On Thu, 21 Nov 2013, Dave P Martin wrote:
On Thu, Nov 21, 2013 at 06:48:32PM +0000, Lorenzo Pieralisi wrote:
On Thu, Nov 21, 2013 at 03:10:58PM +0000, Jon Medhurst (Tixy) wrote:
On Thu, 2013-11-21 at 00:09 -0500, Nicolas Pitre wrote:
I've been banging my head on this one for quite a while now. The problem is that there is very little debug out put available. Could you see if you get the same?
I get the same, with the same kernel version. If I disable cpufreq configs, then it get another (possibly different) crash, which has a backtrace.
Same here, reverting this commit does the trick for me but that's more a symptom than a cure, I still can t pinpoint what the problem is, but that's pretty easy to trigger:
https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/kerne...
This doesn't fix it for me, but my kernel tree has a few extra patches so there might be some other inconsistency somewhere. Mind you, except for the TC2 mcpm power_down_finish() patch I don't _think_ I have anything relevant to these symptoms...
What I'm seeing is a stall in kernel/cpu.c:cpu_down()->synchronize_sched()->wait_rcu_gu() with cpu_hotplug.lock held.
This doesn't kill the system, but any thread that tries to perform CPU hotplug gets stuck in uninterruptible sleep trying to take cpu_hotplug.lock:
kworker/1:2 D 80424dfc 0 2366 2 0x00000000 Workqueue: events cpuset_hotplug_workfn [<80424dfc>] (__schedule+0x218/0x5e0) from [<80425588>] (schedule_preempt_disabled+0xc/0x10) [<80425588>] (schedule_preempt_disabled+0xc/0x10) from [<804278b4>] (mutex_lock_nested+0x1a0/0x38c) [<804278b4>] (mutex_lock_nested+0x1a0/0x38c) from [<80023a40>] (get_online_cpus+0x30/0x4c) [<80023a40>] (get_online_cpus+0x30/0x4c) from [<8008b030>] (rebuild_sched_domains_locked+0x1c/0x458) [<8008b030>] (rebuild_sched_domains_locked+0x1c/0x458) from [<8008cba4>] (rebuild_sched_domains+0x1c/0x28) [<8008cba4>] (rebuild_sched_domains+0x1c/0x28) from [<8008cdbc>] (cpuset_hotplug_workfn+0x20c/0x534) [<8008cdbc>] (cpuset_hotplug_workfn+0x20c/0x534) from [<8003b434>] (process_one_work+0x1b0/0x4d0) [<8003b434>] (process_one_work+0x1b0/0x4d0) from [<8003bb50>] (worker_thread+0x138/0x3c0) [<8003bb50>] (worker_thread+0x138/0x3c0) from [<80041dac>] (kthread+0xc4/0xe0) [<80041dac>] (kthread+0xc4/0xe0) from [<8000e2e8>] (ret_from_fork+0x14/0x2c)
bash D 80424dfc 0 2386 2385 0x00000000 [<80424dfc>] (__schedule+0x218/0x5e0) from [<804246b8>] (schedule_timeout+0x120/0x1bc) [<804246b8>] (schedule_timeout+0x120/0x1bc) from [<80425a0c>] (wait_for_common+0xa8/0x14c) [<80425a0c>] (wait_for_common+0xa8/0x14c) from [<8006bef8>] (wait_rcu_gp+0x44/0x4c) [<8006bef8>] (wait_rcu_gp+0x44/0x4c) from [<8041f068>] (_cpu_down+0x88/0x230) [<8041f068>] (_cpu_down+0x88/0x230) from [<8041f238>] (cpu_down+0x28/0x3c) [<8041f238>] (cpu_down+0x28/0x3c) from [<80285de0>] (device_offline+0x8c/0xb4) [<80285de0>] (device_offline+0x8c/0xb4) from [<80285ed8>] (online_store+0x44/0x6c) [<80285ed8>] (online_store+0x44/0x6c) from [<80283ec0>] (dev_attr_store+0x18/0x24) [<80283ec0>] (dev_attr_store+0x18/0x24) from [<80149d5c>] (sysfs_write_file+0x1a4/0x1d0) [<80149d5c>] (sysfs_write_file+0x1a4/0x1d0) from [<800f02a0>] (vfs_write+0xb4/0x17c) [<800f02a0>] (vfs_write+0xb4/0x17c) from [<800f0628>] (SyS_write+0x40/0x68) [<800f0628>] (SyS_write+0x40/0x68) from [<8000e220>] (ret_fast_syscall+0x0/0x48)
This almost always happens when hotplugging CPUs off in the third inner loop of Nico's test, on the first iteration of the outer loop.
In my tests so far, the hang happens completely randomly with regards to the test loop.
I'm not sure exactly what this code is trying to do, yet. (RCU, RC-who?)
Right now I'm seeing this as a tendency but not always:
Unable to handle kernel NULL pointer dereference at virtual address 0000000c PC is at set_cpu_sd_state_idle+0x28/0x50 LR is at tick_nohz_idle_enter+0x15/0x50
Disassembling the code and backtracking it' I can see that one of the per-cpu variable provides a completely bogus pointer value which looks more like some instruction opcode than a real memory location. But the code dereferences that pointer nevertheless until havoc. Disabling the ARM optimized __my_cpu_offset doesn't change anything.
The crashing CPU is always the last one to have been booted.
Changing the code to get better debugging out tends to move the crash elsewhere. In some cases it is the oops dump code that is unable to complete as it recursively oopses itself.
Nicolas