On Fri, Jun 09, 2017 at 07:18:21PM +0800, Alex Shi wrote:
Hi Daniel & Vincent,
The latest RT kernel linux-4.9.y-rt-rebase failed on booting on arm board while works on ARM v8 qemu and x86.
The bug is also present on ARMv8 except it is very hard to reproduce on a SMP emulated qemu environment.
The bug is not present on x86 because cpu_pm_enter() is not used in this architecture.
This bug caused by a read_lock in arm64 cpu idle chain. RT turn the read_lock into mutex, that may sleep during idle. Then it breaks 2 rules in kernel: 1, BUG: scheduling while atomic: 2, bad: scheduling from the idle thread! The idle chain of arm64 is problematic in RT, trying to figure a way to fix or work around with it.
Do we want to go into deep idle state with a RT kernel?
The RT boot bug call chain: bug call chain: cpu_startup_entry #kernel/sched/idle.c: cpu_idle_loop local_irq_disable() cpuidle_idle_call call_cpuidle cpuidle_enter
cpuidle_enter cpuidle_enter_state ->enter :arm_enter_idle_state cpu_pm_enter/exit CPU_PM_CPU_IDLE_ENTER read_lock(&cpu_pm_notifier_lock); <-- # bug:
Try replacing read_[un]lock() by raw_read_[un]lock()
These are the API to prevent the RT kernel to convert spinlock into sleeping locks.
__rt_spin_lock(); schedule();
#define CPU_PM_CPU_IDLE_ENTER(low_level_idle_enter, idx) \ ({ \ int __ret; \ \ if (!idx) { \ cpu_do_idle(); \ return idx; \ } \ \ __ret = cpu_pm_enter(); \ if (!__ret) { \ __ret = low_level_idle_enter(idx); \ cpu_pm_exit(); \ } \ \ __ret ? -1 : idx; \ })
Hi Vincent & Daniel,
Q1, The cpu_pm_enter is just a cpu_pm_notifier, is this good to stay wake if the notification failed?
Yes, because in the chain notifier one element may have failed to save its context so we must not enter the deep idle state otherwise we can crash when exiting idle.
Q2, If possible to use per cpu notify, since it's the per cpu action to idle.
Sounds reasonable.
Q3, To have a rwlock in deep idle chain doesn't looks good. Could we remove them?
Likely possible if the notifier call chain is per cpu and everything is bounded to this cpu.