Hi Daniel & Vincent,
The latest RT kernel linux-4.9.y-rt-rebase failed on booting on arm board while works on ARM v8 qemu and x86. This bug caused by a read_lock in arm64 cpu idle chain. RT turn the read_lock into mutex, that may sleep during idle. Then it breaks 2 rules in kernel: 1, BUG: scheduling while atomic: 2, bad: scheduling from the idle thread! The idle chain of arm64 is problematic in RT, trying to figure a way to fix or work around with it.
The RT boot bug call chain: bug call chain: cpu_startup_entry #kernel/sched/idle.c: cpu_idle_loop local_irq_disable() cpuidle_idle_call call_cpuidle cpuidle_enter
cpuidle_enter cpuidle_enter_state ->enter :arm_enter_idle_state cpu_pm_enter/exit CPU_PM_CPU_IDLE_ENTER read_lock(&cpu_pm_notifier_lock); <-- # bug: __rt_spin_lock(); schedule();
#define CPU_PM_CPU_IDLE_ENTER(low_level_idle_enter, idx) \ ({ \ int __ret; \ \ if (!idx) { \ cpu_do_idle(); \ return idx; \ } \ \ __ret = cpu_pm_enter(); \ if (!__ret) { \ __ret = low_level_idle_enter(idx); \ cpu_pm_exit(); \ } \ \ __ret ? -1 : idx; \ })
Hi Vincent & Daniel,
Q1, The cpu_pm_enter is just a cpu_pm_notifier, is this good to stay wake if the notification failed? Q2, If possible to use per cpu notify, since it's the per cpu action to idle. Q3, To have a rwlock in deep idle chain doesn't looks good. Could we remove them?
Regards Alex