On Wed, Oct 11, 2023 at 05:05:04AM +0000, Joel Fernandes wrote:
On Tue, Oct 10, 2023 at 06:34:35PM -0700, Paul E. McKenney wrote: [...]
> > It's also worth noting that the bug this fixes wasn't exposed until the > > maple tree (added in v6.1) was used for the IRQ descriptors (added in > > v6.5). > > Lots of latent bugs, to be sure, even with rcutorture. :-/
The Right Thing is to fix the bug all the way back to the introduction, but what fallout makes the backport less desirable than living with the unexposed bug?
You are quite right that it is possible for the risk of a backport to exceed the risk of the original bug.
I defer to Joel (CCed) on how best to resolve this in -stable.
Maybe I am missing something but this issue should also be happening in mainline right?
Even though mainline has 897ba84dc5aa ("rcu-tasks: Handle idle tasks for recently offlined CPUs") , the warning should still be happening due to Liam's "kernel/sched: Modify initial boot task idle setup" because the warning is just rearranged a bit but essentially the same.
IMHO, the right thing to do then is to drop Liam's patch from 5.15 and fix it in mainline (using the ideas described in this thread), then backport both that new fix and Liam's patch to 5.15.
Or is there a reason this warning does not show up on the mainline?
There is not a whole lot of commonality between the v5.15.134 version of RCU Tasks Trace and that of mainline. In theory, in mainline, CPU hotplug is supposed to be disabled across all calls to trc_inspect_reader(), which means that there would not be any CPU coming or going.
But there could potentially be some time between when a CPU was marked as online and its idle task was marked PF_IDLE. And in fact x86 start_secondary() invokes set_cpu_online() before it calls cpu_startup_entry(), and it is the latter than sets PF_IDLE.
The same is true of alpha, arc, arm, arm64, csky, ia64, loongarch, mips, openrisc, parisc, powerpc, riscv, s390, sh, sparc32, sparc64, x86 xen, and xtensa, which is everybody.
One reason why my testing did not reproduce this is because I was running against v6.6-rc1, and cff9b2332ab7 ("kernel/sched: Modify initial boot task idle setup") went into v6.6-rc3. An initial run merging in current mainline also failed to reproduce this, but I am running overnight. If that doesn't reproduce, I will try inserting delays between the set_cpu_online() and the cpu_startup_entry().
I thought the warning happens before set_cpu_online() is even called, because under such situation, ofl == true and the task is not set to PF_IDLE yet:
WARN_ON_ONCE(ofl && task_curr(t) && !is_idle_task(t));
That case is supposed to be excluded by the cpus_read_lock() calls. Yes, key phrase "supposed to be". ;-)
If this problem is real, fixes include:
o Revert Liam's patch and make Tiny RCU's call_rcu() deal with the problem. This is overhead and non-tinyness, but to Joel's point, it might be best.
o Go back to something more like Liam's original patch, which cleared PF_IDLE only for the boot CPU.
o Set PF_IDLE before calling set_cpu_online(). This would work, but it would also be rather ugly, reaching into each and every architecture.
o Move the call to set_cpu_online() into cpu_startup_entry(). This would require some serious inspection to prove that it is safe, assuming that it is in fact safe.
o Drop the WARN_ON_ONCE() from trc_inspect_reader(). Not all that excited by losing this diagnostic, but then again it has been awhile since it has caught anything.
o Make the WARN_ON_ONCE() condition in trc_inspect_reader() instead to a "return false" to retry later. Ditto, also not liking the possibility of indefinite deferral with no warning.
Just for completeness,
o Since it just a warning, checking for task_struct::pid == 0 instead of is_idle_task()? Though PF_IDLE is also set in play_idle_precise().
o Change warning to: WARN_ON_ONCE(ofl && task_curr(t) && (!is_idle_task(t) && t->pid != 0));
This change does look promising, thank you!
Thanx, Paul