On Sun, Oct 8, 2023 at 9:20 PM Paul E. McKenney paulmck@kernel.org wrote: [...]
How frequent is this function called? We could check something for early boot... or track down where the cpu is put online and restore idle before that happens?
Once per RCU Tasks Trace grace period per reader seen to be blocking that grace period. Its performance is as issue, but not to anywhere near the same extent as (say) rcu_read_lock_trace().
It's also worth noting that the bug this fixes wasn't exposed until the maple tree (added in v6.1) was used for the IRQ descriptors (added in v6.5).
Lots of latent bugs, to be sure, even with rcutorture. :-/
The Right Thing is to fix the bug all the way back to the introduction, but what fallout makes the backport less desirable than living with the unexposed bug?
You are quite right that it is possible for the risk of a backport to exceed the risk of the original bug.
I defer to Joel (CCed) on how best to resolve this in -stable.
Maybe I am missing something but this issue should also be happening in mainline right?
Even though mainline has 897ba84dc5aa ("rcu-tasks: Handle idle tasks for recently offlined CPUs") , the warning should still be happening due to Liam's "kernel/sched: Modify initial boot task idle setup" because the warning is just rearranged a bit but essentially the same.
IMHO, the right thing to do then is to drop Liam's patch from 5.15 and fix it in mainline (using the ideas described in this thread), then backport both that new fix and Liam's patch to 5.15.
Or is there a reason this warning does not show up on the mainline?
My impression is that dropping Liam's patch for the stable release and revisiting it later is a better approach since tiny RCU is used way less in the wild than tree/tasks RCU. Thoughts?
I think that this one is strange enough that we need to write down the situation in detail, make sure we have all the corner cases covered in both mainline and -stable, and decide what to do from there.
Yes, I know, this email thread contains much of this information, but a little organizing of it would be good.
Would you like to put that together, or should I? If me, I will get a draft out by the end of this coming Tuesday, Pacific Time.
I apologize, I haven't been able to do any real work as I was OOO for the most part due to dental issues. I am about 25% back now. I will review your other email writeup and thanks for putting it together!
- Joel