On Tue, Oct 27 2020 at 00:07, Ira Weiny wrote:
On Fri, Oct 23, 2020 at 11:50:11PM +0200, Thomas Gleixner wrote:
#ifndef irqentry_state typedef struct irqentry_state {
- bool exit_rcu;
- union {
bool exit_rcu;
bool lockdep;
- };
} irqentry_state_t; #endif
-E_NO_KERNELDOC
Adding: Paul McKenney
I'm happy to write something but I'm very unfamiliar with this code. So I'm getting confused what exactly exit_rcu is flagging.
I can see that exit_rcu is a bad name for the state used in irqentry_nmi_[enter|exit](). Furthermore, I see why 'lockdep' is a better name. But similar lockdep handling is used in irqentry_exit() if exit_rcu is true...
No, it's not similar at all. Lockdep state vs. interrupts and regular exceptions is always consistent.
In the NMI case, that's not guaranteed because of
local_irq_disable() arch_local_irq_disable() <- NMI race window trace_hardirqs_off()
same the other way round
local_irq_enable() trace_hardirqs_on() <- NMI race window arch_local_irq_enable()
IOW, the hardware state and the lockdep state are not consistent.
/**
- struct irqentry_state - Opaque object for exception state storage
- @exit_rcu: Used exclusively in the irqentry_*() calls; tracks if the
exception hit the idle task which requires special handling,
including calling rcu_irq_exit(), when the exception
exits.
calls; signals whether the exit path has to invoke rcu_irq_exit().
- @lockdep: Used exclusively in the irqentry_nmi_*() calls; ensures lockdep
tracking is maintained if hardirqs were already enabled
ensures that lockdep state is restored correctly on exit from nmi.
- This opaque object is filled in by the irqentry_*_enter() functions and
- should be passed back into the corresponding irqentry_*_exit()
functions
s/should/must/
- when the exception is complete.
- Callers of irqentry_*_[enter|exit]() should consider this structure
opaque
s/should/must/
- and all members private. Descriptions of the members are provided to aid in
- the maintenance of the irqentry_*() functions.
*/
Perhaps Paul can enlighten me on how exit_rcu is used beyond just flagging a call to rcu_irq_exit()?
I can do that as well :) The only purpose is to invoke rcu_irq_exit() conditionally.
Why do we call lockdep_hardirqs_off() only when in the idle task? That implies that regs_irqs_disabled() can only be false if we were in the idle task to match up the lockdep on/off calls.
You're reading the code slightly wrong.
This does not make sense to me because why do we need the extra check for exit_rcu? I'm still trying to understand when regs_irqs_disabled() is false.
It's false when the interrupted context had interrupts enabled.
So we have the following scenarios:
Usermode Idletask irqs enabled RCU entry RCU exit Y N Y Y Y
N N Y N N N N N N N N Y Y Y Y N Y N Y Y
Now you might wonder about irqs enabled/disabled. This code is not only used for interrupts (device, ipi, local timer...) where interrupts are obviously enabled, it's also used for exception entry/exit. You can have e.g. pagefaults in interrupt disabled regions.
Also, the comment in irqentry_enter() refers to irq_enter_from_user_mode() which does not seem to exist anymore. So I'm not sure what careful sequence it is referring to.
That was renamed to irqentry_enter_from_user_mode() and the comment was not updated. Sorry for leaving this hard to solve puzzle around.
Thanks,
tglx