Re: [PATCH 06/10] x86/entry: Move nmi entry/exit into common code

27 Oct 2020

      On Tue, Oct 27 2020 at 00:07, Ira Weiny wrote:
...
On Fri, Oct 23, 2020 at 11:50:11PM +0200, Thomas Gleixner wrote:
...
...
#ifndef irqentry_state
 typedef struct irqentry_state {

bool	exit_rcu;

union {
bool	exit_rcu;

bool	lockdep;

};

} irqentry_state_t;
 #endif
-E_NO_KERNELDOC
Adding: Paul McKenney
I'm happy to write something but I'm very unfamiliar with this code.  So I'm
getting confused what exactly exit_rcu is flagging.
I can see that exit_rcu is a bad name for the state used in
irqentry_nmi_[enter|exit]().  Furthermore, I see why 'lockdep' is a better
name.  But similar lockdep handling is used in irqentry_exit() if exit_rcu is
true...
No, it's not similar at all. Lockdep state vs. interrupts and regular
exceptions is always consistent.
In the NMI case, that's not guaranteed because of
local_irq_disable()
         arch_local_irq_disable()
                                        <- NMI race window
         trace_hardirqs_off()
same the other way round
local_irq_enable()
         trace_hardirqs_on()
                                        <- NMI race window
         arch_local_irq_enable()
IOW, the hardware state and the lockdep state are not consistent.
...
/**

struct irqentry_state - Opaque object for exception state storage
@exit_rcu: Used exclusively in the irqentry_*() calls; tracks if the
       exception hit the idle task which requires special handling,

       including calling rcu_irq_exit(), when the exception

exits.
calls; signals whether the exit path has to invoke rcu_irq_exit().
...

@lockdep: Used exclusively in the irqentry_nmi_*() calls; ensures lockdep
      tracking is maintained if hardirqs were already enabled

ensures that lockdep state is restored correctly on exit from nmi.
...

This opaque object is filled in by the irqentry_*_enter() functions and
should be passed back into the corresponding irqentry_*_exit()

functions
s/should/must/
...

when the exception is complete.

Callers of irqentry_*_[enter|exit]() should consider this structure

opaque
s/should/must/
...

and all members private.  Descriptions of the members are provided to aid in
the maintenance of the irqentry_*() functions.

*/
Perhaps Paul can enlighten me on how exit_rcu is used beyond just flagging a
call to rcu_irq_exit()?
I can do that as well :) The only purpose is to invoke rcu_irq_exit()
conditionally.
...
Why do we call lockdep_hardirqs_off() only when in the idle task?  That implies
that regs_irqs_disabled() can only be false if we were in the idle task to
match up the lockdep on/off calls.
You're reading the code slightly wrong.
...
This does not make sense to me because why do we need the extra check
for exit_rcu?  I'm still trying to understand when regs_irqs_disabled() is false.
It's false when the interrupted context had interrupts enabled.
So we have the following scenarios:
Usermode   Idletask   irqs enabled  RCU entry  RCU exit
   Y           N           Y		Y          Y
N           N           Y            N          N 
   N           N           N            N          N
   N           Y           Y            Y          Y
   N           Y           N            Y          Y
Now you might wonder about irqs enabled/disabled. This code is not only
used for interrupts (device, ipi, local timer...) where interrupts are
obviously enabled, it's also used for exception entry/exit. You can have
e.g. pagefaults in interrupt disabled regions.
...
Also, the comment in irqentry_enter() refers to irq_enter_from_user_mode() which
does not seem to exist anymore.  So I'm not sure what careful sequence it is
referring to.
That was renamed to irqentry_enter_from_user_mode() and the comment was
not updated. Sorry for leaving this hard to solve puzzle around.
Thanks,
tglx

2025

2024

2023

2022

2021

2020

2019

2018

2017

Re: [PATCH 06/10] x86/entry: Move nmi entry/exit into common code