On Thu, Dec 17, 2020 at 02:07:01PM +0100, Thomas Gleixner wrote:
On Fri, Dec 11 2020 at 14:14, Andy Lutomirski wrote:
On Mon, Nov 23, 2020 at 10:10 PM ira.weiny@intel.com wrote: After contemplating this for a bit, I think this isn't really the right approach. It *works*, but we've mostly just created a bit of an unfortunate situation. Our stack, on a (possibly nested) entry looks like:
previous frame (or empty if we came from usermode)
SS RSP FLAGS CS RIP rest of pt_regs
C frame
irqentry_state_t (maybe -- the compiler is within its rights to play almost arbitrary games here)
more C stuff
So what we've accomplished is having two distinct arch register regions, one called pt_regs and the other stuck in irqentry_state_t. This is annoying because it means that, if we want to access this thing without passing a pointer around or access it at all from outer frames, we need to do something terrible with the unwinder, and we don't want to go there.
So I propose a somewhat different solution: lay out the stack like this.
SS RSP FLAGS CS RIP rest of pt_regs PKS ^^^^^^^^ extended_pt_regs points here
C frame more C stuff ...
IOW we have:
struct extended_pt_regs { bool rcu_whatever; other generic fields here; struct arch_extended_pt_regs arch_regs; struct pt_regs regs; };
and arch_extended_pt_regs has unsigned long pks;
and instead of passing a pointer to irqentry_state_t to the generic entry/exit code, we just pass a pt_regs pointer.
While I agree vs. PKS which is architecture specific state and needed in other places e.g. #PF, I'm not convinced that sticking the existing state into the same area buys us anything more than an indirect access.
Peter?
Agreed; that immediately solves the confusion Ira had as well. While extending pt_regs sounds scary, I think we've isolated our pt_regs implementation from actual ABI pretty well, but of course, that would need an audit. We don't want to leak this into signals for example.