On Tue, Jul 14, 2020 at 10:27:01AM +0200, Peter Zijlstra wrote:
On Tue, Jul 14, 2020 at 12:02:09AM -0700, ira.weiny@intel.com wrote:
From: Ira Weiny ira.weiny@intel.com
The PKRS MSR is defined as a per-core register. This isolates memory access by CPU. Unfortunately, the MSR is not preserved by XSAVE. Therefore, We must preserve the protections for individual tasks even if they are context switched out and placed on another cpu later.
This is a contradiction and utter trainwreck.
I don't understand where there is a contradiction? Perhaps I should have said the MSR is not XSAVE managed vs 'preserved'?
We're not going to do more per-core MSRs and pretend they make sense per-task.
I don't understand how this does not make sense. The PKRS register is controlling the task's access to kernel memory and is designed to be restricted to that task. Put another way, this is similar to CR3 which ultimately controls tasks memory access. Per-process mm is inherent to memory access control and is per-task. So how is this any different? Many MSRs are like this.
I suppose an alternative might be to disallow a context switch while the PKRS value is not the default but I don't see this being very desirable at all.
Ira