On 5/26/21 11:06 AM, Dave Hansen wrote:
On 5/26/21 8:25 AM, Babu Moger wrote:
On 5/25/21 7:20 PM, Dave Hansen wrote:
On 5/25/21 5:03 PM, Babu Moger wrote:
What values do PKRU and the shadow have when the test fails? Is PKRU 0?
It goes back to default value 0x55555554. The test is expecting it to be 0. Printed them below.
test_ptrace_of_child()::1346, pkey_reg: 0x0000000055555554 shadow: 0000000000000000 protection_keys_64: pkey-helpers.h:127: _read_pkey_reg: Assertion `pkey_reg == shadow_pkey_reg' failed.
That's backwards (shadow vs pkru) from what I was expecting.
Can you turn on all the debuging?
Just compile with -DDEBUG_LEVEL=5
Copied the logs at https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fpastebin.c...
Well, it's a bit backwards from what I'm expecting. The PKRU=0 value *WAS* legitimate because all of the pkeys got allocated and their disable bits cleared.
I think Andy was close when he was blaming:
static inline void write_pkru(u32 pkru) {
...
pk = get_xsave_addr(¤t->thread.fpu.state.xsave, XFEATURE_PKRU);
...
if (pk) pk->pkru = pkru; __write_pkru(pkru);
}
But that can't be it because PKRU ended up with 0x55555554. Something must have been writing 'init_pkru_value'.
switch_fpu_finish() does that:
Yes, I have noticed switch_fpu_finish writing init_pkru_value sometimes. But, I was not sure why that was happening..
static inline void switch_fpu_finish(struct fpu *new_fpu) { u32 pkru_val = init_pkru_value;
...
if (current->mm) { pk = get_xsave_addr(&new_fpu->state.xsave, XFEATURE_PKRU); if (pk) pkru_val = pk->pkru; } __write_pkru(pkru_val);
...
}
If 'new_fpu' had XSTATE_BV[PKRU]=0 then we'd have pk=NULL and 'pkru_val' would still have 'init_pkru_value'. *Then*, we'd have a shadow=0x0 and pkru=0x55555554. It would also only trigger if the hardware has an init tracker that fires when wrpkru(0). Intel doesn't do that. AMD must.
Ok. I will check with hardware guys here about this behavior.
Anyway, I need to think about this a bit more. But, an entirely guaranteed to be 100% untested patch is attached. I'm *NOT* confident this is the right fix.
I don't have much AMD hardware laying around, so testing would be appreciated.
Yes. Patch fixes problem on AMD. Also tested on Intel box to make sure it does not cause any regression there. It does work fine there as well. Thanks for the patch.