On May 31, 2019, at 3:37 AM, Pavel Machek pavel@denx.de wrote:
Hi!
[ Upstream commit f2c65fb3221adc6b73b0549fc7ba892022db9797 ]
When modules and BPF filters are loaded, there is a time window in which some memory is both writable and executable. An attacker that has already found another vulnerability (e.g., a dangling pointer) might be able to exploit this behavior to overwrite kernel code. Prevent having writable executable PTEs in this stage.
In addition, avoiding having W+X mappings can also slightly simplify the patching of modules code on initialization (e.g., by alternatives and static-key), as would be done in the next patch. This was actually the main motivation for this patch.
To avoid having W+X mappings, set them initially as RW (NX) and after they are set as RO set them as X as well. Setting them as executable is done as a separate step to avoid one core in which the old PTE is cached (hence writable), and another which sees the updated PTE (executable), which would break the W^X protection.
First, is this stable material? Yes, it changes something.
But if you assume attacker can write into kernel memory during module load, what prevents him to change the module as he sees fit while it is not executable, simply waiting for system to execute it?
I don't see security benefit here.
I agree that at the moment the benefit it limited. I think the benefit would come later, if the module signature check is performed after the module has been write-protected, but before it is actually executed.
+++ b/arch/x86/kernel/alternative.c @@ -662,15 +662,29 @@ void __init alternative_instructions(void)
- handlers seeing an inconsistent instruction while you patch.
*/ void *__init_or_module text_poke_early(void *addr, const void *opcode,
size_t len)
size_t len)
{ unsigned long flags;
- local_irq_save(flags);
- memcpy(addr, opcode, len);
- local_irq_restore(flags);
- sync_core();
- /* Could also do a CLFLUSH here to speed up CPU recovery; but
that causes hangs on some VIA CPUs. */
- if (boot_cpu_has(X86_FEATURE_NX) &&
is_module_text_address((unsigned long)addr)) {
/*
* Modules text is marked initially as non-executable, so the
* code cannot be running and speculative code-fetches are
* prevented. Just change the code.
*/
memcpy(addr, opcode, len);
- } else {
local_irq_save(flags);
memcpy(addr, opcode, len);
local_irq_restore(flags);
sync_core();
/*
* Could also do a CLFLUSH here to speed up CPU recovery; but
* that causes hangs on some VIA CPUs.
*/
I don't get it. If code can not be running here, it can not be running in the !NX case, either, and we are free to just change it. Speculative execution should not be a problem, either, as CPUs are supposed to mask it, and there are no known bugs in that area. (Plus, I'd not be surprise if speculative execution ignored NX... just saying :-) )
Yes, the module code should not run, but speculative execution might cause it to be cached in the instruction cache (as unlikely as it might be, but we need to consider malicious users that play with branch predictors).
I am unfamiliar with any bug that might cause the CPU to speculatively ignore the NX bit. Without underestimating Intel’s ability to create terrible bugs, I would assume, for now, that it is safe.