On 11/17/22 05:58, Marco Elver wrote:
[ 0.663761] WARNING: CPU: 0 PID: 0 at arch/x86/include/asm/kfence.h:46 kfence_protect+0x7b/0x120 [ 0.664033] WARNING: CPU: 0 PID: 0 at mm/kfence/core.c:234 kfence_protect+0x7d/0x120 [ 0.664465] kfence: kfence_init failed
Any chance you could add some debugging and figure out what actually made kfence call over? Was it the pte or the level?
if (WARN_ON(!pte || level != PG_LEVEL_4K)) return false;
I can see how the thing you bisected to might lead to a page table not being split, which could mess with the 'level' check.
Also, is there a reason this code is mucking with the page tables directly? It seems, uh, rather wonky. This, for instance:
if (protect) set_pte(pte, __pte(pte_val(*pte) & ~_PAGE_PRESENT)); else set_pte(pte, __pte(pte_val(*pte) | _PAGE_PRESENT)); /* * Flush this CPU's TLB, assuming whoever did the allocation/free is * likely to continue running on this CPU. */ preempt_disable(); flush_tlb_one_kernel(addr); preempt_enable();
Seems rather broken. I assume the preempt_disable() is there to get rid of some warnings. But, there is nothing I can see to *keep* the CPU that did the free from being different from the one where the TLB flush is performed until the preempt_disable(). That makes the flush_tlb_one_kernel() mostly useless.
Is there a reason this code isn't using the existing page table manipulation functions and tries to code its own? What prevents it from using something like the attached patch?