Re: WARNING: CPU: 0 PID: 0 at arch/x86/include/asm/kfence.h:46 kfence_protect

17 Nov 2022

On Thu, Nov 17, 2022 at 06:34AM -0800, Dave Hansen wrote:
...
On 11/17/22 05:58, Marco Elver wrote:
...
[    0.663761] WARNING: CPU: 0 PID: 0 at arch/x86/include/asm/kfence.h:46 kfence_protect+0x7b/0x120
[    0.664033] WARNING: CPU: 0 PID: 0 at mm/kfence/core.c:234 kfence_protect+0x7d/0x120
[    0.664465] kfence: kfence_init failed
Any chance you could add some debugging and figure out what actually
made kfence call over?  Was it the pte or the level?
    if (WARN_ON(!pte || level != PG_LEVEL_4K))
            return false;


I can see how the thing you bisected to might lead to a page table not
being split, which could mess with the 'level' check.
Yes - it's the 'level != PG_LEVEL_4K'.
We do actually try to split the pages in arch_kfence_init_pool() (above
this function) - so with "x86/mm: Inhibit _PAGE_NX changes from
cpa_process_alias()" this somehow fails...
...
Also, is there a reason this code is mucking with the page tables
directly?  It seems, uh, rather wonky.  This, for instance:
...
    if (protect)
            set_pte(pte, __pte(pte_val(*pte) & ~_PAGE_PRESENT));
    else
            set_pte(pte, __pte(pte_val(*pte) | _PAGE_PRESENT));

    /*
     * Flush this CPU's TLB, assuming whoever did the allocation/free is
     * likely to continue running on this CPU.
     */
    preempt_disable();
    flush_tlb_one_kernel(addr);
    preempt_enable();

Seems rather broken.  I assume the preempt_disable() is there to get rid
of some warnings.  But, there is nothing I can see to *keep* the CPU
that did the free from being different from the one where the TLB flush
is performed until the preempt_disable().  That makes the
flush_tlb_one_kernel() mostly useless.
Is there a reason this code isn't using the existing page table
manipulation functions and tries to code its own?  What prevents it from
using something like the attached patch?
Yes, see the comment below - it's to avoid the IPIs and TLB shoot-downs,
because KFENCE _can_ tolerate the inaccuracy even if we hit the wrong
TLB or other CPUs' TLBs aren't immediately flushed - we trade a few
false negatives for minimizing performance impact.
...

diff --git a/arch/x86/include/asm/kfence.h b/arch/x86/include/asm/kfence.h
index ff5c7134a37a..5cdb3a1f3995 100644
--- a/arch/x86/include/asm/kfence.h
+++ b/arch/x86/include/asm/kfence.h
@@ -37,34 +37,13 @@ static inline bool arch_kfence_init_pool(void)
   return true;
 }
 
-/* Protect the given page and flush TLB. */
 static inline bool kfence_protect_page(unsigned long addr, bool protect)
 {

unsigned int level;
pte_t *pte = lookup_address(addr, &level);

if (WARN_ON(!pte || level != PG_LEVEL_4K))
return false;



/*
* We need to avoid IPIs, as we may get KFENCE allocations or faults


* with interrupts disabled. Therefore, the below is best-effort, and


* does not flush TLBs on all CPUs. We can tolerate some inaccuracy;


* lazy fault handling takes care of faults after the page is PRESENT.


*/




^^ See this comment. Additionally there's a real performance concern,
and the inaccuracy is something that we deliberately accept.
...
if (protect)

set_pte(pte, __pte(pte_val(*pte) & ~_PAGE_PRESENT));




set_memory_np(addr, addr + PAGE_SIZE);

else


set_pte(pte, __pte(pte_val(*pte) | _PAGE_PRESENT));




set_memory_p(addr, addr + PAGE_SIZE);



Isn't this going to do tons of IPIs and shoot down other CPU's TLBs?
KFENCE shouldn't incur this overhead on large machines with >100 CPUs if
we can avoid it.
What does "x86/mm: Inhibit _PAGE_NX changes from cpa_process_alias()" do
that suddenly makes all this fail?
What solution do you prefer that both fixes the issue and avoids the
IPIs?
Thanks,
-- Marco

    

2025

2024

2023

2022

2021

2020

2019

2018

2017

Re: WARNING: CPU: 0 PID: 0 at arch/x86/include/asm/kfence.h:46 kfence_protect