Re: [PATCH v2] mm/hugetlb: fix a addressing exception caused by huge_pte_offset()

23 Mar 2020


      On Mon, Mar 23, 2020 at 07:40:31AM -0700, Sean Christopherson wrote:
...
On Sun, Mar 22, 2020 at 07:54:32PM -0700, Mike Kravetz wrote:
...
On 3/22/20 7:03 PM, Longpeng (Mike, Cloud Infrastructure Service Product Dept.) wrote:
...
On 2020/3/22 7:38, Mike Kravetz wrote:
...
On 2/21/20 7:33 PM, Longpeng(Mike) wrote:
...
From: Longpeng longpeng2@huawei.com
I have not looked closely at the generated code for lookup_address_in_pgd.
It appears that it would dereference p4d, pud and pmd multiple times.  Sean
seemed to think there was something about the calling context that would
make issues like those seen with huge_pte_offset less likely to happen.  I
do not know if this is accurate or not.
Only for KVM's calls to lookup_address_in_mm(), I can't speak to other
calls that funnel into to lookup_address_in_pgd().
KVM uses a combination of tracking and blocking mmu_notifier calls to ensure
PTE changes/invalidations between gup() and lookup_address_in_pgd() cause a
restart of the faulting instruction, and that pending changes/invalidations
are blocked until installation of the pfn in KVM's secondary MMU completes.
kvm_mmu_page_fault():
mmu_seq = kvm->mmu_notifier_seq;
   smp_rmb();
pfn = gup(hva);
spin_lock(&kvm->mmu_lock);
   smp_rmb();
   if (kvm->mmu_notifier_seq != mmu_seq)
   	goto out_unlock: // Restart guest, i.e. retry the fault
lookup_address_in_mm(hva, ...);
It works because the mmu_lock spinlock is taken before and after any
change to the page table via invalidate_range_start/end() callbacks.
So if you are in the spinlock and mmu_notifier_count == 0, then nobody
can be writing to the page tables.
It is effectively a full page table lock, so any page table read under
that lock do not need to worry about any data races.
Jason

2025

2024

2023

2022

2021

2020

2019

2018

2017

Re: [PATCH v2] mm/hugetlb: fix a addressing exception caused by huge_pte_offset()