On Sun, Mar 22, 2020 at 07:54:32PM -0700, Mike Kravetz wrote:
On 3/22/20 7:03 PM, Longpeng (Mike, Cloud Infrastructure Service Product Dept.) wrote:
On 2020/3/22 7:38, Mike Kravetz wrote:
On 2/21/20 7:33 PM, Longpeng(Mike) wrote:
From: Longpeng longpeng2@huawei.com
I have not looked closely at the generated code for lookup_address_in_pgd. It appears that it would dereference p4d, pud and pmd multiple times. Sean seemed to think there was something about the calling context that would make issues like those seen with huge_pte_offset less likely to happen. I do not know if this is accurate or not.
Only for KVM's calls to lookup_address_in_mm(), I can't speak to other calls that funnel into to lookup_address_in_pgd().
KVM uses a combination of tracking and blocking mmu_notifier calls to ensure PTE changes/invalidations between gup() and lookup_address_in_pgd() cause a restart of the faulting instruction, and that pending changes/invalidations are blocked until installation of the pfn in KVM's secondary MMU completes.
kvm_mmu_page_fault():
mmu_seq = kvm->mmu_notifier_seq; smp_rmb();
pfn = gup(hva);
spin_lock(&kvm->mmu_lock); smp_rmb(); if (kvm->mmu_notifier_seq != mmu_seq) goto out_unlock: // Restart guest, i.e. retry the fault
lookup_address_in_mm(hva, ...);
...
out_unlock: spin_unlock(&kvm->mmu_lock);
kvm_mmu_notifier_change_pte() / kvm_mmu_notifier_invalidate_range_end():
spin_lock(&kvm->mmu_lock); kvm->mmu_notifier_seq++; smp_wmb(); spin_unlock(&kvm->mmu_lock);
Let's remove the two READ_ONCE calls and move this patch forward. We can look closer at lookup_address_in_pgd and generate another patch if that needs to be fixed as well.
Thanks
Mike Kravetz