On Fri, Jul 15, 2022 at 01:36:15PM +0200, Gupta, Pankaj wrote:
Currently in mmu_notifier validate path, hva range is recorded and then checked in the mmu_notifier_retry_hva() from page fault path. However for the to be introduced private memory, a page fault may not have a hva
As this patch appeared in v7, just wondering did you see an actual bug because of it? And not having corresponding 'hva' occurs only with private memory because its not mapped to host userspace?
The addressed problem is not new in this version, previous versions I also had code to handle it (just in different way). But the problem is: mmu_notifier/memfile_notifier may be in the progress of invalidating a pfn that obtained earlier in the page fault handler, when happens, we should retry the fault. In v6 I used global mmu_notifier_retry() for memfile_notifier but that can block unrelated mmu_notifer invalidation which has hva range specified.
Sean gave a comment at https://lkml.org/lkml/2022/6/17/1001 to separate memfile_notifier from mmu_notifier but during the implementation I realized we actually can reuse the same code for shared and private memory if both using gpa range and that can simplify the code handling in kvm_zap_gfn_range and some other code (e.g. we don't need two versions for memfile_notifier/mmu_notifier).
Adding gpa range for private memory invalidation also relieves the above blocking issue between private memory page fault and mmu_notifier.
Chao
Thanks, Pankaj
associated, checking gfn(gpa) makes more sense. For existing non private memory case, gfn is expected to continue to work.
The patch also fixes a potential bug in kvm_zap_gfn_range() which has already been using gfn when calling kvm_inc/dec_notifier_count() in current code.
Signed-off-by: Chao Peng chao.p.peng@linux.intel.com
arch/x86/kvm/mmu/mmu.c | 2 +- include/linux/kvm_host.h | 18 ++++++++---------- virt/kvm/kvm_main.c | 6 +++--- 3 files changed, 12 insertions(+), 14 deletions(-)
diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c index f7fa4c31b7c5..0d882fad4bc1 100644 --- a/arch/x86/kvm/mmu/mmu.c +++ b/arch/x86/kvm/mmu/mmu.c @@ -4182,7 +4182,7 @@ static bool is_page_fault_stale(struct kvm_vcpu *vcpu, return true; return fault->slot &&
mmu_notifier_retry_hva(vcpu->kvm, mmu_seq, fault->hva);
} static int direct_page_fault(struct kvm_vcpu *vcpu, struct kvm_page_fault *fault)mmu_notifier_retry_gfn(vcpu->kvm, mmu_seq, fault->gfn);
diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h index 0bdb6044e316..e9153b54e2a4 100644 --- a/include/linux/kvm_host.h +++ b/include/linux/kvm_host.h @@ -767,8 +767,8 @@ struct kvm { struct mmu_notifier mmu_notifier; unsigned long mmu_notifier_seq; long mmu_notifier_count;
- unsigned long mmu_notifier_range_start;
- unsigned long mmu_notifier_range_end;
- gfn_t mmu_notifier_range_start;
- gfn_t mmu_notifier_range_end; #endif struct list_head devices; u64 manual_dirty_log_protect;
@@ -1362,10 +1362,8 @@ void kvm_mmu_free_memory_cache(struct kvm_mmu_memory_cache *mc); void *kvm_mmu_memory_cache_alloc(struct kvm_mmu_memory_cache *mc); #endif -void kvm_inc_notifier_count(struct kvm *kvm, unsigned long start,
unsigned long end);
-void kvm_dec_notifier_count(struct kvm *kvm, unsigned long start,
unsigned long end);
+void kvm_inc_notifier_count(struct kvm *kvm, gfn_t start, gfn_t end); +void kvm_dec_notifier_count(struct kvm *kvm, gfn_t start, gfn_t end); long kvm_arch_dev_ioctl(struct file *filp, unsigned int ioctl, unsigned long arg); @@ -1923,9 +1921,9 @@ static inline int mmu_notifier_retry(struct kvm *kvm, unsigned long mmu_seq) return 0; } -static inline int mmu_notifier_retry_hva(struct kvm *kvm, +static inline int mmu_notifier_retry_gfn(struct kvm *kvm, unsigned long mmu_seq,
unsigned long hva)
{ lockdep_assert_held(&kvm->mmu_lock); /*gfn_t gfn)
@@ -1935,8 +1933,8 @@ static inline int mmu_notifier_retry_hva(struct kvm *kvm, * positives, due to shortcuts when handing concurrent invalidations. */ if (unlikely(kvm->mmu_notifier_count) &&
hva >= kvm->mmu_notifier_range_start &&
hva < kvm->mmu_notifier_range_end)
gfn >= kvm->mmu_notifier_range_start &&
return 1; if (kvm->mmu_notifier_seq != mmu_seq) return 1;gfn < kvm->mmu_notifier_range_end)
diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c index da263c370d00..4d7f0e72366f 100644 --- a/virt/kvm/kvm_main.c +++ b/virt/kvm/kvm_main.c @@ -536,8 +536,7 @@ static void kvm_mmu_notifier_invalidate_range(struct mmu_notifier *mn, typedef bool (*hva_handler_t)(struct kvm *kvm, struct kvm_gfn_range *range); -typedef void (*on_lock_fn_t)(struct kvm *kvm, unsigned long start,
unsigned long end);
+typedef void (*on_lock_fn_t)(struct kvm *kvm, gfn_t start, gfn_t end); typedef void (*on_unlock_fn_t)(struct kvm *kvm); @@ -624,7 +623,8 @@ static __always_inline int __kvm_handle_hva_range(struct kvm *kvm, locked = true; KVM_MMU_LOCK(kvm); if (!IS_KVM_NULL_FN(range->on_lock))
range->on_lock(kvm, range->start, range->end);
range->on_lock(kvm, gfn_range.start,
gfn_range.end); if (IS_KVM_NULL_FN(range->handler)) break; }