From: LinFeng linfeng23@huawei.com
We found that the !is_zero_page() in kvm_is_mmio_pfn() was submmited in commit:90cff5a8cc("KVM: check for !is_zero_pfn() in kvm_is_mmio_pfn()"), but reverted in commit:0ef2459983("kvm: fix kvm_is_mmio_pfn() and rename to kvm_is_reserved_pfn()").
Maybe just adding !is_zero_page() to kvm_is_reserved_pfn() is too rough. According to commit:e433e83bc3("KVM: MMU: Do not treat ZONE_DEVICE pages as being reserved"), special handling in some other flows is also need by zero_page, if we would treat zero_page as being reserved.
Well, as fixing all functions reference to kvm_is_reserved_pfn() in this patch, we found that only kvm_release_pfn_clean() and kvm_get_pfn() don't need special handling.
So, we thought why not only check is_zero_page() in before get and put page, and revert our last commit:31e813f38f("KVM: fix overflow of zero page refcount with ksm running") in master. Instead of adding !is_zero_page() in kvm_is_reserved_pfn(), new idea is as follow:
diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c index 7f9ee2929cfe..f9a1f9cf188e 100644 --- a/virt/kvm/kvm_main.c +++ b/virt/kvm/kvm_main.c @@ -1695,7 +1695,8 @@ EXPORT_SYMBOL_GPL(kvm_release_page_clean);
void kvm_release_pfn_clean(kvm_pfn_t pfn) { - if (!is_error_noslot_pfn(pfn) && !kvm_is_reserved_pfn(pfn)) + if (!is_error_noslot_pfn(pfn) && + (!kvm_is_reserved_pfn(pfn) || is_zero_pfn(pfn))) put_page(pfn_to_page(pfn)); } EXPORT_SYMBOL_GPL(kvm_release_pfn_clean); @@ -1734,7 +1735,7 @@ EXPORT_SYMBOL_GPL(kvm_set_pfn_accessed);
void kvm_get_pfn(kvm_pfn_t pfn) { - if (!kvm_is_reserved_pfn(pfn)) + if (!kvm_is_reserved_pfn(pfn) || is_zero_pfn(pfn)) get_page(pfn_to_page(pfn)); } EXPORT_SYMBOL_GPL(kvm_get_pfn);
We are confused why ZONE_DEVICE not do this, but treating it as no reserved. Is it racy if we change only use the patch in cover letter, but not the series patches.
And we check the code of v4.9.y v4.10.y v4.11.y v4.12.y, this bug exists in v4.11.y and later, but not in v4.9.y v4.10.y or before. After commit:e86c59b1b1("mm/ksm: improve deduplication of zero pages with colouring"), ksm will use zero pages with colouring. This feature was added in v4.11.y, so I wonder why v4.9.y has this bug.
We use crash tools attaching to /proc/kcore to check the refcount of zero_page, then create and destroy vm. The refcount stays at 1 on v4.9.y, well it increases only after v4.11.y. Are you sure it is the same bug you run into? Is there something we missing?
LinFeng (1): KVM: special handling of zero_page in some flows
Zhuang Yanying (1): KVM: fix overflow of zero page refcount with ksm running
arch/x86/kvm/mmu.c | 2 ++ virt/kvm/kvm_main.c | 9 +++++---- 2 files changed, 7 insertions(+), 4 deletions(-)