Hi,
This test is added to serve as a performance tester and a bug reproducer for kvm page table code (GPA->HPA mappings), it gives guidance for the people trying to make some improvement for kvm.
The following explains what we can exactly do through this test. And a RFC is sent for comments, thanks.
The function guest_code() is designed to cover conditions where a single vcpu or multiple vcpus access guest pages within the same memory range, in three VM stages(before dirty-logging, during dirty-logging, after dirty-logging). Besides, the backing source memory type(ANONYMOUS/THP/HUGETLB) of the tested memory region can be specified by users, which means normal page mappings or block mappings can be chosen by users to be created in the test.
If use of ANONYMOUS memory is specified, kvm will create page mappings for the tested memory region before dirty-logging, and update attributes of the page mappings from RO to RW during dirty-logging. If use of THP/HUGETLB memory is specified, kvm will create block mappings for the tested memory region before dirty-logging, and split the blcok mappings into page mappings during dirty-logging, and coalesce the page mappings back into block mappings after dirty-logging is stopped.
So in summary, as a performance tester, this test can present the performance of kvm creating/updating normal page mappings, or the performance of kvm creating/splitting/recovering block mappings, through execution time.
When we need to coalesce the page mappings back to block mappings after dirty logging is stopped, we have to firstly invalidate *all* the TLB entries for the page mappings right before installation of the block entry, because a TLB conflict abort error could occur if we can't invalidate the TLB entries fully. We have hit this TLB conflict twice on aarch64 software implementation and fixed it. As this test can imulate process from dirty-logging enabled to dirty-logging stopped of a VM with block mappings, so it can also reproduce this TLB conflict abort due to inadequate TLB invalidation when coalescing tables.
Links about the TLB conflict abort: https://lore.kernel.org/lkml/20201201201034.116760-3-wangyanan55@huawei.com/
---
Here are some test examples of this test: platform: HiSilicon Kunpeng920 (aarch64, FWB not supported) host kernel: Linux mainline
(1) Based on v5.11-rc6
cmdline: ./kvm_page_table_test -m 4 -t 0 -g 4K -s 1G -v 1 (1 vcpu, 1G memory, page mappings(granule 4K)) KVM_CREATE_MAPPINGS: 0.8196s 0.8260s 0.8258s 0.8169s 0.8190s KVM_UPDATE_MAPPINGS: 1.1930s 1.1949s 1.1940s 1.1934s 1.1946s
cmdline: ./kvm_page_table_test -m 4 -t 0 -g 4K -s 1G -v 20 (20 vcpus, 1G memory, page mappings(granule 4K)) KVM_CREATE_MAPPINGS: 23.4028s 23.8015s 23.6702s 23.9437s 22.1646s KVM_UPDATE_MAPPINGS: 16.9550s 16.4734s 16.8300s 16.9621s 16.9402s
cmdline: ./kvm_page_table_test -m 4 -t 2 -g 1G -s 20G -v 1 (1 vcpu, 20G memory, block mappings(granule 1G)) KVM_CREATE_MAPPINGS: 3.7040s 3.7053s 3.7047s 3.7061s 3.7068s KVM_ADJUST_MAPPINGS: 2.8264s 2.8266s 2.8272s 2.8259s 2.8283s
cmdline: ./kvm_page_table_test -m 4 -t 2 -g 1G -s 20G -v 20 (20 vcpus, 20G memory, block mappings(granule 1G)) KVM_CREATE_MAPPINGS: 52.8338s 52.8327s 52.8336s 52.8255s 52.8303s KVM_ADJUST_MAPPINGS: 52.0466s 52.0473s 52.0550s 52.0518s 52.0467s
(2) I have post a patch series to improve efficiency of stage2 page table code, so test the performance changes.
cmdline: ./kvm_page_table_test -m 4 -t 2 -g 1G -s 20G -v 20 (20 vcpus, 20G memory, block mappings(granule 1G)) Before patch: KVM_CREATE_MAPPINGS: 52.8338s 52.8327s 52.8336s 52.8255s 52.8303s After patch: KVM_CREATE_MAPPINGS: 3.7022s 3.7031s 3.7028s 3.7012s 3.7024s
Before patch: KVM_ADJUST_MAPPINGS: 52.0466s 52.0473s 52.0550s 52.0518s 52.0467s After patch: KVM_ADJUST_MAPPINGS: 0.3008s 0.3004s 0.2974s 0.2917s 0.2900s
cmdline: ./kvm_page_table_test -m 4 -t 2 -g 1G -s 20G -v 40 (40 vcpus, 20G memory, block mappings(granule 1G)) Before patch: KVM_CREATE_MAPPINGS: 104.560s 104.556s 104.554s 104.556s 104.550s After patch: KVM_CREATE_MAPPINGS: 3.7011s 3.7103s 3.7005s 3.7024s 3.7106s
Before patch: KVM_ADJUST_MAPPINGS: 103.931s 103.936s 103.927s 103.942s 103.927s After patch: KVM_ADJUST_MAPPINGS: 0.3541s 0.3694s 0.3656s 0.3693s 0.3687s
---
Yanan Wang (2): KVM: selftests: Add a macro to get string of vm_mem_backing_src_type KVM: selftests: Add a test for kvm page table code
tools/testing/selftests/kvm/Makefile | 3 + .../testing/selftests/kvm/include/kvm_util.h | 3 + .../selftests/kvm/kvm_page_table_test.c | 518 ++++++++++++++++++ tools/testing/selftests/kvm/lib/kvm_util.c | 8 + 4 files changed, 532 insertions(+) create mode 100644 tools/testing/selftests/kvm/kvm_page_table_test.c