This patchset provides a file descriptor for every VM and VCPU to read KVM statistics data in binary format. It is meant to provide a lightweight, flexible, scalable and efficient lock-free solution for user space telemetry applications to pull the statistics data periodically for large scale systems. The pulling frequency could be as high as a few times per second. In this patchset, every statistics data are treated to have some attributes as below: * architecture dependent or common * VM statistics data or VCPU statistics data * type: cumulative, instantaneous, * unit: none for simple counter, nanosecond, microsecond, millisecond, second, Byte, KiByte, MiByte, GiByte. Clock Cycles Since no lock/synchronization is used, the consistency between all the statistics data is not guaranteed. That means not all statistics data are read out at the exact same time, since the statistics date are still being updated by KVM subsystems while they are read out.
---
* v3 -> v4 - Rebase to kvm/queue, commit 9f242010c3b4 ("KVM: avoid "deadlock" between install_new_memslots and MMU notifier") - Use C-stype comments in the whole patch - Fix wrong count for x86 VCPU stats descriptors - Fix KVM stats data size counting and validity check in selftest
* v2 -> v3 - Rebase to kvm/queue, commit edf408f5257b ("KVM: avoid "deadlock" between install_new_memslots and MMU notifier") - Resolve some nitpicks about format
* v1 -> v2 - Use ARRAY_SIZE to count the number of stats descriptors - Fix missing `size` field initialization in macro STATS_DESC
[1] https://lore.kernel.org/kvm/20210402224359.2297157-1-jingzhangos@google.com [2] https://lore.kernel.org/kvm/20210415151741.1607806-1-jingzhangos@google.com [3] https://lore.kernel.org/kvm/20210423181727.596466-1-jingzhangos@google.com
---
Jing Zhang (4): KVM: stats: Separate common stats from architecture specific ones KVM: stats: Add fd-based API to read binary stats data KVM: stats: Add documentation for statistics data binary interface KVM: selftests: Add selftest for KVM statistics data binary interface
Documentation/virt/kvm/api.rst | 171 ++++++++ arch/arm64/include/asm/kvm_host.h | 9 +- arch/arm64/kvm/guest.c | 42 +- arch/mips/include/asm/kvm_host.h | 9 +- arch/mips/kvm/mips.c | 67 ++- arch/powerpc/include/asm/kvm_host.h | 9 +- arch/powerpc/kvm/book3s.c | 68 +++- arch/powerpc/kvm/book3s_hv.c | 12 +- arch/powerpc/kvm/book3s_pr.c | 2 +- arch/powerpc/kvm/book3s_pr_papr.c | 2 +- arch/powerpc/kvm/booke.c | 63 ++- arch/s390/include/asm/kvm_host.h | 9 +- arch/s390/kvm/kvm-s390.c | 133 +++++- arch/x86/include/asm/kvm_host.h | 9 +- arch/x86/kvm/x86.c | 71 +++- include/linux/kvm_host.h | 132 +++++- include/linux/kvm_types.h | 12 + include/uapi/linux/kvm.h | 50 +++ tools/testing/selftests/kvm/.gitignore | 1 + tools/testing/selftests/kvm/Makefile | 3 + .../testing/selftests/kvm/include/kvm_util.h | 3 + .../selftests/kvm/kvm_bin_form_stats.c | 380 ++++++++++++++++++ tools/testing/selftests/kvm/lib/kvm_util.c | 11 + virt/kvm/kvm_main.c | 237 ++++++++++- 24 files changed, 1415 insertions(+), 90 deletions(-) create mode 100644 tools/testing/selftests/kvm/kvm_bin_form_stats.c
base-commit: 9f242010c3b46e63bc62f08fff42cef992d3801b
Put all common statistics in a separate structure to ease statistics handling for the incoming new statistics API.
No functional change intended.
Signed-off-by: Jing Zhang jingzhangos@google.com --- arch/arm64/include/asm/kvm_host.h | 9 ++------- arch/arm64/kvm/guest.c | 12 ++++++------ arch/mips/include/asm/kvm_host.h | 9 ++------- arch/mips/kvm/mips.c | 12 ++++++------ arch/powerpc/include/asm/kvm_host.h | 9 ++------- arch/powerpc/kvm/book3s.c | 12 ++++++------ arch/powerpc/kvm/book3s_hv.c | 12 ++++++------ arch/powerpc/kvm/book3s_pr.c | 2 +- arch/powerpc/kvm/book3s_pr_papr.c | 2 +- arch/powerpc/kvm/booke.c | 14 +++++++------- arch/s390/include/asm/kvm_host.h | 9 ++------- arch/s390/kvm/kvm-s390.c | 12 ++++++------ arch/x86/include/asm/kvm_host.h | 9 ++------- arch/x86/kvm/x86.c | 14 +++++++------- include/linux/kvm_host.h | 5 +++++ include/linux/kvm_types.h | 12 ++++++++++++ virt/kvm/kvm_main.c | 14 +++++++------- 17 files changed, 80 insertions(+), 88 deletions(-)
diff --git a/arch/arm64/include/asm/kvm_host.h b/arch/arm64/include/asm/kvm_host.h index 7cd7d5c8c4bc..f3ad7a20b0af 100644 --- a/arch/arm64/include/asm/kvm_host.h +++ b/arch/arm64/include/asm/kvm_host.h @@ -556,16 +556,11 @@ static inline bool __vcpu_write_sys_reg_to_cpu(u64 val, int reg) }
struct kvm_vm_stat { - ulong remote_tlb_flush; + struct kvm_vm_stat_common common; };
struct kvm_vcpu_stat { - u64 halt_successful_poll; - u64 halt_attempted_poll; - u64 halt_poll_success_ns; - u64 halt_poll_fail_ns; - u64 halt_poll_invalid; - u64 halt_wakeup; + struct kvm_vcpu_stat_common common; u64 hvc_exit_stat; u64 wfe_exit_stat; u64 wfi_exit_stat; diff --git a/arch/arm64/kvm/guest.c b/arch/arm64/kvm/guest.c index 5cb4a1cd5603..0e41331b0911 100644 --- a/arch/arm64/kvm/guest.c +++ b/arch/arm64/kvm/guest.c @@ -29,18 +29,18 @@ #include "trace.h"
struct kvm_stats_debugfs_item debugfs_entries[] = { - VCPU_STAT("halt_successful_poll", halt_successful_poll), - VCPU_STAT("halt_attempted_poll", halt_attempted_poll), - VCPU_STAT("halt_poll_invalid", halt_poll_invalid), - VCPU_STAT("halt_wakeup", halt_wakeup), + VCPU_STAT_COM("halt_successful_poll", halt_successful_poll), + VCPU_STAT_COM("halt_attempted_poll", halt_attempted_poll), + VCPU_STAT_COM("halt_poll_invalid", halt_poll_invalid), + VCPU_STAT_COM("halt_wakeup", halt_wakeup), VCPU_STAT("hvc_exit_stat", hvc_exit_stat), VCPU_STAT("wfe_exit_stat", wfe_exit_stat), VCPU_STAT("wfi_exit_stat", wfi_exit_stat), VCPU_STAT("mmio_exit_user", mmio_exit_user), VCPU_STAT("mmio_exit_kernel", mmio_exit_kernel), VCPU_STAT("exits", exits), - VCPU_STAT("halt_poll_success_ns", halt_poll_success_ns), - VCPU_STAT("halt_poll_fail_ns", halt_poll_fail_ns), + VCPU_STAT_COM("halt_poll_success_ns", halt_poll_success_ns), + VCPU_STAT_COM("halt_poll_fail_ns", halt_poll_fail_ns), { NULL } };
diff --git a/arch/mips/include/asm/kvm_host.h b/arch/mips/include/asm/kvm_host.h index d0944a75fc8d..1673e5ff42d3 100644 --- a/arch/mips/include/asm/kvm_host.h +++ b/arch/mips/include/asm/kvm_host.h @@ -143,10 +143,11 @@ static inline bool kvm_is_error_hva(unsigned long addr) }
struct kvm_vm_stat { - ulong remote_tlb_flush; + struct kvm_vm_stat_common common; };
struct kvm_vcpu_stat { + struct kvm_vcpu_stat_common common; u64 wait_exits; u64 cache_exits; u64 signal_exits; @@ -178,12 +179,6 @@ struct kvm_vcpu_stat { u64 vz_cpucfg_exits; #endif #endif - u64 halt_successful_poll; - u64 halt_attempted_poll; - u64 halt_poll_success_ns; - u64 halt_poll_fail_ns; - u64 halt_poll_invalid; - u64 halt_wakeup; };
struct kvm_arch_memory_slot { diff --git a/arch/mips/kvm/mips.c b/arch/mips/kvm/mips.c index 4a22ba70c943..011c59acd606 100644 --- a/arch/mips/kvm/mips.c +++ b/arch/mips/kvm/mips.c @@ -71,12 +71,12 @@ struct kvm_stats_debugfs_item debugfs_entries[] = { VCPU_STAT("vz_cpucfg", vz_cpucfg_exits), #endif #endif - VCPU_STAT("halt_successful_poll", halt_successful_poll), - VCPU_STAT("halt_attempted_poll", halt_attempted_poll), - VCPU_STAT("halt_poll_invalid", halt_poll_invalid), - VCPU_STAT("halt_wakeup", halt_wakeup), - VCPU_STAT("halt_poll_success_ns", halt_poll_success_ns), - VCPU_STAT("halt_poll_fail_ns", halt_poll_fail_ns), + VCPU_STAT_COM("halt_successful_poll", halt_successful_poll), + VCPU_STAT_COM("halt_attempted_poll", halt_attempted_poll), + VCPU_STAT_COM("halt_poll_invalid", halt_poll_invalid), + VCPU_STAT_COM("halt_wakeup", halt_wakeup), + VCPU_STAT_COM("halt_poll_success_ns", halt_poll_success_ns), + VCPU_STAT_COM("halt_poll_fail_ns", halt_poll_fail_ns), {NULL} };
diff --git a/arch/powerpc/include/asm/kvm_host.h b/arch/powerpc/include/asm/kvm_host.h index 1e83359f286b..473d9d0804ff 100644 --- a/arch/powerpc/include/asm/kvm_host.h +++ b/arch/powerpc/include/asm/kvm_host.h @@ -80,12 +80,13 @@ struct kvmppc_book3s_shadow_vcpu; struct kvm_nested_guest;
struct kvm_vm_stat { - ulong remote_tlb_flush; + struct kvm_vm_stat_common common; ulong num_2M_pages; ulong num_1G_pages; };
struct kvm_vcpu_stat { + struct kvm_vcpu_stat_common common; u64 sum_exits; u64 mmio_exits; u64 signal_exits; @@ -101,14 +102,8 @@ struct kvm_vcpu_stat { u64 emulated_inst_exits; u64 dec_exits; u64 ext_intr_exits; - u64 halt_poll_success_ns; - u64 halt_poll_fail_ns; u64 halt_wait_ns; - u64 halt_successful_poll; - u64 halt_attempted_poll; u64 halt_successful_wait; - u64 halt_poll_invalid; - u64 halt_wakeup; u64 dbell_exits; u64 gdbell_exits; u64 ld; diff --git a/arch/powerpc/kvm/book3s.c b/arch/powerpc/kvm/book3s.c index 2b691f4d1f26..bd3a10e1fdaf 100644 --- a/arch/powerpc/kvm/book3s.c +++ b/arch/powerpc/kvm/book3s.c @@ -47,14 +47,14 @@ struct kvm_stats_debugfs_item debugfs_entries[] = { VCPU_STAT("dec", dec_exits), VCPU_STAT("ext_intr", ext_intr_exits), VCPU_STAT("queue_intr", queue_intr), - VCPU_STAT("halt_poll_success_ns", halt_poll_success_ns), - VCPU_STAT("halt_poll_fail_ns", halt_poll_fail_ns), + VCPU_STAT_COM("halt_poll_success_ns", halt_poll_success_ns), + VCPU_STAT_COM("halt_poll_fail_ns", halt_poll_fail_ns), VCPU_STAT("halt_wait_ns", halt_wait_ns), - VCPU_STAT("halt_successful_poll", halt_successful_poll), - VCPU_STAT("halt_attempted_poll", halt_attempted_poll), + VCPU_STAT_COM("halt_successful_poll", halt_successful_poll), + VCPU_STAT_COM("halt_attempted_poll", halt_attempted_poll), VCPU_STAT("halt_successful_wait", halt_successful_wait), - VCPU_STAT("halt_poll_invalid", halt_poll_invalid), - VCPU_STAT("halt_wakeup", halt_wakeup), + VCPU_STAT_COM("halt_poll_invalid", halt_poll_invalid), + VCPU_STAT_COM("halt_wakeup", halt_wakeup), VCPU_STAT("pf_storage", pf_storage), VCPU_STAT("sp_storage", sp_storage), VCPU_STAT("pf_instruc", pf_instruc), diff --git a/arch/powerpc/kvm/book3s_hv.c b/arch/powerpc/kvm/book3s_hv.c index 07682ad4110e..584f214a4a3c 100644 --- a/arch/powerpc/kvm/book3s_hv.c +++ b/arch/powerpc/kvm/book3s_hv.c @@ -236,7 +236,7 @@ static void kvmppc_fast_vcpu_kick_hv(struct kvm_vcpu *vcpu)
waitp = kvm_arch_vcpu_get_wait(vcpu); if (rcuwait_wake_up(waitp)) - ++vcpu->stat.halt_wakeup; + ++vcpu->stat.common.halt_wakeup;
cpu = READ_ONCE(vcpu->arch.thread_cpu); if (cpu >= 0 && kvmppc_ipi_thread(cpu)) @@ -3885,7 +3885,7 @@ static void kvmppc_vcore_blocked(struct kvmppc_vcore *vc) cur = start_poll = ktime_get(); if (vc->halt_poll_ns) { ktime_t stop = ktime_add_ns(start_poll, vc->halt_poll_ns); - ++vc->runner->stat.halt_attempted_poll; + ++vc->runner->stat.common.halt_attempted_poll;
vc->vcore_state = VCORE_POLLING; spin_unlock(&vc->lock); @@ -3902,7 +3902,7 @@ static void kvmppc_vcore_blocked(struct kvmppc_vcore *vc) vc->vcore_state = VCORE_INACTIVE;
if (!do_sleep) { - ++vc->runner->stat.halt_successful_poll; + ++vc->runner->stat.common.halt_successful_poll; goto out; } } @@ -3914,7 +3914,7 @@ static void kvmppc_vcore_blocked(struct kvmppc_vcore *vc) do_sleep = 0; /* If we polled, count this as a successful poll */ if (vc->halt_poll_ns) - ++vc->runner->stat.halt_successful_poll; + ++vc->runner->stat.common.halt_successful_poll; goto out; }
@@ -3941,13 +3941,13 @@ static void kvmppc_vcore_blocked(struct kvmppc_vcore *vc) ktime_to_ns(cur) - ktime_to_ns(start_wait); /* Attribute failed poll time */ if (vc->halt_poll_ns) - vc->runner->stat.halt_poll_fail_ns += + vc->runner->stat.common.halt_poll_fail_ns += ktime_to_ns(start_wait) - ktime_to_ns(start_poll); } else { /* Attribute successful poll time */ if (vc->halt_poll_ns) - vc->runner->stat.halt_poll_success_ns += + vc->runner->stat.common.halt_poll_success_ns += ktime_to_ns(cur) - ktime_to_ns(start_poll); } diff --git a/arch/powerpc/kvm/book3s_pr.c b/arch/powerpc/kvm/book3s_pr.c index d7733b07f489..214caa9d9675 100644 --- a/arch/powerpc/kvm/book3s_pr.c +++ b/arch/powerpc/kvm/book3s_pr.c @@ -493,7 +493,7 @@ static void kvmppc_set_msr_pr(struct kvm_vcpu *vcpu, u64 msr) if (!vcpu->arch.pending_exceptions) { kvm_vcpu_block(vcpu); kvm_clear_request(KVM_REQ_UNHALT, vcpu); - vcpu->stat.halt_wakeup++; + vcpu->stat.common.halt_wakeup++;
/* Unset POW bit after we woke up */ msr &= ~MSR_POW; diff --git a/arch/powerpc/kvm/book3s_pr_papr.c b/arch/powerpc/kvm/book3s_pr_papr.c index 031c8015864a..9384625c8051 100644 --- a/arch/powerpc/kvm/book3s_pr_papr.c +++ b/arch/powerpc/kvm/book3s_pr_papr.c @@ -378,7 +378,7 @@ int kvmppc_h_pr(struct kvm_vcpu *vcpu, unsigned long cmd) kvmppc_set_msr_fast(vcpu, kvmppc_get_msr(vcpu) | MSR_EE); kvm_vcpu_block(vcpu); kvm_clear_request(KVM_REQ_UNHALT, vcpu); - vcpu->stat.halt_wakeup++; + vcpu->stat.common.halt_wakeup++; return EMULATE_DONE; case H_LOGICAL_CI_LOAD: return kvmppc_h_pr_logical_ci_load(vcpu); diff --git a/arch/powerpc/kvm/booke.c b/arch/powerpc/kvm/booke.c index 7d5fe43f85c4..07fdd7a1254a 100644 --- a/arch/powerpc/kvm/booke.c +++ b/arch/powerpc/kvm/booke.c @@ -49,15 +49,15 @@ struct kvm_stats_debugfs_item debugfs_entries[] = { VCPU_STAT("inst_emu", emulated_inst_exits), VCPU_STAT("dec", dec_exits), VCPU_STAT("ext_intr", ext_intr_exits), - VCPU_STAT("halt_successful_poll", halt_successful_poll), - VCPU_STAT("halt_attempted_poll", halt_attempted_poll), - VCPU_STAT("halt_poll_invalid", halt_poll_invalid), - VCPU_STAT("halt_wakeup", halt_wakeup), + VCPU_STAT_COM("halt_successful_poll", halt_successful_poll), + VCPU_STAT_COM("halt_attempted_poll", halt_attempted_poll), + VCPU_STAT_COM("halt_poll_invalid", halt_poll_invalid), + VCPU_STAT_COM("halt_wakeup", halt_wakeup), VCPU_STAT("doorbell", dbell_exits), VCPU_STAT("guest doorbell", gdbell_exits), - VCPU_STAT("halt_poll_success_ns", halt_poll_success_ns), - VCPU_STAT("halt_poll_fail_ns", halt_poll_fail_ns), - VM_STAT("remote_tlb_flush", remote_tlb_flush), + VCPU_STAT_COM("halt_poll_success_ns", halt_poll_success_ns), + VCPU_STAT_COM("halt_poll_fail_ns", halt_poll_fail_ns), + VM_STAT_COM("remote_tlb_flush", remote_tlb_flush), { NULL } };
diff --git a/arch/s390/include/asm/kvm_host.h b/arch/s390/include/asm/kvm_host.h index 8925f3969478..57a20897f3db 100644 --- a/arch/s390/include/asm/kvm_host.h +++ b/arch/s390/include/asm/kvm_host.h @@ -361,6 +361,7 @@ struct sie_page { };
struct kvm_vcpu_stat { + struct kvm_vcpu_stat_common common; u64 exit_userspace; u64 exit_null; u64 exit_external_request; @@ -370,13 +371,7 @@ struct kvm_vcpu_stat { u64 exit_validity; u64 exit_instruction; u64 exit_pei; - u64 halt_successful_poll; - u64 halt_attempted_poll; - u64 halt_poll_invalid; u64 halt_no_poll_steal; - u64 halt_wakeup; - u64 halt_poll_success_ns; - u64 halt_poll_fail_ns; u64 instruction_lctl; u64 instruction_lctlg; u64 instruction_stctl; @@ -755,12 +750,12 @@ struct kvm_vcpu_arch { };
struct kvm_vm_stat { + struct kvm_vm_stat_common common; u64 inject_io; u64 inject_float_mchk; u64 inject_pfault_done; u64 inject_service_signal; u64 inject_virtio; - u64 remote_tlb_flush; };
struct kvm_arch_memory_slot { diff --git a/arch/s390/kvm/kvm-s390.c b/arch/s390/kvm/kvm-s390.c index 1296fc10f80c..d6bf3372bb10 100644 --- a/arch/s390/kvm/kvm-s390.c +++ b/arch/s390/kvm/kvm-s390.c @@ -72,13 +72,13 @@ struct kvm_stats_debugfs_item debugfs_entries[] = { VCPU_STAT("exit_program_interruption", exit_program_interruption), VCPU_STAT("exit_instr_and_program_int", exit_instr_and_program), VCPU_STAT("exit_operation_exception", exit_operation_exception), - VCPU_STAT("halt_successful_poll", halt_successful_poll), - VCPU_STAT("halt_attempted_poll", halt_attempted_poll), - VCPU_STAT("halt_poll_invalid", halt_poll_invalid), + VCPU_STAT_COM("halt_successful_poll", halt_successful_poll), + VCPU_STAT_COM("halt_attempted_poll", halt_attempted_poll), + VCPU_STAT_COM("halt_poll_invalid", halt_poll_invalid), VCPU_STAT("halt_no_poll_steal", halt_no_poll_steal), - VCPU_STAT("halt_wakeup", halt_wakeup), - VCPU_STAT("halt_poll_success_ns", halt_poll_success_ns), - VCPU_STAT("halt_poll_fail_ns", halt_poll_fail_ns), + VCPU_STAT_COM("halt_wakeup", halt_wakeup), + VCPU_STAT_COM("halt_poll_success_ns", halt_poll_success_ns), + VCPU_STAT_COM("halt_poll_fail_ns", halt_poll_fail_ns), VCPU_STAT("instruction_lctlg", instruction_lctlg), VCPU_STAT("instruction_lctl", instruction_lctl), VCPU_STAT("instruction_stctl", instruction_stctl), diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h index 3e5fc80a35c8..911fb56b5806 100644 --- a/arch/x86/include/asm/kvm_host.h +++ b/arch/x86/include/asm/kvm_host.h @@ -1127,6 +1127,7 @@ struct kvm_arch { };
struct kvm_vm_stat { + struct kvm_vm_stat_common common; ulong mmu_shadow_zapped; ulong mmu_pte_write; ulong mmu_pde_zapped; @@ -1134,13 +1135,13 @@ struct kvm_vm_stat { ulong mmu_recycled; ulong mmu_cache_miss; ulong mmu_unsync; - ulong remote_tlb_flush; ulong lpages; ulong nx_lpage_splits; ulong max_mmu_page_hash_collisions; };
struct kvm_vcpu_stat { + struct kvm_vcpu_stat_common common; u64 pf_fixed; u64 pf_guest; u64 tlb_flush; @@ -1154,10 +1155,6 @@ struct kvm_vcpu_stat { u64 nmi_window_exits; u64 l1d_flush; u64 halt_exits; - u64 halt_successful_poll; - u64 halt_attempted_poll; - u64 halt_poll_invalid; - u64 halt_wakeup; u64 request_irq_exits; u64 irq_exits; u64 host_state_reload; @@ -1168,8 +1165,6 @@ struct kvm_vcpu_stat { u64 irq_injections; u64 nmi_injections; u64 req_event; - u64 halt_poll_success_ns; - u64 halt_poll_fail_ns; u64 nested_run; u64 directed_yield_attempted; u64 directed_yield_successful; diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c index 3bf52ba5f2bb..e1207fd8b40d 100644 --- a/arch/x86/kvm/x86.c +++ b/arch/x86/kvm/x86.c @@ -229,10 +229,10 @@ struct kvm_stats_debugfs_item debugfs_entries[] = { VCPU_STAT("irq_window", irq_window_exits), VCPU_STAT("nmi_window", nmi_window_exits), VCPU_STAT("halt_exits", halt_exits), - VCPU_STAT("halt_successful_poll", halt_successful_poll), - VCPU_STAT("halt_attempted_poll", halt_attempted_poll), - VCPU_STAT("halt_poll_invalid", halt_poll_invalid), - VCPU_STAT("halt_wakeup", halt_wakeup), + VCPU_STAT_COM("halt_successful_poll", halt_successful_poll), + VCPU_STAT_COM("halt_attempted_poll", halt_attempted_poll), + VCPU_STAT_COM("halt_poll_invalid", halt_poll_invalid), + VCPU_STAT_COM("halt_wakeup", halt_wakeup), VCPU_STAT("hypercalls", hypercalls), VCPU_STAT("request_irq", request_irq_exits), VCPU_STAT("irq_exits", irq_exits), @@ -244,8 +244,8 @@ struct kvm_stats_debugfs_item debugfs_entries[] = { VCPU_STAT("nmi_injections", nmi_injections), VCPU_STAT("req_event", req_event), VCPU_STAT("l1d_flush", l1d_flush), - VCPU_STAT("halt_poll_success_ns", halt_poll_success_ns), - VCPU_STAT("halt_poll_fail_ns", halt_poll_fail_ns), + VCPU_STAT_COM("halt_poll_success_ns", halt_poll_success_ns), + VCPU_STAT_COM("halt_poll_fail_ns", halt_poll_fail_ns), VCPU_STAT("nested_run", nested_run), VCPU_STAT("directed_yield_attempted", directed_yield_attempted), VCPU_STAT("directed_yield_successful", directed_yield_successful), @@ -256,7 +256,7 @@ struct kvm_stats_debugfs_item debugfs_entries[] = { VM_STAT("mmu_recycled", mmu_recycled), VM_STAT("mmu_cache_miss", mmu_cache_miss), VM_STAT("mmu_unsync", mmu_unsync), - VM_STAT("remote_tlb_flush", remote_tlb_flush), + VM_STAT_COM("remote_tlb_flush", remote_tlb_flush), VM_STAT("largepages", lpages, .mode = 0444), VM_STAT("nx_largepages_splitted", nx_lpage_splits, .mode = 0444), VM_STAT("max_mmu_page_hash_collisions", max_mmu_page_hash_collisions), diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h index a9a7bcf6ebee..9286516094e3 100644 --- a/include/linux/kvm_host.h +++ b/include/linux/kvm_host.h @@ -1208,6 +1208,11 @@ struct kvm_stats_debugfs_item { { n, offsetof(struct kvm, stat.x), KVM_STAT_VM, ## __VA_ARGS__ } #define VCPU_STAT(n, x, ...) \ { n, offsetof(struct kvm_vcpu, stat.x), KVM_STAT_VCPU, ## __VA_ARGS__ } +#define VM_STAT_COM(n, x, ...) \ + { n, offsetof(struct kvm, stat.common.x), KVM_STAT_VM, ## __VA_ARGS__ } +#define VCPU_STAT_COM(n, x, ...) \ + { n, offsetof(struct kvm_vcpu, stat.common.x), \ + KVM_STAT_VCPU, ## __VA_ARGS__ }
extern struct kvm_stats_debugfs_item debugfs_entries[]; extern struct dentry *kvm_debugfs_dir; diff --git a/include/linux/kvm_types.h b/include/linux/kvm_types.h index a7580f69dda0..87eb05ad678b 100644 --- a/include/linux/kvm_types.h +++ b/include/linux/kvm_types.h @@ -76,5 +76,17 @@ struct kvm_mmu_memory_cache { }; #endif
+struct kvm_vm_stat_common { + ulong remote_tlb_flush; +}; + +struct kvm_vcpu_stat_common { + u64 halt_successful_poll; + u64 halt_attempted_poll; + u64 halt_poll_invalid; + u64 halt_wakeup; + u64 halt_poll_success_ns; + u64 halt_poll_fail_ns; +};
#endif /* __KVM_TYPES_H__ */ diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c index 9ac70594d133..cdf53fb75ca1 100644 --- a/virt/kvm/kvm_main.c +++ b/virt/kvm/kvm_main.c @@ -330,7 +330,7 @@ void kvm_flush_remote_tlbs(struct kvm *kvm) */ if (!kvm_arch_flush_remote_tlb(kvm) || kvm_make_all_cpus_request(kvm, KVM_REQ_TLB_FLUSH)) - ++kvm->stat.remote_tlb_flush; + ++kvm->stat.common.remote_tlb_flush; cmpxchg(&kvm->tlbs_dirty, dirty_count, 0); } EXPORT_SYMBOL_GPL(kvm_flush_remote_tlbs); @@ -2990,9 +2990,9 @@ static inline void update_halt_poll_stats(struct kvm_vcpu *vcpu, u64 poll_ns, bool waited) { if (waited) - vcpu->stat.halt_poll_fail_ns += poll_ns; + vcpu->stat.common.halt_poll_fail_ns += poll_ns; else - vcpu->stat.halt_poll_success_ns += poll_ns; + vcpu->stat.common.halt_poll_success_ns += poll_ns; }
/* @@ -3010,16 +3010,16 @@ void kvm_vcpu_block(struct kvm_vcpu *vcpu) if (vcpu->halt_poll_ns && !kvm_arch_no_poll(vcpu)) { ktime_t stop = ktime_add_ns(ktime_get(), vcpu->halt_poll_ns);
- ++vcpu->stat.halt_attempted_poll; + ++vcpu->stat.common.halt_attempted_poll; do { /* * This sets KVM_REQ_UNHALT if an interrupt * arrives. */ if (kvm_vcpu_check_block(vcpu) < 0) { - ++vcpu->stat.halt_successful_poll; + ++vcpu->stat.common.halt_successful_poll; if (!vcpu_valid_wakeup(vcpu)) - ++vcpu->stat.halt_poll_invalid; + ++vcpu->stat.common.halt_poll_invalid; goto out; } poll_end = cur = ktime_get(); @@ -3076,7 +3076,7 @@ bool kvm_vcpu_wake_up(struct kvm_vcpu *vcpu) waitp = kvm_arch_vcpu_get_wait(vcpu); if (rcuwait_wake_up(waitp)) { WRITE_ONCE(vcpu->ready, true); - ++vcpu->stat.halt_wakeup; + ++vcpu->stat.common.halt_wakeup; return true; }
On Thu, 29 Apr 2021 21:37:37 +0100, Jing Zhang jingzhangos@google.com wrote:
+struct kvm_vm_stat_common {
- ulong remote_tlb_flush;
+};
+struct kvm_vcpu_stat_common {
- u64 halt_successful_poll;
- u64 halt_attempted_poll;
- u64 halt_poll_invalid;
- u64 halt_wakeup;
- u64 halt_poll_success_ns;
- u64 halt_poll_fail_ns;
+};
Why can't we make everything a u64? Is there anything that really needs to be a ulong? On most architectures, they are the same anyway, so we might as well bite the bullet.
M.
Hi Marc,
On Fri, Apr 30, 2021 at 7:07 AM Marc Zyngier maz@kernel.org wrote:
On Thu, 29 Apr 2021 21:37:37 +0100, Jing Zhang jingzhangos@google.com wrote:
+struct kvm_vm_stat_common {
ulong remote_tlb_flush;
+};
+struct kvm_vcpu_stat_common {
u64 halt_successful_poll;
u64 halt_attempted_poll;
u64 halt_poll_invalid;
u64 halt_wakeup;
u64 halt_poll_success_ns;
u64 halt_poll_fail_ns;
+};
Why can't we make everything a u64? Is there anything that really needs to be a ulong? On most architectures, they are the same anyway, so we might as well bite the bullet.
That's a question I have asked myself many times. It is a little bit annoying to handle different types for VM and VCPU stats. This divergence was from the commit 8a7e75d47b681933, which says "However vm statistics could potentially be updated by multiple vcpus from that vm at a time. To avoid the overhead of atomics make all vm statistics ulong such that they are 64-bit on 64-bit systems where they can be atomically incremented and are 32-bit on 32-bit systems which may not be able to atomically increment 64-bit numbers."
I would be very happy if there is a lock-free way to use u64 for VM stats. Please let me know if anyone has any idea about this.
M.
-- Without deviation from the norm, progress is not possible.
Thanks, Jing
Provides a file descriptor per VM to read VM stats info/data. Provides a file descriptor per vCPU to read vCPU stats info/data.
Signed-off-by: Jing Zhang jingzhangos@google.com --- arch/arm64/kvm/guest.c | 30 +++++ arch/mips/kvm/mips.c | 55 ++++++++++ arch/powerpc/kvm/book3s.c | 56 ++++++++++ arch/powerpc/kvm/booke.c | 49 +++++++++ arch/s390/kvm/kvm-s390.c | 121 +++++++++++++++++++++ arch/x86/kvm/x86.c | 57 ++++++++++ include/linux/kvm_host.h | 127 +++++++++++++++++++++- include/uapi/linux/kvm.h | 50 +++++++++ virt/kvm/kvm_main.c | 223 ++++++++++++++++++++++++++++++++++++++ 9 files changed, 766 insertions(+), 2 deletions(-)
diff --git a/arch/arm64/kvm/guest.c b/arch/arm64/kvm/guest.c index 0e41331b0911..cf04c2a3a1ce 100644 --- a/arch/arm64/kvm/guest.c +++ b/arch/arm64/kvm/guest.c @@ -28,6 +28,36 @@
#include "trace.h"
+struct _kvm_stats_desc kvm_vm_stats_desc[] = { + STATS_VM_COMMON, +}; + +struct _kvm_stats_header kvm_vm_stats_header = { + .name_size = KVM_STATS_NAME_LEN, + .count = ARRAY_SIZE(kvm_vm_stats_desc), + .desc_offset = sizeof(struct kvm_stats_header), + .data_offset = sizeof(struct kvm_stats_header) + + sizeof(kvm_vm_stats_desc), +}; + +struct _kvm_stats_desc kvm_vcpu_stats_desc[] = { + STATS_VCPU_COMMON, + STATS_DESC_COUNTER("hvc_exit_stat"), + STATS_DESC_COUNTER("wfe_exit_stat"), + STATS_DESC_COUNTER("wfi_exit_stat"), + STATS_DESC_COUNTER("mmio_exit_user"), + STATS_DESC_COUNTER("mmio_exit_kernel"), + STATS_DESC_COUNTER("exits"), +}; + +struct _kvm_stats_header kvm_vcpu_stats_header = { + .name_size = KVM_STATS_NAME_LEN, + .count = ARRAY_SIZE(kvm_vcpu_stats_desc), + .desc_offset = sizeof(struct kvm_stats_header), + .data_offset = sizeof(struct kvm_stats_header) + + sizeof(kvm_vcpu_stats_desc), +}; + struct kvm_stats_debugfs_item debugfs_entries[] = { VCPU_STAT_COM("halt_successful_poll", halt_successful_poll), VCPU_STAT_COM("halt_attempted_poll", halt_attempted_poll), diff --git a/arch/mips/kvm/mips.c b/arch/mips/kvm/mips.c index 011c59acd606..ced50e8c1bb2 100644 --- a/arch/mips/kvm/mips.c +++ b/arch/mips/kvm/mips.c @@ -39,6 +39,61 @@ #define VECTORSPACING 0x100 /* for EI/VI mode */ #endif
+struct _kvm_stats_desc kvm_vm_stats_desc[] = { + STATS_VM_COMMON, +}; + +struct _kvm_stats_header kvm_vm_stats_header = { + .name_size = KVM_STATS_NAME_LEN, + .count = ARRAY_SIZE(kvm_vm_stats_desc), + .desc_offset = sizeof(struct kvm_stats_header), + .data_offset = sizeof(struct kvm_stats_header) + + sizeof(kvm_vm_stats_desc), +}; + +struct _kvm_stats_desc kvm_vcpu_stats_desc[] = { + STATS_VCPU_COMMON, + STATS_DESC_COUNTER("wait_exits"), + STATS_DESC_COUNTER("cache_exits"), + STATS_DESC_COUNTER("signal_exits"), + STATS_DESC_COUNTER("int_exits"), + STATS_DESC_COUNTER("cop_unusable_exits"), + STATS_DESC_COUNTER("tlbmod_exits"), + STATS_DESC_COUNTER("tlbmiss_ld_exits"), + STATS_DESC_COUNTER("tlbmiss_st_exits"), + STATS_DESC_COUNTER("addrerr_st_exits"), + STATS_DESC_COUNTER("addrerr_ld_exits"), + STATS_DESC_COUNTER("syscall_exits"), + STATS_DESC_COUNTER("resvd_inst_exits"), + STATS_DESC_COUNTER("break_inst_exits"), + STATS_DESC_COUNTER("trap_inst_exits"), + STATS_DESC_COUNTER("msa_fpe_exits"), + STATS_DESC_COUNTER("fpe_exits"), + STATS_DESC_COUNTER("msa_disabled_exits"), + STATS_DESC_COUNTER("flush_dcache_exits"), +#ifdef CONFIG_KVM_MIPS_VZ + STATS_DESC_COUNTER("vz_gpsi_exits"), + STATS_DESC_COUNTER("vz_gsfc_exits"), + STATS_DESC_COUNTER("vz_hc_exits"), + STATS_DESC_COUNTER("vz_grr_exits"), + STATS_DESC_COUNTER("vz_gva_exits"), + STATS_DESC_COUNTER("vz_ghfc_exits"), + STATS_DESC_COUNTER("vz_gpa_exits"), + STATS_DESC_COUNTER("vz_resvd_exits"), +#ifdef CONFIG_CPU_LOONGSON64 + STATS_DESC_COUNTER("vz_cpucfg_exits"), +#endif +#endif +}; + +struct _kvm_stats_header kvm_vcpu_stats_header = { + .name_size = KVM_STATS_NAME_LEN, + .count = ARRAY_SIZE(kvm_vcpu_stats_desc), + .desc_offset = sizeof(struct kvm_stats_header), + .data_offset = sizeof(struct kvm_stats_header) + + sizeof(kvm_vcpu_stats_desc), +}; + struct kvm_stats_debugfs_item debugfs_entries[] = { VCPU_STAT("wait", wait_exits), VCPU_STAT("cache", cache_exits), diff --git a/arch/powerpc/kvm/book3s.c b/arch/powerpc/kvm/book3s.c index bd3a10e1fdaf..9dc2510537ce 100644 --- a/arch/powerpc/kvm/book3s.c +++ b/arch/powerpc/kvm/book3s.c @@ -38,6 +38,62 @@
/* #define EXIT_DEBUG */
+struct _kvm_stats_desc kvm_vm_stats_desc[] = { + STATS_VM_COMMON, + STATS_DESC_ICOUNTER("num_2M_pages"), + STATS_DESC_ICOUNTER("num_1G_pages"), +}; + +struct _kvm_stats_header kvm_vm_stats_header = { + .name_size = KVM_STATS_NAME_LEN, + .count = ARRAY_SIZE(kvm_vm_stats_desc), + .desc_offset = sizeof(struct kvm_stats_header), + .data_offset = sizeof(struct kvm_stats_header) + + sizeof(kvm_vm_stats_desc), +}; + +struct _kvm_stats_desc kvm_vcpu_stats_desc[] = { + STATS_VCPU_COMMON, + STATS_DESC_COUNTER("sum_exits"), + STATS_DESC_COUNTER("mmio_exits"), + STATS_DESC_COUNTER("signal_exits"), + STATS_DESC_COUNTER("light_exits"), + STATS_DESC_COUNTER("itlb_real_miss_exits"), + STATS_DESC_COUNTER("itlb_virt_miss_exits"), + STATS_DESC_COUNTER("dtlb_real_miss_exits"), + STATS_DESC_COUNTER("dtlb_virt_miss_exits"), + STATS_DESC_COUNTER("syscall_exits"), + STATS_DESC_COUNTER("isi_exits"), + STATS_DESC_COUNTER("dsi_exits"), + STATS_DESC_COUNTER("emulated_inst_exits"), + STATS_DESC_COUNTER("dec_exits"), + STATS_DESC_COUNTER("ext_intr_exits"), + STATS_DESC_TIME_NSEC("halt_wait_ns"), + STATS_DESC_COUNTER("halt_successful_wait"), + STATS_DESC_COUNTER("dbell_exits"), + STATS_DESC_COUNTER("gdbell_exits"), + STATS_DESC_COUNTER("ld"), + STATS_DESC_COUNTER("st"), + STATS_DESC_COUNTER("pf_storage"), + STATS_DESC_COUNTER("pf_instruc"), + STATS_DESC_COUNTER("sp_storage"), + STATS_DESC_COUNTER("sp_instruc"), + STATS_DESC_COUNTER("queue_intr"), + STATS_DESC_COUNTER("ld_slow"), + STATS_DESC_COUNTER("st_slow"), + STATS_DESC_COUNTER("pthru_all"), + STATS_DESC_COUNTER("pthru_host"), + STATS_DESC_COUNTER("pthru_bad_aff"), +}; + +struct _kvm_stats_header kvm_vcpu_stats_header = { + .name_size = KVM_STATS_NAME_LEN, + .count = ARRAY_SIZE(kvm_vcpu_stats_desc), + .desc_offset = sizeof(struct kvm_stats_header), + .data_offset = sizeof(struct kvm_stats_header) + + sizeof(kvm_vcpu_stats_desc), +}; + struct kvm_stats_debugfs_item debugfs_entries[] = { VCPU_STAT("exits", sum_exits), VCPU_STAT("mmio", mmio_exits), diff --git a/arch/powerpc/kvm/booke.c b/arch/powerpc/kvm/booke.c index 07fdd7a1254a..e9ffcf0f022d 100644 --- a/arch/powerpc/kvm/booke.c +++ b/arch/powerpc/kvm/booke.c @@ -36,6 +36,55 @@
unsigned long kvmppc_booke_handlers;
+struct _kvm_stats_desc kvm_vm_stats_desc[] = { + STATS_VM_COMMON, + STATS_DESC_ICOUNTER("num_2M_pages",), + STATS_DESC_ICOUNTER("num_1G_pages",), +}; + +struct _kvm_stats_header kvm_vm_stats_header = { + .name_size = KVM_STATS_NAME_LEN, + .count = ARRAY_SIZE(kvm_vm_stats_desc), + .desc_offset = sizeof(struct kvm_stats_header), + .data_offset = sizeof(struct kvm_stats_header) + + sizeof(kvm_vm_stats_desc), +}; + +struct _kvm_stats_desc kvm_vcpu_stats_desc[] = { + STATS_VCPU_COMMON, + STATS_DESC_COUNTER("sum_exits"), + STATS_DESC_COUNTER("mmio_exits"), + STATS_DESC_COUNTER("signal_exits"), + STATS_DESC_COUNTER("light_exits"), + STATS_DESC_COUNTER("itlb_real_miss_exits"), + STATS_DESC_COUNTER("itlb_virt_miss_exits"), + STATS_DESC_COUNTER("dtlb_real_miss_exits"), + STATS_DESC_COUNTER("dtlb_virt_miss_exits"), + STATS_DESC_COUNTER("syscall_exits"), + STATS_DESC_COUNTER("isi_exits"), + STATS_DESC_COUNTER("dsi_exits"), + STATS_DESC_COUNTER("emulated_inst_exits"), + STATS_DESC_COUNTER("dec_exits"), + STATS_DESC_COUNTER("ext_intr_exits"), + STATS_DESC_TIME_NSEC("halt_wait_ns"), + STATS_DESC_COUNTER("halt_successful_wait"), + STATS_DESC_COUNTER("dbell_exits"), + STATS_DESC_COUNTER("gdbell_exits"), + STATS_DESC_COUNTER("ld"), + STATS_DESC_COUNTER("st"), + STATS_DESC_COUNTER("pthru_all"), + STATS_DESC_COUNTER("pthru_host"), + STATS_DESC_COUNTER("pthru_bad_aff"), +}; + +struct _kvm_stats_header kvm_vcpu_stats_header = { + .name_size = KVM_STATS_NAME_LEN, + .count = ARRAY_SIZE(kvm_vcpu_stats_desc), + .desc_offset = sizeof(struct kvm_stats_header), + .data_offset = sizeof(struct kvm_stats_header) + + sizeof(kvm_vcpu_stats_desc), +}; + struct kvm_stats_debugfs_item debugfs_entries[] = { VCPU_STAT("mmio", mmio_exits), VCPU_STAT("sig", signal_exits), diff --git a/arch/s390/kvm/kvm-s390.c b/arch/s390/kvm/kvm-s390.c index d6bf3372bb10..2c91d70754a9 100644 --- a/arch/s390/kvm/kvm-s390.c +++ b/arch/s390/kvm/kvm-s390.c @@ -58,6 +58,127 @@ #define VCPU_IRQS_MAX_BUF (sizeof(struct kvm_s390_irq) * \ (KVM_MAX_VCPUS + LOCAL_IRQS))
+struct _kvm_stats_desc kvm_vm_stats_desc[] = { + STATS_VM_COMMON, + STATS_DESC_COUNTER("inject_io"), + STATS_DESC_COUNTER("inject_float_mchk"), + STATS_DESC_COUNTER("inject_pfault_done"), + STATS_DESC_COUNTER("inject_service_signal"), + STATS_DESC_COUNTER("inject_virtio"), +}; + +struct _kvm_stats_header kvm_vm_stats_header = { + .name_size = KVM_STATS_NAME_LEN, + .count = ARRAY_SIZE(kvm_vm_stats_desc), + .desc_offset = sizeof(struct kvm_stats_header), + .data_offset = sizeof(struct kvm_stats_header) + + sizeof(kvm_vm_stats_desc), +}; + +struct _kvm_stats_desc kvm_vcpu_stats_desc[] = { + STATS_VCPU_COMMON, + STATS_DESC_COUNTER("exit_userspace"), + STATS_DESC_COUNTER("exit_null"), + STATS_DESC_COUNTER("exit_external_request"), + STATS_DESC_COUNTER("exit_io_request"), + STATS_DESC_COUNTER("exit_external_interrupt"), + STATS_DESC_COUNTER("exit_stop_request"), + STATS_DESC_COUNTER("exit_validity"), + STATS_DESC_COUNTER("exit_instruction"), + STATS_DESC_COUNTER("exit_pei"), + STATS_DESC_COUNTER("halt_no_poll_steal"), + STATS_DESC_COUNTER("instruction_lctl"), + STATS_DESC_COUNTER("instruction_lctlg"), + STATS_DESC_COUNTER("instruction_stctl"), + STATS_DESC_COUNTER("instruction_stctg"), + STATS_DESC_COUNTER("exit_program_interruption"), + STATS_DESC_COUNTER("exit_instr_and_program"), + STATS_DESC_COUNTER("exit_operation_exception"), + STATS_DESC_COUNTER("deliver_ckc"), + STATS_DESC_COUNTER("deliver_cputm"), + STATS_DESC_COUNTER("deliver_external_call"), + STATS_DESC_COUNTER("deliver_emergency_signal"), + STATS_DESC_COUNTER("deliver_service_signal"), + STATS_DESC_COUNTER("deliver_virtio"), + STATS_DESC_COUNTER("deliver_stop_signal"), + STATS_DESC_COUNTER("deliver_prefix_signal"), + STATS_DESC_COUNTER("deliver_restart_signal"), + STATS_DESC_COUNTER("deliver_program"), + STATS_DESC_COUNTER("deliver_io"), + STATS_DESC_COUNTER("deliver_machine_check"), + STATS_DESC_COUNTER("exit_wait_state"), + STATS_DESC_COUNTER("inject_ckc"), + STATS_DESC_COUNTER("inject_cputm"), + STATS_DESC_COUNTER("inject_external_call"), + STATS_DESC_COUNTER("inject_emergency_signal"), + STATS_DESC_COUNTER("inject_mchk"), + STATS_DESC_COUNTER("inject_pfault_init"), + STATS_DESC_COUNTER("inject_program"), + STATS_DESC_COUNTER("inject_restart"), + STATS_DESC_COUNTER("inject_set_prefix"), + STATS_DESC_COUNTER("inject_stop_signal"), + STATS_DESC_COUNTER("instruction_epsw"), + STATS_DESC_COUNTER("instruction_gs"), + STATS_DESC_COUNTER("instruction_io_other"), + STATS_DESC_COUNTER("instruction_lpsw"), + STATS_DESC_COUNTER("instruction_lpswe"), + STATS_DESC_COUNTER("instruction_pfmf"), + STATS_DESC_COUNTER("instruction_ptff"), + STATS_DESC_COUNTER("instruction_sck"), + STATS_DESC_COUNTER("instruction_sckpf"), + STATS_DESC_COUNTER("instruction_stidp"), + STATS_DESC_COUNTER("instruction_spx"), + STATS_DESC_COUNTER("instruction_stpx"), + STATS_DESC_COUNTER("instruction_stap"), + STATS_DESC_COUNTER("instruction_iske"), + STATS_DESC_COUNTER("instruction_ri"), + STATS_DESC_COUNTER("instruction_rrbe"), + STATS_DESC_COUNTER("instruction_sske"), + STATS_DESC_COUNTER("instruction_ipte_interlock"), + STATS_DESC_COUNTER("instruction_stsi"), + STATS_DESC_COUNTER("instruction_stfl"), + STATS_DESC_COUNTER("instruction_tb"), + STATS_DESC_COUNTER("instruction_tpi"), + STATS_DESC_COUNTER("instruction_tprot"), + STATS_DESC_COUNTER("instruction_tsch"), + STATS_DESC_COUNTER("instruction_sie"), + STATS_DESC_COUNTER("instruction_essa"), + STATS_DESC_COUNTER("instruction_sthyi"), + STATS_DESC_COUNTER("instruction_sigp_sense"), + STATS_DESC_COUNTER("instruction_sigp_sense_running"), + STATS_DESC_COUNTER("instruction_sigp_external_call"), + STATS_DESC_COUNTER("instruction_sigp_emergency"), + STATS_DESC_COUNTER("instruction_sigp_cond_emergency"), + STATS_DESC_COUNTER("instruction_sigp_start"), + STATS_DESC_COUNTER("instruction_sigp_stop"), + STATS_DESC_COUNTER("instruction_sigp_stop_store_status"), + STATS_DESC_COUNTER("instruction_sigp_store_status"), + STATS_DESC_COUNTER("instruction_sigp_store_adtl_status"), + STATS_DESC_COUNTER("instruction_sigp_arch"), + STATS_DESC_COUNTER("instruction_sigp_prefix"), + STATS_DESC_COUNTER("instruction_sigp_restart"), + STATS_DESC_COUNTER("instruction_sigp_init_cpu_reset"), + STATS_DESC_COUNTER("instruction_sigp_cpu_reset"), + STATS_DESC_COUNTER("instruction_sigp_unknown"), + STATS_DESC_COUNTER("diagnose_10"), + STATS_DESC_COUNTER("diagnose_44"), + STATS_DESC_COUNTER("diagnose_9c"), + STATS_DESC_COUNTER("diagnose_9c_ignored"), + STATS_DESC_COUNTER("diagnose_258"), + STATS_DESC_COUNTER("diagnose_308"), + STATS_DESC_COUNTER("diagnose_500"), + STATS_DESC_COUNTER("diagnose_other"), + STATS_DESC_COUNTER("pfault_sync"), +}; + +struct _kvm_stats_header kvm_vcpu_stats_header = { + .name_size = KVM_STATS_NAME_LEN, + .count = ARRAY_SIZE(kvm_vcpu_stats_desc), + .desc_offset = sizeof(struct kvm_stats_header), + .data_offset = sizeof(struct kvm_stats_header) + + sizeof(kvm_vcpu_stats_desc), +}; + struct kvm_stats_debugfs_item debugfs_entries[] = { VCPU_STAT("userspace_handled", exit_userspace), VCPU_STAT("exit_null", exit_null), diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c index e1207fd8b40d..dc55de1f958f 100644 --- a/arch/x86/kvm/x86.c +++ b/arch/x86/kvm/x86.c @@ -217,6 +217,63 @@ EXPORT_SYMBOL_GPL(host_xss); u64 __read_mostly supported_xss; EXPORT_SYMBOL_GPL(supported_xss);
+struct _kvm_stats_desc kvm_vm_stats_desc[] = { + STATS_VM_COMMON, + STATS_DESC_COUNTER("mmu_shadow_zapped"), + STATS_DESC_COUNTER("mmu_pte_write"), + STATS_DESC_COUNTER("mmu_pde_zapped"), + STATS_DESC_COUNTER("mmu_flooded"), + STATS_DESC_COUNTER("mmu_recycled"), + STATS_DESC_COUNTER("mmu_cache_miss"), + STATS_DESC_ICOUNTER("mmu_unsync"), + STATS_DESC_ICOUNTER("largepages"), + STATS_DESC_ICOUNTER("nx_largepages_splits"), + STATS_DESC_ICOUNTER("max_mmu_page_hash_collisions"), +}; + +struct _kvm_stats_header kvm_vm_stats_header = { + .name_size = KVM_STATS_NAME_LEN, + .count = ARRAY_SIZE(kvm_vm_stats_desc), + .desc_offset = sizeof(struct kvm_stats_header), + .data_offset = sizeof(struct kvm_stats_header) + + sizeof(kvm_vm_stats_desc), +}; + +struct _kvm_stats_desc kvm_vcpu_stats_desc[] = { + STATS_VCPU_COMMON, + STATS_DESC_COUNTER("pf_fixed"), + STATS_DESC_COUNTER("pf_guest"), + STATS_DESC_COUNTER("tlb_flush"), + STATS_DESC_COUNTER("invlpg"), + STATS_DESC_COUNTER("exits"), + STATS_DESC_COUNTER("io_exits"), + STATS_DESC_COUNTER("mmio_exits"), + STATS_DESC_COUNTER("signal_exits"), + STATS_DESC_COUNTER("irq_window_exits"), + STATS_DESC_COUNTER("nmi_window_exits"), + STATS_DESC_COUNTER("l1d_flush"), + STATS_DESC_COUNTER("halt_exits"), + STATS_DESC_COUNTER("request_irq_exits"), + STATS_DESC_COUNTER("irq_exits"), + STATS_DESC_COUNTER("host_state_reload"), + STATS_DESC_COUNTER("fpu_reload"), + STATS_DESC_COUNTER("insn_emulation"), + STATS_DESC_COUNTER("insn_emulation_fail"), + STATS_DESC_COUNTER("hypercalls"), + STATS_DESC_COUNTER("irq_injections"), + STATS_DESC_COUNTER("nmi_injections"), + STATS_DESC_COUNTER("req_event"), + STATS_DESC_COUNTER("nested_run"), +}; + +struct _kvm_stats_header kvm_vcpu_stats_header = { + .name_size = KVM_STATS_NAME_LEN, + .count = ARRAY_SIZE(kvm_vcpu_stats_desc), + .desc_offset = sizeof(struct kvm_stats_header), + .data_offset = sizeof(struct kvm_stats_header) + + sizeof(kvm_vcpu_stats_desc), +}; + struct kvm_stats_debugfs_item debugfs_entries[] = { VCPU_STAT("pf_fixed", pf_fixed), VCPU_STAT("pf_guest", pf_guest), diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h index 9286516094e3..796d97c8bbf0 100644 --- a/include/linux/kvm_host.h +++ b/include/linux/kvm_host.h @@ -1201,12 +1201,25 @@ struct kvm_stats_debugfs_item { int mode; };
+struct _kvm_stats_header { + __u32 name_size; + __u32 count; + __u32 desc_offset; + __u32 data_offset; +}; + +#define KVM_STATS_NAME_LEN 32 +struct _kvm_stats_desc { + struct kvm_stats_desc desc; + char name[KVM_STATS_NAME_LEN]; +}; + #define KVM_DBGFS_GET_MODE(dbgfs_item) \ ((dbgfs_item)->mode ? (dbgfs_item)->mode : 0644)
-#define VM_STAT(n, x, ...) \ +#define VM_STAT(n, x, ...) \ { n, offsetof(struct kvm, stat.x), KVM_STAT_VM, ## __VA_ARGS__ } -#define VCPU_STAT(n, x, ...) \ +#define VCPU_STAT(n, x, ...) \ { n, offsetof(struct kvm_vcpu, stat.x), KVM_STAT_VCPU, ## __VA_ARGS__ } #define VM_STAT_COM(n, x, ...) \ { n, offsetof(struct kvm, stat.common.x), KVM_STAT_VM, ## __VA_ARGS__ } @@ -1214,8 +1227,118 @@ struct kvm_stats_debugfs_item { { n, offsetof(struct kvm_vcpu, stat.common.x), \ KVM_STAT_VCPU, ## __VA_ARGS__ }
+#define STATS_DESC(name, type, unit, scale, exponent) \ + { \ + {type | unit | scale, exponent, 1}, name, \ + } +#define STATS_DESC_CUMULATIVE(name, unit, scale, exponent) \ + STATS_DESC(name, KVM_STATS_TYPE_CUMULATIVE, unit, scale, exponent) +#define STATS_DESC_INSTANT(name, unit, scale, exponent) \ + STATS_DESC(name, KVM_STATS_TYPE_INSTANT, unit, scale, exponent) + +/* Cumulative counter */ +#define STATS_DESC_COUNTER(name) \ + STATS_DESC_CUMULATIVE(name, KVM_STATS_UNIT_NONE, \ + KVM_STATS_SCALE_POW10, 0) +/* Instantaneous counter */ +#define STATS_DESC_ICOUNTER(name) \ + STATS_DESC_INSTANT(name, KVM_STATS_UNIT_NONE, \ + KVM_STATS_SCALE_POW10, 0) + +/* Cumulative clock cycles */ +#define STATS_DESC_CYCLE(name) \ + STATS_DESC_CUMULATIVE(name, KVM_STATS_UNIT_CYCLES, \ + KVM_STATS_SCALE_POW10, 0) +/* Instantaneous clock cycles */ +#define STATS_DESC_ICYCLE(name) \ + STATS_DESC_INSTANT(name, KVM_STATS_UNIT_CYCLES, \ + KVM_STATS_SCALE_POW10, 0) + +/* Cumulative memory size in Byte */ +#define STATS_DESC_SIZE_BYTE(name) \ + STATS_DESC_CUMULATIVE(name, KVM_STATS_UNIT_BYTES, \ + KVM_STATS_SCALE_POW2, 0) +/* Cumulative memory size in KiByte */ +#define STATS_DESC_SIZE_KBYTE(name) \ + STATS_DESC_CUMULATIVE(name, KVM_STATS_UNIT_BYTES, \ + KVM_STATS_SCALE_POW2, 10) +/* Cumulative memory size in MiByte */ +#define STATS_DESC_SIZE_MBYTE(name) \ + STATS_DESC_CUMULATIVE(name, KVM_STATS_UNIT_BYTES, \ + KVM_STATS_SCALE_POW2, 20) +/* Cumulative memory size in GiByte */ +#define STATS_DESC_SIZE_GBYTE(name) \ + STATS_DESC_CUMULATIVE(name, KVM_STATS_UNIT_BYTES, \ + KVM_STATS_SCALE_POW2, 30) + +/* Instantaneous memory size in Byte */ +#define STATS_DESC_ISIZE_BYTE(name) \ + STATS_DESC_INSTANT(name, KVM_STATS_UNIT_BYTES, \ + KVM_STATS_SCALE_POW2, 0) +/* Instantaneous memory size in KiByte */ +#define STATS_DESC_ISIZE_KBYTE(name) \ + STATS_DESC_INSTANT(name, KVM_STATS_UNIT_BYTES, \ + KVM_STATS_SCALE_POW2, 10) +/* Instantaneous memory size in MiByte */ +#define STATS_DESC_ISIZE_MBYTE(name) \ + STATS_DESC_INSTANT(name, KVM_STATS_UNIT_BYTES, \ + KVM_STATS_SCALE_POW2, 20) +/* Instantaneous memory size in GiByte */ +#define STATS_DESC_ISIZE_GBYTE(name) \ + STATS_DESC_INSTANT(name, KVM_STATS_UNIT_BYTES, \ + KVM_STATS_SCALE_POW2, 30) + +/* Cumulative time in second */ +#define STATS_DESC_TIME_SEC(name) \ + STATS_DESC_CUMULATIVE(name, KVM_STATS_UNIT_SECONDS, \ + KVM_STATS_SCALE_POW10, 0) +/* Cumulative time in millisecond */ +#define STATS_DESC_TIME_MSEC(name) \ + STATS_DESC_CUMULATIVE(name, KVM_STATS_UNIT_SECONDS, \ + KVM_STATS_SCALE_POW10, -3) +/* Cumulative time in microsecond */ +#define STATS_DESC_TIME_USEC(name) \ + STATS_DESC_CUMULATIVE(name, KVM_STATS_UNIT_SECONDS, \ + KVM_STATS_SCALE_POW10, -6) +/* Cumulative time in nanosecond */ +#define STATS_DESC_TIME_NSEC(name) \ + STATS_DESC_CUMULATIVE(name, KVM_STATS_UNIT_SECONDS, \ + KVM_STATS_SCALE_POW10, -9) + +/* Instantaneous time in second */ +#define STATS_DESC_ITIME_SEC(name) \ + STATS_DESC_INSTANT(name, KVM_STATS_UNIT_SECONDS, \ + KVM_STATS_SCALE_POW10, 0) +/* Instantaneous time in millisecond */ +#define STATS_DESC_ITIME_MSEC(name) \ + STATS_DESC_INSTANT(name, KVM_STATS_UNIT_SECONDS, \ + KVM_STATS_SCALE_POW10, -3) +/* Instantaneous time in microsecond */ +#define STATS_DESC_ITIME_USEC(name) \ + STATS_DESC_INSTANT(name, KVM_STATS_UNIT_SECONDS, \ + KVM_STATS_SCALE_POW10, -6) +/* Instantaneous time in nanosecond */ +#define STATS_DESC_ITIME_NSEC(name) \ + STATS_DESC_INSTANT(name, KVM_STATS_UNIT_SECONDS, \ + KVM_STATS_SCALE_POW10, -9) + +#define STATS_VM_COMMON \ + STATS_DESC_COUNTER("remote_tlb_flush") + +#define STATS_VCPU_COMMON \ + STATS_DESC_COUNTER("halt_successful_poll"), \ + STATS_DESC_COUNTER("halt_attempted_poll"), \ + STATS_DESC_COUNTER("halt_poll_invalid"), \ + STATS_DESC_COUNTER("halt_wakeup"), \ + STATS_DESC_TIME_NSEC("halt_poll_success_ns"), \ + STATS_DESC_TIME_NSEC("halt_poll_fail_ns") + extern struct kvm_stats_debugfs_item debugfs_entries[]; extern struct dentry *kvm_debugfs_dir; +extern struct _kvm_stats_header kvm_vm_stats_header; +extern struct _kvm_stats_header kvm_vcpu_stats_header; +extern struct _kvm_stats_desc kvm_vm_stats_desc[]; +extern struct _kvm_stats_desc kvm_vcpu_stats_desc[];
#if defined(CONFIG_MMU_NOTIFIER) && defined(KVM_ARCH_WANT_MMU_NOTIFIER) static inline int mmu_notifier_retry(struct kvm *kvm, unsigned long mmu_seq) diff --git a/include/uapi/linux/kvm.h b/include/uapi/linux/kvm.h index 1fb4fd863324..e7b8dc8fe7a4 100644 --- a/include/uapi/linux/kvm.h +++ b/include/uapi/linux/kvm.h @@ -1083,6 +1083,7 @@ struct kvm_ppc_resize_hpt { #define KVM_CAP_VM_COPY_ENC_CONTEXT_FROM 197 #define KVM_CAP_PTP_KVM 198 #define KVM_CAP_EXIT_HYPERCALL 199 +#define KVM_CAP_STATS_BINARY_FD 200
#ifdef KVM_CAP_IRQ_ROUTING
@@ -1899,4 +1900,53 @@ struct kvm_dirty_gfn { #define KVM_BUS_LOCK_DETECTION_OFF (1 << 0) #define KVM_BUS_LOCK_DETECTION_EXIT (1 << 1)
+#define KVM_STATS_ID_MAXLEN 64 + +struct kvm_stats_header { + char id[KVM_STATS_ID_MAXLEN]; + __u32 name_size; + __u32 count; + __u32 desc_offset; + __u32 data_offset; +}; + +#define KVM_STATS_TYPE_SHIFT 0 +#define KVM_STATS_TYPE_MASK (0xF << KVM_STATS_TYPE_SHIFT) +#define KVM_STATS_TYPE_CUMULATIVE (0x0 << KVM_STATS_TYPE_SHIFT) +#define KVM_STATS_TYPE_INSTANT (0x1 << KVM_STATS_TYPE_SHIFT) +#define KVM_STATS_TYPE_MAX KVM_STATS_TYPE_INSTANT + +#define KVM_STATS_UNIT_SHIFT 4 +#define KVM_STATS_UNIT_MASK (0xF << KVM_STATS_UNIT_SHIFT) +#define KVM_STATS_UNIT_NONE (0x0 << KVM_STATS_UNIT_SHIFT) +#define KVM_STATS_UNIT_BYTES (0x1 << KVM_STATS_UNIT_SHIFT) +#define KVM_STATS_UNIT_SECONDS (0x2 << KVM_STATS_UNIT_SHIFT) +#define KVM_STATS_UNIT_CYCLES (0x3 << KVM_STATS_UNIT_SHIFT) +#define KVM_STATS_UNIT_MAX KVM_STATS_UNIT_CYCLES + +#define KVM_STATS_SCALE_SHIFT 8 +#define KVM_STATS_SCALE_MASK (0xF << KVM_STATS_SCALE_SHIFT) +#define KVM_STATS_SCALE_POW10 (0x0 << KVM_STATS_SCALE_SHIFT) +#define KVM_STATS_SCALE_POW2 (0x1 << KVM_STATS_SCALE_SHIFT) +#define KVM_STATS_SCALE_MAX KVM_STATS_SCALE_POW2 + +struct kvm_stats_desc { + __u32 flags; + __s16 exponent; + __u16 size; + __u32 unused1; + __u32 unused2; + char name[0]; +}; + +struct kvm_vm_stats_data { + unsigned long value[0]; +}; + +struct kvm_vcpu_stats_data { + __u64 value[0]; +}; + +#define KVM_STATS_GETFD _IOR(KVMIO, 0xcc, struct kvm_stats_header) + #endif /* __LINUX_KVM_H */ diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c index cdf53fb75ca1..c48089ab366c 100644 --- a/virt/kvm/kvm_main.c +++ b/virt/kvm/kvm_main.c @@ -3458,6 +3458,115 @@ static int kvm_vcpu_ioctl_set_sigmask(struct kvm_vcpu *vcpu, sigset_t *sigset) return 0; }
+static ssize_t kvm_vcpu_stats_read(struct file *file, char __user *user_buffer, + size_t size, loff_t *offset) +{ + char id[KVM_STATS_ID_MAXLEN]; + struct kvm_vcpu *vcpu = file->private_data; + ssize_t copylen, len, remain = size; + size_t size_header, size_desc, size_stats; + loff_t pos = *offset; + char __user *dest = user_buffer; + void *src; + + snprintf(id, sizeof(id), "kvm-%d/vcpu-%d", + task_pid_nr(current), vcpu->vcpu_id); + size_header = sizeof(kvm_vcpu_stats_header); + size_desc = + kvm_vcpu_stats_header.count * sizeof(struct _kvm_stats_desc); + size_stats = sizeof(vcpu->stat); + + len = sizeof(id) + size_header + size_desc + size_stats - pos; + len = min(len, remain); + if (len <= 0) + return 0; + remain = len; + + /* Copy kvm vcpu stats header id string */ + copylen = sizeof(id) - pos; + copylen = min(copylen, remain); + if (copylen > 0) { + src = (void *)id + pos; + if (copy_to_user(dest, src, copylen)) + return -EFAULT; + remain -= copylen; + pos += copylen; + dest += copylen; + } + /* Copy kvm vcpu stats header */ + copylen = sizeof(id) + size_header - pos; + copylen = min(copylen, remain); + if (copylen > 0) { + src = (void *)&kvm_vcpu_stats_header; + src += pos - sizeof(id); + if (copy_to_user(dest, src, copylen)) + return -EFAULT; + remain -= copylen; + pos += copylen; + dest += copylen; + } + /* Copy kvm vcpu stats descriptors */ + copylen = kvm_vcpu_stats_header.desc_offset + size_desc - pos; + copylen = min(copylen, remain); + if (copylen > 0) { + src = (void *)&kvm_vcpu_stats_desc; + src += pos - kvm_vcpu_stats_header.desc_offset; + if (copy_to_user(dest, src, copylen)) + return -EFAULT; + remain -= copylen; + pos += copylen; + dest += copylen; + } + /* Copy kvm vcpu stats values */ + copylen = kvm_vcpu_stats_header.data_offset + size_stats - pos; + copylen = min(copylen, remain); + if (copylen > 0) { + src = (void *)&vcpu->stat; + src += pos - kvm_vcpu_stats_header.data_offset; + if (copy_to_user(dest, src, copylen)) + return -EFAULT; + remain -= copylen; + pos += copylen; + dest += copylen; + } + + *offset = pos; + return len; +} + +static struct file_operations kvm_vcpu_stats_fops = { + .read = kvm_vcpu_stats_read, + .llseek = noop_llseek, +}; + +static int kvm_vcpu_ioctl_get_statsfd(struct kvm_vcpu *vcpu) +{ + int error, fd; + struct file *file; + char name[15 + ITOA_MAX_LEN + 1]; + + snprintf(name, sizeof(name), "kvm-vcpu-stats:%d", vcpu->vcpu_id); + + error = get_unused_fd_flags(O_CLOEXEC); + if (error < 0) + return error; + fd = error; + + file = anon_inode_getfile(name, &kvm_vcpu_stats_fops, vcpu, O_RDONLY); + if (IS_ERR(file)) { + error = PTR_ERR(file); + goto err_put_unused_fd; + } + file->f_mode |= FMODE_PREAD; + fd_install(fd, file); + + return fd; + +err_put_unused_fd: + put_unused_fd(fd); + return error; +} + static long kvm_vcpu_ioctl(struct file *filp, unsigned int ioctl, unsigned long arg) { @@ -3655,6 +3764,10 @@ static long kvm_vcpu_ioctl(struct file *filp, r = kvm_arch_vcpu_ioctl_set_fpu(vcpu, fpu); break; } + case KVM_STATS_GETFD: { + r = kvm_vcpu_ioctl_get_statsfd(vcpu); + break; + } default: r = kvm_arch_vcpu_ioctl(filp, ioctl, arg); } @@ -3913,6 +4026,8 @@ static long kvm_vm_ioctl_check_extension_generic(struct kvm *kvm, long arg) #else return 0; #endif + case KVM_CAP_STATS_BINARY_FD: + return 1; default: break; } @@ -4016,6 +4131,111 @@ static int kvm_vm_ioctl_enable_cap_generic(struct kvm *kvm, } }
+static ssize_t kvm_vm_stats_read(struct file *file, char __user *user_buffer, + size_t size, loff_t *offset) +{ + char id[KVM_STATS_ID_MAXLEN]; + struct kvm *kvm = file->private_data; + ssize_t copylen, len, remain = size; + size_t size_header, size_desc, size_stats; + loff_t pos = *offset; + char __user *dest = user_buffer; + void *src; + + snprintf(id, sizeof(id), "kvm-%d", task_pid_nr(current)); + size_header = sizeof(kvm_vm_stats_header); + size_desc = kvm_vm_stats_header.count * sizeof(struct _kvm_stats_desc); + size_stats = sizeof(kvm->stat); + + len = sizeof(id) + size_header + size_desc + size_stats - pos; + len = min(len, remain); + if (len <= 0) + return 0; + remain = len; + + /* Copy kvm vm stats header id string */ + copylen = sizeof(id) - pos; + copylen = min(copylen, remain); + if (copylen > 0) { + src = (void *)id + pos; + if (copy_to_user(dest, src, copylen)) + return -EFAULT; + remain -= copylen; + pos += copylen; + dest += copylen; + } + /* Copy kvm vm stats header */ + copylen = sizeof(id) + size_header - pos; + copylen = min(copylen, remain); + if (copylen > 0) { + src = (void *)&kvm_vm_stats_header; + src += pos - sizeof(id); + if (copy_to_user(dest, src, copylen)) + return -EFAULT; + remain -= copylen; + pos += copylen; + dest += copylen; + } + /* Copy kvm vm stats descriptors */ + copylen = kvm_vm_stats_header.desc_offset + size_desc - pos; + copylen = min(copylen, remain); + if (copylen > 0) { + src = (void *)&kvm_vm_stats_desc; + src += pos - kvm_vm_stats_header.desc_offset; + if (copy_to_user(dest, src, copylen)) + return -EFAULT; + remain -= copylen; + pos += copylen; + dest += copylen; + } + /* Copy kvm vm stats values */ + copylen = kvm_vm_stats_header.data_offset + size_stats - pos; + copylen = min(copylen, remain); + if (copylen > 0) { + src = (void *)&kvm->stat; + src += pos - kvm_vm_stats_header.data_offset; + if (copy_to_user(dest, src, copylen)) + return -EFAULT; + remain -= copylen; + pos += copylen; + dest += copylen; + } + + *offset = pos; + return len; +} + +static struct file_operations kvm_vm_stats_fops = { + .read = kvm_vm_stats_read, + .llseek = noop_llseek, +}; + +static int kvm_vm_ioctl_get_statsfd(struct kvm *kvm) +{ + int error, fd; + struct file *file; + + error = get_unused_fd_flags(O_CLOEXEC); + if (error < 0) + return error; + fd = error; + + file = anon_inode_getfile("kvm-vm-stats", + &kvm_vm_stats_fops, kvm, O_RDONLY); + if (IS_ERR(file)) { + error = PTR_ERR(file); + goto err_put_unused_fd; + } + file->f_mode |= FMODE_PREAD; + fd_install(fd, file); + + return fd; + +err_put_unused_fd: + put_unused_fd(fd); + return error; +} + static long kvm_vm_ioctl(struct file *filp, unsigned int ioctl, unsigned long arg) { @@ -4198,6 +4418,9 @@ static long kvm_vm_ioctl(struct file *filp, case KVM_RESET_DIRTY_RINGS: r = kvm_vm_ioctl_reset_dirty_pages(kvm); break; + case KVM_STATS_GETFD: + r = kvm_vm_ioctl_get_statsfd(kvm); + break; default: r = kvm_arch_vm_ioctl(filp, ioctl, arg); }
Hi Jing,
Thank you for the patch! Perhaps something to improve:
[auto build test WARNING on 9f242010c3b46e63bc62f08fff42cef992d3801b]
url: https://github.com/0day-ci/linux/commits/Jing-Zhang/KVM-statistics-data-fd-b... base: 9f242010c3b46e63bc62f08fff42cef992d3801b config: s390-randconfig-r024-20210429 (attached as .config) compiler: clang version 13.0.0 (https://github.com/llvm/llvm-project 8f5a2a5836cc8e4c1def2bdeb022e7b496623439) reproduce (this is a W=1 build): wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross chmod +x ~/bin/make.cross # install s390 cross compiling tool for clang build # apt-get install binutils-s390x-linux-gnu # https://github.com/0day-ci/linux/commit/434cb14317623e9908098fc1c3925f2a6dca... git remote add linux-review https://github.com/0day-ci/linux git fetch --no-tags linux-review Jing-Zhang/KVM-statistics-data-fd-based-binary-interface/20210430-043830 git checkout 434cb14317623e9908098fc1c3925f2a6dcaa556 # save the attached .config to linux build tree COMPILER_INSTALL_PATH=$HOME/0day COMPILER=clang make.cross W=1 ARCH=s390
If you fix the issue, kindly add following tag as appropriate Reported-by: kernel test robot lkp@intel.com
All warnings (new ones prefixed by >>):
In file included from arch/s390/kvm/kvm-s390.c:23: In file included from include/linux/kvm_host.h:33: In file included from include/linux/kvm_para.h:5: In file included from include/uapi/linux/kvm_para.h:37: In file included from arch/s390/include/asm/kvm_para.h:25: In file included from arch/s390/include/asm/diag.h:12: In file included from include/linux/if_ether.h:19: In file included from include/linux/skbuff.h:31: In file included from include/linux/dma-mapping.h:10: In file included from include/linux/scatterlist.h:9: In file included from arch/s390/include/asm/io.h:80: include/asm-generic/io.h:464:31: warning: performing pointer arithmetic on a null pointer has undefined behavior [-Wnull-pointer-arithmetic] val = __raw_readb(PCI_IOBASE + addr); ~~~~~~~~~~ ^ include/asm-generic/io.h:477:61: warning: performing pointer arithmetic on a null pointer has undefined behavior [-Wnull-pointer-arithmetic] val = __le16_to_cpu((__le16 __force)__raw_readw(PCI_IOBASE + addr)); ~~~~~~~~~~ ^ include/uapi/linux/byteorder/big_endian.h:36:59: note: expanded from macro '__le16_to_cpu' #define __le16_to_cpu(x) __swab16((__force __u16)(__le16)(x)) ^ include/uapi/linux/swab.h:102:54: note: expanded from macro '__swab16' #define __swab16(x) (__u16)__builtin_bswap16((__u16)(x)) ^ In file included from arch/s390/kvm/kvm-s390.c:23: In file included from include/linux/kvm_host.h:33: In file included from include/linux/kvm_para.h:5: In file included from include/uapi/linux/kvm_para.h:37: In file included from arch/s390/include/asm/kvm_para.h:25: In file included from arch/s390/include/asm/diag.h:12: In file included from include/linux/if_ether.h:19: In file included from include/linux/skbuff.h:31: In file included from include/linux/dma-mapping.h:10: In file included from include/linux/scatterlist.h:9: In file included from arch/s390/include/asm/io.h:80: include/asm-generic/io.h:490:61: warning: performing pointer arithmetic on a null pointer has undefined behavior [-Wnull-pointer-arithmetic] val = __le32_to_cpu((__le32 __force)__raw_readl(PCI_IOBASE + addr)); ~~~~~~~~~~ ^ include/uapi/linux/byteorder/big_endian.h:34:59: note: expanded from macro '__le32_to_cpu' #define __le32_to_cpu(x) __swab32((__force __u32)(__le32)(x)) ^ include/uapi/linux/swab.h:115:54: note: expanded from macro '__swab32' #define __swab32(x) (__u32)__builtin_bswap32((__u32)(x)) ^ In file included from arch/s390/kvm/kvm-s390.c:23: In file included from include/linux/kvm_host.h:33: In file included from include/linux/kvm_para.h:5: In file included from include/uapi/linux/kvm_para.h:37: In file included from arch/s390/include/asm/kvm_para.h:25: In file included from arch/s390/include/asm/diag.h:12: In file included from include/linux/if_ether.h:19: In file included from include/linux/skbuff.h:31: In file included from include/linux/dma-mapping.h:10: In file included from include/linux/scatterlist.h:9: In file included from arch/s390/include/asm/io.h:80: include/asm-generic/io.h:501:33: warning: performing pointer arithmetic on a null pointer has undefined behavior [-Wnull-pointer-arithmetic] __raw_writeb(value, PCI_IOBASE + addr); ~~~~~~~~~~ ^ include/asm-generic/io.h:511:59: warning: performing pointer arithmetic on a null pointer has undefined behavior [-Wnull-pointer-arithmetic] __raw_writew((u16 __force)cpu_to_le16(value), PCI_IOBASE + addr); ~~~~~~~~~~ ^ include/asm-generic/io.h:521:59: warning: performing pointer arithmetic on a null pointer has undefined behavior [-Wnull-pointer-arithmetic] __raw_writel((u32 __force)cpu_to_le32(value), PCI_IOBASE + addr); ~~~~~~~~~~ ^ include/asm-generic/io.h:609:20: warning: performing pointer arithmetic on a null pointer has undefined behavior [-Wnull-pointer-arithmetic] readsb(PCI_IOBASE + addr, buffer, count); ~~~~~~~~~~ ^ include/asm-generic/io.h:617:20: warning: performing pointer arithmetic on a null pointer has undefined behavior [-Wnull-pointer-arithmetic] readsw(PCI_IOBASE + addr, buffer, count); ~~~~~~~~~~ ^ include/asm-generic/io.h:625:20: warning: performing pointer arithmetic on a null pointer has undefined behavior [-Wnull-pointer-arithmetic] readsl(PCI_IOBASE + addr, buffer, count); ~~~~~~~~~~ ^ include/asm-generic/io.h:634:21: warning: performing pointer arithmetic on a null pointer has undefined behavior [-Wnull-pointer-arithmetic] writesb(PCI_IOBASE + addr, buffer, count); ~~~~~~~~~~ ^ include/asm-generic/io.h:643:21: warning: performing pointer arithmetic on a null pointer has undefined behavior [-Wnull-pointer-arithmetic] writesw(PCI_IOBASE + addr, buffer, count); ~~~~~~~~~~ ^ include/asm-generic/io.h:652:21: warning: performing pointer arithmetic on a null pointer has undefined behavior [-Wnull-pointer-arithmetic] writesl(PCI_IOBASE + addr, buffer, count); ~~~~~~~~~~ ^
arch/s390/kvm/kvm-s390.c:154:21: warning: initializer-string for char array is too long [-Wexcess-initializers]
STATS_DESC_COUNTER("instruction_sigp_stop_store_status"), ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ include/linux/kvm_host.h:1241:24: note: expanded from macro 'STATS_DESC_COUNTER' STATS_DESC_CUMULATIVE(name, KVM_STATS_UNIT_NONE, \ ^~~~ include/linux/kvm_host.h:1235:13: note: expanded from macro 'STATS_DESC_CUMULATIVE' STATS_DESC(name, KVM_STATS_TYPE_CUMULATIVE, unit, scale, exponent) ^~~~ include/linux/kvm_host.h:1232:39: note: expanded from macro 'STATS_DESC' {type | unit | scale, exponent, 1}, name, \ ^~~~ arch/s390/kvm/kvm-s390.c:156:21: warning: initializer-string for char array is too long [-Wexcess-initializers] STATS_DESC_COUNTER("instruction_sigp_store_adtl_status"), ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ include/linux/kvm_host.h:1241:24: note: expanded from macro 'STATS_DESC_COUNTER' STATS_DESC_CUMULATIVE(name, KVM_STATS_UNIT_NONE, \ ^~~~ include/linux/kvm_host.h:1235:13: note: expanded from macro 'STATS_DESC_CUMULATIVE' STATS_DESC(name, KVM_STATS_TYPE_CUMULATIVE, unit, scale, exponent) ^~~~ include/linux/kvm_host.h:1232:39: note: expanded from macro 'STATS_DESC' {type | unit | scale, exponent, 1}, name, \ ^~~~ 14 warnings generated.
vim +154 arch/s390/kvm/kvm-s390.c
77 78 struct _kvm_stats_desc kvm_vcpu_stats_desc[] = { 79 STATS_VCPU_COMMON, 80 STATS_DESC_COUNTER("exit_userspace"), 81 STATS_DESC_COUNTER("exit_null"), 82 STATS_DESC_COUNTER("exit_external_request"), 83 STATS_DESC_COUNTER("exit_io_request"), 84 STATS_DESC_COUNTER("exit_external_interrupt"), 85 STATS_DESC_COUNTER("exit_stop_request"), 86 STATS_DESC_COUNTER("exit_validity"), 87 STATS_DESC_COUNTER("exit_instruction"), 88 STATS_DESC_COUNTER("exit_pei"), 89 STATS_DESC_COUNTER("halt_no_poll_steal"), 90 STATS_DESC_COUNTER("instruction_lctl"), 91 STATS_DESC_COUNTER("instruction_lctlg"), 92 STATS_DESC_COUNTER("instruction_stctl"), 93 STATS_DESC_COUNTER("instruction_stctg"), 94 STATS_DESC_COUNTER("exit_program_interruption"), 95 STATS_DESC_COUNTER("exit_instr_and_program"), 96 STATS_DESC_COUNTER("exit_operation_exception"), 97 STATS_DESC_COUNTER("deliver_ckc"), 98 STATS_DESC_COUNTER("deliver_cputm"), 99 STATS_DESC_COUNTER("deliver_external_call"), 100 STATS_DESC_COUNTER("deliver_emergency_signal"), 101 STATS_DESC_COUNTER("deliver_service_signal"), 102 STATS_DESC_COUNTER("deliver_virtio"), 103 STATS_DESC_COUNTER("deliver_stop_signal"), 104 STATS_DESC_COUNTER("deliver_prefix_signal"), 105 STATS_DESC_COUNTER("deliver_restart_signal"), 106 STATS_DESC_COUNTER("deliver_program"), 107 STATS_DESC_COUNTER("deliver_io"), 108 STATS_DESC_COUNTER("deliver_machine_check"), 109 STATS_DESC_COUNTER("exit_wait_state"), 110 STATS_DESC_COUNTER("inject_ckc"), 111 STATS_DESC_COUNTER("inject_cputm"), 112 STATS_DESC_COUNTER("inject_external_call"), 113 STATS_DESC_COUNTER("inject_emergency_signal"), 114 STATS_DESC_COUNTER("inject_mchk"), 115 STATS_DESC_COUNTER("inject_pfault_init"), 116 STATS_DESC_COUNTER("inject_program"), 117 STATS_DESC_COUNTER("inject_restart"), 118 STATS_DESC_COUNTER("inject_set_prefix"), 119 STATS_DESC_COUNTER("inject_stop_signal"), 120 STATS_DESC_COUNTER("instruction_epsw"), 121 STATS_DESC_COUNTER("instruction_gs"), 122 STATS_DESC_COUNTER("instruction_io_other"), 123 STATS_DESC_COUNTER("instruction_lpsw"), 124 STATS_DESC_COUNTER("instruction_lpswe"), 125 STATS_DESC_COUNTER("instruction_pfmf"), 126 STATS_DESC_COUNTER("instruction_ptff"), 127 STATS_DESC_COUNTER("instruction_sck"), 128 STATS_DESC_COUNTER("instruction_sckpf"), 129 STATS_DESC_COUNTER("instruction_stidp"), 130 STATS_DESC_COUNTER("instruction_spx"), 131 STATS_DESC_COUNTER("instruction_stpx"), 132 STATS_DESC_COUNTER("instruction_stap"), 133 STATS_DESC_COUNTER("instruction_iske"), 134 STATS_DESC_COUNTER("instruction_ri"), 135 STATS_DESC_COUNTER("instruction_rrbe"), 136 STATS_DESC_COUNTER("instruction_sske"), 137 STATS_DESC_COUNTER("instruction_ipte_interlock"), 138 STATS_DESC_COUNTER("instruction_stsi"), 139 STATS_DESC_COUNTER("instruction_stfl"), 140 STATS_DESC_COUNTER("instruction_tb"), 141 STATS_DESC_COUNTER("instruction_tpi"), 142 STATS_DESC_COUNTER("instruction_tprot"), 143 STATS_DESC_COUNTER("instruction_tsch"), 144 STATS_DESC_COUNTER("instruction_sie"), 145 STATS_DESC_COUNTER("instruction_essa"), 146 STATS_DESC_COUNTER("instruction_sthyi"), 147 STATS_DESC_COUNTER("instruction_sigp_sense"), 148 STATS_DESC_COUNTER("instruction_sigp_sense_running"), 149 STATS_DESC_COUNTER("instruction_sigp_external_call"), 150 STATS_DESC_COUNTER("instruction_sigp_emergency"), 151 STATS_DESC_COUNTER("instruction_sigp_cond_emergency"), 152 STATS_DESC_COUNTER("instruction_sigp_start"), 153 STATS_DESC_COUNTER("instruction_sigp_stop"),
154 STATS_DESC_COUNTER("instruction_sigp_stop_store_status"),
155 STATS_DESC_COUNTER("instruction_sigp_store_status"), 156 STATS_DESC_COUNTER("instruction_sigp_store_adtl_status"), 157 STATS_DESC_COUNTER("instruction_sigp_arch"), 158 STATS_DESC_COUNTER("instruction_sigp_prefix"), 159 STATS_DESC_COUNTER("instruction_sigp_restart"), 160 STATS_DESC_COUNTER("instruction_sigp_init_cpu_reset"), 161 STATS_DESC_COUNTER("instruction_sigp_cpu_reset"), 162 STATS_DESC_COUNTER("instruction_sigp_unknown"), 163 STATS_DESC_COUNTER("diagnose_10"), 164 STATS_DESC_COUNTER("diagnose_44"), 165 STATS_DESC_COUNTER("diagnose_9c"), 166 STATS_DESC_COUNTER("diagnose_9c_ignored"), 167 STATS_DESC_COUNTER("diagnose_258"), 168 STATS_DESC_COUNTER("diagnose_308"), 169 STATS_DESC_COUNTER("diagnose_500"), 170 STATS_DESC_COUNTER("diagnose_other"), 171 STATS_DESC_COUNTER("pfault_sync"), 172 }; 173
--- 0-DAY CI Kernel Test Service, Intel Corporation https://lists.01.org/hyperkitty/list/kbuild-all@lists.01.org
On Thu, Apr 29, 2021 at 3:37 PM Jing Zhang jingzhangos@google.com wrote:
Provides a file descriptor per VM to read VM stats info/data. Provides a file descriptor per vCPU to read vCPU stats info/data.
Signed-off-by: Jing Zhang jingzhangos@google.com
arch/arm64/kvm/guest.c | 30 +++++ arch/mips/kvm/mips.c | 55 ++++++++++ arch/powerpc/kvm/book3s.c | 56 ++++++++++ arch/powerpc/kvm/booke.c | 49 +++++++++ arch/s390/kvm/kvm-s390.c | 121 +++++++++++++++++++++ arch/x86/kvm/x86.c | 57 ++++++++++ include/linux/kvm_host.h | 127 +++++++++++++++++++++- include/uapi/linux/kvm.h | 50 +++++++++ virt/kvm/kvm_main.c | 223 ++++++++++++++++++++++++++++++++++++++ 9 files changed, 766 insertions(+), 2 deletions(-)
diff --git a/arch/arm64/kvm/guest.c b/arch/arm64/kvm/guest.c index 0e41331b0911..cf04c2a3a1ce 100644 --- a/arch/arm64/kvm/guest.c +++ b/arch/arm64/kvm/guest.c @@ -28,6 +28,36 @@
#include "trace.h"
+struct _kvm_stats_desc kvm_vm_stats_desc[] = {
STATS_VM_COMMON,
+};
+struct _kvm_stats_header kvm_vm_stats_header = {
.name_size = KVM_STATS_NAME_LEN,
.count = ARRAY_SIZE(kvm_vm_stats_desc),
.desc_offset = sizeof(struct kvm_stats_header),
.data_offset = sizeof(struct kvm_stats_header) +
sizeof(kvm_vm_stats_desc),
+};
+struct _kvm_stats_desc kvm_vcpu_stats_desc[] = {
STATS_VCPU_COMMON,
STATS_DESC_COUNTER("hvc_exit_stat"),
STATS_DESC_COUNTER("wfe_exit_stat"),
STATS_DESC_COUNTER("wfi_exit_stat"),
STATS_DESC_COUNTER("mmio_exit_user"),
STATS_DESC_COUNTER("mmio_exit_kernel"),
STATS_DESC_COUNTER("exits"),
+};
+struct _kvm_stats_header kvm_vcpu_stats_header = {
.name_size = KVM_STATS_NAME_LEN,
.count = ARRAY_SIZE(kvm_vcpu_stats_desc),
.desc_offset = sizeof(struct kvm_stats_header),
.data_offset = sizeof(struct kvm_stats_header) +
sizeof(kvm_vcpu_stats_desc),
+};
struct kvm_stats_debugfs_item debugfs_entries[] = { VCPU_STAT_COM("halt_successful_poll", halt_successful_poll), VCPU_STAT_COM("halt_attempted_poll", halt_attempted_poll), diff --git a/arch/mips/kvm/mips.c b/arch/mips/kvm/mips.c index 011c59acd606..ced50e8c1bb2 100644 --- a/arch/mips/kvm/mips.c +++ b/arch/mips/kvm/mips.c @@ -39,6 +39,61 @@ #define VECTORSPACING 0x100 /* for EI/VI mode */ #endif
+struct _kvm_stats_desc kvm_vm_stats_desc[] = {
STATS_VM_COMMON,
+};
+struct _kvm_stats_header kvm_vm_stats_header = {
.name_size = KVM_STATS_NAME_LEN,
.count = ARRAY_SIZE(kvm_vm_stats_desc),
.desc_offset = sizeof(struct kvm_stats_header),
.data_offset = sizeof(struct kvm_stats_header) +
sizeof(kvm_vm_stats_desc),
+};
+struct _kvm_stats_desc kvm_vcpu_stats_desc[] = {
STATS_VCPU_COMMON,
STATS_DESC_COUNTER("wait_exits"),
STATS_DESC_COUNTER("cache_exits"),
STATS_DESC_COUNTER("signal_exits"),
STATS_DESC_COUNTER("int_exits"),
STATS_DESC_COUNTER("cop_unusable_exits"),
STATS_DESC_COUNTER("tlbmod_exits"),
STATS_DESC_COUNTER("tlbmiss_ld_exits"),
STATS_DESC_COUNTER("tlbmiss_st_exits"),
STATS_DESC_COUNTER("addrerr_st_exits"),
STATS_DESC_COUNTER("addrerr_ld_exits"),
STATS_DESC_COUNTER("syscall_exits"),
STATS_DESC_COUNTER("resvd_inst_exits"),
STATS_DESC_COUNTER("break_inst_exits"),
STATS_DESC_COUNTER("trap_inst_exits"),
STATS_DESC_COUNTER("msa_fpe_exits"),
STATS_DESC_COUNTER("fpe_exits"),
STATS_DESC_COUNTER("msa_disabled_exits"),
STATS_DESC_COUNTER("flush_dcache_exits"),
+#ifdef CONFIG_KVM_MIPS_VZ
STATS_DESC_COUNTER("vz_gpsi_exits"),
STATS_DESC_COUNTER("vz_gsfc_exits"),
STATS_DESC_COUNTER("vz_hc_exits"),
STATS_DESC_COUNTER("vz_grr_exits"),
STATS_DESC_COUNTER("vz_gva_exits"),
STATS_DESC_COUNTER("vz_ghfc_exits"),
STATS_DESC_COUNTER("vz_gpa_exits"),
STATS_DESC_COUNTER("vz_resvd_exits"),
+#ifdef CONFIG_CPU_LOONGSON64
STATS_DESC_COUNTER("vz_cpucfg_exits"),
+#endif +#endif +};
+struct _kvm_stats_header kvm_vcpu_stats_header = {
.name_size = KVM_STATS_NAME_LEN,
.count = ARRAY_SIZE(kvm_vcpu_stats_desc),
.desc_offset = sizeof(struct kvm_stats_header),
.data_offset = sizeof(struct kvm_stats_header) +
sizeof(kvm_vcpu_stats_desc),
+};
struct kvm_stats_debugfs_item debugfs_entries[] = { VCPU_STAT("wait", wait_exits), VCPU_STAT("cache", cache_exits), diff --git a/arch/powerpc/kvm/book3s.c b/arch/powerpc/kvm/book3s.c index bd3a10e1fdaf..9dc2510537ce 100644 --- a/arch/powerpc/kvm/book3s.c +++ b/arch/powerpc/kvm/book3s.c @@ -38,6 +38,62 @@
/* #define EXIT_DEBUG */
+struct _kvm_stats_desc kvm_vm_stats_desc[] = {
STATS_VM_COMMON,
STATS_DESC_ICOUNTER("num_2M_pages"),
STATS_DESC_ICOUNTER("num_1G_pages"),
+};
+struct _kvm_stats_header kvm_vm_stats_header = {
.name_size = KVM_STATS_NAME_LEN,
.count = ARRAY_SIZE(kvm_vm_stats_desc),
.desc_offset = sizeof(struct kvm_stats_header),
.data_offset = sizeof(struct kvm_stats_header) +
sizeof(kvm_vm_stats_desc),
+};
+struct _kvm_stats_desc kvm_vcpu_stats_desc[] = {
STATS_VCPU_COMMON,
STATS_DESC_COUNTER("sum_exits"),
STATS_DESC_COUNTER("mmio_exits"),
STATS_DESC_COUNTER("signal_exits"),
STATS_DESC_COUNTER("light_exits"),
STATS_DESC_COUNTER("itlb_real_miss_exits"),
STATS_DESC_COUNTER("itlb_virt_miss_exits"),
STATS_DESC_COUNTER("dtlb_real_miss_exits"),
STATS_DESC_COUNTER("dtlb_virt_miss_exits"),
STATS_DESC_COUNTER("syscall_exits"),
STATS_DESC_COUNTER("isi_exits"),
STATS_DESC_COUNTER("dsi_exits"),
STATS_DESC_COUNTER("emulated_inst_exits"),
STATS_DESC_COUNTER("dec_exits"),
STATS_DESC_COUNTER("ext_intr_exits"),
STATS_DESC_TIME_NSEC("halt_wait_ns"),
STATS_DESC_COUNTER("halt_successful_wait"),
STATS_DESC_COUNTER("dbell_exits"),
STATS_DESC_COUNTER("gdbell_exits"),
STATS_DESC_COUNTER("ld"),
STATS_DESC_COUNTER("st"),
STATS_DESC_COUNTER("pf_storage"),
STATS_DESC_COUNTER("pf_instruc"),
STATS_DESC_COUNTER("sp_storage"),
STATS_DESC_COUNTER("sp_instruc"),
STATS_DESC_COUNTER("queue_intr"),
STATS_DESC_COUNTER("ld_slow"),
STATS_DESC_COUNTER("st_slow"),
STATS_DESC_COUNTER("pthru_all"),
STATS_DESC_COUNTER("pthru_host"),
STATS_DESC_COUNTER("pthru_bad_aff"),
+};
+struct _kvm_stats_header kvm_vcpu_stats_header = {
.name_size = KVM_STATS_NAME_LEN,
.count = ARRAY_SIZE(kvm_vcpu_stats_desc),
.desc_offset = sizeof(struct kvm_stats_header),
.data_offset = sizeof(struct kvm_stats_header) +
sizeof(kvm_vcpu_stats_desc),
+};
struct kvm_stats_debugfs_item debugfs_entries[] = { VCPU_STAT("exits", sum_exits), VCPU_STAT("mmio", mmio_exits), diff --git a/arch/powerpc/kvm/booke.c b/arch/powerpc/kvm/booke.c index 07fdd7a1254a..e9ffcf0f022d 100644 --- a/arch/powerpc/kvm/booke.c +++ b/arch/powerpc/kvm/booke.c @@ -36,6 +36,55 @@
unsigned long kvmppc_booke_handlers;
+struct _kvm_stats_desc kvm_vm_stats_desc[] = {
STATS_VM_COMMON,
STATS_DESC_ICOUNTER("num_2M_pages",),
STATS_DESC_ICOUNTER("num_1G_pages",),
+};
+struct _kvm_stats_header kvm_vm_stats_header = {
.name_size = KVM_STATS_NAME_LEN,
.count = ARRAY_SIZE(kvm_vm_stats_desc),
.desc_offset = sizeof(struct kvm_stats_header),
.data_offset = sizeof(struct kvm_stats_header) +
sizeof(kvm_vm_stats_desc),
+};
+struct _kvm_stats_desc kvm_vcpu_stats_desc[] = {
STATS_VCPU_COMMON,
STATS_DESC_COUNTER("sum_exits"),
STATS_DESC_COUNTER("mmio_exits"),
STATS_DESC_COUNTER("signal_exits"),
STATS_DESC_COUNTER("light_exits"),
STATS_DESC_COUNTER("itlb_real_miss_exits"),
STATS_DESC_COUNTER("itlb_virt_miss_exits"),
STATS_DESC_COUNTER("dtlb_real_miss_exits"),
STATS_DESC_COUNTER("dtlb_virt_miss_exits"),
STATS_DESC_COUNTER("syscall_exits"),
STATS_DESC_COUNTER("isi_exits"),
STATS_DESC_COUNTER("dsi_exits"),
STATS_DESC_COUNTER("emulated_inst_exits"),
STATS_DESC_COUNTER("dec_exits"),
STATS_DESC_COUNTER("ext_intr_exits"),
STATS_DESC_TIME_NSEC("halt_wait_ns"),
STATS_DESC_COUNTER("halt_successful_wait"),
STATS_DESC_COUNTER("dbell_exits"),
STATS_DESC_COUNTER("gdbell_exits"),
STATS_DESC_COUNTER("ld"),
STATS_DESC_COUNTER("st"),
STATS_DESC_COUNTER("pthru_all"),
STATS_DESC_COUNTER("pthru_host"),
STATS_DESC_COUNTER("pthru_bad_aff"),
+};
+struct _kvm_stats_header kvm_vcpu_stats_header = {
.name_size = KVM_STATS_NAME_LEN,
.count = ARRAY_SIZE(kvm_vcpu_stats_desc),
.desc_offset = sizeof(struct kvm_stats_header),
.data_offset = sizeof(struct kvm_stats_header) +
sizeof(kvm_vcpu_stats_desc),
+};
struct kvm_stats_debugfs_item debugfs_entries[] = { VCPU_STAT("mmio", mmio_exits), VCPU_STAT("sig", signal_exits), diff --git a/arch/s390/kvm/kvm-s390.c b/arch/s390/kvm/kvm-s390.c index d6bf3372bb10..2c91d70754a9 100644 --- a/arch/s390/kvm/kvm-s390.c +++ b/arch/s390/kvm/kvm-s390.c @@ -58,6 +58,127 @@ #define VCPU_IRQS_MAX_BUF (sizeof(struct kvm_s390_irq) * \ (KVM_MAX_VCPUS + LOCAL_IRQS))
+struct _kvm_stats_desc kvm_vm_stats_desc[] = {
STATS_VM_COMMON,
STATS_DESC_COUNTER("inject_io"),
STATS_DESC_COUNTER("inject_float_mchk"),
STATS_DESC_COUNTER("inject_pfault_done"),
STATS_DESC_COUNTER("inject_service_signal"),
STATS_DESC_COUNTER("inject_virtio"),
+};
+struct _kvm_stats_header kvm_vm_stats_header = {
.name_size = KVM_STATS_NAME_LEN,
.count = ARRAY_SIZE(kvm_vm_stats_desc),
.desc_offset = sizeof(struct kvm_stats_header),
.data_offset = sizeof(struct kvm_stats_header) +
sizeof(kvm_vm_stats_desc),
+};
+struct _kvm_stats_desc kvm_vcpu_stats_desc[] = {
STATS_VCPU_COMMON,
STATS_DESC_COUNTER("exit_userspace"),
STATS_DESC_COUNTER("exit_null"),
STATS_DESC_COUNTER("exit_external_request"),
STATS_DESC_COUNTER("exit_io_request"),
STATS_DESC_COUNTER("exit_external_interrupt"),
STATS_DESC_COUNTER("exit_stop_request"),
STATS_DESC_COUNTER("exit_validity"),
STATS_DESC_COUNTER("exit_instruction"),
STATS_DESC_COUNTER("exit_pei"),
STATS_DESC_COUNTER("halt_no_poll_steal"),
STATS_DESC_COUNTER("instruction_lctl"),
STATS_DESC_COUNTER("instruction_lctlg"),
STATS_DESC_COUNTER("instruction_stctl"),
STATS_DESC_COUNTER("instruction_stctg"),
STATS_DESC_COUNTER("exit_program_interruption"),
STATS_DESC_COUNTER("exit_instr_and_program"),
STATS_DESC_COUNTER("exit_operation_exception"),
STATS_DESC_COUNTER("deliver_ckc"),
STATS_DESC_COUNTER("deliver_cputm"),
STATS_DESC_COUNTER("deliver_external_call"),
STATS_DESC_COUNTER("deliver_emergency_signal"),
STATS_DESC_COUNTER("deliver_service_signal"),
STATS_DESC_COUNTER("deliver_virtio"),
STATS_DESC_COUNTER("deliver_stop_signal"),
STATS_DESC_COUNTER("deliver_prefix_signal"),
STATS_DESC_COUNTER("deliver_restart_signal"),
STATS_DESC_COUNTER("deliver_program"),
STATS_DESC_COUNTER("deliver_io"),
STATS_DESC_COUNTER("deliver_machine_check"),
STATS_DESC_COUNTER("exit_wait_state"),
STATS_DESC_COUNTER("inject_ckc"),
STATS_DESC_COUNTER("inject_cputm"),
STATS_DESC_COUNTER("inject_external_call"),
STATS_DESC_COUNTER("inject_emergency_signal"),
STATS_DESC_COUNTER("inject_mchk"),
STATS_DESC_COUNTER("inject_pfault_init"),
STATS_DESC_COUNTER("inject_program"),
STATS_DESC_COUNTER("inject_restart"),
STATS_DESC_COUNTER("inject_set_prefix"),
STATS_DESC_COUNTER("inject_stop_signal"),
STATS_DESC_COUNTER("instruction_epsw"),
STATS_DESC_COUNTER("instruction_gs"),
STATS_DESC_COUNTER("instruction_io_other"),
STATS_DESC_COUNTER("instruction_lpsw"),
STATS_DESC_COUNTER("instruction_lpswe"),
STATS_DESC_COUNTER("instruction_pfmf"),
STATS_DESC_COUNTER("instruction_ptff"),
STATS_DESC_COUNTER("instruction_sck"),
STATS_DESC_COUNTER("instruction_sckpf"),
STATS_DESC_COUNTER("instruction_stidp"),
STATS_DESC_COUNTER("instruction_spx"),
STATS_DESC_COUNTER("instruction_stpx"),
STATS_DESC_COUNTER("instruction_stap"),
STATS_DESC_COUNTER("instruction_iske"),
STATS_DESC_COUNTER("instruction_ri"),
STATS_DESC_COUNTER("instruction_rrbe"),
STATS_DESC_COUNTER("instruction_sske"),
STATS_DESC_COUNTER("instruction_ipte_interlock"),
STATS_DESC_COUNTER("instruction_stsi"),
STATS_DESC_COUNTER("instruction_stfl"),
STATS_DESC_COUNTER("instruction_tb"),
STATS_DESC_COUNTER("instruction_tpi"),
STATS_DESC_COUNTER("instruction_tprot"),
STATS_DESC_COUNTER("instruction_tsch"),
STATS_DESC_COUNTER("instruction_sie"),
STATS_DESC_COUNTER("instruction_essa"),
STATS_DESC_COUNTER("instruction_sthyi"),
STATS_DESC_COUNTER("instruction_sigp_sense"),
STATS_DESC_COUNTER("instruction_sigp_sense_running"),
STATS_DESC_COUNTER("instruction_sigp_external_call"),
STATS_DESC_COUNTER("instruction_sigp_emergency"),
STATS_DESC_COUNTER("instruction_sigp_cond_emergency"),
STATS_DESC_COUNTER("instruction_sigp_start"),
STATS_DESC_COUNTER("instruction_sigp_stop"),
STATS_DESC_COUNTER("instruction_sigp_stop_store_status"),
STATS_DESC_COUNTER("instruction_sigp_store_status"),
STATS_DESC_COUNTER("instruction_sigp_store_adtl_status"),
STATS_DESC_COUNTER("instruction_sigp_arch"),
STATS_DESC_COUNTER("instruction_sigp_prefix"),
STATS_DESC_COUNTER("instruction_sigp_restart"),
STATS_DESC_COUNTER("instruction_sigp_init_cpu_reset"),
STATS_DESC_COUNTER("instruction_sigp_cpu_reset"),
STATS_DESC_COUNTER("instruction_sigp_unknown"),
STATS_DESC_COUNTER("diagnose_10"),
STATS_DESC_COUNTER("diagnose_44"),
STATS_DESC_COUNTER("diagnose_9c"),
STATS_DESC_COUNTER("diagnose_9c_ignored"),
STATS_DESC_COUNTER("diagnose_258"),
STATS_DESC_COUNTER("diagnose_308"),
STATS_DESC_COUNTER("diagnose_500"),
STATS_DESC_COUNTER("diagnose_other"),
STATS_DESC_COUNTER("pfault_sync"),
+};
+struct _kvm_stats_header kvm_vcpu_stats_header = {
.name_size = KVM_STATS_NAME_LEN,
.count = ARRAY_SIZE(kvm_vcpu_stats_desc),
.desc_offset = sizeof(struct kvm_stats_header),
.data_offset = sizeof(struct kvm_stats_header) +
sizeof(kvm_vcpu_stats_desc),
+};
struct kvm_stats_debugfs_item debugfs_entries[] = { VCPU_STAT("userspace_handled", exit_userspace), VCPU_STAT("exit_null", exit_null), diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c index e1207fd8b40d..dc55de1f958f 100644 --- a/arch/x86/kvm/x86.c +++ b/arch/x86/kvm/x86.c @@ -217,6 +217,63 @@ EXPORT_SYMBOL_GPL(host_xss); u64 __read_mostly supported_xss; EXPORT_SYMBOL_GPL(supported_xss);
+struct _kvm_stats_desc kvm_vm_stats_desc[] = {
STATS_VM_COMMON,
STATS_DESC_COUNTER("mmu_shadow_zapped"),
STATS_DESC_COUNTER("mmu_pte_write"),
STATS_DESC_COUNTER("mmu_pde_zapped"),
STATS_DESC_COUNTER("mmu_flooded"),
STATS_DESC_COUNTER("mmu_recycled"),
STATS_DESC_COUNTER("mmu_cache_miss"),
STATS_DESC_ICOUNTER("mmu_unsync"),
STATS_DESC_ICOUNTER("largepages"),
STATS_DESC_ICOUNTER("nx_largepages_splits"),
STATS_DESC_ICOUNTER("max_mmu_page_hash_collisions"),
+};
+struct _kvm_stats_header kvm_vm_stats_header = {
.name_size = KVM_STATS_NAME_LEN,
.count = ARRAY_SIZE(kvm_vm_stats_desc),
.desc_offset = sizeof(struct kvm_stats_header),
.data_offset = sizeof(struct kvm_stats_header) +
sizeof(kvm_vm_stats_desc),
+};
+struct _kvm_stats_desc kvm_vcpu_stats_desc[] = {
STATS_VCPU_COMMON,
STATS_DESC_COUNTER("pf_fixed"),
STATS_DESC_COUNTER("pf_guest"),
STATS_DESC_COUNTER("tlb_flush"),
STATS_DESC_COUNTER("invlpg"),
STATS_DESC_COUNTER("exits"),
STATS_DESC_COUNTER("io_exits"),
STATS_DESC_COUNTER("mmio_exits"),
STATS_DESC_COUNTER("signal_exits"),
STATS_DESC_COUNTER("irq_window_exits"),
STATS_DESC_COUNTER("nmi_window_exits"),
STATS_DESC_COUNTER("l1d_flush"),
STATS_DESC_COUNTER("halt_exits"),
STATS_DESC_COUNTER("request_irq_exits"),
STATS_DESC_COUNTER("irq_exits"),
STATS_DESC_COUNTER("host_state_reload"),
STATS_DESC_COUNTER("fpu_reload"),
STATS_DESC_COUNTER("insn_emulation"),
STATS_DESC_COUNTER("insn_emulation_fail"),
STATS_DESC_COUNTER("hypercalls"),
STATS_DESC_COUNTER("irq_injections"),
STATS_DESC_COUNTER("nmi_injections"),
STATS_DESC_COUNTER("req_event"),
STATS_DESC_COUNTER("nested_run"),
+};
+struct _kvm_stats_header kvm_vcpu_stats_header = {
.name_size = KVM_STATS_NAME_LEN,
.count = ARRAY_SIZE(kvm_vcpu_stats_desc),
.desc_offset = sizeof(struct kvm_stats_header),
.data_offset = sizeof(struct kvm_stats_header) +
sizeof(kvm_vcpu_stats_desc),
+};
struct kvm_stats_debugfs_item debugfs_entries[] = { VCPU_STAT("pf_fixed", pf_fixed), VCPU_STAT("pf_guest", pf_guest), diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h index 9286516094e3..796d97c8bbf0 100644 --- a/include/linux/kvm_host.h +++ b/include/linux/kvm_host.h @@ -1201,12 +1201,25 @@ struct kvm_stats_debugfs_item { int mode; };
+struct _kvm_stats_header {
__u32 name_size;
__u32 count;
__u32 desc_offset;
__u32 data_offset;
+};
+#define KVM_STATS_NAME_LEN 32
According to the warning reported by kernel test robot lkp@intel.com, will change the maximum length to 48 to accommodate some long stats name for s390.
+struct _kvm_stats_desc {
struct kvm_stats_desc desc;
char name[KVM_STATS_NAME_LEN];
+};
#define KVM_DBGFS_GET_MODE(dbgfs_item) \ ((dbgfs_item)->mode ? (dbgfs_item)->mode : 0644)
-#define VM_STAT(n, x, ...) \ +#define VM_STAT(n, x, ...) \ { n, offsetof(struct kvm, stat.x), KVM_STAT_VM, ## __VA_ARGS__ } -#define VCPU_STAT(n, x, ...) \ +#define VCPU_STAT(n, x, ...) \ { n, offsetof(struct kvm_vcpu, stat.x), KVM_STAT_VCPU, ## __VA_ARGS__ } #define VM_STAT_COM(n, x, ...) \ { n, offsetof(struct kvm, stat.common.x), KVM_STAT_VM, ## __VA_ARGS__ } @@ -1214,8 +1227,118 @@ struct kvm_stats_debugfs_item { { n, offsetof(struct kvm_vcpu, stat.common.x), \ KVM_STAT_VCPU, ## __VA_ARGS__ }
+#define STATS_DESC(name, type, unit, scale, exponent) \
{ \
{type | unit | scale, exponent, 1}, name, \
}
+#define STATS_DESC_CUMULATIVE(name, unit, scale, exponent) \
STATS_DESC(name, KVM_STATS_TYPE_CUMULATIVE, unit, scale, exponent)
+#define STATS_DESC_INSTANT(name, unit, scale, exponent) \
STATS_DESC(name, KVM_STATS_TYPE_INSTANT, unit, scale, exponent)
+/* Cumulative counter */ +#define STATS_DESC_COUNTER(name) \
STATS_DESC_CUMULATIVE(name, KVM_STATS_UNIT_NONE, \
KVM_STATS_SCALE_POW10, 0)
+/* Instantaneous counter */ +#define STATS_DESC_ICOUNTER(name) \
STATS_DESC_INSTANT(name, KVM_STATS_UNIT_NONE, \
KVM_STATS_SCALE_POW10, 0)
+/* Cumulative clock cycles */ +#define STATS_DESC_CYCLE(name) \
STATS_DESC_CUMULATIVE(name, KVM_STATS_UNIT_CYCLES, \
KVM_STATS_SCALE_POW10, 0)
+/* Instantaneous clock cycles */ +#define STATS_DESC_ICYCLE(name) \
STATS_DESC_INSTANT(name, KVM_STATS_UNIT_CYCLES, \
KVM_STATS_SCALE_POW10, 0)
+/* Cumulative memory size in Byte */ +#define STATS_DESC_SIZE_BYTE(name) \
STATS_DESC_CUMULATIVE(name, KVM_STATS_UNIT_BYTES, \
KVM_STATS_SCALE_POW2, 0)
+/* Cumulative memory size in KiByte */ +#define STATS_DESC_SIZE_KBYTE(name) \
STATS_DESC_CUMULATIVE(name, KVM_STATS_UNIT_BYTES, \
KVM_STATS_SCALE_POW2, 10)
+/* Cumulative memory size in MiByte */ +#define STATS_DESC_SIZE_MBYTE(name) \
STATS_DESC_CUMULATIVE(name, KVM_STATS_UNIT_BYTES, \
KVM_STATS_SCALE_POW2, 20)
+/* Cumulative memory size in GiByte */ +#define STATS_DESC_SIZE_GBYTE(name) \
STATS_DESC_CUMULATIVE(name, KVM_STATS_UNIT_BYTES, \
KVM_STATS_SCALE_POW2, 30)
+/* Instantaneous memory size in Byte */ +#define STATS_DESC_ISIZE_BYTE(name) \
STATS_DESC_INSTANT(name, KVM_STATS_UNIT_BYTES, \
KVM_STATS_SCALE_POW2, 0)
+/* Instantaneous memory size in KiByte */ +#define STATS_DESC_ISIZE_KBYTE(name) \
STATS_DESC_INSTANT(name, KVM_STATS_UNIT_BYTES, \
KVM_STATS_SCALE_POW2, 10)
+/* Instantaneous memory size in MiByte */ +#define STATS_DESC_ISIZE_MBYTE(name) \
STATS_DESC_INSTANT(name, KVM_STATS_UNIT_BYTES, \
KVM_STATS_SCALE_POW2, 20)
+/* Instantaneous memory size in GiByte */ +#define STATS_DESC_ISIZE_GBYTE(name) \
STATS_DESC_INSTANT(name, KVM_STATS_UNIT_BYTES, \
KVM_STATS_SCALE_POW2, 30)
+/* Cumulative time in second */ +#define STATS_DESC_TIME_SEC(name) \
STATS_DESC_CUMULATIVE(name, KVM_STATS_UNIT_SECONDS, \
KVM_STATS_SCALE_POW10, 0)
+/* Cumulative time in millisecond */ +#define STATS_DESC_TIME_MSEC(name) \
STATS_DESC_CUMULATIVE(name, KVM_STATS_UNIT_SECONDS, \
KVM_STATS_SCALE_POW10, -3)
+/* Cumulative time in microsecond */ +#define STATS_DESC_TIME_USEC(name) \
STATS_DESC_CUMULATIVE(name, KVM_STATS_UNIT_SECONDS, \
KVM_STATS_SCALE_POW10, -6)
+/* Cumulative time in nanosecond */ +#define STATS_DESC_TIME_NSEC(name) \
STATS_DESC_CUMULATIVE(name, KVM_STATS_UNIT_SECONDS, \
KVM_STATS_SCALE_POW10, -9)
+/* Instantaneous time in second */ +#define STATS_DESC_ITIME_SEC(name) \
STATS_DESC_INSTANT(name, KVM_STATS_UNIT_SECONDS, \
KVM_STATS_SCALE_POW10, 0)
+/* Instantaneous time in millisecond */ +#define STATS_DESC_ITIME_MSEC(name) \
STATS_DESC_INSTANT(name, KVM_STATS_UNIT_SECONDS, \
KVM_STATS_SCALE_POW10, -3)
+/* Instantaneous time in microsecond */ +#define STATS_DESC_ITIME_USEC(name) \
STATS_DESC_INSTANT(name, KVM_STATS_UNIT_SECONDS, \
KVM_STATS_SCALE_POW10, -6)
+/* Instantaneous time in nanosecond */ +#define STATS_DESC_ITIME_NSEC(name) \
STATS_DESC_INSTANT(name, KVM_STATS_UNIT_SECONDS, \
KVM_STATS_SCALE_POW10, -9)
+#define STATS_VM_COMMON \
STATS_DESC_COUNTER("remote_tlb_flush")
+#define STATS_VCPU_COMMON \
STATS_DESC_COUNTER("halt_successful_poll"), \
STATS_DESC_COUNTER("halt_attempted_poll"), \
STATS_DESC_COUNTER("halt_poll_invalid"), \
STATS_DESC_COUNTER("halt_wakeup"), \
STATS_DESC_TIME_NSEC("halt_poll_success_ns"), \
STATS_DESC_TIME_NSEC("halt_poll_fail_ns")
extern struct kvm_stats_debugfs_item debugfs_entries[]; extern struct dentry *kvm_debugfs_dir; +extern struct _kvm_stats_header kvm_vm_stats_header; +extern struct _kvm_stats_header kvm_vcpu_stats_header; +extern struct _kvm_stats_desc kvm_vm_stats_desc[]; +extern struct _kvm_stats_desc kvm_vcpu_stats_desc[];
#if defined(CONFIG_MMU_NOTIFIER) && defined(KVM_ARCH_WANT_MMU_NOTIFIER) static inline int mmu_notifier_retry(struct kvm *kvm, unsigned long mmu_seq) diff --git a/include/uapi/linux/kvm.h b/include/uapi/linux/kvm.h index 1fb4fd863324..e7b8dc8fe7a4 100644 --- a/include/uapi/linux/kvm.h +++ b/include/uapi/linux/kvm.h @@ -1083,6 +1083,7 @@ struct kvm_ppc_resize_hpt { #define KVM_CAP_VM_COPY_ENC_CONTEXT_FROM 197 #define KVM_CAP_PTP_KVM 198 #define KVM_CAP_EXIT_HYPERCALL 199 +#define KVM_CAP_STATS_BINARY_FD 200
#ifdef KVM_CAP_IRQ_ROUTING
@@ -1899,4 +1900,53 @@ struct kvm_dirty_gfn { #define KVM_BUS_LOCK_DETECTION_OFF (1 << 0) #define KVM_BUS_LOCK_DETECTION_EXIT (1 << 1)
+#define KVM_STATS_ID_MAXLEN 64
+struct kvm_stats_header {
char id[KVM_STATS_ID_MAXLEN];
__u32 name_size;
__u32 count;
__u32 desc_offset;
__u32 data_offset;
+};
+#define KVM_STATS_TYPE_SHIFT 0 +#define KVM_STATS_TYPE_MASK (0xF << KVM_STATS_TYPE_SHIFT) +#define KVM_STATS_TYPE_CUMULATIVE (0x0 << KVM_STATS_TYPE_SHIFT) +#define KVM_STATS_TYPE_INSTANT (0x1 << KVM_STATS_TYPE_SHIFT) +#define KVM_STATS_TYPE_MAX KVM_STATS_TYPE_INSTANT
+#define KVM_STATS_UNIT_SHIFT 4 +#define KVM_STATS_UNIT_MASK (0xF << KVM_STATS_UNIT_SHIFT) +#define KVM_STATS_UNIT_NONE (0x0 << KVM_STATS_UNIT_SHIFT) +#define KVM_STATS_UNIT_BYTES (0x1 << KVM_STATS_UNIT_SHIFT) +#define KVM_STATS_UNIT_SECONDS (0x2 << KVM_STATS_UNIT_SHIFT) +#define KVM_STATS_UNIT_CYCLES (0x3 << KVM_STATS_UNIT_SHIFT) +#define KVM_STATS_UNIT_MAX KVM_STATS_UNIT_CYCLES
+#define KVM_STATS_SCALE_SHIFT 8 +#define KVM_STATS_SCALE_MASK (0xF << KVM_STATS_SCALE_SHIFT) +#define KVM_STATS_SCALE_POW10 (0x0 << KVM_STATS_SCALE_SHIFT) +#define KVM_STATS_SCALE_POW2 (0x1 << KVM_STATS_SCALE_SHIFT) +#define KVM_STATS_SCALE_MAX KVM_STATS_SCALE_POW2
+struct kvm_stats_desc {
__u32 flags;
__s16 exponent;
__u16 size;
__u32 unused1;
__u32 unused2;
char name[0];
+};
+struct kvm_vm_stats_data {
unsigned long value[0];
+};
+struct kvm_vcpu_stats_data {
__u64 value[0];
+};
+#define KVM_STATS_GETFD _IOR(KVMIO, 0xcc, struct kvm_stats_header)
#endif /* __LINUX_KVM_H */ diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c index cdf53fb75ca1..c48089ab366c 100644 --- a/virt/kvm/kvm_main.c +++ b/virt/kvm/kvm_main.c @@ -3458,6 +3458,115 @@ static int kvm_vcpu_ioctl_set_sigmask(struct kvm_vcpu *vcpu, sigset_t *sigset) return 0; }
+static ssize_t kvm_vcpu_stats_read(struct file *file, char __user *user_buffer,
size_t size, loff_t *offset)
+{
char id[KVM_STATS_ID_MAXLEN];
struct kvm_vcpu *vcpu = file->private_data;
ssize_t copylen, len, remain = size;
size_t size_header, size_desc, size_stats;
loff_t pos = *offset;
char __user *dest = user_buffer;
void *src;
snprintf(id, sizeof(id), "kvm-%d/vcpu-%d",
task_pid_nr(current), vcpu->vcpu_id);
size_header = sizeof(kvm_vcpu_stats_header);
size_desc =
kvm_vcpu_stats_header.count * sizeof(struct _kvm_stats_desc);
size_stats = sizeof(vcpu->stat);
len = sizeof(id) + size_header + size_desc + size_stats - pos;
len = min(len, remain);
if (len <= 0)
return 0;
remain = len;
/* Copy kvm vcpu stats header id string */
copylen = sizeof(id) - pos;
copylen = min(copylen, remain);
if (copylen > 0) {
src = (void *)id + pos;
if (copy_to_user(dest, src, copylen))
return -EFAULT;
remain -= copylen;
pos += copylen;
dest += copylen;
}
/* Copy kvm vcpu stats header */
copylen = sizeof(id) + size_header - pos;
copylen = min(copylen, remain);
if (copylen > 0) {
src = (void *)&kvm_vcpu_stats_header;
src += pos - sizeof(id);
if (copy_to_user(dest, src, copylen))
return -EFAULT;
remain -= copylen;
pos += copylen;
dest += copylen;
}
/* Copy kvm vcpu stats descriptors */
copylen = kvm_vcpu_stats_header.desc_offset + size_desc - pos;
copylen = min(copylen, remain);
if (copylen > 0) {
src = (void *)&kvm_vcpu_stats_desc;
src += pos - kvm_vcpu_stats_header.desc_offset;
if (copy_to_user(dest, src, copylen))
return -EFAULT;
remain -= copylen;
pos += copylen;
dest += copylen;
}
/* Copy kvm vcpu stats values */
copylen = kvm_vcpu_stats_header.data_offset + size_stats - pos;
copylen = min(copylen, remain);
if (copylen > 0) {
src = (void *)&vcpu->stat;
src += pos - kvm_vcpu_stats_header.data_offset;
if (copy_to_user(dest, src, copylen))
return -EFAULT;
remain -= copylen;
pos += copylen;
dest += copylen;
}
*offset = pos;
return len;
+}
+static struct file_operations kvm_vcpu_stats_fops = {
.read = kvm_vcpu_stats_read,
.llseek = noop_llseek,
+};
+static int kvm_vcpu_ioctl_get_statsfd(struct kvm_vcpu *vcpu) +{
int error, fd;
struct file *file;
char name[15 + ITOA_MAX_LEN + 1];
snprintf(name, sizeof(name), "kvm-vcpu-stats:%d", vcpu->vcpu_id);
error = get_unused_fd_flags(O_CLOEXEC);
if (error < 0)
return error;
fd = error;
file = anon_inode_getfile(name, &kvm_vcpu_stats_fops, vcpu, O_RDONLY);
if (IS_ERR(file)) {
error = PTR_ERR(file);
goto err_put_unused_fd;
}
file->f_mode |= FMODE_PREAD;
fd_install(fd, file);
return fd;
+err_put_unused_fd:
put_unused_fd(fd);
return error;
+}
static long kvm_vcpu_ioctl(struct file *filp, unsigned int ioctl, unsigned long arg) { @@ -3655,6 +3764,10 @@ static long kvm_vcpu_ioctl(struct file *filp, r = kvm_arch_vcpu_ioctl_set_fpu(vcpu, fpu); break; }
case KVM_STATS_GETFD: {
r = kvm_vcpu_ioctl_get_statsfd(vcpu);
break;
} default: r = kvm_arch_vcpu_ioctl(filp, ioctl, arg); }
@@ -3913,6 +4026,8 @@ static long kvm_vm_ioctl_check_extension_generic(struct kvm *kvm, long arg) #else return 0; #endif
case KVM_CAP_STATS_BINARY_FD:
return 1; default: break; }
@@ -4016,6 +4131,111 @@ static int kvm_vm_ioctl_enable_cap_generic(struct kvm *kvm, } }
+static ssize_t kvm_vm_stats_read(struct file *file, char __user *user_buffer,
size_t size, loff_t *offset)
+{
char id[KVM_STATS_ID_MAXLEN];
struct kvm *kvm = file->private_data;
ssize_t copylen, len, remain = size;
size_t size_header, size_desc, size_stats;
loff_t pos = *offset;
char __user *dest = user_buffer;
void *src;
snprintf(id, sizeof(id), "kvm-%d", task_pid_nr(current));
size_header = sizeof(kvm_vm_stats_header);
size_desc = kvm_vm_stats_header.count * sizeof(struct _kvm_stats_desc);
size_stats = sizeof(kvm->stat);
len = sizeof(id) + size_header + size_desc + size_stats - pos;
len = min(len, remain);
if (len <= 0)
return 0;
remain = len;
/* Copy kvm vm stats header id string */
copylen = sizeof(id) - pos;
copylen = min(copylen, remain);
if (copylen > 0) {
src = (void *)id + pos;
if (copy_to_user(dest, src, copylen))
return -EFAULT;
remain -= copylen;
pos += copylen;
dest += copylen;
}
/* Copy kvm vm stats header */
copylen = sizeof(id) + size_header - pos;
copylen = min(copylen, remain);
if (copylen > 0) {
src = (void *)&kvm_vm_stats_header;
src += pos - sizeof(id);
if (copy_to_user(dest, src, copylen))
return -EFAULT;
remain -= copylen;
pos += copylen;
dest += copylen;
}
/* Copy kvm vm stats descriptors */
copylen = kvm_vm_stats_header.desc_offset + size_desc - pos;
copylen = min(copylen, remain);
if (copylen > 0) {
src = (void *)&kvm_vm_stats_desc;
src += pos - kvm_vm_stats_header.desc_offset;
if (copy_to_user(dest, src, copylen))
return -EFAULT;
remain -= copylen;
pos += copylen;
dest += copylen;
}
/* Copy kvm vm stats values */
copylen = kvm_vm_stats_header.data_offset + size_stats - pos;
copylen = min(copylen, remain);
if (copylen > 0) {
src = (void *)&kvm->stat;
src += pos - kvm_vm_stats_header.data_offset;
if (copy_to_user(dest, src, copylen))
return -EFAULT;
remain -= copylen;
pos += copylen;
dest += copylen;
}
*offset = pos;
return len;
+}
+static struct file_operations kvm_vm_stats_fops = {
.read = kvm_vm_stats_read,
.llseek = noop_llseek,
+};
+static int kvm_vm_ioctl_get_statsfd(struct kvm *kvm) +{
int error, fd;
struct file *file;
error = get_unused_fd_flags(O_CLOEXEC);
if (error < 0)
return error;
fd = error;
file = anon_inode_getfile("kvm-vm-stats",
&kvm_vm_stats_fops, kvm, O_RDONLY);
if (IS_ERR(file)) {
error = PTR_ERR(file);
goto err_put_unused_fd;
}
file->f_mode |= FMODE_PREAD;
fd_install(fd, file);
return fd;
+err_put_unused_fd:
put_unused_fd(fd);
return error;
+}
static long kvm_vm_ioctl(struct file *filp, unsigned int ioctl, unsigned long arg) { @@ -4198,6 +4418,9 @@ static long kvm_vm_ioctl(struct file *filp, case KVM_RESET_DIRTY_RINGS: r = kvm_vm_ioctl_reset_dirty_pages(kvm); break;
case KVM_STATS_GETFD:
r = kvm_vm_ioctl_get_statsfd(kvm);
break; default: r = kvm_arch_vm_ioctl(filp, ioctl, arg); }
-- 2.31.1.527.g47e6f16901-goog
Signed-off-by: Jing Zhang jingzhangos@google.com --- Documentation/virt/kvm/api.rst | 171 +++++++++++++++++++++++++++++++++ 1 file changed, 171 insertions(+)
diff --git a/Documentation/virt/kvm/api.rst b/Documentation/virt/kvm/api.rst index 6cb940b8db50..2d21c70ecb53 100644 --- a/Documentation/virt/kvm/api.rst +++ b/Documentation/virt/kvm/api.rst @@ -5034,6 +5034,169 @@ see KVM_XEN_VCPU_SET_ATTR above. The KVM_XEN_VCPU_ATTR_TYPE_RUNSTATE_ADJUST type may not be used with the KVM_XEN_VCPU_GET_ATTR ioctl.
+4.130 KVM_STATS_GETFD +--------------------- + +:Capability: KVM_CAP_STATS_BINARY_FD +:Architectures: all +:Type: vm ioctl, vcpu ioctl +:Parameters: none +:Returns: statistics file descriptor on success, < 0 on error + +Errors: + + ====== ====================================================== + ENOMEM if the fd could not be created due to lack of memory + EMFILE if the number of opened files exceeds the limit + ====== ====================================================== + +The file descriptor can be used to read VM/vCPU statistics data in binary +format. The file data is organized into three blocks as below: ++-------------+ +| Header | ++-------------+ +| Descriptors | ++-------------+ +| Stats Data | ++-------------+ + +The Header block is always at the start of the file. It is only needed to be +read one time after a system boot. +It is in the form of ``struct kvm_stats_header`` as below:: + + #define KVM_STATS_ID_MAXLEN 64 + + struct kvm_stats_header { + char id[KVM_STATS_ID_MAXLEN]; + __u32 name_size; + __u32 count; + __u32 desc_offset; + __u32 data_offset; + }; + +The ``id`` field is identification for the corresponding KVM statistics. For +KVM statistics, it is in the form of "kvm-{kvm pid}", like "kvm-12345". For +VCPU statistics, it is in the form of "kvm-{kvm pid}/vcpu-{vcpu id}", like +"kvm-12345/vcpu-12". + +The ``name_size`` field is the size (byte) of the statistics name string +(including trailing '\0') appended to the end of every statistics descriptor. + +The ``count`` field is the number of statistics. + +The ``desc_offset`` field is the offset of the Descriptors block from the start +of the file indicated by the file descriptor. + +The ``data_offset`` field is the offset of the Stats Data block from the start +of the file indicated by the file descriptor. + +The Descriptors block is only needed to be read once after a system boot. It is +an array of ``struct kvm_stats_desc`` as below:: + + #define KVM_STATS_TYPE_SHIFT 0 + #define KVM_STATS_TYPE_MASK (0xF << KVM_STATS_TYPE_SHIFT) + #define KVM_STATS_TYPE_CUMULATIVE (0x0 << KVM_STATS_TYPE_SHIFT) + #define KVM_STATS_TYPE_INSTANT (0x1 << KVM_STATS_TYPE_SHIFT) + #define KVM_STATS_TYPE_MAX KVM_STATS_TYPE_INSTANT + + #define KVM_STATS_UNIT_SHIFT 4 + #define KVM_STATS_UNIT_MASK (0xF << KVM_STATS_UNIT_SHIFT) + #define KVM_STATS_UNIT_NONE (0x0 << KVM_STATS_UNIT_SHIFT) + #define KVM_STATS_UNIT_BYTES (0x1 << KVM_STATS_UNIT_SHIFT) + #define KVM_STATS_UNIT_SECONDS (0x2 << KVM_STATS_UNIT_SHIFT) + #define KVM_STATS_UNIT_CYCLES (0x3 << KVM_STATS_UNIT_SHIFT) + #define KVM_STATS_UNIT_MAX KVM_STATS_UNIT_CYCLES + + #define KVM_STATS_SCALE_SHIFT 8 + #define KVM_STATS_SCALE_MASK (0xF << KVM_STATS_SCALE_SHIFT) + #define KVM_STATS_SCALE_POW10 (0x0 << KVM_STATS_SCALE_SHIFT) + #define KVM_STATS_SCALE_POW2 (0x1 << KVM_STATS_SCALE_SHIFT) + #define KVM_STATS_SCALE_MAX KVM_STATS_SCALE_POW2 + + struct kvm_stats_desc { + __u32 flags; + __s16 exponent; + __u16 size; + __u32 unused1; + __u32 unused2; + __u8 name[0]; + }; + +The ``flags`` field contains the type and unit of the statistics data described +by this descriptor. The following flags are supported: + * ``KVM_STATS_TYPE_CUMULATIVE`` + The statistics data is cumulative. The value of data can only be increased. + Most of the counters used in KVM are of this type. + The corresponding ``count`` filed for this type is always 1. + * ``KVM_STATS_TYPE_INSTANT`` + The statistics data is instantaneous. Its value can be increased or + decreased. This type is usually used as a measurement of some resources, + like the number of dirty pages, the number of large pages, etc. + The corresponding ``count`` field for this type is always 1. + * ``KVM_STATS_UNIT_NONE`` + There is no unit for the value of statistics data. This usually means that + the value is a simple counter of an event. + * ``KVM_STATS_UNIT_BYTES`` + It indicates that the statistics data is used to measure memory size, in the + unit of Byte, KiByte, MiByte, GiByte, etc. The unit of the data is + determined by the ``exponent`` field in the descriptor. The + ``KVM_STATS_SCALE_POW2`` flag is valid in this case. The unit of the data is + determined by ``pow(2, exponent)``. For example, if value is 10, + ``exponent`` is 20, which means the unit of statistics data is MiByte, we + can get the statistics data in the unit of Byte by + ``value * pow(2, exponent) = 10 * pow(2, 20) = 10 MiByte`` which is + 10 * 1024 * 1024 Bytes. + * ``KVM_STATS_UNIT_SECONDS`` + It indicates that the statistics data is used to measure time/latency, in + the unit of nanosecond, microsecond, millisecond and second. The unit of the + data is determined by the ``exponent`` field in the descriptor. The + ``KVM_STATS_SCALE_POW10`` flag is valid in this case. The unit of the data + is determined by ``pow(10, exponent)``. For example, if value is 2000000, + ``exponent`` is -6, which means the unit of statistics data is microsecond, + we can get the statistics data in the unit of second by + ``value * pow(10, exponent) = 2000000 * pow(10, -6) = 2 seconds``. + * ``KVM_STATS_UNIT_CYCLES`` + It indicates that the statistics data is used to measure CPU clock cycles. + The ``KVM_STATS_SCALE_POW10`` flag is valid in this case. For example, if + value is 200, ``exponent`` is 4, we can get the number of CPU clock cycles + by ``value * pow(10, exponent) = 200 * pow(10, 4) = 2000000``. + +The ``exponent`` field is the scale of corresponding statistics data. It has two +values as follows: + * ``KVM_STATS_SCALE_POW10`` + The scale is based on power of 10. It is used for measurement of time and + CPU clock cycles. + * ``KVM_STATS_SCALE_POW2`` + The scale is based on power of 2. It is used for measurement of memory size. + +The ``size`` field is the number of values of this statistics data. It is in the +unit of ``unsigned long`` for VCPU or ``__u64`` for VM. + +The ``unused1`` and ``unused2`` fields are reserved for future +support for other types of statistics data, like log/linear histogram. + +The ``name`` field points to the name string of the statistics data. The name +string starts at the end of ``struct kvm_stats_desc``. +The maximum length (including trailing '\0') is indicated by ``name_size`` +in ``struct kvm_stats_header``. + +The Stats Data block contains an array of data values of type ``struct +kvm_vm_stats_data`` or ``struct kvm_vcpu_stats_data``. It would be read by +user space periodically to pull statistics data. +The order of data value in Stats Data block is the same as the order of +descriptors in Descriptors block. + * Statistics data for VM:: + + struct kvm_vm_stats_data { + unsigned long value[0]; + }; + + * Statistics data for VCPU:: + + struct kvm_vcpu_stats_data { + __u64 value[0]; + }; + 5. The kvm_run structure ========================
@@ -6911,3 +7074,11 @@ Right now, the only bit that can be set is bit 12, corresponding to KVM_HC_PAGE_ENC_STATUS. The hypercall returns ENOSYS if KVM_CAP_EXIT_HYPERCALL is not enabled or if bit 12 is not set when enabling it. + +8.34 KVM_CAP_STATS_BINARY_FD +---------------------------- + +:Architectures: all + +This capability indicates the feature that user space can create get a file +descriptor for every VM and VCPU to read statistics data in binary format.
Signed-off-by: Jing Zhang jingzhangos@google.com --- tools/testing/selftests/kvm/.gitignore | 1 + tools/testing/selftests/kvm/Makefile | 3 + .../testing/selftests/kvm/include/kvm_util.h | 3 + .../selftests/kvm/kvm_bin_form_stats.c | 380 ++++++++++++++++++ tools/testing/selftests/kvm/lib/kvm_util.c | 11 + 5 files changed, 398 insertions(+) create mode 100644 tools/testing/selftests/kvm/kvm_bin_form_stats.c
diff --git a/tools/testing/selftests/kvm/.gitignore b/tools/testing/selftests/kvm/.gitignore index bd83158e0e0b..35796667c944 100644 --- a/tools/testing/selftests/kvm/.gitignore +++ b/tools/testing/selftests/kvm/.gitignore @@ -43,3 +43,4 @@ /memslot_modification_stress_test /set_memory_region_test /steal_time +/kvm_bin_form_stats diff --git a/tools/testing/selftests/kvm/Makefile b/tools/testing/selftests/kvm/Makefile index ea5c42841307..e06ced4f99a2 100644 --- a/tools/testing/selftests/kvm/Makefile +++ b/tools/testing/selftests/kvm/Makefile @@ -76,6 +76,7 @@ TEST_GEN_PROGS_x86_64 += kvm_page_table_test TEST_GEN_PROGS_x86_64 += memslot_modification_stress_test TEST_GEN_PROGS_x86_64 += set_memory_region_test TEST_GEN_PROGS_x86_64 += steal_time +TEST_GEN_PROGS_x86_64 += kvm_bin_form_stats
TEST_GEN_PROGS_aarch64 += aarch64/get-reg-list TEST_GEN_PROGS_aarch64 += aarch64/get-reg-list-sve @@ -87,6 +88,7 @@ TEST_GEN_PROGS_aarch64 += kvm_create_max_vcpus TEST_GEN_PROGS_aarch64 += kvm_page_table_test TEST_GEN_PROGS_aarch64 += set_memory_region_test TEST_GEN_PROGS_aarch64 += steal_time +TEST_GEN_PROGS_aarch64 += kvm_bin_form_stats
TEST_GEN_PROGS_s390x = s390x/memop TEST_GEN_PROGS_s390x += s390x/resets @@ -96,6 +98,7 @@ TEST_GEN_PROGS_s390x += dirty_log_test TEST_GEN_PROGS_s390x += kvm_create_max_vcpus TEST_GEN_PROGS_s390x += kvm_page_table_test TEST_GEN_PROGS_s390x += set_memory_region_test +TEST_GEN_PROGS_s390x += kvm_bin_form_stats
TEST_GEN_PROGS += $(TEST_GEN_PROGS_$(UNAME_M)) LIBKVM += $(LIBKVM_$(UNAME_M)) diff --git a/tools/testing/selftests/kvm/include/kvm_util.h b/tools/testing/selftests/kvm/include/kvm_util.h index a8f022794ce3..ee01a67022d9 100644 --- a/tools/testing/selftests/kvm/include/kvm_util.h +++ b/tools/testing/selftests/kvm/include/kvm_util.h @@ -387,4 +387,7 @@ uint64_t get_ucall(struct kvm_vm *vm, uint32_t vcpu_id, struct ucall *uc); #define GUEST_ASSERT_4(_condition, arg1, arg2, arg3, arg4) \ __GUEST_ASSERT((_condition), 4, (arg1), (arg2), (arg3), (arg4))
+int vm_get_statsfd(struct kvm_vm *vm); +int vcpu_get_statsfd(struct kvm_vm *vm, uint32_t vcpuid); + #endif /* SELFTEST_KVM_UTIL_H */ diff --git a/tools/testing/selftests/kvm/kvm_bin_form_stats.c b/tools/testing/selftests/kvm/kvm_bin_form_stats.c new file mode 100644 index 000000000000..486e54ff2e1e --- /dev/null +++ b/tools/testing/selftests/kvm/kvm_bin_form_stats.c @@ -0,0 +1,380 @@ +// SPDX-License-Identifier: GPL-2.0-only +/* + * kvm_bin_form_stats + * + * Copyright (C) 2021, Google LLC. + * + * Test the fd-based interface for KVM statistics. + */ + +#define _GNU_SOURCE /* for program_invocation_short_name */ +#include <fcntl.h> +#include <stdio.h> +#include <stdlib.h> +#include <string.h> +#include <errno.h> + +#include "test_util.h" + +#include "kvm_util.h" +#include "asm/kvm.h" +#include "linux/kvm.h" + +int vm_stats_test(struct kvm_vm *vm) +{ + ssize_t ret; + int i, stats_fd, err = -1; + size_t size_desc, size_data = 0; + struct kvm_stats_header header; + struct kvm_stats_desc *stats_desc, *pdesc; + struct kvm_vm_stats_data *stats_data; + + /* Get fd for VM stats */ + stats_fd = vm_get_statsfd(vm); + if (stats_fd < 0) { + perror("Get VM stats fd"); + return err; + } + /* Read kvm vm stats header */ + ret = read(stats_fd, &header, sizeof(header)); + if (ret != sizeof(header)) { + perror("Read VM stats header"); + goto out_close_fd; + } + size_desc = sizeof(*stats_desc) + header.name_size; + /* Check id string in header, that should start with "kvm" */ + if (strncmp(header.id, "kvm", 3) || + strlen(header.id) >= KVM_STATS_ID_MAXLEN) { + printf("Invalid KVM VM stats type!\n"); + goto out_close_fd; + } + /* Sanity check for other fields in header */ + if (header.count == 0) { + err = 0; + goto out_close_fd; + } + /* Check overlap */ + if (header.desc_offset == 0 || header.data_offset == 0 || + header.desc_offset < sizeof(header) || + header.data_offset < sizeof(header)) { + printf("Invalid offset fields in header!\n"); + goto out_close_fd; + } + if (header.desc_offset < header.data_offset && + (header.desc_offset + size_desc * header.count > + header.data_offset)) { + printf("VM Descriptor block is overlapped with data block!\n"); + goto out_close_fd; + } + + /* Allocate memory for stats descriptors */ + stats_desc = calloc(header.count, size_desc); + if (!stats_desc) { + perror("Allocate memory for VM stats descriptors"); + goto out_close_fd; + } + /* Read kvm vm stats descriptors */ + ret = pread(stats_fd, stats_desc, + size_desc * header.count, header.desc_offset); + if (ret != size_desc * header.count) { + perror("Read KVM VM stats descriptors"); + goto out_free_desc; + } + /* Sanity check for fields in descriptors */ + for (i = 0; i < header.count; ++i) { + pdesc = (void *)stats_desc + i * size_desc; + /* Check type,unit,scale boundaries */ + if ((pdesc->flags & KVM_STATS_TYPE_MASK) > KVM_STATS_TYPE_MAX) { + printf("Unknown KVM stats type!\n"); + goto out_free_desc; + } + if ((pdesc->flags & KVM_STATS_UNIT_MASK) > KVM_STATS_UNIT_MAX) { + printf("Unknown KVM stats unit!\n"); + goto out_free_desc; + } + if ((pdesc->flags & KVM_STATS_SCALE_MASK) > + KVM_STATS_SCALE_MAX) { + printf("Unknown KVM stats scale!\n"); + goto out_free_desc; + } + /* Check exponent for stats unit + * Exponent for counter should be greater than or equal to 0 + * Exponent for unit bytes should be greater than or equal to 0 + * Exponent for unit seconds should be less than or equal to 0 + * Exponent for unit clock cycles should be greater than or + * equal to 0 + */ + switch (pdesc->flags & KVM_STATS_UNIT_MASK) { + case KVM_STATS_UNIT_NONE: + case KVM_STATS_UNIT_BYTES: + case KVM_STATS_UNIT_CYCLES: + if (pdesc->exponent < 0) { + printf("Unsupported KVM stats unit!\n"); + goto out_free_desc; + } + break; + case KVM_STATS_UNIT_SECONDS: + if (pdesc->exponent > 0) { + printf("Unsupported KVM stats unit!\n"); + goto out_free_desc; + } + break; + } + /* Check name string */ + if (strlen(pdesc->name) >= header.name_size) { + printf("KVM stats name(%s) too long!\n", pdesc->name); + goto out_free_desc; + } + /* Check size field, which should not be zero */ + if (pdesc->size == 0) { + printf("KVM descriptor(%s) with size of 0!\n", + pdesc->name); + goto out_free_desc; + } + size_data += pdesc->size * sizeof(stats_data->value[0]); + } + /* Check overlap */ + if (header.data_offset < header.desc_offset && + header.data_offset + size_data > header.desc_offset) { + printf("Data block is overlapped with Descriptor block!\n"); + goto out_free_desc; + } + /* Check validity of all stats data size */ + if (size_data < header.count * sizeof(stats_data->value[0])) { + printf("Data size is not correct!\n"); + goto out_free_desc; + } + + /* Allocate memory for stats data */ + stats_data = malloc(size_data); + if (!stats_data) { + perror("Allocate memory for VM stats data"); + goto out_free_desc; + } + /* Read kvm vm stats data */ + ret = pread(stats_fd, stats_data, size_data, header.data_offset); + if (ret != size_data) { + perror("Read KVM VM stats data"); + goto out_free_data; + } + + err = 0; +out_free_data: + free(stats_data); +out_free_desc: + free(stats_desc); +out_close_fd: + close(stats_fd); + return err; +} + +int vcpu_stats_test(struct kvm_vm *vm, int vcpu_id) +{ + ssize_t ret; + int i, stats_fd, err = -1; + size_t size_desc, size_data = 0; + struct kvm_stats_header header; + struct kvm_stats_desc *stats_desc, *pdesc; + struct kvm_vcpu_stats_data *stats_data; + + /* Get fd for VCPU stats */ + stats_fd = vcpu_get_statsfd(vm, vcpu_id); + if (stats_fd < 0) { + perror("Get VCPU stats fd"); + return err; + } + /* Read kvm vcpu stats header */ + ret = read(stats_fd, &header, sizeof(header)); + if (ret != sizeof(header)) { + perror("Read VCPU stats header"); + goto out_close_fd; + } + size_desc = sizeof(*stats_desc) + header.name_size; + /* Check id string in header, that should start with "kvm" */ + if (strncmp(header.id, "kvm", 3) || + strlen(header.id) >= KVM_STATS_ID_MAXLEN) { + printf("Invalid KVM VCPU stats type!\n"); + goto out_close_fd; + } + /* Sanity check for other fields in header */ + if (header.count == 0) { + err = 0; + goto out_close_fd; + } + /* Check overlap */ + if (header.desc_offset == 0 || header.data_offset == 0 || + header.desc_offset < sizeof(header) || + header.data_offset < sizeof(header)) { + printf("Invalid offset fields in header!\n"); + goto out_close_fd; + } + if (header.desc_offset < header.data_offset && + (header.desc_offset + size_desc * header.count > + header.data_offset)) { + printf("VCPU Descriptor block is overlapped with data block!\n"); + goto out_close_fd; + } + + /* Allocate memory for stats descriptors */ + stats_desc = calloc(header.count, size_desc); + if (!stats_desc) { + perror("Allocate memory for VCPU stats descriptors"); + goto out_close_fd; + } + /* Read kvm vcpu stats descriptors */ + ret = pread(stats_fd, stats_desc, + size_desc * header.count, header.desc_offset); + if (ret != size_desc * header.count) { + perror("Read KVM VCPU stats descriptors"); + goto out_free_desc; + } + /* Sanity check for fields in descriptors */ + for (i = 0; i < header.count; ++i) { + pdesc = (void *)stats_desc + i * size_desc; + /* Check boundaries */ + if ((pdesc->flags & KVM_STATS_TYPE_MASK) > KVM_STATS_TYPE_MAX) { + printf("Unknown KVM stats type!\n"); + goto out_free_desc; + } + if ((pdesc->flags & KVM_STATS_UNIT_MASK) > KVM_STATS_UNIT_MAX) { + printf("Unknown KVM stats unit!\n"); + goto out_free_desc; + } + if ((pdesc->flags & KVM_STATS_SCALE_MASK) > + KVM_STATS_SCALE_MAX) { + printf("Unknown KVM stats scale!\n"); + goto out_free_desc; + } + /* Check exponent for stats unit + * Exponent for counter should be greater than or equal to 0 + * Exponent for unit bytes should be greater than or equal to 0 + * Exponent for unit seconds should be less than or equal to 0 + * Exponent for unit clock cycles should be greater than or + * equal to 0 + */ + switch (pdesc->flags & KVM_STATS_UNIT_MASK) { + case KVM_STATS_UNIT_NONE: + case KVM_STATS_UNIT_BYTES: + case KVM_STATS_UNIT_CYCLES: + if (pdesc->exponent < 0) { + printf("Unsupported KVM stats unit!\n"); + goto out_free_desc; + } + break; + case KVM_STATS_UNIT_SECONDS: + if (pdesc->exponent > 0) { + printf("Unsupported KVM stats unit!\n"); + goto out_free_desc; + } + break; + } + /* Check name string */ + if (strlen(pdesc->name) >= header.name_size) { + printf("KVM stats name(%s) too long!\n", pdesc->name); + goto out_free_desc; + } + /* Check size field, which should not be zero */ + if (pdesc->size == 0) { + printf("KVM descriptor(%s) with size of 0!\n", + pdesc->name); + goto out_free_desc; + } + size_data += pdesc->size * sizeof(stats_data->value[0]); + } + /* Check overlap */ + if (header.data_offset < header.desc_offset && + header.data_offset + size_data > header.desc_offset) { + printf("Data block is overlapped with Descriptor block!\n"); + goto out_free_desc; + } + /* Check validity of all stats data size */ + if (size_data < header.count * sizeof(stats_data->value[0])) { + printf("Data size is not correct!\n"); + goto out_free_desc; + } + + /* Allocate memory for stats data */ + stats_data = malloc(size_data); + if (!stats_data) { + perror("Allocate memory for VCPU stats data"); + goto out_free_desc; + } + /* Read kvm vcpu stats data */ + ret = pread(stats_fd, stats_data, size_data, header.data_offset); + if (ret != size_data) { + perror("Read KVM VCPU stats data"); + goto out_free_data; + } + + err = 0; +out_free_data: + free(stats_data); +out_free_desc: + free(stats_desc); +out_close_fd: + close(stats_fd); + return err; +} + +/* + * Usage: kvm_bin_form_stats [#vm] [#vcpu] + * The first parameter #vm set the number of VMs being created. + * The second parameter #vcpu set the number of VCPUs being created. + * By default, 1 VM and 1 VCPU for the VM would be created for testing. + */ + +int main(int argc, char *argv[]) +{ + int max_vm = 1, max_vcpu = 1, ret, i, j, err = -1; + struct kvm_vm **vms; + + /* Get the number of VMs and VCPUs that would be created for testing. */ + if (argc > 1) { + max_vm = strtol(argv[1], NULL, 0); + if (max_vm <= 0) + max_vm = 1; + } + if (argc > 2 ) { + max_vcpu = strtol(argv[2], NULL, 0); + if (max_vcpu <= 0) + max_vcpu = 1; + } + + /* Check the extension for binary stats */ + ret = kvm_check_cap(KVM_CAP_STATS_BINARY_FD); + if (ret < 0) { + printf("Binary form statistics interface is not supported!\n"); + return err; + } + + /* Create VMs and VCPUs */ + vms = malloc(sizeof(vms[0]) * max_vm); + if (!vms) { + perror("Allocate memory for storing VM pointers"); + return err; + } + for (i = 0; i < max_vm; ++i) { + vms[i] = vm_create(VM_MODE_DEFAULT, + DEFAULT_GUEST_PHY_PAGES, O_RDWR); + for (j = 0; j < max_vcpu; ++j) { + vm_vcpu_add(vms[i], j); + } + } + + /* Check stats read for every VM and VCPU */ + for (i = 0; i < max_vm; ++i) { + if (vm_stats_test(vms[i])) + goto out_free_vm; + for (j = 0; j < max_vcpu; ++j) { + if (vcpu_stats_test(vms[i], j)) + goto out_free_vm; + } + } + + err = 0; +out_free_vm: + for (i = 0; i < max_vm; ++i) + kvm_vm_free(vms[i]); + free(vms); + return err; +} diff --git a/tools/testing/selftests/kvm/lib/kvm_util.c b/tools/testing/selftests/kvm/lib/kvm_util.c index fc83f6c5902d..9b5768f893f9 100644 --- a/tools/testing/selftests/kvm/lib/kvm_util.c +++ b/tools/testing/selftests/kvm/lib/kvm_util.c @@ -2090,3 +2090,14 @@ unsigned int vm_calc_num_guest_pages(enum vm_guest_mode mode, size_t size) n = DIV_ROUND_UP(size, vm_guest_mode_params[mode].page_size); return vm_adjust_num_guest_pages(mode, n); } + +int vm_get_statsfd(struct kvm_vm *vm) +{ + return ioctl(vm->fd, KVM_STATS_GETFD, NULL); +} + +int vcpu_get_statsfd(struct kvm_vm *vm, uint32_t vcpuid) +{ + struct vcpu *vcpu = vcpu_find(vm, vcpuid); + return ioctl(vcpu->fd, KVM_STATS_GETFD, NULL); +}
Hi Paolo,
On Thu, Apr 29, 2021 at 3:37 PM Jing Zhang jingzhangos@google.com wrote:
This patchset provides a file descriptor for every VM and VCPU to read KVM statistics data in binary format. It is meant to provide a lightweight, flexible, scalable and efficient lock-free solution for user space telemetry applications to pull the statistics data periodically for large scale systems. The pulling frequency could be as high as a few times per second. In this patchset, every statistics data are treated to have some attributes as below:
- architecture dependent or common
- VM statistics data or VCPU statistics data
- type: cumulative, instantaneous,
- unit: none for simple counter, nanosecond, microsecond, millisecond, second, Byte, KiByte, MiByte, GiByte. Clock Cycles
Since no lock/synchronization is used, the consistency between all the statistics data is not guaranteed. That means not all statistics data are read out at the exact same time, since the statistics date are still being updated by KVM subsystems while they are read out.
v3 -> v4
- Rebase to kvm/queue, commit 9f242010c3b4 ("KVM: avoid "deadlock" between install_new_memslots and MMU notifier")
- Use C-stype comments in the whole patch
- Fix wrong count for x86 VCPU stats descriptors
- Fix KVM stats data size counting and validity check in selftest
v2 -> v3
- Rebase to kvm/queue, commit edf408f5257b ("KVM: avoid "deadlock" between install_new_memslots and MMU notifier")
- Resolve some nitpicks about format
v1 -> v2
- Use ARRAY_SIZE to count the number of stats descriptors
- Fix missing `size` field initialization in macro STATS_DESC
[1] https://lore.kernel.org/kvm/20210402224359.2297157-1-jingzhangos@google.com [2] https://lore.kernel.org/kvm/20210415151741.1607806-1-jingzhangos@google.com [3] https://lore.kernel.org/kvm/20210423181727.596466-1-jingzhangos@google.com
Jing Zhang (4): KVM: stats: Separate common stats from architecture specific ones KVM: stats: Add fd-based API to read binary stats data KVM: stats: Add documentation for statistics data binary interface KVM: selftests: Add selftest for KVM statistics data binary interface
Documentation/virt/kvm/api.rst | 171 ++++++++ arch/arm64/include/asm/kvm_host.h | 9 +- arch/arm64/kvm/guest.c | 42 +- arch/mips/include/asm/kvm_host.h | 9 +- arch/mips/kvm/mips.c | 67 ++- arch/powerpc/include/asm/kvm_host.h | 9 +- arch/powerpc/kvm/book3s.c | 68 +++- arch/powerpc/kvm/book3s_hv.c | 12 +- arch/powerpc/kvm/book3s_pr.c | 2 +- arch/powerpc/kvm/book3s_pr_papr.c | 2 +- arch/powerpc/kvm/booke.c | 63 ++- arch/s390/include/asm/kvm_host.h | 9 +- arch/s390/kvm/kvm-s390.c | 133 +++++- arch/x86/include/asm/kvm_host.h | 9 +- arch/x86/kvm/x86.c | 71 +++- include/linux/kvm_host.h | 132 +++++- include/linux/kvm_types.h | 12 + include/uapi/linux/kvm.h | 50 +++ tools/testing/selftests/kvm/.gitignore | 1 + tools/testing/selftests/kvm/Makefile | 3 + .../testing/selftests/kvm/include/kvm_util.h | 3 + .../selftests/kvm/kvm_bin_form_stats.c | 380 ++++++++++++++++++ tools/testing/selftests/kvm/lib/kvm_util.c | 11 + virt/kvm/kvm_main.c | 237 ++++++++++- 24 files changed, 1415 insertions(+), 90 deletions(-) create mode 100644 tools/testing/selftests/kvm/kvm_bin_form_stats.c
base-commit: 9f242010c3b46e63bc62f08fff42cef992d3801b
2.31.1.527.g47e6f16901-goog
Do I need to send another version for this?
Thanks, Jing
On 10/05/21 20:57, Jing Zhang wrote:
Hi Paolo,
On Thu, Apr 29, 2021 at 3:37 PM Jing Zhang jingzhangos@google.com wrote:
This patchset provides a file descriptor for every VM and VCPU to read KVM statistics data in binary format. It is meant to provide a lightweight, flexible, scalable and efficient lock-free solution for user space telemetry applications to pull the statistics data periodically for large scale systems. The pulling frequency could be as high as a few times per second. In this patchset, every statistics data are treated to have some attributes as below:
- architecture dependent or common
- VM statistics data or VCPU statistics data
- type: cumulative, instantaneous,
- unit: none for simple counter, nanosecond, microsecond, millisecond, second, Byte, KiByte, MiByte, GiByte. Clock Cycles
Since no lock/synchronization is used, the consistency between all the statistics data is not guaranteed. That means not all statistics data are read out at the exact same time, since the statistics date are still being updated by KVM subsystems while they are read out.
v3 -> v4
- Rebase to kvm/queue, commit 9f242010c3b4 ("KVM: avoid "deadlock" between install_new_memslots and MMU notifier")
- Use C-stype comments in the whole patch
- Fix wrong count for x86 VCPU stats descriptors
- Fix KVM stats data size counting and validity check in selftest
v2 -> v3
- Rebase to kvm/queue, commit edf408f5257b ("KVM: avoid "deadlock" between install_new_memslots and MMU notifier")
- Resolve some nitpicks about format
v1 -> v2
- Use ARRAY_SIZE to count the number of stats descriptors
- Fix missing `size` field initialization in macro STATS_DESC
[1] https://lore.kernel.org/kvm/20210402224359.2297157-1-jingzhangos@google.com [2] https://lore.kernel.org/kvm/20210415151741.1607806-1-jingzhangos@google.com [3] https://lore.kernel.org/kvm/20210423181727.596466-1-jingzhangos@google.com
Jing Zhang (4): KVM: stats: Separate common stats from architecture specific ones KVM: stats: Add fd-based API to read binary stats data KVM: stats: Add documentation for statistics data binary interface KVM: selftests: Add selftest for KVM statistics data binary interface
Documentation/virt/kvm/api.rst | 171 ++++++++ arch/arm64/include/asm/kvm_host.h | 9 +- arch/arm64/kvm/guest.c | 42 +- arch/mips/include/asm/kvm_host.h | 9 +- arch/mips/kvm/mips.c | 67 ++- arch/powerpc/include/asm/kvm_host.h | 9 +- arch/powerpc/kvm/book3s.c | 68 +++- arch/powerpc/kvm/book3s_hv.c | 12 +- arch/powerpc/kvm/book3s_pr.c | 2 +- arch/powerpc/kvm/book3s_pr_papr.c | 2 +- arch/powerpc/kvm/booke.c | 63 ++- arch/s390/include/asm/kvm_host.h | 9 +- arch/s390/kvm/kvm-s390.c | 133 +++++- arch/x86/include/asm/kvm_host.h | 9 +- arch/x86/kvm/x86.c | 71 +++- include/linux/kvm_host.h | 132 +++++- include/linux/kvm_types.h | 12 + include/uapi/linux/kvm.h | 50 +++ tools/testing/selftests/kvm/.gitignore | 1 + tools/testing/selftests/kvm/Makefile | 3 + .../testing/selftests/kvm/include/kvm_util.h | 3 + .../selftests/kvm/kvm_bin_form_stats.c | 380 ++++++++++++++++++ tools/testing/selftests/kvm/lib/kvm_util.c | 11 + virt/kvm/kvm_main.c | 237 ++++++++++- 24 files changed, 1415 insertions(+), 90 deletions(-) create mode 100644 tools/testing/selftests/kvm/kvm_bin_form_stats.c
base-commit: 9f242010c3b46e63bc62f08fff42cef992d3801b
2.31.1.527.g47e6f16901-goog
Do I need to send another version for this?
No, the merge window has just finished and I wanted to flush the dozens of bugfix patches that I had. I'll get to it shortly.
Paolo
linux-kselftest-mirror@lists.linaro.org