New subject: [PATCH v4 1/4] x86/cpufeatures: Add CPUID feature bit for Idle HLT intercept

22 Oct 2024


      The upcoming new Idle HLT Intercept feature allows for the HLT
instruction execution by a vCPU to be intercepted by the hypervisor
only if there are no pending V_INTR and V_NMI events for the vCPU.
When the vCPU is expected to service the pending V_INTR and V_NMI
events, the Idle HLT intercept won’t trigger. The feature allows the
hypervisor to determine if the vCPU is actually idle and reduces
wasteful VMEXITs.
The idle HLT intercept feature is used for enlightened guests who wish
to securely handle the events. When an enlightened guest does a HLT
while an interrupt is pending, hypervisor will not have a way to
figure out whether the guest needs to be re-entered or not. The Idle
HLT intercept feature allows the HLT execution only if there are no
pending V_INTR and V_NMI events.
Presence of the Idle HLT Intercept feature is indicated via CPUID
function Fn8000_000A_EDX[30].
Document for the Idle HLT intercept feature is available at [1].
This series is based on kvm-next/next (64dbb3a771a1) + [2].
Experiments done:
----------------
kvm_amd.avic is set to '0' for this experiment.
The below numbers represent the average of 10 runs.
Normal guest (L1)
The below netperf command was run on the guest with smp = 1 (pinned).
netperf -H <host ip> -t TCP_RR -l 60
----------------------------------------------------------------
|with Idle HLT(transactions/Sec)|w/o Idle HLT(transactions/Sec)|
----------------------------------------------------------------
|         25645.7136            |        25773.2796            |
----------------------------------------------------------------
Number of transactions/sec with and without idle HLT intercept feature
are almost same.
Nested guest (L2)
The below netperf command was run on L2 guest with smp = 1 (pinned).
netperf -H <host ip> -t TCP_RR -l 60
----------------------------------------------------------------
|with Idle HLT(transactions/Sec)|w/o Idle HLT(transactions/Sec)|
----------------------------------------------------------------
|          5655.4468            |          5755.2189           |
----------------------------------------------------------------
Number of transactions/sec with and without idle HLT intercept feature
are almost same.
Testing Done:
- Tested the functionality for the Idle HLT intercept feature
  using selftest svm_idle_hlt_test.
- Tested SEV and SEV-ES guest for the Idle HLT intercept functionality.
- Tested the Idle HLT intercept functionality on nested guest.
v3 -> v4
- Drop the patches to add vcpu_get_stat() into a new series [2].
- Added nested Idle HLT intercept support.
v2 -> v3
- Incorporated Andrew's suggestion to structure vcpu_stat_types in
  a way that each architecture can share the generic types and also
  provide its own.
v1 -> v2
- Done changes in svm_idle_hlt_test based on the review comments from Sean.
- Added an enum based approach to get binary stats in vcpu_get_stat() which
  doesn't use string to get stat data based on the comments from Sean.
- Added self_halt() and cli() helpers based on the comments from Sean.
[1]: AMD64 Architecture Programmer's Manual Pub. 24593, April 2024,
     Vol 2, 15.9 Instruction Intercepts (Table 15-7: IDLE_HLT).
     https://bugzilla.kernel.org/attachment.cgi?id=306250
[2]: https://lore.kernel.org/kvm/20241021062226.108657-1-manali.shukla@amd.com/T/...
Manali Shukla (4):
  x86/cpufeatures: Add CPUID feature bit for Idle HLT intercept
  KVM: SVM: Add Idle HLT intercept support
  KVM: nSVM: implement the nested idle halt intercept
  KVM: selftests: KVM: SVM: Add Idle HLT intercept test
arch/x86/include/asm/cpufeatures.h            |  1 +
 arch/x86/include/asm/svm.h                    |  1 +
 arch/x86/include/uapi/asm/svm.h               |  2 +
 arch/x86/kvm/governed_features.h              |  1 +
 arch/x86/kvm/svm/nested.c                     |  7 ++
 arch/x86/kvm/svm/svm.c                        | 15 +++-
 tools/testing/selftests/kvm/Makefile          |  1 +
 .../selftests/kvm/include/x86_64/processor.h  |  1 +
 .../selftests/kvm/x86_64/svm_idle_hlt_test.c  | 89 +++++++++++++++++++
 9 files changed, 115 insertions(+), 3 deletions(-)
 create mode 100644 tools/testing/selftests/kvm/x86_64/svm_idle_hlt_test.c
base-commit: c8d430db8eec7d4fd13a6bea27b7086a54eda6da
prerequisite-patch-id: ca912571db5c004f77b70843b8dd35517ff1267f
prerequisite-patch-id: 164ea3b4346f9e04bc69819278d20f5e1b5df5ed
prerequisite-patch-id: 90d870f426ebc2cec43c0dd89b701ee998385455
prerequisite-patch-id: 45812b799c517a4521782a1fdbcda881237e1eda
-- 
2.34.1

[PATCH v4 0/4] Add support for the Idle HLT intercept feature

+}

+}

Experiments done:

netperf -H <host ip> -t TCP_RR -l 60

|with Idle HLT(transactions/Sec)|w/o Idle HLT(transactions/Sec)|

| 25645.7136 | 25773.2796 |

netperf -H <host ip> -t TCP_RR -l 60

|with Idle HLT(transactions/Sec)|w/o Idle HLT(transactions/Sec)|

| 5655.4468 | 5755.2189 |

Experiments done:

netperf -H <host ip> -t TCP_RR -l 60

|with Idle HLT(transactions/Sec)|w/o Idle HLT(transactions/Sec)|

| 25645.7136 | 25773.2796 |

netperf -H <host ip> -t TCP_RR -l 60

|with Idle HLT(transactions/Sec)|w/o Idle HLT(transactions/Sec)|

| 5655.4468 | 5755.2189 |

Experiments done:

netperf -H <host ip> -t TCP_RR -l 60

|with Idle HLT(transactions/Sec)|w/o Idle HLT(transactions/Sec)|

| 25645.7136 | 25773.2796 |

netperf -H <host ip> -t TCP_RR -l 60

|with Idle HLT(transactions/Sec)|w/o Idle HLT(transactions/Sec)|

| 5655.4468 | 5755.2189 |

Experiments done:

netperf -H <host ip> -t TCP_RR -l 60

|with Idle HLT(transactions/Sec)|w/o Idle HLT(transactions/Sec)|

| 25645.7136 | 25773.2796 |

netperf -H <host ip> -t TCP_RR -l 60

|with Idle HLT(transactions/Sec)|w/o Idle HLT(transactions/Sec)|

| 5655.4468 | 5755.2189 |