Since commit e87412e621f1 ("integrate Zaamo and Zalrsc text (#1304)"),
the A extension has been described as a set of instructions provided by
Zaamo and Zalrsc. Add these two extensions.
This series is based on the Zc one [1].
Link: https://lore.kernel.org/linux-riscv/20240619113529.676940-1-cleger@rivosinc…
---
Clément Léger (5):
dt-bindings: riscv: add Zaamo and Zalrsc ISA extension description
riscv: add parsing for Zaamo and Zalrsc extensions
riscv: hwprobe: export Zaamo and Zalrsc extensions
RISC-V: KVM: Allow Zaamo/Zalrsc extensions for Guest/VM
KVM: riscv: selftests: Add Zaamo/Zalrsc extensions to get-reg-list
test
Documentation/arch/riscv/hwprobe.rst | 8 ++++++++
.../devicetree/bindings/riscv/extensions.yaml | 19 +++++++++++++++++++
arch/riscv/include/asm/hwcap.h | 2 ++
arch/riscv/include/uapi/asm/hwprobe.h | 2 ++
arch/riscv/include/uapi/asm/kvm.h | 2 ++
arch/riscv/kernel/cpufeature.c | 9 ++++++++-
arch/riscv/kernel/sys_hwprobe.c | 2 ++
arch/riscv/kvm/vcpu_onereg.c | 4 ++++
.../selftests/kvm/riscv/get-reg-list.c | 8 ++++++++
9 files changed, 55 insertions(+), 1 deletion(-)
--
2.45.2
`MFD_NOEXEC_SEAL` should remove the executable bits and set `F_SEAL_EXEC`
to prevent further modifications to the executable bits as per the comment
in the uapi header file:
not executable and sealed to prevent changing to executable
However, commit 105ff5339f498a ("mm/memfd: add MFD_NOEXEC_SEAL and MFD_EXEC")
that introduced this feature made it so that `MFD_NOEXEC_SEAL` unsets
`F_SEAL_SEAL`, essentially acting as a superset of `MFD_ALLOW_SEALING`.
Nothing implies that it should be so, and indeed up until the second version
of the of the patchset[0] that introduced `MFD_EXEC` and `MFD_NOEXEC_SEAL`,
`F_SEAL_SEAL` was not removed, however, it was changed in the third revision
of the patchset[1] without a clear explanation.
This behaviour is surprising for application developers, there is no
documentation that would reveal that `MFD_NOEXEC_SEAL` has the additional
effect of `MFD_ALLOW_SEALING`. Additionally, combined with `vm.memfd_noexec=2`
it has the effect of making all memfds initially sealable.
So do not remove `F_SEAL_SEAL` when `MFD_NOEXEC_SEAL` is requested,
thereby returning to the pre-Linux 6.3 behaviour of only allowing
sealing when `MFD_ALLOW_SEALING` is specified.
Now, this is technically a uapi break. However, the damage is expected
to be minimal. To trigger user visible change, a program has to do the
following steps:
- create memfd:
- with `MFD_NOEXEC_SEAL`,
- without `MFD_ALLOW_SEALING`;
- try to add seals / check the seals.
But that seems unlikely to happen intentionally since this change
essentially reverts the kernel's behaviour to that of Linux <6.3,
so if a program worked correctly on those older kernels, it will
likely work correctly after this change.
I have used Debian Code Search and GitHub to try to find potential
breakages, and I could only find a single one. dbus-broker's
memfd_create() wrapper is aware of this implicit `MFD_ALLOW_SEALING`
behaviour, and tries to work around it[2]. This workaround will
break. Luckily, this only affects the test suite, it does not affect
the normal operations of dbus-broker. There is a PR with a fix[3].
I also carried out a smoke test by building a kernel with this change
and booting an Arch Linux system into GNOME and Plasma sessions.
There was also a previous attempt to address this peculiarity by
introducing a new flag[4].
[0]: https://lore.kernel.org/lkml/20220805222126.142525-3-jeffxu@google.com/
[1]: https://lore.kernel.org/lkml/20221202013404.163143-3-jeffxu@google.com/
[2]: https://github.com/bus1/dbus-broker/blob/9eb0b7e5826fc76cad7b025bc46f267d4a…
[3]: https://github.com/bus1/dbus-broker/pull/366
[4]: https://lore.kernel.org/lkml/20230714114753.170814-1-david@readahead.eu/
Cc: stable(a)vger.kernel.org
Signed-off-by: Barnabás Pőcze <pobrn(a)protonmail.com>
---
* v3: https://lore.kernel.org/linux-mm/20240611231409.3899809-1-jeffxu@chromium.o…
* v2: https://lore.kernel.org/linux-mm/20240524033933.135049-1-jeffxu@google.com/
* v1: https://lore.kernel.org/linux-mm/20240513191544.94754-1-pobrn@protonmail.co…
This fourth version returns to removing the inconsistency as opposed to documenting
its existence, with the same code change as v1 but with a somewhat extended commit
message. This is sent because I believe it is worth at least a try; it can be easily
reverted if bigger application breakages are discovered than initially imagined.
---
mm/memfd.c | 9 ++++-----
tools/testing/selftests/memfd/memfd_test.c | 2 +-
2 files changed, 5 insertions(+), 6 deletions(-)
diff --git a/mm/memfd.c b/mm/memfd.c
index 7d8d3ab3fa37..8b7f6afee21d 100644
--- a/mm/memfd.c
+++ b/mm/memfd.c
@@ -356,12 +356,11 @@ SYSCALL_DEFINE2(memfd_create,
inode->i_mode &= ~0111;
file_seals = memfd_file_seals_ptr(file);
- if (file_seals) {
- *file_seals &= ~F_SEAL_SEAL;
+ if (file_seals)
*file_seals |= F_SEAL_EXEC;
- }
- } else if (flags & MFD_ALLOW_SEALING) {
- /* MFD_EXEC and MFD_ALLOW_SEALING are set */
+ }
+
+ if (flags & MFD_ALLOW_SEALING) {
file_seals = memfd_file_seals_ptr(file);
if (file_seals)
*file_seals &= ~F_SEAL_SEAL;
diff --git a/tools/testing/selftests/memfd/memfd_test.c b/tools/testing/selftests/memfd/memfd_test.c
index 95af2d78fd31..7b78329f65b6 100644
--- a/tools/testing/selftests/memfd/memfd_test.c
+++ b/tools/testing/selftests/memfd/memfd_test.c
@@ -1151,7 +1151,7 @@ static void test_noexec_seal(void)
mfd_def_size,
MFD_CLOEXEC | MFD_NOEXEC_SEAL);
mfd_assert_mode(fd, 0666);
- mfd_assert_has_seals(fd, F_SEAL_EXEC);
+ mfd_assert_has_seals(fd, F_SEAL_SEAL | F_SEAL_EXEC);
mfd_fail_chmod(fd, 0777);
close(fd);
}
--
2.45.2
The upcoming new Idle HLT Intercept feature allows for the HLT
instruction execution by a vCPU to be intercepted by the hypervisor
only if there are no pending V_INTR and V_NMI events for the vCPU.
When the vCPU is expected to service the pending V_INTR and V_NMI
events, the Idle HLT intercept won’t trigger. The feature allows the
hypervisor to determine if the vCPU is actually idle and reduces
wasteful VMEXITs.
Presence of the Idle HLT Intercept feature is indicated via CPUID
function Fn8000_000A_EDX[30].
Document for the Idle HLT intercept feature is available at [1].
[1]: AMD64 Architecture Programmer's Manual Pub. 24593, April 2024,
Vol 2, 15.9 Instruction Intercepts (Table 15-7: IDLE_HLT).
https://bugzilla.kernel.org/attachment.cgi?id=306250
Testing Done:
- Added a selftest to test the Idle HLT intercept functionality.
- Compile and functionality testing for the Idle HLT intercept selftest
are only done for x86_64.
- Tested SEV and SEV-ES guest for the Idle HLT intercept functionality.
v2 -> v3
- Incorporated Andrew's suggestion to structure vcpu_stat_types in
a way that each architecture can share the generic types and also
provide its own.
v1 -> v2
- Done changes in svm_idle_hlt_test based on the review comments from Sean.
- Added an enum based approach to get binary stats in vcpu_get_stat() which
doesn't use string to get stat data based on the comments from Sean.
- Added self_halt() and cli() helpers based on the comments from Sean.
Manali Shukla (5):
x86/cpufeatures: Add CPUID feature bit for Idle HLT intercept
KVM: SVM: Add Idle HLT intercept support
KVM: selftests: Add safe_halt() and cli() helpers to common code
KVM: selftests: Add an interface to read the data of named vcpu stat
KVM: selftests: KVM: SVM: Add Idle HLT intercept test
arch/x86/include/asm/cpufeatures.h | 1 +
arch/x86/include/asm/svm.h | 1 +
arch/x86/include/uapi/asm/svm.h | 2 +
arch/x86/kvm/svm/svm.c | 11 ++-
tools/testing/selftests/kvm/Makefile | 1 +
.../testing/selftests/kvm/include/kvm_util.h | 44 +++++++++
.../kvm/include/x86_64/kvm_util_arch.h | 40 +++++++++
.../selftests/kvm/include/x86_64/processor.h | 18 ++++
tools/testing/selftests/kvm/lib/kvm_util.c | 32 +++++++
.../selftests/kvm/x86_64/svm_idle_hlt_test.c | 89 +++++++++++++++++++
10 files changed, 236 insertions(+), 3 deletions(-)
create mode 100644 tools/testing/selftests/kvm/x86_64/svm_idle_hlt_test.c
base-commit: d91a9cc16417b8247213a0144a1f0fd61dc855dd
--
2.34.1
1. In order to make rtctest more explicit and robust, we propose to use
RTC_PARAM_GET ioctl interface to check rtc alarm feature state before
running alarm related tests.
2. The rtctest requires the read permission on /dev/rtc0. The rtctest will
be skipped if the /dev/rtc0 is not readable.
Joseph Jang (2):
selftest: rtc: Add to check rtc alarm status for alarm related test
selftest: rtc: Check if could access /dev/rtc0 before testing
tools/testing/selftests/rtc/Makefile | 2 +-
tools/testing/selftests/rtc/rtctest.c | 71 ++++++++++++++++++++++++++-
2 files changed, 71 insertions(+), 2 deletions(-)
--
2.34.1
Series takes care of few bugs and missing features with the aim to improve
the test coverage of sockmap/sockhash.
Last patch is a create_pair() rewrite making use of
__attribute__((cleanup)) to handle socket fd lifetime.
Signed-off-by: Michal Luczaj <mhal(a)rbox.co>
---
Changes in v2:
- Rebase on bpf-next (Jakub)
- Use cleanup helpers from kernel's cleanup.h (Jakub)
- Fix subject of patch 3, rephrase patch 4, use correct prefix
- Link to v1: https://lore.kernel.org/r/20240724-sockmap-selftest-fixes-v1-0-46165d224712…
Changes in v1:
- No declarations in function body (Jakub)
- Don't touch output arguments until function succeeds (Jakub)
- Link to v0: https://lore.kernel.org/netdev/027fdb41-ee11-4be0-a493-22f28a1abd7c@rbox.co/
---
Michal Luczaj (6):
selftests/bpf: Support more socket types in create_pair()
selftests/bpf: Socket pair creation, cleanups
selftests/bpf: Simplify inet_socketpair() and vsock_socketpair_connectible()
selftests/bpf: Honour the sotype of af_unix redir tests
selftests/bpf: Exercise SOCK_STREAM unix_inet_redir_to_connected()
selftests/bpf: Introduce __attribute__((cleanup)) in create_pair()
.../selftests/bpf/prog_tests/sockmap_basic.c | 28 ++--
.../selftests/bpf/prog_tests/sockmap_helpers.h | 149 ++++++++++++++-------
.../selftests/bpf/prog_tests/sockmap_listen.c | 117 ++--------------
3 files changed, 124 insertions(+), 170 deletions(-)
---
base-commit: 92cc2456e9775dc4333fb4aa430763ae4ac2f2d9
change-id: 20240729-selftest-sockmap-fixes-bcca996e143b
Best regards,
--
Michal Luczaj <mhal(a)rbox.co>
This patch series adds unit tests for the clk fixed rate basic type and
the clk registration functions that use struct clk_parent_data. To get
there, we add support for loading device tree overlays onto the live DTB
along with probing platform drivers to bind to device nodes in the
overlays. With this series, we're able to exercise some of the code in
the common clk framework that uses devicetree lookups to find parents
and the fixed rate clk code that scans device tree directly and creates
clks. Please review.
I Cced everyone to all the patches so they get the full context. I'm
hoping I can take the whole pile through the clk tree as they all build
upon each other. Or the DT part can be merged through the DT tree to
reduce the dependencies.
Changes from v7: https://lore.kernel.org/r/20240710201246.1802189-1-sboyd@kernel.org
* Support modular builds properly by compiling overlay with tests into
one .ko
* Fold in thinko fix from Geert to DT overlay application patch
* Export device_is_bound() to fix module build
* Add more module license and description
Changes from v6: https://lore.kernel.org/r/20240706045454.215701-1-sboyd@kernel.org
* Fix kasan error in platform test by fixing the condition to check for
correct free callback
* Add module descriptions to new modules
Changes from v5: https://lore.kernel.org/r/20240603223811.3815762-1-sboyd@kernel.org
* Pick up reviewed-by tags
* Drop test vendor prefix bindings as dtschema allows anything now
* Use of_node_put_kunit() more to plug some reference leaks
* Select DTC config to avoid compile fails because of missing dtc
* Don't skip for OF_OVERLAY in overlay tests because they depend on it
Changes from v4: https://lore.kernel.org/r/20240422232404.213174-1-sboyd@kernel.org
* Picked up reviewed-by tags
* Check for non-NULL device pointers before calling put_device()
* Fix CFI issues with kunit actions
* Introduce platform_device_prepare_wait_for_probe() helper to wait for
a platform device to probe
* Move platform code to lib/kunit and rename functions to have kunit
prefix
* Fix issue with platform wrappers messing up reference counting
because they used kunit actions
* New patch to populate overlay devices on root node for powerpc
* Make fixed-rate binding generic single clk consumer binding
Changes from v3: https://lore.kernel.org/r/20230327222159.3509818-1-sboyd@kernel.org
* No longer depend on Frank's series[1] because it was merged upstream[2]
* Use kunit_add_action_or_reset() to shorten code
* Skip tests properly when CONFIG_OF_OVERLAY isn't set
Changes from v2: https://lore.kernel.org/r/20230315183729.2376178-1-sboyd@kernel.org
* Overlays don't depend on __symbols__ node
* Depend on Frank's always create root node if CONFIG_OF series[1]
* Added kernel-doc to KUnit API doc
* Fixed some kernel-doc on functions
* More test cases for fixed rate clk
Changes from v1: https://lore.kernel.org/r/20230302013822.1808711-1-sboyd@kernel.org
* Don't depend on UML, use unittest data approach to attach nodes
* Introduce overlay loading API for KUnit
* Move platform_device KUnit code to drivers/base/test
* Use #define macros for constants shared between unit tests and
overlays
* Settle on "test" as a vendor prefix
* Make KUnit wrappers have "_kunit" postfix
[1] https://lore.kernel.org/r/20230317053415.2254616-1-frowand.list@gmail.com
[2] https://lore.kernel.org/r/20240308195737.GA1174908-robh@kernel.org
Stephen Boyd (8):
of/platform: Allow overlays to create platform devices from the root
node
of: Add test managed wrappers for of_overlay_apply()/of_node_put()
dt-bindings: vendor-prefixes: Add "test" vendor for KUnit and friends
of: Add a KUnit test for overlays and test managed APIs
platform: Add test managed platform_device/driver APIs
clk: Add test managed clk provider/consumer APIs
clk: Add KUnit tests for clk fixed rate basic type
clk: Add KUnit tests for clks registered with struct clk_parent_data
Documentation/dev-tools/kunit/api/clk.rst | 10 +
Documentation/dev-tools/kunit/api/index.rst | 21 +
Documentation/dev-tools/kunit/api/of.rst | 13 +
.../dev-tools/kunit/api/platformdevice.rst | 10 +
.../devicetree/bindings/vendor-prefixes.yaml | 2 +
drivers/base/dd.c | 1 +
drivers/clk/.kunitconfig | 2 +
drivers/clk/Kconfig | 11 +
drivers/clk/Makefile | 11 +-
drivers/clk/clk-fixed-rate_test.c | 380 +++++++++++++++
drivers/clk/clk-fixed-rate_test.h | 8 +
drivers/clk/clk_kunit_helpers.c | 207 ++++++++
drivers/clk/clk_parent_data_test.h | 10 +
drivers/clk/clk_test.c | 453 +++++++++++++++++-
drivers/clk/kunit_clk_fixed_rate_test.dtso | 19 +
drivers/clk/kunit_clk_parent_data_test.dtso | 28 ++
drivers/of/.kunitconfig | 1 +
drivers/of/Kconfig | 10 +
drivers/of/Makefile | 3 +
drivers/of/kunit_overlay_test.dtso | 9 +
drivers/of/of_kunit_helpers.c | 77 +++
drivers/of/overlay_test.c | 115 +++++
drivers/of/platform.c | 9 +-
include/kunit/clk.h | 28 ++
include/kunit/of.h | 115 +++++
include/kunit/platform_device.h | 20 +
lib/kunit/Makefile | 4 +-
lib/kunit/platform-test.c | 224 +++++++++
lib/kunit/platform.c | 302 ++++++++++++
29 files changed, 2097 insertions(+), 6 deletions(-)
create mode 100644 Documentation/dev-tools/kunit/api/clk.rst
create mode 100644 Documentation/dev-tools/kunit/api/of.rst
create mode 100644 Documentation/dev-tools/kunit/api/platformdevice.rst
create mode 100644 drivers/clk/clk-fixed-rate_test.c
create mode 100644 drivers/clk/clk-fixed-rate_test.h
create mode 100644 drivers/clk/clk_kunit_helpers.c
create mode 100644 drivers/clk/clk_parent_data_test.h
create mode 100644 drivers/clk/kunit_clk_fixed_rate_test.dtso
create mode 100644 drivers/clk/kunit_clk_parent_data_test.dtso
create mode 100644 drivers/of/kunit_overlay_test.dtso
create mode 100644 drivers/of/of_kunit_helpers.c
create mode 100644 drivers/of/overlay_test.c
create mode 100644 include/kunit/clk.h
create mode 100644 include/kunit/of.h
create mode 100644 include/kunit/platform_device.h
create mode 100644 lib/kunit/platform-test.c
create mode 100644 lib/kunit/platform.c
base-commit: 1613e604df0cd359cf2a7fbd9be7a0bcfacfabd0
--
https://git.kernel.org/pub/scm/linux/kernel/git/clk/linux.git/https://git.kernel.org/pub/scm/linux/kernel/git/sboyd/spmi.git
This patch set enables the Intel flexible return and event delivery
(FRED) architecture with KVM VMX to allow guests to utilize FRED.
The FRED architecture defines simple new transitions that change
privilege level (ring transitions). The FRED architecture was
designed with the following goals:
1) Improve overall performance and response time by replacing event
delivery through the interrupt descriptor table (IDT event
delivery) and event return by the IRET instruction with lower
latency transitions.
2) Improve software robustness by ensuring that event delivery
establishes the full supervisor context and that event return
establishes the full user context.
The new transitions defined by the FRED architecture are FRED event
delivery and, for returning from events, two FRED return instructions.
FRED event delivery can effect a transition from ring 3 to ring 0, but
it is used also to deliver events incident to ring 0. One FRED
instruction (ERETU) effects a return from ring 0 to ring 3, while the
other (ERETS) returns while remaining in ring 0. Collectively, FRED
event delivery and the FRED return instructions are FRED transitions.
Intel VMX architecture is extended to run FRED guests, and the major
changes are:
1) New VMCS fields for FRED context management, which includes two new
event data VMCS fields, eight new guest FRED context VMCS fields and
eight new host FRED context VMCS fields.
2) VMX nested-exception support for proper virtualization of stack
levels introduced with FRED architecture.
Search for the latest FRED spec in most search engines with this search
pattern:
site:intel.com FRED (flexible return and event delivery) specification
As the native FRED patches are committed in the tip tree "x86/fred"
branch:
https://git.kernel.org/pub/scm/linux/kernel/git/tip/tip.git/log/?h=x86/fred,
and we have received a good amount of review comments for v1, it's time
to send out v2 based on this branch for further help from the community.
Patch 1-2 are cleanups to VMX basic and misc MSRs, which were sent
out earlier as a preparation for FRED changes:
https://lore.kernel.org/kvm/20240206182032.1596-1-xin3.li@intel.com/T/#u
Patch 3-15 add FRED support to VMX.
Patch 16-21 add FRED support to nested VMX.
Patch 22 exposes FRED and its baseline features to KVM guests.
Patch 23-25 add FRED selftests.
There is also a counterpart qemu patch set for FRED at:
https://lore.kernel.org/qemu-devel/20231109072012.8078-1-xin3.li@intel.com/…,
which works with this patch set to allow KVM to run FRED guests.
Changes since v1:
* Always load the secondary VM exit controls (Sean Christopherson).
* Remove FRED VM entry/exit controls consistency checks in
setup_vmcs_config() (Sean Christopherson).
* Clear FRED VM entry/exit controls if FRED is not enumerated (Chao Gao).
* Use guest_can_use() to trace FRED enumeration in a vcpu (Chao Gao).
* Enable FRED MSRs intercept if FRED is no longer enumerated in CPUID
(Chao Gao).
* Move guest FRED states init into __vmx_vcpu_reset() (Chao Gao).
* Don't use guest_cpuid_has() in vmx_prepare_switch_to_{host,guest}(),
which are called from IRQ-disabled context (Chao Gao).
* Reset msr_guest_fred_rsp0 in __vmx_vcpu_reset() (Chao Gao).
* Fail host requested FRED MSRs access if KVM cannot virtualize FRED
(Chao Gao).
* Handle the case FRED MSRs are valid but KVM cannot virtualize FRED
(Chao Gao).
* Add sanity checks when writing to FRED MSRs.
* Explain why it is ok to only check CR4.FRED in kvm_is_fred_enabled()
(Chao Gao).
* Document event data should be equal to CR2/DR6/IA32_XFD_ERR instead
of using WARN_ON() (Chao Gao).
* Zero event data if a #NM was not caused by extended feature disable
(Chao Gao).
* Set the nested flag when there is an original interrupt (Chao Gao).
* Dump guest FRED states only if guest has FRED enabled (Nikolay Borisov).
* Add a prerequisite to SHADOW_FIELD_R[OW] macros
* Remove hyperv TLFS related changes (Jeremi Piotrowski).
* Use kvm_cpu_cap_has() instead of cpu_feature_enabled() to decouple
KVM's capability to virtualize a feature and host's enabling of a
feature (Chao Gao).
Xin Li (25):
KVM: VMX: Cleanup VMX basic information defines and usages
KVM: VMX: Cleanup VMX misc information defines and usages
KVM: VMX: Add support for the secondary VM exit controls
KVM: x86: Mark CR4.FRED as not reserved
KVM: VMX: Initialize FRED VM entry/exit controls in vmcs_config
KVM: VMX: Defer enabling FRED MSRs save/load until after set CPUID
KVM: VMX: Set intercept for FRED MSRs
KVM: VMX: Initialize VMCS FRED fields
KVM: VMX: Switch FRED RSP0 between host and guest
KVM: VMX: Add support for FRED context save/restore
KVM: x86: Add kvm_is_fred_enabled()
KVM: VMX: Handle FRED event data
KVM: VMX: Handle VMX nested exception for FRED
KVM: VMX: Disable FRED if FRED consistency checks fail
KVM: VMX: Dump FRED context in dump_vmcs()
KVM: VMX: Invoke vmx_set_cpu_caps() before nested setup
KVM: nVMX: Add support for the secondary VM exit controls
KVM: nVMX: Add a prerequisite to SHADOW_FIELD_R[OW] macros
KVM: nVMX: Add FRED VMCS fields
KVM: nVMX: Add support for VMX FRED controls
KVM: nVMX: Add VMCS FRED states checking
KVM: x86: Allow FRED/LKGS/WRMSRNS to be exposed to guests
KVM: selftests: Run debug_regs test with FRED enabled
KVM: selftests: Add a new VM guest mode to run user level code
KVM: selftests: Add fred exception tests
Documentation/virt/kvm/x86/nested-vmx.rst | 19 +
arch/x86/include/asm/kvm_host.h | 8 +-
arch/x86/include/asm/msr-index.h | 15 +-
arch/x86/include/asm/vmx.h | 59 ++-
arch/x86/kvm/cpuid.c | 4 +-
arch/x86/kvm/governed_features.h | 1 +
arch/x86/kvm/kvm_cache_regs.h | 17 +
arch/x86/kvm/svm/svm.c | 4 +-
arch/x86/kvm/vmx/capabilities.h | 30 +-
arch/x86/kvm/vmx/nested.c | 329 ++++++++++++---
arch/x86/kvm/vmx/nested.h | 2 +-
arch/x86/kvm/vmx/vmcs.h | 1 +
arch/x86/kvm/vmx/vmcs12.c | 19 +
arch/x86/kvm/vmx/vmcs12.h | 38 ++
arch/x86/kvm/vmx/vmcs_shadow_fields.h | 80 ++--
arch/x86/kvm/vmx/vmx.c | 385 +++++++++++++++---
arch/x86/kvm/vmx/vmx.h | 15 +-
arch/x86/kvm/x86.c | 103 ++++-
arch/x86/kvm/x86.h | 5 +-
tools/testing/selftests/kvm/Makefile | 1 +
.../selftests/kvm/include/kvm_util_base.h | 1 +
.../selftests/kvm/include/x86_64/processor.h | 36 ++
tools/testing/selftests/kvm/lib/kvm_util.c | 5 +-
.../selftests/kvm/lib/x86_64/processor.c | 15 +-
tools/testing/selftests/kvm/lib/x86_64/vmx.c | 4 +-
.../testing/selftests/kvm/x86_64/debug_regs.c | 50 ++-
.../testing/selftests/kvm/x86_64/fred_test.c | 297 ++++++++++++++
27 files changed, 1320 insertions(+), 223 deletions(-)
create mode 100644 tools/testing/selftests/kvm/x86_64/fred_test.c
base-commit: e13841907b8fda0ae0ce1ec03684665f578416a8
--
2.43.0
Malicious guests can cause bus locks to degrade the performance of a
system. Non-WB (write-back) and misaligned locked RMW
(read-modify-write) instructions are referred to as "bus locks" and
require system wide synchronization among all processors to guarantee
the atomicity. The bus locks can impose notable performance penalties
for all processors within the system.
Support for the Bus Lock Threshold is indicated by CPUID
Fn8000_000A_EDX[29] BusLockThreshold=1, the VMCB provides a Bus Lock
Threshold enable bit and an unsigned 16-bit Bus Lock Threshold count.
VMCB intercept bit
VMCB Offset Bits Function
14h 5 Intercept bus lock operations
Bus lock threshold count
VMCB Offset Bits Function
120h 15:0 Bus lock counter
During VMRUN, the bus lock threshold count is fetched and stored in an
internal count register. Prior to executing a bus lock within the
guest, the processor verifies the count in the bus lock register. If
the count is greater than zero, the processor executes the bus lock,
reducing the count. However, if the count is zero, the bus lock
operation is not performed, and instead, a Bus Lock Threshold #VMEXIT
is triggered to transfer control to the Virtual Machine Monitor (VMM).
A Bus Lock Threshold #VMEXIT is reported to the VMM with VMEXIT code
0xA5h, VMEXIT_BUSLOCK. EXITINFO1 and EXITINFO2 are set to 0 on
a VMEXIT_BUSLOCK. On a #VMEXIT, the processor writes the current
value of the Bus Lock Threshold Counter to the VMCB.
More details about the Bus Lock Threshold feature can be found in AMD
APM [1].
Patches are prepared on kvm-x86/svm (704ec48fc2fb)
Testing done:
- Added a selftest for the Bus Lock Threadshold functionality.
- Tested the Bus Lock Threshold functionality on SEV and SEV-ES guests.
- Tested the Bus Lock Threshold functionality on nested guests.
Qemu changes can be found on:
Repo: https://github.com/AMDESE/qemu.git
Branch: buslock_threshold
Qemu commandline to use the bus lock threshold functionality:
qemu-system-x86_64 -enable-kvm -cpu EPYC-Turin,+svm -M q35,bus-lock-ratelimit=10 \ ..
[1]: AMD64 Architecture Programmer's Manual Pub. 24593, April 2024,
Vol 2, 15.14.5 Bus Lock Threshold.
https://bugzilla.kernel.org/attachment.cgi?id=306250
Manali Shukla (2):
x86/cpufeatures: Add CPUID feature bit for the Bus Lock Threshold
KVM: x86: nSVM: Implement support for nested Bus Lock Threshold
Nikunj A Dadhania (2):
KVM: SVM: Enable Bus lock threshold exit
KVM: selftests: Add bus lock exit test
arch/x86/include/asm/cpufeatures.h | 1 +
arch/x86/include/asm/svm.h | 5 +-
arch/x86/include/uapi/asm/svm.h | 2 +
arch/x86/kvm/governed_features.h | 1 +
arch/x86/kvm/svm/nested.c | 25 ++++
arch/x86/kvm/svm/svm.c | 48 ++++++++
arch/x86/kvm/svm/svm.h | 1 +
arch/x86/kvm/x86.h | 1 +
tools/testing/selftests/kvm/Makefile | 1 +
.../selftests/kvm/x86_64/svm_buslock_test.c | 114 ++++++++++++++++++
10 files changed, 198 insertions(+), 1 deletion(-)
create mode 100644 tools/testing/selftests/kvm/x86_64/svm_buslock_test.c
base-commit: 704ec48fc2fbd4e41ec982662ad5bf1eee33eeb2
--
2.34.1
Changes v4:
- Printing SNC warnings at the start of every test.
- Printing SNC warnings at the end of every relevant test.
- Remove global snc_mode variable, consolidate snc detection functions
into one.
- Correct minor mistakes.
Changes v3:
- Reworked patch 2.
- Changed minor things in patch 1 like function name and made
corrections to the patch message.
Changes v2:
- Removed patches 2 and 3 since now this part will be supported by the
kernel.
Sub-Numa Clustering (SNC) allows splitting CPU cores, caches and memory
into multiple NUMA nodes. When enabled, NUMA-aware applications can
achieve better performance on bigger server platforms.
SNC support in the kernel was merged into x86/cache [1]. With SNC enabled
and kernel support in place all the tests will function normally (aside
from effective cache size). There might be a problem when SNC is enabled
but the system is still using an older kernel version without SNC
support. Currently the only message displayed in that situation is a
guess that SNC might be enabled and is causing issues. That message also
is displayed whenever the test fails on an Intel platform.
Add a mechanism to discover kernel support for SNC which will add more
meaning and certainty to the error message.
Add runtime SNC mode detection and verify how reliable that information
is.
Series was tested on Ice Lake server platforms with SNC disabled, SNC-2
and SNC-4. The tests were also ran with and without kernel support for
SNC.
Series applies cleanly on kselftest/next.
[1] https://lore.kernel.org/all/20240628215619.76401-1-tony.luck@intel.com/
Previous versions of this series:
[v1] https://lore.kernel.org/all/cover.1709721159.git.maciej.wieczor-retman@inte…
[v2] https://lore.kernel.org/all/cover.1715769576.git.maciej.wieczor-retman@inte…
[v3] https://lore.kernel.org/all/cover.1719842207.git.maciej.wieczor-retman@inte…
Maciej Wieczor-Retman (2):
selftests/resctrl: Adjust effective L3 cache size with SNC enabled
selftests/resctrl: Adjust SNC support messages
tools/testing/selftests/resctrl/cat_test.c | 8 ++
tools/testing/selftests/resctrl/cmt_test.c | 10 +-
tools/testing/selftests/resctrl/mba_test.c | 7 +
tools/testing/selftests/resctrl/mbm_test.c | 9 +-
tools/testing/selftests/resctrl/resctrl.h | 7 +
.../testing/selftests/resctrl/resctrl_tests.c | 8 +-
tools/testing/selftests/resctrl/resctrlfs.c | 130 ++++++++++++++++++
7 files changed, 174 insertions(+), 5 deletions(-)
--
2.45.2