This series aims to clarify the behavior of the KVM_GET_EMULATED_CPUID
ioctl, and fix a corner case where -E2BIG is returned when
the nent field of struct kvm_cpuid2 is matching the amount of
emulated entries that kvm returns.
Patch 1 proposes the nent field fix to cpuid.c,
patch 2 updates the ioctl documentation accordingly and
patches 3 and 4 extend the x86_64/get_cpuid_test.c selftest to check
the intended behavior of KVM_GET_EMULATED_CPUID.
Signed-off-by: Emanuele Giuseppe Esposito <eesposit(a)redhat.com>
---
v3:
- clearer commit message and problem explanation
- pre-initialize the stack variable 'entry' in __do_cpuid_func_emulated
so that the various eax/ebx/ecx are initialized if not set by func.
Emanuele Giuseppe Esposito (4):
KVM: x86: Fix a spurious -E2BIG in KVM_GET_EMULATED_CPUID
Documentation: KVM: update KVM_GET_EMULATED_CPUID ioctl description
selftests: add kvm_get_emulated_cpuid to processor.h
selftests: KVM: extend get_cpuid_test to include
KVM_GET_EMULATED_CPUID
Documentation/virt/kvm/api.rst | 10 +--
arch/x86/kvm/cpuid.c | 33 ++++---
.../selftests/kvm/include/x86_64/processor.h | 1 +
.../selftests/kvm/lib/x86_64/processor.c | 33 +++++++
.../selftests/kvm/x86_64/get_cpuid_test.c | 90 ++++++++++++++++++-
5 files changed, 142 insertions(+), 25 deletions(-)
--
2.30.2
This patchset introduces batched operations for the per-cpu variant of
the array map.
It also introduces a standard way to define per-cpu values via the
'BPF_PERCPU_TYPE()' macro, which handles the alignment transparently.
This was already implemented in the selftests and was merely refactored
out to libbpf, with some simplifications for reuse.
The tests were updated to reflect all the new changes.
Pedro Tammela (3):
bpf: add batched ops support for percpu array
libbpf: selftests: refactor 'BPF_PERCPU_TYPE()' and 'bpf_percpu()'
macros
bpf: selftests: update array map tests for per-cpu batched ops
kernel/bpf/arraymap.c | 2 +
tools/lib/bpf/bpf.h | 10 ++
tools/testing/selftests/bpf/bpf_util.h | 7 --
.../bpf/map_tests/array_map_batch_ops.c | 114 +++++++++++++-----
.../bpf/map_tests/htab_map_batch_ops.c | 48 ++++----
.../selftests/bpf/prog_tests/map_init.c | 5 +-
tools/testing/selftests/bpf/test_maps.c | 16 +--
7 files changed, 133 insertions(+), 69 deletions(-)
--
2.25.1
Changelog
RFC v1-->v2
The timer based test produces run to run variance on some intel based
systems that sport a mechansim of "C-state pre-wake" which can
pre-wake a CPU from an idle state when timers are armed.
Hence invoking the timer tests is now parameterized for systems and
architectures that don't support pre-wakeup logic and need granular
timer measurements along with IPI results.
This RFC does not yet support treating of CPU 0s idle states differently
especially as reported on Intel systems. More understanding is needed
on systems to determine if only CPU 0 is treated differently of if they
are more CPUs that cannot have its idle state properties changed.
RFC v1: https://lkml.org/lkml/2021/3/15/492
---
A kernel module + userspace driver to estimate the wakeup latency
caused by going into stop states. The motivation behind this program is
to find significant deviations behind advertised latency and residency
values.
The patchset measures latencies for two kinds of events. IPIs and Timers
As this is a software-only mechanism, there will additional latencies of
the kernel-firmware-hardware interactions. To account for that, the
program also measures a baseline latency on a 100 percent loaded CPU
and the latencies achieved must be in view relative to that.
To achieve this, we introduce a kernel module and expose its control
knobs through the debugfs interface that the selftests can engage with.
The kernel module provides the following interfaces within
/sys/kernel/debug/latency_test/ for,
IPI test:
ipi_cpu_dest = Destination CPU for the IPI
ipi_cpu_src = Origin of the IPI
ipi_latency_ns = Measured latency time in ns
Timeout test:
timeout_cpu_src = CPU on which the timer to be queued
timeout_expected_ns = Timer duration
timeout_diff_ns = Difference of actual duration vs expected timer
Sample output on a POWER9 system is as follows:
# --IPI Latency Test---
# Baseline Average IPI latency(ns): 3114
# Observed Average IPI latency(ns) - State0: 3265
# Observed Average IPI latency(ns) - State1: 3507
# Observed Average IPI latency(ns) - State2: 3739
# Observed Average IPI latency(ns) - State3: 3807
# Observed Average IPI latency(ns) - State4: 17070
# Observed Average IPI latency(ns) - State5: 1038174
# Observed Average IPI latency(ns) - State6: 1068784
#
# --Timeout Latency Test--
# Baseline Average timeout diff(ns): 1420
# Observed Average timeout diff(ns) - State0: 1640
# Observed Average timeout diff(ns) - State1: 1764
# Observed Average timeout diff(ns) - State2: 1715
# Observed Average timeout diff(ns) - State3: 1845
# Observed Average timeout diff(ns) - State4: 16581
# Observed Average timeout diff(ns) - State5: 939977
# Observed Average timeout diff(ns) - State6: 1073024
Things to keep in mind:
1. This kernel module + bash driver does not guarantee idleness on a
core when the IPI and the Timer is armed. It only invokes sleep and
hopes that the core is idle once the IPI/Timer is invoked onto it.
Hence this program must be run on a completely idle system for best
results
2. Even on a completely idle system, there maybe book-keeping tasks or
jitter tasks that can run on the core we want idle. This can create
outliers in the latency measurement. Thankfully, these outliers
should be large enough to easily weed them out.
3. A userspace only selftest variant was also sent out as RFC based on
suggestions over the previous patchset to simply the kernel
complexeity. However, a userspace only approach had more noise in
the latency measurement due to userspace-kernel interactions
which led to run to run variance and a lesser accurate test.
Another downside of the nature of a userspace program is that it
takes orders of magnitude longer to complete a full system test
compared to the kernel framework.
RFC patch: https://lkml.org/lkml/2020/9/2/356
4. For Intel Systems, the Timer based latencies don't exactly give out
the measure of idle latencies. This is because of a hardware
optimization mechanism that pre-arms a CPU when a timer is set to
wakeup. That doesn't make this metric useless for Intel systems,
it just means that is measuring IPI/Timer responding latency rather
than idle wakeup latencies.
(Source: https://lkml.org/lkml/2020/9/2/610)
For solution to this problem, a hardware based latency analyzer is
devised by Artem Bityutskiy from Intel.
https://youtu.be/Opk92aQyvt0?t=8266https://intel.github.io/wult/
Pratik Rajesh Sampat (2):
cpuidle: Extract IPI based and timer based wakeup latency from idle
states
selftest/cpuidle: Add support for cpuidle latency measurement
drivers/cpuidle/Makefile | 1 +
drivers/cpuidle/test-cpuidle_latency.c | 157 ++++++++++
lib/Kconfig.debug | 10 +
tools/testing/selftests/Makefile | 1 +
tools/testing/selftests/cpuidle/Makefile | 6 +
tools/testing/selftests/cpuidle/cpuidle.sh | 323 +++++++++++++++++++++
tools/testing/selftests/cpuidle/settings | 2 +
7 files changed, 500 insertions(+)
create mode 100644 drivers/cpuidle/test-cpuidle_latency.c
create mode 100644 tools/testing/selftests/cpuidle/Makefile
create mode 100755 tools/testing/selftests/cpuidle/cpuidle.sh
create mode 100644 tools/testing/selftests/cpuidle/settings
--
2.17.1
This patchset provides a file descriptor for every VM and VCPU to read
KVM statistics data in binary format.
It is meant to provide a lightweight, flexible, scalable and efficient
lock-free solution for user space telemetry applications to pull the
statistics data periodically for large scale systems. The pulling
frequency could be as high as a few times per second.
In this patchset, every statistics data are treated to have some
attributes as below:
* architecture dependent or common
* VM statistics data or VCPU statistics data
* type: cumulative, instantaneous,
* unit: none for simple counter, nanosecond, microsecond,
millisecond, second, Byte, KiByte, MiByte, GiByte. Clock Cycles
Since no lock/synchronization is used, the consistency between all
the statistics data is not guaranteed. That means not all statistics
data are read out at the exact same time, since the statistics date
are still being updated by KVM subsystems while they are read out.
Jing Zhang (4):
KVM: stats: Separate common stats from architecture specific ones
KVM: stats: Add fd-based API to read binary stats data
KVM: stats: Add documentation for statistics data binary interface
KVM: selftests: Add selftest for KVM statistics data binary interface
Documentation/virt/kvm/api.rst | 169 ++++++++
arch/arm64/include/asm/kvm_host.h | 9 +-
arch/arm64/kvm/guest.c | 42 +-
arch/mips/include/asm/kvm_host.h | 9 +-
arch/mips/kvm/mips.c | 67 +++-
arch/powerpc/include/asm/kvm_host.h | 9 +-
arch/powerpc/kvm/book3s.c | 68 +++-
arch/powerpc/kvm/book3s_hv.c | 12 +-
arch/powerpc/kvm/book3s_pr.c | 2 +-
arch/powerpc/kvm/book3s_pr_papr.c | 2 +-
arch/powerpc/kvm/booke.c | 63 ++-
arch/s390/include/asm/kvm_host.h | 9 +-
arch/s390/kvm/kvm-s390.c | 133 ++++++-
arch/x86/include/asm/kvm_host.h | 9 +-
arch/x86/kvm/x86.c | 71 +++-
include/linux/kvm_host.h | 132 ++++++-
include/linux/kvm_types.h | 12 +
include/uapi/linux/kvm.h | 48 +++
tools/testing/selftests/kvm/.gitignore | 1 +
tools/testing/selftests/kvm/Makefile | 3 +
.../testing/selftests/kvm/include/kvm_util.h | 3 +
.../selftests/kvm/kvm_bin_form_stats.c | 370 ++++++++++++++++++
tools/testing/selftests/kvm/lib/kvm_util.c | 11 +
virt/kvm/kvm_main.c | 237 ++++++++++-
24 files changed, 1401 insertions(+), 90 deletions(-)
create mode 100644 tools/testing/selftests/kvm/kvm_bin_form_stats.c
base-commit: f96be2deac9bca3ef5a2b0b66b71fcef8bad586d
--
2.31.0.208.g409f899ff0-goog
The current way to provide a no-op flag to 'bpf_ringbuf_submit()',
'bpf_ringbuf_discard()' and 'bpf_ringbuf_output()' is to provide a '0'
value.
A '0' value might notify the consumer if it already caught up in processing,
so let's provide a more descriptive notation for this value.
Signed-off-by: Pedro Tammela <pctammela(a)mojatatu.com>
---
include/uapi/linux/bpf.h | 8 ++++++++
tools/include/uapi/linux/bpf.h | 8 ++++++++
tools/testing/selftests/bpf/progs/ima.c | 2 +-
tools/testing/selftests/bpf/progs/ringbuf_bench.c | 2 +-
tools/testing/selftests/bpf/progs/test_ringbuf.c | 2 +-
tools/testing/selftests/bpf/progs/test_ringbuf_multi.c | 2 +-
6 files changed, 20 insertions(+), 4 deletions(-)
diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h
index 598716742593..100cb2e4c104 100644
--- a/include/uapi/linux/bpf.h
+++ b/include/uapi/linux/bpf.h
@@ -4058,6 +4058,8 @@ union bpf_attr {
* Copy *size* bytes from *data* into a ring buffer *ringbuf*.
* If **BPF_RB_NO_WAKEUP** is specified in *flags*, no notification
* of new data availability is sent.
+ * If **BPF_RB_MAY_WAKEUP** is specified in *flags*, notification
+ * of new data availability is sent if needed.
* If **BPF_RB_FORCE_WAKEUP** is specified in *flags*, notification
* of new data availability is sent unconditionally.
* Return
@@ -4066,6 +4068,7 @@ union bpf_attr {
* void *bpf_ringbuf_reserve(void *ringbuf, u64 size, u64 flags)
* Description
* Reserve *size* bytes of payload in a ring buffer *ringbuf*.
+ * *flags* must be 0.
* Return
* Valid pointer with *size* bytes of memory available; NULL,
* otherwise.
@@ -4075,6 +4078,8 @@ union bpf_attr {
* Submit reserved ring buffer sample, pointed to by *data*.
* If **BPF_RB_NO_WAKEUP** is specified in *flags*, no notification
* of new data availability is sent.
+ * If **BPF_RB_MAY_WAKEUP** is specified in *flags*, notification
+ * of new data availability is sent if needed.
* If **BPF_RB_FORCE_WAKEUP** is specified in *flags*, notification
* of new data availability is sent unconditionally.
* Return
@@ -4085,6 +4090,8 @@ union bpf_attr {
* Discard reserved ring buffer sample, pointed to by *data*.
* If **BPF_RB_NO_WAKEUP** is specified in *flags*, no notification
* of new data availability is sent.
+ * If **BPF_RB_MAY_WAKEUP** is specified in *flags*, notification
+ * of new data availability is sent if needed.
* If **BPF_RB_FORCE_WAKEUP** is specified in *flags*, notification
* of new data availability is sent unconditionally.
* Return
@@ -4965,6 +4972,7 @@ enum {
* BPF_FUNC_bpf_ringbuf_output flags.
*/
enum {
+ BPF_RB_MAY_WAKEUP = 0,
BPF_RB_NO_WAKEUP = (1ULL << 0),
BPF_RB_FORCE_WAKEUP = (1ULL << 1),
};
diff --git a/tools/include/uapi/linux/bpf.h b/tools/include/uapi/linux/bpf.h
index ab9f2233607c..3d6d324184c0 100644
--- a/tools/include/uapi/linux/bpf.h
+++ b/tools/include/uapi/linux/bpf.h
@@ -4058,6 +4058,8 @@ union bpf_attr {
* Copy *size* bytes from *data* into a ring buffer *ringbuf*.
* If **BPF_RB_NO_WAKEUP** is specified in *flags*, no notification
* of new data availability is sent.
+ * If **BPF_RB_MAY_WAKEUP** is specified in *flags*, notification
+ * of new data availability is sent if needed.
* If **BPF_RB_FORCE_WAKEUP** is specified in *flags*, notification
* of new data availability is sent unconditionally.
* Return
@@ -4066,6 +4068,7 @@ union bpf_attr {
* void *bpf_ringbuf_reserve(void *ringbuf, u64 size, u64 flags)
* Description
* Reserve *size* bytes of payload in a ring buffer *ringbuf*.
+ * *flags* must be 0.
* Return
* Valid pointer with *size* bytes of memory available; NULL,
* otherwise.
@@ -4075,6 +4078,8 @@ union bpf_attr {
* Submit reserved ring buffer sample, pointed to by *data*.
* If **BPF_RB_NO_WAKEUP** is specified in *flags*, no notification
* of new data availability is sent.
+ * If **BPF_RB_MAY_WAKEUP** is specified in *flags*, notification
+ * of new data availability is sent if needed.
* If **BPF_RB_FORCE_WAKEUP** is specified in *flags*, notification
* of new data availability is sent unconditionally.
* Return
@@ -4085,6 +4090,8 @@ union bpf_attr {
* Discard reserved ring buffer sample, pointed to by *data*.
* If **BPF_RB_NO_WAKEUP** is specified in *flags*, no notification
* of new data availability is sent.
+ * If **BPF_RB_MAY_WAKEUP** is specified in *flags*, notification
+ * of new data availability is sent if needed.
* If **BPF_RB_FORCE_WAKEUP** is specified in *flags*, notification
* of new data availability is sent unconditionally.
* Return
@@ -4959,6 +4966,7 @@ enum {
* BPF_FUNC_bpf_ringbuf_output flags.
*/
enum {
+ BPF_RB_MAY_WAKEUP = 0,
BPF_RB_NO_WAKEUP = (1ULL << 0),
BPF_RB_FORCE_WAKEUP = (1ULL << 1),
};
diff --git a/tools/testing/selftests/bpf/progs/ima.c b/tools/testing/selftests/bpf/progs/ima.c
index 96060ff4ffc6..0f4daced6aad 100644
--- a/tools/testing/selftests/bpf/progs/ima.c
+++ b/tools/testing/selftests/bpf/progs/ima.c
@@ -38,7 +38,7 @@ void BPF_PROG(ima, struct linux_binprm *bprm)
return;
*sample = ima_hash;
- bpf_ringbuf_submit(sample, 0);
+ bpf_ringbuf_submit(sample, BPF_RB_MAY_WAKEUP);
}
return;
diff --git a/tools/testing/selftests/bpf/progs/ringbuf_bench.c b/tools/testing/selftests/bpf/progs/ringbuf_bench.c
index 123607d314d6..808e2e0e3d64 100644
--- a/tools/testing/selftests/bpf/progs/ringbuf_bench.c
+++ b/tools/testing/selftests/bpf/progs/ringbuf_bench.c
@@ -24,7 +24,7 @@ static __always_inline long get_flags()
long sz;
if (!wakeup_data_size)
- return 0;
+ return BPF_RB_MAY_WAKEUP;
sz = bpf_ringbuf_query(&ringbuf, BPF_RB_AVAIL_DATA);
return sz >= wakeup_data_size ? BPF_RB_FORCE_WAKEUP : BPF_RB_NO_WAKEUP;
diff --git a/tools/testing/selftests/bpf/progs/test_ringbuf.c b/tools/testing/selftests/bpf/progs/test_ringbuf.c
index 8ba9959b036b..03a5cbd21356 100644
--- a/tools/testing/selftests/bpf/progs/test_ringbuf.c
+++ b/tools/testing/selftests/bpf/progs/test_ringbuf.c
@@ -21,7 +21,7 @@ struct {
/* inputs */
int pid = 0;
long value = 0;
-long flags = 0;
+long flags = BPF_RB_MAY_WAKEUP;
/* outputs */
long total = 0;
diff --git a/tools/testing/selftests/bpf/progs/test_ringbuf_multi.c b/tools/testing/selftests/bpf/progs/test_ringbuf_multi.c
index edf3b6953533..f33c3fdfb1d6 100644
--- a/tools/testing/selftests/bpf/progs/test_ringbuf_multi.c
+++ b/tools/testing/selftests/bpf/progs/test_ringbuf_multi.c
@@ -71,7 +71,7 @@ int test_ringbuf(void *ctx)
sample->seq = total;
total += 1;
- bpf_ringbuf_submit(sample, 0);
+ bpf_ringbuf_submit(sample, BPF_RB_MAY_WAKEUP);
return 0;
}
--
2.25.1
v1 by Uriel is here: [1].
Since it's been a while, I've dropped the Reviewed-By's.
It depended on commit 83c4e7a0363b ("KUnit: KASAN Integration") which
hadn't been merged yet, so that caused some kerfuffle with applying them
previously and the series was reverted.
This revives the series but makes the kunit_fail_current_test() function
take a format string and logs the file and line number of the failing
code, addressing Alan Maguire's comments on the previous version.
As a result, the patch that makes UBSAN errors was tweaked slightly to
include an error message.
v2 -> v3:
Try and fail to make kunit_fail_current_test() work on CONFIG_KUNIT=m
s/_/__ on the helper func to match others in test.c
v3 -> v4:
Revert to only enabling kunit_fail_current_test() for CONFIG_KUNIT=y
[1] https://lore.kernel.org/linux-kselftest/20200806174326.3577537-1-urielguaja…
Uriel Guajardo (2):
kunit: support failure from dynamic analysis tools
kunit: ubsan integration
include/kunit/test-bug.h | 30 ++++++++++++++++++++++++++++++
lib/kunit/test.c | 39 +++++++++++++++++++++++++++++++++++----
lib/ubsan.c | 3 +++
3 files changed, 68 insertions(+), 4 deletions(-)
create mode 100644 include/kunit/test-bug.h
base-commit: a74e6a014c9d4d4161061f770c9b4f98372ac778
--
2.31.0.rc2.261.g7f71774620-goog
This series improves the defensive posture of sysfs's use of seq_file
to gain the vmap guard pages at the end of vmalloc buffers to stop a
class of recurring flaw[1]. The long-term goal is to switch sysfs from
a buffer to using seq_file directly, but this will take time to refactor.
Included is also a Clang fix for NULL arithmetic and an LKDTM test to
validate vmalloc guard pages.
v4:
- fix NULL arithmetic (Arnd)
- add lkdtm test
- reword commit message
v3: https://lore.kernel.org/lkml/20210401022145.2019422-1-keescook@chromium.org/
v2: https://lore.kernel.org/lkml/20210315174851.622228-1-keescook@chromium.org/
v1: https://lore.kernel.org/lkml/20210312205558.2947488-1-keescook@chromium.org/
Thanks!
-Kees
Arnd Bergmann (1):
seq_file: Fix clang warning for NULL pointer arithmetic
Kees Cook (2):
lkdtm/heap: Add vmalloc linear overflow test
sysfs: Unconditionally use vmalloc for buffer
drivers/misc/lkdtm/core.c | 3 ++-
drivers/misc/lkdtm/heap.c | 21 +++++++++++++++++-
drivers/misc/lkdtm/lkdtm.h | 3 ++-
fs/kernfs/file.c | 9 +++++---
fs/seq_file.c | 5 ++++-
fs/sysfs/file.c | 29 +++++++++++++++++++++++++
include/linux/seq_file.h | 6 +++++
tools/testing/selftests/lkdtm/tests.txt | 3 ++-
8 files changed, 71 insertions(+), 8 deletions(-)
--
2.25.1
v1 by Uriel is here: [1].
Since it's been a while, I've dropped the Reviewed-By's.
It depended on commit 83c4e7a0363b ("KUnit: KASAN Integration") which
hadn't been merged yet, so that caused some kerfuffle with applying them
previously and the series was reverted.
This revives the series but makes the kunit_fail_current_test() function
take a format string and logs the file and line number of the failing
code, addressing Alan Maguire's comments on the previous version.
As a result, the patch that makes UBSAN errors was tweaked slightly to
include an error message.
v2 -> v3:
Try and fail to make kunit_fail_current_test() work on CONFIG_KUNIT=m
s/_/__ on the helper func to match others in test.c
v3 -> v4:
Revert to only enabling kunit_fail_current_test() for CONFIG_KUNIT=y
v4 -> v5:
Delete blank line to make checkpatch.pl --strict happy
[1] https://lore.kernel.org/linux-kselftest/20200806174326.3577537-1-urielguaja…
Uriel Guajardo (2):
kunit: support failure from dynamic analysis tools
kunit: ubsan integration
include/kunit/test-bug.h | 29 +++++++++++++++++++++++++++++
lib/kunit/test.c | 39 +++++++++++++++++++++++++++++++++++----
lib/ubsan.c | 3 +++
3 files changed, 67 insertions(+), 4 deletions(-)
create mode 100644 include/kunit/test-bug.h
base-commit: 1678e493d530e7977cce34e59a86bb86f3c5631e
--
2.31.0.208.g409f899ff0-goog