Make the default .kunitconfig (specified in
arch/um/configs/kunit_defconfig) specify CONFIG_KUNIT_ALL_TESTS by
default. KUNIT_ALL_TESTS runs all tests which have satisfied
dependencies in the current .config (which would be the architecture
defconfig).
Currently, the default .kunitconfig enables only the example tests and
KUnit's own tests. While this does provide a good example of what a
.kunitconfig for running a few individual tests should look like, it
does mean that kunit_tool runs a pretty paltry collection of tests by
default.
A default run of ./tools/testing/kunit/kunit.py run now runs 70 tests
instead of 14.
Signed-off-by: David Gow <davidgow(a)google.com>
---
arch/um/configs/kunit_defconfig | 3 +--
1 file changed, 1 insertion(+), 2 deletions(-)
diff --git a/arch/um/configs/kunit_defconfig b/arch/um/configs/kunit_defconfig
index 9235b7d42d38..becf3432a375 100644
--- a/arch/um/configs/kunit_defconfig
+++ b/arch/um/configs/kunit_defconfig
@@ -1,3 +1,2 @@
CONFIG_KUNIT=y
-CONFIG_KUNIT_TEST=y
-CONFIG_KUNIT_EXAMPLE_TEST=y
+CONFIG_KUNIT_ALL_TESTS=y
--
2.31.1.751.gd2f1c929bd-goog
Base
====
These patches are based upon Andrew Morton's v5.13-rc1-mmots-2021-05-10-22-15
tag. This is because this series depends on:
- UFFD minor fault support for hugetlbfs (in v5.13-rc1) [1]
- UFFD minor fault support for shmem (in Andrew's tree) [2]
[1] https://lore.kernel.org/linux-fsdevel/20210301222728.176417-1-axelrasmussen…
[2] https://lore.kernel.org/patchwork/cover/1420967/
Overview
========
Minor fault handling is a new userfaultfd feature whose goal is generally to
improve performance. In particular, it is intended for use with demand paging.
There are more details in the cover letters for this new feature (linked above),
but at a high level the idea is that we think of these three phases of live
migration of a VM:
1. Precopy, where we copy "some" pages from the source to the target, while the
VM is still running on the source machine.
2. Blackout, where execution stops on the source, and begins on the target.
3. Postcopy, where the VM is running on the target, some pages are already up
to date, and others are not (because they weren't copied, or were modified
after being copied).
During postcopy, the first time the guest touches memory, we intercept a minor
fault. Userspace checks whether or not the page is already up to date. If
needed, we copy the final version of the page from the soure machine. This
could be done with RDMA for example, to do it truly in place / with no copying.
At this point, all that's left is to setup PTEs for the guest: so we issue
UFFDIO_CONTINUE. No copying or page allocation needed.
Because of this use case, it's useful to exercise this as part of the demand
paging test. It lets us ensure the use case works correctly end-to-end, and also
gives us an in-tree way to profile the end-to-end flow for future performance
improvements.
Axel Rasmussen (5):
KVM: selftests: allow different backing memory types for demand paging
KVM: selftests: add shmem backing source type
KVM: selftests: create alias mappings when using shared memory
KVM: selftests: allow using UFFD minor faults for demand paging
KVM: selftests: add shared hugetlbfs backing source type
.../selftests/kvm/demand_paging_test.c | 146 +++++++++++++-----
.../testing/selftests/kvm/include/kvm_util.h | 1 +
.../testing/selftests/kvm/include/test_util.h | 11 ++
tools/testing/selftests/kvm/lib/kvm_util.c | 79 +++++++++-
.../selftests/kvm/lib/kvm_util_internal.h | 2 +
tools/testing/selftests/kvm/lib/test_util.c | 46 ++++--
6 files changed, 222 insertions(+), 63 deletions(-)
--
2.31.1.607.g51e8a6a459-goog
One the mmap tests will map a single page, then try to extend the
mapping by use of mremap, which should fail. Right after that, it unmaps
the extended area, which may end up unmapping other valid mapped areas,
this causing a segfault.
Only unmap the area that is expected to be mapped.
Fixes: b2fb299c9aa4 ("selftests/bpf: test ringbuf mmap read-only and read-write restrictions")
Signed-off-by: Thadeu Lima de Souza Cascardo <cascardo(a)canonical.com>
---
tools/testing/selftests/bpf/prog_tests/ringbuf.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/tools/testing/selftests/bpf/prog_tests/ringbuf.c b/tools/testing/selftests/bpf/prog_tests/ringbuf.c
index 197e30b83298..f9a8ae331963 100644
--- a/tools/testing/selftests/bpf/prog_tests/ringbuf.c
+++ b/tools/testing/selftests/bpf/prog_tests/ringbuf.c
@@ -146,7 +146,7 @@ void test_ringbuf(void)
ASSERT_ERR(mprotect(mmap_ptr, page_size, PROT_WRITE), "write_protect");
ASSERT_ERR(mprotect(mmap_ptr, page_size, PROT_EXEC), "exec_protect");
ASSERT_ERR_PTR(mremap(mmap_ptr, 0, 3 * page_size, MREMAP_MAYMOVE), "ro_remap");
- ASSERT_OK(munmap(mmap_ptr, 3 * page_size), "unmap_ro");
+ ASSERT_OK(munmap(mmap_ptr, page_size), "unmap_ro");
/* only trigger BPF program for current process */
skel->bss->pid = getpid();
--
2.30.2
From: Colin Ian King <colin.king(a)canonical.com>
There is a spelling mistake in a message. Fix it.
Signed-off-by: Colin Ian King <colin.king(a)canonical.com>
---
tools/testing/selftests/sched/cs_prctl_test.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/tools/testing/selftests/sched/cs_prctl_test.c b/tools/testing/selftests/sched/cs_prctl_test.c
index 63fe6521c56d..cf9ca10b876c 100644
--- a/tools/testing/selftests/sched/cs_prctl_test.c
+++ b/tools/testing/selftests/sched/cs_prctl_test.c
@@ -262,7 +262,7 @@ int main(int argc, char *argv[])
if (setpgid(0, 0) != 0)
handle_error("process group");
- printf("\n## Create a thread/process/process group hiearchy\n");
+ printf("\n## Create a thread/process/process group hierarchy\n");
create_processes(num_processes, num_threads, procs);
disp_processes(num_processes, procs);
validate(get_cs_cookie(0) == 0);
--
2.30.2
This patchset provides a file descriptor for every VM and VCPU to read
KVM statistics data in binary format.
It is meant to provide a lightweight, flexible, scalable and efficient
lock-free solution for user space telemetry applications to pull the
statistics data periodically for large scale systems. The pulling
frequency could be as high as a few times per second.
In this patchset, every statistics data are treated to have some
attributes as below:
* architecture dependent or common
* VM statistics data or VCPU statistics data
* type: cumulative, instantaneous,
* unit: none for simple counter, nanosecond, microsecond,
millisecond, second, Byte, KiByte, MiByte, GiByte. Clock Cycles
Since no lock/synchronization is used, the consistency between all
the statistics data is not guaranteed. That means not all statistics
data are read out at the exact same time, since the statistics date
are still being updated by KVM subsystems while they are read out.
---
* v2 -> v3
- Rebase to kvm/queue, commit edf408f5257b ("KVM: avoid "deadlock" between
install_new_memslots and MMU notifier")
- Resolve some nitpicks about format
* v1 -> v2
- Use ARRAY_SIZE to count the number of stats descriptors
- Fix missing `size` field initialization in macro STATS_DESC
[1] https://lore.kernel.org/kvm/20210402224359.2297157-1-jingzhangos@google.com
[2] https://lore.kernel.org/kvm/20210415151741.1607806-1-jingzhangos@google.com
---
Jing Zhang (4):
KVM: stats: Separate common stats from architecture specific ones
KVM: stats: Add fd-based API to read binary stats data
KVM: stats: Add documentation for statistics data binary interface
KVM: selftests: Add selftest for KVM statistics data binary interface
Documentation/virt/kvm/api.rst | 171 ++++++++
arch/arm64/include/asm/kvm_host.h | 9 +-
arch/arm64/kvm/guest.c | 42 +-
arch/mips/include/asm/kvm_host.h | 9 +-
arch/mips/kvm/mips.c | 67 +++-
arch/powerpc/include/asm/kvm_host.h | 9 +-
arch/powerpc/kvm/book3s.c | 68 +++-
arch/powerpc/kvm/book3s_hv.c | 12 +-
arch/powerpc/kvm/book3s_pr.c | 2 +-
arch/powerpc/kvm/book3s_pr_papr.c | 2 +-
arch/powerpc/kvm/booke.c | 63 ++-
arch/s390/include/asm/kvm_host.h | 9 +-
arch/s390/kvm/kvm-s390.c | 133 ++++++-
arch/x86/include/asm/kvm_host.h | 9 +-
arch/x86/kvm/x86.c | 71 +++-
include/linux/kvm_host.h | 132 ++++++-
include/linux/kvm_types.h | 12 +
include/uapi/linux/kvm.h | 50 +++
tools/testing/selftests/kvm/.gitignore | 1 +
tools/testing/selftests/kvm/Makefile | 3 +
.../testing/selftests/kvm/include/kvm_util.h | 3 +
.../selftests/kvm/kvm_bin_form_stats.c | 370 ++++++++++++++++++
tools/testing/selftests/kvm/lib/kvm_util.c | 11 +
virt/kvm/kvm_main.c | 237 ++++++++++-
24 files changed, 1405 insertions(+), 90 deletions(-)
create mode 100644 tools/testing/selftests/kvm/kvm_bin_form_stats.c
base-commit: edf408f5257ba39e63781b820528e1ce1ec0f543
--
2.31.1.498.g6c1eba8ee3d-goog
Changelog RFC v4 --> PATCH v5:
1. Added a CPU online check prior to parsing the CPU topology to avoid
parsing topologies for CPUs unavailable for the latency test
2. Added comment describing the selftest in cpuidle.sh
As I have made changes to cpuidle.sh's working, hence dropping
"Reviewed-by" from Doug Smythies for the second patch, while retaining
it for the first patch.
RFC v4: https://lkml.org/lkml/2021/4/12/99
---
A kernel module + userspace driver to estimate the wakeup latency
caused by going into stop states. The motivation behind this program is
to find significant deviations behind advertised latency and residency
values.
The patchset measures latencies for two kinds of events. IPIs and Timers
As this is a software-only mechanism, there will additional latencies of
the kernel-firmware-hardware interactions. To account for that, the
program also measures a baseline latency on a 100 percent loaded CPU
and the latencies achieved must be in view relative to that.
To achieve this, we introduce a kernel module and expose its control
knobs through the debugfs interface that the selftests can engage with.
The kernel module provides the following interfaces within
/sys/kernel/debug/latency_test/ for,
IPI test:
ipi_cpu_dest = Destination CPU for the IPI
ipi_cpu_src = Origin of the IPI
ipi_latency_ns = Measured latency time in ns
Timeout test:
timeout_cpu_src = CPU on which the timer to be queued
timeout_expected_ns = Timer duration
timeout_diff_ns = Difference of actual duration vs expected timer
Sample output on a POWER9 system is as follows:
# --IPI Latency Test---
# Baseline Average IPI latency(ns): 3114
# Observed Average IPI latency(ns) - State0: 3265
# Observed Average IPI latency(ns) - State1: 3507
# Observed Average IPI latency(ns) - State2: 3739
# Observed Average IPI latency(ns) - State3: 3807
# Observed Average IPI latency(ns) - State4: 17070
# Observed Average IPI latency(ns) - State5: 1038174
# Observed Average IPI latency(ns) - State6: 1068784
#
# --Timeout Latency Test--
# Baseline Average timeout diff(ns): 1420
# Observed Average timeout diff(ns) - State0: 1640
# Observed Average timeout diff(ns) - State1: 1764
# Observed Average timeout diff(ns) - State2: 1715
# Observed Average timeout diff(ns) - State3: 1845
# Observed Average timeout diff(ns) - State4: 16581
# Observed Average timeout diff(ns) - State5: 939977
# Observed Average timeout diff(ns) - State6: 1073024
Things to keep in mind:
1. This kernel module + bash driver does not guarantee idleness on a
core when the IPI and the Timer is armed. It only invokes sleep and
hopes that the core is idle once the IPI/Timer is invoked onto it.
Hence this program must be run on a completely idle system for best
results
2. Even on a completely idle system, there maybe book-keeping tasks or
jitter tasks that can run on the core we want idle. This can create
outliers in the latency measurement. Thankfully, these outliers
should be large enough to easily weed them out.
3. A userspace only selftest variant was also sent out as RFC based on
suggestions over the previous patchset to simply the kernel
complexeity. However, a userspace only approach had more noise in
the latency measurement due to userspace-kernel interactions
which led to run to run variance and a lesser accurate test.
Another downside of the nature of a userspace program is that it
takes orders of magnitude longer to complete a full system test
compared to the kernel framework.
RFC patch: https://lkml.org/lkml/2020/9/2/356
4. For Intel Systems, the Timer based latencies don't exactly give out
the measure of idle latencies. This is because of a hardware
optimization mechanism that pre-arms a CPU when a timer is set to
wakeup. That doesn't make this metric useless for Intel systems,
it just means that is measuring IPI/Timer responding latency rather
than idle wakeup latencies.
(Source: https://lkml.org/lkml/2020/9/2/610)
For solution to this problem, a hardware based latency analyzer is
devised by Artem Bityutskiy from Intel.
https://youtu.be/Opk92aQyvt0?t=8266https://intel.github.io/wult/
Pratik R. Sampat (2):
cpuidle: Extract IPI based and timer based wakeup latency from idle
states
selftest/cpuidle: Add support for cpuidle latency measurement
drivers/cpuidle/Makefile | 1 +
drivers/cpuidle/test-cpuidle_latency.c | 157 ++++++++
lib/Kconfig.debug | 10 +
tools/testing/selftests/Makefile | 1 +
tools/testing/selftests/cpuidle/Makefile | 6 +
tools/testing/selftests/cpuidle/cpuidle.sh | 414 +++++++++++++++++++++
tools/testing/selftests/cpuidle/settings | 2 +
7 files changed, 591 insertions(+)
create mode 100644 drivers/cpuidle/test-cpuidle_latency.c
create mode 100644 tools/testing/selftests/cpuidle/Makefile
create mode 100755 tools/testing/selftests/cpuidle/cpuidle.sh
create mode 100644 tools/testing/selftests/cpuidle/settings
--
2.17.1
If the xfrm_policy.sh script gets terminated by any reason, the netns
namespace files created by the test will be left alone.
In this case a second attempt to run this test will fail with:
# Cannot create namespace file "/run/netns/ns1": File exists
Move the netns cleanup code into an exit trap so that we can ensure
these files will be removed in the end.
Changes in V2:
- Update commit message and patch title.
Signed-off-by: Po-Hsu Lin <po-hsu.lin(a)canonical.com>
---
tools/testing/selftests/net/xfrm_policy.sh | 7 +++++--
1 file changed, 5 insertions(+), 2 deletions(-)
diff --git a/tools/testing/selftests/net/xfrm_policy.sh b/tools/testing/selftests/net/xfrm_policy.sh
index bdf450e..bb4632b 100755
--- a/tools/testing/selftests/net/xfrm_policy.sh
+++ b/tools/testing/selftests/net/xfrm_policy.sh
@@ -28,6 +28,11 @@ KEY_AES=0x0123456789abcdef0123456789012345
SPI1=0x1
SPI2=0x2
+cleanup() {
+ for i in 1 2 3 4;do ip netns del ns$i 2>/dev/null ;done
+}
+trap cleanup EXIT
+
do_esp_policy() {
local ns=$1
local me=$2
@@ -481,6 +486,4 @@ check_hthresh_repeat "policies with repeated htresh change"
check_random_order ns3 "policies inserted in random order"
-for i in 1 2 3 4;do ip netns del ns$i;done
-
exit $ret
--
2.7.4
This patchset provides a file descriptor for every VM and VCPU to read
KVM statistics data in binary format.
It is meant to provide a lightweight, flexible, scalable and efficient
lock-free solution for user space telemetry applications to pull the
statistics data periodically for large scale systems. The pulling
frequency could be as high as a few times per second.
In this patchset, every statistics data are treated to have some
attributes as below:
* architecture dependent or common
* VM statistics data or VCPU statistics data
* type: cumulative, instantaneous,
* unit: none for simple counter, nanosecond, microsecond,
millisecond, second, Byte, KiByte, MiByte, GiByte. Clock Cycles
Since no lock/synchronization is used, the consistency between all
the statistics data is not guaranteed. That means not all statistics
data are read out at the exact same time, since the statistics date
are still being updated by KVM subsystems while they are read out.
---
* v3 -> v4
- Rebase to kvm/queue, commit 9f242010c3b4 ("KVM: avoid "deadlock"
between install_new_memslots and MMU notifier")
- Use C-stype comments in the whole patch
- Fix wrong count for x86 VCPU stats descriptors
- Fix KVM stats data size counting and validity check in selftest
* v2 -> v3
- Rebase to kvm/queue, commit edf408f5257b ("KVM: avoid "deadlock"
between install_new_memslots and MMU notifier")
- Resolve some nitpicks about format
* v1 -> v2
- Use ARRAY_SIZE to count the number of stats descriptors
- Fix missing `size` field initialization in macro STATS_DESC
[1] https://lore.kernel.org/kvm/20210402224359.2297157-1-jingzhangos@google.com
[2] https://lore.kernel.org/kvm/20210415151741.1607806-1-jingzhangos@google.com
[3] https://lore.kernel.org/kvm/20210423181727.596466-1-jingzhangos@google.com
---
Jing Zhang (4):
KVM: stats: Separate common stats from architecture specific ones
KVM: stats: Add fd-based API to read binary stats data
KVM: stats: Add documentation for statistics data binary interface
KVM: selftests: Add selftest for KVM statistics data binary interface
Documentation/virt/kvm/api.rst | 171 ++++++++
arch/arm64/include/asm/kvm_host.h | 9 +-
arch/arm64/kvm/guest.c | 42 +-
arch/mips/include/asm/kvm_host.h | 9 +-
arch/mips/kvm/mips.c | 67 ++-
arch/powerpc/include/asm/kvm_host.h | 9 +-
arch/powerpc/kvm/book3s.c | 68 +++-
arch/powerpc/kvm/book3s_hv.c | 12 +-
arch/powerpc/kvm/book3s_pr.c | 2 +-
arch/powerpc/kvm/book3s_pr_papr.c | 2 +-
arch/powerpc/kvm/booke.c | 63 ++-
arch/s390/include/asm/kvm_host.h | 9 +-
arch/s390/kvm/kvm-s390.c | 133 +++++-
arch/x86/include/asm/kvm_host.h | 9 +-
arch/x86/kvm/x86.c | 71 +++-
include/linux/kvm_host.h | 132 +++++-
include/linux/kvm_types.h | 12 +
include/uapi/linux/kvm.h | 50 +++
tools/testing/selftests/kvm/.gitignore | 1 +
tools/testing/selftests/kvm/Makefile | 3 +
.../testing/selftests/kvm/include/kvm_util.h | 3 +
.../selftests/kvm/kvm_bin_form_stats.c | 380 ++++++++++++++++++
tools/testing/selftests/kvm/lib/kvm_util.c | 11 +
virt/kvm/kvm_main.c | 237 ++++++++++-
24 files changed, 1415 insertions(+), 90 deletions(-)
create mode 100644 tools/testing/selftests/kvm/kvm_bin_form_stats.c
base-commit: 9f242010c3b46e63bc62f08fff42cef992d3801b
--
2.31.1.527.g47e6f16901-goog
From: Mike Rapoport <rppt(a)linux.ibm.com>
Hi,
@Andrew, this is based on v5.12-rc1, I can rebase whatever way you prefer.
This is an implementation of "secret" mappings backed by a file descriptor.
The file descriptor backing secret memory mappings is created using a
dedicated memfd_secret system call The desired protection mode for the
memory is configured using flags parameter of the system call. The mmap()
of the file descriptor created with memfd_secret() will create a "secret"
memory mapping. The pages in that mapping will be marked as not present in
the direct map and will be present only in the page table of the owning mm.
Although normally Linux userspace mappings are protected from other users,
such secret mappings are useful for environments where a hostile tenant is
trying to trick the kernel into giving them access to other tenants
mappings.
Additionally, in the future the secret mappings may be used as a mean to
protect guest memory in a virtual machine host.
For demonstration of secret memory usage we've created a userspace library
https://git.kernel.org/pub/scm/linux/kernel/git/jejb/secret-memory-preloade…
that does two things: the first is act as a preloader for openssl to
redirect all the OPENSSL_malloc calls to secret memory meaning any secret
keys get automatically protected this way and the other thing it does is
expose the API to the user who needs it. We anticipate that a lot of the
use cases would be like the openssl one: many toolkits that deal with
secret keys already have special handling for the memory to try to give
them greater protection, so this would simply be pluggable into the
toolkits without any need for user application modification.
Hiding secret memory mappings behind an anonymous file allows usage of
the page cache for tracking pages allocated for the "secret" mappings as
well as using address_space_operations for e.g. page migration callbacks.
The anonymous file may be also used implicitly, like hugetlb files, to
implement mmap(MAP_SECRET) and use the secret memory areas with "native" mm
ABIs in the future.
Removing of the pages from the direct map may cause its fragmentation on
architectures that use large pages to map the physical memory which affects
the system performance. However, the original Kconfig text for
CONFIG_DIRECT_GBPAGES said that gigabyte pages in the direct map "... can
improve the kernel's performance a tiny bit ..." (commit 00d1c5e05736
("x86: add gbpages switches")) and the recent report [1] showed that "...
although 1G mappings are a good default choice, there is no compelling
evidence that it must be the only choice". Hence, it is sufficient to have
secretmem disabled by default with the ability of a system administrator to
enable it at boot time.
In addition, there is also a long term goal to improve management of the
direct map.
[1] https://lore.kernel.org/linux-mm/213b4567-46ce-f116-9cdf-bbd0c884eb3c@linux…
v18:
* rebase on v5.12-rc1
* merge kfence fix into the original patch
* massage commit message of the patch introducing the memfd_secret syscall
v17: https://lore.kernel.org/lkml/20210208084920.2884-1-rppt@kernel.org
* Remove pool of large pages backing secretmem allocations, per Michal Hocko
* Add secretmem pages to unevictable LRU, per Michal Hocko
* Use GFP_HIGHUSER as secretmem mapping mask, per Michal Hocko
* Make secretmem an opt-in feature that is disabled by default
v16: https://lore.kernel.org/lkml/20210121122723.3446-1-rppt@kernel.org
* Fix memory leak intorduced in v15
* Clean the data left from previous page user before handing the page to
the userspace
v15: https://lore.kernel.org/lkml/20210120180612.1058-1-rppt@kernel.org
* Add riscv/Kconfig update to disable set_memory operations for nommu
builds (patch 3)
* Update the code around add_to_page_cache() per Matthew's comments
(patches 6,7)
* Add fixups for build/checkpatch errors discovered by CI systems
v14: https://lore.kernel.org/lkml/20201203062949.5484-1-rppt@kernel.org
* Finally s/mod_node_page_state/mod_lruvec_page_state/
v13: https://lore.kernel.org/lkml/20201201074559.27742-1-rppt@kernel.org
* Added Reviewed-by, thanks Catalin and David
* s/mod_node_page_state/mod_lruvec_page_state/ as Shakeel suggested
Older history:
v12: https://lore.kernel.org/lkml/20201125092208.12544-1-rppt@kernel.org
v11: https://lore.kernel.org/lkml/20201124092556.12009-1-rppt@kernel.org
v10: https://lore.kernel.org/lkml/20201123095432.5860-1-rppt@kernel.org
v9: https://lore.kernel.org/lkml/20201117162932.13649-1-rppt@kernel.org
v8: https://lore.kernel.org/lkml/20201110151444.20662-1-rppt@kernel.org
v7: https://lore.kernel.org/lkml/20201026083752.13267-1-rppt@kernel.org
v6: https://lore.kernel.org/lkml/20200924132904.1391-1-rppt@kernel.org
v5: https://lore.kernel.org/lkml/20200916073539.3552-1-rppt@kernel.org
v4: https://lore.kernel.org/lkml/20200818141554.13945-1-rppt@kernel.org
v3: https://lore.kernel.org/lkml/20200804095035.18778-1-rppt@kernel.org
v2: https://lore.kernel.org/lkml/20200727162935.31714-1-rppt@kernel.org
v1: https://lore.kernel.org/lkml/20200720092435.17469-1-rppt@kernel.org
rfc-v2: https://lore.kernel.org/lkml/20200706172051.19465-1-rppt@kernel.org/
rfc-v1: https://lore.kernel.org/lkml/20200130162340.GA14232@rapoport-lnx/
rfc-v0: https://lore.kernel.org/lkml/1572171452-7958-1-git-send-email-rppt@kernel.o…
Mike Rapoport (9):
mm: add definition of PMD_PAGE_ORDER
mmap: make mlock_future_check() global
riscv/Kconfig: make direct map manipulation options depend on MMU
set_memory: allow set_direct_map_*_noflush() for multiple pages
set_memory: allow querying whether set_direct_map_*() is actually enabled
mm: introduce memfd_secret system call to create "secret" memory areas
PM: hibernate: disable when there are active secretmem users
arch, mm: wire up memfd_secret system call where relevant
secretmem: test: add basic selftest for memfd_secret(2)
arch/arm64/include/asm/Kbuild | 1 -
arch/arm64/include/asm/cacheflush.h | 6 -
arch/arm64/include/asm/kfence.h | 2 +-
arch/arm64/include/asm/set_memory.h | 17 ++
arch/arm64/include/uapi/asm/unistd.h | 1 +
arch/arm64/kernel/machine_kexec.c | 1 +
arch/arm64/mm/mmu.c | 6 +-
arch/arm64/mm/pageattr.c | 23 +-
arch/riscv/Kconfig | 4 +-
arch/riscv/include/asm/set_memory.h | 4 +-
arch/riscv/include/asm/unistd.h | 1 +
arch/riscv/mm/pageattr.c | 8 +-
arch/x86/entry/syscalls/syscall_32.tbl | 1 +
arch/x86/entry/syscalls/syscall_64.tbl | 1 +
arch/x86/include/asm/set_memory.h | 4 +-
arch/x86/mm/pat/set_memory.c | 8 +-
fs/dax.c | 11 +-
include/linux/pgtable.h | 3 +
include/linux/secretmem.h | 30 +++
include/linux/set_memory.h | 16 +-
include/linux/syscalls.h | 1 +
include/uapi/asm-generic/unistd.h | 6 +-
include/uapi/linux/magic.h | 1 +
kernel/power/hibernate.c | 5 +-
kernel/power/snapshot.c | 4 +-
kernel/sys_ni.c | 2 +
mm/Kconfig | 3 +
mm/Makefile | 1 +
mm/gup.c | 10 +
mm/internal.h | 3 +
mm/mlock.c | 3 +-
mm/mmap.c | 5 +-
mm/secretmem.c | 261 +++++++++++++++++++
mm/vmalloc.c | 5 +-
scripts/checksyscalls.sh | 4 +
tools/testing/selftests/vm/.gitignore | 1 +
tools/testing/selftests/vm/Makefile | 3 +-
tools/testing/selftests/vm/memfd_secret.c | 296 ++++++++++++++++++++++
tools/testing/selftests/vm/run_vmtests.sh | 17 ++
39 files changed, 726 insertions(+), 53 deletions(-)
create mode 100644 arch/arm64/include/asm/set_memory.h
create mode 100644 include/linux/secretmem.h
create mode 100644 mm/secretmem.c
create mode 100644 tools/testing/selftests/vm/memfd_secret.c
--
2.28.0
Explicitly include stddef.h when building the BTI tests so that we have
a definition of NULL, with at least some toolchains this is not done
implicitly by anything else:
test.c: In function ‘start’:
test.c:214:25: error: ‘NULL’ undeclared (first use in this function)
214 | sigaction(SIGILL, &sa, NULL);
| ^~~~
test.c:20:1: note: ‘NULL’ is defined in header ‘<stddef.h>’; did you forget to ‘#include <stddef.h>’?
Signed-off-by: Mark Brown <broonie(a)kernel.org>
---
tools/testing/selftests/arm64/bti/test.c | 1 +
1 file changed, 1 insertion(+)
diff --git a/tools/testing/selftests/arm64/bti/test.c b/tools/testing/selftests/arm64/bti/test.c
index 656b04976ccc..67b77ab83c20 100644
--- a/tools/testing/selftests/arm64/bti/test.c
+++ b/tools/testing/selftests/arm64/bti/test.c
@@ -6,6 +6,7 @@
#include "system.h"
+#include <stddef.h>
#include <linux/errno.h>
#include <linux/auxvec.h>
#include <linux/signal.h>
--
2.20.1
The result of an expression consisting of a single relational operator is
already of the bool type and does not need to be evaluated explicitly.
No functional change.
Signed-off-by: Zhen Lei <thunder.leizhen(a)huawei.com>
---
tools/testing/selftests/mount/unprivileged-remount-test.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/tools/testing/selftests/mount/unprivileged-remount-test.c b/tools/testing/selftests/mount/unprivileged-remount-test.c
index 584dc6bc3b06679..d2917054fe3ae56 100644
--- a/tools/testing/selftests/mount/unprivileged-remount-test.c
+++ b/tools/testing/selftests/mount/unprivileged-remount-test.c
@@ -204,7 +204,7 @@ bool test_unpriv_remount(const char *fstype, const char *mount_options,
if (!WIFEXITED(status)) {
die("child did not terminate cleanly\n");
}
- return WEXITSTATUS(status) == EXIT_SUCCESS ? true : false;
+ return WEXITSTATUS(status) == EXIT_SUCCESS;
}
create_and_enter_userns();
@@ -282,7 +282,7 @@ static bool test_priv_mount_unpriv_remount(void)
if (!WIFEXITED(status)) {
die("child did not terminate cleanly\n");
}
- return WEXITSTATUS(status) == EXIT_SUCCESS ? true : false;
+ return WEXITSTATUS(status) == EXIT_SUCCESS;
}
orig_mnt_flags = read_mnt_flags(orig_path);
--
2.26.0.106.g9fadedd
The use of typecheck() in KUNIT_EXPECT_EQ() and friends is causing more
problems than I think it's worth. Things like enums need to have their
values explicitly cast, and literals all need to be very precisely typed
for the code to compile.
While typechecking does have its uses, the additional overhead of having
lots of needless casts -- combined with the awkward error messages which
don't mention which types are involved -- makes tests less readable and
more difficult to write.
By removing the typecheck() call, the two arguments still need to be of
compatible types, but don't need to be of exactly the same time, which
seems a less confusing and more useful compromise.
Signed-off-by: David Gow <davidgow(a)google.com>
---
I appreciate that this is probably a bit controversial (and, indeed, I
was a bit hesitant about sending it out myself), but after sitting on it
for a few days, I still think this is probably an improvement overall.
The second patch does fix what I think is an actual bug, though, so even
if this isn't determined to be a good idea, it (or some equivalent)
should probably go through.
Cheers,
-- David
include/kunit/test.h | 1 -
1 file changed, 1 deletion(-)
diff --git a/include/kunit/test.h b/include/kunit/test.h
index 49601c4b98b8..4c56ffcb7403 100644
--- a/include/kunit/test.h
+++ b/include/kunit/test.h
@@ -775,7 +775,6 @@ void kunit_do_assertion(struct kunit *test,
do { \
typeof(left) __left = (left); \
typeof(right) __right = (right); \
- ((void)__typecheck(__left, __right)); \
\
KUNIT_ASSERTION(test, \
__left op __right, \
--
2.31.1.607.g51e8a6a459-goog
From: Mike Rapoport <rppt(a)linux.ibm.com>
Hi,
This is an updated version of page_is_secretmem() changes.
This is based on v5.12-rc7-mmots-2021-04-15-16-28.
@Andrew, please let me know if you'd like me to rebase it differently or
resend the entire set.
v3:
* add missing put_compound_head() if we are to return NULL from
gup_page_range(), thanks David.
* add unlikely() to test for page_is_secretmem.
v2:
* move the check for secretmem page in gup_pte_range after we get a
reference to the page, per Matthew.
Mike Rapoport (2):
secretmem/gup: don't check if page is secretmem without reference
secretmem: optimize page_is_secretmem()
include/linux/secretmem.h | 26 +++++++++++++++++++++++++-
mm/gup.c | 6 +++---
mm/secretmem.c | 12 +-----------
3 files changed, 29 insertions(+), 15 deletions(-)
--
2.28.0
Mike Rapoport (2):
secretmem/gup: don't check if page is secretmem without reference
secretmem: optimize page_is_secretmem()
include/linux/secretmem.h | 26 +++++++++++++++++++++++++-
mm/gup.c | 8 +++++---
mm/secretmem.c | 12 +-----------
3 files changed, 31 insertions(+), 15 deletions(-)
--
2.28.0
Base
====
This series is based on (and therefore should apply cleanly to) the tag
"v5.12-rc8-mmots-2021-04-21-23-08", with the following applied first:
1. Peter's selftest cleanup series:
https://lore.kernel.org/patchwork/cover/1412450/
2. My patch to fix a pre-existing BUG_ON in an edge case:
https://lore.kernel.org/patchwork/patch/1419758/
Changelog
=========
v5->v6:
- Picked up {Reviewed,Acked}-by's.
- Rebased onto v5.12-rc8-mmots-2021-04-21-23-08.
- Put mistakenly removed delete_from_page_cache() back in the error path in
shmem_mfill_atomic_pte(). [Hugh]
- Keep shmem_mfill_atomic_pte() naming, instead of shmem_mcopy_... Likewise,
rename our new helper to mfill_atomic_install_pte(). [Hugh]
- Return directly instead of "goto out" in shmem_mfill_atomic_pte(), saving a
couple of lines. [Peter]
v4->v5:
- Picked up {Reviewed,Acked}-by's.
- Fix cleanup in error path in shmem_mcopy_atomic_pte(). [Hugh, Peter]
- Mention switching to lru_cache_add() in the commit message of 9/10. [Hugh]
- Split + reorder commits, so now we 1) implement the faulting path, 2)
implement the CONTINUE ioctl, and 3) advertise the feature. Squash the
documentation update into step (3). [Hugh, Peter]
- Reorder install_pte() cleanup to come before selftest changes. [Hugh]
v3->v4:
- Fix handling of the shmem private mcopy case. Previously, I had (incorrectly)
assumed that !vma_is_anonymous() was equivalent to "the page will be in the
page cache". But, in this case we have an optimization where we allocate a new
*anonymous* page. So, use a new "bool page_in_cache" instead, which checks if
page->mapping is set. Correct several places with this new check. [Hugh]
- Fix calling mm_counter() before page_add_..._rmap(). [Hugh]
- When modifying shmem_mcopy_atomic_pte() to use the new install_pte() helper,
just use lru_cache_add_inactive_or_unevictable(), no need to branch and maybe
use lru_cache_add(). [Hugh]
- De-pluralize mcopy_atomic_install_pte(s). [Hugh]
- Make "writable" a bool, and initialize consistently. [Hugh]
v2->v3:
- Picked up {Reviewed,Acked}-by's.
- Reorder commits: introduce CONTINUE before MINOR registration. [Hugh, Peter]
- Don't try to {unlock,put}_page an xarray value in shmem_getpage_gfp. [Hugh]
- Move enum mcopy_atomic_mode forward declare out of CONFIG_HUGETLB_PAGE. [Hugh]
- Keep mistakenly removed UFFD_USER_MODE_ONLY in selftest. [Peter]
- Cleanup context management in self test (make clear implicit, remove unneeded
return values now that we have err()). [Peter]
- Correct dst_pte argument to dst_pmd in shmem_mcopy_atomic_pte macro. [Hugh]
- Mention the new shmem support feature in documentation. [Hugh]
v1->v2:
- Pick up Reviewed-by's.
- Don't swapin page when a minor fault occurs. Notice that it needs to be
swapped in, and just immediately fire the minor fault. Let a future CONTINUE
deal with swapping in the page. [Peter]
- Clarify comment about i_size checks in mm/userfaultfd.c. [Peter]
- Only forward declare once (out of #ifdef) in hugetlb.h. [Peter]
Changes since [2]:
- Squash the fixes ([2]) in with the original series ([1]). This makes reviewing
easier, as we no longer have to sift through deltas undoing what we had done
before. [Hugh, Peter]
- Modify shmem_mcopy_atomic_pte() to use the new mcopy_atomic_install_ptes()
helper, reducing code duplication. [Hugh]
- Properly trigger handle_userfault() in the shmem_swapin_page() case. [Hugh]
- Use shmem_getpage() instead of find_lock_page() to lookup the existing page in
for continue. This properly deals with swapped-out pages. [Hugh]
- Unconditionally pte_mkdirty() for anon memory (as before). [Peter]
- Don't include userfaultfd_k.h in either hugetlb.h or shmem_fs.h. [Hugh]
- Add comment for UFFD_FEATURE_MINOR_SHMEM (to match _HUGETLBFS). [Hugh]
- Fix some small cleanup issues (parens, reworded conditionals, reduced plumbing
of some parameters, simplify labels/gotos, ...). [Hugh, Peter]
Overview
========
See the series which added minor faults for hugetlbfs [3] for a detailed
overview of minor fault handling in general. This series adds the same support
for shmem-backed areas.
This series is structured as follows:
- Commits 1 and 2 are cleanups.
- Commits 3 and 4 implement the new feature (minor fault handling for shmem).
- Commit 5 advertises that the feature is now available since at this point it's
fully implemented.
- Commit 6 is a final cleanup, modifying an existing code path to re-use a new
helper we've introduced.
- Commits 7, 8, 9, 10 update the userfaultfd selftest to exercise the feature.
Use Case
========
In some cases it is useful to have VM memory backed by tmpfs instead of
hugetlbfs. So, this feature will be used to support the same VM live migration
use case described in my original series.
Additionally, Android folks (Lokesh Gidra <lokeshgidra(a)google.com>) hope to
optimize the Android Runtime garbage collector using this feature:
"The plan is to use userfaultfd for concurrently compacting the heap. With
this feature, the heap can be shared-mapped at another location where the
GC-thread(s) could continue the compaction operation without the need to
invoke userfault ioctl(UFFDIO_COPY) each time. OTOH, if and when Java threads
get faults on the heap, UFFDIO_CONTINUE can be used to resume execution.
Furthermore, this feature enables updating references in the 'non-moving'
portion of the heap efficiently. Without this feature, uneccessary page
copying (ioctl(UFFDIO_COPY)) would be required."
[1] https://lore.kernel.org/patchwork/cover/1388144/
[2] https://lore.kernel.org/patchwork/patch/1408161/
[3] https://lore.kernel.org/linux-fsdevel/20210301222728.176417-1-axelrasmussen…
Axel Rasmussen (10):
userfaultfd/hugetlbfs: avoid including userfaultfd_k.h in hugetlb.h
userfaultfd/shmem: combine shmem_{mcopy_atomic,mfill_zeropage}_pte
userfaultfd/shmem: support minor fault registration for shmem
userfaultfd/shmem: support UFFDIO_CONTINUE for shmem
userfaultfd/shmem: advertise shmem minor fault support
userfaultfd/shmem: modify shmem_mfill_atomic_pte to use install_pte()
userfaultfd/selftests: use memfd_create for shmem test type
userfaultfd/selftests: create alias mappings in the shmem test
userfaultfd/selftests: reinitialize test context in each test
userfaultfd/selftests: exercise minor fault handling shmem support
Documentation/admin-guide/mm/userfaultfd.rst | 3 +-
fs/userfaultfd.c | 6 +-
include/linux/hugetlb.h | 2 +-
include/linux/shmem_fs.h | 19 +-
include/linux/userfaultfd_k.h | 5 +
include/uapi/linux/userfaultfd.h | 7 +-
mm/hugetlb.c | 1 +
mm/memory.c | 8 +-
mm/shmem.c | 120 +++-----
mm/userfaultfd.c | 175 ++++++++----
tools/testing/selftests/vm/userfaultfd.c | 274 ++++++++++++-------
11 files changed, 364 insertions(+), 256 deletions(-)
--
2.31.1.527.g47e6f16901-goog
Hi,
This patch series introduces the futex2 syscalls.
* What happened to the current futex()?
For some years now, developers have been trying to add new features to
futex, but maintainers have been reluctant to accept then, given the
multiplexed interface full of legacy features and tricky to do big
changes. Some problems that people tried to address with patchsets are:
NUMA-awareness[0], smaller sized futexes[1], wait on multiple futexes[2].
NUMA, for instance, just doesn't fit the current API in a reasonable
way. Considering that, it's not possible to merge new features into the
current futex.
** The NUMA problem
At the current implementation, all futex kernel side infrastructure is
stored on a single node. Given that, all futex() calls issued by
processors that aren't located on that node will have a memory access
penalty when doing it.
** The 32bit sized futex problem
Embedded systems or anything with memory constrains would benefit of
using smaller sizes for the futex userspace integer. Also, a mutex
implementation can be done using just three values, so 8 bits is enough
for various scenarios.
** The wait on multiple problem
The use case lies in the Wine implementation of the Windows NT interface
WaitMultipleObjects. This Windows API function allows a thread to sleep
waiting on the first of a set of event sources (mutexes, timers, signal,
console input, etc) to signal. Considering this is a primitive
synchronization operation for Windows applications, being able to quickly
signal events on the producer side, and quickly go to sleep on the
consumer side is essential for good performance of those running over Wine.
[0] https://lore.kernel.org/lkml/20160505204230.932454245@linutronix.de/
[1] https://lore.kernel.org/lkml/20191221155659.3159-2-malteskarupke@web.de/
[2] https://lore.kernel.org/lkml/20200213214525.183689-1-andrealmeid@collabora.…
* The solution
As proposed by Peter Zijlstra and Florian Weimer[3], a new interface
is required to solve this, which must be designed with those features in
mind. futex2() is that interface. As opposed to the current multiplexed
interface, the new one should have one syscall per operation. This will
allow the maintainability of the API if it gets extended, and will help
users with type checking of arguments.
In particular, the new interface is extended to support the ability to
wait on any of a list of futexes at a time, which could be seen as a
vectored extension of the FUTEX_WAIT semantics.
[3] https://lore.kernel.org/lkml/20200303120050.GC2596@hirez.programming.kicks-…
* The interface
The new interface can be seen in details in the following patches, but
this is a high level summary of what the interface can do:
- Supports wake/wait semantics, as in futex()
- Supports requeue operations, similarly as FUTEX_CMP_REQUEUE, but with
individual flags for each address
- Supports waiting for a vector of futexes, using a new syscall named
futex_waitv()
- Supports variable sized futexes (8bits, 16bits, 32bits and 64bits)
- Supports NUMA-awareness operations, where the user can specify on
which memory node would like to operate
* Implementation
The internal implementation follows a similar design to the original futex.
Given that we want to replicate the same external behavior of current
futex, this should be somewhat expected. For some functions, like the
init and the code to get a shared key, I literally copied code and
comments from kernel/futex.c. I decided to do so instead of exposing the
original function as a public function since in that way we can freely
modify our implementation if required, without any impact on old futex.
Also, the comments precisely describes the details and corner cases of
the implementation.
Each patch contains a brief description of implementation, but patch 6
"docs: locking: futex2: Add documentation" adds a more complete document
about it.
* The patchset
This patchset can be also found at my git tree:
https://gitlab.collabora.com/tonyk/linux/-/tree/futex2-dev
- Patch 1: Implements wait/wake, and the basics foundations of futex2
- Patches 2-4: Implement the remaining features (shared, waitv, requeue).
- Patch 5: Adds the x86_x32 ABI handling. I kept it in a separated
patch since I'm not sure if x86_x32 is still a thing, or if it should
return -ENOSYS.
- Patch 6: Add a documentation file which details the interface and
the internal implementation.
- Patches 7-13: Selftests for all operations along with perf
support for futex2.
- Patch 14: While working on porting glibc for futex2, I found out
that there's a futex_wake() call at the user thread exit path, if
that thread was created with clone(..., CLONE_CHILD_SETTID, ...). In
order to make pthreads work with futex2, it was required to add
this patch. Note that this is more a proof-of-concept of what we
will need to do in future, rather than part of the interface and
shouldn't be merged as it is.
* Testing:
This patchset provides selftests for each operation and their flags.
Along with that, the following work was done:
** Stability
To stress the interface in "real world scenarios":
- glibc[4]: nptl's low level locking was modified to use futex2 API
(except for robust and PI things). All relevant nptl/ tests passed.
- Wine[5]: Proton/Wine was modified in order to use futex2() for the
emulation of Windows NT sync mechanisms based on futex, called "fsync".
Triple-A games with huge CPU's loads and tons of parallel jobs worked
as expected when compared with the previous FUTEX_WAIT_MULTIPLE
implementation at futex(). Some games issue 42k futex2() calls
per second.
- Full GNU/Linux distro: I installed the modified glibc in my host
machine, so all pthread's programs would use futex2(). After tweaking
systemd[6] to allow futex2() calls at seccomp, everything worked as
expected (web browsers do some syscall sandboxing and need some
configuration as well).
- perf: The perf benchmarks tests can also be used to stress the
interface, and they can be found in this patchset.
** Performance
- For comparing futex() and futex2() performance, I used the artificial
benchmarks implemented at perf (wake, wake-parallel, hash and
requeue). The setup was 200 runs for each test and using 8, 80, 800,
8000 for the number of threads, Note that for this test, I'm not using
patch 14 ("kernel: Enable waitpid() for futex2") , for reasons explained
at "The patchset" section.
- For the first three ones, I measured an average of 4% gain in
performance. This is not a big step, but it shows that the new
interface is at least comparable in performance with the current one.
- For requeue, I measured an average of 21% decrease in performance
compared to the original futex implementation. This is expected given
the new design with individual flags. The performance trade-offs are
explained at patch 4 ("futex2: Implement requeue operation").
[4] https://gitlab.collabora.com/tonyk/glibc/-/tree/futex2
[5] https://gitlab.collabora.com/tonyk/wine/-/tree/proton_5.13
[6] https://gitlab.collabora.com/tonyk/systemd
* FAQ
** "Where's the code for NUMA and FUTEX_8/16/64?"
The current code is already complex enough to take some time for
review, so I believe it's better to split that work out to a future
iteration of this patchset. Besides that, this RFC is the core part of the
infrastructure, and the following features will not pose big design
changes to it, the work will be more about wiring up the flags and
modifying some functions.
** "Where's the PI/robust stuff?"
As said by Peter Zijlstra at [3], all those new features are related to
the "simple" futex interface, that doesn't use PI or robust. Do we want
to have this complexity at futex2() and if so, should it be part of
this patchset or can it be future work?
Thanks,
André
* Changelog
Changes from v2:
- API now supports 64bit futexes, in addition to 8, 16 and 32.
- This API change will break the glibc[4] and Proton[5] ports for now.
- Refactored futex2_wait and futex2_waitv selftests
v2: https://lore.kernel.org/lkml/20210304004219.134051-1-andrealmeid@collabora.…
Changes from v1:
- Unified futex_set_timer_and_wait and __futex_wait code
- Dropped _carefull from linked list function calls
- Fixed typos on docs patch
- uAPI flags are now added as features are introduced, instead of all flags
in patch 1
- Removed struct futex_single_waiter in favor of an anon struct
v1: https://lore.kernel.org/lkml/20210215152404.250281-1-andrealmeid@collabora.…
André Almeida (13):
futex2: Implement wait and wake functions
futex2: Add support for shared futexes
futex2: Implement vectorized wait
futex2: Implement requeue operation
futex2: Add compatibility entry point for x86_x32 ABI
docs: locking: futex2: Add documentation
selftests: futex2: Add wake/wait test
selftests: futex2: Add timeout test
selftests: futex2: Add wouldblock test
selftests: futex2: Add waitv test
selftests: futex2: Add requeue test
perf bench: Add futex2 benchmark tests
kernel: Enable waitpid() for futex2
Documentation/locking/futex2.rst | 198 +++
Documentation/locking/index.rst | 1 +
MAINTAINERS | 2 +-
arch/arm/tools/syscall.tbl | 4 +
arch/arm64/include/asm/unistd.h | 2 +-
arch/arm64/include/asm/unistd32.h | 8 +
arch/x86/entry/syscalls/syscall_32.tbl | 4 +
arch/x86/entry/syscalls/syscall_64.tbl | 4 +
fs/inode.c | 1 +
include/linux/compat.h | 26 +
include/linux/fs.h | 1 +
include/linux/syscalls.h | 17 +
include/uapi/asm-generic/unistd.h | 14 +-
include/uapi/linux/futex.h | 31 +
init/Kconfig | 7 +
kernel/Makefile | 1 +
kernel/fork.c | 2 +
kernel/futex2.c | 1252 +++++++++++++++++
kernel/sys_ni.c | 9 +
tools/arch/x86/include/asm/unistd_64.h | 12 +
tools/include/uapi/asm-generic/unistd.h | 11 +-
.../arch/x86/entry/syscalls/syscall_64.tbl | 4 +
tools/perf/bench/bench.h | 4 +
tools/perf/bench/futex-hash.c | 24 +-
tools/perf/bench/futex-requeue.c | 57 +-
tools/perf/bench/futex-wake-parallel.c | 41 +-
tools/perf/bench/futex-wake.c | 37 +-
tools/perf/bench/futex.h | 47 +
tools/perf/builtin-bench.c | 18 +-
.../selftests/futex/functional/.gitignore | 3 +
.../selftests/futex/functional/Makefile | 6 +-
.../futex/functional/futex2_requeue.c | 164 +++
.../selftests/futex/functional/futex2_wait.c | 195 +++
.../selftests/futex/functional/futex2_waitv.c | 154 ++
.../futex/functional/futex_wait_timeout.c | 58 +-
.../futex/functional/futex_wait_wouldblock.c | 33 +-
.../testing/selftests/futex/functional/run.sh | 6 +
.../selftests/futex/include/futex2test.h | 112 ++
38 files changed, 2518 insertions(+), 52 deletions(-)
create mode 100644 Documentation/locking/futex2.rst
create mode 100644 kernel/futex2.c
create mode 100644 tools/testing/selftests/futex/functional/futex2_requeue.c
create mode 100644 tools/testing/selftests/futex/functional/futex2_wait.c
create mode 100644 tools/testing/selftests/futex/functional/futex2_waitv.c
create mode 100644 tools/testing/selftests/futex/include/futex2test.h
--
2.31.1
From: Yonghong Song <yhs(a)fb.com>
[ Upstream commit 26e6dd1072763cd5696b75994c03982dde952ad9 ]
selftests/bpf/Makefile includes lib.mk. With the following command
make -j60 LLVM=1 LLVM_IAS=1 <=== compile kernel
make -j60 -C tools/testing/selftests/bpf LLVM=1 LLVM_IAS=1 V=1
some files are still compiled with gcc. This patch
fixed lib.mk issue which sets CC to gcc in all cases.
Signed-off-by: Yonghong Song <yhs(a)fb.com>
Signed-off-by: Alexei Starovoitov <ast(a)kernel.org>
Acked-by: Andrii Nakryiko <andrii(a)kernel.org>
Link: https://lore.kernel.org/bpf/20210413153413.3027426-1-yhs@fb.com
Signed-off-by: Sasha Levin <sashal(a)kernel.org>
---
tools/testing/selftests/lib.mk | 4 ++++
1 file changed, 4 insertions(+)
diff --git a/tools/testing/selftests/lib.mk b/tools/testing/selftests/lib.mk
index 50a93f5f13d6..d8fa6c72b7ca 100644
--- a/tools/testing/selftests/lib.mk
+++ b/tools/testing/selftests/lib.mk
@@ -1,6 +1,10 @@
# This mimics the top-level Makefile. We do it explicitly here so that this
# Makefile can operate with or without the kbuild infrastructure.
+ifneq ($(LLVM),)
+CC := clang
+else
CC := $(CROSS_COMPILE)gcc
+endif
define RUN_TESTS
@for TEST in $(TEST_PROGS); do \
--
2.30.2
From: Yonghong Song <yhs(a)fb.com>
[ Upstream commit 26e6dd1072763cd5696b75994c03982dde952ad9 ]
selftests/bpf/Makefile includes lib.mk. With the following command
make -j60 LLVM=1 LLVM_IAS=1 <=== compile kernel
make -j60 -C tools/testing/selftests/bpf LLVM=1 LLVM_IAS=1 V=1
some files are still compiled with gcc. This patch
fixed lib.mk issue which sets CC to gcc in all cases.
Signed-off-by: Yonghong Song <yhs(a)fb.com>
Signed-off-by: Alexei Starovoitov <ast(a)kernel.org>
Acked-by: Andrii Nakryiko <andrii(a)kernel.org>
Link: https://lore.kernel.org/bpf/20210413153413.3027426-1-yhs@fb.com
Signed-off-by: Sasha Levin <sashal(a)kernel.org>
---
tools/testing/selftests/lib.mk | 4 ++++
1 file changed, 4 insertions(+)
diff --git a/tools/testing/selftests/lib.mk b/tools/testing/selftests/lib.mk
index 50a93f5f13d6..d8fa6c72b7ca 100644
--- a/tools/testing/selftests/lib.mk
+++ b/tools/testing/selftests/lib.mk
@@ -1,6 +1,10 @@
# This mimics the top-level Makefile. We do it explicitly here so that this
# Makefile can operate with or without the kbuild infrastructure.
+ifneq ($(LLVM),)
+CC := clang
+else
CC := $(CROSS_COMPILE)gcc
+endif
define RUN_TESTS
@for TEST in $(TEST_PROGS); do \
--
2.30.2
From: Yonghong Song <yhs(a)fb.com>
[ Upstream commit 26e6dd1072763cd5696b75994c03982dde952ad9 ]
selftests/bpf/Makefile includes lib.mk. With the following command
make -j60 LLVM=1 LLVM_IAS=1 <=== compile kernel
make -j60 -C tools/testing/selftests/bpf LLVM=1 LLVM_IAS=1 V=1
some files are still compiled with gcc. This patch
fixed lib.mk issue which sets CC to gcc in all cases.
Signed-off-by: Yonghong Song <yhs(a)fb.com>
Signed-off-by: Alexei Starovoitov <ast(a)kernel.org>
Acked-by: Andrii Nakryiko <andrii(a)kernel.org>
Link: https://lore.kernel.org/bpf/20210413153413.3027426-1-yhs@fb.com
Signed-off-by: Sasha Levin <sashal(a)kernel.org>
---
tools/testing/selftests/lib.mk | 4 ++++
1 file changed, 4 insertions(+)
diff --git a/tools/testing/selftests/lib.mk b/tools/testing/selftests/lib.mk
index c9be64dc681d..cd3034602ea5 100644
--- a/tools/testing/selftests/lib.mk
+++ b/tools/testing/selftests/lib.mk
@@ -1,6 +1,10 @@
# This mimics the top-level Makefile. We do it explicitly here so that this
# Makefile can operate with or without the kbuild infrastructure.
+ifneq ($(LLVM),)
+CC := clang
+else
CC := $(CROSS_COMPILE)gcc
+endif
ifeq (0,$(MAKELEVEL))
OUTPUT := $(shell pwd)
--
2.30.2
From: Yonghong Song <yhs(a)fb.com>
[ Upstream commit 26e6dd1072763cd5696b75994c03982dde952ad9 ]
selftests/bpf/Makefile includes lib.mk. With the following command
make -j60 LLVM=1 LLVM_IAS=1 <=== compile kernel
make -j60 -C tools/testing/selftests/bpf LLVM=1 LLVM_IAS=1 V=1
some files are still compiled with gcc. This patch
fixed lib.mk issue which sets CC to gcc in all cases.
Signed-off-by: Yonghong Song <yhs(a)fb.com>
Signed-off-by: Alexei Starovoitov <ast(a)kernel.org>
Acked-by: Andrii Nakryiko <andrii(a)kernel.org>
Link: https://lore.kernel.org/bpf/20210413153413.3027426-1-yhs@fb.com
Signed-off-by: Sasha Levin <sashal(a)kernel.org>
---
tools/testing/selftests/lib.mk | 4 ++++
1 file changed, 4 insertions(+)
diff --git a/tools/testing/selftests/lib.mk b/tools/testing/selftests/lib.mk
index 0ef203ec59fd..a5d40653a921 100644
--- a/tools/testing/selftests/lib.mk
+++ b/tools/testing/selftests/lib.mk
@@ -1,6 +1,10 @@
# This mimics the top-level Makefile. We do it explicitly here so that this
# Makefile can operate with or without the kbuild infrastructure.
+ifneq ($(LLVM),)
+CC := clang
+else
CC := $(CROSS_COMPILE)gcc
+endif
ifeq (0,$(MAKELEVEL))
OUTPUT := $(shell pwd)
--
2.30.2
From: Yonghong Song <yhs(a)fb.com>
[ Upstream commit 26e6dd1072763cd5696b75994c03982dde952ad9 ]
selftests/bpf/Makefile includes lib.mk. With the following command
make -j60 LLVM=1 LLVM_IAS=1 <=== compile kernel
make -j60 -C tools/testing/selftests/bpf LLVM=1 LLVM_IAS=1 V=1
some files are still compiled with gcc. This patch
fixed lib.mk issue which sets CC to gcc in all cases.
Signed-off-by: Yonghong Song <yhs(a)fb.com>
Signed-off-by: Alexei Starovoitov <ast(a)kernel.org>
Acked-by: Andrii Nakryiko <andrii(a)kernel.org>
Link: https://lore.kernel.org/bpf/20210413153413.3027426-1-yhs@fb.com
Signed-off-by: Sasha Levin <sashal(a)kernel.org>
---
tools/testing/selftests/lib.mk | 4 ++++
1 file changed, 4 insertions(+)
diff --git a/tools/testing/selftests/lib.mk b/tools/testing/selftests/lib.mk
index 3ed0134a764d..67386aa3f31d 100644
--- a/tools/testing/selftests/lib.mk
+++ b/tools/testing/selftests/lib.mk
@@ -1,6 +1,10 @@
# This mimics the top-level Makefile. We do it explicitly here so that this
# Makefile can operate with or without the kbuild infrastructure.
+ifneq ($(LLVM),)
+CC := clang
+else
CC := $(CROSS_COMPILE)gcc
+endif
ifeq (0,$(MAKELEVEL))
ifeq ($(OUTPUT),)
--
2.30.2
From: Petr Machata <petrm(a)nvidia.com>
[ Upstream commit 1233898ab758cbcf5f6fea10b8dd16a0b2c24fab ]
The mirror_gre_scale test creates as many ERSPAN sessions as the underlying
chip supports, and tests that they all work. In order to determine that it
issues a stream of ICMP packets and checks if they are mirrored as
expected.
However, the mausezahn invocation missed the -6 flag to identify the use of
IPv6 protocol, and was sending ICMP messages over IPv6, as opposed to
ICMP6. It also didn't pass an explicit source IP address, which apparently
worked at some point in the past, but does not anymore.
To fix these issues, extend the function mirror_test() in mirror_lib by
detecting the IPv6 protocol addresses, and using a different ICMP scheme.
Fix __mirror_gre_test() in the selftest itself to pass a source IP address.
Signed-off-by: Petr Machata <petrm(a)nvidia.com>
Signed-off-by: David S. Miller <davem(a)davemloft.net>
Signed-off-by: Sasha Levin <sashal(a)kernel.org>
---
.../drivers/net/mlxsw/mirror_gre_scale.sh | 3 ++-
.../selftests/net/forwarding/mirror_lib.sh | 19 +++++++++++++++++--
2 files changed, 19 insertions(+), 3 deletions(-)
diff --git a/tools/testing/selftests/drivers/net/mlxsw/mirror_gre_scale.sh b/tools/testing/selftests/drivers/net/mlxsw/mirror_gre_scale.sh
index 6f3a70df63bc..e00435753008 100644
--- a/tools/testing/selftests/drivers/net/mlxsw/mirror_gre_scale.sh
+++ b/tools/testing/selftests/drivers/net/mlxsw/mirror_gre_scale.sh
@@ -120,12 +120,13 @@ __mirror_gre_test()
sleep 5
for ((i = 0; i < count; ++i)); do
+ local sip=$(mirror_gre_ipv6_addr 1 $i)::1
local dip=$(mirror_gre_ipv6_addr 1 $i)::2
local htun=h3-gt6-$i
local message
icmp6_capture_install $htun
- mirror_test v$h1 "" $dip $htun 100 10
+ mirror_test v$h1 $sip $dip $htun 100 10
icmp6_capture_uninstall $htun
done
}
diff --git a/tools/testing/selftests/net/forwarding/mirror_lib.sh b/tools/testing/selftests/net/forwarding/mirror_lib.sh
index 13db1cb50e57..6406cd76a19d 100644
--- a/tools/testing/selftests/net/forwarding/mirror_lib.sh
+++ b/tools/testing/selftests/net/forwarding/mirror_lib.sh
@@ -20,6 +20,13 @@ mirror_uninstall()
tc filter del dev $swp1 $direction pref 1000
}
+is_ipv6()
+{
+ local addr=$1; shift
+
+ [[ -z ${addr//[0-9a-fA-F:]/} ]]
+}
+
mirror_test()
{
local vrf_name=$1; shift
@@ -29,9 +36,17 @@ mirror_test()
local pref=$1; shift
local expect=$1; shift
+ if is_ipv6 $dip; then
+ local proto=-6
+ local type="icmp6 type=128" # Echo request.
+ else
+ local proto=
+ local type="icmp echoreq"
+ fi
+
local t0=$(tc_rule_stats_get $dev $pref)
- $MZ $vrf_name ${sip:+-A $sip} -B $dip -a own -b bc -q \
- -c 10 -d 100msec -t icmp type=8
+ $MZ $proto $vrf_name ${sip:+-A $sip} -B $dip -a own -b bc -q \
+ -c 10 -d 100msec -t $type
sleep 0.5
local t1=$(tc_rule_stats_get $dev $pref)
local delta=$((t1 - t0))
--
2.30.2
From: Petr Machata <petrm(a)nvidia.com>
[ Upstream commit dda7f4fa55839baeb72ae040aeaf9ccf89d3e416 ]
The intention behind this test is to make sure that qdisc limit is
correctly projected to the HW. However, first, due to rounding in the
qdisc, and then in the driver, the number cannot actually be accurate. And
second, the approach to testing this is to oversubscribe the port with
traffic generated on the same switch. The actual backlog size therefore
fluctuates.
In practice, this test proved to be noisier than the rest, and spuriously
fails every now and then. Increase the tolerance to 10 % to avoid these
issues.
Signed-off-by: Petr Machata <petrm(a)nvidia.com>
Acked-by: Jiri Pirko <jiri(a)nvidia.com>
Signed-off-by: David S. Miller <davem(a)davemloft.net>
Signed-off-by: Sasha Levin <sashal(a)kernel.org>
---
tools/testing/selftests/drivers/net/mlxsw/sch_red_core.sh | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/tools/testing/selftests/drivers/net/mlxsw/sch_red_core.sh b/tools/testing/selftests/drivers/net/mlxsw/sch_red_core.sh
index b0cb1aaffdda..33ddd01689be 100644
--- a/tools/testing/selftests/drivers/net/mlxsw/sch_red_core.sh
+++ b/tools/testing/selftests/drivers/net/mlxsw/sch_red_core.sh
@@ -507,8 +507,8 @@ do_red_test()
check_err $? "backlog $backlog / $limit Got $pct% marked packets, expected == 0."
local diff=$((limit - backlog))
pct=$((100 * diff / limit))
- ((0 <= pct && pct <= 5))
- check_err $? "backlog $backlog / $limit expected <= 5% distance"
+ ((0 <= pct && pct <= 10))
+ check_err $? "backlog $backlog / $limit expected <= 10% distance"
log_test "TC $((vlan - 10)): RED backlog > limit"
stop_traffic
--
2.30.2
From: Yonghong Song <yhs(a)fb.com>
[ Upstream commit 26e6dd1072763cd5696b75994c03982dde952ad9 ]
selftests/bpf/Makefile includes lib.mk. With the following command
make -j60 LLVM=1 LLVM_IAS=1 <=== compile kernel
make -j60 -C tools/testing/selftests/bpf LLVM=1 LLVM_IAS=1 V=1
some files are still compiled with gcc. This patch
fixed lib.mk issue which sets CC to gcc in all cases.
Signed-off-by: Yonghong Song <yhs(a)fb.com>
Signed-off-by: Alexei Starovoitov <ast(a)kernel.org>
Acked-by: Andrii Nakryiko <andrii(a)kernel.org>
Link: https://lore.kernel.org/bpf/20210413153413.3027426-1-yhs@fb.com
Signed-off-by: Sasha Levin <sashal(a)kernel.org>
---
tools/testing/selftests/lib.mk | 4 ++++
1 file changed, 4 insertions(+)
diff --git a/tools/testing/selftests/lib.mk b/tools/testing/selftests/lib.mk
index a5ce26d548e4..9a41d8bb9ff1 100644
--- a/tools/testing/selftests/lib.mk
+++ b/tools/testing/selftests/lib.mk
@@ -1,6 +1,10 @@
# This mimics the top-level Makefile. We do it explicitly here so that this
# Makefile can operate with or without the kbuild infrastructure.
+ifneq ($(LLVM),)
+CC := clang
+else
CC := $(CROSS_COMPILE)gcc
+endif
ifeq (0,$(MAKELEVEL))
ifeq ($(OUTPUT),)
--
2.30.2
From: Russell Currey <ruscur(a)russell.cc>
[ Upstream commit 3a72c94ebfb1f171eba0715998010678a09ec796 ]
The rfi_flush and entry_flush selftests work by using the PM_LD_MISS_L1
perf event to count L1D misses. The value of this event has changed
over time:
- Power7 uses 0x400f0
- Power8 and Power9 use both 0x400f0 and 0x3e054
- Power10 uses only 0x3e054
Rather than relying on raw values, configure perf to count L1D read
misses in the most explicit way available.
This fixes the selftests to work on systems without 0x400f0 as
PM_LD_MISS_L1, and should change no behaviour for systems that the tests
already worked on.
The only potential downside is that referring to a specific perf event
requires PMU support implemented in the kernel for that platform.
Signed-off-by: Russell Currey <ruscur(a)russell.cc>
Acked-by: Daniel Axtens <dja(a)axtens.net>
Signed-off-by: Michael Ellerman <mpe(a)ellerman.id.au>
Link: https://lore.kernel.org/r/20210223070227.2916871-1-ruscur@russell.cc
Signed-off-by: Sasha Levin <sashal(a)kernel.org>
---
tools/testing/selftests/powerpc/security/entry_flush.c | 2 +-
tools/testing/selftests/powerpc/security/flush_utils.h | 4 ++++
tools/testing/selftests/powerpc/security/rfi_flush.c | 2 +-
3 files changed, 6 insertions(+), 2 deletions(-)
diff --git a/tools/testing/selftests/powerpc/security/entry_flush.c b/tools/testing/selftests/powerpc/security/entry_flush.c
index 78cf914fa321..68ce377b205e 100644
--- a/tools/testing/selftests/powerpc/security/entry_flush.c
+++ b/tools/testing/selftests/powerpc/security/entry_flush.c
@@ -53,7 +53,7 @@ int entry_flush_test(void)
entry_flush = entry_flush_orig;
- fd = perf_event_open_counter(PERF_TYPE_RAW, /* L1d miss */ 0x400f0, -1);
+ fd = perf_event_open_counter(PERF_TYPE_HW_CACHE, PERF_L1D_READ_MISS_CONFIG, -1);
FAIL_IF(fd < 0);
p = (char *)memalign(zero_size, CACHELINE_SIZE);
diff --git a/tools/testing/selftests/powerpc/security/flush_utils.h b/tools/testing/selftests/powerpc/security/flush_utils.h
index 07a5eb301466..7a3d60292916 100644
--- a/tools/testing/selftests/powerpc/security/flush_utils.h
+++ b/tools/testing/selftests/powerpc/security/flush_utils.h
@@ -9,6 +9,10 @@
#define CACHELINE_SIZE 128
+#define PERF_L1D_READ_MISS_CONFIG ((PERF_COUNT_HW_CACHE_L1D) | \
+ (PERF_COUNT_HW_CACHE_OP_READ << 8) | \
+ (PERF_COUNT_HW_CACHE_RESULT_MISS << 16))
+
void syscall_loop(char *p, unsigned long iterations,
unsigned long zero_size);
diff --git a/tools/testing/selftests/powerpc/security/rfi_flush.c b/tools/testing/selftests/powerpc/security/rfi_flush.c
index 7565fd786640..f73484a6470f 100644
--- a/tools/testing/selftests/powerpc/security/rfi_flush.c
+++ b/tools/testing/selftests/powerpc/security/rfi_flush.c
@@ -54,7 +54,7 @@ int rfi_flush_test(void)
rfi_flush = rfi_flush_orig;
- fd = perf_event_open_counter(PERF_TYPE_RAW, /* L1d miss */ 0x400f0, -1);
+ fd = perf_event_open_counter(PERF_TYPE_HW_CACHE, PERF_L1D_READ_MISS_CONFIG, -1);
FAIL_IF(fd < 0);
p = (char *)memalign(zero_size, CACHELINE_SIZE);
--
2.30.2
From: Petr Machata <petrm(a)nvidia.com>
[ Upstream commit 1233898ab758cbcf5f6fea10b8dd16a0b2c24fab ]
The mirror_gre_scale test creates as many ERSPAN sessions as the underlying
chip supports, and tests that they all work. In order to determine that it
issues a stream of ICMP packets and checks if they are mirrored as
expected.
However, the mausezahn invocation missed the -6 flag to identify the use of
IPv6 protocol, and was sending ICMP messages over IPv6, as opposed to
ICMP6. It also didn't pass an explicit source IP address, which apparently
worked at some point in the past, but does not anymore.
To fix these issues, extend the function mirror_test() in mirror_lib by
detecting the IPv6 protocol addresses, and using a different ICMP scheme.
Fix __mirror_gre_test() in the selftest itself to pass a source IP address.
Signed-off-by: Petr Machata <petrm(a)nvidia.com>
Signed-off-by: David S. Miller <davem(a)davemloft.net>
Signed-off-by: Sasha Levin <sashal(a)kernel.org>
---
.../drivers/net/mlxsw/mirror_gre_scale.sh | 3 ++-
.../selftests/net/forwarding/mirror_lib.sh | 19 +++++++++++++++++--
2 files changed, 19 insertions(+), 3 deletions(-)
diff --git a/tools/testing/selftests/drivers/net/mlxsw/mirror_gre_scale.sh b/tools/testing/selftests/drivers/net/mlxsw/mirror_gre_scale.sh
index 6f3a70df63bc..e00435753008 100644
--- a/tools/testing/selftests/drivers/net/mlxsw/mirror_gre_scale.sh
+++ b/tools/testing/selftests/drivers/net/mlxsw/mirror_gre_scale.sh
@@ -120,12 +120,13 @@ __mirror_gre_test()
sleep 5
for ((i = 0; i < count; ++i)); do
+ local sip=$(mirror_gre_ipv6_addr 1 $i)::1
local dip=$(mirror_gre_ipv6_addr 1 $i)::2
local htun=h3-gt6-$i
local message
icmp6_capture_install $htun
- mirror_test v$h1 "" $dip $htun 100 10
+ mirror_test v$h1 $sip $dip $htun 100 10
icmp6_capture_uninstall $htun
done
}
diff --git a/tools/testing/selftests/net/forwarding/mirror_lib.sh b/tools/testing/selftests/net/forwarding/mirror_lib.sh
index 13db1cb50e57..6406cd76a19d 100644
--- a/tools/testing/selftests/net/forwarding/mirror_lib.sh
+++ b/tools/testing/selftests/net/forwarding/mirror_lib.sh
@@ -20,6 +20,13 @@ mirror_uninstall()
tc filter del dev $swp1 $direction pref 1000
}
+is_ipv6()
+{
+ local addr=$1; shift
+
+ [[ -z ${addr//[0-9a-fA-F:]/} ]]
+}
+
mirror_test()
{
local vrf_name=$1; shift
@@ -29,9 +36,17 @@ mirror_test()
local pref=$1; shift
local expect=$1; shift
+ if is_ipv6 $dip; then
+ local proto=-6
+ local type="icmp6 type=128" # Echo request.
+ else
+ local proto=
+ local type="icmp echoreq"
+ fi
+
local t0=$(tc_rule_stats_get $dev $pref)
- $MZ $vrf_name ${sip:+-A $sip} -B $dip -a own -b bc -q \
- -c 10 -d 100msec -t icmp type=8
+ $MZ $proto $vrf_name ${sip:+-A $sip} -B $dip -a own -b bc -q \
+ -c 10 -d 100msec -t $type
sleep 0.5
local t1=$(tc_rule_stats_get $dev $pref)
local delta=$((t1 - t0))
--
2.30.2
From: Petr Machata <petrm(a)nvidia.com>
[ Upstream commit dda7f4fa55839baeb72ae040aeaf9ccf89d3e416 ]
The intention behind this test is to make sure that qdisc limit is
correctly projected to the HW. However, first, due to rounding in the
qdisc, and then in the driver, the number cannot actually be accurate. And
second, the approach to testing this is to oversubscribe the port with
traffic generated on the same switch. The actual backlog size therefore
fluctuates.
In practice, this test proved to be noisier than the rest, and spuriously
fails every now and then. Increase the tolerance to 10 % to avoid these
issues.
Signed-off-by: Petr Machata <petrm(a)nvidia.com>
Acked-by: Jiri Pirko <jiri(a)nvidia.com>
Signed-off-by: David S. Miller <davem(a)davemloft.net>
Signed-off-by: Sasha Levin <sashal(a)kernel.org>
---
tools/testing/selftests/drivers/net/mlxsw/sch_red_core.sh | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/tools/testing/selftests/drivers/net/mlxsw/sch_red_core.sh b/tools/testing/selftests/drivers/net/mlxsw/sch_red_core.sh
index b0cb1aaffdda..33ddd01689be 100644
--- a/tools/testing/selftests/drivers/net/mlxsw/sch_red_core.sh
+++ b/tools/testing/selftests/drivers/net/mlxsw/sch_red_core.sh
@@ -507,8 +507,8 @@ do_red_test()
check_err $? "backlog $backlog / $limit Got $pct% marked packets, expected == 0."
local diff=$((limit - backlog))
pct=$((100 * diff / limit))
- ((0 <= pct && pct <= 5))
- check_err $? "backlog $backlog / $limit expected <= 5% distance"
+ ((0 <= pct && pct <= 10))
+ check_err $? "backlog $backlog / $limit expected <= 10% distance"
log_test "TC $((vlan - 10)): RED backlog > limit"
stop_traffic
--
2.30.2
From: Yonghong Song <yhs(a)fb.com>
[ Upstream commit 26e6dd1072763cd5696b75994c03982dde952ad9 ]
selftests/bpf/Makefile includes lib.mk. With the following command
make -j60 LLVM=1 LLVM_IAS=1 <=== compile kernel
make -j60 -C tools/testing/selftests/bpf LLVM=1 LLVM_IAS=1 V=1
some files are still compiled with gcc. This patch
fixed lib.mk issue which sets CC to gcc in all cases.
Signed-off-by: Yonghong Song <yhs(a)fb.com>
Signed-off-by: Alexei Starovoitov <ast(a)kernel.org>
Acked-by: Andrii Nakryiko <andrii(a)kernel.org>
Link: https://lore.kernel.org/bpf/20210413153413.3027426-1-yhs@fb.com
Signed-off-by: Sasha Levin <sashal(a)kernel.org>
---
tools/testing/selftests/lib.mk | 4 ++++
1 file changed, 4 insertions(+)
diff --git a/tools/testing/selftests/lib.mk b/tools/testing/selftests/lib.mk
index a5ce26d548e4..9a41d8bb9ff1 100644
--- a/tools/testing/selftests/lib.mk
+++ b/tools/testing/selftests/lib.mk
@@ -1,6 +1,10 @@
# This mimics the top-level Makefile. We do it explicitly here so that this
# Makefile can operate with or without the kbuild infrastructure.
+ifneq ($(LLVM),)
+CC := clang
+else
CC := $(CROSS_COMPILE)gcc
+endif
ifeq (0,$(MAKELEVEL))
ifeq ($(OUTPUT),)
--
2.30.2
From: Russell Currey <ruscur(a)russell.cc>
[ Upstream commit 3a72c94ebfb1f171eba0715998010678a09ec796 ]
The rfi_flush and entry_flush selftests work by using the PM_LD_MISS_L1
perf event to count L1D misses. The value of this event has changed
over time:
- Power7 uses 0x400f0
- Power8 and Power9 use both 0x400f0 and 0x3e054
- Power10 uses only 0x3e054
Rather than relying on raw values, configure perf to count L1D read
misses in the most explicit way available.
This fixes the selftests to work on systems without 0x400f0 as
PM_LD_MISS_L1, and should change no behaviour for systems that the tests
already worked on.
The only potential downside is that referring to a specific perf event
requires PMU support implemented in the kernel for that platform.
Signed-off-by: Russell Currey <ruscur(a)russell.cc>
Acked-by: Daniel Axtens <dja(a)axtens.net>
Signed-off-by: Michael Ellerman <mpe(a)ellerman.id.au>
Link: https://lore.kernel.org/r/20210223070227.2916871-1-ruscur@russell.cc
Signed-off-by: Sasha Levin <sashal(a)kernel.org>
---
tools/testing/selftests/powerpc/security/entry_flush.c | 2 +-
tools/testing/selftests/powerpc/security/flush_utils.h | 4 ++++
tools/testing/selftests/powerpc/security/rfi_flush.c | 2 +-
3 files changed, 6 insertions(+), 2 deletions(-)
diff --git a/tools/testing/selftests/powerpc/security/entry_flush.c b/tools/testing/selftests/powerpc/security/entry_flush.c
index 78cf914fa321..68ce377b205e 100644
--- a/tools/testing/selftests/powerpc/security/entry_flush.c
+++ b/tools/testing/selftests/powerpc/security/entry_flush.c
@@ -53,7 +53,7 @@ int entry_flush_test(void)
entry_flush = entry_flush_orig;
- fd = perf_event_open_counter(PERF_TYPE_RAW, /* L1d miss */ 0x400f0, -1);
+ fd = perf_event_open_counter(PERF_TYPE_HW_CACHE, PERF_L1D_READ_MISS_CONFIG, -1);
FAIL_IF(fd < 0);
p = (char *)memalign(zero_size, CACHELINE_SIZE);
diff --git a/tools/testing/selftests/powerpc/security/flush_utils.h b/tools/testing/selftests/powerpc/security/flush_utils.h
index 07a5eb301466..7a3d60292916 100644
--- a/tools/testing/selftests/powerpc/security/flush_utils.h
+++ b/tools/testing/selftests/powerpc/security/flush_utils.h
@@ -9,6 +9,10 @@
#define CACHELINE_SIZE 128
+#define PERF_L1D_READ_MISS_CONFIG ((PERF_COUNT_HW_CACHE_L1D) | \
+ (PERF_COUNT_HW_CACHE_OP_READ << 8) | \
+ (PERF_COUNT_HW_CACHE_RESULT_MISS << 16))
+
void syscall_loop(char *p, unsigned long iterations,
unsigned long zero_size);
diff --git a/tools/testing/selftests/powerpc/security/rfi_flush.c b/tools/testing/selftests/powerpc/security/rfi_flush.c
index 7565fd786640..f73484a6470f 100644
--- a/tools/testing/selftests/powerpc/security/rfi_flush.c
+++ b/tools/testing/selftests/powerpc/security/rfi_flush.c
@@ -54,7 +54,7 @@ int rfi_flush_test(void)
rfi_flush = rfi_flush_orig;
- fd = perf_event_open_counter(PERF_TYPE_RAW, /* L1d miss */ 0x400f0, -1);
+ fd = perf_event_open_counter(PERF_TYPE_HW_CACHE, PERF_L1D_READ_MISS_CONFIG, -1);
FAIL_IF(fd < 0);
p = (char *)memalign(zero_size, CACHELINE_SIZE);
--
2.30.2
From: Petr Machata <petrm(a)nvidia.com>
[ Upstream commit 1233898ab758cbcf5f6fea10b8dd16a0b2c24fab ]
The mirror_gre_scale test creates as many ERSPAN sessions as the underlying
chip supports, and tests that they all work. In order to determine that it
issues a stream of ICMP packets and checks if they are mirrored as
expected.
However, the mausezahn invocation missed the -6 flag to identify the use of
IPv6 protocol, and was sending ICMP messages over IPv6, as opposed to
ICMP6. It also didn't pass an explicit source IP address, which apparently
worked at some point in the past, but does not anymore.
To fix these issues, extend the function mirror_test() in mirror_lib by
detecting the IPv6 protocol addresses, and using a different ICMP scheme.
Fix __mirror_gre_test() in the selftest itself to pass a source IP address.
Signed-off-by: Petr Machata <petrm(a)nvidia.com>
Signed-off-by: David S. Miller <davem(a)davemloft.net>
Signed-off-by: Sasha Levin <sashal(a)kernel.org>
---
.../drivers/net/mlxsw/mirror_gre_scale.sh | 3 ++-
.../selftests/net/forwarding/mirror_lib.sh | 19 +++++++++++++++++--
2 files changed, 19 insertions(+), 3 deletions(-)
diff --git a/tools/testing/selftests/drivers/net/mlxsw/mirror_gre_scale.sh b/tools/testing/selftests/drivers/net/mlxsw/mirror_gre_scale.sh
index 6f3a70df63bc..e00435753008 100644
--- a/tools/testing/selftests/drivers/net/mlxsw/mirror_gre_scale.sh
+++ b/tools/testing/selftests/drivers/net/mlxsw/mirror_gre_scale.sh
@@ -120,12 +120,13 @@ __mirror_gre_test()
sleep 5
for ((i = 0; i < count; ++i)); do
+ local sip=$(mirror_gre_ipv6_addr 1 $i)::1
local dip=$(mirror_gre_ipv6_addr 1 $i)::2
local htun=h3-gt6-$i
local message
icmp6_capture_install $htun
- mirror_test v$h1 "" $dip $htun 100 10
+ mirror_test v$h1 $sip $dip $htun 100 10
icmp6_capture_uninstall $htun
done
}
diff --git a/tools/testing/selftests/net/forwarding/mirror_lib.sh b/tools/testing/selftests/net/forwarding/mirror_lib.sh
index 13db1cb50e57..6406cd76a19d 100644
--- a/tools/testing/selftests/net/forwarding/mirror_lib.sh
+++ b/tools/testing/selftests/net/forwarding/mirror_lib.sh
@@ -20,6 +20,13 @@ mirror_uninstall()
tc filter del dev $swp1 $direction pref 1000
}
+is_ipv6()
+{
+ local addr=$1; shift
+
+ [[ -z ${addr//[0-9a-fA-F:]/} ]]
+}
+
mirror_test()
{
local vrf_name=$1; shift
@@ -29,9 +36,17 @@ mirror_test()
local pref=$1; shift
local expect=$1; shift
+ if is_ipv6 $dip; then
+ local proto=-6
+ local type="icmp6 type=128" # Echo request.
+ else
+ local proto=
+ local type="icmp echoreq"
+ fi
+
local t0=$(tc_rule_stats_get $dev $pref)
- $MZ $vrf_name ${sip:+-A $sip} -B $dip -a own -b bc -q \
- -c 10 -d 100msec -t icmp type=8
+ $MZ $proto $vrf_name ${sip:+-A $sip} -B $dip -a own -b bc -q \
+ -c 10 -d 100msec -t $type
sleep 0.5
local t1=$(tc_rule_stats_get $dev $pref)
local delta=$((t1 - t0))
--
2.30.2
From: Petr Machata <petrm(a)nvidia.com>
[ Upstream commit dda7f4fa55839baeb72ae040aeaf9ccf89d3e416 ]
The intention behind this test is to make sure that qdisc limit is
correctly projected to the HW. However, first, due to rounding in the
qdisc, and then in the driver, the number cannot actually be accurate. And
second, the approach to testing this is to oversubscribe the port with
traffic generated on the same switch. The actual backlog size therefore
fluctuates.
In practice, this test proved to be noisier than the rest, and spuriously
fails every now and then. Increase the tolerance to 10 % to avoid these
issues.
Signed-off-by: Petr Machata <petrm(a)nvidia.com>
Acked-by: Jiri Pirko <jiri(a)nvidia.com>
Signed-off-by: David S. Miller <davem(a)davemloft.net>
Signed-off-by: Sasha Levin <sashal(a)kernel.org>
---
tools/testing/selftests/drivers/net/mlxsw/sch_red_core.sh | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/tools/testing/selftests/drivers/net/mlxsw/sch_red_core.sh b/tools/testing/selftests/drivers/net/mlxsw/sch_red_core.sh
index b0cb1aaffdda..33ddd01689be 100644
--- a/tools/testing/selftests/drivers/net/mlxsw/sch_red_core.sh
+++ b/tools/testing/selftests/drivers/net/mlxsw/sch_red_core.sh
@@ -507,8 +507,8 @@ do_red_test()
check_err $? "backlog $backlog / $limit Got $pct% marked packets, expected == 0."
local diff=$((limit - backlog))
pct=$((100 * diff / limit))
- ((0 <= pct && pct <= 5))
- check_err $? "backlog $backlog / $limit expected <= 5% distance"
+ ((0 <= pct && pct <= 10))
+ check_err $? "backlog $backlog / $limit expected <= 10% distance"
log_test "TC $((vlan - 10)): RED backlog > limit"
stop_traffic
--
2.30.2
From: Yonghong Song <yhs(a)fb.com>
[ Upstream commit 26e6dd1072763cd5696b75994c03982dde952ad9 ]
selftests/bpf/Makefile includes lib.mk. With the following command
make -j60 LLVM=1 LLVM_IAS=1 <=== compile kernel
make -j60 -C tools/testing/selftests/bpf LLVM=1 LLVM_IAS=1 V=1
some files are still compiled with gcc. This patch
fixed lib.mk issue which sets CC to gcc in all cases.
Signed-off-by: Yonghong Song <yhs(a)fb.com>
Signed-off-by: Alexei Starovoitov <ast(a)kernel.org>
Acked-by: Andrii Nakryiko <andrii(a)kernel.org>
Link: https://lore.kernel.org/bpf/20210413153413.3027426-1-yhs@fb.com
Signed-off-by: Sasha Levin <sashal(a)kernel.org>
---
tools/testing/selftests/lib.mk | 4 ++++
1 file changed, 4 insertions(+)
diff --git a/tools/testing/selftests/lib.mk b/tools/testing/selftests/lib.mk
index a5ce26d548e4..9a41d8bb9ff1 100644
--- a/tools/testing/selftests/lib.mk
+++ b/tools/testing/selftests/lib.mk
@@ -1,6 +1,10 @@
# This mimics the top-level Makefile. We do it explicitly here so that this
# Makefile can operate with or without the kbuild infrastructure.
+ifneq ($(LLVM),)
+CC := clang
+else
CC := $(CROSS_COMPILE)gcc
+endif
ifeq (0,$(MAKELEVEL))
ifeq ($(OUTPUT),)
--
2.30.2
From: Russell Currey <ruscur(a)russell.cc>
[ Upstream commit 3a72c94ebfb1f171eba0715998010678a09ec796 ]
The rfi_flush and entry_flush selftests work by using the PM_LD_MISS_L1
perf event to count L1D misses. The value of this event has changed
over time:
- Power7 uses 0x400f0
- Power8 and Power9 use both 0x400f0 and 0x3e054
- Power10 uses only 0x3e054
Rather than relying on raw values, configure perf to count L1D read
misses in the most explicit way available.
This fixes the selftests to work on systems without 0x400f0 as
PM_LD_MISS_L1, and should change no behaviour for systems that the tests
already worked on.
The only potential downside is that referring to a specific perf event
requires PMU support implemented in the kernel for that platform.
Signed-off-by: Russell Currey <ruscur(a)russell.cc>
Acked-by: Daniel Axtens <dja(a)axtens.net>
Signed-off-by: Michael Ellerman <mpe(a)ellerman.id.au>
Link: https://lore.kernel.org/r/20210223070227.2916871-1-ruscur@russell.cc
Signed-off-by: Sasha Levin <sashal(a)kernel.org>
---
tools/testing/selftests/powerpc/security/entry_flush.c | 2 +-
tools/testing/selftests/powerpc/security/flush_utils.h | 4 ++++
tools/testing/selftests/powerpc/security/rfi_flush.c | 2 +-
3 files changed, 6 insertions(+), 2 deletions(-)
diff --git a/tools/testing/selftests/powerpc/security/entry_flush.c b/tools/testing/selftests/powerpc/security/entry_flush.c
index 78cf914fa321..68ce377b205e 100644
--- a/tools/testing/selftests/powerpc/security/entry_flush.c
+++ b/tools/testing/selftests/powerpc/security/entry_flush.c
@@ -53,7 +53,7 @@ int entry_flush_test(void)
entry_flush = entry_flush_orig;
- fd = perf_event_open_counter(PERF_TYPE_RAW, /* L1d miss */ 0x400f0, -1);
+ fd = perf_event_open_counter(PERF_TYPE_HW_CACHE, PERF_L1D_READ_MISS_CONFIG, -1);
FAIL_IF(fd < 0);
p = (char *)memalign(zero_size, CACHELINE_SIZE);
diff --git a/tools/testing/selftests/powerpc/security/flush_utils.h b/tools/testing/selftests/powerpc/security/flush_utils.h
index 07a5eb301466..7a3d60292916 100644
--- a/tools/testing/selftests/powerpc/security/flush_utils.h
+++ b/tools/testing/selftests/powerpc/security/flush_utils.h
@@ -9,6 +9,10 @@
#define CACHELINE_SIZE 128
+#define PERF_L1D_READ_MISS_CONFIG ((PERF_COUNT_HW_CACHE_L1D) | \
+ (PERF_COUNT_HW_CACHE_OP_READ << 8) | \
+ (PERF_COUNT_HW_CACHE_RESULT_MISS << 16))
+
void syscall_loop(char *p, unsigned long iterations,
unsigned long zero_size);
diff --git a/tools/testing/selftests/powerpc/security/rfi_flush.c b/tools/testing/selftests/powerpc/security/rfi_flush.c
index 7565fd786640..f73484a6470f 100644
--- a/tools/testing/selftests/powerpc/security/rfi_flush.c
+++ b/tools/testing/selftests/powerpc/security/rfi_flush.c
@@ -54,7 +54,7 @@ int rfi_flush_test(void)
rfi_flush = rfi_flush_orig;
- fd = perf_event_open_counter(PERF_TYPE_RAW, /* L1d miss */ 0x400f0, -1);
+ fd = perf_event_open_counter(PERF_TYPE_HW_CACHE, PERF_L1D_READ_MISS_CONFIG, -1);
FAIL_IF(fd < 0);
p = (char *)memalign(zero_size, CACHELINE_SIZE);
--
2.30.2
TL;DR: Add support to kunit_tool to dispatch tests via QEMU. Also add
support to immediately shutdown a kernel after running KUnit tests.
Background
----------
KUnit has supported running on all architectures for quite some time;
however, kunit_tool - the script commonly used to invoke KUnit tests -
has only fully supported KUnit run on UML. Its functionality has been
broken up for some time to separate the configure, build, run, and parse
phases making it possible to be used in part on other architectures to a
small extent. Nevertheless, kunit_tool has not supported running tests
on other architectures.
What this patchset does
-----------------------
This patchset introduces first class support to kunit_tool for KUnit to
be run on many popular architectures via QEMU. It does this by adding
two new flags: `--arch` and `--cross_compile`.
`--arch` allows an architecture to be specified by the name the
architecture is given in `arch/`. It uses the specified architecture to
select a minimal amount of Kconfigs and QEMU configs needed for the
architecture to run in QEMU and provide a console from which KTAP
results can be scraped.
`--cross_compile` allows a toolchain prefix to be specified to make
similar to how `CROSS_COMPILE` is used.
Additionally, this patchset revives the previously considered "kunit:
tool: add support for QEMU"[1] patchs. The motivation for this new
kernel command line flags, `kunit_shutdown`, is to better support
running KUnit tests inside of QEMU. For most popular architectures, QEMU
can be made to terminate when the Linux kernel that is being run is
reboted, halted, or powered off. As Kees pointed out in a previous
discussion[2], it is possible to make a kernel initrd that can reboot
the kernel immediately, doing this for every architecture would likely
be infeasible. Instead, just having an option for the kernel to shutdown
when it is done with testing seems a lot simpler, especially since it is
an option which would only available in testing configurations of the
kernel anyway.
What discussion remains for this patchset?
------------------------------------------
The first most obvious thing is settling the debate about
`kunit_shutdown`. If I recall correctly, Kees suggested that it might be
better to just add a new initrd; however, as I mentioned above, now to
support many new architectures, it may be substantially easier to
support this option. So I am hoping with this new usecase, the argument
for `kunit_shutdown` will be more compelling.
The second and likely harder issue is figuring out the best way to
configure and provide configs for running KUnit tests via QEMU. I
provide a pretty primitive way in this patchset which is not super
flexible; for example, for our PPC support we have it set to build big
endian, and POWER8 - we currently don't support a way to change that.
Nevertheless, having sensible defaults is handy too, so we will probably
want to have some support for overriding defaults, while still being
able to have defaults.
[1] http://patches.linaro.org/patch/208336/
[2] https://lkml.org/lkml/2020/6/26/988
Brendan Higgins (3):
Documentation: Add kunit_shutdown to kernel-parameters.txt
kunit: tool: add support for QEMU
Documentation: kunit: document support for QEMU in kunit_tool
David Gow (1):
kunit: Add 'kunit_shutdown' option
.../admin-guide/kernel-parameters.txt | 8 +
Documentation/dev-tools/kunit/usage.rst | 37 +++-
lib/kunit/executor.c | 20 ++
tools/testing/kunit/kunit.py | 33 ++-
tools/testing/kunit/kunit_config.py | 2 +-
tools/testing/kunit/kunit_kernel.py | 209 +++++++++++++++---
tools/testing/kunit/kunit_parser.py | 2 +-
tools/testing/kunit/kunit_tool_test.py | 15 +-
8 files changed, 278 insertions(+), 48 deletions(-)
base-commit: 7af08140979a6e7e12b78c93b8625c8d25b084e2
--
2.31.1.498.g6c1eba8ee3d-goog
KVM_GET_CPUID2 kvm ioctl is not very well documented, but the way it is
implemented in function kvm_vcpu_ioctl_get_cpuid2 suggests that even at
error path it will try to return number of entries to the caller. But
The dispatcher kvm vcpu ioctl dispatcher code in kvm_arch_vcpu_ioctl
ignores any output from this function if it sees the error return code.
It's very explicit by the code that it was designed to receive some
small number of entries to return E2BIG along with the corrected number.
This lost logic in the dispatcher code has been restored by removing the
lines that check for function return code and skip if error is found.
Without it, the ioctl caller will see both the number of entries and the
correct error.
In selftests relevant function vcpu_get_cpuid has also been modified to
utilize the number of cpuid entries returned along with errno E2BIG.
Signed-off-by: Valeriy Vdovin <valeriy.vdovin(a)virtuozzo.com>
---
v4:
- Added description to documentation of KVM_GET_CPUID2.
- Copy back nent only if E2BIG is returned.
- Fixed error code sign.
- Corrected version message
Documentation/virt/kvm/api.rst | 81 ++++++++++++-------
arch/x86/kvm/x86.c | 11 ++-
.../selftests/kvm/lib/x86_64/processor.c | 20 +++--
3 files changed, 73 insertions(+), 39 deletions(-)
diff --git a/Documentation/virt/kvm/api.rst b/Documentation/virt/kvm/api.rst
index 245d80581f15..c7cfe4b9614e 100644
--- a/Documentation/virt/kvm/api.rst
+++ b/Documentation/virt/kvm/api.rst
@@ -711,7 +711,34 @@ resulting CPUID configuration through KVM_GET_CPUID2 in case.
};
-4.21 KVM_SET_SIGNAL_MASK
+4.21 KVM_GET_CPUID2
+------------------
+
+:Capability: basic
+:Architectures: x86
+:Type: vcpu ioctl
+:Parameters: struct kvm_cpuid (in/out)
+:Returns: 0 on success, -1 on error
+
+Returns a full list of cpuid entries that are supported by this vcpu and were
+previously set by KVM_SET_CPUID/KVM_SET_CPUID2.
+
+The userspace must specify the number of cpuid entries it is ready to accept
+from the kernel in the 'nent' field of 'struct kmv_cpuid'.
+
+The kernel will try to return all the cpuid entries it has in the response.
+If the userspace nent value is too small for the full response, the kernel will
+set the error code to -E2BIG, set the same 'nent' field to the actual number of
+cpuid_entries and return without writing back any entries to the userspace.
+The userspace can thus implement a two-call sequence, where the first call is
+made with nent set to 0 to read the number of entries from the kernel and
+use this response to allocate enough memory for a full response for the second
+call.
+
+The call cal also return with error code -EFAULT in case of other errors.
+
+
+4.22 KVM_SET_SIGNAL_MASK
------------------------
:Capability: basic
@@ -737,7 +764,7 @@ signal mask.
};
-4.22 KVM_GET_FPU
+4.23 KVM_GET_FPU
----------------
:Capability: basic
@@ -766,7 +793,7 @@ Reads the floating point state from the vcpu.
};
-4.23 KVM_SET_FPU
+4.24 KVM_SET_FPU
----------------
:Capability: basic
@@ -795,7 +822,7 @@ Writes the floating point state to the vcpu.
};
-4.24 KVM_CREATE_IRQCHIP
+4.25 KVM_CREATE_IRQCHIP
-----------------------
:Capability: KVM_CAP_IRQCHIP, KVM_CAP_S390_IRQCHIP (s390)
@@ -817,7 +844,7 @@ Note that on s390 the KVM_CAP_S390_IRQCHIP vm capability needs to be enabled
before KVM_CREATE_IRQCHIP can be used.
-4.25 KVM_IRQ_LINE
+4.26 KVM_IRQ_LINE
-----------------
:Capability: KVM_CAP_IRQCHIP
@@ -886,7 +913,7 @@ be used for a userspace interrupt controller.
};
-4.26 KVM_GET_IRQCHIP
+4.27 KVM_GET_IRQCHIP
--------------------
:Capability: KVM_CAP_IRQCHIP
@@ -911,7 +938,7 @@ KVM_CREATE_IRQCHIP into a buffer provided by the caller.
};
-4.27 KVM_SET_IRQCHIP
+4.28 KVM_SET_IRQCHIP
--------------------
:Capability: KVM_CAP_IRQCHIP
@@ -936,7 +963,7 @@ KVM_CREATE_IRQCHIP from a buffer provided by the caller.
};
-4.28 KVM_XEN_HVM_CONFIG
+4.29 KVM_XEN_HVM_CONFIG
-----------------------
:Capability: KVM_CAP_XEN_HVM
@@ -972,7 +999,7 @@ fields must be zero.
No other flags are currently valid in the struct kvm_xen_hvm_config.
-4.29 KVM_GET_CLOCK
+4.30 KVM_GET_CLOCK
------------------
:Capability: KVM_CAP_ADJUST_CLOCK
@@ -1005,7 +1032,7 @@ TSC is not stable.
};
-4.30 KVM_SET_CLOCK
+4.31 KVM_SET_CLOCK
------------------
:Capability: KVM_CAP_ADJUST_CLOCK
@@ -1027,7 +1054,7 @@ such as migration.
};
-4.31 KVM_GET_VCPU_EVENTS
+4.32 KVM_GET_VCPU_EVENTS
------------------------
:Capability: KVM_CAP_VCPU_EVENTS
@@ -1146,7 +1173,7 @@ directly to the virtual CPU).
__u32 reserved[12];
};
-4.32 KVM_SET_VCPU_EVENTS
+4.33 KVM_SET_VCPU_EVENTS
------------------------
:Capability: KVM_CAP_VCPU_EVENTS
@@ -1209,7 +1236,7 @@ exceptions by manipulating individual registers using the KVM_SET_ONE_REG API.
See KVM_GET_VCPU_EVENTS for the data structure.
-4.33 KVM_GET_DEBUGREGS
+4.34 KVM_GET_DEBUGREGS
----------------------
:Capability: KVM_CAP_DEBUGREGS
@@ -1231,7 +1258,7 @@ Reads debug registers from the vcpu.
};
-4.34 KVM_SET_DEBUGREGS
+4.35 KVM_SET_DEBUGREGS
----------------------
:Capability: KVM_CAP_DEBUGREGS
@@ -1246,7 +1273,7 @@ See KVM_GET_DEBUGREGS for the data structure. The flags field is unused
yet and must be cleared on entry.
-4.35 KVM_SET_USER_MEMORY_REGION
+4.36 KVM_SET_USER_MEMORY_REGION
-------------------------------
:Capability: KVM_CAP_USER_MEMORY
@@ -1315,7 +1342,7 @@ The KVM_SET_MEMORY_REGION does not allow fine grained control over memory
allocation and is deprecated.
-4.36 KVM_SET_TSS_ADDR
+4.37 KVM_SET_TSS_ADDR
---------------------
:Capability: KVM_CAP_SET_TSS_ADDR
@@ -1335,7 +1362,7 @@ because of a quirk in the virtualization implementation (see the internals
documentation when it pops into existence).
-4.37 KVM_ENABLE_CAP
+4.38 KVM_ENABLE_CAP
-------------------
:Capability: KVM_CAP_ENABLE_CAP
@@ -1390,7 +1417,7 @@ function properly, this is the place to put them.
The vcpu ioctl should be used for vcpu-specific capabilities, the vm ioctl
for vm-wide capabilities.
-4.38 KVM_GET_MP_STATE
+4.39 KVM_GET_MP_STATE
---------------------
:Capability: KVM_CAP_MP_STATE
@@ -1438,7 +1465,7 @@ For arm/arm64:
The only states that are valid are KVM_MP_STATE_STOPPED and
KVM_MP_STATE_RUNNABLE which reflect if the vcpu is paused or not.
-4.39 KVM_SET_MP_STATE
+4.40 KVM_SET_MP_STATE
---------------------
:Capability: KVM_CAP_MP_STATE
@@ -1460,7 +1487,7 @@ For arm/arm64:
The only states that are valid are KVM_MP_STATE_STOPPED and
KVM_MP_STATE_RUNNABLE which reflect if the vcpu should be paused or not.
-4.40 KVM_SET_IDENTITY_MAP_ADDR
+4.41 KVM_SET_IDENTITY_MAP_ADDR
------------------------------
:Capability: KVM_CAP_SET_IDENTITY_MAP_ADDR
@@ -1484,7 +1511,7 @@ documentation when it pops into existence).
Fails if any VCPU has already been created.
-4.41 KVM_SET_BOOT_CPU_ID
+4.42 KVM_SET_BOOT_CPU_ID
------------------------
:Capability: KVM_CAP_SET_BOOT_CPU_ID
@@ -1499,7 +1526,7 @@ is vcpu 0. This ioctl has to be called before vcpu creation,
otherwise it will return EBUSY error.
-4.42 KVM_GET_XSAVE
+4.43 KVM_GET_XSAVE
------------------
:Capability: KVM_CAP_XSAVE
@@ -1518,7 +1545,7 @@ otherwise it will return EBUSY error.
This ioctl would copy current vcpu's xsave struct to the userspace.
-4.43 KVM_SET_XSAVE
+4.44 KVM_SET_XSAVE
------------------
:Capability: KVM_CAP_XSAVE
@@ -1537,7 +1564,7 @@ This ioctl would copy current vcpu's xsave struct to the userspace.
This ioctl would copy userspace's xsave struct to the kernel.
-4.44 KVM_GET_XCRS
+4.45 KVM_GET_XCRS
-----------------
:Capability: KVM_CAP_XCRS
@@ -1564,7 +1591,7 @@ This ioctl would copy userspace's xsave struct to the kernel.
This ioctl would copy current vcpu's xcrs to the userspace.
-4.45 KVM_SET_XCRS
+4.46 KVM_SET_XCRS
-----------------
:Capability: KVM_CAP_XCRS
@@ -1591,7 +1618,7 @@ This ioctl would copy current vcpu's xcrs to the userspace.
This ioctl would set vcpu's xcr to the value userspace specified.
-4.46 KVM_GET_SUPPORTED_CPUID
+4.47 KVM_GET_SUPPORTED_CPUID
----------------------------
:Capability: KVM_CAP_EXT_CPUID
@@ -1676,7 +1703,7 @@ if that returns true and you use KVM_CREATE_IRQCHIP, or if you emulate the
feature in userspace, then you can enable the feature for KVM_SET_CPUID2.
-4.47 KVM_PPC_GET_PVINFO
+4.48 KVM_PPC_GET_PVINFO
-----------------------
:Capability: KVM_CAP_PPC_GET_PVINFO
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index efc7a82ab140..3f941b1f4e78 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -4773,14 +4773,17 @@ long kvm_arch_vcpu_ioctl(struct file *filp,
r = -EFAULT;
if (copy_from_user(&cpuid, cpuid_arg, sizeof(cpuid)))
goto out;
+
r = kvm_vcpu_ioctl_get_cpuid2(vcpu, &cpuid,
cpuid_arg->entries);
- if (r)
+
+ if (r && r != -E2BIG)
goto out;
- r = -EFAULT;
- if (copy_to_user(cpuid_arg, &cpuid, sizeof(cpuid)))
+
+ if (copy_to_user(cpuid_arg, &cpuid, sizeof(cpuid))) {
+ r = -EFAULT;
goto out;
- r = 0;
+ }
break;
}
case KVM_GET_MSRS: {
diff --git a/tools/testing/selftests/kvm/lib/x86_64/processor.c b/tools/testing/selftests/kvm/lib/x86_64/processor.c
index a8906e60a108..a412b39ad791 100644
--- a/tools/testing/selftests/kvm/lib/x86_64/processor.c
+++ b/tools/testing/selftests/kvm/lib/x86_64/processor.c
@@ -727,17 +727,21 @@ struct kvm_cpuid2 *vcpu_get_cpuid(struct kvm_vm *vm, uint32_t vcpuid)
cpuid = allocate_kvm_cpuid2();
max_ent = cpuid->nent;
+ cpuid->nent = 0;
- for (cpuid->nent = 1; cpuid->nent <= max_ent; cpuid->nent++) {
- rc = ioctl(vcpu->fd, KVM_GET_CPUID2, cpuid);
- if (!rc)
- break;
+ rc = ioctl(vcpu->fd, KVM_GET_CPUID2, cpuid);
+ TEST_ASSERT(rc == -1 && errno == E2BIG,
+ "KVM_GET_CPUID2 should return E2BIG: %d %d",
+ rc, errno);
- TEST_ASSERT(rc == -1 && errno == E2BIG,
- "KVM_GET_CPUID2 should either succeed or give E2BIG: %d %d",
- rc, errno);
- }
+ TEST_ASSERT(cpuid->nent,
+ "KVM_GET_CPUID2 failed to set cpuid->nent with E2BIG");
+
+ TEST_ASSERT(cpuid->nent < max_ent,
+ "KVM_GET_CPUID2 has %d entries, expected maximum: %d",
+ cpuid->nent, max_ent);
+ rc = ioctl(vcpu->fd, KVM_GET_CPUID2, cpuid);
TEST_ASSERT(rc == 0, "KVM_GET_CPUID2 failed, rc: %i errno: %i",
rc, errno);
--
2.17.1
Clang's integrated assembler does not allow symbols with non-absolute
values to be reassigned. Modify the interrupt entry loop macro to be
compatible with IAS by using a label and an offset.
Cc: Jian Cai <caij2003(a)gmail.com>
Signed-off-by: Bill Wendling <morbo(a)google.com>
References: https://lore.kernel.org/lkml/20200714233024.1789985-1-caij2003@gmail.com/
---
tools/testing/selftests/kvm/lib/x86_64/handlers.S | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/tools/testing/selftests/kvm/lib/x86_64/handlers.S b/tools/testing/selftests/kvm/lib/x86_64/handlers.S
index aaf7bc7d2ce1..3f9181e9a0a7 100644
--- a/tools/testing/selftests/kvm/lib/x86_64/handlers.S
+++ b/tools/testing/selftests/kvm/lib/x86_64/handlers.S
@@ -54,9 +54,9 @@ idt_handlers:
.align 8
/* Fetch current address and append it to idt_handlers. */
- current_handler = .
+0 :
.pushsection .rodata
-.quad current_handler
+ .quad 0b
.popsection
.if ! \has_error
--
2.29.2.576.ga3fc446d84-goog
I'm just starting my learning curve on SGX, so I don't know if I've missed
some setup for the SGX device entries. After looking at arch/x86/kernel/cpu/sgx/driver.c
I see that there is no mode value for either sgx_dev_enclave or sgx_dev_provision.
With this patch I can get the SGX self test to complete:
sudo ./test_sgx
Warning: no execute permissions on device file /dev/sgx_enclave
0x0000000000000000 0x0000000000002000 0x03
0x0000000000002000 0x0000000000001000 0x05
0x0000000000003000 0x0000000000003000 0x03
SUCCESS
Is the warning even necessary ?
Tim
Functionally, this just means that the test output will be slightly
changed and it'll now depend on CONFIG_KUNIT=y/m.
It'll still run at boot time and can still be built as a loadable
module.
There was a pre-existing patch to convert this test that I found later,
here [1]. Compared to [1], this patch doesn't rename files and uses
KUnit features more heavily (i.e. does more than converting pr_err()
calls to KUNIT_FAIL()).
What this conversion gives us:
* a shorter test thanks to KUnit's macros
* a way to run this a bit more easily via kunit.py (and
CONFIG_KUNIT_ALL_TESTS=y) [2]
* a structured way of reporting pass/fail
* uses kunit-managed allocations to avoid the risk of memory leaks
* more descriptive error messages:
* i.e. it prints out which fields are invalid, what the expected
values are, etc.
What this conversion does not do:
* change the name of the file (and thus the name of the module)
* change the name of the config option
Leaving these as-is for now to minimize the impact to people wanting to
run this test. IMO, that concern trumps following KUnit's style guide
for both names, at least for now.
[1] https://lore.kernel.org/linux-kselftest/20201015014616.309000-1-vitor@massa…
[2] Can be run via
$ ./tools/testing/kunit/kunit.py run --kunitconfig /dev/stdin <<EOF
CONFIG_KUNIT=y
CONFIG_TEST_LIST_SORT=y
EOF
[16:55:56] Configuring KUnit Kernel ...
[16:55:56] Building KUnit Kernel ...
[16:56:29] Starting KUnit Kernel ...
[16:56:32] ============================================================
[16:56:32] ======== [PASSED] list_sort ========
[16:56:32] [PASSED] list_sort_test
[16:56:32] ============================================================
[16:56:32] Testing complete. 1 tests run. 0 failed. 0 crashed.
[16:56:32] Elapsed time: 35.668s total, 0.001s configuring, 32.725s building, 0.000s running
Note: the build time is as after a `make mrproper`.
Signed-off-by: Daniel Latypov <dlatypov(a)google.com>
---
lib/Kconfig.debug | 5 +-
lib/test_list_sort.c | 128 +++++++++++++++++--------------------------
2 files changed, 54 insertions(+), 79 deletions(-)
diff --git a/lib/Kconfig.debug b/lib/Kconfig.debug
index 417c3d3e521b..09a0cc8a55cc 100644
--- a/lib/Kconfig.debug
+++ b/lib/Kconfig.debug
@@ -1999,8 +1999,9 @@ config LKDTM
Documentation/fault-injection/provoke-crashes.rst
config TEST_LIST_SORT
- tristate "Linked list sorting test"
- depends on DEBUG_KERNEL || m
+ tristate "Linked list sorting test" if !KUNIT_ALL_TESTS
+ depends on KUNIT
+ default KUNIT_ALL_TESTS
help
Enable this to turn on 'list_sort()' function test. This test is
executed only once during system boot (so affects only boot time),
diff --git a/lib/test_list_sort.c b/lib/test_list_sort.c
index 1f017d3b610e..ccfd98dbf57c 100644
--- a/lib/test_list_sort.c
+++ b/lib/test_list_sort.c
@@ -1,5 +1,5 @@
// SPDX-License-Identifier: GPL-2.0-only
-#define pr_fmt(fmt) "list_sort_test: " fmt
+#include <kunit/test.h>
#include <linux/kernel.h>
#include <linux/list_sort.h>
@@ -23,67 +23,52 @@ struct debug_el {
struct list_head list;
unsigned int poison2;
int value;
- unsigned serial;
+ unsigned int serial;
};
-/* Array, containing pointers to all elements in the test list */
-static struct debug_el **elts __initdata;
-
-static int __init check(struct debug_el *ela, struct debug_el *elb)
+static void check(struct kunit *test, struct debug_el *ela, struct debug_el *elb)
{
- if (ela->serial >= TEST_LIST_LEN) {
- pr_err("error: incorrect serial %d\n", ela->serial);
- return -EINVAL;
- }
- if (elb->serial >= TEST_LIST_LEN) {
- pr_err("error: incorrect serial %d\n", elb->serial);
- return -EINVAL;
- }
- if (elts[ela->serial] != ela || elts[elb->serial] != elb) {
- pr_err("error: phantom element\n");
- return -EINVAL;
- }
- if (ela->poison1 != TEST_POISON1 || ela->poison2 != TEST_POISON2) {
- pr_err("error: bad poison: %#x/%#x\n",
- ela->poison1, ela->poison2);
- return -EINVAL;
- }
- if (elb->poison1 != TEST_POISON1 || elb->poison2 != TEST_POISON2) {
- pr_err("error: bad poison: %#x/%#x\n",
- elb->poison1, elb->poison2);
- return -EINVAL;
- }
- return 0;
+ struct debug_el **elts = test->priv;
+
+ KUNIT_EXPECT_LT_MSG(test, ela->serial, (unsigned int)TEST_LIST_LEN, "incorrect serial");
+ KUNIT_EXPECT_LT_MSG(test, elb->serial, (unsigned int)TEST_LIST_LEN, "incorrect serial");
+
+ KUNIT_EXPECT_PTR_EQ_MSG(test, elts[ela->serial], ela, "phantom element");
+ KUNIT_EXPECT_PTR_EQ_MSG(test, elts[elb->serial], elb, "phantom element");
+
+ KUNIT_EXPECT_EQ_MSG(test, ela->poison1, TEST_POISON1, "bad poison");
+ KUNIT_EXPECT_EQ_MSG(test, ela->poison2, TEST_POISON2, "bad poison");
+
+ KUNIT_EXPECT_EQ_MSG(test, elb->poison1, TEST_POISON1, "bad poison");
+ KUNIT_EXPECT_EQ_MSG(test, elb->poison2, TEST_POISON2, "bad poison");
}
-static int __init cmp(void *priv, struct list_head *a, struct list_head *b)
+/* `priv` is the test pointer so check() can fail the test if the list is invalid. */
+static int cmp(void *priv, struct list_head *a, struct list_head *b)
{
struct debug_el *ela, *elb;
ela = container_of(a, struct debug_el, list);
elb = container_of(b, struct debug_el, list);
- check(ela, elb);
+ check(priv, ela, elb);
return ela->value - elb->value;
}
-static int __init list_sort_test(void)
+static void list_sort_test(struct kunit *test)
{
- int i, count = 1, err = -ENOMEM;
- struct debug_el *el;
+ int i, count = 1;
+ struct debug_el *el, **elts;
struct list_head *cur;
LIST_HEAD(head);
- pr_debug("start testing list_sort()\n");
-
- elts = kcalloc(TEST_LIST_LEN, sizeof(*elts), GFP_KERNEL);
- if (!elts)
- return err;
+ elts = kunit_kcalloc(test, TEST_LIST_LEN, sizeof(*elts), GFP_KERNEL);
+ KUNIT_ASSERT_NOT_ERR_OR_NULL(test, elts);
+ test->priv = elts;
for (i = 0; i < TEST_LIST_LEN; i++) {
- el = kmalloc(sizeof(*el), GFP_KERNEL);
- if (!el)
- goto exit;
+ el = kunit_kmalloc(test, sizeof(*el), GFP_KERNEL);
+ KUNIT_ASSERT_NOT_ERR_OR_NULL(test, el);
/* force some equivalencies */
el->value = prandom_u32() % (TEST_LIST_LEN / 3);
@@ -94,55 +79,44 @@ static int __init list_sort_test(void)
list_add_tail(&el->list, &head);
}
- list_sort(NULL, &head, cmp);
+ list_sort(test, &head, cmp);
- err = -EINVAL;
for (cur = head.next; cur->next != &head; cur = cur->next) {
struct debug_el *el1;
int cmp_result;
- if (cur->next->prev != cur) {
- pr_err("error: list is corrupted\n");
- goto exit;
- }
+ KUNIT_ASSERT_PTR_EQ_MSG(test, cur->next->prev, cur,
+ "list is corrupted");
- cmp_result = cmp(NULL, cur, cur->next);
- if (cmp_result > 0) {
- pr_err("error: list is not sorted\n");
- goto exit;
- }
+ cmp_result = cmp(test, cur, cur->next);
+ KUNIT_ASSERT_LE_MSG(test, cmp_result, 0, "list is not sorted");
el = container_of(cur, struct debug_el, list);
el1 = container_of(cur->next, struct debug_el, list);
- if (cmp_result == 0 && el->serial >= el1->serial) {
- pr_err("error: order of equivalent elements not "
- "preserved\n");
- goto exit;
+ if (cmp_result == 0) {
+ KUNIT_ASSERT_LE_MSG(test, el->serial, el1->serial,
+ "order of equivalent elements not preserved");
}
- if (check(el, el1)) {
- pr_err("error: element check failed\n");
- goto exit;
- }
+ check(test, el, el1);
count++;
}
- if (head.prev != cur) {
- pr_err("error: list is corrupted\n");
- goto exit;
- }
+ KUNIT_EXPECT_PTR_EQ_MSG(test, head.prev, cur, "list is corrupted");
+ KUNIT_EXPECT_EQ_MSG(test, count, TEST_LIST_LEN,
+ "list length changed after sorting!");
+}
- if (count != TEST_LIST_LEN) {
- pr_err("error: bad list length %d", count);
- goto exit;
- }
+static struct kunit_case list_sort_cases[] = {
+ KUNIT_CASE(list_sort_test),
+ {}
+};
+
+static struct kunit_suite list_sort_suite = {
+ .name = "list_sort",
+ .test_cases = list_sort_cases,
+};
+
+kunit_test_suites(&list_sort_suite);
- err = 0;
-exit:
- for (i = 0; i < TEST_LIST_LEN; i++)
- kfree(elts[i]);
- kfree(elts);
- return err;
-}
-module_init(list_sort_test);
MODULE_LICENSE("GPL");
--
2.31.1.498.g6c1eba8ee3d-goog
Add in:
* kunit_kmalloc_array() and wire up kunit_kmalloc() to be a special
case of it.
* kunit_kcalloc() for symmetry with kunit_kzalloc()
This should using KUnit more natural by making it more similar to the
existing *alloc() APIs.
And while we shouldn't necessarily be writing unit tests where overflow
should be a concern, it can't hurt to be safe.
Signed-off-by: Daniel Latypov <dlatypov(a)google.com>
Reviewed-by: David Gow <davidgow(a)google.com>
Reviewed-by: Brendan Higgins <brendanhiggins(a)google.com>
---
v1 -> v2: s/kzalloc/kcalloc in doc comment.
---
include/kunit/test.h | 36 ++++++++++++++++++++++++++++++++----
lib/kunit/test.c | 22 ++++++++++++----------
2 files changed, 44 insertions(+), 14 deletions(-)
diff --git a/include/kunit/test.h b/include/kunit/test.h
index 49601c4b98b8..e8ecb69dd567 100644
--- a/include/kunit/test.h
+++ b/include/kunit/test.h
@@ -577,16 +577,30 @@ static inline int kunit_destroy_named_resource(struct kunit *test,
void kunit_remove_resource(struct kunit *test, struct kunit_resource *res);
/**
- * kunit_kmalloc() - Like kmalloc() except the allocation is *test managed*.
+ * kunit_kmalloc_array() - Like kmalloc_array() except the allocation is *test managed*.
* @test: The test context object.
+ * @n: number of elements.
* @size: The size in bytes of the desired memory.
* @gfp: flags passed to underlying kmalloc().
*
- * Just like `kmalloc(...)`, except the allocation is managed by the test case
+ * Just like `kmalloc_array(...)`, except the allocation is managed by the test case
* and is automatically cleaned up after the test case concludes. See &struct
* kunit_resource for more information.
*/
-void *kunit_kmalloc(struct kunit *test, size_t size, gfp_t gfp);
+void *kunit_kmalloc_array(struct kunit *test, size_t n, size_t size, gfp_t flags);
+
+/**
+ * kunit_kmalloc() - Like kmalloc() except the allocation is *test managed*.
+ * @test: The test context object.
+ * @size: The size in bytes of the desired memory.
+ * @gfp: flags passed to underlying kmalloc().
+ *
+ * See kmalloc() and kunit_kmalloc_array() for more information.
+ */
+static inline void *kunit_kmalloc(struct kunit *test, size_t size, gfp_t gfp)
+{
+ return kunit_kmalloc_array(test, 1, size, gfp);
+}
/**
* kunit_kfree() - Like kfree except for allocations managed by KUnit.
@@ -601,13 +615,27 @@ void kunit_kfree(struct kunit *test, const void *ptr);
* @size: The size in bytes of the desired memory.
* @gfp: flags passed to underlying kmalloc().
*
- * See kzalloc() and kunit_kmalloc() for more information.
+ * See kzalloc() and kunit_kmalloc_array() for more information.
*/
static inline void *kunit_kzalloc(struct kunit *test, size_t size, gfp_t gfp)
{
return kunit_kmalloc(test, size, gfp | __GFP_ZERO);
}
+/**
+ * kunit_kcalloc() - Just like kunit_kmalloc_array(), but zeroes the allocation.
+ * @test: The test context object.
+ * @n: number of elements.
+ * @size: The size in bytes of the desired memory.
+ * @gfp: flags passed to underlying kmalloc().
+ *
+ * See kcalloc() and kunit_kmalloc_array() for more information.
+ */
+static inline void *kunit_kcalloc(struct kunit *test, size_t n, size_t size, gfp_t flags)
+{
+ return kunit_kmalloc_array(test, n, size, flags | __GFP_ZERO);
+}
+
void kunit_cleanup(struct kunit *test);
void kunit_log_append(char *log, const char *fmt, ...);
diff --git a/lib/kunit/test.c b/lib/kunit/test.c
index 2f6cc0123232..41fa46b14c3b 100644
--- a/lib/kunit/test.c
+++ b/lib/kunit/test.c
@@ -573,41 +573,43 @@ int kunit_destroy_resource(struct kunit *test, kunit_resource_match_t match,
}
EXPORT_SYMBOL_GPL(kunit_destroy_resource);
-struct kunit_kmalloc_params {
+struct kunit_kmalloc_array_params {
+ size_t n;
size_t size;
gfp_t gfp;
};
-static int kunit_kmalloc_init(struct kunit_resource *res, void *context)
+static int kunit_kmalloc_array_init(struct kunit_resource *res, void *context)
{
- struct kunit_kmalloc_params *params = context;
+ struct kunit_kmalloc_array_params *params = context;
- res->data = kmalloc(params->size, params->gfp);
+ res->data = kmalloc_array(params->n, params->size, params->gfp);
if (!res->data)
return -ENOMEM;
return 0;
}
-static void kunit_kmalloc_free(struct kunit_resource *res)
+static void kunit_kmalloc_array_free(struct kunit_resource *res)
{
kfree(res->data);
}
-void *kunit_kmalloc(struct kunit *test, size_t size, gfp_t gfp)
+void *kunit_kmalloc_array(struct kunit *test, size_t n, size_t size, gfp_t gfp)
{
- struct kunit_kmalloc_params params = {
+ struct kunit_kmalloc_array_params params = {
.size = size,
+ .n = n,
.gfp = gfp
};
return kunit_alloc_resource(test,
- kunit_kmalloc_init,
- kunit_kmalloc_free,
+ kunit_kmalloc_array_init,
+ kunit_kmalloc_array_free,
gfp,
¶ms);
}
-EXPORT_SYMBOL_GPL(kunit_kmalloc);
+EXPORT_SYMBOL_GPL(kunit_kmalloc_array);
void kunit_kfree(struct kunit *test, const void *ptr)
{
base-commit: cda689f8708b6bef0b921c3a17fcdecbe959a079
--
2.31.1.527.g47e6f16901-goog
The readahead size used to be 2MB, thus it's reasonable to set the file
size as 4MB when checking check_file_mmap().
However since commit c2e4cd57cfa1 ("block: lift setting the readahead
size into the block layer"), readahead size could be as large as twice
the io_opt, and thus the hardcoded file size no longer works.
check_file_mmap() may report "Read-ahead pages reached the end of the
file" when the readahead size actually exceeds the file size in this
case.
To fix this issue, read the exact readahead window size via BLKRAGET
ioctl. Since now we have the readahead window size, take a more
fine-grained check. It is worth noting that this fine-grained check may
be broken as the sync readahead algorithm of kernel changes. It may be
acceptable since the algorithm of readahead ranging should be quite
stable, and we could tune the test case accorddingly if the algorithm
indeed changes.
Reported-by: James Wang <jnwang(a)linux.alibaba.com>
Acked-by: Ricardo Cañuelo <ricardo.canuelo(a)collabora.com>
Signed-off-by: Jeffle Xu <jefflexu(a)linux.alibaba.com>
---
changes since v3:
- make the check more fine-grained since we have the exact readahead
window size now, as suggested by Ricardo Cañuelo
chnages since v2:
- add 'Reported-by'
chnages since v1:
- add the test name "mincore" in the subject line
- add the error message in commit message
- rename @filesize to @file_size to keep a more consistent naming
convention
---
.../selftests/mincore/mincore_selftest.c | 96 +++++++++++++------
1 file changed, 68 insertions(+), 28 deletions(-)
diff --git a/tools/testing/selftests/mincore/mincore_selftest.c b/tools/testing/selftests/mincore/mincore_selftest.c
index 5a1e85ff5d32..369b35af4b4f 100644
--- a/tools/testing/selftests/mincore/mincore_selftest.c
+++ b/tools/testing/selftests/mincore/mincore_selftest.c
@@ -15,6 +15,11 @@
#include <string.h>
#include <fcntl.h>
#include <string.h>
+#include <sys/types.h>
+#include <sys/stat.h>
+#include <sys/ioctl.h>
+#include <sys/sysmacros.h>
+#include <sys/mount.h>
#include "../kselftest.h"
#include "../kselftest_harness.h"
@@ -193,12 +198,44 @@ TEST(check_file_mmap)
int retval;
int page_size;
int fd;
- int i;
+ int i, start, end;
int ra_pages = 0;
+ long ra_size, file_size;
+ struct stat stats;
+ dev_t devt;
+ unsigned int major, minor;
+ char devpath[32];
+
+ retval = stat(".", &stats);
+ ASSERT_EQ(0, retval) {
+ TH_LOG("Can't stat pwd: %s", strerror(errno));
+ }
+
+ devt = stats.st_dev;
+ major = major(devt);
+ minor = minor(devt);
+ snprintf(devpath, sizeof(devpath), "/dev/block/%u:%u", major, minor);
+
+ fd = open(devpath, O_RDONLY);
+ ASSERT_NE(-1, fd) {
+ TH_LOG("Can't open underlying disk %s", strerror(errno));
+ }
+
+ retval = ioctl(fd, BLKRAGET, &ra_size);
+ ASSERT_EQ(0, retval) {
+ TH_LOG("Error ioctl with the underlying disk: %s", strerror(errno));
+ }
+
+ /*
+ * BLKRAGET ioctl returns the readahead size in sectors (512 bytes).
+ * Make file_size large enough to contain the readahead window.
+ */
+ ra_size *= 512;
+ file_size = ra_size * 2;
page_size = sysconf(_SC_PAGESIZE);
- vec_size = FILE_SIZE / page_size;
- if (FILE_SIZE % page_size)
+ vec_size = file_size / page_size;
+ if (file_size % page_size)
vec_size++;
vec = calloc(vec_size, sizeof(unsigned char));
@@ -213,7 +250,7 @@ TEST(check_file_mmap)
strerror(errno));
}
errno = 0;
- retval = fallocate(fd, 0, 0, FILE_SIZE);
+ retval = fallocate(fd, 0, 0, file_size);
ASSERT_EQ(0, retval) {
TH_LOG("Error allocating space for the temporary file: %s",
strerror(errno));
@@ -223,12 +260,12 @@ TEST(check_file_mmap)
* Map the whole file, the pages shouldn't be fetched yet.
*/
errno = 0;
- addr = mmap(NULL, FILE_SIZE, PROT_READ | PROT_WRITE,
+ addr = mmap(NULL, file_size, PROT_READ | PROT_WRITE,
MAP_SHARED, fd, 0);
ASSERT_NE(MAP_FAILED, addr) {
TH_LOG("mmap error: %s", strerror(errno));
}
- retval = mincore(addr, FILE_SIZE, vec);
+ retval = mincore(addr, file_size, vec);
ASSERT_EQ(0, retval);
for (i = 0; i < vec_size; i++) {
ASSERT_EQ(0, vec[i]) {
@@ -240,38 +277,41 @@ TEST(check_file_mmap)
* Touch a page in the middle of the mapping. We expect the next
* few pages (the readahead window) to be populated too.
*/
- addr[FILE_SIZE / 2] = 1;
- retval = mincore(addr, FILE_SIZE, vec);
+ addr[file_size / 2] = 1;
+ retval = mincore(addr, file_size, vec);
ASSERT_EQ(0, retval);
- ASSERT_EQ(1, vec[FILE_SIZE / 2 / page_size]) {
- TH_LOG("Page not found in memory after use");
- }
- i = FILE_SIZE / 2 / page_size + 1;
- while (i < vec_size && vec[i]) {
- ra_pages++;
- i++;
- }
- EXPECT_GT(ra_pages, 0) {
- TH_LOG("No read-ahead pages found in memory");
- }
+ /*
+ * Readahead window is [start, end). So far the sync readahead
+ * algorithm takes the page that triggers the page fault as the
+ * midpoint.
+ */
+ ra_pages = ra_size / page_size;
+ start = file_size / 2 / page_size - ra_pages / 2;
+ end = start + ra_pages;
- EXPECT_LT(i, vec_size) {
- TH_LOG("Read-ahead pages reached the end of the file");
+ /*
+ * Check there's no hole in the readahead window.
+ */
+ for (i = start; i < end; i++) {
+ ASSERT_EQ(1, vec[i]) {
+ TH_LOG("Hole found in read-ahead window");
+ }
}
+
/*
- * End of the readahead window. The rest of the pages shouldn't
- * be in memory.
+ * Check there's no page beyond the readahead window.
*/
- if (i < vec_size) {
- while (i < vec_size && !vec[i])
- i++;
- EXPECT_EQ(vec_size, i) {
+ for (i = 0; i < vec_size; i++) {
+ if (i == start)
+ i = end;
+
+ EXPECT_EQ(0, vec[i]) {
TH_LOG("Unexpected page in memory beyond readahead window");
}
}
- munmap(addr, FILE_SIZE);
+ munmap(addr, file_size);
close(fd);
free(vec);
}
--
2.27.0
It is documented in Documentation/admin-guide/hw-vuln/spectre.rst, that
disabling indirect branch speculation for a user-space process creates
more overhead and cause it to run slower. The performance hit varies by
CPU, but on the AMD A4-9120C and A6-9220C CPUs, a simple ping-pong using
pipes between two processes runs ~10x slower when disabling IB
speculation.
Patch 2, included in this RFC but not intended for commit, is a simple
program that demonstrates this issue. Running on a A4-9120C without IB
speculation disabled, each process ping-pong takes ~7us:
localhost ~ # taskset 1 /usr/local/bin/test
...
iters: 262144, t: 1936300, iter/sec: 135383, us/iter: 7
But when IB speculation is disabled, that number increases
significantly:
localhost ~ # taskset 1 /usr/local/bin/test d
...
iters: 16384, t: 1500518, iter/sec: 10918, us/iter: 91
Although this test is a worst-case scenario, we can also consider a real
situation: an audio server (i.e. pulse). If we imagine a low-latency
capture, with 10ms packets and a concurrent task on the same CPU (i.e.
video encoding, for a video call), the audio server will preempt the
CPU at a rate of 100HZ. At 91us overhead per preemption (switching to
and from the audio process), that's 0.9% overhead for one process doing
preemption. In real-world testing (on a A4-9120C), I've seen 9% of CPU
used by IBPB when doing a 2-person video call.
With this patch, the number of IBPBs issued can be reduced to the
minimum necessary, only when there's a potential attacker->victim
process switch.
Running on the same A4-9120C device, this patch reduces the performance
hit of IBPB by ~half, as expected:
localhost ~ # taskset 1 /usr/local/bin/test ds
...
iters: 32768, t: 1824043, iter/sec: 17964, us/iter: 55
It should be noted, CPUs from multiple vendors experience a performance
hit due to IBPB. I also tested a Intel i3-8130U which sees a noticable
(~2x) increase in process switch time due to IBPB.
IB spec enabled:
localhost ~ # taskset 1 /usr/local/bin/test
...
iters: 262144, t: 1210821us, iter/sec: 216501, us/iter: 4
IB spec disabled:
localhost ~ # taskset 1 /usr/local/bin/test d
...
iters: 131072, t: 1257583us, iter/sec: 104225, us/iter: 9
Open questions:
- There are a significant number of task flags, which also now reaches the
limit of the 'long' on 32-bit systems. Should the 'mode' flags be
stored somewhere else?
- Having x86-specific flags in linux/sched.h feels wrong. However, this
is the mechanism for doing atomic flag updates. Is there an alternate
approach?
Open tasks:
- Documentation
- Naming
Changes in v2:
- Make flag per-process using prctl().
Anand K Mistry (2):
x86/speculation: Allow per-process control of when to issue IBPB
selftests: Benchmark for the cost of disabling IB speculation
arch/x86/include/asm/thread_info.h | 4 +
arch/x86/kernel/cpu/bugs.c | 56 +++++++++
arch/x86/kernel/process.c | 10 ++
arch/x86/mm/tlb.c | 51 ++++++--
include/linux/sched.h | 10 ++
include/uapi/linux/prctl.h | 5 +
.../testing/selftests/ib_spec/ib_spec_bench.c | 109 ++++++++++++++++++
7 files changed, 236 insertions(+), 9 deletions(-)
create mode 100644 tools/testing/selftests/ib_spec/ib_spec_bench.c
--
2.31.1.498.g6c1eba8ee3d-goog
Hi, a friend and I were chasing bug 205219 [1] listed in Bugzilla.
We step into something a little bit different when trying to reproduce
the buggy behavior. In our try, compilation failed with a message form
make asking us to clean the source tree. We couldn't run kunit_tool
after compiling the kernel for x86, as described by Ted in the
discussion pointed out by the bug report.
Steps to reproduce:
0) Run kunit_tool
$ ./tools/testing/kunit/kunit.py run
Works fine with a clean tree.
1) Compile the kernel for some architecture (we did it for x86_64).
2) Run kunit_tool again
$ ./tools/testing/kunit/kunit.py run
Fails with a message form make asking us to clean the source tree.
Removing the clean source tree check from the top-level Makefile gives
us a similar error to what was described in the bug report. We see that
after running `git clean -fdx` kunit_tool runs nicely again. However,
this is not a real solution since some kernel binaries are erased by git.
We also had a look into the commit messages of Masahiro Yamada but
couldn't quite grasp why the check for the tree to be clean was added.
We could invest more time in this issue but actually don't know how to
proceed. We'd be glad to receive any comment about it. We could also try
something else if it's a too hard issue for beginners.
[1]: https://bugzilla.kernel.org/show_bug.cgi?id=205219
Best Regards,
Marcelo
KVM_GET_CPUID2 kvm ioctl is not very well documented, but the way it is
implemented in function kvm_vcpu_ioctl_get_cpuid2 suggests that even at
error path it will try to return number of entries to the caller. But
The dispatcher kvm vcpu ioctl dispatcher code in kvm_arch_vcpu_ioctl
ignores any output from this function if it sees the error return code.
It's very explicit by the code that it was designed to receive some
small number of entries to return E2BIG along with the corrected number.
This lost logic in the dispatcher code has been restored by removing the
lines that check for function return code and skip if error is found.
Without it, the ioctl caller will see both the number of entries and the
correct error.
In selftests relevant function vcpu_get_cpuid has also been modified to
utilize the number of cpuid entries returned along with errno E2BIG.
Signed-off-by: Valeriy Vdovin <valeriy.vdovin(a)virtuozzo.com>
---
v2:
- Added description to documentation of KVM_GET_CPUID2.
- Copy back nent only if E2BIG is returned.
- Fixed error code sign.
Documentation/virt/kvm/api.rst | 81 ++++++++++++-------
arch/x86/kvm/x86.c | 11 ++-
.../selftests/kvm/lib/x86_64/processor.c | 20 +++--
3 files changed, 73 insertions(+), 39 deletions(-)
diff --git a/Documentation/virt/kvm/api.rst b/Documentation/virt/kvm/api.rst
index 245d80581f15..c7cfe4b9614e 100644
--- a/Documentation/virt/kvm/api.rst
+++ b/Documentation/virt/kvm/api.rst
@@ -711,7 +711,34 @@ resulting CPUID configuration through KVM_GET_CPUID2 in case.
};
-4.21 KVM_SET_SIGNAL_MASK
+4.21 KVM_GET_CPUID2
+------------------
+
+:Capability: basic
+:Architectures: x86
+:Type: vcpu ioctl
+:Parameters: struct kvm_cpuid (in/out)
+:Returns: 0 on success, -1 on error
+
+Returns a full list of cpuid entries that are supported by this vcpu and were
+previously set by KVM_SET_CPUID/KVM_SET_CPUID2.
+
+The userspace must specify the number of cpuid entries it is ready to accept
+from the kernel in the 'nent' field of 'struct kmv_cpuid'.
+
+The kernel will try to return all the cpuid entries it has in the response.
+If the userspace nent value is too small for the full response, the kernel will
+set the error code to -E2BIG, set the same 'nent' field to the actual number of
+cpuid_entries and return without writing back any entries to the userspace.
+The userspace can thus implement a two-call sequence, where the first call is
+made with nent set to 0 to read the number of entries from the kernel and
+use this response to allocate enough memory for a full response for the second
+call.
+
+The call cal also return with error code -EFAULT in case of other errors.
+
+
+4.22 KVM_SET_SIGNAL_MASK
------------------------
:Capability: basic
@@ -737,7 +764,7 @@ signal mask.
};
-4.22 KVM_GET_FPU
+4.23 KVM_GET_FPU
----------------
:Capability: basic
@@ -766,7 +793,7 @@ Reads the floating point state from the vcpu.
};
-4.23 KVM_SET_FPU
+4.24 KVM_SET_FPU
----------------
:Capability: basic
@@ -795,7 +822,7 @@ Writes the floating point state to the vcpu.
};
-4.24 KVM_CREATE_IRQCHIP
+4.25 KVM_CREATE_IRQCHIP
-----------------------
:Capability: KVM_CAP_IRQCHIP, KVM_CAP_S390_IRQCHIP (s390)
@@ -817,7 +844,7 @@ Note that on s390 the KVM_CAP_S390_IRQCHIP vm capability needs to be enabled
before KVM_CREATE_IRQCHIP can be used.
-4.25 KVM_IRQ_LINE
+4.26 KVM_IRQ_LINE
-----------------
:Capability: KVM_CAP_IRQCHIP
@@ -886,7 +913,7 @@ be used for a userspace interrupt controller.
};
-4.26 KVM_GET_IRQCHIP
+4.27 KVM_GET_IRQCHIP
--------------------
:Capability: KVM_CAP_IRQCHIP
@@ -911,7 +938,7 @@ KVM_CREATE_IRQCHIP into a buffer provided by the caller.
};
-4.27 KVM_SET_IRQCHIP
+4.28 KVM_SET_IRQCHIP
--------------------
:Capability: KVM_CAP_IRQCHIP
@@ -936,7 +963,7 @@ KVM_CREATE_IRQCHIP from a buffer provided by the caller.
};
-4.28 KVM_XEN_HVM_CONFIG
+4.29 KVM_XEN_HVM_CONFIG
-----------------------
:Capability: KVM_CAP_XEN_HVM
@@ -972,7 +999,7 @@ fields must be zero.
No other flags are currently valid in the struct kvm_xen_hvm_config.
-4.29 KVM_GET_CLOCK
+4.30 KVM_GET_CLOCK
------------------
:Capability: KVM_CAP_ADJUST_CLOCK
@@ -1005,7 +1032,7 @@ TSC is not stable.
};
-4.30 KVM_SET_CLOCK
+4.31 KVM_SET_CLOCK
------------------
:Capability: KVM_CAP_ADJUST_CLOCK
@@ -1027,7 +1054,7 @@ such as migration.
};
-4.31 KVM_GET_VCPU_EVENTS
+4.32 KVM_GET_VCPU_EVENTS
------------------------
:Capability: KVM_CAP_VCPU_EVENTS
@@ -1146,7 +1173,7 @@ directly to the virtual CPU).
__u32 reserved[12];
};
-4.32 KVM_SET_VCPU_EVENTS
+4.33 KVM_SET_VCPU_EVENTS
------------------------
:Capability: KVM_CAP_VCPU_EVENTS
@@ -1209,7 +1236,7 @@ exceptions by manipulating individual registers using the KVM_SET_ONE_REG API.
See KVM_GET_VCPU_EVENTS for the data structure.
-4.33 KVM_GET_DEBUGREGS
+4.34 KVM_GET_DEBUGREGS
----------------------
:Capability: KVM_CAP_DEBUGREGS
@@ -1231,7 +1258,7 @@ Reads debug registers from the vcpu.
};
-4.34 KVM_SET_DEBUGREGS
+4.35 KVM_SET_DEBUGREGS
----------------------
:Capability: KVM_CAP_DEBUGREGS
@@ -1246,7 +1273,7 @@ See KVM_GET_DEBUGREGS for the data structure. The flags field is unused
yet and must be cleared on entry.
-4.35 KVM_SET_USER_MEMORY_REGION
+4.36 KVM_SET_USER_MEMORY_REGION
-------------------------------
:Capability: KVM_CAP_USER_MEMORY
@@ -1315,7 +1342,7 @@ The KVM_SET_MEMORY_REGION does not allow fine grained control over memory
allocation and is deprecated.
-4.36 KVM_SET_TSS_ADDR
+4.37 KVM_SET_TSS_ADDR
---------------------
:Capability: KVM_CAP_SET_TSS_ADDR
@@ -1335,7 +1362,7 @@ because of a quirk in the virtualization implementation (see the internals
documentation when it pops into existence).
-4.37 KVM_ENABLE_CAP
+4.38 KVM_ENABLE_CAP
-------------------
:Capability: KVM_CAP_ENABLE_CAP
@@ -1390,7 +1417,7 @@ function properly, this is the place to put them.
The vcpu ioctl should be used for vcpu-specific capabilities, the vm ioctl
for vm-wide capabilities.
-4.38 KVM_GET_MP_STATE
+4.39 KVM_GET_MP_STATE
---------------------
:Capability: KVM_CAP_MP_STATE
@@ -1438,7 +1465,7 @@ For arm/arm64:
The only states that are valid are KVM_MP_STATE_STOPPED and
KVM_MP_STATE_RUNNABLE which reflect if the vcpu is paused or not.
-4.39 KVM_SET_MP_STATE
+4.40 KVM_SET_MP_STATE
---------------------
:Capability: KVM_CAP_MP_STATE
@@ -1460,7 +1487,7 @@ For arm/arm64:
The only states that are valid are KVM_MP_STATE_STOPPED and
KVM_MP_STATE_RUNNABLE which reflect if the vcpu should be paused or not.
-4.40 KVM_SET_IDENTITY_MAP_ADDR
+4.41 KVM_SET_IDENTITY_MAP_ADDR
------------------------------
:Capability: KVM_CAP_SET_IDENTITY_MAP_ADDR
@@ -1484,7 +1511,7 @@ documentation when it pops into existence).
Fails if any VCPU has already been created.
-4.41 KVM_SET_BOOT_CPU_ID
+4.42 KVM_SET_BOOT_CPU_ID
------------------------
:Capability: KVM_CAP_SET_BOOT_CPU_ID
@@ -1499,7 +1526,7 @@ is vcpu 0. This ioctl has to be called before vcpu creation,
otherwise it will return EBUSY error.
-4.42 KVM_GET_XSAVE
+4.43 KVM_GET_XSAVE
------------------
:Capability: KVM_CAP_XSAVE
@@ -1518,7 +1545,7 @@ otherwise it will return EBUSY error.
This ioctl would copy current vcpu's xsave struct to the userspace.
-4.43 KVM_SET_XSAVE
+4.44 KVM_SET_XSAVE
------------------
:Capability: KVM_CAP_XSAVE
@@ -1537,7 +1564,7 @@ This ioctl would copy current vcpu's xsave struct to the userspace.
This ioctl would copy userspace's xsave struct to the kernel.
-4.44 KVM_GET_XCRS
+4.45 KVM_GET_XCRS
-----------------
:Capability: KVM_CAP_XCRS
@@ -1564,7 +1591,7 @@ This ioctl would copy userspace's xsave struct to the kernel.
This ioctl would copy current vcpu's xcrs to the userspace.
-4.45 KVM_SET_XCRS
+4.46 KVM_SET_XCRS
-----------------
:Capability: KVM_CAP_XCRS
@@ -1591,7 +1618,7 @@ This ioctl would copy current vcpu's xcrs to the userspace.
This ioctl would set vcpu's xcr to the value userspace specified.
-4.46 KVM_GET_SUPPORTED_CPUID
+4.47 KVM_GET_SUPPORTED_CPUID
----------------------------
:Capability: KVM_CAP_EXT_CPUID
@@ -1676,7 +1703,7 @@ if that returns true and you use KVM_CREATE_IRQCHIP, or if you emulate the
feature in userspace, then you can enable the feature for KVM_SET_CPUID2.
-4.47 KVM_PPC_GET_PVINFO
+4.48 KVM_PPC_GET_PVINFO
-----------------------
:Capability: KVM_CAP_PPC_GET_PVINFO
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index efc7a82ab140..3f941b1f4e78 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -4773,14 +4773,17 @@ long kvm_arch_vcpu_ioctl(struct file *filp,
r = -EFAULT;
if (copy_from_user(&cpuid, cpuid_arg, sizeof(cpuid)))
goto out;
+
r = kvm_vcpu_ioctl_get_cpuid2(vcpu, &cpuid,
cpuid_arg->entries);
- if (r)
+
+ if (r && r != -E2BIG)
goto out;
- r = -EFAULT;
- if (copy_to_user(cpuid_arg, &cpuid, sizeof(cpuid)))
+
+ if (copy_to_user(cpuid_arg, &cpuid, sizeof(cpuid))) {
+ r = -EFAULT;
goto out;
- r = 0;
+ }
break;
}
case KVM_GET_MSRS: {
diff --git a/tools/testing/selftests/kvm/lib/x86_64/processor.c b/tools/testing/selftests/kvm/lib/x86_64/processor.c
index a8906e60a108..a412b39ad791 100644
--- a/tools/testing/selftests/kvm/lib/x86_64/processor.c
+++ b/tools/testing/selftests/kvm/lib/x86_64/processor.c
@@ -727,17 +727,21 @@ struct kvm_cpuid2 *vcpu_get_cpuid(struct kvm_vm *vm, uint32_t vcpuid)
cpuid = allocate_kvm_cpuid2();
max_ent = cpuid->nent;
+ cpuid->nent = 0;
- for (cpuid->nent = 1; cpuid->nent <= max_ent; cpuid->nent++) {
- rc = ioctl(vcpu->fd, KVM_GET_CPUID2, cpuid);
- if (!rc)
- break;
+ rc = ioctl(vcpu->fd, KVM_GET_CPUID2, cpuid);
+ TEST_ASSERT(rc == -1 && errno == E2BIG,
+ "KVM_GET_CPUID2 should return E2BIG: %d %d",
+ rc, errno);
- TEST_ASSERT(rc == -1 && errno == E2BIG,
- "KVM_GET_CPUID2 should either succeed or give E2BIG: %d %d",
- rc, errno);
- }
+ TEST_ASSERT(cpuid->nent,
+ "KVM_GET_CPUID2 failed to set cpuid->nent with E2BIG");
+
+ TEST_ASSERT(cpuid->nent < max_ent,
+ "KVM_GET_CPUID2 has %d entries, expected maximum: %d",
+ cpuid->nent, max_ent);
+ rc = ioctl(vcpu->fd, KVM_GET_CPUID2, cpuid);
TEST_ASSERT(rc == 0, "KVM_GET_CPUID2 failed, rc: %i errno: %i",
rc, errno);
--
2.17.1
KVM_GET_CPUID2 kvm ioctl is not very well documented, but the way it is
implemented in function kvm_vcpu_ioctl_get_cpuid2 suggests that even at
error path it will try to return number of entries to the caller. But
The dispatcher kvm vcpu ioctl dispatcher code in kvm_arch_vcpu_ioctl
ignores any output from this function if it sees the error return code.
It's very explicit by the code that it was designed to receive some
small number of entries to return E2BIG along with the corrected number.
This lost logic in the dispatcher code has been restored by removing the
lines that check for function return code and skip if error is found.
Without it, the ioctl caller will see both the number of entries and the
correct error.
In selftests relevant function vcpu_get_cpuid has also been modified to
utilize the number of cpuid entries returned along with errno E2BIG.
Signed-off-by: Valeriy Vdovin <valeriy.vdovin(a)virtuozzo.com>
---
v2:
- Added description to documentation of KVM_GET_CPUID2.
- Copy back nent only if E2BIG is returned.
Documentation/virt/kvm/api.rst | 81 ++++++++++++-------
arch/x86/kvm/x86.c | 11 ++-
.../selftests/kvm/lib/x86_64/processor.c | 20 +++--
3 files changed, 73 insertions(+), 39 deletions(-)
diff --git a/Documentation/virt/kvm/api.rst b/Documentation/virt/kvm/api.rst
index 245d80581f15..c7cfe4b9614e 100644
--- a/Documentation/virt/kvm/api.rst
+++ b/Documentation/virt/kvm/api.rst
@@ -711,7 +711,34 @@ resulting CPUID configuration through KVM_GET_CPUID2 in case.
};
-4.21 KVM_SET_SIGNAL_MASK
+4.21 KVM_GET_CPUID2
+------------------
+
+:Capability: basic
+:Architectures: x86
+:Type: vcpu ioctl
+:Parameters: struct kvm_cpuid (in/out)
+:Returns: 0 on success, -1 on error
+
+Returns a full list of cpuid entries that are supported by this vcpu and were
+previously set by KVM_SET_CPUID/KVM_SET_CPUID2.
+
+The userspace must specify the number of cpuid entries it is ready to accept
+from the kernel in the 'nent' field of 'struct kmv_cpuid'.
+
+The kernel will try to return all the cpuid entries it has in the response.
+If the userspace nent value is too small for the full response, the kernel will
+set the error code to -E2BIG, set the same 'nent' field to the actual number of
+cpuid_entries and return without writing back any entries to the userspace.
+The userspace can thus implement a two-call sequence, where the first call is
+made with nent set to 0 to read the number of entries from the kernel and
+use this response to allocate enough memory for a full response for the second
+call.
+
+The call cal also return with error code -EFAULT in case of other errors.
+
+
+4.22 KVM_SET_SIGNAL_MASK
------------------------
:Capability: basic
@@ -737,7 +764,7 @@ signal mask.
};
-4.22 KVM_GET_FPU
+4.23 KVM_GET_FPU
----------------
:Capability: basic
@@ -766,7 +793,7 @@ Reads the floating point state from the vcpu.
};
-4.23 KVM_SET_FPU
+4.24 KVM_SET_FPU
----------------
:Capability: basic
@@ -795,7 +822,7 @@ Writes the floating point state to the vcpu.
};
-4.24 KVM_CREATE_IRQCHIP
+4.25 KVM_CREATE_IRQCHIP
-----------------------
:Capability: KVM_CAP_IRQCHIP, KVM_CAP_S390_IRQCHIP (s390)
@@ -817,7 +844,7 @@ Note that on s390 the KVM_CAP_S390_IRQCHIP vm capability needs to be enabled
before KVM_CREATE_IRQCHIP can be used.
-4.25 KVM_IRQ_LINE
+4.26 KVM_IRQ_LINE
-----------------
:Capability: KVM_CAP_IRQCHIP
@@ -886,7 +913,7 @@ be used for a userspace interrupt controller.
};
-4.26 KVM_GET_IRQCHIP
+4.27 KVM_GET_IRQCHIP
--------------------
:Capability: KVM_CAP_IRQCHIP
@@ -911,7 +938,7 @@ KVM_CREATE_IRQCHIP into a buffer provided by the caller.
};
-4.27 KVM_SET_IRQCHIP
+4.28 KVM_SET_IRQCHIP
--------------------
:Capability: KVM_CAP_IRQCHIP
@@ -936,7 +963,7 @@ KVM_CREATE_IRQCHIP from a buffer provided by the caller.
};
-4.28 KVM_XEN_HVM_CONFIG
+4.29 KVM_XEN_HVM_CONFIG
-----------------------
:Capability: KVM_CAP_XEN_HVM
@@ -972,7 +999,7 @@ fields must be zero.
No other flags are currently valid in the struct kvm_xen_hvm_config.
-4.29 KVM_GET_CLOCK
+4.30 KVM_GET_CLOCK
------------------
:Capability: KVM_CAP_ADJUST_CLOCK
@@ -1005,7 +1032,7 @@ TSC is not stable.
};
-4.30 KVM_SET_CLOCK
+4.31 KVM_SET_CLOCK
------------------
:Capability: KVM_CAP_ADJUST_CLOCK
@@ -1027,7 +1054,7 @@ such as migration.
};
-4.31 KVM_GET_VCPU_EVENTS
+4.32 KVM_GET_VCPU_EVENTS
------------------------
:Capability: KVM_CAP_VCPU_EVENTS
@@ -1146,7 +1173,7 @@ directly to the virtual CPU).
__u32 reserved[12];
};
-4.32 KVM_SET_VCPU_EVENTS
+4.33 KVM_SET_VCPU_EVENTS
------------------------
:Capability: KVM_CAP_VCPU_EVENTS
@@ -1209,7 +1236,7 @@ exceptions by manipulating individual registers using the KVM_SET_ONE_REG API.
See KVM_GET_VCPU_EVENTS for the data structure.
-4.33 KVM_GET_DEBUGREGS
+4.34 KVM_GET_DEBUGREGS
----------------------
:Capability: KVM_CAP_DEBUGREGS
@@ -1231,7 +1258,7 @@ Reads debug registers from the vcpu.
};
-4.34 KVM_SET_DEBUGREGS
+4.35 KVM_SET_DEBUGREGS
----------------------
:Capability: KVM_CAP_DEBUGREGS
@@ -1246,7 +1273,7 @@ See KVM_GET_DEBUGREGS for the data structure. The flags field is unused
yet and must be cleared on entry.
-4.35 KVM_SET_USER_MEMORY_REGION
+4.36 KVM_SET_USER_MEMORY_REGION
-------------------------------
:Capability: KVM_CAP_USER_MEMORY
@@ -1315,7 +1342,7 @@ The KVM_SET_MEMORY_REGION does not allow fine grained control over memory
allocation and is deprecated.
-4.36 KVM_SET_TSS_ADDR
+4.37 KVM_SET_TSS_ADDR
---------------------
:Capability: KVM_CAP_SET_TSS_ADDR
@@ -1335,7 +1362,7 @@ because of a quirk in the virtualization implementation (see the internals
documentation when it pops into existence).
-4.37 KVM_ENABLE_CAP
+4.38 KVM_ENABLE_CAP
-------------------
:Capability: KVM_CAP_ENABLE_CAP
@@ -1390,7 +1417,7 @@ function properly, this is the place to put them.
The vcpu ioctl should be used for vcpu-specific capabilities, the vm ioctl
for vm-wide capabilities.
-4.38 KVM_GET_MP_STATE
+4.39 KVM_GET_MP_STATE
---------------------
:Capability: KVM_CAP_MP_STATE
@@ -1438,7 +1465,7 @@ For arm/arm64:
The only states that are valid are KVM_MP_STATE_STOPPED and
KVM_MP_STATE_RUNNABLE which reflect if the vcpu is paused or not.
-4.39 KVM_SET_MP_STATE
+4.40 KVM_SET_MP_STATE
---------------------
:Capability: KVM_CAP_MP_STATE
@@ -1460,7 +1487,7 @@ For arm/arm64:
The only states that are valid are KVM_MP_STATE_STOPPED and
KVM_MP_STATE_RUNNABLE which reflect if the vcpu should be paused or not.
-4.40 KVM_SET_IDENTITY_MAP_ADDR
+4.41 KVM_SET_IDENTITY_MAP_ADDR
------------------------------
:Capability: KVM_CAP_SET_IDENTITY_MAP_ADDR
@@ -1484,7 +1511,7 @@ documentation when it pops into existence).
Fails if any VCPU has already been created.
-4.41 KVM_SET_BOOT_CPU_ID
+4.42 KVM_SET_BOOT_CPU_ID
------------------------
:Capability: KVM_CAP_SET_BOOT_CPU_ID
@@ -1499,7 +1526,7 @@ is vcpu 0. This ioctl has to be called before vcpu creation,
otherwise it will return EBUSY error.
-4.42 KVM_GET_XSAVE
+4.43 KVM_GET_XSAVE
------------------
:Capability: KVM_CAP_XSAVE
@@ -1518,7 +1545,7 @@ otherwise it will return EBUSY error.
This ioctl would copy current vcpu's xsave struct to the userspace.
-4.43 KVM_SET_XSAVE
+4.44 KVM_SET_XSAVE
------------------
:Capability: KVM_CAP_XSAVE
@@ -1537,7 +1564,7 @@ This ioctl would copy current vcpu's xsave struct to the userspace.
This ioctl would copy userspace's xsave struct to the kernel.
-4.44 KVM_GET_XCRS
+4.45 KVM_GET_XCRS
-----------------
:Capability: KVM_CAP_XCRS
@@ -1564,7 +1591,7 @@ This ioctl would copy userspace's xsave struct to the kernel.
This ioctl would copy current vcpu's xcrs to the userspace.
-4.45 KVM_SET_XCRS
+4.46 KVM_SET_XCRS
-----------------
:Capability: KVM_CAP_XCRS
@@ -1591,7 +1618,7 @@ This ioctl would copy current vcpu's xcrs to the userspace.
This ioctl would set vcpu's xcr to the value userspace specified.
-4.46 KVM_GET_SUPPORTED_CPUID
+4.47 KVM_GET_SUPPORTED_CPUID
----------------------------
:Capability: KVM_CAP_EXT_CPUID
@@ -1676,7 +1703,7 @@ if that returns true and you use KVM_CREATE_IRQCHIP, or if you emulate the
feature in userspace, then you can enable the feature for KVM_SET_CPUID2.
-4.47 KVM_PPC_GET_PVINFO
+4.48 KVM_PPC_GET_PVINFO
-----------------------
:Capability: KVM_CAP_PPC_GET_PVINFO
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index efc7a82ab140..fa9bb6b751c6 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -4773,14 +4773,17 @@ long kvm_arch_vcpu_ioctl(struct file *filp,
r = -EFAULT;
if (copy_from_user(&cpuid, cpuid_arg, sizeof(cpuid)))
goto out;
+
r = kvm_vcpu_ioctl_get_cpuid2(vcpu, &cpuid,
cpuid_arg->entries);
- if (r)
+
+ if (r && r != E2BIG)
goto out;
- r = -EFAULT;
- if (copy_to_user(cpuid_arg, &cpuid, sizeof(cpuid)))
+
+ if (copy_to_user(cpuid_arg, &cpuid, sizeof(cpuid))) {
+ r = -EFAULT;
goto out;
- r = 0;
+ }
break;
}
case KVM_GET_MSRS: {
diff --git a/tools/testing/selftests/kvm/lib/x86_64/processor.c b/tools/testing/selftests/kvm/lib/x86_64/processor.c
index a8906e60a108..a412b39ad791 100644
--- a/tools/testing/selftests/kvm/lib/x86_64/processor.c
+++ b/tools/testing/selftests/kvm/lib/x86_64/processor.c
@@ -727,17 +727,21 @@ struct kvm_cpuid2 *vcpu_get_cpuid(struct kvm_vm *vm, uint32_t vcpuid)
cpuid = allocate_kvm_cpuid2();
max_ent = cpuid->nent;
+ cpuid->nent = 0;
- for (cpuid->nent = 1; cpuid->nent <= max_ent; cpuid->nent++) {
- rc = ioctl(vcpu->fd, KVM_GET_CPUID2, cpuid);
- if (!rc)
- break;
+ rc = ioctl(vcpu->fd, KVM_GET_CPUID2, cpuid);
+ TEST_ASSERT(rc == -1 && errno == E2BIG,
+ "KVM_GET_CPUID2 should return E2BIG: %d %d",
+ rc, errno);
- TEST_ASSERT(rc == -1 && errno == E2BIG,
- "KVM_GET_CPUID2 should either succeed or give E2BIG: %d %d",
- rc, errno);
- }
+ TEST_ASSERT(cpuid->nent,
+ "KVM_GET_CPUID2 failed to set cpuid->nent with E2BIG");
+
+ TEST_ASSERT(cpuid->nent < max_ent,
+ "KVM_GET_CPUID2 has %d entries, expected maximum: %d",
+ cpuid->nent, max_ent);
+ rc = ioctl(vcpu->fd, KVM_GET_CPUID2, cpuid);
TEST_ASSERT(rc == 0, "KVM_GET_CPUID2 failed, rc: %i errno: %i",
rc, errno);
--
2.17.1
KVM_GET_CPUID2 kvm ioctl is not very well documented, but the way it is
implemented in function kvm_vcpu_ioctl_get_cpuid2 suggests that even at
error path it will try to return number of entries to the caller. But
The dispatcher kvm vcpu ioctl dispatcher code in kvm_arch_vcpu_ioctl
ignores any output from this function if it sees the error return code.
It's very explicit by the code that it was designed to receive some
small number of entries to return E2BIG along with the corrected number.
This lost logic in the dispatcher code has been restored by removing the
lines that check for function return code and skip if error is found.
Without it, the ioctl caller will see both the number of entries and the
correct error.
In selftests relevant function vcpu_get_cpuid has also been modified to
utilize the number of cpuid entries returned along with errno E2BIG.
Signed-off-by: Valeriy Vdovin <valeriy.vdovin(a)virtuozzo.com>
---
arch/x86/kvm/x86.c | 10 +++++-----
.../selftests/kvm/lib/x86_64/processor.c | 20 +++++++++++--------
2 files changed, 17 insertions(+), 13 deletions(-)
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index efc7a82ab140..df8a3e44e722 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -4773,14 +4773,14 @@ long kvm_arch_vcpu_ioctl(struct file *filp,
r = -EFAULT;
if (copy_from_user(&cpuid, cpuid_arg, sizeof(cpuid)))
goto out;
+
r = kvm_vcpu_ioctl_get_cpuid2(vcpu, &cpuid,
cpuid_arg->entries);
- if (r)
- goto out;
- r = -EFAULT;
- if (copy_to_user(cpuid_arg, &cpuid, sizeof(cpuid)))
+
+ if (copy_to_user(cpuid_arg, &cpuid, sizeof(cpuid))) {
+ r = -EFAULT;
goto out;
- r = 0;
+ }
break;
}
case KVM_GET_MSRS: {
diff --git a/tools/testing/selftests/kvm/lib/x86_64/processor.c b/tools/testing/selftests/kvm/lib/x86_64/processor.c
index a8906e60a108..a412b39ad791 100644
--- a/tools/testing/selftests/kvm/lib/x86_64/processor.c
+++ b/tools/testing/selftests/kvm/lib/x86_64/processor.c
@@ -727,17 +727,21 @@ struct kvm_cpuid2 *vcpu_get_cpuid(struct kvm_vm *vm, uint32_t vcpuid)
cpuid = allocate_kvm_cpuid2();
max_ent = cpuid->nent;
+ cpuid->nent = 0;
- for (cpuid->nent = 1; cpuid->nent <= max_ent; cpuid->nent++) {
- rc = ioctl(vcpu->fd, KVM_GET_CPUID2, cpuid);
- if (!rc)
- break;
+ rc = ioctl(vcpu->fd, KVM_GET_CPUID2, cpuid);
+ TEST_ASSERT(rc == -1 && errno == E2BIG,
+ "KVM_GET_CPUID2 should return E2BIG: %d %d",
+ rc, errno);
- TEST_ASSERT(rc == -1 && errno == E2BIG,
- "KVM_GET_CPUID2 should either succeed or give E2BIG: %d %d",
- rc, errno);
- }
+ TEST_ASSERT(cpuid->nent,
+ "KVM_GET_CPUID2 failed to set cpuid->nent with E2BIG");
+
+ TEST_ASSERT(cpuid->nent < max_ent,
+ "KVM_GET_CPUID2 has %d entries, expected maximum: %d",
+ cpuid->nent, max_ent);
+ rc = ioctl(vcpu->fd, KVM_GET_CPUID2, cpuid);
TEST_ASSERT(rc == 0, "KVM_GET_CPUID2 failed, rc: %i errno: %i",
rc, errno);
--
2.17.1
Hi Linus,
Please pull the following KUnit update for Linux 5.13-rc1.
This KUnit update for Linux 5.13-rc1 consists of several fixes and
new feature to support failure from dynamic analysis tools such as
UBSAN and fake ops for testing.
- a fake ops struct for testing a "free" function to complain if it
was called with an invalid argument, or caught a double-free. Most
return void and have no normal means of signalling failure
(e.g. super_operations, iommu_ops, etc.).
diff is attached.
thanks,
-- Shuah
----------------------------------------------------------------
The following changes since commit a38fd8748464831584a19438cbb3082b5a2dab15:
Linux 5.12-rc2 (2021-03-05 17:33:41 -0800)
are available in the Git repository at:
git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest
tags/linux-kselftest-kunit-5.13-rc1
for you to fetch changes up to de2fcb3e62013738f22bbb42cbd757d9a242574e:
Documentation: kunit: add tips for using current->kunit_test
(2021-04-07 16:40:37 -0600)
----------------------------------------------------------------
linux-kselftest-kunit-5.13-rc1
This KUnit update for Linux 5.13-rc1 consists of several fixes and
new feature to support failure from dynamic analysis tools such as
UBSAN and fake ops for testing.
- a fake ops struct for testing a "free" function to complain if it
was called with an invalid argument, or caught a double-free. Most
return void and have no normal means of signalling failure
(e.g. super_operations, iommu_ops, etc.).
----------------------------------------------------------------
Daniel Latypov (4):
kunit: make KUNIT_EXPECT_STREQ() quote values, don't print literals
kunit: tool: make --kunitconfig accept dirs, add lib/kunit fragment
kunit: fix -Wunused-function warning for __kunit_fail_current_test
Documentation: kunit: add tips for using current->kunit_test
Lucas Stankus (1):
kunit: Match parenthesis alignment to improve code readability
Uriel Guajardo (1):
kunit: support failure from dynamic analysis tools
Documentation/dev-tools/kunit/tips.rst | 78
+++++++++++++++++++++++++++++++++-
include/kunit/test-bug.h | 29 +++++++++++++
lib/kunit/.kunitconfig | 3 ++
lib/kunit/assert.c | 61 ++++++++++++++++++--------
lib/kunit/test.c | 39 +++++++++++++++--
tools/testing/kunit/kunit.py | 4 +-
tools/testing/kunit/kunit_kernel.py | 2 +
tools/testing/kunit/kunit_tool_test.py | 6 +++
8 files changed, 198 insertions(+), 24 deletions(-)
create mode 100644 include/kunit/test-bug.h
create mode 100644 lib/kunit/.kunitconfig
----------------------------------------------------------------
Hi Linus,
Please pull the following Kselftest update for Linux 5.13-rc1.
This Kselftest update for Linux 5.13-rc1 consists of:
- fixes and updates to resctrl test from Fenghua Yu and Reinette Chatre
- fixes to Kselftest documentation, framework
- minor spelling correction in timers test
diff is attached.
thanks,
-- Shuah
----------------------------------------------------------------
The following changes since commit a38fd8748464831584a19438cbb3082b5a2dab15:
Linux 5.12-rc2 (2021-03-05 17:33:41 -0800)
are available in the Git repository at:
git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest
tags/linux-kselftest-next-5.13-rc1
for you to fetch changes up to e75074781f1735c1976bc551e29ccf2ba9a4b17f:
selftests/resctrl: Change a few printed messages (2021-04-07 16:37:49
-0600)
----------------------------------------------------------------
linux-kselftest-next-5.13-rc1
This Kselftest update for Linux 5.13-rc1 consists of:
- fixes and updates to resctrl test from Fenghua Yu and Reinette Chatre
- fixes to Kselftest documentation, framework
- minor spelling correction in timers test
----------------------------------------------------------------
Antonio Terceiro (1):
Documentation: kselftest: fix path to test module files
Colin Ian King (1):
selftests/timers: Fix spelling mistake "clocksourc" -> "clocksource"
Fenghua Yu (20):
selftests/resctrl: Enable gcc checks to detect buffer overflows
selftests/resctrl: Fix compilation issues for global variables
selftests/resctrl: Fix compilation issues for other global variables
selftests/resctrl: Clean up resctrl features check
selftests/resctrl: Fix missing options "-n" and "-p"
selftests/resctrl: Rename CQM test as CMT test
selftests/resctrl: Call kselftest APIs to log test results
selftests/resctrl: Share show_cache_info() by CAT and CMT tests
selftests/resctrl: Add config dependencies
selftests/resctrl: Check for resctrl mount point only if resctrl
FS is supported
selftests/resctrl: Use resctrl/info for feature detection
selftests/resctrl: Fix MBA/MBM results reporting format
selftests/resctrl: Don't hard code value of "no_of_bits" variable
selftests/resctrl: Modularize resctrl test suite main() function
selftests/resctrl: Skip the test if requested resctrl feature is
not supported
selftests/resctrl: Fix unmount resctrl FS
selftests/resctrl: Fix incorrect parsing of iMC counters
selftests/resctrl: Fix checking for < 0 for unsigned values
selftests/resctrl: Create .gitignore to include resctrl_tests
selftests/resctrl: Change a few printed messages
Ilya Leoshkevich (1):
selftests: fix prepending $(OUTPUT) to $(TEST_PROGS)
Reinette Chatre (2):
selftests/resctrl: Ensure sibling CPU is not same as original CPU
selftests/resctrl: Fix a printed message
Documentation/dev-tools/kselftest.rst | 4 +-
tools/testing/selftests/lib.mk | 3 +-
tools/testing/selftests/resctrl/.gitignore | 2 +
tools/testing/selftests/resctrl/Makefile | 2 +-
tools/testing/selftests/resctrl/README | 4 +-
tools/testing/selftests/resctrl/cache.c | 52 ++++++-
tools/testing/selftests/resctrl/cat_test.c | 57 +++----
.../selftests/resctrl/{cqm_test.c => cmt_test.c} | 75 +++-------
tools/testing/selftests/resctrl/config | 2 +
tools/testing/selftests/resctrl/fill_buf.c | 4 +-
tools/testing/selftests/resctrl/mba_test.c | 43 +++---
tools/testing/selftests/resctrl/mbm_test.c | 42 +++---
tools/testing/selftests/resctrl/resctrl.h | 29 +++-
tools/testing/selftests/resctrl/resctrl_tests.c | 163
++++++++++++++-------
tools/testing/selftests/resctrl/resctrl_val.c | 95 +++++++-----
tools/testing/selftests/resctrl/resctrlfs.c | 134 ++++++++++-------
.../testing/selftests/timers/clocksource-switch.c | 2 +-
17 files changed, 413 insertions(+), 300 deletions(-)
create mode 100644 tools/testing/selftests/resctrl/.gitignore
rename tools/testing/selftests/resctrl/{cqm_test.c => cmt_test.c} (56%)
create mode 100644 tools/testing/selftests/resctrl/config
----------------------------------------------------------------
This patchset introduces batched operations for the per-cpu variant of
the array map.
Also updates the batch ops test for arrays.
v4 -> v5:
- Revert removal of percpu macros
v3 -> v4:
- Prefer 'calloc()' over 'malloc()' on batch ops tests
- Add missing static keyword in a couple of test functions
- 'offset' to 'cpu_offset' as suggested by Martin
v2 -> v3:
- Remove percpu macros as suggested by Andrii
- Update tests that used the per cpu macros
v1 -> v2:
- Amended a more descriptive commit message
Pedro Tammela (2):
bpf: add batched ops support for percpu array
bpf: selftests: update array map tests for per-cpu batched ops
kernel/bpf/arraymap.c | 2 +
.../bpf/map_tests/array_map_batch_ops.c | 104 +++++++++++++-----
2 files changed, 77 insertions(+), 29 deletions(-)
--
2.25.1
Base
====
This series is based on (and therefore should apply cleanly to) the tag
"v5.12-rc7-mmots-2021-04-11-20-49", additionally with Peter's selftest cleanup
series applied first:
https://lore.kernel.org/patchwork/cover/1412450/
Changelog
=========
v3->v4:
- Fix handling of the shmem private mcopy case. Previously, I had (incorrectly)
assumed that !vma_is_anonymous() was equivalent to "the page will be in the
page cache". But, in this case we have an optimization where we allocate a new
*anonymous* page. So, use a new "bool page_in_cache" instead, which checks if
page->mapping is set. Correct several places with this new check. [Hugh]
- Fix calling mm_counter() before page_add_..._rmap(). [Hugh]
- When modifying shmem_mcopy_atomic_pte() to use the new install_pte() helper,
just use lru_cache_add_inactive_or_unevictable(), no need to branch and maybe
use lru_cache_add(). [Hugh]
- De-pluralize mcopy_atomic_install_pte(s). [Hugh]
- Make "writable" a bool, and initialize consistently. [Hugh]
v2->v3:
- Picked up {Reviewed,Acked}-by's.
- Reorder commits: introduce CONTINUE before MINOR registration. [Hugh, Peter]
- Don't try to {unlock,put}_page an xarray value in shmem_getpage_gfp. [Hugh]
- Move enum mcopy_atomic_mode forward declare out of CONFIG_HUGETLB_PAGE. [Hugh]
- Keep mistakenly removed UFFD_USER_MODE_ONLY in selftest. [Peter]
- Cleanup context management in self test (make clear implicit, remove unneeded
return values now that we have err()). [Peter]
- Correct dst_pte argument to dst_pmd in shmem_mcopy_atomic_pte macro. [Hugh]
- Mention the new shmem support feature in documentation. [Hugh]
v1->v2:
- Pick up Reviewed-by's.
- Don't swapin page when a minor fault occurs. Notice that it needs to be
swapped in, and just immediately fire the minor fault. Let a future CONTINUE
deal with swapping in the page. [Peter]
- Clarify comment about i_size checks in mm/userfaultfd.c. [Peter]
- Only forward declare once (out of #ifdef) in hugetlb.h. [Peter]
Changes since [2]:
- Squash the fixes ([2]) in with the original series ([1]). This makes reviewing
easier, as we no longer have to sift through deltas undoing what we had done
before. [Hugh, Peter]
- Modify shmem_mcopy_atomic_pte() to use the new mcopy_atomic_install_ptes()
helper, reducing code duplication. [Hugh]
- Properly trigger handle_userfault() in the shmem_swapin_page() case. [Hugh]
- Use shmem_getpage() instead of find_lock_page() to lookup the existing page in
for continue. This properly deals with swapped-out pages. [Hugh]
- Unconditionally pte_mkdirty() for anon memory (as before). [Peter]
- Don't include userfaultfd_k.h in either hugetlb.h or shmem_fs.h. [Hugh]
- Add comment for UFFD_FEATURE_MINOR_SHMEM (to match _HUGETLBFS). [Hugh]
- Fix some small cleanup issues (parens, reworded conditionals, reduced plumbing
of some parameters, simplify labels/gotos, ...). [Hugh, Peter]
Overview
========
See the series which added minor faults for hugetlbfs [3] for a detailed
overview of minor fault handling in general. This series adds the same support
for shmem-backed areas.
This series is structured as follows:
- Commits 1 and 2 are cleanups.
- Commits 3 and 4 implement the new feature (minor fault handling for shmem).
- Commits 5, 6, 7, 8 update the userfaultfd selftest to exercise the feature.
- Commit 9 is one final cleanup, modifying an existing code path to re-use a new
helper we've introduced. We rely on the selftest to show that this change
doesn't break anything.
- Commit 10 is a small documentation update to reflect the new changes.
Use Case
========
In some cases it is useful to have VM memory backed by tmpfs instead of
hugetlbfs. So, this feature will be used to support the same VM live migration
use case described in my original series.
Additionally, Android folks (Lokesh Gidra <lokeshgidra(a)google.com>) hope to
optimize the Android Runtime garbage collector using this feature:
"The plan is to use userfaultfd for concurrently compacting the heap. With
this feature, the heap can be shared-mapped at another location where the
GC-thread(s) could continue the compaction operation without the need to
invoke userfault ioctl(UFFDIO_COPY) each time. OTOH, if and when Java threads
get faults on the heap, UFFDIO_CONTINUE can be used to resume execution.
Furthermore, this feature enables updating references in the 'non-moving'
portion of the heap efficiently. Without this feature, uneccessary page
copying (ioctl(UFFDIO_COPY)) would be required."
[1] https://lore.kernel.org/patchwork/cover/1388144/
[2] https://lore.kernel.org/patchwork/patch/1408161/
[3] https://lore.kernel.org/linux-fsdevel/20210301222728.176417-1-axelrasmussen…
Axel Rasmussen (10):
userfaultfd/hugetlbfs: avoid including userfaultfd_k.h in hugetlb.h
userfaultfd/shmem: combine shmem_{mcopy_atomic,mfill_zeropage}_pte
userfaultfd/shmem: support UFFDIO_CONTINUE for shmem
userfaultfd/shmem: support minor fault registration for shmem
userfaultfd/selftests: use memfd_create for shmem test type
userfaultfd/selftests: create alias mappings in the shmem test
userfaultfd/selftests: reinitialize test context in each test
userfaultfd/selftests: exercise minor fault handling shmem support
userfaultfd/shmem: modify shmem_mcopy_atomic_pte to use install_pte()
userfaultfd: update documentation to mention shmem minor faults
Documentation/admin-guide/mm/userfaultfd.rst | 3 +-
fs/userfaultfd.c | 6 +-
include/linux/hugetlb.h | 4 +-
include/linux/shmem_fs.h | 17 +-
include/linux/userfaultfd_k.h | 5 +
include/uapi/linux/userfaultfd.h | 7 +-
mm/hugetlb.c | 1 +
mm/memory.c | 8 +-
mm/shmem.c | 115 +++-----
mm/userfaultfd.c | 175 ++++++++----
tools/testing/selftests/vm/userfaultfd.c | 274 ++++++++++++-------
11 files changed, 364 insertions(+), 251 deletions(-)
--
2.31.1.368.gbe11c130af-goog
This small series expands futex timeout selftests by checking if all
operations that allows timeouts works as expected. When some version of
Thomas' series "futex: Bugfixes and FUTEX_LOCK_PI2"[0] get merged, I'll
add the new rules to the timeout test. This test should be used to check
for regressions when modifying the timeout path or changing the
interface.
Additionally, fix a bug in the Makefile that can be found when compiling
selftests with new operations, like the one defined at [0] or from the
futex2 patchset.
[0] https://lore.kernel.org/lkml/20210422194417.866740847@linutronix.de/
André Almeida (2):
selftests: futex: Correctly include headers dirs
selftests: futex: Expand timeout test
.../selftests/futex/functional/Makefile | 3 +-
.../futex/functional/futex_wait_timeout.c | 126 +++++++++++++++---
2 files changed, 112 insertions(+), 17 deletions(-)
--
2.31.1
Changelog v3-->v4
Based on review comments by Doug Smythies,
1. Parsing the thread_siblings_list for CPU topology information to
correctly identify the cores the test should run on in
default(quick) mode.
2. The source CPU to source CPU interaction in the IPI test will always
result in a lower latency and cause a bias in the average, hence
avoid adding the latency to be averaged for same cpu IPIs. The
latency will still be displayed in the detailed logs.
RFC v3: https://lkml.org/lkml/2021/4/4/31
---
A kernel module + userspace driver to estimate the wakeup latency
caused by going into stop states. The motivation behind this program is
to find significant deviations behind advertised latency and residency
values.
The patchset measures latencies for two kinds of events. IPIs and Timers
As this is a software-only mechanism, there will additional latencies of
the kernel-firmware-hardware interactions. To account for that, the
program also measures a baseline latency on a 100 percent loaded CPU
and the latencies achieved must be in view relative to that.
To achieve this, we introduce a kernel module and expose its control
knobs through the debugfs interface that the selftests can engage with.
The kernel module provides the following interfaces within
/sys/kernel/debug/latency_test/ for,
IPI test:
ipi_cpu_dest = Destination CPU for the IPI
ipi_cpu_src = Origin of the IPI
ipi_latency_ns = Measured latency time in ns
Timeout test:
timeout_cpu_src = CPU on which the timer to be queued
timeout_expected_ns = Timer duration
timeout_diff_ns = Difference of actual duration vs expected timer
Sample output on a POWER9 system is as follows:
# --IPI Latency Test---
# Baseline Average IPI latency(ns): 3114
# Observed Average IPI latency(ns) - State0: 3265
# Observed Average IPI latency(ns) - State1: 3507
# Observed Average IPI latency(ns) - State2: 3739
# Observed Average IPI latency(ns) - State3: 3807
# Observed Average IPI latency(ns) - State4: 17070
# Observed Average IPI latency(ns) - State5: 1038174
# Observed Average IPI latency(ns) - State6: 1068784
#
# --Timeout Latency Test--
# Baseline Average timeout diff(ns): 1420
# Observed Average timeout diff(ns) - State0: 1640
# Observed Average timeout diff(ns) - State1: 1764
# Observed Average timeout diff(ns) - State2: 1715
# Observed Average timeout diff(ns) - State3: 1845
# Observed Average timeout diff(ns) - State4: 16581
# Observed Average timeout diff(ns) - State5: 939977
# Observed Average timeout diff(ns) - State6: 1073024
Things to keep in mind:
1. This kernel module + bash driver does not guarantee idleness on a
core when the IPI and the Timer is armed. It only invokes sleep and
hopes that the core is idle once the IPI/Timer is invoked onto it.
Hence this program must be run on a completely idle system for best
results
2. Even on a completely idle system, there maybe book-keeping tasks or
jitter tasks that can run on the core we want idle. This can create
outliers in the latency measurement. Thankfully, these outliers
should be large enough to easily weed them out.
3. A userspace only selftest variant was also sent out as RFC based on
suggestions over the previous patchset to simply the kernel
complexeity. However, a userspace only approach had more noise in
the latency measurement due to userspace-kernel interactions
which led to run to run variance and a lesser accurate test.
Another downside of the nature of a userspace program is that it
takes orders of magnitude longer to complete a full system test
compared to the kernel framework.
RFC patch: https://lkml.org/lkml/2020/9/2/356
4. For Intel Systems, the Timer based latencies don't exactly give out
the measure of idle latencies. This is because of a hardware
optimization mechanism that pre-arms a CPU when a timer is set to
wakeup. That doesn't make this metric useless for Intel systems,
it just means that is measuring IPI/Timer responding latency rather
than idle wakeup latencies.
(Source: https://lkml.org/lkml/2020/9/2/610)
For solution to this problem, a hardware based latency analyzer is
devised by Artem Bityutskiy from Intel.
https://youtu.be/Opk92aQyvt0?t=8266https://intel.github.io/wult/
Pratik Rajesh Sampat (2):
cpuidle: Extract IPI based and timer based wakeup latency from idle
states
selftest/cpuidle: Add support for cpuidle latency measurement
drivers/cpuidle/Makefile | 1 +
drivers/cpuidle/test-cpuidle_latency.c | 157 ++++++++
lib/Kconfig.debug | 10 +
tools/testing/selftests/Makefile | 1 +
tools/testing/selftests/cpuidle/Makefile | 6 +
tools/testing/selftests/cpuidle/cpuidle.sh | 402 +++++++++++++++++++++
tools/testing/selftests/cpuidle/settings | 2 +
7 files changed, 579 insertions(+)
create mode 100644 drivers/cpuidle/test-cpuidle_latency.c
create mode 100644 tools/testing/selftests/cpuidle/Makefile
create mode 100755 tools/testing/selftests/cpuidle/cpuidle.sh
create mode 100644 tools/testing/selftests/cpuidle/settings
--
2.17.1
We found that with the latest mainline kernel (5.12.0-051200rc8) on
some KVM instances / bare-metal systems, the following tests will take
longer than the kselftest framework default timeout (45 seconds) to
run and thus got terminated with TIMEOUT error:
* xfrm_policy.sh - took about 2m20s
* pmtu.sh - took about 3m5s
* udpgso_bench.sh - took about 60s
Bump the timeout setting to 5 minutes to allow them have a chance to
finish.
https://bugs.launchpad.net/bugs/1856010
Signed-off-by: Po-Hsu Lin <po-hsu.lin(a)canonical.com>
---
tools/testing/selftests/net/Makefile | 2 ++
tools/testing/selftests/net/settings | 1 +
2 files changed, 3 insertions(+)
create mode 100644 tools/testing/selftests/net/settings
diff --git a/tools/testing/selftests/net/Makefile b/tools/testing/selftests/net/Makefile
index 25f198b..2be4670 100644
--- a/tools/testing/selftests/net/Makefile
+++ b/tools/testing/selftests/net/Makefile
@@ -37,6 +37,8 @@ TEST_GEN_FILES += ipsec
TEST_GEN_PROGS = reuseport_bpf reuseport_bpf_cpu reuseport_bpf_numa
TEST_GEN_PROGS += reuseport_dualstack reuseaddr_conflict tls
+TEST_FILES := settings
+
KSFT_KHDR_INSTALL := 1
include ../lib.mk
diff --git a/tools/testing/selftests/net/settings b/tools/testing/selftests/net/settings
new file mode 100644
index 0000000..694d707
--- /dev/null
+++ b/tools/testing/selftests/net/settings
@@ -0,0 +1 @@
+timeout=300
--
2.7.4
Add in:
* kunit_kmalloc_array() and wire up kunit_kmalloc() to be a special
case of it.
* kunit_kcalloc() for symmetry with kunit_kzalloc()
This should using KUnit more natural by making it more similar to the
existing *alloc() APIs.
And while we shouldn't necessarily be writing unit tests where overflow
should be a concern, it can't hurt to be safe.
Signed-off-by: Daniel Latypov <dlatypov(a)google.com>
---
include/kunit/test.h | 36 ++++++++++++++++++++++++++++++++----
lib/kunit/test.c | 22 ++++++++++++----------
2 files changed, 44 insertions(+), 14 deletions(-)
diff --git a/include/kunit/test.h b/include/kunit/test.h
index 49601c4b98b8..7fa0de4af977 100644
--- a/include/kunit/test.h
+++ b/include/kunit/test.h
@@ -577,16 +577,30 @@ static inline int kunit_destroy_named_resource(struct kunit *test,
void kunit_remove_resource(struct kunit *test, struct kunit_resource *res);
/**
- * kunit_kmalloc() - Like kmalloc() except the allocation is *test managed*.
+ * kunit_kmalloc_array() - Like kmalloc_array() except the allocation is *test managed*.
* @test: The test context object.
+ * @n: number of elements.
* @size: The size in bytes of the desired memory.
* @gfp: flags passed to underlying kmalloc().
*
- * Just like `kmalloc(...)`, except the allocation is managed by the test case
+ * Just like `kmalloc_array(...)`, except the allocation is managed by the test case
* and is automatically cleaned up after the test case concludes. See &struct
* kunit_resource for more information.
*/
-void *kunit_kmalloc(struct kunit *test, size_t size, gfp_t gfp);
+void *kunit_kmalloc_array(struct kunit *test, size_t n, size_t size, gfp_t flags);
+
+/**
+ * kunit_kmalloc() - Like kmalloc() except the allocation is *test managed*.
+ * @test: The test context object.
+ * @size: The size in bytes of the desired memory.
+ * @gfp: flags passed to underlying kmalloc().
+ *
+ * See kmalloc() and kunit_kmalloc_array() for more information.
+ */
+static inline void *kunit_kmalloc(struct kunit *test, size_t size, gfp_t gfp)
+{
+ return kunit_kmalloc_array(test, 1, size, gfp);
+}
/**
* kunit_kfree() - Like kfree except for allocations managed by KUnit.
@@ -601,13 +615,27 @@ void kunit_kfree(struct kunit *test, const void *ptr);
* @size: The size in bytes of the desired memory.
* @gfp: flags passed to underlying kmalloc().
*
- * See kzalloc() and kunit_kmalloc() for more information.
+ * See kzalloc() and kunit_kmalloc_array() for more information.
*/
static inline void *kunit_kzalloc(struct kunit *test, size_t size, gfp_t gfp)
{
return kunit_kmalloc(test, size, gfp | __GFP_ZERO);
}
+/**
+ * kunit_kzalloc() - Just like kunit_kmalloc_array(), but zeroes the allocation.
+ * @test: The test context object.
+ * @n: number of elements.
+ * @size: The size in bytes of the desired memory.
+ * @gfp: flags passed to underlying kmalloc().
+ *
+ * See kcalloc() and kunit_kmalloc_array() for more information.
+ */
+static inline void *kunit_kcalloc(struct kunit *test, size_t n, size_t size, gfp_t flags)
+{
+ return kunit_kmalloc_array(test, n, size, flags | __GFP_ZERO);
+}
+
void kunit_cleanup(struct kunit *test);
void kunit_log_append(char *log, const char *fmt, ...);
diff --git a/lib/kunit/test.c b/lib/kunit/test.c
index ec9494e914ef..052fccf69eef 100644
--- a/lib/kunit/test.c
+++ b/lib/kunit/test.c
@@ -540,41 +540,43 @@ int kunit_destroy_resource(struct kunit *test, kunit_resource_match_t match,
}
EXPORT_SYMBOL_GPL(kunit_destroy_resource);
-struct kunit_kmalloc_params {
+struct kunit_kmalloc_array_params {
+ size_t n;
size_t size;
gfp_t gfp;
};
-static int kunit_kmalloc_init(struct kunit_resource *res, void *context)
+static int kunit_kmalloc_array_init(struct kunit_resource *res, void *context)
{
- struct kunit_kmalloc_params *params = context;
+ struct kunit_kmalloc_array_params *params = context;
- res->data = kmalloc(params->size, params->gfp);
+ res->data = kmalloc_array(params->n, params->size, params->gfp);
if (!res->data)
return -ENOMEM;
return 0;
}
-static void kunit_kmalloc_free(struct kunit_resource *res)
+static void kunit_kmalloc_array_free(struct kunit_resource *res)
{
kfree(res->data);
}
-void *kunit_kmalloc(struct kunit *test, size_t size, gfp_t gfp)
+void *kunit_kmalloc_array(struct kunit *test, size_t n, size_t size, gfp_t gfp)
{
- struct kunit_kmalloc_params params = {
+ struct kunit_kmalloc_array_params params = {
.size = size,
+ .n = n,
.gfp = gfp
};
return kunit_alloc_resource(test,
- kunit_kmalloc_init,
- kunit_kmalloc_free,
+ kunit_kmalloc_array_init,
+ kunit_kmalloc_array_free,
gfp,
¶ms);
}
-EXPORT_SYMBOL_GPL(kunit_kmalloc);
+EXPORT_SYMBOL_GPL(kunit_kmalloc_array);
void kunit_kfree(struct kunit *test, const void *ptr)
{
base-commit: 16fc44d6387e260f4932e9248b985837324705d8
--
2.31.1.498.g6c1eba8ee3d-goog
The kernel now has a number of testing and debugging tools, and we've
seen a bit of confusion about what the differences between them are.
Add a basic documentation outlining the testing tools, when to use each,
and how they interact.
This is a pretty quick overview rather than the idealised "kernel
testing guide" that'd probably be optimal, but given the number of times
questions like "When do you use KUnit and when do you use Kselftest?"
are being asked, it seemed worth at least having something. Hopefully
this can form the basis for more detailed documentation later.
Signed-off-by: David Gow <davidgow(a)google.com>
Reviewed-by: Marco Elver <elver(a)google.com>
Reviewed-by: Daniel Latypov <dlatypov(a)google.com>
---
Thanks again. Assuming no-one has any objections, I think this is good
to go.
-- David
Changes since v2:
https://lore.kernel.org/linux-kselftest/20210414081428.337494-1-davidgow@go…
- A few typo fixes (Thanks Daniel)
- Reworded description of dynamic analysis tools.
- Updated dev-tools index page to not use ':doc:' syntax, but to provide
a path instead.
- Added Marco and Daniel's Reviewed-by tags.
Changes since v1:
https://lore.kernel.org/linux-kselftest/20210410070529.4113432-1-davidgow@g…
- Note KUnit's speed and that one should provide selftests for syscalls
- Mention lockdep as a Dynamic Analysis Tool
- Refer to "Dynamic Analysis Tools" instead of "Sanitizers"
- A number of minor formatting tweaks and rewordings for clarity
Documentation/dev-tools/index.rst | 4 +
Documentation/dev-tools/testing-overview.rst | 117 +++++++++++++++++++
2 files changed, 121 insertions(+)
create mode 100644 Documentation/dev-tools/testing-overview.rst
diff --git a/Documentation/dev-tools/index.rst b/Documentation/dev-tools/index.rst
index 1b1cf4f5c9d9..929d916ffd4c 100644
--- a/Documentation/dev-tools/index.rst
+++ b/Documentation/dev-tools/index.rst
@@ -7,6 +7,9 @@ be used to work on the kernel. For now, the documents have been pulled
together without any significant effort to integrate them into a coherent
whole; patches welcome!
+A brief overview of testing-specific tools can be found in
+Documentation/dev-tools/testing-overview.rst
+
.. class:: toc-title
Table of contents
@@ -14,6 +17,7 @@ whole; patches welcome!
.. toctree::
:maxdepth: 2
+ testing-overview
coccinelle
sparse
kcov
diff --git a/Documentation/dev-tools/testing-overview.rst b/Documentation/dev-tools/testing-overview.rst
new file mode 100644
index 000000000000..b5b46709969c
--- /dev/null
+++ b/Documentation/dev-tools/testing-overview.rst
@@ -0,0 +1,117 @@
+.. SPDX-License-Identifier: GPL-2.0
+
+====================
+Kernel Testing Guide
+====================
+
+
+There are a number of different tools for testing the Linux kernel, so knowing
+when to use each of them can be a challenge. This document provides a rough
+overview of their differences, and how they fit together.
+
+
+Writing and Running Tests
+=========================
+
+The bulk of kernel tests are written using either the kselftest or KUnit
+frameworks. These both provide infrastructure to help make running tests and
+groups of tests easier, as well as providing helpers to aid in writing new
+tests.
+
+If you're looking to verify the behaviour of the Kernel — particularly specific
+parts of the kernel — then you'll want to use KUnit or kselftest.
+
+
+The Difference Between KUnit and kselftest
+------------------------------------------
+
+KUnit (Documentation/dev-tools/kunit/index.rst) is an entirely in-kernel system
+for "white box" testing: because test code is part of the kernel, it can access
+internal structures and functions which aren't exposed to userspace.
+
+KUnit tests therefore are best written against small, self-contained parts
+of the kernel, which can be tested in isolation. This aligns well with the
+concept of 'unit' testing.
+
+For example, a KUnit test might test an individual kernel function (or even a
+single codepath through a function, such as an error handling case), rather
+than a feature as a whole.
+
+This also makes KUnit tests very fast to build and run, allowing them to be
+run frequently as part of the development process.
+
+There is a KUnit test style guide which may give further pointers in
+Documentation/dev-tools/kunit/style.rst
+
+
+kselftest (Documentation/dev-tools/kselftest.rst), on the other hand, is
+largely implemented in userspace, and tests are normal userspace scripts or
+programs.
+
+This makes it easier to write more complicated tests, or tests which need to
+manipulate the overall system state more (e.g., spawning processes, etc.).
+However, it's not possible to call kernel functions directly from kselftest.
+This means that only kernel functionality which is exposed to userspace somehow
+(e.g. by a syscall, device, filesystem, etc.) can be tested with kselftest. To
+work around this, some tests include a companion kernel module which exposes
+more information or functionality. If a test runs mostly or entirely within the
+kernel, however, KUnit may be the more appropriate tool.
+
+kselftest is therefore suited well to tests of whole features, as these will
+expose an interface to userspace, which can be tested, but not implementation
+details. This aligns well with 'system' or 'end-to-end' testing.
+
+For example, all new system calls should be accompanied by kselftest tests.
+
+Code Coverage Tools
+===================
+
+The Linux Kernel supports two different code coverage measurement tools. These
+can be used to verify that a test is executing particular functions or lines
+of code. This is useful for determining how much of the kernel is being tested,
+and for finding corner-cases which are not covered by the appropriate test.
+
+:doc:`gcov` is GCC's coverage testing tool, which can be used with the kernel
+to get global or per-module coverage. Unlike KCOV, it does not record per-task
+coverage. Coverage data can be read from debugfs, and interpreted using the
+usual gcov tooling.
+
+:doc:`kcov` is a feature which can be built in to the kernel to allow
+capturing coverage on a per-task level. It's therefore useful for fuzzing and
+other situations where information about code executed during, for example, a
+single syscall is useful.
+
+
+Dynamic Analysis Tools
+======================
+
+The kernel also supports a number of dynamic analysis tools, which attempt to
+detect classes of issues when they occur in a running kernel. These typically
+each look for a different class of bugs, such as invalid memory accesses,
+concurrency issues such as data races, or other undefined behaviour like
+integer overflows.
+
+Some of these tools are listed below:
+
+* kmemleak detects possible memory leaks. See
+ Documentation/dev-tools/kmemleak.rst
+* KASAN detects invalid memory accesses such as out-of-bounds and
+ use-after-free errors. See Documentation/dev-tools/kasan.rst
+* UBSAN detects behaviour that is undefined by the C standard, like integer
+ overflows. See Documentation/dev-tools/ubsan.rst
+* KCSAN detects data races. See Documentation/dev-tools/kcsan.rst
+* KFENCE is a low-overhead detector of memory issues, which is much faster than
+ KASAN and can be used in production. See Documentation/dev-tools/kfence.rst
+* lockdep is a locking correctness validator. See
+ Documentation/locking/lockdep-design.rst
+* There are several other pieces of debug instrumentation in the kernel, many
+ of which can be found in lib/Kconfig.debug
+
+These tools tend to test the kernel as a whole, and do not "pass" like
+kselftest or KUnit tests. They can be combined with KUnit or kselftest by
+running tests on a kernel with these tools enabled: you can then be sure
+that none of these errors are occurring during the test.
+
+Some of these tools integrate with KUnit or kselftest and will
+automatically fail tests if an issue is detected.
+
--
2.31.1.295.g9ea45b61b8-goog
From: Colin Ian King <colin.king(a)canonical.com>
There are a few function prototypes that are missing a void parameter,
fix this by adding it in.
Signed-off-by: Colin Ian King <colin.king(a)canonical.com>
---
tools/testing/selftests/vm/mremap_dontunmap.c | 10 +++++-----
1 file changed, 5 insertions(+), 5 deletions(-)
diff --git a/tools/testing/selftests/vm/mremap_dontunmap.c b/tools/testing/selftests/vm/mremap_dontunmap.c
index f01dc4a85b0b..78baaf0e85d9 100644
--- a/tools/testing/selftests/vm/mremap_dontunmap.c
+++ b/tools/testing/selftests/vm/mremap_dontunmap.c
@@ -42,7 +42,7 @@ static void dump_maps(void)
// Try a simple operation for to "test" for kernel support this prevents
// reporting tests as failed when it's run on an older kernel.
-static int kernel_support_for_mremap_dontunmap()
+static int kernel_support_for_mremap_dontunmap(void)
{
int ret = 0;
unsigned long num_pages = 1;
@@ -95,7 +95,7 @@ static int check_region_contains_byte(void *addr, unsigned long size, char byte)
// this test validates that MREMAP_DONTUNMAP moves the pagetables while leaving
// the source mapping mapped.
-static void mremap_dontunmap_simple()
+static void mremap_dontunmap_simple(void)
{
unsigned long num_pages = 5;
@@ -128,7 +128,7 @@ static void mremap_dontunmap_simple()
}
// This test validates that MREMAP_DONTUNMAP on a shared mapping works as expected.
-static void mremap_dontunmap_simple_shmem()
+static void mremap_dontunmap_simple_shmem(void)
{
unsigned long num_pages = 5;
@@ -181,7 +181,7 @@ static void mremap_dontunmap_simple_shmem()
// This test validates MREMAP_DONTUNMAP will move page tables to a specific
// destination using MREMAP_FIXED, also while validating that the source
// remains intact.
-static void mremap_dontunmap_simple_fixed()
+static void mremap_dontunmap_simple_fixed(void)
{
unsigned long num_pages = 5;
@@ -226,7 +226,7 @@ static void mremap_dontunmap_simple_fixed()
// This test validates that we can MREMAP_DONTUNMAP for a portion of an
// existing mapping.
-static void mremap_dontunmap_partial_mapping()
+static void mremap_dontunmap_partial_mapping(void)
{
/*
* source mapping:
--
2.30.2
Base
====
This series is based on (and therefore should apply cleanly to) the tag
"v5.12-rc7-mmots-2021-04-11-20-49", additionally with Peter's selftest cleanup
series applied first:
https://lore.kernel.org/patchwork/cover/1412450/
Changelog
=========
v2->v3:
- Picked up {Reviewed,Acked}-by's.
- Reorder commits: introduce CONTINUE before MINOR registration. [Hugh, Peter]
- Don't try to {unlock,put}_page an xarray value in shmem_getpage_gfp. [Hugh]
- Move enum mcopy_atomic_mode forward declare out of CONFIG_HUGETLB_PAGE. [Hugh]
- Keep mistakenly removed UFFD_USER_MODE_ONLY in selftest. [Peter]
- Cleanup context management in self test (make clear implicit, remove unneeded
return values now that we have err()). [Peter]
- Correct dst_pte argument to dst_pmd in shmem_mcopy_atomic_pte macro. [Hugh]
- Mention the new shmem support feature in documentation. [Hugh]
v1->v2:
- Pick up Reviewed-by's.
- Don't swapin page when a minor fault occurs. Notice that it needs to be
swapped in, and just immediately fire the minor fault. Let a future CONTINUE
deal with swapping in the page. [Peter]
- Clarify comment about i_size checks in mm/userfaultfd.c. [Peter]
- Only forward declare once (out of #ifdef) in hugetlb.h. [Peter]
Changes since [2]:
- Squash the fixes ([2]) in with the original series ([1]). This makes reviewing
easier, as we no longer have to sift through deltas undoing what we had done
before. [Hugh, Peter]
- Modify shmem_mcopy_atomic_pte() to use the new mcopy_atomic_install_ptes()
helper, reducing code duplication. [Hugh]
- Properly trigger handle_userfault() in the shmem_swapin_page() case. [Hugh]
- Use shmem_getpage() instead of find_lock_page() to lookup the existing page in
for continue. This properly deals with swapped-out pages. [Hugh]
- Unconditionally pte_mkdirty() for anon memory (as before). [Peter]
- Don't include userfaultfd_k.h in either hugetlb.h or shmem_fs.h. [Hugh]
- Add comment for UFFD_FEATURE_MINOR_SHMEM (to match _HUGETLBFS). [Hugh]
- Fix some small cleanup issues (parens, reworded conditionals, reduced plumbing
of some parameters, simplify labels/gotos, ...). [Hugh, Peter]
Overview
========
See the series which added minor faults for hugetlbfs [3] for a detailed
overview of minor fault handling in general. This series adds the same support
for shmem-backed areas.
This series is structured as follows:
- Commits 1 and 2 are cleanups.
- Commits 3 and 4 implement the new feature (minor fault handling for shmem).
- Commits 5, 6, 7, 8 update the userfaultfd selftest to exercise the feature.
- Commit 9 is one final cleanup, modifying an existing code path to re-use a new
helper we've introduced. We rely on the selftest to show that this change
doesn't break anything.
Use Case
========
In some cases it is useful to have VM memory backed by tmpfs instead of
hugetlbfs. So, this feature will be used to support the same VM live migration
use case described in my original series.
Additionally, Android folks (Lokesh Gidra <lokeshgidra(a)google.com>) hope to
optimize the Android Runtime garbage collector using this feature:
"The plan is to use userfaultfd for concurrently compacting the heap. With
this feature, the heap can be shared-mapped at another location where the
GC-thread(s) could continue the compaction operation without the need to
invoke userfault ioctl(UFFDIO_COPY) each time. OTOH, if and when Java threads
get faults on the heap, UFFDIO_CONTINUE can be used to resume execution.
Furthermore, this feature enables updating references in the 'non-moving'
portion of the heap efficiently. Without this feature, uneccessary page
copying (ioctl(UFFDIO_COPY)) would be required."
[1] https://lore.kernel.org/patchwork/cover/1388144/
[2] https://lore.kernel.org/patchwork/patch/1408161/
[3] https://lore.kernel.org/linux-fsdevel/20210301222728.176417-1-axelrasmussen…
Axel Rasmussen (10):
userfaultfd/hugetlbfs: avoid including userfaultfd_k.h in hugetlb.h
userfaultfd/shmem: combine shmem_{mcopy_atomic,mfill_zeropage}_pte
userfaultfd/shmem: support UFFDIO_CONTINUE for shmem
userfaultfd/shmem: support minor fault registration for shmem
userfaultfd/selftests: use memfd_create for shmem test type
userfaultfd/selftests: create alias mappings in the shmem test
userfaultfd/selftests: reinitialize test context in each test
userfaultfd/selftests: exercise minor fault handling shmem support
userfaultfd/shmem: modify shmem_mcopy_atomic_pte to use install_ptes
userfaultfd: update documentation to mention shmem minor faults
Documentation/admin-guide/mm/userfaultfd.rst | 3 +-
fs/userfaultfd.c | 6 +-
include/linux/hugetlb.h | 4 +-
include/linux/shmem_fs.h | 17 +-
include/linux/userfaultfd_k.h | 5 +
include/uapi/linux/userfaultfd.h | 7 +-
mm/hugetlb.c | 1 +
mm/memory.c | 8 +-
mm/shmem.c | 114 +++-----
mm/userfaultfd.c | 183 +++++++++----
tools/testing/selftests/vm/userfaultfd.c | 274 ++++++++++++-------
11 files changed, 371 insertions(+), 251 deletions(-)
--
2.31.1.368.gbe11c130af-goog
Base
====
This series is based on (and therefore should apply cleanly to) the tag
"v5.12-rc7-mmots-2021-04-11-20-49", additionally with Peter's selftest cleanup
series applied *first*:
https://lore.kernel.org/patchwork/cover/1412450/
Changelog
=========
v1->v2:
- Pick up Reviewed-by's.
- Don't swapin page when a minor fault occurs. Notice that it needs to be
swapped in, and just immediately fire the minor fault. Let a future CONTINUE
deal with swapping in the page. [Peter]
- Clarify comment about i_size checks in mm/userfaultfd.c. [Peter]
- Only forward declare once (out of #ifdef) in hugetlb.h. [Peter]
Changes since [2]:
- Squash the fixes ([2]) in with the original series ([1]). This makes reviewing
easier, as we no longer have to sift through deltas undoing what we had done
before. [Hugh, Peter]
- Modify shmem_mcopy_atomic_pte() to use the new mcopy_atomic_install_ptes()
helper, reducing code duplication. [Hugh]
- Properly trigger handle_userfault() in the shmem_swapin_page() case. [Hugh]
- Use shmem_getpage() instead of find_lock_page() to lookup the existing page in
for continue. This properly deals with swapped-out pages. [Hugh]
- Unconditionally pte_mkdirty() for anon memory (as before). [Peter]
- Don't include userfaultfd_k.h in either hugetlb.h or shmem_fs.h. [Hugh]
- Add comment for UFFD_FEATURE_MINOR_SHMEM (to match _HUGETLBFS). [Hugh]
- Fix some small cleanup issues (parens, reworded conditionals, reduced plumbing
of some parameters, simplify labels/gotos, ...). [Hugh, Peter]
Overview
========
See the series which added minor faults for hugetlbfs [3] for a detailed
overview of minor fault handling in general. This series adds the same support
for shmem-backed areas.
This series is structured as follows:
- Commits 1 and 2 are cleanups.
- Commits 3 and 4 implement the new feature (minor fault handling for shmem).
- Commits 5, 6, 7, 8 update the userfaultfd selftest to exercise the feature.
- Commit 9 is one final cleanup, modifying an existing code path to re-use a new
helper we've introduced. We rely on the selftest to show that this change
doesn't break anything.
Use Case
========
In some cases it is useful to have VM memory backed by tmpfs instead of
hugetlbfs. So, this feature will be used to support the same VM live migration
use case described in my original series.
Additionally, Android folks (Lokesh Gidra <lokeshgidra(a)google.com>) hope to
optimize the Android Runtime garbage collector using this feature:
"The plan is to use userfaultfd for concurrently compacting the heap. With
this feature, the heap can be shared-mapped at another location where the
GC-thread(s) could continue the compaction operation without the need to
invoke userfault ioctl(UFFDIO_COPY) each time. OTOH, if and when Java threads
get faults on the heap, UFFDIO_CONTINUE can be used to resume execution.
Furthermore, this feature enables updating references in the 'non-moving'
portion of the heap efficiently. Without this feature, uneccessary page
copying (ioctl(UFFDIO_COPY)) would be required."
[1] https://lore.kernel.org/patchwork/cover/1388144/
[2] https://lore.kernel.org/patchwork/patch/1408161/
[3] https://lore.kernel.org/linux-fsdevel/20210301222728.176417-1-axelrasmussen…
Axel Rasmussen (9):
userfaultfd/hugetlbfs: avoid including userfaultfd_k.h in hugetlb.h
userfaultfd/shmem: combine shmem_{mcopy_atomic,mfill_zeropage}_pte
userfaultfd/shmem: support minor fault registration for shmem
userfaultfd/shmem: support UFFDIO_CONTINUE for shmem
userfaultfd/selftests: use memfd_create for shmem test type
userfaultfd/selftests: create alias mappings in the shmem test
userfaultfd/selftests: reinitialize test context in each test
userfaultfd/selftests: exercise minor fault handling shmem support
userfaultfd/shmem: modify shmem_mcopy_atomic_pte to use install_ptes
fs/userfaultfd.c | 6 +-
include/linux/hugetlb.h | 4 +-
include/linux/shmem_fs.h | 15 +-
include/linux/userfaultfd_k.h | 5 +
include/uapi/linux/userfaultfd.h | 7 +-
mm/hugetlb.c | 1 +
mm/memory.c | 8 +-
mm/shmem.c | 112 +++------
mm/userfaultfd.c | 183 ++++++++++-----
tools/testing/selftests/vm/userfaultfd.c | 280 +++++++++++++++--------
10 files changed, 377 insertions(+), 244 deletions(-)
--
2.31.1.295.g9ea45b61b8-goog
From: Mike Rapoport <rppt(a)linux.ibm.com>
Hi,
This is an updated version of page_is_secretmem() changes.
This is based on v5.12-rc7-mmots-2021-04-15-16-28.
@Andrew, please let me know if you'd like me to rebase it differently or
resend the entire set.
v2:
* move the check for secretmem page in gup_pte_range after we get a
reference to the page, per Matthew.
Mike Rapoport (2):
secretmem/gup: don't check if page is secretmem without reference
secretmem: optimize page_is_secretmem()
include/linux/secretmem.h | 26 +++++++++++++++++++++++++-
mm/gup.c | 6 +++---
mm/secretmem.c | 12 +-----------
3 files changed, 29 insertions(+), 15 deletions(-)
--
2.28.0
From: Mike Rapoport <rppt(a)linux.ibm.com>
Kernel test robot reported -4.2% regression of will-it-scale.per_thread_ops
due to commit "mm: introduce memfd_secret system call to create "secret"
memory areas".
The perf profile of the test indicated that the regression is caused by
page_is_secretmem() called from gup_pte_range() (inlined by gup_pgd_range):
27.76 +2.5 30.23 perf-profile.children.cycles-pp.gup_pgd_range
0.00 +3.2 3.19 ± 2% perf-profile.children.cycles-pp.page_mapping
0.00 +3.7 3.66 ± 2% perf-profile.children.cycles-pp.page_is_secretmem
Further analysis showed that the slow down happens because neither
page_is_secretmem() nor page_mapping() are not inline and moreover,
multiple page flags checks in page_mapping() involve calling
compound_head() several times for the same page.
Make page_is_secretmem() inline and replace page_mapping() with page flag
checks that do not imply page-to-head conversion.
Reported-by: kernel test robot <oliver.sang(a)intel.com>
Signed-off-by: Mike Rapoport <rppt(a)linux.ibm.com>
---
@Andrew,
The patch is vs v5.12-rc7-mmots-2021-04-15-16-28, I'd appreciate if it would
be added as a fixup to the memfd_secret series.
include/linux/secretmem.h | 26 +++++++++++++++++++++++++-
mm/secretmem.c | 12 +-----------
2 files changed, 26 insertions(+), 12 deletions(-)
diff --git a/include/linux/secretmem.h b/include/linux/secretmem.h
index 907a6734059c..b842b38cbeb1 100644
--- a/include/linux/secretmem.h
+++ b/include/linux/secretmem.h
@@ -4,8 +4,32 @@
#ifdef CONFIG_SECRETMEM
+extern const struct address_space_operations secretmem_aops;
+
+static inline bool page_is_secretmem(struct page *page)
+{
+ struct address_space *mapping;
+
+ /*
+ * Using page_mapping() is quite slow because of the actual call
+ * instruction and repeated compound_head(page) inside the
+ * page_mapping() function.
+ * We know that secretmem pages are not compound and LRU so we can
+ * save a couple of cycles here.
+ */
+ if (PageCompound(page) || !PageLRU(page))
+ return false;
+
+ mapping = (struct address_space *)
+ ((unsigned long)page->mapping & ~PAGE_MAPPING_FLAGS);
+
+ if (mapping != page->mapping)
+ return false;
+
+ return page->mapping->a_ops == &secretmem_aops;
+}
+
bool vma_is_secretmem(struct vm_area_struct *vma);
-bool page_is_secretmem(struct page *page);
bool secretmem_active(void);
#else
diff --git a/mm/secretmem.c b/mm/secretmem.c
index 3b1ba3991964..0bcd15e1b549 100644
--- a/mm/secretmem.c
+++ b/mm/secretmem.c
@@ -151,22 +151,12 @@ static void secretmem_freepage(struct page *page)
clear_highpage(page);
}
-static const struct address_space_operations secretmem_aops = {
+const struct address_space_operations secretmem_aops = {
.freepage = secretmem_freepage,
.migratepage = secretmem_migratepage,
.isolate_page = secretmem_isolate_page,
};
-bool page_is_secretmem(struct page *page)
-{
- struct address_space *mapping = page_mapping(page);
-
- if (!mapping)
- return false;
-
- return mapping->a_ops == &secretmem_aops;
-}
-
static struct vfsmount *secretmem_mnt;
static struct file *secretmem_file_create(unsigned long flags)
--
2.28.0
This patchset introduces batched operations for the per-cpu variant of
the array map.
It also removes the percpu macros from 'bpf_util.h'. This change was
suggested by Andrii in a earlier iteration of this patchset.
The tests were updated to reflect all the new changes.
v3 -> v4:
- Prefer 'calloc()' over 'malloc()' on batch ops tests
- Add missing static keyword in a couple of test functions
- 'offset' to 'cpu_offset' as suggested by Martin
v2 -> v3:
- Remove percpu macros as suggested by Andrii
- Update tests that used the per cpu macros
v1 -> v2:
- Amended a more descriptive commit message
Pedro Tammela (3):
bpf: add batched ops support for percpu array
bpf: selftests: remove percpu macros from bpf_util.h
bpf: selftests: update array map tests for per-cpu batched ops
kernel/bpf/arraymap.c | 2 +
tools/testing/selftests/bpf/bpf_util.h | 7 --
.../bpf/map_tests/array_map_batch_ops.c | 104 +++++++++++++-----
.../bpf/map_tests/htab_map_batch_ops.c | 87 +++++++--------
.../selftests/bpf/prog_tests/map_init.c | 9 +-
tools/testing/selftests/bpf/test_maps.c | 84 ++++++++------
6 files changed, 173 insertions(+), 120 deletions(-)
--
2.25.1
This patchset provides a file descriptor for every VM and VCPU to read
KVM statistics data in binary format.
It is meant to provide a lightweight, flexible, scalable and efficient
lock-free solution for user space telemetry applications to pull the
statistics data periodically for large scale systems. The pulling
frequency could be as high as a few times per second.
In this patchset, every statistics data are treated to have some
attributes as below:
* architecture dependent or common
* VM statistics data or VCPU statistics data
* type: cumulative, instantaneous,
* unit: none for simple counter, nanosecond, microsecond,
millisecond, second, Byte, KiByte, MiByte, GiByte. Clock Cycles
Since no lock/synchronization is used, the consistency between all
the statistics data is not guaranteed. That means not all statistics
data are read out at the exact same time, since the statistics date
are still being updated by KVM subsystems while they are read out.
---
* v1 -> v2
- Use ARRAY_SIZE to count the number of stats descriptors
- Fix missing `size` field initialization in macro STATS_DESC
[1] https://lore.kernel.org/kvm/20210402224359.2297157-1-jingzhangos@google.com
---
Jing Zhang (4):
KVM: stats: Separate common stats from architecture specific ones
KVM: stats: Add fd-based API to read binary stats data
KVM: stats: Add documentation for statistics data binary interface
KVM: selftests: Add selftest for KVM statistics data binary interface
Documentation/virt/kvm/api.rst | 169 ++++++++
arch/arm64/include/asm/kvm_host.h | 9 +-
arch/arm64/kvm/guest.c | 42 +-
arch/mips/include/asm/kvm_host.h | 9 +-
arch/mips/kvm/mips.c | 67 +++-
arch/powerpc/include/asm/kvm_host.h | 9 +-
arch/powerpc/kvm/book3s.c | 68 +++-
arch/powerpc/kvm/book3s_hv.c | 12 +-
arch/powerpc/kvm/book3s_pr.c | 2 +-
arch/powerpc/kvm/book3s_pr_papr.c | 2 +-
arch/powerpc/kvm/booke.c | 63 ++-
arch/s390/include/asm/kvm_host.h | 9 +-
arch/s390/kvm/kvm-s390.c | 133 ++++++-
arch/x86/include/asm/kvm_host.h | 9 +-
arch/x86/kvm/x86.c | 71 +++-
include/linux/kvm_host.h | 132 ++++++-
include/linux/kvm_types.h | 12 +
include/uapi/linux/kvm.h | 48 +++
tools/testing/selftests/kvm/.gitignore | 1 +
tools/testing/selftests/kvm/Makefile | 3 +
.../testing/selftests/kvm/include/kvm_util.h | 3 +
.../selftests/kvm/kvm_bin_form_stats.c | 370 ++++++++++++++++++
tools/testing/selftests/kvm/lib/kvm_util.c | 11 +
virt/kvm/kvm_main.c | 237 ++++++++++-
24 files changed, 1401 insertions(+), 90 deletions(-)
create mode 100644 tools/testing/selftests/kvm/kvm_bin_form_stats.c
base-commit: f96be2deac9bca3ef5a2b0b66b71fcef8bad586d
--
2.31.1.295.g9ea45b61b8-goog
Hi,
This v6 series can mainly include two parts.
Rebased on kvm queue branch: https://git.kernel.org/pub/scm/virt/kvm/kvm.git/log/?h=queue
In the first part, all the known hugetlb backing src types specified
with different hugepage sizes are listed, so that we can specify use
of hugetlb source of the exact granularity that we want, instead of
the system default ones. And as all the known hugetlb page sizes are
listed, it's appropriate for all architectures. Besides, a helper that
can get granularity of different backing src types(anonumous/thp/hugetlb)
is added, so that we can use the accurate backing src granularity for
kinds of alignment or guest memory accessing of vcpus.
In the second part, a new test is added:
This test is added to serve as a performance tester and a bug reproducer
for kvm page table code (GPA->HPA mappings), it gives guidance for the
people trying to make some improvement for kvm. And the following explains
what we can exactly do through this test.
The function guest_code() can cover the conditions where a single vcpu or
multiple vcpus access guest pages within the same memory region, in three
VM stages(before dirty logging, during dirty logging, after dirty logging).
Besides, the backing src memory type(ANONYMOUS/THP/HUGETLB) of the tested
memory region can be specified by users, which means normal page mappings
or block mappings can be chosen by users to be created in the test.
If ANONYMOUS memory is specified, kvm will create normal page mappings
for the tested memory region before dirty logging, and update attributes
of the page mappings from RO to RW during dirty logging. If THP/HUGETLB
memory is specified, kvm will create block mappings for the tested memory
region before dirty logging, and split the blcok mappings into normal page
mappings during dirty logging, and coalesce the page mappings back into
block mappings after dirty logging is stopped.
So in summary, as a performance tester, this test can present the
performance of kvm creating/updating normal page mappings, or the
performance of kvm creating/splitting/recovering block mappings,
through execution time.
When we need to coalesce the page mappings back to block mappings after
dirty logging is stopped, we have to firstly invalidate *all* the TLB
entries for the page mappings right before installation of the block entry,
because a TLB conflict abort error could occur if we can't invalidate the
TLB entries fully. We have hit this TLB conflict twice on aarch64 software
implementation and fixed it. As this test can imulate process from dirty
logging enabled to dirty logging stopped of a VM with block mappings,
so it can also reproduce this TLB conflict abort due to inadequate TLB
invalidation when coalescing tables.
Links about the TLB conflict abort:
https://lore.kernel.org/lkml/20201201201034.116760-3-wangyanan55@huawei.com/
---
Change logs:
v5->v6:
- Address Andrew Jones's comments for v5 series
- Add Andrew Jones's R-b tags in some patches
- Rebased on newest kvm/queue tree
- v5: https://lore.kernel.org/lkml/20210323135231.24948-1-wangyanan55@huawei.com/
v4->v5:
- Use synchronization(sem_wait) for time measurement
- Add a new patch about TEST_ASSERT(patch 4)
- Address Andrew Jones's comments for v4 series
- Add Andrew Jones's R-b tags in some patches
- v4: https://lore.kernel.org/lkml/20210302125751.19080-1-wangyanan55@huawei.com/
v3->v4:
- Add a helper to get system default hugetlb page size
- Add tags of Reviewed-by of Ben in the patches
- v3: https://lore.kernel.org/lkml/20210301065916.11484-1-wangyanan55@huawei.com/
v2->v3:
- Add tags of Suggested-by, Reviewed-by in the patches
- Add a generic micro to get hugetlb page sizes
- Some changes for suggestions about v2 series
- v2: https://lore.kernel.org/lkml/20210225055940.18748-1-wangyanan55@huawei.com/
v1->v2:
- Add a patch to sync header files
- Add helpers to get granularity of different backing src types
- Some changes for suggestions about v1 series
- v1: https://lore.kernel.org/lkml/20210208090841.333724-1-wangyanan55@huawei.com/
---
Yanan Wang (10):
tools headers: sync headers of asm-generic/hugetlb_encode.h
mm/hugetlb: Add a macro to get HUGETLB page sizes for mmap
KVM: selftests: Use flag CLOCK_MONOTONIC_RAW for timing
KVM: selftests: Print the errno besides error-string in TEST_ASSERT
KVM: selftests: Make a generic helper to get vm guest mode strings
KVM: selftests: Add a helper to get system configured THP page size
KVM: selftests: Add a helper to get system default hugetlb page size
KVM: selftests: List all hugetlb src types specified with page sizes
KVM: selftests: Adapt vm_userspace_mem_region_add to new helpers
KVM: selftests: Add a test for kvm page table code
include/uapi/linux/mman.h | 2 +
tools/include/asm-generic/hugetlb_encode.h | 3 +
tools/include/uapi/linux/mman.h | 2 +
tools/testing/selftests/kvm/.gitignore | 1 +
tools/testing/selftests/kvm/Makefile | 3 +
.../selftests/kvm/demand_paging_test.c | 8 +-
.../selftests/kvm/dirty_log_perf_test.c | 14 +-
.../testing/selftests/kvm/include/kvm_util.h | 4 +-
.../testing/selftests/kvm/include/test_util.h | 21 +-
.../selftests/kvm/kvm_page_table_test.c | 506 ++++++++++++++++++
tools/testing/selftests/kvm/lib/assert.c | 4 +-
tools/testing/selftests/kvm/lib/kvm_util.c | 59 +-
tools/testing/selftests/kvm/lib/test_util.c | 163 +++++-
tools/testing/selftests/kvm/steal_time.c | 4 +-
14 files changed, 733 insertions(+), 61 deletions(-)
create mode 100644 tools/testing/selftests/kvm/kvm_page_table_test.c
--
2.23.0
Since commit d9f4ff50d2aa ("kbuild: spilt cc-option and friends to
scripts/Makefile.compiler"), some kselftests fail to build.
The tools/ directory opted out Kbuild, and went in a different
direction. They copy any kind of files to the tools/ directory
in order to do whatever they want in their world.
tools/build/Build.include mimics scripts/Kbuild.include, but some
tool Makefiles included the Kbuild one to import a feature that is
missing in tools/build/Build.include:
- Commit ec04aa3ae87b ("tools/thermal: tmon: use "-fstack-protector"
only if supported") included scripts/Kbuild.include from
tools/thermal/tmon/Makefile to import the cc-option macro.
- Commit c2390f16fc5b ("selftests: kvm: fix for compilers that do
not support -no-pie") included scripts/Kbuild.include from
tools/testing/selftests/kvm/Makefile to import the try-run macro.
- Commit 9cae4ace80ef ("selftests/bpf: do not ignore clang
failures") included scripts/Kbuild.include from
tools/testing/selftests/bpf/Makefile to import the .DELETE_ON_ERROR
target.
- Commit 0695f8bca93e ("selftests/powerpc: Handle Makefile for
unrecognized option") included scripts/Kbuild.include from
tools/testing/selftests/powerpc/pmu/ebb/Makefile to import the
try-run macro.
Copy what they need into tools/build/Build.include, and make them
include it instead of scripts/Kbuild.include.
Link: https://lore.kernel.org/lkml/86dadf33-70f7-a5ac-cb8c-64966d2f45a1@linux.ibm…
Fixes: d9f4ff50d2aa ("kbuild: spilt cc-option and friends to scripts/Makefile.compiler")
Reported-by: Janosch Frank <frankja(a)linux.ibm.com>
Reported-by: Christian Borntraeger <borntraeger(a)de.ibm.com>
Signed-off-by: Masahiro Yamada <masahiroy(a)kernel.org>
---
Changes in v2:
- copy macros to tools/build/BUild.include
tools/build/Build.include | 24 +++++++++++++++++++
tools/testing/selftests/bpf/Makefile | 2 +-
tools/testing/selftests/kvm/Makefile | 2 +-
.../selftests/powerpc/pmu/ebb/Makefile | 2 +-
tools/thermal/tmon/Makefile | 2 +-
5 files changed, 28 insertions(+), 4 deletions(-)
diff --git a/tools/build/Build.include b/tools/build/Build.include
index 585486e40995..2cf3b1bde86e 100644
--- a/tools/build/Build.include
+++ b/tools/build/Build.include
@@ -100,3 +100,27 @@ cxx_flags = -Wp,-MD,$(depfile) -Wp,-MT,$@ $(CXXFLAGS) -D"BUILD_STR(s)=\#s" $(CXX
## HOSTCC C flags
host_c_flags = -Wp,-MD,$(depfile) -Wp,-MT,$@ $(KBUILD_HOSTCFLAGS) -D"BUILD_STR(s)=\#s" $(HOSTCFLAGS_$(basetarget).o) $(HOSTCFLAGS_$(obj))
+
+# output directory for tests below
+TMPOUT = .tmp_$$$$
+
+# try-run
+# Usage: option = $(call try-run, $(CC)...-o "$$TMP",option-ok,otherwise)
+# Exit code chooses option. "$$TMP" serves as a temporary file and is
+# automatically cleaned up.
+try-run = $(shell set -e; \
+ TMP=$(TMPOUT)/tmp; \
+ mkdir -p $(TMPOUT); \
+ trap "rm -rf $(TMPOUT)" EXIT; \
+ if ($(1)) >/dev/null 2>&1; \
+ then echo "$(2)"; \
+ else echo "$(3)"; \
+ fi)
+
+# cc-option
+# Usage: cflags-y += $(call cc-option,-march=winchip-c6,-march=i586)
+cc-option = $(call try-run, \
+ $(CC) -Werror $(1) -c -x c /dev/null -o "$$TMP",$(1),$(2))
+
+# delete partially updated (i.e. corrupted) files on error
+.DELETE_ON_ERROR:
diff --git a/tools/testing/selftests/bpf/Makefile b/tools/testing/selftests/bpf/Makefile
index 044bfdcf5b74..17a5cdf48d37 100644
--- a/tools/testing/selftests/bpf/Makefile
+++ b/tools/testing/selftests/bpf/Makefile
@@ -1,5 +1,5 @@
# SPDX-License-Identifier: GPL-2.0
-include ../../../../scripts/Kbuild.include
+include ../../../build/Build.include
include ../../../scripts/Makefile.arch
include ../../../scripts/Makefile.include
diff --git a/tools/testing/selftests/kvm/Makefile b/tools/testing/selftests/kvm/Makefile
index a6d61f451f88..5ef141f265bd 100644
--- a/tools/testing/selftests/kvm/Makefile
+++ b/tools/testing/selftests/kvm/Makefile
@@ -1,5 +1,5 @@
# SPDX-License-Identifier: GPL-2.0-only
-include ../../../../scripts/Kbuild.include
+include ../../../build/Build.include
all:
diff --git a/tools/testing/selftests/powerpc/pmu/ebb/Makefile b/tools/testing/selftests/powerpc/pmu/ebb/Makefile
index af3df79d8163..c5ecb4634094 100644
--- a/tools/testing/selftests/powerpc/pmu/ebb/Makefile
+++ b/tools/testing/selftests/powerpc/pmu/ebb/Makefile
@@ -1,5 +1,5 @@
# SPDX-License-Identifier: GPL-2.0
-include ../../../../../../scripts/Kbuild.include
+include ../../../../../build/Build.include
noarg:
$(MAKE) -C ../../
diff --git a/tools/thermal/tmon/Makefile b/tools/thermal/tmon/Makefile
index 59e417ec3e13..9db867df7679 100644
--- a/tools/thermal/tmon/Makefile
+++ b/tools/thermal/tmon/Makefile
@@ -1,6 +1,6 @@
# SPDX-License-Identifier: GPL-2.0
# We need this for the "cc-option" macro.
-include ../../../scripts/Kbuild.include
+include ../../build/Build.include
VERSION = 1.0
--
2.27.0
The kunit_tool documentation page was pretty minimal, and a bit
outdated. Update it and flesh it out a bit.
In particular,
- Mention that .kunitconfig is now in the build directory
- Describe the use of --kunitconfig to specify a different config
framgent
- Mention the split functionality (i.e., commands other than 'run')
- Describe --raw_output and kunit.py parse
- Mention the globbing support
- Provide a quick overview of other options, including --build_dir and
--alltests
Note that this does overlap a little with the new running_tips page. I
don't think it's a problem having both: this page is supposed to be a
bit more of a reference, rather than a list of useful tips, so the fact
that they both describe the same features isn't a problem.
Signed-off-by: David Gow <davidgow(a)google.com>
Reviewed-by: Daniel Latypov <dlatypov(a)google.com>
---
Adopted the changes from Daniel.
Changes since v1:
https://lore.kernel.org/linux-kselftest/20210416034036.797727-1-davidgow@go…
- Mention that the default build directory is '.kunit' when discussing
'.kunitconfig' files.
- Reword the discussion of 'CONFIG_KUNIT_ALL_TESTS' under '--alltests'
Documentation/dev-tools/kunit/kunit-tool.rst | 140 +++++++++++++++++--
1 file changed, 132 insertions(+), 8 deletions(-)
diff --git a/Documentation/dev-tools/kunit/kunit-tool.rst b/Documentation/dev-tools/kunit/kunit-tool.rst
index 29ae2fee8123..4247b7420e3b 100644
--- a/Documentation/dev-tools/kunit/kunit-tool.rst
+++ b/Documentation/dev-tools/kunit/kunit-tool.rst
@@ -22,14 +22,19 @@ not require any virtualization support: it is just a regular program.
What is a .kunitconfig?
=======================
-It's just a defconfig that kunit_tool looks for in the base directory.
-kunit_tool uses it to generate a .config as you might expect. In addition, it
-verifies that the generated .config contains the CONFIG options in the
-.kunitconfig; the reason it does this is so that it is easy to be sure that a
-CONFIG that enables a test actually ends up in the .config.
+It's just a defconfig that kunit_tool looks for in the build directory
+(``.kunit`` by default). kunit_tool uses it to generate a .config as you might
+expect. In addition, it verifies that the generated .config contains the CONFIG
+options in the .kunitconfig; the reason it does this is so that it is easy to
+be sure that a CONFIG that enables a test actually ends up in the .config.
-How do I use kunit_tool?
-========================
+It's also possible to pass a separate .kunitconfig fragment to kunit_tool,
+which is useful if you have several different groups of tests you wish
+to run independently, or if you want to use pre-defined test configs for
+certain subsystems.
+
+Getting Started with kunit_tool
+===============================
If a kunitconfig is present at the root directory, all you have to do is:
@@ -48,10 +53,129 @@ However, you most likely want to use it with the following options:
.. note::
This command will work even without a .kunitconfig file: if no
- .kunitconfig is present, a default one will be used instead.
+ .kunitconfig is present, a default one will be used instead.
+
+If you wish to use a different .kunitconfig file (such as one provided for
+testing a particular subsystem), you can pass it as an option.
+
+.. code-block:: bash
+
+ ./tools/testing/kunit/kunit.py run --kunitconfig=fs/ext4/.kunitconfig
For a list of all the flags supported by kunit_tool, you can run:
.. code-block:: bash
./tools/testing/kunit/kunit.py run --help
+
+Configuring, Building, and Running Tests
+========================================
+
+It's also possible to run just parts of the KUnit build process independently,
+which is useful if you want to make manual changes to part of the process.
+
+A .config can be generated from a .kunitconfig by using the ``config`` argument
+when running kunit_tool:
+
+.. code-block:: bash
+
+ ./tools/testing/kunit/kunit.py config
+
+Similarly, if you just want to build a KUnit kernel from the current .config,
+you can use the ``build`` argument:
+
+.. code-block:: bash
+
+ ./tools/testing/kunit/kunit.py build
+
+And, if you already have a built UML kernel with built-in KUnit tests, you can
+run the kernel and display the test results with the ``exec`` argument:
+
+.. code-block:: bash
+
+ ./tools/testing/kunit/kunit.py exec
+
+The ``run`` command which is discussed above is equivalent to running all three
+of these in sequence.
+
+All of these commands accept a number of optional command-line arguments. The
+``--help`` flag will give a complete list of these, or keep reading this page
+for a guide to some of the more useful ones.
+
+Parsing Test Results
+====================
+
+KUnit tests output their results in TAP (Test Anything Protocol) format.
+kunit_tool will, when running tests, parse this output and print a summary
+which is much more pleasant to read. If you wish to look at the raw test
+results in TAP format, you can pass the ``--raw_output`` argument.
+
+.. code-block:: bash
+
+ ./tools/testing/kunit/kunit.py run --raw_output
+
+.. note::
+ The raw output from test runs may contain other, non-KUnit kernel log
+ lines.
+
+If you have KUnit results in their raw TAP format, you can parse them and print
+the human-readable summary with the ``parse`` command for kunit_tool. This
+accepts a filename for an argument, or will read from standard input.
+
+.. code-block:: bash
+
+ # Reading from a file
+ ./tools/testing/kunit/kunit.py parse /var/log/dmesg
+ # Reading from stdin
+ dmesg | ./tools/testing/kunit/kunit.py parse
+
+This is very useful if you wish to run tests in a configuration not supported
+by kunit_tool (such as on real hardware, or an unsupported architecture).
+
+Filtering Tests
+===============
+
+It's possible to run only a subset of the tests built into a kernel by passing
+a filter to the ``exec`` or ``run`` commands. For example, if you only wanted
+to run KUnit resource tests, you could use:
+
+.. code-block:: bash
+
+ ./tools/testing/kunit/kunit.py run 'kunit-resource*'
+
+This uses the standard glob format for wildcards.
+
+Other Useful Options
+====================
+
+kunit_tool has a number of other command-line arguments which can be useful
+when adapting it to fit your environment or needs.
+
+Some of the more useful ones are:
+
+``--help``
+ Lists all of the available options. Note that different commands
+ (``config``, ``build``, ``run``, etc) will have different supported
+ options. Place ``--help`` before the command to list common options,
+ and after the command for options specific to that command.
+
+``--build_dir``
+ Specifies the build directory that kunit_tool will use. This is where
+ the .kunitconfig file is located, as well as where the .config and
+ compiled kernel will be placed. Defaults to ``.kunit``.
+
+``--make_options``
+ Specifies additional options to pass to ``make`` when compiling a
+ kernel (with the ``build`` or ``run`` commands). For example, to enable
+ compiler warnings, you can pass ``--make_options W=1``.
+
+``--alltests``
+ Builds a UML kernel with all config options enabled using ``make
+ allyesconfig``. This allows you to run as many tests as is possible,
+ but is very slow and prone to breakage as new options are added or
+ modified. In most cases, enabling all tests which have satisfied
+ dependencies by adding ``CONFIG_KUNIT_ALL_TESTS=1`` to your
+ .kunitconfig is preferable.
+
+There are several other options (and new ones are often added), so do check
+``--help`` if you're looking for something not mentioned here.
--
2.31.1.368.gbe11c130af-goog
The kunit_tool documentation page was pretty minimal, and a bit
outdated. Update it and flesh it out a bit.
In particular,
- Mention that .kunitconfig is now in the build directory
- Describe the use of --kunitconfig to specify a different config
framgent
- Mention the split functionality (i.e., commands other than 'run')
- Describe --raw_output and kunit.py parse
- Mention the globbing support
- Provide a quick overview of other options, including --build_dir and
--alltests
Note that this does overlap a little with the new running_tips page. I
don't think it's a problem having both: this page is supposed to be a
bit more of a reference, rather than a list of useful tips, so the fact
that they both describe the same features isn't a problem.
Signed-off-by: David Gow <davidgow(a)google.com>
---
Documentation/dev-tools/kunit/kunit-tool.rst | 132 ++++++++++++++++++-
1 file changed, 128 insertions(+), 4 deletions(-)
diff --git a/Documentation/dev-tools/kunit/kunit-tool.rst b/Documentation/dev-tools/kunit/kunit-tool.rst
index 29ae2fee8123..0b45affcd65c 100644
--- a/Documentation/dev-tools/kunit/kunit-tool.rst
+++ b/Documentation/dev-tools/kunit/kunit-tool.rst
@@ -22,14 +22,19 @@ not require any virtualization support: it is just a regular program.
What is a .kunitconfig?
=======================
-It's just a defconfig that kunit_tool looks for in the base directory.
+It's just a defconfig that kunit_tool looks for in the build directory.
kunit_tool uses it to generate a .config as you might expect. In addition, it
verifies that the generated .config contains the CONFIG options in the
.kunitconfig; the reason it does this is so that it is easy to be sure that a
CONFIG that enables a test actually ends up in the .config.
-How do I use kunit_tool?
-========================
+It's also possible to pass a separate .kunitconfig fragment to kunit_tool,
+which is useful if you have several different groups of tests you wish
+to run independently, or if you want to use pre-defined test configs for
+certain subsystems.
+
+Getting Started with kunit_tool
+===============================
If a kunitconfig is present at the root directory, all you have to do is:
@@ -48,10 +53,129 @@ However, you most likely want to use it with the following options:
.. note::
This command will work even without a .kunitconfig file: if no
- .kunitconfig is present, a default one will be used instead.
+ .kunitconfig is present, a default one will be used instead.
+
+If you wish to use a different .kunitconfig file (such as one provided for
+testing a particular subsystem), you can pass it as an option.
+
+.. code-block:: bash
+
+ ./tools/testing/kunit/kunit.py run --kunitconfig=fs/ext4/.kunitconfig
For a list of all the flags supported by kunit_tool, you can run:
.. code-block:: bash
./tools/testing/kunit/kunit.py run --help
+
+Configuring, Building, and Running Tests
+========================================
+
+It's also possible to run just parts of the KUnit build process independently,
+which is useful if you want to make manual changes to part of the process.
+
+A .config can be generated from a .kunitconfig by using the ``config`` argument
+when running kunit_tool:
+
+.. code-block:: bash
+
+ ./tools/testing/kunit/kunit.py config
+
+Similarly, if you just want to build a KUnit kernel from the current .config,
+you can use the ``build`` argument:
+
+.. code-block:: bash
+
+ ./tools/testing/kunit/kunit.py build
+
+And, if you already have a built UML kernel with built-in KUnit tests, you can
+run the kernel and display the test results with the ``exec`` argument:
+
+.. code-block:: bash
+
+ ./tools/testing/kunit/kunit.py exec
+
+The ``run`` command which is discussed above is equivalent to running all three
+of these in sequence.
+
+All of these commands accept a number of optional command-line arguments. The
+``--help`` flag will give a complete list of these, or keep reading this page
+for a guide to some of the more useful ones.
+
+Parsing Test Results
+====================
+
+KUnit tests output their results in TAP (Test Anything Protocol) format.
+kunit_tool will, when running tests, parse this output and print a summary
+which is much more pleasant to read. If you wish to look at the raw test
+results in TAP format, you can pass the ``--raw_output`` argument.
+
+.. code-block:: bash
+
+ ./tools/testing/kunit/kunit.py run --raw_output
+
+.. note::
+ The raw output from test runs may contain other, non-KUnit kernel log
+ lines.
+
+If you have KUnit results in their raw TAP format, you can parse them and print
+the human-readable summary with the ``parse`` command for kunit_tool. This
+accepts a filename for an argument, or will read from standard input.
+
+.. code-block:: bash
+
+ # Reading from a file
+ ./tools/testing/kunit/kunit.py parse /var/log/dmesg
+ # Reading from stdin
+ dmesg | ./tools/testing/kunit/kunit.py parse
+
+This is very useful if you wish to run tests in a configuration not supported
+by kunit_tool (such as on real hardware, or an unsupported architecture).
+
+Filtering Tests
+===============
+
+It's possible to run only a subset of the tests built into a kernel by passing
+a filter to the ``exec`` or ``run`` commands. For example, if you only wanted
+to run KUnit resource tests, you could use:
+
+.. code-block:: bash
+
+ ./tools/testing/kunit/kunit.py run 'kunit-resource*'
+
+This uses the standard glob format for wildcards.
+
+Other Useful Options
+====================
+
+kunit_tool has a number of other command-line arguments which can be useful
+when adapting it to fit your environment or needs.
+
+Some of the more useful ones are:
+
+``--help``
+ Lists all of the available options. Note that different commands
+ (``config``, ``build``, ``run``, etc) will have different supported
+ options. Place ``--help`` before the command to list common options,
+ and after the command for options specific to that command.
+
+``--build_dir``
+ Specifies the build directory that kunit_tool will use. This is where
+ the .kunitconfig file is located, as well as where the .config and
+ compiled kernel will be placed. Defaults to ``.kunit``.
+
+``--make_options``
+ Specifies additional options to pass to ``make`` when compiling a
+ kernel (with the ``build`` or ``run`` commands). For example, to enable
+ compiler warnings, you can pass ``--make_options W=1``.
+
+``--alltests``
+ Builds a UML kernel with all config options enabled using
+ ``make allyesconfig``. This allows you to run as many tests as is
+ possible, but is very slow and prone to breakage as new options are
+ added or modified. Most people should add ``CONFIG_KUNIT_ALL_TESTS=1``
+ to their .kunitconfig instead if they wish to run "all tests".
+
+
+There are several other options (and new ones are often added), so do check
+``--help`` if you're looking for something not mentioned here.
--
2.31.1.368.gbe11c130af-goog
From: Ira Weiny <ira.weiny(a)intel.com>
Introduce a new page protection mechanism for supervisor pages, Protection Key
Supervisor (PKS).
Generally PKS enables protections on 'domains' of supervisor pages to limit
supervisor mode access to pages beyond the normal paging protections. PKS
works in a similar fashion to user space pkeys, PKU. As with PKU, supervisor
pkeys are checked in addition to normal paging protections and Access or Writes
can be disabled via a MSR update without TLB flushes when permissions change.
Also like PKU, a page mapping is assigned to a domain by setting pkey bits in
the page table entry for that mapping.
Access is controlled through a PKRS register which is updated via WRMSR/RDMSR.
XSAVE is not supported for the PKRS MSR. Therefore the implementation
saves/restores the MSR across context switches and during exceptions. Nested
exceptions are supported by each exception getting a new PKS state.
For consistent behavior with current paging protections, pkey 0 is reserved and
configured to allow full access via the pkey mechanism, thus preserving the
default paging protections on mappings with the default pkey value of 0.
Other keys, (1-15) are allocated by an allocator which prepares us for key
contention from day one. Kernel users should be prepared for the allocator to
fail either because of key exhaustion or due to PKS not being supported on the
CPU instance.
The following are key attributes of PKS.
1) Fast switching of permissions
1a) Prevents access without page table manipulations
1b) No TLB flushes required
2) Works on a per thread basis
PKS is available with 4 and 5 level paging. Like PKRU it consumes 4 bits from
the PTE to store the pkey within the entry.
All code to support PKS is configured via ARCH_ENABLE_SUPERVISOR_PKEYS which
is designed to only be turned on when a user is configured on in the kernel.
Those users must depend on ARCH_HAS_SUPERVISOR_PKEYS to properly work with
other architectures which do not yet support PKS.
Originally this series was submitted as part of a large patch set which
converted the kmap call sites.[1]
Many follow on discussions revealed a few problems. The first of which was
that some callers leak a kmap mapping across threads rather than containing it
to a critical section. Attempts were made to see if these 'global kmaps' could
be supported.[2] However, supporting global kmaps had many problems. Work is
being done in parallel on converting as many kmap calls to the new
kmap_local_page().[3]
Changes from V5 [6]
From Dave Hansen
Remove 'we' from comments
Changes from V4 [5]
From kernel test robot <lkp(a)intel.com>
Fix i386 build: pks_init_task not found
Move MSR_IA32_PKRS and INIT_PKRS_VALUE into patch 5 where they are
first 'used'. (Technically nothing is 'used' until the final
test patch. But review wise this is much cleaner.)
From Sean Christoperson
Add documentation details on what happens if the pkey is violated
Change cpu_feature_enabled to be in WARN_ON check
Clean up commit message of patch 6
[1] https://lore.kernel.org/lkml/20201009195033.3208459-1-ira.weiny@intel.com/
[2] https://lore.kernel.org/lkml/87mtycqcjf.fsf@nanos.tec.linutronix.de/
[3] https://lore.kernel.org/lkml/20210128061503.1496847-1-ira.weiny@intel.com/https://lore.kernel.org/lkml/20210210062221.3023586-1-ira.weiny@intel.com/https://lore.kernel.org/lkml/20210205170030.856723-1-ira.weiny@intel.com/https://lore.kernel.org/lkml/20210217024826.3466046-1-ira.weiny@intel.com/
[4] https://lore.kernel.org/lkml/20201106232908.364581-1-ira.weiny@intel.com/
[5] https://lore.kernel.org/lkml/20210322053020.2287058-1-ira.weiny@intel.com/
[6] https://lore.kernel.org/lkml/20210331191405.341999-1-ira.weiny@intel.com/
Fenghua Yu (1):
x86/pks: Add PKS kernel API
Ira Weiny (9):
x86/pkeys: Create pkeys_common.h
x86/fpu: Refactor arch_set_user_pkey_access() for PKS support
x86/pks: Add additional PKEY helper macros
x86/pks: Add PKS defines and Kconfig options
x86/pks: Add PKS setup code
x86/fault: Adjust WARN_ON for PKey fault
x86/pks: Preserve the PKRS MSR on context switch
x86/entry: Preserve PKRS MSR across exceptions
x86/pks: Add PKS test code
Documentation/core-api/protection-keys.rst | 112 +++-
arch/x86/Kconfig | 1 +
arch/x86/entry/calling.h | 26 +
arch/x86/entry/common.c | 57 ++
arch/x86/entry/entry_64.S | 22 +-
arch/x86/entry/entry_64_compat.S | 6 +-
arch/x86/include/asm/cpufeatures.h | 1 +
arch/x86/include/asm/disabled-features.h | 8 +-
arch/x86/include/asm/msr-index.h | 1 +
arch/x86/include/asm/pgtable.h | 15 +-
arch/x86/include/asm/pgtable_types.h | 12 +
arch/x86/include/asm/pkeys.h | 4 +
arch/x86/include/asm/pkeys_common.h | 34 +
arch/x86/include/asm/pks.h | 54 ++
arch/x86/include/asm/processor-flags.h | 2 +
arch/x86/include/asm/processor.h | 47 +-
arch/x86/include/uapi/asm/processor-flags.h | 2 +
arch/x86/kernel/cpu/common.c | 2 +
arch/x86/kernel/fpu/xstate.c | 22 +-
arch/x86/kernel/head_64.S | 7 +-
arch/x86/kernel/process.c | 3 +
arch/x86/kernel/process_64.c | 2 +
arch/x86/mm/fault.c | 30 +-
arch/x86/mm/pkeys.c | 218 +++++-
include/linux/pgtable.h | 4 +
include/linux/pkeys.h | 34 +
kernel/entry/common.c | 14 +-
lib/Kconfig.debug | 11 +
lib/Makefile | 3 +
lib/pks/Makefile | 3 +
lib/pks/pks_test.c | 694 ++++++++++++++++++++
mm/Kconfig | 5 +
tools/testing/selftests/x86/Makefile | 3 +-
tools/testing/selftests/x86/test_pks.c | 149 +++++
34 files changed, 1527 insertions(+), 81 deletions(-)
create mode 100644 arch/x86/include/asm/pkeys_common.h
create mode 100644 arch/x86/include/asm/pks.h
create mode 100644 lib/pks/Makefile
create mode 100644 lib/pks/pks_test.c
create mode 100644 tools/testing/selftests/x86/test_pks.c
--
2.28.0.rc0.12.gb6a658bd00c9
This series adds support to libbpf for attaching SCHED_CLS and SCHED_ACT bpf
programs to their respective tc attach points.
Currently, a user needs to shell out to the tc command line for add, change,
replace, and del operations, which is not ideal.
Some of the features that have been omitted for the CLS API:
* TCA_BPF_POLICE
Support for adding police actions to filter has been omitted for now.
* TCA_RATE
Support for packet rate estimator has been omitted for now.
* Attaching actions directly to the classifier
This allows the attached actions to be bound to classifier and get auto detached
when it is deleted. It translates to 'bind' refcount in the kernel internally.
They run after a successful classification from the SCHED_CLS prog.
Support for this can be added later, but has been omitted for now, primarily
because direct-action mode provides a better alternative.
A high level TC-BPF API is also provided, and currently only supports attach and
destroy operations. These functions return a pointer to a bpf_link object. When
falling back to the low level API, the link must be disconnected to take over
its ownership. It can be released using bpf_link__destroy, which will also cause
the filter/action to be detached if not disconnected.
The individual commits contain a general API summary and examples.
Kumar Kartikeya Dwivedi (5):
tools pkt_cls.h: sync with kernel sources
libbpf: add helpers for preparing netlink attributes
libbpf: add low level TC-BPF API
libbpf: add high level TC-BPF API
libbpf: add selftests for TC-BPF API
tools/include/uapi/linux/pkt_cls.h | 174 +++-
tools/lib/bpf/libbpf.c | 110 ++-
tools/lib/bpf/libbpf.h | 133 ++++
tools/lib/bpf/libbpf.map | 17 +
tools/lib/bpf/netlink.c | 752 +++++++++++++++++-
tools/lib/bpf/nlattr.h | 43 +
.../selftests/bpf/prog_tests/test_tc_bpf.c | 261 ++++++
.../selftests/bpf/progs/test_tc_bpf_kern.c | 18 +
8 files changed, 1476 insertions(+), 32 deletions(-)
create mode 100644 tools/testing/selftests/bpf/prog_tests/test_tc_bpf.c
create mode 100644 tools/testing/selftests/bpf/progs/test_tc_bpf_kern.c
--
2.30.2
This is long overdue.
There are several things that aren't nailed down (in-tree
.kunitconfig's), or partially broken (GCOV on UML), but having them
documented, warts and all, is better than having nothing.
This covers a bunch of the more recent features
* kunit_filter_glob
* kunit.py run --kunitconfig
* slightly more detail on building tests as modules
* CONFIG_KUNIT_DEBUGFS
By my count, the only headline features now not mentioned are the KASAN
integration and KernelCI json output support (kunit.py run --json).
And then it also discusses how to get code coverage reports under UML
and non-UML since this is a question people have repeatedly asked.
Non-UML coverage collection is no different from normal, but we should
probably explicitly call this out.
As for UML, I was able to get it working again with two small hacks.*
E.g. with CONFIG_KUNIT=y && CONFIG_KUNIT_ALL_TESTS=y
Overall coverage rate:
lines......: 15.1% (18294 of 120776 lines)
functions..: 16.8% (1860 of 11050 functions)
Note: this doesn't document --alltests since this is not stable yet.
Hopefully being run more frequently as part of KernelCI will help...
*Using gcc/gcov-6 and not using uml_abort() in os_dump_core().
I've documented these hacks in "Notes" but left TODOs for
brendanhiggins(a)google.com who tracked down the runtime issue in GCC.
To be clear: these are not issues specific to KUnit, but rather to UML.
Signed-off-by: Daniel Latypov <dlatypov(a)google.com>
Reviewed-by: David Gow <davidgow(a)google.com>
---
v3 -> v4:
* update instructions on how to remove uml_abort() call
v2 -> v3:
* Suggest --make_options=CC=/usr/bin/gcc-6 instead of manually editing
kunit_kernel.py
v1 -> v2:
Fix typos, drop --alltests, changed wordiing on config fragments.
---
Documentation/dev-tools/kunit/index.rst | 1 +
.../dev-tools/kunit/running_tips.rst | 259 ++++++++++++++++++
Documentation/dev-tools/kunit/start.rst | 2 +
3 files changed, 262 insertions(+)
create mode 100644 Documentation/dev-tools/kunit/running_tips.rst
diff --git a/Documentation/dev-tools/kunit/index.rst b/Documentation/dev-tools/kunit/index.rst
index 848478838347..7f7cf8d2ab20 100644
--- a/Documentation/dev-tools/kunit/index.rst
+++ b/Documentation/dev-tools/kunit/index.rst
@@ -14,6 +14,7 @@ KUnit - Unit Testing for the Linux Kernel
style
faq
tips
+ running_tips
What is KUnit?
==============
diff --git a/Documentation/dev-tools/kunit/running_tips.rst b/Documentation/dev-tools/kunit/running_tips.rst
new file mode 100644
index 000000000000..7d99386cf94a
--- /dev/null
+++ b/Documentation/dev-tools/kunit/running_tips.rst
@@ -0,0 +1,259 @@
+.. SPDX-License-Identifier: GPL-2.0
+
+============================
+Tips For Running KUnit Tests
+============================
+
+Using ``kunit.py run`` ("kunit tool")
+=====================================
+
+Running from any directory
+--------------------------
+
+It can be handy to create a bash function like:
+
+.. code-block:: bash
+
+ function run_kunit() {
+ ( cd "$(git rev-parse --show-toplevel)" && ./tools/testing/kunit/kunit.py run $@ )
+ }
+
+.. note::
+ Early versions of ``kunit.py`` (before 5.6) didn't work unless run from
+ the kernel root, hence the use of a subshell and ``cd``.
+
+Running a subset of tests
+-------------------------
+
+``kunit.py run`` accepts an optional glob argument to filter tests. Currently
+this only matches against suite names, but this may change in the future.
+
+Say that we wanted to run the sysctl tests, we could do so via:
+
+.. code-block:: bash
+
+ $ echo -e 'CONFIG_KUNIT=y\nCONFIG_KUNIT_ALL_TESTS=y' > .kunit/.kunitconfig
+ $ ./tools/testing/kunit/kunit.py run 'sysctl*'
+
+We're paying the cost of building more tests than we need this way, but it's
+easier than fiddling with ``.kunitconfig`` files or commenting out
+``kunit_suite``'s.
+
+However, if we wanted to define a set of tests in a less ad hoc way, the next
+tip is useful.
+
+Defining a set of tests
+-----------------------
+
+``kunit.py run`` (along with ``build``, and ``config``) supports a
+``--kunitconfig`` flag. So if you have a set of tests that you want to run on a
+regular basis (especially if they have other dependencies), you can create a
+specific ``.kunitconfig`` for them.
+
+E.g. kunit has one for its tests:
+
+.. code-block:: bash
+
+ $ ./tools/testing/kunit/kunit.py run --kunitconfig=lib/kunit/.kunitconfig
+
+Alternatively, if you're following the convention of naming your
+file ``.kunitconfig``, you can just pass in the dir, e.g.
+
+.. code-block:: bash
+
+ $ ./tools/testing/kunit/kunit.py run --kunitconfig=lib/kunit
+
+.. note::
+ This is a relatively new feature (5.12+) so we don't have any
+ conventions yet about on what files should be checked in versus just
+ kept around locally. It's up to you and your maintainer to decide if a
+ config is useful enough to submit (and therefore have to maintain).
+
+.. note::
+ Having ``.kunitconfig`` fragments in a parent and child directory is
+ iffy. There's discussion about adding an "import" statement in these
+ files to make it possible to have a top-level config run tests from all
+ child directories. But that would mean ``.kunitconfig`` files are no
+ longer just simple .config fragments.
+
+ One alternative would be to have kunit tool recursively combine configs
+ automagically, but tests could theoretically depend on incompatible
+ options, so handling that would be tricky.
+
+Generating code coverage reports under UML
+------------------------------------------
+
+.. note::
+ TODO(brendanhiggins(a)google.com): There are various issues with UML and
+ versions of gcc 7 and up. You're likely to run into missing ``.gcda``
+ files or compile errors. We know one `faulty GCC commit
+ <https://github.com/gcc-mirror/gcc/commit/8c9434c2f9358b8b8bad2c1990edf10a21…>`_
+ but not how we'd go about getting this fixed. The compile errors still
+ need some investigation.
+
+.. note::
+ TODO(brendanhiggins(a)google.com): for recent versions of Linux
+ (5.10-5.12, maybe earlier), there's a bug with gcov counters not being
+ flushed in UML. This translates to very low (<1%) reported coverage. This is
+ related to the above issue and can be worked around by replacing the
+ one call to ``uml_abort()`` (it's in ``os_dump_core()``) with a plain
+ ``exit()``.
+
+
+This is different from the "normal" way of getting coverage information that is
+documented in Documentation/dev-tools/gcov.rst.
+
+Instead of enabling ``CONFIG_GCOV_KERNEL=y``, we can set these options:
+
+.. code-block:: none
+
+ CONFIG_DEBUG_KERNEL=y
+ CONFIG_DEBUG_INFO=y
+ CONFIG_GCOV=y
+
+
+Putting it together into a copy-pastable sequence of commands:
+
+.. code-block:: bash
+
+ # Append coverage options to the current config
+ $ echo -e "CONFIG_DEBUG_KERNEL=y\nCONFIG_DEBUG_INFO=y\nCONFIG_GCOV=y" >> .kunit/.kunitconfig
+ $ ./tools/testing/kunit/kunit.py run
+ # Extract the coverage information from the build dir (.kunit/)
+ $ lcov -t "my_kunit_tests" -o coverage.info -c -d .kunit/
+
+ # From here on, it's the same process as with CONFIG_GCOV_KERNEL=y
+ # E.g. can generate an HTML report in a tmp dir like so:
+ $ genhtml -o /tmp/coverage_html coverage.info
+
+
+If your installed version of gcc doesn't work, you can tweak the steps:
+
+.. code-block:: bash
+
+ $ ./tools/testing/kunit/kunit.py run --make_options=CC=/usr/bin/gcc-6
+ $ lcov -t "my_kunit_tests" -o coverage.info -c -d .kunit/ --gcov-tool=/usr/bin/gcov-6
+
+
+Running tests manually
+======================
+
+Running tests without using ``kunit.py run`` is also an important use case.
+Currently it's your only option if you want to test on architectures other than
+UML.
+
+As running the tests under UML is fairly straightforward (configure and compile
+the kernel, run the ``./linux`` binary), this section will focus on testing
+non-UML architectures.
+
+
+Running built-in tests
+----------------------
+
+When setting tests to ``=y``, the tests will run as part of boot and print
+results to dmesg in TAP format. So you just need to add your tests to your
+``.config``, build and boot your kernel as normal.
+
+So if we compiled our kernel with:
+
+.. code-block:: none
+
+ CONFIG_KUNIT=y
+ CONFIG_KUNIT_EXAMPLE_TEST=y
+
+Then we'd see output like this in dmesg signaling the test ran and passed:
+
+.. code-block:: none
+
+ TAP version 14
+ 1..1
+ # Subtest: example
+ 1..1
+ # example_simple_test: initializing
+ ok 1 - example_simple_test
+ ok 1 - example
+
+Running tests as modules
+------------------------
+
+Depending on the tests, you can build them as loadable modules.
+
+For example, we'd change the config options from before to
+
+.. code-block:: none
+
+ CONFIG_KUNIT=y
+ CONFIG_KUNIT_EXAMPLE_TEST=m
+
+Then after booting into our kernel, we can run the test via
+
+.. code-block:: none
+
+ $ modprobe kunit-example-test
+
+This will then cause it to print TAP output to stdout.
+
+.. note::
+ The ``modprobe`` will *not* have a non-zero exit code if any test
+ failed (as of 5.13). But ``kunit.py parse`` would, see below.
+
+.. note::
+ You can set ``CONFIG_KUNIT=m`` as well, however, some features will not
+ work and thus some tests might break. Ideally tests would specify they
+ depend on ``KUNIT=y`` in their ``Kconfig``'s, but this is an edge case
+ most test authors won't think about.
+ As of 5.13, the only difference is that ``current->kunit_test`` will
+ not exist.
+
+Pretty-printing results
+-----------------------
+
+You can use ``kunit.py parse`` to parse dmesg for test output and print out
+results in the same familiar format that ``kunit.py run`` does.
+
+.. code-block:: bash
+
+ $ ./tools/testing/kunit/kunit.py parse /var/log/dmesg
+
+
+Retrieving per suite results
+----------------------------
+
+Regardless of how you're running your tests, you can enable
+``CONFIG_KUNIT_DEBUGFS`` to expose per-suite TAP-formatted results:
+
+.. code-block:: none
+
+ CONFIG_KUNIT=y
+ CONFIG_KUNIT_EXAMPLE_TEST=m
+ CONFIG_KUNIT_DEBUGFS=y
+
+The results for each suite will be exposed under
+``/sys/kernel/debug/kunit/<suite>/results``.
+So using our example config:
+
+.. code-block:: bash
+
+ $ modprobe kunit-example-test > /dev/null
+ $ cat /sys/kernel/debug/kunit/example/results
+ ... <TAP output> ...
+
+ # After removing the module, the corresponding files will go away
+ $ modprobe -r kunit-example-test
+ $ cat /sys/kernel/debug/kunit/example/results
+ /sys/kernel/debug/kunit/example/results: No such file or directory
+
+Generating code coverage reports
+--------------------------------
+
+See Documentation/dev-tools/gcov.rst for details on how to do this.
+
+The only vaguely KUnit-specific advice here is that you probably want to build
+your tests as modules. That way you can isolate the coverage from tests from
+other code executed during boot, e.g.
+
+.. code-block:: bash
+
+ # Reset coverage counters before running the test.
+ $ echo 0 > /sys/kernel/debug/gcov/reset
+ $ modprobe kunit-example-test
diff --git a/Documentation/dev-tools/kunit/start.rst b/Documentation/dev-tools/kunit/start.rst
index 0e65cabe08eb..aa56d7ca6bfb 100644
--- a/Documentation/dev-tools/kunit/start.rst
+++ b/Documentation/dev-tools/kunit/start.rst
@@ -236,5 +236,7 @@ Next Steps
==========
* Check out the :doc:`tips` page for tips on
writing idiomatic KUnit tests.
+* Check out the :doc:`running_tips` page for tips on
+ how to make running KUnit tests easier.
* Optional: see the :doc:`usage` page for a more
in-depth explanation of KUnit.
base-commit: de2fcb3e62013738f22bbb42cbd757d9a242574e
--
2.31.1.295.g9ea45b61b8-goog
Base
====
Since the original series [1] was merged into Andrew's tree, some issues were
noticed. Up to this point, we had been working on fixing what's in Andrew's
tree [2], but at this point we've changed direction enough that a lot of the
fix's delta is undoing what was done in the original series, thereby making it
hard to review.
As suggested by Hugh Dickins and Peter Xu, this series takes a step back. It can
be considered a v3 of the original series [1] - it combines those patches with
the fixes, reordered / broken up to allow for easier review.
The idea is that it will apply cleanly to akpm's tree, *replacing* the following
patches (i.e., drop these first, and then apply this series):
userfaultfd-support-minor-fault-handling-for-shmem.patch
userfaultfd-support-minor-fault-handling-for-shmem-fix.patch
userfaultfd-support-minor-fault-handling-for-shmem-fix-2.patch
userfaultfd-support-minor-fault-handling-for-shmem-fix-3.patch
userfaultfd-support-minor-fault-handling-for-shmem-fix-4.patch
userfaultfd-selftests-use-memfd_create-for-shmem-test-type.patch
userfaultfd-selftests-create-alias-mappings-in-the-shmem-test.patch
userfaultfd-selftests-reinitialize-test-context-in-each-test.patch
userfaultfd-selftests-exercise-minor-fault-handling-shmem-support.patch
Changelog
=========
Changes since the most recent fixup patch [2]:
- Squash the fixes ([2]) in with the original series ([1]). This makes reviewing
easier, as we no longer have to sift through deltas undoing what we had done
before. [Hugh, Peter]
- Modify shmem_mcopy_atomic_pte() to use the new mcopy_atomic_install_ptes()
helper, reducing code duplication. [Hugh]
- Properly trigger handle_userfault() in the shmem_swapin_page() case. [Hugh]
- Use shmem_getpage() instead of find_lock_page() to lookup the existing page in
for continue. This properly deals with swapped-out pages. [Hugh]
- Unconditionally pte_mkdirty() for anon memory (as before). [Peter]
- Don't include userfaultfd_k.h in either hugetlb.h or shmem_fs.h. [Hugh]
- Add comment for UFFD_FEATURE_MINOR_SHMEM (to match _HUGETLBFS). [Hugh]
- Fix some small cleanup issues (parens, reworded conditionals, reduced plumbing
of some parameters, simplify labels/gotos, ...). [Hugh, Peter]
Overview
========
See the series which added minor faults for hugetlbfs [3] for a detailed
overview of minor fault handling in general. This series adds the same support
for shmem-backed areas.
This series is structured as follows:
- Commits 1 and 2 are cleanups.
- Commits 3 and 4 implement the new feature (minor fault handling for shmem).
- Commits 5, 6, 7, 8 update the userfaultfd selftest to exercise the feature.
- Commit 9 is one final cleanup, modifying an existing code path to re-use a new
helper we've introduced. We rely on the selftest to show that this change
doesn't break anything.
Use Case
========
In some cases it is useful to have VM memory backed by tmpfs instead of
hugetlbfs. So, this feature will be used to support the same VM live migration
use case described in my original series.
Additionally, Android folks (Lokesh Gidra <lokeshgidra(a)google.com>) hope to
optimize the Android Runtime garbage collector using this feature:
"The plan is to use userfaultfd for concurrently compacting the heap. With
this feature, the heap can be shared-mapped at another location where the
GC-thread(s) could continue the compaction operation without the need to
invoke userfault ioctl(UFFDIO_COPY) each time. OTOH, if and when Java threads
get faults on the heap, UFFDIO_CONTINUE can be used to resume execution.
Furthermore, this feature enables updating references in the 'non-moving'
portion of the heap efficiently. Without this feature, uneccessary page
copying (ioctl(UFFDIO_COPY)) would be required."
[1] https://lore.kernel.org/patchwork/cover/1388144/
[2] https://lore.kernel.org/patchwork/patch/1408161/
[3] https://lore.kernel.org/linux-fsdevel/20210301222728.176417-1-axelrasmussen…
Axel Rasmussen (9):
userfaultfd/hugetlbfs: avoid including userfaultfd_k.h in hugetlb.h
userfaultfd/shmem: combine shmem_{mcopy_atomic,mfill_zeropage}_pte
userfaultfd/shmem: support minor fault registration for shmem
userfaultfd/shmem: support UFFDIO_CONTINUE for shmem
userfaultfd/selftests: use memfd_create for shmem test type
userfaultfd/selftests: create alias mappings in the shmem test
userfaultfd/selftests: reinitialize test context in each test
userfaultfd/selftests: exercise minor fault handling shmem support
userfaultfd/shmem: modify shmem_mcopy_atomic_pte to use install_ptes
fs/userfaultfd.c | 6 +-
include/linux/hugetlb.h | 5 +-
include/linux/shmem_fs.h | 15 +-
include/linux/userfaultfd_k.h | 5 +
include/uapi/linux/userfaultfd.h | 7 +-
mm/hugetlb.c | 1 +
mm/memory.c | 8 +-
mm/shmem.c | 122 ++++------
mm/userfaultfd.c | 183 ++++++++++-----
tools/testing/selftests/vm/userfaultfd.c | 280 +++++++++++++++--------
10 files changed, 387 insertions(+), 245 deletions(-)
--
2.31.1.295.g9ea45b61b8-goog
The rp_filter testcase is used to test whether local packets redirected
from dummy1 to lo could pass the checking of rp_filter.
In fact, the packets passed the checking, but the testing process cannot
receive any reply packets, leading to test failure. The reason is that
the device dummy1 lacks ip address, caused the incorrect routing of
reply packets.
This patch adds ip address for dummy1 device.
Signed-off-by: Qiao Ma <mqaio(a)linux.alibaba.com>
---
tools/testing/selftests/net/fib_tests.sh | 1 +
1 file changed, 1 insertion(+)
diff --git a/tools/testing/selftests/net/fib_tests.sh b/tools/testing/selftests/net/fib_tests.sh
index 2b5707738609..9a843ca0b913 100755
--- a/tools/testing/selftests/net/fib_tests.sh
+++ b/tools/testing/selftests/net/fib_tests.sh
@@ -448,6 +448,7 @@ fib_rp_filter_test()
$IP link set dummy0 address 52:54:00:6a:c7:5e
$IP link add dummy1 type dummy
$IP link set dummy1 address 52:54:00:6a:c7:5e
+ $IP address add 198.51.101.1/24 dev dummy1
$IP link set dev dummy1 up
$NS_EXEC sysctl -qw net.ipv4.conf.all.rp_filter=1
$NS_EXEC sysctl -qw net.ipv4.conf.all.accept_local=1
--
2.18.2
The kernel now has a number of testing and debugging tools, and we've
seen a bit of confusion about what the differences between them are.
Add a basic documentation outlining the testing tools, when to use each,
and how they interact.
This is a pretty quick overview rather than the idealised "kernel
testing guide" that'd probably be optimal, but given the number of times
questions like "When do you use KUnit and when do you use Kselftest?"
are being asked, it seemed worth at least having something. Hopefully
this can form the basis for more detailed documentation later.
Signed-off-by: David Gow <davidgow(a)google.com>
---
Thanks, everyone, for the comments on the doc. I've made a few of the
suggested changes. Please let me know what you think!
-- David
Changes since v1:
https://lore.kernel.org/linux-kselftest/20210410070529.4113432-1-davidgow@g…
- Note KUnit's speed and that one should provide selftests for syscalls
- Mention lockdep as a Dynamic Analysis Tool
- Refer to "Dynamic Analysis Tools" instead of "Sanitizers"
- A number of minor formatting tweaks and rewordings for clarity.
Not changed:
- I haven't included an exhaustive list of differences, advantages, etc,
between KUnit and kselftest: for now, the doc continues to focus on
the difference between 'in-kernel' and 'userspace' testing here.
- Similarly, I'm not linking out to docs defining and describing "Unit"
tests versus "End-to-end" tests. None of the existing documentation
elsewhere quite matches what we do in the kernel perfectly, so it
seems less confusing to focus on the 'in-kernel'/'userspace'
distinction, and leave other definitions as a passing mention for
those who are already familiar with the concepts.
- I haven't linked to any talk videos here: a few of them are linked on
(e.g.) the KUnit webpage, but I wanted to keep the Kernel documentation
more self-contained for now. No objection to adding them in a follow-up
patch if people feel strongly about it, though.
- The link from index.rst to this doc is unchanged. I personally think
that the link is prominent enough there: it's the first link, and
shows up a few times. One possibility if people disagreed would be to
merge this page with the index, but given not all dev-tools are going
to be testing-related, it seemed a bit arrogant. :-)
Documentation/dev-tools/index.rst | 3 +
Documentation/dev-tools/testing-overview.rst | 117 +++++++++++++++++++
2 files changed, 120 insertions(+)
create mode 100644 Documentation/dev-tools/testing-overview.rst
diff --git a/Documentation/dev-tools/index.rst b/Documentation/dev-tools/index.rst
index 1b1cf4f5c9d9..f590e5860794 100644
--- a/Documentation/dev-tools/index.rst
+++ b/Documentation/dev-tools/index.rst
@@ -7,6 +7,8 @@ be used to work on the kernel. For now, the documents have been pulled
together without any significant effort to integrate them into a coherent
whole; patches welcome!
+A brief overview of testing-specific tools can be found in :doc:`testing-overview`.
+
.. class:: toc-title
Table of contents
@@ -14,6 +16,7 @@ whole; patches welcome!
.. toctree::
:maxdepth: 2
+ testing-overview
coccinelle
sparse
kcov
diff --git a/Documentation/dev-tools/testing-overview.rst b/Documentation/dev-tools/testing-overview.rst
new file mode 100644
index 000000000000..ce36a8cdf6b5
--- /dev/null
+++ b/Documentation/dev-tools/testing-overview.rst
@@ -0,0 +1,117 @@
+.. SPDX-License-Identifier: GPL-2.0
+
+====================
+Kernel Testing Guide
+====================
+
+
+There are a number of different tools for testing the Linux kernel, so knowing
+when to use each of them can be a challenge. This document provides a rough
+overview of their differences, and how they fit together.
+
+
+Writing and Running Tests
+=========================
+
+The bulk of kernel tests are written using either the kselftest or KUnit
+frameworks. These both provide infrastructure to help make running tests and
+groups of tests easier, as well as providing helpers to aid in writing new
+tests.
+
+If you're looking to verify the behaviour of the Kernel — particularly specific
+parts of the kernel — then you'll want to use KUnit or kselftest.
+
+
+The Difference Between KUnit and kselftest
+------------------------------------------
+
+KUnit (Documentation/dev-tools/kunit/index.rst) is an entirely in-kernel system
+for "white box" testing: because test code is part of the kernel, it can access
+internal structures and functions which aren't exposed to userspace.
+
+KUnit tests therefore are best written against small, self-contained parts
+of the kernel, which can be tested in isolation. This aligns well with the
+concept of 'unit' testing.
+
+For example, a KUnit test might test an individual kernel function (or even a
+single codepath through a function, such as an error handling case), rather
+than a feature as a whole.
+
+This also makes KUnit tests very fast to build and run, allowing them to be
+run frequently as part of the development process.
+
+There is a KUnit test style guide which may give further pointers in
+Documentation/dev-tools/kunit/style.rst
+
+
+kselftest (Documentation/dev-tools/kselftest.rst), on the other hand, is
+largely implemented in userspace, and tests are normal userspace scripts or
+programs.
+
+This makes it easier to write more complicated tests, or tests which need to
+manipulate the overall system state more (e.g., spawning processes, etc.).
+However, it's not possible to call kernel functions directly from kselftest.
+This means that only kernel functionality which is exposed to userspace somhow
+(e.g. by a syscall, device, filesystem, etc.) can be tested with kselftest. To
+work around this, some tests include a companion kernel module which exposes
+more information or functionality. If a test runs mostly or entirely within the
+kernel, however, KUnit may be the more appropriate tool.
+
+kselftest is therefore suited well to tests of whole features, as these will
+expose an interface to userspace, which can be tested, but not implementation
+details. This aligns well with 'system' or 'end-to-end' testing.
+
+For example, all new system calls should be accompanied by kselftest tests.
+
+Code Coverage Tools
+===================
+
+The Linux Kernel supports two different code coverage measurement tools. These
+can be used to verify that a test is executing particular functions or lines
+of code. This is useful for determining how much of the kernel is being tested,
+and for finding corner-cases which are not covered by the appropriate test.
+
+:doc:`gcov` is GCC's coverage testing tool, which can be used with the kernel
+to get global or per-module coverage. Unlike KCOV, it does not record per-task
+coverage. Coverage data can be read from debugfs, and interpreted using the
+usual gcov tooling.
+
+:doc:`kcov` is a feature which can be built in to the kernel to allow
+capturing coverage on a per-task level. It's therefore useful for fuzzing and
+other situations where information about code executed during, for example, a
+single syscall is useful.
+
+
+Dynamic Analysis Tools
+======================
+
+The kernel also supports a number of dynamic analysis tools, which attempt to
+detect classes of issues when the occur in a running kernel. These typically
+look for undefined behaviour of some kind, such as invalid memory accesses,
+concurrency issues such as data races, or other undefined behaviour like
+integer overflows.
+
+Some of these tools are listed below:
+
+* kmemleak detects possible memory leaks. See
+ Documentation/dev-tools/kmemleak.rst
+* KASAN detects invalid memory accesses such as out-of-bounds and
+ use-after-free errors. See Documentation/dev-tools/kasan.rst
+* UBSAN detects behaviour that is undefined by the C standard, like integer
+ overflows. See Documentation/dev-tools/ubsan.rst
+* KCSAN detects data races. See Documentation/dev-tools/kcsan.rst
+* KFENCE is a low-overhead detector of memory issues, which is much faster than
+ KASAN and can be used in production. See Documentation/dev-tools/kfence.rst
+* lockdep is a locking correctness validator. See
+ Documentation/locking/lockdep-design.rst
+* There are several other pieces of debug instrumentation in the kernel, many
+ of which can be found in lib/Kconfig.debug
+
+These tools tend to test the kernel as a whole, and do not "pass" like
+kselftest or KUnit tests. They can be combined with KUnit or kselftest by
+running tests on a kernel with a sanitizer enabled: you can then be sure
+that none of these errors are occurring during the test.
+
+Some of these tools integrate with KUnit or kselftest and will
+automatically fail tests if an issue is detected.
+
--
2.31.1.295.g9ea45b61b8-goog
This is long overdue.
There are several things that aren't nailed down (in-tree
.kunitconfig's), or partially broken (GCOV on UML), but having them
documented, warts and all, is better than having nothing.
This covers a bunch of the more recent features
* kunit_filter_glob
* kunit.py run --kunitconfig
* slightly more detail on building tests as modules
* CONFIG_KUNIT_DEBUGFS
By my count, the only headline features now not mentioned are the KASAN
integration and KernelCI json output support (kunit.py run --json).
And then it also discusses how to get code coverage reports under UML
and non-UML since this is a question people have repeatedly asked.
Non-UML coverage collection is no different from normal, but we should
probably explicitly call this out.
As for UML, I was able to get it working again with two small hacks.*
E.g. with CONFIG_KUNIT=y && CONFIG_KUNIT_ALL_TESTS=y
Overall coverage rate:
lines......: 15.1% (18294 of 120776 lines)
functions..: 16.8% (1860 of 11050 functions)
Note: this doesn't document --alltests since this is not stable yet.
Hopefully being run more frequently as part of KernelCI will help...
*Using gcc/gcov-6 and not using uml_abort() in os_dump_core().
I've documented these hacks in "Notes" but left TODOs for
brendanhiggins(a)google.com who tracked down the runtime issue in GCC.
To be clear: these are not issues specific to KUnit, but rather to UML.
Signed-off-by: Daniel Latypov <dlatypov(a)google.com>
---
v2 -> v3:
* Suggest --make_options=CC=/usr/bin/gcc-6 instead of manually editing
kunit_kernel.py
* update instructions on how to remove uml_abort() call
v1 -> v2:
Fix typos, drop --alltests, changed wordiing on config fragments.
---
Documentation/dev-tools/kunit/index.rst | 1 +
.../dev-tools/kunit/running_tips.rst | 258 ++++++++++++++++++
Documentation/dev-tools/kunit/start.rst | 2 +
3 files changed, 261 insertions(+)
create mode 100644 Documentation/dev-tools/kunit/running_tips.rst
diff --git a/Documentation/dev-tools/kunit/index.rst b/Documentation/dev-tools/kunit/index.rst
index 848478838347..7f7cf8d2ab20 100644
--- a/Documentation/dev-tools/kunit/index.rst
+++ b/Documentation/dev-tools/kunit/index.rst
@@ -14,6 +14,7 @@ KUnit - Unit Testing for the Linux Kernel
style
faq
tips
+ running_tips
What is KUnit?
==============
diff --git a/Documentation/dev-tools/kunit/running_tips.rst b/Documentation/dev-tools/kunit/running_tips.rst
new file mode 100644
index 000000000000..e2e9af711d1b
--- /dev/null
+++ b/Documentation/dev-tools/kunit/running_tips.rst
@@ -0,0 +1,258 @@
+.. SPDX-License-Identifier: GPL-2.0
+
+============================
+Tips For Running KUnit Tests
+============================
+
+Using ``kunit.py run`` ("kunit tool")
+=====================================
+
+Running from any directory
+--------------------------
+
+It can be handy to create a bash function like:
+
+.. code-block:: bash
+
+ function run_kunit() {
+ ( cd "$(git rev-parse --show-toplevel)" && ./tools/testing/kunit/kunit.py run $@ )
+ }
+
+.. note::
+ Early versions of ``kunit.py`` (before 5.6) didn't work unless run from
+ the kernel root, hence the use of a subshell and ``cd``.
+
+Running a subset of tests
+-------------------------
+
+``kunit.py run`` accepts an optional glob argument to filter tests. Currently
+this only matches against suite names, but this may change in the future.
+
+Say that we wanted to run the sysctl tests, we could do so via:
+
+.. code-block:: bash
+
+ $ echo -e 'CONFIG_KUNIT=y\nCONFIG_KUNIT_ALL_TESTS=y' > .kunit/.kunitconfig
+ $ ./tools/testing/kunit/kunit.py run 'sysctl*'
+
+We're paying the cost of building more tests than we need this way, but it's
+easier than fiddling with ``.kunitconfig`` files or commenting out
+``kunit_suite``'s.
+
+However, if we wanted to define a set of tests in a less ad hoc way, the next
+tip is useful.
+
+Defining a set of tests
+-----------------------
+
+``kunit.py run`` (along with ``build``, and ``config``) supports a
+``--kunitconfig`` flag. So if you have a set of tests that you want to run on a
+regular basis (especially if they have other dependencies), you can create a
+specific ``.kunitconfig`` for them.
+
+E.g. kunit has one for its tests:
+
+.. code-block:: bash
+
+ $ ./tools/testing/kunit/kunit.py run --kunitconfig=lib/kunit/.kunitconfig
+
+Alternatively, if you're following the convention of naming your
+file ``.kunitconfig``, you can just pass in the dir, e.g.
+
+.. code-block:: bash
+
+ $ ./tools/testing/kunit/kunit.py run --kunitconfig=lib/kunit
+
+.. note::
+ This is a relatively new feature (5.12+) so we don't have any
+ conventions yet about on what files should be checked in versus just
+ kept around locally. It's up to you and your maintainer to decide if a
+ config is useful enough to submit (and therefore have to maintain).
+
+.. note::
+ Having ``.kunitconfig`` fragments in a parent and child directory is
+ iffy. There's discussion about adding an "import" statement in these
+ files to make it possible to have a top-level config run tests from all
+ child directories. But that would mean ``.kunitconfig`` files are no
+ longer just simple .config fragments.
+
+ One alternative would be to have kunit tool recursively combine configs
+ automagically, but tests could theoretically depend on incompatible
+ options, so handling that would be tricky.
+
+Generating code coverage reports under UML
+------------------------------------------
+
+.. note::
+ TODO(brendanhiggins(a)google.com): There are various issues with UML and
+ versions of gcc 7 and up. You're likely to run into missing ``.gcda``
+ files or compile errors. We know one `faulty GCC commit
+ <https://github.com/gcc-mirror/gcc/commit/8c9434c2f9358b8b8bad2c1990edf10a21…>`_
+ but not how we'd go about getting this fixed. The compile errors still
+ need some investigation.
+
+.. note::
+ TODO(brendanhiggins(a)google.com): for recent versions of Linux
+ (5.10-5.12, maybe earlier), there's a bug with gcov counters not being
+ flushed in UML. This translates to very low (<1%) reported coverage. This is
+ related to the above issue and can be worked around by replacing the
+ one call to ``uml_abort()`` with a plain ``exit()``.
+
+
+This is different from the "normal" way of getting coverage information that is
+documented in Documentation/dev-tools/gcov.rst.
+
+Instead of enabling ``CONFIG_GCOV_KERNEL=y``, we can set these options:
+
+.. code-block:: none
+
+ CONFIG_DEBUG_KERNEL=y
+ CONFIG_DEBUG_INFO=y
+ CONFIG_GCOV=y
+
+
+Putting it together into a copy-pastable sequence of commands:
+
+.. code-block:: bash
+
+ # Append coverage options to the current config
+ $ echo -e "CONFIG_DEBUG_KERNEL=y\nCONFIG_DEBUG_INFO=y\nCONFIG_GCOV=y" >> .kunit/.kunitconfig
+ $ ./tools/testing/kunit/kunit.py run
+ # Extract the coverage information from the build dir (.kunit/)
+ $ lcov -t "my_kunit_tests" -o coverage.info -c -d .kunit/
+
+ # From here on, it's the same process as with CONFIG_GCOV_KERNEL=y
+ # E.g. can generate an HTML report in a tmp dir like so:
+ $ genhtml -o /tmp/coverage_html coverage.info
+
+
+If your installed version of gcc doesn't work, you can tweak the steps:
+
+.. code-block:: bash
+
+ $ ./tools/testing/kunit/kunit.py run --make_options=CC=/usr/bin/gcc-6
+ $ lcov -t "my_kunit_tests" -o coverage.info -c -d .kunit/ --gcov-tool=/usr/bin/gcov-6
+
+
+Running tests manually
+======================
+
+Running tests without using ``kunit.py run`` is also an important use case.
+Currently it's your only option if you want to test on architectures other than
+UML.
+
+As running the tests under UML is fairly straightforward (configure and compile
+the kernel, run the ``./linux`` binary), this section will focus on testing
+non-UML architectures.
+
+
+Running built-in tests
+----------------------
+
+When setting tests to ``=y``, the tests will run as part of boot and print
+results to dmesg in TAP format. So you just need to add your tests to your
+``.config``, build and boot your kernel as normal.
+
+So if we compiled our kernel with:
+
+.. code-block:: none
+
+ CONFIG_KUNIT=y
+ CONFIG_KUNIT_EXAMPLE_TEST=y
+
+Then we'd see output like this in dmesg signaling the test ran and passed:
+
+.. code-block:: none
+
+ TAP version 14
+ 1..1
+ # Subtest: example
+ 1..1
+ # example_simple_test: initializing
+ ok 1 - example_simple_test
+ ok 1 - example
+
+Running tests as modules
+------------------------
+
+Depending on the tests, you can build them as loadable modules.
+
+For example, we'd change the config options from before to
+
+.. code-block:: none
+
+ CONFIG_KUNIT=y
+ CONFIG_KUNIT_EXAMPLE_TEST=m
+
+Then after booting into our kernel, we can run the test via
+
+.. code-block:: none
+
+ $ modprobe kunit-example-test
+
+This will then cause it to print TAP output to stdout.
+
+.. note::
+ The ``modprobe`` will *not* have a non-zero exit code if any test
+ failed (as of 5.13). But ``kunit.py parse`` would, see below.
+
+.. note::
+ You can set ``CONFIG_KUNIT=m`` as well, however, some features will not
+ work and thus some tests might break. Ideally tests would specify they
+ depend on ``KUNIT=y`` in their ``Kconfig``'s, but this is an edge case
+ most test authors won't think about.
+ As of 5.13, the only difference is that ``current->kunit_test`` will
+ not exist.
+
+Pretty-printing results
+-----------------------
+
+You can use ``kunit.py parse`` to parse dmesg for test output and print out
+results in the same familiar format that ``kunit.py run`` does.
+
+.. code-block:: bash
+
+ $ ./tools/testing/kunit/kunit.py parse /var/log/dmesg
+
+
+Retrieving per suite results
+----------------------------
+
+Regardless of how you're running your tests, you can enable
+``CONFIG_KUNIT_DEBUGFS`` to expose per-suite TAP-formatted results:
+
+.. code-block:: none
+
+ CONFIG_KUNIT=y
+ CONFIG_KUNIT_EXAMPLE_TEST=m
+ CONFIG_KUNIT_DEBUGFS=y
+
+The results for each suite will be exposed under
+``/sys/kernel/debug/kunit/<suite>/results``.
+So using our example config:
+
+.. code-block:: bash
+
+ $ modprobe kunit-example-test > /dev/null
+ $ cat /sys/kernel/debug/kunit/example/results
+ ... <TAP output> ...
+
+ # After removing the module, the corresponding files will go away
+ $ modprobe -r kunit-example-test
+ $ cat /sys/kernel/debug/kunit/example/results
+ /sys/kernel/debug/kunit/example/results: No such file or directory
+
+Generating code coverage reports
+--------------------------------
+
+See Documentation/dev-tools/gcov.rst for details on how to do this.
+
+The only vaguely KUnit-specific advice here is that you probably want to build
+your tests as modules. That way you can isolate the coverage from tests from
+other code executed during boot, e.g.
+
+.. code-block:: bash
+
+ # Reset coverage counters before running the test.
+ $ echo 0 > /sys/kernel/debug/gcov/reset
+ $ modprobe kunit-example-test
diff --git a/Documentation/dev-tools/kunit/start.rst b/Documentation/dev-tools/kunit/start.rst
index 0e65cabe08eb..aa56d7ca6bfb 100644
--- a/Documentation/dev-tools/kunit/start.rst
+++ b/Documentation/dev-tools/kunit/start.rst
@@ -236,5 +236,7 @@ Next Steps
==========
* Check out the :doc:`tips` page for tips on
writing idiomatic KUnit tests.
+* Check out the :doc:`running_tips` page for tips on
+ how to make running KUnit tests easier.
* Optional: see the :doc:`usage` page for a more
in-depth explanation of KUnit.
base-commit: de2fcb3e62013738f22bbb42cbd757d9a242574e
--
2.31.1.295.g9ea45b61b8-goog
This is long overdue.
There are several things that aren't nailed down (in-tree
.kunitconfig's), or partially broken (GCOV on UML), but having them
documented, warts and all, is better than having nothing.
This covers a bunch of the more recent features
* kunit_filter_glob
* kunit.py run --kunitconfig
* slightly more detail on building tests as modules
* CONFIG_KUNIT_DEBUGFS
By my count, the only headline features now not mentioned are the KASAN
integration and KernelCI json output support (kunit.py run --json).
And then it also discusses how to get code coverage reports under UML
and non-UML since this is a question people have repeatedly asked.
Non-UML coverage collection is no different from normal, but we should
probably explicitly call this out.
As for UML, I was able to get it working again with two small hacks.*
E.g. with CONFIG_KUNIT=y && CONFIG_KUNIT_ALL_TESTS=y
Overall coverage rate:
lines......: 15.1% (18294 of 120776 lines)
functions..: 16.8% (1860 of 11050 functions)
Note: this doesn't document --alltests since this is not stable yet.
Hopefully being run more frequently as part of KernelCI will help...
*Using gcc/gcov-6 and not using uml_abort() in os_dump_core().
I've documented these hacks in "Notes" but left TODOs for
brendanhiggins(a)google.com who tracked down the runtime issue in GCC.
To be clear: these are not issues specific to KUnit, but rather to UML.
Signed-off-by: Daniel Latypov <dlatypov(a)google.com>
---
Documentation/dev-tools/kunit/index.rst | 1 +
.../dev-tools/kunit/running_tips.rst | 260 ++++++++++++++++++
Documentation/dev-tools/kunit/start.rst | 2 +
3 files changed, 263 insertions(+)
create mode 100644 Documentation/dev-tools/kunit/running_tips.rst
diff --git a/Documentation/dev-tools/kunit/index.rst b/Documentation/dev-tools/kunit/index.rst
index 848478838347..7f7cf8d2ab20 100644
--- a/Documentation/dev-tools/kunit/index.rst
+++ b/Documentation/dev-tools/kunit/index.rst
@@ -14,6 +14,7 @@ KUnit - Unit Testing for the Linux Kernel
style
faq
tips
+ running_tips
What is KUnit?
==============
diff --git a/Documentation/dev-tools/kunit/running_tips.rst b/Documentation/dev-tools/kunit/running_tips.rst
new file mode 100644
index 000000000000..52cc62d1c83b
--- /dev/null
+++ b/Documentation/dev-tools/kunit/running_tips.rst
@@ -0,0 +1,260 @@
+.. SPDX-License-Identifier: GPL-2.0
+
+============================
+Tips For Running KUnit Tests
+============================
+
+Using ``kunit.py run`` ("kunit tool")
+=====================================
+
+Running from any directory
+--------------------------
+
+It can be handy to create a bash function like:
+
+.. code-block:: bash
+
+ function run_kunit() {
+ ( cd "$(git rev-parse --show-toplevel)" && ./tools/testing/kunit/kunit.py run $@ )
+ }
+
+.. note::
+ Early versions of ``kunit.py`` (before 5.6) didn't work unless run from
+ the kernel root, hence the use of a subshell and ``cd``.
+
+Running a subset of tests
+-------------------------
+
+``kunit.py run`` accepts an optional glob argument to filter tests. Currently
+this only matches against suite names, but this may change in the future.
+
+Say that we wanted to run the sysctl tests, we could do so via:
+
+.. code-block:: bash
+
+ $ echo -e 'CONFIG_KUNIT=y\nCONFIG_KUNIT_ALL_TESTS=y' > .kunit/.kunitconfig
+ $ ./tools/testing/kunit/kunit.py run 'sysctl*'
+
+We're paying the cost of building more tests than we need this way, but it's
+easier than fiddling with ``.kunitconfig`` files or commenting out
+``kunit_suite``'s.
+
+However, if we wanted to define a set of tests in a less ad hoc way, the next
+tip is useful.
+
+Defining a set of tests
+-----------------------
+
+``kunit.py run`` (along with ``build``, and ``config``) supports a
+``--kunitconfig`` flag. So if you have a set of tests that you want to run on a
+regular basis (especially if they have other dependencies), you can create a
+specific ``.kunitconfig`` for them.
+
+E.g. kunit has one for its tests:
+
+.. code-block:: bash
+
+ $ ./tools/testing/kunit/kunit.py run --kunitconfig=lib/kunit/.kunitconfig
+
+Alternatively, if you're following the convention of naming your
+file ``.kunitconfig``, you can just pass in the dir, e.g.
+
+.. code-block:: bash
+
+ $ ./tools/testing/kunit/kunit.py run --kunitconfig=lib/kunit
+
+.. note::
+ This is a relatively new feature (5.12+) so we don't have any
+ conventions yet about on what files should be checked in versus just
+ kept around locally. It's up to you and your maintainer to decide if a
+ config is useful enough to submit (and therefore have to maintain).
+
+.. note::
+ Having ``.kunitconfig`` fragments in a parent and child directory is
+ iffy. There's discussion about adding an "import" statement in these
+ files to make it possible to have a top-level config run tests from all
+ child directories. But that would mean ``.kunitconfig`` files are no
+ longer just simple .config fragments.
+
+ One alternative would be to have kunit tool recursively combine configs
+ automagically, but tests could theoretically depend on incompatible
+ options, so handling that would be tricky.
+
+Generating code coverage reports under UML
+------------------------------------------
+
+.. note::
+ TODO(brendanhiggins(a)google.com): There are various issues with UML and
+ versions of gcc 7 and up. You're likely to run into missing ``.gcda``
+ files or compile errors. We know one `faulty GCC commit
+ <https://github.com/gcc-mirror/gcc/commit/8c9434c2f9358b8b8bad2c1990edf10a21…>`_
+ but not how we'd go about getting this fixed. The compile errors still
+ need some investigation.
+
+.. note::
+ TODO(brendanhiggins(a)google.com): for recent versions of Linux
+ (5.10-5.12, maybe earlier), there's a bug with gcov counters not being
+ flushed in UML. This translates to very low (<1%) reported coverage. This is
+ related to the above issue and can be worked around by replacing the
+ one call to ``uml_abort()`` with a plain ``exit()``.
+
+
+This is different from the "normal" way of getting coverage information that is
+documented in Documentation/dev-tools/gcov.rst.
+
+Instead of enabling ``CONFIG_GCOV_KERNEL=y``, we can set these options:
+
+.. code-block:: none
+
+ CONFIG_DEBUG_KERNEL=y
+ CONFIG_DEBUG_INFO=y
+ CONFIG_GCOV=y
+
+
+Putting it together into a copy-pastable sequence of commands:
+
+.. code-block:: bash
+
+ # Append coverage options to the current config
+ $ echo -e "CONFIG_DEBUG_KERNEL=y\nCONFIG_DEBUG_INFO=y\nCONFIG_GCOV=y" >> .kunit/.kunitconfig
+ $ ./tools/testing/kunit/kunit.py run
+ # Extract the coverage information from the build dir (.kunit/)
+ $ lcov -t "my_kunit_tests" -o coverage.info -c -d .kunit/
+
+ # From here on, it's the same process as with CONFIG_GCOV_KERNEL=y
+ # E.g. can generate an HTML report in a tmp dir like so:
+ $ genhtml -o /tmp/coverage_html coverage.info
+
+
+If your installed version of gcc doesn't work, you can tweak the steps:
+
+.. code-block:: bash
+
+ # need to edit tools/testing/kunit/kunit_kernel.py to call make with 'CC=/usr/bin/gcc-6'
+ $ $EDITOR tools/testing/kunit/kunit_kernel.py
+
+ $ lcov -t "my_kunit_tests" -o coverage.info -c -d .kunit/ --gcov-tool=/usr/bin/gcov-6
+
+
+Running tests manually
+======================
+
+Running tests without using ``kunit.py run`` is also an important use case.
+Currently it's your only option if you want to test on architectures other than
+UML.
+
+As running the tests under UML is fairly straightforward (configure and compile
+the kernel, run the ``./linux`` binary), this section will focus on testing
+non-UML architectures.
+
+
+Running built-in tests
+----------------------
+
+When setting tests to ``=y``, the tests will run as part of boot and print
+results to dmesg in TAP format. So you just need to add your tests to your
+``.config``, build and boot your kernel as normal.
+
+So if we compiled our kernel with:
+
+.. code-block:: none
+
+ CONFIG_KUNIT=y
+ CONFIG_KUNIT_EXAMPLE_TEST=y
+
+Then we'd see output like this in dmesg signaling the test ran and passed:
+
+.. code-block:: none
+
+ TAP version 14
+ 1..1
+ # Subtest: example
+ 1..1
+ # example_simple_test: initializing
+ ok 1 - example_simple_test
+ ok 1 - example
+
+Running tests as modules
+------------------------
+
+Depending on the tests, you can build them as loadable modules.
+
+For example, we'd change the config options from before to
+
+.. code-block:: none
+
+ CONFIG_KUNIT=y
+ CONFIG_KUNIT_EXAMPLE_TEST=m
+
+Then after booting into our kernel, we can run the test via
+
+.. code-block:: none
+
+ $ modprobe kunit-example-test
+
+This will then cause it to print TAP output to stdout.
+
+.. note::
+ The ``modprobe`` will *not* have a non-zero exit code if any test
+ failed (as of 5.13). But ``kunit.py parse`` would, see below.
+
+.. note::
+ You can set ``CONFIG_KUNIT=m`` as well, however, some features will not
+ work and thus some tests might break. Ideally tests would specify they
+ depend on ``KUNIT=y`` in their ``Kconfig``'s, but this is an edge case
+ most test authors won't think about.
+ As of 5.13, the only difference is that ``current->kunit_test`` will
+ not exist.
+
+Pretty-printing results
+-----------------------
+
+You can use ``kunit.py parse`` to parse dmesg for test output and print out
+results in the same familiar format that ``kunit.py run`` does.
+
+.. code-block:: bash
+
+ $ ./tools/testing/kunit/kunit.py parse /var/log/dmesg
+
+
+Retrieving per suite results
+----------------------------
+
+Regardless of how you're running your tests, you can enable
+``CONFIG_KUNIT_DEBUGFS`` to expose per-suite TAP-formatted results:
+
+.. code-block:: none
+
+ CONFIG_KUNIT=y
+ CONFIG_KUNIT_EXAMPLE_TEST=m
+ CONFIG_KUNIT_DEBUGFS=y
+
+The results for each suite will be exposed under
+``/sys/kernel/debug/kunit/<suite>/results``.
+So using our example config:
+
+.. code-block:: bash
+
+ $ modprobe kunit-example-test > /dev/null
+ $ cat /sys/kernel/debug/kunit/example/results
+ ... <TAP output> ...
+
+ # After removing the module, the corresponding files will go away
+ $ modprobe -r kunit-example-test
+ $ cat /sys/kernel/debug/kunit/example/results
+ /sys/kernel/debug/kunit/example/results: No such file or directory
+
+Generating code coverage reports
+--------------------------------
+
+See Documentation/dev-tools/gcov.rst for details on how to do this.
+
+The only vaguely KUnit-specific advice here is that you probably want to build
+your tests as modules. That way you can isolate the coverage from tests from
+other code executed during boot, e.g.
+
+.. code-block:: bash
+
+ # Reset coverage counters before running the test.
+ $ echo 0 > /sys/kernel/debug/gcov/reset
+ $ modprobe kunit-example-test
diff --git a/Documentation/dev-tools/kunit/start.rst b/Documentation/dev-tools/kunit/start.rst
index 0e65cabe08eb..aa56d7ca6bfb 100644
--- a/Documentation/dev-tools/kunit/start.rst
+++ b/Documentation/dev-tools/kunit/start.rst
@@ -236,5 +236,7 @@ Next Steps
==========
* Check out the :doc:`tips` page for tips on
writing idiomatic KUnit tests.
+* Check out the :doc:`running_tips` page for tips on
+ how to make running KUnit tests easier.
* Optional: see the :doc:`usage` page for a more
in-depth explanation of KUnit.
base-commit: de2fcb3e62013738f22bbb42cbd757d9a242574e
--
2.31.1.295.g9ea45b61b8-goog
This is long overdue.
There are several things that aren't nailed down (in-tree
.kunitconfig's), or partially broken (GCOV on UML), but having them
documented, warts and all, is better than having nothing.
This covers a bunch of the more recent features
* kunit_filter_glob
* kunit.py run --kunitconfig
* kunit.py run --alltests
* slightly more detail on building tests as modules
* CONFIG_KUNIT_DEBUGFS
By my count, the only headline features now not mentioned are the KASAN
integration and KernelCI json output support (kunit.py run --json).
And then it also discusses how to get code coverage reports under UML
and non-UML since this is a question people have repeatedly asked.
Non-UML coverage collection is no differnt from normal, but we should
probably explicitly call thsi out.
As for UML, I was able to get it working again with two small hacks.*
E.g. with CONFIG_KUNIT=y && CONFIG_KUNIT_ALL_TESTS=y
Overall coverage rate:
lines......: 15.1% (18294 of 120776 lines)
functions..: 16.8% (1860 of 11050 functions)
*Switching to use gcc/gcov-6 and not using uml_abort().
I've documented these hacks in "Notes" but left TODOs for
brendanhiggins(a)google.com who tracked down the runtime issue in GCC.
To be clear: these are not issues specific to KUnit, but rather to UML.
Signed-off-by: Daniel Latypov <dlatypov(a)google.com>
---
Documentation/dev-tools/kunit/index.rst | 1 +
.../dev-tools/kunit/running_tips.rst | 278 ++++++++++++++++++
Documentation/dev-tools/kunit/start.rst | 2 +
3 files changed, 281 insertions(+)
create mode 100644 Documentation/dev-tools/kunit/running_tips.rst
diff --git a/Documentation/dev-tools/kunit/index.rst b/Documentation/dev-tools/kunit/index.rst
index 848478838347..7f7cf8d2ab20 100644
--- a/Documentation/dev-tools/kunit/index.rst
+++ b/Documentation/dev-tools/kunit/index.rst
@@ -14,6 +14,7 @@ KUnit - Unit Testing for the Linux Kernel
style
faq
tips
+ running_tips
What is KUnit?
==============
diff --git a/Documentation/dev-tools/kunit/running_tips.rst b/Documentation/dev-tools/kunit/running_tips.rst
new file mode 100644
index 000000000000..d38e665e530f
--- /dev/null
+++ b/Documentation/dev-tools/kunit/running_tips.rst
@@ -0,0 +1,278 @@
+.. SPDX-License-Identifier: GPL-2.0
+
+============================
+Tips For Running KUnit Tests
+============================
+
+Using ``kunit.py run`` ("kunit tool")
+=====================================
+
+Running from any directory
+--------------------------
+
+It can be handy to create a bash function like:
+
+.. code-block:: bash
+
+ function run_kunit() {
+ ( cd "$(git rev-parse --show-toplevel)" && ./tools/testing/kunit/kunit.py run $@ )
+ }
+
+.. note::
+ Early versions of ``kunit.py`` (before 5.6) didn't work unless run from
+ the kernel root, hence the use of a subshell and ``cd``.
+
+Running a subset of tests
+-------------------------
+
+``kunit.py run`` accepts an optional glob argument to filter tests. Currently
+this only matches against suite names, but this may change in the future.
+
+Say that we wanted to run the sysctl tests, we could do so via:
+
+.. code-block:: bash
+
+ $ echo -e 'CONFIG_KUNIT=y\nCONFIG_KUNIT_ALL_TESTS=y' > .kunit/.kunitconfig
+ $ ./tools/testing/kunit/kunit.py run 'sysctl*'
+
+We're paying the cost of building more tests than we need this way, but it's
+easier than fiddling with ``.kunitconfig`` files or commenting out
+``kunit_suite``'s.
+
+However, if we wanted to define a set of tests in a less ad hoc way, the next
+tip is useful.
+
+Defining a set of tests
+-----------------------
+
+``kunit.py run`` (along with ``build``, and ``config``) supports a
+``--kunitconfig`` flag. So if you have a set of tests that you want to run on a
+regular basis (especially if they have other dependencies), you can create a
+specific ``.kunitconfig`` for them.
+
+E.g. kunit has own for its tests:
+
+.. code-block:: bash
+
+ $ ./tools/testing/kunit/kunit.py run --kunitconfig=lib/kunit/.kunitconfig
+
+Alternatively, if you're following the convention of naming your
+file ``.kunitconfig``, you can just pass in the dir, e.g.
+
+.. code-block:: bash
+
+ $ ./tools/testing/kunit/kunit.py run --kunitconfig=lib/kunit
+
+.. note::
+ This is a relatively new feature (5.12+) so we don't have any
+ conventions yet about on what files should be checked in versus just
+ kept around locally. But if the tests don't have any dependencies
+ (beyond ``CONFIG_KUNIT``), it's probably not worth writing and
+ maintaining a ``.kunitconfig`` fragment. Running with
+ ``CONFIG_KUNIT_ALL_TESTS=y`` is probably easier.
+
+.. note::
+ Having ``.kunitconfig`` fragments in a parent and child directory is
+ iffy. There's discussion about adding an "import" statement in these
+ files to make it possible to have a top-level config run tests from all
+ child directories. But that would mean ``.kunitconfig`` files are no
+ longer just simple .config fragments.
+
+ One alternative would be to have kunit tool recursively combine configs
+ automagically, but tests could theoretically depend on incompatible
+ options, so handling that would be tricky.
+
+Running with ``allyesconfig``
+-----------------------------
+
+.. code-block:: bash
+
+ $ ./tools/testing/kunit/kunit.py run --alltests
+
+This will try and use ``allyesconfig``, or rather ``allyesconfig`` with a list
+of UML-incompatible configs turned off. That list is maintained in
+``tools/testing/kunit/configs/broken_on_uml.config``.
+
+.. note::
+ This will take a *lot* longer to run and might be broken from time to
+ time, especially on -next. It's not recommended to use this unless you
+ need to or are morbidly curious.
+
+Generating code coverage reports under UML
+------------------------------------------
+
+.. note::
+ TODO(brendanhiggins(a)google.com): There are various issues with UML and
+ versions of gcc 7 and up. You're likely to run into missing ``.gcda``
+ files or compile errors. We know one `faulty GCC commit
+ <https://github.com/gcc-mirror/gcc/commit/8c9434c2f9358b8b8bad2c1990edf10a21…>`_
+ but not how we'd go about getting this fixed. The compile errors still
+ need some investigation.
+
+.. note::
+ TODO(brendanhiggins(a)google.com): for recent versions of Linux
+ (5.10-5.12, maybe earlier), there's a bug with gcov counters not being
+ flushed in UML. This translates to very low (<1%) reported coverage. This is
+ related to the above issue and can be worked around by replacing the
+ one call to ``uml_abort()`` with a plain ``exit()``.
+
+
+This is different from the "normal" way of getting coverage information that is
+documented in Documentation/dev-tools/gcov.rst.
+
+Instead of enabling ``CONFIG_GCOV_KERNEL=y``, we can set these options:
+
+.. code-block:: none
+
+ CONFIG_DEBUG_KERNEL=y
+ CONFIG_DEBUG_INFO=y
+ CONFIG_GCOV=y
+
+
+Putting it together into a copy-pastable sequence of commands:
+
+.. code-block:: bash
+
+ # Append coverage options to the current config
+ $ echo -e "CONFIG_DEBUG_KERNEL=y\nCONFIG_DEBUG_INFO=y\nCONFIG_GCOV=y" >> .kunit/.kunitconfig
+ $ ./tools/testing/kunit/kunit.py run
+ # Extract the coverage information from the build dir (.kunit/)
+ $ lcov -t "my_kunit_tests" -o coverage.info -c -d .kunit/
+
+ # From here on, it's the same process as with CONFIG_GCOV_KERNEL=y
+ # E.g. can generate an HTML report in a tmp dir like so:
+ $ genhtml -o /tmp/coverage_html coverage.info
+
+
+If your installed version of gcc doesn't work, you can tweak the steps:
+
+.. code-block:: bash
+
+ # need to edit tools/testing/kunit/kunit_kernel.py to call make with 'CC=/usr/bin/gcc-6'
+ $ $EDITOR tools/testing/kunit/kunit_kernel.py
+
+ $ lcov -t "my_kunit_tests" -o coverage.info -c -d .kunit/ --gcov-tool=/usr/bin/gcov-6
+
+
+Running tests manually
+======================
+
+Running tests without using ``kunit.py run`` is also an important use case.
+Currently it's your only option if you want to test on architectures other than
+UML.
+
+As running the tests under UML is fairly straightforward (configure and compile
+the kernel, run the ``./linux`` binary), this section will focus on testing
+non-UML architectures.
+
+
+Running built-in tests
+----------------------
+
+When setting tests to ``=y``, the tests will run as part of boot and print
+results to dmesg in TAP format. So you just need to add your tests to your
+``.config``, build and boot your kernel as normal.
+
+So if we compiled our kernel with:
+
+.. code-block:: none
+
+ CONFIG_KUNIT=y
+ CONFIG_KUNIT_EXAMPLE_TEST=y
+
+Then we'd see output like this in dmesg signaling the test ran and passed:
+
+.. code-block:: none
+
+ TAP version 14
+ 1..1
+ # Subtest: example
+ 1..1
+ # example_simple_test: initializing
+ ok 1 - example_simple_test
+ ok 1 - example
+
+Running tests as modules
+------------------------
+
+Depending on the tests, you can build them as loadable modules.
+
+For example, we'd change the config options from before to
+
+.. code-block:: none
+
+ CONFIG_KUNIT=y
+ CONFIG_KUNIT_EXAMPLE_TEST=m
+
+Then after booting into our kernel, we can run the test via
+
+.. code-block:: none
+
+ $ modprobe kunit-example-test
+
+This will then cause it to print TAP output to stdout.
+
+.. note::
+ The ``modprobe`` will *not* have a non-zero exit code if any test
+ failed (as of 5.13). But ``kunit.py parse`` would, see below.
+
+.. note::
+ You can set ``CONFIG_KUNIT=m`` as well, however, some features will not
+ work and thus some tests might break. Ideally tests would specify they
+ depend on ``KUNIT=y`` in their ``Kconfig``'s, but this is an edge case
+ most test authors won't think about.
+ As of 5.13, the only difference is that ``current->kunit_test`` will
+ not exist.
+
+Pretty-printing results
+-----------------------
+
+You can use ``kunit.py parse`` to parse dmesg for test output and print out
+results in the same familiar format that ``kunit.py run`` does.
+
+.. code-block:: bash
+
+ $ ./tools/testing/kunit/kunit.py parse /var/log/dmesg
+
+
+Retrieving per suite results
+----------------------------
+
+Regardless of how you're running your tests, you can enable
+``CONFIG_KUNIT_DEBUGFS`` to expose per-suite TAP-formatted results:
+
+.. code-block:: none
+
+ CONFIG_KUNIT=y
+ CONFIG_KUNIT_EXAMPLE_TEST=m
+ CONFIG_KUNIT_DEBUGFS=y
+
+The results for each suite will be exposed under
+``/sys/kernel/debug/kunit/<suite>/results``.
+So using our example config:
+
+.. code-block:: bash
+
+ $ modprobe kunit-example-test > /dev/null
+ $ cat /sys/kernel/debug/kunit/example/results
+ ... <TAP output> ...
+
+ # After removing the module, the corresponding files will go away
+ $ modprobe -r kunit-example-test
+ $ cat /sys/kernel/debug/kunit/example/results
+ /sys/kernel/debug/kunit/example/results: No such file or directory
+
+Generating code coverage reports
+--------------------------------
+
+See Documentation/dev-tools/gcov.rst for details on how to do this.
+
+The only vaguely KUnit-specific advice here is that you probably want to build
+your tests as modules. That way you can isolate the coverage from tests from
+other code executed during boot, e.g.
+
+.. code-block:: bash
+
+ # Reset coverage counters before running the test.
+ $ echo 0 > /sys/kernel/debug/gcov/reset
+ $ modprobe kunit-example-test
diff --git a/Documentation/dev-tools/kunit/start.rst b/Documentation/dev-tools/kunit/start.rst
index 0e65cabe08eb..aa56d7ca6bfb 100644
--- a/Documentation/dev-tools/kunit/start.rst
+++ b/Documentation/dev-tools/kunit/start.rst
@@ -236,5 +236,7 @@ Next Steps
==========
* Check out the :doc:`tips` page for tips on
writing idiomatic KUnit tests.
+* Check out the :doc:`running_tips` page for tips on
+ how to make running KUnit tests easier.
* Optional: see the :doc:`usage` page for a more
in-depth explanation of KUnit.
base-commit: de2fcb3e62013738f22bbb42cbd757d9a242574e
--
2.31.1.295.g9ea45b61b8-goog
This patchset introduces batched operations for the per-cpu variant of
the array map.
It also removes the percpu macros from 'bpf_util.h'. This change was
suggested by Andrii in a earlier iteration of this patchset.
The tests were updated to reflect all the new changes.
v2 -> v3:
- Remove percpu macros as suggested by Andrii
- Update tests that used the per cpu macros
v1 -> v2:
- Amended a more descriptive commit message
Pedro Tammela (3):
bpf: add batched ops support for percpu array
bpf: selftests: remove percpu macros from bpf_util.h
bpf: selftests: update array map tests for per-cpu batched ops
kernel/bpf/arraymap.c | 2 +
tools/testing/selftests/bpf/bpf_util.h | 7 --
.../bpf/map_tests/array_map_batch_ops.c | 110 +++++++++++++-----
.../bpf/map_tests/htab_map_batch_ops.c | 71 ++++++-----
.../selftests/bpf/prog_tests/map_init.c | 9 +-
tools/testing/selftests/bpf/test_maps.c | 84 +++++++------
6 files changed, 171 insertions(+), 112 deletions(-)
--
2.25.1
The kernel now has a number of testing and debugging tools, and we've
seen a bit of confusion about what the differences between them are.
Add a basic documentation outlining the testing tools, when to use each,
and how they interact.
This is a pretty quick overview rather than the idealised "kernel
testing guide" that'd probably be optimal, but given the number of times
questions like "When do you use KUnit and when do you use Kselftest?"
are being asked, it seemed worth at least having something. Hopefully
this can form the basis for more detailed documentation later.
Signed-off-by: David Gow <davidgow(a)google.com>
---
Documentation/dev-tools/index.rst | 3 +
Documentation/dev-tools/testing-overview.rst | 102 +++++++++++++++++++
2 files changed, 105 insertions(+)
create mode 100644 Documentation/dev-tools/testing-overview.rst
diff --git a/Documentation/dev-tools/index.rst b/Documentation/dev-tools/index.rst
index 1b1cf4f5c9d9..f590e5860794 100644
--- a/Documentation/dev-tools/index.rst
+++ b/Documentation/dev-tools/index.rst
@@ -7,6 +7,8 @@ be used to work on the kernel. For now, the documents have been pulled
together without any significant effort to integrate them into a coherent
whole; patches welcome!
+A brief overview of testing-specific tools can be found in :doc:`testing-overview`.
+
.. class:: toc-title
Table of contents
@@ -14,6 +16,7 @@ whole; patches welcome!
.. toctree::
:maxdepth: 2
+ testing-overview
coccinelle
sparse
kcov
diff --git a/Documentation/dev-tools/testing-overview.rst b/Documentation/dev-tools/testing-overview.rst
new file mode 100644
index 000000000000..8452adcb8608
--- /dev/null
+++ b/Documentation/dev-tools/testing-overview.rst
@@ -0,0 +1,102 @@
+.. SPDX-License-Identifier: GPL-2.0
+
+====================
+Kernel Testing Guide
+====================
+
+
+There are a number of different tools for testing the Linux kernel, so knowing
+when to use each of them can be a challenge. This document provides a rough
+overview of their differences, and how they fit together.
+
+
+Writing and Running Tests
+=========================
+
+The bulk of kernel tests are written using either the :doc:`kselftest
+<kselftest>` or :doc:`KUnit <kunit/index>` frameworks. These both provide
+infrastructure to help make running tests and groups of tests easier, as well
+as providing helpers to aid in writing new tests.
+
+If you're looking to verify the behaviour of the Kernel — particularly specific
+parts of the kernel — then you'll want to use `KUnit` or `kselftest`.
+
+
+The Difference Between KUnit and kselftest
+------------------------------------------
+
+:doc:`KUnit <kunit/index>` is an entirely in-kernel system for "white box"
+testing: because test code is part of the kernel, it can access internal
+structures and functions which aren't exposed to userspace.
+
+`KUnit` tests therefore are best written against small, self-contained parts
+of the kernel, which can be tested in isolation. This aligns well with the
+concept of Unit testing.
+
+For example, a KUnit test might test an individual kernel function (or even a
+single codepath through a function, such as an error handling case), rather
+than a feature as a whole.
+
+There is a KUnit test style guide which may give further pointers
+
+
+:doc:`kselftest <kselftest>`, on the other hand, is largely implemented in
+userspace, and tests are normal userspace scripts or programs.
+
+This makes it easier to write more complicated tests, or tests which need to
+manipulate the overall system state more (e.g., spawning processes, etc.).
+However, it's not possible to call kernel functions directly unless they're
+exposed to userspace (by a syscall, device, filesystem, etc.) Some tests to
+also provide a kernel module which is loaded by the test, though for tests
+which run mostly or entirely within the kernel, `KUnit` may be the better tool.
+
+`kselftest` is therefore suited well to tests of whole features, as these will
+expose an interface to userspace, which can be tested, but not implementation
+details. This aligns well with 'system' or 'end-to-end' testing.
+
+
+Code Coverage Tools
+===================
+
+The Linux Kernel supports two different code coverage mesurement tools. These
+can be used to verify that a test is executing particular functions or lines
+of code. This is useful for determining how much of the kernel is being tested,
+and for finding corner-cases which are not covered by the appropriate test.
+
+:doc:`kcov` is a feature which can be built in to the kernel to allow
+capturing coverage on a per-task level. It's therefore useful for fuzzing and
+other situations where information about code executed during, for example, a
+single syscall is useful.
+
+:doc:`gcov` is GCC's coverage testing tool, which can be used with the kernel
+to get global or per-module coverage. Unlike KCOV, it does not record per-task
+coverage. Coverage data can be read from debugfs, and interpreted using the
+usual gcov tooling.
+
+
+Sanitizers
+==========
+
+The kernel also supports a number of sanitizers, which attempt to detect
+classes of issues when the occur in a running kernel. These typically
+look for undefined behaviour of some kind, such as invalid memory accesses,
+concurrency issues such as data races, or other undefined behaviour like
+integer overflows.
+
+* :doc:`kmemleak` (Kmemleak) detects possible memory leaks.
+* :doc:`kasan` detects invalid memory accesses such as out-of-bounds and
+ use-after-free errors.
+* :doc:`ubsan` detects behaviour that is undefined by the C standard, like
+ integer overflows.
+* :doc:`kcsan` detects data races.
+* :doc:`kfence` is a low-overhead detector of memory issues, which is much
+ faster than KASAN and can be used in production.
+
+These tools tend to test the kernel as a whole, and do not "pass" like
+kselftest or KUnit tests. They can be combined with KUnit or kselftest by
+running tests on a kernel with a sanitizer enabled: you can then be sure
+that none of these errors are occurring during the test.
+
+Some of these sanitizers integrate with KUnit or kselftest and will
+automatically fail tests if an issue is detected by a sanitizer.
+
--
2.31.1.295.g9ea45b61b8-goog
Changelog
RFC v2-->v3
Based on comments by Doug Smythies,
1. Changed commit log to reflect the test must be run as super user.
2. Added a comment specifying a method to run the test bash script
without recompiling.
3. Enable all the idle states after the experiments are completed so
that the system is in a coherent state after the tests have run
4. Correct the return status of a CPU that cannot be off-lined.
RFC v2: https://lkml.org/lkml/2021/4/1/615
---
A kernel module + userspace driver to estimate the wakeup latency
caused by going into stop states. The motivation behind this program is
to find significant deviations behind advertised latency and residency
values.
The patchset measures latencies for two kinds of events. IPIs and Timers
As this is a software-only mechanism, there will additional latencies of
the kernel-firmware-hardware interactions. To account for that, the
program also measures a baseline latency on a 100 percent loaded CPU
and the latencies achieved must be in view relative to that.
To achieve this, we introduce a kernel module and expose its control
knobs through the debugfs interface that the selftests can engage with.
The kernel module provides the following interfaces within
/sys/kernel/debug/latency_test/ for,
IPI test:
ipi_cpu_dest = Destination CPU for the IPI
ipi_cpu_src = Origin of the IPI
ipi_latency_ns = Measured latency time in ns
Timeout test:
timeout_cpu_src = CPU on which the timer to be queued
timeout_expected_ns = Timer duration
timeout_diff_ns = Difference of actual duration vs expected timer
Sample output on a POWER9 system is as follows:
# --IPI Latency Test---
# Baseline Average IPI latency(ns): 3114
# Observed Average IPI latency(ns) - State0: 3265
# Observed Average IPI latency(ns) - State1: 3507
# Observed Average IPI latency(ns) - State2: 3739
# Observed Average IPI latency(ns) - State3: 3807
# Observed Average IPI latency(ns) - State4: 17070
# Observed Average IPI latency(ns) - State5: 1038174
# Observed Average IPI latency(ns) - State6: 1068784
#
# --Timeout Latency Test--
# Baseline Average timeout diff(ns): 1420
# Observed Average timeout diff(ns) - State0: 1640
# Observed Average timeout diff(ns) - State1: 1764
# Observed Average timeout diff(ns) - State2: 1715
# Observed Average timeout diff(ns) - State3: 1845
# Observed Average timeout diff(ns) - State4: 16581
# Observed Average timeout diff(ns) - State5: 939977
# Observed Average timeout diff(ns) - State6: 1073024
Things to keep in mind:
1. This kernel module + bash driver does not guarantee idleness on a
core when the IPI and the Timer is armed. It only invokes sleep and
hopes that the core is idle once the IPI/Timer is invoked onto it.
Hence this program must be run on a completely idle system for best
results
2. Even on a completely idle system, there maybe book-keeping tasks or
jitter tasks that can run on the core we want idle. This can create
outliers in the latency measurement. Thankfully, these outliers
should be large enough to easily weed them out.
3. A userspace only selftest variant was also sent out as RFC based on
suggestions over the previous patchset to simply the kernel
complexeity. However, a userspace only approach had more noise in
the latency measurement due to userspace-kernel interactions
which led to run to run variance and a lesser accurate test.
Another downside of the nature of a userspace program is that it
takes orders of magnitude longer to complete a full system test
compared to the kernel framework.
RFC patch: https://lkml.org/lkml/2020/9/2/356
4. For Intel Systems, the Timer based latencies don't exactly give out
the measure of idle latencies. This is because of a hardware
optimization mechanism that pre-arms a CPU when a timer is set to
wakeup. That doesn't make this metric useless for Intel systems,
it just means that is measuring IPI/Timer responding latency rather
than idle wakeup latencies.
(Source: https://lkml.org/lkml/2020/9/2/610)
For solution to this problem, a hardware based latency analyzer is
devised by Artem Bityutskiy from Intel.
https://youtu.be/Opk92aQyvt0?t=8266https://intel.github.io/wult/
Pratik Rajesh Sampat (2):
cpuidle: Extract IPI based and timer based wakeup latency from idle
states
selftest/cpuidle: Add support for cpuidle latency measurement
drivers/cpuidle/Makefile | 1 +
drivers/cpuidle/test-cpuidle_latency.c | 157 ++++++++++
lib/Kconfig.debug | 10 +
tools/testing/selftests/Makefile | 1 +
tools/testing/selftests/cpuidle/Makefile | 6 +
tools/testing/selftests/cpuidle/cpuidle.sh | 326 +++++++++++++++++++++
tools/testing/selftests/cpuidle/settings | 2 +
7 files changed, 503 insertions(+)
create mode 100644 drivers/cpuidle/test-cpuidle_latency.c
create mode 100644 tools/testing/selftests/cpuidle/Makefile
create mode 100755 tools/testing/selftests/cpuidle/cpuidle.sh
create mode 100644 tools/testing/selftests/cpuidle/settings
--
2.17.1
This series aims to clarify the behavior of the KVM_GET_EMULATED_CPUID
ioctl, and fix a corner case where -E2BIG is returned when
the nent field of struct kvm_cpuid2 is matching the amount of
emulated entries that kvm returns.
Patch 1 proposes the nent field fix to cpuid.c,
patch 2 updates the ioctl documentation accordingly and
patches 3 and 4 extend the x86_64/get_cpuid_test.c selftest to check
the intended behavior of KVM_GET_EMULATED_CPUID.
Signed-off-by: Emanuele Giuseppe Esposito <eesposit(a)redhat.com>
---
v5:
- Better comment in cpuid.c (patch 1)
Emanuele Giuseppe Esposito (4):
KVM: x86: Fix a spurious -E2BIG in KVM_GET_EMULATED_CPUID
Documentation: KVM: update KVM_GET_EMULATED_CPUID ioctl description
selftests: add kvm_get_emulated_cpuid to processor.h
selftests: KVM: extend get_cpuid_test to include
KVM_GET_EMULATED_CPUID
Documentation/virt/kvm/api.rst | 10 +--
arch/x86/kvm/cpuid.c | 33 ++++---
.../selftests/kvm/include/x86_64/processor.h | 1 +
.../selftests/kvm/lib/x86_64/processor.c | 33 +++++++
.../selftests/kvm/x86_64/get_cpuid_test.c | 90 ++++++++++++++++++-
5 files changed, 142 insertions(+), 25 deletions(-)
--
2.30.2
This series aims to clarify the behavior of the KVM_GET_EMULATED_CPUID
ioctl, and fix a corner case where -E2BIG is returned when
the nent field of struct kvm_cpuid2 is matching the amount of
emulated entries that kvm returns.
Patch 1 proposes the nent field fix to cpuid.c,
patch 2 updates the ioctl documentation accordingly and
patches 3 and 4 extend the x86_64/get_cpuid_test.c selftest to check
the intended behavior of KVM_GET_EMULATED_CPUID.
Signed-off-by: Emanuele Giuseppe Esposito <eesposit(a)redhat.com>
---
v4:
- Address nitpicks given in the mailing list
Emanuele Giuseppe Esposito (4):
KVM: x86: Fix a spurious -E2BIG in KVM_GET_EMULATED_CPUID
Documentation: KVM: update KVM_GET_EMULATED_CPUID ioctl description
selftests: add kvm_get_emulated_cpuid to processor.h
selftests: KVM: extend get_cpuid_test to include
KVM_GET_EMULATED_CPUID
Documentation/virt/kvm/api.rst | 10 +--
arch/x86/kvm/cpuid.c | 33 ++++---
.../selftests/kvm/include/x86_64/processor.h | 1 +
.../selftests/kvm/lib/x86_64/processor.c | 33 +++++++
.../selftests/kvm/x86_64/get_cpuid_test.c | 90 ++++++++++++++++++-
5 files changed, 142 insertions(+), 25 deletions(-)
--
2.30.2
As of commit 359a376081d4 ("kunit: support failure from dynamic analysis
tools"), we can use current->kunit_test to find the current kunit test.
Mention this in tips.rst and give an example of how this can be used in
conjunction with `test->priv` to pass around state and specifically
implement something like mocking.
There's a lot more we could go into on that topic, but given that
example is already longer than every other "tip" on this page, we just
point to the API docs and leave filling in the blanks as an exercise to
the reader.
Also give an example of kunit_fail_current_test().
Signed-off-by: Daniel Latypov <dlatypov(a)google.com>
---
Documentation/dev-tools/kunit/tips.rst | 78 +++++++++++++++++++++++++-
1 file changed, 76 insertions(+), 2 deletions(-)
diff --git a/Documentation/dev-tools/kunit/tips.rst b/Documentation/dev-tools/kunit/tips.rst
index a6ca0af14098..8d8c238f7f79 100644
--- a/Documentation/dev-tools/kunit/tips.rst
+++ b/Documentation/dev-tools/kunit/tips.rst
@@ -78,8 +78,82 @@ Similarly to the above, it can be useful to add test-specific logic.
void test_only_hook(void) { }
#endif
-TODO(dlatypov(a)google.com): add an example of using ``current->kunit_test`` in
-such a hook when it's not only updated for ``CONFIG_KASAN=y``.
+This test-only code can be made more useful by accessing the current kunit
+test, see below.
+
+Accessing the current test
+--------------------------
+
+In some cases, you need to call test-only code from outside the test file, e.g.
+like in the example above or if you're providing a fake implementation of an
+ops struct.
+There is a ``kunit_test`` field in ``task_struct``, so you can access it via
+``current->kunit_test``.
+
+Here's a slightly in-depth example of how one could implement "mocking":
+
+.. code-block:: c
+
+ #include <linux/sched.h> /* for current */
+
+ struct test_data {
+ int foo_result;
+ int want_foo_called_with;
+ };
+
+ static int fake_foo(int arg)
+ {
+ struct kunit *test = current->kunit_test;
+ struct test_data *test_data = test->priv;
+
+ KUNIT_EXPECT_EQ(test, test_data->want_foo_called_with, arg);
+ return test_data->foo_result;
+ }
+
+ static void example_simple_test(struct kunit *test)
+ {
+ /* Assume priv is allocated in the suite's .init */
+ struct test_data *test_data = test->priv;
+
+ test_data->foo_result = 42;
+ test_data->want_foo_called_with = 1;
+
+ /* In a real test, we'd probably pass a pointer to fake_foo somewhere
+ * like an ops struct, etc. instead of calling it directly. */
+ KUNIT_EXPECT_EQ(test, fake_foo(1), 42);
+ }
+
+
+Note: here we're able to get away with using ``test->priv``, but if you wanted
+something more flexible you could use a named ``kunit_resource``, see :doc:`api/test`.
+
+Failing the current test
+------------------------
+
+But sometimes, you might just want to fail the current test. In that case, we
+have ``kunit_fail_current_test(fmt, args...)`` which is defined in ``<kunit/test-bug.h>`` and
+doesn't require pulling in ``<kunit/test.h>``.
+
+E.g. say we had an option to enable some extra debug checks on some data structure:
+
+.. code-block:: c
+
+ #include <kunit/test-bug.h>
+
+ #ifdef CONFIG_EXTRA_DEBUG_CHECKS
+ static void validate_my_data(struct data *data)
+ {
+ if (is_valid(data))
+ return;
+
+ kunit_fail_current_test("data %p is invalid", data);
+
+ /* Normal, non-KUnit, error reporting code here. */
+ }
+ #else
+ static void my_debug_function(void) { }
+ #endif
+
Customizing error messages
--------------------------
base-commit: 0a50438c84363bd37fe18fe432888ae9a074dcab
--
2.31.0.208.g409f899ff0-goog
This patchset introduces batched operations for the per-cpu variant of
the array map.
It also introduces a standard way to define per-cpu values via the
'BPF_PERCPU_TYPE()' macro, which handles the alignment transparently.
This was already implemented in the selftests and was merely refactored
out to libbpf, with some simplifications for reuse.
The tests were updated to reflect all the new changes.
v1 -> v2:
- Amended a more descriptive commit message
Pedro Tammela (3):
bpf: add batched ops support for percpu array
libbpf: selftests: refactor 'BPF_PERCPU_TYPE()' and 'bpf_percpu()'
macros
bpf: selftests: update array map tests for per-cpu batched ops
kernel/bpf/arraymap.c | 2 +
tools/lib/bpf/bpf.h | 10 ++
tools/testing/selftests/bpf/bpf_util.h | 7 --
.../bpf/map_tests/array_map_batch_ops.c | 114 +++++++++++++-----
.../bpf/map_tests/htab_map_batch_ops.c | 48 ++++----
.../selftests/bpf/prog_tests/map_init.c | 5 +-
tools/testing/selftests/bpf/test_maps.c | 16 +--
7 files changed, 133 insertions(+), 69 deletions(-)
--
2.25.1
This series aims to clarify the behavior of the KVM_GET_EMULATED_CPUID
ioctl, and fix a corner case where -E2BIG is returned when
the nent field of struct kvm_cpuid2 is matching the amount of
emulated entries that kvm returns.
Patch 1 proposes the nent field fix to cpuid.c,
patch 2 updates the ioctl documentation accordingly and
patches 3 and 4 extend the x86_64/get_cpuid_test.c selftest to check
the intended behavior of KVM_GET_EMULATED_CPUID.
Signed-off-by: Emanuele Giuseppe Esposito <eesposit(a)redhat.com>
---
v3:
- clearer commit message and problem explanation
- pre-initialize the stack variable 'entry' in __do_cpuid_func_emulated
so that the various eax/ebx/ecx are initialized if not set by func.
Emanuele Giuseppe Esposito (4):
KVM: x86: Fix a spurious -E2BIG in KVM_GET_EMULATED_CPUID
Documentation: KVM: update KVM_GET_EMULATED_CPUID ioctl description
selftests: add kvm_get_emulated_cpuid to processor.h
selftests: KVM: extend get_cpuid_test to include
KVM_GET_EMULATED_CPUID
Documentation/virt/kvm/api.rst | 10 +--
arch/x86/kvm/cpuid.c | 33 ++++---
.../selftests/kvm/include/x86_64/processor.h | 1 +
.../selftests/kvm/lib/x86_64/processor.c | 33 +++++++
.../selftests/kvm/x86_64/get_cpuid_test.c | 90 ++++++++++++++++++-
5 files changed, 142 insertions(+), 25 deletions(-)
--
2.30.2
This patchset introduces batched operations for the per-cpu variant of
the array map.
It also introduces a standard way to define per-cpu values via the
'BPF_PERCPU_TYPE()' macro, which handles the alignment transparently.
This was already implemented in the selftests and was merely refactored
out to libbpf, with some simplifications for reuse.
The tests were updated to reflect all the new changes.
Pedro Tammela (3):
bpf: add batched ops support for percpu array
libbpf: selftests: refactor 'BPF_PERCPU_TYPE()' and 'bpf_percpu()'
macros
bpf: selftests: update array map tests for per-cpu batched ops
kernel/bpf/arraymap.c | 2 +
tools/lib/bpf/bpf.h | 10 ++
tools/testing/selftests/bpf/bpf_util.h | 7 --
.../bpf/map_tests/array_map_batch_ops.c | 114 +++++++++++++-----
.../bpf/map_tests/htab_map_batch_ops.c | 48 ++++----
.../selftests/bpf/prog_tests/map_init.c | 5 +-
tools/testing/selftests/bpf/test_maps.c | 16 +--
7 files changed, 133 insertions(+), 69 deletions(-)
--
2.25.1
Changelog
RFC v1-->v2
The timer based test produces run to run variance on some intel based
systems that sport a mechansim of "C-state pre-wake" which can
pre-wake a CPU from an idle state when timers are armed.
Hence invoking the timer tests is now parameterized for systems and
architectures that don't support pre-wakeup logic and need granular
timer measurements along with IPI results.
This RFC does not yet support treating of CPU 0s idle states differently
especially as reported on Intel systems. More understanding is needed
on systems to determine if only CPU 0 is treated differently of if they
are more CPUs that cannot have its idle state properties changed.
RFC v1: https://lkml.org/lkml/2021/3/15/492
---
A kernel module + userspace driver to estimate the wakeup latency
caused by going into stop states. The motivation behind this program is
to find significant deviations behind advertised latency and residency
values.
The patchset measures latencies for two kinds of events. IPIs and Timers
As this is a software-only mechanism, there will additional latencies of
the kernel-firmware-hardware interactions. To account for that, the
program also measures a baseline latency on a 100 percent loaded CPU
and the latencies achieved must be in view relative to that.
To achieve this, we introduce a kernel module and expose its control
knobs through the debugfs interface that the selftests can engage with.
The kernel module provides the following interfaces within
/sys/kernel/debug/latency_test/ for,
IPI test:
ipi_cpu_dest = Destination CPU for the IPI
ipi_cpu_src = Origin of the IPI
ipi_latency_ns = Measured latency time in ns
Timeout test:
timeout_cpu_src = CPU on which the timer to be queued
timeout_expected_ns = Timer duration
timeout_diff_ns = Difference of actual duration vs expected timer
Sample output on a POWER9 system is as follows:
# --IPI Latency Test---
# Baseline Average IPI latency(ns): 3114
# Observed Average IPI latency(ns) - State0: 3265
# Observed Average IPI latency(ns) - State1: 3507
# Observed Average IPI latency(ns) - State2: 3739
# Observed Average IPI latency(ns) - State3: 3807
# Observed Average IPI latency(ns) - State4: 17070
# Observed Average IPI latency(ns) - State5: 1038174
# Observed Average IPI latency(ns) - State6: 1068784
#
# --Timeout Latency Test--
# Baseline Average timeout diff(ns): 1420
# Observed Average timeout diff(ns) - State0: 1640
# Observed Average timeout diff(ns) - State1: 1764
# Observed Average timeout diff(ns) - State2: 1715
# Observed Average timeout diff(ns) - State3: 1845
# Observed Average timeout diff(ns) - State4: 16581
# Observed Average timeout diff(ns) - State5: 939977
# Observed Average timeout diff(ns) - State6: 1073024
Things to keep in mind:
1. This kernel module + bash driver does not guarantee idleness on a
core when the IPI and the Timer is armed. It only invokes sleep and
hopes that the core is idle once the IPI/Timer is invoked onto it.
Hence this program must be run on a completely idle system for best
results
2. Even on a completely idle system, there maybe book-keeping tasks or
jitter tasks that can run on the core we want idle. This can create
outliers in the latency measurement. Thankfully, these outliers
should be large enough to easily weed them out.
3. A userspace only selftest variant was also sent out as RFC based on
suggestions over the previous patchset to simply the kernel
complexeity. However, a userspace only approach had more noise in
the latency measurement due to userspace-kernel interactions
which led to run to run variance and a lesser accurate test.
Another downside of the nature of a userspace program is that it
takes orders of magnitude longer to complete a full system test
compared to the kernel framework.
RFC patch: https://lkml.org/lkml/2020/9/2/356
4. For Intel Systems, the Timer based latencies don't exactly give out
the measure of idle latencies. This is because of a hardware
optimization mechanism that pre-arms a CPU when a timer is set to
wakeup. That doesn't make this metric useless for Intel systems,
it just means that is measuring IPI/Timer responding latency rather
than idle wakeup latencies.
(Source: https://lkml.org/lkml/2020/9/2/610)
For solution to this problem, a hardware based latency analyzer is
devised by Artem Bityutskiy from Intel.
https://youtu.be/Opk92aQyvt0?t=8266https://intel.github.io/wult/
Pratik Rajesh Sampat (2):
cpuidle: Extract IPI based and timer based wakeup latency from idle
states
selftest/cpuidle: Add support for cpuidle latency measurement
drivers/cpuidle/Makefile | 1 +
drivers/cpuidle/test-cpuidle_latency.c | 157 ++++++++++
lib/Kconfig.debug | 10 +
tools/testing/selftests/Makefile | 1 +
tools/testing/selftests/cpuidle/Makefile | 6 +
tools/testing/selftests/cpuidle/cpuidle.sh | 323 +++++++++++++++++++++
tools/testing/selftests/cpuidle/settings | 2 +
7 files changed, 500 insertions(+)
create mode 100644 drivers/cpuidle/test-cpuidle_latency.c
create mode 100644 tools/testing/selftests/cpuidle/Makefile
create mode 100755 tools/testing/selftests/cpuidle/cpuidle.sh
create mode 100644 tools/testing/selftests/cpuidle/settings
--
2.17.1
This patchset provides a file descriptor for every VM and VCPU to read
KVM statistics data in binary format.
It is meant to provide a lightweight, flexible, scalable and efficient
lock-free solution for user space telemetry applications to pull the
statistics data periodically for large scale systems. The pulling
frequency could be as high as a few times per second.
In this patchset, every statistics data are treated to have some
attributes as below:
* architecture dependent or common
* VM statistics data or VCPU statistics data
* type: cumulative, instantaneous,
* unit: none for simple counter, nanosecond, microsecond,
millisecond, second, Byte, KiByte, MiByte, GiByte. Clock Cycles
Since no lock/synchronization is used, the consistency between all
the statistics data is not guaranteed. That means not all statistics
data are read out at the exact same time, since the statistics date
are still being updated by KVM subsystems while they are read out.
Jing Zhang (4):
KVM: stats: Separate common stats from architecture specific ones
KVM: stats: Add fd-based API to read binary stats data
KVM: stats: Add documentation for statistics data binary interface
KVM: selftests: Add selftest for KVM statistics data binary interface
Documentation/virt/kvm/api.rst | 169 ++++++++
arch/arm64/include/asm/kvm_host.h | 9 +-
arch/arm64/kvm/guest.c | 42 +-
arch/mips/include/asm/kvm_host.h | 9 +-
arch/mips/kvm/mips.c | 67 +++-
arch/powerpc/include/asm/kvm_host.h | 9 +-
arch/powerpc/kvm/book3s.c | 68 +++-
arch/powerpc/kvm/book3s_hv.c | 12 +-
arch/powerpc/kvm/book3s_pr.c | 2 +-
arch/powerpc/kvm/book3s_pr_papr.c | 2 +-
arch/powerpc/kvm/booke.c | 63 ++-
arch/s390/include/asm/kvm_host.h | 9 +-
arch/s390/kvm/kvm-s390.c | 133 ++++++-
arch/x86/include/asm/kvm_host.h | 9 +-
arch/x86/kvm/x86.c | 71 +++-
include/linux/kvm_host.h | 132 ++++++-
include/linux/kvm_types.h | 12 +
include/uapi/linux/kvm.h | 48 +++
tools/testing/selftests/kvm/.gitignore | 1 +
tools/testing/selftests/kvm/Makefile | 3 +
.../testing/selftests/kvm/include/kvm_util.h | 3 +
.../selftests/kvm/kvm_bin_form_stats.c | 370 ++++++++++++++++++
tools/testing/selftests/kvm/lib/kvm_util.c | 11 +
virt/kvm/kvm_main.c | 237 ++++++++++-
24 files changed, 1401 insertions(+), 90 deletions(-)
create mode 100644 tools/testing/selftests/kvm/kvm_bin_form_stats.c
base-commit: f96be2deac9bca3ef5a2b0b66b71fcef8bad586d
--
2.31.0.208.g409f899ff0-goog
The current way to provide a no-op flag to 'bpf_ringbuf_submit()',
'bpf_ringbuf_discard()' and 'bpf_ringbuf_output()' is to provide a '0'
value.
A '0' value might notify the consumer if it already caught up in processing,
so let's provide a more descriptive notation for this value.
Signed-off-by: Pedro Tammela <pctammela(a)mojatatu.com>
---
include/uapi/linux/bpf.h | 8 ++++++++
tools/include/uapi/linux/bpf.h | 8 ++++++++
tools/testing/selftests/bpf/progs/ima.c | 2 +-
tools/testing/selftests/bpf/progs/ringbuf_bench.c | 2 +-
tools/testing/selftests/bpf/progs/test_ringbuf.c | 2 +-
tools/testing/selftests/bpf/progs/test_ringbuf_multi.c | 2 +-
6 files changed, 20 insertions(+), 4 deletions(-)
diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h
index 598716742593..100cb2e4c104 100644
--- a/include/uapi/linux/bpf.h
+++ b/include/uapi/linux/bpf.h
@@ -4058,6 +4058,8 @@ union bpf_attr {
* Copy *size* bytes from *data* into a ring buffer *ringbuf*.
* If **BPF_RB_NO_WAKEUP** is specified in *flags*, no notification
* of new data availability is sent.
+ * If **BPF_RB_MAY_WAKEUP** is specified in *flags*, notification
+ * of new data availability is sent if needed.
* If **BPF_RB_FORCE_WAKEUP** is specified in *flags*, notification
* of new data availability is sent unconditionally.
* Return
@@ -4066,6 +4068,7 @@ union bpf_attr {
* void *bpf_ringbuf_reserve(void *ringbuf, u64 size, u64 flags)
* Description
* Reserve *size* bytes of payload in a ring buffer *ringbuf*.
+ * *flags* must be 0.
* Return
* Valid pointer with *size* bytes of memory available; NULL,
* otherwise.
@@ -4075,6 +4078,8 @@ union bpf_attr {
* Submit reserved ring buffer sample, pointed to by *data*.
* If **BPF_RB_NO_WAKEUP** is specified in *flags*, no notification
* of new data availability is sent.
+ * If **BPF_RB_MAY_WAKEUP** is specified in *flags*, notification
+ * of new data availability is sent if needed.
* If **BPF_RB_FORCE_WAKEUP** is specified in *flags*, notification
* of new data availability is sent unconditionally.
* Return
@@ -4085,6 +4090,8 @@ union bpf_attr {
* Discard reserved ring buffer sample, pointed to by *data*.
* If **BPF_RB_NO_WAKEUP** is specified in *flags*, no notification
* of new data availability is sent.
+ * If **BPF_RB_MAY_WAKEUP** is specified in *flags*, notification
+ * of new data availability is sent if needed.
* If **BPF_RB_FORCE_WAKEUP** is specified in *flags*, notification
* of new data availability is sent unconditionally.
* Return
@@ -4965,6 +4972,7 @@ enum {
* BPF_FUNC_bpf_ringbuf_output flags.
*/
enum {
+ BPF_RB_MAY_WAKEUP = 0,
BPF_RB_NO_WAKEUP = (1ULL << 0),
BPF_RB_FORCE_WAKEUP = (1ULL << 1),
};
diff --git a/tools/include/uapi/linux/bpf.h b/tools/include/uapi/linux/bpf.h
index ab9f2233607c..3d6d324184c0 100644
--- a/tools/include/uapi/linux/bpf.h
+++ b/tools/include/uapi/linux/bpf.h
@@ -4058,6 +4058,8 @@ union bpf_attr {
* Copy *size* bytes from *data* into a ring buffer *ringbuf*.
* If **BPF_RB_NO_WAKEUP** is specified in *flags*, no notification
* of new data availability is sent.
+ * If **BPF_RB_MAY_WAKEUP** is specified in *flags*, notification
+ * of new data availability is sent if needed.
* If **BPF_RB_FORCE_WAKEUP** is specified in *flags*, notification
* of new data availability is sent unconditionally.
* Return
@@ -4066,6 +4068,7 @@ union bpf_attr {
* void *bpf_ringbuf_reserve(void *ringbuf, u64 size, u64 flags)
* Description
* Reserve *size* bytes of payload in a ring buffer *ringbuf*.
+ * *flags* must be 0.
* Return
* Valid pointer with *size* bytes of memory available; NULL,
* otherwise.
@@ -4075,6 +4078,8 @@ union bpf_attr {
* Submit reserved ring buffer sample, pointed to by *data*.
* If **BPF_RB_NO_WAKEUP** is specified in *flags*, no notification
* of new data availability is sent.
+ * If **BPF_RB_MAY_WAKEUP** is specified in *flags*, notification
+ * of new data availability is sent if needed.
* If **BPF_RB_FORCE_WAKEUP** is specified in *flags*, notification
* of new data availability is sent unconditionally.
* Return
@@ -4085,6 +4090,8 @@ union bpf_attr {
* Discard reserved ring buffer sample, pointed to by *data*.
* If **BPF_RB_NO_WAKEUP** is specified in *flags*, no notification
* of new data availability is sent.
+ * If **BPF_RB_MAY_WAKEUP** is specified in *flags*, notification
+ * of new data availability is sent if needed.
* If **BPF_RB_FORCE_WAKEUP** is specified in *flags*, notification
* of new data availability is sent unconditionally.
* Return
@@ -4959,6 +4966,7 @@ enum {
* BPF_FUNC_bpf_ringbuf_output flags.
*/
enum {
+ BPF_RB_MAY_WAKEUP = 0,
BPF_RB_NO_WAKEUP = (1ULL << 0),
BPF_RB_FORCE_WAKEUP = (1ULL << 1),
};
diff --git a/tools/testing/selftests/bpf/progs/ima.c b/tools/testing/selftests/bpf/progs/ima.c
index 96060ff4ffc6..0f4daced6aad 100644
--- a/tools/testing/selftests/bpf/progs/ima.c
+++ b/tools/testing/selftests/bpf/progs/ima.c
@@ -38,7 +38,7 @@ void BPF_PROG(ima, struct linux_binprm *bprm)
return;
*sample = ima_hash;
- bpf_ringbuf_submit(sample, 0);
+ bpf_ringbuf_submit(sample, BPF_RB_MAY_WAKEUP);
}
return;
diff --git a/tools/testing/selftests/bpf/progs/ringbuf_bench.c b/tools/testing/selftests/bpf/progs/ringbuf_bench.c
index 123607d314d6..808e2e0e3d64 100644
--- a/tools/testing/selftests/bpf/progs/ringbuf_bench.c
+++ b/tools/testing/selftests/bpf/progs/ringbuf_bench.c
@@ -24,7 +24,7 @@ static __always_inline long get_flags()
long sz;
if (!wakeup_data_size)
- return 0;
+ return BPF_RB_MAY_WAKEUP;
sz = bpf_ringbuf_query(&ringbuf, BPF_RB_AVAIL_DATA);
return sz >= wakeup_data_size ? BPF_RB_FORCE_WAKEUP : BPF_RB_NO_WAKEUP;
diff --git a/tools/testing/selftests/bpf/progs/test_ringbuf.c b/tools/testing/selftests/bpf/progs/test_ringbuf.c
index 8ba9959b036b..03a5cbd21356 100644
--- a/tools/testing/selftests/bpf/progs/test_ringbuf.c
+++ b/tools/testing/selftests/bpf/progs/test_ringbuf.c
@@ -21,7 +21,7 @@ struct {
/* inputs */
int pid = 0;
long value = 0;
-long flags = 0;
+long flags = BPF_RB_MAY_WAKEUP;
/* outputs */
long total = 0;
diff --git a/tools/testing/selftests/bpf/progs/test_ringbuf_multi.c b/tools/testing/selftests/bpf/progs/test_ringbuf_multi.c
index edf3b6953533..f33c3fdfb1d6 100644
--- a/tools/testing/selftests/bpf/progs/test_ringbuf_multi.c
+++ b/tools/testing/selftests/bpf/progs/test_ringbuf_multi.c
@@ -71,7 +71,7 @@ int test_ringbuf(void *ctx)
sample->seq = total;
total += 1;
- bpf_ringbuf_submit(sample, 0);
+ bpf_ringbuf_submit(sample, BPF_RB_MAY_WAKEUP);
return 0;
}
--
2.25.1
v1 by Uriel is here: [1].
Since it's been a while, I've dropped the Reviewed-By's.
It depended on commit 83c4e7a0363b ("KUnit: KASAN Integration") which
hadn't been merged yet, so that caused some kerfuffle with applying them
previously and the series was reverted.
This revives the series but makes the kunit_fail_current_test() function
take a format string and logs the file and line number of the failing
code, addressing Alan Maguire's comments on the previous version.
As a result, the patch that makes UBSAN errors was tweaked slightly to
include an error message.
v2 -> v3:
Try and fail to make kunit_fail_current_test() work on CONFIG_KUNIT=m
s/_/__ on the helper func to match others in test.c
v3 -> v4:
Revert to only enabling kunit_fail_current_test() for CONFIG_KUNIT=y
[1] https://lore.kernel.org/linux-kselftest/20200806174326.3577537-1-urielguaja…
Uriel Guajardo (2):
kunit: support failure from dynamic analysis tools
kunit: ubsan integration
include/kunit/test-bug.h | 30 ++++++++++++++++++++++++++++++
lib/kunit/test.c | 39 +++++++++++++++++++++++++++++++++++----
lib/ubsan.c | 3 +++
3 files changed, 68 insertions(+), 4 deletions(-)
create mode 100644 include/kunit/test-bug.h
base-commit: a74e6a014c9d4d4161061f770c9b4f98372ac778
--
2.31.0.rc2.261.g7f71774620-goog
This series improves the defensive posture of sysfs's use of seq_file
to gain the vmap guard pages at the end of vmalloc buffers to stop a
class of recurring flaw[1]. The long-term goal is to switch sysfs from
a buffer to using seq_file directly, but this will take time to refactor.
Included is also a Clang fix for NULL arithmetic and an LKDTM test to
validate vmalloc guard pages.
v4:
- fix NULL arithmetic (Arnd)
- add lkdtm test
- reword commit message
v3: https://lore.kernel.org/lkml/20210401022145.2019422-1-keescook@chromium.org/
v2: https://lore.kernel.org/lkml/20210315174851.622228-1-keescook@chromium.org/
v1: https://lore.kernel.org/lkml/20210312205558.2947488-1-keescook@chromium.org/
Thanks!
-Kees
Arnd Bergmann (1):
seq_file: Fix clang warning for NULL pointer arithmetic
Kees Cook (2):
lkdtm/heap: Add vmalloc linear overflow test
sysfs: Unconditionally use vmalloc for buffer
drivers/misc/lkdtm/core.c | 3 ++-
drivers/misc/lkdtm/heap.c | 21 +++++++++++++++++-
drivers/misc/lkdtm/lkdtm.h | 3 ++-
fs/kernfs/file.c | 9 +++++---
fs/seq_file.c | 5 ++++-
fs/sysfs/file.c | 29 +++++++++++++++++++++++++
include/linux/seq_file.h | 6 +++++
tools/testing/selftests/lkdtm/tests.txt | 3 ++-
8 files changed, 71 insertions(+), 8 deletions(-)
--
2.25.1
v1 by Uriel is here: [1].
Since it's been a while, I've dropped the Reviewed-By's.
It depended on commit 83c4e7a0363b ("KUnit: KASAN Integration") which
hadn't been merged yet, so that caused some kerfuffle with applying them
previously and the series was reverted.
This revives the series but makes the kunit_fail_current_test() function
take a format string and logs the file and line number of the failing
code, addressing Alan Maguire's comments on the previous version.
As a result, the patch that makes UBSAN errors was tweaked slightly to
include an error message.
v2 -> v3:
Try and fail to make kunit_fail_current_test() work on CONFIG_KUNIT=m
s/_/__ on the helper func to match others in test.c
v3 -> v4:
Revert to only enabling kunit_fail_current_test() for CONFIG_KUNIT=y
v4 -> v5:
Delete blank line to make checkpatch.pl --strict happy
[1] https://lore.kernel.org/linux-kselftest/20200806174326.3577537-1-urielguaja…
Uriel Guajardo (2):
kunit: support failure from dynamic analysis tools
kunit: ubsan integration
include/kunit/test-bug.h | 29 +++++++++++++++++++++++++++++
lib/kunit/test.c | 39 +++++++++++++++++++++++++++++++++++----
lib/ubsan.c | 3 +++
3 files changed, 67 insertions(+), 4 deletions(-)
create mode 100644 include/kunit/test-bug.h
base-commit: 1678e493d530e7977cce34e59a86bb86f3c5631e
--
2.31.0.208.g409f899ff0-goog
This patch set has several miscellaneous fixes to resctrl selftest tool
that are easily visible to user. V1 had fixes to CAT test and CMT test
but they were dropped in V2 because having them here made the patchset
humongous. So, changes to CAT test and CMT test will be posted in another
patchset.
Change Log:
v6:
- Add Tested-by: Babu Moger <babu.moger(a)amd.com>.
- Replace "cat" by CAT_STR etc (Babu).
- Capitalize the first letter of printed message (Babu).
v5:
- Address various comments from Shuah Khan:
1. Move a few fixing patches before cleaning patches.
2. Call kselftest APIs to log test results instead of printf().
3. Add .gitignore to ignore resctrl_tests.
4. Share show_cache_info() in CAT and CMT tests.
5. Define long_mask, cbm_mask, count_of_bits etc as static variables.
v4:
- Address various comments from Shuah Khan:
1. Combine a few patches e.g. a couple of fixing typos patches into one
and a couple of unmounting patches into one etc.
2. Add config file.
3. Remove "Fixes" tags.
4. Change strcmp() to strncmp().
5. Move the global variable fixing patch to the patch 1 so that the
compilation issue is fixed first.
Please note:
- I didn't move the patch of renaming CQM to CMT to the end of the series
because code and commit messages in a few other patches depend on the
new term of "CMT". If move the renaming patch to the end, the previous
patches use the old "CQM" term and code which will be changed soon at
the end of series and will cause more code and explanations.
[v3: https://lkml.org/lkml/2020/10/28/137]
v3:
Address various comments (commit messages, return value on test failure,
print failure info on test failure etc) from Reinette and Tony.
[v2: https://lore.kernel.org/linux-kselftest/cover.1589835155.git.sai.praneeth.p…]
v2:
1. Dropped changes to CAT test and CMT test as they will be posted in a later
series.
2. Added several other fixes
[v1: https://lore.kernel.org/linux-kselftest/cover.1583657204.git.sai.praneeth.p…]
Fenghua Yu (19):
selftests/resctrl: Enable gcc checks to detect buffer overflows
selftests/resctrl: Fix compilation issues for global variables
selftests/resctrl: Fix compilation issues for other global variables
selftests/resctrl: Clean up resctrl features check
selftests/resctrl: Fix missing options "-n" and "-p"
selftests/resctrl: Rename CQM test as CMT test
selftests/resctrl: Call kselftest APIs to log test results
selftests/resctrl: Share show_cache_info() by CAT and CMT tests
selftests/resctrl: Add config dependencies
selftests/resctrl: Check for resctrl mount point only if resctrl FS is
supported
selftests/resctrl: Use resctrl/info for feature detection
selftests/resctrl: Fix MBA/MBM results reporting format
selftests/resctrl: Don't hard code value of "no_of_bits" variable
selftests/resctrl: Modularize resctrl test suite main() function
selftests/resctrl: Skip the test if requested resctrl feature is not
supported
selftests/resctrl: Fix unmount resctrl FS
selftests/resctrl: Fix incorrect parsing of iMC counters
selftests/resctrl: Fix checking for < 0 for unsigned values
selftests/resctrl: Create .gitignore to include resctrl_tests
Reinette Chatre (2):
selftests/resctrl: Ensure sibling CPU is not same as original CPU
selftests/resctrl: Fix a printed message
tools/testing/selftests/resctrl/.gitignore | 2 +
tools/testing/selftests/resctrl/Makefile | 2 +-
tools/testing/selftests/resctrl/README | 4 +-
tools/testing/selftests/resctrl/cache.c | 52 +++++-
tools/testing/selftests/resctrl/cat_test.c | 57 ++----
.../resctrl/{cqm_test.c => cmt_test.c} | 75 +++-----
tools/testing/selftests/resctrl/config | 2 +
tools/testing/selftests/resctrl/fill_buf.c | 4 +-
tools/testing/selftests/resctrl/mba_test.c | 43 ++---
tools/testing/selftests/resctrl/mbm_test.c | 42 ++---
tools/testing/selftests/resctrl/resctrl.h | 29 +++-
.../testing/selftests/resctrl/resctrl_tests.c | 163 ++++++++++++------
tools/testing/selftests/resctrl/resctrl_val.c | 95 ++++++----
tools/testing/selftests/resctrl/resctrlfs.c | 134 ++++++++------
14 files changed, 408 insertions(+), 296 deletions(-)
create mode 100644 tools/testing/selftests/resctrl/.gitignore
rename tools/testing/selftests/resctrl/{cqm_test.c => cmt_test.c} (56%)
create mode 100644 tools/testing/selftests/resctrl/config
--
2.31.0
A 'single_cpu_test' parameter is odd and it does not exist
anymore. Instead there was introduced a 'nr_threads' one.
If it is not set it behaves as the former parameter.
That is why update a "stress mode" according to this change
specifying number of workers which are equal to number of CPUs.
Also update an output of help message based on a new interface.
CC: linux-kselftest(a)vger.kernel.org
CC: Shuah Khan <shuah(a)kernel.org>
Signed-off-by: Uladzislau Rezki (Sony) <urezki(a)gmail.com>
---
tools/testing/selftests/vm/test_vmalloc.sh | 21 +++++++++++----------
1 file changed, 11 insertions(+), 10 deletions(-)
diff --git a/tools/testing/selftests/vm/test_vmalloc.sh b/tools/testing/selftests/vm/test_vmalloc.sh
index 06d2bb109f06..d73b846736f1 100755
--- a/tools/testing/selftests/vm/test_vmalloc.sh
+++ b/tools/testing/selftests/vm/test_vmalloc.sh
@@ -11,6 +11,7 @@
TEST_NAME="vmalloc"
DRIVER="test_${TEST_NAME}"
+NUM_CPUS=`grep -c ^processor /proc/cpuinfo`
# 1 if fails
exitcode=1
@@ -22,9 +23,9 @@ ksft_skip=4
# Static templates for performance, stressing and smoke tests.
# Also it is possible to pass any supported parameters manualy.
#
-PERF_PARAM="single_cpu_test=1 sequential_test_order=1 test_repeat_count=3"
-SMOKE_PARAM="single_cpu_test=1 test_loop_count=10000 test_repeat_count=10"
-STRESS_PARAM="test_repeat_count=20"
+PERF_PARAM="sequential_test_order=1 test_repeat_count=3"
+SMOKE_PARAM="test_loop_count=10000 test_repeat_count=10"
+STRESS_PARAM="nr_threads=$NUM_CPUS test_repeat_count=20"
check_test_requirements()
{
@@ -58,8 +59,8 @@ run_perfformance_check()
run_stability_check()
{
- echo "Run stability tests. In order to stress vmalloc subsystem we run"
- echo "all available test cases on all available CPUs simultaneously."
+ echo "Run stability tests. In order to stress vmalloc subsystem all"
+ echo "available test cases are run by NUM_CPUS workers simultaneously."
echo "It will take time, so be patient."
modprobe $DRIVER $STRESS_PARAM > /dev/null 2>&1
@@ -92,17 +93,17 @@ usage()
echo "# Shows help message"
echo "./${DRIVER}.sh"
echo
- echo "# Runs 1 test(id_1), repeats it 5 times on all online CPUs"
- echo "./${DRIVER}.sh run_test_mask=1 test_repeat_count=5"
+ echo "# Runs 1 test(id_1), repeats it 5 times by NUM_CPUS workers"
+ echo "./${DRIVER}.sh nr_threads=$NUM_CPUS run_test_mask=1 test_repeat_count=5"
echo
echo -n "# Runs 4 tests(id_1|id_2|id_4|id_16) on one CPU with "
echo "sequential order"
- echo -n "./${DRIVER}.sh single_cpu_test=1 sequential_test_order=1 "
+ echo -n "./${DRIVER}.sh sequential_test_order=1 "
echo "run_test_mask=23"
echo
- echo -n "# Runs all tests on all online CPUs, shuffled order, repeats "
+ echo -n "# Runs all tests by NUM_CPUS workers, shuffled order, repeats "
echo "20 times"
- echo "./${DRIVER}.sh test_repeat_count=20"
+ echo "./${DRIVER}.sh nr_threads=$NUM_CPUS test_repeat_count=20"
echo
echo "# Performance analysis"
echo "./${DRIVER}.sh performance"
--
2.20.1
TL;DR
$ ./tools/testing/kunit/kunit.py run --kunitconfig=lib/kunit
Per suggestion from Ted [1], we can reduce the amount of typing by
assuming a convention that these files are named '.kunitconfig'.
In the case of [1], we now have
$ ./tools/testing/kunit/kunit.py run --kunitconfig=fs/ext4
Also add in such a fragment for kunit itself so we can give that as an
example more close to home (and thus less likely to be accidentally
broken).
[1] https://lore.kernel.org/linux-ext4/YCNF4yP1dB97zzwD@mit.edu/
Signed-off-by: Daniel Latypov <dlatypov(a)google.com>
---
lib/kunit/.kunitconfig | 3 +++
tools/testing/kunit/kunit.py | 4 +++-
tools/testing/kunit/kunit_kernel.py | 2 ++
tools/testing/kunit/kunit_tool_test.py | 6 ++++++
4 files changed, 14 insertions(+), 1 deletion(-)
create mode 100644 lib/kunit/.kunitconfig
diff --git a/lib/kunit/.kunitconfig b/lib/kunit/.kunitconfig
new file mode 100644
index 000000000000..9235b7d42d38
--- /dev/null
+++ b/lib/kunit/.kunitconfig
@@ -0,0 +1,3 @@
+CONFIG_KUNIT=y
+CONFIG_KUNIT_TEST=y
+CONFIG_KUNIT_EXAMPLE_TEST=y
diff --git a/tools/testing/kunit/kunit.py b/tools/testing/kunit/kunit.py
index d5144fcb03ac..5da8fb3762f9 100755
--- a/tools/testing/kunit/kunit.py
+++ b/tools/testing/kunit/kunit.py
@@ -184,7 +184,9 @@ def add_common_opts(parser) -> None:
help='Run all KUnit tests through allyesconfig',
action='store_true')
parser.add_argument('--kunitconfig',
- help='Path to Kconfig fragment that enables KUnit tests',
+ help='Path to Kconfig fragment that enables KUnit tests.'
+ ' If given a directory, (e.g. lib/kunit), "/.kunitconfig" '
+ 'will get automatically appended.',
metavar='kunitconfig')
def add_build_opts(parser) -> None:
diff --git a/tools/testing/kunit/kunit_kernel.py b/tools/testing/kunit/kunit_kernel.py
index f309a33256cd..89a7d4024e87 100644
--- a/tools/testing/kunit/kunit_kernel.py
+++ b/tools/testing/kunit/kunit_kernel.py
@@ -132,6 +132,8 @@ class LinuxSourceTree(object):
return
if kunitconfig_path:
+ if os.path.isdir(kunitconfig_path):
+ kunitconfig_path = os.path.join(kunitconfig_path, KUNITCONFIG_PATH)
if not os.path.exists(kunitconfig_path):
raise ConfigError(f'Specified kunitconfig ({kunitconfig_path}) does not exist')
else:
diff --git a/tools/testing/kunit/kunit_tool_test.py b/tools/testing/kunit/kunit_tool_test.py
index 1ad3049e9069..2e809dd956a7 100755
--- a/tools/testing/kunit/kunit_tool_test.py
+++ b/tools/testing/kunit/kunit_tool_test.py
@@ -251,6 +251,12 @@ class LinuxSourceTreeTest(unittest.TestCase):
with tempfile.NamedTemporaryFile('wt') as kunitconfig:
tree = kunit_kernel.LinuxSourceTree('', kunitconfig_path=kunitconfig.name)
+ def test_dir_kunitconfig(self):
+ with tempfile.TemporaryDirectory('') as dir:
+ with open(os.path.join(dir, '.kunitconfig'), 'w') as f:
+ pass
+ tree = kunit_kernel.LinuxSourceTree('', kunitconfig_path=dir)
+
# TODO: add more test cases.
base-commit: b12b47249688915e987a9a2a393b522f86f6b7ab
--
2.30.0.617.g56c4b15f3c-goog
This patch set has several miscellaneous fixes to resctrl selftest tool
that are easily visible to user. V1 had fixes to CAT test and CMT test
but they were dropped in V2 because having them here made the patchset
humongous. So, changes to CAT test and CMT test will be posted in another
patchset.
Change Log:
v5:
- Address various comments from Shuah Khan:
1. Move a few fixing patches before cleaning patches.
2. Call kselftest APIs to log test results instead of printf().
3. Add .gitignore to ignore resctrl_tests.
4. Share show_cache_info() in CAT and CMT tests.
5. Define long_mask, cbm_mask, count_of_bits etc as static variables.
v4:
- Address various comments from Shuah Khan:
1. Combine a few patches e.g. a couple of fixing typos patches into one
and a couple of unmounting patches into one etc.
2. Add config file.
3. Remove "Fixes" tags.
4. Change strcmp() to strncmp().
5. Move the global variable fixing patch to the patch 1 so that the
compilation issue is fixed first.
Please note:
- I didn't move the patch of renaming CQM to CMT to the end of the series
because code and commit messages in a few other patches depend on the
new term of "CMT". If move the renaming patch to the end, the previous
patches use the old "CQM" term and code which will be changed soon at
the end of series and will cause more code and explanations.
[v3: https://lkml.org/lkml/2020/10/28/137]
v3:
Address various comments (commit messages, return value on test failure,
print failure info on test failure etc) from Reinette and Tony.
[v2: https://lore.kernel.org/linux-kselftest/cover.1589835155.git.sai.praneeth.p…]
v2:
1. Dropped changes to CAT test and CMT test as they will be posted in a later
series.
2. Added several other fixes
[v1: https://lore.kernel.org/linux-kselftest/cover.1583657204.git.sai.praneeth.p…]
Fenghua Yu (19):
selftests/resctrl: Enable gcc checks to detect buffer overflows
selftests/resctrl: Fix compilation issues for global variables
selftests/resctrl: Fix compilation issues for other global variables
selftests/resctrl: Clean up resctrl features check
selftests/resctrl: Fix missing options "-n" and "-p"
selftests/resctrl: Rename CQM test as CMT test
selftests/resctrl: Call kselftest APIs to log test results
selftests/resctrl: Share show_cache_info() by CAT and CMT tests
selftests/resctrl: Add config dependencies
selftests/resctrl: Check for resctrl mount point only if resctrl FS is
supported
selftests/resctrl: Use resctrl/info for feature detection
selftests/resctrl: Fix MBA/MBM results reporting format
selftests/resctrl: Don't hard code value of "no_of_bits" variable
selftests/resctrl: Modularize resctrl test suite main() function
selftests/resctrl: Skip the test if requested resctrl feature is not
supported
selftests/resctrl: Fix unmount resctrl FS
selftests/resctrl: Fix incorrect parsing of iMC counters
selftests/resctrl: Fix checking for < 0 for unsigned values
selftests/resctrl: Create .gitignore to include resctrl_tests
Reinette Chatre (2):
selftests/resctrl: Ensure sibling CPU is not same as original CPU
selftests/resctrl: Fix a printed message
tools/testing/selftests/resctrl/.gitignore | 2 +
tools/testing/selftests/resctrl/Makefile | 2 +-
tools/testing/selftests/resctrl/README | 4 +-
tools/testing/selftests/resctrl/cache.c | 52 +++++-
tools/testing/selftests/resctrl/cat_test.c | 57 ++----
.../resctrl/{cqm_test.c => cmt_test.c} | 75 +++-----
tools/testing/selftests/resctrl/config | 2 +
tools/testing/selftests/resctrl/fill_buf.c | 4 +-
tools/testing/selftests/resctrl/mba_test.c | 43 ++---
tools/testing/selftests/resctrl/mbm_test.c | 42 ++---
tools/testing/selftests/resctrl/resctrl.h | 29 +++-
.../testing/selftests/resctrl/resctrl_tests.c | 163 ++++++++++++------
tools/testing/selftests/resctrl/resctrl_val.c | 95 ++++++----
tools/testing/selftests/resctrl/resctrlfs.c | 134 ++++++++------
14 files changed, 408 insertions(+), 296 deletions(-)
create mode 100644 tools/testing/selftests/resctrl/.gitignore
rename tools/testing/selftests/resctrl/{cqm_test.c => cmt_test.c} (56%)
create mode 100644 tools/testing/selftests/resctrl/config
--
2.30.1
This series aims to clarify the behavior of
KVM_GET_EMULATED_CPUID and KVM_GET_SUPPORTED
ioctls, and fix a corner case where the nent field of the
struct kvm_cpuid2 is matching the amount of entries that kvm returns.
Patch 1 proposes the nent field fix to cpuid.c,
patch 2 updates the ioctl documentation accordingly and
patches 3 and 4 provide a selftest to check KVM_GET_EMULATED_CPUID
accordingly.
Emanuele Giuseppe Esposito (4):
kvm: cpuid: adjust the returned nent field of kvm_cpuid2 for
KVM_GET_SUPPORTED_CPUID and KVM_GET_EMULATED_CPUID
Documentation: kvm: update KVM_GET_EMULATED_CPUID ioctl description
selftests: add kvm_get_emulated_cpuid
selftests: kvm: add get_emulated_cpuid test
Documentation/virt/kvm/api.rst | 10 +-
arch/x86/kvm/cpuid.c | 6 +
tools/testing/selftests/kvm/.gitignore | 1 +
tools/testing/selftests/kvm/Makefile | 1 +
.../selftests/kvm/include/x86_64/processor.h | 1 +
.../selftests/kvm/lib/x86_64/processor.c | 33 ++++
.../selftests/kvm/x86_64/get_emulated_cpuid.c | 183 ++++++++++++++++++
7 files changed, 229 insertions(+), 6 deletions(-)
create mode 100644 tools/testing/selftests/kvm/x86_64/get_emulated_cpuid.c
--
2.30.2
This series aims to clarify the behavior of
KVM_GET_EMULATED_CPUID and KVM_GET_SUPPORTED
ioctls, and fix a corner case where the nent field of the
struct kvm_cpuid2 is matching the amount of entries that kvm returns.
Patch 1 proposes the nent field fix to cpuid.c,
patch 2 updates the ioctl documentation accordingly and
patches 3 and 4 provide a selftest to check KVM_GET_EMULATED_CPUID
accordingly.
Signed-off-by: Emanuele Giuseppe Esposito <eesposit(a)redhat.com>
---
v2:
- better fix in cpuid.c, perform the nent check after the switch statement
- fix bug in get_emulated_cpuid.c selftest, each entry needs to have at least
the padding zeroed otherwise it fails.
Emanuele Giuseppe Esposito (4):
kvm: cpuid: adjust the returned nent field of kvm_cpuid2 for
KVM_GET_SUPPORTED_CPUID and KVM_GET_EMULATED_CPUID
Documentation: kvm: update KVM_GET_EMULATED_CPUID ioctl description
selftests: add kvm_get_emulated_cpuid
selftests: kvm: add get_emulated_cpuid test
Documentation/virt/kvm/api.rst | 10 +-
arch/x86/kvm/cpuid.c | 35 ++--
tools/testing/selftests/kvm/.gitignore | 1 +
tools/testing/selftests/kvm/Makefile | 1 +
.../selftests/kvm/include/x86_64/processor.h | 1 +
.../selftests/kvm/lib/x86_64/processor.c | 33 +++
.../selftests/kvm/x86_64/get_emulated_cpuid.c | 198 ++++++++++++++++++
7 files changed, 256 insertions(+), 23 deletions(-)
create mode 100644 tools/testing/selftests/kvm/x86_64/get_emulated_cpuid.c
--
2.30.2
From: Ira Weiny <ira.weiny(a)intel.com>
Introduce a new page protection mechanism for supervisor pages, Protection Key
Supervisor (PKS).
Generally PKS enables protections on 'domains' of supervisor pages to limit
supervisor mode access to pages beyond the normal paging protections. PKS
works in a similar fashion to user space pkeys, PKU. As with PKU, supervisor
pkeys are checked in addition to normal paging protections and Access or Writes
can be disabled via a MSR update without TLB flushes when permissions change.
Also like PKU, a page mapping is assigned to a domain by setting pkey bits in
the page table entry for that mapping.
Access is controlled through a PKRS register which is updated via WRMSR/RDMSR.
XSAVE is not supported for the PKRS MSR. Therefore the implementation
saves/restores the MSR across context switches and during exceptions. Nested
exceptions are supported by each exception getting a new PKS state.
For consistent behavior with current paging protections, pkey 0 is reserved and
configured to allow full access via the pkey mechanism, thus preserving the
default paging protections on mappings with the default pkey value of 0.
Other keys, (1-15) are allocated by an allocator which prepares us for key
contention from day one. Kernel users should be prepared for the allocator to
fail either because of key exhaustion or due to PKS not being supported on the
CPU instance.
The following are key attributes of PKS.
1) Fast switching of permissions
1a) Prevents access without page table manipulations
1b) No TLB flushes required
2) Works on a per thread basis
PKS is available with 4 and 5 level paging. Like PKRU it consumes 4 bits from
the PTE to store the pkey within the entry.
All code to support PKS is configured via ARCH_ENABLE_SUPERVISOR_PKEYS which
is designed to only be turned on when a user is configured on in the kernel.
Those users must depend on ARCH_HAS_SUPERVISOR_PKEYS to properly work with
other architectures which do not yet support PKS.
Originally this series was submitted as part of a large patch set which
converted the kmap call sites.[1]
Many follow on discussions revealed a few problems. The first of which was
that some callers leak a kmap mapping across threads rather than containing it
to a critical section. Attempts were made to see if these 'global kmaps' could
be supported.[2] However, supporting global kmaps had many problems. Work is
being done in parallel on converting as many kmap calls to the new
kmap_local_page().[3]
Changes from V4 [5]
From kernel test robot <lkp(a)intel.com>
Fix i386 build: pks_init_task not found
Move MSR_IA32_PKRS and INIT_PKRS_VALUE into patch 5 where they are
first 'used'. (Technically nothing is 'used' until the final
test patch. But review wise this is much cleaner.)
From Sean Christoperson
Add documentation details on what happens if the pkey is violated
Change cpu_feature_enabled to be in WARN_ON check
Clean up commit message of patch 6
Fix some checkpatch errors
[1] https://lore.kernel.org/lkml/20201009195033.3208459-1-ira.weiny@intel.com/
[2] https://lore.kernel.org/lkml/87mtycqcjf.fsf@nanos.tec.linutronix.de/
[3] https://lore.kernel.org/lkml/20210128061503.1496847-1-ira.weiny@intel.com/https://lore.kernel.org/lkml/20210210062221.3023586-1-ira.weiny@intel.com/https://lore.kernel.org/lkml/20210205170030.856723-1-ira.weiny@intel.com/https://lore.kernel.org/lkml/20210217024826.3466046-1-ira.weiny@intel.com/
[4] https://lore.kernel.org/lkml/20201106232908.364581-1-ira.weiny@intel.com/
[5] https://lore.kernel.org/lkml/20210322053020.2287058-1-ira.weiny@intel.com/
Fenghua Yu (1):
x86/pks: Add PKS kernel API
Ira Weiny (9):
x86/pkeys: Create pkeys_common.h
x86/fpu: Refactor arch_set_user_pkey_access() for PKS support
x86/pks: Add additional PKEY helper macros
x86/pks: Add PKS defines and Kconfig options
x86/pks: Add PKS setup code
x86/fault: Adjust WARN_ON for PKey fault
x86/pks: Preserve the PKRS MSR on context switch
x86/entry: Preserve PKRS MSR across exceptions
x86/pks: Add PKS test code
Documentation/core-api/protection-keys.rst | 112 +++-
arch/x86/Kconfig | 1 +
arch/x86/entry/calling.h | 26 +
arch/x86/entry/common.c | 58 ++
arch/x86/entry/entry_64.S | 22 +-
arch/x86/entry/entry_64_compat.S | 6 +-
arch/x86/include/asm/cpufeatures.h | 1 +
arch/x86/include/asm/disabled-features.h | 8 +-
arch/x86/include/asm/msr-index.h | 1 +
arch/x86/include/asm/pgtable.h | 10 +-
arch/x86/include/asm/pgtable_types.h | 12 +
arch/x86/include/asm/pkeys.h | 4 +
arch/x86/include/asm/pkeys_common.h | 34 +
arch/x86/include/asm/pks.h | 54 ++
arch/x86/include/asm/processor-flags.h | 2 +
arch/x86/include/asm/processor.h | 47 +-
arch/x86/include/uapi/asm/processor-flags.h | 2 +
arch/x86/kernel/cpu/common.c | 2 +
arch/x86/kernel/fpu/xstate.c | 22 +-
arch/x86/kernel/head_64.S | 7 +-
arch/x86/kernel/process.c | 3 +
arch/x86/kernel/process_64.c | 2 +
arch/x86/mm/fault.c | 31 +-
arch/x86/mm/pkeys.c | 218 +++++-
include/linux/pgtable.h | 4 +
include/linux/pkeys.h | 34 +
kernel/entry/common.c | 14 +-
lib/Kconfig.debug | 11 +
lib/Makefile | 3 +
lib/pks/Makefile | 3 +
lib/pks/pks_test.c | 693 ++++++++++++++++++++
mm/Kconfig | 5 +
tools/testing/selftests/x86/Makefile | 3 +-
tools/testing/selftests/x86/test_pks.c | 150 +++++
34 files changed, 1528 insertions(+), 77 deletions(-)
create mode 100644 arch/x86/include/asm/pkeys_common.h
create mode 100644 arch/x86/include/asm/pks.h
create mode 100644 lib/pks/Makefile
create mode 100644 lib/pks/pks_test.c
create mode 100644 tools/testing/selftests/x86/test_pks.c
--
2.28.0.rc0.12.gb6a658bd00c9
From: Mike Rapoport <rppt(a)linux.ibm.com>
Yuri Norov says:
If parameter size is the same for native and compat ABIs, we may
wire a syscall made by compat client to native handler. This is
true for unsigned int, but not true for unsigned long or pointer.
That's why I suggest using unsigned int and so avoid creating compat
entry point.
Use unsigned int as the type of the flags parameter in memfd_secret()
system call.
Signed-off-by: Mike Rapoport <rppt(a)linux.ibm.com>
---
@Andrew,
The patch is vs v5.12-rc5-mmots-2021-03-30-23, I'd appreciate if it would
be added as a fixup to the memfd_secret series.
include/linux/syscalls.h | 2 +-
mm/secretmem.c | 2 +-
tools/testing/selftests/vm/memfd_secret.c | 2 +-
3 files changed, 3 insertions(+), 3 deletions(-)
diff --git a/include/linux/syscalls.h b/include/linux/syscalls.h
index 49c93c906893..1a1b5d724497 100644
--- a/include/linux/syscalls.h
+++ b/include/linux/syscalls.h
@@ -1050,7 +1050,7 @@ asmlinkage long sys_landlock_create_ruleset(const struct landlock_ruleset_attr _
asmlinkage long sys_landlock_add_rule(int ruleset_fd, enum landlock_rule_type rule_type,
const void __user *rule_attr, __u32 flags);
asmlinkage long sys_landlock_restrict_self(int ruleset_fd, __u32 flags);
-asmlinkage long sys_memfd_secret(unsigned long flags);
+asmlinkage long sys_memfd_secret(unsigned int flags);
/*
* Architecture-specific system calls
diff --git a/mm/secretmem.c b/mm/secretmem.c
index f2ae3f32a193..3b1ba3991964 100644
--- a/mm/secretmem.c
+++ b/mm/secretmem.c
@@ -199,7 +199,7 @@ static struct file *secretmem_file_create(unsigned long flags)
return file;
}
-SYSCALL_DEFINE1(memfd_secret, unsigned long, flags)
+SYSCALL_DEFINE1(memfd_secret, unsigned int, flags)
{
struct file *file;
int fd, err;
diff --git a/tools/testing/selftests/vm/memfd_secret.c b/tools/testing/selftests/vm/memfd_secret.c
index c878c2b841fc..2462f52e9c96 100644
--- a/tools/testing/selftests/vm/memfd_secret.c
+++ b/tools/testing/selftests/vm/memfd_secret.c
@@ -38,7 +38,7 @@ static unsigned long page_size;
static unsigned long mlock_limit_cur;
static unsigned long mlock_limit_max;
-static int memfd_secret(unsigned long flags)
+static int memfd_secret(unsigned int flags)
{
return syscall(__NR_memfd_secret, flags);
}
--
2.28.0
The perf subsystem today unifies various tracing and monitoring
features, from both software and hardware. One benefit of the perf
subsystem is automatically inheriting events to child tasks, which
enables process-wide events monitoring with low overheads. By default
perf events are non-intrusive, not affecting behaviour of the tasks
being monitored.
For certain use-cases, however, it makes sense to leverage the
generality of the perf events subsystem and optionally allow the tasks
being monitored to receive signals on events they are interested in.
This patch series adds the option to synchronously signal user space on
events.
To better support process-wide synchronous self-monitoring, without
events propagating to children that do not share the current process's
shared environment, two pre-requisite patches are added to optionally
restrict inheritance to CLONE_THREAD, and remove events on exec (without
affecting the parent).
Examples how to use these features can be found in the tests added at
the end of the series. In addition to the tests added, the series has
also been subjected to syzkaller fuzzing (focus on 'kernel/events/'
coverage).
Motivation and Example Uses
---------------------------
1. Our immediate motivation is low-overhead sampling-based race
detection for user space [1]. By using perf_event_open() at
process initialization, we can create hardware
breakpoint/watchpoint events that are propagated automatically
to all threads in a process. As far as we are aware, today no
existing kernel facility (such as ptrace) allows us to set up
process-wide watchpoints with minimal overheads (that are
comparable to mprotect() of whole pages).
2. Other low-overhead error detectors that rely on detecting
accesses to certain memory locations or code, process-wide and
also only in a specific set of subtasks or threads.
[1] https://llvm.org/devmtg/2020-09/slides/Morehouse-GWP-Tsan.pdf
Other ideas for use-cases we found interesting, but should only
illustrate the range of potential to further motivate the utility (we're
sure there are more):
3. Code hot patching without full stop-the-world. Specifically, by
setting a code breakpoint to entry to the patched routine, then
send signals to threads and check that they are not in the
routine, but without stopping them further. If any of the
threads will enter the routine, it will receive SIGTRAP and
pause.
4. Safepoints without mprotect(). Some Java implementations use
"load from a known memory location" as a safepoint. When threads
need to be stopped, the page containing the location is
mprotect()ed and threads get a signal. This could be replaced with
a watchpoint, which does not require a whole page nor DTLB
shootdowns.
5. Threads receiving signals on performance events to
throttle/unthrottle themselves.
6. Tracking data flow globally.
Changelog
---------
v3:
* Add patch "perf: Rework perf_event_exit_event()" to beginning of
series, courtesy of Peter Zijlstra.
* Rework "perf: Add support for event removal on exec" based on
the added "perf: Rework perf_event_exit_event()".
* Fix kselftests to work with more recent libc, due to the way it forces
using the kernel's own siginfo_t.
* Add basic perf-tool built-in test.
v2/RFC: https://lkml.kernel.org/r/20210310104139.679618-1-elver@google.com
* Patch "Support only inheriting events if cloned with CLONE_THREAD"
added to series.
* Patch "Add support for event removal on exec" added to series.
* Patch "Add kselftest for process-wide sigtrap handling" added to
series.
* Patch "Add kselftest for remove_on_exec" added to series.
* Implicitly restrict inheriting events if sigtrap, but the child was
cloned with CLONE_CLEAR_SIGHAND, because it is not generally safe if
the child cleared all signal handlers to continue sending SIGTRAP.
* Various minor fixes (see details in patches).
v1/RFC: https://lkml.kernel.org/r/20210223143426.2412737-1-elver@google.com
Pre-series: The discussion at [2] led to the changes in this series. The
approach taken in "Add support for SIGTRAP on perf events" to trigger
the signal was suggested by Peter Zijlstra in [3].
[2] https://lore.kernel.org/lkml/CACT4Y+YPrXGw+AtESxAgPyZ84TYkNZdP0xpocX2jwVAbZ…
[3] https://lore.kernel.org/lkml/YBv3rAT566k+6zjg@hirez.programming.kicks-ass.n…
Marco Elver (10):
perf: Apply PERF_EVENT_IOC_MODIFY_ATTRIBUTES to children
perf: Support only inheriting events if cloned with CLONE_THREAD
perf: Add support for event removal on exec
signal: Introduce TRAP_PERF si_code and si_perf to siginfo
perf: Add support for SIGTRAP on perf events
perf: Add breakpoint information to siginfo on SIGTRAP
selftests/perf_events: Add kselftest for process-wide sigtrap handling
selftests/perf_events: Add kselftest for remove_on_exec
tools headers uapi: Sync tools/include/uapi/linux/perf_event.h
perf test: Add basic stress test for sigtrap handling
Peter Zijlstra (1):
perf: Rework perf_event_exit_event()
arch/m68k/kernel/signal.c | 3 +
arch/x86/kernel/signal_compat.c | 5 +-
fs/signalfd.c | 4 +
include/linux/compat.h | 2 +
include/linux/perf_event.h | 6 +-
include/linux/signal.h | 1 +
include/uapi/asm-generic/siginfo.h | 6 +-
include/uapi/linux/perf_event.h | 5 +-
include/uapi/linux/signalfd.h | 4 +-
kernel/events/core.c | 297 +++++++++++++-----
kernel/fork.c | 2 +-
kernel/signal.c | 11 +
tools/include/uapi/linux/perf_event.h | 5 +-
tools/perf/tests/Build | 1 +
tools/perf/tests/builtin-test.c | 5 +
tools/perf/tests/sigtrap.c | 148 +++++++++
tools/perf/tests/tests.h | 1 +
.../testing/selftests/perf_events/.gitignore | 3 +
tools/testing/selftests/perf_events/Makefile | 6 +
tools/testing/selftests/perf_events/config | 1 +
.../selftests/perf_events/remove_on_exec.c | 260 +++++++++++++++
tools/testing/selftests/perf_events/settings | 1 +
.../selftests/perf_events/sigtrap_threads.c | 206 ++++++++++++
23 files changed, 896 insertions(+), 87 deletions(-)
create mode 100644 tools/perf/tests/sigtrap.c
create mode 100644 tools/testing/selftests/perf_events/.gitignore
create mode 100644 tools/testing/selftests/perf_events/Makefile
create mode 100644 tools/testing/selftests/perf_events/config
create mode 100644 tools/testing/selftests/perf_events/remove_on_exec.c
create mode 100644 tools/testing/selftests/perf_events/settings
create mode 100644 tools/testing/selftests/perf_events/sigtrap_threads.c
--
2.31.0.291.g576ba9dcdaf-goog
Previously, we shared too much of the code with COPY and ZEROPAGE, so we
manipulated things in various invalid ways:
- Previously, we unconditionally called shmem_inode_acct_block. In the
continue case, we're looking up an existing page which would have been
accounted for properly when it was allocated. So doing it twice
results in double-counting, and eventually leaking.
- Previously, we made the pte writable whenever the VMA was writable.
However, for continue, consider this case:
1. A tmpfs file was created
2. The non-UFFD-registered side mmap()-s with MAP_SHARED
3. The UFFD-registered side mmap()-s with MAP_PRIVATE
In this case, even though the UFFD-registered VMA may be writable, we
still want CoW behavior. So, check for this case and don't make the
pte writable.
- The initial pgoff / max_off check isn't necessary, so we can skip past
it. The second one seems likely to be unnecessary too, but keep it
just in case. Modify both checks to use pgoff, as offset is equivalent
and not needed.
- Previously, we unconditionally called ClearPageDirty() in the error
path. In the continue case though, since this is an existing page, it
might have already been dirty before we started touching it. It's very
problematic to clear the bit incorrectly, but not a problem to leave
it - so, just omit the ClearPageDirty() entirely.
- Previously, we unconditionally removed the page from the page cache in
the error path. But in the continue case, we didn't add it - it was
already there because the page is present in some second
(non-UFFD-registered) mapping. So, removing it is invalid.
Because the error handling issues are easy to exercise in the selftest,
make a small modification there to do so.
Finally, refactor shmem_mcopy_atomic_pte a bit. By this point, we've
added a lot of "if (!is_continue)"-s everywhere. It's cleaner to just
check for that mode first thing, and then "goto" down to where the parts
we actually want are. This leaves the code in between cleaner.
Changes since v2:
- Drop the ClearPageDirty() entirely, instead of trying to remember the
old value.
- Modify both pgoff / max_off checks to use pgoff. It's equivalent to
offset, but offset wasn't initialized until the first check (which
we're skipping).
- Keep the second pgoff / max_off check in the continue case.
Changes since v1:
- Refactor to skip ahead with goto, instead of adding several more
"if (!is_continue)".
- Fix unconditional ClearPageDirty().
- Don't pte_mkwrite() when is_continue && !VM_SHARED.
Fixes: 00da60b9d0a0 ("userfaultfd: support minor fault handling for shmem")
Signed-off-by: Axel Rasmussen <axelrasmussen(a)google.com>
---
mm/shmem.c | 60 +++++++++++++-----------
tools/testing/selftests/vm/userfaultfd.c | 12 +++++
2 files changed, 44 insertions(+), 28 deletions(-)
diff --git a/mm/shmem.c b/mm/shmem.c
index d2e0e81b7d2e..fbcce850a16e 100644
--- a/mm/shmem.c
+++ b/mm/shmem.c
@@ -2377,18 +2377,22 @@ int shmem_mcopy_atomic_pte(struct mm_struct *dst_mm, pmd_t *dst_pmd,
struct page *page;
pte_t _dst_pte, *dst_pte;
int ret;
- pgoff_t offset, max_off;
-
- ret = -ENOMEM;
- if (!shmem_inode_acct_block(inode, 1))
- goto out;
+ pgoff_t max_off;
+ int writable;
if (is_continue) {
ret = -EFAULT;
page = find_lock_page(mapping, pgoff);
if (!page)
- goto out_unacct_blocks;
- } else if (!*pagep) {
+ goto out;
+ goto install_ptes;
+ }
+
+ ret = -ENOMEM;
+ if (!shmem_inode_acct_block(inode, 1))
+ goto out;
+
+ if (!*pagep) {
page = shmem_alloc_page(gfp, info, pgoff);
if (!page)
goto out_unacct_blocks;
@@ -2415,30 +2419,29 @@ int shmem_mcopy_atomic_pte(struct mm_struct *dst_mm, pmd_t *dst_pmd,
*pagep = NULL;
}
- if (!is_continue) {
- VM_BUG_ON(PageSwapBacked(page));
- VM_BUG_ON(PageLocked(page));
- __SetPageLocked(page);
- __SetPageSwapBacked(page);
- __SetPageUptodate(page);
- }
+ VM_BUG_ON(PageSwapBacked(page));
+ VM_BUG_ON(PageLocked(page));
+ __SetPageLocked(page);
+ __SetPageSwapBacked(page);
+ __SetPageUptodate(page);
ret = -EFAULT;
- offset = linear_page_index(dst_vma, dst_addr);
max_off = DIV_ROUND_UP(i_size_read(inode), PAGE_SIZE);
- if (unlikely(offset >= max_off))
+ if (unlikely(pgoff >= max_off))
goto out_release;
- /* If page wasn't already in the page cache, add it. */
- if (!is_continue) {
- ret = shmem_add_to_page_cache(page, mapping, pgoff, NULL,
- gfp & GFP_RECLAIM_MASK, dst_mm);
- if (ret)
- goto out_release;
- }
+ ret = shmem_add_to_page_cache(page, mapping, pgoff, NULL,
+ gfp & GFP_RECLAIM_MASK, dst_mm);
+ if (ret)
+ goto out_release;
+install_ptes:
_dst_pte = mk_pte(page, dst_vma->vm_page_prot);
- if (dst_vma->vm_flags & VM_WRITE)
+ /* For CONTINUE on a non-shared VMA, don't pte_mkwrite for CoW. */
+ writable = is_continue && !(dst_vma->vm_flags & VM_SHARED)
+ ? 0
+ : dst_vma->vm_flags & VM_WRITE;
+ if (writable)
_dst_pte = pte_mkwrite(pte_mkdirty(_dst_pte));
else {
/*
@@ -2455,7 +2458,7 @@ int shmem_mcopy_atomic_pte(struct mm_struct *dst_mm, pmd_t *dst_pmd,
ret = -EFAULT;
max_off = DIV_ROUND_UP(i_size_read(inode), PAGE_SIZE);
- if (unlikely(offset >= max_off))
+ if (unlikely(pgoff >= max_off))
goto out_release_unlock;
ret = -EEXIST;
@@ -2485,13 +2488,14 @@ int shmem_mcopy_atomic_pte(struct mm_struct *dst_mm, pmd_t *dst_pmd,
return ret;
out_release_unlock:
pte_unmap_unlock(dst_pte, ptl);
- ClearPageDirty(page);
- delete_from_page_cache(page);
+ if (!is_continue)
+ delete_from_page_cache(page);
out_release:
unlock_page(page);
put_page(page);
out_unacct_blocks:
- shmem_inode_unacct_blocks(inode, 1);
+ if (!is_continue)
+ shmem_inode_unacct_blocks(inode, 1);
goto out;
}
#endif /* CONFIG_USERFAULTFD */
diff --git a/tools/testing/selftests/vm/userfaultfd.c b/tools/testing/selftests/vm/userfaultfd.c
index f6c86b036d0f..d8541a59dae5 100644
--- a/tools/testing/selftests/vm/userfaultfd.c
+++ b/tools/testing/selftests/vm/userfaultfd.c
@@ -485,6 +485,7 @@ static void wp_range(int ufd, __u64 start, __u64 len, bool wp)
static void continue_range(int ufd, __u64 start, __u64 len)
{
struct uffdio_continue req;
+ int ret;
req.range.start = start;
req.range.len = len;
@@ -493,6 +494,17 @@ static void continue_range(int ufd, __u64 start, __u64 len)
if (ioctl(ufd, UFFDIO_CONTINUE, &req))
err("UFFDIO_CONTINUE failed for address 0x%" PRIx64,
(uint64_t)start);
+
+ /*
+ * Error handling within the kernel for continue is subtly different
+ * from copy or zeropage, so it may be a source of bugs. Trigger an
+ * error (-EEXIST) on purpose, to verify doing so doesn't cause a BUG.
+ */
+ req.mapped = 0;
+ ret = ioctl(ufd, UFFDIO_CONTINUE, &req);
+ if (ret >= 0 || req.mapped != -EEXIST)
+ err("failed to exercise UFFDIO_CONTINUE error handling, ret=%d, mapped=%" PRId64,
+ ret, req.mapped);
}
static void *locking_thread(void *arg)
--
2.31.0.291.g576ba9dcdaf-goog
Good Day Sir/Ms,
We are please to invite you or your company to quote the
following item listed below:
Product/Model No: A702TH FYNE PRESSURE REGULATOR
Model Number: A702TH
Qty. 30 units
Compulsory,Kindly send your quotation to:
quotation(a)pfizerbvsupply.com
for immediate approval.
Kind Regards,
Albert Bourla
PFIZER B.V Supply Chain Manager
Tel: +31(0)208080 880
ADDRESS: Rivium Westlaan 142, 2909 LD
Capelle aan den IJssel, Netherlands
From: Ira Weiny <ira.weiny(a)intel.com>
Introduce a new page protection mechanism for supervisor pages, Protection Key
Supervisor (PKS).
Generally PKS enables protections on 'domains' of supervisor pages to limit
supervisor mode access to pages beyond the normal paging protections. PKS
works in a similar fashion to user space pkeys, PKU. As with PKU, supervisor
pkeys are checked in addition to normal paging protections and Access or Writes
can be disabled via a MSR update without TLB flushes when permissions change.
Also like PKU, a page mapping is assigned to a domain by setting pkey bits in
the page table entry for that mapping.
Access is controlled through a PKRS register which is updated via WRMSR/RDMSR.
XSAVE is not supported for the PKRS MSR. Therefore the implementation
saves/restores the MSR across context switches and during exceptions. Nested
exceptions are supported by each exception getting a new PKS state.
For consistent behavior with current paging protections, pkey 0 is reserved and
configured to allow full access via the pkey mechanism, thus preserving the
default paging protections on mappings with the default pkey value of 0.
Other keys, (1-15) are allocated by an allocator which prepares us for key
contention from day one. Kernel users should be prepared for the allocator to
fail either because of key exhaustion or due to PKS not being supported on the
CPU instance.
The following are key attributes of PKS.
1) Fast switching of permissions
1a) Prevents access without page table manipulations
1b) No TLB flushes required
2) Works on a per thread basis
PKS is available with 4 and 5 level paging. Like PKRU it consumes 4 bits from
the PTE to store the pkey within the entry.
All code to support PKS is configured via ARCH_ENABLE_SUPERVISOR_PKEYS which
is designed to only be turned on when a user is configured on in the kernel.
Those users must depend on ARCH_HAS_SUPERVISOR_PKEYS to properly work with
other architectures which do not yet support PKS.
Originally this series was submitted as part of a large patch set which
converted the kmap call sites.[1]
Many follow on discussions revealed a few problems. The first of which was
that some callers leak a kmap mapping across threads rather than containing it
to a critical section. Attempts were made to see if these 'global kmaps' could
be supported.[2] However, supporting global kmaps had many problems. Work is
being done in parallel on converting as many kmap calls to the new
kmap_local_page().[3]
Changes from V3 [4]
Add ARCH_ENABLE_SUPERVISOR_PKEYS config which is selected by kernel
users to add the functionality to the core. However, they should only
select this if ARCH_HAS_SUPERVISOR_PKEYS is available.
Clean up test code for context switching
Adjust for extended_pt_regs
Reduce output unless --debug is specified
Address internal review comments from Dan Williams and Dave Hansen
Help with macros and assembly coding
Change names of various functions
Clean up documentation
Move all #ifdefery into header files.
Clean up cover letter.
Make extended_pt_regs handling a macro rather than coding
around every call to C
Add macross for PKS shift/mask
New patch : x86/pks: Add additional PKEY helper macros
Preserve pkrs_cache as static when PKS_TEST is not configured
Remove unnecessary pr_* prints
Clarify pks_key_alloc flags parameter
Change CONFIG_PKS_TESTING to CONFIG_PKS_TEST
Clean up test code separation from main code in fault.c
Remove module boilerplate from test code
Clean up all commit messages
Address comments from Thomas Gleixner
Provide a warning and fallback to no protection if a global
mapping is requested.
Fix context switch. Fix where pks_sched_in() is called.
Fix test to actually do a context switch
Remove unecessary noinstr's
From Andy Lutomirski
Use extended_pt_regs idea to stash pks values on the stack
Drop patches 5/10 and 7/10
And use extended_pt_regs to print pkey info on fault
Adjust tests
Comments from Randy Dunlap:
Fix gramatical errors in doc
Clean up kernel docs
Rebase to 5.12
[1] https://lore.kernel.org/lkml/20201009195033.3208459-1-ira.weiny@intel.com/
[2] https://lore.kernel.org/lkml/87mtycqcjf.fsf@nanos.tec.linutronix.de/
[3] https://lore.kernel.org/lkml/20210128061503.1496847-1-ira.weiny@intel.com/https://lore.kernel.org/lkml/20210210062221.3023586-1-ira.weiny@intel.com/https://lore.kernel.org/lkml/20210205170030.856723-1-ira.weiny@intel.com/https://lore.kernel.org/lkml/20210217024826.3466046-1-ira.weiny@intel.com/
[4] https://lore.kernel.org/lkml/20201106232908.364581-1-ira.weiny@intel.com/
</proposed cover letter>
Fenghua Yu (1):
x86/pks: Add PKS kernel API
Ira Weiny (9):
x86/pkeys: Create pkeys_common.h
x86/fpu: Refactor arch_set_user_pkey_access() for PKS support
x86/pks: Add additional PKEY helper macros
x86/pks: Add PKS defines and Kconfig options
x86/pks: Add PKS setup code
x86/fault: Adjust WARN_ON for PKey fault
x86/pks: Preserve the PKRS MSR on context switch
x86/entry: Preserve PKRS MSR across exceptions
x86/pks: Add PKS test code
Documentation/core-api/protection-keys.rst | 111 +++-
arch/x86/Kconfig | 1 +
arch/x86/entry/calling.h | 26 +
arch/x86/entry/common.c | 58 ++
arch/x86/entry/entry_64.S | 22 +-
arch/x86/entry/entry_64_compat.S | 6 +-
arch/x86/include/asm/cpufeatures.h | 1 +
arch/x86/include/asm/disabled-features.h | 8 +-
arch/x86/include/asm/msr-index.h | 1 +
arch/x86/include/asm/pgtable.h | 10 +-
arch/x86/include/asm/pgtable_types.h | 12 +
arch/x86/include/asm/pkeys.h | 4 +
arch/x86/include/asm/pkeys_common.h | 34 +
arch/x86/include/asm/pks.h | 54 ++
arch/x86/include/asm/processor-flags.h | 2 +
arch/x86/include/asm/processor.h | 43 +-
arch/x86/include/uapi/asm/processor-flags.h | 2 +
arch/x86/kernel/cpu/common.c | 2 +
arch/x86/kernel/fpu/xstate.c | 22 +-
arch/x86/kernel/head_64.S | 7 +-
arch/x86/kernel/process.c | 3 +
arch/x86/kernel/process_64.c | 2 +
arch/x86/mm/fault.c | 27 +-
arch/x86/mm/pkeys.c | 218 +++++-
include/linux/pgtable.h | 4 +
include/linux/pkeys.h | 34 +
kernel/entry/common.c | 14 +-
lib/Kconfig.debug | 11 +
lib/Makefile | 3 +
lib/pks/Makefile | 3 +
lib/pks/pks_test.c | 693 ++++++++++++++++++++
mm/Kconfig | 5 +
tools/testing/selftests/x86/Makefile | 3 +-
tools/testing/selftests/x86/test_pks.c | 150 +++++
34 files changed, 1519 insertions(+), 77 deletions(-)
create mode 100644 arch/x86/include/asm/pkeys_common.h
create mode 100644 arch/x86/include/asm/pks.h
create mode 100644 lib/pks/Makefile
create mode 100644 lib/pks/pks_test.c
create mode 100644 tools/testing/selftests/x86/test_pks.c
--
2.28.0.rc0.12.gb6a658bd00c9
If a signed number field starts with a '-' the field width must be > 1,
or unlimited, to allow at least one digit after the '-'.
This patch adds a check for this. If a signed field starts with '-'
and field_width == 1 the scanf will quit.
It is ok for a signed number field to have a field width of 1 if it
starts with a digit. In that case the single digit can be converted.
Signed-off-by: Richard Fitzgerald <rf(a)opensource.cirrus.com>
Reviewed-by: Petr Mladek <pmladek(a)suse.com>
Acked-by: Andy Shevchenko <andriy.shevchenko(a)linux.intel.com>
---
lib/vsprintf.c | 6 +++++-
1 file changed, 5 insertions(+), 1 deletion(-)
diff --git a/lib/vsprintf.c b/lib/vsprintf.c
index 41ddc353ebb8..f78651e9b030 100644
--- a/lib/vsprintf.c
+++ b/lib/vsprintf.c
@@ -3466,8 +3466,12 @@ int vsscanf(const char *buf, const char *fmt, va_list args)
str = skip_spaces(str);
digit = *str;
- if (is_sign && digit == '-')
+ if (is_sign && digit == '-') {
+ if (field_width == 1)
+ break;
+
digit = *(str + 1);
+ }
if (!digit
|| (base == 16 && !isxdigit(digit))
--
2.20.1
Previously, we shared too much of the code with COPY and ZEROPAGE, so we
manipulated things in various invalid ways:
- Previously, we unconditionally called shmem_inode_acct_block. In the
continue case, we're looking up an existing page which would have been
accounted for properly when it was allocated. So doing it twice
results in double-counting, and eventually leaking.
- Previously, we made the pte writable whenever the VMA was writable.
However, for continue, consider this case:
1. A tmpfs file was created
2. The non-UFFD-registered side mmap()-s with MAP_SHARED
3. The UFFD-registered side mmap()-s with MAP_PRIVATE
In this case, even though the UFFD-registered VMA may be writable, we
still want CoW behavior. So, check for this case and don't make the
pte writable.
- The offset / max_off checking doesn't necessarily hurt anything, but
it's not needed in the CONTINUE case, so skip it.
- Previously, we unconditionally called ClearPageDirty() in the error
path. In the continue case though, since this is an existing page, it
might have already been dirty before we started touching it. So,
remember whether or not it was dirty before we set_page_dirty(), and
only clear the bit if it wasn't dirty before.
- Previously, we unconditionally removed the page from the page cache in
the error path. But in the continue case, we didn't add it - it was
already there because the page is present in some second
(non-UFFD-registered) mapping. So, removing it is invalid.
Because the error handling issues are easy to exercise in the selftest,
make a small modification there to do so.
Finally, refactor shmem_mcopy_atomic_pte a bit. By this point, we've
added a lot of "if (!is_continue)"-s everywhere. It's cleaner to just
check for that mode first thing, and then "goto" down to where the parts
we actually want are. This leaves the code in between cleaner.
Changes since v1:
- Refactor to skip ahead with goto, instead of adding several more
"if (!is_continue)".
- Fix unconditional ClearPageDirty().
- Don't pte_mkwrite() when is_continue && !VM_SHARED.
Fixes: 00da60b9d0a0 ("userfaultfd: support minor fault handling for shmem")
Signed-off-by: Axel Rasmussen <axelrasmussen(a)google.com>
---
mm/shmem.c | 67 ++++++++++++++----------
tools/testing/selftests/vm/userfaultfd.c | 12 +++++
2 files changed, 51 insertions(+), 28 deletions(-)
diff --git a/mm/shmem.c b/mm/shmem.c
index d2e0e81b7d2e..8ab1f1f29987 100644
--- a/mm/shmem.c
+++ b/mm/shmem.c
@@ -2378,17 +2378,22 @@ int shmem_mcopy_atomic_pte(struct mm_struct *dst_mm, pmd_t *dst_pmd,
pte_t _dst_pte, *dst_pte;
int ret;
pgoff_t offset, max_off;
-
- ret = -ENOMEM;
- if (!shmem_inode_acct_block(inode, 1))
- goto out;
+ int writable;
+ bool was_dirty;
if (is_continue) {
ret = -EFAULT;
page = find_lock_page(mapping, pgoff);
if (!page)
- goto out_unacct_blocks;
- } else if (!*pagep) {
+ goto out;
+ goto install_ptes;
+ }
+
+ ret = -ENOMEM;
+ if (!shmem_inode_acct_block(inode, 1))
+ goto out;
+
+ if (!*pagep) {
page = shmem_alloc_page(gfp, info, pgoff);
if (!page)
goto out_unacct_blocks;
@@ -2415,13 +2420,11 @@ int shmem_mcopy_atomic_pte(struct mm_struct *dst_mm, pmd_t *dst_pmd,
*pagep = NULL;
}
- if (!is_continue) {
- VM_BUG_ON(PageSwapBacked(page));
- VM_BUG_ON(PageLocked(page));
- __SetPageLocked(page);
- __SetPageSwapBacked(page);
- __SetPageUptodate(page);
- }
+ VM_BUG_ON(PageSwapBacked(page));
+ VM_BUG_ON(PageLocked(page));
+ __SetPageLocked(page);
+ __SetPageSwapBacked(page);
+ __SetPageUptodate(page);
ret = -EFAULT;
offset = linear_page_index(dst_vma, dst_addr);
@@ -2429,16 +2432,18 @@ int shmem_mcopy_atomic_pte(struct mm_struct *dst_mm, pmd_t *dst_pmd,
if (unlikely(offset >= max_off))
goto out_release;
- /* If page wasn't already in the page cache, add it. */
- if (!is_continue) {
- ret = shmem_add_to_page_cache(page, mapping, pgoff, NULL,
- gfp & GFP_RECLAIM_MASK, dst_mm);
- if (ret)
- goto out_release;
- }
+ ret = shmem_add_to_page_cache(page, mapping, pgoff, NULL,
+ gfp & GFP_RECLAIM_MASK, dst_mm);
+ if (ret)
+ goto out_release;
+install_ptes:
_dst_pte = mk_pte(page, dst_vma->vm_page_prot);
- if (dst_vma->vm_flags & VM_WRITE)
+ /* For CONTINUE on a non-shared VMA, don't pte_mkwrite for CoW. */
+ writable = is_continue && !(dst_vma->vm_flags & VM_SHARED)
+ ? 0
+ : dst_vma->vm_flags & VM_WRITE;
+ if (writable)
_dst_pte = pte_mkwrite(pte_mkdirty(_dst_pte));
else {
/*
@@ -2448,15 +2453,18 @@ int shmem_mcopy_atomic_pte(struct mm_struct *dst_mm, pmd_t *dst_pmd,
* unconditionally before unlock_page(), but doing it
* only if VM_WRITE is not set is faster.
*/
+ was_dirty = PageDirty(page);
set_page_dirty(page);
}
dst_pte = pte_offset_map_lock(dst_mm, dst_pmd, dst_addr, &ptl);
- ret = -EFAULT;
- max_off = DIV_ROUND_UP(i_size_read(inode), PAGE_SIZE);
- if (unlikely(offset >= max_off))
- goto out_release_unlock;
+ if (!is_continue) {
+ ret = -EFAULT;
+ max_off = DIV_ROUND_UP(i_size_read(inode), PAGE_SIZE);
+ if (unlikely(offset >= max_off))
+ goto out_release_unlock;
+ }
ret = -EEXIST;
if (!pte_none(*dst_pte))
@@ -2485,13 +2493,16 @@ int shmem_mcopy_atomic_pte(struct mm_struct *dst_mm, pmd_t *dst_pmd,
return ret;
out_release_unlock:
pte_unmap_unlock(dst_pte, ptl);
- ClearPageDirty(page);
- delete_from_page_cache(page);
+ if (!was_dirty)
+ ClearPageDirty(page);
+ if (!is_continue)
+ delete_from_page_cache(page);
out_release:
unlock_page(page);
put_page(page);
out_unacct_blocks:
- shmem_inode_unacct_blocks(inode, 1);
+ if (!is_continue)
+ shmem_inode_unacct_blocks(inode, 1);
goto out;
}
#endif /* CONFIG_USERFAULTFD */
diff --git a/tools/testing/selftests/vm/userfaultfd.c b/tools/testing/selftests/vm/userfaultfd.c
index f6c86b036d0f..d8541a59dae5 100644
--- a/tools/testing/selftests/vm/userfaultfd.c
+++ b/tools/testing/selftests/vm/userfaultfd.c
@@ -485,6 +485,7 @@ static void wp_range(int ufd, __u64 start, __u64 len, bool wp)
static void continue_range(int ufd, __u64 start, __u64 len)
{
struct uffdio_continue req;
+ int ret;
req.range.start = start;
req.range.len = len;
@@ -493,6 +494,17 @@ static void continue_range(int ufd, __u64 start, __u64 len)
if (ioctl(ufd, UFFDIO_CONTINUE, &req))
err("UFFDIO_CONTINUE failed for address 0x%" PRIx64,
(uint64_t)start);
+
+ /*
+ * Error handling within the kernel for continue is subtly different
+ * from copy or zeropage, so it may be a source of bugs. Trigger an
+ * error (-EEXIST) on purpose, to verify doing so doesn't cause a BUG.
+ */
+ req.mapped = 0;
+ ret = ioctl(ufd, UFFDIO_CONTINUE, &req);
+ if (ret >= 0 || req.mapped != -EEXIST)
+ err("failed to exercise UFFDIO_CONTINUE error handling, ret=%d, mapped=%" PRId64,
+ ret, req.mapped);
}
static void *locking_thread(void *arg)
--
2.31.0.291.g576ba9dcdaf-goog
From: Andre Przywara <andre.przywara(a)arm.com>
[ Upstream commit 7011d72588d16a9e5f5d85acbc8b10019809599c ]
The "First Fault Register" (FFR) is an SVE register that mimics a
predicate register, but clears bits when a load or store fails to handle
an element of a vector. The supposed usage scenario is to initialise
this register (using SETFFR), then *read* it later on to learn about
elements that failed to load or store. Explicit writes to this register
using the WRFFR instruction are only supposed to *restore* values
previously read from the register (for context-switching only).
As the manual describes, this register holds only certain values, it:
"... contains a monotonic predicate value, in which starting from bit 0
there are zero or more 1 bits, followed only by 0 bits in any remaining
bit positions."
Any other value is UNPREDICTABLE and is not supposed to be "restored"
into the register.
The SVE test currently tries to write a signature pattern into the
register, which is *not* a canonical FFR value. Apparently the existing
setups treat UNPREDICTABLE as "read-as-written", but a new
implementation actually only stores canonical values. As a consequence,
the sve-test fails immediately when comparing the FFR value:
-----------
# ./sve-test
Vector length: 128 bits
PID: 207
Mismatch: PID=207, iteration=0, reg=48
Expected [cf00]
Got [0f00]
Aborted
-----------
Fix this by only populating the FFR with proper canonical values.
Effectively the requirement described above limits us to 17 unique
values over 16 bits worth of FFR, so we condense our signature down to 4
bits (2 bits from the PID, 2 bits from the generation) and generate the
canonical pattern from it. Any bits describing elements above the
minimum 128 bit are set to 0.
This aligns the FFR usage to the architecture and fixes the test on
microarchitectures implementing FFR in a more restricted way.
Signed-off-by: Andre Przywara <andre.przywara(a)arm.com>
Reviwed-by: Mark Brown <broonie(a)kernel.org>
Link: https://lore.kernel.org/r/20210319120128.29452-1-andre.przywara@arm.com
Signed-off-by: Will Deacon <will(a)kernel.org>
Signed-off-by: Sasha Levin <sashal(a)kernel.org>
---
tools/testing/selftests/arm64/fp/sve-test.S | 22 ++++++++++++++++-----
1 file changed, 17 insertions(+), 5 deletions(-)
diff --git a/tools/testing/selftests/arm64/fp/sve-test.S b/tools/testing/selftests/arm64/fp/sve-test.S
index f95074c9b48b..07f14e279a90 100644
--- a/tools/testing/selftests/arm64/fp/sve-test.S
+++ b/tools/testing/selftests/arm64/fp/sve-test.S
@@ -284,16 +284,28 @@ endfunction
// Set up test pattern in the FFR
// x0: pid
// x2: generation
+//
+// We need to generate a canonical FFR value, which consists of a number of
+// low "1" bits, followed by a number of zeros. This gives us 17 unique values
+// per 16 bits of FFR, so we create a 4 bit signature out of the PID and
+// generation, and use that as the initial number of ones in the pattern.
+// We fill the upper lanes of FFR with zeros.
// Beware: corrupts P0.
function setup_ffr
mov x4, x30
- bl pattern
+ and w0, w0, #0x3
+ bfi w0, w2, #2, #2
+ mov w1, #1
+ lsl w1, w1, w0
+ sub w1, w1, #1
+
ldr x0, =ffrref
- ldr x1, =scratch
- rdvl x2, #1
- lsr x2, x2, #3
- bl memcpy
+ strh w1, [x0], 2
+ rdvl x1, #1
+ lsr x1, x1, #3
+ sub x1, x1, #2
+ bl memclr
mov x0, #0
ldr x1, =ffrref
--
2.30.1
From: David Gow <davidgow(a)google.com>
[ Upstream commit 7421b1a4d10c633ca5f14c8236d3e2c1de07e52b ]
The first argument to namedtuple() should match the name of the type,
which wasn't the case for KconfigEntryBase.
Fixing this is enough to make mypy show no python typing errors again.
Fixes 97752c39bd ("kunit: kunit_tool: Allow .kunitconfig to disable config items")
Signed-off-by: David Gow <davidgow(a)google.com>
Reviewed-by: Daniel Latypov <dlatypov(a)google.com>
Acked-by: Brendan Higgins <brendanhiggins(a)google.com>
Signed-off-by: Shuah Khan <skhan(a)linuxfoundation.org>
Signed-off-by: Sasha Levin <sashal(a)kernel.org>
---
tools/testing/kunit/kunit_config.py | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/tools/testing/kunit/kunit_config.py b/tools/testing/kunit/kunit_config.py
index 02ffc3a3e5dc..b30e9d6db6b4 100644
--- a/tools/testing/kunit/kunit_config.py
+++ b/tools/testing/kunit/kunit_config.py
@@ -12,7 +12,7 @@ import re
CONFIG_IS_NOT_SET_PATTERN = r'^# CONFIG_(\w+) is not set$'
CONFIG_PATTERN = r'^CONFIG_(\w+)=(\S+|".*")$'
-KconfigEntryBase = collections.namedtuple('KconfigEntry', ['name', 'value'])
+KconfigEntryBase = collections.namedtuple('KconfigEntryBase', ['name', 'value'])
class KconfigEntry(KconfigEntryBase):
--
2.30.1
From: Andre Przywara <andre.przywara(a)arm.com>
[ Upstream commit 7011d72588d16a9e5f5d85acbc8b10019809599c ]
The "First Fault Register" (FFR) is an SVE register that mimics a
predicate register, but clears bits when a load or store fails to handle
an element of a vector. The supposed usage scenario is to initialise
this register (using SETFFR), then *read* it later on to learn about
elements that failed to load or store. Explicit writes to this register
using the WRFFR instruction are only supposed to *restore* values
previously read from the register (for context-switching only).
As the manual describes, this register holds only certain values, it:
"... contains a monotonic predicate value, in which starting from bit 0
there are zero or more 1 bits, followed only by 0 bits in any remaining
bit positions."
Any other value is UNPREDICTABLE and is not supposed to be "restored"
into the register.
The SVE test currently tries to write a signature pattern into the
register, which is *not* a canonical FFR value. Apparently the existing
setups treat UNPREDICTABLE as "read-as-written", but a new
implementation actually only stores canonical values. As a consequence,
the sve-test fails immediately when comparing the FFR value:
-----------
# ./sve-test
Vector length: 128 bits
PID: 207
Mismatch: PID=207, iteration=0, reg=48
Expected [cf00]
Got [0f00]
Aborted
-----------
Fix this by only populating the FFR with proper canonical values.
Effectively the requirement described above limits us to 17 unique
values over 16 bits worth of FFR, so we condense our signature down to 4
bits (2 bits from the PID, 2 bits from the generation) and generate the
canonical pattern from it. Any bits describing elements above the
minimum 128 bit are set to 0.
This aligns the FFR usage to the architecture and fixes the test on
microarchitectures implementing FFR in a more restricted way.
Signed-off-by: Andre Przywara <andre.przywara(a)arm.com>
Reviwed-by: Mark Brown <broonie(a)kernel.org>
Link: https://lore.kernel.org/r/20210319120128.29452-1-andre.przywara@arm.com
Signed-off-by: Will Deacon <will(a)kernel.org>
Signed-off-by: Sasha Levin <sashal(a)kernel.org>
---
tools/testing/selftests/arm64/fp/sve-test.S | 22 ++++++++++++++++-----
1 file changed, 17 insertions(+), 5 deletions(-)
diff --git a/tools/testing/selftests/arm64/fp/sve-test.S b/tools/testing/selftests/arm64/fp/sve-test.S
index 9210691aa998..e3e08d9c7020 100644
--- a/tools/testing/selftests/arm64/fp/sve-test.S
+++ b/tools/testing/selftests/arm64/fp/sve-test.S
@@ -284,16 +284,28 @@ endfunction
// Set up test pattern in the FFR
// x0: pid
// x2: generation
+//
+// We need to generate a canonical FFR value, which consists of a number of
+// low "1" bits, followed by a number of zeros. This gives us 17 unique values
+// per 16 bits of FFR, so we create a 4 bit signature out of the PID and
+// generation, and use that as the initial number of ones in the pattern.
+// We fill the upper lanes of FFR with zeros.
// Beware: corrupts P0.
function setup_ffr
mov x4, x30
- bl pattern
+ and w0, w0, #0x3
+ bfi w0, w2, #2, #2
+ mov w1, #1
+ lsl w1, w1, w0
+ sub w1, w1, #1
+
ldr x0, =ffrref
- ldr x1, =scratch
- rdvl x2, #1
- lsr x2, x2, #3
- bl memcpy
+ strh w1, [x0], 2
+ rdvl x1, #1
+ lsr x1, x1, #3
+ sub x1, x1, #2
+ bl memclr
mov x0, #0
ldr x1, =ffrref
--
2.30.1
From: David Gow <davidgow(a)google.com>
[ Upstream commit 7421b1a4d10c633ca5f14c8236d3e2c1de07e52b ]
The first argument to namedtuple() should match the name of the type,
which wasn't the case for KconfigEntryBase.
Fixing this is enough to make mypy show no python typing errors again.
Fixes 97752c39bd ("kunit: kunit_tool: Allow .kunitconfig to disable config items")
Signed-off-by: David Gow <davidgow(a)google.com>
Reviewed-by: Daniel Latypov <dlatypov(a)google.com>
Acked-by: Brendan Higgins <brendanhiggins(a)google.com>
Signed-off-by: Shuah Khan <skhan(a)linuxfoundation.org>
Signed-off-by: Sasha Levin <sashal(a)kernel.org>
---
tools/testing/kunit/kunit_config.py | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/tools/testing/kunit/kunit_config.py b/tools/testing/kunit/kunit_config.py
index bdd60230764b..27fe086d2d0d 100644
--- a/tools/testing/kunit/kunit_config.py
+++ b/tools/testing/kunit/kunit_config.py
@@ -13,7 +13,7 @@ from typing import List, Set
CONFIG_IS_NOT_SET_PATTERN = r'^# CONFIG_(\w+) is not set$'
CONFIG_PATTERN = r'^CONFIG_(\w+)=(\S+|".*")$'
-KconfigEntryBase = collections.namedtuple('KconfigEntry', ['name', 'value'])
+KconfigEntryBase = collections.namedtuple('KconfigEntryBase', ['name', 'value'])
class KconfigEntry(KconfigEntryBase):
--
2.30.1
If a signed number field starts with a '-' the field width must be > 1,
or unlimited, to allow at least one digit after the '-'.
This patch adds a check for this. If a signed field starts with '-'
and field_width == 1 the scanf will quit.
It is ok for a signed number field to have a field width of 1 if it
starts with a digit. In that case the single digit can be converted.
Signed-off-by: Richard Fitzgerald <rf(a)opensource.cirrus.com>
Reviewed-by: Petr Mladek <pmladek(a)suse.com>
Acked-by: Andy Shevchenko <andriy.shevchenko(a)linux.intel.com>
---
lib/vsprintf.c | 6 +++++-
1 file changed, 5 insertions(+), 1 deletion(-)
diff --git a/lib/vsprintf.c b/lib/vsprintf.c
index 41ddc353ebb8..f78651e9b030 100644
--- a/lib/vsprintf.c
+++ b/lib/vsprintf.c
@@ -3466,8 +3466,12 @@ int vsscanf(const char *buf, const char *fmt, va_list args)
str = skip_spaces(str);
digit = *str;
- if (is_sign && digit == '-')
+ if (is_sign && digit == '-') {
+ if (field_width == 1)
+ break;
+
digit = *(str + 1);
+ }
if (!digit
|| (base == 16 && !isxdigit(digit))
--
2.20.1
Hi,
This v5 series can mainly include two parts.
Based on kvm queue branch: https://git.kernel.org/pub/scm/virt/kvm/kvm.git/log/?h=queue
In the first part, all the known hugetlb backing src types specified
with different hugepage sizes are listed, so that we can specify use
of hugetlb source of the exact granularity that we want, instead of
the system default ones. And as all the known hugetlb page sizes are
listed, it's appropriate for all architectures. Besides, a helper that
can get granularity of different backing src types(anonumous/thp/hugetlb)
is added, so that we can use the accurate backing src granularity for
kinds of alignment or guest memory accessing of vcpus.
In the second part, a new test is added:
This test is added to serve as a performance tester and a bug reproducer
for kvm page table code (GPA->HPA mappings), it gives guidance for the
people trying to make some improvement for kvm. And the following explains
what we can exactly do through this test.
The function guest_code() can cover the conditions where a single vcpu or
multiple vcpus access guest pages within the same memory region, in three
VM stages(before dirty logging, during dirty logging, after dirty logging).
Besides, the backing src memory type(ANONYMOUS/THP/HUGETLB) of the tested
memory region can be specified by users, which means normal page mappings
or block mappings can be chosen by users to be created in the test.
If ANONYMOUS memory is specified, kvm will create normal page mappings
for the tested memory region before dirty logging, and update attributes
of the page mappings from RO to RW during dirty logging. If THP/HUGETLB
memory is specified, kvm will create block mappings for the tested memory
region before dirty logging, and split the blcok mappings into normal page
mappings during dirty logging, and coalesce the page mappings back into
block mappings after dirty logging is stopped.
So in summary, as a performance tester, this test can present the
performance of kvm creating/updating normal page mappings, or the
performance of kvm creating/splitting/recovering block mappings,
through execution time.
When we need to coalesce the page mappings back to block mappings after
dirty logging is stopped, we have to firstly invalidate *all* the TLB
entries for the page mappings right before installation of the block entry,
because a TLB conflict abort error could occur if we can't invalidate the
TLB entries fully. We have hit this TLB conflict twice on aarch64 software
implementation and fixed it. As this test can imulate process from dirty
logging enabled to dirty logging stopped of a VM with block mappings,
so it can also reproduce this TLB conflict abort due to inadequate TLB
invalidation when coalescing tables.
Links about the TLB conflict abort:
https://lore.kernel.org/lkml/20201201201034.116760-3-wangyanan55@huawei.com/
---
Change logs:
v4->v5:
- Use synchronization(sem_wait) for time measurement
- Add a new patch about TEST_ASSERT(patch 4)
- Address Andrew Jones's comments for v4 series
- Add Andrew Jones's R-b tags in some patches
- v4: https://lore.kernel.org/lkml/20210302125751.19080-1-wangyanan55@huawei.com/
v3->v4:
- Add a helper to get system default hugetlb page size
- Add tags of Reviewed-by of Ben in the patches
- v3: https://lore.kernel.org/lkml/20210301065916.11484-1-wangyanan55@huawei.com/
v2->v3:
- Add tags of Suggested-by, Reviewed-by in the patches
- Add a generic micro to get hugetlb page sizes
- Some changes for suggestions about v2 series
- v2: https://lore.kernel.org/lkml/20210225055940.18748-1-wangyanan55@huawei.com/
v1->v2:
- Add a patch to sync header files
- Add helpers to get granularity of different backing src types
- Some changes for suggestions about v1 series
- v1: https://lore.kernel.org/lkml/20210208090841.333724-1-wangyanan55@huawei.com/
---
Yanan Wang (10):
tools headers: sync headers of asm-generic/hugetlb_encode.h
tools headers: Add a macro to get HUGETLB page sizes for mmap
KVM: selftests: Use flag CLOCK_MONOTONIC_RAW for timing
KVM: selftests: Print the errno besides error-string in TEST_ASSERT
KVM: selftests: Make a generic helper to get vm guest mode strings
KVM: selftests: Add a helper to get system configured THP page size
KVM: selftests: Add a helper to get system default hugetlb page size
KVM: selftests: List all hugetlb src types specified with page sizes
KVM: selftests: Adapt vm_userspace_mem_region_add to new helpers
KVM: selftests: Add a test for kvm page table code
include/uapi/linux/mman.h | 2 +
tools/include/asm-generic/hugetlb_encode.h | 3 +
tools/include/uapi/linux/mman.h | 2 +
tools/testing/selftests/kvm/.gitignore | 1 +
tools/testing/selftests/kvm/Makefile | 3 +
.../selftests/kvm/demand_paging_test.c | 8 +-
.../selftests/kvm/dirty_log_perf_test.c | 14 +-
.../testing/selftests/kvm/include/kvm_util.h | 4 +-
.../testing/selftests/kvm/include/test_util.h | 21 +-
.../selftests/kvm/kvm_page_table_test.c | 512 ++++++++++++++++++
tools/testing/selftests/kvm/lib/assert.c | 4 +-
tools/testing/selftests/kvm/lib/kvm_util.c | 59 +-
tools/testing/selftests/kvm/lib/test_util.c | 163 +++++-
tools/testing/selftests/kvm/steal_time.c | 4 +-
14 files changed, 739 insertions(+), 61 deletions(-)
create mode 100644 tools/testing/selftests/kvm/kvm_page_table_test.c
--
2.23.0
Previously, in the error path, we unconditionally removed the page from
the page cache. But in the continue case, we didn't add it - it was
already there because the page is used by a second (non-UFFD-registered)
mapping. So, in that case, it's incorrect to remove it as the other
mapping may still use it normally.
For this error handling failure, trivially exercise it in the
userfaultfd selftest, to detect this kind of bug in the future.
Also, we previously were unconditionally calling shmem_inode_acct_block.
In the continue case, however, this is incorrect, because we would have
already accounted for the RAM usage when the page was originally
allocated (since at this point it's already in the page cache). So,
doing it in the continue case causes us to double-count.
Fixes: 00da60b9d0a0 ("userfaultfd: support minor fault handling for shmem")
Signed-off-by: Axel Rasmussen <axelrasmussen(a)google.com>
---
mm/shmem.c | 15 ++++++++++-----
tools/testing/selftests/vm/userfaultfd.c | 12 ++++++++++++
2 files changed, 22 insertions(+), 5 deletions(-)
diff --git a/mm/shmem.c b/mm/shmem.c
index d2e0e81b7d2e..5ac8ea737004 100644
--- a/mm/shmem.c
+++ b/mm/shmem.c
@@ -2379,9 +2379,11 @@ int shmem_mcopy_atomic_pte(struct mm_struct *dst_mm, pmd_t *dst_pmd,
int ret;
pgoff_t offset, max_off;
- ret = -ENOMEM;
- if (!shmem_inode_acct_block(inode, 1))
- goto out;
+ if (!is_continue) {
+ ret = -ENOMEM;
+ if (!shmem_inode_acct_block(inode, 1))
+ goto out;
+ }
if (is_continue) {
ret = -EFAULT;
@@ -2389,6 +2391,7 @@ int shmem_mcopy_atomic_pte(struct mm_struct *dst_mm, pmd_t *dst_pmd,
if (!page)
goto out_unacct_blocks;
} else if (!*pagep) {
+ ret = -ENOMEM;
page = shmem_alloc_page(gfp, info, pgoff);
if (!page)
goto out_unacct_blocks;
@@ -2486,12 +2489,14 @@ int shmem_mcopy_atomic_pte(struct mm_struct *dst_mm, pmd_t *dst_pmd,
out_release_unlock:
pte_unmap_unlock(dst_pte, ptl);
ClearPageDirty(page);
- delete_from_page_cache(page);
+ if (!is_continue)
+ delete_from_page_cache(page);
out_release:
unlock_page(page);
put_page(page);
out_unacct_blocks:
- shmem_inode_unacct_blocks(inode, 1);
+ if (!is_continue)
+ shmem_inode_unacct_blocks(inode, 1);
goto out;
}
#endif /* CONFIG_USERFAULTFD */
diff --git a/tools/testing/selftests/vm/userfaultfd.c b/tools/testing/selftests/vm/userfaultfd.c
index f6c86b036d0f..d8541a59dae5 100644
--- a/tools/testing/selftests/vm/userfaultfd.c
+++ b/tools/testing/selftests/vm/userfaultfd.c
@@ -485,6 +485,7 @@ static void wp_range(int ufd, __u64 start, __u64 len, bool wp)
static void continue_range(int ufd, __u64 start, __u64 len)
{
struct uffdio_continue req;
+ int ret;
req.range.start = start;
req.range.len = len;
@@ -493,6 +494,17 @@ static void continue_range(int ufd, __u64 start, __u64 len)
if (ioctl(ufd, UFFDIO_CONTINUE, &req))
err("UFFDIO_CONTINUE failed for address 0x%" PRIx64,
(uint64_t)start);
+
+ /*
+ * Error handling within the kernel for continue is subtly different
+ * from copy or zeropage, so it may be a source of bugs. Trigger an
+ * error (-EEXIST) on purpose, to verify doing so doesn't cause a BUG.
+ */
+ req.mapped = 0;
+ ret = ioctl(ufd, UFFDIO_CONTINUE, &req);
+ if (ret >= 0 || req.mapped != -EEXIST)
+ err("failed to exercise UFFDIO_CONTINUE error handling, ret=%d, mapped=%" PRId64,
+ ret, req.mapped);
}
static void *locking_thread(void *arg)
--
2.31.0.291.g576ba9dcdaf-goog
From: Colin Ian King <colin.king(a)canonical.com>
There is a spelling mistake in a comment. Fix it.
Signed-off-by: Colin Ian King <colin.king(a)canonical.com>
---
tools/testing/selftests/timers/clocksource-switch.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/tools/testing/selftests/timers/clocksource-switch.c b/tools/testing/selftests/timers/clocksource-switch.c
index bfc974b4572d..2d66abd877e6 100644
--- a/tools/testing/selftests/timers/clocksource-switch.c
+++ b/tools/testing/selftests/timers/clocksource-switch.c
@@ -3,7 +3,7 @@
* (C) Copyright IBM 2012
* Licensed under the GPLv2
*
- * NOTE: This is a meta-test which quickly changes the clocksourc and
+ * NOTE: This is a meta-test which quickly changes the clocksource and
* then uses other tests to detect problems. Thus this test requires
* that the inconsistency-check and nanosleep tests be present in the
* same directory it is run from.
--
2.30.2
Currently the following command produces an error message:
linux# make kselftest TARGETS=bpf O=/mnt/linux-build
# selftests: bpf: test_libbpf.sh
# ./test_libbpf.sh: line 23: ./test_libbpf_open: No such file or directory
# test_libbpf: failed at file test_l4lb.o
# selftests: test_libbpf [FAILED]
The error message might not affect the return code of make, therefore
one needs to grep make output in order to detect it.
This is not the only instance of the same underlying problem; any test
with more than one element in $(TEST_PROGS) fails the same way. Another
example:
linux# make O=/mnt/linux-build TARGETS=splice kselftest
[...]
# ./short_splice_read.sh: 15: ./splice_read: not found
# FAIL: /sys/module/test_module/sections/.init.text 2
not ok 2 selftests: splice: short_splice_read.sh # exit=1
The current logic prepends $(OUTPUT) only to the first member of
$(TEST_PROGS). After that, run_one() does
cd `dirname $TEST`
For all tests except the first one, `dirname $TEST` is ., which means
they cannot access the files generated in $(OUTPUT).
Fix by using $(addprefix) to prepend $(OUTPUT)/ to each member of
$(TEST_PROGS).
Fixes: 1a940687e424 ("selftests: lib.mk: copy test scripts and test files for make O=dir run")
Signed-off-by: Ilya Leoshkevich <iii(a)linux.ibm.com>
---
v1->v2:
- Append / to $(OUTPUT).
- Use $(addprefix) instead of $(foreach).
v2->v3:
- Split the patch in two.
- Improve the commit message.
v3: https://lore.kernel.org/linux-kselftest/20191024121347.22189-1-iii@linux.ib…
v3->v4:
- Drop the first patch.
- Add a note regarding make return code to the commit message.
v4: https://lore.kernel.org/linux-kselftest/20191115150428.61131-1-iii@linux.ib…
v4->v5:
- Add another reproducer to the commit message.
tools/testing/selftests/lib.mk | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/tools/testing/selftests/lib.mk b/tools/testing/selftests/lib.mk
index a5ce26d548e4..be17462fe146 100644
--- a/tools/testing/selftests/lib.mk
+++ b/tools/testing/selftests/lib.mk
@@ -74,7 +74,8 @@ ifdef building_out_of_srctree
rsync -aq $(TEST_PROGS) $(TEST_PROGS_EXTENDED) $(TEST_FILES) $(OUTPUT); \
fi
@if [ "X$(TEST_PROGS)" != "X" ]; then \
- $(call RUN_TESTS, $(TEST_GEN_PROGS) $(TEST_CUSTOM_PROGS) $(OUTPUT)/$(TEST_PROGS)) ; \
+ $(call RUN_TESTS, $(TEST_GEN_PROGS) $(TEST_CUSTOM_PROGS) \
+ $(addprefix $(OUTPUT)/,$(TEST_PROGS))) ; \
else \
$(call RUN_TESTS, $(TEST_GEN_PROGS) $(TEST_CUSTOM_PROGS)); \
fi
--
2.29.2
Attacks against vulnerable userspace applications with the purpose to break
ASLR or bypass canaries traditionally use some level of brute force with
the help of the fork system call. This is possible since when creating a
new process using fork its memory contents are the same as those of the
parent process (the process that called the fork system call). So, the
attacker can test the memory infinite times to find the correct memory
values or the correct memory addresses without worrying about crashing the
application.
Based on the above scenario it would be nice to have this detected and
mitigated, and this is the goal of this patch serie. Specifically the
following attacks are expected to be detected:
1.- Launching (fork()/exec()) a setuid/setgid process repeatedly until a
desirable memory layout is got (e.g. Stack Clash).
2.- Connecting to an exec()ing network daemon (e.g. xinetd) repeatedly
until a desirable memory layout is got (e.g. what CTFs do for simple
network service).
3.- Launching processes without exec() (e.g. Android Zygote) and exposing
state to attack a sibling.
4.- Connecting to a fork()ing network daemon (e.g. apache) repeatedly until
the previously shared memory layout of all the other children is
exposed (e.g. kind of related to HeartBleed).
In each case, a privilege boundary has been crossed:
Case 1: setuid/setgid process
Case 2: network to local
Case 3: privilege changes
Case 4: network to local
So, what will really be detected are fork/exec brute force attacks that
cross any of the commented bounds.
The implementation details and comparison against other existing
implementations can be found in the "Documentation" patch.
Knowing all this information I will explain now the different patches:
The 1/8 patch defines a new LSM hook to get the fatal signal of a task.
This will be useful during the attack detection phase.
The 2/8 patch defines a new LSM and manages the statistical data shared by
all the fork hierarchy processes.
The 3/8 patch detects a fork/exec brute force attack.
The 4/8 patch narrows the detection taken into account the privilege
boundary crossing.
The 5/8 patch mitigates a brute force attack.
The 6/8 patch adds self-tests to validate the Brute LSM expectations.
The 7/8 patch adds the documentation to explain this implementation.
The 8/8 patch updates the maintainers file.
This patch serie is a task of the KSPP [1] and can also be accessed from my
github tree [2] in the "brute_v6" branch.
[1] https://github.com/KSPP/linux/issues/39
[2] https://github.com/johwood/linux/
The previous versions can be found in:
RFC
https://lore.kernel.org/kernel-hardening/20200910202107.3799376-1-keescook@…
Version 2
https://lore.kernel.org/kernel-hardening/20201025134540.3770-1-john.wood@gm…
Version 3
https://lore.kernel.org/lkml/20210221154919.68050-1-john.wood@gmx.com/
Version 4
https://lore.kernel.org/lkml/20210227150956.6022-1-john.wood@gmx.com/
Version 5
https://lore.kernel.org/kernel-hardening/20210227153013.6747-1-john.wood@gm…
Changelog RFC -> v2
-------------------
- Rename this feature with a more suitable name (Jann Horn, Kees Cook).
- Convert the code to an LSM (Kees Cook).
- Add locking to avoid data races (Jann Horn).
- Add a new LSM hook to get the fatal signal of a task (Jann Horn, Kees
Cook).
- Add the last crashes timestamps list to avoid false positives in the
attack detection (Jann Horn).
- Use "period" instead of "rate" (Jann Horn).
- Other minor changes suggested (Jann Horn, Kees Cook).
Changelog v2 -> v3
------------------
- Compute the application crash period on an on-going basis (Kees Cook).
- Detect a brute force attack through the execve system call (Kees Cook).
- Detect an slow brute force attack (Randy Dunlap).
- Fine tuning the detection taken into account privilege boundary crossing
(Kees Cook).
- Taken into account only fatal signals delivered by the kernel (Kees
Cook).
- Remove the sysctl attributes to fine tuning the detection (Kees Cook).
- Remove the prctls to allow per process enabling/disabling (Kees Cook).
- Improve the documentation (Kees Cook).
- Fix some typos in the documentation (Randy Dunlap).
- Add self-test to validate the expectations (Kees Cook).
Changelog v3 -> v4
------------------
- Fix all the warnings shown by the tool "scripts/kernel-doc" (Randy
Dunlap).
Changelog v4 -> v5
------------------
- Fix some typos (Randy Dunlap).
Changelog v5 -> v6
------------------
- Fix a reported deadlock (kernel test robot).
- Add high level details to the documentation (Andi Kleen).
Any constructive comments are welcome.
Thanks.
John Wood (8):
security: Add LSM hook at the point where a task gets a fatal signal
security/brute: Define a LSM and manage statistical data
securtiy/brute: Detect a brute force attack
security/brute: Fine tuning the attack detection
security/brute: Mitigate a brute force attack
selftests/brute: Add tests for the Brute LSM
Documentation: Add documentation for the Brute LSM
MAINTAINERS: Add a new entry for the Brute LSM
Documentation/admin-guide/LSM/Brute.rst | 278 ++++++
Documentation/admin-guide/LSM/index.rst | 1 +
MAINTAINERS | 7 +
include/linux/lsm_hook_defs.h | 1 +
include/linux/lsm_hooks.h | 4 +
include/linux/security.h | 4 +
kernel/signal.c | 1 +
security/Kconfig | 11 +-
security/Makefile | 4 +
security/brute/Kconfig | 13 +
security/brute/Makefile | 2 +
security/brute/brute.c | 1107 ++++++++++++++++++++++
security/security.c | 5 +
tools/testing/selftests/Makefile | 1 +
tools/testing/selftests/brute/.gitignore | 2 +
tools/testing/selftests/brute/Makefile | 5 +
tools/testing/selftests/brute/config | 1 +
tools/testing/selftests/brute/exec.c | 44 +
tools/testing/selftests/brute/test.c | 507 ++++++++++
tools/testing/selftests/brute/test.sh | 226 +++++
20 files changed, 2219 insertions(+), 5 deletions(-)
create mode 100644 Documentation/admin-guide/LSM/Brute.rst
create mode 100644 security/brute/Kconfig
create mode 100644 security/brute/Makefile
create mode 100644 security/brute/brute.c
create mode 100644 tools/testing/selftests/brute/.gitignore
create mode 100644 tools/testing/selftests/brute/Makefile
create mode 100644 tools/testing/selftests/brute/config
create mode 100644 tools/testing/selftests/brute/exec.c
create mode 100644 tools/testing/selftests/brute/test.c
create mode 100755 tools/testing/selftests/brute/test.sh
--
2.25.1
sched.h has been included at line 33, so remove the
duplicate one at line 36.
inttypes.h has been included at line 19, so remove the
duplicate one at line 23.
pthread.h has been included at line 17,so remove the
duplicate one at line 20.
Signed-off-by: Wan Jiabing <wanjiabing(a)vivo.com>
---
tools/testing/selftests/powerpc/mm/tlbie_test.c | 1 -
tools/testing/selftests/powerpc/tm/tm-poison.c | 1 -
tools/testing/selftests/powerpc/tm/tm-vmx-unavail.c | 1 -
3 files changed, 3 deletions(-)
diff --git a/tools/testing/selftests/powerpc/mm/tlbie_test.c b/tools/testing/selftests/powerpc/mm/tlbie_test.c
index f85a0938ab25..48344a74b212 100644
--- a/tools/testing/selftests/powerpc/mm/tlbie_test.c
+++ b/tools/testing/selftests/powerpc/mm/tlbie_test.c
@@ -33,7 +33,6 @@
#include <sched.h>
#include <time.h>
#include <stdarg.h>
-#include <sched.h>
#include <pthread.h>
#include <signal.h>
#include <sys/prctl.h>
diff --git a/tools/testing/selftests/powerpc/tm/tm-poison.c b/tools/testing/selftests/powerpc/tm/tm-poison.c
index 29e5f26af7b9..27c083a03d1f 100644
--- a/tools/testing/selftests/powerpc/tm/tm-poison.c
+++ b/tools/testing/selftests/powerpc/tm/tm-poison.c
@@ -20,7 +20,6 @@
#include <sched.h>
#include <sys/types.h>
#include <signal.h>
-#include <inttypes.h>
#include "tm.h"
diff --git a/tools/testing/selftests/powerpc/tm/tm-vmx-unavail.c b/tools/testing/selftests/powerpc/tm/tm-vmx-unavail.c
index e2a0c07e8362..9ef37a9836ac 100644
--- a/tools/testing/selftests/powerpc/tm/tm-vmx-unavail.c
+++ b/tools/testing/selftests/powerpc/tm/tm-vmx-unavail.c
@@ -17,7 +17,6 @@
#include <pthread.h>
#include <sys/mman.h>
#include <unistd.h>
-#include <pthread.h>
#include "tm.h"
#include "utils.h"
--
2.25.1
This fix is analogous to Peter Xu's fix for hugetlb [0]. If we don't
put_page() after getting the page out of the page cache, we leak the
reference.
The fix can be verified by checking /proc/meminfo and running the
userfaultfd selftest in shmem mode. Without the fix, we see MemFree /
MemAvailable steadily decreasing with each run of the test. With the
fix, memory is correctly freed after the test program exits.
Fixes: 00da60b9d0a0 ("userfaultfd: support minor fault handling for shmem")
Signed-off-by: Axel Rasmussen <axelrasmussen(a)google.com>
---
mm/shmem.c | 1 +
1 file changed, 1 insertion(+)
diff --git a/mm/shmem.c b/mm/shmem.c
index ef8c9f5e92fc..d2e0e81b7d2e 100644
--- a/mm/shmem.c
+++ b/mm/shmem.c
@@ -1831,6 +1831,7 @@ static int shmem_getpage_gfp(struct inode *inode, pgoff_t index,
if (page && vma && userfaultfd_minor(vma)) {
unlock_page(page);
+ put_page(page);
*fault_type = handle_userfault(vmf, VM_UFFD_MINOR);
return 0;
}
--
2.31.0.rc2.261.g7f71774620-goog
string.h has been included at line 15.So we remove the
duplicate one at line 17.
Signed-off-by: Wan Jiabing <wanjiabing(a)vivo.com>
---
tools/testing/selftests/mincore/mincore_selftest.c | 1 -
1 file changed, 1 deletion(-)
diff --git a/tools/testing/selftests/mincore/mincore_selftest.c b/tools/testing/selftests/mincore/mincore_selftest.c
index 5a1e85ff5d32..e54106643337 100644
--- a/tools/testing/selftests/mincore/mincore_selftest.c
+++ b/tools/testing/selftests/mincore/mincore_selftest.c
@@ -14,7 +14,6 @@
#include <sys/mman.h>
#include <string.h>
#include <fcntl.h>
-#include <string.h>
#include "../kselftest.h"
#include "../kselftest_harness.h"
--
2.25.1
The perf subsystem today unifies various tracing and monitoring
features, from both software and hardware. One benefit of the perf
subsystem is automatically inheriting events to child tasks, which
enables process-wide events monitoring with low overheads. By default
perf events are non-intrusive, not affecting behaviour of the tasks
being monitored.
For certain use-cases, however, it makes sense to leverage the
generality of the perf events subsystem and optionally allow the tasks
being monitored to receive signals on events they are interested in.
This patch series adds the option to synchronously signal user space on
events.
To better support process-wide synchronous self-monitoring, without
events propagating to children that do not share the current process's
shared environment, two pre-requisite patches are added to optionally
restrict inheritance to CLONE_THREAD, and remove events on exec (without
affecting the parent).
Examples how to use these features can be found in the two kselftests at
the end of the series. The kselftests verify and stress test the basic
functionality.
The discussion at [1] led to the changes proposed in this series. The
approach taken in patch "Add support for SIGTRAP on perf events" to use
'event_limit' to trigger the signal was kindly suggested by Peter
Zijlstra in [2].
[1] https://lore.kernel.org/lkml/CACT4Y+YPrXGw+AtESxAgPyZ84TYkNZdP0xpocX2jwVAbZ…
[2] https://lore.kernel.org/lkml/YBv3rAT566k+6zjg@hirez.programming.kicks-ass.n…
Motivation and example uses:
1. Our immediate motivation is low-overhead sampling-based race
detection for user space [3]. By using perf_event_open() at
process initialization, we can create hardware
breakpoint/watchpoint events that are propagated automatically
to all threads in a process. As far as we are aware, today no
existing kernel facility (such as ptrace) allows us to set up
process-wide watchpoints with minimal overheads (that are
comparable to mprotect() of whole pages).
[3] https://llvm.org/devmtg/2020-09/slides/Morehouse-GWP-Tsan.pdf
2. Other low-overhead error detectors that rely on detecting
accesses to certain memory locations or code, process-wide and
also only in a specific set of subtasks or threads.
Other ideas for use-cases we found interesting, but should only
illustrate the range of potential to further motivate the utility (we're
sure there are more):
3. Code hot patching without full stop-the-world. Specifically, by
setting a code breakpoint to entry to the patched routine, then
send signals to threads and check that they are not in the
routine, but without stopping them further. If any of the
threads will enter the routine, it will receive SIGTRAP and
pause.
4. Safepoints without mprotect(). Some Java implementations use
"load from a known memory location" as a safepoint. When threads
need to be stopped, the page containing the location is
mprotect()ed and threads get a signal. This could be replaced with
a watchpoint, which does not require a whole page nor DTLB
shootdowns.
5. Threads receiving signals on performance events to
throttle/unthrottle themselves.
6. Tracking data flow globally.
---
v2:
* Patch "Support only inheriting events if cloned with CLONE_THREAD"
added to series.
* Patch "Add support for event removal on exec" added to series.
* Patch "Add kselftest for process-wide sigtrap handling" added to
series.
* Patch "Add kselftest for remove_on_exec" added to series.
* Implicitly restrict inheriting events if sigtrap, but the child was
cloned with CLONE_CLEAR_SIGHAND, because it is not generally safe if
the child cleared all signal handlers to continue sending SIGTRAP.
* Various minor fixes (see details in patches).
v1: https://lkml.kernel.org/r/20210223143426.2412737-1-elver@google.com
Marco Elver (8):
perf/core: Apply PERF_EVENT_IOC_MODIFY_ATTRIBUTES to children
perf/core: Support only inheriting events if cloned with CLONE_THREAD
perf/core: Add support for event removal on exec
signal: Introduce TRAP_PERF si_code and si_perf to siginfo
perf/core: Add support for SIGTRAP on perf events
perf/core: Add breakpoint information to siginfo on SIGTRAP
selftests/perf: Add kselftest for process-wide sigtrap handling
selftests/perf: Add kselftest for remove_on_exec
arch/m68k/kernel/signal.c | 3 +
arch/x86/kernel/signal_compat.c | 5 +-
fs/signalfd.c | 4 +
include/linux/compat.h | 2 +
include/linux/perf_event.h | 5 +-
include/linux/signal.h | 1 +
include/uapi/asm-generic/siginfo.h | 6 +-
include/uapi/linux/perf_event.h | 5 +-
include/uapi/linux/signalfd.h | 4 +-
kernel/events/core.c | 130 ++++++++-
kernel/fork.c | 2 +-
kernel/signal.c | 11 +
.../testing/selftests/perf_events/.gitignore | 3 +
tools/testing/selftests/perf_events/Makefile | 6 +
tools/testing/selftests/perf_events/config | 1 +
.../selftests/perf_events/remove_on_exec.c | 256 ++++++++++++++++++
tools/testing/selftests/perf_events/settings | 1 +
.../selftests/perf_events/sigtrap_threads.c | 202 ++++++++++++++
18 files changed, 632 insertions(+), 15 deletions(-)
create mode 100644 tools/testing/selftests/perf_events/.gitignore
create mode 100644 tools/testing/selftests/perf_events/Makefile
create mode 100644 tools/testing/selftests/perf_events/config
create mode 100644 tools/testing/selftests/perf_events/remove_on_exec.c
create mode 100644 tools/testing/selftests/perf_events/settings
create mode 100644 tools/testing/selftests/perf_events/sigtrap_threads.c
--
2.30.1.766.gb4fecdf3b7-goog
When trying to run the arm64 MTE (Memory Tagging Extension) selftests
on a model with the new FEAT_MTE3 capability, the MTE feature detection
failed, because it was overzealously checking for one exact feature
version only (0b0010). Trying to fix that (patch 06/11) led me into the
rabbit hole of userland tool compilation, which triggered patches
01-05/11, to let me actually compile the selftests on an arm64
machine running Ubuntu 20.04. Before I actually fixed that, I tried some
other compiler and distro; patches 07 and 08 are my witnesses.
Then I got brave and tried clang: entering patches 09/11 and 10/11.
Eventually I tried to run the whole thing on that model again, and,
you guessed it, patch 11/11 concludes this apparent "2 minute job".
Eventually I can now compile the mte selftests on Ubuntu 20.04 with both
the native gcc and clang without warnings, also with some custom made
cross compiler. And they even run now!
Please have a look, also you may try to compile it on your setup, if you
feel adventurous:
$ make -C tools/testing/selftests TARGETS=arm64 ARM64_SUBTARGETS=mte
Cheers,
Andre
Andre Przywara (11):
kselftest/arm64: mte: Fix compilation with native compiler
kselftest/arm64: mte: Fix pthread linking
kselftest/arm64: mte: ksm_options: Fix fscanf warning
kselftest/arm64: mte: user_mem: Fix write() warning
kselftest/arm64: mte: common: Fix write() warnings
kselftest/arm64: mte: Fix MTE feature detection
kselftest/arm64: mte: Use cross-compiler if specified
kselftest/arm64: mte: Output warning about failing compiler
kselftest/arm64: mte: Makefile: Fix clang compilation
kselftest/arm64: mte: Fix clang warning
kselftest/arm64: mte: Report filename on failing temp file creation
tools/testing/selftests/arm64/mte/Makefile | 15 +++++--
.../selftests/arm64/mte/check_ksm_options.c | 5 ++-
.../selftests/arm64/mte/check_user_mem.c | 3 +-
.../selftests/arm64/mte/mte_common_util.c | 39 +++++++++++--------
4 files changed, 39 insertions(+), 23 deletions(-)
--
2.17.5
Hi Linus,
Please pull the following KUnit fixes update for Linux 5.12-rc5.
This KUnit update for Linux 5.12-rc5 consists of two fixes to kunit
tool from David Gow.
diff is attached.
thanks,
-- Shuah
----------------------------------------------------------------
The following changes since commit a38fd8748464831584a19438cbb3082b5a2dab15:
Linux 5.12-rc2 (2021-03-05 17:33:41 -0800)
are available in the Git repository at:
git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest
tags/linux-kselftest-kunit-fixes-5.12-rc5.1
for you to fetch changes up to 7fd53f41f771d250eb08db08650940f017e37c26:
kunit: tool: Disable PAGE_POISONING under --alltests (2021-03-11
14:37:37 -0700)
----------------------------------------------------------------
linux-kselftest-kunit-fixes-5.12-rc5.1
This KUnit update for Linux 5.12-rc5 consists of two fixes to kunit
tool from David Gow.
----------------------------------------------------------------
David Gow (2):
kunit: tool: Fix a python tuple typing error
kunit: tool: Disable PAGE_POISONING under --alltests
tools/testing/kunit/configs/broken_on_uml.config | 2 ++
tools/testing/kunit/kunit_config.py | 2 +-
2 files changed, 3 insertions(+), 1 deletion(-)
----------------------------------------------------------------
sched.h has been included at line 33.
So we remove the duplicate one at line 36.
Signed-off-by: Wan Jiabing <wanjiabing(a)vivo.com>
---
tools/testing/selftests/powerpc/mm/tlbie_test.c | 1 -
1 file changed, 1 deletion(-)
diff --git a/tools/testing/selftests/powerpc/mm/tlbie_test.c b/tools/testing/selftests/powerpc/mm/tlbie_test.c
index f85a0938ab25..48344a74b212 100644
--- a/tools/testing/selftests/powerpc/mm/tlbie_test.c
+++ b/tools/testing/selftests/powerpc/mm/tlbie_test.c
@@ -33,7 +33,6 @@
#include <sched.h>
#include <time.h>
#include <stdarg.h>
-#include <sched.h>
#include <pthread.h>
#include <signal.h>
#include <sys/prctl.h>
--
2.25.1
inttypes.h has been included at line 19.
So we remove the duplicate one at line 23.
Signed-off-by: Wan Jiabing <wanjiabing(a)vivo.com>
---
tools/testing/selftests/powerpc/tm/tm-poison.c | 1 -
1 file changed, 1 deletion(-)
diff --git a/tools/testing/selftests/powerpc/tm/tm-poison.c b/tools/testing/selftests/powerpc/tm/tm-poison.c
index 29e5f26af7b9..27c083a03d1f 100644
--- a/tools/testing/selftests/powerpc/tm/tm-poison.c
+++ b/tools/testing/selftests/powerpc/tm/tm-poison.c
@@ -20,7 +20,6 @@
#include <sched.h>
#include <sys/types.h>
#include <signal.h>
-#include <inttypes.h>
#include "tm.h"
--
2.25.1
pthread.h has been included at line 17.
So we remove the duplicate one at line 20.
Signed-off-by: Wan Jiabing <wanjiabing(a)vivo.com>
---
tools/testing/selftests/powerpc/tm/tm-vmx-unavail.c | 1 -
1 file changed, 1 deletion(-)
diff --git a/tools/testing/selftests/powerpc/tm/tm-vmx-unavail.c b/tools/testing/selftests/powerpc/tm/tm-vmx-unavail.c
index e2a0c07e8362..9ef37a9836ac 100644
--- a/tools/testing/selftests/powerpc/tm/tm-vmx-unavail.c
+++ b/tools/testing/selftests/powerpc/tm/tm-vmx-unavail.c
@@ -17,7 +17,6 @@
#include <pthread.h>
#include <sys/mman.h>
#include <unistd.h>
-#include <pthread.h>
#include "tm.h"
#include "utils.h"
--
2.25.1
The patch itself is straightforward thanks to the infrastructure that is
already in-place.
The tests follows the other '*_map_batch_ops' tests with minor tweaks.
v1 -> v2:
Fixes for checkpatch warnings
Pedro Tammela (2):
bpf: add support for batched operations in LPM trie maps
bpf: selftests: add tests for batched ops in LPM trie maps
kernel/bpf/lpm_trie.c | 3 +
.../map_tests/lpm_trie_map_batch_ops.c (new) | 158 ++++++++++++++++++
2 files changed, 161 insertions(+)
create mode 100644 tools/testing/selftests/bpf/map_tests/lpm_trie_map_batch_ops.c
--
2.25.1