On Tue, Mar 16, 2021 at 09:42:50PM +0100, Mickaël Salaün wrote:
> From: Mickaël Salaün <mic(a)linux.microsoft.com>
>
> Test all Landlock system calls, ptrace hooks semantic and filesystem
> access-control with multiple layouts.
>
> Test coverage for security/landlock/ is 93.6% of lines. The code not
> covered only deals with internal kernel errors (e.g. memory allocation)
> and race conditions.
>
> Cc: James Morris <jmorris(a)namei.org>
> Cc: Jann Horn <jannh(a)google.com>
> Cc: Kees Cook <keescook(a)chromium.org>
> Cc: Serge E. Hallyn <serge(a)hallyn.com>
> Cc: Shuah Khan <shuah(a)kernel.org>
> Signed-off-by: Mickaël Salaün <mic(a)linux.microsoft.com>
> Reviewed-by: Vincent Dagonneau <vincent.dagonneau(a)ssi.gouv.fr>
> Link: https://lore.kernel.org/r/20210316204252.427806-11-mic@digikod.net
This is terrific. I love the coverage. How did you measure this, BTW?
To increase it into memory allocation failures, have you tried
allocation fault injection:
https://www.kernel.org/doc/html/latest/fault-injection/fault-injection.html
> [...]
> +TEST(inconsistent_attr) {
> + const long page_size = sysconf(_SC_PAGESIZE);
> + char *const buf = malloc(page_size + 1);
> + struct landlock_ruleset_attr *const ruleset_attr = (void *)buf;
> +
> + ASSERT_NE(NULL, buf);
> +
> + /* Checks copy_from_user(). */
> + ASSERT_EQ(-1, landlock_create_ruleset(ruleset_attr, 0, 0));
> + /* The size if less than sizeof(struct landlock_attr_enforce). */
> + ASSERT_EQ(EINVAL, errno);
> + ASSERT_EQ(-1, landlock_create_ruleset(ruleset_attr, 1, 0));
> + ASSERT_EQ(EINVAL, errno);
Almost everywhere you're using ASSERT instead of EXPECT. Is this correct
(in the sense than as soon as an ASSERT fails the rest of the test is
skipped)? I do see you using EXPECT is some places, but I figured I'd
ask about the intention here.
> +/*
> + * TEST_F_FORK() is useful when a test drop privileges but the corresponding
> + * FIXTURE_TEARDOWN() requires them (e.g. to remove files from a directory
> + * where write actions are denied). For convenience, FIXTURE_TEARDOWN() is
> + * also called when the test failed, but not when FIXTURE_SETUP() failed. For
> + * this to be possible, we must not call abort() but instead exit smoothly
> + * (hence the step print).
> + */
Hm, interesting. I think this should be extracted into a separate patch
and added to the test harness proper.
Could this be solved with TEARDOWN being called on SETUP failure?
> +#define TEST_F_FORK(fixture_name, test_name) \
> + static void fixture_name##_##test_name##_child( \
> + struct __test_metadata *_metadata, \
> + FIXTURE_DATA(fixture_name) *self, \
> + const FIXTURE_VARIANT(fixture_name) *variant); \
> + TEST_F(fixture_name, test_name) \
> + { \
> + int status; \
> + const pid_t child = fork(); \
> + if (child < 0) \
> + abort(); \
> + if (child == 0) { \
> + _metadata->no_print = 1; \
> + fixture_name##_##test_name##_child(_metadata, self, variant); \
> + if (_metadata->skip) \
> + _exit(255); \
> + if (_metadata->passed) \
> + _exit(0); \
> + _exit(_metadata->step); \
> + } \
> + if (child != waitpid(child, &status, 0)) \
> + abort(); \
> + if (WIFSIGNALED(status) || !WIFEXITED(status)) { \
> + _metadata->passed = 0; \
> + _metadata->step = 1; \
> + return; \
> + } \
> + switch (WEXITSTATUS(status)) { \
> + case 0: \
> + _metadata->passed = 1; \
> + break; \
> + case 255: \
> + _metadata->passed = 1; \
> + _metadata->skip = 1; \
> + break; \
> + default: \
> + _metadata->passed = 0; \
> + _metadata->step = WEXITSTATUS(status); \
> + break; \
> + } \
> + } \
This looks like a subset of __wait_for_test()? Could __TEST_F_IMPL() be
updated instead to do this? (Though the fork overhead might not be great
for everyone.)
--
Kees Cook
From: Dave Hansen <dave.hansen(a)linux.intel.com>
The SGX device file (/dev/sgx_enclave) is unusual in that it requires
execute permissions. It has to be both "chmod +x" *and* be on a
filesystem without 'noexec'.
In the future, udev and systemd should get updates to set up systems
automatically. But, for now, nobody's systems do this automatically,
and everybody gets error messages like this when running ./test_sgx:
0x0000000000000000 0x0000000000002000 0x03
0x0000000000002000 0x0000000000001000 0x05
0x0000000000003000 0x0000000000003000 0x03
mmap() failed, errno=1.
That isn't very user friendly, even for forgetful kernel developers.
Further, the test case is rather haphazard about its use of fprintf()
versus perror().
Improve the error messages. Use perror() where possible. Lastly,
do some sanity checks on opening and mmap()ing the device file so
that we can get a decent error message out to the user.
Now, if your user doesn't have permission, you'll get the following:
$ ls -l /dev/sgx_enclave
crw------- 1 root root 10, 126 Mar 18 11:29 /dev/sgx_enclave
$ ./test_sgx
Unable to open /dev/sgx_enclave: Permission denied
If you then 'chown dave:dave /dev/sgx_enclave' (or whatever), but
you leave execute permissions off, you'll get:
$ ls -l /dev/sgx_enclave
crw------- 1 dave dave 10, 126 Mar 18 11:29 /dev/sgx_enclave
$ ./test_sgx
no execute permissions on device file
If you fix that with "chmod ug+x /dev/sgx" but you leave /dev as
noexec, you'll get this:
$ mount | grep "/dev .*noexec"
udev on /dev type devtmpfs (rw,nosuid,noexec,...)
$ ./test_sgx
ERROR: mmap for exec: Operation not permitted
mmap() succeeded for PROT_READ, but failed for PROT_EXEC
check that user has execute permissions on /dev/sgx_enclave and
that /dev does not have noexec set: 'mount | grep "/dev .*noexec"'
That can be fixed with:
mount -o remount,noexec /devESC
Hopefully, the combination of better error messages and the search
engines indexing this message will help people fix their systems
until we do this properly.
Signed-off-by: Dave Hansen <dave.hansen(a)linux.intel.com>
Cc: Jarkko Sakkinen <jarkko(a)kernel.org>
Cc: Shuah Khan <shuah(a)kernel.org>
Cc: Borislav Petkov <bp(a)alien8.de>
Cc: x86(a)kernel.org
Cc: linux-sgx(a)vger.kernel.org
Cc: linux-kselftest(a)vger.kernel.org
---
b/tools/testing/selftests/sgx/load.c | 66 +++++++++++++++++++++++++++--------
b/tools/testing/selftests/sgx/main.c | 2 -
2 files changed, 53 insertions(+), 15 deletions(-)
diff -puN tools/testing/selftests/sgx/load.c~sgx-selftest-err-rework tools/testing/selftests/sgx/load.c
--- a/tools/testing/selftests/sgx/load.c~sgx-selftest-err-rework 2021-03-18 12:18:38.649828215 -0700
+++ b/tools/testing/selftests/sgx/load.c 2021-03-18 12:40:46.388824904 -0700
@@ -45,19 +45,19 @@ static bool encl_map_bin(const char *pat
fd = open(path, O_RDONLY);
if (fd == -1) {
- perror("open()");
+ perror("enclave executable open()");
return false;
}
ret = stat(path, &sb);
if (ret) {
- perror("stat()");
+ perror("enclave executable stat()");
goto err;
}
bin = mmap(NULL, sb.st_size, PROT_READ, MAP_PRIVATE, fd, 0);
if (bin == MAP_FAILED) {
- perror("mmap()");
+ perror("enclave executable mmap()");
goto err;
}
@@ -90,8 +90,7 @@ static bool encl_ioc_create(struct encl
ioc.src = (unsigned long)secs;
rc = ioctl(encl->fd, SGX_IOC_ENCLAVE_CREATE, &ioc);
if (rc) {
- fprintf(stderr, "SGX_IOC_ENCLAVE_CREATE failed: errno=%d\n",
- errno);
+ perror("SGX_IOC_ENCLAVE_CREATE failed");
munmap((void *)secs->base, encl->encl_size);
return false;
}
@@ -116,31 +115,69 @@ static bool encl_ioc_add_pages(struct en
rc = ioctl(encl->fd, SGX_IOC_ENCLAVE_ADD_PAGES, &ioc);
if (rc < 0) {
- fprintf(stderr, "SGX_IOC_ENCLAVE_ADD_PAGES failed: errno=%d.\n",
- errno);
+ perror("SGX_IOC_ENCLAVE_ADD_PAGES failed");
return false;
}
return true;
}
+
+
bool encl_load(const char *path, struct encl *encl)
{
+ const char device_path[] = "/dev/sgx_enclave";
Elf64_Phdr *phdr_tbl;
off_t src_offset;
Elf64_Ehdr *ehdr;
+ struct stat sb;
+ void *ptr;
int i, j;
int ret;
+ int fd = -1;
memset(encl, 0, sizeof(*encl));
- ret = open("/dev/sgx_enclave", O_RDWR);
- if (ret < 0) {
- fprintf(stderr, "Unable to open /dev/sgx_enclave\n");
+ fd = open(device_path, O_RDWR);
+ if (fd < 0) {
+ perror("Unable to open /dev/sgx_enclave");
+ goto err;
+ }
+
+ ret = stat(device_path, &sb);
+ if (ret) {
+ perror("device file stat()");
+ goto err;
+ }
+
+ /*
+ * This just checks if the /dev file has these permission
+ * bits set. It does not check that the current user is
+ * the owner or in the owning group.
+ */
+ if (!(sb.st_mode & (S_IXUSR | S_IXGRP | S_IXOTH))) {
+ fprintf(stderr, "no execute permissions on device file\n");
+ goto err;
+ }
+
+ ptr = mmap(NULL, PAGE_SIZE, PROT_READ, MAP_SHARED, fd, 0);
+ if (ptr == (void *)-1) {
+ perror("mmap for read");
+ goto err;
+ }
+ munmap(ptr, PAGE_SIZE);
+
+ ptr = mmap(NULL, PAGE_SIZE, PROT_EXEC, MAP_SHARED, fd, 0);
+ if (ptr == (void *)-1) {
+ perror("ERROR: mmap for exec");
+ fprintf(stderr, "mmap() succeeded for PROT_READ, but failed for PROT_EXEC\n");
+ fprintf(stderr, "check that user has execute permissions on %s and\n", device_path);
+ fprintf(stderr, "that /dev does not have noexec set: 'mount | grep \"/dev .*noexec\"'\n");
goto err;
}
+ munmap(ptr, PAGE_SIZE);
- encl->fd = ret;
+ encl->fd = fd;
if (!encl_map_bin(path, encl))
goto err;
@@ -217,6 +254,8 @@ bool encl_load(const char *path, struct
return true;
err:
+ if (fd != -1)
+ close(fd);
encl_delete(encl);
return false;
}
@@ -229,7 +268,7 @@ static bool encl_map_area(struct encl *e
area = mmap(NULL, encl_size * 2, PROT_NONE,
MAP_PRIVATE | MAP_ANONYMOUS, -1, 0);
if (area == MAP_FAILED) {
- perror("mmap");
+ perror("reservation mmap()");
return false;
}
@@ -268,8 +307,7 @@ bool encl_build(struct encl *encl)
ioc.sigstruct = (uint64_t)&encl->sigstruct;
ret = ioctl(encl->fd, SGX_IOC_ENCLAVE_INIT, &ioc);
if (ret) {
- fprintf(stderr, "SGX_IOC_ENCLAVE_INIT failed: errno=%d\n",
- errno);
+ perror("SGX_IOC_ENCLAVE_INIT failed");
return false;
}
diff -puN tools/testing/selftests/sgx/main.c~sgx-selftest-err-rework tools/testing/selftests/sgx/main.c
--- a/tools/testing/selftests/sgx/main.c~sgx-selftest-err-rework 2021-03-18 12:18:38.652828215 -0700
+++ b/tools/testing/selftests/sgx/main.c 2021-03-18 12:18:38.657828215 -0700
@@ -195,7 +195,7 @@ int main(int argc, char *argv[], char *e
addr = mmap((void *)encl.encl_base + seg->offset, seg->size,
seg->prot, MAP_SHARED | MAP_FIXED, encl.fd, 0);
if (addr == MAP_FAILED) {
- fprintf(stderr, "mmap() failed, errno=%d.\n", errno);
+ perror("mmap() segment failed");
exit(KSFT_FAIL);
}
}
_
From: Mark Brown <broonie(a)kernel.org>
[ Upstream commit 07e644885bf6727a48db109fad053cb43f3c9859 ]
We track if sve-ptrace encountered a failure in a variable but don't
actually use that value when we exit the program, do so.
Signed-off-by: Mark Brown <broonie(a)kernel.org>
Link: https://lore.kernel.org/r/20210309190304.39169-1-broonie@kernel.org
Signed-off-by: Will Deacon <will(a)kernel.org>
Signed-off-by: Sasha Levin <sashal(a)kernel.org>
---
tools/testing/selftests/arm64/fp/sve-ptrace.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/tools/testing/selftests/arm64/fp/sve-ptrace.c b/tools/testing/selftests/arm64/fp/sve-ptrace.c
index b2282be6f938..612d3899614a 100644
--- a/tools/testing/selftests/arm64/fp/sve-ptrace.c
+++ b/tools/testing/selftests/arm64/fp/sve-ptrace.c
@@ -332,5 +332,5 @@ int main(void)
ksft_print_cnts();
- return 0;
+ return ret;
}
--
2.30.1
From: Mark Brown <broonie(a)kernel.org>
[ Upstream commit 07e644885bf6727a48db109fad053cb43f3c9859 ]
We track if sve-ptrace encountered a failure in a variable but don't
actually use that value when we exit the program, do so.
Signed-off-by: Mark Brown <broonie(a)kernel.org>
Link: https://lore.kernel.org/r/20210309190304.39169-1-broonie@kernel.org
Signed-off-by: Will Deacon <will(a)kernel.org>
Signed-off-by: Sasha Levin <sashal(a)kernel.org>
---
tools/testing/selftests/arm64/fp/sve-ptrace.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/tools/testing/selftests/arm64/fp/sve-ptrace.c b/tools/testing/selftests/arm64/fp/sve-ptrace.c
index b2282be6f938..612d3899614a 100644
--- a/tools/testing/selftests/arm64/fp/sve-ptrace.c
+++ b/tools/testing/selftests/arm64/fp/sve-ptrace.c
@@ -332,5 +332,5 @@ int main(void)
ksft_print_cnts();
- return 0;
+ return ret;
}
--
2.30.1
This patchset extends IOCTL interface to retrieve KVM statistics data
in aggregated binary format.
It is meant to provide a lightweight, flexible, scalable and efficient
lock-free solution for userspace telemetry applications to pull the
statistics data periodically for large scale systems.
The capability is indicated by KVM_CAP_STATS_BINARY_FORM.
Ioctl KVM_STATS_GET_INFO is used to get the information about VM or
vCPU statistics data (The number of supported statistics data which is
used for buffer allocation).
Ioctl KVM_STATS_GET_NAMES is used to get the list of name strings of
all supported statistics data.
Ioctl KVM_STATS_GET_DATA is used to get the aggregated statistics data
per VM or vCPU in the same order as the list of name strings. This is
the ioctl which would be called periodically to retrieve statistics
data per VM or vCPU.
Jing Zhang (4):
KVM: stats: Separate statistics name strings from debugfs code
KVM: stats: Define APIs for aggregated stats retrieval in binary
format
KVM: stats: Add ioctl commands to pull statistics in binary format
KVM: selftests: Add selftest for KVM binary form statistics interface
Documentation/virt/kvm/api.rst | 79 +++++
arch/arm64/kvm/guest.c | 47 ++-
arch/mips/kvm/mips.c | 114 +++++--
arch/powerpc/kvm/book3s.c | 107 ++++--
arch/powerpc/kvm/booke.c | 84 +++--
arch/s390/kvm/kvm-s390.c | 320 ++++++++++++------
arch/x86/kvm/x86.c | 127 ++++---
include/linux/kvm_host.h | 30 +-
include/uapi/linux/kvm.h | 60 ++++
tools/testing/selftests/kvm/.gitignore | 1 +
tools/testing/selftests/kvm/Makefile | 3 +
.../selftests/kvm/kvm_bin_form_stats.c | 89 +++++
virt/kvm/kvm_main.c | 115 +++++++
13 files changed, 935 insertions(+), 241 deletions(-)
create mode 100644 tools/testing/selftests/kvm/kvm_bin_form_stats.c
base-commit: 357ad203d45c0f9d76a8feadbd5a1c5d460c638b
--
2.30.1.766.gb4fecdf3b7-goog
Attacks against vulnerable userspace applications with the purpose to break
ASLR or bypass canaries traditionally use some level of brute force with
the help of the fork system call. This is possible since when creating a
new process using fork its memory contents are the same as those of the
parent process (the process that called the fork system call). So, the
attacker can test the memory infinite times to find the correct memory
values or the correct memory addresses without worrying about crashing the
application.
Based on the above scenario it would be nice to have this detected and
mitigated, and this is the goal of this patch serie. Specifically the
following attacks are expected to be detected:
1.- Launching (fork()/exec()) a setuid/setgid process repeatedly until a
desirable memory layout is got (e.g. Stack Clash).
2.- Connecting to an exec()ing network daemon (e.g. xinetd) repeatedly
until a desirable memory layout is got (e.g. what CTFs do for simple
network service).
3.- Launching processes without exec() (e.g. Android Zygote) and exposing
state to attack a sibling.
4.- Connecting to a fork()ing network daemon (e.g. apache) repeatedly until
the previously shared memory layout of all the other children is
exposed (e.g. kind of related to HeartBleed).
In each case, a privilege boundary has been crossed:
Case 1: setuid/setgid process
Case 2: network to local
Case 3: privilege changes
Case 4: network to local
So, what will really be detected are fork/exec brute force attacks that
cross any of the commented bounds.
The implementation details and comparison against other existing
implementations can be found in the "Documentation" patch.
This v5 version has changed a lot from the v2. Basically the application
crash period is now compute on an on-going basis using an exponential
moving average (EMA), a detection of a brute force attack through the
"execve" system call has been added and the crossing of the commented
privilege bounds are taken into account. Also, the fine tune has also been
removed and now, all this kind of attacks are detected without
administrator intervention.
In the v2 version Kees Cook suggested to study if the statistical data
shared by all the fork hierarchy processes can be tracked in some other
way. Specifically the question was if this info can be hold by the family
hierarchy of the mm struct. After studying this hierarchy I think it is not
suitable for the Brute LSM since they are totally copied on fork() and in
this case we want that they are shared. So I leave this road.
So, knowing all this information I will explain now the different patches:
The 1/8 patch defines a new LSM hook to get the fatal signal of a task.
This will be useful during the attack detection phase.
The 2/8 patch defines a new LSM and manages the statistical data shared by
all the fork hierarchy processes.
The 3/8 patch detects a fork/exec brute force attack.
The 4/8 patch narrows the detection taken into account the privilege
boundary crossing.
The 5/8 patch mitigates a brute force attack.
The 6/8 patch adds self-tests to validate the Brute LSM expectations.
The 7/8 patch adds the documentation to explain this implementation.
The 8/8 patch updates the maintainers file.
This patch serie is a task of the KSPP [1] and can also be accessed from my
github tree [2] in the "brute_v4" branch.
[1] https://github.com/KSPP/linux/issues/39
[2] https://github.com/johwood/linux/
The previous versions can be found in:
RFC
https://lore.kernel.org/kernel-hardening/20200910202107.3799376-1-keescook@…
Version 2
https://lore.kernel.org/kernel-hardening/20201025134540.3770-1-john.wood@gm…
Version 3
https://lore.kernel.org/lkml/20210221154919.68050-1-john.wood@gmx.com/
Version 4
https://lore.kernel.org/lkml/20210227150956.6022-1-john.wood@gmx.com/
Changelog RFC -> v2
-------------------
- Rename this feature with a more suitable name (Jann Horn, Kees Cook).
- Convert the code to an LSM (Kees Cook).
- Add locking to avoid data races (Jann Horn).
- Add a new LSM hook to get the fatal signal of a task (Jann Horn, Kees
Cook).
- Add the last crashes timestamps list to avoid false positives in the
attack detection (Jann Horn).
- Use "period" instead of "rate" (Jann Horn).
- Other minor changes suggested (Jann Horn, Kees Cook).
Changelog v2 -> v3
------------------
- Compute the application crash period on an on-going basis (Kees Cook).
- Detect a brute force attack through the execve system call (Kees Cook).
- Detect an slow brute force attack (Randy Dunlap).
- Fine tuning the detection taken into account privilege boundary crossing
(Kees Cook).
- Taken into account only fatal signals delivered by the kernel (Kees
Cook).
- Remove the sysctl attributes to fine tuning the detection (Kees Cook).
- Remove the prctls to allow per process enabling/disabling (Kees Cook).
- Improve the documentation (Kees Cook).
- Fix some typos in the documentation (Randy Dunlap).
- Add self-test to validate the expectations (Kees Cook).
Changelog v3 -> v4
------------------
- Fix all the warnings shown by the tool "scripts/kernel-doc" (Randy
Dunlap).
Changelog v4 -> v5
------------------
- Fix some typos (Randy Dunlap).
Any constructive comments are welcome.
Thanks.
John Wood (8):
security: Add LSM hook at the point where a task gets a fatal signal
security/brute: Define a LSM and manage statistical data
securtiy/brute: Detect a brute force attack
security/brute: Fine tuning the attack detection
security/brute: Mitigate a brute force attack
selftests/brute: Add tests for the Brute LSM
Documentation: Add documentation for the Brute LSM
MAINTAINERS: Add a new entry for the Brute LSM
Documentation/admin-guide/LSM/Brute.rst | 224 +++++
Documentation/admin-guide/LSM/index.rst | 1 +
MAINTAINERS | 7 +
include/linux/lsm_hook_defs.h | 1 +
include/linux/lsm_hooks.h | 4 +
include/linux/security.h | 4 +
kernel/signal.c | 1 +
security/Kconfig | 11 +-
security/Makefile | 4 +
security/brute/Kconfig | 13 +
security/brute/Makefile | 2 +
security/brute/brute.c | 1102 ++++++++++++++++++++++
security/security.c | 5 +
tools/testing/selftests/Makefile | 1 +
tools/testing/selftests/brute/.gitignore | 2 +
tools/testing/selftests/brute/Makefile | 5 +
tools/testing/selftests/brute/config | 1 +
tools/testing/selftests/brute/exec.c | 44 +
tools/testing/selftests/brute/test.c | 507 ++++++++++
tools/testing/selftests/brute/test.sh | 226 +++++
20 files changed, 2160 insertions(+), 5 deletions(-)
create mode 100644 Documentation/admin-guide/LSM/Brute.rst
create mode 100644 security/brute/Kconfig
create mode 100644 security/brute/Makefile
create mode 100644 security/brute/brute.c
create mode 100644 tools/testing/selftests/brute/.gitignore
create mode 100644 tools/testing/selftests/brute/Makefile
create mode 100644 tools/testing/selftests/brute/config
create mode 100644 tools/testing/selftests/brute/exec.c
create mode 100644 tools/testing/selftests/brute/test.c
create mode 100755 tools/testing/selftests/brute/test.sh
--
2.25.1
We track if sve-ptrace encountered a failure in a variable but don't
actually use that value when we exit the program, do so.
Signed-off-by: Mark Brown <broonie(a)kernel.org>
---
tools/testing/selftests/arm64/fp/sve-ptrace.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/tools/testing/selftests/arm64/fp/sve-ptrace.c b/tools/testing/selftests/arm64/fp/sve-ptrace.c
index b2282be6f938..612d3899614a 100644
--- a/tools/testing/selftests/arm64/fp/sve-ptrace.c
+++ b/tools/testing/selftests/arm64/fp/sve-ptrace.c
@@ -332,5 +332,5 @@ int main(void)
ksft_print_cnts();
- return 0;
+ return ret;
}
--
2.20.1
Base
====
This series is based on top of my series which adds minor fault handling for
hugetlbfs [1]. (And, therefore, it is based on 5.12-rc1 and Peter Xu's series
for disabling huge pmd sharing as well.)
[1] https://lore.kernel.org/linux-fsdevel/20210301222728.176417-1-axelrasmussen…
Changelog
=========
v1->v2:
- For UFFDIO_CONTINUE, don't mess with page flags. Just use find_lock_page to
get a locked page from the page cache, instead of doing __SetPageLocked.
This fixes a VM_BUG_ON v1 hit when handling minor faults for THP-backed
shmem (a tmpfs mounted with huge=always).
Overview
========
See my original series linked above for a detailed overview of minor fault
handling in general. The feature in this series works exactly like the
hugetblfs version (from userspace's perspective).
I'm sending this as a separate series because:
- The original minor fault handling series has a full set of R-Bs, and seems
close to being merged. So, it seems reasonable to start looking at this next
step, which extends the basic functionality.
- shmem is different enough that this series may require some additional work
before it's ready, and I don't want to delay the original series
unnecessarily by bundling them together.
Use Case
========
In some cases it is useful to have VM memory backed by tmpfs instead of
hugetlbfs. So, this feature will be used to support the same VM live migration
use case described in my original series.
Additionally, Android folks (Lokesh Gidra <lokeshgidra(a)google.com>) hope to
optimize the Android Runtime garbage collector using this feature:
"The plan is to use userfaultfd for concurrently compacting the heap. With
this feature, the heap can be shared-mapped at another location where the
GC-thread(s) could continue the compaction operation without the need to
invoke userfault ioctl(UFFDIO_COPY) each time. OTOH, if and when Java threads
get faults on the heap, UFFDIO_CONTINUE can be used to resume execution.
Furthermore, this feature enables updating references in the 'non-moving'
portion of the heap efficiently. Without this feature, uneccessary page
copying (ioctl(UFFDIO_COPY)) would be required."
Axel Rasmussen (5):
userfaultfd: support minor fault handling for shmem
userfaultfd/selftests: use memfd_create for shmem test type
userfaultfd/selftests: create alias mappings in the shmem test
userfaultfd/selftests: reinitialize test context in each test
userfaultfd/selftests: exercise minor fault handling shmem support
fs/userfaultfd.c | 6 +-
include/linux/shmem_fs.h | 26 +-
include/uapi/linux/userfaultfd.h | 4 +-
mm/memory.c | 8 +-
mm/shmem.c | 92 +++----
mm/userfaultfd.c | 27 +-
tools/testing/selftests/vm/userfaultfd.c | 322 +++++++++++++++--------
7 files changed, 295 insertions(+), 190 deletions(-)
--
2.30.1.766.gb4fecdf3b7-goog
Hi,
This patch series introduces the futex2 syscalls.
* What happened to the current futex()?
For some years now, developers have been trying to add new features to
futex, but maintainers have been reluctant to accept then, given the
multiplexed interface full of legacy features and tricky to do big
changes. Some problems that people tried to address with patchsets are:
NUMA-awareness[0], smaller sized futexes[1], wait on multiple futexes[2].
NUMA, for instance, just doesn't fit the current API in a reasonable
way. Considering that, it's not possible to merge new features into the
current futex.
** The NUMA problem
At the current implementation, all futex kernel side infrastructure is
stored on a single node. Given that, all futex() calls issued by
processors that aren't located on that node will have a memory access
penalty when doing it.
** The 32bit sized futex problem
Embedded systems or anything with memory constrains would benefit of
using smaller sizes for the futex userspace integer. Also, a mutex
implementation can be done using just three values, so 8 bits is enough
for various scenarios.
** The wait on multiple problem
The use case lies in the Wine implementation of the Windows NT interface
WaitMultipleObjects. This Windows API function allows a thread to sleep
waiting on the first of a set of event sources (mutexes, timers, signal,
console input, etc) to signal. Considering this is a primitive
synchronization operation for Windows applications, being able to quickly
signal events on the producer side, and quickly go to sleep on the
consumer side is essential for good performance of those running over Wine.
[0] https://lore.kernel.org/lkml/20160505204230.932454245@linutronix.de/
[1] https://lore.kernel.org/lkml/20191221155659.3159-2-malteskarupke@web.de/
[2] https://lore.kernel.org/lkml/20200213214525.183689-1-andrealmeid@collabora.…
* The solution
As proposed by Peter Zijlstra and Florian Weimer[3], a new interface
is required to solve this, which must be designed with those features in
mind. futex2() is that interface. As opposed to the current multiplexed
interface, the new one should have one syscall per operation. This will
allow the maintainability of the API if it gets extended, and will help
users with type checking of arguments.
In particular, the new interface is extended to support the ability to
wait on any of a list of futexes at a time, which could be seen as a
vectored extension of the FUTEX_WAIT semantics.
[3] https://lore.kernel.org/lkml/20200303120050.GC2596@hirez.programming.kicks-…
* The interface
The new interface can be seen in details in the following patches, but
this is a high level summary of what the interface can do:
- Supports wake/wait semantics, as in futex()
- Supports requeue operations, similarly as FUTEX_CMP_REQUEUE, but with
individual flags for each address
- Supports waiting for a vector of futexes, using a new syscall named
futex_waitv()
- Supports variable sized futexes (8bits, 16bits and 32bits)
- Supports NUMA-awareness operations, where the user can specify on
which memory node would like to operate
* Implementation
The internal implementation follows a similar design to the original futex.
Given that we want to replicate the same external behavior of current
futex, this should be somewhat expected. For some functions, like the
init and the code to get a shared key, I literally copied code and
comments from kernel/futex.c. I decided to do so instead of exposing the
original function as a public function since in that way we can freely
modify our implementation if required, without any impact on old futex.
Also, the comments precisely describes the details and corner cases of
the implementation.
Each patch contains a brief description of implementation, but patch 6
"docs: locking: futex2: Add documentation" adds a more complete document
about it.
* The patchset
This patchset can be also found at my git tree:
https://gitlab.collabora.com/tonyk/linux/-/tree/futex2-dev
- Patch 1: Implements wait/wake, and the basics foundations of futex2
- Patches 2-4: Implement the remaining features (shared, waitv, requeue).
- Patch 5: Adds the x86_x32 ABI handling. I kept it in a separated
patch since I'm not sure if x86_x32 is still a thing, or if it should
return -ENOSYS.
- Patch 6: Add a documentation file which details the interface and
the internal implementation.
- Patches 7-13: Selftests for all operations along with perf
support for futex2.
- Patch 14: While working on porting glibc for futex2, I found out
that there's a futex_wake() call at the user thread exit path, if
that thread was created with clone(..., CLONE_CHILD_SETTID, ...). In
order to make pthreads work with futex2, it was required to add
this patch. Note that this is more a proof-of-concept of what we
will need to do in future, rather than part of the interface and
shouldn't be merged as it is.
* Testing:
This patchset provides selftests for each operation and their flags.
Along with that, the following work was done:
** Stability
To stress the interface in "real world scenarios":
- glibc[4]: nptl's low level locking was modified to use futex2 API
(except for robust and PI things). All relevant nptl/ tests passed.
- Wine[5]: Proton/Wine was modified in order to use futex2() for the
emulation of Windows NT sync mechanisms based on futex, called "fsync".
Triple-A games with huge CPU's loads and tons of parallel jobs worked
as expected when compared with the previous FUTEX_WAIT_MULTIPLE
implementation at futex(). Some games issue 42k futex2() calls
per second.
- Full GNU/Linux distro: I installed the modified glibc in my host
machine, so all pthread's programs would use futex2(). After tweaking
systemd[6] to allow futex2() calls at seccomp, everything worked as
expected (web browsers do some syscall sandboxing and need some
configuration as well).
- perf: The perf benchmarks tests can also be used to stress the
interface, and they can be found in this patchset.
** Performance
- For comparing futex() and futex2() performance, I used the artificial
benchmarks implemented at perf (wake, wake-parallel, hash and
requeue). The setup was 200 runs for each test and using 8, 80, 800,
8000 for the number of threads, Note that for this test, I'm not using
patch 14 ("kernel: Enable waitpid() for futex2") , for reasons explained
at "The patchset" section.
- For the first three ones, I measured an average of 4% gain in
performance. This is not a big step, but it shows that the new
interface is at least comparable in performance with the current one.
- For requeue, I measured an average of 21% decrease in performance
compared to the original futex implementation. This is expected given
the new design with individual flags. The performance trade-offs are
explained at patch 4 ("futex2: Implement requeue operation").
[4] https://gitlab.collabora.com/tonyk/glibc/-/tree/futex2
[5] https://gitlab.collabora.com/tonyk/wine/-/tree/proton_5.13
[6] https://gitlab.collabora.com/tonyk/systemd
* FAQ
** "Where's the code for NUMA and FUTEX_8/16?"
The current code is already complex enough to take some time for
review, so I believe it's better to split that work out to a future
iteration of this patchset. Besides that, this RFC is the core part of the
infrastructure, and the following features will not pose big design
changes to it, the work will be more about wiring up the flags and
modifying some functions.
** "And what's about FUTEX_64?"
By supporting 64 bit futexes, the kernel structure for futex would
need to have a 64 bit field for the value, and that could defeat one of
the purposes of having different sized futexes in the first place:
supporting smaller ones to decrease memory usage. This might be
something that could be disabled for 32bit archs (and even for
CONFIG_BASE_SMALL).
Which use case would benefit for FUTEX_64? Does it worth the trade-offs?
** "Where's the PI/robust stuff?"
As said by Peter Zijlstra at [3], all those new features are related to
the "simple" futex interface, that doesn't use PI or robust. Do we want
to have this complexity at futex2() and if so, should it be part of
this patchset or can it be future work?
Thanks,
André
* Changelog
Changes from v1:
- Unified futex_set_timer_and_wait and __futex_wait code
- Dropped _carefull from linked list function calls
- Fixed typos on docs patch
- uAPI flags are now added as features are introduced, instead of all flags
in patch 1
- Removed struct futex_single_waiter in favor of an anon struct
v1: https://lore.kernel.org/lkml/20210215152404.250281-1-andrealmeid@collabora.…
André Almeida (13):
futex2: Implement wait and wake functions
futex2: Add support for shared futexes
futex2: Implement vectorized wait
futex2: Implement requeue operation
futex2: Add compatibility entry point for x86_x32 ABI
docs: locking: futex2: Add documentation
selftests: futex2: Add wake/wait test
selftests: futex2: Add timeout test
selftests: futex2: Add wouldblock test
selftests: futex2: Add waitv test
selftests: futex2: Add requeue test
perf bench: Add futex2 benchmark tests
kernel: Enable waitpid() for futex2
Documentation/locking/futex2.rst | 198 +++
Documentation/locking/index.rst | 1 +
MAINTAINERS | 2 +-
arch/arm/tools/syscall.tbl | 4 +
arch/arm64/include/asm/unistd.h | 2 +-
arch/arm64/include/asm/unistd32.h | 8 +
arch/x86/entry/syscalls/syscall_32.tbl | 4 +
arch/x86/entry/syscalls/syscall_64.tbl | 4 +
fs/inode.c | 1 +
include/linux/compat.h | 23 +
include/linux/fs.h | 1 +
include/linux/syscalls.h | 18 +
include/uapi/asm-generic/unistd.h | 14 +-
include/uapi/linux/futex.h | 31 +
init/Kconfig | 7 +
kernel/Makefile | 1 +
kernel/fork.c | 2 +
kernel/futex2.c | 1239 +++++++++++++++++
kernel/sys_ni.c | 6 +
tools/arch/x86/include/asm/unistd_64.h | 12 +
tools/include/uapi/asm-generic/unistd.h | 11 +-
.../arch/x86/entry/syscalls/syscall_64.tbl | 4 +
tools/perf/bench/bench.h | 4 +
tools/perf/bench/futex-hash.c | 24 +-
tools/perf/bench/futex-requeue.c | 57 +-
tools/perf/bench/futex-wake-parallel.c | 41 +-
tools/perf/bench/futex-wake.c | 37 +-
tools/perf/bench/futex.h | 47 +
tools/perf/builtin-bench.c | 18 +-
.../selftests/futex/functional/.gitignore | 3 +
.../selftests/futex/functional/Makefile | 8 +-
.../futex/functional/futex2_requeue.c | 164 +++
.../selftests/futex/functional/futex2_wait.c | 209 +++
.../selftests/futex/functional/futex2_waitv.c | 157 +++
.../futex/functional/futex_wait_timeout.c | 58 +-
.../futex/functional/futex_wait_wouldblock.c | 33 +-
.../testing/selftests/futex/functional/run.sh | 6 +
.../selftests/futex/include/futex2test.h | 121 ++
38 files changed, 2527 insertions(+), 53 deletions(-)
create mode 100644 Documentation/locking/futex2.rst
create mode 100644 kernel/futex2.c
create mode 100644 tools/testing/selftests/futex/functional/futex2_requeue.c
create mode 100644 tools/testing/selftests/futex/functional/futex2_wait.c
create mode 100644 tools/testing/selftests/futex/functional/futex2_waitv.c
create mode 100644 tools/testing/selftests/futex/include/futex2test.h
--
2.30.1
From: Aaron Lewis <aaronlewis(a)google.com>
[ Upstream commit 6528fc0a11de3d16339cf17639e2f69a68fcaf4d ]
The vcpu mmap area may consist of more than just the kvm_run struct.
Allocate enough space for the entire vcpu mmap area. Without this, on
x86, the PIO page, for example, will be missing. This is problematic
when dealing with an unhandled exception from the guest as the exception
vector will be incorrectly reported as 0x0.
Message-Id: <20210210165035.3712489-1-aaronlewis(a)google.com>
Reviewed-by: Andrew Jones <drjones(a)redhat.com>
Co-developed-by: Steve Rutherford <srutherford(a)google.com>
Signed-off-by: Aaron Lewis <aaronlewis(a)google.com>
Signed-off-by: Paolo Bonzini <pbonzini(a)redhat.com>
Signed-off-by: Sasha Levin <sashal(a)kernel.org>
---
tools/testing/selftests/kvm/lib/kvm_util.c | 6 ++++--
1 file changed, 4 insertions(+), 2 deletions(-)
diff --git a/tools/testing/selftests/kvm/lib/kvm_util.c b/tools/testing/selftests/kvm/lib/kvm_util.c
index fa5a90e6c6f0..859a0b57c683 100644
--- a/tools/testing/selftests/kvm/lib/kvm_util.c
+++ b/tools/testing/selftests/kvm/lib/kvm_util.c
@@ -21,6 +21,8 @@
#define KVM_UTIL_PGS_PER_HUGEPG 512
#define KVM_UTIL_MIN_PFN 2
+static int vcpu_mmap_sz(void);
+
/* Aligns x up to the next multiple of size. Size must be a power of 2. */
static void *align(void *x, size_t size)
{
@@ -509,7 +511,7 @@ static void vm_vcpu_rm(struct kvm_vm *vm, struct vcpu *vcpu)
vcpu->dirty_gfns = NULL;
}
- ret = munmap(vcpu->state, sizeof(*vcpu->state));
+ ret = munmap(vcpu->state, vcpu_mmap_sz());
TEST_ASSERT(ret == 0, "munmap of VCPU fd failed, rc: %i "
"errno: %i", ret, errno);
close(vcpu->fd);
@@ -978,7 +980,7 @@ void vm_vcpu_add(struct kvm_vm *vm, uint32_t vcpuid)
TEST_ASSERT(vcpu_mmap_sz() >= sizeof(*vcpu->state), "vcpu mmap size "
"smaller than expected, vcpu_mmap_sz: %i expected_min: %zi",
vcpu_mmap_sz(), sizeof(*vcpu->state));
- vcpu->state = (struct kvm_run *) mmap(NULL, sizeof(*vcpu->state),
+ vcpu->state = (struct kvm_run *) mmap(NULL, vcpu_mmap_sz(),
PROT_READ | PROT_WRITE, MAP_SHARED, vcpu->fd, 0);
TEST_ASSERT(vcpu->state != MAP_FAILED, "mmap vcpu_state failed, "
"vcpu id: %u errno: %i", vcpuid, errno);
--
2.30.1
Fix the following coccicheck warnings:
./tools/testing/selftests/bpf/test_sockmap.c:735:35-37: WARNING !A || A
&& B is equivalent to !A || B.
Reported-by: Abaci Robot <abaci(a)linux.alibaba.com>
Signed-off-by: Jiapeng Chong <jiapeng.chong(a)linux.alibaba.com>
---
tools/testing/selftests/bpf/test_sockmap.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/tools/testing/selftests/bpf/test_sockmap.c b/tools/testing/selftests/bpf/test_sockmap.c
index 427ca00..eefd445 100644
--- a/tools/testing/selftests/bpf/test_sockmap.c
+++ b/tools/testing/selftests/bpf/test_sockmap.c
@@ -732,7 +732,7 @@ static int sendmsg_test(struct sockmap_options *opt)
* socket is not a valid test. So in this case lets not
* enable kTLS but still run the test.
*/
- if (!txmsg_redir || (txmsg_redir && txmsg_ingress)) {
+ if (!txmsg_redir || txmsg_ingress) {
err = sockmap_init_ktls(opt->verbose, rx_fd);
if (err)
return err;
--
1.8.3.1
On Tue, Mar 02, 2021 at 01:49:41PM +0800, kernel test robot wrote:
>
>
> Greeting,
>
> FYI, we noticed the following commit (built with gcc-9):
>
> commit: cfe92ab6a3ea700c08ba673b46822d51f38d6b40 ("[PATCH v5 2/8] security/brute: Define a LSM and manage statistical data")
> url: https://github.com/0day-ci/linux/commits/John-Wood/Fork-brute-force-attack-…
> base: https://git.kernel.org/cgit/linux/kernel/git/shuah/linux-kselftest.git next
>
> in testcase: trinity
> version: trinity-static-i386-x86_64-f93256fb_2019-08-28
> with following parameters:
>
> group: ["group-00", "group-01", "group-02", "group-03", "group-04"]
>
> test-description: Trinity is a linux system call fuzz tester.
> test-url: http://codemonkey.org.uk/projects/trinity/
>
>
> on test machine: qemu-system-i386 -enable-kvm -cpu SandyBridge -smp 2 -m 8G
>
> caused below changes (please refer to attached dmesg/kmsg for entire log/backtrace):
>
>
> +-------------------------------------------------------------------------+------------+------------+
> | | 1d53b7aac6 | cfe92ab6a3 |
> +-------------------------------------------------------------------------+------------+------------+
> | WARNING:inconsistent_lock_state | 0 | 6 |
> | inconsistent{IN-SOFTIRQ-W}->{SOFTIRQ-ON-W}usage | 0 | 6 |
> +-------------------------------------------------------------------------+------------+------------+
>
>
> If you fix the issue, kindly add following tag
> Reported-by: kernel test robot <oliver.sang(a)intel.com>
>
>
> [ 116.852721] ================================
> [ 116.853120] WARNING: inconsistent lock state
> [ 116.853120] 5.11.0-rc7-00013-gcfe92ab6a3ea #1 Tainted: G S
> [ 116.853120] --------------------------------
>
> [...]
Thanks for the report. I will work on this for the next version.
> Thanks,
> Oliver Sang
Thanks,
John Wood
Hi,
This v3 series can mainly include two parts.
Based on kvm queue branch: https://git.kernel.org/pub/scm/virt/kvm/kvm.git/log/?h=queue
Links of v1: https://lore.kernel.org/lkml/20210208090841.333724-1-wangyanan55@huawei.com/
Links of v2: https://lore.kernel.org/lkml/20210225055940.18748-1-wangyanan55@huawei.com/
In the first part, all the known hugetlb backing src types specified
with different hugepage sizes are listed, so that we can specify use
of hugetlb source of the exact granularity that we want, instead of
the system default ones. And as all the known hugetlb page sizes are
listed, it's appropriate for all architectures. Besides, a helper that
can get granularity of different backing src types(anonumous/thp/hugetlb)
is added, so that we can use the accurate backing src granularity for
kinds of alignment or guest memory accessing of vcpus.
In the second part, a new test is added:
This test is added to serve as a performance tester and a bug reproducer
for kvm page table code (GPA->HPA mappings), it gives guidance for the
people trying to make some improvement for kvm. And the following explains
what we can exactly do through this test.
The function guest_code() can cover the conditions where a single vcpu or
multiple vcpus access guest pages within the same memory region, in three
VM stages(before dirty logging, during dirty logging, after dirty logging).
Besides, the backing src memory type(ANONYMOUS/THP/HUGETLB) of the tested
memory region can be specified by users, which means normal page mappings
or block mappings can be chosen by users to be created in the test.
If ANONYMOUS memory is specified, kvm will create normal page mappings
for the tested memory region before dirty logging, and update attributes
of the page mappings from RO to RW during dirty logging. If THP/HUGETLB
memory is specified, kvm will create block mappings for the tested memory
region before dirty logging, and split the blcok mappings into normal page
mappings during dirty logging, and coalesce the page mappings back into
block mappings after dirty logging is stopped.
So in summary, as a performance tester, this test can present the
performance of kvm creating/updating normal page mappings, or the
performance of kvm creating/splitting/recovering block mappings,
through execution time.
When we need to coalesce the page mappings back to block mappings after
dirty logging is stopped, we have to firstly invalidate *all* the TLB
entries for the page mappings right before installation of the block entry,
because a TLB conflict abort error could occur if we can't invalidate the
TLB entries fully. We have hit this TLB conflict twice on aarch64 software
implementation and fixed it. As this test can imulate process from dirty
logging enabled to dirty logging stopped of a VM with block mappings,
so it can also reproduce this TLB conflict abort due to inadequate TLB
invalidation when coalescing tables.
Links about the TLB conflict abort:
https://lore.kernel.org/lkml/20201201201034.116760-3-wangyanan55@huawei.com/
---
Change logs:
v2->v3:
- Add tags of Suggested-by, Reviewed-by in the patches
- Add a generic micro to get hugetlb page sizes
- Some changes for suggestions about v2 series
v1->v2:
- Add a patch to sync header files
- Add helpers to get granularity of different backing src types
- Some changes for suggestions about v1 series
---
Yanan Wang (7):
tools headers: sync headers of asm-generic/hugetlb_encode.h
tools headers: Add a macro to get HUGETLB page sizes for mmap
KVM: selftests: Use flag CLOCK_MONOTONIC_RAW for timing
KVM: selftests: Make a generic helper to get vm guest mode strings
KVM: selftests: Add a helper to get system configured THP page size
KVM: selftests: List all hugetlb src types specified with page sizes
KVM: selftests: Adapt vm_userspace_mem_region_add to new helpers
KVM: selftests: Add a test for kvm page table code
include/uapi/linux/mman.h | 2 +
tools/include/asm-generic/hugetlb_encode.h | 3 +
tools/include/uapi/linux/mman.h | 2 +
tools/testing/selftests/kvm/Makefile | 3 +
.../selftests/kvm/demand_paging_test.c | 8 +-
.../selftests/kvm/dirty_log_perf_test.c | 14 +-
.../testing/selftests/kvm/include/kvm_util.h | 4 +-
.../testing/selftests/kvm/include/test_util.h | 21 +-
.../selftests/kvm/kvm_page_table_test.c | 476 ++++++++++++++++++
tools/testing/selftests/kvm/lib/kvm_util.c | 59 ++-
tools/testing/selftests/kvm/lib/test_util.c | 92 +++-
tools/testing/selftests/kvm/steal_time.c | 4 +-
12 files changed, 628 insertions(+), 60 deletions(-)
create mode 100644 tools/testing/selftests/kvm/kvm_page_table_test.c
--
2.19.1
From: John Stultz <john.stultz(a)linaro.org>
[ Upstream commit 64ba3d591c9d2be2a9c09e99b00732afe002ad0d ]
Copied in from somewhere else, the makefile was including
the kerne's usr/include dir, which caused the asm/ioctl.h file
to be used.
Unfortunately, that file has different values for _IOC_SIZEBITS
and _IOC_WRITE than include/uapi/asm-generic/ioctl.h which then
causes the _IOCW macros to give the wrong ioctl numbers,
specifically for DMA_BUF_IOCTL_SYNC.
This patch simply removes the extra include from the Makefile
Cc: Shuah Khan <shuah(a)kernel.org>
Cc: Brian Starkey <brian.starkey(a)arm.com>
Cc: Sumit Semwal <sumit.semwal(a)linaro.org>
Cc: Laura Abbott <labbott(a)kernel.org>
Cc: Hridya Valsaraju <hridya(a)google.com>
Cc: Suren Baghdasaryan <surenb(a)google.com>
Cc: Sandeep Patil <sspatil(a)google.com>
Cc: Daniel Mentz <danielmentz(a)google.com>
Cc: linux-media(a)vger.kernel.org
Cc: dri-devel(a)lists.freedesktop.org
Cc: linux-kselftest(a)vger.kernel.org
Fixes: a8779927fd86c ("kselftests: Add dma-heap test")
Signed-off-by: John Stultz <john.stultz(a)linaro.org>
Signed-off-by: Shuah Khan <skhan(a)linuxfoundation.org>
Signed-off-by: Sasha Levin <sashal(a)kernel.org>
---
tools/testing/selftests/dmabuf-heaps/Makefile | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/tools/testing/selftests/dmabuf-heaps/Makefile b/tools/testing/selftests/dmabuf-heaps/Makefile
index 607c2acd20829..604b43ece15f5 100644
--- a/tools/testing/selftests/dmabuf-heaps/Makefile
+++ b/tools/testing/selftests/dmabuf-heaps/Makefile
@@ -1,5 +1,5 @@
# SPDX-License-Identifier: GPL-2.0
-CFLAGS += -static -O3 -Wl,-no-as-needed -Wall -I../../../../usr/include
+CFLAGS += -static -O3 -Wl,-no-as-needed -Wall
TEST_GEN_PROGS = dmabuf-heap
--
2.27.0
From: John Stultz <john.stultz(a)linaro.org>
[ Upstream commit 64ba3d591c9d2be2a9c09e99b00732afe002ad0d ]
Copied in from somewhere else, the makefile was including
the kerne's usr/include dir, which caused the asm/ioctl.h file
to be used.
Unfortunately, that file has different values for _IOC_SIZEBITS
and _IOC_WRITE than include/uapi/asm-generic/ioctl.h which then
causes the _IOCW macros to give the wrong ioctl numbers,
specifically for DMA_BUF_IOCTL_SYNC.
This patch simply removes the extra include from the Makefile
Cc: Shuah Khan <shuah(a)kernel.org>
Cc: Brian Starkey <brian.starkey(a)arm.com>
Cc: Sumit Semwal <sumit.semwal(a)linaro.org>
Cc: Laura Abbott <labbott(a)kernel.org>
Cc: Hridya Valsaraju <hridya(a)google.com>
Cc: Suren Baghdasaryan <surenb(a)google.com>
Cc: Sandeep Patil <sspatil(a)google.com>
Cc: Daniel Mentz <danielmentz(a)google.com>
Cc: linux-media(a)vger.kernel.org
Cc: dri-devel(a)lists.freedesktop.org
Cc: linux-kselftest(a)vger.kernel.org
Fixes: a8779927fd86c ("kselftests: Add dma-heap test")
Signed-off-by: John Stultz <john.stultz(a)linaro.org>
Signed-off-by: Shuah Khan <skhan(a)linuxfoundation.org>
Signed-off-by: Sasha Levin <sashal(a)kernel.org>
---
tools/testing/selftests/dmabuf-heaps/Makefile | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/tools/testing/selftests/dmabuf-heaps/Makefile b/tools/testing/selftests/dmabuf-heaps/Makefile
index 607c2acd20829..604b43ece15f5 100644
--- a/tools/testing/selftests/dmabuf-heaps/Makefile
+++ b/tools/testing/selftests/dmabuf-heaps/Makefile
@@ -1,5 +1,5 @@
# SPDX-License-Identifier: GPL-2.0
-CFLAGS += -static -O3 -Wl,-no-as-needed -Wall -I../../../../usr/include
+CFLAGS += -static -O3 -Wl,-no-as-needed -Wall
TEST_GEN_PROGS = dmabuf-heap
--
2.27.0
Attacks against vulnerable userspace applications with the purpose to break
ASLR or bypass canaries traditionally use some level of brute force with
the help of the fork system call. This is possible since when creating a
new process using fork its memory contents are the same as those of the
parent process (the process that called the fork system call). So, the
attacker can test the memory infinite times to find the correct memory
values or the correct memory addresses without worrying about crashing the
application.
Based on the above scenario it would be nice to have this detected and
mitigated, and this is the goal of this patch serie. Specifically the
following attacks are expected to be detected:
1.- Launching (fork()/exec()) a setuid/setgid process repeatedly until a
desirable memory layout is got (e.g. Stack Clash).
2.- Connecting to an exec()ing network daemon (e.g. xinetd) repeatedly
until a desirable memory layout is got (e.g. what CTFs do for simple
network service).
3.- Launching processes without exec() (e.g. Android Zygote) and exposing
state to attack a sibling.
4.- Connecting to a fork()ing network daemon (e.g. apache) repeatedly until
the previously shared memory layout of all the other children is
exposed (e.g. kind of related to HeartBleed).
In each case, a privilege boundary has been crossed:
Case 1: setuid/setgid process
Case 2: network to local
Case 3: privilege changes
Case 4: network to local
So, what will really be detected are fork/exec brute force attacks that
cross any of the commented bounds.
The implementation details and comparison against other existing
implementations can be found in the "Documentation" patch.
This v4 version has changed a lot from the v2. Basically the application
crash period is now compute on an on-going basis using an exponential
moving average (EMA), a detection of a brute force attack through the
"execve" system call has been added and the crossing of the commented
privilege bounds are taken into account. Also, the fine tune has also been
removed and now, all this kind of attacks are detected without
administrator intervention.
In the v2 version Kees Cook suggested to study if the statistical data
shared by all the fork hierarchy processes can be tracked in some other
way. Specifically the question was if this info can be hold by the family
hierarchy of the mm struct. After studying this hierarchy I think it is not
suitable for the Brute LSM since they are totally copied on fork() and in
this case we want that they are shared. So I leave this road.
So, knowing all this information I will explain now the different patches:
The 1/8 patch defines a new LSM hook to get the fatal signal of a task.
This will be useful during the attack detection phase.
The 2/8 patch defines a new LSM and manages the statistical data shared by
all the fork hierarchy processes.
The 3/8 patch detects a fork/exec brute force attack.
The 4/8 patch narrows the detection taken into account the privilege
boundary crossing.
The 5/8 patch mitigates a brute force attack.
The 6/8 patch adds self-tests to validate the Brute LSM expectations.
The 7/8 patch adds the documentation to explain this implementation.
The 8/8 patch updates the maintainers file.
This patch serie is a task of the KSPP [1] and can also be accessed from my
github tree [2] in the "brute_v4" branch.
[1] https://github.com/KSPP/linux/issues/39
[2] https://github.com/johwood/linux/
The previous versions can be found in:
RFC
https://lore.kernel.org/kernel-hardening/20200910202107.3799376-1-keescook@…
Version 2
https://lore.kernel.org/kernel-hardening/20201025134540.3770-1-john.wood@gm…
Version 3
https://lore.kernel.org/lkml/20210221154919.68050-1-john.wood@gmx.com/
Changelog RFC -> v2
-------------------
- Rename this feature with a more suitable name (Jann Horn, Kees Cook).
- Convert the code to an LSM (Kees Cook).
- Add locking to avoid data races (Jann Horn).
- Add a new LSM hook to get the fatal signal of a task (Jann Horn, Kees
Cook).
- Add the last crashes timestamps list to avoid false positives in the
attack detection (Jann Horn).
- Use "period" instead of "rate" (Jann Horn).
- Other minor changes suggested (Jann Horn, Kees Cook).
Changelog v2 -> v3
------------------
- Compute the application crash period on an on-going basis (Kees Cook).
- Detect a brute force attack through the execve system call (Kees Cook).
- Detect an slow brute force attack (Randy Dunlap).
- Fine tuning the detection taken into account privilege boundary crossing
(Kees Cook).
- Taken into account only fatal signals delivered by the kernel (Kees
Cook).
- Remove the sysctl attributes to fine tuning the detection (Kees Cook).
- Remove the prctls to allow per process enabling/disabling (Kees Cook).
- Improve the documentation (Kees Cook).
- Fix some typos in the documentation (Randy Dunlap).
- Add self-test to validate the expectations (Kees Cook).
Changelog v3 -> v4
------------------
- Fix all the warnings shown by the tool "scripts/kernel-doc" (Randy
Dunlap).
Any constructive comments are welcome.
Thanks.
John Wood (8):
security: Add LSM hook at the point where a task gets a fatal signal
security/brute: Define a LSM and manage statistical data
securtiy/brute: Detect a brute force attack
security/brute: Fine tuning the attack detection
security/brute: Mitigate a brute force attack
selftests/brute: Add tests for the Brute LSM
Documentation: Add documentation for the Brute LSM
MAINTAINERS: Add a new entry for the Brute LSM
Documentation/admin-guide/LSM/Brute.rst | 224 +++++
Documentation/admin-guide/LSM/index.rst | 1 +
MAINTAINERS | 7 +
include/linux/lsm_hook_defs.h | 1 +
include/linux/lsm_hooks.h | 4 +
include/linux/security.h | 4 +
kernel/signal.c | 1 +
security/Kconfig | 11 +-
security/Makefile | 4 +
security/brute/Kconfig | 13 +
security/brute/Makefile | 2 +
security/brute/brute.c | 1102 ++++++++++++++++++++++
security/security.c | 5 +
tools/testing/selftests/Makefile | 1 +
tools/testing/selftests/brute/.gitignore | 2 +
tools/testing/selftests/brute/Makefile | 5 +
tools/testing/selftests/brute/config | 1 +
tools/testing/selftests/brute/exec.c | 44 +
tools/testing/selftests/brute/test.c | 507 ++++++++++
tools/testing/selftests/brute/test.sh | 226 +++++
20 files changed, 2160 insertions(+), 5 deletions(-)
create mode 100644 Documentation/admin-guide/LSM/Brute.rst
create mode 100644 security/brute/Kconfig
create mode 100644 security/brute/Makefile
create mode 100644 security/brute/brute.c
create mode 100644 tools/testing/selftests/brute/.gitignore
create mode 100644 tools/testing/selftests/brute/Makefile
create mode 100644 tools/testing/selftests/brute/config
create mode 100644 tools/testing/selftests/brute/exec.c
create mode 100644 tools/testing/selftests/brute/test.c
create mode 100755 tools/testing/selftests/brute/test.sh
--
2.25.1
The first argument to namedtuple() should match the name of the type,
which wasn't the case for KconfigEntryBase.
Fixing this is enough to make mypy show no python typing errors again.
Fixes 97752c39bd ("kunit: kunit_tool: Allow .kunitconfig to disable config items")
Signed-off-by: David Gow <davidgow(a)google.com>
---
tools/testing/kunit/kunit_config.py | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/tools/testing/kunit/kunit_config.py b/tools/testing/kunit/kunit_config.py
index 0b550cbd667d..1e2683dcc0e7 100644
--- a/tools/testing/kunit/kunit_config.py
+++ b/tools/testing/kunit/kunit_config.py
@@ -13,7 +13,7 @@ from typing import List, Set
CONFIG_IS_NOT_SET_PATTERN = r'^# CONFIG_(\w+) is not set$'
CONFIG_PATTERN = r'^CONFIG_(\w+)=(\S+|".*")$'
-KconfigEntryBase = collections.namedtuple('KconfigEntry', ['name', 'value'])
+KconfigEntryBase = collections.namedtuple('KconfigEntryBase', ['name', 'value'])
class KconfigEntry(KconfigEntryBase):
--
2.30.0.617.g56c4b15f3c-goog
kunit_tool maintains a list of config options which are broken under
UML, which we exclude from an otherwise 'make ARCH=um allyesconfig'
build used to run all tests with the --alltests option.
Something in UML allyesconfig is causing segfaults when page poisining
is enabled (and is poisoning with a non-zero value). Previously, this
didn't occur, as allyesconfig enabled the CONFIG_PAGE_POISONING_ZERO
option, which worked around the problem by zeroing memory. This option
has since been removed, and memory is now poisoned with 0xAA, which
triggers segfaults in many different codepaths, preventing UML from
booting.
Note that we have to disable both CONFIG_PAGE_POISONING and
CONFIG_DEBUG_PAGEALLOC, as the latter will 'select' the former on
architectures (such as UML) which don't implement __kernel_map_pages().
Ideally, we'd fix this properly by tracking down the real root cause,
but since this is breaking KUnit's --alltests feature, it's worth
disabling there in the meantime so the kernel can boot to the point
where tests can actually run.
Fixes: f289041ed4 ("mm, page_poison: remove CONFIG_PAGE_POISONING_ZERO")
Signed-off-by: David Gow <davidgow(a)google.com>
---
As described above, 'make ARCH=um allyesconfig' is broken. KUnit has
been maintaining a list of configs to force-disable for this in
tools/testing/kunit/configs/broken_on_uml.config. The kernels we've
built with this have broken since CONFIG_PAGE_POISONING_ZERO was
removed, panic-ing on startup with:
<0>[ 0.100000][ T11] Kernel panic - not syncing: Segfault with no mm
<4>[ 0.100000][ T11] CPU: 0 PID: 11 Comm: kdevtmpfs Not tainted 5.11.0-rc7-00003-g63381dc6f5f1-dirty #4
<4>[ 0.100000][ T11] Stack:
<4>[ 0.100000][ T11] 677d3d40 677d3f10 0000000e 600c0bc0
<4>[ 0.100000][ T11] 677d3d90 603c99be 677d3d90 62529b93
<4>[ 0.100000][ T11] 603c9ac0 677d3f10 62529b00 603c98a0
<4>[ 0.100000][ T11] Call Trace:
<4>[ 0.100000][ T11] [<600c0bc0>] ? set_signals+0x0/0x60
<4>[ 0.100000][ T11] [<603c99be>] lookup_mnt+0x11e/0x220
<4>[ 0.100000][ T11] [<62529b93>] ? down_write+0x93/0x180
<4>[ 0.100000][ T11] [<603c9ac0>] ? lock_mount+0x0/0x160
<4>[ 0.100000][ T11] [<62529b00>] ? down_write+0x0/0x180
<4>[ 0.100000][ T11] [<603c98a0>] ? lookup_mnt+0x0/0x220
<4>[ 0.100000][ T11] [<603c8160>] ? namespace_unlock+0x0/0x1a0
<4>[ 0.100000][ T11] [<603c9b25>] lock_mount+0x65/0x160
<4>[ 0.100000][ T11] [<6012f360>] ? up_write+0x0/0x40
<4>[ 0.100000][ T11] [<603cbbd2>] do_new_mount_fc+0xd2/0x220
<4>[ 0.100000][ T11] [<603eb560>] ? vfs_parse_fs_string+0x0/0xa0
<4>[ 0.100000][ T11] [<603cbf04>] do_new_mount+0x1e4/0x260
<4>[ 0.100000][ T11] [<603ccae9>] path_mount+0x1c9/0x6e0
<4>[ 0.100000][ T11] [<603a9f4f>] ? getname_kernel+0xaf/0x1a0
<4>[ 0.100000][ T11] [<603ab280>] ? kern_path+0x0/0x60
<4>[ 0.100000][ T11] [<600238ee>] 0x600238ee
<4>[ 0.100000][ T11] [<62523baa>] devtmpfsd+0x52/0xb8
<4>[ 0.100000][ T11] [<62523b58>] ? devtmpfsd+0x0/0xb8
<4>[ 0.100000][ T11] [<600fffd8>] kthread+0x1d8/0x200
<4>[ 0.100000][ T11] [<600a4ea6>] new_thread_handler+0x86/0xc0
Disabling PAGE_POISONING fixes this. The issue can't be repoduced with
just PAGE_POISONING, there's clearly something (or several things) also
enabled by allyesconfig which contribute. Ideally, we'd track these down
and fix this at its root cause, but in the meantime it'd be nice to
disable PAGE_POISONING so we can at least get the kernel to boot. One
way would be to add a 'depends on !UML' or similar, but since
PAGE_POISONING does seem to work in the non-allyesconfig case, adding it
to our list of broken configs seemed the better choice.
Thoughts?
(Note that to reproduce this, you'll want to run
./tools/testing/kunit/kunit.py run --alltests --raw_output
It also depends on a couple of other fixes which are not upstream yet:
https://www.spinics.net/lists/linux-rtc/msg08294.htmlhttps://lore.kernel.org/linux-i3c/20210127040636.1535722-1-davidgow@google.…
Cheers,
-- David
tools/testing/kunit/configs/broken_on_uml.config | 2 ++
1 file changed, 2 insertions(+)
diff --git a/tools/testing/kunit/configs/broken_on_uml.config b/tools/testing/kunit/configs/broken_on_uml.config
index a7f0603d33f6..690870043ac0 100644
--- a/tools/testing/kunit/configs/broken_on_uml.config
+++ b/tools/testing/kunit/configs/broken_on_uml.config
@@ -40,3 +40,5 @@
# CONFIG_RESET_BRCMSTB_RESCAL is not set
# CONFIG_RESET_INTEL_GW is not set
# CONFIG_ADI_AXI_ADC is not set
+# CONFIG_DEBUG_PAGEALLOC is not set
+# CONFIG_PAGE_POISONING is not set
--
2.30.0.478.g8a0d178c01-goog
Hi,
This v2 series can mainly include two parts.
Based on kvm queue branch: https://git.kernel.org/pub/scm/virt/kvm/kvm.git/log/?h=queue
Links of v1: https://lore.kernel.org/lkml/20210208090841.333724-1-wangyanan55@huawei.com/
In the first part, all the known hugetlb backing src types specified
with different hugepage sizes are listed, so that we can specify use
of hugetlb source of the exact granularity that we want, instead of
the system default ones. And as all the known hugetlb page sizes are
listed, it's appropriate for all architectures. Besides, a helper that
can get granularity of different backing src types(anonumous/thp/hugetlb)
is added, so that we can use the accurate backing src granularity for
kinds of alignment or guest memory accessing of vcpus.
In the second part, a new test is added:
This test is added to serve as a performance tester and a bug reproducer
for kvm page table code (GPA->HPA mappings), it gives guidance for the
people trying to make some improvement for kvm. And the following explains
what we can exactly do through this test.
The function guest_code() can cover the conditions where a single vcpu or
multiple vcpus access guest pages within the same memory region, in three
VM stages(before dirty logging, during dirty logging, after dirty logging).
Besides, the backing src memory type(ANONYMOUS/THP/HUGETLB) of the tested
memory region can be specified by users, which means normal page mappings
or block mappings can be chosen by users to be created in the test.
If ANONYMOUS memory is specified, kvm will create normal page mappings
for the tested memory region before dirty logging, and update attributes
of the page mappings from RO to RW during dirty logging. If THP/HUGETLB
memory is specified, kvm will create block mappings for the tested memory
region before dirty logging, and split the blcok mappings into normal page
mappings during dirty logging, and coalesce the page mappings back into
block mappings after dirty logging is stopped.
So in summary, as a performance tester, this test can present the
performance of kvm creating/updating normal page mappings, or the
performance of kvm creating/splitting/recovering block mappings,
through execution time.
When we need to coalesce the page mappings back to block mappings after
dirty logging is stopped, we have to firstly invalidate *all* the TLB
entries for the page mappings right before installation of the block entry,
because a TLB conflict abort error could occur if we can't invalidate the
TLB entries fully. We have hit this TLB conflict twice on aarch64 software
implementation and fixed it. As this test can imulate process from dirty
logging enabled to dirty logging stopped of a VM with block mappings,
so it can also reproduce this TLB conflict abort due to inadequate TLB
invalidation when coalescing tables.
Links about the TLB conflict abort:
https://lore.kernel.org/lkml/20201201201034.116760-3-wangyanan55@huawei.com/
Yanan Wang (7):
tools include: sync head files of mmap flag encodings about hugetlb
KVM: selftests: Use flag CLOCK_MONOTONIC_RAW for timing
KVM: selftests: Make a generic helper to get vm guest mode strings
KVM: selftests: Add a helper to get system configured THP page size
KVM: selftests: List all hugetlb src types specified with page sizes
KVM: selftests: Adapt vm_userspace_mem_region_add to new helpers
KVM: selftests: Add a test for kvm page table code
tools/include/asm-generic/hugetlb_encode.h | 3 +
tools/testing/selftests/kvm/Makefile | 3 +
.../selftests/kvm/demand_paging_test.c | 8 +-
.../selftests/kvm/dirty_log_perf_test.c | 14 +-
.../testing/selftests/kvm/include/kvm_util.h | 4 +-
.../testing/selftests/kvm/include/test_util.h | 21 +-
.../selftests/kvm/kvm_page_table_test.c | 476 ++++++++++++++++++
tools/testing/selftests/kvm/lib/kvm_util.c | 58 +--
tools/testing/selftests/kvm/lib/test_util.c | 92 +++-
tools/testing/selftests/kvm/steal_time.c | 4 +-
10 files changed, 623 insertions(+), 60 deletions(-)
create mode 100644 tools/testing/selftests/kvm/kvm_page_table_test.c
--
2.19.1
Base
====
This series is based on top of my series which adds minor fault handling for
hugetlbfs [1]. (And, therefore, it is based on linux-next/akpm and Peter Xu's
series for disabling huge pmd sharing as well.)
[1] https://lore.kernel.org/patchwork/cover/1384095/
Overview
========
See my original series linked above for a detailed overview of minor fault
handling in general. The feature in this series works exactly like the
hugetblfs version (from userspace's perspective).
I'm sending this as a separate series because:
- The original minor fault handling series has been through several rounds of
review and seems close to being merged, so it seems reasonable to start
looking at this next step.
- shmem is different enough that this series may require some additional work
before it's ready, and I don't want to delay the original series
unnecessarily by bundling them together.
Use Case
========
In some cases it is useful to have VM memory backed by tmpfs instead of
hugetlbfs. So, this feature will be used to support the same VM live migration
use case described in my original series.
Additionally, Android folks (Lokesh Gidra <lokeshgidra(a)google.com>) hope to
optimize the Android JVM garbage collector using this feature (a paper
describing a somewhat similar approach: https://arxiv.org/pdf/1902.04738.pdf).
Axel Rasmussen (5):
userfaultfd: support minor fault handling for shmem
userfaultfd/selftests: use memfd_create for shmem test type
userfaultfd/selftests: create alias mappings in the shmem test
userfaultfd/selftests: reinitialize test context in each test
userfaultfd/selftests: exercise minor fault handling shmem support
fs/userfaultfd.c | 6 +-
include/linux/shmem_fs.h | 26 +-
include/uapi/linux/userfaultfd.h | 4 +-
mm/memory.c | 8 +-
mm/shmem.c | 88 +++----
mm/userfaultfd.c | 27 +-
tools/testing/selftests/vm/userfaultfd.c | 322 +++++++++++++++--------
7 files changed, 293 insertions(+), 188 deletions(-)
--
2.30.0.617.g56c4b15f3c-goog
Hi,
This patch series mainly fixes race condition issues, explains specific
lock rules and improves code related to concurrent calls of
hook_sb_delete() and release_inode(). I exploited these races to
validate the fixes. Userspace tests are also improved along with some
commit messages and comments. Serge Hallyn's review is taken into
account and his Acked-by are added to the corresponding patches.
The SLOC count is 1328 for security/landlock/ and 2539 for
tools/testing/selftest/landlock/ . Test coverage for security/landlock/
is 93.6% of lines. The code not covered only deals with internal kernel
errors (e.g. memory allocation) and race conditions. This series is
being fuzzed by syzkaller (which may cover internal kernel errors), and
patches are on their way: https://github.com/google/syzkaller/pull/2380
The compiled documentation is available here:
https://landlock.io/linux-doc/landlock-v29/userspace-api/landlock.html
This series can be applied on top of v5.11-7592-g1a3a9ffb27bb (Linus's
master branch from Sunday). This can be tested with
CONFIG_SECURITY_LANDLOCK, CONFIG_SAMPLE_LANDLOCK and by prepending
"landlock," to CONFIG_LSM. This patch series can be found in a Git
repository here:
https://github.com/landlock-lsm/linux/commits/landlock-v29
This patch series seems ready for upstream and I would really appreciate
final reviews.
# Landlock LSM
The goal of Landlock is to enable to restrict ambient rights (e.g.
global filesystem access) for a set of processes. Because Landlock is a
stackable LSM [1], it makes possible to create safe security sandboxes
as new security layers in addition to the existing system-wide
access-controls. This kind of sandbox is expected to help mitigate the
security impact of bugs or unexpected/malicious behaviors in user-space
applications. Landlock empowers any process, including unprivileged
ones, to securely restrict themselves.
Landlock is inspired by seccomp-bpf but instead of filtering syscalls
and their raw arguments, a Landlock rule can restrict the use of kernel
objects like file hierarchies, according to the kernel semantic.
Landlock also takes inspiration from other OS sandbox mechanisms: XNU
Sandbox, FreeBSD Capsicum or OpenBSD Pledge/Unveil.
In this current form, Landlock misses some access-control features.
This enables to minimize this patch series and ease review. This series
still addresses multiple use cases, especially with the combined use of
seccomp-bpf: applications with built-in sandboxing, init systems,
security sandbox tools and security-oriented APIs [2].
Previous version:
https://lore.kernel.org/lkml/20210202162710.657398-1-mic@digikod.net/
[1] https://lore.kernel.org/lkml/50db058a-7dde-441b-a7f9-f6837fe8b69f@schaufler…
[2] https://lore.kernel.org/lkml/f646e1c7-33cf-333f-070c-0a40ad0468cd@digikod.n…
Casey Schaufler (1):
LSM: Infrastructure management of the superblock
Mickaël Salaün (11):
landlock: Add object management
landlock: Add ruleset and domain management
landlock: Set up the security framework and manage credentials
landlock: Add ptrace restrictions
fs,security: Add sb_delete hook
landlock: Support filesystem access-control
landlock: Add syscall implementations
arch: Wire up Landlock syscalls
selftests/landlock: Add user space tests
samples/landlock: Add a sandbox manager example
landlock: Add user and kernel documentation
Documentation/security/index.rst | 1 +
Documentation/security/landlock.rst | 79 +
Documentation/userspace-api/index.rst | 1 +
Documentation/userspace-api/landlock.rst | 307 ++
MAINTAINERS | 15 +
arch/Kconfig | 7 +
arch/alpha/kernel/syscalls/syscall.tbl | 3 +
arch/arm/tools/syscall.tbl | 3 +
arch/arm64/include/asm/unistd.h | 2 +-
arch/arm64/include/asm/unistd32.h | 6 +
arch/ia64/kernel/syscalls/syscall.tbl | 3 +
arch/m68k/kernel/syscalls/syscall.tbl | 3 +
arch/microblaze/kernel/syscalls/syscall.tbl | 3 +
arch/mips/kernel/syscalls/syscall_n32.tbl | 3 +
arch/mips/kernel/syscalls/syscall_n64.tbl | 3 +
arch/mips/kernel/syscalls/syscall_o32.tbl | 3 +
arch/parisc/kernel/syscalls/syscall.tbl | 3 +
arch/powerpc/kernel/syscalls/syscall.tbl | 3 +
arch/s390/kernel/syscalls/syscall.tbl | 3 +
arch/sh/kernel/syscalls/syscall.tbl | 3 +
arch/sparc/kernel/syscalls/syscall.tbl | 3 +
arch/um/Kconfig | 1 +
arch/x86/entry/syscalls/syscall_32.tbl | 3 +
arch/x86/entry/syscalls/syscall_64.tbl | 3 +
arch/xtensa/kernel/syscalls/syscall.tbl | 3 +
fs/super.c | 1 +
include/linux/lsm_hook_defs.h | 1 +
include/linux/lsm_hooks.h | 4 +
include/linux/security.h | 4 +
include/linux/syscalls.h | 7 +
include/uapi/asm-generic/unistd.h | 8 +-
include/uapi/linux/landlock.h | 128 +
kernel/sys_ni.c | 5 +
samples/Kconfig | 7 +
samples/Makefile | 1 +
samples/landlock/.gitignore | 1 +
samples/landlock/Makefile | 13 +
samples/landlock/sandboxer.c | 238 ++
security/Kconfig | 11 +-
security/Makefile | 2 +
security/landlock/Kconfig | 21 +
security/landlock/Makefile | 4 +
security/landlock/common.h | 20 +
security/landlock/cred.c | 46 +
security/landlock/cred.h | 58 +
security/landlock/fs.c | 686 +++++
security/landlock/fs.h | 56 +
security/landlock/limits.h | 21 +
security/landlock/object.c | 67 +
security/landlock/object.h | 91 +
security/landlock/ptrace.c | 120 +
security/landlock/ptrace.h | 14 +
security/landlock/ruleset.c | 473 +++
security/landlock/ruleset.h | 165 +
security/landlock/setup.c | 40 +
security/landlock/setup.h | 18 +
security/landlock/syscalls.c | 445 +++
security/security.c | 51 +-
security/selinux/hooks.c | 58 +-
security/selinux/include/objsec.h | 6 +
security/selinux/ss/services.c | 3 +-
security/smack/smack.h | 6 +
security/smack/smack_lsm.c | 35 +-
tools/testing/selftests/Makefile | 1 +
tools/testing/selftests/landlock/.gitignore | 2 +
tools/testing/selftests/landlock/Makefile | 24 +
tools/testing/selftests/landlock/base_test.c | 219 ++
tools/testing/selftests/landlock/common.h | 183 ++
tools/testing/selftests/landlock/config | 7 +
tools/testing/selftests/landlock/fs_test.c | 2724 +++++++++++++++++
.../testing/selftests/landlock/ptrace_test.c | 337 ++
tools/testing/selftests/landlock/true.c | 5 +
72 files changed, 6827 insertions(+), 77 deletions(-)
create mode 100644 Documentation/security/landlock.rst
create mode 100644 Documentation/userspace-api/landlock.rst
create mode 100644 include/uapi/linux/landlock.h
create mode 100644 samples/landlock/.gitignore
create mode 100644 samples/landlock/Makefile
create mode 100644 samples/landlock/sandboxer.c
create mode 100644 security/landlock/Kconfig
create mode 100644 security/landlock/Makefile
create mode 100644 security/landlock/common.h
create mode 100644 security/landlock/cred.c
create mode 100644 security/landlock/cred.h
create mode 100644 security/landlock/fs.c
create mode 100644 security/landlock/fs.h
create mode 100644 security/landlock/limits.h
create mode 100644 security/landlock/object.c
create mode 100644 security/landlock/object.h
create mode 100644 security/landlock/ptrace.c
create mode 100644 security/landlock/ptrace.h
create mode 100644 security/landlock/ruleset.c
create mode 100644 security/landlock/ruleset.h
create mode 100644 security/landlock/setup.c
create mode 100644 security/landlock/setup.h
create mode 100644 security/landlock/syscalls.c
create mode 100644 tools/testing/selftests/landlock/.gitignore
create mode 100644 tools/testing/selftests/landlock/Makefile
create mode 100644 tools/testing/selftests/landlock/base_test.c
create mode 100644 tools/testing/selftests/landlock/common.h
create mode 100644 tools/testing/selftests/landlock/config
create mode 100644 tools/testing/selftests/landlock/fs_test.c
create mode 100644 tools/testing/selftests/landlock/ptrace_test.c
create mode 100644 tools/testing/selftests/landlock/true.c
base-commit: 31caf8b2a847214be856f843e251fc2ed2cd1075
--
2.30.0
The top-level kselftest directory is not called kselftest, but
selftests.
Signed-off-by: Antonio Terceiro <antonio.terceiro(a)linaro.org>
Cc: Shuah Khan <shuah(a)kernel.org>
Cc: Jonathan Corbet <corbet(a)lwn.net>
---
Documentation/dev-tools/kselftest.rst | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/Documentation/dev-tools/kselftest.rst b/Documentation/dev-tools/kselftest.rst
index a901def730d9..dcefee707ccd 100644
--- a/Documentation/dev-tools/kselftest.rst
+++ b/Documentation/dev-tools/kselftest.rst
@@ -239,8 +239,8 @@ using a shell script test runner. ``kselftest/module.sh`` is designed
to facilitate this process. There is also a header file provided to
assist writing kernel modules that are for use with kselftest:
-- ``tools/testing/kselftest/kselftest_module.h``
-- ``tools/testing/kselftest/kselftest/module.sh``
+- ``tools/testing/selftests/kselftest_module.h``
+- ``tools/testing/selftests/kselftest/module.sh``
How to use
----------
--
2.30.1
This is an optimization of a set of sgx-related codes, each of which
is independent of the patch. Because the second and third patches have
conflicting dependencies, these patches are put together.
---
v5 changes:
* Remove the two patches with no actual value
* Typo fix in commit message
v4 changes:
* Improvements suggested by review
v3 changes:
* split free_cnt count and spin lock optimization into two patches
v2 changes:
* review suggested changes
Tianjia Zhang (3):
selftests/x86: Use getauxval() to simplify the code in sgx
x86/sgx: Allows ioctl PROVISION to execute before CREATE
x86/sgx: Remove redundant if conditions in sgx_encl_create
arch/x86/kernel/cpu/sgx/driver.c | 1 +
arch/x86/kernel/cpu/sgx/ioctl.c | 8 ++++----
tools/testing/selftests/sgx/main.c | 24 ++++--------------------
3 files changed, 9 insertions(+), 24 deletions(-)
--
2.19.1.3.ge56e4f7
Attacks against vulnerable userspace applications with the purpose to break
ASLR or bypass canaries traditionally use some level of brute force with
the help of the fork system call. This is possible since when creating a
new process using fork its memory contents are the same as those of the
parent process (the process that called the fork system call). So, the
attacker can test the memory infinite times to find the correct memory
values or the correct memory addresses without worrying about crashing the
application.
Based on the above scenario it would be nice to have this detected and
mitigated, and this is the goal of this patch serie. Specifically the
following attacks are expected to be detected:
1.- Launching (fork()/exec()) a setuid/setgid process repeatedly until a
desirable memory layout is got (e.g. Stack Clash).
2.- Connecting to an exec()ing network daemon (e.g. xinetd) repeatedly
until a desirable memory layout is got (e.g. what CTFs do for simple
network service).
3.- Launching processes without exec() (e.g. Android Zygote) and exposing
state to attack a sibling.
4.- Connecting to a fork()ing network daemon (e.g. apache) repeatedly until
the previously shared memory layout of all the other children is
exposed (e.g. kind of related to HeartBleed).
In each case, a privilege boundary has been crossed:
Case 1: setuid/setgid process
Case 2: network to local
Case 3: privilege changes
Case 4: network to local
So, what will really be detected are fork/exec brute force attacks that
cross any of the commented bounds.
The implementation details and comparison against other existing
implementations can be found in the "Documentation" patch.
This v3 version has changed a lot from the v2. Basically the application
crash period is now compute on an on-going basis using an exponential
moving average (EMA), a detection of a brute force attack through the
"execve" system call has been added and the crossing of the commented
privilege bounds are taken into account. Also, the fine tune has also been
removed and now, all this kind of attacks are detected without
administrator intervention.
In the v2 version Kees Cook suggested to study if the statistical data
shared by all the fork hierarchy processes can be tracked in some other
way. Specifically the question was if this info can be hold by the family
hierarchy of the mm struct. After studying this hierarchy I think it is not
suitable for the Brute LSM since they are totally copied on fork() and in
this case we want that they are shared. So I leave this road.
So, knowing all this information I will explain now the different patches:
The 1/8 patch defines a new LSM hook to get the fatal signal of a task.
This will be useful during the attack detection phase.
The 2/8 patch defines a new LSM and manages the statistical data shared by
all the fork hierarchy processes.
The 3/8 patch detects a fork/exec brute force attack.
The 4/8 patch narrows the detection taken into account the privilege
boundary crossing.
The 5/8 patch mitigates a brute force attack.
The 6/8 patch adds self-tests to validate the Brute LSM expectations.
The 7/8 patch adds the documentation to explain this implementation.
The 8/8 patch updates the maintainers file.
This patch serie is a task of the KSPP [1] and can also be accessed from my
github tree [2] in the "brute_v3" branch.
[1] https://github.com/KSPP/linux/issues/39
[2] https://github.com/johwood/linux/
The previous versions can be found in:
https://lore.kernel.org/kernel-hardening/20200910202107.3799376-1-keescook@…https://lore.kernel.org/kernel-hardening/20201025134540.3770-1-john.wood@gm…
Changelog RFC -> v2
-------------------
- Rename this feature with a more suitable name (Jann Horn, Kees Cook).
- Convert the code to an LSM (Kees Cook).
- Add locking to avoid data races (Jann Horn).
- Add a new LSM hook to get the fatal signal of a task (Jann Horn, Kees
Cook).
- Add the last crashes timestamps list to avoid false positives in the
attack detection (Jann Horn).
- Use "period" instead of "rate" (Jann Horn).
- Other minor changes suggested (Jann Horn, Kees Cook).
Changelog v2 -> v3
------------------
- Compute the application crash period on an on-going basis (Kees Cook).
- Detect a brute force attack through the execve system call (Kees Cook).
- Detect an slow brute force attack (Randy Dunlap).
- Fine tuning the detection taken into account privilege boundary crossing
(Kees Cook).
- Taken into account only fatal signals delivered by the kernel (Kees
Cook).
- Remove the sysctl attributes to fine tuning the detection (Kees Cook).
- Remove the prctls to allow per process enabling/disabling (Kees Cook).
- Improve the documentation (Kees Cook).
- Fix some typos in the documentation (Randy Dunlap).
- Add self-test to validate the expectations (Kees Cook).
John Wood (8):
security: Add LSM hook at the point where a task gets a fatal signal
security/brute: Define a LSM and manage statistical data
securtiy/brute: Detect a brute force attack
security/brute: Fine tuning the attack detection
security/brute: Mitigate a brute force attack
selftests/brute: Add tests for the Brute LSM
Documentation: Add documentation for the Brute LSM
MAINTAINERS: Add a new entry for the Brute LSM
Documentation/admin-guide/LSM/Brute.rst | 224 +++++
Documentation/admin-guide/LSM/index.rst | 1 +
MAINTAINERS | 7 +
include/linux/lsm_hook_defs.h | 1 +
include/linux/lsm_hooks.h | 4 +
include/linux/security.h | 4 +
kernel/signal.c | 1 +
security/Kconfig | 11 +-
security/Makefile | 4 +
security/brute/Kconfig | 13 +
security/brute/Makefile | 2 +
security/brute/brute.c | 1102 ++++++++++++++++++++++
security/security.c | 5 +
tools/testing/selftests/Makefile | 1 +
tools/testing/selftests/brute/.gitignore | 2 +
tools/testing/selftests/brute/Makefile | 5 +
tools/testing/selftests/brute/config | 1 +
tools/testing/selftests/brute/exec.c | 44 +
tools/testing/selftests/brute/test.c | 507 ++++++++++
tools/testing/selftests/brute/test.sh | 226 +++++
20 files changed, 2160 insertions(+), 5 deletions(-)
create mode 100644 Documentation/admin-guide/LSM/Brute.rst
create mode 100644 security/brute/Kconfig
create mode 100644 security/brute/Makefile
create mode 100644 security/brute/brute.c
create mode 100644 tools/testing/selftests/brute/.gitignore
create mode 100644 tools/testing/selftests/brute/Makefile
create mode 100644 tools/testing/selftests/brute/config
create mode 100644 tools/testing/selftests/brute/exec.c
create mode 100644 tools/testing/selftests/brute/test.c
create mode 100755 tools/testing/selftests/brute/test.sh
--
2.25.1
From: Mike Rapoport <rppt(a)linux.ibm.com>
Hi,
@Andrew, this is based on v5.11-rc5-mmotm-2021-01-27-23-30, with secretmem
and related patches dropped from there, I can rebase whatever way you
prefer.
This is an implementation of "secret" mappings backed by a file descriptor.
The file descriptor backing secret memory mappings is created using a
dedicated memfd_secret system call The desired protection mode for the
memory is configured using flags parameter of the system call. The mmap()
of the file descriptor created with memfd_secret() will create a "secret"
memory mapping. The pages in that mapping will be marked as not present in
the direct map and will be present only in the page table of the owning mm.
Although normally Linux userspace mappings are protected from other users,
such secret mappings are useful for environments where a hostile tenant is
trying to trick the kernel into giving them access to other tenants
mappings.
Additionally, in the future the secret mappings may be used as a mean to
protect guest memory in a virtual machine host.
For demonstration of secret memory usage we've created a userspace library
https://git.kernel.org/pub/scm/linux/kernel/git/jejb/secret-memory-preloade…
that does two things: the first is act as a preloader for openssl to
redirect all the OPENSSL_malloc calls to secret memory meaning any secret
keys get automatically protected this way and the other thing it does is
expose the API to the user who needs it. We anticipate that a lot of the
use cases would be like the openssl one: many toolkits that deal with
secret keys already have special handling for the memory to try to give
them greater protection, so this would simply be pluggable into the
toolkits without any need for user application modification.
Hiding secret memory mappings behind an anonymous file allows usage of
the page cache for tracking pages allocated for the "secret" mappings as
well as using address_space_operations for e.g. page migration callbacks.
The anonymous file may be also used implicitly, like hugetlb files, to
implement mmap(MAP_SECRET) and use the secret memory areas with "native" mm
ABIs in the future.
Removing of the pages from the direct map may cause its fragmentation on
architectures that use large pages to map the physical memory which affects
the system performance. However, the original Kconfig text for
CONFIG_DIRECT_GBPAGES said that gigabyte pages in the direct map "... can
improve the kernel's performance a tiny bit ..." (commit 00d1c5e05736
("x86: add gbpages switches")) and the recent report [1] showed that "...
although 1G mappings are a good default choice, there is no compelling
evidence that it must be the only choice". Hence, it is sufficient to have
secretmem disabled by default with the ability of a system administrator to
enable it at boot time.
In addition, there is also a long term goal to improve management of the
direct map.
[1] https://lore.kernel.org/linux-mm/213b4567-46ce-f116-9cdf-bbd0c884eb3c@linux…
v17:
* Remove pool of large pages backing secretmem allocations, per Michal Hocko
* Add secretmem pages to unevictable LRU, per Michal Hocko
* Use GFP_HIGHUSER as secretmem mapping mask, per Michal Hocko
* Make secretmem an opt-in feature that is disabled by default
v16:
* Fix memory leak intorduced in v15
* Clean the data left from previous page user before handing the page to
the userspace
v15: https://lore.kernel.org/lkml/20210120180612.1058-1-rppt@kernel.org
* Add riscv/Kconfig update to disable set_memory operations for nommu
builds (patch 3)
* Update the code around add_to_page_cache() per Matthew's comments
(patches 6,7)
* Add fixups for build/checkpatch errors discovered by CI systems
v14: https://lore.kernel.org/lkml/20201203062949.5484-1-rppt@kernel.org
* Finally s/mod_node_page_state/mod_lruvec_page_state/
v13: https://lore.kernel.org/lkml/20201201074559.27742-1-rppt@kernel.org
* Added Reviewed-by, thanks Catalin and David
* s/mod_node_page_state/mod_lruvec_page_state/ as Shakeel suggested
Older history:
v12: https://lore.kernel.org/lkml/20201125092208.12544-1-rppt@kernel.org
v11: https://lore.kernel.org/lkml/20201124092556.12009-1-rppt@kernel.org
v10: https://lore.kernel.org/lkml/20201123095432.5860-1-rppt@kernel.org
v9: https://lore.kernel.org/lkml/20201117162932.13649-1-rppt@kernel.org
v8: https://lore.kernel.org/lkml/20201110151444.20662-1-rppt@kernel.org
v7: https://lore.kernel.org/lkml/20201026083752.13267-1-rppt@kernel.org
v6: https://lore.kernel.org/lkml/20200924132904.1391-1-rppt@kernel.org
v5: https://lore.kernel.org/lkml/20200916073539.3552-1-rppt@kernel.org
v4: https://lore.kernel.org/lkml/20200818141554.13945-1-rppt@kernel.org
v3: https://lore.kernel.org/lkml/20200804095035.18778-1-rppt@kernel.org
v2: https://lore.kernel.org/lkml/20200727162935.31714-1-rppt@kernel.org
v1: https://lore.kernel.org/lkml/20200720092435.17469-1-rppt@kernel.org
rfc-v2: https://lore.kernel.org/lkml/20200706172051.19465-1-rppt@kernel.org/
rfc-v1: https://lore.kernel.org/lkml/20200130162340.GA14232@rapoport-lnx/
rfc-v0: https://lore.kernel.org/lkml/1572171452-7958-1-git-send-email-rppt@kernel.o…
Arnd Bergmann (1):
arm64: kfence: fix header inclusion
Mike Rapoport (9):
mm: add definition of PMD_PAGE_ORDER
mmap: make mlock_future_check() global
riscv/Kconfig: make direct map manipulation options depend on MMU
set_memory: allow set_direct_map_*_noflush() for multiple pages
set_memory: allow querying whether set_direct_map_*() is actually enabled
mm: introduce memfd_secret system call to create "secret" memory areas
PM: hibernate: disable when there are active secretmem users
arch, mm: wire up memfd_secret system call where relevant
secretmem: test: add basic selftest for memfd_secret(2)
arch/arm64/include/asm/Kbuild | 1 -
arch/arm64/include/asm/cacheflush.h | 6 -
arch/arm64/include/asm/kfence.h | 2 +-
arch/arm64/include/asm/set_memory.h | 17 ++
arch/arm64/include/uapi/asm/unistd.h | 1 +
arch/arm64/kernel/machine_kexec.c | 1 +
arch/arm64/mm/mmu.c | 6 +-
arch/arm64/mm/pageattr.c | 23 +-
arch/riscv/Kconfig | 4 +-
arch/riscv/include/asm/set_memory.h | 4 +-
arch/riscv/include/asm/unistd.h | 1 +
arch/riscv/mm/pageattr.c | 8 +-
arch/x86/entry/syscalls/syscall_32.tbl | 1 +
arch/x86/entry/syscalls/syscall_64.tbl | 1 +
arch/x86/include/asm/set_memory.h | 4 +-
arch/x86/mm/pat/set_memory.c | 8 +-
fs/dax.c | 11 +-
include/linux/pgtable.h | 3 +
include/linux/secretmem.h | 30 +++
include/linux/set_memory.h | 16 +-
include/linux/syscalls.h | 1 +
include/uapi/asm-generic/unistd.h | 6 +-
include/uapi/linux/magic.h | 1 +
kernel/power/hibernate.c | 5 +-
kernel/power/snapshot.c | 4 +-
kernel/sys_ni.c | 2 +
mm/Kconfig | 3 +
mm/Makefile | 1 +
mm/gup.c | 10 +
mm/internal.h | 3 +
mm/mlock.c | 3 +-
mm/mmap.c | 5 +-
mm/secretmem.c | 261 +++++++++++++++++++
mm/vmalloc.c | 5 +-
scripts/checksyscalls.sh | 4 +
tools/testing/selftests/vm/.gitignore | 1 +
tools/testing/selftests/vm/Makefile | 3 +-
tools/testing/selftests/vm/memfd_secret.c | 296 ++++++++++++++++++++++
tools/testing/selftests/vm/run_vmtests | 17 ++
39 files changed, 726 insertions(+), 53 deletions(-)
create mode 100644 arch/arm64/include/asm/set_memory.h
create mode 100644 include/linux/secretmem.h
create mode 100644 mm/secretmem.c
create mode 100644 tools/testing/selftests/vm/memfd_secret.c
--
2.28.0
Hi Linus,
Please pull the following KUnit update for Linux 5.12-rc1.
This KUnit update for Linux 5.12-rc1 consists of consists of:
-- support for filtering test suites using glob from Daniel Latypov.
"kunit_filter.glob" command line option is passed to the UML
kernel, which currently only supports filtering by suite name.
This support allows running different subsets of tests, e.g.
$ ./tools/testing/kunit/kunit.py build
$ ./tools/testing/kunit/kunit.py exec 'list*'
$ ./tools/testing/kunit/kunit.py exec 'kunit*'
-- several fixes and cleanups also from Daniel Latypov.
diff is attached.
thanks,
-- Shuah
----------------------------------------------------------------
The following changes since commit 92bf22614b21a2706f4993b278017e437f7785b3:
Linux 5.11-rc7 (2021-02-07 13:57:38 -0800)
are available in the Git repository at:
git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest
tags/linux-kselftest-kunit-5.12-rc1
for you to fetch changes up to 7af29141a31a2a2350589471c8979ff5f22fb9b7:
kunit: tool: fix unintentional statefulness in run_kernel()
(2021-02-08 16:10:22 -0700)
----------------------------------------------------------------
linux-kselftest-kunit-5.12-rc1
This KUnit update for Linux 5.12-rc1 consists of consists of:
-- support for filtering test suites using glob from Daniel Latypov.
"kunit_filter.glob" command line option is passed to the UML
kernel, which currently only supports filtering by suite name.
This support allows running different subsets of tests, e.g.
$ ./tools/testing/kunit/kunit.py build
$ ./tools/testing/kunit/kunit.py exec 'list*'
$ ./tools/testing/kunit/kunit.py exec 'kunit*'
-- several fixes and cleanups also from Daniel Latypov.
----------------------------------------------------------------
Daniel Latypov (12):
kunit: tool: fix unit test cleanup handling
kunit: tool: stop using bare asserts in unit test
kunit: tool: use `with open()` in unit test
minor: kunit: tool: fix unit test so it can run from non-root dir
kunit: tool: simplify kconfig is_subset_of() logic
KUnit: Docs: make start.rst example Kconfig follow style.rst
Documentation: kunit: add tips.rst for small examples
kunit: make kunit_tool accept optional path to .kunitconfig fragment
kunit: don't show `1 == 1` in failed assertion messages
kunit: add kunit.filter_glob cmdline option to filter suites
kunit: tool: add support for filtering suites by glob
kunit: tool: fix unintentional statefulness in run_kernel()
Documentation/dev-tools/kunit/index.rst | 2 +
Documentation/dev-tools/kunit/start.rst | 7 +-
Documentation/dev-tools/kunit/tips.rst | 115 ++++++++++++++++++
lib/kunit/Kconfig | 1 +
lib/kunit/assert.c | 39 +++++-
lib/kunit/executor.c | 93 +++++++++++++--
tools/testing/kunit/kunit.py | 30 +++--
tools/testing/kunit/kunit_config.py | 13 +-
tools/testing/kunit/kunit_kernel.py | 18 ++-
tools/testing/kunit/kunit_tool_test.py | 204
+++++++++++++++++---------------
10 files changed, 390 insertions(+), 132 deletions(-)
create mode 100644 Documentation/dev-tools/kunit/tips.rst
----------------------------------------------------------------
Hi Linus,
Please pull the following Kselftest update for Linux 5.12-rc1.
This Kselftest update for Linux 5.12-rc1 consists of:
- dmabuf-heaps test fixes and cleanups from John Stultz.
- seccomp test fix to accept any valid fd in user_notification_addfd.
- Minor fixes to breakpoints and vDSO tests.
- Minor code cleanups to ipc and x86 tests.
diff is attached.
thanks,
-- Shuah
----------------------------------------------------------------
The following changes since commit 92bf22614b21a2706f4993b278017e437f7785b3:
Linux 5.11-rc7 (2021-02-07 13:57:38 -0800)
are available in the Git repository at:
git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest
tags/linux-kselftest-next-5.12-rc1
for you to fetch changes up to e0c0840a46db9d50ba7391082d665d74f320c39f:
selftests/seccomp: Accept any valid fd in user_notification_addfd
(2021-02-09 17:39:01 -0700)
----------------------------------------------------------------
linux-kselftest-next-5.12-rc1
This Kselftest update for Linux 5.12-rc1 consists of:
- dmabuf-heaps test fixes and cleanups from John Stultz.
- seccomp test fix to accept any valid fd in user_notification_addfd.
- Minor fixes to breakpoints and vDSO tests.
- Minor code cleanups to ipc and x86 tests.
----------------------------------------------------------------
John Stultz (5):
kselftests: dmabuf-heaps: Fix Makefile's inclusion of the
kernel's usr/include dir
kselftests: dmabuf-heaps: Add clearer checks on DMABUF_BEGIN/END_SYNC
kselftests: dmabuf-heaps: Softly fail if don't find a vgem device
kselftests: dmabuf-heaps: Cleanup test output
kselftests: dmabuf-heaps: Add extra checking that allocated
buffers are zeroed
Seth Forshee (1):
selftests/seccomp: Accept any valid fd in user_notification_addfd
Tiezhu Yang (1):
selftests: breakpoints: Use correct error messages in
breakpoint_test_arm64.c
Tobias Klauser (2):
selftests/vDSO: fix ABI selftest on riscv
selftests/timens: add futex binary to .gitignore
Yang Li (2):
selftests/ipc: remove unneeded semicolon
selftests/x86/ldt_gdt: remove unneeded semicolon
.../selftests/breakpoints/breakpoint_test_arm64.c | 4 +-
tools/testing/selftests/dmabuf-heaps/Makefile | 2 +-
tools/testing/selftests/dmabuf-heaps/dmabuf-heap.c | 149
++++++++++++++++-----
tools/testing/selftests/ipc/msgque.c | 6 +-
tools/testing/selftests/seccomp/seccomp_bpf.c | 8 +-
tools/testing/selftests/timens/.gitignore | 1 +
tools/testing/selftests/vDSO/vdso_config.h | 4 +-
tools/testing/selftests/x86/ldt_gdt.c | 2 +-
8 files changed, 132 insertions(+), 44 deletions(-)
----------------------------------------------------------------
Hi,
This patch series fixes a corner-case with non-overlapping access rights
coming from different layers. This is now handled in a generic way and
verified with new tests. A stricter check is enforced for
landlock_add_rule(2) to forbid useless rules. Finally, the previous
landlock_enforce_ruleset_self(2) is renamed to
landlock_restrict_self(2), which is more consistent.
The SLOC count is 1314 for security/landlock/ and 2484 for
tools/testing/selftest/landlock/ . Test coverage for security/landlock/
is 94.7% of lines. The code not covered only deals with internal kernel
errors (e.g. memory allocation) and race conditions. This series is
being fuzzed by syzkaller, and patches are on their way:
https://github.com/google/syzkaller/pull/2380
The compiled documentation is available here:
https://landlock.io/linux-doc/landlock-v28/userspace-api/landlock.html
This series can be applied on top of v5.11-rc6 . This can be tested
with CONFIG_SECURITY_LANDLOCK, CONFIG_SAMPLE_LANDLOCK and by prepending
"landlock," to CONFIG_LSM. This patch series can be found in a Git
repository here:
https://github.com/landlock-lsm/linux/commits/landlock-v28
This patch series seems ready for upstream and I would really appreciate
final reviews.
# Landlock LSM
The goal of Landlock is to enable to restrict ambient rights (e.g.
global filesystem access) for a set of processes. Because Landlock is a
stackable LSM [1], it makes possible to create safe security sandboxes
as new security layers in addition to the existing system-wide
access-controls. This kind of sandbox is expected to help mitigate the
security impact of bugs or unexpected/malicious behaviors in user-space
applications. Landlock empowers any process, including unprivileged
ones, to securely restrict themselves.
Landlock is inspired by seccomp-bpf but instead of filtering syscalls
and their raw arguments, a Landlock rule can restrict the use of kernel
objects like file hierarchies, according to the kernel semantic.
Landlock also takes inspiration from other OS sandbox mechanisms: XNU
Sandbox, FreeBSD Capsicum or OpenBSD Pledge/Unveil.
In this current form, Landlock misses some access-control features.
This enables to minimize this patch series and ease review. This series
still addresses multiple use cases, especially with the combined use of
seccomp-bpf: applications with built-in sandboxing, init systems,
security sandbox tools and security-oriented APIs [2].
Previous version:
https://lore.kernel.org/lkml/20210121205119.793296-1-mic@digikod.net/
[1] https://lore.kernel.org/lkml/50db058a-7dde-441b-a7f9-f6837fe8b69f@schaufler…
[2] https://lore.kernel.org/lkml/f646e1c7-33cf-333f-070c-0a40ad0468cd@digikod.n…
Casey Schaufler (1):
LSM: Infrastructure management of the superblock
Mickaël Salaün (11):
landlock: Add object management
landlock: Add ruleset and domain management
landlock: Set up the security framework and manage credentials
landlock: Add ptrace restrictions
fs,security: Add sb_delete hook
landlock: Support filesystem access-control
landlock: Add syscall implementations
arch: Wire up Landlock syscalls
selftests/landlock: Add user space tests
samples/landlock: Add a sandbox manager example
landlock: Add user and kernel documentation
Documentation/security/index.rst | 1 +
Documentation/security/landlock.rst | 79 +
Documentation/userspace-api/index.rst | 1 +
Documentation/userspace-api/landlock.rst | 307 ++
MAINTAINERS | 15 +
arch/Kconfig | 7 +
arch/alpha/kernel/syscalls/syscall.tbl | 3 +
arch/arm/tools/syscall.tbl | 3 +
arch/arm64/include/asm/unistd.h | 2 +-
arch/arm64/include/asm/unistd32.h | 6 +
arch/ia64/kernel/syscalls/syscall.tbl | 3 +
arch/m68k/kernel/syscalls/syscall.tbl | 3 +
arch/microblaze/kernel/syscalls/syscall.tbl | 3 +
arch/mips/kernel/syscalls/syscall_n32.tbl | 3 +
arch/mips/kernel/syscalls/syscall_n64.tbl | 3 +
arch/mips/kernel/syscalls/syscall_o32.tbl | 3 +
arch/parisc/kernel/syscalls/syscall.tbl | 3 +
arch/powerpc/kernel/syscalls/syscall.tbl | 3 +
arch/s390/kernel/syscalls/syscall.tbl | 3 +
arch/sh/kernel/syscalls/syscall.tbl | 3 +
arch/sparc/kernel/syscalls/syscall.tbl | 3 +
arch/um/Kconfig | 1 +
arch/x86/entry/syscalls/syscall_32.tbl | 3 +
arch/x86/entry/syscalls/syscall_64.tbl | 3 +
arch/xtensa/kernel/syscalls/syscall.tbl | 3 +
fs/super.c | 1 +
include/linux/lsm_hook_defs.h | 1 +
include/linux/lsm_hooks.h | 3 +
include/linux/security.h | 4 +
include/linux/syscalls.h | 7 +
include/uapi/asm-generic/unistd.h | 8 +-
include/uapi/linux/landlock.h | 128 +
kernel/sys_ni.c | 5 +
samples/Kconfig | 7 +
samples/Makefile | 1 +
samples/landlock/.gitignore | 1 +
samples/landlock/Makefile | 13 +
samples/landlock/sandboxer.c | 238 ++
security/Kconfig | 11 +-
security/Makefile | 2 +
security/landlock/Kconfig | 21 +
security/landlock/Makefile | 4 +
security/landlock/common.h | 20 +
security/landlock/cred.c | 46 +
security/landlock/cred.h | 58 +
security/landlock/fs.c | 627 ++++
security/landlock/fs.h | 56 +
security/landlock/limits.h | 21 +
security/landlock/object.c | 67 +
security/landlock/object.h | 91 +
security/landlock/ptrace.c | 120 +
security/landlock/ptrace.h | 14 +
security/landlock/ruleset.c | 473 +++
security/landlock/ruleset.h | 165 +
security/landlock/setup.c | 40 +
security/landlock/setup.h | 18 +
security/landlock/syscalls.c | 444 +++
security/security.c | 51 +-
security/selinux/hooks.c | 58 +-
security/selinux/include/objsec.h | 6 +
security/selinux/ss/services.c | 3 +-
security/smack/smack.h | 6 +
security/smack/smack_lsm.c | 35 +-
tools/testing/selftests/Makefile | 1 +
tools/testing/selftests/landlock/.gitignore | 2 +
tools/testing/selftests/landlock/Makefile | 24 +
tools/testing/selftests/landlock/base_test.c | 219 ++
tools/testing/selftests/landlock/common.h | 169 ++
tools/testing/selftests/landlock/config | 6 +
tools/testing/selftests/landlock/fs_test.c | 2664 +++++++++++++++++
.../testing/selftests/landlock/ptrace_test.c | 314 ++
tools/testing/selftests/landlock/true.c | 5 +
72 files changed, 6668 insertions(+), 77 deletions(-)
create mode 100644 Documentation/security/landlock.rst
create mode 100644 Documentation/userspace-api/landlock.rst
create mode 100644 include/uapi/linux/landlock.h
create mode 100644 samples/landlock/.gitignore
create mode 100644 samples/landlock/Makefile
create mode 100644 samples/landlock/sandboxer.c
create mode 100644 security/landlock/Kconfig
create mode 100644 security/landlock/Makefile
create mode 100644 security/landlock/common.h
create mode 100644 security/landlock/cred.c
create mode 100644 security/landlock/cred.h
create mode 100644 security/landlock/fs.c
create mode 100644 security/landlock/fs.h
create mode 100644 security/landlock/limits.h
create mode 100644 security/landlock/object.c
create mode 100644 security/landlock/object.h
create mode 100644 security/landlock/ptrace.c
create mode 100644 security/landlock/ptrace.h
create mode 100644 security/landlock/ruleset.c
create mode 100644 security/landlock/ruleset.h
create mode 100644 security/landlock/setup.c
create mode 100644 security/landlock/setup.h
create mode 100644 security/landlock/syscalls.c
create mode 100644 tools/testing/selftests/landlock/.gitignore
create mode 100644 tools/testing/selftests/landlock/Makefile
create mode 100644 tools/testing/selftests/landlock/base_test.c
create mode 100644 tools/testing/selftests/landlock/common.h
create mode 100644 tools/testing/selftests/landlock/config
create mode 100644 tools/testing/selftests/landlock/fs_test.c
create mode 100644 tools/testing/selftests/landlock/ptrace_test.c
create mode 100644 tools/testing/selftests/landlock/true.c
base-commit: 1048ba83fb1c00cd24172e23e8263972f6b5d9ac
--
2.30.0
Hi,
This patch series introduces the futex2 syscalls.
* What happened to the current futex()?
For some years now, developers have been trying to add new features to
futex, but maintainers have been reluctant to accept then, given the
multiplexed interface full of legacy features and tricky to do big
changes. Some problems that people tried to address with patchsets are:
NUMA-awareness[0], smaller sized futexes[1], wait on multiple futexes[2].
NUMA, for instance, just doesn't fit the current API in a reasonable
way. Considering that, it's not possible to merge new features into the
current futex.
** The NUMA problem
At the current implementation, all futex kernel side infrastructure is
stored on a single node. Given that, all futex() calls issued by
processors that aren't located on that node will have a memory access
penalty when doing it.
** The 32bit sized futex problem
Embedded systems or anything with memory constrains would benefit of
using smaller sizes for the futex userspace integer. Also, a mutex
implementation can be done using just three values, so 8 bits is enough
for various scenarios.
** The wait on multiple problem
The use case lies in the Wine implementation of the Windows NT interface
WaitMultipleObjects. This Windows API function allows a thread to sleep
waiting on the first of a set of event sources (mutexes, timers, signal,
console input, etc) to signal. Considering this is a primitive
synchronization operation for Windows applications, being able to quickly
signal events on the producer side, and quickly go to sleep on the
consumer side is essential for good performance of those running
over Wine.
[0] https://lore.kernel.org/lkml/20160505204230.932454245@linutronix.de/
[1] https://lore.kernel.org/lkml/20191221155659.3159-2-malteskarupke@web.de/
[2] https://lore.kernel.org/lkml/20200213214525.183689-1-andrealmeid@collabora.…
* The solution
As proposed by Peter Zijlstra and Florian Weimer[3], a new interface
is required to solve this, which must be designed with those features in
mind. futex2() is that interface. As opposed to the current multiplexed
interface, the new one should have one syscall per operation. This will
allow the maintainability of the API if it gets extended, and will help
users with type checking of arguments.
In particular, the new interface is extended to support the ability to
wait on any of a list of futexes at a time, which could be seen as a
vectored extension of the FUTEX_WAIT semantics.
[3] https://lore.kernel.org/lkml/20200303120050.GC2596@hirez.programming.kicks-…
* The interface
The new interface can be seen in details in the following patches, but
this is a high level summary of what the interface can do:
- Supports wake/wait semantics, as in futex()
- Supports requeue operations, similarly as FUTEX_CMP_REQUEUE, but with
individual flags for each address
- Supports waiting for a vector of futexes, using a new syscall named
futex_waitv()
- Supports variable sized futexes (8bits, 16bits and 32bits)
- Supports NUMA-awareness operations, where the user can specify on
which memory node would like to operate
* Implementation
The internal implementation follows a similar design to the original futex.
Given that we want to replicate the same external behavior of current
futex, this should be somewhat expected. For some functions, like the
init and the code to get a shared key, I literally copied code and
comments from kernel/futex.c. I decided to do so instead of exposing the
original function as a public function since in that way we can freely
modify our implementation if required, without any impact on old futex.
Also, the comments precisely describes the details and corner cases of
the implementation.
Each patch contains a brief description of implementation, but patch 6
"docs: locking: futex2: Add documentation" adds a more complete document
about it.
* The patchset
This patchset can be also found at my git tree:
https://gitlab.collabora.com/tonyk/linux/-/tree/futex2
- Patch 1: Implements wait/wake, and the basics foundations of futex2
- Patches 2-4: Implement the remaining features (shared, waitv, requeue).
- Patch 5: Adds the x86_x32 ABI handling. I kept it in a separated
patch since I'm not sure if x86_x32 is still a thing, or if it should
return -ENOSYS.
- Patch 6: Add a documentation file which details the interface and
the internal implementation.
- Patches 7-13: Selftests for all operations along with perf
support for futex2.
- Patch 14: While working on porting glibc for futex2, I found out
that there's a futex_wake() call at the user thread exit path, if
that thread was created with clone(..., CLONE_CHILD_SETTID, ...). In
order to make pthreads work with futex2, it was required to add
this patch. Note that this is more a proof-of-concept of what we
will need to do in future, rather than part of the interface and
shouldn't be merged as it is.
* Testing:
This patchset provides selftests for each operation and their flags.
Along with that, the following work was done:
** Stability
To stress the interface in "real world scenarios":
- glibc[4]: nptl's low level locking was modified to use futex2 API
(except for robust and PI things). All relevant nptl/ tests passed.
- Wine[5]: Proton/Wine was modified in order to use futex2() for the
emulation of Windows NT sync mechanisms based on futex, called "fsync".
Triple-A games with huge CPU's loads and tons of parallel jobs worked
as expected when compared with the previous FUTEX_WAIT_MULTIPLE
implementation at futex(). Some games issue 42k futex2() calls
per second.
- Full GNU/Linux distro: I installed the modified glibc in my host
machine, so all pthread's programs would use futex2(). After tweaking
systemd[6] to allow futex2() calls at seccomp, everything worked as
expected (web browsers do some syscall sandboxing and need some
configuration as well).
- perf: The perf benchmarks tests can also be used to stress the
interface, and they can be found in this patchset.
** Performance
- For comparing futex() and futex2() performance, I used the artificial
benchmarks implemented at perf (wake, wake-parallel, hash and
requeue). The setup was 200 runs for each test and using 8, 80, 800,
8000 for the number of threads, Note that for this test, I'm not using
patch 14 ("kernel: Enable waitpid() for futex2") , for reasons explained
at "The patchset" section.
- For the first three ones, I measured an average of 4% gain in
performance. This is not a big step, but it shows that the new
interface is at least comparable in performance with the current one.
- For requeue, I measured an average of 21% decrease in performance
compared to the original futex implementation. This is expected given
the new design with individual flags. The performance trade-offs are
explained at patch 4 ("futex2: Implement requeue operation").
[4] https://gitlab.collabora.com/tonyk/glibc/-/tree/futex2
[5] https://gitlab.collabora.com/tonyk/wine/-/tree/proton_5.13
[6] https://gitlab.collabora.com/tonyk/systemd
* FAQ
** "Where's the code for NUMA and FUTEX_8/16?"
The current code is already complex enough to take some time for
review, so I believe it's better to split that work out to a future
iteration of this patchset. Besides that, this RFC is the core part of the
infrastructure, and the following features will not pose big design
changes to it, the work will be more about wiring up the flags and
modifying some functions.
** "And what's about FUTEX_64?"
By supporting 64 bit futexes, the kernel structure for futex would
need to have a 64 bit field for the value, and that could defeat one of
the purposes of having different sized futexes in the first place:
supporting smaller ones to decrease memory usage. This might be
something that could be disabled for 32bit archs (and even for
CONFIG_BASE_SMALL).
Which use case would benefit for FUTEX_64? Does it worth the trade-offs?
** "Where's the PI/robust stuff?"
As said by Peter Zijlstra at [3], all those new features are related to
the "simple" futex interface, that doesn't use PI or robust. Do we want
to have this complexity at futex2() and if so, should it be part of
this patchset or can it be future work?
Thanks,
André
André Almeida (13):
futex2: Implement wait and wake functions
futex2: Add support for shared futexes
futex2: Implement vectorized wait
futex2: Implement requeue operation
futex2: Add compatibility entry point for x86_x32 ABI
docs: locking: futex2: Add documentation
selftests: futex2: Add wake/wait test
selftests: futex2: Add timeout test
selftests: futex2: Add wouldblock test
selftests: futex2: Add waitv test
selftests: futex2: Add requeue test
perf bench: Add futex2 benchmark tests
kernel: Enable waitpid() for futex2
Documentation/locking/futex2.rst | 198 +++
Documentation/locking/index.rst | 1 +
MAINTAINERS | 2 +-
arch/arm/tools/syscall.tbl | 4 +
arch/arm64/include/asm/unistd.h | 2 +-
arch/arm64/include/asm/unistd32.h | 4 +
arch/x86/entry/syscalls/syscall_32.tbl | 4 +
arch/x86/entry/syscalls/syscall_64.tbl | 4 +
fs/inode.c | 1 +
include/linux/compat.h | 23 +
include/linux/fs.h | 1 +
include/linux/syscalls.h | 18 +
include/uapi/asm-generic/unistd.h | 14 +-
include/uapi/linux/futex.h | 56 +
init/Kconfig | 7 +
kernel/Makefile | 1 +
kernel/fork.c | 2 +
kernel/futex2.c | 1255 +++++++++++++++++
kernel/sys_ni.c | 6 +
tools/arch/x86/include/asm/unistd_64.h | 12 +
tools/include/uapi/asm-generic/unistd.h | 11 +-
.../arch/x86/entry/syscalls/syscall_64.tbl | 3 +
tools/perf/bench/bench.h | 4 +
tools/perf/bench/futex-hash.c | 24 +-
tools/perf/bench/futex-requeue.c | 57 +-
tools/perf/bench/futex-wake-parallel.c | 41 +-
tools/perf/bench/futex-wake.c | 37 +-
tools/perf/bench/futex.h | 47 +
tools/perf/builtin-bench.c | 18 +-
.../selftests/futex/functional/.gitignore | 3 +
.../selftests/futex/functional/Makefile | 8 +-
.../futex/functional/futex2_requeue.c | 164 +++
.../selftests/futex/functional/futex2_wait.c | 209 +++
.../selftests/futex/functional/futex2_waitv.c | 157 +++
.../futex/functional/futex_wait_timeout.c | 58 +-
.../futex/functional/futex_wait_wouldblock.c | 33 +-
.../testing/selftests/futex/functional/run.sh | 6 +
.../selftests/futex/include/futex2test.h | 121 ++
38 files changed, 2563 insertions(+), 53 deletions(-)
create mode 100644 Documentation/locking/futex2.rst
create mode 100644 kernel/futex2.c
create mode 100644 tools/testing/selftests/futex/functional/futex2_requeue.c
create mode 100644 tools/testing/selftests/futex/functional/futex2_wait.c
create mode 100644 tools/testing/selftests/futex/functional/futex2_waitv.c
create mode 100644 tools/testing/selftests/futex/include/futex2test.h
--
2.30.1
v1 by Uriel is here: [1].
Since it's been a while, I've dropped the Reviewed-By's.
It depended on commit 83c4e7a0363b ("KUnit: KASAN Integration") which
hadn't been merged yet, so that caused some kerfuffle with applying them
previously and the series was reverted.
This revives the series but makes the kunit_fail_current_test() function
take a format string and logs the file and line number of the failing
code, addressing Alan Maguire's comments on the previous version.
As a result, the patch that makes UBSAN errors was tweaked slightly to
include an error message.
v2 -> v3:
Fix kunit_fail_current_test() so it works w/ CONFIG_KUNIT=m
s/_/__ on the helper func to match others in test.c
[1] https://lore.kernel.org/linux-kselftest/20200806174326.3577537-1-urielguaja…
Uriel Guajardo (2):
kunit: support failure from dynamic analysis tools
kunit: ubsan integration
include/kunit/test-bug.h | 30 ++++++++++++++++++++++++++++++
lib/kunit/test.c | 37 +++++++++++++++++++++++++++++++++----
lib/ubsan.c | 3 +++
3 files changed, 66 insertions(+), 4 deletions(-)
create mode 100644 include/kunit/test-bug.h
base-commit: 1e0d27fce010b0a4a9e595506b6ede75934c31be
--
2.30.0.478.g8a0d178c01-goog
pipe() and socketpair() have different behavior wrt edge-triggered
read epoll, in that no event is generated when data is written into
a non-empty pipe, but an event is generated if socketpair() is used
instead.
This simple modification of the epoll2 testlet from
tools/testing/selftests/filesystems/epoll/epoll_wakeup_test.c
(it just adds a second write) shows the different behavior.
The testlet passes with pipe() but fails with socketpair() with 5.10.
They both fail with 4.19.
Is it fair to assume that 5.10 pipe's behavior is the correct one?
Thanks,
Francesco Ruggeri
/*
* t0
* | (ew)
* e0
* | (et)
* s0
*/
TEST(epoll2)
{
int efd;
int sfd[2];
struct epoll_event e;
ASSERT_EQ(socketpair(AF_UNIX, SOCK_STREAM, 0, sfd), 0);
//ASSERT_EQ(pipe(sfd), 0);
efd = epoll_create(1);
ASSERT_GE(efd, 0);
e.events = EPOLLIN | EPOLLET;
ASSERT_EQ(epoll_ctl(efd, EPOLL_CTL_ADD, sfd[0], &e), 0);
ASSERT_EQ(write(sfd[1], "w", 1), 1);
EXPECT_EQ(epoll_wait(efd, &e, 1, 0), 1);
ASSERT_EQ(write(sfd[1], "w", 1), 1);
EXPECT_EQ(epoll_wait(efd, &e, 1, 0), 0);
close(efd);
close(sfd[0]);
close(sfd[1]);
}
This is an optimization of a set of sgx-related codes, each of which
is independent of the patch. Because the second and third patches have
conflicting dependencies, these patches are put together.
---
v3 changes:
* split free_cnt count and spin lock optimization into two patches
v2 changes:
* review suggested changes
Tianjia Zhang (5):
selftests/x86: Simplify the code to get vdso base address in sgx
x86/sgx: Optimize the locking range in sgx_sanitize_section()
x86/sgx: Optimize the free_cnt count in sgx_epc_section
x86/sgx: Allows ioctl PROVISION to execute before CREATE
x86/sgx: Remove redundant if conditions in sgx_encl_create
arch/x86/kernel/cpu/sgx/driver.c | 1 +
arch/x86/kernel/cpu/sgx/ioctl.c | 9 +++++----
arch/x86/kernel/cpu/sgx/main.c | 13 +++++--------
tools/testing/selftests/sgx/main.c | 24 ++++--------------------
4 files changed, 15 insertions(+), 32 deletions(-)
--
2.19.1.3.ge56e4f7
Changelog
---------
v11
- Another build fix reported by robot on i386: moved is_pinnable_page()
below set_page_section() in linux/mm.h
v10
- Fixed !CONFIG_MMU compiler issues by adding is_zero_pfn() stub.
v9
- Renamed gpf_to_alloc_flags() to gfp_to_alloc_flags_cma(); thanks Lecopzer
Chen for noticing.
- Fixed warning reported scripts/checkpatch.pl:
"Logical continuations should be on the previous line"
v8
- Added reviewed by's from John Hubbard
- Fixed subjects for selftests patches
- Moved zero page check inside is_pinnable_page() as requested by Jason
Gunthorpe.
v7
- Added reviewed-by's
- Fixed a compile bug on non-mmu builds reported by robot
v6
Small update, but I wanted to send it out quicker, as it removes a
controversial patch and replaces it with something sane.
- Removed forcing FOLL_WRITE for longterm gup, instead added a patch to
skip zero pages during migration.
- Added reviewed-by's and minor log changes.
v5
- Added the following patches to the beginning of series, which are fixes
to the other existing problems with CMA migration code:
mm/gup: check every subpage of a compound page during isolation
mm/gup: return an error on migration failure
mm/gup: check for isolation errors also at the beginning of series
mm/gup: do not allow zero page for pinned pages
- remove .gfp_mask/.reclaim_idx changes from mm/vmscan.c
- update movable zone header comment in patch 8 instead of patch 3, fix
the comment
- Added acked, sign-offs
- Updated commit logs based on feedback
- Addressed issues reported by Michal and Jason.
- Remove:
#define PINNABLE_MIGRATE_MAX 10
#define PINNABLE_ISOLATE_MAX 100
Instead: fail on the first migration failure, and retry isolation
forever as their failures are transient.
- In self-set addressed some of the comments from John Hubbard, updated
commit logs, and added comments. Renamed gup->flags with gup->test_flags.
v4
- Address page migration comments. New patch:
mm/gup: limit number of gup migration failures, honor failures
Implements the limiting number of retries for migration failures, and
also check for isolation failures.
Added a test case into gup_test to verify that pages never long-term
pinned in a movable zone, and also added tests to fault both in kernel
and in userland.
v3
- Merged with linux-next, which contains clean-up patch from Jason,
therefore this series is reduced by two patches which did the same
thing.
v2
- Addressed all review comments
- Added Reviewed-by's.
- Renamed PF_MEMALLOC_NOMOVABLE to PF_MEMALLOC_PIN
- Added is_pinnable_page() to check if page can be longterm pinned
- Fixed gup fast path by checking is_in_pinnable_zone()
- rename cma_page_list to movable_page_list
- add a admin-guide note about handling pinned pages in ZONE_MOVABLE,
updated caveat about pinned pages from linux/mmzone.h
- Move current_gfp_context() to fast-path
---------
When page is pinned it cannot be moved and its physical address stays
the same until pages is unpinned.
This is useful functionality to allows userland to implementation DMA
access. For example, it is used by vfio in vfio_pin_pages().
However, this functionality breaks memory hotplug/hotremove assumptions
that pages in ZONE_MOVABLE can always be migrated.
This patch series fixes this issue by forcing new allocations during
page pinning to omit ZONE_MOVABLE, and also to migrate any existing
pages from ZONE_MOVABLE during pinning.
It uses the same scheme logic that is currently used by CMA, and extends
the functionality for all allocations.
For more information read the discussion [1] about this problem.
[1] https://lore.kernel.org/lkml/CA+CK2bBffHBxjmb9jmSKacm0fJMinyt3Nhk8Nx6iudcQS…
Previous versions:
v1
https://lore.kernel.org/lkml/20201202052330.474592-1-pasha.tatashin@soleen.…
v2
https://lore.kernel.org/lkml/20201210004335.64634-1-pasha.tatashin@soleen.c…
v3
https://lore.kernel.org/lkml/20201211202140.396852-1-pasha.tatashin@soleen.…
v4
https://lore.kernel.org/lkml/20201217185243.3288048-1-pasha.tatashin@soleen…
v5
https://lore.kernel.org/lkml/20210119043920.155044-1-pasha.tatashin@soleen.…
v6
https://lore.kernel.org/lkml/20210120014333.222547-1-pasha.tatashin@soleen.…
v7
https://lore.kernel.org/lkml/20210122033748.924330-1-pasha.tatashin@soleen.…
v8
https://lore.kernel.org/lkml/20210125194751.1275316-1-pasha.tatashin@soleen…
v9
https://lore.kernel.org/lkml/20210201153827.444374-1-pasha.tatashin@soleen.…
v10
https://lore.kernel.org/lkml/20210211162427.618913-1-pasha.tatashin@soleen.…
Pavel Tatashin (14):
mm/gup: don't pin migrated cma pages in movable zone
mm/gup: check every subpage of a compound page during isolation
mm/gup: return an error on migration failure
mm/gup: check for isolation errors
mm cma: rename PF_MEMALLOC_NOCMA to PF_MEMALLOC_PIN
mm: apply per-task gfp constraints in fast path
mm: honor PF_MEMALLOC_PIN for all movable pages
mm/gup: do not migrate zero page
mm/gup: migrate pinned pages out of movable zone
memory-hotplug.rst: add a note about ZONE_MOVABLE and page pinning
mm/gup: change index type to long as it counts pages
mm/gup: longterm pin migration cleanup
selftests/vm: gup_test: fix test flag
selftests/vm: gup_test: test faulting in kernel, and verify pinnable
pages
.../admin-guide/mm/memory-hotplug.rst | 9 +
include/linux/migrate.h | 1 +
include/linux/mm.h | 19 ++
include/linux/mmzone.h | 13 +-
include/linux/pgtable.h | 12 ++
include/linux/sched.h | 2 +-
include/linux/sched/mm.h | 27 +--
include/trace/events/migrate.h | 3 +-
mm/gup.c | 174 ++++++++----------
mm/gup_test.c | 29 +--
mm/gup_test.h | 3 +-
mm/hugetlb.c | 4 +-
mm/page_alloc.c | 33 ++--
tools/testing/selftests/vm/gup_test.c | 36 +++-
14 files changed, 208 insertions(+), 157 deletions(-)
--
2.25.1
Musl libc does not define the glibc specific macro __GLIBC_PREREQ(), but
it has the clock_adjtime() function. Assume that a libc implementation
which does not define __GLIBC_PREREQ at all still implements
clock_adjtime().
This fixes a build problem with musl libc because the __GLIBC_PREREQ()
macro is missing.
Fixes: 42e1358e103d ("ptp: In the testptp utility, use clock_adjtime from glibc when available")
Signed-off-by: Hauke Mehrtens <hauke(a)hauke-m.de>
---
tools/testing/selftests/ptp/testptp.c | 2 ++
1 file changed, 2 insertions(+)
diff --git a/tools/testing/selftests/ptp/testptp.c b/tools/testing/selftests/ptp/testptp.c
index f7911aaeb007..ecffe2c78543 100644
--- a/tools/testing/selftests/ptp/testptp.c
+++ b/tools/testing/selftests/ptp/testptp.c
@@ -38,6 +38,7 @@
#define NSEC_PER_SEC 1000000000LL
/* clock_adjtime is not available in GLIBC < 2.14 */
+#ifdef __GLIBC_PREREQ
#if !__GLIBC_PREREQ(2, 14)
#include <sys/syscall.h>
static int clock_adjtime(clockid_t id, struct timex *tx)
@@ -45,6 +46,7 @@ static int clock_adjtime(clockid_t id, struct timex *tx)
return syscall(__NR_clock_adjtime, id, tx);
}
#endif
+#endif /* __GLIBC_PREREQ */
static void show_flag_test(int rq_index, unsigned int flags, int err)
{
--
2.20.1
This is an optimization of a set of sgx-related codes, each of which
is independent of the patch. Because the second and third patches have
conflicting dependencies, these patches are put together.
---
v4 changes:
* Improvements suggested by review
v3 changes:
* split free_cnt count and spin lock optimization into two patches
v2 changes:
* review suggested changes
Tianjia Zhang (5):
selftests/x86: Use getauxval() to simplify the code in sgx
x86/sgx: Reduce the locking range in sgx_sanitize_section()
x86/sgx: Optimize the free_cnt count in sgx_epc_section
x86/sgx: Allows ioctl PROVISION to execute before CREATE
x86/sgx: Remove redundant if conditions in sgx_encl_create
arch/x86/kernel/cpu/sgx/driver.c | 1 +
arch/x86/kernel/cpu/sgx/ioctl.c | 8 ++++----
arch/x86/kernel/cpu/sgx/main.c | 13 +++++--------
tools/testing/selftests/sgx/main.c | 24 ++++--------------------
4 files changed, 14 insertions(+), 32 deletions(-)
--
2.19.1.3.ge56e4f7
Changelog
---------
v10
- Fixed !CONFIG_MMU compiler issues by adding is_zero_pfn() stub.
v9
- Renamed gpf_to_alloc_flags() to gfp_to_alloc_flags_cma(); thanks Lecopzer
Chen for noticing.
- Fixed warning reported scripts/checkpatch.pl:
"Logical continuations should be on the previous line"
v8
- Added reviewed by's from John Hubbard
- Fixed subjects for selftests patches
- Moved zero page check inside is_pinnable_page() as requested by Jason
Gunthorpe.
v7
- Added reviewed-by's
- Fixed a compile bug on non-mmu builds reported by robot
v6
Small update, but I wanted to send it out quicker, as it removes a
controversial patch and replaces it with something sane.
- Removed forcing FOLL_WRITE for longterm gup, instead added a patch to
skip zero pages during migration.
- Added reviewed-by's and minor log changes.
v5
- Added the following patches to the beginning of series, which are fixes
to the other existing problems with CMA migration code:
mm/gup: check every subpage of a compound page during isolation
mm/gup: return an error on migration failure
mm/gup: check for isolation errors also at the beginning of series
mm/gup: do not allow zero page for pinned pages
- remove .gfp_mask/.reclaim_idx changes from mm/vmscan.c
- update movable zone header comment in patch 8 instead of patch 3, fix
the comment
- Added acked, sign-offs
- Updated commit logs based on feedback
- Addressed issues reported by Michal and Jason.
- Remove:
#define PINNABLE_MIGRATE_MAX 10
#define PINNABLE_ISOLATE_MAX 100
Instead: fail on the first migration failure, and retry isolation
forever as their failures are transient.
- In self-set addressed some of the comments from John Hubbard, updated
commit logs, and added comments. Renamed gup->flags with gup->test_flags.
v4
- Address page migration comments. New patch:
mm/gup: limit number of gup migration failures, honor failures
Implements the limiting number of retries for migration failures, and
also check for isolation failures.
Added a test case into gup_test to verify that pages never long-term
pinned in a movable zone, and also added tests to fault both in kernel
and in userland.
v3
- Merged with linux-next, which contains clean-up patch from Jason,
therefore this series is reduced by two patches which did the same
thing.
v2
- Addressed all review comments
- Added Reviewed-by's.
- Renamed PF_MEMALLOC_NOMOVABLE to PF_MEMALLOC_PIN
- Added is_pinnable_page() to check if page can be longterm pinned
- Fixed gup fast path by checking is_in_pinnable_zone()
- rename cma_page_list to movable_page_list
- add a admin-guide note about handling pinned pages in ZONE_MOVABLE,
updated caveat about pinned pages from linux/mmzone.h
- Move current_gfp_context() to fast-path
---------
When page is pinned it cannot be moved and its physical address stays
the same until pages is unpinned.
This is useful functionality to allows userland to implementation DMA
access. For example, it is used by vfio in vfio_pin_pages().
However, this functionality breaks memory hotplug/hotremove assumptions
that pages in ZONE_MOVABLE can always be migrated.
This patch series fixes this issue by forcing new allocations during
page pinning to omit ZONE_MOVABLE, and also to migrate any existing
pages from ZONE_MOVABLE during pinning.
It uses the same scheme logic that is currently used by CMA, and extends
the functionality for all allocations.
For more information read the discussion [1] about this problem.
[1] https://lore.kernel.org/lkml/CA+CK2bBffHBxjmb9jmSKacm0fJMinyt3Nhk8Nx6iudcQS…
Previous versions:
v1
https://lore.kernel.org/lkml/20201202052330.474592-1-pasha.tatashin@soleen.…
v2
https://lore.kernel.org/lkml/20201210004335.64634-1-pasha.tatashin@soleen.c…
v3
https://lore.kernel.org/lkml/20201211202140.396852-1-pasha.tatashin@soleen.…
v4
https://lore.kernel.org/lkml/20201217185243.3288048-1-pasha.tatashin@soleen…
v5
https://lore.kernel.org/lkml/20210119043920.155044-1-pasha.tatashin@soleen.…
v6
https://lore.kernel.org/lkml/20210120014333.222547-1-pasha.tatashin@soleen.…
v7
https://lore.kernel.org/lkml/20210122033748.924330-1-pasha.tatashin@soleen.…
v8
https://lore.kernel.org/lkml/20210125194751.1275316-1-pasha.tatashin@soleen…
v9
https://lore.kernel.org/lkml/20210201153827.444374-1-pasha.tatashin@soleen.…
Pavel Tatashin (14):
mm/gup: don't pin migrated cma pages in movable zone
mm/gup: check every subpage of a compound page during isolation
mm/gup: return an error on migration failure
mm/gup: check for isolation errors
mm cma: rename PF_MEMALLOC_NOCMA to PF_MEMALLOC_PIN
mm: apply per-task gfp constraints in fast path
mm: honor PF_MEMALLOC_PIN for all movable pages
mm/gup: do not migrate zero page
mm/gup: migrate pinned pages out of movable zone
memory-hotplug.rst: add a note about ZONE_MOVABLE and page pinning
mm/gup: change index type to long as it counts pages
mm/gup: longterm pin migration cleanup
selftests/vm: gup_test: fix test flag
selftests/vm: gup_test: test faulting in kernel, and verify pinnable
pages
.../admin-guide/mm/memory-hotplug.rst | 9 +
include/linux/migrate.h | 1 +
include/linux/mm.h | 19 ++
include/linux/mmzone.h | 13 +-
include/linux/pgtable.h | 12 ++
include/linux/sched.h | 2 +-
include/linux/sched/mm.h | 27 +--
include/trace/events/migrate.h | 3 +-
mm/gup.c | 174 ++++++++----------
mm/gup_test.c | 29 +--
mm/gup_test.h | 3 +-
mm/hugetlb.c | 4 +-
mm/page_alloc.c | 33 ++--
tools/testing/selftests/vm/gup_test.c | 36 +++-
14 files changed, 208 insertions(+), 157 deletions(-)
--
2.25.1
If a signed number field starts with a '-' the field width must be > 1,
or unlimited, to allow at least one digit after the '-'.
This patch adds a check for this. If a signed field starts with '-'
and field_width == 1 the scanf will quit.
It is ok for a signed number field to have a field width of 1 if it
starts with a digit. In that case the single digit can be converted.
Signed-off-by: Richard Fitzgerald <rf(a)opensource.cirrus.com>
Reviewed-by: Petr Mladek <pmladek(a)suse.com>
Acked-by: Andy Shevchenko <andriy.shevchenko(a)linux.intel.com>
---
lib/vsprintf.c | 6 +++++-
1 file changed, 5 insertions(+), 1 deletion(-)
diff --git a/lib/vsprintf.c b/lib/vsprintf.c
index 3b53c73580c5..28bb26cd1f67 100644
--- a/lib/vsprintf.c
+++ b/lib/vsprintf.c
@@ -3434,8 +3434,12 @@ int vsscanf(const char *buf, const char *fmt, va_list args)
str = skip_spaces(str);
digit = *str;
- if (is_sign && digit == '-')
+ if (is_sign && digit == '-') {
+ if (field_width == 1)
+ break;
+
digit = *(str + 1);
+ }
if (!digit
|| (base == 16 && !isxdigit(digit))
--
2.20.1
If a signed number field starts with a '-' the field width must be > 1,
or unlimited, to allow at least one digit after the '-'.
This patch adds a check for this. If a signed field starts with '-'
and field_width == 1 the scanf will quit.
It is ok for a signed number field to have a field width of 1 if it
starts with a digit. In that case the single digit can be converted.
Signed-off-by: Richard Fitzgerald <rf(a)opensource.cirrus.com>
Reviewed-by: Petr Mladek <pmladek(a)suse.com>
Acked-by: Andy Shevchenko <andriy.shevchenko(a)linux.intel.com>
---
lib/vsprintf.c | 6 +++++-
1 file changed, 5 insertions(+), 1 deletion(-)
diff --git a/lib/vsprintf.c b/lib/vsprintf.c
index 3b53c73580c5..28bb26cd1f67 100644
--- a/lib/vsprintf.c
+++ b/lib/vsprintf.c
@@ -3434,8 +3434,12 @@ int vsscanf(const char *buf, const char *fmt, va_list args)
str = skip_spaces(str);
digit = *str;
- if (is_sign && digit == '-')
+ if (is_sign && digit == '-') {
+ if (field_width == 1)
+ break;
+
digit = *(str + 1);
+ }
if (!digit
|| (base == 16 && !isxdigit(digit))
--
2.20.1
From: Willem de Bruijn <willemb(a)google.com>
FIXTURE_VARIANT data is passed to FIXTURE_SETUP and TEST_F as variant.
In some cases, the variant will change the setup, such that expections
also change on teardown. Also pass variant to FIXTURE_TEARDOWN.
The new FIXTURE_TEARDOWN logic is identical to that in FIXTURE_SETUP,
right above.
Signed-off-by: Willem de Bruijn <willemb(a)google.com>
---
For one use of this see tentative
tools/testing/selftests/filesystems/selectpoll.c kselftest at
https://github.com/wdebruij/linux-next-mirror/commit/12b4d183ac9140c1360637…
---
tools/testing/selftests/kselftest_harness.h | 12 +++++++-----
1 file changed, 7 insertions(+), 5 deletions(-)
diff --git a/tools/testing/selftests/kselftest_harness.h b/tools/testing/selftests/kselftest_harness.h
index f19804df244c..6a27e79278e8 100644
--- a/tools/testing/selftests/kselftest_harness.h
+++ b/tools/testing/selftests/kselftest_harness.h
@@ -283,7 +283,9 @@
#define FIXTURE_TEARDOWN(fixture_name) \
void fixture_name##_teardown( \
struct __test_metadata __attribute__((unused)) *_metadata, \
- FIXTURE_DATA(fixture_name) __attribute__((unused)) *self)
+ FIXTURE_DATA(fixture_name) __attribute__((unused)) *self, \
+ const FIXTURE_VARIANT(fixture_name) \
+ __attribute__((unused)) *variant)
/**
* FIXTURE_VARIANT(fixture_name) - Optionally called once per fixture
@@ -298,9 +300,9 @@
* ...
* };
*
- * Defines type of constant parameters provided to FIXTURE_SETUP() and TEST_F()
- * as *variant*. Variants allow the same tests to be run with different
- * arguments.
+ * Defines type of constant parameters provided to FIXTURE_SETUP(), TEST_F() and
+ * FIXTURE_TEARDOWN as *variant*. Variants allow the same tests to be run with
+ * different arguments.
*/
#define FIXTURE_VARIANT(fixture_name) struct _fixture_variant_##fixture_name
@@ -382,7 +384,7 @@
if (!_metadata->passed) \
return; \
fixture_name##_##test_name(_metadata, &self, variant->data); \
- fixture_name##_teardown(_metadata, &self); \
+ fixture_name##_teardown(_metadata, &self, variant->data); \
} \
static struct __test_metadata \
_##fixture_name##_##test_name##_object = { \
--
2.29.2.576.ga3fc446d84-goog
We're working on a user space control plane for the BPF sk_lookup
hook [1]. The hook attaches to a network namespace and allows
control over which socket receives a new connection / packet.
Roughly, applications can give a socket to our user space component
to participate in custom bind semantics. This creates an edge case
where an application can provide us with a socket that lives in
a different network namespace than our BPF sk_lookup program.
We'd like to return an error in this case.
Additionally, we have some user space state that is tied to the
network namespace. We currently use the inode of the nsfs entry
in a directory name, but this is suffers from inode reuse.
I'm proposing to fix both of these issues by adding a new
SO_NETNS_COOKIE socket option as well as a NS_GET_COOKIE ioctl.
Using these we get a stable, unique identifier for a network
namespace and check whether a socket belongs to the "correct"
namespace.
NS_GET_COOKIE could be renamed to NS_GET_NET_COOKIE. I kept the
name generic because it seems like other namespace types could
benefit from a cookie as well.
I'm trying to land this via the bpf tree since this is where the
netns cookie originated, please let me know if this isn't
appropriate.
1: https://www.kernel.org/doc/html/latest/bpf/prog_sk_lookup.html
Cc: bpf(a)vger.kernel.org
Cc: linux-alpha(a)vger.kernel.org
Cc: linux-api(a)vger.kernel.org
Cc: linux-arch(a)vger.kernel.org
Cc: linux-fsdevel(a)vger.kernel.org
Cc: linux-kernel(a)vger.kernel.org
Cc: linux-kselftest(a)vger.kernel.org
Cc: linux-mips(a)vger.kernel.org
Cc: linux-parisc(a)vger.kernel.org
Cc: netdev(a)vger.kernel.org
Cc: sparclinux(a)vger.kernel.org
Lorenz Bauer (4):
net: add SO_NETNS_COOKIE socket option
nsfs: add an ioctl to discover the network namespace cookie
tools/testing: add test for NS_GET_COOKIE
tools/testing: add a selftest for SO_NETNS_COOKIE
arch/alpha/include/uapi/asm/socket.h | 2 +
arch/mips/include/uapi/asm/socket.h | 2 +
arch/parisc/include/uapi/asm/socket.h | 2 +
arch/sparc/include/uapi/asm/socket.h | 2 +
fs/nsfs.c | 9 +++
include/linux/sock_diag.h | 20 ++++++
include/net/net_namespace.h | 11 ++++
include/uapi/asm-generic/socket.h | 2 +
include/uapi/linux/nsfs.h | 2 +
net/core/filter.c | 9 ++-
net/core/sock.c | 7 +++
tools/testing/selftests/net/.gitignore | 1 +
tools/testing/selftests/net/Makefile | 2 +-
tools/testing/selftests/net/so_netns_cookie.c | 61 +++++++++++++++++++
tools/testing/selftests/nsfs/.gitignore | 1 +
tools/testing/selftests/nsfs/Makefile | 2 +-
tools/testing/selftests/nsfs/netns.c | 57 +++++++++++++++++
17 files changed, 185 insertions(+), 7 deletions(-)
create mode 100644 tools/testing/selftests/net/so_netns_cookie.c
create mode 100644 tools/testing/selftests/nsfs/netns.c
--
2.27.0
Hi,
This test is added to serve as a performance tester and a bug reproducer
for kvm page table code (GPA->HPA mappings), it gives guidance for the
people trying to make some improvement for kvm.
The following explains what we can exactly do through this test.
And a RFC is sent for comments, thanks.
The function guest_code() is designed to cover conditions where a single vcpu
or multiple vcpus access guest pages within the same memory range, in three
VM stages(before dirty-logging, during dirty-logging, after dirty-logging).
Besides, the backing source memory type(ANONYMOUS/THP/HUGETLB) of the tested
memory region can be specified by users, which means normal page mappings or
block mappings can be chosen by users to be created in the test.
If use of ANONYMOUS memory is specified, kvm will create page mappings for the
tested memory region before dirty-logging, and update attributes of the page
mappings from RO to RW during dirty-logging. If use of THP/HUGETLB memory is
specified, kvm will create block mappings for the tested memory region before
dirty-logging, and split the blcok mappings into page mappings during
dirty-logging, and coalesce the page mappings back into block mappings after
dirty-logging is stopped.
So in summary, as a performance tester, this test can present the performance
of kvm creating/updating normal page mappings, or the performance of kvm
creating/splitting/recovering block mappings, through execution time.
When we need to coalesce the page mappings back to block mappings after dirty
logging is stopped, we have to firstly invalidate *all* the TLB entries for the
page mappings right before installation of the block entry, because a TLB conflict
abort error could occur if we can't invalidate the TLB entries fully. We have
hit this TLB conflict twice on aarch64 software implementation and fixed it.
As this test can imulate process from dirty-logging enabled to dirty-logging
stopped of a VM with block mappings, so it can also reproduce this TLB conflict
abort due to inadequate TLB invalidation when coalescing tables.
Links about the TLB conflict abort:
https://lore.kernel.org/lkml/20201201201034.116760-3-wangyanan55@huawei.com/
---
Here are some test examples of this test:
platform: HiSilicon Kunpeng920 (aarch64, FWB not supported)
host kernel: Linux mainline
(1) Based on v5.11-rc6
cmdline: ./kvm_page_table_test -m 4 -t 0 -g 4K -s 1G -v 1
(1 vcpu, 1G memory, page mappings(granule 4K))
KVM_CREATE_MAPPINGS: 0.8196s 0.8260s 0.8258s 0.8169s 0.8190s
KVM_UPDATE_MAPPINGS: 1.1930s 1.1949s 1.1940s 1.1934s 1.1946s
cmdline: ./kvm_page_table_test -m 4 -t 0 -g 4K -s 1G -v 20
(20 vcpus, 1G memory, page mappings(granule 4K))
KVM_CREATE_MAPPINGS: 23.4028s 23.8015s 23.6702s 23.9437s 22.1646s
KVM_UPDATE_MAPPINGS: 16.9550s 16.4734s 16.8300s 16.9621s 16.9402s
cmdline: ./kvm_page_table_test -m 4 -t 2 -g 1G -s 20G -v 1
(1 vcpu, 20G memory, block mappings(granule 1G))
KVM_CREATE_MAPPINGS: 3.7040s 3.7053s 3.7047s 3.7061s 3.7068s
KVM_ADJUST_MAPPINGS: 2.8264s 2.8266s 2.8272s 2.8259s 2.8283s
cmdline: ./kvm_page_table_test -m 4 -t 2 -g 1G -s 20G -v 20
(20 vcpus, 20G memory, block mappings(granule 1G))
KVM_CREATE_MAPPINGS: 52.8338s 52.8327s 52.8336s 52.8255s 52.8303s
KVM_ADJUST_MAPPINGS: 52.0466s 52.0473s 52.0550s 52.0518s 52.0467s
(2) I have post a patch series to improve efficiency of stage2 page table code,
so test the performance changes.
cmdline: ./kvm_page_table_test -m 4 -t 2 -g 1G -s 20G -v 20
(20 vcpus, 20G memory, block mappings(granule 1G))
Before patch: KVM_CREATE_MAPPINGS: 52.8338s 52.8327s 52.8336s 52.8255s 52.8303s
After patch: KVM_CREATE_MAPPINGS: 3.7022s 3.7031s 3.7028s 3.7012s 3.7024s
Before patch: KVM_ADJUST_MAPPINGS: 52.0466s 52.0473s 52.0550s 52.0518s 52.0467s
After patch: KVM_ADJUST_MAPPINGS: 0.3008s 0.3004s 0.2974s 0.2917s 0.2900s
cmdline: ./kvm_page_table_test -m 4 -t 2 -g 1G -s 20G -v 40
(40 vcpus, 20G memory, block mappings(granule 1G))
Before patch: KVM_CREATE_MAPPINGS: 104.560s 104.556s 104.554s 104.556s 104.550s
After patch: KVM_CREATE_MAPPINGS: 3.7011s 3.7103s 3.7005s 3.7024s 3.7106s
Before patch: KVM_ADJUST_MAPPINGS: 103.931s 103.936s 103.927s 103.942s 103.927s
After patch: KVM_ADJUST_MAPPINGS: 0.3541s 0.3694s 0.3656s 0.3693s 0.3687s
---
Yanan Wang (2):
KVM: selftests: Add a macro to get string of vm_mem_backing_src_type
KVM: selftests: Add a test for kvm page table code
tools/testing/selftests/kvm/Makefile | 3 +
.../testing/selftests/kvm/include/kvm_util.h | 3 +
.../selftests/kvm/kvm_page_table_test.c | 518 ++++++++++++++++++
tools/testing/selftests/kvm/lib/kvm_util.c | 8 +
4 files changed, 532 insertions(+)
create mode 100644 tools/testing/selftests/kvm/kvm_page_table_test.c
--
2.23.0
Only older versions of the RISC-V GCC toolchain define __riscv__. Check
for __riscv as well, which is used by newer GCC toolchains. Also set
VDSO_32BIT based on __riscv_xlen.
Before (on riscv64):
$ ./vdso_test_abi
[vDSO kselftest] VDSO_VERSION: LINUX_4
Could not find __vdso_gettimeofday
Could not find __vdso_clock_gettime
Could not find __vdso_clock_getres
clock_id: CLOCK_REALTIME [PASS]
Could not find __vdso_clock_gettime
Could not find __vdso_clock_getres
clock_id: CLOCK_BOOTTIME [PASS]
Could not find __vdso_clock_gettime
Could not find __vdso_clock_getres
clock_id: CLOCK_TAI [PASS]
Could not find __vdso_clock_gettime
Could not find __vdso_clock_getres
clock_id: CLOCK_REALTIME_COARSE [PASS]
Could not find __vdso_clock_gettime
Could not find __vdso_clock_getres
clock_id: CLOCK_MONOTONIC [PASS]
Could not find __vdso_clock_gettime
Could not find __vdso_clock_getres
clock_id: CLOCK_MONOTONIC_RAW [PASS]
Could not find __vdso_clock_gettime
Could not find __vdso_clock_getres
clock_id: CLOCK_MONOTONIC_COARSE [PASS]
Could not find __vdso_time
After (on riscv32):
$ ./vdso_test_abi
[vDSO kselftest] VDSO_VERSION: LINUX_4.15
The time is 1612449376.015086
The time is 1612449376.18340784
The resolution is 0 1
clock_id: CLOCK_REALTIME [PASS]
The time is 774.842586182
The resolution is 0 1
clock_id: CLOCK_BOOTTIME [PASS]
The time is 1612449376.22536565
The resolution is 0 1
clock_id: CLOCK_TAI [PASS]
The time is 1612449376.20885172
The resolution is 0 4000000
clock_id: CLOCK_REALTIME_COARSE [PASS]
The time is 774.845491269
The resolution is 0 1
clock_id: CLOCK_MONOTONIC [PASS]
The time is 774.849534200
The resolution is 0 1
clock_id: CLOCK_MONOTONIC_RAW [PASS]
The time is 774.842139684
The resolution is 0 4000000
clock_id: CLOCK_MONOTONIC_COARSE [PASS]
Could not find __vdso_time
Signed-off-by: Tobias Klauser <tklauser(a)distanz.ch>
---
tools/testing/selftests/vDSO/vdso_config.h | 4 +++-
1 file changed, 3 insertions(+), 1 deletion(-)
diff --git a/tools/testing/selftests/vDSO/vdso_config.h b/tools/testing/selftests/vDSO/vdso_config.h
index 6a6fe8d4ff55..6188b16827d1 100644
--- a/tools/testing/selftests/vDSO/vdso_config.h
+++ b/tools/testing/selftests/vDSO/vdso_config.h
@@ -47,10 +47,12 @@
#elif defined(__x86_64__)
#define VDSO_VERSION 0
#define VDSO_NAMES 1
-#elif defined(__riscv__)
+#elif defined(__riscv__) || defined(__riscv)
#define VDSO_VERSION 5
#define VDSO_NAMES 1
+#if __riscv_xlen == 32
#define VDSO_32BIT 1
+#endif
#else /* nds32 */
#define VDSO_VERSION 4
#define VDSO_NAMES 1
--
2.30.0
This test expects fds to have specific values, which works fine
when the test is run standalone. However, the kselftest runner
consumes a couple of extra fds for redirection when running
tests, so the test fails when run via kselftest.
Change the test to pass on any valid fd number.
Signed-off-by: Seth Forshee <seth.forshee(a)canonical.com>
---
tools/testing/selftests/seccomp/seccomp_bpf.c | 8 ++------
1 file changed, 2 insertions(+), 6 deletions(-)
diff --git a/tools/testing/selftests/seccomp/seccomp_bpf.c b/tools/testing/selftests/seccomp/seccomp_bpf.c
index 26c72f2b61b1..9338df6f4ca8 100644
--- a/tools/testing/selftests/seccomp/seccomp_bpf.c
+++ b/tools/testing/selftests/seccomp/seccomp_bpf.c
@@ -4019,18 +4019,14 @@ TEST(user_notification_addfd)
/* Verify we can set an arbitrary remote fd */
fd = ioctl(listener, SECCOMP_IOCTL_NOTIF_ADDFD, &addfd);
- /*
- * The child has fds 0(stdin), 1(stdout), 2(stderr), 3(memfd),
- * 4(listener), so the newly allocated fd should be 5.
- */
- EXPECT_EQ(fd, 5);
+ EXPECT_GE(fd, 0);
EXPECT_EQ(filecmp(getpid(), pid, memfd, fd), 0);
/* Verify we can set an arbitrary remote fd with large size */
memset(&big, 0x0, sizeof(big));
big.addfd = addfd;
fd = ioctl(listener, SECCOMP_IOCTL_NOTIF_ADDFD_BIG, &big);
- EXPECT_EQ(fd, 6);
+ EXPECT_GE(fd, 0);
/* Verify we can set a specific remote fd */
addfd.newfd = 42;
--
2.29.2
v1 by Uriel is here: [1].
Since it's been a while, I've dropped the Reviewed-By's.
It depended on commit 83c4e7a0363b ("KUnit: KASAN Integration") which
hadn't been merged yet, so that caused some kerfuffle with applying them
previously and the series was reverted.
This revives the series but makes the kunit_fail_current_test() function
take a format string and logs the file and line number of the failing
code, addressing Alan Maguire's comments on the previous version.
As a result, the patch that makes UBSAN errors was tweaked slightly to
include an error message.
[1] https://lore.kernel.org/linux-kselftest/20200806174326.3577537-1-urielguaja…
Uriel Guajardo (2):
kunit: support failure from dynamic analysis tools
kunit: ubsan integration
include/kunit/test-bug.h | 30 ++++++++++++++++++++++++++++++
lib/kunit/test.c | 36 ++++++++++++++++++++++++++++++++----
lib/ubsan.c | 3 +++
3 files changed, 65 insertions(+), 4 deletions(-)
create mode 100644 include/kunit/test-bug.h
base-commit: 1e0d27fce010b0a4a9e595506b6ede75934c31be
--
2.30.0.478.g8a0d178c01-goog
Sincerely hope I did this right; first time contributor, learning the
patch workflow. I took the opportunity to fix two .gitignore files that
were leaving stale worktree/index outputs after running `make kselftest`
against recent mainline (76c057c84d286140c6c416c3b4ba832cd1d8984e).
Thanks for your time!
XFAIL was removed in commit 9847d24af95c ("selftests/harness: Refactor
XFAIL into SKIP") and its use in close_range_test was already replaced
by commit 1d44d0dd61b6 ("selftests: core: use SKIP instead of XFAIL in
close_range_test.c"). However, commit 23afeaeff3d9 ("selftests: core:
add tests for CLOSE_RANGE_CLOEXEC") introduced usage of XFAIL in
TEST(close_range_cloexec). Use SKIP there as well.
Cc: Giuseppe Scrivano <gscrivan(a)redhat.com>
Fixes: 23afeaeff3d9 ("selftests: core: add tests for CLOSE_RANGE_CLOEXEC")
Signed-off-by: Tobias Klauser <tklauser(a)distanz.ch>
---
tools/testing/selftests/core/close_range_test.c | 6 +++---
1 file changed, 3 insertions(+), 3 deletions(-)
diff --git a/tools/testing/selftests/core/close_range_test.c b/tools/testing/selftests/core/close_range_test.c
index 87e16d65d9d7..670fb30d62f6 100644
--- a/tools/testing/selftests/core/close_range_test.c
+++ b/tools/testing/selftests/core/close_range_test.c
@@ -241,7 +241,7 @@ TEST(close_range_cloexec)
fd = open("/dev/null", O_RDONLY);
ASSERT_GE(fd, 0) {
if (errno == ENOENT)
- XFAIL(return, "Skipping test since /dev/null does not exist");
+ SKIP(return, "Skipping test since /dev/null does not exist");
}
open_fds[i] = fd;
@@ -250,9 +250,9 @@ TEST(close_range_cloexec)
ret = sys_close_range(1000, 1000, CLOSE_RANGE_CLOEXEC);
if (ret < 0) {
if (errno == ENOSYS)
- XFAIL(return, "close_range() syscall not supported");
+ SKIP(return, "close_range() syscall not supported");
if (errno == EINVAL)
- XFAIL(return, "close_range() doesn't support CLOSE_RANGE_CLOEXEC");
+ SKIP(return, "close_range() doesn't support CLOSE_RANGE_CLOEXEC");
}
/* Ensure the FD_CLOEXEC bit is set also with a resource limit in place. */
--
2.29.0
Copied in from somewhere else, the makefile was including
the kerne's usr/include dir, which caused the asm/ioctl.h file
to be used.
Unfortunately, that file has different values for _IOC_SIZEBITS
and _IOC_WRITE than include/uapi/asm-generic/ioctl.h which then
causes the _IOCW macros to give the wrong ioctl numbers,
specifically for DMA_BUF_IOCTL_SYNC.
This patch simply removes the extra include from the Makefile
Cc: Shuah Khan <shuah(a)kernel.org>
Cc: Brian Starkey <brian.starkey(a)arm.com>
Cc: Sumit Semwal <sumit.semwal(a)linaro.org>
Cc: Laura Abbott <labbott(a)kernel.org>
Cc: Hridya Valsaraju <hridya(a)google.com>
Cc: Suren Baghdasaryan <surenb(a)google.com>
Cc: Sandeep Patil <sspatil(a)google.com>
Cc: Daniel Mentz <danielmentz(a)google.com>
Cc: linux-media(a)vger.kernel.org
Cc: dri-devel(a)lists.freedesktop.org
Cc: linux-kselftest(a)vger.kernel.org
Fixes: a8779927fd86c ("kselftests: Add dma-heap test")
Signed-off-by: John Stultz <john.stultz(a)linaro.org>
---
tools/testing/selftests/dmabuf-heaps/Makefile | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/tools/testing/selftests/dmabuf-heaps/Makefile b/tools/testing/selftests/dmabuf-heaps/Makefile
index 607c2acd2082..604b43ece15f 100644
--- a/tools/testing/selftests/dmabuf-heaps/Makefile
+++ b/tools/testing/selftests/dmabuf-heaps/Makefile
@@ -1,5 +1,5 @@
# SPDX-License-Identifier: GPL-2.0
-CFLAGS += -static -O3 -Wl,-no-as-needed -Wall -I../../../../usr/include
+CFLAGS += -static -O3 -Wl,-no-as-needed -Wall
TEST_GEN_PROGS = dmabuf-heap
--
2.25.1
When using `kunit.py run` to run tests, users must populate a
`kunitconfig` file to select the options the tests are hidden behind and
all their dependencies.
The patch [1] to allow specifying a path to kunitconfig promises to make
this nicer as we can have checked in files corresponding to different
sets of tests.
But it's still annoying
1) when trying to run a subet of tests
2) when you want to run tests that don't have such a pre-existing
kunitconfig and selecting all the necessary options is tricky.
This patch series aims to alleviate both:
1) `kunit.py run 'my-suite-*'`
I.e. use my current kunitconfig, but just run suites that match this glob
2) `kunit.py run --alltests 'my-suite-*'`
I.e. use allyesconfig so I don't have to worry about writing a
kunitconfig at all.
See the first commit message for more details and discussion about
future work.
This patch series also includes a bugfix for a latent bug that can't be
triggered right now but has worse consequences as a result of the
changes needed to plumb in this suite name glob.
[1] https://lore.kernel.org/linux-kselftest/20210201205514.3943096-1-dlatypov@g…
---
v1 -> v2:
Fix free of `suites` subarray in suite_set.
Found by Dan Carpenter and kernel test robot.
v2 -> v3:
Add MODULE_PARM_DESC() for kunit.filter_glob.
v3 -> v4:
Rebase on top of kunit_tool_test.py and typing fixes for merging.
Daniel Latypov (3):
kunit: add kunit.filter_glob cmdline option to filter suites
kunit: tool: add support for filtering suites by glob
kunit: tool: fix unintentional statefulness in run_kernel()
lib/kunit/Kconfig | 1 +
lib/kunit/executor.c | 93 +++++++++++++++++++++++---
tools/testing/kunit/kunit.py | 21 ++++--
tools/testing/kunit/kunit_kernel.py | 6 +-
tools/testing/kunit/kunit_tool_test.py | 15 +++--
5 files changed, 115 insertions(+), 21 deletions(-)
base-commit: aa919f3b019d0e10e0c035598546b30cca7bcb19
--
2.30.0.478.g8a0d178c01-goog
If a signed number field starts with a '-' the field width must be > 1,
or unlimited, to allow at least one digit after the '-'.
This patch adds a check for this. If a signed field starts with '-'
and field_width == 1 the scanf will quit.
It is ok for a signed number field to have a field width of 1 if it
starts with a digit. In that case the single digit can be converted.
Signed-off-by: Richard Fitzgerald <rf(a)opensource.cirrus.com>
Reviewed-by: Petr Mladek <pmladek(a)suse.com>
---
lib/vsprintf.c | 6 +++++-
1 file changed, 5 insertions(+), 1 deletion(-)
diff --git a/lib/vsprintf.c b/lib/vsprintf.c
index 3b53c73580c5..28bb26cd1f67 100644
--- a/lib/vsprintf.c
+++ b/lib/vsprintf.c
@@ -3434,8 +3434,12 @@ int vsscanf(const char *buf, const char *fmt, va_list args)
str = skip_spaces(str);
digit = *str;
- if (is_sign && digit == '-')
+ if (is_sign && digit == '-') {
+ if (field_width == 1)
+ break;
+
digit = *(str + 1);
+ }
if (!digit
|| (base == 16 && !isxdigit(digit))
--
2.20.1
When using `kunit.py run` to run tests, users must populate a
`kunitconfig` file to select the options the tests are hidden behind and
all their dependencies.
The patch [1] to allow specifying a path to kunitconfig promises to make
this nicer as we can have checked in files corresponding to different
sets of tests.
But it's still annoying
1) when trying to run a subet of tests
2) when you want to run tests that don't have such a pre-existing
kunitconfig and selecting all the necessary options is tricky.
This patch series aims to alleviate both:
1) `kunit.py run 'my-suite-*'`
I.e. use my current kunitconfig, but just run suites that match this glob
2) `kunit.py run --alltests 'my-suite-*'`
I.e. use allyesconfig so I don't have to worry about writing a
kunitconfig at all (this is a bit overkill, but it works!)
See the first commit message for more details and discussion about
future work.
This patch series also includes a bugfix for a latent bug that can't be
triggered right now but has worse consequences as a result of the
changes needed to plumb in this suite name glob.
[1] https://lore.kernel.org/linux-kselftest/20210201205514.3943096-1-dlatypov@g…
Daniel Latypov (3):
kunit: add kunit.filter_glob cmdline option to filter suites
kunit: tool: add support for filtering suites by glob
kunit: tool: fix unintentional statefulness in run_kernel()
lib/kunit/Kconfig | 1 +
lib/kunit/executor.c | 85 ++++++++++++++++++++++++++---
tools/testing/kunit/kunit.py | 21 +++++--
tools/testing/kunit/kunit_kernel.py | 6 +-
4 files changed, 99 insertions(+), 14 deletions(-)
base-commit: 88bb507a74ea7d75fa49edd421eaa710a7d80598
--
2.30.0.365.g02bc693789-goog
Sequence Number api provides interfaces for unsigned atomic up counters.
There are a number of atomic_t usages in the kernel where atomic_t api
is used for counting sequence numbers and other statistical counters.
Several of these usages, convert atomic_read() and atomic_inc_return()
return values to unsigned. Introducing sequence number ops supports
these use-cases with a standard core-api.
Sequence Number ops provide interfaces to initialize, increment and get
the sequence number. These ops also check for overflow and log message to
indicate when overflow occurs. This check is intended to help catch cases
where overflow could lead to problems.
Since v2:
- Uses atomic_inc_return() for incrementing the sequence number.
- No longer uses atomic_read()
Shuah Khan (7):
seqnum_ops: Introduce Sequence Number Ops
selftests: lib:test_seqnum_ops: add new test for seqnum_ops
drivers/acpi: convert seqno to use seqnum_ops
drivers/acpi/apei: convert seqno to seqnum_ops
drivers/staging/rtl8723bs: convert event_seq to use seqnum_ops
drivers/staging/rtl8188eu: convert event_seq to use seqnum_ops
kobject: convert uevent_seqnum to seqnum_ops
Documentation/core-api/index.rst | 1 +
Documentation/core-api/seqnum_ops.rst | 62 ++++++++
MAINTAINERS | 8 ++
drivers/acpi/acpi_extlog.c | 8 +-
drivers/acpi/apei/ghes.c | 8 +-
drivers/staging/rtl8188eu/core/rtw_mlme_ext.c | 23 ++-
.../staging/rtl8188eu/include/rtw_mlme_ext.h | 3 +-
drivers/staging/rtl8723bs/core/rtw_cmd.c | 3 +-
drivers/staging/rtl8723bs/core/rtw_mlme_ext.c | 33 +++--
drivers/staging/rtl8723bs/include/rtw_cmd.h | 3 +-
.../staging/rtl8723bs/include/rtw_mlme_ext.h | 3 +-
include/linux/kobject.h | 3 +-
include/linux/seqnum_ops.h | 131 +++++++++++++++++
kernel/ksysfs.c | 3 +-
lib/Kconfig | 9 ++
lib/Makefile | 1 +
lib/kobject_uevent.c | 9 +-
lib/test_seqnum_ops.c | 133 ++++++++++++++++++
tools/testing/selftests/lib/Makefile | 1 +
tools/testing/selftests/lib/config | 1 +
.../testing/selftests/lib/test_seqnum_ops.sh | 10 ++
21 files changed, 423 insertions(+), 33 deletions(-)
create mode 100644 Documentation/core-api/seqnum_ops.rst
create mode 100644 include/linux/seqnum_ops.h
create mode 100644 lib/test_seqnum_ops.c
create mode 100755 tools/testing/selftests/lib/test_seqnum_ops.sh
--
2.27.0
When using `kunit.py run` to run tests, users must populate a
`kunitconfig` file to select the options the tests are hidden behind and
all their dependencies.
The patch [1] to allow specifying a path to kunitconfig promises to make
this nicer as we can have checked in files corresponding to different
sets of tests.
But it's still annoying
1) when trying to run a subet of tests
2) when you want to run tests that don't have such a pre-existing
kunitconfig and selecting all the necessary options is tricky.
This patch series aims to alleviate both:
1) `kunit.py run 'my-suite-*'`
I.e. use my current kunitconfig, but just run suites that match this glob
2) `kunit.py run --alltests 'my-suite-*'`
I.e. use allyesconfig so I don't have to worry about writing a
kunitconfig at all.
See the first commit message for more details and discussion about
future work.
This patch series also includes a bugfix for a latent bug that can't be
triggered right now but has worse consequences as a result of the
changes needed to plumb in this suite name glob.
[1] https://lore.kernel.org/linux-kselftest/20210201205514.3943096-1-dlatypov@g…
---
v1 -> v2:
Fix free of `suites` subarray in suite_set.
Found by Dan Carpenter and kernel test robot.
v2 -> v3:
Add MODULE_PARM_DESC() for kunit.filter_glob.
Daniel Latypov (3):
kunit: add kunit.filter_glob cmdline option to filter suites
kunit: tool: add support for filtering suites by glob
kunit: tool: fix unintentional statefulness in run_kernel()
lib/kunit/Kconfig | 1 +
lib/kunit/executor.c | 93 ++++++++++++++++++++++++++---
tools/testing/kunit/kunit.py | 21 +++++--
tools/testing/kunit/kunit_kernel.py | 6 +-
4 files changed, 106 insertions(+), 15 deletions(-)
base-commit: 88bb507a74ea7d75fa49edd421eaa710a7d80598
--
2.30.0.478.g8a0d178c01-goog
Currently running tests via KUnit tool means tweaking a .kunitconfig
file, which you'd keep around locally and never commit.
This changes makes it so users can pass in a path to a kunitconfig.
One of the imagined use cases is having kunitconfig fragments in-tree
to formalize interesting sets of tests for features/subsystems, e.g.
$ ./tools/testing/kunit/kunit.py run --kunticonfig=fs/ext4/kunitconfig
For now, this hypothetical fs/ext4/kunitconfig would contain
CONFIG_KUNIT=y
CONFIG_EXT4_FS=y
CONFIG_EXT4_KUNIT_TESTS=y
At the moment, it's not hard to manually whip up this file, but as more
and more tests get added, this will get tedious.
It also opens the door to documenting how to run all the tests relevant
to a specific subsystem or feature as a simple one-liner.
This can be seen as an analogue to tools/testing/selftests/*/config
But in the case of KUnit, the tests live in the same directory as the
code-under-test, so it feels more natural to allow the kunitconfig
fragments to live anywhere. (Though, people could create a separate
directory if wanted; this patch imposes no restrictions on the path).
Signed-off-by: Daniel Latypov <dlatypov(a)google.com>
---
Changes since v1: change from a positional arg to a flag --kunitconfig.
Ensure that it gets added for `kunit.py config` and all other commands.
---
tools/testing/kunit/kunit.py | 9 +++++---
tools/testing/kunit/kunit_kernel.py | 12 ++++++----
tools/testing/kunit/kunit_tool_test.py | 32 ++++++++++++++++++++++++++
3 files changed, 46 insertions(+), 7 deletions(-)
diff --git a/tools/testing/kunit/kunit.py b/tools/testing/kunit/kunit.py
index e808a47c839b..02871a363f76 100755
--- a/tools/testing/kunit/kunit.py
+++ b/tools/testing/kunit/kunit.py
@@ -182,6 +182,9 @@ def add_common_opts(parser) -> None:
parser.add_argument('--alltests',
help='Run all KUnit tests through allyesconfig',
action='store_true')
+ parser.add_argument('--kunitconfig',
+ help='Path to Kconfig fragment that enables KUnit tests',
+ metavar='kunitconfig')
def add_build_opts(parser) -> None:
parser.add_argument('--jobs',
@@ -256,7 +259,7 @@ def main(argv, linux=None):
os.mkdir(cli_args.build_dir)
if not linux:
- linux = kunit_kernel.LinuxSourceTree(cli_args.build_dir)
+ linux = kunit_kernel.LinuxSourceTree(cli_args.build_dir, kunitconfig_path=cli_args.kunitconfig)
request = KunitRequest(cli_args.raw_output,
cli_args.timeout,
@@ -274,7 +277,7 @@ def main(argv, linux=None):
os.mkdir(cli_args.build_dir)
if not linux:
- linux = kunit_kernel.LinuxSourceTree(cli_args.build_dir)
+ linux = kunit_kernel.LinuxSourceTree(cli_args.build_dir, kunitconfig_path=cli_args.kunitconfig)
request = KunitConfigRequest(cli_args.build_dir,
cli_args.make_options)
@@ -286,7 +289,7 @@ def main(argv, linux=None):
sys.exit(1)
elif cli_args.subcommand == 'build':
if not linux:
- linux = kunit_kernel.LinuxSourceTree(cli_args.build_dir)
+ linux = kunit_kernel.LinuxSourceTree(cli_args.build_dir, kunitconfig_path=cli_args.kunitconfig)
request = KunitBuildRequest(cli_args.jobs,
cli_args.build_dir,
diff --git a/tools/testing/kunit/kunit_kernel.py b/tools/testing/kunit/kunit_kernel.py
index 2076a5a2d060..0b461663e7d9 100644
--- a/tools/testing/kunit/kunit_kernel.py
+++ b/tools/testing/kunit/kunit_kernel.py
@@ -123,7 +123,7 @@ def get_outfile_path(build_dir) -> str:
class LinuxSourceTree(object):
"""Represents a Linux kernel source tree with KUnit tests."""
- def __init__(self, build_dir: str, load_config=True, defconfig=DEFAULT_KUNITCONFIG_PATH) -> None:
+ def __init__(self, build_dir: str, load_config=True, kunitconfig_path='') -> None:
signal.signal(signal.SIGINT, self.signal_handler)
self._ops = LinuxSourceTreeOperations()
@@ -131,9 +131,13 @@ class LinuxSourceTree(object):
if not load_config:
return
- kunitconfig_path = get_kunitconfig_path(build_dir)
- if not os.path.exists(kunitconfig_path):
- shutil.copyfile(defconfig, kunitconfig_path)
+ if kunitconfig_path:
+ if not os.path.exists(kunitconfig_path):
+ raise ConfigError(f'Specified kunitconfig ({kunitconfig_path}) does not exist')
+ else:
+ kunitconfig_path = get_kunitconfig_path(build_dir)
+ if not os.path.exists(kunitconfig_path):
+ shutil.copyfile(DEFAULT_KUNITCONFIG_PATH, kunitconfig_path)
self._kconfig = kunit_config.Kconfig()
self._kconfig.read_from_file(kunitconfig_path)
diff --git a/tools/testing/kunit/kunit_tool_test.py b/tools/testing/kunit/kunit_tool_test.py
index b593f4448e83..22f50b931138 100755
--- a/tools/testing/kunit/kunit_tool_test.py
+++ b/tools/testing/kunit/kunit_tool_test.py
@@ -12,6 +12,7 @@ from unittest import mock
import tempfile, shutil # Handling test_tmpdir
import json
+import signal
import os
import kunit_config
@@ -250,6 +251,23 @@ class KUnitParserTest(unittest.TestCase):
result.status)
self.assertEqual('kunit-resource-test', result.suites[0].name)
+class LinuxSourceTreeTest(unittest.TestCase):
+
+ def setUp(self):
+ mock.patch.object(signal, 'signal').start()
+ self.addCleanup(mock.patch.stopall)
+
+ def test_invalid_kunitconfig(self):
+ with self.assertRaisesRegex(kunit_kernel.ConfigError, 'nonexistent.* does not exist'):
+ kunit_kernel.LinuxSourceTree('', kunitconfig_path='/nonexistent_file')
+
+ def test_valid_kunitconfig(self):
+ with tempfile.NamedTemporaryFile('wt') as kunitconfig:
+ tree = kunit_kernel.LinuxSourceTree('', kunitconfig_path=kunitconfig.name)
+
+ # TODO: add more test cases.
+
+
class KUnitJsonTest(unittest.TestCase):
def _json_for(self, log_file):
@@ -399,5 +417,19 @@ class KUnitMainTest(unittest.TestCase):
self.linux_source_mock.run_kernel.assert_called_once_with(build_dir=build_dir, timeout=300)
self.print_mock.assert_any_call(StrContains('Testing complete.'))
+ @mock.patch.object(kunit_kernel, 'LinuxSourceTree')
+ def test_run_kunitconfig(self, mock_linux_init):
+ mock_linux_init.return_value = self.linux_source_mock
+ kunit.main(['run', '--kunitconfig=mykunitconfig'])
+ # Just verify that we parsed and initialized it correctly here.
+ mock_linux_init.assert_called_once_with('.kunit', kunitconfig_path='mykunitconfig')
+
+ @mock.patch.object(kunit_kernel, 'LinuxSourceTree')
+ def test_config_kunitconfig(self, mock_linux_init):
+ mock_linux_init.return_value = self.linux_source_mock
+ kunit.main(['config', '--kunitconfig=mykunitconfig'])
+ # Just verify that we parsed and initialized it correctly here.
+ mock_linux_init.assert_called_once_with('.kunit', kunitconfig_path='mykunitconfig')
+
if __name__ == '__main__':
unittest.main()
base-commit: 88bb507a74ea7d75fa49edd421eaa710a7d80598
--
2.30.0.365.g02bc693789-goog
./usage.rst contains fairly long examples and explanations of things
like how to fake a class and how to use parameterized tests (and how you
could do table-driven tests yourself).
It's not exactly necessary information, so we add a new page with more
digestible tips like "use kunit_kzalloc() instead of kzalloc() so you
don't have to worry about calling kfree() yourself" and the like.
Change start.rst to point users to this new page first and let them know
that usage.rst is more of optional further reading.
Signed-off-by: Daniel Latypov <dlatypov(a)google.com>
---
Documentation/dev-tools/kunit/index.rst | 2 +
Documentation/dev-tools/kunit/start.rst | 4 +-
Documentation/dev-tools/kunit/tips.rst | 115 ++++++++++++++++++++++++
3 files changed, 120 insertions(+), 1 deletion(-)
create mode 100644 Documentation/dev-tools/kunit/tips.rst
diff --git a/Documentation/dev-tools/kunit/index.rst b/Documentation/dev-tools/kunit/index.rst
index c234a3ab3c34..848478838347 100644
--- a/Documentation/dev-tools/kunit/index.rst
+++ b/Documentation/dev-tools/kunit/index.rst
@@ -13,6 +13,7 @@ KUnit - Unit Testing for the Linux Kernel
api/index
style
faq
+ tips
What is KUnit?
==============
@@ -88,6 +89,7 @@ How do I use it?
================
* :doc:`start` - for new users of KUnit
+* :doc:`tips` - for short examples of best practices
* :doc:`usage` - for a more detailed explanation of KUnit features
* :doc:`api/index` - for the list of KUnit APIs used for testing
* :doc:`kunit-tool` - for more information on the kunit_tool helper script
diff --git a/Documentation/dev-tools/kunit/start.rst b/Documentation/dev-tools/kunit/start.rst
index 454f307813ea..c09e2747c958 100644
--- a/Documentation/dev-tools/kunit/start.rst
+++ b/Documentation/dev-tools/kunit/start.rst
@@ -233,5 +233,7 @@ Congrats! You just wrote your first KUnit test!
Next Steps
==========
-* Check out the :doc:`usage` page for a more
+* Check out the :doc:`tips` page for tips on
+ writing idiomatic KUnit tests.
+* Optional: see the :doc:`usage` page for a more
in-depth explanation of KUnit.
diff --git a/Documentation/dev-tools/kunit/tips.rst b/Documentation/dev-tools/kunit/tips.rst
new file mode 100644
index 000000000000..a6ca0af14098
--- /dev/null
+++ b/Documentation/dev-tools/kunit/tips.rst
@@ -0,0 +1,115 @@
+.. SPDX-License-Identifier: GPL-2.0
+
+============================
+Tips For Writing KUnit Tests
+============================
+
+Exiting early on failed expectations
+------------------------------------
+
+``KUNIT_EXPECT_EQ`` and friends will mark the test as failed and continue
+execution. In some cases, it's unsafe to continue and you can use the
+``KUNIT_ASSERT`` variant to exit on failure.
+
+.. code-block:: c
+
+ void example_test_user_alloc_function(struct kunit *test)
+ {
+ void *object = alloc_some_object_for_me();
+
+ /* Make sure we got a valid pointer back. */
+ KUNIT_ASSERT_NOT_ERR_OR_NULL(test, object);
+ do_something_with_object(object);
+ }
+
+Allocating memory
+-----------------
+
+Where you would use ``kzalloc``, you should prefer ``kunit_kzalloc`` instead.
+KUnit will ensure the memory is freed once the test completes.
+
+This is particularly useful since it lets you use the ``KUNIT_ASSERT_EQ``
+macros to exit early from a test without having to worry about remembering to
+call ``kfree``.
+
+Example:
+
+.. code-block:: c
+
+ void example_test_allocation(struct kunit *test)
+ {
+ char *buffer = kunit_kzalloc(test, 16, GFP_KERNEL);
+ /* Ensure allocation succeeded. */
+ KUNIT_ASSERT_NOT_ERR_OR_NULL(test, buffer);
+
+ KUNIT_ASSERT_STREQ(test, buffer, "");
+ }
+
+
+Testing static functions
+------------------------
+
+If you don't want to expose functions or variables just for testing, one option
+is to conditionally ``#include`` the test file at the end of your .c file, e.g.
+
+.. code-block:: c
+
+ /* In my_file.c */
+
+ static int do_interesting_thing();
+
+ #ifdef CONFIG_MY_KUNIT_TEST
+ #include "my_kunit_test.c"
+ #endif
+
+Injecting test-only code
+------------------------
+
+Similarly to the above, it can be useful to add test-specific logic.
+
+.. code-block:: c
+
+ /* In my_file.h */
+
+ #ifdef CONFIG_MY_KUNIT_TEST
+ /* Defined in my_kunit_test.c */
+ void test_only_hook(void);
+ #else
+ void test_only_hook(void) { }
+ #endif
+
+TODO(dlatypov(a)google.com): add an example of using ``current->kunit_test`` in
+such a hook when it's not only updated for ``CONFIG_KASAN=y``.
+
+Customizing error messages
+--------------------------
+
+Each of the ``KUNIT_EXPECT`` and ``KUNIT_ASSERT`` macros have a ``_MSG`` variant.
+These take a format string and arguments to provide additional context to the automatically generated error messages.
+
+.. code-block:: c
+
+ char some_str[41];
+ generate_sha1_hex_string(some_str);
+
+ /* Before. Not easy to tell why the test failed. */
+ KUNIT_EXPECT_EQ(test, strlen(some_str), 40);
+
+ /* After. Now we see the offending string. */
+ KUNIT_EXPECT_EQ_MSG(test, strlen(some_str), 40, "some_str='%s'", some_str);
+
+Alternatively, one can take full control over the error message by using ``KUNIT_FAIL()``, e.g.
+
+.. code-block:: c
+
+ /* Before */
+ KUNIT_EXPECT_EQ(test, some_setup_function(), 0);
+
+ /* After: full control over the failure message. */
+ if (some_setup_function())
+ KUNIT_FAIL(test, "Failed to setup thing for testing");
+
+Next Steps
+==========
+* Optional: see the :doc:`usage` page for a more
+ in-depth explanation of KUnit.
base-commit: 6ee1d745b7c9fd573fba142a2efdad76a9f1cb04
--
2.30.0.280.ga3ce27912f-goog
The primary change is that we want to encourage people to respect
KUNIT_ALL_TESTS to make it easy to run all the relevant tests for a
given config.
Signed-off-by: Daniel Latypov <dlatypov(a)google.com>
---
Documentation/dev-tools/kunit/start.rst | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/Documentation/dev-tools/kunit/start.rst b/Documentation/dev-tools/kunit/start.rst
index 454f307813ea..560f27af4619 100644
--- a/Documentation/dev-tools/kunit/start.rst
+++ b/Documentation/dev-tools/kunit/start.rst
@@ -196,8 +196,9 @@ Now add the following to ``drivers/misc/Kconfig``:
.. code-block:: kconfig
config MISC_EXAMPLE_TEST
- bool "Test for my example"
+ tristate "Test for my example" if !KUNIT_ALL_TESTS
depends on MISC_EXAMPLE && KUNIT=y
+ default KUNIT_ALL_TESTS
and the following to ``drivers/misc/Makefile``:
base-commit: 146620506274bd24d52fb1c589110a30eed8240b
--
2.30.0.296.g2bfb1c46d8-goog
When a number of tests fail, it can be useful to get higher-level
statistics of how many tests are failing (or how many parameters are
failing in parameterised tests), and in what cases or suites. This is
already done by some non-KUnit tests, so add support for automatically
generating these for KUnit tests.
This change adds a 'kunit_stats_enabled' switch which has three values:
- 0: No stats are printed (current behaviour)
- 1: Stats are printed only for tests/suites with more than one
subtests, and at least one failure (new default)
- 2: Always print test statistics
For parameterised tests, the summary line looks as follows:
" # inode_test_xtimestamp_decoding: 0 / 16 test parameters failed"
For test suites, it looks like this:
"# ext4_inode_test: (0 / 1) tests failed (0 / 16 test parameters)"
kunit_tool is also updated to correctly ignore diagnostic lines, so that
these statistics do not prevent the result from parsing.
Signed-off-by: David Gow <davidgow(a)google.com>
---
This is largely a follow-up to the discussion here:
https://lore.kernel.org/linux-kselftest/CABVgOSmy4n_LGwDS7yWfoLftcQzxv6S+iX…
Does this seem like a sensible addition?
Cheers,
-- David
lib/kunit/test.c | 71 +++++++++++++++++++++++++++++
tools/testing/kunit/kunit_parser.py | 2 +-
2 files changed, 72 insertions(+), 1 deletion(-)
diff --git a/lib/kunit/test.c b/lib/kunit/test.c
index ec9494e914ef..711e269366a7 100644
--- a/lib/kunit/test.c
+++ b/lib/kunit/test.c
@@ -9,6 +9,7 @@
#include <kunit/test.h>
#include <linux/kernel.h>
#include <linux/kref.h>
+#include <linux/moduleparam.h>
#include <linux/sched/debug.h>
#include <linux/sched.h>
@@ -16,6 +17,40 @@
#include "string-stream.h"
#include "try-catch-impl.h"
+/*
+ * KUnit statistic mode:
+ * 0 - disabled
+ * 1 - only when there is at least one failure, and more than one subtest
+ * 2 - enabled
+ */
+static int kunit_stats_enabled = 1;
+core_param(kunit_stats_enabled, kunit_stats_enabled, int, 0644);
+
+static bool kunit_should_print_stats(int num_failures, int num_subtests)
+{
+ if (kunit_stats_enabled == 0)
+ return false;
+
+ if (kunit_stats_enabled == 2)
+ return true;
+
+ return (num_failures > 0 && num_subtests > 1);
+}
+
+static void kunit_print_test_stats(struct kunit *test,
+ size_t num_failures, size_t num_subtests)
+{
+ if (!kunit_should_print_stats(num_failures, num_subtests))
+ return;
+
+ kunit_log(KERN_INFO, test,
+ KUNIT_SUBTEST_INDENT
+ "# %s: %lu / %lu test parameters failed",
+ test->name,
+ num_failures,
+ num_subtests);
+}
+
/*
* Append formatted message to log, size of which is limited to
* KUNIT_LOG_SIZE bytes (including null terminating byte).
@@ -346,15 +381,37 @@ static void kunit_run_case_catch_errors(struct kunit_suite *suite,
test_case->success = test->success;
}
+static void kunit_print_suite_stats(struct kunit_suite *suite,
+ size_t num_failures,
+ size_t total_param_failures,
+ size_t total_params)
+{
+ size_t num_cases = kunit_suite_num_test_cases(suite);
+
+ if (!kunit_should_print_stats(num_failures, num_cases))
+ return;
+
+ kunit_log(KERN_INFO, suite,
+ "# %s: (%lu / %lu) tests failed (%lu / %lu test parameters)",
+ suite->name,
+ num_failures,
+ num_cases,
+ total_param_failures,
+ total_params);
+}
+
int kunit_run_tests(struct kunit_suite *suite)
{
char param_desc[KUNIT_PARAM_DESC_SIZE];
struct kunit_case *test_case;
+ size_t num_suite_failures = 0;
+ size_t total_param_failures = 0, total_params = 0;
kunit_print_subtest_start(suite);
kunit_suite_for_each_test_case(suite, test_case) {
struct kunit test = { .param_value = NULL, .param_index = 0 };
+ size_t num_params = 0, num_failures = 0;
bool test_success = true;
if (test_case->generate_params) {
@@ -385,13 +442,27 @@ int kunit_run_tests(struct kunit_suite *suite)
test.param_value = test_case->generate_params(test.param_value, param_desc);
test.param_index++;
}
+
+ if (!test.success)
+ num_failures++;
+ num_params++;
+
} while (test.param_value);
+ kunit_print_test_stats(&test, num_failures, num_params);
+
kunit_print_ok_not_ok(&test, true, test_success,
kunit_test_case_num(suite, test_case),
test_case->name);
+
+ if (!test_success)
+ num_suite_failures++;
+ total_params += num_params;
+ total_param_failures += num_failures;
}
+ kunit_print_suite_stats(suite, num_suite_failures,
+ total_param_failures, total_params);
kunit_print_subtest_end(suite);
return 0;
diff --git a/tools/testing/kunit/kunit_parser.py b/tools/testing/kunit/kunit_parser.py
index 6614ec4d0898..88ee2b2668ad 100644
--- a/tools/testing/kunit/kunit_parser.py
+++ b/tools/testing/kunit/kunit_parser.py
@@ -95,7 +95,7 @@ def print_log(log):
for m in log:
print_with_timestamp(m)
-TAP_ENTRIES = re.compile(r'^(TAP|[\s]*ok|[\s]*not ok|[\s]*[0-9]+\.\.[0-9]+|[\s]*#).*$')
+TAP_ENTRIES = re.compile(r'^(TAP|[\s]*ok|[\s]*not ok|[\s]*[0-9]+\.\.[0-9]+|[\s]*# Subtest:).*$')
def consume_non_diagnositic(lines: List[str]) -> None:
while lines and not TAP_ENTRIES.match(lines[0]):
base-commit: 5f6b99d0287de2c2d0b5e7abcb0092d553ad804a
--
2.29.2.576.ga3fc446d84-goog
Don't use an O(nm) algorithm* and make it more readable by using a dict.
*Most obviously, it does a nested for-loop over the entire other config.
A bit more subtle, it calls .entries(), which constructs a set from the
list for _every_ outer iteration.
Signed-off-by: Daniel Latypov <dlatypov(a)google.com>
---
tools/testing/kunit/kunit_config.py | 13 ++++++-------
1 file changed, 6 insertions(+), 7 deletions(-)
diff --git a/tools/testing/kunit/kunit_config.py b/tools/testing/kunit/kunit_config.py
index 02ffc3a3e5dc..f1101075d458 100644
--- a/tools/testing/kunit/kunit_config.py
+++ b/tools/testing/kunit/kunit_config.py
@@ -40,15 +40,14 @@ class Kconfig(object):
self._entries.append(entry)
def is_subset_of(self, other: 'Kconfig') -> bool:
+ other_dict = {e.name: e.value for e in other.entries()}
for a in self.entries():
- found = False
- for b in other.entries():
- if a.name != b.name:
+ b = other_dict.get(a.name)
+ if b is None:
+ if a.value == 'n':
continue
- if a.value != b.value:
- return False
- found = True
- if a.value != 'n' and found == False:
+ return False
+ elif a.value != b:
return False
return True
base-commit: c6f7e1510b872c281ff603a3108c084b6548d35c
--
2.29.2.576.ga3fc446d84-goog
Currently, given something (fairly dystopian) like
> KUNIT_EXPECT_EQ(test, 2 + 2, 5)
KUnit will prints a failure message like this.
> Expected 2 + 2 == 5, but
> 2 + 2 == 4
> 5 == 5
With this patch, the output just becomes
> Expected 2 + 2 == 5, but
> 2 + 2 == 4
This patch is slightly hacky, but it's quite common* to compare an
expression to a literal integer value, so this can make KUnit less
chatty in many cases. (This patch also fixes variants like
KUNIT_EXPECT_GT, LE, et al.).
It also allocates an additional string briefly, but given this only
happens on test failures, it doesn't seem too bad a tradeoff.
Also, in most cases it'll realize the lengths are unequal and bail out
before the allocation.
We could save the result of the formatted string to avoid wasting this
extra work, but it felt cleaner to leave it as-is.
Edge case: for something silly and unrealistic like
> KUNIT_EXPECT_EQ(test, 4, 5);
It'll generate this message with a trailing "but"
> Expected 4 == 5, but
> <next line of normal output>
It didn't feel worth adding a check up-front to see if both sides are
literals to handle this better.
*A quick grep suggests 100+ comparisons to an integer literal as the
right hand side.
Signed-off-by: Daniel Latypov <dlatypov(a)google.com>
Tested-by: David Gow <davidgow(a)google.com>
Reviewed-by: Brendan Higgins <brendanhiggins(a)google.com>
---
lib/kunit/assert.c | 39 +++++++++++++++++++++++++++++++++------
1 file changed, 33 insertions(+), 6 deletions(-)
diff --git a/lib/kunit/assert.c b/lib/kunit/assert.c
index 33acdaa28a7d..e0ec7d6fed6f 100644
--- a/lib/kunit/assert.c
+++ b/lib/kunit/assert.c
@@ -85,6 +85,29 @@ void kunit_ptr_not_err_assert_format(const struct kunit_assert *assert,
}
EXPORT_SYMBOL_GPL(kunit_ptr_not_err_assert_format);
+/* Checks if `text` is a literal representing `value`, e.g. "5" and 5 */
+static bool is_literal(struct kunit *test, const char *text, long long value,
+ gfp_t gfp)
+{
+ char *buffer;
+ int len;
+ bool ret;
+
+ len = snprintf(NULL, 0, "%lld", value);
+ if (strlen(text) != len)
+ return false;
+
+ buffer = kunit_kmalloc(test, len+1, gfp);
+ if (!buffer)
+ return false;
+
+ snprintf(buffer, len+1, "%lld", value);
+ ret = strncmp(buffer, text, len) == 0;
+
+ kunit_kfree(test, buffer);
+ return ret;
+}
+
void kunit_binary_assert_format(const struct kunit_assert *assert,
struct string_stream *stream)
{
@@ -97,12 +120,16 @@ void kunit_binary_assert_format(const struct kunit_assert *assert,
binary_assert->left_text,
binary_assert->operation,
binary_assert->right_text);
- string_stream_add(stream, KUNIT_SUBSUBTEST_INDENT "%s == %lld\n",
- binary_assert->left_text,
- binary_assert->left_value);
- string_stream_add(stream, KUNIT_SUBSUBTEST_INDENT "%s == %lld",
- binary_assert->right_text,
- binary_assert->right_value);
+ if (!is_literal(stream->test, binary_assert->left_text,
+ binary_assert->left_value, stream->gfp))
+ string_stream_add(stream, KUNIT_SUBSUBTEST_INDENT "%s == %lld\n",
+ binary_assert->left_text,
+ binary_assert->left_value);
+ if (!is_literal(stream->test, binary_assert->right_text,
+ binary_assert->right_value, stream->gfp))
+ string_stream_add(stream, KUNIT_SUBSUBTEST_INDENT "%s == %lld",
+ binary_assert->right_text,
+ binary_assert->right_value);
kunit_assert_print_msg(assert, stream);
}
EXPORT_SYMBOL_GPL(kunit_binary_assert_format);
base-commit: 1e0d27fce010b0a4a9e595506b6ede75934c31be
--
2.30.0.478.g8a0d178c01-goog
Commit c2aa8afc36fa has renamed run_vmtests in Makefile,
but the file still uses the old name.
The kernel test robot reported the following issue:
# selftests: vm: run_vmtests.sh
# Warning: file run_vmtests.sh is missing!
not ok 1 selftests: vm: run_vmtests.sh
Reported-by: kernel test robot <lkp(a)intel.com>
Fixes: c2aa8afc36fa (selftests/vm: rename run_vmtests --> run_vmtests.sh)
Signed-off-by: Rong Chen <rong.a.chen(a)intel.com>
---
tools/testing/selftests/vm/{run_vmtests => run_vmtests.sh} | 0
1 file changed, 0 insertions(+), 0 deletions(-)
rename tools/testing/selftests/vm/{run_vmtests => run_vmtests.sh} (100%)
diff --git a/tools/testing/selftests/vm/run_vmtests b/tools/testing/selftests/vm/run_vmtests.sh
similarity index 100%
rename from tools/testing/selftests/vm/run_vmtests
rename to tools/testing/selftests/vm/run_vmtests.sh
--
2.20.1
When using `kunit.py run` to run tests, users must populate a
`kunitconfig` file to select the options the tests are hidden behind and
all their dependencies.
The patch [1] to allow specifying a path to kunitconfig promises to make
this nicer as we can have checked in files corresponding to different
sets of tests.
But it's still annoying
1) when trying to run a subet of tests
2) when you want to run tests that don't have such a pre-existing
kunitconfig and selecting all the necessary options is tricky.
This patch series aims to alleviate both:
1) `kunit.py run 'my-suite-*'`
I.e. use my current kunitconfig, but just run suites that match this glob
2) `kunit.py run --alltests 'my-suite-*'`
I.e. use allyesconfig so I don't have to worry about writing a
kunitconfig at all.
See the first commit message for more details and discussion about
future work.
This patch series also includes a bugfix for a latent bug that can't be
triggered right now but has worse consequences as a result of the
changes needed to plumb in this suite name glob.
[1] https://lore.kernel.org/linux-kselftest/20210201205514.3943096-1-dlatypov@g…
---
v1 -> v2:
Fix free of `suites` subarray in suite_set.
Found by Dan Carpenter and kernel test robot.
Daniel Latypov (3):
kunit: add kunit.filter_glob cmdline option to filter suites
kunit: tool: add support for filtering suites by glob
kunit: tool: fix unintentional statefulness in run_kernel()
lib/kunit/Kconfig | 1 +
lib/kunit/executor.c | 91 ++++++++++++++++++++++++++---
tools/testing/kunit/kunit.py | 21 +++++--
tools/testing/kunit/kunit_kernel.py | 6 +-
4 files changed, 104 insertions(+), 15 deletions(-)
base-commit: 88bb507a74ea7d75fa49edd421eaa710a7d80598
--
2.30.0.365.g02bc693789-goog
From: Mike Rapoport <rppt(a)linux.ibm.com>
Hi,
@Andrew, this is based on v5.11-rc4-mmots-2021-01-19-13-54 with secretmem
patches dropped from there, I can rebase whatever way you prefer.
This is an implementation of "secret" mappings backed by a file descriptor.
The file descriptor backing secret memory mappings is created using a
dedicated memfd_secret system call The desired protection mode for the
memory is configured using flags parameter of the system call. The mmap()
of the file descriptor created with memfd_secret() will create a "secret"
memory mapping. The pages in that mapping will be marked as not present in
the direct map and will be present only in the page table of the owning mm.
Although normally Linux userspace mappings are protected from other users,
such secret mappings are useful for environments where a hostile tenant is
trying to trick the kernel into giving them access to other tenants
mappings.
Additionally, in the future the secret mappings may be used as a mean to
protect guest memory in a virtual machine host.
For demonstration of secret memory usage we've created a userspace library
https://git.kernel.org/pub/scm/linux/kernel/git/jejb/secret-memory-preloade…
that does two things: the first is act as a preloader for openssl to
redirect all the OPENSSL_malloc calls to secret memory meaning any secret
keys get automatically protected this way and the other thing it does is
expose the API to the user who needs it. We anticipate that a lot of the
use cases would be like the openssl one: many toolkits that deal with
secret keys already have special handling for the memory to try to give
them greater protection, so this would simply be pluggable into the
toolkits without any need for user application modification.
Hiding secret memory mappings behind an anonymous file allows (ab)use of
the page cache for tracking pages allocated for the "secret" mappings as
well as using address_space_operations for e.g. page migration callbacks.
The anonymous file may be also used implicitly, like hugetlb files, to
implement mmap(MAP_SECRET) and use the secret memory areas with "native" mm
ABIs in the future.
To limit fragmentation of the direct map to splitting only PUD-size pages,
I've added an amortizing cache of PMD-size pages to each file descriptor
that is used as an allocation pool for the secret memory areas.
As the memory allocated by secretmem becomes unmovable, we use CMA to back
large page caches so that page allocator won't be surprised by failing attempt
to migrate these pages.
v16:
* Fix memory leak intorduced in v15
* Clean the data left from previous page user before handing the page to
the userspace
v15: https://lore.kernel.org/lkml/20210120180612.1058-1-rppt@kernel.org
* Add riscv/Kconfig update to disable set_memory operations for nommu
builds (patch 3)
* Update the code around add_to_page_cache() per Matthew's comments
(patches 6,7)
* Add fixups for build/checkpatch errors discovered by CI systems
v14: https://lore.kernel.org/lkml/20201203062949.5484-1-rppt@kernel.org
* Finally s/mod_node_page_state/mod_lruvec_page_state/
v13: https://lore.kernel.org/lkml/20201201074559.27742-1-rppt@kernel.org
* Added Reviewed-by, thanks Catalin and David
* s/mod_node_page_state/mod_lruvec_page_state/ as Shakeel suggested
v12: https://lore.kernel.org/lkml/20201125092208.12544-1-rppt@kernel.org
* Add detection of whether set_direct_map has actual effect on arm64 and bail
out of CMA allocation for secretmem and the memfd_secret() syscall if pages
would not be removed from the direct map
Older history:
v11: https://lore.kernel.org/lkml/20201124092556.12009-1-rppt@kernel.org
v10: https://lore.kernel.org/lkml/20201123095432.5860-1-rppt@kernel.org
v9: https://lore.kernel.org/lkml/20201117162932.13649-1-rppt@kernel.org
v8: https://lore.kernel.org/lkml/20201110151444.20662-1-rppt@kernel.org
v7: https://lore.kernel.org/lkml/20201026083752.13267-1-rppt@kernel.org
v6: https://lore.kernel.org/lkml/20200924132904.1391-1-rppt@kernel.org
v5: https://lore.kernel.org/lkml/20200916073539.3552-1-rppt@kernel.org
v4: https://lore.kernel.org/lkml/20200818141554.13945-1-rppt@kernel.org
v3: https://lore.kernel.org/lkml/20200804095035.18778-1-rppt@kernel.org
v2: https://lore.kernel.org/lkml/20200727162935.31714-1-rppt@kernel.org
v1: https://lore.kernel.org/lkml/20200720092435.17469-1-rppt@kernel.org
Mike Rapoport (11):
mm: add definition of PMD_PAGE_ORDER
mmap: make mlock_future_check() global
riscv/Kconfig: make direct map manipulation options depend on MMU
set_memory: allow set_direct_map_*_noflush() for multiple pages
set_memory: allow querying whether set_direct_map_*() is actually enabled
mm: introduce memfd_secret system call to create "secret" memory areas
secretmem: use PMD-size pages to amortize direct map fragmentation
secretmem: add memcg accounting
PM: hibernate: disable when there are active secretmem users
arch, mm: wire up memfd_secret system call where relevant
secretmem: test: add basic selftest for memfd_secret(2)
arch/arm64/include/asm/Kbuild | 1 -
arch/arm64/include/asm/cacheflush.h | 6 -
arch/arm64/include/asm/set_memory.h | 17 +
arch/arm64/include/uapi/asm/unistd.h | 1 +
arch/arm64/kernel/machine_kexec.c | 1 +
arch/arm64/mm/mmu.c | 6 +-
arch/arm64/mm/pageattr.c | 23 +-
arch/riscv/Kconfig | 4 +-
arch/riscv/include/asm/set_memory.h | 4 +-
arch/riscv/include/asm/unistd.h | 1 +
arch/riscv/mm/pageattr.c | 8 +-
arch/x86/entry/syscalls/syscall_32.tbl | 1 +
arch/x86/entry/syscalls/syscall_64.tbl | 1 +
arch/x86/include/asm/set_memory.h | 4 +-
arch/x86/mm/pat/set_memory.c | 8 +-
fs/dax.c | 11 +-
include/linux/pgtable.h | 3 +
include/linux/secretmem.h | 30 ++
include/linux/set_memory.h | 16 +-
include/linux/syscalls.h | 1 +
include/uapi/asm-generic/unistd.h | 6 +-
include/uapi/linux/magic.h | 1 +
kernel/power/hibernate.c | 5 +-
kernel/power/snapshot.c | 4 +-
kernel/sys_ni.c | 2 +
mm/Kconfig | 5 +
mm/Makefile | 1 +
mm/filemap.c | 3 +-
mm/gup.c | 10 +
mm/internal.h | 3 +
mm/mmap.c | 5 +-
mm/secretmem.c | 451 ++++++++++++++++++++++
mm/vmalloc.c | 5 +-
scripts/checksyscalls.sh | 4 +
tools/testing/selftests/vm/.gitignore | 1 +
tools/testing/selftests/vm/Makefile | 3 +-
tools/testing/selftests/vm/memfd_secret.c | 296 ++++++++++++++
tools/testing/selftests/vm/run_vmtests | 17 +
38 files changed, 917 insertions(+), 52 deletions(-)
create mode 100644 arch/arm64/include/asm/set_memory.h
create mode 100644 include/linux/secretmem.h
create mode 100644 mm/secretmem.c
create mode 100644 tools/testing/selftests/vm/memfd_secret.c
--
2.28.0
According to the error message, the first argument of ptrace() should be
PTRACE_SINGLESTEP instead of PTRACE_CONT when ptrace single step.
Fixes: f43365ee17f8 ("selftests: arm64: add test for unaligned/inexact watchpoint handling")
Signed-off-by: Tiezhu Yang <yangtiezhu(a)loongson.cn>
---
tools/testing/selftests/breakpoints/breakpoint_test_arm64.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/tools/testing/selftests/breakpoints/breakpoint_test_arm64.c b/tools/testing/selftests/breakpoints/breakpoint_test_arm64.c
index ad41ea6..2f4d4d6 100644
--- a/tools/testing/selftests/breakpoints/breakpoint_test_arm64.c
+++ b/tools/testing/selftests/breakpoints/breakpoint_test_arm64.c
@@ -143,7 +143,7 @@ static bool run_test(int wr_size, int wp_size, int wr, int wp)
if (!set_watchpoint(pid, wp_size, wp))
return false;
- if (ptrace(PTRACE_CONT, pid, NULL, NULL) < 0) {
+ if (ptrace(PTRACE_SINGLESTEP, pid, NULL, NULL) < 0) {
ksft_print_msg(
"ptrace(PTRACE_SINGLESTEP) failed: %s\n",
strerror(errno));
--
2.1.0
Changelog
---------
v8
- Added reviewed by's from John Hubbard
- Fixed subjects for selftests patches
- Moved zero page check inside is_pinnable_page() as requested by Jason
Gunthorpe.
v7
- Added reviewed-by's
- Fixed a compile bug on non-mmu builds reported by robot
v6
Small update, but I wanted to send it out quicker, as it removes a
controversial patch and replaces it with something sane.
- Removed forcing FOLL_WRITE for longterm gup, instead added a patch to
skip zero pages during migration.
- Added reviewed-by's and minor log changes.
v5
- Added the following patches to the beginning of series, which are fixes
to the other existing problems with CMA migration code:
mm/gup: check every subpage of a compound page during isolation
mm/gup: return an error on migration failure
mm/gup: check for isolation errors also at the beginning of series
mm/gup: do not allow zero page for pinned pages
- remove .gfp_mask/.reclaim_idx changes from mm/vmscan.c
- update movable zone header comment in patch 8 instead of patch 3, fix
the comment
- Added acked, sign-offs
- Updated commit logs based on feedback
- Addressed issues reported by Michal and Jason.
- Remove:
#define PINNABLE_MIGRATE_MAX 10
#define PINNABLE_ISOLATE_MAX 100
Instead: fail on the first migration failure, and retry isolation
forever as their failures are transient.
- In self-set addressed some of the comments from John Hubbard, updated
commit logs, and added comments. Renamed gup->flags with gup->test_flags.
v4
- Address page migration comments. New patch:
mm/gup: limit number of gup migration failures, honor failures
Implements the limiting number of retries for migration failures, and
also check for isolation failures.
Added a test case into gup_test to verify that pages never long-term
pinned in a movable zone, and also added tests to fault both in kernel
and in userland.
v3
- Merged with linux-next, which contains clean-up patch from Jason,
therefore this series is reduced by two patches which did the same
thing.
v2
- Addressed all review comments
- Added Reviewed-by's.
- Renamed PF_MEMALLOC_NOMOVABLE to PF_MEMALLOC_PIN
- Added is_pinnable_page() to check if page can be longterm pinned
- Fixed gup fast path by checking is_in_pinnable_zone()
- rename cma_page_list to movable_page_list
- add a admin-guide note about handling pinned pages in ZONE_MOVABLE,
updated caveat about pinned pages from linux/mmzone.h
- Move current_gfp_context() to fast-path
---------
When page is pinned it cannot be moved and its physical address stays
the same until pages is unpinned.
This is useful functionality to allows userland to implementation DMA
access. For example, it is used by vfio in vfio_pin_pages().
However, this functionality breaks memory hotplug/hotremove assumptions
that pages in ZONE_MOVABLE can always be migrated.
This patch series fixes this issue by forcing new allocations during
page pinning to omit ZONE_MOVABLE, and also to migrate any existing
pages from ZONE_MOVABLE during pinning.
It uses the same scheme logic that is currently used by CMA, and extends
the functionality for all allocations.
For more information read the discussion [1] about this problem.
[1] https://lore.kernel.org/lkml/CA+CK2bBffHBxjmb9jmSKacm0fJMinyt3Nhk8Nx6iudcQS…
Previous versions:
v1
https://lore.kernel.org/lkml/20201202052330.474592-1-pasha.tatashin@soleen.…
v2
https://lore.kernel.org/lkml/20201210004335.64634-1-pasha.tatashin@soleen.c…
v3
https://lore.kernel.org/lkml/20201211202140.396852-1-pasha.tatashin@soleen.…
v4
https://lore.kernel.org/lkml/20201217185243.3288048-1-pasha.tatashin@soleen…
v5
https://lore.kernel.org/lkml/20210119043920.155044-1-pasha.tatashin@soleen.…
v6
https://lore.kernel.org/lkml/20210120014333.222547-1-pasha.tatashin@soleen.…
v7
https://lore.kernel.org/lkml/20210122033748.924330-1-pasha.tatashin@soleen.…
Pavel Tatashin (14):
mm/gup: don't pin migrated cma pages in movable zone
mm/gup: check every subpage of a compound page during isolation
mm/gup: return an error on migration failure
mm/gup: check for isolation errors
mm cma: rename PF_MEMALLOC_NOCMA to PF_MEMALLOC_PIN
mm: apply per-task gfp constraints in fast path
mm: honor PF_MEMALLOC_PIN for all movable pages
mm/gup: do not migrate zero page
mm/gup: migrate pinned pages out of movable zone
memory-hotplug.rst: add a note about ZONE_MOVABLE and page pinning
mm/gup: change index type to long as it counts pages
mm/gup: longterm pin migration cleanup
selftests/vm: gup_test: fix test flag
selftests/vm: gup_test: test faulting in kernel, and verify pinnable
pages
.../admin-guide/mm/memory-hotplug.rst | 9 +
include/linux/migrate.h | 1 +
include/linux/mm.h | 12 ++
include/linux/mmzone.h | 13 +-
include/linux/pgtable.h | 3 +-
include/linux/sched.h | 2 +-
include/linux/sched/mm.h | 27 +--
include/trace/events/migrate.h | 3 +-
mm/gup.c | 174 ++++++++----------
mm/gup_test.c | 29 +--
mm/gup_test.h | 3 +-
mm/hugetlb.c | 4 +-
mm/page_alloc.c | 33 ++--
tools/testing/selftests/vm/gup_test.c | 36 +++-
14 files changed, 190 insertions(+), 159 deletions(-)
--
2.25.1
Changelog
---------
v9
- Renamed gpf_to_alloc_flags() to gfp_to_alloc_flags_cma(); thanks Lecopzer
Chen for noticing.
- Fixed warning reported scripts/checkpatch.pl:
"Logical continuations should be on the previous line"
v8
- Added reviewed by's from John Hubbard
- Fixed subjects for selftests patches
- Moved zero page check inside is_pinnable_page() as requested by Jason
Gunthorpe.
v7
- Added reviewed-by's
- Fixed a compile bug on non-mmu builds reported by robot
v6
Small update, but I wanted to send it out quicker, as it removes a
controversial patch and replaces it with something sane.
- Removed forcing FOLL_WRITE for longterm gup, instead added a patch to
skip zero pages during migration.
- Added reviewed-by's and minor log changes.
v5
- Added the following patches to the beginning of series, which are fixes
to the other existing problems with CMA migration code:
mm/gup: check every subpage of a compound page during isolation
mm/gup: return an error on migration failure
mm/gup: check for isolation errors also at the beginning of series
mm/gup: do not allow zero page for pinned pages
- remove .gfp_mask/.reclaim_idx changes from mm/vmscan.c
- update movable zone header comment in patch 8 instead of patch 3, fix
the comment
- Added acked, sign-offs
- Updated commit logs based on feedback
- Addressed issues reported by Michal and Jason.
- Remove:
#define PINNABLE_MIGRATE_MAX 10
#define PINNABLE_ISOLATE_MAX 100
Instead: fail on the first migration failure, and retry isolation
forever as their failures are transient.
- In self-set addressed some of the comments from John Hubbard, updated
commit logs, and added comments. Renamed gup->flags with gup->test_flags.
v4
- Address page migration comments. New patch:
mm/gup: limit number of gup migration failures, honor failures
Implements the limiting number of retries for migration failures, and
also check for isolation failures.
Added a test case into gup_test to verify that pages never long-term
pinned in a movable zone, and also added tests to fault both in kernel
and in userland.
v3
- Merged with linux-next, which contains clean-up patch from Jason,
therefore this series is reduced by two patches which did the same
thing.
v2
- Addressed all review comments
- Added Reviewed-by's.
- Renamed PF_MEMALLOC_NOMOVABLE to PF_MEMALLOC_PIN
- Added is_pinnable_page() to check if page can be longterm pinned
- Fixed gup fast path by checking is_in_pinnable_zone()
- rename cma_page_list to movable_page_list
- add a admin-guide note about handling pinned pages in ZONE_MOVABLE,
updated caveat about pinned pages from linux/mmzone.h
- Move current_gfp_context() to fast-path
---------
When page is pinned it cannot be moved and its physical address stays
the same until pages is unpinned.
This is useful functionality to allows userland to implementation DMA
access. For example, it is used by vfio in vfio_pin_pages().
However, this functionality breaks memory hotplug/hotremove assumptions
that pages in ZONE_MOVABLE can always be migrated.
This patch series fixes this issue by forcing new allocations during
page pinning to omit ZONE_MOVABLE, and also to migrate any existing
pages from ZONE_MOVABLE during pinning.
It uses the same scheme logic that is currently used by CMA, and extends
the functionality for all allocations.
For more information read the discussion [1] about this problem.
[1] https://lore.kernel.org/lkml/CA+CK2bBffHBxjmb9jmSKacm0fJMinyt3Nhk8Nx6iudcQS…
Previous versions:
v1
https://lore.kernel.org/lkml/20201202052330.474592-1-pasha.tatashin@soleen.…
v2
https://lore.kernel.org/lkml/20201210004335.64634-1-pasha.tatashin@soleen.c…
v3
https://lore.kernel.org/lkml/20201211202140.396852-1-pasha.tatashin@soleen.…
v4
https://lore.kernel.org/lkml/20201217185243.3288048-1-pasha.tatashin@soleen…
v5
https://lore.kernel.org/lkml/20210119043920.155044-1-pasha.tatashin@soleen.…
v6
https://lore.kernel.org/lkml/20210120014333.222547-1-pasha.tatashin@soleen.…
v7
https://lore.kernel.org/lkml/20210122033748.924330-1-pasha.tatashin@soleen.…
v8
https://lore.kernel.org/lkml/20210125194751.1275316-1-pasha.tatashin@soleen…
Pavel Tatashin (14):
mm/gup: don't pin migrated cma pages in movable zone
mm/gup: check every subpage of a compound page during isolation
mm/gup: return an error on migration failure
mm/gup: check for isolation errors
mm cma: rename PF_MEMALLOC_NOCMA to PF_MEMALLOC_PIN
mm: apply per-task gfp constraints in fast path
mm: honor PF_MEMALLOC_PIN for all movable pages
mm/gup: do not migrate zero page
mm/gup: migrate pinned pages out of movable zone
memory-hotplug.rst: add a note about ZONE_MOVABLE and page pinning
mm/gup: change index type to long as it counts pages
mm/gup: longterm pin migration cleanup
selftests/vm: gup_test: fix test flag
selftests/vm: gup_test: test faulting in kernel, and verify pinnable
pages
.../admin-guide/mm/memory-hotplug.rst | 9 +
include/linux/migrate.h | 1 +
include/linux/mm.h | 12 ++
include/linux/mmzone.h | 13 +-
include/linux/pgtable.h | 3 +-
include/linux/sched.h | 2 +-
include/linux/sched/mm.h | 27 +--
include/trace/events/migrate.h | 3 +-
mm/gup.c | 174 ++++++++----------
mm/gup_test.c | 29 +--
mm/gup_test.h | 3 +-
mm/hugetlb.c | 4 +-
mm/page_alloc.c | 33 ++--
tools/testing/selftests/vm/gup_test.c | 36 +++-
14 files changed, 190 insertions(+), 159 deletions(-)
--
2.25.1
What the short summary is saying now, is that this commit would make the
existing code to use vDSO base address. It's already doing that.
You could instead just "Use getauxval() to simplify the code".
Also, I'd prefer to properly use upper and lower case letter, e.g. vDSO
instead of vdso.
Reply-To:
In-Reply-To: <20210124062907.88229-2-tianjia.zhang(a)linux.alibaba.com>
On Sun, Jan 24, 2021 at 02:29:03PM +0800, Tianjia Zhang wrote:
> This patch uses the library function `getauxval(AT_SYSINFO_EHDR)`
> instead of the custom function `vdso_get_base_addr` to obtain the
Use either double or single quotation mark instead of hyphen.
> base address of vDSO, which will simplify the code implementation.
>
> Signed-off-by: Tianjia Zhang <tianjia.zhang(a)linux.alibaba.com>
This needs to be imperative form, e.g. "Simplify the code implemntation
by using getauxval() instead of a custom function."
> ---
> tools/testing/selftests/sgx/main.c | 24 ++++--------------------
> 1 file changed, 4 insertions(+), 20 deletions(-)
>
> diff --git a/tools/testing/selftests/sgx/main.c b/tools/testing/selftests/sgx/main.c
> index 724cec700926..365d01dea67b 100644
> --- a/tools/testing/selftests/sgx/main.c
> +++ b/tools/testing/selftests/sgx/main.c
> @@ -15,6 +15,7 @@
> #include <sys/stat.h>
> #include <sys/time.h>
> #include <sys/types.h>
> +#include <sys/auxv.h>
> #include "defines.h"
> #include "main.h"
> #include "../kselftest.h"
> @@ -28,24 +29,6 @@ struct vdso_symtab {
> Elf64_Word *elf_hashtab;
> };
>
> -static void *vdso_get_base_addr(char *envp[])
> -{
> - Elf64_auxv_t *auxv;
> - int i;
> -
> - for (i = 0; envp[i]; i++)
> - ;
> -
> - auxv = (Elf64_auxv_t *)&envp[i + 1];
> -
> - for (i = 0; auxv[i].a_type != AT_NULL; i++) {
> - if (auxv[i].a_type == AT_SYSINFO_EHDR)
> - return (void *)auxv[i].a_un.a_val;
> - }
> -
> - return NULL;
> -}
> -
> static Elf64_Dyn *vdso_get_dyntab(void *addr)
> {
> Elf64_Ehdr *ehdr = addr;
> @@ -162,7 +145,7 @@ static int user_handler(long rdi, long rsi, long rdx, long ursp, long r8, long r
> return 0;
> }
>
> -int main(int argc, char *argv[], char *envp[])
> +int main(int argc, char *argv[])
> {
> struct sgx_enclave_run run;
> struct vdso_symtab symtab;
> @@ -203,7 +186,8 @@ int main(int argc, char *argv[], char *envp[])
> memset(&run, 0, sizeof(run));
> run.tcs = encl.encl_base;
>
> - addr = vdso_get_base_addr(envp);
> + /* Get vDSO base address */
> + addr = (void *)(uintptr_t)getauxval(AT_SYSINFO_EHDR);
You could just case the result the result directly to void *.
> if (!addr)
> goto err;
>
> --
> 2.19.1.3.ge56e4f7
>
>
/Jarkko
From: Arnd Bergmann <arnd(a)arndb.de>
I ran into a couple of problems with kunit tests taking too much stack
space, sometimes dangerously so. These the the three instances that
cause an increase over the warning limit of some architectures:
lib/bitfield_kunit.c:93:1: error: the frame size of 7440 bytes is larger than 2048 bytes [-Werror=frame-larger-than=]
drivers/base/test/property-entry-test.c:481:1: error: the frame size of 2640 bytes is larger than 2048 bytes [-Werror=frame-larger-than=]
drivers/thunderbolt/test.c:1529:1: error: the frame size of 1176 bytes is larger than 1024 bytes [-Werror=frame-larger-than=]
Ideally there should be a way to rewrite the kunit infrastructure
that avoids the explosion of stack data when the structleak plugin
is used.
A rather drastic measure would be to use Kconfig logic to make
the two options mutually exclusive. This would clearly work, but
is probably not needed.
As a simpler workaround, this disables the plugin for the three
files in which the excessive stack usage was observed.
Arnd
Arnd Bergmann (3):
bitfield: build kunit tests without structleak plugin
drivers/base: build kunit tests without structleak plugin
thunderbolt: build kunit tests without structleak plugin
drivers/base/test/Makefile | 1 +
drivers/thunderbolt/Makefile | 1 +
lib/Makefile | 1 +
3 files changed, 3 insertions(+)
Cc: Kees Cook <keescook(a)chromium.org>
Cc: Brendan Higgins <brendanhiggins(a)google.com>
Cc: Shuah Khan <skhan(a)linuxfoundation.org>
Cc: Geert Uytterhoeven <geert+renesas(a)glider.be>
Cc: Alan Maguire <alan.maguire(a)oracle.com>
Cc: Dmitry Torokhov <dmitry.torokhov(a)gmail.com>
Cc: Mika Westerberg <mika.westerberg(a)linux.intel.com>
Cc: Vitor Massaru Iha <vitor(a)massaru.org>
Cc: linux-hardening(a)vger.kernel.org
Cc: linux-kselftest(a)vger.kernel.org
Cc: kunit-dev(a)googlegroups.com
Cc: linux-kernel(a)vger.kernel.org
--
2.29.2
Currently, given something (fairly dystopian) like
> KUNIT_EXPECT_EQ(test, 2 + 2, 5)
KUnit will prints a failure message like this.
> Expected 2 + 2 == 5, but
> 2 + 2 == 4
> 5 == 5
With this patch, the output just becomes
> Expected 2 + 2 == 5, but
> 2 + 2 == 4
This patch is slightly hacky, but it's quite common* to compare an
expression to a literal integer value, so this can make KUnit less
chatty in many cases. (This patch also fixes variants like
KUNIT_EXPECT_GT, LE, et al.).
It also allocates an additional string briefly, but given this only
happens on test failures, it doesn't seem too bad a tradeoff.
Also, in most cases it'll realize the lengths are unequal and bail out
before the allocation.
We could save the result of the formatted string to avoid wasting this
extra work, but it felt cleaner to leave it as-is.
Edge case: for something silly and unrealistic like
> KUNIT_EXPECT_EQ(test, 4, 5);
It'll generate this message with a trailing "but"
> Expected 2 + 2 == 5, but
> <next line of normal output>
It didn't feel worth adding a check up-front to see if both sides are
literals to handle this better.
*A quick grep suggests 100+ comparisons to an integer literal as the
right hand side.
Signed-off-by: Daniel Latypov <dlatypov(a)google.com>
---
lib/kunit/assert.c | 39 +++++++++++++++++++++++++++++++++------
1 file changed, 33 insertions(+), 6 deletions(-)
diff --git a/lib/kunit/assert.c b/lib/kunit/assert.c
index 33acdaa28a7d..e0ec7d6fed6f 100644
--- a/lib/kunit/assert.c
+++ b/lib/kunit/assert.c
@@ -85,6 +85,29 @@ void kunit_ptr_not_err_assert_format(const struct kunit_assert *assert,
}
EXPORT_SYMBOL_GPL(kunit_ptr_not_err_assert_format);
+/* Checks if `text` is a literal representing `value`, e.g. "5" and 5 */
+static bool is_literal(struct kunit *test, const char *text, long long value,
+ gfp_t gfp)
+{
+ char *buffer;
+ int len;
+ bool ret;
+
+ len = snprintf(NULL, 0, "%lld", value);
+ if (strlen(text) != len)
+ return false;
+
+ buffer = kunit_kmalloc(test, len+1, gfp);
+ if (!buffer)
+ return false;
+
+ snprintf(buffer, len+1, "%lld", value);
+ ret = strncmp(buffer, text, len) == 0;
+
+ kunit_kfree(test, buffer);
+ return ret;
+}
+
void kunit_binary_assert_format(const struct kunit_assert *assert,
struct string_stream *stream)
{
@@ -97,12 +120,16 @@ void kunit_binary_assert_format(const struct kunit_assert *assert,
binary_assert->left_text,
binary_assert->operation,
binary_assert->right_text);
- string_stream_add(stream, KUNIT_SUBSUBTEST_INDENT "%s == %lld\n",
- binary_assert->left_text,
- binary_assert->left_value);
- string_stream_add(stream, KUNIT_SUBSUBTEST_INDENT "%s == %lld",
- binary_assert->right_text,
- binary_assert->right_value);
+ if (!is_literal(stream->test, binary_assert->left_text,
+ binary_assert->left_value, stream->gfp))
+ string_stream_add(stream, KUNIT_SUBSUBTEST_INDENT "%s == %lld\n",
+ binary_assert->left_text,
+ binary_assert->left_value);
+ if (!is_literal(stream->test, binary_assert->right_text,
+ binary_assert->right_value, stream->gfp))
+ string_stream_add(stream, KUNIT_SUBSUBTEST_INDENT "%s == %lld",
+ binary_assert->right_text,
+ binary_assert->right_value);
kunit_assert_print_msg(assert, stream);
}
EXPORT_SYMBOL_GPL(kunit_binary_assert_format);
base-commit: e5ff2cb9cf67a542f2ec7fb87e24934c88b32678
--
2.30.0.365.g02bc693789-goog
Currently running tests via KUnit tool means tweaking a .kunitconfig
file, which you'd keep around locally and never commit.
This changes makes it so users can pass in a path to a kunitconfig.
One of the imagined use cases is having kunitconfig fragments in-tree
to formalize interesting sets of tests for features/subsystems, e.g.
$ ./tools/testing/kunit/kunit.py run fs/ext4/kunitconfig
For now, this hypothetical fs/ext4/kunitconfig would contain
CONFIG_KUNIT=y
CONFIG_EXT4_FS=y
CONFIG_EXT4_KUNIT_TESTS=y
At the moment, it's not hard to manually whip up this file, but as more
and more tests get added, this will get tedious.
It also opens the door to documenting how to run all the tests relevant
to a specific subsystem or feature as a simple one-liner.
This can be seen as an analogue to tools/testing/selftests/*/config
But in the case of KUnit, the tests live in the same directory as the
code-under-test, so it feels more natural to allow the kunitconfig
fragments to live anywhere. (Though, people could create a separate
directory if wanted; this patch imposes no restrictions on the path).
Signed-off-by: Daniel Latypov <dlatypov(a)google.com>
---
tools/testing/kunit/kunit.py | 9 ++++++---
tools/testing/kunit/kunit_kernel.py | 12 ++++++++----
tools/testing/kunit/kunit_tool_test.py | 25 +++++++++++++++++++++++++
3 files changed, 39 insertions(+), 7 deletions(-)
diff --git a/tools/testing/kunit/kunit.py b/tools/testing/kunit/kunit.py
index e808a47c839b..3204a23bd16e 100755
--- a/tools/testing/kunit/kunit.py
+++ b/tools/testing/kunit/kunit.py
@@ -188,6 +188,9 @@ def add_build_opts(parser) -> None:
help='As in the make command, "Specifies the number of '
'jobs (commands) to run simultaneously."',
type=int, default=8, metavar='jobs')
+ parser.add_argument('kunitconfig',
+ help='Path to Kconfig fragment that enables KUnit tests',
+ type=str, nargs='?', metavar='kunitconfig')
def add_exec_opts(parser) -> None:
parser.add_argument('--timeout',
@@ -256,7 +259,7 @@ def main(argv, linux=None):
os.mkdir(cli_args.build_dir)
if not linux:
- linux = kunit_kernel.LinuxSourceTree(cli_args.build_dir)
+ linux = kunit_kernel.LinuxSourceTree(cli_args.build_dir, kunitconfig_path=cli_args.kunitconfig)
request = KunitRequest(cli_args.raw_output,
cli_args.timeout,
@@ -274,7 +277,7 @@ def main(argv, linux=None):
os.mkdir(cli_args.build_dir)
if not linux:
- linux = kunit_kernel.LinuxSourceTree(cli_args.build_dir)
+ linux = kunit_kernel.LinuxSourceTree(cli_args.build_dir, kunitconfig_path=cli_args.kunitconfig)
request = KunitConfigRequest(cli_args.build_dir,
cli_args.make_options)
@@ -286,7 +289,7 @@ def main(argv, linux=None):
sys.exit(1)
elif cli_args.subcommand == 'build':
if not linux:
- linux = kunit_kernel.LinuxSourceTree(cli_args.build_dir)
+ linux = kunit_kernel.LinuxSourceTree(cli_args.build_dir, kunitconfig_path=cli_args.kunitconfig)
request = KunitBuildRequest(cli_args.jobs,
cli_args.build_dir,
diff --git a/tools/testing/kunit/kunit_kernel.py b/tools/testing/kunit/kunit_kernel.py
index 2076a5a2d060..0b461663e7d9 100644
--- a/tools/testing/kunit/kunit_kernel.py
+++ b/tools/testing/kunit/kunit_kernel.py
@@ -123,7 +123,7 @@ def get_outfile_path(build_dir) -> str:
class LinuxSourceTree(object):
"""Represents a Linux kernel source tree with KUnit tests."""
- def __init__(self, build_dir: str, load_config=True, defconfig=DEFAULT_KUNITCONFIG_PATH) -> None:
+ def __init__(self, build_dir: str, load_config=True, kunitconfig_path='') -> None:
signal.signal(signal.SIGINT, self.signal_handler)
self._ops = LinuxSourceTreeOperations()
@@ -131,9 +131,13 @@ class LinuxSourceTree(object):
if not load_config:
return
- kunitconfig_path = get_kunitconfig_path(build_dir)
- if not os.path.exists(kunitconfig_path):
- shutil.copyfile(defconfig, kunitconfig_path)
+ if kunitconfig_path:
+ if not os.path.exists(kunitconfig_path):
+ raise ConfigError(f'Specified kunitconfig ({kunitconfig_path}) does not exist')
+ else:
+ kunitconfig_path = get_kunitconfig_path(build_dir)
+ if not os.path.exists(kunitconfig_path):
+ shutil.copyfile(DEFAULT_KUNITCONFIG_PATH, kunitconfig_path)
self._kconfig = kunit_config.Kconfig()
self._kconfig.read_from_file(kunitconfig_path)
diff --git a/tools/testing/kunit/kunit_tool_test.py b/tools/testing/kunit/kunit_tool_test.py
index b593f4448e83..533fe41b5123 100755
--- a/tools/testing/kunit/kunit_tool_test.py
+++ b/tools/testing/kunit/kunit_tool_test.py
@@ -12,6 +12,7 @@ from unittest import mock
import tempfile, shutil # Handling test_tmpdir
import json
+import signal
import os
import kunit_config
@@ -250,6 +251,23 @@ class KUnitParserTest(unittest.TestCase):
result.status)
self.assertEqual('kunit-resource-test', result.suites[0].name)
+class LinuxSourceTreeTest(unittest.TestCase):
+
+ def setUp(self):
+ mock.patch.object(signal, 'signal').start()
+ self.addCleanup(mock.patch.stopall)
+
+ def test_invalid_kunitconfig(self):
+ with self.assertRaisesRegex(kunit_kernel.ConfigError, 'nonexistent.* does not exist'):
+ kunit_kernel.LinuxSourceTree('', kunitconfig_path='/nonexistent_file')
+
+ def test_valid_kunitconfig(self):
+ with tempfile.NamedTemporaryFile('wt') as kunitconfig:
+ tree = kunit_kernel.LinuxSourceTree('', kunitconfig_path=kunitconfig.name)
+
+ # TODO: add more test cases.
+
+
class KUnitJsonTest(unittest.TestCase):
def _json_for(self, log_file):
@@ -399,5 +417,12 @@ class KUnitMainTest(unittest.TestCase):
self.linux_source_mock.run_kernel.assert_called_once_with(build_dir=build_dir, timeout=300)
self.print_mock.assert_any_call(StrContains('Testing complete.'))
+ @mock.patch.object(kunit_kernel, 'LinuxSourceTree')
+ def test_run_kunitconfig(self, mock_linux_init):
+ mock_linux_init.return_value = self.linux_source_mock
+ kunit.main(['run', 'mykunitconfig'])
+ # Just verify that we parsed and initialized it correctly here.
+ mock_linux_init.assert_called_once_with('.kunit', kunitconfig_path='mykunitconfig')
+
if __name__ == '__main__':
unittest.main()
base-commit: 2b8fdbbf1c616300312f71fe5b21fe8f03129950
--
2.30.0.280.ga3ce27912f-goog
Hi,
This patch series adjusts the semantic of file hierarchy access-control
per layer to get a more pragmatic and compatible approach. I updated
the documentation to explain how layers, bind mounts and overlayfs are
handled by Landlock. A syscall is also renamed to make it less
ambiguous for future evolution. Last but not least, the test file
layout cleanups are more resilient, and a lot of tests are added to
cover bind mounts and overlayfs, which are fully supported.
The SLOC count is 1292 for security/landlock/ and 2425 for
tools/testing/selftest/landlock/ . Test coverage for security/landlock/
is 94.7% of lines. The code not covered only deals with internal kernel
errors (e.g. memory allocation) and race conditions. This series is
being fuzzed by syzkaller, and patches are on their way:
https://github.com/google/syzkaller/pull/2380
The compiled documentation is available here:
https://landlock.io/linux-doc/landlock-v27/userspace-api/landlock.html
This series can be applied on top of v5.11-rc4 . This can be tested
with CONFIG_SECURITY_LANDLOCK, CONFIG_SAMPLE_LANDLOCK and by prepending
"landlock," to CONFIG_LSM. This patch series can be found in a Git
repository here:
https://github.com/landlock-lsm/linux/commits/landlock-v27
This patch series seems ready for upstream and I would really appreciate
final reviews.
# Landlock LSM
The goal of Landlock is to enable to restrict ambient rights (e.g.
global filesystem access) for a set of processes. Because Landlock is a
stackable LSM [1], it makes possible to create safe security sandboxes
as new security layers in addition to the existing system-wide
access-controls. This kind of sandbox is expected to help mitigate the
security impact of bugs or unexpected/malicious behaviors in user-space
applications. Landlock empowers any process, including unprivileged
ones, to securely restrict themselves.
Landlock is inspired by seccomp-bpf but instead of filtering syscalls
and their raw arguments, a Landlock rule can restrict the use of kernel
objects like file hierarchies, according to the kernel semantic.
Landlock also takes inspiration from other OS sandbox mechanisms: XNU
Sandbox, FreeBSD Capsicum or OpenBSD Pledge/Unveil.
In this current form, Landlock misses some access-control features.
This enables to minimize this patch series and ease review. This series
still addresses multiple use cases, especially with the combined use of
seccomp-bpf: applications with built-in sandboxing, init systems,
security sandbox tools and security-oriented APIs [2].
Previous version:
https://lore.kernel.org/lkml/20201209192839.1396820-1-mic@digikod.net/
[1] https://lore.kernel.org/lkml/50db058a-7dde-441b-a7f9-f6837fe8b69f@schaufler…
[2] https://lore.kernel.org/lkml/f646e1c7-33cf-333f-070c-0a40ad0468cd@digikod.n…
Casey Schaufler (1):
LSM: Infrastructure management of the superblock
Mickaël Salaün (11):
landlock: Add object management
landlock: Add ruleset and domain management
landlock: Set up the security framework and manage credentials
landlock: Add ptrace restrictions
fs,security: Add sb_delete hook
landlock: Support filesystem access-control
landlock: Add syscall implementations
arch: Wire up Landlock syscalls
selftests/landlock: Add user space tests
samples/landlock: Add a sandbox manager example
landlock: Add user and kernel documentation
Documentation/security/index.rst | 1 +
Documentation/security/landlock.rst | 79 +
Documentation/userspace-api/index.rst | 1 +
Documentation/userspace-api/landlock.rst | 306 ++
MAINTAINERS | 13 +
arch/Kconfig | 7 +
arch/alpha/kernel/syscalls/syscall.tbl | 3 +
arch/arm/tools/syscall.tbl | 3 +
arch/arm64/include/asm/unistd.h | 2 +-
arch/arm64/include/asm/unistd32.h | 6 +
arch/ia64/kernel/syscalls/syscall.tbl | 3 +
arch/m68k/kernel/syscalls/syscall.tbl | 3 +
arch/microblaze/kernel/syscalls/syscall.tbl | 3 +
arch/mips/kernel/syscalls/syscall_n32.tbl | 3 +
arch/mips/kernel/syscalls/syscall_n64.tbl | 3 +
arch/mips/kernel/syscalls/syscall_o32.tbl | 3 +
arch/parisc/kernel/syscalls/syscall.tbl | 3 +
arch/powerpc/kernel/syscalls/syscall.tbl | 3 +
arch/s390/kernel/syscalls/syscall.tbl | 3 +
arch/sh/kernel/syscalls/syscall.tbl | 3 +
arch/sparc/kernel/syscalls/syscall.tbl | 3 +
arch/um/Kconfig | 1 +
arch/x86/entry/syscalls/syscall_32.tbl | 3 +
arch/x86/entry/syscalls/syscall_64.tbl | 3 +
arch/xtensa/kernel/syscalls/syscall.tbl | 3 +
fs/super.c | 1 +
include/linux/lsm_hook_defs.h | 1 +
include/linux/lsm_hooks.h | 3 +
include/linux/security.h | 4 +
include/linux/syscalls.h | 7 +
include/uapi/asm-generic/unistd.h | 8 +-
include/uapi/linux/landlock.h | 128 +
kernel/sys_ni.c | 5 +
samples/Kconfig | 7 +
samples/Makefile | 1 +
samples/landlock/.gitignore | 1 +
samples/landlock/Makefile | 13 +
samples/landlock/sandboxer.c | 239 ++
security/Kconfig | 11 +-
security/Makefile | 2 +
security/landlock/Kconfig | 21 +
security/landlock/Makefile | 4 +
security/landlock/common.h | 20 +
security/landlock/cred.c | 46 +
security/landlock/cred.h | 58 +
security/landlock/fs.c | 621 ++++
security/landlock/fs.h | 56 +
security/landlock/limits.h | 21 +
security/landlock/object.c | 67 +
security/landlock/object.h | 91 +
security/landlock/ptrace.c | 120 +
security/landlock/ptrace.h | 14 +
security/landlock/ruleset.c | 466 +++
security/landlock/ruleset.h | 161 +
security/landlock/setup.c | 40 +
security/landlock/setup.h | 18 +
security/landlock/syscalls.c | 429 +++
security/security.c | 51 +-
security/selinux/hooks.c | 58 +-
security/selinux/include/objsec.h | 6 +
security/selinux/ss/services.c | 3 +-
security/smack/smack.h | 6 +
security/smack/smack_lsm.c | 35 +-
tools/testing/selftests/Makefile | 1 +
tools/testing/selftests/landlock/.gitignore | 2 +
tools/testing/selftests/landlock/Makefile | 24 +
tools/testing/selftests/landlock/base_test.c | 219 ++
tools/testing/selftests/landlock/common.h | 166 ++
tools/testing/selftests/landlock/config | 6 +
tools/testing/selftests/landlock/fs_test.c | 2585 +++++++++++++++++
.../testing/selftests/landlock/ptrace_test.c | 314 ++
tools/testing/selftests/landlock/true.c | 5 +
72 files changed, 6552 insertions(+), 77 deletions(-)
create mode 100644 Documentation/security/landlock.rst
create mode 100644 Documentation/userspace-api/landlock.rst
create mode 100644 include/uapi/linux/landlock.h
create mode 100644 samples/landlock/.gitignore
create mode 100644 samples/landlock/Makefile
create mode 100644 samples/landlock/sandboxer.c
create mode 100644 security/landlock/Kconfig
create mode 100644 security/landlock/Makefile
create mode 100644 security/landlock/common.h
create mode 100644 security/landlock/cred.c
create mode 100644 security/landlock/cred.h
create mode 100644 security/landlock/fs.c
create mode 100644 security/landlock/fs.h
create mode 100644 security/landlock/limits.h
create mode 100644 security/landlock/object.c
create mode 100644 security/landlock/object.h
create mode 100644 security/landlock/ptrace.c
create mode 100644 security/landlock/ptrace.h
create mode 100644 security/landlock/ruleset.c
create mode 100644 security/landlock/ruleset.h
create mode 100644 security/landlock/setup.c
create mode 100644 security/landlock/setup.h
create mode 100644 security/landlock/syscalls.c
create mode 100644 tools/testing/selftests/landlock/.gitignore
create mode 100644 tools/testing/selftests/landlock/Makefile
create mode 100644 tools/testing/selftests/landlock/base_test.c
create mode 100644 tools/testing/selftests/landlock/common.h
create mode 100644 tools/testing/selftests/landlock/config
create mode 100644 tools/testing/selftests/landlock/fs_test.c
create mode 100644 tools/testing/selftests/landlock/ptrace_test.c
create mode 100644 tools/testing/selftests/landlock/true.c
base-commit: 19c329f6808995b142b3966301f217c831e7cf31
--
2.30.0
From: Bongsu Jeon <bongsu.jeon(a)samsung.com>
A NCI virtual device can be made to simulate a NCI device in user space.
Using the virtual NCI device, The NCI module and application can be
validated. This driver supports to communicate between the virtual NCI
device and NCI module. To test the basic features of NCI module, selftest
for nci is added. Test cases consist of making the virtual NCI device
on/off and controlling the device's polling for NCI1.0 and NCI2.0 version.
1/2 is the Virtual NCI device driver.
2/2 is the NCI selftest suite
v3:
1/2
- change the Kconfig help comment.
- remove the mutex init code.
- remove the unnecessary mutex(nci_send_mutex).
- remove the full_txbuff.
- add the code to release skb at error case.
- refactor some code.
v2:
1/2
- change the permission of the Virtual NCI device.
- add the ioctl to find the nci device index.
2/2
- add the NCI selftest suite.
MAINTAINERS | 8 +
drivers/nfc/Kconfig | 11 +
drivers/nfc/Makefile | 1 +
drivers/nfc/virtual_ncidev.c | 227 ++++++++++
tools/testing/selftests/Makefile | 1 +
tools/testing/selftests/nci/Makefile | 6 +
tools/testing/selftests/nci/config | 3 +
tools/testing/selftests/nci/nci_dev.c | 599 ++++++++++++++++++++++++++
8 files changed, 856 insertions(+)
create mode 100644 drivers/nfc/virtual_ncidev.c
create mode 100644 tools/testing/selftests/nci/Makefile
create mode 100644 tools/testing/selftests/nci/config
create mode 100644 tools/testing/selftests/nci/nci_dev.c
--
2.25.1
From: Mike Rapoport <rppt(a)linux.ibm.com>
Hi,
@Andrew, this is based on v5.11-rc4-mmots-2021-01-19-13-54 with secretmem
patches dropped from there, I can rebase whatever way you prefer.
This is an implementation of "secret" mappings backed by a file descriptor.
The file descriptor backing secret memory mappings is created using a
dedicated memfd_secret system call The desired protection mode for the
memory is configured using flags parameter of the system call. The mmap()
of the file descriptor created with memfd_secret() will create a "secret"
memory mapping. The pages in that mapping will be marked as not present in
the direct map and will be present only in the page table of the owning mm.
Although normally Linux userspace mappings are protected from other users,
such secret mappings are useful for environments where a hostile tenant is
trying to trick the kernel into giving them access to other tenants
mappings.
Additionally, in the future the secret mappings may be used as a mean to
protect guest memory in a virtual machine host.
For demonstration of secret memory usage we've created a userspace library
https://git.kernel.org/pub/scm/linux/kernel/git/jejb/secret-memory-preloade…
that does two things: the first is act as a preloader for openssl to
redirect all the OPENSSL_malloc calls to secret memory meaning any secret
keys get automatically protected this way and the other thing it does is
expose the API to the user who needs it. We anticipate that a lot of the
use cases would be like the openssl one: many toolkits that deal with
secret keys already have special handling for the memory to try to give
them greater protection, so this would simply be pluggable into the
toolkits without any need for user application modification.
Hiding secret memory mappings behind an anonymous file allows (ab)use of
the page cache for tracking pages allocated for the "secret" mappings as
well as using address_space_operations for e.g. page migration callbacks.
The anonymous file may be also used implicitly, like hugetlb files, to
implement mmap(MAP_SECRET) and use the secret memory areas with "native" mm
ABIs in the future.
To limit fragmentation of the direct map to splitting only PUD-size pages,
I've added an amortizing cache of PMD-size pages to each file descriptor
that is used as an allocation pool for the secret memory areas.
As the memory allocated by secretmem becomes unmovable, we use CMA to back
large page caches so that page allocator won't be surprised by failing attempt
to migrate these pages.
v15:
* Add riscv/Kconfig update to disable set_memory operations for nommu
builds (patch 3)
* Update the code around add_to_page_cache() per Matthew's comments
(patches 6,7)
* Add fixups for build/checkpatch errors discovered by CI systems
v14: https://lore.kernel.org/lkml/20201203062949.5484-1-rppt@kernel.org
* Finally s/mod_node_page_state/mod_lruvec_page_state/
v13: https://lore.kernel.org/lkml/20201201074559.27742-1-rppt@kernel.org
* Added Reviewed-by, thanks Catalin and David
* s/mod_node_page_state/mod_lruvec_page_state/ as Shakeel suggested
v12: https://lore.kernel.org/lkml/20201125092208.12544-1-rppt@kernel.org
* Add detection of whether set_direct_map has actual effect on arm64 and bail
out of CMA allocation for secretmem and the memfd_secret() syscall if pages
would not be removed from the direct map
v11: https://lore.kernel.org/lkml/20201124092556.12009-1-rppt@kernel.org
* Drop support for uncached mappings
Older history:
v10: https://lore.kernel.org/lkml/20201123095432.5860-1-rppt@kernel.org
v9: https://lore.kernel.org/lkml/20201117162932.13649-1-rppt@kernel.org
v8: https://lore.kernel.org/lkml/20201110151444.20662-1-rppt@kernel.org
v7: https://lore.kernel.org/lkml/20201026083752.13267-1-rppt@kernel.org
v6: https://lore.kernel.org/lkml/20200924132904.1391-1-rppt@kernel.org
v5: https://lore.kernel.org/lkml/20200916073539.3552-1-rppt@kernel.org
v4: https://lore.kernel.org/lkml/20200818141554.13945-1-rppt@kernel.org
v3: https://lore.kernel.org/lkml/20200804095035.18778-1-rppt@kernel.org
v2: https://lore.kernel.org/lkml/20200727162935.31714-1-rppt@kernel.org
v1: https://lore.kernel.org/lkml/20200720092435.17469-1-rppt@kernel.org
Mike Rapoport (11):
mm: add definition of PMD_PAGE_ORDER
mmap: make mlock_future_check() global
riscv/Kconfig: make direct map manipulation options depend on MMU
set_memory: allow set_direct_map_*_noflush() for multiple pages
set_memory: allow querying whether set_direct_map_*() is actually enabled
mm: introduce memfd_secret system call to create "secret" memory areas
secretmem: use PMD-size pages to amortize direct map fragmentation
secretmem: add memcg accounting
PM: hibernate: disable when there are active secretmem users
arch, mm: wire up memfd_secret system call where relevant
secretmem: test: add basic selftest for memfd_secret(2)
arch/arm64/include/asm/Kbuild | 1 -
arch/arm64/include/asm/cacheflush.h | 6 -
arch/arm64/include/asm/set_memory.h | 17 +
arch/arm64/include/uapi/asm/unistd.h | 1 +
arch/arm64/kernel/machine_kexec.c | 1 +
arch/arm64/mm/mmu.c | 6 +-
arch/arm64/mm/pageattr.c | 23 +-
arch/riscv/Kconfig | 4 +-
arch/riscv/include/asm/set_memory.h | 4 +-
arch/riscv/include/asm/unistd.h | 1 +
arch/riscv/mm/pageattr.c | 8 +-
arch/x86/entry/syscalls/syscall_32.tbl | 1 +
arch/x86/entry/syscalls/syscall_64.tbl | 1 +
arch/x86/include/asm/set_memory.h | 4 +-
arch/x86/mm/pat/set_memory.c | 8 +-
fs/dax.c | 11 +-
include/linux/pgtable.h | 3 +
include/linux/secretmem.h | 30 ++
include/linux/set_memory.h | 16 +-
include/linux/syscalls.h | 1 +
include/uapi/asm-generic/unistd.h | 6 +-
include/uapi/linux/magic.h | 1 +
kernel/power/hibernate.c | 5 +-
kernel/power/snapshot.c | 4 +-
kernel/sys_ni.c | 2 +
mm/Kconfig | 5 +
mm/Makefile | 1 +
mm/filemap.c | 3 +-
mm/gup.c | 10 +
mm/internal.h | 3 +
mm/mmap.c | 5 +-
mm/secretmem.c | 444 ++++++++++++++++++++++
mm/vmalloc.c | 5 +-
scripts/checksyscalls.sh | 4 +
tools/testing/selftests/vm/.gitignore | 1 +
tools/testing/selftests/vm/Makefile | 3 +-
tools/testing/selftests/vm/memfd_secret.c | 296 +++++++++++++++
tools/testing/selftests/vm/run_vmtests | 17 +
38 files changed, 910 insertions(+), 52 deletions(-)
create mode 100644 arch/arm64/include/asm/set_memory.h
create mode 100644 include/linux/secretmem.h
create mode 100644 mm/secretmem.c
create mode 100644 tools/testing/selftests/vm/memfd_secret.c
--
2.28.0
Changelog
---------
v6
Small update, but I wanted to send it out quicker, as it removes a
controversial patch and replaces it with something sane.
- Removed forcing FOLL_WRITE for longterm gup, instead added a patch to
skip zero pages during migration.
- Added reviewed-by's and minor log changes.
v5
- Added the following patches to the beginning of series, which are fixes
to the other existing problems with CMA migration code:
mm/gup: check every subpage of a compound page during isolation
mm/gup: return an error on migration failure
mm/gup: check for isolation errors also at the beginning of series
mm/gup: do not allow zero page for pinned pages
- remove .gfp_mask/.reclaim_idx changes from mm/vmscan.c
- update movable zone header comment in patch 8 instead of patch 3, fix
the comment
- Added acked, sign-offs
- Updated commit logs based on feedback
- Addressed issues reported by Michal and Jason.
- Remove:
#define PINNABLE_MIGRATE_MAX 10
#define PINNABLE_ISOLATE_MAX 100
Instead: fail on the first migration failure, and retry isolation
forever as their failures are transient.
- In self-set addressed some of the comments from John Hubbard, updated
commit logs, and added comments. Renamed gup->flags with gup->test_flags.
v4
- Address page migration comments. New patch:
mm/gup: limit number of gup migration failures, honor failures
Implements the limiting number of retries for migration failures, and
also check for isolation failures.
Added a test case into gup_test to verify that pages never long-term
pinned in a movable zone, and also added tests to fault both in kernel
and in userland.
v3
- Merged with linux-next, which contains clean-up patch from Jason,
therefore this series is reduced by two patches which did the same
thing.
v2
- Addressed all review comments
- Added Reviewed-by's.
- Renamed PF_MEMALLOC_NOMOVABLE to PF_MEMALLOC_PIN
- Added is_pinnable_page() to check if page can be longterm pinned
- Fixed gup fast path by checking is_in_pinnable_zone()
- rename cma_page_list to movable_page_list
- add a admin-guide note about handling pinned pages in ZONE_MOVABLE,
updated caveat about pinned pages from linux/mmzone.h
- Move current_gfp_context() to fast-path
---------
When page is pinned it cannot be moved and its physical address stays
the same until pages is unpinned.
This is useful functionality to allows userland to implementation DMA
access. For example, it is used by vfio in vfio_pin_pages().
However, this functionality breaks memory hotplug/hotremove assumptions
that pages in ZONE_MOVABLE can always be migrated.
This patch series fixes this issue by forcing new allocations during
page pinning to omit ZONE_MOVABLE, and also to migrate any existing
pages from ZONE_MOVABLE during pinning.
It uses the same scheme logic that is currently used by CMA, and extends
the functionality for all allocations.
For more information read the discussion [1] about this problem.
[1] https://lore.kernel.org/lkml/CA+CK2bBffHBxjmb9jmSKacm0fJMinyt3Nhk8Nx6iudcQS…
Previous versions:
v1
https://lore.kernel.org/lkml/20201202052330.474592-1-pasha.tatashin@soleen.…
v2
https://lore.kernel.org/lkml/20201210004335.64634-1-pasha.tatashin@soleen.c…
v3
https://lore.kernel.org/lkml/20201211202140.396852-1-pasha.tatashin@soleen.…
v4
https://lore.kernel.org/lkml/20201217185243.3288048-1-pasha.tatashin@soleen…
v5
https://lore.kernel.org/lkml/20210119043920.155044-1-pasha.tatashin@soleen.…
Pavel Tatashin (14):
mm/gup: don't pin migrated cma pages in movable zone
mm/gup: check every subpage of a compound page during isolation
mm/gup: return an error on migration failure
mm/gup: check for isolation errors
mm cma: rename PF_MEMALLOC_NOCMA to PF_MEMALLOC_PIN
mm: apply per-task gfp constraints in fast path
mm: honor PF_MEMALLOC_PIN for all movable pages
mm/gup: do not migrate zero page
mm/gup: migrate pinned pages out of movable zone
memory-hotplug.rst: add a note about ZONE_MOVABLE and page pinning
mm/gup: change index type to long as it counts pages
mm/gup: longterm pin migration cleaup
selftests/vm: test flag is broken
selftests/vm: test faulting in kernel, and verify pinnable pages
.../admin-guide/mm/memory-hotplug.rst | 9 +
include/linux/migrate.h | 1 +
include/linux/mm.h | 11 ++
include/linux/mmzone.h | 13 +-
include/linux/sched.h | 2 +-
include/linux/sched/mm.h | 27 +--
include/trace/events/migrate.h | 3 +-
mm/gup.c | 175 ++++++++----------
mm/gup_test.c | 29 +--
mm/gup_test.h | 3 +-
mm/hugetlb.c | 4 +-
mm/page_alloc.c | 33 ++--
tools/testing/selftests/vm/gup_test.c | 36 +++-
13 files changed, 185 insertions(+), 161 deletions(-)
--
2.25.1
Changelog
---------
v7
- Added reviewed-by's
- Fixed a compile bug on non-mmu builds reported by robot
v6
Small update, but I wanted to send it out quicker, as it removes a
controversial patch and replaces it with something sane.
- Removed forcing FOLL_WRITE for longterm gup, instead added a patch to
skip zero pages during migration.
- Added reviewed-by's and minor log changes.
v5
- Added the following patches to the beginning of series, which are fixes
to the other existing problems with CMA migration code:
mm/gup: check every subpage of a compound page during isolation
mm/gup: return an error on migration failure
mm/gup: check for isolation errors also at the beginning of series
mm/gup: do not allow zero page for pinned pages
- remove .gfp_mask/.reclaim_idx changes from mm/vmscan.c
- update movable zone header comment in patch 8 instead of patch 3, fix
the comment
- Added acked, sign-offs
- Updated commit logs based on feedback
- Addressed issues reported by Michal and Jason.
- Remove:
#define PINNABLE_MIGRATE_MAX 10
#define PINNABLE_ISOLATE_MAX 100
Instead: fail on the first migration failure, and retry isolation
forever as their failures are transient.
- In self-set addressed some of the comments from John Hubbard, updated
commit logs, and added comments. Renamed gup->flags with gup->test_flags.
v4
- Address page migration comments. New patch:
mm/gup: limit number of gup migration failures, honor failures
Implements the limiting number of retries for migration failures, and
also check for isolation failures.
Added a test case into gup_test to verify that pages never long-term
pinned in a movable zone, and also added tests to fault both in kernel
and in userland.
v3
- Merged with linux-next, which contains clean-up patch from Jason,
therefore this series is reduced by two patches which did the same
thing.
v2
- Addressed all review comments
- Added Reviewed-by's.
- Renamed PF_MEMALLOC_NOMOVABLE to PF_MEMALLOC_PIN
- Added is_pinnable_page() to check if page can be longterm pinned
- Fixed gup fast path by checking is_in_pinnable_zone()
- rename cma_page_list to movable_page_list
- add a admin-guide note about handling pinned pages in ZONE_MOVABLE,
updated caveat about pinned pages from linux/mmzone.h
- Move current_gfp_context() to fast-path
---------
When page is pinned it cannot be moved and its physical address stays
the same until pages is unpinned.
This is useful functionality to allows userland to implementation DMA
access. For example, it is used by vfio in vfio_pin_pages().
However, this functionality breaks memory hotplug/hotremove assumptions
that pages in ZONE_MOVABLE can always be migrated.
This patch series fixes this issue by forcing new allocations during
page pinning to omit ZONE_MOVABLE, and also to migrate any existing
pages from ZONE_MOVABLE during pinning.
It uses the same scheme logic that is currently used by CMA, and extends
the functionality for all allocations.
For more information read the discussion [1] about this problem.
[1] https://lore.kernel.org/lkml/CA+CK2bBffHBxjmb9jmSKacm0fJMinyt3Nhk8Nx6iudcQS…
Previous versions:
v1
https://lore.kernel.org/lkml/20201202052330.474592-1-pasha.tatashin@soleen.…
v2
https://lore.kernel.org/lkml/20201210004335.64634-1-pasha.tatashin@soleen.c…
v3
https://lore.kernel.org/lkml/20201211202140.396852-1-pasha.tatashin@soleen.…
v4
https://lore.kernel.org/lkml/20201217185243.3288048-1-pasha.tatashin@soleen…
v5
https://lore.kernel.org/lkml/20210119043920.155044-1-pasha.tatashin@soleen.…
v6
https://lore.kernel.org/lkml/20210120014333.222547-1-pasha.tatashin@soleen.…
Pavel Tatashin (14):
mm/gup: don't pin migrated cma pages in movable zone
mm/gup: check every subpage of a compound page during isolation
mm/gup: return an error on migration failure
mm/gup: check for isolation errors
mm cma: rename PF_MEMALLOC_NOCMA to PF_MEMALLOC_PIN
mm: apply per-task gfp constraints in fast path
mm: honor PF_MEMALLOC_PIN for all movable pages
mm/gup: do not migrate zero page
mm/gup: migrate pinned pages out of movable zone
memory-hotplug.rst: add a note about ZONE_MOVABLE and page pinning
mm/gup: change index type to long as it counts pages
mm/gup: longterm pin migration cleanup
selftests/vm: test flag is broken
selftests/vm: test faulting in kernel, and verify pinnable pages
.../admin-guide/mm/memory-hotplug.rst | 9 +
include/linux/migrate.h | 1 +
include/linux/mm.h | 11 ++
include/linux/mmzone.h | 13 +-
include/linux/pgtable.h | 3 +-
include/linux/sched.h | 2 +-
include/linux/sched/mm.h | 27 +--
include/trace/events/migrate.h | 3 +-
mm/gup.c | 176 ++++++++----------
mm/gup_test.c | 29 +--
mm/gup_test.h | 3 +-
mm/hugetlb.c | 4 +-
mm/page_alloc.c | 33 ++--
tools/testing/selftests/vm/gup_test.c | 36 +++-
14 files changed, 191 insertions(+), 159 deletions(-)
--
2.25.1
In function sgx_encl_create(), the logic of directly assigning
value to attributes_mask determines that the call to
SGX_IOC_ENCLAVE_PROVISION must be after the command of
SGX_IOC_ENCLAVE_CREATE. If change this assignment statement to
or operation, the PROVISION command can be executed earlier and
more flexibly.
Reported-by: Jia Zhang <zhang.jia(a)linux.alibaba.com>
Signed-off-by: Tianjia Zhang <tianjia.zhang(a)linux.alibaba.com>
---
arch/x86/kernel/cpu/sgx/ioctl.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/arch/x86/kernel/cpu/sgx/ioctl.c b/arch/x86/kernel/cpu/sgx/ioctl.c
index f45957c05f69..0ca3fc238bc2 100644
--- a/arch/x86/kernel/cpu/sgx/ioctl.c
+++ b/arch/x86/kernel/cpu/sgx/ioctl.c
@@ -108,7 +108,7 @@ static int sgx_encl_create(struct sgx_encl *encl, struct sgx_secs *secs)
encl->base = secs->base;
encl->size = secs->size;
encl->attributes = secs->attributes;
- encl->attributes_mask = SGX_ATTR_DEBUG | SGX_ATTR_MODE64BIT | SGX_ATTR_KSS;
+ encl->attributes_mask |= SGX_ATTR_DEBUG | SGX_ATTR_MODE64BIT | SGX_ATTR_KSS;
/* Set only after completion, as encl->lock has not been taken. */
set_bit(SGX_ENCL_CREATED, &encl->flags);
--
2.19.1.3.ge56e4f7
Hi Linus,
Please pull the following KUnit fixes update for Linux 5.11-rc5.
This KUnit update for Linux 5.11-rc5 consists of 5 fixes to kunit tool
and documentation from Daniel Latypov and David Gow.
diff is attached.
thanks,
-- Shuah
----------------------------------------------------------------
The following changes since commit 65a4e5299739abe0888cda0938d21f8ea3b5c606:
kunit: tool: Force the use of the 'tty' console for UML (2021-01-04
09:18:38 -0700)
are available in the Git repository at:
git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest
tags/linux-kselftest-kunit-fixes-5.11-rc5
for you to fetch changes up to 2b8fdbbf1c616300312f71fe5b21fe8f03129950:
kunit: tool: move kunitconfig parsing into __init__, make it optional
(2021-01-15 17:52:12 -0700)
----------------------------------------------------------------
linux-kselftest-kunit-fixes-5.11-rc5
This KUnit update for Linux 5.11-rc5 consist of 5 fixes to kunit tool
and documentation from Daniel Latypov and David Gow.
----------------------------------------------------------------
Daniel Latypov (4):
Documentation: kunit: include example of a parameterized test
kunit: tool: surface and address more typing issues
kunit: tool: fix minor typing issue with None status
kunit: tool: move kunitconfig parsing into __init__, make it optional
David Gow (1):
kunit: tool: Fix spelling of "diagnostic" in kunit_parser
Documentation/dev-tools/kunit/usage.rst | 57 +++++++++++++++++++++++
tools/testing/kunit/kunit.py | 34 +++++---------
tools/testing/kunit/kunit_config.py | 7 +--
tools/testing/kunit/kunit_json.py | 2 +-
tools/testing/kunit/kunit_kernel.py | 54 +++++++++++-----------
tools/testing/kunit/kunit_parser.py | 81
++++++++++++++++-----------------
6 files changed, 141 insertions(+), 94 deletions(-)
----------------------------------------------------------------
From: Bongsu Jeon <bongsu.jeon(a)samsung.com>
A NCI virtual device can be made to simulate a NCI device in user space.
Using the virtual NCI device, The NCI module and application can be
validated. This driver supports to communicate between the virtual NCI
device and NCI module. To test the basic features of NCI module, selftest
for NCI is added. Test cases consist of making the virtual NCI device
on/off and controlling the device's polling for NCI1.0 and NCI2.0 version.
1/2 is the Virtual NCI device driver.
2/2 is the NCI selftest suite
v2:
1/2
- change the permission of the Virtual NCI device.
- add the ioctl to find the nci device index.
2/2
- add the NCI selftest suite.
Bongsu Jeon (2):
nfc: Add a virtual nci device driver
selftests: Add nci suite
MAINTAINERS | 8 +
drivers/nfc/Kconfig | 11 +
drivers/nfc/Makefile | 1 +
drivers/nfc/virtual_ncidev.c | 235 ++++++++++
tools/testing/selftests/Makefile | 1 +
tools/testing/selftests/nci/Makefile | 6 +
tools/testing/selftests/nci/config | 3 +
tools/testing/selftests/nci/nci_dev.c | 599 ++++++++++++++++++++++++++
8 files changed, 864 insertions(+)
create mode 100644 drivers/nfc/virtual_ncidev.c
create mode 100644 tools/testing/selftests/nci/Makefile
create mode 100644 tools/testing/selftests/nci/config
create mode 100644 tools/testing/selftests/nci/nci_dev.c
--
2.25.1
Add a libbpf dumper function that supports dumping a representation
of data passed in using the BTF id associated with the data in a
manner similar to the bpf_snprintf_btf helper.
Default output format is identical to that dumped by bpf_snprintf_btf(),
for example a "struct sk_buff" representation would look like this:
struct sk_buff){
(union){
(struct){
.next = (struct sk_buff *)0xffffffffffffffff,
.prev = (struct sk_buff *)0xffffffffffffffff,
(union){
.dev = (struct net_device *)0xffffffffffffffff,
.dev_scratch = (long unsigned int)18446744073709551615,
},
},
...
Patches 1 and 2 make functions available that are needed during
dump operations.
Patch 3 implements the dump functionality in a manner similar
to that in kernel/bpf/btf.c, but with a view to fitting into
libbpf more naturally. For example, rather than using flags,
boolean dump options are used to control output.
Patch 4 is a selftest that utilizes a dump printf function
to snprintf the dump output to a string for comparison with
expected output. Tests deliberately mirror those in
snprintf_btf helper test to keep output consistent.
Changes since RFC [1]
- The initial approach explored was to share the kernel code
with libbpf using #defines to paper over the different needs;
however it makes more sense to try and fit in with libbpf
code style for maintenance. A comment in the code points at
the implementation in kernel/bpf/btf.c and notes that any
issues found in it should be fixed there or vice versa;
mirroring the tests should help with this also
(Andrii)
[1] https://lore.kernel.org/bpf/1610386373-24162-1-git-send-email-alan.maguire@…
Alan Maguire (4):
libbpf: add btf_has_size() and btf_int() inlines
libbpf: make skip_mods_and_typedefs available internally in libbpf
libbpf: BTF dumper support for typed data
selftests/bpf: add dump type data tests to btf dump tests
tools/lib/bpf/btf.h | 36 +
tools/lib/bpf/btf_dump.c | 974 ++++++++++++++++++++++
tools/lib/bpf/libbpf.c | 4 +-
tools/lib/bpf/libbpf.map | 5 +
tools/lib/bpf/libbpf_internal.h | 2 +
tools/testing/selftests/bpf/prog_tests/btf_dump.c | 233 ++++++
6 files changed, 1251 insertions(+), 3 deletions(-)
--
1.8.3.1
Obviously, the error variable detection of the if statement is
for the mprotect callback function, so it is also put into the
scope of calling callbck.
Reported-by: Jia Zhang <zhang.jia(a)linux.alibaba.com>
Signed-off-by: Tianjia Zhang <tianjia.zhang(a)linux.alibaba.com>
---
mm/mprotect.c | 7 ++++---
1 file changed, 4 insertions(+), 3 deletions(-)
diff --git a/mm/mprotect.c b/mm/mprotect.c
index ab709023e9aa..94188df1ee55 100644
--- a/mm/mprotect.c
+++ b/mm/mprotect.c
@@ -617,10 +617,11 @@ static int do_mprotect_pkey(unsigned long start, size_t len,
if (tmp > end)
tmp = end;
- if (vma->vm_ops && vma->vm_ops->mprotect)
+ if (vma->vm_ops && vma->vm_ops->mprotect) {
error = vma->vm_ops->mprotect(vma, nstart, tmp, newflags);
- if (error)
- goto out;
+ if (error)
+ goto out;
+ }
error = mprotect_fixup(vma, &prev, nstart, tmp, newflags);
if (error)
--
2.19.1.3.ge56e4f7
Hi Jon,
This series have three parts:
1) 10 remaining fixup patches from the series I sent back on Dec, 1st:
parport: fix a kernel-doc markup
rapidio: fix kernel-doc a markup
fs: fix kernel-doc markups
pstore/zone: fix a kernel-doc markup
firmware: stratix10-svc: fix kernel-doc markups
connector: fix a kernel-doc markup
lib/crc7: fix a kernel-doc markup
memblock: fix kernel-doc markups
w1: fix a kernel-doc markup
selftests: kselftest_harness.h: partially fix kernel-doc markups
2) The patch adding the new check to ensure that the kernel-doc
markup will be used for the right declaration;
3) 5 additional patches, produced against next-20210114 with new
problems detected after the original series:
net: tip: fix a couple kernel-doc markups
net: cfg80211: fix a kerneldoc markup
reset: core: fix a kernel-doc markup
drm: drm_crc: fix a kernel-doc markup
platform/surface: aggregator: fix a kernel-doc markup
It probably makes sense to merge at least the first 11 patches
via the doc tree, as they should apply cleanly there, and
having the last 5 patches merged via each maintainer's tree.
-
Kernel-doc has always be limited to a probably bad documented
rule:
The kernel-doc markups should appear *imediatelly before* the
function or data structure that it documents.
On other words, if a C file would contain something like this:
/**
* foo - function foo
* @args: foo args
*/
static inline void bar(int args);
/**
* bar - function bar
* @args: foo args
*/
static inline void foo(void *args);
The output (in ReST format) will be:
.. c:function:: void bar (int args)
function foo
**Parameters**
``int args``
foo args
.. c:function:: void foo (void *args)
function bar
**Parameters**
``void *args``
foo args
Which is clearly a wrong result. Before this changeset,
not even a warning is produced on such cases.
As placing such markups just before the documented
data is a common practice, on most cases this is fine.
However, as patches touch things, identifiers may be
renamed, and people may forget to update the kernel-doc
markups to follow such changes.
This has been happening for quite a while, as there are
lots of files with kernel-doc problems.
This series address those issues and add a file at the
end that will enforce that the identifier will match the
kernel-doc markup, avoiding this problem from
keep happening as time goes by.
This series is based on current upstream tree.
@maintainers: feel free to pick the patches and
apply them directly on your trees, as all patches on
this series are independent from the other ones.
--
v6:
- rebased on the top of next-20210114 and added a few extra fixes
v5:
- The completion.h patch was replaced by another one which drops
an obsolete macro;
- Some typos got fixed and review tags got added;
- Dropped patches that were already merged at linux-next.
v4:
- Patches got rebased and got some acks.
Mauro Carvalho Chehab (16):
parport: fix a kernel-doc markup
rapidio: fix kernel-doc a markup
fs: fix kernel-doc markups
pstore/zone: fix a kernel-doc markup
firmware: stratix10-svc: fix kernel-doc markups
connector: fix a kernel-doc markup
lib/crc7: fix a kernel-doc markup
memblock: fix kernel-doc markups
w1: fix a kernel-doc markup
selftests: kselftest_harness.h: partially fix kernel-doc markups
scripts: kernel-doc: validate kernel-doc markup with the actual names
net: tip: fix a couple kernel-doc markups
net: cfg80211: fix a kerneldoc markup
reset: core: fix a kernel-doc markup
drm: drm_crc: fix a kernel-doc markup
platform/surface: aggregator: fix a kernel-doc markup
drivers/parport/share.c | 2 +-
.../surface/aggregator/ssh_request_layer.c | 2 +-
drivers/rapidio/rio.c | 2 +-
drivers/reset/core.c | 4 +-
fs/dcache.c | 73 ++++++++++---------
fs/inode.c | 4 +-
fs/pstore/zone.c | 2 +-
fs/seq_file.c | 5 +-
fs/super.c | 12 +--
include/drm/drm_crtc.h | 2 +-
include/linux/connector.h | 2 +-
.../firmware/intel/stratix10-svc-client.h | 10 +--
include/linux/memblock.h | 4 +-
include/linux/parport.h | 31 ++++++++
include/linux/w1.h | 2 +-
include/net/cfg80211.h | 2 +-
lib/crc7.c | 2 +-
net/tipc/link.c | 2 +-
net/tipc/node.c | 2 +-
scripts/kernel-doc | 62 ++++++++++++----
tools/testing/selftests/kselftest_harness.h | 26 ++++---
21 files changed, 160 insertions(+), 93 deletions(-)
--
2.29.2
Initially I just wanted to port the selftests to the latest GPIO uAPI,
but on finding that, due to dependency issues, the selftests are not built
for the buildroot environments that I do most of my GPIO testing in, I
decided to take a closer look.
The first patch is essentially a rewrite of the exising test suite.
It uses a simplified abstraction of the uAPI interfaces to allow a common
test suite to test the gpio-mockup using either of the uAPI interfaces.
The simplified cdev interface is implemented in gpio-mockup.sh, with the
actual driving of the uAPI implemented in gpio-mockup-cdev.c.
The simplified sysfs interface replaces gpio-mockup-sysfs.sh and is
loaded over the cdev implementation when selected.
The new tests should also be simpler to extend to cover new mockup
interfaces, such as the one Bart has been working on.
I have dropped support for testing modules other than gpio-mockup from
the command line options, as the tests are very gpio-mockup specific so
I didn't see any calling for it.
I have also tried to emphasise in the test output that the tests are
covering the gpio-mockup itself. They do perform some implicit testing
of gpiolib and the uAPI interfaces, and so can be useful as smoke tests
for those, but their primary focus is the gpio-mockup.
Patches 2 through 5 do some cleaning up that is now possible with the
new implementation, including enabling building in buildroot environments.
Patch 4 doesn't strictly clean up all the old gpio references that it
could - the gpio was the only Level 1 test, so the Level 1 tests could
potentially be removed, but I was unsure if there may be other
implications to removing a whole test level, or that it may be useful
as a placeholder in case other static LDLIBS tests are added in
the future??
Patch 6 finally gets around to porting the tests to the latest GPIO uAPI.
And Patch 7 updates the config to set the CONFIG_GPIO_CDEV option that
was added in v5.10.
Cheers,
Kent.
Changes v2 -> v3:
- remove 'commit' from Fixes tag in patch 1.
- rebase on Bart's gpio/for-next
Changes v1 -> v2 (all in patch 1 and gpio-mockup.sh unless stated
otherwise):
- reorder includes in gpio-mockup-cdev.c
- a multitude of improvements to gpio-mockup.sh and gpio-mockup-sysfs.sh
based on Andy's review comments
- improved cleanup to ensure all child processes are killed on exit
- added race condition prevention or mitigation including the wait in
release_line, the retries in assert_mock, the assert_mock in set_mock,
and the sleep in set_line
Kent Gibson (7):
selftests: gpio: rework and simplify test implementation
selftests: gpio: remove obsolete gpio-mockup-chardev.c
selftests: remove obsolete build restriction for gpio
selftests: remove obsolete gpio references from kselftest_deps.sh
tools: gpio: remove uAPI v1 code no longer used by selftests
selftests: gpio: port to GPIO uAPI v2
selftests: gpio: add CONFIG_GPIO_CDEV to config
tools/gpio/gpio-utils.c | 89 ----
tools/gpio/gpio-utils.h | 6 -
tools/testing/selftests/Makefile | 9 -
tools/testing/selftests/gpio/Makefile | 26 +-
tools/testing/selftests/gpio/config | 1 +
.../testing/selftests/gpio/gpio-mockup-cdev.c | 198 +++++++
.../selftests/gpio/gpio-mockup-chardev.c | 323 ------------
.../selftests/gpio/gpio-mockup-sysfs.sh | 168 ++----
tools/testing/selftests/gpio/gpio-mockup.sh | 497 ++++++++++++------
tools/testing/selftests/kselftest_deps.sh | 4 +-
10 files changed, 603 insertions(+), 718 deletions(-)
create mode 100644 tools/testing/selftests/gpio/gpio-mockup-cdev.c
delete mode 100644 tools/testing/selftests/gpio/gpio-mockup-chardev.c
base-commit: 64e6066e16b8c562983dd9d33e604c0001ae0fc7
--
2.30.0
If you are not using the test-definitions project to run kselftest,
please ignore this email.
A new run script for kselftest, run_kselftest.sh [1], was created during the
Linux v5.10 release.
This script allows someone to run both individual test cases and sets of
test cases. Accordingly, the test-definitions kselftest script [2] was also
improved to support these upstream changes [1]. Currently this change is in
the test-definitions repository in a separate branch "kselftest". This has been
running in LKFT's CI since November 2020 [3].
The test-definitions kselftest script will stop supporting older versions of
the kselftest run script starting 1st-Feb-2021 from master branch.
OTOH, One have to use test-definitions Tag 2021.01 (will be created) for older
kselftest versions.
We request that any users of test-definitions project start updating your
kselftest sources to version v5.10 and above.
Upstream patch,
[1] https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/patch/to…
[2] https://github.com/Linaro/test-definitions/tree/kselftest/automated/linux/k…
[3] https://github.com/Linaro/test-definitions/tree/kselftest
---
>From 5da1918446a1d50d57f2f6062f7fdede0b052473 Mon Sep 17 00:00:00 2001
From: Kees Cook <keescook(a)chromium.org>
Date: Mon, 28 Sep 2020 13:26:49 -0700
Subject: selftests/run_kselftest.sh: Make each test individually selectable
Currently with run_kselftest.sh there is no way to choose which test
we could run. All the tests listed in kselftest-list.txt are all run
every time. This patch enhanced the run_kselftest.sh to make the test
collections (or tests) individually selectable. e.g.:
$ ./run_kselftest.sh -c seccomp -t timers:posix_timers -t timers:nanosleep
Additionally adds a way to list all known tests with "-l", usage
with "-h", and perform a dry run without running tests with "-n".
Co-developed-by: Hangbin Liu <liuhangbin(a)gmail.com>
Signed-off-by: Hangbin Liu <liuhangbin(a)gmail.com>
Signed-off-by: Kees Cook <keescook(a)chromium.org>
Tested-by: Naresh Kamboju <naresh.kamboju(a)linaro.org>
Signed-off-by: Shuah Khan <skhan(a)linuxfoundation.org>
--
- Naresh Kamboju
On Wed, Jan 20, 2021 at 6:22 AM angkery <angkery(a)163.com> wrote:
>
> From: Junlin Yang <yangjunlin(a)yulong.com>
>
> Change 'exeeds' to 'exceeds'.
>
> Signed-off-by: Junlin Yang <yangjunlin(a)yulong.com>
The patch didn't reach patchwork.
Please reduce cc list and resubmit to bpf@vger only.
Also pls mention [PATCH bpf-next] in the subject.
> ---
> tools/testing/selftests/bpf/prog_tests/btf.c | 4 ++--
> 1 file changed, 2 insertions(+), 2 deletions(-)
>
> diff --git a/tools/testing/selftests/bpf/prog_tests/btf.c b/tools/testing/selftests/bpf/prog_tests/btf.c
> index 8ae97e2..ea008d0 100644
> --- a/tools/testing/selftests/bpf/prog_tests/btf.c
> +++ b/tools/testing/selftests/bpf/prog_tests/btf.c
> @@ -914,7 +914,7 @@ struct btf_raw_test {
> .err_str = "Member exceeds struct_size",
> },
>
> -/* Test member exeeds the size of struct
> +/* Test member exceeds the size of struct
> *
> * struct A {
> * int m;
> @@ -948,7 +948,7 @@ struct btf_raw_test {
> .err_str = "Member exceeds struct_size",
> },
>
> -/* Test member exeeds the size of struct
> +/* Test member exceeds the size of struct
> *
> * struct A {
> * int m;
> --
> 1.9.1
>
>
From: Mike Rapoport <rppt(a)linux.ibm.com>
Hi,
@Andrew, this is based on v5.10-rc2-mmotm-2020-11-07-21-40, I can rebase on
current mmotm if you prefer.
This is an implementation of "secret" mappings backed by a file descriptor.
The file descriptor backing secret memory mappings is created using a
dedicated memfd_secret system call The desired protection mode for the
memory is configured using flags parameter of the system call. The mmap()
of the file descriptor created with memfd_secret() will create a "secret"
memory mapping. The pages in that mapping will be marked as not present in
the direct map and will be present only in the page table of the owning mm.
Although normally Linux userspace mappings are protected from other users,
such secret mappings are useful for environments where a hostile tenant is
trying to trick the kernel into giving them access to other tenants
mappings.
Additionally, in the future the secret mappings may be used as a mean to
protect guest memory in a virtual machine host.
For demonstration of secret memory usage we've created a userspace library
https://git.kernel.org/pub/scm/linux/kernel/git/jejb/secret-memory-preloade…
that does two things: the first is act as a preloader for openssl to
redirect all the OPENSSL_malloc calls to secret memory meaning any secret
keys get automatically protected this way and the other thing it does is
expose the API to the user who needs it. We anticipate that a lot of the
use cases would be like the openssl one: many toolkits that deal with
secret keys already have special handling for the memory to try to give
them greater protection, so this would simply be pluggable into the
toolkits without any need for user application modification.
Hiding secret memory mappings behind an anonymous file allows (ab)use of
the page cache for tracking pages allocated for the "secret" mappings as
well as using address_space_operations for e.g. page migration callbacks.
The anonymous file may be also used implicitly, like hugetlb files, to
implement mmap(MAP_SECRET) and use the secret memory areas with "native" mm
ABIs in the future.
To limit fragmentation of the direct map to splitting only PUD-size pages,
I've added an amortizing cache of PMD-size pages to each file descriptor
that is used as an allocation pool for the secret memory areas.
As the memory allocated by secretmem becomes unmovable, we use CMA to back
large page caches so that page allocator won't be surprised by failing attempt
to migrate these pages.
v14:
* Finally s/mod_node_page_state/mod_lruvec_page_state/
v13: https://lore.kernel.org/lkml/20201201074559.27742-1-rppt@kernel.org
* Added Reviewed-by, thanks Catalin and David
* s/mod_node_page_state/mod_lruvec_page_state/ as Shakeel suggested
v12: https://lore.kernel.org/lkml/20201125092208.12544-1-rppt@kernel.org
* Add detection of whether set_direct_map has actual effect on arm64 and bail
out of CMA allocation for secretmem and the memfd_secret() syscall if pages
would not be removed from the direct map
v11: https://lore.kernel.org/lkml/20201124092556.12009-1-rppt@kernel.org
* Drop support for uncached mappings
v10: https://lore.kernel.org/lkml/20201123095432.5860-1-rppt@kernel.org
* Drop changes to arm64 compatibility layer
* Add Roman's Ack for memcg accounting
Older history:
v9: https://lore.kernel.org/lkml/20201117162932.13649-1-rppt@kernel.org
v8: https://lore.kernel.org/lkml/20201110151444.20662-1-rppt@kernel.org
v7: https://lore.kernel.org/lkml/20201026083752.13267-1-rppt@kernel.org
v6: https://lore.kernel.org/lkml/20200924132904.1391-1-rppt@kernel.org
v5: https://lore.kernel.org/lkml/20200916073539.3552-1-rppt@kernel.org
v4: https://lore.kernel.org/lkml/20200818141554.13945-1-rppt@kernel.org
v3: https://lore.kernel.org/lkml/20200804095035.18778-1-rppt@kernel.org
v2: https://lore.kernel.org/lkml/20200727162935.31714-1-rppt@kernel.org
v1: https://lore.kernel.org/lkml/20200720092435.17469-1-rppt@kernel.org
Mike Rapoport (10):
mm: add definition of PMD_PAGE_ORDER
mmap: make mlock_future_check() global
set_memory: allow set_direct_map_*_noflush() for multiple pages
set_memory: allow querying whether set_direct_map_*() is actually enabled
mm: introduce memfd_secret system call to create "secret" memory areas
secretmem: use PMD-size pages to amortize direct map fragmentation
secretmem: add memcg accounting
PM: hibernate: disable when there are active secretmem users
arch, mm: wire up memfd_secret system call were relevant
secretmem: test: add basic selftest for memfd_secret(2)
arch/arm64/include/asm/Kbuild | 1 -
arch/arm64/include/asm/cacheflush.h | 6 -
arch/arm64/include/asm/set_memory.h | 17 +
arch/arm64/include/uapi/asm/unistd.h | 1 +
arch/arm64/kernel/machine_kexec.c | 1 +
arch/arm64/mm/mmu.c | 6 +-
arch/arm64/mm/pageattr.c | 23 +-
arch/riscv/include/asm/set_memory.h | 4 +-
arch/riscv/include/asm/unistd.h | 1 +
arch/riscv/mm/pageattr.c | 8 +-
arch/x86/Kconfig | 2 +-
arch/x86/entry/syscalls/syscall_32.tbl | 1 +
arch/x86/entry/syscalls/syscall_64.tbl | 1 +
arch/x86/include/asm/set_memory.h | 4 +-
arch/x86/mm/pat/set_memory.c | 8 +-
fs/dax.c | 11 +-
include/linux/pgtable.h | 3 +
include/linux/secretmem.h | 30 ++
include/linux/set_memory.h | 16 +-
include/linux/syscalls.h | 1 +
include/uapi/asm-generic/unistd.h | 6 +-
include/uapi/linux/magic.h | 1 +
kernel/power/hibernate.c | 5 +-
kernel/power/snapshot.c | 4 +-
kernel/sys_ni.c | 2 +
mm/Kconfig | 5 +
mm/Makefile | 1 +
mm/filemap.c | 3 +-
mm/gup.c | 10 +
mm/internal.h | 3 +
mm/mmap.c | 5 +-
mm/secretmem.c | 439 ++++++++++++++++++++++
mm/vmalloc.c | 5 +-
scripts/checksyscalls.sh | 4 +
tools/testing/selftests/vm/.gitignore | 1 +
tools/testing/selftests/vm/Makefile | 3 +-
tools/testing/selftests/vm/memfd_secret.c | 298 +++++++++++++++
tools/testing/selftests/vm/run_vmtests | 17 +
38 files changed, 906 insertions(+), 51 deletions(-)
create mode 100644 arch/arm64/include/asm/set_memory.h
create mode 100644 include/linux/secretmem.h
create mode 100644 mm/secretmem.c
create mode 100644 tools/testing/selftests/vm/memfd_secret.c
base-commit: 9f8ce377d420db12b19d6a4f636fecbd88a725a5
--
2.28.0
In this scenario, there is no case where va_page is NULL, and
the error has been checked. The if condition statement here is
redundant, so remove the condition detection.
Reported-by: Jia Zhang <zhang.jia(a)linux.alibaba.com>
Signed-off-by: Tianjia Zhang <tianjia.zhang(a)linux.alibaba.com>
---
arch/x86/kernel/cpu/sgx/ioctl.c | 5 ++---
1 file changed, 2 insertions(+), 3 deletions(-)
diff --git a/arch/x86/kernel/cpu/sgx/ioctl.c b/arch/x86/kernel/cpu/sgx/ioctl.c
index 90a5caf76939..f45957c05f69 100644
--- a/arch/x86/kernel/cpu/sgx/ioctl.c
+++ b/arch/x86/kernel/cpu/sgx/ioctl.c
@@ -66,9 +66,8 @@ static int sgx_encl_create(struct sgx_encl *encl, struct sgx_secs *secs)
va_page = sgx_encl_grow(encl);
if (IS_ERR(va_page))
return PTR_ERR(va_page);
- else if (va_page)
- list_add(&va_page->list, &encl->va_pages);
- /* else the tail page of the VA page list had free slots. */
+
+ list_add(&va_page->list, &encl->va_pages);
/* The extra page goes to SECS. */
encl_size = secs->size + PAGE_SIZE;
--
2.19.1.3.ge56e4f7
Increase `section->free_cnt` in sgx_sanitize_section() is
more reasonable, which is called in ksgxd kernel thread,
instead of assigning it to epc section pages number at
initialization. Although this is unlikely to fail, these
pages cannot be allocated after initialization, and which
need to be reset by ksgxd.
Reported-by: Jia Zhang <zhang.jia(a)linux.alibaba.com>
Signed-off-by: Tianjia Zhang <tianjia.zhang(a)linux.alibaba.com>
---
arch/x86/kernel/cpu/sgx/main.c | 6 +++---
1 file changed, 3 insertions(+), 3 deletions(-)
diff --git a/arch/x86/kernel/cpu/sgx/main.c b/arch/x86/kernel/cpu/sgx/main.c
index c519fc5f6948..9e9a3cf7c00b 100644
--- a/arch/x86/kernel/cpu/sgx/main.c
+++ b/arch/x86/kernel/cpu/sgx/main.c
@@ -48,9 +48,10 @@ static void sgx_sanitize_section(struct sgx_epc_section *section)
struct sgx_epc_page, list);
ret = __eremove(sgx_get_epc_virt_addr(page));
- if (!ret)
+ if (!ret) {
list_move(&page->list, §ion->page_list);
- else
+ section->free_cnt += 1;
+ } else
list_move_tail(&page->list, &dirty);
spin_unlock(§ion->lock);
@@ -646,7 +647,6 @@ static bool __init sgx_setup_epc_section(u64 phys_addr, u64 size,
list_add_tail(§ion->pages[i].list, §ion->init_laundry_list);
}
- section->free_cnt = nr_pages;
return true;
}
--
2.19.1.3.ge56e4f7
From: Colin Ian King <colin.king(a)canonical.com>
There are two spelling mistakes in check_fail messages. Fix them.
Signed-off-by: Colin Ian King <colin.king(a)canonical.com>
---
tools/testing/selftests/net/forwarding/tc_chains.sh | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/tools/testing/selftests/net/forwarding/tc_chains.sh b/tools/testing/selftests/net/forwarding/tc_chains.sh
index 2934fb5ed2a2..b95de0463ebd 100755
--- a/tools/testing/selftests/net/forwarding/tc_chains.sh
+++ b/tools/testing/selftests/net/forwarding/tc_chains.sh
@@ -136,7 +136,7 @@ template_filter_fits()
tc filter add dev $h2 ingress protocol ip pref 1 handle 1102 \
flower src_mac $h2mac action drop &> /dev/null
- check_fail $? "Incorrectly succeded to insert filter which does not template"
+ check_fail $? "Incorrectly succeeded to insert filter which does not template"
tc filter add dev $h2 ingress chain 1 protocol ip pref 1 handle 1101 \
flower src_mac $h2mac action drop
@@ -144,7 +144,7 @@ template_filter_fits()
tc filter add dev $h2 ingress chain 1 protocol ip pref 1 handle 1102 \
flower dst_mac $h2mac action drop &> /dev/null
- check_fail $? "Incorrectly succeded to insert filter which does not template"
+ check_fail $? "Incorrectly succeeded to insert filter which does not template"
tc filter del dev $h2 ingress chain 1 protocol ip pref 1 handle 1102 \
flower &> /dev/null
--
2.29.2
Changelog
---------
v5
- Added the following patches to the beginning of series, which are fixes
to the other existing problems with CMA migration code:
mm/gup: check every subpage of a compound page during isolation
mm/gup: return an error on migration failure
mm/gup: check for isolation errors also at the beginning of series
mm/gup: do not allow zero page for pinned pages
- remove .gfp_mask/.reclaim_idx changes from mm/vmscan.c
- update movable zone header comment in patch 8 instead of patch 3, fix the comment
- Added acked, sign-offs
- Updated commit logs based on feedback
- Addressed issues reported by Michal and Jason.
- Remove:
#define PINNABLE_MIGRATE_MAX 10
#define PINNABLE_ISOLATE_MAX 100
Instead: fail on the first migration failure, and retry isolation forever as their
failures are transient.
- In self-set addressed some of the comments from John Hubbard, updated commit logs,
and added comments. Renamed gup->flags with gup->test_flags.
v4
- Address page migration comments. New patch:
mm/gup: limit number of gup migration failures, honor failures
Implements the limiting number of retries for migration failures, and
also check for isolation failures.
Added a test case into gup_test to verify that pages never long-term
pinned in a movable zone, and also added tests to fault both in kernel
and in userland.
v3
- Merged with linux-next, which contains clean-up patch from Jason,
therefore this series is reduced by two patches which did the same
thing.
v2
- Addressed all review comments
- Added Reviewed-by's.
- Renamed PF_MEMALLOC_NOMOVABLE to PF_MEMALLOC_PIN
- Added is_pinnable_page() to check if page can be longterm pinned
- Fixed gup fast path by checking is_in_pinnable_zone()
- rename cma_page_list to movable_page_list
- add a admin-guide note about handling pinned pages in ZONE_MOVABLE,
updated caveat about pinned pages from linux/mmzone.h
- Move current_gfp_context() to fast-path
---------
When page is pinned it cannot be moved and its physical address stays
the same until pages is unpinned.
This is useful functionality to allows userland to implementation DMA
access. For example, it is used by vfio in vfio_pin_pages().
However, this functionality breaks memory hotplug/hotremove assumptions
that pages in ZONE_MOVABLE can always be migrated.
This patch series fixes this issue by forcing new allocations during
page pinning to omit ZONE_MOVABLE, and also to migrate any existing
pages from ZONE_MOVABLE during pinning.
It uses the same scheme logic that is currently used by CMA, and extends
the functionality for all allocations.
For more information read the discussion [1] about this problem.
[1] https://lore.kernel.org/lkml/CA+CK2bBffHBxjmb9jmSKacm0fJMinyt3Nhk8Nx6iudcQS…
Previous versions:
v1
https://lore.kernel.org/lkml/20201202052330.474592-1-pasha.tatashin@soleen.…
v2
https://lore.kernel.org/lkml/20201210004335.64634-1-pasha.tatashin@soleen.c…
v3
https://lore.kernel.org/lkml/20201211202140.396852-1-pasha.tatashin@soleen.…
v4
https://lore.kernel.org/lkml/20201217185243.3288048-1-pasha.tatashin@soleen…
Pavel Tatashin (14):
mm/gup: don't pin migrated cma pages in movable zone
mm/gup: check every subpage of a compound page during isolation
mm/gup: return an error on migration failure
mm/gup: check for isolation errors
mm cma: rename PF_MEMALLOC_NOCMA to PF_MEMALLOC_PIN
mm: apply per-task gfp constraints in fast path
mm: honor PF_MEMALLOC_PIN for all movable pages
mm/gup: do not allow zero page for pinned pages
mm/gup: migrate pinned pages out of movable zone
memory-hotplug.rst: add a note about ZONE_MOVABLE and page pinning
mm/gup: change index type to long as it counts pages
mm/gup: longterm pin migration cleaup
selftests/vm: test flag is broken
selftests/vm: test faulting in kernel, and verify pinnable pages
.../admin-guide/mm/memory-hotplug.rst | 9 +
include/linux/migrate.h | 1 +
include/linux/mm.h | 11 ++
include/linux/mmzone.h | 9 +-
include/linux/sched.h | 2 +-
include/linux/sched/mm.h | 27 +--
include/trace/events/migrate.h | 3 +-
mm/gup.c | 178 ++++++++----------
mm/gup_test.c | 29 +--
mm/gup_test.h | 3 +-
mm/hugetlb.c | 4 +-
mm/page_alloc.c | 33 ++--
tools/testing/selftests/vm/gup_test.c | 36 +++-
13 files changed, 185 insertions(+), 160 deletions(-)
--
2.25.1
Initially I just wanted to port the selftests to the latest GPIO uAPI,
but on finding that, due to dependency issues, the selftests are not built
for the buildroot environments that I do most of my GPIO testing in, I
decided to take a closer look.
The first patch is essentially a rewrite of the exising test suite.
It uses a simplified abstraction of the uAPI interfaces to allow a common
test suite to test the gpio-mockup using either of the uAPI interfaces.
The simplified cdev interface is implemented in gpio-mockup.sh, with the
actual driving of the uAPI implemented in gpio-mockup-cdev.c.
The simplified sysfs interface replaces gpio-mockup-sysfs.sh and is
loaded over the cdev implementation when selected.
The new tests should also be simpler to extend to cover new mockup
interfaces, such as the one Bart has been working on.
I have dropped support for testing modules other than gpio-mockup from
the command line options, as the tests are very gpio-mockup specific so
I didn't see any calling for it.
I have also tried to emphasise in the test output that the tests are
covering the gpio-mockup itself. They do perform some implicit testing
of gpiolib and the uAPI interfaces, and so can be useful as smoke tests
for those, but their primary focus is the gpio-mockup.
Patches 2 through 5 do some cleaning up that is now possible with the
new implementation, including enabling building in buildroot environments.
Patch 4 doesn't strictly clean up all the old gpio references that it
could - the gpio was the only Level 1 test, so the Level 1 tests could
potentially be removed, but I was unsure if there may be other
implications to removing a whole test level, or that it may be useful
as a placeholder in case other static LDLIBS tests are added in
the future??
Patch 6 finally gets around to porting the tests to the latest GPIO uAPI.
And Patch 7 updates the config to set the CONFIG_GPIO_CDEV option that
was added in v5.10.
Cheers,
Kent.
Changes v1 -> v2 (all in patch 1 and gpio-mockup.sh unless stated
otherwise):
- reorder includes in gpio-mockup-cdev.c
- a multitude of improvements to gpio-mockup.sh and gpio-mockup-sysfs.sh
based on Andy's review comments
- improved cleanup to ensure all child processes are killed on exit
- added race condition prevention or mitigation including the wait in
release_line, the retries in assert_mock, the assert_mock in set_mock,
and the sleep in set_line
Kent Gibson (7):
selftests: gpio: rework and simplify test implementation
selftests: gpio: remove obsolete gpio-mockup-chardev.c
selftests: remove obsolete build restriction for gpio
selftests: remove obsolete gpio references from kselftest_deps.sh
tools: gpio: remove uAPI v1 code no longer used by selftests
selftests: gpio: port to GPIO uAPI v2
selftests: gpio: add CONFIG_GPIO_CDEV to config
tools/gpio/gpio-utils.c | 89 ----
tools/gpio/gpio-utils.h | 6 -
tools/testing/selftests/Makefile | 9 -
tools/testing/selftests/gpio/Makefile | 26 +-
tools/testing/selftests/gpio/config | 1 +
.../testing/selftests/gpio/gpio-mockup-cdev.c | 198 +++++++
.../selftests/gpio/gpio-mockup-chardev.c | 323 ------------
.../selftests/gpio/gpio-mockup-sysfs.sh | 168 ++----
tools/testing/selftests/gpio/gpio-mockup.sh | 497 ++++++++++++------
tools/testing/selftests/kselftest_deps.sh | 4 +-
10 files changed, 603 insertions(+), 718 deletions(-)
create mode 100644 tools/testing/selftests/gpio/gpio-mockup-cdev.c
delete mode 100644 tools/testing/selftests/gpio/gpio-mockup-chardev.c
--
2.30.0
This series contains a few cleanups that didn't make it into previous
series, including some cosmetic changes and small bug fixes. The series
also lays the groundwork for a memslot modification test which stresses
the memslot update and page fault code paths in an attempt to expose races.
Tested: dirty_log_perf_test, memslot_modification_stress_test, and
demand_paging_test were run, with all the patches in this series
applied, on an Intel Skylake machine.
echo Y > /sys/module/kvm/parameters/tdp_mmu; \
./memslot_modification_stress_test -i 1000 -v 64 -b 1G; \
./memslot_modification_stress_test -i 1000 -v 64 -b 64M -o; \
./dirty_log_perf_test -v 64 -b 1G; \
./dirty_log_perf_test -v 64 -b 64M -o; \
./demand_paging_test -v 64 -b 1G; \
./demand_paging_test -v 64 -b 64M -o; \
echo N > /sys/module/kvm/parameters/tdp_mmu; \
./memslot_modification_stress_test -i 1000 -v 64 -b 1G; \
./memslot_modification_stress_test -i 1000 -v 64 -b 64M -o; \
./dirty_log_perf_test -v 64 -b 1G; \
./dirty_log_perf_test -v 64 -b 64M -o; \
./demand_paging_test -v 64 -b 1G; \
./demand_paging_test -v 64 -b 64M -o
The tests behaved as expected, and fixed the problem of the
population stage being skipped in dirty_log_perf_test. This can be
seen in the output, with the population stage taking about the time
dirty pass 1 took and dirty pass 1 falling closer to the times for
the other passes.
Note that when running these tests, the -o option causes the test to take
much longer as the work each vCPU must do increases proportional to the
number of vCPUs.
You can view this series in Gerrit at:
https://linux-review.googlesource.com/c/linux/kernel/git/torvalds/linux/+/7…
Ben Gardon (6):
KVM: selftests: Rename timespec_diff_now to timespec_elapsed
KVM: selftests: Avoid flooding debug log while populating memory
KVM: selftests: Convert iterations to int in dirty_log_perf_test
KVM: selftests: Fix population stage in dirty_log_perf_test
KVM: selftests: Add option to overlap vCPU memory access
KVM: selftests: Add memslot modification stress test
tools/testing/selftests/kvm/.gitignore | 1 +
tools/testing/selftests/kvm/Makefile | 1 +
.../selftests/kvm/demand_paging_test.c | 40 +++-
.../selftests/kvm/dirty_log_perf_test.c | 72 +++---
.../selftests/kvm/include/perf_test_util.h | 4 +-
.../testing/selftests/kvm/include/test_util.h | 2 +-
.../selftests/kvm/lib/perf_test_util.c | 25 ++-
tools/testing/selftests/kvm/lib/test_util.c | 2 +-
.../kvm/memslot_modification_stress_test.c | 211 ++++++++++++++++++
9 files changed, 307 insertions(+), 51 deletions(-)
create mode 100644 tools/testing/selftests/kvm/memslot_modification_stress_test.c
--
2.30.0.284.gd98b1dd5eaa7-goog
Changelog
---------
v4
- Address page migration comments. New patch:
mm/gup: limit number of gup migration failures, honor failures
Implements the limiting number of retries for migration failures, and
also check for isolation failures.
Added a test case into gup_test to verify that pages never long-term
pinned in a movable zone, and also added tests to fault both in kernel
and in userland.
v3
- Merged with linux-next, which contains clean-up patch from Jason,
therefore this series is reduced by two patches which did the same
thing.
v2
- Addressed all review comments
- Added Reviewed-by's.
- Renamed PF_MEMALLOC_NOMOVABLE to PF_MEMALLOC_PIN
- Added is_pinnable_page() to check if page can be longterm pinned
- Fixed gup fast path by checking is_in_pinnable_zone()
- rename cma_page_list to movable_page_list
- add a admin-guide note about handling pinned pages in ZONE_MOVABLE,
updated caveat about pinned pages from linux/mmzone.h
- Move current_gfp_context() to fast-path
---------
When page is pinned it cannot be moved and its physical address stays
the same until pages is unpinned.
This is useful functionality to allows userland to implementation DMA
access. For example, it is used by vfio in vfio_pin_pages().
However, this functionality breaks memory hotplug/hotremove assumptions
that pages in ZONE_MOVABLE can always be migrated.
This patch series fixes this issue by forcing new allocations during
page pinning to omit ZONE_MOVABLE, and also to migrate any existing
pages from ZONE_MOVABLE during pinning.
It uses the same scheme logic that is currently used by CMA, and extends
the functionality for all allocations.
For more information read the discussion [1] about this problem.
[1] https://lore.kernel.org/lkml/CA+CK2bBffHBxjmb9jmSKacm0fJMinyt3Nhk8Nx6iudcQS…
Previous versions:
v1
https://lore.kernel.org/lkml/20201202052330.474592-1-pasha.tatashin@soleen.…
v2
https://lore.kernel.org/lkml/20201210004335.64634-1-pasha.tatashin@soleen.c…
v3
https://lore.kernel.org/lkml/20201211202140.396852-1-pasha.tatashin@soleen.…
Pavel Tatashin (10):
mm/gup: don't pin migrated cma pages in movable zone
mm cma: rename PF_MEMALLOC_NOCMA to PF_MEMALLOC_PIN
mm: apply per-task gfp constraints in fast path
mm: honor PF_MEMALLOC_PIN for all movable pages
mm/gup: migrate pinned pages out of movable zone
memory-hotplug.rst: add a note about ZONE_MOVABLE and page pinning
mm/gup: change index type to long as it counts pages
mm/gup: limit number of gup migration failures, honor failures
selftests/vm: test flag is broken
selftests/vm: test faulting in kernel, and verify pinnable pages
.../admin-guide/mm/memory-hotplug.rst | 9 +
include/linux/migrate.h | 1 +
include/linux/mm.h | 11 +
include/linux/mmzone.h | 11 +-
include/linux/sched.h | 2 +-
include/linux/sched/mm.h | 27 +--
include/trace/events/migrate.h | 3 +-
mm/gup.c | 217 ++++++++++--------
mm/gup_test.c | 15 +-
mm/gup_test.h | 1 +
mm/hugetlb.c | 4 +-
mm/page_alloc.c | 33 ++-
mm/vmscan.c | 10 +-
tools/testing/selftests/vm/gup_test.c | 26 ++-
14 files changed, 221 insertions(+), 149 deletions(-)
--
2.25.1
On Mon, Jan 11, 2021 at 9:34 AM Alan Maguire <alan.maguire(a)oracle.com> wrote:
>
> libbpf already supports a "dumper" API for dumping type information,
> but there is currently no support for dumping typed _data_ via libbpf.
> However this functionality does exist in the kernel, in part to
> facilitate the bpf_snprintf_btf() helper which dumps a string
> representation of the pointer passed in utilizing the BTF type id
> of the data pointed to. For example, the pair of a pointer to
> a "struct sk_buff" and the BTF type id of "struct sk_buff" can be
> used.
>
> Here the kernel code is generalized into btf_show_common.c. For the
> most part, code is identical for userspace and kernel, beyond a few API
> differences and missing functions. The only significant differences are
>
> - the "safe copy" logic used by the kernel to ensure we do not induce a
> crash during BPF operation; and
> - the BTF seq file support that is kernel-only.
>
> The mechanics are to maintain identical btf_show_common.c files in
> kernel/bpf and tools/lib/bpf , and a common header btf_common.h in
> include/linux/ and tools/lib/bpf/. This file duplication seems to
> be the common practice with duplication between kernel and tools/
> so it's the approach taken here.
>
> The common code approach could likely be explored further, but here
> the minimum common code required to support BTF show functionality is
> used.
>
I don't think this approach will work. libbpf and kernel have
considerably different restrictions and styles, I don't think it's
appropriate to take kernel code and try to fit it into libbpf almost
as is, with a bunch of #defines. It would be much cleaner, simpler,
and more maintainable to just re-implement core logic for libbpf, IMO.
> Currently the only "show" function for userspace is to write the
> representation of the typed data to a string via
>
> LIBBPF_API int
> btf__snprintf(struct btf *btf, char *buf, int len, __u32 id, void *obj,
> __u64 flags);
>
> ...but other approaches could be pursued including printf()-based
> show, or even a callback mechanism could be supported to allow
> user-defined show functions.
>
It's strange that you saw btf_dump APIs, and yet decided to go with
this API instead. snprintf() is not a natural "method" of struct btf.
Using char buffer as an output is overly restrictive and inconvenient.
It's appropriate for kernel and BPF program due to their restrictions,
but there is no need to cripple libbpf APIs for that. I think it
should follow btf_dump APIs with custom callback so that it's easy to
just printf() everything, but also user can create whatever elaborate
mechanism they need and that fits their use case.
Code reuse is not the ultimate goal, it should facilitate
maintainability, not harm it. There are times where sharing code
introduces unnecessary coupling and maintainability issues. And I
think this one is a very obvious case of that.
See below a few comments as well. But overall it's really hard to
review such a humongous patch, of course. So I so far just skimmed
through it.
> Here's an example usage, storing a string representation of
> struct sk_buff *skb in buf:
>
> struct btf *btf = libbpf_find_kernel_btf();
> char buf[8192];
> __s32 skb_id;
>
> skb_id = btf__find_by_name_kind(btf, "sk_buff", BTF_KIND_STRUCT);
> if (skb_id < 0)
> fprintf(stderr, "no skbuff, err %d\n", skb_id);
> else
> btf__snprintf(btf, buf, sizeof(buf), skb_id, skb, 0);
>
> Suggested-by: Alexei Starovoitov <ast(a)kernel.org>
> Signed-off-by: Alan Maguire <alan.maguire(a)oracle.com>
> ---
> include/linux/btf.h | 121 +---
> include/linux/btf_common.h | 286 +++++++++
> kernel/bpf/Makefile | 2 +-
> kernel/bpf/arraymap.c | 1 +
> kernel/bpf/bpf_struct_ops.c | 1 +
> kernel/bpf/btf.c | 1215 +-------------------------------------
> kernel/bpf/btf_show_common.c | 1218 +++++++++++++++++++++++++++++++++++++++
> kernel/bpf/core.c | 1 +
> kernel/bpf/hashtab.c | 1 +
> kernel/bpf/local_storage.c | 1 +
> kernel/bpf/verifier.c | 1 +
> kernel/trace/bpf_trace.c | 1 +
> tools/lib/bpf/Build | 2 +-
> tools/lib/bpf/btf.h | 7 +
> tools/lib/bpf/btf_common.h | 286 +++++++++
> tools/lib/bpf/btf_show_common.c | 1218 +++++++++++++++++++++++++++++++++++++++
> tools/lib/bpf/libbpf.map | 1 +
> 17 files changed, 3044 insertions(+), 1319 deletions(-)
> create mode 100644 include/linux/btf_common.h
> create mode 100644 kernel/bpf/btf_show_common.c
> create mode 100644 tools/lib/bpf/btf_common.h
> create mode 100644 tools/lib/bpf/btf_show_common.c
>
[...]
> +/* For kernel u64 is long long unsigned int... */
> +#define FMT64 "ll"
> +
> +#else
> +/* ...while for userspace it is long unsigned int. These definitions avoid
> + * format specifier warnings.
> + */
that's not true, it depends on the architecture
> +#define FMT64 "l"
> +
> +/* libbpf names differ slightly to in-kernel function names. */
> +#define btf_type_by_id btf__type_by_id
> +#define btf_name_by_offset btf__name_by_offset
> +#define btf_str_by_offset btf__str_by_offset
> +#define btf_resolve_size btf__resolve_size
ugh... good luck navigating the code in libbpf....
> +
> +#endif /* __KERNEL__ */
> +/*
> + * Options to control show behaviour.
> + * - BTF_SHOW_COMPACT: no formatting around type information
> + * - BTF_SHOW_NONAME: no struct/union member names/types
> + * - BTF_SHOW_PTR_RAW: show raw (unobfuscated) pointer values;
> + * equivalent to %px.
> + * - BTF_SHOW_ZERO: show zero-valued struct/union members; they
> + * are not displayed by default
> + * - BTF_SHOW_UNSAFE: skip use of bpf_probe_read() to safely read
> + * data before displaying it.
> + */
> +#define BTF_SHOW_COMPACT BTF_F_COMPACT
> +#define BTF_SHOW_NONAME BTF_F_NONAME
> +#define BTF_SHOW_PTR_RAW BTF_F_PTR_RAW
> +#define BTF_SHOW_ZERO BTF_F_ZERO
> +#define BTF_SHOW_UNSAFE (1ULL << 4)
this (or some subset of them) should be done as opts struct's bool
fields for libbpf
> +
> +/*
> + * Copy len bytes of string representation of obj of BTF type_id into buf.
> + *
> + * @btf: struct btf object
> + * @type_id: type id of type obj points to
> + * @obj: pointer to typed data
> + * @buf: buffer to write to
> + * @len: maximum length to write to buf
> + * @flags: show options (see above)
> + *
> + * Return: length that would have been/was copied as per snprintf, or
> + * negative error.
> + */
> +int btf_type_snprintf_show(const struct btf *btf, u32 type_id, void *obj,
> + char *buf, int len, u64 flags);
> +
> +#define for_each_member(i, struct_type, member) \
> + for (i = 0, member = btf_type_member(struct_type); \
> + i < btf_type_vlen(struct_type); \
> + i++, member++)
> +
> +#define for_each_vsi(i, datasec_type, member) \
> + for (i = 0, member = btf_type_var_secinfo(datasec_type); \
> + i < btf_type_vlen(datasec_type); \
> + i++, member++)
> +
> +static inline bool btf_type_is_ptr(const struct btf_type *t)
> +{
> + return BTF_INFO_KIND(t->info) == BTF_KIND_PTR;
> +}
> +
> +static inline bool btf_type_is_int(const struct btf_type *t)
> +{
> + return BTF_INFO_KIND(t->info) == BTF_KIND_INT;
> +}
> +
> +static inline bool btf_type_is_small_int(const struct btf_type *t)
> +{
> + return btf_type_is_int(t) && t->size <= sizeof(u64);
> +}
> +
> +static inline bool btf_type_is_enum(const struct btf_type *t)
> +{
> + return BTF_INFO_KIND(t->info) == BTF_KIND_ENUM;
> +}
> +
> +static inline bool btf_type_is_typedef(const struct btf_type *t)
> +{
> + return BTF_INFO_KIND(t->info) == BTF_KIND_TYPEDEF;
> +}
> +
> +static inline bool btf_type_is_func(const struct btf_type *t)
> +{
> + return BTF_INFO_KIND(t->info) == BTF_KIND_FUNC;
> +}
> +
> +static inline bool btf_type_is_func_proto(const struct btf_type *t)
> +{
> + return BTF_INFO_KIND(t->info) == BTF_KIND_FUNC_PROTO;
> +}
> +
> +static inline bool btf_type_is_var(const struct btf_type *t)
> +{
> + return BTF_INFO_KIND(t->info) == BTF_KIND_VAR;
> +}
> +
> +/* union is only a special case of struct:
> + * all its offsetof(member) == 0
> + */
> +static inline bool btf_type_is_struct(const struct btf_type *t)
> +{
> + u8 kind = BTF_INFO_KIND(t->info);
> +
> + return kind == BTF_KIND_STRUCT || kind == BTF_KIND_UNION;
> +}
> +
> +static inline bool btf_type_is_modifier(const struct btf_type *t)
> +{
> + /* Some of them is not strictly a C modifier
> + * but they are grouped into the same bucket
> + * for BTF concern:
> + * A type (t) that refers to another
> + * type through t->type AND its size cannot
> + * be determined without following the t->type.
> + *
> + * ptr does not fall into this bucket
> + * because its size is always sizeof(void *).
> + */
> + switch (BTF_INFO_KIND(t->info)) {
> + case BTF_KIND_TYPEDEF:
> + case BTF_KIND_VOLATILE:
> + case BTF_KIND_CONST:
> + case BTF_KIND_RESTRICT:
> + return true;
> + default:
> + return false;
> + }
> +}
> +
> +static inline
> +const struct btf_type *btf_type_skip_modifiers(const struct btf *btf,
> + u32 id, u32 *res_id)
> +{
> + const struct btf_type *t = btf_type_by_id(btf, id);
> +
> + while (btf_type_is_modifier(t)) {
> + id = t->type;
> + t = btf_type_by_id(btf, t->type);
> + }
> +
> + if (res_id)
> + *res_id = id;
> +
> + return t;
> +}
> +
> +static inline u32 btf_type_int(const struct btf_type *t)
> +{
> + return *(u32 *)(t + 1);
> +}
> +
> +static inline const struct btf_array *btf_type_array(const struct btf_type *t)
> +{
> + return (const struct btf_array *)(t + 1);
> +}
> +
> +static inline const struct btf_enum *btf_type_enum(const struct btf_type *t)
> +{
> + return (const struct btf_enum *)(t + 1);
> +}
> +
> +static inline const struct btf_var *btf_type_var(const struct btf_type *t)
> +{
> + return (const struct btf_var *)(t + 1);
> +}
> +
> +static inline u16 btf_type_vlen(const struct btf_type *t)
> +{
> + return BTF_INFO_VLEN(t->info);
> +}
> +
> +static inline u16 btf_func_linkage(const struct btf_type *t)
> +{
> + return BTF_INFO_VLEN(t->info);
> +}
> +
> +/* size can be used */
> +static inline bool btf_type_has_size(const struct btf_type *t)
> +{
> + switch (BTF_INFO_KIND(t->info)) {
> + case BTF_KIND_INT:
> + case BTF_KIND_STRUCT:
> + case BTF_KIND_UNION:
> + case BTF_KIND_ENUM:
> + case BTF_KIND_DATASEC:
> + return true;
> + default:
> + return false;
> + }
> +}
> +
> +static inline const struct btf_member *btf_type_member(const struct btf_type *t)
> +{
> + return (const struct btf_member *)(t + 1);
> +}
> +
> +static inline const struct btf_var_secinfo *btf_type_var_secinfo(
> + const struct btf_type *t)
> +{
> + return (const struct btf_var_secinfo *)(t + 1);
> +}
> +
> +static inline const char *__btf_name_by_offset(const struct btf *btf,
> + u32 offset)
> +{
> + const char *name;
> +
> + if (!offset)
> + return "(anon)";
> +
> + name = btf_str_by_offset(btf, offset);
> + return name ?: "(invalid-name-offset)";
> +}
> +
(almost?) all of the above helpers are already defined in libbpf's
btf.h, no need to add all this duplication
> +/* functions shared between btf.c and btf_show_common.c */
> +void btf_type_ops_show(const struct btf *btf, const struct btf_type *t,
> + __u32 type_id, void *obj, u8 bits_offset,
> + struct btf_show *show);
[...]
> diff --git a/tools/lib/bpf/libbpf.map b/tools/lib/bpf/libbpf.map
> index 1c0fd2d..35bd9dc 100644
> --- a/tools/lib/bpf/libbpf.map
> +++ b/tools/lib/bpf/libbpf.map
> @@ -346,6 +346,7 @@ LIBBPF_0.3.0 {
> btf__parse_split;
> btf__new_empty_split;
> btf__new_split;
> + btf__snprintf;
It's LIBBPF_0.4.0 already, I or someone else should send a patch
adding a new section in .map file.
> ring_buffer__epoll_fd;
> xsk_setup_xdp_prog;
> xsk_socket__update_xskmap;
> --
> 1.8.3.1
>
Commit fadb08e7c750 ("kunit: Support for Parameterized Testing")
introduced support but lacks documentation for how to use it.
This patch builds on commit 1f0e943df68a ("Documentation: kunit: provide
guidance for testing many inputs") to show a minimal example of the new
feature.
Signed-off-by: Daniel Latypov <dlatypov(a)google.com>
---
Documentation/dev-tools/kunit/usage.rst | 57 +++++++++++++++++++++++++
1 file changed, 57 insertions(+)
diff --git a/Documentation/dev-tools/kunit/usage.rst b/Documentation/dev-tools/kunit/usage.rst
index d9fdc14f0677..650f99590df5 100644
--- a/Documentation/dev-tools/kunit/usage.rst
+++ b/Documentation/dev-tools/kunit/usage.rst
@@ -522,6 +522,63 @@ There's more boilerplate involved, but it can:
* E.g. if we wanted to also test ``sha256sum``, we could add a ``sha256``
field and reuse ``cases``.
+* be converted to a "parameterized test", see below.
+
+Parameterized Testing
+~~~~~~~~~~~~~~~~~~~~~
+
+The table-driven testing pattern is common enough that KUnit has special
+support for it.
+
+Reusing the same ``cases`` array from above, we can write the test as a
+"parameterized test" with the following.
+
+.. code-block:: c
+
+ // This is copy-pasted from above.
+ struct sha1_test_case {
+ const char *str;
+ const char *sha1;
+ };
+ struct sha1_test_case cases[] = {
+ {
+ .str = "hello world",
+ .sha1 = "2aae6c35c94fcfb415dbe95f408b9ce91ee846ed",
+ },
+ {
+ .str = "hello world!",
+ .sha1 = "430ce34d020724ed75a196dfc2ad67c77772d169",
+ },
+ };
+
+ // Need a helper function to generate a name for each test case.
+ static void case_to_desc(const struct sha1_test_case *t, char *desc)
+ {
+ strcpy(desc, t->str);
+ }
+ // Creates `sha1_gen_params()` to iterate over `cases`.
+ KUNIT_ARRAY_PARAM(sha1, cases, case_to_desc);
+
+ // Looks no different from a normal test.
+ static void sha1_test(struct kunit *test)
+ {
+ // This function can just contain the body of the for-loop.
+ // The former `cases[i]` is accessible under test->param_value.
+ char out[40];
+ struct sha1_test_case *test_param = (struct sha1_test_case *)(test->param_value);
+
+ sha1sum(test_param->str, out);
+ KUNIT_EXPECT_STREQ_MSG(test, (char *)out, test_param->sha1,
+ "sha1sum(%s)", test_param->str);
+ }
+
+ // Instead of KUNIT_CASE, we use KUNIT_CASE_PARAM and pass in the
+ // function declared by KUNIT_ARRAY_PARAM.
+ static struct kunit_case sha1_test_cases[] = {
+ KUNIT_CASE_PARAM(sha1_test, sha1_gen_params),
+ {}
+ };
+
.. _kunit-on-non-uml:
KUnit on non-UML architectures
base-commit: 5f6b99d0287de2c2d0b5e7abcb0092d553ad804a
--
2.29.2.684.gfbc64c5ab5-goog
Hi Linus,
Please pull the following Kselftest fixes update for Linux 5.11-rc4
This Kselftest fixes update for Linux 5.11-rc4 consists of one single
fix to skip BPF selftests by default. BPF selftests have a hard
dependency on cutting edge versions of tools in the BPF ecosystem
including LLVM.
Skipping BPF allows by default will make it easier for users interested
in running kselftest as a whole. Users can include BPF in Kselftest
build by via SKIP_TARGETS variable.
diff is attached.
thanks,
-- Shuah
----------------------------------------------------------------
The following changes since commit e71ba9452f0b5b2e8dc8aa5445198cd9214a6a62:
Linux 5.11-rc2 (2021-01-03 15:55:30 -0800)
are available in the Git repository at:
git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest
tags/linux-kselftest-fixes-5.11-rc4
for you to fetch changes up to 7a6eb7c34a78498742b5f82543b7a68c1c443329:
selftests: Skip BPF seftests by default (2021-01-04 09:27:20 -0700)
----------------------------------------------------------------
linux-kselftest-fixes-5.11-rc4
This Kselftest fixes update for Linux 5.11-rc4 consists of one single
fix to skip BPF selftests by default. BPF selftests have a hard
dependency on cutting edge versions of tools in the BPF ecosystem
including LLVM.
Skipping BPF allows by default will make it easier for users interested
in running kselftest as a whole. Users can include BPF in Kselftest
build by via SKIP_TARGETS variable.
----------------------------------------------------------------
Mark Brown (1):
selftests: Skip BPF seftests by default
tools/testing/selftests/Makefile | 6 ++++--
1 file changed, 4 insertions(+), 2 deletions(-)
----------------------------------------------------------------
Add support for pointer to mem register spilling, to allow the verifier
to track pointers to valid memory addresses. Such pointers are returned
for example by a successful call of the bpf_ringbuf_reserve helper.
The patch was partially contributed by CyberArk Software, Inc.
Fixes: 457f44363a88 ("bpf: Implement BPF ring buffer and verifier support for it")
Suggested-by: Yonghong Song <yhs(a)fb.com>
Signed-off-by: Gilad Reti <gilad.reti(a)gmail.com>
---
kernel/bpf/verifier.c | 2 ++
1 file changed, 2 insertions(+)
diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
index 17270b8404f1..36af69fac591 100644
--- a/kernel/bpf/verifier.c
+++ b/kernel/bpf/verifier.c
@@ -2217,6 +2217,8 @@ static bool is_spillable_regtype(enum bpf_reg_type type)
case PTR_TO_RDWR_BUF:
case PTR_TO_RDWR_BUF_OR_NULL:
case PTR_TO_PERCPU_BTF_ID:
+ case PTR_TO_MEM:
+ case PTR_TO_MEM_OR_NULL:
return true;
default:
return false;
--
2.27.0
A few fixes for cross-building the sefltests out of tree. This will
enable wider automated testing on various Arm hardware.
Changes since v1 [1]:
* Use wildcard in patch 5
* Move the MAKE_DIRS declaration in patch 1
[1] https://lore.kernel.org/bpf/20210112135959.649075-1-jean-philippe@linaro.or…
Jean-Philippe Brucker (5):
selftests/bpf: Enable cross-building
selftests/bpf: Fix out-of-tree build
selftests/bpf: Move generated test files to $(TEST_GEN_FILES)
selftests/bpf: Fix installation of urandom_read
selftests/bpf: Install btf_dump test cases
tools/testing/selftests/bpf/Makefile | 58 ++++++++++++++++++++--------
1 file changed, 41 insertions(+), 17 deletions(-)
--
2.30.0
Following the build fixes for bpftool [1], here are a few fixes for
cross-building the sefltests out of tree. This will enable wider
automated testing on various Arm hardware.
[1] https://lore.kernel.org/bpf/20201110164310.2600671-1-jean-philippe@linaro.o…
Jean-Philippe Brucker (5):
selftests/bpf: Enable cross-building
selftests/bpf: Fix out-of-tree build
selftests/bpf: Move generated test files to $(TEST_GEN_FILES)
selftests/bpf: Fix installation of urandom_read
selftests/bpf: Install btf_dump test cases
tools/testing/selftests/bpf/Makefile | 61 +++++++++++++++++++++-------
1 file changed, 46 insertions(+), 15 deletions(-)
--
2.30.0
The BPF Type Format (BTF) can be used in conjunction with the helper
bpf_snprintf_btf() to display kernel data with type information.
This series generalizes that support and shares it with libbpf so
that libbpf can display typed data. BTF display functionality is
factored out of kernel/bpf/btf.c into kernel/bpf/btf_show_common.c,
and that file is duplicated in tools/lib/bpf. Similarly, common
definitions and inline functions needed for this support are
extracted into include/linux/btf_common.h and this header is again
duplicated in tools/lib/bpf.
Patch 1 carries out the refactoring, for which no kernel changes
are intended, and introduces btf__snprintf() a libbpf function
that supports dumping a string representation of typed data using
the struct btf * and id associated with that type.
Patch 2 tests btf__snprintf() with built-in and kernel types to
ensure data is of expected format. The test closely mirrors
the BPF program associated with the snprintf_btf.c; in this case
however the string representations are verified in userspace rather
than in BPF program context.
Alan Maguire (2):
bpf: share BTF "show" implementation between kernel and libbpf
selftests/bpf: test libbpf-based type display
include/linux/btf.h | 121 +-
include/linux/btf_common.h | 286 +++++
kernel/bpf/Makefile | 2 +-
kernel/bpf/arraymap.c | 1 +
kernel/bpf/bpf_struct_ops.c | 1 +
kernel/bpf/btf.c | 1215 +------------------
kernel/bpf/btf_show_common.c | 1218 ++++++++++++++++++++
kernel/bpf/core.c | 1 +
kernel/bpf/hashtab.c | 1 +
kernel/bpf/local_storage.c | 1 +
kernel/bpf/verifier.c | 1 +
kernel/trace/bpf_trace.c | 1 +
tools/lib/bpf/Build | 2 +-
tools/lib/bpf/btf.h | 7 +
tools/lib/bpf/btf_common.h | 286 +++++
tools/lib/bpf/btf_show_common.c | 1218 ++++++++++++++++++++
tools/lib/bpf/libbpf.map | 1 +
.../selftests/bpf/prog_tests/snprintf_btf_user.c | 192 +++
18 files changed, 3236 insertions(+), 1319 deletions(-)
create mode 100644 include/linux/btf_common.h
create mode 100644 kernel/bpf/btf_show_common.c
create mode 100644 tools/lib/bpf/btf_common.h
create mode 100644 tools/lib/bpf/btf_show_common.c
create mode 100644 tools/testing/selftests/bpf/prog_tests/snprintf_btf_user.c
--
1.8.3.1
If a signed number field starts with a '-' the field width must be > 1,
or unlimited, to allow at least one digit after the '-'.
This patch adds a check for this. If a signed field starts with '-'
and field_width == 1 the scanf will quit.
It is ok for a signed number field to have a field width of 1 if it
starts with a digit. In that case the single digit can be converted.
Signed-off-by: Richard Fitzgerald <rf(a)opensource.cirrus.com>
---
lib/vsprintf.c | 6 +++++-
1 file changed, 5 insertions(+), 1 deletion(-)
diff --git a/lib/vsprintf.c b/lib/vsprintf.c
index 14c9a6af1b23..8954ff94a53c 100644
--- a/lib/vsprintf.c
+++ b/lib/vsprintf.c
@@ -3433,8 +3433,12 @@ int vsscanf(const char *buf, const char *fmt, va_list args)
str = skip_spaces(str);
digit = *str;
- if (is_sign && digit == '-')
+ if (is_sign && digit == '-') {
+ if (field_width == 1)
+ break;
+
digit = *(str + 1);
+ }
if (!digit
|| (base == 16 && !isxdigit(digit))
--
2.20.1
Hi Linus,
Please pull the following KUnit fixes update for Linux 5.11-rc3.
This kunit update for Linux 5.11-rc3 consists one fix to force the use
of the 'tty' console for UML. Given that kunit tool requires the console
output, explicitly stating the dependency makes sense than relying on
it being the default.
diff is attached.
thanks,
-- Shuah
----------------------------------------------------------------
The following changes since commit e71ba9452f0b5b2e8dc8aa5445198cd9214a6a62:
Linux 5.11-rc2 (2021-01-03 15:55:30 -0800)
are available in the Git repository at:
git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest
tags/linux-kselftest-kunit-fixes-5.11-rc3
for you to fetch changes up to 65a4e5299739abe0888cda0938d21f8ea3b5c606:
kunit: tool: Force the use of the 'tty' console for UML (2021-01-04
09:18:38 -0700)
----------------------------------------------------------------
linux-kselftest-kunit-fixes-5.11-rc3
This kunit update for Linux 5.11-rc3 consists one fix to force the use
of the 'tty' console for UML. Given that kunit tool requires the console
output, explicitly stating the dependency makes sense than relying on
it being the default.
----------------------------------------------------------------
David Gow (1):
kunit: tool: Force the use of the 'tty' console for UML
tools/testing/kunit/kunit_kernel.py | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
----------------------------------------------------------------
Hi Linus,
Please pull the following fixes update for Linux 5.11-rc3.
This fixes update for 5.11-rc3 consists of two minor fixes to vDSO test
changes in 5.11-rc1 update.
diff is attached.
thanks,
-- Shuah
----------------------------------------------------------------
The following changes since commit e71ba9452f0b5b2e8dc8aa5445198cd9214a6a62:
Linux 5.11-rc2 (2021-01-03 15:55:30 -0800)
are available in the Git repository at:
git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest
tags/linux-kselftest-next-5.11-rc3
for you to fetch changes up to df00d02989024d193a6efd1a85513a5658c6a10f:
selftests/vDSO: fix -Wformat warning in vdso_test_correctness
(2021-01-04 09:25:45 -0700)
----------------------------------------------------------------
linux-kselftest-next-5.11-rc3
This fixes update for 5.11-rc3 consists of two minor fixes to vDSO test
changes in 5.11-rc1 update.
----------------------------------------------------------------
Tobias Klauser (2):
selftests/vDSO: add additional binaries to .gitignore
selftests/vDSO: fix -Wformat warning in vdso_test_correctness
tools/testing/selftests/vDSO/.gitignore | 3 +++
tools/testing/selftests/vDSO/vdso_test_correctness.c | 2 +-
2 files changed, 4 insertions(+), 1 deletion(-)
----------------------------------------------------------------
kunit_tool relies on the UML console outputting printk() output to the
tty in order to get results. Since the default console driver could
change, pass 'console=tty' to the kernel.
This is triggered by a change[1] to use ttynull as a fallback console
driver which -- by chance or by design -- seems to have changed the
default console output on UML, breaking kunit_tool. While this may be
fixed, we should be less fragile to such changes in the default.
[1]:
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?…
Signed-off-by: David Gow <davidgow(a)google.com>
Fixes: 757055ae8ded ("init/console: Use ttynull as a fallback when there is no console")
---
tools/testing/kunit/kunit_kernel.py | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/tools/testing/kunit/kunit_kernel.py b/tools/testing/kunit/kunit_kernel.py
index 57c1724b7e5d..698358c9c0d6 100644
--- a/tools/testing/kunit/kunit_kernel.py
+++ b/tools/testing/kunit/kunit_kernel.py
@@ -198,7 +198,7 @@ class LinuxSourceTree(object):
return self.validate_config(build_dir)
def run_kernel(self, args=[], build_dir='', timeout=None):
- args.extend(['mem=1G'])
+ args.extend(['mem=1G', 'console=tty'])
self._ops.linux_bin(args, timeout, build_dir)
outfile = get_outfile_path(build_dir)
subprocess.call(['stty', 'sane'])
--
2.29.2.729.g45daf8777d-goog
Hi!
For last few weeks KUnit stopped working. Any insight?
P.S. I guess no need to tell that my kernel on which I run tests has not been
changed as well as command line for wrapper:
tools/testing/kunit/kunit.py run --build_dir ~/$OUT_DIR
--
With Best Regards,
Andy Shevchenko
Initially I just wanted to port the selftests to the latest GPIO uAPI,
but on finding that, due to dependency issues, the selftests are not built
for the buildroot environments that I do most of my GPIO testing in, I
decided to take a closer look.
The first patch is essentially a rewrite of the exising test suite.
It uses a simplified abstraction of the uAPI interfaces to allow a common
test suite to test the gpio-mockup using either of the uAPI interfaces.
The simplified cdev interface is implemented in gpio-mockup.sh, with the
actual driving of the uAPI implemented in gpio-mockup-cdev.c.
The simplified sysfs interface replaces gpio-mockup-sysfs.sh and is
loaded over the cdev implementation when selected.
The new tests should also be simpler to extend to cover new mockup
interfaces, such as the one Bart has been working on.
I have dropped support for testing modules other than gpio-mockup from
the command line options, as the tests are very gpio-mockup specific so
I didn't see any calling for it.
I have also tried to emphasise in the test output that the tests are
covering the gpio-mockup itself. They do perform some implicit testing
of gpiolib and the uAPI interfaces, and so can be useful as smoke tests
for those, but their primary focus is the gpio-mockup.
Patches 2 through 5 do some cleaning up that is now possible with the
new implementation, including enabling building in buildroot environments.
Patch 4 doesn't strictly clean up all the old gpio references that it
could - the gpio was the only Level 1 test, so the Level 1 tests could
potentially be removed, but I was unsure if there may be other
implications to removing a whole test level, or that it may be useful
as a placeholder in case other static LDLIBS tests are added in
the future??
Patch 6 finally gets around to porting the tests to the latest GPIO uAPI.
And Patch 7 updates the config to set the CONFIG_GPIO_CDEV option that
was added in v5.10.
Cheers,
Kent.
Kent Gibson (7):
selftests: gpio: rework and simplify test implementation
selftests: gpio: remove obsolete gpio-mockup-chardev.c
selftests: remove obsolete build restriction for gpio
selftests: remove obsolete gpio references from kselftest_deps.sh
tools: gpio: remove uAPI v1 code no longer used by selftests
selftests: gpio: port to GPIO uAPI v2
selftests: gpio: add CONFIG_GPIO_CDEV to config
tools/gpio/gpio-utils.c | 89 ----
tools/gpio/gpio-utils.h | 6 -
tools/testing/selftests/Makefile | 9 -
tools/testing/selftests/gpio/Makefile | 26 +-
tools/testing/selftests/gpio/config | 1 +
.../testing/selftests/gpio/gpio-mockup-cdev.c | 198 ++++++++
.../selftests/gpio/gpio-mockup-chardev.c | 323 ------------
.../selftests/gpio/gpio-mockup-sysfs.sh | 168 ++-----
tools/testing/selftests/gpio/gpio-mockup.sh | 469 ++++++++++++------
tools/testing/selftests/kselftest_deps.sh | 4 +-
10 files changed, 573 insertions(+), 720 deletions(-)
create mode 100644 tools/testing/selftests/gpio/gpio-mockup-cdev.c
delete mode 100644 tools/testing/selftests/gpio/gpio-mockup-chardev.c
--
2.30.0
When running this xfrm_policy.sh test script, even with some cases
marked as FAIL, the overall test result will still be PASS:
$ sudo ./xfrm_policy.sh
PASS: policy before exception matches
FAIL: expected ping to .254 to fail (exceptions)
PASS: direct policy matches (exceptions)
PASS: policy matches (exceptions)
FAIL: expected ping to .254 to fail (exceptions and block policies)
PASS: direct policy matches (exceptions and block policies)
PASS: policy matches (exceptions and block policies)
FAIL: expected ping to .254 to fail (exceptions and block policies after hresh changes)
PASS: direct policy matches (exceptions and block policies after hresh changes)
PASS: policy matches (exceptions and block policies after hresh changes)
FAIL: expected ping to .254 to fail (exceptions and block policies after hthresh change in ns3)
PASS: direct policy matches (exceptions and block policies after hthresh change in ns3)
PASS: policy matches (exceptions and block policies after hthresh change in ns3)
FAIL: expected ping to .254 to fail (exceptions and block policies after htresh change to normal)
PASS: direct policy matches (exceptions and block policies after htresh change to normal)
PASS: policy matches (exceptions and block policies after htresh change to normal)
PASS: policies with repeated htresh change
$ echo $?
0
This is because the $lret in check_xfrm() is not a local variable.
Therefore when a test failed in check_exceptions(), the non-zero $lret
will later get reset to 0 when the next test calls check_xfrm().
With this fix, the final return value will be 1. Make it easier for
testers to spot this failure.
Fixes: 39aa6928d462d0 ("xfrm: policy: fix netlink/pf_key policy lookups")
Signed-off-by: Po-Hsu Lin <po-hsu.lin(a)canonical.com>
---
tools/testing/selftests/net/xfrm_policy.sh | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/tools/testing/selftests/net/xfrm_policy.sh b/tools/testing/selftests/net/xfrm_policy.sh
index 7a1bf94..5922941 100755
--- a/tools/testing/selftests/net/xfrm_policy.sh
+++ b/tools/testing/selftests/net/xfrm_policy.sh
@@ -202,7 +202,7 @@ check_xfrm() {
# 1: iptables -m policy rule count != 0
rval=$1
ip=$2
- lret=0
+ local lret=0
ip netns exec ns1 ping -q -c 1 10.0.2.$ip > /dev/null
--
2.7.4
On Sat, Jan 02, 2021 at 03:52:32PM +0200, Andy Shevchenko wrote:
> On Saturday, January 2, 2021, Kent Gibson <warthog618(a)gmail.com> wrote:
>
> > The GPIO mockup selftests are overly complicated with separate
> > implementations of the tests for sysfs and cdev uAPI, and with the cdev
> > implementation being dependent on tools/gpio and libmount.
> >
> > Rework the test implementation to provide a common test suite with a
> > simplified pluggable uAPI interface. The cdev implementation utilises
> > the GPIO uAPI directly to remove the dependence on tools/gpio.
> > The simplified uAPI interface removes the need for any file system mount
> > checks in C, and so removes the dependence on libmount.
> >
> > The rework also fixes the sysfs test implementation which has been broken
> > since the device created in the multiple gpiochip case was split into
> > separate devices.
> >
> >
>
> I briefly looked at code in shell below... there are places to improve
> (useless use of: cat, test, negation, etc).
>
My shell is clearly pretty poor, so I would really appreciate a pointer
to an example of each, and I'll then hunt down the rest.
Thanks,
Kent.
From: Ira Weiny <ira.weiny(a)intel.com>
Changes from V2 [4]
Rebased on tip-tree/core/entry
From Thomas Gleixner
Address bisectability
Drop Patch:
x86/entry: Move nmi entry/exit into common code
From Greg KH
Remove WARN_ON's
From Dan Williams
Add __must_check to pks_key_alloc()
New patch: x86/pks: Add PKS defines and config options
Split from Enable patch to build on through the series
Fix compile errors
Changes from V1
Rebase to TIP master; resolve conflicts and test
Clean up some kernel docs updates missed in V1
Add irqentry_state_t kernel doc for PKRS field
Removed redundant irq_state->pkrs
This is only needed when we add the global state and somehow
ended up in this patch series. That will come back when we add
the global functionality in.
From Thomas Gleixner
Update commit messages
Add kernel doc for struct irqentry_state_t
From Dave Hansen add flags to pks_key_alloc()
Changes from RFC V3[3]
Rebase to TIP master
Update test error output
Standardize on 'irq_state' for state variables
From Dave Hansen
Update commit messages
Add/clean up comments
Add X86_FEATURE_PKS to disabled-features.h and remove some
explicit CONFIG checks
Move saved_pkrs member of thread_struct
Remove superfluous preempt_disable()
s/irq_save_pks/irq_save_set_pks/
Ensure PKRS is not seen in faults if not configured or not
supported
s/pks_mknoaccess/pks_mk_noaccess/
s/pks_mkread/pks_mk_readonly/
s/pks_mkrdwr/pks_mk_readwrite/
Change pks_key_alloc return to -EOPNOTSUPP when not supported
From Peter Zijlstra
Clean up Attribution
Remove superfluous preempt_disable()
Add union to differentiate exit_rcu/lockdep use in
irqentry_state_t
From Thomas Gleixner
Add preliminary clean up patch and adjust series as needed
Introduce a new page protection mechanism for supervisor pages, Protection Key
Supervisor (PKS).
2 use cases for PKS are being developed, trusted keys and PMEM. Trusted keys
is a newer use case which is still being explored. PMEM was submitted as part
of the RFC (v2) series[1]. However, since then it was found that some callers
of kmap() require a global implementation of PKS. Specifically some users of
kmap() expect mappings to be available to all kernel threads. While global use
of PKS is rare it needs to be included for correctness. Unfortunately the
kmap() updates required a large patch series to make the needed changes at the
various kmap() call sites so that patch set has been split out. Because the
global PKS feature is only required for that use case it will be deferred to
that set as well.[2] This patch set is being submitted as a precursor to both
of the use cases.
For an overview of the entire PKS ecosystem, a git tree including this series
and 2 proposed use cases can be found here:
https://lore.kernel.org/lkml/20201009195033.3208459-1-ira.weiny@intel.com/https://lore.kernel.org/lkml/20201009201410.3209180-1-ira.weiny@intel.com/
PKS enables protections on 'domains' of supervisor pages to limit supervisor
mode access to those pages beyond the normal paging protections. PKS works in
a similar fashion to user space pkeys, PKU. As with PKU, supervisor pkeys are
checked in addition to normal paging protections and Access or Writes can be
disabled via a MSR update without TLB flushes when permissions change. Also
like PKU, a page mapping is assigned to a domain by setting pkey bits in the
page table entry for that mapping.
Access is controlled through a PKRS register which is updated via WRMSR/RDMSR.
XSAVE is not supported for the PKRS MSR. Therefore the implementation
saves/restores the MSR across context switches and during exceptions. Nested
exceptions are supported by each exception getting a new PKS state.
For consistent behavior with current paging protections, pkey 0 is reserved and
configured to allow full access via the pkey mechanism, thus preserving the
default paging protections on mappings with the default pkey value of 0.
Other keys, (1-15) are allocated by an allocator which prepares us for key
contention from day one. Kernel users should be prepared for the allocator to
fail either because of key exhaustion or due to PKS not being supported on the
arch and/or CPU instance.
The following are key attributes of PKS.
1) Fast switching of permissions
1a) Prevents access without page table manipulations
1b) No TLB flushes required
2) Works on a per thread basis
PKS is available with 4 and 5 level paging. Like PKRU it consumes 4 bits from
the PTE to store the pkey within the entry.
[1] https://lore.kernel.org/lkml/20200717072056.73134-1-ira.weiny@intel.com/
[2] https://lore.kernel.org/lkml/20201009195033.3208459-2-ira.weiny@intel.com/
[3] https://lore.kernel.org/lkml/20201009194258.3207172-1-ira.weiny@intel.com/
[4] https://lore.kernel.org/lkml/20201102205320.1458656-1-ira.weiny@intel.com/
Fenghua Yu (2):
x86/pks: Add PKS kernel API
x86/pks: Enable Protection Keys Supervisor (PKS)
Ira Weiny (8):
x86/pkeys: Create pkeys_common.h
x86/fpu: Refactor arch_set_user_pkey_access() for PKS support
x86/pks: Add PKS defines and Kconfig options
x86/pks: Preserve the PKRS MSR on context switch
x86/entry: Pass irqentry_state_t by reference
x86/entry: Preserve PKRS MSR across exceptions
x86/fault: Report the PKRS state on fault
x86/pks: Add PKS test code
Documentation/core-api/protection-keys.rst | 103 ++-
arch/x86/Kconfig | 1 +
arch/x86/entry/common.c | 46 +-
arch/x86/include/asm/cpufeatures.h | 1 +
arch/x86/include/asm/disabled-features.h | 8 +-
arch/x86/include/asm/idtentry.h | 25 +-
arch/x86/include/asm/msr-index.h | 1 +
arch/x86/include/asm/pgtable.h | 13 +-
arch/x86/include/asm/pgtable_types.h | 12 +
arch/x86/include/asm/pkeys.h | 15 +
arch/x86/include/asm/pkeys_common.h | 40 ++
arch/x86/include/asm/processor.h | 18 +-
arch/x86/include/uapi/asm/processor-flags.h | 2 +
arch/x86/kernel/cpu/common.c | 15 +
arch/x86/kernel/cpu/mce/core.c | 4 +-
arch/x86/kernel/fpu/xstate.c | 22 +-
arch/x86/kernel/kvm.c | 6 +-
arch/x86/kernel/nmi.c | 4 +-
arch/x86/kernel/process.c | 26 +
arch/x86/kernel/traps.c | 21 +-
arch/x86/mm/fault.c | 87 ++-
arch/x86/mm/pkeys.c | 196 +++++-
include/linux/entry-common.h | 31 +-
include/linux/pgtable.h | 4 +
include/linux/pkeys.h | 24 +
kernel/entry/common.c | 44 +-
lib/Kconfig.debug | 12 +
lib/Makefile | 3 +
lib/pks/Makefile | 3 +
lib/pks/pks_test.c | 692 ++++++++++++++++++++
mm/Kconfig | 2 +
tools/testing/selftests/x86/Makefile | 3 +-
tools/testing/selftests/x86/test_pks.c | 66 ++
33 files changed, 1410 insertions(+), 140 deletions(-)
create mode 100644 arch/x86/include/asm/pkeys_common.h
create mode 100644 lib/pks/Makefile
create mode 100644 lib/pks/pks_test.c
create mode 100644 tools/testing/selftests/x86/test_pks.c
--
2.28.0.rc0.12.gb6a658bd00c9
This patch set contains dm-user, a device mapper target that proxies incoming
BIOs to userspace via a misc device. Essentially it's FUSE, but for block
devices. There's more information in the documentation patch and as a handful
of commends, so I'm just going to avoid duplicating that here. I don't really
think there's any fundamental functionality that dm-user enables, as one could
use something along the lines of nbd/iscsi, but dm-user does result in
extremely simple userspace daemons -- so simple that when I tried to write a
helper userspace library for dm-user I just ended up with nothing.
I talked about this a bit at Plumbers and was hoping to send patches a bit
earlier on in the process, but got tied up with a few things. As a result this
is actually quite far along: it's at the point where we're starting to run this
on real devices as part of an updated Android OTA update flow, where we're
using this to provide an Android-specific compressed backing store for
dm-snap-persistent. The bulk of that project is scattered throughout the
various Android trees, so there are kselftests and a (somewhat bare for now)
Documentation entry with the intent of making this a self-contained
contribution. There's a lot to the Android userspace daemon, but it doesn't
interact with dm-user in a very complex manner.
This is still in a somewhat early stage, but it's at the point where things
largely function. I'm certainly not ready to commit to the user ABI
implemented here and there are a bunch of FIXMEs scattered throughout the code,
but I do think that it's far along enough to begin a more concrete discussion
of where folks would like to go with something like this. While I'd intending
on sorting that stuff out, I'd like to at least get a feel for whether this is
a path worth pursuing before spending a bunch more time on it.
I haven't done much in the way of performance analysis for dm-user. Earlier on
I did some simple throughput tests and found that dm-user/ext4 was faster than
half the speed of tmpfs, which is way out of the realm of being an issue for
our use case (decompressing blocks out of a phone's storage). The design of
dm-user does preclude an extremely high performance implementation, where I
assume one would want an explicit ring buffer and zero copy, but I feel like
users who want that degree of performance are probably better served writing a
proper kernel driver. I wouldn't be opposed to pushing on performance (ideally
without a major design change), but for now I feel like time is better spent
fortifying the user ABI and fixing the various issues with the implementation.
The patches follow as usual, but in case it's easier I've published a tree as
well:
git://git.kernel.org/pub/scm/linux/kernel/git/palmer/dm-user.git -b dm-user-v1
From: Andy Lutomirski <luto(a)kernel.org>
[ Upstream commit 716572b0003ef67a4889bd7d85baf5099c5a0248 ]
Setting GS to 1, 2, or 3 causes a nonsensical part of the IRET microcode
to change GS back to zero on a return from kernel mode to user mode. The
result is that these tests fail randomly depending on when interrupts
happen. Detect when this happens and let the test pass.
Signed-off-by: Andy Lutomirski <luto(a)kernel.org>
Signed-off-by: Borislav Petkov <bp(a)suse.de>
Link: https://lkml.kernel.org/r/7567fd44a1d60a9424f25b19a998f12149993b0d.16043465…
Signed-off-by: Sasha Levin <sashal(a)kernel.org>
---
tools/testing/selftests/x86/fsgsbase.c | 12 ++++++++++--
1 file changed, 10 insertions(+), 2 deletions(-)
diff --git a/tools/testing/selftests/x86/fsgsbase.c b/tools/testing/selftests/x86/fsgsbase.c
index f249e042b3b51..026cd644360f6 100644
--- a/tools/testing/selftests/x86/fsgsbase.c
+++ b/tools/testing/selftests/x86/fsgsbase.c
@@ -318,8 +318,8 @@ static void set_gs_and_switch_to(unsigned long local,
local = read_base(GS);
/*
- * Signal delivery seems to mess up weird selectors. Put it
- * back.
+ * Signal delivery is quite likely to change a selector
+ * of 1, 2, or 3 back to 0 due to IRET being defective.
*/
asm volatile ("mov %0, %%gs" : : "rm" (force_sel));
} else {
@@ -337,6 +337,14 @@ static void set_gs_and_switch_to(unsigned long local,
if (base == local && sel_pre_sched == sel_post_sched) {
printf("[OK]\tGS/BASE remained 0x%hx/0x%lx\n",
sel_pre_sched, local);
+ } else if (base == local && sel_pre_sched >= 1 && sel_pre_sched <= 3 &&
+ sel_post_sched == 0) {
+ /*
+ * IRET is misdesigned and will squash selectors 1, 2, or 3
+ * to zero. Don't fail the test just because this happened.
+ */
+ printf("[OK]\tGS/BASE changed from 0x%hx/0x%lx to 0x%hx/0x%lx because IRET is defective\n",
+ sel_pre_sched, local, sel_post_sched, base);
} else {
nerrs++;
printf("[FAIL]\tGS/BASE changed from 0x%hx/0x%lx to 0x%hx/0x%lx\n",
--
2.27.0
From: Andy Lutomirski <luto(a)kernel.org>
[ Upstream commit 716572b0003ef67a4889bd7d85baf5099c5a0248 ]
Setting GS to 1, 2, or 3 causes a nonsensical part of the IRET microcode
to change GS back to zero on a return from kernel mode to user mode. The
result is that these tests fail randomly depending on when interrupts
happen. Detect when this happens and let the test pass.
Signed-off-by: Andy Lutomirski <luto(a)kernel.org>
Signed-off-by: Borislav Petkov <bp(a)suse.de>
Link: https://lkml.kernel.org/r/7567fd44a1d60a9424f25b19a998f12149993b0d.16043465…
Signed-off-by: Sasha Levin <sashal(a)kernel.org>
---
tools/testing/selftests/x86/fsgsbase.c | 12 ++++++++++--
1 file changed, 10 insertions(+), 2 deletions(-)
diff --git a/tools/testing/selftests/x86/fsgsbase.c b/tools/testing/selftests/x86/fsgsbase.c
index f249e042b3b51..026cd644360f6 100644
--- a/tools/testing/selftests/x86/fsgsbase.c
+++ b/tools/testing/selftests/x86/fsgsbase.c
@@ -318,8 +318,8 @@ static void set_gs_and_switch_to(unsigned long local,
local = read_base(GS);
/*
- * Signal delivery seems to mess up weird selectors. Put it
- * back.
+ * Signal delivery is quite likely to change a selector
+ * of 1, 2, or 3 back to 0 due to IRET being defective.
*/
asm volatile ("mov %0, %%gs" : : "rm" (force_sel));
} else {
@@ -337,6 +337,14 @@ static void set_gs_and_switch_to(unsigned long local,
if (base == local && sel_pre_sched == sel_post_sched) {
printf("[OK]\tGS/BASE remained 0x%hx/0x%lx\n",
sel_pre_sched, local);
+ } else if (base == local && sel_pre_sched >= 1 && sel_pre_sched <= 3 &&
+ sel_post_sched == 0) {
+ /*
+ * IRET is misdesigned and will squash selectors 1, 2, or 3
+ * to zero. Don't fail the test just because this happened.
+ */
+ printf("[OK]\tGS/BASE changed from 0x%hx/0x%lx to 0x%hx/0x%lx because IRET is defective\n",
+ sel_pre_sched, local, sel_post_sched, base);
} else {
nerrs++;
printf("[FAIL]\tGS/BASE changed from 0x%hx/0x%lx to 0x%hx/0x%lx\n",
--
2.27.0
From: Andy Lutomirski <luto(a)kernel.org>
[ Upstream commit 716572b0003ef67a4889bd7d85baf5099c5a0248 ]
Setting GS to 1, 2, or 3 causes a nonsensical part of the IRET microcode
to change GS back to zero on a return from kernel mode to user mode. The
result is that these tests fail randomly depending on when interrupts
happen. Detect when this happens and let the test pass.
Signed-off-by: Andy Lutomirski <luto(a)kernel.org>
Signed-off-by: Borislav Petkov <bp(a)suse.de>
Link: https://lkml.kernel.org/r/7567fd44a1d60a9424f25b19a998f12149993b0d.16043465…
Signed-off-by: Sasha Levin <sashal(a)kernel.org>
---
tools/testing/selftests/x86/fsgsbase.c | 12 ++++++++++--
1 file changed, 10 insertions(+), 2 deletions(-)
diff --git a/tools/testing/selftests/x86/fsgsbase.c b/tools/testing/selftests/x86/fsgsbase.c
index f249e042b3b51..026cd644360f6 100644
--- a/tools/testing/selftests/x86/fsgsbase.c
+++ b/tools/testing/selftests/x86/fsgsbase.c
@@ -318,8 +318,8 @@ static void set_gs_and_switch_to(unsigned long local,
local = read_base(GS);
/*
- * Signal delivery seems to mess up weird selectors. Put it
- * back.
+ * Signal delivery is quite likely to change a selector
+ * of 1, 2, or 3 back to 0 due to IRET being defective.
*/
asm volatile ("mov %0, %%gs" : : "rm" (force_sel));
} else {
@@ -337,6 +337,14 @@ static void set_gs_and_switch_to(unsigned long local,
if (base == local && sel_pre_sched == sel_post_sched) {
printf("[OK]\tGS/BASE remained 0x%hx/0x%lx\n",
sel_pre_sched, local);
+ } else if (base == local && sel_pre_sched >= 1 && sel_pre_sched <= 3 &&
+ sel_post_sched == 0) {
+ /*
+ * IRET is misdesigned and will squash selectors 1, 2, or 3
+ * to zero. Don't fail the test just because this happened.
+ */
+ printf("[OK]\tGS/BASE changed from 0x%hx/0x%lx to 0x%hx/0x%lx because IRET is defective\n",
+ sel_pre_sched, local, sel_post_sched, base);
} else {
nerrs++;
printf("[FAIL]\tGS/BASE changed from 0x%hx/0x%lx to 0x%hx/0x%lx\n",
--
2.27.0
From: Andy Lutomirski <luto(a)kernel.org>
[ Upstream commit 716572b0003ef67a4889bd7d85baf5099c5a0248 ]
Setting GS to 1, 2, or 3 causes a nonsensical part of the IRET microcode
to change GS back to zero on a return from kernel mode to user mode. The
result is that these tests fail randomly depending on when interrupts
happen. Detect when this happens and let the test pass.
Signed-off-by: Andy Lutomirski <luto(a)kernel.org>
Signed-off-by: Borislav Petkov <bp(a)suse.de>
Link: https://lkml.kernel.org/r/7567fd44a1d60a9424f25b19a998f12149993b0d.16043465…
Signed-off-by: Sasha Levin <sashal(a)kernel.org>
---
tools/testing/selftests/x86/fsgsbase.c | 12 ++++++++++--
1 file changed, 10 insertions(+), 2 deletions(-)
diff --git a/tools/testing/selftests/x86/fsgsbase.c b/tools/testing/selftests/x86/fsgsbase.c
index 757bdb218a661..f2916838a7eb5 100644
--- a/tools/testing/selftests/x86/fsgsbase.c
+++ b/tools/testing/selftests/x86/fsgsbase.c
@@ -391,8 +391,8 @@ static void set_gs_and_switch_to(unsigned long local,
local = read_base(GS);
/*
- * Signal delivery seems to mess up weird selectors. Put it
- * back.
+ * Signal delivery is quite likely to change a selector
+ * of 1, 2, or 3 back to 0 due to IRET being defective.
*/
asm volatile ("mov %0, %%gs" : : "rm" (force_sel));
} else {
@@ -410,6 +410,14 @@ static void set_gs_and_switch_to(unsigned long local,
if (base == local && sel_pre_sched == sel_post_sched) {
printf("[OK]\tGS/BASE remained 0x%hx/0x%lx\n",
sel_pre_sched, local);
+ } else if (base == local && sel_pre_sched >= 1 && sel_pre_sched <= 3 &&
+ sel_post_sched == 0) {
+ /*
+ * IRET is misdesigned and will squash selectors 1, 2, or 3
+ * to zero. Don't fail the test just because this happened.
+ */
+ printf("[OK]\tGS/BASE changed from 0x%hx/0x%lx to 0x%hx/0x%lx because IRET is defective\n",
+ sel_pre_sched, local, sel_post_sched, base);
} else {
nerrs++;
printf("[FAIL]\tGS/BASE changed from 0x%hx/0x%lx to 0x%hx/0x%lx\n",
--
2.27.0
From: Jean-Philippe Brucker <jean-philippe(a)linaro.org>
[ Upstream commit 77ce220c0549dcc3db8226c61c60e83fc59dfafc ]
The test fails because of a recent fix to the verifier, even though this
program is valid. In details what happens is:
7: (61) r1 = *(u32 *)(r0 +0)
Load a 32-bit value, with signed bounds [S32_MIN, S32_MAX]. The bounds
of the 64-bit value are [0, U32_MAX]...
8: (65) if r1 s> 0xffffffff goto pc+1
... therefore this is always true (the operand is sign-extended).
10: (b4) w2 = 11
11: (6d) if r2 s> r1 goto pc+1
When true, the 64-bit bounds become [0, 10]. The 32-bit bounds are still
[S32_MIN, 10].
13: (64) w1 <<= 2
Because this is a 32-bit operation, the verifier propagates the new
32-bit bounds to the 64-bit ones, and the knowledge gained from insn 11
is lost.
14: (0f) r0 += r1
15: (7a) *(u64 *)(r0 +0) = 4
Then the verifier considers r0 unbounded here, rejecting the test. To
make the test work, change insn 8 to check the sign of the 32-bit value.
Signed-off-by: Jean-Philippe Brucker <jean-philippe(a)linaro.org>
Acked-by: John Fastabend <john.fastabend(a)gmail.com>
Signed-off-by: Alexei Starovoitov <ast(a)kernel.org>
Signed-off-by: Sasha Levin <sashal(a)kernel.org>
---
tools/testing/selftests/bpf/verifier/array_access.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/tools/testing/selftests/bpf/verifier/array_access.c b/tools/testing/selftests/bpf/verifier/array_access.c
index f3c33e128709b..a80d806ead15f 100644
--- a/tools/testing/selftests/bpf/verifier/array_access.c
+++ b/tools/testing/selftests/bpf/verifier/array_access.c
@@ -68,7 +68,7 @@
BPF_RAW_INSN(BPF_JMP | BPF_CALL, 0, 0, 0, BPF_FUNC_map_lookup_elem),
BPF_JMP_IMM(BPF_JEQ, BPF_REG_0, 0, 9),
BPF_LDX_MEM(BPF_W, BPF_REG_1, BPF_REG_0, 0),
- BPF_JMP_IMM(BPF_JSGT, BPF_REG_1, 0xffffffff, 1),
+ BPF_JMP32_IMM(BPF_JSGT, BPF_REG_1, 0xffffffff, 1),
BPF_MOV32_IMM(BPF_REG_1, 0),
BPF_MOV32_IMM(BPF_REG_2, MAX_ENTRIES),
BPF_JMP_REG(BPF_JSGT, BPF_REG_2, BPF_REG_1, 1),
--
2.27.0