Resending as the first set got mangled with smtp error.
This is part of the effort to remove the empty element of the ctl_table
structures (used to calculate size) and replace it with an ARRAY_SIZE call. By
replacing the child element in struct ctl_table with a flags element we make
sure that there are no forward recursions on child nodes and therefore set
ourselves up for just using an ARRAY_SIZE. We also added some self tests to
make sure that we do not break anything.
Patchset is separated in 4: parport fixes, selftests fixes, selftests additions and
replacement of child element. Tested everything with sysctl self tests and everything
seems "ok".
1. parport fixes: @mcgrof: this is related to my previous series and it plugs a
sysct table leak in the parport driver. Please tell me if you want me to repost
the parport series with this one stiched in.
2. Selftests fixes: Remove the prefixed zeros when passing a awk field to the
awk print command because it was causing $0009 to be interpreted as $0.
Replaced continue with return in sysctl.sh(test_case) so the test actually
gets skipped. The skip decision is now in sysctl.sh(skip_test).
3. Selftest additions: New test to confirm that unregister actually removes
targets. New test to confirm that permanently empty targets are indeed
created and that no other targets can be created "on top".
4. Replaced the child pointer in struct ctl_table with a u8 flag. The flag
is used to differentiate between permanently empty targets and non-empty ones.
Comments/feedback greatly appreciated
Best
Joel
Joel Granados (8):
parport: plug a sysctl register leak
test_sysctl: Fix test metadata getters
test_sysctl: Group node sysctl test under one func
test_sysctl: Add an unregister sysctl test
test_sysctl: Add an option to prevent test skip
test_sysclt: Test for registering a mount point
sysctl: Remove debugging dump_stack
sysctl: replace child with a flags var
drivers/parport/procfs.c | 23 ++---
fs/proc/proc_sysctl.c | 82 ++++------------
include/linux/sysctl.h | 4 +-
lib/test_sysctl.c | 91 ++++++++++++++++--
tools/testing/selftests/sysctl/sysctl.sh | 115 +++++++++++++++++------
5 files changed, 204 insertions(+), 111 deletions(-)
--
2.30.2
From: Maxim Mikityanskiy <maxim(a)isovalent.com>
See the details in the commit message (TL/DR: under CAP_BPF, the
verifier can be fooled to think that a scalar is zero while in fact it's
your predefined number.)
v1 and v2 were sent off-list.
v2 changes:
Added more tests, migrated them to inline asm, started using
bpf_get_prandom_u32, switched to a more bulletproof dead branch check
and modified the failing spill test scenarios so that an unauthorized
access attempt is performed in both branches.
v3 changes:
Dropped an improvement not necessary for the fix, changed the Fixes tag.
Maxim Mikityanskiy (2):
bpf: Fix verifier tracking scalars on spill
selftests/bpf: Add test cases to assert proper ID tracking on spill
kernel/bpf/verifier.c | 7 +
.../selftests/bpf/progs/verifier_spill_fill.c | 198 ++++++++++++++++++
2 files changed, 205 insertions(+)
--
2.40.1
Let's add some selftests to make sure that:
* R/O long-term pinning always works of file mappings
* R/W long-term pinning always works in MAP_PRIVATE file mappings
* R/W long-term pinning only works in MAP_SHARED mappings with special
filesystems (shmem, hugetlb) and fails with other filesystems (ext4, btrfs,
xfs).
The tests make use of the gup_test kernel module to trigger ordinary GUP
and GUP-fast, and liburing (similar to our COW selftests). Test with memfd,
memfd hugetlb, tmpfile() and mkstemp(). The latter usually gives us a
"real" filesystem (ext4, btrfs, xfs) where long-term pinning is
expected to fail.
Note that these selftests don't contain any actual reproducers for data
corruptions in case R/W long-term pinning on problematic filesystems
"would" work.
Maybe we can later come up with a racy !FOLL_LONGTERM reproducer that can
reuse an existing interface to trigger short-term pinning (I'll look into
that next).
On current mm/mm-unstable:
# ./gup_longterm
# [INFO] detected hugetlb page size: 2048 KiB
# [INFO] detected hugetlb page size: 1048576 KiB
TAP version 13
1..50
# [RUN] R/W longterm GUP pin in MAP_SHARED file mapping ... with memfd
ok 1 Should have worked
# [RUN] R/W longterm GUP pin in MAP_SHARED file mapping ... with tmpfile
ok 2 Should have worked
# [RUN] R/W longterm GUP pin in MAP_SHARED file mapping ... with local tmpfile
ok 3 Should have failed
# [RUN] R/W longterm GUP pin in MAP_SHARED file mapping ... with memfd hugetlb (2048 kB)
ok 4 Should have worked
# [RUN] R/W longterm GUP pin in MAP_SHARED file mapping ... with memfd hugetlb (1048576 kB)
ok 5 Should have worked
# [RUN] R/W longterm GUP-fast pin in MAP_SHARED file mapping ... with memfd
ok 6 Should have worked
# [RUN] R/W longterm GUP-fast pin in MAP_SHARED file mapping ... with tmpfile
ok 7 Should have worked
# [RUN] R/W longterm GUP-fast pin in MAP_SHARED file mapping ... with local tmpfile
ok 8 Should have failed
# [RUN] R/W longterm GUP-fast pin in MAP_SHARED file mapping ... with memfd hugetlb (2048 kB)
ok 9 Should have worked
# [RUN] R/W longterm GUP-fast pin in MAP_SHARED file mapping ... with memfd hugetlb (1048576 kB)
ok 10 Should have worked
# [RUN] R/O longterm GUP pin in MAP_SHARED file mapping ... with memfd
ok 11 Should have worked
# [RUN] R/O longterm GUP pin in MAP_SHARED file mapping ... with tmpfile
ok 12 Should have worked
# [RUN] R/O longterm GUP pin in MAP_SHARED file mapping ... with local tmpfile
ok 13 Should have worked
# [RUN] R/O longterm GUP pin in MAP_SHARED file mapping ... with memfd hugetlb (2048 kB)
ok 14 Should have worked
# [RUN] R/O longterm GUP pin in MAP_SHARED file mapping ... with memfd hugetlb (1048576 kB)
ok 15 Should have worked
# [RUN] R/O longterm GUP-fast pin in MAP_SHARED file mapping ... with memfd
ok 16 Should have worked
# [RUN] R/O longterm GUP-fast pin in MAP_SHARED file mapping ... with tmpfile
ok 17 Should have worked
# [RUN] R/O longterm GUP-fast pin in MAP_SHARED file mapping ... with local tmpfile
ok 18 Should have worked
# [RUN] R/O longterm GUP-fast pin in MAP_SHARED file mapping ... with memfd hugetlb (2048 kB)
ok 19 Should have worked
# [RUN] R/O longterm GUP-fast pin in MAP_SHARED file mapping ... with memfd hugetlb (1048576 kB)
ok 20 Should have worked
# [RUN] R/W longterm GUP pin in MAP_PRIVATE file mapping ... with memfd
ok 21 Should have worked
# [RUN] R/W longterm GUP pin in MAP_PRIVATE file mapping ... with tmpfile
ok 22 Should have worked
# [RUN] R/W longterm GUP pin in MAP_PRIVATE file mapping ... with local tmpfile
ok 23 Should have worked
# [RUN] R/W longterm GUP pin in MAP_PRIVATE file mapping ... with memfd hugetlb (2048 kB)
ok 24 Should have worked
# [RUN] R/W longterm GUP pin in MAP_PRIVATE file mapping ... with memfd hugetlb (1048576 kB)
ok 25 Should have worked
# [RUN] R/W longterm GUP-fast pin in MAP_PRIVATE file mapping ... with memfd
ok 26 Should have worked
# [RUN] R/W longterm GUP-fast pin in MAP_PRIVATE file mapping ... with tmpfile
ok 27 Should have worked
# [RUN] R/W longterm GUP-fast pin in MAP_PRIVATE file mapping ... with local tmpfile
ok 28 Should have worked
# [RUN] R/W longterm GUP-fast pin in MAP_PRIVATE file mapping ... with memfd hugetlb (2048 kB)
ok 29 Should have worked
# [RUN] R/W longterm GUP-fast pin in MAP_PRIVATE file mapping ... with memfd hugetlb (1048576 kB)
ok 30 Should have worked
# [RUN] R/O longterm GUP pin in MAP_PRIVATE file mapping ... with memfd
ok 31 Should have worked
# [RUN] R/O longterm GUP pin in MAP_PRIVATE file mapping ... with tmpfile
ok 32 Should have worked
# [RUN] R/O longterm GUP pin in MAP_PRIVATE file mapping ... with local tmpfile
ok 33 Should have worked
# [RUN] R/O longterm GUP pin in MAP_PRIVATE file mapping ... with memfd hugetlb (2048 kB)
ok 34 Should have worked
# [RUN] R/O longterm GUP pin in MAP_PRIVATE file mapping ... with memfd hugetlb (1048576 kB)
ok 35 Should have worked
# [RUN] R/O longterm GUP-fast pin in MAP_PRIVATE file mapping ... with memfd
ok 36 Should have worked
# [RUN] R/O longterm GUP-fast pin in MAP_PRIVATE file mapping ... with tmpfile
ok 37 Should have worked
# [RUN] R/O longterm GUP-fast pin in MAP_PRIVATE file mapping ... with local tmpfile
ok 38 Should have worked
# [RUN] R/O longterm GUP-fast pin in MAP_PRIVATE file mapping ... with memfd hugetlb (2048 kB)
ok 39 Should have worked
# [RUN] R/O longterm GUP-fast pin in MAP_PRIVATE file mapping ... with memfd hugetlb (1048576 kB)
ok 40 Should have worked
# [RUN] io_uring fixed buffer with MAP_SHARED file mapping ... with memfd
ok 41 Should have worked
# [RUN] io_uring fixed buffer with MAP_SHARED file mapping ... with tmpfile
ok 42 Should have worked
# [RUN] io_uring fixed buffer with MAP_SHARED file mapping ... with local tmpfile
ok 43 Should have failed
# [RUN] io_uring fixed buffer with MAP_SHARED file mapping ... with memfd hugetlb (2048 kB)
ok 44 Should have worked
# [RUN] io_uring fixed buffer with MAP_SHARED file mapping ... with memfd hugetlb (1048576 kB)
ok 45 Should have worked
# [RUN] io_uring fixed buffer with MAP_PRIVATE file mapping ... with memfd
ok 46 Should have worked
# [RUN] io_uring fixed buffer with MAP_PRIVATE file mapping ... with tmpfile
ok 47 Should have worked
# [RUN] io_uring fixed buffer with MAP_PRIVATE file mapping ... with local tmpfile
ok 48 Should have worked
# [RUN] io_uring fixed buffer with MAP_PRIVATE file mapping ... with memfd hugetlb (2048 kB)
ok 49 Should have worked
# [RUN] io_uring fixed buffer with MAP_PRIVATE file mapping ... with memfd hugetlb (1048576 kB)
ok 50 Should have worked
# Totals: pass:50 fail:0 xfail:0 xpass:0 skip:0 error:0
Cc: Andrew Morton <akpm(a)linux-foundation.org>
Cc: Shuah Khan <shuah(a)kernel.org>
Cc: Lorenzo Stoakes <lstoakes(a)gmail.com>
Cc: Jens Axboe <axboe(a)kernel.dk>
Cc: Peter Xu <peterx(a)redhat.com>
Cc: Jason Gunthorpe <jgg(a)nvidia.com>
Cc: John Hubbard <jhubbard(a)nvidia.com>
Cc: Jan Kara <jack(a)suse.cz>
David Hildenbrand (3):
selftests/mm: factor out detection of hugetlb page sizes into vm_util
selftests/mm: gup_longterm: new functional test for FOLL_LONGTERM
selftests/mm: gup_longterm: add liburing tests
tools/testing/selftests/mm/Makefile | 3 +
tools/testing/selftests/mm/cow.c | 29 +-
tools/testing/selftests/mm/gup_longterm.c | 459 ++++++++++++++++++++++
tools/testing/selftests/mm/run_vmtests.sh | 4 +-
tools/testing/selftests/mm/vm_util.c | 27 ++
tools/testing/selftests/mm/vm_util.h | 1 +
6 files changed, 495 insertions(+), 28 deletions(-)
create mode 100644 tools/testing/selftests/mm/gup_longterm.c
--
2.40.1
Hello, Waiman.
On Wed, Apr 12, 2023 at 03:52:36PM -0400, Waiman Long wrote:
> There is still a distribution hierarchy as the list of isolation CPUs have
> to be distributed down to the target cgroup through the hierarchy. For
> example,
>
> cgroup root
> +- isolcpus (cpus 8,9; isolcpus)
> +- user.slice (cpus 1-9; ecpus 1-7; member)
> +- user-x.slice (cpus 8,9; ecpus 8,9; isolated)
> +- user-y.slice (cpus 1,2; ecpus 1,2; member)
>
> OTOH, I do agree that this can be somewhat hacky. That is why I post it as a
> RFC to solicit feedback.
Wouldn't it be possible to make it hierarchical by adding another cpumask to
cpuset which lists the cpus which are allowed in the hierarchy but not used
unless claimed by an isolated domain?
Thanks.
--
tejun
Hi, Willy, Thomas
This is not really for merge, but only let it work as a demo code to
test whether it is possible to restore the next test when there is a bad
pointer access in user-space [1].
Besides, a new 'run' command is added to 'NOLIBC_TEST' environment
variable or arguments to control the running iterations, this may be
used to test the reentrancy issues, but no failures found currently ;-)
With glibc, it works as following:
$ ./nolibc-test run:2,syscall:28-30,stdlib:1
Running iteration(s): 2
Current iteration: 1
Running test 'syscall', from 28 to 30
28 dup3_m1 = -1 EBADF [OK]
29 efault_handler ! 11 SIGSEGV [OK]
30 execve_root = -1 EACCES [OK]
Errors during this test: 0
Running test 'stdlib'
1 getenv_blah = <(null)> [OK]
Errors during this test: 0
Total number of errors in the 1 iteration(s): 0
Current iteration: 2
Running test 'syscall'
28 dup3_m1 = -1 EBADF [OK]
29 efault_handler ! 11 SIGSEGV [OK]
30 execve_root = -1 EACCES [OK]
Errors during this test: 0
Running test 'stdlib'
1 getenv_blah = <(null)> [OK]
Errors during this test: 0
Total number of errors in the 2 iteration(s): 0
With nolibc, it will be skipped (run:2,syscall:28-30,stdlib:10):
Running iteration(s): 2
Current iteration: 1
Running test 'syscall', from 28 to 30
28 dup3_m1 = -1 EBADF [OK]
29 efault_handler [SKIPPED]
30 execve_root = -1 EACCES [OK]
Errors during this test: 0
Running test 'stdlib', from 10 to 10
10 strrchr_foobar_o = <obar> [OK]
Errors during this test: 0
Total number of errors in the 1 iteration(s): 0
Current iteration: 2
Running test 'syscall', from 28 to 30
28 dup3_m1 = -1 EBADF [OK]
29 efault_handler [SKIPPED]
30 execve_root = -1 EACCES [OK]
Errors during this test: 0
Running test 'stdlib', from 10 to 10
10 strrchr_foobar_o = <obar> [OK]
Errors during this test: 0
Total number of errors in the 2 iteration(s): 0
Best regards,
Zhangjin
---
[1]: https://lore.kernel.org/linux-riscv/20230529113143.GB2762@1wt.eu/
Zhangjin Wu (4):
selftests/nolibc: allow rerun with the same settings
selftests/nolibc: add rerun support
selftests/nolibc: add user space efault handler
selftests/nolibc: add user-space efault restore test case
tools/testing/selftests/nolibc/nolibc-test.c | 247 +++++++++++++++++--
1 file changed, 221 insertions(+), 26 deletions(-)
--
2.25.1
From: Menglong Dong <imagedong(a)tencent.com>
For now, the BPF program of type BPF_PROG_TYPE_TRACING can only be used
on the kernel functions whose arguments count less than 6. This is not
friendly at all, as too many functions have arguments count more than 6.
Therefore, let's enhance it by increasing the function arguments count
allowed in arch_prepare_bpf_trampoline(), for now, only x86_64.
In the 1th patch, we make MAX_BPF_FUNC_ARGS 14, according to our
statistics.
In the 2th patch, we make arch_prepare_bpf_trampoline() support to copy
function arguments in stack for x86 arch. Therefore, the maximum
arguments can be up to MAX_BPF_FUNC_ARGS for FENTRY and FEXIT.
And the 3-5th patches are for the testcases of the 2th patch.
Changes since v1:
- change the maximun function arguments to 14 from 12
- add testcases (Jiri Olsa)
- instead EMIT4 with EMIT3_off32 for "lea" to prevent overflow
Menglong Dong (5):
bpf: make MAX_BPF_FUNC_ARGS 14
bpf, x86: allow function arguments up to 14 for TRACING
libbpf: make BPF_PROG support 15 function arguments
selftests/bpf: rename bpf_fentry_test{7,8,9} to bpf_fentry_test_ptr*
selftests/bpf: add testcase for FENTRY/FEXIT with 6+ arguments
arch/x86/net/bpf_jit_comp.c | 96 ++++++++++++++++---
include/linux/bpf.h | 9 +-
net/bpf/test_run.c | 40 ++++++--
tools/lib/bpf/bpf_helpers.h | 9 +-
tools/lib/bpf/bpf_tracing.h | 10 +-
.../selftests/bpf/prog_tests/bpf_cookie.c | 24 ++---
.../bpf/prog_tests/kprobe_multi_test.c | 16 ++--
.../testing/selftests/bpf/progs/fentry_test.c | 50 ++++++++--
.../testing/selftests/bpf/progs/fexit_test.c | 51 ++++++++--
.../selftests/bpf/progs/get_func_ip_test.c | 2 +-
.../selftests/bpf/progs/kprobe_multi.c | 12 +--
.../bpf/progs/verifier_btf_ctx_access.c | 2 +-
.../selftests/bpf/verifier/atomic_fetch_add.c | 4 +-
13 files changed, 249 insertions(+), 76 deletions(-)
--
2.40.1
Hi,
This is v2 of a series that fixes up build errors and warnings for at
least the 64-bit builds on x86 with clang.
There are lots of changes since v1 [1], thanks to reviews from Peter Xu, David
Hildenbrand, and Muhammad Usama Anjum. These include:
* Using "make headers", and documenting that prerequisite as well.
* Better ways to avoid clang's Wformat-security warnings
* Added Cc's, ack-by's, reviewed-by's.
* Updated commit log messages.
The series also includes an optional "improvement" of moving some uffd
code into uffd-common.[ch], which is proving to be somewhat
controversial, and so if that doesn't get resolved, then patches 9 and
10 may just get dropped. They are not required in order to get a clean
build, now that "make headers" is happening.
[1]: https://lore.kernel.org/all/20230602013358.900637-1-jhubbard@nvidia.com/
thanks,
John Hubbard
NVIDIA
John Hubbard (11):
selftests/mm: fix uffd-stress unused function warning
selftests/mm: fix unused variable warnings in hugetlb-madvise.c,
migration.c
selftests/mm: fix "warning: expression which evaluates to zero..." in
mlock2-tests.c
selftests/mm: fix invocation of tests that are run via shell scripts
selftests/mm: .gitignore: add mkdirty, va_high_addr_switch
selftests/mm: fix two -Wformat-security warnings in uffd builds
selftests/mm: fix a "possibly uninitialized" warning in pkey-x86.h
selftests/mm: fix uffd-unit-tests.c build failure due to missing
MADV_COLLAPSE
selftests/mm: move psize(), pshift() into vm_utils.c
selftests/mm: move uffd* routines from vm_util.c to uffd-common.c
Documentation: kselftest: "make headers" is a prerequisite
Documentation/dev-tools/kselftest.rst | 1 +
tools/testing/selftests/mm/.gitignore | 2 +
tools/testing/selftests/mm/Makefile | 7 +-
tools/testing/selftests/mm/cow.c | 7 --
tools/testing/selftests/mm/hugepage-mremap.c | 2 +-
tools/testing/selftests/mm/hugetlb-madvise.c | 8 +-
tools/testing/selftests/mm/khugepaged.c | 10 --
.../selftests/mm/ksm_functional_tests.c | 2 +-
tools/testing/selftests/mm/migration.c | 5 +-
tools/testing/selftests/mm/mlock2-tests.c | 1 -
tools/testing/selftests/mm/pkey-x86.h | 2 +-
tools/testing/selftests/mm/run_vmtests.sh | 6 +-
tools/testing/selftests/mm/uffd-common.c | 105 +++++++++++++++++
tools/testing/selftests/mm/uffd-common.h | 12 +-
tools/testing/selftests/mm/uffd-stress.c | 10 --
tools/testing/selftests/mm/uffd-unit-tests.c | 16 +--
tools/testing/selftests/mm/vm_util.c | 106 ++----------------
tools/testing/selftests/mm/vm_util.h | 36 ++----
18 files changed, 165 insertions(+), 173 deletions(-)
base-commit: 929ed21dfdb6ee94391db51c9eedb63314ef6847
--
2.40.1
On Tue, May 23, 2023 at 11:22:07PM +0000, Ziqi Zhao wrote:
> An output message:
>
> > # # waitpid WEXITSTATUS=0
>
> will be printed for 30,000+ times in the `pidfd_test` selftest, which
> does not seem ideal. This patch removes the print logic in the
> `wait_for_pid` function, so each call to this function does not output
> a line by default. Any existing call sites where the extra line might
> be beneficial have been modified to include extra print statements
> outside of the function calls.
>
> Signed-off-by: Ziqi Zhao <astrajoan(a)yahoo.com>
> ---
Fine by me,
Reviewed-by: Christian Brauner <brauner(a)kernel.org>
The default timeout for selftests tests is 45 seconds. Although
we already have 13 settings for tests of about 96 sefltests which
use a timeout greater than this, we want to try to avoid encouraging
more tests to forcing a higher test timeout as selftests strives to
run all tests quickly. Selftests also uses the timeout as a non-fatal
error. Only tests runners which have control over a system would know
if to treat a timeout as fatal or not.
To help with all this:
o Enhance documentation to avoid future increases of insane timeouts
o Add the option to allow overriding the default timeout with test
runners with a command line option
Suggested-by: Shuah Khan <skhan(a)linuxfoundation.org>
Signed-off-by: Luis Chamberlain <mcgrof(a)kernel.org>
---
Documentation/dev-tools/kselftest.rst | 22 +++++++++++++++++++++
tools/testing/selftests/kselftest/runner.sh | 11 ++++++++++-
tools/testing/selftests/run_kselftest.sh | 5 +++++
3 files changed, 37 insertions(+), 1 deletion(-)
diff --git a/Documentation/dev-tools/kselftest.rst b/Documentation/dev-tools/kselftest.rst
index 12b575b76b20..dd214af7b7ff 100644
--- a/Documentation/dev-tools/kselftest.rst
+++ b/Documentation/dev-tools/kselftest.rst
@@ -168,6 +168,28 @@ the `-t` option for specific single tests. Either can be used multiple times::
For other features see the script usage output, seen with the `-h` option.
+Timeout for selftests
+=====================
+
+Selftests are designed to be quick and so a default timeout is used of 45
+seconds for each test. Tests can override the default timeout by adding
+a settings file in their directory and set a timeout variable there to the
+configured a desired upper timeout for the test. Only a few tests override
+the timeout with a value higher than 45 seconds, selftests strives to keep
+it that way. Timeouts in selftests are not considered fatal because the
+system under which a test runs may change and this can also modify the
+expected time it takes to run a test. If you have control over the systems
+which will run the tests you can configure a test runner on those systems to
+use a greater or lower timeout on the command line as with the `-o` or
+the `--override-timeout` argument. For example to use 165 seconds instead
+one would use:
+
+ $ ./run_kselftest.sh --override-timeout 165
+
+You can look at the TAP output to see if you ran into the timeout. Test
+runners which know a test must run under a specific time can then optionally
+treat these timeouts then as fatal.
+
Packaging selftests
===================
diff --git a/tools/testing/selftests/kselftest/runner.sh b/tools/testing/selftests/kselftest/runner.sh
index 294619ade49f..1c952d1401d4 100644
--- a/tools/testing/selftests/kselftest/runner.sh
+++ b/tools/testing/selftests/kselftest/runner.sh
@@ -8,7 +8,8 @@ export logfile=/dev/stdout
export per_test_logging=
# Defaults for "settings" file fields:
-# "timeout" how many seconds to let each test run before failing.
+# "timeout" how many seconds to let each test run before running
+# over our soft timeout limit.
export kselftest_default_timeout=45
# There isn't a shell-agnostic way to find the path of a sourced file,
@@ -90,6 +91,14 @@ run_one()
done < "$settings"
fi
+ # Command line timeout overrides the settings file
+ if [ -n "$kselftest_override_timeout" ]; then
+ kselftest_timeout="$kselftest_override_timeout"
+ echo "# overriding timeout to $kselftest_timeout" >> "$logfile"
+ else
+ echo "# timeout set to $kselftest_timeout" >> "$logfile"
+ fi
+
TEST_HDR_MSG="selftests: $DIR: $BASENAME_TEST"
echo "# $TEST_HDR_MSG"
if [ ! -e "$TEST" ]; then
diff --git a/tools/testing/selftests/run_kselftest.sh b/tools/testing/selftests/run_kselftest.sh
index 97165a83df63..9a981b36bd7f 100755
--- a/tools/testing/selftests/run_kselftest.sh
+++ b/tools/testing/selftests/run_kselftest.sh
@@ -26,6 +26,7 @@ Usage: $0 [OPTIONS]
-l | --list List the available collection:test entries
-d | --dry-run Don't actually run any tests
-h | --help Show this usage info
+ -o | --override-timeout Number of seconds after which we timeout
EOF
exit $1
}
@@ -33,6 +34,7 @@ EOF
COLLECTIONS=""
TESTS=""
dryrun=""
+kselftest_override_timeout=""
while true; do
case "$1" in
-s | --summary)
@@ -51,6 +53,9 @@ while true; do
-d | --dry-run)
dryrun="echo"
shift ;;
+ -o | --override-timeout)
+ kselftest_override_timeout="$2"
+ shift 2 ;;
-h | --help)
usage 0 ;;
"")
--
2.39.2
It turned out that an even dozen patches were required in order to get the
selftests building cleanly, and all running, once again. I made it worse on
myself by insisting on using clang, which seems to uncover a few more warnings
than gcc these days.
So I still haven't gotten to my original goal of running a new HMM test that
Alistair handed me (it's not here yet), but at least this fixes everything I ran
into just now.
John Hubbard (12):
selftests/mm: fix uffd-stress unused function warning
selftests/mm: fix unused variable warning in hugetlb-madvise.c
selftests/mm: fix unused variable warning in migration.c
selftests/mm: fix a char* assignment in mlock2-tests.c
selftests/mm: fix invocation of tests that are run via shell scripts
selftests/mm: .gitignore: add mkdirty, va_high_addr_switch
selftests/mm: set -Wno-format-security to avoid uffd build warnings
selftests/mm: fix a "possibly uninitialized" warning in pkey-x86.h
selftests/mm: move psize(), pshift() into vm_utils.c
selftests/mm: move uffd* routines from vm_util.c to uffd-common.c
selftests/mm: fix missing UFFDIO_CONTINUE_MODE_WP and similar build
failures
selftests/mm: fix uffd-unit-tests.c build failure due to missing
MADV_COLLAPSE
tools/testing/selftests/mm/.gitignore | 2 +
tools/testing/selftests/mm/Makefile | 9 +-
tools/testing/selftests/mm/cow.c | 7 --
tools/testing/selftests/mm/hugepage-mremap.c | 2 +-
tools/testing/selftests/mm/hugetlb-madvise.c | 2 +-
tools/testing/selftests/mm/khugepaged.c | 10 --
.../selftests/mm/ksm_functional_tests.c | 2 +-
tools/testing/selftests/mm/migration.c | 2 +-
tools/testing/selftests/mm/mlock2-tests.c | 2 +-
tools/testing/selftests/mm/pkey-x86.h | 2 +-
tools/testing/selftests/mm/run_vmtests.sh | 6 +-
tools/testing/selftests/mm/uffd-common.c | 105 +++++++++++++++++
tools/testing/selftests/mm/uffd-common.h | 29 ++++-
tools/testing/selftests/mm/uffd-stress.c | 10 --
tools/testing/selftests/mm/vm_util.c | 106 ++----------------
tools/testing/selftests/mm/vm_util.h | 36 ++----
16 files changed, 170 insertions(+), 162 deletions(-)
base-commit: 929ed21dfdb6ee94391db51c9eedb63314ef6847
--
2.40.1
On Sun, Jun 4, 2023, at 10:29, 吴章金 wrote:
>
> Sorry for missing part of your feedbacks, I will check if -nostdlib
> stops the linking of libgcc_s or my own separated test script forgot
> linking the libgcc_s manually.
According to the gcc documentation, -nostdlib drops libgcc.a, but
adding -lgcc is the recommended way to bring it back.
> And as suggestion from Thomas' reply,
>
>>> Perhaps we really need to add the missing __divdi3 and __aeabi_ldivmod and the
>>> ones for the other architectures, or get one from lib/math/div64.c.
>
>>No, these ones come from the compiler via libgcc_s, we must not try to
> reimplement them. And we should do our best to avoid depending on them
> to avoid the error you got above.
>
> So, the explicit conversion is used instead in the patch.
I think a cast to a 32-bit type is ideal when converting the
clock_gettime() result into microseconds, since the kernel guarantees
that the timespec value is normalized, with all zeroes in the
upper 34 bits. Going through __aeabi_ldivmod would make the
conversion much slower.
For user supplied non-normalized timeval values, it's not obvious
whether we need the full 64-bit division
Arnd
This patchset consolidates a number of disparate items that can all be
considered cleanups. They are all related to mlxsw in that they are
directly in mlxsw code, or in selftests that mlxsw heavily uses.
- patch #1 fixes a comment, patch #2 propagates an extack
- patches #3 and #4 tweak several loops to query a resource once and cache
in a local variable instead of querying on each iteration
- patches #5 and #6 fix selftest diagrams, and #7 adds a missing diagram
into an existing test
- patch #8 disables a PVID on a bridge in a selftest that should not need
said PVID
Petr Machata (8):
mlxsw: spectrum_router: Clarify a comment
mlxsw: spectrum_router: Use extack in
mlxsw_sp~_rif_ipip_lb_configure()
mlxsw: spectrum_router: Do not query MAX_RIFS on each iteration
mlxsw: spectrum_router: Do not query MAX_VRS on each iteration
selftests: mlxsw: ingress_rif_conf_1d: Fix the diagram
selftests: mlxsw: egress_vid_classification: Fix the diagram
selftests: router_bridge_vlan: Add a diagram
selftests: router_bridge_vlan: Set vlan_default_pvid 0 on the bridge
.../ethernet/mellanox/mlxsw/spectrum_router.c | 26 ++++++++++++-------
.../net/mlxsw/egress_vid_classification.sh | 5 ++--
.../drivers/net/mlxsw/ingress_rif_conf_1d.sh | 5 ++--
.../net/forwarding/router_bridge_vlan.sh | 24 ++++++++++++++++-
4 files changed, 43 insertions(+), 17 deletions(-)
--
2.40.1
Hi, Willy
When I worked on adding new syscalls and the related library routines,
I have seen most of the library routines share the same syscall call and
return logic, this patchset adds two macros to simplify and shrink them.
All of them have been tested on arm, aarch64, rv32 and rv64, no new
regressions found.
If this is ok, I will rebase the new syscalls and library routines on
this patchset.
Best regards,
Zhangjin
---
Zhangjin Wu (4):
tools/nolibc: unistd.h: add __syscall() and __syscall_ret() helpers
tools/nolibc: unistd.h: apply __syscall_ret() helper
tools/nolibc: sys.h: apply __syscall_ret() helper
tools/nolibc: sys.h: apply __syscall() helper
tools/include/nolibc/sys.h | 369 ++++++----------------------------
tools/include/nolibc/unistd.h | 12 +-
2 files changed, 65 insertions(+), 316 deletions(-)
--
2.25.1
Hi, Willy
This is the v3 generic part1 for rv32, all of the found issues of v2
part1 [1] have been fixed up, several generic patches have been fixed up
and merged from v2 part2 [2] to this series, the standalone test_fork
patch [4] is merged with a Reviewed-by line into this series too.
This series is based on 20230528-nolibc-rv32+stkp5 branch of [5].
Changes from v2 -> v3:
* selftests/nolibc: fix up compile warning with glibc on x86_64
Use simpler 'long long' conversion instead of old #ifdef ...
(Suggestion from Willy)
* tools/nolibc: add missing nanoseconds support for __NR_statx
Split the compound assignment into two single assignments
(Suggestion from Thomas)
* selftests/nolibc: add new gettimeofday test cases
Removed the gettimeofday(NULL, &tz)
(Suggestion from Thomas)
All of the commit messages have been re-checked, some missing
Suggested-by lines are added.
The whole patchset have been tested on arm, aarch64, rv32 and rv64, no
regressions (the next compile patchset is required to do rv32 test).
The nolibc-test has been tested with glibc on x86_64 too.
Btw, we have found such poll failures on arm (not introduced by this
patchset), this will be fixed in our coming ppoll_time64 patchset:
48 poll_null = -1 ENOSYS [FAIL]
49 poll_stdout = -1 ENOSYS [FAIL]
50 poll_fault = -1 ENOSYS != (-1 EFAULT) [FAIL]
And the gettimeofday_null removal patch from Thomas [3] may conflicts
with the gettimeofday removal and addition patches, but it is not hard
to fix.
Best regards,
Zhangjin
---
[1]: https://lore.kernel.org/linux-riscv/cover.1685362482.git.falcon@tinylab.org…
[2]: https://lore.kernel.org/linux-riscv/cover.1685387484.git.falcon@tinylab.org…
[3]: https://lore.kernel.org/lkml/20230530-nolibc-gettimeofday-v1-1-7307441a002b…
[4]: https://lore.kernel.org/lkml/61bdfe7bacebdef8aa9195f6f2550a5b0d33aab3.16854…
[5]: https://git.kernel.org/pub/scm/linux/kernel/git/wtarreau/nolibc.git
Zhangjin Wu (12):
selftests/nolibc: syscall_args: use generic __NR_statx
tools/nolibc: add missing nanoseconds support for __NR_statx
selftests/nolibc: allow specify extra arguments for qemu
selftests/nolibc: fix up compile warning with glibc on x86_64
selftests/nolibc: not include limits.h for nolibc
selftests/nolibc: use INT_MAX instead of __INT_MAX__
tools/nolibc: arm: add missing my_syscall6
tools/nolibc: open: fix up compile warning for arm
selftests/nolibc: support two errnos with EXPECT_SYSER2()
selftests/nolibc: remove gettimeofday_bad1/2 completely
selftests/nolibc: add new gettimeofday test cases
selftests/nolibc: test_fork: fix up duplicated print
tools/include/nolibc/arch-arm.h | 23 +++++++++++
tools/include/nolibc/stdint.h | 14 +++++++
tools/include/nolibc/sys.h | 39 +++++++++---------
tools/testing/selftests/nolibc/Makefile | 2 +-
tools/testing/selftests/nolibc/nolibc-test.c | 42 ++++++++++++--------
5 files changed, 85 insertions(+), 35 deletions(-)
--
2.25.1
running nolibc-test with glibc on x86_64 got such print issue:
29 execve_root = -1 EACCES [OK]
30 fork30 fork = 0 [OK]
31 getdents64_root = 712 [OK]
The fork test case has three printf calls:
(1) llen += printf("%d %s", test, #name);
(2) llen += printf(" = %d %s ", expr, errorname(errno));
(3) llen += pad_spc(llen, 64, "[FAIL]\n"); --> vfprintf()
In the following scene, the above issue happens:
(a) The parent calls (1)
(b) The parent calls fork()
(c) The child runs and shares the print buffer of (1)
(d) The child exits, flushs the print buffer and closes its own stdout/stderr
* "30 fork" is printed at the first time.
(e) The parent calls (2) and (3), with "\n" in (3), it flushs the whole buffer
* "30 fork = 0 ..." is printed
Therefore, there are two "30 fork" in the stdout.
Between (a) and (b), if flush the stdout (and the sterr), the child in
stage (c) will not be able to 'see' the print buffer.
Signed-off-by: Zhangjin Wu <falcon(a)tinylab.org>
---
tools/testing/selftests/nolibc/nolibc-test.c | 8 +++++++-
1 file changed, 7 insertions(+), 1 deletion(-)
diff --git a/tools/testing/selftests/nolibc/nolibc-test.c b/tools/testing/selftests/nolibc/nolibc-test.c
index 7de46305f419..88323a60aa4a 100644
--- a/tools/testing/selftests/nolibc/nolibc-test.c
+++ b/tools/testing/selftests/nolibc/nolibc-test.c
@@ -486,7 +486,13 @@ static int test_getpagesize(void)
static int test_fork(void)
{
int status;
- pid_t pid = fork();
+ pid_t pid;
+
+ /* flush the printf buffer to avoid child flush it */
+ fflush(stdout);
+ fflush(stderr);
+
+ pid = fork();
switch (pid) {
case -1:
--
2.25.1
Hi,
This is somewhat related to the 11-patch series that I just posted [1],
but I hadn't originally planned to go this far with it. But since David
Hildenbrand asked if we could warn in this case [2], here it is.
It turns out that automatically doing the "make headers" correctly is
much harder than just warning, so I stopped at that point. It works well,
though.
[1] https://lore.kernel.org/all/20230603021558.95299-1-jhubbard@nvidia.com/
[2] https://lore.kernel.org/all/a4fbc191-9acb-5db8-a375-96c0c1ba3fcd@redhat.com/
thanks,
John Hubbard
Cc: David Hildenbrand <david(a)redhat.com>
Cc: Peter Xu <peterx(a)redhat.com>
Cc: Muhammad Usama Anjum <usama.anjum(a)collabora.com>
Cc: Jonathan Corbet <corbet(a)lwn.net>
Cc: linux-doc(a)vger.kernel.org
John Hubbard (1):
selftests: error out if kernel header files are not yet built
tools/testing/selftests/lib.mk | 36 +++++++++++++++++++++++++++++++---
1 file changed, 33 insertions(+), 3 deletions(-)
base-commit: e5282a7d8f6b604f2bb6a06457734b8cf1e2f8f2
--
2.40.1
Adamos found the issue with the cached XFD state [1]. Although the XFD
state is reset on the CPU hotplug, the per-CPU XFD cache is missing
the reset. Then, running an AMX thread there, the staled value causes
the kernel crash to kill the thread.
This is reproducible when moving an AMX thread to the hot-plugged CPU.
So, add a test case to ensure no issue with that.
It repeats the test due to possible inconsistencies. Then, along with
the hotplug cost, it will bring a noticeable runtime increase. But,
the overall test has a quick turnaround time.
Link: https://lore.kernel.org/lkml/20230519112315.30616-1-attofari@amazon.de/ [1]
Signed-off-by: Chang S. Bae <chang.seok.bae(a)intel.com>
Cc: Thomas Gleixner <tglx(a)linutronix.de>
Cc: Dave Hansen <dave.hansen(a)linux.intel.com>
Cc: Adamos Ttofari <attofari(a)amazon.de>
Cc: Shuah Khan <shuah(a)kernel.org>
Cc: linux-kselftest(a)vger.kernel.org
Cc: linux-kernel(a)vger.kernel.org
---
The overall x86 selftest via "$ make TARGETS='x86' kselftest" takes
about 3.5 -> 5.5 seconds. 'amx_64' itself took about 1.5 more seconds
over 0.x seconds.
But, this overall runtime still takes in a matter of some seconds,
which should be fine I thought.
---
tools/testing/selftests/x86/amx.c | 133 ++++++++++++++++++++++++++++--
1 file changed, 126 insertions(+), 7 deletions(-)
diff --git a/tools/testing/selftests/x86/amx.c b/tools/testing/selftests/x86/amx.c
index d884fd69dd51..6f2f0598c706 100644
--- a/tools/testing/selftests/x86/amx.c
+++ b/tools/testing/selftests/x86/amx.c
@@ -3,6 +3,7 @@
#define _GNU_SOURCE
#include <err.h>
#include <errno.h>
+#include <fcntl.h>
#include <pthread.h>
#include <setjmp.h>
#include <stdio.h>
@@ -25,6 +26,8 @@
# error This test is 64-bit only
#endif
+#define BUF_LEN 1000
+
#define XSAVE_HDR_OFFSET 512
#define XSAVE_HDR_SIZE 64
@@ -239,11 +242,10 @@ static inline uint64_t get_fpx_sw_bytes_features(void *buffer)
}
/* Work around printf() being unsafe in signals: */
-#define SIGNAL_BUF_LEN 1000
-char signal_message_buffer[SIGNAL_BUF_LEN];
+char signal_message_buffer[BUF_LEN];
void sig_print(char *msg)
{
- int left = SIGNAL_BUF_LEN - strlen(signal_message_buffer) - 1;
+ int left = BUF_LEN - strlen(signal_message_buffer) - 1;
strncat(signal_message_buffer, msg, left);
}
@@ -767,15 +769,15 @@ static int create_threads(int num, struct futex_info *finfo)
return 0;
}
-static void affinitize_cpu0(void)
+static inline void affinitize_cpu(int cpu)
{
cpu_set_t cpuset;
CPU_ZERO(&cpuset);
- CPU_SET(0, &cpuset);
+ CPU_SET(cpu, &cpuset);
if (sched_setaffinity(0, sizeof(cpuset), &cpuset) != 0)
- fatal_error("sched_setaffinity to CPU 0");
+ fatal_error("sched_setaffinity to CPU %d", cpu);
}
static void test_context_switch(void)
@@ -784,7 +786,7 @@ static void test_context_switch(void)
int i;
/* Affinitize to one CPU to force context switches */
- affinitize_cpu0();
+ affinitize_cpu(0);
req_xtiledata_perm();
@@ -926,6 +928,121 @@ static void test_ptrace(void)
err(1, "ptrace test");
}
+/* CPU Hotplug test */
+
+static void __hotplug_cpu(int online, int cpu)
+{
+ char buf[BUF_LEN] = {};
+ int fd, rc;
+
+ strncat(buf, "/sys/devices/system/cpu/cpu", BUF_LEN);
+ snprintf(buf + strlen(buf), BUF_LEN - strlen(buf), "%d", cpu);
+ strncat(buf, "/online", BUF_LEN - strlen(buf));
+
+ fd = open(buf, O_RDWR);
+ if (fd == -1)
+ fatal_error("open()");
+
+ snprintf(buf, BUF_LEN, "%d", online);
+ rc = write(fd, buf, strlen(buf));
+ if (rc == -1)
+ fatal_error("write()");
+
+ rc = close(fd);
+ if (rc == -1)
+ fatal_error("close()");
+}
+
+static void offline_cpu(int cpu)
+{
+ __hotplug_cpu(0, cpu);
+}
+
+static void online_cpu(int cpu)
+{
+ __hotplug_cpu(1, cpu);
+}
+
+static jmp_buf jmpbuf;
+
+static void handle_sigsegv(int sig, siginfo_t *si, void *ctx_void)
+{
+ siglongjmp(jmpbuf, 1);
+}
+
+#define RETRY 5
+
+/*
+ * Sanity-check the hotplug CPU for its (re-)initialization.
+ *
+ * Create an AMX thread on a CPU, while the hotplug CPU went offline.
+ * Then, plug the offlined back, and move the thread to run on it.
+ *
+ * Repeat this multiple times to ensure no inconsistent failure.
+ * If something goes wrong, the thread will get a signal or killed.
+ */
+static void *switch_cpus(void *arg)
+{
+ int *result = (int *)arg;
+ int i = 0;
+
+ affinitize_cpu(0);
+ offline_cpu(1);
+ load_rand_tiledata(stashed_xsave);
+
+ sethandler(SIGSEGV, handle_sigsegv, SA_ONSTACK);
+ for (i = 0; i < RETRY; i++) {
+ if (i > 0) {
+ affinitize_cpu(0);
+ offline_cpu(1);
+ }
+ if (sigsetjmp(jmpbuf, 1) == 0) {
+ online_cpu(1);
+ affinitize_cpu(1);
+ } else {
+ *result = 1;
+ goto out;
+ }
+ }
+ *result = 0;
+out:
+ clearhandler(SIGSEGV);
+ return result;
+}
+
+static void test_cpuhp(void)
+{
+ int max_cpu_num = sysconf(_SC_NPROCESSORS_ONLN) - 1;
+ void *thread_retval;
+ pthread_t thread;
+ int result, rc;
+
+ if (!max_cpu_num) {
+ printf("[SKIP]\tThe running system has no more CPU for the hotplug test.\n");
+ return;
+ }
+
+ printf("[RUN]\tTest AMX use with the CPU hotplug.\n");
+
+ if (pthread_create(&thread, NULL, switch_cpus, &result))
+ fatal_error("pthread_create()");
+
+ rc = pthread_join(thread, &thread_retval);
+
+ if (rc)
+ fatal_error("pthread_join()");
+
+ /*
+ * Either an invalid retval or a failed result indicates
+ * the test failure.
+ */
+ if (thread_retval != &result || result != 0)
+ printf("[FAIL]\tThe AMX thread had an issue (%s).\n",
+ thread_retval != &result ? "killed" : "signaled");
+ else
+ printf("[OK]\tThe AMX thread had no issue.\n");
+}
+
int main(void)
{
/* Check hardware availability at first */
@@ -948,6 +1065,8 @@ int main(void)
test_ptrace();
+ test_cpuhp();
+
clearhandler(SIGILL);
free_stashed_xsave();
base-commit: 7877cb91f1081754a1487c144d85dc0d2e2e7fc4
--
2.17.1
Hi,
This follows the discussion here:
https://lore.kernel.org/linux-kselftest/20230324123157.bbwvfq4gsxnlnfwb@hou…
This shows a couple of inconsistencies with regard to how device-managed
resources are cleaned up. Basically, devm resources will only be cleaned up
if the device is attached to a bus and bound to a driver. Failing any of
these cases, a call to device_unregister will not end up in the devm
resources being released.
We had to work around it in DRM to provide helpers to create a device for
kunit tests, but the current discussion around creating similar, generic,
helpers for kunit resumed interest in fixing this.
This can be tested using the command:
./tools/testing/kunit/kunit.py run --kunitconfig=drivers/base/test/
Let me know what you think,
Maxime
Signed-off-by: Maxime Ripard <maxime(a)cerno.tech>
---
Maxime Ripard (2):
drivers: base: Add basic devm tests for root devices
drivers: base: Add basic devm tests for platform devices
drivers/base/test/.kunitconfig | 2 +
drivers/base/test/Kconfig | 4 +
drivers/base/test/Makefile | 3 +
drivers/base/test/platform-device-test.c | 278 +++++++++++++++++++++++++++++++
drivers/base/test/root-device-test.c | 120 +++++++++++++
5 files changed, 407 insertions(+)
---
base-commit: a6faf7ea9fcb7267d06116d4188947f26e00e57e
change-id: 20230329-kunit-devm-inconsistencies-test-5e5a7d01e60d
Best regards,
--
Maxime Ripard <maxime(a)cerno.tech>
Dzień dobry,
w jaki sposób docierają Państwo do odbiorców?
Tworzymy potężne narzędzia sprzedaży, które pozwalają kompleksowo rozwiązać problemy potencjalnych klientów i skutecznie wpłynąć na ich decyzje zakupowe.
Skupiamy się na Państwa potrzebach związanych z obsługą sklepu, oczekiwaniach i planach sprzedażowych. Szczegółowo dopasowujemy grafikę, funkcjonalności, strukturę i mikrointerakcje do Państwa grupy docelowej, co przekłada się na oczekiwane rezultaty.
Chętnie przedstawię dotychczasowe realizacje, aby mogli Państwo przekonać się o naszych możliwościach. Mogę się skontaktować?
Pozdrawiam
Kamil Durjasz
Commit d937bc3449fa ("bpf: make uniform use of array->elem_size
everywhere in arraymap.c") changed array_map_gen_lookup to use
array->elem_size instead of round_up(map->value_size, 8) as the element
size when generating code to access a value in an array map.
array->elem_size, however, is not set by bpf_map_meta_alloc when
initializing an BPF_MAP_TYPE_ARRAY_OF_MAPS or BPF_MAP_TYPE_HASH_OF_MAPS.
This results in array_map_gen_lookup incorrectly outputting code that
always accesses index 0 in the array (as the index will be calculated
via a multiplication with the element size, which is incorrectly set to
0).
This patchset sets elem_size on the bpf_array object when allocating an
array or hash of maps to fix this and adds a selftest that accesses an
array map nested within a hash of maps at a nonzero index to prevent
regressions.
v1: https://lore.kernel.org/bpf/95b5da7c-ee52-3ecb-0a4e-f6a7a114f269@linux.dev/
Changelog:
v1 -> v2:
Address comments by Martin KaFai Lau:
- Directly use inner_array->elem_size instead of using round_up
- Move selftests to a new patch
- Use ASSERT_* macros instead of CHECK and remove duration
- Remove unnecessary usleep
- Shorten selftest name
Rhys Rustad-Elliott (2):
bpf: Fix elem_size not being set for inner maps
selftests/bpf: Add access_inner_map selftest
kernel/bpf/map_in_map.c | 8 +++-
.../bpf/prog_tests/inner_array_lookup.c | 31 +++++++++++++
.../bpf/progs/test_inner_array_lookup.c | 45 +++++++++++++++++++
3 files changed, 82 insertions(+), 2 deletions(-)
create mode 100644 tools/testing/selftests/bpf/prog_tests/inner_array_lookup.c
create mode 100644 tools/testing/selftests/bpf/progs/test_inner_array_lookup.c
--
2.40.1
*Changes in v16*
- Fix a corner case
- Add exclusive PM_SCAN_OP_WP back
*Changes in v15*
- Build fix (Add missed build fix in RESEND)
*Changes in v14*
- Fix build error caused by #ifdef added at last minute in some configs
*Changes in v13*
- Rebase on top of next-20230414
- Give-up on using uffd_wp_range() and write new helpers, flush tlb only
once
*Changes in v12*
- Update and other memory types to UFFD_FEATURE_WP_ASYNC
- Rebaase on top of next-20230406
- Review updates
*Changes in v11*
- Rebase on top of next-20230307
- Base patches on UFFD_FEATURE_WP_UNPOPULATED
- Do a lot of cosmetic changes and review updates
- Remove ENGAGE_WP + !GET operation as it can be performed with
UFFDIO_WRITEPROTECT
*Changes in v10*
- Add specific condition to return error if hugetlb is used with wp
async
- Move changes in tools/include/uapi/linux/fs.h to separate patch
- Add documentation
*Changes in v9:*
- Correct fault resolution for userfaultfd wp async
- Fix build warnings and errors which were happening on some configs
- Simplify pagemap ioctl's code
*Changes in v8:*
- Update uffd async wp implementation
- Improve PAGEMAP_IOCTL implementation
*Changes in v7:*
- Add uffd wp async
- Update the IOCTL to use uffd under the hood instead of soft-dirty
flags
*Motivation*
The real motivation for adding PAGEMAP_SCAN IOCTL is to emulate Windows
GetWriteWatch() syscall [1]. The GetWriteWatch{} retrieves the addresses of
the pages that are written to in a region of virtual memory.
This syscall is used in Windows applications and games etc. This syscall is
being emulated in pretty slow manner in userspace. Our purpose is to
enhance the kernel such that we translate it efficiently in a better way.
Currently some out of tree hack patches are being used to efficiently
emulate it in some kernels. We intend to replace those with these patches.
So the whole gaming on Linux can effectively get benefit from this. It
means there would be tons of users of this code.
CRIU use case [2] was mentioned by Andrei and Danylo:
> Use cases for migrating sparse VMAs are binaries sanitized with ASAN,
> MSAN or TSAN [3]. All of these sanitizers produce sparse mappings of
> shadow memory [4]. Being able to migrate such binaries allows to highly
> reduce the amount of work needed to identify and fix post-migration
> crashes, which happen constantly.
Andrei's defines the following uses of this code:
* it is more granular and allows us to track changed pages more
effectively. The current interface can clear dirty bits for the entire
process only. In addition, reading info about pages is a separate
operation. It means we must freeze the process to read information
about all its pages, reset dirty bits, only then we can start dumping
pages. The information about pages becomes more and more outdated,
while we are processing pages. The new interface solves both these
downsides. First, it allows us to read pte bits and clear the
soft-dirty bit atomically. It means that CRIU will not need to freeze
processes to pre-dump their memory. Second, it clears soft-dirty bits
for a specified region of memory. It means CRIU will have actual info
about pages to the moment of dumping them.
* The new interface has to be much faster because basic page filtering
is happening in the kernel. With the old interface, we have to read
pagemap for each page.
*Implementation Evolution (Short Summary)*
From the definition of GetWriteWatch(), we feel like kernel's soft-dirty
feature can be used under the hood with some additions like:
* reset soft-dirty flag for only a specific region of memory instead of
clearing the flag for the entire process
* get and clear soft-dirty flag for a specific region atomically
So we decided to use ioctl on pagemap file to read or/and reset soft-dirty
flag. But using soft-dirty flag, sometimes we get extra pages which weren't
even written. They had become soft-dirty because of VMA merging and
VM_SOFTDIRTY flag. This breaks the definition of GetWriteWatch(). We were
able to by-pass this short coming by ignoring VM_SOFTDIRTY until David
reported that mprotect etc messes up the soft-dirty flag while ignoring
VM_SOFTDIRTY [5]. This wasn't happening until [6] got introduced. We
discussed if we can revert these patches. But we could not reach to any
conclusion. So at this point, I made couple of tries to solve this whole
VM_SOFTDIRTY issue by correcting the soft-dirty implementation:
* [7] Correct the bug fixed wrongly back in 2014. It had potential to cause
regression. We left it behind.
* [8] Keep a list of soft-dirty part of a VMA across splits and merges. I
got the reply don't increase the size of the VMA by 8 bytes.
At this point, we left soft-dirty considering it is too much delicate and
userfaultfd [9] seemed like the only way forward. From there onward, we
have been basing soft-dirty emulation on userfaultfd wp feature where
kernel resolves the faults itself when WP_ASYNC feature is used. It was
straight forward to add WP_ASYNC feature in userfautlfd. Now we get only
those pages dirty or written-to which are really written in reality. (PS
There is another WP_UNPOPULATED userfautfd feature is required which is
needed to avoid pre-faulting memory before write-protecting [9].)
All the different masks were added on the request of CRIU devs to create
interface more generic and better.
[1] https://learn.microsoft.com/en-us/windows/win32/api/memoryapi/nf-memoryapi-…
[2] https://lore.kernel.org/all/20221014134802.1361436-1-mdanylo@google.com
[3] https://github.com/google/sanitizers
[4] https://github.com/google/sanitizers/wiki/AddressSanitizerAlgorithm#64-bit
[5] https://lore.kernel.org/all/bfcae708-db21-04b4-0bbe-712badd03071@redhat.com
[6] https://lore.kernel.org/all/20220725142048.30450-1-peterx@redhat.com/
[7] https://lore.kernel.org/all/20221122115007.2787017-1-usama.anjum@collabora.…
[8] https://lore.kernel.org/all/20221220162606.1595355-1-usama.anjum@collabora.…
[9] https://lore.kernel.org/all/20230306213925.617814-1-peterx@redhat.com
[10] https://lore.kernel.org/all/20230125144529.1630917-1-mdanylo@google.com
* Original Cover letter from v8*
Hello,
Note:
Soft-dirty pages and pages which have been written-to are synonyms. As
kernel already has soft-dirty feature inside which we have given up to
use, we are using written-to terminology while using UFFD async WP under
the hood.
This IOCTL, PAGEMAP_SCAN on pagemap file can be used to get and/or clear
the info about page table entries. The following operations are
supported in this ioctl:
- Get the information if the pages have been written-to (PAGE_IS_WRITTEN),
file mapped (PAGE_IS_FILE), present (PAGE_IS_PRESENT) or swapped
(PAGE_IS_SWAPPED).
- Write-protect the pages (PAGEMAP_WP_ENGAGE) to start finding which
pages have been written-to.
- Find pages which have been written-to and write protect the pages
(atomic PAGE_IS_WRITTEN + PAGEMAP_WP_ENGAGE)
It is possible to find and clear soft-dirty pages entirely in userspace.
But it isn't efficient:
- The mprotect and SIGSEGV handler for bookkeeping
- The userfaultfd wp (synchronous) with the handler for bookkeeping
Some benchmarks can be seen here[1]. This series adds features that weren't
present earlier:
- There is no atomic get soft-dirty/Written-to status and clear present in
the kernel.
- The pages which have been written-to can not be found in accurate way.
(Kernel's soft-dirty PTE bit + sof_dirty VMA bit shows more soft-dirty
pages than there actually are.)
Historically, soft-dirty PTE bit tracking has been used in the CRIU
project. The procfs interface is enough for finding the soft-dirty bit
status and clearing the soft-dirty bit of all the pages of a process.
We have the use case where we need to track the soft-dirty PTE bit for
only specific pages on-demand. We need this tracking and clear mechanism
of a region of memory while the process is running to emulate the
getWriteWatch() syscall of Windows.
*(Moved to using UFFD instead of soft-dirtyi feature to find pages which
have been written-to from v7 patch series)*:
Stop using the soft-dirty flags for finding which pages have been
written to. It is too delicate and wrong as it shows more soft-dirty
pages than the actual soft-dirty pages. There is no interest in
correcting it [2][3] as this is how the feature was written years ago.
It shouldn't be updated to changed behaviour. Peter Xu has suggested
using the async version of the UFFD WP [4] as it is based inherently
on the PTEs.
So in this patch series, I've added a new mode to the UFFD which is
asynchronous version of the write protect. When this variant of the
UFFD WP is used, the page faults are resolved automatically by the
kernel. The pages which have been written-to can be found by reading
pagemap file (!PM_UFFD_WP). This feature can be used successfully to
find which pages have been written to from the time the pages were
write protected. This works just like the soft-dirty flag without
showing any extra pages which aren't soft-dirty in reality.
The information related to pages if the page is file mapped, present and
swapped is required for the CRIU project [5][6]. The addition of the
required mask, any mask, excluded mask and return masks are also required
for the CRIU project [5].
The IOCTL returns the addresses of the pages which match the specific
masks. The page addresses are returned in struct page_region in a compact
form. The max_pages is needed to support a use case where user only wants
to get a specific number of pages. So there is no need to find all the
pages of interest in the range when max_pages is specified. The IOCTL
returns when the maximum number of the pages are found. The max_pages is
optional. If max_pages is specified, it must be equal or greater than the
vec_size. This restriction is needed to handle worse case when one
page_region only contains info of one page and it cannot be compacted.
This is needed to emulate the Windows getWriteWatch() syscall.
The patch series include the detailed selftest which can be used as an
example for the uffd async wp test and PAGEMAP_IOCTL. It shows the
interface usages as well.
[1] https://lore.kernel.org/lkml/54d4c322-cd6e-eefd-b161-2af2b56aae24@collabora…
[2] https://lore.kernel.org/all/20221220162606.1595355-1-usama.anjum@collabora.…
[3] https://lore.kernel.org/all/20221122115007.2787017-1-usama.anjum@collabora.…
[4] https://lore.kernel.org/all/Y6Hc2d+7eTKs7AiH@x1n
[5] https://lore.kernel.org/all/YyiDg79flhWoMDZB@gmail.com/
[6] https://lore.kernel.org/all/20221014134802.1361436-1-mdanylo@google.com/
Regards,
Muhammad Usama Anjum
Muhammad Usama Anjum (4):
fs/proc/task_mmu: Implement IOCTL to get and optionally clear info
about PTEs
tools headers UAPI: Update linux/fs.h with the kernel sources
mm/pagemap: add documentation of PAGEMAP_SCAN IOCTL
selftests: mm: add pagemap ioctl tests
Peter Xu (1):
userfaultfd: UFFD_FEATURE_WP_ASYNC
Documentation/admin-guide/mm/pagemap.rst | 58 +
Documentation/admin-guide/mm/userfaultfd.rst | 35 +
fs/proc/task_mmu.c | 503 ++++++
fs/userfaultfd.c | 26 +-
include/linux/userfaultfd_k.h | 21 +-
include/uapi/linux/fs.h | 53 +
include/uapi/linux/userfaultfd.h | 9 +-
mm/hugetlb.c | 32 +-
mm/memory.c | 27 +-
tools/include/uapi/linux/fs.h | 53 +
tools/testing/selftests/mm/.gitignore | 1 +
tools/testing/selftests/mm/Makefile | 3 +-
tools/testing/selftests/mm/config | 1 +
tools/testing/selftests/mm/pagemap_ioctl.c | 1459 ++++++++++++++++++
tools/testing/selftests/mm/run_vmtests.sh | 4 +
15 files changed, 2262 insertions(+), 23 deletions(-)
create mode 100644 tools/testing/selftests/mm/pagemap_ioctl.c
mode change 100644 => 100755 tools/testing/selftests/mm/run_vmtests.sh
--
2.39.2
This is part of the effort to remove the empty element of the ctl_table
structures (used to calculate size) and replace it with an ARRAY_SIZE call. By
replacing the child element in struct ctl_table with a flags element we make
sure that there are no forward recursions on child nodes and therefore set
ourselves up for just using an ARRAY_SIZE. We also added some self tests to
make sure that we do not break anything.
Patchset is separated in 4: parport fixes, selftests fixes, selftests additions and
replacement of child element. Tested everything with sysctl self tests and everything
seems "ok".
1. parport fixes: @mcgrof: this is related to my previous series and it plugs a
sysct table leak in the parport driver. Please tell me if you want me to repost
the parport series with this one stiched in.
2. Selftests fixes: Remove the prefixed zeros when passing a awk field to the
awk print command because it was causing $0009 to be interpreted as $0.
Replaced continue with return in sysctl.sh(test_case) so the test actually
gets skipped. The skip decision is now in sysctl.sh(skip_test).
3. Selftest additions: New test to confirm that unregister actually removes
targets. New test to confirm that permanently empty targets are indeed
created and that no other targets can be created "on top".
4. Replaced the child pointer in struct ctl_table with a u8 flag. The flag
is used to differentiate between permanently empty targets and non-empty ones.
Comments/feedback greatly appreciated
Best
Joel
Joel Granados (8):
parport: plug a sysctl register leak
test_sysctl: Fix test metadata getters
test_sysctl: Group node sysctl test under one func
test_sysctl: Add an unregister sysctl test
test_sysctl: Add an option to prevent test skip
test_sysclt: Test for registering a mount point
sysctl: Remove debugging dump_stack
sysctl: replace child with a flags var
drivers/parport/procfs.c | 23 ++---
fs/proc/proc_sysctl.c | 82 ++++------------
include/linux/sysctl.h | 4 +-
lib/test_sysctl.c | 91 ++++++++++++++++--
tools/testing/selftests/sysctl/sysctl.sh | 115 +++++++++++++++++------
5 files changed, 204 insertions(+), 111 deletions(-)
--
2.30.2
From: Jeff Xu <jeffxu(a)google.com>
This is the first set of Memory mapping (VMA) protection patches using PKU.
* * *
Background:
As discussed previously in the kernel mailing list [1], V8 CFI [2] uses
PKU to protect memory, and Stephen Röttger proposes to extend the PKU to
memory mapping [3].
We're using PKU for in-process isolation to enforce control-flow integrity
for a JIT compiler. In our threat model, an attacker exploits a
vulnerability and has arbitrary read/write access to the whole process
space concurrently to other threads being executed. This attacker can
manipulate some arguments to syscalls from some threads.
Under such a powerful attack, we want to create a “safe/isolated”
thread environment. We assign dedicated PKUs to this thread,
and use those PKUs to protect the threads’ runtime environment.
The thread has exclusive access to its run-time memory. This
includes modifying the protection of the memory mapping, or
munmap the memory mapping after use. And the other threads
won’t be able to access the memory or modify the memory mapping
(VMA) belonging to the thread.
* * *
Proposed changes:
This patch introduces a new flag, PKEY_ENFORCE_API, to the pkey_alloc()
function. When a PKEY is created with this flag, it is enforced that any
thread that wants to make changes to the memory mapping (such as mprotect)
of the memory must have write access to the PKEY. PKEYs created without
this flag will continue to work as they do now, for backwards
compatibility.
Only PKEY created from user space can have the new flag set, the PKEY
allocated by the kernel internally will not have it. In other words,
ARCH_DEFAULT_PKEY(0) and execute_only_pkey won’t have this flag set,
and continue work as today.
This flag is checked only at syscall entry, such as mprotect/munmap in
this set of patches. It will not apply to other call paths. In other
words, if the kernel want to change attributes of VMA for some reasons,
the kernel is free to do that and not affected by this new flag.
This set of patch covers mprotect/munmap, I plan to work on other
syscalls after this.
* * *
Testing:
I have tested this patch on a Linux kernel 5.15, 6,1, and 6.4-rc1,
new selftest is added in: pkey_enforce_api.c
* * *
Discussion:
We believe that this patch provides a valuable security feature.
It allows us to create “safe/isolated” thread environments that are
protected from attackers with arbitrary read/write access to
the process space.
We believe that the interface change and the patch don't
introduce backwards compatibility risk.
We would like to disucss this patch in Linux kernel community
for feedback and support.
* * *
Reference:
[1]https://lore.kernel.org/all/202208221331.71C50A6F@keescook/
[2]https://docs.google.com/document/d/1O2jwK4dxI3nRcOJuPYkonhTkNQfbmwdvxQMyX…
[3]https://docs.google.com/document/d/1qqVoVfRiF2nRylL3yjZyCQvzQaej1HRPh3f5w…
Best Regards,
-Jeff Xu
Jeff Xu (6):
PKEY: Introduce PKEY_ENFORCE_API flag
PKEY: Add arch_check_pkey_enforce_api()
PKEY: Apply PKEY_ENFORCE_API to mprotect
PKEY:selftest pkey_enforce_api for mprotect
KEY: Apply PKEY_ENFORCE_API to munmap
PKEY:selftest pkey_enforce_api for munmap
arch/powerpc/include/asm/pkeys.h | 19 +-
arch/x86/include/asm/mmu.h | 7 +
arch/x86/include/asm/pkeys.h | 92 +-
arch/x86/mm/pkeys.c | 2 +-
include/linux/mm.h | 2 +-
include/linux/pkeys.h | 18 +-
include/uapi/linux/mman.h | 5 +
mm/mmap.c | 34 +-
mm/mprotect.c | 31 +-
mm/mremap.c | 6 +-
tools/testing/selftests/mm/Makefile | 1 +
tools/testing/selftests/mm/pkey_enforce_api.c | 1312 +++++++++++++++++
12 files changed, 1507 insertions(+), 22 deletions(-)
create mode 100644 tools/testing/selftests/mm/pkey_enforce_api.c
base-commit: ba0ad6ed89fd5dada3b7b65ef2b08e95d449d4ab
--
2.40.1.606.ga4b1b128d6-goog
This patch updates the cgroup-v2.rst file to include information about
the new "cpuset.cpus.reserve" control file as well as the new remote
partition.
Signed-off-by: Waiman Long <longman(a)redhat.com>
---
Documentation/admin-guide/cgroup-v2.rst | 92 +++++++++++++++++++++----
1 file changed, 79 insertions(+), 13 deletions(-)
diff --git a/Documentation/admin-guide/cgroup-v2.rst b/Documentation/admin-guide/cgroup-v2.rst
index f67c0829350b..3e9351c2cd27 100644
--- a/Documentation/admin-guide/cgroup-v2.rst
+++ b/Documentation/admin-guide/cgroup-v2.rst
@@ -2215,6 +2215,38 @@ Cpuset Interface Files
Its value will be affected by memory nodes hotplug events.
+ cpuset.cpus.reserve
+ A read-write multiple values file which exists only on root
+ cgroup.
+
+ It lists all the CPUs that are reserved for adjacent and remote
+ partitions created in the system. See the next section for
+ more information on what an adjacent or remote partitions is.
+
+ Creation of adjacent partition does not require touching this
+ control file as CPU reservation will be done automatically.
+ In order to create a remote partition, the CPUs needed by the
+ remote partition has to be written to this file first.
+
+ Due to the fact that "cpuset.cpus.reserve" holds reserve CPUs
+ that can be used by multiple partitions and automatic reservation
+ may also race with manual reservation, an extension prefixes of
+ "+" and "-" are allowed for this file to reduce race.
+
+ A "+" prefix can be used to indicate a list of additional
+ CPUs that are to be added without disturbing the CPUs that are
+ originally there. For example, if its current value is "3-4",
+ echoing ""+5" to it will change it to "3-5".
+
+ Once a remote partition is destroyed, its CPUs have to be
+ removed from this file or no other process can use them. A "-"
+ prefix can be used to remove a list of CPUs from it. However,
+ removing CPUs that are currently used in existing partitions
+ may cause those partitions to become invalid. A single "-"
+ character without any number can be used to indicate removal
+ of all the free CPUs not yet allocated to any partitions to
+ avoid accidental partition invalidation.
+
cpuset.cpus.partition
A read-write single value file which exists on non-root
cpuset-enabled cgroups. This flag is owned by the parent cgroup
@@ -2228,25 +2260,49 @@ Cpuset Interface Files
"isolated" Partition root without load balancing
========== =====================================
- The root cgroup is always a partition root and its state
- cannot be changed. All other non-root cgroups start out as
- "member".
+ A cpuset partition is a collection of cgroups with a partition
+ root at the top of the hierarchy and its descendants except
+ those that are separate partition roots themselves and their
+ descendants. A partition has exclusive access to the set of
+ CPUs allocated to it. Other cgroups outside of that partition
+ cannot use any CPUs in that set.
+
+ There are two types of partitions - adjacent and remote. The
+ parent of an adjacent partition must be a valid partition root.
+ Partition roots of adjacent partitions are all clustered around
+ the root cgroup. Creation of adjacent partition is done by
+ writing the desired partition type into "cpuset.cpus.partition".
+
+ A remote partition does not require a partition root parent.
+ So a remote partition can be formed far from the root cgroup.
+ However, its creation is a 2-step process. The CPUs needed
+ by a remote partition ("cpuset.cpus" of the partition root)
+ has to be written into "cpuset.cpus.reserve" of the root
+ cgroup first. After that, "isolated" can be written into
+ "cpuset.cpus.partition" of the partition root to form a remote
+ isolated partition which is the only supported remote partition
+ type for now.
+
+ All remote partitions are terminal as adjacent partition cannot
+ be created underneath it. With the way remote partition is
+ formed, it is not possible to create another valid remote
+ partition underneath it.
+
+ The root cgroup is always a partition root and its state cannot
+ be changed. All other non-root cgroups start out as "member".
When set to "root", the current cgroup is the root of a new
- partition or scheduling domain that comprises itself and all
- its descendants except those that are separate partition roots
- themselves and their descendants.
+ partition or scheduling domain.
- When set to "isolated", the CPUs in that partition root will
+ When set to "isolated", the CPUs in that partition will
be in an isolated state without any load balancing from the
scheduler. Tasks placed in such a partition with multiple
CPUs should be carefully distributed and bound to each of the
individual CPUs for optimal performance.
- The value shown in "cpuset.cpus.effective" of a partition root
- is the CPUs that the partition root can dedicate to a potential
- new child partition root. The new child subtracts available
- CPUs from its parent "cpuset.cpus.effective".
+ The value shown in "cpuset.cpus.effective" of a partition root is
+ the CPUs that are dedicated to that partition and not available
+ to cgroups outside of that partittion.
A partition root ("root" or "isolated") can be in one of the
two possible states - valid or invalid. An invalid partition
@@ -2270,8 +2326,8 @@ Cpuset Interface Files
In the case of an invalid partition root, a descriptive string on
why the partition is invalid is included within parentheses.
- For a partition root to become valid, the following conditions
- must be met.
+ For an adjacent partition root to be valid, the following
+ conditions must be met.
1) The "cpuset.cpus" is exclusive with its siblings , i.e. they
are not shared by any of its siblings (exclusivity rule).
@@ -2281,6 +2337,16 @@ Cpuset Interface Files
4) The "cpuset.cpus.effective" cannot be empty unless there is
no task associated with this partition.
+ For a remote partition root to be valid, the following conditions
+ must be met.
+
+ 1) The same exclusivity rule as adjacent partition root.
+ 2) The "cpuset.cpus" is not empty and all the CPUs must be
+ present in "cpuset.cpus.reserve" of the root cgroup and none
+ of them are allocated to another partition.
+ 3) The "cpuset.cpus" value must be present in all its ancestors
+ to ensure proper hierarchical cpu distribution.
+
External events like hotplug or changes to "cpuset.cpus" can
cause a valid partition root to become invalid and vice versa.
Note that a task cannot be moved to a cgroup with empty
--
2.31.1
From: Mark Brown <broonie(a)kernel.org>
[ Upstream commit dbcf76390eb9a65d5d0c37b0cd57335218564e37 ]
The ftrace selftests do not currently produce KTAP output, they produce a
custom format much nicer for human consumption. This means that when run in
automated test systems we just get a single result for the suite as a whole
rather than recording results for individual test cases, making it harder
to look at the test data and masking things like inappropriate skips.
Address this by adding support for KTAP output to the ftracetest script and
providing a trivial wrapper which will be invoked by the kselftest runner
to generate output in this format by default, users using ftracetest
directly will continue to get the existing output.
This is not the most elegant solution but it is simple and effective. I
did consider implementing this by post processing the existing output
format but that felt more complex and likely to result in all output being
lost if something goes seriously wrong during the run which would not be
helpful. I did also consider just writing a separate runner script but
there's enough going on with things like the signal handling for that to
seem like it would be duplicating too much.
Acked-by: Steven Rostedt (Google) <rostedt(a)goodmis.org>
Acked-by: Masami Hiramatsu (Google) <mhiramat(a)kernel.org>
Tested-by: Masami Hiramatsu (Google) <mhiramat(a)kernel.org>
Signed-off-by: Mark Brown <broonie(a)kernel.org>
Signed-off-by: Shuah Khan <skhan(a)linuxfoundation.org>
Signed-off-by: Sasha Levin <sashal(a)kernel.org>
---
tools/testing/selftests/ftrace/Makefile | 3 +-
tools/testing/selftests/ftrace/ftracetest | 63 ++++++++++++++++++-
.../testing/selftests/ftrace/ftracetest-ktap | 8 +++
3 files changed, 70 insertions(+), 4 deletions(-)
create mode 100755 tools/testing/selftests/ftrace/ftracetest-ktap
diff --git a/tools/testing/selftests/ftrace/Makefile b/tools/testing/selftests/ftrace/Makefile
index d6e106fbce11c..a1e955d2de4cc 100644
--- a/tools/testing/selftests/ftrace/Makefile
+++ b/tools/testing/selftests/ftrace/Makefile
@@ -1,7 +1,8 @@
# SPDX-License-Identifier: GPL-2.0
all:
-TEST_PROGS := ftracetest
+TEST_PROGS_EXTENDED := ftracetest
+TEST_PROGS := ftracetest-ktap
TEST_FILES := test.d settings
EXTRA_CLEAN := $(OUTPUT)/logs/*
diff --git a/tools/testing/selftests/ftrace/ftracetest b/tools/testing/selftests/ftrace/ftracetest
index c3311c8c40890..2506621e75dfb 100755
--- a/tools/testing/selftests/ftrace/ftracetest
+++ b/tools/testing/selftests/ftrace/ftracetest
@@ -13,6 +13,7 @@ echo "Usage: ftracetest [options] [testcase(s)] [testcase-directory(s)]"
echo " Options:"
echo " -h|--help Show help message"
echo " -k|--keep Keep passed test logs"
+echo " -K|--ktap Output in KTAP format"
echo " -v|--verbose Increase verbosity of test messages"
echo " -vv Alias of -v -v (Show all results in stdout)"
echo " -vvv Alias of -v -v -v (Show all commands immediately)"
@@ -85,6 +86,10 @@ parse_opts() { # opts
KEEP_LOG=1
shift 1
;;
+ --ktap|-K)
+ KTAP=1
+ shift 1
+ ;;
--verbose|-v|-vv|-vvv)
if [ $VERBOSE -eq -1 ]; then
usage "--console can not use with --verbose"
@@ -178,6 +183,7 @@ TEST_DIR=$TOP_DIR/test.d
TEST_CASES=`find_testcases $TEST_DIR`
LOG_DIR=$TOP_DIR/logs/`date +%Y%m%d-%H%M%S`/
KEEP_LOG=0
+KTAP=0
DEBUG=0
VERBOSE=0
UNSUPPORTED_RESULT=0
@@ -229,7 +235,7 @@ prlog() { # messages
newline=
shift
fi
- printf "$*$newline"
+ [ "$KTAP" != "1" ] && printf "$*$newline"
[ "$LOG_FILE" ] && printf "$*$newline" | strip_esc >> $LOG_FILE
}
catlog() { #file
@@ -260,11 +266,11 @@ TOTAL_RESULT=0
INSTANCE=
CASENO=0
+CASENAME=
testcase() { # testfile
CASENO=$((CASENO+1))
- desc=`grep "^#[ \t]*description:" $1 | cut -f2- -d:`
- prlog -n "[$CASENO]$INSTANCE$desc"
+ CASENAME=`grep "^#[ \t]*description:" $1 | cut -f2- -d:`
}
checkreq() { # testfile
@@ -277,40 +283,68 @@ test_on_instance() { # testfile
grep -q "^#[ \t]*flags:.*instance" $1
}
+ktaptest() { # result comment
+ if [ "$KTAP" != "1" ]; then
+ return
+ fi
+
+ local result=
+ if [ "$1" = "1" ]; then
+ result="ok"
+ else
+ result="not ok"
+ fi
+ shift
+
+ local comment=$*
+ if [ "$comment" != "" ]; then
+ comment="# $comment"
+ fi
+
+ echo $CASENO $result $INSTANCE$CASENAME $comment
+}
+
eval_result() { # sigval
case $1 in
$PASS)
prlog " [${color_green}PASS${color_reset}]"
+ ktaptest 1
PASSED_CASES="$PASSED_CASES $CASENO"
return 0
;;
$FAIL)
prlog " [${color_red}FAIL${color_reset}]"
+ ktaptest 0
FAILED_CASES="$FAILED_CASES $CASENO"
return 1 # this is a bug.
;;
$UNRESOLVED)
prlog " [${color_blue}UNRESOLVED${color_reset}]"
+ ktaptest 0 UNRESOLVED
UNRESOLVED_CASES="$UNRESOLVED_CASES $CASENO"
return $UNRESOLVED_RESULT # depends on use case
;;
$UNTESTED)
prlog " [${color_blue}UNTESTED${color_reset}]"
+ ktaptest 1 SKIP
UNTESTED_CASES="$UNTESTED_CASES $CASENO"
return 0
;;
$UNSUPPORTED)
prlog " [${color_blue}UNSUPPORTED${color_reset}]"
+ ktaptest 1 SKIP
UNSUPPORTED_CASES="$UNSUPPORTED_CASES $CASENO"
return $UNSUPPORTED_RESULT # depends on use case
;;
$XFAIL)
prlog " [${color_green}XFAIL${color_reset}]"
+ ktaptest 1 XFAIL
XFAILED_CASES="$XFAILED_CASES $CASENO"
return 0
;;
*)
prlog " [${color_blue}UNDEFINED${color_reset}]"
+ ktaptest 0 error
UNDEFINED_CASES="$UNDEFINED_CASES $CASENO"
return 1 # this must be a test bug
;;
@@ -371,6 +405,7 @@ __run_test() { # testfile
run_test() { # testfile
local testname=`basename $1`
testcase $1
+ prlog -n "[$CASENO]$INSTANCE$CASENAME"
if [ ! -z "$LOG_FILE" ] ; then
local testlog=`mktemp $LOG_DIR/${CASENO}-${testname}-log.XXXXXX`
else
@@ -405,6 +440,17 @@ run_test() { # testfile
# load in the helper functions
. $TEST_DIR/functions
+if [ "$KTAP" = "1" ]; then
+ echo "TAP version 13"
+
+ casecount=`echo $TEST_CASES | wc -w`
+ for t in $TEST_CASES; do
+ test_on_instance $t || continue
+ casecount=$((casecount+1))
+ done
+ echo "1..${casecount}"
+fi
+
# Main loop
for t in $TEST_CASES; do
run_test $t
@@ -439,6 +485,17 @@ prlog "# of unsupported: " `echo $UNSUPPORTED_CASES | wc -w`
prlog "# of xfailed: " `echo $XFAILED_CASES | wc -w`
prlog "# of undefined(test bug): " `echo $UNDEFINED_CASES | wc -w`
+if [ "$KTAP" = "1" ]; then
+ echo -n "# Totals:"
+ echo -n " pass:"`echo $PASSED_CASES | wc -w`
+ echo -n " faii:"`echo $FAILED_CASES | wc -w`
+ echo -n " xfail:"`echo $XFAILED_CASES | wc -w`
+ echo -n " xpass:0"
+ echo -n " skip:"`echo $UNTESTED_CASES $UNSUPPORTED_CASES | wc -w`
+ echo -n " error:"`echo $UNRESOLVED_CASES $UNDEFINED_CASES | wc -w`
+ echo
+fi
+
cleanup
# if no error, return 0
diff --git a/tools/testing/selftests/ftrace/ftracetest-ktap b/tools/testing/selftests/ftrace/ftracetest-ktap
new file mode 100755
index 0000000000000..b3284679ef3af
--- /dev/null
+++ b/tools/testing/selftests/ftrace/ftracetest-ktap
@@ -0,0 +1,8 @@
+#!/bin/sh -e
+# SPDX-License-Identifier: GPL-2.0-only
+#
+# ftracetest-ktap: Wrapper to integrate ftracetest with the kselftest runner
+#
+# Copyright (C) Arm Ltd., 2023
+
+./ftracetest -K
--
2.39.2
Dzień dobry,
zapoznałem się z Państwa ofertą i z przyjemnością przyznaję, że przyciąga uwagę i zachęca do dalszych rozmów.
Pomyślałem, że może mógłbym mieć swój wkład w Państwa rozwój i pomóc dotrzeć z tą ofertą do większego grona odbiorców. Pozycjonuję strony www, dzięki czemu generują świetny ruch w sieci.
Możemy porozmawiać w najbliższym czasie?
Pozdrawiam
Adam Charachuta
One can use "cpuset.cpus.partition" to create multiple scheduling domains
or to produce a set of isolated CPUs where load balancing is disabled.
The former use case is less common but the latter one can be frequently
used especially for the Telco use cases like DPDK.
The existing "isolated" partition can be used to produce isolated
CPUs if the applications have full control of a system. However, in a
containerized environment where all the apps are run in a container,
it is hard to distribute out isolated CPUs from the root down given
the unified hierarchy nature of cgroup v2.
The container running on isolated CPUs can be several layers down from
the root. The current partition feature requires that all the ancestors
of a leaf partition root must be parititon roots themselves. This can
be hard to configure.
This patch introduces a new type of partition called remote partition.
A remote partition is a partition whose parent is not a partition root
itself and its CPUs are acquired directly from available CPUs in the
top cpuset's cpuset.cpus.reserve. For contrast, the existing type of
partitions where their parents have to be valid partition roots are
referred to as adjacent partitions as they have to be clustered around
the cgroup root.
This patch enables only the creation of remote isolated partitions
for now.
The creation of a remote isolated partition is a 2-step process.
1) Reserve the CPUs needed by the remote partition by adding CPUs to
cpuset.cpus.reserve of the top cpuset.
2) Enable an isolated partition by
# echo isolated > cpuset.cpus.partition
Such a remote isolated partition P will only be valid if the following
conditions are true.
1) P/cpuset.cpus is a subset of top cpuset's cpuset.cpus.reserve.
2) All the CPUs in P/cpuset.cpus are present in the cpuset.cpus of
all its ancestors to ensure that those CPUs are properly granted
to P in a hierarchical manner.
3) None of the CPUs in P/cpuset.cpus have been acquired by other valid
partitions.
Like adjacent partitions, a remote partition has exclusive access to the
CPUs allocated to that partition. Because of the exclusive nature, none
of the cpuset.cpus of its sibling cpusets can contain any CPUs allocated
to the remote partition or the partition creation process will fail.
Signed-off-by: Waiman Long <longman(a)redhat.com>
---
kernel/cgroup/cpuset.c | 306 +++++++++++++++++++++++++++++++++++++++--
1 file changed, 291 insertions(+), 15 deletions(-)
diff --git a/kernel/cgroup/cpuset.c b/kernel/cgroup/cpuset.c
index 69abe95a9969..280018cddaba 100644
--- a/kernel/cgroup/cpuset.c
+++ b/kernel/cgroup/cpuset.c
@@ -98,6 +98,7 @@ enum prs_errcode {
PERR_NOCPUS,
PERR_HOTPLUG,
PERR_CPUSEMPTY,
+ PERR_RMTPARENT,
};
static const char * const perr_strings[] = {
@@ -108,6 +109,7 @@ static const char * const perr_strings[] = {
[PERR_NOCPUS] = "Parent unable to distribute cpu downstream",
[PERR_HOTPLUG] = "No cpu available due to hotplug",
[PERR_CPUSEMPTY] = "cpuset.cpus is empty",
+ [PERR_RMTPARENT] = "New partition not allowed under remote partition",
};
struct cpuset {
@@ -206,6 +208,9 @@ struct cpuset {
/* Handle for cpuset.cpus.partition */
struct cgroup_file partition_file;
+
+ /* Remote partition silbling list anchored at remote_children */
+ struct list_head remote_sibling;
};
/*
@@ -236,6 +241,9 @@ static cpumask_var_t cs_reserve_cpus; /* Reserved CPUs */
static cpumask_var_t cs_free_reserve_cpus; /* Unallocated reserved CPUs */
static cpumask_var_t cs_tmp_cpus; /* Temp cpumask for partition */
+/* List of remote partition root children */
+static struct list_head remote_children;
+
/*
* Partition root states:
*
@@ -385,6 +393,8 @@ static struct cpuset top_cpuset = {
.flags = ((1 << CS_ONLINE) | (1 << CS_CPU_EXCLUSIVE) |
(1 << CS_MEM_EXCLUSIVE)),
.partition_root_state = PRS_ROOT,
+ .remote_sibling = LIST_HEAD_INIT(top_cpuset.remote_sibling),
+
};
/**
@@ -1385,6 +1395,209 @@ static void update_partition_sd_lb(struct cpuset *cs, int old_prs)
rebuild_sched_domains_locked();
}
+static inline bool is_remote_partition(struct cpuset *cs)
+{
+ return !list_empty(&cs->remote_sibling);
+}
+
+/*
+ * update_isolated_cpumasks_hier - Update effective cpumasks and tasks
+ * @cs: the cpuset to consider
+ * @lb: load balance flag
+ *
+ * This is called for descendant cpusets when a cpuset switches to or
+ * from an isolated remote partition. There can't be any remote partitions
+ * underneath it.
+ */
+static void update_isolated_cpumasks_hier(struct cpuset *cs, bool lb)
+{
+ struct cpuset *cp;
+ struct cgroup_subsys_state *pos_css;
+
+ rcu_read_lock();
+ cpuset_for_each_descendant_pre(cp, pos_css, cs) {
+ struct cpuset *parent = parent_cs(cp);
+
+ if (cp == cs)
+ continue; /* Skip partition root */
+
+ WARN_ON_ONCE(is_partition_valid(cp));
+ spin_lock_irq(&callback_lock);
+
+ if (cpumask_and(cp->effective_cpus, cp->cpus_allowed,
+ parent->effective_cpus)) {
+ if (cp->use_parent_ecpus) {
+ WARN_ON_ONCE(--parent->child_ecpus_count < 0);
+ cp->use_parent_ecpus = false;
+ }
+ } else {
+ cpumask_copy(cp->effective_cpus, parent->effective_cpus);
+ if (!cp->use_parent_ecpus) {
+ parent->child_ecpus_count++;
+ cp->use_parent_ecpus = true;
+ }
+ }
+ if (lb)
+ set_bit(CS_SCHED_LOAD_BALANCE, &cp->flags);
+ else
+ clear_bit(CS_SCHED_LOAD_BALANCE, &cp->flags);
+
+ spin_unlock_irq(&callback_lock);
+ }
+ rcu_read_unlock();
+}
+
+/*
+ * isolated_cpus_acquire - Acquire isolated CPUs from cpuset.cpus.reserve
+ * @cs: the cpuset to update
+ * Return: 1 if successful, 0 if error
+ *
+ * Acquire isolated CPUs from cpuset.cpus.reserve and become an isolated
+ * partition root. cpuset_mutex must be held by the caller.
+ *
+ * Note that freely available reserve CPUs have already been isolated, so
+ * we don't need to rebuild sched domains. Since the cpuset is likely
+ * using effective_cpus from its parent before the conversion, we have to
+ * update parent's child_ecpus_count accordingly.
+ */
+static int isolated_cpus_acquire(struct cpuset *cs)
+{
+ struct cpuset *ancestor, *parent;
+
+ ancestor = parent = parent_cs(cs);
+
+ /*
+ * To enable acquiring of isolated CPUs from cpuset.cpus.reserve,
+ * cpus_allowed must be a subset of both its ancestor's cpus_allowed
+ * and cs_free_reserve_cpus and the user must have sysadmin privilege.
+ */
+ if (!capable(CAP_SYS_ADMIN) ||
+ !cpumask_subset(cs->cpus_allowed, cs_free_reserve_cpus))
+ return 0;
+
+ /*
+ * Check cpus_allowed of all its ancestors, except top_cpuset.
+ */
+ while (ancestor != &top_cpuset) {
+ if (!cpumask_subset(cs->cpus_allowed, ancestor->cpus_allowed))
+ return 0;
+ ancestor = parent_cs(ancestor);
+ }
+
+ spin_lock_irq(&callback_lock);
+ cpumask_andnot(cs_free_reserve_cpus,
+ cs_free_reserve_cpus, cs->cpus_allowed);
+ cpumask_and(cs->effective_cpus, cs->cpus_allowed, cpu_active_mask);
+
+ if (cs->use_parent_ecpus) {
+ cs->use_parent_ecpus = false;
+ parent->child_ecpus_count--;
+ }
+ list_add(&cs->remote_sibling, &remote_children);
+ clear_bit(CS_SCHED_LOAD_BALANCE, &cs->flags);
+ spin_unlock_irq(&callback_lock);
+
+ if (!list_empty(&cs->css.children))
+ update_isolated_cpumasks_hier(cs, false);
+
+ return 1;
+}
+
+/*
+ * isolated_cpus_release - Release isolated CPUs back to cpuset.cpus.reserve
+ * @cs: the cpuset to update
+ *
+ * Release isolated CPUs back to cpuset.cpus.reserve.
+ * cpuset_mutex must be held by the caller.
+ */
+static void isolated_cpus_release(struct cpuset *cs)
+{
+ struct cpuset *parent = parent_cs(cs);
+
+ if (!is_remote_partition(cs))
+ return;
+
+ /*
+ * This can be called when the cpu list in cs_reserve_cpus
+ * is reduced. So not all the cpus should be returned back to
+ * cs_free_reserve_cpus.
+ */
+ WARN_ON_ONCE(cs->partition_root_state != PRS_ISOLATED);
+ WARN_ON_ONCE(!cpumask_subset(cs->cpus_allowed, cs_reserve_cpus));
+ spin_lock_irq(&callback_lock);
+ if (!cpumask_and(cs->effective_cpus,
+ parent->effective_cpus, cs->cpus_allowed)) {
+ cs->use_parent_ecpus = true;
+ parent->child_ecpus_count++;
+ cpumask_copy(cs->effective_cpus, parent->effective_cpus);
+ }
+ list_del_init(&cs->remote_sibling);
+ cs->partition_root_state = PRS_INVALID_ISOLATED;
+ if (!cs->prs_err)
+ cs->prs_err = PERR_INVCPUS;
+
+ /* Add the CPUs back to cs_free_reserve_cpus */
+ cpumask_or(cs_free_reserve_cpus,
+ cs_free_reserve_cpus, cs->cpus_allowed);
+
+ /*
+ * There is no change in the CPU load balance state that requires
+ * rebuilding sched domains. So the flags bits can be set directly.
+ */
+ set_bit(CS_SCHED_LOAD_BALANCE, &cs->flags);
+ clear_bit(CS_CPU_EXCLUSIVE, &cs->flags);
+ spin_unlock_irq(&callback_lock);
+
+ if (!list_empty(&cs->css.children))
+ update_isolated_cpumasks_hier(cs, true);
+}
+
+/*
+ * isolated_cpus_update - cpuset.cpus change in a remote isolated partition
+ *
+ * Return: 1 if successful, 0 if it needs to become invalid.
+ */
+static int isolated_cpus_update(struct cpuset *cs, struct cpumask *newmask,
+ struct tmpmasks *tmp)
+{
+ bool adding, deleting;
+
+ if (WARN_ON_ONCE((cs->partition_root_state != PRS_ISOLATED) ||
+ !is_remote_partition(cs)))
+ return 0;
+
+ if (cpumask_empty(newmask))
+ goto invalidate;
+
+ adding = cpumask_andnot(tmp->addmask, newmask, cs->cpus_allowed);
+ deleting = cpumask_andnot(tmp->delmask, cs->cpus_allowed, newmask);
+
+ /*
+ * Additions of isolation CPUs is only allowed if those CPUs are
+ * in cs_free_reserve_cpus and the caller has sysadmin privilege.
+ */
+ if (adding && (!capable(CAP_SYS_ADMIN) ||
+ !cpumask_subset(tmp->addmask, cs_free_reserve_cpus)))
+ goto invalidate;
+
+ spin_lock_irq(&callback_lock);
+ if (adding)
+ cpumask_andnot(cs_free_reserve_cpus,
+ cs_free_reserve_cpus, tmp->addmask);
+ if (deleting)
+ cpumask_or(cs_free_reserve_cpus,
+ cs_free_reserve_cpus, tmp->delmask);
+ cpumask_copy(cs->cpus_allowed, newmask);
+ cpumask_andnot(cs->effective_cpus, newmask, cs->subparts_cpus);
+ cpumask_and(cs->effective_cpus, cs->effective_cpus, cpu_active_mask);
+ spin_unlock_irq(&callback_lock);
+ return 1;
+
+invalidate:
+ isolated_cpus_release(cs);
+ return 0;
+}
+
/**
* update_parent_subparts_cpumask - update subparts_cpus mask of parent cpuset
* @cs: The cpuset that requests change in partition root state
@@ -1457,9 +1670,12 @@ static int update_parent_subparts_cpumask(struct cpuset *cs, int cmd,
if (cmd == partcmd_enable) {
/*
* Enabling partition root is not allowed if cpus_allowed
- * doesn't overlap parent's cpus_allowed.
+ * doesn't overlap parent's cpus_allowed or if it intersects
+ * cs_free_reserve_cpus since it needs to be a remote partition
+ * in this case.
*/
- if (!cpumask_intersects(cs->cpus_allowed, parent->cpus_allowed))
+ if (!cpumask_intersects(cs->cpus_allowed, parent->cpus_allowed) ||
+ cpumask_intersects(cs->cpus_allowed, cs_free_reserve_cpus))
return PERR_INVCPUS;
/*
@@ -1694,6 +1910,15 @@ static void update_cpumasks_hier(struct cpuset *cs, struct tmpmasks *tmp,
struct cpuset *parent = parent_cs(cp);
bool update_parent = false;
+ /*
+ * Skip remote partition that acquires isolated CPUs directly
+ * from cs_reserve_cpus.
+ */
+ if (is_remote_partition(cp)) {
+ pos_css = css_rightmost_descendant(pos_css);
+ continue;
+ }
+
compute_effective_cpumask(tmp->new_cpus, cp, parent);
/*
@@ -1804,7 +2029,7 @@ static void update_cpumasks_hier(struct cpuset *cs, struct tmpmasks *tmp,
WARN_ON(!is_in_v2_mode() &&
!cpumask_equal(cp->cpus_allowed, cp->effective_cpus));
- update_tasks_cpumask(cp, tmp->new_cpus);
+ update_tasks_cpumask(cp, cp->effective_cpus);
/*
* On legacy hierarchy, if the effective cpumask of any non-
@@ -1946,6 +2171,14 @@ static int update_cpumask(struct cpuset *cs, struct cpuset *trialcs,
return retval;
if (cs->partition_root_state) {
+ /*
+ * Call isolated_cpus_update() to handle valid remote partition
+ */
+ if (is_remote_partition(cs)) {
+ isolated_cpus_update(cs, cs_tmp_cpus, &tmp);
+ goto update_hier;
+ }
+
if (invalidate)
update_parent_subparts_cpumask(cs, partcmd_invalidate,
NULL, &tmp);
@@ -1980,10 +2213,11 @@ static int update_cpumask(struct cpuset *cs, struct cpuset *trialcs,
}
spin_unlock_irq(&callback_lock);
+update_hier:
/* effective_cpus will be updated here */
update_cpumasks_hier(cs, &tmp, false);
- if (cs->partition_root_state) {
+ if (cs->partition_root_state && !is_remote_partition(cs)) {
struct cpuset *parent = parent_cs(cs);
/*
@@ -2072,7 +2306,13 @@ static int update_reserve_cpumask(struct cpuset *trialcs, const char *buf)
* Invalidate remote partitions if necessary
*/
if (deleting) {
- /* TODO */
+ struct cpuset *child, *next;
+
+ list_for_each_entry_safe(child, next, &remote_children,
+ remote_sibling) {
+ if (cpumask_intersects(child->cpus_allowed, tmp.delmask))
+ isolated_cpus_release(child);
+ }
}
/*
@@ -2539,21 +2779,32 @@ static int update_prstate(struct cpuset *cs, int new_prs)
return 0;
/*
- * For a previously invalid partition root, leave it at being
- * invalid if new_prs is not "member".
+ * For a previously invalid partition root, treat it like a "member".
*/
- if (new_prs && is_prs_invalid(old_prs)) {
- cs->partition_root_state = -new_prs;
- return 0;
- }
+ if (new_prs && is_prs_invalid(old_prs))
+ old_prs = PRS_MEMBER;
if (alloc_cpumasks(NULL, &tmpmask))
return -ENOMEM;
+ if ((old_prs == PRS_ISOLATED) && is_remote_partition(cs)) {
+ /* Pre-invalidate a remote isolated partition */
+ isolated_cpus_release(cs);
+ old_prs = PRS_MEMBER;
+ }
+
err = update_partition_exclusive(cs, new_prs);
if (err)
goto out;
+ /*
+ * New partition is not allowed under a remote partition
+ */
+ if (new_prs && is_remote_partition(parent)) {
+ err = PERR_RMTPARENT;
+ goto out;
+ }
+
if (!old_prs) {
/*
* cpus_allowed cannot be empty.
@@ -2565,6 +2816,12 @@ static int update_prstate(struct cpuset *cs, int new_prs)
err = update_parent_subparts_cpumask(cs, partcmd_enable,
NULL, &tmpmask);
+ /*
+ * If an attempt to become adjacent isolated partition fails,
+ * try to become a remote isolated partition instead.
+ */
+ if (err && (new_prs == PRS_ISOLATED) && isolated_cpus_acquire(cs))
+ err = 0; /* Become remote isolated partition */
} else if (old_prs && new_prs) {
/*
* A change in load balance state only, no change in cpumasks.
@@ -3462,6 +3719,7 @@ cpuset_css_alloc(struct cgroup_subsys_state *parent_css)
nodes_clear(cs->effective_mems);
fmeter_init(&cs->fmeter);
cs->relax_domain_level = -1;
+ INIT_LIST_HEAD(&cs->remote_sibling);
/* Set CS_MEMORY_MIGRATE for default hierarchy */
if (cgroup_subsys_on_dfl(cpuset_cgrp_subsys))
@@ -3497,6 +3755,11 @@ static int cpuset_css_online(struct cgroup_subsys_state *css)
cs->effective_mems = parent->effective_mems;
cs->use_parent_ecpus = true;
parent->child_ecpus_count++;
+ /*
+ * Clear CS_SCHED_LOAD_BALANCE if parent is isolated
+ */
+ if (!is_sched_load_balance(parent))
+ clear_bit(CS_SCHED_LOAD_BALANCE, &cs->flags);
}
spin_unlock_irq(&callback_lock);
@@ -3741,6 +4004,7 @@ int __init cpuset_init(void)
fmeter_init(&top_cpuset.fmeter);
set_bit(CS_SCHED_LOAD_BALANCE, &top_cpuset.flags);
top_cpuset.relax_domain_level = -1;
+ INIT_LIST_HEAD(&remote_children);
BUG_ON(!alloc_cpumask_var(&cpus_attach, GFP_KERNEL));
@@ -3873,9 +4137,20 @@ static void cpuset_hotplug_update_tasks(struct cpuset *cs, struct tmpmasks *tmp)
}
parent = parent_cs(cs);
- compute_effective_cpumask(&new_cpus, cs, parent);
nodes_and(new_mems, cs->mems_allowed, parent->effective_mems);
+ /*
+ * In the special case of a valid remote isolated partition.
+ * We just need to mask offline cpus from cpus_allowed unless
+ * all the isolated cpus are gone.
+ */
+ if (is_remote_partition(cs)) {
+ if (!cpumask_and(&new_cpus, cs->cpus_allowed, cpu_active_mask))
+ isolated_cpus_release(cs);
+ } else {
+ compute_effective_cpumask(&new_cpus, cs, parent);
+ }
+
if (cs->nr_subparts_cpus)
/*
* Make sure that CPUs allocated to child partitions
@@ -3906,10 +4181,11 @@ static void cpuset_hotplug_update_tasks(struct cpuset *cs, struct tmpmasks *tmp)
* the following conditions hold:
* 1) empty effective cpus but not valid empty partition.
* 2) parent is invalid or doesn't grant any cpus to child
- * partitions.
+ * partitions and not a remote partition.
*/
- if (is_partition_valid(cs) && (!parent->nr_subparts_cpus ||
- (cpumask_empty(&new_cpus) && partition_is_populated(cs, NULL)))) {
+ if (is_partition_valid(cs) &&
+ ((!parent->nr_subparts_cpus && !is_remote_partition(cs)) ||
+ (cpumask_empty(&new_cpus) && partition_is_populated(cs, NULL)))) {
int old_prs, parent_prs;
update_parent_subparts_cpumask(cs, partcmd_disable, NULL, tmp);
--
2.31.1
A cpuset partition is a collection of cpusets with a partition root
and its descendants from that root downward excluding any cpusets that
are part of other partitions. A partition has exclusive access to a set
of CPUs granted to it. Other cpusets outside of a partition cannot use
any CPUs in that set.
Currently, creation of partitions requires a hierarchical CPUs
distribution model where the parent of a partition root has to be
a partition root itself. Hence all the partition roots have to be
clustered around the cgroup root.
To enable the creation of a remote partition down in the hierarchy
without a parental partition root, we need a way to reserve the CPUs
that will be used in a remote partition. Introduce a new root-only
"cpuset.cpus.reserve" control file in the top cpuset for this particular
purpose.
By default, the new "cpuset.cpus.reserve" control file will track
the subparts_cpus cpumask in the top cpuset. By writing into this new
control file, however, we can reserve additional CPUs that can be used
in a remote partition. Any CPUs that are in "cpuset.cpus.reserve" will
have to be removed from the effective_cpus of all the cpusets that are
not part of that valid partitions.
The prefix "+" and "-" can be used to indicate the addition to or the
subtraction from the existing CPUs in "cpuset.cpus.reserve". A single
"-" character indicate the deletion of all the free reserve CPUs not
allocated to any existing partition.
Signed-off-by: Waiman Long <longman(a)redhat.com>
---
kernel/cgroup/cpuset.c | 253 ++++++++++++++++++++++++++++++++++++++---
1 file changed, 239 insertions(+), 14 deletions(-)
diff --git a/kernel/cgroup/cpuset.c b/kernel/cgroup/cpuset.c
index 8604c919e1e4..69abe95a9969 100644
--- a/kernel/cgroup/cpuset.c
+++ b/kernel/cgroup/cpuset.c
@@ -208,7 +208,33 @@ struct cpuset {
struct cgroup_file partition_file;
};
-static cpumask_var_t cs_tmp_cpus; /* Temp cpumask for partition */
+/*
+ * Reserved CPUs for partitions.
+ *
+ * By default, CPUs used in partitions are tracked in the parent's
+ * subparts_cpus mask following a hierarchical CPUs distribution model.
+ * To enable the creation of a remote partition down in the hierarchy
+ * without a parental partition root, one can write directly to
+ * cpuset.cpus.reserve in the root cgroup to allocate more CPUs that can
+ * be used by remote partitions. Removal of existing reserved CPUs may
+ * also cause some existing partitions to become invalid.
+ *
+ * All the cpumasks below should only be used with cpuset_mutex held.
+ * Modification of cs_reserve_cpus & cs_free_reserve_cpus also requires
+ * holding the callback_lock.
+ *
+ * Relationship among cs_reserve_cpus, cs_free_reserve_cpus and
+ * top_cpuset.subparts_cpus are:
+ *
+ * top_cpuset.subparts_cpus ⊆ cs_reserve_cpus
+ * cs_free_reserve_cpus ⊆ cs_reserve_cpus
+ * top_cpuset.subparts_cpus ∩ cs_free_reserve_cpus = ∅
+ * cs_reserve_cpus - cs_free_reserve_cpus - top_cpuset.subparts_cpus
+ * = CPUs dedicated to remote partitions
+ */
+static cpumask_var_t cs_reserve_cpus; /* Reserved CPUs */
+static cpumask_var_t cs_free_reserve_cpus; /* Unallocated reserved CPUs */
+static cpumask_var_t cs_tmp_cpus; /* Temp cpumask for partition */
/*
* Partition root states:
@@ -1202,13 +1228,13 @@ static void rebuild_sched_domains_locked(void)
* should be the same as the active CPUs, so checking only top_cpuset
* is enough to detect racing CPU offlines.
*/
- if (!top_cpuset.nr_subparts_cpus &&
+ if (cpumask_empty(cs_reserve_cpus) &&
!cpumask_equal(top_cpuset.effective_cpus, cpu_active_mask))
return;
/*
* With subpartition CPUs, however, the effective CPUs of a partition
- * root should be only a subset of the active CPUs. Since a CPU in any
+ * root should only be a subset of the active CPUs. Since a CPU in any
* partition root could be offlined, all must be checked.
*/
if (top_cpuset.nr_subparts_cpus) {
@@ -1275,7 +1301,7 @@ static void update_tasks_cpumask(struct cpuset *cs, struct cpumask *new_cpus)
*/
if ((task->flags & PF_KTHREAD) && kthread_is_per_cpu(task))
continue;
- cpumask_andnot(new_cpus, possible_mask, cs->subparts_cpus);
+ cpumask_andnot(new_cpus, possible_mask, cs_reserve_cpus);
} else {
cpumask_and(new_cpus, possible_mask, cs->effective_cpus);
}
@@ -1406,6 +1432,7 @@ static int update_parent_subparts_cpumask(struct cpuset *cs, int cmd,
int deleting; /* Moving cpus from subparts_cpus to effective_cpus */
int old_prs, new_prs;
int part_error = PERR_NONE; /* Partition error? */
+ bool update_reserve = (parent == &top_cpuset);
lockdep_assert_held(&cpuset_mutex);
@@ -1576,7 +1603,7 @@ static int update_parent_subparts_cpumask(struct cpuset *cs, int cmd,
}
/*
- * Change the parent's subparts_cpus.
+ * Change the parent's subparts_cpus and maybe cs_reserve_cpus.
* Newly added CPUs will be removed from effective_cpus and
* newly deleted ones will be added back to effective_cpus.
*/
@@ -1586,10 +1613,25 @@ static int update_parent_subparts_cpumask(struct cpuset *cs, int cmd,
parent->subparts_cpus, tmp->addmask);
cpumask_andnot(parent->effective_cpus,
parent->effective_cpus, tmp->addmask);
+ if (update_reserve) {
+ cpumask_or(cs_reserve_cpus,
+ cs_reserve_cpus, tmp->addmask);
+ cpumask_andnot(cs_free_reserve_cpus,
+ cs_free_reserve_cpus, tmp->addmask);
+ }
}
if (deleting) {
cpumask_andnot(parent->subparts_cpus,
parent->subparts_cpus, tmp->delmask);
+ /*
+ * The automatic cpu reservation of adjacent partition
+ * won't add back the deleted CPUs to cs_free_reserve_cpus.
+ * Instead, they are returned back to effective_cpus of top
+ * cpuset.
+ */
+ if (update_reserve)
+ cpumask_andnot(cs_reserve_cpus,
+ cs_reserve_cpus, tmp->delmask);
/*
* Some of the CPUs in subparts_cpus might have been offlined.
*/
@@ -1783,6 +1825,8 @@ static void update_cpumasks_hier(struct cpuset *cs, struct tmpmasks *tmp,
if (need_rebuild_sched_domains)
rebuild_sched_domains_locked();
+
+ return;
}
/**
@@ -1955,6 +1999,167 @@ static int update_cpumask(struct cpuset *cs, struct cpuset *trialcs,
return 0;
}
+/**
+ * update_reserve_cpumask - update cs_reserve_cpus
+ * @trialcs: trial cpuset
+ * @buf: buffer of cpu numbers written to this cpuset
+ * Return: 0 if successful, < 0 if error
+ */
+static int update_reserve_cpumask(struct cpuset *trialcs, const char *buf)
+{
+ struct cgroup_subsys_state *css;
+ struct cpuset *cs;
+ bool adding, deleting;
+ struct tmpmasks tmp;
+
+ adding = deleting = false;
+ if (*buf == '+') {
+ adding = true;
+ buf++;
+ } else if (*buf == '-') {
+ deleting = true;
+ buf++;
+ }
+
+ if (!*buf) {
+ if (adding)
+ return -EINVAL;
+
+ if (deleting) {
+ if (cpumask_empty(cs_free_reserve_cpus))
+ return 0;
+ cpumask_copy(trialcs->cpus_allowed, cs_free_reserve_cpus);
+ } else {
+ cpumask_clear(trialcs->cpus_allowed);
+ }
+ } else {
+ int retval = cpulist_parse(buf, trialcs->cpus_allowed);
+
+ if (retval < 0)
+ return retval;
+ }
+
+ if (!adding && !deleting &&
+ cpumask_equal(trialcs->cpus_allowed, cs_reserve_cpus))
+ return 0;
+
+ /* Preserve trialcs->cpus_allowed for now */
+ init_tmpmasks(&tmp, NULL, trialcs->subparts_cpus,
+ trialcs->effective_cpus);
+
+ /*
+ * Compute the addition and removal of CPUs to/from cs_reserve_cpus
+ */
+ if (!adding && !deleting) {
+ adding = cpumask_andnot(tmp.addmask, trialcs->cpus_allowed,
+ cs_reserve_cpus);
+ deleting = cpumask_andnot(tmp.delmask, cs_reserve_cpus,
+ trialcs->cpus_allowed);
+ } else if (adding) {
+ adding = cpumask_andnot(tmp.addmask,
+ trialcs->cpus_allowed, cs_reserve_cpus);
+ cpumask_or(trialcs->cpus_allowed, cs_reserve_cpus, tmp.addmask);
+ } else { /* deleting */
+ deleting = cpumask_and(tmp.delmask,
+ trialcs->cpus_allowed, cs_reserve_cpus);
+ cpumask_andnot(trialcs->cpus_allowed, cs_reserve_cpus, tmp.delmask);
+ }
+
+ if (!adding && !deleting)
+ return 0;
+
+ /*
+ * Invalidate remote partitions if necessary
+ */
+ if (deleting) {
+ /* TODO */
+ }
+
+ /*
+ * Cannot use up all the CPUs in top_cpuset.effective_cpus
+ */
+ if (!deleting && adding &&
+ cpumask_subset(top_cpuset.effective_cpus, tmp.addmask))
+ return -EINVAL;
+
+ spin_lock_irq(&callback_lock);
+ /*
+ * Update top_cpuset.effective_cpus, cs_reserve_cpus &
+ * cs_free_reserve_cpus.
+ */
+ if (adding)
+ cpumask_or(cs_free_reserve_cpus, cs_free_reserve_cpus,
+ tmp.addmask);
+ cpumask_copy(cs_reserve_cpus, trialcs->cpus_allowed);
+ cpumask_andnot(top_cpuset.effective_cpus,
+ cpu_active_mask, cs_reserve_cpus);
+
+ /*
+ * Remove CPUs from cs_free_reserve_cpus first. Anything left
+ * means some partitions has to be made invalid.
+ */
+ if (deleting & cpumask_and(cs_tmp_cpus, cs_free_reserve_cpus,
+ tmp.delmask)) {
+ cpumask_andnot(cs_free_reserve_cpus, cs_free_reserve_cpus,
+ cs_tmp_cpus);
+ deleting = cpumask_andnot(tmp.delmask, tmp.delmask,
+ cs_tmp_cpus);
+ }
+ spin_unlock_irq(&callback_lock);
+
+ /*
+ * Invalidate some adjacent partitions under top cpuset, if necessary
+ */
+ if (deleting && cpumask_and(cs_tmp_cpus, tmp.delmask,
+ top_cpuset.subparts_cpus)) {
+ struct cgroup_subsys_state *css;
+ struct cpuset *cp;
+
+ /*
+ * Temporarily save the remaining CPUs to be deleted in
+ * trialcs->cpus_allowed to be restored back to tmp.delmask
+ * later.
+ */
+ deleting = cpumask_andnot(trialcs->cpus_allowed, tmp.delmask,
+ cs_tmp_cpus);
+ rcu_read_lock();
+ cpuset_for_each_child(cp, css, &top_cpuset)
+ if (is_partition_valid(cp) &&
+ cpumask_intersects(cs_tmp_cpus, cp->cpus_allowed)) {
+ rcu_read_unlock();
+ update_parent_subparts_cpumask(cp, partcmd_invalidate, NULL, &tmp);
+ rcu_read_lock();
+ }
+ rcu_read_unlock();
+ if (deleting)
+ cpumask_copy(tmp.delmask, trialcs->cpus_allowed);
+ }
+
+ /* Can now use all of trialcs */
+ init_tmpmasks(&tmp, trialcs->cpus_allowed, trialcs->subparts_cpus,
+ trialcs->effective_cpus);
+
+ /*
+ * Update effective_cpus of all descendants that are not in
+ * partitions and rebuild sched domaiins.
+ */
+ rcu_read_lock();
+ cpuset_for_each_child(cs, css, &top_cpuset) {
+ compute_effective_cpumask(tmp.new_cpus, cs, &top_cpuset);
+ if (cpumask_equal(tmp.new_cpus, cs->effective_cpus))
+ continue;
+ if (!css_tryget_online(&cs->css))
+ continue;
+ rcu_read_unlock();
+ update_cpumasks_hier(cs, &tmp, false);
+ rcu_read_lock();
+ css_put(&cs->css);
+ }
+ rcu_read_unlock();
+ rebuild_sched_domains_locked();
+ return 0;
+}
+
/*
* Migrate memory region from one set of nodes to another. This is
* performed asynchronously as it can be called from process migration path
@@ -2743,6 +2948,7 @@ typedef enum {
FILE_EFFECTIVE_CPULIST,
FILE_EFFECTIVE_MEMLIST,
FILE_SUBPARTS_CPULIST,
+ FILE_RESERVE_CPULIST,
FILE_CPU_EXCLUSIVE,
FILE_MEM_EXCLUSIVE,
FILE_MEM_HARDWALL,
@@ -2880,6 +3086,9 @@ static ssize_t cpuset_write_resmask(struct kernfs_open_file *of,
case FILE_CPULIST:
retval = update_cpumask(cs, trialcs, buf);
break;
+ case FILE_RESERVE_CPULIST:
+ retval = update_reserve_cpumask(trialcs, buf);
+ break;
case FILE_MEMLIST:
retval = update_nodemask(cs, trialcs, buf);
break;
@@ -2927,6 +3136,9 @@ static int cpuset_common_seq_show(struct seq_file *sf, void *v)
case FILE_EFFECTIVE_MEMLIST:
seq_printf(sf, "%*pbl\n", nodemask_pr_args(&cs->effective_mems));
break;
+ case FILE_RESERVE_CPULIST:
+ seq_printf(sf, "%*pbl\n", cpumask_pr_args(cs_reserve_cpus));
+ break;
case FILE_SUBPARTS_CPULIST:
seq_printf(sf, "%*pbl\n", cpumask_pr_args(cs->subparts_cpus));
break;
@@ -3200,6 +3412,14 @@ static struct cftype dfl_files[] = {
.file_offset = offsetof(struct cpuset, partition_file),
},
+ {
+ .name = "cpus.reserve",
+ .seq_show = cpuset_common_seq_show,
+ .write = cpuset_write_resmask,
+ .private = FILE_RESERVE_CPULIST,
+ .flags = CFTYPE_ONLY_ON_ROOT,
+ },
+
{
.name = "cpus.subpartitions",
.seq_show = cpuset_common_seq_show,
@@ -3510,6 +3730,8 @@ int __init cpuset_init(void)
BUG_ON(!alloc_cpumask_var(&top_cpuset.effective_cpus, GFP_KERNEL));
BUG_ON(!zalloc_cpumask_var(&top_cpuset.subparts_cpus, GFP_KERNEL));
BUG_ON(!zalloc_cpumask_var(&cs_tmp_cpus, GFP_KERNEL));
+ BUG_ON(!zalloc_cpumask_var(&cs_reserve_cpus, GFP_KERNEL));
+ BUG_ON(!zalloc_cpumask_var(&cs_free_reserve_cpus, GFP_KERNEL));
cpumask_setall(top_cpuset.cpus_allowed);
nodes_setall(top_cpuset.mems_allowed);
@@ -3788,10 +4010,10 @@ static void cpuset_hotplug_workfn(struct work_struct *work)
mems_updated = !nodes_equal(top_cpuset.effective_mems, new_mems);
/*
- * In the rare case that hotplug removes all the cpus in subparts_cpus,
+ * In the rare case that hotplug removes all the reserve cpus,
* we assumed that cpus are updated.
*/
- if (!cpus_updated && top_cpuset.nr_subparts_cpus)
+ if (!cpus_updated && !cpumask_empty(cs_reserve_cpus))
cpus_updated = true;
/* synchronize cpus_allowed to cpu_active_mask */
@@ -3801,18 +4023,21 @@ static void cpuset_hotplug_workfn(struct work_struct *work)
cpumask_copy(top_cpuset.cpus_allowed, &new_cpus);
/*
* Make sure that CPUs allocated to child partitions
- * do not show up in effective_cpus. If no CPU is left,
- * we clear the subparts_cpus & let the child partitions
- * fight for the CPUs again.
+ * do not show up in top_cpuset's effective_cpus. In the
+ * unlikely event tht no effective CPU is left in top_cpuset,
+ * we clear all the reserve cpus and let the non-remote child
+ * partitions fight for the CPUs again.
*/
- if (top_cpuset.nr_subparts_cpus) {
- if (cpumask_subset(&new_cpus,
- top_cpuset.subparts_cpus)) {
+ if (!cpumask_empty(cs_reserve_cpus)) {
+
+ if (cpumask_subset(&new_cpus, cs_reserve_cpus)) {
top_cpuset.nr_subparts_cpus = 0;
cpumask_clear(top_cpuset.subparts_cpus);
+ cpumask_clear(cs_free_reserve_cpus);
+ cpumask_clear(cs_reserve_cpus);
} else {
cpumask_andnot(&new_cpus, &new_cpus,
- top_cpuset.subparts_cpus);
+ cs_reserve_cpus);
}
}
cpumask_copy(top_cpuset.effective_cpus, &new_cpus);
--
2.31.1
v2:
- [v1] https://lore.kernel.org/lkml/20230412153758.3088111-1-longman@redhat.com/
- Dropped the special "isolcpus" partition in v1
- Add the root only "cpuset.cpus.reserve" control file for reserving
CPUs used for remote isolated partitions.
- Update the test_cpuset_prs.sh test script and documentation
accordingly.
This patch series introduces a new category of cpuset partition called
remote partitions. The existing partition category where the partition
roots have to be clustered around the root cgroup in a hierarchical way
is now referred to as adjacent partitions.
A remote partition can be formed far from the root cgroup with no
partition root parent. The only commonality is that the CPUs that are
used in the partition as specified in "cpuset.cpus" have to be present
in the "cpuset.cpus" of all its ancestors.
It is relatively rare to have applications that require creation of
a separate scheduling domain (root). However, it is more common to
have applications that require the use of isolated CPUs (isolated),
e.g. DPDK. One can use the "isolcpus" or "nohz_full" boot command options
to get that statically. Of course, the "isolated" partition is another
way to achieve that dynamically.
Modern container orchestration tools like Kubernetes use the cgroup
hierarchy to manage different containers. And it is relying on other
middleware like systemd to help managing it. If a container needs to
use isolated CPUs, it is hard to get those with the adjacent partitions
as it will require the administrative parent cgroup to be a partition
root too which tool like systemd may not be ready to manage.
With this patch series, a new root cgroup only "cpuset.cpus.reserve"
file is added to specify the set of CPUs that can be used in partitions
(whether remote or adjacent). To create a remote partition, the set
of CPUs to be used in that partition (the "cpuset.cpus" file of the
partition root) has to be reserved by manually adding them to that
control file first. Then that partition can be activated by writing
"isolated" into its "cpuset.cpus.partition". CPU reservation of adjacent
partitions is done automatically without touching "cpuset.cpus.reserve"
at all.
Currently only remote isolated partitions are supported, we could
support a scheduling partition ("root") in the future if the need arises.
Additional isolation attributes like those with the "isolcpus" or "nohz"
boot command line options may be supported in the isolated partitions
in the future.
Waiman Long (6):
cgroup/cpuset: Extract out CS_CPU_EXCLUSIVE & CS_SCHED_LOAD_BALANCE
handling
cgroup/cpuset: Improve temporary cpumasks handling
cgroup/cpuset: Add cpuset.cpus.reserve for top cpuset
cgroup/cpuset: Introduce remote isolated partition
cgroup/cpuset: Documentation update for partition
cgroup/cpuset: Extend test_cpuset_prs.sh to test remote partition
Documentation/admin-guide/cgroup-v2.rst | 92 ++-
kernel/cgroup/cpuset.c | 749 +++++++++++++++---
.../selftests/cgroup/test_cpuset_prs.sh | 403 ++++++----
3 files changed, 988 insertions(+), 256 deletions(-)
--
2.31.1
From: Mirsad Todorovac <mirsad.todorovac(a)alu.unizg.hr>
[ Upstream commit 976d3c6778e99390c6d854d140b746d12ea18a51 ]
According to Mirsad the gpio-sim.sh test appears to FAIL in a wrong way
due to missing initialisation of shell variables:
4.2. Bias settings work correctly
cat: /sys/devices/platform/gpio-sim.0/gpiochip18/sim_gpio0/value: No such file or directory
./gpio-sim.sh: line 393: test: =: unary operator expected
bias setting does not work
GPIO gpio-sim test FAIL
After this change the test passed:
4.2. Bias settings work correctly
GPIO gpio-sim test PASS
His testing environment is AlmaLinux 8.7 on Lenovo desktop box with
the latest Linux kernel based on v6.2:
Linux 6.2.0-mglru-kmlk-andy-09238-gd2980d8d8265 x86_64
Suggested-by: Mirsad Todorovac <mirsad.todorovac(a)alu.unizg.hr>
Signed-off-by: Andy Shevchenko <andriy.shevchenko(a)linux.intel.com>
Tested-by: Mirsad Goran Todorovac <mirsad.todorovac(a)alu.unizg.hr>
Signed-off-by: Mirsad Goran Todorovac <mirsad.todorovac(a)alu.unizg.hr>
Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski(a)linaro.org>
Signed-off-by: Sasha Levin <sashal(a)kernel.org>
---
tools/testing/selftests/gpio/gpio-sim.sh | 3 +++
1 file changed, 3 insertions(+)
diff --git a/tools/testing/selftests/gpio/gpio-sim.sh b/tools/testing/selftests/gpio/gpio-sim.sh
index 341e3de008968..bf67b23ed29ac 100755
--- a/tools/testing/selftests/gpio/gpio-sim.sh
+++ b/tools/testing/selftests/gpio/gpio-sim.sh
@@ -389,6 +389,9 @@ create_chip chip
create_bank chip bank
set_num_lines chip bank 8
enable_chip chip
+DEVNAME=`configfs_dev_name chip`
+CHIPNAME=`configfs_chip_name chip bank`
+SYSFS_PATH="/sys/devices/platform/$DEVNAME/$CHIPNAME/sim_gpio0/value"
$BASE_DIR/gpio-mockup-cdev -b pull-up /dev/`configfs_chip_name chip bank` 0
test `cat $SYSFS_PATH` = "1" || fail "bias setting does not work"
remove_chip chip
--
2.39.2
From: Mirsad Todorovac <mirsad.todorovac(a)alu.unizg.hr>
[ Upstream commit 976d3c6778e99390c6d854d140b746d12ea18a51 ]
According to Mirsad the gpio-sim.sh test appears to FAIL in a wrong way
due to missing initialisation of shell variables:
4.2. Bias settings work correctly
cat: /sys/devices/platform/gpio-sim.0/gpiochip18/sim_gpio0/value: No such file or directory
./gpio-sim.sh: line 393: test: =: unary operator expected
bias setting does not work
GPIO gpio-sim test FAIL
After this change the test passed:
4.2. Bias settings work correctly
GPIO gpio-sim test PASS
His testing environment is AlmaLinux 8.7 on Lenovo desktop box with
the latest Linux kernel based on v6.2:
Linux 6.2.0-mglru-kmlk-andy-09238-gd2980d8d8265 x86_64
Suggested-by: Mirsad Todorovac <mirsad.todorovac(a)alu.unizg.hr>
Signed-off-by: Andy Shevchenko <andriy.shevchenko(a)linux.intel.com>
Tested-by: Mirsad Goran Todorovac <mirsad.todorovac(a)alu.unizg.hr>
Signed-off-by: Mirsad Goran Todorovac <mirsad.todorovac(a)alu.unizg.hr>
Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski(a)linaro.org>
Signed-off-by: Sasha Levin <sashal(a)kernel.org>
---
tools/testing/selftests/gpio/gpio-sim.sh | 3 +++
1 file changed, 3 insertions(+)
diff --git a/tools/testing/selftests/gpio/gpio-sim.sh b/tools/testing/selftests/gpio/gpio-sim.sh
index 9f539d454ee4d..fa2ce2b9dd5fc 100755
--- a/tools/testing/selftests/gpio/gpio-sim.sh
+++ b/tools/testing/selftests/gpio/gpio-sim.sh
@@ -389,6 +389,9 @@ create_chip chip
create_bank chip bank
set_num_lines chip bank 8
enable_chip chip
+DEVNAME=`configfs_dev_name chip`
+CHIPNAME=`configfs_chip_name chip bank`
+SYSFS_PATH="/sys/devices/platform/$DEVNAME/$CHIPNAME/sim_gpio0/value"
$BASE_DIR/gpio-mockup-cdev -b pull-up /dev/`configfs_chip_name chip bank` 0
test `cat $SYSFS_PATH` = "1" || fail "bias setting does not work"
remove_chip chip
--
2.39.2
The kunit_add_action() and related functions named the kunit_action_t
parameter 'func' in early drafts, which was later renamed to 'action'
However, the doc comments were not properly updated.
Fix these to avoid confusion and 'make htmldocs' warnings.
Fixes: b9dce8a1ed3e ("kunit: Add kunit_add_action() to defer a call until test exit")
Reported-by: Stephen Rothwell <sfr(a)canb.auug.org.au>
Closes: https://lore.kernel.org/lkml/20230530151840.16a56460@canb.auug.org.au/
Signed-off-by: David Gow <davidgow(a)google.com>
---
include/kunit/resource.h | 8 ++++----
1 file changed, 4 insertions(+), 4 deletions(-)
diff --git a/include/kunit/resource.h b/include/kunit/resource.h
index b64eb783b1bc..c7383e90f5c9 100644
--- a/include/kunit/resource.h
+++ b/include/kunit/resource.h
@@ -393,7 +393,7 @@ typedef void (kunit_action_t)(void *);
/**
* kunit_add_action() - Call a function when the test ends.
* @test: Test case to associate the action with.
- * @func: The function to run on test exit
+ * @action: The function to run on test exit
* @ctx: Data passed into @func
*
* Defer the execution of a function until the test exits, either normally or
@@ -415,7 +415,7 @@ int kunit_add_action(struct kunit *test, kunit_action_t *action, void *ctx);
/**
* kunit_add_action_or_reset() - Call a function when the test ends.
* @test: Test case to associate the action with.
- * @func: The function to run on test exit
+ * @action: The function to run on test exit
* @ctx: Data passed into @func
*
* Defer the execution of a function until the test exits, either normally or
@@ -441,7 +441,7 @@ int kunit_add_action_or_reset(struct kunit *test, kunit_action_t *action,
/**
* kunit_remove_action() - Cancel a matching deferred action.
* @test: Test case the action is associated with.
- * @func: The deferred function to cancel.
+ * @action: The deferred function to cancel.
* @ctx: The context passed to the deferred function to trigger.
*
* Prevent an action deferred via kunit_add_action() from executing when the
@@ -459,7 +459,7 @@ void kunit_remove_action(struct kunit *test,
/**
* kunit_release_action() - Run a matching action call immediately.
* @test: Test case the action is associated with.
- * @func: The deferred function to trigger.
+ * @action: The deferred function to trigger.
* @ctx: The context passed to the deferred function to trigger.
*
* Execute a function deferred via kunit_add_action()) immediately, rather than
--
2.41.0.rc0.172.g3f132b7071-goog
The sample code has Kconfig for tristate configuration. In the case, it
could be friendly to developers that the code has MODULE_LICENSE, since
the missing MODULE_LICENSE brings error to modpost when the code is built
as loadable kernel module.
Signed-off-by: Takashi Sakamoto <o-takashi(a)sakamocchi.jp>
---
Documentation/dev-tools/kunit/start.rst | 2 ++
1 file changed, 2 insertions(+)
diff --git a/Documentation/dev-tools/kunit/start.rst b/Documentation/dev-tools/kunit/start.rst
index c736613c9b19..d4f99ef94f71 100644
--- a/Documentation/dev-tools/kunit/start.rst
+++ b/Documentation/dev-tools/kunit/start.rst
@@ -250,6 +250,8 @@ Now we are ready to write the test cases.
};
kunit_test_suite(misc_example_test_suite);
+ MODULE_LICENSE("GPL");
+
2. Add the following lines to ``drivers/misc/Kconfig``:
.. code-block:: kconfig
--
2.39.2
User processes register name_args for events. If the same name but different
args event are registered. The trace outputs of second event are printed
as the first event. This is incorrect.
Return EADDRINUSE back to the user process if the same name but different args
event has being registered.
Signed-off-by: sunliming <sunliming(a)kylinos.cn>
---
kernel/trace/trace_events_user.c | 36 +++++++++++++++----
.../selftests/user_events/ftrace_test.c | 6 ++++
2 files changed, 36 insertions(+), 6 deletions(-)
diff --git a/kernel/trace/trace_events_user.c b/kernel/trace/trace_events_user.c
index b1ecd7677642..e90161294698 100644
--- a/kernel/trace/trace_events_user.c
+++ b/kernel/trace/trace_events_user.c
@@ -1753,6 +1753,8 @@ static int user_event_parse(struct user_event_group *group, char *name,
int ret;
u32 key;
struct user_event *user;
+ int argc = 0;
+ char **argv;
/* Prevent dyn_event from racing */
mutex_lock(&event_mutex);
@@ -1760,13 +1762,35 @@ static int user_event_parse(struct user_event_group *group, char *name,
mutex_unlock(&event_mutex);
if (user) {
- *newuser = user;
- /*
- * Name is allocated by caller, free it since it already exists.
- * Caller only worries about failure cases for freeing.
- */
- kfree(name);
+ if (args) {
+ argv = argv_split(GFP_KERNEL, args, &argc);
+ if (!argv) {
+ ret = -ENOMEM;
+ goto error;
+ }
+
+ ret = user_fields_match(user, argc, (const char **)argv);
+ argv_free(argv);
+
+ } else
+ ret = list_empty(&user->fields);
+
+ if (ret) {
+ *newuser = user;
+ /*
+ * Name is allocated by caller, free it since it already exists.
+ * Caller only worries about failure cases for freeing.
+ */
+ kfree(name);
+ } else {
+ ret = -EADDRINUSE;
+ goto error;
+ }
+
return 0;
+error:
+ refcount_dec(&user->refcnt);
+ return ret;
}
user = kzalloc(sizeof(*user), GFP_KERNEL_ACCOUNT);
diff --git a/tools/testing/selftests/user_events/ftrace_test.c b/tools/testing/selftests/user_events/ftrace_test.c
index 7c99cef94a65..6e8c4b47281c 100644
--- a/tools/testing/selftests/user_events/ftrace_test.c
+++ b/tools/testing/selftests/user_events/ftrace_test.c
@@ -228,6 +228,12 @@ TEST_F(user, register_events) {
ASSERT_EQ(0, ioctl(self->data_fd, DIAG_IOCSREG, ®));
ASSERT_EQ(0, reg.write_index);
+ /* Multiple registers to same name but different args should fail */
+ reg.enable_bit = 29;
+ reg.name_args = (__u64)"__test_event u32 field1;";
+ ASSERT_EQ(-1, ioctl(self->data_fd, DIAG_IOCSREG, ®));
+ ASSERT_EQ(EADDRINUSE, errno);
+
/* Ensure disabled */
self->enable_fd = open(enable_file, O_RDWR);
ASSERT_NE(-1, self->enable_fd);
--
2.25.1
Hallo, es tut mir so leid, Ihre Privatsphäre zu verletzen. Es heißt:
„Ein Bild sagt mehr als tausend Worte, aber als ich Ihres sah, war es
mehr, als Worte erklären könnten.“ Das charmante Profil ist
unwiderstehlich, obwohl es eine kleine persönliche Nachricht ist, aber
Ihr Aussehen verrät viel über eine nette Person ... Also musste ich
der charmanten Person mit diesem tollen Profil eine Nachricht
hinterlassen. Ich glaube, es ist die Neugier, die mich in einer
solchen Zeit zu Ihnen führt. Ich muss noch einmal sagen, dass es mir
leid tut, wenn das Schreiben an Sie Ihrer moralischen Ethik
widerspricht. Ich möchte dich einfach besser kennenlernen und ein
Freund sein oder mehr. Ich hoffe, irgendwann von Ihnen zu hören.
Hallo, es tut mir so leid, Ihre Privatsphäre zu verletzen. Es heißt:
„Ein Bild sagt mehr als tausend Worte, aber als ich Ihres sah, war es
mehr, als Worte erklären könnten.“ Das charmante Profil ist
unwiderstehlich, obwohl es eine kleine persönliche Nachricht ist, aber
Ihr Aussehen verrät viel über eine nette Person ... Also musste ich
der charmanten Person mit diesem tollen Profil eine Nachricht
hinterlassen. Ich glaube, es ist die Neugier, die mich in einer
solchen Zeit zu Ihnen führt. Ich muss noch einmal sagen, dass es mir
leid tut, wenn das Schreiben an Sie Ihrer moralischen Ethik
widerspricht. Ich möchte dich einfach besser kennenlernen und ein
Freund sein oder mehr. Ich hoffe, irgendwann von Ihnen zu hören.
After a few years of increasing test coverage in the MPTCP selftests, we
realised [1] the last version of the selftests is supposed to run on old
kernels without issues.
Supporting older versions is not that easy for this MPTCP case: these
selftests are often validating the internals by checking packets that
are exchanged, when some MIB counters are incremented after some
actions, how connections are getting opened and closed in some cases,
etc. In other words, it is not limited to the socket interface between
the userspace and the kernelspace. In addition, the current selftests
run a lot of different sub-tests but the TAP13 protocol used in the
selftests don't support sub-tests: in other words, one failure in
sub-tests implies that the whole selftest is seen as failed at the end
because sub-tests are not tracked. It is then important to skip
sub-tests not supported by old kernels.
To minimise the modifications and reduce the complexity to support old
versions, the idea is to look at external signs and skip the whole
selftests or just some sub-tests before starting them.
This first part focuses on marking the different selftests as skipped
if MPTCP is not even supported. That's what is done in patches 2 to 8.
Patch 2/8 introduces a new file (mptcp_lib.sh) to be able to re-use some
helpers in the different selftests. The first MPTCP selftest has been
introduced in v5.6.
Patch 1/8 is a bit different but still linked: it modifies mptcp_join.sh
selftest not to use 'cmp --bytes' which is not supported by the BusyBox
implementation. It is apparently quite common to use BusyBox in CI
environments. This tool is needed for a subtest introduced in v6.1.
Link: https://lore.kernel.org/stable/CA+G9fYtDGpgT4dckXD-y-N92nqUxuvue_7AtDdBcHrb… [1]
Link: https://github.com/multipath-tcp/mptcp_net-next/issues/368
Signed-off-by: Matthieu Baerts <matthieu.baerts(a)tessares.net>
---
Matthieu Baerts (8):
selftests: mptcp: join: avoid using 'cmp --bytes'
selftests: mptcp: connect: skip if MPTCP is not supported
selftests: mptcp: pm nl: skip if MPTCP is not supported
selftests: mptcp: join: skip if MPTCP is not supported
selftests: mptcp: diag: skip if MPTCP is not supported
selftests: mptcp: simult flows: skip if MPTCP is not supported
selftests: mptcp: sockopt: skip if MPTCP is not supported
selftests: mptcp: userspace pm: skip if MPTCP is not supported
tools/testing/selftests/net/mptcp/Makefile | 2 +-
tools/testing/selftests/net/mptcp/diag.sh | 4 +++
tools/testing/selftests/net/mptcp/mptcp_connect.sh | 4 +++
tools/testing/selftests/net/mptcp/mptcp_join.sh | 17 +++++++--
tools/testing/selftests/net/mptcp/mptcp_lib.sh | 40 ++++++++++++++++++++++
tools/testing/selftests/net/mptcp/mptcp_sockopt.sh | 4 +++
tools/testing/selftests/net/mptcp/pm_netlink.sh | 4 +++
tools/testing/selftests/net/mptcp/simult_flows.sh | 4 +++
tools/testing/selftests/net/mptcp/userspace_pm.sh | 4 +++
9 files changed, 80 insertions(+), 3 deletions(-)
---
base-commit: 9b9e46aa07273ceb96866b2e812b46f1ee0b8d2f
change-id: 20230528-upstream-net-20230528-mptcp-selftests-support-old-kernels-part-1-305638f4dbc0
Best regards,
--
Matthieu Baerts <matthieu.baerts(a)tessares.net>
Hi, Willy
Thanks very mush for your kindly review, discuss and suggestion, now we
get full rv32 support ;-)
In the first series [1], we have fixed up the compile errors about
_start and __NR_llseek for rv32, but left compile errors about tons of
time32 syscalls (removed after kernel commit d4c08b9776b3 ("riscv: Use
latest system call ABI")) and the missing fstat in nolibc-test.c [2],
now we have fixed up all of them.
Introduction
============
This series is based on the 20230524-nolibc-rv32+stkp4 branch of [3], it
includes 3 parts, they work together to add full rv32 support:
* Reverts two old out-of-day patches
* Revert "tools/nolibc: riscv: Support __NR_llseek for rv32"
* Revert "selftests/nolibc: Fix up compile error for rv32"
(these two and the reverted ones:
* commit 606343b7478c ("selftests/nolibc: Fix up compile error for rv32")
* commit d2c3acba6d66 ("tools/nolibc: riscv: Support __NR_llseek for rv32")
can be removed from the git repo completely, there are two new ones to replace
them)
* Compile and test support patches
* selftests/nolibc: print name instead of number for EOVERFLOW
* selftests/nolibc: syscall_args: use __NR_statx for rv32
* --> replace the old one 606343b7478, use statx instead of read
* selftests/nolibc: riscv: customize makefile for rv32
* selftests/nolibc: allow specify a bios for qemu
* selftests/nolibc: remove the duplicated gettimeofday_bad2
* Fix up some missing syscalls, mainly time32 syscalls
* tools/nolibc: sys_lseek: riscv: use __NR_llseek for rv32
* --> replace the old one d2c3acba6d66, cleaned up
* tools/nolibc: sys_poll: riscv: use __NR_ppoll_time64 for rv32
* tools/nolibc: ppoll/ppoll_time64: Add a missing argument
* tools/nolibc: sys_select: riscv: use __NR_pselect6_time64 for rv32
* tools/nolibc: sys_wait4: riscv: use __NR_waitid for rv32
* tools/nolibc: sys_gettimeofday: riscv: use __NR_clock_gettime64 for rv32
Compile
=======
For rv64:
$ make ARCH=riscv CROSS_COMPILE=riscv64-linux-gnu- nolibc-test
$ file nolibc-test
nolibc-test: ELF 64-bit LSB executable, UCB RISC-V ...
$ make ARCH=riscv64 CROSS_COMPILE=riscv64-linux-gnu- nolibc-test
$ file nolibc-test
nolibc-test: ELF 64-bit LSB executable, UCB RISC-V ...
For rv32:
$ make ARCH=riscv CONFIG_32BIT=1 CROSS_COMPILE=riscv64-linux-gnu- nolibc-test
$ file nolibc-test
nolibc-test: ELF 32-bit LSB executable, UCB RISC-V ...
$ make ARCH=riscv32 CROSS_COMPILE=riscv64-linux-gnu- nolibc-test
$ file nolibc-test
nolibc-test: ELF 32-bit LSB executable, UCB RISC-V ...
Testing
=======
Environment:
// gcc toolchain
$ riscv64-linux-gnu-gcc --version
riscv64-linux-gnu-gcc (Ubuntu 9.3.0-17ubuntu1~20.04) 9.3.0
Copyright (C) 2019 Free Software Foundation, Inc.
This is free software; see the source for copying conditions. There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
// glibc >= 2.33 required, for older glibc, must upgrade include/bits/wordsize.h
$ dpkg -l | grep libc6-dev | grep riscv
ii libc6-dev-riscv64-cross 2.31-0ubuntu7cross1
// glibc include/bits/wordsize.h: manually upgraded to >= 2.33
// without this, can not build tools/testing/selftests/nolibc/nolibc-test.c
$ cat /usr/riscv64-linux-gnu/include/bits/wordsize.h
#if __riscv_xlen == (__SIZEOF_POINTER__ * 8)
# define __WORDSIZE __riscv_xlen
#else
# error unsupported ABI
#endif
# define __WORDSIZE_TIME64_COMPAT32 1
#if __WORDSIZE == 32
# define __WORDSIZE32_SIZE_ULONG 0
# define __WORDSIZE32_PTRDIFF_LONG 0
#endif
// higher qemu version is better, latest version is v8.0.0+
$ qemu-system-riscv64 --version
QEMU emulator version 4.2.1 (Debian 1:4.2-3ubuntu6.18)
Copyright (c) 2003-2019 Fabrice Bellard and the QEMU Project developers
// opensbi version, higher is better, must match kernel version and qemu version
// rv64: used version is 1.2, latest is 1.2
$ head -2 /labs/linux-lab/src/linux-stable/tools/testing/selftests/nolibc/run.out | tail -1
OpenSBI v1.2-116-g7919530
// rv32: used version is v0.9, latest is 1.2
$ head -2 /labs/linux-lab/src/linux-stable/tools/testing/selftests/nolibc/run.out | tail -1
OpenSBI v0.9-152-g754d511
For rv64:
$ pwd
/labs/linux-lab/src/linux-stable/tools/testing/selftests/nolibc
$ make ARCH=riscv64 CROSS_COMPILE=riscv64-linux-gnu- defconfig
$ make ARCH=riscv64 CROSS_COMPILE=riscv64-linux-gnu- BIOS=/labs/linux-lab/boards/riscv64/virt/bsp/bios/opensbi/generic/fw_jump.elf run
MKDIR sysroot/riscv/include
make[1]: Entering directory '/labs/linux-lab/src/linux-stable/tools/include/nolibc'
make[2]: Entering directory '/labs/linux-lab/src/linux-stable'
make[2]: Leaving directory '/labs/linux-lab/src/linux-stable'
make[2]: Entering directory '/labs/linux-lab/src/linux-stable'
INSTALL /labs/linux-lab/src/linux-stable/tools/testing/selftests/nolibc/sysroot/sysroot/include
make[2]: Leaving directory '/labs/linux-lab/src/linux-stable'
make[1]: Leaving directory '/labs/linux-lab/src/linux-stable/tools/include/nolibc'
CC nolibc-test
MKDIR initramfs
INSTALL initramfs/init
make[1]: Entering directory '/labs/linux-lab/src/linux-stable'
...
LD vmlinux
NM System.map
SORTTAB vmlinux
OBJCOPY arch/riscv/boot/Image
Kernel: arch/riscv/boot/Image is ready
make[1]: Leaving directory '/labs/linux-lab/src/linux-stable'
135 test(s) passed.
$ file ../../../../vmlinux
../../../../vmlinux: ELF 64-bit LSB executable, UCB RISC-V, version 1 (SYSV), statically linked, BuildID[sha1]=b8e1cea5122b04bce540b4022f0d6f171ffe615a, not stripped
For rv32:
$ pwd
/labs/linux-lab/src/linux-stable/tools/testing/selftests/nolibc
$ make ARCH=riscv32 CROSS_COMPILE=riscv64-linux-gnu- defconfig
$ make ARCH=riscv32 CROSS_COMPILE=riscv64-linux-gnu- BIOS=/labs/linux-lab/boards/riscv32/virt/bsp/bios/opensbi/generic/fw_jump.elf run
MKDIR sysroot/riscv/include
make[1]: Entering directory '/labs/linux-lab/src/linux-stable/tools/include/nolibc'
make[2]: Entering directory '/labs/linux-lab/src/linux-stable'
make[2]: Leaving directory '/labs/linux-lab/src/linux-stable'
make[2]: Entering directory '/labs/linux-lab/src/linux-stable'
INSTALL /labs/linux-lab/src/linux-stable/tools/testing/selftests/nolibc/sysroot/sysroot/include
make[2]: Leaving directory '/labs/linux-lab/src/linux-stable'
make[1]: Leaving directory '/labs/linux-lab/src/linux-stable/tools/include/nolibc'
CC nolibc-test
MKDIR initramfs
INSTALL initramfs/init
make[1]: Entering directory '/labs/linux-lab/src/linux-stable'
CALL scripts/checksyscalls.sh
GEN usr/initramfs_data.cpio
COPY usr/initramfs_inc_data
AS usr/initramfs_data.o
AR usr/built-in.a
GEN security/selinux/flask.h security/selinux/av_permissions.h
CC security/selinux/avc.o
CC security/selinux/hooks.o
CC security/selinux/selinuxfs.o
CC security/selinux/nlmsgtab.o
CC security/selinux/netif.o
CC security/selinux/netnode.o
CC security/selinux/netport.o
CC security/selinux/status.o
CC security/selinux/ss/services.o
AR security/selinux/built-in.a
AR security/built-in.a
AR built-in.a
AR vmlinux.a
LD vmlinux.o
OBJCOPY modules.builtin.modinfo
GEN modules.builtin
MODPOST vmlinux.symvers
UPD include/generated/utsversion.h
CC init/version-timestamp.o
LD .tmp_vmlinux.kallsyms1
NM .tmp_vmlinux.kallsyms1.syms
KSYMS .tmp_vmlinux.kallsyms1.S
AS .tmp_vmlinux.kallsyms1.S
LD .tmp_vmlinux.kallsyms2
NM .tmp_vmlinux.kallsyms2.syms
KSYMS .tmp_vmlinux.kallsyms2.S
AS .tmp_vmlinux.kallsyms2.S
LD vmlinux
NM System.map
SORTTAB vmlinux
OBJCOPY arch/riscv/boot/Image
Kernel: arch/riscv/boot/Image is ready
make[1]: Leaving directory '/labs/linux-lab/src/linux-stable'
135 test(s) passed.
$ file ../../../../vmlinux
../../../../vmlinux: ELF 32-bit LSB executable, UCB RISC-V, version 1 (SYSV), statically linked, BuildID[sha1]=bad4c1f3899f47355d2a2010bade56972fd94b9d, not stripped
The full rv64 testing result (run.out) is uploaded at [4].
The full rv32 testing result (run.out) is uploaded at [5].
That's all, thanks!
Best regards,
Zhangjin Wu
---
[1]: https://lore.kernel.org/linux-riscv/20230520143154.68663-1-falcon@tinylab.o…
[2]: https://lore.kernel.org/linux-riscv/20230520135235.68155-1-falcon@tinylab.o…
[3]: https://git.kernel.org/pub/scm/linux/kernel/git/wtarreau/nolibc.git
[4]: https://pastebin.com/3L0nV78u
[5]: https://pastebin.com/RadrXdta
Zhangjin Wu (13):
Revert "tools/nolibc: riscv: Support __NR_llseek for rv32"
Revert "selftests/nolibc: Fix up compile error for rv32"
selftests/nolibc: print name instead of number for EOVERFLOW
selftests/nolibc: syscall_args: use __NR_statx for rv32
selftests/nolibc: riscv: customize makefile for rv32
selftests/nolibc: allow specify a bios for qemu
selftests/nolibc: remove the duplicated gettimeofday_bad2
tools/nolibc: sys_lseek: riscv: use __NR_llseek for rv32
tools/nolibc: sys_poll: riscv: use __NR_ppoll_time64 for rv32
tools/nolibc: ppoll/ppoll_time64: Add a missing argument
tools/nolibc: sys_select: riscv: use __NR_pselect6_time64 for rv32
tools/nolibc: sys_wait4: riscv: use __NR_waitid for rv32
tools/nolibc: sys_gettimeofday: riscv: use __NR_clock_gettime64 for
rv32
tools/include/nolibc/std.h | 1 +
tools/include/nolibc/sys.h | 135 +++++++++++++++++--
tools/include/nolibc/types.h | 21 ++-
tools/testing/selftests/nolibc/Makefile | 14 +-
tools/testing/selftests/nolibc/nolibc-test.c | 15 ++-
5 files changed, 167 insertions(+), 19 deletions(-)
--
2.25.1
From: Jeff Xu <jeffxu(a)google.com>
This is the first set of Memory mapping (VMA) protection patches using PKU.
* * *
Background:
As discussed previously in the kernel mailing list [1], V8 CFI [2] uses
PKU to protect memory, and Stephen Röttger proposes to extend the PKU to
memory mapping [3].
We're using PKU for in-process isolation to enforce control-flow integrity
for a JIT compiler. In our threat model, an attacker exploits a
vulnerability and has arbitrary read/write access to the whole process
space concurrently to other threads being executed. This attacker can
manipulate some arguments to syscalls from some threads.
Under such a powerful attack, we want to create a “safe/isolated”
thread environment. We assign dedicated PKUs to this thread,
and use those PKUs to protect the threads’ runtime environment.
The thread has exclusive access to its run-time memory. This
includes modifying the protection of the memory mapping, or
munmap the memory mapping after use. And the other threads
won’t be able to access the memory or modify the memory mapping
(VMA) belonging to the thread.
* * *
Proposed changes:
This patch introduces a new flag, PKEY_ENFORCE_API, to the pkey_alloc()
function. When a PKEY is created with this flag, it is enforced that any
thread that wants to make changes to the memory mapping (such as mprotect)
of the memory must have write access to the PKEY. PKEYs created without
this flag will continue to work as they do now, for backwards
compatibility.
Only PKEY created from user space can have the new flag set, the PKEY
allocated by the kernel internally will not have it. In other words,
ARCH_DEFAULT_PKEY(0) and execute_only_pkey won’t have this flag set,
and continue work as today.
This flag is checked only at syscall entry, such as mprotect/munmap in
this set of patches. It will not apply to other call paths. In other
words, if the kernel want to change attributes of VMA for some reasons,
the kernel is free to do that and not affected by this new flag.
This set of patch covers mprotect/munmap, I plan to work on other
syscalls after this.
* * *
Testing:
I have tested this patch on a Linux kernel 5.15, 6,1, and 6.4-rc1,
new selftest is added in: pkey_enforce_api.c
* * *
Discussion:
We believe that this patch provides a valuable security feature.
It allows us to create “safe/isolated” thread environments that are
protected from attackers with arbitrary read/write access to
the process space.
We believe that the interface change and the patch don't
introduce backwards compatibility risk.
We would like to disucss this patch in Linux kernel community
for feedback and support.
* * *
Reference:
[1]https://lore.kernel.org/all/202208221331.71C50A6F@keescook/
[2]https://docs.google.com/document/d/1O2jwK4dxI3nRcOJuPYkonhTkNQfbmwdvxQMyX…
[3]https://docs.google.com/document/d/1qqVoVfRiF2nRylL3yjZyCQvzQaej1HRPh3f5w…
* * *
Current status:
There are on-going discussion related to threat model, io_uring, we will continue discuss using v0 thread.
* * *
PATCH history:
v1: update code related review comments:
mprotect.c:
remove syscall from do_mprotect_pkey()
remove pr_warn_ratelimited
munmap.c:
change syscall to enum caller_origin
remove pr_warn_ratelimited
v0:
https://lore.kernel.org/linux-mm/20230515130553.2311248-1-jeffxu@chromium.o…
Best Regards,
-Jeff Xu
Jeff Xu (6):
PKEY: Introduce PKEY_ENFORCE_API flag
PKEY: Add arch_check_pkey_enforce_api()
PKEY: Apply PKEY_ENFORCE_API to mprotect
PKEY:selftest pkey_enforce_api for mprotect
PKEY: Apply PKEY_ENFORCE_API to munmap
PKEY:selftest pkey_enforce_api for munmap
arch/powerpc/include/asm/pkeys.h | 19 +-
arch/x86/include/asm/mmu.h | 7 +
arch/x86/include/asm/pkeys.h | 92 +-
arch/x86/mm/pkeys.c | 2 +-
include/linux/mm.h | 8 +-
include/linux/pkeys.h | 18 +-
include/uapi/linux/mman.h | 5 +
mm/mmap.c | 31 +-
mm/mprotect.c | 17 +-
mm/mremap.c | 6 +-
tools/testing/selftests/mm/Makefile | 1 +
tools/testing/selftests/mm/pkey_enforce_api.c | 1312 +++++++++++++++++
12 files changed, 1499 insertions(+), 19 deletions(-)
create mode 100644 tools/testing/selftests/mm/pkey_enforce_api.c
base-commit: ba0ad6ed89fd5dada3b7b65ef2b08e95d449d4ab
--
2.40.1.606.ga4b1b128d6-goog
Hi, All
Thanks very much for your review suggestions of the v1 series [1], this
is the generic part1 of the v2 revison.
* selftests/nolibc: syscall_args: use generic __NR_statx
A more generic statx is used instead of fstat
(Review suggestions from Willy, Arnd)
* selftests/nolibc: allow specify extra arguments for qemu
Besides BIOS, QEMU_ARGS_EXTRA is better for more requirements
(Review suggestions from Thomas, Willy)
* selftests/nolibc: fix up compile warning with glibc on x86_64
Definition of uint64_t differs from glibc and nolibc, use the right
print format here
* selftests/nolibc: not include limits.h for nolibc
Remove the requirement of limits.h for nolibc can let us use older
glibc for rv32
(Review suggestions from thomas)
* selftests/nolibc: use INT_MAX instead of __INT_MAX__
A trivial cleanup, based on the previous patch
* tools/nolibc: arm: add missing my_syscall6
Required by future forced pselect6/pselect6_time64, tested on arm/vexpress-a9
(Review suggestions from Arnd)
* tools/nolibc: open: fix up compile warning for arm
A trivial fixup based on compiler's suggestion and glibc code
Best regards,
Zhangjin
----
[1]: https://lore.kernel.org/linux-riscv/20230529113143.GB2762@1wt.eu/T/#t
Zhangjin Wu (7):
selftests/nolibc: syscall_args: use __NR_statx for rv32
selftests/nolibc: allow specify extra arguments for qemu
selftests/nolibc: fix up compile warning with glibc on x86_64
selftests/nolibc: not include limits.h for nolibc
selftests/nolibc: use INT_MAX instead of __INT_MAX__
tools/nolibc: arm: add missing my_syscall6
tools/nolibc: open: fix up compile warning for arm
tools/include/nolibc/arch-arm.h | 23 ++++++++++++++++++++
tools/include/nolibc/stdint.h | 14 ++++++++++++
tools/include/nolibc/sys.h | 2 +-
tools/testing/selftests/nolibc/Makefile | 2 +-
tools/testing/selftests/nolibc/nolibc-test.c | 14 +++++++-----
5 files changed, 47 insertions(+), 8 deletions(-)
--
2.25.1
When A registering user event from dyn_events has no argments, it will pass the
matching check, regardless of whether there is a user event with the same name
and arguments. Add the matching check when the arguments of registering user
event is null.
Signed-off-by: sunliming <sunliming(a)kylinos.cn>
---
kernel/trace/trace_events_user.c | 2 ++
1 file changed, 2 insertions(+)
diff --git a/kernel/trace/trace_events_user.c b/kernel/trace/trace_events_user.c
index e90161294698..0d91dac206ff 100644
--- a/kernel/trace/trace_events_user.c
+++ b/kernel/trace/trace_events_user.c
@@ -1712,6 +1712,8 @@ static bool user_event_match(const char *system, const char *event,
if (match && argc > 0)
match = user_fields_match(user, argc, argv);
+ else if (match && argc == 0)
+ match = list_empty(&user->fields);
return match;
}
--
2.25.1
Partially backport v6.3 commit 11f75a01448f ("selftests/memfd: add
tests for MFD_NOEXEC_SEAL MFD_EXEC") to fix an unknown type name
build error.
In some systems, the __u64 typedef is not present due to differences
in system headers, causing compilation errors like this one:
fuse_test.c:64:8: error: unknown type name '__u64'
64 | static __u64 mfd_assert_get_seals(int fd)
This header includes the __u64 typedef which increases the
likelihood of successful compilation on a wider variety of systems.
Signed-off-by: Hardik Garg <hargar(a)linux.microsoft.com>
---
tools/testing/selftests/memfd/fuse_test.c | 1 +
1 file changed, 1 insertion(+)
diff --git a/tools/testing/selftests/memfd/fuse_test.c b/tools/testing/selftests/memfd/fuse_test.c
index be675002f918..93798c8c5d54 100644
--- a/tools/testing/selftests/memfd/fuse_test.c
+++ b/tools/testing/selftests/memfd/fuse_test.c
@@ -22,6 +22,7 @@
#include <linux/falloc.h>
#include <fcntl.h>
#include <linux/memfd.h>
+#include <linux/types.h>
#include <sched.h>
#include <stdio.h>
#include <stdlib.h>
--
2.25.1
Small optimization to avoid coredump writing during the stack protector
tests.
Adds prctl() as prerequisite.
This series is based on nolibc/20230524-nolibc-rv32+stkp4
Signed-off-by: Thomas Weißschuh <linux(a)weissschuh.net>
---
Changes in v2:
- Fix compilation warning in prctl() testcase
- Link to v1: https://lore.kernel.org/r/20230526-nolibc-test-no-dump-v1-0-62e724a96db2@we…
---
Thomas Weißschuh (2):
tools/nolibc: add support for prctl()
selftests/nolibc: prevent coredumps during test execution
tools/include/nolibc/sys.h | 27 +++++++++++++++++++++++++++
tools/testing/selftests/nolibc/nolibc-test.c | 3 +++
2 files changed, 30 insertions(+)
---
base-commit: 1974a2b5fd434812b32952b09df7b79fdee8104d
change-id: 20230526-nolibc-test-no-dump-a1b1d9557df8
Best regards,
--
Thomas Weißschuh <linux(a)weissschuh.net>
Small optimization to avoid coredump writing during the stack protector
tests.
Adds prctl() as prerequisite.
This series is based on nolibc/20230524-nolibc-rv32+stkp4
Signed-off-by: Thomas Weißschuh <linux(a)weissschuh.net>
---
Thomas Weißschuh (2):
tools/nolibc: add support for prctl()
selftests/nolibc: prevent coredumps during test execution
tools/include/nolibc/sys.h | 27 +++++++++++++++++++++++++++
tools/testing/selftests/nolibc/nolibc-test.c | 3 +++
2 files changed, 30 insertions(+)
---
base-commit: 1974a2b5fd434812b32952b09df7b79fdee8104d
change-id: 20230526-nolibc-test-no-dump-a1b1d9557df8
Best regards,
--
Thomas Weißschuh <linux(a)weissschuh.net>
Dan Carpenter spotted a race condition in a couple of situations like
these in the test_firmware driver:
static int test_dev_config_update_u8(const char *buf, size_t size, u8 *cfg)
{
u8 val;
int ret;
ret = kstrtou8(buf, 10, &val);
if (ret)
return ret;
mutex_lock(&test_fw_mutex);
*(u8 *)cfg = val;
mutex_unlock(&test_fw_mutex);
/* Always return full write size even if we didn't consume all */
return size;
}
static ssize_t config_num_requests_store(struct device *dev,
struct device_attribute *attr,
const char *buf, size_t count)
{
int rc;
mutex_lock(&test_fw_mutex);
if (test_fw_config->reqs) {
pr_err("Must call release_all_firmware prior to changing config\n");
rc = -EINVAL;
mutex_unlock(&test_fw_mutex);
goto out;
}
mutex_unlock(&test_fw_mutex);
rc = test_dev_config_update_u8(buf, count,
&test_fw_config->num_requests);
out:
return rc;
}
static ssize_t config_read_fw_idx_store(struct device *dev,
struct device_attribute *attr,
const char *buf, size_t count)
{
return test_dev_config_update_u8(buf, count,
&test_fw_config->read_fw_idx);
}
The function test_dev_config_update_u8() is called from both the locked
and the unlocked context, function config_num_requests_store() and
config_read_fw_idx_store() which can both be called asynchronously as
they are driver's methods, while test_dev_config_update_u8() and siblings
change their argument pointed to by u8 *cfg or similar pointer.
To avoid deadlock on test_fw_mutex, the lock is dropped before calling
test_dev_config_update_u8() and re-acquired within test_dev_config_update_u8()
itself, but alas this creates a race condition.
Having two locks wouldn't assure a race-proof mutual exclusion.
This situation is best avoided by the introduction of a new, unlocked
function __test_dev_config_update_u8() which can be called from the locked
context and reducing test_dev_config_update_u8() to:
static int test_dev_config_update_u8(const char *buf, size_t size, u8 *cfg)
{
int ret;
mutex_lock(&test_fw_mutex);
ret = __test_dev_config_update_u8(buf, size, cfg);
mutex_unlock(&test_fw_mutex);
return ret;
}
doing the locking and calling the unlocked primitive, which enables both
locked and unlocked versions without duplication of code.
The similar approach was applied to all functions called from the locked
and the unlocked context, which safely mitigates both deadlocks and race
conditions in the driver.
__test_dev_config_update_bool(), __test_dev_config_update_u8() and
__test_dev_config_update_size_t() unlocked versions of the functions
were introduced to be called from the locked contexts as a workaround
without releasing the main driver's lock and thereof causing a race
condition.
The test_dev_config_update_bool(), test_dev_config_update_u8() and
test_dev_config_update_size_t() locked versions of the functions
are being called from driver methods without the unnecessary multiplying
of the locking and unlocking code for each method, and complicating
the code with saving of the return value across lock.
Fixes: 7feebfa487b92 ("test_firmware: add support for request_firmware_into_buf")
Cc: Luis Chamberlain <mcgrof(a)kernel.org>
Cc: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
Cc: Russ Weight <russell.h.weight(a)intel.com>
Cc: Takashi Iwai <tiwai(a)suse.de>
Cc: Tianfei Zhang <tianfei.zhang(a)intel.com>
Cc: Shuah Khan <shuah(a)kernel.org>
Cc: Colin Ian King <colin.i.king(a)gmail.com>
Cc: Randy Dunlap <rdunlap(a)infradead.org>
Cc: linux-kselftest(a)vger.kernel.org
Cc: stable(a)vger.kernel.org # v5.4
Suggested-by: Dan Carpenter <error27(a)gmail.com>
Signed-off-by: Mirsad Goran Todorovac <mirsad.todorovac(a)alu.unizg.hr>
---
lib/test_firmware.c | 52 ++++++++++++++++++++++++++++++---------------
1 file changed, 35 insertions(+), 17 deletions(-)
diff --git a/lib/test_firmware.c b/lib/test_firmware.c
index 05ed84c2fc4c..35417e0af3f4 100644
--- a/lib/test_firmware.c
+++ b/lib/test_firmware.c
@@ -353,16 +353,26 @@ static ssize_t config_test_show_str(char *dst,
return len;
}
-static int test_dev_config_update_bool(const char *buf, size_t size,
+static inline int __test_dev_config_update_bool(const char *buf, size_t size,
bool *cfg)
{
int ret;
- mutex_lock(&test_fw_mutex);
if (kstrtobool(buf, cfg) < 0)
ret = -EINVAL;
else
ret = size;
+
+ return ret;
+}
+
+static int test_dev_config_update_bool(const char *buf, size_t size,
+ bool *cfg)
+{
+ int ret;
+
+ mutex_lock(&test_fw_mutex);
+ ret = __test_dev_config_update_bool(buf, size, cfg);
mutex_unlock(&test_fw_mutex);
return ret;
@@ -373,7 +383,8 @@ static ssize_t test_dev_config_show_bool(char *buf, bool val)
return snprintf(buf, PAGE_SIZE, "%d\n", val);
}
-static int test_dev_config_update_size_t(const char *buf,
+static int __test_dev_config_update_size_t(
+ const char *buf,
size_t size,
size_t *cfg)
{
@@ -384,9 +395,7 @@ static int test_dev_config_update_size_t(const char *buf,
if (ret)
return ret;
- mutex_lock(&test_fw_mutex);
*(size_t *)cfg = new;
- mutex_unlock(&test_fw_mutex);
/* Always return full write size even if we didn't consume all */
return size;
@@ -402,7 +411,7 @@ static ssize_t test_dev_config_show_int(char *buf, int val)
return snprintf(buf, PAGE_SIZE, "%d\n", val);
}
-static int test_dev_config_update_u8(const char *buf, size_t size, u8 *cfg)
+static int __test_dev_config_update_u8(const char *buf, size_t size, u8 *cfg)
{
u8 val;
int ret;
@@ -411,14 +420,23 @@ static int test_dev_config_update_u8(const char *buf, size_t size, u8 *cfg)
if (ret)
return ret;
- mutex_lock(&test_fw_mutex);
*(u8 *)cfg = val;
- mutex_unlock(&test_fw_mutex);
/* Always return full write size even if we didn't consume all */
return size;
}
+static int test_dev_config_update_u8(const char *buf, size_t size, u8 *cfg)
+{
+ int ret;
+
+ mutex_lock(&test_fw_mutex);
+ ret = __test_dev_config_update_u8(buf, size, cfg);
+ mutex_unlock(&test_fw_mutex);
+
+ return ret;
+}
+
static ssize_t test_dev_config_show_u8(char *buf, u8 val)
{
return snprintf(buf, PAGE_SIZE, "%u\n", val);
@@ -471,10 +489,10 @@ static ssize_t config_num_requests_store(struct device *dev,
mutex_unlock(&test_fw_mutex);
goto out;
}
- mutex_unlock(&test_fw_mutex);
- rc = test_dev_config_update_u8(buf, count,
- &test_fw_config->num_requests);
+ rc = __test_dev_config_update_u8(buf, count,
+ &test_fw_config->num_requests);
+ mutex_unlock(&test_fw_mutex);
out:
return rc;
@@ -518,10 +536,10 @@ static ssize_t config_buf_size_store(struct device *dev,
mutex_unlock(&test_fw_mutex);
goto out;
}
- mutex_unlock(&test_fw_mutex);
- rc = test_dev_config_update_size_t(buf, count,
- &test_fw_config->buf_size);
+ rc = __test_dev_config_update_size_t(buf, count,
+ &test_fw_config->buf_size);
+ mutex_unlock(&test_fw_mutex);
out:
return rc;
@@ -548,10 +566,10 @@ static ssize_t config_file_offset_store(struct device *dev,
mutex_unlock(&test_fw_mutex);
goto out;
}
- mutex_unlock(&test_fw_mutex);
- rc = test_dev_config_update_size_t(buf, count,
- &test_fw_config->file_offset);
+ rc = __test_dev_config_update_size_t(buf, count,
+ &test_fw_config->file_offset);
+ mutex_unlock(&test_fw_mutex);
out:
return rc;
--
2.30.2
KUnit aborts the current thread when an assertion fails. Currently, this
is done conditionally as part of the kunit_do_failed_assertion()
function, but this hides the kunit_abort() call from the compiler
(particularly if it's in another module). This, in turn, can lead to
both suboptimal code generation (the compiler can't know if
kunit_do_failed_assertion() will return), and to static analysis tools
like smatch giving false positives.
Moving the kunit_abort() call into the macro should give the compiler
and tools a better chance at understanding what's going on. Doing so
requires exporting kunit_abort(), though it's recommended to continue to
use assertions in lieu of aborting directly.
Suggested-by: Dan Carpenter <dan.carpenter(a)linaro.org>
Signed-off-by: David Gow <davidgow(a)google.com>
---
include/kunit/test.h | 4 ++++
lib/kunit/test.c | 5 +----
2 files changed, 5 insertions(+), 4 deletions(-)
diff --git a/include/kunit/test.h b/include/kunit/test.h
index 2f23d6efa505..6a35e3e2a1e5 100644
--- a/include/kunit/test.h
+++ b/include/kunit/test.h
@@ -481,6 +481,8 @@ void __printf(2, 3) kunit_log_append(char *log, const char *fmt, ...);
*/
#define KUNIT_SUCCEED(test) do {} while (0)
+void __noreturn kunit_abort(struct kunit *test);
+
void kunit_do_failed_assertion(struct kunit *test,
const struct kunit_loc *loc,
enum kunit_assert_type type,
@@ -498,6 +500,8 @@ void kunit_do_failed_assertion(struct kunit *test,
assert_format, \
fmt, \
##__VA_ARGS__); \
+ if (assert_type == KUNIT_ASSERTION) \
+ kunit_abort(test); \
} while (0)
diff --git a/lib/kunit/test.c b/lib/kunit/test.c
index d3fb93a23ccc..3b350e50cab9 100644
--- a/lib/kunit/test.c
+++ b/lib/kunit/test.c
@@ -310,7 +310,7 @@ static void kunit_fail(struct kunit *test, const struct kunit_loc *loc,
string_stream_destroy(stream);
}
-static void __noreturn kunit_abort(struct kunit *test)
+void __noreturn kunit_abort(struct kunit *test)
{
kunit_try_catch_throw(&test->try_catch); /* Does not return. */
@@ -340,9 +340,6 @@ void kunit_do_failed_assertion(struct kunit *test,
kunit_fail(test, loc, type, assert, assert_format, &message);
va_end(args);
-
- if (type == KUNIT_ASSERTION)
- kunit_abort(test);
}
EXPORT_SYMBOL_GPL(kunit_do_failed_assertion);
--
2.41.0.rc0.172.g3f132b7071-goog
User processes register name_args for events. If the same name but different
args event are registered. The trace outputs of second event are printed
as the first event. This is incorrect.
Return EADDRINUSE back to the user process if the same name but different args
event has being registered.
Signed-off-by: sunliming <sunliming(a)kylinos.cn>
---
kernel/trace/trace_events_user.c | 34 +++++++++++++++----
.../selftests/user_events/ftrace_test.c | 6 ++++
2 files changed, 33 insertions(+), 7 deletions(-)
diff --git a/kernel/trace/trace_events_user.c b/kernel/trace/trace_events_user.c
index b1ecd7677642..bd455052ccd0 100644
--- a/kernel/trace/trace_events_user.c
+++ b/kernel/trace/trace_events_user.c
@@ -1753,6 +1753,8 @@ static int user_event_parse(struct user_event_group *group, char *name,
int ret;
u32 key;
struct user_event *user;
+ int argc = 0;
+ char **argv;
/* Prevent dyn_event from racing */
mutex_lock(&event_mutex);
@@ -1760,13 +1762,31 @@ static int user_event_parse(struct user_event_group *group, char *name,
mutex_unlock(&event_mutex);
if (user) {
- *newuser = user;
- /*
- * Name is allocated by caller, free it since it already exists.
- * Caller only worries about failure cases for freeing.
- */
- kfree(name);
- return 0;
+ if (args) {
+ argv = argv_split(GFP_KERNEL, args, &argc);
+ if (!argv)
+ return -ENOMEM;
+
+ ret = user_fields_match(user, argc, (const char **)argv);
+ argv_free(argv);
+
+ } else
+ ret = list_empty(&user->fields);
+
+ if (ret) {
+ *newuser = user;
+ /*
+ * Name is allocated by caller, free it since it already exists.
+ * Caller only worries about failure cases for freeing.
+ */
+ kfree(name);
+ ret = 0;
+ } else {
+ refcount_dec(&user->refcnt);
+ ret = -EADDRINUSE;
+ }
+
+ return ret;
}
user = kzalloc(sizeof(*user), GFP_KERNEL_ACCOUNT);
diff --git a/tools/testing/selftests/user_events/ftrace_test.c b/tools/testing/selftests/user_events/ftrace_test.c
index 7c99cef94a65..6e8c4b47281c 100644
--- a/tools/testing/selftests/user_events/ftrace_test.c
+++ b/tools/testing/selftests/user_events/ftrace_test.c
@@ -228,6 +228,12 @@ TEST_F(user, register_events) {
ASSERT_EQ(0, ioctl(self->data_fd, DIAG_IOCSREG, ®));
ASSERT_EQ(0, reg.write_index);
+ /* Multiple registers to same name but different args should fail */
+ reg.enable_bit = 29;
+ reg.name_args = (__u64)"__test_event u32 field1;";
+ ASSERT_EQ(-1, ioctl(self->data_fd, DIAG_IOCSREG, ®));
+ ASSERT_EQ(EADDRINUSE, errno);
+
/* Ensure disabled */
self->enable_fd = open(enable_file, O_RDWR);
ASSERT_NE(-1, self->enable_fd);
--
2.25.1
There was a report that the hardware breakpoints and watch points weren't
reporting the debug architecture version as expected, they were reporting
a version of 0 which is not defined in the architecture. This happens
when running in a KVM guest if the host has a debug architecture version
not supported by KVM, it in turn confuses GDB which rejects any debug
architecture version it does not know about.
Add a test that covers that situation and while we're at it reports the
debug architecture version and number of slots available to aid with
figuring out problems that may arise.
Signed-off-by: Mark Brown <broonie(a)kernel.org>
---
Changes in v2:
- Rebase onto v6.4-rc3.
- Link to v1: https://lore.kernel.org/r/20230414-arm64-test-hw-breakpoint-v1-1-14162c8e5b…
---
tools/testing/selftests/arm64/abi/ptrace.c | 32 +++++++++++++++++++++++++++++-
1 file changed, 31 insertions(+), 1 deletion(-)
diff --git a/tools/testing/selftests/arm64/abi/ptrace.c b/tools/testing/selftests/arm64/abi/ptrace.c
index be952511af22..abe4d58d731d 100644
--- a/tools/testing/selftests/arm64/abi/ptrace.c
+++ b/tools/testing/selftests/arm64/abi/ptrace.c
@@ -20,7 +20,7 @@
#include "../../kselftest.h"
-#define EXPECTED_TESTS 7
+#define EXPECTED_TESTS 11
#define MAX_TPIDRS 2
@@ -132,6 +132,34 @@ static void test_tpidr(pid_t child)
}
}
+static void test_hw_debug(pid_t child, int type, const char *type_name)
+{
+ struct user_hwdebug_state state;
+ struct iovec iov;
+ int slots, arch, ret;
+
+ iov.iov_len = sizeof(state);
+ iov.iov_base = &state;
+
+ /* Should be able to read the values */
+ ret = ptrace(PTRACE_GETREGSET, child, type, &iov);
+ ksft_test_result(ret == 0, "read_%s\n", type_name);
+
+ if (ret == 0) {
+ /* Low 8 bits is the number of slots, next 4 bits the arch */
+ slots = state.dbg_info & 0xff;
+ arch = (state.dbg_info >> 8) & 0xf;
+
+ ksft_print_msg("%s version %d with %d slots\n", type_name,
+ arch, slots);
+
+ /* Zero is not currently architecturally valid */
+ ksft_test_result(arch, "%s_arch_set\n", type_name);
+ } else {
+ ksft_test_result_skip("%s_arch_set\n");
+ }
+}
+
static int do_child(void)
{
if (ptrace(PTRACE_TRACEME, -1, NULL, NULL))
@@ -207,6 +235,8 @@ static int do_parent(pid_t child)
ksft_print_msg("Parent is %d, child is %d\n", getpid(), child);
test_tpidr(child);
+ test_hw_debug(child, NT_ARM_HW_WATCH, "NT_ARM_HW_WATCH");
+ test_hw_debug(child, NT_ARM_HW_BREAK, "NT_ARM_HW_BREAK");
ret = EXIT_SUCCESS;
---
base-commit: 44c026a73be8038f03dbdeef028b642880cf1511
change-id: 20230414-arm64-test-hw-breakpoint-83fe02f607fc
Best regards,
--
Mark Brown <broonie(a)kernel.org>
Due to the lack of the SKIP directive in the output, if any of the
parameterized test was skipped, the parser could not recognize that
correctly and was marking the test as PASSED.
This can easily be seen by running the new subtest from patch 1:
$ ./tools/testing/kunit/kunit.py run \
--kunitconfig ./lib/kunit/.kunitconfig *.example_params*
[ ] Starting KUnit Kernel (1/1)...
[ ] ============================================================
[ ] =================== example (1 subtest) ====================
[ ] =================== example_params_test ===================
[ ] [PASSED] example value 2
[ ] [PASSED] example value 1
[ ] [PASSED] example value 0
[ ] =============== [PASSED] example_params_test ===============
[ ] ===================== [PASSED] example =====================
[ ] ============================================================
[ ] Testing complete. Ran 3 tests: passed: 3
$ ./tools/testing/kunit/kunit.py run \
--kunitconfig ./lib/kunit/.kunitconfig *.example_params* \
--raw_output
[ ] Starting KUnit Kernel (1/1)...
KTAP version 1
1..1
# example: initializing suite
KTAP version 1
# Subtest: example
1..1
KTAP version 1
# Subtest: example_params_test
# example_params_test: initializing
ok 1 example value 2
# example_params_test: initializing
ok 2 example value 1
# example_params_test: initializing
ok 3 example value 0
# example_params_test: pass:2 fail:0 skip:1 total:3
ok 1 example_params_test
# Totals: pass:2 fail:0 skip:1 total:3
ok 1 example
After adding the SKIP directive, the report looks as expected:
[ ] Starting KUnit Kernel (1/1)...
[ ] ============================================================
[ ] =================== example (1 subtest) ====================
[ ] =================== example_params_test ===================
[ ] [PASSED] example value 2
[ ] [PASSED] example value 1
[ ] [SKIPPED] example value 0
[ ] =============== [PASSED] example_params_test ===============
[ ] ===================== [PASSED] example =====================
[ ] ============================================================
[ ] Testing complete. Ran 3 tests: passed: 2, skipped: 1
[ ] Starting KUnit Kernel (1/1)...
KTAP version 1
1..1
# example: initializing suite
KTAP version 1
# Subtest: example
1..1
KTAP version 1
# Subtest: example_params_test
# example_params_test: initializing
ok 1 example value 2
# example_params_test: initializing
ok 2 example value 1
# example_params_test: initializing
ok 3 example value 0 # SKIP unsupported param value
# example_params_test: pass:2 fail:0 skip:1 total:3
ok 1 example_params_test
# Totals: pass:2 fail:0 skip:1 total:3
ok 1 example
v2: better align with future support for arbitrary levels of testing
v3: rebased on kunit tree [1]
[1]: https://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest.git/l…
Cc: David Gow <davidgow(a)google.com>
Cc: Rae Moar <rmoar(a)google.com>
Michal Wajdeczko (3):
kunit/test: Add example test showing parameterized testing
kunit: Fix reporting of the skipped parameterized tests
kunit: Update kunit_print_ok_not_ok function
include/kunit/test.h | 1 +
lib/kunit/kunit-example-test.c | 34 +++++++++++++++++++++++++++
lib/kunit/test.c | 43 ++++++++++++++++++++++------------
3 files changed, 63 insertions(+), 15 deletions(-)
--
2.25.1
Hello!
Here is v3 of the mremap start address optimization / fix for exec warning.
The main changes are:
1. Care to be taken to move purely within a VMA, in other words this check
in call_align_down():
if (vma->vm_start <= addr_masked)
return false;
As an example of why this is needed:
Consider the following range which is 2MB aligned and is
a part of a larger 10MB range which is not shown. Each
character is 256KB below making the source and destination
2MB each. The lower case letters are moved (s to d) and the
upper case letters are not moved.
|DDDDddddSSSSssss|
If we align down 'ssss' to start from the 'SSSS', we will end up destroying
SSSS. The above if statement prevents that and I verified it.
I also added a test for this in the last patch.
2. Handle the stack case separately. We do not care about #1 for stack movement
because the 'SSSS' does not matter during this move. Further we need to do this
to prevent the stack move warning.
if (!for_stack && vma->vm_start <= addr_masked)
return false;
History of patches
==================
v2->v3:
1. Masked address was stored in int, fixed it to unsigned long to avoid truncation.
2. We now handle moves happening purely within a VMA, a new test is added to handle this.
3. More code comments.
v1->v2:
1. Trigger the optimization for mremaps smaller than a PMD. I tested by tracing
that it works correctly.
2. Fix issue with bogus return value found by Linus if we broke out of the
above loop for the first PMD itself.
v1: Initial RFC.
Description of patches
======================
These patches optimizes the start addresses in move_page_tables() and tests the
changes. It addresses a warning [1] that occurs due to a downward, overlapping
move on a mutually-aligned offset within a PMD during exec. By initiating the
copy process at the PMD level when such alignment is present, we can prevent
this warning and speed up the copying process at the same time. Linus Torvalds
suggested this idea.
Please check the individual patches for more details.
thanks,
- Joel
[1] https://lore.kernel.org/all/ZB2GTBD%2FLWTrkOiO@dhcp22.suse.cz/
Joel Fernandes (Google) (6):
mm/mremap: Optimize the start addresses in move_page_tables()
mm/mremap: Allow moves within the same VMA
selftests: mm: Fix failure case when new remap region was not found
selftests: mm: Add a test for mutually aligned moves > PMD size
selftests: mm: Add a test for remapping to area immediately after
existing mapping
selftests: mm: Add a test for remapping within a range
fs/exec.c | 2 +-
include/linux/mm.h | 2 +-
mm/mremap.c | 69 ++++++++++-
tools/testing/selftests/mm/mremap_test.c | 148 +++++++++++++++++++++--
4 files changed, 209 insertions(+), 12 deletions(-)
--
2.40.1.698.g37aff9b760-goog
From: Mark Brown <broonie(a)kernel.org>
[ Upstream commit dbcf76390eb9a65d5d0c37b0cd57335218564e37 ]
The ftrace selftests do not currently produce KTAP output, they produce a
custom format much nicer for human consumption. This means that when run in
automated test systems we just get a single result for the suite as a whole
rather than recording results for individual test cases, making it harder
to look at the test data and masking things like inappropriate skips.
Address this by adding support for KTAP output to the ftracetest script and
providing a trivial wrapper which will be invoked by the kselftest runner
to generate output in this format by default, users using ftracetest
directly will continue to get the existing output.
This is not the most elegant solution but it is simple and effective. I
did consider implementing this by post processing the existing output
format but that felt more complex and likely to result in all output being
lost if something goes seriously wrong during the run which would not be
helpful. I did also consider just writing a separate runner script but
there's enough going on with things like the signal handling for that to
seem like it would be duplicating too much.
Acked-by: Steven Rostedt (Google) <rostedt(a)goodmis.org>
Acked-by: Masami Hiramatsu (Google) <mhiramat(a)kernel.org>
Tested-by: Masami Hiramatsu (Google) <mhiramat(a)kernel.org>
Signed-off-by: Mark Brown <broonie(a)kernel.org>
Signed-off-by: Shuah Khan <skhan(a)linuxfoundation.org>
Signed-off-by: Sasha Levin <sashal(a)kernel.org>
---
tools/testing/selftests/ftrace/Makefile | 3 +-
tools/testing/selftests/ftrace/ftracetest | 63 ++++++++++++++++++-
.../testing/selftests/ftrace/ftracetest-ktap | 8 +++
3 files changed, 70 insertions(+), 4 deletions(-)
create mode 100755 tools/testing/selftests/ftrace/ftracetest-ktap
diff --git a/tools/testing/selftests/ftrace/Makefile b/tools/testing/selftests/ftrace/Makefile
index d6e106fbce11c..a1e955d2de4cc 100644
--- a/tools/testing/selftests/ftrace/Makefile
+++ b/tools/testing/selftests/ftrace/Makefile
@@ -1,7 +1,8 @@
# SPDX-License-Identifier: GPL-2.0
all:
-TEST_PROGS := ftracetest
+TEST_PROGS_EXTENDED := ftracetest
+TEST_PROGS := ftracetest-ktap
TEST_FILES := test.d settings
EXTRA_CLEAN := $(OUTPUT)/logs/*
diff --git a/tools/testing/selftests/ftrace/ftracetest b/tools/testing/selftests/ftrace/ftracetest
index 8ec1922e974eb..9a73a110a8bfc 100755
--- a/tools/testing/selftests/ftrace/ftracetest
+++ b/tools/testing/selftests/ftrace/ftracetest
@@ -13,6 +13,7 @@ echo "Usage: ftracetest [options] [testcase(s)] [testcase-directory(s)]"
echo " Options:"
echo " -h|--help Show help message"
echo " -k|--keep Keep passed test logs"
+echo " -K|--ktap Output in KTAP format"
echo " -v|--verbose Increase verbosity of test messages"
echo " -vv Alias of -v -v (Show all results in stdout)"
echo " -vvv Alias of -v -v -v (Show all commands immediately)"
@@ -85,6 +86,10 @@ parse_opts() { # opts
KEEP_LOG=1
shift 1
;;
+ --ktap|-K)
+ KTAP=1
+ shift 1
+ ;;
--verbose|-v|-vv|-vvv)
if [ $VERBOSE -eq -1 ]; then
usage "--console can not use with --verbose"
@@ -178,6 +183,7 @@ TEST_DIR=$TOP_DIR/test.d
TEST_CASES=`find_testcases $TEST_DIR`
LOG_DIR=$TOP_DIR/logs/`date +%Y%m%d-%H%M%S`/
KEEP_LOG=0
+KTAP=0
DEBUG=0
VERBOSE=0
UNSUPPORTED_RESULT=0
@@ -229,7 +235,7 @@ prlog() { # messages
newline=
shift
fi
- printf "$*$newline"
+ [ "$KTAP" != "1" ] && printf "$*$newline"
[ "$LOG_FILE" ] && printf "$*$newline" | strip_esc >> $LOG_FILE
}
catlog() { #file
@@ -260,11 +266,11 @@ TOTAL_RESULT=0
INSTANCE=
CASENO=0
+CASENAME=
testcase() { # testfile
CASENO=$((CASENO+1))
- desc=`grep "^#[ \t]*description:" $1 | cut -f2- -d:`
- prlog -n "[$CASENO]$INSTANCE$desc"
+ CASENAME=`grep "^#[ \t]*description:" $1 | cut -f2- -d:`
}
checkreq() { # testfile
@@ -277,40 +283,68 @@ test_on_instance() { # testfile
grep -q "^#[ \t]*flags:.*instance" $1
}
+ktaptest() { # result comment
+ if [ "$KTAP" != "1" ]; then
+ return
+ fi
+
+ local result=
+ if [ "$1" = "1" ]; then
+ result="ok"
+ else
+ result="not ok"
+ fi
+ shift
+
+ local comment=$*
+ if [ "$comment" != "" ]; then
+ comment="# $comment"
+ fi
+
+ echo $CASENO $result $INSTANCE$CASENAME $comment
+}
+
eval_result() { # sigval
case $1 in
$PASS)
prlog " [${color_green}PASS${color_reset}]"
+ ktaptest 1
PASSED_CASES="$PASSED_CASES $CASENO"
return 0
;;
$FAIL)
prlog " [${color_red}FAIL${color_reset}]"
+ ktaptest 0
FAILED_CASES="$FAILED_CASES $CASENO"
return 1 # this is a bug.
;;
$UNRESOLVED)
prlog " [${color_blue}UNRESOLVED${color_reset}]"
+ ktaptest 0 UNRESOLVED
UNRESOLVED_CASES="$UNRESOLVED_CASES $CASENO"
return $UNRESOLVED_RESULT # depends on use case
;;
$UNTESTED)
prlog " [${color_blue}UNTESTED${color_reset}]"
+ ktaptest 1 SKIP
UNTESTED_CASES="$UNTESTED_CASES $CASENO"
return 0
;;
$UNSUPPORTED)
prlog " [${color_blue}UNSUPPORTED${color_reset}]"
+ ktaptest 1 SKIP
UNSUPPORTED_CASES="$UNSUPPORTED_CASES $CASENO"
return $UNSUPPORTED_RESULT # depends on use case
;;
$XFAIL)
prlog " [${color_green}XFAIL${color_reset}]"
+ ktaptest 1 XFAIL
XFAILED_CASES="$XFAILED_CASES $CASENO"
return 0
;;
*)
prlog " [${color_blue}UNDEFINED${color_reset}]"
+ ktaptest 0 error
UNDEFINED_CASES="$UNDEFINED_CASES $CASENO"
return 1 # this must be a test bug
;;
@@ -371,6 +405,7 @@ __run_test() { # testfile
run_test() { # testfile
local testname=`basename $1`
testcase $1
+ prlog -n "[$CASENO]$INSTANCE$CASENAME"
if [ ! -z "$LOG_FILE" ] ; then
local testlog=`mktemp $LOG_DIR/${CASENO}-${testname}-log.XXXXXX`
else
@@ -405,6 +440,17 @@ run_test() { # testfile
# load in the helper functions
. $TEST_DIR/functions
+if [ "$KTAP" = "1" ]; then
+ echo "TAP version 13"
+
+ casecount=`echo $TEST_CASES | wc -w`
+ for t in $TEST_CASES; do
+ test_on_instance $t || continue
+ casecount=$((casecount+1))
+ done
+ echo "1..${casecount}"
+fi
+
# Main loop
for t in $TEST_CASES; do
run_test $t
@@ -439,6 +485,17 @@ prlog "# of unsupported: " `echo $UNSUPPORTED_CASES | wc -w`
prlog "# of xfailed: " `echo $XFAILED_CASES | wc -w`
prlog "# of undefined(test bug): " `echo $UNDEFINED_CASES | wc -w`
+if [ "$KTAP" = "1" ]; then
+ echo -n "# Totals:"
+ echo -n " pass:"`echo $PASSED_CASES | wc -w`
+ echo -n " faii:"`echo $FAILED_CASES | wc -w`
+ echo -n " xfail:"`echo $XFAILED_CASES | wc -w`
+ echo -n " xpass:0"
+ echo -n " skip:"`echo $UNTESTED_CASES $UNSUPPORTED_CASES | wc -w`
+ echo -n " error:"`echo $UNRESOLVED_CASES $UNDEFINED_CASES | wc -w`
+ echo
+fi
+
cleanup
# if no error, return 0
diff --git a/tools/testing/selftests/ftrace/ftracetest-ktap b/tools/testing/selftests/ftrace/ftracetest-ktap
new file mode 100755
index 0000000000000..b3284679ef3af
--- /dev/null
+++ b/tools/testing/selftests/ftrace/ftracetest-ktap
@@ -0,0 +1,8 @@
+#!/bin/sh -e
+# SPDX-License-Identifier: GPL-2.0-only
+#
+# ftracetest-ktap: Wrapper to integrate ftracetest with the kselftest runner
+#
+# Copyright (C) Arm Ltd., 2023
+
+./ftracetest -K
--
2.39.2
From: Mark Brown <broonie(a)kernel.org>
[ Upstream commit dbcf76390eb9a65d5d0c37b0cd57335218564e37 ]
The ftrace selftests do not currently produce KTAP output, they produce a
custom format much nicer for human consumption. This means that when run in
automated test systems we just get a single result for the suite as a whole
rather than recording results for individual test cases, making it harder
to look at the test data and masking things like inappropriate skips.
Address this by adding support for KTAP output to the ftracetest script and
providing a trivial wrapper which will be invoked by the kselftest runner
to generate output in this format by default, users using ftracetest
directly will continue to get the existing output.
This is not the most elegant solution but it is simple and effective. I
did consider implementing this by post processing the existing output
format but that felt more complex and likely to result in all output being
lost if something goes seriously wrong during the run which would not be
helpful. I did also consider just writing a separate runner script but
there's enough going on with things like the signal handling for that to
seem like it would be duplicating too much.
Acked-by: Steven Rostedt (Google) <rostedt(a)goodmis.org>
Acked-by: Masami Hiramatsu (Google) <mhiramat(a)kernel.org>
Tested-by: Masami Hiramatsu (Google) <mhiramat(a)kernel.org>
Signed-off-by: Mark Brown <broonie(a)kernel.org>
Signed-off-by: Shuah Khan <skhan(a)linuxfoundation.org>
Signed-off-by: Sasha Levin <sashal(a)kernel.org>
---
tools/testing/selftests/ftrace/Makefile | 3 +-
tools/testing/selftests/ftrace/ftracetest | 63 ++++++++++++++++++-
.../testing/selftests/ftrace/ftracetest-ktap | 8 +++
3 files changed, 70 insertions(+), 4 deletions(-)
create mode 100755 tools/testing/selftests/ftrace/ftracetest-ktap
diff --git a/tools/testing/selftests/ftrace/Makefile b/tools/testing/selftests/ftrace/Makefile
index d6e106fbce11c..a1e955d2de4cc 100644
--- a/tools/testing/selftests/ftrace/Makefile
+++ b/tools/testing/selftests/ftrace/Makefile
@@ -1,7 +1,8 @@
# SPDX-License-Identifier: GPL-2.0
all:
-TEST_PROGS := ftracetest
+TEST_PROGS_EXTENDED := ftracetest
+TEST_PROGS := ftracetest-ktap
TEST_FILES := test.d settings
EXTRA_CLEAN := $(OUTPUT)/logs/*
diff --git a/tools/testing/selftests/ftrace/ftracetest b/tools/testing/selftests/ftrace/ftracetest
index 8ec1922e974eb..9a73a110a8bfc 100755
--- a/tools/testing/selftests/ftrace/ftracetest
+++ b/tools/testing/selftests/ftrace/ftracetest
@@ -13,6 +13,7 @@ echo "Usage: ftracetest [options] [testcase(s)] [testcase-directory(s)]"
echo " Options:"
echo " -h|--help Show help message"
echo " -k|--keep Keep passed test logs"
+echo " -K|--ktap Output in KTAP format"
echo " -v|--verbose Increase verbosity of test messages"
echo " -vv Alias of -v -v (Show all results in stdout)"
echo " -vvv Alias of -v -v -v (Show all commands immediately)"
@@ -85,6 +86,10 @@ parse_opts() { # opts
KEEP_LOG=1
shift 1
;;
+ --ktap|-K)
+ KTAP=1
+ shift 1
+ ;;
--verbose|-v|-vv|-vvv)
if [ $VERBOSE -eq -1 ]; then
usage "--console can not use with --verbose"
@@ -178,6 +183,7 @@ TEST_DIR=$TOP_DIR/test.d
TEST_CASES=`find_testcases $TEST_DIR`
LOG_DIR=$TOP_DIR/logs/`date +%Y%m%d-%H%M%S`/
KEEP_LOG=0
+KTAP=0
DEBUG=0
VERBOSE=0
UNSUPPORTED_RESULT=0
@@ -229,7 +235,7 @@ prlog() { # messages
newline=
shift
fi
- printf "$*$newline"
+ [ "$KTAP" != "1" ] && printf "$*$newline"
[ "$LOG_FILE" ] && printf "$*$newline" | strip_esc >> $LOG_FILE
}
catlog() { #file
@@ -260,11 +266,11 @@ TOTAL_RESULT=0
INSTANCE=
CASENO=0
+CASENAME=
testcase() { # testfile
CASENO=$((CASENO+1))
- desc=`grep "^#[ \t]*description:" $1 | cut -f2- -d:`
- prlog -n "[$CASENO]$INSTANCE$desc"
+ CASENAME=`grep "^#[ \t]*description:" $1 | cut -f2- -d:`
}
checkreq() { # testfile
@@ -277,40 +283,68 @@ test_on_instance() { # testfile
grep -q "^#[ \t]*flags:.*instance" $1
}
+ktaptest() { # result comment
+ if [ "$KTAP" != "1" ]; then
+ return
+ fi
+
+ local result=
+ if [ "$1" = "1" ]; then
+ result="ok"
+ else
+ result="not ok"
+ fi
+ shift
+
+ local comment=$*
+ if [ "$comment" != "" ]; then
+ comment="# $comment"
+ fi
+
+ echo $CASENO $result $INSTANCE$CASENAME $comment
+}
+
eval_result() { # sigval
case $1 in
$PASS)
prlog " [${color_green}PASS${color_reset}]"
+ ktaptest 1
PASSED_CASES="$PASSED_CASES $CASENO"
return 0
;;
$FAIL)
prlog " [${color_red}FAIL${color_reset}]"
+ ktaptest 0
FAILED_CASES="$FAILED_CASES $CASENO"
return 1 # this is a bug.
;;
$UNRESOLVED)
prlog " [${color_blue}UNRESOLVED${color_reset}]"
+ ktaptest 0 UNRESOLVED
UNRESOLVED_CASES="$UNRESOLVED_CASES $CASENO"
return $UNRESOLVED_RESULT # depends on use case
;;
$UNTESTED)
prlog " [${color_blue}UNTESTED${color_reset}]"
+ ktaptest 1 SKIP
UNTESTED_CASES="$UNTESTED_CASES $CASENO"
return 0
;;
$UNSUPPORTED)
prlog " [${color_blue}UNSUPPORTED${color_reset}]"
+ ktaptest 1 SKIP
UNSUPPORTED_CASES="$UNSUPPORTED_CASES $CASENO"
return $UNSUPPORTED_RESULT # depends on use case
;;
$XFAIL)
prlog " [${color_green}XFAIL${color_reset}]"
+ ktaptest 1 XFAIL
XFAILED_CASES="$XFAILED_CASES $CASENO"
return 0
;;
*)
prlog " [${color_blue}UNDEFINED${color_reset}]"
+ ktaptest 0 error
UNDEFINED_CASES="$UNDEFINED_CASES $CASENO"
return 1 # this must be a test bug
;;
@@ -371,6 +405,7 @@ __run_test() { # testfile
run_test() { # testfile
local testname=`basename $1`
testcase $1
+ prlog -n "[$CASENO]$INSTANCE$CASENAME"
if [ ! -z "$LOG_FILE" ] ; then
local testlog=`mktemp $LOG_DIR/${CASENO}-${testname}-log.XXXXXX`
else
@@ -405,6 +440,17 @@ run_test() { # testfile
# load in the helper functions
. $TEST_DIR/functions
+if [ "$KTAP" = "1" ]; then
+ echo "TAP version 13"
+
+ casecount=`echo $TEST_CASES | wc -w`
+ for t in $TEST_CASES; do
+ test_on_instance $t || continue
+ casecount=$((casecount+1))
+ done
+ echo "1..${casecount}"
+fi
+
# Main loop
for t in $TEST_CASES; do
run_test $t
@@ -439,6 +485,17 @@ prlog "# of unsupported: " `echo $UNSUPPORTED_CASES | wc -w`
prlog "# of xfailed: " `echo $XFAILED_CASES | wc -w`
prlog "# of undefined(test bug): " `echo $UNDEFINED_CASES | wc -w`
+if [ "$KTAP" = "1" ]; then
+ echo -n "# Totals:"
+ echo -n " pass:"`echo $PASSED_CASES | wc -w`
+ echo -n " faii:"`echo $FAILED_CASES | wc -w`
+ echo -n " xfail:"`echo $XFAILED_CASES | wc -w`
+ echo -n " xpass:0"
+ echo -n " skip:"`echo $UNTESTED_CASES $UNSUPPORTED_CASES | wc -w`
+ echo -n " error:"`echo $UNRESOLVED_CASES $UNDEFINED_CASES | wc -w`
+ echo
+fi
+
cleanup
# if no error, return 0
diff --git a/tools/testing/selftests/ftrace/ftracetest-ktap b/tools/testing/selftests/ftrace/ftracetest-ktap
new file mode 100755
index 0000000000000..b3284679ef3af
--- /dev/null
+++ b/tools/testing/selftests/ftrace/ftracetest-ktap
@@ -0,0 +1,8 @@
+#!/bin/sh -e
+# SPDX-License-Identifier: GPL-2.0-only
+#
+# ftracetest-ktap: Wrapper to integrate ftracetest with the kselftest runner
+#
+# Copyright (C) Arm Ltd., 2023
+
+./ftracetest -K
--
2.39.2
From: Mark Brown <broonie(a)kernel.org>
[ Upstream commit dbcf76390eb9a65d5d0c37b0cd57335218564e37 ]
The ftrace selftests do not currently produce KTAP output, they produce a
custom format much nicer for human consumption. This means that when run in
automated test systems we just get a single result for the suite as a whole
rather than recording results for individual test cases, making it harder
to look at the test data and masking things like inappropriate skips.
Address this by adding support for KTAP output to the ftracetest script and
providing a trivial wrapper which will be invoked by the kselftest runner
to generate output in this format by default, users using ftracetest
directly will continue to get the existing output.
This is not the most elegant solution but it is simple and effective. I
did consider implementing this by post processing the existing output
format but that felt more complex and likely to result in all output being
lost if something goes seriously wrong during the run which would not be
helpful. I did also consider just writing a separate runner script but
there's enough going on with things like the signal handling for that to
seem like it would be duplicating too much.
Acked-by: Steven Rostedt (Google) <rostedt(a)goodmis.org>
Acked-by: Masami Hiramatsu (Google) <mhiramat(a)kernel.org>
Tested-by: Masami Hiramatsu (Google) <mhiramat(a)kernel.org>
Signed-off-by: Mark Brown <broonie(a)kernel.org>
Signed-off-by: Shuah Khan <skhan(a)linuxfoundation.org>
Signed-off-by: Sasha Levin <sashal(a)kernel.org>
---
tools/testing/selftests/ftrace/Makefile | 3 +-
tools/testing/selftests/ftrace/ftracetest | 63 ++++++++++++++++++-
.../testing/selftests/ftrace/ftracetest-ktap | 8 +++
3 files changed, 70 insertions(+), 4 deletions(-)
create mode 100755 tools/testing/selftests/ftrace/ftracetest-ktap
diff --git a/tools/testing/selftests/ftrace/Makefile b/tools/testing/selftests/ftrace/Makefile
index d6e106fbce11c..a1e955d2de4cc 100644
--- a/tools/testing/selftests/ftrace/Makefile
+++ b/tools/testing/selftests/ftrace/Makefile
@@ -1,7 +1,8 @@
# SPDX-License-Identifier: GPL-2.0
all:
-TEST_PROGS := ftracetest
+TEST_PROGS_EXTENDED := ftracetest
+TEST_PROGS := ftracetest-ktap
TEST_FILES := test.d settings
EXTRA_CLEAN := $(OUTPUT)/logs/*
diff --git a/tools/testing/selftests/ftrace/ftracetest b/tools/testing/selftests/ftrace/ftracetest
index c3311c8c40890..2506621e75dfb 100755
--- a/tools/testing/selftests/ftrace/ftracetest
+++ b/tools/testing/selftests/ftrace/ftracetest
@@ -13,6 +13,7 @@ echo "Usage: ftracetest [options] [testcase(s)] [testcase-directory(s)]"
echo " Options:"
echo " -h|--help Show help message"
echo " -k|--keep Keep passed test logs"
+echo " -K|--ktap Output in KTAP format"
echo " -v|--verbose Increase verbosity of test messages"
echo " -vv Alias of -v -v (Show all results in stdout)"
echo " -vvv Alias of -v -v -v (Show all commands immediately)"
@@ -85,6 +86,10 @@ parse_opts() { # opts
KEEP_LOG=1
shift 1
;;
+ --ktap|-K)
+ KTAP=1
+ shift 1
+ ;;
--verbose|-v|-vv|-vvv)
if [ $VERBOSE -eq -1 ]; then
usage "--console can not use with --verbose"
@@ -178,6 +183,7 @@ TEST_DIR=$TOP_DIR/test.d
TEST_CASES=`find_testcases $TEST_DIR`
LOG_DIR=$TOP_DIR/logs/`date +%Y%m%d-%H%M%S`/
KEEP_LOG=0
+KTAP=0
DEBUG=0
VERBOSE=0
UNSUPPORTED_RESULT=0
@@ -229,7 +235,7 @@ prlog() { # messages
newline=
shift
fi
- printf "$*$newline"
+ [ "$KTAP" != "1" ] && printf "$*$newline"
[ "$LOG_FILE" ] && printf "$*$newline" | strip_esc >> $LOG_FILE
}
catlog() { #file
@@ -260,11 +266,11 @@ TOTAL_RESULT=0
INSTANCE=
CASENO=0
+CASENAME=
testcase() { # testfile
CASENO=$((CASENO+1))
- desc=`grep "^#[ \t]*description:" $1 | cut -f2- -d:`
- prlog -n "[$CASENO]$INSTANCE$desc"
+ CASENAME=`grep "^#[ \t]*description:" $1 | cut -f2- -d:`
}
checkreq() { # testfile
@@ -277,40 +283,68 @@ test_on_instance() { # testfile
grep -q "^#[ \t]*flags:.*instance" $1
}
+ktaptest() { # result comment
+ if [ "$KTAP" != "1" ]; then
+ return
+ fi
+
+ local result=
+ if [ "$1" = "1" ]; then
+ result="ok"
+ else
+ result="not ok"
+ fi
+ shift
+
+ local comment=$*
+ if [ "$comment" != "" ]; then
+ comment="# $comment"
+ fi
+
+ echo $CASENO $result $INSTANCE$CASENAME $comment
+}
+
eval_result() { # sigval
case $1 in
$PASS)
prlog " [${color_green}PASS${color_reset}]"
+ ktaptest 1
PASSED_CASES="$PASSED_CASES $CASENO"
return 0
;;
$FAIL)
prlog " [${color_red}FAIL${color_reset}]"
+ ktaptest 0
FAILED_CASES="$FAILED_CASES $CASENO"
return 1 # this is a bug.
;;
$UNRESOLVED)
prlog " [${color_blue}UNRESOLVED${color_reset}]"
+ ktaptest 0 UNRESOLVED
UNRESOLVED_CASES="$UNRESOLVED_CASES $CASENO"
return $UNRESOLVED_RESULT # depends on use case
;;
$UNTESTED)
prlog " [${color_blue}UNTESTED${color_reset}]"
+ ktaptest 1 SKIP
UNTESTED_CASES="$UNTESTED_CASES $CASENO"
return 0
;;
$UNSUPPORTED)
prlog " [${color_blue}UNSUPPORTED${color_reset}]"
+ ktaptest 1 SKIP
UNSUPPORTED_CASES="$UNSUPPORTED_CASES $CASENO"
return $UNSUPPORTED_RESULT # depends on use case
;;
$XFAIL)
prlog " [${color_green}XFAIL${color_reset}]"
+ ktaptest 1 XFAIL
XFAILED_CASES="$XFAILED_CASES $CASENO"
return 0
;;
*)
prlog " [${color_blue}UNDEFINED${color_reset}]"
+ ktaptest 0 error
UNDEFINED_CASES="$UNDEFINED_CASES $CASENO"
return 1 # this must be a test bug
;;
@@ -371,6 +405,7 @@ __run_test() { # testfile
run_test() { # testfile
local testname=`basename $1`
testcase $1
+ prlog -n "[$CASENO]$INSTANCE$CASENAME"
if [ ! -z "$LOG_FILE" ] ; then
local testlog=`mktemp $LOG_DIR/${CASENO}-${testname}-log.XXXXXX`
else
@@ -405,6 +440,17 @@ run_test() { # testfile
# load in the helper functions
. $TEST_DIR/functions
+if [ "$KTAP" = "1" ]; then
+ echo "TAP version 13"
+
+ casecount=`echo $TEST_CASES | wc -w`
+ for t in $TEST_CASES; do
+ test_on_instance $t || continue
+ casecount=$((casecount+1))
+ done
+ echo "1..${casecount}"
+fi
+
# Main loop
for t in $TEST_CASES; do
run_test $t
@@ -439,6 +485,17 @@ prlog "# of unsupported: " `echo $UNSUPPORTED_CASES | wc -w`
prlog "# of xfailed: " `echo $XFAILED_CASES | wc -w`
prlog "# of undefined(test bug): " `echo $UNDEFINED_CASES | wc -w`
+if [ "$KTAP" = "1" ]; then
+ echo -n "# Totals:"
+ echo -n " pass:"`echo $PASSED_CASES | wc -w`
+ echo -n " faii:"`echo $FAILED_CASES | wc -w`
+ echo -n " xfail:"`echo $XFAILED_CASES | wc -w`
+ echo -n " xpass:0"
+ echo -n " skip:"`echo $UNTESTED_CASES $UNSUPPORTED_CASES | wc -w`
+ echo -n " error:"`echo $UNRESOLVED_CASES $UNDEFINED_CASES | wc -w`
+ echo
+fi
+
cleanup
# if no error, return 0
diff --git a/tools/testing/selftests/ftrace/ftracetest-ktap b/tools/testing/selftests/ftrace/ftracetest-ktap
new file mode 100755
index 0000000000000..b3284679ef3af
--- /dev/null
+++ b/tools/testing/selftests/ftrace/ftracetest-ktap
@@ -0,0 +1,8 @@
+#!/bin/sh -e
+# SPDX-License-Identifier: GPL-2.0-only
+#
+# ftracetest-ktap: Wrapper to integrate ftracetest with the kselftest runner
+#
+# Copyright (C) Arm Ltd., 2023
+
+./ftracetest -K
--
2.39.2
From: Jinrong Liang <cloudliang(a)tencent.com>
From: Jinrong Liang <cloudliang(a)tencent.com>
Hi,
This patch set adds some tests to ensure consistent PMU performance event
filter behavior. Specifically, the patches aim to improve KVM's PMU event
filter by strengthening the test coverage, adding documentation, and making
other small changes.
The first patch replaces int with uint32_t for nevents to ensure consistency
and readability in the code. The second patch adds fixed_counter_bitmap to
create_pmu_event_filter() to support the use of the same creator to control
the use of guest fixed counters. The third patch adds test cases for
unsupported input values in PMU filter, including unsupported "action"
values, unsupported "flags" values, and unsupported "nevents" values. Also,
it tests setting non-existent fixed counters in the fixed bitmap doesn't
fail.
The fourth patch updates the documentation for KVM_SET_PMU_EVENT_FILTER ioctl
to include a detailed description of how fixed performance events are handled
in the pmu filter. The fifth patch adds tests to cover that pmu_event_filter
works as expected when applied to fixed performance counters, even if there
is no fixed counter exists. The sixth patch adds a test to ensure that setting
both generic and fixed performance event filters does not affect the consistency
of the fixed performance filter behavior in KVM. The seventh patch adds a test
to verify the behavior of the pmu event filter when an incomplete
kvm_pmu_event_filter structure is used.
These changes help to ensure that KVM's PMU event filter functions as expected
in all supported use cases. These patches have been tested and verified to
function properly.
Thanks for your review and feedback.
Sincerely,
Jinrong Liang
Previous:
https://lore.kernel.org/kvm/20230414110056.19665-1-cloudliang@tencent.com
v2:
- Wrap the code from the documentation in a block of code; (Bagas Sanjaya)
Jinrong Liang (7):
KVM: selftests: Replace int with uint32_t for nevents
KVM: selftests: Apply create_pmu_event_filter() to fixed ctrs
KVM: selftests: Test unavailable event filters are rejected
KVM: x86/pmu: Add documentation for fixed ctr on PMU filter
KVM: selftests: Check if pmu_event_filter meets expectations on fixed
ctrs
KVM: selftests: Check gp event filters without affecting fixed event
filters
KVM: selftests: Test pmu event filter with incompatible
kvm_pmu_event_filter
Documentation/virt/kvm/api.rst | 21 ++
.../kvm/x86_64/pmu_event_filter_test.c | 239 ++++++++++++++++--
2 files changed, 243 insertions(+), 17 deletions(-)
base-commit: a25497a280bbd7bbcc08c87ddb2b3909affc8402
--
2.31.1
This is the v7 of this series which tries to implement the fd-based KVM
guest private memory. The patches are based on latest kvm/queue branch
commit:
b9b71f43683a (kvm/queue) KVM: x86/mmu: Buffer nested MMU
split_desc_cache only by default capacity
Introduction
------------
In general this patch series introduce fd-based memslot which provides
guest memory through memory file descriptor fd[offset,size] instead of
hva/size. The fd can be created from a supported memory filesystem
like tmpfs/hugetlbfs etc. which we refer as memory backing store. KVM
and the the memory backing store exchange callbacks when such memslot
gets created. At runtime KVM will call into callbacks provided by the
backing store to get the pfn with the fd+offset. Memory backing store
will also call into KVM callbacks when userspace punch hole on the fd
to notify KVM to unmap secondary MMU page table entries.
Comparing to existing hva-based memslot, this new type of memslot allows
guest memory unmapped from host userspace like QEMU and even the kernel
itself, therefore reduce attack surface and prevent bugs.
Based on this fd-based memslot, we can build guest private memory that
is going to be used in confidential computing environments such as Intel
TDX and AMD SEV. When supported, the memory backing store can provide
more enforcement on the fd and KVM can use a single memslot to hold both
the private and shared part of the guest memory.
mm extension
---------------------
Introduces new MFD_INACCESSIBLE flag for memfd_create(), the file
created with these flags cannot read(), write() or mmap() etc via normal
MMU operations. The file content can only be used with the newly
introduced memfile_notifier extension.
The memfile_notifier extension provides two sets of callbacks for KVM to
interact with the memory backing store:
- memfile_notifier_ops: callbacks for memory backing store to notify
KVM when memory gets invalidated.
- backing store callbacks: callbacks for KVM to call into memory
backing store to request memory pages for guest private memory.
The memfile_notifier extension also provides APIs for memory backing
store to register/unregister itself and to trigger the notifier when the
bookmarked memory gets invalidated.
The patchset also introduces a new memfd seal F_SEAL_AUTO_ALLOCATE to
prevent double allocation caused by unintentional guest when we only
have a single side of the shared/private memfds effective.
memslot extension
-----------------
Add the private fd and the fd offset to existing 'shared' memslot so
that both private/shared guest memory can live in one single memslot.
A page in the memslot is either private or shared. Whether a guest page
is private or shared is maintained through reusing existing SEV ioctls
KVM_MEMORY_ENCRYPT_{UN,}REG_REGION.
Test
----
To test the new functionalities of this patch TDX patchset is needed.
Since TDX patchset has not been merged so I did two kinds of test:
- Regresion test on kvm/queue (this patchset)
Most new code are not covered. Code also in below repo:
https://github.com/chao-p/linux/tree/privmem-v7
- New Funational test on latest TDX code
The patch is rebased to latest TDX code and tested the new
funcationalities. See below repos:
Linux: https://github.com/chao-p/linux/tree/privmem-v7-tdx
QEMU: https://github.com/chao-p/qemu/tree/privmem-v7
An example QEMU command line for TDX test:
-object tdx-guest,id=tdx,debug=off,sept-ve-disable=off \
-machine confidential-guest-support=tdx \
-object memory-backend-memfd-private,id=ram1,size=${mem} \
-machine memory-backend=ram1
Changelog
----------
v7:
- Move the private/shared info from backing store to KVM.
- Introduce F_SEAL_AUTO_ALLOCATE to avoid double allocation.
- Rework on the sync mechanism between zap/page fault paths.
- Addressed other comments in v6.
v6:
- Re-organzied patch for both mm/KVM parts.
- Added flags for memfile_notifier so its consumers can state their
features and memory backing store can check against these flags.
- Put a backing store reference in the memfile_notifier and move pfn_ops
into backing store.
- Only support boot time backing store register.
- Overall KVM part improvement suggested by Sean and some others.
v5:
- Removed userspace visible F_SEAL_INACCESSIBLE, instead using an
in-kernel flag (SHM_F_INACCESSIBLE for shmem). Private fd can only
be created by MFD_INACCESSIBLE.
- Introduced new APIs for backing store to register itself to
memfile_notifier instead of direct function call.
- Added the accounting and restriction for MFD_INACCESSIBLE memory.
- Added KVM API doc for new memslot extensions and man page for the new
MFD_INACCESSIBLE flag.
- Removed the overlap check for mapping the same file+offset into
multiple gfns due to perf consideration, warned in document.
- Addressed other comments in v4.
v4:
- Decoupled the callbacks between KVM/mm from memfd and use new
name 'memfile_notifier'.
- Supported register multiple memslots to the same backing store.
- Added per-memslot pfn_ops instead of per-system.
- Reworked the invalidation part.
- Improved new KVM uAPIs (private memslot extension and memory
error) per Sean's suggestions.
- Addressed many other minor fixes for comments from v3.
v3:
- Added locking protection when calling
invalidate_page_range/fallocate callbacks.
- Changed memslot structure to keep use useraddr for shared memory.
- Re-organized F_SEAL_INACCESSIBLE and MEMFD_OPS.
- Added MFD_INACCESSIBLE flag to force F_SEAL_INACCESSIBLE.
- Commit message improvement.
- Many small fixes for comments from the last version.
Links to previous discussions
-----------------------------
[1] Original design proposal:
https://lkml.kernel.org/kvm/20210824005248.200037-1-seanjc@google.com/
[2] Updated proposal and RFC patch v1:
https://lkml.kernel.org/linux-fsdevel/20211111141352.26311-1-chao.p.peng@li…
[3] Patch v5: https://lkml.org/lkml/2022/5/19/861
Chao Peng (12):
mm: Add F_SEAL_AUTO_ALLOCATE seal to memfd
selftests/memfd: Add tests for F_SEAL_AUTO_ALLOCATE
mm: Introduce memfile_notifier
mm/memfd: Introduce MFD_INACCESSIBLE flag
KVM: Rename KVM_PRIVATE_MEM_SLOTS to KVM_INTERNAL_MEM_SLOTS
KVM: Use gfn instead of hva for mmu_notifier_retry
KVM: Rename mmu_notifier_*
KVM: Extend the memslot to support fd-based private memory
KVM: Add KVM_EXIT_MEMORY_FAULT exit
KVM: Register/unregister the guest private memory regions
KVM: Handle page fault for private memory
KVM: Enable and expose KVM_MEM_PRIVATE
Kirill A. Shutemov (1):
mm/shmem: Support memfile_notifier
Documentation/virt/kvm/api.rst | 77 +++++-
arch/arm64/kvm/mmu.c | 8 +-
arch/mips/include/asm/kvm_host.h | 2 +-
arch/mips/kvm/mmu.c | 10 +-
arch/powerpc/include/asm/kvm_book3s_64.h | 2 +-
arch/powerpc/kvm/book3s_64_mmu_host.c | 4 +-
arch/powerpc/kvm/book3s_64_mmu_hv.c | 4 +-
arch/powerpc/kvm/book3s_64_mmu_radix.c | 6 +-
arch/powerpc/kvm/book3s_hv_nested.c | 2 +-
arch/powerpc/kvm/book3s_hv_rm_mmu.c | 8 +-
arch/powerpc/kvm/e500_mmu_host.c | 4 +-
arch/riscv/kvm/mmu.c | 4 +-
arch/x86/include/asm/kvm_host.h | 3 +-
arch/x86/kvm/Kconfig | 3 +
arch/x86/kvm/mmu.h | 2 -
arch/x86/kvm/mmu/mmu.c | 74 +++++-
arch/x86/kvm/mmu/mmu_internal.h | 18 ++
arch/x86/kvm/mmu/mmutrace.h | 1 +
arch/x86/kvm/mmu/paging_tmpl.h | 4 +-
arch/x86/kvm/x86.c | 2 +-
include/linux/kvm_host.h | 105 +++++---
include/linux/memfile_notifier.h | 91 +++++++
include/linux/shmem_fs.h | 2 +
include/uapi/linux/fcntl.h | 1 +
include/uapi/linux/kvm.h | 37 +++
include/uapi/linux/memfd.h | 1 +
mm/Kconfig | 4 +
mm/Makefile | 1 +
mm/memfd.c | 18 +-
mm/memfile_notifier.c | 123 ++++++++++
mm/shmem.c | 125 +++++++++-
tools/testing/selftests/memfd/memfd_test.c | 166 +++++++++++++
virt/kvm/Kconfig | 3 +
virt/kvm/kvm_main.c | 272 ++++++++++++++++++---
virt/kvm/pfncache.c | 14 +-
35 files changed, 1074 insertions(+), 127 deletions(-)
create mode 100644 include/linux/memfile_notifier.h
create mode 100644 mm/memfile_notifier.c
--
2.25.1
Many uses of the KUnit resource system are intended to simply defer
calling a function until the test exits (be it due to success or
failure). The existing kunit_alloc_resource() function is often used for
this, but was awkward to use (requiring passing NULL init functions, etc),
and returned a resource without incrementing its reference count, which
-- while okay for this use-case -- could cause problems in others.
Instead, introduce a simple kunit_add_action() API: a simple function
(returning nothing, accepting a single void* argument) can be scheduled
to be called when the test exits. Deferred actions are called in the
opposite order to that which they were registered.
This mimics the devres API, devm_add_action(), and also provides
kunit_remove_action(), to cancel a deferred action, and
kunit_release_action() to trigger one early.
This is implemented as a resource under the hood, so the ordering
between resource cleanup and deferred functions is maintained.
Reviewed-by: Benjamin Berg <benjamin.berg(a)intel.com>
Reviewed-by: Maxime Ripard <maxime(a)cerno.tech>
Tested-by: Maxime Ripard <maxime(a)cerno.tech>
Signed-off-by: David Gow <davidgow(a)google.com>
---
No changes since v2:
https://lore.kernel.org/linux-kselftest/20230518083849.2631178-1-davidgow@g…
Changes since v1:
https://lore.kernel.org/linux-kselftest/20230421084226.2278282-2-davidgow@g…
- Some small documentation updates (Thanks Daniel)
- Reinstate a typedef for the action function.
- This time, it's called kunit_action_t
- Thanks Maxime!
Changes since RFC v2:
https://lore.kernel.org/linux-kselftest/20230331080411.981038-2-davidgow@go…
- Got rid of internal_gfp
- everything uses GFP_KERNEL now
- This includes kunit_kzalloc() and friends, which still allocate the
returned memory with the provided GFP, but use GFP_KERNEL for
internal bookkeeping data.
- Thanks Maxime & Benjamin!
- Got rid of cancellation tokens.
- kunit_add_action() now returns 0 on success, otherwise an error
- Note that this can quite easily lead to a memory leak, so look at
kunit_add_action_or_reset()
- Thanks Maxime & Benjamin!
- Added kunit_add_action_or_reset
- Matches devm_add_action_or_reset()
- Returns 0 on success.
- Thanks Maxime & Benjamin!
- Got rid of the kunit_defer_func_t typedef.
- I liked it, but it is probably pushing the boundaries of kernel
style.
- Use (void (*)(void *)) instead.
Changes since RFC v1:
https://lore.kernel.org/linux-kselftest/20230325043104.3761770-2-davidgow@g…
- Rename functions to better match the devm_* APIs. (Thanks Maxime)
- Embed the kunit_resource in struct kunit_action_ctx to avoid an extra
allocation (Thanks Benjamin)
- Use 'struct kunit_action_ctx' as the type for cancellation tokens
(Thanks Benjamin)
- Add tests.
---
include/kunit/resource.h | 92 +++++++++++++++++++++++++++++++++++++
lib/kunit/kunit-test.c | 88 ++++++++++++++++++++++++++++++++++-
lib/kunit/resource.c | 99 ++++++++++++++++++++++++++++++++++++++++
3 files changed, 278 insertions(+), 1 deletion(-)
diff --git a/include/kunit/resource.h b/include/kunit/resource.h
index c0d88b318e90..b64eb783b1bc 100644
--- a/include/kunit/resource.h
+++ b/include/kunit/resource.h
@@ -387,4 +387,96 @@ static inline int kunit_destroy_named_resource(struct kunit *test,
*/
void kunit_remove_resource(struct kunit *test, struct kunit_resource *res);
+/* A 'deferred action' function to be used with kunit_add_action. */
+typedef void (kunit_action_t)(void *);
+
+/**
+ * kunit_add_action() - Call a function when the test ends.
+ * @test: Test case to associate the action with.
+ * @func: The function to run on test exit
+ * @ctx: Data passed into @func
+ *
+ * Defer the execution of a function until the test exits, either normally or
+ * due to a failure. @ctx is passed as additional context. All functions
+ * registered with kunit_add_action() will execute in the opposite order to that
+ * they were registered in.
+ *
+ * This is useful for cleaning up allocated memory and resources, as these
+ * functions are called even if the test aborts early due to, e.g., a failed
+ * assertion.
+ *
+ * See also: devm_add_action() for the devres equivalent.
+ *
+ * Returns:
+ * 0 on success, an error if the action could not be deferred.
+ */
+int kunit_add_action(struct kunit *test, kunit_action_t *action, void *ctx);
+
+/**
+ * kunit_add_action_or_reset() - Call a function when the test ends.
+ * @test: Test case to associate the action with.
+ * @func: The function to run on test exit
+ * @ctx: Data passed into @func
+ *
+ * Defer the execution of a function until the test exits, either normally or
+ * due to a failure. @ctx is passed as additional context. All functions
+ * registered with kunit_add_action() will execute in the opposite order to that
+ * they were registered in.
+ *
+ * This is useful for cleaning up allocated memory and resources, as these
+ * functions are called even if the test aborts early due to, e.g., a failed
+ * assertion.
+ *
+ * If the action cannot be created (e.g., due to the system being out of memory),
+ * then action(ctx) will be called immediately, and an error will be returned.
+ *
+ * See also: devm_add_action_or_reset() for the devres equivalent.
+ *
+ * Returns:
+ * 0 on success, an error if the action could not be deferred.
+ */
+int kunit_add_action_or_reset(struct kunit *test, kunit_action_t *action,
+ void *ctx);
+
+/**
+ * kunit_remove_action() - Cancel a matching deferred action.
+ * @test: Test case the action is associated with.
+ * @func: The deferred function to cancel.
+ * @ctx: The context passed to the deferred function to trigger.
+ *
+ * Prevent an action deferred via kunit_add_action() from executing when the
+ * test terminates.
+ *
+ * If the function/context pair was deferred multiple times, only the most
+ * recent one will be cancelled.
+ *
+ * See also: devm_remove_action() for the devres equivalent.
+ */
+void kunit_remove_action(struct kunit *test,
+ kunit_action_t *action,
+ void *ctx);
+
+/**
+ * kunit_release_action() - Run a matching action call immediately.
+ * @test: Test case the action is associated with.
+ * @func: The deferred function to trigger.
+ * @ctx: The context passed to the deferred function to trigger.
+ *
+ * Execute a function deferred via kunit_add_action()) immediately, rather than
+ * when the test ends.
+ *
+ * If the function/context pair was deferred multiple times, it will only be
+ * executed once here. The most recent deferral will no longer execute when
+ * the test ends.
+ *
+ * kunit_release_action(test, func, ctx);
+ * is equivalent to
+ * func(ctx);
+ * kunit_remove_action(test, func, ctx);
+ *
+ * See also: devm_release_action() for the devres equivalent.
+ */
+void kunit_release_action(struct kunit *test,
+ kunit_action_t *action,
+ void *ctx);
#endif /* _KUNIT_RESOURCE_H */
diff --git a/lib/kunit/kunit-test.c b/lib/kunit/kunit-test.c
index 42e44caa1bdd..83d8e90ca7a2 100644
--- a/lib/kunit/kunit-test.c
+++ b/lib/kunit/kunit-test.c
@@ -112,7 +112,7 @@ struct kunit_test_resource_context {
struct kunit test;
bool is_resource_initialized;
int allocate_order[2];
- int free_order[2];
+ int free_order[4];
};
static int fake_resource_init(struct kunit_resource *res, void *context)
@@ -403,6 +403,88 @@ static void kunit_resource_test_named(struct kunit *test)
KUNIT_EXPECT_TRUE(test, list_empty(&test->resources));
}
+static void increment_int(void *ctx)
+{
+ int *i = (int *)ctx;
+ (*i)++;
+}
+
+static void kunit_resource_test_action(struct kunit *test)
+{
+ int num_actions = 0;
+
+ kunit_add_action(test, increment_int, &num_actions);
+ KUNIT_EXPECT_EQ(test, num_actions, 0);
+ kunit_cleanup(test);
+ KUNIT_EXPECT_EQ(test, num_actions, 1);
+
+ /* Once we've cleaned up, the action queue is empty. */
+ kunit_cleanup(test);
+ KUNIT_EXPECT_EQ(test, num_actions, 1);
+
+ /* Check the same function can be deferred multiple times. */
+ kunit_add_action(test, increment_int, &num_actions);
+ kunit_add_action(test, increment_int, &num_actions);
+ kunit_cleanup(test);
+ KUNIT_EXPECT_EQ(test, num_actions, 3);
+}
+static void kunit_resource_test_remove_action(struct kunit *test)
+{
+ int num_actions = 0;
+
+ kunit_add_action(test, increment_int, &num_actions);
+ KUNIT_EXPECT_EQ(test, num_actions, 0);
+
+ kunit_remove_action(test, increment_int, &num_actions);
+ kunit_cleanup(test);
+ KUNIT_EXPECT_EQ(test, num_actions, 0);
+}
+static void kunit_resource_test_release_action(struct kunit *test)
+{
+ int num_actions = 0;
+
+ kunit_add_action(test, increment_int, &num_actions);
+ KUNIT_EXPECT_EQ(test, num_actions, 0);
+ /* Runs immediately on trigger. */
+ kunit_release_action(test, increment_int, &num_actions);
+ KUNIT_EXPECT_EQ(test, num_actions, 1);
+
+ /* Doesn't run again on test exit. */
+ kunit_cleanup(test);
+ KUNIT_EXPECT_EQ(test, num_actions, 1);
+}
+static void action_order_1(void *ctx)
+{
+ struct kunit_test_resource_context *res_ctx = (struct kunit_test_resource_context *)ctx;
+
+ KUNIT_RESOURCE_TEST_MARK_ORDER(res_ctx, free_order, 1);
+ kunit_log(KERN_INFO, current->kunit_test, "action_order_1");
+}
+static void action_order_2(void *ctx)
+{
+ struct kunit_test_resource_context *res_ctx = (struct kunit_test_resource_context *)ctx;
+
+ KUNIT_RESOURCE_TEST_MARK_ORDER(res_ctx, free_order, 2);
+ kunit_log(KERN_INFO, current->kunit_test, "action_order_2");
+}
+static void kunit_resource_test_action_ordering(struct kunit *test)
+{
+ struct kunit_test_resource_context *ctx = test->priv;
+
+ kunit_add_action(test, action_order_1, ctx);
+ kunit_add_action(test, action_order_2, ctx);
+ kunit_add_action(test, action_order_1, ctx);
+ kunit_add_action(test, action_order_2, ctx);
+ kunit_remove_action(test, action_order_1, ctx);
+ kunit_release_action(test, action_order_2, ctx);
+ kunit_cleanup(test);
+
+ /* [2 is triggered] [2], [(1 is cancelled)] [1] */
+ KUNIT_EXPECT_EQ(test, ctx->free_order[0], 2);
+ KUNIT_EXPECT_EQ(test, ctx->free_order[1], 2);
+ KUNIT_EXPECT_EQ(test, ctx->free_order[2], 1);
+}
+
static int kunit_resource_test_init(struct kunit *test)
{
struct kunit_test_resource_context *ctx =
@@ -434,6 +516,10 @@ static struct kunit_case kunit_resource_test_cases[] = {
KUNIT_CASE(kunit_resource_test_proper_free_ordering),
KUNIT_CASE(kunit_resource_test_static),
KUNIT_CASE(kunit_resource_test_named),
+ KUNIT_CASE(kunit_resource_test_action),
+ KUNIT_CASE(kunit_resource_test_remove_action),
+ KUNIT_CASE(kunit_resource_test_release_action),
+ KUNIT_CASE(kunit_resource_test_action_ordering),
{}
};
diff --git a/lib/kunit/resource.c b/lib/kunit/resource.c
index c414df922f34..f0209252b179 100644
--- a/lib/kunit/resource.c
+++ b/lib/kunit/resource.c
@@ -77,3 +77,102 @@ int kunit_destroy_resource(struct kunit *test, kunit_resource_match_t match,
return 0;
}
EXPORT_SYMBOL_GPL(kunit_destroy_resource);
+
+struct kunit_action_ctx {
+ struct kunit_resource res;
+ kunit_action_t *func;
+ void *ctx;
+};
+
+static void __kunit_action_free(struct kunit_resource *res)
+{
+ struct kunit_action_ctx *action_ctx = container_of(res, struct kunit_action_ctx, res);
+
+ action_ctx->func(action_ctx->ctx);
+}
+
+
+int kunit_add_action(struct kunit *test, void (*action)(void *), void *ctx)
+{
+ struct kunit_action_ctx *action_ctx;
+
+ KUNIT_ASSERT_NOT_NULL_MSG(test, action, "Tried to action a NULL function!");
+
+ action_ctx = kzalloc(sizeof(*action_ctx), GFP_KERNEL);
+ if (!action_ctx)
+ return -ENOMEM;
+
+ action_ctx->func = action;
+ action_ctx->ctx = ctx;
+
+ action_ctx->res.should_kfree = true;
+ /* As init is NULL, this cannot fail. */
+ __kunit_add_resource(test, NULL, __kunit_action_free, &action_ctx->res, action_ctx);
+
+ return 0;
+}
+EXPORT_SYMBOL_GPL(kunit_add_action);
+
+int kunit_add_action_or_reset(struct kunit *test, void (*action)(void *),
+ void *ctx)
+{
+ int res = kunit_add_action(test, action, ctx);
+
+ if (res)
+ action(ctx);
+ return res;
+}
+EXPORT_SYMBOL_GPL(kunit_add_action_or_reset);
+
+static bool __kunit_action_match(struct kunit *test,
+ struct kunit_resource *res, void *match_data)
+{
+ struct kunit_action_ctx *match_ctx = (struct kunit_action_ctx *)match_data;
+ struct kunit_action_ctx *res_ctx = container_of(res, struct kunit_action_ctx, res);
+
+ /* Make sure this is a free function. */
+ if (res->free != __kunit_action_free)
+ return false;
+
+ /* Both the function and context data should match. */
+ return (match_ctx->func == res_ctx->func) && (match_ctx->ctx == res_ctx->ctx);
+}
+
+void kunit_remove_action(struct kunit *test,
+ kunit_action_t *action,
+ void *ctx)
+{
+ struct kunit_action_ctx match_ctx;
+ struct kunit_resource *res;
+
+ match_ctx.func = action;
+ match_ctx.ctx = ctx;
+
+ res = kunit_find_resource(test, __kunit_action_match, &match_ctx);
+ if (res) {
+ /* Remove the free function so we don't run the action. */
+ res->free = NULL;
+ kunit_remove_resource(test, res);
+ kunit_put_resource(res);
+ }
+}
+EXPORT_SYMBOL_GPL(kunit_remove_action);
+
+void kunit_release_action(struct kunit *test,
+ kunit_action_t *action,
+ void *ctx)
+{
+ struct kunit_action_ctx match_ctx;
+ struct kunit_resource *res;
+
+ match_ctx.func = action;
+ match_ctx.ctx = ctx;
+
+ res = kunit_find_resource(test, __kunit_action_match, &match_ctx);
+ if (res) {
+ kunit_remove_resource(test, res);
+ /* We have to put() this here, else free won't be called. */
+ kunit_put_resource(res);
+ }
+}
+EXPORT_SYMBOL_GPL(kunit_release_action);
--
2.41.0.rc0.172.g3f132b7071-goog
Many uses of the KUnit resource system are intended to simply defer
calling a function until the test exits (be it due to success or
failure). The existing kunit_alloc_resource() function is often used for
this, but was awkward to use (requiring passing NULL init functions, etc),
and returned a resource without incrementing its reference count, which
-- while okay for this use-case -- could cause problems in others.
Instead, introduce a simple kunit_add_action() API: a simple function
(returning nothing, accepting a single void* argument) can be scheduled
to be called when the test exits. Deferred actions are called in the
opposite order to that which they were registered.
This mimics the devres API, devm_add_action(), and also provides
kunit_remove_action(), to cancel a deferred action, and
kunit_release_action() to trigger one early.
This is implemented as a resource under the hood, so the ordering
between resource cleanup and deferred functions is maintained.
Reviewed-by: Benjamin Berg <benjamin.berg(a)intel.com>
Reviewed-by: Maxime Ripard <maxime(a)cerno.tech>
Tested-by: Maxime Ripard <maxime(a)cerno.tech>
Signed-off-by: David Gow <davidgow(a)google.com>
---
Changes since v1:
https://lore.kernel.org/linux-kselftest/20230421084226.2278282-2-davidgow@g…
- Some small documentation updates (Thanks Daniel)
- Reinstate a typedef for the action function.
- This time, it's called kunit_action_t
- Thanks Maxime!
Changes since RFC v2:
https://lore.kernel.org/linux-kselftest/20230331080411.981038-2-davidgow@go…
- Got rid of internal_gfp
- everything uses GFP_KERNEL now
- This includes kunit_kzalloc() and friends, which still allocate the
returned memory with the provided GFP, but use GFP_KERNEL for
internal bookkeeping data.
- Thanks Maxime & Benjamin!
- Got rid of cancellation tokens.
- kunit_add_action() now returns 0 on success, otherwise an error
- Note that this can quite easily lead to a memory leak, so look at
kunit_add_action_or_reset()
- Thanks Maxime & Benjamin!
- Added kunit_add_action_or_reset
- Matches devm_add_action_or_reset()
- Returns 0 on success.
- Thanks Maxime & Benjamin!
- Got rid of the kunit_defer_func_t typedef.
- I liked it, but it is probably pushing the boundaries of kernel
style.
- Use (void (*)(void *)) instead.
Changes since RFC v1:
https://lore.kernel.org/linux-kselftest/20230325043104.3761770-2-davidgow@g…
- Rename functions to better match the devm_* APIs. (Thanks Maxime)
- Embed the kunit_resource in struct kunit_action_ctx to avoid an extra
allocation (Thanks Benjamin)
- Use 'struct kunit_action_ctx' as the type for cancellation tokens
(Thanks Benjamin)
- Add tests.
---
include/kunit/resource.h | 92 +++++++++++++++++++++++++++++++++++++
lib/kunit/kunit-test.c | 88 ++++++++++++++++++++++++++++++++++-
lib/kunit/resource.c | 99 ++++++++++++++++++++++++++++++++++++++++
3 files changed, 278 insertions(+), 1 deletion(-)
diff --git a/include/kunit/resource.h b/include/kunit/resource.h
index c0d88b318e90..b64eb783b1bc 100644
--- a/include/kunit/resource.h
+++ b/include/kunit/resource.h
@@ -387,4 +387,96 @@ static inline int kunit_destroy_named_resource(struct kunit *test,
*/
void kunit_remove_resource(struct kunit *test, struct kunit_resource *res);
+/* A 'deferred action' function to be used with kunit_add_action. */
+typedef void (kunit_action_t)(void *);
+
+/**
+ * kunit_add_action() - Call a function when the test ends.
+ * @test: Test case to associate the action with.
+ * @func: The function to run on test exit
+ * @ctx: Data passed into @func
+ *
+ * Defer the execution of a function until the test exits, either normally or
+ * due to a failure. @ctx is passed as additional context. All functions
+ * registered with kunit_add_action() will execute in the opposite order to that
+ * they were registered in.
+ *
+ * This is useful for cleaning up allocated memory and resources, as these
+ * functions are called even if the test aborts early due to, e.g., a failed
+ * assertion.
+ *
+ * See also: devm_add_action() for the devres equivalent.
+ *
+ * Returns:
+ * 0 on success, an error if the action could not be deferred.
+ */
+int kunit_add_action(struct kunit *test, kunit_action_t *action, void *ctx);
+
+/**
+ * kunit_add_action_or_reset() - Call a function when the test ends.
+ * @test: Test case to associate the action with.
+ * @func: The function to run on test exit
+ * @ctx: Data passed into @func
+ *
+ * Defer the execution of a function until the test exits, either normally or
+ * due to a failure. @ctx is passed as additional context. All functions
+ * registered with kunit_add_action() will execute in the opposite order to that
+ * they were registered in.
+ *
+ * This is useful for cleaning up allocated memory and resources, as these
+ * functions are called even if the test aborts early due to, e.g., a failed
+ * assertion.
+ *
+ * If the action cannot be created (e.g., due to the system being out of memory),
+ * then action(ctx) will be called immediately, and an error will be returned.
+ *
+ * See also: devm_add_action_or_reset() for the devres equivalent.
+ *
+ * Returns:
+ * 0 on success, an error if the action could not be deferred.
+ */
+int kunit_add_action_or_reset(struct kunit *test, kunit_action_t *action,
+ void *ctx);
+
+/**
+ * kunit_remove_action() - Cancel a matching deferred action.
+ * @test: Test case the action is associated with.
+ * @func: The deferred function to cancel.
+ * @ctx: The context passed to the deferred function to trigger.
+ *
+ * Prevent an action deferred via kunit_add_action() from executing when the
+ * test terminates.
+ *
+ * If the function/context pair was deferred multiple times, only the most
+ * recent one will be cancelled.
+ *
+ * See also: devm_remove_action() for the devres equivalent.
+ */
+void kunit_remove_action(struct kunit *test,
+ kunit_action_t *action,
+ void *ctx);
+
+/**
+ * kunit_release_action() - Run a matching action call immediately.
+ * @test: Test case the action is associated with.
+ * @func: The deferred function to trigger.
+ * @ctx: The context passed to the deferred function to trigger.
+ *
+ * Execute a function deferred via kunit_add_action()) immediately, rather than
+ * when the test ends.
+ *
+ * If the function/context pair was deferred multiple times, it will only be
+ * executed once here. The most recent deferral will no longer execute when
+ * the test ends.
+ *
+ * kunit_release_action(test, func, ctx);
+ * is equivalent to
+ * func(ctx);
+ * kunit_remove_action(test, func, ctx);
+ *
+ * See also: devm_release_action() for the devres equivalent.
+ */
+void kunit_release_action(struct kunit *test,
+ kunit_action_t *action,
+ void *ctx);
#endif /* _KUNIT_RESOURCE_H */
diff --git a/lib/kunit/kunit-test.c b/lib/kunit/kunit-test.c
index 42e44caa1bdd..83d8e90ca7a2 100644
--- a/lib/kunit/kunit-test.c
+++ b/lib/kunit/kunit-test.c
@@ -112,7 +112,7 @@ struct kunit_test_resource_context {
struct kunit test;
bool is_resource_initialized;
int allocate_order[2];
- int free_order[2];
+ int free_order[4];
};
static int fake_resource_init(struct kunit_resource *res, void *context)
@@ -403,6 +403,88 @@ static void kunit_resource_test_named(struct kunit *test)
KUNIT_EXPECT_TRUE(test, list_empty(&test->resources));
}
+static void increment_int(void *ctx)
+{
+ int *i = (int *)ctx;
+ (*i)++;
+}
+
+static void kunit_resource_test_action(struct kunit *test)
+{
+ int num_actions = 0;
+
+ kunit_add_action(test, increment_int, &num_actions);
+ KUNIT_EXPECT_EQ(test, num_actions, 0);
+ kunit_cleanup(test);
+ KUNIT_EXPECT_EQ(test, num_actions, 1);
+
+ /* Once we've cleaned up, the action queue is empty. */
+ kunit_cleanup(test);
+ KUNIT_EXPECT_EQ(test, num_actions, 1);
+
+ /* Check the same function can be deferred multiple times. */
+ kunit_add_action(test, increment_int, &num_actions);
+ kunit_add_action(test, increment_int, &num_actions);
+ kunit_cleanup(test);
+ KUNIT_EXPECT_EQ(test, num_actions, 3);
+}
+static void kunit_resource_test_remove_action(struct kunit *test)
+{
+ int num_actions = 0;
+
+ kunit_add_action(test, increment_int, &num_actions);
+ KUNIT_EXPECT_EQ(test, num_actions, 0);
+
+ kunit_remove_action(test, increment_int, &num_actions);
+ kunit_cleanup(test);
+ KUNIT_EXPECT_EQ(test, num_actions, 0);
+}
+static void kunit_resource_test_release_action(struct kunit *test)
+{
+ int num_actions = 0;
+
+ kunit_add_action(test, increment_int, &num_actions);
+ KUNIT_EXPECT_EQ(test, num_actions, 0);
+ /* Runs immediately on trigger. */
+ kunit_release_action(test, increment_int, &num_actions);
+ KUNIT_EXPECT_EQ(test, num_actions, 1);
+
+ /* Doesn't run again on test exit. */
+ kunit_cleanup(test);
+ KUNIT_EXPECT_EQ(test, num_actions, 1);
+}
+static void action_order_1(void *ctx)
+{
+ struct kunit_test_resource_context *res_ctx = (struct kunit_test_resource_context *)ctx;
+
+ KUNIT_RESOURCE_TEST_MARK_ORDER(res_ctx, free_order, 1);
+ kunit_log(KERN_INFO, current->kunit_test, "action_order_1");
+}
+static void action_order_2(void *ctx)
+{
+ struct kunit_test_resource_context *res_ctx = (struct kunit_test_resource_context *)ctx;
+
+ KUNIT_RESOURCE_TEST_MARK_ORDER(res_ctx, free_order, 2);
+ kunit_log(KERN_INFO, current->kunit_test, "action_order_2");
+}
+static void kunit_resource_test_action_ordering(struct kunit *test)
+{
+ struct kunit_test_resource_context *ctx = test->priv;
+
+ kunit_add_action(test, action_order_1, ctx);
+ kunit_add_action(test, action_order_2, ctx);
+ kunit_add_action(test, action_order_1, ctx);
+ kunit_add_action(test, action_order_2, ctx);
+ kunit_remove_action(test, action_order_1, ctx);
+ kunit_release_action(test, action_order_2, ctx);
+ kunit_cleanup(test);
+
+ /* [2 is triggered] [2], [(1 is cancelled)] [1] */
+ KUNIT_EXPECT_EQ(test, ctx->free_order[0], 2);
+ KUNIT_EXPECT_EQ(test, ctx->free_order[1], 2);
+ KUNIT_EXPECT_EQ(test, ctx->free_order[2], 1);
+}
+
static int kunit_resource_test_init(struct kunit *test)
{
struct kunit_test_resource_context *ctx =
@@ -434,6 +516,10 @@ static struct kunit_case kunit_resource_test_cases[] = {
KUNIT_CASE(kunit_resource_test_proper_free_ordering),
KUNIT_CASE(kunit_resource_test_static),
KUNIT_CASE(kunit_resource_test_named),
+ KUNIT_CASE(kunit_resource_test_action),
+ KUNIT_CASE(kunit_resource_test_remove_action),
+ KUNIT_CASE(kunit_resource_test_release_action),
+ KUNIT_CASE(kunit_resource_test_action_ordering),
{}
};
diff --git a/lib/kunit/resource.c b/lib/kunit/resource.c
index c414df922f34..f0209252b179 100644
--- a/lib/kunit/resource.c
+++ b/lib/kunit/resource.c
@@ -77,3 +77,102 @@ int kunit_destroy_resource(struct kunit *test, kunit_resource_match_t match,
return 0;
}
EXPORT_SYMBOL_GPL(kunit_destroy_resource);
+
+struct kunit_action_ctx {
+ struct kunit_resource res;
+ kunit_action_t *func;
+ void *ctx;
+};
+
+static void __kunit_action_free(struct kunit_resource *res)
+{
+ struct kunit_action_ctx *action_ctx = container_of(res, struct kunit_action_ctx, res);
+
+ action_ctx->func(action_ctx->ctx);
+}
+
+
+int kunit_add_action(struct kunit *test, void (*action)(void *), void *ctx)
+{
+ struct kunit_action_ctx *action_ctx;
+
+ KUNIT_ASSERT_NOT_NULL_MSG(test, action, "Tried to action a NULL function!");
+
+ action_ctx = kzalloc(sizeof(*action_ctx), GFP_KERNEL);
+ if (!action_ctx)
+ return -ENOMEM;
+
+ action_ctx->func = action;
+ action_ctx->ctx = ctx;
+
+ action_ctx->res.should_kfree = true;
+ /* As init is NULL, this cannot fail. */
+ __kunit_add_resource(test, NULL, __kunit_action_free, &action_ctx->res, action_ctx);
+
+ return 0;
+}
+EXPORT_SYMBOL_GPL(kunit_add_action);
+
+int kunit_add_action_or_reset(struct kunit *test, void (*action)(void *),
+ void *ctx)
+{
+ int res = kunit_add_action(test, action, ctx);
+
+ if (res)
+ action(ctx);
+ return res;
+}
+EXPORT_SYMBOL_GPL(kunit_add_action_or_reset);
+
+static bool __kunit_action_match(struct kunit *test,
+ struct kunit_resource *res, void *match_data)
+{
+ struct kunit_action_ctx *match_ctx = (struct kunit_action_ctx *)match_data;
+ struct kunit_action_ctx *res_ctx = container_of(res, struct kunit_action_ctx, res);
+
+ /* Make sure this is a free function. */
+ if (res->free != __kunit_action_free)
+ return false;
+
+ /* Both the function and context data should match. */
+ return (match_ctx->func == res_ctx->func) && (match_ctx->ctx == res_ctx->ctx);
+}
+
+void kunit_remove_action(struct kunit *test,
+ kunit_action_t *action,
+ void *ctx)
+{
+ struct kunit_action_ctx match_ctx;
+ struct kunit_resource *res;
+
+ match_ctx.func = action;
+ match_ctx.ctx = ctx;
+
+ res = kunit_find_resource(test, __kunit_action_match, &match_ctx);
+ if (res) {
+ /* Remove the free function so we don't run the action. */
+ res->free = NULL;
+ kunit_remove_resource(test, res);
+ kunit_put_resource(res);
+ }
+}
+EXPORT_SYMBOL_GPL(kunit_remove_action);
+
+void kunit_release_action(struct kunit *test,
+ kunit_action_t *action,
+ void *ctx)
+{
+ struct kunit_action_ctx match_ctx;
+ struct kunit_resource *res;
+
+ match_ctx.func = action;
+ match_ctx.ctx = ctx;
+
+ res = kunit_find_resource(test, __kunit_action_match, &match_ctx);
+ if (res) {
+ kunit_remove_resource(test, res);
+ /* We have to put() this here, else free won't be called. */
+ kunit_put_resource(res);
+ }
+}
+EXPORT_SYMBOL_GPL(kunit_release_action);
--
2.40.1.698.g37aff9b760-goog
The basic idea here is to "simulate" memory poisoning for VMs. A VM
running on some host might encounter a memory error, after which some
page(s) are poisoned (i.e., future accesses SIGBUS). They expect that
once poisoned, pages can never become "un-poisoned". So, when we live
migrate the VM, we need to preserve the poisoned status of these pages.
When live migrating, we try to get the guest running on its new host as
quickly as possible. So, we start it running before all memory has been
copied, and before we're certain which pages should be poisoned or not.
So the basic way to use this new feature is:
- On the new host, the guest's memory is registered with userfaultfd, in
either MISSING or MINOR mode (doesn't really matter for this purpose).
- On any first access, we get a userfaultfd event. At this point we can
communicate with the old host to find out if the page was poisoned.
- If so, we can respond with a UFFDIO_SIGBUS - this places a swap marker
so any future accesses will SIGBUS. Because the pte is now "present",
future accesses won't generate more userfaultfd events, they'll just
SIGBUS directly.
UFFDIO_SIGBUS does not handle unmapping previously-present PTEs. This
isn't needed, because during live migration we want to intercept
all accesses with userfaultfd (not just writes, so WP mode isn't useful
for this). So whether minor or missing mode is being used (or both), the
PTE won't be present in any case, so handling that case isn't needed.
Signed-off-by: Axel Rasmussen <axelrasmussen(a)google.com>
---
fs/userfaultfd.c | 63 ++++++++++++++++++++++++++++++++
include/linux/swapops.h | 3 +-
include/linux/userfaultfd_k.h | 4 ++
include/uapi/linux/userfaultfd.h | 25 +++++++++++--
mm/memory.c | 4 ++
mm/userfaultfd.c | 62 ++++++++++++++++++++++++++++++-
6 files changed, 156 insertions(+), 5 deletions(-)
diff --git a/fs/userfaultfd.c b/fs/userfaultfd.c
index 0fd96d6e39ce..edc2928dae2b 100644
--- a/fs/userfaultfd.c
+++ b/fs/userfaultfd.c
@@ -1966,6 +1966,66 @@ static int userfaultfd_continue(struct userfaultfd_ctx *ctx, unsigned long arg)
return ret;
}
+static inline int userfaultfd_sigbus(struct userfaultfd_ctx *ctx, unsigned long arg)
+{
+ __s64 ret;
+ struct uffdio_sigbus uffdio_sigbus;
+ struct uffdio_sigbus __user *user_uffdio_sigbus;
+ struct userfaultfd_wake_range range;
+
+ user_uffdio_sigbus = (struct uffdio_sigbus __user *)arg;
+
+ ret = -EAGAIN;
+ if (atomic_read(&ctx->mmap_changing))
+ goto out;
+
+ ret = -EFAULT;
+ if (copy_from_user(&uffdio_sigbus, user_uffdio_sigbus,
+ /* don't copy the output fields */
+ sizeof(uffdio_sigbus) - (sizeof(__s64))))
+ goto out;
+
+ ret = validate_range(ctx->mm, uffdio_sigbus.range.start,
+ uffdio_sigbus.range.len);
+ if (ret)
+ goto out;
+
+ ret = -EINVAL;
+ /* double check for wraparound just in case. */
+ if (uffdio_sigbus.range.start + uffdio_sigbus.range.len <=
+ uffdio_sigbus.range.start) {
+ goto out;
+ }
+ if (uffdio_sigbus.mode & ~UFFDIO_SIGBUS_MODE_DONTWAKE)
+ goto out;
+
+ if (mmget_not_zero(ctx->mm)) {
+ ret = mfill_atomic_sigbus(ctx->mm, uffdio_sigbus.range.start,
+ uffdio_sigbus.range.len,
+ &ctx->mmap_changing, 0);
+ mmput(ctx->mm);
+ } else {
+ return -ESRCH;
+ }
+
+ if (unlikely(put_user(ret, &user_uffdio_sigbus->updated)))
+ return -EFAULT;
+ if (ret < 0)
+ goto out;
+
+ /* len == 0 would wake all */
+ BUG_ON(!ret);
+ range.len = ret;
+ if (!(uffdio_sigbus.mode & UFFDIO_SIGBUS_MODE_DONTWAKE)) {
+ range.start = uffdio_sigbus.range.start;
+ wake_userfault(ctx, &range);
+ }
+ ret = range.len == uffdio_sigbus.range.len ? 0 : -EAGAIN;
+
+out:
+ return ret;
+}
+
static inline unsigned int uffd_ctx_features(__u64 user_features)
{
/*
@@ -2067,6 +2127,9 @@ static long userfaultfd_ioctl(struct file *file, unsigned cmd,
case UFFDIO_CONTINUE:
ret = userfaultfd_continue(ctx, arg);
break;
+ case UFFDIO_SIGBUS:
+ ret = userfaultfd_sigbus(ctx, arg);
+ break;
}
return ret;
}
diff --git a/include/linux/swapops.h b/include/linux/swapops.h
index 3a451b7afcb3..fa778a0ae730 100644
--- a/include/linux/swapops.h
+++ b/include/linux/swapops.h
@@ -405,7 +405,8 @@ typedef unsigned long pte_marker;
#define PTE_MARKER_UFFD_WP BIT(0)
#define PTE_MARKER_SWAPIN_ERROR BIT(1)
-#define PTE_MARKER_MASK (BIT(2) - 1)
+#define PTE_MARKER_UFFD_SIGBUS BIT(2)
+#define PTE_MARKER_MASK (BIT(3) - 1)
static inline swp_entry_t make_pte_marker_entry(pte_marker marker)
{
diff --git a/include/linux/userfaultfd_k.h b/include/linux/userfaultfd_k.h
index d78b01524349..6de1084939c5 100644
--- a/include/linux/userfaultfd_k.h
+++ b/include/linux/userfaultfd_k.h
@@ -46,6 +46,7 @@ enum mfill_atomic_mode {
MFILL_ATOMIC_COPY,
MFILL_ATOMIC_ZEROPAGE,
MFILL_ATOMIC_CONTINUE,
+ MFILL_ATOMIC_SIGBUS,
NR_MFILL_ATOMIC_MODES,
};
@@ -83,6 +84,9 @@ extern ssize_t mfill_atomic_zeropage(struct mm_struct *dst_mm,
extern ssize_t mfill_atomic_continue(struct mm_struct *dst_mm, unsigned long dst_start,
unsigned long len, atomic_t *mmap_changing,
uffd_flags_t flags);
+extern ssize_t mfill_atomic_sigbus(struct mm_struct *dst_mm, unsigned long start,
+ unsigned long len, atomic_t *mmap_changing,
+ uffd_flags_t flags);
extern int mwriteprotect_range(struct mm_struct *dst_mm,
unsigned long start, unsigned long len,
bool enable_wp, atomic_t *mmap_changing);
diff --git a/include/uapi/linux/userfaultfd.h b/include/uapi/linux/userfaultfd.h
index 66dd4cd277bd..616e33d3db97 100644
--- a/include/uapi/linux/userfaultfd.h
+++ b/include/uapi/linux/userfaultfd.h
@@ -39,7 +39,8 @@
UFFD_FEATURE_MINOR_SHMEM | \
UFFD_FEATURE_EXACT_ADDRESS | \
UFFD_FEATURE_WP_HUGETLBFS_SHMEM | \
- UFFD_FEATURE_WP_UNPOPULATED)
+ UFFD_FEATURE_WP_UNPOPULATED | \
+ UFFD_FEATURE_SIGBUS_IOCTL)
#define UFFD_API_IOCTLS \
((__u64)1 << _UFFDIO_REGISTER | \
(__u64)1 << _UFFDIO_UNREGISTER | \
@@ -49,12 +50,14 @@
(__u64)1 << _UFFDIO_COPY | \
(__u64)1 << _UFFDIO_ZEROPAGE | \
(__u64)1 << _UFFDIO_WRITEPROTECT | \
- (__u64)1 << _UFFDIO_CONTINUE)
+ (__u64)1 << _UFFDIO_CONTINUE | \
+ (__u64)1 << _UFFDIO_SIGBUS)
#define UFFD_API_RANGE_IOCTLS_BASIC \
((__u64)1 << _UFFDIO_WAKE | \
(__u64)1 << _UFFDIO_COPY | \
+ (__u64)1 << _UFFDIO_WRITEPROTECT | \
(__u64)1 << _UFFDIO_CONTINUE | \
- (__u64)1 << _UFFDIO_WRITEPROTECT)
+ (__u64)1 << _UFFDIO_SIGBUS)
/*
* Valid ioctl command number range with this API is from 0x00 to
@@ -71,6 +74,7 @@
#define _UFFDIO_ZEROPAGE (0x04)
#define _UFFDIO_WRITEPROTECT (0x06)
#define _UFFDIO_CONTINUE (0x07)
+#define _UFFDIO_SIGBUS (0x08)
#define _UFFDIO_API (0x3F)
/* userfaultfd ioctl ids */
@@ -91,6 +95,8 @@
struct uffdio_writeprotect)
#define UFFDIO_CONTINUE _IOWR(UFFDIO, _UFFDIO_CONTINUE, \
struct uffdio_continue)
+#define UFFDIO_SIGBUS _IOWR(UFFDIO, _UFFDIO_SIGBUS, \
+ struct uffdio_sigbus)
/* read() structure */
struct uffd_msg {
@@ -225,6 +231,7 @@ struct uffdio_api {
#define UFFD_FEATURE_EXACT_ADDRESS (1<<11)
#define UFFD_FEATURE_WP_HUGETLBFS_SHMEM (1<<12)
#define UFFD_FEATURE_WP_UNPOPULATED (1<<13)
+#define UFFD_FEATURE_SIGBUS_IOCTL (1<<14)
__u64 features;
__u64 ioctls;
@@ -321,6 +328,18 @@ struct uffdio_continue {
__s64 mapped;
};
+struct uffdio_sigbus {
+ struct uffdio_range range;
+#define UFFDIO_SIGBUS_MODE_DONTWAKE ((__u64)1<<0)
+ __u64 mode;
+
+ /*
+ * Fields below here are written by the ioctl and must be at the end:
+ * the copy_from_user will not read past here.
+ */
+ __s64 updated;
+};
+
/*
* Flags for the userfaultfd(2) system call itself.
*/
diff --git a/mm/memory.c b/mm/memory.c
index f69fbc251198..e4b4207c2590 100644
--- a/mm/memory.c
+++ b/mm/memory.c
@@ -3675,6 +3675,10 @@ static vm_fault_t handle_pte_marker(struct vm_fault *vmf)
if (WARN_ON_ONCE(!marker))
return VM_FAULT_SIGBUS;
+ /* SIGBUS explicitly requested for this PTE. */
+ if (marker & PTE_MARKER_UFFD_SIGBUS)
+ return VM_FAULT_SIGBUS;
+
/* Higher priority than uffd-wp when data corrupted */
if (marker & PTE_MARKER_SWAPIN_ERROR)
return VM_FAULT_SIGBUS;
diff --git a/mm/userfaultfd.c b/mm/userfaultfd.c
index e97a0b4889fc..933587eebd5d 100644
--- a/mm/userfaultfd.c
+++ b/mm/userfaultfd.c
@@ -278,6 +278,51 @@ static int mfill_atomic_pte_continue(pmd_t *dst_pmd,
goto out;
}
+/* Handles UFFDIO_SIGBUS for all non-hugetlb VMAs. */
+static int mfill_atomic_pte_sigbus(pmd_t *dst_pmd,
+ struct vm_area_struct *dst_vma,
+ unsigned long dst_addr,
+ uffd_flags_t flags)
+{
+ int ret;
+ struct mm_struct *dst_mm = dst_vma->vm_mm;
+ pte_t _dst_pte, *dst_pte;
+ spinlock_t *ptl;
+
+ _dst_pte = make_pte_marker(PTE_MARKER_UFFD_SIGBUS);
+ dst_pte = pte_offset_map_lock(dst_mm, dst_pmd, dst_addr, &ptl);
+
+ if (vma_is_shmem(dst_vma)) {
+ struct inode *inode;
+ pgoff_t offset, max_off;
+
+ /* serialize against truncate with the page table lock */
+ inode = dst_vma->vm_file->f_inode;
+ offset = linear_page_index(dst_vma, dst_addr);
+ max_off = DIV_ROUND_UP(i_size_read(inode), PAGE_SIZE);
+ ret = -EFAULT;
+ if (unlikely(offset >= max_off))
+ goto out_unlock;
+ }
+
+ ret = -EEXIST;
+ /*
+ * For now, we don't handle unmapping pages, so only support filling in
+ * none PTEs, or replacing PTE markers.
+ */
+ if (!pte_none_mostly(*dst_pte))
+ goto out_unlock;
+
+ set_pte_at(dst_mm, dst_addr, dst_pte, _dst_pte);
+
+ /* No need to invalidate - it was non-present before */
+ update_mmu_cache(dst_vma, dst_addr, dst_pte);
+ ret = 0;
+out_unlock:
+ pte_unmap_unlock(dst_pte, ptl);
+ return ret;
+}
+
static pmd_t *mm_alloc_pmd(struct mm_struct *mm, unsigned long address)
{
pgd_t *pgd;
@@ -328,8 +373,12 @@ static __always_inline ssize_t mfill_atomic_hugetlb(
* supported by hugetlb. A PMD_SIZE huge pages may exist as used
* by THP. Since we can not reliably insert a zero page, this
* feature is not supported.
+ *
+ * PTE marker handling for hugetlb is a bit special, so for now
+ * UFFDIO_SIGBUS is not supported.
*/
- if (uffd_flags_mode_is(flags, MFILL_ATOMIC_ZEROPAGE)) {
+ if (uffd_flags_mode_is(flags, MFILL_ATOMIC_ZEROPAGE) ||
+ uffd_flags_mode_is(flags, MFILL_ATOMIC_SIGBUS)) {
mmap_read_unlock(dst_mm);
return -EINVAL;
}
@@ -473,6 +522,9 @@ static __always_inline ssize_t mfill_atomic_pte(pmd_t *dst_pmd,
if (uffd_flags_mode_is(flags, MFILL_ATOMIC_CONTINUE)) {
return mfill_atomic_pte_continue(dst_pmd, dst_vma,
dst_addr, flags);
+ } else if (uffd_flags_mode_is(flags, MFILL_ATOMIC_SIGBUS)) {
+ return mfill_atomic_pte_sigbus(dst_pmd, dst_vma,
+ dst_addr, flags);
}
/*
@@ -694,6 +746,14 @@ ssize_t mfill_atomic_continue(struct mm_struct *dst_mm, unsigned long start,
uffd_flags_set_mode(flags, MFILL_ATOMIC_CONTINUE));
}
+ssize_t mfill_atomic_sigbus(struct mm_struct *dst_mm, unsigned long start,
+ unsigned long len, atomic_t *mmap_changing,
+ uffd_flags_t flags)
+{
+ return mfill_atomic(dst_mm, start, 0, len, mmap_changing,
+ uffd_flags_set_mode(flags, MFILL_ATOMIC_SIGBUS));
+}
+
long uffd_wp_range(struct vm_area_struct *dst_vma,
unsigned long start, unsigned long len, bool enable_wp)
{
--
2.40.1.606.ga4b1b128d6-goog
*Changes in v15*
- Build fix (Add missed build fix in RESEND)
*Changes in v14*
- Fix build error caused by #ifdef added at last minute in some configs
*Changes in v13*
- Rebase on top of next-20230414
- Give-up on using uffd_wp_range() and write new helpers, flush tlb only
once
*Changes in v12*
- Update and other memory types to UFFD_FEATURE_WP_ASYNC
- Rebaase on top of next-20230406
- Review updates
*Changes in v11*
- Rebase on top of next-20230307
- Base patches on UFFD_FEATURE_WP_UNPOPULATED
- Do a lot of cosmetic changes and review updates
- Remove ENGAGE_WP + !GET operation as it can be performed with
UFFDIO_WRITEPROTECT
*Changes in v10*
- Add specific condition to return error if hugetlb is used with wp
async
- Move changes in tools/include/uapi/linux/fs.h to separate patch
- Add documentation
*Changes in v9:*
- Correct fault resolution for userfaultfd wp async
- Fix build warnings and errors which were happening on some configs
- Simplify pagemap ioctl's code
*Changes in v8:*
- Update uffd async wp implementation
- Improve PAGEMAP_IOCTL implementation
*Changes in v7:*
- Add uffd wp async
- Update the IOCTL to use uffd under the hood instead of soft-dirty
flags
*Motivation*
The real motivation for adding PAGEMAP_SCAN IOCTL is to emulate Windows
GetWriteWatch() syscall [1]. The GetWriteWatch{} retrieves the addresses of
the pages that are written to in a region of virtual memory.
This syscall is used in Windows applications and games etc. This syscall is
being emulated in pretty slow manner in userspace. Our purpose is to
enhance the kernel such that we translate it efficiently in a better way.
Currently some out of tree hack patches are being used to efficiently
emulate it in some kernels. We intend to replace those with these patches.
So the whole gaming on Linux can effectively get benefit from this. It
means there would be tons of users of this code.
CRIU use case [2] was mentioned by Andrei and Danylo:
> Use cases for migrating sparse VMAs are binaries sanitized with ASAN,
> MSAN or TSAN [3]. All of these sanitizers produce sparse mappings of
> shadow memory [4]. Being able to migrate such binaries allows to highly
> reduce the amount of work needed to identify and fix post-migration
> crashes, which happen constantly.
Andrei's defines the following uses of this code:
* it is more granular and allows us to track changed pages more
effectively. The current interface can clear dirty bits for the entire
process only. In addition, reading info about pages is a separate
operation. It means we must freeze the process to read information
about all its pages, reset dirty bits, only then we can start dumping
pages. The information about pages becomes more and more outdated,
while we are processing pages. The new interface solves both these
downsides. First, it allows us to read pte bits and clear the
soft-dirty bit atomically. It means that CRIU will not need to freeze
processes to pre-dump their memory. Second, it clears soft-dirty bits
for a specified region of memory. It means CRIU will have actual info
about pages to the moment of dumping them.
* The new interface has to be much faster because basic page filtering
is happening in the kernel. With the old interface, we have to read
pagemap for each page.
*Implementation Evolution (Short Summary)*
From the definition of GetWriteWatch(), we feel like kernel's soft-dirty
feature can be used under the hood with some additions like:
* reset soft-dirty flag for only a specific region of memory instead of
clearing the flag for the entire process
* get and clear soft-dirty flag for a specific region atomically
So we decided to use ioctl on pagemap file to read or/and reset soft-dirty
flag. But using soft-dirty flag, sometimes we get extra pages which weren't
even written. They had become soft-dirty because of VMA merging and
VM_SOFTDIRTY flag. This breaks the definition of GetWriteWatch(). We were
able to by-pass this short coming by ignoring VM_SOFTDIRTY until David
reported that mprotect etc messes up the soft-dirty flag while ignoring
VM_SOFTDIRTY [5]. This wasn't happening until [6] got introduced. We
discussed if we can revert these patches. But we could not reach to any
conclusion. So at this point, I made couple of tries to solve this whole
VM_SOFTDIRTY issue by correcting the soft-dirty implementation:
* [7] Correct the bug fixed wrongly back in 2014. It had potential to cause
regression. We left it behind.
* [8] Keep a list of soft-dirty part of a VMA across splits and merges. I
got the reply don't increase the size of the VMA by 8 bytes.
At this point, we left soft-dirty considering it is too much delicate and
userfaultfd [9] seemed like the only way forward. From there onward, we
have been basing soft-dirty emulation on userfaultfd wp feature where
kernel resolves the faults itself when WP_ASYNC feature is used. It was
straight forward to add WP_ASYNC feature in userfautlfd. Now we get only
those pages dirty or written-to which are really written in reality. (PS
There is another WP_UNPOPULATED userfautfd feature is required which is
needed to avoid pre-faulting memory before write-protecting [9].)
All the different masks were added on the request of CRIU devs to create
interface more generic and better.
[1] https://learn.microsoft.com/en-us/windows/win32/api/memoryapi/nf-memoryapi-…
[2] https://lore.kernel.org/all/20221014134802.1361436-1-mdanylo@google.com
[3] https://github.com/google/sanitizers
[4] https://github.com/google/sanitizers/wiki/AddressSanitizerAlgorithm#64-bit
[5] https://lore.kernel.org/all/bfcae708-db21-04b4-0bbe-712badd03071@redhat.com
[6] https://lore.kernel.org/all/20220725142048.30450-1-peterx@redhat.com/
[7] https://lore.kernel.org/all/20221122115007.2787017-1-usama.anjum@collabora.…
[8] https://lore.kernel.org/all/20221220162606.1595355-1-usama.anjum@collabora.…
[9] https://lore.kernel.org/all/20230306213925.617814-1-peterx@redhat.com
[10] https://lore.kernel.org/all/20230125144529.1630917-1-mdanylo@google.com
* Original Cover letter from v8*
Hello,
Note:
Soft-dirty pages and pages which have been written-to are synonyms. As
kernel already has soft-dirty feature inside which we have given up to
use, we are using written-to terminology while using UFFD async WP under
the hood.
This IOCTL, PAGEMAP_SCAN on pagemap file can be used to get and/or clear
the info about page table entries. The following operations are
supported in this ioctl:
- Get the information if the pages have been written-to (PAGE_IS_WRITTEN),
file mapped (PAGE_IS_FILE), present (PAGE_IS_PRESENT) or swapped
(PAGE_IS_SWAPPED).
- Write-protect the pages (PAGEMAP_WP_ENGAGE) to start finding which
pages have been written-to.
- Find pages which have been written-to and write protect the pages
(atomic PAGE_IS_WRITTEN + PAGEMAP_WP_ENGAGE)
It is possible to find and clear soft-dirty pages entirely in userspace.
But it isn't efficient:
- The mprotect and SIGSEGV handler for bookkeeping
- The userfaultfd wp (synchronous) with the handler for bookkeeping
Some benchmarks can be seen here[1]. This series adds features that weren't
present earlier:
- There is no atomic get soft-dirty/Written-to status and clear present in
the kernel.
- The pages which have been written-to can not be found in accurate way.
(Kernel's soft-dirty PTE bit + sof_dirty VMA bit shows more soft-dirty
pages than there actually are.)
Historically, soft-dirty PTE bit tracking has been used in the CRIU
project. The procfs interface is enough for finding the soft-dirty bit
status and clearing the soft-dirty bit of all the pages of a process.
We have the use case where we need to track the soft-dirty PTE bit for
only specific pages on-demand. We need this tracking and clear mechanism
of a region of memory while the process is running to emulate the
getWriteWatch() syscall of Windows.
*(Moved to using UFFD instead of soft-dirtyi feature to find pages which
have been written-to from v7 patch series)*:
Stop using the soft-dirty flags for finding which pages have been
written to. It is too delicate and wrong as it shows more soft-dirty
pages than the actual soft-dirty pages. There is no interest in
correcting it [2][3] as this is how the feature was written years ago.
It shouldn't be updated to changed behaviour. Peter Xu has suggested
using the async version of the UFFD WP [4] as it is based inherently
on the PTEs.
So in this patch series, I've added a new mode to the UFFD which is
asynchronous version of the write protect. When this variant of the
UFFD WP is used, the page faults are resolved automatically by the
kernel. The pages which have been written-to can be found by reading
pagemap file (!PM_UFFD_WP). This feature can be used successfully to
find which pages have been written to from the time the pages were
write protected. This works just like the soft-dirty flag without
showing any extra pages which aren't soft-dirty in reality.
The information related to pages if the page is file mapped, present and
swapped is required for the CRIU project [5][6]. The addition of the
required mask, any mask, excluded mask and return masks are also required
for the CRIU project [5].
The IOCTL returns the addresses of the pages which match the specific
masks. The page addresses are returned in struct page_region in a compact
form. The max_pages is needed to support a use case where user only wants
to get a specific number of pages. So there is no need to find all the
pages of interest in the range when max_pages is specified. The IOCTL
returns when the maximum number of the pages are found. The max_pages is
optional. If max_pages is specified, it must be equal or greater than the
vec_size. This restriction is needed to handle worse case when one
page_region only contains info of one page and it cannot be compacted.
This is needed to emulate the Windows getWriteWatch() syscall.
The patch series include the detailed selftest which can be used as an
example for the uffd async wp test and PAGEMAP_IOCTL. It shows the
interface usages as well.
[1] https://lore.kernel.org/lkml/54d4c322-cd6e-eefd-b161-2af2b56aae24@collabora…
[2] https://lore.kernel.org/all/20221220162606.1595355-1-usama.anjum@collabora.…
[3] https://lore.kernel.org/all/20221122115007.2787017-1-usama.anjum@collabora.…
[4] https://lore.kernel.org/all/Y6Hc2d+7eTKs7AiH@x1n
[5] https://lore.kernel.org/all/YyiDg79flhWoMDZB@gmail.com/
[6] https://lore.kernel.org/all/20221014134802.1361436-1-mdanylo@google.com/
Regards,
Muhammad Usama Anjum
Muhammad Usama Anjum (4):
fs/proc/task_mmu: Implement IOCTL to get and optionally clear info
about PTEs
tools headers UAPI: Update linux/fs.h with the kernel sources
mm/pagemap: add documentation of PAGEMAP_SCAN IOCTL
selftests: mm: add pagemap ioctl tests
Peter Xu (1):
userfaultfd: UFFD_FEATURE_WP_ASYNC
Documentation/admin-guide/mm/pagemap.rst | 56 +
Documentation/admin-guide/mm/userfaultfd.rst | 35 +
fs/proc/task_mmu.c | 481 +++++++
fs/userfaultfd.c | 26 +-
include/linux/userfaultfd_k.h | 21 +-
include/uapi/linux/fs.h | 53 +
include/uapi/linux/userfaultfd.h | 9 +-
mm/hugetlb.c | 32 +-
mm/memory.c | 27 +-
tools/include/uapi/linux/fs.h | 53 +
tools/testing/selftests/mm/.gitignore | 1 +
tools/testing/selftests/mm/Makefile | 3 +-
tools/testing/selftests/mm/config | 1 +
tools/testing/selftests/mm/pagemap_ioctl.c | 1326 ++++++++++++++++++
tools/testing/selftests/mm/run_vmtests.sh | 4 +
15 files changed, 2105 insertions(+), 23 deletions(-)
create mode 100644 tools/testing/selftests/mm/pagemap_ioctl.c
mode change 100644 => 100755 tools/testing/selftests/mm/run_vmtests.sh
--
2.39.2
Hi,
On vanilla AlmaLinux 8.7 (CentOS fork) selftests/net/af_unix/diag_uid.c doesn't
compile out of the box, giving the errors:
make[2]: Entering directory '/home/marvin/linux/kernel/linux_torvalds/tools/testing/selftests/net/af_unix'
gcc diag_uid.c -o /home/marvin/linux/kernel/linux_torvalds/tools/testing/selftests/net/af_unix/diag_uid
diag_uid.c:36:16: error: ‘UDIAG_SHOW_UID’ undeclared here (not in a function); did you mean ‘UDIAG_SHOW_VFS’?
.udiag_show = UDIAG_SHOW_UID
^~~~~~~~~~~~~~
UDIAG_SHOW_VFS
In file included from diag_uid.c:17:
diag_uid.c: In function ‘render_response’:
diag_uid.c:128:28: error: ‘UNIX_DIAG_UID’ undeclared (first use in this function); did you mean ‘UNIX_DIAG_VFS’?
ASSERT_EQ(attr->rta_type, UNIX_DIAG_UID);
^~~~~~~~~~~~~
../../kselftest_harness.h:707:13: note: in definition of macro ‘__EXPECT’
__typeof__(_seen) __seen = (_seen); \
^~~~~
diag_uid.c:128:2: note: in expansion of macro ‘ASSERT_EQ’
ASSERT_EQ(attr->rta_type, UNIX_DIAG_UID);
^~~~~~~~~
diag_uid.c:128:28: note: each undeclared identifier is reported only once for each function it appears in
ASSERT_EQ(attr->rta_type, UNIX_DIAG_UID);
^~~~~~~~~~~~~
../../kselftest_harness.h:707:13: note: in definition of macro ‘__EXPECT’
__typeof__(_seen) __seen = (_seen); \
^~~~~
diag_uid.c:128:2: note: in expansion of macro ‘ASSERT_EQ’
ASSERT_EQ(attr->rta_type, UNIX_DIAG_UID);
^~~~~~~~~
make[2]: *** [../../lib.mk:147: /home/marvin/linux/kernel/linux_torvalds/tools/testing/selftests/net/af_unix/diag_uid] Error 1
The correct value is in <uapi/linux/unix_diag.h>:
include/uapi/linux/unix_diag.h:23:#define UDIAG_SHOW_UID 0x00000040 /* show socket's UID */
The fix is as follows:
---
tools/testing/selftests/net/af_unix/diag_uid.c | 4 ++++
1 file changed, 4 insertions(+)
diff --git a/tools/testing/selftests/net/af_unix/diag_uid.c b/tools/testing/selftests/net/af_unix/diag_uid.c
index 5b88f7129fea..66d75b646d35 100644
--- a/tools/testing/selftests/net/af_unix/diag_uid.c
+++ b/tools/testing/selftests/net/af_unix/diag_uid.c
@@ -16,6 +16,10 @@
#include "../../kselftest_harness.h"
+#ifndef UDIAG_SHOW_UID
+#define UDIAG_SHOW_UID 0x00000040 /* show socket's UID */
+#endif
+
FIXTURE(diag_uid)
{
int netlink_fd;
--
However, this patch reveals another undefined value:
make[2]: Entering directory '/home/marvin/linux/kernel/linux_torvalds/tools/testing/selftests/net/af_unix'
gcc diag_uid.c -o /home/marvin/linux/kernel/linux_torvalds/tools/testing/selftests/net/af_unix/diag_uid
In file included from diag_uid.c:17:
diag_uid.c: In function ‘render_response’:
diag_uid.c:132:28: error: ‘UNIX_DIAG_UID’ undeclared (first use in this function); did you mean ‘UNIX_DIAG_VFS’?
ASSERT_EQ(attr->rta_type, UNIX_DIAG_UID);
^~~~~~~~~~~~~
../../kselftest_harness.h:707:13: note: in definition of macro ‘__EXPECT’
__typeof__(_seen) __seen = (_seen); \
^~~~~
diag_uid.c:132:2: note: in expansion of macro ‘ASSERT_EQ’
ASSERT_EQ(attr->rta_type, UNIX_DIAG_UID);
^~~~~~~~~
diag_uid.c:132:28: note: each undeclared identifier is reported only once for each function it appears in
ASSERT_EQ(attr->rta_type, UNIX_DIAG_UID);
^~~~~~~~~~~~~
../../kselftest_harness.h:707:13: note: in definition of macro ‘__EXPECT’
__typeof__(_seen) __seen = (_seen); \
^~~~~
diag_uid.c:132:2: note: in expansion of macro ‘ASSERT_EQ’
ASSERT_EQ(attr->rta_type, UNIX_DIAG_UID);
^~~~~~~~~
make[2]: *** [../../lib.mk:147: /home/marvin/linux/kernel/linux_torvalds/tools/testing/selftests/net/af_unix/diag_uid] Error 1
Apparently, AlmaLinux 8.7 lacks this enum UNIX_DIAG_UID:
diff -u /usr/include/linux/unix_diag.h include/uapi/linux/unix_diag.h
--- /usr/include/linux/unix_diag.h 2023-05-16 13:47:51.000000000 +0200
+++ include/uapi/linux/unix_diag.h 2022-10-12 07:35:58.253481367 +0200
@@ -20,6 +20,7 @@
#define UDIAG_SHOW_ICONS 0x00000008 /* show pending connections */
#define UDIAG_SHOW_RQLEN 0x00000010 /* show skb receive queue len */
#define UDIAG_SHOW_MEMINFO 0x00000020 /* show memory info of a socket */
+#define UDIAG_SHOW_UID 0x00000040 /* show socket's UID */
struct unix_diag_msg {
__u8 udiag_family;
@@ -40,6 +41,7 @@
UNIX_DIAG_RQLEN,
UNIX_DIAG_MEMINFO,
UNIX_DIAG_SHUTDOWN,
+ UNIX_DIAG_UID,
__UNIX_DIAG_MAX,
};
Now, this is a change in enums and there doesn't seem to an easy way out
here. (I think I saw an example, but I cannot recall which thread. I will do
more research.)
When I included
# gcc -I ../../../../include diag_uid.c
I've got the following error:
[marvin@pc-mtodorov linux_torvalds]$ cd tools/testing/selftests/net/af_unix/
[marvin@pc-mtodorov af_unix]$ gcc -I ../../../../../include diag_uid.c -o
/home/marvin/linux/kernel/linux_torvalds/tools/testing/selftests/net/af_unix/diag_uid
In file included from ../../../../../include/linux/build_bug.h:5,
from ../../../../../include/linux/bits.h:21,
from ../../../../../include/linux/capability.h:18,
from ../../../../../include/linux/netlink.h:6,
from diag_uid.c:8:
../../../../../include/linux/compiler.h:246:10: fatal error: asm/rwonce.h: No such file or directory
#include <asm/rwonce.h>
^~~~~~~~~~~~~~
compilation terminated.
[marvin@pc-mtodorov af_unix]$
At this point I gave up, as it would be an overkill to change kernel system
header to make a test pass, and this probably wouldn't be accepted upsteam?
Hope this helps. (If we still want to build on CentOS/AlmaLinux/Rocky 8?)
Best regards,
Mirsad
--
Mirsad Goran Todorovac
Sistem inženjer
Grafički fakultet | Akademija likovnih umjetnosti
Sveučilište u Zagrebu
System engineer
Faculty of Graphic Arts | Academy of Fine Arts
University of Zagreb, Republic of Croatia
Add documentation for the new Virtual PCM Test Driver. It covers all
possible usage cases: errors and delay injections, random and
pattern-based data generation, playback and ioctl redefinition
functionalities testing.
We have a lot of different virtual media drivers, which can be used for
testing of the userspace applications and media subsystem middle layer.
However, all of them are aimed at testing the video functionality and
simulating the video devices. For audio devices we have only snd-dummy
module, which is good in simulating the correct behavior of an ALSA device.
I decided to write a tool, which would help to test the userspace ALSA
programs (and the PCM middle layer as well) under unusual circumstances
to figure out how they would behave. So I came up with this Virtual PCM
Test Driver.
This new Virtual PCM Test Driver has several features which can be useful
during the userspace ALSA applications testing/fuzzing, or testing/fuzzing
of the PCM middle layer. Not all of them can be implemented using the
existing virtual drivers (like dummy or loopback). Here is what can this
driver do:
- Simulate both capture and playback processes
- Check the playback stream for containing the looped pattern
- Generate random or pattern-based capture data
- Inject delays into the playback and capturing processes
- Inject errors during the PCM callbacks
Also, this driver can check the playback stream for containing the
predefined pattern, which is used in the corresponding selftest to check
the PCM middle layer data transferring functionality. Additionally, this
driver redefines the default RESET ioctl, and the selftest covers this PCM
API functionality as well.
Signed-off-by: Ivan Orlov <ivan.orlov0322(a)gmail.com>
---
V1 -> V2:
- Rename the driver from from 'valsa' to 'pcmtest'.
- Implement support for interleaved and non-interleaved access modes
- Add support for 8 substreams and 4 channels
- Extend supported formats
- Extend and rewrite in C the selftest for the driver
Documentation/sound/cards/index.rst | 1 +
Documentation/sound/cards/pcmtest.rst | 119 ++++++++++++++++++++++++++
2 files changed, 120 insertions(+)
create mode 100644 Documentation/sound/cards/pcmtest.rst
diff --git a/Documentation/sound/cards/index.rst b/Documentation/sound/cards/index.rst
index c016f8c3b88b..49c1f2f688f8 100644
--- a/Documentation/sound/cards/index.rst
+++ b/Documentation/sound/cards/index.rst
@@ -17,3 +17,4 @@ Card-Specific Information
hdspm
serial-u16550
img-spdif-in
+ pcmtest
diff --git a/Documentation/sound/cards/pcmtest.rst b/Documentation/sound/cards/pcmtest.rst
new file mode 100644
index 000000000000..ea8070eaa44e
--- /dev/null
+++ b/Documentation/sound/cards/pcmtest.rst
@@ -0,0 +1,119 @@
+.. SPDX-License-Identifier: GPL-2.0
+
+The Virtual PCM Test Driver
+===========================
+
+The Virtual PCM Test Driver emulates a generic PCM device, and can be used for
+testing/fuzzing of the userspace ALSA applications, as well as for testing/fuzzing of
+the PCM middle layer. Additionally, it can be used for simulating hard to reproduce
+problems with PCM devices.
+
+What can this driver do?
+~~~~~~~~~~~~~~~~~~~~~~~~
+
+At this moment the driver can do the following things:
+ * Simulate both capture and playback processes
+ * Generate random or pattern-based capturing data
+ * Inject delays into the playback and capturing processes
+ * Inject errors during the PCM callbacks
+
+It supports up to 8 substreams and 4 channels. Also it supports both interleaved and
+non-interleaved access modes.
+
+Also, this driver can check the playback stream for containing the predefined pattern,
+which is used in the corresponding selftest (alsa/test-pcmtest-driver.c). To check the
+PCM middle layer data transferring functionality. Additionally, this driver redefines
+the default RESET ioctl, and the selftest covers this PCM API functionality as well.
+
+Configuration
+-------------
+
+The driver has several parameters besides the common ALSA module parameters:
+
+ * fill_mode (bool) - Buffer fill mode (see below)
+ * inject_delay (int)
+ * inject_hwpars_err (bool)
+ * inject_prepare_err (bool)
+ * inject_trigger_err (bool)
+
+
+Capture Data Generation
+-----------------------
+
+The driver has two modes of data generation: the first (0 in the fill_mode parameter)
+means random data generation, the second (1 in the fill_mode) - pattern-based
+data generation. Let's look at the second mode.
+
+First of all, you may want to specify the pattern for data generation. You can do it
+by writing the pattern to the debugfs file (/sys/kernel/debug/pcmtest/fill_pattern).
+Like that:
+
+.. code-block:: bash
+
+ echo -n mycoolpattern > /sys/kernel/debug/pcmtest/fill_pattern
+
+After this, every capture action performed on the 'pcmtest' device will return
+'mycoolpatternmycoolpatternmycoolpatternmy...' for every channel buffer.
+
+In case of interleaved access, the capture buffer will contain the repeated pattern
+for every channel. Otherwise, every channel buffer will contain the repeated pattern.
+
+The pattern itself can be up to 4096 bytes long.
+
+Delay injection
+---------------
+
+The driver has 'inject_delay' parameter, which has very self-descriptive name and
+can be used for time delay/speedup simulations. The parameter has integer type, and
+it means the delay added between module's internal timer ticks.
+
+If the 'inject_delay' value is positive, the buffer will be filled slower, if it is
+negative - faster. You can try it yourself by starting a recording in any
+audio recording application (like Audacity) and selecting the 'pcmtest' device as a
+source.
+
+This parameter can be also used for generating a huge amount of sound data in a very
+short period of time (with the negative 'inject_delay' value).
+
+Errors injection
+----------------
+
+This module can be used for injecting errors into the PCM communication process. This
+action can help you to figure out how the userspace ALSA program behaves under unusual
+circumstances.
+
+For example, you can make all 'hw_params' PCM callback calls return EBUSY error by
+writing '1' to the 'inject_hwpars_err' module parameter:
+
+.. code-block:: bash
+
+ echo 1 > /sys/module/snd_pcmtest/parameters/inject_hwpars_err
+
+Errors can be injected into the following PCM callbacks:
+
+ * hw_params (EBUSY)
+ * prepare (EINVAL)
+ * trigger (EINVAL)
+
+
+Playback test
+-------------
+
+This driver can be also used for the playback functionality testing - every time you
+write the playback data to the 'pcmtest' PCM device and close it, the driver checks the
+buffer of each channel for containing the looped pattern (which is specified in the
+fill_pattern debugfs file). If the playback buffer content represents the looped pattern,
+'pc_test' debugfs entry is set into '1'. Otherwise, the driver sets it to '0'.
+
+ioctl redefinition test
+-----------------------
+
+The driver redefines the 'reset' ioctl, which is default for all PCM devices. To test
+this functionality, we can trigger the reset ioctl and check the 'ioctl_test' debugfs
+entry:
+
+.. code-block:: bash
+
+ cat /sys/kernel/debug/pcmtest/ioctl_test
+
+If the ioctl is triggered successfully, this file will contain '1', and '0' otherwise.
--
2.34.1
iommufd gives userspace the capability to manipulate iommu subsytem.
e.g. DMA map/unmap etc. In the near future, it will support iommu nested
translation. Different platform vendors have different implementation for
the nested translation. So before set up nested translation, userspace
needs to know the hardware iommu information. For example, Intel VT-d
supports using guest I/O page table as the stage-1 translation table. This
requires guest I/O page table be compatible with hardware IOMMU.
This series reports the iommu hardware information for a given iommufd_device
which has been bound to iommufd. It is preparation work for userspace to
allocate hwpt for given device. Like the nested translation support[1].
This series introduces an iommu op to report the iommu hardware info,
and an ioctl IOMMU_DEVICE_GET_HW_INFO is added to report such hardware
info to user. enum iommu_hw_info_type is defined to differentiate the
iommu hardware info reported to user hence user can decode them. This
series only adds the framework for iommu hw info reporting, the complete
reporting path needs vendor specific definition and driver support. The
full picture is available in [1] as well.
base-commit: 35db4f4dac813ffaa987cf633694107fabf3aff5
[1] https://github.com/yiliu1765/iommufd/tree/iommufd_nesting
Change log:
v3:
- Add r-b from Baolu
- Rename IOMMU_HW_INFO_TYPE_DEFAULT to be IOMMU_HW_INFO_TYPE_NONE to
better suit what it means
- Let IOMMU_DEVICE_GET_HW_INFO succeed even the underlying iommu driver
does not have driver-specific data to report per below remark.
https://lore.kernel.org/kvm/ZAcwJSK%2F9UVI9LXu@nvidia.com/
v2: https://lore.kernel.org/linux-iommu/20230309075358.571567-1-yi.l.liu@intel.…
- Drop patch 05 of v1 as it is already covered by other series
- Rename the capability info to be iommu hardware info
v1: https://lore.kernel.org/linux-iommu/20230209041642.9346-1-yi.l.liu@intel.co…
Regards,
Yi Liu
Lu Baolu (1):
iommu: Add new iommu op to get iommu hardware information
Nicolin Chen (1):
iommufd/selftest: Add coverage for IOMMU_DEVICE_GET_HW_INFO ioctl
Yi Liu (2):
iommu: Move dev_iommu_ops() to private header
iommufd: Add IOMMU_DEVICE_GET_HW_INFO
drivers/iommu/iommu-priv.h | 11 +++
drivers/iommu/iommu.c | 2 +
drivers/iommu/iommufd/device.c | 73 +++++++++++++++++++
drivers/iommu/iommufd/iommufd_private.h | 1 +
drivers/iommu/iommufd/iommufd_test.h | 9 +++
drivers/iommu/iommufd/main.c | 3 +
drivers/iommu/iommufd/selftest.c | 16 ++++
include/linux/iommu.h | 27 ++++---
include/uapi/linux/iommufd.h | 44 +++++++++++
tools/testing/selftests/iommu/iommufd.c | 17 ++++-
tools/testing/selftests/iommu/iommufd_utils.h | 26 +++++++
11 files changed, 217 insertions(+), 12 deletions(-)
--
2.34.1
As suggested by Willy it is possible to detect the availability of
stackprotector via preprocessor defines.
Make use of that to simplify the code and interface of nolibc.
Signed-off-by: Thomas Weißschuh <linux(a)weissschuh.net>
---
Thomas Weißschuh (7):
tools/nolibc: fix typo pint -> point
tools/nolibc: x86_64: disable stack protector for _start
tools/nolibc: ensure stack protector guard is never zero
tools/nolibc: add test for __stack_chk_guard initialization
tools/nolibc: reformat list of headers to be installed
tools/nolibc: add autodetection for stackprotector support
tools/nolibc: simplify stackprotector compiler flags
tools/include/nolibc/Makefile | 19 +++++++++++++++++--
tools/include/nolibc/arch-aarch64.h | 6 +++---
tools/include/nolibc/arch-arm.h | 6 +++---
tools/include/nolibc/arch-i386.h | 6 +++---
tools/include/nolibc/arch-loongarch.h | 6 +++---
tools/include/nolibc/arch-mips.h | 6 +++---
tools/include/nolibc/arch-riscv.h | 6 +++---
tools/include/nolibc/arch-x86_64.h | 8 ++++----
tools/include/nolibc/arch.h | 2 +-
tools/include/nolibc/compiler.h | 15 +++++++++++++++
tools/include/nolibc/stackprotector.h | 15 ++++++---------
tools/testing/selftests/nolibc/Makefile | 13 ++-----------
tools/testing/selftests/nolibc/nolibc-test.c | 10 +++++++++-
13 files changed, 72 insertions(+), 46 deletions(-)
---
base-commit: 606343b7478c319cb30291a39ecbceddb42229d6
change-id: 20230521-nolibc-automatic-stack-protector-b4f7fab9e625
Best regards,
--
Thomas Weißschuh <linux(a)weissschuh.net>
Hi,
On AlmaLinux 8.7, make kselftest-all fails at memfd/memfd_test.c:
make[2]: Entering directory '/home/marvin/linux/kernel/linux_torvalds/tools/testing/selftests/memfd'
gcc -D_FILE_OFFSET_BITS=64 -isystem /home/marvin/linux/kernel/linux_torvalds/usr/include memfd_test.c common.c -o
/home/marvin/linux/kernel/linux_torvalds/tools/testing/selftests/memfd/memfd_test
memfd_test.c: In function ‘test_seal_future_write’:
memfd_test.c:916:27: error: ‘F_SEAL_FUTURE_WRITE’ undeclared (first use in this function); did you mean ‘F_SEAL_WRITE’?
mfd_assert_add_seals(fd, F_SEAL_FUTURE_WRITE);
^~~~~~~~~~~~~~~~~~~
F_SEAL_WRITE
memfd_test.c:916:27: note: each undeclared identifier is reported only once for each function it appears in
memfd_test.c: In function ‘test_exec_seal’:
memfd_test.c:36:7: error: ‘F_SEAL_FUTURE_WRITE’ undeclared (first use in this function); did you mean ‘F_SEAL_WRITE’?
F_SEAL_FUTURE_WRITE | \
^~~~~~~~~~~~~~~~~~~
memfd_test.c:1058:27: note: in expansion of macro ‘F_WX_SEALS’
mfd_assert_has_seals(fd, F_WX_SEALS);
^~~~~~~~~~
make[2]: *** [../lib.mk:147: /home/marvin/linux/kernel/linux_torvalds/tools/testing/selftests/memfd/memfd_test] Error 1
make[2]: Leaving directory '/home/marvin/linux/kernel/linux_torvalds/tools/testing/selftests/memfd'
Apparently, the include file include/uapi/linux/fcntl.h defines this
F_SEAL_FUTURE_WRITE as 0x0010:
include/uapi/linux/fcntl.h:45:#define F_SEAL_FUTURE_WRITE 0x0010 /* prevent future writes while mapped */
This patch fixed the issue:
---
tools/testing/selftests/memfd/memfd_test.c | 8 +++++++-
1 file changed, 7 insertions(+), 1 deletion(-)
diff --git a/tools/testing/selftests/memfd/memfd_test.c b/tools/testing/selftests/memfd/memfd_test.c
index dba0e8ba002f..868f17c02e32 100644
--- a/tools/testing/selftests/memfd/memfd_test.c
+++ b/tools/testing/selftests/memfd/memfd_test.c
@@ -28,7 +28,13 @@
#define MFD_DEF_SIZE 8192
#define STACK_SIZE 65536
-#define F_SEAL_EXEC 0x0020
+#ifndef F_SEAL_FUTURE_WRITE
+#define F_SEAL_FUTURE_WRITE 0x0010
+#endif
+
+#ifndef F_SEAL_EXEC
+#define F_SEAL_EXEC 0x0020
+#endif
#define F_WX_SEALS (F_SEAL_SHRINK | \
F_SEAL_GROW | \
Hope this helps.
Best regards,
Mirsad
--
Mirsad Goran Todorovac
Sistem inženjer
Grafički fakultet | Akademija likovnih umjetnosti
Sveučilište u Zagrebu
System engineer
Faculty of Graphic Arts | Academy of Fine Arts
University of Zagreb, Republic of Croatia
Changes since RFC v1:
* add two kselftests (patch 10-11)
* set virtual MSRs also on APs [Pawan]
* enable "virtualize IA32_SPEC_CTRL" for L2 to prevent L2 from changing
some bits of IA32_SPEC_CTRL (patch 4)
* other misc cleanup and cosmetic changes
RFC v1: https://lore.kernel.org/lkml/20221210160046.2608762-1-chen.zhang@intel.com/
This series introduces "virtualize IA32_SPEC_CTRL" support. Here are
introduction and use cases of this new feature.
### Virtualize IA32_SPEC_CTRL
"Virtualize IA32_SPEC_CTRL" [1] is a new VMX feature on Intel CPUs. This feature
allows VMM to lock some bits of IA32_SPEC_CTRL MSR even when the MSR is
pass-thru'd to a guest.
### Use cases of "virtualize IA32_SPEC_CTRL" [2]
Software mitigations like Retpoline and software BHB-clearing sequence depend on
CPU microarchitectures. And guest cannot know exactly the underlying
microarchitecture. When a guest is migrated between processors of different
microarchitectures, software mitigations which work perfectly on previous
microachitecture may be not effective on the new one. To fix the problem, some
hardware mitigations should be used in conjunction with software mitigations.
Using virtual IA32_SPEC_CTRL, VMM can enforce hardware mitigations transparently
to guests and avoid those hardware mitigations being unintentionally disabled
when guest changes IA32_SPEC_CTRL MSR.
### Intention of this series
This series adds the capability of enforcing hardware mitigations for guests
transparently and efficiently (i.e., without intecepting IA32_SPEC_CTRL MSR
accesses) to kvm. The capability can be used to solve the VM migration issue in
a pool consisting of processors of different microarchitectures.
Specifically, below are two target scenarios of this series:
Scenario 1: If retpoline is used by a VM to mitigate IMBTI in CPL0, VMM can set
RRSBA_DIS_S on parts enumerates RRSBA. Note that the VM is presented
with a microarchitecture doesn't enumerate RRSBA.
Scenario 2: If a VM uses software BHB-clearing sequence on transitions into CPL0
to mitigate BHI, VMM can use "virtualize IA32_SPEC_CTRL" to set
BHI_DIS_S on new parts which doesn't enumerate BHI_NO.
Intel defines some virtual MSRs [2] for guests to report in-use software
mitigations. This allows guests to opt in VMM's deploying hardware mitigations
for them if the guests are either running or later migrated to a system on which
in-use software mitigations are not effective. The virtual MSRs interface is
also added in this series.
### Organization of this series
1. Patch 1-3 Advertise RRSBA_CTRL and BHI_CTRL to guest
2. Patch 4 Add "virtualize IA32_SPEC_CTRL" support
3. Patch 5-9 Allow guests to report in-use software mitigations to KVM so
that KVM can enable hardware mitigations for guests.
4. Patch 10-11 Add kselftest for virtual MSRs and IA32_SPEC_CTRL
[1]: https://cdrdv2.intel.com/v1/dl/getContent/671368 Ref. #319433-047 Chapter 12
[2]: https://www.intel.com/content/www/us/en/developer/articles/technical/softwa…
Chao Gao (3):
KVM: VMX: Advertise MITI_ENUM_RETPOLINE_S_SUPPORT
KVM: selftests: Add tests for virtual enumeration/mitigation MSRs
KVM: selftests: Add tests for IA32_SPEC_CTRL MSR
Pawan Gupta (1):
x86/bugs: Use Virtual MSRs to request hardware mitigations
Zhang Chen (7):
x86/msr-index: Add bit definitions for BHI_DIS_S and BHI_NO
KVM: x86: Advertise CPUID.7.2.EDX and RRSBA_CTRL support
KVM: x86: Advertise BHI_CTRL support
KVM: VMX: Add IA32_SPEC_CTRL virtualization support
KVM: x86: Advertise ARCH_CAP_VIRTUAL_ENUM support
KVM: VMX: Advertise MITIGATION_CTRL support
KVM: VMX: Advertise MITI_CTRL_BHB_CLEAR_SEQ_S_SUPPORT
arch/x86/include/asm/msr-index.h | 33 +++-
arch/x86/include/asm/vmx.h | 5 +
arch/x86/include/asm/vmxfeatures.h | 2 +
arch/x86/kernel/cpu/bugs.c | 25 +++
arch/x86/kvm/cpuid.c | 22 ++-
arch/x86/kvm/reverse_cpuid.h | 8 +
arch/x86/kvm/svm/svm.c | 3 +
arch/x86/kvm/vmx/capabilities.h | 5 +
arch/x86/kvm/vmx/nested.c | 13 ++
arch/x86/kvm/vmx/vmcs.h | 2 +
arch/x86/kvm/vmx/vmx.c | 112 ++++++++++-
arch/x86/kvm/vmx/vmx.h | 43 ++++-
arch/x86/kvm/x86.c | 19 +-
tools/arch/x86/include/asm/msr-index.h | 37 +++-
tools/testing/selftests/kvm/Makefile | 2 +
.../selftests/kvm/include/x86_64/processor.h | 5 +
.../selftests/kvm/x86_64/spec_ctrl_msr_test.c | 178 ++++++++++++++++++
.../kvm/x86_64/virtual_mitigation_msr_test.c | 175 +++++++++++++++++
18 files changed, 676 insertions(+), 13 deletions(-)
create mode 100644 tools/testing/selftests/kvm/x86_64/spec_ctrl_msr_test.c
create mode 100644 tools/testing/selftests/kvm/x86_64/virtual_mitigation_msr_test.c
base-commit: 400d2132288edbd6d500f45eab5d85526ca94e46
--
2.40.0
Dzień dobry,
chcielibyśmy zapewnić Państwu kompleksowe rozwiązania, jeśli chodzi o system monitoringu GPS.
Precyzyjne monitorowanie pojazdów na mapach cyfrowych, śledzenie ich parametrów eksploatacyjnych w czasie rzeczywistym oraz kontrola paliwa to kluczowe funkcjonalności naszego systemu.
Organizowanie pracy pracowników jest dzięki temu prostsze i bardziej efektywne, a oszczędności i optymalizacja w zakresie ponoszonych kosztów, mają dla każdego przedsiębiorcy ogromne znaczenie.
Dopasujemy naszą ofertę do Państwa oczekiwań i potrzeb organizacji. Czy moglibyśmy porozmawiać o naszej propozycji?
Pozdrawiam
Konrad Trojanowski
syscall() is used by "normal" libcs to allow users to directly call
syscalls.
By having the same syntax inside nolibc users can more easily write code
that works with different libcs.
The macro logic is adapted from systemtaps STAP_PROBEV() macro that is
released in the public domain / CC0.
Signed-off-by: Thomas Weißschuh <linux(a)weissschuh.net>
---
tools/include/nolibc/unistd.h | 15 +++++++++++++++
tools/testing/selftests/nolibc/nolibc-test.c | 2 ++
2 files changed, 17 insertions(+)
diff --git a/tools/include/nolibc/unistd.h b/tools/include/nolibc/unistd.h
index ac7d53d986cd..6773e83c16a0 100644
--- a/tools/include/nolibc/unistd.h
+++ b/tools/include/nolibc/unistd.h
@@ -56,6 +56,21 @@ int tcsetpgrp(int fd, pid_t pid)
return ioctl(fd, TIOCSPGRP, &pid);
}
+#define _syscall(N, ...) \
+({ \
+ int _ret = my_syscall##N(__VA_ARGS__); \
+ if (_ret < 0) { \
+ SET_ERRNO(-_ret); \
+ _ret = -1; \
+ } \
+ _ret; \
+})
+
+#define _sycall_narg(...) __syscall_narg(__VA_ARGS__, 6, 5, 4, 3, 2, 1, 0)
+#define __syscall_narg(_0, _1, _2, _3, _4, _5, _6, N, ...) N
+#define _syscall_n(N, ...) _syscall(N, __VA_ARGS__)
+#define syscall(...) _syscall_n(_sycall_narg(__VA_ARGS__), ##__VA_ARGS__)
+
/* make sure to include all global symbols */
#include "nolibc.h"
diff --git a/tools/testing/selftests/nolibc/nolibc-test.c b/tools/testing/selftests/nolibc/nolibc-test.c
index f042a6436b6b..54bf91847af3 100644
--- a/tools/testing/selftests/nolibc/nolibc-test.c
+++ b/tools/testing/selftests/nolibc/nolibc-test.c
@@ -588,6 +588,8 @@ int run_syscall(int min, int max)
CASE_TEST(waitpid_child); EXPECT_SYSER(1, waitpid(getpid(), &tmp, WNOHANG), -1, ECHILD); break;
CASE_TEST(write_badf); EXPECT_SYSER(1, write(-1, &tmp, 1), -1, EBADF); break;
CASE_TEST(write_zero); EXPECT_SYSZR(1, write(1, &tmp, 0)); break;
+ CASE_TEST(syscall_noargs); EXPECT_SYSEQ(1, syscall(__NR_getpid), getpid()); break;
+ CASE_TEST(syscall_args); EXPECT_SYSER(1, syscall(__NR_fstat, 0, NULL), -1, EFAULT); break;
case __LINE__:
return ret; /* must be last */
/* note: do not set any defaults so as to permit holes above */
---
base-commit: 063dcc53b416ae1e89f767330feab3d0842943ed
change-id: 20230517-nolibc-syscall-bd13da6468c6
Best regards,
--
Thomas Weißschuh <linux(a)weissschuh.net>
Hello,
Here is v2 of the mremap start address optimization / fix for exec warning.
v1->v2:
1. Trigger the optimization for mremaps smaller than a PMD. I tested by tracing
that it works correctly.
2. Fix issue with bogus return value found by Linus if we broke out of the
above loop for the first PMD itself.
Description of patches:
These patches optimizes the start addresses in move_page_tables() and tests the
changes. It addresses a warning [1] that occurs due to a downward, overlapping
move on a mutually-aligned offset within a PMD during exec. By initiating the
copy process at the PMD level when such alignment is present, we can prevent
this warning and speed up the copying process at the same time. Linus Torvalds
suggested this idea.
Please check the individual patches for more details.
thanks,
- Joel
[1] https://lore.kernel.org/all/ZB2GTBD%2FLWTrkOiO@dhcp22.suse.cz/
Joel Fernandes (Google) (4):
mm/mremap: Optimize the start addresses in move_page_tables()
selftests: mm: Fix failure case when new remap region was not found
selftests: mm: Add a test for mutually aligned moves > PMD size
selftests: mm: Add a test for remapping to area immediately after
existing mapping
mm/mremap.c | 56 +++++++++++++++++++
tools/testing/selftests/mm/mremap_test.c | 69 +++++++++++++++++++++---
2 files changed, 119 insertions(+), 6 deletions(-)
--
2.40.1.698.g37aff9b760-goog
Dear,
Please grant me permission to share a very crucial discussion with
you. I am looking forward to hearing from you at your earliest
convenience.
Mrs. Nina Coulibal
> From: Tian, Kevin
> Sent: Friday, May 19, 2023 4:42 PM
> > +struct iommu_hw_info {
> > + __u32 size;
> > + __u32 flags;
> > + __u32 dev_id;
> > + __u32 data_len;
> > + __aligned_u64 data_ptr;
> > + __u32 out_data_type;
> > + __u32 __reserved;
>
> it's unusual to have reserved field in the end. It makes more sense
> to move data_ptr to the end to make it meaningful.
>
Please ignore this comment. typed too fast...
In the end of the test, there will be an error message induced by the
`ip netns del ns1` command in cleanup()
Tests passed: 201
Tests failed: 0
Cannot remove namespace file "/run/netns/ns1": No such file or directory
This can even be reproduced with just `./fib_tests.sh -h` as we're
calling cleanup() on exit.
Redirect the error message to /dev/null to mute it.
V2: Update commit message and fixes tag.
V3: resubmit due to missing netdev ML in V2
Fixes: b60417a9f2b8 ("selftest: fib_tests: Always cleanup before exit")
Signed-off-by: Po-Hsu Lin <po-hsu.lin(a)canonical.com>
---
tools/testing/selftests/net/fib_tests.sh | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/tools/testing/selftests/net/fib_tests.sh b/tools/testing/selftests/net/fib_tests.sh
index 7da8ec8..35d89df 100755
--- a/tools/testing/selftests/net/fib_tests.sh
+++ b/tools/testing/selftests/net/fib_tests.sh
@@ -68,7 +68,7 @@ setup()
cleanup()
{
$IP link del dev dummy0 &> /dev/null
- ip netns del ns1
+ ip netns del ns1 &> /dev/null
ip netns del ns2 &> /dev/null
}
--
2.7.4
Hello,
I am posting this as an RFC for any feedback. I have tested them suitably and I
am continuing to test them.
These patches optimizes the start addresses in move_page_tables(). It addresses a
warning [1] that occurs due to a downward, overlapping move on a mutually-aligned
offset within a PMD during exec. By initiating the copy process at the PMD
level when such alignment is present, we can prevent this warning and speed up
the copying process at the same time. Linus Torvalds suggested this idea.
Please check the individual patches for more details.
thanks,
- Joel
[1] https://lore.kernel.org/all/ZB2GTBD%2FLWTrkOiO@dhcp22.suse.cz/
Joel Fernandes (Google) (4):
mm/mremap: Optimize the start addresses in move_page_tables()
selftests: mm: Fix failure case when new remap region was not found
selftests: mm: Add a test for mutually aligned moves > PMD size
selftests: mm: Add a test for remapping to area immediately after
existing mapping
mm/mremap.c | 49 +++++++++++++++++
tools/testing/selftests/mm/mremap_test.c | 69 +++++++++++++++++++++---
2 files changed, 112 insertions(+), 6 deletions(-)
--
2.40.1.606.ga4b1b128d6-goog
KVM_GET_REG_LIST will dump all register IDs that are available to
KVM_GET/SET_ONE_REG and It's very useful to identify some platform
regression issue during VM migration.
Patch 1 enabled the KVM_GET_REG_LIST API in riscv and patch 2 added
the corresponding kselftest for checking possible register regressions.
Both patches were ported from arm64 and tested with Linux 6.4-rc1 on a
Qemu riscv virt machine.
Haibo Xu (2):
riscv: kvm: Add KVM_GET_REG_LIST API support
KVM: selftests: Add riscv get-reg-list test
Documentation/virt/kvm/api.rst | 2 +-
arch/riscv/kvm/vcpu.c | 346 +++++++
tools/testing/selftests/kvm/Makefile | 3 +
.../selftests/kvm/include/riscv/processor.h | 3 +
.../selftests/kvm/riscv/get-reg-list.c | 869 ++++++++++++++++++
5 files changed, 1222 insertions(+), 1 deletion(-)
create mode 100644 tools/testing/selftests/kvm/riscv/get-reg-list.c
--
2.34.1
In the end of the test, there will be an error message induced by the
`ip netns del ns1` command in cleanup()
Tests passed: 201
Tests failed: 0
Cannot remove namespace file "/run/netns/ns1": No such file or directory
This can even be reproduced with just `./fib_tests.sh -h` as we're
calling cleanup() on exit.
Redirect the error message to /dev/null to mute it.
V2: Update commit message and fixes tag.
Fixes: b60417a9f2b8 ("selftest: fib_tests: Always cleanup before exit")
Signed-off-by: Po-Hsu Lin <po-hsu.lin(a)canonical.com>
---
tools/testing/selftests/net/fib_tests.sh | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/tools/testing/selftests/net/fib_tests.sh b/tools/testing/selftests/net/fib_tests.sh
index 7da8ec8..35d89df 100755
--- a/tools/testing/selftests/net/fib_tests.sh
+++ b/tools/testing/selftests/net/fib_tests.sh
@@ -68,7 +68,7 @@ setup()
cleanup()
{
$IP link del dev dummy0 &> /dev/null
- ip netns del ns1
+ ip netns del ns1 &> /dev/null
ip netns del ns2 &> /dev/null
}
--
2.7.4
This patchset adds a stress test for kprobes and a test for checking
optimized probes.
The two tests are being added based on the below discussion:
https://lore.kernel.org/all/20230128101622.ce6f8e64d929e29d36b08b73@kernel.…
kprobe_opt_types.tc is modified as per the below review comments:
https://lore.kernel.org/all/1682506809.uus6y0ir3i.naveen@linux.ibm.com/#t
Changelog:
v3:
* Add Acked-by for kprobe_insn_boundary.tc
* Simplify test for optimized probe, as suggested by Masami
* Add exit_unresolved to exit as unresolved in case no probe was optimized
v2:
* Add an explicit fork after enabling the events ( echo "forked" )
* Remove the extended test from multiple_kprobe_types.tc which adds
multiple consecutive probes in a function and add it as a
separate test case.
* Add new test case which checks for optimized probes.
Akanksha J N (2):
selftests/ftrace: Add new test case which adds multiple consecutive
probes in a function
selftests/ftrace: Add new test case which checks for optimized probes
.../test.d/kprobe/kprobe_insn_boundary.tc | 19 +++++++++++
.../ftrace/test.d/kprobe/kprobe_opt_types.tc | 34 +++++++++++++++++++
2 files changed, 53 insertions(+)
create mode 100644 tools/testing/selftests/ftrace/test.d/kprobe/kprobe_insn_boundary.tc
create mode 100644 tools/testing/selftests/ftrace/test.d/kprobe/kprobe_opt_types.tc
--
2.31.1
In the end of the test, there will be an error message induced by the
`ip netns del ns1` command in cleanup()
Tests passed: 201
Tests failed: 0
Cannot remove namespace file "/run/netns/ns1": No such file or directory
Redirect the error message to /dev/null to mute it.
Fixes: a0e11da78f48 ("fib_tests: Add tests for metrics on routes")
Signed-off-by: Po-Hsu Lin <po-hsu.lin(a)canonical.com>
---
tools/testing/selftests/net/fib_tests.sh | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/tools/testing/selftests/net/fib_tests.sh b/tools/testing/selftests/net/fib_tests.sh
index 7da8ec8..35d89df 100755
--- a/tools/testing/selftests/net/fib_tests.sh
+++ b/tools/testing/selftests/net/fib_tests.sh
@@ -68,7 +68,7 @@ setup()
cleanup()
{
$IP link del dev dummy0 &> /dev/null
- ip netns del ns1
+ ip netns del ns1 &> /dev/null
ip netns del ns2 &> /dev/null
}
--
2.7.4
Hi Linus,
Please pull the following Kselftest fixes update for Linux 6.4-rc3.
This Kselftest fixes update for Linux 6.4-rc3 consists of:
- sgx test fix for false negatives.
- ftrace output is hard to parse and it masks inappropriate skips etc.
This fix addresses the problems by integrating with kselftest runner.
diff is attached.
thanks,
-- Shuah
----------------------------------------------------------------
The following changes since commit ac9a78681b921877518763ba0e89202254349d1b:
Linux 6.4-rc1 (2023-05-07 13:34:35 -0700)
are available in the Git repository at:
git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest tags/linux-kselftest-fixes-6.4-rc3
for you to fetch changes up to dbcf76390eb9a65d5d0c37b0cd57335218564e37:
selftests/ftrace: Improve integration with kselftest runner (2023-05-08 11:10:13 -0600)
----------------------------------------------------------------
linux-kselftest-fixes-6.4-rc3
This Kselftest fixes update for Linux 6.4-rc3 consists of:
- sgx test fix for false negatives.
- ftrace output is hard to parse and it masks inappropriate skips etc.
This fix addresses the problems by integrating with kselftest runner.
----------------------------------------------------------------
Mark Brown (1):
selftests/ftrace: Improve integration with kselftest runner
Yi Lai (1):
selftests/sgx: Add "test_encl.elf" to TEST_FILES
tools/testing/selftests/ftrace/Makefile | 3 +-
tools/testing/selftests/ftrace/ftracetest | 63 ++++++++++++++++++++++++--
tools/testing/selftests/ftrace/ftracetest-ktap | 8 ++++
tools/testing/selftests/sgx/Makefile | 1 +
4 files changed, 71 insertions(+), 4 deletions(-)
create mode 100755 tools/testing/selftests/ftrace/ftracetest-ktap
----------------------------------------------------------------
v2:
---
* swap order of patches (thanks Claudio)
* add r-b
* add comment why memslots are zeroed
Add a new selftest for CMMA migration. Also fix a small issue found during
development of the test.
Nico Boehr (2):
KVM: s390: fix KVM_S390_GET_CMMA_BITS for GFNs in memslot holes
KVM: s390: selftests: add selftest for CMMA migration
arch/s390/kvm/kvm-s390.c | 4 +
tools/testing/selftests/kvm/Makefile | 1 +
tools/testing/selftests/kvm/s390x/cmma_test.c | 680 ++++++++++++++++++
3 files changed, 685 insertions(+)
create mode 100644 tools/testing/selftests/kvm/s390x/cmma_test.c
--
2.39.1
The cited commit added a stray colon to the 'v' option. That makes the
option work incorrectly.
ex:
tools/testing/selftests/net# ./fib_nexthops.sh -v
(should enable verbose mode, instead it shows help text due to missing arg)
Fixes: 5feba4727395 ("selftests: fib_nexthops: Make ping timeout configurable")
Reviewed-by: Ido Schimmel <idosch(a)nvidia.com>
Signed-off-by: Benjamin Poirier <bpoirier(a)nvidia.com>
---
tools/testing/selftests/net/fib_nexthops.sh | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/tools/testing/selftests/net/fib_nexthops.sh b/tools/testing/selftests/net/fib_nexthops.sh
index a47b26ab48f2..0f5e88c8f4ff 100755
--- a/tools/testing/selftests/net/fib_nexthops.sh
+++ b/tools/testing/selftests/net/fib_nexthops.sh
@@ -2283,7 +2283,7 @@ EOF
################################################################################
# main
-while getopts :t:pP46hv:w: o
+while getopts :t:pP46hvw: o
do
case $o in
t) TESTS=$OPTARG;;
--
2.40.1
While KUnit tests that cannot be built as a loadable module must depend
on "KUNIT=y", this is not true for modular tests, where it adds an
unnecessary limitation.
Fix this by relaxing the dependency to "KUNIT".
Fixes: 08809e482a1c44d9 ("HID: uclogic: KUnit best practices and naming conventions")
Signed-off-by: Geert Uytterhoeven <geert+renesas(a)glider.be>
---
drivers/hid/Kconfig | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/hid/Kconfig b/drivers/hid/Kconfig
index 4ce012f83253ec9f..b977450cac75265d 100644
--- a/drivers/hid/Kconfig
+++ b/drivers/hid/Kconfig
@@ -1285,7 +1285,7 @@ config HID_MCP2221
config HID_KUNIT_TEST
tristate "KUnit tests for HID" if !KUNIT_ALL_TESTS
- depends on KUNIT=y
+ depends on KUNIT
depends on HID_BATTERY_STRENGTH
depends on HID_UCLOGIC
default KUNIT_ALL_TESTS
--
2.34.1
Add documentation for the new Virtual ALSA driver. It covers all possible
usage cases: errors and delay injections, random and pattern-based data
generation, playback and ioctl redefinition functionalities testing.
We have a lot of different virtual media drivers, which can be used for
testing of the userspace applications and media subsystem middle layer.
However, all of them are aimed at testing the video functionality and
simulating the video devices. For audio devices we have only snd-dummy
module, which is good in simulating the correct behavior of an ALSA device.
I decided to write a tool, which would help to test the userspace ALSA
programs (and the PCM middle layer as well) under unusual circumstances
to figure out how they would behave. So I came up with this Virtual ALSA
Driver.
This new Virtual ALSA Driver has several features which can be useful
during the userspace ALSA applications testing/fuzzing, or testing/fuzzing
of the PCM middle layer. Not all of them can be implemented using the
existing virtual drivers (like dummy or loopback). Here is what can this
driver do:
- Simulate both capture and playback processes
- Check the playback stream for containing the looped pattern
- Generate random or pattern-based capture data
- Inject delays into the playback and capturing processes
- Inject errors during the PCM callbacks
Also, this driver can check the playback stream for containing the
predefined pattern, which is used in the corresponding selftest to check
the PCM middle layer data transferring functionality. Additionally, this
driver redefines the default RESET ioctl, and the selftest covers this PCM
API functionality as well.
Signed-off-by: Ivan Orlov <ivan.orlov0322(a)gmail.com>
---
Documentation/admin-guide/index.rst | 1 +
Documentation/admin-guide/valsa.rst | 114 ++++++++++++++++++++++++++++
2 files changed, 115 insertions(+)
create mode 100644 Documentation/admin-guide/valsa.rst
diff --git a/Documentation/admin-guide/index.rst b/Documentation/admin-guide/index.rst
index 43ea35613dfc..328cc59275a1 100644
--- a/Documentation/admin-guide/index.rst
+++ b/Documentation/admin-guide/index.rst
@@ -131,6 +131,7 @@ configure specific aspects of kernel behavior to your liking.
thunderbolt
ufs
unicode
+ valsa
vga-softcursor
video-output
xfs
diff --git a/Documentation/admin-guide/valsa.rst b/Documentation/admin-guide/valsa.rst
new file mode 100644
index 000000000000..64ffc130fb4c
--- /dev/null
+++ b/Documentation/admin-guide/valsa.rst
@@ -0,0 +1,114 @@
+.. SPDX-License-Identifier: GPL-2.0
+
+The Virtual ALSA Driver
+=======================
+
+The Virtual ALSA Driver emulates a generic ALSA device, and can be used for
+testing/fuzzing of the userspace ALSA applications, as well as for testing/fuzzing of
+the ALSA middle layer. Additionally, it can be used for simulating hard to reproduce
+problems with PCM devices.
+
+What can this driver do?
+~~~~~~~~~~~~~~~~~~~~~~~~
+
+At this moment the driver can do the following things:
+ * Simulate both capture and playback processes
+ * Generate random or pattern-based capturing data
+ * Inject delays into the playback and capturing processes
+ * Inject errors during the PCM callbacks
+
+Also, this driver can check the playback stream for containing the
+predefined pattern, which is used in the corresponding selftest (alsa/valsa-test.sh)
+to check the PCM middle layer data transferring functionality. Additionally, this
+driver redefines the default RESET ioctl, and the selftest covers this PCM
+API functionality as well.
+
+Configuration
+-------------
+
+The driver has several parameters besides the common ALSA module parameters:
+
+ * fill_mode (bool) - Buffer fill mode (see below)
+ * inject_delay (int)
+ * inject_hwpars_err (bool)
+ * inject_prepare_err (bool)
+ * inject_trigger_err (bool)
+
+
+Capture Data Generation
+-----------------------
+
+The driver has two modes of data generation: the first (0 in the fill_mode parameter)
+means random data generation, the second (1 in the fill_mode) - pattern-based
+data generation. Let's look at the second mode.
+
+First of all, you may want to specify the pattern for data generation. You can do it
+by writing the pattern to the debugfs file (/sys/kernel/debug/valsa/fill_pattern).
+Like that:
+
+.. code-block:: bash
+
+ echo -n mycoolpattern > /sys/kernel/debug/valsa/fill_pattern
+
+After this, every capture action performed on the 'valsa' device will return
+'mycoolpatternmycoolpatternmycoolpatternmy...' in the capturing buffer.
+
+The pattern itself can be up to 4096 bytes long.
+
+Delay injection
+---------------
+
+The driver has 'inject_delay' parameter, which has very self-descriptive name and
+can be used for time delay/speedup simulations. The parameter has integer type, and
+it means the delay added between module's internal timer ticks.
+
+If the 'inject_delay' value is positive, the buffer will be filled slower, if it is
+negative - faster. You can try it yourself by starting a recording in any
+audiorecording application (like Audacity) and selecting the 'valsa' device as a
+source.
+
+This parameter can be also used for generating a huge amount of sound data in a very
+short period of time (with the negative 'inject_delay' value).
+
+Errors injection
+----------------
+
+This module can be used for injecting errors into the PCM communication process. This
+action can help you to figure out how the userspace ALSA program behaves under unusual
+circumstances.
+
+For example, you can make all 'hw_params' PCM callback calls return EBUSY error by
+writing '1' to the 'inject_hwpars_err' module parameter:
+
+.. code-block:: bash
+
+ echo 1 > /sys/module/snd_valsa/parameters/inject_hwpars_err
+
+Errors can be injected into the following PCM callbacks:
+
+ * hw_params (EBUSY)
+ * prepare (EINVAL)
+ * trigger (EINVAL)
+
+
+Playback test
+-------------
+
+This driver can be also used for the playback functionality testing - every time you
+write the playback data to the 'valsa' PCM device and close it, the driver checks the
+buffer for containing the looped pattern (which is specified in the fill_pattern
+debugfs file). If the playback buffer content represents the looped pattern, 'pc_test'
+debugfs entry is set into '1'. Otherwise, the driver sets it to '0'.
+
+ioctl redefinition test
+-----------------------
+
+The driver redefines the 'reset' ioctl, which is default for all PCM devices. To test
+this functionality, we can trigger the reset ioctl and check the 'ioctl_test' debugfs
+entry:
+
+.. code-block:: bash
+
+ cat /sys/kernel/debug/valsa/ioctl_test
+
+If the ioctl is triggered successfully, this file will contain '1', and '0' otherwise.
--
2.34.1
This pachset aims to improve and make more robust the selftests performed to
check whether SRv6 End.DT4 beahvior works as expected under different system
configurations.
Some Linux distributions enable Deduplication Address Detection and Reverse
Path Filtering mechanisms by default which can interfere with SRv6 End.DT4
behavior and cause selftests to fail.
The following patches improve selftests for End.DT4 by taking these two
mechanisms into account. Specifically:
- patch 1/2: selftests: seg6: disable DAD on IPv6 router cfg for
srv6_end_dt4_l3vpn_test
- patch 2/2: selftets: seg6: disable rp_filter by default in
srv6_end_dt4_l3vpn_test
Thank you all,
Andrea
Andrea Mayer (2):
selftests: seg6: disable DAD on IPv6 router cfg for
srv6_end_dt4_l3vpn_test
selftets: seg6: disable rp_filter by default in
srv6_end_dt4_l3vpn_test
.../selftests/net/srv6_end_dt4_l3vpn_test.sh | 17 +++++++++++------
1 file changed, 11 insertions(+), 6 deletions(-)
--
2.20.1
Dzień dobry,
w ramach nowej edycji programu Czyste Powietrze dla klientów indywidualnych mogą otrzymać Państwo do 135 tys. zł wsparcia na zakup pompy ciepła.
Prócz wyższego dofinansowania program zakłada m.in. podwyższenie progów dochodowych oraz możliwość złożenia kolejnego wniosku o dofinansowanie dla tych, którzy już wcześniej skorzystali z Programu.
Jako firma specjalizująca się w dostawie, montażu i serwisie pomp ciepła pomożemy Państwu w uzyskaniu dofinansowania wraz z kompleksową realizacją całego projektu.
Są Państwo zainteresowani?
Pozdrawiam
Damian Hordych
KUnit tests run in a kthread, with the current->kunit_test pointer set
to the test's context. This allows the kunit_get_current_test() and
kunit_fail_current_test() macros to work. Normally, this pointer is
still valid during test shutdown (i.e., the suite->exit function, and
any resource cleanup). However, if the test has exited early (e.g., due
to a failed assertion), the cleanup is done in the parent KUnit thread,
which does not have an active context.
Instead, in the event test terminates early, run the test exit and
cleanup from a new 'cleanup' kthread, which sets current->kunit_test,
and better isolates the rest of KUnit from issues which arise in test
cleanup.
If a test cleanup function itself aborts (e.g., due to an assertion
failing), there will be no further attempts to clean up: an error will
be logged and the test failed. For example:
# example_simple_test: test aborted during cleanup. continuing without cleaning up
This should also make it easier to get access to the KUnit context,
particularly from within resource cleanup functions, which may, for
example, need access to data in test->priv.
Reviewed-by: Benjamin Berg <benjamin.berg(a)intel.com>
Reviewed-by: Maxime Ripard <maxime(a)cerno.tech>
Tested-by: Maxime Ripard <maxime(a)cerno.tech>
Signed-off-by: David Gow <davidgow(a)google.com>
---
This is an updated version of / replacement for "kunit: Set the current
KUnit context when cleaning up", which instead creates a new kthread
for cleanup tasks if the original test kthread is aborted. This protects
us from failed assertions during cleanup, if the test exited early.
Changes since v3:
https://lore.kernel.org/all/20230421040218.2156548-1-davidgow@google.com/
- Get rid of a unused 'suite' variable (kernel test robot)
- Add Benjamin and Maxime's Reviewed-by tags.
Changes since v2:
https://lore.kernel.org/linux-kselftest/20230419085426.1671703-1-davidgow@g…
- Always run cleanup in its own kthread
- Therefore, never attempt to re-run it if it exits
- Thanks, Benjamin.
Changes since v1:
https://lore.kernel.org/linux-kselftest/20230415091401.681395-1-davidgow@go…
- Move cleanup execution to another kthread
- (Thanks, Benjamin, for pointing out the assertion issues)
---
lib/kunit/test.c | 56 +++++++++++++++++++++++++++++++++++++++++-------
1 file changed, 48 insertions(+), 8 deletions(-)
diff --git a/lib/kunit/test.c b/lib/kunit/test.c
index e2910b261112..f5e4ceffd282 100644
--- a/lib/kunit/test.c
+++ b/lib/kunit/test.c
@@ -419,15 +419,54 @@ static void kunit_try_run_case(void *data)
* thread will resume control and handle any necessary clean up.
*/
kunit_run_case_internal(test, suite, test_case);
- /* This line may never be reached. */
+}
+
+static void kunit_try_run_case_cleanup(void *data)
+{
+ struct kunit_try_catch_context *ctx = data;
+ struct kunit *test = ctx->test;
+ struct kunit_suite *suite = ctx->suite;
+
+ current->kunit_test = test;
+
kunit_run_case_cleanup(test, suite);
}
+static void kunit_catch_run_case_cleanup(void *data)
+{
+ struct kunit_try_catch_context *ctx = data;
+ struct kunit *test = ctx->test;
+ int try_exit_code = kunit_try_catch_get_result(&test->try_catch);
+
+ /* It is always a failure if cleanup aborts. */
+ kunit_set_failure(test);
+
+ if (try_exit_code) {
+ /*
+ * Test case could not finish, we have no idea what state it is
+ * in, so don't do clean up.
+ */
+ if (try_exit_code == -ETIMEDOUT) {
+ kunit_err(test, "test case cleanup timed out\n");
+ /*
+ * Unknown internal error occurred preventing test case from
+ * running, so there is nothing to clean up.
+ */
+ } else {
+ kunit_err(test, "internal error occurred during test case cleanup: %d\n",
+ try_exit_code);
+ }
+ return;
+ }
+
+ kunit_err(test, "test aborted during cleanup. continuing without cleaning up\n");
+}
+
+
static void kunit_catch_run_case(void *data)
{
struct kunit_try_catch_context *ctx = data;
struct kunit *test = ctx->test;
- struct kunit_suite *suite = ctx->suite;
int try_exit_code = kunit_try_catch_get_result(&test->try_catch);
if (try_exit_code) {
@@ -448,12 +487,6 @@ static void kunit_catch_run_case(void *data)
}
return;
}
-
- /*
- * Test case was run, but aborted. It is the test case's business as to
- * whether it failed or not, we just need to clean up.
- */
- kunit_run_case_cleanup(test, suite);
}
/*
@@ -478,6 +511,13 @@ static void kunit_run_case_catch_errors(struct kunit_suite *suite,
context.test_case = test_case;
kunit_try_catch_run(try_catch, &context);
+ /* Now run the cleanup */
+ kunit_try_catch_init(try_catch,
+ test,
+ kunit_try_run_case_cleanup,
+ kunit_catch_run_case_cleanup);
+ kunit_try_catch_run(try_catch, &context);
+
/* Propagate the parameter result to the test case. */
if (test->status == KUNIT_FAILURE)
test_case->status = KUNIT_FAILURE;
--
2.40.1.521.gf1e218fcd8-goog
Hi All,
In TDX guest, the attestation process is used to verify the TDX guest
trustworthiness to other entities before provisioning secrets to the
guest.
The TDX guest attestation process consists of two steps:
1. TDREPORT generation
2. Quote generation.
The First step (TDREPORT generation) involves getting the TDX guest
measurement data in the format of TDREPORT which is further used to
validate the authenticity of the TDX guest. The second step involves
sending the TDREPORT to a Quoting Enclave (QE) server to generate a
remotely verifiable Quote. TDREPORT by design can only be verified on
the local platform. To support remote verification of the TDREPORT,
TDX leverages Intel SGX Quoting Enclave to verify the TDREPORT
locally and convert it to a remotely verifiable Quote. Although
attestation software can use communication methods like TCP/IP or
vsock to send the TDREPORT to QE, not all platforms support these
communication models. So TDX GHCI specification [1] defines a method
for Quote generation via hypercalls. Please check the discussion from
Google [2] and Alibaba [3] which clarifies the need for hypercall based
Quote generation support. This patch set adds this support.
Support for TDREPORT generation already exists in the TDX guest driver.
This patchset extends the same driver to add the Quote generation
support.
Following are the details of the patch set:
Patch 1/3 -> Adds event notification IRQ support.
Patch 2/3 -> Adds Quote generation support.
Patch 3/3 -> Adds selftest support for Quote generation feature.
[1] https://cdrdv2.intel.com/v1/dl/getContent/726790, section titled "TDG.VP.VMCALL<GetQuote>".
[2] https://lore.kernel.org/lkml/CAAYXXYxxs2zy_978GJDwKfX5Hud503gPc8=1kQ-+JwG_k…
[3] https://lore.kernel.org/lkml/a69faebb-11e8-b386-d591-dbd08330b008@linux.ali…
Kuppuswamy Sathyanarayanan (3):
x86/tdx: Add TDX Guest event notify interrupt support
virt: tdx-guest: Add Quote generation support
selftests/tdx: Test GetQuote TDX attestation feature
Documentation/virt/coco/tdx-guest.rst | 11 ++
arch/x86/coco/tdx/tdx.c | 196 +++++++++++++++++++
arch/x86/include/asm/tdx.h | 8 +
drivers/virt/coco/tdx-guest/tdx-guest.c | 168 +++++++++++++++-
include/uapi/linux/tdx-guest.h | 43 ++++
tools/testing/selftests/tdx/tdx_guest_test.c | 68 ++++++-
6 files changed, 487 insertions(+), 7 deletions(-)
--
2.34.1
From: Ivan Orlov <ivan.orlov0322(a)gmail.com>
[ Upstream commit 735b0e0f2d001b7ed9486db84453fb860e764a4d ]
There is a 'malloc' call in vcpu_save_state function, which can
be unsuccessful. This patch will add the malloc failure checking
to avoid possible null dereference and give more information
about test fail reasons.
Signed-off-by: Ivan Orlov <ivan.orlov0322(a)gmail.com>
Link: https://lore.kernel.org/r/20230322144528.704077-1-ivan.orlov0322@gmail.com
Signed-off-by: Sean Christopherson <seanjc(a)google.com>
Signed-off-by: Sasha Levin <sashal(a)kernel.org>
---
tools/testing/selftests/kvm/lib/x86_64/processor.c | 1 +
1 file changed, 1 insertion(+)
diff --git a/tools/testing/selftests/kvm/lib/x86_64/processor.c b/tools/testing/selftests/kvm/lib/x86_64/processor.c
index acfa1d01e7df0..d9365a9d1c490 100644
--- a/tools/testing/selftests/kvm/lib/x86_64/processor.c
+++ b/tools/testing/selftests/kvm/lib/x86_64/processor.c
@@ -950,6 +950,7 @@ struct kvm_x86_state *vcpu_save_state(struct kvm_vcpu *vcpu)
vcpu_run_complete_io(vcpu);
state = malloc(sizeof(*state) + msr_list->nmsrs * sizeof(state->msrs.entries[0]));
+ TEST_ASSERT(state, "-ENOMEM when allocating kvm state");
vcpu_events_get(vcpu, &state->events);
vcpu_mp_state_get(vcpu, &state->mp_state);
--
2.39.2
From: Ivan Orlov <ivan.orlov0322(a)gmail.com>
[ Upstream commit 735b0e0f2d001b7ed9486db84453fb860e764a4d ]
There is a 'malloc' call in vcpu_save_state function, which can
be unsuccessful. This patch will add the malloc failure checking
to avoid possible null dereference and give more information
about test fail reasons.
Signed-off-by: Ivan Orlov <ivan.orlov0322(a)gmail.com>
Link: https://lore.kernel.org/r/20230322144528.704077-1-ivan.orlov0322@gmail.com
Signed-off-by: Sean Christopherson <seanjc(a)google.com>
Signed-off-by: Sasha Levin <sashal(a)kernel.org>
---
tools/testing/selftests/kvm/lib/x86_64/processor.c | 1 +
1 file changed, 1 insertion(+)
diff --git a/tools/testing/selftests/kvm/lib/x86_64/processor.c b/tools/testing/selftests/kvm/lib/x86_64/processor.c
index c39a4353ba194..827647ff3d41b 100644
--- a/tools/testing/selftests/kvm/lib/x86_64/processor.c
+++ b/tools/testing/selftests/kvm/lib/x86_64/processor.c
@@ -954,6 +954,7 @@ struct kvm_x86_state *vcpu_save_state(struct kvm_vcpu *vcpu)
vcpu_run_complete_io(vcpu);
state = malloc(sizeof(*state) + msr_list->nmsrs * sizeof(state->msrs.entries[0]));
+ TEST_ASSERT(state, "-ENOMEM when allocating kvm state");
vcpu_events_get(vcpu, &state->events);
vcpu_mp_state_get(vcpu, &state->mp_state);
--
2.39.2
The generic fork() implementation in nolibc falls back to the clone()
syscall. On s390 the first two arguments to clone() are swapped compared
to other architectures, breaking the implementation in nolibc.
Add a custom implementation of fork() to s390 that works.
While at it also add a testcase for fork().
Signed-off-by: Thomas Weißschuh <linux(a)weissschuh.net>
---
Thomas Weißschuh (2):
tools/nolibc: s390: provide custom implementation for sys_fork
tools/nolibc: add testcase for fork()/waitpid()
tools/include/nolibc/arch-s390.h | 8 ++++++++
tools/include/nolibc/sys.h | 2 ++
tools/testing/selftests/nolibc/nolibc-test.c | 20 ++++++++++++++++++++
3 files changed, 30 insertions(+)
---
base-commit: c1c4f33b6be9b3412d9e0ba01b367f4ffe47c379
change-id: 20230415-nolibc-fork-b7087a345166
Best regards,
--
Thomas Weißschuh <linux(a)weissschuh.net>
[ This series depends on the VFIO device cdev series ]
Changelog
v7:
* Rebased on top of v6.4-rc1 and cdev v11 candidate
* Fixed a wrong file in replace() API patch
* Added Kevin's "Reviewed-by" to replace() API patch
v6:
https://lore.kernel.org/all/cover.1679939952.git.nicolinc@nvidia.com/
* Rebased on top of cdev v8 series
https://lore.kernel.org/kvm/20230327094047.47215-1-yi.l.liu@intel.com/
* Added "Reviewed-by" from Kevin to PATCH-4
* Squashed access->ioas updating lines into iommufd_access_change_pt(),
and changed function return type accordingly for simplification.
v5:
https://lore.kernel.org/all/cover.1679559476.git.nicolinc@nvidia.com/
* Kept the cmd->id in the iommufd_test_create_access() so the access can
be created with an ioas by default. Then, renamed the previous ioctl
IOMMU_TEST_OP_ACCESS_SET_IOAS to IOMMU_TEST_OP_ACCESS_REPLACE_IOAS, so
it would be used to replace an access->ioas pointer.
* Added iommufd_access_replace() API after the introductions of the other
two APIs iommufd_access_attach() and iommufd_access_detach().
* Since vdev->iommufd_attached is also set in emulated pathway too, call
iommufd_access_update(), similar to the physical pathway.
v4:
https://lore.kernel.org/all/cover.1678284812.git.nicolinc@nvidia.com/
* Rebased on top of Jason's series adding replace() and hwpt_alloc()
https://lore.kernel.org/all/0-v2-51b9896e7862+8a8c-iommufd_alloc_jgg@nvidia…
* Rebased on top of cdev series v6
https://lore.kernel.org/kvm/20230308132903.465159-1-yi.l.liu@intel.com/
* Dropped the patch that's moved to cdev series.
* Added unmap function pointer sanity before calling it.
* Added "Reviewed-by" from Kevin and Yi.
* Added back the VFIO change updating the ATTACH uAPI.
v3:
https://lore.kernel.org/all/cover.1677288789.git.nicolinc@nvidia.com/
* Rebased on top of Jason's iommufd_hwpt branch:
https://lore.kernel.org/all/0-v2-406f7ac07936+6a-iommufd_hwpt_jgg@nvidia.co…
* Dropped patches from this series accordingly. There were a couple of
VFIO patches that will be submitted after the VFIO cdev series. Also,
renamed the series to be "emulated".
* Moved dma_unmap sanity patch to the first in the series.
* Moved dma_unmap sanity to cover both VFIO and IOMMUFD pathways.
* Added Kevin's "Reviewed-by" to two of the patches.
* Fixed a NULL pointer bug in vfio_iommufd_emulated_bind().
* Moved unmap() call to the common place in iommufd_access_set_ioas().
v2:
https://lore.kernel.org/all/cover.1675802050.git.nicolinc@nvidia.com/
* Rebased on top of vfio_device cdev v2 series.
* Update the kdoc and commit message of iommu_group_replace_domain().
* Dropped revert-to-core-domain part in iommu_group_replace_domain().
* Dropped !ops->dma_unmap check in vfio_iommufd_emulated_attach_ioas().
* Added missing rc value in vfio_iommufd_emulated_attach_ioas() from the
iommufd_access_set_ioas() call.
* Added a new patch in vfio_main to deny vfio_pin/unpin_pages() calls if
vdev->ops->dma_unmap is not implemented.
* Added a __iommmufd_device_detach helper and let the replace routine do
a partial detach().
* Added restriction on auto_domains to use the replace feature.
* Added the patch "iommufd/device: Make hwpt_list list_add/del symmetric"
from the has_group removal series.
v1:
https://lore.kernel.org/all/cover.1675320212.git.nicolinc@nvidia.com/
Hi all,
The existing IOMMU APIs provide a pair of functions: iommu_attach_group()
for callers to attach a device from the default_domain (NULL if not being
supported) to a given iommu domain, and iommu_detach_group() for callers
to detach a device from a given domain to the default_domain. Internally,
the detach_dev op is deprecated for the newer drivers with default_domain.
This means that those drivers likely can switch an attaching domain to
another one, without stagging the device at a blocking or default domain,
for use cases such as:
1) vPASID mode, when a guest wants to replace a single pasid (PASID=0)
table with a larger table (PASID=N)
2) Nesting mode, when switching the attaching device from an S2 domain
to an S1 domain, or when switching between relevant S1 domains.
This series is rebased on top of Jason Gunthorpe's series that introduces
iommu_group_replace_domain API and IOMMUFD infrastructure for the IOMMUFD
"physical" devices. The IOMMUFD "emulated" deivces will need some extra
steps to replace the access->ioas object and its iopt pointer.
You can also find this series on Github:
https://github.com/nicolinc/iommufd/commits/iommu_group_replace_domain-v7
Thank you
Nicolin Chen
Nicolin Chen (4):
vfio: Do not allow !ops->dma_unmap in vfio_pin/unpin_pages()
iommufd: Add iommufd_access_replace() API
iommufd/selftest: Add IOMMU_TEST_OP_ACCESS_REPLACE_IOAS coverage
vfio: Support IO page table replacement
drivers/iommu/iommufd/device.c | 53 ++++++++++++++-----
drivers/iommu/iommufd/iommufd_test.h | 4 ++
drivers/iommu/iommufd/selftest.c | 19 +++++++
drivers/vfio/iommufd.c | 11 ++--
drivers/vfio/vfio_main.c | 4 ++
include/linux/iommufd.h | 1 +
include/uapi/linux/vfio.h | 6 +++
tools/testing/selftests/iommu/iommufd.c | 29 +++++++++-
tools/testing/selftests/iommu/iommufd_utils.h | 19 +++++++
9 files changed, 127 insertions(+), 19 deletions(-)
--
2.40.1
In the unlikely case that CLOCK_REALTIME is not defined, variable ret is
not initialized and further accumulation of return values to ret can leave
ret in an undefined state. Fix this by initialized ret to zero and changing
the assignment of ret to an accumulation for the CLOCK_REALTIME case.
Fixes: 03f55c7952c9 ("kselftest: Extend vDSO selftest to clock_getres")
Signed-off-by: Colin Ian King <colin.i.king(a)gmail.com>
---
tools/testing/selftests/vDSO/vdso_test_clock_getres.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/tools/testing/selftests/vDSO/vdso_test_clock_getres.c b/tools/testing/selftests/vDSO/vdso_test_clock_getres.c
index 15dcee16ff72..38d46a8bf7cb 100644
--- a/tools/testing/selftests/vDSO/vdso_test_clock_getres.c
+++ b/tools/testing/selftests/vDSO/vdso_test_clock_getres.c
@@ -84,12 +84,12 @@ static inline int vdso_test_clock(unsigned int clock_id)
int main(int argc, char **argv)
{
- int ret;
+ int ret = 0;
#if _POSIX_TIMERS > 0
#ifdef CLOCK_REALTIME
- ret = vdso_test_clock(CLOCK_REALTIME);
+ ret += vdso_test_clock(CLOCK_REALTIME);
#endif
#ifdef CLOCK_BOOTTIME
--
2.30.2
When we added fd based file streams we created references to STx_FILENO in
stdio.h but these constants are declared in unistd.h which is the last file
included by the top level nolibc.h meaning those constants are not defined
when we try to build stdio.h. This causes programs using nolibc.h to fail
to build.
Reorder the headers to avoid this issue.
Fixes: d449546c957f ("tools/nolibc: implement fd-based FILE streams")
Acked-by: Willy Tarreau <w(a)1wt.eu>
Signed-off-by: Mark Brown <broonie(a)kernel.org>
---
Changes in v2:
- Rebase onto v6.4-rc1.
- This is now a fix for Linus' tree.
- Link to v1: https://lore.kernel.org/r/20230413-nolibc-stdio-fix-v1-1-fa05fc3ba1fe@kerne…
---
tools/include/nolibc/nolibc.h | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/tools/include/nolibc/nolibc.h b/tools/include/nolibc/nolibc.h
index 04739a6293c4..05a228a6ee78 100644
--- a/tools/include/nolibc/nolibc.h
+++ b/tools/include/nolibc/nolibc.h
@@ -99,11 +99,11 @@
#include "sys.h"
#include "ctype.h"
#include "signal.h"
+#include "unistd.h"
#include "stdio.h"
#include "stdlib.h"
#include "string.h"
#include "time.h"
-#include "unistd.h"
#include "stackprotector.h"
/* Used by programs to avoid std includes */
---
base-commit: ac9a78681b921877518763ba0e89202254349d1b
change-id: 20230413-nolibc-stdio-fix-fb42de39d099
Best regards,
--
Mark Brown,,, <broonie(a)kernel.org>
The ftrace selftests do not currently produce KTAP output, they produce a
custom format much nicer for human consumption. This means that when run in
automated test systems we just get a single result for the suite as a whole
rather than recording results for individual test cases, making it harder
to look at the test data and masking things like inappropriate skips.
Address this by adding support for KTAP output to the ftracetest script and
providing a trivial wrapper which will be invoked by the kselftest runner
to generate output in this format by default, users using ftracetest
directly will continue to get the existing output.
This is not the most elegant solution but it is simple and effective. I
did consider implementing this by post processing the existing output
format but that felt more complex and likely to result in all output being
lost if something goes seriously wrong during the run which would not be
helpful. I did also consider just writing a separate runner script but
there's enough going on with things like the signal handling for that to
seem like it would be duplicating too much.
Acked-by: Steven Rostedt (Google) <rostedt(a)goodmis.org>
Acked-by: Masami Hiramatsu (Google) <mhiramat(a)kernel.org>
Tested-by: Masami Hiramatsu (Google) <mhiramat(a)kernel.org>
Signed-off-by: Mark Brown <broonie(a)kernel.org>
---
It might make sense to merge this via the ftrace tree - kselftests often
get merged with the code they test.
Changes in v2:
- Rebase onto v6.4-rc1.
- Link to v1: https://lore.kernel.org/r/20230302-ftrace-kselftest-ktap-v1-1-a84a0765b7ad@…
---
tools/testing/selftests/ftrace/Makefile | 3 +-
tools/testing/selftests/ftrace/ftracetest | 63 ++++++++++++++++++++++++--
tools/testing/selftests/ftrace/ftracetest-ktap | 8 ++++
3 files changed, 70 insertions(+), 4 deletions(-)
diff --git a/tools/testing/selftests/ftrace/Makefile b/tools/testing/selftests/ftrace/Makefile
index d6e106fbce11..a1e955d2de4c 100644
--- a/tools/testing/selftests/ftrace/Makefile
+++ b/tools/testing/selftests/ftrace/Makefile
@@ -1,7 +1,8 @@
# SPDX-License-Identifier: GPL-2.0
all:
-TEST_PROGS := ftracetest
+TEST_PROGS_EXTENDED := ftracetest
+TEST_PROGS := ftracetest-ktap
TEST_FILES := test.d settings
EXTRA_CLEAN := $(OUTPUT)/logs/*
diff --git a/tools/testing/selftests/ftrace/ftracetest b/tools/testing/selftests/ftrace/ftracetest
index c3311c8c4089..2506621e75df 100755
--- a/tools/testing/selftests/ftrace/ftracetest
+++ b/tools/testing/selftests/ftrace/ftracetest
@@ -13,6 +13,7 @@ echo "Usage: ftracetest [options] [testcase(s)] [testcase-directory(s)]"
echo " Options:"
echo " -h|--help Show help message"
echo " -k|--keep Keep passed test logs"
+echo " -K|--ktap Output in KTAP format"
echo " -v|--verbose Increase verbosity of test messages"
echo " -vv Alias of -v -v (Show all results in stdout)"
echo " -vvv Alias of -v -v -v (Show all commands immediately)"
@@ -85,6 +86,10 @@ parse_opts() { # opts
KEEP_LOG=1
shift 1
;;
+ --ktap|-K)
+ KTAP=1
+ shift 1
+ ;;
--verbose|-v|-vv|-vvv)
if [ $VERBOSE -eq -1 ]; then
usage "--console can not use with --verbose"
@@ -178,6 +183,7 @@ TEST_DIR=$TOP_DIR/test.d
TEST_CASES=`find_testcases $TEST_DIR`
LOG_DIR=$TOP_DIR/logs/`date +%Y%m%d-%H%M%S`/
KEEP_LOG=0
+KTAP=0
DEBUG=0
VERBOSE=0
UNSUPPORTED_RESULT=0
@@ -229,7 +235,7 @@ prlog() { # messages
newline=
shift
fi
- printf "$*$newline"
+ [ "$KTAP" != "1" ] && printf "$*$newline"
[ "$LOG_FILE" ] && printf "$*$newline" | strip_esc >> $LOG_FILE
}
catlog() { #file
@@ -260,11 +266,11 @@ TOTAL_RESULT=0
INSTANCE=
CASENO=0
+CASENAME=
testcase() { # testfile
CASENO=$((CASENO+1))
- desc=`grep "^#[ \t]*description:" $1 | cut -f2- -d:`
- prlog -n "[$CASENO]$INSTANCE$desc"
+ CASENAME=`grep "^#[ \t]*description:" $1 | cut -f2- -d:`
}
checkreq() { # testfile
@@ -277,40 +283,68 @@ test_on_instance() { # testfile
grep -q "^#[ \t]*flags:.*instance" $1
}
+ktaptest() { # result comment
+ if [ "$KTAP" != "1" ]; then
+ return
+ fi
+
+ local result=
+ if [ "$1" = "1" ]; then
+ result="ok"
+ else
+ result="not ok"
+ fi
+ shift
+
+ local comment=$*
+ if [ "$comment" != "" ]; then
+ comment="# $comment"
+ fi
+
+ echo $CASENO $result $INSTANCE$CASENAME $comment
+}
+
eval_result() { # sigval
case $1 in
$PASS)
prlog " [${color_green}PASS${color_reset}]"
+ ktaptest 1
PASSED_CASES="$PASSED_CASES $CASENO"
return 0
;;
$FAIL)
prlog " [${color_red}FAIL${color_reset}]"
+ ktaptest 0
FAILED_CASES="$FAILED_CASES $CASENO"
return 1 # this is a bug.
;;
$UNRESOLVED)
prlog " [${color_blue}UNRESOLVED${color_reset}]"
+ ktaptest 0 UNRESOLVED
UNRESOLVED_CASES="$UNRESOLVED_CASES $CASENO"
return $UNRESOLVED_RESULT # depends on use case
;;
$UNTESTED)
prlog " [${color_blue}UNTESTED${color_reset}]"
+ ktaptest 1 SKIP
UNTESTED_CASES="$UNTESTED_CASES $CASENO"
return 0
;;
$UNSUPPORTED)
prlog " [${color_blue}UNSUPPORTED${color_reset}]"
+ ktaptest 1 SKIP
UNSUPPORTED_CASES="$UNSUPPORTED_CASES $CASENO"
return $UNSUPPORTED_RESULT # depends on use case
;;
$XFAIL)
prlog " [${color_green}XFAIL${color_reset}]"
+ ktaptest 1 XFAIL
XFAILED_CASES="$XFAILED_CASES $CASENO"
return 0
;;
*)
prlog " [${color_blue}UNDEFINED${color_reset}]"
+ ktaptest 0 error
UNDEFINED_CASES="$UNDEFINED_CASES $CASENO"
return 1 # this must be a test bug
;;
@@ -371,6 +405,7 @@ __run_test() { # testfile
run_test() { # testfile
local testname=`basename $1`
testcase $1
+ prlog -n "[$CASENO]$INSTANCE$CASENAME"
if [ ! -z "$LOG_FILE" ] ; then
local testlog=`mktemp $LOG_DIR/${CASENO}-${testname}-log.XXXXXX`
else
@@ -405,6 +440,17 @@ run_test() { # testfile
# load in the helper functions
. $TEST_DIR/functions
+if [ "$KTAP" = "1" ]; then
+ echo "TAP version 13"
+
+ casecount=`echo $TEST_CASES | wc -w`
+ for t in $TEST_CASES; do
+ test_on_instance $t || continue
+ casecount=$((casecount+1))
+ done
+ echo "1..${casecount}"
+fi
+
# Main loop
for t in $TEST_CASES; do
run_test $t
@@ -439,6 +485,17 @@ prlog "# of unsupported: " `echo $UNSUPPORTED_CASES | wc -w`
prlog "# of xfailed: " `echo $XFAILED_CASES | wc -w`
prlog "# of undefined(test bug): " `echo $UNDEFINED_CASES | wc -w`
+if [ "$KTAP" = "1" ]; then
+ echo -n "# Totals:"
+ echo -n " pass:"`echo $PASSED_CASES | wc -w`
+ echo -n " faii:"`echo $FAILED_CASES | wc -w`
+ echo -n " xfail:"`echo $XFAILED_CASES | wc -w`
+ echo -n " xpass:0"
+ echo -n " skip:"`echo $UNTESTED_CASES $UNSUPPORTED_CASES | wc -w`
+ echo -n " error:"`echo $UNRESOLVED_CASES $UNDEFINED_CASES | wc -w`
+ echo
+fi
+
cleanup
# if no error, return 0
diff --git a/tools/testing/selftests/ftrace/ftracetest-ktap b/tools/testing/selftests/ftrace/ftracetest-ktap
new file mode 100755
index 000000000000..b3284679ef3a
--- /dev/null
+++ b/tools/testing/selftests/ftrace/ftracetest-ktap
@@ -0,0 +1,8 @@
+#!/bin/sh -e
+# SPDX-License-Identifier: GPL-2.0-only
+#
+# ftracetest-ktap: Wrapper to integrate ftracetest with the kselftest runner
+#
+# Copyright (C) Arm Ltd., 2023
+
+./ftracetest -K
---
base-commit: ac9a78681b921877518763ba0e89202254349d1b
change-id: 20230302-ftrace-kselftest-ktap-9d7878691557
Best regards,
--
Mark Brown,,, <broonie(a)kernel.org>
The "test_encl.elf" file used by test_sgx is not installed in
INSTALL_PATH. Attempting to execute test_sgx causes false negative:
"
enclave executable open(): No such file or directory
main.c:188:unclobbered_vdso:Failed to load the test enclave.
"
Add "test_encl.elf" to TEST_FILES so that it will be installed.
Fixes: 2adcba79e69d ("selftests/x86: Add a selftest for SGX")
Signed-off-by: Yi Lai <yi1.lai(a)intel.com>
---
tools/testing/selftests/sgx/Makefile | 1 +
1 file changed, 1 insertion(+)
diff --git a/tools/testing/selftests/sgx/Makefile b/tools/testing/selftests/sgx/Makefile
index 75af864e07b6..50aab6b57da3 100644
--- a/tools/testing/selftests/sgx/Makefile
+++ b/tools/testing/selftests/sgx/Makefile
@@ -17,6 +17,7 @@ ENCL_CFLAGS := -Wall -Werror -static -nostdlib -nostartfiles -fPIC \
-fno-stack-protector -mrdrnd $(INCLUDES)
TEST_CUSTOM_PROGS := $(OUTPUT)/test_sgx
+TEST_FILES := $(OUTPUT)/test_encl.elf
ifeq ($(CAN_BUILD_X86_64), 1)
all: $(TEST_CUSTOM_PROGS) $(OUTPUT)/test_encl.elf
--
2.25.1
Enabling a (modular) test must not silently enable additional kernel
functionality, as that may increase the attack vector of a product.
Fix this by:
1. making REGMAP visible if CONFIG_KUNIT_ALL_TESTS is enabled,
2. making REGMAP_KUNIT depend on REGMAP instead of selecting it.
After this, one can safely enable CONFIG_KUNIT_ALL_TESTS=m to build
modules for all appropriate tests for ones system, without pulling in
extra unwanted functionality, while still allowing a tester to manually
enable REGMAP and its test suite on a system where REGMAP is not enabled
by default.
Fixes: 2238959b6ad27040 ("regmap: Add some basic kunit tests")
Signed-off-by: Geert Uytterhoeven <geert(a)linux-m68k.org>
---
v2:
- Make REGMAP visible if CONFIG_KUNIT_ALL_TESTS is enabled.
---
drivers/base/regmap/Kconfig | 13 ++++++++++---
1 file changed, 10 insertions(+), 3 deletions(-)
diff --git a/drivers/base/regmap/Kconfig b/drivers/base/regmap/Kconfig
index 33a8366e22a584a5..0db2021f7477f2ab 100644
--- a/drivers/base/regmap/Kconfig
+++ b/drivers/base/regmap/Kconfig
@@ -4,16 +4,23 @@
# subsystems should select the appropriate symbols.
config REGMAP
+ bool "Register Map support" if KUNIT_ALL_TESTS
default y if (REGMAP_I2C || REGMAP_SPI || REGMAP_SPMI || REGMAP_W1 || REGMAP_AC97 || REGMAP_MMIO || REGMAP_IRQ || REGMAP_SOUNDWIRE || REGMAP_SOUNDWIRE_MBQ || REGMAP_SCCB || REGMAP_I3C || REGMAP_SPI_AVMM || REGMAP_MDIO || REGMAP_FSI)
select IRQ_DOMAIN if REGMAP_IRQ
select MDIO_BUS if REGMAP_MDIO
- bool
+ help
+ Enable support for the Register Map (regmap) access API.
+
+ Usually, this option is automatically selected when needed.
+ However, you may want to enable it manually for running the regmap
+ KUnit tests.
+
+ If unsure, say N.
config REGMAP_KUNIT
tristate "KUnit tests for regmap"
- depends on KUNIT
+ depends on KUNIT && REGMAP
default KUNIT_ALL_TESTS
- select REGMAP
select REGMAP_RAM
config REGMAP_AC97
--
2.34.1
These patches relax a few verifier requirements around dynptrs.
Patches 1-3 are unchanged from v2, apart from rebasing
Patch 4 is the same as in v1, see
https://lore.kernel.org/bpf/CA+PiJmST4WUH061KaxJ4kRL=fqy3X6+Wgb2E2rrLT5OYjU…
Patch 5 adds a test for the change in Patch 4
Daniel Rosenberg (5):
bpf: Allow NULL buffers in bpf_dynptr_slice(_rw)
selftests/bpf: Test allowing NULL buffer in dynptr slice
selftests/bpf: Check overflow in optional buffer
bpf: verifier: Accept dynptr mem as mem in helpers
selftests/bpf: Accept mem from dynptr in helper funcs
Documentation/bpf/kfuncs.rst | 23 ++++++++++-
include/linux/skbuff.h | 2 +-
kernel/bpf/helpers.c | 30 +++++++++------
kernel/bpf/verifier.c | 21 ++++++++--
.../testing/selftests/bpf/prog_tests/dynptr.c | 2 +
.../testing/selftests/bpf/progs/dynptr_fail.c | 20 ++++++++++
.../selftests/bpf/progs/dynptr_success.c | 38 +++++++++++++++++++
7 files changed, 118 insertions(+), 18 deletions(-)
base-commit: f4dea9689c5fea3d07170c2cb0703e216f1a0922
--
2.40.1.521.gf1e218fcd8-goog
From: Feng Zhou <zhoufeng.zf(a)bytedance.com>
Trace sched related functions, such as enqueue_task_fair, it is necessary to
specify a task instead of the current task which within a given cgroup.
Feng Zhou (2):
bpf: Add bpf_task_under_cgroup() kfunc
selftests/bpf: Add testcase for bpf_task_under_cgroup
Changelog:
v6->v7: Addressed comments from Hao Luo
- Get rid of unnecessary check
Details in here:
https://lore.kernel.org/all/20230505060818.60037-1-zhoufeng.zf@bytedance.co…
v5->v6: Addressed comments from Yonghong Song
- Some code format modifications.
- Add ack-by
Details in here:
https://lore.kernel.org/all/20230504031513.13749-1-zhoufeng.zf@bytedance.co…
v4->v5: Addressed comments from Yonghong Song
- Some code format modifications.
Details in here:
https://lore.kernel.org/all/20230428071737.43849-1-zhoufeng.zf@bytedance.co…
v3->v4: Addressed comments from Yonghong Song
- Modify test cases and test other tasks, not the current task.
Details in here:
https://lore.kernel.org/all/20230427023019.73576-1-zhoufeng.zf@bytedance.co…
v2->v3: Addressed comments from Alexei Starovoitov
- Modify the comment information of the function.
- Narrow down the testcase's hook point
Details in here:
https://lore.kernel.org/all/20230421090403.15515-1-zhoufeng.zf@bytedance.co…
v1->v2: Addressed comments from Alexei Starovoitov
- Add kfunc instead.
Details in here:
https://lore.kernel.org/all/20230420072657.80324-1-zhoufeng.zf@bytedance.co…
kernel/bpf/helpers.c | 17 ++++++
tools/testing/selftests/bpf/DENYLIST.s390x | 1 +
.../bpf/prog_tests/task_under_cgroup.c | 53 +++++++++++++++++++
.../bpf/progs/test_task_under_cgroup.c | 51 ++++++++++++++++++
4 files changed, 122 insertions(+)
create mode 100644 tools/testing/selftests/bpf/prog_tests/task_under_cgroup.c
create mode 100644 tools/testing/selftests/bpf/progs/test_task_under_cgroup.c
--
2.20.1
This patch series improves guest vCPUs performances on Arm during clearing
dirty log operations by taking MMU read lock instead of MMU write lock.
vCPUs write protection faults are fixed in Arm using MMU read locks.
However, when userspace is clearing dirty logs via KVM_CLEAR_DIRTY_LOG
ioctl, then kernel code takes MMU write lock. This will block vCPUs
write protection faults and degrade guest performance. This
degradation gets worse as guest VM size increases in terms of memory and
vCPU count.
In this series, MMU read lock adoption is made possible by using
KVM_PGTABLE_WALK_SHARED flag in page walker.
Patches 1 to 5:
These patches are modifying dirty_log_perf_test. Intent is to mimic
production scenarios where guest keeps on executing while userspace
threads collect and clear dirty logs independently.
Three new command line options are added:
1. j: Allows to run guest vCPUs and main thread collecting dirty logs
independently of each other after initialization is complete.
2. k: Allows to clear dirty logs in smaller chunks compared to existing
whole memslot clear in one call.
3. l: Allows to add customizable wait time between consecutive clear
dirty log calls to mimic sending dirty memory to destination.
Patch 7-8:
These patches refactor code to move MMU lock operations to arch specific
code, refactor Arm's page table walker APIs, and change MMU write lock
for clearing dirty logs to read lock. Patch 8 has results showing
improvements based on dirty_log_perf_test.
Vipin Sharma (9):
KVM: selftests: Allow dirty_log_perf_test to clear dirty memory in
chunks
KVM: selftests: Add optional delay between consecutive Clear-Dirty-Log
calls
KVM: selftests: Pass count of read and write accesses from guest to
host
KVM: selftests: Print read and write accesses of pages by vCPUs in
dirty_log_perf_test
KVM: selftests: Allow independent execution of vCPUs in
dirty_log_perf_test
KVM: arm64: Correct the kvm_pgtable_stage2_flush() documentation
KVM: mmu: Move mmu lock/unlock to arch code for clear dirty log
KMV: arm64: Allow stage2_apply_range_sched() to pass page table walker
flags
KVM: arm64: Run clear-dirty-log under MMU read lock
arch/arm64/include/asm/kvm_pgtable.h | 17 ++-
arch/arm64/kvm/hyp/nvhe/mem_protect.c | 4 +-
arch/arm64/kvm/hyp/pgtable.c | 16 ++-
arch/arm64/kvm/mmu.c | 36 ++++--
arch/mips/kvm/mmu.c | 2 +
arch/riscv/kvm/mmu.c | 2 +
arch/x86/kvm/mmu/mmu.c | 3 +
.../selftests/kvm/dirty_log_perf_test.c | 108 ++++++++++++++----
.../testing/selftests/kvm/include/memstress.h | 13 ++-
tools/testing/selftests/kvm/lib/memstress.c | 43 +++++--
virt/kvm/dirty_ring.c | 2 -
virt/kvm/kvm_main.c | 4 -
12 files changed, 185 insertions(+), 65 deletions(-)
base-commit: 95b9779c1758f03cf494e8550d6249a40089ed1c
--
2.40.0.634.g4ca3ef3211-goog
There is a spelling mistake in a message. Fix it.
Signed-off-by: Colin Ian King <colin.i.king(a)gmail.com>
---
tools/testing/selftests/cachestat/test_cachestat.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/tools/testing/selftests/cachestat/test_cachestat.c b/tools/testing/selftests/cachestat/test_cachestat.c
index c3823b809c25..9be2262e5c17 100644
--- a/tools/testing/selftests/cachestat/test_cachestat.c
+++ b/tools/testing/selftests/cachestat/test_cachestat.c
@@ -191,7 +191,7 @@ bool test_cachestat_shmem(void)
}
if (ftruncate(fd, filesize)) {
- ksft_print_msg("Unable to trucate shmem file.\n");
+ ksft_print_msg("Unable to truncate shmem file.\n");
ret = false;
goto close_fd;
}
--
2.30.2
From: Feng Zhou <zhoufeng.zf(a)bytedance.com>
Trace sched related functions, such as enqueue_task_fair, it is necessary to
specify a task instead of the current task which within a given cgroup.
Feng Zhou (2):
bpf: Add bpf_task_under_cgroup() kfunc
selftests/bpf: Add testcase for bpf_task_under_cgroup
Changelog:
v5->v6: Addressed comments from Yonghong Song
- Some code format modifications.
- Add ack-by
Details in here:
https://lore.kernel.org/all/20230504031513.13749-1-zhoufeng.zf@bytedance.co…
v4->v5: Addressed comments from Yonghong Song
- Some code format modifications.
Details in here:
https://lore.kernel.org/all/20230428071737.43849-1-zhoufeng.zf@bytedance.co…
v3->v4: Addressed comments from Yonghong Song
- Modify test cases and test other tasks, not the current task.
Details in here:
https://lore.kernel.org/all/20230427023019.73576-1-zhoufeng.zf@bytedance.co…
v2->v3: Addressed comments from Alexei Starovoitov
- Modify the comment information of the function.
- Narrow down the testcase's hook point
Details in here:
https://lore.kernel.org/all/20230421090403.15515-1-zhoufeng.zf@bytedance.co…
v1->v2: Addressed comments from Alexei Starovoitov
- Add kfunc instead.
Details in here:
https://lore.kernel.org/all/20230420072657.80324-1-zhoufeng.zf@bytedance.co…
kernel/bpf/helpers.c | 20 +++++++
tools/testing/selftests/bpf/DENYLIST.s390x | 1 +
.../bpf/prog_tests/task_under_cgroup.c | 53 +++++++++++++++++++
.../bpf/progs/test_task_under_cgroup.c | 51 ++++++++++++++++++
4 files changed, 125 insertions(+)
create mode 100644 tools/testing/selftests/bpf/prog_tests/task_under_cgroup.c
create mode 100644 tools/testing/selftests/bpf/progs/test_task_under_cgroup.c
--
2.20.1
From: Feng Zhou <zhoufeng.zf(a)bytedance.com>
Trace sched related functions, such as enqueue_task_fair, it is necessary to
specify a task instead of the current task which within a given cgroup.
Feng Zhou (2):
bpf: Add bpf_task_under_cgroup() kfunc
selftests/bpf: Add testcase for bpf_task_under_cgroup
Changelog:
v4->v5: Addressed comments from Yonghong Song
- Some code format modifications.
Details in here:
https://lore.kernel.org/all/20230428071737.43849-1-zhoufeng.zf@bytedance.co…
v3->v4: Addressed comments from Yonghong Song
- Modify test cases and test other tasks, not the current task.
Details in here:
https://lore.kernel.org/all/20230427023019.73576-1-zhoufeng.zf@bytedance.co…
v2->v3: Addressed comments from Alexei Starovoitov
- Modify the comment information of the function.
- Narrow down the testcase's hook point
Details in here:
https://lore.kernel.org/all/20230421090403.15515-1-zhoufeng.zf@bytedance.co…
v1->v2: Addressed comments from Alexei Starovoitov
- Add kfunc instead.
Details in here:
https://lore.kernel.org/all/20230420072657.80324-1-zhoufeng.zf@bytedance.co…
kernel/bpf/helpers.c | 20 +++++++
tools/testing/selftests/bpf/DENYLIST.s390x | 1 +
.../bpf/prog_tests/task_under_cgroup.c | 54 +++++++++++++++++++
.../bpf/progs/test_task_under_cgroup.c | 51 ++++++++++++++++++
4 files changed, 126 insertions(+)
create mode 100644 tools/testing/selftests/bpf/prog_tests/task_under_cgroup.c
create mode 100644 tools/testing/selftests/bpf/progs/test_task_under_cgroup.c
--
2.20.1
Verify that calling clone3 with an exit signal (SIGCHLD) in flags will
fail.
Signed-off-by: Tobias Klauser <tklauser(a)distanz.ch>
---
tools/testing/selftests/clone3/clone3.c | 5 ++++-
1 file changed, 4 insertions(+), 1 deletion(-)
diff --git a/tools/testing/selftests/clone3/clone3.c b/tools/testing/selftests/clone3/clone3.c
index e495f895a2cd..e60cf4da8fb0 100644
--- a/tools/testing/selftests/clone3/clone3.c
+++ b/tools/testing/selftests/clone3/clone3.c
@@ -129,7 +129,7 @@ int main(int argc, char *argv[])
uid_t uid = getuid();
ksft_print_header();
- ksft_set_plan(18);
+ ksft_set_plan(19);
test_clone3_supported();
/* Just a simple clone3() should return 0.*/
@@ -198,5 +198,8 @@ int main(int argc, char *argv[])
/* Do a clone3() in a new time namespace */
test_clone3(CLONE_NEWTIME, 0, 0, CLONE3_ARGS_NO_TEST);
+ /* Do a clone3() with exit signal (SIGCHLD) in flags */
+ test_clone3(SIGCHLD, 0, -EINVAL, CLONE3_ARGS_NO_TEST);
+
ksft_finished();
}
--
2.40.0
Hi all,
This patch series fixes a crash in the new input selftest, and makes the
test available when the KUnit framework is modular.
Unfortunately test 3 still fails for me (tested on Koelsch (R-Car M2-W)
and ARAnyM):
KTAP version 1
# Subtest: input_core
1..3
input: Test input device as /devices/virtual/input/input1
ok 1 input_test_polling
input: Test input device as /devices/virtual/input/input2
ok 2 input_test_timestamp
input: Test input device as /devices/virtual/input/input3
# input_test_match_device_id: ASSERTION FAILED at # drivers/input/tests/input_test.c:99
Expected input_match_device_id(input_dev, &id) to be true, but is false
not ok 3 input_test_match_device_id
# input_core: pass:2 fail:1 skip:0 total:3
# Totals: pass:2 fail:1 skip:0 total:3
not ok 1 input_core
Thanks!
Geert Uytterhoeven (2):
Input: tests - fix use-after-free and refcount underflow in
input_test_exit()
Input: tests - modular KUnit tests should not depend on KUNIT=y
drivers/input/Kconfig | 2 +-
drivers/input/tests/input_test.c | 4 ++--
2 files changed, 3 insertions(+), 3 deletions(-)
--
2.34.1
Gr{oetje,eeting}s,
Geert
--
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- geert(a)linux-m68k.org
In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
-- Linus Torvalds
When the documentation was updated for modular tests, the dependency on
"KUNIT=y" was forgotten to be updated, now encouraging people to create
tests that cannot be enabled when the KUNIT framework itself is modular.
Fix this by changing the dependency to "KUNIT".
Document when it is appropriate (and required) to depend on "KUNIT=y".
Fixes: c9ef2d3e3f3b3e56 ("KUnit: Docs: make start.rst example Kconfig follow style.rst")
Signed-off-by: Geert Uytterhoeven <geert+renesas(a)glider.be>
---
Documentation/dev-tools/kunit/start.rst | 5 ++++-
1 file changed, 4 insertions(+), 1 deletion(-)
diff --git a/Documentation/dev-tools/kunit/start.rst b/Documentation/dev-tools/kunit/start.rst
index c736613c9b199bff..9619a044093042ce 100644
--- a/Documentation/dev-tools/kunit/start.rst
+++ b/Documentation/dev-tools/kunit/start.rst
@@ -256,9 +256,12 @@ Now we are ready to write the test cases.
config MISC_EXAMPLE_TEST
tristate "Test for my example" if !KUNIT_ALL_TESTS
- depends on MISC_EXAMPLE && KUNIT=y
+ depends on MISC_EXAMPLE && KUNIT
default KUNIT_ALL_TESTS
+Note: If your test does not support being built as a loadable module (which is
+discouraged), replace tristate by bool, and depend on KUNIT=y instead of KUNIT.
+
3. Add the following lines to ``drivers/misc/Makefile``:
.. code-block:: make
--
2.34.1
From: Feng Zhou <zhoufeng.zf(a)bytedance.com>
Trace sched related functions, such as enqueue_task_fair, it is necessary to
specify a task instead of the current task which within a given cgroup.
Feng Zhou (2):
bpf: Add bpf_task_under_cgroup() kfunc
selftests/bpf: Add testcase for bpf_task_under_cgroup
Changelog:
v3->v4: Addressed comments from Yonghong Song
- Modify test cases and test other tasks, not the current task.
Details in here:
https://lore.kernel.org/all/20230427023019.73576-1-zhoufeng.zf@bytedance.co…
v2->v3: Addressed comments from Alexei Starovoitov
- Modify the comment information of the function.
- Narrow down the testcase's hook point
Details in here:
https://lore.kernel.org/all/20230421090403.15515-1-zhoufeng.zf@bytedance.co…
v1->v2: Addressed comments from Alexei Starovoitov
- Add kfunc instead.
Details in here:
https://lore.kernel.org/all/20230420072657.80324-1-zhoufeng.zf@bytedance.co…
kernel/bpf/helpers.c | 20 +++++++
tools/testing/selftests/bpf/DENYLIST.s390x | 1 +
.../bpf/prog_tests/task_under_cgroup.c | 55 +++++++++++++++++++
.../bpf/progs/test_task_under_cgroup.c | 51 +++++++++++++++++
4 files changed, 127 insertions(+)
create mode 100644 tools/testing/selftests/bpf/prog_tests/task_under_cgroup.c
create mode 100644 tools/testing/selftests/bpf/progs/test_task_under_cgroup.c
--
2.20.1
These patches relax a few verifier requirements around dynptrs.
I was unable to test the patch in 0003 due to unrelated issues compiling the
bpf selftests, but did run an equivalent local test program.
This is the issue I was running into:
progs/cgrp_ls_attach_cgroup.c:17:15: error: use of undeclared identifier 'BPF_MAP_TYPE_CGRP_STORAGE'; did you mean 'BPF_MAP_TYPE_CGROUP_STORAGE'?
__uint(type, BPF_MAP_TYPE_CGRP_STORAGE);
^~~~~~~~~~~~~~~~~~~~~~~~~
BPF_MAP_TYPE_CGROUP_STORAGE
/ssd/kernel/fuse-bpf/tools/testing/selftests/bpf/tools/include/bpf/bpf_helpers.h:13:39: note: expanded from macro '__uint'
#define __uint(name, val) int (*name)[val]
^
/ssd/kernel/fuse-bpf/tools/testing/selftests/bpf/tools/include/vmlinux.h:27892:2: note: 'BPF_MAP_TYPE_CGROUP_STORAGE' declared here
BPF_MAP_TYPE_CGROUP_STORAGE = 19,
^
1 error generated.
Daniel Rosenberg (3):
bpf: verifier: Accept dynptr mem as mem in helpers
bpf: Allow NULL buffers in bpf_dynptr_slice(_rw)
selftests/bpf: Test allowing NULL buffer in dynptr slice
Documentation/bpf/kfuncs.rst | 23 ++++++++++++-
kernel/bpf/helpers.c | 32 ++++++++++++-------
kernel/bpf/verifier.c | 21 ++++++++++++
.../testing/selftests/bpf/prog_tests/dynptr.c | 1 +
.../selftests/bpf/progs/dynptr_success.c | 21 ++++++++++++
5 files changed, 85 insertions(+), 13 deletions(-)
base-commit: 5af607a861d43ffff830fc1890033e579ec44799
--
2.40.0.577.gac1e443424-goog
When calling socket lookup from L2 (tc, xdp), VRF boundaries aren't
respected. This patchset fixes this by regarding the incoming device's
VRF attachment when performing the socket lookups from tc/xdp.
The first two patches are coding changes which factor out the tc helper's
logic which was shared with cg/sk_skb (which operate correctly).
This refactoring is needed in order to avoid affecting the cgroup/sk_skb
flows as there does not seem to be a strict criteria for discerning which
flow the helper is called from based on the net device or packet
information.
The third patch contains the actual bugfix.
The fourth patch adds bpf tests for these lookup functions.
---
v4: - Move dev_sdif() to include/linux/netdevice.h as suggested by Stanislav Fomichev
- Remove SYS and SYS_NOFAIL duplicate definitions
v3: - Rename bpf_l2_sdif() to dev_sdif() as suggested by Stanislav Fomichev
- Added xdp tests as suggested by Daniel Borkmann
- Use start_server() to avoid duplicate code as suggested by Stanislav Fomichev
v2: Fixed uninitialized var in test patch (4).
Gilad Sever (4):
bpf: factor out socket lookup functions for the TC hookpoint.
bpf: Call __bpf_sk_lookup()/__bpf_skc_lookup() directly via TC
hookpoint
bpf: fix bpf socket lookup from tc/xdp to respect socket VRF bindings
selftests/bpf: Add vrf_socket_lookup tests
include/linux/netdevice.h | 9 +
net/core/filter.c | 123 +++++--
.../bpf/prog_tests/vrf_socket_lookup.c | 312 ++++++++++++++++++
.../selftests/bpf/progs/vrf_socket_lookup.c | 88 +++++
4 files changed, 511 insertions(+), 21 deletions(-)
create mode 100644 tools/testing/selftests/bpf/prog_tests/vrf_socket_lookup.c
create mode 100644 tools/testing/selftests/bpf/progs/vrf_socket_lookup.c
--
2.34.1
> This adds the general_profit KSM sysfs knob and the process profit metric
> knobs to ksm_stat.
>
> 1) expose general_profit metric
>
> The documentation mentions a general profit metric, however this
> metric is not calculated. In addition the formula depends on the size
> of internal structures, which makes it more difficult for an
> administrator to make the calculation. Adding the metric for a better
> user experience.
>
> 2) document general_profit sysfs knob
>
> 3) calculate ksm process profit metric
>
> The ksm documentation mentions the process profit metric and how to
> calculate it. This adds the calculation of the metric.
>
> 4) mm: expose ksm process profit metric in ksm_stat
>
> This exposes the ksm process profit metric in /proc/<pid>/ksm_stat.
> The documentation mentions the formula for the ksm process profit
> metric, however it does not calculate it. In addition the formula
> depends on the size of internal structures. So it makes sense to
> expose it.
>
Hi, Stefan, I think you should give some credits to me about my contributions on
the concept and formula of ksm profit (process wide and system wide), it's kind
of idea stealing.
Besides, the idea of Process control KSM was proposed by me last year although you use
prctl instead of /proc fs. you even didn't CC my email. I think you should CC my email
(xu.xin16(a)zte.com.cn) as least.
> 5) document new procfs ksm knobs
>
> Signed-off-by: Stefan Roesch <shr(a)devkernel.io>
> Reviewed-by: Bagas Sanjaya <bagasdotme(a)gmail.com>
> Acked-by: David Hildenbrand <david(a)redhat.com>
> Cc: David Hildenbrand <david(a)redhat.com>
> Cc: Johannes Weiner <hannes(a)cmpxchg.org>
> Cc: Michal Hocko <mhocko(a)suse.com>
> Cc: Rik van Riel <riel(a)surriel.com>
> Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
> ---
> Documentation/ABI/testing/sysfs-kernel-mm-ksm | 8 +++++++
> Documentation/admin-guide/mm/ksm.rst | 5 ++++-
> fs/proc/base.c | 3 +++
> include/linux/ksm.h | 4 ++++
> mm/ksm.c | 21 +++++++++++++++++++
> 5 files changed, 40 insertions(+), 1 deletion(-)
>
> diff --git a/Documentation/ABI/testing/sysfs-kernel-mm-ksm b/Documentation/ABI/testing/sysfs-kernel-mm-ksm
> index d244674a9480..6041a025b65a 100644
> --- a/Documentation/ABI/testing/sysfs-kernel-mm-ksm
> +++ b/Documentation/ABI/testing/sysfs-kernel-mm-ksm
> @@ -51,3 +51,11 @@ Description: Control merging pages across different NUMA nodes.
>
> When it is set to 0 only pages from the same node are merged,
> otherwise pages from all nodes can be merged together (default).
> +
> +What: /sys/kernel/mm/ksm/general_profit
> +Date: April 2023
> +KernelVersion: 6.4
> +Contact: Linux memory management mailing list <linux-mm(a)kvack.org>
> +Description: Measure how effective KSM is.
> + general_profit: how effective is KSM. The formula for the
> + calculation is in Documentation/admin-guide/mm/ksm.rst.
> diff --git a/Documentation/admin-guide/mm/ksm.rst b/Documentation/admin-guide/mm/ksm.rst
On some distributions, the rp_filter is automatically set (=1) by
default on a netdev basis (also on VRFs).
In an SRv6 End.DT46 behavior, decapsulated IPv4 packets are routed using
the table associated with the VRF bound to that tunnel. During lookup
operations, the rp_filter can lead to packet loss when activated on the
VRF.
Therefore, we chose to make this selftest more robust by explicitly
disabling the rp_filter during tests (as it is automatically set by some
Linux distributions).
Fixes: 03a0b567a03d ("selftests: seg6: add selftest for SRv6 End.DT46 Behavior")
Reported-by: Hangbin Liu <liuhangbin(a)gmail.com>
Signed-off-by: Andrea Mayer <andrea.mayer(a)uniroma2.it>
Tested-by: Hangbin Liu <liuhangbin(a)gmail.com>
---
.../testing/selftests/net/srv6_end_dt46_l3vpn_test.sh | 10 +++++-----
1 file changed, 5 insertions(+), 5 deletions(-)
diff --git a/tools/testing/selftests/net/srv6_end_dt46_l3vpn_test.sh b/tools/testing/selftests/net/srv6_end_dt46_l3vpn_test.sh
index aebaab8ce44c..441eededa031 100755
--- a/tools/testing/selftests/net/srv6_end_dt46_l3vpn_test.sh
+++ b/tools/testing/selftests/net/srv6_end_dt46_l3vpn_test.sh
@@ -292,6 +292,11 @@ setup_hs()
ip netns exec ${hsname} sysctl -wq net.ipv6.conf.all.accept_dad=0
ip netns exec ${hsname} sysctl -wq net.ipv6.conf.default.accept_dad=0
+ # disable the rp_filter otherwise the kernel gets confused about how
+ # to route decap ipv4 packets.
+ ip netns exec ${rtname} sysctl -wq net.ipv4.conf.all.rp_filter=0
+ ip netns exec ${rtname} sysctl -wq net.ipv4.conf.default.rp_filter=0
+
ip -netns ${hsname} link add veth0 type veth peer name ${rtveth}
ip -netns ${hsname} link set ${rtveth} netns ${rtname}
ip -netns ${hsname} addr add ${IPv6_HS_NETWORK}::${hs}/64 dev veth0 nodad
@@ -316,11 +321,6 @@ setup_hs()
ip netns exec ${rtname} sysctl -wq net.ipv6.conf.${rtveth}.proxy_ndp=1
ip netns exec ${rtname} sysctl -wq net.ipv4.conf.${rtveth}.proxy_arp=1
- # disable the rp_filter otherwise the kernel gets confused about how
- # to route decap ipv4 packets.
- ip netns exec ${rtname} sysctl -wq net.ipv4.conf.all.rp_filter=0
- ip netns exec ${rtname} sysctl -wq net.ipv4.conf.${rtveth}.rp_filter=0
-
ip netns exec ${rtname} sh -c "echo 1 > /proc/sys/net/vrf/strict_mode"
}
--
2.20.1
memalign() is obsolete according to its manpage.
Replace memalign() with posix_memalign() and remove malloc.h include
that was there for memalign().
As a pointer is passed into posix_memalign(), initialize *s to NULL
to silence a warning about the function's return value being used as
uninitialized (which is not valid anyway because the error is properly
checked before s is returned).
Signed-off-by: Deming Wang <wangdeming(a)inspur.com>
---
tools/testing/selftests/powerpc/stringloops/strlen.c | 6 +++---
1 file changed, 3 insertions(+), 3 deletions(-)
diff --git a/tools/testing/selftests/powerpc/stringloops/strlen.c b/tools/testing/selftests/powerpc/stringloops/strlen.c
index 9055ebc484d0..f9c1f9cc2d32 100644
--- a/tools/testing/selftests/powerpc/stringloops/strlen.c
+++ b/tools/testing/selftests/powerpc/stringloops/strlen.c
@@ -1,5 +1,4 @@
// SPDX-License-Identifier: GPL-2.0
-#include <malloc.h>
#include <stdlib.h>
#include <string.h>
#include <time.h>
@@ -51,10 +50,11 @@ static void bench_test(char *s)
static int testcase(void)
{
char *s;
+ int ret;
unsigned long i;
- s = memalign(128, SIZE);
- if (!s) {
+ ret = posix_memalign((void **)&s, 128, SIZE);
+ if (ret < 0) {
perror("memalign");
exit(1);
}
--
2.27.0
For cases like IPv6 addresses, having a means to supply tracing
predicates for fields with more than 8 bytes would be convenient.
This series provides a simple way to support this by allowing
simple ==, != memory comparison with the predicate supplied when
the size of the field exceeds 8 bytes. For example, to trace
::1, the predicate
"dst == 0x00000000000000000000000000000001"
..could be used.
Patch 1 provides the support for > 8 byte fields via a memcmp()-style
predicate. Patch 2 adds tests for filter predicates, and patch 3
documents the fact that for > 8 bytes. only == and != are supported.
Changes since RFC [1]:
- originally a fix was intermixed with the new functionality as
patch 1 in series [1]; the fix landed separately
- small tweaks to how filter predicates are defined via fn_num as
opposed to via fn directly
[1] https://lore.kernel.org/lkml/1659910883-18223-1-git-send-email-alan.maguire…
Alan Maguire (3):
tracing: support > 8 byte array filter predicates
selftests/ftrace: add test coverage for filter predicates
tracing: document > 8 byte numeric filtering support
Documentation/trace/events.rst | 9 +++
kernel/trace/trace_events_filter.c | 55 +++++++++++++++-
.../selftests/ftrace/test.d/event/filter.tc | 62 +++++++++++++++++++
3 files changed, 125 insertions(+), 1 deletion(-)
create mode 100644 tools/testing/selftests/ftrace/test.d/event/filter.tc
--
2.31.1
When calling socket lookup from L2 (tc, xdp), VRF boundaries aren't
respected. This patchset fixes this by regarding the incoming device's
VRF attachment when performing the socket lookups from tc/xdp.
The first two patches are coding changes which factor out the tc helper's
logic which was shared with cg/sk_skb (which operate correctly).
This refactoring is needed in order to avoid affecting the cgroup/sk_skb
flows as there does not seem to be a strict criteria for discerning which
flow the helper is called from based on the net device or packet
information.
The third patch contains the actual bugfix.
The fourth patch adds bpf tests for these lookup functions.
---
v3: - Rename bpf_l2_sdif() to dev_sdif() as suggested by Stanislav Fomichev
- Added xdp tests as suggested by Daniel Borkmann
- Use start_server() to avoid duplicate code as suggested by Stanislav Fomichev
v2: Fixed uninitialized var in test patch (4).
Gilad Sever (4):
bpf: factor out socket lookup functions for the TC hookpoint.
bpf: Call __bpf_sk_lookup()/__bpf_skc_lookup() directly via TC
hookpoint
bpf: fix bpf socket lookup from tc/xdp to respect socket VRF bindings
selftests/bpf: Add vrf_socket_lookup tests
net/core/filter.c | 132 +++++--
.../bpf/prog_tests/vrf_socket_lookup.c | 327 ++++++++++++++++++
.../selftests/bpf/progs/vrf_socket_lookup.c | 88 +++++
3 files changed, 526 insertions(+), 21 deletions(-)
create mode 100644 tools/testing/selftests/bpf/prog_tests/vrf_socket_lookup.c
create mode 100644 tools/testing/selftests/bpf/progs/vrf_socket_lookup.c
--
2.34.1
From: SeongJae Park <sjpark(a)amazon.de>
When running a test program, 'run_one()' checks if the program has the
execution permission and fails if it doesn't. However, it's easy to
mistakenly missing the permission, as some common tools like 'diff'
don't support the permission change well[1]. Compared to that, making
mistakes in the test program's path would only rare, as those are
explicitly listed in 'TEST_PROGS'. Therefore, it might make more sense
to resolve the situation on our own and run the program.
For the reason, this commit makes the test program runner function to
still print the warning message but try parsing the interpreter of the
program and explicitly run it with the interpreter, in the case.
[1] https://lore.kernel.org/mm-commits/YRJisBs9AunccCD4@kroah.com/
Suggested-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
Signed-off-by: SeongJae Park <sjpark(a)amazon.de>
---
Changes from v1
(https://lore.kernel.org/linux-kselftest/20210810140459.23990-1-sj38.park@gm…)
- Parse and use the interpreter instead of changing the file
tools/testing/selftests/kselftest/runner.sh | 28 +++++++++++++--------
1 file changed, 18 insertions(+), 10 deletions(-)
diff --git a/tools/testing/selftests/kselftest/runner.sh b/tools/testing/selftests/kselftest/runner.sh
index cc9c846585f0..a9ba782d8ca0 100644
--- a/tools/testing/selftests/kselftest/runner.sh
+++ b/tools/testing/selftests/kselftest/runner.sh
@@ -33,9 +33,9 @@ tap_timeout()
{
# Make sure tests will time out if utility is available.
if [ -x /usr/bin/timeout ] ; then
- /usr/bin/timeout --foreground "$kselftest_timeout" "$1"
+ /usr/bin/timeout --foreground "$kselftest_timeout" $1
else
- "$1"
+ $1
fi
}
@@ -65,17 +65,25 @@ run_one()
TEST_HDR_MSG="selftests: $DIR: $BASENAME_TEST"
echo "# $TEST_HDR_MSG"
- if [ ! -x "$TEST" ]; then
- echo -n "# Warning: file $TEST is "
- if [ ! -e "$TEST" ]; then
- echo "missing!"
- else
- echo "not executable, correct this."
- fi
+ if [ ! -e "$TEST" ]; then
+ echo "# Warning: file $TEST is missing!"
echo "not ok $test_num $TEST_HDR_MSG"
else
+ cmd="./$BASENAME_TEST"
+ if [ ! -x "$TEST" ]; then
+ echo "# Warning: file $TEST is not executable"
+
+ if [ $(head -n 1 "$TEST" | cut -c -2) = "#!" ]
+ then
+ interpreter=$(head -n 1 "$TEST" | cut -c 3-)
+ cmd="$interpreter ./$BASENAME_TEST"
+ else
+ echo "not ok $test_num $TEST_HDR_MSG"
+ return
+ fi
+ fi
cd `dirname $TEST` > /dev/null
- ((((( tap_timeout ./$BASENAME_TEST 2>&1; echo $? >&3) |
+ ((((( tap_timeout "$cmd" 2>&1; echo $? >&3) |
tap_prefix >&4) 3>&1) |
(read xs; exit $xs)) 4>>"$logfile" &&
echo "ok $test_num $TEST_HDR_MSG") ||
--
2.17.1
From: Feng Zhou <zhoufeng.zf(a)bytedance.com>
Trace sched related functions, such as enqueue_task_fair, it is necessary to
specify a task instead of the current task which within a given cgroup.
Feng Zhou (2):
bpf: Add bpf_task_under_cgroup() kfunc
selftests/bpf: Add testcase for bpf_task_under_cgroup
Changelog:
v2->v3: Addressed comments from Alexei Starovoitov
- Modify the comment information of the function.
- Narrow down the testcase's hook point
Details in here:
https://lore.kernel.org/all/20230421090403.15515-1-zhoufeng.zf@bytedance.co…
v1->v2: Addressed comments from Alexei Starovoitov
- Add kfunc instead.
Details in here:
https://lore.kernel.org/all/20230420072657.80324-1-zhoufeng.zf@bytedance.co…
kernel/bpf/helpers.c | 20 ++++++++
tools/testing/selftests/bpf/DENYLIST.s390x | 1 +
.../bpf/prog_tests/task_under_cgroup.c | 47 +++++++++++++++++++
.../selftests/bpf/progs/cgrp_kfunc_common.h | 1 +
.../bpf/progs/test_task_under_cgroup.c | 37 +++++++++++++++
5 files changed, 106 insertions(+)
create mode 100644 tools/testing/selftests/bpf/prog_tests/task_under_cgroup.c
create mode 100644 tools/testing/selftests/bpf/progs/test_task_under_cgroup.c
--
2.20.1
From: Roberto Sassu <roberto.sassu(a)huawei.com>
Goals
=====
Support new key and signature formats with the same kernel component.
Verify the authenticity of system data with newly supported data formats.
Mitigate the risk of parsing arbitrary data in the kernel.
Motivation
==========
Adding new functionality to the kernel comes with an increased risk of
introducing new bugs which, once exploited, can lead to a partial or full
system compromise.
Parsing arbitrary data is particularly critical, since it allows an
attacker to send a malicious sequence of bytes to exploit vulnerabilities
in the parser code. The attacker might be able to overwrite kernel memory
to bypass kernel protections, and obtain more privileges.
User Mode Drivers (UMDs) can effectively mitigate this risk. If the parser
runs in user space, even if it has a bug, it won't allow the attacker to
overwrite kernel memory.
The communication protocol between the UMD and the kernel should be simple
enough, that the kernel can immediately recognize malformed data sent by an
attacker controlling the UMD, and discard it.
Solution
========
Register a new parser of the asymmetric key type which, instead of parsing
the key blob, forwards it to a UMD, and populates the key fields from the
UMD response. That response contains the data for each field of the public
key structure, defined in the kernel, and possibly a key description.
Supporting new data formats can be achieved by simply extending the UMD. As
long as the UMD recognizes them, and provides the crypto material to the
kernel Crypto API in the expected format, the kernel does not need to be
aware of the UMD changes.
Add a new API to verify the authenticity of system data, similar to the one
for PKCS#7 signatures. As for the key parser, send the signature to a UMD,
and fill the public_key_signature structure from the UMD response.
The API still supports a very basic trust model, it accepts a key for
signature verification if it is in the supplied keyring. The API can be
extended later to support more sophisticated models.
Use cases
=========
eBPF
----
The eBPF infrastructure already offers to eBPF programs the ability to
verify PKCS#7 signatures, through the bpf_verify_pkcs7_signature() kfunc.
Add the new bpf_verify_umd_signature() kfunc, to allow eBPF programs verify
signatures in a data format that is not PKCS#7 (for example PGP).
IMA Appraisal
-------------
An alternative to appraising each file with its signature (Fedora 38) is to
build a repository of reference file digests from signed RPM headers, and
lookup the calculated digest of files being accessed in that repository
(DIGLIM[1]).
With this patch set, the kernel can verify the authenticity of RPM headers
from their PGP signature against the Linux distribution GPG keys. Once
verified, RPM headers can be parsed with a UMD to build the repository of
reference file digests.
With DIGLIM, Linux distributions are not required to change anything in
their building infrastructure (no extra data in the RPM header, no new PKI
for IMA signatures).
[1]: https://lore.kernel.org/linux-integrity/20210914163401.864635-1-roberto.sas…
UMD development
===============
The header file crypto/asymmetric_keys/umd_key_sig_umh.h contains the
details of the communication protocol between the kernel and the UMD
handler.
The UMD handler should implement the commands defined, CMD_KEY and CMD_SIG,
should set the result of the processing, and fill the key and
signature-specific structures umd_key_msg_out and umd_sig_msg_out.
The UMD handler should provide the key and signature blobs in a format that
is understood by the kernel. For example, for RSA keys, it should provide
them in ASN.1 format (SEQUENCE of INTEGER).
The auth IDs of the keys and signatures should match, for signature
verification. Auth ID matching can be partial.
Patch set dependencies
======================
This patch set depends on 'usermode_driver: Add management library and
API':
https://lore.kernel.org/bpf/20230317145240.363908-1-roberto.sassu@huaweiclo…
Patch set content
=================
Patch 1 introduces the new parser for the asymmetric key type.
Patch 2 introduces the parser for signatures and its API.
Patch 3 introduces the system-level API for signature verification.
Patch 4 extends eBPF to use the new system-level API.
Patch 5 adds a test for UMD-parser signatures (not executed until the UMD
supports PGP).
Patch 6 introduces the skeleton of the UMD handler.
PGP
===
A work in progress implementation of the PGP format (RFC 4880 and RFC 6637)
in the UMD handler is available at:
https://github.com/robertosassu/linux/commits/pgp-signatures-umd-v1-devel-v…
It is based on a previous work of David Howells, available at:
https://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-modsign.git/…
The patches have been adapted for use in user space.
Roberto Sassu (6):
KEYS: asymmetric: Introduce UMD-based asymmetric key parser
KEYS: asymmetric: Introduce UMD-based asymmetric key signature parser
verification: Introduce verify_umd_signature() and
verify_umd_message_sig()
bpf: Introduce bpf_verify_umd_signature() kfunc
selftests/bpf: Prepare a test for UMD-parsed signatures
KEYS: asymmetric: Add UMD handler
.gitignore | 3 +
MAINTAINERS | 1 +
certs/system_keyring.c | 125 ++++++
crypto/asymmetric_keys/Kconfig | 32 ++
crypto/asymmetric_keys/Makefile | 23 +
crypto/asymmetric_keys/asymmetric_type.c | 3 +-
crypto/asymmetric_keys/umd_key.h | 28 ++
crypto/asymmetric_keys/umd_key_parser.c | 203 +++++++++
crypto/asymmetric_keys/umd_key_sig_loader.c | 32 ++
crypto/asymmetric_keys/umd_key_sig_umh.h | 71 +++
crypto/asymmetric_keys/umd_key_sig_umh_blob.S | 7 +
crypto/asymmetric_keys/umd_key_sig_umh_user.c | 84 ++++
crypto/asymmetric_keys/umd_sig_parser.c | 416 ++++++++++++++++++
include/crypto/umd_sig.h | 71 +++
include/keys/asymmetric-type.h | 1 +
include/linux/verification.h | 48 ++
kernel/trace/bpf_trace.c | 69 ++-
...ify_pkcs7_sig.c => verify_pkcs7_umd_sig.c} | 109 +++--
...kcs7_sig.c => test_verify_pkcs7_umd_sig.c} | 18 +-
.../testing/selftests/bpf/verify_sig_setup.sh | 82 +++-
20 files changed, 1378 insertions(+), 48 deletions(-)
create mode 100644 crypto/asymmetric_keys/umd_key.h
create mode 100644 crypto/asymmetric_keys/umd_key_parser.c
create mode 100644 crypto/asymmetric_keys/umd_key_sig_loader.c
create mode 100644 crypto/asymmetric_keys/umd_key_sig_umh.h
create mode 100644 crypto/asymmetric_keys/umd_key_sig_umh_blob.S
create mode 100644 crypto/asymmetric_keys/umd_key_sig_umh_user.c
create mode 100644 crypto/asymmetric_keys/umd_sig_parser.c
create mode 100644 include/crypto/umd_sig.h
rename tools/testing/selftests/bpf/prog_tests/{verify_pkcs7_sig.c => verify_pkcs7_umd_sig.c} (75%)
rename tools/testing/selftests/bpf/progs/{test_verify_pkcs7_sig.c => test_verify_pkcs7_umd_sig.c} (82%)
--
2.25.1
Dan Carpenter spotted a race condition in a couple of situations like
these in the test_firmware driver:
static int test_dev_config_update_u8(const char *buf, size_t size, u8 *cfg)
{
u8 val;
int ret;
ret = kstrtou8(buf, 10, &val);
if (ret)
return ret;
mutex_lock(&test_fw_mutex);
*(u8 *)cfg = val;
mutex_unlock(&test_fw_mutex);
/* Always return full write size even if we didn't consume all */
return size;
}
static ssize_t config_num_requests_store(struct device *dev,
struct device_attribute *attr,
const char *buf, size_t count)
{
int rc;
mutex_lock(&test_fw_mutex);
if (test_fw_config->reqs) {
pr_err("Must call release_all_firmware prior to changing config\n");
rc = -EINVAL;
mutex_unlock(&test_fw_mutex);
goto out;
}
mutex_unlock(&test_fw_mutex);
rc = test_dev_config_update_u8(buf, count,
&test_fw_config->num_requests);
out:
return rc;
}
static ssize_t config_read_fw_idx_store(struct device *dev,
struct device_attribute *attr,
const char *buf, size_t count)
{
return test_dev_config_update_u8(buf, count,
&test_fw_config->read_fw_idx);
}
The function test_dev_config_update_u8() is called from both the locked
and the unlocked context, function config_num_requests_store() and
config_read_fw_idx_store() which can both be called asynchronously as
they are driver's methods, while test_dev_config_update_u8() and siblings
change their argument pointed to by u8 *cfg or similar pointer.
To avoid deadlock on test_fw_mutex, the lock is dropped before calling
test_dev_config_update_u8() and re-acquired within test_dev_config_update_u8()
itself, but alas this creates a race condition.
Having two locks wouldn't assure a race-proof mutual exclusion.
This situation is best avoided by the introduction of a new, unlocked
function __test_dev_config_update_u8() which can be called from the locked
context and reducing test_dev_config_update_u8() to:
static int test_dev_config_update_u8(const char *buf, size_t size, u8 *cfg)
{
int ret;
mutex_lock(&test_fw_mutex);
ret = __test_dev_config_update_u8(buf, size, cfg);
mutex_unlock(&test_fw_mutex);
return ret;
}
doing the locking and calling the unlocked primitive, which enables both
locked and unlocked versions without duplication of code.
The similar approach was applied to all functions called from the locked
and the unlocked context, which safely mitigates both deadlocks and race
conditions in the driver.
__test_dev_config_update_bool(), __test_dev_config_update_u8() and
__test_dev_config_update_size_t() unlocked versions of the functions
were introduced to be called from the locked contexts as a workaround
without releasing the main driver's lock and thereof causing a race
condition.
The test_dev_config_update_bool(), test_dev_config_update_u8() and
test_dev_config_update_size_t() locked versions of the functions
are being called from driver methods without the unnecessary multiplying
of the locking and unlocking code for each method, and complicating
the code with saving of the return value across lock.
Fixes: 7feebfa487b92 ("test_firmware: add support for request_firmware_into_buf")
Cc: Luis Chamberlain <mcgrof(a)kernel.org>
Cc: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
Cc: Russ Weight <russell.h.weight(a)intel.com>
Cc: Takashi Iwai <tiwai(a)suse.de>
Cc: Tianfei Zhang <tianfei.zhang(a)intel.com>
Cc: Shuah Khan <shuah(a)kernel.org>
Cc: Colin Ian King <colin.i.king(a)gmail.com>
Cc: Randy Dunlap <rdunlap(a)infradead.org>
Cc: linux-kselftest(a)vger.kernel.org
Cc: stable(a)vger.kernel.org # v5.4
Suggested-by: Dan Carpenter <error27(a)gmail.com>
Signed-off-by: Mirsad Goran Todorovac <mirsad.todorovac(a)alu.unizg.hr>
---
lib/test_firmware.c | 52 ++++++++++++++++++++++++++++++---------------
1 file changed, 35 insertions(+), 17 deletions(-)
diff --git a/lib/test_firmware.c b/lib/test_firmware.c
index 05ed84c2fc4c..35417e0af3f4 100644
--- a/lib/test_firmware.c
+++ b/lib/test_firmware.c
@@ -353,16 +353,26 @@ static ssize_t config_test_show_str(char *dst,
return len;
}
-static int test_dev_config_update_bool(const char *buf, size_t size,
+static inline int __test_dev_config_update_bool(const char *buf, size_t size,
bool *cfg)
{
int ret;
- mutex_lock(&test_fw_mutex);
if (kstrtobool(buf, cfg) < 0)
ret = -EINVAL;
else
ret = size;
+
+ return ret;
+}
+
+static int test_dev_config_update_bool(const char *buf, size_t size,
+ bool *cfg)
+{
+ int ret;
+
+ mutex_lock(&test_fw_mutex);
+ ret = __test_dev_config_update_bool(buf, size, cfg);
mutex_unlock(&test_fw_mutex);
return ret;
@@ -373,7 +383,8 @@ static ssize_t test_dev_config_show_bool(char *buf, bool val)
return snprintf(buf, PAGE_SIZE, "%d\n", val);
}
-static int test_dev_config_update_size_t(const char *buf,
+static int __test_dev_config_update_size_t(
+ const char *buf,
size_t size,
size_t *cfg)
{
@@ -384,9 +395,7 @@ static int test_dev_config_update_size_t(const char *buf,
if (ret)
return ret;
- mutex_lock(&test_fw_mutex);
*(size_t *)cfg = new;
- mutex_unlock(&test_fw_mutex);
/* Always return full write size even if we didn't consume all */
return size;
@@ -402,7 +411,7 @@ static ssize_t test_dev_config_show_int(char *buf, int val)
return snprintf(buf, PAGE_SIZE, "%d\n", val);
}
-static int test_dev_config_update_u8(const char *buf, size_t size, u8 *cfg)
+static int __test_dev_config_update_u8(const char *buf, size_t size, u8 *cfg)
{
u8 val;
int ret;
@@ -411,14 +420,23 @@ static int test_dev_config_update_u8(const char *buf, size_t size, u8 *cfg)
if (ret)
return ret;
- mutex_lock(&test_fw_mutex);
*(u8 *)cfg = val;
- mutex_unlock(&test_fw_mutex);
/* Always return full write size even if we didn't consume all */
return size;
}
+static int test_dev_config_update_u8(const char *buf, size_t size, u8 *cfg)
+{
+ int ret;
+
+ mutex_lock(&test_fw_mutex);
+ ret = __test_dev_config_update_u8(buf, size, cfg);
+ mutex_unlock(&test_fw_mutex);
+
+ return ret;
+}
+
static ssize_t test_dev_config_show_u8(char *buf, u8 val)
{
return snprintf(buf, PAGE_SIZE, "%u\n", val);
@@ -471,10 +489,10 @@ static ssize_t config_num_requests_store(struct device *dev,
mutex_unlock(&test_fw_mutex);
goto out;
}
- mutex_unlock(&test_fw_mutex);
- rc = test_dev_config_update_u8(buf, count,
- &test_fw_config->num_requests);
+ rc = __test_dev_config_update_u8(buf, count,
+ &test_fw_config->num_requests);
+ mutex_unlock(&test_fw_mutex);
out:
return rc;
@@ -518,10 +536,10 @@ static ssize_t config_buf_size_store(struct device *dev,
mutex_unlock(&test_fw_mutex);
goto out;
}
- mutex_unlock(&test_fw_mutex);
- rc = test_dev_config_update_size_t(buf, count,
- &test_fw_config->buf_size);
+ rc = __test_dev_config_update_size_t(buf, count,
+ &test_fw_config->buf_size);
+ mutex_unlock(&test_fw_mutex);
out:
return rc;
@@ -548,10 +566,10 @@ static ssize_t config_file_offset_store(struct device *dev,
mutex_unlock(&test_fw_mutex);
goto out;
}
- mutex_unlock(&test_fw_mutex);
- rc = test_dev_config_update_size_t(buf, count,
- &test_fw_config->file_offset);
+ rc = __test_dev_config_update_size_t(buf, count,
+ &test_fw_config->file_offset);
+ mutex_unlock(&test_fw_mutex);
out:
return rc;
--
2.30.2
This patchset adds a stress test for kprobes and a test for checking
optimized probes.
The two tests are being added based on the below discussion:
https://lore.kernel.org/all/20230128101622.ce6f8e64d929e29d36b08b73@kernel.…
Changelog:
* Add an explicit fork after enabling the events ( echo "forked" )
* Remove the extended test from multiple_kprobe_types.tc which adds
multiple consecutive probes in a function and add it as a
separate test case.
* Add new test case which checks for optimized probes.
Akanksha J N (2):
selftests/ftrace: Add new test case which adds multiple consecutive
probes in a function
selftests/ftrace: Add new test case which checks for optimized probes
.../test.d/kprobe/kprobe_insn_boundary.tc | 19 +++++++++++
.../ftrace/test.d/kprobe/kprobe_opt_types.tc | 34 +++++++++++++++++++
2 files changed, 53 insertions(+)
create mode 100644 tools/testing/selftests/ftrace/test.d/kprobe/kprobe_insn_boundary.tc
create mode 100644 tools/testing/selftests/ftrace/test.d/kprobe/kprobe_opt_types.tc
--
2.31.1
This is v1 of the KUnit deferred actions API, which implements an
equivalent of devm_add_action[1] on top of KUnit managed resources. This
provides a simple way of scheduling a function to run when the test
terminates (whether successfully, or with an error). It's therefore very
useful for freeing resources, or otherwise cleaning up.
The notable changes since RFCv2[2] are:
- Got rid of the 'cancellation token' concept. It was overcomplicated,
and we can add it back if we need to.
- kunit_add_action() therefore now returns 0 on success, and an error
otherwise (like devm_add_action()). Though you may wish to use:
- Added kunit_add_action_or_reset(), which will call the deferred
function if an error occurs. (See devm_add_action_or_reset()). This
also returns an error on failure, which can be asserted safely.
- Got rid of the function pointer typedef. Personally, I liked it, but
it's more typedef-y than most kernel code.
- Got rid of the 'internal_gfp' argument: all internal state is now
allocated with GFP_KERNEL. The main KUnit resource API can be used
instead if this doesn't work for your use-case.
I'd love to hear any further thoughts!
Cheers,
-- David
[1]: https://docs.kernel.org/driver-api/basics.html#c.devm_add_action
[2]: https://patchwork.kernel.org/project/linux-kselftest/list/?series=735720
David Gow (3):
kunit: Add kunit_add_action() to defer a call until test exit
kunit: executor_test: Use kunit_add_action()
kunit: kmalloc_array: Use kunit_add_action()
include/kunit/resource.h | 76 +++++++++++++++++++++++++++++++
include/kunit/test.h | 10 ++++-
lib/kunit/executor_test.c | 11 ++---
lib/kunit/kunit-test.c | 88 +++++++++++++++++++++++++++++++++++-
lib/kunit/resource.c | 95 +++++++++++++++++++++++++++++++++++++++
lib/kunit/test.c | 48 ++++----------------
6 files changed, 279 insertions(+), 49 deletions(-)
--
2.40.0.634.g4ca3ef3211-goog
KUnit tests run in a kthread, with the current->kunit_test pointer set
to the test's context. This allows the kunit_get_current_test() and
kunit_fail_current_test() macros to work. Normally, this pointer is
still valid during test shutdown (i.e., the suite->exit function, and
any resource cleanup). However, if the test has exited early (e.g., due
to a failed assertion), the cleanup is done in the parent KUnit thread,
which does not have an active context.
Instead, in the event test terminates early, run the test exit and
cleanup from a new 'cleanup' kthread, which sets current->kunit_test,
and better isolates the rest of KUnit from issues which arise in test
cleanup.
If a test cleanup function itself aborts (e.g., due to an assertion
failing), there will be no further attempts to clean up: an error will
be logged and the test failed. For example:
# example_simple_test: test aborted during cleanup. continuing without cleaning up
This should also make it easier to get access to the KUnit context,
particularly from within resource cleanup functions, which may, for
example, need access to data in test->priv.
Signed-off-by: David Gow <davidgow(a)google.com>
---
This is an updated version of / replacement of "kunit: Set the current
KUnit context when cleaning up", which instead creates a new kthread
for cleanup tasks if the original test kthread is aborted. This protects
us from failed assertions during cleanup, if the test exited early.
Changes since v2:
https://lore.kernel.org/linux-kselftest/20230419085426.1671703-1-davidgow@g…
- Always run cleanup in its own kthread
- Therefore, never attempt to re-run it if it exits
- Thanks, Benjamin.
Changes since v1:
https://lore.kernel.org/linux-kselftest/20230415091401.681395-1-davidgow@go…
- Move cleanup execution to another kthread
- (Thanks, Benjamin, for pointing out the assertion issues)
---
lib/kunit/test.c | 55 ++++++++++++++++++++++++++++++++++++++++++------
1 file changed, 48 insertions(+), 7 deletions(-)
diff --git a/lib/kunit/test.c b/lib/kunit/test.c
index e2910b261112..2025e51941e6 100644
--- a/lib/kunit/test.c
+++ b/lib/kunit/test.c
@@ -419,10 +419,50 @@ static void kunit_try_run_case(void *data)
* thread will resume control and handle any necessary clean up.
*/
kunit_run_case_internal(test, suite, test_case);
- /* This line may never be reached. */
+}
+
+static void kunit_try_run_case_cleanup(void *data)
+{
+ struct kunit_try_catch_context *ctx = data;
+ struct kunit *test = ctx->test;
+ struct kunit_suite *suite = ctx->suite;
+
+ current->kunit_test = test;
+
kunit_run_case_cleanup(test, suite);
}
+static void kunit_catch_run_case_cleanup(void *data)
+{
+ struct kunit_try_catch_context *ctx = data;
+ struct kunit *test = ctx->test;
+ int try_exit_code = kunit_try_catch_get_result(&test->try_catch);
+
+ /* It is always a failure if cleanup aborts. */
+ kunit_set_failure(test);
+
+ if (try_exit_code) {
+ /*
+ * Test case could not finish, we have no idea what state it is
+ * in, so don't do clean up.
+ */
+ if (try_exit_code == -ETIMEDOUT) {
+ kunit_err(test, "test case cleanup timed out\n");
+ /*
+ * Unknown internal error occurred preventing test case from
+ * running, so there is nothing to clean up.
+ */
+ } else {
+ kunit_err(test, "internal error occurred during test case cleanup: %d\n",
+ try_exit_code);
+ }
+ return;
+ }
+
+ kunit_err(test, "test aborted during cleanup. continuing without cleaning up\n");
+}
+
+
static void kunit_catch_run_case(void *data)
{
struct kunit_try_catch_context *ctx = data;
@@ -448,12 +488,6 @@ static void kunit_catch_run_case(void *data)
}
return;
}
-
- /*
- * Test case was run, but aborted. It is the test case's business as to
- * whether it failed or not, we just need to clean up.
- */
- kunit_run_case_cleanup(test, suite);
}
/*
@@ -478,6 +512,13 @@ static void kunit_run_case_catch_errors(struct kunit_suite *suite,
context.test_case = test_case;
kunit_try_catch_run(try_catch, &context);
+ /* Now run the cleanup */
+ kunit_try_catch_init(try_catch,
+ test,
+ kunit_try_run_case_cleanup,
+ kunit_catch_run_case_cleanup);
+ kunit_try_catch_run(try_catch, &context);
+
/* Propagate the parameter result to the test case. */
if (test->status == KUNIT_FAILURE)
test_case->status = KUNIT_FAILURE;
--
2.40.0.634.g4ca3ef3211-goog
The ftrace selftests do not currently produce KTAP output, they produce a
custom format much nicer for human consumption. This means that when run in
automated test systems we just get a single result for the suite as a whole
rather than recording results for individual test cases, making it harder
to look at the test data and masking things like inappropriate skips.
Address this by adding support for KTAP output to the ftracetest script and
providing a trivial wrapper which will be invoked by the kselftest runner
to generate output in this format by default, users using ftracetest
directly will continue to get the existing output.
This is not the most elegant solution but it is simple and effective. I
did consider implementing this by post processing the existing output
format but that felt more complex and likely to result in all output being
lost if something goes seriously wrong during the run which would not be
helpful. I did also consider just writing a separate runner script but
there's enough going on with things like the signal handling for that to
seem like it would be duplicating too much.
Acked-by: Steven Rostedt (Google) <rostedt(a)goodmis.org>
Acked-by: Masami Hiramatsu (Google) <mhiramat(a)kernel.org>
Tested-by: Masami Hiramatsu (Google) <mhiramat(a)kernel.org>
Signed-off-by: Mark Brown <broonie(a)kernel.org>
---
Changes in v2:
- Update help text.
- Link to v1: https://lore.kernel.org/r/20230302-ftrace-kselftest-ktap-v1-1-a84a0765b7ad@…
---
tools/testing/selftests/ftrace/Makefile | 3 +-
tools/testing/selftests/ftrace/ftracetest | 63 ++++++++++++++++++++++++--
tools/testing/selftests/ftrace/ftracetest-ktap | 8 ++++
3 files changed, 70 insertions(+), 4 deletions(-)
diff --git a/tools/testing/selftests/ftrace/Makefile b/tools/testing/selftests/ftrace/Makefile
index d6e106fbce11..a1e955d2de4c 100644
--- a/tools/testing/selftests/ftrace/Makefile
+++ b/tools/testing/selftests/ftrace/Makefile
@@ -1,7 +1,8 @@
# SPDX-License-Identifier: GPL-2.0
all:
-TEST_PROGS := ftracetest
+TEST_PROGS_EXTENDED := ftracetest
+TEST_PROGS := ftracetest-ktap
TEST_FILES := test.d settings
EXTRA_CLEAN := $(OUTPUT)/logs/*
diff --git a/tools/testing/selftests/ftrace/ftracetest b/tools/testing/selftests/ftrace/ftracetest
index c3311c8c4089..2506621e75df 100755
--- a/tools/testing/selftests/ftrace/ftracetest
+++ b/tools/testing/selftests/ftrace/ftracetest
@@ -13,6 +13,7 @@ echo "Usage: ftracetest [options] [testcase(s)] [testcase-directory(s)]"
echo " Options:"
echo " -h|--help Show help message"
echo " -k|--keep Keep passed test logs"
+echo " -K|--ktap Output in KTAP format"
echo " -v|--verbose Increase verbosity of test messages"
echo " -vv Alias of -v -v (Show all results in stdout)"
echo " -vvv Alias of -v -v -v (Show all commands immediately)"
@@ -85,6 +86,10 @@ parse_opts() { # opts
KEEP_LOG=1
shift 1
;;
+ --ktap|-K)
+ KTAP=1
+ shift 1
+ ;;
--verbose|-v|-vv|-vvv)
if [ $VERBOSE -eq -1 ]; then
usage "--console can not use with --verbose"
@@ -178,6 +183,7 @@ TEST_DIR=$TOP_DIR/test.d
TEST_CASES=`find_testcases $TEST_DIR`
LOG_DIR=$TOP_DIR/logs/`date +%Y%m%d-%H%M%S`/
KEEP_LOG=0
+KTAP=0
DEBUG=0
VERBOSE=0
UNSUPPORTED_RESULT=0
@@ -229,7 +235,7 @@ prlog() { # messages
newline=
shift
fi
- printf "$*$newline"
+ [ "$KTAP" != "1" ] && printf "$*$newline"
[ "$LOG_FILE" ] && printf "$*$newline" | strip_esc >> $LOG_FILE
}
catlog() { #file
@@ -260,11 +266,11 @@ TOTAL_RESULT=0
INSTANCE=
CASENO=0
+CASENAME=
testcase() { # testfile
CASENO=$((CASENO+1))
- desc=`grep "^#[ \t]*description:" $1 | cut -f2- -d:`
- prlog -n "[$CASENO]$INSTANCE$desc"
+ CASENAME=`grep "^#[ \t]*description:" $1 | cut -f2- -d:`
}
checkreq() { # testfile
@@ -277,40 +283,68 @@ test_on_instance() { # testfile
grep -q "^#[ \t]*flags:.*instance" $1
}
+ktaptest() { # result comment
+ if [ "$KTAP" != "1" ]; then
+ return
+ fi
+
+ local result=
+ if [ "$1" = "1" ]; then
+ result="ok"
+ else
+ result="not ok"
+ fi
+ shift
+
+ local comment=$*
+ if [ "$comment" != "" ]; then
+ comment="# $comment"
+ fi
+
+ echo $CASENO $result $INSTANCE$CASENAME $comment
+}
+
eval_result() { # sigval
case $1 in
$PASS)
prlog " [${color_green}PASS${color_reset}]"
+ ktaptest 1
PASSED_CASES="$PASSED_CASES $CASENO"
return 0
;;
$FAIL)
prlog " [${color_red}FAIL${color_reset}]"
+ ktaptest 0
FAILED_CASES="$FAILED_CASES $CASENO"
return 1 # this is a bug.
;;
$UNRESOLVED)
prlog " [${color_blue}UNRESOLVED${color_reset}]"
+ ktaptest 0 UNRESOLVED
UNRESOLVED_CASES="$UNRESOLVED_CASES $CASENO"
return $UNRESOLVED_RESULT # depends on use case
;;
$UNTESTED)
prlog " [${color_blue}UNTESTED${color_reset}]"
+ ktaptest 1 SKIP
UNTESTED_CASES="$UNTESTED_CASES $CASENO"
return 0
;;
$UNSUPPORTED)
prlog " [${color_blue}UNSUPPORTED${color_reset}]"
+ ktaptest 1 SKIP
UNSUPPORTED_CASES="$UNSUPPORTED_CASES $CASENO"
return $UNSUPPORTED_RESULT # depends on use case
;;
$XFAIL)
prlog " [${color_green}XFAIL${color_reset}]"
+ ktaptest 1 XFAIL
XFAILED_CASES="$XFAILED_CASES $CASENO"
return 0
;;
*)
prlog " [${color_blue}UNDEFINED${color_reset}]"
+ ktaptest 0 error
UNDEFINED_CASES="$UNDEFINED_CASES $CASENO"
return 1 # this must be a test bug
;;
@@ -371,6 +405,7 @@ __run_test() { # testfile
run_test() { # testfile
local testname=`basename $1`
testcase $1
+ prlog -n "[$CASENO]$INSTANCE$CASENAME"
if [ ! -z "$LOG_FILE" ] ; then
local testlog=`mktemp $LOG_DIR/${CASENO}-${testname}-log.XXXXXX`
else
@@ -405,6 +440,17 @@ run_test() { # testfile
# load in the helper functions
. $TEST_DIR/functions
+if [ "$KTAP" = "1" ]; then
+ echo "TAP version 13"
+
+ casecount=`echo $TEST_CASES | wc -w`
+ for t in $TEST_CASES; do
+ test_on_instance $t || continue
+ casecount=$((casecount+1))
+ done
+ echo "1..${casecount}"
+fi
+
# Main loop
for t in $TEST_CASES; do
run_test $t
@@ -439,6 +485,17 @@ prlog "# of unsupported: " `echo $UNSUPPORTED_CASES | wc -w`
prlog "# of xfailed: " `echo $XFAILED_CASES | wc -w`
prlog "# of undefined(test bug): " `echo $UNDEFINED_CASES | wc -w`
+if [ "$KTAP" = "1" ]; then
+ echo -n "# Totals:"
+ echo -n " pass:"`echo $PASSED_CASES | wc -w`
+ echo -n " faii:"`echo $FAILED_CASES | wc -w`
+ echo -n " xfail:"`echo $XFAILED_CASES | wc -w`
+ echo -n " xpass:0"
+ echo -n " skip:"`echo $UNTESTED_CASES $UNSUPPORTED_CASES | wc -w`
+ echo -n " error:"`echo $UNRESOLVED_CASES $UNDEFINED_CASES | wc -w`
+ echo
+fi
+
cleanup
# if no error, return 0
diff --git a/tools/testing/selftests/ftrace/ftracetest-ktap b/tools/testing/selftests/ftrace/ftracetest-ktap
new file mode 100755
index 000000000000..b3284679ef3a
--- /dev/null
+++ b/tools/testing/selftests/ftrace/ftracetest-ktap
@@ -0,0 +1,8 @@
+#!/bin/sh -e
+# SPDX-License-Identifier: GPL-2.0-only
+#
+# ftracetest-ktap: Wrapper to integrate ftracetest with the kselftest runner
+#
+# Copyright (C) Arm Ltd., 2023
+
+./ftracetest -K
---
base-commit: 457391b0380335d5e9a5babdec90ac53928b23b4
change-id: 20230302-ftrace-kselftest-ktap-9d7878691557
Best regards,
--
Mark Brown <broonie(a)kernel.org>
The ftrace selftests do not currently produce KTAP output, they produce a
custom format much nicer for human consumption. This means that when run in
automated test systems we just get a single result for the suite as a whole
rather than recording results for individual test cases, making it harder
to look at the test data and masking things like inappropriate skips.
Address this by adding support for KTAP output to the ftracetest script and
providing a trivial wrapper which will be invoked by the kselftest runner
to generate output in this format by default, users using ftracetest
directly will continue to get the existing output.
This is not the most elegant solution but it is simple and effective. I
did consider implementing this by post processing the existing output
format but that felt more complex and likely to result in all output being
lost if something goes seriously wrong during the run which would not be
helpful. I did also consider just writing a separate runner script but
there's enough going on with things like the signal handling for that to
seem like it would be duplicating too much.
Signed-off-by: Mark Brown <broonie(a)kernel.org>
---
tools/testing/selftests/ftrace/Makefile | 3 +-
tools/testing/selftests/ftrace/ftracetest | 63 ++++++++++++++++++++++++--
tools/testing/selftests/ftrace/ftracetest-ktap | 8 ++++
3 files changed, 70 insertions(+), 4 deletions(-)
diff --git a/tools/testing/selftests/ftrace/Makefile b/tools/testing/selftests/ftrace/Makefile
index d6e106fbce11..a1e955d2de4c 100644
--- a/tools/testing/selftests/ftrace/Makefile
+++ b/tools/testing/selftests/ftrace/Makefile
@@ -1,7 +1,8 @@
# SPDX-License-Identifier: GPL-2.0
all:
-TEST_PROGS := ftracetest
+TEST_PROGS_EXTENDED := ftracetest
+TEST_PROGS := ftracetest-ktap
TEST_FILES := test.d settings
EXTRA_CLEAN := $(OUTPUT)/logs/*
diff --git a/tools/testing/selftests/ftrace/ftracetest b/tools/testing/selftests/ftrace/ftracetest
index c3311c8c4089..539c8d6d5d71 100755
--- a/tools/testing/selftests/ftrace/ftracetest
+++ b/tools/testing/selftests/ftrace/ftracetest
@@ -13,6 +13,7 @@ echo "Usage: ftracetest [options] [testcase(s)] [testcase-directory(s)]"
echo " Options:"
echo " -h|--help Show help message"
echo " -k|--keep Keep passed test logs"
+echo " -K|--KTAP Output in KTAP format"
echo " -v|--verbose Increase verbosity of test messages"
echo " -vv Alias of -v -v (Show all results in stdout)"
echo " -vvv Alias of -v -v -v (Show all commands immediately)"
@@ -85,6 +86,10 @@ parse_opts() { # opts
KEEP_LOG=1
shift 1
;;
+ --ktap|-K)
+ KTAP=1
+ shift 1
+ ;;
--verbose|-v|-vv|-vvv)
if [ $VERBOSE -eq -1 ]; then
usage "--console can not use with --verbose"
@@ -178,6 +183,7 @@ TEST_DIR=$TOP_DIR/test.d
TEST_CASES=`find_testcases $TEST_DIR`
LOG_DIR=$TOP_DIR/logs/`date +%Y%m%d-%H%M%S`/
KEEP_LOG=0
+KTAP=0
DEBUG=0
VERBOSE=0
UNSUPPORTED_RESULT=0
@@ -229,7 +235,7 @@ prlog() { # messages
newline=
shift
fi
- printf "$*$newline"
+ [ "$KTAP" != "1" ] && printf "$*$newline"
[ "$LOG_FILE" ] && printf "$*$newline" | strip_esc >> $LOG_FILE
}
catlog() { #file
@@ -260,11 +266,11 @@ TOTAL_RESULT=0
INSTANCE=
CASENO=0
+CASENAME=
testcase() { # testfile
CASENO=$((CASENO+1))
- desc=`grep "^#[ \t]*description:" $1 | cut -f2- -d:`
- prlog -n "[$CASENO]$INSTANCE$desc"
+ CASENAME=`grep "^#[ \t]*description:" $1 | cut -f2- -d:`
}
checkreq() { # testfile
@@ -277,40 +283,68 @@ test_on_instance() { # testfile
grep -q "^#[ \t]*flags:.*instance" $1
}
+ktaptest() { # result comment
+ if [ "$KTAP" != "1" ]; then
+ return
+ fi
+
+ local result=
+ if [ "$1" = "1" ]; then
+ result="ok"
+ else
+ result="not ok"
+ fi
+ shift
+
+ local comment=$*
+ if [ "$comment" != "" ]; then
+ comment="# $comment"
+ fi
+
+ echo $CASENO $result $INSTANCE$CASENAME $comment
+}
+
eval_result() { # sigval
case $1 in
$PASS)
prlog " [${color_green}PASS${color_reset}]"
+ ktaptest 1
PASSED_CASES="$PASSED_CASES $CASENO"
return 0
;;
$FAIL)
prlog " [${color_red}FAIL${color_reset}]"
+ ktaptest 0
FAILED_CASES="$FAILED_CASES $CASENO"
return 1 # this is a bug.
;;
$UNRESOLVED)
prlog " [${color_blue}UNRESOLVED${color_reset}]"
+ ktaptest 0 UNRESOLVED
UNRESOLVED_CASES="$UNRESOLVED_CASES $CASENO"
return $UNRESOLVED_RESULT # depends on use case
;;
$UNTESTED)
prlog " [${color_blue}UNTESTED${color_reset}]"
+ ktaptest 1 SKIP
UNTESTED_CASES="$UNTESTED_CASES $CASENO"
return 0
;;
$UNSUPPORTED)
prlog " [${color_blue}UNSUPPORTED${color_reset}]"
+ ktaptest 1 SKIP
UNSUPPORTED_CASES="$UNSUPPORTED_CASES $CASENO"
return $UNSUPPORTED_RESULT # depends on use case
;;
$XFAIL)
prlog " [${color_green}XFAIL${color_reset}]"
+ ktaptest 1 XFAIL
XFAILED_CASES="$XFAILED_CASES $CASENO"
return 0
;;
*)
prlog " [${color_blue}UNDEFINED${color_reset}]"
+ ktaptest 0 error
UNDEFINED_CASES="$UNDEFINED_CASES $CASENO"
return 1 # this must be a test bug
;;
@@ -371,6 +405,7 @@ __run_test() { # testfile
run_test() { # testfile
local testname=`basename $1`
testcase $1
+ prlog -n "[$CASENO]$INSTANCE$CASENAME"
if [ ! -z "$LOG_FILE" ] ; then
local testlog=`mktemp $LOG_DIR/${CASENO}-${testname}-log.XXXXXX`
else
@@ -405,6 +440,17 @@ run_test() { # testfile
# load in the helper functions
. $TEST_DIR/functions
+if [ "$KTAP" = "1" ]; then
+ echo "TAP version 13"
+
+ casecount=`echo $TEST_CASES | wc -w`
+ for t in $TEST_CASES; do
+ test_on_instance $t || continue
+ casecount=$((casecount+1))
+ done
+ echo "1..${casecount}"
+fi
+
# Main loop
for t in $TEST_CASES; do
run_test $t
@@ -439,6 +485,17 @@ prlog "# of unsupported: " `echo $UNSUPPORTED_CASES | wc -w`
prlog "# of xfailed: " `echo $XFAILED_CASES | wc -w`
prlog "# of undefined(test bug): " `echo $UNDEFINED_CASES | wc -w`
+if [ "$KTAP" = "1" ]; then
+ echo -n "# Totals:"
+ echo -n " pass:"`echo $PASSED_CASES | wc -w`
+ echo -n " faii:"`echo $FAILED_CASES | wc -w`
+ echo -n " xfail:"`echo $XFAILED_CASES | wc -w`
+ echo -n " xpass:0"
+ echo -n " skip:"`echo $UNTESTED_CASES $UNSUPPORTED_CASES | wc -w`
+ echo -n " error:"`echo $UNRESOLVED_CASES $UNDEFINED_CASES | wc -w`
+ echo
+fi
+
cleanup
# if no error, return 0
diff --git a/tools/testing/selftests/ftrace/ftracetest-ktap b/tools/testing/selftests/ftrace/ftracetest-ktap
new file mode 100755
index 000000000000..b3284679ef3a
--- /dev/null
+++ b/tools/testing/selftests/ftrace/ftracetest-ktap
@@ -0,0 +1,8 @@
+#!/bin/sh -e
+# SPDX-License-Identifier: GPL-2.0-only
+#
+# ftracetest-ktap: Wrapper to integrate ftracetest with the kselftest runner
+#
+# Copyright (C) Arm Ltd., 2023
+
+./ftracetest -K
---
base-commit: fe15c26ee26efa11741a7b632e9f23b01aca4cc6
change-id: 20230302-ftrace-kselftest-ktap-9d7878691557
Best regards,
--
Mark Brown <broonie(a)kernel.org>
Dzień dobry,
czy rozważali Państwo rozwój kwalifikacji językowych swoich pracowników?
Opracowaliśmy kursy językowe dla różnych branż, w których koncentrujemy się na podniesieniu poziomu słownictwa i jakości komunikacji wykorzystując autorską metodę, stworzoną specjalnie dla wymagającego biznesu.
Niestandardowy kurs on-line, dopasowany do profilu firmy i obszarów świadczonych usług, w szybkim czasie przyniesie efekty, które zwiększą komfort i jakość pracy, rozwijając możliwości biznesowe.
Zdalne szkolenie językowe to m.in. zajęcia z native speakerami, które w szybkim czasie nauczą pracowników rozmawiać za pomocą jasnego i zwięzłego języka Business English.
Czy mógłbym przedstawić więcej szczegółów i opowiedzieć jak działamy?
Pozdrawiam
Krzysztof Maj
When calling socket lookup from L2 (tc, xdp), VRF boundaries aren't
respected. This patchset fixes this by regarding the incoming device's
VRF attachment when performing the socket lookups from tc/xdp.
The first two patches are coding changes which facilitate this fix by
factoring out the tc helper's logic which was shared with cg/sk_skb
(which operate correctly).
The third patch contains the actual bugfix.
The fourth patch adds bpf tests for these lookup functions.
---
v2: Fixed uninitialized var in test patch (4).
Gilad Sever (4):
bpf: factor out socket lookup functions for the TC hookpoint.
bpf: Call __bpf_sk_lookup()/__bpf_skc_lookup() directly via TC
hookpoint
bpf: fix bpf socket lookup from tc/xdp to respect socket VRF bindings
selftests/bpf: Add tc_socket_lookup tests
net/core/filter.c | 132 +++++--
.../bpf/prog_tests/tc_socket_lookup.c | 341 ++++++++++++++++++
.../selftests/bpf/progs/tc_socket_lookup.c | 73 ++++
3 files changed, 525 insertions(+), 21 deletions(-)
create mode 100644 tools/testing/selftests/bpf/prog_tests/tc_socket_lookup.c
create mode 100644 tools/testing/selftests/bpf/progs/tc_socket_lookup.c
--
2.34.1