The testing effort is increasing throughout the community.
The tests are generally merged into the subsystem trees,
and are of relatively narrow interest. The patch volume on
linux-kselftest(a)vger.kernel.org makes it hard to follow
the changes to the framework, and discuss proposals.
Create a new ML for "all" of kselftests (tests and framework),
replacing the old list. Use the old list for framework changes
only. It would cause less churn to create a ML for just the
framework, but I prefer to use the shorter name for the list
which has much more practical use.
Signed-off-by: Jakub Kicinski <kuba(a)kernel.org>
---
Posting as an RFC because we need to create the new ML.
CC: shuah(a)kernel.org
CC: linux-kselftest(a)vger.kernel.org
CC: workflows(a)vger.kernel.org
---
MAINTAINERS | 12 ++++++++++++
1 file changed, 12 insertions(+)
diff --git a/MAINTAINERS b/MAINTAINERS
index c27f3190737f..9a03dc1c8974 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -12401,6 +12401,18 @@ S: Maintained
Q: https://patchwork.kernel.org/project/linux-kselftest/list/
T: git git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest.git
F: Documentation/dev-tools/kselftest*
+F: tools/testing/selftests/kselftest/
+F: tools/testing/selftests/lib/
+F: tools/testing/selftests/lib.mk
+F: tools/testing/selftests/Makefile
+F: tools/testing/selftests/*.sh
+F: tools/testing/selftests/*.h
+
+KERNEL SELFTEST TESTS
+M: Shuah Khan <shuah(a)kernel.org>
+M: Shuah Khan <skhan(a)linuxfoundation.org>
+L: linux-kselftest-all(a)vger.kernel.org
+S: Maintained
F: tools/testing/selftests/
KERNEL SMB3 SERVER (KSMBD)
--
2.46.2
The kernel has recently added support for shadow stacks, currently
x86 only using their CET feature but both arm64 and RISC-V have
equivalent features (GCS and Zicfiss respectively), I am actively
working on GCS[1]. With shadow stacks the hardware maintains an
additional stack containing only the return addresses for branch
instructions which is not generally writeable by userspace and ensures
that any returns are to the recorded addresses. This provides some
protection against ROP attacks and making it easier to collect call
stacks. These shadow stacks are allocated in the address space of the
userspace process.
Our API for shadow stacks does not currently offer userspace any
flexiblity for managing the allocation of shadow stacks for newly
created threads, instead the kernel allocates a new shadow stack with
the same size as the normal stack whenever a thread is created with the
feature enabled. The stacks allocated in this way are freed by the
kernel when the thread exits or shadow stacks are disabled for the
thread. This lack of flexibility and control isn't ideal, in the vast
majority of cases the shadow stack will be over allocated and the
implicit allocation and deallocation is not consistent with other
interfaces. As far as I can tell the interface is done in this manner
mainly because the shadow stack patches were in development since before
clone3() was implemented.
Since clone3() is readily extensible let's add support for specifying a
shadow stack when creating a new thread or process in a similar manner
to how the normal stack is specified, keeping the current implicit
allocation behaviour if one is not specified either with clone3() or
through the use of clone(). The user must provide a shadow stack
address and size, this must point to memory mapped for use as a shadow
stackby map_shadow_stack() with a shadow stack token at the top of the
stack.
Please note that the x86 portions of this code are build tested only, I
don't appear to have a system that can run CET avaible to me, I have
done testing with an integration into my pending work for GCS. There is
some possibility that the arm64 implementation may require the use of
clone3() and explicit userspace allocation of shadow stacks, this is
still under discussion.
Please further note that the token consumption done by clone3() is not
currently implemented in an atomic fashion, Rick indicated that he would
look into fixing this if people are OK with the implementation.
A new architecture feature Kconfig option for shadow stacks is added as
here, this was suggested as part of the review comments for the arm64
GCS series and since we need to detect if shadow stacks are supported it
seemed sensible to roll it in here.
[1] https://lore.kernel.org/r/20231009-arm64-gcs-v6-0-78e55deaa4dd@kernel.org/
Signed-off-by: Mark Brown <broonie(a)kernel.org>
---
Changes in v9:
- Pull token validation earlier and report problems with an error return
to parent rather than signal delivery to the child.
- Verify that the top of the supplied shadow stack is VM_SHADOW_STACK.
- Rework token validation to only do the page mapping once.
- Drop no longer needed support for testing for signals in selftest.
- Fix typo in comments.
- Link to v8: https://lore.kernel.org/r/20240808-clone3-shadow-stack-v8-0-0acf37caf14c@ke…
Changes in v8:
- Fix token verification with user specified shadow stack.
- Don't track user managed shadow stacks for child processes.
- Link to v7: https://lore.kernel.org/r/20240731-clone3-shadow-stack-v7-0-a9532eebfb1d@ke…
Changes in v7:
- Rebase onto v6.11-rc1.
- Typo fixes.
- Link to v6: https://lore.kernel.org/r/20240623-clone3-shadow-stack-v6-0-9ee7783b1fb9@ke…
Changes in v6:
- Rebase onto v6.10-rc3.
- Ensure we don't try to free the parent shadow stack in error paths of
x86 arch code.
- Spelling fixes in userspace API document.
- Additional cleanups and improvements to the clone3() tests to support
the shadow stack tests.
- Link to v5: https://lore.kernel.org/r/20240203-clone3-shadow-stack-v5-0-322c69598e4b@ke…
Changes in v5:
- Rebase onto v6.8-rc2.
- Rework ABI to have the user allocate the shadow stack memory with
map_shadow_stack() and a token.
- Force inlining of the x86 shadow stack enablement.
- Move shadow stack enablement out into a shared header for reuse by
other tests.
- Link to v4: https://lore.kernel.org/r/20231128-clone3-shadow-stack-v4-0-8b28ffe4f676@ke…
Changes in v4:
- Formatting changes.
- Use a define for minimum shadow stack size and move some basic
validation to fork.c.
- Link to v3: https://lore.kernel.org/r/20231120-clone3-shadow-stack-v3-0-a7b8ed3e2acc@ke…
Changes in v3:
- Rebase onto v6.7-rc2.
- Remove stale shadow_stack in internal kargs.
- If a shadow stack is specified unconditionally use it regardless of
CLONE_ parameters.
- Force enable shadow stacks in the selftest.
- Update changelogs for RISC-V feature rename.
- Link to v2: https://lore.kernel.org/r/20231114-clone3-shadow-stack-v2-0-b613f8681155@ke…
Changes in v2:
- Rebase onto v6.7-rc1.
- Remove ability to provide preallocated shadow stack, just specify the
desired size.
- Link to v1: https://lore.kernel.org/r/20231023-clone3-shadow-stack-v1-0-d867d0b5d4d0@ke…
---
Mark Brown (8):
Documentation: userspace-api: Add shadow stack API documentation
selftests: Provide helper header for shadow stack testing
mm: Introduce ARCH_HAS_USER_SHADOW_STACK
fork: Add shadow stack support to clone3()
selftests/clone3: Remove redundant flushes of output streams
selftests/clone3: Factor more of main loop into test_clone3()
selftests/clone3: Allow tests to flag if -E2BIG is a valid error code
selftests/clone3: Test shadow stack support
Documentation/userspace-api/index.rst | 1 +
Documentation/userspace-api/shadow_stack.rst | 41 ++++
arch/x86/Kconfig | 1 +
arch/x86/include/asm/shstk.h | 11 +-
arch/x86/kernel/process.c | 2 +-
arch/x86/kernel/shstk.c | 103 +++++++---
fs/proc/task_mmu.c | 2 +-
include/linux/mm.h | 2 +-
include/linux/sched/task.h | 18 ++
include/uapi/linux/sched.h | 13 +-
kernel/fork.c | 114 +++++++++--
mm/Kconfig | 6 +
tools/testing/selftests/clone3/clone3.c | 230 ++++++++++++++++++----
tools/testing/selftests/clone3/clone3_selftests.h | 40 +++-
tools/testing/selftests/ksft_shstk.h | 63 ++++++
15 files changed, 560 insertions(+), 87 deletions(-)
---
base-commit: 8400291e289ee6b2bf9779ff1c83a291501f017b
change-id: 20231019-clone3-shadow-stack-15d40d2bf536
Best regards,
--
Mark Brown <broonie(a)kernel.org>
Misbehaving guests can cause bus locks to degrade the performance of
a system. Non-WB (write-back) and misaligned locked RMW
(read-modify-write) instructions are referred to as "bus locks" and
require system wide synchronization among all processors to guarantee
the atomicity. The bus locks can impose notable performance penalties
for all processors within the system.
Support for the Bus Lock Threshold is indicated by CPUID
Fn8000_000A_EDX[29] BusLockThreshold=1, the VMCB provides a Bus Lock
Threshold enable bit and an unsigned 16-bit Bus Lock Threshold count.
VMCB intercept bit
VMCB Offset Bits Function
14h 5 Intercept bus lock operations
Bus lock threshold count
VMCB Offset Bits Function
120h 15:0 Bus lock counter
During VMRUN, the bus lock threshold count is fetched and stored in an
internal count register. Prior to executing a bus lock within the
guest, the processor verifies the count in the bus lock register. If
the count is greater than zero, the processor executes the bus lock,
reducing the count. However, if the count is zero, the bus lock
operation is not performed, and instead, a Bus Lock Threshold #VMEXIT
is triggered to transfer control to the Virtual Machine Monitor (VMM).
A Bus Lock Threshold #VMEXIT is reported to the VMM with VMEXIT code
0xA5h, VMEXIT_BUSLOCK. EXITINFO1 and EXITINFO2 are set to 0 on
a VMEXIT_BUSLOCK. On a #VMEXIT, the processor writes the current
value of the Bus Lock Threshold Counter to the VMCB.
More details about the Bus Lock Threshold feature can be found in AMD
APM [1].
v1 -> v2
- Incorporated misc review comments from Sean.
- Removed bus_lock_counter module parameter.
- Set the value of bus_lock_counter to zero by default and reload the value by 1
in bus lock exit handler.
- Add documentation for the behavioral difference for KVM_EXIT_BUS_LOCK.
- Improved selftest for buslock to work on SVM and VMX.
- Rewrite the commit messages.
Patches are prepared on kvm-next/next (0cdcc99eeaed)
Testing done:
- Added a selftest for the Bus Lock Threshold functionality.
- The bus lock threshold selftest has been tested on both Intel and AMD platforms.
- Tested the Bus Lock Threshold functionality on SEV and SEV-ES guests.
- Tested the Bus Lock Threshold functionality on nested guests.
v1: https://lore.kernel.org/kvm/20240709175145.9986-4-manali.shukla@amd.com/T/
[1]: AMD64 Architecture Programmer's Manual Pub. 24593, April 2024,
Vol 2, 15.14.5 Bus Lock Threshold.
https://bugzilla.kernel.org/attachment.cgi?id=306250
Manali Shukla (3):
x86/cpu: Add virt tag in /proc/cpuinfo
x86/cpufeatures: Add CPUID feature bit for the Bus Lock Threshold
KVM: X86: Add documentation about behavioral difference for
KVM_EXIT_BUS_LOCK
Nikunj A Dadhania (2):
KVM: SVM: Enable Bus lock threshold exit
KVM: selftests: Add bus lock exit test
Documentation/virt/kvm/api.rst | 5 +
arch/x86/include/asm/cpufeature.h | 1 +
arch/x86/include/asm/cpufeatures.h | 1 +
arch/x86/include/asm/svm.h | 5 +-
arch/x86/include/uapi/asm/svm.h | 2 +
arch/x86/kernel/cpu/mkcapflags.sh | 3 +
arch/x86/kernel/cpu/proc.c | 5 +
arch/x86/kvm/svm/nested.c | 12 ++
arch/x86/kvm/svm/svm.c | 29 ++++
tools/testing/selftests/kvm/Makefile | 1 +
.../selftests/kvm/x86_64/kvm_buslock_test.c | 130 ++++++++++++++++++
11 files changed, 193 insertions(+), 1 deletion(-)
create mode 100644 tools/testing/selftests/kvm/x86_64/kvm_buslock_test.c
base-commit: 0cdcc99eeaedf2422c80d75760293fdbb476cec1
--
2.34.1
Hi all,
This is part of a hackathon organized by LKCAMP[1], focused on writing
tests using KUnit. We reached out a while ago asking for advice on what
would be a useful contribution[2] and ended up choosing data structures
that did not yet have tests.
This patch adds tests for the llist data structure, defined in
include/linux/llist.h, and is inspired by the KUnit tests for the doubly
linked list in lib/list-test.c[3].
It is important to note that this patch depends on the patch referenced
in [4], as it utilizes the newly created lib/tests/ subdirectory.
[1] https://lkcamp.dev/about/
[2] https://lore.kernel.org/all/Zktnt7rjKryTh9-N@arch/
[3] https://elixir.bootlin.com/linux/latest/source/lib/list-test.c
[4] https://lore.kernel.org/all/20240720181025.work.002-kees@kernel.org/
---
Changes in v3:
- Resolved checkpatch warnings:
- Renamed tests for macros starting with 'for_each'
- Removed link from commit message
- Replaced hardcoded constants with ENTRIES_SIZE
- Updated initialization of llist_node array
- Fixed typos
- Update Kconfig.debug message for llist_kunit
Changes in v2:
- Add MODULE_DESCRIPTION()
- Move the tests from lib/llist_kunit.c to lib/tests/llist_kunit.c
- Change the license from "GPL v2" to "GPL"
Artur Alves (1):
lib/llist_kunit.c: add KUnit tests for llist
lib/Kconfig.debug | 11 ++
lib/tests/Makefile | 1 +
lib/tests/llist_kunit.c | 358 ++++++++++++++++++++++++++++++++++++++++
3 files changed, 370 insertions(+)
create mode 100644 lib/tests/llist_kunit.c
--
2.46.0
The SLUB changes for 6.12 included new kunit tests that resulted in
noisy warnings, which we normally suppress, and a boot lockup in some
configurations in case the kunit tests are built-in.
The warnings are addressed in Patch 1.
The lockups I couldn't reproduce, but inspecting boot initialization
order makes me suspect the tests (which call few RCU operations) are
being executed a bit too early before RCU finishes initialization.
Moving the exection later seems to do the trick, so I'd like to ask
kunit folks to ack this change (Patch 2). If RCU folks have any
insights, it would be welcome too.
So these are now fixes for 4e1c44b3db79 ("kunit, slub: add
test_kfree_rcu() and test_leak_destroy()")
Once sent as a full patch, I also want to include comment fixes from
Ulad for kvfree_rcu_queue_batch():
https://lore.kernel.org/all/CA%2BKHdyV%3D0dpJX_v_tcuTQ-_ree-Yb9ch3F_HqfT4Yn…
The plan is to take the fixes via slab tree for a 6.12 rcX.
Signed-off-by: Vlastimil Babka <vbabka(a)suse.cz>
---
Vlastimil Babka (2):
mm, slab: suppress warnings in test_leak_destroy kunit test
kunit: move call to kunit_run_all_tests() after rcu_end_inkernel_boot()
init/main.c | 4 ++--
lib/slub_kunit.c | 4 ++--
mm/slab.h | 6 ++++++
mm/slab_common.c | 5 +++--
mm/slub.c | 5 +++--
5 files changed, 16 insertions(+), 8 deletions(-)
---
base-commit: 9852d85ec9d492ebef56dc5f229416c925758edc
change-id: 20240930-b4-slub-kunit-fix-6fba4d1c1742
Best regards,
--
Vlastimil Babka <vbabka(a)suse.cz>
rxq contains a pointer to the device from where
the redirect happened. Currently, the BPF program
that was executed after a redirect via BPF_MAP_TYPE_DEVMAP*
does not have it set.
Add bugfix and related selftest.
Signed-off-by: Florian Kauer <florian.kauer(a)linutronix.de>
---
Changes in v4:
- return -> goto out_close, thanks Toke
- Link to v3: https://lore.kernel.org/r/20240909-devel-koalo-fix-ingress-ifindex-v3-0-662…
Changes in v3:
- initialize skel to NULL, thanks Stanislav
- Link to v2: https://lore.kernel.org/r/20240906-devel-koalo-fix-ingress-ifindex-v2-0-4ca…
Changes in v2:
- changed fixes tag
- added selftest
- Link to v1: https://lore.kernel.org/r/20240905-devel-koalo-fix-ingress-ifindex-v1-1-d12…
---
Florian Kauer (2):
bpf: devmap: provide rxq after redirect
bpf: selftests: send packet to devmap redirect XDP
kernel/bpf/devmap.c | 11 +-
.../selftests/bpf/prog_tests/xdp_devmap_attach.c | 114 +++++++++++++++++++--
2 files changed, 115 insertions(+), 10 deletions(-)
---
base-commit: 8e69c96df771ab469cec278edb47009351de4da6
change-id: 20240905-devel-koalo-fix-ingress-ifindex-b9293d471db6
Best regards,
--
Florian Kauer <florian.kauer(a)linutronix.de>
step_after_suspend_test fails with device busy error while
writing to /sys/power/state to start suspend. The test believes
it failed to enter suspend state with
$ sudo ./step_after_suspend_test
TAP version 13
Bail out! Failed to enter Suspend state
However, in the kernel message, I indeed see the system get
suspended and then wake up later.
[611172.033108] PM: suspend entry (s2idle)
[611172.044940] Filesystems sync: 0.006 seconds
[611172.052254] Freezing user space processes
[611172.059319] Freezing user space processes completed (elapsed 0.001 seconds)
[611172.067920] OOM killer disabled.
[611172.072465] Freezing remaining freezable tasks
[611172.080332] Freezing remaining freezable tasks completed (elapsed 0.001 seconds)
[611172.089724] printk: Suspending console(s) (use no_console_suspend to debug)
[611172.117126] serial 00:03: disabled
some other hardware get reconnected
[611203.136277] OOM killer enabled.
[611203.140637] Restarting tasks ...
[611203.141135] usb 1-8.1: USB disconnect, device number 7
[611203.141755] done.
[611203.155268] random: crng reseeded on system resumption
[611203.162059] PM: suspend exit
After investigation, I noticed that for the code block
if (write(power_state_fd, "mem", strlen("mem")) != strlen("mem"))
ksft_exit_fail_msg("Failed to enter Suspend state\n");
The write will return -1 and errno is set to 16 (device busy).
It should be caused by the write function is not successfully returned
before the system suspend and the return value get messed when waking up.
As a result, It may be better to check the time passed of those few
instructions to determine whether the suspend is executed correctly for
it is pretty hard to execute those few lines for 5 seconds.
The timer to wake up the system is set to expire after 5 seconds and
no re-arm. If the timer remaining time is 0 second and 0 nano secomd,
it means the timer expired and wake the system up. Otherwise, the system
could be considered to enter the suspend state failed if there is any
remaining time.
After appling this patch, the test would not fail for it believes the
system does not go to suspend by mistake. It now could continue to the
rest part of the test after suspend.
Fixes: bfd092b8c272 ("selftests: breakpoint: add step_after_suspend_test")
Reported-by: Sinadin Shan <sinadin.shan(a)oracle.com>
Signed-off-by: Yifei Liu <yifei.l.liu(a)oracle.com>
---
v4->v5: Remove the above quotes in the first part.
remove the incorrect format which could confuse the git.
---
.../testing/selftests/breakpoints/step_after_suspend_test.c | 5 ++++-
1 file changed, 4 insertions(+), 1 deletion(-)
diff --git a/tools/testing/selftests/breakpoints/step_after_suspend_test.c b/tools/testing/selftests/breakpoints/step_after_suspend_test.c
index dfec31fb9b30d..8d275f03e977f 100644
--- a/tools/testing/selftests/breakpoints/step_after_suspend_test.c
+++ b/tools/testing/selftests/breakpoints/step_after_suspend_test.c
@@ -152,7 +152,10 @@ void suspend(void)
if (err < 0)
ksft_exit_fail_msg("timerfd_settime() failed\n");
- if (write(power_state_fd, "mem", strlen("mem")) != strlen("mem"))
+ system("(echo mem > /sys/power/state) 2> /dev/null");
+
+ timerfd_gettime(timerfd, &spec);
+ if (spec.it_value.tv_sec != 0 || spec.it_value.tv_nsec != 0)
ksft_exit_fail_msg("Failed to enter Suspend state\n");
close(timerfd);
--
2.46.0
Thanks to Miroslav, Petr and Marcos for the reviews!
V4:
Use variable for /sys/kernel/debug.
Be consistent with "" around variables.
Fix path in commit message to /sys/kernel/debug/kprobes/enabled.
V3:
Save and restore kprobe state also when test fails, by integrating it
into setup_config() and cleanup().
Rename SYSFS variables in a more logical way.
Sort test modules in alphabetical order.
Rename module description.
V2:
Save and restore kprobe state.
Michael Vetter (3):
selftests: livepatch: rename KLP_SYSFS_DIR to SYSFS_KLP_DIR
selftests: livepatch: save and restore kprobe state
selftests: livepatch: test livepatching a kprobed function
tools/testing/selftests/livepatch/Makefile | 3 +-
.../testing/selftests/livepatch/functions.sh | 19 ++++--
.../selftests/livepatch/test-kprobe.sh | 62 +++++++++++++++++++
.../selftests/livepatch/test_modules/Makefile | 3 +-
.../livepatch/test_modules/test_klp_kprobe.c | 38 ++++++++++++
5 files changed, 117 insertions(+), 8 deletions(-)
create mode 100755 tools/testing/selftests/livepatch/test-kprobe.sh
create mode 100644 tools/testing/selftests/livepatch/test_modules/test_klp_kprobe.c
--
2.46.1
The SLUB changes for 6.12 included new kunit tests that resulted in
noisy warnings, which we normally suppress, and a boot lockup in some
configurations in case the kunit tests are built-in.
The warnings are addressed in Patch 1.
The lockups I couldn't reproduce, but inspecting boot initialization
order makes me suspect the test_kfree_rcu() calling kfree_rcu() which is
too early before RCU finishes initialization. Moving the exection later
was tried but broke tests marking their code as __init so Patch 2 skips
the test when the slub kunit tests are built-in.
So these are now fixes for 4e1c44b3db79 ("kunit, slub: add
test_kfree_rcu() and test_leak_destroy()")
The plan is to take the fixes via slab tree for a 6.12 rcX.
Signed-off-by: Vlastimil Babka <vbabka(a)suse.cz>
---
Changes in v2:
- patch 2 skips the test when built-in instead of moving kunit execution
later
- Link to v1: https://lore.kernel.org/r/20240930-b4-slub-kunit-fix-v1-0-32ca9dbbbc11@suse…
---
Vlastimil Babka (2):
mm, slab: suppress warnings in test_leak_destroy kunit test
slub/kunit: skip test_kfree_rcu when the slub kunit test is built-in
lib/slub_kunit.c | 18 ++++++++++++------
mm/slab.h | 6 ++++++
mm/slab_common.c | 5 +++--
mm/slub.c | 5 +++--
4 files changed, 24 insertions(+), 10 deletions(-)
---
base-commit: 9852d85ec9d492ebef56dc5f229416c925758edc
change-id: 20240930-b4-slub-kunit-fix-6fba4d1c1742
Best regards,
--
Vlastimil Babka <vbabka(a)suse.cz>
Some applications rely on placing data in free bits addresses allocated
by mmap. Various architectures (eg. x86, arm64, powerpc) restrict the
address returned by mmap to be less than the 48-bit address space,
unless the hint address uses more than 47 bits (the 48th bit is reserved
for the kernel address space).
The riscv architecture needs a way to similarly restrict the virtual
address space. On the riscv port of OpenJDK an error is thrown if
attempted to run on the 57-bit address space, called sv57 [1]. golang
has a comment that sv57 support is not complete, but there are some
workarounds to get it to mostly work [2].
These applications work on x86 because x86 does an implicit 47-bit
restriction of mmap() address that contain a hint address that is less
than 48 bits.
Instead of implicitly restricting the address space on riscv (or any
current/future architecture), provide a flag to the personality syscall
that can be used to ensure an application works in any arbitrary VA
space. A similar feature has already been implemented by the personality
syscall in ADDR_LIMIT_32BIT.
This flag will also allow seemless compatibility between all
architectures, so applications like Go and OpenJDK that use bits in a
virtual address can request the exact number of bits they need in a
generic way. The flag can be checked inside of vm_unmapped_area() so
that this flag does not have to be handled individually by each
architecture.
Link:
https://github.com/openjdk/jdk/blob/f080b4bb8a75284db1b6037f8c00ef3b1ef1add…
[1]
Link:
https://github.com/golang/go/blob/9e8ea567c838574a0f14538c0bbbd83c3215aa55/…
[2]
To: Arnd Bergmann <arnd(a)arndb.de>
To: Richard Henderson <richard.henderson(a)linaro.org>
To: Ivan Kokshaysky <ink(a)jurassic.park.msu.ru>
To: Matt Turner <mattst88(a)gmail.com>
To: Vineet Gupta <vgupta(a)kernel.org>
To: Russell King <linux(a)armlinux.org.uk>
To: Guo Ren <guoren(a)kernel.org>
To: Huacai Chen <chenhuacai(a)kernel.org>
To: WANG Xuerui <kernel(a)xen0n.name>
To: Thomas Bogendoerfer <tsbogend(a)alpha.franken.de>
To: James E.J. Bottomley <James.Bottomley(a)HansenPartnership.com>
To: Helge Deller <deller(a)gmx.de>
To: Michael Ellerman <mpe(a)ellerman.id.au>
To: Nicholas Piggin <npiggin(a)gmail.com>
To: Christophe Leroy <christophe.leroy(a)csgroup.eu>
To: Naveen N Rao <naveen(a)kernel.org>
To: Alexander Gordeev <agordeev(a)linux.ibm.com>
To: Gerald Schaefer <gerald.schaefer(a)linux.ibm.com>
To: Heiko Carstens <hca(a)linux.ibm.com>
To: Vasily Gorbik <gor(a)linux.ibm.com>
To: Christian Borntraeger <borntraeger(a)linux.ibm.com>
To: Sven Schnelle <svens(a)linux.ibm.com>
To: Yoshinori Sato <ysato(a)users.sourceforge.jp>
To: Rich Felker <dalias(a)libc.org>
To: John Paul Adrian Glaubitz <glaubitz(a)physik.fu-berlin.de>
To: David S. Miller <davem(a)davemloft.net>
To: Andreas Larsson <andreas(a)gaisler.com>
To: Thomas Gleixner <tglx(a)linutronix.de>
To: Ingo Molnar <mingo(a)redhat.com>
To: Borislav Petkov <bp(a)alien8.de>
To: Dave Hansen <dave.hansen(a)linux.intel.com>
To: x86(a)kernel.org
To: H. Peter Anvin <hpa(a)zytor.com>
To: Andy Lutomirski <luto(a)kernel.org>
To: Peter Zijlstra <peterz(a)infradead.org>
To: Muchun Song <muchun.song(a)linux.dev>
To: Andrew Morton <akpm(a)linux-foundation.org>
To: Liam R. Howlett <Liam.Howlett(a)oracle.com>
To: Vlastimil Babka <vbabka(a)suse.cz>
To: Lorenzo Stoakes <lorenzo.stoakes(a)oracle.com>
To: Shuah Khan <shuah(a)kernel.org>
To: Christoph Hellwig <hch(a)infradead.org>
To: Michal Hocko <mhocko(a)suse.com>
To: "Kirill A. Shutemov" <kirill(a)shutemov.name>
To: Chris Torek <chris.torek(a)gmail.com>
Cc: linux-arch(a)vger.kernel.org
Cc: linux-kernel(a)vger.kernel.org
Cc: linux-alpha(a)vger.kernel.org
Cc: linux-snps-arc(a)lists.infradead.org
Cc: linux-arm-kernel(a)lists.infradead.org
Cc: linux-csky(a)vger.kernel.org
Cc: loongarch(a)lists.linux.dev
Cc: linux-mips(a)vger.kernel.org
Cc: linux-parisc(a)vger.kernel.org
Cc: linuxppc-dev(a)lists.ozlabs.org
Cc: linux-s390(a)vger.kernel.org
Cc: linux-sh(a)vger.kernel.org
Cc: sparclinux(a)vger.kernel.org
Cc: linux-mm(a)kvack.org
Cc: linux-kselftest(a)vger.kernel.org
Cc: linux-abi-devel(a)lists.sourceforge.net
Signed-off-by: Charlie Jenkins <charlie(a)rivosinc.com>
Changes in v2:
- Added much greater detail to cover letter
- Removed all code that touched architecture specific code and was able
to factor this out into all generic functions, except for flags that
needed to be added to vm_unmapped_area_info
- Made this an RFC since I have only tested it on riscv and x86
- Link to v1: https://lore.kernel.org/r/20240827-patches-below_hint_mmap-v1-0-46ff2eb9022…
Changes in v3:
- Use a personality flag instead of an mmap flag
- Link to v2: https://lore.kernel.org/r/20240829-patches-below_hint_mmap-v2-0-638a28d9eae…
---
Charlie Jenkins (2):
mm: Add personality flag to limit address to 47 bits
selftests/mm: Create ADDR_LIMIT_47BIT test
include/uapi/linux/personality.h | 1 +
mm/mmap.c | 3 ++
tools/testing/selftests/mm/.gitignore | 1 +
tools/testing/selftests/mm/Makefile | 1 +
tools/testing/selftests/mm/map_47bit_personality.c | 34 ++++++++++++++++++++++
5 files changed, 40 insertions(+)
---
base-commit: 5be63fc19fcaa4c236b307420483578a56986a37
change-id: 20240827-patches-below_hint_mmap-b13d79ae1c55
--
- Charlie
From: Tycho Andersen <tandersen(a)netflix.com>
Zbigniew mentioned at Linux Plumber's that systemd is interested in
switching to execveat() for service execution, but can't, because the
contents of /proc/pid/comm are the file descriptor which was used,
instead of the path to the binary. This makes the output of tools like
top and ps useless, especially in a world where most fds are opened
CLOEXEC so the number is truly meaningless.
Change exec path to fix up /proc/pid/comm in the case where we have
allocated one of these synthetic paths in bprm_init(). This way the actual
exec machinery is unchanged, but cosmetically the comm looks reasonable to
admins investigating things.
Signed-off-by: Tycho Andersen <tandersen(a)netflix.com>
Suggested-by: Zbigniew Jędrzejewski-Szmek <zbyszek(a)in.waw.pl>
CC: Aleksa Sarai <cyphar(a)cyphar.com>
Link: https://github.com/uapi-group/kernel-features#set-comm-field-before-exec
---
v2: * drop the flag, everyone :)
* change the rendered value to f_path.dentry->d_name.name instead of
argv[0], Eric
v3: * fix up subject line, Eric
---
fs/exec.c | 13 ++++++++++++-
1 file changed, 12 insertions(+), 1 deletion(-)
diff --git a/fs/exec.c b/fs/exec.c
index dad402d55681..9520359a8dcc 100644
--- a/fs/exec.c
+++ b/fs/exec.c
@@ -1416,7 +1416,18 @@ int begin_new_exec(struct linux_binprm * bprm)
set_dumpable(current->mm, SUID_DUMP_USER);
perf_event_exec();
- __set_task_comm(me, kbasename(bprm->filename), true);
+
+ /*
+ * If fdpath was set, execveat() made up a path that will
+ * probably not be useful to admins running ps or similar.
+ * Let's fix it up to be something reasonable.
+ */
+ if (bprm->fdpath) {
+ BUILD_BUG_ON(TASK_COMM_LEN > DNAME_INLINE_LEN);
+ __set_task_comm(me, bprm->file->f_path.dentry->d_name.name, true);
+ } else {
+ __set_task_comm(me, kbasename(bprm->filename), true);
+ }
/* An exec changes our domain. We are no longer part of the thread
group */
base-commit: baeb9a7d8b60b021d907127509c44507539c15e5
--
2.34.1
virtio-net have two usage of hashes: one is RSS and another is hash
reporting. Conventionally the hash calculation was done by the VMM.
However, computing the hash after the queue was chosen defeats the
purpose of RSS.
Another approach is to use eBPF steering program. This approach has
another downside: it cannot report the calculated hash due to the
restrictive nature of eBPF.
Introduce the code to compute hashes to the kernel in order to overcome
thse challenges.
An alternative solution is to extend the eBPF steering program so that it
will be able to report to the userspace, but it is based on context
rewrites, which is in feature freeze. We can adopt kfuncs, but they will
not be UAPIs. We opt to ioctl to align with other relevant UAPIs (KVM
and vhost_net).
The patches for QEMU to use this new feature was submitted as RFC and
is available at:
https://patchew.org/QEMU/20240915-hash-v3-0-79cb08d28647@daynix.com/
This work was presented at LPC 2024:
https://lpc.events/event/18/contributions/1963/
V1 -> V2:
Changed to introduce a new BPF program type.
Signed-off-by: Akihiko Odaki <akihiko.odaki(a)daynix.com>
---
Changes in v4:
- Moved tun_vnet_hash_ext to if_tun.h.
- Renamed virtio_net_toeplitz() to virtio_net_toeplitz_calc().
- Replaced htons() with cpu_to_be16().
- Changed virtio_net_hash_rss() to return void.
- Reordered variable declarations in virtio_net_hash_rss().
- Removed virtio_net_hdr_v1_hash_from_skb().
- Updated messages of "tap: Pad virtio header with zero" and
"tun: Pad virtio header with zero".
- Fixed vnet_hash allocation size.
- Ensured to free vnet_hash when destructing tun_struct.
- Link to v3: https://lore.kernel.org/r/20240915-rss-v3-0-c630015db082@daynix.com
Changes in v3:
- Reverted back to add ioctl.
- Split patch "tun: Introduce virtio-net hashing feature" into
"tun: Introduce virtio-net hash reporting feature" and
"tun: Introduce virtio-net RSS".
- Changed to reuse hash values computed for automq instead of performing
RSS hashing when hash reporting is requested but RSS is not.
- Extracted relevant data from struct tun_struct to keep it minimal.
- Added kernel-doc.
- Changed to allow calling TUNGETVNETHASHCAP before TUNSETIFF.
- Initialized num_buffers with 1.
- Added a test case for unclassified packets.
- Fixed error handling in tests.
- Changed tests to verify that the queue index will not overflow.
- Rebased.
- Link to v2: https://lore.kernel.org/r/20231015141644.260646-1-akihiko.odaki@daynix.com
---
Akihiko Odaki (9):
skbuff: Introduce SKB_EXT_TUN_VNET_HASH
virtio_net: Add functions for hashing
net: flow_dissector: Export flow_keys_dissector_symmetric
tap: Pad virtio header with zero
tun: Pad virtio header with zero
tun: Introduce virtio-net hash reporting feature
tun: Introduce virtio-net RSS
selftest: tun: Add tests for virtio-net hashing
vhost/net: Support VIRTIO_NET_F_HASH_REPORT
Documentation/networking/tuntap.rst | 7 +
drivers/net/Kconfig | 1 +
drivers/net/tap.c | 2 +-
drivers/net/tun.c | 255 ++++++++++++--
drivers/vhost/net.c | 16 +-
include/linux/if_tun.h | 5 +
include/linux/skbuff.h | 3 +
include/linux/virtio_net.h | 174 +++++++++
include/net/flow_dissector.h | 1 +
include/uapi/linux/if_tun.h | 71 ++++
net/core/flow_dissector.c | 3 +-
net/core/skbuff.c | 4 +
tools/testing/selftests/net/Makefile | 2 +-
tools/testing/selftests/net/tun.c | 666 ++++++++++++++++++++++++++++++++++-
14 files changed, 1170 insertions(+), 40 deletions(-)
---
base-commit: 752ebcbe87aceeb6334e846a466116197711a982
change-id: 20240403-rss-e737d89efa77
Best regards,
--
Akihiko Odaki <akihiko.odaki(a)daynix.com>
This patch allows progs to elide a null check on statically known map
lookup keys. In other words, if the verifier can statically prove that
the lookup will be in-bounds, allow the prog to drop the null check.
This is useful for two reasons:
1. Large numbers of nullness checks (especially when they cannot fail)
unnecessarily pushes prog towards BPF_COMPLEXITY_LIMIT_JMP_SEQ.
2. It forms a tighter contract between programmer and verifier.
For (1), bpftrace is starting to make heavier use of percpu scratch
maps. As a result, for user scripts with large number of unrolled loops,
we are starting to hit jump complexity verification errors. These
percpu lookups cannot fail anyways, as we only use static key values.
Eliding nullness probably results in less work for verifier as well.
For (2), percpu scratch maps are often used as a larger stack, as the
currrent stack is limited to 512 bytes. In these situations, it is
desirable for the programmer to express: "this lookup should never fail,
and if it does, it means I messed up the code". By omitting the null
check, the programmer can "ask" the verifier to double check the logic.
Changes in v3:
* Check if stack is (erroneously) growing upwards
* Mention in commit message why existing tests needed change
Changes in v2:
* Added a check for when R2 is not a ptr to stack
* Added a check for when stack is uninitialized (no stack slot yet)
* Updated existing tests to account for null elision
* Added test case for when R2 can be both const and non-const
Daniel Xu (2):
bpf: verifier: Support eliding map lookup nullness
bpf: selftests: verifier: Add nullness elision tests
kernel/bpf/verifier.c | 67 ++++++-
tools/testing/selftests/bpf/progs/iters.c | 14 +-
.../selftests/bpf/progs/map_kptr_fail.c | 2 +-
.../bpf/progs/verifier_array_access.c | 166 ++++++++++++++++++
.../selftests/bpf/progs/verifier_map_in_map.c | 2 +-
.../testing/selftests/bpf/verifier/map_kptr.c | 2 +-
6 files changed, 242 insertions(+), 11 deletions(-)
--
2.46.0
From: Joshua Hahn <joshua.hahn6(a)gmail.com>
v2 -> v3: Signed-off-by & renamed subject for clarity.
v1 -> v2: Edited commit messages for clarity.
Niced CPU usage is a metric reported in host-level /prot/stat, but is
not reported in cgroup-level statistics in cpu.stat. However, when a
host contains multiple tasks across different workloads, it becomes
difficult to gauge how much of the task is being spent on niced
processes based on /proc/stat alone, since host-level metrics do not
provide this cgroup-level granularity.
Exposing this metric will allow users to accurately probe the niced CPU
metric for each workload, and make more informed decisions when
directing higher priority tasks.
Joshua Hahn (2):
Tracking cgroup-level niced CPU time
Selftests for niced CPU statistics
include/linux/cgroup-defs.h | 1 +
kernel/cgroup/rstat.c | 16 ++++-
tools/testing/selftests/cgroup/test_cpu.c | 72 +++++++++++++++++++++++
3 files changed, 86 insertions(+), 3 deletions(-)
--
2.43.5
The recent addition of support for testing with the x86 specific quirk
KVM_X86_QUIRK_SLOT_ZAP_ALL disabled in the generic memslot tests broke the
build of the KVM selftests for all other architectures:
In file included from include/kvm_util.h:8,
from include/memstress.h:13,
from memslot_modification_stress_test.c:21:
memslot_modification_stress_test.c: In function ‘main’:
memslot_modification_stress_test.c:176:38: error: ‘KVM_X86_QUIRK_SLOT_ZAP_ALL’ undeclared (first use in this function)
176 | KVM_X86_QUIRK_SLOT_ZAP_ALL);
| ^~~~~~~~~~~~~~~~~~~~~~~~~~
Add __x86_64__ guard defines to avoid building the relevant code on other
architectures.
Fixes: 61de4c34b51c ("KVM: selftests: Test memslot move in memslot_perf_test with quirk disabled")
Fixes: 218f6415004a ("KVM: selftests: Allow slot modification stress test with quirk disabled")
Reported-by: Aishwarya TCV <aishwarya.tcv(a)arm.com>
Signed-off-by: Mark Brown <broonie(a)kernel.org>
---
This is obviously disruptive for testing of KVM changes on non-x86
architectures.
---
tools/testing/selftests/kvm/memslot_modification_stress_test.c | 2 ++
tools/testing/selftests/kvm/memslot_perf_test.c | 6 ++++++
2 files changed, 8 insertions(+)
diff --git a/tools/testing/selftests/kvm/memslot_modification_stress_test.c b/tools/testing/selftests/kvm/memslot_modification_stress_test.c
index e3343f0df9e1..c81a84990eab 100644
--- a/tools/testing/selftests/kvm/memslot_modification_stress_test.c
+++ b/tools/testing/selftests/kvm/memslot_modification_stress_test.c
@@ -169,12 +169,14 @@ int main(int argc, char *argv[])
case 'i':
p.nr_iterations = atoi_positive("Number of iterations", optarg);
break;
+#ifdef __x86_64__
case 'q':
p.disable_slot_zap_quirk = true;
TEST_REQUIRE(kvm_check_cap(KVM_CAP_DISABLE_QUIRKS2) &
KVM_X86_QUIRK_SLOT_ZAP_ALL);
break;
+#endif
case 'h':
default:
help(argv[0]);
diff --git a/tools/testing/selftests/kvm/memslot_perf_test.c b/tools/testing/selftests/kvm/memslot_perf_test.c
index 893366982f77..989ffe0d047f 100644
--- a/tools/testing/selftests/kvm/memslot_perf_test.c
+++ b/tools/testing/selftests/kvm/memslot_perf_test.c
@@ -113,7 +113,9 @@ static_assert(ATOMIC_BOOL_LOCK_FREE == 2, "atomic bool is not lockless");
static sem_t vcpu_ready;
static bool map_unmap_verify;
+#ifdef __x86_64__
static bool disable_slot_zap_quirk;
+#endif
static bool verbose;
#define pr_info_v(...) \
@@ -579,8 +581,10 @@ static bool test_memslot_move_prepare(struct vm_data *data,
uint32_t guest_page_size = data->vm->page_size;
uint64_t movesrcgpa, movetestgpa;
+#ifdef __x86_64__
if (disable_slot_zap_quirk)
vm_enable_cap(data->vm, KVM_CAP_DISABLE_QUIRKS2, KVM_X86_QUIRK_SLOT_ZAP_ALL);
+#endif
movesrcgpa = vm_slot2gpa(data, data->nslots - 1);
@@ -971,11 +975,13 @@ static bool parse_args(int argc, char *argv[],
case 'd':
map_unmap_verify = true;
break;
+#ifdef __x86_64__
case 'q':
disable_slot_zap_quirk = true;
TEST_REQUIRE(kvm_check_cap(KVM_CAP_DISABLE_QUIRKS2) &
KVM_X86_QUIRK_SLOT_ZAP_ALL);
break;
+#endif
case 's':
targs->nslots = atoi_paranoid(optarg);
if (targs->nslots <= 1 && targs->nslots != -1) {
---
base-commit: 9852d85ec9d492ebef56dc5f229416c925758edc
change-id: 20240930-kvm-build-breakage-a542f46d78f9
Best regards,
--
Mark Brown <broonie(a)kernel.org>
The HID test cases actually run tests using the run-hid-tools-tests.sh
script. However, if installed with "make install", the run-hid-tools-tests.sh
script will not be copied over, resulting in the following error message.
make -C tools/testing/selftests/ TARGETS=hid install \
INSTALL_PATH=$KSFT_INSTALL_PATH
cd $KSFT_INSTALL_PATH
./run_kselftest.sh -c hid
selftests: hid: hid-core.sh
bash: ./run-hid-tools-tests.sh: No such file or directory
So add the run-hid-tools-tests.sh script to the TEST_FILES in the Makefile.
Signed-off-by: Yun Lu <luyun(a)kylinos.cn>
---
tools/testing/selftests/hid/Makefile | 2 ++
1 file changed, 2 insertions(+)
diff --git a/tools/testing/selftests/hid/Makefile b/tools/testing/selftests/hid/Makefile
index 72be55ac4bdf..38ae31bb07b5 100644
--- a/tools/testing/selftests/hid/Makefile
+++ b/tools/testing/selftests/hid/Makefile
@@ -17,6 +17,8 @@ TEST_PROGS += hid-tablet.sh
TEST_PROGS += hid-usb_crash.sh
TEST_PROGS += hid-wacom.sh
+TEST_FILES := run-hid-tools-tests.sh
+
CXX ?= $(CROSS_COMPILE)g++
HOSTPKG_CONFIG := pkg-config
--
2.27.0
If you wish to utilise a pidfd interface to refer to the current process
(from the point of view of userland - from the kernel point of view - the
thread group leader), it is rather cumbersome, requiring something like:
int pidfd = pidfd_open(getpid(), 0);
...
close(pidfd);
Or the equivalent call opening /proc/self. It is more convenient to use a
sentinel value to indicate to an interface that accepts a pidfd that we
simply wish to refer to the current process.
This series introduces such a sentinel, PIDFD_SELF, which can be passed as
the pidfd in this instance rather than having to establish a dummy fd for
this purpose.
The only pidfd interface where this is particularly useful is
process_madvise(), which provides the motivation for this series. However,
as this is a general interface, we ensure that all pidfd interfaces can
handle this correctly.
We ensure that pidfd_send_signal() and pidfd_getfd() work correctly, and
assert as much in selftests. All other interfaces except setns() will work
implicitly with this new interface, however it doesn't make sense to test
waitid(P_PIDFD, ...) as waiting on ourselves is a blocking operation.
In the case of setns() we explicitly disallow use of PIDFD_SELF as it
doesn't make sense to obtain the namespaces of our own process, and it
would require work to implement this functionality there that would be of
no use.
We also do not provide the ability to utilise PIDFD_SELF in ordinary fd
operations such as open() or poll(), as this would require extensive work
and be of no real use.
Lorenzo Stoakes (3):
pidfd: refactor pidfd_get_pid/to_pid() and de-duplicate pid lookup
pidfd: add PIDFD_SELF sentinel to refer to own process
selftests: pidfd: add tests for PIDFD_SELF
include/linux/pid.h | 43 +++++++++++-
include/uapi/linux/pidfd.h | 3 +
kernel/exit.c | 3 +-
kernel/nsproxy.c | 1 +
kernel/pid.c | 70 +++++++++++++------
kernel/signal.c | 26 ++-----
tools/testing/selftests/pidfd/pidfd.h | 5 ++
.../selftests/pidfd/pidfd_getfd_test.c | 38 ++++++++++
.../selftests/pidfd/pidfd_setns_test.c | 6 ++
tools/testing/selftests/pidfd/pidfd_test.c | 13 ++++
10 files changed, 165 insertions(+), 43 deletions(-)
--
2.46.2
"Eric W. Biederman" <ebiederm(a)xmission.com> writes:
> Kees Cook <kees(a)kernel.org> writes:
>> I'm not super comfortable doing this regardless of bprm->fdpath; that
>> seems like too many cases getting changed. Can we just leave it as
>> depending on bprm->fdpath?
I was recommending that because I did not expect that there was any
widespread usage of aliasing of binary names using symlinks.
I realized today that on debian there are many aliases
of binaries created with the /etc/alternatives mechanism.
So there is much wider exposure to problems than I would have
supposed.
So I remove any objections to making the new code conditional on bprm->fdpath.
Eric
Currently, the second bridge command overwrites the first one.
Fix this by adding this VID to the interface behind $swp2.
The one_bridge_two_pvids() test intends to check that there is no
leakage of traffic between bridge ports which have a single VLAN - the
PVID VLAN.
Because of a typo, port $swp1 is configured with a PVID twice (second
command overwrites first), and $swp2 isn't configured at all (and since
the bridge vlan_default_pvid property is set to 0, this port will not
have a PVID at all, so it will drop all untagged and priority-tagged
traffic).
So, instead of testing the configuration that was intended, we are
testing a different one, where one port has PVID 2 and the other has
no PVID. This incorrect version of the test should also pass, but is
ineffective for its purpose, so fix the typo.
This typo has an impact on results of the test results,
potentially leading to wrong conclusions regarding
the functionality of a network device.
Fixes: 476a4f05d9b8 ("selftests: forwarding: add a no_forwarding.sh test")
Reviewed-by: Hangbin Liu <liuhangbin(a)gmail.com>
Reviewed-by: Shuah Khan <skhan(a)linuxfoundation.org>
Signed-off-by: Kacper Ludwinski <kac.ludwinski(a)icloud.com>
---
tools/testing/selftests/net/forwarding/no_forwarding.sh | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
v4:
- Add revision history od this patch
- Add "Reviewed-by:"
- Limit number of characters in commit to 80
- Add impact explanation to commit message
v3:
- Edit commit message
- Add missing Signed-off-by
- Link: https://lore.kernel.org/linux-kselftest/20240927112824.339-1-kac.ludwinski@…
v2:
- Add missing CCs
- Fix typo in commit message
- Add target name
- Link: https://lore.kernel.org/linux-kselftest/fQknN_r6POzmrp8UVjyA3cknLnB1HB9I_jf…
v1:
- Link: https://lore.kernel.org/linux-kselftest/20240925050539.1906-1-kacper@ludwin…
diff --git a/tools/testing/selftests/net/forwarding/no_forwarding.sh b/tools/testing/selftests/net/forwarding/no_forwarding.sh
index 9e677aa64a06..694ece9ba3a7 100755
--- a/tools/testing/selftests/net/forwarding/no_forwarding.sh
+++ b/tools/testing/selftests/net/forwarding/no_forwarding.sh
@@ -202,7 +202,7 @@ one_bridge_two_pvids()
ip link set $swp2 master br0
bridge vlan add dev $swp1 vid 1 pvid untagged
- bridge vlan add dev $swp1 vid 2 pvid untagged
+ bridge vlan add dev $swp2 vid 2 pvid untagged
run_test "Switch ports in VLAN-aware bridge with different PVIDs"
--
2.43.0
Mario brought to my attention that the hugetlb_fault_after_madv test
is currently always skipped on s390x.
Let's adjust the test to be independent of the default hugetlb page size
and while at it, also improve the test output.
Cc: Andrew Morton <akpm(a)linux-foundation.org>
Cc: Shuah Khan <shuah(a)kernel.org>
Cc: Mario Casquero <mcasquer(a)redhat.com>
Cc: Breno Leitao <leitao(a)debian.org>
David Hildenbrand (2):
selftests/mm: hugetlb_fault_after_madv: use default hguetlb page size
selftests/mm: hugetlb_fault_after_madv: improve test output
.../selftests/mm/hugetlb_fault_after_madv.c | 48 ++++++++++++++++---
1 file changed, 42 insertions(+), 6 deletions(-)
--
2.46.1
Introduce test 'test_proc_pid_mem' to address the issue in the TODO.
Check if VMsize is 0 to determine whether the process has been unmapped.
The child process cannot signal the parent that it has unmapped itself,
as it no longer exists. This includes unmapping the text segment,
preventing the child from proceeding to the next instruction.
Signed-off-by: Siddharth Menon <simeddon(a)gmail.com>
---
v1->v2: Removed redundant parenthesis, fixed other checkpatch warnings.
Revised commit message based on feedback.
tools/testing/selftests/proc/proc-empty-vm.c | 56 +++++++++++++++++---
1 file changed, 50 insertions(+), 6 deletions(-)
diff --git a/tools/testing/selftests/proc/proc-empty-vm.c b/tools/testing/selftests/proc/proc-empty-vm.c
index b3f898aab4ab..bfb7f8823529 100644
--- a/tools/testing/selftests/proc/proc-empty-vm.c
+++ b/tools/testing/selftests/proc/proc-empty-vm.c
@@ -213,6 +213,53 @@ static void vsyscall(void)
}
#endif
+static int test_proc_pid_mem(pid_t pid)
+{
+ char buf[4096];
+ char *line;
+ int vm_size = -1;
+
+ snprintf(buf, sizeof(buf), "/proc/%d/status", pid);
+ int fd = open(buf, O_RDONLY);
+
+ if (fd == -1) {
+ if (errno == ENOENT)
+ /* Process does not exist */
+ return EXIT_SUCCESS;
+
+ perror("open /proc/[pid]/status");
+ return EXIT_FAILURE;
+ }
+
+ ssize_t rv = read(fd, buf, sizeof(buf) - 1);
+
+ if (rv == -1) {
+ perror("read");
+ close(fd);
+ return EXIT_FAILURE;
+ }
+ buf[rv] = '\0';
+
+ line = strtok(buf, "\n");
+ while (line != NULL) {
+ /* Check for VmSize */
+ if (strncmp(line, "VmSize:", 7) == 0) {
+ if (sscanf(line, "VmSize: %d", &vm_size) == 1)
+ break;
+ perror("Failed to parse VmSize.\n");
+ }
+ line = strtok(NULL, "\n");
+ }
+
+ close(fd);
+
+ /* Check if VmSize is 0 */
+ if (vm_size == 0)
+ return EXIT_SUCCESS;
+
+ return EXIT_FAILURE;
+}
+
static int test_proc_pid_maps(pid_t pid)
{
char buf[4096];
@@ -500,14 +547,11 @@ int main(void)
#endif
return EXIT_FAILURE;
} else {
- /*
- * TODO find reliable way to signal parent that munmap(2) completed.
- * Child can't do it directly because it effectively doesn't exist
- * anymore. Looking at child's VM files isn't 100% reliable either:
- * due to a bug they may not become empty or empty-like.
- */
sleep(1);
+ if (rv == EXIT_SUCCESS)
+ rv = test_proc_pid_mem(pid);
+
if (rv == EXIT_SUCCESS) {
rv = test_proc_pid_maps(pid);
}
--
2.39.5
In this series from Geliang, modifying MPTCP BPF selftests, we have:
- A new MPTCP subflow BPF program setting socket options per subflow: it
looks better to have this old test program in the BPF selftests to
track regressions and to serve as example.
Note: Nicolas is no longer working at Tessares, but he did this work
while working for them, and his email address is no longer available.
- A new hook in the same BPF program to do the verification step.
- A new MPTCP BPF subtest validating the new BPF program added in the
first patch, with the help of the new hook added in the second patch.
Signed-off-by: Matthieu Baerts (NGI0) <matttbe(a)kernel.org>
---
Changes in v7:
- Patch 2/3: use 'can_loop' instead of 'cond_break'. (Martin)
- Patch 3/3: use bpf_program__attach_cgroup(). (Martin)
- Link to v6: https://lore.kernel.org/r/20240911-upstream-bpf-next-20240506-mptcp-subflow…
Changes in v6:
- Patch 3/3: use usleep() instead of sleep()
- Series: rebased on top of bpf-next/net
- Link to v5: https://lore.kernel.org/r/20240910-upstream-bpf-next-20240506-mptcp-subflow…
Changes in v5:
- See the individual changelog for more details about them
- Patch 1/3: set TCP on the 2nd subflow
- Patch 2/3: new
- Patch 3/3: use the BPF program from patch 2/3 to do the validation
instead of using ss.
- Series: rebased on top of bpf-next/net
- Link to v4: https://lore.kernel.org/r/20240805-upstream-bpf-next-20240506-mptcp-subflow…
Changes in v4:
- Drop former patch 2/3: MPTCP's pm_nl_ctl requires a new header file:
- I will check later if it is possible to avoid having duplicated
header files in tools/include/uapi, but no need to block this series
for that. Patch 2/3 can be added later if needed.
- Patch 2/2: skip the test if 'ip mptcp' is not available.
- Link to v3: https://lore.kernel.org/r/20240703-upstream-bpf-next-20240506-mptcp-subflow…
Changes in v3:
- Sorry for the delay between v2 and v3, this series was conflicting
with the "add netns helpers", but it looks like it is on hold:
https://lore.kernel.org/cover.1715821541.git.tanggeliang@kylinos.cn
- Patch 1/3 includes "bpf_tracing_net.h", introduced in between.
- New patch 2/3: "selftests/bpf: Add mptcp pm_nl_ctl link".
- Patch 3/3: use the tool introduced in patch 2/3 + SYS_NOFAIL() helper.
- Link to v2: https://lore.kernel.org/r/20240509-upstream-bpf-next-20240506-mptcp-subflow…
Changes in v2:
- Previous patches 1/4 and 2/4 have been dropped from this series:
- 1/4: "selftests/bpf: Handle SIGINT when creating netns":
- A new version, more generic and no longer specific to MPTCP BPF
selftest will be sent later, as part of a new series. (Alexei)
- 2/4: "selftests/bpf: Add RUN_MPTCP_TEST macro":
- Removed, not to hide helper functions in macros. (Alexei)
- The commit message of patch 1/2 has been clarified to avoid some
possible confusions spot by Alexei.
- Link to v1: https://lore.kernel.org/r/20240507-upstream-bpf-next-20240506-mptcp-subflow…
---
Geliang Tang (2):
selftests/bpf: Add getsockopt to inspect mptcp subflow
selftests/bpf: Add mptcp subflow subtest
Nicolas Rybowski (1):
selftests/bpf: Add mptcp subflow example
MAINTAINERS | 2 +-
tools/testing/selftests/bpf/prog_tests/mptcp.c | 121 ++++++++++++++++++++
tools/testing/selftests/bpf/progs/mptcp_bpf.h | 42 +++++++
tools/testing/selftests/bpf/progs/mptcp_subflow.c | 128 ++++++++++++++++++++++
4 files changed, 292 insertions(+), 1 deletion(-)
---
base-commit: 151ac45348afc5b56baa584c7cd4876addf461ff
change-id: 20240506-upstream-bpf-next-20240506-mptcp-subflow-test-faef6654bfa3
Best regards,
--
Matthieu Baerts (NGI0) <matttbe(a)kernel.org>
From: Jason Xing <kernelxing(a)tencent.com>
When reading through the whole feature, I feel we can do these things to
make it more robust. They are trivial changes, not big ones.
Jason Xing (3):
net-timestamp: add strict check when setting tx flags
net-timestamp: add OPT_ID_TCP test in selftests
net-timestamp: namespacify the sysctl_tstamp_allow_data
include/net/netns/core.h | 1 +
include/net/sock.h | 2 --
net/core/net_namespace.c | 1 +
net/core/skbuff.c | 2 +-
net/core/sock.c | 4 ++++
net/core/sysctl_net_core.c | 18 +++++++++---------
tools/testing/selftests/net/txtimestamp.c | 6 ++++++
7 files changed, 22 insertions(+), 12 deletions(-)
--
2.37.3
There were several attempts to resolve circular include dependency
after the addition of percpu.h: 1c9df907da83 ("random: fix circular
include dependency on arm64 after addition of percpu.h"), c0842fbc1b18
("random32: move the pseudo-random 32-bit definitions to prandom.h") and
finally d9f29deb7fe8 ("prandom: Remove unused include") that completely
removes the inclusion of <linux/percpu.h>.
Due to legacy reasons, <linux/random.h> includes <linux/prandom.h>, but
with the commit entry remark:
--quote--
A further cleanup step would be to remove this from <linux/random.h>
entirely, and make people who use the prandom infrastructure include
just the new header file. That's a bit of a churn patch, but grepping
for "prandom_" and "next_pseudo_random32" "struct rnd_state" should
catch most users.
But it turns out that that nice cleanup step is fairly painful, because
a _lot_ of code currently seems to depend on the implicit include of
<linux/random.h>, which can currently come in a lot of ways, including
such fairly core headfers as <linux/net.h>.
So the "nice cleanup" part may or may never happen.
--/quote--
We would like to include <linux/percpu.h> in <linux/prandom.h>.
In [1] we would like to repurpose __percpu tag as a named address space
qualifier, where __percpu macro uses defines from <linux/percpu.h>.
The major roadblock to inclusion of <linux/percpu.h> is the above
mentioned legacy inclusion of <linux/prandom.h> in <linux/random.h> that
causes circular include dependency that prevents <linux/percpu.h>
inclusion.
This patch series is the "nice cleanup" part that:
a) Substitutes the inclusion of <linux/random.h> with the
inclusion of <linux/prandom.h> where needed (patches 1 - 17).
b) Removes legacy inclusion of <linux/prandom.h> from
<linux/random.h> (patch 18).
c) Includes <linux/percpu.h> in <linux/prandom.h> (patch 19).
The whole series was tested by compiling the kernel for x86_64 allconfig
and some popular architectures, namely arm64 defconfig, powerpc defconfig
and loongarch defconfig.
[1] https://lore.kernel.org/lkml/20240812115945.484051-4-ubizjak@gmail.com/
Cc: Dave Hansen <dave.hansen(a)linux.intel.com>
Cc: Andy Lutomirski <luto(a)kernel.org>
Cc: Peter Zijlstra <peterz(a)infradead.org>
Cc: Thomas Gleixner <tglx(a)linutronix.de>
Cc: Ingo Molnar <mingo(a)redhat.com>
Cc: Borislav Petkov <bp(a)alien8.de>
Cc: "H. Peter Anvin" <hpa(a)zytor.com>
Cc: Jani Nikula <jani.nikula(a)linux.intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen(a)linux.intel.com>
Cc: Rodrigo Vivi <rodrigo.vivi(a)intel.com>
Cc: Tvrtko Ursulin <tursulin(a)ursulin.net>
Cc: David Airlie <airlied(a)gmail.com>
Cc: Daniel Vetter <daniel(a)ffwll.ch>
Cc: Maarten Lankhorst <maarten.lankhorst(a)linux.intel.com>
Cc: Maxime Ripard <mripard(a)kernel.org>
Cc: Thomas Zimmermann <tzimmermann(a)suse.de>
Cc: Hans Verkuil <hverkuil(a)xs4all.nl>
Cc: Mauro Carvalho Chehab <mchehab(a)kernel.org>
Cc: Miquel Raynal <miquel.raynal(a)bootlin.com>
Cc: Richard Weinberger <richard(a)nod.at>
Cc: Vignesh Raghavendra <vigneshr(a)ti.com>
Cc: Eric Biggers <ebiggers(a)kernel.org>
Cc: "Theodore Y. Ts'o" <tytso(a)mit.edu>
Cc: Jaegeuk Kim <jaegeuk(a)kernel.org>
Cc: "Jason A. Donenfeld" <Jason(a)zx2c4.com>
Cc: Linus Torvalds <torvalds(a)linux-foundation.org>
Cc: Hannes Reinecke <hare(a)suse.de>
Cc: "James E.J. Bottomley" <James.Bottomley(a)HansenPartnership.com>
Cc: "Martin K. Petersen" <martin.petersen(a)oracle.com>
Cc: Alexei Starovoitov <ast(a)kernel.org>
Cc: Daniel Borkmann <daniel(a)iogearbox.net>
Cc: John Fastabend <john.fastabend(a)gmail.com>
Cc: Andrii Nakryiko <andrii(a)kernel.org>
Cc: Martin KaFai Lau <martin.lau(a)linux.dev>
Cc: Eduard Zingerman <eddyz87(a)gmail.com>
Cc: Song Liu <song(a)kernel.org>
Cc: Yonghong Song <yonghong.song(a)linux.dev>
Cc: KP Singh <kpsingh(a)kernel.org>
Cc: Stanislav Fomichev <sdf(a)fomichev.me>
Cc: Hao Luo <haoluo(a)google.com>
Cc: Jiri Olsa <jolsa(a)kernel.org>
Cc: Andrew Morton <akpm(a)linux-foundation.org>
Cc: Brendan Higgins <brendan.higgins(a)linux.dev>
Cc: David Gow <davidgow(a)google.com>
Cc: Rae Moar <rmoar(a)google.com>
Cc: "David S. Miller" <davem(a)davemloft.net>
Cc: Eric Dumazet <edumazet(a)google.com>
Cc: Jakub Kicinski <kuba(a)kernel.org>
Cc: Paolo Abeni <pabeni(a)redhat.com>
Cc: Jiri Pirko <jiri(a)resnulli.us>
Cc: Petr Mladek <pmladek(a)suse.com>
Cc: Steven Rostedt <rostedt(a)goodmis.org>
Cc: Andy Shevchenko <andriy.shevchenko(a)linux.intel.com>
Cc: Rasmus Villemoes <linux(a)rasmusvillemoes.dk>
Cc: Sergey Senozhatsky <senozhatsky(a)chromium.org>
Cc: Stephen Hemminger <stephen(a)networkplumber.org>
Cc: Jamal Hadi Salim <jhs(a)mojatatu.com>
Cc: Cong Wang <xiyou.wangcong(a)gmail.com>
Cc: Kent Overstreet <kent.overstreet(a)linux.dev>
---
v2: - Reword commit messages to mention the removal of legacy inclusion
of <linux/prandom.h> from <linux/random.h>
- Add missing substitution in crypto/testmgr.c
(reported by kernel test robot)
- Add Acked-by: tags.
v3: - Update to linux 6.12rc1.
- Add more Acked-by: tags.
Uros Bizjak (19):
x86/kaslr: Include <linux/prandom.h> instead of <linux/random.h>
crypto: testmgr: Include <linux/prandom.h> instead of <linux/random.h>
drm/i915/selftests: Include <linux/prandom.h> instead of
<linux/random.h>
drm/lib: Include <linux/prandom.h> instead of <linux/random.h>
media: vivid: Include <linux/prandom.h> in vivid-vid-cap.c
mtd: tests: Include <linux/prandom.h> instead of <linux/random.h>
fscrypt: Include <linux/once.h> in fs/crypto/keyring.c
scsi: libfcoe: Include <linux/prandom.h> instead of <linux/random.h>
bpf: Include <linux/prandom.h> instead of <linux/random.h>
lib/interval_tree_test.c: Include <linux/prandom.h> instead of
<linux/random.h>
kunit: string-stream-test: Include <linux/prandom.h>
random32: Include <linux/prandom.h> instead of <linux/random.h>
lib/rbtree-test: Include <linux/prandom.h> instead of <linux/random.h>
bpf/tests: Include <linux/prandom.h> instead of <linux/random.h>
lib/test_parman: Include <linux/prandom.h> instead of <linux/random.h>
lib/test_scanf: Include <linux/prandom.h> instead of <linux/random.h>
netem: Include <linux/prandom.h> in sch_netem.c
random: Do not include <linux/prandom.h> in <linux/random.h>
prandom: Include <linux/percpu.h> in <linux/prandom.h>
arch/x86/mm/kaslr.c | 2 +-
crypto/testmgr.c | 2 +-
drivers/gpu/drm/i915/selftests/i915_gem.c | 2 +-
drivers/gpu/drm/i915/selftests/i915_random.h | 2 +-
drivers/gpu/drm/i915/selftests/scatterlist.c | 2 +-
drivers/gpu/drm/lib/drm_random.h | 2 +-
drivers/media/test-drivers/vivid/vivid-vid-cap.c | 1 +
drivers/mtd/tests/oobtest.c | 2 +-
drivers/mtd/tests/pagetest.c | 2 +-
drivers/mtd/tests/subpagetest.c | 2 +-
fs/crypto/keyring.c | 1 +
include/linux/prandom.h | 1 +
include/linux/random.h | 7 -------
include/scsi/libfcoe.h | 2 +-
kernel/bpf/core.c | 2 +-
lib/interval_tree_test.c | 2 +-
lib/kunit/string-stream-test.c | 1 +
lib/random32.c | 2 +-
lib/rbtree_test.c | 2 +-
lib/test_bpf.c | 2 +-
lib/test_parman.c | 2 +-
lib/test_scanf.c | 2 +-
net/sched/sch_netem.c | 1 +
23 files changed, 22 insertions(+), 24 deletions(-)
--
2.46.2
"step_after_suspend_test fails with device busy error while
writing to /sys/power/state to start suspend." The test believes
it failed to enter suspend state with
$ sudo ./step_after_suspend_test
TAP version 13
Bail out! Failed to enter Suspend state
However, in the kernel message, I indeed see the system get
suspended and then wake up later.
[611172.033108] PM: suspend entry (s2idle)
[611172.044940] Filesystems sync: 0.006 seconds
[611172.052254] Freezing user space processes
[611172.059319] Freezing user space processes completed (elapsed 0.001 seconds)
[611172.067920] OOM killer disabled.
[611172.072465] Freezing remaining freezable tasks
[611172.080332] Freezing remaining freezable tasks completed (elapsed 0.001 seconds)
[611172.089724] printk: Suspending console(s) (use no_console_suspend to debug)
[611172.117126] serial 00:03: disabled
--- some other hardware get reconnected ---
[611203.136277] OOM killer enabled.
[611203.140637] Restarting tasks ...
[611203.141135] usb 1-8.1: USB disconnect, device number 7
[611203.141755] done.
[611203.155268] random: crng reseeded on system resumption
[611203.162059] PM: suspend exit
After investigation, I noticed that for the code block
if (write(power_state_fd, "mem", strlen("mem")) != strlen("mem"))
ksft_exit_fail_msg("Failed to enter Suspend state\n");
The write will return -1 and errno is set to 16 (device busy).
It should be caused by the write function is not successfully returned
before the system suspend and the return value get messed when waking up.
As a result, It may be better to check the time passed of those few
instructions to determine whether the suspend is executed correctly for
it is pretty hard to execute those few lines for 5 seconds.
The timer to wake up the system is set to expire after 5 seconds and
no re-arm. If the timer remaining time is 0 second and 0 nano secomd,
it means the timer expired and wake the system up. Otherwise, the system
could be considered to enter the suspend state failed if there is any
remaining time.
After appling this patch, the test would not fail for it believes the
system does not go to suspend by mistake. It now could continue to the
rest part of the test after suspend.
Fixes: bfd092b8c272 ("selftests: breakpoint: add step_after_suspend_test")
Reported-by: Sinadin Shan <sinadin.shan(a)oracle.com>
Signed-off-by: Yifei Liu <yifei.l.liu(a)oracle.com>
---
.../testing/selftests/breakpoints/step_after_suspend_test.c | 5 ++++-
1 file changed, 4 insertions(+), 1 deletion(-)
diff --git a/tools/testing/selftests/breakpoints/step_after_suspend_test.c b/tools/testing/selftests/breakpoints/step_after_suspend_test.c
index dfec31fb9b30d..8d275f03e977f 100644
--- a/tools/testing/selftests/breakpoints/step_after_suspend_test.c
+++ b/tools/testing/selftests/breakpoints/step_after_suspend_test.c
@@ -152,7 +152,10 @@ void suspend(void)
if (err < 0)
ksft_exit_fail_msg("timerfd_settime() failed\n");
- if (write(power_state_fd, "mem", strlen("mem")) != strlen("mem"))
+ system("(echo mem > /sys/power/state) 2> /dev/null");
+
+ timerfd_gettime(timerfd, &spec);
+ if (spec.it_value.tv_sec != 0 || spec.it_value.tv_nsec != 0)
ksft_exit_fail_msg("Failed to enter Suspend state\n");
close(timerfd);
--
2.46.0
1:The control flow was simplified by using else if statements instead of
goto structure.
2:Error conditions are handled more clearly.
3:The device_unlock call at the end of the function is guaranteed in all
cases.
Github request : https://github.com/torvalds/linux/pull/967
Trivial patches to update the gitignore files unders selftests, and a
little addition to EXTRA_CLEAN under net/rds to account for the
automatically generated include.sh.
Signed-off-by: Javier Carrasco <javier.carrasco.cruz(a)gmail.com>
---
Changes in v3:
- Split new entries in core and net gitignore files into two patches.
- Link to v2: https://lore.kernel.org/r/20240925-selftests-gitignore-v2-0-bbbbdef21959@gm…
Changes in v2:
- [PATCH 4/4] add excepction for load_address.c (must be tracked).
- Link to v1: https://lore.kernel.org/r/20240924-selftests-gitignore-v1-0-9755ac883388@gm…
---
Javier Carrasco (5):
selftests: core: add unshare_test to gitignore
selftests: net: add msg_oob to gitignore
selftests: rds: add include.sh to EXTRA_CLEAN
selftests: rds: add gitignore file for include.sh
selftests: exec: update gitignore for load_address
tools/testing/selftests/core/.gitignore | 1 +
tools/testing/selftests/exec/.gitignore | 3 ++-
tools/testing/selftests/net/.gitignore | 1 +
tools/testing/selftests/net/rds/.gitignore | 1 +
tools/testing/selftests/net/rds/Makefile | 2 +-
5 files changed, 6 insertions(+), 2 deletions(-)
---
base-commit: 4d0326b60bb753627437fff0f76bf1525bcda422
change-id: 20240924-selftests-gitignore-e41133e6c5bd
Best regards,
--
Javier Carrasco <javier.carrasco.cruz(a)gmail.com>
"step_after_suspend_test fails with device busy error while
writing to /sys/power/state to start suspend." The test believes
it failed to enter suspend state with
$ sudo ./step_after_suspend_test
TAP version 13
Bail out! Failed to enter Suspend state
However, in the kernel message, I indeed see the system get
suspended and then wake up later.
[611172.033108] PM: suspend entry (s2idle)
[611172.044940] Filesystems sync: 0.006 seconds
[611172.052254] Freezing user space processes
[611172.059319] Freezing user space processes completed (elapsed 0.001 seconds)
[611172.067920] OOM killer disabled.
[611172.072465] Freezing remaining freezable tasks
[611172.080332] Freezing remaining freezable tasks completed (elapsed 0.001 seconds)
[611172.089724] printk: Suspending console(s) (use no_console_suspend to debug)
[611172.117126] serial 00:03: disabled
--- some other hardware get reconnected ---
[611203.136277] OOM killer enabled.
[611203.140637] Restarting tasks ...
[611203.141135] usb 1-8.1: USB disconnect, device number 7
[611203.141755] done.
[611203.155268] random: crng reseeded on system resumption
[611203.162059] PM: suspend exit
After investigation, I noticed that for the code block
if (write(power_state_fd, "mem", strlen("mem")) != strlen("mem"))
ksft_exit_fail_msg("Failed to enter Suspend state\n");
The write will return -1 and errno is set to 16 (device busy).
It should be caused by the write function is not successfully returned
before the system suspend and the return value get messed when waking up.
As a result, It may be better to check the time passed of those few instructions
to determine whether the suspend is executed correctly for it is pretty hard to
execute those few lines for 5 seconds.
The timer to wake up the system is set to expire after 5 seconds and no re-arm.
If the timer remaining time is 0 second and 0 nano secomd, it means the timer
expired and wake the system up. Otherwise, the system could be considered to
enter the suspend state failed if there is any remaining time.
After appling this patch, the test would not fail for it believes the system does
not go to suspend by mistake. It now could continue to the rest part of the test after suspend.
Fixes: bfd092b8c2728 ("selftests: breakpoint: add step_after_suspend_test")
Reported-by: Sinadin Shan <sinadin.shan(a)oracle.com>
Signed-off-by: Yifei Liu <yifei.l.liu(a)oracle.com>
---
.../testing/selftests/breakpoints/step_after_suspend_test.c | 5 ++++-
1 file changed, 4 insertions(+), 1 deletion(-)
diff --git a/tools/testing/selftests/breakpoints/step_after_suspend_test.c b/tools/testing/selftests/breakpoints/step_after_suspend_test.c
index dfec31fb9b30d..33f5542bf741d 100644
--- a/tools/testing/selftests/breakpoints/step_after_suspend_test.c
+++ b/tools/testing/selftests/breakpoints/step_after_suspend_test.c
@@ -152,7 +152,10 @@ void suspend(void)
if (err < 0)
ksft_exit_fail_msg("timerfd_settime() failed\n");
- if (write(power_state_fd, "mem", strlen("mem")) != strlen("mem"))
+ system("(echo mem > /sys/power/state) 2> /dev/null");
+
+ timerfd_gettime(timerfd,&spec);
+ if (spec.it_value.tv_sec != 0 || spec.it_value.tv_nsec != 0)
ksft_exit_fail_msg("Failed to enter Suspend state\n");
close(timerfd);
--
2.46.0
This patch set enables the Intel flexible return and event delivery
(FRED) architecture with KVM VMX to allow guests to utilize FRED.
The FRED architecture defines simple new transitions that change
privilege level (ring transitions). The FRED architecture was
designed with the following goals:
1) Improve overall performance and response time by replacing event
delivery through the interrupt descriptor table (IDT event
delivery) and event return by the IRET instruction with lower
latency transitions.
2) Improve software robustness by ensuring that event delivery
establishes the full supervisor context and that event return
establishes the full user context.
The new transitions defined by the FRED architecture are FRED event
delivery and, for returning from events, two FRED return instructions.
FRED event delivery can effect a transition from ring 3 to ring 0, but
it is used also to deliver events incident to ring 0. One FRED
instruction (ERETU) effects a return from ring 0 to ring 3, while the
other (ERETS) returns while remaining in ring 0. Collectively, FRED
event delivery and the FRED return instructions are FRED transitions.
Intel VMX architecture is extended to run FRED guests, and the major
changes are:
1) New VMCS fields for FRED context management, which includes two new
event data VMCS fields, eight new guest FRED context VMCS fields and
eight new host FRED context VMCS fields.
2) VMX nested-exception support for proper virtualization of stack
levels introduced with FRED architecture.
Search for the latest FRED spec in most search engines with this search
pattern:
site:intel.com FRED (flexible return and event delivery) specification
As the native FRED patches are committed in the tip tree "x86/fred"
branch:
https://git.kernel.org/pub/scm/linux/kernel/git/tip/tip.git/log/?h=x86/fred,
and we have received a good amount of review comments for v1, it's time
to send out v2 based on this branch for further help from the community.
Patch 1-2 are cleanups to VMX basic and misc MSRs, which were sent
out earlier as a preparation for FRED changes:
https://lore.kernel.org/kvm/20240206182032.1596-1-xin3.li@intel.com/T/#u
Patch 3-15 add FRED support to VMX.
Patch 16-21 add FRED support to nested VMX.
Patch 22 exposes FRED and its baseline features to KVM guests.
Patch 23-25 add FRED selftests.
There is also a counterpart qemu patch set for FRED at:
https://lore.kernel.org/qemu-devel/20231109072012.8078-1-xin3.li@intel.com/…,
which works with this patch set to allow KVM to run FRED guests.
Changes since v1:
* Always load the secondary VM exit controls (Sean Christopherson).
* Remove FRED VM entry/exit controls consistency checks in
setup_vmcs_config() (Sean Christopherson).
* Clear FRED VM entry/exit controls if FRED is not enumerated (Chao Gao).
* Use guest_can_use() to trace FRED enumeration in a vcpu (Chao Gao).
* Enable FRED MSRs intercept if FRED is no longer enumerated in CPUID
(Chao Gao).
* Move guest FRED states init into __vmx_vcpu_reset() (Chao Gao).
* Don't use guest_cpuid_has() in vmx_prepare_switch_to_{host,guest}(),
which are called from IRQ-disabled context (Chao Gao).
* Reset msr_guest_fred_rsp0 in __vmx_vcpu_reset() (Chao Gao).
* Fail host requested FRED MSRs access if KVM cannot virtualize FRED
(Chao Gao).
* Handle the case FRED MSRs are valid but KVM cannot virtualize FRED
(Chao Gao).
* Add sanity checks when writing to FRED MSRs.
* Explain why it is ok to only check CR4.FRED in kvm_is_fred_enabled()
(Chao Gao).
* Document event data should be equal to CR2/DR6/IA32_XFD_ERR instead
of using WARN_ON() (Chao Gao).
* Zero event data if a #NM was not caused by extended feature disable
(Chao Gao).
* Set the nested flag when there is an original interrupt (Chao Gao).
* Dump guest FRED states only if guest has FRED enabled (Nikolay Borisov).
* Add a prerequisite to SHADOW_FIELD_R[OW] macros
* Remove hyperv TLFS related changes (Jeremi Piotrowski).
* Use kvm_cpu_cap_has() instead of cpu_feature_enabled() to decouple
KVM's capability to virtualize a feature and host's enabling of a
feature (Chao Gao).
Xin Li (25):
KVM: VMX: Cleanup VMX basic information defines and usages
KVM: VMX: Cleanup VMX misc information defines and usages
KVM: VMX: Add support for the secondary VM exit controls
KVM: x86: Mark CR4.FRED as not reserved
KVM: VMX: Initialize FRED VM entry/exit controls in vmcs_config
KVM: VMX: Defer enabling FRED MSRs save/load until after set CPUID
KVM: VMX: Set intercept for FRED MSRs
KVM: VMX: Initialize VMCS FRED fields
KVM: VMX: Switch FRED RSP0 between host and guest
KVM: VMX: Add support for FRED context save/restore
KVM: x86: Add kvm_is_fred_enabled()
KVM: VMX: Handle FRED event data
KVM: VMX: Handle VMX nested exception for FRED
KVM: VMX: Disable FRED if FRED consistency checks fail
KVM: VMX: Dump FRED context in dump_vmcs()
KVM: VMX: Invoke vmx_set_cpu_caps() before nested setup
KVM: nVMX: Add support for the secondary VM exit controls
KVM: nVMX: Add a prerequisite to SHADOW_FIELD_R[OW] macros
KVM: nVMX: Add FRED VMCS fields
KVM: nVMX: Add support for VMX FRED controls
KVM: nVMX: Add VMCS FRED states checking
KVM: x86: Allow FRED/LKGS/WRMSRNS to be exposed to guests
KVM: selftests: Run debug_regs test with FRED enabled
KVM: selftests: Add a new VM guest mode to run user level code
KVM: selftests: Add fred exception tests
Documentation/virt/kvm/x86/nested-vmx.rst | 19 +
arch/x86/include/asm/kvm_host.h | 8 +-
arch/x86/include/asm/msr-index.h | 15 +-
arch/x86/include/asm/vmx.h | 59 ++-
arch/x86/kvm/cpuid.c | 4 +-
arch/x86/kvm/governed_features.h | 1 +
arch/x86/kvm/kvm_cache_regs.h | 17 +
arch/x86/kvm/svm/svm.c | 4 +-
arch/x86/kvm/vmx/capabilities.h | 30 +-
arch/x86/kvm/vmx/nested.c | 329 ++++++++++++---
arch/x86/kvm/vmx/nested.h | 2 +-
arch/x86/kvm/vmx/vmcs.h | 1 +
arch/x86/kvm/vmx/vmcs12.c | 19 +
arch/x86/kvm/vmx/vmcs12.h | 38 ++
arch/x86/kvm/vmx/vmcs_shadow_fields.h | 80 ++--
arch/x86/kvm/vmx/vmx.c | 385 +++++++++++++++---
arch/x86/kvm/vmx/vmx.h | 15 +-
arch/x86/kvm/x86.c | 103 ++++-
arch/x86/kvm/x86.h | 5 +-
tools/testing/selftests/kvm/Makefile | 1 +
.../selftests/kvm/include/kvm_util_base.h | 1 +
.../selftests/kvm/include/x86_64/processor.h | 36 ++
tools/testing/selftests/kvm/lib/kvm_util.c | 5 +-
.../selftests/kvm/lib/x86_64/processor.c | 15 +-
tools/testing/selftests/kvm/lib/x86_64/vmx.c | 4 +-
.../testing/selftests/kvm/x86_64/debug_regs.c | 50 ++-
.../testing/selftests/kvm/x86_64/fred_test.c | 297 ++++++++++++++
27 files changed, 1320 insertions(+), 223 deletions(-)
create mode 100644 tools/testing/selftests/kvm/x86_64/fred_test.c
base-commit: e13841907b8fda0ae0ce1ec03684665f578416a8
--
2.43.0