Currently, we can run string-stream and assertion tests only when they
are built into the kernel (with config options = y), since some of the
symbols (string-stream functions and functions from assert.c) are not
exported into any of the namespaces, therefore they are not accessible
for the modules.
This patch series exports the required symbols into the KUnit namespace.
Also, it makes the string-stream test a separate module and removes the
log test stub from kunit-test since now we can access the string-stream
symbols even if the test which uses it is built as a module.
Additionally, this patch series merges the assertion test suite into the
kunit-test, since assert.c (and all of the assertion formatting
functions in it) is a part of the KUnit core.
Ivan Orlov (5):
kunit: string-stream: export non-static functions
kunit: kunit-test: Remove stub for log tests
kunit: string-stream-test: Make it a separate module
kunit: assert: export non-static functions
kunit: Merge assertion test into kunit-test.c
lib/kunit/Kconfig | 8 +
lib/kunit/Makefile | 7 +-
lib/kunit/assert.c | 4 +
lib/kunit/assert_test.c | 388 --------------------------------
lib/kunit/kunit-test.c | 397 +++++++++++++++++++++++++++++++--
lib/kunit/string-stream-test.c | 2 +
lib/kunit/string-stream.c | 12 +-
7 files changed, 405 insertions(+), 413 deletions(-)
delete mode 100644 lib/kunit/assert_test.c
--
2.34.1
The perf subsystem today unifies various tracing and monitoring
features, from both software and hardware. One benefit of the perf
subsystem is automatically inheriting events to child tasks, which
enables process-wide events monitoring with low overheads. By default
perf events are non-intrusive, not affecting behaviour of the tasks
being monitored.
For certain use-cases, however, it makes sense to leverage the
generality of the perf events subsystem and optionally allow the tasks
being monitored to receive signals on events they are interested in.
This patch series adds the option to synchronously signal user space on
events.
To better support process-wide synchronous self-monitoring, without
events propagating to children that do not share the current process's
shared environment, two pre-requisite patches are added to optionally
restrict inheritance to CLONE_THREAD, and remove events on exec (without
affecting the parent).
Examples how to use these features can be found in the tests added at
the end of the series. In addition to the tests added, the series has
also been subjected to syzkaller fuzzing (focus on 'kernel/events/'
coverage).
Motivation and Example Uses
---------------------------
1. Our immediate motivation is low-overhead sampling-based race
detection for user space [1]. By using perf_event_open() at
process initialization, we can create hardware
breakpoint/watchpoint events that are propagated automatically
to all threads in a process. As far as we are aware, today no
existing kernel facility (such as ptrace) allows us to set up
process-wide watchpoints with minimal overheads (that are
comparable to mprotect() of whole pages).
2. Other low-overhead error detectors that rely on detecting
accesses to certain memory locations or code, process-wide and
also only in a specific set of subtasks or threads.
[1] https://llvm.org/devmtg/2020-09/slides/Morehouse-GWP-Tsan.pdf
Other ideas for use-cases we found interesting, but should only
illustrate the range of potential to further motivate the utility (we're
sure there are more):
3. Code hot patching without full stop-the-world. Specifically, by
setting a code breakpoint to entry to the patched routine, then
send signals to threads and check that they are not in the
routine, but without stopping them further. If any of the
threads will enter the routine, it will receive SIGTRAP and
pause.
4. Safepoints without mprotect(). Some Java implementations use
"load from a known memory location" as a safepoint. When threads
need to be stopped, the page containing the location is
mprotect()ed and threads get a signal. This could be replaced with
a watchpoint, which does not require a whole page nor DTLB
shootdowns.
5. Threads receiving signals on performance events to
throttle/unthrottle themselves.
6. Tracking data flow globally.
Changelog
---------
v4:
* Fix for parent and child racing to exit in sync_child_event().
* Fix race between irq_work running and task's sighand being released by
release_task().
* Generalize setting si_perf and si_addr independent of event type;
introduces perf_event_attr::sig_data, which can be set by user space
to be propagated to si_perf.
* Warning in perf_sigtrap() if ctx->task and current mismatch; we expect
this on architectures that do not properly implement
arch_irq_work_raise().
* Require events that want sigtrap to be associated with a task.
* Dropped "perf: Add breakpoint information to siginfo on SIGTRAP"
in favor of more generic solution (perf_event_attr::sig_data).
v3:
* Add patch "perf: Rework perf_event_exit_event()" to beginning of
series, courtesy of Peter Zijlstra.
* Rework "perf: Add support for event removal on exec" based on
the added "perf: Rework perf_event_exit_event()".
* Fix kselftests to work with more recent libc, due to the way it forces
using the kernel's own siginfo_t.
* Add basic perf-tool built-in test.
v2/RFC: https://lkml.kernel.org/r/20210310104139.679618-1-elver@google.com
* Patch "Support only inheriting events if cloned with CLONE_THREAD"
added to series.
* Patch "Add support for event removal on exec" added to series.
* Patch "Add kselftest for process-wide sigtrap handling" added to
series.
* Patch "Add kselftest for remove_on_exec" added to series.
* Implicitly restrict inheriting events if sigtrap, but the child was
cloned with CLONE_CLEAR_SIGHAND, because it is not generally safe if
the child cleared all signal handlers to continue sending SIGTRAP.
* Various minor fixes (see details in patches).
v1/RFC: https://lkml.kernel.org/r/20210223143426.2412737-1-elver@google.com
Pre-series: The discussion at [2] led to the changes in this series. The
approach taken in "Add support for SIGTRAP on perf events" to trigger
the signal was suggested by Peter Zijlstra in [3].
[2] https://lore.kernel.org/lkml/CACT4Y+YPrXGw+AtESxAgPyZ84TYkNZdP0xpocX2jwVAbZ…
[3] https://lore.kernel.org/lkml/YBv3rAT566k+6zjg@hirez.programming.kicks-ass.n…
Marco Elver (9):
perf: Apply PERF_EVENT_IOC_MODIFY_ATTRIBUTES to children
perf: Support only inheriting events if cloned with CLONE_THREAD
perf: Add support for event removal on exec
signal: Introduce TRAP_PERF si_code and si_perf to siginfo
perf: Add support for SIGTRAP on perf events
selftests/perf_events: Add kselftest for process-wide sigtrap handling
selftests/perf_events: Add kselftest for remove_on_exec
tools headers uapi: Sync tools/include/uapi/linux/perf_event.h
perf test: Add basic stress test for sigtrap handling
Peter Zijlstra (1):
perf: Rework perf_event_exit_event()
arch/m68k/kernel/signal.c | 3 +
arch/x86/kernel/signal_compat.c | 5 +-
fs/signalfd.c | 4 +
include/linux/compat.h | 2 +
include/linux/perf_event.h | 9 +-
include/linux/signal.h | 1 +
include/uapi/asm-generic/siginfo.h | 6 +-
include/uapi/linux/perf_event.h | 12 +-
include/uapi/linux/signalfd.h | 4 +-
kernel/events/core.c | 302 +++++++++++++-----
kernel/fork.c | 2 +-
kernel/signal.c | 11 +
tools/include/uapi/linux/perf_event.h | 12 +-
tools/perf/tests/Build | 1 +
tools/perf/tests/builtin-test.c | 5 +
tools/perf/tests/sigtrap.c | 150 +++++++++
tools/perf/tests/tests.h | 1 +
.../testing/selftests/perf_events/.gitignore | 3 +
tools/testing/selftests/perf_events/Makefile | 6 +
tools/testing/selftests/perf_events/config | 1 +
.../selftests/perf_events/remove_on_exec.c | 260 +++++++++++++++
tools/testing/selftests/perf_events/settings | 1 +
.../selftests/perf_events/sigtrap_threads.c | 210 ++++++++++++
23 files changed, 924 insertions(+), 87 deletions(-)
create mode 100644 tools/perf/tests/sigtrap.c
create mode 100644 tools/testing/selftests/perf_events/.gitignore
create mode 100644 tools/testing/selftests/perf_events/Makefile
create mode 100644 tools/testing/selftests/perf_events/config
create mode 100644 tools/testing/selftests/perf_events/remove_on_exec.c
create mode 100644 tools/testing/selftests/perf_events/settings
create mode 100644 tools/testing/selftests/perf_events/sigtrap_threads.c
--
2.31.0.208.g409f899ff0-goog
This patch series is motivated by the following observation:
Raise a signal, jump to signal handler. The ucontext_t structure dumped
by kernel to userspace has a uc_sigmask field having the mask of blocked
signals. If you run a fresh minimalistic program doing this, this field
is empty, even if you block some signals while registering the handler
with sigaction().
Here is what the man-pages have to say:
sigaction(2): "sa_mask specifies a mask of signals which should be blocked
(i.e., added to the signal mask of the thread in which the signal handler
is invoked) during execution of the signal handler. In addition, the
signal which triggered the handler will be blocked, unless the SA_NODEFER
flag is used."
signal(7): Under "Execution of signal handlers", (1.3) implies:
"The thread's current signal mask is accessible via the ucontext_t
object that is pointed to by the third argument of the signal handler."
But, (1.4) states:
"Any signals specified in act->sa_mask when registering the handler with
sigprocmask(2) are added to the thread's signal mask. The signal being
delivered is also added to the signal mask, unless SA_NODEFER was
specified when registering the handler. These signals are thus blocked
while the handler executes."
There clearly is no distinction being made in the man pages between
"Thread's signal mask" and ucontext_t; this logically should imply
that a signal blocked by populating struct sigaction should be visible
in ucontext_t.
Here is what the kernel code does (for Aarch64):
do_signal() -> handle_signal() -> sigmask_to_save(), which returns
¤t->blocked, is passed to setup_rt_frame() -> setup_sigframe() ->
__copy_to_user(). Hence, ¤t->blocked is copied to ucontext_t
exposed to userspace. Returning back to handle_signal(),
signal_setup_done() -> signal_delivered() -> sigorsets() and
set_current_blocked() are responsible for using information from
struct ksignal ksig, which was populated through the sigaction()
system call in kernel/signal.c:
copy_from_user(&new_sa.sa, act, sizeof(new_sa.sa)),
to update ¤t->blocked; hence, the set of blocked signals for the
current thread is updated AFTER the kernel dumps ucontext_t to
userspace.
Assuming that the above is indeed the intended behaviour, because it
semantically makes sense, since the signals blocked using sigaction()
remain blocked only till the execution of the handler, and not in the
context present before jumping to the handler (but nothing can be
confirmed from the man-pages), the series introduces a test for
mangling with uc_sigmask. I will send a separate series to fix the
man-pages.
The proposed selftest has been tested out on Aarch32, Aarch64 and x86_64.
Changes in v2:
- Replace all occurrences of SIGPIPE with SIGSEGV
- Add a testcase: Raise the same signal again; it must not be queued
- Remove unneeded <assert.h>, <unistd.h>
- Give a detailed test description in the comments; also describe the
exact meaning of delivered and blocked
- Handle errors for all libc functions/syscalls
- Mention tests in Makefile and .gitignore in alphabetical order
Dev Jain (2):
selftests: Rename sigaltstack to generic signal
selftests: Add a test mangling with uc_sigmask
tools/testing/selftests/Makefile | 2 +-
.../{sigaltstack => signal}/.gitignore | 3 +-
.../{sigaltstack => signal}/Makefile | 3 +-
.../current_stack_pointer.h | 0
.../selftests/signal/mangle_uc_sigmask.c | 194 ++++++++++++++++++
.../sas.c => signal/sigaltstack.c} | 0
6 files changed, 199 insertions(+), 3 deletions(-)
rename tools/testing/selftests/{sigaltstack => signal}/.gitignore (57%)
rename tools/testing/selftests/{sigaltstack => signal}/Makefile (53%)
rename tools/testing/selftests/{sigaltstack => signal}/current_stack_pointer.h (100%)
create mode 100644 tools/testing/selftests/signal/mangle_uc_sigmask.c
rename tools/testing/selftests/{sigaltstack/sas.c => signal/sigaltstack.c} (100%)
--
2.34.1
From: Jeff Xu <jeffxu(a)chromium.org>
When MFD_NOEXEC_SEAL was introduced, there was one big mistake: it
didn't have proper documentation. This led to a lot of confusion,
especially about whether or not memfd created with the MFD_NOEXEC_SEAL
flag is sealable. Before MFD_NOEXEC_SEAL, memfd had to explicitly set
MFD_ALLOW_SEALING to be sealable, so it's a fair question.
As one might have noticed, unlike other flags in memfd_create,
MFD_NOEXEC_SEAL is actually a combination of multiple flags. The idea
is to make it easier to use memfd in the most common way, which is
NOEXEC + F_SEAL_EXEC + MFD_ALLOW_SEALING. This works with sysctl
vm.noexec to help existing applications move to a more secure way of
using memfd.
Proposals have been made to put MFD_NOEXEC_SEAL non-sealable, unless
MFD_ALLOW_SEALING is set, to be consistent with other flags [1] [2],
Those are based on the viewpoint that each flag is an atomic unit,
which is a reasonable assumption. However, MFD_NOEXEC_SEAL was
designed with the intent of promoting the most secure method of using
memfd, therefore a combination of multiple functionalities into one
bit.
Furthermore, the MFD_NOEXEC_SEAL has been added for more than one
year, and multiple applications and distributions have backported and
utilized it. Altering ABI now presents a degree of risk and may lead
to disruption.
MFD_NOEXEC_SEAL is a new flag, and applications must change their code
to use it. There is no backward compatibility problem.
When sysctl vm.noexec == 1 or 2, applications that don't set
MFD_NOEXEC_SEAL or MFD_EXEC will get MFD_NOEXEC_SEAL memfd. And
old-application might break, that is by-design, in such a system
vm.noexec = 0 shall be used. Also no backward compatibility problem.
I propose to include this documentation patch to assist in clarifying
the semantics of MFD_NOEXEC_SEAL, thereby preventing any potential
future confusion.
This patch supersede previous patch which is trying different
direction [3], and please remove [2] from mm-unstable branch when
applying this patch.
Finally, I would like to express my gratitude to David Rheinsberg and
Barnabás Pőcze for initiating the discussion on the topic of sealability.
[1]
https://lore.kernel.org/lkml/20230714114753.170814-1-david@readahead.eu/
[2]
https://lore.kernel.org/lkml/20240513191544.94754-1-pobrn@protonmail.com/
[3]
https://lore.kernel.org/lkml/20240524033933.135049-1-jeffxu@google.com/
Jeff Xu (1):
mm/memfd: add documentation for MFD_NOEXEC_SEAL MFD_EXEC
Documentation/userspace-api/index.rst | 1 +
Documentation/userspace-api/mfd_noexec.rst | 86 ++++++++++++++++++++++
2 files changed, 87 insertions(+)
create mode 100644 Documentation/userspace-api/mfd_noexec.rst
--
2.45.2.505.gda0bf45e8d-goog
The different patches here are some unrelated fixes for MPTCP:
- Patch 1 ensures 'snd_una' is initialised on connect in case of MPTCP
fallback to TCP followed by retransmissions before the processing of
any other incoming packets. A fix for v5.9+.
- Patch 2 makes sure the RmAddr MIB counter is incremented, and only
once per ID, upon the reception of a RM_ADDR. A fix for v5.10+.
- Patch 3 doesn't update 'add addr' related counters if the connect()
was not possible. A fix for v5.7+.
- Patch 4 updates the mailmap file to add Geliang's new email address.
Signed-off-by: Matthieu Baerts (NGI0) <matttbe(a)kernel.org>
---
Geliang Tang (1):
mailmap: map Geliang's new email address
Paolo Abeni (1):
mptcp: ensure snd_una is properly initialized on connect
YonglongLi (2):
mptcp: pm: inc RmAddr MIB counter once per RM_ADDR ID
mptcp: pm: update add_addr counters after connect
.mailmap | 1 +
net/mptcp/pm_netlink.c | 21 ++++++++++++++-------
net/mptcp/protocol.c | 1 +
tools/testing/selftests/net/mptcp/mptcp_join.sh | 5 +++--
4 files changed, 19 insertions(+), 9 deletions(-)
---
base-commit: c44711b78608c98a3e6b49ce91678cd0917d5349
change-id: 20240607-upstream-net-20240607-misc-fixes-024007171d60
Best regards,
--
Matthieu Baerts (NGI0) <matttbe(a)kernel.org>
Hi,
This builds on the proposal[1] from Mark and lets me convert the
existing usercopy selftest to KUnit. Besides adding this basic test to
the KUnit collection, it also opens the door for execve testing (which
depends on having a functional current->mm), and should provide the
basic infrastructure for adding Mark's much more complete usercopy tests.
-Kees
[1] https://lore.kernel.org/lkml/20230321122514.1743889-2-mark.rutland@arm.com/
Kees Cook (2):
kunit: test: Add vm_mmap() allocation resource manager
usercopy: Convert test_user_copy to KUnit test
MAINTAINERS | 1 +
include/kunit/test.h | 17 ++
lib/Kconfig.debug | 21 +-
lib/Makefile | 2 +-
lib/kunit/test.c | 139 +++++++++++-
lib/{test_user_copy.c => usercopy_kunit.c} | 252 ++++++++++-----------
6 files changed, 288 insertions(+), 144 deletions(-)
rename lib/{test_user_copy.c => usercopy_kunit.c} (52%)
--
2.34.1
Hi all,
This series does a number of cleanups into resctrl_val() and
generalizes it by removing test name specific handling from the
function.
v6:
- Adjust closing/rollback of the IMC perf
- Move the comment in measure_vals() to function level
- Capitalize MBM
- binded to -> bound to
- Language tweak into kerneldoc
- Removed stale paragraph from commit message
v5:
- Open mem bw file only once and use rewind().
- Add \n to mem bw file read to allow reading fresh values from the file.
- Return 0 if create_grp() is given NULL grp_name (matches the original
behavior). Mention this in function's kerneldoc.
- Cast pid_t to int before printing with %d.
- Caps/typo fixes to kerneldoc and commit messages.
- Use imperative tone in commit messages and improve them based on points
that came up during review.
v4:
- Merged close fix into IMC READ+WRITE rework patch
- Add loop to reset imc_counters_config fds to -1 to be able know which
need closing
- Introduce perf_close_imc_mem_bw() to close fds
- Open resctrl mem bw file (twice) beforehand to avoid opening it during
the test
- Remove MBM .mongrp setup
- Remove mongrp from CMT test
v3:
- Rename init functions to <testname>_init()
- Replace for loops with READ+WRITE statements for clarity
- Don't drop Return: entry from perf_open_imc_mem_bw() func comment
- New patch: Fix closing of IMC fds in case of error
- New patch: Make "bandwidth" consistent in comments & prints
- New patch: Simplify mem bandwidth file code
- Remove wrong comment
- Changed grp_name check to return -1 on fail (internal sanity check)
v2:
- Resolved conflicts with kselftest/next
- Spaces -> tabs correction
Ilpo Järvinen (16):
selftests/resctrl: Fix closing IMC fds on error and open-code R+W
instead of loops
selftests/resctrl: Calculate resctrl FS derived mem bw over sleep(1)
only
selftests/resctrl: Make "bandwidth" consistent in comments & prints
selftests/resctrl: Consolidate get_domain_id() into resctrl_val()
selftests/resctrl: Use correct type for pids
selftests/resctrl: Cleanup bm_pid and ppid usage & limit scope
selftests/resctrl: Rename measure_vals() to measure_mem_bw_vals() &
document
selftests/resctrl: Simplify mem bandwidth file code for MBA & MBM
tests
selftests/resctrl: Add ->measure() callback to resctrl_val_param
selftests/resctrl: Add ->init() callback into resctrl_val_param
selftests/resctrl: Simplify bandwidth report type handling
selftests/resctrl: Make some strings passed to resctrlfs functions
const
selftests/resctrl: Convert ctrlgrp & mongrp to pointers
selftests/resctrl: Remove mongrp from MBA test
selftests/resctrl: Remove mongrp from CMT test
selftests/resctrl: Remove test name comparing from
write_bm_pid_to_resctrl()
tools/testing/selftests/resctrl/cache.c | 10 +-
tools/testing/selftests/resctrl/cat_test.c | 5 +-
tools/testing/selftests/resctrl/cmt_test.c | 22 +-
tools/testing/selftests/resctrl/mba_test.c | 26 +-
tools/testing/selftests/resctrl/mbm_test.c | 26 +-
tools/testing/selftests/resctrl/resctrl.h | 49 ++-
tools/testing/selftests/resctrl/resctrl_val.c | 371 ++++++++----------
tools/testing/selftests/resctrl/resctrlfs.c | 67 ++--
8 files changed, 291 insertions(+), 285 deletions(-)
--
2.39.2
This patch addresses the present TODO in the file.
I have tested it manually on my system and added relevant filtering to
ensure that the correct feature list is being checked.
Signed-off-by: Abhinav Jain <jain.abhinav177(a)gmail.com>
---
tools/testing/selftests/net/netdevice.sh | 21 +++++++++++++++++++--
1 file changed, 19 insertions(+), 2 deletions(-)
diff --git a/tools/testing/selftests/net/netdevice.sh b/tools/testing/selftests/net/netdevice.sh
index e3afcb424710..cbe2573c3827 100755
--- a/tools/testing/selftests/net/netdevice.sh
+++ b/tools/testing/selftests/net/netdevice.sh
@@ -117,14 +117,31 @@ kci_netdev_ethtool()
return 1
fi
- ethtool -k "$netdev" > "$TMP_ETHTOOL_FEATURES"
+ ethtool -k "$netdev" | tail -n +2 > "$TMP_ETHTOOL_FEATURES"
if [ $? -ne 0 ];then
echo "FAIL: $netdev: ethtool list features"
rm "$TMP_ETHTOOL_FEATURES"
return 1
fi
echo "PASS: $netdev: ethtool list features"
- #TODO for each non fixed features, try to turn them on/off
+
+ for feature in $(grep -v fixed "$TMP_ETHTOOL_FEATURES" | \
+ awk '{print $1}' | sed 's/://'); do
+ ethtool --offload "$netdev" "$feature" off
+ if [ $? -eq 0 ]; then
+ echo "PASS: $netdev: Turned off feature: $feature"
+ else
+ echo "FAIL: $netdev: Failed to turn off feature: $feature"
+ fi
+
+ ethtool --offload "$netdev" "$feature" on
+ if [ $? -eq 0 ]; then
+ echo "PASS: $netdev: Turned on feature: $feature"
+ else
+ echo "FAIL: $netdev: Failed to turn on feature: $feature"
+ fi
+ done
+
rm "$TMP_ETHTOOL_FEATURES"
kci_netdev_ethtool_test 74 'dump' "ethtool -d $netdev"
--
2.34.1
From: Pankaj Raghav <p.raghav(a)samsung.com>
create_pagecache_thp_and_fd() in split_huge_page_test.c used the
variable dummy to perform mmap read.
However, this test was skipped even on XFS which has large folio
support. The issue was compiler (gcc 13.2.0) was optimizing out the
dummy variable, therefore, not creating huge page in the page cache.
Use asm volatile() trick to force the compiler not to optimize out
the loop where we read from the mmaped addr. This is similar to what is
being done in other tests (cow.c, etc)
As the variable is now used in the asm statement, remove the unused
attribute.
Signed-off-by: Pankaj Raghav <p.raghav(a)samsung.com>
---
Changes since v2:
- Use the asm volatile trick to force the compiler to not optimize the
read into dummy variable. (David)
tools/testing/selftests/mm/split_huge_page_test.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/tools/testing/selftests/mm/split_huge_page_test.c b/tools/testing/selftests/mm/split_huge_page_test.c
index d3c7f5fb3e7b..e5e8dafc9d94 100644
--- a/tools/testing/selftests/mm/split_huge_page_test.c
+++ b/tools/testing/selftests/mm/split_huge_page_test.c
@@ -300,7 +300,7 @@ int create_pagecache_thp_and_fd(const char *testfile, size_t fd_size, int *fd,
char **addr)
{
size_t i;
- int __attribute__((unused)) dummy = 0;
+ int dummy = 0;
srand(time(NULL));
@@ -341,6 +341,7 @@ int create_pagecache_thp_and_fd(const char *testfile, size_t fd_size, int *fd,
for (size_t i = 0; i < fd_size; i++)
dummy += *(*addr + i);
+ asm volatile("" : "+r" (dummy));
if (!check_huge_file(*addr, fd_size / pmd_pagesize, pmd_pagesize)) {
ksft_print_msg("No large pagecache folio generated, please provide a filesystem supporting large folio\n");
base-commit: d97496ca23a2d4ee80b7302849404859d9058bcd
--
2.44.1
The purpose of this series is to rethink how HID-BPF is invoked.
Currently it implies a jmp table, a prog fd bpf_map, a preloaded tracing
bpf program and a lot of manual work for handling the bpf program
lifetime and addition/removal.
OTOH, bpf_struct_ops take care of most of the bpf handling leaving us
with a simple list of ops pointers, and we can directly call the
struct_ops program from the kernel as a regular function.
The net gain right now is in term of code simplicity and lines of code
removal (though is an API breakage), but udev-hid-bpf is able to handle
such breakages.
In the near future, we will be able to extend the HID-BPF struct_ops
with entrypoints for hid_hw_raw_request() and hid_hw_output_report(),
allowing for covering all of the initial use cases:
- firewalling a HID device
- fixing all of the HID device interactions (not just device events as
it is right now).
The matching user-space loader (udev-hid-bpf) MR is at
https://gitlab.freedesktop.org/libevdev/udev-hid-bpf/-/merge_requests/86
I'll put it out of draft once this is merged.
Cheers,
Benjamin
Signed-off-by: Benjamin Tissoires <bentiss(a)kernel.org>
---
Changes in v2:
- drop HID_BPF_FLAGS enum and use BPF_F_BEFORE instead
- fix .init_members to not open code member->offset
- allow struct hid_device to be writeable from HID-BPF for its name,
uniq and phys
- Link to v1: https://lore.kernel.org/r/20240528-hid_bpf_struct_ops-v1-0-8c6663df27d8@ker…
---
Benjamin Tissoires (16):
HID: rename struct hid_bpf_ops into hid_ops
HID: bpf: add hid_get/put_device() helpers
HID: bpf: implement HID-BPF through bpf_struct_ops
selftests/hid: convert the hid_bpf selftests with struct_ops
HID: samples: convert the 2 HID-BPF samples into struct_ops
HID: bpf: add defines for HID-BPF SEC in in-tree bpf fixes
HID: bpf: convert in-tree fixes into struct_ops
HID: bpf: remove tracing HID-BPF capability
selftests/hid: add subprog call test
Documentation: HID: amend HID-BPF for struct_ops
Documentation: HID: add a small blurb on udev-hid-bpf
HID: bpf: Artist24: remove unused variable
HID: bpf: error on warnings when compiling bpf objects
bpf: allow bpf helpers to be used into HID-BPF struct_ops
HID: bpf: rework hid_bpf_ops_btf_struct_access
HID: bpf: make part of struct hid_device writable
Documentation/hid/hid-bpf.rst | 173 ++++---
drivers/hid/bpf/Makefile | 2 +-
drivers/hid/bpf/entrypoints/Makefile | 93 ----
drivers/hid/bpf/entrypoints/README | 4 -
drivers/hid/bpf/entrypoints/entrypoints.bpf.c | 25 -
drivers/hid/bpf/entrypoints/entrypoints.lskel.h | 248 ---------
drivers/hid/bpf/hid_bpf_dispatch.c | 266 +++-------
drivers/hid/bpf/hid_bpf_dispatch.h | 12 +-
drivers/hid/bpf/hid_bpf_jmp_table.c | 565 ---------------------
drivers/hid/bpf/hid_bpf_struct_ops.c | 298 +++++++++++
drivers/hid/bpf/progs/FR-TEC__Raptor-Mach-2.bpf.c | 9 +-
drivers/hid/bpf/progs/HP__Elite-Presenter.bpf.c | 6 +-
drivers/hid/bpf/progs/Huion__Kamvas-Pro-19.bpf.c | 9 +-
.../hid/bpf/progs/IOGEAR__Kaliber-MMOmentum.bpf.c | 6 +-
drivers/hid/bpf/progs/Makefile | 2 +-
.../hid/bpf/progs/Microsoft__XBox-Elite-2.bpf.c | 6 +-
drivers/hid/bpf/progs/Wacom__ArtPen.bpf.c | 6 +-
drivers/hid/bpf/progs/XPPen__Artist24.bpf.c | 10 +-
drivers/hid/bpf/progs/XPPen__ArtistPro16Gen2.bpf.c | 24 +-
drivers/hid/bpf/progs/hid_bpf.h | 5 +
drivers/hid/hid-core.c | 6 +-
include/linux/hid_bpf.h | 125 ++---
samples/hid/Makefile | 5 +-
samples/hid/hid_bpf_attach.bpf.c | 18 -
samples/hid/hid_bpf_attach.h | 14 -
samples/hid/hid_mouse.bpf.c | 26 +-
samples/hid/hid_mouse.c | 39 +-
samples/hid/hid_surface_dial.bpf.c | 10 +-
samples/hid/hid_surface_dial.c | 53 +-
tools/testing/selftests/hid/hid_bpf.c | 100 +++-
tools/testing/selftests/hid/progs/hid.c | 100 +++-
.../testing/selftests/hid/progs/hid_bpf_helpers.h | 19 +-
32 files changed, 805 insertions(+), 1479 deletions(-)
---
base-commit: 70ec81c2e2b4005465ad0d042e90b36087c36104
change-id: 20240513-hid_bpf_struct_ops-e3212a224555
Best regards,
--
Benjamin Tissoires <bentiss(a)kernel.org>
This series fixes build errors found by clang to allow the x86 suite to
get built with the clang.
Unfortunately, there is one bug [1] in the clang becuase of which
extended asm isn't handled correctly by it and build fails for
sysret_rip.c. Hence even after this series the build of this test would
fail with clang. Should we disable this test for now when clang is used
until the bug is fixed in clang? Not sure. Any opinions?
[1] https://github.com/llvm/llvm-project/issues/53728
Muhammad Usama Anjum (8):
selftests: x86: Remove dependence of headers file
selftests: x86: check_initial_reg_state: remove -no-pie while using
-static
selftests: x86: test_vsyscall: remove unused function
selftests: x86: fsgsbase_restore: fix asm directive from =rm to =r
selftests: x86: syscall_arg_fault_32: remove unused variable
selftests: x86: test_FISTTP: use fisttps instead of ambigous fisttp
selftests: x86: fsgsbase: Remove unused function and variable
selftests: x86: amx: Remove unused functions
tools/testing/selftests/x86/Makefile | 9 +++++----
tools/testing/selftests/x86/amx.c | 16 ----------------
tools/testing/selftests/x86/fsgsbase.c | 6 ------
tools/testing/selftests/x86/fsgsbase_restore.c | 2 +-
tools/testing/selftests/x86/syscall_arg_fault.c | 1 -
tools/testing/selftests/x86/test_FISTTP.c | 8 ++++----
tools/testing/selftests/x86/test_vsyscall.c | 5 -----
7 files changed, 10 insertions(+), 37 deletions(-)
--
2.39.2
Commit 1b151e2435fc ("block: Remove special-casing of compound
pages") caused a change in behaviour when releasing the pages
if the buffer does not start at the beginning of the page. This
was because the calculation of the number of pages to release
was incorrect.
This was fixed by commit 38b43539d64b ("block: Fix page refcounts
for unaligned buffers in __bio_release_pages()").
We pin the user buffer during direct I/O writes. If this buffer is a
hugepage, bio_release_page() will unpin it and decrement all references
and pin counts at ->bi_end_io. However, if any references to the hugepage
remain post-I/O, the hugepage will not be freed upon unmap, leading
to a memory leak.
This patch verifies that a hugepage, used as a user buffer for DIO
operations, is correctly freed upon unmapping, regardless of whether
the offsets are aligned or unaligned w.r.t page boundary.
Test Result Fail Scenario (Without the fix)
--------------------------------------------------------
[]# ./hugetlb_dio
TAP version 13
1..4
No. Free pages before allocation : 7
No. Free pages after munmap : 7
ok 1 : Huge pages freed successfully !
No. Free pages before allocation : 7
No. Free pages after munmap : 7
ok 2 : Huge pages freed successfully !
No. Free pages before allocation : 7
No. Free pages after munmap : 7
ok 3 : Huge pages freed successfully !
No. Free pages before allocation : 7
No. Free pages after munmap : 6
not ok 4 : Huge pages not freed!
Totals: pass:3 fail:1 xfail:0 xpass:0 skip:0 error:0
Test Result PASS Scenario (With the fix)
---------------------------------------------------------
[]#./hugetlb_dio
TAP version 13
1..4
No. Free pages before allocation : 7
No. Free pages after munmap : 7
ok 1 : Huge pages freed successfully !
No. Free pages before allocation : 7
No. Free pages after munmap : 7
ok 2 : Huge pages freed successfully !
No. Free pages before allocation : 7
No. Free pages after munmap : 7
ok 3 : Huge pages freed successfully !
No. Free pages before allocation : 7
No. Free pages after munmap : 7
ok 4 : Huge pages freed successfully !
Totals: pass:4 fail:0 xfail:0 xpass:0 skip:0 error:0
V4:
- Added this test to run_vmtests.sh.
V3:
- Fixed the build error when it is compiled with _FORTIFY_SOURCE.
V2:
- Addressed all review commets from Muhammad Usama Anjum
https://lore.kernel.org/all/20240604132801.23377-1-donettom@linux.ibm.com/
V1:
https://lore.kernel.org/all/20240523063905.3173-1-donettom@linux.ibm.com/#t
Signed-off-by: Donet Tom <donettom(a)linux.ibm.com>
Co-developed-by: Ritesh Harjani (IBM) <ritesh.list(a)gmail.com>
Signed-off-by: Ritesh Harjani (IBM) <ritesh.list(a)gmail.com>
Reviewed-by: Muhammad Usama Anjum <usama.anjum(a)collabora.com>
---
tools/testing/selftests/mm/Makefile | 1 +
tools/testing/selftests/mm/hugetlb_dio.c | 118 ++++++++++++++++++++++
tools/testing/selftests/mm/run_vmtests.sh | 1 +
3 files changed, 120 insertions(+)
create mode 100644 tools/testing/selftests/mm/hugetlb_dio.c
diff --git a/tools/testing/selftests/mm/Makefile b/tools/testing/selftests/mm/Makefile
index 3b49bc3d0a3b..a1748a4c7df1 100644
--- a/tools/testing/selftests/mm/Makefile
+++ b/tools/testing/selftests/mm/Makefile
@@ -73,6 +73,7 @@ TEST_GEN_FILES += ksm_functional_tests
TEST_GEN_FILES += mdwe_test
TEST_GEN_FILES += hugetlb_fault_after_madv
TEST_GEN_FILES += hugetlb_madv_vs_map
+TEST_GEN_FILES += hugetlb_dio
ifneq ($(ARCH),arm64)
TEST_GEN_FILES += soft-dirty
diff --git a/tools/testing/selftests/mm/hugetlb_dio.c b/tools/testing/selftests/mm/hugetlb_dio.c
new file mode 100644
index 000000000000..986f3b6c7f7b
--- /dev/null
+++ b/tools/testing/selftests/mm/hugetlb_dio.c
@@ -0,0 +1,118 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * This program tests for hugepage leaks after DIO writes to a file using a
+ * hugepage as the user buffer. During DIO, the user buffer is pinned and
+ * should be properly unpinned upon completion. This patch verifies that the
+ * kernel correctly unpins the buffer at DIO completion for both aligned and
+ * unaligned user buffer offsets (w.r.t page boundary), ensuring the hugepage
+ * is freed upon unmapping.
+ */
+
+#define _GNU_SOURCE
+#include <stdio.h>
+#include <sys/stat.h>
+#include <stdlib.h>
+#include <fcntl.h>
+#include <stdint.h>
+#include <unistd.h>
+#include <string.h>
+#include <sys/mman.h>
+#include "vm_util.h"
+#include "../kselftest.h"
+
+void run_dio_using_hugetlb(unsigned int start_off, unsigned int end_off)
+{
+ int fd;
+ char *buffer = NULL;
+ char *orig_buffer = NULL;
+ size_t h_pagesize = 0;
+ size_t writesize;
+ int free_hpage_b = 0;
+ int free_hpage_a = 0;
+ const int mmap_flags = MAP_PRIVATE | MAP_ANONYMOUS | MAP_HUGETLB;
+ const int mmap_prot = PROT_READ | PROT_WRITE;
+
+ writesize = end_off - start_off;
+
+ /* Get the default huge page size */
+ h_pagesize = default_huge_page_size();
+ if (!h_pagesize)
+ ksft_exit_fail_msg("Unable to determine huge page size\n");
+
+ /* Open the file to DIO */
+ fd = open("/tmp", O_TMPFILE | O_RDWR | O_DIRECT, 0664);
+ if (fd < 0)
+ ksft_exit_fail_perror("Error opening file\n");
+
+ /* Get the free huge pages before allocation */
+ free_hpage_b = get_free_hugepages();
+ if (free_hpage_b == 0) {
+ close(fd);
+ ksft_exit_skip("No free hugepage, exiting!\n");
+ }
+
+ /* Allocate a hugetlb page */
+ orig_buffer = mmap(NULL, h_pagesize, mmap_prot, mmap_flags, -1, 0);
+ if (orig_buffer == MAP_FAILED) {
+ close(fd);
+ ksft_exit_fail_perror("Error mapping memory\n");
+ }
+ buffer = orig_buffer;
+ buffer += start_off;
+
+ memset(buffer, 'A', writesize);
+
+ /* Write the buffer to the file */
+ if (write(fd, buffer, writesize) != (writesize)) {
+ munmap(orig_buffer, h_pagesize);
+ close(fd);
+ ksft_exit_fail_perror("Error writing to file\n");
+ }
+
+ /* unmap the huge page */
+ munmap(orig_buffer, h_pagesize);
+ close(fd);
+
+ /* Get the free huge pages after unmap*/
+ free_hpage_a = get_free_hugepages();
+
+ /*
+ * If the no. of free hugepages before allocation and after unmap does
+ * not match - that means there could still be a page which is pinned.
+ */
+ if (free_hpage_a != free_hpage_b) {
+ ksft_print_msg("No. Free pages before allocation : %d\n", free_hpage_b);
+ ksft_print_msg("No. Free pages after munmap : %d\n", free_hpage_a);
+ ksft_test_result_fail(": Huge pages not freed!\n");
+ } else {
+ ksft_print_msg("No. Free pages before allocation : %d\n", free_hpage_b);
+ ksft_print_msg("No. Free pages after munmap : %d\n", free_hpage_a);
+ ksft_test_result_pass(": Huge pages freed successfully !\n");
+ }
+}
+
+int main(void)
+{
+ size_t pagesize = 0;
+
+ ksft_print_header();
+ ksft_set_plan(4);
+
+ /* Get base page size */
+ pagesize = psize();
+
+ /* start and end is aligned to pagesize */
+ run_dio_using_hugetlb(0, (pagesize * 3));
+
+ /* start is aligned but end is not aligned */
+ run_dio_using_hugetlb(0, (pagesize * 3) - (pagesize / 2));
+
+ /* start is unaligned and end is aligned */
+ run_dio_using_hugetlb(pagesize / 2, (pagesize * 3));
+
+ /* both start and end are unaligned */
+ run_dio_using_hugetlb(pagesize / 2, (pagesize * 3) + (pagesize / 2));
+
+ ksft_finished();
+}
+
diff --git a/tools/testing/selftests/mm/run_vmtests.sh b/tools/testing/selftests/mm/run_vmtests.sh
index 3157204b9047..5698d519170d 100755
--- a/tools/testing/selftests/mm/run_vmtests.sh
+++ b/tools/testing/selftests/mm/run_vmtests.sh
@@ -265,6 +265,7 @@ CATEGORY="hugetlb" run_test ./map_hugetlb
CATEGORY="hugetlb" run_test ./hugepage-mremap
CATEGORY="hugetlb" run_test ./hugepage-vmemmap
CATEGORY="hugetlb" run_test ./hugetlb-madvise
+CATEGORY="hugetlb" run_test ./hugetlb_dio
nr_hugepages_tmp=$(cat /proc/sys/vm/nr_hugepages)
# For this test, we need one and just one huge page
--
2.43.0
in the main function of vdso_restorer.c,there is a dlopen function,
but there is no dlclose function to close the file
Signed-off-by: liujing <liujing(a)cmss.chinamobile.com>
---
tools/testing/selftests/x86/vdso_restorer.c | 2 ++
1 file changed, 2 insertions(+)
diff --git a/tools/testing/selftests/x86/vdso_restorer.c b/tools/testing/selftests/x86/vdso_restorer.c
index fe99f2434155..a0b1155dee31 100644
--- a/tools/testing/selftests/x86/vdso_restorer.c
+++ b/tools/testing/selftests/x86/vdso_restorer.c
@@ -57,6 +57,8 @@ int main()
return 0;
}
+ dlclose(vdso);
+
memset(&sa, 0, sizeof(sa));
sa.handler = handler_with_siginfo;
sa.flags = SA_SIGINFO;
--
2.18.2
Conform individual tests to TAP output. One patch conform one test. With
this series, all vDSO tests become TAP conformant.
Muhammad Usama Anjum (4):
kselftests: vdso: vdso_test_clock_getres: conform test to TAP output
kselftests: vdso: vdso_test_correctness: conform test to TAP output
kselftests: vdso: vdso_test_getcpu: conform test to TAP output
kselftests: vdso: vdso_test_gettimeofday: conform test to TAP output
.../selftests/vDSO/vdso_test_clock_getres.c | 68 ++++----
.../selftests/vDSO/vdso_test_correctness.c | 146 +++++++++---------
.../testing/selftests/vDSO/vdso_test_getcpu.c | 16 +-
.../selftests/vDSO/vdso_test_gettimeofday.c | 23 +--
4 files changed, 126 insertions(+), 127 deletions(-)
--
2.39.2
The kselftests may be built in a couple different ways:
make LLVM=1
make CC=clang
In order to handle both cases, set LLVM=1 if CC=clang. That way,the rest
of lib.mk, and any Makefiles that include lib.mk, can base decisions
solely on whether or not LLVM is set.
Then, build upon that to disable a pair of clang warnings that are
already silenced on gcc.
Doing it this way is much better than the piecemeal approach that I
started with in [1] and [2]. Thanks to Nathan Chancellor for the patch
reviews that led to this approach.
Changes since the first version:
1) Wrote a detailed explanation for suppressing two clang warnings, in
both a lib.mk comment, and the commit description.
2) Added a Reviewed-by tag to the first patch.
[1] https://lore.kernel.org/20240527214704.300444-1-jhubbard@nvidia.com
[2] https://lore.kernel.org/20240527213641.299458-1-jhubbard@nvidia.com
John Hubbard (2):
selftests/lib.mk: handle both LLVM=1 and CC=clang builds
selftests/lib.mk: silence some clang warnings that gcc already ignores
tools/testing/selftests/lib.mk | 20 ++++++++++++++++++++
1 file changed, 20 insertions(+)
base-commit: e0cce98fe279b64f4a7d81b7f5c3a23d80b92fbc
--
2.45.1
Commit 1b151e2435fc ("block: Remove special-casing of compound
pages") caused a change in behaviour when releasing the pages
if the buffer does not start at the beginning of the page. This
was because the calculation of the number of pages to release
was incorrect.
This was fixed by commit 38b43539d64b ("block: Fix page refcounts
for unaligned buffers in __bio_release_pages()").
We pin the user buffer during direct I/O writes. If this buffer is a
hugepage, bio_release_page() will unpin it and decrement all references
and pin counts at ->bi_end_io. However, if any references to the hugepage
remain post-I/O, the hugepage will not be freed upon unmap, leading
to a memory leak.
This patch verifies that a hugepage, used as a user buffer for DIO
operations, is correctly freed upon unmapping, regardless of whether
the offsets are aligned or unaligned w.r.t page boundary.
Test Result Fail Scenario (Without the fix)
--------------------------------------------------------
[]# ./hugetlb_dio
TAP version 13
1..4
No. Free pages before allocation : 7
No. Free pages after munmap : 7
ok 1 : Huge pages freed successfully !
No. Free pages before allocation : 7
No. Free pages after munmap : 7
ok 2 : Huge pages freed successfully !
No. Free pages before allocation : 7
No. Free pages after munmap : 7
ok 3 : Huge pages freed successfully !
No. Free pages before allocation : 7
No. Free pages after munmap : 6
not ok 4 : Huge pages not freed!
Totals: pass:3 fail:1 xfail:0 xpass:0 skip:0 error:0
Test Result PASS Scenario (With the fix)
---------------------------------------------------------
[]#./hugetlb_dio
TAP version 13
1..4
No. Free pages before allocation : 7
No. Free pages after munmap : 7
ok 1 : Huge pages freed successfully !
No. Free pages before allocation : 7
No. Free pages after munmap : 7
ok 2 : Huge pages freed successfully !
No. Free pages before allocation : 7
No. Free pages after munmap : 7
ok 3 : Huge pages freed successfully !
No. Free pages before allocation : 7
No. Free pages after munmap : 7
ok 4 : Huge pages freed successfully !
Totals: pass:4 fail:0 xfail:0 xpass:0 skip:0 error:0
V3:
- Fixed the build error when it is compiled with _FORTIFY_SOURCE.
V2:
- Addressed all review commets from Muhammad Usama Anjum
https://lore.kernel.org/all/20240604132801.23377-1-donettom@linux.ibm.com/
V1:
https://lore.kernel.org/all/20240523063905.3173-1-donettom@linux.ibm.com/#t
Signed-off-by: Donet Tom <donettom(a)linux.ibm.com>
Co-developed-by: Ritesh Harjani (IBM) <ritesh.list(a)gmail.com>
Signed-off-by: Ritesh Harjani (IBM) <ritesh.list(a)gmail.com>
---
tools/testing/selftests/mm/Makefile | 1 +
tools/testing/selftests/mm/hugetlb_dio.c | 118 +++++++++++++++++++++++
2 files changed, 119 insertions(+)
create mode 100644 tools/testing/selftests/mm/hugetlb_dio.c
diff --git a/tools/testing/selftests/mm/Makefile b/tools/testing/selftests/mm/Makefile
index 3b49bc3d0a3b..a1748a4c7df1 100644
--- a/tools/testing/selftests/mm/Makefile
+++ b/tools/testing/selftests/mm/Makefile
@@ -73,6 +73,7 @@ TEST_GEN_FILES += ksm_functional_tests
TEST_GEN_FILES += mdwe_test
TEST_GEN_FILES += hugetlb_fault_after_madv
TEST_GEN_FILES += hugetlb_madv_vs_map
+TEST_GEN_FILES += hugetlb_dio
ifneq ($(ARCH),arm64)
TEST_GEN_FILES += soft-dirty
diff --git a/tools/testing/selftests/mm/hugetlb_dio.c b/tools/testing/selftests/mm/hugetlb_dio.c
new file mode 100644
index 000000000000..986f3b6c7f7b
--- /dev/null
+++ b/tools/testing/selftests/mm/hugetlb_dio.c
@@ -0,0 +1,118 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * This program tests for hugepage leaks after DIO writes to a file using a
+ * hugepage as the user buffer. During DIO, the user buffer is pinned and
+ * should be properly unpinned upon completion. This patch verifies that the
+ * kernel correctly unpins the buffer at DIO completion for both aligned and
+ * unaligned user buffer offsets (w.r.t page boundary), ensuring the hugepage
+ * is freed upon unmapping.
+ */
+
+#define _GNU_SOURCE
+#include <stdio.h>
+#include <sys/stat.h>
+#include <stdlib.h>
+#include <fcntl.h>
+#include <stdint.h>
+#include <unistd.h>
+#include <string.h>
+#include <sys/mman.h>
+#include "vm_util.h"
+#include "../kselftest.h"
+
+void run_dio_using_hugetlb(unsigned int start_off, unsigned int end_off)
+{
+ int fd;
+ char *buffer = NULL;
+ char *orig_buffer = NULL;
+ size_t h_pagesize = 0;
+ size_t writesize;
+ int free_hpage_b = 0;
+ int free_hpage_a = 0;
+ const int mmap_flags = MAP_PRIVATE | MAP_ANONYMOUS | MAP_HUGETLB;
+ const int mmap_prot = PROT_READ | PROT_WRITE;
+
+ writesize = end_off - start_off;
+
+ /* Get the default huge page size */
+ h_pagesize = default_huge_page_size();
+ if (!h_pagesize)
+ ksft_exit_fail_msg("Unable to determine huge page size\n");
+
+ /* Open the file to DIO */
+ fd = open("/tmp", O_TMPFILE | O_RDWR | O_DIRECT, 0664);
+ if (fd < 0)
+ ksft_exit_fail_perror("Error opening file\n");
+
+ /* Get the free huge pages before allocation */
+ free_hpage_b = get_free_hugepages();
+ if (free_hpage_b == 0) {
+ close(fd);
+ ksft_exit_skip("No free hugepage, exiting!\n");
+ }
+
+ /* Allocate a hugetlb page */
+ orig_buffer = mmap(NULL, h_pagesize, mmap_prot, mmap_flags, -1, 0);
+ if (orig_buffer == MAP_FAILED) {
+ close(fd);
+ ksft_exit_fail_perror("Error mapping memory\n");
+ }
+ buffer = orig_buffer;
+ buffer += start_off;
+
+ memset(buffer, 'A', writesize);
+
+ /* Write the buffer to the file */
+ if (write(fd, buffer, writesize) != (writesize)) {
+ munmap(orig_buffer, h_pagesize);
+ close(fd);
+ ksft_exit_fail_perror("Error writing to file\n");
+ }
+
+ /* unmap the huge page */
+ munmap(orig_buffer, h_pagesize);
+ close(fd);
+
+ /* Get the free huge pages after unmap*/
+ free_hpage_a = get_free_hugepages();
+
+ /*
+ * If the no. of free hugepages before allocation and after unmap does
+ * not match - that means there could still be a page which is pinned.
+ */
+ if (free_hpage_a != free_hpage_b) {
+ ksft_print_msg("No. Free pages before allocation : %d\n", free_hpage_b);
+ ksft_print_msg("No. Free pages after munmap : %d\n", free_hpage_a);
+ ksft_test_result_fail(": Huge pages not freed!\n");
+ } else {
+ ksft_print_msg("No. Free pages before allocation : %d\n", free_hpage_b);
+ ksft_print_msg("No. Free pages after munmap : %d\n", free_hpage_a);
+ ksft_test_result_pass(": Huge pages freed successfully !\n");
+ }
+}
+
+int main(void)
+{
+ size_t pagesize = 0;
+
+ ksft_print_header();
+ ksft_set_plan(4);
+
+ /* Get base page size */
+ pagesize = psize();
+
+ /* start and end is aligned to pagesize */
+ run_dio_using_hugetlb(0, (pagesize * 3));
+
+ /* start is aligned but end is not aligned */
+ run_dio_using_hugetlb(0, (pagesize * 3) - (pagesize / 2));
+
+ /* start is unaligned and end is aligned */
+ run_dio_using_hugetlb(pagesize / 2, (pagesize * 3));
+
+ /* both start and end are unaligned */
+ run_dio_using_hugetlb(pagesize / 2, (pagesize * 3) + (pagesize / 2));
+
+ ksft_finished();
+}
+
--
2.43.0
From: Jeff Xu <jeffxu(a)google.com>
By default, memfd_create() creates a non-sealable MFD, unless the
MFD_ALLOW_SEALING flag is set.
When the MFD_NOEXEC_SEAL flag is initially introduced, the MFD created
with that flag is sealable, even though MFD_ALLOW_SEALING is not set.
This patch changes MFD_NOEXEC_SEAL to be non-sealable by default,
unless MFD_ALLOW_SEALING is explicitly set.
This is a non-backward compatible change. However, as MFD_NOEXEC_SEAL
is new, we expect not many applications will rely on the nature of
MFD_NOEXEC_SEAL being sealable. In most cases, the application already
sets MFD_ALLOW_SEALING if they need a sealable MFD.
Additionally, this enhances the useability of pid namespace sysctl
vm.memfd_noexec. When vm.memfd_noexec equals 1 or 2, the kernel will
add MFD_NOEXEC_SEAL if mfd_create does not specify MFD_EXEC or
MFD_NOEXEC_SEAL, and the addition of MFD_NOEXEC_SEAL enables the MFD
to be sealable. This means, any application that does not desire this
behavior will be unable to utilize vm.memfd_noexec = 1 or 2 to
migrate/enforce non-executable MFD. This adjustment ensures that
applications can anticipate that the sealable characteristic will
remain unmodified by vm.memfd_noexec.
This patch was initially developed by Barnabás Pőcze, and Barnabás
used Debian Code Search and GitHub to try to find potential breakages
and could only find a single one. Dbus-broker's memfd_create() wrapper
is aware of this implicit `MFD_ALLOW_SEALING` behavior, and tries to
work around it [1]. This workaround will break. Luckily, this only
affects the test suite, it does not affect
the normal operations of dbus-broker. There is a PR with a fix[2]. In
addition, David Rheinsberg also raised similar fix in [3]
[1]: https://github.com/bus1/dbus-broker/blob/9eb0b7e5826fc76cad7b025bc46f267d4a…
[2]: https://github.com/bus1/dbus-broker/pull/366
[3]: https://lore.kernel.org/lkml/20230714114753.170814-1-david@readahead.eu/
History
======
V2:
update commit message.
add testcase for vm.memfd_noexec
add documentation.
V1:
https://lore.kernel.org/lkml/20240513191544.94754-1-pobrn@protonmail.com/
Jeff Xu (2):
memfd: fix MFD_NOEXEC_SEAL to be non-sealable by default
memfd:add MEMFD_NOEXEC_SEAL documentation
Documentation/userspace-api/index.rst | 1 +
Documentation/userspace-api/mfd_noexec.rst | 90 ++++++++++++++++++++++
mm/memfd.c | 9 +--
tools/testing/selftests/memfd/memfd_test.c | 26 ++++++-
4 files changed, 120 insertions(+), 6 deletions(-)
create mode 100644 Documentation/userspace-api/mfd_noexec.rst
--
2.45.1.288.g0e0cd299f1-goog
`MFD_NOEXEC_SEAL` should remove the executable bits and set
`F_SEAL_EXEC` to prevent further modifications to the executable
bits as per the comment in the uapi header file:
not executable and sealed to prevent changing to executable
However, currently, it also unsets `F_SEAL_SEAL`, essentially
acting as a superset of `MFD_ALLOW_SEALING`. Nothing implies
that it should be so, and indeed up until the second version
of the of the patchset[0] that introduced `MFD_EXEC` and
`MFD_NOEXEC_SEAL`, `F_SEAL_SEAL` was not removed, however it
was changed in the third revision of the patchset[1] without
a clear explanation.
This behaviour is suprising for application developers,
there is no documentation that would reveal that `MFD_NOEXEC_SEAL`
has the additional effect of `MFD_ALLOW_SEALING`.
So do not remove `F_SEAL_SEAL` when `MFD_NOEXEC_SEAL` is requested.
This is technically an ABI break, but it seems very unlikely that an
application would depend on this behaviour (unless by accident).
[0]: https://lore.kernel.org/lkml/20220805222126.142525-3-jeffxu@google.com/
[1]: https://lore.kernel.org/lkml/20221202013404.163143-3-jeffxu@google.com/
Fixes: 105ff5339f498a ("mm/memfd: add MFD_NOEXEC_SEAL and MFD_EXEC")
Signed-off-by: Barnabás Pőcze <pobrn(a)protonmail.com>
---
Or did I miss the explanation as to why MFD_NOEXEC_SEAL should
imply MFD_ALLOW_SEALING? If so, please direct me to it and
sorry for the noise.
---
mm/memfd.c | 9 ++++-----
tools/testing/selftests/memfd/memfd_test.c | 2 +-
2 files changed, 5 insertions(+), 6 deletions(-)
diff --git a/mm/memfd.c b/mm/memfd.c
index 7d8d3ab3fa37..8b7f6afee21d 100644
--- a/mm/memfd.c
+++ b/mm/memfd.c
@@ -356,12 +356,11 @@ SYSCALL_DEFINE2(memfd_create,
inode->i_mode &= ~0111;
file_seals = memfd_file_seals_ptr(file);
- if (file_seals) {
- *file_seals &= ~F_SEAL_SEAL;
+ if (file_seals)
*file_seals |= F_SEAL_EXEC;
- }
- } else if (flags & MFD_ALLOW_SEALING) {
- /* MFD_EXEC and MFD_ALLOW_SEALING are set */
+ }
+
+ if (flags & MFD_ALLOW_SEALING) {
file_seals = memfd_file_seals_ptr(file);
if (file_seals)
*file_seals &= ~F_SEAL_SEAL;
diff --git a/tools/testing/selftests/memfd/memfd_test.c b/tools/testing/selftests/memfd/memfd_test.c
index 18f585684e20..b6a7ad68c3c1 100644
--- a/tools/testing/selftests/memfd/memfd_test.c
+++ b/tools/testing/selftests/memfd/memfd_test.c
@@ -1151,7 +1151,7 @@ static void test_noexec_seal(void)
mfd_def_size,
MFD_CLOEXEC | MFD_NOEXEC_SEAL);
mfd_assert_mode(fd, 0666);
- mfd_assert_has_seals(fd, F_SEAL_EXEC);
+ mfd_assert_has_seals(fd, F_SEAL_SEAL | F_SEAL_EXEC);
mfd_fail_chmod(fd, 0777);
close(fd);
}
--
2.45.0
In order to be able to save the current value of a sysctl without changing
it, split the relevant bit out of sysctl_set() into a new helper.
Signed-off-by: Petr Machata <petrm(a)nvidia.com>
Reviewed-by: Ido Schimmel <idosch(a)nvidia.com>
---
Cc: Shuah Khan <shuah(a)kernel.org>
Cc: linux-kselftest(a)vger.kernel.org
Notes:
v2:
- New patch.
tools/testing/selftests/net/forwarding/lib.sh | 9 ++++++++-
1 file changed, 8 insertions(+), 1 deletion(-)
diff --git a/tools/testing/selftests/net/forwarding/lib.sh b/tools/testing/selftests/net/forwarding/lib.sh
index eabbdf00d8ca..9086d2015296 100644
--- a/tools/testing/selftests/net/forwarding/lib.sh
+++ b/tools/testing/selftests/net/forwarding/lib.sh
@@ -1134,12 +1134,19 @@ bridge_ageing_time_get()
}
declare -A SYSCTL_ORIG
+sysctl_save()
+{
+ local key=$1; shift
+
+ SYSCTL_ORIG[$key]=$(sysctl -n $key)
+}
+
sysctl_set()
{
local key=$1; shift
local value=$1; shift
- SYSCTL_ORIG[$key]=$(sysctl -n $key)
+ sysctl_save "$key"
sysctl -qw $key="$value"
}
--
2.45.0
This patch series is motivated by the following observation:
Raise a signal, jump to signal handler. The ucontext_t structure dumped
by kernel to userspace has a uc_sigmask field having the mask of blocked
signals. If you run a fresh minimalistic program doing this, this field
is empty, even if you block some signals while registering the handler
with sigaction().
Here is what the man-pages have to say:
sigaction(2): "sa_mask specifies a mask of signals which should be blocked
(i.e., added to the signal mask of the thread in which the signal handler
is invoked) during execution of the signal handler. In addition, the
signal which triggered the handler will be blocked, unless the SA_NODEFER
flag is used."
signal(7): Under "Execution of signal handlers", (1.3) implies:
"The thread's current signal mask is accessible via the ucontext_t
object that is pointed to by the third argument of the signal handler."
But, (1.4) states:
"Any signals specified in act->sa_mask when registering the handler with
sigprocmask(2) are added to the thread's signal mask. The signal being
delivered is also added to the signal mask, unless SA_NODEFER was
specified when registering the handler. These signals are thus blocked
while the handler executes."
There clearly is no distinction being made in the man pages between
"Thread's signal mask" and ucontext_t; this logically should imply
that a signal blocked by populating struct sigaction should be visible
in ucontext_t.
Here is what the kernel code does (for Aarch64):
do_signal() -> handle_signal() -> sigmask_to_save(), which returns
¤t->blocked, is passed to setup_rt_frame() -> setup_sigframe() ->
__copy_to_user(). Hence, ¤t->blocked is copied to ucontext_t
exposed to userspace. Returning back to handle_signal(),
signal_setup_done() -> signal_delivered() -> sigorsets() and
set_current_blocked() are responsible for using information from
struct ksignal ksig, which was populated through the sigaction()
system call in kernel/signal.c:
copy_from_user(&new_sa.sa, act, sizeof(new_sa.sa)),
to update ¤t->blocked; hence, the set of blocked signals for the
current thread is updated AFTER the kernel dumps ucontext_t to
userspace.
Assuming that the above is indeed the intended behaviour, because it
semantically makes sense, since the signals blocked using sigaction()
remain blocked only till the execution of the handler, and not in the
context present before jumping to the handler (but nothing can be
confirmed from the man-pages), the series introduces a test for
mangling with uc_sigmask. I will send a separate series to fix the
man-pages.
The proposed selftest has been tested out on Aarch32, Aarch64 and x86_64.
Dev Jain (2):
selftests: Rename sigaltstack to generic signal
selftests: Add a test mangling with uc_sigmask
tools/testing/selftests/Makefile | 2 +-
.../{sigaltstack => signal}/.gitignore | 3 +-
.../{sigaltstack => signal}/Makefile | 3 +-
.../current_stack_pointer.h | 0
.../selftests/signal/mangle_uc_sigmask.c | 141 ++++++++++++++++++
.../sas.c => signal/sigaltstack.c} | 0
6 files changed, 146 insertions(+), 3 deletions(-)
rename tools/testing/selftests/{sigaltstack => signal}/.gitignore (57%)
rename tools/testing/selftests/{sigaltstack => signal}/Makefile (53%)
rename tools/testing/selftests/{sigaltstack => signal}/current_stack_pointer.h (100%)
create mode 100644 tools/testing/selftests/signal/mangle_uc_sigmask.c
rename tools/testing/selftests/{sigaltstack/sas.c => signal/sigaltstack.c} (100%)
--
2.34.1
Hello,
We're pleased to announce the return of the Kernel Testing &
Dependability Micro-Conference at Linux Plumbers 2024:
https://lpc.events/event/18/contributions/1665/
You can already submit proposals by selecting the micro-conf in
the Track drop-down list:
https://lpc.events/login/?next=/event/18/abstracts/%23submit-abstract
Please note that the deadline for submissions is *Sunday 16th June*
The event description contains a list of suggested topics
inherited from past editions. Is there anything in particular
you would like to see discussed this year?
Knowing people's interests helps with triaging proposals and
making the micro-conf as relevant as possible. See you there!
Thanks,
Guillaume & Shuah & Sasha
When compiling with Android bionic, the MAP_HUGE_* and SHM_HUGE_* macros
are redefined because they are included from the uapi by sys/mman.h and
sys/shm.h:
INFO: From Compiling common/tools/testing/selftests/mm/thuge-gen.c:
common/tools/testing/selftests/mm/thuge-gen.c:32:9: warning: 'MAP_HUGE_2MB' macro redefined [-Wmacro-redefined]
32 | #define MAP_HUGE_2MB (21 << MAP_HUGE_SHIFT)
| ^
external/_main~_repo_rules~prebuilt_ndk/toolchains/llvm/prebuilt/linux-x86_64/sysroot/usr/include/linux/mman.h:38:9: note: previous definition is here
38 | #define MAP_HUGE_2MB HUGETLB_FLAG_ENCODE_2MB
| ^
common/tools/testing/selftests/mm/thuge-gen.c:33:9: warning: 'MAP_HUGE_1GB' macro redefined [-Wmacro-redefined]
33 | #define MAP_HUGE_1GB (30 << MAP_HUGE_SHIFT)
| ^
external/_main~_repo_rules~prebuilt_ndk/toolchains/llvm/prebuilt/linux-x86_64/sysroot/usr/include/linux/mman.h:44:9: note: previous definition is here
This test should probably use the uapi definitions instead of redefining
them. However, glibc gets struct redefinitions when including sys/shm.h
and linux/shm.h together. So, add guards for the SHM_HUGE_* macros
instead.
Edward Liaw (2):
selftests/mm: Include linux/mman.h
selftests/mm: Guard defines from shm
tools/testing/selftests/mm/thuge-gen.c | 13 +++++++++----
1 file changed, 9 insertions(+), 4 deletions(-)
--
2.45.1.467.gbab1589fc0-goog
From: Geliang Tang <tanggeliang(a)kylinos.cn>
For moving dctcp test dedicated code out of do_test() into test_dctcp().
This patchset adds a new helper start_test() in bpf_tcp_ca.c to refactor
do_test().
Address Martin's comments for the previous series.
Geliang Tang (5):
selftests/bpf: Use connect_to_fd_opts in do_test in bpf_tcp_ca
selftests/bpf: Add start_test helper in bpf_tcp_ca
selftests/bpf: Use start_test in test_dctcp_fallback in bpf_tcp_ca
selftests/bpf: Use start_test in test_dctcp in bpf_tcp_ca
selftests/bpf: Drop useless arguments of do_test in bpf_tcp_ca
.../selftests/bpf/prog_tests/bpf_tcp_ca.c | 140 +++++++++++-------
1 file changed, 85 insertions(+), 55 deletions(-)
--
2.43.0
From: Pankaj Raghav <p.raghav(a)samsung.com>
create_pagecache_thp_and_fd() in split_huge_page_test.c used the
variable dummy to perform mmap read.
However, this test was skipped even on XFS which has large folio
support. The issue was compiler (gcc 13.2.0) was optimizing out the
dummy variable, therefore, not creating huge page in the page cache.
Add volatile keyword to force compiler not to optimize out the loop
where we read from the mmaped addr.
Signed-off-by: Pankaj Raghav <p.raghav(a)samsung.com>
---
tools/testing/selftests/mm/split_huge_page_test.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/tools/testing/selftests/mm/split_huge_page_test.c b/tools/testing/selftests/mm/split_huge_page_test.c
index d3c7f5fb3e7b..c573a58f80ab 100644
--- a/tools/testing/selftests/mm/split_huge_page_test.c
+++ b/tools/testing/selftests/mm/split_huge_page_test.c
@@ -300,7 +300,7 @@ int create_pagecache_thp_and_fd(const char *testfile, size_t fd_size, int *fd,
char **addr)
{
size_t i;
- int __attribute__((unused)) dummy = 0;
+ volatile int __attribute__((unused)) dummy = 0;
srand(time(NULL));
base-commit: d97496ca23a2d4ee80b7302849404859d9058bcd
--
2.44.1
From: Pankaj Raghav <p.raghav(a)samsung.com>
create_pagecache_thp_and_fd() in split_huge_page_test.c used the
variable dummy to perform mmap read.
However, this test was skipped even on XFS which has large folio
support. The issue was compiler (gcc 13.2.0) was optimizing out the
dummy variable, therefore, not creating huge page in the page cache.
Make it as a global variable to force the compiler not to optimize out
the loop where we read from the mmaped addr.
Signed-off-by: Pankaj Raghav <p.raghav(a)samsung.com>
---
Changes since v1:
- Make the dummy variable as a global variable(willy).
tools/testing/selftests/mm/split_huge_page_test.c | 6 +++++-
1 file changed, 5 insertions(+), 1 deletion(-)
diff --git a/tools/testing/selftests/mm/split_huge_page_test.c b/tools/testing/selftests/mm/split_huge_page_test.c
index d3c7f5fb3e7b..c4857de2c042 100644
--- a/tools/testing/selftests/mm/split_huge_page_test.c
+++ b/tools/testing/selftests/mm/split_huge_page_test.c
@@ -23,6 +23,11 @@
uint64_t pagesize;
unsigned int pageshift;
uint64_t pmd_pagesize;
+/*
+ * Used by create_pagecache_thp_and_fd() to do mmap read.
+ * Made it as global to avoid compiler optimizing out the variable.
+ */
+int dummy;
#define SPLIT_DEBUGFS "/sys/kernel/debug/split_huge_pages"
#define SMAP_PATH "/proc/self/smaps"
@@ -300,7 +305,6 @@ int create_pagecache_thp_and_fd(const char *testfile, size_t fd_size, int *fd,
char **addr)
{
size_t i;
- int __attribute__((unused)) dummy = 0;
srand(time(NULL));
base-commit: d97496ca23a2d4ee80b7302849404859d9058bcd
--
2.44.1
While looking at using 'lib.sh' for the MPTCP selftests [1], we found
some small issues with 'lib.sh'. Here they are:
- Patch 1: fix 'errexit' (set -e) support with busywait. 'errexit' is
supported in some functions, not all. A fix for v6.8+.
- Patch 2: avoid confusing error messages linked to the cleaning part
when the netns setup fails. A fix for v6.8+.
- Patch 3: set a variable as local to avoid accidentally changing the
value of a another one with the same name on the caller side. A fix
for v6.10-rc1+.
Link: https://lore.kernel.org/mptcp/5f4615c3-0621-43c5-ad25-55747a4350ce@kernel.o… [1]
Signed-off-by: Matthieu Baerts (NGI0) <matttbe(a)kernel.org>
---
Matthieu Baerts (NGI0) (3):
selftests: net: lib: support errexit with busywait
selftests: net: lib: avoid error removing empty netns name
selftests: net: lib: set 'i' as local
tools/testing/selftests/net/lib.sh | 18 +++++++++---------
1 file changed, 9 insertions(+), 9 deletions(-)
---
base-commit: a535d59432370343058755100ee75ab03c0e3f91
change-id: 20240605-upstream-net-20240605-selftests-net-lib-fixes-7a90a1a8d9d2
Best regards,
--
Matthieu Baerts (NGI0) <matttbe(a)kernel.org>
Commit 1b151e2435fc ("block: Remove special-casing of compound
pages") caused a change in behaviour when releasing the pages
if the buffer does not start at the beginning of the page. This
was because the calculation of the number of pages to release
was incorrect.
This was fixed by commit 38b43539d64b ("block: Fix page refcounts
for unaligned buffers in __bio_release_pages()").
We pin the user buffer during direct I/O writes. If this buffer is a
hugepage, bio_release_page() will unpin it and decrement all references
and pin counts at ->bi_end_io. However, if any references to the hugepage
remain post-I/O, the hugepage will not be freed upon unmap, leading
to a memory leak.
This patch verifies that a hugepage, used as a user buffer for DIO
operations, is correctly freed upon unmapping, regardless of whether
the offsets are aligned or unaligned w.r.t page boundary.
Test Result Fail Scenario (Without the fix)
--------------------------------------------------------
[]# ./hugetlb_dio
TAP version 13
1..4
No. Free pages before allocation : 7
No. Free pages after munmap : 7
ok 1 : Huge pages freed successfully !
No. Free pages before allocation : 7
No. Free pages after munmap : 7
ok 2 : Huge pages freed successfully !
No. Free pages before allocation : 7
No. Free pages after munmap : 7
ok 3 : Huge pages freed successfully !
No. Free pages before allocation : 7
No. Free pages after munmap : 6
not ok 4 : Huge pages not freed!
Totals: pass:3 fail:1 xfail:0 xpass:0 skip:0 error:0
Test Result PASS Scenario (With the fix)
---------------------------------------------------------
[]#./hugetlb_dio
TAP version 13
1..4
No. Free pages before allocation : 7
No. Free pages after munmap : 7
ok 1 : Huge pages freed successfully !
No. Free pages before allocation : 7
No. Free pages after munmap : 7
ok 2 : Huge pages freed successfully !
No. Free pages before allocation : 7
No. Free pages after munmap : 7
ok 3 : Huge pages freed successfully !
No. Free pages before allocation : 7
No. Free pages after munmap : 7
ok 4 : Huge pages freed successfully !
Totals: pass:4 fail:0 xfail:0 xpass:0 skip:0 error:0
V2:
- Addressed all review commets from Muhammad Usama Anjum
V1:
https://lore.kernel.org/all/20240523063905.3173-1-donettom@linux.ibm.com/#t
Signed-off-by: Donet Tom <donettom(a)linux.ibm.com>
Co-developed-by: Ritesh Harjani (IBM) <ritesh.list(a)gmail.com>
Signed-off-by: Ritesh Harjani (IBM) <ritesh.list(a)gmail.com>
---
tools/testing/selftests/mm/Makefile | 1 +
tools/testing/selftests/mm/hugetlb_dio.c | 118 +++++++++++++++++++++++
2 files changed, 119 insertions(+)
create mode 100644 tools/testing/selftests/mm/hugetlb_dio.c
diff --git a/tools/testing/selftests/mm/Makefile b/tools/testing/selftests/mm/Makefile
index 3b49bc3d0a3b..a1748a4c7df1 100644
--- a/tools/testing/selftests/mm/Makefile
+++ b/tools/testing/selftests/mm/Makefile
@@ -73,6 +73,7 @@ TEST_GEN_FILES += ksm_functional_tests
TEST_GEN_FILES += mdwe_test
TEST_GEN_FILES += hugetlb_fault_after_madv
TEST_GEN_FILES += hugetlb_madv_vs_map
+TEST_GEN_FILES += hugetlb_dio
ifneq ($(ARCH),arm64)
TEST_GEN_FILES += soft-dirty
diff --git a/tools/testing/selftests/mm/hugetlb_dio.c b/tools/testing/selftests/mm/hugetlb_dio.c
new file mode 100644
index 000000000000..e4f4924179c8
--- /dev/null
+++ b/tools/testing/selftests/mm/hugetlb_dio.c
@@ -0,0 +1,118 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * This program tests for hugepage leaks after DIO writes to a file using a
+ * hugepage as the user buffer. During DIO, the user buffer is pinned and
+ * should be properly unpinned upon completion. This patch verifies that the
+ * kernel correctly unpins the buffer at DIO completion for both aligned and
+ * unaligned user buffer offsets (w.r.t page boundary), ensuring the hugepage
+ * is freed upon unmapping.
+ */
+
+#define _GNU_SOURCE
+#include <stdio.h>
+#include <sys/stat.h>
+#include <stdlib.h>
+#include <fcntl.h>
+#include <stdint.h>
+#include <unistd.h>
+#include <string.h>
+#include <sys/mman.h>
+#include "vm_util.h"
+#include "../kselftest.h"
+
+void run_dio_using_hugetlb(unsigned int start_off, unsigned int end_off)
+{
+ int fd;
+ char *buffer = NULL;
+ char *orig_buffer = NULL;
+ size_t h_pagesize = 0;
+ size_t writesize;
+ int free_hpage_b = 0;
+ int free_hpage_a = 0;
+ const int mmap_flags = MAP_PRIVATE | MAP_ANONYMOUS | MAP_HUGETLB;
+ const int mmap_prot = PROT_READ | PROT_WRITE;
+
+ writesize = end_off - start_off;
+
+ /* Get the default huge page size */
+ h_pagesize = default_huge_page_size();
+ if (!h_pagesize)
+ ksft_exit_fail_msg("Unable to determine huge page size\n");
+
+ /* Open the file to DIO */
+ fd = open("/tmp", O_TMPFILE | O_RDWR | O_DIRECT);
+ if (fd < 0)
+ ksft_exit_fail_perror("Error opening file\n");
+
+ /* Get the free huge pages before allocation */
+ free_hpage_b = get_free_hugepages();
+ if (free_hpage_b == 0) {
+ close(fd);
+ ksft_exit_skip("No free hugepage, exiting!\n");
+ }
+
+ /* Allocate a hugetlb page */
+ orig_buffer = mmap(NULL, h_pagesize, mmap_prot, mmap_flags, -1, 0);
+ if (orig_buffer == MAP_FAILED) {
+ close(fd);
+ ksft_exit_fail_perror("Error mapping memory\n");
+ }
+ buffer = orig_buffer;
+ buffer += start_off;
+
+ memset(buffer, 'A', writesize);
+
+ /* Write the buffer to the file */
+ if (write(fd, buffer, writesize) != (writesize)) {
+ munmap(orig_buffer, h_pagesize);
+ close(fd);
+ ksft_exit_fail_perror("Error writing to file\n");
+ }
+
+ /* unmap the huge page */
+ munmap(orig_buffer, h_pagesize);
+ close(fd);
+
+ /* Get the free huge pages after unmap*/
+ free_hpage_a = get_free_hugepages();
+
+ /*
+ * If the no. of free hugepages before allocation and after unmap does
+ * not match - that means there could still be a page which is pinned.
+ */
+ if (free_hpage_a != free_hpage_b) {
+ ksft_print_msg("No. Free pages before allocation : %d\n", free_hpage_b);
+ ksft_print_msg("No. Free pages after munmap : %d\n", free_hpage_a);
+ ksft_test_result_fail(": Huge pages not freed!\n");
+ } else {
+ ksft_print_msg("No. Free pages before allocation : %d\n", free_hpage_b);
+ ksft_print_msg("No. Free pages after munmap : %d\n", free_hpage_a);
+ ksft_test_result_pass(": Huge pages freed successfully !\n");
+ }
+}
+
+int main(void)
+{
+ size_t pagesize = 0;
+
+ ksft_print_header();
+ ksft_set_plan(4);
+
+ /* Get base page size */
+ pagesize = psize();
+
+ /* start and end is aligned to pagesize */
+ run_dio_using_hugetlb(0, (pagesize * 3));
+
+ /* start is aligned but end is not aligned */
+ run_dio_using_hugetlb(0, (pagesize * 3) - (pagesize / 2));
+
+ /* start is unaligned and end is aligned */
+ run_dio_using_hugetlb(pagesize / 2, (pagesize * 3));
+
+ /* both start and end are unaligned */
+ run_dio_using_hugetlb(pagesize / 2, (pagesize * 3) + (pagesize / 2));
+
+ ksft_finished();
+}
+
--
2.43.0
Fixed MAC addresses help with debugging as last four bytes identify the
network namespace.
Signed-off-by: Lukasz Majewski <lukma(a)denx.de>
---
tools/testing/selftests/net/hsr/hsr_ping.sh | 9 +++++++++
1 file changed, 9 insertions(+)
diff --git a/tools/testing/selftests/net/hsr/hsr_ping.sh b/tools/testing/selftests/net/hsr/hsr_ping.sh
index 3684b813b0f6..f5d207fc770a 100755
--- a/tools/testing/selftests/net/hsr/hsr_ping.sh
+++ b/tools/testing/selftests/net/hsr/hsr_ping.sh
@@ -152,6 +152,15 @@ setup_hsr_interfaces()
ip -net "$ns3" addr add 100.64.0.3/24 dev hsr3
ip -net "$ns3" addr add dead:beef:1::3/64 dev hsr3 nodad
+ ip -net "$ns1" link set address 00:11:22:00:01:01 dev ns1eth1
+ ip -net "$ns1" link set address 00:11:22:00:01:02 dev ns1eth2
+
+ ip -net "$ns2" link set address 00:11:22:00:02:01 dev ns2eth1
+ ip -net "$ns2" link set address 00:11:22:00:02:02 dev ns2eth2
+
+ ip -net "$ns3" link set address 00:11:22:00:03:01 dev ns3eth1
+ ip -net "$ns3" link set address 00:11:22:00:03:02 dev ns3eth2
+
# All Links up
ip -net "$ns1" link set ns1eth1 up
ip -net "$ns1" link set ns1eth2 up
--
2.20.1
Hi all,
This series does a number of cleanups into resctrl_val() and
generalizes it by removing test name specific handling from the
function.
Hopefully these reach also Shuah successfully as I've recently seen
rejects for mail from @linux.intel.com to gmail addresses.
v5:
- Open mem bw file only once and use rewind().
- Add \n to mem bw file read to allow reading fresh values from the file.
- Return 0 if create_grp() is given NULL grp_name (matches the original
behavior). Mention this in function's kerneldoc.
- Cast pid_t to int before printing with %d.
- Caps/typo fixes to kerneldoc and commit messages.
- Use imperative tone in commit messages and improve them based on points
that came up during review.
v4:
- Merged close fix into IMC READ+WRITE rework patch
- Add loop to reset imc_counters_config fds to -1 to be able know which
need closing
- Introduce perf_close_imc_mem_bw() to close fds
- Open resctrl mem bw file (twice) beforehand to avoid opening it during
the test
- Remove MBM .mongrp setup
- Remove mongrp from CMT test
v3:
- Rename init functions to <testname>_init()
- Replace for loops with READ+WRITE statements for clarity
- Don't drop Return: entry from perf_open_imc_mem_bw() func comment
- New patch: Fix closing of IMC fds in case of error
- New patch: Make "bandwidth" consistent in comments & prints
- New patch: Simplify mem bandwidth file code
- Remove wrong comment
- Changed grp_name check to return -1 on fail (internal sanity check)
v2:
- Resolved conflicts with kselftest/next
- Spaces -> tabs correction
Ilpo Järvinen (16):
selftests/resctrl: Fix closing IMC fds on error and open-code R+W
instead of loops
selftests/resctrl: Calculate resctrl FS derived mem bw over sleep(1)
only
selftests/resctrl: Make "bandwidth" consistent in comments & prints
selftests/resctrl: Consolidate get_domain_id() into resctrl_val()
selftests/resctrl: Use correct type for pids
selftests/resctrl: Cleanup bm_pid and ppid usage & limit scope
selftests/resctrl: Rename measure_vals() to measure_mem_bw_vals() &
document
selftests/resctrl: Simplify mem bandwidth file code for MBA & MBM
tests
selftests/resctrl: Add ->measure() callback to resctrl_val_param
selftests/resctrl: Add ->init() callback into resctrl_val_param
selftests/resctrl: Simplify bandwidth report type handling
selftests/resctrl: Make some strings passed to resctrlfs functions
const
selftests/resctrl: Convert ctrlgrp & mongrp to pointers
selftests/resctrl: Remove mongrp from MBA test
selftests/resctrl: Remove mongrp from CMT test
selftests/resctrl: Remove test name comparing from
write_bm_pid_to_resctrl()
tools/testing/selftests/resctrl/cache.c | 10 +-
tools/testing/selftests/resctrl/cat_test.c | 5 +-
tools/testing/selftests/resctrl/cmt_test.c | 22 +-
tools/testing/selftests/resctrl/mba_test.c | 26 +-
tools/testing/selftests/resctrl/mbm_test.c | 26 +-
tools/testing/selftests/resctrl/resctrl.h | 49 ++-
tools/testing/selftests/resctrl/resctrl_val.c | 364 ++++++++----------
tools/testing/selftests/resctrl/resctrlfs.c | 67 ++--
8 files changed, 290 insertions(+), 279 deletions(-)
--
2.39.2
Hi,
I'm trying to build arm64 selftests on next-20240529. I'm getting build
failures. Complete logs are attached while some snippets are as following:
gcc pac.c /pauth/pac_corruptor.o /pauth/helper.o -o /pauth/pac -Wall -O2 -g
-I/linux_mainline/tools/testing/selftests/ -I/linux_mainline/tools/include
-mbranch-protection=pac-ret -march=armv8.2-a
In file included from pac.c:13:
../../kselftest_harness.h: In function ‘clone3_vfork’:
../../kselftest_harness.h:88:9: error: variable ‘args’ has initializer but
incomplete type
88 | struct clone_args args = {
CC check_prctl
check_prctl.c: In function ‘set_tagged_addr_ctrl’:
check_prctl.c:19:14: error: ‘PR_SET_TAGGED_ADDR_CTRL’ undeclared (first use
in this function)
19 | ret = prctl(PR_SET_TAGGED_ADDR_CTRL, val, 0, 0, 0);
gcc -mbranch-protection=standard -DBTI=1 -ffreestanding -Wall -Wextra -Wall
-O2 -g -I/linux_mainline/tools/testing/selftests/
-I/linux_mainline/tools/include -c -o /bti/test-bti.o test.c
test.c: In function ‘handler’:
test.c:85:50: error: ‘PSR_BTYPE_MASK’ undeclared (first use in this
function); did you mean ‘PSR_MODE_MASK’?
85 | write(1, &"00011011"[((uc->uc_mcontext.pstate & PSR_BTYPE_MASK)
I've GCC 8 installed. I'm not expecting the errors because of a little
older compiler. Any more ideas about the failures?
--
BR,
Muhammad Usama Anjum
The series composes of two parts. The first part Specifically,
patch 1 adds a comment at a callsite of riscv_setup_vsize to clarify how
vlenb is observed by the system. Patch 2 fixes the issue by failing the
boot process of a secondary core if vlenb mismatches.
Here is the organization of the series:
- Patch 1, 2 provide a fix for mismatching vlen problem [1]. The
solution is to fail secondary cores if their vlenb is not the same as
the boot core.
- Patch 3 is a cleanup for introducing ZVE* Vector subextensions. It
gives the obsolete ISA parser the ability to expand ISA extensions for
sigle letter extensions.
- Patch 4, 5, 6 introduce Zve32x, Zve32f, Zve64x, Zve64f, Zve64d for isa
parsing and hwprobe, and document about it.
- Patch 7 makes has_vector() check against ZVE32X instead of V, so most
userspace Vector supports will be available for bare ZVE32X.
- Patch 8 updates the prctl test so that it runs on ZVE32X.
The series is tested on a QEMU and verified that booting, Vector
programs context-switch, signal, ptrace, prctl interfaces works when we
only report partial V from the ISA.
Note that the signal test was performed after applying the commit
c27fa53b858b ("riscv: Fix vector state restore in rt_sigreturn()")
This patch should be able to apply on risc-v for-next branch on top of
the commit 0a16a1728790 ("riscv: select ARCH_HAS_FAST_MULTIPLIER")
[1]: https://lore.kernel.org/all/20240228-vicinity-cornstalk-4b8eb5fe5730@spud/T…
Changes in v5:
- Rebase on top of for-next
- Update comments (1, 7)
- Reorder the documentation patch to the front of patches that it
documents about. (5->4)
- Include ZVE64D to the list, which single letter V implies (6)
- Remove ZVE32F_IMPLY_LIST (5)
- Change the semantic of has_vector() thus rewrite patch 7
- Remove the patch that fixes integer promotion as it is merged else
place (8)
- Link to v4: https://lore.kernel.org/r/20240412-zve-detection-v4-0-e0c45bb6b253@sifive.c…
Changes in v4:
- Add a patch to trigger prctl test on ZVE32X (9)
- Add a patch to fix integer promotion bug in hwprobe (8)
- Fix a build fail on !CONFIG_RISCV_ISA_V (7)
- Add more comment in the assembly code change (2)
- Link to v3: https://lore.kernel.org/r/20240318-zve-detection-v3-0-e12d42107fa8@sifive.c…
Changelog v3:
- Include correct maintainers and mailing list into CC.
- Cleanup isa string parser code (3)
- Adjust extensions order and name (4, 5)
- Refine commit message (6)
Changelog v2:
- Update comments and commit messages (1, 2, 7)
- Refine isa_exts[] lists for zve extensions (4)
- Add a patch for dt-binding (5)
- Make ZVE* extensions depend on has_vector(ZVE32X) (6, 7)
---
---
Andy Chiu (8):
riscv: vector: add a comment when calling riscv_setup_vsize()
riscv: smp: fail booting up smp if inconsistent vlen is detected
riscv: cpufeature: call match_isa_ext() for single-letter extensions
dt-bindings: riscv: add Zve32[xf] Zve64[xfd] ISA extension description
riscv: cpufeature: add zve32[xf] and zve64[xfd] isa detection
riscv: hwprobe: add zve Vector subextensions into hwprobe interface
riscv: vector: adjust minimum Vector requirement to ZVE32X
selftest: run vector prctl test for ZVE32X
Documentation/arch/riscv/hwprobe.rst | 15 ++++++
.../devicetree/bindings/riscv/extensions.yaml | 30 +++++++++++
arch/riscv/include/asm/hwcap.h | 5 ++
arch/riscv/include/asm/vector.h | 10 ++--
arch/riscv/include/uapi/asm/hwprobe.h | 5 ++
arch/riscv/kernel/cpufeature.c | 60 +++++++++++++++++++---
arch/riscv/kernel/head.S | 19 ++++---
arch/riscv/kernel/smpboot.c | 14 +++--
arch/riscv/kernel/sys_hwprobe.c | 11 +++-
arch/riscv/kernel/vector.c | 5 +-
arch/riscv/lib/uaccess.S | 2 +-
.../testing/selftests/riscv/vector/vstate_prctl.c | 6 +--
12 files changed, 151 insertions(+), 31 deletions(-)
---
base-commit: 0a16a172879012c42f55ae8c2883e17c1e4e388f
change-id: 20240318-zve-detection-50106d2da527
Best regards,
--
Andy Chiu <andy.chiu(a)sifive.com>
hsr_redbox.sh test need to create bridge for testing. Add the missing
config CONFIG_BRIDGE in config file.
Fixes: eafbf0574e05 ("test: hsr: Extend the hsr_redbox.sh to have more SAN devices connected")
Signed-off-by: Hangbin Liu <liuhangbin(a)gmail.com>
---
tools/testing/selftests/net/hsr/config | 1 +
1 file changed, 1 insertion(+)
diff --git a/tools/testing/selftests/net/hsr/config b/tools/testing/selftests/net/hsr/config
index 22061204fb69..241542441c51 100644
--- a/tools/testing/selftests/net/hsr/config
+++ b/tools/testing/selftests/net/hsr/config
@@ -2,3 +2,4 @@ CONFIG_IPV6=y
CONFIG_NET_SCH_NETEM=m
CONFIG_HSR=y
CONFIG_VETH=y
+CONFIG_BRIDGE=y
--
2.43.0
This patchset makes it possible for MGLRU to consult secondary MMUs
while doing aging, not just during eviction. This allows for more
accurate reclaim decisions, which is especially important for proactive
reclaim.
This series makes the necessary MMU notifier changes to MGLRU and then
includes optimizations on top of that. This series also now includes
changes to access_tracking_perf_test to verify that aging works properly
for pages that are mainly used by KVM.
access_tracking_perf_test also has a mode (-p) to check performance of
MGLRU aging while the VM is faulting memory in. Here are some results:
Testing MGLRU aging while vCPUs are faulting in memory on x86 with the
TDP MMU. THPs disabled.
The test results varied a decent amount from run to run. I did my best to
take representative averages, but nonetheless, the big picture is the
important part.
Main takeaways:
- With the optimizations, the workload is much less impacted
by the presence of aging.
- With the optimizations, MGLRU is able to do aging much more
quickly, especially at 8+ vCPUs on my machine.
./access_tracking_perf_test -p -l -b 1G -v $N_VCPUS # 1G per vCPU
num_vcpus vcpu wall time aging avg pass time
1 (no aging) 0.878822016 n/a
1 (no opt) 0.938250568 0.008236007
1 (opt) 0.912270190 0.007314582
2 (no aging) 0.984959659 n/a
2 (no opt) 1.057880728 0.017989741
2 (opt) 1.037735641 0.013996319
4 (no aging) 1.264881581 n/a
4 (no opt) 1.318849182 0.056164918
4 (opt) 1.314653465 0.029311993
8 (no aging) 1.473883699 n/a
8 (no opt) 1.589441079 0.227419586s
8 (opt) 1.498439592 0.063857740s
16 (no aging) 2.048766096 n/a
16 (no opt) 2.399335597 1.247142841s
16 (opt) 2.000914001 0.121628689s
32 (no aging) 3.316256321 n/a
32 (no opt) 3.955417018 4.347290433
32 (opt) 3.355274507 0.250886289
64 (no aging) 6.498958516 n/a
64 (no opt) 7.127533884 9.815592054
64 (opt) 6.442582168 1.392907010
112 (no aging) 8.498029491 n/a
112 (no opt) 10.21372495 13.47381656
112 (opt) 8.896963554 2.292223850
Previous versions of this series included logic in MGLRU and KVM to
support batching the updates to secondary page tables. This version
removes this logic, as it was complex and not necessary to enable
proactive reclaim. This optimization, as well as the additional
optimizations for arm64 and powerpc, can be done in a later series.
Changes since v3[1]:
- Vastly simplified the series (thanks David). Removed mmu notifier
batching logic entirely.
- Cleaned up how locking is done for mmu_notifier_test/clear_young
(thanks David).
- Look-around is now only done when there are no secondary MMUs
subscribed to MMU notifiers.
- CONFIG_LRU_GEN_WALKS_SECONDARY_MMU has been added.
- Fixed the lockless implementation of kvm_{test,}age_gfn for x86
(thanks David).
- Added MGLRU functional and performance tests to
access_tracking_perf_test (thanks Axel).
- In v3, an mm would be completely ignored (for aging) if there was a
secondary MMU but support for secondary MMU walking was missing. Now,
missing secondary MMU walking support simply skips the notifier
calls (except for eviction).
- Added a sanity check for that range->lockless and range->on_lock are
never both provided for the memslot walk.
For the changes from v2[2] to v3, see v3[1].
This series applies cleanly to mm/mm-unstable and kvm/queue.
[1]: https://lore.kernel.org/linux-mm/20240401232946.1837665-1-jthoughton@google…
[2]: https://lore.kernel.org/kvmarm/20230526234435.662652-1-yuzhao@google.com/
James Houghton (7):
mm/Kconfig: Add LRU_GEN_WALKS_SECONDARY_MMU
mm: multi-gen LRU: Have secondary MMUs participate in aging
KVM: Add lockless memslot walk to KVM
KVM: Move MMU lock acquisition for test/clear_young to architecture
KVM: x86: Relax locking for kvm_test_age_gfn and kvm_age_gfn
KVM: arm64: Relax locking for kvm_test_age_gfn and kvm_age_gfn
KVM: selftests: Add multi-gen LRU aging to access_tracking_perf_test
Documentation/admin-guide/mm/multigen_lru.rst | 6 +-
arch/arm64/kvm/hyp/pgtable.c | 9 +-
arch/arm64/kvm/mmu.c | 30 +-
arch/loongarch/kvm/mmu.c | 20 +-
arch/mips/kvm/mmu.c | 21 +-
arch/powerpc/kvm/book3s.c | 14 +-
arch/riscv/kvm/mmu.c | 26 +-
arch/x86/include/asm/kvm_host.h | 1 +
arch/x86/kvm/mmu/mmu.c | 10 +-
arch/x86/kvm/mmu/tdp_iter.h | 27 +-
arch/x86/kvm/mmu/tdp_mmu.c | 67 ++-
include/linux/kvm_host.h | 1 +
include/linux/mmzone.h | 6 +-
mm/Kconfig | 8 +
mm/rmap.c | 9 +-
mm/vmscan.c | 144 +++++--
tools/testing/selftests/kvm/Makefile | 1 +
.../selftests/kvm/access_tracking_perf_test.c | 365 ++++++++++++++--
.../selftests/kvm/include/lru_gen_util.h | 55 +++
.../testing/selftests/kvm/lib/lru_gen_util.c | 391 ++++++++++++++++++
virt/kvm/kvm_main.c | 38 +-
21 files changed, 1104 insertions(+), 145 deletions(-)
create mode 100644 tools/testing/selftests/kvm/include/lru_gen_util.h
create mode 100644 tools/testing/selftests/kvm/lib/lru_gen_util.c
base-commit: e0cce98fe279b64f4a7d81b7f5c3a23d80b92fbc
--
2.45.1.288.g0e0cd299f1-goog
Currrentl a 32 bit 1u value is being shifted more than 32 bits causing
overflow and incorrect checking of bits 32-63. Fix this by using the
BIT_ULL macro for shifting bits.
Detected by cppcheck:
sev_init2_tests.c:108:34: error: Shifting 32-bit value by 63 bits is
undefined behaviour [shiftTooManyBits]
Fixes: dfc083a181ba ("selftests: kvm: add tests for KVM_SEV_INIT2")
Signed-off-by: Colin Ian King <colin.i.king(a)gmail.com>
---
V2: Fix incorrect variable in 2nd BIT_ULL(), kudos to Dan Carpenter for
catching this error.
---
tools/testing/selftests/kvm/x86_64/sev_init2_tests.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/tools/testing/selftests/kvm/x86_64/sev_init2_tests.c b/tools/testing/selftests/kvm/x86_64/sev_init2_tests.c
index 7a4a61be119b..ea09f7a06aa4 100644
--- a/tools/testing/selftests/kvm/x86_64/sev_init2_tests.c
+++ b/tools/testing/selftests/kvm/x86_64/sev_init2_tests.c
@@ -105,11 +105,11 @@ void test_features(uint32_t vm_type, uint64_t supported_features)
int i;
for (i = 0; i < 64; i++) {
- if (!(supported_features & (1u << i)))
+ if (!(supported_features & BIT_ULL(i)))
test_init2_invalid(vm_type,
&(struct kvm_sev_init){ .vmsa_features = BIT_ULL(i) },
"unknown feature");
- else if (KNOWN_FEATURES & (1u << i))
+ else if (KNOWN_FEATURES & BIT_ULL(i))
test_init2(vm_type,
&(struct kvm_sev_init){ .vmsa_features = BIT_ULL(i) });
}
--
2.39.2
Hi Linus,
Please pull this urgent kselftest fixes update for Linux 6.10-rc3.
This kselftest fixes update consists of fixes to build warnings
in several tests and fixes to ftrace tests.
diff for pull request is attached.
thanks,
-- Shuah
----------------------------------------------------------------
The following changes since commit 1613e604df0cd359cf2a7fbd9be7a0bcfacfabd0:
Linux 6.10-rc1 (2024-05-26 15:20:12 -0700)
are available in the Git repository at:
git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest tags/linux_kselftest-fixes-6.10-rc3
for you to fetch changes up to 4bf15b1c657d22d1d70173e43264e4606dfe75ff:
selftests/futex: don't pass a const char* to asprintf(3) (2024-05-31 14:37:10 -0600)
----------------------------------------------------------------
linux_kselftest-fixes-6.10-rc3
This kselftest fixes update consists of fixes to build warnings
in several tests and fixes to ftrace tests.
----------------------------------------------------------------
John Hubbard (3):
selftests/futex: pass _GNU_SOURCE without a value to the compiler
selftests/futex: don't redefine .PHONY targets (all, clean)
selftests/futex: don't pass a const char* to asprintf(3)
Mark Brown (1):
kselftest/alsa: Ensure _GNU_SOURCE is defined
Masami Hiramatsu (Google) (3):
selftests/ftrace: Fix to check required event file
selftests/ftrace: Update required config
selftests/tracing: Fix event filter test to retry up to 10 times
Michael Ellerman (3):
selftests: cachestat: Fix build warnings on ppc64
selftests/openat2: Fix build warnings on ppc64
selftests/overlayfs: Fix build error on ppc64
Steven Rostedt (Google) (1):
tracing/selftests: Fix kprobe event name test for .isra. functions
tools/testing/selftests/alsa/Makefile | 2 +-
tools/testing/selftests/cachestat/test_cachestat.c | 1 +
.../selftests/filesystems/overlayfs/dev_in_maps.c | 1 +
tools/testing/selftests/ftrace/config | 26 ++++++++++++++++------
.../ftrace/test.d/dynevent/test_duplicates.tc | 2 +-
.../ftrace/test.d/filter/event-filter-function.tc | 20 ++++++++++++++++-
.../ftrace/test.d/kprobe/kprobe_eventname.tc | 3 ++-
tools/testing/selftests/futex/Makefile | 2 --
tools/testing/selftests/futex/functional/Makefile | 2 +-
.../selftests/futex/functional/futex_requeue_pi.c | 2 +-
tools/testing/selftests/openat2/openat2_test.c | 1 +
11 files changed, 47 insertions(+), 15 deletions(-)
----------------------------------------------------------------
Hi,
This seven series fix an issue reported by kernel test robot [3].
Shuah, I (as well as Kees and Sean [4]) think this should be in -next
really soon to make sure everything works fine for the v6.9 release,
which is not currently the case. I cannot test against all kselftests
though. I would prefer to let you handle this, but I guess you're not
able to do so and I'll push it on my branch without reply from you.
Even if I push it on my branch, please push it on yours too as soon as
you see this and I'll remove it from mine.
Mark, Jakub, could you please test this series?
As reported by Kernel Test Robot [1] and Sean Christopherson [2], some
tests fail since v6.9-rc1 . This is due to the use of vfork() which
introduced some side effects. Similarly, while making it more generic,
a previous commit made some Landlock file system tests flaky, and
subject to the host's file system mount configuration.
This series fixes all these side effects by replacing vfork() with
clone3() and CLONE_VFORK, which is cleaner (no arbitrary shared memory)
and makes the Kselftest framework more robust.
I tried different approaches and I found this one to be the cleaner and
less invasive for current test cases.
I successfully ran the following tests (using TEST_F and
fork/clone/clone3, and KVM_ONE_VCPU_TEST) with this series:
- kvm:fix_hypercall_test
- kvm:sync_regs_test
- kvm:userspace_msr_exit_test
- kvm:vmx_pmu_caps_test
- landlock:fs_test
- landlock:net_test
- landlock:ptrace_test
- move_mount_set_group:move_mount_set_group_test
- net/af_unix:scm_pidfd
- perf_events:remove_on_exec
- pidfd:pidfd_getfd_test
- pidfd:pidfd_setns_test
- seccomp:seccomp_bpf
- user_events:abi_test
[1] https://lore.kernel.org/oe-lkp/202403291015.1fcfa957-oliver.sang@intel.com
[2] https://lore.kernel.org/r/ZjPelW6-AbtYvslu@google.com
[3] https://lore.kernel.org/r/202405100339.vfBe0t9C-lkp@intel.com
[4] https://lore.kernel.org/r/202405061002.01D399877A@keescook
Previous versions:
v1: https://lore.kernel.org/r/20240426172252.1862930-1-mic@digikod.net
v2: https://lore.kernel.org/r/20240429130931.2394118-1-mic@digikod.net
v3: https://lore.kernel.org/r/20240429191911.2552580-1-mic@digikod.net
v4: https://lore.kernel.org/r/20240502210926.145539-1-mic@digikod.net
v5: https://lore.kernel.org/r/20240503105820.300927-1-mic@digikod.net
v6: https://lore.kernel.org/r/20240506165518.474504-1-mic@digikod.net
Regards,
Mickaël Salaün (10):
selftests/pidfd: Fix config for pidfd_setns_test
selftests/landlock: Fix FS tests when run on a private mount point
selftests/harness: Fix fixture teardown
selftests/harness: Fix interleaved scheduling leading to race
conditions
selftests/landlock: Do not allocate memory in fixture data
selftests/harness: Constify fixture variants
selftests/pidfd: Fix wrong expectation
selftests/harness: Share _metadata between forked processes
selftests/harness: Fix vfork() side effects
selftests/harness: Handle TEST_F()'s explicit exit codes
tools/testing/selftests/kselftest_harness.h | 127 +++++++++++++-----
tools/testing/selftests/landlock/fs_test.c | 83 +++++++-----
tools/testing/selftests/pidfd/config | 2 +
.../selftests/pidfd/pidfd_setns_test.c | 2 +-
4 files changed, 147 insertions(+), 67 deletions(-)
base-commit: e67572cd2204894179d89bd7b984072f19313b03
--
2.45.0
Add support for (yet again) more RVA23U64 missing extensions. Add
support for Zimop, Zcmop, Zca, Zcf, Zcd and Zcb extensions ISA string
parsing, hwprobe and kvm support. Zce, Zcmt and Zcmp extensions have
been left out since they target microcontrollers/embedded CPUs and are
not needed by RVA23U64.
Since Zc* extensions states that C implies Zca, Zcf (if F and RV32), Zcd
(if D), this series modifies the way ISA string is parsed and now does
it in two phases. First one parses the string and the second one
validates it for the final ISA description.
Link: https://lore.kernel.org/linux-riscv/20240404103254.1752834-1-cleger@rivosin… [1]
Link: https://lore.kernel.org/all/20240409143839.558784-1-cleger@rivosinc.com/ [2]
---
v6:
- Rebased on riscv/for-next
- Remove ternary operator to use 'if()' instead in extension checks
- v5: https://lore.kernel.org/all/20240517145302.971019-1-cleger@rivosinc.com/
v5:
- Merged in Zimop to avoid any uneeded series dependencies
- Rework dependency resolution loop to loop on source isa first rather
than on all extension.
- Disabled extensions in source isa once set in resolved isa
- Rename riscv_resolve_isa() parameters
- v4: https://lore.kernel.org/all/20240429150553.625165-1-cleger@rivosinc.com/
v4:
- Modify validate() callbacks to return 0, -EPROBEDEFER or another
error.
- v3: https://lore.kernel.org/all/20240423124326.2532796-1-cleger@rivosinc.com/
v3:
- Fix typo "exists" -> "exist"
- Remove C implies Zca, Zcd, Zcf, dt-bindings rules
- Rework ISA string resolver to handle dependencies
- v2: https://lore.kernel.org/all/20240418124300.1387978-1-cleger@rivosinc.com/
v2:
- Add Zc* dependencies validation in dt-bindings
- v1: https://lore.kernel.org/lkml/20240410091106.749233-1-cleger@rivosinc.com/
Clément Léger (16):
dt-bindings: riscv: add Zimop ISA extension description
riscv: add ISA extension parsing for Zimop
riscv: hwprobe: export Zimop ISA extension
RISC-V: KVM: Allow Zimop extension for Guest/VM
KVM: riscv: selftests: Add Zimop extension to get-reg-list test
dt-bindings: riscv: add Zca, Zcf, Zcd and Zcb ISA extension
description
riscv: add ISA extensions validation callback
riscv: add ISA parsing for Zca, Zcf, Zcd and Zcb
riscv: hwprobe: export Zca, Zcf, Zcd and Zcb ISA extensions
RISC-V: KVM: Allow Zca, Zcf, Zcd and Zcb extensions for Guest/VM
KVM: riscv: selftests: Add some Zc* extensions to get-reg-list test
dt-bindings: riscv: add Zcmop ISA extension description
riscv: add ISA extension parsing for Zcmop
riscv: hwprobe: export Zcmop ISA extension
RISC-V: KVM: Allow Zcmop extension for Guest/VM
KVM: riscv: selftests: Add Zcmop extension to get-reg-list test
Documentation/arch/riscv/hwprobe.rst | 28 ++
.../devicetree/bindings/riscv/extensions.yaml | 95 ++++++
arch/riscv/include/asm/cpufeature.h | 1 +
arch/riscv/include/asm/hwcap.h | 7 +-
arch/riscv/include/uapi/asm/hwprobe.h | 6 +
arch/riscv/include/uapi/asm/kvm.h | 6 +
arch/riscv/kernel/cpufeature.c | 278 ++++++++++++------
arch/riscv/kernel/sys_hwprobe.c | 6 +
arch/riscv/kvm/vcpu_onereg.c | 12 +
.../selftests/kvm/riscv/get-reg-list.c | 24 ++
10 files changed, 375 insertions(+), 88 deletions(-)
--
2.45.1