Hello,
kernel test robot noticed "kernel-selftests.seccomp.seccomp_bpf.TRAP.dfl.fail" on:
commit: 0710a1a73fb45033ebb06073e374ab7d44a05f15 ("selftests/harness: Merge TEST_F_FORK() into TEST_F()")
https://git.kernel.org/cgit/linux/kernel/git/next/linux-next.git master
[test failed on linux-next/master 67908bf6954b7635d33760ff6dfc189fc26ccc89]
in testcase: kernel-selftests
version: kernel-selftests-x86_64-4306b286-1_20240301
with following parameters:
group: group-s
compiler: gcc-12
test machine: 36 threads 1 sockets Intel(R) Core(TM) i9-10980XE CPU @ 3.00GHz (Cascade Lake) with 32G memory
(please refer to attached dmesg/kmsg for entire log/backtrace)
If you fix the issue in a separate patch/commit (i.e. not just a new version of
the same patch/commit), kindly add following tags
| Reported-by: kernel test robot <oliver.sang(a)intel.com>
| Closes: https://lore.kernel.org/oe-lkp/202403051529.cf359aed-oliver.sang@intel.com
in fact, we found more tests failed on this commit but can pass on parent:
e74048650eaff667 0710a1a73fb45033ebb06073e37
---------------- ---------------------------
fail:runs %reproduction fail:runs
| | |
:6 100% 6:6 kernel-selftests.seccomp.seccomp_bpf.TRACE_poke.read_has_side_effects.fail
:6 100% 6:6 kernel-selftests.seccomp.seccomp_bpf.TRACE_syscall.ptrace.kill_after.fail
:6 100% 6:6 kernel-selftests.seccomp.seccomp_bpf.TRACE_syscall.ptrace.kill_immediate.fail
:6 100% 6:6 kernel-selftests.seccomp.seccomp_bpf.TRACE_syscall.ptrace.skip_after.fail
:6 100% 6:6 kernel-selftests.seccomp.seccomp_bpf.TRACE_syscall.ptrace.syscall_allowed.fail
:6 100% 6:6 kernel-selftests.seccomp.seccomp_bpf.TRACE_syscall.ptrace.syscall_errno.fail
:6 100% 6:6 kernel-selftests.seccomp.seccomp_bpf.TRACE_syscall.ptrace.syscall_faked.fail
:6 100% 6:6 kernel-selftests.seccomp.seccomp_bpf.TRACE_syscall.ptrace.syscall_redirected.fail
:6 100% 6:6 kernel-selftests.seccomp.seccomp_bpf.TRACE_syscall.seccomp.kill_after.fail
:6 100% 6:6 kernel-selftests.seccomp.seccomp_bpf.TRACE_syscall.seccomp.kill_immediate.fail
:6 100% 6:6 kernel-selftests.seccomp.seccomp_bpf.TRACE_syscall.seccomp.skip_after.fail
:6 100% 6:6 kernel-selftests.seccomp.seccomp_bpf.TRACE_syscall.seccomp.syscall_allowed.fail
:6 100% 6:6 kernel-selftests.seccomp.seccomp_bpf.TRACE_syscall.seccomp.syscall_errno.fail
:6 100% 6:6 kernel-selftests.seccomp.seccomp_bpf.TRACE_syscall.seccomp.syscall_faked.fail
:6 100% 6:6 kernel-selftests.seccomp.seccomp_bpf.TRACE_syscall.seccomp.syscall_redirected.fail
:6 100% 6:6 kernel-selftests.seccomp.seccomp_bpf.TRAP.dfl.fail
:6 100% 6:6 kernel-selftests.seccomp.seccomp_bpf.TRAP.ign.fail
:6 100% 6:6 kernel-selftests.seccomp.seccomp_bpf.fail
:6 100% 6:6 kernel-selftests.seccomp.seccomp_bpf.precedence.kill_is_highest.fail
:6 100% 6:6 kernel-selftests.seccomp.seccomp_bpf.precedence.kill_is_highest_in_any_order.fail
:6 100% 6:6 kernel-selftests.seccomp.seccomp_bpf.precedence.trap_is_second.fail
:6 100% 6:6 kernel-selftests.seccomp.seccomp_bpf.precedence.trap_is_second_in_any_order.fail
more details could be found in 'kernel-selftests' file in below link.
The kernel config and materials to reproduce are available at:
https://download.01.org/0day-ci/archive/20240305/202403051529.cf359aed-oliv…
--
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki
On systems that have large core counts and large page sizes, but limited
memory, the userfaultfd test hugepage requirement is too large.
Exiting early due to missing one test's requirements is a rather aggressive
strategy, and prevents a lot of other tests from running. Remove the
early exit to prevent this.
Fixes: ee00479d6702 ("selftests: vm: Try harder to allocate huge pages")
Signed-off-by: Nico Pache <npache(a)redhat.com>
---
tools/testing/selftests/mm/run_vmtests.sh | 1 -
1 file changed, 1 deletion(-)
diff --git a/tools/testing/selftests/mm/run_vmtests.sh b/tools/testing/selftests/mm/run_vmtests.sh
index 246d53a5d7f28..727ea22ba408e 100755
--- a/tools/testing/selftests/mm/run_vmtests.sh
+++ b/tools/testing/selftests/mm/run_vmtests.sh
@@ -173,7 +173,6 @@ if [ -n "$freepgs" ] && [ -n "$hpgsize_KB" ]; then
if [ "$freepgs" -lt "$needpgs" ]; then
printf "Not enough huge pages available (%d < %d)\n" \
"$freepgs" "$needpgs"
- exit 1
fi
else
echo "no hugetlbfs support in kernel?"
--
2.43.0
Changes :
- "excercise" is corrected to "exercise" in drivers/net/mlxsw/spectrum-2/tc_flower.sh
- "mutliple" is corrected to "multiple" in drivers/net/netdevsim/ethtool-fec.sh
Signed-off-by: Prabhav Kumar Vaish <pvkumar5749404(a)gmail.com>
---
.../testing/selftests/drivers/net/mlxsw/spectrum-2/tc_flower.sh | 2 +-
tools/testing/selftests/drivers/net/netdevsim/ethtool-fec.sh | 2 +-
2 files changed, 2 insertions(+), 2 deletions(-)
diff --git a/tools/testing/selftests/drivers/net/mlxsw/spectrum-2/tc_flower.sh b/tools/testing/selftests/drivers/net/mlxsw/spectrum-2/tc_flower.sh
index 616d3581419c..31252bc8775e 100755
--- a/tools/testing/selftests/drivers/net/mlxsw/spectrum-2/tc_flower.sh
+++ b/tools/testing/selftests/drivers/net/mlxsw/spectrum-2/tc_flower.sh
@@ -869,7 +869,7 @@ bloom_simple_test()
bloom_complex_test()
{
# Bloom filter index computation is affected from region ID, eRP
- # ID and from the region key size. In order to excercise those parts
+ # ID and from the region key size. In order to exercise those parts
# of the Bloom filter code, use a series of regions, each with a
# different key size and send packet that should hit all of them.
local index
diff --git a/tools/testing/selftests/drivers/net/netdevsim/ethtool-fec.sh b/tools/testing/selftests/drivers/net/netdevsim/ethtool-fec.sh
index 7d7829f57550..6c52ce1b0450 100755
--- a/tools/testing/selftests/drivers/net/netdevsim/ethtool-fec.sh
+++ b/tools/testing/selftests/drivers/net/netdevsim/ethtool-fec.sh
@@ -49,7 +49,7 @@ for o in llrs rs; do
Active FEC encoding: ${o^^}"
done
-# Test mutliple bits
+# Test multiple bits
$ETHTOOL --set-fec $NSIM_NETDEV encoding rs llrs
check $?
s=$($ETHTOOL --show-fec $NSIM_NETDEV | tail -2)
--
2.34.1
This series from Geliang adds two new Netlink commands to the userspace
PM:
- one to dump all addresses of a specific MPTCP connection:
- feature added in patches 3 to 5
- test added in patches 7, 8 and 10
- and one to get a specific address for an MPTCP connection:
- feature added in patches 11 to 13
- test added in patches 14 and 15
These new Netlink commands can be useful if an MPTCP daemon lost track
of the different connections, e.g. after having been restarted.
The other patches are some clean-ups and small improvements added while
working on the new features.
Signed-off-by: Matthieu Baerts (NGI0) <matttbe(a)kernel.org>
---
Geliang Tang (15):
mptcp: make pm_remove_addrs_and_subflows static
mptcp: export mptcp_genl_family & mptcp_nl_fill_addr
mptcp: implement mptcp_userspace_pm_dump_addr
mptcp: add token for get-addr in yaml
mptcp: dump addrs in userspace pm list
mptcp: check userspace pm flags
selftests: mptcp: add userspace pm subflow flag
selftests: mptcp: add token for dump_addr
selftests: mptcp: add mptcp_lib_check_output helper
selftests: mptcp: dump userspace addrs list
mptcp: add userspace_pm_lookup_addr_by_id helper
mptcp: implement mptcp_userspace_pm_get_addr
mptcp: get addr in userspace pm list
selftests: mptcp: add token for get_addr
selftests: mptcp: userspace pm get addr tests
Documentation/netlink/specs/mptcp_pm.yaml | 3 +-
net/mptcp/mptcp_pm_gen.c | 7 +-
net/mptcp/mptcp_pm_gen.h | 2 +-
net/mptcp/pm.c | 16 +++
net/mptcp/pm_netlink.c | 30 ++--
net/mptcp/pm_userspace.c | 180 +++++++++++++++++++++---
net/mptcp/protocol.h | 15 +-
tools/testing/selftests/net/mptcp/mptcp_join.sh | 91 ++++++++++++
tools/testing/selftests/net/mptcp/mptcp_lib.sh | 23 +++
tools/testing/selftests/net/mptcp/pm_netlink.sh | 18 +--
tools/testing/selftests/net/mptcp/pm_nl_ctl.c | 39 ++++-
11 files changed, 374 insertions(+), 50 deletions(-)
---
base-commit: e960825709330cb199d209740326cec37e8c419d
change-id: 20240301-upstream-net-next-20240301-mptcp-userspace-pm-dump-addr-221f169ac144
Best regards,
--
Matthieu Baerts (NGI0) <matttbe(a)kernel.org>
Here are two patches fixing issues in MPTCP diag.sh kselftest:
- Patch 1 makes sure the exit code is '1' in case of error, and not the
test ID, not to return an exit code that would be wrongly interpreted
by the ksefltests framework, e.g. '4' means 'skip'.
- Patch 2 avoids waiting for unnecessary conditions, which can cause
timeouts in some very slow environments.
Signed-off-by: Matthieu Baerts (NGI0) <matttbe(a)kernel.org>
---
Geliang Tang (1):
selftests: mptcp: diag: return KSFT_FAIL not test_cnt
Matthieu Baerts (NGI0) (1):
selftests: mptcp: diag: avoid extra waiting
tools/testing/selftests/net/mptcp/diag.sh | 15 ++++++---------
1 file changed, 6 insertions(+), 9 deletions(-)
---
base-commit: 1c61728be22c1cb49c1be88693e72d8c06b1c81e
change-id: 20240301-upstream-net-20240301-selftests-mptcp-diag-exit-timeout-207d7925b7c0
Best regards,
--
Matthieu Baerts (NGI0) <matttbe(a)kernel.org>
Hi,
Here is version 3 series of patches to support accessing function entry data
from function *return* probes (including kretprobe and fprobe-exit event).
The previous version is here;
https://lore.kernel.org/all/170891987362.609861.6767830614537418260.stgit@d…
In this version, [1/8] is a bugfix patch (but note that this is already pushed to
probes-fixes-v6.8-rc5, just for reference), updated [4/8] changelog and build error,
fixes selftests error [6/8], update document[8/8] and added Steve's reviewed-by.
This allows us to access the results of some functions, which returns the
error code and its results are passed via function parameter, such as an
structure-initialization function.
For example, vfs_open() will link the file structure to the inode and update
mode. Thus we can trace that changes.
# echo 'f vfs_open mode=file->f_mode:x32 inode=file->f_inode:x64' >> dynamic_events
# echo 'f vfs_open%return mode=file->f_mode:x32 inode=file->f_inode:x64' >> dynamic_events
# echo 1 > events/fprobes/enable
# cat trace
sh-131 [006] ...1. 1945.714346: vfs_open__entry: (vfs_open+0x4/0x40) mode=0x2 inode=0x0
sh-131 [006] ...1. 1945.714358: vfs_open__exit: (do_open+0x274/0x3d0 <- vfs_open) mode=0x4d801e inode=0xffff888008470168
cat-143 [007] ...1. 1945.717949: vfs_open__entry: (vfs_open+0x4/0x40) mode=0x1 inode=0x0
cat-143 [007] ...1. 1945.717956: vfs_open__exit: (do_open+0x274/0x3d0 <- vfs_open) mode=0x4a801d inode=0xffff888005f78d28
cat-143 [007] ...1. 1945.720616: vfs_open__entry: (vfs_open+0x4/0x40) mode=0x1 inode=0x0
cat-143 [007] ...1. 1945.728263: vfs_open__exit: (do_open+0x274/0x3d0 <- vfs_open) mode=0xa800d inode=0xffff888004ada8d8
So as you can see those fields are initialized at exit.
This series is based on v6.8-rc5 kernel or you can checkout from
https://git.kernel.org/pub/scm/linux/kernel/git/mhiramat/linux.git/log/?h=t…
Thank you,
---
Masami Hiramatsu (Google) (8):
fprobe: Fix to allocate entry_data_size buffer with rethook instances
tracing/fprobe-event: cleanup: Fix a wrong comment in fprobe event
tracing/probes: Cleanup probe argument parser
tracing/probes: cleanup: Set trace_probe::nr_args at trace_probe_init
tracing: Remove redundant #else block for BTF args from README
tracing/probes: Support $argN in return probe (kprobe and fprobe)
selftests/ftrace: Add test cases for entry args at function exit
Documentation: tracing: Add entry argument access at function exit
Documentation/trace/fprobetrace.rst | 31 +
Documentation/trace/kprobetrace.rst | 9
kernel/trace/fprobe.c | 14 -
kernel/trace/trace.c | 5
kernel/trace/trace_eprobe.c | 8
kernel/trace/trace_fprobe.c | 59 ++-
kernel/trace/trace_kprobe.c | 58 ++-
kernel/trace/trace_probe.c | 417 ++++++++++++++------
kernel/trace/trace_probe.h | 30 +
kernel/trace/trace_probe_tmpl.h | 10
kernel/trace/trace_uprobe.c | 14 -
.../ftrace/test.d/dynevent/fprobe_entry_arg.tc | 18 +
.../ftrace/test.d/dynevent/fprobe_syntax_errors.tc | 4
.../ftrace/test.d/kprobe/kprobe_syntax_errors.tc | 2
.../ftrace/test.d/kprobe/kretprobe_entry_arg.tc | 18 +
15 files changed, 521 insertions(+), 176 deletions(-)
create mode 100644 tools/testing/selftests/ftrace/test.d/dynevent/fprobe_entry_arg.tc
create mode 100644 tools/testing/selftests/ftrace/test.d/kprobe/kretprobe_entry_arg.tc
--
Masami Hiramatsu (Google) <mhiramat(a)kernel.org>
Hi,
Here is version 2 series of patches to support accessing function entry data
from function *return* probes (including kretprobe and fprobe-exit event).
In this version, I added another cleanup [4/7], updated README[5/7], added
testcases[6/7] and updated document[7/7].
This allows us to access the results of some functions, which returns the
error code and its results are passed via function parameter, such as an
structure-initialization function.
For example, vfs_open() will link the file structure to the inode and update
mode. Thus we can trace that changes.
# echo 'f vfs_open mode=file->f_mode:x32 inode=file->f_inode:x64' >> dynamic_events
# echo 'f vfs_open%return mode=file->f_mode:x32 inode=file->f_inode:x64' >> dynamic_events
# echo 1 > events/fprobes/enable
# cat trace
sh-131 [006] ...1. 1945.714346: vfs_open__entry: (vfs_open+0x4/0x40) mode=0x2 inode=0x0
sh-131 [006] ...1. 1945.714358: vfs_open__exit: (do_open+0x274/0x3d0 <- vfs_open) mode=0x4d801e inode=0xffff888008470168
cat-143 [007] ...1. 1945.717949: vfs_open__entry: (vfs_open+0x4/0x40) mode=0x1 inode=0x0
cat-143 [007] ...1. 1945.717956: vfs_open__exit: (do_open+0x274/0x3d0 <- vfs_open) mode=0x4a801d inode=0xffff888005f78d28
cat-143 [007] ...1. 1945.720616: vfs_open__entry: (vfs_open+0x4/0x40) mode=0x1 inode=0x0
cat-143 [007] ...1. 1945.728263: vfs_open__exit: (do_open+0x274/0x3d0 <- vfs_open) mode=0xa800d inode=0xffff888004ada8d8
So as you can see those fields are initialized at exit.
This series is based on v6.8-rc5 kernel or you can checkout from
https://git.kernel.org/pub/scm/linux/kernel/git/mhiramat/linux.git/log/?h=t…
Thank you,
---
Masami Hiramatsu (Google) (7):
tracing/fprobe-event: cleanup: Fix a wrong comment in fprobe event
tracing/probes: Cleanup probe argument parser
tracing/probes: cleanup: Set trace_probe::nr_args at trace_probe_init
tracing: Remove redundant #else block for BTF args from README
tracing/probes: Support $argN in return probe (kprobe and fprobe)
selftests/ftrace: Add test cases for entry args at function exit
Documentation: tracing: Add entry argument access at function exit
Documentation/trace/fprobetrace.rst | 7
Documentation/trace/kprobetrace.rst | 7
kernel/trace/trace.c | 5
kernel/trace/trace_eprobe.c | 8
kernel/trace/trace_fprobe.c | 59 ++-
kernel/trace/trace_kprobe.c | 58 ++-
kernel/trace/trace_probe.c | 417 ++++++++++++++------
kernel/trace/trace_probe.h | 30 +
kernel/trace/trace_probe_tmpl.h | 10
kernel/trace/trace_uprobe.c | 14 -
.../ftrace/test.d/dynevent/fprobe_entry_arg.tc | 18 +
.../ftrace/test.d/kprobe/kretprobe_entry_arg.tc | 18 +
12 files changed, 483 insertions(+), 168 deletions(-)
create mode 100644 tools/testing/selftests/ftrace/test.d/dynevent/fprobe_entry_arg.tc
create mode 100644 tools/testing/selftests/ftrace/test.d/kprobe/kretprobe_entry_arg.tc
--
Masami Hiramatsu (Google) <mhiramat(a)kernel.org>
This series includes 6 types of fixes:
- Patch 1 fixes v4 mapped in v6 addresses support for the userspace PM,
when asking to delete a subflow. It was done everywhere else, but not
there. Patch 2 validates the modification, thanks to a subtest in
mptcp_join.sh. These patches can be backported up to v5.19.
- Patch 3 is a small fix for a recent bug-fix patch, just to avoid
printing an irrelevant warning (pr_warn()) once. It can be backported
up to v5.6, alongside the bug-fix that has been introduced in the
v6.8-rc5.
- Patches 4 to 6 are fixes for bugs found by Paolo while working on
TCP_NOTSENT_LOWAT support for MPTCP. These fixes can improve the
performances in some cases. Patches can be backported up to v5.6,
v5.11 and v6.7 respectively.
- Patch 7 makes sure 'ss -M' is available when starting MPTCP Join
selftest as it is required for some subtests since v5.18.
- Patch 8 fixes a possible double-free on socket dismantle. The issue
always existed, but was unnoticed because it was not causing any
problem so far. This fix can be backported up to v5.6.
- Patch 9 is a fix for a very recent patch causing lockdep warnings in
subflow diag. The patch causing the regression -- which fixes another
issue present since v5.7 -- should be part of the future v6.8-rc6.
Patch 10 validates the modification, thanks to a new subtest in
diag.sh.
Signed-off-by: Matthieu Baerts (NGI0) <matttbe(a)kernel.org>
---
Davide Caratti (1):
mptcp: fix double-free on socket dismantle
Geliang Tang (3):
mptcp: map v4 address to v6 when destroying subflow
selftests: mptcp: rm subflow with v4/v4mapped addr
selftests: mptcp: join: add ss mptcp support check
Matthieu Baerts (NGI0) (1):
mptcp: avoid printing warning once on client side
Paolo Abeni (5):
mptcp: push at DSS boundaries
mptcp: fix snd_wnd initialization for passive socket
mptcp: fix potential wake-up event loss
mptcp: fix possible deadlock in subflow diag
selftests: mptcp: explicitly trigger the listener diag code-path
net/mptcp/diag.c | 3 ++
net/mptcp/options.c | 2 +-
net/mptcp/pm_userspace.c | 10 +++++
net/mptcp/protocol.c | 52 ++++++++++++++++++++++++-
net/mptcp/protocol.h | 21 +++++-----
tools/testing/selftests/net/mptcp/diag.sh | 30 +++++++++++++-
tools/testing/selftests/net/mptcp/mptcp_join.sh | 33 ++++++++++------
tools/testing/selftests/net/mptcp/mptcp_lib.sh | 4 +-
8 files changed, 128 insertions(+), 27 deletions(-)
---
base-commit: b0b1210bc150fbd741b4b9fce8a24541306b40fc
change-id: 20240223-upstream-net-20240223-misc-fixes-1630cd6b3b0a
Best regards,
--
Matthieu Baerts (NGI0) <matttbe(a)kernel.org>
This series extends the KVM RISC-V ONE_REG interface to report few more
ISA extensions namely: Ztso and Zacas. These extensions are already
supported by the HWPROBE interface in Linux-6.8 kernel.
To test these patches, use KVMTOOL from the riscv_more_exts_round2_v1
branch at: https://github.com/avpatel/kvmtool.git
These patches can also be found in the riscv_kvm_more_exts_round2_v1
branch at: https://github.com/avpatel/linux.git
Anup Patel (5):
RISC-V: KVM: Forward SEED CSR access to user space
RISC-V: KVM: Allow Ztso extension for Guest/VM
KVM: riscv: selftests: Add Ztso extension to get-reg-list test
RISC-V: KVM: Allow Zacas extension for Guest/VM
KVM: riscv: selftests: Add Zacas extension to get-reg-list test
arch/riscv/include/uapi/asm/kvm.h | 2 ++
arch/riscv/kvm/vcpu_insn.c | 13 +++++++++++++
arch/riscv/kvm/vcpu_onereg.c | 4 ++++
tools/testing/selftests/kvm/riscv/get-reg-list.c | 8 ++++++++
4 files changed, 27 insertions(+)
--
2.34.1
This series fixes a bug in the complete phase of UDP in GRO, in which
socket lookup fails due to using network_header when parsing encapsulated
packets. The fix is to keep track of both outer and inner offsets.
The last commit leverages the first commit to remove some state from
napi_gro_cb, and stateful code in {ipv6,inet}_gro_receive which may be
unnecessarily complicated due to encapsulation support in GRO.
In addition, udpgro_fwd selftest is adjusted to include the socket lookup
case for vxlan. This selftest will test its supposed functionality once
local bind support is merged (https://lore.kernel.org/netdev/df300a49-7811-4126-a56a-a77100c8841b@gmail.c…).
Richard Gobert (3):
net: gro: set {inner_,}network_header in receive phase
selftests/net: add local address bind in vxlan selftest
net: gro: move L3 flush checks to tcp_gro_receive
include/net/gro.h | 23 ++++---
net/8021q/vlan_core.c | 3 +
net/core/gro.c | 3 -
net/ipv4/af_inet.c | 44 ++------------
net/ipv4/tcp_offload.c | 73 ++++++++++++++++++-----
net/ipv4/udp_offload.c | 2 +-
net/ipv6/ip6_offload.c | 22 ++-----
net/ipv6/tcpv6_offload.c | 2 +-
net/ipv6/udp_offload.c | 2 +-
tools/testing/selftests/net/udpgro_fwd.sh | 10 +++-
10 files changed, 97 insertions(+), 87 deletions(-)
--
2.36.1
Previous patch series[1] changes a mmap behavior that treats the hint
address as the upper bound of the mmap address range. The motivation of the
previous patch series is that some user space software may assume 48-bit
address space and use higher bits to encode some information, which may
collide with large virtual address space mmap may return. However, to make
sv48 by default, we don't need to change the meaning of the hint address on
mmap as the upper bound of the mmap address range, especially when this
behavior only shows up on the RISC-V. This behavior also breaks some user
space software which assumes mmap should try to create mapping on the hint
address if possible. As the mmap manpage said:
> If addr is not NULL, then the kernel takes it as a hint about where to
> place the mapping; on Linux, the kernel will pick a nearby page boundary
> (but always above or equal to the value specified by
> /proc/sys/vm/mmap_min_addr) and attempt to create the mapping there.
Unfortunately, what mmap said is not true on RISC-V since kernel v6.6.
Other ISAs with larger than 48-bit virtual address space like x86, arm64,
and powerpc do not have this special mmap behavior on hint address. They
all just make 48-bit / 47-bit virtual address space by default, and if a
user space software wants to large virtual address space, it only need to
specify a hint address larger than 48-bit / 47-bit.
Thus, this patch series keeps the change of mmap to use sv48 by default but
does not treat the hint address as the upper bound of the mmap address
range. After this patch, the behavior of mmap will align with existing
behavior on other ISAs with larger than 48-bit virtual address space like
x86, arm64, and powerpc. The user space software will no longer need to
rewrite their code to fit with this special mmap behavior only on RISC-V.
My concern is that the change of mmap behavior on the hint address is
already in the upstream kernel since v6.6, and it might be hard to revert
it although it already brings some regression on some user space software.
And it will be harder than adding it since v6.6 because mmap not creating
mapping on the hint address is very common, especially when running on a
machine without sv57 / sv48. However, if some user space software already
adopted this special mmap behavior on RISC-V, we should not return a mmap
address larger than the hint if the address is larger than BIT(38). My
opinion is that revert this change on the next kernel release might be a
good choice as only a few of hardware support sv57 / sv48 now, these
changes will have no impact on sv39 systems.
Moreover, previous patch series said it make sv48 by default, which is
in the cover letter, kernel documentation and MMAP_VA_BITS defination.
However, the code on arch_get_mmap_end and arch_get_mmap_base marco still
use sv39 by default, which makes me confused, and I still use sv48 by
default in this patch series including arch_get_mmap_end and
arch_get_mmap_base.
Changes in v2:
- correct arch_get_mmap_end and arch_get_mmap_base
- Add description in documentation about mmap behavior on kernel v6.6-6.7.
- Improve commit message and cover letter
- Rebase to newest riscv/for-next branch
- Link to v1: https://lore.kernel.org/linux-riscv/tencent_F3B3B5AB1C9D704763CA423E1A41F8B…
[1]. https://lore.kernel.org/linux-riscv/20230809232218.849726-1-charlie@rivosin…
Yangyu Chen (3):
RISC-V: mm: do not treat hint addr on mmap as the upper bound to
search
RISC-V: mm: only test mmap without hint
Documentation: riscv: correct sv57 kernel behavior
Documentation/arch/riscv/vm-layout.rst | 54 ++++++++++++-------
arch/riscv/include/asm/processor.h | 38 +++----------
.../selftests/riscv/mm/mmap_bottomup.c | 12 -----
.../testing/selftests/riscv/mm/mmap_default.c | 12 -----
tools/testing/selftests/riscv/mm/mmap_test.h | 30 -----------
5 files changed, 41 insertions(+), 105 deletions(-)
--
2.43.0
Powerpc specific selftests (specifically powerpc/primitives) included in linux-next
tree fails to build with following error
gcc -std=gnu99 -O2 -Wall -Werror -DGIT_VERSION='"next-20240229-0-gf303a3e2bcfb-dirty"' -I/home/sachin/linux-next/tools/testing/selftests/powerpc/include -I/home/sachin/linux-next/tools/testing/selftests/powerpc/primitives load_unaligned_zeropad.c ../harness.c -o /home/sachin/linux-next/tools/testing/selftests/powerpc/primitives/load_unaligned_zeropad
In file included from load_unaligned_zeropad.c:26:
word-at-a-time.h:7:10: fatal error: linux/bitops.h: No such file or directory
7 | #include <linux/bitops.h>
| ^~~~~~~~~~~~~~~~
compilation terminated.
The header file in question was last changed by following commit
commit 66a5c40f60f5d88ad8d47ba6a4ba05892853fa1f
kernel.h: removed REPEAT_BYTE from kernel.h
Thanks
— Sachin
Hi,
when running the dev_addr_lists unit test with lock debugging enabled,
I always get the following lockdep warning.
[ 7.031327] ====================================
[ 7.031393] WARNING: kunit_try_catch/1886 still has locks held!
[ 7.031478] 6.8.0-rc6-00053-g0fec7343edb5-dirty #1 Tainted: G W N
[ 7.031728] ------------------------------------
[ 7.031816] 1 lock held by kunit_try_catch/1886:
[ 7.031896] #0: ffffffff8ed35008 (rtnl_mutex){+.+.}-{3:3}, at: dev_addr_test_init+0x6a/0x100
Instrumentation shows that dev_addr_test_exit() is called, but only
after the warning fires.
Is this a problem with kunit tests or a problem with this specific test ?
Thanks,
Guenter
CC testing
On Wed, Feb 28, 2024 at 8:59 AM Guenter Roeck <linux(a)roeck-us.net> wrote:
> On 2/27/24 23:25, Christophe Leroy wrote:
> [ ... ]
> >>
> >> This test case is supposed to be as true to the "general case" as
> >> possible, so I have aligned the data along 14 + NET_IP_ALIGN. On ARM
> >> this will be a 16-byte boundary since NET_IP_ALIGN is 2. A driver that
> >> does not follow this may not be appropriately tested by this test case,
> >> but anyone is welcome to submit additional test cases that address this
> >> additional alignment concern.
> >
> > But then this test case is becoming less and less true to the "general
> > case" with this patch, whereas your initial implementation was almost
> > perfect as it was covering most cases, a lot more than what we get with
> > that patch applied.
> >
> NP with me if that is where people want to go. I'll simply disable checksum
> tests on all architectures which don't support unaligned accesses (so far
> it looks like that is only arm with thumb instructions, and possibly nios2).
> I personally find that less desirable and would have preferred a second
> configurable set of tests for unaligned accesses, but I have no problem
> with it.
IMHO the tests should validate the expected functionality. If a test
fails, either functionality is missing or behaves wrong, or the test
is wrong.
What is the point of writing tests for a core functionality like network
checksumming that do not match the expected functionality?
Gr{oetje,eeting}s,
Geert
--
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- geert(a)linux-m68k.org
In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
-- Linus Torvalds
Basic idea of this series is now to use the kselftest_harness.h
framework to get TAP output in the tests, so that it is easier
for the user to see what is going on, and e.g. to be able to
detect whether a certain test is part of the test binary or not
(which is useful when tests get extended in the course of time).
Since most tests also need a vcpu, we introduce our own macros
to define such tests, so we don't have to repeat this code all
over the place.
v3:
- Add patch from Sean to allow setting vCPU's entry points seperately
- Let each test define the entry point via KVM_ONE_VCPU_TEST(), don't
do it globally from KVM_ONE_VCPU_TEST_SUITE()
v2:
- Dropped the "Rename the ASSERT_EQ macro" patch (already merged)
- Split the fixes in the sync_regs_test into separate patches
(see the first two patches)
- Introduce the KVM_ONE_VCPU_TEST_SUITE() macro as suggested
by Sean (see third patch) and use it in the following patches
- Add a new patch to convert vmx_pmu_caps_test.c, too
Sean Christopherson (1):
KVM: selftests: Move setting a vCPU's entry point to a dedicated API
Thomas Huth (7):
KVM: selftests: x86: sync_regs_test: Use vcpu_run() where appropriate
KVM: selftests: x86: sync_regs_test: Get regs structure before
modifying it
KVM: selftests: Add a macro to define a test with one vcpu
KVM: selftests: x86: Use TAP interface in the sync_regs test
KVM: selftests: x86: Use TAP interface in the fix_hypercall test
KVM: selftests: x86: Use TAP interface in the vmx_pmu_caps test
KVM: selftests: x86: Use TAP interface in the userspace_msr_exit test
.../selftests/kvm/include/kvm_test_harness.h | 36 ++++++
.../selftests/kvm/include/kvm_util_base.h | 11 +-
.../selftests/kvm/lib/aarch64/processor.c | 23 +++-
.../selftests/kvm/lib/riscv/processor.c | 9 +-
.../selftests/kvm/lib/s390x/processor.c | 13 +-
.../selftests/kvm/lib/x86_64/processor.c | 13 +-
.../selftests/kvm/x86_64/fix_hypercall_test.c | 27 ++--
.../selftests/kvm/x86_64/sync_regs_test.c | 121 +++++++++++++-----
.../kvm/x86_64/userspace_msr_exit_test.c | 52 ++------
.../selftests/kvm/x86_64/vmx_pmu_caps_test.c | 50 ++------
10 files changed, 215 insertions(+), 140 deletions(-)
create mode 100644 tools/testing/selftests/kvm/include/kvm_test_harness.h
--
2.43.0
From: Roberto Sassu <roberto.sassu(a)huawei.com>
Introduce the digest_cache LSM, whose purpose is to deliver reference
digest values to integrity providers, such as IMA and IPE, abstracting to
them how those digests where extracted from the respective data source.
The major benefit is the ability to use the vaste amount of digests already
provided (and likely signed) by software vendors, without needing them to
adapt their format to the one understood by the integrity provider.
IMA and IPE can immediately interface with the digest_cache LSM and query
the digest of an accessed file. If the digest is found, it means that the
file is coming from the software vendor and not modified. If not, the file
might have been corrupted. Each integrity provider decides how to handle
this situation.
The second major benefit is performance improvement. Since the digest_cache
LSM has the ability to extract many digests from a single data source, it
means that it has less signatures to verify compared to the approach of
verifying individual file signatures (IMA appraisal). Preliminary tests
have shown a speedup of IMA appraisal of about 65% for sequential read, and
45% for parallel read.
This patch set has as prerequisites the file_release LSM hook (to be
introduced with the move of IMA/EVM to the LSM infrastructure), and
support for PGP keys, which is still unclear how it should be done.
The IMA integration patch set will be introduced separately. Also a PoC
based on the current version of IPE can be provided.
v2:
- Include the TLV parser in this patch set (from user asymmetric keys and
signatures)
- Move from IMA and make an independent LSM
- Remove IMA-specific stuff from this patch set
- Add per algorithm hash table
- Expect all digest lists to be in the same directory and allow changing
the default directory
- Support digest lookup on directories, when there is no
security.digest_list xattr
- Add seq num to digest list file name, to impose ordering on directory
iteration
- Add a new data type DIGEST_LIST_ENTRY_DATA for the nested data in the
tlv digest list format
- Add the concept of verification data attached to digest caches
- Add the reset mechanism to track changes on digest lists and directory
containing the digest lists
- Add kernel selftests
v1:
- Add documentation in Documentation/security/integrity-digest-cache.rst
- Pass the mask of IMA actions to digest_cache_alloc()
- Add a reference count to the digest cache
- Remove the path parameter from digest_cache_get(), and rely on the
reference count to avoid the digest cache disappearing while being used
- Rename the dentry_to_check parameter of digest_cache_get() to dentry
- Rename digest_cache_get() to digest_cache_new() and add
digest_cache_get() to set the digest cache in the iint of the inode for
which the digest cache was requested
- Add dig_owner and dig_user to the iint, to distinguish from which inode
the digest cache was created from, and which is using it; consequently it
makes the digest cache usable to measure/appraise other digest caches
(support not yet enabled)
- Add dig_owner_mutex and dig_user_mutex to serialize accesses to dig_owner
and dig_user until they are initialized
- Enforce strong synchronization and make the contenders wait until
dig_owner and dig_user are assigned to the iint the first time
- Move checking IMA actions on the digest list earlier, and fail if no
action were performed (digest cache not usable)
- Remove digest_cache_put(), not needed anymore with the introduction of
the reference count
- Fail immediately in digest_cache_lookup() if the digest algorithm is
not set in the digest cache
- Use 64 bit mask for IMA actions on the digest list instead of 8 bit
- Return NULL in the inline version of digest_cache_get()
- Use list_add_tail() instead of list_add() in the iterator
- Copy the digest list path to a separate buffer in digest_cache_iter_dir()
- Use digest list parsers verified with Frama-C
- Explicitly disable (for now) the possibility in the IMA policy to use the
digest cache to measure/appraise other digest lists
- Replace exit(<value>) with return <value> in manage_digest_lists.c
Roberto Sassu (13):
lib: Add TLV parser
security: Introduce the digest_cache LSM
digest_cache: Add securityfs interface
digest_cache: Add hash tables and operations
digest_cache: Populate the digest cache from a digest list
digest_cache: Parse tlv digest lists
digest_cache: Parse rpm digest lists
digest_cache: Add management of verification data
digest_cache: Add support for directories
digest cache: Prefetch digest lists if requested
digest_cache: Reset digest cache on file/directory change
selftests/digest_cache: Add selftests for digest_cache LSM
docs: Add documentation of the digest_cache LSM
Documentation/security/digest_cache.rst | 900 ++++++++++++++++++
Documentation/security/index.rst | 1 +
MAINTAINERS | 16 +
include/linux/digest_cache.h | 89 ++
include/linux/kernel_read_file.h | 1 +
include/linux/tlv_parser.h | 28 +
include/uapi/linux/lsm.h | 1 +
include/uapi/linux/tlv_digest_list.h | 72 ++
include/uapi/linux/tlv_parser.h | 59 ++
include/uapi/linux/xattr.h | 6 +
lib/Kconfig | 3 +
lib/Makefile | 3 +
lib/tlv_parser.c | 214 +++++
lib/tlv_parser.h | 17 +
security/Kconfig | 11 +-
security/Makefile | 1 +
security/digest_cache/Kconfig | 34 +
security/digest_cache/Makefile | 11 +
security/digest_cache/dir.c | 245 +++++
security/digest_cache/htable.c | 268 ++++++
security/digest_cache/internal.h | 259 +++++
security/digest_cache/main.c | 545 +++++++++++
security/digest_cache/modsig.c | 66 ++
security/digest_cache/parsers/parsers.h | 15 +
security/digest_cache/parsers/rpm.c | 223 +++++
security/digest_cache/parsers/tlv.c | 299 ++++++
security/digest_cache/populate.c | 163 ++++
security/digest_cache/reset.c | 168 ++++
security/digest_cache/secfs.c | 87 ++
security/digest_cache/verif.c | 119 +++
security/security.c | 3 +-
tools/testing/selftests/Makefile | 1 +
.../testing/selftests/digest_cache/.gitignore | 3 +
tools/testing/selftests/digest_cache/Makefile | 23 +
.../testing/selftests/digest_cache/all_test.c | 706 ++++++++++++++
tools/testing/selftests/digest_cache/common.c | 79 ++
tools/testing/selftests/digest_cache/common.h | 131 +++
.../selftests/digest_cache/common_user.c | 47 +
.../selftests/digest_cache/common_user.h | 17 +
tools/testing/selftests/digest_cache/config | 1 +
.../selftests/digest_cache/generators.c | 248 +++++
.../selftests/digest_cache/generators.h | 19 +
.../selftests/digest_cache/testmod/Makefile | 16 +
.../selftests/digest_cache/testmod/kern.c | 499 ++++++++++
.../selftests/lsm/lsm_list_modules_test.c | 3 +
45 files changed, 5714 insertions(+), 6 deletions(-)
create mode 100644 Documentation/security/digest_cache.rst
create mode 100644 include/linux/digest_cache.h
create mode 100644 include/linux/tlv_parser.h
create mode 100644 include/uapi/linux/tlv_digest_list.h
create mode 100644 include/uapi/linux/tlv_parser.h
create mode 100644 lib/tlv_parser.c
create mode 100644 lib/tlv_parser.h
create mode 100644 security/digest_cache/Kconfig
create mode 100644 security/digest_cache/Makefile
create mode 100644 security/digest_cache/dir.c
create mode 100644 security/digest_cache/htable.c
create mode 100644 security/digest_cache/internal.h
create mode 100644 security/digest_cache/main.c
create mode 100644 security/digest_cache/modsig.c
create mode 100644 security/digest_cache/parsers/parsers.h
create mode 100644 security/digest_cache/parsers/rpm.c
create mode 100644 security/digest_cache/parsers/tlv.c
create mode 100644 security/digest_cache/populate.c
create mode 100644 security/digest_cache/reset.c
create mode 100644 security/digest_cache/secfs.c
create mode 100644 security/digest_cache/verif.c
create mode 100644 tools/testing/selftests/digest_cache/.gitignore
create mode 100644 tools/testing/selftests/digest_cache/Makefile
create mode 100644 tools/testing/selftests/digest_cache/all_test.c
create mode 100644 tools/testing/selftests/digest_cache/common.c
create mode 100644 tools/testing/selftests/digest_cache/common.h
create mode 100644 tools/testing/selftests/digest_cache/common_user.c
create mode 100644 tools/testing/selftests/digest_cache/common_user.h
create mode 100644 tools/testing/selftests/digest_cache/config
create mode 100644 tools/testing/selftests/digest_cache/generators.c
create mode 100644 tools/testing/selftests/digest_cache/generators.h
create mode 100644 tools/testing/selftests/digest_cache/testmod/Makefile
create mode 100644 tools/testing/selftests/digest_cache/testmod/kern.c
--
2.34.1
Changes :
- "excercise" is corrected to "exercise" in drivers/net/mlxsw/spectrum-2/tc_flower.sh
- "mutliple" is corrected to "multiple" in drivers/net/netdevsim/ethtool-fec.sh
Signed-off-by: Prabhav Kumar Vaish <pvkumar5749404(a)gmail.com>
---
.../testing/selftests/drivers/net/mlxsw/spectrum-2/tc_flower.sh | 2 +-
tools/testing/selftests/drivers/net/netdevsim/ethtool-fec.sh | 2 +-
2 files changed, 2 insertions(+), 2 deletions(-)
diff --git a/tools/testing/selftests/drivers/net/mlxsw/spectrum-2/tc_flower.sh b/tools/testing/selftests/drivers/net/mlxsw/spectrum-2/tc_flower.sh
index 616d3581419c..31252bc8775e 100755
--- a/tools/testing/selftests/drivers/net/mlxsw/spectrum-2/tc_flower.sh
+++ b/tools/testing/selftests/drivers/net/mlxsw/spectrum-2/tc_flower.sh
@@ -869,7 +869,7 @@ bloom_simple_test()
bloom_complex_test()
{
# Bloom filter index computation is affected from region ID, eRP
- # ID and from the region key size. In order to excercise those parts
+ # ID and from the region key size. In order to exercise those parts
# of the Bloom filter code, use a series of regions, each with a
# different key size and send packet that should hit all of them.
local index
diff --git a/tools/testing/selftests/drivers/net/netdevsim/ethtool-fec.sh b/tools/testing/selftests/drivers/net/netdevsim/ethtool-fec.sh
index 7d7829f57550..6c52ce1b0450 100755
--- a/tools/testing/selftests/drivers/net/netdevsim/ethtool-fec.sh
+++ b/tools/testing/selftests/drivers/net/netdevsim/ethtool-fec.sh
@@ -49,7 +49,7 @@ for o in llrs rs; do
Active FEC encoding: ${o^^}"
done
-# Test mutliple bits
+# Test multiple bits
$ETHTOOL --set-fec $NSIM_NETDEV encoding rs llrs
check $?
s=$($ETHTOOL --show-fec $NSIM_NETDEV | tail -2)
--
2.34.1
Changes :
- "excercise" is corrected to "exercise" in drivers/net/mlxsw/spectrum-2/tc_flower.sh
- "mutliple" is corrected to "multiple" in drivers/net/netdevsim/ethtool-fec.sh
Signed-off-by: Prabhav Kumar Vaish <pvkumar5749404(a)gmail.com>
---
.../testing/selftests/drivers/net/mlxsw/spectrum-2/tc_flower.sh | 2 +-
tools/testing/selftests/drivers/net/netdevsim/ethtool-fec.sh | 2 +-
2 files changed, 2 insertions(+), 2 deletions(-)
diff --git a/tools/testing/selftests/drivers/net/mlxsw/spectrum-2/tc_flower.sh b/tools/testing/selftests/drivers/net/mlxsw/spectrum-2/tc_flower.sh
index 616d3581419c..31252bc8775e 100755
--- a/tools/testing/selftests/drivers/net/mlxsw/spectrum-2/tc_flower.sh
+++ b/tools/testing/selftests/drivers/net/mlxsw/spectrum-2/tc_flower.sh
@@ -869,7 +869,7 @@ bloom_simple_test()
bloom_complex_test()
{
# Bloom filter index computation is affected from region ID, eRP
- # ID and from the region key size. In order to excercise those parts
+ # ID and from the region key size. In order to exercise those parts
# of the Bloom filter code, use a series of regions, each with a
# different key size and send packet that should hit all of them.
local index
diff --git a/tools/testing/selftests/drivers/net/netdevsim/ethtool-fec.sh b/tools/testing/selftests/drivers/net/netdevsim/ethtool-fec.sh
index 7d7829f57550..6c52ce1b0450 100755
--- a/tools/testing/selftests/drivers/net/netdevsim/ethtool-fec.sh
+++ b/tools/testing/selftests/drivers/net/netdevsim/ethtool-fec.sh
@@ -49,7 +49,7 @@ for o in llrs rs; do
Active FEC encoding: ${o^^}"
done
-# Test mutliple bits
+# Test multiple bits
$ETHTOOL --set-fec $NSIM_NETDEV encoding rs llrs
check $?
s=$($ETHTOOL --show-fec $NSIM_NETDEV | tail -2)
--
2.34.1
Changes :
- "excercise" is corrected to "exercise" in drivers/net/mlxsw/spectrum-2/tc_flower.sh
- "mutliple" is corrected to "multiple" in drivers/net/netdevsim/ethtool-fec.sh
Signed-off-by: Prabhav Kumar Vaish <pvkumar5749404(a)gmail.com>
---
.../testing/selftests/drivers/net/mlxsw/spectrum-2/tc_flower.sh | 2 +-
tools/testing/selftests/drivers/net/netdevsim/ethtool-fec.sh | 2 +-
2 files changed, 2 insertions(+), 2 deletions(-)
diff --git a/tools/testing/selftests/drivers/net/mlxsw/spectrum-2/tc_flower.sh b/tools/testing/selftests/drivers/net/mlxsw/spectrum-2/tc_flower.sh
index 616d3581419c..31252bc8775e 100755
--- a/tools/testing/selftests/drivers/net/mlxsw/spectrum-2/tc_flower.sh
+++ b/tools/testing/selftests/drivers/net/mlxsw/spectrum-2/tc_flower.sh
@@ -869,7 +869,7 @@ bloom_simple_test()
bloom_complex_test()
{
# Bloom filter index computation is affected from region ID, eRP
- # ID and from the region key size. In order to excercise those parts
+ # ID and from the region key size. In order to exercise those parts
# of the Bloom filter code, use a series of regions, each with a
# different key size and send packet that should hit all of them.
local index
diff --git a/tools/testing/selftests/drivers/net/netdevsim/ethtool-fec.sh b/tools/testing/selftests/drivers/net/netdevsim/ethtool-fec.sh
index 7d7829f57550..6c52ce1b0450 100755
--- a/tools/testing/selftests/drivers/net/netdevsim/ethtool-fec.sh
+++ b/tools/testing/selftests/drivers/net/netdevsim/ethtool-fec.sh
@@ -49,7 +49,7 @@ for o in llrs rs; do
Active FEC encoding: ${o^^}"
done
-# Test mutliple bits
+# Test multiple bits
$ETHTOOL --set-fec $NSIM_NETDEV encoding rs llrs
check $?
s=$($ETHTOOL --show-fec $NSIM_NETDEV | tail -2)
--
2.34.1
[Partly a RFC/formal submission: there are still FIXMEs in the code]
[Also using bpf-next as the base tree for HID changes as there will
be conflicting changes otherwise, so I'm personaly fine for the HID
commits to go through bpf-next]
IMO, patches 1-3 and 9-14 are ready to go, rest is still pending review.
For reference, the use cases I have in mind:
---
Basically, I need to be able to defer a HID-BPF program for the
following reasons (from the aforementioned patch):
1. defer an event:
Sometimes we receive an out of proximity event, but the device can not
be trusted enough, and we need to ensure that we won't receive another
one in the following n milliseconds. So we need to wait those n
milliseconds, and eventually re-inject that event in the stack.
2. inject new events in reaction to one given event:
We might want to transform one given event into several. This is the
case for macro keys where a single key press is supposed to send
a sequence of key presses. But this could also be used to patch a
faulty behavior, if a device forgets to send a release event.
3. communicate with the device in reaction to one event:
We might want to communicate back to the device after a given event.
For example a device might send us an event saying that it came back
from sleeping state and needs to be re-initialized.
Currently we can achieve that by keeping a userspace program around,
raise a bpf event, and let that userspace program inject the events and
commands.
However, we are just keeping that program alive as a daemon for just
scheduling commands. There is no logic in it, so it doesn't really justify
an actual userspace wakeup. So a kernel workqueue seems simpler to handle.
The other part I'm not sure is whether we can say that BPF maps of type
queue/stack can be used in sleepable context.
I don't see any warning when running the test programs, but that's probably
not a guarantee I'm doing the things properly :)
Cheers,
Benjamin
To: Alexei Starovoitov <ast(a)kernel.org>
To: Daniel Borkmann <daniel(a)iogearbox.net>
To: John Fastabend <john.fastabend(a)gmail.com>
To: Andrii Nakryiko <andrii(a)kernel.org>
To: Martin KaFai Lau <martin.lau(a)linux.dev>
To: Eduard Zingerman <eddyz87(a)gmail.com>
To: Song Liu <song(a)kernel.org>
To: Yonghong Song <yonghong.song(a)linux.dev>
To: KP Singh <kpsingh(a)kernel.org>
To: Stanislav Fomichev <sdf(a)google.com>
To: Hao Luo <haoluo(a)google.com>
To: Jiri Olsa <jolsa(a)kernel.org>
To: Jiri Kosina <jikos(a)kernel.org>
To: Benjamin Tissoires <benjamin.tissoires(a)redhat.com>
To: Jonathan Corbet <corbet(a)lwn.net>
To: Shuah Khan <shuah(a)kernel.org>
Cc: <bpf(a)vger.kernel.org>
Cc: <linux-kernel(a)vger.kernel.org>
Cc: <linux-input(a)vger.kernel.org>
Cc: <linux-doc(a)vger.kernel.org>
Cc: <linux-kselftest(a)vger.kernel.org>
Signed-off-by: Benjamin Tissoires <bentiss(a)kernel.org>
---
Changes in v3:
- fixed the crash from v2
- changed the API to have only BPF_F_TIMER_SLEEPABLE for
bpf_timer_start()
- split the new kfuncs/verifier patch into several sub-patches, for
easier reviews
- Link to v2: https://lore.kernel.org/r/20240214-hid-bpf-sleepable-v2-0-5756b054724d@kern…
Changes in v2:
- make use of bpf_timer (and dropped the custom HID handling)
- implemented bpf_timer_set_sleepable_cb as a kfunc
- still not implemented global subprogs
- no sleepable bpf_timer selftests yet
- Link to v1: https://lore.kernel.org/r/20240209-hid-bpf-sleepable-v1-0-4cc895b5adbd@kern…
---
Benjamin Tissoires (16):
bpf/verifier: allow more maps in sleepable bpf programs
bpf/verifier: introduce in_sleepable() helper
bpf/verifier: add is_async_callback_calling_insn() helper
bpf/helpers: introduce sleepable bpf_timers
bpf/verifier: add bpf_timer as a kfunc capable type
bpf/helpers: introduce bpf_timer_set_sleepable_cb() kfunc
bpf/helpers: mark the callback of bpf_timer_set_sleepable_cb() as sleepable
bpf/verifier: do_misc_fixups for is_bpf_timer_set_sleepable_cb_kfunc
HID: bpf/dispatch: regroup kfuncs definitions
HID: bpf: export hid_hw_output_report as a BPF kfunc
selftests/hid: Add test for hid_bpf_hw_output_report
HID: bpf: allow to inject HID event from BPF
selftests/hid: add tests for hid_bpf_input_report
HID: bpf: allow to use bpf_timer_set_sleepable_cb() in tracing callbacks.
selftests/hid: add test for bpf_timer
selftests/hid: add KASAN to the VM tests
Documentation/hid/hid-bpf.rst | 2 +-
drivers/hid/bpf/hid_bpf_dispatch.c | 232 ++++++++++++++-------
drivers/hid/hid-core.c | 2 +
include/linux/bpf_verifier.h | 2 +
include/linux/hid_bpf.h | 3 +
include/uapi/linux/bpf.h | 4 +
kernel/bpf/helpers.c | 140 +++++++++++--
kernel/bpf/verifier.c | 114 ++++++++--
tools/testing/selftests/hid/config.common | 1 +
tools/testing/selftests/hid/hid_bpf.c | 195 ++++++++++++++++-
tools/testing/selftests/hid/progs/hid.c | 198 ++++++++++++++++++
.../testing/selftests/hid/progs/hid_bpf_helpers.h | 8 +
12 files changed, 795 insertions(+), 106 deletions(-)
---
base-commit: 5c331823b3fc52ffd27524bf5b7e0d137114f470
change-id: 20240205-hid-bpf-sleepable-c01260fd91c4
Best regards,
--
Benjamin Tissoires <bentiss(a)kernel.org>
The drm_buddy_test's alloc_contiguous test used a u64 for the page size,
which was then updated to be an 'unsigned long' to avoid 64-bit
multiplication division helpers.
However, the variable is logged by some KUNIT_ASSERT_EQ_MSG() using the
'%d' or '%llu' format specifiers, the former of which is always wrong,
and the latter is no longer correct now that ps is no longer a u64. Fix
these to all use '%lu'.
Also, drm_mm_test calls KUNIT_FAIL() with an empty string as the
message. gcc and clang warns if a printf format string is empty, so
give these some more detailed error messages, which should be more
useful anyway.
Fixes: a64056bb5a32 ("drm/tests/drm_buddy: add alloc_contiguous test")
Fixes: fca7526b7d89 ("drm/tests/drm_buddy: fix build failure on 32-bit targets")
Fixes: fc8d29e298cf ("drm: selftest: convert drm_mm selftest to KUnit")
Reviewed-by: Matthew Auld <matthew.auld(a)intel.com>
Acked-by: Christian König <christian.koenig(a)amd.com>
Tested-by: Guenter Roeck <linux(a)roeck-us.net>
Reviewed-by: Justin Stitt <justinstitt(a)google.com>
Signed-off-by: David Gow <davidgow(a)google.com>
---
Changes since v1:
https://lore.kernel.org/linux-kselftest/20240221092728.1281499-8-davidgow@g…
- Split this patch out, as the others have been applied already.
- Rebase on 6.8-rc6
- Add everyone's {Reviewed,Acked,Tested}-by tags. Thanks!
---
drivers/gpu/drm/tests/drm_buddy_test.c | 14 +++++++-------
drivers/gpu/drm/tests/drm_mm_test.c | 6 +++---
2 files changed, 10 insertions(+), 10 deletions(-)
diff --git a/drivers/gpu/drm/tests/drm_buddy_test.c b/drivers/gpu/drm/tests/drm_buddy_test.c
index 2f32fb2f12e7..3dbfa3078449 100644
--- a/drivers/gpu/drm/tests/drm_buddy_test.c
+++ b/drivers/gpu/drm/tests/drm_buddy_test.c
@@ -55,30 +55,30 @@ static void drm_test_buddy_alloc_contiguous(struct kunit *test)
KUNIT_ASSERT_FALSE_MSG(test,
drm_buddy_alloc_blocks(&mm, 0, mm_size,
ps, ps, list, 0),
- "buddy_alloc hit an error size=%u\n",
+ "buddy_alloc hit an error size=%lu\n",
ps);
} while (++i < n_pages);
KUNIT_ASSERT_TRUE_MSG(test, drm_buddy_alloc_blocks(&mm, 0, mm_size,
3 * ps, ps, &allocated,
DRM_BUDDY_CONTIGUOUS_ALLOCATION),
- "buddy_alloc didn't error size=%u\n", 3 * ps);
+ "buddy_alloc didn't error size=%lu\n", 3 * ps);
drm_buddy_free_list(&mm, &middle);
KUNIT_ASSERT_TRUE_MSG(test, drm_buddy_alloc_blocks(&mm, 0, mm_size,
3 * ps, ps, &allocated,
DRM_BUDDY_CONTIGUOUS_ALLOCATION),
- "buddy_alloc didn't error size=%u\n", 3 * ps);
+ "buddy_alloc didn't error size=%lu\n", 3 * ps);
KUNIT_ASSERT_TRUE_MSG(test, drm_buddy_alloc_blocks(&mm, 0, mm_size,
2 * ps, ps, &allocated,
DRM_BUDDY_CONTIGUOUS_ALLOCATION),
- "buddy_alloc didn't error size=%u\n", 2 * ps);
+ "buddy_alloc didn't error size=%lu\n", 2 * ps);
drm_buddy_free_list(&mm, &right);
KUNIT_ASSERT_TRUE_MSG(test, drm_buddy_alloc_blocks(&mm, 0, mm_size,
3 * ps, ps, &allocated,
DRM_BUDDY_CONTIGUOUS_ALLOCATION),
- "buddy_alloc didn't error size=%u\n", 3 * ps);
+ "buddy_alloc didn't error size=%lu\n", 3 * ps);
/*
* At this point we should have enough contiguous space for 2 blocks,
* however they are never buddies (since we freed middle and right) so
@@ -87,13 +87,13 @@ static void drm_test_buddy_alloc_contiguous(struct kunit *test)
KUNIT_ASSERT_FALSE_MSG(test, drm_buddy_alloc_blocks(&mm, 0, mm_size,
2 * ps, ps, &allocated,
DRM_BUDDY_CONTIGUOUS_ALLOCATION),
- "buddy_alloc hit an error size=%u\n", 2 * ps);
+ "buddy_alloc hit an error size=%lu\n", 2 * ps);
drm_buddy_free_list(&mm, &left);
KUNIT_ASSERT_FALSE_MSG(test, drm_buddy_alloc_blocks(&mm, 0, mm_size,
3 * ps, ps, &allocated,
DRM_BUDDY_CONTIGUOUS_ALLOCATION),
- "buddy_alloc hit an error size=%u\n", 3 * ps);
+ "buddy_alloc hit an error size=%lu\n", 3 * ps);
total = 0;
list_for_each_entry(block, &allocated, link)
diff --git a/drivers/gpu/drm/tests/drm_mm_test.c b/drivers/gpu/drm/tests/drm_mm_test.c
index 1eb0c304f960..f37c0d765865 100644
--- a/drivers/gpu/drm/tests/drm_mm_test.c
+++ b/drivers/gpu/drm/tests/drm_mm_test.c
@@ -157,7 +157,7 @@ static void drm_test_mm_init(struct kunit *test)
/* After creation, it should all be one massive hole */
if (!assert_one_hole(test, &mm, 0, size)) {
- KUNIT_FAIL(test, "");
+ KUNIT_FAIL(test, "mm not one hole on creation");
goto out;
}
@@ -171,14 +171,14 @@ static void drm_test_mm_init(struct kunit *test)
/* After filling the range entirely, there should be no holes */
if (!assert_no_holes(test, &mm)) {
- KUNIT_FAIL(test, "");
+ KUNIT_FAIL(test, "mm has holes when filled");
goto out;
}
/* And then after emptying it again, the massive hole should be back */
drm_mm_remove_node(&tmp);
if (!assert_one_hole(test, &mm, 0, size)) {
- KUNIT_FAIL(test, "");
+ KUNIT_FAIL(test, "mm does not have single hole after emptying");
goto out;
}
--
2.44.0.rc1.240.g4c46232300-goog
The changes on lib.mk are both for simplification and also
clarification, like in the case of not handling TEST_GEN_MODS_DIR
directly. There is a new patch to solve one issue reported by build bot.
These changes apply on top of the current kselftest-next branch. Please
review!
Signed-off-by: Marcos Paulo de Souza <mpdesouza(a)suse.com>
---
Changes in v2:
- Added a new patch to avoid building the modules/running the tests if
kernel-devel is not installed. Resolving an issue reported by the
build bot.
- Reordered the patches, showing the more simple ones first. Besides
patch 0002, all the other three didn't changed since v1.
- Link to v1: https://lore.kernel.org/r/20240215-lp-selftests-fixes-v1-0-89f4a6f5cddc@sus…
---
Marcos Paulo de Souza (4):
selftests: livepatch: Add initial .gitignore
selftests: livepatch: Avoid running the tests if kernel-devel is missing
selftests: lib.mk: Do not process TEST_GEN_MODS_DIR
selftests: lib.mk: Simplify TEST_GEN_MODS_DIR handling
tools/testing/selftests/lib.mk | 19 +++++++------------
tools/testing/selftests/livepatch/.gitignore | 1 +
tools/testing/selftests/livepatch/functions.sh | 13 +++++++++++++
.../testing/selftests/livepatch/test_modules/Makefile | 6 ++++++
4 files changed, 27 insertions(+), 12 deletions(-)
---
base-commit: 6f1a214d446b2f2f9c8c4b96755a8f0316ba4436
change-id: 20240215-lp-selftests-fixes-7d4bab3c0712
Best regards,
--
Marcos Paulo de Souza <mpdesouza(a)suse.com>
Non-contiguous CBM support for Intel CAT has been merged into the kernel
with Commit 0e3cd31f6e90 ("x86/resctrl: Enable non-contiguous CBMs in
Intel CAT") but there is no selftest that would validate if this feature
works correctly. The selftest needs to verify if writing non-contiguous
CBMs to the schemata file behaves as expected in comparison to the
information about non-contiguous CBMs support.
The patch series is based on a rework of resctrl selftests that's
currently in review [1]. The patch also implements a similar
functionality presented in the bash script included in the cover letter
of the original non-contiguous CBMs in Intel CAT series [3].
Changelog v6:
- Add Reinette's reviewed-by tag to patch 2/5.
- Fix ret type in noncont test.
- Add a check for bit_center value in noncont test.
- Add resource pointer check in resctrl_mon_feature_exists.
- Fix patch 4 leaking into patch 3 by mistake.
Changelog v5:
- Add a few reviewed-by tags.
- Make some minor text changes in patch messages and comments.
- Redo resctrl_mon_feature_exists() to be more generic and fix some of
my errors in refactoring feature checking.
Changelog v4:
- Changes to error failure return values in non-contiguous test.
- Some minor text refactoring without functional changes.
Changelog v3:
- Rebase onto v4 of Ilpo's series [1].
- Split old patch 3/4 into two parts. One doing refactoring and one
adding a new function.
- Some changes to all the patches after Reinette's review.
Changelog v2:
- Rebase onto v4 of Ilpo's series [2].
- Add two patches that prepare helpers for the new test.
- Move Ilpo's patch that adds test grouping to this series.
- Apply Ilpo's suggestion to the patch that adds a new test.
[1] https://lore.kernel.org/all/20231215150515.36983-1-ilpo.jarvinen@linux.inte…
[2] https://lore.kernel.org/all/20231211121826.14392-1-ilpo.jarvinen@linux.inte…
[3] https://lore.kernel.org/all/cover.1696934091.git.maciej.wieczor-retman@inte…
Older versions of this series:
[v1] https://lore.kernel.org/all/20231109112847.432687-1-maciej.wieczor-retman@i…
[v2] https://lore.kernel.org/all/cover.1702392177.git.maciej.wieczor-retman@inte…
[v3] https://lore.kernel.org/all/cover.1706180726.git.maciej.wieczor-retman@inte…
[v4] https://lore.kernel.org/all/cover.1707130307.git.maciej.wieczor-retman@inte…
[v5] https://lore.kernel.org/all/cover.1707487039.git.maciej.wieczor-retman@inte…
Ilpo Järvinen (1):
selftests/resctrl: Add test groups and name L3 CAT test L3_CAT
Maciej Wieczor-Retman (4):
selftests/resctrl: Add a helper for the non-contiguous test
selftests/resctrl: Split validate_resctrl_feature_request()
selftests/resctrl: Add resource_info_file_exists()
selftests/resctrl: Add non-contiguous CBMs CAT test
tools/testing/selftests/resctrl/cat_test.c | 92 +++++++++++++++++-
tools/testing/selftests/resctrl/cmt_test.c | 2 +-
tools/testing/selftests/resctrl/mba_test.c | 2 +-
tools/testing/selftests/resctrl/mbm_test.c | 6 +-
tools/testing/selftests/resctrl/resctrl.h | 10 +-
.../testing/selftests/resctrl/resctrl_tests.c | 18 +++-
tools/testing/selftests/resctrl/resctrlfs.c | 96 ++++++++++++++++---
7 files changed, 203 insertions(+), 23 deletions(-)
--
2.43.0
KUnit has several macros which accept a log message, which can contain
printf format specifiers. Some of these (the explicit log macros)
already use the __printf() gcc attribute to ensure the format specifiers
are valid, but those which could fail the test, and hence used
__kunit_do_failed_assertion() behind the scenes, did not.
These include:
- KUNIT_EXPECT_*_MSG()
- KUNIT_ASSERT_*_MSG()
- KUNIT_FAIL()
This series adds the __printf() attribute, and fixes all of the issues
uncovered. (Or, at least, all of those I could find with an x86_64
allyesconfig, and the default KUnit config on a number of other
architectures. Please test!)
The issues in question basically take the following forms:
- int / long / long long confusion: typically a type being updated, but
the format string not.
- Use of integer format specifiers (%d/%u/%li/etc) for types like size_t
or pointer differences (technically ptrdiff_t), which would only work
on some architectures.
- Use of integer format specifiers in combination with PTR_ERR(), where
%pe would make more sense.
- Use of empty messages which, whilst technically not incorrect, are not
useful and trigger a gcc warning.
We'd like to get these (or equivalent) in for 6.9 if possible, so please
do take a look if possible.
Thanks,
-- David
Reported-by: Linus Torvalds <torvalds(a)linux-foundation.org>
Closes: https://lore.kernel.org/linux-kselftest/CAHk-=wgJMOquDO5f8ShH1f4rzZwzApNVCw…
David Gow (9):
kunit: test: Log the correct filter string in executor_test
lib/cmdline: Fix an invalid format specifier in an assertion msg
lib: memcpy_kunit: Fix an invalid format specifier in an assertion msg
time: test: Fix incorrect format specifier
rtc: test: Fix invalid format specifier.
net: test: Fix printf format specifier in skb_segment kunit test
drm: tests: Fix invalid printf format specifiers in KUnit tests
drm/xe/tests: Fix printf format specifiers in xe_migrate test
kunit: Annotate _MSG assertion variants with gnu printf specifiers
drivers/gpu/drm/tests/drm_buddy_test.c | 14 +++++++-------
drivers/gpu/drm/tests/drm_mm_test.c | 6 +++---
drivers/gpu/drm/xe/tests/xe_migrate.c | 8 ++++----
drivers/rtc/lib_test.c | 2 +-
include/kunit/test.h | 12 ++++++------
kernel/time/time_test.c | 2 +-
lib/cmdline_kunit.c | 2 +-
lib/kunit/executor_test.c | 2 +-
lib/memcpy_kunit.c | 4 ++--
net/core/gso_test.c | 2 +-
10 files changed, 27 insertions(+), 27 deletions(-)
--
2.44.0.rc0.258.g7320e95886-goog
On 2/27/24 00:42, Meng Li wrote:
> make -C tools/testing/selftests, compiling dev_in_maps fail.
> In file included from dev_in_maps.c:10:
> /usr/include/x86_64-linux-gnu/sys/mount.h:35:3: error: expected identifier before numeric constant
> 35 | MS_RDONLY = 1, /* Mount read-only. */
> | ^~~~~~~~~
>
> That sys/mount.h has to be included before linux/mount.h.
>
> Signed-off-by: Meng Li <li.meng(a)amd.com>
> ---
> tools/testing/selftests/filesystems/overlayfs/dev_in_maps.c | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
I don't see this problem when I build it on my system when
I run:
make -C tools/testing/selftests
or
make -C tools/testing/selftests/filesystems/overlayfs
Are you running this after doing headers_install?
thanks,
-- Shuah
Add ability to parse all files within a directory. Additionally add the
ability to parse all results in the KUnit debugfs repository.
How to parse all files in directory:
./tools/testing/kunit/kunit.py parse [directory path]
How to parse KUnit debugfs repository:
./tools/testing/kunit/kunit.py parse debugfs
For each file, the parser outputs the file name, results, and test
summary. At the end of all parsing, the parser outputs a total summary
line.
This feature can be easily tested on the tools/testing/kunit/test_data/
directory.
Signed-off-by: Rae Moar <rmoar(a)google.com>
---
tools/testing/kunit/kunit.py | 45 ++++++++++++++++++++++++++----------
1 file changed, 33 insertions(+), 12 deletions(-)
diff --git a/tools/testing/kunit/kunit.py b/tools/testing/kunit/kunit.py
index bc74088c458a..827e6dac40ae 100755
--- a/tools/testing/kunit/kunit.py
+++ b/tools/testing/kunit/kunit.py
@@ -511,19 +511,40 @@ def exec_handler(cli_args: argparse.Namespace) -> None:
def parse_handler(cli_args: argparse.Namespace) -> None:
- if cli_args.file is None:
+ parsed_files = []
+ total_test = kunit_parser.Test()
+ total_test.status = kunit_parser.TestStatus.SUCCESS
+ if cli_args.file_path is None:
sys.stdin.reconfigure(errors='backslashreplace') # type: ignore
kunit_output = sys.stdin # type: Iterable[str]
- else:
- with open(cli_args.file, 'r', errors='backslashreplace') as f:
+ elif cli_args.file_path == "debugfs":
+ for (root, _, files) in os.walk("/sys/kernel/debug/kunit"):
+ for file in files:
+ if file == "results":
+ parsed_files.append(os.path.join(root, file))
+ elif os.path.isdir(cli_args.file_path):
+ for (root, _, files) in os.walk(cli_args.file_path):
+ for file in files:
+ parsed_files.append(os.path.join(root, file))
+ elif os.path.isfile(cli_args.file_path):
+ parsed_files.append(cli_args.file_path)
+
+ for file in parsed_files:
+ print(file)
+ with open(file, 'r', errors='backslashreplace') as f:
kunit_output = f.read().splitlines()
- # We know nothing about how the result was created!
- metadata = kunit_json.Metadata()
- request = KunitParseRequest(raw_output=cli_args.raw_output,
- json=cli_args.json)
- result, _ = parse_tests(request, metadata, kunit_output)
- if result.status != KunitStatus.SUCCESS:
- sys.exit(1)
+ # We know nothing about how the result was created!
+ metadata = kunit_json.Metadata()
+ request = KunitParseRequest(raw_output=cli_args.raw_output,
+ json=cli_args.json)
+ _, test = parse_tests(request, metadata, kunit_output)
+ total_test.subtests.append(test)
+
+ if len(parsed_files) > 1: # if more than one file was parsed output total summary
+ print('All files parsed.')
+ stdout.print_with_timestamp(kunit_parser.DIVIDER)
+ kunit_parser.bubble_up_test_results(total_test)
+ kunit_parser.print_summary_line(total_test)
subcommand_handlers_map = {
@@ -569,8 +590,8 @@ def main(argv: Sequence[str]) -> None:
help='Parses KUnit results from a file, '
'and parses formatted results.')
add_parse_opts(parse_parser)
- parse_parser.add_argument('file',
- help='Specifies the file to read results from.',
+ parse_parser.add_argument('file_path',
+ help='Specifies the file path to read results from.',
type=str, nargs='?', metavar='input_file')
cli_args = parser.parse_args(massage_argv(argv))
base-commit: 08c454e26daab6f843e5883fb96f680f11784fa6
--
2.44.0.rc0.258.g7320e95886-goog
Commit d393acce7b3f ("drm/tests: Switch to kunit devices") switched the
DRM device creation helpers from an ad-hoc implementation to the new
kunit device creation helpers introduced in commit d03c720e03bd ("kunit:
Add APIs for managing devices").
However, while the DRM helpers were using a platform_device, the kunit
helpers are using a dedicated bus and device type.
That situation creates small differences in the initialisation, and one
of them is that the kunit devices do not have the DMA masks setup. In
turn, this means that we can't do any kind of DMA buffer allocation
anymore, which creates a regression on some (downstream for now) tests.
Let's set up a default DMA mask that should work on any platform to fix
it.
Fixes: d03c720e03bd ("kunit: Add APIs for managing devices")
Signed-off-by: Maxime Ripard <mripard(a)kernel.org>
---
lib/kunit/device.c | 4 ++++
1 file changed, 4 insertions(+)
diff --git a/lib/kunit/device.c b/lib/kunit/device.c
index 644a38a1f5b1..9ea399049749 100644
--- a/lib/kunit/device.c
+++ b/lib/kunit/device.c
@@ -10,6 +10,7 @@
*/
#include <linux/device.h>
+#include <linux/dma-mapping.h>
#include <kunit/test.h>
#include <kunit/device.h>
@@ -133,6 +134,9 @@ static struct kunit_device *kunit_device_register_internal(struct kunit *test,
return ERR_PTR(err);
}
+ kunit_dev->dev.dma_mask = &kunit_dev->dev.coherent_dma_mask;
+ kunit_dev->dev.coherent_dma_mask = DMA_BIT_MASK(32);
+
kunit_add_action(test, device_unregister_wrapper, &kunit_dev->dev);
return kunit_dev;
--
2.43.2
The test is inspired by the pmu_event_filter_test which implemented by x86. On
the arm64 platform, there is the same ability to set the pmu_event_filter
through the KVM_ARM_VCPU_PMU_V3_FILTER attribute. So add the test for arm64.
The series first move some pmu common code from vpmu_counter_access to
lib/aarch64/vpmu.c and include/aarch64/vpmu.h, which can be used by
pmu_event_filter_test. Then fix a bug related to the [enable|disable]_counter,
and at last, implement the test itself.
Changelog:
----------
v3->v4:
- Rebased to the v6.8-rc2.
v2->v3:
- Check the pmceid in guest code instead of pmu event count since different
hardware may have different event count result, check pmceid makes it stable
on different platform. [Eric]
- Some typo fixed and commit message improved.
v1->v2:
- Improve the commit message. [Eric]
- Fix the bug in [enable|disable]_counter. [Raghavendra & Marc]
- Add the check if kvm has attr KVM_ARM_VCPU_PMU_V3_FILTER.
- Add if host pmu support the test event throught pmceid0.
- Split the test_invalid_filter() to another patch. [Eric]
v1: https://lore.kernel.org/all/20231123063750.2176250-1-shahuang@redhat.com/
v2: https://lore.kernel.org/all/20231129072712.2667337-1-shahuang@redhat.com/
v3: https://lore.kernel.org/all/20240116060129.55473-1-shahuang@redhat.com/
Shaoqin Huang (5):
KVM: selftests: aarch64: Make the [create|destroy]_vpmu_vm() public
KVM: selftests: aarch64: Move pmu helper functions into vpmu.h
KVM: selftests: aarch64: Fix the buggy [enable|disable]_counter
KVM: selftests: aarch64: Introduce pmu_event_filter_test
KVM: selftests: aarch64: Add invalid filter test in
pmu_event_filter_test
tools/testing/selftests/kvm/Makefile | 2 +
.../kvm/aarch64/pmu_event_filter_test.c | 255 ++++++++++++++++++
.../kvm/aarch64/vpmu_counter_access.c | 217 ++-------------
.../selftests/kvm/include/aarch64/vpmu.h | 134 +++++++++
.../testing/selftests/kvm/lib/aarch64/vpmu.c | 74 +++++
5 files changed, 489 insertions(+), 193 deletions(-)
create mode 100644 tools/testing/selftests/kvm/aarch64/pmu_event_filter_test.c
create mode 100644 tools/testing/selftests/kvm/include/aarch64/vpmu.h
create mode 100644 tools/testing/selftests/kvm/lib/aarch64/vpmu.c
base-commit: 41bccc98fb7931d63d03f326a746ac4d429c1dd3
--
2.40.1
This series brings various small improvements to MPTCP and its
selftests:
Patch 1 prints an error if there are duplicated subtests names. It is
important to have unique (sub)tests names in TAP, because some CI
environments drop (sub)tests with duplicated names.
Patch 2 is a preparation for patches 3 and 4, which check the protocol
in tcp_sk() and mptcp_sk() with DEBUG_NET, only in code from net/mptcp/.
We recently had the case where an MPTCP socket was wrongly treated as a
TCP one, and fuzzers and static checkers never spot the issue. This
would prevent such issues in the future.
Patches 5 to 7 are some cleanup for the MPTCP selftests. These patches
are not supposed to change the behaviour.
Patch 8 sets the poll timeout in diag selftest to the same value as the
one used in the other selftests.
Signed-off-by: Matthieu Baerts (NGI0) <matttbe(a)kernel.org>
---
Geliang Tang (4):
selftests: mptcp: netlink: drop duplicate var ret
selftests: mptcp: simult flows: define missing vars
selftests: mptcp: join: change capture/checksum as bool
selftests: mptcp: diag: change timeout_poll to 30
Matthieu Baerts (NGI0) (4):
selftests: mptcp: lib: catch duplicated subtest entries
mptcp: token kunit: set protocol
mptcp: check the protocol in tcp_sk() with DEBUG_NET
mptcp: check the protocol in mptcp_sk() with DEBUG_NET
net/mptcp/protocol.h | 16 ++++++++++++++++
net/mptcp/token_test.c | 7 ++++++-
tools/testing/selftests/net/mptcp/diag.sh | 2 +-
tools/testing/selftests/net/mptcp/mptcp_join.sh | 22 +++++++++++-----------
tools/testing/selftests/net/mptcp/mptcp_lib.sh | 21 +++++++++++++++++++++
tools/testing/selftests/net/mptcp/pm_netlink.sh | 1 -
tools/testing/selftests/net/mptcp/simult_flows.sh | 6 ++++++
7 files changed, 61 insertions(+), 14 deletions(-)
---
base-commit: a818bd12538c1408c7480de31573cdb3c3c0926f
change-id: 20240223-upstream-net-next-20240223-misc-improvements-7d64a076bca8
Best regards,
--
Matthieu Baerts (NGI0) <matttbe(a)kernel.org>
Cleaning up after tests is implemented separately for individual tests
and called at the end of each test execution. Since these functions are
very similar and a more generalized test framework was introduced a
function pointer in the resctrl_test struct can be used to reduce the
amount of function calls.
These functions are also all called in the ctrl-c handler because the
handler isn't aware which test is currently running. Since the handler
is implemented with a sigaction no function parameters can be passed
there but information about what test is currently running can be passed
with a global variable.
Series applies cleanly on top of kselftests/next.
Changelog v4:
- Check current_test pointer and reset it in signal unregistering.
- Move cleanup call to test_cleanup function.
Changelog v3:
- Make current_test static.
- Add callback NULL check to the ctrl-c handler.
Changelog v2:
- Make current_test a const pointer limited in scope to resctrl_val
file.
- Remove tests_cleanup from resctrl.h.
- Cleanup 'goto out' path and labels in individual test functions.
Older versions of this series:
[v1] https://lore.kernel.org/all/cover.1708434017.git.maciej.wieczor-retman@inte…
[v2] https://lore.kernel.org/all/cover.1708596015.git.maciej.wieczor-retman@inte…
[v3] https://lore.kernel.org/all/cover.1708599491.git.maciej.wieczor-retman@inte…
Maciej Wieczor-Retman (3):
selftests/resctrl: Add cleanup function to test framework
selftests/resctrl: Simplify cleanup in ctrl-c handler
selftests/resctrl: Move cleanups out of individual tests
tools/testing/selftests/resctrl/cat_test.c | 8 +++-----
tools/testing/selftests/resctrl/cmt_test.c | 4 ++--
tools/testing/selftests/resctrl/mba_test.c | 8 +++-----
tools/testing/selftests/resctrl/mbm_test.c | 8 +++-----
tools/testing/selftests/resctrl/resctrl.h | 9 +++------
.../testing/selftests/resctrl/resctrl_tests.c | 20 +++++++------------
tools/testing/selftests/resctrl/resctrl_val.c | 8 ++++++--
7 files changed, 27 insertions(+), 38 deletions(-)
--
2.43.2
I tried to split it a bit, maybe I could even go further and split by
TRACE_EVENT_CLASS() changes, but not sure if it adds any value.
But at least all preparation patches are separate.
I wasn't sure if I should just remove tcp_hash_fail() as I did in this
version, or rather put it under CONFIG_TCP_..., making it disabled by
default and with a warning of deprecated, scheduled for removal.
Maybe this won't cause any problems for anybody and I'm just too
cautious of breaking others.
Anyways, version 1, thanks for any reviews!
Signed-off-by: Dmitry Safonov <dima(a)arista.com>
---
Dmitry Safonov (10):
net/tcp: Use static_branch_tcp_{md5,ao} to drop ifdefs
net/tcp: Add a helper tcp_ao_hdr_maclen()
net/tcp: Move tcp_inbound_hash() from headers
net/tcp: Add tcp-md5 and tcp-ao tracepoints
net/tcp: Remove tcp_hash_fail()
selftests/net: Clean-up double assignment
selftests/net: Provide test_snprintf() helper
selftests/net: Be consistnat in kconfig checks
selftests/net: Don't forget to close nsfd after switch_save_ns()
selftest/net: Add trace events matching to tcp_ao
include/net/tcp.h | 79 +-
include/net/tcp_ao.h | 42 +-
include/trace/events/tcp.h | 317 ++++++++
net/ipv4/tcp.c | 86 ++-
net/ipv4/tcp_ao.c | 24 +-
net/ipv4/tcp_input.c | 8 +-
net/ipv4/tcp_ipv4.c | 8 +-
net/ipv4/tcp_output.c | 2 +
tools/testing/selftests/net/tcp_ao/Makefile | 2 +-
tools/testing/selftests/net/tcp_ao/bench-lookups.c | 2 +-
tools/testing/selftests/net/tcp_ao/connect-deny.c | 18 +-
tools/testing/selftests/net/tcp_ao/connect.c | 2 +-
tools/testing/selftests/net/tcp_ao/icmps-discard.c | 2 +-
.../testing/selftests/net/tcp_ao/key-management.c | 18 +-
tools/testing/selftests/net/tcp_ao/lib/aolib.h | 150 +++-
tools/testing/selftests/net/tcp_ao/lib/ftrace.c | 846 +++++++++++++++++++++
tools/testing/selftests/net/tcp_ao/lib/kconfig.c | 31 +-
tools/testing/selftests/net/tcp_ao/lib/setup.c | 15 +-
tools/testing/selftests/net/tcp_ao/lib/sock.c | 1 -
tools/testing/selftests/net/tcp_ao/lib/utils.c | 26 +
tools/testing/selftests/net/tcp_ao/restore.c | 18 +-
tools/testing/selftests/net/tcp_ao/rst.c | 2 +-
tools/testing/selftests/net/tcp_ao/self-connect.c | 19 +-
tools/testing/selftests/net/tcp_ao/seq-ext.c | 10 +-
.../selftests/net/tcp_ao/setsockopt-closed.c | 2 +-
tools/testing/selftests/net/tcp_ao/unsigned-md5.c | 28 +-
26 files changed, 1576 insertions(+), 182 deletions(-)
---
base-commit: d662c5b3ce6dbed9d0991bc83001bbcc4a9bc2f8
change-id: 20240224-tcp-ao-tracepoints-0ea8ba11467a
Best regards,
--
Dmitry Safonov <dima(a)arista.com>
Hi!
When running selftests for our subsystem in our CI we'd like all
tests to pass. Currently some tests use SKIP for cases they
expect to fail, because the kselftest_harness limits the return
codes to pass/fail/skip.
Clean up and support the use of the full range of ksft exit codes
under kselftest_harness.
Merge plan is to put it on top of -rc4 and merge into net-next.
That way others should be able to pull the patches without
any networking changes.
v2: https://lore.kernel.org/all/20240216002619.1999225-1-kuba@kernel.org/
- fix alignment
follow up RFC: https://lore.kernel.org/all/20240216004122.2004689-1-kuba@kernel.org/
v1: https://lore.kernel.org/all/20240213154416.422739-1-kuba@kernel.org/
Jakub Kicinski (11):
selftests: kselftest_harness: pass step via shared memory
selftests: kselftest_harness: use KSFT_* exit codes
selftests: kselftest_harness: generate test name once
selftests: kselftest_harness: save full exit code in metadata
selftests: kselftest_harness: use exit code to store skip
selftests: kselftest: add ksft_test_result_code(), handling all exit
codes
selftests: kselftest_harness: print test name for SKIP
selftests: kselftest_harness: separate diagnostic message with # in
ksft_test_result_code()
selftests: kselftest_harness: let PASS / FAIL provide diagnostic
selftests: kselftest_harness: support using xfail
selftests: ip_local_port_range: use XFAIL instead of SKIP
tools/testing/selftests/kselftest.h | 45 ++++++
tools/testing/selftests/kselftest_harness.h | 148 ++++++++++++------
tools/testing/selftests/landlock/base_test.c | 2 +-
tools/testing/selftests/landlock/common.h | 22 +--
tools/testing/selftests/landlock/fs_test.c | 4 +-
tools/testing/selftests/landlock/net_test.c | 4 +-
.../testing/selftests/landlock/ptrace_test.c | 7 +-
.../selftests/net/ip_local_port_range.c | 6 +-
tools/testing/selftests/net/tls.c | 2 +-
tools/testing/selftests/seccomp/seccomp_bpf.c | 9 +-
10 files changed, 168 insertions(+), 81 deletions(-)
--
2.43.0
Cleaning up after tests is implemented separately for individual tests
and called at the end of each test execution. Since these functions are
very similar and a more generalized test framework was introduced a
function pointer in the resctrl_test struct can be used to reduce the
amount of function calls.
These functions are also all called in the ctrl-c handler because the
handler isn't aware which test is currently running. Since the handler
is implemented with a sigaction no function parameters can be passed
there but information about what test is currently running can be passed
with a global variable.
Changelog v3:
- Make current_test static.
- Add callback NULL check to the ctrl-c handler.
Changelog v2:
- Make current_test a const pointer limited in scope to resctrl_val
file.
- Remove tests_cleanup from resctrl.h.
- Cleanup 'goto out' path and labels in individual test functions.
Older versions of this series:
[v1] https://lore.kernel.org/all/cover.1708434017.git.maciej.wieczor-retman@inte…
[v2] https://lore.kernel.org/all/cover.1708596015.git.maciej.wieczor-retman@inte…
Maciej Wieczor-Retman (3):
selftests/resctrl: Add cleanup function to test framework
selftests/resctrl: Simplify cleanup in ctrl-c handler
selftests/resctrl: Move cleanups out of individual tests
tools/testing/selftests/resctrl/cat_test.c | 8 +++-----
tools/testing/selftests/resctrl/cmt_test.c | 4 ++--
tools/testing/selftests/resctrl/mba_test.c | 8 +++-----
tools/testing/selftests/resctrl/mbm_test.c | 8 +++-----
tools/testing/selftests/resctrl/resctrl.h | 9 +++------
tools/testing/selftests/resctrl/resctrl_tests.c | 16 +++++-----------
tools/testing/selftests/resctrl/resctrl_val.c | 7 +++++--
7 files changed, 24 insertions(+), 36 deletions(-)
--
2.43.2
The RISC-V arch_timer selftests is used to validate Sstc timer
functionality in a guest, which sets up periodic timer interrupts
and check the basic interrupt status upon its receipt.
This KVM selftests was ported from aarch64 arch_timer and tested
with Linux v6.7-rc8 on a Qemu riscv64 virt machine.
---
Changed since v4:
* Rebased to Linux 6.7-rc8
* Added new patch(2/12) to clean up the data type in struct test_args
* Re-ordered patch(11/11) in v4 to patch(3/12)
* Changed the timer_err_margin_us type from int to uint32_t
Haibo Xu (11):
KVM: arm64: selftests: Data type cleanup for arch_timer test
KVM: arm64: selftests: Enable tuning of error margin in arch_timer
test
KVM: arm64: selftests: Split arch_timer test code
KVM: selftests: Add CONFIG_64BIT definition for the build
tools: riscv: Add header file csr.h
tools: riscv: Add header file vdso/processor.h
KVM: riscv: selftests: Switch to use macro from csr.h
KVM: riscv: selftests: Add exception handling support
KVM: riscv: selftests: Add guest helper to get vcpu id
KVM: riscv: selftests: Change vcpu_has_ext to a common function
KVM: riscv: selftests: Add sstc timer test
Paolo Bonzini (1):
selftests/kvm: Fix issues with $(SPLIT_TESTS)
tools/arch/riscv/include/asm/csr.h | 541 ++++++++++++++++++
tools/arch/riscv/include/asm/vdso/processor.h | 32 ++
tools/testing/selftests/kvm/Makefile | 27 +-
.../selftests/kvm/aarch64/arch_timer.c | 295 +---------
tools/testing/selftests/kvm/arch_timer.c | 259 +++++++++
.../selftests/kvm/include/aarch64/processor.h | 4 -
.../selftests/kvm/include/kvm_util_base.h | 9 +
.../selftests/kvm/include/riscv/arch_timer.h | 71 +++
.../selftests/kvm/include/riscv/processor.h | 65 ++-
.../testing/selftests/kvm/include/test_util.h | 2 +
.../selftests/kvm/include/timer_test.h | 45 ++
.../selftests/kvm/lib/riscv/handlers.S | 101 ++++
.../selftests/kvm/lib/riscv/processor.c | 87 +++
.../testing/selftests/kvm/riscv/arch_timer.c | 111 ++++
.../selftests/kvm/riscv/get-reg-list.c | 11 +-
15 files changed, 1353 insertions(+), 307 deletions(-)
create mode 100644 tools/arch/riscv/include/asm/csr.h
create mode 100644 tools/arch/riscv/include/asm/vdso/processor.h
create mode 100644 tools/testing/selftests/kvm/arch_timer.c
create mode 100644 tools/testing/selftests/kvm/include/riscv/arch_timer.h
create mode 100644 tools/testing/selftests/kvm/include/timer_test.h
create mode 100644 tools/testing/selftests/kvm/lib/riscv/handlers.S
create mode 100644 tools/testing/selftests/kvm/riscv/arch_timer.c
--
2.34.1
The changes doesn't change the current functionality. The changes on
lib.mk are both for simplification and also clarification, like in the
case of not handling TEST_GEN_MODS_DIR directly.
These changes apply on top of the current kselftest-next branch. Please
review!
Signed-off-by: Marcos Paulo de Souza <mpdesouza(a)suse.com>
---
Marcos Paulo de Souza (3):
selftests: lib.mk: Do not process TEST_GEN_MODS_DIR
selftests: lib.mk: Simplify TEST_GEN_MODS_DIR handling
selftests: livepatch: Add initial .gitignore
tools/testing/selftests/lib.mk | 19 +++++++------------
tools/testing/selftests/livepatch/.gitignore | 1 +
2 files changed, 8 insertions(+), 12 deletions(-)
---
base-commit: 345e8abe4c355bc24bab3f4a5634122e55be8665
change-id: 20240215-lp-selftests-fixes-7d4bab3c0712
Best regards,
--
Marcos Paulo de Souza <mpdesouza(a)suse.com>
Hi!
I was running some KASan tests with kunit.py recently and noticed that
when KASan is run in hw tags mode, we manually have to add the required
`mte=on` option to kunit_tool's qemu invocation, as the tests will
otherwise crash.
To make life easier, I was looking into ways for kunit.py to recognise
when MTE support was required and set the option automatically.
All solutions I could come up with for having kunit_tool conditionally
pass `mte=on` to qemu, either entailed duplicate code or required
parsing of kernel's config file again. I was working under the
assumption that only after configuring the kernel we would know whether
the 'mte=on' option was necessary, as CONFIG_ARM64_MTE is not visible
before.
Only afterwads did I realise that the qemu arm64 config that kunit_tool
falls back on, uses the `virt` machine, which supports MTE in any case.
So, could it be as easy as just adding the `mte=on` option to
kunit_tool's arm64 config? Would this be a welcome addition?
What do you think?
Many thanks,
Paul
Signed-off-by: Paul Heidekrüger <paul.heidekrueger(a)tum.de>
---
tools/testing/kunit/qemu_configs/arm64.py | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/tools/testing/kunit/qemu_configs/arm64.py b/tools/testing/kunit/qemu_configs/arm64.py
index d3ff27024755..a525f7e1093b 100644
--- a/tools/testing/kunit/qemu_configs/arm64.py
+++ b/tools/testing/kunit/qemu_configs/arm64.py
@@ -9,4 +9,4 @@ CONFIG_SERIAL_AMBA_PL011_CONSOLE=y''',
qemu_arch='aarch64',
kernel_path='arch/arm64/boot/Image.gz',
kernel_command_line='console=ttyAMA0',
- extra_qemu_params=['-machine', 'virt', '-cpu', 'max,pauth-impdef=on'])
+ extra_qemu_params=['-machine', 'virt,mte=on', '-cpu', 'max,pauth-impdef=on'])
--
2.40.1
This series adds the missing cache flush and dirty track set for nested
parent domain (it's s2_domain but used as parent) which has no insight
into devices/DID's under the nested domain (a.k.a s1_domain). This
results in missing cache flush per parent domain change and incomplete
dirty tracking set on the parent domain. There was a discussion about
this in the mailing list [1].
This series adds a s1_domains list in the parent domain to track the nested
domains. Hence, the driver can loop the nested domains and use the DIDs of
the nested domain to flush iotlb. The driver can also loop the nested domains
and nested domain's devices list to flush device iotlb and set the dirty
tracking completely.
This series doesn't touch the pasid iotlb or the pasid devtlb as there is
no support of attaching pasid to nested domain yet. It will be covered when
that feature is enabled.
The complete code can be found at[2], this has been tested with a hacky
Qemu[3] to test the unmap on parent domain and dirty tracking set on parent
domain. This just verifies the new path. So appreciated to see t-b with
regression tests.
This aims to the 6.8 rc#, so your special attention is welcomed.
[1] https://lore.kernel.org/linux-iommu/92f8aaca-093d-4161-b8f2-5ab1680df769@in…
[2] https://github.com/yiliu1765/iommufd/tree/vtd_nesting_fixes
[3] https://github.com/yiliu1765/qemu/tree/tmp/for-testing-unmap-and-dirty-set-…
base commit: 547ab8fc4cb04a1a6b34377dd8fad34cd2c8a8e3
Regards,
Yi Liu
Yi Liu (8):
iommu/vt-d: Track nested domains in parent
iommu/vt-d: Add __iommu_flush_iotlb_psi()
iommu/vt-d: Add missing iotlb flush for parent domain
iommu/vt-d: Update iotlb in nested domain attach
iommu/vt-d: Add missing device iotlb flush for parent domain
iommu/vt-d: Remove @domain parameter from
intel_pasid_setup_dirty_tracking()
iommu/vt-d: Wrap the dirty tracking loop to be a helper
iommu/vt-d: Add missing dirty tracking set for parent domain
drivers/iommu/intel/iommu.c | 213 ++++++++++++++++++++++++++---------
drivers/iommu/intel/iommu.h | 7 ++
drivers/iommu/intel/nested.c | 12 ++
drivers/iommu/intel/pasid.c | 3 +-
drivers/iommu/intel/pasid.h | 1 -
5 files changed, 182 insertions(+), 54 deletions(-)
--
2.34.1
From: Deepak Gupta <debug(a)rivosinc.com>
It's been almost an year since I posted my last patch series [1] to
enable CPU assisted control-flow integrity for usermode on riscv. A lot
has changed since then and so has the patches. It's been a while and since
this is a reboot of series, starting with RFC and v1.
Securing control-flow integrity for usermode requires following
- Securing forward control flow : All callsites must reach
reach a target that they actually intend to reach.
- Securing backward control flow : All function returns must
return to location where they were called from.
This patch series use riscv cpu extension `zicfilp` [2] to secure forward
control flow and `zicfiss` [2] to secure backward control flow. `zicfilp`
enforces that all indirect calls or jmps must land on a landing pad instr
and label embedded in landing pad instr must match a value programmed in
`x7` register (at callsite via compiler). `zicfiss` introduces shadow stack
which can only be writeable via shadow stack instructions (sspush and
ssamoswap) and thus can't be tampered with via inadvertent stores. More
details about extension can be read from [2] and there are details in
documentation as well (in this patch series).
Using config `CONFIG_RISCV_USER_CFI`, kernel support for riscv control flow
integrity for user mode programs can be compiled in the kernel.
Enabling of control flow integrity for user programs is left to user runtime
(specifically expected from dynamic loader). There has been a lot of earlier
discussion on the enabling topic around x86 shadow stack enabling [3, 4, 5] and
overall consensus had been to let dynamic loader (or usermode) to decide for
enabling the feature.
This patch series introduces arch agnostic `prctls` to enable shadow stack
and indirect branch tracking. And implements them on riscv. arm64 is expected
to implement shadow stack part of these arch agnostic `prctls` [6]
Changes since last time
***********************
Spec changes
------------
- Forward cfi spec has become much simpler. `lpad` instruction is pseudo for
`auipc rd, <20bit_imm>`. `lpad` checks x7 against 20bit embedded in instr.
Thus label width is 20bit.
- Shadow stack management instructions are reduced to
sspush - to push x1/x5 on shadow stack
sspopchk - pops from shadow stack and comapres with x1/x5.
ssamoswap - atomically swap value on shadow stack.
rdssp - reads current shadow stack pointer
- Shadow stack accesses on readonly memory always raise AMO/store page fault.
`sspopchk` is load but if underlying page is readonly, it'll raise a store
page fault. It simplifies hardware and kernel for COW handling for shadow
stack pages.
- riscv defines a new exception type `software check exception` and control flow
violations raise software check exception.
- enabling controls for shadow stack and landing are in xenvcfg CSR and controls
lower privilege mode enabling. As an example senvcfg controls enabling for U and
menvcfg controls enabling for S mode.
core mm shadow stack enabling
-----------------------------
Shadow stack for x86 usermode are now in mainline and thus this patch
series builds on top of that for arch-agnostic mm related changes. Big
thanks and shout out to Rick Edgecombe for that.
selftests
---------
Created some minimal selftests to test the patch series.
[1] - https://lore.kernel.org/lkml/20230213045351.3945824-1-debug@rivosinc.com/
[2] - https://github.com/riscv/riscv-cfi
[3] - https://lore.kernel.org/lkml/ZWHcBq0bJ+15eeKs@finisterre.sirena.org.uk/T/#m…
[4] - https://lore.kernel.org/all/20220130211838.8382-1-rick.p.edgecombe@intel.co…
[5] - https://lore.kernel.org/lkml/CAHk-=wgP5mk3poVeejw16Asbid0ghDt4okHnWaWKLBkRh…
[6] - https://lore.kernel.org/linux-mm/20231122-arm64-gcs-v7-2-201c483bd775@kerne…
Deepak Gupta (27):
riscv: abstract envcfg CSR
riscv: envcfg save and restore on trap entry/exit
riscv: define default value for envcfg
riscv/Kconfig: enable HAVE_EXIT_THREAD for riscv
riscv: zicfiss/zicfilp enumeration
riscv: zicfiss/zicfilp extension csr and bit definitions
riscv: kernel handling on trap entry/exit for user cfi
mm: Define VM_SHADOW_STACK for RISC-V
mm: abstract shadow stack vma behind `arch_is_shadow_stack`
riscv/mm : Introducing new protection flag "PROT_SHADOWSTACK"
riscv: Implementing "PROT_SHADOWSTACK" on riscv
riscv mm: manufacture shadow stack pte
riscv mmu: teach pte_mkwrite to manufacture shadow stack PTEs
riscv mmu: write protect and shadow stack
riscv/mm: Implement map_shadow_stack() syscall
riscv/shstk: If needed allocate a new shadow stack on clone
prctl: arch-agnostic prtcl for indirect branch tracking
riscv: Implements arch agnostic shadow stack prctls
riscv: Implements arch argnostic indirect branch tracking prctls
riscv/traps: Introduce software check exception
riscv sigcontext: adding cfi state field in sigcontext
riscv signal: Save and restore of shadow stack for signal
riscv: select config for shadow stack and landing pad instr support
riscv/ptrace: riscv cfi status and state via ptrace and in core files
riscv: Documentation for landing pad / indirect branch tracking
riscv: Documentation for shadow stack on riscv
kselftest/riscv: kselftest for user mode cfi
Mark Brown (1):
prctl: arch-agnostic prctl for shadow stack
Documentation/arch/riscv/zicfilp.rst | 104 ++++
Documentation/arch/riscv/zicfiss.rst | 169 ++++++
arch/riscv/Kconfig | 16 +
arch/riscv/include/asm/asm-prototypes.h | 1 +
arch/riscv/include/asm/cpufeature.h | 18 +
arch/riscv/include/asm/csr.h | 20 +
arch/riscv/include/asm/hwcap.h | 2 +
arch/riscv/include/asm/mman.h | 42 ++
arch/riscv/include/asm/pgtable.h | 32 +-
arch/riscv/include/asm/processor.h | 2 +
arch/riscv/include/asm/thread_info.h | 4 +
arch/riscv/include/asm/usercfi.h | 106 ++++
arch/riscv/include/uapi/asm/ptrace.h | 18 +
arch/riscv/include/uapi/asm/sigcontext.h | 5 +
arch/riscv/kernel/Makefile | 2 +
arch/riscv/kernel/asm-offsets.c | 6 +-
arch/riscv/kernel/cpufeature.c | 4 +-
arch/riscv/kernel/entry.S | 32 ++
arch/riscv/kernel/process.c | 16 +
arch/riscv/kernel/ptrace.c | 83 +++
arch/riscv/kernel/signal.c | 45 ++
arch/riscv/kernel/sys_riscv.c | 19 +
arch/riscv/kernel/traps.c | 38 ++
arch/riscv/kernel/usercfi.c | 497 ++++++++++++++++++
arch/riscv/mm/init.c | 2 +-
arch/riscv/mm/pgtable.c | 21 +
include/linux/mm.h | 35 +-
include/uapi/asm-generic/mman.h | 1 +
include/uapi/linux/elf.h | 1 +
include/uapi/linux/prctl.h | 49 ++
kernel/sys.c | 60 +++
mm/gup.c | 5 +-
mm/internal.h | 2 +-
mm/mmap.c | 1 +
tools/testing/selftests/riscv/Makefile | 2 +-
tools/testing/selftests/riscv/cfi/Makefile | 10 +
.../testing/selftests/riscv/cfi/cfi_rv_test.h | 85 +++
.../selftests/riscv/cfi/riscv_cfi_test.c | 91 ++++
.../testing/selftests/riscv/cfi/shadowstack.c | 376 +++++++++++++
.../testing/selftests/riscv/cfi/shadowstack.h | 39 ++
40 files changed, 2050 insertions(+), 11 deletions(-)
create mode 100644 Documentation/arch/riscv/zicfilp.rst
create mode 100644 Documentation/arch/riscv/zicfiss.rst
create mode 100644 arch/riscv/include/asm/mman.h
create mode 100644 arch/riscv/include/asm/usercfi.h
create mode 100644 arch/riscv/kernel/usercfi.c
create mode 100644 tools/testing/selftests/riscv/cfi/Makefile
create mode 100644 tools/testing/selftests/riscv/cfi/cfi_rv_test.h
create mode 100644 tools/testing/selftests/riscv/cfi/riscv_cfi_test.c
create mode 100644 tools/testing/selftests/riscv/cfi/shadowstack.c
create mode 100644 tools/testing/selftests/riscv/cfi/shadowstack.h
--
2.43.0
The config fragment doesn't follow the correct format to enable those
config options which make the config options getting missed while
merging with other configs.
➜ merge_config.sh -m .config tools/testing/selftests/iommu/config
Using .config as base
Merging tools/testing/selftests/iommu/config
➜ make olddefconfig
.config:5295:warning: unexpected data: CONFIG_IOMMUFD
.config:5296:warning: unexpected data: CONFIG_IOMMUFD_TEST
While at it, add CONFIG_FAULT_INJECTION as well which is needed for
CONFIG_IOMMUFD_TEST. If CONFIG_FAULT_INJECTION isn't present in base
config (such as x86 defconfig), CONFIG_IOMMUFD_TEST doesn't get enabled.
Fixes: 57f0988706fe ("iommufd: Add a selftest")
Signed-off-by: Muhammad Usama Anjum <usama.anjum(a)collabora.com>
---
tools/testing/selftests/iommu/config | 5 +++--
1 file changed, 3 insertions(+), 2 deletions(-)
diff --git a/tools/testing/selftests/iommu/config b/tools/testing/selftests/iommu/config
index 6c4f901d6fed3..110d73917615d 100644
--- a/tools/testing/selftests/iommu/config
+++ b/tools/testing/selftests/iommu/config
@@ -1,2 +1,3 @@
-CONFIG_IOMMUFD
-CONFIG_IOMMUFD_TEST
+CONFIG_IOMMUFD=y
+CONFIG_FAULT_INJECTION=y
+CONFIG_IOMMUFD_TEST=y
--
2.42.0
If an integer's type has x bits, shifting the integer left by x or more
is undefined behavior.
This can happen in the rotate function when attempting to do a rotation
of the whole value by 0.
Fixes: 0dd714bfd200 ("KVM: s390: selftest: memop: Add cmpxchg tests")
Signed-off-by: Nina Schoetterl-Glausch <nsg(a)linux.ibm.com>
---
v1 -> v2:
use early return instead of modulus
tools/testing/selftests/kvm/s390x/memop.c | 2 ++
1 file changed, 2 insertions(+)
diff --git a/tools/testing/selftests/kvm/s390x/memop.c b/tools/testing/selftests/kvm/s390x/memop.c
index bb3ca9a5d731..4ec8d0181e8d 100644
--- a/tools/testing/selftests/kvm/s390x/memop.c
+++ b/tools/testing/selftests/kvm/s390x/memop.c
@@ -489,6 +489,8 @@ static __uint128_t rotate(int size, __uint128_t val, int amount)
amount = (amount + bits) % bits;
val = cut_to_size(size, val);
+ if (!amount)
+ return val;
return (val << (bits - amount)) | (val >> amount);
}
base-commit: 305230142ae0637213bf6e04f6d9f10bbcb74af8
--
2.40.1
Cleaning up after tests is implemented separately for individual tests
and called at the end of each test execution. Since these functions are
very similar and a more generalized test framework was introduced a
function pointer in the resctrl_test struct can be used to reduce the
amount of function calls.
These functions are also all called in the ctrl-c handler because the
handler isn't aware which test is currently running. Since the handler
is implemented with a sigaction no function parameters can be passed
there but information about what test is currently running can be passed
with a global variable.
Changelog v2:
- Make current_test a const pointer limited in scope to resctrl_val
file.
- Remove tests_cleanup from resctrl.h.
- Cleanup 'goto out' path and labels in individual test functions.
Older versions of this series:
[v1] https://lore.kernel.org/all/cover.1708434017.git.maciej.wieczor-retman@inte…
Maciej Wieczor-Retman (3):
selftests/resctrl: Add cleanup function to test framework
selftests/resctrl: Simplify cleanup in ctrl-c handler
selftests/resctrl: Move cleanups out of individual tests
tools/testing/selftests/resctrl/cat_test.c | 8 +++-----
tools/testing/selftests/resctrl/cmt_test.c | 4 ++--
tools/testing/selftests/resctrl/mba_test.c | 8 +++-----
tools/testing/selftests/resctrl/mbm_test.c | 8 +++-----
tools/testing/selftests/resctrl/resctrl.h | 9 +++------
tools/testing/selftests/resctrl/resctrl_tests.c | 16 +++++-----------
tools/testing/selftests/resctrl/resctrl_val.c | 6 ++++--
7 files changed, 23 insertions(+), 36 deletions(-)
--
2.43.2
On 2/21/24 07:44, Nicolai Stange wrote:
> Shresth Prasad <shresthprasad7(a)gmail.com> writes:
>
>> I checked the source code and yes I am on the latest Linux next repo.
>>
>> Here's the warning:
>> /home/shresthp/dev/linux_work/linux_next/tools/testing/selftests/livepatch/test_modules/test_klp_state.c:38:24: warning: assignment to ‘struct klp_state *’ from ‘int’ makes pointer from integer without a cast [-Wint-conversion]
>> 38 | loglevel_state = klp_get_state(&patch, CONSOLE_LOGLEVEL_STATE);
>> | ^
>
>
> Is the declaration of klp_get_state() visible at that point, i.e. is
> there perhaps any warning about missing declarations above that?
>
> Otherwise C rules would default to assume an 'int' return type.
>
This is an interesting clue. I thought I might be able to reproduce the
build error by modifying include/livepatch.h and running `make -j15 -C
tools/testing/selftests/livepatch` ... but that seemed to work fine on
my system. I even removed the entire include/ subdir from my tree and
it still built the test module. Huh?
Then I moved /lib/modules/$(uname -r)/build out of the way and saw that
the compilation failed. Ah hah -- that's right, it's using the system
build tree. That version of livepatch.h may have a missing or
completely different definition of klp_get_state().
How does this sequence work for you, Shresth:
# Verify that kernel livepatching is turned on
$ grep LIVEPATCH .config
CONFIG_HAVE_LIVEPATCH=y
CONFIG_LIVEPATCH=y
# Build linux-next kernel tree and then the livepatch selftests,
# pointing KDIR to this tree
$ make -j$(nproc) vmlinux && \
make -j$(nproc) KDIR=$(pwd) -C tools/testing/selftests/livepatch
--
Joe
Changelog:
v3:
* More cleanup (patch 3) (suggested by Yosry Ahmed).
* Check swap peak in swapin test
v2:
* Make the swapin test also checks for zswap usage (patch 3)
(suggested by Yosry Ahmed)
* Some test simplifications/cleanups (patch 3)
(suggested by Yosry Ahmed).
Fix a broken zswap kselftest due to cgroup zswap writeback counter
renaming, and add 2 zswap kselftests, one to cover the (z)swapin case,
and another to check that no zswapping happens when the cgroup limit is
0.
Also, add the zswap kselftest file to zswap maintainer entry so that
get_maintainers script can find zswap maintainers.
Nhat Pham (3):
selftests: zswap: add zswap selftest file to zswap maintainer entry
selftests: fix the zswap invasive shrink test
selftests: add zswapin and no zswap tests
MAINTAINERS | 1 +
tools/testing/selftests/cgroup/test_zswap.c | 122 +++++++++++++++++++-
2 files changed, 121 insertions(+), 2 deletions(-)
base-commit: 91f3daa1765ee4e0c89987dc25f72c40f07af34d
--
2.39.3
There are multiple bugs in tls_sw_recvmsg's handling of record types
when MSG_PEEK flag is used, which can lead to incorrectly merging two
records:
- consecutive non-DATA records shouldn't be merged, even if they're
the same type (partly handled by the test at the end of the main
loop)
- records of the same type (even DATA) shouldn't be merged if one
record of a different type comes in between
Sabrina Dubroca (5):
tls: break out of main loop when PEEK gets a non-data record
tls: stop recv() if initial process_rx_list gave us non-DATA
tls: don't skip over different type records from the rx_list
selftests: tls: add test for merging of same-type control messages
selftests: tls: add test for peeking past a record of a different type
net/tls/tls_sw.c | 24 +++++++++++------
tools/testing/selftests/net/tls.c | 45 +++++++++++++++++++++++++++++++
2 files changed, 61 insertions(+), 8 deletions(-)
--
2.43.0
Resending misc fixes for DAMON selftets on behalf of the original
authors for more visibility and inclusion on mm tree.
The patches are same to their original versions, except added Links: for
the original posts, and Signed-off-by: of mine.
Javier Carrasco (1):
selftests: damon: add access_memory to .gitignore
Vincenzo Mezzela (1):
selftest: damon: fix minor typos in test logs
tools/testing/selftests/damon/.gitignore | 1 +
.../selftests/damon/sysfs_update_schemes_tried_regions_hang.py | 2 +-
.../damon/sysfs_update_schemes_tried_regions_wss_estimation.py | 2 +-
3 files changed, 3 insertions(+), 2 deletions(-)
--
2.39.2
The series is a host of cleanups to the openvswitch selftest suite
which should be ready to run under the netdev selftest runners using
vng. For now, the testing has been done with RW directories, but
additional testing will be done to try and keep it all as RO to be
more friendly.
There is one more test case I plan which will print the debug log
details when a test case fails so that a developer can get a clear
picture why the test case failed. That will be done for the proper
submission as another patch in this series.
Additionally, the timeout setting was just an arbitrary number that
I picked, but needs more testing to tune it properly (since 5
minutes may be a bit too long).
Tested on fedora 38 using virtme-ng with the following commandline:
../virtme-ng/vng -v --run . --user root --cpus 4 \
--rwdir=/home/aconole/git/linux/tools/testing/selftests/net/openvswitch/ \
-- \
make -C tools/testing/selftests/net/openvswitch \
TARGETS=openvswitch TEST_PROGS=openvswitch.sh run_tests
Aaron Conole (7):
selftests: openvswitch: add test case error directories to clean list
selftests: openvswitch: be more verbose with selftest debugging
selftests: openvswitch: use non-graceful kills when needed
selftests: openvswitch: delete previously allocated netns
selftests: openvswitch: make arping test a bit 'slower'
selftests: openvswitch: insert module when running the tests
selftests: openvswitch: add config and timeout settings
.../selftests/net/openvswitch/Makefile | 12 ++++-
.../testing/selftests/net/openvswitch/config | 50 +++++++++++++++++++
.../selftests/net/openvswitch/openvswitch.sh | 33 +++++++++---
.../selftests/net/openvswitch/settings | 1 +
4 files changed, 89 insertions(+), 7 deletions(-)
create mode 100644 tools/testing/selftests/net/openvswitch/config
create mode 100644 tools/testing/selftests/net/openvswitch/settings
--
2.41.0
Hi all:
The core frequency is subjected to the process variation in semiconductors.
Not all cores are able to reach the maximum frequency respecting the
infrastructure limits. Consequently, AMD has redefined the concept of
maximum frequency of a part. This means that a fraction of cores can reach
maximum frequency. To find the best process scheduling policy for a given
scenario, OS needs to know the core ordering informed by the platform through
highest performance capability register of the CPPC interface.
Earlier implementations of amd-pstate preferred core only support a static
core ranking and targeted performance. Now it has the ability to dynamically
change the preferred core based on the workload and platform conditions and
accounting for thermals and aging.
Amd-pstate driver utilizes the functions and data structures provided by
the ITMT architecture to enable the scheduler to favor scheduling on cores
which can be get a higher frequency with lower voltage.
We call it amd-pstate preferred core.
Here sched_set_itmt_core_prio() is called to set priorities and
sched_set_itmt_support() is called to enable ITMT feature.
Amd-pstate driver uses the highest performance value to indicate
the priority of CPU. The higher value has a higher priority.
Amd-pstate driver will provide an initial core ordering at boot time.
It relies on the CPPC interface to communicate the core ranking to the
operating system and scheduler to make sure that OS is choosing the cores
with highest performance firstly for scheduling the process. When amd-pstate
driver receives a message with the highest performance change, it will
update the core ranking.
Changes from V13->V14:
- cpufreq:
- - fix build error without CONFIG_CPU_FREQ
- ACPI: CPPC:
Changes from V12->V13:
- ACPI: CPPC:
- - modify commit message.
- - modify handle function of the notify(0x85).
- cpufreq: amd-pstate:
- - implement update_limits() callback function.
- x86:
- - pick up Acked-By flag added by Petkov.
Changes from V11->V12:
- all:
- - pick up Reviewed-By flag added by Perry.
- cpufreq: amd-pstate:
- - rebase the latest linux-next and fixed conflicts.
- - fixed the issue about cpudata without init in amd_pstate_update_highest_perf().
Changes from V10->V11:
- cpufreq: amd-pstate:
- - according Perry's commnts, I replace the string with str_enabled_disable().
Changes from V9->V10:
- cpufreq: amd-pstate:
- - add judgement for highest_perf. When it is less than 255, the
preferred core feature is enabled. And it will set the priority.
- - deleset "static u32 max_highest_perf" etc, because amd p-state
perferred coe does not require specail process for hotpulg.
Changes form V8->V9:
- all:
- - pick up Tested-By flag added by Oleksandr.
- cpufreq: amd-pstate:
- - pick up Review-By flag added by Wyes.
- - ignore modification of bug.
- - add a attribute of prefcore_ranking.
- - modify data type conversion from u32 to int.
- Documentation: amd-pstate:
- - pick up Review-By flag added by Wyes.
Changes form V7->V8:
- all:
- - pick up Review-By flag added by Mario and Ray.
- cpufreq: amd-pstate:
- - use hw_prefcore embeds into cpudata structure.
- - delete preferred core init from cpu online/off.
Changes form V6->V7:
- x86:
- - Modify kconfig about X86_AMD_PSTATE.
- cpufreq: amd-pstate:
- - modify incorrect comments about scheduler_work().
- - convert highest_perf data type.
- - modify preferred core init when cpu init and online.
- ACPI: CPPC:
- - modify link of CPPC highest performance.
- cpufreq:
- - modify link of CPPC highest performance changed.
Changes form V5->V6:
- cpufreq: amd-pstate:
- - modify the wrong tag order.
- - modify warning about hw_prefcore sysfs attribute.
- - delete duplicate comments.
- - modify the variable name cppc_highest_perf to prefcore_ranking.
- - modify judgment conditions for setting highest_perf.
- - modify sysfs attribute for CPPC highest perf to pr_debug message.
- Documentation: amd-pstate:
- - modify warning: title underline too short.
Changes form V4->V5:
- cpufreq: amd-pstate:
- - modify sysfs attribute for CPPC highest perf.
- - modify warning about comments
- - rebase linux-next
- cpufreq:
- - Moidfy warning about function declarations.
- Documentation: amd-pstate:
- - align with ``amd-pstat``
Changes form V3->V4:
- Documentation: amd-pstate:
- - Modify inappropriate descriptions.
Changes form V2->V3:
- x86:
- - Modify kconfig and description.
- cpufreq: amd-pstate:
- - Add Co-developed-by tag in commit message.
- cpufreq:
- - Modify commit message.
- Documentation: amd-pstate:
- - Modify inappropriate descriptions.
Changes form V1->V2:
- ACPI: CPPC:
- - Add reference link.
- cpufreq:
- - Moidfy link error.
- cpufreq: amd-pstate:
- - Init the priorities of all online CPUs
- - Use a single variable to represent the status of preferred core.
- Documentation:
- - Default enabled preferred core.
- Documentation: amd-pstate:
- - Modify inappropriate descriptions.
- - Default enabled preferred core.
- - Use a single variable to represent the status of preferred core.
Meng Li (7):
x86: Drop CPU_SUP_INTEL from SCHED_MC_PRIO for the expansion.
ACPI: CPPC: Add get the highest performance cppc control
cpufreq: amd-pstate: Enable amd-pstate preferred core supporting.
cpufreq: Add a notification message that the highest perf has changed
cpufreq: amd-pstate: Update amd-pstate preferred core ranking
dynamically
Documentation: amd-pstate: introduce amd-pstate preferred core
Documentation: introduce amd-pstate preferrd core mode kernel command
line options
.../admin-guide/kernel-parameters.txt | 5 +
Documentation/admin-guide/pm/amd-pstate.rst | 59 +++++-
arch/x86/Kconfig | 5 +-
drivers/acpi/cppc_acpi.c | 13 ++
drivers/acpi/processor_driver.c | 6 +
drivers/cpufreq/amd-pstate.c | 183 +++++++++++++++++-
include/acpi/cppc_acpi.h | 5 +
include/linux/amd-pstate.h | 10 +
include/linux/cpufreq.h | 1 +
9 files changed, 275 insertions(+), 12 deletions(-)
--
2.34.1
> drivers/misc/ntsync.c | 1146 ++++++++++++++
Assuming this doesn't go into futex(2) or some other existing code...
Can you start putting all of this into top-level "windows" directory?
I suspect there will be more Windows stuff in the future.
So those who don't care about Windows can turn off just one config option
(CONFIG_WINDOWS) and be done with it.
Name it "Linux Subsystem for Windows" for 146% better memes.
[Still a RFC: there are a lot of FIXMEs in the code, and
calling the sleepable timer cb actually crashes.]
[Also using bpf-next as the base tree as there will be conflicting
changes otherwise]
This is crashing, and I have a few questions in the code (look for all
of the FIXMEs), so sending this now before I become insane :)
For reference, the use cases I have in mind:
---
Basically, I need to be able to defer a HID-BPF program for the
following reasons (from the aforementioned patch):
1. defer an event:
Sometimes we receive an out of proximity event, but the device can not
be trusted enough, and we need to ensure that we won't receive another
one in the following n milliseconds. So we need to wait those n
milliseconds, and eventually re-inject that event in the stack.
2. inject new events in reaction to one given event:
We might want to transform one given event into several. This is the
case for macro keys where a single key press is supposed to send
a sequence of key presses. But this could also be used to patch a
faulty behavior, if a device forgets to send a release event.
3. communicate with the device in reaction to one event:
We might want to communicate back to the device after a given event.
For example a device might send us an event saying that it came back
from sleeping state and needs to be re-initialized.
Currently we can achieve that by keeping a userspace program around,
raise a bpf event, and let that userspace program inject the events and
commands.
However, we are just keeping that program alive as a daemon for just
scheduling commands. There is no logic in it, so it doesn't really justify
an actual userspace wakeup. So a kernel workqueue seems simpler to handle.
The other part I'm not sure is whether we can say that BPF maps of type
queue/stack can be used in sleepable context.
I don't see any warning when running the test programs, but that's probably
not a guarantee I'm doing the things properly :)
Cheers,
Benjamin
To: Alexei Starovoitov <ast(a)kernel.org>
To: Daniel Borkmann <daniel(a)iogearbox.net>
To: John Fastabend <john.fastabend(a)gmail.com>
To: Andrii Nakryiko <andrii(a)kernel.org>
To: Martin KaFai Lau <martin.lau(a)linux.dev>
To: Eduard Zingerman <eddyz87(a)gmail.com>
To: Song Liu <song(a)kernel.org>
To: Yonghong Song <yonghong.song(a)linux.dev>
To: KP Singh <kpsingh(a)kernel.org>
To: Stanislav Fomichev <sdf(a)google.com>
To: Hao Luo <haoluo(a)google.com>
To: Jiri Olsa <jolsa(a)kernel.org>
To: Jiri Kosina <jikos(a)kernel.org>
To: Benjamin Tissoires <benjamin.tissoires(a)redhat.com>
To: Jonathan Corbet <corbet(a)lwn.net>
To: Shuah Khan <shuah(a)kernel.org>
Cc: <bpf(a)vger.kernel.org>
Cc: <linux-kernel(a)vger.kernel.org>
Cc: <linux-input(a)vger.kernel.org>
Cc: <linux-doc(a)vger.kernel.org>
Cc: <linux-kselftest(a)vger.kernel.org>
Signed-off-by: Benjamin Tissoires <bentiss(a)kernel.org>
---
Changes in v2:
- make use of bpf_timer (and dropped the custom HID handling)
- implemented bpf_timer_set_sleepable_cb as a kfunc
- still not implemented global subprogs
- no sleepable bpf_timer selftests yet
- Link to v1: https://lore.kernel.org/r/20240209-hid-bpf-sleepable-v1-0-4cc895b5adbd@kern…
---
Benjamin Tissoires (10):
bpf/verifier: introduce in_sleepable() helper
bpf/helpers: introduce sleepable timers
bpf/verifier: allow more maps in sleepable bpf programs
HID: bpf/dispatch: regroup kfuncs definitions
HID: bpf: export hid_hw_output_report as a BPF kfunc
selftests/hid: Add test for hid_bpf_hw_output_report
HID: bpf: allow to inject HID event from BPF
selftests/hid: add tests for hid_bpf_input_report
HID: bpf: allow to use bpf_timer_set_sleepable_cb() in tracing callbacks.
selftests/hid: add test for bpf_timer
Documentation/hid/hid-bpf.rst | 2 +-
drivers/hid/bpf/hid_bpf_dispatch.c | 232 ++++++++++++++-------
drivers/hid/hid-core.c | 2 +
include/linux/bpf_verifier.h | 2 +
include/linux/hid_bpf.h | 3 +
include/uapi/linux/bpf.h | 12 ++
kernel/bpf/helpers.c | 105 +++++++++-
kernel/bpf/verifier.c | 91 +++++++-
tools/testing/selftests/hid/hid_bpf.c | 195 ++++++++++++++++-
tools/testing/selftests/hid/progs/hid.c | 198 ++++++++++++++++++
.../testing/selftests/hid/progs/hid_bpf_helpers.h | 8 +
11 files changed, 756 insertions(+), 94 deletions(-)
---
base-commit: 4f7a05917237b006ceae760507b3d15305769ade
change-id: 20240205-hid-bpf-sleepable-c01260fd91c4
Best regards,
--
Benjamin Tissoires <bentiss(a)kernel.org>
Hi Christian,
On Tue, Feb 20, 2024 at 04:03:57PM +0100, Christian König wrote:
> Am 20.02.24 um 15:56 schrieb Maxime Ripard:
> > On Tue, Feb 20, 2024 at 02:28:53PM +0100, Christian König wrote:
> > > [SNIP]
> > > This kunit test is not meant to be run on real hardware, but rather just as
> > > stand a long kunit tests within user mode linux. I was assuming that it
> > > doesn't even compiles on bare metal.
> > >
> > > We should probably either double check the kconfig options to prevent
> > > compiling it or modify the test so that it can run on real hardware as well.
> > I think any cross-compiled kunit run will be impossible to differentiate
> > from running on real hardware. We should just make it work there.
>
> The problem is what the unit test basically does is registering and
> destroying a dummy device to see if initializing and tear down of the global
> pools work correctly.
>
> If you run on real hardware and have a real device
I assume you mean a real DRM device backed by TTM here, right?
> additionally to the dummy device the reference count of the global
> pool never goes down to zero and so it is never torn down.
>
> So running this test just doesn't make any sense in that environment.
> Any idea how to work around that?
I've added David, Brendan and Rae in Cc.
To sum up the problem, your tests are relying on the mock device created
to run a kunit test to be the sole DRM device in the system. But if you
compile a kernel with the kunit tests enabled and boot that on a real
hardware, then that assumption might not be true anymore and things
break apart. Is that a fair description?
If so, maybe we could detect if it's running under qemu or UML (if
that's something we can do in the first place), and then extend
kunit_attributes to only run that test if it's in a simulated
environment.
Maxime
For now, the BPF program of type BPF_PROG_TYPE_TRACING is not allowed to
be attached to multiple hooks, and we have to create a BPF program for
each kernel function, for which we want to trace, even through all the
program have the same (or similar) logic. This can consume extra memory,
and make the program loading slow if we have plenty of kernel function to
trace.
In the commit 4a1e7c0c63e0 ("bpf: Support attaching freplace programs to
multiple attach points"), the freplace BPF program is made to support
attach to multiple attach points. And in this series, we extend it to
fentry/fexit/raw_tp/...
In the 1st patch, we add the support to record index of the accessed
function args of the target for tracing program. Meanwhile, we add the
function btf_check_func_part_match() to compare the accessed function args
of two function prototype. This function will be used in the next commit.
In the 2nd patch, we do some adjust to bpf_tracing_prog_attach() to make
it support multiple attaching.
In the 3rd patch, we allow to set bpf cookie in bpf_link_create() even if
target_btf_id is set, as we are allowed to attach the tracing program to
new target.
In the 4th patch, we introduce the function libbpf_find_kernel_btf_id() to
libbpf to find the btf type id of the kernel function, and this function
will be used in the next commit.
In the 5th patch, we add the testcases for this series.
Menglong Dong (5):
bpf: tracing: add support to record and check the accessed args
bpf: tracing: support to attach program to multi hooks
libbpf: allow to set coookie when target_btf_id is set in
bpf_link_create
libbpf: add the function libbpf_find_kernel_btf_id()
selftests/bpf: add test cases for multiple attach of tracing program
include/linux/bpf.h | 6 +
include/uapi/linux/bpf.h | 1 +
kernel/bpf/btf.c | 121 ++++++++++++++
kernel/bpf/syscall.c | 118 +++++++++++---
tools/lib/bpf/bpf.c | 17 +-
tools/lib/bpf/libbpf.c | 83 ++++++++++
tools/lib/bpf/libbpf.h | 3 +
tools/lib/bpf/libbpf.map | 1 +
.../selftests/bpf/bpf_testmod/bpf_testmod.c | 49 ++++++
.../bpf/prog_tests/tracing_multi_attach.c | 153 ++++++++++++++++++
.../selftests/bpf/progs/tracing_multi_test.c | 66 ++++++++
11 files changed, 583 insertions(+), 35 deletions(-)
create mode 100644 tools/testing/selftests/bpf/prog_tests/tracing_multi_attach.c
create mode 100644 tools/testing/selftests/bpf/progs/tracing_multi_test.c
--
2.39.2
There is a spelling mistake in a printed message. Fix it.
Signed-off-by: Colin Ian King <colin.i.king(a)gmail.com>
---
tools/testing/selftests/sched/cs_prctl_test.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/tools/testing/selftests/sched/cs_prctl_test.c b/tools/testing/selftests/sched/cs_prctl_test.c
index 7ba057154343..62fba7356af2 100644
--- a/tools/testing/selftests/sched/cs_prctl_test.c
+++ b/tools/testing/selftests/sched/cs_prctl_test.c
@@ -276,7 +276,7 @@ int main(int argc, char *argv[])
if (setpgid(0, 0) != 0)
handle_error("process group");
- printf("\n## Create a thread/process/process group hiearchy\n");
+ printf("\n## Create a thread/process/process group hierarchy\n");
create_processes(num_processes, num_threads, procs);
need_cleanup = 1;
disp_processes(num_processes, procs);
--
2.39.2
While mq_perf_tests runs with the default kselftest timeout limit, which
is 45 seconds, the test takes about 60 seconds to complete on i3.metal
AWS instances. Hence, the test always times out. Increase the timeout
to 180 seconds.
Fixes: 852c8cbf34d3 ("selftests/kselftest/runner.sh: Add 45 second timeout per test")
Cc: <stable(a)vger.kernel.org> # 5.4.x
Signed-off-by: SeongJae Park <sj(a)kernel.org>
Reviewed-by: Kees Cook <keescook(a)chromium.org>
---
Changes from v2
(https://lore.kernel.org/r/20240220000243.162285-1-sj@kernel.org)
- Update commit message about the new timeout limit to 180 seconds
- Remove wrong Link: line
Changes from v1
(https://lore.kernel.org/r/20240208212925.68286-1-sj@kernel.org)
- Use 180 seconds timeout instead of 100 seconds
tools/testing/selftests/mqueue/setting | 1 +
1 file changed, 1 insertion(+)
create mode 100644 tools/testing/selftests/mqueue/setting
diff --git a/tools/testing/selftests/mqueue/setting b/tools/testing/selftests/mqueue/setting
new file mode 100644
index 000000000000..a953c96aa16e
--- /dev/null
+++ b/tools/testing/selftests/mqueue/setting
@@ -0,0 +1 @@
+timeout=180
--
2.39.2
Resolves a spelling error in the test log, preventing potential
confusion.
Signed-off-by: Vincenzo Mezzela <vincenzo.mezzela(a)gmail.com>
---
It is submitted as part of my application to the "Linux Kernel
Bug Fixing Spring Unpaid 2024" mentorship program of the Linux
Foundation.
.../testing/selftests/ftrace/test.d/trigger/trigger-hist-mod.tc | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/tools/testing/selftests/ftrace/test.d/trigger/trigger-hist-mod.tc b/tools/testing/selftests/ftrace/test.d/trigger/trigger-hist-mod.tc
index 4562e13cb26b..717898894ef7 100644
--- a/tools/testing/selftests/ftrace/test.d/trigger/trigger-hist-mod.tc
+++ b/tools/testing/selftests/ftrace/test.d/trigger/trigger-hist-mod.tc
@@ -40,7 +40,7 @@ grep "id: \(unknown_\|sys_\)" events/raw_syscalls/sys_exit/hist > /dev/null || \
reset_trigger
-echo "Test histgram with log2 modifier"
+echo "Test histogram with log2 modifier"
echo 'hist:keys=bytes_req.log2' > events/kmem/kmalloc/trigger
for i in `seq 1 10` ; do ( echo "forked" > /dev/null); done
--
2.34.1
Add a test to exercize cpu hotplug with the function tracer active to
ensure that sensitive functions in idle path are excluded from being
traced. This helps catch issues such as the one fixed by commit
4b3338aaa74d ("powerpc/ftrace: Fix stack teardown in ftrace_no_trace").
Signed-off-by: Naveen N Rao <naveen(a)kernel.org>
Acked-by: Masami Hiramatsu (Google) <mhiramat(a)kernel.org>
---
.../ftrace/test.d/ftrace/func_hotplug.tc | 42 +++++++++++++++++++
1 file changed, 42 insertions(+)
create mode 100644 tools/testing/selftests/ftrace/test.d/ftrace/func_hotplug.tc
diff --git a/tools/testing/selftests/ftrace/test.d/ftrace/func_hotplug.tc b/tools/testing/selftests/ftrace/test.d/ftrace/func_hotplug.tc
new file mode 100644
index 000000000000..ccfbfde3d942
--- /dev/null
+++ b/tools/testing/selftests/ftrace/test.d/ftrace/func_hotplug.tc
@@ -0,0 +1,42 @@
+#!/bin/sh
+# SPDX-License-Identifier: GPL-2.0-or-later
+# description: ftrace - function trace across cpu hotplug
+# requires: function:tracer
+
+if ! which nproc ; then
+ nproc() {
+ ls -d /sys/devices/system/cpu/cpu[0-9]* | wc -l
+ }
+fi
+
+NP=`nproc`
+
+if [ $NP -eq 1 ] ;then
+ echo "We cannot test cpu hotplug in UP environment"
+ exit_unresolved
+fi
+
+# Find online cpu
+for i in /sys/devices/system/cpu/cpu[1-9]*; do
+ if [ -f $i/online ] && [ "$(cat $i/online)" = "1" ]; then
+ cpu=$i
+ break
+ fi
+done
+
+if [ -z "$cpu" ]; then
+ echo "We cannot test cpu hotplug with a single cpu online"
+ exit_unresolved
+fi
+
+echo 0 > tracing_on
+echo > trace
+
+: "Set $(basename $cpu) offline/online with function tracer enabled"
+echo function > current_tracer
+echo 1 > tracing_on
+(echo 0 > $cpu/online)
+(echo "forked"; sleep 1)
+(echo 1 > $cpu/online)
+echo 0 > tracing_on
+echo nop > current_tracer
base-commit: 130a83879954a9fed35cf4474d223b4fcfd479fa
--
2.43.0
This series aims to keep the git status clean after building the
selftests by adding some missing .gitignore files and object inclusion
in existing .gitignore files. This is one of the requirements listed in
the selftests documentation for new tests, but it is not always followed
as desired.
After adding these .gitignore files and including the generated objects,
the working tree appears clean again.
To: Shuah Khan <shuah(a)kernel.org>
To: SeongJae Park <sj(a)kernel.org>
Cc: linux-kernel(a)vger.kernel.org
Cc: linux-kselftest(a)vger.kernel.org
Signed-off-by: Javier Carrasco <javier.carrasco.cruz(a)gmail.com>
Changes in v4:
- damon: remove from the series to apply it in mm separately.
- Link to v3: https://lore.kernel.org/r/20240213-selftest_gitignore-v3-0-1f812368702b@gma…
Changes in v3:
- General: base on mm-unstable to avoid conflicts.
- damon: add missing Closes tag.
- Link to v2: https://lore.kernel.org/r/20240212-selftest_gitignore-v2-0-75f0de50a178@gma…
Changes in v2:
- Remove patch for netfilter (not relevant anymore).
- Add patch for damon (missing binary in .gitignore).
- Link to v1: https://lore.kernel.org/r/20240101-selftest_gitignore-v1-0-eb61b09adb05@gma…
---
Javier Carrasco (3):
selftests: uevent: add missing gitignore
selftests: thermal: intel: power_floor: add missing gitignore
selftests: thermal: intel: workload_hint: add missing gitignore
tools/testing/selftests/thermal/intel/power_floor/.gitignore | 1 +
tools/testing/selftests/thermal/intel/workload_hint/.gitignore | 1 +
tools/testing/selftests/uevent/.gitignore | 1 +
3 files changed, 3 insertions(+)
---
base-commit: 7e90b5c295ec1e47c8ad865429f046970c549a66
change-id: 20240101-selftest_gitignore-7da2c503766e
Best regards,
--
Javier Carrasco <javier.carrasco.cruz(a)gmail.com>
On Tue, 20 Feb 2024 at 11:57, Linus Torvalds
<torvalds(a)linux-foundation.org> wrote:
>
> It turns out that that commit is buggy for another reason, but it's
> hidden by the fact that apparently KUNIT_ASSERT_FALSE_MSG() doesn't
> check the format string.
The fix for that is this:
--- a/include/kunit/test.h
+++ b/include/kunit/test.h
@@ -579,7 +579,7 @@ void __printf(2, 3) kunit_log_append(struct
string_stream *log, const char *fmt,
void __noreturn __kunit_abort(struct kunit *test);
-void __kunit_do_failed_assertion(struct kunit *test,
+void __printf(6,7) __kunit_do_failed_assertion(struct kunit *test,
const struct kunit_loc *loc,
enum kunit_assert_type type,
const struct kunit_assert *assert,
but that causes a *lot* of noise (not just in drm_buddy_test.c), so
I'm not going to apply that fix as-is. Clearly there's a lot of
incorrect format parameters that have never been checked.
Instead adding Shuah and the KUnit people to the participants, and
hoping that they will fix this up and we can get the format fixes for
KUnit in the 6.9 timeframe.
Side note: when I apply the above patch, the suggestions gcc spews out
look invalid. Gcc seems to suggest turning a a format string of '%d"
to "%ld" for a size_t variable. That's wrong. It should be "%zu".
A 'size_t' can in fact be 'unsigned int' on some platforms (not just
in theory), so %ld is really incorrect not just from a sign
perspective.
Anyway, I guess I will commit the immediate drm_buddy_test.c fix to
get rid of the build issue, but the KUnit message format string issue
will have to be a "let's get this fixed up _later_" issue.
Linus
This series adds a few missing functions to the shell KTAP helpers
script which are present in the C counterpart, kselftest.h.
This series depends on
"selftests: Move KTAP bash helpers to selftests common folder"
https://lore.kernel.org/all/20240102141528.169947-1-laura.nao@collabora.com/
Signed-off-by: Nícolas F. R. A. Prado <nfraprado(a)collabora.com>
---
Nícolas F. R. A. Prado (4):
selftests: ktap_helpers: Add helper to print diagnostic messages
selftests: ktap_helpers: Add helper to pass/fail test based on exit code
selftests: ktap_helpers: Add a helper to abort the test
selftests: ktap_helpers: Add a helper to finish the test
tools/testing/selftests/kselftest/ktap_helpers.sh | 39 +++++++++++++++++++++--
1 file changed, 37 insertions(+), 2 deletions(-)
---
base-commit: f1ca07220ad16a9efae7f68e4bade0db89b63a3c
change-id: 20240131-ktap-sh-helpers-extend-805b77ca773c
Best regards,
--
Nícolas F. R. A. Prado <nfraprado(a)collabora.com>
The typo in the description shows up in test logs and output.
This patch submission is part of my application to the Linux Foundation
mentorship program: Linux kernel Bug Fixing Spring Unpaid 2024.
Signed-off-by: Ali Zahraee <ahzahraee(a)gmail.com>
---
tools/testing/selftests/ftrace/test.d/00basic/test_ownership.tc | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/tools/testing/selftests/ftrace/test.d/00basic/test_ownership.tc b/tools/testing/selftests/ftrace/test.d/00basic/test_ownership.tc
index add7d5bf585d..c45094d1e1d2 100644
--- a/tools/testing/selftests/ftrace/test.d/00basic/test_ownership.tc
+++ b/tools/testing/selftests/ftrace/test.d/00basic/test_ownership.tc
@@ -1,6 +1,6 @@
#!/bin/sh
# SPDX-License-Identifier: GPL-2.0
-# description: Test file and directory owership changes for eventfs
+# description: Test file and directory ownership changes for eventfs
original_group=`stat -c "%g" .`
original_owner=`stat -c "%u" .`
--
2.34.1
This series includes 4 types of fixes:
Patches 1 and 2 force the path-managers not to allocate a new address
entry when dealing with the "special" ID 0, reserved to the address of
the initial subflow. These patches can be backported up to v5.19 and
v5.12 respectively.
Patch 3 to 6 fix the in-kernel path-manager not to create duplicated
subflows. Patch 6 is the main fix, but patches 3 to 5 are some kind of
pre-requisities: they fix some data races that could also lead to the
creation of unexpected subflows. These patches can be backported up to
v5.7, v5.10, v6.0, and v5.15 respectively.
Note that patch 3 modifies the existing ULP API. No better solutions
have been found for -net, and there is some similar prior art, see
commit 0df48c26d841 ("tcp: add tcpi_bytes_acked to tcp_info"). Please
also note that TLS ULP Diag has likely the same issue.
Patches 7 to 9 fix issues in the selftests, when executing them on older
kernels, e.g. when testing the last version of these kselftests on the
v5.15.148 kernel as it is done by LKFT when validating stable kernels.
These patches only avoid printing expected errors the console and
marking some tests as "OK" while they have been skipped. Patches 7 and 8
can be backported up to v6.6.
Patches 10 to 13 make sure all MPTCP selftests subtests have a unique
name. It is important to have a unique (sub)test name in TAP, because
that's the test identifier. Some CI environments might drop tests with
duplicated names. Patches 10 to 12 can be backported up to v6.6.
Signed-off-by: Matthieu Baerts (NGI0) <matttbe(a)kernel.org>
---
Geliang Tang (2):
mptcp: add needs_id for userspace appending addr
mptcp: add needs_id for netlink appending addr
Matthieu Baerts (NGI0) (7):
selftests: mptcp: pm nl: also list skipped tests
selftests: mptcp: pm nl: avoid error msg on older kernels
selftests: mptcp: diag: fix bash warnings on older kernels
selftests: mptcp: simult flows: fix some subtest names
selftests: mptcp: userspace_pm: unique subtest names
selftests: mptcp: diag: unique 'in use' subtest names
selftests: mptcp: diag: unique 'cestab' subtest names
Paolo Abeni (4):
mptcp: fix lockless access in subflow ULP diag
mptcp: fix data races on local_id
mptcp: fix data races on remote_id
mptcp: fix duplicate subflow creation
include/net/tcp.h | 2 +-
net/mptcp/diag.c | 8 ++-
net/mptcp/pm_netlink.c | 69 ++++++++++++++---------
net/mptcp/pm_userspace.c | 15 ++---
net/mptcp/protocol.c | 2 +-
net/mptcp/protocol.h | 15 ++++-
net/mptcp/subflow.c | 15 ++---
net/tls/tls_main.c | 2 +-
tools/testing/selftests/net/mptcp/diag.sh | 41 ++++++++------
tools/testing/selftests/net/mptcp/pm_netlink.sh | 8 ++-
tools/testing/selftests/net/mptcp/simult_flows.sh | 3 +-
tools/testing/selftests/net/mptcp/userspace_pm.sh | 4 +-
12 files changed, 116 insertions(+), 68 deletions(-)
---
base-commit: c40c0d3a768c78a023a72fb2ceea00743e3a695d
change-id: 20240215-upstream-net-20240215-misc-fixes-03815ec14dc6
Best regards,
--
Matthieu Baerts (NGI0) <matttbe(a)kernel.org>
From: Paul Durrant <pdurrant(a)amazon.com>
This series contains a new patch from Sean added since v12 [1]:
* KVM: s390: Refactor kvm_is_error_gpa() into kvm_is_gpa_in_memslot()
This frees up the function name kvm_is_error_gpa() such that it can then be
re-defined in:
* KVM: pfncache: allow a cache to be activated with a fixed (userspace) HVA
to be used for a simple GPA validation helper function. The patch also now
contains an either/or address check for GPA versus HVA in
__kvm_gpc_refresh().
In:
* KVM: pfncache: add a mark-dirty helper
The function name has been changed from kvm_gpc_mark_dirty() to
kvm_gpc_mark_dirty_in_slot().
In:
* KVM: x86/xen: allow shared_info to be mapped by fixed HVA
missing HVA validation checks have been added and the 'hva == 0' test
has been changed to '!hva'. The KVM_XEN_ATTR_TYPE_SHARED_INFO and
KVM_XEN_ATTR_TYPE_SHARED_INFO_HVA cases are still largely handled as one
though as separation leads to duplicate calls to
kvm_xen_shared_info_init() which looks messy.
Also, patches with a 'xen' prefix have now been modified to have a
'x86/xen' prefix and patches with a 'selftests / xen' prefix have been
modified to have simply a 'selftests' prefix.
[1] https://lore.kernel.org/kvm/20240115125707.1183-1-paul@xen.org/
David Woodhouse (1):
KVM: pfncache: rework __kvm_gpc_refresh() to fix locking issues
Paul Durrant (19):
KVM: pfncache: Add a map helper function
KVM: pfncache: remove unnecessary exports
KVM: x86/xen: mark guest pages dirty with the pfncache lock held
KVM: pfncache: add a mark-dirty helper
KVM: pfncache: remove KVM_GUEST_USES_PFN usage
KVM: pfncache: stop open-coding offset_in_page()
KVM: pfncache: include page offset in uhva and use it consistently
KVM: pfncache: allow a cache to be activated with a fixed (userspace)
HVA
KVM: x86/xen: separate initialization of shared_info cache and content
KVM: x86/xen: re-initialize shared_info if guest (32/64-bit) mode is
set
KVM: x86/xen: allow shared_info to be mapped by fixed HVA
KVM: x86/xen: allow vcpu_info to be mapped by fixed HVA
KVM: selftests: map Xen's shared_info page using HVA rather than GFN
KVM: selftests: re-map Xen's vcpu_info using HVA rather than GPA
KVM: x86/xen: advertize the KVM_XEN_HVM_CONFIG_SHARED_INFO_HVA
capability
KVM: x86/xen: split up kvm_xen_set_evtchn_fast()
KVM: x86/xen: don't block on pfncache locks in
kvm_xen_set_evtchn_fast()
KVM: pfncache: check the need for invalidation under read lock first
KVM: x86/xen: allow vcpu_info content to be 'safely' copied
Sean Christopherson (1):
KVM: s390: Refactor kvm_is_error_gpa() into kvm_is_gpa_in_memslot()
Documentation/virt/kvm/api.rst | 53 ++-
arch/s390/kvm/diag.c | 2 +-
arch/s390/kvm/gaccess.c | 14 +-
arch/s390/kvm/kvm-s390.c | 4 +-
arch/s390/kvm/priv.c | 4 +-
arch/s390/kvm/sigp.c | 2 +-
arch/x86/kvm/x86.c | 7 +-
arch/x86/kvm/xen.c | 361 +++++++++++------
include/linux/kvm_host.h | 49 ++-
include/linux/kvm_types.h | 8 -
include/uapi/linux/kvm.h | 9 +-
.../selftests/kvm/x86_64/xen_shinfo_test.c | 59 ++-
virt/kvm/pfncache.c | 382 ++++++++++--------
13 files changed, 591 insertions(+), 363 deletions(-)
base-commit: 7455665a3521aa7b56245c0a2810f748adc5fdd4
--
2.39.2
Hi Christian, Janosch, Heiko,
Here is a new version for the AR/MEM_OP issue I'm attempting to address.
(Thank you, Heiko, for the offline chat!)
Changes:
v3:
[HC] Drop the AR swap in MEM_OP path
[HC] Remove WARN and don't do save_access_regs on !bool
v2: https://lore.kernel.org/r/20240215205344.2562020-1-farman@linux.ibm.com/
[HC] Add a flag to indicate access registers have been loaded
v1: https://lore.kernel.org/r/20240209204539.4150550-1-farman@linux.ibm.com/
[CB] Store access registers around memop ioctl
[JF] Add a kernel selftest
RFC: https://lore.kernel.org/r/20240131205832.2179029-1-farman@linux.ibm.com/
Eric Farman (2):
KVM: s390: fix access register usage in ioctls
KVM: s390: selftests: memop: add a simple AR test
arch/s390/include/asm/kvm_host.h | 1 +
arch/s390/kvm/gaccess.c | 3 ++-
arch/s390/kvm/kvm-s390.c | 3 +++
tools/testing/selftests/kvm/s390x/memop.c | 28 +++++++++++++++++++++++
4 files changed, 34 insertions(+), 1 deletion(-)
--
2.40.1
Hi Christian, Janosch, Heiko,
Here is a new version for the AR/MEM_OP issue I'm attempting to address,
with Heiko's feedback to v1.
Patch 1 performs the host/guest access register swap that Christian
suggested (instead of a full sync_reg/store_reg process).
Patch 2 provides a selftest patch that exercises this scenario.
Applying patch 2 without patch 1 fails in the following way:
[eric@host linux]# ./tools/testing/selftests/kvm/s390x/memop
TAP version 13
1..16
ok 1 simple copy
ok 2 generic error checks
ok 3 copy with storage keys
ok 4 cmpxchg with storage keys
ok 5 concurrently cmpxchg with storage keys
ok 6 copy with key storage protection override
ok 7 copy with key fetch protection
ok 8 copy with key fetch protection override
==== Test Assertion Failure ====
s390x/memop.c:186: !r
pid=5720 tid=5720 errno=4 - Interrupted system call
1 0x00000000010042af: memop_ioctl at memop.c:186 (discriminator 3)
2 0x0000000001006697: test_copy_access_register at memop.c:400 (discriminator 2)
3 0x0000000001002aaf: main at memop.c:1181
4 0x000003ffaec33a5b: ?? ??:0
5 0x000003ffaec33b5d: ?? ??:0
6 0x0000000001002ba9: _start at ??:?
KVM_S390_MEM_OP failed, rc: 40 errno: 4 (Interrupted system call)
Thoughts on this approach?
Thanks,
Eric
Changes:
v2:
[HC] Add a flag to indicate access registers have been loaded
v1: https://lore.kernel.org/r/20240209204539.4150550-1-farman@linux.ibm.com/
[CB] Store access registers around memop ioctl
[JF] Add a kernel selftest
RFC: https://lore.kernel.org/r/20240131205832.2179029-1-farman@linux.ibm.com/
Eric Farman (2):
KVM: s390: load guest access registers in MEM_OP ioctl
KVM: s390: selftests: memop: add a simple AR test
arch/s390/include/asm/kvm_host.h | 1 +
arch/s390/kvm/gaccess.c | 2 ++
arch/s390/kvm/kvm-s390.c | 11 +++++++++
tools/testing/selftests/kvm/s390x/memop.c | 28 +++++++++++++++++++++++
4 files changed, 42 insertions(+)
--
2.40.1
Cleaning up after tests is implemented separately for individual tests
and called at the end of each test execution. Since these functions are
very similar and a more generalized test framework was introduced a
function pointer in the resctrl_test struct can be used to reduce the
amount of function calls.
These functions are also all called in the ctrl-c handler because the
handler isn't aware which test is currently running. Since the handler
is implemented with a sigaction no function parameters can be passed
there but information about what test is currently running can be passed
with a global variable.
Maciej Wieczor-Retman (3):
selftests/resctrl: Add cleanup function to test framework
selftests/resctrl: Simplify cleanup in ctrl-c handler
selftests/resctrl: Move cleanups out of individual tests
tools/testing/selftests/resctrl/cat_test.c | 5 ++---
tools/testing/selftests/resctrl/cmt_test.c | 4 ++--
tools/testing/selftests/resctrl/mba_test.c | 5 ++---
tools/testing/selftests/resctrl/mbm_test.c | 5 ++---
tools/testing/selftests/resctrl/resctrl.h | 8 +++----
.../testing/selftests/resctrl/resctrl_tests.c | 22 ++++++++++---------
tools/testing/selftests/resctrl/resctrl_val.c | 2 +-
7 files changed, 25 insertions(+), 26 deletions(-)
--
2.43.2
I have been steadily working but struggled to find a seamlessly
integrated way to implement tty frontend until Guilherme inspired me
that multi-backend and tty frontend are actually two separate entities.
This submission presents the 3rd iteration of my efforts, listing
notable changes form the v1:
1. pstore.backend no longer acts as "registered backend", but "backends
eligible for registration".
2. drop subdir since it will break user space
3. drop tty frontend since I haven't yet devised a satisfactory
implementation strategy
Changes from v2:
1. Fix ftrace.c build error as I did not compile with
CONFIG_PSTORE_FTRACE.
A heartfelt thank you to Kees and Guilherme for your suggestions.
I firmly believe that a tty frontend is crucial for kdump debugging,
and I am still dedicating effort to develop one. Hope in the future I
can accomplish it with deeper comprehension with tty driver :)
Yuanhe Shu (3):
pstore: add multi-backend support
Documentation: adjust pstore backend related document
tools/testing: adjust pstore backend related selftest
Documentation/ABI/testing/pstore | 8 +-
.../admin-guide/kernel-parameters.txt | 4 +-
fs/pstore/ftrace.c | 31 ++-
fs/pstore/inode.c | 19 +-
fs/pstore/internal.h | 4 +-
fs/pstore/platform.c | 225 ++++++++++++------
fs/pstore/pmsg.c | 24 +-
include/linux/pstore.h | 29 +++
tools/testing/selftests/pstore/common_tests | 8 +-
.../selftests/pstore/pstore_post_reboot_tests | 65 ++---
tools/testing/selftests/pstore/pstore_tests | 2 +-
11 files changed, 295 insertions(+), 124 deletions(-)
--
2.39.3
This is a v2 for previous series [1] to allow mapping for compound tail
pages for IO or PFNMAP mapping.
Compared to v1, this version provides selftest to check functionality in
KVM to map memslots for MMIO BARs (VMAs with flag VM_IO | VM_PFNMAP), as
requested by Sean in [1].
The selftest can also be used to test series "allow mapping non-refcounted
pages" [2].
Tag RFC is added because a test driver is introduced in patch 2, which is
new to KVM selftest, and test "set_memory_region_io" in patch 3 depends on
that the test driver is compiled and loaded in kernel.
Besides, patch 3 calls vm_set_user_memory_region() directly without
modifying vm_mem_add().
So, this series is sent to ensure the main direction is right.
Thanks
Yan
[1] https://lore.kernel.org/all/20230719083332.4584-1-yan.y.zhao@intel.com/
[2] https://lore.kernel.org/all/20230911021637.1941096-1-stevensd@google.com/
v2:
added patch 2 and 3 to do selftest for patch 1 (Sean).
Yan Zhao (3):
KVM: allow mapping of compound tail pages for IO or PFNMAP mapping
KVM: selftests: add selftest driver for KVM to test memory slots for
MMIO BARs
KVM: selftests: Add set_memory_region_io to test memslots for MMIO
BARs
lib/Kconfig.debug | 14 +
lib/Makefile | 1 +
lib/test_kvm_mock_device.c | 281 ++++++++++++++++++
lib/test_kvm_mock_device_uapi.h | 16 +
tools/testing/selftests/kvm/Makefile | 1 +
.../selftests/kvm/set_memory_region_io.c | 188 ++++++++++++
virt/kvm/kvm_main.c | 2 +-
7 files changed, 502 insertions(+), 1 deletion(-)
create mode 100644 lib/test_kvm_mock_device.c
create mode 100644 lib/test_kvm_mock_device_uapi.h
create mode 100644 tools/testing/selftests/kvm/set_memory_region_io.c
base-commit: 8ed26ab8d59111c2f7b86d200d1eb97d2a458fd1
--
2.17.1
While mq_perf_tests runs with the default kselftest timeout limit, which
is 45 seconds, the test takes about 60 seconds to complete on i3.metal
AWS instances. Hence, the test always times out. Increase the timeout
to 100 seconds.
Link: https://lore.kernel.org/r/20240208212925.68286-1-sj@kernel.org
Fixes: 852c8cbf34d3 ("selftests/kselftest/runner.sh: Add 45 second timeout per test")
Cc: <stable(a)vger.kernel.org> # 5.4.x
Signed-off-by: SeongJae Park <sj(a)kernel.org>
Reviewed-by: Kees Cook <keescook(a)chromium.org>
---
Changes from v1
(https://lore.kernel.org/r/20240208212925.68286-1-sj@kernel.org)
- Use 180 seconds timeout instead of 100 seconds
tools/testing/selftests/mqueue/setting | 1 +
1 file changed, 1 insertion(+)
create mode 100644 tools/testing/selftests/mqueue/setting
diff --git a/tools/testing/selftests/mqueue/setting b/tools/testing/selftests/mqueue/setting
new file mode 100644
index 000000000000..a953c96aa16e
--- /dev/null
+++ b/tools/testing/selftests/mqueue/setting
@@ -0,0 +1 @@
+timeout=180
--
2.39.2
While mq_perf_tests runs with the default kselftest timeout limit, which
is 45 seconds, the test takes about 60 seconds to complete on i3.metal
AWS instances. Hence, the test always times out. Increase the timeout
to 100 seconds.
Fixes: 852c8cbf34d3 ("selftests/kselftest/runner.sh: Add 45 second timeout per test")
Cc: <stable(a)vger.kernel.org> # 5.4.x
Signed-off-by: SeongJae Park <sj(a)kernel.org>
---
tools/testing/selftests/mqueue/setting | 1 +
1 file changed, 1 insertion(+)
create mode 100644 tools/testing/selftests/mqueue/setting
diff --git a/tools/testing/selftests/mqueue/setting b/tools/testing/selftests/mqueue/setting
new file mode 100644
index 000000000000..54dc12287839
--- /dev/null
+++ b/tools/testing/selftests/mqueue/setting
@@ -0,0 +1 @@
+timeout=100
--
2.39.2
The futex_requeue_pi test program is run a number of times with different
options to provide multiple test cases. Currently every time it runs it
reports the result with a consistent string, meaning that automated systems
parsing the TAP output from a test run have difficulty in distinguishing
which test is which.
The parameters used for the test are already logged as part of the test
output, let's use the same format to roll them into the test name that we
use with KTAP so that automated systems can follow the results of the
individual cases that get run.
Signed-off-by: Mark Brown <broonie(a)kernel.org>
---
tools/testing/selftests/futex/functional/futex_requeue_pi.c | 13 ++++++++++++-
1 file changed, 12 insertions(+), 1 deletion(-)
diff --git a/tools/testing/selftests/futex/functional/futex_requeue_pi.c b/tools/testing/selftests/futex/functional/futex_requeue_pi.c
index 1ee5518ee6b7..7f3ca5c78df1 100644
--- a/tools/testing/selftests/futex/functional/futex_requeue_pi.c
+++ b/tools/testing/selftests/futex/functional/futex_requeue_pi.c
@@ -17,6 +17,8 @@
*
*****************************************************************************/
+#define _GNU_SOURCE
+
#include <errno.h>
#include <limits.h>
#include <pthread.h>
@@ -358,6 +360,7 @@ int unit_test(int broadcast, long lock, int third_party_owner, long timeout_ns)
int main(int argc, char *argv[])
{
+ const char *test_name;
int c, ret;
while ((c = getopt(argc, argv, "bchlot:v:")) != -1) {
@@ -397,6 +400,14 @@ int main(int argc, char *argv[])
"\tArguments: broadcast=%d locked=%d owner=%d timeout=%ldns\n",
broadcast, locked, owner, timeout_ns);
+ ret = asprintf(&test_name,
+ "%s broadcast=%d locked=%d owner=%d timeout=%ldns",
+ TEST_NAME, broadcast, locked, owner, timeout_ns);
+ if (ret < 0) {
+ ksft_print_msg("Failed to generate test name\n");
+ test_name = TEST_NAME;
+ }
+
/*
* FIXME: unit_test is obsolete now that we parse options and the
* various style of runs are done by run.sh - simplify the code and move
@@ -404,6 +415,6 @@ int main(int argc, char *argv[])
*/
ret = unit_test(broadcast, locked, owner, timeout_ns);
- print_result(TEST_NAME, ret);
+ print_result(test_name, ret);
return ret;
}
---
base-commit: 54be6c6c5ae8e0d93a6c4641cb7528eb0b6ba478
change-id: 20240213-kselftest-futex-requeue-pi-unique-5a462303f6bc
Best regards,
--
Mark Brown <broonie(a)kernel.org>
Replace deprecated 0-length array in struct bpf_lpm_trie_key with
flexible array. Found with GCC 13:
../kernel/bpf/lpm_trie.c:207:51: warning: array subscript i is outside array bounds of 'const __u8[0]' {aka 'const unsigned char[]'} [-Warray-bounds=]
207 | *(__be16 *)&key->data[i]);
| ^~~~~~~~~~~~~
../include/uapi/linux/swab.h:102:54: note: in definition of macro '__swab16'
102 | #define __swab16(x) (__u16)__builtin_bswap16((__u16)(x))
| ^
../include/linux/byteorder/generic.h:97:21: note: in expansion of macro '__be16_to_cpu'
97 | #define be16_to_cpu __be16_to_cpu
| ^~~~~~~~~~~~~
../kernel/bpf/lpm_trie.c:206:28: note: in expansion of macro 'be16_to_cpu'
206 | u16 diff = be16_to_cpu(*(__be16 *)&node->data[i]
^
| ^~~~~~~~~~~
In file included from ../include/linux/bpf.h:7:
../include/uapi/linux/bpf.h:82:17: note: while referencing 'data'
82 | __u8 data[0]; /* Arbitrary size */
| ^~~~
And found at run-time under CONFIG_FORTIFY_SOURCE:
UBSAN: array-index-out-of-bounds in kernel/bpf/lpm_trie.c:218:49
index 0 is out of range for type '__u8 [*]'
This includes fixing the selftest which was incorrectly using a
variable length struct as a header, identified earlier[1]. Avoid this
by just explicitly including the prefixlen member instead of struct
bpf_lpm_trie_key.
Note that it is not possible to simply remove the "data" member, as it
is referenced by userspace
cilium:
struct egress_gw_policy_key in_key = {
.lpm_key = { 32 + 24, {} },
.saddr = CLIENT_IP,
.daddr = EXTERNAL_SVC_IP & 0Xffffff,
};
systemd:
ipv6_map_fd = bpf_map_new(
BPF_MAP_TYPE_LPM_TRIE,
offsetof(struct bpf_lpm_trie_key, data) + sizeof(uint32_t)*4,
sizeof(uint64_t),
...
The only risk to UAPI would be if sizeof() were used directly on the
data member, which it does not seem to be. It is only used as a static
initializer destination and to find its location via offsetof().
Link: https://lore.kernel.org/all/202206281009.4332AA33@keescook/ [1]
Reported-by: Mark Rutland <mark.rutland(a)arm.com>
Closes: https://paste.debian.net/hidden/ca500597/
Signed-off-by: Kees Cook <keescook(a)chromium.org>
---
Cc: Alexei Starovoitov <ast(a)kernel.org>
Cc: Daniel Borkmann <daniel(a)iogearbox.net>
Cc: Andrii Nakryiko <andrii(a)kernel.org>
Cc: Martin KaFai Lau <martin.lau(a)linux.dev>
Cc: Song Liu <song(a)kernel.org>
Cc: Yonghong Song <yhs(a)fb.com>
Cc: John Fastabend <john.fastabend(a)gmail.com>
Cc: KP Singh <kpsingh(a)kernel.org>
Cc: Stanislav Fomichev <sdf(a)google.com>
Cc: Hao Luo <haoluo(a)google.com>
Cc: Jiri Olsa <jolsa(a)kernel.org>
Cc: Mykola Lysenko <mykolal(a)fb.com>
Cc: Shuah Khan <shuah(a)kernel.org>
Cc: Haowen Bai <baihaowen(a)meizu.com>
Cc: bpf(a)vger.kernel.org
Cc: linux-kselftest(a)vger.kernel.org
v2- clarify commit log, add more failure examples
v1- https://lore.kernel.org/all/63e531e3.170a0220.3a46a.3262@mx.google.com/
---
include/uapi/linux/bpf.h | 2 +-
tools/testing/selftests/bpf/progs/map_ptr_kern.c | 2 +-
2 files changed, 2 insertions(+), 2 deletions(-)
diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h
index 754e68ca8744..359dd8a429c1 100644
--- a/include/uapi/linux/bpf.h
+++ b/include/uapi/linux/bpf.h
@@ -80,7 +80,7 @@ struct bpf_insn {
/* Key of an a BPF_MAP_TYPE_LPM_TRIE entry */
struct bpf_lpm_trie_key {
__u32 prefixlen; /* up to 32 for AF_INET, 128 for AF_INET6 */
- __u8 data[0]; /* Arbitrary size */
+ __u8 data[]; /* Arbitrary size */
};
struct bpf_cgroup_storage_key {
diff --git a/tools/testing/selftests/bpf/progs/map_ptr_kern.c b/tools/testing/selftests/bpf/progs/map_ptr_kern.c
index 3325da17ec81..1d476c6ae284 100644
--- a/tools/testing/selftests/bpf/progs/map_ptr_kern.c
+++ b/tools/testing/selftests/bpf/progs/map_ptr_kern.c
@@ -316,7 +316,7 @@ struct lpm_trie {
} __attribute__((preserve_access_index));
struct lpm_key {
- struct bpf_lpm_trie_key trie_key;
+ __u32 prefixlen;
__u32 data;
};
--
2.34.1
This patch series introduces a new char misc driver, /dev/ntsync, which is used
to implement Windows NT synchronization primitives.
This was previously submitted as an RFC [1]. Since there were no major changes
requested to the last RFC revision, I've stripped the RFC prefix.
[1] https://lore.kernel.org/lkml/20240131021356.10322-1-zfigura@codeweavers.com/
== Background ==
The Wine project emulates the Windows API in user space. One particular part of
that API, namely the NT synchronization primitives, have historically been
implemented via RPC to a dedicated "kernel" process. However, more recent
applications use these APIs more strenuously, and the overhead of RPC has become
a bottleneck.
The NT synchronization APIs are too complex to implement on top of existing
primitives without sacrificing correctness. Certain operations, such as
NtPulseEvent() or the "wait-for-all" mode of NtWaitForMultipleObjects(), require
direct control over the underlying wait queue, and implementing a wait queue
sufficiently robust for Wine in user space is not possible. This proposed
driver, therefore, implements the problematic interfaces directly in the Linux
kernel.
This driver was presented at Linux Plumbers Conference 2023. For those further
interested in the history of synchronization in Wine and past attempts to solve
this problem in user space, a recording of the presentation can be viewed here:
https://www.youtube.com/watch?v=NjU4nyWyhU8
== Performance ==
The gain in performance varies wildly depending on the application in question
and the user's hardware. For some games NT synchronization is not a bottleneck
and no change can be observed, but for others frame rate improvements of 50 to
150 percent are not atypical. The following table lists frame rate measurements
from a variety of games on a variety of hardware, taken by users Dmitry
Skvortsov, FuzzyQuils, OnMars, and myself:
Game Upstream ntsync improvement
===========================================================================
Anger Foot 69 99 43%
Call of Juarez 99.8 224.1 125%
Dirt 3 110.6 860.7 678%
Forza Horizon 5 108 160 48%
Lara Croft: Temple of Osiris 141 326 131%
Metro 2033 164.4 199.2 21%
Resident Evil 2 26 77 196%
The Crew 26 51 96%
Tiny Tina's Wonderlands 130 360 177%
Total War Saga: Troy 109 146 34%
===========================================================================
== Patches ==
The intended semantics of the patches are broadly intended to match those of the
corresponding Windows functions. For those not already familiar with the Windows
functions (or their undocumented behaviour), patch 31/31 provides a detailed
specification, and individual patches also include a brief description of the
API they are implementing.
The patches making use of this driver in Wine can be retrieved or browsed here:
https://repo.or.cz/wine/zf.git/shortlog/refs/heads/ntsync5
== Implementation ==
Some aspects of the implementation may deserve particular comment:
* In the interest of performance, each object is governed only by a single
spinlock. However, NTSYNC_IOC_WAIT_ALL requires that the state of multiple
objects be changed as a single atomic operation. In order to achieve this, we
first take a device-wide lock ("wait_all_lock") any time we are going to lock
more than one object at a time.
The maximum number of objects that can be used in a vectored wait, and
therefore the maximum that can be locked simultaneously, is 64. This number is
NT's own limit.
The acquisition of multiple spinlocks will degrade performance. This is a
conscious choice, however. Wait-for-all is known to be a very rare operation
in practice, especially with counts that approach the maximum, and it is the
intent of the ntsync driver to optimize wait-for-any at the expense of
wait-for-all as much as possible.
* NT mutexes are tied to their threads on an OS level, and the kernel includes
builtin support for "robust" mutexes. In order to keep the ntsync driver
self-contained and avoid touching more code than necessary, it does not hook
into task exit nor use pids.
Instead, the user space emulator is expected to manage thread IDs and pass
them as an argument to any relevant functions; this is the "owner" field of
ntsync_wait_args and ntsync_mutex_args.
When the emulator detects that a thread dies, it should therefore call
NTSYNC_IOC_MUTEX_KILL on any open mutexes.
* ntsync is module-capable mostly because there was nothing preventing it, and
because it aided development. It is not a hard requirement, though.
== Previous versions ==
Changes from the last (v2) RFC:
* Add a new wait flag NTSYNC_WAIT_REALTIME. I had originally missed a corner
case in NtWaitForMultipleObjects() related to its interaction with system time
adjustments. Essentially the function is sometimes supposed to respect system
time adjustments and sometimes supposed to ignore them, so in order to achieve
this I've added a function that controls which flag is being synchronized to.
Thanks Piotr Caban for catching this.
* Add tests for overflowing semaphore and mutex counters, and a test for
exceeding NTSYNC_MAX_WAIT_COUNT, per Andi Kleen.
* Add a more intense and realistic test involving multiple threads using the
same mutex to access data, per Andi Kleen.
* Use check_add_overflow() instead of writing out overflow checking manually
[and thereby avoid relying on -fwrapv].
* Add some missing headers that were being implicitly included: atomic.h,
hrtimer.h, ktime.h, sched.h, sched/signal.h, spinlock.h.
* Link to RFC v2: https://lore.kernel.org/lkml/20240131021356.10322-1-zfigura@codeweavers.com/
* Link to RFC v1: https://lore.kernel.org/lkml/20240124004028.16826-1-zfigura@codeweavers.com/
Elizabeth Figura (31):
ntsync: Introduce the ntsync driver and character device.
ntsync: Introduce NTSYNC_IOC_CREATE_SEM.
ntsync: Introduce NTSYNC_IOC_SEM_POST.
ntsync: Introduce NTSYNC_IOC_WAIT_ANY.
ntsync: Introduce NTSYNC_IOC_WAIT_ALL.
ntsync: Introduce NTSYNC_IOC_CREATE_MUTEX.
ntsync: Introduce NTSYNC_IOC_MUTEX_UNLOCK.
ntsync: Introduce NTSYNC_IOC_MUTEX_KILL.
ntsync: Introduce NTSYNC_IOC_CREATE_EVENT.
ntsync: Introduce NTSYNC_IOC_EVENT_SET.
ntsync: Introduce NTSYNC_IOC_EVENT_RESET.
ntsync: Introduce NTSYNC_IOC_EVENT_PULSE.
ntsync: Introduce NTSYNC_IOC_SEM_READ.
ntsync: Introduce NTSYNC_IOC_MUTEX_READ.
ntsync: Introduce NTSYNC_IOC_EVENT_READ.
ntsync: Introduce alertable waits.
ntsync: Allow waits to use the REALTIME clock.
selftests: ntsync: Add some tests for semaphore state.
selftests: ntsync: Add some tests for mutex state.
selftests: ntsync: Add some tests for NTSYNC_IOC_WAIT_ANY.
selftests: ntsync: Add some tests for NTSYNC_IOC_WAIT_ALL.
selftests: ntsync: Add some tests for wakeup signaling with
WINESYNC_IOC_WAIT_ANY.
selftests: ntsync: Add some tests for wakeup signaling with
WINESYNC_IOC_WAIT_ALL.
selftests: ntsync: Add some tests for manual-reset event state.
selftests: ntsync: Add some tests for auto-reset event state.
selftests: ntsync: Add some tests for wakeup signaling with events.
selftests: ntsync: Add tests for alertable waits.
selftests: ntsync: Add some tests for wakeup signaling via alerts.
selftests: ntsync: Add a stress test for contended waits.
maintainers: Add an entry for ntsync.
docs: ntsync: Add documentation for the ntsync uAPI.
Documentation/userspace-api/index.rst | 1 +
.../userspace-api/ioctl/ioctl-number.rst | 2 +
Documentation/userspace-api/ntsync.rst | 399 +++++
MAINTAINERS | 9 +
drivers/misc/Kconfig | 9 +
drivers/misc/Makefile | 1 +
drivers/misc/ntsync.c | 1146 ++++++++++++++
include/uapi/linux/ntsync.h | 62 +
tools/testing/selftests/Makefile | 1 +
.../testing/selftests/drivers/ntsync/Makefile | 8 +
tools/testing/selftests/drivers/ntsync/config | 1 +
.../testing/selftests/drivers/ntsync/ntsync.c | 1407 +++++++++++++++++
12 files changed, 3046 insertions(+)
create mode 100644 Documentation/userspace-api/ntsync.rst
create mode 100644 drivers/misc/ntsync.c
create mode 100644 include/uapi/linux/ntsync.h
create mode 100644 tools/testing/selftests/drivers/ntsync/Makefile
create mode 100644 tools/testing/selftests/drivers/ntsync/config
create mode 100644 tools/testing/selftests/drivers/ntsync/ntsync.c
base-commit: e21817acb23ece75d41a4fa7b40c85550f147389
--
2.43.0
When adapting the test to the kselftest framework, a few printf() calls
indicating test progress were not updated.
Fix this by replacing these printf() calls by ksft_print_msg() calls.
Fixes: ce7d101750ff8450 ("selftests: timers: clocksource-switch: adapt to kselftest framework")
Signed-off-by: Geert Uytterhoeven <geert+renesas(a)glider.be>
---
When just running the test, the output looks like:
# Validating clocksource arch_sys_counter
TAP version 13
1..12
ok 1 CLOCK_REALTIME
...
# Validating clocksource ffca0000.timer
TAP version 13
1..12
ok 1 CLOCK_REALTIME
...
When redirecting the test output to a file, the progress prints are not
interspersed with the test output, but collated at the end:
TAP version 13
1..12
ok 1 CLOCK_REALTIME
...
TAP version 13
1..12
ok 1 CLOCK_REALTIME
...
# Totals: pass:6 fail:0 xfail:0 xpass:0 skip:6 error:0
# Validating clocksource arch_sys_counter
# Validating clocksource ffca0000.timer
...
This makes it hard to match the test results with the timer under test.
Is there a way to fix this? The test does use fork().
Thanks!
---
tools/testing/selftests/timers/clocksource-switch.c | 6 +++---
1 file changed, 3 insertions(+), 3 deletions(-)
diff --git a/tools/testing/selftests/timers/clocksource-switch.c b/tools/testing/selftests/timers/clocksource-switch.c
index c5264594064c8516..83faa4e354e389c2 100644
--- a/tools/testing/selftests/timers/clocksource-switch.c
+++ b/tools/testing/selftests/timers/clocksource-switch.c
@@ -156,8 +156,8 @@ int main(int argc, char **argv)
/* Check everything is sane before we start switching asynchronously */
if (do_sanity_check) {
for (i = 0; i < count; i++) {
- printf("Validating clocksource %s\n",
- clocksource_list[i]);
+ ksft_print_msg("Validating clocksource %s\n",
+ clocksource_list[i]);
if (change_clocksource(clocksource_list[i])) {
status = -1;
goto out;
@@ -169,7 +169,7 @@ int main(int argc, char **argv)
}
}
- printf("Running Asynchronous Switching Tests...\n");
+ ksft_print_msg("Running Asynchronous Switching Tests...\n");
pid = fork();
if (!pid)
return run_tests(runtime);
--
2.34.1
Make sure the IOAM data insertion is not applied on cloned skb's. As a
consequence, ioam selftests needed a refactoring.
Justin Iurman (2):
ioam6: fix write to cloned skb in ipv6_hop_ioam()
selftests: ioam6: refactoring to align with the fix
net/ipv6/exthdrs.c | 8 ++
tools/testing/selftests/net/ioam6.sh | 38 ++++----
tools/testing/selftests/net/ioam6_parser.c | 101 +++++++++++----------
3 files changed, 81 insertions(+), 66 deletions(-)
base-commit: 166c2c8a6a4dc2e4ceba9e10cfe81c3e469e3210
--
2.34.1
Hi!
When running selftests for our subsystem in our CI we'd like all
tests to pass. Currently some tests use SKIP for cases they
expect to fail, because the kselftest_harness limits the return
codes to pass/fail/skip.
Clean up and support the use of the full range of ksft exit codes
under kselftest_harness.
To avoid conflicts and get the functionality into the networking
tree ASAP I'd like to put the patches on shared branch so that
both linux-kselftest and net-next can pull it in. Shuah, please
LMK if that'd work for you, and if so which -rc should I base
the branch on. Or is merging directly into net-next okay?
Jakub Kicinski (4):
selftests: kselftest_harness: pass step via shared memory
selftests: kselftest_harness: use KSFT_* exit codes
selftests: kselftest_harness: support using xfail
selftests: ip_local_port_range: use XFAIL instead of SKIP
tools/testing/selftests/kselftest_harness.h | 67 ++++++++++++++-----
.../selftests/net/ip_local_port_range.c | 2 +-
2 files changed, 52 insertions(+), 17 deletions(-)
--
2.43.0
Should set the SSADE (Second Stage Access/Dirty bit Enable) bit of the
pasid entry when attaching a device to a nested domain if its parent
has already enabled dirty tracking.
Fixes: 111bf85c68f6 ("iommu/vt-d: Add helper to setup pasid nested translation")
Signed-off-by: Yi Liu <yi.l.liu(a)intel.com>
---
base commit: 547ab8fc4cb04a1a6b34377dd8fad34cd2c8a8e3
---
drivers/iommu/intel/pasid.c | 2 ++
1 file changed, 2 insertions(+)
diff --git a/drivers/iommu/intel/pasid.c b/drivers/iommu/intel/pasid.c
index 3239cefa4c33..9be24bb762cf 100644
--- a/drivers/iommu/intel/pasid.c
+++ b/drivers/iommu/intel/pasid.c
@@ -658,6 +658,8 @@ int intel_pasid_setup_nested(struct intel_iommu *iommu, struct device *dev,
pasid_set_domain_id(pte, did);
pasid_set_address_width(pte, s2_domain->agaw);
pasid_set_page_snoop(pte, !!ecap_smpwc(iommu->ecap));
+ if (s2_domain->dirty_tracking)
+ pasid_set_ssade(pte);
pasid_set_translation_type(pte, PASID_ENTRY_PGTT_NESTED);
pasid_set_present(pte);
spin_unlock(&iommu->lock);
--
2.34.1
Add a common result printing helper and always include test name
in the result line. Previously when SKIP or XPASS would happen
we printed:
ok 1 # SKIP unknown
without the test name. Now we'll print:
ok 1 global.no_pad # SKIP unknown
This appears to be more inline with:
https://docs.kernel.org/dev-tools/ktap.html
and makes parsing results easier.
First 3 patches rearrange kselftest_harness to use exit code
as an enum rather than separate passed/skip/xfail members.
Rest of the series builds a ksft_test_result_code() helper.
This series is on top of:
https://lore.kernel.org/all/20240216002619.1999225-1-kuba@kernel.org/
Jakub Kicinski (7):
selftests: kselftest_harness: generate test name once
selftests: kselftest_harness: save full exit code in metadata
selftests: kselftest_harness: use exit code to store skip and xfail
selftests: kselftest: add ksft_test_result_code(), handling all exit
codes
selftests: kselftest_harness: print test name for SKIP and XFAIL
selftests: kselftest_harness: let ksft_test_result_code() handle line
termination
selftests: kselftest_harness: let PASS / FAIL provide diagnostic
tools/testing/selftests/kselftest.h | 45 ++++++++++
tools/testing/selftests/kselftest_harness.h | 96 ++++++++++-----------
tools/testing/selftests/net/tls.c | 2 +-
3 files changed, 91 insertions(+), 52 deletions(-)
--
2.43.0
Hi!
When running selftests for our subsystem in our CI we'd like all
tests to pass. Currently some tests use SKIP for cases they
expect to fail, because the kselftest_harness limits the return
codes to pass/fail/skip.
Clean up and support the use of the full range of ksft exit codes
under kselftest_harness.
Merge plan is to put it on top of -rc4 and merge into net-next.
That way others should be able to pull the patches without
any networking changes.
v2:
- fix alignment
v1: https://lore.kernel.org/all/20240213154416.422739-1-kuba@kernel.org/
Jakub Kicinski (4):
selftests: kselftest_harness: pass step via shared memory
selftests: kselftest_harness: use KSFT_* exit codes
selftests: kselftest_harness: support using xfail
selftests: ip_local_port_range: use XFAIL instead of SKIP
tools/testing/selftests/kselftest_harness.h | 67 ++++++++++++++-----
.../selftests/net/ip_local_port_range.c | 2 +-
2 files changed, 52 insertions(+), 17 deletions(-)
--
2.43.0
From: Roberto Sassu <roberto.sassu(a)huawei.com>
IMA and EVM are not effectively LSMs, especially due to the fact that in
the past they could not provide a security blob while there is another LSM
active.
That changed in the recent years, the LSM stacking feature now makes it
possible to stack together multiple LSMs, and allows them to provide a
security blob for most kernel objects. While the LSM stacking feature has
some limitations being worked out, it is already suitable to make IMA and
EVM as LSMs.
The main purpose of this patch set is to remove IMA and EVM function calls,
hardcoded in the LSM infrastructure and other places in the kernel, and to
register them as LSM hook implementations, so that those functions are
called by the LSM infrastructure like other regular LSMs.
This patch set introduces two new LSMs 'ima' and 'evm', so that functions
can be registered to their respective LSM, and removes the 'integrity' LSM.
integrity_kernel_module_request() was moved to IMA, since deadlock could
occur only there. integrity_inode_free() was replaced with ima_inode_free()
(EVM does not need to free memory).
In order to make 'ima' and 'evm' independent LSMs, it was necessary to
split integrity metadata used by both IMA and EVM, and to let them manage
their own. The special case of the IMA_NEW_FILE flag, managed by IMA and
used by EVM, was handled by introducing a new flag in EVM, EVM_NEW_FILE,
managed by two additional LSM hooks, evm_post_path_mknod() and
evm_file_release(), equivalent to their counterparts ima_post_path_mknod()
and ima_file_free().
In addition to splitting metadata, it was decided to embed the
evm_iint_inode structure into the inode security blob, since it is small
and because anyway it cannot rely on IMA anymore allocating it (since it
uses a different structure).
On the other hand, to avoid memory pressure concerns, only a pointer to the
ima_iint_cache structure is stored in the inode security blob, and the
structure is allocated on demand, like before.
Another follow-up change was removing the iint parameter from
evm_verifyxattr(), that IMA used to pass integrity metadata to EVM. After
splitting metadata, and aligning EVM_NEW_FILE with IMA_NEW_FILE, this
parameter was not necessary anymore.
The last part was to ensure that the order of IMA and EVM functions is
respected after they become LSMs. Since the order of lsm_info structures in
the .lsm_info.init section depends on the order object files containing
those structures are passed to the linker of the kernel image, and since
IMA is before EVM in the Makefile, that is sufficient to assert that IMA
functions are executed before EVM ones.
The patch set is organized as follows.
Patches 1-9 make IMA and EVM functions suitable to be registered to the LSM
infrastructure, by aligning function parameters.
Patches 10-18 add new LSM hooks in the same places where IMA and EVM
functions are called, if there is no LSM hook already.
Patch 19 moves integrity_kernel_module_request() to IMA, as a prerequisite
for removing the 'integrity' LSM.
Patches 20-22 introduce the new standalone LSMs 'ima' and 'evm', and move
hardcoded calls to IMA, EVM and integrity functions to those LSMs.
Patches 23-24 remove the dependency on the 'integrity' LSM by splitting
integrity metadata, so that the 'ima' and 'evm' LSMs can use their own.
They also duplicate iint_lockdep_annotate() in ima_main.c, since the mutex
field was moved from integrity_iint_cache to ima_iint_cache.
Patch 25 finally removes the 'integrity' LSM, since 'ima' and 'evm' are now
self-contained and independent.
The patch set applies on top of lsm/next, commit 97280fa1ed94 ("Automated
merge of 'dev' into 'next'").
Changelog:
v9:
- Add new Reviewed-by/Acked-by
- Rewrite documentation of ima_kernel_module_request() (suggested by
Stefan)
- Move evm_inode_post_setxattr() registration after evm_inode_setxattr()
(suggested by Stefan)
v8:
- Restore dynamic allocation of IMA integrity metadata, and store only the
pointer in the inode security blob
- Select SECURITY_PATH both in IMA and EVM
- Rename evm_file_free() to evm_file_release()
- Unconditionally register evm_post_path_mknod()
- Introduce the new ima_iint.c file for the management of IMA integrity
metadata
- Introduce ima_inode_set_iint()/ima_inode_get_iint() in ima.h to
respectively store/retrieve the IMA integrity metadata pointer
- Replace ima_iint_inode() with ima_inode_get() and ima_iint_find(), with
same behavior of integrity_inode_get() and integrity_iint_find()
- Initialize the ima_iint_cache in ima_iintcache_init() and call it from
init_ima_lsm()
- Move integrity_kernel_module_request() to IMA in a separate patch
(suggested by Mimi)
- Compile ima_kernel_module_request() if CONFIG_INTEGRITY_ASYMMETRIC_KEYS
is enabled
- Remove ima_inode_alloc_security() and ima_inode_free_security(), since
the IMA integrity metadata is not fully embedded in the inode security
blob
- Fixed the missed initialization of ima_iint_cache in
process_measurement() and __ima_inode_hash()
- Add a sentence in 'evm: Move to LSM infrastructure' to mention about
moving evm_inode_remove_acl(), evm_inode_post_remove_acl() and
evm_inode_post_set_acl() to evm_main.c
- Add a sentence in 'ima: Move IMA-Appraisal to LSM infrastructure' to
mention about moving ima_inode_remove_acl() to ima_appraise.c
v7:
- Use return instead of goto in __vfs_removexattr_locked() (suggested by
Casey)
- Clarify in security/integrity/Makefile that the order of 'ima' and 'evm'
LSMs depends on the order in which IMA and EVM are compiled
- Move integrity_iint_cache flags to ima.h and evm.h in security/ and
duplicate IMA_NEW_FILE to EVM_NEW_FILE
- Rename evm_inode_get_iint() to evm_iint_inode() and ima_inode_get_iint()
to ima_iint_inode(), check if inode->i_security is NULL, and just return
the pointer from the inode security blob
- Restore the non-NULL checks after ima_iint_inode() and evm_iint_inode()
(suggested by Casey)
- Introduce evm_file_free() to clear EVM_NEW_FILE
- Remove comment about LSM_ORDER_LAST not guaranteeing the order of 'ima'
and 'evm' LSMs
- Lock iint->mutex before reading IMA_COLLECTED flag in __ima_inode_hash()
and restored ima_policy_flag check
- Remove patch about the hardcoded ordering of 'ima' and 'evm' LSMs in
security.c
- Add missing ima_inode_free_security() to free iint->ima_hash
- Add the cases for LSM_ID_IMA and LSM_ID_EVM in lsm_list_modules_test.c
- Mention about the change in IMA and EVM post functions for private
inodes
v6:
- See v7
v5:
- Rename security_file_pre_free() to security_file_release() and the LSM
hook file_pre_free_security to file_release (suggested by Paul)
- Move integrity_kernel_module_request() to ima_main.c (renamed to
ima_kernel_module_request())
- Split the integrity_iint_cache structure into ima_iint_cache and
evm_iint_cache, so that IMA and EVM can use disjoint metadata and
reserve space with the LSM infrastructure
- Reserve space for the entire ima_iint_cache and evm_iint_cache
structures, not just the pointer (suggested by Paul)
- Introduce ima_inode_get_iint() and evm_inode_get_iint() to retrieve
respectively the ima_iint_cache and evm_iint_cache structure from the
security blob
- Remove the various non-NULL checks for the ima_iint_cache and
evm_iint_cache structures, since the LSM infrastructure ensure that they
always exist
- Remove the iint parameter from evm_verifyxattr() since IMA and EVM
use disjoint integrity metaddata
- Introduce the evm_post_path_mknod() to set the IMA_NEW_FILE flag
- Register the inode_alloc_security LSM hook in IMA and EVM to
initialize the respective integrity metadata structures
- Remove the 'integrity' LSM completely and instead make 'ima' and 'evm'
proper standalone LSMs
- Add the inode parameter to ima_get_verity_digest(), since the inode
field is not present in ima_iint_cache
- Move iint_lockdep_annotate() to ima_main.c (renamed to
ima_iint_lockdep_annotate())
- Remove ima_get_lsm_id() and evm_get_lsm_id(), since IMA and EVM directly
register the needed LSM hooks
- Enforce 'ima' and 'evm' LSM ordering at LSM infrastructure level
v4:
- Improve short and long description of
security_inode_post_create_tmpfile(), security_inode_post_set_acl(),
security_inode_post_remove_acl() and security_file_post_open()
(suggested by Mimi)
- Improve commit message of 'ima: Move to LSM infrastructure' (suggested
by Mimi)
v3:
- Drop 'ima: Align ima_post_path_mknod() definition with LSM
infrastructure' and 'ima: Align ima_post_create_tmpfile() definition
with LSM infrastructure', define the new LSM hooks with the same
IMA parameters instead (suggested by Mimi)
- Do IS_PRIVATE() check in security_path_post_mknod() and
security_inode_post_create_tmpfile() on the new inode rather than the
parent directory (in the post method it is available)
- Don't export ima_file_check() (suggested by Stefan)
- Remove redundant check of file mode in ima_post_path_mknod() (suggested
by Mimi)
- Mention that ima_post_path_mknod() is now conditionally invoked when
CONFIG_SECURITY_PATH=y (suggested by Mimi)
- Mention when a LSM hook will be introduced in the IMA/EVM alignment
patches (suggested by Mimi)
- Simplify the commit messages when introducing a new LSM hook
- Still keep the 'extern' in the function declaration, until the
declaration is removed (suggested by Mimi)
- Improve documentation of security_file_pre_free()
- Register 'ima' and 'evm' as standalone LSMs (suggested by Paul)
- Initialize the 'ima' and 'evm' LSMs from 'integrity', to keep the
original ordering of IMA and EVM functions as when they were hardcoded
- Return the IMA and EVM LSM IDs to 'integrity' for registration of the
integrity-specific hooks
- Reserve an xattr slot from the 'evm' LSM instead of 'integrity'
- Pass the LSM ID to init_ima_appraise_lsm()
v2:
- Add description for newly introduced LSM hooks (suggested by Casey)
- Clarify in the description of security_file_pre_free() that actions can
be performed while the file is still open
v1:
- Drop 'evm: Complete description of evm_inode_setattr()', 'fs: Fix
description of vfs_tmpfile()' and 'security: Introduce LSM_ORDER_LAST',
they were sent separately (suggested by Christian Brauner)
- Replace dentry with file descriptor parameter for
security_inode_post_create_tmpfile()
- Introduce mode_stripped and pass it as mode argument to
security_path_mknod() and security_path_post_mknod()
- Use goto in do_mknodat() and __vfs_removexattr_locked() (suggested by
Mimi)
- Replace __lsm_ro_after_init with __ro_after_init
- Modify short description of security_inode_post_create_tmpfile() and
security_inode_post_set_acl() (suggested by Stefan)
- Move security_inode_post_setattr() just after security_inode_setattr()
(suggested by Mimi)
- Modify short description of security_key_post_create_or_update()
(suggested by Mimi)
- Add back exported functions ima_file_check() and
evm_inode_init_security() respectively to ima.h and evm.h (reported by
kernel robot)
- Remove extern from prototype declarations and fix style issues
- Remove unnecessary include of linux/lsm_hooks.h in ima_main.c and
ima_appraise.c
Roberto Sassu (25):
ima: Align ima_inode_post_setattr() definition with LSM infrastructure
ima: Align ima_file_mprotect() definition with LSM infrastructure
ima: Align ima_inode_setxattr() definition with LSM infrastructure
ima: Align ima_inode_removexattr() definition with LSM infrastructure
ima: Align ima_post_read_file() definition with LSM infrastructure
evm: Align evm_inode_post_setattr() definition with LSM infrastructure
evm: Align evm_inode_setxattr() definition with LSM infrastructure
evm: Align evm_inode_post_setxattr() definition with LSM
infrastructure
security: Align inode_setattr hook definition with EVM
security: Introduce inode_post_setattr hook
security: Introduce inode_post_removexattr hook
security: Introduce file_post_open hook
security: Introduce file_release hook
security: Introduce path_post_mknod hook
security: Introduce inode_post_create_tmpfile hook
security: Introduce inode_post_set_acl hook
security: Introduce inode_post_remove_acl hook
security: Introduce key_post_create_or_update hook
integrity: Move integrity_kernel_module_request() to IMA
ima: Move to LSM infrastructure
ima: Move IMA-Appraisal to LSM infrastructure
evm: Move to LSM infrastructure
evm: Make it independent from 'integrity' LSM
ima: Make it independent from 'integrity' LSM
integrity: Remove LSM
fs/attr.c | 5 +-
fs/file_table.c | 3 +-
fs/namei.c | 12 +-
fs/nfsd/vfs.c | 3 +-
fs/open.c | 1 -
fs/posix_acl.c | 5 +-
fs/xattr.c | 9 +-
include/linux/evm.h | 117 +-------
include/linux/ima.h | 142 ----------
include/linux/integrity.h | 27 --
include/linux/lsm_hook_defs.h | 20 +-
include/linux/security.h | 59 ++++
include/uapi/linux/lsm.h | 2 +
security/integrity/Makefile | 1 +
security/integrity/digsig_asymmetric.c | 23 --
security/integrity/evm/Kconfig | 1 +
security/integrity/evm/evm.h | 19 ++
security/integrity/evm/evm_crypto.c | 4 +-
security/integrity/evm/evm_main.c | 195 ++++++++++---
security/integrity/iint.c | 197 +------------
security/integrity/ima/Kconfig | 1 +
security/integrity/ima/Makefile | 2 +-
security/integrity/ima/ima.h | 148 ++++++++--
security/integrity/ima/ima_api.c | 23 +-
security/integrity/ima/ima_appraise.c | 66 +++--
security/integrity/ima/ima_iint.c | 142 ++++++++++
security/integrity/ima/ima_init.c | 2 +-
security/integrity/ima/ima_main.c | 148 +++++++---
security/integrity/ima/ima_policy.c | 2 +-
security/integrity/integrity.h | 80 +-----
security/keys/key.c | 10 +-
security/security.c | 263 +++++++++++-------
security/selinux/hooks.c | 3 +-
security/smack/smack_lsm.c | 4 +-
.../selftests/lsm/lsm_list_modules_test.c | 6 +
35 files changed, 906 insertions(+), 839 deletions(-)
create mode 100644 security/integrity/ima/ima_iint.c
--
2.34.1
From: Zi Yan <ziy(a)nvidia.com>
Hi all,
File folio supports any order and multi-size THP is upstreamed[1], so both
file and anonymous folios can be >0 order. Currently, split_huge_page()
only splits a huge page to order-0 pages, but splitting to orders higher than
0 is going to better utilize large folios. In addition, Large Block
Sizes in XFS support would benefit from it[2]. This patchset adds support for
splitting a large folio to any lower order folios and uses it during file
folio truncate operations.
For Patch 6, Hugh did not like my approach to minimize the number of
folios for truncate[3]. I would like to get more feedback, especially
from FS people, on it to decide whether to keep it or not.
The patchset is on top of mm-everything-2024-02-13-01-26.
Changelog
===
Since v3
---
1. Excluded shmem folios and pagecache folios without FS support from
splitting to any order (per Hugh Dickins).
2. Allowed splitting anonymous large folio to any lower order since
multi-size THP is upstreamed.
3. Adapted selftests code to new framework.
Since v2
---
1. Fixed an issue in __split_page_owner() introduced during my rebase
Since v1
---
1. Changed split_page_memcg() and split_page_owner() parameter to use order
2. Used folio_test_pmd_mappable() in place of the equivalent code
Details
===
* Patch 1 changes split_page_memcg() to use order instead of nr_pages
* Patch 2 changes split_page_owner() to use order instead of nr_pages
* Patch 3 and 4 add new_order parameter split_page_memcg() and
split_page_owner() and prepare for upcoming changes.
* Patch 5 adds split_huge_page_to_list_to_order() to split a huge page
to any lower order. The original split_huge_page_to_list() calls
split_huge_page_to_list_to_order() with new_order = 0.
* Patch 6 uses split_huge_page_to_list_to_order() in large pagecache folio
truncation instead of split the large folio all the way down to order-0.
* Patch 7 adds a test API to debugfs and test cases in
split_huge_page_test selftests.
Comments and/or suggestions are welcome.
[1] https://lore.kernel.org/all/20231207161211.2374093-1-ryan.roberts@arm.com/
[2] https://lore.kernel.org/linux-mm/qzbcjn4gcyxla4gwuj6smlnwknz2wvo5wrjctin6ee…
[3] https://lore.kernel.org/linux-mm/9dd96da-efa2-5123-20d4-4992136ef3ad@google…
Zi Yan (7):
mm/memcg: use order instead of nr in split_page_memcg()
mm/page_owner: use order instead of nr in split_page_owner()
mm: memcg: make memcg huge page split support any order split.
mm: page_owner: add support for splitting to any order in split
page_owner.
mm: thp: split huge page to any lower order pages (except order-1).
mm: truncate: split huge page cache page to a non-zero order if
possible.
mm: huge_memory: enable debugfs to split huge pages to any order.
include/linux/huge_mm.h | 21 +-
include/linux/memcontrol.h | 4 +-
include/linux/page_owner.h | 10 +-
mm/huge_memory.c | 149 +++++++++---
mm/memcontrol.c | 10 +-
mm/page_alloc.c | 8 +-
mm/page_owner.c | 8 +-
mm/truncate.c | 21 +-
.../selftests/mm/split_huge_page_test.c | 223 +++++++++++++++++-
9 files changed, 382 insertions(+), 72 deletions(-)
--
2.43.0
Arch maintainers, please ack/review patches.
This is a resend of a series from Frank last year[1]. I worked in Rob's
review comments to unconditionally call unflatten_device_tree() and
fixup/audit calls to of_have_populated_dt() so that behavior doesn't
change.
I need this series so I can add DT based tests in the clk framework.
Either I can merge it through the clk tree once everyone is happy, or
Rob can merge it through the DT tree and provide some branch so I can
base clk patches on it.
Changes from Frank's series[1]:
* Add a DTB loaded kunit test
* Make of_have_populated_dt() return false if the DTB isn't from the
bootloader
* Architecture calls made unconditional so that a root node is always
made
Changes from v2 (https://lore.kernel.org/r/20240130004508.1700335-1-sboyd@kernel.org):
* Reorder patches to have OF changes largely first
* No longer modify initial_boot_params if ACPI=y
* Put arm64 patch back to v1
Changes from v1 (https://lore.kernel.org/r/20240112200750.4062441-1-sboyd@kernel.org):
* x86 patch included
* arm64 knocks out initial dtb if acpi is in use
* keep Kconfig hidden but def_bool enabled otherwise
Frank Rowand (2):
of: Create of_root if no dtb provided by firmware
of: unittest: treat missing of_root as error instead of fixing up
Stephen Boyd (5):
of: Always unflatten in unflatten_and_copy_device_tree()
um: Unconditionally call unflatten_device_tree()
x86/of: Unconditionally call unflatten_and_copy_device_tree()
arm64: Unconditionally call unflatten_device_tree()
of: Add KUnit test to confirm DTB is loaded
arch/arm64/kernel/setup.c | 3 +-
arch/um/kernel/dtb.c | 14 ++++----
arch/x86/kernel/devicetree.c | 24 +++++++-------
drivers/of/.kunitconfig | 3 ++
drivers/of/Kconfig | 11 ++++++-
drivers/of/Makefile | 4 ++-
drivers/of/empty_root.dts | 6 ++++
drivers/of/fdt.c | 64 +++++++++++++++++++++++++++---------
drivers/of/of_test.c | 48 +++++++++++++++++++++++++++
drivers/of/platform.c | 3 --
drivers/of/unittest.c | 16 +++------
include/linux/of.h | 25 ++++++++------
12 files changed, 158 insertions(+), 63 deletions(-)
create mode 100644 drivers/of/.kunitconfig
create mode 100644 drivers/of/empty_root.dts
create mode 100644 drivers/of/of_test.c
Cc: Anton Ivanov <anton.ivanov(a)cambridgegreys.com>
Cc: Brendan Higgins <brendan.higgins(a)linux.dev>
Cc: Catalin Marinas <catalin.marinas(a)arm.com>
Cc: David Gow <davidgow(a)google.com>
Cc: Frank Rowand <frowand.list(a)gmail.com>
Cc: Johannes Berg <johannes(a)sipsolutions.net>
Cc: Richard Weinberger <richard(a)nod.at>
Cc: Rob Herring <robh+dt(a)kernel.org>
Cc: Will Deacon <will(a)kernel.org>
Cc: Thomas Gleixner <tglx(a)linutronix.de>
Cc: Ingo Molnar <mingo(a)redhat.com>
Cc: Borislav Petkov <bp(a)alien8.de>
Cc: Dave Hansen <dave.hansen(a)linux.intel.com>
Cc: <x86(a)kernel.org>
Cc: "H. Peter Anvin" <hpa(a)zytor.com>
Cc: Saurabh Sengar <ssengar(a)linux.microsoft.com>
[1] https://lore.kernel.org/r/20230317053415.2254616-1-frowand.list@gmail.com
base-commit: 0dd3ee31125508cd67f7e7172247f05b7fd1753a
--
https://git.kernel.org/pub/scm/linux/kernel/git/clk/linux.git/https://git.kernel.org/pub/scm/linux/kernel/git/sboyd/spmi.git
Add specification for test metadata to the KTAP v2 spec.
KTAP v1 only specifies the output format of very basic test information:
test result and test name. Any additional test information either gets
added to general diagnostic data or is not included in the output at all.
The purpose of KTAP metadata is to create a framework to include and
easily identify additional important test information in KTAP.
KTAP metadata could include any test information that is pertinent for
user interaction before or after the running of the test. For example,
the test file path or the test speed.
Since this includes a large variety of information, this specification
will recognize notable types of KTAP metadata to ensure consistent format
across test frameworks. See the full list of types in the specification.
Example of KTAP Metadata:
KTAP version 2
#:ktap_test: main
#:ktap_arch: uml
1..1
KTAP version 2
#:ktap_test: suite_1
#:ktap_subsystem: example
#:ktap_test_file: lib/test.c
1..2
ok 1 test_1
#:ktap_test: test_2
#:ktap_speed: very_slow
# test_2 has begun
#:custom_is_flaky: true
ok 2 test_2
# suite_1 has passed
ok 1 suite_1
The changes to the KTAP specification outline the format, location, and
different types of metadata.
Reviewed-by: David Gow <davidgow(a)google.com>
Signed-off-by: Rae Moar <rmoar(a)google.com>
---
Changes since v2:
- Change format of metadata line from "# prefix_type: value" to
"#:prefix_type: value".
- Add examples of edge cases, such as diagnostic information interrupting
metadata lines and lines in the wrong location, to the example in the
specification.
- Add current list of accepted prefixes.
- Add notes on indentation and generated files.
- Fix a few typos.
Documentation/dev-tools/ktap.rst | 182 ++++++++++++++++++++++++++++++-
1 file changed, 178 insertions(+), 4 deletions(-)
diff --git a/Documentation/dev-tools/ktap.rst b/Documentation/dev-tools/ktap.rst
index ff77f4aaa6ef..42a7e50b9270 100644
--- a/Documentation/dev-tools/ktap.rst
+++ b/Documentation/dev-tools/ktap.rst
@@ -17,19 +17,22 @@ KTAP test results describe a series of tests (which may be nested: i.e., test
can have subtests), each of which can contain both diagnostic data -- e.g., log
lines -- and a final result. The test structure and results are
machine-readable, whereas the diagnostic data is unstructured and is there to
-aid human debugging.
+aid human debugging. Since version 2, tests can also contain metadata which
+consists of important supplemental test information and can be
+machine-readable.
+
+KTAP output is built from five different types of lines:
-KTAP output is built from four different types of lines:
- Version lines
- Plan lines
- Test case result lines
- Diagnostic lines
+- Metadata lines
In general, valid KTAP output should also form valid TAP output, but some
information, in particular nested test results, may be lost. Also note that
there is a stagnant draft specification for TAP14, KTAP diverges from this in
-a couple of places (notably the "Subtest" header), which are described where
-relevant later in this document.
+a couple of places, which are described where relevant later in this document.
Version lines
-------------
@@ -166,6 +169,171 @@ even if they do not start with a "#": this is to capture any other useful
kernel output which may help debug the test. It is nevertheless recommended
that tests always prefix any diagnostic output they have with a "#" character.
+KTAP metadata lines
+-------------------
+
+KTAP metadata lines are used to include and easily identify important
+supplemental test information in KTAP. These lines may appear similar to
+diagnostic lines. The format of metadata lines is below:
+
+.. code-block:: none
+
+ #:<prefix>_<metadata type>: <metadata value>
+
+The <prefix> indicates where to find the specification for the type of
+metadata, such as the name of a test framework or "ktap" to indicate this
+specification. The list of currently approved prefixes and where to find the
+documentation of the metadata types is below. Note any metadata type that does
+not use a prefix from the list below must use the prefix "custom".
+
+Current List of Approved Prefixes:
+
+- ``ktap``: See Types of KTAP Metadata below for the list of metadata types.
+
+The format of <metadata type> and <value> varies based on the type. See the
+individual specification. For "custom" types the <metadata type> can be any
+string excluding ":", spaces, or newline characters and the <value> can be any
+string.
+
+**Location:**
+
+The first KTAP metadata line for a test must be "#:ktap_test: <test name>",
+which acts as a header to associate metadata with the correct test. Metadata
+for the main KTAP level uses the test name "main".
+
+For test cases, the location of the metadata is between the prior test result
+line and the current test result line. For test suites, the location of the
+metadata is between the suite's version line and test plan line. For the main
+level, the location of the metadata is between the main version line and main
+test plan line. See the example below.
+
+Note that a test case's metadata is inline with the test's result line. Whereas
+a suite's metadata is inline with the suite's version line and thus will be
+more indented than the suite's result line. Additionally, metadata for the main
+level is inline with the main version line.
+
+KTAP metadata for a test does not need to be contiguous. For example, a kernel
+warning or other diagnostic output could interrupt metadata lines. However, it
+is recommended to keep a test's metadata lines in the correct location and
+together when possible, as this improves readability.
+
+**Example of KTAP metadata:**
+
+::
+
+ KTAP version 2
+ #:ktap_test: main
+ #:ktap_arch: uml
+ 1..1
+ KTAP version 2
+ #:ktap_test: suite_1
+ #:ktap_subsystem: example
+ #:ktap_test_file: lib/test.c
+ 1..2
+ # WARNING: test_1 skipped
+ ok 1 test_1 # SKIP
+ #:ktap_test: test_2
+ #:ktap_speed: very_slow
+ # test_2 has begun
+ #:custom_is_flaky: true
+ ok 2 test_2
+ #:ktap_duration: 1.231s
+ # suite_1 passed
+ ok 1 suite_1
+
+In this example, the tests are running on UML. The test suite "suite_1" is part
+of the subsystem "example" and belongs to the file "lib/test.c". It has
+two subtests, "test_1" and "test_2". The subtest "test_2" has a speed of
+"very_slow" and has been marked with a custom KTAP metadata type called
+"custom_is_flaky" with the value of "true".
+
+Additionally "test_2" has a duration of 1.231s. The ktap_duration line is not
+contiguous with the other metadata and is not in the correct location, which
+is not recommended. However, it is allowed because it is below the
+"#:ktap_test: test_2" line.
+
+**Types of KTAP Metadata:**
+
+This is the current list of KTAP metadata types recognized in this
+specification. Note that all of these metadata types are optional (except for
+ktap_test as the KTAP metadata header).
+
+- ``ktap_test``: Name of test (used as header of KTAP metadata). This should
+ match the test name printed in the test result line: "ok 1 [test_name]".
+
+- ``ktap_module``: Name of the module containing the test
+
+- ``ktap_subsystem``: Name of the subsystem being tested
+
+- ``ktap_start_time``: Time tests started in ISO8601 format
+
+ - Example: "#:ktap_start_time: 2024-01-09T13:09:01.990000+00:00"
+
+- ``ktap_duration``: Time taken (in seconds) to execute the test
+
+ - Example: "#:ktap_duration: 10.154s"
+
+- ``ktap_speed``: Category of how fast test runs: "normal", "slow", or
+ "very_slow"
+
+- ``ktap_test_file``: Path to source file containing the test. This metadata
+ line can be repeated if the test is spread across multiple files.
+
+ - Example: "#:ktap_test_file: lib/test.c"
+
+- ``ktap_generated_file``: Description of and path to file generated during
+ test execution. This could be a core dump, generated filesystem image, some
+ form of visual output (for graphics drivers), etc. This metadata line can be
+ repeated to attach multiple files to the test. Note use ktap_log_file or
+ ktap_error_file instead of this type if more applicable.
+
+ - Example: "#:ktap_generated_file: Core dump: /var/lib/systemd/coredump/hello.core"
+
+- ``ktap_log_file``: Path to file containing kernel log test output
+
+ - Example: "#:ktap_log_file: /sys/kernel/debugfs/kunit/example/results"
+
+- ``ktap_error_file``: Path to file containing context for test failure or
+ error. This could include the difference between optimal test output and
+ actual test output.
+
+ - Example: "#:ktap_error_file: fs/results/example.out.bad"
+
+- ``ktap_results_url``: Link to webpage describing this test run and its
+ results
+
+ - Example: "#:ktap_results_url: https://kcidb.kernelci.org/hello"
+
+- ``ktap_arch``: Architecture used during test run
+
+ - Example: "#:ktap_arch: x86_64"
+
+- ``ktap_compiler``: Compiler used during test run
+
+ - Example: "#:ktap_compiler: gcc (GCC) 10.1.1 20200507 (Red Hat 10.1.1-1)"
+
+- ``ktap_respository_url``: Link to git repository of the checked out code.
+
+ - Example: "#:ktap_respository_url: https://github.com/torvalds/linux.git"
+
+- ``ktap_git_branch``: Name of git branch of checked out code
+
+ - Example: "#:ktap_git_branch: kselftest/kunit"
+
+- ``ktap_kernel_version``: Version of Linux Kernel being used during test run
+
+ - Example: "#:ktap_kernel_version: 6.7-rc1"
+
+- ``ktap_commit_hash``: The full git commit hash of the checked out base code.
+
+ - Example: "#:ktap_commit_hash: 064725faf8ec2e6e36d51e22d3b86d2707f0f47f"
+
+**Other Metadata Types:**
+
+There can also be KTAP metadata that is not included in the recognized list
+above. This metadata must be prefixed with the test framework, ie. "kselftest",
+or with the prefix "custom". For example, "# custom_batch: 20".
+
Unknown lines
-------------
@@ -206,6 +374,7 @@ An example of a test with two nested subtests:
KTAP version 2
1..1
KTAP version 2
+ #:ktap_test: example
1..2
ok 1 test_1
not ok 2 test_2
@@ -219,6 +388,7 @@ An example format with multiple levels of nested testing:
KTAP version 2
1..2
KTAP version 2
+ #:ktap_test: example_test_1
1..2
KTAP version 2
1..2
@@ -254,6 +424,7 @@ Example KTAP output
KTAP version 2
1..1
KTAP version 2
+ #:ktap_test: main_test
1..3
KTAP version 2
1..1
@@ -261,11 +432,14 @@ Example KTAP output
ok 1 test_1
ok 1 example_test_1
KTAP version 2
+ #:ktap_test: example_test_2
+ #:ktap_speed: slow
1..2
ok 1 test_1 # SKIP test_1 skipped
ok 2 test_2
ok 2 example_test_2
KTAP version 2
+ #:ktap_test: example_test_3
1..3
ok 1 test_1
# test_2: FAIL
base-commit: 906f02e42adfbd5ae70d328ee71656ecb602aaf5
--
2.43.0.687.g38aa6559b0-goog
From: Roberto Sassu <roberto.sassu(a)huawei.com>
IMA and EVM are not effectively LSMs, especially due to the fact that in
the past they could not provide a security blob while there is another LSM
active.
That changed in the recent years, the LSM stacking feature now makes it
possible to stack together multiple LSMs, and allows them to provide a
security blob for most kernel objects. While the LSM stacking feature has
some limitations being worked out, it is already suitable to make IMA and
EVM as LSMs.
The main purpose of this patch set is to remove IMA and EVM function calls,
hardcoded in the LSM infrastructure and other places in the kernel, and to
register them as LSM hook implementations, so that those functions are
called by the LSM infrastructure like other regular LSMs.
This patch set introduces two new LSMs 'ima' and 'evm', so that functions
can be registered to their respective LSM, and removes the 'integrity' LSM.
integrity_kernel_module_request() was moved to IMA, since deadlock could
occur only there. integrity_inode_free() was replaced with ima_inode_free()
(EVM does not need to free memory).
In order to make 'ima' and 'evm' independent LSMs, it was necessary to
split integrity metadata used by both IMA and EVM, and to let them manage
their own. The special case of the IMA_NEW_FILE flag, managed by IMA and
used by EVM, was handled by introducing a new flag in EVM, EVM_NEW_FILE,
managed by two additional LSM hooks, evm_post_path_mknod() and
evm_file_release(), equivalent to their counterparts ima_post_path_mknod()
and ima_file_free().
In addition to splitting metadata, it was decided to embed the
evm_iint_inode structure into the inode security blob, since it is small
and because anyway it cannot rely on IMA anymore allocating it (since it
uses a different structure).
On the other hand, to avoid memory pressure concerns, only a pointer to the
ima_iint_cache structure is stored in the inode security blob, and the
structure is allocated on demand, like before.
Another follow-up change was removing the iint parameter from
evm_verifyxattr(), that IMA used to pass integrity metadata to EVM. After
splitting metadata, and aligning EVM_NEW_FILE with IMA_NEW_FILE, this
parameter was not necessary anymore.
The last part was to ensure that the order of IMA and EVM functions is
respected after they become LSMs. Since the order of lsm_info structures in
the .lsm_info.init section depends on the order object files containing
those structures are passed to the linker of the kernel image, and since
IMA is before EVM in the Makefile, that is sufficient to assert that IMA
functions are executed before EVM ones.
The patch set is organized as follows.
Patches 1-9 make IMA and EVM functions suitable to be registered to the LSM
infrastructure, by aligning function parameters.
Patches 10-18 add new LSM hooks in the same places where IMA and EVM
functions are called, if there is no LSM hook already.
Patch 19 moves integrity_kernel_module_request() to IMA, as a prerequisite
for removing the 'integrity' LSM.
Patches 20-22 introduce the new standalone LSMs 'ima' and 'evm', and move
hardcoded calls to IMA, EVM and integrity functions to those LSMs.
Patches 23-24 remove the dependency on the 'integrity' LSM by splitting
integrity metadata, so that the 'ima' and 'evm' LSMs can use their own.
They also duplicate iint_lockdep_annotate() in ima_main.c, since the mutex
field was moved from integrity_iint_cache to ima_iint_cache.
Patch 25 finally removes the 'integrity' LSM, since 'ima' and 'evm' are now
self-contained and independent.
The patch set applies on top of lsm/dev, commit f1bb47a31dff ("lsm: new
security_file_ioctl_compat() hook"). The linux-integrity/next-integrity at
commit c00f94b3a5be ("overlay: disable EVM") was merged.
Changelog:
v8:
- Restore dynamic allocation of IMA integrity metadata, and store only the
pointer in the inode security blob
- Select SECURITY_PATH both in IMA and EVM
- Rename evm_file_free() to evm_file_release()
- Unconditionally register evm_post_path_mknod()
- Introduce the new ima_iint.c file for the management of IMA integrity
metadata
- Introduce ima_inode_set_iint()/ima_inode_get_iint() in ima.h to
respectively store/retrieve the IMA integrity metadata pointer
- Replace ima_iint_inode() with ima_inode_get() and ima_iint_find(), with
same behavior of integrity_inode_get() and integrity_iint_find()
- Initialize the ima_iint_cache in ima_iintcache_init() and call it from
init_ima_lsm()
- Move integrity_kernel_module_request() to IMA in a separate patch
(suggested by Mimi)
- Compile ima_kernel_module_request() if CONFIG_INTEGRITY_ASYMMETRIC_KEYS
is enabled
- Remove ima_inode_alloc_security() and ima_inode_free_security(), since
the IMA integrity metadata is not fully embedded in the inode security
blob
- Fixed the missed initialization of ima_iint_cache in
process_measurement() and __ima_inode_hash()
- Add a sentence in 'evm: Move to LSM infrastructure' to mention about
moving evm_inode_remove_acl(), evm_inode_post_remove_acl() and
evm_inode_post_set_acl() to evm_main.c
- Add a sentence in 'ima: Move IMA-Appraisal to LSM infrastructure' to
mention about moving ima_inode_remove_acl() to ima_appraise.c
v7:
- Use return instead of goto in __vfs_removexattr_locked() (suggested by
Casey)
- Clarify in security/integrity/Makefile that the order of 'ima' and 'evm'
LSMs depends on the order in which IMA and EVM are compiled
- Move integrity_iint_cache flags to ima.h and evm.h in security/ and
duplicate IMA_NEW_FILE to EVM_NEW_FILE
- Rename evm_inode_get_iint() to evm_iint_inode() and ima_inode_get_iint()
to ima_iint_inode(), check if inode->i_security is NULL, and just return
the pointer from the inode security blob
- Restore the non-NULL checks after ima_iint_inode() and evm_iint_inode()
(suggested by Casey)
- Introduce evm_file_free() to clear EVM_NEW_FILE
- Remove comment about LSM_ORDER_LAST not guaranteeing the order of 'ima'
and 'evm' LSMs
- Lock iint->mutex before reading IMA_COLLECTED flag in __ima_inode_hash()
and restored ima_policy_flag check
- Remove patch about the hardcoded ordering of 'ima' and 'evm' LSMs in
security.c
- Add missing ima_inode_free_security() to free iint->ima_hash
- Add the cases for LSM_ID_IMA and LSM_ID_EVM in lsm_list_modules_test.c
- Mention about the change in IMA and EVM post functions for private
inodes
v6:
- See v7
v5:
- Rename security_file_pre_free() to security_file_release() and the LSM
hook file_pre_free_security to file_release (suggested by Paul)
- Move integrity_kernel_module_request() to ima_main.c (renamed to
ima_kernel_module_request())
- Split the integrity_iint_cache structure into ima_iint_cache and
evm_iint_cache, so that IMA and EVM can use disjoint metadata and
reserve space with the LSM infrastructure
- Reserve space for the entire ima_iint_cache and evm_iint_cache
structures, not just the pointer (suggested by Paul)
- Introduce ima_inode_get_iint() and evm_inode_get_iint() to retrieve
respectively the ima_iint_cache and evm_iint_cache structure from the
security blob
- Remove the various non-NULL checks for the ima_iint_cache and
evm_iint_cache structures, since the LSM infrastructure ensure that they
always exist
- Remove the iint parameter from evm_verifyxattr() since IMA and EVM
use disjoint integrity metaddata
- Introduce the evm_post_path_mknod() to set the IMA_NEW_FILE flag
- Register the inode_alloc_security LSM hook in IMA and EVM to
initialize the respective integrity metadata structures
- Remove the 'integrity' LSM completely and instead make 'ima' and 'evm'
proper standalone LSMs
- Add the inode parameter to ima_get_verity_digest(), since the inode
field is not present in ima_iint_cache
- Move iint_lockdep_annotate() to ima_main.c (renamed to
ima_iint_lockdep_annotate())
- Remove ima_get_lsm_id() and evm_get_lsm_id(), since IMA and EVM directly
register the needed LSM hooks
- Enforce 'ima' and 'evm' LSM ordering at LSM infrastructure level
v4:
- Improve short and long description of
security_inode_post_create_tmpfile(), security_inode_post_set_acl(),
security_inode_post_remove_acl() and security_file_post_open()
(suggested by Mimi)
- Improve commit message of 'ima: Move to LSM infrastructure' (suggested
by Mimi)
v3:
- Drop 'ima: Align ima_post_path_mknod() definition with LSM
infrastructure' and 'ima: Align ima_post_create_tmpfile() definition
with LSM infrastructure', define the new LSM hooks with the same
IMA parameters instead (suggested by Mimi)
- Do IS_PRIVATE() check in security_path_post_mknod() and
security_inode_post_create_tmpfile() on the new inode rather than the
parent directory (in the post method it is available)
- Don't export ima_file_check() (suggested by Stefan)
- Remove redundant check of file mode in ima_post_path_mknod() (suggested
by Mimi)
- Mention that ima_post_path_mknod() is now conditionally invoked when
CONFIG_SECURITY_PATH=y (suggested by Mimi)
- Mention when a LSM hook will be introduced in the IMA/EVM alignment
patches (suggested by Mimi)
- Simplify the commit messages when introducing a new LSM hook
- Still keep the 'extern' in the function declaration, until the
declaration is removed (suggested by Mimi)
- Improve documentation of security_file_pre_free()
- Register 'ima' and 'evm' as standalone LSMs (suggested by Paul)
- Initialize the 'ima' and 'evm' LSMs from 'integrity', to keep the
original ordering of IMA and EVM functions as when they were hardcoded
- Return the IMA and EVM LSM IDs to 'integrity' for registration of the
integrity-specific hooks
- Reserve an xattr slot from the 'evm' LSM instead of 'integrity'
- Pass the LSM ID to init_ima_appraise_lsm()
v2:
- Add description for newly introduced LSM hooks (suggested by Casey)
- Clarify in the description of security_file_pre_free() that actions can
be performed while the file is still open
v1:
- Drop 'evm: Complete description of evm_inode_setattr()', 'fs: Fix
description of vfs_tmpfile()' and 'security: Introduce LSM_ORDER_LAST',
they were sent separately (suggested by Christian Brauner)
- Replace dentry with file descriptor parameter for
security_inode_post_create_tmpfile()
- Introduce mode_stripped and pass it as mode argument to
security_path_mknod() and security_path_post_mknod()
- Use goto in do_mknodat() and __vfs_removexattr_locked() (suggested by
Mimi)
- Replace __lsm_ro_after_init with __ro_after_init
- Modify short description of security_inode_post_create_tmpfile() and
security_inode_post_set_acl() (suggested by Stefan)
- Move security_inode_post_setattr() just after security_inode_setattr()
(suggested by Mimi)
- Modify short description of security_key_post_create_or_update()
(suggested by Mimi)
- Add back exported functions ima_file_check() and
evm_inode_init_security() respectively to ima.h and evm.h (reported by
kernel robot)
- Remove extern from prototype declarations and fix style issues
- Remove unnecessary include of linux/lsm_hooks.h in ima_main.c and
ima_appraise.c
Roberto Sassu (25):
ima: Align ima_inode_post_setattr() definition with LSM infrastructure
ima: Align ima_file_mprotect() definition with LSM infrastructure
ima: Align ima_inode_setxattr() definition with LSM infrastructure
ima: Align ima_inode_removexattr() definition with LSM infrastructure
ima: Align ima_post_read_file() definition with LSM infrastructure
evm: Align evm_inode_post_setattr() definition with LSM infrastructure
evm: Align evm_inode_setxattr() definition with LSM infrastructure
evm: Align evm_inode_post_setxattr() definition with LSM
infrastructure
security: Align inode_setattr hook definition with EVM
security: Introduce inode_post_setattr hook
security: Introduce inode_post_removexattr hook
security: Introduce file_post_open hook
security: Introduce file_release hook
security: Introduce path_post_mknod hook
security: Introduce inode_post_create_tmpfile hook
security: Introduce inode_post_set_acl hook
security: Introduce inode_post_remove_acl hook
security: Introduce key_post_create_or_update hook
integrity: Move integrity_kernel_module_request() to IMA
ima: Move to LSM infrastructure
ima: Move IMA-Appraisal to LSM infrastructure
evm: Move to LSM infrastructure
evm: Make it independent from 'integrity' LSM
ima: Make it independent from 'integrity' LSM
integrity: Remove LSM
fs/attr.c | 5 +-
fs/file_table.c | 3 +-
fs/namei.c | 12 +-
fs/nfsd/vfs.c | 3 +-
fs/open.c | 1 -
fs/posix_acl.c | 5 +-
fs/xattr.c | 9 +-
include/linux/evm.h | 117 +-------
include/linux/ima.h | 142 ----------
include/linux/integrity.h | 27 --
include/linux/lsm_hook_defs.h | 20 +-
include/linux/security.h | 59 ++++
include/uapi/linux/lsm.h | 2 +
security/integrity/Makefile | 1 +
security/integrity/digsig_asymmetric.c | 23 --
security/integrity/evm/Kconfig | 1 +
security/integrity/evm/evm.h | 19 ++
security/integrity/evm/evm_crypto.c | 4 +-
security/integrity/evm/evm_main.c | 195 ++++++++++---
security/integrity/iint.c | 197 +------------
security/integrity/ima/Kconfig | 1 +
security/integrity/ima/Makefile | 2 +-
security/integrity/ima/ima.h | 148 ++++++++--
security/integrity/ima/ima_api.c | 23 +-
security/integrity/ima/ima_appraise.c | 66 +++--
security/integrity/ima/ima_iint.c | 142 ++++++++++
security/integrity/ima/ima_init.c | 2 +-
security/integrity/ima/ima_main.c | 144 +++++++---
security/integrity/ima/ima_policy.c | 2 +-
security/integrity/integrity.h | 80 +-----
security/keys/key.c | 10 +-
security/security.c | 263 +++++++++++-------
security/selinux/hooks.c | 3 +-
security/smack/smack_lsm.c | 4 +-
.../selftests/lsm/lsm_list_modules_test.c | 6 +
35 files changed, 902 insertions(+), 839 deletions(-)
create mode 100644 security/integrity/ima/ima_iint.c
--
2.34.1
Calling get_system_loc_code before checking devfd and errno - fails the test
when the device is not available, expected a SKIP.
Change the order of 'SKIP_IF_MSG' correctly SKIP when the /dev/papr-vpd device
is not available.
with out patch: Test FAILED on line 271
with patch: [SKIP] Test skipped on line 266: /dev/papr-vpd not present
Signed-off-by: R Nageswara Sastry <rnsastry(a)linux.ibm.com>
---
tools/testing/selftests/powerpc/papr_vpd/papr_vpd.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/tools/testing/selftests/powerpc/papr_vpd/papr_vpd.c b/tools/testing/selftests/powerpc/papr_vpd/papr_vpd.c
index 98cbb9109ee6..505294da1b9f 100644
--- a/tools/testing/selftests/powerpc/papr_vpd/papr_vpd.c
+++ b/tools/testing/selftests/powerpc/papr_vpd/papr_vpd.c
@@ -263,10 +263,10 @@ static int papr_vpd_system_loc_code(void)
off_t size;
int fd;
- SKIP_IF_MSG(get_system_loc_code(&lc),
- "Cannot determine system location code");
SKIP_IF_MSG(devfd < 0 && errno == ENOENT,
DEVPATH " not present");
+ SKIP_IF_MSG(get_system_loc_code(&lc),
+ "Cannot determine system location code");
FAIL_IF(devfd < 0);
--
2.37.1 (Apple Git-137.1)
The kernel has recently added support for shadow stacks, currently
x86 only using their CET feature but both arm64 and RISC-V have
equivalent features (GCS and Zicfiss respectively), I am actively
working on GCS[1]. With shadow stacks the hardware maintains an
additional stack containing only the return addresses for branch
instructions which is not generally writeable by userspace and ensures
that any returns are to the recorded addresses. This provides some
protection against ROP attacks and making it easier to collect call
stacks. These shadow stacks are allocated in the address space of the
userspace process.
Our API for shadow stacks does not currently offer userspace any
flexiblity for managing the allocation of shadow stacks for newly
created threads, instead the kernel allocates a new shadow stack with
the same size as the normal stack whenever a thread is created with the
feature enabled. The stacks allocated in this way are freed by the
kernel when the thread exits or shadow stacks are disabled for the
thread. This lack of flexibility and control isn't ideal, in the vast
majority of cases the shadow stack will be over allocated and the
implicit allocation and deallocation is not consistent with other
interfaces. As far as I can tell the interface is done in this manner
mainly because the shadow stack patches were in development since before
clone3() was implemented.
Since clone3() is readily extensible let's add support for specifying a
shadow stack when creating a new thread or process in a similar manner
to how the normal stack is specified, keeping the current implicit
allocation behaviour if one is not specified either with clone3() or
through the use of clone(). The user must provide a shadow stack
address and size, this must point to memory mapped for use as a shadow
stackby map_shadow_stack() with a shadow stack token at the top of the
stack.
Please note that the x86 portions of this code are build tested only, I
don't appear to have a system that can run CET avaible to me, I have
done testing with an integration into my pending work for GCS. There is
some possibility that the arm64 implementation may require the use of
clone3() and explicit userspace allocation of shadow stacks, this is
still under discussion.
Please further note that the token consumption done by clone3() is not
currently implemented in an atomic fashion, Rick indicated that he would
look into fixing this if people are OK with the implementation.
A new architecture feature Kconfig option for shadow stacks is added as
here, this was suggested as part of the review comments for the arm64
GCS series and since we need to detect if shadow stacks are supported it
seemed sensible to roll it in here.
[1] https://lore.kernel.org/r/20231009-arm64-gcs-v6-0-78e55deaa4dd@kernel.org/
Signed-off-by: Mark Brown <broonie(a)kernel.org>
---
Changes in v5:
- Rebase onto v6.8-rc2.
- Rework ABI to have the user allocate the shadow stack memory with
map_shadow_stack() and a token.
- Force inlining of the x86 shadow stack enablement.
- Move shadow stack enablement out into a shared header for reuse by
other tests.
- Link to v4: https://lore.kernel.org/r/20231128-clone3-shadow-stack-v4-0-8b28ffe4f676@ke…
Changes in v4:
- Formatting changes.
- Use a define for minimum shadow stack size and move some basic
validation to fork.c.
- Link to v3: https://lore.kernel.org/r/20231120-clone3-shadow-stack-v3-0-a7b8ed3e2acc@ke…
Changes in v3:
- Rebase onto v6.7-rc2.
- Remove stale shadow_stack in internal kargs.
- If a shadow stack is specified unconditionally use it regardless of
CLONE_ parameters.
- Force enable shadow stacks in the selftest.
- Update changelogs for RISC-V feature rename.
- Link to v2: https://lore.kernel.org/r/20231114-clone3-shadow-stack-v2-0-b613f8681155@ke…
Changes in v2:
- Rebase onto v6.7-rc1.
- Remove ability to provide preallocated shadow stack, just specify the
desired size.
- Link to v1: https://lore.kernel.org/r/20231023-clone3-shadow-stack-v1-0-d867d0b5d4d0@ke…
---
Mark Brown (7):
Documentation: userspace-api: Add shadow stack API documentation
selftests: Provide helper header for shadow stack testing
mm: Introduce ARCH_HAS_USER_SHADOW_STACK
fork: Add shadow stack support to clone3()
selftests/clone3: Factor more of main loop into test_clone3()
selftests/clone3: Allow tests to flag if -E2BIG is a valid error code
selftests/clone3: Test shadow stack support
Documentation/userspace-api/index.rst | 1 +
Documentation/userspace-api/shadow_stack.rst | 41 +++++
arch/x86/Kconfig | 1 +
arch/x86/include/asm/shstk.h | 11 +-
arch/x86/kernel/process.c | 2 +-
arch/x86/kernel/shstk.c | 91 +++++++---
fs/proc/task_mmu.c | 2 +-
include/linux/mm.h | 2 +-
include/linux/sched/task.h | 2 +
include/uapi/linux/sched.h | 13 +-
kernel/fork.c | 61 +++++--
mm/Kconfig | 6 +
tools/testing/selftests/clone3/clone3.c | 211 ++++++++++++++++++----
tools/testing/selftests/clone3/clone3_selftests.h | 8 +
tools/testing/selftests/ksft_shstk.h | 63 +++++++
15 files changed, 430 insertions(+), 85 deletions(-)
---
base-commit: 41bccc98fb7931d63d03f326a746ac4d429c1dd3
change-id: 20231019-clone3-shadow-stack-15d40d2bf536
Best regards,
--
Mark Brown <broonie(a)kernel.org>
Hi Linus,
Please pull the following KUnit fixes update for Linux 6.8-rc5.
This KUnit update for Linux 6.8-rc5 consists of one important fix
to unregister kunit_bus when KUnit module is unloaded. Not doing
so causes an error when KUnit module tries to re-register the bus
when it gets reloaded.
diff is attached.
thanks,
-- Shuah
----------------------------------------------------------------
The following changes since commit 1a9f2c776d1416c4ea6cb0d0b9917778c41a1a7d:
Documentation: KUnit: Update the instructions on how to test static functions (2024-01-22 07:59:03 -0700)
are available in the Git repository at:
git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest tags/linux_kselftest-kunit-fixes-6.8-rc5
for you to fetch changes up to 829388b725f8d266ccec32a2f446717d8693eaba:
kunit: device: Unregister the kunit_bus on shutdown (2024-02-06 17:07:37 -0700)
----------------------------------------------------------------
linux_kselftest-kunit-fixes-6.8-rc5
This KUnit update for Linux 6.8-rc5 consists of one important fix
to unregister kunit_bus when KUnit module is unloaded. Not doing
so causes an error when KUnit module tries to re-register the bus
when it gets reloaded.
----------------------------------------------------------------
David Gow (1):
kunit: device: Unregister the kunit_bus on shutdown
lib/kunit/device-impl.h | 2 ++
lib/kunit/device.c | 14 ++++++++++++++
lib/kunit/test.c | 3 +++
3 files changed, 19 insertions(+)
----------------------------------------------------------------
[Putting this as a RFC because I'm pretty sure I'm not doing the things
correctly at the BPF level.]
[Also using bpf-next as the base tree as there will be conflicting
changes otherwise]
Ideally I'd like to have something similar to bpf_timers, but not
in soft IRQ context. So I'm emulating this with a sleepable
bpf_tail_call() (see "HID: bpf: allow to defer work in a delayed
workqueue").
Basically, I need to be able to defer a HID-BPF program for the
following reasons (from the aforementioned patch):
1. defer an event:
Sometimes we receive an out of proximity event, but the device can not
be trusted enough, and we need to ensure that we won't receive another
one in the following n milliseconds. So we need to wait those n
milliseconds, and eventually re-inject that event in the stack.
2. inject new events in reaction to one given event:
We might want to transform one given event into several. This is the
case for macro keys where a single key press is supposed to send
a sequence of key presses. But this could also be used to patch a
faulty behavior, if a device forgets to send a release event.
3. communicate with the device in reaction to one event:
We might want to communicate back to the device after a given event.
For example a device might send us an event saying that it came back
from sleeping state and needs to be re-initialized.
Currently we can achieve that by keeping a userspace program around,
raise a bpf event, and let that userspace program inject the events and
commands.
However, we are just keeping that program alive as a daemon for just
scheduling commands. There is no logic in it, so it doesn't really justify
an actual userspace wakeup. So a kernel workqueue seems simpler to handle.
Besides the "this is insane to reimplement the wheel", the other part I'm
not sure is whether we can say that BPF maps of type prog_array and of
type queue/stack can be used in sleepable context.
I don't see any warning when running the test programs, but that's probably
not a guarantee I'm doing the things properly :)
Cheers,
Benjamin
Signed-off-by: Benjamin Tissoires <bentiss(a)kernel.org>
---
Benjamin Tissoires (9):
bpf: allow more maps in sleepable bpf programs
HID: bpf/dispatch: regroup kfuncs definitions
HID: bpf: export hid_hw_output_report as a BPF kfunc
selftests/hid: Add test for hid_bpf_hw_output_report
HID: bpf: allow to inject HID event from BPF
selftests/hid: add tests for hid_bpf_input_report
HID: bpf: allow to defer work in a delayed workqueue
selftests/hid: add test for hid_bpf_schedule_delayed_work
selftests/hid: add another set of delayed work tests
Documentation/hid/hid-bpf.rst | 26 +-
drivers/hid/bpf/entrypoints/entrypoints.bpf.c | 8 +
drivers/hid/bpf/entrypoints/entrypoints.lskel.h | 236 +++++++++------
drivers/hid/bpf/hid_bpf_dispatch.c | 315 ++++++++++++++++-----
drivers/hid/bpf/hid_bpf_jmp_table.c | 34 ++-
drivers/hid/hid-core.c | 2 +
include/linux/hid_bpf.h | 10 +
kernel/bpf/verifier.c | 3 +
tools/testing/selftests/hid/hid_bpf.c | 286 ++++++++++++++++++-
tools/testing/selftests/hid/progs/hid.c | 205 ++++++++++++++
.../testing/selftests/hid/progs/hid_bpf_helpers.h | 8 +
11 files changed, 962 insertions(+), 171 deletions(-)
---
base-commit: 4f7a05917237b006ceae760507b3d15305769ade
change-id: 20240205-hid-bpf-sleepable-c01260fd91c4
Best regards,
--
Benjamin Tissoires <bentiss(a)kernel.org>
From: Paul Durrant <pdurrant(a)amazon.com>
This series has one small fix to what was in v11 [1]:
* KVM: xen: re-initialize shared_info if guest (32/64-bit) mode is set
The v11 patch failed to set the return code of the ioctl if the mode
was not actually changed, leading to a spurious failure.
This version of the series also contains a new bug-fix to the pfncache
code from David Woodhouse.
[1] https://lore.kernel.org/kvm/20231219161109.1318-1-paul@xen.org/
David Woodhouse (1):
KVM: pfncache: rework __kvm_gpc_refresh() to fix locking issues
Paul Durrant (19):
KVM: pfncache: Add a map helper function
KVM: pfncache: remove unnecessary exports
KVM: xen: mark guest pages dirty with the pfncache lock held
KVM: pfncache: add a mark-dirty helper
KVM: pfncache: remove KVM_GUEST_USES_PFN usage
KVM: pfncache: stop open-coding offset_in_page()
KVM: pfncache: include page offset in uhva and use it consistently
KVM: pfncache: allow a cache to be activated with a fixed (userspace)
HVA
KVM: xen: separate initialization of shared_info cache and content
KVM: xen: re-initialize shared_info if guest (32/64-bit) mode is set
KVM: xen: allow shared_info to be mapped by fixed HVA
KVM: xen: allow vcpu_info to be mapped by fixed HVA
KVM: selftests / xen: map shared_info using HVA rather than GFN
KVM: selftests / xen: re-map vcpu_info using HVA rather than GPA
KVM: xen: advertize the KVM_XEN_HVM_CONFIG_SHARED_INFO_HVA capability
KVM: xen: split up kvm_xen_set_evtchn_fast()
KVM: xen: don't block on pfncache locks in kvm_xen_set_evtchn_fast()
KVM: pfncache: check the need for invalidation under read lock first
KVM: xen: allow vcpu_info content to be 'safely' copied
Documentation/virt/kvm/api.rst | 53 ++-
arch/x86/kvm/x86.c | 7 +-
arch/x86/kvm/xen.c | 356 +++++++++++-------
include/linux/kvm_host.h | 40 +-
include/linux/kvm_types.h | 8 -
include/uapi/linux/kvm.h | 9 +-
.../selftests/kvm/x86_64/xen_shinfo_test.c | 59 ++-
virt/kvm/pfncache.c | 340 +++++++++--------
8 files changed, 535 insertions(+), 337 deletions(-)
base-commit: 1c6d984f523f67ecfad1083bb04c55d91977bb15
--
2.39.2
Major changes in RFC v5:
------------------------
1. Rebased on top of 'Abstract page from net stack' series and used the
new netmem type to refer to LSB set pointers instead of re-using
struct page.
2. Downgraded this series back to RFC and called it RFC v5. This is
because this series is now dependent on 'Abstract page from net
stack'[1] and the queue API. Both are removed from the series to
pre-requisite work.
3. Reworked the page_pool devmem support to use netmem and for some
more unified handling.
4. Reworked the reference counting of net_iov (renamed from
page_pool_iov) to use pp_ref_count for refcounting.
The full changes including the dependent series and GVE page pool
support is here:
https://github.com/mina/linux/commits/tcpdevmem-rfcv5/
[1] https://patchwork.kernel.org/project/netdevbpf/list/?series=810774
Major changes in v1:
--------------------
1. Implemented MVP queue API ndos to remove the userspace-visible
driver reset.
2. Fixed issues in the napi_pp_put_page() devmem frag unref path.
3. Removed RFC tag.
Many smaller addressed comments across all the patches (patches have
individual change log).
Full tree including the rest of the GVE driver changes:
https://github.com/mina/linux/commits/tcpdevmem-v1
Changes in RFC v3:
------------------
1. Pulled in the memory-provider dependency from Jakub's RFC[1] to make the
series reviewable and mergable.
2. Implemented multi-rx-queue binding which was a todo in v2.
3. Fix to cmsg handling.
The sticking point in RFC v2[2] was the device reset required to refill
the device rx-queues after the dmabuf bind/unbind. The solution
suggested as I understand is a subset of the per-queue management ops
Jakub suggested or similar:
https://lore.kernel.org/netdev/20230815171638.4c057dcd@kernel.org/
This is not addressed in this revision, because:
1. This point was discussed at netconf & netdev and there is openness to
using the current approach of requiring a device reset.
2. Implementing individual queue resetting seems to be difficult for my
test bed with GVE. My prototype to test this ran into issues with the
rx-queues not coming back up properly if reset individually. At the
moment I'm unsure if it's a mistake in the POC or a genuine issue in
the virtualization stack behind GVE, which currently doesn't test
individual rx-queue restart.
3. Our usecases are not bothered by requiring a device reset to refill
the buffer queues, and we'd like to support NICs that run into this
limitation with resetting individual queues.
My thought is that drivers that have trouble with per-queue configs can
use the support in this series, while drivers that support new netdev
ops to reset individual queues can automatically reset the queue as
part of the dma-buf bind/unbind.
The same approach with device resets is presented again for consideration
with other sticking points addressed.
This proposal includes the rx devmem path only proposed for merge. For a
snapshot of my entire tree which includes the GVE POC page pool support &
device memory support:
https://github.com/torvalds/linux/compare/master...mina:linux:tcpdevmem-v3
[1] https://lore.kernel.org/netdev/f8270765-a27b-6ccf-33ea-cda097168d79@redhat.…
[2] https://lore.kernel.org/netdev/CAHS8izOVJGJH5WF68OsRWFKJid1_huzzUK+hpKbLcL4…
Changes in RFC v2:
------------------
The sticking point in RFC v1[1] was the dma-buf pages approach we used to
deliver the device memory to the TCP stack. RFC v2 is a proof-of-concept
that attempts to resolve this by implementing scatterlist support in the
networking stack, such that we can import the dma-buf scatterlist
directly. This is the approach proposed at a high level here[2].
Detailed changes:
1. Replaced dma-buf pages approach with importing scatterlist into the
page pool.
2. Replace the dma-buf pages centric API with a netlink API.
3. Removed the TX path implementation - there is no issue with
implementing the TX path with scatterlist approach, but leaving
out the TX path makes it easier to review.
4. Functionality is tested with this proposal, but I have not conducted
perf testing yet. I'm not sure there are regressions, but I removed
perf claims from the cover letter until they can be re-confirmed.
5. Added Signed-off-by: contributors to the implementation.
6. Fixed some bugs with the RX path since RFC v1.
Any feedback welcome, but specifically the biggest pending questions
needing feedback IMO are:
1. Feedback on the scatterlist-based approach in general.
2. Netlink API (Patch 1 & 2).
3. Approach to handle all the drivers that expect to receive pages from
the page pool (Patch 6).
[1] https://lore.kernel.org/netdev/dfe4bae7-13a0-3c5d-d671-f61b375cb0b4@gmail.c…
[2] https://lore.kernel.org/netdev/CAHS8izPm6XRS54LdCDZVd0C75tA1zHSu6jLVO8nzTLX…
----------------------
* TL;DR:
Device memory TCP (devmem TCP) is a proposal for transferring data to and/or
from device memory efficiently, without bouncing the data to a host memory
buffer.
* Problem:
A large amount of data transfers have device memory as the source and/or
destination. Accelerators drastically increased the volume of such transfers.
Some examples include:
- ML accelerators transferring large amounts of training data from storage into
GPU/TPU memory. In some cases ML training setup time can be as long as 50% of
TPU compute time, improving data transfer throughput & efficiency can help
improving GPU/TPU utilization.
- Distributed training, where ML accelerators, such as GPUs on different hosts,
exchange data among them.
- Distributed raw block storage applications transfer large amounts of data with
remote SSDs, much of this data does not require host processing.
Today, the majority of the Device-to-Device data transfers the network are
implemented as the following low level operations: Device-to-Host copy,
Host-to-Host network transfer, and Host-to-Device copy.
The implementation is suboptimal, especially for bulk data transfers, and can
put significant strains on system resources, such as host memory bandwidth,
PCIe bandwidth, etc. One important reason behind the current state is the
kernel’s lack of semantics to express device to network transfers.
* Proposal:
In this patch series we attempt to optimize this use case by implementing
socket APIs that enable the user to:
1. send device memory across the network directly, and
2. receive incoming network packets directly into device memory.
Packet _payloads_ go directly from the NIC to device memory for receive and from
device memory to NIC for transmit.
Packet _headers_ go to/from host memory and are processed by the TCP/IP stack
normally. The NIC _must_ support header split to achieve this.
Advantages:
- Alleviate host memory bandwidth pressure, compared to existing
network-transfer + device-copy semantics.
- Alleviate PCIe BW pressure, by limiting data transfer to the lowest level
of the PCIe tree, compared to traditional path which sends data through the
root complex.
* Patch overview:
** Part 1: netlink API
Gives user ability to bind dma-buf to an RX queue.
** Part 2: scatterlist support
Currently the standard for device memory sharing is DMABUF, which doesn't
generate struct pages. On the other hand, networking stack (skbs, drivers, and
page pool) operate on pages. We have 2 options:
1. Generate struct pages for dmabuf device memory, or,
2. Modify the networking stack to process scatterlist.
** part 3: page pool support
We piggy back on page pool memory providers proposal:
https://github.com/kuba-moo/linux/tree/pp-providers
It allows the page pool to define a memory provider that provides the
page allocation and freeing. It helps abstract most of the device memory
TCP changes from the driver.
** part 4: support for unreadable skb frags
Page pool iovs are not accessible by the host; we implement changes
throughput the networking stack to correctly handle skbs with unreadable
frags.
** Part 5: recvmsg() APIs
We define user APIs for the user to send and receive device memory.
Not included with this series is the GVE devmem TCP support, just to
simplify the review. Code available here if desired:
https://github.com/mina/linux/tree/tcpdevmem
This series is built on top of net-next with Jakub's pp-providers changes
cherry-picked.
* NIC dependencies:
1. (strict) Devmem TCP require the NIC to support header split, i.e. the
capability to split incoming packets into a header + payload and to put
each into a separate buffer. Devmem TCP works by using device memory
for the packet payload, and host memory for the packet headers.
2. (optional) Devmem TCP works better with flow steering support & RSS support,
i.e. the NIC's ability to steer flows into certain rx queues. This allows the
sysadmin to enable devmem TCP on a subset of the rx queues, and steer
devmem TCP traffic onto these queues and non devmem TCP elsewhere.
The NIC I have access to with these properties is the GVE with DQO support
running in Google Cloud, but any NIC that supports these features would suffice.
I may be able to help reviewers bring up devmem TCP on their NICs.
* Testing:
The series includes a udmabuf kselftest that show a simple use case of
devmem TCP and validates the entire data path end to end without
a dependency on a specific dmabuf provider.
** Test Setup
Kernel: net-next with this series and memory provider API cherry-picked
locally.
Hardware: Google Cloud A3 VMs.
NIC: GVE with header split & RSS & flow steering support.
Cc: Pavel Begunkov <asml.silence(a)gmail.com>
Cc: David Wei <dw(a)davidwei.uk>
Cc: Jason Gunthorpe <jgg(a)ziepe.ca>
Cc: Yunsheng Lin <linyunsheng(a)huawei.com>
Cc: Shailend Chand <shailend(a)google.com>
Cc: Harshitha Ramamurthy <hramamurthy(a)google.com>
Cc: Shakeel Butt <shakeelb(a)google.com>
Cc: Jeroen de Borst <jeroendb(a)google.com>
Cc: Praveen Kaligineedi <pkaligineedi(a)google.com>
Jakub Kicinski (1):
net: page_pool: create hooks for custom page providers
Mina Almasry (13):
net: page_pool: factor out page_pool recycle check
net: netdev netlink api to bind dma-buf to a net device
netdev: support binding dma-buf to netdevice
netdev: netdevice devmem allocator
page_pool: convert to use netmem
page_pool: devmem support
memory-provider: dmabuf devmem memory provider
net: support non paged skb frags
net: add support for skbs with unreadable frags
tcp: RX path for devmem TCP
net: add SO_DEVMEM_DONTNEED setsockopt to release RX frags
net: add devmem TCP documentation
selftests: add ncdevmem, netcat for devmem TCP
Documentation/netlink/specs/netdev.yaml | 52 +++
Documentation/networking/devmem.rst | 271 +++++++++++++
Documentation/networking/index.rst | 1 +
arch/alpha/include/uapi/asm/socket.h | 8 +-
arch/mips/include/uapi/asm/socket.h | 6 +
arch/parisc/include/uapi/asm/socket.h | 6 +
arch/sparc/include/uapi/asm/socket.h | 6 +
include/linux/skbuff.h | 68 +++-
include/linux/socket.h | 1 +
include/net/devmem.h | 127 ++++++
include/net/netdev_rx_queue.h | 1 +
include/net/netmem.h | 223 ++++++++++-
include/net/page_pool/helpers.h | 117 ++++--
include/net/page_pool/types.h | 27 +-
include/net/sock.h | 2 +
include/net/tcp.h | 5 +-
include/trace/events/page_pool.h | 28 +-
include/uapi/asm-generic/socket.h | 6 +
include/uapi/linux/netdev.h | 19 +
include/uapi/linux/uio.h | 14 +
net/bpf/test_run.c | 5 +-
net/core/datagram.c | 6 +
net/core/dev.c | 314 ++++++++++++++-
net/core/gro.c | 7 +-
net/core/netdev-genl-gen.c | 19 +
net/core/netdev-genl-gen.h | 2 +
net/core/netdev-genl.c | 123 ++++++
net/core/page_pool.c | 419 +++++++++++++-------
net/core/skbuff.c | 92 ++++-
net/core/sock.c | 45 +++
net/ipv4/tcp.c | 196 +++++++++-
net/ipv4/tcp_input.c | 13 +-
net/ipv4/tcp_ipv4.c | 9 +
net/ipv4/tcp_output.c | 5 +-
net/packet/af_packet.c | 4 +-
tools/include/uapi/linux/netdev.h | 19 +
tools/testing/selftests/net/.gitignore | 1 +
tools/testing/selftests/net/Makefile | 5 +
tools/testing/selftests/net/ncdevmem.c | 489 ++++++++++++++++++++++++
39 files changed, 2527 insertions(+), 234 deletions(-)
create mode 100644 Documentation/networking/devmem.rst
create mode 100644 include/net/devmem.h
create mode 100644 tools/testing/selftests/net/ncdevmem.c
--
2.43.0.472.g3155946c3a-goog
From: Jeff Xu <jeffxu(a)chromium.org>
This is V9 version, with removing MAP_SEALABLE and PROT_SEAL
from mmap(), adding perfrmance benchmark and a test
to demo the sealing of read-only memory segment of elf mapping.
------------------------------------------------------------------
This patchset proposes a new mseal() syscall for the Linux kernel.
In a nutshell, mseal() protects the VMAs of a given virtual memory
range against modifications, such as changes to their permission bits.
Modern CPUs support memory permissions, such as the read/write (RW)
and no-execute (NX) bits. Linux has supported NX since the release of
kernel version 2.6.8 in August 2004 [1]. The memory permission feature
improves the security stance on memory corruption bugs, as an attacker
cannot simply write to arbitrary memory and point the code to it. The
memory must be marked with the X bit, or else an exception will occur.
Internally, the kernel maintains the memory permissions in a data
structure called VMA (vm_area_struct). mseal() additionally protects
the VMA itself against modifications of the selected seal type.
Memory sealing is useful to mitigate memory corruption issues where a
corrupted pointer is passed to a memory management system. For
example, such an attacker primitive can break control-flow integrity
guarantees since read-only memory that is supposed to be trusted can
become writable or .text pages can get remapped. Memory sealing can
automatically be applied by the runtime loader to seal .text and
.rodata pages and applications can additionally seal security critical
data at runtime. A similar feature already exists in the XNU kernel
with the VM_FLAGS_PERMANENT [3] flag and on OpenBSD with the
mimmutable syscall [4]. Also, Chrome wants to adopt this feature for
their CFI work [2] and this patchset has been designed to be
compatible with the Chrome use case.
Two system calls are involved in sealing the map: mmap() and mseal().
The new mseal() is an syscall on 64 bit CPU, and with
following signature:
int mseal(void addr, size_t len, unsigned long flags)
addr/len: memory range.
flags: reserved.
mseal() blocks following operations for the given memory range.
1> Unmapping, moving to another location, and shrinking the size,
via munmap() and mremap(), can leave an empty space, therefore can
be replaced with a VMA with a new set of attributes.
2> Moving or expanding a different VMA into the current location,
via mremap().
3> Modifying a VMA via mmap(MAP_FIXED).
4> Size expansion, via mremap(), does not appear to pose any specific
risks to sealed VMAs. It is included anyway because the use case is
unclear. In any case, users can rely on merging to expand a sealed VMA.
5> mprotect() and pkey_mprotect().
6> Some destructive madvice() behaviors (e.g. MADV_DONTNEED) for anonymous
memory, when users don't have write permission to the memory. Those
behaviors can alter region contents by discarding pages, effectively a
memset(0) for anonymous memory.
The idea that inspired this patch comes from Stephen Röttger’s work in
V8 CFI [5]. Chrome browser in ChromeOS will be the first user of this
API.
Indeed, the Chrome browser has very specific requirements for sealing,
which are distinct from those of most applications. For example, in
the case of libc, sealing is only applied to read-only (RO) or
read-execute (RX) memory segments (such as .text and .RELRO) to
prevent them from becoming writable, the lifetime of those mappings
are tied to the lifetime of the process.
Chrome wants to seal two large address space reservations that are
managed by different allocators. The memory is mapped RW- and RWX
respectively but write access to it is restricted using pkeys (or in
the future ARM permission overlay extensions). The lifetime of those
mappings are not tied to the lifetime of the process, therefore, while
the memory is sealed, the allocators still need to free or discard the
unused memory. For example, with madvise(DONTNEED).
However, always allowing madvise(DONTNEED) on this range poses a
security risk. For example if a jump instruction crosses a page
boundary and the second page gets discarded, it will overwrite the
target bytes with zeros and change the control flow. Checking
write-permission before the discard operation allows us to control
when the operation is valid. In this case, the madvise will only
succeed if the executing thread has PKEY write permissions and PKRU
changes are protected in software by control-flow integrity.
Although the initial version of this patch series is targeting the
Chrome browser as its first user, it became evident during upstream
discussions that we would also want to ensure that the patch set
eventually is a complete solution for memory sealing and compatible
with other use cases. The specific scenario currently in mind is
glibc's use case of loading and sealing ELF executables. To this end,
Stephen is working on a change to glibc to add sealing support to the
dynamic linker, which will seal all non-writable segments at startup.
Once this work is completed, all applications will be able to
automatically benefit from these new protections.
In closing, I would like to formally acknowledge the valuable
contributions received during the RFC process, which were instrumental
in shaping this patch:
Jann Horn: raising awareness and providing valuable insights on the
destructive madvise operations.
Liam R. Howlett: perf optimization.
Linus Torvalds: assisting in defining system call signature and scope.
Theo de Raadt: sharing the experiences and insight gained from
implementing mimmutable() in OpenBSD.
MM perf benchmarks
==================
This patch adds a loop in the mprotect/munmap/madvise(DONTNEED) to
check the VMAs’ sealing flag, so that no partial update can be made,
when any segment within the given memory range is sealed.
To measure the performance impact of this loop, two tests are developed.
[8]
The first is measuring the time taken for a particular system call,
by using clock_gettime(CLOCK_MONOTONIC). The second is using
PERF_COUNT_HW_REF_CPU_CYCLES (exclude user space). Both tests have
similar results.
The tests have roughly below sequence:
for (i = 0; i < 1000, i++)
create 1000 mappings (1 page per VMA)
start the sampling
for (j = 0; j < 1000, j++)
mprotect one mapping
stop and save the sample
delete 1000 mappings
calculates all samples.
Below tests are performed on Intel(R) Pentium(R) Gold 7505 @ 2.00GHz,
4G memory, Chromebook.
Based on the latest upstream code:
The first test (measuring time)
syscall__ vmas t t_mseal delta_ns per_vma %
munmap__ 1 909 944 35 35 104%
munmap__ 2 1398 1502 104 52 107%
munmap__ 4 2444 2594 149 37 106%
munmap__ 8 4029 4323 293 37 107%
munmap__ 16 6647 6935 288 18 104%
munmap__ 32 11811 12398 587 18 105%
mprotect 1 439 465 26 26 106%
mprotect 2 1659 1745 86 43 105%
mprotect 4 3747 3889 142 36 104%
mprotect 8 6755 6969 215 27 103%
mprotect 16 13748 14144 396 25 103%
mprotect 32 27827 28969 1142 36 104%
madvise_ 1 240 262 22 22 109%
madvise_ 2 366 442 76 38 121%
madvise_ 4 623 751 128 32 121%
madvise_ 8 1110 1324 215 27 119%
madvise_ 16 2127 2451 324 20 115%
madvise_ 32 4109 4642 534 17 113%
The second test (measuring cpu cycle)
syscall__ vmas cpu cmseal delta_cpu per_vma %
munmap__ 1 1790 1890 100 100 106%
munmap__ 2 2819 3033 214 107 108%
munmap__ 4 4959 5271 312 78 106%
munmap__ 8 8262 8745 483 60 106%
munmap__ 16 13099 14116 1017 64 108%
munmap__ 32 23221 24785 1565 49 107%
mprotect 1 906 967 62 62 107%
mprotect 2 3019 3203 184 92 106%
mprotect 4 6149 6569 420 105 107%
mprotect 8 9978 10524 545 68 105%
mprotect 16 20448 21427 979 61 105%
mprotect 32 40972 42935 1963 61 105%
madvise_ 1 434 497 63 63 115%
madvise_ 2 752 899 147 74 120%
madvise_ 4 1313 1513 200 50 115%
madvise_ 8 2271 2627 356 44 116%
madvise_ 16 4312 4883 571 36 113%
madvise_ 32 8376 9319 943 29 111%
Based on the result, for 6.8 kernel, sealing check adds
20-40 nano seconds, or around 50-100 CPU cycles, per VMA.
In addition, I applied the sealing to 5.10 kernel:
The first test (measuring time)
syscall__ vmas t tmseal delta_ns per_vma %
munmap__ 1 357 390 33 33 109%
munmap__ 2 442 463 21 11 105%
munmap__ 4 614 634 20 5 103%
munmap__ 8 1017 1137 120 15 112%
munmap__ 16 1889 2153 263 16 114%
munmap__ 32 4109 4088 -21 -1 99%
mprotect 1 235 227 -7 -7 97%
mprotect 2 495 464 -30 -15 94%
mprotect 4 741 764 24 6 103%
mprotect 8 1434 1437 2 0 100%
mprotect 16 2958 2991 33 2 101%
mprotect 32 6431 6608 177 6 103%
madvise_ 1 191 208 16 16 109%
madvise_ 2 300 324 24 12 108%
madvise_ 4 450 473 23 6 105%
madvise_ 8 753 806 53 7 107%
madvise_ 16 1467 1592 125 8 108%
madvise_ 32 2795 3405 610 19 122%
The second test (measuring cpu cycle)
syscall__ nbr_vma cpu cmseal delta_cpu per_vma %
munmap__ 1 684 715 31 31 105%
munmap__ 2 861 898 38 19 104%
munmap__ 4 1183 1235 51 13 104%
munmap__ 8 1999 2045 46 6 102%
munmap__ 16 3839 3816 -23 -1 99%
munmap__ 32 7672 7887 216 7 103%
mprotect 1 397 443 46 46 112%
mprotect 2 738 788 50 25 107%
mprotect 4 1221 1256 35 9 103%
mprotect 8 2356 2429 72 9 103%
mprotect 16 4961 4935 -26 -2 99%
mprotect 32 9882 10172 291 9 103%
madvise_ 1 351 380 29 29 108%
madvise_ 2 565 615 49 25 109%
madvise_ 4 872 933 61 15 107%
madvise_ 8 1508 1640 132 16 109%
madvise_ 16 3078 3323 245 15 108%
madvise_ 32 5893 6704 811 25 114%
For 5.10 kernel, sealing check adds 0-15 ns in time, or 10-30
CPU cycles, there is even decrease in some cases.
It might be interesting to compare 5.10 and 6.8 kernel
The first test (measuring time)
syscall__ vmas t_5_10 t_6_8 delta_ns per_vma %
munmap__ 1 357 909 552 552 254%
munmap__ 2 442 1398 956 478 316%
munmap__ 4 614 2444 1830 458 398%
munmap__ 8 1017 4029 3012 377 396%
munmap__ 16 1889 6647 4758 297 352%
munmap__ 32 4109 11811 7702 241 287%
mprotect 1 235 439 204 204 187%
mprotect 2 495 1659 1164 582 335%
mprotect 4 741 3747 3006 752 506%
mprotect 8 1434 6755 5320 665 471%
mprotect 16 2958 13748 10790 674 465%
mprotect 32 6431 27827 21397 669 433%
madvise_ 1 191 240 49 49 125%
madvise_ 2 300 366 67 33 122%
madvise_ 4 450 623 173 43 138%
madvise_ 8 753 1110 357 45 147%
madvise_ 16 1467 2127 660 41 145%
madvise_ 32 2795 4109 1314 41 147%
The second test (measuring cpu cycle)
syscall__ vmas cpu_5_10 c_6_8 delta_cpu per_vma %
munmap__ 1 684 1790 1106 1106 262%
munmap__ 2 861 2819 1958 979 327%
munmap__ 4 1183 4959 3776 944 419%
munmap__ 8 1999 8262 6263 783 413%
munmap__ 16 3839 13099 9260 579 341%
munmap__ 32 7672 23221 15549 486 303%
mprotect 1 397 906 509 509 228%
mprotect 2 738 3019 2281 1140 409%
mprotect 4 1221 6149 4929 1232 504%
mprotect 8 2356 9978 7622 953 423%
mprotect 16 4961 20448 15487 968 412%
mprotect 32 9882 40972 31091 972 415%
madvise_ 1 351 434 82 82 123%
madvise_ 2 565 752 186 93 133%
madvise_ 4 872 1313 442 110 151%
madvise_ 8 1508 2271 763 95 151%
madvise_ 16 3078 4312 1234 77 140%
madvise_ 32 5893 8376 2483 78 142%
From 5.10 to 6.8
munmap: added 250-550 ns in time, or 500-1100 in cpu cycle, per vma.
mprotect: added 200-750 ns in time, or 500-1200 in cpu cycle, per vma.
madvise: added 33-50 ns in time, or 70-110 in cpu cycle, per vma.
In comparison to mseal, which adds 20-40 ns or 50-100 CPU cycles, the
increase from 5.10 to 6.8 is significantly larger, approximately ten
times greater for munmap and mprotect.
When I discuss the mm performance with Brian Makin, an engineer worked
on performance, it was brought to my attention that such a performance
benchmarks, which measuring millions of mm syscall in a tight loop, may
not accurately reflect real-world scenarios, such as that of a database
service. Also this is tested using a single HW and ChromeOS, the data
from another HW or distribution might be different. It might be best
to take this data with a grain of salt.
Change history:
===============
V9:
- remove mmap(PROT_SEAL) and mmap(MAP_SEALABLE) (Linus, Theo de Raadt)
- Update mseal_test to check for prot bit (Liam R. Howlett)
- Update documentation to give more detail on sealing check (Liam R. Howlett)
- Add seal_elf test.
- Add performance measure data.
- mseal_test: fix arm build.
V8:
- perf optimization in mmap. (Liam R. Howlett)
- add one testcase (test_seal_zero_address)
- Update mseal.rst to add note for MAP_SEALABLE.
https://lore.kernel.org/lkml/20240131175027.3287009-1-jeffxu@chromium.org/
V7:
- fix index.rst (Randy Dunlap)
- fix arm build (Randy Dunlap)
- return EPERM for blocked operations (Theo de Raadt)
https://lore.kernel.org/linux-mm/20240122152905.2220849-2-jeffxu@chromium.o…
V6:
- Drop RFC from subject, Given Linus's general approval.
- Adjust syscall number for mseal (main Jan.11/2024)
- Code style fix (Matthew Wilcox)
- selftest: use ksft macros (Muhammad Usama Anjum)
- Document fix. (Randy Dunlap)
https://lore.kernel.org/all/20240111234142.2944934-1-jeffxu@chromium.org/
V5:
- fix build issue in mseal-Wire-up-mseal-syscall
(Suggested by Linus Torvalds, and Greg KH)
- updates on selftest.
https://lore.kernel.org/lkml/20240109154547.1839886-1-jeffxu@chromium.org/#r
V4:
(Suggested by Linus Torvalds)
- new signature: mseal(start,len,flags)
- 32 bit is not supported. vm_seal is removed, use vm_flags instead.
- single bit in vm_flags for sealed state.
- CONFIG_MSEAL kernel config is removed.
- single bit of PROT_SEAL in the "Prot" field of mmap().
Other changes:
- update selftest (Suggested by Muhammad Usama Anjum)
- update documentation.
https://lore.kernel.org/all/20240104185138.169307-1-jeffxu@chromium.org/
V3:
- Abandon per-syscall approach, (Suggested by Linus Torvalds).
- Organize sealing types around their functionality, such as
MM_SEAL_BASE, MM_SEAL_PROT_PKEY.
- Extend the scope of sealing from calls originated in userspace to
both kernel and userspace. (Suggested by Linus Torvalds)
- Add seal type support in mmap(). (Suggested by Pedro Falcato)
- Add a new sealing type: MM_SEAL_DISCARD_RO_ANON to prevent
destructive operations of madvise. (Suggested by Jann Horn and
Stephen Röttger)
- Make sealed VMAs mergeable. (Suggested by Jann Horn)
- Add MAP_SEALABLE to mmap()
- Add documentation - mseal.rst
https://lore.kernel.org/linux-mm/20231212231706.2680890-2-jeffxu@chromium.o…
v2:
Use _BITUL to define MM_SEAL_XX type.
Use unsigned long for seal type in sys_mseal() and other functions.
Remove internal VM_SEAL_XX type and convert_user_seal_type().
Remove MM_ACTION_XX type.
Remove caller_origin(ON_BEHALF_OF_XX) and replace with sealing bitmask.
Add more comments in code.
Add a detailed commit message.
https://lore.kernel.org/lkml/20231017090815.1067790-1-jeffxu@chromium.org/
v1:
https://lore.kernel.org/lkml/20231016143828.647848-1-jeffxu@chromium.org/
----------------------------------------------------------------
[1] https://kernelnewbies.org/Linux_2_6_8
[2] https://v8.dev/blog/control-flow-integrity
[3] https://github.com/apple-oss-distributions/xnu/blob/1031c584a5e37aff177559b…
[4] https://man.openbsd.org/mimmutable.2
[5] https://docs.google.com/document/d/1O2jwK4dxI3nRcOJuPYkonhTkNQfbmwdvxQMyXge…
[6] https://lore.kernel.org/lkml/CAG48ez3ShUYey+ZAFsU2i1RpQn0a5eOs2hzQ426Fkcgnf…
[7] https://lore.kernel.org/lkml/20230515130553.2311248-1-jeffxu@chromium.org/
[8] https://github.com/peaktocreek/mmperf
Jeff Xu (5):
mseal: Wire up mseal syscall
mseal: add mseal syscall
selftest mm/mseal memory sealing
mseal:add documentation
selftest mm/mseal read-only elf memory segment
Documentation/userspace-api/index.rst | 1 +
Documentation/userspace-api/mseal.rst | 199 ++
arch/alpha/kernel/syscalls/syscall.tbl | 1 +
arch/arm/tools/syscall.tbl | 1 +
arch/arm64/include/asm/unistd.h | 2 +-
arch/arm64/include/asm/unistd32.h | 2 +
arch/m68k/kernel/syscalls/syscall.tbl | 1 +
arch/microblaze/kernel/syscalls/syscall.tbl | 1 +
arch/mips/kernel/syscalls/syscall_n32.tbl | 1 +
arch/mips/kernel/syscalls/syscall_n64.tbl | 1 +
arch/mips/kernel/syscalls/syscall_o32.tbl | 1 +
arch/parisc/kernel/syscalls/syscall.tbl | 1 +
arch/powerpc/kernel/syscalls/syscall.tbl | 1 +
arch/s390/kernel/syscalls/syscall.tbl | 1 +
arch/sh/kernel/syscalls/syscall.tbl | 1 +
arch/sparc/kernel/syscalls/syscall.tbl | 1 +
arch/x86/entry/syscalls/syscall_32.tbl | 1 +
arch/x86/entry/syscalls/syscall_64.tbl | 1 +
arch/xtensa/kernel/syscalls/syscall.tbl | 1 +
include/linux/syscalls.h | 1 +
include/uapi/asm-generic/unistd.h | 5 +-
kernel/sys_ni.c | 1 +
mm/Makefile | 4 +
mm/internal.h | 37 +
mm/madvise.c | 12 +
mm/mmap.c | 31 +-
mm/mprotect.c | 10 +
mm/mremap.c | 31 +
mm/mseal.c | 307 ++++
tools/testing/selftests/mm/.gitignore | 2 +
tools/testing/selftests/mm/Makefile | 2 +
tools/testing/selftests/mm/mseal_test.c | 1836 +++++++++++++++++++
tools/testing/selftests/mm/seal_elf.c | 183 ++
33 files changed, 2678 insertions(+), 3 deletions(-)
create mode 100644 Documentation/userspace-api/mseal.rst
create mode 100644 mm/mseal.c
create mode 100644 tools/testing/selftests/mm/mseal_test.c
create mode 100644 tools/testing/selftests/mm/seal_elf.c
--
2.43.0.687.g38aa6559b0-goog
When opening yama/ptrace_scope we unconditionally use sudo to ensure we
are running as root, resulting in failures if running in a minimal root
filesystem where sudo is not installed. Since automated test systems will
typically just run all of kselftest as root (and many kselftests rely on
this for full functionality) add a check to see if we're already root and
only invoke sudo if not.
Since I am unclear what the intended effect of the command being run is I
have not added any error handling for the case where we fail to obtain
root.
Signed-off-by: Mark Brown <broonie(a)kernel.org>
---
tools/testing/selftests/mm/run_vmtests.sh | 14 +++++++++++++-
1 file changed, 13 insertions(+), 1 deletion(-)
diff --git a/tools/testing/selftests/mm/run_vmtests.sh b/tools/testing/selftests/mm/run_vmtests.sh
index fe140a9f4f9d..c8ca830dba93 100755
--- a/tools/testing/selftests/mm/run_vmtests.sh
+++ b/tools/testing/selftests/mm/run_vmtests.sh
@@ -248,6 +248,17 @@ run_test() {
echo "TAP version 13" | tap_output
+HAVE_ROOT=0
+if [ "$(id -u)" = "0" ]; then
+ AS_ROOT=
+ HAVE_ROOT=1
+elif [ "$(command -v sudo)" != "" ]; then
+ AS_ROOT=sudo
+ HAVE_ROOT=1
+else
+ echo # WARNING: Unable to run as root
+fi
+
CATEGORY="hugetlb" run_test ./hugepage-mmap
shmmax=$(cat /proc/sys/kernel/shmmax)
@@ -363,7 +374,8 @@ CATEGORY="hmm" run_test bash ./test_hmm.sh smoke
# MADV_POPULATE_READ and MADV_POPULATE_WRITE tests
CATEGORY="madv_populate" run_test ./madv_populate
-(echo 0 | sudo tee /proc/sys/kernel/yama/ptrace_scope 2>&1) | tap_prefix
+# FIXME: What if we can't get root?
+(echo 0 | ${AS_ROOT} tee /proc/sys/kernel/yama/ptrace_scope 2>&1) | tap_prefix
CATEGORY="memfd_secret" run_test ./memfd_secret
# KSM KSM_MERGE_TIME_HUGE_PAGES test with size of 100
---
base-commit: 445a555e0623387fa9b94e68e61681717e70200a
change-id: 20240209-kselftest-mm-check-deps-01a825e5fed4
Best regards,
--
Mark Brown <broonie(a)kernel.org>
This series aims to keep the git status clean after building the
selftests by adding some missing .gitignore files and object inclusion
in existing .gitignore files. This is one of the requirements listed in
the selftests documentation for new tests, but it is not always followed
as desired.
After adding these .gitignore files and including the generated objects,
the working tree appears clean again.
The new version includes a missing entry fot the .gitignore in damon,
which was reported by Bernd Edlinger <bernd.edlinger(a)hotmail.de>, who
also proposed a patch for it as well as for other missing .gitignore
files covered by v1. Bernd has been added to the corresponding patch as
the reporter. If a different tag is desired, I am fine with it.
To: Shuah Khan <shuah(a)kernel.org>
To: SeongJae Park <sj(a)kernel.org>
To: Bernd Edlinger <bernd.edlinger(a)hotmail.de>
Cc: linux-kselftest(a)vger.kernel.org
Cc: linux-kernel(a)vger.kernel.org
Cc: damon(a)lists.linux.dev
Cc: linux-mm(a)kvack.org
Signed-off-by: Javier Carrasco <javier.carrasco.cruz(a)gmail.com>
Changes in v3:
- General: base on mm-unstable to avoid conflicts.
- damon: add missing Closes tag.
- Link to v2: https://lore.kernel.org/r/20240212-selftest_gitignore-v2-0-75f0de50a178@gma…
Changes in v2:
- Remove patch for netfilter (not relevant anymore).
- Add patch for damon (missing binary in .gitignore).
- Link to v1: https://lore.kernel.org/r/20240101-selftest_gitignore-v1-0-eb61b09adb05@gma…
---
Javier Carrasco (4):
selftests: uevent: add missing gitignore
selftests: thermal: intel: power_floor: add missing gitignore
selftests: thermal: intel: workload_hint: add missing gitignore
selftests: damon: add access_memory to .gitignore
tools/testing/selftests/damon/.gitignore | 1 +
tools/testing/selftests/thermal/intel/power_floor/.gitignore | 1 +
tools/testing/selftests/thermal/intel/workload_hint/.gitignore | 1 +
tools/testing/selftests/uevent/.gitignore | 1 +
4 files changed, 4 insertions(+)
---
base-commit: 7e56cf9a7f108e8129d75cea0dabc9488fb4defa
change-id: 20240101-selftest_gitignore-7da2c503766e
Best regards,
--
Javier Carrasco <javier.carrasco.cruz(a)gmail.com>
The mentioned test is still flaky, unusally enough in 'fast'
environments.
Patch 2/2 [try to] address the existing issues, while patch 1/2
introduces more strict tests for the existing net helpers, to hopefully
prevent future pain.
Paolo Abeni (2):
selftests: net: more strict check in net_helper
selftests: net: more pmtu.sh fixes
tools/testing/selftests/net/net_helper.sh | 11 +++++++----
tools/testing/selftests/net/pmtu.sh | 4 ++--
2 files changed, 9 insertions(+), 6 deletions(-)
--
2.43.0
The gro self-tests sends the packets to be aggregated with
multiple write operations.
When running is slow environment, it's hard to guarantee that
the GRO engine will wait for the last packet in an intended
train.
The above causes almost deterministic failures in our CI for
the 'large' test-case.
Address the issue explicitly ignoring failures for such case
in slow environments (KSFT_MACHINE_SLOW==true).
Fixes: 7d1575014a63 ("selftests/net: GRO coalesce test")
Reviewed-by: Willem de Bruijn <willemb(a)google.com>
Signed-off-by: Paolo Abeni <pabeni(a)redhat.com>
---
v2 -> v3:
- dump the error suppression message only on actual errors (Jakub)
v1 -> v2:
- replace the '-a' operator with '&&' - Mattbe
Note that the fixes tag is there mainly to justify targeting the net
tree, and this is aiming at net to hopefully make the test more stable
ASAP for both trees.
---
tools/testing/selftests/net/gro.sh | 5 +++++
1 file changed, 5 insertions(+)
diff --git a/tools/testing/selftests/net/gro.sh b/tools/testing/selftests/net/gro.sh
index 19352f106c1d..02c21ff4ca81 100755
--- a/tools/testing/selftests/net/gro.sh
+++ b/tools/testing/selftests/net/gro.sh
@@ -31,6 +31,11 @@ run_test() {
1>>log.txt
wait "${server_pid}"
exit_code=$?
+ if [[ ${test} == "large" && -n "${KSFT_MACHINE_SLOW}" && \
+ ${exit_code} -ne 0 ]]; then
+ echo "Ignoring errors due to slow environment" 1>&2
+ exit_code=0
+ fi
if [[ "${exit_code}" -eq 0 ]]; then
break;
fi
--
2.43.0
Hi Christian, Janosch, Heiko,
Here is a pair of patches to replace the RFC I recently sent [1].
Patch 1 performs the host/guest access register swap that Christian
suggested (instead of a full sync_reg/store_reg process).
Patch 2 provides a selftest patch that exercises this scenario.
Applying patch 2 without patch 1 fails in the following way:
[eric@host linux]# ./tools/testing/selftests/kvm/s390x/memop
TAP version 13
1..16
ok 1 simple copy
ok 2 generic error checks
ok 3 copy with storage keys
ok 4 cmpxchg with storage keys
ok 5 concurrently cmpxchg with storage keys
ok 6 copy with key storage protection override
ok 7 copy with key fetch protection
ok 8 copy with key fetch protection override
==== Test Assertion Failure ====
s390x/memop.c:186: !r
pid=5720 tid=5720 errno=4 - Interrupted system call
1 0x00000000010042af: memop_ioctl at memop.c:186 (discriminator 3)
2 0x0000000001006697: test_copy_access_register at memop.c:400 (discriminator 2)
3 0x0000000001002aaf: main at memop.c:1181
4 0x000003ffaec33a5b: ?? ??:0
5 0x000003ffaec33b5d: ?? ??:0
6 0x0000000001002ba9: _start at ??:?
KVM_S390_MEM_OP failed, rc: 40 errno: 4 (Interrupted system call)
Thoughts on this approach?
Thanks,
Eric
[1] https://lore.kernel.org/r/20240131205832.2179029-1-farman@linux.ibm.com/
Eric Farman (2):
KVM: s390: load guest access registers in MEM_OP ioctl
KVM: s390: selftests: memop: add a simple AR test
arch/s390/kvm/kvm-s390.c | 6 +++++
tools/testing/selftests/kvm/s390x/memop.c | 28 +++++++++++++++++++++++
2 files changed, 34 insertions(+)
--
2.40.1
From: Zi Yan <ziy(a)nvidia.com>
Hi all,
File folio supports any order and people would like to support flexible orders
for anonymous folio[1] too. Currently, split_huge_page() only splits a huge
page to order-0 pages, but splitting to orders higher than 0 is also useful.
This patchset adds support for splitting a huge page to any lower order pages
and uses it during file folio truncate operations.
The patchset is on top of mm-everything-2023-03-27-21-20.
Changelog
===
Since v2
---
1. Fixed an issue in __split_page_owner() introduced during my rebase
Since v1
---
1. Changed split_page_memcg() and split_page_owner() parameter to use order
2. Used folio_test_pmd_mappable() in place of the equivalent code
Details
===
* Patch 1 changes split_page_memcg() to use order instead of nr_pages
* Patch 2 changes split_page_owner() to use order instead of nr_pages
* Patch 3 and 4 add new_order parameter split_page_memcg() and
split_page_owner() and prepare for upcoming changes.
* Patch 5 adds split_huge_page_to_list_to_order() to split a huge page
to any lower order. The original split_huge_page_to_list() calls
split_huge_page_to_list_to_order() with new_order = 0.
* Patch 6 uses split_huge_page_to_list_to_order() in large pagecache folio
truncation instead of split the large folio all the way down to order-0.
* Patch 7 adds a test API to debugfs and test cases in
split_huge_page_test selftests.
Comments and/or suggestions are welcome.
[1] https://lore.kernel.org/linux-mm/Y%2FblF0GIunm+pRIC@casper.infradead.org/
Zi Yan (7):
mm/memcg: use order instead of nr in split_page_memcg()
mm/page_owner: use order instead of nr in split_page_owner()
mm: memcg: make memcg huge page split support any order split.
mm: page_owner: add support for splitting to any order in split
page_owner.
mm: thp: split huge page to any lower order pages.
mm: truncate: split huge page cache page to a non-zero order if
possible.
mm: huge_memory: enable debugfs to split huge pages to any order.
include/linux/huge_mm.h | 10 +-
include/linux/memcontrol.h | 4 +-
include/linux/page_owner.h | 10 +-
mm/huge_memory.c | 137 ++++++++---
mm/memcontrol.c | 10 +-
mm/page_alloc.c | 8 +-
mm/page_owner.c | 8 +-
mm/truncate.c | 21 +-
.../selftests/mm/split_huge_page_test.c | 225 +++++++++++++++++-
9 files changed, 365 insertions(+), 68 deletions(-)
--
2.39.2
This series aims to keep the git status clean after building the
selftests by adding some missing .gitignore files and object inclusion
in existing .gitignore files. This is one of the requirements listed in
the selftests documentation for new tests, but it is not always followed
as desired.
After adding these .gitignore files and including the generated objects,
the working tree appears clean again.
The new version includes a missing entry fot the .gitignore in damon,
which was reported by Bernd Edlinger <bernd.edlinger(a)hotmail.de>, who
also proposed a patch for it as well as for other missing .gitignore
files covered by v1. Bernd has been added to the corresponding patch as
the reporter. If a different tag is desired, I am fine with it.
To: Shuah Khan <shuah(a)kernel.org>
To: SeongJae Park <sj(a)kernel.org>
To: Bernd Edlinger <bernd.edlinger(a)hotmail.de>
Cc: linux-kselftest(a)vger.kernel.org
Cc: linux-kernel(a)vger.kernel.org
Cc: damon(a)lists.linux.dev
Cc: linux-mm(a)kvack.org
Signed-off-by: Javier Carrasco <javier.carrasco.cruz(a)gmail.com>
Changes in v2:
- Remove patch for netfilter (not relevant anymore).
- Add patch for damon (missing binary in .gitignore).
- Link to v1: https://lore.kernel.org/r/20240101-selftest_gitignore-v1-0-eb61b09adb05@gma…
---
Javier Carrasco (4):
selftests: uevent: add missing gitignore
selftests: thermal: intel: power_floor: add missing gitignore
selftests: thermal: intel: workload_hint: add missing gitignore
selftests: damon: add access_memory to .gitignore
tools/testing/selftests/damon/.gitignore | 1 +
tools/testing/selftests/thermal/intel/power_floor/.gitignore | 1 +
tools/testing/selftests/thermal/intel/workload_hint/.gitignore | 1 +
tools/testing/selftests/uevent/.gitignore | 1 +
4 files changed, 4 insertions(+)
---
base-commit: 716f4aaa7b48a55c73d632d0657b35342b1fefd7
change-id: 20240101-selftest_gitignore-7da2c503766e
Best regards,
--
Javier Carrasco <javier.carrasco.cruz(a)gmail.com>
Non-contiguous CBM support for Intel CAT has been merged into the kernel
with Commit 0e3cd31f6e90 ("x86/resctrl: Enable non-contiguous CBMs in
Intel CAT") but there is no selftest that would validate if this feature
works correctly. The selftest needs to verify if writing non-contiguous
CBMs to the schemata file behaves as expected in comparison to the
information about non-contiguous CBMs support.
The patch series is based on a rework of resctrl selftests that's
currently in review [1]. The patch also implements a similar
functionality presented in the bash script included in the cover letter
of the original non-contiguous CBMs in Intel CAT series [3].
Changelog v5:
- Add a few reviewed-by tags.
- Make some minor text changes in patch messages and comments.
- Redo resctrl_mon_feature_exists() to be more generic and fix some of
my errors in refactoring feature checking.
Changelog v4:
- Changes to error failure return values in non-contiguous test.
- Some minor text refactoring without functional changes.
Changelog v3:
- Rebase onto v4 of Ilpo's series [1].
- Split old patch 3/4 into two parts. One doing refactoring and one
adding a new function.
- Some changes to all the patches after Reinette's review.
Changelog v2:
- Rebase onto v4 of Ilpo's series [2].
- Add two patches that prepare helpers for the new test.
- Move Ilpo's patch that adds test grouping to this series.
- Apply Ilpo's suggestion to the patch that adds a new test.
[1] https://lore.kernel.org/all/20231215150515.36983-1-ilpo.jarvinen@linux.inte…
[2] https://lore.kernel.org/all/20231211121826.14392-1-ilpo.jarvinen@linux.inte…
[3] https://lore.kernel.org/all/cover.1696934091.git.maciej.wieczor-retman@inte…
Older versions of this series:
[v1] https://lore.kernel.org/all/20231109112847.432687-1-maciej.wieczor-retman@i…
[v2] https://lore.kernel.org/all/cover.1702392177.git.maciej.wieczor-retman@inte…
[v3] https://lore.kernel.org/all/cover.1706180726.git.maciej.wieczor-retman@inte…
[v4] https://lore.kernel.org/all/cover.1707130307.git.maciej.wieczor-retman@inte…
Ilpo Järvinen (1):
selftests/resctrl: Add test groups and name L3 CAT test L3_CAT
Maciej Wieczor-Retman (4):
selftests/resctrl: Add a helper for the non-contiguous test
selftests/resctrl: Split validate_resctrl_feature_request()
selftests/resctrl: Add resource_info_file_exists()
selftests/resctrl: Add non-contiguous CBMs CAT test
tools/testing/selftests/resctrl/cat_test.c | 84 ++++++++++++++++-
tools/testing/selftests/resctrl/cmt_test.c | 2 +-
tools/testing/selftests/resctrl/mba_test.c | 2 +-
tools/testing/selftests/resctrl/mbm_test.c | 6 +-
tools/testing/selftests/resctrl/resctrl.h | 10 +-
.../testing/selftests/resctrl/resctrl_tests.c | 18 +++-
tools/testing/selftests/resctrl/resctrlfs.c | 94 +++++++++++++++++--
7 files changed, 194 insertions(+), 22 deletions(-)
--
2.43.0
Title under line too short
Signed-off-by: Meng Li <li.meng(a)amd.com>
---
Documentation/admin-guide/pm/amd-pstate.rst | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/Documentation/admin-guide/pm/amd-pstate.rst b/Documentation/admin-guide/pm/amd-pstate.rst
index 0a3aa6b8ffd5..1e0d101b020a 100644
--- a/Documentation/admin-guide/pm/amd-pstate.rst
+++ b/Documentation/admin-guide/pm/amd-pstate.rst
@@ -381,7 +381,7 @@ driver receives a message with the highest performance change, it will
update the core ranking and set the cpu's priority.
``amd-pstate`` Preferred Core Switch
-=================================
+=====================================
Kernel Parameters
-----------------
--
2.34.1
The gro self-tests sends the packets to be aggregated with
multiple write operations.
When running is slow environment, it's hard to guarantee that
the GRO engine will wait for the last packet in an intended
train.
The above causes almost deterministic failures in our CI for
the 'large' test-case.
Address the issue explicitly ignoring failures for such case
in slow environments (KSFT_MACHINE_SLOW==true).
Fixes: 7d1575014a63 ("selftests/net: GRO coalesce test")
Reviewed-by: Willem de Bruijn <willemb(a)google.com>
Signed-off-by: Paolo Abeni <pabeni(a)redhat.com>
---
v1 -> v2:
- replace the '-a' operator with '&&' - Mattbe
Note that the fixes tag is there mainly to justify targeting the net
tree, and this is aiming at net to hopefully make the test more stable
ASAP for both trees.
I experimented with a largish refactory replacing the multiple writes
with a single GSO packet, but exhausted by time budget before reaching
any good result.
---
tools/testing/selftests/net/gro.sh | 4 ++++
1 file changed, 4 insertions(+)
diff --git a/tools/testing/selftests/net/gro.sh b/tools/testing/selftests/net/gro.sh
index 19352f106c1d..3190b41e8bfc 100755
--- a/tools/testing/selftests/net/gro.sh
+++ b/tools/testing/selftests/net/gro.sh
@@ -31,6 +31,10 @@ run_test() {
1>>log.txt
wait "${server_pid}"
exit_code=$?
+ if [[ ${test} == "large" && -n "${KSFT_MACHINE_SLOW}" ]]; then
+ echo "Ignoring errors due to slow environment" 1>&2
+ exit_code=0
+ fi
if [[ "${exit_code}" -eq 0 ]]; then
break;
fi
--
2.43.0
The mentioned test is failing in slow environments:
# SO_TXTIME ipv4 clock monotonic
# ./so_txtime: recv: timeout: Resource temporarily unavailable
not ok 1 selftests: net: so_txtime.sh # exit=1
The receiver is started in background and the sender could end-up
transmitting the packet before the receiver is ready, so that the
later recv times out.
Address the issue explcitly waiting for the socket being bound to
the relevant port.
Fixes: af5136f95045 ("selftests/net: SO_TXTIME with ETF and FQ")
Signed-off-by: Paolo Abeni <pabeni(a)redhat.com>
---
Note that to really cope with slow env the mentioned self-tests also
need net-next commit c41dfb0dfbec ("selftests/net: ignore timing
errors in so_txtime if KSFT_MACHINE_SLOW"), so this could be applied to
net-next, too
---
tools/testing/selftests/net/so_txtime.sh | 6 ++++++
1 file changed, 6 insertions(+)
diff --git a/tools/testing/selftests/net/so_txtime.sh b/tools/testing/selftests/net/so_txtime.sh
index 3f06f4d286a9..ade0e5755099 100755
--- a/tools/testing/selftests/net/so_txtime.sh
+++ b/tools/testing/selftests/net/so_txtime.sh
@@ -5,6 +5,8 @@
set -e
+source net_helper.sh
+
readonly DEV="veth0"
readonly BIN="./so_txtime"
@@ -51,13 +53,16 @@ do_test() {
local readonly CLOCK="$2"
local readonly TXARGS="$3"
local readonly RXARGS="$4"
+ local PROTO
if [[ "${IP}" == "4" ]]; then
local readonly SADDR="${SADDR4}"
local readonly DADDR="${DADDR4}"
+ PROTO=udp
elif [[ "${IP}" == "6" ]]; then
local readonly SADDR="${SADDR6}"
local readonly DADDR="${DADDR6}"
+ PROTO=udp6
else
echo "Invalid IP version ${IP}"
exit 1
@@ -65,6 +70,7 @@ do_test() {
local readonly START="$(date +%s%N --date="+ 0.1 seconds")"
ip netns exec "${NS2}" "${BIN}" -"${IP}" -c "${CLOCK}" -t "${START}" -S "${SADDR}" -D "${DADDR}" "${RXARGS}" -r &
+ wait_local_port_listen "${NS2}" 8000 "${PROTO}"
ip netns exec "${NS1}" "${BIN}" -"${IP}" -c "${CLOCK}" -t "${START}" -S "${SADDR}" -D "${DADDR}" "${TXARGS}"
wait "$!"
}
--
2.43.0
This patch series is based on the RFC patch from Frederic [1]. Instead
of offering RCU_NOCB as a separate option, it is now lumped into a
root-only cpuset.cpus.isolation_full flag that will enable all the
additional CPU isolation capabilities available for isolated partitions
if set. RCU_NOCB is just the first one to this party. Additional dynamic
CPU isolation capabilities will be added in the future.
The first 2 patches are adopted from Federic with minor twists to fix
merge conflicts and compilation issue. The rests are for implementing
the new cpuset.cpus.isolation_full interface which is essentially a flag
to globally enable or disable full CPU isolation on isolated partitions.
On read, it also shows the CPU isolation capabilities that are currently
enabled. RCU_NOCB requires that the rcu_nocbs option be present in
the kernel boot command line. Without that, the rcu_nocb functionality
cannot be enabled even if the isolation_full flag is set. So we allow
users to check the isolation_full file to verify that if the desired
CPU isolation capability is enabled or not.
Only sanity checking has been done so far. More testing, especially on
the RCU side, will be needed.
[1] https://lore.kernel.org/lkml/20220525221055.1152307-1-frederic@kernel.org/
Frederic Weisbecker (2):
rcu/nocb: Pass a cpumask instead of a single CPU to offload/deoffload
rcu/nocb: Prepare to change nocb cpumask from CPU-hotplug protected
cpuset caller
Waiman Long (6):
rcu/no_cb: Add rcu_nocb_enabled() to expose the rcu_nocb state
cgroup/cpuset: Better tracking of addition/deletion of isolated CPUs
cgroup/cpuset: Add cpuset.cpus.isolation_full
cgroup/cpuset: Enable dynamic rcu_nocb mode on isolated CPUs
cgroup/cpuset: Document the new cpuset.cpus.isolation_full control
file
cgroup/cpuset: Update test_cpuset_prs.sh to handle
cpuset.cpus.isolation_full
Documentation/admin-guide/cgroup-v2.rst | 24 ++
include/linux/rcupdate.h | 15 +-
kernel/cgroup/cpuset.c | 237 ++++++++++++++----
kernel/rcu/rcutorture.c | 6 +-
kernel/rcu/tree_nocb.h | 118 ++++++---
.../selftests/cgroup/test_cpuset_prs.sh | 23 +-
6 files changed, 337 insertions(+), 86 deletions(-)
--
2.39.3
The altnames test uses the forwarding/lib.sh and that dependency
currently causes failures when running the test after install:
make -C tools/testing/selftests/ TARGETS=net install
./tools/testing/selftests/kselftest_install/run_kselftest.sh \
-t net:altnames.sh
# ...
# ./altnames.sh: line 8: ./forwarding/lib.sh: No such file or directory
# RTNETLINK answers: Operation not permitted
# ./altnames.sh: line 73: tests_run: command not found
# ./altnames.sh: line 65: pre_cleanup: command not found
Address the issue leveraging the TEST_INCLUDES infrastructure
provided by commit 2a0683be5b4c ("selftests: Introduce Makefile variable
to list shared bash scripts")
Signed-off-by: Paolo Abeni <pabeni(a)redhat.com>
---
tools/testing/selftests/net/Makefile | 2 ++
1 file changed, 2 insertions(+)
diff --git a/tools/testing/selftests/net/Makefile b/tools/testing/selftests/net/Makefile
index 211753756bde..7b6918d5f4af 100644
--- a/tools/testing/selftests/net/Makefile
+++ b/tools/testing/selftests/net/Makefile
@@ -97,6 +97,8 @@ TEST_PROGS += vlan_hw_filter.sh
TEST_FILES := settings
TEST_FILES += in_netns.sh lib.sh net_helper.sh setup_loopback.sh setup_veth.sh
+TEST_INCLUDES := forwarding/lib.sh
+
include ../lib.mk
$(OUTPUT)/reuseport_bpf_numa: LDLIBS += -lnuma
--
2.43.0
Open vSwitch module accepts actions as a list from the netlink socket
and then creates a copy which it uses in the action set processing.
During processing of the action list on a packet, the module keeps a
count of the execution depth and exits processing if the action depth
goes too high.
However, during netlink processing the recursion depth isn't checked
anywhere, and the copy trusts that kernel has large enough stack to
accommodate it. The OVS sample action was the original action which
could perform this kinds of recursion, and it originally checked that
it didn't exceed the sample depth limit. However, when sample became
optimized to provide the clone() semantics, the recursion limit was
dropped.
This series adds a depth limit during the __ovs_nla_copy_actions() call
that will ensure we don't exceed the max that the OVS userspace could
generate for a clone().
Additionally, this series provides a selftest in 2/2 that can be used to
determine if the OVS module is allowing unbounded access. It can be
safely omitted where the ovs selftest framework isn't available.
Aaron Conole (2):
net: openvswitch: limit the number of recursions from action sets
selftests: openvswitch: Add validation for the recursion test
net/openvswitch/flow_netlink.c | 49 ++++++++-----
.../selftests/net/openvswitch/openvswitch.sh | 13 ++++
.../selftests/net/openvswitch/ovs-dpctl.py | 71 +++++++++++++++----
3 files changed, 102 insertions(+), 31 deletions(-)
--
2.41.0
Fix various problems in the forwarding selftests so that they will pass
in the netdev CI instead of being ignored. See commit messages for
details.
Ido Schimmel (4):
selftests: forwarding: Fix layer 2 miss test flakiness
selftests: forwarding: Fix bridge MDB test flakiness
selftests: forwarding: Suppress grep warnings
selftests: forwarding: Fix bridge locked port test flakiness
.../selftests/net/forwarding/bridge_locked_port.sh | 4 ++--
.../testing/selftests/net/forwarding/bridge_mdb.sh | 14 +++++++++-----
.../selftests/net/forwarding/tc_flower_l2_miss.sh | 8 ++++++--
3 files changed, 17 insertions(+), 9 deletions(-)
--
2.43.0
The two tests that make use of multicast routig (router.sh and
router_multicast.sh) are currently failing in the netdev CI because the
kernel is missing multicast routing support.
Fix by adding the required config entries.
Fixes: 6d4efada3b82 ("selftests: forwarding: Add multicast routing test")
Signed-off-by: Ido Schimmel <idosch(a)nvidia.com>
---
Targeting at net-next because this is where 4acf4e62cd57 ("selftests:
forwarding: Add missing config entries") was applied to, but you can
apply to net as well.
---
tools/testing/selftests/net/forwarding/config | 7 +++++++
1 file changed, 7 insertions(+)
diff --git a/tools/testing/selftests/net/forwarding/config b/tools/testing/selftests/net/forwarding/config
index ba2343514582..f59083d8c59d 100644
--- a/tools/testing/selftests/net/forwarding/config
+++ b/tools/testing/selftests/net/forwarding/config
@@ -2,7 +2,14 @@ CONFIG_BRIDGE=m
CONFIG_VLAN_8021Q=m
CONFIG_BRIDGE_VLAN_FILTERING=y
CONFIG_NET_L3_MASTER_DEV=y
+CONFIG_IP_MROUTE=y
+CONFIG_IP_MROUTE_MULTIPLE_TABLES=y
+CONFIG_IP_PIMSM_V1=y
+CONFIG_IP_PIMSM_V2=y
+CONFIG_IPV6_MROUTE=y
+CONFIG_IPV6_MROUTE_MULTIPLE_TABLES=y
CONFIG_IPV6_MULTIPLE_TABLES=y
+CONFIG_IPV6_PIMSM_V2=y
CONFIG_NET_VRF=m
CONFIG_BPF_SYSCALL=y
CONFIG_CGROUP_BPF=y
--
2.43.0
A couple of small updates for the check_compaction selftest which make
it play more nicely with test automation systems.
Signed-off-by: Mark Brown <broonie(a)kernel.org>
---
Mark Brown (2):
selftests/mm: Log skipped compaction test as a skip
selftests/mm: Log a consistent test name for check_compaction
tools/testing/selftests/mm/compaction_test.c | 37 +++++++++++++++-------------
1 file changed, 20 insertions(+), 17 deletions(-)
---
base-commit: b1d3a0e70c3881d2f8cf6692ccf7c2a4fb2d030d
change-id: 20240208-kselftest-mm-cleanup-30edd2e567cb
Best regards,
--
Mark Brown <broonie(a)kernel.org>
I encountered the following build errors while compiling the selftests net
test cases on Linux next-20240208 tag with clang toolchain.
Reported-by: Linux Kernel Functional Testing <lkft(a)linaro.org>
selftests/net/ip_local_port_range
ip_local_port_range.c:152:17: error: use of undeclared identifier
'IPPROTO_MPTCP'
152 | .so_protocol = IPPROTO_MPTCP,
| ^
ip_local_port_range.c:176:17: error: use of undeclared identifier
'IPPROTO_MPTCP'
176 | .so_protocol = IPPROTO_MPTCP,
| ^
2 errors generated.
Build link,
- https://storage.tuxsuite.com/public/linaro/lkft/builds/2c4LtUoRSYhdGbErOY8h…
--
Linaro LKFT
https://lkft.linaro.org