Isolated CPUs are not allowed to be used in a non-isolated partition.
The only exception is the top cpuset which is allowed to contain boot
time isolated CPUs.
Commit ccac8e8de99c ("cgroup/cpuset: Fix remote root partition creation
problem") introduces a simplified scheme of including only partition
roots in sched domain generation. However, it does not properly account
for this exception case. This can result in leakage of isolated CPUs
into a sched domain.
Fix it by making sure that isolated CPUs are excluded from the top
cpuset before generating sched domains.
Also update the way the boot time isolated CPUs are handled in
test_cpuset_prs.sh to make sure that those isolated CPUs are really
isolated instead of just skipping them in the tests.
Fixes: ccac8e8de99c ("cgroup/cpuset: Fix remote root partition creation problem")
Signed-off-by: Waiman Long <longman(a)redhat.com>
---
kernel/cgroup/cpuset.c | 10 +++++-
.../selftests/cgroup/test_cpuset_prs.sh | 33 +++++++++++--------
2 files changed, 28 insertions(+), 15 deletions(-)
diff --git a/kernel/cgroup/cpuset.c b/kernel/cgroup/cpuset.c
index f321ed515f3a..33b264c3e258 100644
--- a/kernel/cgroup/cpuset.c
+++ b/kernel/cgroup/cpuset.c
@@ -890,7 +890,15 @@ static int generate_sched_domains(cpumask_var_t **domains,
*/
if (cgrpv2) {
for (i = 0; i < ndoms; i++) {
- cpumask_copy(doms[i], csa[i]->effective_cpus);
+ /*
+ * The top cpuset may contain some boot time isolated
+ * CPUs that need to be excluded from the sched domain.
+ */
+ if (csa[i] == &top_cpuset)
+ cpumask_and(doms[i], csa[i]->effective_cpus,
+ housekeeping_cpumask(HK_TYPE_DOMAIN));
+ else
+ cpumask_copy(doms[i], csa[i]->effective_cpus);
if (dattr)
dattr[i] = SD_ATTR_INIT;
}
diff --git a/tools/testing/selftests/cgroup/test_cpuset_prs.sh b/tools/testing/selftests/cgroup/test_cpuset_prs.sh
index 03c1bdaed2c3..400a696a0d21 100755
--- a/tools/testing/selftests/cgroup/test_cpuset_prs.sh
+++ b/tools/testing/selftests/cgroup/test_cpuset_prs.sh
@@ -86,15 +86,15 @@ echo "" > test/cpuset.cpus
#
# If isolated CPUs have been reserved at boot time (as shown in
-# cpuset.cpus.isolated), these isolated CPUs should be outside of CPUs 0-7
+# cpuset.cpus.isolated), these isolated CPUs should be outside of CPUs 0-8
# that will be used by this script for testing purpose. If not, some of
-# the tests may fail incorrectly. These isolated CPUs will also be removed
-# before being compared with the expected results.
+# the tests may fail incorrectly. These pre-isolated CPUs should stay in
+# an isolated state throughout the testing process for now.
#
BOOT_ISOLCPUS=$(cat $CGROUP2/cpuset.cpus.isolated)
if [[ -n "$BOOT_ISOLCPUS" ]]
then
- [[ $(echo $BOOT_ISOLCPUS | sed -e "s/[,-].*//") -le 7 ]] &&
+ [[ $(echo $BOOT_ISOLCPUS | sed -e "s/[,-].*//") -le 8 ]] &&
skip_test "Pre-isolated CPUs ($BOOT_ISOLCPUS) overlap CPUs to be tested"
echo "Pre-isolated CPUs: $BOOT_ISOLCPUS"
fi
@@ -683,15 +683,19 @@ check_isolcpus()
EXPECT_VAL2=$EXPECT_VAL
fi
+ #
+ # Appending pre-isolated CPUs
+ # Even though CPU #8 isn't used for testing, it can't be pre-isolated
+ # to make appending those CPUs easier.
+ #
+ [[ -n "$BOOT_ISOLCPUS" ]] && {
+ EXPECT_VAL=${EXPECT_VAL:+${EXPECT_VAL},}${BOOT_ISOLCPUS}
+ EXPECT_VAL2=${EXPECT_VAL2:+${EXPECT_VAL2},}${BOOT_ISOLCPUS}
+ }
+
#
# Check cpuset.cpus.isolated cpumask
#
- if [[ -z "$BOOT_ISOLCPUS" ]]
- then
- ISOLCPUS=$(cat $ISCPUS)
- else
- ISOLCPUS=$(cat $ISCPUS | sed -e "s/,*$BOOT_ISOLCPUS//")
- fi
[[ "$EXPECT_VAL2" != "$ISOLCPUS" ]] && {
# Take a 50ms pause and try again
pause 0.05
@@ -731,8 +735,6 @@ check_isolcpus()
fi
done
[[ "$ISOLCPUS" = *- ]] && ISOLCPUS=${ISOLCPUS}$LASTISOLCPU
- [[ -n "BOOT_ISOLCPUS" ]] &&
- ISOLCPUS=$(echo $ISOLCPUS | sed -e "s/,*$BOOT_ISOLCPUS//")
[[ "$EXPECT_VAL" = "$ISOLCPUS" ]]
}
@@ -836,8 +838,11 @@ run_state_test()
# if available
[[ -n "$ICPUS" ]] && {
check_isolcpus $ICPUS
- [[ $? -ne 0 ]] && test_fail $I "isolated CPU" \
- "Expect $ICPUS, get $ISOLCPUS instead"
+ [[ $? -ne 0 ]] && {
+ [[ -n "$BOOT_ISOLCPUS" ]] && ICPUS=${ICPUS},${BOOT_ISOLCPUS}
+ test_fail $I "isolated CPU" \
+ "Expect $ICPUS, get $ISOLCPUS instead"
+ }
}
reset_cgroup_states
#
--
2.47.1
Hi,
This series carries forward the effort to add Kselftest for PCI Endpoint
Subsystem started by Aman Gupta [1] a while ago. I reworked the initial version
based on another patch that fixes the return values of IOCTLs in
pci_endpoint_test driver and did many cleanups. Since the resulting work
modified the initial version substantially, I took over the authorship.
This series also incorporates the review comment by Shuah Khan [2] to move the
existing tests from 'tools/pci' to 'tools/testing/kselftest/pci_endpoint' before
migrating to Kselftest framework. I made sure that the tests are executable in
each commit and updated documentation accordingly.
NOTE: Patch 1 is strictly not related to this series, but necessary to execute
Kselftests with Qualcomm Endpoint devices. So this can be merged separately.
- Mani
[1] https://lore.kernel.org/linux-pci/20221007053934.5188-1-aman1.gupta@samsung…
[2] https://lore.kernel.org/linux-pci/b2a5db97-dc59-33ab-71cd-f591e0b1b34d@linu…
Changes in v2:
* Added a patch that fixes return values of IOCTL in pci_endpoint_test driver
* Moved the existing tests to new location before migrating
* Added a fix for BARs on Qcom devices
* Updated documentation and also added fixture variants for memcpy & DMA modes
Manivannan Sadhasivam (4):
PCI: qcom-ep: Mark BAR0/BAR2 as 64bit BARs and BAR1/BAR3 as RESERVED
misc: pci_endpoint_test: Fix the return value of IOCTL
selftests: Move PCI Endpoint tests from tools/pci to Kselftests
selftests: pci_endpoint: Migrate to Kselftest framework
Documentation/PCI/endpoint/pci-test-howto.rst | 144 +++-------
MAINTAINERS | 2 +-
drivers/misc/pci_endpoint_test.c | 236 ++++++++---------
drivers/pci/controller/dwc/pcie-qcom-ep.c | 4 +
tools/pci/Build | 1 -
tools/pci/Makefile | 58 ----
tools/pci/pcitest.c | 250 ------------------
tools/pci/pcitest.sh | 72 -----
tools/testing/selftests/Makefile | 1 +
.../testing/selftests/pci_endpoint/.gitignore | 2 +
tools/testing/selftests/pci_endpoint/Makefile | 7 +
tools/testing/selftests/pci_endpoint/config | 4 +
.../pci_endpoint/pci_endpoint_test.c | 186 +++++++++++++
13 files changed, 365 insertions(+), 602 deletions(-)
delete mode 100644 tools/pci/Build
delete mode 100644 tools/pci/Makefile
delete mode 100644 tools/pci/pcitest.c
delete mode 100644 tools/pci/pcitest.sh
create mode 100644 tools/testing/selftests/pci_endpoint/.gitignore
create mode 100644 tools/testing/selftests/pci_endpoint/Makefile
create mode 100644 tools/testing/selftests/pci_endpoint/config
create mode 100644 tools/testing/selftests/pci_endpoint/pci_endpoint_test.c
--
2.25.1
When compiling the pointer masking tests with -Wall this warning
is present:
pointer_masking.c: In function ‘test_tagged_addr_abi_sysctl’:
pointer_masking.c:203:9: warning: ignoring return value of ‘pwrite’
declared with attribute ‘warn_unused_result’ [-Wunused-result]
203 | pwrite(fd, &value, 1, 0); |
^~~~~~~~~~~~~~~~~~~~~~~~ pointer_masking.c:208:9: warning:
ignoring return value of ‘pwrite’ declared with attribute
‘warn_unused_result’ [-Wunused-result]
208 | pwrite(fd, &value, 1, 0);
I came across this on riscv64-linux-gnu-gcc (Ubuntu
11.4.0-1ubuntu1~22.04).
Fix this by checking that the number of bytes written equal the expected
number of bytes written.
Fixes: 7470b5afd150 ("riscv: selftests: Add a pointer masking test")
Signed-off-by: Charlie Jenkins <charlie(a)rivosinc.com>
---
Changes in v5:
- No longer skip second pwrite if first one fails
- Use wrapper function instead of goto (Drew)
- Link to v4: https://lore.kernel.org/r/20241205-fix_warnings_pointer_masking_tests-v4-1-…
Changes in v4:
- Skip sysctl_enabled test if first pwrite failed
- Link to v3: https://lore.kernel.org/r/20241205-fix_warnings_pointer_masking_tests-v3-1-…
Changes in v3:
- Fix sysctl enabled test case (Drew/Alex)
- Move pwrite err condition into goto (Drew)
- Link to v2: https://lore.kernel.org/r/20241204-fix_warnings_pointer_masking_tests-v2-1-…
Changes in v2:
- I had ret != 2 for testing, I changed it to be ret != 1.
- Link to v1: https://lore.kernel.org/r/20241204-fix_warnings_pointer_masking_tests-v1-1-…
---
.../testing/selftests/riscv/abi/pointer_masking.c | 22 ++++++++++++++++------
1 file changed, 16 insertions(+), 6 deletions(-)
diff --git a/tools/testing/selftests/riscv/abi/pointer_masking.c b/tools/testing/selftests/riscv/abi/pointer_masking.c
index dee41b7ee3e3..50c4d1bc7570 100644
--- a/tools/testing/selftests/riscv/abi/pointer_masking.c
+++ b/tools/testing/selftests/riscv/abi/pointer_masking.c
@@ -185,8 +185,20 @@ static void test_fork_exec(void)
}
}
+static bool pwrite_wrapper(int fd, void *buf, size_t count, const char *msg)
+{
+ int ret = pwrite(fd, buf, count, 0);
+
+ if (ret != count) {
+ ksft_perror(msg);
+ return false;
+ }
+ return true;
+}
+
static void test_tagged_addr_abi_sysctl(void)
{
+ char *err_pwrite_msg = "failed to write to /proc/sys/abi/tagged_addr_disabled\n";
char value;
int fd;
@@ -200,14 +212,12 @@ static void test_tagged_addr_abi_sysctl(void)
}
value = '1';
- pwrite(fd, &value, 1, 0);
- ksft_test_result(set_tagged_addr_ctrl(min_pmlen, true) == -EINVAL,
- "sysctl disabled\n");
+ if (!pwrite_wrapper(fd, &value, 1, "write '1'"))
+ ksft_test_result_fail(err_pwrite_msg);
value = '0';
- pwrite(fd, &value, 1, 0);
- ksft_test_result(set_tagged_addr_ctrl(min_pmlen, true) == 0,
- "sysctl enabled\n");
+ if (!pwrite_wrapper(fd, &value, 1, "write '0'"))
+ ksft_test_result_fail(err_pwrite_msg);
set_tagged_addr_ctrl(0, false);
---
base-commit: 40384c840ea1944d7c5a392e8975ed088ecf0b37
change-id: 20241204-fix_warnings_pointer_masking_tests-3860e4f35429
--
- Charlie
Compiled binary files should be added to .gitignore
'git status' complains:
Untracked files:
(use "git add <file>..." to include in what will be committed)
filesystems/statmount/statmount_test_ns
Cc: Shuah Khan <shuah(a)kernel.org>
Cc: Christian Brauner <brauner(a)kernel.org>
Cc: Miklos Szeredi <mszeredi(a)redhat.com>
Cc: Josef Bacik <josef(a)toxicpanda.com>
Signed-off-by: Li Zhijian <lizhijian(a)fujitsu.com>
---
Hello,
Cover letter is here.
This patch set aims to make 'git status' clear after 'make' and 'make
run_tests' for kselftests.
---
V3:
sorted the ignored files
V2:
split as a separate patch from a small one [0]
[0] https://lore.kernel.org/linux-kselftest/20241015010817.453539-1-lizhijian@f…
Signed-off-by: Li Zhijian <lizhijian(a)fujitsu.com>
---
tools/testing/selftests/filesystems/statmount/.gitignore | 1 +
1 file changed, 1 insertion(+)
diff --git a/tools/testing/selftests/filesystems/statmount/.gitignore b/tools/testing/selftests/filesystems/statmount/.gitignore
index 82a4846cbc4b..973363ad66a2 100644
--- a/tools/testing/selftests/filesystems/statmount/.gitignore
+++ b/tools/testing/selftests/filesystems/statmount/.gitignore
@@ -1,2 +1,3 @@
# SPDX-License-Identifier: GPL-2.0-only
+statmount_test_ns
/*_test
--
2.44.0
After `make run_tests`, the git status complains:
Untracked files:
(use "git add <file>..." to include in what will be committed)
zram/err.log
This file will be cleaned up when execute 'make clean'
Cc: Shuah Khan <shuah(a)kernel.org>
Signed-off-by: Li Zhijian <lizhijian(a)fujitsu.com>
---
Hello,
Cover letter is here.
This patch set aims to make 'git status' clear after 'make' and 'make
run_tests' for kselftests.
---
V3:
Add Copyright description
V2:
split as a separate patch from a small one [0]
[0] https://lore.kernel.org/linux-kselftest/20241015010817.453539-1-lizhijian@f…
Signed-off-by: Li Zhijian <lizhijian(a)fujitsu.com>
---
tools/testing/selftests/zram/.gitignore | 2 ++
1 file changed, 2 insertions(+)
create mode 100644 tools/testing/selftests/zram/.gitignore
diff --git a/tools/testing/selftests/zram/.gitignore b/tools/testing/selftests/zram/.gitignore
new file mode 100644
index 000000000000..088cd9bad87a
--- /dev/null
+++ b/tools/testing/selftests/zram/.gitignore
@@ -0,0 +1,2 @@
+# SPDX-License-Identifier: GPL-2.0-only
+err.log
--
2.44.0
Compiled binary files should be added to .gitignore
'git status' complains:
Untracked files:
(use "git add <file>..." to include in what will be committed)
filesystems/statmount/statmount_test_ns
Cc: Shuah Khan <shuah(a)kernel.org>
Cc: Christian Brauner <brauner(a)kernel.org>
Cc: Miklos Szeredi <mszeredi(a)redhat.com>
Cc: Josef Bacik <josef(a)toxicpanda.com>
Reviewed-by: Charlie Jenkins <charlie(a)rivosinc.com>
Tested-by: Charlie Jenkins <charlie(a)rivosinc.com>
Signed-off-by: Li Zhijian <lizhijian(a)fujitsu.com>
---
Hello,
Cover letter is here.
This patch set aims to make 'git status' clear after 'make' and 'make
run_tests' for kselftests.
---
V4:
Collect Reviewed-by and Tested-by from Charlie, many thanks
Remove the duplicate Signed-off-by # Shuah
V3:
sorted the ignored files
V2:
split as a separate patch from a small one [0]
[0] https://lore.kernel.org/linux-kselftest/20241015010817.453539-1-lizhijian@f…
---
tools/testing/selftests/filesystems/statmount/.gitignore | 1 +
1 file changed, 1 insertion(+)
diff --git a/tools/testing/selftests/filesystems/statmount/.gitignore b/tools/testing/selftests/filesystems/statmount/.gitignore
index 82a4846cbc4b..973363ad66a2 100644
--- a/tools/testing/selftests/filesystems/statmount/.gitignore
+++ b/tools/testing/selftests/filesystems/statmount/.gitignore
@@ -1,2 +1,3 @@
# SPDX-License-Identifier: GPL-2.0-only
+statmount_test_ns
/*_test
--
2.44.0
After `make run_tests`, the git status complains:
Untracked files:
(use "git add <file>..." to include in what will be committed)
zram/err.log
This file will be cleaned up when execute 'make clean'
Cc: Shuah Khan <shuah(a)kernel.org>
Signed-off-by: Li Zhijian <lizhijian(a)fujitsu.com>
---
Hello,
Cover letter is here.
This patch set aims to make 'git status' clear after 'make' and 'make
run_tests' for kselftests.
---
V4:
Remove duplicate Signed-off-by # Shuah
V3:
Add Copyright description
V2:
split as a separate patch from a small one [0]
[0] https://lore.kernel.org/linux-kselftest/20241015010817.453539-1-lizhijian@f…
---
tools/testing/selftests/zram/.gitignore | 2 ++
1 file changed, 2 insertions(+)
create mode 100644 tools/testing/selftests/zram/.gitignore
diff --git a/tools/testing/selftests/zram/.gitignore b/tools/testing/selftests/zram/.gitignore
new file mode 100644
index 000000000000..088cd9bad87a
--- /dev/null
+++ b/tools/testing/selftests/zram/.gitignore
@@ -0,0 +1,2 @@
+# SPDX-License-Identifier: GPL-2.0-only
+err.log
--
2.44.0
This adds support for receiving KeyUpdate messages (RFC 8446, 4.6.3
[1]). A sender transmits a KeyUpdate message and then changes its TX
key. The receiver should react by updating its RX key before
processing the next message.
This patchset implements key updates by:
1. pausing decryption when a KeyUpdate message is received, to avoid
attempting to use the old key to decrypt a record encrypted with
the new key
2. returning -EKEYEXPIRED to syscalls that cannot receive the
KeyUpdate message, until the rekey has been performed by userspace
3. passing the KeyUpdate message to userspace as a control message
4. allowing updates of the crypto_info via the TLS_TX/TLS_RX
setsockopts
This API has been tested with gnutls to make sure that it allows
userspace libraries to implement key updates [2]. Thanks to Frantisek
Krenzelok <fkrenzel(a)redhat.com> for providing the implementation in
gnutls and testing the kernel patches.
=======================================================================
Discussions around v2 of this patchset focused on how HW offload would
interact with rekey.
RX
- The existing SW path will handle all records between the KeyUpdate
message signaling the change of key and the new key becoming known
to the kernel -- those will be queued encrypted, and decrypted in
SW as they are read by userspace (once the key is provided, ie same
as this patchset)
- Call ->tls_dev_del + ->tls_dev_add immediately during
setsockopt(TLS_RX)
TX
- After setsockopt(TLS_TX), switch to the existing SW path (not the
current device_fallback) until we're able to re-enable HW offload
- tls_device_sendmsg will call into tls_sw_sendmsg under lock_sock
to avoid changing socket ops during the rekey while another
thread might be waiting on the lock
- We only re-enable HW offload (call ->tls_dev_add to install the new
key in HW) once all records sent with the old key have been
ACKed. At this point, all unacked records are SW-encrypted with the
new key, and the old key is unused by both HW and retransmissions.
- If there are no unacked records when userspace does
setsockopt(TLS_TX), we can (try to) install the new key in HW
immediately.
- If yet another key has been provided via setsockopt(TLS_TX), we
don't install intermediate keys, only the latest.
- TCP notifies ktls of ACKs via the icsk_clean_acked callback. In
case of a rekey, tls_icsk_clean_acked will record when all data
sent with the most recent past key has been sent. The next call
to sendmsg will install the new key in HW.
- We close and push the current SW record before reenabling
offload.
If ->tls_dev_add fails to install the new key in HW, we stay in SW
mode. We can add a counter to keep track of this.
In addition:
Because we can't change socket ops during a rekey, we'll also have to
modify do_tls_setsockopt_conf to check ctx->tx_conf and only call
either tls_set_device_offload or tls_set_sw_offload. RX already uses
the same ops for both TLS_HW and TLS_SW, so we could switch between HW
and SW mode on rekey.
An alternative would be to have a common sendmsg which locks
the socket and then calls the correct implementation. We'll need that
anyway for the offload under rekey case, so that would only add a test
to the SW path's ops (compared to the current code). That should allow
us to simplify build_protos a bit, but might have a performance
impact - we'll need to check it if we want to go that route.
=======================================================================
Changes since v3:
- rebase on top of net-next
- rework tls_check_pending_rekey according to Jakub's feedback
- add statistics for rekey: {RX,TX}REKEY{OK,ERROR}
- some coding style clean ups
Link: https://lore.kernel.org/netdev/cover.1691584074.git.sd@queasysnail.net/ [v3]
Link: https://lore.kernel.org/netdev/cover.1676052788.git.sd@queasysnail.net/ [v2]
Link: https://lore.kernel.org/netdev/cover.1673952268.git.sd@queasysnail.net/ [v1]
Link: https://www.rfc-editor.org/rfc/rfc8446#section-4.6.3 [1]
Link: https://gitlab.com/gnutls/gnutls/-/merge_requests/1625 [2]
Sabrina Dubroca (6):
tls: block decryption when a rekey is pending
tls: implement rekey for TLS1.3
tls: add counters for rekey
docs: tls: document TLS1.3 key updates
selftests: tls: add key_generation argument to tls_crypto_info_init
selftests: tls: add rekey tests
Documentation/networking/tls.rst | 31 ++
include/net/tls.h | 3 +
include/uapi/linux/snmp.h | 4 +
net/tls/tls.h | 3 +-
net/tls/tls_device.c | 2 +-
net/tls/tls_main.c | 71 ++++-
net/tls/tls_proc.c | 4 +
net/tls/tls_sw.c | 138 +++++++--
tools/testing/selftests/net/tls.c | 480 +++++++++++++++++++++++++++++-
9 files changed, 676 insertions(+), 60 deletions(-)
--
2.47.0
Series takes care of two issues with sockmap update: inconsistent behaviour
after update with same, and race/refcount imbalance on element replace.
I am hesitant if patch 3/3 ("bpf, sockmap: Fix race between element replace
and close()") is the right approach. I might have missed some detail of the
current __sock_map_delete() implementation. I'd be grateful for comments,
thanks.
Signed-off-by: Michal Luczaj <mhal(a)rbox.co>
---
Michal Luczaj (3):
bpf, sockmap: Fix update element with same
selftest/bpf: Extend test for sockmap update with same
bpf, sockmap: Fix race between element replace and close()
net/core/sock_map.c | 6 +++---
tools/testing/selftests/bpf/prog_tests/sockmap_basic.c | 8 +++++---
2 files changed, 8 insertions(+), 6 deletions(-)
---
base-commit: 537a2525eaf76ea9b0dca62b994500d8670b39d5
change-id: 20241201-sockmap-replace-67c7077f3a31
Best regards,
--
Michal Luczaj <mhal(a)rbox.co>
Context
=======
We've observed within Red Hat that isolated, NOHZ_FULL CPUs running a
pure-userspace application get regularly interrupted by IPIs sent from
housekeeping CPUs. Those IPIs are caused by activity on the housekeeping CPUs
leading to various on_each_cpu() calls, e.g.:
64359.052209596 NetworkManager 0 1405 smp_call_function_many_cond (cpu=0, func=do_kernel_range_flush)
smp_call_function_many_cond+0x1
smp_call_function+0x39
on_each_cpu+0x2a
flush_tlb_kernel_range+0x7b
__purge_vmap_area_lazy+0x70
_vm_unmap_aliases.part.42+0xdf
change_page_attr_set_clr+0x16a
set_memory_ro+0x26
bpf_int_jit_compile+0x2f9
bpf_prog_select_runtime+0xc6
bpf_prepare_filter+0x523
sk_attach_filter+0x13
sock_setsockopt+0x92c
__sys_setsockopt+0x16a
__x64_sys_setsockopt+0x20
do_syscall_64+0x87
entry_SYSCALL_64_after_hwframe+0x65
The heart of this series is the thought that while we cannot remove NOHZ_FULL
CPUs from the list of CPUs targeted by these IPIs, they may not have to execute
the callbacks immediately. Anything that only affects kernelspace can wait
until the next user->kernel transition, providing it can be executed "early
enough" in the entry code.
The original implementation is from Peter [1]. Nicolas then added kernel TLB
invalidation deferral to that [2], and I picked it up from there.
Deferral approach
=================
Storing each and every callback, like a secondary call_single_queue turned out
to be a no-go: the whole point of deferral is to keep NOHZ_FULL CPUs in
userspace for as long as possible - no signal of any form would be sent when
deferring an IPI. This means that any form of queuing for deferred callbacks
would end up as a convoluted memory leak.
Deferred IPIs must thus be coalesced, which this series achieves by assigning
IPIs a "type" and having a mapping of IPI type to callback, leveraged upon
kernel entry.
What about IPIs whose callback take a parameter, you may ask?
Peter suggested during OSPM23 [3] that since on_each_cpu() targets
housekeeping CPUs *and* isolated CPUs, isolated CPUs can access either global or
housekeeping-CPU-local state to "reconstruct" the data that would have been sent
via the IPI.
This series does not affect any IPI callback that requires an argument, but the
approach would remain the same (one coalescable callback executed on kernel
entry).
Kernel entry vs execution of the deferred operation
===================================================
This is what I've referred to as the "Danger Zone" during my LPC24 talk [4].
There is a non-zero length of code that is executed upon kernel entry before the
deferred operation can be itself executed (i.e. before we start getting into
context_tracking.c proper), i.e.:
idtentry_func_foo() <--- we're in the kernel
irqentry_enter()
enter_from_user_mode()
__ct_user_exit()
ct_kernel_enter_state()
ct_work_flush() <--- deferred operation is executed here
This means one must take extra care to what can happen in the early entry code,
and that <bad things> cannot happen. For instance, we really don't want to hit
instructions that have been modified by a remote text_poke() while we're on our
way to execute a deferred sync_core(). Patches doing the actual deferral have
more detail on this.
Patches
=======
o Patches 1-3 are standalone cleanups.
o Patches 4-5 add an RCU testing feature.
o Patches 6-8 add a new type of jump label for static keys that will not have
their IPI be deferred.
o Patch 9 adds objtool verification of static keys vs their text_poke IPI
deferral
o Patches 10-14 add the actual IPI deferrals
o Patch 15 is a freebie to enable the deferral feature for NO_HZ_IDLE
Patches are also available at:
https://gitlab.com/vschneid/linux.git -b redhat/isolirq/defer/v3
RFC status
==========
Things I'd like to get comments on and/or that are a bit WIPish; they're called
out in the individual changelogs:
o "forceful" jump label naming which I don't particularly like
o objtool usage of 'offset_of(static_key.type)' and JUMP_TYPE_FORCEFUL. I've
hardcoded them but it could do with being shoved in a kernel header objtool
can include directly
o The noinstr variant of __flush_tlb_all() doesn't have a paravirt variant, does
it need one?
Testing
=======
Xeon E5-2699 system with SMToff, NOHZ_FULL, isolated CPUs.
RHEL9 userspace.
Workload is using rteval (kernel compilation + hackbench) on housekeeping CPUs
and a dummy stay-in-userspace loop on the isolated CPUs. The main invocation is:
$ trace-cmd record -e "csd_queue_cpu" -f "cpu & CPUS{$ISOL_CPUS}" \
-e "ipi_send_cpumask" -f "cpumask & CPUS{$ISOL_CPUS}" \
-e "ipi_send_cpu" -f "cpu & CPUS{$ISOL_CPUS}" \
rteval --onlyload --loads-cpulist=$HK_CPUS \
--hackbench-runlowmem=True --duration=$DURATION
This only records IPIs sent to isolated CPUs, so any event there is interference
(with a bit of fuzz at the start/end of the workload when spawning the
processes). All tests were done with a duration of 1hr.
v6.12-rc4
# This is the actual IPI count
$ trace-cmd report trace-base.dat | grep callback | awk '{ print $(NF) }' | sort | uniq -c | sort -nr
1782 callback=generic_smp_call_function_single_interrupt+0x0
73 callback=0x0
# These are the different CSD's that caused IPIs
$ trace-cmd report | grep csd_queue | awk '{ print $(NF-1) }' | sort | uniq -c | sort -nr
22048 func=tlb_remove_table_smp_sync
16536 func=do_sync_core
2262 func=do_flush_tlb_all
182 func=do_kernel_range_flush
144 func=rcu_exp_handler
60 func=sched_ttwu_pending
v6.12-rc4 + patches:
# This is the actual IPI count
$ trace-cmd report | grep callback | awk '{ print $(NF) }' | sort | uniq -c | sort -nr
1168 callback=generic_smp_call_function_single_interrupt+0x0
74 callback=0x0
# These are the different CSD's that caused IPIs
$ trace-cmd report | grep csd_queue | awk '{ print $(NF-1) }' | sort | uniq -c | sort -nr
23686 func=tlb_remove_table_smp_sync
192 func=rcu_exp_handler
65 func=sched_ttwu_pending
Interestingly tlb_remove_table_smp_sync() started showing up on this machine,
while it didn't during testing for v2 and it's the same machine. Yair had a
series adressing this [5] which per these results would be worth revisiting.
Acknowledgements
================
Special thanks to:
o Clark Williams for listening to my ramblings about this and throwing ideas my way
o Josh Poimboeuf for his guidance regarding objtool and hinting at the
.data..ro_after_init section.
o All of the folks who attended various talks about this and provided precious
feedback.
Links
=====
[1]: https://lore.kernel.org/all/20210929151723.162004989@infradead.org/
[2]: https://github.com/vianpl/linux.git -b ct-work-defer-wip
[3]: https://youtu.be/0vjE6fjoVVE
[4]: https://lpc.events/event/18/contributions/1889/
[5]: https://lore.kernel.org/lkml/20230620144618.125703-1-ypodemsk@redhat.com/
Revisions
=========
RFCv2 -> RFCv3
+++++++++++
o Rebased onto v6.12-rc7
o Added objtool documentation for the new warning (Josh)
o Added low-size RCU watching counter to TREE04 torture scenario (Paul)
o Added FORCEFUL jump label and static key types
o Added noinstr-compliant helpers for tlb flush deferral
o Overall changelog & comments cleanup
RFCv1 -> RFCv2
++++++++++++++
o Rebased onto v6.5-rc1
o Updated the trace filter patches (Steven)
o Fixed __ro_after_init keys used in modules (Peter)
o Dropped the extra context_tracking atomic, squashed the new bits in the
existing .state field (Peter, Frederic)
o Added an RCU_EXPERT config for the RCU dynticks counter size, and added an
rcutorture case for a low-size counter (Paul)
o Fixed flush_tlb_kernel_range_deferrable() definition
Valentin Schneider (15):
objtool: Make validate_call() recognize indirect calls to pv_ops[]
objtool: Flesh out warning related to pv_ops[] calls
sched/clock: Make sched_clock_running __ro_after_init
rcu: Add a small-width RCU watching counter debug option
rcutorture: Make TREE04 use CONFIG_RCU_DYNTICKS_TORTURE
jump_label: Add forceful jump label type
x86/speculation/mds: Make mds_idle_clear forceful
sched/clock, x86: Make __sched_clock_stable forceful
objtool: Warn about non __ro_after_init static key usage in .noinstr
x86/alternatives: Record text_poke's of JUMP_TYPE_FORCEFUL labels
context-tracking: Introduce work deferral infrastructure
context_tracking,x86: Defer kernel text patching IPIs
context_tracking,x86: Add infrastructure to defer kernel TLBI
x86/mm, mm/vmalloc: Defer flush_tlb_kernel_range() targeting NOHZ_FULL
CPUs
context-tracking: Add a Kconfig to enable IPI deferral for NO_HZ_IDLE
arch/Kconfig | 9 +++
arch/x86/Kconfig | 1 +
arch/x86/include/asm/context_tracking_work.h | 20 +++++++
arch/x86/include/asm/special_insns.h | 1 +
arch/x86/include/asm/text-patching.h | 13 ++++-
arch/x86/include/asm/tlbflush.h | 17 +++++-
arch/x86/kernel/alternative.c | 49 ++++++++++++----
arch/x86/kernel/cpu/bugs.c | 2 +-
arch/x86/kernel/cpu/common.c | 6 +-
arch/x86/kernel/jump_label.c | 7 ++-
arch/x86/kernel/kprobes/core.c | 4 +-
arch/x86/kernel/kprobes/opt.c | 4 +-
arch/x86/kernel/module.c | 2 +-
arch/x86/mm/tlb.c | 49 ++++++++++++++--
include/linux/context_tracking.h | 21 +++++++
include/linux/context_tracking_state.h | 54 ++++++++++++++---
include/linux/context_tracking_work.h | 28 +++++++++
include/linux/jump_label.h | 26 ++++++---
kernel/context_tracking.c | 46 ++++++++++++++-
kernel/rcu/Kconfig.debug | 14 +++++
kernel/sched/clock.c | 4 +-
kernel/time/Kconfig | 19 ++++++
mm/vmalloc.c | 35 +++++++++--
tools/objtool/Documentation/objtool.txt | 13 +++++
tools/objtool/check.c | 58 ++++++++++++++++---
tools/objtool/include/objtool/check.h | 1 +
tools/objtool/include/objtool/special.h | 2 +
tools/objtool/special.c | 3 +
.../selftests/rcutorture/configs/rcu/TREE04 | 1 +
29 files changed, 450 insertions(+), 59 deletions(-)
create mode 100644 arch/x86/include/asm/context_tracking_work.h
create mode 100644 include/linux/context_tracking_work.h
--
2.43.0
When compiling these selftests the host-tools directory is generated.
Add it to the .gitignore so git doesn't see these files as trackable.
Signed-off-by: Charlie Jenkins <charlie(a)rivosinc.com>
---
tools/testing/selftests/hid/.gitignore | 1 +
1 file changed, 1 insertion(+)
diff --git a/tools/testing/selftests/hid/.gitignore b/tools/testing/selftests/hid/.gitignore
index 746c62361f77..933f483815b2 100644
--- a/tools/testing/selftests/hid/.gitignore
+++ b/tools/testing/selftests/hid/.gitignore
@@ -1,5 +1,6 @@
bpftool
*.skel.h
+/host-tools
/tools
hid_bpf
hidraw
---
base-commit: 40384c840ea1944d7c5a392e8975ed088ecf0b37
change-id: 20241206-host_tools_gitignore-8a89f8820a61
--
- Charlie
This patch series adds more test case issuing ioctls to ucontrol VMs and
its floating interrupt controller.
The test cases trigger three possible null pointer dereferences within
the handling of the KVM_DEV_FLIC_APF_ENABLE,
KVM_DEV_FLIC_APF_DISABLE_WAIT and KVM_SET_GSI_ROUTING ioctl.
All of these issues do only exist on ucontrol VMs. Fixes for the issues
are included within the patch series.
Christoph Schlameuss (6):
kvm: s390: Reject setting flic pfault attributes on ucontrol VMs
selftests: kvm: s390: Add ucontrol flic attr selftests
kvm: s390: Reject KVM_SET_GSI_ROUTING on ucontrol VMs
selftests: kvm: s390: Add ucontrol gis routing test
selftests: kvm: s390: Streamline uc_skey test to issue iske after sske
selftests: kvm: s390: Add has device attr check to uc_attr_mem_limit
selftest
arch/s390/kvm/interrupt.c | 6 +
.../selftests/kvm/s390x/ucontrol_test.c | 196 ++++++++++++++++--
2 files changed, 184 insertions(+), 18 deletions(-)
--
2.47.1
With CONFIG_KPROBES_ON_FTRACE enabled on powerpc, ftrace_location_range
returns ftrace location for bpf_fentry_test1 at offset of 4 bytes from
function entry. This is because branch to _mcount function is at offset
of 4 bytes in function profile sequence.
To fix this, add entry_offset of 4 bytes while verifying the address for
kprobe entry address of bpf_fentry_test1 in verify_perf_link_info in
selftest, when CONFIG_KPROBES_ON_FTRACE is enabled.
Disassemble of bpf_fentry_test1:
c000000000e4b080 <bpf_fentry_test1>:
c000000000e4b080: a6 02 08 7c mflr r0
c000000000e4b084: b9 e2 22 4b bl c00000000007933c <_mcount>
c000000000e4b088: 01 00 63 38 addi r3,r3,1
c000000000e4b08c: b4 07 63 7c extsw r3,r3
c000000000e4b090: 20 00 80 4e blr
When CONFIG_PPC_FTRACE_OUT_OF_LINE [1] is enabled, these function profile
sequence is moved out of line with an unconditional branch at offset 0.
So, the test works without altering the offset for
'CONFIG_KPROBES_ON_FTRACE && CONFIG_PPC_FTRACE_OUT_OF_LINE' case.
Disassemble of bpf_fentry_test1:
c000000000f95190 <bpf_fentry_test1>:
c000000000f95190: 00 00 00 60 nop
c000000000f95194: 01 00 63 38 addi r3,r3,1
c000000000f95198: b4 07 63 7c extsw r3,r3
c000000000f9519c: 20 00 80 4e blr
[1] https://lore.kernel.org/all/20241030070850.1361304-13-hbathini@linux.ibm.co…
Fixes: 23cf7aa539dc ("selftests/bpf: Add selftest for fill_link_info")
Signed-off-by: Saket Kumar Bhaskar <skb99(a)linux.ibm.com>
---
.../selftests/bpf/prog_tests/fill_link_info.c | 4 ++++
.../selftests/bpf/progs/test_fill_link_info.c | 13 ++++++++++---
2 files changed, 14 insertions(+), 3 deletions(-)
diff --git a/tools/testing/selftests/bpf/prog_tests/fill_link_info.c b/tools/testing/selftests/bpf/prog_tests/fill_link_info.c
index d50cbd804..e59af2aa6 100644
--- a/tools/testing/selftests/bpf/prog_tests/fill_link_info.c
+++ b/tools/testing/selftests/bpf/prog_tests/fill_link_info.c
@@ -171,6 +171,10 @@ static void test_kprobe_fill_link_info(struct test_fill_link_info *skel,
/* See also arch_adjust_kprobe_addr(). */
if (skel->kconfig->CONFIG_X86_KERNEL_IBT)
entry_offset = 4;
+ if (skel->kconfig->CONFIG_PPC64 &&
+ skel->kconfig->CONFIG_KPROBES_ON_FTRACE &&
+ !skel->kconfig->CONFIG_PPC_FTRACE_OUT_OF_LINE)
+ entry_offset = 4;
err = verify_perf_link_info(link_fd, type, kprobe_addr, 0, entry_offset);
ASSERT_OK(err, "verify_perf_link_info");
} else {
diff --git a/tools/testing/selftests/bpf/progs/test_fill_link_info.c b/tools/testing/selftests/bpf/progs/test_fill_link_info.c
index 6afa83475..fac33a14f 100644
--- a/tools/testing/selftests/bpf/progs/test_fill_link_info.c
+++ b/tools/testing/selftests/bpf/progs/test_fill_link_info.c
@@ -6,13 +6,20 @@
#include <stdbool.h>
extern bool CONFIG_X86_KERNEL_IBT __kconfig __weak;
+extern bool CONFIG_PPC_FTRACE_OUT_OF_LINE __kconfig __weak;
+extern bool CONFIG_KPROBES_ON_FTRACE __kconfig __weak;
+extern bool CONFIG_PPC64 __kconfig __weak;
-/* This function is here to have CONFIG_X86_KERNEL_IBT
- * used and added to object BTF.
+/* This function is here to have CONFIG_X86_KERNEL_IBT,
+ * CONFIG_PPC_FTRACE_OUT_OF_LINE, CONFIG_KPROBES_ON_FTRACE,
+ * CONFIG_PPC6 used and added to object BTF.
*/
int unused(void)
{
- return CONFIG_X86_KERNEL_IBT ? 0 : 1;
+ return CONFIG_X86_KERNEL_IBT ||
+ CONFIG_PPC_FTRACE_OUT_OF_LINE ||
+ CONFIG_KPROBES_ON_FTRACE ||
+ CONFIG_PPC64 ? 0 : 1;
}
SEC("kprobe")
--
2.45.2
Recently, I reviewed a patch on the mm/kselftest mailing list about a
test which had obvious type mismatch fix in it. It was strange why that
wasn't caught during development and when patch was accepted. This led
me to discover that those extra compiler options to catch these warnings
aren't being used. When I added them, I found tens of warnings in just
mm suite.
In this series, I'm fixing those warnings in a few files. More fixes
would be sent later.
Muhammad Usama Anjum (4):
selftests/mm: thp_settings: remove const from return type
selftests/mm: pagemap_ioctl: Fix types mismatches shown by compiler
options
selftests/mm: mseal_test: remove unused variables
selftests/mm: mremap_test: Remove unused variable and type mismatches
tools/testing/selftests/mm/mremap_test.c | 15 +--
tools/testing/selftests/mm/mseal_test.c | 8 +-
tools/testing/selftests/mm/pagemap_ioctl.c | 108 +++++++++++----------
tools/testing/selftests/mm/thp_settings.c | 4 +-
tools/testing/selftests/mm/thp_settings.h | 4 +-
tools/testing/selftests/mm/vm_util.c | 2 +-
6 files changed, 75 insertions(+), 66 deletions(-)
--
2.39.5
Recently, I reviewed a patch on the mm/kselftest mailing list about a
test which had obvious type mismatch fix in it. It was strange why that
wasn't caught during development and when patch was accepted. This led
me to discover that those extra compiler options to catch these warnings
aren't being used. When I added them, I found tens of warnings in just
mm suite.
In this series, I'm fixing those warnings in a few files. More fixes
would be sent later.
Muhammad Usama Anjum (4):
selftests/mm: thp_settings: remove const from return type
selftests/mm: pagemap_ioctl: Fix types mismatches shown by compiler
options
selftests/mm: mseal_test: remove unused variables
selftests/mm: mremap_test: Remove unused variable and type mismatches
tools/testing/selftests/mm/mremap_test.c | 15 +--
tools/testing/selftests/mm/mseal_test.c | 8 +-
tools/testing/selftests/mm/pagemap_ioctl.c | 108 +++++++++++----------
tools/testing/selftests/mm/thp_settings.c | 4 +-
tools/testing/selftests/mm/thp_settings.h | 4 +-
tools/testing/selftests/mm/vm_util.c | 2 +-
6 files changed, 75 insertions(+), 66 deletions(-)
--
2.39.5
This is the 12th version of the patchset.
Hopefully there are no major flaws that will require more resendings.
I am sure we'll have plenty of time to polish up all bells and whistles
:-)
@Sergey, at the end I think I took in all your suggested changes, maybe
with some adaptations.
Notable changes from v11:
* move 'select' entries in Kconfig from patch 1 to where those deps are
used
* mark mailing list as subscribers-only in MAINTAINERS file
* check iface validity against net_device_ops instead of ndo_start_xmit
* drop DRV_ defines in favour of literals
* use "ovpn" literal instead of OVPN_FAMILY_NAME in code that is not
netlink related
* delete all peers on ifdown (new del-peer reason added accordingly)
* don't allow adding new peers if iface is down
* clarified uniqueness of IDs in netlink spec
* renamed ovpn_struct to ovpn_priv
* removed packet.h and moved content to proto.h
* fixed overhead/head_room calculation
* dropped unused ovpn_priv.dev_list member
* ensured all defines are prefixed with OVPN_
* kept carrier on only for MP mode
* carrier in P2P mode goes on/off when peer is added/deleted
* dropped skb_protocol_to_family() in favour of checking skb->protocol
directly
* dropped ovpn_priv.peers.lock in favour of ovpn_priv.lock
* dropped error message in case of packet with unknown ID
* dropped sanity check in udp socket attach function
* made ovpn_peer_skb_to_sockaddr() return sockaddr len to simplify code
* dropped __must_hold() in favour of lockdep_assert_held()
* with TCP patch ovpn_socket now holds reference to ovpn_priv (UDP) or
ovpn_peer (TCP) to prevent use-after-free of peer in TCP code and to
force cleanup code to wait for TCP scheduled work
* ovpn_peer release refactored in two steps to allow implementing
previous point (reference to socket is now dropped in first step,
instead of kref callback)
* dropped all mentions of __func__ in messages
* moved introduction of UDP_ENCAP_OVPNINUDP from patch 1 to related patch
* properly update vpn and link statistics at right time instead of same
spot
* properly checked skb head size before accessing ipv6 header in
ovpn_ip_check_protocol()
* merged ovpn_peer_update_local_endpoint() and ovpn_peer_float()
* properly locked peer collection when rehashing upon peer float
* used netdev_name() when possible for printing iface name
* destroyed dst_cache only upon final peer release
* used bitfield APIs for opcode parsing and creation
* dropped struct ovpn_nonce_tail in favour of using u8[] directly
* added comment about skb_reset_network_header() placement
* added locking around peer->bind modifications
* added TCP out_queue to stash data skbs when socket is owned by user
(to be sent out upon sock release)
* added call to barrier() in TCP socket release
* fixed hlist nulls lookup by adding loop restart
* used WRITE/READ_ONCE with last_recv/sent
* stopped counting keepalive msgs as dropped packets
* improved ovpn_nl_peer_precheck() to account for mixed v4mapped IPv6
* rehash peer after PEER_SET only in MP mode
addresses
* added iface teardown check to kselftest script
* Link to v11: https://lore.kernel.org/r/20241029-b4-ovpn-v11-0-de4698c73a25@openvpn.net
Please note that some patches were already reviewed by Andre Lunn,
Donald Hunter and Shuah Khan. They have retained the Reviewed-by tag
since no major code modification has happened since the review.
Patch
The latest code can also be found at:
https://github.com/OpenVPN/linux-kernel-ovpn
Thanks a lot!
Best Regards,
Antonio Quartulli
OpenVPN Inc.
---
Antonio Quartulli (22):
net: introduce OpenVPN Data Channel Offload (ovpn)
ovpn: add basic netlink support
ovpn: add basic interface creation/destruction/management routines
ovpn: keep carrier always on for MP interfaces
ovpn: introduce the ovpn_peer object
ovpn: introduce the ovpn_socket object
ovpn: implement basic TX path (UDP)
ovpn: implement basic RX path (UDP)
ovpn: implement packet processing
ovpn: store tunnel and transport statistics
ovpn: implement TCP transport
ovpn: implement multi-peer support
ovpn: implement peer lookup logic
ovpn: implement keepalive mechanism
ovpn: add support for updating local UDP endpoint
ovpn: add support for peer floating
ovpn: implement peer add/get/dump/delete via netlink
ovpn: implement key add/get/del/swap via netlink
ovpn: kill key and notify userspace in case of IV exhaustion
ovpn: notify userspace when a peer is deleted
ovpn: add basic ethtool support
testing/selftests: add test tool and scripts for ovpn module
Documentation/netlink/specs/ovpn.yaml | 368 +++
MAINTAINERS | 11 +
drivers/net/Kconfig | 14 +
drivers/net/Makefile | 1 +
drivers/net/ovpn/Makefile | 22 +
drivers/net/ovpn/bind.c | 55 +
drivers/net/ovpn/bind.h | 101 +
drivers/net/ovpn/crypto.c | 211 ++
drivers/net/ovpn/crypto.h | 145 ++
drivers/net/ovpn/crypto_aead.c | 383 ++++
drivers/net/ovpn/crypto_aead.h | 33 +
drivers/net/ovpn/io.c | 446 ++++
drivers/net/ovpn/io.h | 34 +
drivers/net/ovpn/main.c | 339 +++
drivers/net/ovpn/main.h | 14 +
drivers/net/ovpn/netlink-gen.c | 212 ++
drivers/net/ovpn/netlink-gen.h | 41 +
drivers/net/ovpn/netlink.c | 1178 ++++++++++
drivers/net/ovpn/netlink.h | 18 +
drivers/net/ovpn/ovpnstruct.h | 57 +
drivers/net/ovpn/peer.c | 1278 +++++++++++
drivers/net/ovpn/peer.h | 163 ++
drivers/net/ovpn/pktid.c | 129 ++
drivers/net/ovpn/pktid.h | 87 +
drivers/net/ovpn/proto.h | 118 +
drivers/net/ovpn/skb.h | 58 +
drivers/net/ovpn/socket.c | 180 ++
drivers/net/ovpn/socket.h | 55 +
drivers/net/ovpn/stats.c | 21 +
drivers/net/ovpn/stats.h | 47 +
drivers/net/ovpn/tcp.c | 579 +++++
drivers/net/ovpn/tcp.h | 33 +
drivers/net/ovpn/udp.c | 397 ++++
drivers/net/ovpn/udp.h | 23 +
include/uapi/linux/if_link.h | 15 +
include/uapi/linux/ovpn.h | 110 +
include/uapi/linux/udp.h | 1 +
tools/testing/selftests/Makefile | 1 +
tools/testing/selftests/net/ovpn/.gitignore | 2 +
tools/testing/selftests/net/ovpn/Makefile | 17 +
tools/testing/selftests/net/ovpn/config | 10 +
tools/testing/selftests/net/ovpn/data64.key | 5 +
tools/testing/selftests/net/ovpn/ovpn-cli.c | 2370 ++++++++++++++++++++
tools/testing/selftests/net/ovpn/tcp_peers.txt | 5 +
.../testing/selftests/net/ovpn/test-chachapoly.sh | 9 +
tools/testing/selftests/net/ovpn/test-float.sh | 9 +
tools/testing/selftests/net/ovpn/test-tcp.sh | 9 +
tools/testing/selftests/net/ovpn/test.sh | 182 ++
tools/testing/selftests/net/ovpn/udp_peers.txt | 5 +
49 files changed, 9601 insertions(+)
---
base-commit: 65ae975e97d5aab3ee9dc5ec701b12090572ed43
change-id: 20241002-b4-ovpn-eeee35c694a2
Best regards,
--
Antonio Quartulli <antonio(a)openvpn.net>
When using svcr_in to check ZA and Streaming Mode, we should make sure
that the value in x2 is correct, otherwise it may trigger an Illegal
instruction if FEAT_SVE and !FEAT_SME.
Fixes: 43e3f85523e4 ("kselftest/arm64: Add SME support to syscall ABI test")
Signed-off-by: Weizhao Ouyang <o451686892(a)gmail.com>
---
tools/testing/selftests/arm64/abi/syscall-abi-asm.S | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/tools/testing/selftests/arm64/abi/syscall-abi-asm.S b/tools/testing/selftests/arm64/abi/syscall-abi-asm.S
index df3230fdac39..98cde4f37abf 100644
--- a/tools/testing/selftests/arm64/abi/syscall-abi-asm.S
+++ b/tools/testing/selftests/arm64/abi/syscall-abi-asm.S
@@ -81,9 +81,9 @@ do_syscall:
stp x27, x28, [sp, #96]
// Set SVCR if we're doing SME
- cbz x1, 1f
adrp x2, svcr_in
ldr x2, [x2, :lo12:svcr_in]
+ cbz x1, 1f
msr S3_3_C4_C2_2, x2
1:
--
2.45.2
Add PKEY_UNRESTRICTED macro to mman.h and use it in selftests.
For context, this change will also allow for more consistent update of the
Glibc manual which in turn will help with introducing memory protection
keys on AArch64 targets.
Applies to fac04efc5c79 (tag: v6.13-rc2).
Note that I couldn't build ppc tests so I would appreciate if someone
could check the 3rd patch. Thank you!
Signed-off-by: Yury Khrustalev <yury.khrustalev(a)arm.com>
---
Changes in v4:
- Removed change to tools/include/uapi/asm-generic/mman-common.h as it is not
necessary.
Link to v3: https://lore.kernel.org/all/20241028090715.509527-1-yury.khrustalev@arm.com/
Changes in v3:
- Replaced previously missed 0-s tools/testing/selftests/mm/mseal_test.c
- Replaced previously missed 0-s in tools/testing/selftests/mm/mseal_test.c
Link to v2: https://lore.kernel.org/linux-arch/20241027170006.464252-2-yury.khrustalev@…
Changes in v2:
- Update tools/include/uapi/asm-generic/mman-common.h as well
- Add usages of the new macro to selftests.
Link to v1: https://lore.kernel.org/linux-arch/20241022120128.359652-1-yury.khrustalev@…
---
Yury Khrustalev (3):
mm/pkey: Add PKEY_UNRESTRICTED macro
selftests/mm: Use PKEY_UNRESTRICTED macro
selftests/powerpc: Use PKEY_UNRESTRICTED macro
include/uapi/asm-generic/mman-common.h | 1 +
tools/testing/selftests/mm/mseal_test.c | 6 +++---
tools/testing/selftests/mm/pkey-helpers.h | 3 ++-
tools/testing/selftests/mm/pkey_sighandler_tests.c | 4 ++--
tools/testing/selftests/mm/protection_keys.c | 2 +-
tools/testing/selftests/powerpc/include/pkeys.h | 2 +-
tools/testing/selftests/powerpc/mm/pkey_exec_prot.c | 2 +-
tools/testing/selftests/powerpc/mm/pkey_siginfo.c | 2 +-
tools/testing/selftests/powerpc/ptrace/core-pkey.c | 6 +++---
tools/testing/selftests/powerpc/ptrace/ptrace-pkey.c | 6 +++---
10 files changed, 18 insertions(+), 16 deletions(-)
--
2.39.5
Basics and overview
===================
Software with larger attack surfaces (e.g. network facing apps like databases,
browsers or apps relying on browser runtimes) suffer from memory corruption
issues which can be utilized by attackers to bend control flow of the program
to eventually gain control (by making their payload executable). Attackers are
able to perform such attacks by leveraging call-sites which rely on indirect
calls or return sites which rely on obtaining return address from stack memory.
To mitigate such attacks, risc-v extension zicfilp enforces that all indirect
calls must land on a landing pad instruction `lpad` else cpu will raise software
check exception (a new cpu exception cause code on riscv).
Similarly for return flow, risc-v extension zicfiss extends architecture with
- `sspush` instruction to push return address on a shadow stack
- `sspopchk` instruction to pop return address from shadow stack
and compare with input operand (i.e. return address on stack)
- `sspopchk` to raise software check exception if comparision above
was a mismatch
- Protection mechanism using which shadow stack is not writeable via
regular store instructions
More information an details can be found at extensions github repo [1].
Equivalent to landing pad (zicfilp) on x86 is `ENDBRANCH` instruction in Intel
CET [3] and branch target identification (BTI) [4] on arm.
Similarly x86's Intel CET has shadow stack [5] and arm64 has guarded control
stack (GCS) [6] which are very similar to risc-v's zicfiss shadow stack.
x86 already supports shadow stack for user mode and arm64 support for GCS in
usermode [7] is in -next.
Kernel awareness for user control flow integrity
================================================
This series picks up Samuel Holland's envcfg changes [2] as well. So if those are
being applied independently, they should be removed from this series.
Enabling:
In order to maintain compatibility and not break anything in user mode, kernel
doesn't enable control flow integrity cpu extensions on binary by default.
Instead exposes a prctl interface to enable, disable and lock the shadow stack
or landing pad feature for a task. This allows userspace (loader) to enumerate
if all objects in its address space are compiled with shadow stack and landing
pad support and accordingly enable the feature. Additionally if a subsequent
`dlopen` happens on a library, user mode can take a decision again to disable
the feature (if incoming library is not compiled with support) OR terminate the
task (if user mode policy is strict to have all objects in address space to be
compiled with control flow integirty cpu feature). prctl to enable shadow stack
results in allocating shadow stack from virtual memory and activating for user
address space. x86 and arm64 are also following same direction due to similar
reason(s).
clone/fork:
On clone and fork, cfi state for task is inherited by child. Shadow stack is
part of virtual memory and is a writeable memory from kernel perspective
(writeable via a restricted set of instructions aka shadow stack instructions)
Thus kernel changes ensure that this memory is converted into read-only when
fork/clone happens and COWed when fault is taken due to sspush, sspopchk or
ssamoswap. In case `CLONE_VM` is specified and shadow stack is to be enabled,
kernel will automatically allocate a shadow stack for that clone call.
map_shadow_stack:
x86 introduced `map_shadow_stack` system call to allow user space to explicitly
map shadow stack memory in its address space. It is useful to allocate shadow
for different contexts managed by a single thread (green threads or contexts)
risc-v implements this system call as well.
signal management:
If shadow stack is enabled for a task, kernel performs an asynchronous control
flow diversion to deliver the signal and eventually expects userspace to issue
sigreturn so that original execution can be resumed. Even though resume context
is prepared by kernel, it is in user space memory and is subject to memory
corruption and corruption bugs can be utilized by attacker in this race window
to perform arbitrary sigreturn and eventually bypass cfi mechanism.
Another issue is how to ensure that cfi related state on sigcontext area is not
trampled by legacy apps or apps compiled with old kernel headers.
In order to mitigate control-flow hijacting, kernel prepares a token and place
it on shadow stack before signal delivery and places address of token in
sigcontext structure. During sigreturn, kernel obtains address of token from
sigcontext struture, reads token from shadow stack and validates it and only
then allow sigreturn to succeed. Compatiblity issue is solved by adopting
dynamic sigcontext management introduced for vector extension. This series
re-factor the code little bit to allow future sigcontext management easy (as
proposed by Andy Chiu from SiFive)
config and compilation:
Introduce a new risc-v config option `CONFIG_RISCV_USER_CFI`. Selecting this
config option picks the kernel support for user control flow integrity. This
optin is presented only if toolchain has shadow stack and landing pad support.
And is on purpose guarded by toolchain support. Reason being that eventually
vDSO also needs to be compiled in with shadow stack and landing pad support.
vDSO compile patches are not included as of now because landing pad labeling
scheme is yet to settle for usermode runtime.
To get more information on kernel interactions with respect to
zicfilp and zicfiss, patch series adds documentation for
`zicfilp` and `zicfiss` in following:
Documentation/arch/riscv/zicfiss.rst
Documentation/arch/riscv/zicfilp.rst
How to test this series
=======================
Toolchain
---------
$ git clone git@github.com:sifive/riscv-gnu-toolchain.git -b cfi-dev
$ riscv-gnu-toolchain/configure --prefix=<path-to-where-to-build> --with-arch=rv64gc_zicfilp_zicfiss --enable-linux --disable-gdb --with-extra-multilib-test="rv64gc_zicfilp_zicfiss-lp64d:-static"
$ make -j$(nproc)
Qemu
----
$ git clone git@github.com:deepak0414/qemu.git -b zicfilp_zicfiss_ratified_master_july11
$ cd qemu
$ mkdir build
$ cd build
$ ../configure --target-list=riscv64-softmmu
$ make -j$(nproc)
Opensbi
-------
$ git clone git@github.com:deepak0414/opensbi.git -b v6_cfi_spec_split_opensbi
$ make CROSS_COMPILE=<your riscv toolchain> -j$(nproc) PLATFORM=generic
Linux
-----
Running defconfig is fine. CFI is enabled by default if the toolchain
supports it.
$ make ARCH=riscv CROSS_COMPILE=<path-to-cfi-riscv-gnu-toolchain>/build/bin/riscv64-unknown-linux-gnu- -j$(nproc) defconfig
$ make ARCH=riscv CROSS_COMPILE=<path-to-cfi-riscv-gnu-toolchain>/build/bin/riscv64-unknown-linux-gnu- -j$(nproc)
Branch where user cfi enabling patches are maintained
https://github.com/deepak0414/linux-riscv-cfi/tree/vdso_user_cfi_v6.12-rc1
In case you're building your own rootfs using toolchain, please make sure you
pick following patch to ensure that vDSO compiled with lpad and shadow stack.
"arch/riscv: compile vdso with landing pad"
Running
-------
Modify your qemu command to have:
-bios <path-to-cfi-opensbi>/build/platform/generic/firmware/fw_dynamic.bin
-cpu rv64,zicfilp=true,zicfiss=true,zimop=true,zcmop=true
vDSO related Opens (in the flux)
=================================
I am listing these opens for laying out plan and what to expect in future
patch sets. And of course for the sake of discussion.
Shadow stack and landing pad enabling in vDSO
----------------------------------------------
vDSO must have shadow stack and landing pad support compiled in for task
to have shadow stack and landing pad support. This patch series doesn't
enable that (yet). Enabling shadow stack support in vDSO should be
straight forward (intend to do that in next versions of patch set). Enabling
landing pad support in vDSO requires some collaboration with toolchain folks
to follow a single label scheme for all object binaries. This is necessary to
ensure that all indirect call-sites are setting correct label and target landing
pads are decorated with same label scheme.
How many vDSOs
---------------
Shadow stack instructions are carved out of zimop (may be operations) and if CPU
doesn't implement zimop, they're illegal instructions. Kernel could be running on
a CPU which may or may not implement zimop. And thus kernel will have to carry 2
different vDSOs and expose the appropriate one depending on whether CPU implements
zimop or not.
References
==========
[1] - https://github.com/riscv/riscv-cfi
[2] - https://lore.kernel.org/all/20240814081126.956287-1-samuel.holland@sifive.c…
[3] - https://lwn.net/Articles/889475/
[4] - https://developer.arm.com/documentation/109576/0100/Branch-Target-Identific…
[5] - https://www.intel.com/content/dam/develop/external/us/en/documents/catc17-i…
[6] - https://lwn.net/Articles/940403/
[7] - https://lore.kernel.org/all/20241001-arm64-gcs-v13-0-222b78d87eee@kernel.or…
---
changelog
---------
v7:
- Removed "riscv/Kconfig: enable HAVE_EXIT_THREAD for riscv"
Instead using `deactivate_mm` flow to clean up.
see here for more context
https://lore.kernel.org/all/20230908203655.543765-1-rick.p.edgecombe@intel.…
- Changed the header include in `kselftest`. Hopefully this fixes compile
issue faced by Zong Li at SiFive.
- Cleaned up an orphaned change to `mm/mmap.c` in below patch
"riscv/mm : ensure PROT_WRITE leads to VM_READ | VM_WRITE"
- Lock interfaces for shadow stack and indirect branch tracking expect arg == 0
Any future evolution of this interface should accordingly define how arg should
be setup.
- `mm/map.c` has an instance of using `VM_SHADOW_STACK`. Fixed it to use helper
`is_shadow_stack_vma`.
- Link to v6: https://lore.kernel.org/r/20241008-v5_user_cfi_series-v6-0-60d9fe073f37@riv…
v6:
- Picked up Samuel Holland's changes as is with `envcfg` placed in
`thread` instead of `thread_info`
- fixed unaligned newline escapes in kselftest
- cleaned up messages in kselftest and included test output in commit message
- fixed a bug in clone path reported by Zong Li
- fixed a build issue if CONFIG_RISCV_ISA_V is not selected
(this was introduced due to re-factoring signal context
management code)
v5:
- rebased on v6.12-rc1
- Fixed schema related issues in device tree file
- Fixed some of the documentation related issues in zicfilp/ss.rst
(style issues and added index)
- added `SHADOW_STACK_SET_MARKER` so that implementation can define base
of shadow stack.
- Fixed warnings on definitions added in usercfi.h when
CONFIG_RISCV_USER_CFI is not selected.
- Adopted context header based signal handling as proposed by Andy Chiu
- Added support for enabling kernel mode access to shadow stack using
FWFT
(https://github.com/riscv-non-isa/riscv-sbi-doc/blob/master/src/ext-firmware…)
- Link to v5: https://lore.kernel.org/r/20241001-v5_user_cfi_series-v1-0-3ba65b6e550f@riv…
(Note: I had an issue in my workflow due to which version number wasn't
picked up correctly while sending out patches)
v4:
- rebased on 6.11-rc6
- envcfg: Converged with Samuel Holland's patches for envcfg management on per-
thread basis.
- vma_is_shadow_stack is renamed to is_vma_shadow_stack
- picked up Mark Brown's `ARCH_HAS_USER_SHADOW_STACK` patch
- signal context: using extended context management to maintain compatibility.
- fixed `-Wmissing-prototypes` compiler warnings for prctl functions
- Documentation fixes and amending typos.
- Link to v4: https://lore.kernel.org/all/20240912231650.3740732-1-debug@rivosinc.com/
v3:
- envcfg
logic to pick up base envcfg had a bug where `ENVCFG_CBZE` could have been
picked on per task basis, even though CPU didn't implement it. Fixed in
this series.
- dt-bindings
As suggested, split into separate commit. fixed the messaging that spec is
in public review
- arch_is_shadow_stack change
arch_is_shadow_stack changed to vma_is_shadow_stack
- hwprobe
zicfiss / zicfilp if present will get enumerated in hwprobe
- selftests
As suggested, added object and binary filenames to .gitignore
Selftest binary anyways need to be compiled with cfi enabled compiler which
will make sure that landing pad and shadow stack are enabled. Thus removed
separate enable/disable tests. Cleaned up tests a bit.
- Link to v3: https://lore.kernel.org/lkml/20240403234054.2020347-1-debug@rivosinc.com/
v2:
- Using config `CONFIG_RISCV_USER_CFI`, kernel support for riscv control flow
integrity for user mode programs can be compiled in the kernel.
- Enabling of control flow integrity for user programs is left to user runtime
- This patch series introduces arch agnostic `prctls` to enable shadow stack
and indirect branch tracking. And implements them on riscv.
---
Andy Chiu (1):
riscv: signal: abstract header saving for setup_sigcontext
Clément Léger (1):
riscv: Add Firmware Feature SBI extensions definitions
Deepak Gupta (25):
mm: helper `is_shadow_stack_vma` to check shadow stack vma
dt-bindings: riscv: zicfilp and zicfiss in dt-bindings (extensions.yaml)
riscv: zicfiss / zicfilp enumeration
riscv: zicfiss / zicfilp extension csr and bit definitions
riscv: usercfi state for task and save/restore of CSR_SSP on trap entry/exit
riscv/mm : ensure PROT_WRITE leads to VM_READ | VM_WRITE
riscv mm: manufacture shadow stack pte
riscv mmu: teach pte_mkwrite to manufacture shadow stack PTEs
riscv mmu: write protect and shadow stack
riscv/mm: Implement map_shadow_stack() syscall
riscv/shstk: If needed allocate a new shadow stack on clone
prctl: arch-agnostic prctl for indirect branch tracking
riscv: Implements arch agnostic shadow stack prctls
riscv: Implements arch agnostic indirect branch tracking prctls
riscv/traps: Introduce software check exception
riscv/signal: save and restore of shadow stack for signal
riscv/kernel: update __show_regs to print shadow stack register
riscv/ptrace: riscv cfi status and state via ptrace and in core files
riscv/hwprobe: zicfilp / zicfiss enumeration in hwprobe
riscv: enable kernel access to shadow stack memory via FWFT sbi call
riscv: kernel command line option to opt out of user cfi
riscv: create a config for shadow stack and landing pad instr support
riscv: Documentation for landing pad / indirect branch tracking
riscv: Documentation for shadow stack on riscv
kselftest/riscv: kselftest for user mode cfi
Mark Brown (2):
mm: Introduce ARCH_HAS_USER_SHADOW_STACK
prctl: arch-agnostic prctl for shadow stack
Samuel Holland (3):
riscv: Enable cbo.zero only when all harts support Zicboz
riscv: Add support for per-thread envcfg CSR values
riscv: Call riscv_user_isa_enable() only on the boot hart
Documentation/arch/riscv/index.rst | 2 +
Documentation/arch/riscv/zicfilp.rst | 115 +++++
Documentation/arch/riscv/zicfiss.rst | 176 +++++++
.../devicetree/bindings/riscv/extensions.yaml | 14 +
arch/riscv/Kconfig | 20 +
arch/riscv/include/asm/asm-prototypes.h | 1 +
arch/riscv/include/asm/cpufeature.h | 15 +-
arch/riscv/include/asm/csr.h | 16 +
arch/riscv/include/asm/entry-common.h | 2 +
arch/riscv/include/asm/hwcap.h | 2 +
arch/riscv/include/asm/mman.h | 24 +
arch/riscv/include/asm/mmu_context.h | 7 +
arch/riscv/include/asm/pgtable.h | 30 +-
arch/riscv/include/asm/processor.h | 3 +
arch/riscv/include/asm/sbi.h | 27 ++
arch/riscv/include/asm/switch_to.h | 8 +
arch/riscv/include/asm/thread_info.h | 3 +
arch/riscv/include/asm/usercfi.h | 89 ++++
arch/riscv/include/asm/vector.h | 3 +
arch/riscv/include/uapi/asm/hwprobe.h | 2 +
arch/riscv/include/uapi/asm/ptrace.h | 22 +
arch/riscv/include/uapi/asm/sigcontext.h | 1 +
arch/riscv/kernel/Makefile | 2 +
arch/riscv/kernel/asm-offsets.c | 8 +
arch/riscv/kernel/cpufeature.c | 13 +-
arch/riscv/kernel/entry.S | 31 +-
arch/riscv/kernel/head.S | 12 +
arch/riscv/kernel/process.c | 26 +-
arch/riscv/kernel/ptrace.c | 83 ++++
arch/riscv/kernel/signal.c | 140 +++++-
arch/riscv/kernel/smpboot.c | 2 -
arch/riscv/kernel/suspend.c | 4 +-
arch/riscv/kernel/sys_hwprobe.c | 2 +
arch/riscv/kernel/sys_riscv.c | 10 +
arch/riscv/kernel/traps.c | 42 ++
arch/riscv/kernel/usercfi.c | 526 +++++++++++++++++++++
arch/riscv/mm/init.c | 2 +-
arch/riscv/mm/pgtable.c | 17 +
arch/x86/Kconfig | 1 +
fs/proc/task_mmu.c | 2 +-
include/linux/cpu.h | 4 +
include/linux/mm.h | 5 +-
include/uapi/asm-generic/mman.h | 4 +
include/uapi/linux/elf.h | 1 +
include/uapi/linux/prctl.h | 48 ++
kernel/sys.c | 60 +++
mm/Kconfig | 6 +
mm/gup.c | 2 +-
mm/mmap.c | 2 +-
mm/vma.h | 10 +-
tools/testing/selftests/riscv/Makefile | 2 +-
tools/testing/selftests/riscv/cfi/.gitignore | 3 +
tools/testing/selftests/riscv/cfi/Makefile | 10 +
tools/testing/selftests/riscv/cfi/cfi_rv_test.h | 84 ++++
tools/testing/selftests/riscv/cfi/riscv_cfi_test.c | 78 +++
tools/testing/selftests/riscv/cfi/shadowstack.c | 373 +++++++++++++++
tools/testing/selftests/riscv/cfi/shadowstack.h | 37 ++
57 files changed, 2191 insertions(+), 43 deletions(-)
---
base-commit: 7d9923ee3960bdbfaa7f3a4e0ac2364e770c46ff
change-id: 20240930-v5_user_cfi_series-3dc332f8f5b2
--
- debug
The word 'accross' is wrong, so fix it.
Signed-off-by: Zhu Jun <zhujun2(a)cmss.chinamobile.com>
---
tools/testing/selftests/net/psock_tpacket.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/tools/testing/selftests/net/psock_tpacket.c b/tools/testing/selftests/net/psock_tpacket.c
index 404a2ce75..221270cee 100644
--- a/tools/testing/selftests/net/psock_tpacket.c
+++ b/tools/testing/selftests/net/psock_tpacket.c
@@ -12,7 +12,7 @@
*
* Datapath:
* Open a pair of packet sockets and send resp. receive an a priori known
- * packet pattern accross the sockets and check if it was received resp.
+ * packet pattern across the sockets and check if it was received resp.
* sent correctly. Fanout in combination with RX_RING is currently not
* tested here.
*
--
2.17.1
Danielle Ratson writes:
Currently, the sharedbuffer test fails sometimes because it is reading a
maximum occupancy that is larger than expected on some different cases.
This is happening because the test assumes that the packet it is sending
is the only packet being passed to the device.
In addition, some duplications on one hand, and redundant test cases on
the other hand, were found in the test.
Add egress filters on h1 and h2 that will guarantee that the packets in
the buffer are sent in the test, and remove the redundant test cases.
Danielle Ratson (3):
selftests: mlxsw: sharedbuffer: Remove h1 ingress test case
selftests: mlxsw: sharedbuffer: Remove duplicate test cases
selftests: mlxsw: sharedbuffer: Ensure no extra packets are counted
.../drivers/net/mlxsw/sharedbuffer.sh | 55 ++++++++++++++-----
1 file changed, 40 insertions(+), 15 deletions(-)
--
2.47.0
Hi all,
This patch series continues the work to migrate the script tests into
prog_tests.
test_xdp_meta.sh uses the BPF programs defined in progs/test_xdp_meta.c
to do a simple XDP/TC functional test that checks the metadata
allocation performed by the bpf_xdp_adjust_meta() helper.
This is already partly covered by two tests under prog_tests/:
- xdp_context_test_run.c uses bpf_prog_test_run_opts() to verify the
validity of the xdp_md context after a call to bpf_xdp_adjust_meta()
- xdp_metadata.c ensures that these meta-data can be exchanged through
an AF_XDP socket.
However test_xdp_meta.sh also verifies that the meta-data initialized
in the struct xdp_md is forwarded to the struct __sk_buff used by BPF
programs at 'TC level'. To cover this, I add a test case in
xdp_context_test_run.c that uses the same BPF programs from
progs/test_xdp_meta.c.
---
Bastien Curutchet (2):
selftests/bpf: test_xdp_meta: Rename BPF sections
selftests/bpf: Migrate test_xdp_meta.sh into xdp_context_test_run.c
tools/testing/selftests/bpf/Makefile | 1 -
.../bpf/prog_tests/xdp_context_test_run.c | 86 ++++++++++++++++++++++
tools/testing/selftests/bpf/progs/test_xdp_meta.c | 4 +-
tools/testing/selftests/bpf/test_xdp_meta.sh | 58 ---------------
4 files changed, 88 insertions(+), 61 deletions(-)
---
base-commit: 6849a3de3507a490fb0788c9bafbb2f29a904f05
change-id: 20241203-xdp_meta-868307cd0e03
Best regards,
--
Bastien Curutchet (eBPF Foundation) <bastien.curutchet(a)bootlin.com>
When compiling the pointer masking tests with -Wall this warning
is present:
pointer_masking.c: In function ‘test_tagged_addr_abi_sysctl’:
pointer_masking.c:203:9: warning: ignoring return value of ‘pwrite’
declared with attribute ‘warn_unused_result’ [-Wunused-result]
203 | pwrite(fd, &value, 1, 0); |
^~~~~~~~~~~~~~~~~~~~~~~~ pointer_masking.c:208:9: warning:
ignoring return value of ‘pwrite’ declared with attribute
‘warn_unused_result’ [-Wunused-result]
208 | pwrite(fd, &value, 1, 0);
I came across this on riscv64-linux-gnu-gcc (Ubuntu
11.4.0-1ubuntu1~22.04).
Fix this by checking that the number of bytes written equal the expected
number of bytes written.
Fixes: 7470b5afd150 ("riscv: selftests: Add a pointer masking test")
Signed-off-by: Charlie Jenkins <charlie(a)rivosinc.com>
---
Changes in v4:
- Skip sysctl_enabled test if first pwrite failed
- Link to v3: https://lore.kernel.org/r/20241205-fix_warnings_pointer_masking_tests-v3-1-…
Changes in v3:
- Fix sysctl enabled test case (Drew/Alex)
- Move pwrite err condition into goto (Drew)
- Link to v2: https://lore.kernel.org/r/20241204-fix_warnings_pointer_masking_tests-v2-1-…
Changes in v2:
- I had ret != 2 for testing, I changed it to be ret != 1.
- Link to v1: https://lore.kernel.org/r/20241204-fix_warnings_pointer_masking_tests-v1-1-…
---
tools/testing/selftests/riscv/abi/pointer_masking.c | 20 ++++++++++++++++++--
1 file changed, 18 insertions(+), 2 deletions(-)
diff --git a/tools/testing/selftests/riscv/abi/pointer_masking.c b/tools/testing/selftests/riscv/abi/pointer_masking.c
index dee41b7ee3e3..759445d5f265 100644
--- a/tools/testing/selftests/riscv/abi/pointer_masking.c
+++ b/tools/testing/selftests/riscv/abi/pointer_masking.c
@@ -189,6 +189,8 @@ static void test_tagged_addr_abi_sysctl(void)
{
char value;
int fd;
+ int ret;
+ char *err_pwrite_msg = "failed to write to /proc/sys/abi/tagged_addr_disabled\n";
ksft_print_msg("Testing tagged address ABI sysctl\n");
@@ -200,18 +202,32 @@ static void test_tagged_addr_abi_sysctl(void)
}
value = '1';
- pwrite(fd, &value, 1, 0);
+ ret = pwrite(fd, &value, 1, 0);
+ if (ret != 1) {
+ ksft_test_result_skip(err_pwrite_msg);
+ goto err_pwrite;
+ }
+
ksft_test_result(set_tagged_addr_ctrl(min_pmlen, true) == -EINVAL,
"sysctl disabled\n");
value = '0';
- pwrite(fd, &value, 1, 0);
+ ret = pwrite(fd, &value, 1, 0);
+ if (ret != 1)
+ goto err_pwrite;
+
ksft_test_result(set_tagged_addr_ctrl(min_pmlen, true) == 0,
"sysctl enabled\n");
set_tagged_addr_ctrl(0, false);
close(fd);
+
+ return;
+
+err_pwrite:
+ close(fd);
+ ksft_test_result_fail(err_pwrite_msg);
}
static void test_tagged_addr_abi_pmlen(int pmlen)
---
base-commit: 40384c840ea1944d7c5a392e8975ed088ecf0b37
change-id: 20241204-fix_warnings_pointer_masking_tests-3860e4f35429
--
- Charlie
Add ip_link_set_addr(), ip_link_set_up(), ip_addr_add() and ip_route_add()
to the suite of helpers that automatically schedule a corresponding
cleanup.
When setting a new MAC, one needs to remember the old address first. Move
mac_get() from forwarding/ to that end.
Signed-off-by: Petr Machata <petrm(a)nvidia.com>
Reviewed-by: Ido Schimmel <idosch(a)nvidia.com>
---
Notes:
CC: Shuah Khan <shuah(a)kernel.org>
CC: Benjamin Poirier <bpoirier(a)nvidia.com>
CC: Hangbin Liu <liuhangbin(a)gmail.com>
CC: Vladimir Oltean <vladimir.oltean(a)nxp.com>
CC: linux-kselftest(a)vger.kernel.org
tools/testing/selftests/net/forwarding/lib.sh | 7 ----
tools/testing/selftests/net/lib.sh | 39 +++++++++++++++++++
2 files changed, 39 insertions(+), 7 deletions(-)
diff --git a/tools/testing/selftests/net/forwarding/lib.sh b/tools/testing/selftests/net/forwarding/lib.sh
index 7337f398f9cc..1fd40bada694 100644
--- a/tools/testing/selftests/net/forwarding/lib.sh
+++ b/tools/testing/selftests/net/forwarding/lib.sh
@@ -932,13 +932,6 @@ packets_rate()
echo $(((t1 - t0) / interval))
}
-mac_get()
-{
- local if_name=$1
-
- ip -j link show dev $if_name | jq -r '.[]["address"]'
-}
-
ether_addr_to_u64()
{
local addr="$1"
diff --git a/tools/testing/selftests/net/lib.sh b/tools/testing/selftests/net/lib.sh
index 5ea6537acd2b..2cd5c743b2d9 100644
--- a/tools/testing/selftests/net/lib.sh
+++ b/tools/testing/selftests/net/lib.sh
@@ -435,6 +435,13 @@ xfail_on_veth()
fi
}
+mac_get()
+{
+ local if_name=$1
+
+ ip -j link show dev $if_name | jq -r '.[]["address"]'
+}
+
kill_process()
{
local pid=$1; shift
@@ -459,3 +466,35 @@ ip_link_set_master()
ip link set dev "$member" master "$master"
defer ip link set dev "$member" nomaster
}
+
+ip_link_set_addr()
+{
+ local name=$1; shift
+ local addr=$1; shift
+
+ local old_addr=$(mac_get "$name")
+ ip link set dev "$name" address "$addr"
+ defer ip link set dev "$name" address "$old_addr"
+}
+
+ip_link_set_up()
+{
+ local name=$1; shift
+
+ ip link set dev "$name" up
+ defer ip link set dev "$name" down
+}
+
+ip_addr_add()
+{
+ local name=$1; shift
+
+ ip addr add dev "$name" "$@"
+ defer ip addr del dev "$name" "$@"
+}
+
+ip_route_add()
+{
+ ip route add "$@"
+ defer ip route del "$@"
+}
--
2.47.0
This series is a follow-up to v1[1], aimed at making the watchdog selftest
more suitable for CI environments. Currently, in non-interactive setups,
the watchdog kselftest can only run with oneshot parameters, preventing the
testing of the WDIOC_KEEPALIVE ioctl since the ping loop is only
interrupted by SIGINT.
The first patch adds a new -c option to limit the number of watchdog pings,
allowing the test to be optionally finite. The second patch updates the
test output to conform to KTAP.
The default behavior remains unchanged: without the -c option, the
keep_alive() loop continues indefinitely until interrupted by SIGINT.
[1] https://lore.kernel.org/all/20240506111359.224579-1-laura.nao@collabora.com/
Changes in v2:
- The keep_alive() loop remains infinite by default
- Introduced keep_alive_res variable to track the WDIOC_KEEPALIVE ioctl return code for user reporting
Laura Nao (2):
selftests/watchdog: add -c option to limit the ping loop
selftests/watchdog: convert the test output to KTAP format
.../selftests/watchdog/watchdog-test.c | 169 +++++++++++-------
1 file changed, 103 insertions(+), 66 deletions(-)
--
2.30.2
Currently, user needs to manually enable transmit hardware timestamp
feature of certain Ethernet drivers, e.g. stmmac and igc drivers, through
following command after running the xdp_hw_metadata app.
sudo hwstamp_ctl -i eth0 -t 1
To simplify the step test of xdp_hw_metadata, set tx_type to HWTSTAMP_TX_ON
to enable hardware timestamping for all outgoing packets, so that user no
longer need to execute hwstamp_ctl command.
Signed-off-by: Song Yoong Siang <yoong.siang.song(a)intel.com>
Acked-by: Stanislav Fomichev <sdf(a)fomichev.me>
---
v1: https://patchwork.kernel.org/project/netdevbpf/patch/20241204115715.3148412…
v1->v2 changelog:
- Add detail in commit msg on why HWTSTAMP_TX_ON is needed (Stanislav).
- Separate the patch into two, current one submit to bpf-next,
another one submit to bpf.
---
tools/testing/selftests/bpf/xdp_hw_metadata.c | 1 +
1 file changed, 1 insertion(+)
diff --git a/tools/testing/selftests/bpf/xdp_hw_metadata.c b/tools/testing/selftests/bpf/xdp_hw_metadata.c
index 06266aad2f99..96c65500f4b4 100644
--- a/tools/testing/selftests/bpf/xdp_hw_metadata.c
+++ b/tools/testing/selftests/bpf/xdp_hw_metadata.c
@@ -551,6 +551,7 @@ static void hwtstamp_enable(const char *ifname)
{
struct hwtstamp_config cfg = {
.rx_filter = HWTSTAMP_FILTER_ALL,
+ .tx_type = HWTSTAMP_TX_ON,
};
hwtstamp_ioctl(SIOCGHWTSTAMP, ifname, &saved_hwtstamp_cfg);
--
2.34.1
When compiling the pointer masking tests with -Wall this warning
is present:
pointer_masking.c: In function ‘test_tagged_addr_abi_sysctl’:
pointer_masking.c:203:9: warning: ignoring return value of ‘pwrite’
declared with attribute ‘warn_unused_result’ [-Wunused-result]
203 | pwrite(fd, &value, 1, 0); |
^~~~~~~~~~~~~~~~~~~~~~~~ pointer_masking.c:208:9: warning:
ignoring return value of ‘pwrite’ declared with attribute
‘warn_unused_result’ [-Wunused-result]
208 | pwrite(fd, &value, 1, 0);
I came across this on riscv64-linux-gnu-gcc (Ubuntu
11.4.0-1ubuntu1~22.04).
Fix this by checking that the number of bytes written equal the expected
number of bytes written.
Fixes: 7470b5afd150 ("riscv: selftests: Add a pointer masking test")
Signed-off-by: Charlie Jenkins <charlie(a)rivosinc.com>
---
Changes in v2:
- I had ret != 2 for testing, I changed it to be ret != 1.
- Link to v1: https://lore.kernel.org/r/20241204-fix_warnings_pointer_masking_tests-v1-1-…
---
tools/testing/selftests/riscv/abi/pointer_masking.c | 19 +++++++++++++++----
1 file changed, 15 insertions(+), 4 deletions(-)
diff --git a/tools/testing/selftests/riscv/abi/pointer_masking.c b/tools/testing/selftests/riscv/abi/pointer_masking.c
index dee41b7ee3e3..229d85ccff50 100644
--- a/tools/testing/selftests/riscv/abi/pointer_masking.c
+++ b/tools/testing/selftests/riscv/abi/pointer_masking.c
@@ -189,6 +189,7 @@ static void test_tagged_addr_abi_sysctl(void)
{
char value;
int fd;
+ int ret;
ksft_print_msg("Testing tagged address ABI sysctl\n");
@@ -200,14 +201,24 @@ static void test_tagged_addr_abi_sysctl(void)
}
value = '1';
- pwrite(fd, &value, 1, 0);
+ ret = pwrite(fd, &value, 1, 0);
+ if (ret != 1) {
+ ksft_test_result_fail("Write to /proc/sys/abi/tagged_addr_disabled failed.\n");
+ return;
+ }
+
ksft_test_result(set_tagged_addr_ctrl(min_pmlen, true) == -EINVAL,
"sysctl disabled\n");
value = '0';
- pwrite(fd, &value, 1, 0);
- ksft_test_result(set_tagged_addr_ctrl(min_pmlen, true) == 0,
- "sysctl enabled\n");
+ ret = pwrite(fd, &value, 1, 0);
+ if (ret != 1) {
+ ksft_test_result_fail("Write to /proc/sys/abi/tagged_addr_disabled failed.\n");
+ return;
+ }
+
+ ksft_test_result(set_tagged_addr_ctrl(min_pmlen, true) == -EINVAL,
+ "sysctl disabled\n");
set_tagged_addr_ctrl(0, false);
---
base-commit: 40384c840ea1944d7c5a392e8975ed088ecf0b37
change-id: 20241204-fix_warnings_pointer_masking_tests-3860e4f35429
--
- Charlie
When compiling the pointer masking tests with -Wall this warning
is present:
pointer_masking.c: In function ‘test_tagged_addr_abi_sysctl’:
pointer_masking.c:203:9: warning: ignoring return value of ‘pwrite’
declared with attribute ‘warn_unused_result’ [-Wunused-result]
203 | pwrite(fd, &value, 1, 0); |
^~~~~~~~~~~~~~~~~~~~~~~~ pointer_masking.c:208:9: warning:
ignoring return value of ‘pwrite’ declared with attribute
‘warn_unused_result’ [-Wunused-result]
208 | pwrite(fd, &value, 1, 0);
I came across this on riscv64-linux-gnu-gcc (Ubuntu
11.4.0-1ubuntu1~22.04).
Fix this by checking that the number of bytes written equal the expected
number of bytes written.
Fixes: 7470b5afd150 ("riscv: selftests: Add a pointer masking test")
Signed-off-by: Charlie Jenkins <charlie(a)rivosinc.com>
---
Changes in v3:
- Fix sysctl enabled test case (Drew/Alex)
- Move pwrite err condition into goto (Drew)
- Link to v2: https://lore.kernel.org/r/20241204-fix_warnings_pointer_masking_tests-v2-1-…
Changes in v2:
- I had ret != 2 for testing, I changed it to be ret != 1.
- Link to v1: https://lore.kernel.org/r/20241204-fix_warnings_pointer_masking_tests-v1-1-…
---
tools/testing/selftests/riscv/abi/pointer_masking.c | 17 +++++++++++++++--
1 file changed, 15 insertions(+), 2 deletions(-)
diff --git a/tools/testing/selftests/riscv/abi/pointer_masking.c b/tools/testing/selftests/riscv/abi/pointer_masking.c
index dee41b7ee3e3..2367b24a2b4e 100644
--- a/tools/testing/selftests/riscv/abi/pointer_masking.c
+++ b/tools/testing/selftests/riscv/abi/pointer_masking.c
@@ -189,6 +189,7 @@ static void test_tagged_addr_abi_sysctl(void)
{
char value;
int fd;
+ int ret;
ksft_print_msg("Testing tagged address ABI sysctl\n");
@@ -200,18 +201,30 @@ static void test_tagged_addr_abi_sysctl(void)
}
value = '1';
- pwrite(fd, &value, 1, 0);
+ ret = pwrite(fd, &value, 1, 0);
+ if (ret != 1)
+ goto err_pwrite;
+
ksft_test_result(set_tagged_addr_ctrl(min_pmlen, true) == -EINVAL,
"sysctl disabled\n");
value = '0';
- pwrite(fd, &value, 1, 0);
+ ret = pwrite(fd, &value, 1, 0);
+ if (ret != 1)
+ goto err_pwrite;
+
ksft_test_result(set_tagged_addr_ctrl(min_pmlen, true) == 0,
"sysctl enabled\n");
set_tagged_addr_ctrl(0, false);
close(fd);
+
+ return;
+
+err_pwrite:
+ close(fd);
+ ksft_test_result_fail("failed to write to /proc/sys/abi/tagged_addr_disabled\n");
}
static void test_tagged_addr_abi_pmlen(int pmlen)
---
base-commit: 40384c840ea1944d7c5a392e8975ed088ecf0b37
change-id: 20241204-fix_warnings_pointer_masking_tests-3860e4f35429
--
- Charlie
The sysctl tests for vm.memfd_noexec rely on the kernel to support PID
namespaces (i.e. the kernel is built with CONFIG_PID_NS=y). If the
kernel the test runs on does not support PID namespaces, the first
sysctl test will fail when attempting to spawn a new thread in a new
PID namespace, abort the test, preventing the remaining tests from
being run.
This is not desirable, as not all kernels need PID namespaces, but can
still use the other features provided by memfd. Therefore, only run the
sysctl tests if the kernel supports PID namespaces. Otherwise, skip
those tests and emit an informative message to let the user know why
the sysctl tests are not being run.
Fixes: 11f75a01448f ("selftests/memfd: add tests for MFD_NOEXEC_SEAL MFD_EXEC")
Cc: stable(a)vger.kernel.org # v6.6+
Cc: Jeff Xu <jeffxu(a)google.com>
Cc: Suren Baghdasaryan <surenb(a)google.com>
Cc: Kalesh Singh <kaleshsingh(a)google.com>
Signed-off-by: Isaac J. Manjarres <isaacmanjarres(a)google.com>
---
tools/testing/selftests/memfd/memfd_test.c | 14 ++++++++++++--
1 file changed, 12 insertions(+), 2 deletions(-)
diff --git a/tools/testing/selftests/memfd/memfd_test.c b/tools/testing/selftests/memfd/memfd_test.c
index 95af2d78fd31..0a0b55516028 100644
--- a/tools/testing/selftests/memfd/memfd_test.c
+++ b/tools/testing/selftests/memfd/memfd_test.c
@@ -9,6 +9,7 @@
#include <fcntl.h>
#include <linux/memfd.h>
#include <sched.h>
+#include <stdbool.h>
#include <stdio.h>
#include <stdlib.h>
#include <signal.h>
@@ -1557,6 +1558,11 @@ static void test_share_fork(char *banner, char *b_suffix)
close(fd);
}
+static bool pid_ns_supported(void)
+{
+ return access("/proc/self/ns/pid", F_OK) == 0;
+}
+
int main(int argc, char **argv)
{
pid_t pid;
@@ -1591,8 +1597,12 @@ int main(int argc, char **argv)
test_seal_grow();
test_seal_resize();
- test_sysctl_simple();
- test_sysctl_nested();
+ if (pid_ns_supported()) {
+ test_sysctl_simple();
+ test_sysctl_nested();
+ } else {
+ printf("PID namespaces are not supported; skipping sysctl tests\n");
+ }
test_share_dup("SHARE-DUP", "");
test_share_mmap("SHARE-MMAP", "");
--
2.47.0.338.g60cca15819-goog
When we fork anonymous pages, apply a guard page then remove it, the
previous CoW mapping is cleared.
This might not be obvious to an outside observer without taking some time
to think about how the overall process functions, so document that this is
the case through a test, which also usefully asserts that the behaviour is
as we expect.
This is grouped with other, more important, fork tests that ensure that
guard pages are correctly propagated on fork.
Fix a typo in a nearby comment at the same time.
Signed-off-by: Lorenzo Stoakes <lorenzo.stoakes(a)oracle.com>
---
tools/testing/selftests/mm/guard-pages.c | 73 +++++++++++++++++++++++-
1 file changed, 72 insertions(+), 1 deletion(-)
diff --git a/tools/testing/selftests/mm/guard-pages.c b/tools/testing/selftests/mm/guard-pages.c
index 7cdf815d0d63..d8f8dee9ebbd 100644
--- a/tools/testing/selftests/mm/guard-pages.c
+++ b/tools/testing/selftests/mm/guard-pages.c
@@ -990,7 +990,7 @@ TEST_F(guard_pages, fork)
MAP_ANON | MAP_PRIVATE, -1, 0);
ASSERT_NE(ptr, MAP_FAILED);
- /* Establish guard apges in the first 5 pages. */
+ /* Establish guard pages in the first 5 pages. */
ASSERT_EQ(madvise(ptr, 5 * page_size, MADV_GUARD_INSTALL), 0);
pid = fork();
@@ -1029,6 +1029,77 @@ TEST_F(guard_pages, fork)
ASSERT_EQ(munmap(ptr, 10 * page_size), 0);
}
+/*
+ * Assert expected behaviour after we fork populated ranges of anonymous memory
+ * and then guard and unguard the range.
+ */
+TEST_F(guard_pages, fork_cow)
+{
+ const unsigned long page_size = self->page_size;
+ char *ptr;
+ pid_t pid;
+ int i;
+
+ /* Map 10 pages. */
+ ptr = mmap(NULL, 10 * page_size, PROT_READ | PROT_WRITE,
+ MAP_ANON | MAP_PRIVATE, -1, 0);
+ ASSERT_NE(ptr, MAP_FAILED);
+
+ /* Populate range. */
+ for (i = 0; i < 10 * page_size; i++) {
+ char chr = 'a' + (i % 26);
+
+ ptr[i] = chr;
+ }
+
+ pid = fork();
+ ASSERT_NE(pid, -1);
+ if (!pid) {
+ /* This is the child process now. */
+
+ /* Ensure the range is as expected. */
+ for (i = 0; i < 10 * page_size; i++) {
+ char expected = 'a' + (i % 26);
+ char actual = ptr[i];
+
+ ASSERT_EQ(actual, expected);
+ }
+
+ /* Establish guard pages across the whole range. */
+ ASSERT_EQ(madvise(ptr, 10 * page_size, MADV_GUARD_INSTALL), 0);
+ /* Remove it. */
+ ASSERT_EQ(madvise(ptr, 10 * page_size, MADV_GUARD_REMOVE), 0);
+
+ /*
+ * By removing the guard pages, the page tables will be
+ * cleared. Assert that we are looking at the zero page now.
+ */
+ for (i = 0; i < 10 * page_size; i++) {
+ char actual = ptr[i];
+
+ ASSERT_EQ(actual, '\0');
+ }
+
+ exit(0);
+ }
+
+ /* Parent process. */
+
+ /* Parent simply waits on child. */
+ waitpid(pid, NULL, 0);
+
+ /* Ensure the range is unchanged in parent anon range. */
+ for (i = 0; i < 10 * page_size; i++) {
+ char expected = 'a' + (i % 26);
+ char actual = ptr[i];
+
+ ASSERT_EQ(actual, expected);
+ }
+
+ /* Cleanup. */
+ ASSERT_EQ(munmap(ptr, 10 * page_size), 0);
+}
+
/*
* Assert that forking a process with VMAs that do have VM_WIPEONFORK set
* behave as expected.
--
2.47.1
In 'NOFENTRY_ARGS' test case for syntax check, any offset X of
`vfs_read+X` except function entry offset (0) fits the criterion,
even if that offset is not at instruction boundary, as the parser
comes before probing. But with "ENDBR64" instruction on x86, offset
4 is treated as function entry. So, X can't be 4 as well. Thus, 8
was used as offset for the test case. On 64-bit powerpc though, any
offset <= 16 can be considered function entry depending on build
configuration (see arch_kprobe_on_func_entry() for implementation
details). So, use `vfs_read+20` to accommodate that scenario too.
Suggested-by: Masami Hiramatsu <mhiramat(a)kernel.org>
Signed-off-by: Hari Bathini <hbathini(a)linux.ibm.com>
---
Changes in v2:
* Use 20 as offset for all arches.
.../selftests/ftrace/test.d/kprobe/kprobe_syntax_errors.tc | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/tools/testing/selftests/ftrace/test.d/kprobe/kprobe_syntax_errors.tc b/tools/testing/selftests/ftrace/test.d/kprobe/kprobe_syntax_errors.tc
index a16c6a6f6055..8f1c58f0c239 100644
--- a/tools/testing/selftests/ftrace/test.d/kprobe/kprobe_syntax_errors.tc
+++ b/tools/testing/selftests/ftrace/test.d/kprobe/kprobe_syntax_errors.tc
@@ -111,7 +111,7 @@ check_error 'p vfs_read $arg* ^$arg*' # DOUBLE_ARGS
if !grep -q 'kernel return probes support:' README; then
check_error 'r vfs_read ^$arg*' # NOFENTRY_ARGS
fi
-check_error 'p vfs_read+8 ^$arg*' # NOFENTRY_ARGS
+check_error 'p vfs_read+20 ^$arg*' # NOFENTRY_ARGS
check_error 'p vfs_read ^hoge' # NO_BTFARG
check_error 'p kfree ^$arg10' # NO_BTFARG (exceed the number of parameters)
check_error 'r kfree ^$retval' # NO_RETVAL
--
2.47.0
`MFD_NOEXEC_SEAL` should remove the executable bits and set `F_SEAL_EXEC`
to prevent further modifications to the executable bits as per the comment
in the uapi header file:
not executable and sealed to prevent changing to executable
However, commit 105ff5339f498a ("mm/memfd: add MFD_NOEXEC_SEAL and MFD_EXEC")
that introduced this feature made it so that `MFD_NOEXEC_SEAL` unsets
`F_SEAL_SEAL`, essentially acting as a superset of `MFD_ALLOW_SEALING`.
Nothing implies that it should be so, and indeed up until the second version
of the of the patchset[0] that introduced `MFD_EXEC` and `MFD_NOEXEC_SEAL`,
`F_SEAL_SEAL` was not removed, however, it was changed in the third revision
of the patchset[1] without a clear explanation.
This behaviour is surprising for application developers, there is no
documentation that would reveal that `MFD_NOEXEC_SEAL` has the additional
effect of `MFD_ALLOW_SEALING`. Additionally, combined with `vm.memfd_noexec=2`
it has the effect of making all memfds initially sealable.
So do not remove `F_SEAL_SEAL` when `MFD_NOEXEC_SEAL` is requested,
thereby returning to the pre-Linux 6.3 behaviour of only allowing
sealing when `MFD_ALLOW_SEALING` is specified.
Now, this is technically a uapi break. However, the damage is expected
to be minimal. To trigger user visible change, a program has to do the
following steps:
- create memfd:
- with `MFD_NOEXEC_SEAL`,
- without `MFD_ALLOW_SEALING`;
- try to add seals / check the seals.
But that seems unlikely to happen intentionally since this change
essentially reverts the kernel's behaviour to that of Linux <6.3,
so if a program worked correctly on those older kernels, it will
likely work correctly after this change.
I have used Debian Code Search and GitHub to try to find potential
breakages, and I could only find a single one. dbus-broker's
memfd_create() wrapper is aware of this implicit `MFD_ALLOW_SEALING`
behaviour, and tries to work around it[2]. This workaround will
break. Luckily, this only affects the test suite, it does not affect
the normal operations of dbus-broker. There is a PR with a fix[3].
I also carried out a smoke test by building a kernel with this change
and booting an Arch Linux system into GNOME and Plasma sessions.
There was also a previous attempt to address this peculiarity by
introducing a new flag[4].
[0]: https://lore.kernel.org/lkml/20220805222126.142525-3-jeffxu@google.com/
[1]: https://lore.kernel.org/lkml/20221202013404.163143-3-jeffxu@google.com/
[2]: https://github.com/bus1/dbus-broker/blob/9eb0b7e5826fc76cad7b025bc46f267d4a…
[3]: https://github.com/bus1/dbus-broker/pull/366
[4]: https://lore.kernel.org/lkml/20230714114753.170814-1-david@readahead.eu/
Cc: stable(a)vger.kernel.org
Signed-off-by: Barnabás Pőcze <pobrn(a)protonmail.com>
---
* v3: https://lore.kernel.org/linux-mm/20240611231409.3899809-1-jeffxu@chromium.o…
* v2: https://lore.kernel.org/linux-mm/20240524033933.135049-1-jeffxu@google.com/
* v1: https://lore.kernel.org/linux-mm/20240513191544.94754-1-pobrn@protonmail.co…
This fourth version returns to removing the inconsistency as opposed to documenting
its existence, with the same code change as v1 but with a somewhat extended commit
message. This is sent because I believe it is worth at least a try; it can be easily
reverted if bigger application breakages are discovered than initially imagined.
---
mm/memfd.c | 9 ++++-----
tools/testing/selftests/memfd/memfd_test.c | 2 +-
2 files changed, 5 insertions(+), 6 deletions(-)
diff --git a/mm/memfd.c b/mm/memfd.c
index 7d8d3ab3fa37..8b7f6afee21d 100644
--- a/mm/memfd.c
+++ b/mm/memfd.c
@@ -356,12 +356,11 @@ SYSCALL_DEFINE2(memfd_create,
inode->i_mode &= ~0111;
file_seals = memfd_file_seals_ptr(file);
- if (file_seals) {
- *file_seals &= ~F_SEAL_SEAL;
+ if (file_seals)
*file_seals |= F_SEAL_EXEC;
- }
- } else if (flags & MFD_ALLOW_SEALING) {
- /* MFD_EXEC and MFD_ALLOW_SEALING are set */
+ }
+
+ if (flags & MFD_ALLOW_SEALING) {
file_seals = memfd_file_seals_ptr(file);
if (file_seals)
*file_seals &= ~F_SEAL_SEAL;
diff --git a/tools/testing/selftests/memfd/memfd_test.c b/tools/testing/selftests/memfd/memfd_test.c
index 95af2d78fd31..7b78329f65b6 100644
--- a/tools/testing/selftests/memfd/memfd_test.c
+++ b/tools/testing/selftests/memfd/memfd_test.c
@@ -1151,7 +1151,7 @@ static void test_noexec_seal(void)
mfd_def_size,
MFD_CLOEXEC | MFD_NOEXEC_SEAL);
mfd_assert_mode(fd, 0666);
- mfd_assert_has_seals(fd, F_SEAL_EXEC);
+ mfd_assert_has_seals(fd, F_SEAL_SEAL | F_SEAL_EXEC);
mfd_fail_chmod(fd, 0777);
close(fd);
}
--
2.45.2
Currently, sendmmsg is implemented in udpgso_bench_tx.c,
but it is not called by any test script.
This patch adds a test for sendmmsg in udpgso_bench.sh.
This allows for basic API testing and benchmarking
comparisons with GSO.
Signed-off-by: Kenjiro Nakayama <nakayamakenjiro(a)gmail.com>
---
tools/testing/selftests/net/udpgso_bench.sh | 3 +++
1 file changed, 3 insertions(+)
diff --git a/tools/testing/selftests/net/udpgso_bench.sh b/tools/testing/selftests/net/udpgso_bench.sh
index 640bc43452fa..88fa1d53ba2b 100755
--- a/tools/testing/selftests/net/udpgso_bench.sh
+++ b/tools/testing/selftests/net/udpgso_bench.sh
@@ -92,6 +92,9 @@ run_udp() {
echo "udp"
run_in_netns ${args}
+ echo "udp sendmmsg"
+ run_in_netns ${args} -m
+
echo "udp gso"
run_in_netns ${args} -S 0
--
2.39.3 (Apple Git-146)
If fopen succeeds, the fscanf function is called to read the data.
Regardless of whether fscanf is successful, you need to run
fclose(proc) to prevent memory leaks.
Signed-off-by: liujing <liujing(a)cmss.chinamobile.com>
diff --git a/tools/testing/selftests/timens/procfs.c b/tools/testing/selftests/timens/procfs.c
index 1833ca97eb24..e47844a73c31 100644
--- a/tools/testing/selftests/timens/procfs.c
+++ b/tools/testing/selftests/timens/procfs.c
@@ -79,9 +79,11 @@ static int read_proc_uptime(struct timespec *uptime)
if (fscanf(proc, "%lu.%02lu", &up_sec, &up_nsec) != 2) {
if (errno) {
pr_perror("fscanf");
+ fclose(proc);
return -errno;
}
pr_err("failed to parse /proc/uptime");
+ fclose(proc);
return -1;
}
fclose(proc);
--
2.27.0
After using the fopen function successfully, you need to call
fclose to close the file before finally returning.
Signed-off-by: liujing <liujing(a)cmss.chinamobile.com>
diff --git a/tools/testing/selftests/prctl/set-process-name.c b/tools/testing/selftests/prctl/set-process-name.c
index 562f707ba771..7be7afff0cd2 100644
--- a/tools/testing/selftests/prctl/set-process-name.c
+++ b/tools/testing/selftests/prctl/set-process-name.c
@@ -66,9 +66,12 @@ int check_name(void)
return -EIO;
fscanf(fptr, "%s", output);
- if (ferror(fptr))
+ if (ferror(fptr)) {
+ fclose(fptr);
return -EIO;
+ }
+ fclose(fptr);
int res = prctl(PR_GET_NAME, name, NULL, NULL, NULL);
if (res < 0)
--
2.27.0
Commit f803bcf9208a ("selftests/bpf: Prevent client connect before
server bind in test_tc_tunnel.sh") added code that waits for the
netcat server to start before the netcat client attempts to connect to
it. However, not all calls to 'server_listen' were guarded.
This patch adds the existing 'wait_for_port' guard after the remaining
call to 'server_listen'.
Fixes: f803bcf9208a ("selftests/bpf: Prevent client connect before server bind in test_tc_tunnel.sh")
Signed-off-by: Marco Leogrande <leogrande(a)google.com>
---
tools/testing/selftests/bpf/test_tc_tunnel.sh | 1 +
1 file changed, 1 insertion(+)
diff --git a/tools/testing/selftests/bpf/test_tc_tunnel.sh b/tools/testing/selftests/bpf/test_tc_tunnel.sh
index 7989ec6084545..cb55a908bb0d7 100755
--- a/tools/testing/selftests/bpf/test_tc_tunnel.sh
+++ b/tools/testing/selftests/bpf/test_tc_tunnel.sh
@@ -305,6 +305,7 @@ else
client_connect
verify_data
server_listen
+ wait_for_port ${port} ${netcat_opt}
fi
# serverside, use BPF for decap
--
2.47.0.338.g60cca15819-goog
The current implementation of netconsole sends all log messages in
parallel, which can lead to an intermixed and interleaved output on the
receiving side. This makes it challenging to demultiplex the messages
and attribute them to their originating CPUs.
As a result, users and developers often struggle to effectively analyze
and debug the parallel log output received through netconsole.
Example of a message got from produciton hosts:
------------[ cut here ]------------
------------[ cut here ]------------
refcount_t: saturated; leaking memory.
WARNING: CPU: 2 PID: 1613668 at lib/refcount.c:22 refcount_warn_saturate+0x5e/0xe0
refcount_t: addition on 0; use-after-free.
WARNING: CPU: 26 PID: 4139916 at lib/refcount.c:25 refcount_warn_saturate+0x7d/0xe0
Modules linked in: bpf_preload(E) vhost_net(E) tun(E) vhost(E)
This series of patches introduces a new feature to the netconsole
subsystem that allows the automatic population of the CPU number in the
userdata field for each log message. This enhancement provides several
benefits:
* Improved demultiplexing of parallel log output: When multiple CPUs are
sending messages concurrently, the added CPU number in the userdata
makes it easier to differentiate and attribute the messages to their
originating CPUs.
* Better visibility into message sources: The CPU number information
gives users and developers more insight into which specific CPU a
particular log message came from, which can be valuable for debugging
and analysis.
The changes in this series are as follows:
Patch "Ensure dynamic_netconsole_mutex is held during userdata update"
Add a lockdep assert to make sure dynamic_netconsole_mutex is held when
calling update_userdata().
Patch "netconsole: Add option to auto-populate CPU number in userdata"
Adds a new option to enable automatic CPU number population in the
netconsole userdata Provides a new "populate_cpu_nr" sysfs attribute to
control this feature
Patch "netconsole: selftest: test CPU number auto-population"
Expands the existing netconsole selftest to verify the CPU number
auto-population functionality Ensures the received netconsole messages
contain the expected "cpu=" entry in the userdata
Patch "netconsole: docs: Add documentation for CPU number auto-population"
Updates the netconsole documentation to explain the new CPU number
auto-population feature Provides instructions on how to enable and use
the feature
I believe these changes will be a valuable addition to the netconsole
subsystem, enhancing its usefulness for kernel developers and users.
Signed-off-by: Breno Leitao <leitao(a)debian.org>
---
Breno Leitao (4):
netconsole: Ensure dynamic_netconsole_mutex is held during userdata update
netconsole: Add option to auto-populate CPU number in userdata
netconsole: docs: Add documentation for CPU number auto-population
netconsole: selftest: Validate CPU number auto-population in userdata
Documentation/networking/netconsole.rst | 44 +++++++++++++++
drivers/net/netconsole.c | 63 ++++++++++++++++++++++
.../testing/selftests/drivers/net/netcons_basic.sh | 18 +++++++
3 files changed, 125 insertions(+)
---
base-commit: a58f00ed24b849d449f7134fd5d86f07090fe2f5
change-id: 20241108-netcon_cpu-ce3917e88f4b
Best regards,
--
Breno Leitao <leitao(a)debian.org>
Changes v6:
- Rebase onto latest kselftests-next.
- Looking at the two patches with a fresh eye decided to make a split
along the lines of:
- Patch 1/2 contains all of the code that relates to SNC mode
detection and checking that detection's reliability.
- Patch 2/2 contains checking kernel support for SNC and
modifying the messages at the end of affected tests.
Changes v5:
- Tests are skipped if snc_unreliable was set.
- Moved resctrlfs.c changes from patch 2/2 to 1/2.
- Removed CAT changes since it's not impacted by SNC in the selftest.
- Updated various comments.
- Fixed a bunch of minor issues pointed out in the review.
Changes v4:
- Printing SNC warnings at the start of every test.
- Printing SNC warnings at the end of every relevant test.
- Remove global snc_mode variable, consolidate snc detection functions
into one.
- Correct minor mistakes.
Changes v3:
- Reworked patch 2.
- Changed minor things in patch 1 like function name and made
corrections to the patch message.
Changes v2:
- Removed patches 2 and 3 since now this part will be supported by the
kernel.
Sub-Numa Clustering (SNC) allows splitting CPU cores, caches and memory
into multiple NUMA nodes. When enabled, NUMA-aware applications can
achieve better performance on bigger server platforms.
SNC support in the kernel was merged into x86/cache [1]. With SNC enabled
and kernel support in place all the tests will function normally (aside
from effective cache size). There might be a problem when SNC is enabled
but the system is still using an older kernel version without SNC
support. Currently the only message displayed in that situation is a
guess that SNC might be enabled and is causing issues. That message also
is displayed whenever the test fails on an Intel platform.
Add a mechanism to discover kernel support for SNC which will add more
meaning and certainty to the error message.
Add runtime SNC mode detection and verify how reliable that information
is.
Series was tested on Ice Lake server platforms with SNC disabled, SNC-2
and SNC-4. The tests were also ran with and without kernel support for
SNC.
Series applies cleanly on kselftest/next.
[1] https://lore.kernel.org/all/20240628215619.76401-1-tony.luck@intel.com/
Previous versions of this series:
[v1] https://lore.kernel.org/all/cover.1709721159.git.maciej.wieczor-retman@inte…
[v2] https://lore.kernel.org/all/cover.1715769576.git.maciej.wieczor-retman@inte…
[v3] https://lore.kernel.org/all/cover.1719842207.git.maciej.wieczor-retman@inte…
[v4] https://lore.kernel.org/all/cover.1720774981.git.maciej.wieczor-retman@inte…
[v5] https://lore.kernel.org/all/cover.1730206468.git.maciej.wieczor-retman@inte…
Maciej Wieczor-Retman (2):
selftests/resctrl: Adjust effective L3 cache size with SNC enabled
selftests/resctrl: Discover SNC kernel support and adjust messages
tools/testing/selftests/resctrl/cmt_test.c | 4 +-
tools/testing/selftests/resctrl/mba_test.c | 2 +
tools/testing/selftests/resctrl/mbm_test.c | 4 +-
tools/testing/selftests/resctrl/resctrl.h | 5 +
.../testing/selftests/resctrl/resctrl_tests.c | 9 +-
tools/testing/selftests/resctrl/resctrlfs.c | 129 ++++++++++++++++++
6 files changed, 148 insertions(+), 5 deletions(-)
--
2.47.1
From: Yicong Yang <yangyicong(a)hisilicon.com>
Armv8.7 introduces single-copy atomic 64-byte loads and stores
instructions and its variants named under FEAT_{LS64, LS64_V,
LS64_ACCDATA}. Add support for Armv8.7 FEAT_{LS64, LS64_V, LS64_ACCDATA}:
- Add identifying and enabling in the cpufeature list
- Expose the support of these features to userspace through HWCAP3
and cpuinfo
- Add related hwcap test
- Handle the trap of unsupported memory (normal/uncacheable) access in a VM
A real scenario for this feature is that the userspace driver can make use of
this to implement direct WQE (workqueue entry) - a mechanism to fill WQE
directly into the hardware.
This patchset also depends on Marc's patchset[1] for enabling related
features in a VM, HCRX trap handling, etc.
[1] https://lore.kernel.org/linux-arm-kernel/20240815125959.2097734-1-maz@kerne…
Tested with updated hwcap test:
On host:
root@localhost:/# dmesg | grep "All CPU(s) started"
[ 1.600263] CPU: All CPU(s) started at EL2
root@localhost:/# ./hwcap
[snip...]
# LS64 present
ok 169 cpuinfo_match_LS64
ok 170 sigill_LS64
ok 171 # SKIP sigbus_LS64
# LS64_V present
ok 172 cpuinfo_match_LS64_V
ok 173 sigill_LS64_V
ok 174 # SKIP sigbus_LS64_V
# LS64_ACCDATA present
ok 175 cpuinfo_match_LS64_ACCDATA
ok 176 sigill_LS64_ACCDATA
ok 177 # SKIP sigbus_LS64_ACCDATA
# Totals: pass:92 fail:0 xfail:0 xpass:0 skip:85 error:0
On guest:
root@localhost:/# dmesg | grep "All CPU(s) started"
[ 1.375272] CPU: All CPU(s) started at EL1
root@localhost:/# ./hwcap
[snip...]
# LS64 present
ok 169 cpuinfo_match_LS64
ok 170 sigill_LS64
ok 171 # SKIP sigbus_LS64
# LS64_V present
ok 172 cpuinfo_match_LS64_V
ok 173 sigill_LS64_V
ok 174 # SKIP sigbus_LS64_V
# LS64_ACCDATA present
ok 175 cpuinfo_match_LS64_ACCDATA
ok 176 sigill_LS64_ACCDATA
ok 177 # SKIP sigbus_LS64_ACCDATA
# Totals: pass:92 fail:0 xfail:0 xpass:0 skip:85 error:0
Yicong Yang (5):
arm64: Provide basic EL2 setup for FEAT_{LS64, LS64_V, LS64_ACCDATA}
usage at EL0/1
arm64: Add support for FEAT_{LS64, LS64_V, LS64_ACCDATA}
kselftest/arm64: Add HWCAP test for FEAT_{LS64, LS64_V, LS64_ACCDATA}
arm64: Add ESR.DFSC definition of unsupported exclusive or atomic
access
KVM: arm64: Handle DABT caused by LS64* instructions on unsupported
memory
Documentation/arch/arm64/booting.rst | 28 +++++
Documentation/arch/arm64/elf_hwcaps.rst | 9 ++
arch/arm64/include/asm/el2_setup.h | 26 ++++-
arch/arm64/include/asm/esr.h | 8 ++
arch/arm64/include/asm/hwcap.h | 3 +
arch/arm64/include/uapi/asm/hwcap.h | 3 +
arch/arm64/kernel/cpufeature.c | 70 +++++++++++-
arch/arm64/kernel/cpuinfo.c | 3 +
arch/arm64/kvm/mmu.c | 14 +++
arch/arm64/tools/cpucaps | 3 +
tools/testing/selftests/arm64/abi/hwcap.c | 127 ++++++++++++++++++++++
11 files changed, 291 insertions(+), 3 deletions(-)
--
2.24.0
This version 5 patch series replace direct error handling methods with ksft
macros, which provide better reporting.Currently, when the tmpfs test runs,
it does not display any output if it passes,and if it fails
(particularly when not run as root),it simply exits without any warning or
message.
This series of patch adds:
1. Add 'ksft_print_header()' and 'ksft_set_plan()'
to structure test outputs more effectively.
2. Error if not run as root.
3. Replace direct error handling with 'ksft_test_result_*',
'ksft_exit_fail_msg' macros for better reporting.
v4->v5:
- Remove unnecessary pass messages.
- Remove unnecessary use of KSFT_SKIP.
- Add appropriate use of ksft_exit_fail_msg.
v4 1/2: https://lore.kernel.org/all/20241105202639.1977356-2-cvam0000@gmail.com/
v4 2/2: https://lore.kernel.org/all/20241105202639.1977356-3-cvam0000@gmail.com/
v3->v4:
- Start a patchset
- Split patch into smaller patches to make it easy to review.
Patch1 Replace 'ksft_test_result_skip' with 'KSFT_SKIP' during root run check.
Patch2 Replace 'ksft_test_result_fail' with 'KSFT_SKIP' where fail does not make sense,
or failure could be due to not unsupported APIs with appropriate warnings.
v3: https://lore.kernel.org/all/20241028185756.111832-1-cvam0000@gmail.com/
v2->v3:
- Remove extra ksft_set_plan()
- Remove function for unshare()
- Fix the comment style
v2: https://lore.kernel.org/all/20241026191621.2860376-1-cvam0000@gmail.com/
v1->v2:
- Make the commit message more clear.
v1: https://lore.kernel.org/all/20241024200228.1075840-1-cvam0000@gmail.com/T/#u
thanks
Shivam
Shivam Chaudhary (2):
selftests: tmpfs: Add Test-fail if not run as root
selftests: tmpfs: Add kselftest support to tmpfs
.../selftests/tmpfs/bug-link-o-tmpfile.c | 60 ++++++++++++-------
1 file changed, 37 insertions(+), 23 deletions(-)
--
2.45.2
If the selftest is not running as root, it should fail and
give an appropriate warning to the user. This patch adds
ksft_exit_fail_msg() if the test is not running as root.
Logs:
Before change:
TAP version 13
1..1
ok 1 # SKIP This test needs root to run!
After change:
TAP version 13
1..1
Bail out! Error : Need to run as root# Planned tests != run tests (1 != 0)
Totals: pass:0 fail:0 xfail:0 xpass:0 skip:0 error:0
Signed-off-by: Shivam Chaudhary <cvam0000(a)gmail.com>
---
tools/testing/selftests/acct/acct_syscall.c | 3 +--
1 file changed, 1 insertion(+), 2 deletions(-)
diff --git a/tools/testing/selftests/acct/acct_syscall.c b/tools/testing/selftests/acct/acct_syscall.c
index e44e8fe1f4a3..7c65deef54e3 100644
--- a/tools/testing/selftests/acct/acct_syscall.c
+++ b/tools/testing/selftests/acct/acct_syscall.c
@@ -24,8 +24,7 @@ int main(void)
// Check if test is run a root
if (geteuid()) {
- ksft_test_result_skip("This test needs root to run!\n");
- return 1;
+ ksft_exit_fail_msg("Error : Need to run as root");
}
// Create file to log closed processes
--
2.45.2
The kernel has recently added support for shadow stacks, currently
x86 only using their CET feature but both arm64 and RISC-V have
equivalent features (GCS and Zicfiss respectively), I am actively
working on GCS[1]. With shadow stacks the hardware maintains an
additional stack containing only the return addresses for branch
instructions which is not generally writeable by userspace and ensures
that any returns are to the recorded addresses. This provides some
protection against ROP attacks and making it easier to collect call
stacks. These shadow stacks are allocated in the address space of the
userspace process.
Our API for shadow stacks does not currently offer userspace any
flexiblity for managing the allocation of shadow stacks for newly
created threads, instead the kernel allocates a new shadow stack with
the same size as the normal stack whenever a thread is created with the
feature enabled. The stacks allocated in this way are freed by the
kernel when the thread exits or shadow stacks are disabled for the
thread. This lack of flexibility and control isn't ideal, in the vast
majority of cases the shadow stack will be over allocated and the
implicit allocation and deallocation is not consistent with other
interfaces. As far as I can tell the interface is done in this manner
mainly because the shadow stack patches were in development since before
clone3() was implemented.
Since clone3() is readily extensible let's add support for specifying a
shadow stack when creating a new thread or process, keeping the current
implicit allocation behaviour if one is not specified either with
clone3() or through the use of clone(). The user must provide a shadow
stack pointer, this must point to memory mapped for use as a shadow
stackby map_shadow_stack() with an architecture specified shadow stack
token at the top of the stack.
Please note that the x86 portions of this code are build tested only, I
don't appear to have a system that can run CET available to me.
[1] https://lore.kernel.org/linux-arm-kernel/20241001-arm64-gcs-v13-0-222b78d87…
Signed-off-by: Mark Brown <broonie(a)kernel.org>
---
Changes in v13:
- Rebase onto v6.13-rc1.
- Link to v12: https://lore.kernel.org/r/20241031-clone3-shadow-stack-v12-0-7183eb8bee17@k…
Changes in v12:
- Add the regular prctl() to the userspace API document since arm64
support is queued in -next.
- Link to v11: https://lore.kernel.org/r/20241005-clone3-shadow-stack-v11-0-2a6a2bd6d651@k…
Changes in v11:
- Rebase onto arm64 for-next/gcs, which is based on v6.12-rc1, and
integrate arm64 support.
- Rework the interface to specify a shadow stack pointer rather than a
base and size like we do for the regular stack.
- Link to v10: https://lore.kernel.org/r/20240821-clone3-shadow-stack-v10-0-06e8797b9445@k…
Changes in v10:
- Integrate fixes & improvements for the x86 implementation from Rick
Edgecombe.
- Require that the shadow stack be VM_WRITE.
- Require that the shadow stack base and size be sizeof(void *) aligned.
- Clean up trailing newline.
- Link to v9: https://lore.kernel.org/r/20240819-clone3-shadow-stack-v9-0-962d74f99464@ke…
Changes in v9:
- Pull token validation earlier and report problems with an error return
to parent rather than signal delivery to the child.
- Verify that the top of the supplied shadow stack is VM_SHADOW_STACK.
- Rework token validation to only do the page mapping once.
- Drop no longer needed support for testing for signals in selftest.
- Fix typo in comments.
- Link to v8: https://lore.kernel.org/r/20240808-clone3-shadow-stack-v8-0-0acf37caf14c@ke…
Changes in v8:
- Fix token verification with user specified shadow stack.
- Don't track user managed shadow stacks for child processes.
- Link to v7: https://lore.kernel.org/r/20240731-clone3-shadow-stack-v7-0-a9532eebfb1d@ke…
Changes in v7:
- Rebase onto v6.11-rc1.
- Typo fixes.
- Link to v6: https://lore.kernel.org/r/20240623-clone3-shadow-stack-v6-0-9ee7783b1fb9@ke…
Changes in v6:
- Rebase onto v6.10-rc3.
- Ensure we don't try to free the parent shadow stack in error paths of
x86 arch code.
- Spelling fixes in userspace API document.
- Additional cleanups and improvements to the clone3() tests to support
the shadow stack tests.
- Link to v5: https://lore.kernel.org/r/20240203-clone3-shadow-stack-v5-0-322c69598e4b@ke…
Changes in v5:
- Rebase onto v6.8-rc2.
- Rework ABI to have the user allocate the shadow stack memory with
map_shadow_stack() and a token.
- Force inlining of the x86 shadow stack enablement.
- Move shadow stack enablement out into a shared header for reuse by
other tests.
- Link to v4: https://lore.kernel.org/r/20231128-clone3-shadow-stack-v4-0-8b28ffe4f676@ke…
Changes in v4:
- Formatting changes.
- Use a define for minimum shadow stack size and move some basic
validation to fork.c.
- Link to v3: https://lore.kernel.org/r/20231120-clone3-shadow-stack-v3-0-a7b8ed3e2acc@ke…
Changes in v3:
- Rebase onto v6.7-rc2.
- Remove stale shadow_stack in internal kargs.
- If a shadow stack is specified unconditionally use it regardless of
CLONE_ parameters.
- Force enable shadow stacks in the selftest.
- Update changelogs for RISC-V feature rename.
- Link to v2: https://lore.kernel.org/r/20231114-clone3-shadow-stack-v2-0-b613f8681155@ke…
Changes in v2:
- Rebase onto v6.7-rc1.
- Remove ability to provide preallocated shadow stack, just specify the
desired size.
- Link to v1: https://lore.kernel.org/r/20231023-clone3-shadow-stack-v1-0-d867d0b5d4d0@ke…
---
Mark Brown (8):
arm64/gcs: Return a success value from gcs_alloc_thread_stack()
Documentation: userspace-api: Add shadow stack API documentation
selftests: Provide helper header for shadow stack testing
fork: Add shadow stack support to clone3()
selftests/clone3: Remove redundant flushes of output streams
selftests/clone3: Factor more of main loop into test_clone3()
selftests/clone3: Allow tests to flag if -E2BIG is a valid error code
selftests/clone3: Test shadow stack support
Documentation/userspace-api/index.rst | 1 +
Documentation/userspace-api/shadow_stack.rst | 42 ++++
arch/arm64/include/asm/gcs.h | 8 +-
arch/arm64/kernel/process.c | 8 +-
arch/arm64/mm/gcs.c | 62 +++++-
arch/x86/include/asm/shstk.h | 11 +-
arch/x86/kernel/process.c | 2 +-
arch/x86/kernel/shstk.c | 57 +++++-
include/asm-generic/cacheflush.h | 11 ++
include/linux/sched/task.h | 17 ++
include/uapi/linux/sched.h | 10 +-
kernel/fork.c | 96 +++++++--
tools/testing/selftests/clone3/clone3.c | 226 ++++++++++++++++++----
tools/testing/selftests/clone3/clone3_selftests.h | 65 ++++++-
tools/testing/selftests/ksft_shstk.h | 98 ++++++++++
15 files changed, 633 insertions(+), 81 deletions(-)
---
base-commit: 40384c840ea1944d7c5a392e8975ed088ecf0b37
change-id: 20231019-clone3-shadow-stack-15d40d2bf536
Best regards,
--
Mark Brown <broonie(a)kernel.org>
There are a few patches in vIRQ series that rework the fault.c file. So,
we should fix this before that bigger series touches the same code.
And add missing coverage for IOMMU_FAULT_QUEUE_ALLOC in iommufd_fail_nth.
Thanks!
Nicolin
Nicolin Chen (2):
iommufd/fault: Fix out_fput in iommufd_fault_alloc()
iommufd/selftest: Cover IOMMU_FAULT_QUEUE_ALLOC in iommufd_fail_nth
drivers/iommu/iommufd/fault.c | 2 --
tools/testing/selftests/iommu/iommufd_fail_nth.c | 14 ++++++++++++++
2 files changed, 14 insertions(+), 2 deletions(-)
--
2.43.0
Currently, sendmmsg is implemented in udpgso_bench_tx.c,
but it is not called by any test script.
This patch adds a test for sendmmsg in udpgso_bench.sh.
This allows for basic API testing and benchmarking
comparisons with GSO.
---
tools/testing/selftests/net/udpgso_bench.sh | 3 +++
1 file changed, 3 insertions(+)
diff --git a/tools/testing/selftests/net/udpgso_bench.sh b/tools/testing/selftests/net/udpgso_bench.sh
index 640bc43452fa..88fa1d53ba2b 100755
--- a/tools/testing/selftests/net/udpgso_bench.sh
+++ b/tools/testing/selftests/net/udpgso_bench.sh
@@ -92,6 +92,9 @@ run_udp() {
echo "udp"
run_in_netns ${args}
+ echo "udp sendmmsg"
+ run_in_netns ${args} -m
+
echo "udp gso"
run_in_netns ${args} -S 0
--
2.39.3 (Apple Git-146)
As discussed in the v1 [1], with guest_memfd moving from KVM to mm, it
is more practical to have a non-KVM-specific API to populate guest
memory in a generic way. The series proposes using the write syscall
for this purpose instead of a KVM ioctl as in the v1. The approach also
has an advantage that the guest_memfd handle can be sent to another
process that would be responsible for population. I also included a
suggestion from Mike Day for excluding the code from compilation if AMD
SEV is configured.
There is a potential for refactoring of the kvm_gmem_populate to extract
common parts with the write. I did not do that in this series yet to
keep it clear what the write would do and get feedback on whether
write's behaviour is sensible.
Nikita
[1]: https://lore.kernel.org/kvm/20241024095429.54052-1-kalyazin@amazon.com/T/
Nikita Kalyazin (2):
KVM: guest_memfd: add generic population via write
KVM: selftests: update guest_memfd write tests
.../testing/selftests/kvm/guest_memfd_test.c | 85 +++++++++++++++++--
virt/kvm/guest_memfd.c | 79 +++++++++++++++++
2 files changed, 158 insertions(+), 6 deletions(-)
base-commit: 1508bae37044ebffd7c7e09915f041936f338123
--
2.40.1
Add ip_link_set_addr(), ip_link_set_up(), ip_addr_add() and ip_route_add()
to the suite of helpers that automatically schedule a corresponding
cleanup.
When setting a new MAC, one needs to remember the old address first. Move
mac_get() from forwarding/ to that end.
Signed-off-by: Petr Machata <petrm(a)nvidia.com>
Reviewed-by: Ido Schimmel <idosch(a)nvidia.com>
---
Notes:
CC: Shuah Khan <shuah(a)kernel.org>
CC: Benjamin Poirier <bpoirier(a)nvidia.com>
CC: Hangbin Liu <liuhangbin(a)gmail.com>
CC: Vladimir Oltean <vladimir.oltean(a)nxp.com>
CC: linux-kselftest(a)vger.kernel.org
tools/testing/selftests/net/forwarding/lib.sh | 7 ----
tools/testing/selftests/net/lib.sh | 39 +++++++++++++++++++
2 files changed, 39 insertions(+), 7 deletions(-)
diff --git a/tools/testing/selftests/net/forwarding/lib.sh b/tools/testing/selftests/net/forwarding/lib.sh
index 7337f398f9cc..1fd40bada694 100644
--- a/tools/testing/selftests/net/forwarding/lib.sh
+++ b/tools/testing/selftests/net/forwarding/lib.sh
@@ -932,13 +932,6 @@ packets_rate()
echo $(((t1 - t0) / interval))
}
-mac_get()
-{
- local if_name=$1
-
- ip -j link show dev $if_name | jq -r '.[]["address"]'
-}
-
ether_addr_to_u64()
{
local addr="$1"
diff --git a/tools/testing/selftests/net/lib.sh b/tools/testing/selftests/net/lib.sh
index 5ea6537acd2b..2cd5c743b2d9 100644
--- a/tools/testing/selftests/net/lib.sh
+++ b/tools/testing/selftests/net/lib.sh
@@ -435,6 +435,13 @@ xfail_on_veth()
fi
}
+mac_get()
+{
+ local if_name=$1
+
+ ip -j link show dev $if_name | jq -r '.[]["address"]'
+}
+
kill_process()
{
local pid=$1; shift
@@ -459,3 +466,35 @@ ip_link_set_master()
ip link set dev "$member" master "$master"
defer ip link set dev "$member" nomaster
}
+
+ip_link_set_addr()
+{
+ local name=$1; shift
+ local addr=$1; shift
+
+ local old_addr=$(mac_get "$name")
+ ip link set dev "$name" address "$addr"
+ defer ip link set dev "$name" address "$old_addr"
+}
+
+ip_link_set_up()
+{
+ local name=$1; shift
+
+ ip link set dev "$name" up
+ defer ip link set dev "$name" down
+}
+
+ip_addr_add()
+{
+ local name=$1; shift
+
+ ip addr add dev "$name" "$@"
+ defer ip addr del dev "$name" "$@"
+}
+
+ip_route_add()
+{
+ ip route add "$@"
+ defer ip route del "$@"
+}
--
2.47.0
If the coreutils variant of ping is used instead of the busybox one, the
ping_do() command is broken. This comes by the fact that for coreutils
ping, the ping IP needs to be the very last elements.
To handle this, reorder the ping args and make $dip last element.
The use of coreutils ping might be useful for case where busybox is not
compiled with float interval support and ping command doesn't support
0.1 interval. (in such case a dedicated ping utility is installed
instead)
Cc: stable(a)vger.kernel.org
Fixes: 73bae6736b6b ("selftests: forwarding: Add initial testing framework")
Signed-off-by: Christian Marangi <ansuelsmth(a)gmail.com>
---
tools/testing/selftests/net/forwarding/lib.sh | 8 ++++----
1 file changed, 4 insertions(+), 4 deletions(-)
diff --git a/tools/testing/selftests/net/forwarding/lib.sh b/tools/testing/selftests/net/forwarding/lib.sh
index c992e385159c..2060f95d5c62 100644
--- a/tools/testing/selftests/net/forwarding/lib.sh
+++ b/tools/testing/selftests/net/forwarding/lib.sh
@@ -1473,8 +1473,8 @@ ping_do()
vrf_name=$(master_name_get $if_name)
ip vrf exec $vrf_name \
- $PING $args $dip -c $PING_COUNT -i 0.1 \
- -w $PING_TIMEOUT &> /dev/null
+ $PING $args -c $PING_COUNT -i 0.1 \
+ -w $PING_TIMEOUT $dip &> /dev/null
}
ping_test()
@@ -1504,8 +1504,8 @@ ping6_do()
vrf_name=$(master_name_get $if_name)
ip vrf exec $vrf_name \
- $PING6 $args $dip -c $PING_COUNT -i 0.1 \
- -w $PING_TIMEOUT &> /dev/null
+ $PING6 $args -c $PING_COUNT -i 0.1 \
+ -w $PING_TIMEOUT $dip &> /dev/null
}
ping6_test()
--
2.45.2
Create a test for the robust list mechanism. Test the following uAPI
operations:
- Creating a robust mutex where the lock waiter is wake by the kernel
when the lock owner died
- Setting a robust list to the current task
- Getting a robust list from the current task
- Getting a robust list from another task
- Using the list_op_pending field from robust_list_head struct to test
robustness when the lock owner dies before completing the locking
- Setting a invalid size for syscall argument `len`
Signed-off-by: André Almeida <andrealmeid(a)igalia.com>
---
Changes from v2:
- Dropped kselftest_harness include, using just futex test API
- This is the expected output:
TAP version 13
1..6
ok 1 test_robustness
ok 2 test_set_robust_list_invalid_size
ok 3 test_get_robust_list_self
ok 4 test_get_robust_list_child
ok 5 test_set_list_op_pending
ok 6 test_robust_list_multiple_elements
# Totals: pass:6 fail:0 xfail:0 xpass:0 skip:0 error:0
Changes from v1:
- Change futex type from int to _Atomic(unsigned int)
- Use old futex(FUTEX_WAIT) instead of the new sys_futex_wait()
---
.../selftests/futex/functional/.gitignore | 1 +
.../selftests/futex/functional/Makefile | 3 +-
.../selftests/futex/functional/robust_list.c | 512 ++++++++++++++++++
3 files changed, 515 insertions(+), 1 deletion(-)
create mode 100644 tools/testing/selftests/futex/functional/robust_list.c
diff --git a/tools/testing/selftests/futex/functional/.gitignore b/tools/testing/selftests/futex/functional/.gitignore
index fbcbdb6963b3..4726e1be7497 100644
--- a/tools/testing/selftests/futex/functional/.gitignore
+++ b/tools/testing/selftests/futex/functional/.gitignore
@@ -9,3 +9,4 @@ futex_wait_wouldblock
futex_wait
futex_requeue
futex_waitv
+robust_list
diff --git a/tools/testing/selftests/futex/functional/Makefile b/tools/testing/selftests/futex/functional/Makefile
index f79f9bac7918..b8635a1ac7f6 100644
--- a/tools/testing/selftests/futex/functional/Makefile
+++ b/tools/testing/selftests/futex/functional/Makefile
@@ -17,7 +17,8 @@ TEST_GEN_PROGS := \
futex_wait_private_mapped_file \
futex_wait \
futex_requeue \
- futex_waitv
+ futex_waitv \
+ robust_list
TEST_PROGS := run.sh
diff --git a/tools/testing/selftests/futex/functional/robust_list.c b/tools/testing/selftests/futex/functional/robust_list.c
new file mode 100644
index 000000000000..7bfda2d39821
--- /dev/null
+++ b/tools/testing/selftests/futex/functional/robust_list.c
@@ -0,0 +1,512 @@
+// SPDX-License-Identifier: GPL-2.0-or-later
+/*
+ * Copyright (C) 2024 Igalia S.L.
+ *
+ * Robust list test by André Almeida <andrealmeid(a)igalia.com>
+ *
+ * The robust list uAPI allows userspace to create "robust" locks, in the sense
+ * that if the lock holder thread dies, the remaining threads that are waiting
+ * for the lock won't block forever, waiting for a lock that will never be
+ * released.
+ *
+ * This is achieve by userspace setting a list where a thread can enter all the
+ * locks (futexes) that it is holding. The robust list is a linked list, and
+ * userspace register the start of the list with the syscall set_robust_list().
+ * If such thread eventually dies, the kernel will walk this list, waking up one
+ * thread waiting for each futex and marking the futex word with the flag
+ * FUTEX_OWNER_DIED.
+ *
+ * See also
+ * man set_robust_list
+ * Documententation/locking/robust-futex-ABI.rst
+ * Documententation/locking/robust-futexes.rst
+ */
+
+#define _GNU_SOURCE
+
+#include "futextest.h"
+#include "logging.h"
+
+#include <errno.h>
+#include <pthread.h>
+#include <signal.h>
+#include <stdatomic.h>
+#include <stdbool.h>
+#include <stddef.h>
+#include <sys/mman.h>
+#include <sys/wait.h>
+
+#define STACK_SIZE (1024 * 1024)
+
+#define FUTEX_TIMEOUT 3
+
+static pthread_barrier_t barrier, barrier2;
+
+int set_robust_list(struct robust_list_head *head, size_t len)
+{
+ return syscall(SYS_set_robust_list, head, len);
+}
+
+int get_robust_list(int pid, struct robust_list_head **head, size_t *len_ptr)
+{
+ return syscall(SYS_get_robust_list, pid, head, len_ptr);
+}
+
+/*
+ * Basic lock struct, contains just the futex word and the robust list element
+ * Real implementations have also a *prev to easily walk in the list
+ */
+struct lock_struct {
+ _Atomic(unsigned int) futex;
+ struct robust_list list;
+};
+
+/*
+ * Helper function to spawn a child thread. Returns -1 on error, pid on success
+ */
+static int create_child(int (*fn)(void *arg), void *arg)
+{
+ char *stack;
+ pid_t pid;
+
+ stack = mmap(NULL, STACK_SIZE, PROT_READ | PROT_WRITE,
+ MAP_PRIVATE | MAP_ANONYMOUS | MAP_STACK, -1, 0);
+ if (stack == MAP_FAILED)
+ return -1;
+
+ stack += STACK_SIZE;
+
+ pid = clone(fn, stack, CLONE_VM | SIGCHLD, arg);
+
+ if (pid == -1)
+ return -1;
+
+ return pid;
+}
+
+/*
+ * Helper function to prepare and register a robust list
+ */
+static int set_list(struct robust_list_head *head)
+{
+ int ret;
+
+ ret = set_robust_list(head, sizeof(struct robust_list_head));
+ if (ret)
+ return ret;
+
+ head->futex_offset = (size_t) offsetof(struct lock_struct, futex) -
+ (size_t) offsetof(struct lock_struct, list);
+ head->list.next = &head->list;
+ head->list_op_pending = NULL;
+
+ return 0;
+}
+
+/*
+ * A basic (and incomplete) mutex lock function with robustness
+ */
+static int mutex_lock(struct lock_struct *lock, struct robust_list_head *head, bool error_inject)
+{
+ _Atomic(unsigned int) *futex = &lock->futex;
+ int zero = 0, ret = -1;
+ pid_t tid = gettid();
+
+ /*
+ * Set list_op_pending before starting the lock, so the kernel can catch
+ * the case where the thread died during the lock operation
+ */
+ head->list_op_pending = &lock->list;
+
+ if (atomic_compare_exchange_strong(futex, &zero, tid)) {
+ /*
+ * We took the lock, insert it in the robust list
+ */
+ struct robust_list *list = &head->list;
+
+ /* Error injection to test list_op_pending */
+ if (error_inject)
+ return 0;
+
+ while (list->next != &head->list)
+ list = list->next;
+
+ list->next = &lock->list;
+ lock->list.next = &head->list;
+
+ ret = 0;
+ } else {
+ /*
+ * We didn't take the lock, wait until the owner wakes (or dies)
+ */
+ struct timespec to;
+
+ clock_gettime(CLOCK_MONOTONIC, &to);
+ to.tv_sec = to.tv_sec + FUTEX_TIMEOUT;
+
+ tid = atomic_load(futex);
+ /* Kernel ignores futexes without the waiters flag */
+ tid |= FUTEX_WAITERS;
+ atomic_store(futex, tid);
+
+ ret = futex_wait((futex_t *) futex, tid, &to, 0);
+
+ /*
+ * A real mutex_lock() implementation would loop here to finally
+ * take the lock. We don't care about that, so we stop here.
+ */
+ }
+
+ head->list_op_pending = NULL;
+
+ return ret;
+}
+
+/*
+ * This child thread will succeed taking the lock, and then will exit holding it
+ */
+static int child_fn_lock(void *arg)
+{
+ struct lock_struct *lock = (struct lock_struct *) arg;
+ struct robust_list_head head;
+ int ret;
+
+ ret = set_list(&head);
+ if (ret)
+ ksft_test_result_fail("set_robust_list error\n");
+
+ ret = mutex_lock(lock, &head, false);
+ if (ret)
+ ksft_test_result_fail("mutex_lock error\n");
+
+ pthread_barrier_wait(&barrier);
+
+ /*
+ * There's a race here: the parent thread needs to be inside
+ * futex_wait() before the child thread dies, otherwise it will miss the
+ * wakeup from handle_futex_death() that this child will emit. We wait a
+ * little bit just to make sure that this happens.
+ */
+ sleep(1);
+
+ return 0;
+}
+
+/*
+ * Spawns a child thread that will set a robust list, take the lock, register it
+ * in the robust list and die. The parent thread will wait on this futex, and
+ * should be waken up when the child exits.
+ */
+static void test_robustness()
+{
+ struct lock_struct lock = { .futex = 0 };
+ struct robust_list_head head;
+ _Atomic(unsigned int) *futex = &lock.futex;
+ int ret;
+
+ ret = set_list(&head);
+ ASSERT_EQ(ret, 0);
+
+ /*
+ * Lets use a barrier to ensure that the child thread takes the lock
+ * before the parent
+ */
+ ret = pthread_barrier_init(&barrier, NULL, 2);
+ ASSERT_EQ(ret, 0);
+
+ ret = create_child(&child_fn_lock, &lock);
+ ASSERT_NE(ret, -1);
+
+ pthread_barrier_wait(&barrier);
+ ret = mutex_lock(&lock, &head, false);
+
+ /*
+ * futex_wait() should return 0 and the futex word should be marked with
+ * FUTEX_OWNER_DIED
+ */
+ ASSERT_EQ(ret, 0);
+ if (ret != 0)
+ printf("futex wait returned %d", errno);
+
+ ASSERT_TRUE(*futex | FUTEX_OWNER_DIED);
+
+ pthread_barrier_destroy(&barrier);
+
+ ksft_test_result_pass("%s\n", __func__);
+}
+
+/*
+ * The only valid value for len is sizeof(*head)
+ */
+static void test_set_robust_list_invalid_size()
+{
+ struct robust_list_head head;
+ size_t head_size = sizeof(struct robust_list_head);
+ int ret;
+
+ ret = set_robust_list(&head, head_size);
+ ASSERT_EQ(ret, 0);
+
+ ret = set_robust_list(&head, head_size * 2);
+ ASSERT_EQ(ret, -1);
+ ASSERT_EQ(errno, EINVAL);
+
+ ret = set_robust_list(&head, head_size - 1);
+ ASSERT_EQ(ret, -1);
+ ASSERT_EQ(errno, EINVAL);
+
+ ret = set_robust_list(&head, 0);
+ ASSERT_EQ(ret, -1);
+ ASSERT_EQ(errno, EINVAL);
+
+ ksft_test_result_pass("%s\n", __func__);
+}
+
+/*
+ * Test get_robust_list with pid = 0, getting the list of the running thread
+ */
+static void test_get_robust_list_self()
+{
+ struct robust_list_head head, head2, *get_head;
+ size_t head_size = sizeof(struct robust_list_head), len_ptr;
+ int ret;
+
+ ret = set_robust_list(&head, head_size);
+ ASSERT_EQ(ret, 0);
+
+ ret = get_robust_list(0, &get_head, &len_ptr);
+ ASSERT_EQ(ret, 0);
+ ASSERT_EQ(get_head, &head);
+ ASSERT_EQ(head_size, len_ptr);
+
+ ret = set_robust_list(&head2, head_size);
+ ASSERT_EQ(ret, 0);
+
+ ret = get_robust_list(0, &get_head, &len_ptr);
+ ASSERT_EQ(ret, 0);
+ ASSERT_EQ(get_head, &head2);
+ ASSERT_EQ(head_size, len_ptr);
+
+ ksft_test_result_pass("%s\n", __func__);
+}
+
+static int child_list(void *arg)
+{
+ struct robust_list_head *head = (struct robust_list_head *) arg;
+ int ret;
+
+ ret = set_robust_list(head, sizeof(struct robust_list_head));
+ if (ret)
+ ksft_test_result_fail("set_robust_list error\n");
+
+ pthread_barrier_wait(&barrier);
+ pthread_barrier_wait(&barrier2);
+
+ return 0;
+}
+
+/*
+ * Test get_robust_list from another thread. We use two barriers here to ensure
+ * that:
+ * 1) the child thread set the list before we try to get it from the
+ * parent
+ * 2) the child thread still alive when we try to get the list from it
+ */
+static void test_get_robust_list_child()
+{
+ pid_t tid;
+ int ret;
+ struct robust_list_head head, *get_head;
+ size_t len_ptr;
+
+ ret = pthread_barrier_init(&barrier, NULL, 2);
+ ret = pthread_barrier_init(&barrier2, NULL, 2);
+ ASSERT_EQ(ret, 0);
+
+ tid = create_child(&child_list, &head);
+ ASSERT_NE(tid, -1);
+
+ pthread_barrier_wait(&barrier);
+
+ ret = get_robust_list(tid, &get_head, &len_ptr);
+ ASSERT_EQ(ret, 0);
+ ASSERT_EQ(&head, get_head);
+
+ pthread_barrier_wait(&barrier2);
+
+ pthread_barrier_destroy(&barrier);
+ pthread_barrier_destroy(&barrier2);
+
+ ksft_test_result_pass("%s\n", __func__);
+}
+
+static int child_fn_lock_with_error(void *arg)
+{
+ struct lock_struct *lock = (struct lock_struct *) arg;
+ struct robust_list_head head;
+ int ret;
+
+ ret = set_list(&head);
+ if (ret)
+ ksft_test_result_fail("set_robust_list error\n");
+
+ ret = mutex_lock(lock, &head, true);
+ if (ret)
+ ksft_test_result_fail("mutex_lock error\n");
+
+ pthread_barrier_wait(&barrier);
+
+ sleep(1);
+
+ return 0;
+}
+
+/*
+ * Same as robustness test, but inject an error where the mutex_lock() exits
+ * earlier, just after setting list_op_pending and taking the lock, to test the
+ * list_op_pending mechanism
+ */
+static void test_set_list_op_pending()
+{
+ struct lock_struct lock = { .futex = 0 };
+ struct robust_list_head head;
+ _Atomic(unsigned int) *futex = &lock.futex;
+ int ret;
+
+ ret = set_list(&head);
+ ASSERT_EQ(ret, 0);
+
+ ret = pthread_barrier_init(&barrier, NULL, 2);
+ ASSERT_EQ(ret, 0);
+
+ ret = create_child(&child_fn_lock_with_error, &lock);
+ ASSERT_NE(ret, -1);
+
+ pthread_barrier_wait(&barrier);
+ ret = mutex_lock(&lock, &head, false);
+
+ ASSERT_EQ(ret, 0);
+ if (ret != 0)
+ printf("futex wait returned %d", errno);
+
+ ASSERT_TRUE(*futex | FUTEX_OWNER_DIED);
+
+ pthread_barrier_destroy(&barrier);
+
+ ksft_test_result_pass("%s\n", __func__);
+}
+
+#define CHILD_NR 10
+
+static int child_lock_holder(void *arg)
+{
+ struct lock_struct *locks = (struct lock_struct *) arg;
+ struct robust_list_head head;
+ int i;
+
+ set_list(&head);
+
+ for (i = 0; i < CHILD_NR; i++) {
+ locks[i].futex = 0;
+ mutex_lock(&locks[i], &head, false);
+ }
+
+ pthread_barrier_wait(&barrier);
+ pthread_barrier_wait(&barrier2);
+
+ sleep(1);
+ return 0;
+}
+
+static int child_wait_lock(void *arg)
+{
+ struct lock_struct *lock = (struct lock_struct *) arg;
+ struct robust_list_head head;
+ int ret;
+
+ pthread_barrier_wait(&barrier2);
+ ret = mutex_lock(lock, &head, false);
+
+ if (ret)
+ ksft_test_result_fail("mutex_lock error\n");
+
+ if (!(lock->futex | FUTEX_OWNER_DIED))
+ ksft_test_result_fail("futex not marked with FUTEX_OWNER_DIED\n");
+
+ return 0;
+}
+
+/*
+ * Test a robust list of more than one element. All the waiters should wake when
+ * the holder dies
+ */
+static void test_robust_list_multiple_elements()
+{
+ struct lock_struct locks[CHILD_NR];
+ int i, ret;
+
+ ret = pthread_barrier_init(&barrier, NULL, 2);
+ ASSERT_EQ(ret, 0);
+ ret = pthread_barrier_init(&barrier2, NULL, CHILD_NR + 1);
+ ASSERT_EQ(ret, 0);
+
+ create_child(&child_lock_holder, &locks);
+
+ /* Wait until the locker thread takes the look */
+ pthread_barrier_wait(&barrier);
+
+ for (i = 0; i < CHILD_NR; i++)
+ create_child(&child_wait_lock, &locks[i]);
+
+ /* Wait for all children to return */
+ while (wait(NULL) > 0);
+
+ pthread_barrier_destroy(&barrier);
+ pthread_barrier_destroy(&barrier2);
+
+ ksft_test_result_pass("%s\n", __func__);
+}
+
+void usage(char *prog)
+{
+ printf("Usage: %s\n", prog);
+ printf(" -c Use color\n");
+ printf(" -h Display this help message\n");
+ printf(" -v L Verbosity level: %d=QUIET %d=CRITICAL %d=INFO\n",
+ VQUIET, VCRITICAL, VINFO);
+}
+
+int main(int argc, char *argv[])
+{
+ int c;
+
+ while ((c = getopt(argc, argv, "cht:v:")) != -1) {
+ switch (c) {
+ case 'c':
+ log_color(1);
+ break;
+ case 'h':
+ usage(basename(argv[0]));
+ exit(0);
+ case 'v':
+ log_verbosity(atoi(optarg));
+ break;
+ default:
+ usage(basename(argv[0]));
+ exit(1);
+ }
+ }
+
+ ksft_print_header();
+ ksft_set_plan(6);
+
+ test_robustness();
+ test_set_robust_list_invalid_size();
+ test_get_robust_list_self();
+ test_get_robust_list_child();
+ test_set_list_op_pending();
+ test_robust_list_multiple_elements();
+
+ ksft_print_cnts();
+ return 0;
+}
--
2.47.1
serial_test_flow_dissector_namespace manipulates both the root net
namespace and a dedicated non-root net namespace. If for some reason a
program attach on root namespace succeeds while it was expected to
fail, the unexpected program will remain attached to the root namespace,
possibly affecting other runs or even other tests in the same run.
Fix undesired test failure side effect by explicitly detaching programs
on failing tests expecting attach to fail. As a side effect of this
change, do not test errno value if the tested operation do not fail.
Fixes: 284ed00a59dd ("selftests/bpf: migrate flow_dissector namespace exclusivity test")
Signed-off-by: Alexis Lothoré (eBPF Foundation) <alexis.lothore(a)bootlin.com>
---
This small fix addresses an issue discovered while trying to add a new
test in my recently merged work on flow_dissector migration. This new
test is still only present in bpf-next, hence this fix does not target
the bpf tree but the bpf-next tree.
---
tools/testing/selftests/bpf/prog_tests/flow_dissector.c | 16 +++++++++++-----
1 file changed, 11 insertions(+), 5 deletions(-)
diff --git a/tools/testing/selftests/bpf/prog_tests/flow_dissector.c b/tools/testing/selftests/bpf/prog_tests/flow_dissector.c
index 8e6e483fead3f71f21e2223c707c6d4fb548a61e..08bae13248c4a8ab0bfa356a34b2738964d97f4c 100644
--- a/tools/testing/selftests/bpf/prog_tests/flow_dissector.c
+++ b/tools/testing/selftests/bpf/prog_tests/flow_dissector.c
@@ -525,11 +525,14 @@ void serial_test_flow_dissector_namespace(void)
ns = open_netns(TEST_NS);
if (!ASSERT_OK_PTR(ns, "enter non-root net namespace"))
goto out_clean_ns;
-
err = bpf_prog_attach(prog_fd, 0, BPF_FLOW_DISSECTOR, 0);
+ if (!ASSERT_ERR(err,
+ "refuse new flow dissector in non-root net namespace"))
+ bpf_prog_detach2(prog_fd, 0, BPF_FLOW_DISSECTOR);
+ else
+ ASSERT_EQ(errno, EEXIST,
+ "refused because of already attached prog");
close_netns(ns);
- ASSERT_ERR(err, "refuse new flow dissector in non-root net namespace");
- ASSERT_EQ(errno, EEXIST, "refused because of already attached prog");
/* If no flow dissector is attached to the root namespace, we must
* be able to attach one to a non-root net namespace
@@ -545,8 +548,11 @@ void serial_test_flow_dissector_namespace(void)
* a flow dissector to root namespace must fail
*/
err = bpf_prog_attach(prog_fd, 0, BPF_FLOW_DISSECTOR, 0);
- ASSERT_ERR(err, "refuse new flow dissector on root namespace");
- ASSERT_EQ(errno, EEXIST, "refused because of already attached prog");
+ if (!ASSERT_ERR(err, "refuse new flow dissector on root namespace"))
+ bpf_prog_detach2(prog_fd, 0, BPF_FLOW_DISSECTOR);
+ else
+ ASSERT_EQ(errno, EEXIST,
+ "refused because of already attached prog");
ns = open_netns(TEST_NS);
bpf_prog_detach2(prog_fd, 0, BPF_FLOW_DISSECTOR);
---
base-commit: 04e7b00083a120d60511443d900a5cc10dbed263
change-id: 20241128-small_flow_test_fix-0c53624a3c4c
Best regards,
--
Alexis Lothoré, Bootlin
Embedded Linux and Kernel engineering
https://bootlin.com
From: Edward Cree <ecree.xilinx(a)gmail.com>
The original semantics of ntuple filters with FLOW_RSS were not
fully understood by all drivers, some ignoring the ring_cookie from
the flow rule. Require this support to be explicitly declared by
the driver for filters relying on it to be inserted, and add self-
test coverage for this functionality.
Also teach ethtool_check_max_channel() about this.
Edward Cree (5):
net: ethtool: only allow set_rxnfc with rss + ring_cookie if driver
opts in
net: ethtool: account for RSS+RXNFC add semantics when checking
channel count
selftest: include dst-ip in ethtool ntuple rules
selftest: validate RSS+ntuple filters with nonzero ring_cookie
selftest: extend test_rss_context_queue_reconfigure for action
addition
drivers/net/ethernet/sfc/ef100_ethtool.c | 1 +
drivers/net/ethernet/sfc/ethtool.c | 1 +
include/linux/ethtool.h | 4 +
net/ethtool/common.c | 42 +++++++---
net/ethtool/ioctl.c | 5 ++
.../selftests/drivers/net/hw/rss_ctx.py | 79 +++++++++++++++++--
6 files changed, 113 insertions(+), 19 deletions(-)
From: Mark Brown <broonie(a)kernel.org>
[ Upstream commit 3e360ef0c0a1fb6ce9a302e40b8057c41ba8a9d2 ]
When building for streaming SVE the irritator for SVE skips updates of both
P0 and FFR. While FFR is skipped since it might not be present there is no
reason to skip corrupting P0 so switch to an instruction valid in streaming
mode and move the ifdef.
Signed-off-by: Mark Brown <broonie(a)kernel.org>
Link: https://lore.kernel.org/r/20241107-arm64-fp-stress-irritator-v2-3-c4b9622e3…
Signed-off-by: Catalin Marinas <catalin.marinas(a)arm.com>
Signed-off-by: Sasha Levin <sashal(a)kernel.org>
---
tools/testing/selftests/arm64/fp/sve-test.S | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/tools/testing/selftests/arm64/fp/sve-test.S b/tools/testing/selftests/arm64/fp/sve-test.S
index 07f14e279a904..7784c684a865b 100644
--- a/tools/testing/selftests/arm64/fp/sve-test.S
+++ b/tools/testing/selftests/arm64/fp/sve-test.S
@@ -459,7 +459,8 @@ function irritator_handler
movi v9.16b, #2
movi v31.8b, #3
// And P0
- rdffr p0.b
+ ptrue p0.d
+#ifndef SSVE
// And FFR
wrffr p15.b
--
2.43.0
This patchset fixes a bug where bnxt driver was failing to report that
an ntuple rule is redirecting to an RSS context. First commit is the
fix, then second commit extends selftests to detect if other/new drivers
are compliant with ntuple/rss_ctx API.
=== Changelog ===
Changes from v2:
* Rebase to net instead of net-next
* Make regex work with ethtool output changes
Changes from v1:
* Add selftest in patch 2
Daniel Xu (2):
bnxt_en: ethtool: Supply ntuple rss context action
selftests: drv-net: rss_ctx: Add test for ntuple rule
drivers/net/ethernet/broadcom/bnxt/bnxt_ethtool.c | 8 ++++++--
tools/testing/selftests/drivers/net/hw/rss_ctx.py | 12 +++++++++++-
2 files changed, 17 insertions(+), 3 deletions(-)
--
2.46.0
Hello and Welcome,
We appreciate your understanding during this time, and we regret the slow response to your previous note. We appreciate your interest and are pleased to assist you with the information you need.
This email contains an attached screenshot that presents essential details concerning your request. Please take a moment to access it and familiarize yourself with the details for a full understanding of the information.
If you need any information or have any questions, do not hesitate to get in touch. We are always ready to assist you and eager to provide all required help.
Cheers,
Ivory Davis
Catalyst Solutions, LLC
+1 (212) 143-51-81
Change the spelling from metadate -> metadata
Signed-off-by: Gautam Somani <gautamsomani(a)gmail.com>
---
tools/testing/selftests/x86/lam.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/tools/testing/selftests/x86/lam.c b/tools/testing/selftests/x86/lam.c
index 0ea4f6813930..4d4a76532dc9 100644
--- a/tools/testing/selftests/x86/lam.c
+++ b/tools/testing/selftests/x86/lam.c
@@ -237,7 +237,7 @@ static uint64_t set_metadata(uint64_t src, unsigned long lam)
* both pointers should point to the same address.
*
* @return:
- * 0: value on the pointer with metadate and value on original are same
+ * 0: value on the pointer with metadata and value on original are same
* 1: not same.
*/
static int handle_lam_test(void *src, unsigned int lam)
--
2.43.0
The kernel has recently added support for shadow stacks, currently
x86 only using their CET feature but both arm64 and RISC-V have
equivalent features (GCS and Zicfiss respectively), I am actively
working on GCS[1]. With shadow stacks the hardware maintains an
additional stack containing only the return addresses for branch
instructions which is not generally writeable by userspace and ensures
that any returns are to the recorded addresses. This provides some
protection against ROP attacks and making it easier to collect call
stacks. These shadow stacks are allocated in the address space of the
userspace process.
Our API for shadow stacks does not currently offer userspace any
flexiblity for managing the allocation of shadow stacks for newly
created threads, instead the kernel allocates a new shadow stack with
the same size as the normal stack whenever a thread is created with the
feature enabled. The stacks allocated in this way are freed by the
kernel when the thread exits or shadow stacks are disabled for the
thread. This lack of flexibility and control isn't ideal, in the vast
majority of cases the shadow stack will be over allocated and the
implicit allocation and deallocation is not consistent with other
interfaces. As far as I can tell the interface is done in this manner
mainly because the shadow stack patches were in development since before
clone3() was implemented.
Since clone3() is readily extensible let's add support for specifying a
shadow stack when creating a new thread or process, keeping the current
implicit allocation behaviour if one is not specified either with
clone3() or through the use of clone(). The user must provide a shadow
stack pointer, this must point to memory mapped for use as a shadow
stackby map_shadow_stack() with an architecture specified shadow stack
token at the top of the stack.
Please note that the x86 portions of this code are build tested only, I
don't appear to have a system that can run CET available to me.
[1] https://lore.kernel.org/linux-arm-kernel/20241001-arm64-gcs-v13-0-222b78d87…
Signed-off-by: Mark Brown <broonie(a)kernel.org>
---
Changes in v12:
- Add the regular prctl() to the userspace API document since arm64
support is queued in -next.
- Link to v11: https://lore.kernel.org/r/20241005-clone3-shadow-stack-v11-0-2a6a2bd6d651@k…
Changes in v11:
- Rebase onto arm64 for-next/gcs, which is based on v6.12-rc1, and
integrate arm64 support.
- Rework the interface to specify a shadow stack pointer rather than a
base and size like we do for the regular stack.
- Link to v10: https://lore.kernel.org/r/20240821-clone3-shadow-stack-v10-0-06e8797b9445@k…
Changes in v10:
- Integrate fixes & improvements for the x86 implementation from Rick
Edgecombe.
- Require that the shadow stack be VM_WRITE.
- Require that the shadow stack base and size be sizeof(void *) aligned.
- Clean up trailing newline.
- Link to v9: https://lore.kernel.org/r/20240819-clone3-shadow-stack-v9-0-962d74f99464@ke…
Changes in v9:
- Pull token validation earlier and report problems with an error return
to parent rather than signal delivery to the child.
- Verify that the top of the supplied shadow stack is VM_SHADOW_STACK.
- Rework token validation to only do the page mapping once.
- Drop no longer needed support for testing for signals in selftest.
- Fix typo in comments.
- Link to v8: https://lore.kernel.org/r/20240808-clone3-shadow-stack-v8-0-0acf37caf14c@ke…
Changes in v8:
- Fix token verification with user specified shadow stack.
- Don't track user managed shadow stacks for child processes.
- Link to v7: https://lore.kernel.org/r/20240731-clone3-shadow-stack-v7-0-a9532eebfb1d@ke…
Changes in v7:
- Rebase onto v6.11-rc1.
- Typo fixes.
- Link to v6: https://lore.kernel.org/r/20240623-clone3-shadow-stack-v6-0-9ee7783b1fb9@ke…
Changes in v6:
- Rebase onto v6.10-rc3.
- Ensure we don't try to free the parent shadow stack in error paths of
x86 arch code.
- Spelling fixes in userspace API document.
- Additional cleanups and improvements to the clone3() tests to support
the shadow stack tests.
- Link to v5: https://lore.kernel.org/r/20240203-clone3-shadow-stack-v5-0-322c69598e4b@ke…
Changes in v5:
- Rebase onto v6.8-rc2.
- Rework ABI to have the user allocate the shadow stack memory with
map_shadow_stack() and a token.
- Force inlining of the x86 shadow stack enablement.
- Move shadow stack enablement out into a shared header for reuse by
other tests.
- Link to v4: https://lore.kernel.org/r/20231128-clone3-shadow-stack-v4-0-8b28ffe4f676@ke…
Changes in v4:
- Formatting changes.
- Use a define for minimum shadow stack size and move some basic
validation to fork.c.
- Link to v3: https://lore.kernel.org/r/20231120-clone3-shadow-stack-v3-0-a7b8ed3e2acc@ke…
Changes in v3:
- Rebase onto v6.7-rc2.
- Remove stale shadow_stack in internal kargs.
- If a shadow stack is specified unconditionally use it regardless of
CLONE_ parameters.
- Force enable shadow stacks in the selftest.
- Update changelogs for RISC-V feature rename.
- Link to v2: https://lore.kernel.org/r/20231114-clone3-shadow-stack-v2-0-b613f8681155@ke…
Changes in v2:
- Rebase onto v6.7-rc1.
- Remove ability to provide preallocated shadow stack, just specify the
desired size.
- Link to v1: https://lore.kernel.org/r/20231023-clone3-shadow-stack-v1-0-d867d0b5d4d0@ke…
---
Mark Brown (8):
arm64/gcs: Return a success value from gcs_alloc_thread_stack()
Documentation: userspace-api: Add shadow stack API documentation
selftests: Provide helper header for shadow stack testing
fork: Add shadow stack support to clone3()
selftests/clone3: Remove redundant flushes of output streams
selftests/clone3: Factor more of main loop into test_clone3()
selftests/clone3: Allow tests to flag if -E2BIG is a valid error code
selftests/clone3: Test shadow stack support
Documentation/userspace-api/index.rst | 1 +
Documentation/userspace-api/shadow_stack.rst | 42 ++++
arch/arm64/include/asm/gcs.h | 8 +-
arch/arm64/kernel/process.c | 8 +-
arch/arm64/mm/gcs.c | 62 +++++-
arch/x86/include/asm/shstk.h | 11 +-
arch/x86/kernel/process.c | 2 +-
arch/x86/kernel/shstk.c | 57 +++++-
include/asm-generic/cacheflush.h | 11 ++
include/linux/sched/task.h | 17 ++
include/uapi/linux/sched.h | 10 +-
kernel/fork.c | 96 +++++++--
tools/testing/selftests/clone3/clone3.c | 226 ++++++++++++++++++----
tools/testing/selftests/clone3/clone3_selftests.h | 65 ++++++-
tools/testing/selftests/ksft_shstk.h | 98 ++++++++++
15 files changed, 633 insertions(+), 81 deletions(-)
---
base-commit: d17cd7b7cc92d37ee8b2df8f975fc859a261f4dc
change-id: 20231019-clone3-shadow-stack-15d40d2bf536
Best regards,
--
Mark Brown <broonie(a)kernel.org>
bpftool now embeds the kfuncs definitions directly in the generated
vmlinux.h
This is great, but because the selftests dir might be compiled with
HID_BPF disabled, we have no guarantees to be able to compile the
sources with the generated kfuncs.
If we have the kfuncs, because we have the `__not_used` hack, the newly
defined kfuncs do not match the ones from vmlinux.h and things go wrong.
Prevent vmlinux.h to define its kfuncs and also add the missing `__weak`
symbols for our custom kfuncs definitions
Signed-off-by: Benjamin Tissoires <bentiss(a)kernel.org>
---
This was noticed while bumping the CI to fedora 41 which has an update
of bpftool.
I'll probably take this in for-6.13/upstream-fixes tomorrow if no bots
comes back at me.
Cheers,
Benjamin
---
tools/testing/selftests/hid/progs/hid_bpf_helpers.h | 19 +++++++++++--------
1 file changed, 11 insertions(+), 8 deletions(-)
diff --git a/tools/testing/selftests/hid/progs/hid_bpf_helpers.h b/tools/testing/selftests/hid/progs/hid_bpf_helpers.h
index e5db897586bbfe010d8799f6f52fc5c418344e6b..531228b849daebcf40d994abb8bf35e760b3cc4e 100644
--- a/tools/testing/selftests/hid/progs/hid_bpf_helpers.h
+++ b/tools/testing/selftests/hid/progs/hid_bpf_helpers.h
@@ -22,6 +22,9 @@
#define HID_REQ_SET_IDLE HID_REQ_SET_IDLE___not_used
#define HID_REQ_SET_PROTOCOL HID_REQ_SET_PROTOCOL___not_used
+/* do not define kfunc through vmlinux.h as this messes up our custom hack */
+#define BPF_NO_KFUNC_PROTOTYPES
+
#include "vmlinux.h"
#undef hid_bpf_ctx
@@ -91,31 +94,31 @@ struct hid_bpf_ops {
/* following are kfuncs exported by HID for HID-BPF */
extern __u8 *hid_bpf_get_data(struct hid_bpf_ctx *ctx,
unsigned int offset,
- const size_t __sz) __ksym;
-extern struct hid_bpf_ctx *hid_bpf_allocate_context(unsigned int hid_id) __ksym;
-extern void hid_bpf_release_context(struct hid_bpf_ctx *ctx) __ksym;
+ const size_t __sz) __weak __ksym;
+extern struct hid_bpf_ctx *hid_bpf_allocate_context(unsigned int hid_id) __weak __ksym;
+extern void hid_bpf_release_context(struct hid_bpf_ctx *ctx) __weak __ksym;
extern int hid_bpf_hw_request(struct hid_bpf_ctx *ctx,
__u8 *data,
size_t buf__sz,
enum hid_report_type type,
- enum hid_class_request reqtype) __ksym;
+ enum hid_class_request reqtype) __weak __ksym;
extern int hid_bpf_hw_output_report(struct hid_bpf_ctx *ctx,
- __u8 *buf, size_t buf__sz) __ksym;
+ __u8 *buf, size_t buf__sz) __weak __ksym;
extern int hid_bpf_input_report(struct hid_bpf_ctx *ctx,
enum hid_report_type type,
__u8 *data,
- size_t buf__sz) __ksym;
+ size_t buf__sz) __weak __ksym;
extern int hid_bpf_try_input_report(struct hid_bpf_ctx *ctx,
enum hid_report_type type,
__u8 *data,
- size_t buf__sz) __ksym;
+ size_t buf__sz) __weak __ksym;
/* bpf_wq implementation */
extern int bpf_wq_init(struct bpf_wq *wq, void *p__map, unsigned int flags) __weak __ksym;
extern int bpf_wq_start(struct bpf_wq *wq, unsigned int flags) __weak __ksym;
extern int bpf_wq_set_callback_impl(struct bpf_wq *wq,
int (callback_fn)(void *map, int *key, void *wq),
- unsigned int flags__k, void *aux__ign) __ksym;
+ unsigned int flags__k, void *aux__ign) __weak __ksym;
#define bpf_wq_set_callback(timer, cb, flags) \
bpf_wq_set_callback_impl(timer, cb, flags, NULL)
---
base-commit: 919464deeca24e5bf13b6c8efd0b1d25cc43866f
change-id: 20241128-fix-new-bpftool-31a9f43aafbb
Best regards,
--
Benjamin Tissoires <bentiss(a)kernel.org>
The parameter passed to load_mod() is stored in $MOD, but never used.
Obviously it was intended to be used instead of the hardcoded
"test_kallsyms_b" module name.
Fixes: 84b4a51fce4ccc66 ("selftests: add new kallsyms selftests")
Signed-off-by: Geert Uytterhoeven <geert(a)linux-m68k.org>
---
tools/testing/selftests/module/find_symbol.sh | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/tools/testing/selftests/module/find_symbol.sh b/tools/testing/selftests/module/find_symbol.sh
index 140364d3c49fc912..2c56805c9b6e6ea0 100755
--- a/tools/testing/selftests/module/find_symbol.sh
+++ b/tools/testing/selftests/module/find_symbol.sh
@@ -44,10 +44,10 @@ load_mod()
local ARCH="$(uname -m)"
case "${ARCH}" in
x86_64)
- perf stat $STATS $MODPROBE test_kallsyms_b
+ perf stat $STATS $MODPROBE $MOD
;;
*)
- time $MODPROBE test_kallsyms_b
+ time $MODPROBE $MOD
exit 1
;;
esac
--
2.34.1
The string logged when a test passes or fails is used by the selftest
framework to identify which test is being reported. The hugetlb_dio test
not only uses the same strings for every test that is run but it also uses
different strings for test passes and failures which means that test
automation is unable to follow what the test is doing at all.
Pull the existing duplicated logging of the number of free huge pages
before and after the test out of the conditional and replace that and the
logging of the result with a single ksft_print_result() which incorporates
the parameters passed into the test into the output.
Fixes: fae1980347bf ("selftests: hugetlb_dio: fixup check for initial conditions to skip in the start")
Signed-off-by: Mark Brown <broonie(a)kernel.org>
---
tools/testing/selftests/mm/hugetlb_dio.c | 14 +++++---------
1 file changed, 5 insertions(+), 9 deletions(-)
diff --git a/tools/testing/selftests/mm/hugetlb_dio.c b/tools/testing/selftests/mm/hugetlb_dio.c
index 432d5af15e66b7d6cac0273fb244d6696d7c9ddc..db63abe5ee5e85ff7795d3ea176c3ac47184bf4f 100644
--- a/tools/testing/selftests/mm/hugetlb_dio.c
+++ b/tools/testing/selftests/mm/hugetlb_dio.c
@@ -76,19 +76,15 @@ void run_dio_using_hugetlb(unsigned int start_off, unsigned int end_off)
/* Get the free huge pages after unmap*/
free_hpage_a = get_free_hugepages();
+ ksft_print_msg("No. Free pages before allocation : %d\n", free_hpage_b);
+ ksft_print_msg("No. Free pages after munmap : %d\n", free_hpage_a);
+
/*
* If the no. of free hugepages before allocation and after unmap does
* not match - that means there could still be a page which is pinned.
*/
- if (free_hpage_a != free_hpage_b) {
- ksft_print_msg("No. Free pages before allocation : %d\n", free_hpage_b);
- ksft_print_msg("No. Free pages after munmap : %d\n", free_hpage_a);
- ksft_test_result_fail(": Huge pages not freed!\n");
- } else {
- ksft_print_msg("No. Free pages before allocation : %d\n", free_hpage_b);
- ksft_print_msg("No. Free pages after munmap : %d\n", free_hpage_a);
- ksft_test_result_pass(": Huge pages freed successfully !\n");
- }
+ ksft_test_result(free_hpage_a == free_hpage_b,
+ "free huge pages from %u-%u\n", start_off, end_off);
}
int main(void)
---
base-commit: 6f3d2b5299b0a8bcb8a9405a8d3fceb24f79c4f0
change-id: 20241127-kselftest-mm-hugetlb-dio-names-1ebccbe8183d
Best regards,
--
Mark Brown <broonie(a)kernel.org>
The test.py should not be run separately. It should be run via run.sh,
which will do some sanity checks first. Move the test.py from TEST_PROGS
to TEST_FILES.
Reported-by: Maximilian Heyne <mheyne(a)amazon.de>
Closes: https://lore.kernel.org/netdev/20241122150129.GB18887@dev-dsk-mheyne-1b-556…
Fixes: 3ade6ce1255e ("selftests: rds: add testing infrastructure")
Signed-off-by: Hangbin Liu <liuhangbin(a)gmail.com>
---
tools/testing/selftests/net/rds/Makefile | 5 ++---
1 file changed, 2 insertions(+), 3 deletions(-)
diff --git a/tools/testing/selftests/net/rds/Makefile b/tools/testing/selftests/net/rds/Makefile
index 1803c39dbacb..612a7219990e 100644
--- a/tools/testing/selftests/net/rds/Makefile
+++ b/tools/testing/selftests/net/rds/Makefile
@@ -3,10 +3,9 @@
all:
@echo mk_build_dir="$(shell pwd)" > include.sh
-TEST_PROGS := run.sh \
- test.py
+TEST_PROGS := run.sh
-TEST_FILES := include.sh
+TEST_FILES := include.sh test.py
EXTRA_CLEAN := /tmp/rds_logs include.sh
--
2.46.0
Recent change in how get_user() handles pointers [1] has a specific case
for LAM. It assigns a different bitmask that's later used to check
whether a pointer comes from userland in get_user().
While currently commented out (until LASS [2] is merged into the kernel)
it's worth making changes to the LAM selftest ahead of time.
Modify cpu_has_la57() so it provides current paging level information
instead of the cpuid one.
Add test case to LAM that utilizes a ioctl (FIOASYNC) syscall which uses
get_user() in its implementation. Execute the syscall with differently
tagged pointers to verify that valid user pointers are passing through
and invalid kernel/non-canonical pointers are not.
Also to avoid unhelpful test failures add a check in main() to skip
running tests if LAM was not compiled into the kernel.
Code was tested on a Sierra Forest Xeon machine that's LAM capable. The
test was ran without issues with both the LAM lines from [1] untouched
and commented out. The test was also ran without issues with LAM_SUP
both enabled and disabled.
4/5 level pagetables code paths were also successfully tested in Simics
on a 5-level capable machine.
[1] https://lore.kernel.org/all/20241024013214.129639-1-torvalds@linux-foundati…
[2] https://lore.kernel.org/all/20241028160917.1380714-1-alexander.shishkin@lin…
Maciej Wieczor-Retman (3):
selftests/lam: Move cpu_has_la57() to use cpuinfo flag
selftests/lam: Skip test if LAM is disabled
selftests/lam: Test get_user() LAM pointer handling
tools/testing/selftests/x86/lam.c | 122 ++++++++++++++++++++++++++++--
1 file changed, 117 insertions(+), 5 deletions(-)
--
2.47.1
When running selftests I encountered the following error message with
some damon tests:
# Traceback (most recent call last):
# File "[...]/damon/./damos_quota.py", line 7, in <module>
# import _damon_sysfs
# ModuleNotFoundError: No module named '_damon_sysfs'
Fix this by adding the _damon_sysfs.py file to TEST_FILES so that it
will be available when running the respective damon selftests.
Fixes: 306abb63a8ca ("selftests/damon: implement a python module for test-purpose DAMON sysfs controls")
Signed-off-by: Maximilian Heyne <mheyne(a)amazon.de>
---
tools/testing/selftests/damon/Makefile | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/tools/testing/selftests/damon/Makefile b/tools/testing/selftests/damon/Makefile
index 5b2a6a5dd1af7..812f656260fba 100644
--- a/tools/testing/selftests/damon/Makefile
+++ b/tools/testing/selftests/damon/Makefile
@@ -6,7 +6,7 @@ TEST_GEN_FILES += debugfs_target_ids_read_before_terminate_race
TEST_GEN_FILES += debugfs_target_ids_pid_leak
TEST_GEN_FILES += access_memory access_memory_even
-TEST_FILES = _chk_dependency.sh _debugfs_common.sh
+TEST_FILES = _chk_dependency.sh _debugfs_common.sh _damon_sysfs.py
# functionality tests
TEST_PROGS = debugfs_attrs.sh debugfs_schemes.sh debugfs_target_ids.sh
--
2.40.1
Amazon Web Services Development Center Germany GmbH
Krausenstr. 38
10117 Berlin
Geschaeftsfuehrung: Christian Schlaeger, Jonathan Weiss
Eingetragen am Amtsgericht Charlottenburg unter HRB 257764 B
Sitz: Berlin
Ust-ID: DE 365 538 597
__ktap_test() may be called without the optional third argument which is
an issue for scripts using `set -u` to detect uninitialized variables
and potential bugs.
Fix this optional "directive" argument by either using the third
argument or an empty string.
This was discovered while developing tests for script control execution:
https://lore.kernel.org/r/20241112191858.162021-7-mic@digikod.net
Cc: Kees Cook <kees(a)kernel.org>
Cc: Nícolas F. R. A. Prado <nfraprado(a)collabora.com>
Cc: Shuah Khan <skhan(a)linuxfoundation.org>
Signed-off-by: Mickaël Salaün <mic(a)digikod.net>
---
tools/testing/selftests/kselftest/ktap_helpers.sh | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/tools/testing/selftests/kselftest/ktap_helpers.sh b/tools/testing/selftests/kselftest/ktap_helpers.sh
index 79a125eb24c2..14e7f3ec3f84 100644
--- a/tools/testing/selftests/kselftest/ktap_helpers.sh
+++ b/tools/testing/selftests/kselftest/ktap_helpers.sh
@@ -40,7 +40,7 @@ ktap_skip_all() {
__ktap_test() {
result="$1"
description="$2"
- directive="$3" # optional
+ directive="${3:-}" # optional
local directive_str=
[ ! -z "$directive" ] && directive_str="# $directive"
--
2.47.1
From: Tycho Andersen <tandersen(a)netflix.com>
Zbigniew mentioned at Linux Plumber's that systemd is interested in
switching to execveat() for service execution, but can't, because the
contents of /proc/pid/comm are the file descriptor which was used,
instead of the path to the binary. This makes the output of tools like
top and ps useless, especially in a world where most fds are opened
CLOEXEC so the number is truly meaningless.
Change exec path to fix up /proc/pid/comm in the case where we have
allocated one of these synthetic paths in bprm_init(). This way the actual
exec machinery is unchanged, but cosmetically the comm looks reasonable to
admins investigating things.
Signed-off-by: Tycho Andersen <tandersen(a)netflix.com>
Suggested-by: Zbigniew Jędrzejewski-Szmek <zbyszek(a)in.waw.pl>
CC: Aleksa Sarai <cyphar(a)cyphar.com>
Link: https://github.com/uapi-group/kernel-features#set-comm-field-before-exec
---
v2: * drop the flag, everyone :)
* change the rendered value to f_path.dentry->d_name.name instead of
argv[0], Eric
v3: * fix up subject line, Eric
v4: * switch to no flag, always rewrite approach, with some cleanup
suggested by Kees
---
fs/exec.c | 36 +++++++++++++++++++++++++++++++++++-
include/linux/binfmts.h | 1 +
2 files changed, 36 insertions(+), 1 deletion(-)
diff --git a/fs/exec.c b/fs/exec.c
index 6c53920795c2..3b559f598c74 100644
--- a/fs/exec.c
+++ b/fs/exec.c
@@ -1347,7 +1347,16 @@ int begin_new_exec(struct linux_binprm * bprm)
set_dumpable(current->mm, SUID_DUMP_USER);
perf_event_exec();
- __set_task_comm(me, kbasename(bprm->filename), true);
+
+ /*
+ * If argv0 was set, alloc_bprm() made up a path that will
+ * probably not be useful to admins running ps or similar.
+ * Let's fix it up to be something reasonable.
+ */
+ if (bprm->argv0)
+ __set_task_comm(me, kbasename(bprm->argv0), true);
+ else
+ __set_task_comm(me, kbasename(bprm->filename), true);
/* An exec changes our domain. We are no longer part of the thread
group */
@@ -1497,9 +1506,28 @@ static void free_bprm(struct linux_binprm *bprm)
if (bprm->interp != bprm->filename)
kfree(bprm->interp);
kfree(bprm->fdpath);
+ kfree(bprm->argv0);
kfree(bprm);
}
+static int bprm_add_fixup_comm(struct linux_binprm *bprm,
+ struct user_arg_ptr argv)
+{
+ const char __user *p = get_user_arg_ptr(argv, 0);
+
+ /*
+ * If p == NULL, let's just fall back to fdpath.
+ */
+ if (!p)
+ return 0;
+
+ bprm->argv0 = strndup_user(p, MAX_ARG_STRLEN);
+ if (bprm->argv0)
+ return 0;
+
+ return -EFAULT;
+}
+
static struct linux_binprm *alloc_bprm(int fd, struct filename *filename, int flags)
{
struct linux_binprm *bprm;
@@ -1906,6 +1934,12 @@ static int do_execveat_common(int fd, struct filename *filename,
goto out_ret;
}
+ if (unlikely(bprm->fdpath)) {
+ retval = bprm_add_fixup_comm(bprm, argv);
+ if (retval != 0)
+ goto out_free;
+ }
+
retval = count(argv, MAX_ARG_STRINGS);
if (retval == 0)
pr_warn_once("process '%s' launched '%s' with NULL argv: empty string added\n",
diff --git a/include/linux/binfmts.h b/include/linux/binfmts.h
index e6c00e860951..bab5121a746b 100644
--- a/include/linux/binfmts.h
+++ b/include/linux/binfmts.h
@@ -55,6 +55,7 @@ struct linux_binprm {
of the time same as filename, but could be
different for binfmt_{misc,script} */
const char *fdpath; /* generated filename for execveat */
+ const char *argv0; /* argv0 from execveat */
unsigned interp_flags;
int execfd; /* File descriptor of the executable */
unsigned long loader, exec;
base-commit: c1e939a21eb111a6d6067b38e8e04b8809b64c4e
--
2.34.1
Hi Kuan-Wei,
On Sun, Oct 20, 2024 at 6:02 AM Kuan-Wei Chiu <visitorckw(a)gmail.com> wrote:
> All current min heap API functions are marked with '__always_inline'.
> However, as the number of users increases, inlining these functions
> everywhere leads to a increase in kernel size.
>
> In performance-critical paths, such as when perf events are enabled and
> min heap functions are called on every context switch, it is important
> to retain the inline versions for optimal performance. To balance this,
> the original inline functions are kept, and additional non-inline
> versions of the functions have been added in lib/min_heap.c.
>
> Link: https://lore.kernel.org/20240522161048.8d8bbc7b153b4ecd92c50666@linux-found…
> Suggested-by: Andrew Morton <akpm(a)linux-foundation.org>
> Signed-off-by: Kuan-Wei Chiu <visitorckw(a)gmail.com>
Thanks for your patch, which is now commit 92a8b224b833e82d
("lib/min_heap: introduce non-inline versions of min heap API
functions") upstream.
> --- a/include/linux/min_heap.h
> +++ b/include/linux/min_heap.h
> @@ -50,33 +50,33 @@ void __min_heap_init(min_heap_char *heap, void *data, int size)
> heap->data = heap->preallocated;
> }
>
> -#define min_heap_init(_heap, _data, _size) \
> - __min_heap_init((min_heap_char *)_heap, _data, _size)
> +#define min_heap_init_inline(_heap, _data, _size) \
> + __min_heap_init_inline((min_heap_char *)_heap, _data, _size)
Casting macro parameters without any further checks prevents the
compiler from detecting silly mistakes. Would it be possible to
add safety-nets here and below, using e.g. container_of() or typeof()
checks?
> --- a/lib/Kconfig
> +++ b/lib/Kconfig
> @@ -777,3 +777,6 @@ config POLYNOMIAL
>
> config FIRMWARE_TABLE
> bool
> +
> +config MIN_HEAP
> + bool
Perhaps tristate? See also below.
> --- a/lib/Kconfig.debug
> +++ b/lib/Kconfig.debug
> @@ -2279,6 +2279,7 @@ config TEST_LIST_SORT
> config TEST_MIN_HEAP
> tristate "Min heap test"
> depends on DEBUG_KERNEL || m
> + select MIN_HEAP
Ideally, tests should not select functionality, to prevent increasing the
attack vector by merely enabling (modular) tests.
In this particular case, just using "depends on MIN_HEAP" is not an
option, as MIN_HEAP is not user-visible, and thus cannot be enabled
by the user on its own. However, making MIN_HEAP tristate could be
a first step for the modular case.
The builtin case is harder to fix, as e.g.
depends on MIN_HEAP || COMPILE_TEST
select MIN_HEAP if COMPILE_TEST
would still trigger a recursive dependency error.
Alternatively, the test could just keep on using the inline variants,
unless CONFIG_MIN_HEAP=y? Or event test both for the latter?
> help
> Enable this to turn on min heap function tests. This test is
> executed only once during system boot (so affects only boot time),
Gr{oetje,eeting}s,
Geert
--
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- geert(a)linux-m68k.org
In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
-- Linus Torvalds
This patchset adds basic kunit test coverage for initramfs unpacking
and cleans up some minor buffer handling issues / inefficiencies.
Changes since v2 (patch 2 only):
- fix !CONFIG_INITRAMFS_PRESERVE_MTIME kunit test checks
- add test MODULE_DESCRIPTION(), as suggested by Jeff Johnson
- add some missing headers, reported by kernel test robot
Changes since v1 (RFC):
- rebase atop v6.12-rc6 and filename field overrun fix from
https://lore.kernel.org/r/20241030035509.20194-2-ddiss@suse.de
- add unit test coverage (new patches 1 and 2)
- add patch: fix hardlink hash leak without TRAILER
- rework patch: avoid static buffer for error message
+ drop unnecessary message propagation
- drop patch: cpio_buf reuse for built-in and bootloader initramfs
+ no good justification for the change
Feedback appreciated.
David Disseldorp (9):
init: add initramfs_internal.h
initramfs_test: kunit tests for initramfs unpacking
vsprintf: add simple_strntoul
initramfs: avoid memcpy for hex header fields
initramfs: remove extra symlink path buffer
initramfs: merge header_buf and name_buf
initramfs: reuse name_len for dir mtime tracking
initramfs: fix hardlink hash leak without TRAILER
initramfs: avoid static buffer for error message
include/linux/kstrtox.h | 1 +
init/.kunitconfig | 3 +
init/Kconfig | 7 +
init/Makefile | 1 +
init/initramfs.c | 73 +++++----
init/initramfs_internal.h | 8 +
init/initramfs_test.c | 403 ++++++++++++++++++++++++++++++++++++++++++++++
lib/vsprintf.c | 7 +
8 files changed, 471 insertions(+), 32 deletions(-)
create mode 100644 init/.kunitconfig
create mode 100644 init/initramfs_internal.h
create mode 100644 init/initramfs_test.c
CONFIG_PREEMPT is a preemtion model the so called "Low-Latency Desktop".
A different preemption model is PREEMPT_RT the so called "Real-Time".
Both implement preemption in kernel and set CONFIG_PREEMPTION.
There is also the so called "LAZY PREEMPT" which the "Scheduler
controlled preemption model". Here we have also preemption in the kernel
the rules are slightly different.
Therefore the testsuite should not check for CONFIG_PREEMPT (as one
model) but for CONFIG_PREEMPTION to figure out if preemption in the
kernel is possible.
Signed-off-by: Sebastian Andrzej Siewior <bigeasy(a)linutronix.de>
---
tools/testing/selftests/bpf/map_tests/task_storage_map.c | 4 ++--
tools/testing/selftests/bpf/prog_tests/task_local_storage.c | 2 +-
.../testing/selftests/bpf/progs/read_bpf_task_storage_busy.c | 4 ++--
tools/testing/selftests/bpf/progs/task_storage_nodeadlock.c | 4 ++--
4 files changed, 7 insertions(+), 7 deletions(-)
diff --git a/tools/testing/selftests/bpf/map_tests/task_storage_map.c b/tools/testing/selftests/bpf/map_tests/task_storage_map.c
index 7d050364efca1..5fd7c923544c3 100644
--- a/tools/testing/selftests/bpf/map_tests/task_storage_map.c
+++ b/tools/testing/selftests/bpf/map_tests/task_storage_map.c
@@ -77,8 +77,8 @@ void test_task_storage_map_stress_lookup(void)
CHECK(err, "open_and_load", "error %d\n", err);
/* Only for a fully preemptible kernel */
- if (!skel->kconfig->CONFIG_PREEMPT) {
- printf("%s SKIP (no CONFIG_PREEMPT)\n", __func__);
+ if (!skel->kconfig->CONFIG_PREEMPTION) {
+ printf("%s SKIP (no CONFIG_PREEMPTION)\n", __func__);
read_bpf_task_storage_busy__destroy(skel);
skips++;
return;
diff --git a/tools/testing/selftests/bpf/prog_tests/task_local_storage.c b/tools/testing/selftests/bpf/prog_tests/task_local_storage.c
index c33c05161a9ea..acaeebf83f3ee 100644
--- a/tools/testing/selftests/bpf/prog_tests/task_local_storage.c
+++ b/tools/testing/selftests/bpf/prog_tests/task_local_storage.c
@@ -189,7 +189,7 @@ static void test_nodeadlock(void)
/* Unnecessary recursion and deadlock detection are reproducible
* in the preemptible kernel.
*/
- if (!skel->kconfig->CONFIG_PREEMPT) {
+ if (!skel->kconfig->CONFIG_PREEMPTION) {
test__skip();
goto done;
}
diff --git a/tools/testing/selftests/bpf/progs/read_bpf_task_storage_busy.c b/tools/testing/selftests/bpf/progs/read_bpf_task_storage_busy.c
index 76556e0b42b24..69da05bb6c63e 100644
--- a/tools/testing/selftests/bpf/progs/read_bpf_task_storage_busy.c
+++ b/tools/testing/selftests/bpf/progs/read_bpf_task_storage_busy.c
@@ -4,7 +4,7 @@
#include <bpf/bpf_helpers.h>
#include <bpf/bpf_tracing.h>
-extern bool CONFIG_PREEMPT __kconfig __weak;
+extern bool CONFIG_PREEMPTION __kconfig __weak;
extern const int bpf_task_storage_busy __ksym;
char _license[] SEC("license") = "GPL";
@@ -24,7 +24,7 @@ int BPF_PROG(read_bpf_task_storage_busy)
{
int *value;
- if (!CONFIG_PREEMPT)
+ if (!CONFIG_PREEMPTION)
return 0;
if (bpf_get_current_pid_tgid() >> 32 != pid)
diff --git a/tools/testing/selftests/bpf/progs/task_storage_nodeadlock.c b/tools/testing/selftests/bpf/progs/task_storage_nodeadlock.c
index ea2dbb80f7b3e..986829aaf73a6 100644
--- a/tools/testing/selftests/bpf/progs/task_storage_nodeadlock.c
+++ b/tools/testing/selftests/bpf/progs/task_storage_nodeadlock.c
@@ -10,7 +10,7 @@ char _license[] SEC("license") = "GPL";
#define EBUSY 16
#endif
-extern bool CONFIG_PREEMPT __kconfig __weak;
+extern bool CONFIG_PREEMPTION __kconfig __weak;
int nr_get_errs = 0;
int nr_del_errs = 0;
@@ -29,7 +29,7 @@ int BPF_PROG(socket_post_create, struct socket *sock, int family, int type,
int ret, zero = 0;
int *value;
- if (!CONFIG_PREEMPT)
+ if (!CONFIG_PREEMPTION)
return 0;
task = bpf_get_current_task_btf();
--
2.45.2
Hello,
this is the revision 3 of test_flow_dissector_migration.sh into
test_progs. This revision addresses comments from Stanislas, especially
about proper reuse of pseudo-header checksuming in new network helpers.
There are 2 "main" parts in test_flow_dissector.sh:
- a set of tests checking flow_dissector programs attachment to either
root namespace or non-root namespace
- dissection test
The first set is integrated in flow_dissector.c, which already contains
some existing tests for flow_dissector programs. This series uses the
opportunity to update a bit this file (use new assert, re-split tests,
etc)
The second part is migrated into a new file under test_progs,
flow_dissector_classification.c. It uses the same eBPF programs as
flow_dissector.c, but the difference is rather about how those program
are executed:
- flow_dissector.c manually runs programs with BPF_PROG_RUN
- flow_dissector_classification.c sends real packets to be dissected, and
so it also executes kernel code related to eBPF flow dissector (eg:
__skb_flow_bpf_to_target)
---
Changes in v3:
- Keep new helpers name in sync with kernel ones
- Document some existing network helpers
- Properly reuse pseudo-header csum helper in transport layer csum
helper
- Drop duplicate assert
- Use const for test structure in the migrated test
- Simplify shutdown callchain for basic test
- collect Acked-by
- Link to v2: https://lore.kernel.org/r/20241114-flow_dissector-v2-0-ee4a3be3de65@bootlin…
Changes in v2:
- allow tests to run in parallel
- move some generic helpers to network_helpers.h
- define proper function for ASSERT_MEMEQ
- fetch acked-by tags
- Link to v1: https://lore.kernel.org/r/20241113-flow_dissector-v1-0-27c4df0592dc@bootlin…
---
Alexis Lothoré (eBPF Foundation) (14):
selftests/bpf: add a macro to compare raw memory
selftests/bpf: use ASSERT_MEMEQ to compare bpf flow keys
selftests/bpf: replace CHECK calls with ASSERT macros in flow_dissector test
selftests/bpf: re-split main function into dedicated tests
selftests/bpf: expose all subtests from flow_dissector
selftests/bpf: add gre packets testing to flow_dissector
selftests/bpf: migrate flow_dissector namespace exclusivity test
selftests/bpf: Enable generic tc actions in selftests config
selftests/bpf: move ip checksum helper to network helpers
selftests/bpf: document pseudo-header checksum helpers
selftests/bpf: use the same udp and tcp headers in tests under test_progs
selftests/bpf: add network helpers to generate udp checksums
selftests/bpf: migrate bpf flow dissectors tests to test_progs
selftests/bpf: remove test_flow_dissector.sh
tools/testing/selftests/bpf/.gitignore | 1 -
tools/testing/selftests/bpf/Makefile | 3 +-
tools/testing/selftests/bpf/config | 1 +
tools/testing/selftests/bpf/network_helpers.c | 2 +-
tools/testing/selftests/bpf/network_helpers.h | 96 +++
.../selftests/bpf/prog_tests/flow_dissector.c | 323 +++++++--
.../bpf/prog_tests/flow_dissector_classification.c | 792 +++++++++++++++++++++
.../testing/selftests/bpf/prog_tests/sockopt_sk.c | 2 +-
.../testing/selftests/bpf/prog_tests/xdp_bonding.c | 2 +-
.../selftests/bpf/prog_tests/xdp_do_redirect.c | 2 +-
.../selftests/bpf/prog_tests/xdp_flowtable.c | 2 +-
.../selftests/bpf/prog_tests/xdp_metadata.c | 21 +-
.../selftests/bpf/progs/test_cls_redirect.c | 2 +-
.../selftests/bpf/progs/test_cls_redirect.h | 2 +-
.../selftests/bpf/progs/test_cls_redirect_dynptr.c | 2 +-
tools/testing/selftests/bpf/test_flow_dissector.c | 780 --------------------
tools/testing/selftests/bpf/test_flow_dissector.sh | 178 -----
tools/testing/selftests/bpf/test_progs.c | 15 +
tools/testing/selftests/bpf/test_progs.h | 15 +
tools/testing/selftests/bpf/xdp_hw_metadata.c | 2 +-
20 files changed, 1176 insertions(+), 1067 deletions(-)
---
base-commit: 8e403f7465a7c73e3a8ef62bba8cd75ef525e4b1
change-id: 20241019-flow_dissector-3eb0c07fc163
Best regards,
--
Alexis Lothoré, Bootlin
Embedded Linux and Kernel engineering
https://bootlin.com
Currently the temporary address is not removed when mngtmpaddr is deleted
or becomes unmanaged. The patch set fixed this issue and add a related
test.
v2:
1) delete the tempaddrs directly instead of remove them in addrconf_verify_rtnl(Sam Edwards)
2) Update the test case by checking the address including, add Sam in SOB (Sam Edwards)
Hangbin Liu (2):
net/ipv6: delete temporary address if mngtmpaddr is removed or
unmanaged
selftests/rtnetlink.sh: add mngtempaddr test
net/ipv6/addrconf.c | 41 +++++++---
tools/testing/selftests/net/rtnetlink.sh | 95 ++++++++++++++++++++++++
2 files changed, 124 insertions(+), 12 deletions(-)
--
2.46.0
Two small fixes for vsock: poll() missing a queue check, and close() not
invoking sockmap cleanup.
Signed-off-by: Michal Luczaj <mhal(a)rbox.co>
---
Michal Luczaj (4):
bpf, vsock: Fix poll() missing a queue
selftest/bpf: Add test for af_vsock poll()
bpf, vsock: Invoke proto::close on close()
selftest/bpf: Add test for vsock removal from sockmap on close()
net/vmw_vsock/af_vsock.c | 70 ++++++++++++--------
.../selftests/bpf/prog_tests/sockmap_basic.c | 77 ++++++++++++++++++++++
2 files changed, 120 insertions(+), 27 deletions(-)
---
base-commit: 6c4139b0f19b7397286897caee379f8321e78272
change-id: 20241118-vsock-bpf-poll-close-64f432e682ec
Best regards,
--
Michal Luczaj <mhal(a)rbox.co>
Compiled binary files should be added to .gitignore
'git status' complains:
Untracked files:
(use "git add <file>..." to include in what will be committed)
mm/hugetlb_dio
mm/pkey_sighandler_tests_32
mm/pkey_sighandler_tests_64
Cc: Donet Tom <donettom(a)linux.ibm.com>
Cc: Andrew Morton <akpm(a)linux-foundation.org>
Cc: Shuah Khan <shuah(a)kernel.org>
Signed-off-by: Li Zhijian <lizhijian(a)fujitsu.com>
---
Cc: linux-mm(a)kvack.org
---
Hello,
Cover letter is here.
This patch set aims to make 'git status' clear after 'make' and 'make
run_tests' for kselftests.
---
V3:
nothing change, just resend it
(This .gitignore have not sorted, so I appended new files to the end)
V2:
split as seperate patch from a small one [0]
[0] https://lore.kernel.org/linux-kselftest/20241015010817.453539-1-lizhijian@f…
---
tools/testing/selftests/mm/.gitignore | 2 ++
1 file changed, 2 insertions(+)
diff --git a/tools/testing/selftests/mm/.gitignore b/tools/testing/selftests/mm/.gitignore
index da030b43e43b..2ac11b7fcb26 100644
--- a/tools/testing/selftests/mm/.gitignore
+++ b/tools/testing/selftests/mm/.gitignore
@@ -51,3 +51,5 @@ hugetlb_madv_vs_map
mseal_test
seal_elf
droppable
+hugetlb_dio
+pkey_sighandler_tests*
--
2.44.0
Currently, when testing a certain target in selftests, executing the
command 'make TARGETS=XX -C tools/testing/selftests' succeeds for non-BPF,
but a similar command fails for BPF:
'''
make TARGETS=bpf -C tools/testing/selftests
make: Entering directory '/linux-kselftest/tools/testing/selftests'
make: *** [Makefile:197: all] Error 1
make: Leaving directory '/linux-kselftest/tools/testing/selftests'
'''
The reason is that the previous commit:
commit 7a6eb7c34a78 ("selftests: Skip BPF seftests by default")
led to the default filtering of bpf in TARGETS which make TARGETS empty.
That commit also mentioned that building BPF tests requires external
commands to run. This caused target like 'bpf' or 'sched_ext' defined
in SKIP_TARGETS to need an additional specification of SKIP_TARGETS as
empty to avoid skipping it, for example:
'''
make TARGETS=bpf SKIP_TARGETS="" -C tools/testing/selftests
'''
If special steps are required to execute certain test, it is extremely
unfair. We need a fairer way to treat different test targets.
This commit provider a way: If a user has specified a single TARGETS,
it indicates an expectation to run the specified target, and thus the
object should not be skipped.
Another way is to change TARGETS to DEFAULT_TARGETS in the Makefile and
then check if the user specified TARGETS and decide whether filter or not,
though this approach requires too many modifications.
Signed-off-by: Jiayuan Chen <mrpre(a)163.com>
---
tools/testing/selftests/Makefile | 7 +++++--
1 file changed, 5 insertions(+), 2 deletions(-)
diff --git a/tools/testing/selftests/Makefile b/tools/testing/selftests/Makefile
index 363d031a16f7..d76c1781ec09 100644
--- a/tools/testing/selftests/Makefile
+++ b/tools/testing/selftests/Makefile
@@ -116,7 +116,7 @@ TARGETS += vDSO
TARGETS += mm
TARGETS += x86
TARGETS += zram
-#Please keep the TARGETS list alphabetically sorted
+# Please keep the TARGETS list alphabetically sorted
# Run "make quicktest=1 run_tests" or
# "make quicktest=1 kselftest" from top level Makefile
@@ -132,12 +132,15 @@ endif
# User can optionally provide a TARGETS skiplist. By default we skip
# targets using BPF since it has cutting edge build time dependencies
-# which require more effort to install.
+# If user provide custom TARGETS, we just ignore SKIP_TARGETS so that
+# user can easy to test single target which defined in SKIP_TARGETS
SKIP_TARGETS ?= bpf sched_ext
ifneq ($(SKIP_TARGETS),)
+ifneq ($(words $(TARGETS)), 1)
TMP := $(filter-out $(SKIP_TARGETS), $(TARGETS))
override TARGETS := $(TMP)
endif
+endif
# User can set FORCE_TARGETS to 1 to require all targets to be successfully
# built; make will fail if any of the targets cannot be built. If
base-commit: 67b6d342fb6d5abfbeb71e0f23141b9b96cf7bb1
--
2.43.5
From: Reinette Chatre <reinette.chatre(a)intel.com>
[ Upstream commit 46058430fc5d39c114f7e1b9c6ff14c9f41bd531 ]
resctrl selftests discover system properties via a variety of sysfs files.
The MBM and MBA tests need to discover the event and umask with which to
configure the performance event used to measure read memory bandwidth.
This is done by parsing the contents of
/sys/bus/event_source/devices/uncore_imc_<imc instance>/events/cas_count_read
Similarly, the resctrl selftests discover the cache size via
/sys/bus/cpu/devices/cpu<id>/cache/index<index>/size.
Take care to do bounds checking when using fscanf() to read the
contents of files into a string buffer because by default fscanf() assumes
arbitrarily long strings. If the file contains more bytes than the array
can accommodate then an overflow will occur.
Provide a maximum field width to the conversion specifier to protect
against array overflow. The maximum is one less than the array size because
string input stores a terminating null byte that is not covered by the
maximum field width.
Signed-off-by: Reinette Chatre <reinette.chatre(a)intel.com>
Reviewed-by: Ilpo Järvinen <ilpo.jarvinen(a)linux.intel.com>
Signed-off-by: Shuah Khan <skhan(a)linuxfoundation.org>
Signed-off-by: Sasha Levin <sashal(a)kernel.org>
---
tools/testing/selftests/resctrl/resctrl_val.c | 4 ++--
tools/testing/selftests/resctrl/resctrlfs.c | 2 +-
2 files changed, 3 insertions(+), 3 deletions(-)
diff --git a/tools/testing/selftests/resctrl/resctrl_val.c b/tools/testing/selftests/resctrl/resctrl_val.c
index 45439e726e79c..438dd86f2704c 100644
--- a/tools/testing/selftests/resctrl/resctrl_val.c
+++ b/tools/testing/selftests/resctrl/resctrl_val.c
@@ -179,7 +179,7 @@ static int read_from_imc_dir(char *imc_dir, int count)
return -1;
}
- if (fscanf(fp, "%s", cas_count_cfg) <= 0) {
+ if (fscanf(fp, "%1023s", cas_count_cfg) <= 0) {
ksft_perror("Could not get iMC cas count read");
fclose(fp);
@@ -197,7 +197,7 @@ static int read_from_imc_dir(char *imc_dir, int count)
return -1;
}
- if (fscanf(fp, "%s", cas_count_cfg) <= 0) {
+ if (fscanf(fp, "%1023s", cas_count_cfg) <= 0) {
ksft_perror("Could not get iMC cas count write");
fclose(fp);
diff --git a/tools/testing/selftests/resctrl/resctrlfs.c b/tools/testing/selftests/resctrl/resctrlfs.c
index 71ad2b335b83f..fe3241799841b 100644
--- a/tools/testing/selftests/resctrl/resctrlfs.c
+++ b/tools/testing/selftests/resctrl/resctrlfs.c
@@ -160,7 +160,7 @@ int get_cache_size(int cpu_no, char *cache_type, unsigned long *cache_size)
return -1;
}
- if (fscanf(fp, "%s", cache_str) <= 0) {
+ if (fscanf(fp, "%63s", cache_str) <= 0) {
ksft_perror("Could not get cache_size");
fclose(fp);
--
2.43.0
From: Reinette Chatre <reinette.chatre(a)intel.com>
[ Upstream commit 46058430fc5d39c114f7e1b9c6ff14c9f41bd531 ]
resctrl selftests discover system properties via a variety of sysfs files.
The MBM and MBA tests need to discover the event and umask with which to
configure the performance event used to measure read memory bandwidth.
This is done by parsing the contents of
/sys/bus/event_source/devices/uncore_imc_<imc instance>/events/cas_count_read
Similarly, the resctrl selftests discover the cache size via
/sys/bus/cpu/devices/cpu<id>/cache/index<index>/size.
Take care to do bounds checking when using fscanf() to read the
contents of files into a string buffer because by default fscanf() assumes
arbitrarily long strings. If the file contains more bytes than the array
can accommodate then an overflow will occur.
Provide a maximum field width to the conversion specifier to protect
against array overflow. The maximum is one less than the array size because
string input stores a terminating null byte that is not covered by the
maximum field width.
Signed-off-by: Reinette Chatre <reinette.chatre(a)intel.com>
Reviewed-by: Ilpo Järvinen <ilpo.jarvinen(a)linux.intel.com>
Signed-off-by: Shuah Khan <skhan(a)linuxfoundation.org>
Signed-off-by: Sasha Levin <sashal(a)kernel.org>
---
tools/testing/selftests/resctrl/resctrl_val.c | 4 ++--
tools/testing/selftests/resctrl/resctrlfs.c | 2 +-
2 files changed, 3 insertions(+), 3 deletions(-)
diff --git a/tools/testing/selftests/resctrl/resctrl_val.c b/tools/testing/selftests/resctrl/resctrl_val.c
index 8c275f6b4dd77..1bba85e4c0675 100644
--- a/tools/testing/selftests/resctrl/resctrl_val.c
+++ b/tools/testing/selftests/resctrl/resctrl_val.c
@@ -160,7 +160,7 @@ static int read_from_imc_dir(char *imc_dir, int count)
return -1;
}
- if (fscanf(fp, "%s", cas_count_cfg) <= 0) {
+ if (fscanf(fp, "%1023s", cas_count_cfg) <= 0) {
ksft_perror("Could not get iMC cas count read");
fclose(fp);
@@ -178,7 +178,7 @@ static int read_from_imc_dir(char *imc_dir, int count)
return -1;
}
- if (fscanf(fp, "%s", cas_count_cfg) <= 0) {
+ if (fscanf(fp, "%1023s", cas_count_cfg) <= 0) {
ksft_perror("Could not get iMC cas count write");
fclose(fp);
diff --git a/tools/testing/selftests/resctrl/resctrlfs.c b/tools/testing/selftests/resctrl/resctrlfs.c
index 250c320349a78..a53cd1cb6e0c6 100644
--- a/tools/testing/selftests/resctrl/resctrlfs.c
+++ b/tools/testing/selftests/resctrl/resctrlfs.c
@@ -182,7 +182,7 @@ int get_cache_size(int cpu_no, const char *cache_type, unsigned long *cache_size
return -1;
}
- if (fscanf(fp, "%s", cache_str) <= 0) {
+ if (fscanf(fp, "%63s", cache_str) <= 0) {
ksft_perror("Could not get cache_size");
fclose(fp);
--
2.43.0
From: Reinette Chatre <reinette.chatre(a)intel.com>
[ Upstream commit 46058430fc5d39c114f7e1b9c6ff14c9f41bd531 ]
resctrl selftests discover system properties via a variety of sysfs files.
The MBM and MBA tests need to discover the event and umask with which to
configure the performance event used to measure read memory bandwidth.
This is done by parsing the contents of
/sys/bus/event_source/devices/uncore_imc_<imc instance>/events/cas_count_read
Similarly, the resctrl selftests discover the cache size via
/sys/bus/cpu/devices/cpu<id>/cache/index<index>/size.
Take care to do bounds checking when using fscanf() to read the
contents of files into a string buffer because by default fscanf() assumes
arbitrarily long strings. If the file contains more bytes than the array
can accommodate then an overflow will occur.
Provide a maximum field width to the conversion specifier to protect
against array overflow. The maximum is one less than the array size because
string input stores a terminating null byte that is not covered by the
maximum field width.
Signed-off-by: Reinette Chatre <reinette.chatre(a)intel.com>
Reviewed-by: Ilpo Järvinen <ilpo.jarvinen(a)linux.intel.com>
Signed-off-by: Shuah Khan <skhan(a)linuxfoundation.org>
Signed-off-by: Sasha Levin <sashal(a)kernel.org>
---
tools/testing/selftests/resctrl/resctrl_val.c | 4 ++--
tools/testing/selftests/resctrl/resctrlfs.c | 2 +-
2 files changed, 3 insertions(+), 3 deletions(-)
diff --git a/tools/testing/selftests/resctrl/resctrl_val.c b/tools/testing/selftests/resctrl/resctrl_val.c
index 8c275f6b4dd77..1bba85e4c0675 100644
--- a/tools/testing/selftests/resctrl/resctrl_val.c
+++ b/tools/testing/selftests/resctrl/resctrl_val.c
@@ -160,7 +160,7 @@ static int read_from_imc_dir(char *imc_dir, int count)
return -1;
}
- if (fscanf(fp, "%s", cas_count_cfg) <= 0) {
+ if (fscanf(fp, "%1023s", cas_count_cfg) <= 0) {
ksft_perror("Could not get iMC cas count read");
fclose(fp);
@@ -178,7 +178,7 @@ static int read_from_imc_dir(char *imc_dir, int count)
return -1;
}
- if (fscanf(fp, "%s", cas_count_cfg) <= 0) {
+ if (fscanf(fp, "%1023s", cas_count_cfg) <= 0) {
ksft_perror("Could not get iMC cas count write");
fclose(fp);
diff --git a/tools/testing/selftests/resctrl/resctrlfs.c b/tools/testing/selftests/resctrl/resctrlfs.c
index 250c320349a78..a53cd1cb6e0c6 100644
--- a/tools/testing/selftests/resctrl/resctrlfs.c
+++ b/tools/testing/selftests/resctrl/resctrlfs.c
@@ -182,7 +182,7 @@ int get_cache_size(int cpu_no, const char *cache_type, unsigned long *cache_size
return -1;
}
- if (fscanf(fp, "%s", cache_str) <= 0) {
+ if (fscanf(fp, "%63s", cache_str) <= 0) {
ksft_perror("Could not get cache_size");
fclose(fp);
--
2.43.0
From: Mark Brown <broonie(a)kernel.org>
[ Upstream commit 27141b690547da5650a420f26ec369ba142a9ebb ]
The PAC exec_sign_all() test spawns some child processes, creating pipes
to be stdin and stdout for the child. It cleans up most of the file
descriptors that are created as part of this but neglects to clean up the
parent end of the child stdin and stdout. Add the missing close() calls.
Signed-off-by: Mark Brown <broonie(a)kernel.org>
Link: https://lore.kernel.org/r/20241111-arm64-pac-test-collisions-v1-1-171875f37…
Signed-off-by: Catalin Marinas <catalin.marinas(a)arm.com>
Signed-off-by: Sasha Levin <sashal(a)kernel.org>
---
tools/testing/selftests/arm64/pauth/pac.c | 3 +++
1 file changed, 3 insertions(+)
diff --git a/tools/testing/selftests/arm64/pauth/pac.c b/tools/testing/selftests/arm64/pauth/pac.c
index b743daa772f55..5a07b3958fbf2 100644
--- a/tools/testing/selftests/arm64/pauth/pac.c
+++ b/tools/testing/selftests/arm64/pauth/pac.c
@@ -182,6 +182,9 @@ int exec_sign_all(struct signatures *signed_vals, size_t val)
return -1;
}
+ close(new_stdin[1]);
+ close(new_stdout[0]);
+
return 0;
}
--
2.43.0
From: Mark Brown <broonie(a)kernel.org>
[ Upstream commit 27141b690547da5650a420f26ec369ba142a9ebb ]
The PAC exec_sign_all() test spawns some child processes, creating pipes
to be stdin and stdout for the child. It cleans up most of the file
descriptors that are created as part of this but neglects to clean up the
parent end of the child stdin and stdout. Add the missing close() calls.
Signed-off-by: Mark Brown <broonie(a)kernel.org>
Link: https://lore.kernel.org/r/20241111-arm64-pac-test-collisions-v1-1-171875f37…
Signed-off-by: Catalin Marinas <catalin.marinas(a)arm.com>
Signed-off-by: Sasha Levin <sashal(a)kernel.org>
---
tools/testing/selftests/arm64/pauth/pac.c | 3 +++
1 file changed, 3 insertions(+)
diff --git a/tools/testing/selftests/arm64/pauth/pac.c b/tools/testing/selftests/arm64/pauth/pac.c
index b743daa772f55..5a07b3958fbf2 100644
--- a/tools/testing/selftests/arm64/pauth/pac.c
+++ b/tools/testing/selftests/arm64/pauth/pac.c
@@ -182,6 +182,9 @@ int exec_sign_all(struct signatures *signed_vals, size_t val)
return -1;
}
+ close(new_stdin[1]);
+ close(new_stdout[0]);
+
return 0;
}
--
2.43.0
From: Mark Brown <broonie(a)kernel.org>
[ Upstream commit 3e360ef0c0a1fb6ce9a302e40b8057c41ba8a9d2 ]
When building for streaming SVE the irritator for SVE skips updates of both
P0 and FFR. While FFR is skipped since it might not be present there is no
reason to skip corrupting P0 so switch to an instruction valid in streaming
mode and move the ifdef.
Signed-off-by: Mark Brown <broonie(a)kernel.org>
Link: https://lore.kernel.org/r/20241107-arm64-fp-stress-irritator-v2-3-c4b9622e3…
Signed-off-by: Catalin Marinas <catalin.marinas(a)arm.com>
Signed-off-by: Sasha Levin <sashal(a)kernel.org>
---
tools/testing/selftests/arm64/fp/sve-test.S | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/tools/testing/selftests/arm64/fp/sve-test.S b/tools/testing/selftests/arm64/fp/sve-test.S
index e3e08d9c7020e..f631d17c899d2 100644
--- a/tools/testing/selftests/arm64/fp/sve-test.S
+++ b/tools/testing/selftests/arm64/fp/sve-test.S
@@ -459,7 +459,8 @@ function irritator_handler
movi v9.16b, #2
movi v31.8b, #3
// And P0
- rdffr p0.b
+ ptrue p0.d
+#ifndef SSVE
// And FFR
wrffr p15.b
--
2.43.0
From: Mark Brown <broonie(a)kernel.org>
[ Upstream commit 27141b690547da5650a420f26ec369ba142a9ebb ]
The PAC exec_sign_all() test spawns some child processes, creating pipes
to be stdin and stdout for the child. It cleans up most of the file
descriptors that are created as part of this but neglects to clean up the
parent end of the child stdin and stdout. Add the missing close() calls.
Signed-off-by: Mark Brown <broonie(a)kernel.org>
Link: https://lore.kernel.org/r/20241111-arm64-pac-test-collisions-v1-1-171875f37…
Signed-off-by: Catalin Marinas <catalin.marinas(a)arm.com>
Signed-off-by: Sasha Levin <sashal(a)kernel.org>
---
tools/testing/selftests/arm64/pauth/pac.c | 3 +++
1 file changed, 3 insertions(+)
diff --git a/tools/testing/selftests/arm64/pauth/pac.c b/tools/testing/selftests/arm64/pauth/pac.c
index b743daa772f55..5a07b3958fbf2 100644
--- a/tools/testing/selftests/arm64/pauth/pac.c
+++ b/tools/testing/selftests/arm64/pauth/pac.c
@@ -182,6 +182,9 @@ int exec_sign_all(struct signatures *signed_vals, size_t val)
return -1;
}
+ close(new_stdin[1]);
+ close(new_stdout[0]);
+
return 0;
}
--
2.43.0
From: Mark Brown <broonie(a)kernel.org>
[ Upstream commit 3e360ef0c0a1fb6ce9a302e40b8057c41ba8a9d2 ]
When building for streaming SVE the irritator for SVE skips updates of both
P0 and FFR. While FFR is skipped since it might not be present there is no
reason to skip corrupting P0 so switch to an instruction valid in streaming
mode and move the ifdef.
Signed-off-by: Mark Brown <broonie(a)kernel.org>
Link: https://lore.kernel.org/r/20241107-arm64-fp-stress-irritator-v2-3-c4b9622e3…
Signed-off-by: Catalin Marinas <catalin.marinas(a)arm.com>
Signed-off-by: Sasha Levin <sashal(a)kernel.org>
---
tools/testing/selftests/arm64/fp/sve-test.S | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/tools/testing/selftests/arm64/fp/sve-test.S b/tools/testing/selftests/arm64/fp/sve-test.S
index 2a18cb4c528c5..b9648daeabbb1 100644
--- a/tools/testing/selftests/arm64/fp/sve-test.S
+++ b/tools/testing/selftests/arm64/fp/sve-test.S
@@ -304,9 +304,9 @@ function irritator_handler
movi v0.8b, #1
movi v9.16b, #2
movi v31.8b, #3
-#ifndef SSVE
// And P0
- rdffr p0.b
+ ptrue p0.d
+#ifndef SSVE
// And FFR
wrffr p15.b
#endif
--
2.43.0
From: Mark Brown <broonie(a)kernel.org>
[ Upstream commit 27141b690547da5650a420f26ec369ba142a9ebb ]
The PAC exec_sign_all() test spawns some child processes, creating pipes
to be stdin and stdout for the child. It cleans up most of the file
descriptors that are created as part of this but neglects to clean up the
parent end of the child stdin and stdout. Add the missing close() calls.
Signed-off-by: Mark Brown <broonie(a)kernel.org>
Link: https://lore.kernel.org/r/20241111-arm64-pac-test-collisions-v1-1-171875f37…
Signed-off-by: Catalin Marinas <catalin.marinas(a)arm.com>
Signed-off-by: Sasha Levin <sashal(a)kernel.org>
---
tools/testing/selftests/arm64/pauth/pac.c | 3 +++
1 file changed, 3 insertions(+)
diff --git a/tools/testing/selftests/arm64/pauth/pac.c b/tools/testing/selftests/arm64/pauth/pac.c
index b743daa772f55..5a07b3958fbf2 100644
--- a/tools/testing/selftests/arm64/pauth/pac.c
+++ b/tools/testing/selftests/arm64/pauth/pac.c
@@ -182,6 +182,9 @@ int exec_sign_all(struct signatures *signed_vals, size_t val)
return -1;
}
+ close(new_stdin[1]);
+ close(new_stdout[0]);
+
return 0;
}
--
2.43.0
From: Mark Brown <broonie(a)kernel.org>
[ Upstream commit 3e360ef0c0a1fb6ce9a302e40b8057c41ba8a9d2 ]
When building for streaming SVE the irritator for SVE skips updates of both
P0 and FFR. While FFR is skipped since it might not be present there is no
reason to skip corrupting P0 so switch to an instruction valid in streaming
mode and move the ifdef.
Signed-off-by: Mark Brown <broonie(a)kernel.org>
Link: https://lore.kernel.org/r/20241107-arm64-fp-stress-irritator-v2-3-c4b9622e3…
Signed-off-by: Catalin Marinas <catalin.marinas(a)arm.com>
Signed-off-by: Sasha Levin <sashal(a)kernel.org>
---
tools/testing/selftests/arm64/fp/sve-test.S | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/tools/testing/selftests/arm64/fp/sve-test.S b/tools/testing/selftests/arm64/fp/sve-test.S
index 4328895dfc876..ff60360a97f80 100644
--- a/tools/testing/selftests/arm64/fp/sve-test.S
+++ b/tools/testing/selftests/arm64/fp/sve-test.S
@@ -304,9 +304,9 @@ function irritator_handler
movi v0.8b, #1
movi v9.16b, #2
movi v31.8b, #3
-#ifndef SSVE
// And P0
- rdffr p0.b
+ ptrue p0.d
+#ifndef SSVE
// And FFR
wrffr p15.b
#endif
--
2.43.0
From: Mark Brown <broonie(a)kernel.org>
[ Upstream commit dca93d29845dfed60910ba13dbfb6ae6a0e19f6d ]
Currently if we encounter an error between fork() and exec() of a child
process we log the error to stderr. This means that the errors don't get
annotated with the child information which makes diagnostics harder and
means that if we miss the exit signal from the child we can deadlock
waiting for output from the child. Improve robustness and output quality
by logging to stdout instead.
Signed-off-by: Mark Brown <broonie(a)kernel.org>
Link: https://lore.kernel.org/r/20241023-arm64-fp-stress-exec-fail-v1-1-ee3c62932…
Signed-off-by: Catalin Marinas <catalin.marinas(a)arm.com>
Signed-off-by: Sasha Levin <sashal(a)kernel.org>
---
tools/testing/selftests/arm64/fp/fp-stress.c | 15 +++++++--------
1 file changed, 7 insertions(+), 8 deletions(-)
diff --git a/tools/testing/selftests/arm64/fp/fp-stress.c b/tools/testing/selftests/arm64/fp/fp-stress.c
index dd31647b00a22..cf9d7b2e4630c 100644
--- a/tools/testing/selftests/arm64/fp/fp-stress.c
+++ b/tools/testing/selftests/arm64/fp/fp-stress.c
@@ -79,7 +79,7 @@ static void child_start(struct child_data *child, const char *program)
*/
ret = dup2(pipefd[1], 1);
if (ret == -1) {
- fprintf(stderr, "dup2() %d\n", errno);
+ printf("dup2() %d\n", errno);
exit(EXIT_FAILURE);
}
@@ -89,7 +89,7 @@ static void child_start(struct child_data *child, const char *program)
*/
ret = dup2(startup_pipe[0], 3);
if (ret == -1) {
- fprintf(stderr, "dup2() %d\n", errno);
+ printf("dup2() %d\n", errno);
exit(EXIT_FAILURE);
}
@@ -107,16 +107,15 @@ static void child_start(struct child_data *child, const char *program)
*/
ret = read(3, &i, sizeof(i));
if (ret < 0)
- fprintf(stderr, "read(startp pipe) failed: %s (%d)\n",
- strerror(errno), errno);
+ printf("read(startp pipe) failed: %s (%d)\n",
+ strerror(errno), errno);
if (ret > 0)
- fprintf(stderr, "%d bytes of data on startup pipe\n",
- ret);
+ printf("%d bytes of data on startup pipe\n", ret);
close(3);
ret = execl(program, program, NULL);
- fprintf(stderr, "execl(%s) failed: %d (%s)\n",
- program, errno, strerror(errno));
+ printf("execl(%s) failed: %d (%s)\n",
+ program, errno, strerror(errno));
exit(EXIT_FAILURE);
} else {
--
2.43.0
From: Mark Brown <broonie(a)kernel.org>
[ Upstream commit 27141b690547da5650a420f26ec369ba142a9ebb ]
The PAC exec_sign_all() test spawns some child processes, creating pipes
to be stdin and stdout for the child. It cleans up most of the file
descriptors that are created as part of this but neglects to clean up the
parent end of the child stdin and stdout. Add the missing close() calls.
Signed-off-by: Mark Brown <broonie(a)kernel.org>
Link: https://lore.kernel.org/r/20241111-arm64-pac-test-collisions-v1-1-171875f37…
Signed-off-by: Catalin Marinas <catalin.marinas(a)arm.com>
Signed-off-by: Sasha Levin <sashal(a)kernel.org>
---
tools/testing/selftests/arm64/pauth/pac.c | 3 +++
1 file changed, 3 insertions(+)
diff --git a/tools/testing/selftests/arm64/pauth/pac.c b/tools/testing/selftests/arm64/pauth/pac.c
index b743daa772f55..5a07b3958fbf2 100644
--- a/tools/testing/selftests/arm64/pauth/pac.c
+++ b/tools/testing/selftests/arm64/pauth/pac.c
@@ -182,6 +182,9 @@ int exec_sign_all(struct signatures *signed_vals, size_t val)
return -1;
}
+ close(new_stdin[1]);
+ close(new_stdout[0]);
+
return 0;
}
--
2.43.0
From: Mark Brown <broonie(a)kernel.org>
[ Upstream commit 3e360ef0c0a1fb6ce9a302e40b8057c41ba8a9d2 ]
When building for streaming SVE the irritator for SVE skips updates of both
P0 and FFR. While FFR is skipped since it might not be present there is no
reason to skip corrupting P0 so switch to an instruction valid in streaming
mode and move the ifdef.
Signed-off-by: Mark Brown <broonie(a)kernel.org>
Link: https://lore.kernel.org/r/20241107-arm64-fp-stress-irritator-v2-3-c4b9622e3…
Signed-off-by: Catalin Marinas <catalin.marinas(a)arm.com>
Signed-off-by: Sasha Levin <sashal(a)kernel.org>
---
tools/testing/selftests/arm64/fp/sve-test.S | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/tools/testing/selftests/arm64/fp/sve-test.S b/tools/testing/selftests/arm64/fp/sve-test.S
index fff60e2a25add..4fcb492aee1fb 100644
--- a/tools/testing/selftests/arm64/fp/sve-test.S
+++ b/tools/testing/selftests/arm64/fp/sve-test.S
@@ -304,9 +304,9 @@ function irritator_handler
movi v0.8b, #1
movi v9.16b, #2
movi v31.8b, #3
-#ifndef SSVE
// And P0
- rdffr p0.b
+ ptrue p0.d
+#ifndef SSVE
// And FFR
wrffr p15.b
#endif
--
2.43.0
From: Mark Brown <broonie(a)kernel.org>
[ Upstream commit dca93d29845dfed60910ba13dbfb6ae6a0e19f6d ]
Currently if we encounter an error between fork() and exec() of a child
process we log the error to stderr. This means that the errors don't get
annotated with the child information which makes diagnostics harder and
means that if we miss the exit signal from the child we can deadlock
waiting for output from the child. Improve robustness and output quality
by logging to stdout instead.
Signed-off-by: Mark Brown <broonie(a)kernel.org>
Link: https://lore.kernel.org/r/20241023-arm64-fp-stress-exec-fail-v1-1-ee3c62932…
Signed-off-by: Catalin Marinas <catalin.marinas(a)arm.com>
Signed-off-by: Sasha Levin <sashal(a)kernel.org>
---
tools/testing/selftests/arm64/fp/fp-stress.c | 15 +++++++--------
1 file changed, 7 insertions(+), 8 deletions(-)
diff --git a/tools/testing/selftests/arm64/fp/fp-stress.c b/tools/testing/selftests/arm64/fp/fp-stress.c
index faac24bdefeb9..80f22789504d6 100644
--- a/tools/testing/selftests/arm64/fp/fp-stress.c
+++ b/tools/testing/selftests/arm64/fp/fp-stress.c
@@ -79,7 +79,7 @@ static void child_start(struct child_data *child, const char *program)
*/
ret = dup2(pipefd[1], 1);
if (ret == -1) {
- fprintf(stderr, "dup2() %d\n", errno);
+ printf("dup2() %d\n", errno);
exit(EXIT_FAILURE);
}
@@ -89,7 +89,7 @@ static void child_start(struct child_data *child, const char *program)
*/
ret = dup2(startup_pipe[0], 3);
if (ret == -1) {
- fprintf(stderr, "dup2() %d\n", errno);
+ printf("dup2() %d\n", errno);
exit(EXIT_FAILURE);
}
@@ -107,16 +107,15 @@ static void child_start(struct child_data *child, const char *program)
*/
ret = read(3, &i, sizeof(i));
if (ret < 0)
- fprintf(stderr, "read(startp pipe) failed: %s (%d)\n",
- strerror(errno), errno);
+ printf("read(startp pipe) failed: %s (%d)\n",
+ strerror(errno), errno);
if (ret > 0)
- fprintf(stderr, "%d bytes of data on startup pipe\n",
- ret);
+ printf("%d bytes of data on startup pipe\n", ret);
close(3);
ret = execl(program, program, NULL);
- fprintf(stderr, "execl(%s) failed: %d (%s)\n",
- program, errno, strerror(errno));
+ printf("execl(%s) failed: %d (%s)\n",
+ program, errno, strerror(errno));
exit(EXIT_FAILURE);
} else {
--
2.43.0
From: Mark Brown <broonie(a)kernel.org>
[ Upstream commit 27141b690547da5650a420f26ec369ba142a9ebb ]
The PAC exec_sign_all() test spawns some child processes, creating pipes
to be stdin and stdout for the child. It cleans up most of the file
descriptors that are created as part of this but neglects to clean up the
parent end of the child stdin and stdout. Add the missing close() calls.
Signed-off-by: Mark Brown <broonie(a)kernel.org>
Link: https://lore.kernel.org/r/20241111-arm64-pac-test-collisions-v1-1-171875f37…
Signed-off-by: Catalin Marinas <catalin.marinas(a)arm.com>
Signed-off-by: Sasha Levin <sashal(a)kernel.org>
---
tools/testing/selftests/arm64/pauth/pac.c | 3 +++
1 file changed, 3 insertions(+)
diff --git a/tools/testing/selftests/arm64/pauth/pac.c b/tools/testing/selftests/arm64/pauth/pac.c
index b743daa772f55..5a07b3958fbf2 100644
--- a/tools/testing/selftests/arm64/pauth/pac.c
+++ b/tools/testing/selftests/arm64/pauth/pac.c
@@ -182,6 +182,9 @@ int exec_sign_all(struct signatures *signed_vals, size_t val)
return -1;
}
+ close(new_stdin[1]);
+ close(new_stdout[0]);
+
return 0;
}
--
2.43.0
From: Mark Brown <broonie(a)kernel.org>
[ Upstream commit 3e360ef0c0a1fb6ce9a302e40b8057c41ba8a9d2 ]
When building for streaming SVE the irritator for SVE skips updates of both
P0 and FFR. While FFR is skipped since it might not be present there is no
reason to skip corrupting P0 so switch to an instruction valid in streaming
mode and move the ifdef.
Signed-off-by: Mark Brown <broonie(a)kernel.org>
Link: https://lore.kernel.org/r/20241107-arm64-fp-stress-irritator-v2-3-c4b9622e3…
Signed-off-by: Catalin Marinas <catalin.marinas(a)arm.com>
Signed-off-by: Sasha Levin <sashal(a)kernel.org>
---
tools/testing/selftests/arm64/fp/sve-test.S | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/tools/testing/selftests/arm64/fp/sve-test.S b/tools/testing/selftests/arm64/fp/sve-test.S
index fff60e2a25add..4fcb492aee1fb 100644
--- a/tools/testing/selftests/arm64/fp/sve-test.S
+++ b/tools/testing/selftests/arm64/fp/sve-test.S
@@ -304,9 +304,9 @@ function irritator_handler
movi v0.8b, #1
movi v9.16b, #2
movi v31.8b, #3
-#ifndef SSVE
// And P0
- rdffr p0.b
+ ptrue p0.d
+#ifndef SSVE
// And FFR
wrffr p15.b
#endif
--
2.43.0
From: Mark Brown <broonie(a)kernel.org>
[ Upstream commit dca93d29845dfed60910ba13dbfb6ae6a0e19f6d ]
Currently if we encounter an error between fork() and exec() of a child
process we log the error to stderr. This means that the errors don't get
annotated with the child information which makes diagnostics harder and
means that if we miss the exit signal from the child we can deadlock
waiting for output from the child. Improve robustness and output quality
by logging to stdout instead.
Signed-off-by: Mark Brown <broonie(a)kernel.org>
Link: https://lore.kernel.org/r/20241023-arm64-fp-stress-exec-fail-v1-1-ee3c62932…
Signed-off-by: Catalin Marinas <catalin.marinas(a)arm.com>
Signed-off-by: Sasha Levin <sashal(a)kernel.org>
---
tools/testing/selftests/arm64/fp/fp-stress.c | 15 +++++++--------
1 file changed, 7 insertions(+), 8 deletions(-)
diff --git a/tools/testing/selftests/arm64/fp/fp-stress.c b/tools/testing/selftests/arm64/fp/fp-stress.c
index faac24bdefeb9..80f22789504d6 100644
--- a/tools/testing/selftests/arm64/fp/fp-stress.c
+++ b/tools/testing/selftests/arm64/fp/fp-stress.c
@@ -79,7 +79,7 @@ static void child_start(struct child_data *child, const char *program)
*/
ret = dup2(pipefd[1], 1);
if (ret == -1) {
- fprintf(stderr, "dup2() %d\n", errno);
+ printf("dup2() %d\n", errno);
exit(EXIT_FAILURE);
}
@@ -89,7 +89,7 @@ static void child_start(struct child_data *child, const char *program)
*/
ret = dup2(startup_pipe[0], 3);
if (ret == -1) {
- fprintf(stderr, "dup2() %d\n", errno);
+ printf("dup2() %d\n", errno);
exit(EXIT_FAILURE);
}
@@ -107,16 +107,15 @@ static void child_start(struct child_data *child, const char *program)
*/
ret = read(3, &i, sizeof(i));
if (ret < 0)
- fprintf(stderr, "read(startp pipe) failed: %s (%d)\n",
- strerror(errno), errno);
+ printf("read(startp pipe) failed: %s (%d)\n",
+ strerror(errno), errno);
if (ret > 0)
- fprintf(stderr, "%d bytes of data on startup pipe\n",
- ret);
+ printf("%d bytes of data on startup pipe\n", ret);
close(3);
ret = execl(program, program, NULL);
- fprintf(stderr, "execl(%s) failed: %d (%s)\n",
- program, errno, strerror(errno));
+ printf("execl(%s) failed: %d (%s)\n",
+ program, errno, strerror(errno));
exit(EXIT_FAILURE);
} else {
--
2.43.0
Compiled binary files should be added to .gitignore
'git status' complains:
Untracked files:
(use "git add <file>..." to include in what will be committed)
alsa/global-timer
alsa/utimer-test
Cc: Mark Brown <broonie(a)kernel.org>
Cc: Jaroslav Kysela <perex(a)perex.cz>
Cc: Takashi Iwai <tiwai(a)suse.com>
Cc: Shuah Khan <shuah(a)kernel.org>
Signed-off-by: Li Zhijian <lizhijian(a)fujitsu.com>
---
Cc: linux-sound(a)vger.kernel.org
---
Hello,
Cover letter is here.
This patch set aims to make 'git status' clear after 'make' and 'make
run_tests' for kselftests.
---
V3:
sorted the ignore files
V2:
split as a separate patch from a small one [0]
[0] https://lore.kernel.org/linux-kselftest/20241015010817.453539-1-lizhijian@f…
Signed-off-by: Li Zhijian <lizhijian(a)fujitsu.com>
---
tools/testing/selftests/alsa/.gitignore | 2 ++
1 file changed, 2 insertions(+)
diff --git a/tools/testing/selftests/alsa/.gitignore b/tools/testing/selftests/alsa/.gitignore
index 12dc3fcd3456..3dd8e1176b89 100644
--- a/tools/testing/selftests/alsa/.gitignore
+++ b/tools/testing/selftests/alsa/.gitignore
@@ -1,3 +1,5 @@
+global-timer
mixer-test
pcm-test
test-pcmtest-driver
+utimer-test
--
2.44.0
The include.sh file is generated for inclusion and should not be executable.
Otherwise, it will be added to kselftest-list.txt. Additionally, add the
executable bit for test.py at the same time to ensure proper functionality.
Fixes: 3ade6ce1255e ("selftests: rds: add testing infrastructure")
Signed-off-by: Hangbin Liu <liuhangbin(a)gmail.com>
---
tools/testing/selftests/net/rds/Makefile | 3 ++-
tools/testing/selftests/net/rds/test.py | 0
2 files changed, 2 insertions(+), 1 deletion(-)
mode change 100644 => 100755 tools/testing/selftests/net/rds/test.py
diff --git a/tools/testing/selftests/net/rds/Makefile b/tools/testing/selftests/net/rds/Makefile
index da9714bc7aad..cf30307a829b 100644
--- a/tools/testing/selftests/net/rds/Makefile
+++ b/tools/testing/selftests/net/rds/Makefile
@@ -4,9 +4,10 @@ all:
@echo mk_build_dir="$(shell pwd)" > include.sh
TEST_PROGS := run.sh \
- include.sh \
test.py
+TEST_FILES := include.sh
+
EXTRA_CLEAN := /tmp/rds_logs
include ../../lib.mk
diff --git a/tools/testing/selftests/net/rds/test.py b/tools/testing/selftests/net/rds/test.py
old mode 100644
new mode 100755
--
2.39.3 (Apple Git-146)
Hi Linus,
Please pull the following kunit update for Linux 6.13-rc1.
This pull request is fixed up with the right fix for the UAF bug and
includes other fixes.
linux_kselftest-kunit-6.13-rc1-fixed
kunit update for Linux 6.13-rc1
-- fixes user-after-free (UAF) bug in kunit_init_suite()
-- adds option to kunit tool to print just the summary of test results
-- adds option to kunit tool to print just the failed test results
-- fixes kunit_zalloc_skb() to use user passed in gfp value instead of
hardcoding GFP_KERNEL
-- fixes kunit_zalloc_skb() kernel doc to include allocation flags variable
-- updates KUnit email address for Brendan Higgins
-- adds LoongArch config to qemu_configs
-- changes tool to allow overriding the shutdown mode from qemu config
-- enables shutdown in loongarch qemu_config
-- fixes potential null dereference in kunit_device_driver_test()
-- fixes debugfs to use IS_ERR() for alloc_string_stream() error check
diff is attached.
Tests passed on my kunit repo & linux-next:
- Build make allmodconfig
./tools/testing/kunit/kunit.py run
./tools/testing/kunit/kunit.py run --alltests
./tools/testing/kunit/kunit.py run --arch x86_64
./tools/testing/kunit/kunit.py run --alltests --arch x86_64
thanks,
-- Shuah
----------------------------------------------------------------
The following changes since commit 2d5404caa8c7bb5c4e0435f94b28834ae5456623:
Linux 6.12-rc7 (2024-11-10 14:19:35 -0800)
are available in the Git repository at:
git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest tags/linux_kselftest-kunit-6.13-rc1-fixed
for you to fetch changes up to 62adcae479fe5bc04fa3b6c3f93bd340441f8b25:
kunit: qemu_configs: loongarch: Enable shutdown (2024-11-19 15:26:30 -0700)
----------------------------------------------------------------
linux_kselftest-kunit-6.13-rc1-fixed
kunit update for Linux 6.13-rc1
-- fixes user-after-free (UAF) bug in kunit_init_suite()
-- adds option to kunit tool to print just the summary of test results
-- adds option to kunit tool to print just the failed test results
-- fixes kunit_zalloc_skb() to use user passed in gfp value instead of
hardcoding GFP_KERNEL
-- fixes kunit_zalloc_skb() kernel doc to include allocation flags variable
-- updates KUnit email address for Brendan Higgins
-- adds LoongArch config to qemu_configs
-- changes tool to allow overriding the shutdown mode from qemu config
-- enables shutdown in loongarch qemu_config
-- fixes potential null dereference in kunit_device_driver_test()
-- fixes debugfs to use IS_ERR() for alloc_string_stream() error check
----------------------------------------------------------------
Brendan Higgins (1):
MAINTAINERS: Update KUnit email address for Brendan Higgins
Dan Carpenter (2):
kunit: skb: use "gfp" variable instead of hardcoding GFP_KERNEL
kunit: skb: add gfp to kernel doc for kunit_zalloc_skb()
David Gow (1):
kunit: tool: Only print the summary
Jinjie Ruan (1):
kunit: string-stream: Fix a UAF bug in kunit_init_suite()
Kuan-Wei Chiu (1):
kunit: debugfs: Use IS_ERR() for alloc_string_stream() error check
Rae Moar (1):
kunit: tool: print failed tests only
Thomas Weißschuh (3):
kunit: qemu_configs: Add LoongArch config
kunit: tool: Allow overriding the shutdown mode from qemu config
kunit: qemu_configs: loongarch: Enable shutdown
Zichen Xie (1):
kunit: Fix potential null dereference in kunit_device_driver_test()
MAINTAINERS | 2 +-
include/kunit/skbuff.h | 5 +-
lib/kunit/debugfs.c | 9 +-
lib/kunit/kunit-test.c | 2 +
tools/testing/kunit/kunit.py | 28 +++++-
tools/testing/kunit/kunit_kernel.py | 4 +-
tools/testing/kunit/kunit_parser.py | 134 ++++++++++++++++----------
tools/testing/kunit/kunit_printer.py | 14 ++-
tools/testing/kunit/kunit_tool_test.py | 55 +++++------
tools/testing/kunit/qemu_configs/loongarch.py | 21 ++++
10 files changed, 183 insertions(+), 91 deletions(-)
create mode 100644 tools/testing/kunit/qemu_configs/loongarch.py
----------------------------------------------------------------
The testing effort is increasing throughout the community.
The tests are generally merged into the subsystem trees,
and are of relatively narrow interest. The patch volume on
linux-kselftest(a)vger.kernel.org makes it hard to follow
the changes to the framework, and discuss proposals.
Create a new ML focused on the framework of kselftests,
which will hopefully be similarly low volume to the workflows@
mailing list.
From the responses to v1 I gather that the preference is to
keep the existing list for all (or create an alias to it);
the framework-only section should be the one to have a new
list created.
Signed-off-by: Jakub Kicinski <kuba(a)kernel.org>
---
Sorry for the delay, the responses to v1 weren't super positive
but I keep thinking this would be very useful :) Or at the very
least I find workflows@ very useful and informative as a maintainer.
Posting as an RFC because we need to create the new ML.
v1: https://lore.kernel.org/20241003142328.622730-1-kuba@kernel.org
CC: shuah(a)kernel.org
CC: Tim.Bird(a)sony.com
CC: linux-kselftest(a)vger.kernel.org
CC: workflows(a)vger.kernel.org
---
MAINTAINERS | 15 ++++++++++++++-
1 file changed, 14 insertions(+), 1 deletion(-)
diff --git a/MAINTAINERS b/MAINTAINERS
index f245573722ef..345a0657ba1d 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -12395,7 +12395,7 @@ S: Supported
F: Documentation/admin-guide/reporting-regressions.rst
F: Documentation/process/handling-regressions.rst
-KERNEL SELFTEST FRAMEWORK
+KERNEL SELFTEST
M: Shuah Khan <shuah(a)kernel.org>
M: Shuah Khan <skhan(a)linuxfoundation.org>
L: linux-kselftest(a)vger.kernel.org
@@ -12405,6 +12405,19 @@ T: git git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest.git
F: Documentation/dev-tools/kselftest*
F: tools/testing/selftests/
+KERNEL SELFTEST FRAMEWORK
+M: Shuah Khan <shuah(a)kernel.org>
+M: Shuah Khan <skhan(a)linuxfoundation.org>
+L: linux-kselftest-core(a)vger.kernel.org
+S: Maintained
+F: Documentation/dev-tools/kselftest*
+F: tools/testing/selftests/kselftest/
+F: tools/testing/selftests/lib/
+F: tools/testing/selftests/lib.mk
+F: tools/testing/selftests/Makefile
+F: tools/testing/selftests/*.sh
+F: tools/testing/selftests/*.h
+
KERNEL SMB3 SERVER (KSMBD)
M: Namjae Jeon <linkinjeon(a)kernel.org>
M: Steve French <sfrench(a)samba.org>
--
2.47.0
This patch series includes some netns-related improvements and fixes for
RTNL and ip_tunnel, to make link creation more intuitive:
- Creating link in another net namespace doesn't conflict with link names
in current one.
- Refector rtnetlink link creation. Create link in target namespace
directly. Pass both source and link netns to drivers via newlink()
callback.
So that
# ip link add netns ns1 link-netns ns2 tun0 type gre ...
will create tun0 in ns1, rather than create it in ns2 and move to ns1.
And don't conflict with another interface named "tun0" in current netns.
---
v4:
- Pack newlink() parameters to a single struct.
- Use ynl async_msg_queue.empty() in selftest.
v3:
link: https://lore.kernel.org/all/20241113125715.150201-1-shaw.leon@gmail.com/
- Drop "netns_atomic" flag and module parameter. Add netns parameter to
newlink() instead, and convert drivers accordingly.
- Move python NetNSEnter helper to net selftest lib.
v2:
link: https://lore.kernel.org/all/20241107133004.7469-1-shaw.leon@gmail.com/
- Check NLM_F_EXCL to ensure only link creation is affected.
- Add self tests for link name/ifindex conflict and notifications
in different netns.
- Changes in dummy driver and ynl in order to add the test case.
v1:
link: https://lore.kernel.org/all/20241023023146.372653-1-shaw.leon@gmail.com/
Xiao Liang (5):
net: ip_tunnel: Build flow in underlay net namespace
rtnetlink: Lookup device in target netns when creating link
rtnetlink: Decouple net namespaces in rtnl_newlink_create()
selftests: net: Add python context manager for netns entering
selftests: net: Add two test cases for link netns
drivers/infiniband/ulp/ipoib/ipoib_netlink.c | 11 ++++--
drivers/net/amt.c | 13 ++++---
drivers/net/bareudp.c | 11 ++++--
drivers/net/bonding/bond_netlink.c | 8 ++--
drivers/net/can/dev/netlink.c | 4 +-
drivers/net/can/vxcan.c | 11 ++++--
.../ethernet/qualcomm/rmnet/rmnet_config.c | 11 ++++--
drivers/net/geneve.c | 11 ++++--
drivers/net/gtp.c | 9 +++--
drivers/net/ipvlan/ipvlan.h | 4 +-
drivers/net/ipvlan/ipvlan_main.c | 11 ++++--
drivers/net/ipvlan/ipvtap.c | 7 ++--
drivers/net/macsec.c | 11 ++++--
drivers/net/macvlan.c | 8 ++--
drivers/net/macvtap.c | 8 ++--
drivers/net/netkit.c | 11 ++++--
drivers/net/pfcp.c | 8 ++--
drivers/net/ppp/ppp_generic.c | 10 +++--
drivers/net/team/team_core.c | 7 ++--
drivers/net/veth.c | 11 ++++--
drivers/net/vrf.c | 7 ++--
drivers/net/vxlan/vxlan_core.c | 11 ++++--
drivers/net/wireguard/device.c | 8 ++--
drivers/net/wireless/virtual/virt_wifi.c | 10 +++--
drivers/net/wwan/wwan_core.c | 15 +++++--
include/net/ip_tunnels.h | 5 ++-
include/net/rtnetlink.h | 34 +++++++++++++---
net/8021q/vlan_netlink.c | 11 ++++--
net/batman-adv/soft-interface.c | 8 ++--
net/bridge/br_netlink.c | 8 ++--
net/caif/chnl_net.c | 6 +--
net/core/rtnetlink.c | 29 +++++++++-----
net/hsr/hsr_netlink.c | 14 ++++---
net/ieee802154/6lowpan/core.c | 9 +++--
net/ipv4/ip_gre.c | 27 ++++++++-----
net/ipv4/ip_tunnel.c | 16 ++++----
net/ipv4/ip_vti.c | 10 +++--
net/ipv4/ipip.c | 10 +++--
net/ipv6/ip6_gre.c | 28 +++++++------
net/ipv6/ip6_tunnel.c | 16 ++++----
net/ipv6/ip6_vti.c | 15 ++++---
net/ipv6/sit.c | 16 ++++----
net/xfrm/xfrm_interface_core.c | 14 +++----
tools/testing/selftests/net/Makefile | 1 +
.../testing/selftests/net/lib/py/__init__.py | 2 +-
tools/testing/selftests/net/lib/py/netns.py | 18 +++++++++
tools/testing/selftests/net/netns-name.sh | 10 +++++
tools/testing/selftests/net/netns_atomic.py | 39 +++++++++++++++++++
48 files changed, 377 insertions(+), 205 deletions(-)
create mode 100755 tools/testing/selftests/net/netns_atomic.py
--
2.47.0
Hi,
At Linux Plumbers, a few dozen of us gathered together to discuss how
to expose what tests subsystem maintainers would like to run for every
patch submitted or when CI runs tests. We agreed on a mock up of a
yaml template to start gathering info. The yaml file could be
temporarily stored on kernelci.org until a more permanent home could
be found. Attached is a template to start the conversation.
Longer story.
The current problem is CI systems are not unanimous about what tests
they run on submitted patches or git branches. This makes it
difficult to figure out why a test failed or how to reproduce.
Further, it isn't always clear what tests a normal contributor should
run before posting patches.
It has been long communicated that the tests LTP, xfstest and/or
kselftests should be the tests to run. However, not all maintainers
use those tests for their subsystems. I am hoping to either capture
those tests or find ways to convince them to add their tests to the
preferred locations.
The goal is for a given subsystem (defined in MAINTAINERS), define a
set of tests that should be run for any contributions to that
subsystem. The hope is the collective CI results can be triaged
collectively (because they are related) and even have the numerous
flakes waived collectively (same reason) improving the ability to
find and debug new test failures. Because the tests and process are
known, having a human help debug any failures becomes easier.
The plan is to put together a minimal yaml template that gets us going
(even if it is not optimized yet) and aim for about a dozen or so
subsystems. At that point we should have enough feedback to promote
this more seriously and talk optimizations.
Feedback encouraged.
Cheers,
Don
---
# List of tests by subsystem
#
# Tests should adhere to KTAP definitions for results
#
# Description of section entries
#
# maintainer: test maintainer - name <email>
# list: mailing list for discussion
# version: stable version of the test
# dependency: necessary distro package for testing
# test:
# path: internal git path or url to fetch from
# cmd: command to run; ability to run locally
# param: additional param necessary to run test
# hardware: hardware necessary for validation
#
# Subsystems (alphabetical)
KUNIT TEST:
maintainer:
- name: name1
email: email1
- name: name2
email: email2
list:
version:
dependency:
- dep1
- dep2
test:
- path: tools/testing/kunit
cmd:
param:
- path:
cmd:
param:
hardware: none
Hi Linus,
Please pull the following kselftest fixes update for Linux 6.13-rc1.
kselftest update for Linux 6.13-rc1
-- timers test - removes duplicates defines
-- timers test - fixes to improve error reporting
-- rtc test - adds check rtc alarm status to alarm test
-- resctrl test - adds array overrun checks during iMC config parsing code
-- resctrl test - adds array overflow checks when reading strings
-- resctrl test - fixes and reorganizing code
Looks like I forgot to mention signal test changes in my tag
message. :(
diff is attached.
Tests passed on my kselftest next branch:
- Individual test runs of signal, timers, rtc, and resctrl
thanks,
-- Shuah
----------------------------------------------------------------
The following changes since commit 8e929cb546ee42c9a61d24fae60605e9e3192354:
Linux 6.12-rc3 (2024-10-13 14:33:32 -0700)
are available in the Git repository at:
git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest tags/linux_kselftest-next-6.13-rc1
for you to fetch changes up to a44c26d7fa74a5f4d2795a5c55a2d6ec1ebf1e38:
selftests/resctrl: Replace magic constants used as array size (2024-11-04 17:02:03 -0700)
----------------------------------------------------------------
linux_kselftest-next-6.13-rc1
kselftest update for Linux 6.13-rc1
-- timers test - removes duplicates defines
-- timers test - fixes to improve error reporting
-- rtc test - adds check rtc alarm status to alarm test
-- resctrl test - adds array overrun checks during iMC config parsing code
-- resctrl test - adds array overflow checks when reading strings
-- resctrl test - fixes and reorganizing code
----------------------------------------------------------------
Chen Ni (1):
selftests: timers: Remove unneeded semicolon
Dev Jain (2):
selftests: Rename sigaltstack to generic signal
selftests: Add a test mangling with uc_sigmask
Gianfranco Trad (1):
selftests: timers: improve timer_create failure message
Joseph Jang (1):
selftest: rtc: Add to check rtc alarm status for alarm related test
Nícolas F. R. A. Prado (1):
docs: dev-tools: Add documentation for the device focused kselftests
Reinette Chatre (15):
selftests/resctrl: Make functions only used in same file static
selftests/resctrl: Print accurate buffer size as part of MBM results
selftests/resctrl: Fix memory overflow due to unhandled wraparound
selftests/resctrl: Protect against array overrun during iMC config parsing
selftests/resctrl: Protect against array overflow when reading strings
selftests/resctrl: Make wraparound handling obvious
selftests/resctrl: Remove "once" parameter required to be false
selftests/resctrl: Only support measured read operation
selftests/resctrl: Remove unused measurement code
selftests/resctrl: Make benchmark parameter passing robust
selftests/resctrl: Ensure measurements skip initialization of default benchmark
selftests/resctrl: Use cache size to determine "fill_buf" buffer size
selftests/resctrl: Do not compare performance counters and resctrl at low bandwidth
selftests/resctrl: Keep results from first test run
selftests/resctrl: Replace magic constants used as array size
Shuah Khan (2):
selftests: timers: Remove local NSEC_PER_SEC and USEC_PER_SEC defines
selftests:timers: remove local CLOCKID defines
Documentation/dev-tools/kselftest.rst | 9 +
Documentation/dev-tools/testing-devices.rst | 47 +++
tools/testing/selftests/Makefile | 2 +-
tools/testing/selftests/resctrl/cmt_test.c | 37 +-
tools/testing/selftests/resctrl/fill_buf.c | 45 +--
tools/testing/selftests/resctrl/mba_test.c | 54 ++-
tools/testing/selftests/resctrl/mbm_test.c | 37 +-
tools/testing/selftests/resctrl/resctrl.h | 79 +++-
tools/testing/selftests/resctrl/resctrl_tests.c | 95 ++++-
tools/testing/selftests/resctrl/resctrl_val.c | 447 ++++++---------------
tools/testing/selftests/resctrl/resctrlfs.c | 19 +-
tools/testing/selftests/rtc/Makefile | 2 +-
tools/testing/selftests/rtc/rtctest.c | 64 +++
.../selftests/{sigaltstack => signal}/.gitignore | 1 +
.../selftests/{sigaltstack => signal}/Makefile | 3 +-
.../current_stack_pointer.h | 0
tools/testing/selftests/signal/mangle_uc_sigmask.c | 184 +++++++++
.../selftests/{sigaltstack => signal}/sas.c | 0
tools/testing/selftests/timers/Makefile | 2 +-
tools/testing/selftests/timers/adjtick.c | 6 +-
.../testing/selftests/timers/alarmtimer-suspend.c | 22 +-
.../testing/selftests/timers/inconsistency-check.c | 21 +-
tools/testing/selftests/timers/leap-a-day.c | 2 +-
tools/testing/selftests/timers/mqueue-lat.c | 2 +-
tools/testing/selftests/timers/nanosleep.c | 21 +-
tools/testing/selftests/timers/nsleep-lat.c | 22 +-
tools/testing/selftests/timers/posix_timers.c | 15 +-
tools/testing/selftests/timers/raw_skew.c | 4 +-
tools/testing/selftests/timers/set-2038.c | 3 +-
tools/testing/selftests/timers/set-timer-lat.c | 21 +-
tools/testing/selftests/timers/valid-adjtimex.c | 4 +-
31 files changed, 701 insertions(+), 569 deletions(-)
create mode 100644 Documentation/dev-tools/testing-devices.rst
rename tools/testing/selftests/{sigaltstack => signal}/.gitignore (70%)
rename tools/testing/selftests/{sigaltstack => signal}/Makefile (56%)
rename tools/testing/selftests/{sigaltstack => signal}/current_stack_pointer.h (100%)
create mode 100644 tools/testing/selftests/signal/mangle_uc_sigmask.c
rename tools/testing/selftests/{sigaltstack => signal}/sas.c (100%)
----------------------------------------------------------------
Page Detective is a new kernel debugging tool that provides detailed
information about the usage and mapping of physical memory pages.
It is often known that a particular page is corrupted, but it is hard to
extract more information about such a page from live system. Examples
are:
- Checksum failure during live migration
- Filesystem journal failure
- dump_page warnings on the console log
- Unexcpected segfaults
Page Detective helps to extract more information from the kernel, so it
can be used by developers to root cause the associated problem.
It operates through the Linux debugfs interface, with two files: "virt"
and "phys".
The "virt" file takes a virtual address and PID and outputs information
about the corresponding page.
The "phys" file takes a physical address and outputs information about
that page.
The output is presented via kernel log messages (can be accessed with
dmesg), and includes information such as the page's reference count,
mapping, flags, and memory cgroup. It also shows whether the page is
mapped in the kernel page table, and if so, how many times.
Pasha Tatashin (6):
mm: Make get_vma_name() function public
pagewalk: Add a page table walker for init_mm page table
mm: Add a dump_page variant that accept log level argument
misc/page_detective: Introduce Page Detective
misc/page_detective: enable loadable module
selftests/page_detective: Introduce self tests for Page Detective
Documentation/misc-devices/index.rst | 1 +
Documentation/misc-devices/page_detective.rst | 78 ++
MAINTAINERS | 8 +
drivers/misc/Kconfig | 11 +
drivers/misc/Makefile | 1 +
drivers/misc/page_detective.c | 808 ++++++++++++++++++
fs/inode.c | 18 +-
fs/kernfs/dir.c | 1 +
fs/proc/task_mmu.c | 61 --
include/linux/fs.h | 5 +-
include/linux/mmdebug.h | 1 +
include/linux/pagewalk.h | 2 +
kernel/pid.c | 1 +
mm/debug.c | 53 +-
mm/memcontrol.c | 1 +
mm/oom_kill.c | 1 +
mm/pagewalk.c | 32 +
mm/vma.c | 60 ++
tools/testing/selftests/Makefile | 1 +
.../selftests/page_detective/.gitignore | 1 +
.../testing/selftests/page_detective/Makefile | 7 +
tools/testing/selftests/page_detective/config | 4 +
.../page_detective/page_detective_test.c | 727 ++++++++++++++++
23 files changed, 1787 insertions(+), 96 deletions(-)
create mode 100644 Documentation/misc-devices/page_detective.rst
create mode 100644 drivers/misc/page_detective.c
create mode 100644 tools/testing/selftests/page_detective/.gitignore
create mode 100644 tools/testing/selftests/page_detective/Makefile
create mode 100644 tools/testing/selftests/page_detective/config
create mode 100644 tools/testing/selftests/page_detective/page_detective_test.c
--
2.47.0.338.g60cca15819-goog
Hi all,
Another rather vague question from me - does anyone know any reasons
why we couldn't support KUNIT_EXPECT in atomic contexts?
For Address Space Isolation we have some tests that want to disable
preemption. I guess this is not a totally insane thing to want to do -
am I missing anything there?
You can see some examples here:
https://github.com/googleprodkernel/linux-kvm/blob/asi-lpc-24/arch/x86/mm/a…
(Actually, looks like in that code we just started doing KUNIT_EXPECT
in preempt-disabled regions anyway, and I never suffered any
consequences until recently. Probably we just never had those
assertions fail under DEBUG_ATOMIC_SLEEP).
From a very quick look, it seems like the only reason you can't
KUNIT_EXPECT from atomic right now is the GFP_KERNEL allocations.
So... what if we just made them GFP_ATOMIC? In general I think that's
a pretty ropey approach, but maybe fine in the context of unit tests?
Or, we could have variants of the assertions, or a test attribute, to
just use GFP_ATOMIC in the cases where it's needed.
Is there anything else missing that would need to be done?
Alternatively, we could expose the whole context concern to the user
in such a way that KUNIT_ASSERT can be used too, something like:
struct kunit_defer defer = KUNIT_INIT_DEFER(test);
preempt_disable();
KUNIT_EXPECT_DEFERRED_TRUE(&defer, ...);
KUNIT_ASSERT_DEFERRED_EQ(&defer, ...);
preempt_enable();
// Prints failures from above, aborts if the ASSERT failed:
kunit_defer_finish(&defer);
I hope that wouldn't be necessary though, it seems like a lot of API surface.
Currently the mount_setattr_test fails on machines with a 64K PAGE_SIZE,
with errors such as:
# RUN mount_setattr_idmapped.invalid_fd_negative ...
mkfs.ext4: No space left on device while writing out and closing file system
# mount_setattr_test.c:1055:invalid_fd_negative:Expected system("mkfs.ext4 -q /mnt/C/ext4.img") (256) == 0 (0)
# invalid_fd_negative: Test terminated by assertion
# FAIL mount_setattr_idmapped.invalid_fd_negative
not ok 12 mount_setattr_idmapped.invalid_fd_negative
The code creates a 100,000 byte tmpfs:
ASSERT_EQ(mount("testing", "/mnt", "tmpfs", MS_NOATIME | MS_NODEV,
"size=100000,mode=700"), 0);
And then a little later creates a 2MB ext4 filesystem in that tmpfs:
ASSERT_EQ(ftruncate(img_fd, 1024 * 2048), 0);
ASSERT_EQ(system("mkfs.ext4 -q /mnt/C/ext4.img"), 0);
At first glance it seems like that should never work, after all 2MB is
larger than 100,000 bytes. However the filesystem image doesn't actually
occupy 2MB on "disk" (actually RAM, due to tmpfs). On 4K kernels the
ext4.img uses ~84KB of actual space (according to du), which just fits.
However on 64K PAGE_SIZE kernels the ext4.img takes at least 256KB,
which is too large to fit in the tmpfs, hence the errors.
It seems fraught to rely on the ext4.img taking less space on disk than
the allocated size, so instead create the tmpfs with a size of 2MB. With
that all 21 tests pass on 64K PAGE_SIZE kernels.
Fixes: 01eadc8dd96d ("tests: add mount_setattr() selftests")
Signed-off-by: Michael Ellerman <mpe(a)ellerman.id.au>
---
tools/testing/selftests/mount_setattr/mount_setattr_test.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/tools/testing/selftests/mount_setattr/mount_setattr_test.c b/tools/testing/selftests/mount_setattr/mount_setattr_test.c
index 68801e1a9ec2..70f65eb320a7 100644
--- a/tools/testing/selftests/mount_setattr/mount_setattr_test.c
+++ b/tools/testing/selftests/mount_setattr/mount_setattr_test.c
@@ -1026,7 +1026,7 @@ FIXTURE_SETUP(mount_setattr_idmapped)
"size=100000,mode=700"), 0);
ASSERT_EQ(mount("testing", "/mnt", "tmpfs", MS_NOATIME | MS_NODEV,
- "size=100000,mode=700"), 0);
+ "size=2m,mode=700"), 0);
ASSERT_EQ(mkdir("/mnt/A", 0777), 0);
--
2.47.0
Hi Linus,
Please pull the following kunit update for Linux 6.13-rc1.
kunit update for Linux 6.13-rc1
-- fixes user-after-free (UAF) bug in kunit_init_suite()
-- adds option to kunit tool to print just the summary of test results
-- adds option to kunit tool to print just the failed test results
-- fixes kunit_zalloc_skb() to use user passed in gfp value instead of
hardcoding GFP_KERNEL
-- fixes kunit_zalloc_skb() kernel doc to include allocation flags variable
diff is attached.
Tests passed on my kunit repo:
- Build make allmodconfig
./tools/testing/kunit/kunit.py run
./tools/testing/kunit/kunit.py run --alltests
./tools/testing/kunit/kunit.py run --arch x86_64
./tools/testing/kunit/kunit.py run --alltests --arch x86_64
thanks,
-- Shuah
----------------------------------------------------------------
The following changes since commit 2d5404caa8c7bb5c4e0435f94b28834ae5456623:
Linux 6.12-rc7 (2024-11-10 14:19:35 -0800)
are available in the Git repository at:
git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest tags/linux_kselftest-kunit-6.13-rc1
for you to fetch changes up to 67b6d342fb6d5abfbeb71e0f23141b9b96cf7bb1:
kunit: tool: print failed tests only (2024-11-14 09:38:19 -0700)
----------------------------------------------------------------
linux_kselftest-kunit-6.13-rc1
kunit update for Linux 6.13-rc1
-- fixes user-after-free (UAF) bug in kunit_init_suite()
-- adds option to kunit tool to print just the summary of test results
-- adds option to kunit tool to print just the failed test results
-- fixes kunit_zalloc_skb() to use user passed in gfp value instead of
hardcoding GFP_KERNEL
-- fixes kunit_zalloc_skb() kernel doc to include allocation flags variable
----------------------------------------------------------------
Dan Carpenter (2):
kunit: skb: use "gfp" variable instead of hardcoding GFP_KERNEL
kunit: skb: add gfp to kernel doc for kunit_zalloc_skb()
David Gow (1):
kunit: tool: Only print the summary
Jinjie Ruan (1):
kunit: string-stream: Fix a UAF bug in kunit_init_suite()
Rae Moar (1):
kunit: tool: print failed tests only
include/kunit/skbuff.h | 5 +-
lib/kunit/string-stream.c | 1 +
tools/testing/kunit/kunit.py | 28 ++++++-
tools/testing/kunit/kunit_parser.py | 134 +++++++++++++++++++++------------
tools/testing/kunit/kunit_printer.py | 14 +++-
tools/testing/kunit/kunit_tool_test.py | 55 +++++++-------
6 files changed, 151 insertions(+), 86 deletions(-)
----------------------------------------------------------------
page_frag test module is an out of tree module, but built
using KDIR as the main kernel tree, the mm test suite is
just getting skipped if newly added page_frag test module
fails to compile due to kernel not yet compiled.
Fix the above problem by ensuring both kernel is built first
and a newer kernel which has page_frag_cache.h is used.
CC: Andrew Morton <akpm(a)linux-foundation.org>
CC: Alexander Duyck <alexanderduyck(a)fb.com>
CC: Linux-MM <linux-mm(a)kvack.org>
Fixes: 7fef0dec415c ("mm: page_frag: add a test module for page_frag")
Fixes: 65941f10caf2 ("mm: move the page fragment allocator from page_alloc into its own file")
Reported-by: Mark Brown <broonie(a)kernel.org>
Signed-off-by: Yunsheng Lin <linyunsheng(a)huawei.com>
Tested-by: Mark Brown <broonie(a)kernel.org>
---
V2: Repost by adding net-next ML.
Note, page_frag test module is only in the net-next tree for now,
so target the net-next tree.
---
tools/testing/selftests/mm/Makefile | 18 ++++++++++++++++++
tools/testing/selftests/mm/page_frag/Makefile | 2 +-
2 files changed, 19 insertions(+), 1 deletion(-)
diff --git a/tools/testing/selftests/mm/Makefile b/tools/testing/selftests/mm/Makefile
index acec529baaca..04e04733fc8a 100644
--- a/tools/testing/selftests/mm/Makefile
+++ b/tools/testing/selftests/mm/Makefile
@@ -36,7 +36,16 @@ MAKEFLAGS += --no-builtin-rules
CFLAGS = -Wall -I $(top_srcdir) $(EXTRA_CFLAGS) $(KHDR_INCLUDES) $(TOOLS_INCLUDES)
LDLIBS = -lrt -lpthread -lm
+KDIR ?= /lib/modules/$(shell uname -r)/build
+ifneq (,$(wildcard $(KDIR)/Module.symvers))
+ifneq (,$(wildcard $(KDIR)/include/linux/page_frag_cache.h))
TEST_GEN_MODS_DIR := page_frag
+else
+PAGE_FRAG_WARNING = "missing page_frag_cache.h, please use a newer kernel"
+endif
+else
+PAGE_FRAG_WARNING = "missing Module.symvers, please have the kernel built first"
+endif
TEST_GEN_FILES = cow
TEST_GEN_FILES += compaction_test
@@ -214,3 +223,12 @@ warn_missing_liburing:
echo "Warning: missing liburing support. Some tests will be skipped." ; \
echo
endif
+
+ifneq ($(PAGE_FRAG_WARNING),)
+all: warn_missing_page_frag
+
+warn_missing_page_frag:
+ @echo ; \
+ echo "Warning: $(PAGE_FRAG_WARNING). page_frag test will be skipped." ; \
+ echo
+endif
diff --git a/tools/testing/selftests/mm/page_frag/Makefile b/tools/testing/selftests/mm/page_frag/Makefile
index 58dda74d50a3..8c8bb39ffa28 100644
--- a/tools/testing/selftests/mm/page_frag/Makefile
+++ b/tools/testing/selftests/mm/page_frag/Makefile
@@ -1,5 +1,5 @@
PAGE_FRAG_TEST_DIR := $(realpath $(dir $(abspath $(lastword $(MAKEFILE_LIST)))))
-KDIR ?= $(abspath $(PAGE_FRAG_TEST_DIR)/../../../../..)
+KDIR ?= /lib/modules/$(shell uname -r)/build
ifeq ($(V),1)
Q =
--
2.33.0
The series of patches are for doing basic tests
of NIC driver. Test comprises checks for auto-negotiation,
speed, duplex state and throughput between local NIC and
partner. Tools such as ethtool, iperf3 are used.
Signed-off-by: Mohan Prasad J <mohan.prasad(a)microchip.com>
---
Changes in v4:
- Type hints are added for arguments and return types.
- Test values can be modified through command line.
- Minor code improvements.
- GenerateTraffic class extended for throughput test.
- Protocol testcases separated.
Changes in v3:
Link to v3: https://patchwork.kernel.org/project/netdevbpf/cover/20241016215014.401476-…
- LinkConfig class is included in the hw library. This contains
generic APIs for doing link layer operations.
- Auto-negotiation checks involve changing the auto-neg state
both in local and partner NIC.
- Link layer test and performance test are separated to
different selftest files.
- Resetting of NIC driver done after test completion.
Changes in v2:
Link to v2: https://patchwork.kernel.org/project/netdevbpf/cover/20240917023525.2571082…
- Changed the hardcoded implementation of speed, duplex states,
throughput to generic values, in order to support all type
of NIC drivers.
- Test executes based on the supported link modes between local
NIC driver and partner.
- Instead of lan743x directory, selftest file is now relocated
to /selftests/drivers/net/hw.
Link to v1: https://patchwork.kernel.org/project/netdevbpf/cover/20240903221549.1215842…
---
Mohan Prasad J (3):
selftests: nic_link_layer: Add link layer selftest for NIC driver
selftests: nic_link_layer: Add selftest case for speed and duplex
states
selftests: nic_performance: Add selftest for performance of NIC driver
.../testing/selftests/drivers/net/hw/Makefile | 6 +-
.../drivers/net/hw/lib/py/__init__.py | 1 +
.../drivers/net/hw/lib/py/linkconfig.py | 222 ++++++++++++++++++
.../drivers/net/hw/nic_link_layer.py | 113 +++++++++
.../drivers/net/hw/nic_performance.py | 137 +++++++++++
.../selftests/drivers/net/lib/py/load.py | 20 +-
6 files changed, 496 insertions(+), 3 deletions(-)
create mode 100644 tools/testing/selftests/drivers/net/hw/lib/py/linkconfig.py
create mode 100644 tools/testing/selftests/drivers/net/hw/nic_link_layer.py
create mode 100644 tools/testing/selftests/drivers/net/hw/nic_performance.py
--
2.43.0
Hello,
this new series aims to migrate test_flow_dissector.sh into test_progs.
There are 2 "main" parts in test_flow_dissector.sh:
- a set of tests checking flow_dissector programs attachment to either
root namespace or non-root namespace
- dissection test
The first set is integrated in flow_dissector.c, which already contains
some existing tests for flow_dissector programs. This series uses the
opportunity to update a bit this file (use new assert, re-split tests,
etc)
The second part is migrated into a new file under test_progs,
flow_dissector_classification.c. It uses the same eBPF programs as
flow_dissector.c, but the difference is rather about how those program
are executed:
- flow_dissector.c manually runs programs with BPF_PROG_RUN
- flow_dissector_classification.c sends real packets to be dissected, and
so it also executes kernel code related to eBPF flow dissector (eg:
__skb_flow_bpf_to_target)
---
Changes in v2:
- allow tests to run in parallel
- move some generic helpers to network_helpers.h
- define proper function for ASSERT_MEMEQ
- fetch acked-by tags
- Link to v1: https://lore.kernel.org/r/20241113-flow_dissector-v1-0-27c4df0592dc@bootlin…
---
Alexis Lothoré (eBPF Foundation) (13):
selftests/bpf: add a macro to compare raw memory
selftests/bpf: use ASSERT_MEMEQ to compare bpf flow keys
selftests/bpf: replace CHECK calls with ASSERT macros in flow_dissector test
selftests/bpf: re-split main function into dedicated tests
selftests/bpf: expose all subtests from flow_dissector
selftests/bpf: add gre packets testing to flow_dissector
selftests/bpf: migrate flow_dissector namespace exclusivity test
selftests/bpf: Enable generic tc actions in selftests config
selftests/bpf: move ip checksum helper to network helpers
selftests/bpf: rename pseudo headers checksum computation
selftests/bpf: add network helpers to generate udp checksums
selftests/bpf: migrate bpf flow dissectors tests to test_progs
selftests/bpf: remove test_flow_dissector.sh
tools/testing/selftests/bpf/.gitignore | 1 -
tools/testing/selftests/bpf/Makefile | 3 +-
tools/testing/selftests/bpf/config | 1 +
tools/testing/selftests/bpf/network_helpers.h | 64 +-
.../selftests/bpf/prog_tests/flow_dissector.c | 323 +++++++--
.../bpf/prog_tests/flow_dissector_classification.c | 807 +++++++++++++++++++++
.../selftests/bpf/prog_tests/xdp_metadata.c | 24 +-
tools/testing/selftests/bpf/test_flow_dissector.c | 780 --------------------
tools/testing/selftests/bpf/test_flow_dissector.sh | 178 -----
tools/testing/selftests/bpf/test_progs.c | 15 +
tools/testing/selftests/bpf/test_progs.h | 15 +
tools/testing/selftests/bpf/xdp_hw_metadata.c | 12 +-
12 files changed, 1153 insertions(+), 1070 deletions(-)
---
base-commit: 9e71d50d3befb93a6394b0979f8ebd0dc9bd8d0f
change-id: 20241019-flow_dissector-3eb0c07fc163
Best regards,
--
Alexis Lothoré, Bootlin
Embedded Linux and Kernel engineering
https://bootlin.com
Currently the temporary address is not removed when mngtmpaddr is deleted
or becomes unmanaged. The patch set fixed this issue and add a related
test.
Hangbin Liu (2):
net/ipv6: delete temporary address if mngtmpaddr is removed or
un-mngtmpaddr
selftests/rtnetlink.sh: add mngtempaddr test
net/ipv6/addrconf.c | 5 ++
tools/testing/selftests/net/rtnetlink.sh | 89 ++++++++++++++++++++++++
2 files changed, 94 insertions(+)
--
2.46.0
1. fix recursive lock when ebpf prog return SK_PASS.
2. add selftest to reproduce recursive lock.
Note that the test code can reproduce the 'dead-lock' and if just
the selftest merged without first patch, the test case will
definitely fail, because the issue of deadlock is inevitable.
---
v2->v4: fix line length reported by patchwork and remove unused code.
(max_line_length is set to 80 in patchwork but default is 100 in kernel tree)
v1->v2: 1.inspired by martin.lau to add selftest to reproduce the issue.
2. follow the community rules for patch.
v1: https://lore.kernel.org/bpf/55fc6114-7e64-4b65-86d2-92cfd1e9e92f@linux.dev/…
---
Jiayuan Chen (2):
bpf: fix recursive lock when verdict program return SK_PASS
selftests/bpf: Add some tests with sockmap SK_PASS
net/core/skmsg.c | 4 +-
.../selftests/bpf/prog_tests/sockmap_basic.c | 54 +++++++++++++++++++
2 files changed, 56 insertions(+), 2 deletions(-)
--
2.43.5
sizeof when applied to a pointer typed expression gives the size of
the pointer.
tools/testing/selftests/bpf/progs/test_tunnel_kern.c:678:41-47: ERROR: application of sizeof to pointer
The proper fix in this particular case is to code sizeof(*gopt)
instead of sizeof(gopt).
This issue was detected with the help of Coccinelle.
Fixes: 5ddafcc377f9 ("selftests/bpf: Fix a few tests for GCC related warnings.")
Signed-off-by: guanjing <guanjing(a)cmss.chinamobile.com>
---
tools/testing/selftests/bpf/progs/test_tunnel_kern.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/tools/testing/selftests/bpf/progs/test_tunnel_kern.c b/tools/testing/selftests/bpf/progs/test_tunnel_kern.c
index 32127f1cd687..3a437cdc5c15 100644
--- a/tools/testing/selftests/bpf/progs/test_tunnel_kern.c
+++ b/tools/testing/selftests/bpf/progs/test_tunnel_kern.c
@@ -675,7 +675,7 @@ int ip6geneve_set_tunnel(struct __sk_buff *skb)
gopt->length = 2; /* 4-byte multiple */
*(int *) &gopt->opt_data = bpf_htonl(0xfeedbeef);
- ret = bpf_skb_set_tunnel_opt(skb, gopt, sizeof(gopt));
+ ret = bpf_skb_set_tunnel_opt(skb, gopt, sizeof(*gopt));
if (ret < 0) {
log_err(ret);
return TC_ACT_SHOT;
--
2.33.0
page_frag test module is an out of tree module, but built
using KDIR as the main kernel tree, the mm test suite is
just getting skipped if newly added page_frag test module
fails to compile due to kernel not yet compiled.
Fix the above problem by ensuring both kernel is built first
and a newer kernel which has page_frag_cache.h is used.
CC: Andrew Morton <akpm(a)linux-foundation.org>
CC: Alexander Duyck <alexanderduyck(a)fb.com>
CC: Linux-MM <linux-mm(a)kvack.org>
Fixes: 7fef0dec415c ("mm: page_frag: add a test module for page_frag")
Fixes: 65941f10caf2 ("mm: move the page fragment allocator from page_alloc into its own file")
Reported-by: Mark Brown <broonie(a)kernel.org>
Signed-off-by: Yunsheng Lin <yunshenglin0825(a)gmail.com>
---
Mote, page_frag test module is only in the net-next tree for now,
so target the net-next tree.
---
tools/testing/selftests/mm/Makefile | 18 ++++++++++++++++++
tools/testing/selftests/mm/page_frag/Makefile | 2 +-
2 files changed, 19 insertions(+), 1 deletion(-)
diff --git a/tools/testing/selftests/mm/Makefile b/tools/testing/selftests/mm/Makefile
index acec529baaca..04e04733fc8a 100644
--- a/tools/testing/selftests/mm/Makefile
+++ b/tools/testing/selftests/mm/Makefile
@@ -36,7 +36,16 @@ MAKEFLAGS += --no-builtin-rules
CFLAGS = -Wall -I $(top_srcdir) $(EXTRA_CFLAGS) $(KHDR_INCLUDES) $(TOOLS_INCLUDES)
LDLIBS = -lrt -lpthread -lm
+KDIR ?= /lib/modules/$(shell uname -r)/build
+ifneq (,$(wildcard $(KDIR)/Module.symvers))
+ifneq (,$(wildcard $(KDIR)/include/linux/page_frag_cache.h))
TEST_GEN_MODS_DIR := page_frag
+else
+PAGE_FRAG_WARNING = "missing page_frag_cache.h, please use a newer kernel"
+endif
+else
+PAGE_FRAG_WARNING = "missing Module.symvers, please have the kernel built first"
+endif
TEST_GEN_FILES = cow
TEST_GEN_FILES += compaction_test
@@ -214,3 +223,12 @@ warn_missing_liburing:
echo "Warning: missing liburing support. Some tests will be skipped." ; \
echo
endif
+
+ifneq ($(PAGE_FRAG_WARNING),)
+all: warn_missing_page_frag
+
+warn_missing_page_frag:
+ @echo ; \
+ echo "Warning: $(PAGE_FRAG_WARNING). page_frag test will be skipped." ; \
+ echo
+endif
diff --git a/tools/testing/selftests/mm/page_frag/Makefile b/tools/testing/selftests/mm/page_frag/Makefile
index 58dda74d50a3..8c8bb39ffa28 100644
--- a/tools/testing/selftests/mm/page_frag/Makefile
+++ b/tools/testing/selftests/mm/page_frag/Makefile
@@ -1,5 +1,5 @@
PAGE_FRAG_TEST_DIR := $(realpath $(dir $(abspath $(lastword $(MAKEFILE_LIST)))))
-KDIR ?= $(abspath $(PAGE_FRAG_TEST_DIR)/../../../../..)
+KDIR ?= /lib/modules/$(shell uname -r)/build
ifeq ($(V),1)
Q =
--
2.34.1
Add ip_link_set_addr(), ip_link_set_up(), ip_addr_add() and ip_route_add()
to the suite of helpers that automatically schedule a corresponding
cleanup.
When setting a new MAC, one needs to remember the old address first. Move
mac_get() from forwarding/ to that end.
Signed-off-by: Petr Machata <petrm(a)nvidia.com>
Reviewed-by: Ido Schimmel <idosch(a)nvidia.com>
---
Notes:
CC: Shuah Khan <shuah(a)kernel.org>
CC: Benjamin Poirier <bpoirier(a)nvidia.com>
CC: Hangbin Liu <liuhangbin(a)gmail.com>
CC: Vladimir Oltean <vladimir.oltean(a)nxp.com>
CC: linux-kselftest(a)vger.kernel.org
tools/testing/selftests/net/forwarding/lib.sh | 7 ----
tools/testing/selftests/net/lib.sh | 39 +++++++++++++++++++
2 files changed, 39 insertions(+), 7 deletions(-)
diff --git a/tools/testing/selftests/net/forwarding/lib.sh b/tools/testing/selftests/net/forwarding/lib.sh
index 7337f398f9cc..1fd40bada694 100644
--- a/tools/testing/selftests/net/forwarding/lib.sh
+++ b/tools/testing/selftests/net/forwarding/lib.sh
@@ -932,13 +932,6 @@ packets_rate()
echo $(((t1 - t0) / interval))
}
-mac_get()
-{
- local if_name=$1
-
- ip -j link show dev $if_name | jq -r '.[]["address"]'
-}
-
ether_addr_to_u64()
{
local addr="$1"
diff --git a/tools/testing/selftests/net/lib.sh b/tools/testing/selftests/net/lib.sh
index 5ea6537acd2b..2cd5c743b2d9 100644
--- a/tools/testing/selftests/net/lib.sh
+++ b/tools/testing/selftests/net/lib.sh
@@ -435,6 +435,13 @@ xfail_on_veth()
fi
}
+mac_get()
+{
+ local if_name=$1
+
+ ip -j link show dev $if_name | jq -r '.[]["address"]'
+}
+
kill_process()
{
local pid=$1; shift
@@ -459,3 +466,35 @@ ip_link_set_master()
ip link set dev "$member" master "$master"
defer ip link set dev "$member" nomaster
}
+
+ip_link_set_addr()
+{
+ local name=$1; shift
+ local addr=$1; shift
+
+ local old_addr=$(mac_get "$name")
+ ip link set dev "$name" address "$addr"
+ defer ip link set dev "$name" address "$old_addr"
+}
+
+ip_link_set_up()
+{
+ local name=$1; shift
+
+ ip link set dev "$name" up
+ defer ip link set dev "$name" down
+}
+
+ip_addr_add()
+{
+ local name=$1; shift
+
+ ip addr add dev "$name" "$@"
+ defer ip addr del dev "$name" "$@"
+}
+
+ip_route_add()
+{
+ ip route add "$@"
+ defer ip route del "$@"
+}
--
2.47.0
After reviewing the code, it was found that the macro GUEST_CODE_PIO_PORT
is never referenced in the code. Just remove it.
Signed-off-by: Ba Jing <bajing(a)cmss.chinamobile.com>
---
tools/testing/selftests/kvm/hardware_disable_test.c | 1 -
1 file changed, 1 deletion(-)
diff --git a/tools/testing/selftests/kvm/hardware_disable_test.c b/tools/testing/selftests/kvm/hardware_disable_test.c
index bce73bcb973c..94bd6ed24cf3 100644
--- a/tools/testing/selftests/kvm/hardware_disable_test.c
+++ b/tools/testing/selftests/kvm/hardware_disable_test.c
@@ -20,7 +20,6 @@
#define SLEEPING_THREAD_NUM (1 << 4)
#define FORK_NUM (1ULL << 9)
#define DELAY_US_MAX 2000
-#define GUEST_CODE_PIO_PORT 4
sem_t *sem;
--
2.33.0
This series is a follow-up to Joey's Permission Overlay Extension (POE)
series [1] that recently landed on mainline. The goal is to improve the
way we handle the register that governs which pkeys/POIndex are
accessible (POR_EL0) during signal delivery. As things stand, we may
unexpectedly fail to write the signal frame on the stack because POR_EL0
is not reset before the uaccess operations. See patch 3 for more details
and the main changes this series brings.
A similar series landed recently for x86/MPK [2]; the present series
aims at aligning arm64 with x86. Worth noting: once the signal frame is
written, POR_EL0 is still set to POR_EL0_INIT, granting access to pkey 0
only. This means that a program that sets up an alternate signal stack
with a non-zero pkey will need some assembly trampoline to set POR_EL0
before invoking the real signal handler, as discussed here [3]. This is
not ideal, but it makes experimentation with pkeys in signal handlers
possible while waiting for a potential interface to control the pkey
state when delivering a signal. See Pierre's reply [4] for more
information about use-cases and a potential interface.
The x86 series also added kselftests to ensure that no spurious SIGSEGV
occurs during signal delivery regardless of which pkey is accessible at
the point where the signal is delivered. This series adapts those
kselftests to allow running them on arm64 (patch 4-5).
Finally patch 2 is a clean-up following feedback on Joey's series [5].
I have tested this series on arm64 and x86_64 (booting and running the
protection_keys and pkey_sighandler_tests mm kselftests).
v1..v2:
* In setup_rt_frame(), ensured that POR_EL0 is reset to its original
value if we fail to deliver the signal (addresses Catalin's concern [6]).
* Renamed *unpriv_access* to *user_access* in patch 3 (suggestion from
Dave).
* Made what patch 1-2 do explicit in the commit message body (suggestion
from Dave).
- Kevin
[1] https://lore.kernel.org/linux-arm-kernel/20240822151113.1479789-1-joey.goul…
[2] https://lore.kernel.org/lkml/20240802061318.2140081-1-aruna.ramakrishna@ora…
[3] https://lore.kernel.org/lkml/CABi2SkWxNkP2O7ipkP67WKz0-LV33e5brReevTTtba6oK…
[4] https://lore.kernel.org/linux-arm-kernel/87plns8owh.fsf@arm.com/
[5] https://lore.kernel.org/linux-arm-kernel/20241015114116.GA19334@willie-the-…
[6] https://lore.kernel.org/linux-arm-kernel/Zw6D2waVyIwYE7wd@arm.com/
Cc: akpm(a)linux-foundation.org
Cc: anshuman.khandual(a)arm.com
Cc: aruna.ramakrishna(a)oracle.com
Cc: broonie(a)kernel.org
Cc: catalin.marinas(a)arm.com
Cc: dave.hansen(a)linux.intel.com
Cc: dave.martin(a)arm.com
Cc: jeffxu(a)chromium.org
Cc: joey.gouly(a)arm.com
Cc: pierre.langlois(a)arm.com
Cc: shuah(a)kernel.org
Cc: sroettger(a)google.com
Cc: will(a)kernel.org
Cc: linux-kselftest(a)vger.kernel.org
Cc: x86(a)kernel.org
Kevin Brodsky (5):
arm64: signal: Remove unused macro
arm64: signal: Remove unnecessary check when saving POE state
arm64: signal: Improve POR_EL0 handling to avoid uaccess failures
selftests/mm: Use generic pkey register manipulation
selftests/mm: Enable pkey_sighandler_tests on arm64
arch/arm64/kernel/signal.c | 95 +++++++++++++---
tools/testing/selftests/mm/Makefile | 8 +-
tools/testing/selftests/mm/pkey-arm64.h | 1 +
tools/testing/selftests/mm/pkey-x86.h | 2 +
.../selftests/mm/pkey_sighandler_tests.c | 101 +++++++++++++-----
5 files changed, 162 insertions(+), 45 deletions(-)
--
2.43.0
Here is a series from Geliang, adding mptcp_subflow bpf_iter support.
We are working on extending MPTCP with BPF, e.g. to control the path
manager -- in charge of the creation, deletion, and announcements of
subflows (paths) -- and the packet scheduler -- in charge of selecting
which available path the next data will be sent to. These extensions
need to iterate over the list of subflows attached to an MPTCP
connection, and do some specific actions via some new kfunc that will be
added later on.
This preparation work is split in different patches:
- Patch 1: register some "basic" MPTCP kfunc.
- Patch 2: add mptcp_subflow bpf_iter support. Note that previous
versions of this single patch have already been shared to the
BPF mailing list. The changelog has been kept with a comment,
but the version number has been reset to avoid confusions.
- Patch 3: add kfunc to make sure the msk is valid
- Patch 4: add more MPTCP endpoints in the selftests, in order to create
more than 2 subflows.
- Patch 5: add a very simple test validating mptcp_subflow bpf_iter
support. This test could be written without the new bpf_iter,
but it is there only to make sure this specific feature works
as expected.
Signed-off-by: Matthieu Baerts (NGI0) <matttbe(a)kernel.org>
---
Geliang Tang (5):
bpf: Register mptcp common kfunc set
bpf: Add mptcp_subflow bpf_iter
bpf: Acquire and release mptcp socket
selftests/bpf: More endpoints for endpoint_init
selftests/bpf: Add mptcp_subflow bpf_iter subtest
net/mptcp/bpf.c | 104 ++++++++++++++++-
tools/testing/selftests/bpf/bpf_experimental.h | 8 ++
tools/testing/selftests/bpf/prog_tests/mptcp.c | 129 ++++++++++++++++++++-
tools/testing/selftests/bpf/progs/mptcp_bpf.h | 10 ++
.../testing/selftests/bpf/progs/mptcp_bpf_iters.c | 64 ++++++++++
5 files changed, 308 insertions(+), 7 deletions(-)
---
base-commit: 141b4d6a8049cecdc8124f87e044b83a9e80730d
change-id: 20241108-bpf-next-net-mptcp-bpf_iter-subflows-027f6d87770e
Best regards,
--
Matthieu Baerts (NGI0) <matttbe(a)kernel.org>
This series introduces a new vIOMMU infrastructure and related ioctls.
IOMMUFD has been using the HWPT infrastructure for all cases, including a
nested IO page table support. Yet, there're limitations for an HWPT-based
structure to support some advanced HW-accelerated features, such as CMDQV
on NVIDIA Grace, and HW-accelerated vIOMMU on AMD. Even for a multi-IOMMU
environment, it is not straightforward for nested HWPTs to share the same
parent HWPT (stage-2 IO pagetable), with the HWPT infrastructure alone: a
parent HWPT typically hold one stage-2 IO pagetable and tag it with only
one ID in the cache entries. When sharing one large stage-2 IO pagetable
across physical IOMMU instances, that one ID may not always be available
across all the IOMMU instances. In other word, it's ideal for SW to have
a different container for the stage-2 IO pagetable so it can hold another
ID that's available. And this container will be able to hold some advanced
feature too.
For this "different container", add vIOMMU, an additional layer to hold
extra virtualization information:
_______________________________________________________________________
| iommufd (with vIOMMU) |
| _____________ |
| | | |
| |----------------| vIOMMU | |
| | ______ | | _____________ ________ |
| | | | | | | | | | |
| | | IOAS |<---|(HWPT_PAGING)|<---| HWPT_NESTED |<--| DEVICE | |
| | |______| |_____________| |_____________| |________| |
| | | | | | |
|______|________|______________|__________________|_______________|_____|
| | | | |
______v_____ | ______v_____ ______v_____ ___v__
| struct | | PFN | (paging) | | (nested) | |struct|
|iommu_device| |------>|iommu_domain|<----|iommu_domain|<----|device|
|____________| storage|____________| |____________| |______|
The vIOMMU object should be seen as a slice of a physical IOMMU instance
that is passed to or shared with a VM. That can be some HW/SW resources:
- Security namespace for guest owned ID, e.g. guest-controlled cache tags
- Non-device-affiliated event reporting, e.g. invalidation queue errors
- Access to a sharable nesting parent pagetable across physical IOMMUs
- Virtualization of various platforms IDs, e.g. RIDs and others
- Delivery of paravirtualized invalidation
- Direct assigned invalidation queues
- Direct assigned interrupts
On a multi-IOMMU system, the vIOMMU object must be instanced to the number
of the physical IOMMUs that have a slice passed to (via device) a guest VM,
while being able to hold the shareable parent HWPT. Each vIOMMU then just
needs to allocate its own individual ID to tag its own cache:
----------------------------
---------------- | | paging_hwpt0 |
| hwpt_nested0 |--->| viommu0 ------------------
---------------- | | IDx |
----------------------------
----------------------------
---------------- | | paging_hwpt0 |
| hwpt_nested1 |--->| viommu1 ------------------
---------------- | | IDy |
----------------------------
As an initial part-1, add IOMMUFD_CMD_VIOMMU_ALLOC ioctl for an allocation
only.
More vIOMMU-based structs and ioctls will be introduced in the follow-up
series to support vDEVICE, vIRQ (vEVENT) and vQUEUE objects. Although we
repurposed the vIOMMU object from an earlier RFC, just for a referece:
https://lore.kernel.org/all/cover.1712978212.git.nicolinc@nvidia.com/
This series is on Github:
https://github.com/nicolinc/iommufd/commits/iommufd_viommu_p1-v7
(QEMU branch for testing will be provided in Jason's nesting series)
Changelog
v7
* Added "Reviewed-by" from Jason
* Dropped "select IOMMUFD_DRIVER_CORE" in Kconfig
* Decoupled IOMMUFD_DRIVER from IOMMUFD_DRIVER_CORE
* Moved vIOMMU negative tests out of FIXTURE_SETUP to a TEST_F
* Dropped the "flags" check in iommufd_viommu_alloc_hwpt_nested
* Added the kdoc for "flags" in iommufd_viommu_alloc_hwpt_nested
v6
https://lore.kernel.org/all/cover.1730313237.git.nicolinc@nvidia.com/
* Improved comment lines
* Added a TEST_F for IO page fault
* Fixed indentations in iommufd.rst
* Revised kdoc of the viommu_alloc op
* Added "Reviewed-by" from Kevin and Jason
* Passed in "flags" to ->alloc_domain_nested
* Renamed "free" op to "destroy" in viommu_ops
* Skipped SMMUv3 driver changes (to post in a separate series)
* Fixed "flags" validation in iommufd_viommu_alloc_hwpt_nested
* Added CONFIG_IOMMUFD_DRIVER_CORE for sharing between iommufd
core and IOMMU dirvers
* Replaced iommufd_verify_unfinalized_object with xa_cmpxchg in
iommufd_object_finalize/abort functions
v5
https://lore.kernel.org/all/cover.1729897352.git.nicolinc@nvidia.com/
* Added "Reviewed-by" from Kevin
* Reworked iommufd_viommu_alloc helper
* Revised the uAPI kdoc for vIOMMU object
* Revised comments for pluggable iommu_dev
* Added a couple of cleanup patches for selftest
* Renamed domain_alloc_nested op to alloc_domain_nested
* Updated a few commit messages to reflect the latest series
* Renamed iommufd_hwpt_nested_alloc_for_viommu to
iommufd_viommu_alloc_hwpt_nested, and added flag validation
v4
https://lore.kernel.org/all/cover.1729553811.git.nicolinc@nvidia.com/
* Added "Reviewed-by" from Jason
* Dropped IOMMU_VIOMMU_TYPE_DEFAULT support
* Dropped iommufd_object_alloc_elm renamings
* Renamed iommufd's viommu_api.c to driver.c
* Reworked iommufd_viommu_alloc helper
* Added a separate iommufd_hwpt_nested_alloc_for_viommu function for
hwpt_nested allocations on a vIOMMU, and added comparison between
viommu->iommu_dev->ops and dev_iommu_ops(idev->dev)
* Replaced s2_parent with vsmmu in arm_smmu_nested_domain
* Replaced domain_alloc_user in iommu_ops with domain_alloc_nested in
viommu_ops
* Replaced wait_queue_head_t with a completion, to delay the unplug of
mock_iommu_dev
* Corrected documentation graph that was missing struct iommu_device
* Added an iommufd_verify_unfinalized_object helper to verify driver-
allocated vIOMMU/vDEVICE objects
* Added missing test cases for TEST_LENGTH and fail_nth
v3
https://lore.kernel.org/all/cover.1728491453.git.nicolinc@nvidia.com/
* Rebased on top of Jason's nesting v3 series
https://lore.kernel.org/all/0-v3-e2e16cd7467f+2a6a1-smmuv3_nesting_jgg@nvid…
* Split the series into smaller parts
* Added Jason's Reviewed-by
* Added back viommu->iommu_dev
* Added support for driver-allocated vIOMMU v.s. core-allocated
* Dropped arm_smmu_cache_invalidate_user
* Added an iommufd_test_wait_for_users() in selftest
* Reworked test code to make viommu an individual FIXTURE
* Added missing TEST_LENGTH case for the new ioctl command
v2
https://lore.kernel.org/all/cover.1724776335.git.nicolinc@nvidia.com/
* Limited vdev_id to one per idev
* Added a rw_sem to protect the vdev_id list
* Reworked driver-level APIs with proper lockings
* Added a new viommu_api file for IOMMUFD_DRIVER config
* Dropped useless iommu_dev point from the viommu structure
* Added missing index numnbers to new types in the uAPI header
* Dropped IOMMU_VIOMMU_INVALIDATE uAPI; Instead, reuse the HWPT one
* Reworked mock_viommu_cache_invalidate() using the new iommu helper
* Reordered details of set/unset_vdev_id handlers for proper lockings
v1
https://lore.kernel.org/all/cover.1723061377.git.nicolinc@nvidia.com/
Thanks!
Nicolin
Nicolin Chen (13):
iommufd: Move struct iommufd_object to public iommufd header
iommufd: Move _iommufd_object_alloc helper to a sharable file
iommufd: Introduce IOMMUFD_OBJ_VIOMMU and its related struct
iommufd: Verify object in iommufd_object_finalize/abort()
iommufd/viommu: Add IOMMU_VIOMMU_ALLOC ioctl
iommufd: Add alloc_domain_nested op to iommufd_viommu_ops
iommufd: Allow pt_id to carry viommu_id for IOMMU_HWPT_ALLOC
iommufd/selftest: Add container_of helpers
iommufd/selftest: Prepare for mock_viommu_alloc_domain_nested()
iommufd/selftest: Add refcount to mock_iommu_device
iommufd/selftest: Add IOMMU_VIOMMU_TYPE_SELFTEST
iommufd/selftest: Add IOMMU_VIOMMU_ALLOC test coverage
Documentation: userspace-api: iommufd: Update vIOMMU
drivers/iommu/iommufd/Kconfig | 4 +
drivers/iommu/iommufd/Makefile | 4 +-
drivers/iommu/iommufd/iommufd_private.h | 33 +--
drivers/iommu/iommufd/iommufd_test.h | 2 +
include/linux/iommu.h | 14 +
include/linux/iommufd.h | 86 ++++++
include/uapi/linux/iommufd.h | 54 +++-
tools/testing/selftests/iommu/iommufd_utils.h | 28 ++
drivers/iommu/iommufd/driver.c | 40 +++
drivers/iommu/iommufd/hw_pagetable.c | 73 ++++-
drivers/iommu/iommufd/main.c | 54 ++--
drivers/iommu/iommufd/selftest.c | 266 +++++++++++++-----
drivers/iommu/iommufd/viommu.c | 81 ++++++
tools/testing/selftests/iommu/iommufd.c | 137 +++++++++
.../selftests/iommu/iommufd_fail_nth.c | 11 +
Documentation/userspace-api/iommufd.rst | 69 ++++-
16 files changed, 804 insertions(+), 152 deletions(-)
create mode 100644 drivers/iommu/iommufd/driver.c
create mode 100644 drivers/iommu/iommufd/viommu.c
base-commit: 0bcceb1f51c77f6b98a7aab00847ed340bf36e35
--
2.43.0
Some distros may not load nf_conntrack by default, which will cause
subsequent nf_conntrack settings to fail. Let's load this module if it's
not loaded by default.
Fixes: e7096c131e51 ("net: WireGuard secure network tunnel")
Signed-off-by: Hangbin Liu <liuhangbin(a)gmail.com>
---
v2: load the mode directly in case nf_conntrack is build in (Simon Horman)
---
tools/testing/selftests/wireguard/netns.sh | 1 +
1 file changed, 1 insertion(+)
diff --git a/tools/testing/selftests/wireguard/netns.sh b/tools/testing/selftests/wireguard/netns.sh
index 405ff262ca93..fa4dd7eb5918 100755
--- a/tools/testing/selftests/wireguard/netns.sh
+++ b/tools/testing/selftests/wireguard/netns.sh
@@ -66,6 +66,7 @@ cleanup() {
orig_message_cost="$(< /proc/sys/net/core/message_cost)"
trap cleanup EXIT
printf 0 > /proc/sys/net/core/message_cost
+modprobe nf_conntrack
ip netns del $netns0 2>/dev/null || true
ip netns del $netns1 2>/dev/null || true
--
2.39.5 (Apple Git-154)
The count_stcx_fail test runs for close to or just over 2 minutes, which
means it sometimes times out.
That's overkill for a test that just demonstrates some PMU counters
are working. Drop the 64 billion instruction case, to lower the runtime
to ~30s.
Signed-off-by: Michael Ellerman <mpe(a)ellerman.id.au>
---
tools/testing/selftests/powerpc/pmu/count_stcx_fail.c | 3 ---
1 file changed, 3 deletions(-)
diff --git a/tools/testing/selftests/powerpc/pmu/count_stcx_fail.c b/tools/testing/selftests/powerpc/pmu/count_stcx_fail.c
index 2070a1e2b3a5..d8dd9a9c6c1b 100644
--- a/tools/testing/selftests/powerpc/pmu/count_stcx_fail.c
+++ b/tools/testing/selftests/powerpc/pmu/count_stcx_fail.c
@@ -144,9 +144,6 @@ static int test_body(void)
/* Run for 16Bi instructions */
FAIL_IF(do_count_loop(events, 16000000000, overhead, true));
- /* Run for 64Bi instructions */
- FAIL_IF(do_count_loop(events, 64000000000, overhead, true));
-
event_close(&events[0]);
event_close(&events[1]);
--
2.47.0
Hello,
I am a private investment consultant representing the interest of a multinational conglomerate that wishes to place funds into a trust management portfolio.
Please indicate your interest for additional information.
Regards,
Van Collin.
This patch series includes some netns-related improvements and fixes for
RTNL and ip_tunnel, to make link creation more intuitive:
- Creating link in another net namespace doesn't conflict with link names
in current one.
- Refector rtnetlink link creation. Create link in target namespace
directly. Pass both source and link netns to drivers via newlink()
callback.
So that
# ip link add netns ns1 link-netns ns2 tun0 type gre ...
will create tun0 in ns1, rather than create it in ns2 and move to ns1.
And don't conflict with another interface named "tun0" in current netns.
Patch 1 from Donald is included just as a dependency.
---
v3:
- Drop "netns_atomic" flag and module parameter. Add netns parameter to
newlink() instead, and convert drivers accordingly.
- Move python NetNSEnter helper to net selftest lib.
v2:
link: https://lore.kernel.org/all/20241107133004.7469-1-shaw.leon@gmail.com/
- Check NLM_F_EXCL to ensure only link creation is affected.
- Add self tests for link name/ifindex conflict and notifications
in different netns.
- Changes in dummy driver and ynl in order to add the test case.
v1:
link: https://lore.kernel.org/all/20241023023146.372653-1-shaw.leon@gmail.com/
Donald Hunter (1):
Revert "tools/net/ynl: improve async notification handling"
Xiao Liang (5):
net: ip_tunnel: Build flow in underlay net namespace
rtnetlink: Lookup device in target netns when creating link
rtnetlink: Decouple net namespaces in rtnl_newlink_create()
selftests: net: Add python context manager for netns entering
selftests: net: Add two test cases for link netns
drivers/infiniband/ulp/ipoib/ipoib_netlink.c | 6 ++-
drivers/net/amt.c | 6 +--
drivers/net/bareudp.c | 4 +-
drivers/net/bonding/bond_netlink.c | 3 +-
drivers/net/can/dev/netlink.c | 2 +-
drivers/net/can/vxcan.c | 4 +-
.../ethernet/qualcomm/rmnet/rmnet_config.c | 5 +-
drivers/net/geneve.c | 4 +-
drivers/net/gtp.c | 4 +-
drivers/net/ipvlan/ipvlan.h | 2 +-
drivers/net/ipvlan/ipvlan_main.c | 5 +-
drivers/net/ipvlan/ipvtap.c | 4 +-
drivers/net/macsec.c | 5 +-
drivers/net/macvlan.c | 5 +-
drivers/net/macvtap.c | 5 +-
drivers/net/netkit.c | 4 +-
drivers/net/pfcp.c | 4 +-
drivers/net/ppp/ppp_generic.c | 4 +-
drivers/net/team/team_core.c | 2 +-
drivers/net/veth.c | 4 +-
drivers/net/vrf.c | 2 +-
drivers/net/vxlan/vxlan_core.c | 4 +-
drivers/net/wireguard/device.c | 4 +-
drivers/net/wireless/virtual/virt_wifi.c | 5 +-
drivers/net/wwan/wwan_core.c | 6 ++-
include/net/ip_tunnels.h | 5 +-
include/net/rtnetlink.h | 22 ++++++++-
net/8021q/vlan_netlink.c | 5 +-
net/batman-adv/soft-interface.c | 5 +-
net/bridge/br_netlink.c | 2 +-
net/caif/chnl_net.c | 2 +-
net/core/rtnetlink.c | 25 ++++++----
net/hsr/hsr_netlink.c | 8 +--
net/ieee802154/6lowpan/core.c | 5 +-
net/ipv4/ip_gre.c | 13 +++--
net/ipv4/ip_tunnel.c | 16 +++---
net/ipv4/ip_vti.c | 5 +-
net/ipv4/ipip.c | 5 +-
net/ipv6/ip6_gre.c | 17 ++++---
net/ipv6/ip6_tunnel.c | 11 ++---
net/ipv6/ip6_vti.c | 11 ++---
net/ipv6/sit.c | 11 ++---
net/xfrm/xfrm_interface_core.c | 13 +++--
tools/net/ynl/cli.py | 10 ++--
tools/net/ynl/lib/ynl.py | 49 ++++++++-----------
tools/testing/selftests/net/Makefile | 1 +
.../testing/selftests/net/lib/py/__init__.py | 2 +-
tools/testing/selftests/net/lib/py/netns.py | 18 +++++++
tools/testing/selftests/net/netns-name.sh | 10 ++++
tools/testing/selftests/net/netns_atomic.py | 38 ++++++++++++++
50 files changed, 255 insertions(+), 157 deletions(-)
create mode 100755 tools/testing/selftests/net/netns_atomic.py
--
2.47.0
The alloc_string_stream() function only returns ERR_PTR(-ENOMEM) on
failure and never returns NULL. Therefore, switching the error check in
the caller from IS_ERR_OR_NULL to IS_ERR improves clarity, indicating
that this function will return an error pointer (not NULL) when an
error occurs. This change avoids any ambiguity regarding the function's
return behavior.
Link: https://lore.kernel.org/lkml/Zy9deU5VK3YR+r9N@visitorckw-System-Product-Name
Signed-off-by: Kuan-Wei Chiu <visitorckw(a)gmail.com>
---
lib/kunit/debugfs.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/lib/kunit/debugfs.c b/lib/kunit/debugfs.c
index d548750a325a..6273fa9652df 100644
--- a/lib/kunit/debugfs.c
+++ b/lib/kunit/debugfs.c
@@ -181,7 +181,7 @@ void kunit_debugfs_create_suite(struct kunit_suite *suite)
* successfully.
*/
stream = alloc_string_stream(GFP_KERNEL);
- if (IS_ERR_OR_NULL(stream))
+ if (IS_ERR(stream))
return;
string_stream_set_append_newlines(stream, true);
@@ -189,7 +189,7 @@ void kunit_debugfs_create_suite(struct kunit_suite *suite)
kunit_suite_for_each_test_case(suite, test_case) {
stream = alloc_string_stream(GFP_KERNEL);
- if (IS_ERR_OR_NULL(stream))
+ if (IS_ERR(stream))
goto err;
string_stream_set_append_newlines(stream, true);
--
2.34.1
Update Brendan's email address for the KUnit entry.
Signed-off-by: Brendan Higgins <brendanhiggins(a)google.com>
---
I am leaving Google and am going through and cleaning up my @google.com
address in the relevant places. This patch updates my email address for
the KUnit entry. I am removing myself from the ASPEED I2C entry in a
separate patch.
Do note that Friday, November 15 2024 is my last day at Google after
which I will lose access to this email account so any future updates or
comments after Friday will come from my @linux.dev account.
---
MAINTAINERS | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/MAINTAINERS b/MAINTAINERS
index b878ddc99f94e..d077a3deeb386 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -12405,7 +12405,7 @@ F: fs/smb/common/
F: fs/smb/server/
KERNEL UNIT TESTING FRAMEWORK (KUnit)
-M: Brendan Higgins <brendanhiggins(a)google.com>
+M: Brendan Higgins <brendan.higgins(a)linux.dev>
M: David Gow <davidgow(a)google.com>
R: Rae Moar <rmoar(a)google.com>
L: linux-kselftest(a)vger.kernel.org
base-commit: cfaaa7d010d1fc58f9717fcc8591201e741d2d49
--
2.47.0.338.g60cca15819-goog
Unmapping virtual machine guest memory from the host kernel's direct map
is a successful mitigation against Spectre-style transient execution
issues: If the kernel page tables do not contain entries pointing to
guest memory, then any attempted speculative read through the direct map
will necessarily be blocked by the MMU before any observable
microarchitectural side-effects happen. This means that Spectre-gadgets
and similar cannot be used to target virtual machine memory. Roughly 60%
of speculative execution issues fall into this category [1, Table 1].
This patch series extends guest_memfd with the ability to remove its
memory from the host kernel's direct map, to be able to attain the above
protection for KVM guests running inside guest_memfd.
=== Changes to v2 ===
- Handle direct map removal for physically contiguous pages in arch code
(Mike R.)
- Track the direct map state in guest_memfd itself instead of at the
folio level, to prepare for huge pages support (Sean C.)
- Allow configuring direct map state of not-yet faulted in memory
(Vishal A.)
- Pay attention to alignment in ftrace structs (Steven R.)
Most significantly, I've reduced the patch series to focus only on
direct map removal for guest_memfd for now, leaving the whole "how to do
non-CoCo VMs in guest_memfd" for later. If this separation is
acceptable, then I think I can drop the RFC tag in the next revision
(I've mainly kept it here because I'm not entirely sure what to do with
patches 3 and 4).
=== Implementation ===
This patch series introduces a new flag to the KVM_CREATE_GUEST_MEMFD
that causes guest_memfd to remove its pages from the host kernel's
direct map immediately after population/preparation. It also adds
infrastructure for tracking the direct map state of all gmem folios
inside the guest_memfd inode. Storing this information in the inode has
the advantage that the code is ready for future hugepages extensions,
where only removing/reinserting direct map entries for sub-ranges of a
huge folio is a valid usecase, and it allows pre-configuring the direct
map state of not-yet faulted in parts of memory (for example, when the
VMM is receiving a RX virtio buffer from the guest).
=== Summary ===
Patch 1 (from Mike Rapoport) adds arch APIs for manipulating the direct
map for ranges of physically contiguous pages, which are used by
guest_memfd in follow up patches. Patch 2 adds the
KVM_GMEM_NO_DIRECT_MAP flag and the logic for configuring direct map
state of freshly prepared folios. Patches 3 and 4 mainly serve an
illustrative purpose, to show how the framework from patch 2 can be
extended with routines for runtime direct map manipulation. Patches 5
and 6 deal with documentation and self-tests respectively.
[1]: https://download.vusec.net/papers/quarantine_raid23.pdf
[RFC v1]: https://lore.kernel.org/kvm/20240709132041.3625501-1-roypat@amazon.co.uk/
[RFC v2]: https://lore.kernel.org/kvm/20240910163038.1298452-1-roypat@amazon.co.uk/
Mike Rapoport (Microsoft) (1):
arch: introduce set_direct_map_valid_noflush()
Patrick Roy (5):
kvm: gmem: add flag to remove memory from kernel direct map
kvm: gmem: implement direct map manipulation routines
kvm: gmem: add trace point for direct map state changes
kvm: document KVM_GMEM_NO_DIRECT_MAP flag
kvm: selftests: run gmem tests with KVM_GMEM_NO_DIRECT_MAP set
Documentation/virt/kvm/api.rst | 14 +
arch/arm64/include/asm/set_memory.h | 1 +
arch/arm64/mm/pageattr.c | 10 +
arch/loongarch/include/asm/set_memory.h | 1 +
arch/loongarch/mm/pageattr.c | 21 ++
arch/riscv/include/asm/set_memory.h | 1 +
arch/riscv/mm/pageattr.c | 15 +
arch/s390/include/asm/set_memory.h | 1 +
arch/s390/mm/pageattr.c | 11 +
arch/x86/include/asm/set_memory.h | 1 +
arch/x86/mm/pat/set_memory.c | 8 +
include/linux/set_memory.h | 6 +
include/trace/events/kvm.h | 22 ++
include/uapi/linux/kvm.h | 2 +
.../testing/selftests/kvm/guest_memfd_test.c | 2 +-
.../kvm/x86_64/private_mem_conversions_test.c | 7 +-
virt/kvm/guest_memfd.c | 280 +++++++++++++++++-
17 files changed, 384 insertions(+), 19 deletions(-)
base-commit: 5cb1659f412041e4780f2e8ee49b2e03728a2ba6
--
2.47.0
Hello LKFT maintainers, CI operators,
First, I would like to say thank you to the people behind the LKFT
project for validating stable kernels (and more), and including some
Network selftests in their tests suites.
A lot of improvements around the networking kselftests have been done
this year. At the last Netconf [1], we discussed how these tests were
validated on stable kernels from CIs like the LKFT one, and we have some
suggestions to improve the situation.
KSelftests from the same version
--------------------------------
According to the doc [2], kselftests should support all previous kernel
versions. The LKFT CI is then using the kselftests from the last stable
release to validate all stable versions. Even if there are good reasons
to do that, we would like to ask for an opt-out for this policy for the
networking tests: this is hard to maintain with the increased
complexity, hard to validate on all stable kernels before applying
patches, and hard to put in place in some situations. As a result, many
tests are failing on older kernels, and it looks like it is a lot of
work to support older kernels, and to maintain this.
Many networking tests are validating the internal behaviour that is not
exposed to the userspace. A typical example: some tests look at the raw
packets being exchanged during a test, and this behaviour can change
without modifying how the userspace is interacting with the kernel. The
kernel could expose capabilities, but that's not something that seems
natural to put in place for internal behaviours that are not exposed to
end users. Maybe workarounds could be used, e.g. looking at kernel
symbols, etc. Nut that doesn't always work, increase the complexity, and
often "false positive" issue will be noticed only after a patch hits
stable, and will cause a bunch of tests to be ignored.
Regarding fixes, ideally they will come with a new or modified test that
can also be backported. So the coverage can continue to grow in stable
versions too.
Do you think that from the kernel v6.12 (or before?), the LKFT CI could
run the networking kselftests from the version that is being validated,
and not from a newer one? So validating the selftests from v6.12.1 on a
v6.12.1, and not the ones from a future v6.16.y on a v6.12.42.
Skipped tests
-------------
It looks like many tests are skipped:
- Some have been in a skip file [3] for a while: maybe they can be removed?
- Some are skipped because of missing tools: maybe they can be added?
e.g. iputils, tshark, ipv6toolkit, etc.
- Some tests are in 'net', but in subdirectories, and hence not tested,
e.g. forwarding, packetdrill, netfilter, tcp_ao. Could they be tested too?
How can we change this to increase the code coverage using existing tests?
KVM
---
It looks like different VMs are being used to execute the different
tests. Do these VMs benefit from any accelerations like KVM? If not,
some tests might fail because the environment is too slow.
The KSFT_MACHINE_SLOW=yes env var can be set to increase some
tolerances, timeout or to skip some parts, but that might not be enough
for some tests.
Notifications
-------------
In case of new regressions, who is being notified? Are the people from
the MAINTAINERS file, and linked to the corresponding selftests being
notified or do they need to do the monitoring on their side?
Looking forward to improving the networking selftests results when
validating stable kernels!
[1] https://netdev.bots.linux.dev/netconf/2024/
[2] https://docs.kernel.org/dev-tools/kselftest.html
[3]
https://github.com/Linaro/test-definitions/blob/master/automated/linux/ksel…
Cheers,
Matt
--
Sponsored by the NGI0 Core fund.
Basics and overview
===================
Software with larger attack surfaces (e.g. network facing apps like databases,
browsers or apps relying on browser runtimes) suffer from memory corruption
issues which can be utilized by attackers to bend control flow of the program
to eventually gain control (by making their payload executable). Attackers are
able to perform such attacks by leveraging call-sites which rely on indirect
calls or return sites which rely on obtaining return address from stack memory.
To mitigate such attacks, risc-v extension zicfilp enforces that all indirect
calls must land on a landing pad instruction `lpad` else cpu will raise software
check exception (a new cpu exception cause code on riscv).
Similarly for return flow, risc-v extension zicfiss extends architecture with
- `sspush` instruction to push return address on a shadow stack
- `sspopchk` instruction to pop return address from shadow stack
and compare with input operand (i.e. return address on stack)
- `sspopchk` to raise software check exception if comparision above
was a mismatch
- Protection mechanism using which shadow stack is not writeable via
regular store instructions
More information an details can be found at extensions github repo [1].
Equivalent to landing pad (zicfilp) on x86 is `ENDBRANCH` instruction in Intel
CET [3] and branch target identification (BTI) [4] on arm.
Similarly x86's Intel CET has shadow stack [5] and arm64 has guarded control
stack (GCS) [6] which are very similar to risc-v's zicfiss shadow stack.
x86 already supports shadow stack for user mode and arm64 support for GCS in
usermode [7] is in -next.
Kernel awareness for user control flow integrity
================================================
This series picks up Samuel Holland's envcfg changes [2] as well. So if those are
being applied independently, they should be removed from this series.
Enabling:
In order to maintain compatibility and not break anything in user mode, kernel
doesn't enable control flow integrity cpu extensions on binary by default.
Instead exposes a prctl interface to enable, disable and lock the shadow stack
or landing pad feature for a task. This allows userspace (loader) to enumerate
if all objects in its address space are compiled with shadow stack and landing
pad support and accordingly enable the feature. Additionally if a subsequent
`dlopen` happens on a library, user mode can take a decision again to disable
the feature (if incoming library is not compiled with support) OR terminate the
task (if user mode policy is strict to have all objects in address space to be
compiled with control flow integirty cpu feature). prctl to enable shadow stack
results in allocating shadow stack from virtual memory and activating for user
address space. x86 and arm64 are also following same direction due to similar
reason(s).
clone/fork:
On clone and fork, cfi state for task is inherited by child. Shadow stack is
part of virtual memory and is a writeable memory from kernel perspective
(writeable via a restricted set of instructions aka shadow stack instructions)
Thus kernel changes ensure that this memory is converted into read-only when
fork/clone happens and COWed when fault is taken due to sspush, sspopchk or
ssamoswap. In case `CLONE_VM` is specified and shadow stack is to be enabled,
kernel will automatically allocate a shadow stack for that clone call.
map_shadow_stack:
x86 introduced `map_shadow_stack` system call to allow user space to explicitly
map shadow stack memory in its address space. It is useful to allocate shadow
for different contexts managed by a single thread (green threads or contexts)
risc-v implements this system call as well.
signal management:
If shadow stack is enabled for a task, kernel performs an asynchronous control
flow diversion to deliver the signal and eventually expects userspace to issue
sigreturn so that original execution can be resumed. Even though resume context
is prepared by kernel, it is in user space memory and is subject to memory
corruption and corruption bugs can be utilized by attacker in this race window
to perform arbitrary sigreturn and eventually bypass cfi mechanism.
Another issue is how to ensure that cfi related state on sigcontext area is not
trampled by legacy apps or apps compiled with old kernel headers.
In order to mitigate control-flow hijacting, kernel prepares a token and place
it on shadow stack before signal delivery and places address of token in
sigcontext structure. During sigreturn, kernel obtains address of token from
sigcontext struture, reads token from shadow stack and validates it and only
then allow sigreturn to succeed. Compatiblity issue is solved by adopting
dynamic sigcontext management introduced for vector extension. This series
re-factor the code little bit to allow future sigcontext management easy (as
proposed by Andy Chiu from SiFive)
config and compilation:
Introduce a new risc-v config option `CONFIG_RISCV_USER_CFI`. Selecting this
config option picks the kernel support for user control flow integrity. This
optin is presented only if toolchain has shadow stack and landing pad support.
And is on purpose guarded by toolchain support. Reason being that eventually
vDSO also needs to be compiled in with shadow stack and landing pad support.
vDSO compile patches are not included as of now because landing pad labeling
scheme is yet to settle for usermode runtime.
To get more information on kernel interactions with respect to
zicfilp and zicfiss, patch series adds documentation for
`zicfilp` and `zicfiss` in following:
Documentation/arch/riscv/zicfiss.rst
Documentation/arch/riscv/zicfilp.rst
How to test this series
=======================
Toolchain
---------
$ git clone git@github.com:sifive/riscv-gnu-toolchain.git -b cfi-dev
$ riscv-gnu-toolchain/configure --prefix=<path-to-where-to-build> --with-arch=rv64gc_zicfilp_zicfiss --enable-linux --disable-gdb --with-extra-multilib-test="rv64gc_zicfilp_zicfiss-lp64d:-static"
$ make -j$(nproc)
Qemu
----
$ git clone git@github.com:deepak0414/qemu.git -b zicfilp_zicfiss_ratified_master_july11
$ cd qemu
$ mkdir build
$ cd build
$ ../configure --target-list=riscv64-softmmu
$ make -j$(nproc)
Opensbi
-------
$ git clone git@github.com:deepak0414/opensbi.git -b v6_cfi_spec_split_opensbi
$ make CROSS_COMPILE=<your riscv toolchain> -j$(nproc) PLATFORM=generic
Linux
-----
Running defconfig is fine. CFI is enabled by default if the toolchain
supports it.
$ make ARCH=riscv CROSS_COMPILE=<path-to-cfi-riscv-gnu-toolchain>/build/bin/riscv64-unknown-linux-gnu- -j$(nproc) defconfig
$ make ARCH=riscv CROSS_COMPILE=<path-to-cfi-riscv-gnu-toolchain>/build/bin/riscv64-unknown-linux-gnu- -j$(nproc)
Branch where user cfi enabling patches are maintained
https://github.com/deepak0414/linux-riscv-cfi/tree/vdso_user_cfi_v6.12-rc1
In case you're building your own rootfs using toolchain, please make sure you
pick following patch to ensure that vDSO compiled with lpad and shadow stack.
"arch/riscv: compile vdso with landing pad"
Running
-------
Modify your qemu command to have:
-bios <path-to-cfi-opensbi>/build/platform/generic/firmware/fw_dynamic.bin
-cpu rv64,zicfilp=true,zicfiss=true,zimop=true,zcmop=true
vDSO related Opens (in the flux)
=================================
I am listing these opens for laying out plan and what to expect in future
patch sets. And of course for the sake of discussion.
Shadow stack and landing pad enabling in vDSO
----------------------------------------------
vDSO must have shadow stack and landing pad support compiled in for task
to have shadow stack and landing pad support. This patch series doesn't
enable that (yet). Enabling shadow stack support in vDSO should be
straight forward (intend to do that in next versions of patch set). Enabling
landing pad support in vDSO requires some collaboration with toolchain folks
to follow a single label scheme for all object binaries. This is necessary to
ensure that all indirect call-sites are setting correct label and target landing
pads are decorated with same label scheme.
How many vDSOs
---------------
Shadow stack instructions are carved out of zimop (may be operations) and if CPU
doesn't implement zimop, they're illegal instructions. Kernel could be running on
a CPU which may or may not implement zimop. And thus kernel will have to carry 2
different vDSOs and expose the appropriate one depending on whether CPU implements
zimop or not.
References
==========
[1] - https://github.com/riscv/riscv-cfi
[2] - https://lore.kernel.org/all/20240814081126.956287-1-samuel.holland@sifive.c…
[3] - https://lwn.net/Articles/889475/
[4] - https://developer.arm.com/documentation/109576/0100/Branch-Target-Identific…
[5] - https://www.intel.com/content/dam/develop/external/us/en/documents/catc17-i…
[6] - https://lwn.net/Articles/940403/
[7] - https://lore.kernel.org/all/20241001-arm64-gcs-v13-0-222b78d87eee@kernel.or…
---
changelog
---------
v8:
- rebased on palmer/for-next
- dropped samuel holland's `envcfg` context switch patches.
they are in parlmer/for-next
v7:
- Removed "riscv/Kconfig: enable HAVE_EXIT_THREAD for riscv"
Instead using `deactivate_mm` flow to clean up.
see here for more context
https://lore.kernel.org/all/20230908203655.543765-1-rick.p.edgecombe@intel.…
- Changed the header include in `kselftest`. Hopefully this fixes compile
issue faced by Zong Li at SiFive.
- Cleaned up an orphaned change to `mm/mmap.c` in below patch
"riscv/mm : ensure PROT_WRITE leads to VM_READ | VM_WRITE"
- Lock interfaces for shadow stack and indirect branch tracking expect arg == 0
Any future evolution of this interface should accordingly define how arg should
be setup.
- `mm/map.c` has an instance of using `VM_SHADOW_STACK`. Fixed it to use helper
`is_shadow_stack_vma`.
- Link to v6: https://lore.kernel.org/r/20241008-v5_user_cfi_series-v6-0-60d9fe073f37@riv…
v6:
- Picked up Samuel Holland's changes as is with `envcfg` placed in
`thread` instead of `thread_info`
- fixed unaligned newline escapes in kselftest
- cleaned up messages in kselftest and included test output in commit message
- fixed a bug in clone path reported by Zong Li
- fixed a build issue if CONFIG_RISCV_ISA_V is not selected
(this was introduced due to re-factoring signal context
management code)
v5:
- rebased on v6.12-rc1
- Fixed schema related issues in device tree file
- Fixed some of the documentation related issues in zicfilp/ss.rst
(style issues and added index)
- added `SHADOW_STACK_SET_MARKER` so that implementation can define base
of shadow stack.
- Fixed warnings on definitions added in usercfi.h when
CONFIG_RISCV_USER_CFI is not selected.
- Adopted context header based signal handling as proposed by Andy Chiu
- Added support for enabling kernel mode access to shadow stack using
FWFT
(https://github.com/riscv-non-isa/riscv-sbi-doc/blob/master/src/ext-firmware…)
- Link to v5: https://lore.kernel.org/r/20241001-v5_user_cfi_series-v1-0-3ba65b6e550f@riv…
(Note: I had an issue in my workflow due to which version number wasn't
picked up correctly while sending out patches)
v4:
- rebased on 6.11-rc6
- envcfg: Converged with Samuel Holland's patches for envcfg management on per-
thread basis.
- vma_is_shadow_stack is renamed to is_vma_shadow_stack
- picked up Mark Brown's `ARCH_HAS_USER_SHADOW_STACK` patch
- signal context: using extended context management to maintain compatibility.
- fixed `-Wmissing-prototypes` compiler warnings for prctl functions
- Documentation fixes and amending typos.
- Link to v4: https://lore.kernel.org/all/20240912231650.3740732-1-debug@rivosinc.com/
v3:
- envcfg
logic to pick up base envcfg had a bug where `ENVCFG_CBZE` could have been
picked on per task basis, even though CPU didn't implement it. Fixed in
this series.
- dt-bindings
As suggested, split into separate commit. fixed the messaging that spec is
in public review
- arch_is_shadow_stack change
arch_is_shadow_stack changed to vma_is_shadow_stack
- hwprobe
zicfiss / zicfilp if present will get enumerated in hwprobe
- selftests
As suggested, added object and binary filenames to .gitignore
Selftest binary anyways need to be compiled with cfi enabled compiler which
will make sure that landing pad and shadow stack are enabled. Thus removed
separate enable/disable tests. Cleaned up tests a bit.
- Link to v3: https://lore.kernel.org/lkml/20240403234054.2020347-1-debug@rivosinc.com/
v2:
- Using config `CONFIG_RISCV_USER_CFI`, kernel support for riscv control flow
integrity for user mode programs can be compiled in the kernel.
- Enabling of control flow integrity for user programs is left to user runtime
- This patch series introduces arch agnostic `prctls` to enable shadow stack
and indirect branch tracking. And implements them on riscv.
---
Changes in v8:
- EDITME: describe what is new in this series revision.
- EDITME: use bulletpoints and terse descriptions.
- Link to v7: https://lore.kernel.org/r/20241029-v5_user_cfi_series-v7-0-2727ce9936cb@riv…
---
Andy Chiu (1):
riscv: signal: abstract header saving for setup_sigcontext
Clément Léger (1):
riscv: Add Firmware Feature SBI extensions definitions
Deepak Gupta (25):
mm: helper `is_shadow_stack_vma` to check shadow stack vma
dt-bindings: riscv: zicfilp and zicfiss in dt-bindings (extensions.yaml)
riscv: zicfiss / zicfilp enumeration
riscv: zicfiss / zicfilp extension csr and bit definitions
riscv: usercfi state for task and save/restore of CSR_SSP on trap entry/exit
riscv/mm : ensure PROT_WRITE leads to VM_READ | VM_WRITE
riscv mm: manufacture shadow stack pte
riscv mmu: teach pte_mkwrite to manufacture shadow stack PTEs
riscv mmu: write protect and shadow stack
riscv/mm: Implement map_shadow_stack() syscall
riscv/shstk: If needed allocate a new shadow stack on clone
prctl: arch-agnostic prctl for indirect branch tracking
riscv: Implements arch agnostic shadow stack prctls
riscv: Implements arch agnostic indirect branch tracking prctls
riscv/traps: Introduce software check exception
riscv/signal: save and restore of shadow stack for signal
riscv/kernel: update __show_regs to print shadow stack register
riscv/ptrace: riscv cfi status and state via ptrace and in core files
riscv/hwprobe: zicfilp / zicfiss enumeration in hwprobe
riscv: enable kernel access to shadow stack memory via FWFT sbi call
riscv: kernel command line option to opt out of user cfi
riscv: create a config for shadow stack and landing pad instr support
riscv: Documentation for landing pad / indirect branch tracking
riscv: Documentation for shadow stack on riscv
kselftest/riscv: kselftest for user mode cfi
Mark Brown (2):
mm: Introduce ARCH_HAS_USER_SHADOW_STACK
prctl: arch-agnostic prctl for shadow stack
Documentation/arch/riscv/index.rst | 2 +
Documentation/arch/riscv/zicfilp.rst | 115 +++++
Documentation/arch/riscv/zicfiss.rst | 176 +++++++
.../devicetree/bindings/riscv/extensions.yaml | 14 +
arch/riscv/Kconfig | 20 +
arch/riscv/include/asm/asm-prototypes.h | 1 +
arch/riscv/include/asm/cpufeature.h | 13 +
arch/riscv/include/asm/csr.h | 16 +
arch/riscv/include/asm/entry-common.h | 2 +
arch/riscv/include/asm/hwcap.h | 2 +
arch/riscv/include/asm/mman.h | 24 +
arch/riscv/include/asm/mmu_context.h | 7 +
arch/riscv/include/asm/pgtable.h | 30 +-
arch/riscv/include/asm/processor.h | 2 +
arch/riscv/include/asm/sbi.h | 27 ++
arch/riscv/include/asm/thread_info.h | 3 +
arch/riscv/include/asm/usercfi.h | 89 ++++
arch/riscv/include/asm/vector.h | 3 +
arch/riscv/include/uapi/asm/hwprobe.h | 2 +
arch/riscv/include/uapi/asm/ptrace.h | 22 +
arch/riscv/include/uapi/asm/sigcontext.h | 1 +
arch/riscv/kernel/Makefile | 2 +
arch/riscv/kernel/asm-offsets.c | 8 +
arch/riscv/kernel/cpufeature.c | 2 +
arch/riscv/kernel/entry.S | 31 +-
arch/riscv/kernel/head.S | 12 +
arch/riscv/kernel/process.c | 26 +-
arch/riscv/kernel/ptrace.c | 83 ++++
arch/riscv/kernel/signal.c | 140 +++++-
arch/riscv/kernel/sys_hwprobe.c | 2 +
arch/riscv/kernel/sys_riscv.c | 10 +
arch/riscv/kernel/traps.c | 42 ++
arch/riscv/kernel/usercfi.c | 526 +++++++++++++++++++++
arch/riscv/mm/init.c | 2 +-
arch/riscv/mm/pgtable.c | 17 +
arch/x86/Kconfig | 1 +
fs/proc/task_mmu.c | 2 +-
include/linux/cpu.h | 4 +
include/linux/mm.h | 5 +-
include/uapi/asm-generic/mman.h | 4 +
include/uapi/linux/elf.h | 1 +
include/uapi/linux/prctl.h | 48 ++
kernel/sys.c | 60 +++
mm/Kconfig | 6 +
mm/gup.c | 2 +-
mm/mmap.c | 2 +-
mm/vma.h | 10 +-
tools/testing/selftests/riscv/Makefile | 2 +-
tools/testing/selftests/riscv/cfi/.gitignore | 3 +
tools/testing/selftests/riscv/cfi/Makefile | 10 +
tools/testing/selftests/riscv/cfi/cfi_rv_test.h | 84 ++++
tools/testing/selftests/riscv/cfi/riscv_cfi_test.c | 78 +++
tools/testing/selftests/riscv/cfi/shadowstack.c | 373 +++++++++++++++
tools/testing/selftests/riscv/cfi/shadowstack.h | 37 ++
54 files changed, 2171 insertions(+), 35 deletions(-)
---
base-commit: 64f7b77f0bd9271861ed9e410e9856b6b0b21c48
change-id: 20240930-v5_user_cfi_series-3dc332f8f5b2
--
- debug
For logging to be useful, something has to set RET and retmsg by calling
ret_set_ksft_status(). There is a suite of functions to that end in
forwarding/lib: check_err, check_fail et.al. Move them to net/lib.sh so
that every net test can use them.
Existing lib.sh users might be using these same names for their functions.
However lib.sh is always sourced near the top of the file (checked), and
whatever new definitions will simply override the ones provided by lib.sh.
Signed-off-by: Petr Machata <petrm(a)nvidia.com>
Reviewed-by: Amit Cohen <amcohen(a)nvidia.com>
Acked-by: Shuah Khan <skhan(a)linuxfoundation.org>
---
Notes:
CC: Shuah Khan <shuah(a)kernel.org>
CC: Benjamin Poirier <bpoirier(a)nvidia.com>
CC: Hangbin Liu <liuhangbin(a)gmail.com>
CC: linux-kselftest(a)vger.kernel.org
CC: Jiri Pirko <jiri(a)resnulli.us>
tools/testing/selftests/net/forwarding/lib.sh | 73 -------------------
tools/testing/selftests/net/lib.sh | 73 +++++++++++++++++++
2 files changed, 73 insertions(+), 73 deletions(-)
diff --git a/tools/testing/selftests/net/forwarding/lib.sh b/tools/testing/selftests/net/forwarding/lib.sh
index d28dbf27c1f0..8625e3c99f55 100644
--- a/tools/testing/selftests/net/forwarding/lib.sh
+++ b/tools/testing/selftests/net/forwarding/lib.sh
@@ -445,79 +445,6 @@ done
##############################################################################
# Helpers
-# Whether FAILs should be interpreted as XFAILs. Internal.
-FAIL_TO_XFAIL=
-
-check_err()
-{
- local err=$1
- local msg=$2
-
- if ((err)); then
- if [[ $FAIL_TO_XFAIL = yes ]]; then
- ret_set_ksft_status $ksft_xfail "$msg"
- else
- ret_set_ksft_status $ksft_fail "$msg"
- fi
- fi
-}
-
-check_fail()
-{
- local err=$1
- local msg=$2
-
- check_err $((!err)) "$msg"
-}
-
-check_err_fail()
-{
- local should_fail=$1; shift
- local err=$1; shift
- local what=$1; shift
-
- if ((should_fail)); then
- check_fail $err "$what succeeded, but should have failed"
- else
- check_err $err "$what failed"
- fi
-}
-
-xfail()
-{
- FAIL_TO_XFAIL=yes "$@"
-}
-
-xfail_on_slow()
-{
- if [[ $KSFT_MACHINE_SLOW = yes ]]; then
- FAIL_TO_XFAIL=yes "$@"
- else
- "$@"
- fi
-}
-
-omit_on_slow()
-{
- if [[ $KSFT_MACHINE_SLOW != yes ]]; then
- "$@"
- fi
-}
-
-xfail_on_veth()
-{
- local dev=$1; shift
- local kind
-
- kind=$(ip -j -d link show dev $dev |
- jq -r '.[].linkinfo.info_kind')
- if [[ $kind = veth ]]; then
- FAIL_TO_XFAIL=yes "$@"
- else
- "$@"
- fi
-}
-
not()
{
"$@"
diff --git a/tools/testing/selftests/net/lib.sh b/tools/testing/selftests/net/lib.sh
index 4f52b8e48a3a..6bcf5d13879d 100644
--- a/tools/testing/selftests/net/lib.sh
+++ b/tools/testing/selftests/net/lib.sh
@@ -361,3 +361,76 @@ tests_run()
$current_test
done
}
+
+# Whether FAILs should be interpreted as XFAILs. Internal.
+FAIL_TO_XFAIL=
+
+check_err()
+{
+ local err=$1
+ local msg=$2
+
+ if ((err)); then
+ if [[ $FAIL_TO_XFAIL = yes ]]; then
+ ret_set_ksft_status $ksft_xfail "$msg"
+ else
+ ret_set_ksft_status $ksft_fail "$msg"
+ fi
+ fi
+}
+
+check_fail()
+{
+ local err=$1
+ local msg=$2
+
+ check_err $((!err)) "$msg"
+}
+
+check_err_fail()
+{
+ local should_fail=$1; shift
+ local err=$1; shift
+ local what=$1; shift
+
+ if ((should_fail)); then
+ check_fail $err "$what succeeded, but should have failed"
+ else
+ check_err $err "$what failed"
+ fi
+}
+
+xfail()
+{
+ FAIL_TO_XFAIL=yes "$@"
+}
+
+xfail_on_slow()
+{
+ if [[ $KSFT_MACHINE_SLOW = yes ]]; then
+ FAIL_TO_XFAIL=yes "$@"
+ else
+ "$@"
+ fi
+}
+
+omit_on_slow()
+{
+ if [[ $KSFT_MACHINE_SLOW != yes ]]; then
+ "$@"
+ fi
+}
+
+xfail_on_veth()
+{
+ local dev=$1; shift
+ local kind
+
+ kind=$(ip -j -d link show dev $dev |
+ jq -r '.[].linkinfo.info_kind')
+ if [[ $kind = veth ]]; then
+ FAIL_TO_XFAIL=yes "$@"
+ else
+ "$@"
+ fi
+}
--
2.45.0
It would be good to use the same mechanism for scheduling and dispatching
general net tests as the many forwarding tests already use. To that end,
move the logging helpers to net/lib.sh so that every net test can use them.
Existing lib.sh users might be using the name themselves. However lib.sh is
always sourced near the top of the file (checked), and whatever new
definition will simply override the one provided by lib.sh.
Signed-off-by: Petr Machata <petrm(a)nvidia.com>
Reviewed-by: Amit Cohen <amcohen(a)nvidia.com>
Acked-by: Shuah Khan <skhan(a)linuxfoundation.org>
---
Notes:
CC: Shuah Khan <shuah(a)kernel.org>
CC: Benjamin Poirier <bpoirier(a)nvidia.com>
CC: Hangbin Liu <liuhangbin(a)gmail.com>
CC: linux-kselftest(a)vger.kernel.org
CC: Jiri Pirko <jiri(a)resnulli.us>
tools/testing/selftests/net/forwarding/lib.sh | 10 ----------
tools/testing/selftests/net/lib.sh | 10 ++++++++++
2 files changed, 10 insertions(+), 10 deletions(-)
diff --git a/tools/testing/selftests/net/forwarding/lib.sh b/tools/testing/selftests/net/forwarding/lib.sh
index 41dd14c42c48..d28dbf27c1f0 100644
--- a/tools/testing/selftests/net/forwarding/lib.sh
+++ b/tools/testing/selftests/net/forwarding/lib.sh
@@ -1285,16 +1285,6 @@ matchall_sink_create()
action drop
}
-tests_run()
-{
- local current_test
-
- for current_test in ${TESTS:-$ALL_TESTS}; do
- in_defer_scope \
- $current_test
- done
-}
-
cleanup()
{
pre_cleanup
diff --git a/tools/testing/selftests/net/lib.sh b/tools/testing/selftests/net/lib.sh
index 691318b1ec55..4f52b8e48a3a 100644
--- a/tools/testing/selftests/net/lib.sh
+++ b/tools/testing/selftests/net/lib.sh
@@ -351,3 +351,13 @@ log_info()
echo "INFO: $msg"
}
+
+tests_run()
+{
+ local current_test
+
+ for current_test in ${TESTS:-$ALL_TESTS}; do
+ in_defer_scope \
+ $current_test
+ done
+}
--
2.45.0
Many net selftests invent their own logging helpers. These really should be
in a library sourced by these tests. Currently forwarding/lib.sh has a
suite of perfectly fine logging helpers, but sourcing a forwarding/ library
from a higher-level directory smells of layering violation. In this patch,
move the logging helpers to net/lib.sh so that every net test can use them.
Together with the logging helpers, it's also necessary to move
pause_on_fail(), and EXIT_STATUS and RET.
Existing lib.sh users might be using these same names for their functions
or variables. However lib.sh is always sourced near the top of the
file (checked), and whatever new definitions will simply override the ones
provided by lib.sh.
Signed-off-by: Petr Machata <petrm(a)nvidia.com>
Reviewed-by: Amit Cohen <amcohen(a)nvidia.com>
Acked-by: Shuah Khan <skhan(a)linuxfoundation.org>
---
Notes:
CC: Shuah Khan <shuah(a)kernel.org>
CC: Benjamin Poirier <bpoirier(a)nvidia.com>
CC: Hangbin Liu <liuhangbin(a)gmail.com>
CC: linux-kselftest(a)vger.kernel.org
CC: Jiri Pirko <jiri(a)resnulli.us>
tools/testing/selftests/net/forwarding/lib.sh | 113 -----------------
tools/testing/selftests/net/lib.sh | 115 ++++++++++++++++++
2 files changed, 115 insertions(+), 113 deletions(-)
diff --git a/tools/testing/selftests/net/forwarding/lib.sh b/tools/testing/selftests/net/forwarding/lib.sh
index 89c25f72b10c..41dd14c42c48 100644
--- a/tools/testing/selftests/net/forwarding/lib.sh
+++ b/tools/testing/selftests/net/forwarding/lib.sh
@@ -48,7 +48,6 @@ declare -A NETIFS=(
: "${WAIT_TIME:=5}"
# Whether to pause on, respectively, after a failure and before cleanup.
-: "${PAUSE_ON_FAIL:=no}"
: "${PAUSE_ON_CLEANUP:=no}"
# Whether to create virtual interfaces, and what netdevice type they should be.
@@ -446,22 +445,6 @@ done
##############################################################################
# Helpers
-# Exit status to return at the end. Set in case one of the tests fails.
-EXIT_STATUS=0
-# Per-test return value. Clear at the beginning of each test.
-RET=0
-
-ret_set_ksft_status()
-{
- local ksft_status=$1; shift
- local msg=$1; shift
-
- RET=$(ksft_status_merge $RET $ksft_status)
- if (( $? )); then
- retmsg=$msg
- fi
-}
-
# Whether FAILs should be interpreted as XFAILs. Internal.
FAIL_TO_XFAIL=
@@ -535,102 +518,6 @@ xfail_on_veth()
fi
}
-log_test_result()
-{
- local test_name=$1; shift
- local opt_str=$1; shift
- local result=$1; shift
- local retmsg=$1; shift
-
- printf "TEST: %-60s [%s]\n" "$test_name $opt_str" "$result"
- if [[ $retmsg ]]; then
- printf "\t%s\n" "$retmsg"
- fi
-}
-
-pause_on_fail()
-{
- if [[ $PAUSE_ON_FAIL == yes ]]; then
- echo "Hit enter to continue, 'q' to quit"
- read a
- [[ $a == q ]] && exit 1
- fi
-}
-
-handle_test_result_pass()
-{
- local test_name=$1; shift
- local opt_str=$1; shift
-
- log_test_result "$test_name" "$opt_str" " OK "
-}
-
-handle_test_result_fail()
-{
- local test_name=$1; shift
- local opt_str=$1; shift
-
- log_test_result "$test_name" "$opt_str" FAIL "$retmsg"
- pause_on_fail
-}
-
-handle_test_result_xfail()
-{
- local test_name=$1; shift
- local opt_str=$1; shift
-
- log_test_result "$test_name" "$opt_str" XFAIL "$retmsg"
- pause_on_fail
-}
-
-handle_test_result_skip()
-{
- local test_name=$1; shift
- local opt_str=$1; shift
-
- log_test_result "$test_name" "$opt_str" SKIP "$retmsg"
-}
-
-log_test()
-{
- local test_name=$1
- local opt_str=$2
-
- if [[ $# -eq 2 ]]; then
- opt_str="($opt_str)"
- fi
-
- if ((RET == ksft_pass)); then
- handle_test_result_pass "$test_name" "$opt_str"
- elif ((RET == ksft_xfail)); then
- handle_test_result_xfail "$test_name" "$opt_str"
- elif ((RET == ksft_skip)); then
- handle_test_result_skip "$test_name" "$opt_str"
- else
- handle_test_result_fail "$test_name" "$opt_str"
- fi
-
- EXIT_STATUS=$(ksft_exit_status_merge $EXIT_STATUS $RET)
- return $RET
-}
-
-log_test_skip()
-{
- RET=$ksft_skip retmsg= log_test "$@"
-}
-
-log_test_xfail()
-{
- RET=$ksft_xfail retmsg= log_test "$@"
-}
-
-log_info()
-{
- local msg=$1
-
- echo "INFO: $msg"
-}
-
not()
{
"$@"
diff --git a/tools/testing/selftests/net/lib.sh b/tools/testing/selftests/net/lib.sh
index c8991cc6bf28..691318b1ec55 100644
--- a/tools/testing/selftests/net/lib.sh
+++ b/tools/testing/selftests/net/lib.sh
@@ -9,6 +9,9 @@ source "$net_dir/lib/sh/defer.sh"
: "${WAIT_TIMEOUT:=20}"
+# Whether to pause on after a failure.
+: "${PAUSE_ON_FAIL:=no}"
+
BUSYWAIT_TIMEOUT=$((WAIT_TIMEOUT * 1000)) # ms
# Kselftest framework constants.
@@ -20,6 +23,11 @@ ksft_skip=4
# namespace list created by setup_ns
NS_LIST=()
+# Exit status to return at the end. Set in case one of the tests fails.
+EXIT_STATUS=0
+# Per-test return value. Clear at the beginning of each test.
+RET=0
+
##############################################################################
# Helpers
@@ -236,3 +244,110 @@ tc_rule_handle_stats_get()
| jq ".[] | select(.options.handle == $handle) | \
.options.actions[0].stats$selector"
}
+
+ret_set_ksft_status()
+{
+ local ksft_status=$1; shift
+ local msg=$1; shift
+
+ RET=$(ksft_status_merge $RET $ksft_status)
+ if (( $? )); then
+ retmsg=$msg
+ fi
+}
+
+log_test_result()
+{
+ local test_name=$1; shift
+ local opt_str=$1; shift
+ local result=$1; shift
+ local retmsg=$1; shift
+
+ printf "TEST: %-60s [%s]\n" "$test_name $opt_str" "$result"
+ if [[ $retmsg ]]; then
+ printf "\t%s\n" "$retmsg"
+ fi
+}
+
+pause_on_fail()
+{
+ if [[ $PAUSE_ON_FAIL == yes ]]; then
+ echo "Hit enter to continue, 'q' to quit"
+ read a
+ [[ $a == q ]] && exit 1
+ fi
+}
+
+handle_test_result_pass()
+{
+ local test_name=$1; shift
+ local opt_str=$1; shift
+
+ log_test_result "$test_name" "$opt_str" " OK "
+}
+
+handle_test_result_fail()
+{
+ local test_name=$1; shift
+ local opt_str=$1; shift
+
+ log_test_result "$test_name" "$opt_str" FAIL "$retmsg"
+ pause_on_fail
+}
+
+handle_test_result_xfail()
+{
+ local test_name=$1; shift
+ local opt_str=$1; shift
+
+ log_test_result "$test_name" "$opt_str" XFAIL "$retmsg"
+ pause_on_fail
+}
+
+handle_test_result_skip()
+{
+ local test_name=$1; shift
+ local opt_str=$1; shift
+
+ log_test_result "$test_name" "$opt_str" SKIP "$retmsg"
+}
+
+log_test()
+{
+ local test_name=$1
+ local opt_str=$2
+
+ if [[ $# -eq 2 ]]; then
+ opt_str="($opt_str)"
+ fi
+
+ if ((RET == ksft_pass)); then
+ handle_test_result_pass "$test_name" "$opt_str"
+ elif ((RET == ksft_xfail)); then
+ handle_test_result_xfail "$test_name" "$opt_str"
+ elif ((RET == ksft_skip)); then
+ handle_test_result_skip "$test_name" "$opt_str"
+ else
+ handle_test_result_fail "$test_name" "$opt_str"
+ fi
+
+ EXIT_STATUS=$(ksft_exit_status_merge $EXIT_STATUS $RET)
+ return $RET
+}
+
+log_test_skip()
+{
+ RET=$ksft_skip retmsg= log_test "$@"
+}
+
+log_test_xfail()
+{
+ RET=$ksft_xfail retmsg= log_test "$@"
+}
+
+log_info()
+{
+ local msg=$1
+
+ echo "INFO: $msg"
+}
--
2.45.0
Hello,
this new series aims to migrate test_flow_dissector.sh into test_progs.
There are 2 "main" parts in test_flow_dissector.sh:
- a set of tests checking flow_dissector programs attachment to either
root namespace or non-root namespace
- dissection test
The first set is integrated in flow_dissector.c, which already contains
some existing tests for flow_dissector programs. This series uses the
opportunity to update a bit this file (use new assert, re-split tests,
etc)
The second part is migrated into a new file under test_progs,
flow_dissector_classification.c. It uses the same eBPF programs as
flow_dissector.c, but the difference is rather about how those program
are executed:
- flow_dissector.c manually runs programs with BPF_PROG_RUN
- flow_dissector_classification.c sends real packets to be dissected, and
so it also executes kernel code related to eBPF flow dissector (eg:
__skb_flow_bpf_to_target)
---
Alexis Lothoré (eBPF Foundation) (10):
selftests/bpf: add a macro to compare raw memory
selftests/bpf: use ASSERT_MEMEQ to compare bpf flow keys
selftests/bpf: replace CHECK calls with ASSERT macros in flow_dissector test
selftests/bpf: re-split main function into dedicated tests
selftests/bpf: expose all subtests from flow_dissector
selftests/bpf: add gre packets testing to flow_dissector
selftests/bpf: migrate flow_dissector namespace exclusivity test
selftests/bpf: Enable generic tc actions in selftests config
selftests/bpf: migrate bpf flow dissectors tests to test_progs
selftests/bpf: remove test_flow_dissector.sh
tools/testing/selftests/bpf/.gitignore | 1 -
tools/testing/selftests/bpf/Makefile | 3 +-
tools/testing/selftests/bpf/config | 1 +
.../selftests/bpf/prog_tests/flow_dissector.c | 307 ++++++--
.../bpf/prog_tests/flow_dissector_classification.c | 851 +++++++++++++++++++++
tools/testing/selftests/bpf/test_flow_dissector.c | 780 -------------------
tools/testing/selftests/bpf/test_flow_dissector.sh | 178 -----
tools/testing/selftests/bpf/test_progs.h | 25 +
8 files changed, 1107 insertions(+), 1039 deletions(-)
---
base-commit: 16e1d1c377aa4c56223701a31f3bfa88505e7e4f
change-id: 20241019-flow_dissector-3eb0c07fc163
Best regards,
--
Alexis Lothoré, Bootlin
Embedded Linux and Kernel engineering
https://bootlin.com
On Wed, Nov 13, 2024, Paolo Bonzini wrote:
> Il mar 12 nov 2024, 21:44 Doug Covelli <doug.covelli(a)broadcom.com> ha
> scritto:
>
> > > Split irqchip should be the best tradeoff. Without it, moves from cr8
> > > stay in the kernel, but moves to cr8 always go to userspace with a
> > > KVM_EXIT_SET_TPR exit. You also won't be able to use Intel
> > > flexpriority (in-processor accelerated TPR) because KVM does not know
> > > which bits are set in IRR. So it will be *really* every move to cr8
> > > that goes to userspace.
> >
> > Sorry to hijack this thread but is there a technical reason not to allow
> > CR8 based accesses to the TPR (not MMIO accesses) when the in-kernel local
> > APIC is not in use?
>
> No worries, you're not hijacking :) The only reason is that it would be
> more code for a seldom used feature and anyway with worse performance. (To
> be clear, CR8 based accesses are allowed, but stores cause an exit in order
> to check the new TPR against IRR. That's because KVM's API does not have an
> equivalent of the TPR threshold as you point out below).
>
> > Also I could not find these documented anywhere but with MSFT's APIC our
> > monitor relies on extensions for trapping certain events such as INIT/SIPI
> > plus LINT0 and SVR writes:
> >
> > UINT64 X64ApicInitSipiExitTrap : 1; //
> > WHvRunVpExitReasonX64ApicInitSipiTrap
> > UINT64 X64ApicWriteLint0ExitTrap : 1; //
> > WHvRunVpExitReasonX64ApicWriteTrap
> > UINT64 X64ApicWriteLint1ExitTrap : 1; //
> > WHvRunVpExitReasonX64ApicWriteTrap
> > UINT64 X64ApicWriteSvrExitTrap : 1; //
> > WHvRunVpExitReasonX64ApicWriteTrap
> >
>
> There's no need for this in KVM's in-kernel APIC model. INIT and SIPI are
> handled in the hypervisor and you can get the current state of APs via
> KVM_GET_MPSTATE. LINT0 and LINT1 are injected with KVM_INTERRUPT and
> KVM_NMI respectively, and they obey IF/PPR and NMI blocking respectively,
> plus the interrupt shadow; so there's no need for userspace to know when
> LINT0/LINT1 themselves change. The spurious interrupt vector register is
> also handled completely in kernel.
>
> > I did not see any similar functionality for KVM. Does anything like that
> > exist? In any case we would be happy to add support for handling CR8
> > accesses w/o exiting w/o the in-kernel APIC along with some sort of a way
> > to configure the TPR threshold if folks are not opposed to that.
>
> As far I know everybody who's using KVM (whether proprietary or open
> source) has had no need for that, so I don't think it's a good idea to make
> the API more complex.
+1
> Performance of Windows guests is going to be bad anyway with userspace APIC.
Heh, on modern hardware, performance of any guest is going to suck with a userspace
APIC, compared to what is possible with an in-kernel APIC.
More importantly, I really, really don't want to encourage non-trivial usage of
a local APIC in userspace. KVM's support for a userspace local APIC is very
poorly tested these days. I have zero desire to spend any amount of time reviewing
and fixing issues that are unique to emulating the local APIC in userspace. And
long term, I would love to force an in-kernel local APIC, though I don't know if
that's entirely feasible.
This patchset fixes a bug where bnxt driver was failing to report that
an ntuple rule is redirecting to an RSS context. First commit is the
fix, then second commit extends selftests to detect if other/new drivers
are compliant with ntuple/rss_ctx API.
Daniel Xu (2):
bnxt_en: ethtool: Supply ntuple rss context action
selftests: drv-net: rss_ctx: Add test for ntuple rule
drivers/net/ethernet/broadcom/bnxt/bnxt_ethtool.c | 8 ++++++--
tools/testing/selftests/drivers/net/hw/rss_ctx.py | 13 ++++++++++++-
2 files changed, 18 insertions(+), 3 deletions(-)
--
2.46.0
To be able to switch VMware products running on Linux to KVM some minor
changes are required to let KVM run/resume unmodified VMware guests.
First allow enabling of the VMware backdoor via an api. Currently the
setting of the VMware backdoor is limited to kernel boot parameters,
which forces all VM's running on a host to either run with or without
the VMware backdoor. Add a simple cap to allow enabling of the VMware
backdoor on a per VM basis. The default for that setting remains the
kvm.enable_vmware_backdoor boot parameter (which is false by default)
and can be changed on a per-vm basis via the KVM_CAP_X86_VMWARE_BACKDOOR
cap.
Second add a cap to forward hypercalls to userspace. I know that in
general that's frowned upon but VMwre guests send quite a few hypercalls
from userspace and it would be both impractical and largelly impossible
to handle all in the kernel. The change is trivial and I'd be maintaining
this code so I hope it's not a big deal.
The third commit just adds a self-test for the "forward VMware hypercalls
to userspace" functionality.
Cc: Doug Covelli <doug.covelli(a)broadcom.com>
Cc: Paolo Bonzini <pbonzini(a)redhat.com>
Cc: Jonathan Corbet <corbet(a)lwn.net>
Cc: Sean Christopherson <seanjc(a)google.com>
Cc: Thomas Gleixner <tglx(a)linutronix.de>
Cc: Ingo Molnar <mingo(a)redhat.com>
Cc: Borislav Petkov <bp(a)alien8.de>
Cc: Dave Hansen <dave.hansen(a)linux.intel.com>
Cc: x86(a)kernel.org
Cc: "H. Peter Anvin" <hpa(a)zytor.com>
Cc: Shuah Khan <shuah(a)kernel.org>
Cc: Namhyung Kim <namhyung(a)kernel.org>
Cc: Arnaldo Carvalho de Melo <acme(a)redhat.com>
Cc: Isaku Yamahata <isaku.yamahata(a)intel.com>
Cc: Joel Stanley <joel(a)jms.id.au>
Cc: Zack Rusin <zack.rusin(a)broadcom.com>
Cc: kvm(a)vger.kernel.org
Cc: linux-doc(a)vger.kernel.org
Cc: linux-kernel(a)vger.kernel.org
Cc: linux-kselftest(a)vger.kernel.org
Zack Rusin (3):
KVM: x86: Allow enabling of the vmware backdoor via a cap
KVM: x86: Add support for VMware guest specific hypercalls
KVM: selftests: x86: Add a test for KVM_CAP_X86_VMWARE_HYPERCALL
Documentation/virt/kvm/api.rst | 56 ++++++++-
arch/x86/include/asm/kvm_host.h | 2 +
arch/x86/kvm/emulate.c | 5 +-
arch/x86/kvm/svm/svm.c | 6 +-
arch/x86/kvm/vmx/vmx.c | 4 +-
arch/x86/kvm/x86.c | 47 ++++++++
arch/x86/kvm/x86.h | 7 +-
include/uapi/linux/kvm.h | 2 +
tools/include/uapi/linux/kvm.h | 2 +
tools/testing/selftests/kvm/Makefile | 1 +
.../kvm/x86_64/vmware_hypercall_test.c | 108 ++++++++++++++++++
11 files changed, 227 insertions(+), 13 deletions(-)
create mode 100644 tools/testing/selftests/kvm/x86_64/vmware_hypercall_test.c
--
2.43.0
This patch series represents the final release of KTAP version 2.
There have been having open discussions on version 2 for just over 2
years. This patch series marks the end of KTAP version 2 development
and beginning of the KTAP version 3 development.
The largest component of KTAP version 2 release is the addition of test
metadata to the specification. KTAP metadata could include any test
information that is pertinent for user interaction before or after the
running of the test. For example, the test file path or the test speed.
Example of KTAP Metadata:
KTAP version 2
#:ktap_test: main
#:ktap_arch: uml
1..1
KTAP version 2
#:ktap_test: suite_1
#:ktap_subsystem: example
#:ktap_test_file: lib/test.c
1..2
ok 1 test_1
#:ktap_test: test_2
#:ktap_speed: very_slow
# test_2 has begun
#:custom_is_flaky: true
ok 2 test_2
# suite_1 has passed
ok 1 suite_1
The release also includes some formatting fixes and changes to update
the specification to version 2.
Frank Rowand (2):
ktap_v2: change version to 2-rc in KTAP specification
ktap_v2: change "version 1" to "version 2" in examples
Rae Moar (3):
ktap_v2: add test metadata
ktap_v2: formatting fixes to ktap spec
ktap_v2: change version to 2 in KTAP specification
Documentation/dev-tools/ktap.rst | 276 +++++++++++++++++++++++++++++--
1 file changed, 260 insertions(+), 16 deletions(-)
base-commit: 2d5404caa8c7bb5c4e0435f94b28834ae5456623
--
2.47.0.277.g8800431eea-goog
Since commit 49f59573e9e0 ("selftests/mm: Enable pkey_sighandler_tests
on arm64"), pkey_sighandler_tests.c (which includes pkey-arm64.h via
pkey-helpers.h) ends up compiled for arm64. Since it doesn't use
aarch64_write_signal_pkey(), the compiler warns:
In file included from pkey-helpers.h:106,
from pkey_sighandler_tests.c:31:
pkey-arm64.h:130:13: warning: ‘aarch64_write_signal_pkey’ defined but not used [-Wunused-function]
130 | static void aarch64_write_signal_pkey(ucontext_t *uctxt, u64 pkey)
| ^~~~~~~~~~~~~~~~~~~~~~~~~
Make the aarch64_write_signal_pkey() a 'static inline void' function to
avoid the compiler warning.
Fixes: f5b5ea51f78f ("selftests: mm: make protection_keys test work on arm64")
Cc: Shuah Khan <skhan(a)linuxfoundation.org>
Cc: Joey Gouly <joey.gouly(a)arm.com>
Cc: Kevin Brodsky <kevin.brodsky(a)arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas(a)arm.com>
---
I'll add this on top of the arm64 for-next/pkey-signal branch together with
Kevin's other patches.
tools/testing/selftests/mm/pkey-arm64.h | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/tools/testing/selftests/mm/pkey-arm64.h b/tools/testing/selftests/mm/pkey-arm64.h
index d57fbeace38f..d9d2100eafc0 100644
--- a/tools/testing/selftests/mm/pkey-arm64.h
+++ b/tools/testing/selftests/mm/pkey-arm64.h
@@ -127,7 +127,7 @@ static inline u64 get_pkey_bits(u64 reg, int pkey)
return 0;
}
-static void aarch64_write_signal_pkey(ucontext_t *uctxt, u64 pkey)
+static inline void aarch64_write_signal_pkey(ucontext_t *uctxt, u64 pkey)
{
struct _aarch64_ctx *ctx = GET_UC_RESV_HEAD(uctxt);
struct poe_context *poe_ctx =
It looks like people started ignoring the compiler warnings (or even
errors) when building the arm64-specific kselftests. The first three
patches are printf() arguments adjustment. The last one adds
".arch_extension sme", otherwise they fail to build (with my toolchain
versions at least).
Note for future kselftest contributors - try to keep them warning-free.
Catalin Marinas (4):
kselftest/arm64: Fix printf() compiler warnings in the arm64 fp tests
kselftest/arm64: Fix printf() warning in the arm64 MTE prctl() test
kselftest/arm64: Fix printf() compiler warnings in the arm64
syscall-abi.c tests
kselftest/arm64: Fix compilation of SME instructions in the arm64 fp
tests
tools/testing/selftests/arm64/abi/syscall-abi.c | 8 ++++----
tools/testing/selftests/arm64/fp/sve-ptrace.c | 16 +++++++++-------
tools/testing/selftests/arm64/fp/za-ptrace.c | 8 +++++---
tools/testing/selftests/arm64/fp/za-test.S | 1 +
tools/testing/selftests/arm64/fp/zt-ptrace.c | 8 +++++---
tools/testing/selftests/arm64/fp/zt-test.S | 1 +
tools/testing/selftests/arm64/mte/check_prctl.c | 2 +-
7 files changed, 26 insertions(+), 18 deletions(-)
Currently we test signal delivery to the programs run by fp-stress but
our signal handlers simply count the number of signals seen and don't do
anything with the floating point state. The original fpsimd-test and
sve-test programs had signal handlers called irritators which modify the
live register state, verifying that we restore the signal context on
return, but a combination of misleading comments and code resulted in
them never being used and the equivalent handlers in the other tests
being stubbed or omitted.
Clarify the code, implement effective irritator handlers for the test
programs that can have them and then switch the signals generated by the
fp-stress program over to use the irritators, ensuring that we validate
that we restore the saved signal context properly.
Signed-off-by: Mark Brown <broonie(a)kernel.org>
---
Changes in v2:
- Change instruction and commit log for corruption of predicat
registers.
- Link to v1: https://lore.kernel.org/r/20241023-arm64-fp-stress-irritator-v1-0-a51af298d…
---
Mark Brown (6):
kselftest/arm64: Correct misleading comments on fp-stress irritators
kselftest/arm64: Remove unused ADRs from irritator handlers
kselftest/arm64: Corrupt P0 in the irritator when testing SSVE
kselftest/arm64: Implement irritators for ZA and ZT
kselftest/arm64: Provide a SIGUSR1 handler in the kernel mode FP stress test
kselftest/arm64: Test signal handler state modification in fp-stress
tools/testing/selftests/arm64/fp/fp-stress.c | 2 +-
tools/testing/selftests/arm64/fp/fpsimd-test.S | 4 +---
tools/testing/selftests/arm64/fp/kernel-test.c | 4 ++++
tools/testing/selftests/arm64/fp/sve-test.S | 8 +++-----
tools/testing/selftests/arm64/fp/za-test.S | 13 ++++---------
tools/testing/selftests/arm64/fp/zt-test.S | 13 ++++---------
6 files changed, 17 insertions(+), 27 deletions(-)
---
base-commit: 8e929cb546ee42c9a61d24fae60605e9e3192354
change-id: 20241023-arm64-fp-stress-irritator-5415fe92adef
Best regards,
--
Mark Brown <broonie(a)kernel.org>
This series contains a bit of a grab bag of improvements to the floating
point tests, mainly fp-ptrace. Globally over all the tests we start
using defines from the generated sysregs (following the example of the
KVM selftests) for SVCR, stop being quite so wasteful with registers
when calling into the assembler code then expand the coverage of both ZA
writes and FPMR (which was not there since fp-ptrace and the 2023 dpISA
extensions were on the list at the same time).
Signed-off-by: Mark Brown <broonie(a)kernel.org>
---
Changes in v2:
- Drop attempt to reference sysreg-defs.h due to build issues, just
import the defines directly into fp-ptrace. We can revisit later.
- Link to v1: https://lore.kernel.org/r/20241107-arm64-fp-ptrace-fpmr-v1-0-3e5e0b6e3be9@k…
---
Mark Brown (3):
kselftets/arm64: Use flag bits for features in fp-ptrace assembler code
kselftest/arm64: Expand the set of ZA writes fp-ptrace does
kselftest/arm64: Add FPMR coverage to fp-ptrace
tools/testing/selftests/arm64/fp/fp-ptrace-asm.S | 41 ++++--
tools/testing/selftests/arm64/fp/fp-ptrace.c | 161 +++++++++++++++++++++--
tools/testing/selftests/arm64/fp/fp-ptrace.h | 12 ++
tools/testing/selftests/arm64/fp/sme-inst.h | 2 +
4 files changed, 188 insertions(+), 28 deletions(-)
---
base-commit: 8e929cb546ee42c9a61d24fae60605e9e3192354
change-id: 20241105-arm64-fp-ptrace-fpmr-ff061facd3da
Best regards,
--
Mark Brown <broonie(a)kernel.org>
While some assemblers (including the LLVM assembler I mostly use) will
happily accept SMSTART as an instruction by default others, specifically
gas, require that any architecture extensions be explicitly enabled.
The assembler SME test programs use manually encoded helpers for the new
instructions but no SMSTART helper is defined, only SM and ZA specific
variants. Unfortunately the irritators that were just added use plain
SMSTART so on stricter assemblers these fail to build:
za-test.S:160: Error: selected processor does not support `smstart'
Switch to using SMSTART ZA via the manually encoded smstart_za macro we
already have defined.
Fixes: d65f27d240bb ("kselftest/arm64: Implement irritators for ZA and ZT")
Signed-off-by: Mark Brown <broonie(a)kernel.org>
---
tools/testing/selftests/arm64/fp/za-test.S | 2 +-
tools/testing/selftests/arm64/fp/zt-test.S | 2 +-
2 files changed, 2 insertions(+), 2 deletions(-)
diff --git a/tools/testing/selftests/arm64/fp/za-test.S b/tools/testing/selftests/arm64/fp/za-test.S
index 95fdc1c1f228221bc812087a528e4b7c99767bba..9c33e13e9dc4a6f084649fe7d0fb838d9171e3aa 100644
--- a/tools/testing/selftests/arm64/fp/za-test.S
+++ b/tools/testing/selftests/arm64/fp/za-test.S
@@ -157,7 +157,7 @@ function irritator_handler
// This will reset ZA to all bits 0
smstop
- smstart
+ smstart_za
ret
endfunction
diff --git a/tools/testing/selftests/arm64/fp/zt-test.S b/tools/testing/selftests/arm64/fp/zt-test.S
index a90712802801efb97dc6bf8027fb9ceac8f0a895..38080f3c328042af6b3e2d7c3300162ea6efa4ea 100644
--- a/tools/testing/selftests/arm64/fp/zt-test.S
+++ b/tools/testing/selftests/arm64/fp/zt-test.S
@@ -126,7 +126,7 @@ function irritator_handler
// This will reset ZT to all bits 0
smstop
- smstart
+ smstart_za
ret
endfunction
---
base-commit: 95ad089d464da2a4cd4511fb077f25994104c8f1
change-id: 20241108-arm64-selftest-asm-error-d78570e50b3b
Best regards,
--
Mark Brown <broonie(a)kernel.org>
Currently we don't build the PAC selftests when building with LLVM=1 since
we attempt to test for PAC support in the toolchain before we've set up the
build system to point at LLVM in lib.mk, which has to be one of the last
things in the Makefile.
Since all versions of LLVM supported for use with the kernel have PAC
support we can just sidestep the issue by just assuming PAC is there when
doing a LLVM=1 build.
Signed-off-by: Mark Brown <broonie(a)kernel.org>
---
tools/testing/selftests/arm64/pauth/Makefile | 6 ++++++
1 file changed, 6 insertions(+)
diff --git a/tools/testing/selftests/arm64/pauth/Makefile b/tools/testing/selftests/arm64/pauth/Makefile
index 72e290b0b10c1ea5bf1b84232f70844a601b8129..b5a1c80e0ead6932d2441a192b9758a049e3b3f8 100644
--- a/tools/testing/selftests/arm64/pauth/Makefile
+++ b/tools/testing/selftests/arm64/pauth/Makefile
@@ -7,8 +7,14 @@ CC := $(CROSS_COMPILE)gcc
endif
CFLAGS += -mbranch-protection=pac-ret
+
+# All supported LLVMs have PAC, test for GCC
+ifeq ($(LLVM),1)
+pauth_cc_support := 1
+else
# check if the compiler supports ARMv8.3 and branch protection with PAuth
pauth_cc_support := $(shell if ($(CC) $(CFLAGS) -march=armv8.3-a -E -x c /dev/null -o /dev/null 2>&1) then echo "1"; fi)
+endif
ifeq ($(pauth_cc_support),1)
TEST_GEN_PROGS := pac
---
base-commit: 8e929cb546ee42c9a61d24fae60605e9e3192354
change-id: 20241111-arm64-selftest-pac-clang-cb80de079169
Best regards,
--
Mark Brown <broonie(a)kernel.org>
The PAC tests currently generate a very small number of false failures
since the limited size of PAC keys, especially with fewer bits allocated
for PAC due to larger VA, means collisions in generated PACs are
possible even if the PAC keys are different. The test tries to work around
this by testing repeatedly but doesn't iterate often enough to be
reliable.
We also fix a leak of file descriptors in the exec test which doesn't
matter now but can become an issue when the number of iterations is
increased, we can bump into limits on allocated file descriptors.
Signed-off-by: Mark Brown <broonie(a)kernel.org>
---
Mark Brown (2):
kselftest/arm64: Don't leak pipe fds in pac.exec_sign_all()
kselftest/arm64: Try harder to generate different keys during PAC tests
tools/testing/selftests/arm64/pauth/pac.c | 5 ++++-
1 file changed, 4 insertions(+), 1 deletion(-)
---
base-commit: 8e929cb546ee42c9a61d24fae60605e9e3192354
change-id: 20241111-arm64-pac-test-collisions-5613f5dfe479
Best regards,
--
Mark Brown <broonie(a)kernel.org>
This series contains a bit of a grab bag of improvements to the floating
point tests, mainly fp-ptrace. Globally over all the tests we start
using defines from the generated sysregs (following the example of the
KVM selftests) for SVCR, stop being quite so wasteful with registers
when calling into the assembler code then expand the coverage of both ZA
writes and FPMR (which was not there since fp-ptrace and the 2023 dpISA
extensions were on the list at the same time).
Signed-off-by: Mark Brown <broonie(a)kernel.org>
---
Mark Brown (4):
kselftets/arm64: Use flag bits for features in fp-ptrace assembler code
kselftest/arm64: Use a define for SVCR
kselftest/arm64: Expand the set of ZA writes fp-ptrace does
kselftest/arm64: Add FPMR coverage to fp-ptrace
tools/testing/selftests/arm64/fp/Makefile | 20 ++-
tools/testing/selftests/arm64/fp/fp-ptrace-asm.S | 52 +++++---
tools/testing/selftests/arm64/fp/fp-ptrace.c | 155 +++++++++++++++++++++--
tools/testing/selftests/arm64/fp/fp-ptrace.h | 16 ++-
tools/testing/selftests/arm64/fp/sve-test.S | 5 +-
tools/testing/selftests/arm64/fp/za-fork-asm.S | 3 +-
tools/testing/selftests/arm64/fp/za-test.S | 6 +-
tools/testing/selftests/arm64/fp/zt-test.S | 5 +-
8 files changed, 213 insertions(+), 49 deletions(-)
---
base-commit: 8e929cb546ee42c9a61d24fae60605e9e3192354
change-id: 20241105-arm64-fp-ptrace-fpmr-ff061facd3da
Best regards,
--
Mark Brown <broonie(a)kernel.org>
Kuan-Wei Chiu pointed out that the kernel doc for kunit_zalloc_skb()
needs to include the @gfp information. Add it.
Reported-by: Kuan-Wei Chiu <visitorckw(a)gmail.com>
Closes: https://lore.kernel.org/all/Zy+VIXDPuU613fFd@visitorckw-System-Product-Name/
Signed-off-by: Dan Carpenter <dan.carpenter(a)linaro.org>
---
include/kunit/skbuff.h | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/include/kunit/skbuff.h b/include/kunit/skbuff.h
index 345e1e8f0312..07784694357c 100644
--- a/include/kunit/skbuff.h
+++ b/include/kunit/skbuff.h
@@ -20,8 +20,9 @@ static void kunit_action_kfree_skb(void *p)
* kunit_zalloc_skb() - Allocate and initialize a resource managed skb.
* @test: The test case to which the skb belongs
* @len: size to allocate
+ * @gfp: allocation flags
*
- * Allocate a new struct sk_buff with GFP_KERNEL, zero fill the give length
+ * Allocate a new struct sk_buff with gfp flags, zero fill the given length
* and add it as a resource to the kunit test for automatic cleanup.
*
* Returns: newly allocated SKB, or %NULL on error
--
2.45.2