Hello,
This patch clears out warnings seen while compiling the tests; at the time, it closes a test report.
Thank you,
Link: https://lore.kernel.org/oe-kbuild-all/202412222015.lMBH62zB-lkp@intel.com/
Ariel Otilibili (1):
selftests: Clear -Wimplicit-function-declaration warnings
tools/testing/selftests/pid_namespace/pid_max.c | 1 +
tools/testing/selftests/pidfd/pidfd_fdinfo_test.c | 1 +
2 files changed, 2 insertions(+)
--
2.43.0
The tool pp_alloc_fail.py tested error recovery by injecting errors
into page_pool_alloc_pages(). Perhaps due to the netmems conversion,
page_pool_put_full_page() does not end up calling that function.
page_pool_alloc_netmems() seems to be the base function for all the
the allocation functions in the API call, so put the error injection
there instead.
Signed-off-by: John Daley <johndale(a)cisco.com>
John Daley (1):
page_pool: inject pp_alloc_fail errors in the right place
net/core/page_pool.c | 2 +-
tools/testing/selftests/drivers/net/hw/pp_alloc_fail.py | 6 +++---
2 files changed, 4 insertions(+), 4 deletions(-)
--
2.44.0
Currently, kselftests does not have a generalised mechanism to skip compilation
and run tests when required kernel configuration flags are missing.
This patch introduces a check to validate the presence of required config flags
specified in the selftest config files. In case scripts/config or the current
kernel config is not found, this check is skipped.
In order to skip checking for config options required to compile the test,
set the environment variable SKIP_CHECKS=1.
example usage:
```
make SKIP_CHECKS=1 -C livepatch/
```
Suggested-by: Petr Mladek <pmladek(a)suse.com>
Suggested-by: Miroslav Benes <mbenes(a)suse.cz>
Signed-off-by: Siddharth Menon <simeddon(a)gmail.com>
---
v1->v2:
- Moved the logic to check for required configurations
to an external script
v2 -> v3:
- Add SKIP_CHECKS flag to skip checking the dependencies
if required
- Updated the test skip statement to be more meaningful
tools/testing/selftests/lib.mk | 15 +++++++++++++--
1 file changed, 13 insertions(+), 2 deletions(-)
mode change 100644 => 100755 tools/testing/selftests/lib.mk
diff --git a/tools/testing/selftests/lib.mk b/tools/testing/selftests/lib.mk
old mode 100644
new mode 100755
index d6edcfcb5be8..0e11d1d3bab8
--- a/tools/testing/selftests/lib.mk
+++ b/tools/testing/selftests/lib.mk
@@ -97,7 +97,18 @@ TEST_GEN_PROGS := $(patsubst %,$(OUTPUT)/%,$(TEST_GEN_PROGS))
TEST_GEN_PROGS_EXTENDED := $(patsubst %,$(OUTPUT)/%,$(TEST_GEN_PROGS_EXTENDED))
TEST_GEN_FILES := $(patsubst %,$(OUTPUT)/%,$(TEST_GEN_FILES))
-all: $(TEST_GEN_PROGS) $(TEST_GEN_PROGS_EXTENDED) $(TEST_GEN_FILES) \
+TEST_DIR := $(shell pwd)
+
+check_kselftest_deps:
+ifneq ($(SKIP_CHECKS),1)
+ @$(selfdir)/check_kselftest_deps.pl $(TEST_DIR) $(CC) || { \
+ echo "Skipping test: $(notdir $(TEST_DIR)) (missing required kernel features)"; \
+ exit 1; \
+ }
+endif
+
+
+all: check_kselftest_deps $(TEST_GEN_PROGS) $(TEST_GEN_PROGS_EXTENDED) $(TEST_GEN_FILES) \
$(if $(TEST_GEN_MODS_DIR),gen_mods_dir)
define RUN_TESTS
@@ -228,4 +239,4 @@ $(OUTPUT)/%:%.S
$(LINK.S) $^ $(LDLIBS) -o $@
endif
-.PHONY: run_tests all clean install emit_tests gen_mods_dir clean_mods_dir
+.PHONY: run_tests all clean install emit_tests gen_mods_dir clean_mods_dir check_kselftest_deps
--
2.39.5
Hi,
This series carries forward the effort to add Kselftest for PCI Endpoint
Subsystem started by Aman Gupta [1] a while ago. I reworked the initial version
based on another patch that fixes the return values of IOCTLs in
pci_endpoint_test driver and did many cleanups. Since the resulting work
modified the initial version substantially, I took over the authorship.
This series also incorporates the review comment by Shuah Khan [2] to move the
existing tests from 'tools/pci' to 'tools/testing/kselftest/pci_endpoint' before
migrating to Kselftest framework. I made sure that the tests are executable in
each commit and updated documentation accordingly.
- Mani
[1] https://lore.kernel.org/linux-pci/20221007053934.5188-1-aman1.gupta@samsung…
[2] https://lore.kernel.org/linux-pci/b2a5db97-dc59-33ab-71cd-f591e0b1b34d@linu…
Changes in v5:
* Incorporated comments from Niklas
* Added a patch to fix the DMA MEMCPY check in pci-epf-test driver
* Collected tags
* Rebased on top of pci/next 0333f56dbbf7ef6bb46d2906766c3e1b2a04a94d
Changes in v4:
* Dropped the BAR fix patches and submitted them separately:
https://lore.kernel.org/linux-pci/20241231130224.38206-1-manivannan.sadhasi…
* Rebased on top of pci/next 9e1b45d7a5bc0ad20f6b5267992da422884b916e
Changes in v3:
* Collected tags.
* Added a note about failing testcase 10 and command to skip it in
documentation.
* Removed Aman Gupta and Padmanabhan Rajanbabu from CC as their addresses are
bouncing.
Changes in v2:
* Added a patch that fixes return values of IOCTL in pci_endpoint_test driver
* Moved the existing tests to new location before migrating
* Added a fix for BARs on Qcom devices
* Updated documentation and also added fixture variants for memcpy & DMA modes
Manivannan Sadhasivam (4):
PCI: endpoint: pci-epf-test: Fix the check for DMA MEMCPY test
misc: pci_endpoint_test: Fix the return value of IOCTL
selftests: Move PCI Endpoint tests from tools/pci to Kselftests
selftests: pci_endpoint: Migrate to Kselftest framework
Documentation/PCI/endpoint/pci-test-howto.rst | 170 +++++------
MAINTAINERS | 2 +-
drivers/misc/pci_endpoint_test.c | 255 +++++++++--------
drivers/pci/endpoint/functions/pci-epf-test.c | 4 +-
tools/pci/Build | 1 -
tools/pci/Makefile | 58 ----
tools/pci/pcitest.c | 264 ------------------
tools/pci/pcitest.sh | 73 -----
tools/testing/selftests/Makefile | 1 +
.../testing/selftests/pci_endpoint/.gitignore | 2 +
tools/testing/selftests/pci_endpoint/Makefile | 7 +
tools/testing/selftests/pci_endpoint/config | 4 +
.../pci_endpoint/pci_endpoint_test.c | 221 +++++++++++++++
13 files changed, 435 insertions(+), 627 deletions(-)
delete mode 100644 tools/pci/Build
delete mode 100644 tools/pci/Makefile
delete mode 100644 tools/pci/pcitest.c
delete mode 100644 tools/pci/pcitest.sh
create mode 100644 tools/testing/selftests/pci_endpoint/.gitignore
create mode 100644 tools/testing/selftests/pci_endpoint/Makefile
create mode 100644 tools/testing/selftests/pci_endpoint/config
create mode 100644 tools/testing/selftests/pci_endpoint/pci_endpoint_test.c
--
2.25.1
From: Steven Rostedt <rostedt(a)goodmis.org>
Now that here's a :mod: command that can be sent into set_event, add a
test that tests its use. Both setting events for a loaded module, as well
as caching what events to set for a module that is not loaded yet.
Cc: Shuah Khan <shuah(a)kernel.org>
Cc: linux-kselftest(a)vger.kernel.org
Signed-off-by: Steven Rostedt (Google) <rostedt(a)goodmis.org>
---
.../ftrace/test.d/event/event-mod.tc | 192 ++++++++++++++++++
1 file changed, 192 insertions(+)
create mode 100644 tools/testing/selftests/ftrace/test.d/event/event-mod.tc
diff --git a/tools/testing/selftests/ftrace/test.d/event/event-mod.tc b/tools/testing/selftests/ftrace/test.d/event/event-mod.tc
new file mode 100644
index 000000000000..6f7601c4b54b
--- /dev/null
+++ b/tools/testing/selftests/ftrace/test.d/event/event-mod.tc
@@ -0,0 +1,192 @@
+#!/bin/sh
+# SPDX-License-Identifier: GPL-2.0
+# description: event tracing - enable/disable with module event
+# requires: set_event "Can enable module events via: :mod:":README
+# flags: instance
+
+rmmod trace-events-sample ||:
+if ! modprobe trace-events-sample ; then
+ echo "No trace-events sample module - please make CONFIG_SAMPLE_TRACE_EVENTS=m"
+ exit_unresolved;
+fi
+trap "rmmod trace-events-sample" EXIT
+
+# Set events for the module
+echo ":mod:trace-events-sample" > set_event
+
+test_all_enabled() {
+
+ # Check if more than one is enabled
+ grep -q sample-trace:foo_bar set_event
+ grep -q sample-trace:foo_bar_with_cond set_event
+ grep -q sample-trace:foo_bar_with_fn set_event
+
+ # All of them should be enabled. Check via the enable file
+ val=`cat events/sample-trace/enable`
+ if [ $val -ne 1 ]; then
+ exit_fail
+ fi
+}
+
+clear_events() {
+ echo > set_event
+ val=`cat events/enable`
+ if [ "$val" != "0" ]; then
+ exit_fail
+ fi
+ count=`cat set_event | wc -l`
+ if [ $count -ne 0 ]; then
+ exit_fail
+ fi
+}
+
+test_all_enabled
+
+echo clear all events
+echo 0 > events/enable
+
+echo Confirm the events are disabled
+val=`cat events/sample-trace/enable`
+if [ $val -ne 0 ]; then
+ exit_fail
+fi
+
+echo And the set_event file is empty
+
+cnt=`wc -l set_event`
+if [ $cnt -ne 0 ]; then
+ exit_fail
+fi
+
+echo now enable all events
+echo 1 > events/enable
+
+echo Confirm the events are enabled again
+val=`cat events/sample-trace/enable`
+if [ $val -ne 1 ]; then
+ exit_fail
+fi
+
+echo disable just the module events
+echo '!:mod:trace-events-sample' >> set_event
+
+echo Should have mix of events enabled
+val=`cat events/enable`
+if [ "$val" != "X" ]; then
+ exit_fail
+fi
+
+echo Confirm the module events are disabled
+val=`cat events/sample-trace/enable`
+if [ $val -ne 0 ]; then
+ exit_fail
+fi
+
+echo 0 > events/enable
+
+echo now enable the system events
+echo 'sample-trace:mod:trace-events-sample' > set_event
+
+test_all_enabled
+
+echo clear all events
+echo 0 > events/enable
+
+echo Confirm the events are disabled
+val=`cat events/sample-trace/enable`
+if [ $val -ne 0 ]; then
+ exit_fail
+fi
+
+echo Test enabling foo_bar only
+echo 'foo_bar:mod:trace-events-sample' > set_event
+
+grep -q sample-trace:foo_bar set_event
+
+echo make sure nothing is found besides foo_bar
+if grep -q -v sample-trace:foo_bar set_event ; then
+ exit_fail
+fi
+
+echo Append another using the system and event name
+echo 'sample-trace:foo_bar_with_cond:mod:trace-events-sample' >> set_event
+
+grep -q sample-trace:foo_bar set_event
+grep -q sample-trace:foo_bar_with_cond set_event
+
+count=`cat set_event | wc -l`
+
+if [ $count -ne 2 ]; then
+ exit_fail
+fi
+
+clear_events
+
+rmmod trace-events-sample
+
+echo ':mod:trace-events-sample' > set_event
+
+echo make sure that the module shows up, and '-' is converted to '_'
+grep -q '\*:\*:mod:trace_events_sample' set_event
+
+modprobe trace-events-sample
+
+test_all_enabled
+
+clear_events
+
+rmmod trace-events-sample
+
+echo Enable just the system events
+echo 'sample-trace:mod:trace-events-sample' > set_event
+grep -q 'sample-trace:mod:trace_events_sample' set_event
+
+modprobe trace-events-sample
+
+test_all_enabled
+
+clear_events
+
+rmmod trace-events-sample
+
+echo Enable event with just event name
+echo 'foo_bar:mod:trace-events-sample' > set_event
+grep -q 'foo_bar:mod:trace_events_sample' set_event
+
+echo Enable another event with both system and event name
+echo 'sample-trace:foo_bar_with_cond:mod:trace-events-sample' >> set_event
+grep -q 'sample-trace:foo_bar_with_cond:mod:trace_events_sample' set_event
+echo Make sure the other event was still there
+grep -q 'foo_bar:mod:trace_events_sample' set_event
+
+modprobe trace-events-sample
+
+echo There should be no :mod: cached events
+if grep -q ':mod:' set_event; then
+ exit_fail
+fi
+
+echo two events should be enabled
+count=`cat set_event | wc -l`
+if [ $count -ne 2 ]; then
+ exit_fail
+fi
+
+echo only two events should be enabled
+val=`cat events/sample-trace/enable`
+if [ "$val" != "X" ]; then
+ exit_fail
+fi
+
+val=`cat events/sample-trace/foo_bar/enable`
+if [ "$val" != "1" ]; then
+ exit_fail
+fi
+
+val=`cat events/sample-trace/foo_bar_with_cond/enable`
+if [ "$val" != "1" ]; then
+ exit_fail
+fi
+
+clear_trace
+
--
2.45.2
Hi,
This series carries forward the effort to add Kselftest for PCI Endpoint
Subsystem started by Aman Gupta [1] a while ago. I reworked the initial version
based on another patch that fixes the return values of IOCTLs in
pci_endpoint_test driver and did many cleanups. Since the resulting work
modified the initial version substantially, I took over the authorship.
This series also incorporates the review comment by Shuah Khan [2] to move the
existing tests from 'tools/pci' to 'tools/testing/kselftest/pci_endpoint' before
migrating to Kselftest framework. I made sure that the tests are executable in
each commit and updated documentation accordingly.
- Mani
[1] https://lore.kernel.org/linux-pci/20221007053934.5188-1-aman1.gupta@samsung…
[2] https://lore.kernel.org/linux-pci/b2a5db97-dc59-33ab-71cd-f591e0b1b34d@linu…
Changes in v4:
* Dropped the BAR fix patches and submitted them separately:
https://lore.kernel.org/linux-pci/20241231130224.38206-1-manivannan.sadhasi…
* Rebased on top of pci/next 9e1b45d7a5bc0ad20f6b5267992da422884b916e
Changes in v3:
* Collected tags.
* Added a note about failing testcase 10 and command to skip it in
documentation.
* Removed Aman Gupta and Padmanabhan Rajanbabu from CC as their addresses are
bouncing.
Changes in v2:
* Added a patch that fixes return values of IOCTL in pci_endpoint_test driver
* Moved the existing tests to new location before migrating
* Added a fix for BARs on Qcom devices
* Updated documentation and also added fixture variants for memcpy & DMA modes
Manivannan Sadhasivam (3):
misc: pci_endpoint_test: Fix the return value of IOCTL
selftests: Move PCI Endpoint tests from tools/pci to Kselftests
selftests: pci_endpoint: Migrate to Kselftest framework
Documentation/PCI/endpoint/pci-test-howto.rst | 155 ++++------
MAINTAINERS | 2 +-
drivers/misc/pci_endpoint_test.c | 250 ++++++++---------
tools/pci/Build | 1 -
tools/pci/Makefile | 58 ----
tools/pci/pcitest.c | 264 ------------------
tools/pci/pcitest.sh | 73 -----
tools/testing/selftests/Makefile | 1 +
.../testing/selftests/pci_endpoint/.gitignore | 2 +
tools/testing/selftests/pci_endpoint/Makefile | 7 +
tools/testing/selftests/pci_endpoint/config | 4 +
.../pci_endpoint/pci_endpoint_test.c | 194 +++++++++++++
12 files changed, 386 insertions(+), 625 deletions(-)
delete mode 100644 tools/pci/Build
delete mode 100644 tools/pci/Makefile
delete mode 100644 tools/pci/pcitest.c
delete mode 100644 tools/pci/pcitest.sh
create mode 100644 tools/testing/selftests/pci_endpoint/.gitignore
create mode 100644 tools/testing/selftests/pci_endpoint/Makefile
create mode 100644 tools/testing/selftests/pci_endpoint/config
create mode 100644 tools/testing/selftests/pci_endpoint/pci_endpoint_test.c
--
2.25.1
This series expands the XDP TX metadata framework to allow user
applications to pass per packet 64-bit launch time directly to the kernel
driver, requesting launch time hardware offload support. The XDP TX
metadata framework will not perform any clock conversion or packet
reordering.
Please note that the role of Tx metadata is just to pass the launch time,
not to enable the offload feature. Users will need to enable the launch
time hardware offload feature of the device by using the respective
command, such as the tc-etf command.
Although some devices use the tc-etf command to enable their launch time
hardware offload feature, xsk packets will not go through the etf qdisc.
Therefore, in my opinion, the launch time should always be based on the PTP
Hardware Clock (PHC). Thus, i did not include a clock ID to indicate the
clock source.
To simplify the test steps, I modified the xdp_hw_metadata bpf self-test
tool in such a way that it will set the launch time based on the offset
provided by the user and the value of the Receive Hardware Timestamp, which
is against the PHC. This will eliminate the need to discipline System Clock
with the PHC and then use clock_gettime() to get the time.
Please note that AF_XDP lacks a feedback mechanism to inform the
application if the requested launch time is invalid. So, users are expected
to familiar with the horizon of the launch time of the device they use and
not request a launch time that is beyond the horizon. Otherwise, the driver
might interpret the launch time incorrectly and react wrongly. For stmmac
and igc, where modulo computation is used, a launch time larger than the
horizon will cause the device to transmit the packet earlier that the
requested launch time.
Although there is no feedback mechanism for the launch time request
for now, user still can check whether the requested launch time is
working or not, by requesting the Transmit Completion Hardware Timestamp.
Changes since v1:
- renamed to use Earliest TxTime First (Willem)
- renamed to use txtime (Willem)
Changes since v2:
- renamed to use launch time (Jesper & Willem)
- changed the default launch time in xdp_hw_metadata apps from 1s to 0.1s
because some NICs do not support such a large future time.
Changes since v3:
- added XDP launch time support to the igc driver (Jesper & Florian)
- added per-driver launch time limitation on xsk-tx-metadata.rst (Jesper)
- added explanation on FIFO behavior on xsk-tx-metadata.rst (Jakub)
- added step to enable launch time in the commit message (Jesper & Willem)
- explicitly documented the type of launch_time and which clock source
it is against (Willem)
Changes since v4:
- change netdev feature name from tx-launch-time to tx-launch-time-fifo
to explicitly state the FIFO behaviour (Stanislav)
- improve the looping of xdp_hw_metadata app to wait for packet tx
completion to be more readable by using clock_gettime() (Stanislav)
- add launch time setup steps into xdp_hw_metadata app (Stanislav)
v1: https://patchwork.kernel.org/project/netdevbpf/cover/20231130162028.852006-…
v2: https://patchwork.kernel.org/project/netdevbpf/cover/20231201062421.1074768…
v3: https://patchwork.kernel.org/project/netdevbpf/cover/20231203165129.1740512…
v4: https://patchwork.kernel.org/project/netdevbpf/cover/20250106135506.9687-1-…
Song Yoong Siang (4):
xsk: Add launch time hardware offload support to XDP Tx metadata
selftests/bpf: Add launch time request to xdp_hw_metadata
net: stmmac: Add launch time support to XDP ZC
igc: Add launch time support to XDP ZC
Documentation/netlink/specs/netdev.yaml | 4 +
Documentation/networking/xsk-tx-metadata.rst | 62 +++++++++
drivers/net/ethernet/intel/igc/igc_main.c | 78 +++++++----
drivers/net/ethernet/stmicro/stmmac/stmmac.h | 2 +
.../net/ethernet/stmicro/stmmac/stmmac_main.c | 13 ++
include/net/xdp_sock.h | 10 ++
include/net/xdp_sock_drv.h | 1 +
include/uapi/linux/if_xdp.h | 10 ++
include/uapi/linux/netdev.h | 3 +
net/core/netdev-genl.c | 2 +
net/xdp/xsk.c | 3 +
tools/include/uapi/linux/if_xdp.h | 10 ++
tools/include/uapi/linux/netdev.h | 3 +
tools/testing/selftests/bpf/xdp_hw_metadata.c | 121 +++++++++++++++++-
14 files changed, 298 insertions(+), 24 deletions(-)
--
2.34.1
The orig_a0 is missing in struct user_regs_struct of riscv, and there is
no way to add it without breaking UAPI. (See Link tag below)
Like NT_ARM_SYSTEM_CALL do, we add a new regset name NT_RISCV_ORIG_A0 to
access original a0 register from userspace via ptrace API.
Link: https://lore.kernel.org/all/59505464-c84a-403d-972f-d4b2055eeaac@gmail.com/
Signed-off-by: Celeste Liu <uwu(a)coelacanthus.name>
---
Changes in v6:
- Fix obsolute comment.
- Copy include/linux/stddef.h to tools/include to use offsetofend in
selftests.
- Link to v5: https://lore.kernel.org/r/20250115-riscv-new-regset-v5-0-d0e6ec031a23@coela…
Changes in v5:
- Fix wrong usage in selftests.
- Link to v4: https://lore.kernel.org/r/20241226-riscv-new-regset-v4-0-4496a29d0436@coela…
Changes in v4:
- Fix a copy paste error in selftest. (Forget to commit...)
- Link to v3: https://lore.kernel.org/r/20241226-riscv-new-regset-v3-0-f5b96465826b@coela…
Changes in v3:
- Use return 0 directly for readability.
- Fix test for modify a0.
- Add Fixes: tag
- Remove useless Cc: stable.
- Selftest will check both a0 and orig_a0, but depends on the
correctness of PTRACE_GET_SYSCALL_INFO.
- Link to v2: https://lore.kernel.org/r/20241203-riscv-new-regset-v2-0-d37da8c0cba6@coela…
Changes in v2:
- Fix integer width.
- Add selftest.
- Link to v1: https://lore.kernel.org/r/20241201-riscv-new-regset-v1-1-c83c58abcc7b@coela…
---
Celeste Liu (3):
riscv/ptrace: add new regset to access original a0 register
tools: copy include/linux/stddef.h to tools/include
riscv: selftests: Add a ptrace test to verify a0 and orig_a0 access
arch/riscv/kernel/ptrace.c | 32 +++++
include/uapi/linux/elf.h | 1 +
tools/include/linux/stddef.h | 85 ++++++++++++
tools/include/uapi/linux/stddef.h | 6 +-
tools/testing/selftests/riscv/abi/.gitignore | 1 +
tools/testing/selftests/riscv/abi/Makefile | 6 +-
tools/testing/selftests/riscv/abi/ptrace.c | 193 +++++++++++++++++++++++++++
7 files changed, 319 insertions(+), 5 deletions(-)
---
base-commit: 0e287d31b62bb53ad81d5e59778384a40f8b6f56
change-id: 20241201-riscv-new-regset-d529b952ad0d
Best regards,
--
Celeste Liu <uwu(a)coelacanthus.name>
Here are just a bunch of small improvements for the MPTCP selftests:
Patch 1: Unify errors messages in simult_flows: print MIB and 'ss -Me'.
Patch 2: Unify errors messages in sockopt: print MIB.
Patch 3: Move common code to print debug info to mptcp_lib.sh.
Patch 4: Use 'ss' with '-m' in case of errors.
Patch 5: Remove an unused variable.
Patch 6: Print only the size instead of size + filename again.
Signed-off-by: Matthieu Baerts (NGI0) <matttbe(a)kernel.org>
---
Geliang Tang (1):
selftests: mptcp: sockopt: save nstat infos
Matthieu Baerts (NGI0) (5):
selftests: mptcp: simult_flows: unify errors msgs
selftests: mptcp: move stats info in case of errors to lib.sh
selftests: mptcp: add -m with ss in case of errors
selftests: mptcp: connect: remove unused variable
selftests: mptcp: connect: better display the files size
tools/testing/selftests/net/mptcp/mptcp_connect.sh | 13 ++++---------
tools/testing/selftests/net/mptcp/mptcp_join.sh | 9 ++-------
tools/testing/selftests/net/mptcp/mptcp_lib.sh | 21 +++++++++++++++++++++
tools/testing/selftests/net/mptcp/mptcp_sockopt.sh | 17 ++++++++++++-----
tools/testing/selftests/net/mptcp/simult_flows.sh | 21 ++++++++++++++-------
5 files changed, 53 insertions(+), 28 deletions(-)
---
base-commit: 9c7ad35632297edc08d0f2c7b599137e9fb5f9ff
change-id: 20250114-net-next-mptcp-st-more-debug-err-3f3f1aa15a10
Best regards,
--
Matthieu Baerts (NGI0) <matttbe(a)kernel.org>
When porting librseq commit:
commit c7b45750fa85 ("Adapt to glibc __rseq_size feature detection")
from librseq to the kernel selftests, the following line was missed
at the end of rseq_init():
rseq_size = get_rseq_kernel_feature_size();
which effectively leaves rseq_size initialized to -1U when glibc does not
have rseq support. glibc supports rseq from version 2.35 onwards.
In a following librseq commit
commit c67d198627c2 ("Only set 'rseq_size' on first thread registration")
to mimic the libc behavior, a new approach is taken: don't set the
feature size in 'rseq_size' until at least one thread has successfully
registered. This allows using 'rseq_size' in fast-paths to test for both
registration status and available features. The caveat is that on libc
either all threads are registered or none are, while with bare librseq
it is the responsability of the user to register all threads using rseq.
This combines the changes from the following librseq commits:
commit c7b45750fa85 ("Adapt to glibc __rseq_size feature detection")
commit c67d198627c2 ("Only set 'rseq_size' on first thread registration")
Fixes: 73a4f5a704a2 ("selftests/rseq: Fix mm_cid test failure")
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers(a)efficios.com>
Cc: Raghavendra Rao Ananta <rananta(a)google.com>
Cc: Shuah Khan <skhan(a)linuxfoundation.org>
Cc: Peter Zijlstra <peterz(a)infradead.org>
Cc: Boqun Feng <boqun.feng(a)gmail.com>
Cc: "Paul E. McKenney" <paulmck(a)kernel.org>
Cc: Carlos O'Donell <carlos(a)redhat.com>
Cc: Florian Weimer <fweimer(a)redhat.com>
Cc: Michael Jeanson <mjeanson(a)efficios.com>
Cc: linux-kselftest(a)vger.kernel.org
Cc: stable(a)vger.kernel.org
---
tools/testing/selftests/rseq/rseq.c | 32 ++++++++++++++++++++++-------
tools/testing/selftests/rseq/rseq.h | 9 +++++++-
2 files changed, 33 insertions(+), 8 deletions(-)
diff --git a/tools/testing/selftests/rseq/rseq.c b/tools/testing/selftests/rseq/rseq.c
index 5b9772cdf265..f6156790c3b4 100644
--- a/tools/testing/selftests/rseq/rseq.c
+++ b/tools/testing/selftests/rseq/rseq.c
@@ -61,7 +61,6 @@ unsigned int rseq_size = -1U;
unsigned int rseq_flags;
static int rseq_ownership;
-static int rseq_reg_success; /* At least one rseq registration has succeded. */
/* Allocate a large area for the TLS. */
#define RSEQ_THREAD_AREA_ALLOC_SIZE 1024
@@ -152,14 +151,27 @@ int rseq_register_current_thread(void)
}
rc = sys_rseq(&__rseq_abi, get_rseq_min_alloc_size(), 0, RSEQ_SIG);
if (rc) {
- if (RSEQ_READ_ONCE(rseq_reg_success)) {
+ /*
+ * After at least one thread has registered successfully
+ * (rseq_size > 0), the registration of other threads should
+ * never fail.
+ */
+ if (RSEQ_READ_ONCE(rseq_size) > 0) {
/* Incoherent success/failure within process. */
abort();
}
return -1;
}
assert(rseq_current_cpu_raw() >= 0);
- RSEQ_WRITE_ONCE(rseq_reg_success, 1);
+
+ /*
+ * The first thread to register sets the rseq_size to mimic the libc
+ * behavior.
+ */
+ if (RSEQ_READ_ONCE(rseq_size) == 0) {
+ RSEQ_WRITE_ONCE(rseq_size, get_rseq_kernel_feature_size());
+ }
+
return 0;
}
@@ -235,12 +247,18 @@ void rseq_init(void)
return;
}
rseq_ownership = 1;
- if (!rseq_available()) {
- rseq_size = 0;
- return;
- }
+
+ /* Calculate the offset of the rseq area from the thread pointer. */
rseq_offset = (void *)&__rseq_abi - rseq_thread_pointer();
+
+ /* rseq flags are deprecated, always set to 0. */
rseq_flags = 0;
+
+ /*
+ * Set the size to 0 until at least one thread registers to mimic the
+ * libc behavior.
+ */
+ rseq_size = 0;
}
static __attribute__((destructor))
diff --git a/tools/testing/selftests/rseq/rseq.h b/tools/testing/selftests/rseq/rseq.h
index 4e217b620e0c..062d10925a10 100644
--- a/tools/testing/selftests/rseq/rseq.h
+++ b/tools/testing/selftests/rseq/rseq.h
@@ -60,7 +60,14 @@
extern ptrdiff_t rseq_offset;
/*
- * Size of the registered rseq area. 0 if the registration was
+ * The rseq ABI is composed of extensible feature fields. The extensions
+ * are done by appending additional fields at the end of the structure.
+ * The rseq_size defines the size of the active feature set which can be
+ * used by the application for the current rseq registration. Features
+ * starting at offset >= rseq_size are inactive and should not be used.
+ *
+ * The rseq_size is the intersection between the available allocation
+ * size for the rseq area and the feature size supported by the kernel.
* unsuccessful.
*/
extern unsigned int rseq_size;
--
2.39.5
The orig_a0 is missing in struct user_regs_struct of riscv, and there is
no way to add it without breaking UAPI. (See Link tag below)
Like NT_ARM_SYSTEM_CALL do, we add a new regset name NT_RISCV_ORIG_A0 to
access original a0 register from userspace via ptrace API.
Link: https://lore.kernel.org/all/59505464-c84a-403d-972f-d4b2055eeaac@gmail.com/
Signed-off-by: Celeste Liu <uwu(a)coelacanthus.name>
---
Changes in v5:
- Fix wrong usage in selftests.
- Link to v4: https://lore.kernel.org/r/20241226-riscv-new-regset-v4-0-4496a29d0436@coela…
Changes in v4:
- Fix a copy paste error in selftest. (Forget to commit...)
- Link to v3: https://lore.kernel.org/r/20241226-riscv-new-regset-v3-0-f5b96465826b@coela…
Changes in v3:
- Use return 0 directly for readability.
- Fix test for modify a0.
- Add Fixes: tag
- Remove useless Cc: stable.
- Selftest will check both a0 and orig_a0, but depends on the
correctness of PTRACE_GET_SYSCALL_INFO.
- Link to v2: https://lore.kernel.org/r/20241203-riscv-new-regset-v2-0-d37da8c0cba6@coela…
Changes in v2:
- Fix integer width.
- Add selftest.
- Link to v1: https://lore.kernel.org/r/20241201-riscv-new-regset-v1-1-c83c58abcc7b@coela…
---
Celeste Liu (2):
riscv/ptrace: add new regset to access original a0 register
riscv: selftests: Add a ptrace test to verify a0 and orig_a0 access
arch/riscv/kernel/ptrace.c | 32 +++++
include/uapi/linux/elf.h | 1 +
tools/testing/selftests/riscv/abi/.gitignore | 1 +
tools/testing/selftests/riscv/abi/Makefile | 6 +-
tools/testing/selftests/riscv/abi/ptrace.c | 201 +++++++++++++++++++++++++++
5 files changed, 240 insertions(+), 1 deletion(-)
---
base-commit: 0e287d31b62bb53ad81d5e59778384a40f8b6f56
change-id: 20241201-riscv-new-regset-d529b952ad0d
Best regards,
--
Celeste Liu <uwu(a)coelacanthus.name>
Changes v8:
- Fix Makefile changes.
- Update cover letter SNC status information.
- Add Reinette's reviewed by tag to patch 2/2.
Changes v7:
- Include fallthrough in resctrlfs.c.
- Check fp after opening empty cpus file.
- Correct a comment and merge strings in snprintf().
Changes v6:
- Rebase onto latest kselftest-next.
- Looking at the two patches with a fresh eye decided to make a split
along the lines of:
- Patch 1/2 contains all of the code that relates to SNC mode
detection and checking that detection's reliability.
- Patch 2/2 contains checking kernel support for SNC and
modifying the messages at the end of affected tests.
Changes v5:
- Tests are skipped if snc_unreliable was set.
- Moved resctrlfs.c changes from patch 2/2 to 1/2.
- Removed CAT changes since it's not impacted by SNC in the selftest.
- Updated various comments.
- Fixed a bunch of minor issues pointed out in the review.
Changes v4:
- Printing SNC warnings at the start of every test.
- Printing SNC warnings at the end of every relevant test.
- Remove global snc_mode variable, consolidate snc detection functions
into one.
- Correct minor mistakes.
Changes v3:
- Reworked patch 2.
- Changed minor things in patch 1 like function name and made
corrections to the patch message.
Changes v2:
- Removed patches 2 and 3 since now this part will be supported by the
kernel.
Sub-Numa Clustering (SNC) allows splitting CPU cores, caches and memory
into multiple NUMA nodes. When enabled, NUMA-aware applications can
achieve better performance on bigger server platforms.
SNC support was merged into the kernel [1]. With SNC enabled
and kernel support in place all the tests will function normally (aside
from effective cache size). There might be a problem when SNC is enabled
but the system is still using an older kernel version without SNC
support. Currently the only message displayed in that situation is a
guess that SNC might be enabled and is causing issues. That message also
is displayed whenever the test fails on an Intel platform.
Add a mechanism to discover kernel support for SNC which will add more
meaning and certainty to the error message.
Add runtime SNC mode detection and verify how reliable that information
is.
Series was tested on Ice Lake server platforms with SNC disabled, SNC-2
and SNC-4. The tests were also ran with and without kernel support for
SNC.
Series applies cleanly on kselftest/next.
[1] https://lore.kernel.org/all/20240716065458.GAZpYZQhh0PBItpD1k@fat_crate.loc…
Previous versions of this series:
[v1] https://lore.kernel.org/all/cover.1709721159.git.maciej.wieczor-retman@inte…
[v2] https://lore.kernel.org/all/cover.1715769576.git.maciej.wieczor-retman@inte…
[v3] https://lore.kernel.org/all/cover.1719842207.git.maciej.wieczor-retman@inte…
[v4] https://lore.kernel.org/all/cover.1720774981.git.maciej.wieczor-retman@inte…
[v5] https://lore.kernel.org/all/cover.1730206468.git.maciej.wieczor-retman@inte…
[v6] https://lore.kernel.org/all/cover.1733136454.git.maciej.wieczor-retman@inte…
[v7] https://lore.kernel.org/all/cover.1733741950.git.maciej.wieczor-retman@inte…
Maciej Wieczor-Retman (2):
selftests/resctrl: Adjust effective L3 cache size with SNC enabled
selftests/resctrl: Discover SNC kernel support and adjust messages
tools/testing/selftests/resctrl/Makefile | 1 +
tools/testing/selftests/resctrl/cmt_test.c | 4 +-
tools/testing/selftests/resctrl/mba_test.c | 2 +
tools/testing/selftests/resctrl/mbm_test.c | 4 +-
tools/testing/selftests/resctrl/resctrl.h | 6 +
.../testing/selftests/resctrl/resctrl_tests.c | 9 +-
tools/testing/selftests/resctrl/resctrlfs.c | 137 ++++++++++++++++++
7 files changed, 158 insertions(+), 5 deletions(-)
--
2.47.1
The new option controls tests run on boot or module load. With the new
debugfs "run" dentry allowing to run tests on demand, an ability to disable
automatic tests run becomes a useful option in case of intrusive tests.
The option is set to true by default to preserve the existent behavior. It
can be overridden by either the corresponding module option or by the
corresponding config build option.
Signed-off-by: Stanislav Kinsburskii <skinsburskii(a)linux.microsoft.com>
---
include/kunit/test.h | 4 +++-
lib/kunit/Kconfig | 12 ++++++++++++
lib/kunit/debugfs.c | 2 +-
lib/kunit/executor.c | 21 +++++++++++++++++++--
lib/kunit/test.c | 6 ++++--
5 files changed, 39 insertions(+), 6 deletions(-)
diff --git a/include/kunit/test.h b/include/kunit/test.h
index 34b71e42fb10..58dbab60f853 100644
--- a/include/kunit/test.h
+++ b/include/kunit/test.h
@@ -312,6 +312,7 @@ static inline void kunit_set_failure(struct kunit *test)
}
bool kunit_enabled(void);
+bool kunit_autorun(void);
const char *kunit_action(void);
const char *kunit_filter_glob(void);
char *kunit_filter(void);
@@ -334,7 +335,8 @@ kunit_filter_suites(const struct kunit_suite_set *suite_set,
int *err);
void kunit_free_suite_set(struct kunit_suite_set suite_set);
-int __kunit_test_suites_init(struct kunit_suite * const * const suites, int num_suites);
+int __kunit_test_suites_init(struct kunit_suite * const * const suites, int num_suites,
+ bool run_tests);
void __kunit_test_suites_exit(struct kunit_suite **suites, int num_suites);
diff --git a/lib/kunit/Kconfig b/lib/kunit/Kconfig
index 34d7242d526d..a97897edd964 100644
--- a/lib/kunit/Kconfig
+++ b/lib/kunit/Kconfig
@@ -81,4 +81,16 @@ config KUNIT_DEFAULT_ENABLED
In most cases this should be left as Y. Only if additional opt-in
behavior is needed should this be set to N.
+config KUNIT_AUTORUN_ENABLED
+ bool "Default value of kunit.autorun"
+ default y
+ help
+ Sets the default value of kunit.autorun. If set to N then KUnit
+ tests will not run after initialization unless kunit.autorun=1 is
+ passed to the kernel command line. The test can still be run manually
+ via debugfs interface.
+
+ In most cases this should be left as Y. Only if additional opt-in
+ behavior is needed should this be set to N.
+
endif # KUNIT
diff --git a/lib/kunit/debugfs.c b/lib/kunit/debugfs.c
index d548750a325a..9df064f40d98 100644
--- a/lib/kunit/debugfs.c
+++ b/lib/kunit/debugfs.c
@@ -145,7 +145,7 @@ static ssize_t debugfs_run(struct file *file,
struct inode *f_inode = file->f_inode;
struct kunit_suite *suite = (struct kunit_suite *) f_inode->i_private;
- __kunit_test_suites_init(&suite, 1);
+ __kunit_test_suites_init(&suite, 1, true);
return count;
}
diff --git a/lib/kunit/executor.c b/lib/kunit/executor.c
index 34b7b6833df3..3f39955cb0f1 100644
--- a/lib/kunit/executor.c
+++ b/lib/kunit/executor.c
@@ -29,6 +29,22 @@ const char *kunit_action(void)
return action_param;
}
+/*
+ * Run KUnit tests after initialization
+ */
+#ifdef CONFIG_KUNIT_AUTORUN_ENABLED
+static bool autorun_param = true;
+#else
+static bool autorun_param;
+#endif
+module_param_named(autorun, autorun_param, bool, 0);
+MODULE_PARM_DESC(autorun, "Run KUnit tests after initialization");
+
+bool kunit_autorun(void)
+{
+ return autorun_param;
+}
+
static char *filter_glob_param;
static char *filter_param;
static char *filter_action_param;
@@ -260,13 +276,14 @@ kunit_filter_suites(const struct kunit_suite_set *suite_set,
void kunit_exec_run_tests(struct kunit_suite_set *suite_set, bool builtin)
{
size_t num_suites = suite_set->end - suite_set->start;
+ bool autorun = kunit_autorun();
- if (builtin || num_suites) {
+ if (autorun && (builtin || num_suites)) {
pr_info("KTAP version 1\n");
pr_info("1..%zu\n", num_suites);
}
- __kunit_test_suites_init(suite_set->start, num_suites);
+ __kunit_test_suites_init(suite_set->start, num_suites, autorun);
}
void kunit_exec_list_tests(struct kunit_suite_set *suite_set, bool include_attr)
diff --git a/lib/kunit/test.c b/lib/kunit/test.c
index 089c832e3cdb..146d1b48a096 100644
--- a/lib/kunit/test.c
+++ b/lib/kunit/test.c
@@ -708,7 +708,8 @@ bool kunit_enabled(void)
return enable_param;
}
-int __kunit_test_suites_init(struct kunit_suite * const * const suites, int num_suites)
+int __kunit_test_suites_init(struct kunit_suite * const * const suites, int num_suites,
+ bool run_tests)
{
unsigned int i;
@@ -731,7 +732,8 @@ int __kunit_test_suites_init(struct kunit_suite * const * const suites, int num_
for (i = 0; i < num_suites; i++) {
kunit_init_suite(suites[i]);
- kunit_run_tests(suites[i]);
+ if (run_tests)
+ kunit_run_tests(suites[i]);
}
static_branch_dec(&kunit_running);
Fixes an issue where out-of-tree kselftest builds fail when building
the BPF and bpftools components. The failure occurs because the top-level
Makefile passes a relative srctree path to its sub-Makefiles, which
leads to errors in locating necessary files.
For example, the following error is encountered:
```
$ make V=1 O=$build/ TARGETS=hid kselftest-all
...
make -C ../tools/testing/selftests all
make[4]: Entering directory '/path/to/linux/tools/testing/selftests/hid'
make -C /path/to/linux/tools/testing/selftests/../../../tools/lib/bpf OUTPUT=/path/to/linux/O/kselftest/hid/tools/build/libbpf/ \
EXTRA_CFLAGS='-g -O0' \
DESTDIR=/path/to/linux/O/kselftest/hid/tools prefix= all install_headers
make[5]: Entering directory '/path/to/linux/tools/lib/bpf'
...
make[5]: Entering directory '/path/to/linux/tools/bpf/bpftool'
Makefile:127: ../tools/build/Makefile.feature: No such file or directory
make[5]: *** No rule to make target '../tools/build/Makefile.feature'. Stop.
```
To resolve this, override the srctree in the kselftests's top Makefile
when performing an out-of-tree build. This ensures that all sub-Makefiles
have the correct path to the source tree, preventing directory resolution
errors.
Signed-off-by: Li Zhijian <lizhijian(a)fujitsu.com>
---
Cc: Masahiro Yamada <masahiroy(a)kernel.org>
V2:
- handle srctree in selftests itself rather than the linux' top Makefile # Masahiro Yamada <masahiroy(a)kernel.org>
V1: https://lore.kernel.org/lkml/20241217031052.69744-1-lizhijian@fujitsu.com/
---
tools/testing/selftests/Makefile | 8 ++++++--
1 file changed, 6 insertions(+), 2 deletions(-)
diff --git a/tools/testing/selftests/Makefile b/tools/testing/selftests/Makefile
index 3d8a80abd4f0..ab82278353cf 100644
--- a/tools/testing/selftests/Makefile
+++ b/tools/testing/selftests/Makefile
@@ -154,15 +154,19 @@ override LDFLAGS =
override MAKEFLAGS =
endif
+top_srcdir ?= ../../..
+
# Append kselftest to KBUILD_OUTPUT and O to avoid cluttering
# KBUILD_OUTPUT with selftest objects and headers installed
# by selftests Makefile or lib.mk.
+# Override the `srctree` variable to ensure it is correctly resolved in
+# sub-Makefiles, such as those within `bpf`, when managing targets like
+# `net` and `hid`.
ifdef building_out_of_srctree
override LDFLAGS =
+override srctree := $(top_srcdir)
endif
-top_srcdir ?= ../../..
-
ifeq ("$(origin O)", "command line")
KBUILD_OUTPUT := $(O)
endif
--
2.44.0
The tool pp_alloc_fail.py tested error recovery by injecting errors
into page_pool_alloc_pages(). Perhaps due to the netmems conversion,
page_pool_put_full_page() does not end up calling that function.
page_pool_alloc_netmems() seems to be the base function for all the
the allocation functions in the API call, so put the error injection
there instead.
Signed-off-by: John Daley <johndale(a)cisco.com>
John Daley (1):
page_pool: inject pp_alloc_fail errors in the right place
net/core/page_pool.c | 2 +-
tools/testing/selftests/drivers/net/hw/pp_alloc_fail.py | 6 +++---
2 files changed, 4 insertions(+), 4 deletions(-)
--
2.44.0
Android uses the ashmem driver [1] for creating shared memory regions
between processes. The ashmem driver exposes an ioctl command for
processes to restrict the permissions an ashmem buffer can be mapped
with.
Buffers are created with the ability to be mapped as readable, writable,
and executable. Processes remove the ability to map some ashmem buffers
as executable to ensure that those buffers cannot be exploited to run
unintended code. Other buffers retain their ability to be mapped as
executable, as these buffers can be used for just-in-time (JIT)
compilation. So there is a need to be able to remove the ability to
map a buffer as executable on a per-buffer basis.
Android is currently trying to migrate towards replacing its ashmem
driver usage with memfd. Part of the transition involved introducing a
library that serves to abstract away how shared memory regions are
allocated (i.e. ashmem vs memfd). This allows clients to use a single
interface for restricting how a buffer can be mapped without having to
worry about how it is handled for ashmem (through the ioctl
command mentioned earlier) or memfd (through file seals).
While memfd has support for preventing buffers from being mapped as
writable beyond a certain point in time (thanks to
F_SEAL_FUTURE_WRITE), it does not have a similar interface to prevent
buffers from being mapped as executable beyond a certain point.
However, that could be implemented as a file seal (F_SEAL_FUTURE_EXEC)
which works similarly to F_SEAL_FUTURE_WRITE.
F_SEAL_FUTURE_WRITE was chosen as a template for how this new seal
should behave, instead of F_SEAL_WRITE, for the following reasons:
1. Having the new seal behave like F_SEAL_FUTURE_WRITE matches the
behavior that was present with ashmem. This aids in seamlessly
transitioning clients away from ashmem to memfd.
2. Making the new seal behave like F_SEAL_WRITE would mean that no
mappings that could become executable in the future (i.e. via
mprotect()) can exist when the seal is applied. However, there are
known cases (e.g. CursorWindow [2]) where restrictions are applied
on how a buffer can be mapped after a mapping has already been made.
That mapping may have VM_MAYEXEC set, which would not allow the seal
to be applied successfully.
Therefore, the F_SEAL_FUTURE_EXEC seal was designed to have the same
semantics as F_SEAL_FUTURE_WRITE.
Note: this series depends on Lorenzo's work [3] which allows for a
memfd's file seals to be read in do_mmap().
[1] https://cs.android.com/android/kernel/superproject/+/common-android-mainlin…
[2] https://developer.android.com/reference/android/database/CursorWindow
[3] https://lore.kernel.org/all/cover.1732804776.git.lorenzo.stoakes@oracle.com/
Isaac J. Manjarres (2):
mm/memfd: Add support for F_SEAL_FUTURE_EXEC to memfd
selftests/memfd: Add tests for F_SEAL_FUTURE_EXEC
include/linux/mm.h | 5 ++
include/uapi/linux/fcntl.h | 1 +
mm/memfd.c | 1 +
mm/mmap.c | 11 +++
tools/testing/selftests/memfd/memfd_test.c | 79 ++++++++++++++++++++++
5 files changed, 97 insertions(+)
--
2.47.0.338.g60cca15819-goog
Last week, Jakub reported [1] that the MPTCP Connect selftest was
unstable. It looked like it started after the introduction of some fixes
[2]. After analysis from Paolo, these patches revealed existing bugs,
that should be fixed by the following patches.
- Patch 1: Make sure ACK are sent when MPTCP-level window re-opens. In
some corner cases, the other peer was not notified when more data
could be sent. A fix for v5.11, but depending on a feature introduced
in v5.19.
- Patch 2: Fix spurious wake-up under memory pressure. In this
situation, the userspace could be invited to read data not being there
yet. A fix for v6.7.
- Patch 3: Fix a false positive error when running the MPTCP Connect
selftest with the "disconnect" cases. The userspace could disconnect
the socket too soon, which would reset (MP_FASTCLOSE) the connection,
interpreted as an error by the test. A fix for v5.17.
Link: https://lore.kernel.org/20250107131845.5e5de3c5@kernel.org [1]
Link: https://lore.kernel.org/20241230-net-mptcp-rbuf-fixes-v1-0-8608af434ceb@ker… [2]
Signed-off-by: Matthieu Baerts (NGI0) <matttbe(a)kernel.org>
---
Paolo Abeni (3):
mptcp: be sure to send ack when mptcp-level window re-opens
mptcp: fix spurious wake-up on under memory pressure
selftests: mptcp: avoid spurious errors on disconnect
net/mptcp/options.c | 6 ++--
net/mptcp/protocol.h | 9 +++--
tools/testing/selftests/net/mptcp/mptcp_connect.c | 43 +++++++++++++++++------
3 files changed, 43 insertions(+), 15 deletions(-)
---
base-commit: 76201b5979768500bca362871db66d77cb4c225e
change-id: 20250113-net-mptcp-connect-st-flakes-4af6389808de
Best regards,
--
Matthieu Baerts (NGI0) <matttbe(a)kernel.org>
The orig_a0 is missing in struct user_regs_struct of riscv, and there is
no way to add it without breaking UAPI. (See Link tag below)
Like NT_ARM_SYSTEM_CALL do, we add a new regset name NT_RISCV_ORIG_A0 to
access original a0 register from userspace via ptrace API.
Link: https://lore.kernel.org/all/59505464-c84a-403d-972f-d4b2055eeaac@gmail.com/
Signed-off-by: Celeste Liu <uwu(a)coelacanthus.name>
---
Changes in v4:
- Fix a copy paste error in selftest. (Forget to commit...)
- Link to v3: https://lore.kernel.org/r/20241226-riscv-new-regset-v3-0-f5b96465826b@coela…
Changes in v3:
- Use return 0 directly for readability.
- Fix test for modify a0.
- Add Fixes: tag
- Remove useless Cc: stable.
- Selftest will check both a0 and orig_a0, but depends on the
correctness of PTRACE_GET_SYSCALL_INFO.
- Link to v2: https://lore.kernel.org/r/20241203-riscv-new-regset-v2-0-d37da8c0cba6@coela…
Changes in v2:
- Fix integer width.
- Add selftest.
- Link to v1: https://lore.kernel.org/r/20241201-riscv-new-regset-v1-1-c83c58abcc7b@coela…
---
Celeste Liu (2):
riscv/ptrace: add new regset to access original a0 register
riscv: selftests: Add a ptrace test to verify syscall parameter modification
arch/riscv/kernel/ptrace.c | 32 ++++++
include/uapi/linux/elf.h | 1 +
tools/testing/selftests/riscv/abi/.gitignore | 1 +
tools/testing/selftests/riscv/abi/Makefile | 5 +-
tools/testing/selftests/riscv/abi/ptrace.c | 151 +++++++++++++++++++++++++++
5 files changed, 189 insertions(+), 1 deletion(-)
---
base-commit: 0e287d31b62bb53ad81d5e59778384a40f8b6f56
change-id: 20241201-riscv-new-regset-d529b952ad0d
Best regards,
--
Celeste Liu <uwu(a)coelacanthus.name>
Fix several issues in the mptcp connect test's main_loop function.
- Fix a bug where the wrong file descriptor was being checked for errors
- Fix the input file descriptor lifecycle in the reconnection loop to
prevent use of invalid fd
- Add proper resource cleanup in error paths
Cong Liu (3):
selftests: mptcp: Fix incorrect file descriptor check in main_loop
selftests: mptcp: Fix input fd lifecycle in reconnection loop
selftests: mptcp: Clean up resources properly in main_loop
.../selftests/net/mptcp/mptcp_connect.c | 18 +++++++++---------
1 file changed, 9 insertions(+), 9 deletions(-)
base-commit: 2b88851f583d3c4e40bcd40cfe1965241ec229dd
--
2.43.0
When working on OpenRISC support for restartable sequences I noticed
and fixed these two issues with the riscv support bits.
1 The 'inc' argument to RSEQ_ASM_OP_R_DEREF_ADDV was being implicitly
passed to the macro. Fix this by adding 'inc' to the list of macro
arguments.
2 The inline asm input constraints for 'inc' and 'off' use "er", The
riscv gcc port does not have an "e" constraint, this looks to be
copied from the x86 port. Fix this by just using an "r" constraint.
I have compile tested this only for riscv. However, the same fixes I
use in the OpenRISC rseq selftests and everything passes with no issues.
Signed-off-by: Stafford Horne <shorne(a)gmail.com>
---
tools/testing/selftests/rseq/rseq-riscv-bits.h | 6 +++---
tools/testing/selftests/rseq/rseq-riscv.h | 2 +-
2 files changed, 4 insertions(+), 4 deletions(-)
diff --git a/tools/testing/selftests/rseq/rseq-riscv-bits.h b/tools/testing/selftests/rseq/rseq-riscv-bits.h
index de31a0143139..f02f411d550d 100644
--- a/tools/testing/selftests/rseq/rseq-riscv-bits.h
+++ b/tools/testing/selftests/rseq/rseq-riscv-bits.h
@@ -243,7 +243,7 @@ int RSEQ_TEMPLATE_IDENTIFIER(rseq_offset_deref_addv)(intptr_t *ptr, off_t off, i
#ifdef RSEQ_COMPARE_TWICE
RSEQ_ASM_CMP_CPU_ID(cpu_id, current_cpu_id, "%l[error1]")
#endif
- RSEQ_ASM_OP_R_DEREF_ADDV(ptr, off, 3)
+ RSEQ_ASM_OP_R_DEREF_ADDV(ptr, off, inc, 3)
RSEQ_INJECT_ASM(4)
RSEQ_ASM_DEFINE_ABORT(4, abort)
: /* gcc asm goto does not allow outputs */
@@ -251,8 +251,8 @@ int RSEQ_TEMPLATE_IDENTIFIER(rseq_offset_deref_addv)(intptr_t *ptr, off_t off, i
[current_cpu_id] "m" (rseq_get_abi()->RSEQ_TEMPLATE_CPU_ID_FIELD),
[rseq_cs] "m" (rseq_get_abi()->rseq_cs.arch.ptr),
[ptr] "r" (ptr),
- [off] "er" (off),
- [inc] "er" (inc)
+ [off] "r" (off),
+ [inc] "r" (inc)
RSEQ_INJECT_INPUT
: "memory", RSEQ_ASM_TMP_REG_1
RSEQ_INJECT_CLOBBER
diff --git a/tools/testing/selftests/rseq/rseq-riscv.h b/tools/testing/selftests/rseq/rseq-riscv.h
index 37e598d0a365..67d544aaa9a3 100644
--- a/tools/testing/selftests/rseq/rseq-riscv.h
+++ b/tools/testing/selftests/rseq/rseq-riscv.h
@@ -158,7 +158,7 @@ do { \
"bnez " RSEQ_ASM_TMP_REG_1 ", 222b\n" \
"333:\n"
-#define RSEQ_ASM_OP_R_DEREF_ADDV(ptr, off, post_commit_label) \
+#define RSEQ_ASM_OP_R_DEREF_ADDV(ptr, off, inc, post_commit_label) \
"mv " RSEQ_ASM_TMP_REG_1 ", %[" __rseq_str(ptr) "]\n" \
RSEQ_ASM_OP_R_ADD(off) \
REG_L RSEQ_ASM_TMP_REG_1 ", 0(" RSEQ_ASM_TMP_REG_1 ")\n" \
--
2.47.0
The selftest started failing since commit e93d2521b27f
("x86/vdso: Split virtual clock pages into dedicated mapping")
was merged. While debugging I stumbled upon some memory usage
optimizations.
With these test now runs on a VM with only 60MiB of memory.
Signed-off-by: Thomas Weißschuh <thomas.weissschuh(a)linutronix.de>
---
Changes in v4:
- Pick up review tags
- Correct Fixes: of patch 1
- Drop git rebase commit message artifacts
- Replace strtok_r() with strspn() and strcspn()
- Avoid uninitialized read on error in __get_smap_entry()
- Link to v3: https://lore.kernel.org/r/20250113-virtual_address_range-tests-v3-0-f4a8e6b…
Changes in v3:
- Pick up review tags
- Fix naming around PR_SET_VMA_ANON_NAME helper functions
- Skip selftest if PR_SET_VMA_ANON_NAME is not supported
- Check for VM_IO instead of [vvar name prefix
- Link to v2: https://lore.kernel.org/r/20250110-virtual_address_range-tests-v2-0-262a2bf…
Changes in v2:
- Drop /dev/null usage
- Avoid overcommit restrictions by dropping PROT_WRITE
- Avoid high memory usage due to PTEs
- Link to v1: https://lore.kernel.org/r/20250107-virtual_address_range-tests-v1-0-3834a2f…
---
Thomas Weißschuh (4):
selftests/mm: virtual_address_range: mmap() without PROT_WRITE
selftests/mm: virtual_address_range: Unmap chunks after validation
selftests/mm: vm_util: Split up /proc/self/smaps parsing
selftests/mm: virtual_address_range: Avoid reading from VM_IO mappings
tools/testing/selftests/mm/config | 1 +
tools/testing/selftests/mm/virtual_address_range.c | 41 ++++++++++++--
tools/testing/selftests/mm/vm_util.c | 66 +++++++++++++++++-----
tools/testing/selftests/mm/vm_util.h | 1 +
4 files changed, 92 insertions(+), 17 deletions(-)
---
base-commit: 3043cb9a517b707c12a3f5879f4970c97bfeb3fb
change-id: 20250107-virtual_address_range-tests-95843766fa97
Best regards,
--
Thomas Weißschuh <thomas.weissschuh(a)linutronix.de>
Currently the rseq constructor, rseq_init(), assumes that glibc always
has the support for rseq symbols (__rseq_size for instance). However,
glibc supports rseq from version 2.35 onwards. As a result, for the
systems that run glibc less than 2.35, the global rseq_size remains
initialized to -1U. When a thread then tries to register for rseq,
get_rseq_min_alloc_size() would end up returning -1U, which is
incorrect. Hence, initialize rseq_size for the cases where glibc doesn't
have the support for rseq symbols.
Cc: stable(a)vger.kernel.org
Fixes: 73a4f5a704a2 ("selftests/rseq: Fix mm_cid test failure")
Signed-off-by: Raghavendra Rao Ananta <rananta(a)google.com>
---
tools/testing/selftests/rseq/rseq.c | 19 +++++++++++++------
1 file changed, 13 insertions(+), 6 deletions(-)
diff --git a/tools/testing/selftests/rseq/rseq.c b/tools/testing/selftests/rseq/rseq.c
index 5b9772cdf265..9eb5356f25fa 100644
--- a/tools/testing/selftests/rseq/rseq.c
+++ b/tools/testing/selftests/rseq/rseq.c
@@ -142,6 +142,16 @@ unsigned int get_rseq_kernel_feature_size(void)
return ORIG_RSEQ_FEATURE_SIZE;
}
+static void set_default_rseq_size(void)
+{
+ unsigned int rseq_kernel_feature_size = get_rseq_kernel_feature_size();
+
+ if (rseq_kernel_feature_size < ORIG_RSEQ_ALLOC_SIZE)
+ rseq_size = rseq_kernel_feature_size;
+ else
+ rseq_size = ORIG_RSEQ_ALLOC_SIZE;
+}
+
int rseq_register_current_thread(void)
{
int rc;
@@ -219,12 +229,7 @@ void rseq_init(void)
fallthrough;
case ORIG_RSEQ_ALLOC_SIZE:
{
- unsigned int rseq_kernel_feature_size = get_rseq_kernel_feature_size();
-
- if (rseq_kernel_feature_size < ORIG_RSEQ_ALLOC_SIZE)
- rseq_size = rseq_kernel_feature_size;
- else
- rseq_size = ORIG_RSEQ_ALLOC_SIZE;
+ set_default_rseq_size();
break;
}
default:
@@ -239,8 +244,10 @@ void rseq_init(void)
rseq_size = 0;
return;
}
+
rseq_offset = (void *)&__rseq_abi - rseq_thread_pointer();
rseq_flags = 0;
+ set_default_rseq_size();
}
static __attribute__((destructor))
base-commit: 40384c840ea1944d7c5a392e8975ed088ecf0b37
--
2.47.0.338.g60cca15819-goog
The selftest started failing since commit e93d2521b27f
("x86/vdso: Split virtual clock pages into dedicated mapping")
was merged. While debugging I stumbled upon some memory usage
optimizations.
With these test now runs on a VM with only 60MiB of memory.
Signed-off-by: Thomas Weißschuh <thomas.weissschuh(a)linutronix.de>
---
Changes in v3:
- Pick up review tags
- Fix naming around PR_SET_VMA_ANON_NAME helper functions
- Skip selftest if PR_SET_VMA_ANON_NAME is not supported
- Check for VM_IO instead of [vvar name prefix
- Link to v2: https://lore.kernel.org/r/20250110-virtual_address_range-tests-v2-0-262a2bf…
Changes in v2:
- Drop /dev/null usage
- Avoid overcommit restrictions by dropping PROT_WRITE
- Avoid high memory usage due to PTEs
- Link to v1: https://lore.kernel.org/r/20250107-virtual_address_range-tests-v1-0-3834a2f…
---
Thomas Weißschuh (4):
selftests/mm: virtual_address_range: mmap() without PROT_WRITE
selftests/mm: virtual_address_range: Unmap chunks after validation
selftests/mm: vm_util: Split up /proc/self/smaps parsing
selftests/mm: virtual_address_range: Avoid reading from VM_IO mappings
tools/testing/selftests/mm/config | 1 +
tools/testing/selftests/mm/virtual_address_range.c | 41 ++++++++++++--
tools/testing/selftests/mm/vm_util.c | 63 +++++++++++++++++-----
tools/testing/selftests/mm/vm_util.h | 1 +
4 files changed, 89 insertions(+), 17 deletions(-)
---
base-commit: 7793bee8fed2027eb15219014de6fb0dc15d4a03
change-id: 20250107-virtual_address_range-tests-95843766fa97
Best regards,
--
Thomas Weißschuh <thomas.weissschuh(a)linutronix.de>
v1/v2:
There is only the first patch: RISC-V: Enable cbo.clean/flush in usermode,
which mainly removes the enabling of cbo.inval in user mode.
v3:
Add the functionality of Expose Zicbom and selftests for Zicbom.
v4:
Modify the order of macros, The test_no_cbo_inval function is added
separately.
Yunhui Cui (3):
RISC-V: Enable cbo.clean/flush in usermode
RISC-V: hwprobe: Expose Zicbom extension and its block size
RISC-V: selftests: Add TEST_ZICBOM into CBO tests
Documentation/arch/riscv/hwprobe.rst | 6 ++
arch/riscv/include/asm/hwprobe.h | 2 +-
arch/riscv/include/uapi/asm/hwprobe.h | 2 +
arch/riscv/kernel/cpufeature.c | 8 +++
arch/riscv/kernel/sys_hwprobe.c | 6 ++
tools/testing/selftests/riscv/hwprobe/cbo.c | 66 +++++++++++++++++----
6 files changed, 78 insertions(+), 12 deletions(-)
--
2.39.2
From: "Mike Rapoport (Microsoft)" <rppt(a)kernel.org>
Hi,
Following Peter's comments [1] these patches rework handling of ROX caches
for module text allocations.
Instead of using a writable copy that really complicates alternatives
patching, temporarily remap parts of a large ROX page as RW for the time of
module formation and then restore it's ROX protections when the module is
ready.
To keep the ROX memory mapped with large pages, make set_memory code
capable of restoring large pages (more details are in patch 3).
The patches also available in git
https://git.kernel.org/rppt/h/execmem/x86-rox/v8
[1] https://lore.kernel.org/all/20241209083818.GK8562@noisy.programming.kicks-a…
Kirill A. Shutemov (1):
x86/mm/pat: Restore large pages after fragmentation
Mike Rapoport (Microsoft) (7):
x86/mm/pat: cpa-test: fix length for CPA_ARRAY test
x86/mm/pat: drop duplicate variable in cpa_flush()
execmem: add API for temporal remapping as RW and restoring ROX
afterwards
module: introduce MODULE_STATE_GONE
modules: switch to execmem API for remapping as RW and restoring ROX
Revert "x86/module: prepare module loading for ROX allocations of
text"
module: drop unused module_writable_address()
arch/um/kernel/um_arch.c | 11 +-
arch/x86/entry/vdso/vma.c | 3 +-
arch/x86/include/asm/alternative.h | 14 +-
arch/x86/include/asm/pgtable_types.h | 2 +
arch/x86/kernel/alternative.c | 181 ++++++---------
arch/x86/kernel/ftrace.c | 30 ++-
arch/x86/kernel/module.c | 45 ++--
arch/x86/mm/pat/cpa-test.c | 2 +-
arch/x86/mm/pat/set_memory.c | 216 +++++++++++++++++-
include/linux/execmem.h | 31 +++
include/linux/module.h | 21 +-
include/linux/moduleloader.h | 4 -
include/linux/vm_event_item.h | 2 +
kernel/module/kallsyms.c | 8 +-
kernel/module/kdb.c | 2 +-
kernel/module/main.c | 86 ++-----
kernel/module/procfs.c | 2 +-
kernel/module/strict_rwx.c | 9 +-
kernel/tracepoint.c | 2 +
lib/kunit/test.c | 2 +
mm/execmem.c | 118 ++++++++--
mm/vmstat.c | 2 +
samples/livepatch/livepatch-callbacks-demo.c | 1 +
.../test_modules/test_klp_callbacks_demo.c | 1 +
.../test_modules/test_klp_callbacks_demo2.c | 1 +
.../livepatch/test_modules/test_klp_state.c | 1 +
.../livepatch/test_modules/test_klp_state2.c | 1 +
27 files changed, 511 insertions(+), 287 deletions(-)
--
2.45.2
This patch series includes some netns-related improvements and fixes for
rtnetlink, to make link creation more intuitive:
1) Creating link in another net namespace doesn't conflict with link
names in current one.
2) Refector rtnetlink link creation. Create link in target namespace
directly.
So that
# ip link add netns ns1 link-netns ns2 tun0 type gre ...
will create tun0 in ns1, rather than create it in ns2 and move to ns1.
And don't conflict with another interface named "tun0" in current netns.
Patch 01 serves for 1) to avoids link name conflict in different netns.
To achieve 2), there're mainly 3 steps:
- Patch 02 packs newlink() parameters into a struct, including
the original "src_net" along with more netns context. No semantic
changes are introduced.
- Patch 03 ~ 07 converts device drivers to use the explicit netns
extracted from params.
- Patch 08 ~ 09 removes the old netns parameter, and converts
rtnetlink to create device in target netns directly.
Patch 10 ~ 11 adds some tests for link name and link netns.
BTW please note there're some issues found in current code:
- In amt_newlink() drivers/net/amt.c:
amt->net = net;
...
amt->stream_dev = dev_get_by_index(net, ...
Uses net, but amt_lookup_upper_dev() only searches in dev_net.
So the AMT device may not be properly deleted if it's in a different
netns from lower dev.
- In gtp_newlink() in drivers/net/gtp.c:
gtp->net = src_net;
...
gn = net_generic(dev_net(dev), gtp_net_id);
list_add_rcu(>p->list, &gn->gtp_dev_list);
Uses src_net, but priv is linked to list in dev_net. So it may not be
properly deleted on removal of link netns.
- In pfcp_newlink() in drivers/net/pfcp.c:
pfcp->net = net;
...
pn = net_generic(dev_net(dev), pfcp_net_id);
list_add_rcu(&pfcp->list, &pn->pfcp_dev_list);
Same as above.
- In lowpan_newlink() in net/ieee802154/6lowpan/core.c:
wdev = dev_get_by_index(dev_net(ldev), nla_get_u32(tb[IFLA_LINK]));
Looks for IFLA_LINK in dev_net, but in theory the ifindex is defined
in link netns.
Kuniyuki has a patchset to address the issues of gtp and pfcp:
https://lore.kernel.org/netdev/20250110014754.33847-1-kuniyu@amazon.com/
---
v8:
- Move dev and ext_ack out from param struct.
- Validate link_net and dev_net are identical for 6lowpan.
v7:
link: https://lore.kernel.org/all/20250104125732.17335-1-shaw.leon@gmail.com/
- Add selftest kconfig.
- Remove a duplicated test of ip6gre.
v6:
link: https://lore.kernel.org/all/20241218130909.2173-1-shaw.leon@gmail.com/
- Split prototype, driver and rtnetlink changes.
- Add more tests for link netns.
- Fix IPv6 tunnel net overwriten in ndo_init().
- Reorder variable declarations.
- Exclude a ip_tunnel-specific patch.
v5:
link: https://lore.kernel.org/all/20241209140151.231257-1-shaw.leon@gmail.com/
- Fix function doc in batman-adv.
- Include peer_net in rtnl newlink parameters.
v4:
link: https://lore.kernel.org/all/20241118143244.1773-1-shaw.leon@gmail.com/
- Pack newlink() parameters to a single struct.
- Use ynl async_msg_queue.empty() in selftest.
v3:
link: https://lore.kernel.org/all/20241113125715.150201-1-shaw.leon@gmail.com/
- Drop "netns_atomic" flag and module parameter. Add netns parameter to
newlink() instead, and convert drivers accordingly.
- Move python NetNSEnter helper to net selftest lib.
v2:
link: https://lore.kernel.org/all/20241107133004.7469-1-shaw.leon@gmail.com/
- Check NLM_F_EXCL to ensure only link creation is affected.
- Add self tests for link name/ifindex conflict and notifications
in different netns.
- Changes in dummy driver and ynl in order to add the test case.
v1:
link: https://lore.kernel.org/all/20241023023146.372653-1-shaw.leon@gmail.com/
Xiao Liang (11):
rtnetlink: Lookup device in target netns when creating link
rtnetlink: Pack newlink() params into struct
net: Use link netns in newlink() of rtnl_link_ops
ieee802154: 6lowpan: Validate link netns in newlink() of rtnl_link_ops
net: ip_tunnel: Use link netns in newlink() of rtnl_link_ops
net: ipv6: Use link netns in newlink() of rtnl_link_ops
net: xfrm: Use link netns in newlink() of rtnl_link_ops
rtnetlink: Remove "net" from newlink params
rtnetlink: Create link directly in target net namespace
selftests: net: Add python context manager for netns entering
selftests: net: Add test cases for link and peer netns
drivers/infiniband/ulp/ipoib/ipoib_netlink.c | 9 +-
drivers/net/amt.c | 11 +-
drivers/net/bareudp.c | 9 +-
drivers/net/bonding/bond_netlink.c | 6 +-
drivers/net/can/dev/netlink.c | 4 +-
drivers/net/can/vxcan.c | 7 +-
.../ethernet/qualcomm/rmnet/rmnet_config.c | 9 +-
drivers/net/geneve.c | 9 +-
drivers/net/gtp.c | 8 +-
drivers/net/ipvlan/ipvlan.h | 3 +-
drivers/net/ipvlan/ipvlan_main.c | 8 +-
drivers/net/ipvlan/ipvtap.c | 6 +-
drivers/net/macsec.c | 9 +-
drivers/net/macvlan.c | 7 +-
drivers/net/macvtap.c | 7 +-
drivers/net/netkit.c | 7 +-
drivers/net/pfcp.c | 7 +-
drivers/net/ppp/ppp_generic.c | 9 +-
drivers/net/team/team_core.c | 6 +-
drivers/net/veth.c | 7 +-
drivers/net/vrf.c | 5 +-
drivers/net/vxlan/vxlan_core.c | 9 +-
drivers/net/wireguard/device.c | 7 +-
drivers/net/wireless/virtual/virt_wifi.c | 8 +-
drivers/net/wwan/wwan_core.c | 16 +-
include/net/ip_tunnels.h | 5 +-
include/net/rtnetlink.h | 40 ++++-
net/8021q/vlan_netlink.c | 9 +-
net/batman-adv/soft-interface.c | 9 +-
net/bridge/br_netlink.c | 6 +-
net/caif/chnl_net.c | 5 +-
net/core/rtnetlink.c | 33 ++--
net/hsr/hsr_netlink.c | 12 +-
net/ieee802154/6lowpan/core.c | 7 +-
net/ipv4/ip_gre.c | 24 ++-
net/ipv4/ip_tunnel.c | 10 +-
net/ipv4/ip_vti.c | 9 +-
net/ipv4/ipip.c | 9 +-
net/ipv6/ip6_gre.c | 30 ++--
net/ipv6/ip6_tunnel.c | 19 ++-
net/ipv6/ip6_vti.c | 15 +-
net/ipv6/sit.c | 17 ++-
net/xfrm/xfrm_interface_core.c | 15 +-
tools/testing/selftests/net/Makefile | 1 +
tools/testing/selftests/net/config | 5 +
.../testing/selftests/net/lib/py/__init__.py | 2 +-
tools/testing/selftests/net/lib/py/netns.py | 18 +++
tools/testing/selftests/net/link_netns.py | 141 ++++++++++++++++++
tools/testing/selftests/net/netns-name.sh | 10 ++
49 files changed, 479 insertions(+), 165 deletions(-)
create mode 100755 tools/testing/selftests/net/link_netns.py
--
2.47.1
This series implements feature detection of hardware virtualization on
Linux and macOS; the latter being my primary use case.
This yields approximately a 6x improvement using HVF on M3 Pro.
Signed-off-by: Tamir Duberstein <tamird(a)gmail.com>
---
Changes in v2:
- Use QEMU accelerator fallback (Alyssa Ross, Thomas Weißschuh).
- Link to v1: https://lore.kernel.org/r/20241025-kunit-qemu-accel-macos-v1-0-2f30c26192d4…
---
Tamir Duberstein (2):
kunit: add fallback for os.sched_getaffinity
kunit: enable hardware acceleration when available
tools/testing/kunit/kunit.py | 11 ++++++++++-
tools/testing/kunit/kunit_kernel.py | 3 +++
tools/testing/kunit/qemu_configs/arm64.py | 2 +-
3 files changed, 14 insertions(+), 2 deletions(-)
---
base-commit: 81983758430957d9a5cb3333fe324fd70cf63e7e
change-id: 20241025-kunit-qemu-accel-macos-2840e4c2def5
Best regards,
--
Tamir Duberstein <tamird(a)gmail.com>
This patch series extends the sev_init2 and the sev_smoke test to
exercise the SEV-SNP VM launch workflow.
Primarily, it introduces the architectural defines, its support in the SEV
library and extends the tests to interact with the SEV-SNP ioctl()
wrappers.
Patch 1 - Do not advertize SNP on incompatible firmware
Patch 2 - SNP test for KVM_SEV_INIT2
Patch 3 - Add VMGEXIT helper
Patch 4 - Introduce SEV+ VM type check
Patch 5 - SNP iotcl() plumbing for the SEV library
Patch 6 - Force set GUEST_MEMFD for SNP
Patch 7 - Cleanups of smoke test - Decouple policy from type
Patch 8 - SNP smoke test
v4:
1. Remove SNP FW API version check in the test and ensure the KVM
capability advertizes the presence of the feature. Retain the minimum
version definitions to exercise these API versions in the smoke test.
2. Retained only the SNP smoke test and SNP_INIT2 test
3. The SNP architectural defined merged with SNP_INIT2 test patch
4. SNP shutdown merged with SNP smoke test patch
5. Add SEV VM type check to abstract comparisons and reduce clutter
6. Define a SNP default policy which sets bits based on the presence of
SMT
7. Decouple privatization and encryption for it to be SNP agnostic
8. Assert for only positive tests using vm_ioctl()
9. Dropped tested-by tags
In summary - based on comments from Sean, I have primarily reduced the
scope of this patch series to focus on breaking down the SNP smoke test
patch (v3 - patch2) to first introduce SEV-SNP support and use this
interface to extend the sev_init2 and the sev_smoke test.
The rest of the v3 patchset that introduces ioctl, pre fault, fallocate
and negative tests, will be re-worked and re-introduced subsequently in
future patch series post addressing the issues discussed.
v3:
https://lore.kernel.org/kvm/20240905124107.6954-1-pratikrajesh.sampat@amd.c…
1. Remove the assignments for the prefault and fallocate test type
enums.
2. Fix error message for sev launch measure and finish.
3. Collect tested-by tags [Peter, Srikanth]
Any feedback/review is highly appreciated!
Pratik R. Sampat (8):
KVM: SEV: Disable SEV-SNP on FW validation failure
KVM: selftests: SEV-SNP test for KVM_SEV_INIT2
KVM: selftests: Add VMGEXIT helper
KVM: selftests: Introduce SEV VM type check
KVM: selftests: Add library support for interacting with SNP
KVM: selftests: Force GUEST_MEMFD flag for SNP VM type
KVM: selftests: Abstractions for SEV to decouple policy from type
KVM: selftests: Add a basic SEV-SNP smoke test
arch/x86/kvm/svm/sev.c | 4 +-
drivers/crypto/ccp/sev-dev.c | 6 ++
include/linux/psp-sev.h | 3 +
.../selftests/kvm/include/x86_64/processor.h | 1 +
.../selftests/kvm/include/x86_64/sev.h | 55 ++++++++++-
tools/testing/selftests/kvm/lib/kvm_util.c | 7 +-
.../selftests/kvm/lib/x86_64/processor.c | 4 +-
tools/testing/selftests/kvm/lib/x86_64/sev.c | 98 ++++++++++++++++++-
.../selftests/kvm/x86_64/sev_init2_tests.c | 13 +++
.../selftests/kvm/x86_64/sev_smoke_test.c | 96 ++++++++++++++----
10 files changed, 258 insertions(+), 29 deletions(-)
--
2.43.0
The new option controls tests run on boot or module load. With the new
debugfs "run" dentry allowing to run tests on demand, an ability to disable
automatic tests run becomes a useful option in case of intrusive tests.
The option is set to true by default to preserve the existent behavior. It
can be overridden by either the corresponding module option or by the
corresponding config build option.
Signed-off-by: Stanislav Kinsburskii <skinsburskii(a)linux.microsoft.com>
---
include/kunit/test.h | 4 +++-
lib/kunit/Kconfig | 12 ++++++++++++
lib/kunit/debugfs.c | 2 +-
lib/kunit/executor.c | 18 +++++++++++++++++-
lib/kunit/test.c | 6 ++++--
5 files changed, 37 insertions(+), 5 deletions(-)
diff --git a/include/kunit/test.h b/include/kunit/test.h
index 34b71e42fb10..58dbab60f853 100644
--- a/include/kunit/test.h
+++ b/include/kunit/test.h
@@ -312,6 +312,7 @@ static inline void kunit_set_failure(struct kunit *test)
}
bool kunit_enabled(void);
+bool kunit_autorun(void);
const char *kunit_action(void);
const char *kunit_filter_glob(void);
char *kunit_filter(void);
@@ -334,7 +335,8 @@ kunit_filter_suites(const struct kunit_suite_set *suite_set,
int *err);
void kunit_free_suite_set(struct kunit_suite_set suite_set);
-int __kunit_test_suites_init(struct kunit_suite * const * const suites, int num_suites);
+int __kunit_test_suites_init(struct kunit_suite * const * const suites, int num_suites,
+ bool run_tests);
void __kunit_test_suites_exit(struct kunit_suite **suites, int num_suites);
diff --git a/lib/kunit/Kconfig b/lib/kunit/Kconfig
index 34d7242d526d..a97897edd964 100644
--- a/lib/kunit/Kconfig
+++ b/lib/kunit/Kconfig
@@ -81,4 +81,16 @@ config KUNIT_DEFAULT_ENABLED
In most cases this should be left as Y. Only if additional opt-in
behavior is needed should this be set to N.
+config KUNIT_AUTORUN_ENABLED
+ bool "Default value of kunit.autorun"
+ default y
+ help
+ Sets the default value of kunit.autorun. If set to N then KUnit
+ tests will not run after initialization unless kunit.autorun=1 is
+ passed to the kernel command line. The test can still be run manually
+ via debugfs interface.
+
+ In most cases this should be left as Y. Only if additional opt-in
+ behavior is needed should this be set to N.
+
endif # KUNIT
diff --git a/lib/kunit/debugfs.c b/lib/kunit/debugfs.c
index d548750a325a..9df064f40d98 100644
--- a/lib/kunit/debugfs.c
+++ b/lib/kunit/debugfs.c
@@ -145,7 +145,7 @@ static ssize_t debugfs_run(struct file *file,
struct inode *f_inode = file->f_inode;
struct kunit_suite *suite = (struct kunit_suite *) f_inode->i_private;
- __kunit_test_suites_init(&suite, 1);
+ __kunit_test_suites_init(&suite, 1, true);
return count;
}
diff --git a/lib/kunit/executor.c b/lib/kunit/executor.c
index 34b7b6833df3..340723571b0f 100644
--- a/lib/kunit/executor.c
+++ b/lib/kunit/executor.c
@@ -29,6 +29,22 @@ const char *kunit_action(void)
return action_param;
}
+/*
+ * Run KUnit tests after initialization
+ */
+#ifdef CONFIG_KUNIT_AUTORUN_ENABLED
+static bool autorun_param = true;
+#else
+static bool autorun_param;
+#endif
+module_param_named(autorun, autorun_param, bool, 0);
+MODULE_PARM_DESC(autorun, "Run KUnit tests after initialization");
+
+bool kunit_autorun(void)
+{
+ return autorun_param;
+}
+
static char *filter_glob_param;
static char *filter_param;
static char *filter_action_param;
@@ -266,7 +282,7 @@ void kunit_exec_run_tests(struct kunit_suite_set *suite_set, bool builtin)
pr_info("1..%zu\n", num_suites);
}
- __kunit_test_suites_init(suite_set->start, num_suites);
+ __kunit_test_suites_init(suite_set->start, num_suites, kunit_autorun());
}
void kunit_exec_list_tests(struct kunit_suite_set *suite_set, bool include_attr)
diff --git a/lib/kunit/test.c b/lib/kunit/test.c
index 089c832e3cdb..146d1b48a096 100644
--- a/lib/kunit/test.c
+++ b/lib/kunit/test.c
@@ -708,7 +708,8 @@ bool kunit_enabled(void)
return enable_param;
}
-int __kunit_test_suites_init(struct kunit_suite * const * const suites, int num_suites)
+int __kunit_test_suites_init(struct kunit_suite * const * const suites, int num_suites,
+ bool run_tests)
{
unsigned int i;
@@ -731,7 +732,8 @@ int __kunit_test_suites_init(struct kunit_suite * const * const suites, int num_
for (i = 0; i < num_suites; i++) {
kunit_init_suite(suites[i]);
- kunit_run_tests(suites[i]);
+ if (run_tests)
+ kunit_run_tests(suites[i]);
}
static_branch_dec(&kunit_running);
PTRACE_SET_SYSCALL_INFO is a generic ptrace API that complements
PTRACE_GET_SYSCALL_INFO by letting the ptracer modify details of
system calls the tracee is blocked in.
This API allows ptracers to obtain and modify system call details
in a straightforward and architecture-agnostic way.
Current implementation supports changing only those bits of system call
information that are used by strace, namely, syscall number, syscall
arguments, and syscall return value.
Support of changing additional details returned by PTRACE_GET_SYSCALL_INFO,
such as instruction pointer and stack pointer, could be added later
if needed, by using struct ptrace_syscall_info.flags to specify
the additional details that should be set. Currently, flags and reserved
fields of struct ptrace_syscall_info must be initialized with zeroes;
arch, instruction_pointer, and stack_pointer fields are ignored.
PTRACE_SET_SYSCALL_INFO currently supports only PTRACE_SYSCALL_INFO_ENTRY,
PTRACE_SYSCALL_INFO_EXIT, and PTRACE_SYSCALL_INFO_SECCOMP operations.
Other operations could be added later if needed.
Ideally, PTRACE_SET_SYSCALL_INFO should have been introduced along with
PTRACE_GET_SYSCALL_INFO, but it didn't happen. The last straw that
convinced me to implement PTRACE_SET_SYSCALL_INFO was apparent failure
to provide an API of changing the first system call argument on riscv
architecture [1].
ptrace(2) man page:
long ptrace(enum __ptrace_request request, pid_t pid, void *addr, void *data);
...
PTRACE_SET_SYSCALL_INFO
Modify information about the system call that caused the stop.
The "data" argument is a pointer to struct ptrace_syscall_info
that specifies the system call information to be set.
The "addr" argument should be set to sizeof(struct ptrace_syscall_info)).
[1] https://lore.kernel.org/all/59505464-c84a-403d-972f-d4b2055eeaac@gmail.com/
---
Notes:
v2:
* Add patch to fix syscall_set_return_value() on powerpc
* Add patch to fix mips_get_syscall_arg() on mips
* Merge two patches adding syscall_set_arguments() implementations
from different sources into a single patch
* Add syscall_set_return_value() implementation on hexagon
* Add syscall_set_return_value() invocation to syscall_set_nr()
on arm and arm64.
* Fix syscall_set_nr() and mips_set_syscall_arg() on mips
* Add a comment to syscall_set_nr() on arc, powerpc, s390, sh,
and sparc
* Remove redundant ptrace_syscall_info.op assignments in
ptrace_get_syscall_info_*
* Minor style tweaks in ptrace_get_syscall_info_op()
* Remove syscall_set_return_value() invocation from
ptrace_set_syscall_info_entry()
* Skip syscall_set_arguments() invocation in case of syscall number -1
in ptrace_set_syscall_info_entry()
* Split ptrace_syscall_info.reserved into ptrace_syscall_info.reserved
and ptrace_syscall_info.flags
* Use __kernel_ulong_t instead of unsigned long in set_syscall_info test
Dmitry V. Levin (7):
powerpc: properly negate error in syscall_set_return_value()
mips: fix mips_get_syscall_arg() for O32 and N32
syscall.h: add syscall_set_arguments() and syscall_set_return_value()
syscall.h: introduce syscall_set_nr()
ptrace_get_syscall_info: factor out ptrace_get_syscall_info_op
ptrace: introduce PTRACE_SET_SYSCALL_INFO request
selftests/ptrace: add a test case for PTRACE_SET_SYSCALL_INFO
arch/arc/include/asm/syscall.h | 25 +
arch/arm/include/asm/syscall.h | 37 ++
arch/arm64/include/asm/syscall.h | 29 ++
arch/csky/include/asm/syscall.h | 13 +
arch/hexagon/include/asm/syscall.h | 21 +
arch/loongarch/include/asm/syscall.h | 15 +
arch/m68k/include/asm/syscall.h | 7 +
arch/microblaze/include/asm/syscall.h | 7 +
arch/mips/include/asm/syscall.h | 72 ++-
arch/nios2/include/asm/syscall.h | 16 +
arch/openrisc/include/asm/syscall.h | 13 +
arch/parisc/include/asm/syscall.h | 19 +
arch/powerpc/include/asm/syscall.h | 26 +-
arch/riscv/include/asm/syscall.h | 16 +
arch/s390/include/asm/syscall.h | 24 +
arch/sh/include/asm/syscall_32.h | 24 +
arch/sparc/include/asm/syscall.h | 22 +
arch/um/include/asm/syscall-generic.h | 19 +
arch/x86/include/asm/syscall.h | 43 ++
arch/xtensa/include/asm/syscall.h | 18 +
include/asm-generic/syscall.h | 30 ++
include/linux/ptrace.h | 3 +
include/uapi/linux/ptrace.h | 4 +-
kernel/ptrace.c | 153 +++++-
tools/testing/selftests/ptrace/Makefile | 2 +-
.../selftests/ptrace/set_syscall_info.c | 441 ++++++++++++++++++
26 files changed, 1052 insertions(+), 47 deletions(-)
create mode 100644 tools/testing/selftests/ptrace/set_syscall_info.c
--
ldv
Enabling cbo.clean and cbo.flush in user mode makes it more
convenient to manage the cache state and achieve better performance.
Reviewed-by: Andrew Jones <ajones(a)ventanamicro.com>
Signed-off-by: Yunhui Cui <cuiyunhui(a)bytedance.com>
---
arch/riscv/kernel/cpufeature.c | 8 ++++++++
1 file changed, 8 insertions(+)
diff --git a/arch/riscv/kernel/cpufeature.c b/arch/riscv/kernel/cpufeature.c
index c0916ed318c2..60d180b98f52 100644
--- a/arch/riscv/kernel/cpufeature.c
+++ b/arch/riscv/kernel/cpufeature.c
@@ -30,6 +30,7 @@
#define NUM_ALPHA_EXTS ('z' - 'a' + 1)
static bool any_cpu_has_zicboz;
+static bool any_cpu_has_zicbom;
unsigned long elf_hwcap __read_mostly;
@@ -87,6 +88,8 @@ static int riscv_ext_zicbom_validate(const struct riscv_isa_ext_data *data,
pr_err("Zicbom disabled as cbom-block-size present, but is not a power-of-2\n");
return -EINVAL;
}
+
+ any_cpu_has_zicbom = true;
return 0;
}
@@ -944,6 +947,11 @@ void __init riscv_user_isa_enable(void)
current->thread.envcfg |= ENVCFG_CBZE;
else if (any_cpu_has_zicboz)
pr_warn("Zicboz disabled as it is unavailable on some harts\n");
+
+ if (riscv_has_extension_unlikely(RISCV_ISA_EXT_ZICBOM))
+ current->thread.envcfg |= ENVCFG_CBCFE;
+ else if (any_cpu_has_zicbom)
+ pr_warn("Zicbom disabled as it is unavailable on some harts\n");
}
#ifdef CONFIG_RISCV_ALTERNATIVE
--
2.39.2
Running "make kselftest TARGETS=net/forwarding" results in several
occurrences of the same error:
./lib.sh: line 787: teamd: command not found
Since many tests depends on teamd, this fix stops the tests if the
teamd command is not installed.
Signed-off-by: Alessandro Zanni <alessandro.zanni87(a)gmail.com>
---
tools/testing/selftests/net/forwarding/lib.sh | 1 +
1 file changed, 1 insertion(+)
diff --git a/tools/testing/selftests/net/forwarding/lib.sh b/tools/testing/selftests/net/forwarding/lib.sh
index 7337f398f9cc..a6a74a4be4bf 100644
--- a/tools/testing/selftests/net/forwarding/lib.sh
+++ b/tools/testing/selftests/net/forwarding/lib.sh
@@ -784,6 +784,7 @@ team_destroy()
{
local if_name=$1; shift
+ require_command $TEAMD
$TEAMD -t $if_name -k
}
--
2.43.0
The selftest started failing since commit e93d2521b27f
("x86/vdso: Split virtual clock pages into dedicated mapping")
was merged. While debugging I stumbled upon some memory usage
optimizations.
With these test now runs on a VM with only 60MiB of memory.
Signed-off-by: Thomas Weißschuh <thomas.weissschuh(a)linutronix.de>
---
Changes in v2:
- Drop /dev/null usage
- Avoid overcommit restrictions by dropping PROT_WRITE
- Avoid high memory usage due to PTEs
- Link to v1: https://lore.kernel.org/r/20250107-virtual_address_range-tests-v1-0-3834a2f…
---
Thomas Weißschuh (3):
selftests/mm: virtual_address_range: mmap() without PROT_WRITE
selftests/mm: virtual_address_range: Unmap chunks after validation
selftests/mm: virtual_address_range: Avoid reading VVAR mappings
tools/testing/selftests/mm/config | 1 +
tools/testing/selftests/mm/virtual_address_range.c | 34 +++++++++++++++++++---
2 files changed, 31 insertions(+), 4 deletions(-)
---
base-commit: 32af4d2269d20fe2f8d32aaa456cad8e40abd365
change-id: 20250107-virtual_address_range-tests-95843766fa97
Best regards,
--
Thomas Weißschuh <thomas.weissschuh(a)linutronix.de>
Notable changes since v16:
* fixed usage of netdev tracker by removing dev_tracker member from
ovpn_priv and adding it to ovpn_peer and ovpn_socket as those are the
objects really holding a ref to the netdev
* switched ovpn_get_dev_from_attrs() to GFP_ATOMIC to prevent sleep under
rcu_read_lock
* allocated netdevice_tracker in ovpn_nl_pre_doit() [stored in
user_ptr[1]] to keep track of the netdev reference held during netlink
handler calls
* moved whole socket detaching routine to worker. This way the code is
allowed to sleep and in turn it can be executed under lock_sock. This
lock allows us to happily coordinate concurrent attach/detach calls.
(note: lock is acquired everytime the refcnt for the socket is
decremented, because this guarantees us that setting the refcnt to 0
and detaching the socket will happen atomically)
* dropped kref_put_sock()/refcount handler as it's not required anymore,
thanks to the point above
* re-arranged ovpn_socket_new() in order to simplify error path by first
allocating the new ovpn_sock and then attaching
Please note that some patches were already reviewed/tested by a few
people. iThese patches have retained the tags as they have hardly been
touched.
The latest code can also be found at:
https://github.com/OpenVPN/linux-kernel-ovpn
Thanks a lot!
Best Regards,
Antonio Quartulli
OpenVPN Inc.
---
Antonio Quartulli (25):
net: introduce OpenVPN Data Channel Offload (ovpn)
ovpn: add basic netlink support
ovpn: add basic interface creation/destruction/management routines
ovpn: keep carrier always on for MP interfaces
ovpn: introduce the ovpn_peer object
ovpn: introduce the ovpn_socket object
ovpn: implement basic TX path (UDP)
ovpn: implement basic RX path (UDP)
ovpn: implement packet processing
ovpn: store tunnel and transport statistics
ipv6: export inet6_stream_ops via EXPORT_SYMBOL_GPL
ovpn: implement TCP transport
skb: implement skb_send_sock_locked_with_flags()
ovpn: add support for MSG_NOSIGNAL in tcp_sendmsg
ovpn: implement multi-peer support
ovpn: implement peer lookup logic
ovpn: implement keepalive mechanism
ovpn: add support for updating local UDP endpoint
ovpn: add support for peer floating
ovpn: implement peer add/get/dump/delete via netlink
ovpn: implement key add/get/del/swap via netlink
ovpn: kill key and notify userspace in case of IV exhaustion
ovpn: notify userspace when a peer is deleted
ovpn: add basic ethtool support
testing/selftests: add test tool and scripts for ovpn module
Documentation/netlink/specs/ovpn.yaml | 372 +++
Documentation/netlink/specs/rt_link.yaml | 16 +
MAINTAINERS | 11 +
drivers/net/Kconfig | 15 +
drivers/net/Makefile | 1 +
drivers/net/ovpn/Makefile | 22 +
drivers/net/ovpn/bind.c | 55 +
drivers/net/ovpn/bind.h | 101 +
drivers/net/ovpn/crypto.c | 211 ++
drivers/net/ovpn/crypto.h | 145 ++
drivers/net/ovpn/crypto_aead.c | 382 ++++
drivers/net/ovpn/crypto_aead.h | 33 +
drivers/net/ovpn/io.c | 446 ++++
drivers/net/ovpn/io.h | 34 +
drivers/net/ovpn/main.c | 350 +++
drivers/net/ovpn/main.h | 14 +
drivers/net/ovpn/netlink-gen.c | 213 ++
drivers/net/ovpn/netlink-gen.h | 41 +
drivers/net/ovpn/netlink.c | 1183 ++++++++++
drivers/net/ovpn/netlink.h | 18 +
drivers/net/ovpn/ovpnstruct.h | 54 +
drivers/net/ovpn/peer.c | 1269 +++++++++++
drivers/net/ovpn/peer.h | 164 ++
drivers/net/ovpn/pktid.c | 129 ++
drivers/net/ovpn/pktid.h | 87 +
drivers/net/ovpn/proto.h | 118 +
drivers/net/ovpn/skb.h | 60 +
drivers/net/ovpn/socket.c | 204 ++
drivers/net/ovpn/socket.h | 49 +
drivers/net/ovpn/stats.c | 21 +
drivers/net/ovpn/stats.h | 47 +
drivers/net/ovpn/tcp.c | 565 +++++
drivers/net/ovpn/tcp.h | 33 +
drivers/net/ovpn/udp.c | 421 ++++
drivers/net/ovpn/udp.h | 22 +
include/linux/skbuff.h | 2 +
include/uapi/linux/if_link.h | 15 +
include/uapi/linux/ovpn.h | 111 +
include/uapi/linux/udp.h | 1 +
net/core/skbuff.c | 18 +-
net/ipv6/af_inet6.c | 1 +
tools/testing/selftests/Makefile | 1 +
tools/testing/selftests/net/ovpn/.gitignore | 2 +
tools/testing/selftests/net/ovpn/Makefile | 17 +
tools/testing/selftests/net/ovpn/config | 10 +
tools/testing/selftests/net/ovpn/data64.key | 5 +
tools/testing/selftests/net/ovpn/ovpn-cli.c | 2366 ++++++++++++++++++++
tools/testing/selftests/net/ovpn/tcp_peers.txt | 5 +
.../testing/selftests/net/ovpn/test-chachapoly.sh | 9 +
tools/testing/selftests/net/ovpn/test-float.sh | 9 +
tools/testing/selftests/net/ovpn/test-tcp.sh | 9 +
tools/testing/selftests/net/ovpn/test.sh | 185 ++
tools/testing/selftests/net/ovpn/udp_peers.txt | 5 +
53 files changed, 9672 insertions(+), 5 deletions(-)
---
base-commit: 7b24f164cf005b9649138ef6de94aaac49c9f3d1
change-id: 20241002-b4-ovpn-eeee35c694a2
Best regards,
--
Antonio Quartulli <antonio(a)openvpn.net>
Hi all,
This patch series continues the work to migrate the *.sh tests into
prog_tests.
test_xdp_redirect.sh tests the XDP redirections done through
bpf_redirect().
These XDP redirections are already tested by prog_tests/xdp_do_redirect.c
but IMO it doesn't cover the exact same code path because
xdp_do_redirect.c uses bpf_prog_test_run_opts() to trigger redirections
of 'fake packets' while test_xdp_redirect.sh redirects packets coming
from the network. Also, the test_xdp_redirect.sh script tests the
redirections with both SKB and DRV modes while xdp_do_redirect.c only
tests the DRV mode.
The patch series adds two new test cases in prog_tests/xdp_do_redirect.c
to replace the test_xdp_redirect.sh script.
Signed-off-by: Bastien Curutchet (eBPF Foundation) <bastien.curutchet(a)bootlin.com>
---
Changes in v2:
- Use directly skel->progs instead of 'bpf_object__find_program_by_name()'
- Use 'ip -n NSX' in SYS calls instead of opening NSX with open_netns()
- Use #define for static indexes of veth1 and veth2
- Delete the useless second ping
- Set nstoken to NULL after close_netns()
- Merge the two added tests into one with 3 subtests (one for each flag:
0, DRV, SKB)
- Link to v1: https://lore.kernel.org/r/20250103-xdp_redirect-v1-0-e93099f59069@bootlin.c…
---
Bastien Curutchet (eBPF Foundation) (3):
selftests/bpf: test_xdp_redirect: Rename BPF sections
selftests/bpf: Migrate test_xdp_redirect.sh to xdp_do_redirect.c
selftests/bpf: Migrate test_xdp_redirect.c to test_xdp_do_redirect.c
tools/testing/selftests/bpf/Makefile | 1 -
.../selftests/bpf/prog_tests/xdp_do_redirect.c | 164 +++++++++++++++++++++
.../selftests/bpf/progs/test_xdp_do_redirect.c | 12 ++
.../selftests/bpf/progs/test_xdp_redirect.c | 26 ----
tools/testing/selftests/bpf/test_xdp_redirect.sh | 79 ----------
5 files changed, 176 insertions(+), 106 deletions(-)
---
base-commit: b27feb5365c6a1bf7e71ba5c795717ee0eec298d
change-id: 20241219-xdp_redirect-2b8ec79dc24e
Best regards,
--
Bastien Curutchet (eBPF Foundation) <bastien.curutchet(a)bootlin.com>
`-l2 -v` is a useful combination of flags to dump the entire
verification log. This is helpful when making changes to the verifier,
as you can see what it thinks program one instruction at a time.
This was more or less a hidden feature before. Document it so others can
discover it.
Signed-off-by: Daniel Xu <dxu(a)dxuuu.xyz>
---
tools/testing/selftests/bpf/veristat.c | 5 +++--
1 file changed, 3 insertions(+), 2 deletions(-)
diff --git a/tools/testing/selftests/bpf/veristat.c b/tools/testing/selftests/bpf/veristat.c
index 974c808f9321..7d0a9cb753e3 100644
--- a/tools/testing/selftests/bpf/veristat.c
+++ b/tools/testing/selftests/bpf/veristat.c
@@ -216,7 +216,8 @@ const char argp_program_doc[] =
"\n"
"USAGE: veristat <obj-file> [<obj-file>...]\n"
" OR: veristat -C <baseline.csv> <comparison.csv>\n"
-" OR: veristat -R <results.csv>\n";
+" OR: veristat -R <results.csv>\n"
+" OR: veristat -v -l2 <to_analyze.bpf.o>\n";
enum {
OPT_LOG_FIXED = 1000,
@@ -228,7 +229,7 @@ static const struct argp_option opts[] = {
{ "version", 'V', NULL, 0, "Print version" },
{ "verbose", 'v', NULL, 0, "Verbose mode" },
{ "debug", 'd', NULL, 0, "Debug mode (turns on libbpf debug logging)" },
- { "log-level", 'l', "LEVEL", 0, "Verifier log level (default 0 for normal mode, 1 for verbose mode)" },
+ { "log-level", 'l', "LEVEL", 0, "Verifier log level (default 0 for normal mode, 1 for verbose mode, 2 for full verification log)" },
{ "log-fixed", OPT_LOG_FIXED, NULL, 0, "Disable verifier log rotation" },
{ "log-size", OPT_LOG_SIZE, "BYTES", 0, "Customize verifier log size (default to 16MB)" },
{ "top-n", 'n', "N", 0, "Emit only up to first N results." },
--
2.47.1
Android uses the ashmem driver [1] for creating shared memory regions
between processes. The ashmem driver exposes an ioctl command for
processes to restrict the permissions an ashmem buffer can be mapped
with.
Buffers are created with the ability to be mapped as readable, writable,
and executable. Processes remove the ability to map some ashmem buffers
as executable to ensure that those buffers cannot be used to inject
malicious code for another process to run. Other buffers retain their
ability to be mapped as executable, as these buffers can be used for
just-in-time (JIT) compilation. So there is a need to be able to remove
the ability to map a buffer as executable on a per-buffer basis.
Android is currently trying to migrate towards replacing its ashmem
driver usage with memfd. Part of the transition involved introducing a
library that serves to abstract away how shared memory regions are
allocated (i.e. ashmem vs memfd). This allows clients to use a single
interface for restricting how a buffer can be mapped without having to
worry about how it is handled for ashmem (through the ioctl
command mentioned earlier) or memfd (through file seals).
While memfd has support for preventing buffers from being mapped as
writable beyond a certain point in time (thanks to
F_SEAL_FUTURE_WRITE), it does not have a similar interface to prevent
buffers from being mapped as executable beyond a certain point.
However, that could be implemented as a file seal (F_SEAL_FUTURE_EXEC)
which works similarly to F_SEAL_FUTURE_WRITE.
F_SEAL_FUTURE_WRITE was chosen as a template for how this new seal
should behave, instead of F_SEAL_WRITE, for the following reasons:
1. Having the new seal behave like F_SEAL_FUTURE_WRITE matches the
behavior that was present with ashmem. This aids in seamlessly
transitioning clients away from ashmem to memfd.
2. Making the new seal behave like F_SEAL_WRITE would mean that no
mappings that could become executable in the future (i.e. via
mprotect()) can exist when the seal is applied. However, there are
known cases (e.g. CursorWindow [2]) where restrictions are applied
on how a buffer can be mapped after a mapping has already been made.
That mapping may have VM_MAYEXEC set, which would not allow the seal
to be applied successfully.
Therefore, the F_SEAL_FUTURE_EXEC seal was designed to have the same
semantics as F_SEAL_FUTURE_WRITE.
Note: this series depends on Lorenzo's work [3], [4], [5] from Andrew
Morton's mm-unstable branch [6], which reworks memfd's file seal checks,
allowing for newer file seals to be implemented in a cleaner fashion.
Changes from v1 ==> v2:
- Changed the return code to be -EPERM instead of -EACCES when
attempting to map an exec sealed file with PROT_EXEC to align
to mmap()'s man page. Thank you Kalesh Singh for spotting this!
- Rebased on top of Lorenzo's work to cleanup memfd file seal checks in
mmap() ([3], [4], and [5]). Thank you for this Lorenzo!
- Changed to deny PROT_EXEC mappings only if the mapping is shared,
instead of for both shared and private mappings, after discussing
this with Lorenzo.
Opens:
- Lorenzo brought up that this patch may negatively impact the usage of
MFD_NOEXEC_SCOPE_NOEXEC_ENFORCED [7]. However, it is not clear to me
why that is the case. At the moment, my intent is for the executable
permissions of the file to be disjoint from the ability to create
executable mappings.
Links:
[1] https://cs.android.com/android/kernel/superproject/+/common-android-mainlin…
[2] https://developer.android.com/reference/android/database/CursorWindow
[3] https://lore.kernel.org/all/cover.1732804776.git.lorenzo.stoakes@oracle.com/
[4] https://lkml.kernel.org/r/20241206212846.210835-1-lorenzo.stoakes@oracle.com
[5] https://lkml.kernel.org/r/7dee6c5d-480b-4c24-b98e-6fa47dbd8a23@lucifer.local
[6] https://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm.git/tree/?h=mm-unst…
[7] https://lore.kernel.org/all/3a53b154-1e46-45fb-a559-65afa7a8a788@lucifer.lo…
Links to previous versions:
v1: https://lore.kernel.org/all/20241206010930.3871336-1-isaacmanjarres@google.…
Isaac J. Manjarres (2):
mm/memfd: Add support for F_SEAL_FUTURE_EXEC to memfd
selftests/memfd: Add tests for F_SEAL_FUTURE_EXEC
include/uapi/linux/fcntl.h | 1 +
mm/memfd.c | 39 ++++++++++-
tools/testing/selftests/memfd/memfd_test.c | 79 ++++++++++++++++++++++
3 files changed, 118 insertions(+), 1 deletion(-)
--
2.47.1.613.gc27f4b7a9f-goog
After reviewing the code, it was found that these macros are never
referenced in the code. Just remove them.
Signed-off-by: Ba Jing <bajing(a)cmss.chinamobile.com>
---
tools/testing/selftests/landlock/ptrace_test.c | 2 --
1 file changed, 2 deletions(-)
diff --git a/tools/testing/selftests/landlock/ptrace_test.c b/tools/testing/selftests/landlock/ptrace_test.c
index a19db4d0b3bd..8f31b673ff2d 100644
--- a/tools/testing/selftests/landlock/ptrace_test.c
+++ b/tools/testing/selftests/landlock/ptrace_test.c
@@ -22,8 +22,6 @@
/* Copied from security/yama/yama_lsm.c */
#define YAMA_SCOPE_DISABLED 0
#define YAMA_SCOPE_RELATIONAL 1
-#define YAMA_SCOPE_CAPABILITY 2
-#define YAMA_SCOPE_NO_ATTACH 3
static void create_domain(struct __test_metadata *const _metadata)
{
--
2.33.0
This series depends on: "[PATCH v2 0/3] tun: Unify vnet implementation
and fill full vnet header"
https://lore.kernel.org/r/20250109-tun-v2-0-388d7d5a287a@daynix.com
virtio-net have two usage of hashes: one is RSS and another is hash
reporting. Conventionally the hash calculation was done by the VMM.
However, computing the hash after the queue was chosen defeats the
purpose of RSS.
Another approach is to use eBPF steering program. This approach has
another downside: it cannot report the calculated hash due to the
restrictive nature of eBPF.
Introduce the code to compute hashes to the kernel in order to overcome
thse challenges.
An alternative solution is to extend the eBPF steering program so that it
will be able to report to the userspace, but it is based on context
rewrites, which is in feature freeze. We can adopt kfuncs, but they will
not be UAPIs. We opt to ioctl to align with other relevant UAPIs (KVM
and vhost_net).
The patches for QEMU to use this new feature was submitted as RFC and
is available at:
https://patchew.org/QEMU/20240915-hash-v3-0-79cb08d28647@daynix.com/
This work was presented at LPC 2024:
https://lpc.events/event/18/contributions/1963/
V1 -> V2:
Changed to introduce a new BPF program type.
Signed-off-by: Akihiko Odaki <akihiko.odaki(a)daynix.com>
---
Changes in v6:
- Extracted changes to fill vnet header holes into another series.
- Squashed patches "skbuff: Introduce SKB_EXT_TUN_VNET_HASH", "tun:
Introduce virtio-net hash reporting feature", and "tun: Introduce
virtio-net RSS" into patch "tun: Introduce virtio-net hash feature".
- Dropped the RFC tag.
- Link to v5: https://lore.kernel.org/r/20241008-rss-v5-0-f3cf68df005d@daynix.com
Changes in v5:
- Fixed a compilation error with CONFIG_TUN_VNET_CROSS_LE.
- Optimized the calculation of the hash value according to:
https://git.dpdk.org/dpdk/commit/?id=3fb1ea032bd6ff8317af5dac9af901f1f324ca…
- Added patch "tun: Unify vnet implementation".
- Dropped patch "tap: Pad virtio header with zero".
- Added patch "selftest: tun: Test vnet ioctls without device".
- Reworked selftests to skip for older kernels.
- Documented the case when the underlying device is deleted and packets
have queue_mapping set by TC.
- Reordered test harness arguments.
- Added code to handle fragmented packets.
- Link to v4: https://lore.kernel.org/r/20240924-rss-v4-0-84e932ec0e6c@daynix.com
Changes in v4:
- Moved tun_vnet_hash_ext to if_tun.h.
- Renamed virtio_net_toeplitz() to virtio_net_toeplitz_calc().
- Replaced htons() with cpu_to_be16().
- Changed virtio_net_hash_rss() to return void.
- Reordered variable declarations in virtio_net_hash_rss().
- Removed virtio_net_hdr_v1_hash_from_skb().
- Updated messages of "tap: Pad virtio header with zero" and
"tun: Pad virtio header with zero".
- Fixed vnet_hash allocation size.
- Ensured to free vnet_hash when destructing tun_struct.
- Link to v3: https://lore.kernel.org/r/20240915-rss-v3-0-c630015db082@daynix.com
Changes in v3:
- Reverted back to add ioctl.
- Split patch "tun: Introduce virtio-net hashing feature" into
"tun: Introduce virtio-net hash reporting feature" and
"tun: Introduce virtio-net RSS".
- Changed to reuse hash values computed for automq instead of performing
RSS hashing when hash reporting is requested but RSS is not.
- Extracted relevant data from struct tun_struct to keep it minimal.
- Added kernel-doc.
- Changed to allow calling TUNGETVNETHASHCAP before TUNSETIFF.
- Initialized num_buffers with 1.
- Added a test case for unclassified packets.
- Fixed error handling in tests.
- Changed tests to verify that the queue index will not overflow.
- Rebased.
- Link to v2: https://lore.kernel.org/r/20231015141644.260646-1-akihiko.odaki@daynix.com
---
Akihiko Odaki (6):
virtio_net: Add functions for hashing
net: flow_dissector: Export flow_keys_dissector_symmetric
tun: Introduce virtio-net hash feature
selftest: tun: Test vnet ioctls without device
selftest: tun: Add tests for virtio-net hashing
vhost/net: Support VIRTIO_NET_F_HASH_REPORT
Documentation/networking/tuntap.rst | 7 +
drivers/net/Kconfig | 1 +
drivers/net/tap.c | 50 ++-
drivers/net/tun.c | 93 ++++--
drivers/net/tun_vnet.c | 167 +++++++++-
drivers/net/tun_vnet.h | 33 +-
drivers/vhost/net.c | 16 +-
include/linux/if_tap.h | 2 +
include/linux/skbuff.h | 3 +
include/linux/virtio_net.h | 188 +++++++++++
include/net/flow_dissector.h | 1 +
include/uapi/linux/if_tun.h | 75 +++++
net/core/flow_dissector.c | 3 +-
net/core/skbuff.c | 4 +
tools/testing/selftests/net/Makefile | 2 +-
tools/testing/selftests/net/tun.c | 630 ++++++++++++++++++++++++++++++++++-
16 files changed, 1224 insertions(+), 51 deletions(-)
---
base-commit: 9b2ffa6148b1e4468d08f7e0e7e371c43cac9ffe
change-id: 20240403-rss-e737d89efa77
prerequisite-change-id: 20241230-tun-66e10a49b0c7:v2
prerequisite-patch-id: 057e888c371f2ce750064b7c40c2cc6abbdf6819
prerequisite-patch-id: 22d53dd3443a2c72496bffb90f19d429972550a3
prerequisite-patch-id: 1520f0c1f7b11559d0898bea556f745f6b8914ac
Best regards,
--
Akihiko Odaki <akihiko.odaki(a)daynix.com>
PTRACE_SET_SYSCALL_INFO is a generic ptrace API that complements
PTRACE_GET_SYSCALL_INFO by letting the ptracer modify details of
system calls the tracee is blocked in.
This API allows ptracers to obtain and modify system call details
in a straightforward and architecture-agnostic way.
Current implementation supports changing only those bits of system call
information that are used by strace, namely, syscall number, syscall
arguments, and syscall return value.
Support of changing additional details returned by PTRACE_GET_SYSCALL_INFO,
such as instruction pointer and stack pointer, could be added later
if needed, by re-using struct ptrace_syscall_info.reserved to specify
the additional details that should be set. Currently, the reserved
field of struct ptrace_syscall_info must be initialized with zeroes;
arch, instruction_pointer, and stack_pointer fields are ignored.
PTRACE_SET_SYSCALL_INFO currently supports only PTRACE_SYSCALL_INFO_ENTRY,
PTRACE_SYSCALL_INFO_EXIT, and PTRACE_SYSCALL_INFO_SECCOMP operations.
Other operations could be added later if needed.
Ideally, PTRACE_SET_SYSCALL_INFO should have been introduced along with
PTRACE_GET_SYSCALL_INFO, but it didn't happen. The last straw that
convinced me to implement PTRACE_SET_SYSCALL_INFO was apparent failure
to provide an API of changing the first system call argument on riscv
architecture [1].
ptrace(2) man page:
long ptrace(enum __ptrace_request request, pid_t pid, void *addr, void *data);
...
PTRACE_SET_SYSCALL_INFO
Modify information about the system call that caused the stop.
The "data" argument is a pointer to struct ptrace_syscall_info
that specifies the system call information to be set.
The "addr" argument should be set to sizeof(struct ptrace_syscall_info)).
[1] https://lore.kernel.org/all/59505464-c84a-403d-972f-d4b2055eeaac@gmail.com/
Dmitry V. Levin (6):
Revert "arch: remove unused function syscall_set_arguments()"
syscall.h: add syscall_set_arguments() on remaining
HAVE_ARCH_TRACEHOOK arches
syscall.h: introduce syscall_set_nr()
ptrace_get_syscall_info: factor out ptrace_get_syscall_info_op
ptrace: introduce PTRACE_SET_SYSCALL_INFO request
selftests/ptrace: add a test case for PTRACE_SET_SYSCALL_INFO
arch/arc/include/asm/syscall.h | 20 +
arch/arm/include/asm/syscall.h | 25 +
arch/arm64/include/asm/syscall.h | 20 +
arch/csky/include/asm/syscall.h | 13 +
arch/hexagon/include/asm/syscall.h | 14 +
arch/loongarch/include/asm/syscall.h | 15 +
arch/m68k/include/asm/syscall.h | 7 +
arch/microblaze/include/asm/syscall.h | 7 +
arch/mips/include/asm/syscall.h | 53 +++
arch/nios2/include/asm/syscall.h | 16 +
arch/openrisc/include/asm/syscall.h | 13 +
arch/parisc/include/asm/syscall.h | 19 +
arch/powerpc/include/asm/syscall.h | 15 +
arch/riscv/include/asm/syscall.h | 16 +
arch/s390/include/asm/syscall.h | 19 +
arch/sh/include/asm/syscall_32.h | 19 +
arch/sparc/include/asm/syscall.h | 17 +
arch/um/include/asm/syscall-generic.h | 19 +
arch/x86/include/asm/syscall.h | 43 ++
arch/xtensa/include/asm/syscall.h | 18 +
include/asm-generic/syscall.h | 30 ++
include/linux/ptrace.h | 3 +
include/uapi/linux/ptrace.h | 3 +-
kernel/ptrace.c | 154 ++++++-
tools/testing/selftests/ptrace/Makefile | 2 +-
.../selftests/ptrace/set_syscall_info.c | 436 ++++++++++++++++++
26 files changed, 994 insertions(+), 22 deletions(-)
create mode 100644 tools/testing/selftests/ptrace/set_syscall_info.c
--
ldv
Implement comprehensive testing for netconsole userdata entry handling,
demonstrating correct behavior when creating maximum entries and
preventing unauthorized overflow.
Refactor existing test infrastructure to support modular, reusable
helper functions that validate strict entry limit enforcement.
Also, add a warning if update_userdata() sees more than
MAX_USERDATA_ITEMS entries. This shouldn't happen and it is a bug that
shouldn't be silently ignored.
Signed-off-by: Breno Leitao <leitao(a)debian.org>
---
Changes in v3:
- Added the new shell helpers files in the TEST_INCLUDES (Jakub)
- Link to v2: https://lore.kernel.org/r/20250103-netcons_overflow_test-v2-0-a49f9be64c21@…
Changes in v2:
- Add the new script (netcons_overflow.sh) in
tools/testing/selftests/drivers/net/Makefile as suggested by Simon
Horman
- Link to v1: https://lore.kernel.org/r/20241204-netcons_overflow_test-v1-0-a85a8d0ace21@…
---
Breno Leitao (4):
netconsole: Warn if MAX_USERDATA_ITEMS limit is exceeded
netconsole: selftest: Split the helpers from the selftest
netconsole: selftest: Delete all userdata keys
netconsole: selftest: verify userdata entry limit
MAINTAINERS | 3 +-
drivers/net/netconsole.c | 2 +-
tools/testing/selftests/drivers/net/Makefile | 2 +
.../selftests/drivers/net/lib/sh/lib_netcons.sh | 225 +++++++++++++++++++++
.../testing/selftests/drivers/net/netcons_basic.sh | 218 +-------------------
.../selftests/drivers/net/netcons_overflow.sh | 67 ++++++
6 files changed, 298 insertions(+), 219 deletions(-)
---
base-commit: 7bf1659bad4e9413cdba132ef9cbd0caa9cabcc4
change-id: 20241204-netcons_overflow_test-eaf735d1f743
Best regards,
--
Breno Leitao <leitao(a)debian.org>
This patch allows progs to elide a null check on statically known map
lookup keys. In other words, if the verifier can statically prove that
the lookup will be in-bounds, allow the prog to drop the null check.
This is useful for two reasons:
1. Large numbers of nullness checks (especially when they cannot fail)
unnecessarily pushes prog towards BPF_COMPLEXITY_LIMIT_JMP_SEQ.
2. It forms a tighter contract between programmer and verifier.
For (1), bpftrace is starting to make heavier use of percpu scratch
maps. As a result, for user scripts with large number of unrolled loops,
we are starting to hit jump complexity verification errors. These
percpu lookups cannot fail anyways, as we only use static key values.
Eliding nullness probably results in less work for verifier as well.
For (2), percpu scratch maps are often used as a larger stack, as the
currrent stack is limited to 512 bytes. In these situations, it is
desirable for the programmer to express: "this lookup should never fail,
and if it does, it means I messed up the code". By omitting the null
check, the programmer can "ask" the verifier to double check the logic.
=== Changelog ===
Changes in v6:
* Use is_spilled_scalar_reg() helper and remove unnecessary comment
* Add back deleted selftest with different helper to dirty dst buffer
* Check size of spill is exactly key_size and update selftests
* Read slot_type from correct offset into the spi
* Rewrite selftests in C where possible
* Mark constant map keys as precise
Changes in v5:
* Dropped all acks
* Use s64 instead of long for const_map_key
* Ensure stack slot contains spilled reg before accessing spilled_ptr
* Ensure spilled reg is a scalar before accessing tnum const value
* Fix verifier selftest for 32-bit write to write at 8 byte alignment
to ensure spill is tracked
* Introduce more precise tracking of helper stack accesses
* Do constant map key extraction as part of helper argument processing
and then remove duplicated stack checks
* Use ret_flag instead of regs[BPF_REG_0].type
* Handle STACK_ZERO
* Fix bug in bpf_load_hdr_opt() arg annotation
Changes in v4:
* Only allow for CAP_BPF
* Add test for stack growing upwards
* Improve comment about stack growing upwards
Changes in v3:
* Check if stack is (erroneously) growing upwards
* Mention in commit message why existing tests needed change
Changes in v2:
* Added a check for when R2 is not a ptr to stack
* Added a check for when stack is uninitialized (no stack slot yet)
* Updated existing tests to account for null elision
* Added test case for when R2 can be both const and non-const
Daniel Xu (5):
bpf: verifier: Add missing newline on verbose() call
bpf: tcp: Mark bpf_load_hdr_opt() arg2 as read-write
bpf: verifier: Refactor helper access type tracking
bpf: verifier: Support eliding map lookup nullness
bpf: selftests: verifier: Add nullness elision tests
kernel/bpf/verifier.c | 139 +++++++++++----
net/core/filter.c | 2 +-
.../testing/selftests/bpf/progs/dynptr_fail.c | 6 +-
tools/testing/selftests/bpf/progs/iters.c | 14 +-
.../selftests/bpf/progs/map_kptr_fail.c | 2 +-
.../selftests/bpf/progs/test_global_func10.c | 2 +-
.../selftests/bpf/progs/uninit_stack.c | 5 +-
.../bpf/progs/verifier_array_access.c | 168 ++++++++++++++++++
.../bpf/progs/verifier_basic_stack.c | 2 +-
.../selftests/bpf/progs/verifier_const_or.c | 4 +-
.../progs/verifier_helper_access_var_len.c | 12 +-
.../selftests/bpf/progs/verifier_int_ptr.c | 2 +-
.../selftests/bpf/progs/verifier_map_in_map.c | 2 +-
.../selftests/bpf/progs/verifier_mtu.c | 2 +-
.../selftests/bpf/progs/verifier_raw_stack.c | 4 +-
.../selftests/bpf/progs/verifier_unpriv.c | 2 +-
.../selftests/bpf/progs/verifier_var_off.c | 8 +-
tools/testing/selftests/bpf/verifier/calls.c | 2 +-
.../testing/selftests/bpf/verifier/map_kptr.c | 2 +-
19 files changed, 311 insertions(+), 69 deletions(-)
--
2.47.1
Reverse the order in which
the PML log is read to align more closely to the hardware. It should
not affect regular users of the dirty logging but it fixes a unit test
specific assumption in the dirty_log_test dirty-ring mode.
Best regards,
Maxim Levitsky
Maxim Levitsky (2):
KVM: VMX: refactor PML terminology
KVM: VMX: read the PML log in the same order as it was written
arch/x86/kvm/vmx/main.c | 2 +-
arch/x86/kvm/vmx/nested.c | 2 +-
arch/x86/kvm/vmx/vmx.c | 32 ++++++++++++++++++++------------
arch/x86/kvm/vmx/vmx.h | 5 ++++-
4 files changed, 26 insertions(+), 15 deletions(-)
--
2.26.3
Extend the XDP Tx metadata framework so that user can requests launch time
hardware offload, where the Ethernet device will schedule the packet for
transmission at a pre-determined time called launch time. The value of
launch time is communicated from user space to Ethernet driver via
launch_time field of struct xsk_tx_metadata.
Suggested-by: Stanislav Fomichev <sdf(a)google.com>
Signed-off-by: Song Yoong Siang <yoong.siang.song(a)intel.com>
---
Documentation/netlink/specs/netdev.yaml | 4 ++
Documentation/networking/xsk-tx-metadata.rst | 64 ++++++++++++++++++++
include/net/xdp_sock.h | 10 +++
include/net/xdp_sock_drv.h | 1 +
include/uapi/linux/if_xdp.h | 10 +++
include/uapi/linux/netdev.h | 3 +
net/core/netdev-genl.c | 2 +
net/xdp/xsk.c | 3 +
tools/include/uapi/linux/if_xdp.h | 10 +++
tools/include/uapi/linux/netdev.h | 3 +
10 files changed, 110 insertions(+)
diff --git a/Documentation/netlink/specs/netdev.yaml b/Documentation/netlink/specs/netdev.yaml
index cbb544bd6c84..e59c8a14f7d1 100644
--- a/Documentation/netlink/specs/netdev.yaml
+++ b/Documentation/netlink/specs/netdev.yaml
@@ -70,6 +70,10 @@ definitions:
name: tx-checksum
doc:
L3 checksum HW offload is supported by the driver.
+ -
+ name: tx-launch-time
+ doc:
+ Launch time HW offload is supported by the driver.
-
name: queue-type
type: enum
diff --git a/Documentation/networking/xsk-tx-metadata.rst b/Documentation/networking/xsk-tx-metadata.rst
index e76b0cfc32f7..3cec089747ce 100644
--- a/Documentation/networking/xsk-tx-metadata.rst
+++ b/Documentation/networking/xsk-tx-metadata.rst
@@ -50,6 +50,10 @@ The flags field enables the particular offload:
checksum. ``csum_start`` specifies byte offset of where the checksumming
should start and ``csum_offset`` specifies byte offset where the
device should store the computed checksum.
+- ``XDP_TXMD_FLAGS_LAUNCH_TIME``: requests the device to schedule the
+ packet for transmission at a pre-determined time called launch time. The
+ value of launch time is indicated by ``launch_time`` field of
+ ``union xsk_tx_metadata``.
Besides the flags above, in order to trigger the offloads, the first
packet's ``struct xdp_desc`` descriptor should set ``XDP_TX_METADATA``
@@ -65,6 +69,65 @@ In this case, when running in ``XDK_COPY`` mode, the TX checksum
is calculated on the CPU. Do not enable this option in production because
it will negatively affect performance.
+Launch Time
+===========
+
+The value of the requested launch time should be based on the device's PTP
+Hardware Clock (PHC) to ensure accuracy. AF_XDP takes a different data path
+compared to the ETF queuing discipline, which organizes packets and delays
+their transmission. Instead, AF_XDP immediately hands off the packets to
+the device driver without rearranging their order or holding them prior to
+transmission. In scenarios where the launch time offload feature is
+disabled, the device driver is expected to disregard the launch time
+request. For correct interpretation and meaningful operation, the launch
+time should never be set to a value larger than the farthest programmable
+time in the future (the horizon). Different devices have different hardware
+limitations on the launch time offload feature.
+
+stmmac driver
+-------------
+
+For stmmac, TSO and launch time (TBS) features are mutually exclusive for
+each individual Tx Queue. By default, the driver configures Tx Queue 0 to
+support TSO and the rest of the Tx Queues to support TBS. The launch time
+hardware offload feature can be enabled or disabled by using the tc-etf
+command to call the driver's ndo_setup_tc() callback.
+
+The value of the launch time that is programmed in the Enhanced Normal
+Transmit Descriptors is a 32-bit value, where the most significant 8 bits
+represent the time in seconds and the remaining 24 bits represent the time
+in 256 ns increments. The programmed launch time is compared against the
+PTP time (bits[39:8]) and rolls over after 256 seconds. Therefore, the
+horizon of the launch time for dwmac4 and dwxlgmac2 is 128 seconds in the
+future.
+
+The stmmac driver maintains FIFO behavior and does not perform packet
+reordering. This means that a packet with a launch time request will block
+other packets in the same Tx Queue until it is transmitted.
+
+igc driver
+----------
+
+For igc, all four Tx Queues support the launch time feature. The launch
+time hardware offload feature can be enabled or disabled by using the
+tc-etf command to call the driver's ndo_setup_tc() callback. When entering
+TSN mode, the igc driver will reset the device and create a default Qbv
+schedule with a 1-second cycle time, with all Tx Queues open at all times.
+
+The value of the launch time that is programmed in the Advanced Transmit
+Context Descriptor is a relative offset to the starting time of the Qbv
+transmission window of the queue. The Frst flag of the descriptor can be
+set to schedule the packet for the next Qbv cycle. Therefore, the horizon
+of the launch time for i225 and i226 is the ending time of the next cycle
+of the Qbv transmission window of the queue. For example, when the Qbv
+cycle time is set to 1 second, the horizon of the launch time ranges
+from 1 second to 2 seconds, depending on where the Qbv cycle is currently
+running.
+
+The igc driver maintains FIFO behavior and does not perform packet
+reordering. This means that a packet with a launch time request will block
+other packets in the same Tx Queue until it is transmitted.
+
Querying Device Capabilities
============================
@@ -74,6 +137,7 @@ Refer to ``xsk-flags`` features bitmask in
- ``tx-timestamp``: device supports ``XDP_TXMD_FLAGS_TIMESTAMP``
- ``tx-checksum``: device supports ``XDP_TXMD_FLAGS_CHECKSUM``
+- ``tx-launch-time``: device supports ``XDP_TXMD_FLAGS_LAUNCH_TIME``
See ``tools/net/ynl/samples/netdev.c`` on how to query this information.
diff --git a/include/net/xdp_sock.h b/include/net/xdp_sock.h
index bfe625b55d55..a58ae7589d12 100644
--- a/include/net/xdp_sock.h
+++ b/include/net/xdp_sock.h
@@ -110,11 +110,16 @@ struct xdp_sock {
* indicates position where checksumming should start.
* csum_offset indicates position where checksum should be stored.
*
+ * void (*tmo_request_launch_time)(u64 launch_time, void *priv)
+ * Called when AF_XDP frame requested launch time HW offload support.
+ * launch_time indicates the PTP time at which the device can schedule the
+ * packet for transmission.
*/
struct xsk_tx_metadata_ops {
void (*tmo_request_timestamp)(void *priv);
u64 (*tmo_fill_timestamp)(void *priv);
void (*tmo_request_checksum)(u16 csum_start, u16 csum_offset, void *priv);
+ void (*tmo_request_launch_time)(u64 launch_time, void *priv);
};
#ifdef CONFIG_XDP_SOCKETS
@@ -162,6 +167,11 @@ static inline void xsk_tx_metadata_request(const struct xsk_tx_metadata *meta,
if (!meta)
return;
+ if (ops->tmo_request_launch_time)
+ if (meta->flags & XDP_TXMD_FLAGS_LAUNCH_TIME)
+ ops->tmo_request_launch_time(meta->request.launch_time,
+ priv);
+
if (ops->tmo_request_timestamp)
if (meta->flags & XDP_TXMD_FLAGS_TIMESTAMP)
ops->tmo_request_timestamp(priv);
diff --git a/include/net/xdp_sock_drv.h b/include/net/xdp_sock_drv.h
index 40085afd9160..78af371bc002 100644
--- a/include/net/xdp_sock_drv.h
+++ b/include/net/xdp_sock_drv.h
@@ -198,6 +198,7 @@ static inline void *xsk_buff_raw_get_data(struct xsk_buff_pool *pool, u64 addr)
#define XDP_TXMD_FLAGS_VALID ( \
XDP_TXMD_FLAGS_TIMESTAMP | \
XDP_TXMD_FLAGS_CHECKSUM | \
+ XDP_TXMD_FLAGS_LAUNCH_TIME | \
0)
static inline bool xsk_buff_valid_tx_metadata(struct xsk_tx_metadata *meta)
diff --git a/include/uapi/linux/if_xdp.h b/include/uapi/linux/if_xdp.h
index 42ec5ddaab8d..42869770776e 100644
--- a/include/uapi/linux/if_xdp.h
+++ b/include/uapi/linux/if_xdp.h
@@ -127,6 +127,12 @@ struct xdp_options {
*/
#define XDP_TXMD_FLAGS_CHECKSUM (1 << 1)
+/* Request launch time hardware offload. The device will schedule the packet for
+ * transmission at a pre-determined time called launch time. The value of
+ * launch time is communicated via launch_time field of struct xsk_tx_metadata.
+ */
+#define XDP_TXMD_FLAGS_LAUNCH_TIME (1 << 2)
+
/* AF_XDP offloads request. 'request' union member is consumed by the driver
* when the packet is being transmitted. 'completion' union member is
* filled by the driver when the transmit completion arrives.
@@ -142,6 +148,10 @@ struct xsk_tx_metadata {
__u16 csum_start;
/* Offset from csum_start where checksum should be stored. */
__u16 csum_offset;
+
+ /* XDP_TXMD_FLAGS_LAUNCH_TIME */
+ /* Launch time in nanosecond against the PTP HW Clock */
+ __u64 launch_time;
} request;
struct {
diff --git a/include/uapi/linux/netdev.h b/include/uapi/linux/netdev.h
index e4be227d3ad6..5ab85f4af009 100644
--- a/include/uapi/linux/netdev.h
+++ b/include/uapi/linux/netdev.h
@@ -59,10 +59,13 @@ enum netdev_xdp_rx_metadata {
* by the driver.
* @NETDEV_XSK_FLAGS_TX_CHECKSUM: L3 checksum HW offload is supported by the
* driver.
+ * @NETDEV_XSK_FLAGS_LAUNCH_TIME: Launch Time HW offload is supported by the
+ * driver.
*/
enum netdev_xsk_flags {
NETDEV_XSK_FLAGS_TX_TIMESTAMP = 1,
NETDEV_XSK_FLAGS_TX_CHECKSUM = 2,
+ NETDEV_XSK_FLAGS_LAUNCH_TIME = 4,
};
enum netdev_queue_type {
diff --git a/net/core/netdev-genl.c b/net/core/netdev-genl.c
index 9527dd46e4dc..e2515cf9190f 100644
--- a/net/core/netdev-genl.c
+++ b/net/core/netdev-genl.c
@@ -52,6 +52,8 @@ XDP_METADATA_KFUNC_xxx
xsk_features |= NETDEV_XSK_FLAGS_TX_TIMESTAMP;
if (netdev->xsk_tx_metadata_ops->tmo_request_checksum)
xsk_features |= NETDEV_XSK_FLAGS_TX_CHECKSUM;
+ if (netdev->xsk_tx_metadata_ops->tmo_request_launch_time)
+ xsk_features |= NETDEV_XSK_FLAGS_LAUNCH_TIME;
}
if (nla_put_u32(rsp, NETDEV_A_DEV_IFINDEX, netdev->ifindex) ||
diff --git a/net/xdp/xsk.c b/net/xdp/xsk.c
index 3fa70286c846..8feaa0e86f07 100644
--- a/net/xdp/xsk.c
+++ b/net/xdp/xsk.c
@@ -743,6 +743,9 @@ static struct sk_buff *xsk_build_skb(struct xdp_sock *xs,
goto free_err;
}
}
+
+ if (meta->flags & XDP_TXMD_FLAGS_LAUNCH_TIME)
+ skb->skb_mstamp_ns = meta->request.launch_time;
}
}
diff --git a/tools/include/uapi/linux/if_xdp.h b/tools/include/uapi/linux/if_xdp.h
index 2f082b01ff22..67719f8966c2 100644
--- a/tools/include/uapi/linux/if_xdp.h
+++ b/tools/include/uapi/linux/if_xdp.h
@@ -127,6 +127,12 @@ struct xdp_options {
*/
#define XDP_TXMD_FLAGS_CHECKSUM (1 << 1)
+/* Request launch time hardware offload. The device will schedule the packet for
+ * transmission at a pre-determined time called launch time. The value of
+ * launch time is communicated via launch_time field of struct xsk_tx_metadata.
+ */
+#define XDP_TXMD_FLAGS_LAUNCH_TIME (1 << 2)
+
/* AF_XDP offloads request. 'request' union member is consumed by the driver
* when the packet is being transmitted. 'completion' union member is
* filled by the driver when the transmit completion arrives.
@@ -142,6 +148,10 @@ struct xsk_tx_metadata {
__u16 csum_start;
/* Offset from csum_start where checksum should be stored. */
__u16 csum_offset;
+
+ /* XDP_TXMD_FLAGS_LAUNCH_TIME */
+ /* Launch time in nanosecond against the PTP HW Clock */
+ __u64 launch_time;
} request;
struct {
diff --git a/tools/include/uapi/linux/netdev.h b/tools/include/uapi/linux/netdev.h
index e4be227d3ad6..5ab85f4af009 100644
--- a/tools/include/uapi/linux/netdev.h
+++ b/tools/include/uapi/linux/netdev.h
@@ -59,10 +59,13 @@ enum netdev_xdp_rx_metadata {
* by the driver.
* @NETDEV_XSK_FLAGS_TX_CHECKSUM: L3 checksum HW offload is supported by the
* driver.
+ * @NETDEV_XSK_FLAGS_LAUNCH_TIME: Launch Time HW offload is supported by the
+ * driver.
*/
enum netdev_xsk_flags {
NETDEV_XSK_FLAGS_TX_TIMESTAMP = 1,
NETDEV_XSK_FLAGS_TX_CHECKSUM = 2,
+ NETDEV_XSK_FLAGS_LAUNCH_TIME = 4,
};
enum netdev_queue_type {
--
2.34.1
The selftest started failing since commit e93d2521b27f
("x86/vdso: Split virtual clock pages into dedicated mapping")
was merged. While debugging I stumbled upon another bug and potential
cleanup.
Signed-off-by: Thomas Weißschuh <thomas.weissschuh(a)linutronix.de>
---
Thomas Weißschuh (3):
selftests/mm: virtual_address_range: Fix error when CommitLimit < 1GiB
selftests/mm: virtual_address_range: Avoid reading VVAR mappings
selftests/mm: virtual_address_range: Dump to /dev/null
tools/testing/selftests/mm/virtual_address_range.c | 21 +++++++++++++++------
1 file changed, 15 insertions(+), 6 deletions(-)
---
base-commit: fbfd64d25c7af3b8695201ebc85efe90be28c5a3
change-id: 20250107-virtual_address_range-tests-95843766fa97
Best regards,
--
Thomas Weißschuh <thomas.weissschuh(a)linutronix.de>
Notable changes since v15:
* added IPV6 hack in Kconfig
* switched doc '|' operator to '>-' in yaml netlink spec
* added ovpn-mode doc to rt_link.yaml
* implemented rtnl_link_ops.fill_info
* removed ovpn_socket_detach() function because UDP and TCP detachment
is now happening in different moments
* reworked ovpn_socket lifetime:
** introduced ovpn_socket_release() that depending on transport proto
will take the right step towards releasing the socket (check large
comment on top of function for greater details)
** extended comments on various ovpn_socket* functions to ensure socket
lifecycle is clear
** implemented kref_put_lock() to allow UDP sockets to be detached while
holding socket lock
** acquired socket lock in ovpn_socket_new() to avoid race with detach
(point above)
** socket is now released upon peer removal (not upon peer free!)
* added convenient define OVPN_AAD_SIZE
* renamed AUTH_TAG_SIZE to OVPN_AUTH_TAG_SIZE
* s/dev_core_stats_rx_dropped_inc/dev_core_stats_tx_dropped_inc where
needed
* fixed some typos
* moved tcp_close() call outside of rcu_read_lock area
* moved ovpn_socket creation from ovpn_nl_peer_modify() to
ovpn_nl_peer_new_doit() to make smatch happy (ovpn_socket_new() may
have been called under spinlock, but it may sleep)
* added support for MSG_NOSIGNAL flag in TCP calls (required extending
the skb API)
* improved TCP proto/ops customization (required exporting
inet6_stream_ops)
* changed kselftest tool (ovpn-cli.c) to pass MSG_NOSIGNAL to TCP
send/recv calls.
The ovpn_socket lifecycle changes above address the race conditions
previously reported by Sabrina.
Hopefully all though nuts have been cracked at this point.
Please note that some patches were already reviewed by Andre Lunn,
Donald Hunter and Shuah Khan. They have retained the Reviewed-by tag
since no major code modification has happened since the review.
The latest code can also be found at:
https://github.com/OpenVPN/linux-kernel-ovpn
Thanks a lot!
Best Regards,
Antonio Quartulli
OpenVPN Inc.
---
Antonio Quartulli (26):
net: introduce OpenVPN Data Channel Offload (ovpn)
ovpn: add basic netlink support
ovpn: add basic interface creation/destruction/management routines
ovpn: keep carrier always on for MP interfaces
ovpn: introduce the ovpn_peer object
kref/refcount: implement kref_put_sock()
ovpn: introduce the ovpn_socket object
ovpn: implement basic TX path (UDP)
ovpn: implement basic RX path (UDP)
ovpn: implement packet processing
ovpn: store tunnel and transport statistics
ipv6: export inet6_stream_ops via EXPORT_SYMBOL_GPL
ovpn: implement TCP transport
skb: implement skb_send_sock_locked_with_flags()
ovpn: add support for MSG_NOSIGNAL in tcp_sendmsg
ovpn: implement multi-peer support
ovpn: implement peer lookup logic
ovpn: implement keepalive mechanism
ovpn: add support for updating local UDP endpoint
ovpn: add support for peer floating
ovpn: implement peer add/get/dump/delete via netlink
ovpn: implement key add/get/del/swap via netlink
ovpn: kill key and notify userspace in case of IV exhaustion
ovpn: notify userspace when a peer is deleted
ovpn: add basic ethtool support
testing/selftests: add test tool and scripts for ovpn module
Documentation/netlink/specs/ovpn.yaml | 372 +++
Documentation/netlink/specs/rt_link.yaml | 16 +
MAINTAINERS | 11 +
drivers/net/Kconfig | 15 +
drivers/net/Makefile | 1 +
drivers/net/ovpn/Makefile | 22 +
drivers/net/ovpn/bind.c | 55 +
drivers/net/ovpn/bind.h | 101 +
drivers/net/ovpn/crypto.c | 211 ++
drivers/net/ovpn/crypto.h | 145 ++
drivers/net/ovpn/crypto_aead.c | 382 ++++
drivers/net/ovpn/crypto_aead.h | 33 +
drivers/net/ovpn/io.c | 446 ++++
drivers/net/ovpn/io.h | 34 +
drivers/net/ovpn/main.c | 350 +++
drivers/net/ovpn/main.h | 14 +
drivers/net/ovpn/netlink-gen.c | 213 ++
drivers/net/ovpn/netlink-gen.h | 41 +
drivers/net/ovpn/netlink.c | 1178 ++++++++++
drivers/net/ovpn/netlink.h | 18 +
drivers/net/ovpn/ovpnstruct.h | 57 +
drivers/net/ovpn/peer.c | 1256 +++++++++++
drivers/net/ovpn/peer.h | 159 ++
drivers/net/ovpn/pktid.c | 129 ++
drivers/net/ovpn/pktid.h | 87 +
drivers/net/ovpn/proto.h | 118 +
drivers/net/ovpn/skb.h | 60 +
drivers/net/ovpn/socket.c | 237 ++
drivers/net/ovpn/socket.h | 45 +
drivers/net/ovpn/stats.c | 21 +
drivers/net/ovpn/stats.h | 47 +
drivers/net/ovpn/tcp.c | 567 +++++
drivers/net/ovpn/tcp.h | 33 +
drivers/net/ovpn/udp.c | 392 ++++
drivers/net/ovpn/udp.h | 23 +
include/linux/kref.h | 11 +
include/linux/refcount.h | 3 +
include/linux/skbuff.h | 2 +
include/uapi/linux/if_link.h | 15 +
include/uapi/linux/ovpn.h | 111 +
include/uapi/linux/udp.h | 1 +
lib/refcount.c | 32 +
net/core/skbuff.c | 18 +-
net/ipv6/af_inet6.c | 1 +
tools/testing/selftests/Makefile | 1 +
tools/testing/selftests/net/ovpn/.gitignore | 2 +
tools/testing/selftests/net/ovpn/Makefile | 17 +
tools/testing/selftests/net/ovpn/config | 10 +
tools/testing/selftests/net/ovpn/data64.key | 5 +
tools/testing/selftests/net/ovpn/ovpn-cli.c | 2366 ++++++++++++++++++++
tools/testing/selftests/net/ovpn/tcp_peers.txt | 5 +
.../testing/selftests/net/ovpn/test-chachapoly.sh | 9 +
tools/testing/selftests/net/ovpn/test-float.sh | 9 +
tools/testing/selftests/net/ovpn/test-tcp.sh | 9 +
tools/testing/selftests/net/ovpn/test.sh | 182 ++
tools/testing/selftests/net/ovpn/udp_peers.txt | 5 +
56 files changed, 9698 insertions(+), 5 deletions(-)
---
base-commit: 4b252f2dab2ebb654eebbb2aee980ab8373b2295
change-id: 20241002-b4-ovpn-eeee35c694a2
Best regards,
--
Antonio Quartulli <antonio(a)openvpn.net>
On 08.01.25 07:09, Dev Jain wrote:
>
> On 07/01/25 8:44 pm, Thomas Weißschuh wrote:
>> During the execution of validate_complete_va_space() a lot of memory is
>> on the VM subsystem. When running on a low memory subsystem an OOM may
>> be triggered, when writing to the dump file as the filesystem may also
>> require memory.
>>
>> On my test system with 1100MiB physical memory:
>>
>> Tasks state (memory values in pages):
>> [ pid ] uid tgid total_vm rss rss_anon rss_file rss_shmem pgtables_bytes swapents oom_score_adj name
>> [ 57] 0 57 34359215953 695 256 0 439 1064390656 0 0 virtual_address
>>
>> Out of memory: Killed process 57 (virtual_address) total-vm:137436863812kB, anon-rss:1024kB, file-rss:0kB, shmem-rss:1756kB, UID:0 pgtables:1039444kB oom_score_adj:0
>> <snip>
>> fault_in_iov_iter_readable+0x4a/0xd0
>> generic_perform_write+0x9c/0x280
>> shmem_file_write_iter+0x86/0x90
>> vfs_write+0x29c/0x480
>> ksys_write+0x6c/0xe0
>> do_syscall_64+0x9e/0x1a0
>> entry_SYSCALL_64_after_hwframe+0x77/0x7f
>>
>> Write the dumped data into /dev/null instead which does not require
>> additional memory during write(), making the code simpler as a
>> side-effect.
>>
>> Signed-off-by: Thomas Weißschuh<thomas.weissschuh(a)linutronix.de>
>> ---
>> tools/testing/selftests/mm/virtual_address_range.c | 6 ++----
>> 1 file changed, 2 insertions(+), 4 deletions(-)
>>
>> diff --git a/tools/testing/selftests/mm/virtual_address_range.c b/tools/testing/selftests/mm/virtual_address_range.c
>> index 484f82c7b7c871f82a7d9ec6d6c649f2ab1eb0cd..4042fd878acd702d23da2c3293292de33bd48143 100644
>> --- a/tools/testing/selftests/mm/virtual_address_range.c
>> +++ b/tools/testing/selftests/mm/virtual_address_range.c
>> @@ -103,10 +103,9 @@ static int validate_complete_va_space(void)
>> FILE *file;
>> int fd;
>>
>> - fd = open("va_dump", O_CREAT | O_WRONLY, 0600);
>> - unlink("va_dump");
>> + fd = open("/dev/null", O_WRONLY);
>> if (fd < 0) {
>> - ksft_test_result_skip("cannot create or open dump file\n");
>> + ksft_test_result_skip("cannot create or open /dev/null\n");
>> ksft_finished();
>> }
>> >> @@ -152,7 +151,6 @@ static int validate_complete_va_space(void)
>> while (start_addr + hop < end_addr) {
>> if (write(fd, (void *)(start_addr + hop), 1) != 1)
>> return 1;
>> - lseek(fd, 0, SEEK_SET);
>>
>> hop += MAP_CHUNK_SIZE;
>> }
>>
>
> The reason I had not used /dev/null was that write() was succeeding to /dev/null
> even from an address not in my VA space. I was puzzled about this behaviour of
> /dev/null and I chose to ignore it and just use a real file.
>
> To test this behaviour, run the following program:
>
> #include <stdio.h>
> #include <stdlib.h>
> #include <unistd.h>
> #include <fcntl.h>
> #include <sys/mman.h>
> intmain()
> {
> intfd;
> fd = open("va_dump", O_CREAT| O_WRONLY, 0600);
> unlink("va_dump");
> // fd = open("/dev/null", O_WRONLY);
> intret = munmap((void*)(1UL<< 30), 100);
> if(!ret)
> printf("munmap succeeded\n");
> intres = write(fd, (void*)(1UL<< 30), 1);
> if(res == 1)
> printf("write succeeded\n");
> return0;
> }
> The write will fail as expected, but if you comment out the va_dump
> lines and use /dev/null, the write will succeed.
What exactly do we want to achieve with the write? Verify that the
output of /proc/self/map is reasonable and we can actually resolve a
fault / map a page?
Why not access the memory directly+signal handler or using
/proc/self/mem, so you can avoid the temp file completely?
--
Cheers,
David / dhildenb
* Resending because I accidentally forgot to include Lorenzo in the
"to" list.
Android uses the ashmem driver [1] for creating shared memory regions
between processes. The ashmem driver exposes an ioctl command for
processes to restrict the permissions an ashmem buffer can be mapped
with.
Buffers are created with the ability to be mapped as readable, writable,
and executable. Processes remove the ability to map some ashmem buffers
as executable to ensure that those buffers cannot be used to inject
malicious code for another process to run. Other buffers retain their
ability to be mapped as executable, as these buffers can be used for
just-in-time (JIT) compilation. So there is a need to be able to remove
the ability to map a buffer as executable on a per-buffer basis.
Android is currently trying to migrate towards replacing its ashmem
driver usage with memfd. Part of the transition involved introducing a
library that serves to abstract away how shared memory regions are
allocated (i.e. ashmem vs memfd). This allows clients to use a single
interface for restricting how a buffer can be mapped without having to
worry about how it is handled for ashmem (through the ioctl
command mentioned earlier) or memfd (through file seals).
While memfd has support for preventing buffers from being mapped as
writable beyond a certain point in time (thanks to
F_SEAL_FUTURE_WRITE), it does not have a similar interface to prevent
buffers from being mapped as executable beyond a certain point.
However, that could be implemented as a file seal (F_SEAL_FUTURE_EXEC)
which works similarly to F_SEAL_FUTURE_WRITE.
F_SEAL_FUTURE_WRITE was chosen as a template for how this new seal
should behave, instead of F_SEAL_WRITE, for the following reasons:
1. Having the new seal behave like F_SEAL_FUTURE_WRITE matches the
behavior that was present with ashmem. This aids in seamlessly
transitioning clients away from ashmem to memfd.
2. Making the new seal behave like F_SEAL_WRITE would mean that no
mappings that could become executable in the future (i.e. via
mprotect()) can exist when the seal is applied. However, there are
known cases (e.g. CursorWindow [2]) where restrictions are applied
on how a buffer can be mapped after a mapping has already been made.
That mapping may have VM_MAYEXEC set, which would not allow the seal
to be applied successfully.
Therefore, the F_SEAL_FUTURE_EXEC seal was designed to have the same
semantics as F_SEAL_FUTURE_WRITE.
Note: this series depends on Lorenzo's work [3], [4], [5] from Andrew
Morton's mm-unstable branch [6], which reworks memfd's file seal checks,
allowing for newer file seals to be implemented in a cleaner fashion.
Changes from v1 ==> v2:
- Changed the return code to be -EPERM instead of -EACCES when
attempting to map an exec sealed file with PROT_EXEC to align
to mmap()'s man page. Thank you Kalesh Singh for spotting this!
- Rebased on top of Lorenzo's work to cleanup memfd file seal checks in
mmap() ([3], [4], and [5]). Thank you for this Lorenzo!
- Changed to deny PROT_EXEC mappings only if the mapping is shared,
instead of for both shared and private mappings, after discussing
this with Lorenzo.
Opens:
- Lorenzo brought up that this patch may negatively impact the usage of
MFD_NOEXEC_SCOPE_NOEXEC_ENFORCED [7]. However, it is not clear to me
why that is the case. At the moment, my intent is for the executable
permissions of the file to be disjoint from the ability to create
executable mappings.
Links:
[1] https://cs.android.com/android/kernel/superproject/+/common-android-mainlin…
[2] https://developer.android.com/reference/android/database/CursorWindow
[3] https://lore.kernel.org/all/cover.1732804776.git.lorenzo.stoakes@oracle.com/
[4] https://lkml.kernel.org/r/20241206212846.210835-1-lorenzo.stoakes@oracle.com
[5] https://lkml.kernel.org/r/7dee6c5d-480b-4c24-b98e-6fa47dbd8a23@lucifer.local
[6] https://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm.git/tree/?h=mm-unst…
[7] https://lore.kernel.org/all/3a53b154-1e46-45fb-a559-65afa7a8a788@lucifer.lo…
Links to previous versions:
v1: https://lore.kernel.org/all/20241206010930.3871336-1-isaacmanjarres@google.…
Isaac J. Manjarres (2):
mm/memfd: Add support for F_SEAL_FUTURE_EXEC to memfd
selftests/memfd: Add tests for F_SEAL_FUTURE_EXEC
include/uapi/linux/fcntl.h | 1 +
mm/memfd.c | 39 ++++++++++-
tools/testing/selftests/memfd/memfd_test.c | 79 ++++++++++++++++++++++
3 files changed, 118 insertions(+), 1 deletion(-)
--
2.47.1.613.gc27f4b7a9f-goog
When compiling the pointer masking tests with -Wall this warning
is present:
pointer_masking.c: In function ‘test_tagged_addr_abi_sysctl’:
pointer_masking.c:203:9: warning: ignoring return value of ‘pwrite’
declared with attribute ‘warn_unused_result’ [-Wunused-result]
203 | pwrite(fd, &value, 1, 0); |
^~~~~~~~~~~~~~~~~~~~~~~~ pointer_masking.c:208:9: warning:
ignoring return value of ‘pwrite’ declared with attribute
‘warn_unused_result’ [-Wunused-result]
208 | pwrite(fd, &value, 1, 0);
I came across this on riscv64-linux-gnu-gcc (Ubuntu
11.4.0-1ubuntu1~22.04).
Fix this by checking that the number of bytes written equal the expected
number of bytes written.
Fixes: 7470b5afd150 ("riscv: selftests: Add a pointer masking test")
Signed-off-by: Charlie Jenkins <charlie(a)rivosinc.com>
Reviewed-by: Andrew Jones <ajones(a)ventanamicro.com>
---
Changes in v6:
- Add back ksft_test_result() (Samuel)
- Link to v5: https://lore.kernel.org/r/20241206-fix_warnings_pointer_masking_tests-v5-1-…
Changes in v5:
- No longer skip second pwrite if first one fails
- Use wrapper function instead of goto (Drew)
- Link to v4: https://lore.kernel.org/r/20241205-fix_warnings_pointer_masking_tests-v4-1-…
Changes in v4:
- Skip sysctl_enabled test if first pwrite failed
- Link to v3: https://lore.kernel.org/r/20241205-fix_warnings_pointer_masking_tests-v3-1-…
Changes in v3:
- Fix sysctl enabled test case (Drew/Alex)
- Move pwrite err condition into goto (Drew)
- Link to v2: https://lore.kernel.org/r/20241204-fix_warnings_pointer_masking_tests-v2-1-…
Changes in v2:
- I had ret != 2 for testing, I changed it to be ret != 1.
- Link to v1: https://lore.kernel.org/r/20241204-fix_warnings_pointer_masking_tests-v1-1-…
---
.../testing/selftests/riscv/abi/pointer_masking.c | 28 +++++++++++++++++-----
1 file changed, 22 insertions(+), 6 deletions(-)
diff --git a/tools/testing/selftests/riscv/abi/pointer_masking.c b/tools/testing/selftests/riscv/abi/pointer_masking.c
index dee41b7ee3e323150d55523c8acbf3ec38857b87..059d2e87eb1f737caf44f692b239bf3e49c233b4 100644
--- a/tools/testing/selftests/riscv/abi/pointer_masking.c
+++ b/tools/testing/selftests/riscv/abi/pointer_masking.c
@@ -185,8 +185,20 @@ static void test_fork_exec(void)
}
}
+static bool pwrite_wrapper(int fd, void *buf, size_t count, const char *msg)
+{
+ int ret = pwrite(fd, buf, count, 0);
+
+ if (ret != count) {
+ ksft_perror(msg);
+ return false;
+ }
+ return true;
+}
+
static void test_tagged_addr_abi_sysctl(void)
{
+ char *err_pwrite_msg = "failed to write to /proc/sys/abi/tagged_addr_disabled\n";
char value;
int fd;
@@ -200,14 +212,18 @@ static void test_tagged_addr_abi_sysctl(void)
}
value = '1';
- pwrite(fd, &value, 1, 0);
- ksft_test_result(set_tagged_addr_ctrl(min_pmlen, true) == -EINVAL,
- "sysctl disabled\n");
+ if (!pwrite_wrapper(fd, &value, 1, "write '1'"))
+ ksft_test_result_fail(err_pwrite_msg);
+ else
+ ksft_test_result(set_tagged_addr_ctrl(min_pmlen, true) == -EINVAL,
+ "sysctl disabled\n");
value = '0';
- pwrite(fd, &value, 1, 0);
- ksft_test_result(set_tagged_addr_ctrl(min_pmlen, true) == 0,
- "sysctl enabled\n");
+ if (!pwrite_wrapper(fd, &value, 1, "write '0'"))
+ ksft_test_result_fail(err_pwrite_msg);
+ else
+ ksft_test_result(set_tagged_addr_ctrl(min_pmlen, true) == 0,
+ "sysctl enabled\n");
set_tagged_addr_ctrl(0, false);
---
base-commit: 40384c840ea1944d7c5a392e8975ed088ecf0b37
change-id: 20241204-fix_warnings_pointer_masking_tests-3860e4f35429
--
- Charlie
This patch series includes some netns-related improvements and fixes for
rtnetlink, to make link creation more intuitive:
1) Creating link in another net namespace doesn't conflict with link
names in current one.
2) Refector rtnetlink link creation. Create link in target namespace
directly.
So that
# ip link add netns ns1 link-netns ns2 tun0 type gre ...
will create tun0 in ns1, rather than create it in ns2 and move to ns1.
And don't conflict with another interface named "tun0" in current netns.
Patch 01 serves for 1) to avoids link name conflict in different netns.
To achieve 2), there're mainly 3 steps:
- Patch 02 packs newlink() parameters into a struct, including
the original "src_net" along with more netns context. No semantic
changes are introduced.
- Patch 03 ~ 07 converts device drivers to use the explicit netns
extracted from params.
- Patch 08 ~ 09 removes the old netns parameter, and converts
rtnetlink to create device in target netns directly.
Patch 10 ~ 11 adds some tests for link name and link netns.
BTW please note there're some issues found in current code:
- In amt_newlink() drivers/net/amt.c:
amt->net = net;
...
amt->stream_dev = dev_get_by_index(net, ...
Uses net, but amt_lookup_upper_dev() only searches in dev_net.
So the AMT device may not be properly deleted if it's in a different
netns from lower dev.
- In gtp_newlink() in drivers/net/gtp.c:
gtp->net = src_net;
...
gn = net_generic(dev_net(dev), gtp_net_id);
list_add_rcu(>p->list, &gn->gtp_dev_list);
Uses src_net, but priv is linked to list in dev_net. So it may not be
properly deleted on removal of link netns.
- In pfcp_newlink() in drivers/net/pfcp.c:
pfcp->net = net;
...
pn = net_generic(dev_net(dev), pfcp_net_id);
list_add_rcu(&pfcp->list, &pn->pfcp_dev_list);
Same as above.
- In lowpan_newlink() in net/ieee802154/6lowpan/core.c:
wdev = dev_get_by_index(dev_net(ldev), nla_get_u32(tb[IFLA_LINK]));
Looks for IFLA_LINK in dev_net, but in theory the ifindex is defined
in link netns.
---
v7:
- Add selftest kconfig.
- Remove a duplicated test of ip6gre.
v6:
link: https://lore.kernel.org/all/20241218130909.2173-1-shaw.leon@gmail.com/
- Split prototype, driver and rtnetlink changes.
- Add more tests for link netns.
- Fix IPv6 tunnel net overwriten in ndo_init().
- Reorder variable declarations.
- Exclude a ip_tunnel-specific patch.
v5:
link: https://lore.kernel.org/all/20241209140151.231257-1-shaw.leon@gmail.com/
- Fix function doc in batman-adv.
- Include peer_net in rtnl newlink parameters.
v4:
link: https://lore.kernel.org/all/20241118143244.1773-1-shaw.leon@gmail.com/
- Pack newlink() parameters to a single struct.
- Use ynl async_msg_queue.empty() in selftest.
v3:
link: https://lore.kernel.org/all/20241113125715.150201-1-shaw.leon@gmail.com/
- Drop "netns_atomic" flag and module parameter. Add netns parameter to
newlink() instead, and convert drivers accordingly.
- Move python NetNSEnter helper to net selftest lib.
v2:
link: https://lore.kernel.org/all/20241107133004.7469-1-shaw.leon@gmail.com/
- Check NLM_F_EXCL to ensure only link creation is affected.
- Add self tests for link name/ifindex conflict and notifications
in different netns.
- Changes in dummy driver and ynl in order to add the test case.
v1:
link: https://lore.kernel.org/all/20241023023146.372653-1-shaw.leon@gmail.com/
Xiao Liang (11):
rtnetlink: Lookup device in target netns when creating link
rtnetlink: Pack newlink() params into struct
net: Use link netns in newlink() of rtnl_link_ops
ieee802154: 6lowpan: Use link netns in newlink() of rtnl_link_ops
net: ip_tunnel: Use link netns in newlink() of rtnl_link_ops
net: ipv6: Use link netns in newlink() of rtnl_link_ops
net: xfrm: Use link netns in newlink() of rtnl_link_ops
rtnetlink: Remove "net" from newlink params
rtnetlink: Create link directly in target net namespace
selftests: net: Add python context manager for netns entering
selftests: net: Add test cases for link and peer netns
drivers/infiniband/ulp/ipoib/ipoib_netlink.c | 11 +-
drivers/net/amt.c | 16 +-
drivers/net/bareudp.c | 11 +-
drivers/net/bonding/bond_netlink.c | 8 +-
drivers/net/can/dev/netlink.c | 4 +-
drivers/net/can/vxcan.c | 9 +-
.../ethernet/qualcomm/rmnet/rmnet_config.c | 11 +-
drivers/net/geneve.c | 11 +-
drivers/net/gtp.c | 9 +-
drivers/net/ipvlan/ipvlan.h | 4 +-
drivers/net/ipvlan/ipvlan_main.c | 15 +-
drivers/net/ipvlan/ipvtap.c | 10 +-
drivers/net/macsec.c | 15 +-
drivers/net/macvlan.c | 8 +-
drivers/net/macvtap.c | 11 +-
drivers/net/netkit.c | 9 +-
drivers/net/pfcp.c | 11 +-
drivers/net/ppp/ppp_generic.c | 10 +-
drivers/net/team/team_core.c | 7 +-
drivers/net/veth.c | 9 +-
drivers/net/vrf.c | 11 +-
drivers/net/vxlan/vxlan_core.c | 11 +-
drivers/net/wireguard/device.c | 11 +-
drivers/net/wireless/virtual/virt_wifi.c | 14 +-
drivers/net/wwan/wwan_core.c | 25 +++-
include/net/ip_tunnels.h | 5 +-
include/net/rtnetlink.h | 44 +++++-
net/8021q/vlan_netlink.c | 15 +-
net/batman-adv/soft-interface.c | 16 +-
net/bridge/br_netlink.c | 12 +-
net/caif/chnl_net.c | 6 +-
net/core/rtnetlink.c | 35 +++--
net/hsr/hsr_netlink.c | 14 +-
net/ieee802154/6lowpan/core.c | 9 +-
net/ipv4/ip_gre.c | 27 ++--
net/ipv4/ip_tunnel.c | 10 +-
net/ipv4/ip_vti.c | 10 +-
net/ipv4/ipip.c | 14 +-
net/ipv6/ip6_gre.c | 42 ++++--
net/ipv6/ip6_tunnel.c | 20 ++-
net/ipv6/ip6_vti.c | 16 +-
net/ipv6/sit.c | 18 ++-
net/xfrm/xfrm_interface_core.c | 15 +-
tools/testing/selftests/net/Makefile | 1 +
tools/testing/selftests/net/config | 5 +
.../testing/selftests/net/lib/py/__init__.py | 2 +-
tools/testing/selftests/net/lib/py/netns.py | 18 +++
tools/testing/selftests/net/link_netns.py | 141 ++++++++++++++++++
tools/testing/selftests/net/netns-name.sh | 10 ++
49 files changed, 550 insertions(+), 226 deletions(-)
create mode 100755 tools/testing/selftests/net/link_netns.py
--
2.47.1
When I implemented virtio's hash-related features to tun/tap [1],
I found tun/tap does not fill the entire region reserved for the virtio
header, leaving some uninitialized hole in the middle of the buffer
after read()/recvmesg().
This series fills the uninitialized hole. More concretely, the
num_buffers field will be initialized with 1, and the other fields will
be inialized with 0. Setting the num_buffers field to 1 is mandated by
virtio 1.0 [2].
The change to virtio header is preceded by another change that refactors
tun and tap to unify their virtio-related code.
[1]: https://lore.kernel.org/r/20241008-rss-v5-0-f3cf68df005d@daynix.com
[2]: https://lore.kernel.org/r/20241227084256-mutt-send-email-mst@kernel.org/
Signed-off-by: Akihiko Odaki <akihiko.odaki(a)daynix.com>
---
Akihiko Odaki (3):
tun: Unify vnet implementation
tun: Pad virtio header with zero
tun: Set num_buffers for virtio 1.0
MAINTAINERS | 1 +
drivers/net/Kconfig | 5 ++
drivers/net/Makefile | 1 +
drivers/net/tap.c | 174 ++++++----------------------------------
drivers/net/tun.c | 212 ++++++++-----------------------------------------
drivers/net/tun_vnet.c | 191 ++++++++++++++++++++++++++++++++++++++++++++
drivers/net/tun_vnet.h | 24 ++++++
7 files changed, 281 insertions(+), 327 deletions(-)
---
base-commit: a32e14f8aef69b42826cf0998b068a43d486a9e9
change-id: 20241230-tun-66e10a49b0c7
Best regards,
--
Akihiko Odaki <akihiko.odaki(a)daynix.com>
Implement comprehensive testing for netconsole userdata entry handling,
demonstrating correct behavior when creating maximum entries and
preventing unauthorized overflow.
Refactor existing test infrastructure to support modular, reusable
helper functions that validate strict entry limit enforcement.
Also, add a warning if update_userdata() sees more than
MAX_USERDATA_ITEMS entries. This shouldn't happen and it is a bug that
shouldn't be silently ignored.
Signed-off-by: Breno Leitao <leitao(a)debian.org>
---
Changes in v2:
- Add the new script (netcons_overflow.sh) in
tools/testing/selftests/drivers/net/Makefile as suggested by Simon
Horman
- Link to v1: https://lore.kernel.org/r/20241204-netcons_overflow_test-v1-0-a85a8d0ace21@…
---
Breno Leitao (4):
netconsole: Warn if MAX_USERDATA_ITEMS limit is exceeded
netconsole: selftest: Split the helpers from the selftest
netconsole: selftest: Delete all userdata keys
netconsole: selftest: verify userdata entry limit
MAINTAINERS | 3 +-
drivers/net/netconsole.c | 2 +-
tools/testing/selftests/drivers/net/Makefile | 1 +
.../selftests/drivers/net/lib/sh/lib_netcons.sh | 225 +++++++++++++++++++++
.../testing/selftests/drivers/net/netcons_basic.sh | 218 +-------------------
.../selftests/drivers/net/netcons_overflow.sh | 67 ++++++
6 files changed, 297 insertions(+), 219 deletions(-)
---
base-commit: 94c16fd4df9089931f674fb9aaec41ea20b0fd7a
change-id: 20241204-netcons_overflow_test-eaf735d1f743
Best regards,
--
Breno Leitao <leitao(a)debian.org>
After commit b1f202060afe ("mm: remap unused subpages to shared zeropage
when splitting isolated thp"), cow test cases involving swapping out
THPs via madvise(MADV_PAGEOUT) started to be skipped due to the
subsequent check via pagemap determining that the memory was not
actually swapped out. Logs similar to this were emitted:
...
# [RUN] Basic COW after fork() ... with swapped-out, PTE-mapped THP (16 kB)
ok 2 # SKIP MADV_PAGEOUT did not work, is swap enabled?
# [RUN] Basic COW after fork() ... with single PTE of swapped-out THP (16 kB)
ok 3 # SKIP MADV_PAGEOUT did not work, is swap enabled?
# [RUN] Basic COW after fork() ... with swapped-out, PTE-mapped THP (32 kB)
ok 4 # SKIP MADV_PAGEOUT did not work, is swap enabled?
...
The commit in question introduces the behaviour of scanning THPs and if
their content is predominantly zero, it splits them and replaces the
pages which are wholly zero with the zero page. These cow test cases
were getting caught up in this.
So let's avoid that by filling the contents of all allocated memory with
a non-zero value. With this in place, the tests are passing again.
Signed-off-by: Ryan Roberts <ryan.roberts(a)arm.com>
---
Applies on top of mm-unstable (f349e79bfbf3)
Thanks,
Ryan
tools/testing/selftests/mm/cow.c | 8 ++++----
1 file changed, 4 insertions(+), 4 deletions(-)
diff --git a/tools/testing/selftests/mm/cow.c b/tools/testing/selftests/mm/cow.c
index 32c6ccc2a6be..1238e1c5aae1 100644
--- a/tools/testing/selftests/mm/cow.c
+++ b/tools/testing/selftests/mm/cow.c
@@ -758,7 +758,7 @@ static void do_run_with_base_page(test_fn fn, bool swapout)
}
/* Populate a base page. */
- memset(mem, 0, pagesize);
+ memset(mem, 1, pagesize);
if (swapout) {
madvise(mem, pagesize, MADV_PAGEOUT);
@@ -824,12 +824,12 @@ static void do_run_with_thp(test_fn fn, enum thp_run thp_run, size_t thpsize)
* Try to populate a THP. Touch the first sub-page and test if
* we get the last sub-page populated automatically.
*/
- mem[0] = 0;
+ mem[0] = 1;
if (!pagemap_is_populated(pagemap_fd, mem + thpsize - pagesize)) {
ksft_test_result_skip("Did not get a THP populated\n");
goto munmap;
}
- memset(mem, 0, thpsize);
+ memset(mem, 1, thpsize);
size = thpsize;
switch (thp_run) {
@@ -1012,7 +1012,7 @@ static void run_with_hugetlb(test_fn fn, const char *desc, size_t hugetlbsize)
}
/* Populate an huge page. */
- memset(mem, 0, hugetlbsize);
+ memset(mem, 1, hugetlbsize);
/*
* We need a total of two hugetlb pages to handle COW/unsharing
--
2.43.0
On Wed, Jan 08, 2025 at 11:39:40AM +0530, Dev Jain wrote:
>
> On 07/01/25 8:44 pm, Thomas Weißschuh wrote:
> > During the execution of validate_complete_va_space() a lot of memory is
> > on the VM subsystem. When running on a low memory subsystem an OOM may
> > be triggered, when writing to the dump file as the filesystem may also
> > require memory.
> >
> > On my test system with 1100MiB physical memory:
> >
> > Tasks state (memory values in pages):
> > [ pid ] uid tgid total_vm rss rss_anon rss_file rss_shmem pgtables_bytes swapents oom_score_adj name
> > [ 57] 0 57 34359215953 695 256 0 439 1064390656 0 0 virtual_address
> >
> > Out of memory: Killed process 57 (virtual_address) total-vm:137436863812kB, anon-rss:1024kB, file-rss:0kB, shmem-rss:1756kB, UID:0 pgtables:1039444kB oom_score_adj:0
> > <snip>
> > fault_in_iov_iter_readable+0x4a/0xd0
> > generic_perform_write+0x9c/0x280
> > shmem_file_write_iter+0x86/0x90
> > vfs_write+0x29c/0x480
> > ksys_write+0x6c/0xe0
> > do_syscall_64+0x9e/0x1a0
> > entry_SYSCALL_64_after_hwframe+0x77/0x7f
> >
> > Write the dumped data into /dev/null instead which does not require
> > additional memory during write(), making the code simpler as a
> > side-effect.
> >
> > Signed-off-by: Thomas Weißschuh<thomas.weissschuh(a)linutronix.de>
> > ---
> > tools/testing/selftests/mm/virtual_address_range.c | 6 ++----
> > 1 file changed, 2 insertions(+), 4 deletions(-)
> >
> > diff --git a/tools/testing/selftests/mm/virtual_address_range.c b/tools/testing/selftests/mm/virtual_address_range.c
> > index 484f82c7b7c871f82a7d9ec6d6c649f2ab1eb0cd..4042fd878acd702d23da2c3293292de33bd48143 100644
> > --- a/tools/testing/selftests/mm/virtual_address_range.c
> > +++ b/tools/testing/selftests/mm/virtual_address_range.c
> > @@ -103,10 +103,9 @@ static int validate_complete_va_space(void)
> > FILE *file;
> > int fd;
> > - fd = open("va_dump", O_CREAT | O_WRONLY, 0600);
> > - unlink("va_dump");
> > + fd = open("/dev/null", O_WRONLY);
> > if (fd < 0) {
> > - ksft_test_result_skip("cannot create or open dump file\n");
> > + ksft_test_result_skip("cannot create or open /dev/null\n");
> > ksft_finished();
> > }
> > @@ -152,7 +151,6 @@ static int validate_complete_va_space(void)
> > while (start_addr + hop < end_addr) {
> > if (write(fd, (void *)(start_addr + hop), 1) != 1)
> > return 1;
> > - lseek(fd, 0, SEEK_SET);
> > hop += MAP_CHUNK_SIZE;
> > }
> >
>
> The reason I had not used /dev/null was that write() was succeeding to /dev/null
> even from an address not in my VA space. I was puzzled about this behaviour of
> /dev/null and I chose to ignore it and just use a real file.
That makes sense and I can reproduce your example.
Switching to another dummy file which reads the written data like
/dev/random also leads to OOM, so wouldn't help either.
Thanks for the explanation.
@Andrew, could you drop this patch?
> To test this behaviour, run the following program:
[..]
PS: Your mail contained HTML and did not make it to the list archives.
(And the text variant of the example program was corrupted)
This patch series implements a new char misc driver, /dev/ntsync, which is used
to implement Windows NT synchronization primitives.
NT synchronization primitives are unique in that the wait functions both are
vectored, operate on multiple types of object with different behaviour (mutex,
semaphore, event), and affect the state of the objects they wait on. This model
is not compatible with existing kernel synchronization objects or interfaces,
and therefore the ntsync driver implements its own wait queues and locking.
This patch series is rebased against the "char-misc-next" branch of
gregkh/char-misc.git.
== Background ==
The Wine project emulates the Windows API in user space. One particular part of
that API, namely the NT synchronization primitives, have historically been
implemented via RPC to a dedicated "kernel" process. However, more recent
applications use these APIs more strenuously, and the overhead of RPC has become
a bottleneck.
The NT synchronization APIs are too complex to implement on top of existing
primitives without sacrificing correctness. Certain operations, such as
NtPulseEvent() or the "wait-for-all" mode of NtWaitForMultipleObjects(), require
direct control over the underlying wait queue, and implementing a wait queue
sufficiently robust for Wine in user space is not possible. This proposed
driver, therefore, implements the problematic interfaces directly in the Linux
kernel.
This driver was presented at Linux Plumbers Conference 2023. For those further
interested in the history of synchronization in Wine and past attempts to solve
this problem in user space, a recording of the presentation can be viewed here:
https://www.youtube.com/watch?v=NjU4nyWyhU8
== Performance ==
The performance measurements described below are copied from earlier versions of
the patch set. While some of the code has changed, I do not currently anticipate
that it has changed drastically enough to affect those measurements.
The gain in performance varies wildly depending on the application in question
and the user's hardware. For some games NT synchronization is not a bottleneck
and no change can be observed, but for others frame rate improvements of 50 to
150 percent are not atypical. The following table lists frame rate measurements
from a variety of games on a variety of hardware, taken by users Dmitry
Skvortsov, FuzzyQuils, OnMars, and myself:
Game Upstream ntsync improvement
===========================================================================
Anger Foot 69 99 43%
Call of Juarez 99.8 224.1 125%
Dirt 3 110.6 860.7 678%
Forza Horizon 5 108 160 48%
Lara Croft: Temple of Osiris 141 326 131%
Metro 2033 164.4 199.2 21%
Resident Evil 2 26 77 196%
The Crew 26 51 96%
Tiny Tina's Wonderlands 130 360 177%
Total War Saga: Troy 109 146 34%
===========================================================================
== Patches ==
The intended semantics of the patches are broadly intended to match those of the
corresponding Windows functions. For those not already familiar with the Windows
functions (or their undocumented behaviour), patch 27/28 provides a detailed
specification, and individual patches also include a brief description of the
API they are implementing.
The patches making use of this driver in Wine can be retrieved or browsed here:
https://repo.or.cz/wine/zf.git/shortlog/refs/heads/ntsync7
== Previous versions ==
Changes from v6:
* rename NTSYNC_IOC_SEM_POST to NTSYNC_IOC_SEM_RELEASE (matching the NT
terminology instead of POSIX),
* change object creation ioctls to return the fds directly in the return value
instead of through the args struct, which simplifies the API a bit.
* Link to v6: https://lore.kernel.org/lkml/20241209185904.507350-1-zfigura@codeweavers.co…
* Link to v5: https://lore.kernel.org/lkml/20240519202454.1192826-1-zfigura@codeweavers.c…
* Link to v4: https://lore.kernel.org/lkml/20240416010837.333694-1-zfigura@codeweavers.co…
* Link to v3: https://lore.kernel.org/lkml/20240329000621.148791-1-zfigura@codeweavers.co…
* Link to v2: https://lore.kernel.org/lkml/20240219223833.95710-1-zfigura@codeweavers.com/
* Link to v1: https://lore.kernel.org/lkml/20240214233645.9273-1-zfigura@codeweavers.com/
* Link to RFC v2: https://lore.kernel.org/lkml/20240131021356.10322-1-zfigura@codeweavers.com/
* Link to RFC v1: https://lore.kernel.org/lkml/20240124004028.16826-1-zfigura@codeweavers.com/
Elizabeth Figura (30):
ntsync: Return the fd from NTSYNC_IOC_CREATE_SEM.
ntsync: Rename NTSYNC_IOC_SEM_POST to NTSYNC_IOC_SEM_RELEASE.
ntsync: Introduce NTSYNC_IOC_WAIT_ANY.
ntsync: Introduce NTSYNC_IOC_WAIT_ALL.
ntsync: Introduce NTSYNC_IOC_CREATE_MUTEX.
ntsync: Introduce NTSYNC_IOC_MUTEX_UNLOCK.
ntsync: Introduce NTSYNC_IOC_MUTEX_KILL.
ntsync: Introduce NTSYNC_IOC_CREATE_EVENT.
ntsync: Introduce NTSYNC_IOC_EVENT_SET.
ntsync: Introduce NTSYNC_IOC_EVENT_RESET.
ntsync: Introduce NTSYNC_IOC_EVENT_PULSE.
ntsync: Introduce NTSYNC_IOC_SEM_READ.
ntsync: Introduce NTSYNC_IOC_MUTEX_READ.
ntsync: Introduce NTSYNC_IOC_EVENT_READ.
ntsync: Introduce alertable waits.
selftests: ntsync: Add some tests for semaphore state.
selftests: ntsync: Add some tests for mutex state.
selftests: ntsync: Add some tests for NTSYNC_IOC_WAIT_ANY.
selftests: ntsync: Add some tests for NTSYNC_IOC_WAIT_ALL.
selftests: ntsync: Add some tests for wakeup signaling with
WINESYNC_IOC_WAIT_ANY.
selftests: ntsync: Add some tests for wakeup signaling with
WINESYNC_IOC_WAIT_ALL.
selftests: ntsync: Add some tests for manual-reset event state.
selftests: ntsync: Add some tests for auto-reset event state.
selftests: ntsync: Add some tests for wakeup signaling with events.
selftests: ntsync: Add tests for alertable waits.
selftests: ntsync: Add some tests for wakeup signaling via alerts.
selftests: ntsync: Add a stress test for contended waits.
maintainers: Add an entry for ntsync.
docs: ntsync: Add documentation for the ntsync uAPI.
ntsync: No longer depend on BROKEN.
Documentation/userspace-api/index.rst | 1 +
Documentation/userspace-api/ntsync.rst | 385 +++++
MAINTAINERS | 9 +
drivers/misc/Kconfig | 1 -
drivers/misc/ntsync.c | 992 +++++++++++-
include/uapi/linux/ntsync.h | 42 +-
tools/testing/selftests/Makefile | 1 +
.../selftests/drivers/ntsync/.gitignore | 1 +
.../testing/selftests/drivers/ntsync/Makefile | 7 +
tools/testing/selftests/drivers/ntsync/config | 1 +
.../testing/selftests/drivers/ntsync/ntsync.c | 1343 +++++++++++++++++
11 files changed, 2767 insertions(+), 16 deletions(-)
create mode 100644 Documentation/userspace-api/ntsync.rst
create mode 100644 tools/testing/selftests/drivers/ntsync/.gitignore
create mode 100644 tools/testing/selftests/drivers/ntsync/Makefile
create mode 100644 tools/testing/selftests/drivers/ntsync/config
create mode 100644 tools/testing/selftests/drivers/ntsync/ntsync.c
base-commit: cdd30ebb1b9f36159d66f088b61aee264e649d7a
--
2.45.2
Hi all,
This patch series continues the work to migrate the *.sh tests into
prog_tests.
test_xdp_redirect.sh tests the XDP redirections done through
bpf_redirect().
These XDP redirections are already tested by prog_tests/xdp_do_redirect.c
but IMO it doesn't cover the exact same code path because
xdp_do_redirect.c uses bpf_prog_test_run_opts() to trigger redirections
of 'fake packets' while test_xdp_redirect.sh redirects packets coming
from the network. Also, the test_xdp_redirect.sh script tests the
redirections with both SKB and DRV modes while xdp_do_redirect.c only
tests the DRV mode.
The patch series adds two new test cases in prog_tests/xdp_do_redirect.c
to replace the test_xdp_redirect.sh script.
Signed-off-by: Bastien Curutchet (eBPF Foundation) <bastien.curutchet(a)bootlin.com>
---
Bastien Curutchet (eBPF Foundation) (3):
selftests/bpf: test_xdp_redirect: Rename BPF sections
selftests/bpf: Migrate test_xdp_redirect.sh to xdp_do_redirect.c
selftests/bpf: Migrate test_xdp_redirect.c to test_xdp_do_redirect.c
tools/testing/selftests/bpf/Makefile | 1 -
.../selftests/bpf/prog_tests/xdp_do_redirect.c | 192 +++++++++++++++++++++
.../selftests/bpf/progs/test_xdp_do_redirect.c | 12 ++
.../selftests/bpf/progs/test_xdp_redirect.c | 26 ---
tools/testing/selftests/bpf/test_xdp_redirect.sh | 79 ---------
5 files changed, 204 insertions(+), 106 deletions(-)
---
base-commit: da86bde1e6d1b887efc46af5ee1f9bbccd27233e
change-id: 20241219-xdp_redirect-2b8ec79dc24e
Best regards,
--
Bastien Curutchet (eBPF Foundation) <bastien.curutchet(a)bootlin.com>
The 2024 architecture release includes a number of data processing
extensions, mostly SVE and SME additions with a few others. These are
all very straightforward extensions which add instructions but no
architectural state so only need hwcaps and exposing of the ID registers
to KVM guests and userspace.
Signed-off-by: Mark Brown <broonie(a)kernel.org>
---
Changes in v4:
- Fix encodings for ID_AA64ISAR3_EL1.
- Link to v3: https://lore.kernel.org/r/20241203-arm64-2024-dpisa-v3-0-a6c78b1aa297@kerne…
Changes in v3:
- Commit log update for the hwcap test.
- Link to v2: https://lore.kernel.org/r/20241030-arm64-2024-dpisa-v2-0-b6601a15d2a5@kerne…
Changes in v2:
- Filter KVM guest visible bitfields in ID_AA64ISAR3_EL1 to only those
we make writeable.
- Link to v1: https://lore.kernel.org/r/20241028-arm64-2024-dpisa-v1-0-a38d08b008a8@kerne…
---
Mark Brown (9):
arm64/sysreg: Update ID_AA64PFR2_EL1 to DDI0601 2024-09
arm64/sysreg: Update ID_AA64ISAR3_EL1 to DDI0601 2024-09
arm64/sysreg: Update ID_AA64FPFR0_EL1 to DDI0601 2024-09
arm64/sysreg: Update ID_AA64ZFR0_EL1 to DDI0601 2024-09
arm64/sysreg: Update ID_AA64SMFR0_EL1 to DDI0601 2024-09
arm64/sysreg: Update ID_AA64ISAR2_EL1 to DDI0601 2024-09
arm64/hwcap: Describe 2024 dpISA extensions to userspace
KVM: arm64: Allow control of dpISA extensions in ID_AA64ISAR3_EL1
kselftest/arm64: Add 2024 dpISA extensions to hwcap test
Documentation/arch/arm64/elf_hwcaps.rst | 51 ++++++
arch/arm64/include/asm/hwcap.h | 17 ++
arch/arm64/include/uapi/asm/hwcap.h | 17 ++
arch/arm64/kernel/cpufeature.c | 35 ++++
arch/arm64/kernel/cpuinfo.c | 17 ++
arch/arm64/kvm/sys_regs.c | 6 +-
arch/arm64/tools/sysreg | 87 +++++++++-
tools/testing/selftests/arm64/abi/hwcap.c | 273 +++++++++++++++++++++++++++++-
8 files changed, 493 insertions(+), 10 deletions(-)
---
base-commit: 40384c840ea1944d7c5a392e8975ed088ecf0b37
change-id: 20241008-arm64-2024-dpisa-8091074a7f48
Best regards,
--
Mark Brown <broonie(a)kernel.org>
This series brings various cleanups and fixes for the mm (mostly
pkeys) kselftests. The original goal was to make the pkeys tests work
out of the box and without build warning - it turned out to be more
involved than expected.
The most important change is enabling -O2 when building all mm
kselftests (patch 5). This is actually needed for the pkeys tests to run
successfully (see gcc command line at the top of protection_keys.c and
pkey_sighandler_tests.c), and seems to have no negative impact on the
other tests. It certainly can't hurt performance!
The following patches address a few obvious issues in the pkeys tests
(unused code, bad scope for functions/variables, etc.) and finally make
a couple of small improvements.
There is one ugliness that this series does not fix: some functions in
pkey-<arch>.h call functions that are actually defined in
protection_keys.c. For instance, expect_fault_on_read_execonly_key() in
pkey-x86.h calls expected_pkey_fault(). This means that other test
programs that use pkey-helpers.h (namely pkey_sighandler_tests) would
fail to link if they called such functions defined in pkey-<arch>.h.
Fixing this would require a more comprehensive reorganisation of the
pkey-* headers, which doesn't seem worth it (patch 9 adds a comment to
pkey-helpers.h to clarify the situation).
Some more details on the patches:
- Patch 1 is an unrelated fix that was revealed by inspecting a warning.
It seems fairly harmless though, so I thought I'd just post it as part
of this series.
- Patch 2-5 fix various warnings that come up by building the mm tests
at -O2 and finally enable -O2.
- Patch 6-12 are various cleanups for the pkeys tests. Patch 11 in
particular enables is_pkeys_supported() to be called from outside
protection_keys.c (patch 13 relies on this).
- Patch 13-14 are small improvements to pkey_sighandler_tests.c.
Many thanks to Ryan Roberts for checking that the mm tests still run
fine on arm64 with those patches applied. I've also checked that the
pkeys tests run fine on arm64 and x86.
- Kevin
---
Cc: akpm(a)linux-foundation.org
Cc: aruna.ramakrishna(a)oracle.com
Cc: catalin.marinas(a)arm.com
Cc: dave.hansen(a)linux.intel.com
Cc: joey.gouly(a)arm.com
Cc: keith.lucas(a)oracle.com
Cc: ryan.roberts(a)arm.com
Cc: shuah(a)kernel.org
Cc: linux-arm-kernel(a)lists.infradead.org
Cc: linux-kselftest(a)vger.kernel.org
Cc: x86(a)kernel.org
---
Kevin Brodsky (14):
selftests/mm: Fix condition in uffd_move_test_common()
selftests/mm: Fix -Wmaybe-uninitialized warnings
selftests/mm: Fix strncpy() length
selftests/mm: Fix -Warray-bounds warnings in pkey_sighandler_tests
selftests/mm: Build with -O2
selftests/mm: Remove unused pkey helpers
selftests/mm: Define types using typedef in pkey-helpers.h
selftests/mm: Ensure pkey-*.h define inline functions only
selftests/mm: Remove empty pkey helper definition
selftests/mm: Ensure non-global pkey symbols are marked static
selftests/mm: Use sys_pkey helpers consistently
selftests/mm: Rename pkey register macro
selftests/mm: Skip pkey_sighandler_tests if support is missing
selftests/mm: Remove X permission from sigaltstack mapping
tools/testing/selftests/mm/Makefile | 6 +-
tools/testing/selftests/mm/ksm_tests.c | 2 +-
tools/testing/selftests/mm/mremap_test.c | 2 +-
tools/testing/selftests/mm/pkey-arm64.h | 6 +-
tools/testing/selftests/mm/pkey-helpers.h | 61 ++---
tools/testing/selftests/mm/pkey-powerpc.h | 4 +-
tools/testing/selftests/mm/pkey-x86.h | 6 +-
.../selftests/mm/pkey_sighandler_tests.c | 32 +--
tools/testing/selftests/mm/pkey_util.c | 40 ++++
tools/testing/selftests/mm/protection_keys.c | 212 +++++++-----------
tools/testing/selftests/mm/soft-dirty.c | 2 +-
tools/testing/selftests/mm/uffd-unit-tests.c | 4 +-
.../testing/selftests/mm/write_to_hugetlbfs.c | 2 +-
13 files changed, 163 insertions(+), 216 deletions(-)
create mode 100644 tools/testing/selftests/mm/pkey_util.c
--
2.47.0
The recently introduced guard-pages mm selftest uses the
process_madvise() syscall, a wrapper for which was added to glibc v2.36.
For those of us stuck with older distributions this causes a compile
error when compiling the mm selftests. For example Ubuntu 22.04 uses
glibc 2.35, which does not have the wrapper.
To workaround the issue, let's introduce our own static
process_madvise() wrapper that uses glibc's syscall() helper.
While we are at it, add the guard-page test suite to run_vmtests.sh so
that it can be automatically run by CI systems.
Signed-off-by: Ryan Roberts <ryan.roberts(a)arm.com>
---
Applies on top of mm-unstable (f349e79bfbf3)
Thanks,
Ryan
tools/testing/selftests/mm/guard-pages.c | 10 ++++++++--
tools/testing/selftests/mm/run_vmtests.sh | 5 +++++
2 files changed, 13 insertions(+), 2 deletions(-)
diff --git a/tools/testing/selftests/mm/guard-pages.c b/tools/testing/selftests/mm/guard-pages.c
index d8f8dee9ebbd..ece37212a8a2 100644
--- a/tools/testing/selftests/mm/guard-pages.c
+++ b/tools/testing/selftests/mm/guard-pages.c
@@ -55,6 +55,12 @@ static int pidfd_open(pid_t pid, unsigned int flags)
return syscall(SYS_pidfd_open, pid, flags);
}
+static ssize_t sys_process_madvise(int pidfd, const struct iovec *iovec,
+ size_t n, int advice, unsigned int flags)
+{
+ return syscall(__NR_process_madvise, pidfd, iovec, n, advice, flags);
+}
+
/*
* Enable our signal catcher and try to read/write the specified buffer. The
* return value indicates whether the read/write succeeds without a fatal
@@ -419,7 +425,7 @@ TEST_F(guard_pages, process_madvise)
ASSERT_EQ(munmap(&ptr_region[99 * page_size], page_size), 0);
/* Now guard in one step. */
- count = process_madvise(pidfd, vec, 6, MADV_GUARD_INSTALL, 0);
+ count = sys_process_madvise(pidfd, vec, 6, MADV_GUARD_INSTALL, 0);
/* OK we don't have permission to do this, skip. */
if (count == -1 && errno == EPERM)
@@ -440,7 +446,7 @@ TEST_F(guard_pages, process_madvise)
ASSERT_FALSE(try_read_write_buf(&ptr3[19 * page_size]));
/* Now do the same with unguard... */
- count = process_madvise(pidfd, vec, 6, MADV_GUARD_REMOVE, 0);
+ count = sys_process_madvise(pidfd, vec, 6, MADV_GUARD_REMOVE, 0);
/* ...and everything should now succeed. */
diff --git a/tools/testing/selftests/mm/run_vmtests.sh b/tools/testing/selftests/mm/run_vmtests.sh
index 2fc290d9430c..00c3f07ea100 100755
--- a/tools/testing/selftests/mm/run_vmtests.sh
+++ b/tools/testing/selftests/mm/run_vmtests.sh
@@ -45,6 +45,8 @@ separated by spaces:
vmalloc smoke tests
- hmm
hmm smoke tests
+- madv_guard
+ test madvise(2) MADV_GUARD_INSTALL and MADV_GUARD_REMOVE options
- madv_populate
test memadvise(2) MADV_POPULATE_{READ,WRITE} options
- memfd_secret
@@ -375,6 +377,9 @@ CATEGORY="mremap" run_test ./mremap_dontunmap
CATEGORY="hmm" run_test bash ./test_hmm.sh smoke
+# MADV_GUARD_INSTALL and MADV_GUARD_REMOVE tests
+CATEGORY="madv_guard" run_test ./guard-pages
+
# MADV_POPULATE_READ and MADV_POPULATE_WRITE tests
CATEGORY="madv_populate" run_test ./madv_populate
--
2.43.0
Currently, kselftests does not have a generalised mechanism to skip compilation
and run tests when required kernel configuration options are disabled.
This patch series adresses this issue by checking whether all required configs
from selftest/<test>/config are enabled in the current kernel
Siddharth Menon (2):
selftests: Introduce script to validate required configs
selftests/lib.mk: Introduce check to validate required configs
tools/testing/selftests/lib.mk | 11 ++-
tools/testing/selftests/mktest.pl | 138 ++++++++++++++++++++++++++++++
2 files changed, 147 insertions(+), 2 deletions(-)
mode change 100644 => 100755 tools/testing/selftests/lib.mk
create mode 100755 tools/testing/selftests/mktest.pl
--
2.39.5
If you wish to utilise a pidfd interface to refer to the current process or
thread it is rather cumbersome, requiring something like:
int pidfd = pidfd_open(getpid(), 0 or PIDFD_THREAD);
...
close(pidfd);
Or the equivalent call opening /proc/self. It is more convenient to use a
sentinel value to indicate to an interface that accepts a pidfd that we
simply wish to refer to the current process thread.
This series introduces sentinels for this purposes which can be passed as
the pidfd in this instance rather than having to establish a dummy fd for
this purpose.
It is useful to refer to both the current thread from the userland's
perspective for which we use PIDFD_SELF, and the current process from the
userland's perspective, for which we use PIDFD_SELF_PROCESS.
There is unfortunately some confusion between the kernel and userland as to
what constitutes a process - a thread from the userland perspective is a
process in userland, and a userland process is a thread group (more
specifically the thread group leader from the kernel perspective). We
therefore alias things thusly:
* PIDFD_SELF_THREAD aliased by PIDFD_SELF - use PIDTYPE_PID.
* PIDFD_SELF_THREAD_GROUP alised by PIDFD_SELF_PROCESS - use PIDTYPE_TGID.
In all of the kernel code we refer to PIDFD_SELF_THREAD and
PIDFD_SELF_THREAD_GROUP. However we expect users to use PIDFD_SELF and
PIDFD_SELF_PROCESS.
This matters for cases where, for instance, a user unshare()'s FDs or does
thread-specific signal handling and where the user would be hugely confused
if the FDs referenced or signal processed referred to the thread group
leader rather than the individual thread.
We ensure that pidfd_send_signal() and pidfd_getfd() work correctly, and
assert as much in selftests. All other interfaces except setns() will work
implicitly with this new interface, however it doesn't make sense to test
waitid(P_PIDFD, ...) as waiting on ourselves is a blocking operation.
In the case of setns() we explicitly disallow use of PIDFD_SELF* as it
doesn't make sense to obtain the namespaces of our own process, and it
would require work to implement this functionality there that would be of
no use.
We also do not provide the ability to utilise PIDFD_SELF* in ordinary fd
operations such as open() or poll(), as this would require extensive work
and be of no real use.
v6:
* Avoid static inline in UAPI header as suggested by Pedro.
* Place PIDFD_SELF values out of range of errors and any other sentinel as
suggested by Pedro.
v5:
* Fixup self test dependencies on pidfd/pidfd.h.
https://lore.kernel.org/linux-mm/cover.1729848252.git.lorenzo.stoakes@oracl…
v4:
* Avoid returning an fd in the __pidfd_get_pid() function as pointed out by
Christian, instead simply always pin the pid and maintain fd scope in the
helper alone.
* Add wrapper header file in tools/include/linux to allow for import of
UAPI pidfd.h header without encountering the collision between system
fcntl.h and linux/fcntl.h as discussed with Shuah and John.
* Fixup tests to import the UAPI pidfd.h header working around conflicts
between system fcntl.h and linux/fcntl.h which the UAPI pidfd.h imports,
as reported by Shuah.
* Use an int for pidfd_is_self_sentinel() to avoid any dependency on
stdbool.h in userland.
https://lore.kernel.org/linux-mm/cover.1729198898.git.lorenzo.stoakes@oracl…
v3:
* Do not fput() an invalid fd as reported by kernel test bot.
* Fix unintended churn from moving variable declaration.
https://lore.kernel.org/linux-mm/cover.1729073310.git.lorenzo.stoakes@oracl…
v2:
* Fix tests as reported by Shuah.
* Correct RFC version lore link.
https://lore.kernel.org/linux-mm/cover.1728643714.git.lorenzo.stoakes@oracl…
Non-RFC v1:
* Removed RFC tag - there seems to be general consensus that this change is
a good idea, but perhaps some debate to be had on implementation. It
seems sensible then to move forward with the RFC flag removed.
* Introduced PIDFD_SELF_THREAD, PIDFD_SELF_THREAD_GROUP and their aliases
PIDFD_SELF and PIDFD_SELF_PROCESS respectively.
* Updated testing accordingly.
https://lore.kernel.org/linux-mm/cover.1728578231.git.lorenzo.stoakes@oracl…
RFC version:
https://lore.kernel.org/linux-mm/cover.1727644404.git.lorenzo.stoakes@oracl…
Lorenzo Stoakes (5):
pidfd: extend pidfd_get_pid() and de-duplicate pid lookup
pidfd: add PIDFD_SELF_* sentinels to refer to own thread/process
tools: testing: separate out wait_for_pid() into helper header
selftests: pidfd: add pidfd.h UAPI wrapper
selftests: pidfd: add tests for PIDFD_SELF_*
include/linux/pid.h | 34 ++++-
include/uapi/linux/pidfd.h | 10 ++
kernel/exit.c | 4 +-
kernel/nsproxy.c | 1 +
kernel/pid.c | 65 +++++---
kernel/signal.c | 29 +---
tools/include/linux/pidfd.h | 14 ++
tools/testing/selftests/cgroup/test_kill.c | 2 +-
.../pid_namespace/regression_enomem.c | 2 +-
tools/testing/selftests/pidfd/Makefile | 3 +-
tools/testing/selftests/pidfd/pidfd.h | 28 +---
.../selftests/pidfd/pidfd_getfd_test.c | 141 ++++++++++++++++++
tools/testing/selftests/pidfd/pidfd_helpers.h | 39 +++++
.../selftests/pidfd/pidfd_setns_test.c | 11 ++
tools/testing/selftests/pidfd/pidfd_test.c | 76 ++++++++--
15 files changed, 371 insertions(+), 88 deletions(-)
create mode 100644 tools/include/linux/pidfd.h
create mode 100644 tools/testing/selftests/pidfd/pidfd_helpers.h
--
2.47.0
As the vIOMMU infrastructure series part-3, this introduces a new vEVENTQ
object. The existing FAULT object provides a nice notification pathway to
the user space with a queue already, so let vEVENTQ reuse that.
Mimicing the HWPT structure, add a common EVENTQ structure to support its
derivatives: IOMMUFD_OBJ_FAULT (existing) and IOMMUFD_OBJ_VEVENTQ (new).
An IOMMUFD_CMD_VEVENTQ_ALLOC is introduced to allocate vEVENTQ object for
vIOMMUs. One vIOMMU can have multiple vEVENTQs in different types but can
not support multiple vEVENTQs in the same type.
The forwarding part is fairly simple but might need to replace a physical
device ID with a virtual device ID in a driver-level event data structure.
So, this also adds some helpers for drivers to use.
As usual, this series comes with the selftest coverage for this new ioctl
and with a real world use case in the ARM SMMUv3 driver.
This is on Github:
https://github.com/nicolinc/iommufd/commits/iommufd_veventq-v4
Testing with RMR patches for MSI:
https://github.com/nicolinc/iommufd/commits/iommufd_veventq-v4-with-rmr
Paring QEMU branch for testing:
https://github.com/nicolinc/qemu/commits/wip/for_iommufd_veventq-v4
Changelog
v4
* Rename "vIRQ" to "vEVENTQ"
* Use flexible array in struct iommufd_vevent
* Add the new ioctl command to union ucmd_buffer
* Fix the alphabetical order in union ucmd_buffer too
* Rename _TYPE_NONE to _TYPE_DEFAULT aligning with vIOMMU naming
v3
https://lore.kernel.org/all/cover.1734477608.git.nicolinc@nvidia.com/
* Rebase on Will's for-joerg/arm-smmu/updates for arm_smmu_event series
* Add "Reviewed-by" lines from Kevin
* Fix typos in comments, kdocs, and jump tags
* Add a patch to sort struct iommufd_ioctl_op
* Update iommufd's userpsace-api documentation
* Update uAPI kdoc to quote SMMUv3 offical spec
* Drop the unused workqueue in struct iommufd_virq
* Drop might_sleep() in iommufd_viommu_report_irq() helper
* Add missing "break" in iommufd_viommu_get_vdev_id() helper
* Shrink the scope of the vmaster's read lock in SMMUv3 driver
* Pass in two arguments to iommufd_eventq_virq_handler() helper
* Move "!ops || !ops->read" validation into iommufd_eventq_init()
* Move "fault->ictx = ictx" closer to iommufd_ctx_get(fault->ictx)
* Update commit message for arm_smmu_attach_prepare/commit_vmaster()
* Keep "iommufd_fault" as-is and rename "iommufd_eventq_virq" to just
"iommufd_virq"
v2
https://lore.kernel.org/all/cover.1733263737.git.nicolinc@nvidia.com/
* Rebase on v6.13-rc1
* Add IOPF and vIRQ in iommufd.rst (userspace-api)
* Add a proper locking in iommufd_event_virq_destroy
* Add iommufd_event_virq_abort with a lockdep_assert_held
* Rename "EVENT_*" to "EVENTQ_*" to describe the objects better
* Reorganize flows in iommufd_eventq_virq_alloc for abort() to work
* Adde struct arm_smmu_vmaster to store vSID upon attaching to a nested
domain, calling a newly added iommufd_viommu_get_vdev_id helper
* Adde an arm_vmaster_report_event helper in arm-smmu-v3-iommufd file
to simplify the routine in arm_smmu_handle_evt() of the main driver
v1
https://lore.kernel.org/all/cover.1724777091.git.nicolinc@nvidia.com/
Thanks!
Nicolin
Nicolin Chen (14):
iommufd: Keep IOCTL list in an alphabetical order
iommufd/fault: Add an iommufd_fault_init() helper
iommufd/fault: Move iommufd_fault_iopf_handler() to header
iommufd: Abstract an iommufd_eventq from iommufd_fault
iommufd: Rename fault.c to eventq.c
iommufd: Add IOMMUFD_OBJ_VEVENTQ and IOMMUFD_CMD_VEVENTQ_ALLOC
iommufd/viommu: Add iommufd_viommu_get_vdev_id helper
iommufd/viommu: Add iommufd_viommu_report_event helper
iommufd/selftest: Require vdev_id when attaching to a nested domain
iommufd/selftest: Add IOMMU_TEST_OP_TRIGGER_VEVENT for vEVENTQ
coverage
iommufd/selftest: Add IOMMU_VEVENTQ_ALLOC test coverage
Documentation: userspace-api: iommufd: Update FAULT and VEVENTQ
iommu/arm-smmu-v3: Introduce struct arm_smmu_vmaster
iommu/arm-smmu-v3: Report events that belong to devices attached to
vIOMMU
drivers/iommu/iommufd/Makefile | 2 +-
drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h | 30 ++
drivers/iommu/iommufd/iommufd_private.h | 116 ++++++-
drivers/iommu/iommufd/iommufd_test.h | 10 +
include/linux/iommufd.h | 22 ++
include/uapi/linux/iommufd.h | 46 +++
tools/testing/selftests/iommu/iommufd_utils.h | 65 ++++
.../arm/arm-smmu-v3/arm-smmu-v3-iommufd.c | 65 ++++
drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 90 ++++--
drivers/iommu/iommufd/driver.c | 60 ++++
drivers/iommu/iommufd/{fault.c => eventq.c} | 298 ++++++++++++++----
drivers/iommu/iommufd/hw_pagetable.c | 6 +-
drivers/iommu/iommufd/main.c | 23 +-
drivers/iommu/iommufd/selftest.c | 53 ++++
drivers/iommu/iommufd/viommu.c | 2 +
tools/testing/selftests/iommu/iommufd.c | 27 ++
.../selftests/iommu/iommufd_fail_nth.c | 7 +
Documentation/userspace-api/iommufd.rst | 16 +
18 files changed, 820 insertions(+), 118 deletions(-)
rename drivers/iommu/iommufd/{fault.c => eventq.c} (55%)
base-commit: e94dc6ddda8dd3770879a132d577accd2cce25f9
--
2.43.0
Recently the loongarch defconfig stopped working with the default 128 MiB
of memory. The VM just spins infinitively.
Increasing the available memory to 1 GiB, similar to s390, fixes the
issue. To avoid having to do this for each architecture on its own,
proactively apply to all architectures.
Signed-off-by: Thomas Weißschuh <thomas.weissschuh(a)linutronix.de>
---
tools/testing/selftests/nolibc/Makefile | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/tools/testing/selftests/nolibc/Makefile b/tools/testing/selftests/nolibc/Makefile
index 8de98ea7af8071caa0597aa7b86d91a2d1d50e68..e92e0b88586111072a0e043cb15f3b59cf42c3a6 100644
--- a/tools/testing/selftests/nolibc/Makefile
+++ b/tools/testing/selftests/nolibc/Makefile
@@ -130,9 +130,9 @@ QEMU_ARGS_ppc = -M g3beige -append "console=ttyS0 panic=-1 $(TEST:%=NOLIB
QEMU_ARGS_ppc64 = -M powernv -append "console=hvc0 panic=-1 $(TEST:%=NOLIBC_TEST=%)"
QEMU_ARGS_ppc64le = -M powernv -append "console=hvc0 panic=-1 $(TEST:%=NOLIBC_TEST=%)"
QEMU_ARGS_riscv = -M virt -append "console=ttyS0 panic=-1 $(TEST:%=NOLIBC_TEST=%)"
-QEMU_ARGS_s390 = -M s390-ccw-virtio -m 1G -append "console=ttyS0 panic=-1 $(TEST:%=NOLIBC_TEST=%)"
+QEMU_ARGS_s390 = -M s390-ccw-virtio -append "console=ttyS0 panic=-1 $(TEST:%=NOLIBC_TEST=%)"
QEMU_ARGS_loongarch = -M virt -append "console=ttyS0,115200 panic=-1 $(TEST:%=NOLIBC_TEST=%)"
-QEMU_ARGS = $(QEMU_ARGS_$(XARCH)) $(QEMU_ARGS_BIOS) $(QEMU_ARGS_EXTRA)
+QEMU_ARGS = -m 1G $(QEMU_ARGS_$(XARCH)) $(QEMU_ARGS_BIOS) $(QEMU_ARGS_EXTRA)
# OUTPUT is only set when run from the main makefile, otherwise
# it defaults to this nolibc directory.
---
base-commit: 8cf0b93919e13d1e8d4466eb4080a4c4d9d66d7b
change-id: 20241007-nolibc-qemu-mem-5ed605520472
Best regards,
--
Thomas Weißschuh <thomas.weissschuh(a)linutronix.de>
DAMON debugfs interface was the only user interface of DAMON at the
beginning[1]. However, it turned out the interface would be not good
enough for long-term flexibility and stability.
In Feb 2022[2], we therefore introduced DAMON sysfs interface as an
alternative user interface that aims long-term flexibility and
stability. With its introduction, DAMON debugfs interface has announced
to be deprecated in near future.
In Feb 2023[3], we announced the official deprecation of DAMON debugfs
interface. In Jan 2024[4], we further made the deprecation difficult to
be ignored.
In Oct 2024[5], we posted an RFC version of this patch series as the
last notice.
And as of this writing, no problem or concerns about the removal plan
have reported. Apparently users are already moved to the alternative,
or made good plans for the change.
Remove the DAMON debugfs interface code from the tree. Given the past
timeline and the absence of reported problems or concerns, it is safe
enough to be done.
[1] https://lore.kernel.org/20210716081449.22187-1-sj38.park@gmail.com
[2] https://lore.kernel.org/20220228081314.5770-1-sj@kernel.org
[3] https://lore.kernel.org/20230209192009.7885-1-sj@kernel.org
[4] https://lore.kernel.org/20240130013549.89538-1-sj@kernel.org
[5] https://lore.kernel.org/20241015175412.60563-1-sj@kernel.org
Revision History
----------------
Changes from v1
(https://lore.kernel.org/20250101213527.74203-1-sj@kernel.org)
- Remove debugfs usage section and references from translations
(https://lore.kernel.org/20250106183944.103569-1-sj@kernel.org)
Changes from RFC
(https://lore.kernel.org/20241015175412.60563-1-sj@kernel.org)
- Rebased on latest mm-unstable
- Update and wordsmith commit messages
SeongJae Park (8):
Docs/translations/*/admin-guide/mm/damon/usage: remove DAMON debugfs
interface documentation
Docs/admin-guide/mm/damon/usage: remove DAMON debugfs interface
documentation
Docs/mm/damon/design: update for removal of DAMON debugfs interface
selftests/damon/config: remove configs for DAMON debugfs interface
selftests
selftests/damon: remove tests for DAMON debugfs interface
kunit: configs: remove configs for DAMON debugfs interface tests
mm/damon: remove DAMON debugfs interface kunit tests
mm/damon: remove DAMON debugfs interface
Documentation/admin-guide/mm/damon/usage.rst | 309 -----
Documentation/mm/damon/design.rst | 23 +-
.../zh_CN/admin-guide/mm/damon/usage.rst | 248 +---
.../zh_TW/admin-guide/mm/damon/usage.rst | 248 +---
mm/damon/Kconfig | 30 -
mm/damon/Makefile | 1 -
mm/damon/dbgfs.c | 1148 -----------------
mm/damon/tests/.kunitconfig | 7 -
mm/damon/tests/dbgfs-kunit.h | 173 ---
tools/testing/kunit/configs/all_tests.config | 3 -
tools/testing/selftests/damon/.gitignore | 3 -
tools/testing/selftests/damon/Makefile | 11 +-
tools/testing/selftests/damon/config | 1 -
.../testing/selftests/damon/debugfs_attrs.sh | 17 -
.../debugfs_duplicate_context_creation.sh | 27 -
.../selftests/damon/debugfs_empty_targets.sh | 21 -
.../damon/debugfs_huge_count_read_write.sh | 22 -
.../damon/debugfs_rm_non_contexts.sh | 19 -
.../selftests/damon/debugfs_schemes.sh | 19 -
.../selftests/damon/debugfs_target_ids.sh | 19 -
.../damon/debugfs_target_ids_pid_leak.c | 68 -
.../damon/debugfs_target_ids_pid_leak.sh | 22 -
...fs_target_ids_read_before_terminate_race.c | 80 --
...s_target_ids_read_before_terminate_race.sh | 14 -
.../selftests/damon/huge_count_read_write.c | 46 -
25 files changed, 13 insertions(+), 2566 deletions(-)
delete mode 100644 mm/damon/dbgfs.c
delete mode 100644 mm/damon/tests/dbgfs-kunit.h
delete mode 100755 tools/testing/selftests/damon/debugfs_attrs.sh
delete mode 100755 tools/testing/selftests/damon/debugfs_duplicate_context_creation.sh
delete mode 100755 tools/testing/selftests/damon/debugfs_empty_targets.sh
delete mode 100755 tools/testing/selftests/damon/debugfs_huge_count_read_write.sh
delete mode 100755 tools/testing/selftests/damon/debugfs_rm_non_contexts.sh
delete mode 100755 tools/testing/selftests/damon/debugfs_schemes.sh
delete mode 100755 tools/testing/selftests/damon/debugfs_target_ids.sh
delete mode 100644 tools/testing/selftests/damon/debugfs_target_ids_pid_leak.c
delete mode 100755 tools/testing/selftests/damon/debugfs_target_ids_pid_leak.sh
delete mode 100644 tools/testing/selftests/damon/debugfs_target_ids_read_before_terminate_race.c
delete mode 100755 tools/testing/selftests/damon/debugfs_target_ids_read_before_terminate_race.sh
delete mode 100644 tools/testing/selftests/damon/huge_count_read_write.c
--
2.39.5
This RFC patch series proposes a new ioctl PTP_SYS_OFFSET_STAT and adds
support for it in the proposed virtio_rtc driver [1]. The new
PTP_SYS_OFFSET_STAT ioctl provides a cross-timestamp like
PTP_SYS_OFFSET_PRECISE2, plus any the following status information (for
now):
- for UTC timescale clocks: leap second related status,
- clock accuracy.
The second commit adds support for the ioctl in the proposed virtio_rtc
driver, and hence depends on the patch series "Add virtio_rtc module" [1].
[1] https://lore.kernel.org/lkml/20241219201118.2233-1-quic_philber@quicinc.com…
Signed-off-by: Peter Hilber <quic_philber(a)quicinc.com>
Peter Hilber (2):
ptp: add PTP_SYS_OFFSET_STAT for xtstamping with status
virtio_rtc: Support PTP_SYS_OFFSET_STAT ioctl
drivers/ptp/ptp_chardev.c | 39 ++++++++
drivers/ptp/ptp_clock.c | 9 ++
drivers/virtio/Kconfig | 4 +-
drivers/virtio/virtio_rtc_driver.c | 122 +++++++++++++++++++++++-
drivers/virtio/virtio_rtc_internal.h | 3 +-
drivers/virtio/virtio_rtc_ptp.c | 25 +++--
include/linux/ptp_clock_kernel.h | 31 ++++++
include/uapi/linux/ptp_clock.h | 130 +++++++++++++++++++++++++-
tools/testing/selftests/ptp/Makefile | 2 +-
tools/testing/selftests/ptp/testptp.c | 126 ++++++++++++++++++++++++-
10 files changed, 471 insertions(+), 20 deletions(-)
base-commit: 8a8009abbfa04e58f1b01b20534cac9e8fe61a46
--
2.43.0
As the part-3 of the vIOMMU infrastructure, this series introduces a vIRQ
object. The existing FAULT object provides a nice notification pathway to
the user space already, so let vIRQ reuse the infrastructure.
Mimicing the HWPT structure, add a common EVENTQ structure to support its
derivatives: IOMMUFD_OBJ_FAULT (existing) and IOMMUFD_OBJ_VIRQ (new).
IOMMUFD_CMD_VIRQ_ALLOC is introduced to allocate vIRQ objects for vIOMMUs.
One vIOMMU can have multiple vIRQs in different types but can not support
multiple vIRQs with the same types.
The forwarding part is fairly simple but might need to replace a physical
device ID with a virtual device ID in a driver-level IRQ data structure.
So, this comes with some helpers for drivers to use.
As usual, this series comes with the selftest coverage for this new vIRQ,
and with a real world use case in the ARM SMMUv3 driver.
This is on Github:
https://github.com/nicolinc/iommufd/commits/iommufd_virq-v3
Testing with RMR patches for MSI:
https://github.com/nicolinc/iommufd/commits/iommufd_virq-v3-with-rmr
Paring QEMU branch for testing:
https://github.com/nicolinc/qemu/commits/wip/for_iommufd_virq-v3
Changelog
v3
* Rebase on Will's for-joerg/arm-smmu/updates for arm_smmu_event series
* Add "Reviewed-by" lines from Kevin
* Fix typos in comments, kdocs, and jump tags
* Add a patch to sort struct iommufd_ioctl_op
* Update iommufd's userpsace-api documentation
* Update uAPI kdoc to quote SMMUv3 offical spec
* Drop the unused workqueue in struct iommufd_virq
* Drop might_sleep() in iommufd_viommu_report_irq() helper
* Add missing "break" in iommufd_viommu_get_vdev_id() helper
* Shrink the scope of the vmaster's read lock in SMMUv3 driver
* Pass in two arguments to iommufd_eventq_virq_handler() helper
* Move "!ops || !ops->read" validation into iommufd_eventq_init()
* Move "fault->ictx = ictx" closer to iommufd_ctx_get(fault->ictx)
* Update commit message for arm_smmu_attach_prepare/commit_vmaster()
* Keep "iommufd_fault" as-is and rename "iommufd_eventq_virq" to just
"iommufd_virq"
v2
https://lore.kernel.org/all/cover.1733263737.git.nicolinc@nvidia.com/
* Rebase on v6.13-rc1
* Add IOPF and vIRQ in iommufd.rst (userspace-api)
* Add a proper locking in iommufd_event_virq_destroy
* Add iommufd_event_virq_abort with a lockdep_assert_held
* Rename "EVENT_*" to "EVENTQ_*" to describe the objects better
* Reorganize flows in iommufd_eventq_virq_alloc for abort() to work
* Adde struct arm_smmu_vmaster to store vSID upon attaching to a nested
domain, calling a newly added iommufd_viommu_get_vdev_id helper
* Adde an arm_vmaster_report_event helper in arm-smmu-v3-iommufd file
to simplify the routine in arm_smmu_handle_evt() of the main driver
v1
https://lore.kernel.org/all/cover.1724777091.git.nicolinc@nvidia.com/
Thanks!
Nicolin
Nicolin Chen (14):
iommufd: Keep IOCTL list in an alphabetical order
iommufd/fault: Add an iommufd_fault_init() helper
iommufd/fault: Move iommufd_fault_iopf_handler() to header
iommufd: Abstract an iommufd_eventq from iommufd_fault
iommufd: Rename fault.c to eventq.c
iommufd: Add IOMMUFD_OBJ_VIRQ and IOMMUFD_CMD_VIRQ_ALLOC
iommufd/viommu: Add iommufd_viommu_get_vdev_id helper
iommufd/viommu: Add iommufd_viommu_report_irq helper
iommufd/selftest: Require vdev_id when attaching to a nested domain
iommufd/selftest: Add IOMMU_TEST_OP_TRIGGER_VIRQ for vIRQ coverage
iommufd/selftest: Add IOMMU_VIRQ_ALLOC test coverage
Documentation: userspace-api: iommufd: Update FAULT and VIRQ
iommu/arm-smmu-v3: Introduce struct arm_smmu_vmaster
iommu/arm-smmu-v3: Report IRQs that belong to devices attached to
vIOMMU
drivers/iommu/iommufd/Makefile | 2 +-
drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h | 30 ++
drivers/iommu/iommufd/iommufd_private.h | 115 ++++++-
drivers/iommu/iommufd/iommufd_test.h | 10 +
include/linux/iommufd.h | 20 ++
include/uapi/linux/iommufd.h | 46 +++
tools/testing/selftests/iommu/iommufd_utils.h | 63 ++++
.../arm/arm-smmu-v3/arm-smmu-v3-iommufd.c | 65 ++++
drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 90 ++++--
drivers/iommu/iommufd/driver.c | 57 ++++
drivers/iommu/iommufd/{fault.c => eventq.c} | 298 ++++++++++++++----
drivers/iommu/iommufd/hw_pagetable.c | 6 +-
drivers/iommu/iommufd/main.c | 20 +-
drivers/iommu/iommufd/selftest.c | 53 ++++
drivers/iommu/iommufd/viommu.c | 2 +
tools/testing/selftests/iommu/iommufd.c | 27 ++
.../selftests/iommu/iommufd_fail_nth.c | 6 +
Documentation/userspace-api/iommufd.rst | 16 +
18 files changed, 809 insertions(+), 117 deletions(-)
rename drivers/iommu/iommufd/{fault.c => eventq.c} (55%)
base-commit: 376ce8b35ed15d5deee57bdecd8449f6a4df4c42
--
2.43.0
This series expands the XDP TX metadata framework to allow user
applications to pass per packet 64-bit launch time directly to the kernel
driver, requesting launch time hardware offload support. The XDP TX
metadata framework will not perform any clock conversion or packet
reordering.
Please note that the role of Tx metadata is just to pass the launch time,
not to enable the offload feature. Users will need to enable the launch
time hardware offload feature of the device by using the respective
command, such as the tc-etf command.
Although some devices use the tc-etf command to enable their launch time
hardware offload feature, xsk packets will not go through the etf qdisc.
Therefore, in my opinion, the launch time should always be based on the PTP
Hardware Clock (PHC). Thus, i did not include a clock ID to indicate the
clock source.
To simplify the test steps, I modified the xdp_hw_metadata bpf self-test
tool in such a way that it will set the launch time based on the offset
provided by the user and the value of the Receive Hardware Timestamp, which
is against the PHC. This will eliminate the need to discipline System Clock
with the PHC and then use clock_gettime() to get the time.
Please note that AF_XDP lacks a feedback mechanism to inform the
application if the requested launch time is invalid. So, users are expected
to familiar with the horizon of the launch time of the device they use and
not request a launch time that is beyond the horizon. Otherwise, the driver
might interpret the launch time incorrectly and react wrongly. For stmmac
and igc, where modulo computation is used, a launch time larger than the
horizon will cause the device to transmit the packet earlier that the
requested launch time.
Although there is no feedback mechanism for the launch time request
for now, user still can check whether the requested launch time is
working or not, by requesting the Transmit Completion Hardware Timestamp.
Changes since v1:
- renamed to use Earliest TxTime First (Willem)
- renamed to use txtime (Willem)
Changes since v2:
- renamed to use launch time (Jesper & Willem)
- changed the default launch time in xdp_hw_metadata apps from 1s to 0.1s
because some NICs do not support such a large future time.
Changes since v3:
- added XDP launch time support to the igc driver (Jesper & Florian)
- added per-driver launch time limitation on xsk-tx-metadata.rst (Jesper)
- added explanation on FIFO behavior on xsk-tx-metadata.rst (Jakub)
- added step to enable launch time in the commit message (Jesper & Willem)
- explicitly documented the type of launch_time and which clock source
it is against (Willem)
v1: https://patchwork.kernel.org/project/netdevbpf/cover/20231130162028.852006-…
v2: https://patchwork.kernel.org/project/netdevbpf/cover/20231201062421.1074768…
v3: https://patchwork.kernel.org/project/netdevbpf/cover/20231203165129.1740512…
Song Yoong Siang (4):
xsk: Add launch time hardware offload support to XDP Tx metadata
selftests/bpf: Add Launch Time request to xdp_hw_metadata
net: stmmac: Add launch time support to XDP ZC
igc: Add launch time support to XDP ZC
Documentation/netlink/specs/netdev.yaml | 4 +
Documentation/networking/xsk-tx-metadata.rst | 64 +++++++++++++++
drivers/net/ethernet/intel/igc/igc_main.c | 78 +++++++++++++------
drivers/net/ethernet/stmicro/stmmac/stmmac.h | 2 +
.../net/ethernet/stmicro/stmmac/stmmac_main.c | 13 ++++
include/net/xdp_sock.h | 10 +++
include/net/xdp_sock_drv.h | 1 +
include/uapi/linux/if_xdp.h | 10 +++
include/uapi/linux/netdev.h | 3 +
net/core/netdev-genl.c | 2 +
net/xdp/xsk.c | 3 +
tools/include/uapi/linux/if_xdp.h | 10 +++
tools/include/uapi/linux/netdev.h | 3 +
tools/testing/selftests/bpf/xdp_hw_metadata.c | 30 ++++++-
14 files changed, 208 insertions(+), 25 deletions(-)
--
2.34.1
Hi,
In /proc/PID/stat, there is the kstkesp field which is the stack pointer of
a thread. While the thread is active, this field reads zero. But during a
coredump, it should have a valid value.
However, at the moment, kstkesp is zero even during coredump.
The first commit fixes this problem, and the second commit adds a selftest
to detect if this problem appears again in the future.
v2..v3 https://lore.kernel.org/lkml/cover.1735550994.git.namcao@linutronix.de/
- Move stackdump file to local directory [Kees]
- Always cleanup the stackdump file after the test [Kees]
- Remove unused empty function
v1..v2 https://lore.kernel.org/lkml/cover.1730883229.git.namcao@linutronix.de/
- Change the fix patch to use PF_POSTCOREDUMP [Oleg]
Nam Cao (2):
fs/proc: do_task_stat: Fix ESP not readable during coredump
selftests: coredump: Add stackdump test
fs/proc/array.c | 2 +-
tools/testing/selftests/coredump/Makefile | 7 +
tools/testing/selftests/coredump/README.rst | 50 ++++++
tools/testing/selftests/coredump/stackdump | 14 ++
.../selftests/coredump/stackdump_test.c | 151 ++++++++++++++++++
5 files changed, 223 insertions(+), 1 deletion(-)
create mode 100644 tools/testing/selftests/coredump/Makefile
create mode 100644 tools/testing/selftests/coredump/README.rst
create mode 100755 tools/testing/selftests/coredump/stackdump
create mode 100644 tools/testing/selftests/coredump/stackdump_test.c
--
2.39.5
From: Li Zhijian <lizhijian(a)fujitsu.com>
[ Upstream commit 55853cb829dc707427c3519f6b8686682a204368 ]
The pattern rule `$(OUTPUT)/%: %.c` inadvertently included a circular
dependency on the global-timer target due to its inclusion in
$(TEST_GEN_PROGS_EXTENDED). This resulted in a circular dependency
warning during the build process.
To resolve this, the dependency on $(TEST_GEN_PROGS_EXTENDED) has been
replaced with an explicit dependency on $(OUTPUT)/libatest.so. This change
ensures that libatest.so is built before any other targets that require it,
without creating a circular dependency.
This fix addresses the following warning:
make[4]: Entering directory 'tools/testing/selftests/alsa'
make[4]: Circular default_modconfig/kselftest/alsa/global-timer <- default_modconfig/kselftest/alsa/global-timer dependency dropped.
make[4]: Nothing to be done for 'all'.
make[4]: Leaving directory 'tools/testing/selftests/alsa'
Cc: Mark Brown <broonie(a)kernel.org>
Cc: Jaroslav Kysela <perex(a)perex.cz>
Cc: Takashi Iwai <tiwai(a)suse.com>
Cc: Shuah Khan <shuah(a)kernel.org>
Signed-off-by: Li Zhijian <lizhijian(a)fujitsu.com>
Link: https://patch.msgid.link/20241218025931.914164-1-lizhijian@fujitsu.com
Signed-off-by: Takashi Iwai <tiwai(a)suse.de>
Signed-off-by: Sasha Levin <sashal(a)kernel.org>
---
tools/testing/selftests/alsa/Makefile | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/tools/testing/selftests/alsa/Makefile b/tools/testing/selftests/alsa/Makefile
index 5af9ba8a4645..140c7f821727 100644
--- a/tools/testing/selftests/alsa/Makefile
+++ b/tools/testing/selftests/alsa/Makefile
@@ -23,5 +23,5 @@ include ../lib.mk
$(OUTPUT)/libatest.so: conf.c alsa-local.h
$(CC) $(CFLAGS) -shared -fPIC $< $(LDLIBS) -o $@
-$(OUTPUT)/%: %.c $(TEST_GEN_PROGS_EXTENDED) alsa-local.h
+$(OUTPUT)/%: %.c $(OUTPUT)/libatest.so alsa-local.h
$(CC) $(CFLAGS) $< $(LDLIBS) -latest -o $@
--
2.39.5
From: Li Zhijian <lizhijian(a)fujitsu.com>
[ Upstream commit 55853cb829dc707427c3519f6b8686682a204368 ]
The pattern rule `$(OUTPUT)/%: %.c` inadvertently included a circular
dependency on the global-timer target due to its inclusion in
$(TEST_GEN_PROGS_EXTENDED). This resulted in a circular dependency
warning during the build process.
To resolve this, the dependency on $(TEST_GEN_PROGS_EXTENDED) has been
replaced with an explicit dependency on $(OUTPUT)/libatest.so. This change
ensures that libatest.so is built before any other targets that require it,
without creating a circular dependency.
This fix addresses the following warning:
make[4]: Entering directory 'tools/testing/selftests/alsa'
make[4]: Circular default_modconfig/kselftest/alsa/global-timer <- default_modconfig/kselftest/alsa/global-timer dependency dropped.
make[4]: Nothing to be done for 'all'.
make[4]: Leaving directory 'tools/testing/selftests/alsa'
Cc: Mark Brown <broonie(a)kernel.org>
Cc: Jaroslav Kysela <perex(a)perex.cz>
Cc: Takashi Iwai <tiwai(a)suse.com>
Cc: Shuah Khan <shuah(a)kernel.org>
Signed-off-by: Li Zhijian <lizhijian(a)fujitsu.com>
Link: https://patch.msgid.link/20241218025931.914164-1-lizhijian@fujitsu.com
Signed-off-by: Takashi Iwai <tiwai(a)suse.de>
Signed-off-by: Sasha Levin <sashal(a)kernel.org>
---
tools/testing/selftests/alsa/Makefile | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/tools/testing/selftests/alsa/Makefile b/tools/testing/selftests/alsa/Makefile
index 944279160fed..8dab90ad22bb 100644
--- a/tools/testing/selftests/alsa/Makefile
+++ b/tools/testing/selftests/alsa/Makefile
@@ -27,5 +27,5 @@ include ../lib.mk
$(OUTPUT)/libatest.so: conf.c alsa-local.h
$(CC) $(CFLAGS) -shared -fPIC $< $(LDLIBS) -o $@
-$(OUTPUT)/%: %.c $(TEST_GEN_PROGS_EXTENDED) alsa-local.h
+$(OUTPUT)/%: %.c $(OUTPUT)/libatest.so alsa-local.h
$(CC) $(CFLAGS) $< $(LDLIBS) -latest -o $@
--
2.39.5
Hi all,
v5 here is a small set of fixes and a rebase of the previous versions.
If there are no major issues, I'd like to land this soon so it can be
used and tested ready for 6.14.
This series was originally written by José Expósito, and has been
modified and updated by Matt Gilbride and myself. The original version
can be found here:
https://github.com/Rust-for-Linux/linux/pull/950
Add support for writing KUnit tests in Rust. While Rust doctests are
already converted to KUnit tests and run, they're really better suited
for examples, rather than as first-class unit tests.
This series implements a series of direct Rust bindings for KUnit tests,
as well as a new macro which allows KUnit tests to be written using a
close variant of normal Rust unit test syntax. The only change required
is replacing '#[cfg(test)]' with '#[kunit_tests(kunit_test_suite_name)]'
An example test would look like:
#[kunit_tests(rust_kernel_hid_driver)]
mod tests {
use super::*;
use crate::{c_str, driver, hid, prelude::*};
use core::ptr;
struct SimpleTestDriver;
impl Driver for SimpleTestDriver {
type Data = ();
}
#[test]
fn rust_test_hid_driver_adapter() {
let mut hid = bindings::hid_driver::default();
let name = c_str!("SimpleTestDriver");
static MODULE: ThisModule = unsafe { ThisModule::from_ptr(ptr::null_mut()) };
let res = unsafe {
<hid::Adapter<SimpleTestDriver> as driver::DriverOps>::register(&mut hid, name, &MODULE)
};
assert_eq!(res, Err(ENODEV)); // The mock returns -19
}
}
Please give this a go, and make sure I haven't broken it! There's almost
certainly a lot of improvements which can be made -- and there's a fair
case to be made for replacing some of this with generated C code which
can use the C macros -- but this is hopefully an adequate implementation
for now, and the interface can (with luck) remain the same even if the
implementation changes.
A few small notable missing features:
- Attributes (like the speed of a test) are hardcoded to the default
value.
- Similarly, the module name attribute is hardcoded to NULL. In C, we
use the KBUILD_MODNAME macro, but I couldn't find a way to use this
from Rust which wasn't more ugly than just disabling it.
- Assertions are not automatically rewritten to use KUnit assertions.
---
Changes since v4:
https://lore.kernel.org/linux-kselftest/20241101064505.3820737-1-davidgow@g…
- Rebased against 6.13-rc1
- Allowed an unused_unsafe warning after the behaviour of addr_of_mut!()
changed in Rust 1.82. (Thanks Boqun, Miguel)
- "Expect" that the sample assert_eq!(1+1, 2) produces a clippy warning
due to a redundant assertion. (Thanks Boqun, Miguel)
- Fix some missing safety comments, and remove some unneeded 'unsafe'
blocks. (Thanks Boqun)
- Fix a couple of minor rustfmt issues which were triggering checkpatch
warnings.
Changes since v3:
https://lore.kernel.org/linux-kselftest/20241030045719.3085147-2-davidgow@g…
- The kunit_unsafe_test_suite!() macro now panic!s if the suite name is
too long, triggering a compile error. (Thanks, Alice!)
- The #[kunit_tests()] macro now preserves span information, so
errors can be better reported. (Thanks, Boqun!)
- The example tests have been updated to no longer use assert_eq!() with
a constant bool argument (which triggered a clippy warning now we
have the span info).
Changes since v2:
https://lore.kernel.org/linux-kselftest/20241029092422.2884505-1-davidgow@g…
- Include missing rust/macros/kunit.rs file from v2. (Thanks Boqun!)
- The kunit_unsafe_test_suite!() macro will truncate the name of the
suite if it is too long. (Thanks Alice!)
- The proc macro now emits an error if the suite name is too long.
- We no longer needlessly use UnsafeCell<> in
kunit_unsafe_test_suite!(). (Thanks Alice!)
Changes since v1:
https://lore.kernel.org/lkml/20230720-rustbind-v1-0-c80db349e3b5@google.com…
- Rebase on top of the latest rust-next (commit 718c4069896c)
- Make kunit_case a const fn, rather than a macro (Thanks Boqun)
- As a result, the null terminator is now created with
kernel::kunit::kunit_case_null()
- Use the C kunit_get_current_test() function to implement
in_kunit_test(), rather than re-implementing it (less efficiently)
ourselves.
Changes since the GitHub PR:
- Rebased on top of kselftest/kunit
- Add const_mut_refs feature
This may conflict with https://lore.kernel.org/lkml/20230503090708.2524310-6-nmi@metaspace.dk/
- Add rust/macros/kunit.rs to the KUnit MAINTAINERS entry
---
José Expósito (3):
rust: kunit: add KUnit case and suite macros
rust: macros: add macro to easily run KUnit tests
rust: kunit: allow to know if we are in a test
MAINTAINERS | 1 +
rust/kernel/kunit.rs | 207 +++++++++++++++++++++++++++++++++++++++++++
rust/kernel/lib.rs | 1 +
rust/macros/kunit.rs | 168 +++++++++++++++++++++++++++++++++++
rust/macros/lib.rs | 29 ++++++
5 files changed, 406 insertions(+)
create mode 100644 rust/macros/kunit.rs
--
2.47.1.613.gc27f4b7a9f-goog
Hello all,
I was looking at other test candidates for conversion to bpf test_progs
framework (to increase automatic testing scope) and found test_xsk.sh, which
does not seem to have coverage yet in test_progs. This test validates the AF_XDP
socket behavior with different XDP modes (SKB, DRV, zero copy) and socket
configuration (normal, busy polling).
The testing program looks pretty big, considering all files involved
(test_xsk.sh, xskxceiver.c, xsk.c, the different XDP programs) and the matrix of
tests it runs. So before really diving into it, I would like to ask:
- is it indeed a good/relevant target for integration in test_progs (all tests
look like functional tests, so I guess it is) ?
- if so, is there anyone already working on this ?
- multiple commits on xskxceiver.c hint that the program is also used for
testing on real hardware, could someone confirm that it is still the case
(similar need has been seen with test_xdp_features.sh for example) ? If so, it
means that the current form must be preserved, and it would be an additional
integration into test_progs rather a conversion (then most of the code should be
shared between the non-test_progs and the test_progs version)
Thanks,
Alexis
--
Alexis Lothoré, Bootlin
Embedded Linux and Kernel engineering
https://bootlin.com
Implement comprehensive testing for netconsole userdata entry handling,
demonstrating correct behavior when creating maximum entries and
preventing unauthorized overflow.
Refactor existing test infrastructure to support modular, reusable
helper functions that validate strict entry limit enforcement.
Also, add a warning if update_userdata() sees more than
MAX_USERDATA_ITEMS entries. This shouldn't happen and it is a bug that
shouldn't be silently ignored.
Signed-off-by: Breno Leitao <leitao(a)debian.org>
---
Breno Leitao (4):
netconsole: Warn if MAX_USERDATA_ITEMS limit is exceeded
netconsole: selftest: Split the helpers from the selftest
netconsole: selftest: Delete all userdata keys
netconsole: selftest: verify userdata entry limit
MAINTAINERS | 3 +-
drivers/net/netconsole.c | 2 +-
.../selftests/drivers/net/lib/sh/lib_netcons.sh | 225 +++++++++++++++++++++
.../testing/selftests/drivers/net/netcons_basic.sh | 218 +-------------------
.../selftests/drivers/net/netcons_overflow.sh | 67 ++++++
5 files changed, 296 insertions(+), 219 deletions(-)
---
base-commit: bb18265c3aba92b91a1355609769f3e967b65dee
change-id: 20241204-netcons_overflow_test-eaf735d1f743
Best regards,
--
Breno Leitao <leitao(a)debian.org>
Handle the case that hugetlbfs is not supported. To make it easier
for debugging.
On a system that does not support hugetlbfs. There will be no such
HugePages_Free entry in /proc/meminfo. And consequently freepgs will
be empty. The huge pages availability check will fail and the test
will be started anyway:
./run_hugetlbfs_test.sh: line 47: [: -lt: unary operator expected
./run_hugetlbfs_test.sh: line 60: 12577 Aborted
(core dumped) ./memfd_test hugetlbfs
Aborted (core dumped)
Po-Hsu Lin (1):
selftests/memfd: skip hugetlbfs test if it's not supported
tools/testing/selftests/memfd/run_hugetlbfs_test.sh | 5 ++++-
1 file changed, 4 insertions(+), 1 deletion(-)
--
2.34.1
DAMON debugfs interface was the only user interface of DAMON at the
beginning[1]. However, it turned out the interface would be not good
enough for long-term flexibility and stability.
In Feb 2022[2], we therefore introduced DAMON sysfs interface as an
alternative user interface that aims long-term flexibility and
stability. With its introduction, DAMON debugfs interface has announced
to be deprecated in near future.
In Feb 2023[3], we announced the official deprecation of DAMON debugfs
interface. In Jan 2024[4], we further made the deprecation difficult to
be ignored.
In Oct 2024[5], we posted an RFC version of this patch series as the
last notice.
And as of this writing, no problem or concerns about the removal plan
have reported. Apparently users are already moved to the alternative,
or made good plans for the change.
Remove the DAMON debugfs interface code from the tree. Given the past
timeline and the absence of reported problems or concerns, it is safe
enough to be done.
[1] https://lore.kernel.org/20210716081449.22187-1-sj38.park@gmail.com
[2] https://lore.kernel.org/20220228081314.5770-1-sj@kernel.org
[3] https://lore.kernel.org/20230209192009.7885-1-sj@kernel.org
[4] https://lore.kernel.org/20240130013549.89538-1-sj@kernel.org
[5] https://lore.kernel.org/20241015175412.60563-1-sj@kernel.org
Changes from RFC
(https://lore.kernel.org/20241015175412.60563-1-sj@kernel.org)
- Rebased on latest mm-unstable
- Update and wordsmith commit messages
SeongJae Park (7):
Docs/admin-guide/mm/damon/usage: remove DAMON debugfs interface
documentation
Docs/mm/damon/design: update for removal of DAMON debugfs interface
selftests/damon/config: remove configs for DAMON debugfs interface
selftests
selftests/damon: remove tests for DAMON debugfs interface
kunit: configs: remove configs for DAMON debugfs interface tests
mm/damon: remove DAMON debugfs interface kunit tests
mm/damon: remove DAMON debugfs interface
Documentation/admin-guide/mm/damon/usage.rst | 309 -----
Documentation/mm/damon/design.rst | 23 +-
mm/damon/Kconfig | 30 -
mm/damon/Makefile | 1 -
mm/damon/dbgfs.c | 1148 -----------------
mm/damon/tests/.kunitconfig | 7 -
mm/damon/tests/dbgfs-kunit.h | 173 ---
tools/testing/kunit/configs/all_tests.config | 3 -
tools/testing/selftests/damon/.gitignore | 3 -
tools/testing/selftests/damon/Makefile | 11 +-
tools/testing/selftests/damon/config | 1 -
.../testing/selftests/damon/debugfs_attrs.sh | 17 -
.../debugfs_duplicate_context_creation.sh | 27 -
.../selftests/damon/debugfs_empty_targets.sh | 21 -
.../damon/debugfs_huge_count_read_write.sh | 22 -
.../damon/debugfs_rm_non_contexts.sh | 19 -
.../selftests/damon/debugfs_schemes.sh | 19 -
.../selftests/damon/debugfs_target_ids.sh | 19 -
.../damon/debugfs_target_ids_pid_leak.c | 68 -
.../damon/debugfs_target_ids_pid_leak.sh | 22 -
...fs_target_ids_read_before_terminate_race.c | 80 --
...s_target_ids_read_before_terminate_race.sh | 14 -
.../selftests/damon/huge_count_read_write.c | 46 -
23 files changed, 11 insertions(+), 2072 deletions(-)
delete mode 100644 mm/damon/dbgfs.c
delete mode 100644 mm/damon/tests/dbgfs-kunit.h
delete mode 100755 tools/testing/selftests/damon/debugfs_attrs.sh
delete mode 100755 tools/testing/selftests/damon/debugfs_duplicate_context_creation.sh
delete mode 100755 tools/testing/selftests/damon/debugfs_empty_targets.sh
delete mode 100755 tools/testing/selftests/damon/debugfs_huge_count_read_write.sh
delete mode 100755 tools/testing/selftests/damon/debugfs_rm_non_contexts.sh
delete mode 100755 tools/testing/selftests/damon/debugfs_schemes.sh
delete mode 100755 tools/testing/selftests/damon/debugfs_target_ids.sh
delete mode 100644 tools/testing/selftests/damon/debugfs_target_ids_pid_leak.c
delete mode 100755 tools/testing/selftests/damon/debugfs_target_ids_pid_leak.sh
delete mode 100644 tools/testing/selftests/damon/debugfs_target_ids_read_before_terminate_race.c
delete mode 100755 tools/testing/selftests/damon/debugfs_target_ids_read_before_terminate_race.sh
delete mode 100644 tools/testing/selftests/damon/huge_count_read_write.c
--
2.39.5
./powerpc/ptrace/Makefile includes flags.mk. In flags.mk,
-I$(selfdir)/powerpc/include is always included as part of
CFLAGS. So it will pick up the "pkeys.h" defined in
powerpc/include.
core-pkey.c test has couple of macros defined which
are part of "pkeys.h" header file. Remove those
duplicates and include "pkeys.h"
Reviewed-by: Ritesh Harjani (IBM) <ritesh.list(a)gmail.com>
Signed-off-by: Madhavan Srinivasan <maddy(a)linux.ibm.com>
---
Changelog v1:
- Added Reviewed-by tag
- made changes to commit message
.../selftests/powerpc/ptrace/core-pkey.c | 19 +------------------
1 file changed, 1 insertion(+), 18 deletions(-)
diff --git a/tools/testing/selftests/powerpc/ptrace/core-pkey.c b/tools/testing/selftests/powerpc/ptrace/core-pkey.c
index f6da4cb30cd6..31c9bf6d95db 100644
--- a/tools/testing/selftests/powerpc/ptrace/core-pkey.c
+++ b/tools/testing/selftests/powerpc/ptrace/core-pkey.c
@@ -16,14 +16,7 @@
#include <unistd.h>
#include "ptrace.h"
#include "child.h"
-
-#ifndef __NR_pkey_alloc
-#define __NR_pkey_alloc 384
-#endif
-
-#ifndef __NR_pkey_free
-#define __NR_pkey_free 385
-#endif
+#include "pkeys.h"
#ifndef NT_PPC_PKEY
#define NT_PPC_PKEY 0x110
@@ -61,16 +54,6 @@ struct shared_info {
time_t core_time;
};
-static int sys_pkey_alloc(unsigned long flags, unsigned long init_access_rights)
-{
- return syscall(__NR_pkey_alloc, flags, init_access_rights);
-}
-
-static int sys_pkey_free(int pkey)
-{
- return syscall(__NR_pkey_free, pkey);
-}
-
static int increase_core_file_limit(void)
{
struct rlimit rlim;
--
2.47.0
Hi All,
This patch-set aims to improve precision of BPF_MUL and add testcases
to illustrate precision gains using signed and unsigned bounds.
Thanks for taking the time to review and for all the feedback!
Best,
Matan
Changes from v1:
- Fixed typo made in patch.
Changes from v2:
- Added signed multiplication to BPF_MUL.
- Added test cases to exercise BPF_MUL.
- Reordered patches in the series.
Changes from v3:
- Coding style fixes.
Matan Shachnai (2):
bpf, verifier: Improve precision of BPF_MUL
selftests/bpf: Add testcases for BPF_MUL
kernel/bpf/verifier.c | 80 +++++------
.../selftests/bpf/progs/verifier_bounds.c | 134 ++++++++++++++++++
2 files changed, 170 insertions(+), 44 deletions(-)
--
2.25.1
Hi,
In /proc/PID/stat, there is the kstkesp field which is the stack pointer of
a thread. While the thread is active, this field reads zero. But during a
coredump, it should have a valid value.
However, at the moment, kstkesp is zero even during coredump.
The first commit fixes this problem, and the second commit adds a selftest
to detect if this problem appears again in the future.
v2:
- Change the fix patch to use PF_POSTCOREDUMP [Oleg]
Link to v1:
https://lore.kernel.org/lkml/cover.1730883229.git.namcao@linutronix.de/
Nam Cao (2):
fs/proc: do_task_stat: Fix ESP not readable during coredump
selftests: coredump: Add stackdump test
fs/proc/array.c | 2 +-
tools/testing/selftests/coredump/Makefile | 7 +
tools/testing/selftests/coredump/README.rst | 50 ++++++
tools/testing/selftests/coredump/stackdump | 14 ++
.../selftests/coredump/stackdump_test.c | 154 ++++++++++++++++++
5 files changed, 226 insertions(+), 1 deletion(-)
create mode 100644 tools/testing/selftests/coredump/Makefile
create mode 100644 tools/testing/selftests/coredump/README.rst
create mode 100755 tools/testing/selftests/coredump/stackdump
create mode 100644 tools/testing/selftests/coredump/stackdump_test.c
--
2.39.5
The upcoming new Idle HLT Intercept feature allows for the HLT
instruction execution by a vCPU to be intercepted by the hypervisor
only if there are no pending V_INTR and V_NMI events for the vCPU.
When the vCPU is expected to service the pending V_INTR and V_NMI
events, the Idle HLT intercept won’t trigger. The feature allows the
hypervisor to determine if the vCPU is actually idle and reduces
wasteful VMEXITs.
The idle HLT intercept feature is used for enlightened guests who wish
to securely handle the events. When an enlightened guest does a HLT
while an interrupt is pending, hypervisor will not have a way to
figure out whether the guest needs to be re-entered or not. The Idle
HLT intercept feature allows the HLT execution only if there are no
pending V_INTR and V_NMI events.
Presence of the Idle HLT Intercept feature is indicated via CPUID
function Fn8000_000A_EDX[30].
Document for the Idle HLT intercept feature is available at [1].
This series is based on kvm-next/next (64dbb3a771a1) + [2].
Experiments done:
----------------
kvm_amd.avic is set to '0' for this experiment.
The below numbers represent the average of 10 runs.
Normal guest (L1)
The below netperf command was run on the guest with smp = 1 (pinned).
netperf -H <host ip> -t TCP_RR -l 60
----------------------------------------------------------------
|with Idle HLT(transactions/Sec)|w/o Idle HLT(transactions/Sec)|
----------------------------------------------------------------
| 25645.7136 | 25773.2796 |
----------------------------------------------------------------
Number of transactions/sec with and without idle HLT intercept feature
are almost same.
Nested guest (L2)
The below netperf command was run on L2 guest with smp = 1 (pinned).
netperf -H <host ip> -t TCP_RR -l 60
----------------------------------------------------------------
|with Idle HLT(transactions/Sec)|w/o Idle HLT(transactions/Sec)|
----------------------------------------------------------------
| 5655.4468 | 5755.2189 |
----------------------------------------------------------------
Number of transactions/sec with and without idle HLT intercept feature
are almost same.
Testing Done:
- Tested the functionality for the Idle HLT intercept feature
using selftest svm_idle_hlt_test.
- Tested SEV and SEV-ES guest for the Idle HLT intercept functionality.
- Tested the Idle HLT intercept functionality on nested guest.
v3 -> v4
- Drop the patches to add vcpu_get_stat() into a new series [2].
- Added nested Idle HLT intercept support.
v2 -> v3
- Incorporated Andrew's suggestion to structure vcpu_stat_types in
a way that each architecture can share the generic types and also
provide its own.
v1 -> v2
- Done changes in svm_idle_hlt_test based on the review comments from Sean.
- Added an enum based approach to get binary stats in vcpu_get_stat() which
doesn't use string to get stat data based on the comments from Sean.
- Added self_halt() and cli() helpers based on the comments from Sean.
[1]: AMD64 Architecture Programmer's Manual Pub. 24593, April 2024,
Vol 2, 15.9 Instruction Intercepts (Table 15-7: IDLE_HLT).
https://bugzilla.kernel.org/attachment.cgi?id=306250
[2]: https://lore.kernel.org/kvm/20241021062226.108657-1-manali.shukla@amd.com/T…
Manali Shukla (4):
x86/cpufeatures: Add CPUID feature bit for Idle HLT intercept
KVM: SVM: Add Idle HLT intercept support
KVM: nSVM: implement the nested idle halt intercept
KVM: selftests: KVM: SVM: Add Idle HLT intercept test
arch/x86/include/asm/cpufeatures.h | 1 +
arch/x86/include/asm/svm.h | 1 +
arch/x86/include/uapi/asm/svm.h | 2 +
arch/x86/kvm/governed_features.h | 1 +
arch/x86/kvm/svm/nested.c | 7 ++
arch/x86/kvm/svm/svm.c | 15 +++-
tools/testing/selftests/kvm/Makefile | 1 +
.../selftests/kvm/include/x86_64/processor.h | 1 +
.../selftests/kvm/x86_64/svm_idle_hlt_test.c | 89 +++++++++++++++++++
9 files changed, 115 insertions(+), 3 deletions(-)
create mode 100644 tools/testing/selftests/kvm/x86_64/svm_idle_hlt_test.c
base-commit: c8d430db8eec7d4fd13a6bea27b7086a54eda6da
prerequisite-patch-id: ca912571db5c004f77b70843b8dd35517ff1267f
prerequisite-patch-id: 164ea3b4346f9e04bc69819278d20f5e1b5df5ed
prerequisite-patch-id: 90d870f426ebc2cec43c0dd89b701ee998385455
prerequisite-patch-id: 45812b799c517a4521782a1fdbcda881237e1eda
--
2.34.1
Hi,
Here is the 7th version of the series to support polling on event 'hist' file.
The previous version is here;
https://lore.kernel.org/all/172907575534.470540.12941248697563459082.stgit@…
This version updates descriptions, use guard() for mutex and fixes
selftest problem.
Background
----------
There has been interest in allowing user programs to monitor kernel
events in real time. Ftrace provides `trace_pipe` interface to wait
on events in the ring buffer, but it is needed to wait until filling
up a page with events in the ring buffer. We can also peek the
`trace` file periodically, but that is inefficient way to monitor
a randomely happening event.
Overview
--------
This patch set allows user to `poll`(or `select`, `epoll`) on event
histogram interface. As you know each event has its own `hist` file
which shows histograms generated by trigger action. So user can set
a new hist trigger on any event you want to monitor, and poll on the
`hist` file until it is updated.
There are 2 poll events are supported, POLLIN and POLLPRI. POLLIN
means that there are any readable update on `hist` file and this
event will be flashed only when you call read(). So, this is
useful if you want to read the histogram periodically.
The other POLLPRI event is for monitoring trace event. Like the
POLLIN, this will be returned when the histogram is updated, but
you don't need to read() the file and use poll() again.
Note that this waits for histogram update (not event arrival), thus
you must set a histogram on the event at first.
Usage
-----
Here is an example usage:
----
TRACEFS=/sys/kernel/tracing
EVENT=$TRACEFS/events/sched/sched_process_free
# setup histogram trigger and enable event
echo "hist:key=comm" >> $EVENT/trigger
echo 1 > $EVENT/enable
# Wait for update
poll pri $EVENT/hist
# Event arrived.
echo "process free event is comming"
tail $TRACEFS/trace
----
The 'poll' command is in the selftest patch.
You can take this series also from here;
https://git.kernel.org/pub/scm/linux/kernel/git/mhiramat/linux.git/log/?h=t…
Thank you,
---
Masami Hiramatsu (Google) (3):
tracing/hist: Add poll(POLLIN) support on hist file
tracing/hist: Support POLLPRI event for poll on histogram
selftests/tracing: Add hist poll() support test
include/linux/trace_events.h | 14 +++
kernel/trace/trace_events.c | 14 +++
kernel/trace/trace_events_hist.c | 92 +++++++++++++++++++-
tools/testing/selftests/ftrace/Makefile | 2
tools/testing/selftests/ftrace/poll.c | 74 ++++++++++++++++
.../ftrace/test.d/trigger/trigger-hist-poll.tc | 74 ++++++++++++++++
6 files changed, 267 insertions(+), 3 deletions(-)
create mode 100644 tools/testing/selftests/ftrace/poll.c
create mode 100644 tools/testing/selftests/ftrace/test.d/trigger/trigger-hist-poll.tc
--
Masami Hiramatsu (Google) <mhiramat(a)kernel.org>
Hi,
Here is the v6 patch to support polling on event 'hist' file.
The previous version is here;
https://lore.kernel.org/all/172398710447.295714.4489282566285719918.stgit@d…
This version is rebased on the ftrace/for-next branch of the
linux-trace tree, and use global irq_work and wq instead of per-event
one.
Background
----------
There has been interest in allowing user programs to monitor kernel
events in real time. Ftrace provides `trace_pipe` interface to wait
on events in the ring buffer, but it is needed to wait until filling
up a page with events in the ring buffer. We can also peek the
`trace` file periodically, but that is inefficient way to monitor
a randomely happening event.
Overview
--------
This patch set allows user to `poll`(or `select`, `epoll`) on event
histogram interface. As you know each event has its own `hist` file
which shows histograms generated by trigger action. So user can set
a new hist trigger on any event you want to monitor, and poll on the
`hist` file until it is updated.
There are 2 poll events are supported, POLLIN and POLLPRI. POLLIN
means that there are any readable update on `hist` file and this
event will be flashed only when you call read(). So, this is
useful if you want to read the histogram periodically.
The other POLLPRI event is for monitoring trace event. Like the
POLLIN, this will be returned when the histogram is updated, but
you don't need to read() the file and use poll() again.
Note that this waits for histogram update (not event arrival), thus
you must set a histogram on the event at first.
Usage
-----
Here is an example usage:
----
TRACEFS=/sys/kernel/tracing
EVENT=$TRACEFS/events/sched/sched_process_free
# setup histogram trigger and enable event
echo "hist:key=comm" >> $EVENT/trigger
echo 1 > $EVENT/enable
# Wait for update
poll pri $EVENT/hist
# Event arrived.
echo "process free event is comming"
tail $TRACEFS/trace
----
The 'poll' command is in the selftest patch.
You can take this series also from here;
https://git.kernel.org/pub/scm/linux/kernel/git/mhiramat/linux.git/log/?h=t…
Thank you,
---
Masami Hiramatsu (Google) (3):
tracing/hist: Add poll(POLLIN) support on hist file
tracing/hist: Support POLLPRI event for poll on histogram
selftests/tracing: Add hist poll() support test
include/linux/trace_events.h | 14 +++
kernel/trace/trace_events.c | 14 +++
kernel/trace/trace_events_hist.c | 100 +++++++++++++++++++-
tools/testing/selftests/ftrace/Makefile | 2
tools/testing/selftests/ftrace/poll.c | 74 +++++++++++++++
.../ftrace/test.d/trigger/trigger-hist-poll.tc | 74 +++++++++++++++
6 files changed, 275 insertions(+), 3 deletions(-)
create mode 100644 tools/testing/selftests/ftrace/poll.c
create mode 100644 tools/testing/selftests/ftrace/test.d/trigger/trigger-hist-poll.tc
--
Masami Hiramatsu (Google) <mhiramat(a)kernel.org>
The orig_a0 is missing in struct user_regs_struct of riscv, and there is
no way to add it without breaking UAPI. (See Link tag below)
Like NT_ARM_SYSTEM_CALL do, we add a new regset name NT_RISCV_ORIG_A0 to
access original a0 register from userspace via ptrace API.
Link: https://lore.kernel.org/all/59505464-c84a-403d-972f-d4b2055eeaac@gmail.com/
Signed-off-by: Celeste Liu <uwu(a)coelacanthus.name>
---
Changes in v2:
- Fix integer width.
- Add selftest.
- Link to v1: https://lore.kernel.org/r/20241201-riscv-new-regset-v1-1-c83c58abcc7b@coela…
---
Celeste Liu (1):
riscv/ptrace: add new regset to access original a0 register
Charlie Jenkins (1):
riscv: selftests: Add a ptrace test to verify syscall parameter modification
arch/riscv/kernel/ptrace.c | 32 +++++++
include/uapi/linux/elf.h | 1 +
tools/testing/selftests/riscv/abi/.gitignore | 1 +
tools/testing/selftests/riscv/abi/Makefile | 5 +-
tools/testing/selftests/riscv/abi/ptrace.c | 134 +++++++++++++++++++++++++++
5 files changed, 172 insertions(+), 1 deletion(-)
---
base-commit: 0e287d31b62bb53ad81d5e59778384a40f8b6f56
change-id: 20241201-riscv-new-regset-d529b952ad0d
Best regards,
--
Celeste Liu <uwu(a)coelacanthus.name>
The orig_a0 is missing in struct user_regs_struct of riscv, and there is
no way to add it without breaking UAPI. (See Link tag below)
Like NT_ARM_SYSTEM_CALL do, we add a new regset name NT_RISCV_ORIG_A0 to
access original a0 register from userspace via ptrace API.
Link: https://lore.kernel.org/all/59505464-c84a-403d-972f-d4b2055eeaac@gmail.com/
Signed-off-by: Celeste Liu <uwu(a)coelacanthus.name>
---
Changes in v3:
- Use return 0 directly for readability.
- Fix test for modify a0.
- Add Fixes: tag
- Remove useless Cc: stable.
- Selftest will check both a0 and orig_a0, but depends on the
correctness of PTRACE_GET_SYSCALL_INFO.
- Link to v2: https://lore.kernel.org/r/20241203-riscv-new-regset-v2-0-d37da8c0cba6@coela…
Changes in v2:
- Fix integer width.
- Add selftest.
- Link to v1: https://lore.kernel.org/r/20241201-riscv-new-regset-v1-1-c83c58abcc7b@coela…
---
Celeste Liu (2):
riscv/ptrace: add new regset to access original a0 register
riscv: selftests: Add a ptrace test to verify syscall parameter modification
arch/riscv/kernel/ptrace.c | 32 ++++++
include/uapi/linux/elf.h | 1 +
tools/testing/selftests/riscv/abi/.gitignore | 1 +
tools/testing/selftests/riscv/abi/Makefile | 5 +-
tools/testing/selftests/riscv/abi/ptrace.c | 151 +++++++++++++++++++++++++++
5 files changed, 189 insertions(+), 1 deletion(-)
---
base-commit: 0e287d31b62bb53ad81d5e59778384a40f8b6f56
change-id: 20241201-riscv-new-regset-d529b952ad0d
Best regards,
--
Celeste Liu <uwu(a)coelacanthus.name>
Fixes an issue where out-of-tree kselftest builds fail when building
the BPF and bpftools components. The failure occurs because the top-level
Makefile passes a relative srctree path ('..') to its sub-Makefiles, which
leads to errors in locating necessary files.
For example, the following error is encountered:
```
$ make V=1 O=$build/ TARGETS=hid kselftest-all
...
make -C ../tools/testing/selftests all
make[4]: Entering directory '/path/to/linux/tools/testing/selftests/hid'
make -C /path/to/linux/tools/testing/selftests/../../../tools/lib/bpf OUTPUT=/path/to/linux/O/kselftest/hid/tools/build/libbpf/ \
EXTRA_CFLAGS='-g -O0' \
DESTDIR=/path/to/linux/O/kselftest/hid/tools prefix= all install_headers
make[5]: Entering directory '/path/to/linux/tools/lib/bpf'
...
make[5]: Entering directory '/path/to/linux/tools/bpf/bpftool'
Makefile:127: ../tools/build/Makefile.feature: No such file or directory
make[5]: *** No rule to make target '../tools/build/Makefile.feature'. Stop.
```
To resolve this, the srctree is exported as an absolute path (abs_srctree)
when performing an out-of-tree build. This ensures that all sub-Makefiles
have the correct path to the source tree, preventing directory resolution
errors.
Signed-off-by: Li Zhijian <lizhijian(a)fujitsu.com>
---
Request for Additional Testing
We welcome all contributors and CI systems to test this change thoroughly.
In theory, this change should not affect in-tree builds. However, to ensure
stability and compatibility, we encourage testing across different
configurations.
What has been tested?
- out-of-tree kernel build
- out-of-tree kselftest-all
---
Makefile | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/Makefile b/Makefile
index e5b8a8832c0c..36e65806bb5e 100644
--- a/Makefile
+++ b/Makefile
@@ -275,7 +275,8 @@ else ifeq ($(srcroot)/,$(dir $(CURDIR)))
srcroot := ..
endif
-export srctree := $(if $(KBUILD_EXTMOD),$(abs_srctree),$(srcroot))
+srctree := $(if $(KBUILD_EXTMOD),$(abs_srctree),$(srcroot))
+export srctree := $(if $(building_out_of_srctree),$(abs_srctree),$(srctree))
ifdef building_out_of_srctree
export VPATH := $(srcroot)
--
2.44.0
Hi,
In /proc/PID/stat, there is the kstkesp field which is the stack pointer of
a thread. While the thread is active, this field reads zero. But during a
coredump, it should have a valid value.
However, at the moment, kstkesp is zero even during coredump.
The first commit fixes this problem, and the second commit adds a selftest
to detect if this problem appears again in the future.
Nam Cao (2):
fs/proc: do_task_stat: Fix ESP not readable during coredump
selftests: coredump: Add stackdump test
fs/proc/array.c | 36 ++--
tools/testing/selftests/coredump/Makefile | 7 +
tools/testing/selftests/coredump/README.rst | 50 ++++++
tools/testing/selftests/coredump/stackdump | 14 ++
.../selftests/coredump/stackdump_test.c | 154 ++++++++++++++++++
5 files changed, 243 insertions(+), 18 deletions(-)
create mode 100644 tools/testing/selftests/coredump/Makefile
create mode 100644 tools/testing/selftests/coredump/README.rst
create mode 100755 tools/testing/selftests/coredump/stackdump
create mode 100644 tools/testing/selftests/coredump/stackdump_test.c
--
2.39.5
This series:
1. makes the behavior of_find_device_by_node(),
bus_find_device_by_of_node(), bus_find_device_by_fwnode(), etc., more
consistent when provided with a NULL node/handle;
2. adds kunit tests to validate the new NULL-argument behavior; and
3. makes some related improvements and refactoring for the drivers/base/
kunit tests.
This series aims to prevent problems like the ones resolved in commit
5c8418cf4025 ("PCI/pwrctrl: Unregister platform device only if one
actually exists").
Changes in v3:
* Fix potential leak in test error case
Changes in v2:
* CC LKML (oops!)
* Keep "devm" and "match" tests in separate suites
Brian Norris (3):
drivers: base: Don't match devices with NULL of_node/fwnode/etc
drivers: base: test: Enable device model tests with KUNIT_ALL_TESTS
drivers: base: test: Add ...find_device_by...(... NULL) tests
drivers/base/core.c | 8 ++---
drivers/base/test/Kconfig | 1 +
drivers/base/test/platform-device-test.c | 41 +++++++++++++++++++++++-
3 files changed, 45 insertions(+), 5 deletions(-)
--
2.47.1.613.gc27f4b7a9f-goog
This is documented as --per_test_log but the argument is actually
--per-test-log.
Signed-off-by: Brendan Jackman <jackmanb(a)google.com>
---
tools/testing/selftests/run_kselftest.sh | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/tools/testing/selftests/run_kselftest.sh b/tools/testing/selftests/run_kselftest.sh
index a28c1416cb89b96ba5f8b287e68b324b51d95673..50e03eefe7ac70d1b21ec1da4d245182dda7b8ad 100755
--- a/tools/testing/selftests/run_kselftest.sh
+++ b/tools/testing/selftests/run_kselftest.sh
@@ -21,7 +21,7 @@ usage()
cat <<EOF
Usage: $0 [OPTIONS]
-s | --summary Print summary with detailed log in output.log (conflict with -p)
- -p | --per_test_log Print test log in /tmp with each test name (conflict with -s)
+ -p | --per-test-log Print test log in /tmp with each test name (conflict with -s)
-t | --test COLLECTION:TEST Run TEST from COLLECTION
-c | --collection COLLECTION Run all tests from COLLECTION
-l | --list List the available collection:test entries
---
base-commit: eabcdba3ad4098460a376538df2ae36500223c1e
change-id: 20241220-per-test-log-33ecf9d49406
Best regards,
--
Brendan Jackman <jackmanb(a)google.com>
From: Eduard Zingerman <eddyz87(a)gmail.com>
[ Upstream commit 1a4607ffba35bf2a630aab299e34dd3f6e658d70 ]
Tail-called programs could execute any of the helpers that invalidate
packet pointers. Hence, conservatively assume that each tail call
invalidates packet pointers.
Making the change in bpf_helper_changes_pkt_data() automatically makes
use of check_cfg() logic that computes 'changes_pkt_data' effect for
global sub-programs, such that the following program could be
rejected:
int tail_call(struct __sk_buff *sk)
{
bpf_tail_call_static(sk, &jmp_table, 0);
return 0;
}
SEC("tc")
int not_safe(struct __sk_buff *sk)
{
int *p = (void *)(long)sk->data;
... make p valid ...
tail_call(sk);
*p = 42; /* this is unsafe */
...
}
The tc_bpf2bpf.c:subprog_tc() needs change: mark it as a function that
can invalidate packet pointers. Otherwise, it can't be freplaced with
tailcall_freplace.c:entry_freplace() that does a tail call.
Signed-off-by: Eduard Zingerman <eddyz87(a)gmail.com>
Link: https://lore.kernel.org/r/20241210041100.1898468-8-eddyz87@gmail.com
Signed-off-by: Alexei Starovoitov <ast(a)kernel.org>
Signed-off-by: Sasha Levin <sashal(a)kernel.org>
---
net/core/filter.c | 2 ++
tools/testing/selftests/bpf/progs/tc_bpf2bpf.c | 2 ++
2 files changed, 4 insertions(+)
diff --git a/net/core/filter.c b/net/core/filter.c
index 33125317994e..bbd0c08072cb 100644
--- a/net/core/filter.c
+++ b/net/core/filter.c
@@ -7934,6 +7934,8 @@ bool bpf_helper_changes_pkt_data(enum bpf_func_id func_id)
case BPF_FUNC_xdp_adjust_head:
case BPF_FUNC_xdp_adjust_meta:
case BPF_FUNC_xdp_adjust_tail:
+ /* tail-called program could call any of the above */
+ case BPF_FUNC_tail_call:
return true;
default:
return false;
diff --git a/tools/testing/selftests/bpf/progs/tc_bpf2bpf.c b/tools/testing/selftests/bpf/progs/tc_bpf2bpf.c
index 8a0632c37839..79f5087dade2 100644
--- a/tools/testing/selftests/bpf/progs/tc_bpf2bpf.c
+++ b/tools/testing/selftests/bpf/progs/tc_bpf2bpf.c
@@ -10,6 +10,8 @@ int subprog(struct __sk_buff *skb)
int ret = 1;
__sink(ret);
+ /* let verifier know that 'subprog_tc' can change pointers to skb->data */
+ bpf_skb_change_proto(skb, 0, 0);
return ret;
}
--
2.39.5
Alongside the helper ip_link_set_up(), one to set the link down will be
useful as well. Add a helper to determine the link state as well,
ip_link_is_up(), and use it to short-circuit any changes if the state is
already the desired one.
Furthermore, add a helper bridge_vlan_add().
Signed-off-by: Petr Machata <petrm(a)nvidia.com>
Reviewed-by: Ido Schimmel <idosch(a)nvidia.com>
---
CC: Shuah Khan <shuah(a)kernel.org>
CC: linux-kselftest(a)vger.kernel.org
---
tools/testing/selftests/net/lib.sh | 31 ++++++++++++++++++++++++++++--
1 file changed, 29 insertions(+), 2 deletions(-)
diff --git a/tools/testing/selftests/net/lib.sh b/tools/testing/selftests/net/lib.sh
index 2cd5c743b2d9..0bd9a038a1f0 100644
--- a/tools/testing/selftests/net/lib.sh
+++ b/tools/testing/selftests/net/lib.sh
@@ -477,12 +477,33 @@ ip_link_set_addr()
defer ip link set dev "$name" address "$old_addr"
}
+ip_link_is_up()
+{
+ local name=$1; shift
+
+ local state=$(ip -j link show "$name" |
+ jq -r '(.[].flags[] | select(. == "UP")) // "DOWN"')
+ [[ $state == "UP" ]]
+}
+
ip_link_set_up()
{
local name=$1; shift
- ip link set dev "$name" up
- defer ip link set dev "$name" down
+ if ! ip_link_is_up "$name"; then
+ ip link set dev "$name" up
+ defer ip link set dev "$name" down
+ fi
+}
+
+ip_link_set_down()
+{
+ local name=$1; shift
+
+ if ip_link_is_up "$name"; then
+ ip link set dev "$name" down
+ defer ip link set dev "$name" up
+ fi
}
ip_addr_add()
@@ -498,3 +519,9 @@ ip_route_add()
ip route add "$@"
defer ip route del "$@"
}
+
+bridge_vlan_add()
+{
+ bridge vlan add "$@"
+ defer bridge vlan del "$@"
+}
--
2.47.0
The pattern rule `$(OUTPUT)/%: %.c` inadvertently included a circular
dependency on the global-timer target due to its inclusion in
$(TEST_GEN_PROGS_EXTENDED). This resulted in a circular dependency
warning during the build process.
To resolve this, the dependency on $(TEST_GEN_PROGS_EXTENDED) has been
replaced with an explicit dependency on $(OUTPUT)/libatest.so. This change
ensures that libatest.so is built before any other targets that require it,
without creating a circular dependency.
This fix addresses the following warning:
make[4]: Entering directory 'tools/testing/selftests/alsa'
make[4]: Circular default_modconfig/kselftest/alsa/global-timer <- default_modconfig/kselftest/alsa/global-timer dependency dropped.
make[4]: Nothing to be done for 'all'.
make[4]: Leaving directory 'tools/testing/selftests/alsa'
Cc: Mark Brown <broonie(a)kernel.org>
Cc: Jaroslav Kysela <perex(a)perex.cz>
Cc: Takashi Iwai <tiwai(a)suse.com>
Cc: Shuah Khan <shuah(a)kernel.org>
Signed-off-by: Li Zhijian <lizhijian(a)fujitsu.com>
---
Cc: linux-sound(a)vger.kernel.org
Cc: linux-kselftest(a)vger.kernel.org
Cc: linux-kernel(a)vger.kernel.org
---
tools/testing/selftests/alsa/Makefile | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/tools/testing/selftests/alsa/Makefile b/tools/testing/selftests/alsa/Makefile
index 944279160fed..8dab90ad22bb 100644
--- a/tools/testing/selftests/alsa/Makefile
+++ b/tools/testing/selftests/alsa/Makefile
@@ -27,5 +27,5 @@ include ../lib.mk
$(OUTPUT)/libatest.so: conf.c alsa-local.h
$(CC) $(CFLAGS) -shared -fPIC $< $(LDLIBS) -o $@
-$(OUTPUT)/%: %.c $(TEST_GEN_PROGS_EXTENDED) alsa-local.h
+$(OUTPUT)/%: %.c $(OUTPUT)/libatest.so alsa-local.h
$(CC) $(CFLAGS) $< $(LDLIBS) -latest -o $@
--
2.44.0
This patch allows progs to elide a null check on statically known map
lookup keys. In other words, if the verifier can statically prove that
the lookup will be in-bounds, allow the prog to drop the null check.
This is useful for two reasons:
1. Large numbers of nullness checks (especially when they cannot fail)
unnecessarily pushes prog towards BPF_COMPLEXITY_LIMIT_JMP_SEQ.
2. It forms a tighter contract between programmer and verifier.
For (1), bpftrace is starting to make heavier use of percpu scratch
maps. As a result, for user scripts with large number of unrolled loops,
we are starting to hit jump complexity verification errors. These
percpu lookups cannot fail anyways, as we only use static key values.
Eliding nullness probably results in less work for verifier as well.
For (2), percpu scratch maps are often used as a larger stack, as the
currrent stack is limited to 512 bytes. In these situations, it is
desirable for the programmer to express: "this lookup should never fail,
and if it does, it means I messed up the code". By omitting the null
check, the programmer can "ask" the verifier to double check the logic.
Changes in v5:
* Dropped all acks
* Use s64 instead of long for const_map_key
* Ensure stack slot contains spilled reg before accessing spilled_ptr
* Ensure spilled reg is a scalar before accessing tnum const value
* Fix verifier selftest for 32-bit write to write at 8 byte alignment
to ensure spill is tracked
* Introduce more precise tracking of helper stack accesses
* Do constant map key extraction as part of helper argument processing
and then remove duplicated stack checks
* Use ret_flag instead of regs[BPF_REG_0].type
* Handle STACK_ZERO
* Fix bug in bpf_load_hdr_opt() arg annotation
Changes in v4:
* Only allow for CAP_BPF
* Add test for stack growing upwards
* Improve comment about stack growing upwards
Changes in v3:
* Check if stack is (erroneously) growing upwards
* Mention in commit message why existing tests needed change
Changes in v2:
* Added a check for when R2 is not a ptr to stack
* Added a check for when stack is uninitialized (no stack slot yet)
* Updated existing tests to account for null elision
* Added test case for when R2 can be both const and non-const
Daniel Xu (5):
bpf: verifier: Add missing newline on verbose() call
bpf: tcp: Mark bpf_load_hdr_opt() arg2 as read-write
bpf: verifier: Refactor helper access type tracking
bpf: verifier: Support eliding map lookup nullness
bpf: selftests: verifier: Add nullness elision tests
kernel/bpf/verifier.c | 127 ++++++++---
net/core/filter.c | 2 +-
.../testing/selftests/bpf/progs/dynptr_fail.c | 6 +-
tools/testing/selftests/bpf/progs/iters.c | 14 +-
.../selftests/bpf/progs/map_kptr_fail.c | 2 +-
.../selftests/bpf/progs/test_global_func10.c | 2 +-
.../selftests/bpf/progs/uninit_stack.c | 29 ---
.../bpf/progs/verifier_array_access.c | 214 ++++++++++++++++++
.../bpf/progs/verifier_basic_stack.c | 2 +-
.../selftests/bpf/progs/verifier_const_or.c | 4 +-
.../progs/verifier_helper_access_var_len.c | 12 +-
.../selftests/bpf/progs/verifier_int_ptr.c | 2 +-
.../selftests/bpf/progs/verifier_map_in_map.c | 2 +-
.../selftests/bpf/progs/verifier_mtu.c | 2 +-
.../selftests/bpf/progs/verifier_raw_stack.c | 4 +-
.../selftests/bpf/progs/verifier_unpriv.c | 2 +-
.../selftests/bpf/progs/verifier_var_off.c | 8 +-
tools/testing/selftests/bpf/verifier/calls.c | 2 +-
.../testing/selftests/bpf/verifier/map_kptr.c | 2 +-
19 files changed, 342 insertions(+), 96 deletions(-)
--
2.46.0
This series was prompted by feedback given in [1].
Patch 1 : Adds safe_hlt() and cli() helpers.
Patch 2, 3: Adds an interface to read vcpu stat in selftest. Adds
a macro to generate compiler error to detect typos at
compile time while parsing vcpu and vm stats.
Patch 4 : Fix few of the selftests based on newly defined macro.
This series was split from the Idle HLT intercept support series [2]
because the series has a few changes in the vm_get_stat() interface
as suggested in [1] and a few changes in two of the self-tests
(nx_huge_pages_test.c and dirty_log_page_splitting_test.c) which use
vm_get_stat() functionality to retrieve specified VM stats. These
changes are unrelated to the Idle HLT intercept support series [2].
[1] https://lore.kernel.org/kvm/ZruDweYzQRRcJeTO@google.com/T/#m7cd7a110f0fcff9…
[2] https://lore.kernel.org/kvm/ZruDweYzQRRcJeTO@google.com/T/#m6c67ca8ccb226e5…
Manali Shukla (4):
KVM: selftests: Add safe_halt() and cli() helpers to common code
KVM: selftests: Add an interface to read the data of named vcpu stat
KVM: selftests: convert vm_get_stat to macro
KVM: selftests: Replace previously used vm_get_stat() to macro
.../testing/selftests/kvm/include/kvm_util.h | 83 +++++++++++++++++--
.../kvm/include/x86_64/kvm_util_arch.h | 52 ++++++++++++
.../selftests/kvm/include/x86_64/processor.h | 17 ++++
tools/testing/selftests/kvm/lib/kvm_util.c | 40 +++++++++
.../x86_64/dirty_log_page_splitting_test.c | 6 +-
.../selftests/kvm/x86_64/nx_huge_pages_test.c | 4 +-
6 files changed, 191 insertions(+), 11 deletions(-)
base-commit: c8d430db8eec7d4fd13a6bea27b7086a54eda6da
--
2.34.1
Currently, the unhandleable vectoring (e.g. when guest accesses MMIO
during vectoring) is handled differently on VMX and SVM: on VMX KVM
returns internal error, when SVM goes into infinite loop trying to
deliver an event again and again.
This patch series eliminates this difference by returning a KVM internal
error when KVM can't emulate during vectoring for both VMX and SVM.
Also, introduce a selftest test case which covers the error handling
mentioned above.
V1 -> V2:
- Make commit messages more brief, avoid using pronouns
- Extract SVM error handling into a separate commit
- Introduce a new X86EMUL_ return type and detect the unhandleable
vectoring error in vendor-specific check_emulate_instruction instead of
handling it in the common MMU code (which is specific for cached MMIO)
V2 -> V3:
- Make the new X86EMUL_ code more generic
- Prohibit any emulation during vectoring if it is due to an intercepted
#PF
- Add a new patch for checking whether unprotect & retry is possible
before exiting to userspace due to unhandleable vectoring
- Codestyle fixes
Ivan Orlov (7):
KVM: x86: Add function for vectoring error generation
KVM: x86: Add emulation status for unhandleable vectoring
KVM: x86: Unprotect & retry before unhandleable vectoring check
KVM: VMX: Handle vectoring error in check_emulate_instruction
KVM: SVM: Handle vectoring error in check_emulate_instruction
selftests: KVM: extract lidt into helper function
selftests: KVM: Add test case for MMIO during vectoring
arch/x86/include/asm/kvm_host.h | 11 +++-
arch/x86/kvm/kvm_emulate.h | 2 +
arch/x86/kvm/svm/svm.c | 6 +++
arch/x86/kvm/vmx/vmx.c | 30 ++++-------
arch/x86/kvm/x86.c | 31 +++++++++++
.../selftests/kvm/include/x86_64/processor.h | 7 +++
.../selftests/kvm/set_memory_region_test.c | 53 ++++++++++++++++++-
.../selftests/kvm/x86_64/sev_smoke_test.c | 2 +-
8 files changed, 117 insertions(+), 25 deletions(-)
--
2.43.0
Hi,
This series carries forward the effort to add Kselftest for PCI Endpoint
Subsystem started by Aman Gupta [1] a while ago. I reworked the initial version
based on another patch that fixes the return values of IOCTLs in
pci_endpoint_test driver and did many cleanups. Since the resulting work
modified the initial version substantially, I took over the authorship.
This series also incorporates the review comment by Shuah Khan [2] to move the
existing tests from 'tools/pci' to 'tools/testing/kselftest/pci_endpoint' before
migrating to Kselftest framework. I made sure that the tests are executable in
each commit and updated documentation accordingly.
NOTE: Patch 1 is strictly not related to this series, but necessary to execute
Kselftests with Qualcomm Endpoint devices. So this can be merged separately.
- Mani
[1] https://lore.kernel.org/linux-pci/20221007053934.5188-1-aman1.gupta@samsung…
[2] https://lore.kernel.org/linux-pci/b2a5db97-dc59-33ab-71cd-f591e0b1b34d@linu…
Changes in v3:
* Collected tags.
* Added a note about failing testcase 10 and command to skip it in
documentation.
* Removed Aman Gupta and Padmanabhan Rajanbabu from CC as their addresses are
bouncing.
Changes in v2:
* Added a patch that fixes return values of IOCTL in pci_endpoint_test driver
* Moved the existing tests to new location before migrating
* Added a fix for BARs on Qcom devices
* Updated documentation and also added fixture variants for memcpy & DMA modes
Manivannan Sadhasivam (4):
PCI: qcom-ep: Mark BAR0/BAR2 as 64bit BARs and BAR1/BAR3 as RESERVED
misc: pci_endpoint_test: Fix the return value of IOCTL
selftests: Move PCI Endpoint tests from tools/pci to Kselftests
selftests: pci_endpoint: Migrate to Kselftest framework
Documentation/PCI/endpoint/pci-test-howto.rst | 152 ++++-------
MAINTAINERS | 2 +-
drivers/misc/pci_endpoint_test.c | 236 ++++++++---------
drivers/pci/controller/dwc/pcie-qcom-ep.c | 4 +
tools/pci/Build | 1 -
tools/pci/Makefile | 58 ----
tools/pci/pcitest.c | 250 ------------------
tools/pci/pcitest.sh | 72 -----
tools/testing/selftests/Makefile | 1 +
.../testing/selftests/pci_endpoint/.gitignore | 2 +
tools/testing/selftests/pci_endpoint/Makefile | 7 +
tools/testing/selftests/pci_endpoint/config | 4 +
.../pci_endpoint/pci_endpoint_test.c | 186 +++++++++++++
13 files changed, 373 insertions(+), 602 deletions(-)
delete mode 100644 tools/pci/Build
delete mode 100644 tools/pci/Makefile
delete mode 100644 tools/pci/pcitest.c
delete mode 100644 tools/pci/pcitest.sh
create mode 100644 tools/testing/selftests/pci_endpoint/.gitignore
create mode 100644 tools/testing/selftests/pci_endpoint/Makefile
create mode 100644 tools/testing/selftests/pci_endpoint/config
create mode 100644 tools/testing/selftests/pci_endpoint/pci_endpoint_test.c
--
2.25.1
The 2024 architecture release includes a number of data processing
extensions, mostly SVE and SME additions with a few others. These are
all very straightforward extensions which add instructions but no
architectural state so only need hwcaps and exposing of the ID registers
to KVM guests and userspace.
Signed-off-by: Mark Brown <broonie(a)kernel.org>
---
Changes in v3:
- Commit log update for the hwcap test.
- Link to v2: https://lore.kernel.org/r/20241030-arm64-2024-dpisa-v2-0-b6601a15d2a5@kerne…
Changes in v2:
- Filter KVM guest visible bitfields in ID_AA64ISAR3_EL1 to only those
we make writeable.
- Link to v1: https://lore.kernel.org/r/20241028-arm64-2024-dpisa-v1-0-a38d08b008a8@kerne…
---
Mark Brown (9):
arm64/sysreg: Update ID_AA64PFR2_EL1 to DDI0601 2024-09
arm64/sysreg: Update ID_AA64ISAR3_EL1 to DDI0601 2024-09
arm64/sysreg: Update ID_AA64FPFR0_EL1 to DDI0601 2024-09
arm64/sysreg: Update ID_AA64ZFR0_EL1 to DDI0601 2024-09
arm64/sysreg: Update ID_AA64SMFR0_EL1 to DDI0601 2024-09
arm64/sysreg: Update ID_AA64ISAR2_EL1 to DDI0601 2024-09
arm64/hwcap: Describe 2024 dpISA extensions to userspace
KVM: arm64: Allow control of dpISA extensions in ID_AA64ISAR3_EL1
kselftest/arm64: Add 2024 dpISA extensions to hwcap test
Documentation/arch/arm64/elf_hwcaps.rst | 51 ++++++
arch/arm64/include/asm/hwcap.h | 17 ++
arch/arm64/include/uapi/asm/hwcap.h | 17 ++
arch/arm64/kernel/cpufeature.c | 35 ++++
arch/arm64/kernel/cpuinfo.c | 17 ++
arch/arm64/kvm/sys_regs.c | 6 +-
arch/arm64/tools/sysreg | 87 +++++++++-
tools/testing/selftests/arm64/abi/hwcap.c | 273 +++++++++++++++++++++++++++++-
8 files changed, 493 insertions(+), 10 deletions(-)
---
base-commit: 40384c840ea1944d7c5a392e8975ed088ecf0b37
change-id: 20241008-arm64-2024-dpisa-8091074a7f48
Best regards,
--
Mark Brown <broonie(a)kernel.org>
This patch series adds more test case issuing ioctls to ucontrol VMs and
its floating interrupt controller.
The test cases trigger three possible null pointer dereferences within
the handling of the KVM_DEV_FLIC_APF_ENABLE,
KVM_DEV_FLIC_APF_DISABLE_WAIT and KVM_SET_GSI_ROUTING ioctl.
All of these issues do only exist on ucontrol VMs. Fixes for the issues
are included within the patch series.
v2:
- added documentation changes
- simplify uc_flic_attrs; remove .getrc and .setrc from uc_flic_attrs
(Thanks Claudio)
Christoph Schlameuss (6):
kvm: s390: Reject setting flic pfault attributes on ucontrol VMs
selftests: kvm: s390: Add ucontrol flic attr selftests
kvm: s390: Reject KVM_SET_GSI_ROUTING on ucontrol VMs
selftests: kvm: s390: Add ucontrol gis routing test
selftests: kvm: s390: Streamline uc_skey test to issue iske after sske
selftests: kvm: s390: Add has device attr check to uc_attr_mem_limit
selftest
Documentation/virt/kvm/api.rst | 3 +
Documentation/virt/kvm/devices/s390_flic.rst | 4 +
arch/s390/kvm/interrupt.c | 6 +
.../selftests/kvm/s390x/ucontrol_test.c | 194 ++++++++++++++++--
4 files changed, 189 insertions(+), 18 deletions(-)
--
2.47.1
This patch set intends to fix the errors in install and run_tests when
'O=' is specified. such as `make O=/path/to/build TARGETS=net kselftest-install`
Li Zhijian (2):
selftests/Makefile: Create BUILD_TARGET directory for
INSTALL_DEP_TARGETS
selftests/Makefile: add INSTALL_DEP_TARGETS to run_tests
tools/testing/selftests/Makefile | 4 +++-
1 file changed, 3 insertions(+), 1 deletion(-)
--
2.44.0
Since commit e87412e621f1 ("integrate Zaamo and Zalrsc text (#1304)"),
the A extension has been described as a set of instructions provided by
Zaamo and Zalrsc. Add these two extensions.
This series is based on the Zc one [1].
Link: https://lore.kernel.org/linux-riscv/20240619113529.676940-1-cleger@rivosinc…
---
Clément Léger (5):
dt-bindings: riscv: add Zaamo and Zalrsc ISA extension description
riscv: add parsing for Zaamo and Zalrsc extensions
riscv: hwprobe: export Zaamo and Zalrsc extensions
RISC-V: KVM: Allow Zaamo/Zalrsc extensions for Guest/VM
KVM: riscv: selftests: Add Zaamo/Zalrsc extensions to get-reg-list
test
Documentation/arch/riscv/hwprobe.rst | 8 ++++++++
.../devicetree/bindings/riscv/extensions.yaml | 19 +++++++++++++++++++
arch/riscv/include/asm/hwcap.h | 2 ++
arch/riscv/include/uapi/asm/hwprobe.h | 2 ++
arch/riscv/include/uapi/asm/kvm.h | 2 ++
arch/riscv/kernel/cpufeature.c | 9 ++++++++-
arch/riscv/kernel/sys_hwprobe.c | 2 ++
arch/riscv/kvm/vcpu_onereg.c | 4 ++++
.../selftests/kvm/riscv/get-reg-list.c | 8 ++++++++
9 files changed, 55 insertions(+), 1 deletion(-)
--
2.45.2
This patch series includes some netns-related improvements and fixes for
rtnetlink, to make link creation more intuitive:
1) Creating link in another net namespace doesn't conflict with link
names in current one.
2) Refector rtnetlink link creation. Create link in target namespace
directly.
So that
# ip link add netns ns1 link-netns ns2 tun0 type gre ...
will create tun0 in ns1, rather than create it in ns2 and move to ns1.
And don't conflict with another interface named "tun0" in current netns.
Patch 01 servers for 1) to avoids link name conflict in different netns.
To achieve 2), there're mainly 3 steps:
- Patch 02 packs newlink() parameters into a struct, including
the original "src_net" along with more netns context.
- Patch 03 ~ 07 converts device drivers to use the explicit netns
extracted from params.
- Patch 08 ~ 09 removes the old netns parameter, and converts
rtnetlink to create device in target netns directly.
Patch 10 ~ 11 adds some tests for link name and link netns.
BTW please note there're some issues found in current code:
- In amt_newlink() drivers/net/amt.c:
amt->net = net;
...
amt->stream_dev = dev_get_by_index(net, ...
Uses net, but amt_lookup_upper_dev() only searches in dev_net.
So the AMT device may not be properly deleted if it's in a different
netns from lower dev.
- In gtp_newlink() in drivers/net/gtp.c:
gtp->net = src_net;
...
gn = net_generic(dev_net(dev), gtp_net_id);
list_add_rcu(>p->list, &gn->gtp_dev_list);
Uses src_net, but priv is linked to list in dev_net. So it may not be
properly deleted on removal of link netns.
- In pfcp_newlink() in drivers/net/pfcp.c:
pfcp->net = net;
...
pn = net_generic(dev_net(dev), pfcp_net_id);
list_add_rcu(&pfcp->list, &pn->pfcp_dev_list);
Same as above.
- In lowpan_newlink() in net/ieee802154/6lowpan/core.c:
wdev = dev_get_by_index(dev_net(ldev), nla_get_u32(tb[IFLA_LINK]));
Looks for IFLA_LINK in dev_net, but in theory the ifindex is defined
in link netns.
---
v6:
- Split prototype, driver and rtnetlink changes.
- Add more tests for link netns.
- Fix IPv6 tunnel net overwriten in ndo_init().
- Reorder variable declarations.
- Exclude a ip_tunnel-specific patch.
v5:
link: https://lore.kernel.org/all/20241209140151.231257-1-shaw.leon@gmail.com/
- Fix function doc in batman-adv.
- Include peer_net in rtnl newlink parameters.
v4:
link: https://lore.kernel.org/all/20241118143244.1773-1-shaw.leon@gmail.com/
- Pack newlink() parameters to a single struct.
- Use ynl async_msg_queue.empty() in selftest.
v3:
link: https://lore.kernel.org/all/20241113125715.150201-1-shaw.leon@gmail.com/
- Drop "netns_atomic" flag and module parameter. Add netns parameter to
newlink() instead, and convert drivers accordingly.
- Move python NetNSEnter helper to net selftest lib.
v2:
link: https://lore.kernel.org/all/20241107133004.7469-1-shaw.leon@gmail.com/
- Check NLM_F_EXCL to ensure only link creation is affected.
- Add self tests for link name/ifindex conflict and notifications
in different netns.
- Changes in dummy driver and ynl in order to add the test case.
v1:
link: https://lore.kernel.org/all/20241023023146.372653-1-shaw.leon@gmail.com/
Xiao Liang (11):
rtnetlink: Lookup device in target netns when creating link
rtnetlink: Pack newlink() params into struct
net: Use link netns in newlink() of rtnl_link_ops
ieee802154: 6lowpan: Use link netns in newlink() of rtnl_link_ops
net: ip_tunnel: Use link netns in newlink() of rtnl_link_ops
net: ipv6: Use link netns in newlink() of rtnl_link_ops
net: xfrm: Use link netns in newlink() of rtnl_link_ops
rtnetlink: Remove "net" from newlink params
rtnetlink: Create link directly in target net namespace
selftests: net: Add python context manager for netns entering
selftests: net: Add test cases for link and peer netns
drivers/infiniband/ulp/ipoib/ipoib_netlink.c | 11 +-
drivers/net/amt.c | 16 +-
drivers/net/bareudp.c | 11 +-
drivers/net/bonding/bond_netlink.c | 8 +-
drivers/net/can/dev/netlink.c | 4 +-
drivers/net/can/vxcan.c | 9 +-
.../ethernet/qualcomm/rmnet/rmnet_config.c | 11 +-
drivers/net/geneve.c | 11 +-
drivers/net/gtp.c | 9 +-
drivers/net/ipvlan/ipvlan.h | 4 +-
drivers/net/ipvlan/ipvlan_main.c | 15 +-
drivers/net/ipvlan/ipvtap.c | 10 +-
drivers/net/macsec.c | 15 +-
drivers/net/macvlan.c | 8 +-
drivers/net/macvtap.c | 11 +-
drivers/net/netkit.c | 9 +-
drivers/net/pfcp.c | 11 +-
drivers/net/ppp/ppp_generic.c | 10 +-
drivers/net/team/team_core.c | 7 +-
drivers/net/veth.c | 9 +-
drivers/net/vrf.c | 11 +-
drivers/net/vxlan/vxlan_core.c | 11 +-
drivers/net/wireguard/device.c | 11 +-
drivers/net/wireless/virtual/virt_wifi.c | 14 +-
drivers/net/wwan/wwan_core.c | 25 ++-
include/net/ip_tunnels.h | 5 +-
include/net/rtnetlink.h | 44 +++++-
net/8021q/vlan_netlink.c | 15 +-
net/batman-adv/soft-interface.c | 16 +-
net/bridge/br_netlink.c | 12 +-
net/caif/chnl_net.c | 6 +-
net/core/rtnetlink.c | 35 +++--
net/hsr/hsr_netlink.c | 14 +-
net/ieee802154/6lowpan/core.c | 9 +-
net/ipv4/ip_gre.c | 27 ++--
net/ipv4/ip_tunnel.c | 10 +-
net/ipv4/ip_vti.c | 10 +-
net/ipv4/ipip.c | 14 +-
net/ipv6/ip6_gre.c | 42 ++++--
net/ipv6/ip6_tunnel.c | 20 ++-
net/ipv6/ip6_vti.c | 16 +-
net/ipv6/sit.c | 18 ++-
net/xfrm/xfrm_interface_core.c | 15 +-
tools/testing/selftests/net/Makefile | 1 +
.../testing/selftests/net/lib/py/__init__.py | 2 +-
tools/testing/selftests/net/lib/py/netns.py | 18 +++
tools/testing/selftests/net/link_netns.py | 142 ++++++++++++++++++
tools/testing/selftests/net/netns-name.sh | 10 ++
48 files changed, 546 insertions(+), 226 deletions(-)
create mode 100755 tools/testing/selftests/net/link_netns.py
--
2.47.1
If the <kunit/platform_device.h> header is included in a test without
certain other headers, it produces compiler warnings like:
In file included from [...]
../include/kunit/platform_device.h:15:57: warning: ‘struct completion’
declared inside parameter list will not be visible outside of this
definition or declaration
15 | struct completion *x);
| ^~~~~~~~~~
Add a 'struct completion' forward declaration to resolve this.
Signed-off-by: Brian Norris <briannorris(a)chromium.org>
---
I'm not bothering with a Fixes tag, since this only shows up with new
tests I'm writing.
include/kunit/platform_device.h | 1 +
1 file changed, 1 insertion(+)
diff --git a/include/kunit/platform_device.h b/include/kunit/platform_device.h
index 0fc0999d2420..f8236a8536f7 100644
--- a/include/kunit/platform_device.h
+++ b/include/kunit/platform_device.h
@@ -2,6 +2,7 @@
#ifndef _KUNIT_PLATFORM_DRIVER_H
#define _KUNIT_PLATFORM_DRIVER_H
+struct completion;
struct kunit;
struct platform_device;
struct platform_driver;
--
2.47.1.613.gc27f4b7a9f-goog
Fix the way tcpdump is executed by:
- Using the right variable for the namespace. Currently the use of the
empty "ns" makes the command fail.
- Waiting until it starts to capture to ensure the interesting traffic
is caught on slow systems.
- Using line-buffered output to ensure logs are available when the test
is paused with "-p". Otherwise the last chunk of data might only be
written when tcpdump is killed.
Fixes: 74cc26f416b9 ("selftests: openvswitch: add interface support")
Signed-off-by: Adrian Moreno <amorenoz(a)redhat.com>
---
tools/testing/selftests/net/openvswitch/openvswitch.sh | 6 ++++--
1 file changed, 4 insertions(+), 2 deletions(-)
diff --git a/tools/testing/selftests/net/openvswitch/openvswitch.sh b/tools/testing/selftests/net/openvswitch/openvswitch.sh
index cc0bfae2bafa..960e1ab4dd04 100755
--- a/tools/testing/selftests/net/openvswitch/openvswitch.sh
+++ b/tools/testing/selftests/net/openvswitch/openvswitch.sh
@@ -171,8 +171,10 @@ ovs_add_netns_and_veths () {
ovs_add_if "$1" "$2" "$4" -u || return 1
fi
- [ $TRACING -eq 1 ] && ovs_netns_spawn_daemon "$1" "$ns" \
- tcpdump -i any -s 65535
+ if [ $TRACING -eq 1 ]; then
+ ovs_netns_spawn_daemon "$1" "$3" tcpdump -l -i any -s 6553
+ ovs_wait grep -q "listening on any" ${ovs_dir}/stderr
+ fi
return 0
}
--
2.47.1
On 2024-12-16 10:21, Christoph Schlameuss wrote:
> Prevent null pointer dereference when processing the
> KVM_DEV_FLIC_APF_ENABLE and KVM_DEV_FLIC_APF_DISABLE_WAIT ioctls in the
> interrupt controller.
>
> Fixes: 3c038e6be0e2 ("KVM: async_pf: Async page fault support on s390")
> Reported-by: Claudio Imbrenda <imbrenda(a)linux.ibm.com>
> Signed-off-by: Christoph Schlameuss <schlameuss(a)linux.ibm.com>
Tested-by: Hariharan Mari <hari55(a)linux.ibm.com>
> ---
> Documentation/virt/kvm/devices/s390_flic.rst | 4 ++++
> arch/s390/kvm/interrupt.c | 4 ++++
> 2 files changed, 8 insertions(+)
>
> diff --git a/Documentation/virt/kvm/devices/s390_flic.rst
> b/Documentation/virt/kvm/devices/s390_flic.rst
> index ea96559ba501..b784f8016748 100644
> --- a/Documentation/virt/kvm/devices/s390_flic.rst
> +++ b/Documentation/virt/kvm/devices/s390_flic.rst
> @@ -58,11 +58,15 @@ Groups:
> Enables async page faults for the guest. So in case of a major
> page fault
> the host is allowed to handle this async and continues the guest.
>
> + -EINVAL is returned when called on the FLIC of a ucontrol VM.
> +
> KVM_DEV_FLIC_APF_DISABLE_WAIT
> Disables async page faults for the guest and waits until already
> pending
> async page faults are done. This is necessary to trigger a
> completion interrupt
> for every init interrupt before migrating the interrupt list.
>
> + -EINVAL is returned when called on the FLIC of a ucontrol VM.
> +
> KVM_DEV_FLIC_ADAPTER_REGISTER
> Register an I/O adapter interrupt source. Takes a
> kvm_s390_io_adapter
> describing the adapter to register::
> diff --git a/arch/s390/kvm/interrupt.c b/arch/s390/kvm/interrupt.c
> index ea8dce299954..22d73c13e555 100644
> --- a/arch/s390/kvm/interrupt.c
> +++ b/arch/s390/kvm/interrupt.c
> @@ -2678,9 +2678,13 @@ static int flic_set_attr(struct kvm_device
> *dev, struct kvm_device_attr *attr)
> kvm_s390_clear_float_irqs(dev->kvm);
> break;
> case KVM_DEV_FLIC_APF_ENABLE:
> + if (kvm_is_ucontrol(dev->kvm))
> + return -EINVAL;
> dev->kvm->arch.gmap->pfault_enabled = 1;
> break;
> case KVM_DEV_FLIC_APF_DISABLE_WAIT:
> + if (kvm_is_ucontrol(dev->kvm))
> + return -EINVAL;
> dev->kvm->arch.gmap->pfault_enabled = 0;
> /*
> * Make sure no async faults are in transition when
Introduce a new helper function, `sk_set_prio_allowed`,
to centralize the logic for validating priority settings.
Add support for the `SO_PRIORITY` control message,
enabling user-space applications to set socket priority
via control messages (cmsg).
Patch Overview:
Patch 1/4: Introduce 'sk_set_prio_allowed' helper function.
Patch 2/4: Add support for setting SO_PRIORITY via control messages
Patch 3/4: Add test for SO_PRIORITY setting via control messages
Patch 4/4: Add new socket option, SO_RCVPRIORITY
v7:
- Carry Eric's and Willem's "Reviewed-by" tags from v3 to
patch 1/4 since that is resubmitted without changes.
- Carry Willem's "Reviewed-by" tag from v4 in patch 2/4,
as it is resubmitted without changes.
- Carry Willem's "Reviewed-by" tag from v5 in patch 4/4,
as it is resubmitted without changes.
- Carry Willem's "Reviewed-by" tag from v6
since it is resubmitted with minor cosmetic changes in
patch 3/4.
- Carry Willem's "Acked-by" tag from v5 on FILTER_COUNTER
(patch 3/4).
- Carry Ido's "Reviewed-by" and "Tested-by" tags from v6
since it is resubmitted with minor cosmetic changes in
patch 3/4.
- Align the code to the open parenthesis in cmsg_sender.c
(patch 3/4).
- Remove unnecessary blank line in cmsg_so_priority.sh
(patch 3/4).
- Remove unused delay variable from cmsg_so_priority.sh
(patch 3/4).
- Rebased on net-next.
v6:
https://lore.kernel.org/netdev/20241210191309.8681-1-annaemesenyiri@gmail.c…
- Carry Eric's and Willem's "Reviewed-by" tags from v3 to
patch 1/4 since that is resubmitted without changes.
- Carry Willem's "Reviewed-by" tag from v4 in patch 2/4,
as it is resubmitted without changes.
- Carry Willem's "Reviewed-by" tag from v5 in patch 4/4,
as it is resubmitted without changes.
- Use KSFT_SKIP in jq installation test and
add 'nodad' flag for IPv6 address in cmsg_so_priority.sh (patch 3/4).
- Rebased on net-next.
v5:
https://lore.kernel.org/netdev/20241205133112.17903-1-annaemesenyiri@gmail.…
- Carry Eric's and Willem's "Reviewed-by" tags from v3 to
patch 1/4 since that is resubmitted without changes.
- Carry Willem's "Reviewed-by" tag from v4 in patch 2/4,
as it is resubmitted without changes.
- Eliminate variable duplication, fix indentation, simplify cleanup,
verify dependencies, separate setsockopt and control message
priority testing, and modify namespace setup
in patch 3/4 cmsg_so_priority.sh.
- Add cmsg_so_priority.sh to tools/testing/selftests/net/Makefile.
- Remove the unused variable, rename priority_cmsg to priority,
and document the -P option in cmsg_sender.c in patch 3/4.
- New in v5: add new socket option, SO_RCVPRIORITY in patch 4/4.
- Rebased on net-next.
v4:
https://lore.kernel.org/netdev/20241118145147.56236-1-annaemesenyiri@gmail.…
- Carry Eric's and Willem's "Reviewed-by" tags from v3 to
patch 1/3 since that is resubmitted without changes.
- Updated description in patch 2/3.
- Missing ipc6.sockc.priority field added in ping_v6_sendmsg()
in patch 2/3.
- Update cmsg_so_priority.sh to test SO_PRIORITY sockopt and cmsg
setting with VLAN priority tagging in patch 3/3. (Ido Schimmel)
- Rebased on net-next.
v3:
https://lore.kernel.org/netdev/20241107132231.9271-1-annaemesenyiri@gmail.c…
- Updated cover letter text.
- Removed priority field from ipcm_cookie.
- Removed cork->tos value check from ip_setup_cork, so
cork->priority will now take its value from ipc->sockc.priority.
- Replaced ipc->priority with ipc->sockc.priority
in ip_cmsg_send().
- Modified the error handling for the SO_PRIORITY
case in __sock_cmsg_send().
- Added missing initialization for ipc6.sockc.priority.
- Introduced cmsg_so_priority.sh test script.
- Modified cmsg_sender.c to set priority via control message (cmsg).
- Rebased on net-next.
v2:
https://lore.kernel.org/netdev/20241102125136.5030-1-annaemesenyiri@gmail.c…
- Introduced sk_set_prio_allowed helper to check capability
for setting priority.
- Removed new fields and changed sockcm_cookie::priority
from char to u32 to align with sk_buff::priority.
- Moved the cork->tos value check for priority setting
from __ip_make_skb() to ip_setup_cork().
- Rebased on net-next.
v1:
https://lore.kernel.org/all/20241029144142.31382-1-annaemesenyiri@gmail.com/
Anna Emese Nyiri (4):
Introduce sk_set_prio_allowed helper function
support SO_PRIORITY cmsg
test SO_PRIORITY ancillary data with cmsg_sender
introduce SO_RCVPRIORITY socket option
arch/alpha/include/uapi/asm/socket.h | 2 +
arch/mips/include/uapi/asm/socket.h | 2 +
arch/parisc/include/uapi/asm/socket.h | 2 +
arch/sparc/include/uapi/asm/socket.h | 2 +
include/net/inet_sock.h | 2 +-
include/net/ip.h | 2 +-
include/net/sock.h | 8 +-
include/uapi/asm-generic/socket.h | 2 +
net/can/raw.c | 2 +-
net/core/sock.c | 26 ++-
net/ipv4/ip_output.c | 4 +-
net/ipv4/ip_sockglue.c | 2 +-
net/ipv4/raw.c | 2 +-
net/ipv6/ip6_output.c | 3 +-
net/ipv6/ping.c | 1 +
net/ipv6/raw.c | 3 +-
net/ipv6/udp.c | 1 +
net/packet/af_packet.c | 2 +-
net/socket.c | 11 ++
tools/include/uapi/asm-generic/socket.h | 2 +
tools/testing/selftests/net/Makefile | 1 +
tools/testing/selftests/net/cmsg_sender.c | 11 +-
.../testing/selftests/net/cmsg_so_priority.sh | 151 ++++++++++++++++++
23 files changed, 228 insertions(+), 16 deletions(-)
create mode 100755 tools/testing/selftests/net/cmsg_so_priority.sh
--
2.43.0
Currently, when we run the BPF selftests with the following command:
'make -C tools/testing/selftests TARGETS=bpf SKIP_TARGETS=""'
The command generates untracked files and directories:
'''
Untracked files:
(use "git add <file>..." to include in what will be committed)
tools/testing/selftests/bpfFEATURE-DUMP.selftests
tools/testing/selftests/bpffeature/
'''
The core reason is our Makefile(tools/testing/selftests/bpf/Makefile)
was written like this:
'''
OUTPUT := $(OUTPUT)/
$(eval include ../../../build/Makefile.feature)
OUTPUT := $(patsubst %/,%,$(OUTPUT))
'''
This way of assigning values to OUTPUT will never be effective for the
variable OUTPUT provided via the command argument and sub makefile called
like this(tools/testing/selftests/Makefile):
'''
all:
...
$(MAKE) OUTPUT=$$BUILD_TARGET -C $$TARGET
'''
As stated in the GNU make documentation:
'''
An argument that contains '=' specifies the value of a variable: 'v=x'
sets the value of the variable v to x. If you specify a value in this way,
all ordinary assignments of the same variable in the makefile are ignored;
we say they have been overridden by the command line argument.
'''
According to GNU make, we use override Directive to fix this issue:
'''
If you want to set the variable in the makefile even though it was set
with a command argument, you can use an override directive, which is a
line that looks like this:
override variable := value
Link: https://www.gnu.org/software/make/manual/make.html#Override-Directive
Fixes: dc3a8804d790 ("selftests/bpf: Adapt OUTPUT appending logic to lower versions of Make")
Signed-off-by: Jiayuan Chen <mrpre(a)163.com>
---
tools/testing/selftests/bpf/Makefile | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/tools/testing/selftests/bpf/Makefile b/tools/testing/selftests/bpf/Makefile
index 9e870e519c30..eb4d21651aa7 100644
--- a/tools/testing/selftests/bpf/Makefile
+++ b/tools/testing/selftests/bpf/Makefile
@@ -202,9 +202,9 @@ ifeq ($(shell expr $(MAKE_VERSION) \>= 4.4), 1)
$(let OUTPUT,$(OUTPUT)/,\
$(eval include ../../../build/Makefile.feature))
else
-OUTPUT := $(OUTPUT)/
+override OUTPUT := $(OUTPUT)/
$(eval include ../../../build/Makefile.feature)
-OUTPUT := $(patsubst %/,%,$(OUTPUT))
+override OUTPUT := $(patsubst %/,%,$(OUTPUT))
endif
endif
base-commit: a7c205120d339b6ad2557fe3f33fdf20394f1a0f
--
2.43.5
In commit 03c7527e97f7 ("KVM: arm64: Do not allow ID_AA64MMFR0_EL1.ASIDbits
to be overridden") we made that bitfield in the ID registers unwritable
however the change neglected to make the corresponding update to set_id_regs
resulting in it failing:
# ok 56 ID_AA64MMFR0_EL1_BIGEND
# ==== Test Assertion Failure ====
# aarch64/set_id_regs.c:434: masks[idx] & ftr_bits[j].mask == ftr_bits[j].mask
# pid=5566 tid=5566 errno=22 - Invalid argument
# 1 0x00000000004034a7: test_vm_ftr_id_regs at set_id_regs.c:434
# 2 0x0000000000401b53: main at set_id_regs.c:684
# 3 0x0000ffff8e6b7543: ?? ??:0
# 4 0x0000ffff8e6b7617: ?? ??:0
# 5 0x0000000000401e6f: _start at ??:?
# 0 != 0xf0 (masks[idx] & ftr_bits[j].mask != ftr_bits[j].mask)
not ok 8 selftests: kvm: set_id_regs # exit=254
Remove ID_AA64MMFR1_EL1.ASIDBITS from the set of bitfields we test for
writeability.
Fixes: 03c7527e97f7 ("KVM: arm64: Do not allow ID_AA64MMFR0_EL1.ASIDbits to be overridden")
Signed-off-by: Mark Brown <broonie(a)kernel.org>
---
tools/testing/selftests/kvm/aarch64/set_id_regs.c | 1 -
1 file changed, 1 deletion(-)
diff --git a/tools/testing/selftests/kvm/aarch64/set_id_regs.c b/tools/testing/selftests/kvm/aarch64/set_id_regs.c
index a79b7f18452d2ec336ae623b8aa5c9cf329b6b4e..3a97c160b5fec990aaf8dfaf100a907b913f057c 100644
--- a/tools/testing/selftests/kvm/aarch64/set_id_regs.c
+++ b/tools/testing/selftests/kvm/aarch64/set_id_regs.c
@@ -152,7 +152,6 @@ static const struct reg_ftr_bits ftr_id_aa64mmfr0_el1[] = {
REG_FTR_BITS(FTR_LOWER_SAFE, ID_AA64MMFR0_EL1, BIGENDEL0, 0),
REG_FTR_BITS(FTR_LOWER_SAFE, ID_AA64MMFR0_EL1, SNSMEM, 0),
REG_FTR_BITS(FTR_LOWER_SAFE, ID_AA64MMFR0_EL1, BIGEND, 0),
- REG_FTR_BITS(FTR_LOWER_SAFE, ID_AA64MMFR0_EL1, ASIDBITS, 0),
REG_FTR_BITS(FTR_LOWER_SAFE, ID_AA64MMFR0_EL1, PARANGE, 0),
REG_FTR_END,
};
---
base-commit: 78d4f34e2115b517bcbfe7ec0d018bbbb6f9b0b8
change-id: 20241216-kvm-arm64-fix-set-id-asidbits-9bede25b7ad3
Best regards,
--
Mark Brown <broonie(a)kernel.org>
Hi All,
This patch-set aims to improve precision of BPF_MUL and add testcases
to illustrate precision gains using signed and unsigned bounds.
Thanks for taking the time to review and specifically for Eduard's feedback!
Best,
Matan
Changes from v1:
- Fixed typo made in patch
Changes from v2:
- Added signed multiplication to BPF_MUL
- Added test cases to exercise BPF_MUL
- Reordered patches in the series.
Matan Shachnai (2):
bpf, verifier: Improve precision of BPF_MUL
selftests/bpf: Add testcases for BPF_MUL
kernel/bpf/verifier.c | 72 +++++-----
.../selftests/bpf/progs/verifier_bounds.c | 134 ++++++++++++++++++
2 files changed, 166 insertions(+), 40 deletions(-)
--
2.25.1
If the selftest is not running as root, it should skip not
fail and give an appropriate warning to the user. This patch adds
ksft_exit_skip() if the test is not running as root.
Logs:
Before change:
TAP version 13
1..1
ok 1 # SKIP This test needs root to run!
After change:
TAP version 13
1..1
ok 2 # SKIP This test needs root to run!
Totals: pass:0 fail:0 xfail:0 xpass:0 skip:1 error:0
Signed-off-by: Shivam Chaudhary <cvam0000(a)gmail.com>
---
v1->v2 : Replace ksft_exit_fail_msg -> ksft_exit_skip
v1 : https://lore.kernel.org/all/20241115191721.621381-1-cvam0000@gmail.com/
tools/testing/selftests/acct/acct_syscall.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/tools/testing/selftests/acct/acct_syscall.c b/tools/testing/selftests/acct/acct_syscall.c
index e44e8fe1f4a3..87c044fb9293 100644
--- a/tools/testing/selftests/acct/acct_syscall.c
+++ b/tools/testing/selftests/acct/acct_syscall.c
@@ -24,7 +24,7 @@ int main(void)
// Check if test is run a root
if (geteuid()) {
- ksft_test_result_skip("This test needs root to run!\n");
+ ksft_exit_skip("This test needs root to run!\n");
return 1;
}
--
2.34.1
When adapting the test to the kselftest framework, a few printf() calls
indicating test progress were not updated.
Fix this by replacing these printf() calls by ksft_print_msg() calls.
Fixes: ce7d101750ff8450 ("selftests: timers: clocksource-switch: adapt to kselftest framework")
Signed-off-by: Geert Uytterhoeven <geert+renesas(a)glider.be>
Reviewed-by: Thomas Gleixner <tglx(a)linutronix.de>
---
v2:
- Add Reviewed-by.
When just running the test, the output looks like:
# Validating clocksource arch_sys_counter
TAP version 13
1..12
ok 1 CLOCK_REALTIME
...
# Validating clocksource ffca0000.timer
TAP version 13
1..12
ok 1 CLOCK_REALTIME
...
When redirecting the test output to a file, the progress prints are not
interspersed with the test output, but collated at the end:
TAP version 13
1..12
ok 1 CLOCK_REALTIME
...
TAP version 13
1..12
ok 1 CLOCK_REALTIME
...
# Totals: pass:6 fail:0 xfail:0 xpass:0 skip:6 error:0
# Validating clocksource arch_sys_counter
# Validating clocksource ffca0000.timer
...
This makes it hard to match the test results with the timer under test.
Is there a way to fix this? The test does use fork().
Thanks!
---
tools/testing/selftests/timers/clocksource-switch.c | 6 +++---
1 file changed, 3 insertions(+), 3 deletions(-)
diff --git a/tools/testing/selftests/timers/clocksource-switch.c b/tools/testing/selftests/timers/clocksource-switch.c
index c5264594064c8516..83faa4e354e389c2 100644
--- a/tools/testing/selftests/timers/clocksource-switch.c
+++ b/tools/testing/selftests/timers/clocksource-switch.c
@@ -156,8 +156,8 @@ int main(int argc, char **argv)
/* Check everything is sane before we start switching asynchronously */
if (do_sanity_check) {
for (i = 0; i < count; i++) {
- printf("Validating clocksource %s\n",
- clocksource_list[i]);
+ ksft_print_msg("Validating clocksource %s\n",
+ clocksource_list[i]);
if (change_clocksource(clocksource_list[i])) {
status = -1;
goto out;
@@ -169,7 +169,7 @@ int main(int argc, char **argv)
}
}
- printf("Running Asynchronous Switching Tests...\n");
+ ksft_print_msg("Running Asynchronous Switching Tests...\n");
pid = fork();
if (!pid)
return run_tests(runtime);
--
2.34.1
Hi all,
This patch series continues the work to migrate the script tests into
prog_tests.
test_xdp_meta.sh uses the BPF programs defined in progs/test_xdp_meta.c
to do a simple XDP/TC functional test that checks the metadata
allocation performed by the bpf_xdp_adjust_meta() helper.
This is already partly covered by two tests under prog_tests/:
- xdp_context_test_run.c uses bpf_prog_test_run_opts() to verify the
validity of the xdp_md context after a call to bpf_xdp_adjust_meta()
- xdp_metadata.c ensures that these meta-data can be exchanged through
an AF_XDP socket.
However test_xdp_meta.sh also verifies that the meta-data initialized
in the struct xdp_md is forwarded to the struct __sk_buff used by BPF
programs at 'TC level'. To cover this, I add a test case in
xdp_context_test_run.c that uses the same BPF programs from
progs/test_xdp_meta.c.
---
Changes in v2:
- Add missing close_netns()
- Use a unique 'close' label
- Link to v1: https://lore.kernel.org/r/20241206-xdp_meta-v1-0-5c150618f6e9@bootlin.com
---
Bastien Curutchet (2):
selftests/bpf: test_xdp_meta: Rename BPF sections
selftests/bpf: Migrate test_xdp_meta.sh into xdp_context_test_run.c
tools/testing/selftests/bpf/Makefile | 1 -
.../bpf/prog_tests/xdp_context_test_run.c | 87 ++++++++++++++++++++++
tools/testing/selftests/bpf/progs/test_xdp_meta.c | 4 +-
tools/testing/selftests/bpf/test_xdp_meta.sh | 58 ---------------
4 files changed, 89 insertions(+), 61 deletions(-)
---
base-commit: 0c30734c4f35c4784d3d3ca1bb89d9779045878c
change-id: 20241203-xdp_meta-868307cd0e03
Best regards,
--
Bastien Curutchet (eBPF Foundation) <bastien.curutchet(a)bootlin.com>
Changes v7:
- Include fallthrough in resctrlfs.c.
- Check fp after opening empty cpus file.
- Correct a comment and merge strings in snprintf().
Changes v6:
- Rebase onto latest kselftest-next.
- Looking at the two patches with a fresh eye decided to make a split
along the lines of:
- Patch 1/2 contains all of the code that relates to SNC mode
detection and checking that detection's reliability.
- Patch 2/2 contains checking kernel support for SNC and
modifying the messages at the end of affected tests.
Changes v5:
- Tests are skipped if snc_unreliable was set.
- Moved resctrlfs.c changes from patch 2/2 to 1/2.
- Removed CAT changes since it's not impacted by SNC in the selftest.
- Updated various comments.
- Fixed a bunch of minor issues pointed out in the review.
Changes v4:
- Printing SNC warnings at the start of every test.
- Printing SNC warnings at the end of every relevant test.
- Remove global snc_mode variable, consolidate snc detection functions
into one.
- Correct minor mistakes.
Changes v3:
- Reworked patch 2.
- Changed minor things in patch 1 like function name and made
corrections to the patch message.
Changes v2:
- Removed patches 2 and 3 since now this part will be supported by the
kernel.
Sub-Numa Clustering (SNC) allows splitting CPU cores, caches and memory
into multiple NUMA nodes. When enabled, NUMA-aware applications can
achieve better performance on bigger server platforms.
SNC support in the kernel was merged into x86/cache [1]. With SNC enabled
and kernel support in place all the tests will function normally (aside
from effective cache size). There might be a problem when SNC is enabled
but the system is still using an older kernel version without SNC
support. Currently the only message displayed in that situation is a
guess that SNC might be enabled and is causing issues. That message also
is displayed whenever the test fails on an Intel platform.
Add a mechanism to discover kernel support for SNC which will add more
meaning and certainty to the error message.
Add runtime SNC mode detection and verify how reliable that information
is.
Series was tested on Ice Lake server platforms with SNC disabled, SNC-2
and SNC-4. The tests were also ran with and without kernel support for
SNC.
Series applies cleanly on kselftest/next.
[1] https://lore.kernel.org/all/20240628215619.76401-1-tony.luck@intel.com/
Previous versions of this series:
[v1] https://lore.kernel.org/all/cover.1709721159.git.maciej.wieczor-retman@inte…
[v2] https://lore.kernel.org/all/cover.1715769576.git.maciej.wieczor-retman@inte…
[v3] https://lore.kernel.org/all/cover.1719842207.git.maciej.wieczor-retman@inte…
[v4] https://lore.kernel.org/all/cover.1720774981.git.maciej.wieczor-retman@inte…
[v5] https://lore.kernel.org/all/cover.1730206468.git.maciej.wieczor-retman@inte…
[v6] https://lore.kernel.org/all/cover.1733136454.git.maciej.wieczor-retman@inte…
Maciej Wieczor-Retman (2):
selftests/resctrl: Adjust effective L3 cache size with SNC enabled
selftests/resctrl: Discover SNC kernel support and adjust messages
tools/testing/selftests/resctrl/Makefile | 3 +-
tools/testing/selftests/resctrl/cmt_test.c | 4 +-
tools/testing/selftests/resctrl/mba_test.c | 2 +
tools/testing/selftests/resctrl/mbm_test.c | 4 +-
tools/testing/selftests/resctrl/resctrl.h | 6 +
.../testing/selftests/resctrl/resctrl_tests.c | 9 +-
tools/testing/selftests/resctrl/resctrlfs.c | 137 ++++++++++++++++++
7 files changed, 159 insertions(+), 6 deletions(-)
--
2.47.1
This adds support for receiving KeyUpdate messages (RFC 8446, 4.6.3
[1]). A sender transmits a KeyUpdate message and then changes its TX
key. The receiver should react by updating its RX key before
processing the next message.
This patchset implements key updates by:
1. pausing decryption when a KeyUpdate message is received, to avoid
attempting to use the old key to decrypt a record encrypted with
the new key
2. returning -EKEYEXPIRED to syscalls that cannot receive the
KeyUpdate message, until the rekey has been performed by userspace
3. passing the KeyUpdate message to userspace as a control message
4. allowing updates of the crypto_info via the TLS_TX/TLS_RX
setsockopts
This API has been tested with gnutls to make sure that it allows
userspace libraries to implement key updates [2]. Thanks to Frantisek
Krenzelok <fkrenzel(a)redhat.com> for providing the implementation in
gnutls and testing the kernel patches.
=======================================================================
Discussions around v2 of this patchset focused on how HW offload would
interact with rekey.
RX
- The existing SW path will handle all records between the KeyUpdate
message signaling the change of key and the new key becoming known
to the kernel -- those will be queued encrypted, and decrypted in
SW as they are read by userspace (once the key is provided, ie same
as this patchset)
- Call ->tls_dev_del + ->tls_dev_add immediately during
setsockopt(TLS_RX)
TX
- After setsockopt(TLS_TX), switch to the existing SW path (not the
current device_fallback) until we're able to re-enable HW offload
- tls_device_sendmsg will call into tls_sw_sendmsg under lock_sock
to avoid changing socket ops during the rekey while another
thread might be waiting on the lock
- We only re-enable HW offload (call ->tls_dev_add to install the new
key in HW) once all records sent with the old key have been
ACKed. At this point, all unacked records are SW-encrypted with the
new key, and the old key is unused by both HW and retransmissions.
- If there are no unacked records when userspace does
setsockopt(TLS_TX), we can (try to) install the new key in HW
immediately.
- If yet another key has been provided via setsockopt(TLS_TX), we
don't install intermediate keys, only the latest.
- TCP notifies ktls of ACKs via the icsk_clean_acked callback. In
case of a rekey, tls_icsk_clean_acked will record when all data
sent with the most recent past key has been sent. The next call
to sendmsg will install the new key in HW.
- We close and push the current SW record before reenabling
offload.
If ->tls_dev_add fails to install the new key in HW, we stay in SW
mode. We can add a counter to keep track of this.
In addition:
Because we can't change socket ops during a rekey, we'll also have to
modify do_tls_setsockopt_conf to check ctx->tx_conf and only call
either tls_set_device_offload or tls_set_sw_offload. RX already uses
the same ops for both TLS_HW and TLS_SW, so we could switch between HW
and SW mode on rekey.
An alternative would be to have a common sendmsg which locks
the socket and then calls the correct implementation. We'll need that
anyway for the offload under rekey case, so that would only add a test
to the SW path's ops (compared to the current code). That should allow
us to simplify build_protos a bit, but might have a performance
impact - we'll need to check it if we want to go that route.
=======================================================================
Changes since v4:
- add counter for received KeyUpdate messages
- improve wording in the documentation
- improve handling of bogus messages when looking for KeyUpdate's
- some coding style clean ups
Changes since v3:
- rebase on top of net-next
- rework tls_check_pending_rekey according to Jakub's feedback
- add statistics for rekey: {RX,TX}REKEY{OK,ERROR}
- some coding style clean ups
Link: https://lore.kernel.org/netdev/cover.1731597571.git.sd@queasysnail.net/ [v4]
Link: https://lore.kernel.org/netdev/cover.1691584074.git.sd@queasysnail.net/ [v3]
Link: https://lore.kernel.org/netdev/cover.1676052788.git.sd@queasysnail.net/ [v2]
Link: https://lore.kernel.org/netdev/cover.1673952268.git.sd@queasysnail.net/ [v1]
Link: https://www.rfc-editor.org/rfc/rfc8446#section-4.6.3 [1]
Link: https://gitlab.com/gnutls/gnutls/-/merge_requests/1625 [2]
Sabrina Dubroca (6):
tls: block decryption when a rekey is pending
tls: implement rekey for TLS1.3
tls: add counters for rekey
docs: tls: document TLS1.3 key updates
selftests: tls: add key_generation argument to tls_crypto_info_init
selftests: tls: add rekey tests
Documentation/networking/tls.rst | 36 +++
include/net/tls.h | 3 +
include/uapi/linux/snmp.h | 5 +
net/tls/tls.h | 3 +-
net/tls/tls_device.c | 2 +-
net/tls/tls_main.c | 71 ++++-
net/tls/tls_proc.c | 5 +
net/tls/tls_sw.c | 140 ++++++---
tools/testing/selftests/net/tls.c | 478 +++++++++++++++++++++++++++++-
9 files changed, 682 insertions(+), 61 deletions(-)
--
2.47.1
This test already catches a netlink bug fixed by this series,
but only when running on HW with many queues. Make sure the
netdevsim instance created has a lot of queues, and constrain
the size of the recv_buffer used by netlink.
While at it test both rx and tx queues.
Signed-off-by: Jakub Kicinski <kuba(a)kernel.org>
---
CC: shuah(a)kernel.org
CC: linux-kselftest(a)vger.kernel.org
---
tools/testing/selftests/drivers/net/queues.py | 23 +++++++++++--------
1 file changed, 13 insertions(+), 10 deletions(-)
diff --git a/tools/testing/selftests/drivers/net/queues.py b/tools/testing/selftests/drivers/net/queues.py
index 30f29096e27c..9c5473abbd78 100755
--- a/tools/testing/selftests/drivers/net/queues.py
+++ b/tools/testing/selftests/drivers/net/queues.py
@@ -8,25 +8,28 @@ from lib.py import cmd
import glob
-def sys_get_queues(ifname) -> int:
- folders = glob.glob(f'/sys/class/net/{ifname}/queues/rx-*')
+def sys_get_queues(ifname, qtype='rx') -> int:
+ folders = glob.glob(f'/sys/class/net/{ifname}/queues/{qtype}-*')
return len(folders)
-def nl_get_queues(cfg, nl):
+def nl_get_queues(cfg, nl, qtype='rx'):
queues = nl.queue_get({'ifindex': cfg.ifindex}, dump=True)
if queues:
- return len([q for q in queues if q['type'] == 'rx'])
+ return len([q for q in queues if q['type'] == qtype])
return None
def get_queues(cfg, nl) -> None:
- queues = nl_get_queues(cfg, nl)
- if not queues:
- raise KsftSkipEx('queue-get not supported by device')
+ snl = NetdevFamily(recv_size=4096)
- expected = sys_get_queues(cfg.dev['ifname'])
- ksft_eq(queues, expected)
+ for qtype in ['rx', 'tx']:
+ queues = nl_get_queues(cfg, snl, qtype)
+ if not queues:
+ raise KsftSkipEx('queue-get not supported by device')
+
+ expected = sys_get_queues(cfg.dev['ifname'], qtype)
+ ksft_eq(queues, expected)
def addremove_queues(cfg, nl) -> None:
@@ -57,7 +60,7 @@ import glob
def main() -> None:
- with NetDrvEnv(__file__, queue_count=3) as cfg:
+ with NetDrvEnv(__file__, queue_count=100) as cfg:
ksft_run([get_queues, addremove_queues], args=(cfg, NetdevFamily()))
ksft_exit()
--
2.47.1
As a part of the effort to start running kvm selftests nested, this patch
series contains several fixes to the dirty_log_test, which allows this test
to run nested very well.
I also included a mostly nop change to KVM, to reverse the order in which
the PML log is read to align more closely to the hardware. It should
not affect regular users of the dirty logging but it fixes a unit test
specific assumption in the dirty_log_test dirty-ring mode.
Patch 4 fixes a very rare problem, which is hard to reproduce with standard
test parameters, but due to some weird timing issue, it
actually happened a few times on my machine which prompted me to investigate
it.
The issue can be reproduced well by running the test nested
(without patch 4 applied) with a very short iteration time and with a
few iterations in a loop like this:
while ./dirty_log_test -i 10 -I 1 -M dirty-ring ; do true ; done
Or even better, it's possible to manually patch the test to not wait at all
(effectively setting iteration time to 0), then it fails pretty fast.
Best regards,
Maxim Levitsky
Maxim Levitsky (4):
KVM: VMX: read the PML log in the same order as it was written
KVM: selftests: dirty_log_test: Limit s390x workaround to s390x
KVM: selftests: dirty_log_test: run the guest until some dirty ring
entries were harvested
KVM: selftests: dirty_log_test: support multiple write retires
arch/x86/kvm/vmx/vmx.c | 32 +++++---
arch/x86/kvm/vmx/vmx.h | 1 +
tools/testing/selftests/kvm/dirty_log_test.c | 79 +++++++++++++++++---
3 files changed, 91 insertions(+), 21 deletions(-)
--
2.26.3
This series:
1. makes the behavior of_find_device_by_node(),
bus_find_device_by_of_node(), bus_find_device_by_fwnode(), etc., more
consistent when provided with a NULL node/handle;
2. adds kunit tests to validate the new NULL-argument behavior; and
3. makes some related improvements and refactoring for the drivers/base/
kunit tests.
This series aims to prevent problems like the ones resolved in commit
5c8418cf4025 ("PCI/pwrctrl: Unregister platform device only if one
actually exists").
Changes in v2:
* Add Rob's Reviewed-by
* CC LKML (oops!)
* Keep "devm" and "match" tests in separate suites
Brian Norris (3):
drivers: base: Don't match devices with NULL of_node/fwnode/etc
drivers: base: test: Enable device model tests with KUNIT_ALL_TESTS
drivers: base: test: Add ...find_device_by...(... NULL) tests
drivers/base/core.c | 8 ++---
drivers/base/test/Kconfig | 1 +
drivers/base/test/platform-device-test.c | 42 +++++++++++++++++++++++-
3 files changed, 46 insertions(+), 5 deletions(-)
--
2.47.0.338.g60cca15819-goog
Currently, the situation when guest accesses MMIO during vectoring is
handled differently on VMX and SVM: on VMX KVM returns internal error,
when SVM goes into infinite loop trying to deliver an event again and
again.
This patch series eliminates this difference by returning a KVM internal
error when guest performs MMIO during vectoring for both VMX and SVM.
Also, introduce a selftest test case which covers the error handling
mentioned above.
V1 -> V2:
- Make commit messages more brief, avoid using pronouns
- Extract SVM error handling into a separate commit
- Introduce a new X86EMUL_ return type and detect the unhandleable
vectoring error in vendor-specific check_emulate_instruction instead of
handling it in the common MMU code (which is specific for cached MMIO)
Ivan Orlov (6):
KVM: x86: Add function for vectoring error generation
KVM: x86: Add emulation status for vectoring during MMIO
KVM: VMX: Handle vectoring error in check_emulate_instruction
KVM: SVM: Handle MMIO during vectroing error
selftests: KVM: extract lidt into helper function
selftests: KVM: Add test case for MMIO during vectoring
arch/x86/include/asm/kvm_host.h | 12 ++++-
arch/x86/kvm/kvm_emulate.h | 2 +
arch/x86/kvm/svm/svm.c | 9 +++-
arch/x86/kvm/vmx/vmx.c | 33 +++++-------
arch/x86/kvm/x86.c | 27 ++++++++++
.../selftests/kvm/include/x86_64/processor.h | 7 +++
.../selftests/kvm/set_memory_region_test.c | 53 ++++++++++++++++++-
.../selftests/kvm/x86_64/sev_smoke_test.c | 2 +-
8 files changed, 119 insertions(+), 26 deletions(-)
--
2.43.0
This patchset moves the task_mm_cid_work to a preemptible and migratable
context. This reduces the impact of this task to the scheduling latency
of real time tasks.
The change makes the recurrence of the task a bit more predictable.
We also add optimisation and fixes to make sure the task_mm_cid_work
works as intended.
Patch 1 contains the main changes, removing the task_work on the
scheduler tick and using a delayed_work instead.
Patch 2 adds some optimisations to the approach, since we rely
on a delayed_work, it is no longer required to check that the minimum
interval passed since execution, we however terminate the call
immediately if we see that no mm_cid is actually active, which could
happen on processes sleeping for long time or which exited but whose mm
has not been freed yet.
Patch 3 allows the mm_cids to be actually compacted when a process
reduces its number of threads, which was not the case since the same
mm_cids were reused to improve cache locality, more details in [1].
Patch 4 adds a selftest to validate the functionality of the
task_mm_cid_work (i.e. to compact the mm_cids), this test requires patch
3 to be applied.
Changes since V1 [1]:
* Re-arm the delayed_work at each invocation
* Cancel the work synchronously at mmdrop
* Remove next scan fields and completely rely on the delayed_work
* Shrink mm_cid allocation with nr thread/affinity (Mathieu Desnoyers)
* Add self test
OVERHEAD COMPARISON
In this section, I'm going to refer to head as the current state
upstream without my patch applied, patch is the same head with these
patches applied. Likewise, I'm going to refer to task_mm_cid_work as
either the task or the function. The experiments are run on an aarch64
machine with 128 cores. The kernel has a bare configuration with
PREEMPT_RT enabled.
- Memory
The patch introduces some memory overhead:
* head uses a callback_head per thread (16 bytes)
* patch relies on a delayed work per mm but drops a long (80 bytes net)
Tasks with 5 threads or less have lower memory footprint with the
current approach.
Considering a task_struct can be 7-13 kB and an mm_struct is about 1.4
kB, the overhead should be acceptable.
- Boot time
I tested the patch booting a virtual machine with vng[2], both head and
patch get similar boot times (around 8s).
- Runtime
I run some rather demanding tests to show what could possibly be a worst
case in the approach introduced by this patch. The following tests are
running again in vng to have a plain system, running mostly the
stressors (if there). Unless differently specified, time is in us. All
tests run for 30s.
The stress-ng tests were run with 128 stressors, I will omit from the
table for clarity.
No load head patch
running processes(threads): 12(12) 12(12)
duration(avg,max,sum): 75,426,987 2,42,45ms
invocations: 13 20k
stress-ng --cpu-load 80 head patch
running processes(threads): 129(129) 129(129)
duration(avg,max,sum): 20,2ms,740ms 7,774,280ms
invocations: 36k 39k
stress-ng --fork head patch
running processes(threads): 3.6k(3.6k) 4k(4k)
duration(avg,max,sum): 34,41,720 19,457,880ms
invocations: 21 46k
stress-ng --pthread-max 4 head patch
running processes(threads): 129(4k) 129(4k)
duration(avg,max,sum): 31,195,41ms 21,1ms,830ms
invocations: 1290 38k
It is important to note that some of those stressors run for a very
short period of time to just fork/create a thread, this heavily favours
head since the task won't simply run as often.
Moreover, the duration time needs to be read carefully, since the task
can now be preempted by threads, I tried to exclude that from the
computation, but to keep the probes simple, I didn't exclude
interference caused by interrupts.
On the same system while isolated, the task runs in about 30-35ms, it is
hence highly likely that much larger values are only due to
interruptions, rather than the function actually running that long.
I will post another email with the scripts used to retrieve the data and
more details about the runtime distribution.
[1] - https://lore.kernel.org/linux-kernel/20241205083110.180134-2-gmonaco@redhat…
[2] - https://github.com/arighi/virtme-ng
Gabriele Monaco (3):
sched: Move task_mm_cid_work to mm delayed work
sched: Remove mm_cid_next_scan as obsolete
rseq/selftests: Add test for mm_cid compaction
Mathieu Desnoyers (1):
sched: Compact RSEQ concurrency IDs with reduced threads and affinity
include/linux/mm_types.h | 23 ++-
include/linux/sched.h | 1 -
kernel/sched/core.c | 66 +-------
kernel/sched/sched.h | 32 ++--
tools/testing/selftests/rseq/.gitignore | 1 +
tools/testing/selftests/rseq/Makefile | 2 +-
.../selftests/rseq/mm_cid_compaction_test.c | 157 ++++++++++++++++++
7 files changed, 203 insertions(+), 79 deletions(-)
create mode 100644 tools/testing/selftests/rseq/mm_cid_compaction_test.c
base-commit: 231825b2e1ff6ba799c5eaf396d3ab2354e37c6b
--
2.47.1
I am Tomasz Chmielewski, a Portfolio Manager and Chartered
Financial Analyst affiliated with Iwoca Poland Sp. Z OO in
Poland. I have the privilege of working with distinguished
investors who are eager to support your company's current
initiatives, thereby broadening their investment portfolios. If
this proposal aligns with your interests, I invite you to
respond, and I will gladly share more information to assist you.
Yours sincerely,
Tomasz Chmielewski Warsaw, Mazowieckie,
Poland.
The word 'accross' is wrong, so fix it.
Signed-off-by: Zhu Jun <zhujun2(a)cmss.chinamobile.com>
---
tools/testing/selftests/powerpc/vphn/test-vphn.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/tools/testing/selftests/powerpc/vphn/test-vphn.c b/tools/testing/selftests/powerpc/vphn/test-vphn.c
index 81d3069ff..f348f5491 100644
--- a/tools/testing/selftests/powerpc/vphn/test-vphn.c
+++ b/tools/testing/selftests/powerpc/vphn/test-vphn.c
@@ -275,7 +275,7 @@ static struct test {
}
},
{
- /* Parse a 32-bit value split accross two consecutives 64-bit
+ /* Parse a 32-bit value split across two consecutives 64-bit
* input values.
*/
"vphn: 16-bit value followed by 2 x 32-bit values",
--
2.17.1
This patch set convert iptables to nftables for wireguard testing, as
iptables is deparated and nftables is the default framework of most releases.
v3: drop iptables directly (Jason A. Donenfeld)
Also convert to using nft for qemu testing (Jason A. Donenfeld)
v2: use one nft table for testing (Phil Sutter)
Hangbin Liu (2):
selftests: wireguards: convert iptables to nft
selftests: wireguard: update to using nft for qemu test
tools/testing/selftests/wireguard/netns.sh | 29 +++++++++-----
.../testing/selftests/wireguard/qemu/Makefile | 40 ++++++++++++++-----
.../selftests/wireguard/qemu/kernel.config | 7 ++--
3 files changed, 53 insertions(+), 23 deletions(-)
--
2.39.5 (Apple Git-154)
As the part-3 of the vIOMMU infrastructure, this series introduces a vIRQ
object. The existing FAULT object provides a nice notification pathway to
the user space already, so let vIRQ reuse the infrastructure.
Mimicing the HWPT structure, add a common EVENTQ structure to support its
derivatives: EVENTQ_IOPF (the prior FAULT object) and EVENTQ_VIRQ (new).
IOMMUFD_CMD_VIRQ_ALLOC is introduced to allocate EVENTQ_VIRQ for vIOMMUs.
One vIOMMU can have multiple vIRQs in different types but can not support
multiple vIRQs with the same types.
The forwarding part is fairly simple but might need to replace a physical
device ID with a virtual device ID in a driver-level IRQ data structure.
So, this comes with some helpers for drivers to use.
As usual, this series comes with the selftest coverage for this new vIRQ,
and with a real world use case in the ARM SMMUv3 driver.
This is on Github:
https://github.com/nicolinc/iommufd/commits/iommufd_virq-v2
Testing with RMR patches for MSI:
https://github.com/nicolinc/iommufd/commits/iommufd_virq-v2-with-rmr
Paring QEMU branch for testing:
https://github.com/nicolinc/qemu/commits/wip/for_iommufd_virq-v2
Changelog
v2
* Rebased on v6.13-rc1
* Added IOPF and vIRQ in iommufd.rst (userspace-api)
* Added a proper locking in iommufd_event_virq_destroy
* Added iommufd_event_virq_abort with a lockdep_assert_held
* Renamed "EVENT_*" to "EVENTQ_*" to describe the objects better
* Reorganized flows in iommufd_eventq_virq_alloc for abort() to work
* Added struct arm_smmu_vmaster to store vSID upon attaching to a nested
domain, calling a newly added iommufd_viommu_get_vdev_id helper
* Added an arm_vmaster_report_event helper in arm-smmu-v3-iommufd file
to simplify the routine in arm_smmu_handle_evt() of the main driver
v1
https://lore.kernel.org/all/cover.1724777091.git.nicolinc@nvidia.com/
Thanks!
Nicolin
Nicolin Chen (13):
iommufd/fault: Add an iommufd_fault_init() helper
iommufd/fault: Move iommufd_fault_iopf_handler() to header
iommufd: Rename IOMMUFD_OBJ_FAULT to IOMMUFD_OBJ_EVENTQ_IOPF
iommufd: Rename fault.c to eventq.c
iommufd: Add IOMMUFD_OBJ_EVENTQ_VIRQ and IOMMUFD_CMD_VIRQ_ALLOC
iommufd/viommu: Add iommufd_viommu_get_vdev_id helper
iommufd/viommu: Add iommufd_viommu_report_irq helper
iommufd/selftest: Require vdev_id when attaching to a nested domain
iommufd/selftest: Add IOMMU_TEST_OP_TRIGGER_VIRQ for vIRQ coverage
iommufd/selftest: Add EVENT_VIRQ test coverage
Documentation: userspace-api: iommufd: Update EVENTQ_IOPF and
EVENTQ_VIRQ
iommu/arm-smmu-v3: Introduce struct arm_smmu_vmaster
iommu/arm-smmu-v3: Report IRQs that belong to devices attached to
vIOMMU
drivers/iommu/iommufd/Makefile | 2 +-
drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h | 30 +
drivers/iommu/iommufd/iommufd_private.h | 150 ++++-
drivers/iommu/iommufd/iommufd_test.h | 10 +
include/linux/iommufd.h | 22 +-
include/uapi/linux/iommufd.h | 45 ++
tools/testing/selftests/iommu/iommufd_utils.h | 63 ++
.../arm/arm-smmu-v3/arm-smmu-v3-iommufd.c | 65 ++
drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 94 ++-
drivers/iommu/iommufd/driver.c | 59 ++
drivers/iommu/iommufd/eventq.c | 612 ++++++++++++++++++
drivers/iommu/iommufd/fault.c | 444 -------------
drivers/iommu/iommufd/hw_pagetable.c | 12 +-
drivers/iommu/iommufd/main.c | 14 +-
drivers/iommu/iommufd/selftest.c | 53 ++
drivers/iommu/iommufd/viommu.c | 2 +
tools/testing/selftests/iommu/iommufd.c | 27 +
.../selftests/iommu/iommufd_fail_nth.c | 6 +
Documentation/userspace-api/iommufd.rst | 19 +
19 files changed, 1218 insertions(+), 511 deletions(-)
create mode 100644 drivers/iommu/iommufd/eventq.c
delete mode 100644 drivers/iommu/iommufd/fault.c
base-commit: 2ca704f55e22b7b00cc7025953091af3c82fa5c0
--
2.43.0
This patchset creates a selftest for the robust list interface, to track
regressions and assure that the interface keeps working as expected.
In this version I removed the kselftest_harness include, but I expanded the
current futex selftest API a little bit with basic ASSERT_ macros to make the
test easier to write and read. In the future, hopefully we can move all futex
selftests to the kselftest_harness API anyway.
This is the expected output:
TAP version 13
1..6
ok 1 test_robustness
ok 2 test_set_robust_list_invalid_size
ok 3 test_get_robust_list_self
ok 4 test_get_robust_list_child
ok 5 test_set_list_op_pending
ok 6 test_robust_list_multiple_elements
# Totals: pass:6 fail:0 xfail:0 xpass:0 skip:0 error:0
Changelog
v4:
- Fixed clang warning "robust_list.c:121: converts between pointers to integer types
with different sign"
v3: https://lore.kernel.org/lkml/20241010011142.905297-1-andrealmeid@igalia.com/
- Create ASSERT_ macros for futex selftests
- Dropped kselftest_harness include, using just futex test API
André Almeida (2):
selftests/futex: Add ASSERT_ macros
selftests/futex: Create test for robust list
.../selftests/futex/functional/.gitignore | 1 +
.../selftests/futex/functional/Makefile | 3 +-
.../selftests/futex/functional/robust_list.c | 513 ++++++++++++++++++
.../testing/selftests/futex/include/logging.h | 28 +
4 files changed, 544 insertions(+), 1 deletion(-)
create mode 100644 tools/testing/selftests/futex/functional/robust_list.c
--
2.47.1
This patch series includes some netns-related improvements and fixes for
RTNL and ip_tunnel, to make link creation more intuitive:
- Creating link in another net namespace doesn't conflict with link names
in current one.
- Refector rtnetlink link creation. Create link in target namespace
directly. Pass both source and link netns to drivers via newlink()
callback.
So that
# ip link add netns ns1 link-netns ns2 tun0 type gre ...
will create tun0 in ns1, rather than create it in ns2 and move to ns1.
And don't conflict with another interface named "tun0" in current netns.
---
v5:
- Fix function doc in batman-adv.
- Include peer_net in rtnl newlink parameters.
v4:
link: https://lore.kernel.org/all/20241118143244.1773-1-shaw.leon@gmail.com/
- Pack newlink() parameters to a single struct.
- Use ynl async_msg_queue.empty() in selftest.
v3:
link: https://lore.kernel.org/all/20241113125715.150201-1-shaw.leon@gmail.com/
- Drop "netns_atomic" flag and module parameter. Add netns parameter to
newlink() instead, and convert drivers accordingly.
- Move python NetNSEnter helper to net selftest lib.
v2:
link: https://lore.kernel.org/all/20241107133004.7469-1-shaw.leon@gmail.com/
- Check NLM_F_EXCL to ensure only link creation is affected.
- Add self tests for link name/ifindex conflict and notifications
in different netns.
- Changes in dummy driver and ynl in order to add the test case.
v1:
link: https://lore.kernel.org/all/20241023023146.372653-1-shaw.leon@gmail.com/
Xiao Liang (5):
net: ip_tunnel: Build flow in underlay net namespace
rtnetlink: Lookup device in target netns when creating link
rtnetlink: Decouple net namespaces in rtnl_newlink_create()
selftests: net: Add python context manager for netns entering
selftests: net: Add two test cases for link netns
drivers/infiniband/ulp/ipoib/ipoib_netlink.c | 11 +++--
drivers/net/amt.c | 13 +++---
drivers/net/bareudp.c | 11 +++--
drivers/net/bonding/bond_netlink.c | 8 ++--
drivers/net/can/dev/netlink.c | 4 +-
drivers/net/can/vxcan.c | 9 ++--
.../ethernet/qualcomm/rmnet/rmnet_config.c | 11 +++--
drivers/net/geneve.c | 11 +++--
drivers/net/gtp.c | 9 ++--
drivers/net/ipvlan/ipvlan.h | 4 +-
drivers/net/ipvlan/ipvlan_main.c | 11 +++--
drivers/net/ipvlan/ipvtap.c | 7 ++-
drivers/net/macsec.c | 11 +++--
drivers/net/macvlan.c | 8 ++--
drivers/net/macvtap.c | 8 ++--
drivers/net/netkit.c | 9 ++--
drivers/net/pfcp.c | 8 ++--
drivers/net/ppp/ppp_generic.c | 10 +++--
drivers/net/team/team_core.c | 7 +--
drivers/net/veth.c | 9 ++--
drivers/net/vrf.c | 7 +--
drivers/net/vxlan/vxlan_core.c | 11 +++--
drivers/net/wireguard/device.c | 8 ++--
drivers/net/wireless/virtual/virt_wifi.c | 10 +++--
drivers/net/wwan/wwan_core.c | 15 +++++--
include/net/ip_tunnels.h | 5 ++-
include/net/rtnetlink.h | 44 ++++++++++++++++---
net/8021q/vlan_netlink.c | 11 +++--
net/batman-adv/soft-interface.c | 12 ++---
net/bridge/br_netlink.c | 8 ++--
net/caif/chnl_net.c | 6 +--
net/core/rtnetlink.c | 35 ++++++++-------
net/hsr/hsr_netlink.c | 14 +++---
net/ieee802154/6lowpan/core.c | 9 ++--
net/ipv4/ip_gre.c | 27 ++++++++----
net/ipv4/ip_tunnel.c | 16 ++++---
net/ipv4/ip_vti.c | 10 +++--
net/ipv4/ipip.c | 10 +++--
net/ipv6/ip6_gre.c | 28 +++++++-----
net/ipv6/ip6_tunnel.c | 16 +++----
net/ipv6/ip6_vti.c | 15 +++----
net/ipv6/sit.c | 16 +++----
net/xfrm/xfrm_interface_core.c | 14 +++---
tools/testing/selftests/net/Makefile | 1 +
.../testing/selftests/net/lib/py/__init__.py | 2 +-
tools/testing/selftests/net/lib/py/netns.py | 18 ++++++++
tools/testing/selftests/net/netns-name.sh | 10 +++++
tools/testing/selftests/net/netns_atomic.py | 39 ++++++++++++++++
48 files changed, 385 insertions(+), 211 deletions(-)
create mode 100755 tools/testing/selftests/net/netns_atomic.py
--
2.47.1
This patch series implements a new char misc driver, /dev/ntsync, which is used
to implement Windows NT synchronization primitives.
NT synchronization primitives are unique in that the wait functions both are
vectored, operate on multiple types of object with different behaviour (mutex,
semaphore, event), and affect the state of the objects they wait on. This model
is not compatible with existing kernel synchronization objects or interfaces,
and therefore the ntsync driver implements its own wait queues and locking.
This patch series is rebased against the "char-misc-next" branch of
gregkh/char-misc.git.
== Background ==
The Wine project emulates the Windows API in user space. One particular part of
that API, namely the NT synchronization primitives, have historically been
implemented via RPC to a dedicated "kernel" process. However, more recent
applications use these APIs more strenuously, and the overhead of RPC has become
a bottleneck.
The NT synchronization APIs are too complex to implement on top of existing
primitives without sacrificing correctness. Certain operations, such as
NtPulseEvent() or the "wait-for-all" mode of NtWaitForMultipleObjects(), require
direct control over the underlying wait queue, and implementing a wait queue
sufficiently robust for Wine in user space is not possible. This proposed
driver, therefore, implements the problematic interfaces directly in the Linux
kernel.
This driver was presented at Linux Plumbers Conference 2023. For those further
interested in the history of synchronization in Wine and past attempts to solve
this problem in user space, a recording of the presentation can be viewed here:
https://www.youtube.com/watch?v=NjU4nyWyhU8
== Performance ==
The performance measurements described below are copied from earlier versions of
the patch set. While some of the code has changed, I do not currently anticipate
that it has changed drastically enough to affect those measurements.
The gain in performance varies wildly depending on the application in question
and the user's hardware. For some games NT synchronization is not a bottleneck
and no change can be observed, but for others frame rate improvements of 50 to
150 percent are not atypical. The following table lists frame rate measurements
from a variety of games on a variety of hardware, taken by users Dmitry
Skvortsov, FuzzyQuils, OnMars, and myself:
Game Upstream ntsync improvement
===========================================================================
Anger Foot 69 99 43%
Call of Juarez 99.8 224.1 125%
Dirt 3 110.6 860.7 678%
Forza Horizon 5 108 160 48%
Lara Croft: Temple of Osiris 141 326 131%
Metro 2033 164.4 199.2 21%
Resident Evil 2 26 77 196%
The Crew 26 51 96%
Tiny Tina's Wonderlands 130 360 177%
Total War Saga: Troy 109 146 34%
===========================================================================
== Patches ==
The intended semantics of the patches are broadly intended to match those of the
corresponding Windows functions. For those not already familiar with the Windows
functions (or their undocumented behaviour), patch 27/28 provides a detailed
specification, and individual patches also include a brief description of the
API they are implementing.
The patches making use of this driver in Wine can be retrieved or browsed here:
https://repo.or.cz/wine/zf.git/shortlog/refs/heads/ntsync5
== Previous versions ==
No changes were made from v5 other than rebasing on top of the 6.13-rc1
char-misc-next tree.
I would like to repeat a question from the last round of review, though. Two
changes were suggested related to API design, which I did not make because the
APIs in question were already released in upstream Linux. However, the driver is
also completely nonfunctional and hidden behind BROKEN, so would this be
acceptable anyway? The changes in question are:
* rename NTSYNC_IOC_SEM_POST to NTSYNC_IOC_SEM_RELEASE (matching the NT
terminology instead of POSIX),
* change object creation ioctls to return the fds directly in the return value
instead of through the args struct. I would also still appreciate a
clarification on the advice in [1], which is why I didn't do this in the first
place.
[1] https://docs.kernel.org/driver-api/ioctl.html#return-code
* Link to v5: https://lore.kernel.org/lkml/20240519202454.1192826-1-zfigura@codeweavers.c…
* Link to v4: https://lore.kernel.org/lkml/20240416010837.333694-1-zfigura@codeweavers.co…
* Link to v3: https://lore.kernel.org/lkml/20240329000621.148791-1-zfigura@codeweavers.co…
* Link to v2: https://lore.kernel.org/lkml/20240219223833.95710-1-zfigura@codeweavers.com/
* Link to v1: https://lore.kernel.org/lkml/20240214233645.9273-1-zfigura@codeweavers.com/
* Link to RFC v2: https://lore.kernel.org/lkml/20240131021356.10322-1-zfigura@codeweavers.com/
* Link to RFC v1: https://lore.kernel.org/lkml/20240124004028.16826-1-zfigura@codeweavers.com/
Elizabeth Figura (28):
ntsync: Introduce NTSYNC_IOC_WAIT_ANY.
ntsync: Introduce NTSYNC_IOC_WAIT_ALL.
ntsync: Introduce NTSYNC_IOC_CREATE_MUTEX.
ntsync: Introduce NTSYNC_IOC_MUTEX_UNLOCK.
ntsync: Introduce NTSYNC_IOC_MUTEX_KILL.
ntsync: Introduce NTSYNC_IOC_CREATE_EVENT.
ntsync: Introduce NTSYNC_IOC_EVENT_SET.
ntsync: Introduce NTSYNC_IOC_EVENT_RESET.
ntsync: Introduce NTSYNC_IOC_EVENT_PULSE.
ntsync: Introduce NTSYNC_IOC_SEM_READ.
ntsync: Introduce NTSYNC_IOC_MUTEX_READ.
ntsync: Introduce NTSYNC_IOC_EVENT_READ.
ntsync: Introduce alertable waits.
selftests: ntsync: Add some tests for semaphore state.
selftests: ntsync: Add some tests for mutex state.
selftests: ntsync: Add some tests for NTSYNC_IOC_WAIT_ANY.
selftests: ntsync: Add some tests for NTSYNC_IOC_WAIT_ALL.
selftests: ntsync: Add some tests for wakeup signaling with
WINESYNC_IOC_WAIT_ANY.
selftests: ntsync: Add some tests for wakeup signaling with
WINESYNC_IOC_WAIT_ALL.
selftests: ntsync: Add some tests for manual-reset event state.
selftests: ntsync: Add some tests for auto-reset event state.
selftests: ntsync: Add some tests for wakeup signaling with events.
selftests: ntsync: Add tests for alertable waits.
selftests: ntsync: Add some tests for wakeup signaling via alerts.
selftests: ntsync: Add a stress test for contended waits.
maintainers: Add an entry for ntsync.
docs: ntsync: Add documentation for the ntsync uAPI.
ntsync: No longer depend on BROKEN.
Documentation/userspace-api/index.rst | 1 +
Documentation/userspace-api/ntsync.rst | 398 +++++
MAINTAINERS | 9 +
drivers/misc/Kconfig | 1 -
drivers/misc/ntsync.c | 989 +++++++++++-
include/uapi/linux/ntsync.h | 39 +
tools/testing/selftests/Makefile | 1 +
.../selftests/drivers/ntsync/.gitignore | 1 +
.../testing/selftests/drivers/ntsync/Makefile | 7 +
tools/testing/selftests/drivers/ntsync/config | 1 +
.../testing/selftests/drivers/ntsync/ntsync.c | 1407 +++++++++++++++++
11 files changed, 2850 insertions(+), 4 deletions(-)
create mode 100644 Documentation/userspace-api/ntsync.rst
create mode 100644 tools/testing/selftests/drivers/ntsync/.gitignore
create mode 100644 tools/testing/selftests/drivers/ntsync/Makefile
create mode 100644 tools/testing/selftests/drivers/ntsync/config
create mode 100644 tools/testing/selftests/drivers/ntsync/ntsync.c
base-commit: cdd30ebb1b9f36159d66f088b61aee264e649d7a
--
2.45.2
On Tue, Dec 10, 2024 at 10:00:17PM +0800, Zijun Hu wrote:
> This patch series is to fix bug for APIs
> - devm_pci_epc_destroy().
> - pci_epf_remove_vepf().
>
> and simplify APIs below:
> - pci_epc_get().
>
> Signed-off-by: Zijun Hu <quic_zijuhu(a)quicinc.com>
> ---
This is very good. This is Config FS. Is there a kself test for configfs
or did you create your own test?
regards,
dan carpenter
Currently, kselftests does not have a generalised mechanism to skip compilation
and run tests when required kernel configuration options are disabled.
This patch series addresses this limitation by introducing a new flag,
'TEST_CONFIG_DEPS' in lib.mk, along with corresponding updates to the
documentation.
The selftests/livepatch/Makefile has been updated to utilize TEST_CONFIG_DEPS.
Siddharth Menon (3):
docs/kselftests: Explain the usage of TEST_CONFIG_DEPS
selftests/lib.mk: Introduce check to validate required configs
selftests/livepatch: Check if required config options are enabled
Documentation/dev-tools/kselftest.rst | 3 +++
tools/testing/selftests/lib.mk | 18 ++++++++++++++++--
tools/testing/selftests/livepatch/Makefile | 1 +
3 files changed, 20 insertions(+), 2 deletions(-)
--
2.39.5
These patches are all simple fixes with no strong dependency though,
I hope that making them a patchset will be more convenient for merge.
The patchset are based on v6.12-rc2.
Chunyan Zhang (4):
riscv: Remove unused GENERATING_ASM_OFFSETS
riscv: Remove duplicated GET_RM
selftest/mm: Fix typo in virtual_address_range
selftests/mm: skip virtual_address_range tests on riscv
arch/riscv/kernel/asm-offsets.c | 2 --
arch/riscv/kernel/traps_misaligned.c | 2 --
tools/testing/selftests/mm/Makefile | 2 ++
tools/testing/selftests/mm/run_vmtests.sh | 10 ++++++----
tools/testing/selftests/mm/virtual_address_range.c | 4 ++--
5 files changed, 10 insertions(+), 10 deletions(-)
--
2.34.1
Hi Linus,
Please pull the following fixes update for Linux 6.13-rc3.
linux_kselftest-fixes-6.13-rc3
-- fixes the offset for kprobe syntax error test case when checking the
BTF arguments on 64-bit powerpc.
Note: This fix has been in linux-next since last week. I had to drop
a patch and rebase this morning.
diff is attached.
thanks,
-- Shuah
----------------------------------------------------------------
The following changes since commit 40384c840ea1944d7c5a392e8975ed088ecf0b37:
Linux 6.13-rc1 (2024-12-01 14:28:56 -0800)
are available in the Git repository at:
git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest tags/linux_kselftest-fixes-6.13-rc3
for you to fetch changes up to 777f290ab328de333b85558bb6807a69a59b36ba:
selftests/ftrace: adjust offset for kprobe syntax error test (2024-12-11 10:08:04 -0700)
----------------------------------------------------------------
linux_kselftest-fixes-6.13-rc3
-- fixes the offset for kprobe syntax error test case when checking the
BTF arguments on 64-bit powerpc.
----------------------------------------------------------------
Hari Bathini (1):
selftests/ftrace: adjust offset for kprobe syntax error test
tools/testing/selftests/ftrace/test.d/kprobe/kprobe_syntax_errors.tc | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
----------------------------------------------------------------
This series introduces a new ioctl KVM_TRANSLATE2, which expands on
KVM_TRANSLATE. It is required to implement Hyper-V's
HvTranslateVirtualAddress hyper-call as part of the ongoing effort to
emulate HyperV's Virtual Secure Mode (VSM) within KVM and QEMU. The hyper-
call requires several new KVM APIs, one of which is KVM_TRANSLATE2, which
implements the core functionality of the hyper-call. The rest of the
required functionality will be implemented in subsequent series.
Other than translating guest virtual addresses, the ioctl allows the
caller to control whether the access and dirty bits are set during the
page walk. It also allows specifying an access mode instead of returning
viable access modes, which enables setting the bits up to the level that
caused a failure. Additionally, the ioctl provides more information about
why the page walk failed, and which page table is responsible. This
functionality is not available within KVM_TRANSLATE, and can't be added
without breaking backwards compatiblity, thus a new ioctl is required.
The ioctl was designed to facilitate as many other use cases as possible
apart from VSM. The error codes were intentionally chosen to be broad
enough to avoid exposing architecture specific details. Even though
HvTranslateVirtualAddress only really needs one flag to set the accessed
and dirty bits whenever possible, that was split into several flags so
that future users can chose more gradually when these bits should be set.
Furthermore, as much information as possible is provided to the caller.
The patch series includes selftests for the ioctl, as well as fuzzy
testing on random garbage guest page table entries. All previously passing
KVM selftests and KVM unit tests still pass.
Series overview:
- 1: Document the new ioctl
- 2-11: Update the page walker in preparation
- 12-14: Implement the ioctl
- 15: Implement testing
This series, alongside the series by Nicolas Saenz Julienne [1]
introducing the core building blocks for VSM and the accompanying QEMU
implementation [2], is capable of booting Windows Server 2019.
Both series are also available on GitHub [3].
[1] https://lore.kernel.org/linux-hyperv/20240609154945.55332-1-nsaenz@amazon.c…
[2] https://github.com/vianpl/qemu/tree/vsm/next
[3] https://github.com/vianpl/linux/tree/vsm/next
Best,
Nikolas
Nikolas Wipper (15):
KVM: Add API documentation for KVM_TRANSLATE2
KVM: x86/mmu: Abort page walk if permission checks fail
KVM: x86/mmu: Introduce exception flag for unmapped GPAs
KVM: x86/mmu: Store GPA in exception if applicable
KVM: x86/mmu: Introduce flags parameter to page walker
KVM: x86/mmu: Implement PWALK_SET_ACCESSED in page walker
KVM: x86/mmu: Implement PWALK_SET_DIRTY in page walker
KVM: x86/mmu: Implement PWALK_FORCE_SET_ACCESSED in page walker
KVM: x86/mmu: Introduce status parameter to page walker
KVM: x86/mmu: Implement PWALK_STATUS_READ_ONLY_PTE_GPA in page walker
KVM: x86: Introduce generic gva to gpa translation function
KVM: Introduce KVM_TRANSLATE2
KVM: Add KVM_TRANSLATE2 stub
KVM: x86: Implement KVM_TRANSLATE2
KVM: selftests: Add test for KVM_TRANSLATE2
Documentation/virt/kvm/api.rst | 131 ++++++++
arch/x86/include/asm/kvm_host.h | 18 +-
arch/x86/kvm/hyperv.c | 3 +-
arch/x86/kvm/kvm_emulate.h | 8 +
arch/x86/kvm/mmu.h | 10 +-
arch/x86/kvm/mmu/mmu.c | 7 +-
arch/x86/kvm/mmu/paging_tmpl.h | 80 +++--
arch/x86/kvm/x86.c | 123 ++++++-
include/linux/kvm_host.h | 6 +
include/uapi/linux/kvm.h | 33 ++
tools/testing/selftests/kvm/Makefile | 1 +
.../selftests/kvm/x86_64/kvm_translate2.c | 310 ++++++++++++++++++
virt/kvm/kvm_main.c | 41 +++
13 files changed, 724 insertions(+), 47 deletions(-)
create mode 100644 tools/testing/selftests/kvm/x86_64/kvm_translate2.c
--
2.40.1
Amazon Web Services Development Center Germany GmbH
Krausenstr. 38
10117 Berlin
Geschaeftsfuehrung: Christian Schlaeger, Jonathan Weiss
Eingetragen am Amtsgericht Charlottenburg unter HRB 257764 B
Sitz: Berlin
Ust-ID: DE 365 538 597
This series:
1. makes the behavior of_find_device_by_node(),
bus_find_device_by_of_node(), bus_find_device_by_fwnode(), etc., more
consistent when provided with a NULL node/handle;
2. adds kunit tests to validate the new NULL-argument behavior; and
3. makes some related improvements and refactoring for the drivers/base/
kunit tests.
This series aims to prevent problems like the ones resolved in commit
5c8418cf4025 ("PCI/pwrctrl: Unregister platform device only if one
actually exists").
Brian Norris (4):
drivers: base: Don't match devices with NULL of_node/fwnode/etc
drivers: base: test: Enable device model tests with KUNIT_ALL_TESTS
drivers: base: test: Drop "devm" from platform-device-test names
drivers: base: test: Add ...find_device_by...(... NULL) tests
drivers/base/core.c | 8 ++---
drivers/base/test/Kconfig | 1 +
drivers/base/test/platform-device-test.c | 42 ++++++++++++++++++++----
3 files changed, 40 insertions(+), 11 deletions(-)
--
2.47.0.338.g60cca15819-goog
From: Kumar Kartikeya Dwivedi <memxor(a)gmail.com>
[ Upstream commit bd74e238ae6944b462f57ce8752440a011ba4530 ]
Andrii spotted that process_dynptr_func's rejection of incorrect
argument register type will print an error string where argument numbers
are not zero-indexed, unlike elsewhere in the verifier. Fix this by
subtracting 1 from regno. The same scenario exists for iterator
messages. Fix selftest error strings that match on the exact argument
number while we're at it to ensure clean bisection.
Suggested-by: Andrii Nakryiko <andrii(a)kernel.org>
Signed-off-by: Kumar Kartikeya Dwivedi <memxor(a)gmail.com>
Link: https://lore.kernel.org/r/20241203002235.3776418-1-memxor@gmail.com
Signed-off-by: Alexei Starovoitov <ast(a)kernel.org>
Signed-off-by: Sasha Levin <sashal(a)kernel.org>
---
kernel/bpf/verifier.c | 12 +++++-----
.../testing/selftests/bpf/progs/dynptr_fail.c | 22 +++++++++----------
.../selftests/bpf/progs/iters_state_safety.c | 14 ++++++------
.../selftests/bpf/progs/iters_testmod_seq.c | 4 ++--
.../bpf/progs/test_kfunc_dynptr_param.c | 2 +-
.../selftests/bpf/progs/verifier_bits_iter.c | 4 ++--
6 files changed, 29 insertions(+), 29 deletions(-)
diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
index 91317857ea3ee..436a83784b7d2 100644
--- a/kernel/bpf/verifier.c
+++ b/kernel/bpf/verifier.c
@@ -7903,7 +7903,7 @@ static int process_dynptr_func(struct bpf_verifier_env *env, int regno, int insn
if (reg->type != PTR_TO_STACK && reg->type != CONST_PTR_TO_DYNPTR) {
verbose(env,
"arg#%d expected pointer to stack or const struct bpf_dynptr\n",
- regno);
+ regno - 1);
return -EINVAL;
}
@@ -7957,7 +7957,7 @@ static int process_dynptr_func(struct bpf_verifier_env *env, int regno, int insn
if (!is_dynptr_reg_valid_init(env, reg)) {
verbose(env,
"Expected an initialized dynptr as arg #%d\n",
- regno);
+ regno - 1);
return -EINVAL;
}
@@ -7965,7 +7965,7 @@ static int process_dynptr_func(struct bpf_verifier_env *env, int regno, int insn
if (!is_dynptr_type_expected(env, reg, arg_type & ~MEM_RDONLY)) {
verbose(env,
"Expected a dynptr of type %s as arg #%d\n",
- dynptr_type_str(arg_to_dynptr_type(arg_type)), regno);
+ dynptr_type_str(arg_to_dynptr_type(arg_type)), regno - 1);
return -EINVAL;
}
@@ -8029,7 +8029,7 @@ static int process_iter_arg(struct bpf_verifier_env *env, int regno, int insn_id
*/
btf_id = btf_check_iter_arg(meta->btf, meta->func_proto, regno - 1);
if (btf_id < 0) {
- verbose(env, "expected valid iter pointer as arg #%d\n", regno);
+ verbose(env, "expected valid iter pointer as arg #%d\n", regno - 1);
return -EINVAL;
}
t = btf_type_by_id(meta->btf, btf_id);
@@ -8039,7 +8039,7 @@ static int process_iter_arg(struct bpf_verifier_env *env, int regno, int insn_id
/* bpf_iter_<type>_new() expects pointer to uninit iter state */
if (!is_iter_reg_valid_uninit(env, reg, nr_slots)) {
verbose(env, "expected uninitialized iter_%s as arg #%d\n",
- iter_type_str(meta->btf, btf_id), regno);
+ iter_type_str(meta->btf, btf_id), regno - 1);
return -EINVAL;
}
@@ -8063,7 +8063,7 @@ static int process_iter_arg(struct bpf_verifier_env *env, int regno, int insn_id
break;
case -EINVAL:
verbose(env, "expected an initialized iter_%s as arg #%d\n",
- iter_type_str(meta->btf, btf_id), regno);
+ iter_type_str(meta->btf, btf_id), regno - 1);
return err;
case -EPROTO:
verbose(env, "expected an RCU CS when using %s\n", meta->func_name);
diff --git a/tools/testing/selftests/bpf/progs/dynptr_fail.c b/tools/testing/selftests/bpf/progs/dynptr_fail.c
index 8f36c9de75915..dfd817d0348c4 100644
--- a/tools/testing/selftests/bpf/progs/dynptr_fail.c
+++ b/tools/testing/selftests/bpf/progs/dynptr_fail.c
@@ -149,7 +149,7 @@ int ringbuf_release_uninit_dynptr(void *ctx)
/* A dynptr can't be used after it has been invalidated */
SEC("?raw_tp")
-__failure __msg("Expected an initialized dynptr as arg #3")
+__failure __msg("Expected an initialized dynptr as arg #2")
int use_after_invalid(void *ctx)
{
struct bpf_dynptr ptr;
@@ -428,7 +428,7 @@ int invalid_helper2(void *ctx)
/* A bpf_dynptr is invalidated if it's been written into */
SEC("?raw_tp")
-__failure __msg("Expected an initialized dynptr as arg #1")
+__failure __msg("Expected an initialized dynptr as arg #0")
int invalid_write1(void *ctx)
{
struct bpf_dynptr ptr;
@@ -1407,7 +1407,7 @@ int invalid_slice_rdwr_rdonly(struct __sk_buff *skb)
/* bpf_dynptr_adjust can only be called on initialized dynptrs */
SEC("?raw_tp")
-__failure __msg("Expected an initialized dynptr as arg #1")
+__failure __msg("Expected an initialized dynptr as arg #0")
int dynptr_adjust_invalid(void *ctx)
{
struct bpf_dynptr ptr = {};
@@ -1420,7 +1420,7 @@ int dynptr_adjust_invalid(void *ctx)
/* bpf_dynptr_is_null can only be called on initialized dynptrs */
SEC("?raw_tp")
-__failure __msg("Expected an initialized dynptr as arg #1")
+__failure __msg("Expected an initialized dynptr as arg #0")
int dynptr_is_null_invalid(void *ctx)
{
struct bpf_dynptr ptr = {};
@@ -1433,7 +1433,7 @@ int dynptr_is_null_invalid(void *ctx)
/* bpf_dynptr_is_rdonly can only be called on initialized dynptrs */
SEC("?raw_tp")
-__failure __msg("Expected an initialized dynptr as arg #1")
+__failure __msg("Expected an initialized dynptr as arg #0")
int dynptr_is_rdonly_invalid(void *ctx)
{
struct bpf_dynptr ptr = {};
@@ -1446,7 +1446,7 @@ int dynptr_is_rdonly_invalid(void *ctx)
/* bpf_dynptr_size can only be called on initialized dynptrs */
SEC("?raw_tp")
-__failure __msg("Expected an initialized dynptr as arg #1")
+__failure __msg("Expected an initialized dynptr as arg #0")
int dynptr_size_invalid(void *ctx)
{
struct bpf_dynptr ptr = {};
@@ -1459,7 +1459,7 @@ int dynptr_size_invalid(void *ctx)
/* Only initialized dynptrs can be cloned */
SEC("?raw_tp")
-__failure __msg("Expected an initialized dynptr as arg #1")
+__failure __msg("Expected an initialized dynptr as arg #0")
int clone_invalid1(void *ctx)
{
struct bpf_dynptr ptr1 = {};
@@ -1493,7 +1493,7 @@ int clone_invalid2(struct xdp_md *xdp)
/* Invalidating a dynptr should invalidate its clones */
SEC("?raw_tp")
-__failure __msg("Expected an initialized dynptr as arg #3")
+__failure __msg("Expected an initialized dynptr as arg #2")
int clone_invalidate1(void *ctx)
{
struct bpf_dynptr clone;
@@ -1514,7 +1514,7 @@ int clone_invalidate1(void *ctx)
/* Invalidating a dynptr should invalidate its parent */
SEC("?raw_tp")
-__failure __msg("Expected an initialized dynptr as arg #3")
+__failure __msg("Expected an initialized dynptr as arg #2")
int clone_invalidate2(void *ctx)
{
struct bpf_dynptr ptr;
@@ -1535,7 +1535,7 @@ int clone_invalidate2(void *ctx)
/* Invalidating a dynptr should invalidate its siblings */
SEC("?raw_tp")
-__failure __msg("Expected an initialized dynptr as arg #3")
+__failure __msg("Expected an initialized dynptr as arg #2")
int clone_invalidate3(void *ctx)
{
struct bpf_dynptr ptr;
@@ -1723,7 +1723,7 @@ __noinline long global_call_bpf_dynptr(const struct bpf_dynptr *dynptr)
}
SEC("?raw_tp")
-__failure __msg("arg#1 expected pointer to stack or const struct bpf_dynptr")
+__failure __msg("arg#0 expected pointer to stack or const struct bpf_dynptr")
int test_dynptr_reg_type(void *ctx)
{
struct task_struct *current = NULL;
diff --git a/tools/testing/selftests/bpf/progs/iters_state_safety.c b/tools/testing/selftests/bpf/progs/iters_state_safety.c
index d47e59aba6de3..f41257eadbb25 100644
--- a/tools/testing/selftests/bpf/progs/iters_state_safety.c
+++ b/tools/testing/selftests/bpf/progs/iters_state_safety.c
@@ -73,7 +73,7 @@ int create_and_forget_to_destroy_fail(void *ctx)
}
SEC("?raw_tp")
-__failure __msg("expected an initialized iter_num as arg #1")
+__failure __msg("expected an initialized iter_num as arg #0")
int destroy_without_creating_fail(void *ctx)
{
/* init with zeros to stop verifier complaining about uninit stack */
@@ -91,7 +91,7 @@ int destroy_without_creating_fail(void *ctx)
}
SEC("?raw_tp")
-__failure __msg("expected an initialized iter_num as arg #1")
+__failure __msg("expected an initialized iter_num as arg #0")
int compromise_iter_w_direct_write_fail(void *ctx)
{
struct bpf_iter_num iter;
@@ -143,7 +143,7 @@ int compromise_iter_w_direct_write_and_skip_destroy_fail(void *ctx)
}
SEC("?raw_tp")
-__failure __msg("expected an initialized iter_num as arg #1")
+__failure __msg("expected an initialized iter_num as arg #0")
int compromise_iter_w_helper_write_fail(void *ctx)
{
struct bpf_iter_num iter;
@@ -230,7 +230,7 @@ int valid_stack_reuse(void *ctx)
}
SEC("?raw_tp")
-__failure __msg("expected uninitialized iter_num as arg #1")
+__failure __msg("expected uninitialized iter_num as arg #0")
int double_create_fail(void *ctx)
{
struct bpf_iter_num iter;
@@ -258,7 +258,7 @@ int double_create_fail(void *ctx)
}
SEC("?raw_tp")
-__failure __msg("expected an initialized iter_num as arg #1")
+__failure __msg("expected an initialized iter_num as arg #0")
int double_destroy_fail(void *ctx)
{
struct bpf_iter_num iter;
@@ -284,7 +284,7 @@ int double_destroy_fail(void *ctx)
}
SEC("?raw_tp")
-__failure __msg("expected an initialized iter_num as arg #1")
+__failure __msg("expected an initialized iter_num as arg #0")
int next_without_new_fail(void *ctx)
{
struct bpf_iter_num iter;
@@ -305,7 +305,7 @@ int next_without_new_fail(void *ctx)
}
SEC("?raw_tp")
-__failure __msg("expected an initialized iter_num as arg #1")
+__failure __msg("expected an initialized iter_num as arg #0")
int next_after_destroy_fail(void *ctx)
{
struct bpf_iter_num iter;
diff --git a/tools/testing/selftests/bpf/progs/iters_testmod_seq.c b/tools/testing/selftests/bpf/progs/iters_testmod_seq.c
index 4a176e6aede89..6543d5b6e0a97 100644
--- a/tools/testing/selftests/bpf/progs/iters_testmod_seq.c
+++ b/tools/testing/selftests/bpf/progs/iters_testmod_seq.c
@@ -79,7 +79,7 @@ int testmod_seq_truncated(const void *ctx)
SEC("?raw_tp")
__failure
-__msg("expected an initialized iter_testmod_seq as arg #2")
+__msg("expected an initialized iter_testmod_seq as arg #1")
int testmod_seq_getter_before_bad(const void *ctx)
{
struct bpf_iter_testmod_seq it;
@@ -89,7 +89,7 @@ int testmod_seq_getter_before_bad(const void *ctx)
SEC("?raw_tp")
__failure
-__msg("expected an initialized iter_testmod_seq as arg #2")
+__msg("expected an initialized iter_testmod_seq as arg #1")
int testmod_seq_getter_after_bad(const void *ctx)
{
struct bpf_iter_testmod_seq it;
diff --git a/tools/testing/selftests/bpf/progs/test_kfunc_dynptr_param.c b/tools/testing/selftests/bpf/progs/test_kfunc_dynptr_param.c
index e68667aec6a65..cd4d752bd089c 100644
--- a/tools/testing/selftests/bpf/progs/test_kfunc_dynptr_param.c
+++ b/tools/testing/selftests/bpf/progs/test_kfunc_dynptr_param.c
@@ -45,7 +45,7 @@ int BPF_PROG(not_valid_dynptr, int cmd, union bpf_attr *attr, unsigned int size)
}
SEC("?lsm.s/bpf")
-__failure __msg("arg#1 expected pointer to stack or const struct bpf_dynptr")
+__failure __msg("arg#0 expected pointer to stack or const struct bpf_dynptr")
int BPF_PROG(not_ptr_to_stack, int cmd, union bpf_attr *attr, unsigned int size)
{
unsigned long val = 0;
diff --git a/tools/testing/selftests/bpf/progs/verifier_bits_iter.c b/tools/testing/selftests/bpf/progs/verifier_bits_iter.c
index 7c881bca9af5c..497febf5c578d 100644
--- a/tools/testing/selftests/bpf/progs/verifier_bits_iter.c
+++ b/tools/testing/selftests/bpf/progs/verifier_bits_iter.c
@@ -32,7 +32,7 @@ int BPF_PROG(no_destroy, struct bpf_iter_meta *meta, struct cgroup *cgrp)
SEC("iter/cgroup")
__description("uninitialized iter in ->next()")
-__failure __msg("expected an initialized iter_bits as arg #1")
+__failure __msg("expected an initialized iter_bits as arg #0")
int BPF_PROG(next_uninit, struct bpf_iter_meta *meta, struct cgroup *cgrp)
{
struct bpf_iter_bits *it = NULL;
@@ -43,7 +43,7 @@ int BPF_PROG(next_uninit, struct bpf_iter_meta *meta, struct cgroup *cgrp)
SEC("iter/cgroup")
__description("uninitialized iter in ->destroy()")
-__failure __msg("expected an initialized iter_bits as arg #1")
+__failure __msg("expected an initialized iter_bits as arg #0")
int BPF_PROG(destroy_uninit, struct bpf_iter_meta *meta, struct cgroup *cgrp)
{
struct bpf_iter_bits it = {};
--
2.43.0
Isolated CPUs are not allowed to be used in a non-isolated partition.
The only exception is the top cpuset which is allowed to contain boot
time isolated CPUs.
Commit ccac8e8de99c ("cgroup/cpuset: Fix remote root partition creation
problem") introduces a simplified scheme of including only partition
roots in sched domain generation. However, it does not properly account
for this exception case. This can result in leakage of isolated CPUs
into a sched domain.
Fix it by making sure that isolated CPUs are excluded from the top
cpuset before generating sched domains.
Also update the way the boot time isolated CPUs are handled in
test_cpuset_prs.sh to make sure that those isolated CPUs are really
isolated instead of just skipping them in the tests.
Fixes: ccac8e8de99c ("cgroup/cpuset: Fix remote root partition creation problem")
Signed-off-by: Waiman Long <longman(a)redhat.com>
---
kernel/cgroup/cpuset.c | 10 +++++-
.../selftests/cgroup/test_cpuset_prs.sh | 33 +++++++++++--------
2 files changed, 28 insertions(+), 15 deletions(-)
diff --git a/kernel/cgroup/cpuset.c b/kernel/cgroup/cpuset.c
index f321ed515f3a..33b264c3e258 100644
--- a/kernel/cgroup/cpuset.c
+++ b/kernel/cgroup/cpuset.c
@@ -890,7 +890,15 @@ static int generate_sched_domains(cpumask_var_t **domains,
*/
if (cgrpv2) {
for (i = 0; i < ndoms; i++) {
- cpumask_copy(doms[i], csa[i]->effective_cpus);
+ /*
+ * The top cpuset may contain some boot time isolated
+ * CPUs that need to be excluded from the sched domain.
+ */
+ if (csa[i] == &top_cpuset)
+ cpumask_and(doms[i], csa[i]->effective_cpus,
+ housekeeping_cpumask(HK_TYPE_DOMAIN));
+ else
+ cpumask_copy(doms[i], csa[i]->effective_cpus);
if (dattr)
dattr[i] = SD_ATTR_INIT;
}
diff --git a/tools/testing/selftests/cgroup/test_cpuset_prs.sh b/tools/testing/selftests/cgroup/test_cpuset_prs.sh
index 03c1bdaed2c3..400a696a0d21 100755
--- a/tools/testing/selftests/cgroup/test_cpuset_prs.sh
+++ b/tools/testing/selftests/cgroup/test_cpuset_prs.sh
@@ -86,15 +86,15 @@ echo "" > test/cpuset.cpus
#
# If isolated CPUs have been reserved at boot time (as shown in
-# cpuset.cpus.isolated), these isolated CPUs should be outside of CPUs 0-7
+# cpuset.cpus.isolated), these isolated CPUs should be outside of CPUs 0-8
# that will be used by this script for testing purpose. If not, some of
-# the tests may fail incorrectly. These isolated CPUs will also be removed
-# before being compared with the expected results.
+# the tests may fail incorrectly. These pre-isolated CPUs should stay in
+# an isolated state throughout the testing process for now.
#
BOOT_ISOLCPUS=$(cat $CGROUP2/cpuset.cpus.isolated)
if [[ -n "$BOOT_ISOLCPUS" ]]
then
- [[ $(echo $BOOT_ISOLCPUS | sed -e "s/[,-].*//") -le 7 ]] &&
+ [[ $(echo $BOOT_ISOLCPUS | sed -e "s/[,-].*//") -le 8 ]] &&
skip_test "Pre-isolated CPUs ($BOOT_ISOLCPUS) overlap CPUs to be tested"
echo "Pre-isolated CPUs: $BOOT_ISOLCPUS"
fi
@@ -683,15 +683,19 @@ check_isolcpus()
EXPECT_VAL2=$EXPECT_VAL
fi
+ #
+ # Appending pre-isolated CPUs
+ # Even though CPU #8 isn't used for testing, it can't be pre-isolated
+ # to make appending those CPUs easier.
+ #
+ [[ -n "$BOOT_ISOLCPUS" ]] && {
+ EXPECT_VAL=${EXPECT_VAL:+${EXPECT_VAL},}${BOOT_ISOLCPUS}
+ EXPECT_VAL2=${EXPECT_VAL2:+${EXPECT_VAL2},}${BOOT_ISOLCPUS}
+ }
+
#
# Check cpuset.cpus.isolated cpumask
#
- if [[ -z "$BOOT_ISOLCPUS" ]]
- then
- ISOLCPUS=$(cat $ISCPUS)
- else
- ISOLCPUS=$(cat $ISCPUS | sed -e "s/,*$BOOT_ISOLCPUS//")
- fi
[[ "$EXPECT_VAL2" != "$ISOLCPUS" ]] && {
# Take a 50ms pause and try again
pause 0.05
@@ -731,8 +735,6 @@ check_isolcpus()
fi
done
[[ "$ISOLCPUS" = *- ]] && ISOLCPUS=${ISOLCPUS}$LASTISOLCPU
- [[ -n "BOOT_ISOLCPUS" ]] &&
- ISOLCPUS=$(echo $ISOLCPUS | sed -e "s/,*$BOOT_ISOLCPUS//")
[[ "$EXPECT_VAL" = "$ISOLCPUS" ]]
}
@@ -836,8 +838,11 @@ run_state_test()
# if available
[[ -n "$ICPUS" ]] && {
check_isolcpus $ICPUS
- [[ $? -ne 0 ]] && test_fail $I "isolated CPU" \
- "Expect $ICPUS, get $ISOLCPUS instead"
+ [[ $? -ne 0 ]] && {
+ [[ -n "$BOOT_ISOLCPUS" ]] && ICPUS=${ICPUS},${BOOT_ISOLCPUS}
+ test_fail $I "isolated CPU" \
+ "Expect $ICPUS, get $ISOLCPUS instead"
+ }
}
reset_cgroup_states
#
--
2.47.1