This changes the selftest output is several ways:
- total test count is reported at start.
- each test's result is on a single line with "# SKIP" as needed per spec.
- each test's output is report _before_ the result line as a commented
"diagnostic" line.
This creates a bit of a kernel-specific TAP output where the diagnostics
precede the results. The TAP spec isn't entirely clear about this, though,
so I think it's the correct solution so as to not keep interactive
runs making sense. If the output _followed_ the result line in the
spec-suggested YAML form, each test would dump all of its output at once
instead of as it went, making debugging harder.
Also, while "sed -u" is used to add the "# " line prefixes, this
still doesn't work for all output. For example, the "timer" lists
print out text, then do the work, then print a result and a newline.
This isn't visible any more. And some tests still show nothing until
they finish. I haven't found a way to force the prefixing while keeping
the output entirely unbuffered. :(
Note that the shell construct needed to both get an exit code from
the first command in a pipe and still filter the pipe (to add the "# "
prefix) uses a POSIX solution rather than the bash "pipefail" option
which is not supported by dash.
Signed-off-by: Kees Cook <keescook(a)chromium.org>
---
tools/testing/selftests/lib.mk | 54 +++++++++++++++++++---------------
1 file changed, 31 insertions(+), 23 deletions(-)
diff --git a/tools/testing/selftests/lib.mk b/tools/testing/selftests/lib.mk
index 8b0f16409ed7..056dac8f5701 100644
--- a/tools/testing/selftests/lib.mk
+++ b/tools/testing/selftests/lib.mk
@@ -5,6 +5,7 @@ CC := $(CROSS_COMPILE)gcc
ifeq (0,$(MAKELEVEL))
OUTPUT := $(shell pwd)
endif
+srcdir = $(notdir $(shell pwd))
# The following are built by lib.mk common compile rules.
# TEST_CUSTOM_PROGS should be used by tests that require
@@ -32,38 +33,45 @@ endif
.ONESHELL:
define RUN_TEST_PRINT_RESULT
- TEST_HDR_MSG="selftests: "`basename $$PWD`:" $$BASENAME_TEST"; \
- echo $$TEST_HDR_MSG; \
- echo "========================================"; \
+ TEST_HDR_MSG="selftests: $(srcdir): $$BASENAME_TEST"; \
if [ ! -x $$TEST ]; then \
- echo "$$TEST_HDR_MSG: Warning: file $$BASENAME_TEST is not executable, correct this.";\
- echo "not ok 1..$$test_num $$TEST_HDR_MSG [FAIL]"; \
- else \
- cd `dirname $$TEST` > /dev/null; \
- if [ "X$(summary)" != "X" ]; then \
- (./$$BASENAME_TEST > /tmp/$$BASENAME_TEST 2>&1 && \
- echo "ok 1..$$test_num $$TEST_HDR_MSG [PASS]") || \
- (if [ $$? -eq $$skip ]; then \
- echo "not ok 1..$$test_num $$TEST_HDR_MSG [SKIP]"; \
- else echo "not ok 1..$$test_num $$TEST_HDR_MSG [FAIL]"; \
- fi;) \
- else \
- (./$$BASENAME_TEST && \
- echo "ok 1..$$test_num $$TEST_HDR_MSG [PASS]") || \
- (if [ $$? -eq $$skip ]; then \
- echo "not ok 1..$$test_num $$TEST_HDR_MSG [SKIP]"; \
- else echo "not ok 1..$$test_num $$TEST_HDR_MSG [FAIL]"; \
- fi;) \
- fi; \
- cd - > /dev/null; \
+ if [ "X$(summary)" = "X" ]; then \
+ echo "# warning: 'file $$TEST is not executable, correct this.'";\
+ fi; \
+ echo "not ok $$test_num $$TEST_HDR_MSG"; \
+ else \
+ cd `dirname $$TEST` > /dev/null; \
+ if [ "X$(summary)" != "X" ]; then \
+ (./$$BASENAME_TEST > /tmp/$$BASENAME_TEST 2>&1 &&\
+ echo "ok $$test_num $$TEST_HDR_MSG") || \
+ (if [ $$? -eq $$skip ]; then \
+ echo "not ok $$test_num $$TEST_HDR_MSG # SKIP"; \
+ else \
+ echo "not ok $$test_num $$TEST_HDR_MSG";\
+ fi); \
+ else \
+ echo "# $$TEST_HDR_MSG"; \
+ (((((./$$BASENAME_TEST 2>&1; echo $$? >&3) | \
+ sed -ue 's/^/# /' >&4) 3>&1) | \
+ (read xs; exit $$xs)) 4>&1 && \
+ echo "ok $$test_num $$TEST_HDR_MSG") || \
+ (if [ $$? -eq $$skip ]; then \
+ echo "not ok $$test_num $$TEST_HDR_MSG # SKIP"; \
+ else \
+ echo "not ok $$test_num $$TEST_HDR_MSG";\
+ fi); \
+ fi; \
+ cd - > /dev/null; \
fi;
endef
define RUN_TESTS
@export KSFT_TAP_LEVEL=`echo 1`; \
test_num=`echo 0`; \
+ total=`echo "$(1)" | wc -w`; \
skip=`echo 4`; \
echo "TAP version 13"; \
+ echo "1..$$total selftests: $(srcdir)"; \
for TEST in $(1); do \
BASENAME_TEST=`basename $$TEST`; \
test_num=`echo $$test_num+1 | bc`; \
--
2.17.1
--
Kees Cook
Extend bpf_skb_adjust_room growth to mark inner MAC header so that
L2 encapsulation can be used for tc tunnels.
Patch #1 extends the existing test_tc_tunnel to support UDP
encapsulation; later we want to be able to test MPLS over UDP and
MPLS over GRE encapsulation.
Patch #2 adds the BPF_F_ADJ_ROOM_ENCAP_L2(len) macro, which
allows specification of inner mac length. Other approaches were
explored prior to taking this approach. Specifically, I tried
automatically computing the inner mac length on the basis of the
specified flags (so inner maclen for GRE/IPv4 encap is the len_diff
specified to bpf_skb_adjust_room minus GRE + IPv4 header length
for example). Problem with this is that we don't know for sure
what form of GRE/UDP header we have; is it a full GRE header,
or is it a FOU UDP header or generic UDP encap header? My fear
here was we'd end up with an explosion of flags. The other approach
tried was to support inner L2 header marking as a separate room
adjustment, i.e. adjust for L3/L4 encap, then call
bpf_skb_adjust_room for L2 encap. This can be made to work but
because it imposed an order on operations, felt a bit clunky.
Patch #3 syncs tools/ bpf.h.
Patch #4 extends the tests again to support MPLSoverGRE,
MPLSoverUDP, and transparent ethernet bridging (TEB) where
the inner L2 header is an ethernet header. Testing of BPF
encap against tunnels is done for cases where configuration
of such tunnels is possible (MPLSoverGRE[6], MPLSoverUDP,
gre[6]tap), and skipped otherwise. Testing of BPF encap/decap
is always carried out.
Changes since v1:
- fixed formatting of commit references.
- BPF_F_ADJ_ROOM_FIXED_GSO flag enabled on all variants (patch 1)
- fixed fou6 options for UDP encap; checksum errors observed were
due to the fact fou6 tunnel was not set up with correct ipproto
options (41 -6). 0 checksums work fine (patch 1)
- added definitions for mask and shift used in setting L2 length
(patch 2)
- allow udp encap with fixed GSO (patch 2)
- changed "elen" to "l2_len" to be more descriptive (patch 4)
- added support for ethernet L2 encap to tests; testing against
gre[6]tap tunnels (patch 4).
Alan Maguire (4):
selftests_bpf: add UDP encap to test_tc_tunnel
bpf: add layer 2 encap support to bpf_skb_adjust_room
bpf: sync bpf.h to tools/ for BPF_F_ADJ_ROOM_ENCAP_L2
selftests_bpf: add L2 encap to test_tc_tunnel
include/uapi/linux/bpf.h | 10 +
net/core/filter.c | 15 +-
tools/include/uapi/linux/bpf.h | 10 +
tools/testing/selftests/bpf/progs/test_tc_tunnel.c | 335 +++++++++++++++++----
tools/testing/selftests/bpf/test_tc_tunnel.sh | 147 +++++++--
5 files changed, 427 insertions(+), 90 deletions(-)
--
1.8.3.1
On 4/3/19 7:34 PM, shuah wrote:
> Hi Linus,
>
> Please pull the following Kselftest update for Linux 5.1-rc4
>
> This Kselftest update for Linux 5.1-rc4 consists of fixes to rseq,
> cgroup, and efivarfs tests.
>
> diff is attached.
>
> thanks,
> -- Shuah
>
Hi Linus,
I just noticed that one commit in this pull request has mismatched
author and signed-off. Please disregard the pull request.
I will fix this and send a new pull request. Sorry for the noise.
thanks,
-- Shuah
[I suggest to stop waiting for more acks and merge this into linux-next as is.]
PTRACE_GET_SYSCALL_INFO is a generic ptrace API that lets ptracer obtain
details of the syscall the tracee is blocked in.
There are two reasons for a special syscall-related ptrace request.
Firstly, with the current ptrace API there are cases when ptracer cannot
retrieve necessary information about syscalls. Some examples include:
* The notorious int-0x80-from-64-bit-task issue. See [1] for details.
In short, if a 64-bit task performs a syscall through int 0x80, its tracer
has no reliable means to find out that the syscall was, in fact,
a compat syscall, and misidentifies it.
* Syscall-enter-stop and syscall-exit-stop look the same for the tracer.
Common practice is to keep track of the sequence of ptrace-stops in order
not to mix the two syscall-stops up. But it is not as simple as it looks;
for example, strace had a (just recently fixed) long-standing bug where
attaching strace to a tracee that is performing the execve system call
led to the tracer identifying the following syscall-exit-stop as
syscall-enter-stop, which messed up all the state tracking.
* Since the introduction of commit 84d77d3f06e7e8dea057d10e8ec77ad71f721be3
("ptrace: Don't allow accessing an undumpable mm"), both PTRACE_PEEKDATA
and process_vm_readv become unavailable when the process dumpable flag
is cleared. On such architectures as ia64 this results in all syscall
arguments being unavailable for the tracer.
Secondly, ptracers also have to support a lot of arch-specific code for
obtaining information about the tracee. For some architectures, this
requires a ptrace(PTRACE_PEEKUSER, ...) invocation for every syscall
argument and return value.
PTRACE_GET_SYSCALL_INFO returns the following structure:
struct ptrace_syscall_info {
__u8 op; /* PTRACE_SYSCALL_INFO_* */
__u32 arch __attribute__((__aligned__(sizeof(__u32))));
__u64 instruction_pointer;
__u64 stack_pointer;
union {
struct {
__u64 nr;
__u64 args[6];
} entry;
struct {
__s64 rval;
__u8 is_error;
} exit;
struct {
__u64 nr;
__u64 args[6];
__u32 ret_data;
} seccomp;
};
};
The structure was chosen according to [2], except for the following
changes:
* seccomp substructure was added as a superset of entry substructure;
* the type of nr field was changed from int to __u64 because syscall
numbers are, as a practical matter, 64 bits;
* stack_pointer field was added along with instruction_pointer field
since it is readily available and can save the tracer from extra
PTRACE_GETREGS/PTRACE_GETREGSET calls;
* arch is always initialized to aid with tracing system calls
* such as execve();
* instruction_pointer and stack_pointer are always initialized
so they could be easily obtained for non-syscall stops;
* a boolean is_error field was added along with rval field, this way
the tracer can more reliably distinguish a return value
from an error value.
strace has been ported to PTRACE_GET_SYSCALL_INFO.
Starting with release 4.26, strace uses PTRACE_GET_SYSCALL_INFO API
as the preferred mechanism of obtaining syscall information.
[1] https://lore.kernel.org/lkml/CA+55aFzcSVmdDj9Lh_gdbz1OzHyEm6ZrGPBDAJnywm2LF…
[2] https://lore.kernel.org/lkml/CAObL_7GM0n80N7J_DFw_eQyfLyzq+sf4y2AvsCCV88Tb3…
---
Notes:
v9:
* Rebased to linux-next again due to syscall_get_arguments() signature change.
v8:
* Moved syscall_get_arch() specific patches to a separate patchset
which is now merged into audit/next tree.
* Rebased to linux-next.
* Moved ptrace_get_syscall_info code under #ifdef CONFIG_HAVE_ARCH_TRACEHOOK,
narrowing down the set of architectures supported by this implementation
back to those 19 that enable CONFIG_HAVE_ARCH_TRACEHOOK because
I failed to get all syscall_get_*(), instruction_pointer(),
and user_stack_pointer() functions implemented on some niche
architectures. This leaves the following architectures out:
alpha, h8300, m68k, microblaze, and unicore32.
v7:
* Rebased to v5.0-rc1.
* 5 arch-specific preparatory patches out of 25 have been merged
into v5.0-rc1 via arch trees.
v6:
* Add syscall_get_arguments and syscall_set_arguments wrappers
to asm-generic/syscall.h, requested by Geert.
* Change PTRACE_GET_SYSCALL_INFO return code: do not take trailing paddings
into account, use the end of the last field of the structure being written.
* Change struct ptrace_syscall_info:
* remove .frame_pointer field, is is not needed and not portable;
* make .arch field explicitly aligned, remove no longer needed
padding before .arch field;
* remove trailing pads, they are no longer needed.
v5:
* Merge separate series and patches into the single series.
* Change PTRACE_EVENTMSG_SYSCALL_{ENTRY,EXIT} values as requested by Oleg.
* Change struct ptrace_syscall_info: generalize instruction_pointer,
stack_pointer, and frame_pointer fields by moving them from
ptrace_syscall_info.{entry,seccomp} substructures to ptrace_syscall_info
and initializing them for all stops.
* Add PTRACE_SYSCALL_INFO_NONE, set it when not in a syscall stop,
so e.g. "strace -i" could use PTRACE_SYSCALL_INFO_SECCOMP to obtain
instruction_pointer when the tracee is in a signal stop.
* Patch all remaining architectures to provide all necessary
syscall_get_* functions.
* Make available for all architectures: do not conditionalize on
CONFIG_HAVE_ARCH_TRACEHOOK since all syscall_get_* functions
are implemented on all architectures.
* Add a test for PTRACE_GET_SYSCALL_INFO to selftests/ptrace.
v4:
* Do not introduce task_struct.ptrace_event,
use child->last_siginfo->si_code instead.
* Implement PTRACE_SYSCALL_INFO_SECCOMP and ptrace_syscall_info.seccomp
support along with PTRACE_SYSCALL_INFO_{ENTRY,EXIT} and
ptrace_syscall_info.{entry,exit}.
v3:
* Change struct ptrace_syscall_info.
* Support PTRACE_EVENT_SECCOMP by adding ptrace_event to task_struct.
* Add proper defines for ptrace_syscall_info.op values.
* Rename PT_SYSCALL_IS_ENTERING and PT_SYSCALL_IS_EXITING to
PTRACE_EVENTMSG_SYSCALL_ENTRY and PTRACE_EVENTMSG_SYSCALL_EXIT
* and move them to uapi.
v2:
* Do not use task->ptrace.
* Replace entry_info.is_compat with entry_info.arch, use syscall_get_arch().
* Use addr argument of sys_ptrace to get expected size of the struct;
return full size of the struct.
Dmitry V. Levin (6):
nds32: fix asm/syscall.h # waiting for ack since early January
hexagon: define syscall_get_error() and syscall_get_return_value() # waiting for ack since November
mips: define syscall_get_error() # acked
parisc: define syscall_get_error() # acked
powerpc: define syscall_get_error() # waiting for ack since early December
selftests/ptrace: add a test case for PTRACE_GET_SYSCALL_INFO
Elvira Khabirova (1):
ptrace: add PTRACE_GET_SYSCALL_INFO request # reviewed
arch/hexagon/include/asm/syscall.h | 14 +
arch/mips/include/asm/syscall.h | 6 +
arch/nds32/include/asm/syscall.h | 27 +-
arch/parisc/include/asm/syscall.h | 7 +
arch/powerpc/include/asm/syscall.h | 10 +
include/linux/tracehook.h | 9 +-
include/uapi/linux/ptrace.h | 35 +++
kernel/ptrace.c | 103 ++++++-
tools/testing/selftests/ptrace/.gitignore | 1 +
tools/testing/selftests/ptrace/Makefile | 2 +-
.../selftests/ptrace/get_syscall_info.c | 271 ++++++++++++++++++
11 files changed, 470 insertions(+), 15 deletions(-)
create mode 100644 tools/testing/selftests/ptrace/get_syscall_info.c
--
ldv
Hi,
I'm working on writing a LKDTM selftest, and I noticed that the output
format for the existing selftest tests don't seem to actually conform
to the TAP 13 specification[1]. Namely, there is a ton of output
between the "ok" and "not ok" lines, in various places for various
reasons. IIUC, the spec says these lines should follow the "ok" lines
and be (at least) indented to be parsed as YAML.
How does the existing CI do the selftest output parsing? Is there some
other spec I should have read for the correct interpretation here?
(TAP 13 says this part isn't exactly agreed on yet.)
e.g., I see this right now:
# cd tools/testing/selftests
# make --silent -C lib run_tests
TAP version 13
selftests: lib: printf.sh
========================================
printf: module test_printf is not found [SKIP]
not ok 1..1 selftests: lib: printf.sh [SKIP]
selftests: lib: bitmap.sh
========================================
bitmap: module test_bitmap is not found [SKIP]
not ok 1..2 selftests: lib: bitmap.sh [SKIP]
selftests: lib: prime_numbers.sh
========================================
prime_numbers: module prime_numbers is not found [SKIP]
not ok 1..3 selftests: lib: prime_numbers.sh [SKIP]
There is no "plan" line, lots of lines between test lines, and the
test numbering is non-spec. I would have expected the following
output:
TAP version 13
1..3 lib
not ok 1 lib: printf.sh # SKIP module test_printf is not found
not ok 2 lib: bitmap.sh # SKIP module test_bitmap is not found
not ok 3 lib: prime_numbers.sh # SKIP module prime_numbers is not found
And for output, here's another:
# make --silent -C ipc run_tests
TAP version 13
selftests: ipc: msgque
========================================
Pass 0 Fail 0 Xfail 0 Xpass 0 Skip 0 Error 0
1..0
ok 1..1 selftests: ipc: msgque [PASS]
The test-specific output should be after the "ok" line:
TAP version 13
1..1 ipc
ok 1 ipc: msgque
---
output: |+
Pass 0 Fail 0 Xfail 0 Xpass 0 Skip 0 Error 0
...
The test runner knows how many tests are going to be run, so the
"plan" line can be correct. The ok/not-ok lines can be fixed to avoid
the prefixed "1..", and the output during the tests can be captured
and indented for valid YAML (above uses "output" to capture the text
output with a literal block quote style of "|+"[2]).
It seems like the current output cannot be parsed by things like
Test::Harness. The spec is pretty specific about "Any output that is
not a version, a plan, a test line, a YAML block, a diagnostic or a
bail out is incorrect."
Has this been discussed before? I spent a little time looking but
couldn't find an earlier thread...
Thanks!
-Kees
[1] https://testanything.org/tap-version-13-specification.html
[2] https://yaml-multiline.info
--
Kees Cook
Hi,
this is a draft trying to define some API in order to remove some
redundancy from kselftest shell scripts. Existing kselftest.h already
defines some sort of API for C, there is none for shell.
It's just a small example how things could be. Draft, not meant to be
really merged. But instead of defining shell library (with more useful
helpers), I'd rather adopt LTP shell [1] and C [2] API to kselftest.
LTP API [1] is more like a framework, easy to use with a lot of helpers
making tests 1) small, concentrating on the problem itself 2) have
unique output. API is well documented [3] [4], it's creator Cyril Hrubis
made it after years experience of handling (at the time) quite bad
quality LTP code. Rewriting LTP tests to use this API improved tests a
lot (less buggy, easier to read).
Some examples of advantages of LTP API:
* SAFE_*() macros for C, which handles errors inside a library
* unified messages, unified test status, unified way to exit testing due
missing functionality, at the end of testing there is summary of passed,
failed and skipped tests
* many prepared functionality for both C and shell
* handling threads, parent-child synchronization
* setup and cleanup functions
* "flags" for defining requirements or certain functionality (need root, temporary
directory, ...)
* and many other
kselftest and LTP has a bit different goals and approach. Probably
not all of LTP API is needed atm, but I guess it's at least worth of
thinking to adopt it.
There are of course other options: reinvent a wheel or left kselftest
code in a state it is now (code quality varies, some of the code is
really messy, buggy, not even compile).
[1] https://github.com/linux-test-project/ltp/blob/master/testcases/lib/tst_tes…
[2] https://github.com/linux-test-project/ltp/tree/master/lib
[3] https://github.com/linux-test-project/ltp/wiki/Test-Writing-Guidelines#22-w…
[4] https://github.com/linux-test-project/ltp/wiki/Test-Writing-Guidelines#23-w…
Petr Vorel (2):
selftests: Start shell API
selftest/kexec: Use kselftest shell API
.../selftests/kexec/kexec_common_lib.sh | 74 +++++--------------
.../selftests/kexec/test_kexec_file_load.sh | 53 ++++++-------
.../selftests/kexec/test_kexec_load.sh | 20 ++---
tools/testing/selftests/kselftest.sh | 53 +++++++++++++
4 files changed, 105 insertions(+), 95 deletions(-)
mode change 100755 => 100644 tools/testing/selftests/kexec/kexec_common_lib.sh
create mode 100644 tools/testing/selftests/kselftest.sh
--
2.20.1
Summary
------------------------------------------------------------------------
git repo: not informed
git branch: not informed
git commit: not informed
git describe: not informed
Test details: https://qa-reports.linaro.org/lkft/linux-next-oe/build/next-20190408
Regressions (compared to build next-20190405)
------------------------------------------------------------------------
No regressions
Fixes (compared to build next-20190405)
------------------------------------------------------------------------
No fixes
In total:
------------------------------------------------------------------------
Ran 0 total tests in the following environments and test suites.
pass 0
fail 0
xfail 0
skip 0
Environments
--------------
- x86_64
Test Suites
-----------
Failures
------------------------------------------------------------------------
x86:
Skips
------------------------------------------------------------------------
No skips
--
Linaro LKFT
https://lkft.linaro.org
The kernel may be configured or an IMA policy specified on the boot
command line requiring the kexec kernel image signature to be verified.
At runtime a custom IMA policy may be loaded, replacing the policy
specified on the boot command line. In addition, the arch specific
policy rules are dynamically defined based on the secure boot mode that
may require the kernel image signature to be verified.
The kernel image may have a PE signature, an IMA signature, or both. In
addition, there are two kexec syscalls - kexec_load and kexec_file_load
- but only the kexec_file_load syscall can verify signatures.
These kexec selftests verify that only properly signed kernel images are
loaded as required, based on the kernel config, the secure boot mode,
and the IMA runtime policy.
Loading a kernel image requires root privileges. To run just the KEXEC
selftests: sudo make TARGETS=kexec kselftest
Changelog v5:
- Make tests independent of IMA being enabled, folding the changes
into the kexec_file_load test.
- Add support for CONFIG_KEXEC_VERIFY_SIG being enabled, but not
CONFIG_KEXEC_BZIMAGE_VERIFY_SIG.
Changelog v4:
- Moved the kexec tests to selftests/kexec, as requested by Dave Young.
- Removed the kernel module selftest from this patch set.
- Rewritten cover letter, removing reference to kernel modules.
Changelog v3:
- Updated tests based on Petr's review, including the defining a common
test to check for root privileges.
- Modified config, removing the CONFIG_KEXEC_VERIFY_SIG requirement.
- Updated the SPDX license to GPL-2.0 based on Shuah's review.
- Updated the secureboot mode test to check the SetupMode as well, based
on David Young's review.
Mimi Zohar (8):
selftests/kexec: move the IMA kexec_load selftest to selftests/kexec
selftests/kexec: cleanup the kexec selftest
selftests/kexec: define a set of common functions
selftests/kexec: define common logging functions
kselftest/kexec: define "require_root_privileges"
selftests/kexec: kexec_file_load syscall test
selftests/kexec: check kexec_load and kexec_file_load are enabled
selftests/kexec: make kexec_load test independent of IMA being enabled
Petr Vorel (1):
selftests/kexec: Add missing '=y' to config options
tools/testing/selftests/Makefile | 2 +-
tools/testing/selftests/ima/Makefile | 11 --
tools/testing/selftests/ima/config | 4 -
tools/testing/selftests/ima/test_kexec_load.sh | 54 ------
tools/testing/selftests/kexec/Makefile | 12 ++
tools/testing/selftests/kexec/config | 3 +
tools/testing/selftests/kexec/kexec_common_lib.sh | 175 +++++++++++++++++
.../selftests/kexec/test_kexec_file_load.sh | 208 +++++++++++++++++++++
tools/testing/selftests/kexec/test_kexec_load.sh | 47 +++++
9 files changed, 446 insertions(+), 70 deletions(-)
delete mode 100644 tools/testing/selftests/ima/Makefile
delete mode 100644 tools/testing/selftests/ima/config
delete mode 100755 tools/testing/selftests/ima/test_kexec_load.sh
create mode 100644 tools/testing/selftests/kexec/Makefile
create mode 100644 tools/testing/selftests/kexec/config
create mode 100755 tools/testing/selftests/kexec/kexec_common_lib.sh
create mode 100755 tools/testing/selftests/kexec/test_kexec_file_load.sh
create mode 100755 tools/testing/selftests/kexec/test_kexec_load.sh
--
2.7.5
If the cgroup destruction races with an exit() of a belonging
process(es), cg_kill_all() may fail. It's not a good reason to make
cg_destroy() fail and leave the cgroup in place, potentially causing
next test runs to fail.
Signed-off-by: Roman Gushchin <guro(a)fb.com>
Cc: Shuah Khan <shuah(a)kernel.org>
Cc: Tejun Heo <tj(a)kernel.org>
Cc: kernel-team(a)fb.com
Cc: linux-kselftest(a)vger.kernel.org
---
tools/testing/selftests/cgroup/cgroup_util.c | 4 +---
1 file changed, 1 insertion(+), 3 deletions(-)
diff --git a/tools/testing/selftests/cgroup/cgroup_util.c b/tools/testing/selftests/cgroup/cgroup_util.c
index 14c9fe284806..eba06f94433b 100644
--- a/tools/testing/selftests/cgroup/cgroup_util.c
+++ b/tools/testing/selftests/cgroup/cgroup_util.c
@@ -227,9 +227,7 @@ int cg_destroy(const char *cgroup)
retry:
ret = rmdir(cgroup);
if (ret && errno == EBUSY) {
- ret = cg_killall(cgroup);
- if (ret)
- return ret;
+ cg_killall(cgroup);
usleep(100);
goto retry;
}
--
2.20.1
A test for the basic NAT functionality uses ip command which
needs veth device.There is a condition where the kernel support
for veth is not compiled into the kernel and the test script
breaks.This patch contains code for reasonable error display
and correct code exit.
Signed-off-by: Jeffrin Jose T <jeffrin(a)rajagiritech.edu.in>
---
tools/testing/selftests/netfilter/nft_nat.sh | 6 +++++-
1 file changed, 5 insertions(+), 1 deletion(-)
diff --git a/tools/testing/selftests/netfilter/nft_nat.sh b/tools/testing/selftests/netfilter/nft_nat.sh
index 8ec76681605c..f25f72a75cf3 100755
--- a/tools/testing/selftests/netfilter/nft_nat.sh
+++ b/tools/testing/selftests/netfilter/nft_nat.sh
@@ -23,7 +23,11 @@ ip netns add ns0
ip netns add ns1
ip netns add ns2
-ip link add veth0 netns ns0 type veth peer name eth0 netns ns1
+ip link add veth0 netns ns0 type veth peer name eth0 netns ns1 > /dev/null 2>&1
+if [ $? -ne 0 ];then
+ echo "SKIP: No virtual ethernet pair device support in kernel"
+ exit $ksft_skip
+fi
ip link add veth1 netns ns0 type veth peer name eth0 netns ns2
ip -net ns0 link set lo up
--
2.20.1
Hi,
strscpy_pad() patch set now with added test shenanigans.
This version adds 5 initial patches to the set and splits the single
patch from v2 into two separate patches (6 and 7).
While doing the testing for strscpy_pad() it was noticed that there is
duplication in how test modules are being fed to kselftest and also in
the test modules themselves.
This set makes an attempt at adding a framework to kselftest for writing
kernel test modules. It also adds a script for use in creating script
test runners for kselftest. My macro-foo is not great, all criticism
and suggestions very much appreciated. The design is based on test
modules lib/test_printf.c, lib/test_bitmap.c, lib/test_xarray.c.
Shua, I'm by no means a kselftest expert, if this approach does not fit
in with your general direction please say so.
Kees, I put the strscpy_pad() addition patch separate so if this goes in
through Shua's tree (and if it goes in at all) its a single patch to
grab if we want to start playing around with strscpy_pad().
Patch 1 fixes module unload for lib/test_printf in preparation for the
rest of the series.
Patch 2 Adds a shell script that can be used to create shell script test
runners.
Patch 3 Converts current shell script runners in
tools/testing/selftests/lib/ to use the script introduced in
patch 2.
Patch 4 Adds the test framework by way of a header file (inc. documentation)
Patch 5 Converts a couple of current test modules to make some use of
the newly added test framework.
Patch 6 Adds strscpy_pad()
Patch 7 Adds test module for strscpy_pad() using the new framework and script.
If you are a testing geek and you would like to play with this; if you
are already running a kernel built recently from your tree you may
want to just apply the first 5 patches then you don't need to build/boot
a new kernel, just config and build the lib/ test modules (test_printf
etc.) and then:
sudo make TARGETS=lib kselftest
Late in the development of this I found that a bunch of boiler plate had
to be added to the script to handle running tests with:
make O=/path/to/kout kselftest
The reason is that during the build we are in the output directory but
the script is in the source directory. I get the feeling that a better
understanding of how the kernel build process works would provide a
better solution to this. The current solution is disappointing since
removing duplication and boiler plate was the point of the whole
exercise. I'd love a better way to solve this?
One final interesting note: there are 36 test modules in lib/ only 3 of
them are run by kselftest from tools/testing/selfests/lib?
Thanks for looking at this,
Tobin.
Tobin C. Harding (7):
lib/test_printf: Add empty module_exit function
kselftest: Add test runner creation script
kselftest/lib: Use new shell runner to define tests
kselftest: Add test module framework header
lib: Use new kselftest header
lib/string: Add strscpy_pad() function
lib: Add test module for strscpy_pad
Documentation/dev-tools/kselftest.rst | 108 ++++++++++++-
include/linux/string.h | 4 +
lib/Kconfig.debug | 3 +
lib/Makefile | 1 +
lib/string.c | 47 +++++-
lib/test_bitmap.c | 20 +--
lib/test_printf.c | 17 +--
lib/test_strscpy.c | 150 +++++++++++++++++++
tools/testing/selftests/kselftest_module.h | 48 ++++++
tools/testing/selftests/kselftest_module.sh | 75 ++++++++++
tools/testing/selftests/lib/Makefile | 2 +-
tools/testing/selftests/lib/bitmap.sh | 25 ++--
tools/testing/selftests/lib/config | 1 +
tools/testing/selftests/lib/prime_numbers.sh | 23 ++-
tools/testing/selftests/lib/printf.sh | 25 ++--
tools/testing/selftests/lib/strscpy.sh | 17 +++
16 files changed, 490 insertions(+), 76 deletions(-)
create mode 100644 lib/test_strscpy.c
create mode 100644 tools/testing/selftests/kselftest_module.h
create mode 100755 tools/testing/selftests/kselftest_module.sh
create mode 100755 tools/testing/selftests/lib/strscpy.sh
--
2.20.1
Introduce in-kernel headers and other artifacts which are made available
as an archive through proc (/proc/kheaders.txz file). This archive makes
it possible to build kernel modules, run eBPF programs, and other
tracing programs that need to extend the kernel for tracing purposes
without any dependency on the file system having headers and build
artifacts.
On Android and embedded systems, it is common to switch kernels but not
have kernel headers available on the file system. Raw kernel headers
also cannot be copied into the filesystem like they can be on other
distros, due to licensing and other issues. There's no linux-headers
package on Android. Further once a different kernel is booted, any
headers stored on the file system will no longer be useful. By storing
the headers as a compressed archive within the kernel, we can avoid these
issues that have been a hindrance for a long time.
The feature is also buildable as a module just in case the user desires
it not being part of the kernel image. This makes it possible to load
and unload the headers on demand. A tracing program, or a kernel module
builder can load the module, do its operations, and then unload the
module to save kernel memory. The total memory needed is 3.8MB.
The code to read the headers is based on /proc/config.gz code and uses
the same technique to embed the headers.
To build a module, the below steps have been tested on an x86 machine:
modprobe kheaders
rm -rf $HOME/headers
mkdir -p $HOME/headers
tar -xvf /proc/kheaders.txz -C $HOME/headers >/dev/null
cd my-kernel-module
make -C $HOME/headers M=$(pwd) modules
rmmod kheaders
Signed-off-by: Joel Fernandes (Google) <joel(a)joelfernandes.org>
---
Changes since v1:
- removed IKH_EXTRA variable, not needed (Masahiro Yamada)
- small fix ups to selftest
- added target to main Makefile etc
- added MODULE_LICENSE to test module
- made selftest more quiet
Changes since RFC:
Both changes bring size down to 3.8MB:
- use xz for compression
- strip comments except SPDX lines
- Call out the module name in Kconfig
- Also added selftests in second patch to ensure headers are always
working.
Documentation/dontdiff | 1 +
init/Kconfig | 11 ++++++
kernel/.gitignore | 2 ++
kernel/Makefile | 27 ++++++++++++++
kernel/kheaders.c | 74 +++++++++++++++++++++++++++++++++++++++
scripts/gen_ikh_data.sh | 19 ++++++++++
scripts/strip-comments.pl | 8 +++++
7 files changed, 142 insertions(+)
create mode 100644 kernel/kheaders.c
create mode 100755 scripts/gen_ikh_data.sh
create mode 100755 scripts/strip-comments.pl
diff --git a/Documentation/dontdiff b/Documentation/dontdiff
index 2228fcc8e29f..05a2319ee2a2 100644
--- a/Documentation/dontdiff
+++ b/Documentation/dontdiff
@@ -151,6 +151,7 @@ int8.c
kallsyms
kconfig
keywords.c
+kheaders_data.h*
ksym.c*
ksym.h*
kxgettext
diff --git a/init/Kconfig b/init/Kconfig
index c9386a365eea..9fbf4f73d98c 100644
--- a/init/Kconfig
+++ b/init/Kconfig
@@ -563,6 +563,17 @@ config IKCONFIG_PROC
This option enables access to the kernel configuration file
through /proc/config.gz.
+config IKHEADERS_PROC
+ tristate "Enable kernel header artifacts through /proc/kheaders.txz"
+ select BUILD_BIN2C
+ depends on PROC_FS
+ help
+ This option enables access to the kernel header and other artifacts that
+ are generated during the build process. These can be used to build kernel
+ modules, and other in-kernel programs such as those generated by eBPF
+ and systemtap tools. If you build the headers as a module, a module
+ called kheaders.ko is built which can be loaded to get access to them.
+
config LOG_BUF_SHIFT
int "Kernel log buffer size (16 => 64KB, 17 => 128KB)"
range 12 25
diff --git a/kernel/.gitignore b/kernel/.gitignore
index b3097bde4e9c..6acf71acbdcb 100644
--- a/kernel/.gitignore
+++ b/kernel/.gitignore
@@ -3,5 +3,7 @@
#
config_data.h
config_data.gz
+kheaders_data.h
+kheaders_data.txz
timeconst.h
hz.bc
diff --git a/kernel/Makefile b/kernel/Makefile
index 6aa7543bcdb2..1d13a7a6c537 100644
--- a/kernel/Makefile
+++ b/kernel/Makefile
@@ -70,6 +70,7 @@ obj-$(CONFIG_UTS_NS) += utsname.o
obj-$(CONFIG_USER_NS) += user_namespace.o
obj-$(CONFIG_PID_NS) += pid_namespace.o
obj-$(CONFIG_IKCONFIG) += configs.o
+obj-$(CONFIG_IKHEADERS_PROC) += kheaders.o
obj-$(CONFIG_SMP) += stop_machine.o
obj-$(CONFIG_KPROBES_SANITY_TEST) += test_kprobes.o
obj-$(CONFIG_AUDIT) += audit.o auditfilter.o
@@ -130,3 +131,29 @@ filechk_ikconfiggz = \
targets += config_data.h
$(obj)/config_data.h: $(obj)/config_data.gz FORCE
$(call filechk,ikconfiggz)
+
+# Build a list of in-kernel headers for building kernel modules
+ikh_file_list := include/
+ikh_file_list += arch/$(ARCH)/Makefile
+ikh_file_list += arch/$(ARCH)/include/
+ikh_file_list += scripts/
+ikh_file_list += Makefile
+ikh_file_list += Module.symvers
+ifeq ($(CONFIG_STACK_VALIDATION), y)
+ikh_file_list += $(objtree)/tools/objtool/objtool
+endif
+
+$(obj)/kheaders.o: $(obj)/kheaders_data.h
+
+targets += kheaders_data.txz
+
+quiet_cmd_genikh = GEN $(obj)/kheaders_data.txz
+cmd_genikh = $(srctree)/scripts/gen_ikh_data.sh $@ $^ >/dev/null 2>&1
+$(obj)/kheaders_data.txz: $(ikh_file_list) FORCE
+ $(call cmd,genikh)
+
+filechk_ikheadersxz = (echo "static const char kernel_headers_data[] __used = KH_MAGIC_START"; cat $< | scripts/bin2c; echo "KH_MAGIC_END;")
+
+targets += kheaders_data.h
+$(obj)/kheaders_data.h: $(obj)/kheaders_data.txz FORCE
+ $(call filechk,ikheadersxz)
diff --git a/kernel/kheaders.c b/kernel/kheaders.c
new file mode 100644
index 000000000000..c39930f51202
--- /dev/null
+++ b/kernel/kheaders.c
@@ -0,0 +1,74 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * kernel/kheaders.c
+ * Provide headers and artifacts needed to build kernel modules.
+ * (Borrowed code from kernel/configs.c)
+ */
+
+#include <linux/kernel.h>
+#include <linux/module.h>
+#include <linux/proc_fs.h>
+#include <linux/seq_file.h>
+#include <linux/init.h>
+#include <linux/uaccess.h>
+
+/*
+ * Define kernel_headers_data and kernel_headers_data_size, which contains the
+ * compressed kernel headers. The file is first compressed with xz and then
+ * bounded by two eight byte magic numbers to allow extraction from a binary
+ * kernel image:
+ *
+ * IKHD_ST
+ * <image>
+ * IKHD_ED
+ */
+#define KH_MAGIC_START "IKHD_ST"
+#define KH_MAGIC_END "IKHD_ED"
+#include "kheaders_data.h"
+
+
+#define KH_MAGIC_SIZE (sizeof(KH_MAGIC_START) - 1)
+#define kernel_headers_data_size \
+ (sizeof(kernel_headers_data) - 1 - KH_MAGIC_SIZE * 2)
+
+static ssize_t
+ikheaders_read_current(struct file *file, char __user *buf,
+ size_t len, loff_t *offset)
+{
+ return simple_read_from_buffer(buf, len, offset,
+ kernel_headers_data + KH_MAGIC_SIZE,
+ kernel_headers_data_size);
+}
+
+static const struct file_operations ikheaders_file_ops = {
+ .owner = THIS_MODULE,
+ .read = ikheaders_read_current,
+ .llseek = default_llseek,
+};
+
+static int __init ikheaders_init(void)
+{
+ struct proc_dir_entry *entry;
+
+ /* create the current headers file */
+ entry = proc_create("kheaders.txz", S_IFREG | S_IRUGO, NULL,
+ &ikheaders_file_ops);
+ if (!entry)
+ return -ENOMEM;
+
+ proc_set_size(entry, kernel_headers_data_size);
+
+ return 0;
+}
+
+static void __exit ikheaders_cleanup(void)
+{
+ remove_proc_entry("kheaders.txz", NULL);
+}
+
+module_init(ikheaders_init);
+module_exit(ikheaders_cleanup);
+
+MODULE_LICENSE("GPL");
+MODULE_AUTHOR("Joel Fernandes");
+MODULE_DESCRIPTION("Echo the kernel header artifacts used to build the kernel");
diff --git a/scripts/gen_ikh_data.sh b/scripts/gen_ikh_data.sh
new file mode 100755
index 000000000000..609196b5cea2
--- /dev/null
+++ b/scripts/gen_ikh_data.sh
@@ -0,0 +1,19 @@
+#!/bin/bash
+# SPDX-License-Identifier: GPL-2.0
+
+spath="$(dirname "$(readlink -f "$0")")"
+
+rm -rf $1.tmp
+mkdir $1.tmp
+
+for f in "${@:2}";
+ do find "$f" ! -name "*.c" ! -name "*.o" ! -name "*.cmd" ! -name ".*";
+done | cpio -pd $1.tmp
+
+for f in $(find $1.tmp); do
+ $spath/strip-comments.pl $f
+done
+
+tar -Jcf $1 -C $1.tmp/ . > /dev/null
+
+rm -rf $1.tmp
diff --git a/scripts/strip-comments.pl b/scripts/strip-comments.pl
new file mode 100755
index 000000000000..f8ada87c5802
--- /dev/null
+++ b/scripts/strip-comments.pl
@@ -0,0 +1,8 @@
+#!/usr/bin/perl -pi
+# SPDX-License-Identifier: GPL-2.0
+
+# This script removes /**/ comments from a file, unless such comments
+# contain "SPDX". It is used when building compressed in-kernel headers.
+
+BEGIN {undef $/;}
+s/\/\*((?!SPDX).)*?\*\///smg;
--
2.20.1.791.gb4d0f1c61a-goog
Add a new test for Media Device Allocator API.
Media Device Allocator API to allows multiple drivers share a media device.
This API solves a very common use-case for media devices where one physical
device (an USB stick) provides both audio and video. When such media device
exposes a standard USB Audio class, a proprietary Video class, two or more
independent drivers will share a single physical USB bridge. In such cases,
it is necessary to coordinate access to the shared resource.
Using this API, drivers can allocate a media device with the shared struct
device as the key. Once the media device is allocated by a driver, other
drivers can get a reference to it. The media device is released when all
the references are released.
This test does a series of unbind/bind tests to make sure media device
is released correctly when it is no longer is use and when the last
driver releases the reference.
Signed-off-by: Shuah Khan <shuah(a)kernel.org>
---
.../media_tests/media_dev_allocator.sh | 85 +++++++++++++++++++
1 file changed, 85 insertions(+)
create mode 100755 tools/testing/selftests/media_tests/media_dev_allocator.sh
diff --git a/tools/testing/selftests/media_tests/media_dev_allocator.sh b/tools/testing/selftests/media_tests/media_dev_allocator.sh
new file mode 100755
index 000000000000..ffe00c59a483
--- /dev/null
+++ b/tools/testing/selftests/media_tests/media_dev_allocator.sh
@@ -0,0 +1,85 @@
+#!/bin/bash
+# SPDX-License-Identifier: GPL-2.0
+# Media Device Allocator API test script
+# Copyright (c) 2019 Shuah Khan <shuah(a)kernel.org>
+
+echo "Media Device Allocator testing: unbind and bind"
+echo "media driver $1 audio driver $2"
+
+MDRIVER=/sys/bus/usb/drivers/$1
+cd $MDRIVER
+MDEV=$(ls -d *\-*)
+
+ADRIVER=/sys/bus/usb/drivers/$2
+cd $ADRIVER
+ADEV=$(ls -d *\-*.1)
+
+echo "=================================="
+echo "Test unbind both devices - start"
+echo "Running unbind of $MDEV from $MDRIVER"
+echo $MDEV > $MDRIVER/unbind;
+
+echo "Media device should still be present!"
+ls -l /dev/media*
+
+echo "sound driver is at: $ADRIVER"
+echo "Device is: $ADEV"
+
+echo "Running unbind of $ADEV from $ADRIVER"
+echo $ADEV > $ADRIVER/unbind;
+
+echo "Media device should have been deleted!"
+ls -l /dev/media*
+echo "Test unbind both devices - end"
+
+echo "=================================="
+
+echo "Test bind both devices - start"
+echo "Running bind of $MDEV from $MDRIVER"
+echo $MDEV > $MDRIVER/bind;
+
+echo "Media device should be present!"
+ls -l /dev/media*
+
+echo "Running bind of $ADEV from $ADRIVER"
+echo $ADEV > $ADRIVER/bind;
+
+echo "Media device should be there!"
+ls -l /dev/media*
+
+echo "Test bind both devices - end"
+
+echo "=================================="
+
+echo "Test unbind $MDEV - bind $MDEV - unbind $ADEV - bind $ADEV start"
+
+echo "Running unbind of $MDEV from $MDRIVER"
+echo $MDEV > $MDRIVER/unbind;
+
+echo "Media device should be there!"
+ls -l /dev/media*
+
+sleep 1
+
+echo "Running bind of $MDEV from $MDRIVER"
+echo $MDEV > $MDRIVER/bind;
+
+echo "Media device should be there!"
+ls -l /dev/media*
+
+echo "Running unbind of $ADEV from $ADRIVER"
+echo $ADEV > $ADRIVER/unbind;
+
+echo "Media device should be there!"
+ls -l /dev/media*
+
+sleep 1
+
+echo "Running bind of $ADEV from $ADRIVER"
+echo $ADEV > $ADRIVER/bind;
+
+echo "Media device should be there!"
+ls -l /dev/media*
+
+echo "Test unbind $MDEV - bind $MDEV - unbind $ADEV - bind $ADEV end"
+echo "=================================="
--
2.17.1
hello
i think the script nft_nat.sh is assuming devices eth0 and eth1
which may not be the case always. my suggestion is why not give the needed
network devices as arguments to the script. iam showing related
command line sessions below and error related file is attached.
---------------------------x-------------x----------------------------
$ip link
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN
mode DEFAULT group default qlen 1000
link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
2: eth0: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc pfifo_fast
state DOWN mode DEFAULT group default qlen 1000
link/ether 70:5a:0f:b9:d8:5c brd ff:ff:ff:ff:ff:ff
3: wlp2s0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state
UP mode DORMANT group default qlen 1000
link/ether 68:14:01:07:36:1f brd ff:ff:ff:ff:ff:ff
$
------------------------x-----------x---------------------------------------
$sudo ./nft_nat.sh 2> error-related.txt
ERROR: ping failed
SKIP: Could not add add ip6 dnat hook
ERROR: canot ping ns1 from ns2
ERROR: cannot ping ns1 from ns2 with active ip masquerading
ERROR: cannot ping ns1 from ns2 via ipv6
ERROR: cannot ping ns1 from ns2
ERROR: cannot ping ns1 from ns2 with active ip redirect
ERROR: cannnot ping ns1 from ns2 via ipv6
ERROR: cannot ping ns1 from ns2 with active ip6 redirect
-------------------------x---------------------------x------------------------------------
a file is attached which shows the contents of error-related.txt
/Jeffrin
--
software engineer
rajagiri school of engineering and technology
Extend bpf_skb_adjust_room growth to mark inner MAC header so that
L2 encapsulation can be used for tc tunnels.
Patch #1 extends the existing test_tc_tunnel to support UDP
encapsulation; later we want to be able to test MPLS over UDP and
MPLS over GRE encapsulation.
Patch #2 adds the BPF_F_ADJ_ROOM_ENCAP_L2(len) macro, which
allows specification of inner mac length. Other approaches were
explored prior to taking this approach. Specifically, I tried
automatically computing the inner mac length on the basis of the
specified flags (so inner maclen for GRE/IPv4 encap is the len_diff
specified to bpf_skb_adjust_room minus GRE + IPv4 header length
for example). Problem with this is that we don't know for sure
what form of GRE/UDP header we have; is it a full GRE header,
or is it a FOU UDP header or generic UDP encap header? My fear
here was we'd end up with an explosion of flags. The other approach
tried was to support inner L2 header marking as a separate room
adjustment, i.e. adjust for L3/L4 encap, then call
bpf_skb_adjust_room for L2 encap. This can be made to work but
because it imposed an order on operations, felt a bit clunky.
Patch #3 syncs tools/ bpf.h.
Patch #4 extends the tests again to support MPLSoverGRE and
MPLSoverUDP encap, along with existing test coverage.
Alan Maguire (4):
selftests_bpf: extend test_tc_tunnel for UDP encap
bpf: add layer 2 encap support to bpf_skb_adjust_room
bpf: sync bpf.h to tools/ for BPF_F_ADJ_ROOM_ENCAP_L2
selftests_bpf: extend test_tc_tunnel.sh test for L2 encap
include/uapi/linux/bpf.h | 5 +
net/core/filter.c | 19 +-
tools/include/uapi/linux/bpf.h | 5 +
tools/testing/selftests/bpf/progs/test_tc_tunnel.c | 281 ++++++++++++++++-----
tools/testing/selftests/bpf/test_tc_tunnel.sh | 105 +++++---
5 files changed, 318 insertions(+), 97 deletions(-)
--
1.8.3.1
Add a new test for Media Device Allocator API.
Media Device Allocator API to allows multiple drivers share a media device.
This API solves a very common use-case for media devices where one physical
device (an USB stick) provides both audio and video. When such media device
exposes a standard USB Audio class, a proprietary Video class, two or more
independent drivers will share a single physical USB bridge. In such cases,
it is necessary to coordinate access to the shared resource.
Using this API, drivers can allocate a media device with the shared struct
device as the key. Once the media device is allocated by a driver, other
drivers can get a reference to it. The media device is released when all
the references are released.
This test does a series of unbind/bind tests to make sure media device
is released correctly when it is no longer is use and when the last
driver releases the reference.
Signed-off-by: Shuah Khan <shuah(a)kernel.org>
---
.../media_tests/media_dev_allocator.sh | 85 +++++++++++++++++++
1 file changed, 85 insertions(+)
create mode 100755 tools/testing/selftests/media_tests/media_dev_allocator.sh
diff --git a/tools/testing/selftests/media_tests/media_dev_allocator.sh b/tools/testing/selftests/media_tests/media_dev_allocator.sh
new file mode 100755
index 000000000000..ffe00c59a483
--- /dev/null
+++ b/tools/testing/selftests/media_tests/media_dev_allocator.sh
@@ -0,0 +1,85 @@
+#!/bin/bash
+# SPDX-License-Identifier: GPL-2.0
+# Media Device Allocator API test script
+# Copyright (c) 2019 Shuah Khan <shuah(a)kernel.org>
+
+echo "Media Device Allocator testing: unbind and bind"
+echo "media driver $1 audio driver $2"
+
+MDRIVER=/sys/bus/usb/drivers/$1
+cd $MDRIVER
+MDEV=$(ls -d *\-*)
+
+ADRIVER=/sys/bus/usb/drivers/$2
+cd $ADRIVER
+ADEV=$(ls -d *\-*.1)
+
+echo "=================================="
+echo "Test unbind both devices - start"
+echo "Running unbind of $MDEV from $MDRIVER"
+echo $MDEV > $MDRIVER/unbind;
+
+echo "Media device should still be present!"
+ls -l /dev/media*
+
+echo "sound driver is at: $ADRIVER"
+echo "Device is: $ADEV"
+
+echo "Running unbind of $ADEV from $ADRIVER"
+echo $ADEV > $ADRIVER/unbind;
+
+echo "Media device should have been deleted!"
+ls -l /dev/media*
+echo "Test unbind both devices - end"
+
+echo "=================================="
+
+echo "Test bind both devices - start"
+echo "Running bind of $MDEV from $MDRIVER"
+echo $MDEV > $MDRIVER/bind;
+
+echo "Media device should be present!"
+ls -l /dev/media*
+
+echo "Running bind of $ADEV from $ADRIVER"
+echo $ADEV > $ADRIVER/bind;
+
+echo "Media device should be there!"
+ls -l /dev/media*
+
+echo "Test bind both devices - end"
+
+echo "=================================="
+
+echo "Test unbind $MDEV - bind $MDEV - unbind $ADEV - bind $ADEV start"
+
+echo "Running unbind of $MDEV from $MDRIVER"
+echo $MDEV > $MDRIVER/unbind;
+
+echo "Media device should be there!"
+ls -l /dev/media*
+
+sleep 1
+
+echo "Running bind of $MDEV from $MDRIVER"
+echo $MDEV > $MDRIVER/bind;
+
+echo "Media device should be there!"
+ls -l /dev/media*
+
+echo "Running unbind of $ADEV from $ADRIVER"
+echo $ADEV > $ADRIVER/unbind;
+
+echo "Media device should be there!"
+ls -l /dev/media*
+
+sleep 1
+
+echo "Running bind of $ADEV from $ADRIVER"
+echo $ADEV > $ADRIVER/bind;
+
+echo "Media device should be there!"
+ls -l /dev/media*
+
+echo "Test unbind $MDEV - bind $MDEV - unbind $ADEV - bind $ADEV end"
+echo "=================================="
--
2.17.1
Add a new test for Media Device Allocator API.
Media Device Allocator API to allows multiple drivers share a media device.
This API solves a very common use-case for media devices where one physical
device (an USB stick) provides both audio and video. When such media device
exposes a standard USB Audio class, a proprietary Video class, two or more
independent drivers will share a single physical USB bridge. In such cases,
it is necessary to coordinate access to the shared resource.
Using this API, drivers can allocate a media device with the shared struct
device as the key. Once the media device is allocated by a driver, other
drivers can get a reference to it. The media device is released when all
the references are released.
This test does a series of unbind/bind tests to make sure media device
is released correctly when it is no longer is use and when the last
driver releases the reference.
Signed-off-by: Shuah Khan <shuah(a)kernel.org>
---
.../media_tests/media_dev_allocator.sh | 81 +++++++++++++++++++
1 file changed, 81 insertions(+)
create mode 100755 tools/testing/selftests/media_tests/media_dev_allocator.sh
diff --git a/tools/testing/selftests/media_tests/media_dev_allocator.sh b/tools/testing/selftests/media_tests/media_dev_allocator.sh
new file mode 100755
index 000000000000..d58e39c1b66c
--- /dev/null
+++ b/tools/testing/selftests/media_tests/media_dev_allocator.sh
@@ -0,0 +1,81 @@
+#!/bin/bash
+# SPDX-License-Identifier: GPL-2.0
+# Media Device Allocator API test script
+# Copyright (c) 2018 Shuah Khan <shuah(a)kernel.org>
+
+echo "Media Device Allocator testing: unbind and bind"
+echo "media driver $1 audio driver $2"
+
+MDRIVER=/sys/bus/usb/drivers/$1
+cd $MDRIVER
+MDEV=$(ls -d *\-*)
+
+ADRIVER=/sys/bus/usb/drivers/$2
+cd $ADRIVER
+ADEV=$(ls -d *\-*.1)
+
+echo "=================================="
+echo "Test unbind both devices - start"
+echo "Running unbind of $MDEV from $MDRIVER"
+echo $MDEV > $MDRIVER/unbind;
+
+echo "Media device should still be present!"
+ls -l /dev/media*
+
+echo "sound driver is at: $ADRIVER"
+echo "Device is: $ADEV"
+
+echo "Running unbind of $ADEV from $ADRIVER"
+echo $ADEV > $ADRIVER/unbind;
+
+echo "Media device should have been deleted!"
+ls -l /dev/media*
+echo "Test unbind both devices - end"
+
+echo "=================================="
+
+echo "Test bind both devices - start"
+echo "Running bind of $MDEV from $MDRIVER"
+echo $MDEV > $MDRIVER/bind;
+
+echo "Media device should be present!"
+ls -l /dev/media*
+
+echo "Running bind of $ADEV from $ADRIVER"
+echo $ADEV > $ADRIVER/bind;
+
+echo "Media device should be there!"
+ls -l /dev/media*
+
+echo "Test bind both devices - end"
+
+echo "=================================="
+
+echo "Test unbind $MDEV - bind $MDEV - unbind $ADEV - bind $ADEV start"
+
+echo "Running unbind of $MDEV from $MDRIVER"
+echo $MDEV > $MDRIVER/unbind;
+
+echo "Media device should be there!"
+ls -l /dev/media*
+
+echo "Running bind of $MDEV from $MDRIVER"
+echo $MDEV > $MDRIVER/bind;
+
+echo "Media device should be there!"
+ls -l /dev/media*
+
+echo "Running unbind of $ADEV from $ADRIVER"
+echo $ADEV > $ADRIVER/unbind;
+
+echo "Media device should be there!"
+ls -l /dev/media*
+
+echo "Running bind of $ADEV from $ADRIVER"
+echo $ADEV > $ADRIVER/bind;
+
+echo "Media device should be there!"
+ls -l /dev/media*
+
+echo "Test unbind $MDEV - bind $MDEV - unbind $ADEV - bind $ADEV end"
+echo "=================================="
--
2.17.1
In rcu_rrupt_from_idle, we want to check if it is called from within an
interrupt, but want to do such checking only for debug builds. lockdep
already tracks when we enter an interrupt. Let us expose it as an
assertion macro so it can be used to assert this.
Suggested-by: Steven Rostedt <rostedt(a)goodmis.org>
Cc: kernel-team(a)android.com
Cc: rcu(a)vger.kernel.org
Signed-off-by: Joel Fernandes (Google) <joel(a)joelfernandes.org>
---
include/linux/lockdep.h | 7 +++++++
1 file changed, 7 insertions(+)
diff --git a/include/linux/lockdep.h b/include/linux/lockdep.h
index c5335df2372f..d24f564823d3 100644
--- a/include/linux/lockdep.h
+++ b/include/linux/lockdep.h
@@ -601,11 +601,18 @@ do { \
"IRQs not disabled as expected\n"); \
} while (0)
+#define lockdep_assert_in_irq() do { \
+ WARN_ONCE(debug_locks && !current->lockdep_recursion && \
+ !current->hardirq_context, \
+ "Not in hardirq as expected\n"); \
+ } while (0)
+
#else
# define might_lock(lock) do { } while (0)
# define might_lock_read(lock) do { } while (0)
# define lockdep_assert_irqs_enabled() do { } while (0)
# define lockdep_assert_irqs_disabled() do { } while (0)
+# define lockdep_assert_in_irq() do { } while (0)
#endif
#ifdef CONFIG_LOCKDEP
--
2.21.0.392.gf8f6787159e-goog
From: Tycho Andersen <tycho(a)tycho.ws>
[ Upstream commit 3aa415dd2128e478ea3225b59308766de0e94d6b ]
The get_metadata() test requires real root, so let's skip it if we're not
real root.
Note that I used XFAIL here because that's what the test does later if
CONFIG_CHEKCKPOINT_RESTORE happens to not be enabled. After looking at the
code, there doesn't seem to be a nice way to skip tests defined as TEST(),
since there's no return code (I tried exit(KSFT_SKIP), but that didn't work
either...). So let's do it this way to be consistent, and easier to fix
when someone comes along and fixes it.
Signed-off-by: Tycho Andersen <tycho(a)tycho.ws>
Acked-by: Kees Cook <keescook(a)chromium.org>
Signed-off-by: Shuah Khan <shuah(a)kernel.org>
Signed-off-by: Sasha Levin <sashal(a)kernel.org>
---
tools/testing/selftests/seccomp/seccomp_bpf.c | 6 ++++++
1 file changed, 6 insertions(+)
diff --git a/tools/testing/selftests/seccomp/seccomp_bpf.c b/tools/testing/selftests/seccomp/seccomp_bpf.c
index 83057fa9d391..14cad657bc6a 100644
--- a/tools/testing/selftests/seccomp/seccomp_bpf.c
+++ b/tools/testing/selftests/seccomp/seccomp_bpf.c
@@ -2920,6 +2920,12 @@ TEST(get_metadata)
struct seccomp_metadata md;
long ret;
+ /* Only real root can get metadata. */
+ if (geteuid()) {
+ XFAIL(return, "get_metadata requires real root");
+ return;
+ }
+
ASSERT_EQ(0, pipe(pipefd));
pid = fork();
--
2.19.1
From: Tycho Andersen <tycho(a)tycho.ws>
[ Upstream commit 3aa415dd2128e478ea3225b59308766de0e94d6b ]
The get_metadata() test requires real root, so let's skip it if we're not
real root.
Note that I used XFAIL here because that's what the test does later if
CONFIG_CHEKCKPOINT_RESTORE happens to not be enabled. After looking at the
code, there doesn't seem to be a nice way to skip tests defined as TEST(),
since there's no return code (I tried exit(KSFT_SKIP), but that didn't work
either...). So let's do it this way to be consistent, and easier to fix
when someone comes along and fixes it.
Signed-off-by: Tycho Andersen <tycho(a)tycho.ws>
Acked-by: Kees Cook <keescook(a)chromium.org>
Signed-off-by: Shuah Khan <shuah(a)kernel.org>
Signed-off-by: Sasha Levin <sashal(a)kernel.org>
---
tools/testing/selftests/seccomp/seccomp_bpf.c | 6 ++++++
1 file changed, 6 insertions(+)
diff --git a/tools/testing/selftests/seccomp/seccomp_bpf.c b/tools/testing/selftests/seccomp/seccomp_bpf.c
index 7e632b465ab4..6d7a81306f8a 100644
--- a/tools/testing/selftests/seccomp/seccomp_bpf.c
+++ b/tools/testing/selftests/seccomp/seccomp_bpf.c
@@ -2971,6 +2971,12 @@ TEST(get_metadata)
struct seccomp_metadata md;
long ret;
+ /* Only real root can get metadata. */
+ if (geteuid()) {
+ XFAIL(return, "get_metadata requires real root");
+ return;
+ }
+
ASSERT_EQ(0, pipe(pipefd));
pid = fork();
--
2.19.1
The rcutorture jitter.sh script selects a random CPU but does not check
if it is offline or online. This leads to taskset errors many times. On
my machine, hyper threading is disabled so half the cores are offline
causing taskset errors a lot of times. Let us fix this by checking from
only the online CPUs on the system.
Signed-off-by: Joel Fernandes (Google) <joel(a)joelfernandes.org>
---
tools/testing/selftests/rcutorture/bin/jitter.sh | 11 ++++++++++-
1 file changed, 10 insertions(+), 1 deletion(-)
diff --git a/tools/testing/selftests/rcutorture/bin/jitter.sh b/tools/testing/selftests/rcutorture/bin/jitter.sh
index 3633828375e3..53bf9d99b5cd 100755
--- a/tools/testing/selftests/rcutorture/bin/jitter.sh
+++ b/tools/testing/selftests/rcutorture/bin/jitter.sh
@@ -47,10 +47,19 @@ do
exit 0;
fi
- # Set affinity to randomly selected CPU
+ # Set affinity to randomly selected online CPU
cpus=`ls /sys/devices/system/cpu/*/online |
sed -e 's,/[^/]*$,,' -e 's/^[^0-9]*//' |
grep -v '^0*$'`
+
+ for c in $cpus; do
+ if [ "$(cat /sys/devices/system/cpu/cpu$c/online)" == "1" ];
+ then
+ cpus_tmp="$cpus_tmp $c"
+ fi
+ done
+ cpus=$cpus_tmp
+
cpumask=`awk -v cpus="$cpus" -v me=$me -v n=$n 'BEGIN {
srand(n + me + systime());
ncpus = split(cpus, ca);
--
2.21.0.392.gf8f6787159e-goog
This patch set proposes KUnit, a lightweight unit testing and mocking
framework for the Linux kernel.
Unlike Autotest and kselftest, KUnit is a true unit testing framework;
it does not require installing the kernel on a test machine or in a VM
and does not require tests to be written in userspace running on a host
kernel. Additionally, KUnit is fast: From invocation to completion KUnit
can run several dozen tests in under a second. Currently, the entire
KUnit test suite for KUnit runs in under a second from the initial
invocation (build time excluded).
KUnit is heavily inspired by JUnit, Python's unittest.mock, and
Googletest/Googlemock for C++. KUnit provides facilities for defining
unit test cases, grouping related test cases into test suites, providing
common infrastructure for running tests, mocking, spying, and much more.
## What's so special about unit testing?
A unit test is supposed to test a single unit of code in isolation,
hence the name. There should be no dependencies outside the control of
the test; this means no external dependencies, which makes tests orders
of magnitudes faster. Likewise, since there are no external dependencies,
there are no hoops to jump through to run the tests. Additionally, this
makes unit tests deterministic: a failing unit test always indicates a
problem. Finally, because unit tests necessarily have finer granularity,
they are able to test all code paths easily solving the classic problem
of difficulty in exercising error handling code.
## Is KUnit trying to replace other testing frameworks for the kernel?
No. Most existing tests for the Linux kernel are end-to-end tests, which
have their place. A well tested system has lots of unit tests, a
reasonable number of integration tests, and some end-to-end tests. KUnit
is just trying to address the unit test space which is currently not
being addressed.
## More information on KUnit
There is a bunch of documentation near the end of this patch set that
describes how to use KUnit and best practices for writing unit tests.
For convenience I am hosting the compiled docs here:
https://google.github.io/kunit-docs/third_party/kernel/docs/
Additionally for convenience, I have applied these patches to a branch:
https://kunit.googlesource.com/linux/+/kunit/rfc/5.0-rc5/v4
The repo may be cloned with:
git clone https://kunit.googlesource.com/linux
This patchset is on the kunit/rfc/5.0-rc5/v4 branch.
## Changes Since Last Version
- Got KUnit working on (hypothetically) all architectures (tested on
x86), as per Rob's (and other's) request
- Punting all KUnit features/patches depending on UML for now.
- Broke out UML specific support into arch/um/* as per "[RFC v3 01/19]
kunit: test: add KUnit test runner core", as requested by Luis.
- Added support to kunit_tool to allow it to build kernels in external
directories, as suggested by Kieran.
- Added a UML defconfig, and a config fragment for KUnit as suggested
by Kieran and Luis.
- Cleaned up, and reformatted a bunch of stuff.
--
2.21.0.rc0.258.g878e2cd30e-goog
This is version 3 of the MSI interrupts for ntb_transport patchset.
I've addressed the feedback so far and rebased on the latest kernel
and would like this to be considered for merging this cycle.
The only outstanding issue I know of is that it still will not work
with IDT hardware, but ntb_transport doesn't work with IDT hardware
and there is still no sensible common infrastructure to support
ntb_peer_mw_set_trans(). Thus, I decline to consider that complication
in this patchset. However, I'll be happy to review work that adds this
feature in the future.
Also as the port number and resource index stuff is a bit complicated,
I made a quick out of tree test fixture to ensure it's correct[1]. As
an excerise I also wrote some test code[2] using the upcomming KUnit
feature.
Logan
[1] https://repl.it/repls/ExcitingPresentFile
[2] https://github.com/sbates130272/linux-p2pmem/commits/ntb_kunit
--
Changes in v3:
* Rebased onto v5.1-rc1 (Dropped the first two patches as they have
been merged, and cleaned up some minor conflicts in the PCI tree)
* Added a new patch (#3) to calculate logical port numbers that
are port numbers from 0 to (number of ports - 1). This is
then used in ntb_peer_resource_idx() to fix the issues brought
up by Serge.
* Fixed missing __iomem and iowrite calls (as noticed by Serge)
* Added patch 10 which describes ntb_msi_test in the documentation
file (as requested by Serge)
* A couple other minor nits and documentation fixes
--
Changes in v2:
* Cleaned up the changes in intel_irq_remapping.c to make them
less confusing and add a comment. (Per discussion with Jacob and
Joerg)
* Fixed a nit from Bjorn and collected his Ack
* Added a Kconfig dependancy on CONFIG_PCI_MSI for CONFIG_NTB_MSI
as the Kbuild robot hit a random config that didn't build
without it.
* Worked in a callback for when the MSI descriptor changes so that
the clients can resend the new address and data values to the peer.
On my test system this was never necessary, but there may be
other platforms where this can occur. I tested this by hacking
in a path to rewrite the MSI descriptor when I change the cpu
affinity of an IRQ. There's a bit of uncertainty over the latency
of the change, but without hardware this can acctually occur on
we can't test this. This was the result of a discussion with Dave.
--
This patch series adds optional support for using MSI interrupts instead
of NTB doorbells in ntb_transport. This is desirable seeing doorbells on
current hardware are quite slow and therefore switching to MSI interrupts
provides a significant performance gain. On switchtec hardware, a simple
apples-to-apples comparison shows ntb_netdev/iperf numbers going from
3.88Gb/s to 14.1Gb/s when switching to MSI interrupts.
To do this, a couple changes are required outside of the NTB tree:
1) The IOMMU must know to accept MSI requests from aliased bused numbers
seeing NTB hardware typically sends proxied request IDs through
additional requester IDs. The first patch in this series adds support
for the Intel IOMMU. A quirk to add these aliases for switchtec hardware
was already accepted. See commit ad281ecf1c7d ("PCI: Add DMA alias quirk
for Microsemi Switchtec NTB") for a description of NTB proxy IDs and why
this is necessary.
2) NTB transport (and other clients) may often need more MSI interrupts
than the NTB hardware actually advertises support for. However, seeing
these interrupts will not be triggered by the hardware but through an
NTB memory window, the hardware does not actually need support or need
to know about them. Therefore we add the concept of Virtual MSI
interrupts which are allocated just like any other MSI interrupt but
are not programmed into the hardware's MSI table. This is done in
Patch 2 and then made use of in Patch 3.
The remaining patches in this series add a library for dealing with MSI
interrupts, a test client and finally support in ntb_transport.
The series is based off of v5.1-rc1 plus the patches in ntb-next.
A git repo is available here:
https://github.com/sbates130272/linux-p2pmem/ ntb_transport_msi_v3
Thanks,
Logan
--
Logan Gunthorpe (10):
PCI/MSI: Support allocating virtual MSI interrupts
PCI/switchtec: Add module parameter to request more interrupts
NTB: Introduce helper functions to calculate logical port number
NTB: Introduce functions to calculate multi-port resource index
NTB: Rename ntb.c to support multiple source files in the module
NTB: Introduce MSI library
NTB: Introduce NTB MSI Test Client
NTB: Add ntb_msi_test support to ntb_test
NTB: Add MSI interrupt support to ntb_transport
NTB: Describe the ntb_msi_test client in the documentation.
Documentation/ntb.txt | 27 ++
drivers/ntb/Kconfig | 11 +
drivers/ntb/Makefile | 3 +
drivers/ntb/{ntb.c => core.c} | 0
drivers/ntb/msi.c | 415 +++++++++++++++++++++++
drivers/ntb/ntb_transport.c | 169 ++++++++-
drivers/ntb/test/Kconfig | 9 +
drivers/ntb/test/Makefile | 1 +
drivers/ntb/test/ntb_msi_test.c | 433 ++++++++++++++++++++++++
drivers/pci/msi.c | 54 ++-
drivers/pci/switch/switchtec.c | 12 +-
include/linux/msi.h | 8 +
include/linux/ntb.h | 196 ++++++++++-
include/linux/pci.h | 9 +
tools/testing/selftests/ntb/ntb_test.sh | 54 ++-
15 files changed, 1386 insertions(+), 15 deletions(-)
rename drivers/ntb/{ntb.c => core.c} (100%)
create mode 100644 drivers/ntb/msi.c
create mode 100644 drivers/ntb/test/ntb_msi_test.c
--
2.20.1
After the first run, the test case 'test_create_read' will always
fail because the file is exist and file's attr is 'S_IMMUTABLE',
open with 'O_RDWR' will always return -EPERM.
Signed-off-by: ZhangXiaoxu <zhangxiaoxu5(a)huawei.com>
---
tools/testing/selftests/efivarfs/efivarfs.sh | 4 ++++
1 file changed, 4 insertions(+)
diff --git a/tools/testing/selftests/efivarfs/efivarfs.sh b/tools/testing/selftests/efivarfs/efivarfs.sh
index a47029a..d386610 100755
--- a/tools/testing/selftests/efivarfs/efivarfs.sh
+++ b/tools/testing/selftests/efivarfs/efivarfs.sh
@@ -77,6 +77,10 @@ test_create_empty()
test_create_read()
{
local file=$efivarfs_mount/$FUNCNAME-$test_guid
+ if [ -f $file]; then
+ chattr -i $file
+ rm -rf $file
+ fi
./create-read $file
}
--
2.7.4
test_tc_tunnel.sh sets up a pair of namespaces connected by a
veth pair to verify encap/decap using bpf_skb_adjust_room. In
testing this, it uses tunnel links as the peer of the bpf-based
encap/decap. However because the same IP header is used for inner
and outer IP, when packets arrive at the tunnel interface they will
be dropped by reverse path filtering as those packets are expected
on the veth interface (where the destination IP of the decapped
packet is configured).
To avoid this, ensure reverse path filtering is disabled for the
namespace using tunneling.
Fixes: 98cdabcd0798 ("selftests/bpf: bpf tunnel encap test")
Signed-off-by: Alan Maguire <alan.maguire(a)oracle.com>
---
tools/testing/selftests/bpf/test_tc_tunnel.sh | 8 ++++++++
1 file changed, 8 insertions(+)
diff --git a/tools/testing/selftests/bpf/test_tc_tunnel.sh b/tools/testing/selftests/bpf/test_tc_tunnel.sh
index dcf3206..c805adb 100755
--- a/tools/testing/selftests/bpf/test_tc_tunnel.sh
+++ b/tools/testing/selftests/bpf/test_tc_tunnel.sh
@@ -160,6 +160,14 @@ server_listen
# client can connect again
ip netns exec "${ns2}" ip link add dev testtun0 type "${tuntype}" \
remote "${addr1}" local "${addr2}"
+# Because packets are decapped by the tunnel they arrive on testtun0 from
+# the IP stack perspective. Ensure reverse path filtering is disabled
+# otherwise we drop the TCP SYN as arriving on testtun0 instead of the
+# expected veth2 (veth2 is where 192.168.1.2 is configured).
+ip netns exec "${ns2}" sysctl -qw net.ipv4.conf.all.rp_filter=0
+# rp needs to be disabled for both all and testtun0 as the rp value is
+# selected as the max of the "all" and device-specific values.
+ip netns exec "${ns2}" sysctl -qw net.ipv4.conf.testtun0.rp_filter=0
ip netns exec "${ns2}" ip link set dev testtun0 up
echo "test bpf encap with tunnel device decap"
client_connect
--
1.8.3.1
PTRACE_GET_SYSCALL_INFO is a generic ptrace API that lets ptracer obtain
details of the syscall the tracee is blocked in.
There are two reasons for a special syscall-related ptrace request.
Firstly, with the current ptrace API there are cases when ptracer cannot
retrieve necessary information about syscalls. Some examples include:
* The notorious int-0x80-from-64-bit-task issue. See [1] for details.
In short, if a 64-bit task performs a syscall through int 0x80, its tracer
has no reliable means to find out that the syscall was, in fact,
a compat syscall, and misidentifies it.
* Syscall-enter-stop and syscall-exit-stop look the same for the tracer.
Common practice is to keep track of the sequence of ptrace-stops in order
not to mix the two syscall-stops up. But it is not as simple as it looks;
for example, strace had a (just recently fixed) long-standing bug where
attaching strace to a tracee that is performing the execve system call
led to the tracer identifying the following syscall-exit-stop as
syscall-enter-stop, which messed up all the state tracking.
* Since the introduction of commit 84d77d3f06e7e8dea057d10e8ec77ad71f721be3
("ptrace: Don't allow accessing an undumpable mm"), both PTRACE_PEEKDATA
and process_vm_readv become unavailable when the process dumpable flag
is cleared. On such architectures as ia64 this results in all syscall
arguments being unavailable for the tracer.
Secondly, ptracers also have to support a lot of arch-specific code for
obtaining information about the tracee. For some architectures, this
requires a ptrace(PTRACE_PEEKUSER, ...) invocation for every syscall
argument and return value.
PTRACE_GET_SYSCALL_INFO returns the following structure:
struct ptrace_syscall_info {
__u8 op; /* PTRACE_SYSCALL_INFO_* */
__u32 arch __attribute__((__aligned__(sizeof(__u32))));
__u64 instruction_pointer;
__u64 stack_pointer;
union {
struct {
__u64 nr;
__u64 args[6];
} entry;
struct {
__s64 rval;
__u8 is_error;
} exit;
struct {
__u64 nr;
__u64 args[6];
__u32 ret_data;
} seccomp;
};
};
The structure was chosen according to [2], except for the following
changes:
* seccomp substructure was added as a superset of entry substructure;
* the type of nr field was changed from int to __u64 because syscall
numbers are, as a practical matter, 64 bits;
* stack_pointer field was added along with instruction_pointer field
since it is readily available and can save the tracer from extra
PTRACE_GETREGS/PTRACE_GETREGSET calls;
* arch is always initialized to aid with tracing system calls
* such as execve();
* instruction_pointer and stack_pointer are always initialized
so they could be easily obtained for non-syscall stops;
* a boolean is_error field was added along with rval field, this way
the tracer can more reliably distinguish a return value
from an error value.
strace has been ported to PTRACE_GET_SYSCALL_INFO.
Starting with release 4.26, strace uses PTRACE_GET_SYSCALL_INFO API
as the preferred mechanism of obtaining syscall information.
[1] https://lore.kernel.org/lkml/CA+55aFzcSVmdDj9Lh_gdbz1OzHyEm6ZrGPBDAJnywm2LF…
[2] https://lore.kernel.org/lkml/CAObL_7GM0n80N7J_DFw_eQyfLyzq+sf4y2AvsCCV88Tb3…
---
Notes:
v8:
* Moved syscall_get_arch() specific patches to a separate patchset
which is now merged into audit/next tree.
* Rebased to linux-next.
* Moved ptrace_get_syscall_info code under #ifdef CONFIG_HAVE_ARCH_TRACEHOOK,
narrowing down the set of architectures supported by this implementation
back to those 19 that enable CONFIG_HAVE_ARCH_TRACEHOOK because
I failed to get all syscall_get_*(), instruction_pointer(),
and user_stack_pointer() functions implemented on some niche
architectures. This leaves the following architectures out:
alpha, h8300, m68k, microblaze, and unicore32.
v7:
* Rebased to v5.0-rc1.
* 5 arch-specific preparatory patches out of 25 have been merged
into v5.0-rc1 via arch trees.
v6:
* Add syscall_get_arguments and syscall_set_arguments wrappers
to asm-generic/syscall.h, requested by Geert.
* Change PTRACE_GET_SYSCALL_INFO return code: do not take trailing paddings
into account, use the end of the last field of the structure being written.
* Change struct ptrace_syscall_info:
* remove .frame_pointer field, is is not needed and not portable;
* make .arch field explicitly aligned, remove no longer needed
padding before .arch field;
* remove trailing pads, they are no longer needed.
v5:
* Merge separate series and patches into the single series.
* Change PTRACE_EVENTMSG_SYSCALL_{ENTRY,EXIT} values as requested by Oleg.
* Change struct ptrace_syscall_info: generalize instruction_pointer,
stack_pointer, and frame_pointer fields by moving them from
ptrace_syscall_info.{entry,seccomp} substructures to ptrace_syscall_info
and initializing them for all stops.
* Add PTRACE_SYSCALL_INFO_NONE, set it when not in a syscall stop,
so e.g. "strace -i" could use PTRACE_SYSCALL_INFO_SECCOMP to obtain
instruction_pointer when the tracee is in a signal stop.
* Patch all remaining architectures to provide all necessary
syscall_get_* functions.
* Make available for all architectures: do not conditionalize on
CONFIG_HAVE_ARCH_TRACEHOOK since all syscall_get_* functions
are implemented on all architectures.
* Add a test for PTRACE_GET_SYSCALL_INFO to selftests/ptrace.
v4:
* Do not introduce task_struct.ptrace_event,
use child->last_siginfo->si_code instead.
* Implement PTRACE_SYSCALL_INFO_SECCOMP and ptrace_syscall_info.seccomp
support along with PTRACE_SYSCALL_INFO_{ENTRY,EXIT} and
ptrace_syscall_info.{entry,exit}.
v3:
* Change struct ptrace_syscall_info.
* Support PTRACE_EVENT_SECCOMP by adding ptrace_event to task_struct.
* Add proper defines for ptrace_syscall_info.op values.
* Rename PT_SYSCALL_IS_ENTERING and PT_SYSCALL_IS_EXITING to
PTRACE_EVENTMSG_SYSCALL_ENTRY and PTRACE_EVENTMSG_SYSCALL_EXIT
* and move them to uapi.
v2:
* Do not use task->ptrace.
* Replace entry_info.is_compat with entry_info.arch, use syscall_get_arch().
* Use addr argument of sys_ptrace to get expected size of the struct;
return full size of the struct.
Dmitry V. Levin (6):
nds32: fix asm/syscall.h
hexagon: define syscall_get_error() and syscall_get_return_value()
mips: define syscall_get_error()
parisc: define syscall_get_error()
powerpc: define syscall_get_error()
selftests/ptrace: add a test case for PTRACE_GET_SYSCALL_INFO
Elvira Khabirova (1):
ptrace: add PTRACE_GET_SYSCALL_INFO request
arch/hexagon/include/asm/syscall.h | 14 +
arch/mips/include/asm/syscall.h | 6 +
arch/nds32/include/asm/syscall.h | 29 +-
arch/parisc/include/asm/syscall.h | 7 +
arch/powerpc/include/asm/syscall.h | 10 +
include/linux/tracehook.h | 9 +-
include/uapi/linux/ptrace.h | 35 +++
kernel/ptrace.c | 103 ++++++-
tools/testing/selftests/ptrace/.gitignore | 1 +
tools/testing/selftests/ptrace/Makefile | 2 +-
.../selftests/ptrace/get_syscall_info.c | 271 ++++++++++++++++++
11 files changed, 471 insertions(+), 16 deletions(-)
create mode 100644 tools/testing/selftests/ptrace/get_syscall_info.c
--
ldv