Hi,
It is said eBPF is a safe way to extend kernels and that is very
attarctive, but we need to use kfuncs to add new usage of eBPF and
kfuncs are said as unstable as EXPORT_SYMBOL_GPL. So now I'd like to ask
some questions:
1) Which should I choose, BPF kfuncs or ioctl, when adding a new feature
for userspace apps?
2) How should I use BPF kfuncs from userspace apps if I add them?
Here, a "userspace app" means something not like a system-wide daemon
like systemd (particularly, I have QEMU in mind). I'll describe the
context more below:
---
I'm working on a new feature that aids virtio-net implementations using
tuntap virtual network device. You can see [1] for details, but
basically it's to extend BPF_PROG_TYPE_SOCKET_FILTER to report four more
bytes.
However, with long discussions we have confirmed extending
BPF_PROG_TYPE_SOCKET_FILTER is not going to happen, and adding kfuncs is
the way forward. So I decided how to add kfuncs to the kernel and how to
use it. There are rich documentations for the kernel side, but I found
little about the userspace. The best I could find is a systemd change
proposal that is based on WIP kernel changes[2].
So now I'm wondering how I should use BPF kfuncs from userspace apps if
I add them. In the systemd discussion, it is told that Linus said it's
fine to use BPF kfuncs in a private infrastructure big companies own, or
in systemd as those users know well about the system[3]. Indeed, those
users should be able to make more assumptions on the kernel than
"normal" userspace applications can.
Returning to my proposal, I'm proposing a new feature to be used by QEMU
or other VMM applications. QEMU is more like a normal userspace
application, and usually does not make much assumptions on the kernel it
runs on. For example, it's generally safe to run a Debian container
including QEMU installed with apt on Fedora. BPF kfuncs may work even in
such a situation thanks to CO-RE, but it sounds like *accidentally*
creating UAPIs.
Considering all above, how can I integrate BPF kfuncs to the application?
If BPF kfuncs are like EXPORT_SYMBOL_GPL, the natural way to handle them
is to think of BPF programs as some sort of kernel modules and
incorporate logic that behaves like modprobe. More concretely, I can put
eBPF binaries to a directory like:
/usr/local/share/qemu/ebpf/$KERNEL_RELEASE
Then, QEMU can uname() and get the path to the binary. It will give an
error if it can't find the binary for the current kernel so that it
won't create accidental UAPIs.
The obvious downside of this is that it complicates packaging a lot; it
requires packaging QEMU eBPF binaries each time a new kernel comes up.
This complexity is centrally managed by modprobe for kernel modules, but
apparently each application needs to take care of it for BPF programs.
In conclusion, I see too much complexity to use BPF in a userspace
application, which we didn't have to care for
BPF_PROG_TYPE_SOCKET_FILTER. Isn't there a better way? Or shouldn't I
use BPF in my case in the first place?
Thanks,
Akihiko Odaki
[1]
https://lore.kernel.org/all/20231015141644.260646-1-akihiko.odaki@daynix.co…
[2] https://github.com/systemd/systemd/pull/29797
[3] https://github.com/systemd/systemd/pull/29797#discussion_r1384637939
The vec-syscfg selftest verifies that setting the VL of the currently
tested vector type does not disrupt the VL of the other vector type. To do
this it records the current vector length for each type but neglects to
guard this with a check for that vector type actually being supported. Add
one, using a helper function which we also update all the other instances
of this pattern.
Signed-off-by: Mark Brown <broonie(a)kernel.org>
---
tools/testing/selftests/arm64/fp/vec-syscfg.c | 14 +++++++++++---
1 file changed, 11 insertions(+), 3 deletions(-)
diff --git a/tools/testing/selftests/arm64/fp/vec-syscfg.c b/tools/testing/selftests/arm64/fp/vec-syscfg.c
index 5f648b97a06f..ea9c7d47790f 100644
--- a/tools/testing/selftests/arm64/fp/vec-syscfg.c
+++ b/tools/testing/selftests/arm64/fp/vec-syscfg.c
@@ -66,6 +66,11 @@ static struct vec_data vec_data[] = {
},
};
+static bool vec_type_supported(struct vec_data *data)
+{
+ return getauxval(data->hwcap_type) & data->hwcap;
+}
+
static int stdio_read_integer(FILE *f, const char *what, int *val)
{
int n = 0;
@@ -564,8 +569,11 @@ static void prctl_set_all_vqs(struct vec_data *data)
return;
}
- for (i = 0; i < ARRAY_SIZE(vec_data); i++)
+ for (i = 0; i < ARRAY_SIZE(vec_data); i++) {
+ if (!vec_type_supported(&vec_data[i]))
+ continue;
orig_vls[i] = vec_data[i].rdvl();
+ }
for (vq = SVE_VQ_MIN; vq <= SVE_VQ_MAX; vq++) {
vl = sve_vl_from_vq(vq);
@@ -594,7 +602,7 @@ static void prctl_set_all_vqs(struct vec_data *data)
if (&vec_data[i] == data)
continue;
- if (!(getauxval(vec_data[i].hwcap_type) & vec_data[i].hwcap))
+ if (!vec_type_supported(&vec_data[i]))
continue;
if (vec_data[i].rdvl() != orig_vls[i]) {
@@ -765,7 +773,7 @@ int main(void)
struct vec_data *data = &vec_data[i];
unsigned long supported;
- supported = getauxval(data->hwcap_type) & data->hwcap;
+ supported = vec_type_supported(data);
if (!supported)
all_supported = false;
---
base-commit: 2cc14f52aeb78ce3f29677c2de1f06c0e91471ab
change-id: 20231215-kselftest-arm64-vec-syscfg-rdvl-7944e19ac64f
Best regards,
--
Mark Brown <broonie(a)kernel.org>
When running tests on a CI system (e.g. LAVA) it is useful to output
test results in TAP format so that the CI can parse the fine-grained
results to show regressions. Many of the mm selftest binaries already
output using the TAP format. And the kselftests runner
(run_kselftest.sh) also uses the format. CI systems such as LAVA can
already handle nested TAP reports. However, with the mm selftests we
have 3 levels of nesting (run_kselftest.sh -> run_vmtests.sh ->
individual test binaries) and the middle level did not previously
support TAP, which breaks the parser.
Let's fix that by teaching run_vmtests.sh to output using the TAP
format. Ideally this would be opt-in via a command line argument to
avoid the possibility of breaking anyone's existing scripts that might
scrape the output. However, it is not possible to pass arguments to
tests invoked via run_kselftest.sh. So I've implemented an opt-out
option (-n), which will revert to the existing output format.
Future changes to this file should be aware of 2 new conventions:
- output that is part of the TAP reporting is piped through tap_output
- general output is piped through tap_prefix
Signed-off-by: Ryan Roberts <ryan.roberts(a)arm.com>
---
tools/testing/selftests/mm/run_vmtests.sh | 51 +++++++++++++++++------
1 file changed, 39 insertions(+), 12 deletions(-)
diff --git a/tools/testing/selftests/mm/run_vmtests.sh b/tools/testing/selftests/mm/run_vmtests.sh
index 87f513f5cf91..246d53a5d7f2 100755
--- a/tools/testing/selftests/mm/run_vmtests.sh
+++ b/tools/testing/selftests/mm/run_vmtests.sh
@@ -5,6 +5,7 @@
# Kselftest framework requirement - SKIP code is 4.
ksft_skip=4
+count_total=0
count_pass=0
count_fail=0
count_skip=0
@@ -17,6 +18,7 @@ usage: ${BASH_SOURCE[0]:-$0} [ options ]
-a: run all tests, including extra ones
-t: specify specific categories to tests to run
-h: display this message
+ -n: disable TAP output
The default behavior is to run required tests only. If -a is specified,
will run all tests.
@@ -77,12 +79,14 @@ EOF
}
RUN_ALL=false
+TAP_PREFIX="# "
-while getopts "aht:" OPT; do
+while getopts "aht:n" OPT; do
case ${OPT} in
"a") RUN_ALL=true ;;
"h") usage ;;
"t") VM_SELFTEST_ITEMS=${OPTARG} ;;
+ "n") TAP_PREFIX= ;;
esac
done
shift $((OPTIND -1))
@@ -184,30 +188,52 @@ fi
VADDR64=0
echo "$ARCH64STR" | grep "$ARCH" &>/dev/null && VADDR64=1
+tap_prefix() {
+ sed -e "s/^/${TAP_PREFIX}/"
+}
+
+tap_output() {
+ if [[ ! -z "$TAP_PREFIX" ]]; then
+ read str
+ echo $str
+ fi
+}
+
+pretty_name() {
+ echo "$*" | sed -e 's/^\(bash \)\?\.\///'
+}
+
# Usage: run_test [test binary] [arbitrary test arguments...]
run_test() {
if test_selected ${CATEGORY}; then
+ local test=$(pretty_name "$*")
local title="running $*"
local sep=$(echo -n "$title" | tr "[:graph:][:space:]" -)
- printf "%s\n%s\n%s\n" "$sep" "$title" "$sep"
+ printf "%s\n%s\n%s\n" "$sep" "$title" "$sep" | tap_prefix
- "$@"
- local ret=$?
+ ("$@" 2>&1) | tap_prefix
+ local ret=${PIPESTATUS[0]}
+ count_total=$(( count_total + 1 ))
if [ $ret -eq 0 ]; then
count_pass=$(( count_pass + 1 ))
- echo "[PASS]"
+ echo "[PASS]" | tap_prefix
+ echo "ok ${count_total} ${test}" | tap_output
elif [ $ret -eq $ksft_skip ]; then
count_skip=$(( count_skip + 1 ))
- echo "[SKIP]"
+ echo "[SKIP]" | tap_prefix
+ echo "ok ${count_total} ${test} # SKIP" | tap_output
exitcode=$ksft_skip
else
count_fail=$(( count_fail + 1 ))
- echo "[FAIL]"
+ echo "[FAIL]" | tap_prefix
+ echo "not ok ${count_total} ${test} # exit=$ret" | tap_output
exitcode=1
fi
fi # test_selected
}
+echo "TAP version 13" | tap_output
+
CATEGORY="hugetlb" run_test ./hugepage-mmap
shmmax=$(cat /proc/sys/kernel/shmmax)
@@ -231,9 +257,9 @@ CATEGORY="hugetlb" run_test ./hugetlb_fault_after_madv
echo "$nr_hugepages_tmp" > /proc/sys/vm/nr_hugepages
if test_selected "hugetlb"; then
- echo "NOTE: These hugetlb tests provide minimal coverage. Use"
- echo " https://github.com/libhugetlbfs/libhugetlbfs.git for"
- echo " hugetlb regression testing."
+ echo "NOTE: These hugetlb tests provide minimal coverage. Use" | tap_prefix
+ echo " https://github.com/libhugetlbfs/libhugetlbfs.git for" | tap_prefix
+ echo " hugetlb regression testing." | tap_prefix
fi
CATEGORY="mmap" run_test ./map_fixed_noreplace
@@ -312,7 +338,7 @@ CATEGORY="hmm" run_test bash ./test_hmm.sh smoke
# MADV_POPULATE_READ and MADV_POPULATE_WRITE tests
CATEGORY="madv_populate" run_test ./madv_populate
-echo 0 | sudo tee /proc/sys/kernel/yama/ptrace_scope
+(echo 0 | sudo tee /proc/sys/kernel/yama/ptrace_scope 2>&1) | tap_prefix
CATEGORY="memfd_secret" run_test ./memfd_secret
# KSM KSM_MERGE_TIME_HUGE_PAGES test with size of 100
@@ -369,6 +395,7 @@ CATEGORY="mkdirty" run_test ./mkdirty
CATEGORY="mdwe" run_test ./mdwe_test
-echo "SUMMARY: PASS=${count_pass} SKIP=${count_skip} FAIL=${count_fail}"
+echo "SUMMARY: PASS=${count_pass} SKIP=${count_skip} FAIL=${count_fail}" | tap_prefix
+echo "1..${count_total}" | tap_output
exit $exitcode
--
2.25.1
From: "Steven Rostedt (Google)" <rostedt(a)goodmis.org>
Add a test that writes longs strings, some over the size of the sub buffer
and make sure that the entire content is there.
Signed-off-by: Steven Rostedt (Google) <rostedt(a)goodmis.org>
---
Changes since v3: https://lore.kernel.org/linux-trace-kernel/20231212192317.0fb6b101@gandalf.…
- Removed / */ from regex, to catch more than one space added to the
beginning of the print. This would have caught the bug of using "%*s"
instead of "%.*s". Luckily, the trace_printk test caught that.
.../ftrace/test.d/00basic/trace_marker.tc | 82 +++++++++++++++++++
1 file changed, 82 insertions(+)
create mode 100755 tools/testing/selftests/ftrace/test.d/00basic/trace_marker.tc
diff --git a/tools/testing/selftests/ftrace/test.d/00basic/trace_marker.tc b/tools/testing/selftests/ftrace/test.d/00basic/trace_marker.tc
new file mode 100755
index 000000000000..9aa0db2b84fc
--- /dev/null
+++ b/tools/testing/selftests/ftrace/test.d/00basic/trace_marker.tc
@@ -0,0 +1,82 @@
+#!/bin/sh
+# SPDX-License-Identifier: GPL-2.0
+# description: Basic tests on writing to trace_marker
+# requires: trace_marker
+# flags: instance
+
+get_buffer_data_size() {
+ sed -ne 's/^.*data.*size:\([0-9][0-9]*\).*/\1/p' events/header_page
+}
+
+get_buffer_data_offset() {
+ sed -ne 's/^.*data.*offset:\([0-9][0-9]*\).*/\1/p' events/header_page
+}
+
+get_event_header_size() {
+ type_len=`sed -ne 's/^.*type_len.*:[^0-9]*\([0-9][0-9]*\).*/\1/p' events/header_event`
+ time_len=`sed -ne 's/^.*time_delta.*:[^0-9]*\([0-9][0-9]*\).*/\1/p' events/header_event`
+ array_len=`sed -ne 's/^.*array.*:[^0-9]*\([0-9][0-9]*\).*/\1/p' events/header_event`
+ total_bits=$((type_len+time_len+array_len))
+ total_bits=$((total_bits+7))
+ echo $((total_bits/8))
+}
+
+get_print_event_buf_offset() {
+ sed -ne 's/^.*buf.*offset:\([0-9][0-9]*\).*/\1/p' events/ftrace/print/format
+}
+
+event_header_size=`get_event_header_size`
+print_header_size=`get_print_event_buf_offset`
+
+data_offset=`get_buffer_data_offset`
+
+marker_meta=$((event_header_size+print_header_size))
+
+make_str() {
+ cnt=$1
+ # subtract two for \n\0 as marker adds these
+ cnt=$((cnt-2))
+ printf -- 'X%.0s' $(seq $cnt)
+}
+
+write_buffer() {
+ size=$1
+
+ str=`make_str $size`
+
+ # clear the buffer
+ echo > trace
+
+ # write the string into the marker
+ echo -n $str > trace_marker
+
+ echo $str
+}
+
+test_buffer() {
+
+ size=`get_buffer_data_size`
+ oneline_size=$((size-marker_meta))
+ echo size = $size
+ echo meta size = $marker_meta
+
+ # Now add a little more the meta data overhead will overflow
+
+ str=`write_buffer $size`
+
+ # Make sure the line was broken
+ new_str=`awk ' /tracing_mark_write:/ { sub(/^.*tracing_mark_write: /,"");printf "%s", $0; exit}' trace`
+
+ if [ "$new_str" = "$str" ]; then
+ exit fail;
+ fi
+
+ # Make sure the entire line can be found
+ new_str=`awk ' /tracing_mark_write:/ { sub(/^.*tracing_mark_write: /,"");printf "%s", $0; }' trace`
+
+ if [ "$new_str" != "$str" ]; then
+ exit fail;
+ fi
+}
+
+test_buffer
--
2.42.0
If an integer's type has x bits, shifting the integer left by x or more
is undefined behavior.
This can happen in the rotate function when attempting to do a rotation
of the whole value by 0.
Fixes: 0dd714bfd200 ("KVM: s390: selftest: memop: Add cmpxchg tests")
Signed-off-by: Nina Schoetterl-Glausch <nsg(a)linux.ibm.com>
---
tools/testing/selftests/kvm/s390x/memop.c | 8 +++++---
1 file changed, 5 insertions(+), 3 deletions(-)
diff --git a/tools/testing/selftests/kvm/s390x/memop.c b/tools/testing/selftests/kvm/s390x/memop.c
index bb3ca9a5d731..2eba9575828e 100644
--- a/tools/testing/selftests/kvm/s390x/memop.c
+++ b/tools/testing/selftests/kvm/s390x/memop.c
@@ -485,11 +485,13 @@ static bool popcount_eq(__uint128_t a, __uint128_t b)
static __uint128_t rotate(int size, __uint128_t val, int amount)
{
- unsigned int bits = size * 8;
+ unsigned int left, right, bits = size * 8;
- amount = (amount + bits) % bits;
+ right = (amount + bits) % bits;
+ /* % 128 prevents left shift UB if size == 16 && right == 0 */
+ left = (bits - right) % 128;
val = cut_to_size(size, val);
- return (val << (bits - amount)) | (val >> amount);
+ return (val << left) | (val >> right);
}
const unsigned int max_block = 16;
base-commit: 305230142ae0637213bf6e04f6d9f10bbcb74af8
--
2.40.1
A statement used %d print formatter where %s should have
been used. The same has been fixed in this commit.
Signed-off-by: Ghanshyam Agrawal <ghanshyam1898(a)gmail.com>
---
tools/testing/selftests/alsa/mixer-test.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/tools/testing/selftests/alsa/mixer-test.c b/tools/testing/selftests/alsa/mixer-test.c
index 21e482b23f50..23df154fcdd7 100644
--- a/tools/testing/selftests/alsa/mixer-test.c
+++ b/tools/testing/selftests/alsa/mixer-test.c
@@ -138,7 +138,7 @@ static void find_controls(void)
err = snd_ctl_elem_info(card_data->handle,
ctl_data->info);
if (err < 0) {
- ksft_print_msg("%s getting info for %d\n",
+ ksft_print_msg("%s getting info for %s\n",
snd_strerror(err),
ctl_data->name);
}
--
2.25.1
Here are a few fixes related to MPTCP:
Patch 1 avoids skipping some subtests of the MPTCP Join selftest by
mistake when using older versions of GCC. This fixes a patch introduced
in v6.4, backported up to v6.1.
Patch 2 fixes an inconsistent state when using MPTCP + FastOpen. A fix
for v6.2.
Patch 3 adds a description for MPTCP Kunit test modules to avoid a
warning.
Patch 4 adds an entry to the mailmap file for Geliang's email addresses.
Signed-off-by: Matthieu Baerts <matttbe(a)kernel.org>
---
Geliang Tang (2):
selftests: mptcp: join: fix subflow_send_ack lookup
mailmap: add entries for Geliang Tang
Matthieu Baerts (1):
mptcp: fill in missing MODULE_DESCRIPTION()
Paolo Abeni (1):
mptcp: fix inconsistent state on fastopen race
.mailmap | 4 ++++
net/mptcp/crypto_test.c | 1 +
net/mptcp/protocol.c | 6 +++---
net/mptcp/protocol.h | 9 +++++---
net/mptcp/subflow.c | 28 +++++++++++++++----------
net/mptcp/token_test.c | 1 +
tools/testing/selftests/net/mptcp/mptcp_join.sh | 8 +++----
7 files changed, 36 insertions(+), 21 deletions(-)
---
base-commit: 64b8bc7d5f1434c636a40bdcfcd42b278d1714be
change-id: 20231215-upstream-net-20231215-mptcp-misc-fixes-33c4380c2f32
Best regards,
--
Matthieu Baerts <matttbe(a)kernel.org>