From: Jakub Kicinski <kuba(a)kernel.org>
[ Upstream commit 2d3b8dfd82d76b1295167c6453d683ab99e50794 ]
On slow machines the SND timestamp sometimes doesn't arrive before
we quit. The test only waits as long as the packet delay, so it's
easy for a race condition to happen.
Double the wait but do a bit of polling, once the SND timestamp
arrives there's no point to wait any longer.
This fixes the "TXTIME abs" failures on debug kernels, like:
Case ICMPv4 - TXTIME abs returned '', expected 'OK'
Reviewed-by: Willem de Bruijn <willemb(a)google.com>
Link: https://lore.kernel.org/r/20240510005705.43069-1-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba(a)kernel.org>
Signed-off-by: Sasha Levin <sashal(a)kernel.org>
---
tools/testing/selftests/net/cmsg_sender.c | 20 +++++++++++++++-----
1 file changed, 15 insertions(+), 5 deletions(-)
diff --git a/tools/testing/selftests/net/cmsg_sender.c b/tools/testing/selftests/net/cmsg_sender.c
index c79e65581dc37..161db24e3c409 100644
--- a/tools/testing/selftests/net/cmsg_sender.c
+++ b/tools/testing/selftests/net/cmsg_sender.c
@@ -333,16 +333,17 @@ static const char *cs_ts_info2str(unsigned int info)
return "unknown";
}
-static void
+static unsigned long
cs_read_cmsg(int fd, struct msghdr *msg, char *cbuf, size_t cbuf_sz)
{
struct sock_extended_err *see;
struct scm_timestamping *ts;
+ unsigned long ts_seen = 0;
struct cmsghdr *cmsg;
int i, err;
if (!opt.ts.ena)
- return;
+ return 0;
msg->msg_control = cbuf;
msg->msg_controllen = cbuf_sz;
@@ -396,8 +397,11 @@ cs_read_cmsg(int fd, struct msghdr *msg, char *cbuf, size_t cbuf_sz)
printf(" %5s ts%d %lluus\n",
cs_ts_info2str(see->ee_info),
i, rel_time);
+ ts_seen |= 1 << see->ee_info;
}
}
+
+ return ts_seen;
}
static void ca_set_sockopts(int fd)
@@ -509,10 +513,16 @@ int main(int argc, char *argv[])
err = ERN_SUCCESS;
if (opt.ts.ena) {
- /* Make sure all timestamps have time to loop back */
- usleep(opt.txtime.delay);
+ unsigned long seen;
+ int i;
- cs_read_cmsg(fd, &msg, cbuf, sizeof(cbuf));
+ /* Make sure all timestamps have time to loop back */
+ for (i = 0; i < 40; i++) {
+ seen = cs_read_cmsg(fd, &msg, cbuf, sizeof(cbuf));
+ if (seen & (1 << SCM_TSTAMP_SND))
+ break;
+ usleep(opt.txtime.delay / 20);
+ }
}
err_out:
--
2.43.0
From: "Jose E. Marchesi" <jose.marchesi(a)oracle.com>
[ Upstream commit cd3fc3b9782130a5bc1dc3dfccffbc1657637a93 ]
[Changes from V1:
- The warning to disable is -Wmaybe-uninitialized, not -Wuninitialized.
- This warning is only supported in GCC.]
The BPF selftest verifier_global_subprogs.c contains code that
purposedly performs out of bounds access to memory, to check whether
the kernel verifier is able to catch them. For example:
__noinline int global_unsupp(const int *mem)
{
if (!mem)
return 0;
return mem[100]; /* BOOM */
}
With -O1 and higher and no inlining, GCC notices this fact and emits a
"maybe uninitialized" warning. This is by design. Note that the
emission of these warnings is highly dependent on the precise
optimizations that are performed.
This patch adds a compiler pragma to verifier_global_subprogs.c to
ignore these warnings.
Tested in bpf-next master.
No regressions.
Signed-off-by: Jose E. Marchesi <jose.marchesi(a)oracle.com>
Cc: david.faust(a)oracle.com
Cc: cupertino.miranda(a)oracle.com
Cc: Yonghong Song <yonghong.song(a)linux.dev>
Cc: Eduard Zingerman <eddyz87(a)gmail.com>
Acked-by: Yonghong Song <yonghong.song(a)linux.dev>
Link: https://lore.kernel.org/r/20240507184756.1772-1-jose.marchesi@oracle.com
Signed-off-by: Alexei Starovoitov <ast(a)kernel.org>
Signed-off-by: Sasha Levin <sashal(a)kernel.org>
---
.../testing/selftests/bpf/progs/verifier_global_subprogs.c | 7 +++++++
1 file changed, 7 insertions(+)
diff --git a/tools/testing/selftests/bpf/progs/verifier_global_subprogs.c b/tools/testing/selftests/bpf/progs/verifier_global_subprogs.c
index 67dddd9418911..27f4b2da131b1 100644
--- a/tools/testing/selftests/bpf/progs/verifier_global_subprogs.c
+++ b/tools/testing/selftests/bpf/progs/verifier_global_subprogs.c
@@ -8,6 +8,13 @@
#include "xdp_metadata.h"
#include "bpf_kfuncs.h"
+/* The compiler may be able to detect the access to uninitialized
+ memory in the routines performing out of bound memory accesses and
+ emit warnings about it. This is the case of GCC. */
+#if !defined(__clang__)
+#pragma GCC diagnostic ignored "-Wmaybe-uninitialized"
+#endif
+
int arr[1];
int unkn_idx;
const volatile bool call_dead_subprog = false;
--
2.43.0
From: Yonghong Song <yonghong.song(a)linux.dev>
[ Upstream commit 14bb1e8c8d4ad5d9d2febb7d19c70a3cf536e1e5 ]
Recently, I frequently hit the following test failure:
[root@arch-fb-vm1 bpf]# ./test_progs -n 33/1
test_lookup_update:PASS:skel_open 0 nsec
[...]
test_lookup_update:PASS:sync_rcu 0 nsec
test_lookup_update:FAIL:map1_leak inner_map1 leaked!
#33/1 btf_map_in_map/lookup_update:FAIL
#33 btf_map_in_map:FAIL
In the test, after map is closed and then after two rcu grace periods,
it is assumed that map_id is not available to user space.
But the above assumption cannot be guaranteed. After zero or one
or two rcu grace periods in different siturations, the actual
freeing-map-work is put into a workqueue. Later on, when the work
is dequeued, the map will be actually freed.
See bpf_map_put() in kernel/bpf/syscall.c.
By using workqueue, there is no ganrantee that map will be actually
freed after a couple of rcu grace periods. This patch removed
such map leak detection and then the test can pass consistently.
Signed-off-by: Yonghong Song <yonghong.song(a)linux.dev>
Signed-off-by: Daniel Borkmann <daniel(a)iogearbox.net>
Link: https://lore.kernel.org/bpf/20240322061353.632136-1-yonghong.song@linux.dev
Signed-off-by: Sasha Levin <sashal(a)kernel.org>
---
.../selftests/bpf/prog_tests/btf_map_in_map.c | 26 +------------------
1 file changed, 1 insertion(+), 25 deletions(-)
diff --git a/tools/testing/selftests/bpf/prog_tests/btf_map_in_map.c b/tools/testing/selftests/bpf/prog_tests/btf_map_in_map.c
index a8b53b8736f01..f66ceccd7029c 100644
--- a/tools/testing/selftests/bpf/prog_tests/btf_map_in_map.c
+++ b/tools/testing/selftests/bpf/prog_tests/btf_map_in_map.c
@@ -25,7 +25,7 @@ static void test_lookup_update(void)
int map1_fd, map2_fd, map3_fd, map4_fd, map5_fd, map1_id, map2_id;
int outer_arr_fd, outer_hash_fd, outer_arr_dyn_fd;
struct test_btf_map_in_map *skel;
- int err, key = 0, val, i, fd;
+ int err, key = 0, val, i;
skel = test_btf_map_in_map__open_and_load();
if (CHECK(!skel, "skel_open", "failed to open&load skeleton\n"))
@@ -102,30 +102,6 @@ static void test_lookup_update(void)
CHECK(map1_id == 0, "map1_id", "failed to get ID 1\n");
CHECK(map2_id == 0, "map2_id", "failed to get ID 2\n");
- test_btf_map_in_map__destroy(skel);
- skel = NULL;
-
- /* we need to either wait for or force synchronize_rcu(), before
- * checking for "still exists" condition, otherwise map could still be
- * resolvable by ID, causing false positives.
- *
- * Older kernels (5.8 and earlier) freed map only after two
- * synchronize_rcu()s, so trigger two, to be entirely sure.
- */
- CHECK(kern_sync_rcu(), "sync_rcu", "failed\n");
- CHECK(kern_sync_rcu(), "sync_rcu", "failed\n");
-
- fd = bpf_map_get_fd_by_id(map1_id);
- if (CHECK(fd >= 0, "map1_leak", "inner_map1 leaked!\n")) {
- close(fd);
- goto cleanup;
- }
- fd = bpf_map_get_fd_by_id(map2_id);
- if (CHECK(fd >= 0, "map2_leak", "inner_map2 leaked!\n")) {
- close(fd);
- goto cleanup;
- }
-
cleanup:
test_btf_map_in_map__destroy(skel);
}
--
2.43.0
From: "Alessandro Carminati (Red Hat)" <alessandro.carminati(a)gmail.com>
[ Upstream commit f803bcf9208a2540acb4c32bdc3616673169f490 ]
In some systems, the netcat server can incur in delay to start listening.
When this happens, the test can randomly fail in various points.
This is an example error message:
# ip gre none gso
# encap 192.168.1.1 to 192.168.1.2, type gre, mac none len 2000
# test basic connectivity
# Ncat: Connection refused.
The issue stems from a race condition between the netcat client and server.
The test author had addressed this problem by implementing a sleep, which
I have removed in this patch.
This patch introduces a function capable of sleeping for up to two seconds.
However, it can terminate the waiting period early if the port is reported
to be listening.
Signed-off-by: Alessandro Carminati (Red Hat) <alessandro.carminati(a)gmail.com>
Signed-off-by: Andrii Nakryiko <andrii(a)kernel.org>
Link: https://lore.kernel.org/bpf/20240314105911.213411-1-alessandro.carminati@gm…
Signed-off-by: Sasha Levin <sashal(a)kernel.org>
---
tools/testing/selftests/bpf/test_tc_tunnel.sh | 13 ++++++++++++-
1 file changed, 12 insertions(+), 1 deletion(-)
diff --git a/tools/testing/selftests/bpf/test_tc_tunnel.sh b/tools/testing/selftests/bpf/test_tc_tunnel.sh
index 910044f08908a..7989ec6084545 100755
--- a/tools/testing/selftests/bpf/test_tc_tunnel.sh
+++ b/tools/testing/selftests/bpf/test_tc_tunnel.sh
@@ -72,7 +72,6 @@ cleanup() {
server_listen() {
ip netns exec "${ns2}" nc "${netcat_opt}" -l "${port}" > "${outfile}" &
server_pid=$!
- sleep 0.2
}
client_connect() {
@@ -93,6 +92,16 @@ verify_data() {
fi
}
+wait_for_port() {
+ for i in $(seq 20); do
+ if ip netns exec "${ns2}" ss ${2:--4}OHntl | grep -q "$1"; then
+ return 0
+ fi
+ sleep 0.1
+ done
+ return 1
+}
+
set -e
# no arguments: automated test, run all
@@ -193,6 +202,7 @@ setup
# basic communication works
echo "test basic connectivity"
server_listen
+wait_for_port ${port} ${netcat_opt}
client_connect
verify_data
@@ -204,6 +214,7 @@ ip netns exec "${ns1}" tc filter add dev veth1 egress \
section "encap_${tuntype}_${mac}"
echo "test bpf encap without decap (expect failure)"
server_listen
+wait_for_port ${port} ${netcat_opt}
! client_connect
if [[ "$tuntype" =~ "udp" ]]; then
--
2.43.0
From: Jakub Kicinski <kuba(a)kernel.org>
[ Upstream commit 2d3b8dfd82d76b1295167c6453d683ab99e50794 ]
On slow machines the SND timestamp sometimes doesn't arrive before
we quit. The test only waits as long as the packet delay, so it's
easy for a race condition to happen.
Double the wait but do a bit of polling, once the SND timestamp
arrives there's no point to wait any longer.
This fixes the "TXTIME abs" failures on debug kernels, like:
Case ICMPv4 - TXTIME abs returned '', expected 'OK'
Reviewed-by: Willem de Bruijn <willemb(a)google.com>
Link: https://lore.kernel.org/r/20240510005705.43069-1-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba(a)kernel.org>
Signed-off-by: Sasha Levin <sashal(a)kernel.org>
---
tools/testing/selftests/net/cmsg_sender.c | 20 +++++++++++++++-----
1 file changed, 15 insertions(+), 5 deletions(-)
diff --git a/tools/testing/selftests/net/cmsg_sender.c b/tools/testing/selftests/net/cmsg_sender.c
index c79e65581dc37..161db24e3c409 100644
--- a/tools/testing/selftests/net/cmsg_sender.c
+++ b/tools/testing/selftests/net/cmsg_sender.c
@@ -333,16 +333,17 @@ static const char *cs_ts_info2str(unsigned int info)
return "unknown";
}
-static void
+static unsigned long
cs_read_cmsg(int fd, struct msghdr *msg, char *cbuf, size_t cbuf_sz)
{
struct sock_extended_err *see;
struct scm_timestamping *ts;
+ unsigned long ts_seen = 0;
struct cmsghdr *cmsg;
int i, err;
if (!opt.ts.ena)
- return;
+ return 0;
msg->msg_control = cbuf;
msg->msg_controllen = cbuf_sz;
@@ -396,8 +397,11 @@ cs_read_cmsg(int fd, struct msghdr *msg, char *cbuf, size_t cbuf_sz)
printf(" %5s ts%d %lluus\n",
cs_ts_info2str(see->ee_info),
i, rel_time);
+ ts_seen |= 1 << see->ee_info;
}
}
+
+ return ts_seen;
}
static void ca_set_sockopts(int fd)
@@ -509,10 +513,16 @@ int main(int argc, char *argv[])
err = ERN_SUCCESS;
if (opt.ts.ena) {
- /* Make sure all timestamps have time to loop back */
- usleep(opt.txtime.delay);
+ unsigned long seen;
+ int i;
- cs_read_cmsg(fd, &msg, cbuf, sizeof(cbuf));
+ /* Make sure all timestamps have time to loop back */
+ for (i = 0; i < 40; i++) {
+ seen = cs_read_cmsg(fd, &msg, cbuf, sizeof(cbuf));
+ if (seen & (1 << SCM_TSTAMP_SND))
+ break;
+ usleep(opt.txtime.delay / 20);
+ }
}
err_out:
--
2.43.0
From: "Jose E. Marchesi" <jose.marchesi(a)oracle.com>
[ Upstream commit cd3fc3b9782130a5bc1dc3dfccffbc1657637a93 ]
[Changes from V1:
- The warning to disable is -Wmaybe-uninitialized, not -Wuninitialized.
- This warning is only supported in GCC.]
The BPF selftest verifier_global_subprogs.c contains code that
purposedly performs out of bounds access to memory, to check whether
the kernel verifier is able to catch them. For example:
__noinline int global_unsupp(const int *mem)
{
if (!mem)
return 0;
return mem[100]; /* BOOM */
}
With -O1 and higher and no inlining, GCC notices this fact and emits a
"maybe uninitialized" warning. This is by design. Note that the
emission of these warnings is highly dependent on the precise
optimizations that are performed.
This patch adds a compiler pragma to verifier_global_subprogs.c to
ignore these warnings.
Tested in bpf-next master.
No regressions.
Signed-off-by: Jose E. Marchesi <jose.marchesi(a)oracle.com>
Cc: david.faust(a)oracle.com
Cc: cupertino.miranda(a)oracle.com
Cc: Yonghong Song <yonghong.song(a)linux.dev>
Cc: Eduard Zingerman <eddyz87(a)gmail.com>
Acked-by: Yonghong Song <yonghong.song(a)linux.dev>
Link: https://lore.kernel.org/r/20240507184756.1772-1-jose.marchesi@oracle.com
Signed-off-by: Alexei Starovoitov <ast(a)kernel.org>
Signed-off-by: Sasha Levin <sashal(a)kernel.org>
---
.../testing/selftests/bpf/progs/verifier_global_subprogs.c | 7 +++++++
1 file changed, 7 insertions(+)
diff --git a/tools/testing/selftests/bpf/progs/verifier_global_subprogs.c b/tools/testing/selftests/bpf/progs/verifier_global_subprogs.c
index baff5ffe94051..a9fc30ed4d732 100644
--- a/tools/testing/selftests/bpf/progs/verifier_global_subprogs.c
+++ b/tools/testing/selftests/bpf/progs/verifier_global_subprogs.c
@@ -8,6 +8,13 @@
#include "xdp_metadata.h"
#include "bpf_kfuncs.h"
+/* The compiler may be able to detect the access to uninitialized
+ memory in the routines performing out of bound memory accesses and
+ emit warnings about it. This is the case of GCC. */
+#if !defined(__clang__)
+#pragma GCC diagnostic ignored "-Wmaybe-uninitialized"
+#endif
+
int arr[1];
int unkn_idx;
const volatile bool call_dead_subprog = false;
--
2.43.0
From: Yonghong Song <yonghong.song(a)linux.dev>
[ Upstream commit 14bb1e8c8d4ad5d9d2febb7d19c70a3cf536e1e5 ]
Recently, I frequently hit the following test failure:
[root@arch-fb-vm1 bpf]# ./test_progs -n 33/1
test_lookup_update:PASS:skel_open 0 nsec
[...]
test_lookup_update:PASS:sync_rcu 0 nsec
test_lookup_update:FAIL:map1_leak inner_map1 leaked!
#33/1 btf_map_in_map/lookup_update:FAIL
#33 btf_map_in_map:FAIL
In the test, after map is closed and then after two rcu grace periods,
it is assumed that map_id is not available to user space.
But the above assumption cannot be guaranteed. After zero or one
or two rcu grace periods in different siturations, the actual
freeing-map-work is put into a workqueue. Later on, when the work
is dequeued, the map will be actually freed.
See bpf_map_put() in kernel/bpf/syscall.c.
By using workqueue, there is no ganrantee that map will be actually
freed after a couple of rcu grace periods. This patch removed
such map leak detection and then the test can pass consistently.
Signed-off-by: Yonghong Song <yonghong.song(a)linux.dev>
Signed-off-by: Daniel Borkmann <daniel(a)iogearbox.net>
Link: https://lore.kernel.org/bpf/20240322061353.632136-1-yonghong.song@linux.dev
Signed-off-by: Sasha Levin <sashal(a)kernel.org>
---
.../selftests/bpf/prog_tests/btf_map_in_map.c | 26 +------------------
1 file changed, 1 insertion(+), 25 deletions(-)
diff --git a/tools/testing/selftests/bpf/prog_tests/btf_map_in_map.c b/tools/testing/selftests/bpf/prog_tests/btf_map_in_map.c
index a8b53b8736f01..f66ceccd7029c 100644
--- a/tools/testing/selftests/bpf/prog_tests/btf_map_in_map.c
+++ b/tools/testing/selftests/bpf/prog_tests/btf_map_in_map.c
@@ -25,7 +25,7 @@ static void test_lookup_update(void)
int map1_fd, map2_fd, map3_fd, map4_fd, map5_fd, map1_id, map2_id;
int outer_arr_fd, outer_hash_fd, outer_arr_dyn_fd;
struct test_btf_map_in_map *skel;
- int err, key = 0, val, i, fd;
+ int err, key = 0, val, i;
skel = test_btf_map_in_map__open_and_load();
if (CHECK(!skel, "skel_open", "failed to open&load skeleton\n"))
@@ -102,30 +102,6 @@ static void test_lookup_update(void)
CHECK(map1_id == 0, "map1_id", "failed to get ID 1\n");
CHECK(map2_id == 0, "map2_id", "failed to get ID 2\n");
- test_btf_map_in_map__destroy(skel);
- skel = NULL;
-
- /* we need to either wait for or force synchronize_rcu(), before
- * checking for "still exists" condition, otherwise map could still be
- * resolvable by ID, causing false positives.
- *
- * Older kernels (5.8 and earlier) freed map only after two
- * synchronize_rcu()s, so trigger two, to be entirely sure.
- */
- CHECK(kern_sync_rcu(), "sync_rcu", "failed\n");
- CHECK(kern_sync_rcu(), "sync_rcu", "failed\n");
-
- fd = bpf_map_get_fd_by_id(map1_id);
- if (CHECK(fd >= 0, "map1_leak", "inner_map1 leaked!\n")) {
- close(fd);
- goto cleanup;
- }
- fd = bpf_map_get_fd_by_id(map2_id);
- if (CHECK(fd >= 0, "map2_leak", "inner_map2 leaked!\n")) {
- close(fd);
- goto cleanup;
- }
-
cleanup:
test_btf_map_in_map__destroy(skel);
}
--
2.43.0
From: "Alessandro Carminati (Red Hat)" <alessandro.carminati(a)gmail.com>
[ Upstream commit f803bcf9208a2540acb4c32bdc3616673169f490 ]
In some systems, the netcat server can incur in delay to start listening.
When this happens, the test can randomly fail in various points.
This is an example error message:
# ip gre none gso
# encap 192.168.1.1 to 192.168.1.2, type gre, mac none len 2000
# test basic connectivity
# Ncat: Connection refused.
The issue stems from a race condition between the netcat client and server.
The test author had addressed this problem by implementing a sleep, which
I have removed in this patch.
This patch introduces a function capable of sleeping for up to two seconds.
However, it can terminate the waiting period early if the port is reported
to be listening.
Signed-off-by: Alessandro Carminati (Red Hat) <alessandro.carminati(a)gmail.com>
Signed-off-by: Andrii Nakryiko <andrii(a)kernel.org>
Link: https://lore.kernel.org/bpf/20240314105911.213411-1-alessandro.carminati@gm…
Signed-off-by: Sasha Levin <sashal(a)kernel.org>
---
tools/testing/selftests/bpf/test_tc_tunnel.sh | 13 ++++++++++++-
1 file changed, 12 insertions(+), 1 deletion(-)
diff --git a/tools/testing/selftests/bpf/test_tc_tunnel.sh b/tools/testing/selftests/bpf/test_tc_tunnel.sh
index 910044f08908a..7989ec6084545 100755
--- a/tools/testing/selftests/bpf/test_tc_tunnel.sh
+++ b/tools/testing/selftests/bpf/test_tc_tunnel.sh
@@ -72,7 +72,6 @@ cleanup() {
server_listen() {
ip netns exec "${ns2}" nc "${netcat_opt}" -l "${port}" > "${outfile}" &
server_pid=$!
- sleep 0.2
}
client_connect() {
@@ -93,6 +92,16 @@ verify_data() {
fi
}
+wait_for_port() {
+ for i in $(seq 20); do
+ if ip netns exec "${ns2}" ss ${2:--4}OHntl | grep -q "$1"; then
+ return 0
+ fi
+ sleep 0.1
+ done
+ return 1
+}
+
set -e
# no arguments: automated test, run all
@@ -193,6 +202,7 @@ setup
# basic communication works
echo "test basic connectivity"
server_listen
+wait_for_port ${port} ${netcat_opt}
client_connect
verify_data
@@ -204,6 +214,7 @@ ip netns exec "${ns1}" tc filter add dev veth1 egress \
section "encap_${tuntype}_${mac}"
echo "test bpf encap without decap (expect failure)"
server_listen
+wait_for_port ${port} ${netcat_opt}
! client_connect
if [[ "$tuntype" =~ "udp" ]]; then
--
2.43.0
Add several new test cases which assert corner cases on the eventfd
mechanism, for example, the supplied buffer is less than 8 bytes,
attempting to write a value that is too large, etc.
./eventfd_test
# Starting 9 tests from 1 test cases.
# RUN global.eventfd_check_flag_rdwr ...
# OK global.eventfd_check_flag_rdwr
ok 1 global.eventfd_check_flag_rdwr
# RUN global.eventfd_check_flag_cloexec ...
# OK global.eventfd_check_flag_cloexec
ok 2 global.eventfd_check_flag_cloexec
# RUN global.eventfd_check_flag_nonblock ...
# OK global.eventfd_check_flag_nonblock
ok 3 global.eventfd_check_flag_nonblock
# RUN global.eventfd_chek_flag_cloexec_and_nonblock ...
# OK global.eventfd_chek_flag_cloexec_and_nonblock
ok 4 global.eventfd_chek_flag_cloexec_and_nonblock
# RUN global.eventfd_check_flag_semaphore ...
# OK global.eventfd_check_flag_semaphore
ok 5 global.eventfd_check_flag_semaphore
# RUN global.eventfd_check_write ...
# OK global.eventfd_check_write
ok 6 global.eventfd_check_write
# RUN global.eventfd_check_read ...
# OK global.eventfd_check_read
ok 7 global.eventfd_check_read
# RUN global.eventfd_check_read_with_nonsemaphore ...
# OK global.eventfd_check_read_with_nonsemaphore
ok 8 global.eventfd_check_read_with_nonsemaphore
# RUN global.eventfd_check_read_with_semaphore ...
# OK global.eventfd_check_read_with_semaphore
ok 9 global.eventfd_check_read_with_semaphore
# PASSED: 9 / 9 tests passed.
# Totals: pass:9 fail:0 xfail:0 xpass:0 skip:0 error:0
Signed-off-by: Wen Yang <wen.yang(a)linux.dev>
Cc: SShuah Khan <shuah(a)kernel.org>
Cc: Christian Brauner <brauner(a)kernel.org>
Cc: Andrei Vagin <avagin(a)google.com>
Cc: Mathieu Desnoyers <mathieu.desnoyers(a)efficios.com>
Cc: Steven Rostedt <rostedt(a)goodmis.org>
Cc: Andrew Morton <akpm(a)linux-foundation.org>
Cc: Dave Young <dyoung(a)redhat.com>
Cc: Tim Bird <tim.bird(a)sony.com>
Cc: linux-kselftest(a)vger.kernel.org
Cc: linux-kernel(a)vger.kernel.org
---
v2: use strings which indicate what is being tested, that are useful to a human
.../filesystems/eventfd/eventfd_test.c | 136 +++++++++++++++++-
1 file changed, 131 insertions(+), 5 deletions(-)
diff --git a/tools/testing/selftests/filesystems/eventfd/eventfd_test.c b/tools/testing/selftests/filesystems/eventfd/eventfd_test.c
index f142a137526c..85acb4e3ef00 100644
--- a/tools/testing/selftests/filesystems/eventfd/eventfd_test.c
+++ b/tools/testing/selftests/filesystems/eventfd/eventfd_test.c
@@ -13,6 +13,8 @@
#include <sys/eventfd.h>
#include "../../kselftest_harness.h"
+#define EVENTFD_TEST_ITERATIONS 100000UL
+
struct error {
int code;
char msg[512];
@@ -40,7 +42,7 @@ static inline int sys_eventfd2(unsigned int count, int flags)
return syscall(__NR_eventfd2, count, flags);
}
-TEST(eventfd01)
+TEST(eventfd_check_flag_rdwr)
{
int fd, flags;
@@ -54,7 +56,7 @@ TEST(eventfd01)
close(fd);
}
-TEST(eventfd02)
+TEST(eventfd_check_flag_cloexec)
{
int fd, flags;
@@ -68,7 +70,7 @@ TEST(eventfd02)
close(fd);
}
-TEST(eventfd03)
+TEST(eventfd_check_flag_nonblock)
{
int fd, flags;
@@ -83,7 +85,7 @@ TEST(eventfd03)
close(fd);
}
-TEST(eventfd04)
+TEST(eventfd_chek_flag_cloexec_and_nonblock)
{
int fd, flags;
@@ -161,7 +163,7 @@ static int verify_fdinfo(int fd, struct error *err, const char *prefix,
return 0;
}
-TEST(eventfd05)
+TEST(eventfd_check_flag_semaphore)
{
struct error err = {0};
int fd, ret;
@@ -183,4 +185,128 @@ TEST(eventfd05)
close(fd);
}
+/*
+ * A write(2) fails with the error EINVAL if the size of the supplied buffer
+ * is less than 8 bytes, or if an attempt is made to write the value
+ * 0xffffffffffffffff.
+ */
+TEST(eventfd_check_write)
+{
+ uint64_t value = 1;
+ ssize_t size;
+ int fd;
+
+ fd = sys_eventfd2(0, 0);
+ ASSERT_GE(fd, 0);
+
+ size = write(fd, &value, sizeof(int));
+ EXPECT_EQ(size, -1);
+ EXPECT_EQ(errno, EINVAL);
+
+ size = write(fd, &value, sizeof(value));
+ EXPECT_EQ(size, sizeof(value));
+
+ value = (uint64_t)-1;
+ size = write(fd, &value, sizeof(value));
+ EXPECT_EQ(size, -1);
+ EXPECT_EQ(errno, EINVAL);
+
+ close(fd);
+}
+
+/*
+ * A read(2) fails with the error EINVAL if the size of the supplied buffer is
+ * less than 8 bytes.
+ */
+TEST(eventfd_check_read)
+{
+ uint64_t value;
+ ssize_t size;
+ int fd;
+
+ fd = sys_eventfd2(1, 0);
+ ASSERT_GE(fd, 0);
+
+ size = read(fd, &value, sizeof(int));
+ EXPECT_EQ(size, -1);
+ EXPECT_EQ(errno, EINVAL);
+
+ size = read(fd, &value, sizeof(value));
+ EXPECT_EQ(size, sizeof(value));
+ EXPECT_EQ(value, 1);
+
+ close(fd);
+}
+
+
+/*
+ * If EFD_SEMAPHORE was not specified and the eventfd counter has a nonzero
+ * value, then a read(2) returns 8 bytes containing that value, and the
+ * counter's value is reset to zero.
+ * If the eventfd counter is zero at the time of the call to read(2), then the
+ * call fails with the error EAGAIN if the file descriptor has been made nonblocking.
+ */
+TEST(eventfd_check_read_with_nonsemaphore)
+{
+ uint64_t value;
+ ssize_t size;
+ int fd;
+ int i;
+
+ fd = sys_eventfd2(0, EFD_NONBLOCK);
+ ASSERT_GE(fd, 0);
+
+ value = 1;
+ for (i = 0; i < EVENTFD_TEST_ITERATIONS; i++) {
+ size = write(fd, &value, sizeof(value));
+ EXPECT_EQ(size, sizeof(value));
+ }
+
+ size = read(fd, &value, sizeof(value));
+ EXPECT_EQ(size, sizeof(uint64_t));
+ EXPECT_EQ(value, EVENTFD_TEST_ITERATIONS);
+
+ size = read(fd, &value, sizeof(value));
+ EXPECT_EQ(size, -1);
+ EXPECT_EQ(errno, EAGAIN);
+
+ close(fd);
+}
+
+/*
+ * If EFD_SEMAPHORE was specified and the eventfd counter has a nonzero value,
+ * then a read(2) returns 8 bytes containing the value 1, and the counter's
+ * value is decremented by 1.
+ * If the eventfd counter is zero at the time of the call to read(2), then the
+ * call fails with the error EAGAIN if the file descriptor has been made nonblocking.
+ */
+TEST(eventfd_check_read_with_semaphore)
+{
+ uint64_t value;
+ ssize_t size;
+ int fd;
+ int i;
+
+ fd = sys_eventfd2(0, EFD_SEMAPHORE|EFD_NONBLOCK);
+ ASSERT_GE(fd, 0);
+
+ value = 1;
+ for (i = 0; i < EVENTFD_TEST_ITERATIONS; i++) {
+ size = write(fd, &value, sizeof(value));
+ EXPECT_EQ(size, sizeof(value));
+ }
+
+ for (i = 0; i < EVENTFD_TEST_ITERATIONS; i++) {
+ size = read(fd, &value, sizeof(value));
+ EXPECT_EQ(size, sizeof(value));
+ EXPECT_EQ(value, 1);
+ }
+
+ size = read(fd, &value, sizeof(value));
+ EXPECT_EQ(size, -1);
+ EXPECT_EQ(errno, EAGAIN);
+
+ close(fd);
+}
+
TEST_HARNESS_MAIN
--
2.25.1
Commit 1b151e2435fc ("block: Remove special-casing of compound
pages") caused a change in behaviour when releasing the pages
if the buffer does not start at the beginning of the page. This
was because the calculation of the number of pages to release
was incorrect.
This was fixed by commit 38b43539d64b ("block: Fix page refcounts
for unaligned buffers in __bio_release_pages()").
We pin the user buffer during direct I/O writes. If this buffer is a
hugepage, bio_release_page() will unpin it and decrement all references
and pin counts at ->bi_end_io. However, if any references to the hugepage
remain post-I/O, the hugepage will not be freed upon unmap, leading
to a memory leak.
This patch verifies that a hugepage, used as a user buffer for DIO
operations, is correctly freed upon unmapping, regardless of whether
the offsets are aligned or unaligned w.r.t page boundary.
Test Result Fail Scenario (Without the fix)
--------------------------------------------------------
[]# ./hugetlb_dio
TAP version 13
1..4
No. Free pages before allocation : 7
No. Free pages after munmap : 7
ok 1 : Huge pages freed successfully !
No. Free pages before allocation : 7
No. Free pages after munmap : 7
ok 2 : Huge pages freed successfully !
No. Free pages before allocation : 7
No. Free pages after munmap : 7
ok 3 : Huge pages freed successfully !
No. Free pages before allocation : 7
No. Free pages after munmap : 6
not ok 4 : Huge pages not freed!
Totals: pass:3 fail:1 xfail:0 xpass:0 skip:0 error:0
Test Result PASS Scenario (With the fix)
---------------------------------------------------------
[]#./hugetlb_dio
TAP version 13
1..4
No. Free pages before allocation : 7
No. Free pages after munmap : 7
ok 1 : Huge pages freed successfully !
No. Free pages before allocation : 7
No. Free pages after munmap : 7
ok 2 : Huge pages freed successfully !
No. Free pages before allocation : 7
No. Free pages after munmap : 7
ok 3 : Huge pages freed successfully !
No. Free pages before allocation : 7
No. Free pages after munmap : 7
ok 4 : Huge pages freed successfully !
Totals: pass:4 fail:0 xfail:0 xpass:0 skip:0 error:0
Signed-off-by: Donet Tom <donettom(a)linux.ibm.com>
Signed-off-by: Ritesh Harjani (IBM) <ritesh.list(a)gmail.com>
---
tools/testing/selftests/mm/Makefile | 1 +
tools/testing/selftests/mm/hugetlb_dio.c | 118 +++++++++++++++++++++++
2 files changed, 119 insertions(+)
create mode 100644 tools/testing/selftests/mm/hugetlb_dio.c
diff --git a/tools/testing/selftests/mm/Makefile b/tools/testing/selftests/mm/Makefile
index eb5f39a2668b..87d8130b3376 100644
--- a/tools/testing/selftests/mm/Makefile
+++ b/tools/testing/selftests/mm/Makefile
@@ -71,6 +71,7 @@ TEST_GEN_FILES += ksm_functional_tests
TEST_GEN_FILES += mdwe_test
TEST_GEN_FILES += hugetlb_fault_after_madv
TEST_GEN_FILES += hugetlb_madv_vs_map
+TEST_GEN_FILES += hugetlb_dio
ifneq ($(ARCH),arm64)
TEST_GEN_FILES += soft-dirty
diff --git a/tools/testing/selftests/mm/hugetlb_dio.c b/tools/testing/selftests/mm/hugetlb_dio.c
new file mode 100644
index 000000000000..6f6587c7913c
--- /dev/null
+++ b/tools/testing/selftests/mm/hugetlb_dio.c
@@ -0,0 +1,118 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * This program tests for hugepage leaks after DIO writes to a file using a
+ * hugepage as the user buffer. During DIO, the user buffer is pinned and
+ * should be properly unpinned upon completion. This patch verifies that the
+ * kernel correctly unpins the buffer at DIO completion for both aligned and
+ * unaligned user buffer offsets (w.r.t page boundary), ensuring the hugepage
+ * is freed upon unmapping.
+ */
+
+#define _GNU_SOURCE
+#include <stdio.h>
+#include <sys/stat.h>
+#include <stdlib.h>
+#include <fcntl.h>
+#include <stdint.h>
+#include <unistd.h>
+#include <string.h>
+#include <sys/mman.h>
+#include "vm_util.h"
+#include "../kselftest.h"
+
+void run_dio_using_hugetlb(unsigned int start_off, unsigned int end_off)
+{
+ int fd;
+ char *buffer = NULL;
+ char *orig_buffer = NULL;
+ size_t h_pagesize = 0;
+ size_t writesize;
+ int free_hpage_b = 0;
+ int free_hpage_a = 0;
+
+ writesize = end_off - start_off;
+
+ /* Get the default huge page size */
+ h_pagesize = default_huge_page_size();
+ if (!h_pagesize)
+ ksft_exit_fail_msg("Unable to determine huge page size\n");
+
+ /* Open the file to DIO */
+ fd = open("/tmp", O_TMPFILE | O_RDWR | O_DIRECT);
+ if (fd < 0)
+ ksft_exit_fail_msg("Error opening file");
+
+ /* Get the free huge pages before allocation */
+ free_hpage_b = get_free_hugepages();
+ if (free_hpage_b == 0) {
+ close(fd);
+ ksft_exit_skip("No free hugepage, exiting!\n");
+ }
+
+ /* Allocate a hugetlb page */
+ orig_buffer = mmap(NULL, h_pagesize, PROT_READ | PROT_WRITE, MAP_PRIVATE
+ | MAP_ANONYMOUS | MAP_HUGETLB, -1, 0);
+ if (orig_buffer == MAP_FAILED) {
+ close(fd);
+ ksft_exit_fail_msg("Error mapping memory");
+ }
+ buffer = orig_buffer;
+ buffer += start_off;
+
+ memset(buffer, 'A', writesize);
+
+ /* Write the buffer to the file */
+ if (write(fd, buffer, writesize) != (writesize)) {
+ munmap(orig_buffer, h_pagesize);
+ close(fd);
+ ksft_exit_fail_msg("Error writing to file");
+ }
+
+ /* unmap the huge page */
+ munmap(orig_buffer, h_pagesize);
+ close(fd);
+
+ /* Get the free huge pages after unmap*/
+ free_hpage_a = get_free_hugepages();
+
+ /*
+ * If the no. of free hugepages before allocation and after unmap does
+ * not match - that means there could still be a page which is pinned.
+ */
+ if (free_hpage_a != free_hpage_b) {
+ printf("No. Free pages before allocation : %d\n", free_hpage_b);
+ printf("No. Free pages after munmap : %d\n", free_hpage_a);
+ ksft_test_result_fail(": Huge pages not freed!\n");
+ } else {
+ printf("No. Free pages before allocation : %d\n", free_hpage_b);
+ printf("No. Free pages after munmap : %d\n", free_hpage_a);
+ ksft_test_result_pass(": Huge pages freed successfully !\n");
+ }
+}
+
+int main(void)
+{
+ size_t pagesize = 0;
+
+ ksft_print_header();
+ ksft_set_plan(4);
+
+ /* Get base page size */
+ pagesize = psize();
+
+ /* start and end is aligned to pagesize */
+ run_dio_using_hugetlb(0, (pagesize * 3));
+
+ /* start is aligned but end is not aligned */
+ run_dio_using_hugetlb(0, (pagesize * 3) - (pagesize / 2));
+
+ /* start is unaligned and end is aligned */
+ run_dio_using_hugetlb(pagesize / 2, (pagesize * 3));
+
+ /* both start and end are unaligned */
+ run_dio_using_hugetlb(pagesize / 2, (pagesize * 3) + (pagesize / 2));
+
+ ksft_finished();
+ return 0;
+}
+
--
2.39.3
Here is a couple of patches to fix some issues related to kconfig.
I found these issues when I built the kernel with
tools/testing/selftests/ftrace/config.
Thank you,
---
Masami Hiramatsu (Google) (2):
selftests/ftrace: Fix to check required event file
selftests/ftrace: Update required config
tools/testing/selftests/ftrace/config | 26 +++++++++++++++-----
.../ftrace/test.d/dynevent/test_duplicates.tc | 2 +-
2 files changed, 20 insertions(+), 8 deletions(-)
--
Masami Hiramatsu (Google) <mhiramat(a)kernel.org>
From: Geliang Tang <tanggeliang(a)kylinos.cn>
This patchset uses post_socket_cb and post_connect_cb callbacks of struct
network_helper_opts to refactor do_test() in bpf_tcp_ca.c to move dctcp
test dedicated code out of do_test() into test_dctcp().
v4:
- address Martin's comments in v3 (thanks).
- drop 2 patches, keep "type" as the individual arg to start_server_addr,
connect_to_addr and start_server_str.
v3:
- Add 4 new patches, 1-3 are cleanups. 4 adds a new helper.
- address Martin's comments in v2.
v2:
- rebased on commit "selftests/bpf: Add test for the use of new args in
cong_control"
Geliang Tang (6):
selftests/bpf: Drop struct post_socket_opts
selftests/bpf: Add start_server_str helper
selftests/bpf: Use post_socket_cb in connect_to_fd_opts
selftests/bpf: Use start_server_str in bpf_tcp_ca
selftests/bpf: Use connect_to_fd_opts in do_test in bpf_tcp_ca
selftests/bpf: Add post_connect_cb callback
tools/testing/selftests/bpf/network_helpers.c | 39 +++--
tools/testing/selftests/bpf/network_helpers.h | 9 +-
.../selftests/bpf/prog_tests/bpf_tcp_ca.c | 138 +++++++++++++-----
.../bpf/prog_tests/sockopt_inherit.c | 2 +-
.../bpf/test_tcp_check_syncookie_user.c | 4 +-
5 files changed, 133 insertions(+), 59 deletions(-)
--
2.43.0
Sending out v3 for cpu assisted riscv user mode control flow integrity.
v2 [9] was sent a week ago for this riscv usermode control flow integrity
enabling. RFC patchset was (v1) early this year (January) [7].
changes in v3
--------------
envcfg:
logic to pick up base envcfg had a bug where `ENVCFG_CBZE` could have been
picked on per task basis, even though CPU didn't implement it. Fixed in
this series.
dt-bindings:
As suggested, split into separate commit. fixed the messaging that spec is
in public review
arch_is_shadow_stack change:
arch_is_shadow_stack changed to vma_is_shadow_stack
hwprobe:
zicfiss / zicfilp if present will get enumerated in hwprobe
selftests:
As suggested, added object and binary filenames to .gitignore
Selftest binary anyways need to be compiled with cfi enabled compiler which
will make sure that landing pad and shadow stack are enabled. Thus removed
separate enable/disable tests. Cleaned up tests a bit.
changes in v2
---------------
As part of testing effort, compiled a rootfs with shadow stack and landing
pad enabled (libraries and binaries) and booted to shell. As part of long
running tests, I have been able to run some spec 2006 benchmarks [8] (here
link is provided only for list of benchmarks that were tested for long
running tests, excel sheet provided here actually is for some static stats
like code size growth on spec binaries). Thus converting from RFC to
regular patchset.
Securing control-flow integrity for usermode requires following
- Securing forward control flow : All callsites must reach
reach a target that they actually intend to reach.
- Securing backward control flow : All function returns must
return to location where they were called from.
This patch series use riscv cpu extension `zicfilp` [2] to secure forward
control flow and `zicfiss` [2] to secure backward control flow. `zicfilp`
enforces that all indirect calls or jmps must land on a landing pad instr
and label embedded in landing pad instr must match a value programmed in
`x7` register (at callsite via compiler). `zicfiss` introduces shadow stack
which can only be writeable via shadow stack instructions (sspush and
ssamoswap) and thus can't be tampered with via inadvertent stores. More
details about extension can be read from [2] and there are details in
documentation as well (in this patch series).
Using config `CONFIG_RISCV_USER_CFI`, kernel support for riscv control flow
integrity for user mode programs can be compiled in the kernel.
Enabling of control flow integrity for user programs is left to user runtime
(specifically expected from dynamic loader). There has been a lot of earlier
discussion on the enabling topic around x86 shadow stack enabling [3, 4, 5] and
overall consensus had been to let dynamic loader (or usermode) to decide for
enabling the feature.
This patch series introduces arch agnostic `prctls` to enable shadow stack
and indirect branch tracking. And implements them on riscv. arm64 is expected
to implement shadow stack part of these arch agnostic `prctls` [6]
Changes since last time
***********************
Spec changes
------------
- Forward cfi spec has become much simpler. `lpad` instruction is pseudo for
`auipc rd, <20bit_imm>`. `lpad` checks x7 against 20bit embedded in instr.
Thus label width is 20bit.
- Shadow stack management instructions are reduced to
sspush - to push x1/x5 on shadow stack
sspopchk - pops from shadow stack and comapres with x1/x5.
ssamoswap - atomically swap value on shadow stack.
rdssp - reads current shadow stack pointer
- Shadow stack accesses on readonly memory always raise AMO/store page fault.
`sspopchk` is load but if underlying page is readonly, it'll raise a store
page fault. It simplifies hardware and kernel for COW handling for shadow
stack pages.
- riscv defines a new exception type `software check exception` and control flow
violations raise software check exception.
- enabling controls for shadow stack and landing are in xenvcfg CSR and controls
lower privilege mode enabling. As an example senvcfg controls enabling for U and
menvcfg controls enabling for S mode.
core mm shadow stack enabling
-----------------------------
Shadow stack for x86 usermode are now in mainline and thus this patch
series builds on top of that for arch-agnostic mm related changes. Big
thanks and shout out to Rick Edgecombe for that.
selftests
---------
Created some minimal selftests to test the patch series.
[1] - https://lore.kernel.org/lkml/20230213045351.3945824-1-debug@rivosinc.com/
[2] - https://github.com/riscv/riscv-cfi
[3] - https://lore.kernel.org/lkml/ZWHcBq0bJ+15eeKs@finisterre.sirena.org.uk/T/#m…
[4] - https://lore.kernel.org/all/20220130211838.8382-1-rick.p.edgecombe@intel.co…
[5] - https://lore.kernel.org/lkml/CAHk-=wgP5mk3poVeejw16Asbid0ghDt4okHnWaWKLBkRh…
[6] - https://lore.kernel.org/linux-mm/20231122-arm64-gcs-v7-2-201c483bd775@kerne…
[7] - https://lore.kernel.org/lkml/20240125062739.1339782-1-debug@rivosinc.com/
[8] - https://docs.google.com/spreadsheets/d/1_cHGH4ctNVvFRiS7hW9dEGKtXLAJ3aX4Z_i…
[9] - https://lore.kernel.org/lkml/20240329044459.3990638-1-debug@rivosinc.com/
From: Jeff Xu <jeffxu(a)chromium.org>
This is V10 version, it rebases v9 patch to 6.9.rc3.
We also applied and tested mseal() in chrome and chromebook.
------------------------------------------------------------------
This patchset proposes a new mseal() syscall for the Linux kernel.
In a nutshell, mseal() protects the VMAs of a given virtual memory
range against modifications, such as changes to their permission bits.
Modern CPUs support memory permissions, such as the read/write (RW)
and no-execute (NX) bits. Linux has supported NX since the release of
kernel version 2.6.8 in August 2004 [1]. The memory permission feature
improves the security stance on memory corruption bugs, as an attacker
cannot simply write to arbitrary memory and point the code to it. The
memory must be marked with the X bit, or else an exception will occur.
Internally, the kernel maintains the memory permissions in a data
structure called VMA (vm_area_struct). mseal() additionally protects
the VMA itself against modifications of the selected seal type.
Memory sealing is useful to mitigate memory corruption issues where a
corrupted pointer is passed to a memory management system. For
example, such an attacker primitive can break control-flow integrity
guarantees since read-only memory that is supposed to be trusted can
become writable or .text pages can get remapped. Memory sealing can
automatically be applied by the runtime loader to seal .text and
.rodata pages and applications can additionally seal security critical
data at runtime. A similar feature already exists in the XNU kernel
with the VM_FLAGS_PERMANENT [3] flag and on OpenBSD with the
mimmutable syscall [4]. Also, Chrome wants to adopt this feature for
their CFI work [2] and this patchset has been designed to be
compatible with the Chrome use case.
Two system calls are involved in sealing the map: mmap() and mseal().
The new mseal() is an syscall on 64 bit CPU, and with
following signature:
int mseal(void addr, size_t len, unsigned long flags)
addr/len: memory range.
flags: reserved.
mseal() blocks following operations for the given memory range.
1> Unmapping, moving to another location, and shrinking the size,
via munmap() and mremap(), can leave an empty space, therefore can
be replaced with a VMA with a new set of attributes.
2> Moving or expanding a different VMA into the current location,
via mremap().
3> Modifying a VMA via mmap(MAP_FIXED).
4> Size expansion, via mremap(), does not appear to pose any specific
risks to sealed VMAs. It is included anyway because the use case is
unclear. In any case, users can rely on merging to expand a sealed VMA.
5> mprotect() and pkey_mprotect().
6> Some destructive madvice() behaviors (e.g. MADV_DONTNEED) for anonymous
memory, when users don't have write permission to the memory. Those
behaviors can alter region contents by discarding pages, effectively a
memset(0) for anonymous memory.
The idea that inspired this patch comes from Stephen Röttger’s work in
V8 CFI [5]. Chrome browser in ChromeOS will be the first user of this
API.
Indeed, the Chrome browser has very specific requirements for sealing,
which are distinct from those of most applications. For example, in
the case of libc, sealing is only applied to read-only (RO) or
read-execute (RX) memory segments (such as .text and .RELRO) to
prevent them from becoming writable, the lifetime of those mappings
are tied to the lifetime of the process.
Chrome wants to seal two large address space reservations that are
managed by different allocators. The memory is mapped RW- and RWX
respectively but write access to it is restricted using pkeys (or in
the future ARM permission overlay extensions). The lifetime of those
mappings are not tied to the lifetime of the process, therefore, while
the memory is sealed, the allocators still need to free or discard the
unused memory. For example, with madvise(DONTNEED).
However, always allowing madvise(DONTNEED) on this range poses a
security risk. For example if a jump instruction crosses a page
boundary and the second page gets discarded, it will overwrite the
target bytes with zeros and change the control flow. Checking
write-permission before the discard operation allows us to control
when the operation is valid. In this case, the madvise will only
succeed if the executing thread has PKEY write permissions and PKRU
changes are protected in software by control-flow integrity.
Although the initial version of this patch series is targeting the
Chrome browser as its first user, it became evident during upstream
discussions that we would also want to ensure that the patch set
eventually is a complete solution for memory sealing and compatible
with other use cases. The specific scenario currently in mind is
glibc's use case of loading and sealing ELF executables. To this end,
Stephen is working on a change to glibc to add sealing support to the
dynamic linker, which will seal all non-writable segments at startup.
Once this work is completed, all applications will be able to
automatically benefit from these new protections.
In closing, I would like to formally acknowledge the valuable
contributions received during the RFC process, which were instrumental
in shaping this patch:
Jann Horn: raising awareness and providing valuable insights on the
destructive madvise operations.
Liam R. Howlett: perf optimization.
Linus Torvalds: assisting in defining system call signature and scope.
Theo de Raadt: sharing the experiences and insight gained from
implementing mimmutable() in OpenBSD.
MM perf benchmarks
==================
This patch adds a loop in the mprotect/munmap/madvise(DONTNEED) to
check the VMAs’ sealing flag, so that no partial update can be made,
when any segment within the given memory range is sealed.
To measure the performance impact of this loop, two tests are developed.
[8]
The first is measuring the time taken for a particular system call,
by using clock_gettime(CLOCK_MONOTONIC). The second is using
PERF_COUNT_HW_REF_CPU_CYCLES (exclude user space). Both tests have
similar results.
The tests have roughly below sequence:
for (i = 0; i < 1000, i++)
create 1000 mappings (1 page per VMA)
start the sampling
for (j = 0; j < 1000, j++)
mprotect one mapping
stop and save the sample
delete 1000 mappings
calculates all samples.
Below tests are performed on Intel(R) Pentium(R) Gold 7505 @ 2.00GHz,
4G memory, Chromebook.
Based on the latest upstream code:
The first test (measuring time)
syscall__ vmas t t_mseal delta_ns per_vma %
munmap__ 1 909 944 35 35 104%
munmap__ 2 1398 1502 104 52 107%
munmap__ 4 2444 2594 149 37 106%
munmap__ 8 4029 4323 293 37 107%
munmap__ 16 6647 6935 288 18 104%
munmap__ 32 11811 12398 587 18 105%
mprotect 1 439 465 26 26 106%
mprotect 2 1659 1745 86 43 105%
mprotect 4 3747 3889 142 36 104%
mprotect 8 6755 6969 215 27 103%
mprotect 16 13748 14144 396 25 103%
mprotect 32 27827 28969 1142 36 104%
madvise_ 1 240 262 22 22 109%
madvise_ 2 366 442 76 38 121%
madvise_ 4 623 751 128 32 121%
madvise_ 8 1110 1324 215 27 119%
madvise_ 16 2127 2451 324 20 115%
madvise_ 32 4109 4642 534 17 113%
The second test (measuring cpu cycle)
syscall__ vmas cpu cmseal delta_cpu per_vma %
munmap__ 1 1790 1890 100 100 106%
munmap__ 2 2819 3033 214 107 108%
munmap__ 4 4959 5271 312 78 106%
munmap__ 8 8262 8745 483 60 106%
munmap__ 16 13099 14116 1017 64 108%
munmap__ 32 23221 24785 1565 49 107%
mprotect 1 906 967 62 62 107%
mprotect 2 3019 3203 184 92 106%
mprotect 4 6149 6569 420 105 107%
mprotect 8 9978 10524 545 68 105%
mprotect 16 20448 21427 979 61 105%
mprotect 32 40972 42935 1963 61 105%
madvise_ 1 434 497 63 63 115%
madvise_ 2 752 899 147 74 120%
madvise_ 4 1313 1513 200 50 115%
madvise_ 8 2271 2627 356 44 116%
madvise_ 16 4312 4883 571 36 113%
madvise_ 32 8376 9319 943 29 111%
Based on the result, for 6.8 kernel, sealing check adds
20-40 nano seconds, or around 50-100 CPU cycles, per VMA.
In addition, I applied the sealing to 5.10 kernel:
The first test (measuring time)
syscall__ vmas t tmseal delta_ns per_vma %
munmap__ 1 357 390 33 33 109%
munmap__ 2 442 463 21 11 105%
munmap__ 4 614 634 20 5 103%
munmap__ 8 1017 1137 120 15 112%
munmap__ 16 1889 2153 263 16 114%
munmap__ 32 4109 4088 -21 -1 99%
mprotect 1 235 227 -7 -7 97%
mprotect 2 495 464 -30 -15 94%
mprotect 4 741 764 24 6 103%
mprotect 8 1434 1437 2 0 100%
mprotect 16 2958 2991 33 2 101%
mprotect 32 6431 6608 177 6 103%
madvise_ 1 191 208 16 16 109%
madvise_ 2 300 324 24 12 108%
madvise_ 4 450 473 23 6 105%
madvise_ 8 753 806 53 7 107%
madvise_ 16 1467 1592 125 8 108%
madvise_ 32 2795 3405 610 19 122%
The second test (measuring cpu cycle)
syscall__ nbr_vma cpu cmseal delta_cpu per_vma %
munmap__ 1 684 715 31 31 105%
munmap__ 2 861 898 38 19 104%
munmap__ 4 1183 1235 51 13 104%
munmap__ 8 1999 2045 46 6 102%
munmap__ 16 3839 3816 -23 -1 99%
munmap__ 32 7672 7887 216 7 103%
mprotect 1 397 443 46 46 112%
mprotect 2 738 788 50 25 107%
mprotect 4 1221 1256 35 9 103%
mprotect 8 2356 2429 72 9 103%
mprotect 16 4961 4935 -26 -2 99%
mprotect 32 9882 10172 291 9 103%
madvise_ 1 351 380 29 29 108%
madvise_ 2 565 615 49 25 109%
madvise_ 4 872 933 61 15 107%
madvise_ 8 1508 1640 132 16 109%
madvise_ 16 3078 3323 245 15 108%
madvise_ 32 5893 6704 811 25 114%
For 5.10 kernel, sealing check adds 0-15 ns in time, or 10-30
CPU cycles, there is even decrease in some cases.
It might be interesting to compare 5.10 and 6.8 kernel
The first test (measuring time)
syscall__ vmas t_5_10 t_6_8 delta_ns per_vma %
munmap__ 1 357 909 552 552 254%
munmap__ 2 442 1398 956 478 316%
munmap__ 4 614 2444 1830 458 398%
munmap__ 8 1017 4029 3012 377 396%
munmap__ 16 1889 6647 4758 297 352%
munmap__ 32 4109 11811 7702 241 287%
mprotect 1 235 439 204 204 187%
mprotect 2 495 1659 1164 582 335%
mprotect 4 741 3747 3006 752 506%
mprotect 8 1434 6755 5320 665 471%
mprotect 16 2958 13748 10790 674 465%
mprotect 32 6431 27827 21397 669 433%
madvise_ 1 191 240 49 49 125%
madvise_ 2 300 366 67 33 122%
madvise_ 4 450 623 173 43 138%
madvise_ 8 753 1110 357 45 147%
madvise_ 16 1467 2127 660 41 145%
madvise_ 32 2795 4109 1314 41 147%
The second test (measuring cpu cycle)
syscall__ vmas cpu_5_10 c_6_8 delta_cpu per_vma %
munmap__ 1 684 1790 1106 1106 262%
munmap__ 2 861 2819 1958 979 327%
munmap__ 4 1183 4959 3776 944 419%
munmap__ 8 1999 8262 6263 783 413%
munmap__ 16 3839 13099 9260 579 341%
munmap__ 32 7672 23221 15549 486 303%
mprotect 1 397 906 509 509 228%
mprotect 2 738 3019 2281 1140 409%
mprotect 4 1221 6149 4929 1232 504%
mprotect 8 2356 9978 7622 953 423%
mprotect 16 4961 20448 15487 968 412%
mprotect 32 9882 40972 31091 972 415%
madvise_ 1 351 434 82 82 123%
madvise_ 2 565 752 186 93 133%
madvise_ 4 872 1313 442 110 151%
madvise_ 8 1508 2271 763 95 151%
madvise_ 16 3078 4312 1234 77 140%
madvise_ 32 5893 8376 2483 78 142%
From 5.10 to 6.8
munmap: added 250-550 ns in time, or 500-1100 in cpu cycle, per vma.
mprotect: added 200-750 ns in time, or 500-1200 in cpu cycle, per vma.
madvise: added 33-50 ns in time, or 70-110 in cpu cycle, per vma.
In comparison to mseal, which adds 20-40 ns or 50-100 CPU cycles, the
increase from 5.10 to 6.8 is significantly larger, approximately ten
times greater for munmap and mprotect.
When I discuss the mm performance with Brian Makin, an engineer worked
on performance, it was brought to my attention that such a performance
benchmarks, which measuring millions of mm syscall in a tight loop, may
not accurately reflect real-world scenarios, such as that of a database
service. Also this is tested using a single HW and ChromeOS, the data
from another HW or distribution might be different. It might be best
to take this data with a grain of salt.
Change history:
===============
V10:
- rebase to 6.9.rc3 (no code change, resolve conflict only)
- Stephen Röttger applied mseal() in Chrome code, and I tested it on
chromebook, the mseal() is working as designed.
V9:
- remove mmap(PROT_SEAL) and mmap(MAP_SEALABLE) (Linus, Theo de Raadt)
- Update mseal_test to check for prot bit (Liam R. Howlett)
- Update documentation to give more detail on sealing check (Liam R. Howlett)
- Add seal_elf test.
- Add performance measure data.
- mseal_test: fix arm build.
https://lore.kernel.org/all/20240214151130.616240-1-jeffxu@chromium.org/
V8:
- perf optimization in mmap. (Liam R. Howlett)
- add one testcase (test_seal_zero_address)
- Update mseal.rst to add note for MAP_SEALABLE.
https://lore.kernel.org/lkml/20240131175027.3287009-1-jeffxu@chromium.org/
V7:
- fix index.rst (Randy Dunlap)
- fix arm build (Randy Dunlap)
- return EPERM for blocked operations (Theo de Raadt)
https://lore.kernel.org/linux-mm/20240122152905.2220849-2-jeffxu@chromium.o…
V6:
- Drop RFC from subject, Given Linus's general approval.
- Adjust syscall number for mseal (main Jan.11/2024)
- Code style fix (Matthew Wilcox)
- selftest: use ksft macros (Muhammad Usama Anjum)
- Document fix. (Randy Dunlap)
https://lore.kernel.org/all/20240111234142.2944934-1-jeffxu@chromium.org/
V5:
- fix build issue in mseal-Wire-up-mseal-syscall
(Suggested by Linus Torvalds, and Greg KH)
- updates on selftest.
https://lore.kernel.org/lkml/20240109154547.1839886-1-jeffxu@chromium.org/#r
V4:
(Suggested by Linus Torvalds)
- new signature: mseal(start,len,flags)
- 32 bit is not supported. vm_seal is removed, use vm_flags instead.
- single bit in vm_flags for sealed state.
- CONFIG_MSEAL kernel config is removed.
- single bit of PROT_SEAL in the "Prot" field of mmap().
Other changes:
- update selftest (Suggested by Muhammad Usama Anjum)
- update documentation.
https://lore.kernel.org/all/20240104185138.169307-1-jeffxu@chromium.org/
V3:
- Abandon per-syscall approach, (Suggested by Linus Torvalds).
- Organize sealing types around their functionality, such as
MM_SEAL_BASE, MM_SEAL_PROT_PKEY.
- Extend the scope of sealing from calls originated in userspace to
both kernel and userspace. (Suggested by Linus Torvalds)
- Add seal type support in mmap(). (Suggested by Pedro Falcato)
- Add a new sealing type: MM_SEAL_DISCARD_RO_ANON to prevent
destructive operations of madvise. (Suggested by Jann Horn and
Stephen Röttger)
- Make sealed VMAs mergeable. (Suggested by Jann Horn)
- Add MAP_SEALABLE to mmap()
- Add documentation - mseal.rst
https://lore.kernel.org/linux-mm/20231212231706.2680890-2-jeffxu@chromium.o…
v2:
Use _BITUL to define MM_SEAL_XX type.
Use unsigned long for seal type in sys_mseal() and other functions.
Remove internal VM_SEAL_XX type and convert_user_seal_type().
Remove MM_ACTION_XX type.
Remove caller_origin(ON_BEHALF_OF_XX) and replace with sealing bitmask.
Add more comments in code.
Add a detailed commit message.
https://lore.kernel.org/lkml/20231017090815.1067790-1-jeffxu@chromium.org/
v1:
https://lore.kernel.org/lkml/20231016143828.647848-1-jeffxu@chromium.org/
----------------------------------------------------------------
[1] https://kernelnewbies.org/Linux_2_6_8
[2] https://v8.dev/blog/control-flow-integrity
[3] https://github.com/apple-oss-distributions/xnu/blob/1031c584a5e37aff177559b…
[4] https://man.openbsd.org/mimmutable.2
[5] https://docs.google.com/document/d/1O2jwK4dxI3nRcOJuPYkonhTkNQfbmwdvxQMyXge…
[6] https://lore.kernel.org/lkml/CAG48ez3ShUYey+ZAFsU2i1RpQn0a5eOs2hzQ426Fkcgnf…
[7] https://lore.kernel.org/lkml/20230515130553.2311248-1-jeffxu@chromium.org/
[8] https://github.com/peaktocreek/mmperf
Jeff Xu (5):
mseal: Wire up mseal syscall
mseal: add mseal syscall
selftest mm/mseal memory sealing
mseal:add documentation
selftest mm/mseal read-only elf memory segment
Documentation/userspace-api/index.rst | 1 +
Documentation/userspace-api/mseal.rst | 199 ++
arch/alpha/kernel/syscalls/syscall.tbl | 1 +
arch/arm/tools/syscall.tbl | 1 +
arch/arm64/include/asm/unistd.h | 2 +-
arch/arm64/include/asm/unistd32.h | 2 +
arch/m68k/kernel/syscalls/syscall.tbl | 1 +
arch/microblaze/kernel/syscalls/syscall.tbl | 1 +
arch/mips/kernel/syscalls/syscall_n32.tbl | 1 +
arch/mips/kernel/syscalls/syscall_n64.tbl | 1 +
arch/mips/kernel/syscalls/syscall_o32.tbl | 1 +
arch/parisc/kernel/syscalls/syscall.tbl | 1 +
arch/powerpc/kernel/syscalls/syscall.tbl | 1 +
arch/s390/kernel/syscalls/syscall.tbl | 1 +
arch/sh/kernel/syscalls/syscall.tbl | 1 +
arch/sparc/kernel/syscalls/syscall.tbl | 1 +
arch/x86/entry/syscalls/syscall_32.tbl | 1 +
arch/x86/entry/syscalls/syscall_64.tbl | 1 +
arch/xtensa/kernel/syscalls/syscall.tbl | 1 +
include/linux/syscalls.h | 1 +
include/uapi/asm-generic/unistd.h | 5 +-
kernel/sys_ni.c | 1 +
mm/Makefile | 4 +
mm/internal.h | 37 +
mm/madvise.c | 12 +
mm/mmap.c | 31 +-
mm/mprotect.c | 10 +
mm/mremap.c | 31 +
mm/mseal.c | 307 ++++
tools/testing/selftests/mm/.gitignore | 2 +
tools/testing/selftests/mm/Makefile | 2 +
tools/testing/selftests/mm/mseal_test.c | 1836 +++++++++++++++++++
tools/testing/selftests/mm/seal_elf.c | 183 ++
33 files changed, 2678 insertions(+), 3 deletions(-)
create mode 100644 Documentation/userspace-api/mseal.rst
create mode 100644 mm/mseal.c
create mode 100644 tools/testing/selftests/mm/mseal_test.c
create mode 100644 tools/testing/selftests/mm/seal_elf.c
--
2.44.0.683.g7961c838ac-goog
Hello,
this was reported in https://lore.kernel.org/all/202404151340.5b152d96-lkp@intel.com/
since we still observed same failure after the commit is merged in mainline,
we just report again FYI.
kernel test robot noticed "kunit.VCAP_API_DebugFS_Testsuite.vcap_api_show_admin_raw_test.fail" on:
commit: 3a35c13007dea132a65f07de05c26b87837fadc2 ("kunit: Handle test faults")
https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git master
[test failed in linus/master 6e51b4b5bbc07e52b226017936874715629932d1]
[test failed on linux-next/master 632483ea8004edfadd035de36e1ab2c7c4f53158]
in testcase: kunit
version:
with following parameters:
group: group-03
compiler: gcc-13
test machine: 4 threads 1 sockets Intel(R) Core(TM) i3-3220 CPU @ 3.30GHz (Ivy Bridge) with 8G memory
(please refer to attached dmesg/kmsg for entire log/backtrace)
If you fix the issue in a separate patch/commit (i.e. not just a new version of
the same patch/commit), kindly add following tags
| Reported-by: kernel test robot <oliver.sang(a)intel.com>
| Closes: https://lore.kernel.org/oe-lkp/202405241710.148db8b0-oliver.sang@intel.com
[ 116.216583] # vcap_api_show_admin_raw_test: EXPECTATION FAILED at drivers/net/ethernet/microchip/vcap/vcap_api_debugfs_kunit.c:377
Expected test_expected == test_pr_buffer[0], but
test_expected == " addr: 786, X6 rule, keysets: VCAP_KFS_MAC_ETYPE
"
test_pr_buffer[0] == ""
[ 116.222467] not ok 2 vcap_api_show_admin_raw_test
The kernel config and materials to reproduce are available at:
https://download.01.org/0day-ci/archive/20240524/202405241710.148db8b0-oliv…
--
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki
From: Geliang Tang <tanggeliang(a)kylinos.cn>
This patchset uses post_socket_cb and post_connect_cb callbacks of struct
network_helper_opts to refactor do_test() in bpf_tcp_ca.c to move dctcp
test dedicated code out of do_test() into test_dctcp().
v3:
- Add 4 new patches, 1-3 are cleanups. 4 adds a new helper.
- address Martin's comments in v2.
v2:
- rebased on commit "selftests/bpf: Add test for the use of new args in
cong_control"
Geliang Tang (8):
selftests/bpf: Drop struct post_socket_opts
selftests/bpf: Drop type parameter of start_server_addr
selftests/bpf: Drop type parameter of connect_to_addr
selftests/bpf: Add start_server_str helper
selftests/bpf: Use post_socket_cb in connect_to_fd_opts
selftests/bpf: Use start_server_str in bpf_tcp_ca
selftests/bpf: Use connect_to_fd_opts in do_test in bpf_tcp_ca
selftests/bpf: Add post_connect_cb callback
tools/testing/selftests/bpf/network_helpers.c | 56 ++++---
tools/testing/selftests/bpf/network_helpers.h | 13 +-
.../selftests/bpf/prog_tests/bpf_tcp_ca.c | 138 +++++++++++++-----
.../selftests/bpf/prog_tests/cls_redirect.c | 7 +-
.../testing/selftests/bpf/prog_tests/mptcp.c | 2 +-
.../selftests/bpf/prog_tests/sk_assign.c | 13 +-
.../selftests/bpf/prog_tests/sock_addr.c | 23 ++-
.../bpf/prog_tests/sockopt_inherit.c | 4 +-
.../bpf/test_tcp_check_syncookie_user.c | 10 +-
9 files changed, 179 insertions(+), 87 deletions(-)
--
2.43.0
Currrentl a 32 bit 1u value is being shifted more than 32 bits causing
overflow and incorrect checking of bits 32-63. Fix this by using the
BIT_ULL macro for shifting bits.
Detected by cppcheck:
sev_init2_tests.c:108:34: error: Shifting 32-bit value by 63 bits is
undefined behaviour [shiftTooManyBits]
Fixes: dfc083a181ba ("selftests: kvm: add tests for KVM_SEV_INIT2")
Signed-off-by: Colin Ian King <colin.i.king(a)gmail.com>
---
tools/testing/selftests/kvm/x86_64/sev_init2_tests.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/tools/testing/selftests/kvm/x86_64/sev_init2_tests.c b/tools/testing/selftests/kvm/x86_64/sev_init2_tests.c
index 7a4a61be119b..ea09f7a06aa4 100644
--- a/tools/testing/selftests/kvm/x86_64/sev_init2_tests.c
+++ b/tools/testing/selftests/kvm/x86_64/sev_init2_tests.c
@@ -105,11 +105,11 @@ void test_features(uint32_t vm_type, uint64_t supported_features)
int i;
for (i = 0; i < 64; i++) {
- if (!(supported_features & (1u << i)))
+ if (!(supported_features & BIT_ULL(i)))
test_init2_invalid(vm_type,
&(struct kvm_sev_init){ .vmsa_features = BIT_ULL(i) },
"unknown feature");
- else if (KNOWN_FEATURES & (1u << i))
+ else if (KNOWN_FEATURES & BIT_ULL(u))
test_init2(vm_type,
&(struct kvm_sev_init){ .vmsa_features = BIT_ULL(i) });
}
--
2.39.2
Dear Kernel Community,
This patch introduces a `.gitlab-ci` file along with a `ci/` folder, defining a
basic test pipeline triggered by code pushes to a GitLab-CI instance. This
initial version includes static checks (checkpatch and smatch for now) and build
tests across various architectures and configurations. It leverages an
integrated cache for efficient build times and introduces a flexible 'scenarios'
mechanism for subsystem-specific extensions.
tl;dr: check this video to see a quick demo: https://youtu.be/TWiTjhjOuzg,
but don't forget to check the "Motivation for this work" below. Your feedback,
whether a simple thumbs up or down, is crucial to determine if it is worthwhile
to pursue this initiative.
GitLab is an Open Source platform that includes integrated CI/CD. The pipeline
provided in this patch is designed to work out-of-the-box with any GitLab
instance, including the gitlab.com Free Tier. If you reach the limits of the
Free Tier, consider using community instances like https://gitlab.freedesktop.org/.
Alternatively, you can set up a local runner for more flexibility. The
bootstrap-gitlab-runner.sh script included with this patch simplifies this
process, enabling you to run tests on your preferred infrastructure, including
your own machine.
For detailed information, please refer to the documentation included in the
patch, or check the rendered version here: https://koike.pages.collabora.com/-/linux/-/jobs/298498/artifacts/artifacts… .
Motivation for this Work
========================
We all know tests are a major topic in the community, so let's mention the
specificities of this approach:
1. **Built-in User Interface:** GitLab CI/CD is growing in popularity and has an
user-friendly interface. Our experience with the upstream DRM-CI in the kernel
tree (see this blog post [https://www.collabora.com/news-and-blog/blog/2024/02/08/drm-ci-a-gitlab-ci-…] )
has provided insights into how such a system can benefit the wider community.
2. **Distributed Infrastructure:**
The proposed GitLab-CI pipeline is designed with a distributed infrastructure
model, being possible to run in any gitlab instance.
3. **Reduce regressions:** Fostering a culture where people habitually run
validated tests and post their results can prevent many issues in post-merge
tests.
4. **Collaborative Testing Environment:** The kernel community is already
engaged in numerous testing efforts, including various GitLab-CI pipelines such
as DRM-CI, which I maintain, along with other solutions like KernelCI and
BPF-CI. This proposal is designed to further stimulate contributions to the
evolving testing landscape. Our goal is to establish a comprehensive suite of
common tools and files.
5. **Ownership of QA:**
Discrepancies between kernel code and outdated tests often lead to misattributed
failures, complicating regression tracking. This issue, often arising from
neglected or deprioritized test updates, creates uncertainty about the source of
failures. Adopting an "always green pipeline" approach, as detailed in this
patch's documentation, encourages timely maintenance and validation of tests.
This ensures that testing accurately reflects the current state of the kernel,
thereby improving the effectiveness of our QA processes.
Additionally, if we discover that this method isn't working for us, we can
easily remove it from the codebase, as it is primarily contained within the ci/
folder.
Future Work
===========
**Expanding Static Checks:**
We have the opportunity to integrate a variety of static analysis tools,
including:
- dtbs_checks
- sparse
- yamllint
- dt-doc-validate
- coccicheck
**Adding Userspace Tests on VMs:**
To further our testing, we can implement userspace tests that run on virtual
machines (VMs), such as:
- kselftests
- kunit tests
- Subsystem-specific tests, customizable in the scenarios.
**Leveraging External Test Labs:**
We can extend our testing to external labs, similar to what DRM-CI currently
does. This includes:
- Lava labs
- Bare metal labs
- Using KernelCI-provided labs
**Other integrations**
- Submit results to KCIDB
**Lightweight Implementation for All Developers:**
We aim to design these tests to be lightweight, ensuring developers with limited
computing resources can still run essential tests. Resource-intensive tests can
be set to trigger manually, rather than automatically, to accommodate diverse
development environments.
Chat Discussions
================
For those interested in further discussions:
**Join Our Slack Channel:**
We have a Slack channel, #gitlab-ci, on the KernelCI Slack instance https://kernelci.slack.com/ .
Feel free to join and contribute to the conversation. The KernelCI team has
weekly calls where we also discuss the GitLab-CI pipeline.
**Acknowledgments:**
A special thanks to Nikolai Kondrashov, Tales da Aparecida - both from Red Hat -
and KernelCI community for their valuable feedback and support in this proposal.
I eagerly await your thoughts and suggestions on this initiative.
Also, if you want to see this initiave move faster, we are happy to discuss
funding options.
Best regards,
Helen Koike
Helen Koike (3):
kci-gitlab: Introducing GitLab-CI Pipeline for Kernel Testing
kci-gitlab: Add documentation
kci-gitlab: docs: Add images
.gitlab-ci.yml | 2 +
Documentation/ci/gitlab-ci/gitlab-ci.rst | 404 ++++++++++++++++++
.../ci/gitlab-ci/images/job-matrix.png | Bin 0 -> 159752 bytes
.../gitlab-ci/images/new-project-runner.png | Bin 0 -> 607737 bytes
.../ci/gitlab-ci/images/pipelines-on-push.png | Bin 0 -> 532143 bytes
.../ci/gitlab-ci/images/the-pipeline.png | Bin 0 -> 91675 bytes
.../ci/gitlab-ci/images/variables.png | Bin 0 -> 277518 bytes
Documentation/index.rst | 7 +
MAINTAINERS | 9 +
ci/gitlab-ci/bootstrap-gitlab-runner.sh | 55 +++
ci/gitlab-ci/ci-scripts/build-docs.sh | 35 ++
ci/gitlab-ci/ci-scripts/build-kernel.sh | 35 ++
ci/gitlab-ci/ci-scripts/ici-functions.sh | 104 +++++
ci/gitlab-ci/ci-scripts/install-smatch.sh | 13 +
.../ci-scripts/parse_commit_message.sh | 27 ++
ci/gitlab-ci/ci-scripts/run-checkpatch.sh | 19 +
ci/gitlab-ci/ci-scripts/run-smatch.sh | 45 ++
ci/gitlab-ci/docker-compose.yaml | 18 +
ci/gitlab-ci/linux.code-workspace | 11 +
ci/gitlab-ci/yml/build.yml | 43 ++
ci/gitlab-ci/yml/cache.yml | 26 ++
ci/gitlab-ci/yml/container.yml | 36 ++
ci/gitlab-ci/yml/gitlab-ci.yml | 71 +++
ci/gitlab-ci/yml/kernel-combinations.yml | 18 +
ci/gitlab-ci/yml/scenarios.yml | 12 +
ci/gitlab-ci/yml/scenarios/file-systems.yml | 21 +
ci/gitlab-ci/yml/scenarios/media.yml | 21 +
ci/gitlab-ci/yml/scenarios/network.yml | 21 +
ci/gitlab-ci/yml/static-checks.yml | 21 +
29 files changed, 1074 insertions(+)
create mode 100644 .gitlab-ci.yml
create mode 100644 Documentation/ci/gitlab-ci/gitlab-ci.rst
create mode 100644 Documentation/ci/gitlab-ci/images/job-matrix.png
create mode 100644 Documentation/ci/gitlab-ci/images/new-project-runner.png
create mode 100644 Documentation/ci/gitlab-ci/images/pipelines-on-push.png
create mode 100644 Documentation/ci/gitlab-ci/images/the-pipeline.png
create mode 100644 Documentation/ci/gitlab-ci/images/variables.png
create mode 100755 ci/gitlab-ci/bootstrap-gitlab-runner.sh
create mode 100755 ci/gitlab-ci/ci-scripts/build-docs.sh
create mode 100755 ci/gitlab-ci/ci-scripts/build-kernel.sh
create mode 100644 ci/gitlab-ci/ci-scripts/ici-functions.sh
create mode 100755 ci/gitlab-ci/ci-scripts/install-smatch.sh
create mode 100755 ci/gitlab-ci/ci-scripts/parse_commit_message.sh
create mode 100755 ci/gitlab-ci/ci-scripts/run-checkpatch.sh
create mode 100755 ci/gitlab-ci/ci-scripts/run-smatch.sh
create mode 100644 ci/gitlab-ci/docker-compose.yaml
create mode 100644 ci/gitlab-ci/linux.code-workspace
create mode 100644 ci/gitlab-ci/yml/build.yml
create mode 100644 ci/gitlab-ci/yml/cache.yml
create mode 100644 ci/gitlab-ci/yml/container.yml
create mode 100644 ci/gitlab-ci/yml/gitlab-ci.yml
create mode 100644 ci/gitlab-ci/yml/kernel-combinations.yml
create mode 100644 ci/gitlab-ci/yml/scenarios.yml
create mode 100644 ci/gitlab-ci/yml/scenarios/file-systems.yml
create mode 100644 ci/gitlab-ci/yml/scenarios/media.yml
create mode 100644 ci/gitlab-ci/yml/scenarios/network.yml
create mode 100644 ci/gitlab-ci/yml/static-checks.yml
--
2.40.1
From: Mark Brown <broonie(a)kernel.org>
[ Upstream commit 907f33028871fa7c9a3db1efd467b78ef82cce20 ]
The standard library perror() function provides a convenient way to print
an error message based on the current errno but this doesn't play nicely
with KTAP output. Provide a helper which does an equivalent thing in a KTAP
compatible format.
nolibc doesn't have a strerror() and adding the table of strings required
doesn't seem like a good fit for what it's trying to do so when we're using
that only print the errno.
Signed-off-by: Mark Brown <broonie(a)kernel.org>
Reviewed-by: Kees Cook <keescook(a)chromium.org>
Signed-off-by: Shuah Khan <skhan(a)linuxfoundation.org>
Stable-dep-of: 071af0c9e582 ("selftests: timers: Convert posix_timers test to generate KTAP output")
Signed-off-by: Edward Liaw <edliaw(a)google.com>
---
tools/testing/selftests/kselftest.h | 14 ++++++++++++++
1 file changed, 14 insertions(+)
diff --git a/tools/testing/selftests/kselftest.h b/tools/testing/selftests/kselftest.h
index e8eecbc83a60..ad7b97e16f37 100644
--- a/tools/testing/selftests/kselftest.h
+++ b/tools/testing/selftests/kselftest.h
@@ -48,6 +48,7 @@
#include <stdlib.h>
#include <unistd.h>
#include <stdarg.h>
+#include <string.h>
#include <stdio.h>
#include <sys/utsname.h>
#endif
@@ -156,6 +157,19 @@ static inline void ksft_print_msg(const char *msg, ...)
va_end(args);
}
+static inline void ksft_perror(const char *msg)
+{
+#ifndef NOLIBC
+ ksft_print_msg("%s: %s (%d)\n", msg, strerror(errno), errno);
+#else
+ /*
+ * nolibc doesn't provide strerror() and it seems
+ * inappropriate to add one, just print the errno.
+ */
+ ksft_print_msg("%s: %d)\n", msg, errno);
+#endif
+}
+
static inline void ksft_test_result_pass(const char *msg, ...)
{
int saved_errno = errno;
--
2.45.0.215.g3402c0e53f-goog
Currently array buf is not being initialized and so garbage values
on the stack are being used in the mq_send calls. Initialize the
values in the array to zero.
Cleans up cppcheck warning:
mq_perf_tests.c:334:25: error: Uninitialized variable: buff [uninitvar]
Signed-off-by: Colin Ian King <colin.i.king(a)gmail.com>
---
tools/testing/selftests/mqueue/mq_perf_tests.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/tools/testing/selftests/mqueue/mq_perf_tests.c b/tools/testing/selftests/mqueue/mq_perf_tests.c
index 5c16159d0bcd..bd561dc785d8 100644
--- a/tools/testing/selftests/mqueue/mq_perf_tests.c
+++ b/tools/testing/selftests/mqueue/mq_perf_tests.c
@@ -322,7 +322,7 @@ void *fake_cont_thread(void *arg)
void *cont_thread(void *arg)
{
- char buff[MSG_SIZE];
+ char buff[MSG_SIZE] = { };
int i, priority;
for (i = 0; i < num_cpus_to_pin; i++)
--
2.39.2
Testing a network device that has large numbers of bytes/packets may
overflow. Using stats64 when comparing fixes this problem.
I tripped on this while iterating on a qstats patch for mlx5. See below
for confirmation without my added code that this is a bug.
Before this patch (with added debugging output):
$ NETIF=eth0 tools/testing/selftests/drivers/net/stats.py
KTAP version 1
1..4
ok 1 stats.check_pause
ok 2 stats.check_fec
rstat: 481708634 qstat: 666201639514 key: tx-bytes
not ok 3 stats.pkt_byte_sum
ok 4 stats.qstat_by_ifindex
Note the huge delta above ^^^ in the rtnl vs qstats.
After this patch:
$ NETIF=eth0 tools/testing/selftests/drivers/net/stats.py
KTAP version 1
1..4
ok 1 stats.check_pause
ok 2 stats.check_fec
ok 3 stats.pkt_byte_sum
ok 4 stats.qstat_by_ifindex
It looks like rtnl_fill_stats in net/core/rtnetlink.c will attempt to
copy the 64bit stats into a 32bit structure which is probably why this
behavior is occurring.
To show this is happening, you can get the underlying stats that the
stats.py test uses like this:
$ ./cli.py --spec ../../../Documentation/netlink/specs/rt_link.yaml \
--do getlink --json '{"ifi-index": 7}'
And examine the output (heavily snipped to show relevant fields):
'stats': {
'multicast': 3739197,
'rx-bytes': 1201525399,
'rx-packets': 56807158,
'tx-bytes': 492404458,
'tx-packets': 1200285371,
'stats64': {
'multicast': 3739197,
'rx-bytes': 35561263767,
'rx-packets': 56807158,
'tx-bytes': 666212335338,
'tx-packets': 1200285371,
The stats.py test prior to this patch was using the 'stats' structure
above, which matches the failure output on my system.
Comparing side by side, rx-bytes and tx-bytes, and getting ethtool -S
output:
rx-bytes stats: 1201525399
rx-bytes stats64: 35561263767
rx-bytes ethtool: 36203402638
tx-bytes stats: 492404458
tx-bytes stats64: 666212335338
tx-bytes ethtool: 666215360113
Note that the above was taken from a system with an mlx5 NIC, which only
exposes ndo_get_stats64.
Based on the ethtool output and qstat output, it appears that stats.py
should be updated to use the 'stats64' structure for accurate
comparisons when packet/byte counters get very large.
To confirm that this was not related to the qstats code I was iterating
on, I booted a kernel without my driver changes and re-ran the test
which shows the qstats are skipped (as they don't exist for mlx5):
NETIF=eth0 tools/testing/selftests/drivers/net/stats.py
KTAP version 1
1..4
ok 1 stats.check_pause
ok 2 stats.check_fec
ok 3 stats.pkt_byte_sum # SKIP qstats not supported by the device
ok 4 stats.qstat_by_ifindex # SKIP No ifindex supports qstats
But, fetching the stats using the CLI
$ ./cli.py --spec ../../../Documentation/netlink/specs/rt_link.yaml \
--do getlink --json '{"ifi-index": 7}'
Shows the same issue (heavily snipped for relevant fields only):
'stats': {
'multicast': 105489,
'rx-bytes': 530879526,
'rx-packets': 751415,
'tx-bytes': 2510191396,
'tx-packets': 27700323,
'stats64': {
'multicast': 105489,
'rx-bytes': 530879526,
'rx-packets': 751415,
'tx-bytes': 15395093284,
'tx-packets': 27700323,
Comparing side by side with ethtool -S on the unmodified mlx5 driver:
tx-bytes stats: 2510191396
tx-bytes stats64: 15395093284
tx-bytes ethtool: 17718435810
Fixes: f0e6c86e4bab ("testing: net-drv: add a driver test for stats reporting")
Signed-off-by: Joe Damato <jdamato(a)fastly.com>
---
tools/testing/selftests/drivers/net/stats.py | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/tools/testing/selftests/drivers/net/stats.py b/tools/testing/selftests/drivers/net/stats.py
index 7a7b16b180e2..820b8e0a22c6 100755
--- a/tools/testing/selftests/drivers/net/stats.py
+++ b/tools/testing/selftests/drivers/net/stats.py
@@ -69,7 +69,7 @@ def pkt_byte_sum(cfg) -> None:
return 0
for _ in range(10):
- rtstat = rtnl.getlink({"ifi-index": cfg.ifindex})['stats']
+ rtstat = rtnl.getlink({"ifi-index": cfg.ifindex})['stats64']
if stat_cmp(rtstat, qstat) < 0:
raise Exception("RTNL stats are lower, fetched later")
qstat = get_qstat(cfg)
--
2.25.1
The cbo and which-cpu hwprobe selftests leave their artifacts in the
kernel tree and end up being tracked by git. Add the binaries to the
hwprobe selftest .gitignore so this no longer happens.
Signed-off-by: Charlie Jenkins <charlie(a)rivosinc.com>
Fixes: a29e2a48afe3 ("RISC-V: selftests: Add CBO tests")
Fixes: ef7d6abb2cf5 ("RISC-V: selftests: Add which-cpus hwprobe test")
---
tools/testing/selftests/riscv/hwprobe/.gitignore | 2 ++
1 file changed, 2 insertions(+)
diff --git a/tools/testing/selftests/riscv/hwprobe/.gitignore b/tools/testing/selftests/riscv/hwprobe/.gitignore
index 8113dc3bdd03..6e384e80ea1a 100644
--- a/tools/testing/selftests/riscv/hwprobe/.gitignore
+++ b/tools/testing/selftests/riscv/hwprobe/.gitignore
@@ -1 +1,3 @@
hwprobe
+cbo
+which-cpus
---
base-commit: ed30a4a51bb196781c8058073ea720133a65596f
change-id: 20240425-gitignore_hwprobe_artifacts-fb0f2cd3509c
--
- Charlie
Hi all,
We are students from the State University of Campinas with an interest in contributing to the kernel. We are part of LKCAMP, a student group that focuses on researching and contributing to open source software. Our group has organized kernel hackathons in the past [1] that resulted in sucessful contributions, and we would like to continue the effort this year.
This time, we were thinking about writing KUnit tests for data structures in `lib/` (or converting existing lib test code), similarly to our previous hackathon. We are currently considering a few candidates:
- lib/kfifo.c
- lib/llist.c
- tools/testing/scatterlist
- tools/testing/radix-tree
We would like to know if these are good candidates, and also ask for suggestions of other code that could benefit from having KUnit tests.
Thanks!
Artur Alves
[1] https://lore.kernel.org/dri-devel/20211011152333.gm5jkaog6b6nbv5w@notapiano/
From: Geliang Tang <tanggeliang(a)kylinos.cn>
bpf_prog5 and bpf_prog7 are removed from progs/test_sockmap_kern.h in
commit d79a32129b21 ("bpf: Selftests, remove prints from sockmap tests"),
now there are only 9 progs in it, not 11:
SEC("sk_skb1")
int bpf_prog1(struct __sk_buff *skb)
SEC("sk_skb2")
int bpf_prog2(struct __sk_buff *skb)
SEC("sk_skb3")
int bpf_prog3(struct __sk_buff *skb)
SEC("sockops")
int bpf_sockmap(struct bpf_sock_ops *skops)
SEC("sk_msg1")
int bpf_prog4(struct sk_msg_md *msg)
SEC("sk_msg2")
int bpf_prog6(struct sk_msg_md *msg)
SEC("sk_msg3")
int bpf_prog8(struct sk_msg_md *msg)
SEC("sk_msg4")
int bpf_prog9(struct sk_msg_md *msg)
SEC("sk_msg5")
int bpf_prog10(struct sk_msg_md *msg)
This patch updates the array sizes of prog_fd[], prog_attach_type[] and
prog_type[] from 11 to 9 accordingly.
Fixes: d79a32129b21 ("bpf: Selftests, remove prints from sockmap tests")
Signed-off-by: Geliang Tang <tanggeliang(a)kylinos.cn>
---
tools/testing/selftests/bpf/test_sockmap.c | 6 +-----
1 file changed, 1 insertion(+), 5 deletions(-)
diff --git a/tools/testing/selftests/bpf/test_sockmap.c b/tools/testing/selftests/bpf/test_sockmap.c
index 92752f5eeded..4499b3cfc3a6 100644
--- a/tools/testing/selftests/bpf/test_sockmap.c
+++ b/tools/testing/selftests/bpf/test_sockmap.c
@@ -63,7 +63,7 @@ int passed;
int failed;
int map_fd[9];
struct bpf_map *maps[9];
-int prog_fd[11];
+int prog_fd[9];
int txmsg_pass;
int txmsg_redir;
@@ -1793,8 +1793,6 @@ int prog_attach_type[] = {
BPF_SK_MSG_VERDICT,
BPF_SK_MSG_VERDICT,
BPF_SK_MSG_VERDICT,
- BPF_SK_MSG_VERDICT,
- BPF_SK_MSG_VERDICT,
};
int prog_type[] = {
@@ -1807,8 +1805,6 @@ int prog_type[] = {
BPF_PROG_TYPE_SK_MSG,
BPF_PROG_TYPE_SK_MSG,
BPF_PROG_TYPE_SK_MSG,
- BPF_PROG_TYPE_SK_MSG,
- BPF_PROG_TYPE_SK_MSG,
};
static int populate_progs(char *bpf_file)
--
2.43.0
The va_high_addr_switch memory selftest tests out some corner cases
related to allocation and page/hugepage faulting around the switch
boundary. Currently, the page size and hugepage size have been statically
defined. Post FEAT_LPA2, the Aarch64 Linux kernel adds support for 4k and
16k translation granules on higher addresses; we restructure the test to
support the same. In addition, we avoid invocation of the binary twice,
in the shell script, to reduce test noise.
Dev Jain (2):
selftests/mm: va_high_addr_switch: Reduce test noise
selftests/mm: va_high_addr_switch: Dynamically initialize testcases to
enable LPA2 testing
.../selftests/mm/va_high_addr_switch.c | 454 +++++++++---------
.../selftests/mm/va_high_addr_switch.sh | 4 -
2 files changed, 232 insertions(+), 226 deletions(-)
--
2.34.1
From: donsheng <dongsheng.x.zhang(a)intel.com>
If the host was booted with the "default_hugepagesz=1G" kernel command-line
parameter, running the NX hugepage test will fail with error "Invalid argument"
at the TEST_ASSERT line in kvm_util.c's __vm_mem_region_delete() function:
static void __vm_mem_region_delete(struct kvm_vm *vm,
struct userspace_mem_region *region,
bool unlink)
{
int ret;
...
ret = munmap(region->mmap_start, region->mmap_size);
TEST_ASSERT(!ret, __KVM_SYSCALL_ERROR("munmap()", ret));
...
}
NX hugepage test creates a VM with a data slot of 6M size backed with huge
pages. If the default hugetlb page size is set to 1G, calling mmap() with
MAP_HUGETLB and a length of 6M will succeed but calling its matching munmap()
will fail. Documentation/admin-guide/mm/hugetlbpage.rst specifies this behavior:
"Syscalls that operate on memory backed by hugetlb pages only have their lengths
aligned to the native page size of the processor; they will normally fail with
errno set to EINVAL or exclude hugetlb pages that extend beyond the length if
not hugepage aligned. For example, munmap(2) will fail if memory is backed by
a hugetlb page and the length is smaller than the hugepage size."
Explicitly use MAP_HUGE_2MB in conjunction with MAP_HUGETLB to fix the issue.
Signed-off-by: donsheng <dongsheng.x.zhang(a)intel.com>
Suggested-by: Zide Chen <zide.chen(a)intel.com>
---
tools/testing/selftests/kvm/x86_64/nx_huge_pages_test.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/tools/testing/selftests/kvm/x86_64/nx_huge_pages_test.c b/tools/testing/selftests/kvm/x86_64/nx_huge_pages_test.c
index 17bbb96fc4df..146e9033e206 100644
--- a/tools/testing/selftests/kvm/x86_64/nx_huge_pages_test.c
+++ b/tools/testing/selftests/kvm/x86_64/nx_huge_pages_test.c
@@ -129,7 +129,7 @@ void run_test(int reclaim_period_ms, bool disable_nx_huge_pages,
vcpu = vm_vcpu_add(vm, 0, guest_code);
- vm_userspace_mem_region_add(vm, VM_MEM_SRC_ANONYMOUS_HUGETLB,
+ vm_userspace_mem_region_add(vm, VM_MEM_SRC_ANONYMOUS_HUGETLB_2MB,
HPAGE_GPA, HPAGE_SLOT,
HPAGE_SLOT_NPAGES, 0);
--
2.43.0
From: Geliang Tang <tanggeliang(a)kylinos.cn>
This patchset uses post_socket_cb and post_connect_cb callbacks of struct
network_helper_opts to refactor do_test() in bpf_tcp_ca.c to move dctcp
test dedicated code out of do_test() into test_dctcp().
Patch 3 adds a new member in post_socket_opts and patch 4 adds a new
callback in network_helper_opts. I'm not sure if this is going too far.
v2:
- rebased on commit "selftests/bpf: Add test for the use of new args in
cong_control"
Geliang Tang (4):
selftests/bpf: Use post_socket_cb in connect_to_fd_opts
selftests/bpf: Use start_server_addr in bpf_tcp_ca
selftests/bpf: Use connect_to_fd_opts in do_test in bpf_tcp_ca
selftests/bpf: Add post_connect_cb callback
tools/testing/selftests/bpf/network_helpers.c | 13 +-
tools/testing/selftests/bpf/network_helpers.h | 8 +-
.../selftests/bpf/prog_tests/bpf_tcp_ca.c | 111 ++++++++++++------
3 files changed, 86 insertions(+), 46 deletions(-)
--
2.43.0
From: "Steven Rostedt (Google)" <rostedt(a)goodmis.org>
The kprobe_eventname.tc test checks if a function with .isra. can have a
kprobe attached to it. It loops through the kallsyms file for all the
functions that have the .isra. name, and checks if it exists in the
available_filter_functions file, and if it does, it uses it to attach a
kprobe to it.
The issue is that kprobes can not attach to functions that are listed more
than once in available_filter_functions. With the latest kernel, the
function that is found is: rapl_event_update.isra.0
# grep rapl_event_update.isra.0 /sys/kernel/tracing/available_filter_functions
rapl_event_update.isra.0
rapl_event_update.isra.0
It is listed twice. This causes the attached kprobe to it to fail which in
turn fails the test. Instead of just picking the function function that is
found in available_filter_functions, pick the first one that is listed
only once in available_filter_functions.
Cc: stable(a)vger.kernel.org
Fixes: 604e3548236de ("selftests/ftrace: Select an existing function in kprobe_eventname test")
Signed-off-by: Steven Rostedt (Google) <rostedt(a)goodmis.org>
---
.../testing/selftests/ftrace/test.d/kprobe/kprobe_eventname.tc | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/tools/testing/selftests/ftrace/test.d/kprobe/kprobe_eventname.tc b/tools/testing/selftests/ftrace/test.d/kprobe/kprobe_eventname.tc
index 1f6981ef7afa..ba19b81cef39 100644
--- a/tools/testing/selftests/ftrace/test.d/kprobe/kprobe_eventname.tc
+++ b/tools/testing/selftests/ftrace/test.d/kprobe/kprobe_eventname.tc
@@ -30,7 +30,8 @@ find_dot_func() {
fi
grep " [tT] .*\.isra\..*" /proc/kallsyms | cut -f 3 -d " " | while read f; do
- if grep -s $f available_filter_functions; then
+ cnt=`grep -s $f available_filter_functions | wc -l`;
+ if [ $cnt -eq 1 ]; then
echo $f
break
fi
--
2.43.0
Post FEAT_LPA2, Aarch64 extends the 4KB and 16KB translation granule to
large virtual addresses. Currently, the test is being skipped for said
granule sizes, because the page sizes have been statically defined; to
work around that would mean breaking the nice array of structs used for
adding testcases. Instead, don't skip the test, and encourage the user
to manually change the macros.
Signed-off-by: Dev Jain <dev.jain(a)arm.com>
---
.../testing/selftests/mm/va_high_addr_switch.c | 17 ++++++++++++++++-
1 file changed, 16 insertions(+), 1 deletion(-)
diff --git a/tools/testing/selftests/mm/va_high_addr_switch.c b/tools/testing/selftests/mm/va_high_addr_switch.c
index cfbc501290d3..ba862f51d395 100644
--- a/tools/testing/selftests/mm/va_high_addr_switch.c
+++ b/tools/testing/selftests/mm/va_high_addr_switch.c
@@ -292,12 +292,24 @@ static int supported_arch(void)
#elif defined(__x86_64__)
return 1;
#elif defined(__aarch64__)
- return getpagesize() == PAGE_SIZE;
+ return 1;
#else
return 0;
#endif
}
+#if defined(__aarch64__)
+void failure_message(void)
+{
+ printf("TEST MAY FAIL: Are you running on a pagesize other than 64K?\n");
+ printf("If yes, please change macros manually. Ensure to change the\n");
+ printf("address macros too if running defconfig on 16K pagesize,\n");
+ printf("since userspace VA = 47 bits post FEAT_LPA2.\n");
+}
+#else
+void failure_message(void) {}
+#endif
+
int main(int argc, char **argv)
{
int ret;
@@ -308,5 +320,8 @@ int main(int argc, char **argv)
ret = run_test(testcases, ARRAY_SIZE(testcases));
if (argc == 2 && !strcmp(argv[1], "--run-hugetlb"))
ret = run_test(hugetlb_testcases, ARRAY_SIZE(hugetlb_testcases));
+
+ if (ret)
+ failure_message();
return ret;
}
--
2.39.2
The compaction_test memory selftest introduces fragmentation in memory
and then tries to allocate as many hugepages as possible. This series
addresses some problems.
On Aarch64, if nr_hugepages == 0, then the test trivially succeeds since
compaction_index becomes 0, which is less than 3, due to no division by
zero exception being raised. We fix that by checking for division by
zero.
Secondly, correctly set the number of hugepages to zero before trying
to set a large number of them.
Now, consider a situation in which, at the start of the test, a non-zero
number of hugepages have been already set (while running the entire
selftests/mm suite, or manually by the admin). The test operates on 80%
of memory to avoid OOM-killer invocation, and because some memory is
already blocked by hugepages, it would increase the chance of OOM-killing.
Also, since mem_free used in check_compaction() is the value before we
set nr_hugepages to zero, the chance that the compaction_index will
be small is very high if the preset nr_hugepages was high, leading to a
bogus test success.
This series applies on top of the stable 6.9 kernel.
Changes in v2:
- Handle an unsigned long number of hugepages
- Combine the first patch (previously standalone) with this series
Link to v1:
https://lore.kernel.org/all/20240513082842.4117782-1-dev.jain@arm.com/https://lore.kernel.org/all/20240515093633.54814-1-dev.jain@arm.com/
Dev Jain (3):
selftests/mm: compaction_test: Fix bogus test success on Aarch64
selftests/mm: compaction_test: Fix incorrect write of zero to
nr_hugepages
selftests/mm: compaction_test: Fix bogus test success and reduce
probability of OOM-killer invocation
tools/testing/selftests/mm/compaction_test.c | 85 ++++++++++++++------
1 file changed, 60 insertions(+), 25 deletions(-)
--
2.34.1
The upcoming new Idle HLT Intercept feature allows for the HLT
instruction execution by a vCPU to be intercepted by the hypervisor
only if there are no pending V_INTR and V_NMI events for the vCPU.
When the vCPU is expected to service the pending V_INTR and V_NMI
events, the Idle HLT intercept won’t trigger. The feature allows the
hypervisor to determine if the vCPU is actually idle and reduces
wasteful VMEXITs.
Presence of the Idle HLT Intercept feature is indicated via CPUID
function Fn8000_000A_EDX[30].
Document for the Idle HLT intercept feature is available at [1].
[1]: AMD64 Architecture Programmer's Manual Pub. 24593, April 2024,
Vol 2, 15.9 Instruction Intercepts (Table 15-7: IDLE_HLT).
https://bugzilla.kernel.org/attachment.cgi?id=306250
Testing Done:
Added a selftest to test the Idle HLT intercept functionality.
Tested SEV and SEV-ES guest for the Idle HLT intercept functionality.
v1 -> v2
- Done changes in svm_idle_hlt_test based on the review comments from Sean.
- Added an enum based approach to get binary stats in vcpu_get_stat() which
doesn't use string to get stat data based on the comments from Sean.
- Added self_halt() and cli() helpers based on the comments from Sean.
Manali Shukla (5):
x86/cpufeatures: Add CPUID feature bit for Idle HLT intercept
KVM: SVM: Add Idle HLT intercept support
KVM: selftests: Add safe_halt() and cli() helpers to common code
KVM: selftests: Add an interface to read the data of named vcpu stat
KVM: selftests: KVM: SVM: Add Idle HLT intercept test
arch/x86/include/asm/cpufeatures.h | 1 +
arch/x86/include/asm/svm.h | 1 +
arch/x86/include/uapi/asm/svm.h | 2 +
arch/x86/kvm/svm/svm.c | 15 +++-
tools/testing/selftests/kvm/Makefile | 1 +
.../testing/selftests/kvm/include/kvm_util.h | 66 ++++++++++++++
.../selftests/kvm/include/x86_64/processor.h | 18 ++++
tools/testing/selftests/kvm/lib/kvm_util.c | 32 +++++++
.../selftests/kvm/x86_64/svm_idle_hlt_test.c | 87 +++++++++++++++++++
9 files changed, 220 insertions(+), 3 deletions(-)
create mode 100644 tools/testing/selftests/kvm/x86_64/svm_idle_hlt_test.c
base-commit: 2489e6c9ebb57d6d0e98936479b5f586201379c7
--
2.34.1
Hi Linus,
Please pull this urgent kselftest fixes update for Linux 6.10-rc1.
This kselftest fixes update for Linux 6.10-rc1 consists of
reverts framework change to add D_GNU_SOURCE to KHDR_INCLUDES
to Makefile, lib.mk, and kselftest_harness.h and follow-on
changes to cgroup and sgx test as they are causing build
failures and warnings.
These three reverts have bee in next for a few days prior
to a rebase before generating the pull request.
diff for pull request is attached.
thanks,
-- Shuah
----------------------------------------------------------------
The following changes since commit ea5f6ad9ad9645733b72ab53a98e719b460d36a6:
Merge tag 'platform-drivers-x86-v6.10-1' of git://git.kernel.org/pub/scm/linux/kernel/git/pdx86/platform-drivers-x86 (2024-05-16 09:14:50 -0700)
are available in the Git repository at:
git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest tags/linux_kselftest-next-6.10-rc1-fixes
for you to fetch changes up to a97853f25b06f71c23b2d7a59fbd40f3f42d55ac:
Revert "selftests/cgroup: Drop define _GNU_SOURCE" (2024-05-20 09:00:15 -0600)
----------------------------------------------------------------
linux_kselftest-next-6.10-rc1-fixes
This kselftest fixes update for Linux 6.10-rc1 consists of
reverts framework change to add D_GNU_SOURCE to KHDR_INCLUDES
to Makefile, lib.mk, and kselftest_harness.h and follow-on
changes to cgroup and sgx test as they are causing build
failures and warnings.
----------------------------------------------------------------
Shuah Khan (3):
Revert "selftests: Compile kselftest headers with -D_GNU_SOURCE"
Revert "selftests/sgx: Include KHDR_INCLUDES in Makefile"
Revert "selftests/cgroup: Drop define _GNU_SOURCE"
tools/testing/selftests/Makefile | 4 ++--
tools/testing/selftests/cgroup/cgroup_util.c | 3 +++
tools/testing/selftests/cgroup/test_core.c | 2 ++
tools/testing/selftests/cgroup/test_cpu.c | 2 ++
tools/testing/selftests/cgroup/test_hugetlb_memcg.c | 2 ++
tools/testing/selftests/cgroup/test_kmem.c | 2 ++
tools/testing/selftests/cgroup/test_memcontrol.c | 2 ++
tools/testing/selftests/cgroup/test_zswap.c | 2 ++
tools/testing/selftests/kselftest_harness.h | 2 +-
tools/testing/selftests/lib.mk | 2 +-
tools/testing/selftests/sgx/Makefile | 2 +-
tools/testing/selftests/sgx/sigstruct.c | 1 +
12 files changed, 21 insertions(+), 5 deletions(-)
----------------------------------------------------------------
The pcmtest driver tests use the kselftest harness which requires that
_GNU_SOURCE is defined but nothing causes it to be defined. Since the
KHDR_INCLUDES Makefile variable has had the required define added let's
use that, this should provide some futureproofing.
Fixes: daef47b89efd ("selftests: Compile kselftest headers with -D_GNU_SOURCE")
Signed-off-by: Mark Brown <broonie(a)kernel.org>
---
tools/testing/selftests/alsa/Makefile | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/tools/testing/selftests/alsa/Makefile b/tools/testing/selftests/alsa/Makefile
index 5af9ba8a4645..c1ce39874e2b 100644
--- a/tools/testing/selftests/alsa/Makefile
+++ b/tools/testing/selftests/alsa/Makefile
@@ -1,7 +1,7 @@
# SPDX-License-Identifier: GPL-2.0
#
-CFLAGS += $(shell pkg-config --cflags alsa)
+CFLAGS += $(shell pkg-config --cflags alsa) $(KHDR_INCLUDES)
LDLIBS += $(shell pkg-config --libs alsa)
ifeq ($(LDLIBS),)
LDLIBS += -lasound
---
base-commit: 3c999d1ae3c75991902a1a7dad0cb62c2a3008b4
change-id: 20240516-kselftest-fix-gnu-source-81ddd00870a8
Best regards,
--
Mark Brown <broonie(a)kernel.org>
Hi all,
This series does a number of cleanups into resctrl_val() and
generalizes it by removing test name specific handling from the
function.
One of the changes improves MBA/MBM measurement by narrowing down the
period the resctrl FS derived memory bandwidth numbers are measured
over. My feel is it didn't cause noticeable difference into the numbers
because they're generally good anyway except for the small number of
outliers. To see the impact on outliers, I'd need to setup a test to
run large number of replications and do a statistical analysis, which
I've not spent my time on. Even without the statistical analysis, the
new way to measure seems obviously better and makes sense even if I
cannot see a major improvement with the setup I'm using.
v4:
- Merged close fix into IMC READ+WRITE rework patch
- Add loop to reset imc_counters_config fds to -1 to be able know which
need closing
- Introduce perf_close_imc_mem_bw() to close fds
- Open resctrl mem bw file (twice) beforehand to avoid opening it during
the test
- Remove MBM .mongrp setup
- Remove mongrp from CMT test
v3:
- Rename init functions to <testname>_init()
- Replace for loops with READ+WRITE statements for clarity
- Don't drop Return: entry from perf_open_imc_mem_bw() func comment
- New patch: Fix closing of IMC fds in case of error
- New patch: Make "bandwidth" consistent in comments & prints
- New patch: Simplify mem bandwidth file code
- Remove wrong comment
- Changed grp_name check to return -1 on fail (internal sanity check)
v2:
- Resolved conflicts with kselftest/next
- Spaces -> tabs correction
Ilpo Järvinen (16):
selftests/resctrl: Fix closing IMC fds on error and open-code R+W
instead of loops
selftests/resctrl: Calculate resctrl FS derived mem bw over sleep(1)
only
selftests/resctrl: Make "bandwidth" consistent in comments & prints
selftests/resctrl: Consolidate get_domain_id() into resctrl_val()
selftests/resctrl: Use correct type for pids
selftests/resctrl: Cleanup bm_pid and ppid usage & limit scope
selftests/resctrl: Rename measure_vals() to measure_mem_bw_vals() &
document
selftests/resctrl: Simplify mem bandwidth file code for MBA & MBM
tests
selftests/resctrl: Add ->measure() callback to resctrl_val_param
selftests/resctrl: Add ->init() callback into resctrl_val_param
selftests/resctrl: Simplify bandwidth report type handling
selftests/resctrl: Make some strings passed to resctrlfs functions
const
selftests/resctrl: Convert ctrlgrp & mongrp to pointers
selftests/resctrl: Remove mongrp from MBA test
selftests/resctrl: Remove mongrp from CMT test
selftests/resctrl: Remove test name comparing from
write_bm_pid_to_resctrl()
tools/testing/selftests/resctrl/cache.c | 6 +-
tools/testing/selftests/resctrl/cat_test.c | 5 +-
tools/testing/selftests/resctrl/cmt_test.c | 22 +-
tools/testing/selftests/resctrl/mba_test.c | 26 +-
tools/testing/selftests/resctrl/mbm_test.c | 26 +-
tools/testing/selftests/resctrl/resctrl.h | 49 ++-
tools/testing/selftests/resctrl/resctrl_val.c | 362 ++++++++----------
tools/testing/selftests/resctrl/resctrlfs.c | 64 ++--
8 files changed, 287 insertions(+), 273 deletions(-)
--
2.39.2
The amt.sh requires smcrouted for multicasting routing.
So, it starts smcrouted before forwarding tests.
It must be stopped after all tests, but it isn't.
To fix this issue, it kills smcrouted in the cleanup logic.
Fixes: c08e8baea78e ("selftests: add amt interface selftest script")
Signed-off-by: Taehee Yoo <ap420073(a)gmail.com>
---
The v1 patch is here:
https://lore.kernel.org/netdev/20240508040643.229383-1-ap420073@gmail.com/
v3
- Do not change shebang.
v2
- Headline change.
- Kill smcrouted process only if amt.pid exists.
- Do not remove the return value.
- Remove timeout logic because it was already fixed by following commit
4c639b6a7b9d ("selftests: net: move amt to socat for better compatibility")
- Fix shebang.
tools/testing/selftests/net/amt.sh | 8 +++++++-
1 file changed, 7 insertions(+), 1 deletion(-)
diff --git a/tools/testing/selftests/net/amt.sh b/tools/testing/selftests/net/amt.sh
index 5175a42cbe8a..7e7ed6c558da 100755
--- a/tools/testing/selftests/net/amt.sh
+++ b/tools/testing/selftests/net/amt.sh
@@ -77,6 +77,7 @@ readonly LISTENER=$(mktemp -u listener-XXXXXXXX)
readonly GATEWAY=$(mktemp -u gateway-XXXXXXXX)
readonly RELAY=$(mktemp -u relay-XXXXXXXX)
readonly SOURCE=$(mktemp -u source-XXXXXXXX)
+readonly SMCROUTEDIR="$(mktemp -d)"
ERR=4
err=0
@@ -85,6 +86,11 @@ exit_cleanup()
for ns in "$@"; do
ip netns delete "${ns}" 2>/dev/null || true
done
+ if [ -f "$SMCROUTEDIR/amt.pid" ]; then
+ smcpid=$(< $SMCROUTEDIR/amt.pid)
+ kill $smcpid
+ fi
+ rm -rf $SMCROUTEDIR
exit $ERR
}
@@ -167,7 +173,7 @@ setup_iptables()
setup_mcast_routing()
{
- ip netns exec "${RELAY}" smcrouted
+ ip netns exec "${RELAY}" smcrouted -P $SMCROUTEDIR/amt.pid
ip netns exec "${RELAY}" smcroutectl a relay_src \
172.17.0.2 239.0.0.1 amtr
ip netns exec "${RELAY}" smcroutectl a relay_src \
--
2.34.1
The compaction_test memory selftest introduces fragmentation in memory
and then tries to allocate as many hugepages as possible. This series
addresses some problems.
First off, correctly set the number of hugepages to zero before trying
to set a large number of them.
Now, consider a situation in which, at the start of the test, a non-zero
number of hugepages have been already set (while running the entire
selftests/mm suite, or manually by the admin). The test operates on 80%
of memory to avoid OOM-killer invocation, and because some memory is
already blocked by hugepages, it would increase the chance of OOM-killing.
Also, since mem_free used in check_compaction() is the value before we
set nr_hugepages to zero, the chance that the compaction_index will
be small is very high if the preset nr_hugepages was high, leading to a
bogus test success.
This series applies on top of the stable 6.9 kernel.
Dev Jain (2):
selftests/mm: compaction_test: Fix incorrect write of zero to
nr_hugepages
selftests/mm: compaction_test: Fix trivial test success and reduce
probability of OOM-killer invocation
tools/testing/selftests/mm/compaction_test.c | 70 ++++++++++++++------
1 file changed, 50 insertions(+), 20 deletions(-)
--
2.30.2
This series refactors __constructor_order because
__constructor_order_last() is unneeded.
BTW, the comments in kselftest_harness.h was confusing to me.
As far as I tested, all arches executed constructors in the forward
order.
[test code]
#include <stdio.h>
static int x;
static void __attribute__((constructor)) increment(void)
{
x += 1;
}
static void __attribute__((constructor)) multiply(void)
{
x *= 2;
}
int main(void)
{
printf("foo = %d\n", x);
return 0;
}
It should print 2 for forward order systems, 1 for reverse order systems.
I executed it on some archtes by using QEMU. I always got 2.
Masahiro Yamada (2):
selftests: harness: remove unneeded __constructor_order_last()
selftests: harness: rename __constructor_order for clarification
.../drivers/s390x/uvdevice/test_uvdevice.c | 6 ------
tools/testing/selftests/hid/hid_bpf.c | 6 ------
tools/testing/selftests/kselftest_harness.h | 18 ++++--------------
tools/testing/selftests/rtc/rtctest.c | 7 -------
4 files changed, 4 insertions(+), 33 deletions(-)
--
2.40.1
compile_commands.json is used by clangd[1] to provide code navigation
and completion functionality to editors. See [2] for an example
configuration that includes this functionality for VSCode.
It can currently be built manually when using kunit.py, by running:
./scripts/clang-tools/gen_compile_commands.py -d .kunit
With this change however, it's built automatically so you don't need to
manually keep it up to date.
Unlike the manual approach, having make build the compile_commands.json
means that it appears in the build output tree instead of at the root of
the source tree, so you'll need to add --compile-commands-dir=.kunit to
your clangd args for it to be found. This might turn out to be pretty
annoying, I'm not sure yet. If so maybe we can later add some hackery to
kunit.py to work around it.
[1] https://clangd.llvm.org/
[2] https://github.com/FlorentRevest/linux-kernel-vscode
Signed-off-by: Brendan Jackman <jackmanb(a)google.com>
---
tools/testing/kunit/kunit_kernel.py | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/tools/testing/kunit/kunit_kernel.py b/tools/testing/kunit/kunit_kernel.py
index 7254c110ff23..61931c4926fd 100644
--- a/tools/testing/kunit/kunit_kernel.py
+++ b/tools/testing/kunit/kunit_kernel.py
@@ -72,7 +72,8 @@ class LinuxSourceTreeOperations:
raise ConfigError(e.output.decode())
def make(self, jobs: int, build_dir: str, make_options: Optional[List[str]]) -> None:
- command = ['make', 'ARCH=' + self._linux_arch, 'O=' + build_dir, '--jobs=' + str(jobs)]
+ command = ['make', 'all', 'compile_commands.json', 'ARCH=' + self._linux_arch,
+ 'O=' + build_dir, '--jobs=' + str(jobs)]
if make_options:
command.extend(make_options)
if self._cross_compile:
---
base-commit: 3c999d1ae3c75991902a1a7dad0cb62c2a3008b4
change-id: 20240516-kunit-compile-commands-d994074fc2be
Best regards,
--
Brendan Jackman <jackmanb(a)google.com>
Hi,
I'm seeing quite a lot of breakage in mainline as a result of
daef47b89efd0b7 ("selftests: Compile kselftest headers with
-D_GNU_SOURCE") and daef47b89efd0 ("selftests: Compile kselftest headers
with -D_GNU_SOURCE") - thus far I've found that the use of
static_assert() is triggering build breaks where testsuites aren't
picking up the addition of _GNU_SOURCE (including stopping installing
the other tests in the same directory), and there's a bunch of tests
which #define _GNU_SOURCE in their code and now trigger build warnings.
I'm looking at fixes and mitigations now.
The build failures are taking out the ALSA tests entirely which has
caused my personal CI to explode badly :/
Thanks,
Mark
The check_random_order test add/get plenty of xfrm rules, which consume
a lot time on debug kernel and always TIMEOUT. Let's reduce the test
loop and see if it works.
Signed-off-by: Hangbin Liu <liuhangbin(a)gmail.com>
---
tools/testing/selftests/net/xfrm_policy.sh | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/tools/testing/selftests/net/xfrm_policy.sh b/tools/testing/selftests/net/xfrm_policy.sh
index 457789530645..3eeeeffb4005 100755
--- a/tools/testing/selftests/net/xfrm_policy.sh
+++ b/tools/testing/selftests/net/xfrm_policy.sh
@@ -293,7 +293,7 @@ check_random_order()
local ns=$1
local log=$2
- for i in $(seq 100); do
+ for i in $(seq 50); do
ip -net $ns xfrm policy flush
for j in $(seq 0 16 255 | sort -R); do
ip -net $ns xfrm policy add dst $j.0.0.0/24 dir out priority 10 action allow
@@ -306,7 +306,7 @@ check_random_order()
done
done
- for i in $(seq 100); do
+ for i in $(seq 50); do
ip -net $ns xfrm policy flush
for j in $(seq 0 16 255 | sort -R); do
local addr=$(printf "e000:0000:%02x00::/56" $j)
--
2.43.0
Commit daef47b89efd0b7 ("selftests: Compile kselftest headers with
-D_GNU_SOURCE") adds a static_assert() which means that things which
would be warnings about undeclared functions get escalated into build
failures. While we do actually want _GNU_SOURCE to be defined for users
of kselftest_harness we haven't actually done that yet and this is
causing widespread build breaks which were previously warnings about
uses of asprintf() without prototypes, including causing other test
programs in the same directory to fail to build.
Since the build failures that are introduced cause additional issues due
to make stopping builds early replace the static_assert() with a
missing without making the error more severe than it already was. This
will be moot once the issue is fixed properly but reduces the disruption
while that happens.
Signed-off-by: Mark Brown <broonie(a)kernel.org>
---
tools/testing/selftests/kselftest_harness.h | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/tools/testing/selftests/kselftest_harness.h b/tools/testing/selftests/kselftest_harness.h
index 37b03f1b8741..1cee8cacf9dc 100644
--- a/tools/testing/selftests/kselftest_harness.h
+++ b/tools/testing/selftests/kselftest_harness.h
@@ -51,7 +51,7 @@
#define __KSELFTEST_HARNESS_H
#ifndef _GNU_SOURCE
-static_assert(0, "kselftest harness requires _GNU_SOURCE to be defined");
+#warning kselftest harness requires _GNU_SOURCE to be defined
#endif
#include <asm/types.h>
#include <ctype.h>
---
base-commit: 3c999d1ae3c75991902a1a7dad0cb62c2a3008b4
change-id: 20240516-kselftest-mitigate-gnu-source-b41b2d2cb8a1
Best regards,
--
Mark Brown <broonie(a)kernel.org>
This patch series ended up much larger than expected, please bear with
me! The goal here is to support vendor extensions, starting at probing
the device tree and ending with reporting to userspace.
The main design objective was to allow vendors to operate independently
of each other. This has been achieved by delegating vendor extensions to
a their own files and then accumulating the extensions in
arch/riscv/kernel/vendor_extensions.c.
Each vendor will have their own list of extensions they support.
There is a new hwprobe key RISCV_HWPROBE_KEY_VENDOR_EXT_THEAD_0 that is
used to request which thead vendor extensions are supported on the
current platform. This allows future vendors to allocate hwprobe keys
for their vendor.
On to the xtheadvector specific code. xtheadvector is a custom extension
that is based upon riscv vector version 0.7.1 [1]. All of the vector
routines have been modified to support this alternative vector version
based upon whether xtheadvector was determined to be supported at boot.
I have tested this with an Allwinner Nezha board. I ran into issues
booting the board on 6.9-rc1 so I applied these patches to 6.8. There
are a couple of minor merge conflicts that do arrise when doing that, so
please let me know if you have been able to boot this board with a 6.9
kernel. I used SkiffOS [2] to manage building the image, but upgraded
the U-Boot version to Samuel Holland's more up-to-date version [3] and
changed out the device tree used by U-Boot with the device trees that
are present in upstream linux and this series. Thank you Samuel for all
of the work you did to make this task possible.
To test the integration, I used the riscv vector kselftests. I modified
the test cases to be able to more easily extend them, and then added a
xtheadvector target that works by calling hwprobe and swapping out the
vector asm if needed.
[1] https://github.com/T-head-Semi/thead-extension-spec/blob/95358cb2cca9489361…
[2] https://github.com/skiffos/SkiffOS/tree/master/configs/allwinner/nezha
[3] https://github.com/smaeul/u-boot/commit/2e89b706f5c956a70c989cd31665f1429e9…
Signed-off-by: Charlie Jenkins <charlie(a)rivosinc.com>
---
Changes in v6:
- Only check vlenb from of if vector enabled in kernel (Conor)
- No need for variadic args in VENDOR_EXTENSION_SUPPORTED so just use a
standard argument
- Make 'first' variable in riscv_fill_vendor_ext_list() static so that
the variable value remains across calls to the function (Evan)
- Link to v5: https://lore.kernel.org/r/20240502-dev-charlie-support_thead_vector_6_9-v5-…
Changes in v5:
- Make all vendors have the same size bitmap
- Extract vendor hwprobe code into helper macro
- Fix bug related to the handling of vendor extensions in the parsing of
the isa string (Conor)
- Fix bug with the vendor bitmap being incorrectly populated (Evan)
- Add vendor extensions to /proc/cpuinfo
- Link to v4: https://lore.kernel.org/r/20240426-dev-charlie-support_thead_vector_6_9-v4-…
Changes in v4:
- Disable vector immediately if vlenb from the device tree is not
homogeneous
- Hide vendor extension code behind a hidden config that vendor
extensions select to eliminate the code when kernel is compiled
without vendor extensions
- Clear up naming conventions and introduce some defines to make the
vendor extension code clearer
- Link to v3: https://lore.kernel.org/r/20240420-dev-charlie-support_thead_vector_6_9-v3-…
Changes in v3:
- Allow any hardware to support any vendor extension, rather than
restricting the vendor extensions to the same vendor as the hardware
- Introduce config options to enable/disable a vendor's extensions
- Link to v2: https://lore.kernel.org/r/20240415-dev-charlie-support_thead_vector_6_9-v2-…
Changes in v2:
- Added commit hash to xtheadvector
- Simplified riscv,isa vector removal fix to not mess with the DT
riscv,vendorid
- Moved riscv,vendorid parsing into a different patch and cache the
value to be used by alternative patching
- Reduce riscv,vendorid missing severity to "info"
- Separate vendor extension list to vendor files
- xtheadvector no longer puts v in the elf_hwcap
- Only patch vendor extension if all harts are associated with the same
vendor. This is the best chance the kernel has for working properly if
there are multiple vendors.
- Split hwprobe vendor keys out into vendor file
- Add attribution for Heiko's patches
- Link to v1: https://lore.kernel.org/r/20240411-dev-charlie-support_thead_vector_6_9-v1-…
---
Charlie Jenkins (15):
dt-bindings: riscv: Add xtheadvector ISA extension description
riscv: vector: Use vlenb from DT
riscv: dts: allwinner: Add xtheadvector to the D1/D1s devicetree
riscv: Extend cpufeature.c to detect vendor extensions
riscv: Add vendor extensions to /proc/cpuinfo
riscv: Introduce vendor variants of extension helpers
riscv: cpufeature: Extract common elements from extension checking
riscv: Convert xandespmu to use the vendor extension framework
riscv: csr: Add CSR encodings for VCSR_VXRM/VCSR_VXSAT
riscv: Add xtheadvector instruction definitions
riscv: vector: Support xtheadvector save/restore
riscv: hwprobe: Add thead vendor extension probing
riscv: hwprobe: Document thead vendor extensions and xtheadvector extension
selftests: riscv: Fix vector tests
selftests: riscv: Support xtheadvector in vector tests
Conor Dooley (1):
dt-bindings: riscv: cpus: add a vlen register length property
Heiko Stuebner (1):
RISC-V: define the elements of the VCSR vector CSR
Documentation/arch/riscv/hwprobe.rst | 10 +
Documentation/devicetree/bindings/riscv/cpus.yaml | 6 +
.../devicetree/bindings/riscv/extensions.yaml | 10 +
arch/riscv/Kconfig | 2 +
arch/riscv/Kconfig.vendor | 44 +++
arch/riscv/boot/dts/allwinner/sun20i-d1s.dtsi | 3 +-
arch/riscv/errata/andes/errata.c | 2 +
arch/riscv/errata/sifive/errata.c | 3 +
arch/riscv/errata/thead/errata.c | 3 +
arch/riscv/include/asm/cpufeature.h | 98 ++++---
arch/riscv/include/asm/csr.h | 13 +
arch/riscv/include/asm/hwcap.h | 1 -
arch/riscv/include/asm/hwprobe.h | 4 +-
arch/riscv/include/asm/switch_to.h | 2 +-
arch/riscv/include/asm/vector.h | 247 +++++++++++++----
arch/riscv/include/asm/vendor_extensions.h | 103 +++++++
arch/riscv/include/asm/vendor_extensions/andes.h | 19 ++
arch/riscv/include/asm/vendor_extensions/thead.h | 42 +++
.../include/asm/vendor_extensions/thead_hwprobe.h | 18 ++
.../include/asm/vendor_extensions/vendor_hwprobe.h | 37 +++
arch/riscv/include/uapi/asm/hwprobe.h | 3 +-
arch/riscv/include/uapi/asm/vendor/thead.h | 3 +
arch/riscv/kernel/Makefile | 2 +
arch/riscv/kernel/cpu.c | 35 ++-
arch/riscv/kernel/cpufeature.c | 175 +++++++++---
arch/riscv/kernel/kernel_mode_vector.c | 8 +-
arch/riscv/kernel/process.c | 4 +-
arch/riscv/kernel/signal.c | 6 +-
arch/riscv/kernel/sys_hwprobe.c | 5 +
arch/riscv/kernel/vector.c | 25 +-
arch/riscv/kernel/vendor_extensions.c | 66 +++++
arch/riscv/kernel/vendor_extensions/Makefile | 5 +
arch/riscv/kernel/vendor_extensions/andes.c | 18 ++
arch/riscv/kernel/vendor_extensions/thead.c | 18 ++
.../riscv/kernel/vendor_extensions/thead_hwprobe.c | 19 ++
drivers/perf/riscv_pmu_sbi.c | 9 +-
tools/testing/selftests/riscv/vector/.gitignore | 3 +-
tools/testing/selftests/riscv/vector/Makefile | 17 +-
.../selftests/riscv/vector/v_exec_initval_nolibc.c | 93 +++++++
tools/testing/selftests/riscv/vector/v_helpers.c | 67 +++++
tools/testing/selftests/riscv/vector/v_helpers.h | 7 +
tools/testing/selftests/riscv/vector/v_initval.c | 22 ++
.../selftests/riscv/vector/v_initval_nolibc.c | 68 -----
.../selftests/riscv/vector/vstate_exec_nolibc.c | 20 +-
.../testing/selftests/riscv/vector/vstate_prctl.c | 295 ++++++++++++---------
45 files changed, 1319 insertions(+), 341 deletions(-)
---
base-commit: 4cece764965020c22cff7665b18a012006359095
change-id: 20240411-dev-charlie-support_thead_vector_6_9-1591fc2a431d
--
- Charlie
Changes from RFC v3 -> PATCH v1:
- Updated selftest to use ksft_print_msg instead of fprintf(stderr, ...)
(Muhammad Usama Anjum)
- Included more detail in patch skipping pmd_young with force_scan
(Huang, Ying)
- Deferred reaccess histogram as a followup
- Removed per-memcg page age interval configs for simplicity
Changes from RFC v2 -> RFC v3:
- Update to v6.8
- Added an aging kernel thread (gated behind config)
- Added basic selftests for sysfs interface files
- Track swapped out pages for reaccesses
- Refactoring and cleanup
- Dropped the virtio-balloon extension to make things manageable
Changes from RFC v1 -> RFC v2:
- Refactored the patchs into smaller pieces
- Renamed interfaces and functions from wss to wsr (Working Set Reporting)
- Fixed build errors when CONFIG_WSR is not set
- Changed working_set_num_bins to u8 for virtio-balloon
- Added support for per-NUMA node reporting for virtio-balloon
[rfc v1]
https://lore.kernel.org/linux-mm/20230509185419.1088297-1-yuanchu@google.co…
[rfc v2]
https://lore.kernel.org/linux-mm/20230621180454.973862-1-yuanchu@google.com/
[rfc v3]
https://lore.kernel.org/linux-mm/20240327213108.2384666-1-yuanchu@google.co…
This patch series provides workingset reporting of user pages in
lruvecs, of which coldness can be tracked by accessed bits and fd
references. However, the concept of workingset applies generically to
all types of memory, which could be kernel slab caches, discardable
userspace caches (databases), or CXL.mem. Therefore, data sources might
come from slab shrinkers, device drivers, or the userspace. IMO, the
kernel should provide a set of workingset interfaces that should be
generic enough to accommodate the various use cases, and be extensible
to potential future use cases. The current proposed interfaces are not
sufficient in that regard, but I would like to start somewhere, solicit
feedback, and iterate.
Use cases
==========
Job scheduling
On overcommitted hosts, workingset information allows the job scheduler
to right-size each job and land more jobs on the same host or NUMA node,
and in the case of a job with increasing workingset, policy decisions
can be made to migrate other jobs off the host/NUMA node, or oom-kill
the misbehaving job. If the job shape is very different from the machine
shape, knowing the workingset per-node can also help inform page
allocation policies.
Proactive reclaim
Workingset information allows the a container manager to proactively
reclaim memory while not impacting a job's performance. While PSI may
provide a reactive measure of when a proactive reclaim has reclaimed too
much, workingset reporting allows the policy to be more accurate and
flexible.
Ballooning (similar to proactive reclaim)
While this patch series does not extend the virtio-balloon device,
balloon policies benefit from workingset to more precisely determine
the size of the memory balloon. On desktops/laptops/mobile devices where
memory is scarce and overcommitted, the balloon sizing in multiple VMs
running on the same device can be orchestrated with workingset reports
from each one.
Promotion/Demotion
Similar to proactive reclaim, a workingset report enables demotion to a
slower tier of memory.
For promotion, the workingset report interfaces need to be extended to
report hotness and gather hotness information from the devices[1].
[1]
https://www.opencompute.org/documents/ocp-cms-hotness-tracking-requirements…
Sysfs and Cgroup Interfaces
==========
The interfaces are detailed in the patches that introduce them. The main
idea here is we break down the workingset per-node per-memcg into time
intervals (ms), e.g.
1000 anon=137368 file=24530
20000 anon=34342 file=0
30000 anon=353232 file=333608
40000 anon=407198 file=206052
9223372036854775807 anon=4925624 file=892892
I realize this does not generalize well to hotness information, but I
lack the intuition for an abstraction that presents hotness in a useful
way. Please advise.
Implementation
==========
Currently, the reporting of user pages is based off of MGLRU, and
therefore requires CONFIG_LRU_GEN=y. We would benefit from more MGLRU
generations for a more fine-grained workingset report. I will make the
generation count configurable in the next version. The workingset
reporting mechanism is gated behind CONFIG_WORKINGSET_REPORT, and the
aging thread is behind CONFIG_WORKINGSET_REPORT_AGING.
Yuanchu Xie (7):
mm: multi-gen LRU: ignore non-leaf pmd_young for force_scan=true
mm: aggregate working set information into histograms
mm: use refresh interval to rate-limit workingset report aggregation
mm: report workingset during memory pressure driven scanning
mm: extend working set reporting to memcgs
mm: add kernel aging thread for workingset reporting
selftest: test system-wide workingset reporting
drivers/base/node.c | 6 +
include/linux/memcontrol.h | 5 +
include/linux/mmzone.h | 9 +
include/linux/workingset_report.h | 97 ++++
mm/Kconfig | 15 +
mm/Makefile | 2 +
mm/internal.h | 17 +
mm/memcontrol.c | 184 +++++-
mm/mm_init.c | 2 +
mm/mmzone.c | 2 +
mm/vmscan.c | 85 ++-
mm/workingset_report.c | 545 ++++++++++++++++++
mm/workingset_report_aging.c | 127 ++++
tools/testing/selftests/mm/.gitignore | 1 +
tools/testing/selftests/mm/Makefile | 3 +
.../testing/selftests/mm/workingset_report.c | 317 ++++++++++
.../testing/selftests/mm/workingset_report.h | 39 ++
.../selftests/mm/workingset_report_test.c | 332 +++++++++++
18 files changed, 1786 insertions(+), 2 deletions(-)
create mode 100644 include/linux/workingset_report.h
create mode 100644 mm/workingset_report.c
create mode 100644 mm/workingset_report_aging.c
create mode 100644 tools/testing/selftests/mm/workingset_report.c
create mode 100644 tools/testing/selftests/mm/workingset_report.h
create mode 100644 tools/testing/selftests/mm/workingset_report_test.c
--
2.45.0.rc1.225.g2a3ae87e7f-goog
There is no need to add the name to ns_list again if the netns already
recoreded.
Fixes: 25ae948b4478 ("selftests/net: add lib.sh")
Signed-off-by: Hangbin Liu <liuhangbin(a)gmail.com>
---
tools/testing/selftests/net/lib.sh | 6 ++++--
1 file changed, 4 insertions(+), 2 deletions(-)
diff --git a/tools/testing/selftests/net/lib.sh b/tools/testing/selftests/net/lib.sh
index f9fe182dfbd4..56a9454b7ba3 100644
--- a/tools/testing/selftests/net/lib.sh
+++ b/tools/testing/selftests/net/lib.sh
@@ -73,15 +73,17 @@ setup_ns()
local ns=""
local ns_name=""
local ns_list=""
+ local ns_exist=
for ns_name in "$@"; do
# Some test may setup/remove same netns multi times
if unset ${ns_name} 2> /dev/null; then
ns="${ns_name,,}-$(mktemp -u XXXXXX)"
eval readonly ${ns_name}="$ns"
+ ns_exist=false
else
eval ns='$'${ns_name}
cleanup_ns "$ns"
-
+ ns_exist=true
fi
if ! ip netns add "$ns"; then
@@ -90,7 +92,7 @@ setup_ns()
return $ksft_skip
fi
ip -n "$ns" link set lo up
- ns_list="$ns_list $ns"
+ ! $ns_exist && ns_list="$ns_list $ns"
done
NS_LIST="$NS_LIST $ns_list"
}
--
2.43.0
This patch series implements a new char misc driver, /dev/ntsync, which is used
to implement Windows NT synchronization primitives.
NT synchronization primitives are unique in that the wait functions both are
vectored, operate on multiple types of object with different behaviour (mutex,
semaphore, event), and affect the state of the objects they wait on. This model
is not compatible with existing kernel synchronization objects or interfaces,
and therefore the ntsync driver implements its own wait queues and locking.
Hence I would like to request review from someone familiar with locking to make
sure that the usage of low-level kernel primitives is correct and that the wait
queues work as intended, and to that end I've CC'd the locking maintainers.
== Background ==
The Wine project emulates the Windows API in user space. One particular part of
that API, namely the NT synchronization primitives, have historically been
implemented via RPC to a dedicated "kernel" process. However, more recent
applications use these APIs more strenuously, and the overhead of RPC has become
a bottleneck.
The NT synchronization APIs are too complex to implement on top of existing
primitives without sacrificing correctness. Certain operations, such as
NtPulseEvent() or the "wait-for-all" mode of NtWaitForMultipleObjects(), require
direct control over the underlying wait queue, and implementing a wait queue
sufficiently robust for Wine in user space is not possible. This proposed
driver, therefore, implements the problematic interfaces directly in the Linux
kernel.
This driver was presented at Linux Plumbers Conference 2023. For those further
interested in the history of synchronization in Wine and past attempts to solve
this problem in user space, a recording of the presentation can be viewed here:
https://www.youtube.com/watch?v=NjU4nyWyhU8
== Performance ==
The gain in performance varies wildly depending on the application in question
and the user's hardware. For some games NT synchronization is not a bottleneck
and no change can be observed, but for others frame rate improvements of 50 to
150 percent are not atypical. The following table lists frame rate measurements
from a variety of games on a variety of hardware, taken by users Dmitry
Skvortsov, FuzzyQuils, OnMars, and myself:
Game Upstream ntsync improvement
===========================================================================
Anger Foot 69 99 43%
Call of Juarez 99.8 224.1 125%
Dirt 3 110.6 860.7 678%
Forza Horizon 5 108 160 48%
Lara Croft: Temple of Osiris 141 326 131%
Metro 2033 164.4 199.2 21%
Resident Evil 2 26 77 196%
The Crew 26 51 96%
Tiny Tina's Wonderlands 130 360 177%
Total War Saga: Troy 109 146 34%
===========================================================================
== Patches ==
The intended semantics of the patches are broadly intended to match those of the
corresponding Windows functions. For those not already familiar with the Windows
functions (or their undocumented behaviour), patch 27/27 provides a detailed
specification, and individual patches also include a brief description of the
API they are implementing.
The patches making use of this driver in Wine can be retrieved or browsed here:
https://repo.or.cz/wine/zf.git/shortlog/refs/heads/ntsync5
== Implementation ==
Some aspects of the implementation may deserve particular comment:
* In the interest of performance, each object is governed only by a single
spinlock. However, NTSYNC_IOC_WAIT_ALL requires that the state of multiple
objects be changed as a single atomic operation. In order to achieve this, we
first take a device-wide lock ("wait_all_lock") any time we are going to lock
more than one object at a time.
The maximum number of objects that can be used in a vectored wait, and
therefore the maximum that can be locked simultaneously, is 64. This number is
NT's own limit.
The acquisition of multiple spinlocks will degrade performance. This is a
conscious choice, however. Wait-for-all is known to be a very rare operation
in practice, especially with counts that approach the maximum, and it is the
intent of the ntsync driver to optimize wait-for-any at the expense of
wait-for-all as much as possible.
* NT mutexes are tied to their threads on an OS level, and the kernel includes
builtin support for "robust" mutexes. In order to keep the ntsync driver
self-contained and avoid touching more code than necessary, it does not hook
into task exit nor use pids.
Instead, the user space emulator is expected to manage thread IDs and pass
them as an argument to any relevant functions; this is the "owner" field of
ntsync_wait_args and ntsync_mutex_args.
When the emulator detects that a thread dies, it should therefore call
NTSYNC_IOC_MUTEX_KILL on any open mutexes.
* ntsync is module-capable mostly because there was nothing preventing it, and
because it aided development. It is not a hard requirement, though.
== Previous versions ==
Changes from v3:
* Add .gitignore and use KHDR_INCLUDES in selftest build files, per Muhammad
Usama Anjum.
* Try to explain why we are rolling our own primitives a little better, per Greg
Kroah-Hartman.
* Link to v3: https://lore.kernel.org/lkml/20240329000621.148791-1-zfigura@codeweavers.co…
* Link to v2: https://lore.kernel.org/lkml/20240219223833.95710-1-zfigura@codeweavers.com/
* Link to v1: https://lore.kernel.org/lkml/20240214233645.9273-1-zfigura@codeweavers.com/
* Link to RFC v2: https://lore.kernel.org/lkml/20240131021356.10322-1-zfigura@codeweavers.com/
* Link to RFC v1: https://lore.kernel.org/lkml/20240124004028.16826-1-zfigura@codeweavers.com/
Elizabeth Figura (27):
ntsync: Introduce NTSYNC_IOC_WAIT_ANY.
ntsync: Introduce NTSYNC_IOC_WAIT_ALL.
ntsync: Introduce NTSYNC_IOC_CREATE_MUTEX.
ntsync: Introduce NTSYNC_IOC_MUTEX_UNLOCK.
ntsync: Introduce NTSYNC_IOC_MUTEX_KILL.
ntsync: Introduce NTSYNC_IOC_CREATE_EVENT.
ntsync: Introduce NTSYNC_IOC_EVENT_SET.
ntsync: Introduce NTSYNC_IOC_EVENT_RESET.
ntsync: Introduce NTSYNC_IOC_EVENT_PULSE.
ntsync: Introduce NTSYNC_IOC_SEM_READ.
ntsync: Introduce NTSYNC_IOC_MUTEX_READ.
ntsync: Introduce NTSYNC_IOC_EVENT_READ.
ntsync: Introduce alertable waits.
selftests: ntsync: Add some tests for semaphore state.
selftests: ntsync: Add some tests for mutex state.
selftests: ntsync: Add some tests for NTSYNC_IOC_WAIT_ANY.
selftests: ntsync: Add some tests for NTSYNC_IOC_WAIT_ALL.
selftests: ntsync: Add some tests for wakeup signaling with
WINESYNC_IOC_WAIT_ANY.
selftests: ntsync: Add some tests for wakeup signaling with
WINESYNC_IOC_WAIT_ALL.
selftests: ntsync: Add some tests for manual-reset event state.
selftests: ntsync: Add some tests for auto-reset event state.
selftests: ntsync: Add some tests for wakeup signaling with events.
selftests: ntsync: Add tests for alertable waits.
selftests: ntsync: Add some tests for wakeup signaling via alerts.
selftests: ntsync: Add a stress test for contended waits.
maintainers: Add an entry for ntsync.
docs: ntsync: Add documentation for the ntsync uAPI.
Documentation/userspace-api/index.rst | 1 +
Documentation/userspace-api/ntsync.rst | 399 +++++
MAINTAINERS | 9 +
drivers/misc/ntsync.c | 925 ++++++++++-
include/uapi/linux/ntsync.h | 39 +
tools/testing/selftests/Makefile | 1 +
.../selftests/drivers/ntsync/.gitignore | 1 +
.../testing/selftests/drivers/ntsync/Makefile | 7 +
tools/testing/selftests/drivers/ntsync/config | 1 +
.../testing/selftests/drivers/ntsync/ntsync.c | 1407 +++++++++++++++++
10 files changed, 2786 insertions(+), 4 deletions(-)
create mode 100644 Documentation/userspace-api/ntsync.rst
create mode 100644 tools/testing/selftests/drivers/ntsync/.gitignore
create mode 100644 tools/testing/selftests/drivers/ntsync/Makefile
create mode 100644 tools/testing/selftests/drivers/ntsync/config
create mode 100644 tools/testing/selftests/drivers/ntsync/ntsync.c
base-commit: ebbc1a4789c666846b9854ef845a37a64879e5f9
--
2.43.0
Hi everyone,
My name is Abhinav Saxena. I am a graduate student at the University
of Calgary. This is my first patch series for the Linux kernel. I am
applying for the "Linux kernel Bug Fixing Summer Unpaid
2024". Apologies in advance if I made any trivial mistakes :)
This patch mainly includes issues reported by checkpatch.pl. The
changes include:
- Running clang-format on `binderfs_test.c` to fix formatting issues.
- Updates the macro close_prot_errno_disarm macro.
Testing: I tested patches on my local machine (ARM64 ubuntu) with
checkpatch.pl and running the selftests.
Best,
Abhinav
Abhinav Saxena (4):
run clang-format on bindergfs test
update close_prot_errno_disarm macro to do{...}while(false) structure
for safety
Macro argument 'fd' may be better as '(fd)' to avoid precedence issues
add missing a blank line after declarations; fix alignment formatting
.../filesystems/binderfs/binderfs_test.c | 204 +++++++++++-------
1 file changed, 126 insertions(+), 78 deletions(-)
--
2.34.1
Add support for (yet again) more RVA23U64 missing extensions. Add
support for Zcmop, Zca, Zcf, Zcd and Zcb extensions isa string parsing,
hwprobe and kvm support. Zce, Zcmt and Zcmp extensions have been left
out since they target microcontrollers/embedded CPUs and are not needed
by RVA23U64.
Since Zc* extensions states that C implies Zca, Zcf (if F and RV32), Zcd
(if D), this series modifies the way ISA string is parsed and now does
it in two phases. First one parses the string and the second one
validates it for the final ISA description.
This series is based on the Zimop one [1]. An additional fix [2] should
be applied to correctly test that series.
Link: https://lore.kernel.org/linux-riscv/20240404103254.1752834-1-cleger@rivosin… [1]
Link: https://lore.kernel.org/all/20240409143839.558784-1-cleger@rivosinc.com/ [2]
---
v4:
- Modify validate() callbacks to return an 0, -EPROBEDEFER or another
error.
- v3: https://lore.kernel.org/all/20240423124326.2532796-1-cleger@rivosinc.com/
v3:
- Fix typo "exists" -> "exist"
- Remove C implies Zca, Zcd, Zcf, dt-bindings rules
- Rework ISA string resolver to handle dependencies
- v2: https://lore.kernel.org/all/20240418124300.1387978-1-cleger@rivosinc.com/
v2:
- Add Zc* dependencies validation in dt-bindings
- v1: https://lore.kernel.org/lkml/20240410091106.749233-1-cleger@rivosinc.com/
Clément Léger (11):
dt-bindings: riscv: add Zca, Zcf, Zcd and Zcb ISA extension
description
riscv: add ISA extensions validation
riscv: add ISA parsing for Zca, Zcf, Zcd and Zcb
riscv: hwprobe: export Zca, Zcf, Zcd and Zcb ISA extensions
RISC-V: KVM: Allow Zca, Zcf, Zcd and Zcb extensions for Guest/VM
KVM: riscv: selftests: Add some Zc* extensions to get-reg-list test
dt-bindings: riscv: add Zcmop ISA extension description
riscv: add ISA extension parsing for Zcmop
riscv: hwprobe: export Zcmop ISA extension
RISC-V: KVM: Allow Zcmop extension for Guest/VM
KVM: riscv: selftests: Add Zcmop extension to get-reg-list test
Documentation/arch/riscv/hwprobe.rst | 24 ++
.../devicetree/bindings/riscv/extensions.yaml | 90 ++++++
arch/riscv/include/asm/cpufeature.h | 1 +
arch/riscv/include/asm/hwcap.h | 5 +
arch/riscv/include/uapi/asm/hwprobe.h | 5 +
arch/riscv/include/uapi/asm/kvm.h | 5 +
arch/riscv/kernel/cpufeature.c | 259 ++++++++++++------
arch/riscv/kernel/sys_hwprobe.c | 5 +
arch/riscv/kvm/vcpu_onereg.c | 10 +
.../selftests/kvm/riscv/get-reg-list.c | 20 ++
10 files changed, 337 insertions(+), 87 deletions(-)
--
2.43.0
From: Mark Brown <broonie(a)kernel.org>
[ Upstream commit 907f33028871fa7c9a3db1efd467b78ef82cce20 ]
The standard library perror() function provides a convenient way to print
an error message based on the current errno but this doesn't play nicely
with KTAP output. Provide a helper which does an equivalent thing in a KTAP
compatible format.
nolibc doesn't have a strerror() and adding the table of strings required
doesn't seem like a good fit for what it's trying to do so when we're using
that only print the errno.
Signed-off-by: Mark Brown <broonie(a)kernel.org>
Reviewed-by: Kees Cook <keescook(a)chromium.org>
Signed-off-by: Shuah Khan <skhan(a)linuxfoundation.org>
Stable-dep-of: 071af0c9e582 ("selftests: timers: Convert posix_timers test to generate KTAP output")
Signed-off-by: Edward Liaw <edliaw(a)google.com>
---
tools/testing/selftests/kselftest.h | 14 ++++++++++++++
1 file changed, 14 insertions(+)
diff --git a/tools/testing/selftests/kselftest.h b/tools/testing/selftests/kselftest.h
index e8eecbc83a60..ad7b97e16f37 100644
--- a/tools/testing/selftests/kselftest.h
+++ b/tools/testing/selftests/kselftest.h
@@ -48,6 +48,7 @@
#include <stdlib.h>
#include <unistd.h>
#include <stdarg.h>
+#include <string.h>
#include <stdio.h>
#include <sys/utsname.h>
#endif
@@ -156,6 +157,19 @@ static inline void ksft_print_msg(const char *msg, ...)
va_end(args);
}
+static inline void ksft_perror(const char *msg)
+{
+#ifndef NOLIBC
+ ksft_print_msg("%s: %s (%d)\n", msg, strerror(errno), errno);
+#else
+ /*
+ * nolibc doesn't provide strerror() and it seems
+ * inappropriate to add one, just print the errno.
+ */
+ ksft_print_msg("%s: %d)\n", msg, errno);
+#endif
+}
+
static inline void ksft_test_result_pass(const char *msg, ...)
{
int saved_errno = errno;
--
2.44.0.769.g3c40516874-goog
To verify IFS (In Field Scan [1]) driver functionality, add the following 6
test cases:
1. Verify that IFS sysfs entries are created after loading the IFS module
2. Check if loading an invalid IFS test image fails and loading a valid
one succeeds
3. Perform IFS scan test on each CPU using all the available image files
4. Perform IFS scan with first test image file on a random CPU for 3
rounds
5. Perform IFS ARRAY BIST(Board Integrated System Test) test on each CPU
6. Perform IFS ARRAY BIST test on a random CPU for 3 rounds
These are not exhaustive, but some minimal test runs to check various
parts of the driver. Some negative tests are also included.
[1] https://docs.kernel.org/arch/x86/ifs.html
Pengfei Xu (4):
selftests: ifs: verify test interfaces are created by the driver
selftests: ifs: verify test image loading functionality
selftests: ifs: verify IFS scan test functionality
selftests: ifs: verify IFS ARRAY BIST functionality
MAINTAINERS | 1 +
tools/testing/selftests/Makefile | 1 +
.../drivers/platform/x86/intel/ifs/Makefile | 6 +
.../platform/x86/intel/ifs/test_ifs.sh | 496 ++++++++++++++++++
4 files changed, 504 insertions(+)
create mode 100644 tools/testing/selftests/drivers/platform/x86/intel/ifs/Makefile
create mode 100755 tools/testing/selftests/drivers/platform/x86/intel/ifs/test_ifs.sh
--
2.43.0
From: Geliang Tang <tanggeliang(a)kylinos.cn>
This patchset uses post_socket_cb and post_connect_cb callbacks of struct
network_helper_opts to refactor do_test() in bpf_tcp_ca.c to move dctcp
test dedicated code out of do_test() into test_dctcp().
Patch 3 adds a new member in post_socket_opts and patch 4 adds a new
callback in network_helper_opts. I'm not sure if this is going too far.
Geliang Tang (4):
selftests/bpf: Use post_socket_cb in connect_to_fd_opts
selftests/bpf: Use start_server_addr in bpf_tcp_ca
selftests/bpf: Use connect_to_fd_opts in do_test in bpf_tcp_ca
selftests/bpf: Add post_connect_cb callback
tools/testing/selftests/bpf/network_helpers.c | 13 ++-
tools/testing/selftests/bpf/network_helpers.h | 8 +-
.../selftests/bpf/prog_tests/bpf_tcp_ca.c | 105 +++++++++++-------
3 files changed, 81 insertions(+), 45 deletions(-)
--
2.43.0
It seems obvious once you know, but at first I didn't realise that the
suite name is part of this format. Document it and add example.
Signed-off-by: Brendan Jackman <jackmanb(a)google.com>
---
Documentation/dev-tools/kunit/run_wrapper.rst | 9 ++++++++-
1 file changed, 8 insertions(+), 1 deletion(-)
diff --git a/Documentation/dev-tools/kunit/run_wrapper.rst b/Documentation/dev-tools/kunit/run_wrapper.rst
index 19ddf5e07013..e75a5fc05814 100644
--- a/Documentation/dev-tools/kunit/run_wrapper.rst
+++ b/Documentation/dev-tools/kunit/run_wrapper.rst
@@ -156,13 +156,20 @@ Filtering tests
===============
By passing a bash style glob filter to the ``exec`` or ``run``
-commands, we can run a subset of the tests built into a kernel . For
+commands, we can run a subset of the tests built into a kernel,
+identified by a string like ``$suite_name.$test_name``. For
example: if we only want to run KUnit resource tests, use:
.. code-block::
./tools/testing/kunit/kunit.py run 'kunit-resource*'
+Or to run just one specific test from that suite:
+
+.. code-block::
+
+ ./tools/testing/kunit/kunit.py run 'kunit-resource-test.kunit_resource_test_init_resources'
+
This uses the standard glob format with wildcard characters.
.. _kunit-on-qemu:
--
2.44.0.396.g6e790dbe36-goog
It seems obvious once you know, but at first I didn't realise that the
suite name is part of this format. Document it and add some examples.
Signed-off-by: Brendan Jackman <jackmanb(a)google.com>
---
v1->v2: Expanded to clarify that suite_glob and test_glob are two separate
patterns. Also made some other trivial changes to formatting etc.
Documentation/dev-tools/kunit/run_wrapper.rst | 33 +++++++++++++++++--
1 file changed, 30 insertions(+), 3 deletions(-)
diff --git a/Documentation/dev-tools/kunit/run_wrapper.rst b/Documentation/dev-tools/kunit/run_wrapper.rst
index 19ddf5e07013..b07252d3fa9d 100644
--- a/Documentation/dev-tools/kunit/run_wrapper.rst
+++ b/Documentation/dev-tools/kunit/run_wrapper.rst
@@ -156,12 +156,39 @@ Filtering tests
===============
By passing a bash style glob filter to the ``exec`` or ``run``
-commands, we can run a subset of the tests built into a kernel . For
-example: if we only want to run KUnit resource tests, use:
+commands, we can run a subset of the tests built into a kernel,
+identified by a string like ``<suite_glob>[.<test_glob>]``.
+
+For example, to run the ``kunit-resource-test`` suite:
+
+.. code-block::
+
+ ./tools/testing/kunit/kunit.py run kunit-resource-test
+
+To run a specific test from that suite:
+
+.. code-block::
+
+ ./tools/testing/kunit/kunit.py run kunit-resource-test.kunit_resource_test
+
+To run all tests from suites whose names start with ``kunit``:
+
+.. code-block::
+
+ ./tools/testing/kunit/kunit.py run 'kunit*'
+
+To run all tests whose name ends with ``remove_resource``:
+
+.. code-block::
+
+ ./tools/testing/kunit/kunit.py run '*.*remove_resource'
+
+To run all tests whose name ends with ``remove_resource``, from suites whose
+names start with ``kunit``:
.. code-block::
- ./tools/testing/kunit/kunit.py run 'kunit-resource*'
+ ./tools/testing/kunit/kunit.py run 'kunit*.*remove_resource'
This uses the standard glob format with wildcard characters.
--
2.44.0.478.gd926399ef9-goog
Hi Linus,
Please pull the following KUnit next update for Linux 6.10-rc1.
This kunit update for Linux 6.10-rc1 consists of:
- fix to race condition in try-catch completion
- change to __kunit_test_suites_init() to exit early if there is
nothing to test
- change to string-stream-test to use KUNIT_DEFINE_ACTION_WRAPPER
- moving fault tests behind KUNIT_FAULT_TEST Kconfig option
- kthread test fixes and improvements
- iov_iter test fixes
diff is attached.
Tests passed on linux-next on my test system:
- allmodconfig build
Default arch um:
./tools/testing/kunit/kunit.py run
./tools/testing/kunit/kunit.py run --alltests
./tools/testing/kunit/kunit.py run --arch x86_64
./tools/testing/kunit/kunit.py run --alltests --arch x86_64
thanks,
-- Shuah
----------------------------------------------------------------
The following changes since commit dd5a440a31fae6e459c0d6271dddd62825505361:
Linux 6.9-rc7 (2024-05-05 14:06:01 -0700)
are available in the Git repository at:
git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest tags/linux_kselftest-kunit-6.10-rc1
for you to fetch changes up to 5496b9b77d7420652202b73cf036e69760be5deb:
kunit: bail out early in __kunit_test_suites_init() if there are no suites to test (2024-05-06 14:22:02 -0600)
----------------------------------------------------------------
linux_kselftest-kunit-6.10-rc1
This kunit update for Linux 6.10-rc1 consists of:
- fix to race condition in try-catch completion
- change to __kunit_test_suites_init() to exit early if there is
nothing to test
- change to string-stream-test to use KUNIT_DEFINE_ACTION_WRAPPER
- moving fault tests behind KUNIT_FAULT_TEST Kconfig option
- kthread test fixes and improvements
- iov_iter test fixes
----------------------------------------------------------------
David Gow (2):
kunit: Fix race condition in try-catch completion
kunit: test: Move fault tests behind KUNIT_FAULT_TEST Kconfig option
Ivan Orlov (1):
kunit: string-stream-test: use KUNIT_DEFINE_ACTION_WRAPPER
Mickaël Salaün (7):
kunit: Handle thread creation error
kunit: Fix kthread reference
kunit: Fix timeout message
kunit: Handle test faults
kunit: Fix KUNIT_SUCCESS() calls in iov_iter tests
kunit: Print last test location on fault
kunit: Add tests for fault
Scott Mayhew (1):
kunit: bail out early in __kunit_test_suites_init() if there are no suites to test
Wander Lairson Costa (1):
kunit: unregister the device on error
include/kunit/test.h | 24 +++++++++++++++++++---
include/kunit/try-catch.h | 3 ---
kernel/kthread.c | 1 +
lib/kunit/Kconfig | 11 +++++++++++
lib/kunit/device.c | 2 +-
lib/kunit/kunit-test.c | 45 +++++++++++++++++++++++++++++++++++++++++-
lib/kunit/string-stream-test.c | 12 ++---------
lib/kunit/test.c | 3 +++
lib/kunit/try-catch.c | 40 ++++++++++++++++++++++++++-----------
lib/kunit_iov_iter.c | 18 ++++++++---------
10 files changed, 121 insertions(+), 38 deletions(-)
----------------------------------------------------------------
Hi Linus,
Please pull the nolibc update for Linux 6.10-rc1.
This nolibc update for Linux 6.10-rc1
- adds support for uname(2)
- removes open-coded strnlen()
- exports strlen()
- adds tests for strlcat() and strlcpy()
- fixes memory error in realloc()
- fixes strlcat() return code and size usage
- fixes strlcpy() return code and size usage
diff is attached.
thanks,
-- Shuah
----------------------------------------------------------------
The following changes since commit 4cece764965020c22cff7665b18a012006359095:
Linux 6.9-rc1 (2024-03-24 14:10:05 -0700)
are available in the Git repository at:
git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest tags/linux_kselftest-nolibc-6.10-rc1
for you to fetch changes up to 0adab2b6b7336fb6ee3c6456a432dad3b1d25647:
tools/nolibc: add support for uname(2) (2024-04-14 20:28:54 +0200)
----------------------------------------------------------------
linux_kselftest-nolibc-6.10-rc1
This nolibc update for Linux 6.10-rc1
- adds support for uname(2)
- removes open-coded strnlen()
- exports strlen()
- adds tests for strlcat() and strlcpy()
- fixes memory error in realloc()
- fixes strlcat() return code and size usage
- fixes strlcpy() return code and size usage
----------------------------------------------------------------
Brennan Xavier McManus (1):
tools/nolibc/stdlib: fix memory error in realloc()
Rodrigo Campos (4):
tools/nolibc/string: export strlen()
tools/nolibc: Fix strlcat() return code and size usage
tools/nolibc: Fix strlcpy() return code and size usage
selftests/nolibc: Add tests for strlcat() and strlcpy()
Thomas Weißschuh (2):
tools/nolibc/string: remove open-coded strnlen()
tools/nolibc: add support for uname(2)
tools/include/nolibc/stdlib.h | 2 +-
tools/include/nolibc/string.h | 46 +++++++++-------
tools/include/nolibc/sys.h | 27 +++++++++
tools/testing/selftests/nolibc/nolibc-test.c | 82 ++++++++++++++++++++++++++++
4 files changed, 136 insertions(+), 21 deletions(-)
----------------------------------------------------------------
Hi Linus,
Please pull the kselftest update for Linux 6.10-rc1.
This kselftest update for Linux 6.10-rc1 consists of:
- changes to make framework and tests reporting KTAP compliant
- changes to make ktap_helpers and power_supply test POSIX compliant
- adds ksft_exit_fail_perror() to include errono in string form
- fixes to avoid clang reporting false positive static analysis errors
about functions that exit and never return. ksft_exit* functions
are marked __noreturn to address this problem
- adds mechanism for reporting a KSFT_ result code
- fixes to build warnings related missing headers and unused variables
- fixes to clang build failures
- cleanups to resctrl test
- adds host arch for LLVM builds
Please note that Stepen found the following conflict in
tools/testing/selftests/mm/soft-dirty.c in next and fixed it up.
between commit:
258ff696db6b ("selftests/mm: soft-dirty should fail if a testcase fails")
from the mm-unstable branch of the mm tree and commit:
e6162a96c81d ("selftests/mm: ksft_exit functions do not return")
from the kselftest tree.
Stepehen's fix taking the 258ff696db6b change to use ksft_finished()
looks good to me.
diff for pull request is attached.
thanks,
-- Shuah
----------------------------------------------------------------
The following changes since commit dd5a440a31fae6e459c0d6271dddd62825505361:
Linux 6.9-rc7 (2024-05-05 14:06:01 -0700)
are available in the Git repository at:
git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest tags/linux_kselftest-next-6.10-rc1
for you to fetch changes up to 2c3b8f8f37c6c0c926d584cf4158db95e62b960c:
selftests/sgx: Include KHDR_INCLUDES in Makefile (2024-05-08 17:08:46 -0600)
----------------------------------------------------------------
linux_kselftest-next-6.10-rc1
This kselftest update for Linux 6.10-rc1 consists of:
- changes to make framework and tests reporting KTAP compliant
- changes to make ktap_helpers and power_supply test POSIX compliant
- adds ksft_exit_fail_perror() to include errono in string form
- fixes to avoid clang reporting false positive static analysis errors
about functions that exit and never return. ksft_exit* functions
are marked __noreturn to address this problem
- adds mechanism for reporting a KSFT_ result code
- fixes to build warnings related missing headers and unused variables
- fixes to clang build failures
- cleanups to resctrl test
- adds host arch for LLVM builds
----------------------------------------------------------------
Amer Al Shanawany (2):
selftests: filesystems: add missing stddef header
selftests/capabilities: fix warn_unused_result build warnings
Edward Liaw (2):
selftests: Compile kselftest headers with -D_GNU_SOURCE
selftests/sgx: Include KHDR_INCLUDES in Makefile
John Hubbard (3):
selftests/binderfs: use the Makefile's rules, not Make's implicit rules
selftests/resctrl: fix clang build failure: use LOCAL_HDRS
selftests/resctrl: fix clang build warnings related to abs(), labs() calls
Lu Dai (1):
selftests: kselftest_deps: fix l5_test() empty variable
Maciej Wieczor-Retman (3):
selftests/resctrl: Add cleanup function to test framework
selftests/resctrl: Simplify cleanup in ctrl-c handler
selftests/resctrl: Move cleanups out of individual tests
Mark Brown (8):
kselftest: Add mechanism for reporting a KSFT_ result code
kselftest/tty: Report a consistent test name for the one test we run
kselftest/clone3: Make test names for set_tid test stable
tracing/selftests: Support log output when generating KTAP output
tracing/selftests: Default to verbose mode when running in kselftest
selftests/clone3: Fix compiler warning
selftests/clone3: Check that the child exited cleanly
selftests/clone3: Correct log message for waitpid() failures
Masami Hiramatsu (Google) (2):
selftests/ftrace: Fix BTFARG testcase to check fprobe is enabled correctly
selftests/ftrace: Fix checkbashisms errors
Muhammad Usama Anjum (9):
selftests: x86: test_vsyscall: reorder code to reduce #ifdef blocks
selftests: x86: test_vsyscall: conform test to TAP format output
selftests: x86: test_mremap_vdso: conform test to TAP format output
selftests/dmabuf-heap: conform test to TAP format output
kselftest: Add missing signature to the comments
selftests: add ksft_exit_fail_perror()
selftests: exec: Use new ksft_exit_fail_perror() helper
selftests: Mark ksft_exit_fail_perror() as __noreturn
selftests: cpufreq: conform test to TAP
Nathan Chancellor (10):
selftests/clone3: ksft_exit functions do not return
selftests/ipc: ksft_exit functions do not return
selftests: membarrier: ksft_exit_pass() does not return
selftests/mm: ksft_exit functions do not return
selftests: pidfd: ksft_exit functions do not return
selftests/resctrl: ksft_exit_skip() does not return
selftests: sync: ksft_exit_pass() does not return
selftests: timers: ksft_exit functions do not return
selftests: x86: ksft_exit_pass() does not return
selftests: kselftest: Make ksft_exit functions return void instead of int
Nícolas F. R. A. Prado (2):
selftests: ktap_helpers: Make it POSIX-compliant
selftests: power_supply: Make it POSIX-compliant
Valentin Obst (1):
selftests: default to host arch for LLVM builds
Yo-Jung (Leo) Lin (1):
Documentation: kselftest: fix codeblock
Documentation/dev-tools/kselftest.rst | 2 +-
tools/testing/selftests/Makefile | 4 +-
tools/testing/selftests/capabilities/test_execve.c | 12 +-
.../testing/selftests/capabilities/validate_cap.c | 7 +-
tools/testing/selftests/clone3/clone3.c | 7 +-
.../selftests/clone3/clone3_clear_sighand.c | 2 +-
tools/testing/selftests/clone3/clone3_set_tid.c | 121 +++--
tools/testing/selftests/cpufreq/cpufreq.sh | 3 +-
tools/testing/selftests/cpufreq/main.sh | 47 +-
tools/testing/selftests/cpufreq/module.sh | 6 +-
tools/testing/selftests/dmabuf-heaps/dmabuf-heap.c | 247 ++++------
tools/testing/selftests/exec/recursion-depth.c | 10 +-
.../selftests/filesystems/binderfs/Makefile | 2 -
.../filesystems/statmount/statmount_test.c | 1 +
tools/testing/selftests/ftrace/ftracetest | 8 +-
tools/testing/selftests/ftrace/ftracetest-ktap | 2 +-
.../ftrace/test.d/dynevent/add_remove_btfarg.tc | 2 +-
.../ftrace/test.d/dynevent/fprobe_entry_arg.tc | 2 +-
.../ftrace/test.d/kprobe/kretprobe_entry_arg.tc | 2 +-
tools/testing/selftests/ipc/msgque.c | 11 +-
tools/testing/selftests/kselftest.h | 49 +-
tools/testing/selftests/kselftest/ktap_helpers.sh | 4 +-
tools/testing/selftests/kselftest_deps.sh | 1 +
tools/testing/selftests/kselftest_harness.h | 2 +-
tools/testing/selftests/lib.mk | 14 +-
.../membarrier/membarrier_test_multi_thread.c | 2 +-
.../membarrier/membarrier_test_single_thread.c | 2 +-
tools/testing/selftests/mm/compaction_test.c | 6 +-
tools/testing/selftests/mm/cow.c | 2 +-
tools/testing/selftests/mm/gup_longterm.c | 2 +-
tools/testing/selftests/mm/gup_test.c | 4 +-
tools/testing/selftests/mm/ksm_functional_tests.c | 2 +-
tools/testing/selftests/mm/madv_populate.c | 2 +-
tools/testing/selftests/mm/mkdirty.c | 2 +-
tools/testing/selftests/mm/pagemap_ioctl.c | 4 +-
tools/testing/selftests/mm/soft-dirty.c | 2 +-
tools/testing/selftests/pidfd/pidfd_fdinfo_test.c | 2 +-
tools/testing/selftests/pidfd/pidfd_open_test.c | 4 +-
tools/testing/selftests/pidfd/pidfd_poll_test.c | 2 +-
tools/testing/selftests/pidfd/pidfd_test.c | 2 +-
.../power_supply/test_power_supply_properties.sh | 2 +-
tools/testing/selftests/resctrl/Makefile | 4 +-
tools/testing/selftests/resctrl/cat_test.c | 8 +-
tools/testing/selftests/resctrl/cmt_test.c | 8 +-
tools/testing/selftests/resctrl/mba_test.c | 10 +-
tools/testing/selftests/resctrl/mbm_test.c | 10 +-
tools/testing/selftests/resctrl/resctrl.h | 9 +-
tools/testing/selftests/resctrl/resctrl_tests.c | 26 +-
tools/testing/selftests/resctrl/resctrl_val.c | 8 +-
tools/testing/selftests/sgx/Makefile | 2 +-
tools/testing/selftests/sgx/sigstruct.c | 1 -
tools/testing/selftests/sync/sync_test.c | 3 +-
tools/testing/selftests/timers/adjtick.c | 4 +-
.../testing/selftests/timers/alarmtimer-suspend.c | 4 +-
tools/testing/selftests/timers/change_skew.c | 4 +-
tools/testing/selftests/timers/freq-step.c | 4 +-
tools/testing/selftests/timers/leap-a-day.c | 10 +-
tools/testing/selftests/timers/leapcrash.c | 4 +-
tools/testing/selftests/timers/mqueue-lat.c | 4 +-
tools/testing/selftests/timers/posix_timers.c | 12 +-
tools/testing/selftests/timers/raw_skew.c | 6 +-
tools/testing/selftests/timers/set-2038.c | 4 +-
tools/testing/selftests/timers/set-tai.c | 4 +-
tools/testing/selftests/timers/set-timer-lat.c | 4 +-
tools/testing/selftests/timers/set-tz.c | 4 +-
tools/testing/selftests/timers/skew_consistency.c | 4 +-
tools/testing/selftests/timers/threadtest.c | 2 +-
tools/testing/selftests/timers/valid-adjtimex.c | 6 +-
tools/testing/selftests/tty/tty_tstamp_update.c | 48 +-
tools/testing/selftests/x86/lam.c | 2 +-
tools/testing/selftests/x86/test_mremap_vdso.c | 43 +-
tools/testing/selftests/x86/test_vsyscall.c | 506 ++++++++++-----------
72 files changed, 703 insertions(+), 675 deletions(-)
----------------------------------------------------------------
Clean up the KVM clock mess somewhat so that it is either based on the guest
TSC ("master clock" mode), or on the host CLOCK_MONOTONIC_RAW in cases where
the TSC isn't usable.
Eliminate the third variant where it was based directly on the *host* TSC,
due to bugs in e.g. __get_kvmclock().
Kill off the last vestiges of the KVM clock being based on CLOCK_MONOTONIC
instead of CLOCK_MONOTONIC_RAW and thus being subject to NTP skew.
Fix up migration support to allow the KVM clock to be saved/restored as an
arithmetic function of the guest TSC, since that's what it actually is in
the *common* case so it can be migrated precisely. Or at least to within
±1 ns which is good enough, as discussed in
https://lore.kernel.org/kvm/c8dca08bf848e663f192de6705bf04aa3966e856.camel@…
In v2 of this series, TSC synchronization is improved and simplified a bit
too, and we allow masterclock mode to be used even when the guest TSCs are
out of sync, as long as they're running at the same *rate*. The different
*offset* shouldn't matter.
And the kvm_get_time_scale() function annoyed me by being entirely opaque,
so I studied it until my brain hurt and then added some comments.
In v2 I also dropped the commits which were removing the periodic clock
syncs. Those are going to be needed still but *only* for non-masterclock
mode, which I'll do next. Along with ensuring that a masterclock update
while already in masterclock mode doesn't jump the clock, and just does
the same as KVM_SET_CLOCK_GUEST does to preserve it.
Needs a *lot* more testing. I think I'm almost done refactoring the code,
so should focus on building up the tests next.
(I do still hate that we're abusing KVM_GET_CLOCK just to get the tuple
of {host_tsc, CLOCK_REALTIME} without even *caring* about the eponymous
KVM clock. Especially as this information is (a) fundamentally what the
vDSO gettimeofday() exposes to us anyway, (b) using CLOCK_REALTIME not
TAI, (c) not available on other platforms, for example for migrating
the Arm arch counter.)
David Woodhouse (13):
KVM: x86/xen: Do not corrupt KVM clock in kvm_xen_shared_info_init()
KVM: x86: Improve accuracy of KVM clock when TSC scaling is in force
KVM: x86: Explicitly disable TSC scaling without CONSTANT_TSC
KVM: x86: Add KVM_VCPU_TSC_SCALE and fix the documentation on TSC migration
KVM: x86: Avoid NTP frequency skew for KVM clock on 32-bit host
KVM: x86: Fix KVM clock precision in __get_kvmclock()
KVM: x86: Fix software TSC upscaling in kvm_update_guest_time()
KVM: x86: Simplify and comment kvm_get_time_scale()
KVM: x86: Remove implicit rdtsc() from kvm_compute_l1_tsc_offset()
KVM: x86: Improve synchronization in kvm_synchronize_tsc()
KVM: x86: Kill cur_tsc_{nsec,offset,write} fields
KVM: x86: Allow KVM master clock mode when TSCs are offset from each other
KVM: x86: Factor out kvm_use_master_clock()
Jack Allister (2):
KVM: x86: Add KVM_[GS]ET_CLOCK_GUEST for accurate KVM clock migration
KVM: selftests: Add KVM/PV clock selftest to prove timer correction
Documentation/virt/kvm/api.rst | 37 ++
Documentation/virt/kvm/devices/vcpu.rst | 115 +++-
arch/x86/include/asm/kvm_host.h | 15 +-
arch/x86/include/uapi/asm/kvm.h | 6 +
arch/x86/kvm/svm/svm.c | 3 +-
arch/x86/kvm/vmx/vmx.c | 2 +-
arch/x86/kvm/x86.c | 687 +++++++++++++++-------
arch/x86/kvm/xen.c | 4 +-
include/uapi/linux/kvm.h | 3 +
tools/testing/selftests/kvm/Makefile | 1 +
tools/testing/selftests/kvm/x86_64/pvclock_test.c | 192 ++++++
11 files changed, 822 insertions(+), 243 deletions(-)
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Mark Rutland <mark.rutland(a)arm.com>
[ Upstream commit 8ecab2e64572f1aecdfc5a8feae748abda6e3347 ]
The event filter function test has been failing in our internal test
farm:
| # not ok 33 event filter function - test event filtering on functions
Running the test in verbose mode indicates that this is because the test
erroneously determines that kmem_cache_free() is the most common caller
of kmem_cache_free():
# # + cut -d: -f3 trace
# # + sed s/call_site=([^+]*)+0x.*/1/
# # + sort
# # + uniq -c
# # + sort
# # + tail -n 1
# # + sed s/^[ 0-9]*//
# # + target_func=kmem_cache_free
... and as kmem_cache_free() doesn't call itself, setting this as the
filter function for kmem_cache_free() results in no hits, and
consequently the test fails:
# # + grep kmem_cache_free trace
# # + grep kmem_cache_free
# # + wc -l
# # + hitcnt=0
# # + grep kmem_cache_free trace
# # + grep -v kmem_cache_free
# # + wc -l
# # + misscnt=0
# # + [ 0 -eq 0 ]
# # + exit_fail
This seems to be because the system in question has tasks with ':' in
their name (which a number of kernel worker threads have). These show up
in the trace, e.g.
test:.sh-1299 [004] ..... 2886.040608: kmem_cache_free: call_site=putname+0xa4/0xc8 ptr=000000000f4d22f4 name=names_cache
... and so when we try to extact the call_site with:
cut -d: -f3 trace | sed 's/call_site=\([^+]*\)+0x.*/\1/'
... the 'cut' command will extrace the column containing
'kmem_cache_free' rather than the column containing 'call_site=...', and
the 'sed' command will leave this unchanged. Consequently, the test will
decide to use 'kmem_cache_free' as the filter function, resulting in the
failure seen above.
Fix this by matching the 'call_site=<func>' part specifically to extract
the function name.
Signed-off-by: Mark Rutland <mark.rutland(a)arm.com>
Reported-by: Aishwarya TCV <aishwarya.tcv(a)arm.com>
Cc: Masami Hiramatsu <mhiramat(a)kernel.org>
Cc: Mathieu Desnoyers <mathieu.desnoyers(a)efficios.com>
Cc: Shuah Khan <shuah(a)kernel.org>
Cc: Steven Rostedt <rostedt(a)goodmis.org>
Cc: linux-kernel(a)vger.kernel.org
Cc: linux-kselftest(a)vger.kernel.org
Cc: linux-trace-kernel(a)vger.kernel.org
Acked-by: Masami Hiramatsu (Google) <mhiramat(a)kernel.org>
Signed-off-by: Shuah Khan <skhan(a)linuxfoundation.org>
Signed-off-by: Sasha Levin <sashal(a)kernel.org>
---
.../selftests/ftrace/test.d/filter/event-filter-function.tc | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/tools/testing/selftests/ftrace/test.d/filter/event-filter-function.tc b/tools/testing/selftests/ftrace/test.d/filter/event-filter-function.tc
index 2de7c61d1ae30..3f74c09c56b62 100644
--- a/tools/testing/selftests/ftrace/test.d/filter/event-filter-function.tc
+++ b/tools/testing/selftests/ftrace/test.d/filter/event-filter-function.tc
@@ -24,7 +24,7 @@ echo 0 > events/enable
echo "Get the most frequently calling function"
sample_events
-target_func=`cut -d: -f3 trace | sed 's/call_site=\([^+]*\)+0x.*/\1/' | sort | uniq -c | sort | tail -n 1 | sed 's/^[ 0-9]*//'`
+target_func=`cat trace | grep -o 'call_site=\([^+]*\)' | sed 's/call_site=//' | sort | uniq -c | sort | tail -n 1 | sed 's/^[ 0-9]*//'`
if [ -z "$target_func" ]; then
exit_fail
fi
--
2.43.0
6.8-stable review patch. If anyone has any objections, please let me know.
------------------
From: Mark Rutland <mark.rutland(a)arm.com>
[ Upstream commit 8ecab2e64572f1aecdfc5a8feae748abda6e3347 ]
The event filter function test has been failing in our internal test
farm:
| # not ok 33 event filter function - test event filtering on functions
Running the test in verbose mode indicates that this is because the test
erroneously determines that kmem_cache_free() is the most common caller
of kmem_cache_free():
# # + cut -d: -f3 trace
# # + sed s/call_site=([^+]*)+0x.*/1/
# # + sort
# # + uniq -c
# # + sort
# # + tail -n 1
# # + sed s/^[ 0-9]*//
# # + target_func=kmem_cache_free
... and as kmem_cache_free() doesn't call itself, setting this as the
filter function for kmem_cache_free() results in no hits, and
consequently the test fails:
# # + grep kmem_cache_free trace
# # + grep kmem_cache_free
# # + wc -l
# # + hitcnt=0
# # + grep kmem_cache_free trace
# # + grep -v kmem_cache_free
# # + wc -l
# # + misscnt=0
# # + [ 0 -eq 0 ]
# # + exit_fail
This seems to be because the system in question has tasks with ':' in
their name (which a number of kernel worker threads have). These show up
in the trace, e.g.
test:.sh-1299 [004] ..... 2886.040608: kmem_cache_free: call_site=putname+0xa4/0xc8 ptr=000000000f4d22f4 name=names_cache
... and so when we try to extact the call_site with:
cut -d: -f3 trace | sed 's/call_site=\([^+]*\)+0x.*/\1/'
... the 'cut' command will extrace the column containing
'kmem_cache_free' rather than the column containing 'call_site=...', and
the 'sed' command will leave this unchanged. Consequently, the test will
decide to use 'kmem_cache_free' as the filter function, resulting in the
failure seen above.
Fix this by matching the 'call_site=<func>' part specifically to extract
the function name.
Signed-off-by: Mark Rutland <mark.rutland(a)arm.com>
Reported-by: Aishwarya TCV <aishwarya.tcv(a)arm.com>
Cc: Masami Hiramatsu <mhiramat(a)kernel.org>
Cc: Mathieu Desnoyers <mathieu.desnoyers(a)efficios.com>
Cc: Shuah Khan <shuah(a)kernel.org>
Cc: Steven Rostedt <rostedt(a)goodmis.org>
Cc: linux-kernel(a)vger.kernel.org
Cc: linux-kselftest(a)vger.kernel.org
Cc: linux-trace-kernel(a)vger.kernel.org
Acked-by: Masami Hiramatsu (Google) <mhiramat(a)kernel.org>
Signed-off-by: Shuah Khan <skhan(a)linuxfoundation.org>
Signed-off-by: Sasha Levin <sashal(a)kernel.org>
---
.../selftests/ftrace/test.d/filter/event-filter-function.tc | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/tools/testing/selftests/ftrace/test.d/filter/event-filter-function.tc b/tools/testing/selftests/ftrace/test.d/filter/event-filter-function.tc
index 2de7c61d1ae30..3f74c09c56b62 100644
--- a/tools/testing/selftests/ftrace/test.d/filter/event-filter-function.tc
+++ b/tools/testing/selftests/ftrace/test.d/filter/event-filter-function.tc
@@ -24,7 +24,7 @@ echo 0 > events/enable
echo "Get the most frequently calling function"
sample_events
-target_func=`cut -d: -f3 trace | sed 's/call_site=\([^+]*\)+0x.*/\1/' | sort | uniq -c | sort | tail -n 1 | sed 's/^[ 0-9]*//'`
+target_func=`cat trace | grep -o 'call_site=\([^+]*\)' | sed 's/call_site=//' | sort | uniq -c | sort | tail -n 1 | sed 's/^[ 0-9]*//'`
if [ -z "$target_func" ]; then
exit_fail
fi
--
2.43.0
After this change the single SAN device (ns3eth1) is now replaced with
two SAN devices - respectively ns4eth1 and ns5eth1.
It is possible to extend this script to have more SAN devices connected
by adding them to ns3br1 bridge.
Signed-off-by: Lukasz Majewski <lukma(a)denx.de>
---
tools/testing/selftests/net/hsr/hsr_redbox.sh | 71 +++++++++++++------
1 file changed, 49 insertions(+), 22 deletions(-)
diff --git a/tools/testing/selftests/net/hsr/hsr_redbox.sh b/tools/testing/selftests/net/hsr/hsr_redbox.sh
index db69be95ecb3..1f36785347c0 100755
--- a/tools/testing/selftests/net/hsr/hsr_redbox.sh
+++ b/tools/testing/selftests/net/hsr/hsr_redbox.sh
@@ -8,12 +8,19 @@ source ./hsr_common.sh
do_complete_ping_test()
{
echo "INFO: Initial validation ping (HSR-SAN/RedBox)."
- # Each node has to be able each one.
+ # Each node has to be able to reach each one.
do_ping "${ns1}" 100.64.0.2
do_ping "${ns2}" 100.64.0.1
- # Ping from SAN to hsr1 (via hsr2)
+ # Ping between SANs (test bridge)
+ do_ping "${ns4}" 100.64.0.51
+ do_ping "${ns5}" 100.64.0.41
+ # Ping from SANs to hsr1 (via hsr2) (and opposite)
do_ping "${ns3}" 100.64.0.1
do_ping "${ns1}" 100.64.0.3
+ do_ping "${ns1}" 100.64.0.41
+ do_ping "${ns4}" 100.64.0.1
+ do_ping "${ns1}" 100.64.0.51
+ do_ping "${ns5}" 100.64.0.1
stop_if_error "Initial validation failed."
# Wait for MGNT HSR frames being received and nodes being
@@ -23,8 +30,12 @@ do_complete_ping_test()
echo "INFO: Longer ping test (HSR-SAN/RedBox)."
# Ping from SAN to hsr1 (via hsr2)
do_ping_long "${ns3}" 100.64.0.1
- # Ping from hsr1 (via hsr2) to SAN
+ # Ping from hsr1 (via hsr2) to SANs (and opposite)
do_ping_long "${ns1}" 100.64.0.3
+ do_ping_long "${ns1}" 100.64.0.41
+ do_ping_long "${ns4}" 100.64.0.1
+ do_ping_long "${ns1}" 100.64.0.51
+ do_ping_long "${ns5}" 100.64.0.1
stop_if_error "Longer ping test failed."
echo "INFO: All good."
@@ -35,22 +46,26 @@ setup_hsr_interfaces()
local HSRv="$1"
echo "INFO: preparing interfaces for HSRv${HSRv} (HSR-SAN/RedBox)."
-
-# |NS1 |
-# | |
-# | /-- hsr1 --\ |
-# | ns1eth1 ns1eth2 |
-# |------------------------|
-# | |
-# | |
-# | |
-# |------------------------| |-----------|
-# | ns2eth1 ns2eth2 | | |
-# | \-- hsr2 --/ | | |
-# | \ | | |
-# | ns2eth3 |--------| ns3eth1 |
-# | (interlink)| | |
-# |NS2 (RedBOX) | |NS3 (SAN) |
+#
+# IPv4 addresses (100.64.X.Y/24), and [X.Y] is presented on below diagram:
+#
+#
+# |NS1 | |NS4 |
+# | [0.1] | | |
+# | /-- hsr1 --\ | | [0.41] |
+# | ns1eth1 ns1eth2 | | ns4eth1 (SAN) |
+# |------------------------| |-------------------|
+# | | |
+# | | |
+# | | |
+# |------------------------| |-------------------------------|
+# | ns2eth1 ns2eth2 | | ns3eth2 |
+# | \-- hsr2 --/ | | / |
+# | [0.2] \ | | / | |------------|
+# | ns2eth3 |---| ns3eth1 -- ns3br1 -- ns3eth3--|--| ns5eth1 |
+# | (interlink)| | [0.3] [0.11] | | [0.51] |
+# |NS2 (RedBOX) | |NS3 (BR) | | NS5 (SAN) |
+#
#
# Check if iproute2 supports adding interlink port to hsrX device
ip link help hsr | grep -q INTERLINK
@@ -59,7 +74,9 @@ setup_hsr_interfaces()
# Create interfaces for name spaces
ip link add ns1eth1 netns "${ns1}" type veth peer name ns2eth1 netns "${ns2}"
ip link add ns1eth2 netns "${ns1}" type veth peer name ns2eth2 netns "${ns2}"
- ip link add ns3eth1 netns "${ns3}" type veth peer name ns2eth3 netns "${ns2}"
+ ip link add ns2eth3 netns "${ns2}" type veth peer name ns3eth1 netns "${ns3}"
+ ip link add ns3eth2 netns "${ns3}" type veth peer name ns4eth1 netns "${ns4}"
+ ip link add ns3eth3 netns "${ns3}" type veth peer name ns5eth1 netns "${ns5}"
sleep 1
@@ -70,21 +87,31 @@ setup_hsr_interfaces()
ip -n "${ns2}" link set ns2eth2 up
ip -n "${ns2}" link set ns2eth3 up
- ip -n "${ns3}" link set ns3eth1 up
+ ip -n "${ns3}" link add name ns3br1 type bridge
+ ip -n "${ns3}" link set ns3br1 up
+ ip -n "${ns3}" link set ns3eth1 master ns3br1 up
+ ip -n "${ns3}" link set ns3eth2 master ns3br1 up
+ ip -n "${ns3}" link set ns3eth3 master ns3br1 up
+
+ ip -n "${ns4}" link set ns4eth1 up
+ ip -n "${ns5}" link set ns5eth1 up
ip -net "${ns1}" link add name hsr1 type hsr slave1 ns1eth1 slave2 ns1eth2 supervision 45 version ${HSRv} proto 0
ip -net "${ns2}" link add name hsr2 type hsr slave1 ns2eth1 slave2 ns2eth2 interlink ns2eth3 supervision 45 version ${HSRv} proto 0
ip -n "${ns1}" addr add 100.64.0.1/24 dev hsr1
ip -n "${ns2}" addr add 100.64.0.2/24 dev hsr2
+ ip -n "${ns3}" addr add 100.64.0.11/24 dev ns3br1
ip -n "${ns3}" addr add 100.64.0.3/24 dev ns3eth1
+ ip -n "${ns4}" addr add 100.64.0.41/24 dev ns4eth1
+ ip -n "${ns5}" addr add 100.64.0.51/24 dev ns5eth1
ip -n "${ns1}" link set hsr1 up
ip -n "${ns2}" link set hsr2 up
}
check_prerequisites
-setup_ns ns1 ns2 ns3
+setup_ns ns1 ns2 ns3 ns4 ns5
trap cleanup_all_ns EXIT
--
2.20.1
Joachim kindly merged the IPv6 support in
https://github.com/troglobit/mtools/pull/2, so we can just use his
version now. A few more fixes subsequently came in for IPv6, so even
better.
Check that the deployed mtools version is 3.0 or above. Note that the
version check breaks compatibility with my fork where I didn't bump the
version, but I assume that won't be a problem.
Signed-off-by: Vladimir Oltean <vladimir.oltean(a)nxp.com>
---
tools/testing/selftests/net/forwarding/lib.sh | 19 +++++++++++++++++--
1 file changed, 17 insertions(+), 2 deletions(-)
diff --git a/tools/testing/selftests/net/forwarding/lib.sh b/tools/testing/selftests/net/forwarding/lib.sh
index 4fe28ab5d8b9..aa925c0954a5 100644
--- a/tools/testing/selftests/net/forwarding/lib.sh
+++ b/tools/testing/selftests/net/forwarding/lib.sh
@@ -309,6 +309,21 @@ require_command()
fi
}
+# IPv6 support was added in v3.0
+check_mtools_version()
+{
+ local version="$(msend -v)"
+ local major
+
+ version=${version##msend version }
+ major=$(echo $version | cut -d. -f1)
+
+ if [ $major -lt 3 ]; then
+ echo "SKIP: expected mtools version 3.0, got $version"
+ exit $ksft_skip
+ fi
+}
+
if [[ "$REQUIRE_JQ" = "yes" ]]; then
require_command jq
fi
@@ -316,10 +331,10 @@ if [[ "$REQUIRE_MZ" = "yes" ]]; then
require_command $MZ
fi
if [[ "$REQUIRE_MTOOLS" = "yes" ]]; then
- # https://github.com/vladimiroltean/mtools/
- # patched for IPv6 support
+ # https://github.com/troglobit/mtools
require_command msend
require_command mreceive
+ check_mtools_version
fi
##############################################################################
--
2.34.1
Hi Linus,
Without reply from Shuah, and given the importance of these fixes [1], here is
a PR to fix Kselftest (broken since v6.9-rc1) for at least KVM, pidfd, and
Landlock. I cannot test against all kselftests though. This has been in
linux-next since the beginning of this week, and so far only one issue has been
reported [2] and fixed [3].
Feel free to take this PR if you see fit.
Regards,
Mickaël
[1] https://lore.kernel.org/r/Zjo1xyhjmehsRhZ2@google.com
[2] https://lore.kernel.org/r/202405100339.vfBe0t9C-lkp@intel.com
[3] https://lore.kernel.org/r/20240511171445.904356-1-mic@digikod.net
--
The following changes since commit e67572cd2204894179d89bd7b984072f19313b03:
Linux 6.9-rc6 (2024-04-28 13:47:24 -0700)
are available in the Git repository at:
git://git.kernel.org/pub/scm/linux/kernel/git/mic/linux.git tags/kselftest-fix-vfork-2024-05-12
for you to fetch changes up to 323feb3bdb67649bfa5614eb24ec9cb92a60cf33:
selftests/harness: Handle TEST_F()'s explicit exit codes (2024-05-11 19:18:47 +0200)
----------------------------------------------------------------
Fix Kselftest's vfork() side effects
See https://lore.kernel.org/r/20240511171445.904356-1-mic@digikod.net
----------------------------------------------------------------
Mickaël Salaün (10):
selftests/pidfd: Fix config for pidfd_setns_test
selftests/landlock: Fix FS tests when run on a private mount point
selftests/harness: Fix fixture teardown
selftests/harness: Fix interleaved scheduling leading to race conditions
selftests/landlock: Do not allocate memory in fixture data
selftests/harness: Constify fixture variants
selftests/pidfd: Fix wrong expectation
selftests/harness: Share _metadata between forked processes
selftests/harness: Fix vfork() side effects
selftests/harness: Handle TEST_F()'s explicit exit codes
tools/testing/selftests/kselftest_harness.h | 127 +++++++++++++++++------
tools/testing/selftests/landlock/fs_test.c | 83 +++++++++------
tools/testing/selftests/pidfd/config | 2 +
tools/testing/selftests/pidfd/pidfd_setns_test.c | 2 +-
4 files changed, 147 insertions(+), 67 deletions(-)
The selftest/sgx/main.c didn't compile with [-Werror=implicit-function-declaration]
[edited]:
make[3]: Entering directory 'tools/testing/selftests/sgx'
gcc -Wall -Werror -g -Itools/testing/selftests/../../../tools/include -fPIC -c main.c \
-o tools/testing/selftests/sgx/main.o
In file included from main.c:21:
../kselftest_harness.h: In function ‘__run_test’:
../kselftest_harness.h:1169:13: error: implicit declaration of function ‘asprintf’; \
did you mean ‘vsprintf’? [-Werror=implicit-function-declaration]
1169 | if (asprintf(&test_name, "%s%s%s.%s", f->name,
| ^~~~~~~~
| vsprintf
cc1: all warnings being treated as errors
make[3]: *** [Makefile:36: tools/testing/selftests/sgx/main.o] Error 1
The cause is in the included <stdio.h> on Ubuntu 22.04 LTS:
19 /*
20 * ISO C99 Standard: 7.19 Input/output <stdio.h>
21 */
.
.
.
387 #if __GLIBC_USE (LIB_EXT2)
388 /* Write formatted output to a string dynamically allocated with `malloc'.
389 Store the address of the string in *PTR. */
390 extern int vasprintf (char **__restrict __ptr, const char *__restrict __f,
391 __gnuc_va_list __arg)
392 __THROWNL __attribute__ ((__format__ (__printf__, 2, 0))) __wur;
393 extern int __asprintf (char **__restrict __ptr,
394 const char *__restrict __fmt, ...)
395 __THROWNL __attribute__ ((__format__ (__printf__, 2, 3))) __wur;
396 extern int asprintf (char **__restrict __ptr,
397 const char *__restrict __fmt, ...)
398 __THROWNL __attribute__ ((__format__ (__printf__, 2, 3))) __wur;
399 #endif
__GLIBC_USE (LIB_EXT2) expands into __GLIBC_USE_LIB_EXT2 as defined here:
/usr/include/features.h:186:#define __GLIBC_USE(F) __GLIBC_USE_ ## F
Now, what is unobvious is that <stdio.h> includes
/usr/include/x86_64-linux-gnu/bits/libc-header-start.h:
------------------------------------------------------
35 /* ISO/IEC TR 24731-2:2010 defines the __STDC_WANT_LIB_EXT2__
36 macro. */
37 #undef __GLIBC_USE_LIB_EXT2
38 #if (defined __USE_GNU \
39 || (defined __STDC_WANT_LIB_EXT2__ && __STDC_WANT_LIB_EXT2__ > 0))
40 # define __GLIBC_USE_LIB_EXT2 1
41 #else
42 # define __GLIBC_USE_LIB_EXT2 0
43 #endif
This makes <stdio.h> exclude line 396 and asprintf() prototype from normal
include file processing.
The fix defines __USE_GNU before including <stdio.h> in case it isn't already
defined. After this intervention the module compiles OK.
Converting snprintf() to asprintf() in selftests/kselftest_harness.h:1169
created this new dependency and the implicit declaration broke the compilation.
Fixes: 809216233555 ("selftests/harness: remove use of LINE_MAX")
Cc: Edward Liaw <edliaw(a)google.com>
Cc: Jarkko Sakkinen <jarkko(a)kernel.org>
Cc: Dave Hansen <dave.hansen(a)linux.intel.com>
Cc: Shuah Khan <shuah(a)kernel.org>
Cc: linux-sgx(a)vger.kernel.org
Cc: linux-kselftest(a)vger.kernel.org
Cc: linux-kernel(a)vger.kernel.org
Signed-off-by: Mirsad Todorovac <mtodorov69(a)gmail.com>
---
tools/testing/selftests/sgx/main.c | 3 +++
1 file changed, 3 insertions(+)
diff --git a/tools/testing/selftests/sgx/main.c b/tools/testing/selftests/sgx/main.c
index 9820b3809c69..f5cb426bd797 100644
--- a/tools/testing/selftests/sgx/main.c
+++ b/tools/testing/selftests/sgx/main.c
@@ -6,6 +6,9 @@
#include <errno.h>
#include <fcntl.h>
#include <stdbool.h>
+#ifndef __USE_GNU
+#define __USE_GNU
+#endif
#include <stdio.h>
#include <stdint.h>
#include <stdlib.h>
--
2.34.1
Currently, if at runtime we are not able to allocate a huge page, the
test will trivially pass on Aarch64 due to no exception being raised on
division by zero while computing compaction_index. Fix that by checking
for nr_hugepages == 0. Anyways, in general, avoid a division by zero by
exiting the program beforehand. While at it, fix a typo.
Signed-off-by: Dev Jain <dev.jain(a)arm.com>
---
tools/testing/selftests/mm/compaction_test.c | 6 +++++-
1 file changed, 5 insertions(+), 1 deletion(-)
diff --git a/tools/testing/selftests/mm/compaction_test.c b/tools/testing/selftests/mm/compaction_test.c
index 533999b6c284..df1b76f9c734 100644
--- a/tools/testing/selftests/mm/compaction_test.c
+++ b/tools/testing/selftests/mm/compaction_test.c
@@ -134,6 +134,10 @@ int check_compaction(unsigned long mem_free, unsigned int hugepage_size)
/* We should have been able to request at least 1/3 rd of the memory in
huge pages */
+ if (!atoi(nr_hugepages)) {
+ ksft_print_msg("ERROR: No memory is available as huge pages\n");
+ goto close_fd;
+ }
compaction_index = mem_free/(atoi(nr_hugepages) * hugepage_size);
lseek(fd, 0, SEEK_SET);
@@ -149,7 +153,7 @@ int check_compaction(unsigned long mem_free, unsigned int hugepage_size)
atoi(nr_hugepages));
if (compaction_index > 3) {
- ksft_print_msg("ERROR: Less that 1/%d of memory is available\n"
+ ksft_print_msg("ERROR: Less than 1/%d of memory is available\n"
"as huge pages\n", compaction_index);
goto close_fd;
}
--
2.39.2
From: Kunwu Chan <chentao(a)kylinos.cn>
The "malloc" call may not be successful.Add the malloc
failure checking to avoid possible null dereference.
Changes since v1 [1]:
- As Daniel Borkmann suggested, change patch 4/4 only
- Other 3 patches no changes.
[1] https://lore.kernel.org/all/20240424020444.2375773-1-chentao@kylinos.cn/
Kunwu Chan (4):
selftests/bpf: Add some null pointer checks
selftests/bpf/sockopt: Add a null pointer check for the run_test
selftests/bpf: Add a null pointer check for the load_btf_spec
selftests/bpf: Add a null pointer check for the
serial_test_tp_attach_query
tools/testing/selftests/bpf/prog_tests/sockopt.c | 6 ++++++
tools/testing/selftests/bpf/prog_tests/tp_attach_query.c | 3 +++
tools/testing/selftests/bpf/test_progs.c | 7 +++++++
tools/testing/selftests/bpf/test_verifier.c | 2 ++
4 files changed, 18 insertions(+)
--
2.40.1
From: Geliang Tang <tanggeliang(a)kylinos.cn>
The strdup() function returns a pointer to a new string which is a
duplicate of the string "ifname". Memory for the new string is obtained
with malloc(), and need to be freed with free().
This patch adds this missing "free(saved_hwtstamp_ifname)" in cleanup()
to avoid a potential memory leak in xdp_hw_metadata.c.
Signed-off-by: Geliang Tang <tanggeliang(a)kylinos.cn>
---
tools/testing/selftests/bpf/xdp_hw_metadata.c | 2 ++
1 file changed, 2 insertions(+)
diff --git a/tools/testing/selftests/bpf/xdp_hw_metadata.c b/tools/testing/selftests/bpf/xdp_hw_metadata.c
index 0859fe727da7..6f9956eed797 100644
--- a/tools/testing/selftests/bpf/xdp_hw_metadata.c
+++ b/tools/testing/selftests/bpf/xdp_hw_metadata.c
@@ -581,6 +581,8 @@ static void cleanup(void)
if (bpf_obj)
xdp_hw_metadata__destroy(bpf_obj);
+
+ free((void *)saved_hwtstamp_ifname);
}
static void handle_signal(int sig)
--
2.43.0
In this series from Geliang, modifying MPTCP BPF selftests, we have:
- SIGINT support
- A new macro to reduce duplicated code
- A new MPTCP subflow BPF program setting socket options per subflow: it
looks better to have this old test program in the BPF selftests to
track regressions and to serve as example.
Note: Nicolas is no longer working for Tessares, but he did this work
while working for them, and his email address is no longer available.
- A new MPTCP BPF subtest validating the new BPF program.
Signed-off-by: Matthieu Baerts (NGI0) <matttbe(a)kernel.org>
---
Geliang Tang (3):
selftests/bpf: Handle SIGINT when creating netns
selftests/bpf: Add RUN_MPTCP_TEST macro
selftests/bpf: Add mptcp subflow subtest
Nicolas Rybowski (1):
selftests/bpf: Add mptcp subflow example
tools/testing/selftests/bpf/prog_tests/mptcp.c | 127 +++++++++++++++++++++-
tools/testing/selftests/bpf/progs/mptcp_subflow.c | 70 ++++++++++++
2 files changed, 193 insertions(+), 4 deletions(-)
---
base-commit: 329a6720a3ebbc041983b267981ab2cac102de93
change-id: 20240506-upstream-bpf-next-20240506-mptcp-subflow-test-faef6654bfa3
Best regards,
--
Matthieu Baerts (NGI0) <matttbe(a)kernel.org>
On slow machines the SND timestamp sometimes doesn't arrive before
we quit. The test only waits as long as the packet delay, so it's
easy for a race condition to happen.
Double the wait but do a bit of polling, once the SND timestamp
arrives there's no point to wait any longer.
This fixes the "TXTIME abs" failures on debug kernels, like:
Case ICMPv4 - TXTIME abs returned '', expected 'OK'
Signed-off-by: Jakub Kicinski <kuba(a)kernel.org>
---
CC: shuah(a)kernel.org
CC: linux-kselftest(a)vger.kernel.org
---
tools/testing/selftests/net/cmsg_sender.c | 20 +++++++++++++++-----
1 file changed, 15 insertions(+), 5 deletions(-)
diff --git a/tools/testing/selftests/net/cmsg_sender.c b/tools/testing/selftests/net/cmsg_sender.c
index c79e65581dc3..f25268504937 100644
--- a/tools/testing/selftests/net/cmsg_sender.c
+++ b/tools/testing/selftests/net/cmsg_sender.c
@@ -333,16 +333,17 @@ static const char *cs_ts_info2str(unsigned int info)
return "unknown";
}
-static void
+static unsigned long
cs_read_cmsg(int fd, struct msghdr *msg, char *cbuf, size_t cbuf_sz)
{
struct sock_extended_err *see;
struct scm_timestamping *ts;
+ unsigned int ts_seen = 0;
struct cmsghdr *cmsg;
int i, err;
if (!opt.ts.ena)
- return;
+ return 0;
msg->msg_control = cbuf;
msg->msg_controllen = cbuf_sz;
@@ -396,8 +397,11 @@ cs_read_cmsg(int fd, struct msghdr *msg, char *cbuf, size_t cbuf_sz)
printf(" %5s ts%d %lluus\n",
cs_ts_info2str(see->ee_info),
i, rel_time);
+ ts_seen |= 1 << see->ee_info;
}
}
+
+ return ts_seen;
}
static void ca_set_sockopts(int fd)
@@ -509,10 +513,16 @@ int main(int argc, char *argv[])
err = ERN_SUCCESS;
if (opt.ts.ena) {
- /* Make sure all timestamps have time to loop back */
- usleep(opt.txtime.delay);
+ unsigned long seen;
+ int i;
- cs_read_cmsg(fd, &msg, cbuf, sizeof(cbuf));
+ /* Make sure all timestamps have time to loop back */
+ for (i = 0; i < 40; i++) {
+ seen = cs_read_cmsg(fd, &msg, cbuf, sizeof(cbuf));
+ if (seen & (1 << SCM_TSTAMP_SND))
+ break;
+ usleep(opt.txtime.delay / 20);
+ }
}
err_out:
--
2.45.0
The selftest/sgx/main.c didn't compile with [-Werror=implicit-function-declaration]
[edited]:
make[3]: Entering directory 'tools/testing/selftests/sgx'
gcc -Wall -Werror -g -Itools/testing/selftests/../../../tools/include -fPIC -c main.c \
-o tools/testing/selftests/sgx/main.o
In file included from main.c:21:
../kselftest_harness.h: In function ‘__run_test’:
../kselftest_harness.h:1169:13: error: implicit declaration of function ‘asprintf’; \
did you mean ‘vsprintf’? [-Werror=implicit-function-declaration]
1169 | if (asprintf(&test_name, "%s%s%s.%s", f->name,
| ^~~~~~~~
| vsprintf
cc1: all warnings being treated as errors
make[3]: *** [Makefile:36: tools/testing/selftests/sgx/main.o] Error 1
The cause is in the included <stdio.h> on Ubuntu 22.04 LTS:
19 /*
20 * ISO C99 Standard: 7.19 Input/output <stdio.h>
21 */
.
.
.
387 #if __GLIBC_USE (LIB_EXT2)
388 /* Write formatted output to a string dynamically allocated with `malloc'.
389 Store the address of the string in *PTR. */
390 extern int vasprintf (char **__restrict __ptr, const char *__restrict __f,
391 __gnuc_va_list __arg)
392 __THROWNL __attribute__ ((__format__ (__printf__, 2, 0))) __wur;
393 extern int __asprintf (char **__restrict __ptr,
394 const char *__restrict __fmt, ...)
395 __THROWNL __attribute__ ((__format__ (__printf__, 2, 3))) __wur;
396 extern int asprintf (char **__restrict __ptr,
397 const char *__restrict __fmt, ...)
398 __THROWNL __attribute__ ((__format__ (__printf__, 2, 3))) __wur;
399 #endif
__GLIBC_USE (LIB_EXT2) expands into __GLIBC_USE_LIB_EXT2 as defined here:
/usr/include/features.h:186:#define __GLIBC_USE(F) __GLIBC_USE_ ## F
Now, what is unobvious is that <stdio.h> includes
/usr/include/x86_64-linux-gnu/bits/libc-header-start.h:
------------------------------------------------------
35 /* ISO/IEC TR 24731-2:2010 defines the __STDC_WANT_LIB_EXT2__
36 macro. */
37 #undef __GLIBC_USE_LIB_EXT2
38 #if (defined __USE_GNU \
39 || (defined __STDC_WANT_LIB_EXT2__ && __STDC_WANT_LIB_EXT2__ > 0))
40 # define __GLIBC_USE_LIB_EXT2 1
41 #else
42 # define __GLIBC_USE_LIB_EXT2 0
43 #endif
This makes <stdio.h> exclude line 396 and asprintf() prototype from normal
include file processing.
The fix defines __USE_GNU before including <stdio.h> in case it isn't already
defined. After this intervention the module compiles OK.
Converting snprintf() to asprintf() in selftests/kselftest_harness.h:1169
created this new dependency and the implicit declaration broke the compilation.
Fixes: 809216233555 ("selftests/harness: remove use of LINE_MAX")
Cc: Edward Liaw <edliaw(a)google.com>
Cc: Jarkko Sakkinen <jarkko(a)kernel.org>
Cc: Dave Hansen <dave.hansen(a)linux.intel.com>
Cc: Shuah Khan <shuah(a)kernel.org>
Cc: linux-sgx(a)vger.kernel.org
Cc: linux-kselftest(a)vger.kernel.org
Cc: linux-kernel(a)vger.kernel.org
Signed-off-by: Mirsad Todorovac <mtodorov69(a)gmail.com>
---
tools/testing/selftests/sgx/main.c | 3 +++
1 file changed, 3 insertions(+)
diff --git a/tools/testing/selftests/sgx/main.c b/tools/testing/selftests/sgx/main.c
index 9820b3809c69..f5cb426bd797 100644
--- a/tools/testing/selftests/sgx/main.c
+++ b/tools/testing/selftests/sgx/main.c
@@ -6,6 +6,9 @@
#include <errno.h>
#include <fcntl.h>
#include <stdbool.h>
+#ifndef __USE_GNU
+#define __USE_GNU
+#endif
#include <stdio.h>
#include <stdint.h>
#include <stdlib.h>
--
2.34.1
The cb fields network_offset and inner_network_offset are used instead of
skb->network_header throughout GRO.
These fields are then leveraged in the next commit to remove flush_id state
from napi_gro_cb, and stateful code in {ipv6,inet}_gro_receive which may be
unnecessarily complicated due to encapsulation support in GRO. These fields
are checked in L4 instead.
3rd patch adds tests for different flush_id flows in GRO.
v8 -> v9:
- rename skb_gro_network_offset to skb_gro_receive_network_offset for
clarification
- improved code readability in tests and gro_network_flush functions
- v8:
https://lore.kernel.org/all/20240506093550.128210-1-richardbgobert@gmail.co…
v7 -> v8:
- Remove network_header use in gro
- Re-send commits after the dependent patch to net was applied
- v7:
https://lore.kernel.org/all/20240412155533.115507-1-richardbgobert@gmail.co…
v6 -> v7:
- Moved bug fixes to a separate submission in net
- Added UDP fwd benchmark
- v6:
https://lore.kernel.org/all/20240410153423.107381-1-richardbgobert@gmail.co…
v5 -> v6:
- Write inner_network_offset in vxlan and geneve
- Ignore is_atomic when DF=0
- v5:
https://lore.kernel.org/all/20240408141720.98832-1-richardbgobert@gmail.com/
v4 -> v5:
- Add 1st commit - flush id checks in udp_gro_receive segment which can be
backported by itself
- Add TCP measurements for the 5th commit
- Add flush id tests to ensure flush id logic is preserved in GRO
- Simplify gro_inet_flush by removing a branch
- v4:
https://lore.kernel.org/all/202420325182543.87683-1-richardbgobert@gmail.co…
v3 -> v4:
- Fix code comment and commit message typos
- v3:
https://lore.kernel.org/all/f939c84a-2322-4393-a5b0-9b1e0be8ed8e@gmail.com/
v2 -> v3:
- Use napi_gro_cb instead of skb->{offset}
- v2:
https://lore.kernel.org/all/2ce1600b-e733-448b-91ac-9d0ae2b866a4@gmail.com/
v1 -> v2:
- Pass p_off in *_gro_complete to fix UDP bug
- Remove more conditionals and memory fetches from inet_gro_flush
- v1:
https://lore.kernel.org/netdev/e1d22505-c5f8-4c02-a997-64248480338b@gmail.c…
Richard Gobert (3):
net: gro: use cb instead of skb->network_header
net: gro: move L3 flush checks to tcp_gro_receive and udp_gro_receive_segment
selftests/net: add flush id selftests
include/net/gro.h | 87 ++++++++++++++++---
net/core/gro.c | 3 -
net/ipv4/af_inet.c | 45 +---------
net/ipv4/tcp_offload.c | 20 ++---
net/ipv4/udp_offload.c | 10 +--
net/ipv6/ip6_offload.c | 16 +---
net/ipv6/tcpv6_offload.c | 3 +-
tools/testing/selftests/net/gro.c | 138 ++++++++++++++++++++++++++++++
8 files changed, 227 insertions(+), 95 deletions(-)
--
2.36.1
When building with clang, via:
make LLVM=1 -C tools/testing/selftests
...two types of warnings occur:
warning: absolute value function 'abs' given an argument of type
'long' but has parameter of type 'int' which may cause truncation of
value
warning: taking the absolute value of unsigned type 'unsigned long'
has no effect
Fix these by:
a) using labs() in place of abs(), when long integers are involved, and
b) Change to use signed integer data types, in places where subtraction
is used (and could end up with negative values).
c) Remove a duplicate abs() call in cmt_test.c.
Cc: Reinette Chatre <reinette.chatre(a)intel.com>
Cc: Ilpo Järvinen <ilpo.jarvinen(a)linux.intel.com>
Signed-off-by: John Hubbard <jhubbard(a)nvidia.com>
---
tools/testing/selftests/resctrl/cmt_test.c | 4 ++--
tools/testing/selftests/resctrl/mba_test.c | 2 +-
tools/testing/selftests/resctrl/mbm_test.c | 2 +-
3 files changed, 4 insertions(+), 4 deletions(-)
diff --git a/tools/testing/selftests/resctrl/cmt_test.c b/tools/testing/selftests/resctrl/cmt_test.c
index a81f91222a89..05a241519ae8 100644
--- a/tools/testing/selftests/resctrl/cmt_test.c
+++ b/tools/testing/selftests/resctrl/cmt_test.c
@@ -40,11 +40,11 @@ static int show_results_info(unsigned long sum_llc_val, int no_of_bits,
int ret;
avg_llc_val = sum_llc_val / num_of_runs;
- avg_diff = (long)abs(cache_span - avg_llc_val);
+ avg_diff = (long)(cache_span - avg_llc_val);
diff_percent = ((float)cache_span - avg_llc_val) / cache_span * 100;
ret = platform && abs((int)diff_percent) > max_diff_percent &&
- abs(avg_diff) > max_diff;
+ labs(avg_diff) > max_diff;
ksft_print_msg("%s Check cache miss rate within %lu%%\n",
ret ? "Fail:" : "Pass:", max_diff_percent);
diff --git a/tools/testing/selftests/resctrl/mba_test.c b/tools/testing/selftests/resctrl/mba_test.c
index 7946e32e85c8..5fffbc9ff6a4 100644
--- a/tools/testing/selftests/resctrl/mba_test.c
+++ b/tools/testing/selftests/resctrl/mba_test.c
@@ -60,7 +60,7 @@ static bool show_mba_info(unsigned long *bw_imc, unsigned long *bw_resc)
/* Memory bandwidth from 100% down to 10% */
for (allocation = 0; allocation < ALLOCATION_MAX / ALLOCATION_STEP;
allocation++) {
- unsigned long avg_bw_imc, avg_bw_resc;
+ long avg_bw_imc, avg_bw_resc;
unsigned long sum_bw_imc = 0, sum_bw_resc = 0;
int avg_diff_per;
float avg_diff;
diff --git a/tools/testing/selftests/resctrl/mbm_test.c b/tools/testing/selftests/resctrl/mbm_test.c
index d67ffa3ec63a..a4c3ea49b0e8 100644
--- a/tools/testing/selftests/resctrl/mbm_test.c
+++ b/tools/testing/selftests/resctrl/mbm_test.c
@@ -17,7 +17,7 @@
static int
show_bw_info(unsigned long *bw_imc, unsigned long *bw_resc, size_t span)
{
- unsigned long avg_bw_imc = 0, avg_bw_resc = 0;
+ long avg_bw_imc = 0, avg_bw_resc = 0;
unsigned long sum_bw_imc = 0, sum_bw_resc = 0;
int runs, ret, avg_diff_per;
float avg_diff = 0;
base-commit: 45db3ab70092637967967bfd8e6144017638563c
prerequisite-patch-id: b901ece2a5b78503e2fb5480f20e304d36a0ea27
prerequisite-patch-id: 8d96c4b8c3ed6d9ea2588ef7f594ae0f9f83c279
--
2.45.0
Since kselftest_harness.h introduces asprintf()[1], many selftests have
compilation warnings or errors due to missing _GNU_SOURCE definitions.
The issue stems from a lack of a LINE_MAX definition in Android (see
commit 38c957f07038), which is the reason why asprintf() was introduced.
We tried adding _GNU_SOURCE definitions to more selftests to fix, but
asprintf() may continue to cause problems, and since it is quite late in
the 6.9 cycle, we would like to revert 809216233555 first to provide
testing for forks[2].
[1] https://lore.kernel.org/all/20240411231954.62156-1-edliaw@google.com
[2] https://lore.kernel.org/linux-kselftest/ZjuA3aY_iHkjP7bQ@google.com
v1 -> v2:
- Stop defining _GNU_SOURCE in related selftests
- Revert commit 809216233555
- Use 1024 in place of LINE_MAX to fix 38c957f07038
v1: https://lore.kernel.org/all/20240507063534.4191447-1-tao1.su@linux.intel.co…
Tao Su (2):
Revert "selftests/harness: remove use of LINE_MAX"
selftests/harness: Use 1024 in place of LINE_MAX
tools/testing/selftests/kselftest_harness.h | 11 +++--------
tools/testing/selftests/mm/mdwe_test.c | 1 -
2 files changed, 3 insertions(+), 9 deletions(-)
base-commit: 45db3ab70092637967967bfd8e6144017638563c
--
2.34.1
When building with clang, via:
make LLVM=1 -C tools/testing/selftest
...clang warns about several cases of using a signed integer for the
priority argument to mq_receive(3), which expects an unsigned int.
Fix this by declaring the type as unsigned int in all cases.
Also, both input and output priority are unsigned, per the man pages, so
let's change the type of both priorities throughout, even though clang
did not warn about the prio_out variable.
Also, add an argument name to test->func(), in order to address another
warning from clang.
Cc: Ryan Roberts <ryan.roberts(a)arm.com>
Signed-off-by: John Hubbard <jhubbard(a)nvidia.com>
---
tools/testing/selftests/mqueue/mq_perf_tests.c | 15 ++++++++-------
1 file changed, 8 insertions(+), 7 deletions(-)
diff --git a/tools/testing/selftests/mqueue/mq_perf_tests.c b/tools/testing/selftests/mqueue/mq_perf_tests.c
index 5c16159d0bcd..9380c656581f 100644
--- a/tools/testing/selftests/mqueue/mq_perf_tests.c
+++ b/tools/testing/selftests/mqueue/mq_perf_tests.c
@@ -323,7 +323,8 @@ void *fake_cont_thread(void *arg)
void *cont_thread(void *arg)
{
char buff[MSG_SIZE];
- int i, priority;
+ int i;
+ unsigned int priority;
for (i = 0; i < num_cpus_to_pin; i++)
if (cpu_threads[i] == pthread_self())
@@ -373,27 +374,27 @@ void *cont_thread(void *arg)
struct test {
char *desc;
- void (*func)(int *);
+ void (*func)(unsigned int *prio);
};
-void const_prio(int *prio)
+void const_prio(unsigned int *prio)
{
return;
}
-void inc_prio(int *prio)
+void inc_prio(unsigned int *prio)
{
if (++*prio == mq_prio_max)
*prio = 0;
}
-void dec_prio(int *prio)
+void dec_prio(unsigned int *prio)
{
if (--*prio < 0)
*prio = mq_prio_max - 1;
}
-void random_prio(int *prio)
+void random_prio(unsigned int *prio)
{
*prio = random() % mq_prio_max;
}
@@ -425,7 +426,7 @@ struct test test2[] = {
void *perf_test_thread(void *arg)
{
char buff[MSG_SIZE];
- int prio_out, prio_in;
+ unsigned int prio_out, prio_in;
int i;
clockid_t clock;
pthread_t *t;
base-commit: 45db3ab70092637967967bfd8e6144017638563c
prerequisite-patch-id: b901ece2a5b78503e2fb5480f20e304d36a0ea27
--
2.45.0
There is a 'malloc' call in test_vmx_nested_state function, which can
be unsuccessful. This patch will add the malloc failure checking
to avoid possible null dereference and give more information
about test fail reasons.
Signed-off-by: Kunwu Chan <chentao(a)kylinos.cn>
---
tools/testing/selftests/kvm/x86_64/vmx_set_nested_state_test.c | 1 +
1 file changed, 1 insertion(+)
diff --git a/tools/testing/selftests/kvm/x86_64/vmx_set_nested_state_test.c b/tools/testing/selftests/kvm/x86_64/vmx_set_nested_state_test.c
index 67a62a5a8895..18afc2000a74 100644
--- a/tools/testing/selftests/kvm/x86_64/vmx_set_nested_state_test.c
+++ b/tools/testing/selftests/kvm/x86_64/vmx_set_nested_state_test.c
@@ -91,6 +91,7 @@ void test_vmx_nested_state(struct kvm_vcpu *vcpu)
const int state_sz = sizeof(struct kvm_nested_state) + getpagesize();
struct kvm_nested_state *state =
(struct kvm_nested_state *)malloc(state_sz);
+ TEST_ASSERT(state, "-ENOMEM when allocating kvm state");
/* The format must be set to 0. 0 for VMX, 1 for SVM. */
set_default_vmx_state(state, state_sz);
--
2.40.1
The "malloc" call may not be successful.Add the malloc
failure checking to avoid possible null dereference.
Kunwu Chan (4):
selftests/bpf: Add some null pointer checks
selftests/bpf/sockopt: Add a null pointer check for the run_test
selftests/bpf: Add a null pointer check for the load_btf_spec
selftests/bpf: Add a null pointer check for the
serial_test_tp_attach_query
tools/testing/selftests/bpf/prog_tests/sockopt.c | 6 ++++++
tools/testing/selftests/bpf/prog_tests/tp_attach_query.c | 3 +++
tools/testing/selftests/bpf/test_progs.c | 7 +++++++
tools/testing/selftests/bpf/test_verifier.c | 2 ++
4 files changed, 18 insertions(+)
--
2.40.1
The series composes of two parts. The first part provides a quick fix for
the issue on a recent thread[1]. The issue happens when a platform has
ununified vector register length across multiple cores. Specifically,
patch 1 adds a comment at a callsite of riscv_setup_vsize to clarify how
vlenb is observed by the system. Patch 2 fixes the issue by failing the
boot process of a secondary core if vlenb mismatches.
The second part of the series provide a finer grain view of the Vector
extension. Patch 3 give the obsolete ISA parser the ability to expand
ISA extensions for sigle letter extensions. Patch 3, 4 introduces Zve32x,
Zve32f, Zve64x, Zve64f, Zve64d for isa parsing and hwprobe. Patch 5
updates all callsites such that Vector subextensions are maximumly
supported by the kernel.
Two parts of the series are sent together to ease the effort of picking
dependency patches. The first part can be merged independent of the
second one if necessary.
The series is tested on a QEMU and verified that booting, Vector
programs context-switch, signal, ptrace, prctl interfaces works when we
only report partial V from the ISA.
Note that the signal test was performed after applying the commit
c27fa53b858b ("riscv: Fix vector state restore in rt_sigreturn()")
This patch should be able to apply on risc-v for-next branch on top of
the commit ba5ea59f768f ("riscv: Do not save the scratch CSR during suspend")
[1]: https://lore.kernel.org/all/20240228-vicinity-cornstalk-4b8eb5fe5730@spud/T…
Changes in v4:
- Add a patch to trigger prctl test on ZVE32X (9)
- Add a patch to fix integer promotion bug in hwprobe (8)
- Fix a build fail on !CONFIG_RISCV_ISA_V (7)
- Add more comment in the assembly code change (2)
- Link to v3: https://lore.kernel.org/r/20240318-zve-detection-v3-0-e12d42107fa8@sifive.c…
Changelog v3:
- Include correct maintainers and mailing list into CC.
- Cleanup isa string parser code (3)
- Adjust extensions order and name (4, 5)
- Refine commit message (6)
Changelog v2:
- Update comments and commit messages (1, 2, 7)
- Refine isa_exts[] lists for zve extensions (4)
- Add a patch for dt-binding (5)
- Make ZVE* extensions depend on has_vector(ZVE32X) (6, 7)
---
---
Andy Chiu (9):
riscv: vector: add a comment when calling riscv_setup_vsize()
riscv: smp: fail booting up smp if inconsistent vlen is detected
riscv: cpufeature: call match_isa_ext() for single-letter extensions
riscv: cpufeature: add zve32[xf] and zve64[xfd] isa detection
dt-bindings: riscv: add Zve32[xf] Zve64[xfd] ISA extension description
riscv: hwprobe: add zve Vector subextensions into hwprobe interface
riscv: vector: adjust minimum Vector requirement to ZVE32X
hwprobe: fix integer promotion in RISCV_HWPROBE_EXT macro
selftest: run vector prctl test for ZVE32X
Documentation/arch/riscv/hwprobe.rst | 15 ++++++
.../devicetree/bindings/riscv/extensions.yaml | 30 ++++++++++++
arch/riscv/include/asm/hwcap.h | 5 ++
arch/riscv/include/asm/switch_to.h | 2 +-
arch/riscv/include/asm/vector.h | 25 ++++++----
arch/riscv/include/asm/xor.h | 2 +-
arch/riscv/include/uapi/asm/hwprobe.h | 7 ++-
arch/riscv/kernel/cpufeature.c | 56 ++++++++++++++++++----
arch/riscv/kernel/head.S | 19 +++++---
arch/riscv/kernel/kernel_mode_vector.c | 4 +-
arch/riscv/kernel/process.c | 4 +-
arch/riscv/kernel/signal.c | 6 +--
arch/riscv/kernel/smpboot.c | 14 ++++--
arch/riscv/kernel/sys_hwprobe.c | 13 ++++-
arch/riscv/kernel/vector.c | 15 +++---
arch/riscv/lib/uaccess.S | 2 +-
.../testing/selftests/riscv/vector/vstate_prctl.c | 6 +--
17 files changed, 174 insertions(+), 51 deletions(-)
---
base-commit: ba5ea59f768f67d127b319b26ba209ff67e0d9a5
change-id: 20240318-zve-detection-50106d2da527
Best regards,
--
Andy Chiu <andy.chiu(a)sifive.com>
From: Geliang Tang <tanggeliang(a)kylinos.cn>
This patchset adds post_socket_cb pointer together with 'struct
post_socket_opts cb_opts' into struct network_helper_opts to make
start_server_addr() helper more flexible. With these modifications,
many duplicate codes can be dropped.
Patches 1-3 address Martin's comments in the previous series.
Geliang Tang (6):
selftests/bpf: Add post_socket_cb for network_helper_opts
selftests/bpf: Use start_server_addr in sockopt_inherit
selftests/bpf: Use start_server_addr in test_tcp_check_syncookie
selftests/bpf: Use connect_to_fd in sockopt_inherit
selftests/bpf: Use connect_to_fd in test_tcp_check_syncookie
selftests/bpf: Drop get_port in test_tcp_check_syncookie
tools/testing/selftests/bpf/Makefile | 1 +
tools/testing/selftests/bpf/network_helpers.c | 25 ++--
tools/testing/selftests/bpf/network_helpers.h | 4 +
.../bpf/prog_tests/sockopt_inherit.c | 63 ++--------
.../bpf/test_tcp_check_syncookie_user.c | 117 ++++--------------
5 files changed, 61 insertions(+), 149 deletions(-)
--
2.43.0
As it is said in the docs, we should always check the KVM API version
before running the KVM-based applications. Add the function which
queries the current KVM API version through `ioctl` to the `kvm_util.c`
file.
Add a new TEST_REQUIRE statement to the `vm_open` function in order
to verify the version of the KVM API before creating a VM.
Signed-off-by: Ivan Orlov <ivan.orlov(a)codethink.co.uk>
---
.../testing/selftests/kvm/include/kvm_util_base.h | 2 ++
tools/testing/selftests/kvm/lib/kvm_util.c | 14 ++++++++++++++
2 files changed, 16 insertions(+)
diff --git a/tools/testing/selftests/kvm/include/kvm_util_base.h b/tools/testing/selftests/kvm/include/kvm_util_base.h
index 3e0db283a46a..d7a83387ae33 100644
--- a/tools/testing/selftests/kvm/include/kvm_util_base.h
+++ b/tools/testing/selftests/kvm/include/kvm_util_base.h
@@ -38,6 +38,7 @@
#define kvm_static_assert(expr, ...) __kvm_static_assert(expr, ##__VA_ARGS__, #expr)
#define KVM_DEV_PATH "/dev/kvm"
+#define KVM_DEFAULT_API_VERSION 12
#define KVM_MAX_VCPUS 512
#define NSEC_PER_SEC 1000000000L
@@ -275,6 +276,7 @@ int get_kvm_intel_param_integer(const char *param);
int get_kvm_amd_param_integer(const char *param);
unsigned int kvm_check_cap(long cap);
+int kvm_get_api_version(void);
static inline bool kvm_has_cap(long cap)
{
diff --git a/tools/testing/selftests/kvm/lib/kvm_util.c b/tools/testing/selftests/kvm/lib/kvm_util.c
index b2262b5fad9e..58a5deccb388 100644
--- a/tools/testing/selftests/kvm/lib/kvm_util.c
+++ b/tools/testing/selftests/kvm/lib/kvm_util.c
@@ -176,6 +176,19 @@ unsigned int kvm_check_cap(long cap)
return (unsigned int)ret;
}
+int kvm_get_api_version(void)
+{
+ int ret;
+ int kvm_fd;
+
+ kvm_fd = open_kvm_dev_path_or_exit();
+ ret = __kvm_ioctl(kvm_fd, KVM_GET_API_VERSION, NULL);
+
+ close(kvm_fd);
+
+ return ret;
+}
+
void vm_enable_dirty_ring(struct kvm_vm *vm, uint32_t ring_size)
{
if (vm_check_cap(vm, KVM_CAP_DIRTY_LOG_RING_ACQ_REL))
@@ -190,6 +203,7 @@ static void vm_open(struct kvm_vm *vm)
vm->kvm_fd = _open_kvm_dev_path_or_exit(O_RDWR);
TEST_REQUIRE(kvm_has_cap(KVM_CAP_IMMEDIATE_EXIT));
+ TEST_REQUIRE(kvm_get_api_version() == KVM_DEFAULT_API_VERSION);
vm->fd = __kvm_ioctl(vm->kvm_fd, KVM_CREATE_VM, (void *)vm->type);
TEST_ASSERT(vm->fd >= 0, KVM_IOCTL_ERROR(KVM_CREATE_VM, vm->fd));
--
2.34.1
The test is inspired by the pmu_event_filter_test which implemented by x86. On
the arm64 platform, there is the same ability to set the pmu_event_filter
through the KVM_ARM_VCPU_PMU_V3_FILTER attribute. So add the test for arm64.
The series first create the helper function which can be used
for the vpmu related tests. Then, it implement the test.
Changelog:
----------
v7->v8:
- Rebased to kvm-arm/next.
- Deleted the GIC layout related staff.
- Fixed the checking logic in the kvm_pmu_support_events.
v6->v7:
- Rebased to v6.9-rc3.
v5->v6:
- Rebased to v6.9-rc1.
- Collect RB.
- Add multiple filter test.
v4->v5:
- Rebased to v6.8-rc6.
- Refactor the helper function, make it fine-grained and easy to be used.
- Namimg improvements.
- Use the kvm_device_attr_set() helper.
- Make the test descriptor array readable and clean.
- Delete the patch which moves the pmu related helper to vpmu.h.
- Remove the kvm_supports_pmu_event_filter() function since nobody will run
this on a old kernel.
v3->v4:
- Rebased to the v6.8-rc2.
v2->v3:
- Check the pmceid in guest code instead of pmu event count since different
hardware may have different event count result, check pmceid makes it stable
on different platform. [Eric]
- Some typo fixed and commit message improved.
v1->v2:
- Improve the commit message. [Eric]
- Fix the bug in [enable|disable]_counter. [Raghavendra & Marc]
- Add the check if kvm has attr KVM_ARM_VCPU_PMU_V3_FILTER.
- Add if host pmu support the test event throught pmceid0.
- Split the test_invalid_filter() to another patch. [Eric]
v1: https://lore.kernel.org/all/20231123063750.2176250-1-shahuang@redhat.com/
v2: https://lore.kernel.org/all/20231129072712.2667337-1-shahuang@redhat.com/
v3: https://lore.kernel.org/all/20240116060129.55473-1-shahuang@redhat.com/
v4: https://lore.kernel.org/all/20240202025659.5065-1-shahuang@redhat.com/
v5: https://lore.kernel.org/all/20240229065625.114207-1-shahuang@redhat.com/
v6: https://lore.kernel.org/all/20240326033706.117189-1-shahuang@redhat.com/
v7: https://lore.kernel.org/all/20240409030320.182591-1-shahuang@redhat.com/
Shaoqin Huang (3):
KVM: selftests: aarch64: Add helper function for the vpmu vcpu
creation
KVM: selftests: aarch64: Introduce pmu_event_filter_test
KVM: selftests: aarch64: Add invalid filter test in
pmu_event_filter_test
tools/testing/selftests/kvm/Makefile | 1 +
.../kvm/aarch64/pmu_event_filter_test.c | 341 ++++++++++++++++++
.../kvm/aarch64/vpmu_counter_access.c | 32 +-
.../selftests/kvm/include/aarch64/vpmu.h | 28 ++
4 files changed, 376 insertions(+), 26 deletions(-)
create mode 100644 tools/testing/selftests/kvm/aarch64/pmu_event_filter_test.c
create mode 100644 tools/testing/selftests/kvm/include/aarch64/vpmu.h
--
2.40.1
The test is inspired by the pmu_event_filter_test which implemented by x86. On
the arm64 platform, there is the same ability to set the pmu_event_filter
through the KVM_ARM_VCPU_PMU_V3_FILTER attribute. So add the test for arm64.
The series first create the helper function which can be used
for the vpmu related tests. Then, it implement the test.
Changelog:
----------
v6->v7:
- Rebased to v6.9-rc3.
v5->v6:
- Rebased to v6.9-rc1.
- Collect RB.
- Add multiple filter test.
v4->v5:
- Rebased to v6.8-rc6.
- Refactor the helper function, make it fine-grained and easy to be used.
- Namimg improvements.
- Use the kvm_device_attr_set() helper.
- Make the test descriptor array readable and clean.
- Delete the patch which moves the pmu related helper to vpmu.h.
- Remove the kvm_supports_pmu_event_filter() function since nobody will run
this on a old kernel.
v3->v4:
- Rebased to the v6.8-rc2.
v2->v3:
- Check the pmceid in guest code instead of pmu event count since different
hardware may have different event count result, check pmceid makes it stable
on different platform. [Eric]
- Some typo fixed and commit message improved.
v1->v2:
- Improve the commit message. [Eric]
- Fix the bug in [enable|disable]_counter. [Raghavendra & Marc]
- Add the check if kvm has attr KVM_ARM_VCPU_PMU_V3_FILTER.
- Add if host pmu support the test event throught pmceid0.
- Split the test_invalid_filter() to another patch. [Eric]
v1: https://lore.kernel.org/all/20231123063750.2176250-1-shahuang@redhat.com/
v2: https://lore.kernel.org/all/20231129072712.2667337-1-shahuang@redhat.com/
v3: https://lore.kernel.org/all/20240116060129.55473-1-shahuang@redhat.com/
v4: https://lore.kernel.org/all/20240202025659.5065-1-shahuang@redhat.com/
v5: https://lore.kernel.org/all/20240229065625.114207-1-shahuang@redhat.com/
v6: https://lore.kernel.org/all/20240326033706.117189-1-shahuang@redhat.com/
Shaoqin Huang (3):
KVM: selftests: aarch64: Add helper function for the vpmu vcpu
creation
KVM: selftests: aarch64: Introduce pmu_event_filter_test
KVM: selftests: aarch64: Add invalid filter test in
pmu_event_filter_test
tools/testing/selftests/kvm/Makefile | 1 +
.../kvm/aarch64/pmu_event_filter_test.c | 336 ++++++++++++++++++
.../kvm/aarch64/vpmu_counter_access.c | 33 +-
.../selftests/kvm/include/aarch64/vpmu.h | 28 ++
4 files changed, 372 insertions(+), 26 deletions(-)
create mode 100644 tools/testing/selftests/kvm/aarch64/pmu_event_filter_test.c
create mode 100644 tools/testing/selftests/kvm/include/aarch64/vpmu.h
--
2.40.1
RFC v8:
=======
Major Changes:
--------------
- Fixed build error generated by patch-by-patch build.
- Applied docs suggestions from Randy.
RFC v7:
=======
Major Changes:
--------------
This revision largely rebases on top of net-next and addresses the feedback
RFCv6 received from folks, namely Jakub, Yunsheng, Arnd, David, & Pavel.
The series remains in RFC because the queue-API ndos defined in this
series are not yet implemented. I have a GVE implementation I carry out
of tree for my testing. A upstreamable GVE implementation is in the
works. Aside from that, in my estimation all the patches are ready for
review/merge. Please do take a look.
As usual the full devmem TCP changes including the full GVE driver
implementation is here:
https://github.com/mina/linux/commits/tcpdevmem-v7/
Detailed changelog:
- Use admin-perm in netlink API.
- Addressed feedback from Jakub with regards to netlink API
implementation.
- Renamed devmem.c functions to something more appropriate for that
file.
- Improve the performance seen through the page_pool benchmark.
- Fix the value definition of all the SO_DEVMEM_* uapi.
- Various fixes to documentation.
Perf - page-pool benchmark:
---------------------------
Improved performance of bench_page_pool_simple.ko tests compared to v6:
https://pastebin.com/raw/v5dYRg8L
net-next base: 8 cycle fast path.
RFC v6: 10 cycle fast path.
RFC v7: 9 cycle fast path.
RFC v7 with CONFIG_DMA_SHARED_BUFFER disabled: 8 cycle fast path,
same as baseline.
Perf - Devmem TCP benchmark:
---------------------
Perf is about the same regardless of the changes in v7, namely the
removal of the static_branch_unlikely to improve the page_pool benchmark
performance:
189/200gbps bi-directional throughput with RX devmem TCP and regular TCP
TX i.e. ~95% line rate.
RFC v6:
=======
Major Changes:
--------------
This revision largely rebases on top of net-next and addresses the little
feedback RFCv5 received.
The series remains in RFC because the queue-API ndos defined in this
series are not yet implemented. I have a GVE implementation I carry out
of tree for my testing. A upstreamable GVE implementation is in the
works. Aside from that, in my estimation all the patches are ready for
review/merge. Please do take a look.
As usual the full devmem TCP changes including the full GVE driver
implementation is here:
https://github.com/mina/linux/commits/tcpdevmem-v6/
This version also comes with some performance data recorded in the cover
letter (see below changelog).
Detailed changelog:
- Rebased on top of the merged netmem_ref changes.
- Converted skb->dmabuf to skb->readable (Pavel). Pavel's original
suggestion was to remove the skb->dmabuf flag entirely, but when I
looked into it closely, I found the issue that if we remove the flag
we have to dereference the shinfo(skb) pointer to obtain the first
frag to tell whether an skb is readable or not. This can cause a
performance regression if it dirties the cache line when the
shinfo(skb) was not really needed. Instead, I converted the skb->dmabuf
flag into a generic skb->readable flag which can be re-used by io_uring
0-copy RX.
- Squashed a few locking optimizations from Eric Dumazet in the RX path
and the DEVMEM_DONTNEED setsockopt.
- Expanded the tests a bit. Added validation for invalid scenarios and
added some more coverage.
Perf - page-pool benchmark:
---------------------------
bench_page_pool_simple.ko tests with and without these changes:
https://pastebin.com/raw/ncHDwAbn
AFAIK the number that really matters in the perf tests is the
'tasklet_page_pool01_fast_path Per elem'. This one measures at about 8
cycles without the changes but there is some 1 cycle noise in some
results.
With the patches this regresses to 9 cycles with the changes but there
is 1 cycle noise occasionally running this test repeatedly.
Lastly I tried disable the static_branch_unlikely() in
netmem_is_net_iov() check. To my surprise disabling the
static_branch_unlikely() check reduces the fast path back to 8 cycles,
but the 1 cycle noise remains.
Perf - Devmem TCP benchmark:
---------------------
189/200gbps bi-directional throughput with RX devmem TCP and regular TCP
TX i.e. ~95% line rate.
Major changes in RFC v5:
========================
1. Rebased on top of 'Abstract page from net stack' series and used the
new netmem type to refer to LSB set pointers instead of re-using
struct page.
2. Downgraded this series back to RFC and called it RFC v5. This is
because this series is now dependent on 'Abstract page from net
stack'[1] and the queue API. Both are removed from the series to
reduce the patch # and those bits are fairly independent or
pre-requisite work.
3. Reworked the page_pool devmem support to use netmem and for some
more unified handling.
4. Reworked the reference counting of net_iov (renamed from
page_pool_iov) to use pp_ref_count for refcounting.
The full changes including the dependent series and GVE page pool
support is here:
https://github.com/mina/linux/commits/tcpdevmem-rfcv5/
[1] https://patchwork.kernel.org/project/netdevbpf/list/?series=810774
Major changes in v1:
====================
1. Implemented MVP queue API ndos to remove the userspace-visible
driver reset.
2. Fixed issues in the napi_pp_put_page() devmem frag unref path.
3. Removed RFC tag.
Many smaller addressed comments across all the patches (patches have
individual change log).
Full tree including the rest of the GVE driver changes:
https://github.com/mina/linux/commits/tcpdevmem-v1
Changes in RFC v3:
==================
1. Pulled in the memory-provider dependency from Jakub's RFC[1] to make the
series reviewable and mergeable.
2. Implemented multi-rx-queue binding which was a todo in v2.
3. Fix to cmsg handling.
The sticking point in RFC v2[2] was the device reset required to refill
the device rx-queues after the dmabuf bind/unbind. The solution
suggested as I understand is a subset of the per-queue management ops
Jakub suggested or similar:
https://lore.kernel.org/netdev/20230815171638.4c057dcd@kernel.org/
This is not addressed in this revision, because:
1. This point was discussed at netconf & netdev and there is openness to
using the current approach of requiring a device reset.
2. Implementing individual queue resetting seems to be difficult for my
test bed with GVE. My prototype to test this ran into issues with the
rx-queues not coming back up properly if reset individually. At the
moment I'm unsure if it's a mistake in the POC or a genuine issue in
the virtualization stack behind GVE, which currently doesn't test
individual rx-queue restart.
3. Our usecases are not bothered by requiring a device reset to refill
the buffer queues, and we'd like to support NICs that run into this
limitation with resetting individual queues.
My thought is that drivers that have trouble with per-queue configs can
use the support in this series, while drivers that support new netdev
ops to reset individual queues can automatically reset the queue as
part of the dma-buf bind/unbind.
The same approach with device resets is presented again for consideration
with other sticking points addressed.
This proposal includes the rx devmem path only proposed for merge. For a
snapshot of my entire tree which includes the GVE POC page pool support &
device memory support:
https://github.com/torvalds/linux/compare/master...mina:linux:tcpdevmem-v3
[1] https://lore.kernel.org/netdev/f8270765-a27b-6ccf-33ea-cda097168d79@redhat.…
[2] https://lore.kernel.org/netdev/CAHS8izOVJGJH5WF68OsRWFKJid1_huzzUK+hpKbLcL4…
Changes in RFC v2:
==================
The sticking point in RFC v1[1] was the dma-buf pages approach we used to
deliver the device memory to the TCP stack. RFC v2 is a proof-of-concept
that attempts to resolve this by implementing scatterlist support in the
networking stack, such that we can import the dma-buf scatterlist
directly. This is the approach proposed at a high level here[2].
Detailed changes:
1. Replaced dma-buf pages approach with importing scatterlist into the
page pool.
2. Replace the dma-buf pages centric API with a netlink API.
3. Removed the TX path implementation - there is no issue with
implementing the TX path with scatterlist approach, but leaving
out the TX path makes it easier to review.
4. Functionality is tested with this proposal, but I have not conducted
perf testing yet. I'm not sure there are regressions, but I removed
perf claims from the cover letter until they can be re-confirmed.
5. Added Signed-off-by: contributors to the implementation.
6. Fixed some bugs with the RX path since RFC v1.
Any feedback welcome, but specifically the biggest pending questions
needing feedback IMO are:
1. Feedback on the scatterlist-based approach in general.
2. Netlink API (Patch 1 & 2).
3. Approach to handle all the drivers that expect to receive pages from
the page pool (Patch 6).
[1] https://lore.kernel.org/netdev/dfe4bae7-13a0-3c5d-d671-f61b375cb0b4@gmail.c…
[2] https://lore.kernel.org/netdev/CAHS8izPm6XRS54LdCDZVd0C75tA1zHSu6jLVO8nzTLX…
==================
* TL;DR:
Device memory TCP (devmem TCP) is a proposal for transferring data to and/or
from device memory efficiently, without bouncing the data to a host memory
buffer.
* Problem:
A large amount of data transfers have device memory as the source and/or
destination. Accelerators drastically increased the volume of such transfers.
Some examples include:
- ML accelerators transferring large amounts of training data from storage into
GPU/TPU memory. In some cases ML training setup time can be as long as 50% of
TPU compute time, improving data transfer throughput & efficiency can help
improving GPU/TPU utilization.
- Distributed training, where ML accelerators, such as GPUs on different hosts,
exchange data among them.
- Distributed raw block storage applications transfer large amounts of data with
remote SSDs, much of this data does not require host processing.
Today, the majority of the Device-to-Device data transfers the network are
implemented as the following low level operations: Device-to-Host copy,
Host-to-Host network transfer, and Host-to-Device copy.
The implementation is suboptimal, especially for bulk data transfers, and can
put significant strains on system resources, such as host memory bandwidth,
PCIe bandwidth, etc. One important reason behind the current state is the
kernel’s lack of semantics to express device to network transfers.
* Proposal:
In this patch series we attempt to optimize this use case by implementing
socket APIs that enable the user to:
1. send device memory across the network directly, and
2. receive incoming network packets directly into device memory.
Packet _payloads_ go directly from the NIC to device memory for receive and from
device memory to NIC for transmit.
Packet _headers_ go to/from host memory and are processed by the TCP/IP stack
normally. The NIC _must_ support header split to achieve this.
Advantages:
- Alleviate host memory bandwidth pressure, compared to existing
network-transfer + device-copy semantics.
- Alleviate PCIe BW pressure, by limiting data transfer to the lowest level
of the PCIe tree, compared to traditional path which sends data through the
root complex.
* Patch overview:
** Part 1: netlink API
Gives user ability to bind dma-buf to an RX queue.
** Part 2: scatterlist support
Currently the standard for device memory sharing is DMABUF, which doesn't
generate struct pages. On the other hand, networking stack (skbs, drivers, and
page pool) operate on pages. We have 2 options:
1. Generate struct pages for dmabuf device memory, or,
2. Modify the networking stack to process scatterlist.
Approach #1 was attempted in RFC v1. RFC v2 implements approach #2.
** part 3: page pool support
We piggy back on page pool memory providers proposal:
https://github.com/kuba-moo/linux/tree/pp-providers
It allows the page pool to define a memory provider that provides the
page allocation and freeing. It helps abstract most of the device memory
TCP changes from the driver.
** part 4: support for unreadable skb frags
Page pool iovs are not accessible by the host; we implement changes
throughput the networking stack to correctly handle skbs with unreadable
frags.
** Part 5: recvmsg() APIs
We define user APIs for the user to send and receive device memory.
Not included with this series is the GVE devmem TCP support, just to
simplify the review. Code available here if desired:
https://github.com/mina/linux/tree/tcpdevmem
This series is built on top of net-next with Jakub's pp-providers changes
cherry-picked.
* NIC dependencies:
1. (strict) Devmem TCP require the NIC to support header split, i.e. the
capability to split incoming packets into a header + payload and to put
each into a separate buffer. Devmem TCP works by using device memory
for the packet payload, and host memory for the packet headers.
2. (optional) Devmem TCP works better with flow steering support & RSS support,
i.e. the NIC's ability to steer flows into certain rx queues. This allows the
sysadmin to enable devmem TCP on a subset of the rx queues, and steer
devmem TCP traffic onto these queues and non devmem TCP elsewhere.
The NIC I have access to with these properties is the GVE with DQO support
running in Google Cloud, but any NIC that supports these features would suffice.
I may be able to help reviewers bring up devmem TCP on their NICs.
* Testing:
The series includes a udmabuf kselftest that show a simple use case of
devmem TCP and validates the entire data path end to end without
a dependency on a specific dmabuf provider.
** Test Setup
Kernel: net-next with this series and memory provider API cherry-picked
locally.
Hardware: Google Cloud A3 VMs.
NIC: GVE with header split & RSS & flow steering support.
Cc: Pavel Begunkov <asml.silence(a)gmail.com>
Cc: David Wei <dw(a)davidwei.uk>
Cc: Jason Gunthorpe <jgg(a)ziepe.ca>
Cc: Yunsheng Lin <linyunsheng(a)huawei.com>
Cc: Shailend Chand <shailend(a)google.com>
Cc: Harshitha Ramamurthy <hramamurthy(a)google.com>
Cc: Shakeel Butt <shakeel.butt(a)linux.dev>
Cc: Jeroen de Borst <jeroendb(a)google.com>
Cc: Praveen Kaligineedi <pkaligineedi(a)google.com>
Jakub Kicinski (1):
net: page_pool: create hooks for custom page providers
Mina Almasry (13):
queue_api: define queue api
net: netdev netlink api to bind dma-buf to a net device
netdev: support binding dma-buf to netdevice
netdev: netdevice devmem allocator
page_pool: convert to use netmem
page_pool: devmem support
memory-provider: dmabuf devmem memory provider
net: support non paged skb frags
net: add support for skbs with unreadable frags
tcp: RX path for devmem TCP
net: add SO_DEVMEM_DONTNEED setsockopt to release RX frags
net: add devmem TCP documentation
selftests: add ncdevmem, netcat for devmem TCP
Documentation/netlink/specs/netdev.yaml | 57 +++
Documentation/networking/devmem.rst | 256 +++++++++++
Documentation/networking/index.rst | 1 +
arch/alpha/include/uapi/asm/socket.h | 6 +
arch/mips/include/uapi/asm/socket.h | 6 +
arch/parisc/include/uapi/asm/socket.h | 6 +
arch/sparc/include/uapi/asm/socket.h | 6 +
include/linux/netdevice.h | 3 +
include/linux/skbuff.h | 73 +++-
include/linux/socket.h | 1 +
include/net/devmem.h | 124 ++++++
include/net/netdev_queues.h | 27 ++
include/net/netdev_rx_queue.h | 2 +
include/net/netmem.h | 234 +++++++++-
include/net/page_pool/helpers.h | 155 +++++--
include/net/page_pool/types.h | 33 +-
include/net/sock.h | 2 +
include/net/tcp.h | 5 +-
include/trace/events/page_pool.h | 29 +-
include/uapi/asm-generic/socket.h | 6 +
include/uapi/linux/netdev.h | 19 +
include/uapi/linux/uio.h | 17 +
net/bpf/test_run.c | 5 +-
net/core/Makefile | 2 +-
net/core/datagram.c | 6 +
net/core/dev.c | 6 +-
net/core/devmem.c | 425 ++++++++++++++++++
net/core/gro.c | 8 +-
net/core/netdev-genl-gen.c | 23 +
net/core/netdev-genl-gen.h | 6 +
net/core/netdev-genl.c | 107 +++++
net/core/page_pool.c | 364 +++++++++-------
net/core/skbuff.c | 110 ++++-
net/core/sock.c | 61 +++
net/ipv4/esp4.c | 2 +-
net/ipv4/tcp.c | 254 ++++++++++-
net/ipv4/tcp_input.c | 13 +-
net/ipv4/tcp_ipv4.c | 9 +
net/ipv4/tcp_minisocks.c | 2 +
net/ipv4/tcp_output.c | 5 +-
net/ipv6/esp6.c | 2 +-
net/packet/af_packet.c | 4 +-
tools/include/uapi/linux/netdev.h | 19 +
tools/testing/selftests/net/.gitignore | 1 +
tools/testing/selftests/net/Makefile | 5 +
tools/testing/selftests/net/ncdevmem.c | 546 ++++++++++++++++++++++++
46 files changed, 2776 insertions(+), 277 deletions(-)
create mode 100644 Documentation/networking/devmem.rst
create mode 100644 include/net/devmem.h
create mode 100644 net/core/devmem.c
create mode 100644 tools/testing/selftests/net/ncdevmem.c
--
2.44.0.478.gd926399ef9-goog
When building with clang, via:
make LLVM=1 -C tools/testing/selftests
...two types of warnings occur:
warning: absolute value function 'abs' given an argument of type
'long' but has parameter of type 'int' which may cause truncation of
value
warning: taking the absolute value of unsigned type 'unsigned long'
has no effect
Fix these by:
a) using labs() in place of abs(), when long integers are involved, and
b) Change to use signed integer data types, in places where subtraction
is used (and could end up with negative values).
c) Remove a duplicate abs() call in cmt_test.c.
Cc: Ilpo Järvinen <ilpo.jarvinen(a)linux.intel.com>
Reviewed-by: Reinette Chatre <reinette.chatre(a)intel.com>
Signed-off-by: John Hubbard <jhubbard(a)nvidia.com>
---
tools/testing/selftests/resctrl/cmt_test.c | 4 ++--
tools/testing/selftests/resctrl/mba_test.c | 2 +-
tools/testing/selftests/resctrl/mbm_test.c | 2 +-
3 files changed, 4 insertions(+), 4 deletions(-)
diff --git a/tools/testing/selftests/resctrl/cmt_test.c b/tools/testing/selftests/resctrl/cmt_test.c
index a81f91222a89..05a241519ae8 100644
--- a/tools/testing/selftests/resctrl/cmt_test.c
+++ b/tools/testing/selftests/resctrl/cmt_test.c
@@ -40,11 +40,11 @@ static int show_results_info(unsigned long sum_llc_val, int no_of_bits,
int ret;
avg_llc_val = sum_llc_val / num_of_runs;
- avg_diff = (long)abs(cache_span - avg_llc_val);
+ avg_diff = (long)(cache_span - avg_llc_val);
diff_percent = ((float)cache_span - avg_llc_val) / cache_span * 100;
ret = platform && abs((int)diff_percent) > max_diff_percent &&
- abs(avg_diff) > max_diff;
+ labs(avg_diff) > max_diff;
ksft_print_msg("%s Check cache miss rate within %lu%%\n",
ret ? "Fail:" : "Pass:", max_diff_percent);
diff --git a/tools/testing/selftests/resctrl/mba_test.c b/tools/testing/selftests/resctrl/mba_test.c
index 7946e32e85c8..8fd16b117092 100644
--- a/tools/testing/selftests/resctrl/mba_test.c
+++ b/tools/testing/selftests/resctrl/mba_test.c
@@ -60,8 +60,8 @@ static bool show_mba_info(unsigned long *bw_imc, unsigned long *bw_resc)
/* Memory bandwidth from 100% down to 10% */
for (allocation = 0; allocation < ALLOCATION_MAX / ALLOCATION_STEP;
allocation++) {
- unsigned long avg_bw_imc, avg_bw_resc;
unsigned long sum_bw_imc = 0, sum_bw_resc = 0;
+ long avg_bw_imc, avg_bw_resc;
int avg_diff_per;
float avg_diff;
diff --git a/tools/testing/selftests/resctrl/mbm_test.c b/tools/testing/selftests/resctrl/mbm_test.c
index d67ffa3ec63a..252c94ff2a3d 100644
--- a/tools/testing/selftests/resctrl/mbm_test.c
+++ b/tools/testing/selftests/resctrl/mbm_test.c
@@ -17,8 +17,8 @@
static int
show_bw_info(unsigned long *bw_imc, unsigned long *bw_resc, size_t span)
{
- unsigned long avg_bw_imc = 0, avg_bw_resc = 0;
unsigned long sum_bw_imc = 0, sum_bw_resc = 0;
+ long avg_bw_imc = 0, avg_bw_resc = 0;
int runs, ret, avg_diff_per;
float avg_diff = 0;
base-commit: 45db3ab70092637967967bfd8e6144017638563c
prerequisite-patch-id: b901ece2a5b78503e2fb5480f20e304d36a0ea27
prerequisite-patch-id: 8d96c4b8c3ed6d9ea2588ef7f594ae0f9f83c279
--
2.45.0
When building with clang, via:
make LLVM=1 -C tools/testing/selftests
...two types of warnings occur:
warning: absolute value function 'abs' given an argument of type
'long' but has parameter of type 'int' which may cause truncation of
value
warning: taking the absolute value of unsigned type 'unsigned long'
has no effect
Fix these by:
a) using labs() in place of abs(), when long integers are involved, and
b) Change to use signed integer data types, in places where subtraction
is used (and could end up with negative values).
c) Remove a duplicate abs() call in cmt_test.c.
Cc: Reinette Chatre <reinette.chatre(a)intel.com>
Cc: Ilpo Järvinen <ilpo.jarvinen(a)linux.intel.com>
Signed-off-by: John Hubbard <jhubbard(a)nvidia.com>
---
tools/testing/selftests/resctrl/cmt_test.c | 4 ++--
tools/testing/selftests/resctrl/mba_test.c | 2 +-
tools/testing/selftests/resctrl/mbm_test.c | 2 +-
3 files changed, 4 insertions(+), 4 deletions(-)
diff --git a/tools/testing/selftests/resctrl/cmt_test.c b/tools/testing/selftests/resctrl/cmt_test.c
index a81f91222a89..05a241519ae8 100644
--- a/tools/testing/selftests/resctrl/cmt_test.c
+++ b/tools/testing/selftests/resctrl/cmt_test.c
@@ -40,11 +40,11 @@ static int show_results_info(unsigned long sum_llc_val, int no_of_bits,
int ret;
avg_llc_val = sum_llc_val / num_of_runs;
- avg_diff = (long)abs(cache_span - avg_llc_val);
+ avg_diff = (long)(cache_span - avg_llc_val);
diff_percent = ((float)cache_span - avg_llc_val) / cache_span * 100;
ret = platform && abs((int)diff_percent) > max_diff_percent &&
- abs(avg_diff) > max_diff;
+ labs(avg_diff) > max_diff;
ksft_print_msg("%s Check cache miss rate within %lu%%\n",
ret ? "Fail:" : "Pass:", max_diff_percent);
diff --git a/tools/testing/selftests/resctrl/mba_test.c b/tools/testing/selftests/resctrl/mba_test.c
index 7946e32e85c8..5fffbc9ff6a4 100644
--- a/tools/testing/selftests/resctrl/mba_test.c
+++ b/tools/testing/selftests/resctrl/mba_test.c
@@ -60,7 +60,7 @@ static bool show_mba_info(unsigned long *bw_imc, unsigned long *bw_resc)
/* Memory bandwidth from 100% down to 10% */
for (allocation = 0; allocation < ALLOCATION_MAX / ALLOCATION_STEP;
allocation++) {
- unsigned long avg_bw_imc, avg_bw_resc;
+ long avg_bw_imc, avg_bw_resc;
unsigned long sum_bw_imc = 0, sum_bw_resc = 0;
int avg_diff_per;
float avg_diff;
diff --git a/tools/testing/selftests/resctrl/mbm_test.c b/tools/testing/selftests/resctrl/mbm_test.c
index d67ffa3ec63a..a4c3ea49b0e8 100644
--- a/tools/testing/selftests/resctrl/mbm_test.c
+++ b/tools/testing/selftests/resctrl/mbm_test.c
@@ -17,7 +17,7 @@
static int
show_bw_info(unsigned long *bw_imc, unsigned long *bw_resc, size_t span)
{
- unsigned long avg_bw_imc = 0, avg_bw_resc = 0;
+ long avg_bw_imc = 0, avg_bw_resc = 0;
unsigned long sum_bw_imc = 0, sum_bw_resc = 0;
int runs, ret, avg_diff_per;
float avg_diff = 0;
base-commit: 45db3ab70092637967967bfd8e6144017638563c
--
2.45.0
If mnl_socket_open() or mnl_socket_bind() fails, it's generally due to
not having the user space parts fully installed and configured
correctly. This was previously ignored and reported as a test PASS, but
what really happened is that the tests were being skipped. This led to
generating inaccurate TAP output (I've omitted the leading '#'
character):
Totals: pass:3 fail:0 xfail:0 xpass:0 skip:0 error:0
Fix this by using kselftest's SKIP() macro. The new output on the same
misconfigured system shows:
Totals: pass:0 fail:0 xfail:0 xpass:0 skip:3 error:0
This was briefly discussed already with Felix Huettner [1].
[1] https://lore.kernel.org/all/edca87be-d8a7-4c62-b9c1-f9b3f5752595@nvidia.com/
Cc: Felix Huettner <felix.huettner(a)mail.schwarz>
Signed-off-by: John Hubbard <jhubbard(a)nvidia.com>
---
tools/testing/selftests/netfilter/conntrack_dump_flush.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/tools/testing/selftests/netfilter/conntrack_dump_flush.c b/tools/testing/selftests/netfilter/conntrack_dump_flush.c
index e9df4ae14e16..4a73afad4de4 100644
--- a/tools/testing/selftests/netfilter/conntrack_dump_flush.c
+++ b/tools/testing/selftests/netfilter/conntrack_dump_flush.c
@@ -317,12 +317,12 @@ FIXTURE_SETUP(conntrack_dump_flush)
self->sock = mnl_socket_open(NETLINK_NETFILTER);
if (!self->sock) {
perror("mnl_socket_open");
- exit(EXIT_FAILURE);
+ SKIP(exit(EXIT_FAILURE), "mnl_socket_open() failed");
}
if (mnl_socket_bind(self->sock, 0, MNL_SOCKET_AUTOPID) < 0) {
perror("mnl_socket_bind");
- exit(EXIT_FAILURE);
+ SKIP(exit(EXIT_FAILURE), "mnl_socket_bind() failed");
}
ret = conntracK_count_zone(self->sock, TEST_ZONE_ID);
base-commit: 45db3ab70092637967967bfd8e6144017638563c
prerequisite-patch-id: b901ece2a5b78503e2fb5480f20e304d36a0ea27
prerequisite-patch-id: 9db2d20be98dc44731d8605a3da64ff118d2546d
prerequisite-patch-id: 72413b9b47d66666f20967a664470199892fe282
--
2.45.0
First of all, in order to build with clang at all, one must first apply
Valentin Obst's build fix for LLVM [1]. Furthermore, for this particular
resctrl directory, my pending fix [2] must also be applied. Once those
fixes are in place, then when building with clang, via:
make LLVM=1 -C tools/testing/selftests
...two types of warnings occur:
warning: absolute value function 'abs' given an argument of type
'long' but has parameter of type 'int' which may cause truncation of
value
warning: taking the absolute value of unsigned type 'unsigned long'
has no effect
Fix these by:
a) using labs() in place of abs(), when long integers are involved, and
b) Change to use signed integer data types, in places where subtraction
is used (and could end up with negative values).
[1] https://lore.kernel.org/all/20240329-selftests-libmk-llvm-rfc-v1-1-2f9ed7d1…
[2] https://lore.kernel.org/all/20240503021712.78601-1-jhubbard@nvidia.com/
Cc: Reinette Chatre <reinette.chatre(a)intel.com>
Signed-off-by: John Hubbard <jhubbard(a)nvidia.com>
---
Hi Reinette,
This v2 includes a fix for the bugs that you pointed out (thanks!) in v1.
I kept the changes to signed integers minimal: only what is required in
order to get a clean clang build.
thanks,
John Hubbard
tools/testing/selftests/resctrl/cmt_test.c | 12 ++++++------
tools/testing/selftests/resctrl/mba_test.c | 4 ++--
tools/testing/selftests/resctrl/mbm_test.c | 4 ++--
3 files changed, 10 insertions(+), 10 deletions(-)
diff --git a/tools/testing/selftests/resctrl/cmt_test.c b/tools/testing/selftests/resctrl/cmt_test.c
index a81f91222a89..af33abd1cca7 100644
--- a/tools/testing/selftests/resctrl/cmt_test.c
+++ b/tools/testing/selftests/resctrl/cmt_test.c
@@ -29,22 +29,22 @@ static int cmt_setup(const struct resctrl_test *test,
return 0;
}
-static int show_results_info(unsigned long sum_llc_val, int no_of_bits,
- unsigned long cache_span, unsigned long max_diff,
- unsigned long max_diff_percent, unsigned long num_of_runs,
+static int show_results_info(long sum_llc_val, int no_of_bits,
+ long cache_span, long max_diff,
+ long max_diff_percent, long num_of_runs,
bool platform)
{
- unsigned long avg_llc_val = 0;
+ long avg_llc_val = 0;
float diff_percent;
long avg_diff = 0;
int ret;
avg_llc_val = sum_llc_val / num_of_runs;
- avg_diff = (long)abs(cache_span - avg_llc_val);
+ avg_diff = labs(cache_span - avg_llc_val);
diff_percent = ((float)cache_span - avg_llc_val) / cache_span * 100;
ret = platform && abs((int)diff_percent) > max_diff_percent &&
- abs(avg_diff) > max_diff;
+ labs(avg_diff) > max_diff;
ksft_print_msg("%s Check cache miss rate within %lu%%\n",
ret ? "Fail:" : "Pass:", max_diff_percent);
diff --git a/tools/testing/selftests/resctrl/mba_test.c b/tools/testing/selftests/resctrl/mba_test.c
index 7946e32e85c8..707b07687249 100644
--- a/tools/testing/selftests/resctrl/mba_test.c
+++ b/tools/testing/selftests/resctrl/mba_test.c
@@ -60,8 +60,8 @@ static bool show_mba_info(unsigned long *bw_imc, unsigned long *bw_resc)
/* Memory bandwidth from 100% down to 10% */
for (allocation = 0; allocation < ALLOCATION_MAX / ALLOCATION_STEP;
allocation++) {
- unsigned long avg_bw_imc, avg_bw_resc;
- unsigned long sum_bw_imc = 0, sum_bw_resc = 0;
+ long avg_bw_imc, avg_bw_resc;
+ long sum_bw_imc = 0, sum_bw_resc = 0;
int avg_diff_per;
float avg_diff;
diff --git a/tools/testing/selftests/resctrl/mbm_test.c b/tools/testing/selftests/resctrl/mbm_test.c
index d67ffa3ec63a..30af15020731 100644
--- a/tools/testing/selftests/resctrl/mbm_test.c
+++ b/tools/testing/selftests/resctrl/mbm_test.c
@@ -17,8 +17,8 @@
static int
show_bw_info(unsigned long *bw_imc, unsigned long *bw_resc, size_t span)
{
- unsigned long avg_bw_imc = 0, avg_bw_resc = 0;
- unsigned long sum_bw_imc = 0, sum_bw_resc = 0;
+ long avg_bw_imc = 0, avg_bw_resc = 0;
+ long sum_bw_imc = 0, sum_bw_resc = 0;
int runs, ret, avg_diff_per;
float avg_diff = 0;
base-commit: ddb4c3f25b7b95df3d6932db0b379d768a6ebdf7
prerequisite-patch-id: b901ece2a5b78503e2fb5480f20e304d36a0ea27
prerequisite-patch-id: 8d96c4b8c3ed6d9ea2588ef7f594ae0f9f83c279
--
2.45.0
There is a spelling mistake in the help message. Fix it.
Signed-off-by: Colin Ian King <colin.i.king(a)gmail.com>
---
tools/testing/selftests/powerpc/dexcr/chdexcr.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/tools/testing/selftests/powerpc/dexcr/chdexcr.c b/tools/testing/selftests/powerpc/dexcr/chdexcr.c
index bda44630cada..c548d7a5bb9b 100644
--- a/tools/testing/selftests/powerpc/dexcr/chdexcr.c
+++ b/tools/testing/selftests/powerpc/dexcr/chdexcr.c
@@ -26,7 +26,7 @@ static void help(void)
"\n"
"The normal option sets the aspect in the DEXCR. The --no- variant\n"
"clears that aspect. For example, --ibrtpd sets the IBRTPD aspect bit,\n"
- "so indirect branch predicition will be disabled in the provided program.\n"
+ "so indirect branch prediction will be disabled in the provided program.\n"
"Conversely, --no-ibrtpd clears the aspect bit, so indirect branch\n"
"prediction may occur.\n"
"\n"
--
2.39.2
Without this change the created netns instances are not cleared after
this script execution. To fix this problem the cleanup_all_ns function
from ../lib.sh is called.
Signed-off-by: Lukasz Majewski <lukma(a)denx.de>
---
tools/testing/selftests/net/hsr/hsr_redbox.sh | 2 ++
1 file changed, 2 insertions(+)
diff --git a/tools/testing/selftests/net/hsr/hsr_redbox.sh b/tools/testing/selftests/net/hsr/hsr_redbox.sh
index 52e0412c32e6..db69be95ecb3 100755
--- a/tools/testing/selftests/net/hsr/hsr_redbox.sh
+++ b/tools/testing/selftests/net/hsr/hsr_redbox.sh
@@ -86,6 +86,8 @@ setup_hsr_interfaces()
check_prerequisites
setup_ns ns1 ns2 ns3
+trap cleanup_all_ns EXIT
+
setup_hsr_interfaces 1
do_complete_ping_test
--
2.20.1
When building with clang, via:
make LLVM=1 -C tools/testing/selftest
...clang warns about several cases of using a signed integer for the
priority argument to mq_receive(3), which expects an unsigned int.
Fix this by declaring the type as unsigned int in all cases.
Signed-off-by: John Hubbard <jhubbard(a)nvidia.com>
---
tools/testing/selftests/mqueue/mq_perf_tests.c | 6 ++++--
1 file changed, 4 insertions(+), 2 deletions(-)
diff --git a/tools/testing/selftests/mqueue/mq_perf_tests.c b/tools/testing/selftests/mqueue/mq_perf_tests.c
index 5c16159d0bcd..fb898850867c 100644
--- a/tools/testing/selftests/mqueue/mq_perf_tests.c
+++ b/tools/testing/selftests/mqueue/mq_perf_tests.c
@@ -323,7 +323,8 @@ void *fake_cont_thread(void *arg)
void *cont_thread(void *arg)
{
char buff[MSG_SIZE];
- int i, priority;
+ int i;
+ unsigned int priority;
for (i = 0; i < num_cpus_to_pin; i++)
if (cpu_threads[i] == pthread_self())
@@ -425,7 +426,8 @@ struct test test2[] = {
void *perf_test_thread(void *arg)
{
char buff[MSG_SIZE];
- int prio_out, prio_in;
+ int prio_out;
+ unsigned int prio_in;
int i;
clockid_t clock;
pthread_t *t;
base-commit: f462ae0edd3703edd6f22fe41d336369c38b884b
prerequisite-patch-id: b901ece2a5b78503e2fb5480f20e304d36a0ea27
--
2.45.0
Add several new test cases which assert corner cases on the eventfd
mechanism, for example, the supplied buffer is less than 8 bytes,
attempting to write a value that is too large, etc.
./eventfd_test
# Starting 9 tests from 1 test cases.
# RUN global.eventfd_check_flag_rdwr ...
# OK global.eventfd_check_flag_rdwr
ok 1 global.eventfd_check_flag_rdwr
# RUN global.eventfd_check_flag_cloexec ...
# OK global.eventfd_check_flag_cloexec
ok 2 global.eventfd_check_flag_cloexec
# RUN global.eventfd_check_flag_nonblock ...
# OK global.eventfd_check_flag_nonblock
ok 3 global.eventfd_check_flag_nonblock
# RUN global.eventfd_chek_flag_cloexec_and_nonblock ...
# OK global.eventfd_chek_flag_cloexec_and_nonblock
ok 4 global.eventfd_chek_flag_cloexec_and_nonblock
# RUN global.eventfd_check_flag_semaphore ...
# OK global.eventfd_check_flag_semaphore
ok 5 global.eventfd_check_flag_semaphore
# RUN global.eventfd_check_write ...
# OK global.eventfd_check_write
ok 6 global.eventfd_check_write
# RUN global.eventfd_check_read ...
# OK global.eventfd_check_read
ok 7 global.eventfd_check_read
# RUN global.eventfd_check_read_with_nonsemaphore ...
# OK global.eventfd_check_read_with_nonsemaphore
ok 8 global.eventfd_check_read_with_nonsemaphore
# RUN global.eventfd_check_read_with_semaphore ...
# OK global.eventfd_check_read_with_semaphore
ok 9 global.eventfd_check_read_with_semaphore
# PASSED: 9 / 9 tests passed.
# Totals: pass:9 fail:0 xfail:0 xpass:0 skip:0 error:0
Signed-off-by: Wen Yang <wen.yang(a)linux.dev>
Cc: SShuah Khan <shuah(a)kernel.org>
Cc: Christian Brauner <brauner(a)kernel.org>
Cc: Andrei Vagin <avagin(a)google.com>
Cc: Mathieu Desnoyers <mathieu.desnoyers(a)efficios.com>
Cc: Steven Rostedt <rostedt(a)goodmis.org>
Cc: Andrew Morton <akpm(a)linux-foundation.org>
Cc: Dave Young <dyoung(a)redhat.com>
Cc: Tim Bird <tim.bird(a)sony.com>
Cc: linux-kselftest(a)vger.kernel.org
Cc: linux-kernel(a)vger.kernel.org
---
v2: use strings which indicate what is being tested, that are useful to a human
.../filesystems/eventfd/eventfd_test.c | 136 +++++++++++++++++-
1 file changed, 131 insertions(+), 5 deletions(-)
diff --git a/tools/testing/selftests/filesystems/eventfd/eventfd_test.c b/tools/testing/selftests/filesystems/eventfd/eventfd_test.c
index f142a137526c..85acb4e3ef00 100644
--- a/tools/testing/selftests/filesystems/eventfd/eventfd_test.c
+++ b/tools/testing/selftests/filesystems/eventfd/eventfd_test.c
@@ -13,6 +13,8 @@
#include <sys/eventfd.h>
#include "../../kselftest_harness.h"
+#define EVENTFD_TEST_ITERATIONS 100000UL
+
struct error {
int code;
char msg[512];
@@ -40,7 +42,7 @@ static inline int sys_eventfd2(unsigned int count, int flags)
return syscall(__NR_eventfd2, count, flags);
}
-TEST(eventfd01)
+TEST(eventfd_check_flag_rdwr)
{
int fd, flags;
@@ -54,7 +56,7 @@ TEST(eventfd01)
close(fd);
}
-TEST(eventfd02)
+TEST(eventfd_check_flag_cloexec)
{
int fd, flags;
@@ -68,7 +70,7 @@ TEST(eventfd02)
close(fd);
}
-TEST(eventfd03)
+TEST(eventfd_check_flag_nonblock)
{
int fd, flags;
@@ -83,7 +85,7 @@ TEST(eventfd03)
close(fd);
}
-TEST(eventfd04)
+TEST(eventfd_chek_flag_cloexec_and_nonblock)
{
int fd, flags;
@@ -161,7 +163,7 @@ static int verify_fdinfo(int fd, struct error *err, const char *prefix,
return 0;
}
-TEST(eventfd05)
+TEST(eventfd_check_flag_semaphore)
{
struct error err = {0};
int fd, ret;
@@ -183,4 +185,128 @@ TEST(eventfd05)
close(fd);
}
+/*
+ * A write(2) fails with the error EINVAL if the size of the supplied buffer
+ * is less than 8 bytes, or if an attempt is made to write the value
+ * 0xffffffffffffffff.
+ */
+TEST(eventfd_check_write)
+{
+ uint64_t value = 1;
+ ssize_t size;
+ int fd;
+
+ fd = sys_eventfd2(0, 0);
+ ASSERT_GE(fd, 0);
+
+ size = write(fd, &value, sizeof(int));
+ EXPECT_EQ(size, -1);
+ EXPECT_EQ(errno, EINVAL);
+
+ size = write(fd, &value, sizeof(value));
+ EXPECT_EQ(size, sizeof(value));
+
+ value = (uint64_t)-1;
+ size = write(fd, &value, sizeof(value));
+ EXPECT_EQ(size, -1);
+ EXPECT_EQ(errno, EINVAL);
+
+ close(fd);
+}
+
+/*
+ * A read(2) fails with the error EINVAL if the size of the supplied buffer is
+ * less than 8 bytes.
+ */
+TEST(eventfd_check_read)
+{
+ uint64_t value;
+ ssize_t size;
+ int fd;
+
+ fd = sys_eventfd2(1, 0);
+ ASSERT_GE(fd, 0);
+
+ size = read(fd, &value, sizeof(int));
+ EXPECT_EQ(size, -1);
+ EXPECT_EQ(errno, EINVAL);
+
+ size = read(fd, &value, sizeof(value));
+ EXPECT_EQ(size, sizeof(value));
+ EXPECT_EQ(value, 1);
+
+ close(fd);
+}
+
+
+/*
+ * If EFD_SEMAPHORE was not specified and the eventfd counter has a nonzero
+ * value, then a read(2) returns 8 bytes containing that value, and the
+ * counter's value is reset to zero.
+ * If the eventfd counter is zero at the time of the call to read(2), then the
+ * call fails with the error EAGAIN if the file descriptor has been made nonblocking.
+ */
+TEST(eventfd_check_read_with_nonsemaphore)
+{
+ uint64_t value;
+ ssize_t size;
+ int fd;
+ int i;
+
+ fd = sys_eventfd2(0, EFD_NONBLOCK);
+ ASSERT_GE(fd, 0);
+
+ value = 1;
+ for (i = 0; i < EVENTFD_TEST_ITERATIONS; i++) {
+ size = write(fd, &value, sizeof(value));
+ EXPECT_EQ(size, sizeof(value));
+ }
+
+ size = read(fd, &value, sizeof(value));
+ EXPECT_EQ(size, sizeof(uint64_t));
+ EXPECT_EQ(value, EVENTFD_TEST_ITERATIONS);
+
+ size = read(fd, &value, sizeof(value));
+ EXPECT_EQ(size, -1);
+ EXPECT_EQ(errno, EAGAIN);
+
+ close(fd);
+}
+
+/*
+ * If EFD_SEMAPHORE was specified and the eventfd counter has a nonzero value,
+ * then a read(2) returns 8 bytes containing the value 1, and the counter's
+ * value is decremented by 1.
+ * If the eventfd counter is zero at the time of the call to read(2), then the
+ * call fails with the error EAGAIN if the file descriptor has been made nonblocking.
+ */
+TEST(eventfd_check_read_with_semaphore)
+{
+ uint64_t value;
+ ssize_t size;
+ int fd;
+ int i;
+
+ fd = sys_eventfd2(0, EFD_SEMAPHORE|EFD_NONBLOCK);
+ ASSERT_GE(fd, 0);
+
+ value = 1;
+ for (i = 0; i < EVENTFD_TEST_ITERATIONS; i++) {
+ size = write(fd, &value, sizeof(value));
+ EXPECT_EQ(size, sizeof(value));
+ }
+
+ for (i = 0; i < EVENTFD_TEST_ITERATIONS; i++) {
+ size = read(fd, &value, sizeof(value));
+ EXPECT_EQ(size, sizeof(value));
+ EXPECT_EQ(value, 1);
+ }
+
+ size = read(fd, &value, sizeof(value));
+ EXPECT_EQ(size, -1);
+ EXPECT_EQ(errno, EAGAIN);
+
+ close(fd);
+}
+
TEST_HARNESS_MAIN
--
2.25.1
When building with clang, via:
make LLVM=1 -C tools/testing/selftest
...clang warns about three variables that are not initialized in all
cases:
1) The opt_ipproto_off variable is used uninitialized if "testname" is
not "ip". Willem de Bruijn pointed out that this is an actual bug, and
suggested the fix that I'm using here (thanks!).
2) The addr_len is used uninitialized, but only in the assert case,
which bails out, so this is harmless.
3) The family variable in add_listener() is only used uninitialized in
the error case (neither IPv4 nor IPv6 is specified), so it's also
harmless.
Fix by initializing each variable.
Cc: Willem de Bruijn <willemdebruijn.kernel(a)gmail.com>
Signed-off-by: John Hubbard <jhubbard(a)nvidia.com>
---
tools/testing/selftests/net/gro.c | 3 +++
tools/testing/selftests/net/ip_local_port_range.c | 2 +-
tools/testing/selftests/net/mptcp/pm_nl_ctl.c | 2 +-
3 files changed, 5 insertions(+), 2 deletions(-)
diff --git a/tools/testing/selftests/net/gro.c b/tools/testing/selftests/net/gro.c
index 353e1e867fbb..6038b96ecee8 100644
--- a/tools/testing/selftests/net/gro.c
+++ b/tools/testing/selftests/net/gro.c
@@ -119,6 +119,9 @@ static void setup_sock_filter(int fd)
next_off = offsetof(struct ipv6hdr, nexthdr);
ipproto_off = ETH_HLEN + next_off;
+ /* Overridden later if exthdrs are used: */
+ opt_ipproto_off = ipproto_off;
+
if (strcmp(testname, "ip") == 0) {
if (proto == PF_INET)
optlen = sizeof(struct ip_timestamp);
diff --git a/tools/testing/selftests/net/ip_local_port_range.c b/tools/testing/selftests/net/ip_local_port_range.c
index 193b82745fd8..29451d2244b7 100644
--- a/tools/testing/selftests/net/ip_local_port_range.c
+++ b/tools/testing/selftests/net/ip_local_port_range.c
@@ -359,7 +359,7 @@ TEST_F(ip_local_port_range, late_bind)
struct sockaddr_in v4;
struct sockaddr_in6 v6;
} addr;
- socklen_t addr_len;
+ socklen_t addr_len = 0;
const int one = 1;
int fd, err;
__u32 range;
diff --git a/tools/testing/selftests/net/mptcp/pm_nl_ctl.c b/tools/testing/selftests/net/mptcp/pm_nl_ctl.c
index 7426a2cbd4a0..7ad5a59adff2 100644
--- a/tools/testing/selftests/net/mptcp/pm_nl_ctl.c
+++ b/tools/testing/selftests/net/mptcp/pm_nl_ctl.c
@@ -1276,7 +1276,7 @@ int add_listener(int argc, char *argv[])
struct sockaddr_storage addr;
struct sockaddr_in6 *a6;
struct sockaddr_in *a4;
- u_int16_t family;
+ u_int16_t family = AF_UNSPEC;
int enable = 1;
int sock;
int err;
base-commit: f462ae0edd3703edd6f22fe41d336369c38b884b
prerequisite-patch-id: b901ece2a5b78503e2fb5480f20e304d36a0ea27
prerequisite-patch-id: e81ae5ca6c427dde802acd4c1442c82e170c251a
--
2.45.0
From: Clément Léger <cleger(a)rivosinc.com>
[ Upstream commit 17c67ed752d6a456602b3dbb25c5ae4d3de5deab ]
Currently, the sud_test expects the emulated syscall to return the
emulated syscall number. This assumption only works on architectures
were the syscall calling convention use the same register for syscall
number/syscall return value. This is not the case for RISC-V and thus
the return value must be also emulated using the provided ucontext.
Signed-off-by: Clément Léger <cleger(a)rivosinc.com>
Reviewed-by: Palmer Dabbelt <palmer(a)rivosinc.com>
Acked-by: Palmer Dabbelt <palmer(a)rivosinc.com>
Link: https://lore.kernel.org/r/20231206134438.473166-1-cleger@rivosinc.com
Signed-off-by: Palmer Dabbelt <palmer(a)rivosinc.com>
Signed-off-by: Sasha Levin <sashal(a)kernel.org>
---
.../selftests/syscall_user_dispatch/sud_test.c | 14 ++++++++++++++
1 file changed, 14 insertions(+)
diff --git a/tools/testing/selftests/syscall_user_dispatch/sud_test.c b/tools/testing/selftests/syscall_user_dispatch/sud_test.c
index b5d592d4099e8..d975a67673299 100644
--- a/tools/testing/selftests/syscall_user_dispatch/sud_test.c
+++ b/tools/testing/selftests/syscall_user_dispatch/sud_test.c
@@ -158,6 +158,20 @@ static void handle_sigsys(int sig, siginfo_t *info, void *ucontext)
/* In preparation for sigreturn. */
SYSCALL_DISPATCH_OFF(glob_sel);
+
+ /*
+ * The tests for argument handling assume that `syscall(x) == x`. This
+ * is a NOP on x86 because the syscall number is passed in %rax, which
+ * happens to also be the function ABI return register. Other
+ * architectures may need to swizzle the arguments around.
+ */
+#if defined(__riscv)
+/* REG_A7 is not defined in libc headers */
+# define REG_A7 (REG_A0 + 7)
+
+ ((ucontext_t *)ucontext)->uc_mcontext.__gregs[REG_A0] =
+ ((ucontext_t *)ucontext)->uc_mcontext.__gregs[REG_A7];
+#endif
}
TEST(dispatch_and_return)
--
2.43.0
From: Clément Léger <cleger(a)rivosinc.com>
[ Upstream commit 17c67ed752d6a456602b3dbb25c5ae4d3de5deab ]
Currently, the sud_test expects the emulated syscall to return the
emulated syscall number. This assumption only works on architectures
were the syscall calling convention use the same register for syscall
number/syscall return value. This is not the case for RISC-V and thus
the return value must be also emulated using the provided ucontext.
Signed-off-by: Clément Léger <cleger(a)rivosinc.com>
Reviewed-by: Palmer Dabbelt <palmer(a)rivosinc.com>
Acked-by: Palmer Dabbelt <palmer(a)rivosinc.com>
Link: https://lore.kernel.org/r/20231206134438.473166-1-cleger@rivosinc.com
Signed-off-by: Palmer Dabbelt <palmer(a)rivosinc.com>
Signed-off-by: Sasha Levin <sashal(a)kernel.org>
---
.../selftests/syscall_user_dispatch/sud_test.c | 14 ++++++++++++++
1 file changed, 14 insertions(+)
diff --git a/tools/testing/selftests/syscall_user_dispatch/sud_test.c b/tools/testing/selftests/syscall_user_dispatch/sud_test.c
index b5d592d4099e8..d975a67673299 100644
--- a/tools/testing/selftests/syscall_user_dispatch/sud_test.c
+++ b/tools/testing/selftests/syscall_user_dispatch/sud_test.c
@@ -158,6 +158,20 @@ static void handle_sigsys(int sig, siginfo_t *info, void *ucontext)
/* In preparation for sigreturn. */
SYSCALL_DISPATCH_OFF(glob_sel);
+
+ /*
+ * The tests for argument handling assume that `syscall(x) == x`. This
+ * is a NOP on x86 because the syscall number is passed in %rax, which
+ * happens to also be the function ABI return register. Other
+ * architectures may need to swizzle the arguments around.
+ */
+#if defined(__riscv)
+/* REG_A7 is not defined in libc headers */
+# define REG_A7 (REG_A0 + 7)
+
+ ((ucontext_t *)ucontext)->uc_mcontext.__gregs[REG_A0] =
+ ((ucontext_t *)ucontext)->uc_mcontext.__gregs[REG_A7];
+#endif
}
TEST(dispatch_and_return)
--
2.43.0
From: Clément Léger <cleger(a)rivosinc.com>
[ Upstream commit 17c67ed752d6a456602b3dbb25c5ae4d3de5deab ]
Currently, the sud_test expects the emulated syscall to return the
emulated syscall number. This assumption only works on architectures
were the syscall calling convention use the same register for syscall
number/syscall return value. This is not the case for RISC-V and thus
the return value must be also emulated using the provided ucontext.
Signed-off-by: Clément Léger <cleger(a)rivosinc.com>
Reviewed-by: Palmer Dabbelt <palmer(a)rivosinc.com>
Acked-by: Palmer Dabbelt <palmer(a)rivosinc.com>
Link: https://lore.kernel.org/r/20231206134438.473166-1-cleger@rivosinc.com
Signed-off-by: Palmer Dabbelt <palmer(a)rivosinc.com>
Signed-off-by: Sasha Levin <sashal(a)kernel.org>
---
.../selftests/syscall_user_dispatch/sud_test.c | 14 ++++++++++++++
1 file changed, 14 insertions(+)
diff --git a/tools/testing/selftests/syscall_user_dispatch/sud_test.c b/tools/testing/selftests/syscall_user_dispatch/sud_test.c
index b5d592d4099e8..d975a67673299 100644
--- a/tools/testing/selftests/syscall_user_dispatch/sud_test.c
+++ b/tools/testing/selftests/syscall_user_dispatch/sud_test.c
@@ -158,6 +158,20 @@ static void handle_sigsys(int sig, siginfo_t *info, void *ucontext)
/* In preparation for sigreturn. */
SYSCALL_DISPATCH_OFF(glob_sel);
+
+ /*
+ * The tests for argument handling assume that `syscall(x) == x`. This
+ * is a NOP on x86 because the syscall number is passed in %rax, which
+ * happens to also be the function ABI return register. Other
+ * architectures may need to swizzle the arguments around.
+ */
+#if defined(__riscv)
+/* REG_A7 is not defined in libc headers */
+# define REG_A7 (REG_A0 + 7)
+
+ ((ucontext_t *)ucontext)->uc_mcontext.__gregs[REG_A0] =
+ ((ucontext_t *)ucontext)->uc_mcontext.__gregs[REG_A7];
+#endif
}
TEST(dispatch_and_return)
--
2.43.0
From: Clément Léger <cleger(a)rivosinc.com>
[ Upstream commit 17c67ed752d6a456602b3dbb25c5ae4d3de5deab ]
Currently, the sud_test expects the emulated syscall to return the
emulated syscall number. This assumption only works on architectures
were the syscall calling convention use the same register for syscall
number/syscall return value. This is not the case for RISC-V and thus
the return value must be also emulated using the provided ucontext.
Signed-off-by: Clément Léger <cleger(a)rivosinc.com>
Reviewed-by: Palmer Dabbelt <palmer(a)rivosinc.com>
Acked-by: Palmer Dabbelt <palmer(a)rivosinc.com>
Link: https://lore.kernel.org/r/20231206134438.473166-1-cleger@rivosinc.com
Signed-off-by: Palmer Dabbelt <palmer(a)rivosinc.com>
Signed-off-by: Sasha Levin <sashal(a)kernel.org>
---
.../selftests/syscall_user_dispatch/sud_test.c | 14 ++++++++++++++
1 file changed, 14 insertions(+)
diff --git a/tools/testing/selftests/syscall_user_dispatch/sud_test.c b/tools/testing/selftests/syscall_user_dispatch/sud_test.c
index b5d592d4099e8..d975a67673299 100644
--- a/tools/testing/selftests/syscall_user_dispatch/sud_test.c
+++ b/tools/testing/selftests/syscall_user_dispatch/sud_test.c
@@ -158,6 +158,20 @@ static void handle_sigsys(int sig, siginfo_t *info, void *ucontext)
/* In preparation for sigreturn. */
SYSCALL_DISPATCH_OFF(glob_sel);
+
+ /*
+ * The tests for argument handling assume that `syscall(x) == x`. This
+ * is a NOP on x86 because the syscall number is passed in %rax, which
+ * happens to also be the function ABI return register. Other
+ * architectures may need to swizzle the arguments around.
+ */
+#if defined(__riscv)
+/* REG_A7 is not defined in libc headers */
+# define REG_A7 (REG_A0 + 7)
+
+ ((ucontext_t *)ucontext)->uc_mcontext.__gregs[REG_A0] =
+ ((ucontext_t *)ucontext)->uc_mcontext.__gregs[REG_A7];
+#endif
}
TEST(dispatch_and_return)
--
2.43.0
Hi, all,
With the latest 6.9-rc7 torvalds tree kernel, the "make kselftest"
always hangs in the following
./pidfd_setns_test.
Symptoms are that all ./pidfd_netne_test processes end up in pause()
syscalls, with
nothing to wake them up.
Please find the config attached. All the options from the
selftests/pidfdconfig are on (verified):
CONFIG_UTS_NS=y
CONFIG_IPC_NS=y
CONFIG_USER_NS=y
CONFIG_PID_NS=y
CONFIG_NET_NS=y
CONFIG_CGROUPS=y
CONFIG_CHECKPOINT_RESTORE=y
root 2090101 5211 0 21:53 pts/1 00:00:00 make
OUTPUT=/home/marvin/linux/kernel/linux_torvalds/tools/testing/selftests/pidfd
-C pidfd run_tests
SRC_PATH=/home/marvin/linux/kernel/linux_torvalds/tools/testing/selftests
OB
root 2090102 2090101 0 21:53 pts/1 00:00:00 /bin/sh -c
BASE_DIR="/home/marvin/linux/kernel/linux_torvalds/tools/testing/selftests";
. /home/marvin/linux/kernel/linux_torvalds/tools/testing/selftests/kselftest/runner.sh;
if
root 2133026 2090102 0 21:54 pts/1 00:00:00 /bin/sh -c
BASE_DIR="/home/marvin/linux/kernel/linux_torvalds/tools/testing/selftests";
. /home/marvin/linux/kernel/linux_torvalds/tools/testing/selftests/kselftest/runner.sh;
if
root 2133027 2133026 0 21:54 pts/1 00:00:00 /bin/sh -c
BASE_DIR="/home/marvin/linux/kernel/linux_torvalds/tools/testing/selftests";
. /home/marvin/linux/kernel/linux_torvalds/tools/testing/selftests/kselftest/runner.sh;
if
root 2133028 2133027 0 21:54 pts/1 00:00:00 /bin/sh -c
BASE_DIR="/home/marvin/linux/kernel/linux_torvalds/tools/testing/selftests";
. /home/marvin/linux/kernel/linux_torvalds/tools/testing/selftests/kselftest/runner.sh;
if
root 2133031 2133028 0 21:54 pts/1 00:00:00 /bin/sh -c
BASE_DIR="/home/marvin/linux/kernel/linux_torvalds/tools/testing/selftests";
. /home/marvin/linux/kernel/linux_torvalds/tools/testing/selftests/kselftest/runner.sh;
if
root 2133033 2133031 0 21:54 pts/1 00:00:00 perl
/home/marvin/linux/kernel/linux_torvalds/tools/testing/selftests/kselftest/prefix.pl
root 2133050 2887 0 21:54 pts/1 00:00:00 ./pidfd_setns_test
root 2133051 2887 0 21:54 pts/1 00:00:00 ./pidfd_setns_test
root 2133056 2887 0 21:54 pts/1 00:00:00 ./pidfd_setns_test
root 2133057 2887 0 21:54 pts/1 00:00:00 ./pidfd_setns_test
root 2133062 2887 0 21:54 pts/1 00:00:00 ./pidfd_setns_test
root 2133063 2887 0 21:54 pts/1 00:00:00 ./pidfd_setns_test
root 2133068 2887 0 21:54 pts/1 00:00:00 ./pidfd_setns_test
root 2133069 2887 0 21:54 pts/1 00:00:00 ./pidfd_setns_test
.
.
.
marvin@defiant:~/linux/kernel/linux_torvalds$ sudo bash
[sudo] password for marvin:
root@defiant:/home/marvin/linux/kernel/linux_torvalds# strace -p 2133050
strace: Process 2133050 attached
pause(^Cstrace: Process 2133050 detached
<detached ...>
root@defiant:/home/marvin/linux/kernel/linux_torvalds# strace -p 2133051
strace: Process 2133051 attached
pause(^Cstrace: Process 2133051 detached
<detached ...>
root@defiant:/home/marvin/linux/kernel/linux_torvalds# strace -p 2133056
strace: Process 2133056 attached
pause(^Cstrace: Process 2133056 detached
<detached ...>
root@defiant:/home/marvin/linux/kernel/linux_torvalds# strace -p 2133057
strace: Process 2133057 attached
pause(^Cstrace: Process 2133057 detached
<detached ...>
root@defiant:/home/marvin/linux/kernel/linux_torvalds# strace -p 2133062
strace: Process 2133062 attached
pause(^Cstrace: Process 2133062 detached
<detached ...>
root@defiant:/home/marvin/linux/kernel/linux_torvalds# strace -p 2133063
strace: Process 2133063 attached
pause(^Cstrace: Process 2133063 detached
<detached ...>
root@defiant:/home/marvin/linux/kernel/linux_torvalds#
root@defiant:/home/marvin/linux/kernel/linux_torvalds#
root@defiant:/home/marvin/linux/kernel/linux_torvalds# strace -p 2133068
strace: Process 2133068 attached
pause(^Cstrace: Process 2133068 detached
<detached ...>
root@defiant:/home/marvin/linux/kernel/linux_torvalds# strace -p 2133069
strace: Process 2133069 attached
pause(^Cstrace: Process 2133069 detached
<detached ...>
root@defiant:/home/marvin/linux/kernel/linux_torvalds#
The output log is:
# selftests: pidfd: pidfd_setns_test
# TAP version 13
# 1..7
# # Starting 7 tests from 2 test cases.
# # RUN global.setns_einval ...
# # OK global.setns_einval
# ok 1 global.setns_einval
# # RUN current_nsset.invalid_flags ...
# # pidfd_setns_test.c:161:invalid_flags:Expected
self->child_pid_exited (0) > 0 (0)
# # OK current_nsset.invalid_flags
# ok 2 current_nsset.invalid_flags
# # RUN current_nsset.pidfd_exited_child ...
# # pidfd_setns_test.c:161:pidfd_exited_child:Expected
self->child_pid_exited (0) > 0 (0)
# # OK current_nsset.pidfd_exited_child
# ok 3 current_nsset.pidfd_exited_child
# # RUN current_nsset.pidfd_incremental_setns ...
# # pidfd_setns_test.c:161:pidfd_incremental_setns:Expected
self->child_pid_exited (0) > 0 (0)
# # pidfd_setns_test.c:408:pidfd_incremental_setns:Managed to
correctly setns to user namespace of 2133050 via pidfd 20
# # pidfd_setns_test.c:408:pidfd_incremental_setns:Managed to
correctly setns to mnt namespace of 2133050 via pidfd 20
# # pidfd_setns_test.c:408:pidfd_incremental_setns:Managed to
correctly setns to pid namespace of 2133050 via pidfd 20
# # pidfd_setns_test.c:408:pidfd_incremental_setns:Managed to
correctly setns to uts namespace of 2133050 via pidfd 20
# # pidfd_setns_test.c:408:pidfd_incremental_setns:Managed to
correctly setns to ipc namespace of 2133050 via pidfd 20
# # pidfd_setns_test.c:408:pidfd_incremental_setns:Managed to
correctly setns to net namespace of 2133050 via pidfd 20
# # pidfd_setns_test.c:408:pidfd_incremental_setns:Managed to
correctly setns to cgroup namespace of 2133050 via pidfd 20
# # pidfd_setns_test.c:408:pidfd_incremental_setns:Managed to
correctly setns to pid_for_children namespace of 2133050 via pidfd 20
# # pidfd_setns_test.c:391:pidfd_incremental_setns:Expected
setns(self->child_pidfd1, info->flag) (-1) == 0 (0)
# # pidfd_setns_test.c:392:pidfd_incremental_setns:Too many users -
Failed to setns to time namespace of 2133050 via pidfd 20
# # pidfd_incremental_setns: Test terminated by assertion
# # FAIL current_nsset.pidfd_incremental_setns
# not ok 4 current_nsset.pidfd_incremental_setns
# # RUN current_nsset.nsfd_incremental_setns ...
# # pidfd_setns_test.c:161:nsfd_incremental_setns:Expected
self->child_pid_exited (0) > 0 (0)
# # pidfd_setns_test.c:444:nsfd_incremental_setns:Managed to correctly
setns to user namespace of 2133056 via nsfd 19
# # pidfd_setns_test.c:444:nsfd_incremental_setns:Managed to correctly
setns to mnt namespace of 2133056 via nsfd 24
# # pidfd_setns_test.c:444:nsfd_incremental_setns:Managed to correctly
setns to pid namespace of 2133056 via nsfd 27
# # pidfd_setns_test.c:444:nsfd_incremental_setns:Managed to correctly
setns to uts namespace of 2133056 via nsfd 30
# # pidfd_setns_test.c:444:nsfd_incremental_setns:Managed to correctly
setns to ipc namespace of 2133056 via nsfd 33
# # pidfd_setns_test.c:444:nsfd_incremental_setns:Managed to correctly
setns to net namespace of 2133056 via nsfd 36
# # pidfd_setns_test.c:444:nsfd_incremental_setns:Managed to correctly
setns to cgroup namespace of 2133056 via nsfd 39
# # pidfd_setns_test.c:444:nsfd_incremental_setns:Managed to correctly
setns to pid_for_children namespace of 2133056 via nsfd 42
# # pidfd_setns_test.c:427:nsfd_incremental_setns:Expected
setns(self->child_nsfds1[i], info->flag) (-1) == 0 (0)
# # pidfd_setns_test.c:428:nsfd_incremental_setns:Too many users -
Failed to setns to time namespace of 2133056 via nsfd 45
# # nsfd_incremental_setns: Test terminated by assertion
# # FAIL current_nsset.nsfd_incremental_setns
# not ok 5 current_nsset.nsfd_incremental_setns
# # RUN current_nsset.pidfd_one_shot_setns ...
# # pidfd_setns_test.c:161:pidfd_one_shot_setns:Expected
self->child_pid_exited (0) > 0 (0)
# # pidfd_setns_test.c:462:pidfd_one_shot_setns:Adding user namespace
of 2133062 to list of namespaces to attach to
# # pidfd_setns_test.c:462:pidfd_one_shot_setns:Adding mnt namespace
of 2133062 to list of namespaces to attach to
# # pidfd_setns_test.c:462:pidfd_one_shot_setns:Adding pid namespace
of 2133062 to list of namespaces to attach to
# # pidfd_setns_test.c:462:pidfd_one_shot_setns:Adding uts namespace
of 2133062 to list of namespaces to attach to
# # pidfd_setns_test.c:462:pidfd_one_shot_setns:Adding ipc namespace
of 2133062 to list of namespaces to attach to
# # pidfd_setns_test.c:462:pidfd_one_shot_setns:Adding net namespace
of 2133062 to list of namespaces to attach to
# # pidfd_setns_test.c:462:pidfd_one_shot_setns:Adding cgroup
namespace of 2133062 to list of namespaces to attach to
# # pidfd_setns_test.c:462:pidfd_one_shot_setns:Adding
pid_for_children namespace of 2133062 to list of namespaces to attach
to
# # pidfd_setns_test.c:462:pidfd_one_shot_setns:Adding time namespace
of 2133062 to list of namespaces to attach to
# # pidfd_setns_test.c:466:pidfd_one_shot_setns:Expected
setns(self->child_pidfd1, flags) (-1) == 0 (0)
# # pidfd_setns_test.c:467:pidfd_one_shot_setns:Too many users -
Failed to setns to namespaces of 2133062
# # pidfd_one_shot_setns: Test terminated by assertion
# # FAIL current_nsset.pidfd_one_shot_setns
# not ok 6 current_nsset.pidfd_one_shot_setns
# # RUN current_nsset.no_foul_play ...
# # pidfd_setns_test.c:161:no_foul_play:Expected
self->child_pid_exited (0) > 0 (0)
# # pidfd_setns_test.c:506:no_foul_play:Adding user namespace of
2133068 to list of namespaces to attach to
# # pidfd_setns_test.c:506:no_foul_play:Adding mnt namespace of
2133068 to list of namespaces to attach to
# # pidfd_setns_test.c:506:no_foul_play:Adding pid namespace of
2133068 to list of namespaces to attach to
# # pidfd_setns_test.c:506:no_foul_play:Adding uts namespace of
2133068 to list of namespaces to attach to
# # pidfd_setns_test.c:506:no_foul_play:Adding ipc namespace of
2133068 to list of namespaces to attach to
# # pidfd_setns_test.c:506:no_foul_play:Adding net namespace of
2133068 to list of namespaces to attach to
# # pidfd_setns_test.c:506:no_foul_play:Adding cgroup namespace of
2133068 to list of namespaces to attach to
# # pidfd_setns_test.c:506:no_foul_play:Adding time namespace of
2133068 to list of namespaces to attach to
# # pidfd_setns_test.c:510:no_foul_play:Expected
setns(self->child_pidfd1, flags) (-1) == 0 (0)
# # pidfd_setns_test.c:511:no_foul_play:Too many users - Failed to
setns to namespaces of 2133068 vid pidfd 20
# # no_foul_play: Test terminated by assertion
# # FAIL current_nsset.no_foul_play
# not ok 7 current_nsset.no_foul_play
# # FAILED: 3 / 7 tests passed.
# # Totals: pass:3 fail:4 xfail:0 xpass:0 skip:0 error:0
Thanks for your time and patience for reviewing this BUG report.
Best regards,
Mirsad Todorovac
P.S.
I have changed the email address because of the uncertainty whether the employer
would continue to support my work on the Linux kernel testing as the
part of my work
research.
Thank you.
809216233555 ("selftests/harness: remove use of LINE_MAX") introduced
asprintf into kselftest_harness.h, which is a GNU extension and needs
_GNU_SOURCE to either be defined prior to including headers or with the
-D_GNU_SOURCE flag passed to the compiler.
Edward Liaw (10):
selftests/sgx: Compile with -D_GNU_SOURCE
selftests/alsa: Compile with -D_GNU_SOURCE
selftests/hid: Compile with -D_GNU_SOURCE
selftests/kvm: Define _GNU_SOURCE
selftests/nci: Compile with -D_GNU_SOURCE
selftests/net: Define _GNU_SOURCE
selftests/prctl: Compile with -D_GNU_SOURCE
selftests/rtc: Compile with -D_GNU_SOURCE
selftests/tdx: Compile with -D_GNU_SOURCE
selftests/user_events: Compiled with -D_GNU_SOURCE
tools/testing/selftests/alsa/Makefile | 1 +
tools/testing/selftests/hid/Makefile | 2 +-
tools/testing/selftests/kvm/x86_64/fix_hypercall_test.c | 2 ++
tools/testing/selftests/nci/Makefile | 2 +-
tools/testing/selftests/net/bind_wildcard.c | 1 +
tools/testing/selftests/net/ip_local_port_range.c | 1 +
tools/testing/selftests/net/reuseaddr_ports_exhausted.c | 2 ++
tools/testing/selftests/prctl/Makefile | 1 +
tools/testing/selftests/rtc/Makefile | 2 +-
tools/testing/selftests/sgx/Makefile | 2 +-
tools/testing/selftests/sgx/sigstruct.c | 2 --
tools/testing/selftests/tdx/Makefile | 2 +-
tools/testing/selftests/user_events/Makefile | 2 +-
tools/testing/selftests/user_events/abi_test.c | 1 -
14 files changed, 14 insertions(+), 9 deletions(-)
--
2.45.0.rc0.197.gbae5840b3b-goog
The cb fields network_offset and inner_network_offset are used instead of
skb->network_header throughout GRO.
These fields are then leveraged in the next commit to remove flush_id state
from napi_gro_cb, and stateful code in {ipv6,inet}_gro_receive which may be
unnecessarily complicated due to encapsulation support in GRO. These fields
are checked in L4 instead.
3rd patch adds tests for different flush_id flows in GRO.
v7 -> v8:
- Remove network_header use in gro
- Re-send commits after the dependent patch to net was applied
- v7:
https://lore.kernel.org/all/20240412155533.115507-1-richardbgobert@gmail.co…
v6 -> v7:
- Moved bug fixes to a separate submission in net
- Added UDP fwd benchmark
- v6:
https://lore.kernel.org/all/20240410153423.107381-1-richardbgobert@gmail.co…
v5 -> v6:
- Write inner_network_offset in vxlan and geneve
- Ignore is_atomic when DF=0
- v5:
https://lore.kernel.org/all/20240408141720.98832-1-richardbgobert@gmail.com/
v4 -> v5:
- Add 1st commit - flush id checks in udp_gro_receive segment which can be
backported by itself
- Add TCP measurements for the 5th commit
- Add flush id tests to ensure flush id logic is preserved in GRO
- Simplify gro_inet_flush by removing a branch
- v4:
https://lore.kernel.org/all/202420325182543.87683-1-richardbgobert@gmail.co…
v3 -> v4:
- Fix code comment and commit message typos
- v3:
https://lore.kernel.org/all/f939c84a-2322-4393-a5b0-9b1e0be8ed8e@gmail.com/
v2 -> v3:
- Use napi_gro_cb instead of skb->{offset}
- v2:
https://lore.kernel.org/all/2ce1600b-e733-448b-91ac-9d0ae2b866a4@gmail.com/
v1 -> v2:
- Pass p_off in *_gro_complete to fix UDP bug
- Remove more conditionals and memory fetches from inet_gro_flush
- v1:
https://lore.kernel.org/netdev/e1d22505-c5f8-4c02-a997-64248480338b@gmail.c…
Richard Gobert (3):
net: gro: use cb instead of skb->network_header
net: gro: move L3 flush checks to tcp_gro_receive and udp_gro_receive_segment
selftests/net: add flush id selftests
include/net/gro.h | 75 +++++++++++++--
net/core/gro.c | 3 -
net/ipv4/af_inet.c | 45 +--------
net/ipv4/tcp_offload.c | 18 +---
net/ipv4/udp_offload.c | 10 +-
net/ipv6/ip6_offload.c | 16 +---
net/ipv6/tcpv6_offload.c | 3 +-
tools/testing/selftests/net/gro.c | 147 ++++++++++++++++++++++++++++++
8 files changed, 225 insertions(+), 92 deletions(-)
--
2.36.1
Hi,
This sixth series just update the last patch description.
Shuah, I think this should be in -next really soon to make sure
everything works fine for the v6.9 release, which is not currently the
case. I cannot test against all kselftests though. I would prefer to
let you handle this, but I guess you're not able to do so and I'll push
it on my branch without reply from you. Even if I push it on my branch,
please push it on yours too as soon as you see this and I'll remove it
from mine.
Mark, Jakub, could you please test this series?
As reported by Kernel Test Robot [1] and Sean Christopherson [2], some
tests fail since v6.9-rc1 . This is due to the use of vfork() which
introduced some side effects. Similarly, while making it more generic,
a previous commit made some Landlock file system tests flaky, and
subject to the host's file system mount configuration.
This series fixes all these side effects by replacing vfork() with
clone3() and CLONE_VFORK, which is cleaner (no arbitrary shared memory)
and makes the Kselftest framework more robust.
I tried different approaches and I found this one to be the cleaner and
less invasive for current test cases.
I successfully ran the following tests (using TEST_F and
fork/clone/clone3, and KVM_ONE_VCPU_TEST) with this series:
- kvm:fix_hypercall_test
- kvm:sync_regs_test
- kvm:userspace_msr_exit_test
- kvm:vmx_pmu_caps_test
- landlock:fs_test
- landlock:net_test
- landlock:ptrace_test
- move_mount_set_group:move_mount_set_group_test
- net/af_unix:scm_pidfd
- perf_events:remove_on_exec
- pidfd:pidfd_getfd_test
- pidfd:pidfd_setns_test
- seccomp:seccomp_bpf
- user_events:abi_test
[1] https://lore.kernel.org/oe-lkp/202403291015.1fcfa957-oliver.sang@intel.com
[2] https://lore.kernel.org/r/ZjPelW6-AbtYvslu@google.com
Previous versions:
v1: https://lore.kernel.org/r/20240426172252.1862930-1-mic@digikod.net
v2: https://lore.kernel.org/r/20240429130931.2394118-1-mic@digikod.net
v3: https://lore.kernel.org/r/20240429191911.2552580-1-mic@digikod.net
v4: https://lore.kernel.org/r/20240502210926.145539-1-mic@digikod.net
v5: https://lore.kernel.org/r/20240503105820.300927-1-mic@digikod.net
Regards,
Mickaël Salaün (10):
selftests/pidfd: Fix config for pidfd_setns_test
selftests/landlock: Fix FS tests when run on a private mount point
selftests/harness: Fix fixture teardown
selftests/harness: Fix interleaved scheduling leading to race
conditions
selftests/landlock: Do not allocate memory in fixture data
selftests/harness: Constify fixture variants
selftests/pidfd: Fix wrong expectation
selftests/harness: Share _metadata between forked processes
selftests/harness: Fix vfork() side effects
selftests/harness: Handle TEST_F()'s explicit exit codes
tools/testing/selftests/kselftest_harness.h | 122 +++++++++++++-----
tools/testing/selftests/landlock/fs_test.c | 83 +++++++-----
tools/testing/selftests/pidfd/config | 2 +
.../selftests/pidfd/pidfd_setns_test.c | 2 +-
4 files changed, 143 insertions(+), 66 deletions(-)
base-commit: e67572cd2204894179d89bd7b984072f19313b03
--
2.45.0
When I introduced HID-BPF, I mentioned that we should ship the HID-BPF
programs in the kernel when they are fixes so that everybody can benefit
from them.
I tried multiple times to do so but I was confronted to a tough problem:
how can I make the kernel load them automatically?
I went over a few solutions, but it always came down to something either
ugly, or either not satisfying (like forcing `bpftool` to be compiled
first, or not being able to insert them as a module).
OTOH, I was working with Peter on `udev-hid-bpf`[0] as a proof of
concept on how a minimal loader should look like. This allowed me to
experiment on the BPF files and how they should look like.
And after further thoughts, I realized that `udev-hid-bpf` could very
well be the `kmod load` that we currently have:
- the kernel handles the device normally
- a udev event is emitted
- a udev rule fires `udev-hid-bpf` and load the appropriate HID-BPF
file(s) based on the modalias
Given that most HID devices are supposed to work to a minimal level when
connected without any driver, this makes the whole HID-BPF programs nice
to have but not critical. We can then postpone the HID-BPF loading when
userspace is ready.
Working with HID-BPF is also a much better user experience for end users
(as I predicted). All they have to do is to go to the `udev-hid-bpf`
project, fetch an artifact from the MR that concerns them, run
`install.sh` (no compilation required), and their devices are fixed
(minus some back and forth when the HID-BPF program needs some changes).
So I already have that loader available, and it works well enough for
our users. But the missing point was still how to "upstream" those BPF
fixes?
That's where this patch series comes in: we simply store the fixes in
the kernel under `drivers/hid/bpf/progs`, provide a way to compile them,
but also add tests for them in the selftests dir.
Once a program is accepted here, for convenience, the same program will
move from a "testing" directory to a "stable" directory on
`udev-hid-bpf`. This way, distributions don't need to follow when there
is a new program added here, they can just ship the "stable" ones from
`udev-hid-bpf`.
Signed-off-by: Benjamin Tissoires <bentiss(a)kernel.org>
[0] https://gitlab.freedesktop.org/libevdev/udev-hid-bpf/
---
Benjamin Tissoires (18):
HID: do not assume HAT Switch logical max < 8
HID: bpf: add first in-tree HID-BPF fix for the XPPen Artist 24
HID: bpf: add in-tree HID-BPF fix for the XPPen Artist 16
HID: bpf: add in-tree HID-BPF fix for the HP Elite Presenter Mouse
HID: bpf: add in-tree HID-BPF fix for the IOGear Kaliber Gaming MMOmentum mouse
HID: bpf: add in-tree HID-BPF fix for the Wacom ArtPen
HID: bpf: add in-tree HID-BPF fix for the XBox Elite 2 over Bluetooth
HID: bpf: add in-tree HID-BPF fix for the Huion Kamvas Pro 19
HID: bpf: add in-tree HID-BPF fix for the Raptor Mach 2
selftests/hid: import base_device.py from hid-tools
selftests/hid: add support for HID-BPF pre-loading before starting a test
selftests/hid: tablets: reduce the number of pen state
selftests/hid: tablets: add a couple of XP-PEN tablets
selftests/hid: tablets: also check for XP-Pen offset correction
selftests/hid: add Huion Kamvas Pro 19 tests
selftests/hid: import base_gamepad.py from hid-tools
selftests/hid: move the gamepads definitions in the test file
selftests/hid: add tests for the Raptor Mach 2 joystick
drivers/hid/bpf/progs/FR-TEC__Raptor-Mach-2.bpf.c | 185 ++++++
drivers/hid/bpf/progs/HP__Elite-Presenter.bpf.c | 58 ++
drivers/hid/bpf/progs/Huion__Kamvas-Pro-19.bpf.c | 290 +++++++++
.../hid/bpf/progs/IOGEAR__Kaliber-MMOmentum.bpf.c | 59 ++
drivers/hid/bpf/progs/Makefile | 91 +++
.../hid/bpf/progs/Microsoft__XBox-Elite-2.bpf.c | 133 ++++
drivers/hid/bpf/progs/README | 102 +++
drivers/hid/bpf/progs/Wacom__ArtPen.bpf.c | 173 +++++
drivers/hid/bpf/progs/XPPen__Artist24.bpf.c | 229 +++++++
drivers/hid/bpf/progs/XPPen__ArtistPro16Gen2.bpf.c | 274 ++++++++
drivers/hid/bpf/progs/hid_bpf.h | 15 +
drivers/hid/bpf/progs/hid_bpf_helpers.h | 170 +++++
include/linux/hid.h | 6 +-
tools/testing/selftests/hid/tests/base.py | 87 ++-
tools/testing/selftests/hid/tests/base_device.py | 421 ++++++++++++
tools/testing/selftests/hid/tests/base_gamepad.py | 238 +++++++
tools/testing/selftests/hid/tests/test_gamepad.py | 457 ++++++++++++-
tools/testing/selftests/hid/tests/test_tablet.py | 723 +++++++++++++++------
18 files changed, 3507 insertions(+), 204 deletions(-)
---
base-commit: 3e78a6c0d3e02e4cf881dc84c5127e9990f939d6
change-id: 20240328-bpf_sources-be1f3c617c5e
Best regards,
--
Benjamin Tissoires <bentiss(a)kernel.org>
Add several new test cases which assert corner cases on the eventfd
mechanism, for example, the supplied buffer is less than 8 bytes,
attempting to write a value that is too large, etc.
./eventfd_test
# Starting 9 tests from 1 test cases.
# RUN global.eventfd01 ...
# OK global.eventfd01
ok 1 global.eventfd01
# RUN global.eventfd02 ...
# OK global.eventfd02
ok 2 global.eventfd02
# RUN global.eventfd03 ...
# OK global.eventfd03
ok 3 global.eventfd03
# RUN global.eventfd04 ...
# OK global.eventfd04
ok 4 global.eventfd04
# RUN global.eventfd05 ...
# OK global.eventfd05
ok 5 global.eventfd05
# RUN global.eventfd06 ...
# OK global.eventfd06
ok 6 global.eventfd06
# RUN global.eventfd07 ...
# OK global.eventfd07
ok 7 global.eventfd07
# RUN global.eventfd08 ...
# OK global.eventfd08
ok 8 global.eventfd08
# RUN global.eventfd09 ...
# OK global.eventfd09
ok 9 global.eventfd09
# PASSED: 9 / 9 tests passed.
# Totals: pass:9 fail:0 xfail:0 xpass:0 skip:0 error:0
Signed-off-by: Wen Yang <wen.yang(a)linux.dev>
Cc: SShuah Khan <shuah(a)kernel.org>
Cc: Christian Brauner <brauner(a)kernel.org>
Cc: Andrei Vagin <avagin(a)google.com>
Cc: Mathieu Desnoyers <mathieu.desnoyers(a)efficios.com>
Cc: Steven Rostedt <rostedt(a)goodmis.org>
Cc: Andrew Morton <akpm(a)linux-foundation.org>
Cc: Dave Young <dyoung(a)redhat.com>
Cc: linux-kselftest(a)vger.kernel.org
Cc: linux-kernel(a)vger.kernel.org
---
.../filesystems/eventfd/eventfd_test.c | 116 ++++++++++++++++++
1 file changed, 116 insertions(+)
diff --git a/tools/testing/selftests/filesystems/eventfd/eventfd_test.c b/tools/testing/selftests/filesystems/eventfd/eventfd_test.c
index f142a137526c..eeab8df5b1b5 100644
--- a/tools/testing/selftests/filesystems/eventfd/eventfd_test.c
+++ b/tools/testing/selftests/filesystems/eventfd/eventfd_test.c
@@ -183,4 +183,120 @@ TEST(eventfd05)
close(fd);
}
+/*
+ * A write(2) fails with the error EINVAL if the size of the supplied buffer
+ * is less than 8 bytes, or if an attempt is made to write the value
+ * 0xffffffffffffffff.
+ */
+TEST(eventfd06)
+{
+ uint64_t value = 1;
+ ssize_t size;
+ int fd;
+
+ fd = sys_eventfd2(0, 0);
+ ASSERT_GE(fd, 0);
+
+ size = write(fd, &value, sizeof(int));
+ EXPECT_EQ(size, -1);
+ EXPECT_EQ(errno, EINVAL);
+
+ value = (uint64_t)-1;
+ size = write(fd, &value, sizeof(value));
+ EXPECT_EQ(size, -1);
+ EXPECT_EQ(errno, EINVAL);
+
+ close(fd);
+}
+
+/*
+ * A read(2) fails with the error EINVAL if the size of the supplied buffer is
+ * less than 8 bytes.
+ */
+TEST(eventfd07)
+{
+ int value = 0;
+ ssize_t size;
+ int fd;
+
+ fd = sys_eventfd2(0, 0);
+ ASSERT_GE(fd, 0);
+
+ size = write(fd, &value, sizeof(value));
+ EXPECT_EQ(size, -1);
+ EXPECT_EQ(errno, EINVAL);
+
+ close(fd);
+}
+
+/*
+ * If EFD_SEMAPHORE was not specified and the eventfd counter has a nonzero
+ * value, then a read(2) returns 8 bytes containing that value, and the
+ * counter's value is reset to zero.
+ * If the eventfd counter is zero at the time of the call to read(2), then the
+ * call fails with the error EAGAIN if the file descriptor has been made nonblocking.
+ */
+TEST(eventfd08)
+{
+ uint64_t value;
+ ssize_t size;
+ int fd;
+ int i;
+
+ fd = sys_eventfd2(0, EFD_NONBLOCK);
+ ASSERT_GE(fd, 0);
+
+ value = 1;
+ for (i = 0; i < 10000000; i++) {
+ size = write(fd, &value, sizeof(value));
+ EXPECT_EQ(size, sizeof(value));
+ }
+
+ size = read(fd, &value, sizeof(value));
+ EXPECT_EQ(size, sizeof(uint64_t));
+ EXPECT_EQ(value, 10000000);
+
+ size = read(fd, &value, sizeof(value));
+ EXPECT_EQ(size, -1);
+ EXPECT_EQ(errno, EAGAIN);
+
+ close(fd);
+}
+
+/*
+ * If EFD_SEMAPHORE was specified and the eventfd counter has a nonzero value,
+ * then a read(2) returns 8 bytes containing the value 1, and the counter's
+ * value is decremented by 1.
+ * If the eventfd counter is zero at the time of the call to read(2), then the
+ * call fails with the error EAGAIN if the file descriptor has been made nonblocking.
+ */
+TEST(eventfd09)
+{
+ uint64_t value;
+ ssize_t size;
+ int fd;
+ int i;
+
+ fd = sys_eventfd2(0, EFD_SEMAPHORE|EFD_NONBLOCK);
+ ASSERT_GE(fd, 0);
+
+ value = 1;
+ for (i = 0; i < 10000000; i++) {
+ size = write(fd, &value, sizeof(value));
+ EXPECT_EQ(size, sizeof(value));
+ }
+
+ for (i = 0; i < 10000000; i++) {
+ size = read(fd, &value, sizeof(value));
+ EXPECT_EQ(size, sizeof(value));
+ EXPECT_EQ(value, 1);
+ }
+
+ size = read(fd, &value, sizeof(value));
+ EXPECT_EQ(size, -1);
+ EXPECT_EQ(errno, EAGAIN);
+
+ close(fd);
+}
+
TEST_HARNESS_MAIN
--
2.25.1
This is a RFC, following[0].
It works, still needs some care but this is mainly to see if this will
have a chance to get upsrteamed or if I should rely on struct_ops
instead.
Cheers,
Benjamin
Signed-off-by: Benjamin Tissoires <bentiss(a)kernel.org>
---
Benjamin Tissoires (8):
bpf: ignore sleepable prog parameter for kfuncs
bpf: add kfunc_meta parameter to push_callback_call()
bpf: implement __async and __s_async kfunc suffixes
bpf: typedef a type for the bpf_wq callbacks
selftests/bpf: rely on wq_callback_fn_t
bpf: remove one special case of is_bpf_wq_set_callback_impl_kfunc
bpf: implement __aux kfunc argument suffix to fetch prog_aux
bpf: rely on __aux suffix for bpf_wq_set_callback_impl
kernel/bpf/helpers.c | 10 +-
kernel/bpf/verifier.c | 333 +++++++++++++++++++-----
tools/testing/selftests/bpf/bpf_experimental.h | 3 +-
tools/testing/selftests/bpf/progs/wq.c | 10 +-
tools/testing/selftests/bpf/progs/wq_failures.c | 4 +-
5 files changed, 280 insertions(+), 80 deletions(-)
---
base-commit: 05cbc217aafbc631a6c2fab4accf95850cb48358
change-id: 20240507-bpf_async-bd2e65847525
Best regards,
--
Benjamin Tissoires <bentiss(a)kernel.org>
Hi,
This fix, originally intended for XFRM/IPsec, has been recommended by
Steffen Klassert to submit to the net tree.
The patch addresses a minor issue related to the IPv4 source address of
ICMP error messages, which originated from an old 2011 commit:
415b3334a21a ("icmp: Fix regression in nexthop resolution during replies.")
The omission of a "Fixes" tag in the following commit is deliberate
to prevent potential test failures and subsequent regression issues
that may arise from backporting this patch all stable kerenels.
This is a minor fix, anot not security fix.
With a seleftest I am submitting this to net-next tree.
v2->v3 : fix testscript. The IFS, space, got mangled.
v1->v2 : add kernel selftest script
Antony Antony (2):
xfrm: fix source address in icmp error generation from IPsec gateway
selftests/net: add ICMP unreachable over IPsec tunnel
net/ipv4/icmp.c | 1 -
tools/testing/selftests/net/Makefile | 1 +
tools/testing/selftests/net/xfrm_state.sh | 624 ++++++++++++++++++++++
3 files changed, 625 insertions(+), 1 deletion(-)
create mode 100755 tools/testing/selftests/net/xfrm_state.sh
--
2.30.2
Cast operation has a higher precedence than addition. The code here
wants to zero the 2nd half of the 64-bit metadata, but due to a pointer
arithmetic mistake, it writes the zero at offset 16 instead.
Just adding parentheses around "data + 4" would fix this, but I think
this will be slightly better readable with array syntax.
I was unable to test this with tools/testing/selftests/bpf/vmtest.sh,
because my glibc is newer than glibc in the provided VM image.
So I just checked the difference in the compiled code.
objdump -S tools/testing/selftests/bpf/xdp_do_redirect.test.o:
- *((__u32 *)data) = 0x42; /* metadata test value */
+ ((__u32 *)data)[0] = 0x42; /* metadata test value */
be7: 48 8d 85 30 fc ff ff lea -0x3d0(%rbp),%rax
bee: c7 00 42 00 00 00 movl $0x42,(%rax)
- *((__u32 *)data + 4) = 0;
+ ((__u32 *)data)[1] = 0;
bf4: 48 8d 85 30 fc ff ff lea -0x3d0(%rbp),%rax
- bfb: 48 83 c0 10 add $0x10,%rax
+ bfb: 48 83 c0 04 add $0x4,%rax
bff: c7 00 00 00 00 00 movl $0x0,(%rax)
Fixes: 5640b6d89434 ("selftests/bpf: fix "metadata marker" getting overwritten by the netstack")
Signed-off-by: Michal Schmidt <mschmidt(a)redhat.com>
---
tools/testing/selftests/bpf/prog_tests/xdp_do_redirect.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/tools/testing/selftests/bpf/prog_tests/xdp_do_redirect.c b/tools/testing/selftests/bpf/prog_tests/xdp_do_redirect.c
index 498d3bdaa4b0..bad0ea167be7 100644
--- a/tools/testing/selftests/bpf/prog_tests/xdp_do_redirect.c
+++ b/tools/testing/selftests/bpf/prog_tests/xdp_do_redirect.c
@@ -107,8 +107,8 @@ void test_xdp_do_redirect(void)
.attach_point = BPF_TC_INGRESS);
memcpy(&data[sizeof(__u64)], &pkt_udp, sizeof(pkt_udp));
- *((__u32 *)data) = 0x42; /* metadata test value */
- *((__u32 *)data + 4) = 0;
+ ((__u32 *)data)[0] = 0x42; /* metadata test value */
+ ((__u32 *)data)[1] = 0;
skel = test_xdp_do_redirect__open();
if (!ASSERT_OK_PTR(skel, "skel"))
--
2.44.0
From: Geliang Tang <tanggeliang(a)kylinos.cn>
v4:
- fix a bug in v3, it should be 'if (err)', not 'if (!err)'.
- move "selftests/bpf: Use log_err in network_helpers" out of this
series.
v3:
- add two more patches.
- use log_err instead of ASSERT in v3.
- let send_recv_data return int as Martin suggested.
v2:
Address Martin's comments for v1 (thanks.)
- drop patch 1, "export send_byte helper".
- drop "WRITE_ONCE(arg.stop, 0)".
- rebased.
send_recv_data will be re-used in MPTCP bpf tests, but not included
in this set because it depends on other patches that have not been
in the bpf-next yet. It will be sent as another set soon.
Geliang Tang (3):
selftests/bpf: Add struct send_recv_arg
selftests/bpf: Export send_recv_data helper
selftests/bpf: Support nonblock for send_recv_data
tools/testing/selftests/bpf/network_helpers.c | 103 ++++++++++++++++++
tools/testing/selftests/bpf/network_helpers.h | 1 +
.../selftests/bpf/prog_tests/bpf_tcp_ca.c | 71 +-----------
3 files changed, 105 insertions(+), 70 deletions(-)
--
2.40.1
Fixes clang compilation warnings by adding explicit unsigned conversion:
parse_vdso.c:206:22: warning: passing 'const char *' to parameter of
type 'const unsigned char *' converts between pointers to integer types
where one is of the unique plain 'char' type and the other is not
[-Wpointer-sign]
ver_hash = elf_hash(version);
^~~~~~~
parse_vdso.c:59:52: note: passing argument to parameter 'name' here
static unsigned long elf_hash(const unsigned char *name)
^
parse_vdso.c:207:46: warning: passing 'const char *' to parameter of
type 'const unsigned char *' converts between pointers to integer types
where one is of the unique plain 'char' type and the other is not
[-Wpointer-sign]
ELF(Word) chain = vdso_info.bucket[elf_hash(name) % vdso_info.nbucket];
^~~~
parse_vdso.c:59:52: note: passing argument to parameter 'name' here
static unsigned long elf_hash(const unsigned char *name)
Fixes: 98eedc3a9dbf ("Document the vDSO and add a reference parser")
Signed-off-by: Edward Liaw <edliaw(a)google.com>
---
v2: update commit message with correct compiler warning
v3: fix checkpatch errors and indentation
---
tools/testing/selftests/vDSO/parse_vdso.c | 5 +++--
1 file changed, 3 insertions(+), 2 deletions(-)
diff --git a/tools/testing/selftests/vDSO/parse_vdso.c b/tools/testing/selftests/vDSO/parse_vdso.c
index 413f75620a35..9e29ff0657ea 100644
--- a/tools/testing/selftests/vDSO/parse_vdso.c
+++ b/tools/testing/selftests/vDSO/parse_vdso.c
@@ -203,8 +203,9 @@ void *vdso_sym(const char *version, const char *name)
if (!vdso_info.valid)
return 0;
- ver_hash = elf_hash(version);
- ELF(Word) chain = vdso_info.bucket[elf_hash(name) % vdso_info.nbucket];
+ ver_hash = elf_hash((const unsigned char *)version);
+ ELF(Word) chain = vdso_info.bucket[
+ elf_hash((const unsigned char *)name) % vdso_info.nbucket];
for (; chain != STN_UNDEF; chain = vdso_info.chain[chain]) {
ELF(Sym) *sym = &vdso_info.symtab[chain];
--
2.45.0.rc0.197.gbae5840b3b-goog
When building either tools/bpf/bpftool, or tools/testing/selftests/hid,
(the same Makefile is used for these), clang generates many instances of
the following:
"clang: warning: -lLLVM-17: 'linker' input unused"
Quentin points out that the LLVM version is only required in $(LIBS),
not in $(CFLAGS), so the fix is to remove it from CFLAGS.
Suggested-by: Quentin Monnet <qmo(a)kernel.org>
Signed-off-by: John Hubbard <jhubbard(a)nvidia.com>
---
tools/bpf/bpftool/Makefile | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/tools/bpf/bpftool/Makefile b/tools/bpf/bpftool/Makefile
index e9154ace80ff..a5445a422109 100644
--- a/tools/bpf/bpftool/Makefile
+++ b/tools/bpf/bpftool/Makefile
@@ -143,7 +143,7 @@ ifeq ($(feature-llvm),1)
# If LLVM is available, use it for JIT disassembly
CFLAGS += -DHAVE_LLVM_SUPPORT
LLVM_CONFIG_LIB_COMPONENTS := mcdisassembler all-targets
- CFLAGS += $(shell $(LLVM_CONFIG) --cflags --libs $(LLVM_CONFIG_LIB_COMPONENTS))
+ CFLAGS += $(shell $(LLVM_CONFIG) --cflags)
LIBS += $(shell $(LLVM_CONFIG) --libs $(LLVM_CONFIG_LIB_COMPONENTS))
ifeq ($(shell $(LLVM_CONFIG) --shared-mode),static)
LIBS += $(shell $(LLVM_CONFIG) --system-libs $(LLVM_CONFIG_LIB_COMPONENTS))
base-commit: f462ae0edd3703edd6f22fe41d336369c38b884b
prerequisite-patch-id: b901ece2a5b78503e2fb5480f20e304d36a0ea27
--
2.45.0
clang doesn't deal well with "-pie -static": it warns that -pie is an
unused option here. Changing to "-fPIE -static" solves this problem for
clang, while keeping the gcc results identical.
The problem is visible when building via:
make LLVM=1 -C tools/testing/selftests
Again: gcc 13 produces identical binaries for all of these programs,
both before and after this commit (using "-pie"), and after (using
"-fPIE").
Also, the runtime results are the same for both clang and gcc builds.
Signed-off-by: John Hubbard <jhubbard(a)nvidia.com>
---
tools/testing/selftests/exec/Makefile | 6 +++---
1 file changed, 3 insertions(+), 3 deletions(-)
diff --git a/tools/testing/selftests/exec/Makefile b/tools/testing/selftests/exec/Makefile
index fb4472ddffd8..b7b54d442378 100644
--- a/tools/testing/selftests/exec/Makefile
+++ b/tools/testing/selftests/exec/Makefile
@@ -29,8 +29,8 @@ $(OUTPUT)/execveat.denatured: $(OUTPUT)/execveat
cp $< $@
chmod -x $@
$(OUTPUT)/load_address_4096: load_address.c
- $(CC) $(CFLAGS) $(LDFLAGS) -Wl,-z,max-page-size=0x1000 -pie -static $< -o $@
+ $(CC) $(CFLAGS) $(LDFLAGS) -Wl,-z,max-page-size=0x1000 -fPIE -static $< -o $@
$(OUTPUT)/load_address_2097152: load_address.c
- $(CC) $(CFLAGS) $(LDFLAGS) -Wl,-z,max-page-size=0x200000 -pie -static $< -o $@
+ $(CC) $(CFLAGS) $(LDFLAGS) -Wl,-z,max-page-size=0x200000 -fPIE -static $< -o $@
$(OUTPUT)/load_address_16777216: load_address.c
- $(CC) $(CFLAGS) $(LDFLAGS) -Wl,-z,max-page-size=0x1000000 -pie -static $< -o $@
+ $(CC) $(CFLAGS) $(LDFLAGS) -Wl,-z,max-page-size=0x1000000 -fPIE -static $< -o $@
base-commit: ddb4c3f25b7b95df3d6932db0b379d768a6ebdf7
prerequisite-patch-id: b901ece2a5b78503e2fb5480f20e304d36a0ea27
--
2.45.0
When building with clang, via:
make LLVM=1 -C tools/testing/selftest
...clang warns that "a variable sized type not at the end of a struct or
class is a GNU extension".
These cases are not easily changed, because they involve structs that
are part of the API. Fortunately, however, the tests seem to be doing
just fine (specifically, neither affected test runs any differently with
gcc vs. clang builds, on my test system) regardless of the warning. So,
all the warning is doing is preventing a clean build of selftests/net.
Fix this by suppressing this particular clang warning for the
selftests/net suite.
Signed-off-by: John Hubbard <jhubbard(a)nvidia.com>
---
tools/testing/selftests/net/Makefile | 4 ++++
1 file changed, 4 insertions(+)
diff --git a/tools/testing/selftests/net/Makefile b/tools/testing/selftests/net/Makefile
index 7b6918d5f4af..956481174783 100644
--- a/tools/testing/selftests/net/Makefile
+++ b/tools/testing/selftests/net/Makefile
@@ -6,6 +6,10 @@ CFLAGS += -I../../../../usr/include/ $(KHDR_INCLUDES)
# Additional include paths needed by kselftest.h
CFLAGS += -I../
+ifneq ($(LLVM),)
+ CFLAGS += -Wno-gnu-variable-sized-type-not-at-end
+endif
+
TEST_PROGS := run_netsocktests run_afpackettests test_bpf.sh netdevice.sh \
rtnetlink.sh xfrm_policy.sh test_blackhole_dev.sh
TEST_PROGS += fib_tests.sh fib-onlink-tests.sh pmtu.sh udpgso.sh ip_defrag.sh
base-commit: f462ae0edd3703edd6f22fe41d336369c38b884b
prerequisite-patch-id: b901ece2a5b78503e2fb5480f20e304d36a0ea27
--
2.45.0
When building with clang, via:
make LLVM=1 -C tools/testing/selftests
...clang warns, correctly, that several functions declared with an "int"
return type are not always returning values in all cases (or at least,
clang cannot prove that they always return a value).
Fix this by returning an appropriate value for each function. Thanks to
Felix Huettner for recommending MNL_CB_OK (which is non-zero) for the
return value of the count_entries() callback.
Cc: Felix Huettner <felix.huettner(a)mail.schwarz>
Signed-off-by: John Hubbard <jhubbard(a)nvidia.com>
---
tools/testing/selftests/netfilter/conntrack_dump_flush.c | 4 ++++
1 file changed, 4 insertions(+)
diff --git a/tools/testing/selftests/netfilter/conntrack_dump_flush.c b/tools/testing/selftests/netfilter/conntrack_dump_flush.c
index b11ea8ee6719..e9df4ae14e16 100644
--- a/tools/testing/selftests/netfilter/conntrack_dump_flush.c
+++ b/tools/testing/selftests/netfilter/conntrack_dump_flush.c
@@ -43,6 +43,7 @@ static int build_cta_tuple_v4(struct nlmsghdr *nlh, int type,
mnl_attr_nest_end(nlh, nest_proto);
mnl_attr_nest_end(nlh, nest);
+ return 0;
}
static int build_cta_tuple_v6(struct nlmsghdr *nlh, int type,
@@ -71,6 +72,7 @@ static int build_cta_tuple_v6(struct nlmsghdr *nlh, int type,
mnl_attr_nest_end(nlh, nest_proto);
mnl_attr_nest_end(nlh, nest);
+ return 0;
}
static int build_cta_proto(struct nlmsghdr *nlh)
@@ -90,6 +92,7 @@ static int build_cta_proto(struct nlmsghdr *nlh)
mnl_attr_nest_end(nlh, nest_proto);
mnl_attr_nest_end(nlh, nest);
+ return 0;
}
static int conntrack_data_insert(struct mnl_socket *sock, struct nlmsghdr *nlh,
@@ -207,6 +210,7 @@ static int conntrack_data_generate_v6(struct mnl_socket *sock,
static int count_entries(const struct nlmsghdr *nlh, void *data)
{
reply_counter++;
+ return MNL_CB_OK;
}
static int conntracK_count_zone(struct mnl_socket *sock, uint16_t zone)
base-commit: f462ae0edd3703edd6f22fe41d336369c38b884b
prerequisite-patch-id: b901ece2a5b78503e2fb5480f20e304d36a0ea27
prerequisite-patch-id: 9db2d20be98dc44731d8605a3da64ff118d2546d
--
2.45.0
On 5/6/24 7:41 AM, Felix Huettner wrote:
> On Sun, May 05, 2024 at 02:47:16PM -0700, John Hubbard wrote:
...
> > @@ -207,6 +210,7 @@ static int conntrack_data_generate_v6(struct
> mnl_socket *sock,
> > static int count_entries(const struct nlmsghdr *nlh, void *data)
> > {
> > reply_counter++;
> > + return 0;
>
> Hi John,
>
> This will need to return MNL_CB_OK.
> Otherwise mnl_cb_run below will abort early and the connection count
> will be wrong.
>
Thanks for catching that, I'm sending a v2 with that fix.
I was thinking about it, and expected that the pre-existing code
appeared to work because the return value was some non-zero garbage
value scrounged off of the stack (or %rax, for example on x86).
However, just a quick test showed that *any* value (O, 1==MNL_CB_OK,
or no value at all) allows the test to report success...oh, I see,
it's reporting PASSED when it really ought to say SKIPPED:
$ ./conntrack_dump_flush
TAP version 13
1..3
# Starting 3 tests from 1 test cases.
# RUN conntrack_dump_flush.test_dump_by_zone ...
mnl_socket_open: Protocol not supported
# OK conntrack_dump_flush.test_dump_by_zone
ok 1 conntrack_dump_flush.test_dump_by_zone
# RUN conntrack_dump_flush.test_flush_by_zone ...
mnl_socket_open: Protocol not supported
# OK conntrack_dump_flush.test_flush_by_zone
ok 2 conntrack_dump_flush.test_flush_by_zone
# RUN conntrack_dump_flush.test_flush_by_zone_default ...
mnl_socket_open: Protocol not supported
# OK conntrack_dump_flush.test_flush_by_zone_default
ok 3 conntrack_dump_flush.test_flush_by_zone_default
# PASSED: 3 / 3 tests passed.
# Totals: pass:3 fail:0 xfail:0 xpass:0 skip:0 error:0
As long as we are looking at this, what do you think about
this:
diff --git a/tools/testing/selftests/netfilter/conntrack_dump_flush.c
b/tools/testing/selftests/netfilter/conntrack_dump_flush.c
index e9df4ae14e16..4a73afad4de4 100644
--- a/tools/testing/selftests/netfilter/conntrack_dump_flush.c
+++ b/tools/testing/selftests/netfilter/conntrack_dump_flush.c
@@ -317,12 +317,12 @@ FIXTURE_SETUP(conntrack_dump_flush)
self->sock = mnl_socket_open(NETLINK_NETFILTER);
if (!self->sock) {
perror("mnl_socket_open");
- exit(EXIT_FAILURE);
+ SKIP(exit(EXIT_FAILURE), "mnl_socket_open() failed");
}
if (mnl_socket_bind(self->sock, 0, MNL_SOCKET_AUTOPID) < 0) {
perror("mnl_socket_bind");
- exit(EXIT_FAILURE);
+ SKIP(exit(EXIT_FAILURE), "mnl_socket_bind() failed");
}
ret = conntracK_count_zone(self->sock, TEST_ZONE_ID);
...which changes the above output, to:
$ ./conntrack_dump_flush
TAP version 13
1..3
# Starting 3 tests from 1 test cases.
# RUN conntrack_dump_flush.test_dump_by_zone ...
mnl_socket_open: Protocol not supported
# SKIP mnl_socket_open() failed
# OK conntrack_dump_flush.test_dump_by_zone
ok 1 conntrack_dump_flush.test_dump_by_zone # SKIP mnl_socket_open() failed
# RUN conntrack_dump_flush.test_flush_by_zone ...
mnl_socket_open: Protocol not supported
# SKIP mnl_socket_open() failed
# OK conntrack_dump_flush.test_flush_by_zone
ok 2 conntrack_dump_flush.test_flush_by_zone # SKIP mnl_socket_open() failed
# RUN conntrack_dump_flush.test_flush_by_zone_default ...
mnl_socket_open: Protocol not supported
# SKIP mnl_socket_open() failed
# OK conntrack_dump_flush.test_flush_by_zone_default
ok 3 conntrack_dump_flush.test_flush_by_zone_default # SKIP
mnl_socket_open() failed
# PASSED: 3 / 3 tests passed.
# Totals: pass:0 fail:0 xfail:0 xpass:0 skip:3 error:0
?
thanks,
--
John Hubbard
NVIDIA
Fixes clang compilation warnings by changing elf_hash's parameter type
to char * and casting to unsigned char * inside elf_hash:
parse_vdso.c:206:22: warning: passing 'const char *' to parameter of
type 'const unsigned char *' converts between pointers to integer types
where one is of the unique plain 'char' type and the other is not
[-Wpointer-sign]
ver_hash = elf_hash(version);
^~~~~~~
parse_vdso.c:59:52: note: passing argument to parameter 'name' here
static unsigned long elf_hash(const unsigned char *name)
^
parse_vdso.c:207:46: warning: passing 'const char *' to parameter of
type 'const unsigned char *' converts between pointers to integer types
where one is of the unique plain 'char' type and the other is not
[-Wpointer-sign]
ELF(Word) chain = vdso_info.bucket[elf_hash(name) % vdso_info.nbucket];
^~~~
parse_vdso.c:59:52: note: passing argument to parameter 'name' here
static unsigned long elf_hash(const unsigned char *name)
Fixes: 98eedc3a9dbf ("Document the vDSO and add a reference parser")
Signed-off-by: Edward Liaw <edliaw(a)google.com>
---
v2: updated commit message with correct compiler warning
v3: fixed checkpatch errors and indentation
https://lore.kernel.org/all/20240501180622.1676340-1-edliaw@google.com/
v4: moved the typecast into elf_hash based on libelf
https://sourceforge.net/p/elftoolchain/code/HEAD/tree/trunk/libelf/elf_hash…
---
tools/testing/selftests/vDSO/parse_vdso.c | 9 ++++++---
1 file changed, 6 insertions(+), 3 deletions(-)
diff --git a/tools/testing/selftests/vDSO/parse_vdso.c b/tools/testing/selftests/vDSO/parse_vdso.c
index 413f75620a35..33db8abd7d59 100644
--- a/tools/testing/selftests/vDSO/parse_vdso.c
+++ b/tools/testing/selftests/vDSO/parse_vdso.c
@@ -56,12 +56,15 @@ static struct vdso_info
} vdso_info;
/* Straight from the ELF specification. */
-static unsigned long elf_hash(const unsigned char *name)
+static unsigned long elf_hash(const char *name)
{
unsigned long h = 0, g;
- while (*name)
+ const unsigned char *s;
+
+ s = (const unsigned char *) name;
+ while (*s)
{
- h = (h << 4) + *name++;
+ h = (h << 4) + *s++;
if (g = h & 0xf0000000)
h ^= g >> 24;
h &= ~g;
--
2.45.0.rc1.225.g2a3ae87e7f-goog
dump_config_tree() is declared to return an int, but the compiler cannot
prove that it always returns any value at all. This leads to a clang
warning, when building via:
make LLVM=1 -C tools/testing/selftests
Furthermore, Mark Brown noticed that dump_config_tree() isn't even used
anymore, so just delete the entire function.
Cc: Mark Brown <broonie(a)kernel.org>
Signed-off-by: John Hubbard <jhubbard(a)nvidia.com>
---
tools/testing/selftests/alsa/conf.c | 13 -------------
1 file changed, 13 deletions(-)
diff --git a/tools/testing/selftests/alsa/conf.c b/tools/testing/selftests/alsa/conf.c
index 89e3656a042d..357561c1759b 100644
--- a/tools/testing/selftests/alsa/conf.c
+++ b/tools/testing/selftests/alsa/conf.c
@@ -105,19 +105,6 @@ static struct card_cfg_data *conf_data_by_card(int card, bool msg)
return NULL;
}
-static int dump_config_tree(snd_config_t *top)
-{
- snd_output_t *out;
- int err;
-
- err = snd_output_stdio_attach(&out, stdout, 0);
- if (err < 0)
- ksft_exit_fail_msg("stdout attach\n");
- if (snd_config_save(top, out))
- ksft_exit_fail_msg("config save\n");
- snd_output_close(out);
-}
-
snd_config_t *conf_load_from_file(const char *filename)
{
snd_config_t *dst;
base-commit: f462ae0edd3703edd6f22fe41d336369c38b884b
prerequisite-patch-id: b901ece2a5b78503e2fb5480f20e304d36a0ea27
--
2.45.0
First of all, in order to build with clang at all, one must first apply
Valentin Obst's build fix for LLVM [1]. Furthermore, for this particular
resctrl directory, my pending fix [2] must also be applied. Once those
fixes are in place, then when building with clang, via:
make LLVM=1 -C tools/testing/selftests
...two types of warnings occur:
warning: absolute value function 'abs' given an argument of type
'long' but has parameter of type 'int' which may cause truncation of
value
warning: taking the absolute value of unsigned type 'unsigned long'
has no effect
Fix these by:
a) using labs() in place of abs(), when long integers are involved, and
b) don't call labs() unnecessarily.
[1] https://lore.kernel.org/all/20240329-selftests-libmk-llvm-rfc-v1-1-2f9ed7d1…
[2] https://lore.kernel.org/all/20240503021712.78601-1-jhubbard@nvidia.com/
Signed-off-by: John Hubbard <jhubbard(a)nvidia.com>
---
tools/testing/selftests/resctrl/cmt_test.c | 4 ++--
tools/testing/selftests/resctrl/mba_test.c | 2 +-
tools/testing/selftests/resctrl/mbm_test.c | 2 +-
3 files changed, 4 insertions(+), 4 deletions(-)
diff --git a/tools/testing/selftests/resctrl/cmt_test.c b/tools/testing/selftests/resctrl/cmt_test.c
index a81f91222a89..05a241519ae8 100644
--- a/tools/testing/selftests/resctrl/cmt_test.c
+++ b/tools/testing/selftests/resctrl/cmt_test.c
@@ -40,11 +40,11 @@ static int show_results_info(unsigned long sum_llc_val, int no_of_bits,
int ret;
avg_llc_val = sum_llc_val / num_of_runs;
- avg_diff = (long)abs(cache_span - avg_llc_val);
+ avg_diff = (long)(cache_span - avg_llc_val);
diff_percent = ((float)cache_span - avg_llc_val) / cache_span * 100;
ret = platform && abs((int)diff_percent) > max_diff_percent &&
- abs(avg_diff) > max_diff;
+ labs(avg_diff) > max_diff;
ksft_print_msg("%s Check cache miss rate within %lu%%\n",
ret ? "Fail:" : "Pass:", max_diff_percent);
diff --git a/tools/testing/selftests/resctrl/mba_test.c b/tools/testing/selftests/resctrl/mba_test.c
index 7946e32e85c8..673b2bb800f7 100644
--- a/tools/testing/selftests/resctrl/mba_test.c
+++ b/tools/testing/selftests/resctrl/mba_test.c
@@ -77,7 +77,7 @@ static bool show_mba_info(unsigned long *bw_imc, unsigned long *bw_resc)
avg_bw_imc = sum_bw_imc / (NUM_OF_RUNS - 1);
avg_bw_resc = sum_bw_resc / (NUM_OF_RUNS - 1);
- avg_diff = (float)labs(avg_bw_resc - avg_bw_imc) / avg_bw_imc;
+ avg_diff = (float)(avg_bw_resc - avg_bw_imc) / avg_bw_imc;
avg_diff_per = (int)(avg_diff * 100);
ksft_print_msg("%s Check MBA diff within %d%% for schemata %u\n",
diff --git a/tools/testing/selftests/resctrl/mbm_test.c b/tools/testing/selftests/resctrl/mbm_test.c
index d67ffa3ec63a..c873793d016d 100644
--- a/tools/testing/selftests/resctrl/mbm_test.c
+++ b/tools/testing/selftests/resctrl/mbm_test.c
@@ -33,7 +33,7 @@ show_bw_info(unsigned long *bw_imc, unsigned long *bw_resc, size_t span)
avg_bw_imc = sum_bw_imc / 4;
avg_bw_resc = sum_bw_resc / 4;
- avg_diff = (float)labs(avg_bw_resc - avg_bw_imc) / avg_bw_imc;
+ avg_diff = (float)(avg_bw_resc - avg_bw_imc) / avg_bw_imc;
avg_diff_per = (int)(avg_diff * 100);
ret = avg_diff_per > MAX_DIFF_PERCENT;
base-commit: f03359bca01bf4372cf2c118cd9a987a5951b1c8
prerequisite-patch-id: b901ece2a5b78503e2fb5480f20e304d36a0ea27
prerequisite-patch-id: 8d96c4b8c3ed6d9ea2588ef7f594ae0f9f83c279
--
2.45.0
Hi,
This fifth series fixes _metadata reset and fixes the last patch to
handle code set with direct calls to _exit().
As reported by Kernel Test Robot [1], some pidfd tests fail. This is
due to the use of vfork() which introduced some side effects.
Similarly, while making it more generic, a previous commit made some
Landlock file system tests flaky, and subject to the host's file system
mount configuration.
This series fixes all these side effects by replacing vfork() with
clone3() and CLONE_VFORK, which is cleaner (no arbitrary shared memory)
and makes the Kselftest framework more robust.
I tried different approaches and I found this one to be the cleaner and
less invasive for current test cases.
I successfully ran the following tests (using TEST_F and
fork/clone/clone3, and KVM_ONE_VCPU_TEST) with this series:
- kvm:fix_hypercall_test
- kvm:sync_regs_test
- kvm:userspace_msr_exit_test
- kvm:vmx_pmu_caps_test
- landlock:fs_test
- landlock:net_test
- landlock:ptrace_test
- move_mount_set_group:move_mount_set_group_test
- net/af_unix:scm_pidfd
- perf_events:remove_on_exec
- pidfd:pidfd_getfd_test
- pidfd:pidfd_setns_test
- seccomp:seccomp_bpf
- user_events:abi_test
[1] https://lore.kernel.org/oe-lkp/202403291015.1fcfa957-oliver.sang@intel.com
Previous versions:
v1: https://lore.kernel.org/r/20240426172252.1862930-1-mic@digikod.net
v2: https://lore.kernel.org/r/20240429130931.2394118-1-mic@digikod.net
v3: https://lore.kernel.org/r/20240429191911.2552580-1-mic@digikod.net
v4: https://lore.kernel.org/r/20240502210926.145539-1-mic@digikod.net
Regards,
Mickaël Salaün (10):
selftests/pidfd: Fix config for pidfd_setns_test
selftests/landlock: Fix FS tests when run on a private mount point
selftests/harness: Fix fixture teardown
selftests/harness: Fix interleaved scheduling leading to race
conditions
selftests/landlock: Do not allocate memory in fixture data
selftests/harness: Constify fixture variants
selftests/pidfd: Fix wrong expectation
selftests/harness: Share _metadata between forked processes
selftests/harness: Fix vfork() side effects
selftests/harness: Handle TEST_F()'s explicit exit codes
tools/testing/selftests/kselftest_harness.h | 122 +++++++++++++-----
tools/testing/selftests/landlock/fs_test.c | 83 +++++++-----
tools/testing/selftests/pidfd/config | 2 +
.../selftests/pidfd/pidfd_setns_test.c | 2 +-
4 files changed, 143 insertions(+), 66 deletions(-)
base-commit: e67572cd2204894179d89bd7b984072f19313b03
--
2.45.0
When building either tools/bpf/bpftool, or tools/testing/selftests/hid,
(the same Makefile is used for these), clang generates many instances of
a warning that is useless here:
"clang: warning: -lLLVM-17: 'linker' input unused"
Silence this in both locations, by disabling that warning when building
with clang.
Signed-off-by: John Hubbard <jhubbard(a)nvidia.com>
---
tools/bpf/bpftool/Makefile | 4 ++++
1 file changed, 4 insertions(+)
diff --git a/tools/bpf/bpftool/Makefile b/tools/bpf/bpftool/Makefile
index e9154ace80ff..c7457921d136 100644
--- a/tools/bpf/bpftool/Makefile
+++ b/tools/bpf/bpftool/Makefile
@@ -133,6 +133,10 @@ CFLAGS += -DUSE_LIBCAP
LIBS += -lcap
endif
+ifneq ($(LLVM),)
+ CFLAGS += -Wno-unused-command-line-argument
+endif
+
include $(wildcard $(OUTPUT)*.d)
all: $(OUTPUT)bpftool
base-commit: f462ae0edd3703edd6f22fe41d336369c38b884b
prerequisite-patch-id: b901ece2a5b78503e2fb5480f20e304d36a0ea27
--
2.45.0
When building with clang, via:
make LLVM=1 -C tools/testing/selftest
...clang warns about using "int *" interchangeably with "socklen_t *".
clang is correct, so fix this by declaring len as a socklen_t, instead
of as an int.
Signed-off-by: John Hubbard <jhubbard(a)nvidia.com>
---
tools/testing/selftests/netfilter/sctp_collision.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/tools/testing/selftests/netfilter/sctp_collision.c b/tools/testing/selftests/netfilter/sctp_collision.c
index 21bb1cfd8a85..91df996367e9 100644
--- a/tools/testing/selftests/netfilter/sctp_collision.c
+++ b/tools/testing/selftests/netfilter/sctp_collision.c
@@ -9,7 +9,8 @@
int main(int argc, char *argv[])
{
struct sockaddr_in saddr = {}, daddr = {};
- int sd, ret, len = sizeof(daddr);
+ int sd, ret;
+ socklen_t len = sizeof(daddr);
struct timeval tv = {25, 0};
char buf[] = "hello";
base-commit: f462ae0edd3703edd6f22fe41d336369c38b884b
prerequisite-patch-id: b901ece2a5b78503e2fb5480f20e304d36a0ea27
--
2.45.0
When building with clang, via:
make LLVM=1 -C tools/testing/selftest
...clang warns about "taking address of packed member 'write_index' ".
This is not particularly helpful, because the test code really wants to
write to exactly this location, and if it ends up being unaligned, then
the test won't work (and may segfault, depending on the CPU type).
If that ever comes up, it will be obvious and can be fixed. But all it's
doing now is prevent a clean clang build, so disable the warning.
Signed-off-by: John Hubbard <jhubbard(a)nvidia.com>
---
tools/testing/selftests/user_events/Makefile | 5 +++++
1 file changed, 5 insertions(+)
diff --git a/tools/testing/selftests/user_events/Makefile b/tools/testing/selftests/user_events/Makefile
index 10fcd0066203..617e94344711 100644
--- a/tools/testing/selftests/user_events/Makefile
+++ b/tools/testing/selftests/user_events/Makefile
@@ -1,5 +1,10 @@
# SPDX-License-Identifier: GPL-2.0
CFLAGS += -Wl,-no-as-needed -Wall $(KHDR_INCLUDES)
+
+ifneq ($(LLVM),)
+ CFLAGS += -Wno-address-of-packed-member
+endif
+
LDLIBS += -lrt -lpthread -lm
TEST_GEN_PROGS = ftrace_test dyn_test perf_test abi_test
base-commit: f462ae0edd3703edd6f22fe41d336369c38b884b
prerequisite-patch-id: b901ece2a5b78503e2fb5480f20e304d36a0ea27
--
2.45.0
dump_config_tree() is declared to return an int, but the compiler cannot
prove that it always returns any value at all. This leads to a clang
warning, when building via:
make LLVM=1 -C tools/testing/selftests
Fix this by unconditionally returning the "err" variable if the code
reaches the end of the function.
Signed-off-by: John Hubbard <jhubbard(a)nvidia.com>
---
tools/testing/selftests/alsa/conf.c | 2 ++
1 file changed, 2 insertions(+)
diff --git a/tools/testing/selftests/alsa/conf.c b/tools/testing/selftests/alsa/conf.c
index 89e3656a042d..0109fde53e6f 100644
--- a/tools/testing/selftests/alsa/conf.c
+++ b/tools/testing/selftests/alsa/conf.c
@@ -116,6 +116,8 @@ static int dump_config_tree(snd_config_t *top)
if (snd_config_save(top, out))
ksft_exit_fail_msg("config save\n");
snd_output_close(out);
+
+ return err;
}
snd_config_t *conf_load_from_file(const char *filename)
base-commit: ddb4c3f25b7b95df3d6932db0b379d768a6ebdf7
prerequisite-patch-id: b901ece2a5b78503e2fb5480f20e304d36a0ea27
--
2.45.0
Hi,
This fix, originally intended for XFRM/IPsec, has been recommended by
Steffen Klassert to submit to the net tree.
The patch addresses a minor issue related to the IPv4 source address of
ICMP error messages, which originated from an old 2011 commit:
415b3334a21a ("icmp: Fix regression in nexthop resolution during replies.")
The omission of a "Fixes" tag in the following commit is deliberate
to prevent potential test failures and subsequent regression issues
that may arise from backporting this patch all stable kerenels.
This is a minor fix, anot not security fix.
With a seleftest I am submitting this to net-next tree.
v1->v2 : add kernel selftest script
Antony Antony (2):
xfrm: fix source address in icmp error generation from IPsec gateway
selftests: add ICMP unreachable over IPsec tunnel
net/ipv4/icmp.c | 1 -
tools/testing/selftests/net/Makefile | 1 +
tools/testing/selftests/net/xfrm_state.sh | 624 ++++++++++++++++++++++
3 files changed, 625 insertions(+), 1 deletion(-)
create mode 100755 tools/testing/selftests/net/xfrm_state.sh
--
2.30.2
In the function l5_test(), variable $tests is empty when there is no .mk
file in the subsystem to be tested. It causes the following grep operation
stuck.
This fix check the variable $tests, return when it is empty.
Signed-off-by: Lu Dai <dai.lu(a)exordes.com>
---
tools/testing/selftests/kselftest_deps.sh | 1 +
1 file changed, 1 insertion(+)
diff --git a/tools/testing/selftests/kselftest_deps.sh b/tools/testing/selftests/kselftest_deps.sh
index de59cc8f03c3..487e49fdf2a6 100755
--- a/tools/testing/selftests/kselftest_deps.sh
+++ b/tools/testing/selftests/kselftest_deps.sh
@@ -244,6 +244,7 @@ l4_test()
l5_test()
{
tests=$(find $(dirname "$test") -type f -name "*.mk")
+ [[ -z "${tests// }" ]] && return
test_libs=$(grep "^IOURING_EXTRA_LIBS +\?=" $tests | \
cut -d "=" -f 2)
--
2.39.2
From: Masami Hiramatsu (Google) <mhiramat(a)kernel.org>
Since the VFS type argument test case uses fprobe events, it must
check the availablity of dynamic_events file and fprobe events syntax
in README. Without this fix, the test fails if CONFIG_FPROBE_EVENTS=n.
Fixes: ee97e5e135c6 ("selftests/ftrace: add fprobe test cases for VFS type "%pd" and "%pD"")
Signed-off-by: Masami Hiramatsu (Google) <mhiramat(a)kernel.org>
---
.../ftrace/test.d/dynevent/fprobe_args_vfs.tc | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/tools/testing/selftests/ftrace/test.d/dynevent/fprobe_args_vfs.tc b/tools/testing/selftests/ftrace/test.d/dynevent/fprobe_args_vfs.tc
index 49a833bf334c..c6a9d2466a71 100644
--- a/tools/testing/selftests/ftrace/test.d/dynevent/fprobe_args_vfs.tc
+++ b/tools/testing/selftests/ftrace/test.d/dynevent/fprobe_args_vfs.tc
@@ -1,7 +1,8 @@
#!/bin/sh
# SPDX-License-Identifier: GPL-2.0
# description: Fprobe event VFS type argument
-# requires: kprobe_events "%pd/%pD":README
+# requires: dynamic_events "%pd/%pD":README "f[:[<group>/][<event>]] <func-name>[%return] [<args>]":README
+
: "Test argument %pd with name for fprobe"
echo 'f:testprobe dput name=$arg1:%pd' > dynamic_events
Here is a couple of patches for fixing errors on ftracetest.
Shuah, can you pick these to your fixes branch? Or I also can push it.
Thank you,
---
Masami Hiramatsu (Google) (2):
selftests/ftrace: Fix BTFARG testcase to check fprobe is enabled correctly
selftests/ftrace: Fix checkbashisms errors
.../ftrace/test.d/dynevent/add_remove_btfarg.tc | 2 +-
.../ftrace/test.d/dynevent/fprobe_entry_arg.tc | 2 +-
.../ftrace/test.d/kprobe/kretprobe_entry_arg.tc | 2 +-
3 files changed, 3 insertions(+), 3 deletions(-)
--
Masami Hiramatsu (Google) <mhiramat(a)kernel.org>
Use kselftest_harness.h to simplify the code structure by eliminating
conditional logic. Enhance diagnostics by directly printing relevant info,
such as access and modification times, upon test failure. Reflecting
common I/O optimizations, the access time usually remains unchanged, while
the modify time is expected to update. Accordingly, these elements have
been logically separated.
Signed-off-by: Shengyu Li <shengyu.li.evgeny(a)gmail.com>
---
v3: Explain the need for refactoring
v2: Fixed the last Assert
---
.../testing/selftests/tty/tty_tstamp_update.c | 49 +++++++++----------
1 file changed, 22 insertions(+), 27 deletions(-)
diff --git a/tools/testing/selftests/tty/tty_tstamp_update.c b/tools/testing/selftests/tty/tty_tstamp_update.c
index 0ee97943dccc..38de211e0715 100644
--- a/tools/testing/selftests/tty/tty_tstamp_update.c
+++ b/tools/testing/selftests/tty/tty_tstamp_update.c
@@ -9,7 +9,7 @@
#include <unistd.h>
#include <linux/limits.h>
-#include "../kselftest.h"
+#include "../kselftest_harness.h"
#define MIN_TTY_PATH_LEN 8
@@ -42,47 +42,42 @@ static int write_dev_tty(void)
return r;
}
-int main(int argc, char **argv)
+TEST(tty_tstamp_update)
{
int r;
char tty[PATH_MAX] = {};
struct stat st1, st2;
- ksft_print_header();
- ksft_set_plan(1);
+ ASSERT_GE(readlink("/proc/self/fd/0", tty, PATH_MAX), 0)
+ TH_LOG("readlink on /proc/self/fd/0 failed: %m");
- r = readlink("/proc/self/fd/0", tty, PATH_MAX);
- if (r < 0)
- ksft_exit_fail_msg("readlink on /proc/self/fd/0 failed: %m\n");
-
- if (!tty_valid(tty))
- ksft_exit_skip("invalid tty path '%s'\n", tty);
+ ASSERT_TRUE(tty_valid(tty)) {
+ TH_LOG("SKIP: invalid tty path '%s'", tty);
+ _exit(KSFT_SKIP);
+ }
- r = stat(tty, &st1);
- if (r < 0)
- ksft_exit_fail_msg("stat failed on tty path '%s': %m\n", tty);
+ ASSERT_GE(stat(tty, &st1), 0)
+ TH_LOG("stat failed on tty path '%s': %m", tty);
/* We need to wait at least 8 seconds in order to observe timestamp change */
/* https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?… */
sleep(10);
r = write_dev_tty();
- if (r < 0)
- ksft_exit_fail_msg("failed to write to /dev/tty: %s\n",
- strerror(-r));
+ ASSERT_GE(r, 0)
+ TH_LOG("failed to write to /dev/tty: %s", strerror(-r));
- r = stat(tty, &st2);
- if (r < 0)
- ksft_exit_fail_msg("stat failed on tty path '%s': %m\n", tty);
+ ASSERT_GE(stat(tty, &st2), 0)
+ TH_LOG("stat failed on tty path '%s': %m", tty);
+
+ /* Validate unchanged atime under 'relatime' to ensure minimal disk I/O */
+ EXPECT_EQ(st1.st_atim.tv_sec, st2.st_atim.tv_sec);
/* We wrote to the terminal so timestamps should have been updated */
- if (st1.st_atim.tv_sec == st2.st_atim.tv_sec &&
- st1.st_mtim.tv_sec == st2.st_mtim.tv_sec) {
- ksft_test_result_fail("tty timestamps not updated\n");
- ksft_exit_fail();
- }
+ ASSERT_NE(st1.st_mtim.tv_sec, st2.st_mtim.tv_sec)
+ TH_LOG("tty timestamps not updated");
- ksft_test_result_pass(
- "timestamps of terminal '%s' updated after write to /dev/tty\n", tty);
- return EXIT_SUCCESS;
+ TH_LOG("timestamps of terminal '%s' updated after write to /dev/tty",
+ tty);
}
+TEST_HARNESS_MAIN
--
2.25.1
Currently, the migration worker delays 1-10 us, assuming that one
KVM_RUN iteration only takes a few microseconds. But if the CPU low
power wakeup latency is large enough, for example, hundreds or even
thousands of microseconds deep C-state exit latencies on x86 server
CPUs, it may happen that it's not able to wakeup the target CPU before
the migration worker starts to migrate the vCPU thread to the next CPU.
If the system workload is light, most CPUs could be at a certain low
power state, which may result in less successful migrations and fail the
migration/KVM_RUN ratio sanity check. But this is not supposed to be
deemed a test failure.
This patch adds a command line option to skip the sanity check in
this case.
Co-developed-by: Dongsheng Zhang <dongsheng.x.zhang(a)intel.com>
Signed-off-by: Dongsheng Zhang <dongsheng.x.zhang(a)intel.com>
Signed-off-by: Zide Chen <zide.chen(a)intel.com>
---
V2:
- removed the busy loop implementation
- add the new "-s" option
V3:
- drop the usleep randomization code
- removed the term C-state for less confusion for non-x86 archetectures
- changed patch subject
v4:
- replaced Signed-off-by with Co-developed-by
- changed command line option from "-s" to "-u"
- Adopted the much clearer assertion error messages provided by Sean.
V5:
- Fixed the missing SoB
---
tools/testing/selftests/kvm/rseq_test.c | 35 +++++++++++++++++++++++--
1 file changed, 33 insertions(+), 2 deletions(-)
diff --git a/tools/testing/selftests/kvm/rseq_test.c b/tools/testing/selftests/kvm/rseq_test.c
index 28f97fb52044..ad418a5c59dd 100644
--- a/tools/testing/selftests/kvm/rseq_test.c
+++ b/tools/testing/selftests/kvm/rseq_test.c
@@ -186,12 +186,35 @@ static void calc_min_max_cpu(void)
"Only one usable CPU, task migration not possible");
}
+static void help(const char *name)
+{
+ puts("");
+ printf("usage: %s [-h] [-u]\n", name);
+ printf(" -u: Don't sanity check the number of successful KVM_RUNs\n");
+ puts("");
+ exit(0);
+}
+
int main(int argc, char *argv[])
{
int r, i, snapshot;
struct kvm_vm *vm;
struct kvm_vcpu *vcpu;
u32 cpu, rseq_cpu;
+ bool skip_sanity_check = false;
+ int opt;
+
+ while ((opt = getopt(argc, argv, "hu")) != -1) {
+ switch (opt) {
+ case 'u':
+ skip_sanity_check = true;
+ break;
+ case 'h':
+ default:
+ help(argv[0]);
+ break;
+ }
+ }
r = sched_getaffinity(0, sizeof(possible_mask), &possible_mask);
TEST_ASSERT(!r, "sched_getaffinity failed, errno = %d (%s)", errno,
@@ -254,9 +277,17 @@ int main(int argc, char *argv[])
* getcpu() to stabilize. A 2:1 migration:KVM_RUN ratio is a fairly
* conservative ratio on x86-64, which can do _more_ KVM_RUNs than
* migrations given the 1us+ delay in the migration task.
+ *
+ * Another reason why it may have small migration:KVM_RUN ratio is that,
+ * on systems with large low power mode wakeup latency, it may happen
+ * quite often that the scheduler is not able to wake up the target CPU
+ * before the vCPU thread is scheduled to another CPU.
*/
- TEST_ASSERT(i > (NR_TASK_MIGRATIONS / 2),
- "Only performed %d KVM_RUNs, task stalled too much?", i);
+ TEST_ASSERT(skip_sanity_check || i > (NR_TASK_MIGRATIONS / 2),
+ "Only performed %d KVM_RUNs, task stalled too much? \n"
+ " Try disabling deep sleep states to reduce CPU wakeup latency,\n"
+ " e.g. via cpuidle.off=1 or setting /dev/cpu_dma_latency to '0',\n"
+ " or run with -u to disable this sanity check.", i);
pthread_join(migration_thread, NULL);
--
2.34.1
Align the behavior for gcc and clang builds by interpreting unset
`ARCH` and `CROSS_COMPILE` variables in `LLVM` builds as a sign that the
user wants to build for the host architecture.
This patch preserves the properties that setting the `ARCH` variable to an
unknown value will trigger an error that complains about insufficient
information, and that a set `CROSS_COMPILE` variable will override the
target triple that is determined based on presence/absence of `ARCH`.
When compiling with clang, i.e., `LLVM` is set, an unset `ARCH` variable in
combination with an unset `CROSS_COMPILE` variable, i.e., compiling for
the host architecture, leads to compilation failures since `lib.mk` can
not determine the clang target triple. In this case, the following error
message is displayed for each subsystem that does not set `ARCH` in its
own Makefile before including `lib.mk` (lines wrapped at 75 chrs):
make[1]: Entering directory '/mnt/build/linux/tools/testing/selftests/
sysctl'
../lib.mk:33: *** Specify CROSS_COMPILE or add '--target=' option to
lib.mk. Stop.
make[1]: Leaving directory '/mnt/build/linux/tools/testing/selftests/
sysctl'
In the same scenario a gcc build would default to the host architecture,
i.e., it would use plain `gcc`.
Fixes: 795285ef2425 ("selftests: Fix clang cross compilation")
Reviewed-by: Mark Brown <broonie(a)kernel.org>
Signed-off-by: Valentin Obst <kernel(a)valentinobst.de>
---
I am not entirely sure whether this behavior is in fact known and intended
and whether the way to obtain the host target triple is sufficiently
general. The flag was introduced in llvm-8 with [1], it will be an error in
older clang versions.
The target triple you get with `-print-target-triple` may not be the
same that you would get when explicitly setting ARCH to you host
architecture. For example on my x86_64 system it get
`x86_64-pc-linux-gnu` instead of `x86_64-linux-gnu`, similar deviations
were observed when testing other clang binaries on compiler-explorer,
e.g., [2].
An alternative could be to simply do:
ARCH ?= $(shell uname -m)
before using it to select the target. Possibly with some post processing,
but at that point we would likely be replicating `scripts/subarch.include`.
This is what some subsystem Makefiles do before including `lib.mk`. This
change might make it possible to remove the explicit setting of `ARCH` from
the few subsystem Makefiles that do it.
[1]: https://reviews.llvm.org/D50755
[2]: https://godbolt.org/z/r7Gn9bvv1
Changes in v1:
- Shortened commit message.
- Link to RFC: https://lore.kernel.org/r/20240303-selftests-libmk-llvm-rfc-v1-1-9ab53e365e…
---
tools/testing/selftests/lib.mk | 12 ++++++++++--
1 file changed, 10 insertions(+), 2 deletions(-)
diff --git a/tools/testing/selftests/lib.mk b/tools/testing/selftests/lib.mk
index da2cade3bab0..8ae203d8ed7f 100644
--- a/tools/testing/selftests/lib.mk
+++ b/tools/testing/selftests/lib.mk
@@ -7,6 +7,8 @@ else ifneq ($(filter -%,$(LLVM)),)
LLVM_SUFFIX := $(LLVM)
endif
+CLANG := $(LLVM_PREFIX)clang$(LLVM_SUFFIX)
+
CLANG_TARGET_FLAGS_arm := arm-linux-gnueabi
CLANG_TARGET_FLAGS_arm64 := aarch64-linux-gnu
CLANG_TARGET_FLAGS_hexagon := hexagon-linux-musl
@@ -18,7 +20,13 @@ CLANG_TARGET_FLAGS_riscv := riscv64-linux-gnu
CLANG_TARGET_FLAGS_s390 := s390x-linux-gnu
CLANG_TARGET_FLAGS_x86 := x86_64-linux-gnu
CLANG_TARGET_FLAGS_x86_64 := x86_64-linux-gnu
-CLANG_TARGET_FLAGS := $(CLANG_TARGET_FLAGS_$(ARCH))
+
+# Default to host architecture if ARCH is not explicitly given.
+ifeq ($(ARCH),)
+CLANG_TARGET_FLAGS := $(shell $(CLANG) -print-target-triple)
+else
+CLANG_TARGET_FLAGS := $(CLANG_TARGET_FLAGS_$(ARCH))
+endif
ifeq ($(CROSS_COMPILE),)
ifeq ($(CLANG_TARGET_FLAGS),)
@@ -30,7 +38,7 @@ else
CLANG_FLAGS += --target=$(notdir $(CROSS_COMPILE:%-=%))
endif # CROSS_COMPILE
-CC := $(LLVM_PREFIX)clang$(LLVM_SUFFIX) $(CLANG_FLAGS) -fintegrated-as
+CC := $(CLANG) $(CLANG_FLAGS) -fintegrated-as
else
CC := $(CROSS_COMPILE)gcc
endif # LLVM
---
base-commit: 4cece764965020c22cff7665b18a012006359095
change-id: 20240303-selftests-libmk-llvm-rfc-5fe3cfa9f094
Best regards,
--
Valentin Obst <kernel(a)valentinobst.de>