Fixes and cleanups for various issues in the vDSO selftests.
Signed-off-by: Thomas Weißschuh <thomas.weissschuh(a)linutronix.de>
---
Thomas Weißschuh (7):
selftests: vDSO: chacha: Correctly skip test if necessary
selftests: vDSO: clock_getres: Drop unused include of err.h
selftests: vDSO: vdso_test_correctness: Fix -Wold-style-definitions
selftests: vDSO: vdso_test_getrandom: Drop unused include of linux/compiler.h
selftests: vDSO: vdso_test_getrandom: Drop some dead code
selftests: vDSO: vdso_test_getrandom: Always print TAP header
selftests: vDSO: vdso_config: Avoid -Wunused-variables
tools/testing/selftests/vDSO/vdso_config.h | 2 ++
tools/testing/selftests/vDSO/vdso_test_chacha.c | 3 ++-
tools/testing/selftests/vDSO/vdso_test_clock_getres.c | 1 -
tools/testing/selftests/vDSO/vdso_test_correctness.c | 2 +-
tools/testing/selftests/vDSO/vdso_test_getrandom.c | 18 +++++-------------
5 files changed, 10 insertions(+), 16 deletions(-)
---
base-commit: 0af2f6be1b4281385b618cb86ad946eded089ac8
change-id: 20250423-selftests-vdso-fixes-d2ce74142359
Best regards,
--
Thomas Weißschuh <thomas.weissschuh(a)linutronix.de>
I'd like to cut down the memory usage of parsing vmlinux BTF in ebpf-go.
With some upcoming changes the library is sitting at 5MiB for a parse.
Most of that memory is simply copying the BTF blob into user space.
By allowing vmlinux BTF to be mmapped read-only into user space I can
cut memory usage by about 75%.
Signed-off-by: Lorenz Bauer <lmb(a)isovalent.com>
---
Lorenz Bauer (2):
btf: allow mmap of vmlinux btf
selftests: bpf: add a test for mmapable vmlinux BTF
include/asm-generic/vmlinux.lds.h | 3 +-
kernel/bpf/sysfs_btf.c | 25 ++++++-
tools/testing/selftests/bpf/prog_tests/btf_sysfs.c | 79 ++++++++++++++++++++++
3 files changed, 104 insertions(+), 3 deletions(-)
---
base-commit: 38d976c32d85ef12dcd2b8a231196f7049548477
change-id: 20250501-vmlinux-mmap-2ec5563c3ef1
Best regards,
--
Lorenz Bauer <lmb(a)isovalent.com>
Ensure the following prerequisites before executing the test:
1. 'socat' is installed on the remote host.
2. Python version supports socket.SO_INCOMING_CPU (available since v3.11).
Skip the test if either prerequisite is not met.
Reviewed-by: Nimrod Oren <noren(a)nvidia.com>
Signed-off-by: Gal Pressman <gal(a)nvidia.com>
---
Changelog -
v1->v2: https://lore.kernel.org/netdev/20250317123149.364565-1-gal@nvidia.com/
* Use require_cmd() helper (Jakub).
---
tools/testing/selftests/drivers/net/hw/rss_input_xfrm.py | 5 +++++
1 file changed, 5 insertions(+)
diff --git a/tools/testing/selftests/drivers/net/hw/rss_input_xfrm.py b/tools/testing/selftests/drivers/net/hw/rss_input_xfrm.py
index 53bb08cc29ec..f439c434ba36 100755
--- a/tools/testing/selftests/drivers/net/hw/rss_input_xfrm.py
+++ b/tools/testing/selftests/drivers/net/hw/rss_input_xfrm.py
@@ -32,6 +32,11 @@ def test_rss_input_xfrm(cfg, ipver):
if multiprocessing.cpu_count() < 2:
raise KsftSkipEx("Need at least two CPUs to test symmetric RSS hash")
+ cfg.require_cmd("socat", remote=True)
+
+ if not hasattr(socket, "SO_INCOMING_CPU"):
+ raise KsftSkipEx("socket.SO_INCOMING_CPU was added in Python 3.11")
+
input_xfrm = cfg.ethnl.rss_get(
{'header': {'dev-name': cfg.ifname}}).get('input_xfrm')
--
2.40.1
This series is a follow-up to [1], which adds mTHP support to khugepaged.
mTHP khugepaged support is a "loose" dependency for the sysfs/sysctl
configs to make sense. Without it global="defer" and mTHP="inherit" case
is "undefined" behavior.
We've seen cases were customers switching from RHEL7 to RHEL8 see a
significant increase in the memory footprint for the same workloads.
Through our investigations we found that a large contributing factor to
the increase in RSS was an increase in THP usage.
For workloads like MySQL, or when using allocators like jemalloc, it is
often recommended to set /transparent_hugepages/enabled=never. This is
in part due to performance degradations and increased memory waste.
This series introduces enabled=defer, this setting acts as a middle
ground between always and madvise. If the mapping is MADV_HUGEPAGE, the
page fault handler will act normally, making a hugepage if possible. If
the allocation is not MADV_HUGEPAGE, then the page fault handler will
default to the base size allocation. The caveat is that khugepaged can
still operate on pages that are not MADV_HUGEPAGE.
This allows for three things... one, applications specifically designed to
use hugepages will get them, and two, applications that don't use
hugepages can still benefit from them without aggressively inserting
THPs at every possible chance. This curbs the memory waste, and defers
the use of hugepages to khugepaged. Khugepaged can then scan the memory
for eligible collapsing. Lastly there is the added benefit for those who
want THPs but experience higher latency PFs. Now you can get base page
performance at the PF handler and Hugepage performance for those mappings
after they collapse.
Admins may want to lower max_ptes_none, if not, khugepaged may
aggressively collapse single allocations into hugepages.
TESTING:
- Built for x86_64, aarch64, ppc64le, and s390x
- selftests mm
- In [1] I provided a script [2] that has multiple access patterns
- lots of general use.
- redis testing. This test was my original case for the defer mode. What I
was able to prove was that THP=always leads to increased max_latency
cases; hence why it is recommended to disable THPs for redis servers.
However with 'defer' we dont have the max_latency spikes and can still
get the system to utilize THPs. I further tested this with the mTHP
defer setting and found that redis (and probably other jmalloc users)
can utilize THPs via defer (+mTHP defer) without a large latency
penalty and some potential gains. I uploaded some mmtest results
here[3] which compares:
stock+thp=never
stock+(m)thp=always
khugepaged-mthp + defer (max_ptes_none=64)
The results show that (m)THPs can cause some throughput regression in
some cases, but also has gains in other cases. The mTHP+defer results
have more gains and less losses over the (m)THP=always case.
V5 Changes:
- rebased dependent series
- added reviewed-by tag on 2/4
V4 Changes:
- Minor Documentation fixes
- rebased the dependent series [1] onto mm-unstable
commit 0e68b850b1d3 ("vmalloc: use atomic_long_add_return_relaxed()")
V3 Changes:
- Combined the documentation commits into one, and moved a section to the
khugepaged mthp patchset
V2 Changes:
- base changes on mTHP khugepaged support
- Fix selftests parsing issue
- add mTHP defer option
- add mTHP defer Documentation
[1] - https://lore.kernel.org/lkml/20250428181218.85925-1-npache@redhat.com/
[2] - https://gitlab.com/npache/khugepaged_mthp_test
[3] - https://people.redhat.com/npache/mthp_khugepaged_defer/testoutput2/output.h…
Nico Pache (4):
mm: defer THP insertion to khugepaged
mm: document (m)THP defer usage
khugepaged: add defer option to mTHP options
selftests: mm: add defer to thp setting parser
Documentation/admin-guide/mm/transhuge.rst | 31 +++++++---
include/linux/huge_mm.h | 18 +++++-
mm/huge_memory.c | 69 +++++++++++++++++++---
mm/khugepaged.c | 8 +--
tools/testing/selftests/mm/thp_settings.c | 1 +
tools/testing/selftests/mm/thp_settings.h | 1 +
6 files changed, 106 insertions(+), 22 deletions(-)
--
2.48.1
In the current implementation if the program is dev-bound to a specific
device, it will not be possible to perform XDP_REDIRECT into a DEVMAP or
CPUMAP even if the program is running in the driver NAPI context.
Fix the issue introducing __bpf_prog_map_compatible utility routine in
order to avoid bpf_prog_is_dev_bound() during the XDP program load.
Continue forbidding to attach a dev-bound program to XDP maps.
---
Changes in v3:
- move seltest changes in a dedicated patch
- Link to v2: https://lore.kernel.org/r/20250423-xdp-prog-bound-fix-v2-1-51742a5dfbce@ker…
Changes in v2:
- Introduce __bpf_prog_map_compatible() utility routine in order to skip
bpf_prog_is_dev_bound check in bpf_check_tail_call()
- Extend xdp_metadata selftest
- Link to v1: https://lore.kernel.org/r/20250422-xdp-prog-bound-fix-v1-1-0b581fa186fe@ker…
---
Lorenzo Bianconi (2):
bpf: Allow XDP dev-bound programs to perform XDP_REDIRECT into maps
selftests/bpf: xdp_metadata: check XDP_REDIRCT support for dev-bound progs
kernel/bpf/core.c | 27 +++++++++++++---------
.../selftests/bpf/prog_tests/xdp_metadata.c | 22 +++++++++++++++++-
tools/testing/selftests/bpf/progs/xdp_metadata.c | 13 +++++++++++
3 files changed, 50 insertions(+), 12 deletions(-)
---
base-commit: 91dbac4076537b464639953c055c460d2bdfc7ea
change-id: 20250422-xdp-prog-bound-fix-9f30f3e134aa
Best regards,
--
Lorenzo Bianconi <lorenzo(a)kernel.org>
This series fixes misaligned access handling when in non interruptible
context by reenabling interrupts when possible. A previous commit
changed raw_copy_from_user() with copy_from_user() which enables page
faulting and thus can sleep. While correct, a warning is now triggered
due to being called in an invalid context (sleeping in
non-interruptible). This series fixes that problem by factorizing
misaligned load/store entry in a single function than reenables
interrupt if the interrupted context had interrupts enabled.
In order for misaligned handling problems to be caught sooner, add a
kselftest for all the currently supported instructions .
Note: these commits were actually part of another larger series for
misaligned request delegation but was split since it isn't directly
required.
Clément Léger (5):
riscv: misaligned: factorize trap handling
riscv: misaligned: enable IRQs while handling misaligned accesses
riscv: misaligned: use get_user() instead of __get_user()
Documentation/sysctl: add riscv to unaligned-trap supported archs
selftests: riscv: add misaligned access testing
Documentation/admin-guide/sysctl/kernel.rst | 4 +-
arch/riscv/kernel/traps.c | 57 ++--
arch/riscv/kernel/traps_misaligned.c | 2 +-
.../selftests/riscv/misaligned/.gitignore | 1 +
.../selftests/riscv/misaligned/Makefile | 12 +
.../selftests/riscv/misaligned/common.S | 33 +++
.../testing/selftests/riscv/misaligned/fpu.S | 180 +++++++++++++
tools/testing/selftests/riscv/misaligned/gp.S | 103 +++++++
.../selftests/riscv/misaligned/misaligned.c | 254 ++++++++++++++++++
9 files changed, 614 insertions(+), 32 deletions(-)
create mode 100644 tools/testing/selftests/riscv/misaligned/.gitignore
create mode 100644 tools/testing/selftests/riscv/misaligned/Makefile
create mode 100644 tools/testing/selftests/riscv/misaligned/common.S
create mode 100644 tools/testing/selftests/riscv/misaligned/fpu.S
create mode 100644 tools/testing/selftests/riscv/misaligned/gp.S
create mode 100644 tools/testing/selftests/riscv/misaligned/misaligned.c
--
2.49.0
For some services we are using "established-over-unconnected" model.
'''
// create unconnected socket and 'listen()'
srv_fd = socket(AF_INET, SOCK_DGRAM)
setsockopt(srv_fd, SO_REUSEPORT)
bind(srv_fd, SERVER_ADDR, SERVER_PORT)
// 'accept()'
data, client_addr = recvmsg(srv_fd)
// create a connected socket for this request
cli_fd = socket(AF_INET, SOCK_DGRAM)
setsockopt(cli_fd, SO_REUSEPORT)
bind(cli_fd, SERVER_ADDR, SERVER_PORT)
connect(cli, client_addr)
...
// do handshake with cli_fd
'''
This programming pattern simulates accept() using UDP, creating a new
socket for each client request. The server can then use separate sockets
to handle client requests, avoiding the need to use a single UDP socket
for I/O transmission.
But there is a race condition between the bind() and connect() of the
connected socket:
We might receive unexpected packets belonging to the unconnected socket
before connect() is executed, which is not what we need.
(Of course, before connect(), the unconnected socket will also receive
packets from the connected socket, which is easily resolved because
upper-layer protocols typically require explicit boundaries, and we
receive a complete packet before creating a connected socket.)
Before this patch, the connected socket had to filter requests at recvmsg
time, acting as a dispatcher to some extent. With this patch, we can
consider the bind and connect operations to be atomic.
Signed-off-by: Jiayuan Chen <jiayuan.chen(a)linux.dev>
---
include/linux/udp.h | 1 +
include/uapi/linux/udp.h | 1 +
net/ipv4/udp.c | 13 ++++++++++---
net/ipv6/udp.c | 5 +++--
4 files changed, 15 insertions(+), 5 deletions(-)
diff --git a/include/linux/udp.h b/include/linux/udp.h
index 895240177f4f..8d281a0c0d9d 100644
--- a/include/linux/udp.h
+++ b/include/linux/udp.h
@@ -42,6 +42,7 @@ enum {
UDP_FLAGS_ENCAP_ENABLED, /* This socket enabled encap */
UDP_FLAGS_UDPLITE_SEND_CC, /* set via udplite setsockopt */
UDP_FLAGS_UDPLITE_RECV_CC, /* set via udplite setsockopt */
+ UDP_FLAGS_STOP_RCV, /* Stop receiving packets */
};
struct udp_sock {
diff --git a/include/uapi/linux/udp.h b/include/uapi/linux/udp.h
index edca3e430305..bb8e0a749a55 100644
--- a/include/uapi/linux/udp.h
+++ b/include/uapi/linux/udp.h
@@ -34,6 +34,7 @@ struct udphdr {
#define UDP_NO_CHECK6_RX 102 /* Disable accepting checksum for UDP6 */
#define UDP_SEGMENT 103 /* Set GSO segmentation size */
#define UDP_GRO 104 /* This socket can receive UDP GRO packets */
+#define UDP_STOP_RCV 105 /* This socket will not receive any packets */
/* UDP encapsulation types */
#define UDP_ENCAP_ESPINUDP_NON_IKE 1 /* unused draft-ietf-ipsec-nat-t-ike-00/01 */
diff --git a/net/ipv4/udp.c b/net/ipv4/udp.c
index f9f5b92cf4b6..764d337ab1b3 100644
--- a/net/ipv4/udp.c
+++ b/net/ipv4/udp.c
@@ -376,7 +376,8 @@ static int compute_score(struct sock *sk, const struct net *net,
if (!net_eq(sock_net(sk), net) ||
udp_sk(sk)->udp_port_hash != hnum ||
- ipv6_only_sock(sk))
+ ipv6_only_sock(sk) ||
+ udp_test_bit(STOP_RCV, sk))
return -1;
if (sk->sk_rcv_saddr != daddr)
@@ -494,7 +495,7 @@ static struct sock *udp4_lib_lookup2(const struct net *net,
result = inet_lookup_reuseport(net, sk, skb, sizeof(struct udphdr),
saddr, sport, daddr, hnum, udp_ehashfn);
- if (!result) {
+ if (!result || udp_test_bit(STOP_RCV, result)) {
result = sk;
continue;
}
@@ -3031,7 +3032,9 @@ int udp_lib_setsockopt(struct sock *sk, int level, int optname,
set_xfrm_gro_udp_encap_rcv(up->encap_type, sk->sk_family, sk);
sockopt_release_sock(sk);
break;
-
+ case UDP_STOP_RCV:
+ udp_assign_bit(STOP_RCV, sk, valbool);
+ break;
/*
* UDP-Lite's partial checksum coverage (RFC 3828).
*/
@@ -3120,6 +3123,10 @@ int udp_lib_getsockopt(struct sock *sk, int level, int optname,
val = udp_test_bit(GRO_ENABLED, sk);
break;
+ case UDP_STOP_RCV:
+ val = udp_test_bit(STOP_RCV, sk);
+ break;
+
/* The following two cannot be changed on UDP sockets, the return is
* always 0 (which corresponds to the full checksum coverage of UDP). */
case UDPLITE_SEND_CSCOV:
diff --git a/net/ipv6/udp.c b/net/ipv6/udp.c
index 7317f8e053f1..55896a78e94b 100644
--- a/net/ipv6/udp.c
+++ b/net/ipv6/udp.c
@@ -137,7 +137,8 @@ static int compute_score(struct sock *sk, const struct net *net,
if (!net_eq(sock_net(sk), net) ||
udp_sk(sk)->udp_port_hash != hnum ||
- sk->sk_family != PF_INET6)
+ sk->sk_family != PF_INET6 ||
+ udp_test_bit(STOP_RCV, sk))
return -1;
if (!ipv6_addr_equal(&sk->sk_v6_rcv_saddr, daddr))
@@ -245,7 +246,7 @@ static struct sock *udp6_lib_lookup2(const struct net *net,
result = inet6_lookup_reuseport(net, sk, skb, sizeof(struct udphdr),
saddr, sport, daddr, hnum, udp6_ehashfn);
- if (!result) {
+ if (!result || udp_test_bit(STOP_RCV, result)) {
result = sk;
continue;
}
--
2.47.1
This series improves the following tests.
1. Get-reg-list : Adds vector support
2. SBI PMU test : Distinguish between different types of illegal exception
The first patch is just helper patch that adds stval support during
exception handling.
Signed-off-by: Atish Patra <atishp(a)rivosinc.com>
---
Changes in v3:
- Dropped the redundant macros and rv32 specific csr details.
- Changed to vcpu_get_reg from __vcpu_get_reg based on suggestion from Drew.
- Added RB tags from Drew.
- Link to v2: https://lore.kernel.org/r/20250429-kvm_selftest_improve-v2-0-51713f91e04a@r…
Changes in v2:
- Rebased on top of Linux 6.15-rc4
- Changed from ex_regs to pt_regs based on Drew's suggestion.
- Dropped Anup's review on PATCH1 as it is significantly changed from last review.
- Moved the instruction decoding macros to a common header file.
- Improved the vector reg list test as per the feedback.
- Link to v1: https://lore.kernel.org/r/20250324-kvm_selftest_improve-v1-0-583620219d4f@r…
---
Atish Patra (3):
KVM: riscv: selftests: Align the trap information wiht pt_regs
KVM: riscv: selftests: Decode stval to identify exact exception type
KVM: riscv: selftests: Add vector extension tests
.../selftests/kvm/include/riscv/processor.h | 23 +++-
tools/testing/selftests/kvm/lib/riscv/handlers.S | 139 +++++++++++----------
tools/testing/selftests/kvm/lib/riscv/processor.c | 2 +-
tools/testing/selftests/kvm/riscv/arch_timer.c | 2 +-
tools/testing/selftests/kvm/riscv/ebreak_test.c | 2 +-
tools/testing/selftests/kvm/riscv/get-reg-list.c | 132 +++++++++++++++++++
tools/testing/selftests/kvm/riscv/sbi_pmu_test.c | 24 +++-
7 files changed, 247 insertions(+), 77 deletions(-)
---
base-commit: f15d97df5afae16f40ecef942031235d1c6ba14f
change-id: 20250324-kvm_selftest_improve-9bedb9f0a6d3
--
Regards,
Atish patra
Fix the spelling error from "multible" to "multiple".
Signed-off-by: Ankit Chauhan <ankitchauhan2065(a)gmail.com>
---
tools/testing/selftests/ptrace/peeksiginfo.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/tools/testing/selftests/ptrace/peeksiginfo.c b/tools/testing/selftests/ptrace/peeksiginfo.c
index a6884f66dc01..2f345d11e4b8 100644
--- a/tools/testing/selftests/ptrace/peeksiginfo.c
+++ b/tools/testing/selftests/ptrace/peeksiginfo.c
@@ -199,7 +199,7 @@ int main(int argc, char *argv[])
/*
* Dump signal from the process-wide queue.
- * The number of signals is not multible to the buffer size
+ * The number of signals is not multiple to the buffer size
*/
if (check_direct_path(child, 1, 3))
goto out;
--
2.34.1
kunit kernel build could fail if there are ny build artifacts from a
prior kernel build. These can be hard to debug if the build artifact
happens to be generated header file. It took me a while to debug kunit
build fail on ARCH=x86_64 in a tree which had a generated header file
arch/x86/realmode/rm/pasyms.h
make ARCH=um mrproper will not clean the tree. It is necessary to run
make ARCH=x86_64 mrproper
Example work-flow that could lead to this:
make allmodconfig (x86_64)
make
./tools/testing/kunit/kunit.py run
Add this to the documentation and kunit.py build help message.
Shuah Khan (2):
doc: kunit: add information about cleaning source trees
kunit: add tips to clean source tree to build help message
Documentation/dev-tools/kunit/start.rst | 12 ++++++++++++
tools/testing/kunit/kunit.py | 2 +-
2 files changed, 13 insertions(+), 1 deletion(-)
--
2.47.2