Dear all,
This patchset is just a respin of my latest PR to net-next, including all
modifications requested by Jakub and Sabrina.
However, this time I am also adding patches targeting selftest/net/ovpn, as
they come in handy for testing the new features (originally I wanted
them to be a separate PR, but it doesn't indeed make a lot of sense).
This said, since these kselftest patches are quite invasive, I didn't
feel confident with sending them in a PR right away, but I rather wanted
some feedback from Sabrina and Shuah first, if possible.
So here we go.
Once I get some approval on this batch, I'll send then send them all
to net-next again as PRv2.
Thanks a lot!
Regards,
Antonio Quartulli (1):
selftests: ovpn: allow compiling ovpn-cli.c with mbedtls3
Qingfang Deng (1):
ovpn: pktid: use bitops.h API
Ralf Lici (10):
selftests: ovpn: add notification parsing and matching
ovpn: notify userspace on client float event
ovpn: add support for asymmetric peer IDs
selftests: ovpn: check asymmetric peer-id
selftests: ovpn: add test for the FW mark feature
ovpn: consolidate crypto allocations in one chunk
ovpn: use bound device in UDP when available
selftests: ovpn: add test for bound device
ovpn: use bound address in UDP when available
selftests: ovpn: add test for bound address
Sabrina Dubroca (1):
ovpn: use correct array size to parse nested attributes in
ovpn_nl_key_swap_doit
Documentation/netlink/specs/ovpn.yaml | 23 +-
drivers/net/ovpn/crypto_aead.c | 162 +++++++---
drivers/net/ovpn/io.c | 8 +-
drivers/net/ovpn/netlink-gen.c | 13 +-
drivers/net/ovpn/netlink-gen.h | 6 +-
drivers/net/ovpn/netlink.c | 98 +++++-
drivers/net/ovpn/netlink.h | 2 +
drivers/net/ovpn/peer.c | 6 +
drivers/net/ovpn/peer.h | 4 +-
drivers/net/ovpn/pktid.c | 11 +-
drivers/net/ovpn/pktid.h | 2 +-
drivers/net/ovpn/skb.h | 13 +-
drivers/net/ovpn/udp.c | 10 +-
include/uapi/linux/ovpn.h | 2 +
tools/testing/selftests/net/ovpn/Makefile | 17 +-
.../selftests/net/ovpn/check_requirements.py | 37 +++
tools/testing/selftests/net/ovpn/common.sh | 60 +++-
tools/testing/selftests/net/ovpn/data64.key | 6 +-
.../selftests/net/ovpn/json/peer0-float.json | 9 +
.../selftests/net/ovpn/json/peer0.json | 6 +
.../selftests/net/ovpn/json/peer1-float.json | 1 +
.../selftests/net/ovpn/json/peer1.json | 1 +
.../selftests/net/ovpn/json/peer2-float.json | 1 +
.../selftests/net/ovpn/json/peer2.json | 1 +
.../selftests/net/ovpn/json/peer3-float.json | 1 +
.../selftests/net/ovpn/json/peer3.json | 1 +
.../selftests/net/ovpn/json/peer4-float.json | 1 +
.../selftests/net/ovpn/json/peer4.json | 1 +
.../selftests/net/ovpn/json/peer5-float.json | 1 +
.../selftests/net/ovpn/json/peer5.json | 1 +
.../selftests/net/ovpn/json/peer6-float.json | 1 +
.../selftests/net/ovpn/json/peer6.json | 1 +
tools/testing/selftests/net/ovpn/ovpn-cli.c | 281 +++++++++++-------
.../selftests/net/ovpn/requirements.txt | 1 +
.../testing/selftests/net/ovpn/tcp_peers.txt | 11 +-
.../selftests/net/ovpn/test-bind-addr.sh | 10 +
tools/testing/selftests/net/ovpn/test-bind.sh | 117 ++++++++
.../selftests/net/ovpn/test-close-socket.sh | 2 +-
tools/testing/selftests/net/ovpn/test-mark.sh | 81 +++++
tools/testing/selftests/net/ovpn/test.sh | 57 +++-
.../testing/selftests/net/ovpn/udp_peers.txt | 12 +-
41 files changed, 855 insertions(+), 224 deletions(-)
create mode 100755 tools/testing/selftests/net/ovpn/check_requirements.py
create mode 100644 tools/testing/selftests/net/ovpn/json/peer0-float.json
create mode 100644 tools/testing/selftests/net/ovpn/json/peer0.json
create mode 120000 tools/testing/selftests/net/ovpn/json/peer1-float.json
create mode 100644 tools/testing/selftests/net/ovpn/json/peer1.json
create mode 120000 tools/testing/selftests/net/ovpn/json/peer2-float.json
create mode 100644 tools/testing/selftests/net/ovpn/json/peer2.json
create mode 120000 tools/testing/selftests/net/ovpn/json/peer3-float.json
create mode 100644 tools/testing/selftests/net/ovpn/json/peer3.json
create mode 120000 tools/testing/selftests/net/ovpn/json/peer4-float.json
create mode 100644 tools/testing/selftests/net/ovpn/json/peer4.json
create mode 120000 tools/testing/selftests/net/ovpn/json/peer5-float.json
create mode 100644 tools/testing/selftests/net/ovpn/json/peer5.json
create mode 120000 tools/testing/selftests/net/ovpn/json/peer6-float.json
create mode 100644 tools/testing/selftests/net/ovpn/json/peer6.json
create mode 120000 tools/testing/selftests/net/ovpn/requirements.txt
create mode 100755 tools/testing/selftests/net/ovpn/test-bind-addr.sh
create mode 100755 tools/testing/selftests/net/ovpn/test-bind.sh
create mode 100755 tools/testing/selftests/net/ovpn/test-mark.sh
--
2.51.2
The unix_connreset.c test included <stdlib.h>, but no symbol from that
header is used. This causes a fatal build error under certain
linux-next configurations where stdlib.h is not available.
Remove the unused include to fix the build.
Reported-by: kernel test robot <lkp(a)intel.com>
Closes: https://lore.kernel.org/r/202511221800.hcgCKvVa-lkp@intel.com/
Signed-off-by: Sunday Adelodun <adelodunolaoluwa(a)yahoo.com>
---
tools/testing/selftests/net/af_unix/unix_connreset.c | 1 -
1 file changed, 1 deletion(-)
diff --git a/tools/testing/selftests/net/af_unix/unix_connreset.c b/tools/testing/selftests/net/af_unix/unix_connreset.c
index bffef2b54bfd..9844e829aed5 100644
--- a/tools/testing/selftests/net/af_unix/unix_connreset.c
+++ b/tools/testing/selftests/net/af_unix/unix_connreset.c
@@ -14,7 +14,6 @@
*/
#define _GNU_SOURCE
-#include <stdlib.h>
#include <string.h>
#include <fcntl.h>
#include <unistd.h>
--
2.43.0
Hi all,
This patch series introduces improvements to the cgroup selftests by adding helper functions to better handle
asynchronous updates in cgroup statistics. These changes are especially useful for managing cgroup stats like
memory.stat and cgroup.stat, which can be affected by delays (e.g., RCPU callbacks and asynchronous rstat flushing).
v4:
- Patch 1/3: Adds the `cg_read_key_long_poll()` helper to poll cgroup keys with retries and configurable intervals.
- Patch 2/3: Updates `test_memcg_sock()` to use `cg_read_key_long_poll()` for handling delayed "sock" counter updates in memory.stat.
- Patch 3/3: Replaces `sleep` and retry logic in `test_kmem_dead_cgroups()` with `cg_read_key_long_poll()` for waiting on `nr_dying_descendants`.
v3:
- Move `MEMCG_SOCKSTAT_WAIT_*` defines after the `#include` block as suggested.
v2:
- Clarify the rationale for the 3s timeout and mention the periodic rstat flush interval (FLUSH_TIME = 2*HZ) in the comment.
- Replace hardcoded retry count and wait interval with macros to avoid magic numbers and make the timeout calculation explicit.
Thanks to Michal Koutný for the suggestion to introduce the polling helper, and to Lance Yang for the review.
Guopeng Zhang (3):
selftests: cgroup: Add cg_read_key_long_poll() to poll a cgroup key
with retries
selftests: cgroup: make test_memcg_sock robust against delayed sock
stats
selftests: cgroup: Replace sleep with cg_read_key_long_poll() for
waiting on nr_dying_descendants
.../selftests/cgroup/lib/cgroup_util.c | 21 +++++++++++++
.../cgroup/lib/include/cgroup_util.h | 5 +++
tools/testing/selftests/cgroup/test_kmem.c | 31 ++++++++-----------
.../selftests/cgroup/test_memcontrol.c | 20 +++++++++++-
4 files changed, 58 insertions(+), 19 deletions(-)
--
2.25.1
From: Hui Zhu <zhuhui(a)kylinos.cn>
This series proposes adding eBPF support to the Linux memory
controller, enabling dynamic and extensible memory management
policies at runtime.
Background
The memory controller (memcg) currently provides fixed memory
accounting and reclamation policies through static kernel code.
This limits flexibility for specialized workloads and use cases
that require custom memory management strategies.
By enabling eBPF programs to hook into key memory control
operations, administrators can implement custom policies without
recompiling the kernel, while maintaining the safety guarantees
provided by the BPF verifier.
Use Cases
1. Custom memory reclamation strategies for specialized workloads
2. Dynamic memory pressure monitoring and telemetry
3. Memory accounting adjustments based on runtime conditions
4. Integration with container orchestration systems for
intelligent resource management
5. Research and experimentation with novel memory management
algorithms
Design Overview
This series introduces:
1. A new BPF struct ops type (`memcg_ops`) that allows eBPF
programs to implement custom behavior for memory charging
operations.
2. A hook point in the `try_charge_memcg()` fast path that
invokes registered eBPF programs to determine if custom
memory management should be applied.
3. The eBPF handler can inspect memory cgroup context and
optionally modify certain parameters (e.g., `nr_pages` for
reclamation size).
4. A reference counting mechanism using `percpu_ref` to safely
manage the lifecycle of registered eBPF struct ops instances.
5. Configuration via `CONFIG_MEMCG_BPF` to allow disabling this
feature at build time.
Implementation Details
- Uses BPF struct ops for a cleaner integration model
- Leverages static branch keys for minimal overhead when feature
is unused
- RCU synchronization ensures safe replacement of handlers
- Sample eBPF program demonstrates monitoring capabilities
- Comprehensive selftest suite validates core functionality
Performance Considerations
- Zero overhead when feature is disabled or no eBPF program is
loaded (static branch is disabled)
- Minimal overhead when enabled: one indirect function call per
charge attempt
- eBPF programs run under the restrictions of the BPF verifier
Patch Overview
PATCH 1/3: Core kernel implementation
- Adds eBPF struct ops support to memcg
- Introduces CONFIG_MEMCG_BPF option
- Implements safe registration/unregistration mechanism
PATCH 2/3: Selftest suite
- prog_tests/memcg_ops.c: Test entry points
- progs/memcg_ops.bpf.c: Test eBPF program
- Validates load, attach, and single-handler constraints
PATCH 3/3: Sample userspace program
- samples/bpf/memcg_printk.bpf.c: Monitoring eBPF program
- samples/bpf/memcg_printk.c: Userspace loader
- Demonstrates real-world usage and debugging capabilities
Open Questions & Discussion Points
1. Should the eBPF handler have access to additional memory
cgroup state? Current design exposes minimal context to
reduce attack surface.
2. Are there other memory control operations that would benefit
from eBPF extensibility (e.g., uncharge, reclaim)?
3. Should there be permission checks or restrictions on who can
load memcg eBPF programs? Currently inherits BPF's
CAP_PERFMON/CAP_SYS_ADMIN requirements.
4. How should we handle multiple eBPF programs trying to
register? Current implementation allows only one active
handler.
5. Is the current exposed context in `try_charge_memcg` struct
sufficient, or should additional fields be added?
Testing
The selftests provide comprehensive coverage of the core
functionality. The sample program can be used for manual
testing and as a reference for implementing additional
monitoring tools.
Hui Zhu (3):
memcg: add eBPF struct ops support for memory charging
selftests/bpf: add memcg eBPF struct ops test
samples/bpf: add example memcg eBPF program
MAINTAINERS | 5 +
init/Kconfig | 38 ++++
mm/Makefile | 1 +
mm/memcontrol.c | 26 ++-
mm/memcontrol_bpf.c | 200 ++++++++++++++++++
mm/memcontrol_bpf.h | 103 +++++++++
samples/bpf/Makefile | 2 +
samples/bpf/memcg_printk.bpf.c | 30 +++
samples/bpf/memcg_printk.c | 82 +++++++
.../selftests/bpf/prog_tests/memcg_ops.c | 117 ++++++++++
tools/testing/selftests/bpf/progs/memcg_ops.c | 20 ++
11 files changed, 617 insertions(+), 7 deletions(-)
create mode 100644 mm/memcontrol_bpf.c
create mode 100644 mm/memcontrol_bpf.h
create mode 100644 samples/bpf/memcg_printk.bpf.c
create mode 100644 samples/bpf/memcg_printk.c
create mode 100644 tools/testing/selftests/bpf/prog_tests/memcg_ops.c
create mode 100644 tools/testing/selftests/bpf/progs/memcg_ops.c
--
2.43.0
In cgroup v2, a mutual overlap check is required when at least one of two
cpusets is exclusive. However, this check should be relaxed and limited to
cases where both cpusets are exclusive.
This patch ensures that for sibling cpusets A1 (exclusive) and B1
(non-exclusive), change B1 cannot affect A1's exclusivity.
for example. Assume a machine has 4 CPUs (0-3).
root cgroup
/ \
A1 B1
Case 1:
Table 1.1: Before applying the patch
Step | A1's prstate | B1'sprstate |
#1> echo "0-1" > A1/cpuset.cpus | member | member |
#2> echo "root" > A1/cpuset.cpus.partition | root | member |
#3> echo "0" > B1/cpuset.cpus | root invalid | member |
After step #3, A1 changes from "root" to "root invalid" because its CPUs
(0-1) overlap with those requested by B1 (0-3). However, B1 can actually
use CPUs 2-3(from B1's parent), so it would be more reasonable for A1 to
remain as "root."
Table 1.2: After applying the patch
Step | A1's prstate | B1'sprstate |
#1> echo "0-1" > A1/cpuset.cpus | member | member |
#2> echo "root" > A1/cpuset.cpus.partition | root | member |
#3> echo "0" > B1/cpuset.cpus | root | member |
Case 2: (This situation remains unchanged from before)
Table 2.1: Before applying the patch
Step | A1's prstate | B1'sprstate |
#1> echo "0-1" > A1/cpuset.cpus | member | member |
#3> echo "1-2" > B1/cpuset.cpus | member | member |
#2> echo "root" > A1/cpuset.cpus.partition | root invalid | member |
Table 2.2: After applying the patch
Step | A1's prstate | B1'sprstate |
#1> echo "0-1" > A1/cpuset.cpus | member | member |
#3> echo "1-2" > B1/cpuset.cpus | member | member |
#2> echo "root" > A1/cpuset.cpus.partition | root invalid | member |
All other cases remain unaffected. For example, cgroup-v1, both A1 and
B1 are exclusive or non-exlusive.
---
v3 -> v4:
- Adjust the test_cpuset_prt.sh test file to align with the current
behavior.
v2 -> v3:
- Ensure compliance with constraints such as cpuset.cpus.exclusive.
- Link: https://lore.kernel.org/cgroups/20251113131434.606961-1-sunshaojie@kylinos.…
v1 -> v2:
- Keeps the current cgroup v1 behavior unchanged
- Link: https://lore.kernel.org/cgroups/c8e234f4-2c27-4753-8f39-8ae83197efd3@redhat…
---
kernel/cgroup/cpuset-internal.h | 3 ++
kernel/cgroup/cpuset-v1.c | 20 +++++++++
kernel/cgroup/cpuset.c | 43 ++++++++++++++-----
.../selftests/cgroup/test_cpuset_prs.sh | 5 ++-
4 files changed, 58 insertions(+), 13 deletions(-)
--
2.25.1