This patch set introduces the BPF_F_CPU and BPF_F_ALL_CPUS flags for
percpu maps, as the requirement of BPF_F_ALL_CPUS flag for percpu_array
maps was discussed in the thread of
"[PATCH bpf-next v3 0/4] bpf: Introduce global percpu data"[1].
The goal of BPF_F_ALL_CPUS flag is to reduce data caching overhead in light
skeletons by allowing a single value to be reused to update values across all
CPUs. This avoids the M:N problem where M cached values are used to update a
map on N CPUs kernel.
The BPF_F_CPU flag is accompanied by *flags*-embedded cpu info, which
specifies the target CPU for the operation:
* For lookup operations: the flag field alongside cpu info enable querying
a value on the specified CPU.
* For update operations: the flag field alongside cpu info enable
updating value for specified CPU.
Links:
[1] https://lore.kernel.org/bpf/20250526162146.24429-1-leon.hwang@linux.dev/
Changes:
v11 -> v12:
* Dropped the v11 changes.
* Stabilized the lru_percpu_hash map test by keeping an extra spare entry,
which can be used temporarily during updates to avoid unintended LRU
evictions.
v10 -> v11:
* Support the combination of BPF_EXIST and BPF_F_CPU/BPF_F_ALL_CPUS for
update operations.
* Fix unstable lru_percpu_hash map test using the combination of
BPF_EXIST and BPF_F_CPU/BPF_F_ALL_CPUS to avoid LRU eviction
(reported by Alexei).
v9 -> v10:
* Add tests to verify array and hash maps do not support BPF_F_CPU and
BPF_F_ALL_CPUS flags.
* Address comment from Andrii:
* Copy map value using copy_map_value_long for percpu_cgroup_storage
maps in a separate patch.
v8 -> v9:
* Change value type from u64 to u32 in selftests.
* Address comments from Andrii:
* Keep value_size unaligned and update everywhere for consistency when
cpu flags are specified.
* Update value by getting pointer for percpu hash and percpu
cgroup_storage maps.
v7 -> v8:
* Address comments from Andrii:
* Check BPF_F_LOCK when update percpu_array, percpu_hash and
lru_percpu_hash maps.
* Refactor flags check in __htab_map_lookup_and_delete_batch().
* Keep value_size unaligned and copy value using copy_map_value() in
__htab_map_lookup_and_delete_batch() when BPF_F_CPU is specified.
* Update warn message in libbpf's validate_map_op().
* Update comment of libbpf's bpf_map__lookup_elem().
v6 -> v7:
* Get correct value size for percpu_hash and lru_percpu_hash in
update_batch API.
* Set 'count' as 'max_entries' in test cases for lookup_batch API.
* Address comment from Alexei:
* Move cpu flags check into bpf_map_check_op_flags().
v5 -> v6:
* Move bpf_map_check_op_flags() from 'bpf.h' to 'syscall.c'.
* Address comments from Alexei:
* Drop the refactoring code of data copying logic for percpu maps.
* Drop bpf_map_check_op_flags() wrappers.
v4 -> v5:
* Address comments from Andrii:
* Refactor data copying logic for all percpu maps.
* Drop this_cpu_ptr() micro-optimization.
* Drop cpu check in libbpf's validate_map_op().
* Enhance bpf_map_check_op_flags() using *allowed flags* instead of
'extra_flags_mask'.
v3 -> v4:
* Address comments from Andrii:
* Remove unnecessary map_type check in bpf_map_value_size().
* Reduce code churn.
* Remove unnecessary do_delete check in
__htab_map_lookup_and_delete_batch().
* Introduce bpf_percpu_copy_to_user() and bpf_percpu_copy_from_user().
* Rename check_map_flags() to bpf_map_check_op_flags() with
extra_flags_mask.
* Add human-readable pr_warn() explanations in validate_map_op().
* Use flags in bpf_map__delete_elem() and
bpf_map__lookup_and_delete_elem().
* Drop "for alignment reasons".
v3 link: https://lore.kernel.org/bpf/20250821160817.70285-1-leon.hwang@linux.dev/
v2 -> v3:
* Address comments from Alexei:
* Use BPF_F_ALL_CPUS instead of BPF_ALL_CPUS magic.
* Introduce these two cpu flags for all percpu maps.
* Address comments from Jiri:
* Reduce some unnecessary u32 cast.
* Refactor more generic map flags check function.
* A code style issue.
v2 link: https://lore.kernel.org/bpf/20250805163017.17015-1-leon.hwang@linux.dev/
v1 -> v2:
* Address comments from Andrii:
* Embed cpu info as high 32 bits of *flags* totally.
* Use ERANGE instead of E2BIG.
* Few format issues.
Leon Hwang (7):
bpf: Introduce BPF_F_CPU and BPF_F_ALL_CPUS flags
bpf: Add BPF_F_CPU and BPF_F_ALL_CPUS flags support for percpu_array
maps
bpf: Add BPF_F_CPU and BPF_F_ALL_CPUS flags support for percpu_hash
and lru_percpu_hash maps
bpf: Copy map value using copy_map_value_long for
percpu_cgroup_storage maps
bpf: Add BPF_F_CPU and BPF_F_ALL_CPUS flags support for
percpu_cgroup_storage maps
libbpf: Add BPF_F_CPU and BPF_F_ALL_CPUS flags support for percpu maps
selftests/bpf: Add cases to test BPF_F_CPU and BPF_F_ALL_CPUS flags
include/linux/bpf-cgroup.h | 4 +-
include/linux/bpf.h | 35 +-
include/uapi/linux/bpf.h | 2 +
kernel/bpf/arraymap.c | 29 +-
kernel/bpf/hashtab.c | 94 +++--
kernel/bpf/local_storage.c | 27 +-
kernel/bpf/syscall.c | 37 +-
tools/include/uapi/linux/bpf.h | 2 +
tools/lib/bpf/bpf.h | 8 +
tools/lib/bpf/libbpf.c | 26 +-
tools/lib/bpf/libbpf.h | 21 +-
.../selftests/bpf/prog_tests/percpu_alloc.c | 328 ++++++++++++++++++
.../selftests/bpf/progs/percpu_alloc_array.c | 32 ++
13 files changed, 560 insertions(+), 85 deletions(-)
--
2.51.2
This patch series builds upon the discussion in
"[PATCH bpf-next v4 0/4] bpf: Improve error reporting for freplace attachment failure" [1].
This patch series introduces support for *common attributes* in the BPF
syscall, providing a unified mechanism for passing shared metadata across
all BPF commands.
The initial set of common attributes includes:
1. 'log_buf': User-provided buffer for storing log output.
2. 'log_size': Size of the provided log buffer.
3. 'log_level': Verbosity level for logging.
4. 'log_true_size': The size of log reported by kernel.
With this extension, the BPF syscall will be able to return meaningful
error messages (e.g., failures of creating map), improving debuggability
and user experience.
Changes:
RFC v3 -> v4:
* Drop RFC.
* Address comments from Andrii:
* Add parentheses in 'sys_bpf_ext()'.
* Avoid creating new fd in 'probe_sys_bpf_ext()'.
* Add a new struct to wrap log fields in libbpf.
* Address comments from Alexei:
* Do not skip writing to user space when log_true_size is zero.
* Do not use 'bool' arguments.
* Drop the adding WARN_ON_ONCE()'s.
RFC v2 -> RFC v3:
* Rename probe_sys_bpf_extended to probe_sys_bpf_ext.
* Refactor reporting 'log_true_size' for prog_load.
* Refactor reporting 'btf_log_true_size' for btf_load.
* Add warnings for internal bugs in map_create.
* Check log_true_size in test cases.
* Address comment from Alexei:
* Change kvzalloc/kvfree to kzalloc/kfree.
* Address comments from Andrii:
* Move BPF_COMMON_ATTRS to 'enum bpf_cmd' alongside brief comment.
* Add bpf_check_uarg_tail_zero() for extra checks.
* Rename sys_bpf_extended to sys_bpf_ext.
* Rename sys_bpf_fd_extended to sys_bpf_ext_fd.
* Probe the new feature using NULL and -EFAULT.
* Move probe_sys_bpf_ext to libbpf_internal.h and drop LIBBPF_API.
* Return -EUSERS when log attrs are conflict between bpf_attr and
bpf_common_attr.
* Avoid touching bpf_vlog_init().
* Update the reason messages in map_create.
* Finalize the log using __cleanup().
* Report log size to users.
* Change type of log_buf from '__u64' to 'const char *' and cast type
using ptr_to_u64() in bpf_map_create().
* Do not return -EOPNOTSUPP when kernel doesn't support this feature
in bpf_map_create().
* Add log_level support for map creation for consistency.
* Address comment from Eduard:
* Use common_attrs->log_level instead of BPF_LOG_FIXED.
RFC v1 -> RFC v2:
* Fix build error reported by test bot.
* Address comments from Alexei:
* Drop new uapi for freplace.
* Add common attributes support for prog_load and btf_load.
* Add common attributes support for map_create.
Links:
[1] https://lore.kernel.org/bpf/20250224153352.64689-1-leon.hwang@linux.dev/
Leon Hwang (9):
bpf: Extend bpf syscall with common attributes support
libbpf: Add support for extended bpf syscall
bpf: Refactor reporting log_true_size for prog_load
bpf: Add common attr support for prog_load
bpf: Refactor reporting btf_log_true_size for btf_load
bpf: Add common attr support for btf_load
bpf: Add common attr support for map_create
libbpf: Add common attr support for map_create
selftests/bpf: Add tests to verify map create failure log
include/linux/bpf.h | 2 +-
include/linux/btf.h | 2 +-
include/linux/syscalls.h | 3 +-
include/uapi/linux/bpf.h | 8 +
kernel/bpf/btf.c | 25 +-
kernel/bpf/syscall.c | 223 ++++++++++++++++--
kernel/bpf/verifier.c | 12 +-
tools/include/uapi/linux/bpf.h | 8 +
tools/lib/bpf/bpf.c | 49 +++-
tools/lib/bpf/bpf.h | 17 +-
tools/lib/bpf/features.c | 8 +
tools/lib/bpf/libbpf_internal.h | 3 +
.../selftests/bpf/prog_tests/map_init.c | 143 +++++++++++
13 files changed, 448 insertions(+), 55 deletions(-)
--
2.52.0
This small patchset is about avoid user-memory-access vulnerability
for LIVE_FRAMES at specific xdp_md context.
---
KaFai Wan (2):
bpf, test_run: Fix user-memory-access vulnerability for LIVE_FRAMES
selftests/bpf: Add test for xdp_md context with LIVE_FRAMES in
BPF_PROG_TEST_RUN
net/bpf/test_run.c | 23 +++++++++----------
.../bpf/prog_tests/xdp_context_test_run.c | 19 +++++++++++++++
.../bpf/prog_tests/xdp_do_redirect.c | 6 ++---
.../bpf/progs/test_xdp_context_test_run.c | 6 +++++
4 files changed, 39 insertions(+), 15 deletions(-)
--
2.43.0
The upcoming RISC-V Ssdtso specification introduces a bit in the senvcfg
CSR to switch the memory consistency model of user mode at run-time from
RVWMO to TSO. The active consistency model can therefore be switched on a
per-hart base and managed by the kernel on a per-process base.
This patchset implements basic Ssdtso support and adds a prctl API on top
so that user-space processes can switch to a stronger memory consistency
model (than the kernel was written for) at run-time.
The patchset also comes with a short documentation of the prctl API.
This series is based on the third draft of the Ssdtso specification
which can be found here:
https://github.com/riscv/riscv-ssdtso/releases/tag/v1.0-draft3
Note, that the Ssdtso specification is in development state
(i.e., not frozen or even ratified) which is also the reason
why this series is marked as RFC.
This series saw the following changes since v1:
* Reordered/restructured patches
* Fixed build issues
* Addressed typos
* Removed ability to switch TSO->WMO
* Moved the state from per-thread to per-process
* Reschedule all CPUs after switching
* Some cleanups in the documentation
* Adding compatibility with Ztso (spec change in draft 3)
This patchset can also be found in this GitHub branch:
https://github.com/cmuellner/linux/tree/ssdtso-v2
A QEMU implementation of DTSO can be found in this GitHub branch:
https://github.com/cmuellner/qemu/tree/ssdtso-v2
Christoph Müllner (6):
mm: Add dynamic memory consistency model switching
uapi: prctl: Add new prctl call to set/get the memory consistency
model
RISC-V: Enable dynamic memory consistency model support with Ssdtso
RISC-V: Implement prctl call to set/get the memory consistency model
RISC-V: Expose Ssdtso via hwprobe API
RISC-V: selftests: Add DTSO tests
Documentation/arch/riscv/hwprobe.rst | 3 +
.../mm/dynamic-memory-consistency-model.rst | 86 ++++++++++++++++
Documentation/mm/index.rst | 1 +
arch/Kconfig | 14 +++
arch/riscv/Kconfig | 11 +++
arch/riscv/include/asm/csr.h | 1 +
arch/riscv/include/asm/dtso.h | 97 +++++++++++++++++++
arch/riscv/include/asm/hwcap.h | 1 +
arch/riscv/include/asm/processor.h | 7 ++
arch/riscv/include/asm/switch_to.h | 3 +
arch/riscv/include/uapi/asm/hwprobe.h | 1 +
arch/riscv/kernel/Makefile | 1 +
arch/riscv/kernel/asm-offsets.c | 3 +
arch/riscv/kernel/cpufeature.c | 1 +
arch/riscv/kernel/dtso.c | 67 +++++++++++++
arch/riscv/kernel/sys_hwprobe.c | 2 +
include/linux/sched.h | 5 +
include/uapi/linux/prctl.h | 5 +
kernel/sys.c | 12 +++
tools/testing/selftests/riscv/Makefile | 2 +-
tools/testing/selftests/riscv/dtso/.gitignore | 1 +
tools/testing/selftests/riscv/dtso/Makefile | 11 +++
tools/testing/selftests/riscv/dtso/dtso.c | 82 ++++++++++++++++
23 files changed, 416 insertions(+), 1 deletion(-)
create mode 100644 Documentation/mm/dynamic-memory-consistency-model.rst
create mode 100644 arch/riscv/include/asm/dtso.h
create mode 100644 arch/riscv/kernel/dtso.c
create mode 100644 tools/testing/selftests/riscv/dtso/.gitignore
create mode 100644 tools/testing/selftests/riscv/dtso/Makefile
create mode 100644 tools/testing/selftests/riscv/dtso/dtso.c
--
2.43.0
The primary goal is to add test validation for GSO when operating over
UDP tunnels, a scenario which is not currently covered.
The design strategy is to extend the existing tun/tap testing infrastructure
to support this new use-case, rather than introducing a new or parallel framework.
This allows for better integration and re-use of existing test logic.
---
v2 -> v3:
- Re-send the patch series becasue Patchwork don't update them
v2: https://lore.kernel.org/all/cover.1767074545.git.xudu@redhat.com/
- Addresse sporadic failures due to too early send.
- Refactor environment address assign helper function.
- Fix incorrect argument passing in build packet functions.
v1: https://lore.kernel.org/netdev/cover.1763345426.git.xudu@redhat.com/
Xu Du (8):
selftest: tun: Format tun.c existing code
selftest: tun: Introduce tuntap_helpers.h header for TUN/TAP testing
selftest: tun: Refactor tun_delete to use tuntap_helpers
selftest: tap: Refactor tap test to use tuntap_helpers
selftest: tun: Add helpers for GSO over UDP tunnel
selftest: tun: Add test for sending gso packet into tun
selftest: tun: Add test for receiving gso packet from tun
selftest: tun: Add test data for success and failure paths
tools/testing/selftests/net/tap.c | 287 +-----
tools/testing/selftests/net/tun.c | 917 ++++++++++++++++++-
tools/testing/selftests/net/tuntap_helpers.h | 608 ++++++++++++
3 files changed, 1530 insertions(+), 282 deletions(-)
create mode 100644 tools/testing/selftests/net/tuntap_helpers.h
base-commit: 7b8e9264f55a9c320f398e337d215e68cca50131
--
2.49.0