Hi all,
This is part of a hackathon organized by LKCAMP[1], focused on writing
tests using KUnit. We reached out a while ago asking for advice on what would
be a useful contribution[2] and ended up choosing data structures that did
not yet have tests.
This patch adds tests for the llist data structure, defined in
include/linux/llist.h, and is inspired by the KUnit tests for the doubly
linked list in lib/list-test.c[3].
[1] https://lkcamp.dev/about/
[2] https://lore.kernel.org/all/Zktnt7rjKryTh9-N@arch/
[3] https://elixir.bootlin.com/linux/latest/source/lib/list-test.c
Artur Alves (1):
lib/llist_kunit.c: add KUnit tests for llist
lib/Kconfig.debug | 11 ++
lib/Makefile | 1 +
lib/llist_kunit.c | 360 ++++++++++++++++++++++++++++++++++++++++++++++
3 files changed, 372 insertions(+)
create mode 100644 lib/llist_kunit.c
--
2.45.2
There are multiple possible timer sources which could be useful for
the sound stream synchronization: hrtimers, hardware clocks (e.g. PTP),
timer wheels (jiffies). Currently, using one of them to synchronize
the audio stream of snd-aloop module would require writing a
kernel-space driver which exports an ALSA timer through the
snd_timer interface.
However, it is not really convenient for application developers, who may
want to define their custom timer sources for audio synchronization.
For instance, we could have a network application which receives frames
and sends them to snd-aloop pcm device, and another application
listening on the other end of snd-aloop. It makes sense to transfer a
new period of data only when certain amount of frames is received
through the network, but definitely not when a certain amount of jiffies
on a local system elapses. Since all of the devices are purely virtual
it won't introduce any glitches and will help the application developers
to avoid using sample-rate conversion.
This patch series introduces userspace-driven ALSA timers: virtual
timers which are created and controlled from userspace. The timer can
be created from the userspace using the new ioctl SNDRV_TIMER_IOCTL_CREATE.
After creating a timer, it becomes available for use system-wide, so it
can be passed to snd-aloop as a timer source (timer_source parameter
would be "-1.SNDRV_TIMER_GLOBAL_UDRIVEN.{timer_id}"). When the userspace
app decides to trigger a timer, it calls another ioctl
SNDRV_TIMER_IOCTL_TRIGGER on the file descriptor of a timer. It
initiates a transfer of a new period of data.
Userspace-driven timers are associated with file descriptors. If the
application wishes to destroy the timer, it can simply release the file
descriptor of a virtual timer.
I believe introducing new ioctl calls is quite inconvenient (as we have
a limited amount of them), but other possible ways of app <-> kernel
communication (like virtual FS) seem completely inappropriate for this
task (but I'd love to discuss alternative solutions).
This patch series also updates the snd-aloop module so the global timers
can be used as a timer_source for it (it allows using userspace-driven
timers as timer source).
Ivan Orlov (4):
ALSA: aloop: Allow using global timers
Docs/sound: Add documentation for userspace-driven ALSA timers
ALSA: timer: Introduce virtual userspace-driven timers
selftests: ALSA: Cover userspace-driven timers with test
Documentation/sound/index.rst | 1 +
Documentation/sound/utimers.rst | 120 +++++++++++
include/uapi/sound/asound.h | 17 ++
sound/core/Kconfig | 11 +
sound/core/timer.c | 226 ++++++++++++++++++++
sound/drivers/aloop.c | 2 +
tools/testing/selftests/alsa/Makefile | 2 +-
tools/testing/selftests/alsa/global-timer.c | 87 ++++++++
tools/testing/selftests/alsa/utimer-test.c | 133 ++++++++++++
9 files changed, 598 insertions(+), 1 deletion(-)
create mode 100644 Documentation/sound/utimers.rst
create mode 100644 tools/testing/selftests/alsa/global-timer.c
create mode 100644 tools/testing/selftests/alsa/utimer-test.c
--
2.34.1
Hello,
this small series aims to integrate test_dev_cgroup in test_progs so it
could be run automatically in CI. The new version brings a few differences
with the current one:
- test now uses directly syscalls instead of wrapping commandline tools
into system() calls
- test_progs manipulates /dev/null (eg: redirecting test logs into it), so
disabling access to it in the bpf program confuses the tests. To fix this,
the first commit modifies the bpf program to allow access to char devices
1:3 (/dev/null), and disable access to char devices 1:5 (/dev/zero)
- once test is converted, add a small subtest to also check for device type
interpretation (char or block)
- paths used in mknod tests are now in /dev instead of /tmp: due to the CI
runner organisation and mountpoints manipulations, trying to create nodes
in /tmp leads to errors unrelated to the test (ie, mknod calls refused by
kernel, not the bpf program). I don't understand exactly the root cause
at the deepest point (all I see in CI is an -ENXIO error on mknod when trying to
create the node in tmp, and I can not make sense out of it neither
replicate it locally), so I would gladly take inputs from anyone more
educated than me about this.
The new test_progs part has been tested in a local qemu environment as well
as in upstream CI:
./test_progs -a cgroup_dev
47/1 cgroup_dev/deny-mknod:OK
47/2 cgroup_dev/allow-mknod:OK
47/3 cgroup_dev/deny-mknod-wrong-type:OK
47/4 cgroup_dev/allow-read:OK
47/5 cgroup_dev/allow-write:OK
47/6 cgroup_dev/deny-read:OK
47/7 cgroup_dev/deny-write:OK
47 cgroup_dev:OK
Summary: 1/7 PASSED, 0 SKIPPED, 0 FAILED
---
Alexis Lothoré (eBPF Foundation) (3):
selftests/bpf: do not disable /dev/null device access in cgroup dev test
selftests/bpf: convert test_dev_cgroup to test_progs
selftests/bpf: add wrong type test to cgroup dev
tools/testing/selftests/bpf/.gitignore | 1 -
tools/testing/selftests/bpf/Makefile | 2 -
.../testing/selftests/bpf/prog_tests/cgroup_dev.c | 123 +++++++++++++++++++++
tools/testing/selftests/bpf/progs/dev_cgroup.c | 4 +-
tools/testing/selftests/bpf/test_dev_cgroup.c | 85 --------------
5 files changed, 125 insertions(+), 90 deletions(-)
---
base-commit: c90e2d4a7738a24fbb3657092dbd1ca20c040ed1
change-id: 20240723-convert_dev_cgroup-6464b0d37f1a
Best regards,
--
Alexis Lothoré, Bootlin
Embedded Linux and Kernel engineering
https://bootlin.com
The format specifier in fprintf is "%u", that "%u" should use
unsigned int type instead.the problem is discovered by reading code.
Signed-off-by: Zhu Jun <zhujun2(a)cmss.chinamobile.com>
---
v1->v2:
modify commit info add how to find the problem in the log
v2->v3:
Seems this can use macro WTERMSIG like those above usage, rather than
changing the print format.
v3->v4:
Now the commit summary doesn't match the change you are making.
Also WTERMSIG() is incorrect for this conditional code path.
See comments below in the code path.
I would leave the v2 code intact. How are you testing this change?
Please include the details in the change log.
v4->v5:
Compile the kernel for testing using make
tools/testing/selftests/kselftest_harness.h | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/tools/testing/selftests/kselftest_harness.h b/tools/testing/selftests/kselftest_harness.h
index e05ac8261046..675b8f43e148 100644
--- a/tools/testing/selftests/kselftest_harness.h
+++ b/tools/testing/selftests/kselftest_harness.h
@@ -910,7 +910,7 @@ void __wait_for_test(struct __test_metadata *t)
.sa_flags = SA_SIGINFO,
};
struct sigaction saved_action;
- int status;
+ unsigned int status;
if (sigaction(SIGALRM, &action, &saved_action)) {
t->passed = 0;
--
2.17.1
xtheadvector is a custom extension that is based upon riscv vector
version 0.7.1 [1]. All of the vector routines have been modified to
support this alternative vector version based upon whether xtheadvector
was determined to be supported at boot.
vlenb is not supported on the existing xtheadvector hardware, so a
devicetree property thead,vlenb is added to provide the vlenb to Linux.
There is a new hwprobe key RISCV_HWPROBE_KEY_VENDOR_EXT_THEAD_0 that is
used to request which thead vendor extensions are supported on the
current platform. This allows future vendors to allocate hwprobe keys
for their vendor.
Support for xtheadvector is also added to the vector kselftests.
Signed-off-by: Charlie Jenkins <charlie(a)rivosinc.com>
[1] https://github.com/T-head-Semi/thead-extension-spec/blob/95358cb2cca9489361…
---
This series is a continuation of a different series that was fragmented
into two other series in an attempt to get part of it merged in the 6.10
merge window. The split-off series did not get merged due to a NAK on
the series that added the generic riscv,vlenb devicetree entry. This
series has converted riscv,vlenb to thead,vlenb to remedy this issue.
The original series is titled "riscv: Support vendor extensions and
xtheadvector" [3].
The series titled "riscv: Extend cpufeature.c to detect vendor
extensions" is still under development and this series is based on that
series! [4]
I have tested this with an Allwinner Nezha board. I ran into issues
booting the board after 6.9-rc1 so I applied these patches to 6.8. There
are a couple of minor merge conflicts that do arrise when doing that, so
please let me know if you have been able to boot this board with a 6.9
kernel. I used SkiffOS [1] to manage building the image, but upgraded
the U-Boot version to Samuel Holland's more up-to-date version [2] and
changed out the device tree used by U-Boot with the device trees that
are present in upstream linux and this series. Thank you Samuel for all
of the work you did to make this task possible.
[1] https://github.com/skiffos/SkiffOS/tree/master/configs/allwinner/nezha
[2] https://github.com/smaeul/u-boot/commit/2e89b706f5c956a70c989cd31665f1429e9…
[3] https://lore.kernel.org/all/20240503-dev-charlie-support_thead_vector_6_9-v…
[4] https://lore.kernel.org/lkml/20240719-support_vendor_extensions-v3-4-0af758…
---
Changes in v8:
- Rebase onto palmer's for-next
- Link to v7: https://lore.kernel.org/r/20240724-xtheadvector-v7-0-b741910ada3e@rivosinc.…
Changes in v7:
- Add defs for has_xtheadvector_no_alternatives() and has_xtheadvector()
when vector disabled. (Palmer)
- Link to v6: https://lore.kernel.org/r/20240722-xtheadvector-v6-0-c9af0130fa00@rivosinc.…
Changes in v6:
- Fix return type of is_vector_supported()/is_xthead_supported() to be bool
- Link to v5: https://lore.kernel.org/r/20240719-xtheadvector-v5-0-4b485fc7d55f@rivosinc.…
Changes in v5:
- Rebase on for-next
- Link to v4: https://lore.kernel.org/r/20240702-xtheadvector-v4-0-2bad6820db11@rivosinc.…
Changes in v4:
- Replace inline asm with C (Samuel)
- Rename VCSRs to CSRs (Samuel)
- Replace .insn directives with .4byte directives
- Link to v3: https://lore.kernel.org/r/20240619-xtheadvector-v3-0-bff39eb9668e@rivosinc.…
Changes in v3:
- Add back Heiko's signed-off-by (Conor)
- Mark RISCV_HWPROBE_KEY_VENDOR_EXT_THEAD_0 as a bitmask
- Link to v2: https://lore.kernel.org/r/20240610-xtheadvector-v2-0-97a48613ad64@rivosinc.…
Changes in v2:
- Removed extraneous references to "riscv,vlenb" (Jess)
- Moved declaration of "thead,vlenb" into cpus.yaml and added
restriction that it's only applicable to thead cores (Conor)
- Check CONFIG_RISCV_ISA_XTHEADVECTOR instead of CONFIG_RISCV_ISA_V for
thead,vlenb (Jess)
- Fix naming of hwprobe variables (Evan)
- Link to v1: https://lore.kernel.org/r/20240609-xtheadvector-v1-0-3fe591d7f109@rivosinc.…
---
Charlie Jenkins (12):
dt-bindings: riscv: Add xtheadvector ISA extension description
dt-bindings: cpus: add a thead vlen register length property
riscv: dts: allwinner: Add xtheadvector to the D1/D1s devicetree
riscv: Add thead and xtheadvector as a vendor extension
riscv: vector: Use vlenb from DT for thead
riscv: csr: Add CSR encodings for CSR_VXRM/CSR_VXSAT
riscv: Add xtheadvector instruction definitions
riscv: vector: Support xtheadvector save/restore
riscv: hwprobe: Add thead vendor extension probing
riscv: hwprobe: Document thead vendor extensions and xtheadvector extension
selftests: riscv: Fix vector tests
selftests: riscv: Support xtheadvector in vector tests
Heiko Stuebner (1):
RISC-V: define the elements of the VCSR vector CSR
Documentation/arch/riscv/hwprobe.rst | 10 +
Documentation/devicetree/bindings/riscv/cpus.yaml | 19 ++
.../devicetree/bindings/riscv/extensions.yaml | 10 +
arch/riscv/Kconfig.vendor | 26 ++
arch/riscv/boot/dts/allwinner/sun20i-d1s.dtsi | 3 +-
arch/riscv/include/asm/cpufeature.h | 2 +
arch/riscv/include/asm/csr.h | 15 ++
arch/riscv/include/asm/hwprobe.h | 3 +-
arch/riscv/include/asm/switch_to.h | 2 +-
arch/riscv/include/asm/vector.h | 225 ++++++++++++----
arch/riscv/include/asm/vendor_extensions/thead.h | 42 +++
.../include/asm/vendor_extensions/thead_hwprobe.h | 18 ++
.../include/asm/vendor_extensions/vendor_hwprobe.h | 37 +++
arch/riscv/include/uapi/asm/hwprobe.h | 3 +-
arch/riscv/include/uapi/asm/vendor/thead.h | 3 +
arch/riscv/kernel/cpufeature.c | 51 +++-
arch/riscv/kernel/kernel_mode_vector.c | 8 +-
arch/riscv/kernel/process.c | 4 +-
arch/riscv/kernel/signal.c | 6 +-
arch/riscv/kernel/sys_hwprobe.c | 5 +
arch/riscv/kernel/vector.c | 24 +-
arch/riscv/kernel/vendor_extensions.c | 10 +
arch/riscv/kernel/vendor_extensions/Makefile | 2 +
arch/riscv/kernel/vendor_extensions/thead.c | 18 ++
.../riscv/kernel/vendor_extensions/thead_hwprobe.c | 19 ++
tools/testing/selftests/riscv/vector/.gitignore | 3 +-
tools/testing/selftests/riscv/vector/Makefile | 17 +-
.../selftests/riscv/vector/v_exec_initval_nolibc.c | 93 +++++++
tools/testing/selftests/riscv/vector/v_helpers.c | 68 +++++
tools/testing/selftests/riscv/vector/v_helpers.h | 8 +
tools/testing/selftests/riscv/vector/v_initval.c | 22 ++
.../selftests/riscv/vector/v_initval_nolibc.c | 68 -----
.../selftests/riscv/vector/vstate_exec_nolibc.c | 20 +-
.../testing/selftests/riscv/vector/vstate_prctl.c | 295 ++++++++++++---------
34 files changed, 890 insertions(+), 269 deletions(-)
---
base-commit: 2709e400c2e06ddae9ad120f301a5254f629cf3e
change-id: 20240530-xtheadvector-833d3d17b423
--
- Charlie
Add RTC wakeup alarm for devices to resume after specific time interval.
This improvement in the test will help in enabling this test
in the CI systems and will eliminate the need of manual intervention
for resuming back the devices after suspend/hibernation.
Signed-off-by: Shreeya Patel <shreeya.patel(a)collabora.com>
---
Changes in v2
- Use rtcwake utility instead of sysfs for setting up
a RTC wakeup alarm
tools/testing/selftests/cpufreq/cpufreq.sh | 15 +++++++++++++++
tools/testing/selftests/cpufreq/main.sh | 13 ++++++++++++-
2 files changed, 27 insertions(+), 1 deletion(-)
diff --git a/tools/testing/selftests/cpufreq/cpufreq.sh b/tools/testing/selftests/cpufreq/cpufreq.sh
index b583a2fb4504..a427de1f9e08 100755
--- a/tools/testing/selftests/cpufreq/cpufreq.sh
+++ b/tools/testing/selftests/cpufreq/cpufreq.sh
@@ -232,6 +232,21 @@ do_suspend()
for i in `seq 1 $2`; do
printf "Starting $1\n"
+
+ if [ "$3" = "rtc" ]; then
+ if ! command -v rtcwake &> /dev/null; then
+ printf "rtcwake could not be found, please install it.\n"
+ return 1
+ fi
+
+ rtcwake -m $filename -s 15
+
+ if [ $? -ne 0 ]; then
+ printf "Failed to suspend using RTC wake alarm\n"
+ return 1
+ fi
+ fi
+
echo $filename > $SYSFS/power/state
printf "Came out of $1\n"
diff --git a/tools/testing/selftests/cpufreq/main.sh b/tools/testing/selftests/cpufreq/main.sh
index 60ce18ed0666..b1ca4147a5e6 100755
--- a/tools/testing/selftests/cpufreq/main.sh
+++ b/tools/testing/selftests/cpufreq/main.sh
@@ -24,6 +24,8 @@ helpme()
[-t <basic: Basic cpufreq testing
suspend: suspend/resume,
hibernate: hibernate/resume,
+ suspend_rtc: suspend/resume back using the RTC wakeup alarm,
+ hibernate_rtc: hibernate/resume back using the RTC wakeup alarm,
modtest: test driver or governor modules. Only to be used with -d or -g options,
sptest1: Simple governor switch to produce lockdep.
sptest2: Concurrent governor switch to produce lockdep.
@@ -76,7 +78,8 @@ parse_arguments()
helpme
;;
- t) # --func_type (Function to perform: basic, suspend, hibernate, modtest, sptest1/2/3/4 (default: basic))
+ t) # --func_type (Function to perform: basic, suspend, hibernate,
+ # suspend_rtc, hibernate_rtc, modtest, sptest1/2/3/4 (default: basic))
FUNC=$OPTARG
;;
@@ -122,6 +125,14 @@ do_test()
do_suspend "hibernate" 1
;;
+ "suspend_rtc")
+ do_suspend "suspend" 1 rtc
+ ;;
+
+ "hibernate_rtc")
+ do_suspend "hibernate" 1 rtc
+ ;;
+
"modtest")
# Do we have modules in place?
if [ -z $DRIVER_MOD ] && [ -z $GOVERNOR_MOD ]; then
--
2.39.2
After HID-BPF struct_ops was merged into 6.11-rc0, there are a few
mishaps:
- the bpf_wq API changed and needs to be updated here
- libbpf now auto-attach all the struct_ops it sees in the bpf object,
leading to attempting at attaching them multiple times
Fix the selftests but also prevent the same struct_ops to be attached
more than once as this enters various locks, confusions, and kernel
oopses.
Signed-off-by: Benjamin Tissoires <bentiss(a)kernel.org>
---
Benjamin Tissoires (4):
selftests/hid: fix bpf_wq new API
selftests/hid: disable struct_ops auto-attach
HID: bpf: prevent the same struct_ops to be attached more than once
selftests/hid: add test for attaching multiple time the same struct_ops
drivers/hid/bpf/hid_bpf_struct_ops.c | 5 +++++
tools/testing/selftests/hid/hid_bpf.c | 26 ++++++++++++++++++++++
tools/testing/selftests/hid/progs/hid.c | 2 +-
.../testing/selftests/hid/progs/hid_bpf_helpers.h | 2 +-
4 files changed, 33 insertions(+), 2 deletions(-)
---
base-commit: 6e504d2c61244a01226c5100c835e44fb9b85ca8
change-id: 20240723-fix-6-11-bpf-cfa63dcda5bc
Best regards,
--
Benjamin Tissoires <bentiss(a)kernel.org>
The format specifier in fprintf is "%u", that "%u" should use
unsigned int type instead.the problem is discovered by reading code.
Signed-off-by: Zhu Jun <zhujun2(a)cmss.chinamobile.com>
---
v1->v2:
modify commit info add how to find the problem in the log
v2->v3:
Seems this can use macro WTERMSIG like those above usage, rather than
changing the print format.
v3->v4:
Now the commit summary doesn't match the change you are making.
Also WTERMSIG() is incorrect for this conditional code path.
See comments below in the code path.
I would leave the v2 code intact. How are you testing this change?
Please include the details in the change log.
tools/testing/selftests/kselftest_harness.h | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/tools/testing/selftests/kselftest_harness.h b/tools/testing/selftests/kselftest_harness.h
index e05ac8261046..675b8f43e148 100644
--- a/tools/testing/selftests/kselftest_harness.h
+++ b/tools/testing/selftests/kselftest_harness.h
@@ -910,7 +910,7 @@ void __wait_for_test(struct __test_metadata *t)
.sa_flags = SA_SIGINFO,
};
struct sigaction saved_action;
- int status;
+ unsigned int status;
if (sigaction(SIGALRM, &action, &saved_action)) {
t->passed = 0;
--
2.17.1
This patch series adds a selftest suite to validate the s390x
architecture specific ucontrol KVM interface.
When creating a VM on s390x it is possible to create it as userspace
controlled VM or in short ucontrol VM.
These VMs delegates the management of the VM to userspace instead
of handling most events within the kernel. Consequently the userspace
has to manage interrupts, memory allocation etc.
Before this patch set this functionality lacks any public test cases.
It is desirable to add test cases for this interface to be able to
reduce the risk of breaking changes in the future.
In order to provision a ucontrol VM the kernel needs to be compiled with
the CONFIG_KVM_S390_UCONTROL enabled. The users with sys_admin capability
can then create a new ucontrol VM providing the KVM_VM_S390_UCONTROL
parameter to the KVM_CREATE_VM ioctl.
The kernels existing selftest helper functions can only be partially be
reused for these tests.
The test cases cover existing special handling of ucontrol VMs within the
implementation and basic VM creation and handling cases:
* Reject setting HPAGE when VM is ucontrol
* Assert KVM_GET_DIRTY_LOG is rejected
* Assert KVM_S390_VM_MEM_LIMIT_SIZE is rejected
* Assert state of initial SIE flags setup by the kernel
* Run simple program in VM with and without DAT
* Assert KVM_EXIT_S390_UCONTROL exit on not mapped memory access
* Assert functionality of storage keys in ucontrol VM
Running the test cases requires sys_admin capabilities to start the
ucontrol VM.
This can be achieved by running as root or with a command like:
sudo setpriv --reuid nobody --inh-caps -all,+sys_admin \
--ambient-caps -all,+sys_admin --bounding-set -all,+sys_admin \
./ucontrol_test
The patch set does also contain some code cleanup / consolidation of
architecture specific defines that are now used in multiple test cases.
---
V1 -> V2:
- add ucontrol to s390 debug config (new patch)
- PATCH 2: changed atomic_t to __u32 (thanks Claudio)
- PATCH 4: reformatted comment in FIXTURE_SETUP(uc_kvm)
- PATCH 5: refactored to display 8 byte blocks + more internal reuse
(thanks Claudio)
- PATCH 7: make use of more declarative defines instead of magic values
- PATCH 8: make use of more declarative defines instead of magic values
(thanks Claudio)
- PATCH 9: add reference to fix verified by the test case
Christoph Schlameuss (10):
selftests: kvm: s390: Define page sizes in shared header
selftests: kvm: s390: Add kvm_s390_sie_block definition for userspace
tests
selftests: kvm: s390: Add s390x ucontrol test suite with hpage test
selftests: kvm: s390: Add test fixture and simple VM setup tests
selftests: kvm: s390: Add debug print functions
selftests: kvm: s390: Add VM run test case
selftests: kvm: s390: Add uc_map_unmap VM test case
selftests: kvm: s390: Add uc_skey VM test case
selftests: kvm: s390: Verify reject memory region operations for
ucontrol VMs
s390: Enable KVM_S390_UCONTROL config in debug_defconfig
arch/s390/Kconfig.debug | 3 +
tools/testing/selftests/kvm/.gitignore | 1 +
tools/testing/selftests/kvm/Makefile | 1 +
.../selftests/kvm/include/s390x/debug_print.h | 69 ++
.../selftests/kvm/include/s390x/processor.h | 5 +
.../testing/selftests/kvm/include/s390x/sie.h | 240 +++++++
.../selftests/kvm/lib/s390x/processor.c | 10 +-
tools/testing/selftests/kvm/s390x/cmma_test.c | 7 +-
tools/testing/selftests/kvm/s390x/config | 2 +
.../testing/selftests/kvm/s390x/debug_test.c | 4 +-
tools/testing/selftests/kvm/s390x/memop.c | 4 +-
tools/testing/selftests/kvm/s390x/tprot.c | 5 +-
.../selftests/kvm/s390x/ucontrol_test.c | 614 ++++++++++++++++++
13 files changed, 949 insertions(+), 16 deletions(-)
create mode 100644 tools/testing/selftests/kvm/include/s390x/debug_print.h
create mode 100644 tools/testing/selftests/kvm/include/s390x/sie.h
create mode 100644 tools/testing/selftests/kvm/s390x/config
create mode 100644 tools/testing/selftests/kvm/s390x/ucontrol_test.c
base-commit: 66ebbdfdeb093e097399b1883390079cd4c3022b
--
2.45.2
Hello everyone,
this small series is a first step in a larger effort aiming to help improve
eBPF selftests and the testing coverage in CI. It focuses for now on
test_xdp_veth.sh, a small test which is not integrated yet in test_progs.
The series is mostly about a rewrite of test_xdp_veth.sh to make it able to
run under test_progs, relying on libbpf to manipulate bpf programs involved
in the test.
Signed-off-by: Alexis Lothoré <alexis.lothore(a)bootlin.com>
---
Changes in v4:
- add missing netns_close in an error path during programs attach
- Link to v3: https://lore.kernel.org/r/20240716-convert_test_xdp_veth-v3-0-7b01389e3cb3@…
Changes in v3:
- Fix doc style in the new test
- Collect acked-by tags
- Link to v2: https://lore.kernel.org/r/20240715-convert_test_xdp_veth-v2-0-46290b82f6d2@…
Changes in v2:
- fix many formatting issues raised by checkpatch
- use static namespaces instead of random ones
- use SYS_NOFAIL instead of snprintf() + system ()
- squashed the new test addition patch and the old test removal patch
- Link to v1: https://lore.kernel.org/r/20240711-convert_test_xdp_veth-v1-0-868accb0a727@…
---
Alexis Lothoré (eBPF Foundation) (2):
selftests/bpf: update xdp_redirect_map prog sections for libbpf
selftests/bpf: integrate test_xdp_veth into test_progs
tools/testing/selftests/bpf/Makefile | 1 -
.../selftests/bpf/prog_tests/test_xdp_veth.c | 213 +++++++++++++++++++++
.../testing/selftests/bpf/progs/xdp_redirect_map.c | 6 +-
tools/testing/selftests/bpf/test_xdp_veth.sh | 121 ------------
4 files changed, 216 insertions(+), 125 deletions(-)
---
base-commit: 4837cbaa1365cdb213b58577197c5b10f6e2aa81
change-id: 20240710-convert_test_xdp_veth-04cc05f5557d
Best regards,
--
Alexis Lothoré, Bootlin
Embedded Linux and Kernel engineering
https://bootlin.com
xtheadvector is a custom extension that is based upon riscv vector
version 0.7.1 [1]. All of the vector routines have been modified to
support this alternative vector version based upon whether xtheadvector
was determined to be supported at boot.
vlenb is not supported on the existing xtheadvector hardware, so a
devicetree property thead,vlenb is added to provide the vlenb to Linux.
There is a new hwprobe key RISCV_HWPROBE_KEY_VENDOR_EXT_THEAD_0 that is
used to request which thead vendor extensions are supported on the
current platform. This allows future vendors to allocate hwprobe keys
for their vendor.
Support for xtheadvector is also added to the vector kselftests.
Signed-off-by: Charlie Jenkins <charlie(a)rivosinc.com>
[1] https://github.com/T-head-Semi/thead-extension-spec/blob/95358cb2cca9489361…
---
This series is a continuation of a different series that was fragmented
into two other series in an attempt to get part of it merged in the 6.10
merge window. The split-off series did not get merged due to a NAK on
the series that added the generic riscv,vlenb devicetree entry. This
series has converted riscv,vlenb to thead,vlenb to remedy this issue.
The original series is titled "riscv: Support vendor extensions and
xtheadvector" [3].
The series titled "riscv: Extend cpufeature.c to detect vendor
extensions" is still under development and this series is based on that
series! [4]
I have tested this with an Allwinner Nezha board. I ran into issues
booting the board after 6.9-rc1 so I applied these patches to 6.8. There
are a couple of minor merge conflicts that do arrise when doing that, so
please let me know if you have been able to boot this board with a 6.9
kernel. I used SkiffOS [1] to manage building the image, but upgraded
the U-Boot version to Samuel Holland's more up-to-date version [2] and
changed out the device tree used by U-Boot with the device trees that
are present in upstream linux and this series. Thank you Samuel for all
of the work you did to make this task possible.
[1] https://github.com/skiffos/SkiffOS/tree/master/configs/allwinner/nezha
[2] https://github.com/smaeul/u-boot/commit/2e89b706f5c956a70c989cd31665f1429e9…
[3] https://lore.kernel.org/all/20240503-dev-charlie-support_thead_vector_6_9-v…
[4] https://lore.kernel.org/lkml/20240719-support_vendor_extensions-v3-4-0af758…
---
Changes in v7:
- Add defs for has_xtheadvector_no_alternatives() and has_xtheadvector()
when vector disabled. (Palmer)
- Link to v6: https://lore.kernel.org/r/20240722-xtheadvector-v6-0-c9af0130fa00@rivosinc.…
Changes in v6:
- Fix return type of is_vector_supported()/is_xthead_supported() to be bool
- Link to v5: https://lore.kernel.org/r/20240719-xtheadvector-v5-0-4b485fc7d55f@rivosinc.…
Changes in v5:
- Rebase on for-next
- Link to v4: https://lore.kernel.org/r/20240702-xtheadvector-v4-0-2bad6820db11@rivosinc.…
Changes in v4:
- Replace inline asm with C (Samuel)
- Rename VCSRs to CSRs (Samuel)
- Replace .insn directives with .4byte directives
- Link to v3: https://lore.kernel.org/r/20240619-xtheadvector-v3-0-bff39eb9668e@rivosinc.…
Changes in v3:
- Add back Heiko's signed-off-by (Conor)
- Mark RISCV_HWPROBE_KEY_VENDOR_EXT_THEAD_0 as a bitmask
- Link to v2: https://lore.kernel.org/r/20240610-xtheadvector-v2-0-97a48613ad64@rivosinc.…
Changes in v2:
- Removed extraneous references to "riscv,vlenb" (Jess)
- Moved declaration of "thead,vlenb" into cpus.yaml and added
restriction that it's only applicable to thead cores (Conor)
- Check CONFIG_RISCV_ISA_XTHEADVECTOR instead of CONFIG_RISCV_ISA_V for
thead,vlenb (Jess)
- Fix naming of hwprobe variables (Evan)
- Link to v1: https://lore.kernel.org/r/20240609-xtheadvector-v1-0-3fe591d7f109@rivosinc.…
---
Charlie Jenkins (12):
dt-bindings: riscv: Add xtheadvector ISA extension description
dt-bindings: cpus: add a thead vlen register length property
riscv: dts: allwinner: Add xtheadvector to the D1/D1s devicetree
riscv: Add thead and xtheadvector as a vendor extension
riscv: vector: Use vlenb from DT for thead
riscv: csr: Add CSR encodings for CSR_VXRM/CSR_VXSAT
riscv: Add xtheadvector instruction definitions
riscv: vector: Support xtheadvector save/restore
riscv: hwprobe: Add thead vendor extension probing
riscv: hwprobe: Document thead vendor extensions and xtheadvector extension
selftests: riscv: Fix vector tests
selftests: riscv: Support xtheadvector in vector tests
Heiko Stuebner (1):
RISC-V: define the elements of the VCSR vector CSR
Documentation/arch/riscv/hwprobe.rst | 10 +
Documentation/devicetree/bindings/riscv/cpus.yaml | 19 ++
.../devicetree/bindings/riscv/extensions.yaml | 10 +
arch/riscv/Kconfig.vendor | 26 ++
arch/riscv/boot/dts/allwinner/sun20i-d1s.dtsi | 3 +-
arch/riscv/include/asm/cpufeature.h | 2 +
arch/riscv/include/asm/csr.h | 15 ++
arch/riscv/include/asm/hwprobe.h | 5 +-
arch/riscv/include/asm/switch_to.h | 2 +-
arch/riscv/include/asm/vector.h | 225 ++++++++++++----
arch/riscv/include/asm/vendor_extensions/thead.h | 42 +++
.../include/asm/vendor_extensions/thead_hwprobe.h | 18 ++
.../include/asm/vendor_extensions/vendor_hwprobe.h | 37 +++
arch/riscv/include/uapi/asm/hwprobe.h | 3 +-
arch/riscv/include/uapi/asm/vendor/thead.h | 3 +
arch/riscv/kernel/cpufeature.c | 54 +++-
arch/riscv/kernel/kernel_mode_vector.c | 8 +-
arch/riscv/kernel/process.c | 4 +-
arch/riscv/kernel/signal.c | 6 +-
arch/riscv/kernel/sys_hwprobe.c | 5 +
arch/riscv/kernel/vector.c | 24 +-
arch/riscv/kernel/vendor_extensions.c | 10 +
arch/riscv/kernel/vendor_extensions/Makefile | 2 +
arch/riscv/kernel/vendor_extensions/thead.c | 18 ++
.../riscv/kernel/vendor_extensions/thead_hwprobe.c | 19 ++
tools/testing/selftests/riscv/vector/.gitignore | 3 +-
tools/testing/selftests/riscv/vector/Makefile | 17 +-
.../selftests/riscv/vector/v_exec_initval_nolibc.c | 93 +++++++
tools/testing/selftests/riscv/vector/v_helpers.c | 68 +++++
tools/testing/selftests/riscv/vector/v_helpers.h | 8 +
tools/testing/selftests/riscv/vector/v_initval.c | 22 ++
.../selftests/riscv/vector/v_initval_nolibc.c | 68 -----
.../selftests/riscv/vector/vstate_exec_nolibc.c | 20 +-
.../testing/selftests/riscv/vector/vstate_prctl.c | 295 ++++++++++++---------
34 files changed, 891 insertions(+), 273 deletions(-)
---
base-commit: 554462ced9ac97487c8f725fe466a9c20ed87521
change-id: 20240530-xtheadvector-833d3d17b423
--
- Charlie
Hello all,
This series includes the bulk of libc-related compile fixes accumulated to
support systems using musl, with smaller numbers to follow. These patches
are simple and straightforward, and the series has been tested with the
kernel-patches/bpf CI and locally using mips64el-gcc/musl-libc and QEMU
with an OpenWrt rootfs.
The patches address a few general categories of libc portability issues:
- missing, redundant or incorrect include headers
- disabled GNU header extensions (i.e. missing #define _GNU_SOURCE)
- issues with types and casting
Feedback and suggestions for improvement are welcome!
Thanks,
Tony
Tony Ambardar (19):
selftests/bpf: Use pid_t consistently in test_progs.c
selftests/bpf: Fix compile error from rlim_t in sk_storage_map.c
selftests/bpf: Fix error compiling bpf_iter_setsockopt.c with musl
libc
selftests/bpf: Drop unneeded include in unpriv_helpers.c
selftests/bpf: Drop unneeded include in sk_lookup.c
selftests/bpf: Drop unneeded include in flow_dissector.c
selftests/bpf: Fix missing ARRAY_SIZE() definition in bench.c
selftests/bpf: Fix missing UINT_MAX definitions in benchmarks
selftests/bpf: Fix missing BUILD_BUG_ON() declaration
selftests/bpf: Fix include of <sys/fcntl.h>
selftests/bpf: Fix compiling parse_tcp_hdr_opt.c with musl-libc
selftests/bpf: Fix compiling kfree_skb.c with musl-libc
selftests/bpf: Fix compiling flow_dissector.c with musl-libc
selftests/bpf: Fix compiling tcp_rtt.c with musl-libc
selftests/bpf: Fix compiling core_reloc.c with musl-libc
selftests/bpf: Fix errors compiling lwt_redirect.c with musl libc
selftests/bpf: Fix errors compiling decap_sanity.c with musl libc
selftests/bpf: Fix errors compiling crypto_sanity.c with musl libc
selftests/bpf: Fix errors compiling cg_storage_multi.h with musl libc
tools/testing/selftests/bpf/bench.c | 1 +
tools/testing/selftests/bpf/bench.h | 1 +
tools/testing/selftests/bpf/map_tests/sk_storage_map.c | 2 +-
tools/testing/selftests/bpf/prog_tests/bpf_iter_setsockopt.c | 2 +-
tools/testing/selftests/bpf/prog_tests/core_reloc.c | 1 +
tools/testing/selftests/bpf/prog_tests/crypto_sanity.c | 1 -
tools/testing/selftests/bpf/prog_tests/decap_sanity.c | 1 -
tools/testing/selftests/bpf/prog_tests/flow_dissector.c | 2 +-
tools/testing/selftests/bpf/prog_tests/kfree_skb.c | 1 +
tools/testing/selftests/bpf/prog_tests/lwt_redirect.c | 1 -
tools/testing/selftests/bpf/prog_tests/ns_current_pid_tgid.c | 2 +-
tools/testing/selftests/bpf/prog_tests/parse_tcp_hdr_opt.c | 1 +
tools/testing/selftests/bpf/prog_tests/sk_lookup.c | 1 -
tools/testing/selftests/bpf/prog_tests/tcp_rtt.c | 1 +
tools/testing/selftests/bpf/prog_tests/user_ringbuf.c | 1 +
tools/testing/selftests/bpf/progs/cg_storage_multi.h | 2 --
tools/testing/selftests/bpf/test_progs.c | 2 +-
tools/testing/selftests/bpf/unpriv_helpers.c | 1 -
18 files changed, 12 insertions(+), 12 deletions(-)
--
2.34.1
commit 0518dbe97fe6 ("selftests/mm: fix cross compilation with LLVM")
changed the env variable for the architecture from MACHINE to ARCH.
This is preventing 3 required TEST_GEN_FILES from being included when
cross compiling s390x and errors when trying to run the test suite.
This is due to the ARCH variable already being set and the arch folder
name being s390.
Add "s390" to the filtered list to cover this case and have the 3 files
included in the build.
Fixes: 0518dbe97fe6 ("selftests/mm: fix cross compilation with LLVM")
Cc: stable(a)kernel.org
Cc: Mark Brown <broonie(a)kernel.org>
Signed-off-by: Nico Pache <npache(a)redhat.com>
---
tools/testing/selftests/mm/Makefile | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/tools/testing/selftests/mm/Makefile b/tools/testing/selftests/mm/Makefile
index 901e0d07765b..7b8a5def54a1 100644
--- a/tools/testing/selftests/mm/Makefile
+++ b/tools/testing/selftests/mm/Makefile
@@ -110,7 +110,7 @@ endif
endif
-ifneq (,$(filter $(ARCH),arm64 ia64 mips64 parisc64 powerpc riscv64 s390x sparc64 x86_64))
+ifneq (,$(filter $(ARCH),arm64 ia64 mips64 parisc64 powerpc riscv64 s390x sparc64 x86_64 s390))
TEST_GEN_FILES += va_high_addr_switch
TEST_GEN_FILES += virtual_address_range
TEST_GEN_FILES += write_to_hugetlbfs
--
2.45.2
'%u' in format string requires 'unsigned int' in __wait_for_test()
but the argument type is 'signed int' that this problem was discovered
by reading code.use macro WTERMSIG like those above usage to
fix the wrong format specifier.
Signed-off-by: Zhu Jun <zhujun2(a)cmss.chinamobile.com>
---
Changes
v1->v2:
modify commit info add how to find the problem in the log
v2->v3:
Seems this can use macro WTERMSIG like those above usage, rather than
changing the print format.
tools/testing/selftests/kselftest_harness.h | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/tools/testing/selftests/kselftest_harness.h b/tools/testing/selftests/kselftest_harness.h
index dbbbcc6c04ee..f41f4435e9a4 100644
--- a/tools/testing/selftests/kselftest_harness.h
+++ b/tools/testing/selftests/kselftest_harness.h
@@ -1086,7 +1086,7 @@ void __wait_for_test(struct __test_metadata *t)
fprintf(TH_LOG_STREAM,
"# %s: Test ended in some other way [%d]\n",
t->name,
- status);
+ WTERMSIG(status));
}
}
--
2.17.1
In this series from Geliang, modifying MPTCP BPF selftests, we have:
- A new MPTCP subflow BPF program setting socket options per subflow: it
looks better to have this old test program in the BPF selftests to
track regressions and to serve as example.
Note: Nicolas is no longer working for Tessares, but he did this work
while working for them, and his email address is no longer available.
- A new symlink to MPTCP's pm_nl_ctl tool is added in BPF selftests, to
be able to use it instead of 'ip mptcp' which is not supported by the
BPF CI running IPRoute 5.5.0.
- A new MPTCP BPF subtest validating the new BPF program added in the
first patch.
Signed-off-by: Matthieu Baerts (NGI0) <matttbe(a)kernel.org>
---
Changes in v3:
- Sorry for the delay between v2 and v3, this series was conflicting
with the "add netns helpers", but it looks like it is on hold:
https://lore.kernel.org/cover.1715821541.git.tanggeliang@kylinos.cn
- Patch 1/3 includes "bpf_tracing_net.h", introduced in between.
- New patch 2/3: "selftests/bpf: Add mptcp pm_nl_ctl link".
- Patch 3/3: use the tool introduced in patch 2/3 + SYS_NOFAIL() helper.
- Link to v2: https://lore.kernel.org/r/20240509-upstream-bpf-next-20240506-mptcp-subflow…
Changes in v2:
- Previous patches 1/4 and 2/4 have been dropped from this series:
- 1/4: "selftests/bpf: Handle SIGINT when creating netns":
- A new version, more generic and no longer specific to MPTCP BPF
selftest will be sent later, as part of a new series. (Alexei)
- 2/4: "selftests/bpf: Add RUN_MPTCP_TEST macro":
- Removed, not to hide helper functions in macros. (Alexei)
- The commit message of patch 1/2 has been clarified to avoid some
possible confusions spot by Alexei.
- Link to v1: https://lore.kernel.org/r/20240507-upstream-bpf-next-20240506-mptcp-subflow…
---
Geliang Tang (2):
selftests/bpf: Add mptcp pm_nl_ctl link
selftests/bpf: Add mptcp subflow subtest
Nicolas Rybowski (1):
selftests/bpf: Add mptcp subflow example
MAINTAINERS | 1 +
tools/testing/selftests/bpf/Makefile | 3 +-
tools/testing/selftests/bpf/mptcp_pm_nl_ctl.c | 1 +
tools/testing/selftests/bpf/prog_tests/mptcp.c | 104 ++++++++++++++++++++++
tools/testing/selftests/bpf/progs/mptcp_subflow.c | 59 ++++++++++++
5 files changed, 167 insertions(+), 1 deletion(-)
---
base-commit: fd8db07705c55a995c42b1e71afc42faad675b0b
change-id: 20240506-upstream-bpf-next-20240506-mptcp-subflow-test-faef6654bfa3
Best regards,
--
Matthieu Baerts (NGI0) <matttbe(a)kernel.org>
v15: https://patchwork.kernel.org/project/netdevbpf/list/?series=865481&state=*
====
No material changes in this version, only a fix to linking against
libynl.a from the last version. Per Jakub's instructions I've pulled one
of his patches into this series, and now use the new libynl.a correctly,
I hope.
As usual, the full devmem TCP changes including the full GVE driver
implementation is here:
https://github.com/mina/linux/commits/tcpdevmem-v15/
v14: https://patchwork.kernel.org/project/netdevbpf/list/?series=865135&archive=…
====
No material changes in this version. Only rebase and re-verification on
top of net-next. v13, I think, raced with commit ebad6d0334793
("net/ipv4: Use nested-BH locking for ipv4_tcp_sk.") being merged to
net-next that caused a patchwork failure to apply. This series should
apply cleanly on commit c4532232fa2a4 ("selftests: net: remove unneeded
IP_GRE config").
I did not wait the customary 24hr as Jakub said it's OK to repost as soon
as I build test the rebased version:
https://lore.kernel.org/netdev/20240625075926.146d769d@kernel.org/
v13: https://patchwork.kernel.org/project/netdevbpf/list/?series=861406&archive=…
====
Major changes:
--------------
This iteration addresses Pavel's review comments, applies his
reviewed-by's, and seeks to fix the patchwork build error (sorry!).
As usual, the full devmem TCP changes including the full GVE driver
implementation is here:
https://github.com/mina/linux/commits/tcpdevmem-v13/
v12: https://patchwork.kernel.org/project/netdevbpf/list/?series=859747&state=*
====
Major changes:
--------------
This iteration only addresses one minor comment from Pavel with regards
to the trace printing of netmem, and the patchwork build error
introduced in v11 because I missed doing an allmodconfig build, sorry.
Other than that v11, AFAICT, received no feedback. There is one
discussion about how the specifics of plugging io uring memory through
the page pool, but not relevant to content in this particular patchset,
AFAICT.
As usual, the full devmem TCP changes including the full GVE driver
implementation is here:
https://github.com/mina/linux/commits/tcpdevmem-v12/
v11: https://patchwork.kernel.org/project/netdevbpf/list/?series=857457&state=*
====
Major Changes:
--------------
v11 addresses feedback received in v10. The major change is the removal
of the memory provider ops as requested by Christoph. We still
accomplish the same thing, but utilizing direct function calls with if
statements rather than generic ops.
Additionally address sparse warnings, bugs and review comments from
folks that reviewed.
As usual, the full devmem TCP changes including the full GVE driver
implementation is here:
https://github.com/mina/linux/commits/tcpdevmem-v11/
Detailed changelog:
-------------------
- Fixes in netdev_rx_queue_restart() from Pavel & David.
- Remove commit e650e8c3a36f5 ("net: page_pool: create hooks for
custom page providers") from the series to address Christoph's
feedback and rebased other patches on the series on this change.
- Fixed build errors with CONFIG_DMA_SHARED_BUFFER &&
!CONFIG_GENERIC_ALLOCATOR build.
- Fixed sparse warnings pointed out by Paolo.
- Drop unnecessary gro_pull_from_frag0 checks.
- Added Bagas reviewed-by to docs.
Cc: Bagas Sanjaya <bagasdotme(a)gmail.com>
Cc: Steven Rostedt <rostedt(a)goodmis.org>
Cc: Christoph Hellwig <hch(a)infradead.org>
Cc: Nikolay Aleksandrov <razor(a)blackwall.org>
v10: https://patchwork.kernel.org/project/netdevbpf/list/?series=852422&state=*
====
Major Changes:
--------------
v9 was sent right before the merge window closed (sorry!). v10 is almost
a re-send of the series now that the merge window re-opened. Only
rebased to latest net-next and addressed some minor iterative comments
received on v9.
As usual, the full devmem TCP changes including the full GVE driver
implementation is here:
https://github.com/mina/linux/commits/tcpdevmem-v10/
Detailed changelog:
-------------------
- Fixed tokens leaking in DONTNEED setsockopt (Nikolay).
- Moved net_iov_dma_addr() to devmem.c and made it a devmem specific
helpers (David).
- Rename hook alloc_pages to alloc_netmems as alloc_pages is now
preprocessor macro defined and causes a build error.
v9:
===
Major Changes:
--------------
GVE queue API has been merged. Submitting this version as non-RFC after
rebasing on top of the merged API, and dropped the out of tree queue API
I was carrying on github. Addressed the little feedback v8 has received.
Detailed changelog:
------------------
- Added new patch from David Wei to this series for
netdev_rx_queue_restart()
- Fixed sparse error.
- Removed CONFIG_ checks in netmem_is_net_iov()
- Flipped skb->readable to skb->unreadable
- Minor fixes to selftests & docs.
RFC v8:
=======
Major Changes:
--------------
- Fixed build error generated by patch-by-patch build.
- Applied docs suggestions from Randy.
RFC v7:
=======
Major Changes:
--------------
This revision largely rebases on top of net-next and addresses the feedback
RFCv6 received from folks, namely Jakub, Yunsheng, Arnd, David, & Pavel.
The series remains in RFC because the queue-API ndos defined in this
series are not yet implemented. I have a GVE implementation I carry out
of tree for my testing. A upstreamable GVE implementation is in the
works. Aside from that, in my estimation all the patches are ready for
review/merge. Please do take a look.
As usual the full devmem TCP changes including the full GVE driver
implementation is here:
https://github.com/mina/linux/commits/tcpdevmem-v7/
Detailed changelog:
- Use admin-perm in netlink API.
- Addressed feedback from Jakub with regards to netlink API
implementation.
- Renamed devmem.c functions to something more appropriate for that
file.
- Improve the performance seen through the page_pool benchmark.
- Fix the value definition of all the SO_DEVMEM_* uapi.
- Various fixes to documentation.
Perf - page-pool benchmark:
---------------------------
Improved performance of bench_page_pool_simple.ko tests compared to v6:
https://pastebin.com/raw/v5dYRg8L
net-next base: 8 cycle fast path.
RFC v6: 10 cycle fast path.
RFC v7: 9 cycle fast path.
RFC v7 with CONFIG_DMA_SHARED_BUFFER disabled: 8 cycle fast path,
same as baseline.
Perf - Devmem TCP benchmark:
---------------------
Perf is about the same regardless of the changes in v7, namely the
removal of the static_branch_unlikely to improve the page_pool benchmark
performance:
189/200gbps bi-directional throughput with RX devmem TCP and regular TCP
TX i.e. ~95% line rate.
RFC v6:
=======
Major Changes:
--------------
This revision largely rebases on top of net-next and addresses the little
feedback RFCv5 received.
The series remains in RFC because the queue-API ndos defined in this
series are not yet implemented. I have a GVE implementation I carry out
of tree for my testing. A upstreamable GVE implementation is in the
works. Aside from that, in my estimation all the patches are ready for
review/merge. Please do take a look.
As usual the full devmem TCP changes including the full GVE driver
implementation is here:
https://github.com/mina/linux/commits/tcpdevmem-v6/
This version also comes with some performance data recorded in the cover
letter (see below changelog).
Detailed changelog:
- Rebased on top of the merged netmem_ref changes.
- Converted skb->dmabuf to skb->readable (Pavel). Pavel's original
suggestion was to remove the skb->dmabuf flag entirely, but when I
looked into it closely, I found the issue that if we remove the flag
we have to dereference the shinfo(skb) pointer to obtain the first
frag to tell whether an skb is readable or not. This can cause a
performance regression if it dirties the cache line when the
shinfo(skb) was not really needed. Instead, I converted the skb->dmabuf
flag into a generic skb->readable flag which can be re-used by io_uring
0-copy RX.
- Squashed a few locking optimizations from Eric Dumazet in the RX path
and the DEVMEM_DONTNEED setsockopt.
- Expanded the tests a bit. Added validation for invalid scenarios and
added some more coverage.
Perf - page-pool benchmark:
---------------------------
bench_page_pool_simple.ko tests with and without these changes:
https://pastebin.com/raw/ncHDwAbn
AFAIK the number that really matters in the perf tests is the
'tasklet_page_pool01_fast_path Per elem'. This one measures at about 8
cycles without the changes but there is some 1 cycle noise in some
results.
With the patches this regresses to 9 cycles with the changes but there
is 1 cycle noise occasionally running this test repeatedly.
Lastly I tried disable the static_branch_unlikely() in
netmem_is_net_iov() check. To my surprise disabling the
static_branch_unlikely() check reduces the fast path back to 8 cycles,
but the 1 cycle noise remains.
Perf - Devmem TCP benchmark:
---------------------
189/200gbps bi-directional throughput with RX devmem TCP and regular TCP
TX i.e. ~95% line rate.
Major changes in RFC v5:
========================
1. Rebased on top of 'Abstract page from net stack' series and used the
new netmem type to refer to LSB set pointers instead of re-using
struct page.
2. Downgraded this series back to RFC and called it RFC v5. This is
because this series is now dependent on 'Abstract page from net
stack'[1] and the queue API. Both are removed from the series to
reduce the patch # and those bits are fairly independent or
pre-requisite work.
3. Reworked the page_pool devmem support to use netmem and for some
more unified handling.
4. Reworked the reference counting of net_iov (renamed from
page_pool_iov) to use pp_ref_count for refcounting.
The full changes including the dependent series and GVE page pool
support is here:
https://github.com/mina/linux/commits/tcpdevmem-rfcv5/
[1] https://patchwork.kernel.org/project/netdevbpf/list/?series=810774
Major changes in v1:
====================
1. Implemented MVP queue API ndos to remove the userspace-visible
driver reset.
2. Fixed issues in the napi_pp_put_page() devmem frag unref path.
3. Removed RFC tag.
Many smaller addressed comments across all the patches (patches have
individual change log).
Full tree including the rest of the GVE driver changes:
https://github.com/mina/linux/commits/tcpdevmem-v1
Changes in RFC v3:
==================
1. Pulled in the memory-provider dependency from Jakub's RFC[1] to make the
series reviewable and mergeable.
2. Implemented multi-rx-queue binding which was a todo in v2.
3. Fix to cmsg handling.
The sticking point in RFC v2[2] was the device reset required to refill
the device rx-queues after the dmabuf bind/unbind. The solution
suggested as I understand is a subset of the per-queue management ops
Jakub suggested or similar:
https://lore.kernel.org/netdev/20230815171638.4c057dcd@kernel.org/
This is not addressed in this revision, because:
1. This point was discussed at netconf & netdev and there is openness to
using the current approach of requiring a device reset.
2. Implementing individual queue resetting seems to be difficult for my
test bed with GVE. My prototype to test this ran into issues with the
rx-queues not coming back up properly if reset individually. At the
moment I'm unsure if it's a mistake in the POC or a genuine issue in
the virtualization stack behind GVE, which currently doesn't test
individual rx-queue restart.
3. Our usecases are not bothered by requiring a device reset to refill
the buffer queues, and we'd like to support NICs that run into this
limitation with resetting individual queues.
My thought is that drivers that have trouble with per-queue configs can
use the support in this series, while drivers that support new netdev
ops to reset individual queues can automatically reset the queue as
part of the dma-buf bind/unbind.
The same approach with device resets is presented again for consideration
with other sticking points addressed.
This proposal includes the rx devmem path only proposed for merge. For a
snapshot of my entire tree which includes the GVE POC page pool support &
device memory support:
https://github.com/torvalds/linux/compare/master...mina:linux:tcpdevmem-v3
[1] https://lore.kernel.org/netdev/f8270765-a27b-6ccf-33ea-cda097168d79@redhat.…
[2] https://lore.kernel.org/netdev/CAHS8izOVJGJH5WF68OsRWFKJid1_huzzUK+hpKbLcL4…
Changes in RFC v2:
==================
The sticking point in RFC v1[1] was the dma-buf pages approach we used to
deliver the device memory to the TCP stack. RFC v2 is a proof-of-concept
that attempts to resolve this by implementing scatterlist support in the
networking stack, such that we can import the dma-buf scatterlist
directly. This is the approach proposed at a high level here[2].
Detailed changes:
1. Replaced dma-buf pages approach with importing scatterlist into the
page pool.
2. Replace the dma-buf pages centric API with a netlink API.
3. Removed the TX path implementation - there is no issue with
implementing the TX path with scatterlist approach, but leaving
out the TX path makes it easier to review.
4. Functionality is tested with this proposal, but I have not conducted
perf testing yet. I'm not sure there are regressions, but I removed
perf claims from the cover letter until they can be re-confirmed.
5. Added Signed-off-by: contributors to the implementation.
6. Fixed some bugs with the RX path since RFC v1.
Any feedback welcome, but specifically the biggest pending questions
needing feedback IMO are:
1. Feedback on the scatterlist-based approach in general.
2. Netlink API (Patch 1 & 2).
3. Approach to handle all the drivers that expect to receive pages from
the page pool (Patch 6).
[1] https://lore.kernel.org/netdev/dfe4bae7-13a0-3c5d-d671-f61b375cb0b4@gmail.c…
[2] https://lore.kernel.org/netdev/CAHS8izPm6XRS54LdCDZVd0C75tA1zHSu6jLVO8nzTLX…
==================
* TL;DR:
Device memory TCP (devmem TCP) is a proposal for transferring data to and/or
from device memory efficiently, without bouncing the data to a host memory
buffer.
* Problem:
A large amount of data transfers have device memory as the source and/or
destination. Accelerators drastically increased the volume of such transfers.
Some examples include:
- ML accelerators transferring large amounts of training data from storage into
GPU/TPU memory. In some cases ML training setup time can be as long as 50% of
TPU compute time, improving data transfer throughput & efficiency can help
improving GPU/TPU utilization.
- Distributed training, where ML accelerators, such as GPUs on different hosts,
exchange data among them.
- Distributed raw block storage applications transfer large amounts of data with
remote SSDs, much of this data does not require host processing.
Today, the majority of the Device-to-Device data transfers the network are
implemented as the following low level operations: Device-to-Host copy,
Host-to-Host network transfer, and Host-to-Device copy.
The implementation is suboptimal, especially for bulk data transfers, and can
put significant strains on system resources, such as host memory bandwidth,
PCIe bandwidth, etc. One important reason behind the current state is the
kernel’s lack of semantics to express device to network transfers.
* Proposal:
In this patch series we attempt to optimize this use case by implementing
socket APIs that enable the user to:
1. send device memory across the network directly, and
2. receive incoming network packets directly into device memory.
Packet _payloads_ go directly from the NIC to device memory for receive and from
device memory to NIC for transmit.
Packet _headers_ go to/from host memory and are processed by the TCP/IP stack
normally. The NIC _must_ support header split to achieve this.
Advantages:
- Alleviate host memory bandwidth pressure, compared to existing
network-transfer + device-copy semantics.
- Alleviate PCIe BW pressure, by limiting data transfer to the lowest level
of the PCIe tree, compared to traditional path which sends data through the
root complex.
* Patch overview:
** Part 1: netlink API
Gives user ability to bind dma-buf to an RX queue.
** Part 2: scatterlist support
Currently the standard for device memory sharing is DMABUF, which doesn't
generate struct pages. On the other hand, networking stack (skbs, drivers, and
page pool) operate on pages. We have 2 options:
1. Generate struct pages for dmabuf device memory, or,
2. Modify the networking stack to process scatterlist.
Approach #1 was attempted in RFC v1. RFC v2 implements approach #2.
** part 3: page pool support
We piggy back on page pool memory providers proposal:
https://github.com/kuba-moo/linux/tree/pp-providers
It allows the page pool to define a memory provider that provides the
page allocation and freeing. It helps abstract most of the device memory
TCP changes from the driver.
** part 4: support for unreadable skb frags
Page pool iovs are not accessible by the host; we implement changes
throughput the networking stack to correctly handle skbs with unreadable
frags.
** Part 5: recvmsg() APIs
We define user APIs for the user to send and receive device memory.
Not included with this series is the GVE devmem TCP support, just to
simplify the review. Code available here if desired:
https://github.com/mina/linux/tree/tcpdevmem
This series is built on top of net-next with Jakub's pp-providers changes
cherry-picked.
* NIC dependencies:
1. (strict) Devmem TCP require the NIC to support header split, i.e. the
capability to split incoming packets into a header + payload and to put
each into a separate buffer. Devmem TCP works by using device memory
for the packet payload, and host memory for the packet headers.
2. (optional) Devmem TCP works better with flow steering support & RSS support,
i.e. the NIC's ability to steer flows into certain rx queues. This allows the
sysadmin to enable devmem TCP on a subset of the rx queues, and steer
devmem TCP traffic onto these queues and non devmem TCP elsewhere.
The NIC I have access to with these properties is the GVE with DQO support
running in Google Cloud, but any NIC that supports these features would suffice.
I may be able to help reviewers bring up devmem TCP on their NICs.
* Testing:
The series includes a udmabuf kselftest that show a simple use case of
devmem TCP and validates the entire data path end to end without
a dependency on a specific dmabuf provider.
** Test Setup
Kernel: net-next with this series and memory provider API cherry-picked
locally.
Hardware: Google Cloud A3 VMs.
NIC: GVE with header split & RSS & flow steering support.
Cc: Pavel Begunkov <asml.silence(a)gmail.com>
Cc: David Wei <dw(a)davidwei.uk>
Cc: Jason Gunthorpe <jgg(a)ziepe.ca>
Cc: Yunsheng Lin <linyunsheng(a)huawei.com>
Cc: Shailend Chand <shailend(a)google.com>
Cc: Harshitha Ramamurthy <hramamurthy(a)google.com>
Cc: Shakeel Butt <shakeel.butt(a)linux.dev>
Cc: Jeroen de Borst <jeroendb(a)google.com>
Cc: Praveen Kaligineedi <pkaligineedi(a)google.com>
Jakub Kicinski (1):
tools: net: package libynl for use in selftests
Mina Almasry (13):
netdev: add netdev_rx_queue_restart()
net: netdev netlink api to bind dma-buf to a net device
netdev: support binding dma-buf to netdevice
netdev: netdevice devmem allocator
page_pool: convert to use netmem
page_pool: devmem support
memory-provider: dmabuf devmem memory provider
net: support non paged skb frags
net: add support for skbs with unreadable frags
tcp: RX path for devmem TCP
net: add SO_DEVMEM_DONTNEED setsockopt to release RX frags
net: add devmem TCP documentation
selftests: add ncdevmem, netcat for devmem TCP
Documentation/netlink/specs/netdev.yaml | 57 +++
Documentation/networking/devmem.rst | 258 +++++++++++
Documentation/networking/index.rst | 1 +
arch/alpha/include/uapi/asm/socket.h | 6 +
arch/mips/include/uapi/asm/socket.h | 6 +
arch/parisc/include/uapi/asm/socket.h | 6 +
arch/sparc/include/uapi/asm/socket.h | 6 +
include/linux/skbuff.h | 61 ++-
include/linux/skbuff_ref.h | 11 +-
include/linux/socket.h | 1 +
include/net/devmem.h | 124 ++++++
include/net/mp_dmabuf_devmem.h | 44 ++
include/net/netdev_rx_queue.h | 5 +
include/net/netmem.h | 208 ++++++++-
include/net/page_pool/helpers.h | 124 ++++--
include/net/page_pool/types.h | 22 +-
include/net/sock.h | 2 +
include/net/tcp.h | 5 +-
include/trace/events/page_pool.h | 30 +-
include/uapi/asm-generic/socket.h | 6 +
include/uapi/linux/netdev.h | 19 +
include/uapi/linux/uio.h | 17 +
net/bpf/test_run.c | 5 +-
net/core/Makefile | 3 +-
net/core/datagram.c | 6 +
net/core/dev.c | 6 +-
net/core/devmem.c | 376 ++++++++++++++++
net/core/gro.c | 3 +-
net/core/netdev-genl-gen.c | 23 +
net/core/netdev-genl-gen.h | 6 +
net/core/netdev-genl.c | 103 +++++
net/core/netdev_rx_queue.c | 74 ++++
net/core/page_pool.c | 362 +++++++++-------
net/core/skbuff.c | 83 +++-
net/core/sock.c | 61 +++
net/ipv4/esp4.c | 3 +-
net/ipv4/tcp.c | 261 +++++++++++-
net/ipv4/tcp_input.c | 13 +-
net/ipv4/tcp_ipv4.c | 16 +
net/ipv4/tcp_minisocks.c | 2 +
net/ipv4/tcp_output.c | 5 +-
net/ipv6/esp6.c | 3 +-
net/packet/af_packet.c | 4 +-
tools/include/uapi/linux/netdev.h | 19 +
tools/net/ynl/Makefile | 6 +-
tools/net/ynl/lib/Makefile | 4 +-
tools/testing/selftests/net/.gitignore | 1 +
tools/testing/selftests/net/Makefile | 9 +
tools/testing/selftests/net/ncdevmem.c | 542 ++++++++++++++++++++++++
tools/testing/selftests/net/ynl.mk | 21 +
50 files changed, 2786 insertions(+), 253 deletions(-)
create mode 100644 Documentation/networking/devmem.rst
create mode 100644 include/net/devmem.h
create mode 100644 include/net/mp_dmabuf_devmem.h
create mode 100644 net/core/devmem.c
create mode 100644 net/core/netdev_rx_queue.c
create mode 100644 tools/testing/selftests/net/ncdevmem.c
create mode 100644 tools/testing/selftests/net/ynl.mk
--
2.45.2.803.g4e1b14247a-goog
If the testing kernel doesn't support setting fdb_max_learned or show
fdb_n_learned, just skip it. Or we will get errors like
./bridge_fdb_learning_limit.sh: line 218: [: null: integer expression expected
./bridge_fdb_learning_limit.sh: line 225: [: null: integer expression expected
Fixes: 6f84090333bb ("selftests: forwarding: bridge_fdb_learning_limit: Add a new selftest")
Signed-off-by: Hangbin Liu <liuhangbin(a)gmail.com>
---
.../forwarding/bridge_fdb_learning_limit.sh | 18 ++++++++++++++++++
1 file changed, 18 insertions(+)
diff --git a/tools/testing/selftests/net/forwarding/bridge_fdb_learning_limit.sh b/tools/testing/selftests/net/forwarding/bridge_fdb_learning_limit.sh
index 0760a34b7114..a21b7085da2e 100755
--- a/tools/testing/selftests/net/forwarding/bridge_fdb_learning_limit.sh
+++ b/tools/testing/selftests/net/forwarding/bridge_fdb_learning_limit.sh
@@ -178,6 +178,22 @@ fdb_del()
check_err $? "Failed to remove a FDB entry of type ${type}"
}
+check_fdb_n_learned_support()
+{
+ if ! ip link help bridge 2>&1 | grep -q "fdb_max_learned"; then
+ echo "SKIP: iproute2 too old, missing bridge max learned support"
+ exit $ksft_skip
+ fi
+
+ ip link add dev br0 type bridge
+ local learned=$(fdb_get_n_learned)
+ ip link del dev br0
+ if [ "$learned" == "null" ]; then
+ echo "SKIP: kernel too old; bridge fdb_n_learned feature not supported."
+ exit $ksft_skip
+ fi
+}
+
check_accounting_one_type()
{
local type=$1 is_counted=$2 overrides_learned=$3
@@ -274,6 +290,8 @@ check_limit()
done
}
+check_fdb_n_learned_support
+
trap cleanup EXIT
setup_prepare
--
2.45.0
Based on feedback from Linus[1] and follow-up discussions, change the
suggested file naming for KUnit tests.
Link: https://lore.kernel.org/lkml/CAHk-=wgim6pNiGTBMhP8Kd3tsB7_JTAuvNJ=XYd3wPvvk… [1]
Signed-off-by: Kees Cook <kees(a)kernel.org>
---
Cc: David Gow <davidgow(a)google.com>
Cc: Brendan Higgins <brendan.higgins(a)linux.dev>
Cc: Rae Moar <rmoar(a)google.com>
Cc: John Hubbard <jhubbard(a)nvidia.com>
Cc: Jonathan Corbet <corbet(a)lwn.net>
Cc: Linus Torvalds <torvalds(a)linux-foundation.org>
Cc: linux-kselftest(a)vger.kernel.org
Cc: kunit-dev(a)googlegroups.com
Cc: linux-doc(a)vger.kernel.org
Cc: linux-kernel(a)vger.kernel.org
Cc: linux-hardening(a)vger.kernel.org
---
Documentation/dev-tools/kunit/style.rst | 25 +++++++++++++++----------
1 file changed, 15 insertions(+), 10 deletions(-)
diff --git a/Documentation/dev-tools/kunit/style.rst b/Documentation/dev-tools/kunit/style.rst
index b6d0d7359f00..1538835cd0e2 100644
--- a/Documentation/dev-tools/kunit/style.rst
+++ b/Documentation/dev-tools/kunit/style.rst
@@ -188,15 +188,20 @@ For example, a Kconfig entry might look like:
Test File and Module Names
==========================
-KUnit tests can often be compiled as a module. These modules should be named
-after the test suite, followed by ``_test``. If this is likely to conflict with
-non-KUnit tests, the suffix ``_kunit`` can also be used.
-
-The easiest way of achieving this is to name the file containing the test suite
-``<suite>_test.c`` (or, as above, ``<suite>_kunit.c``). This file should be
-placed next to the code under test.
+Whether a KUnit test is compiled as a separate module or via an
+``#include`` in a core kernel source file, the file should be named
+after the test suite, followed by ``_kunit``, and live in a ``tests``
+subdirectory to avoid conflicting with regular modules (e.g. if "foobar"
+is the core module, then "foobar_kunit" is the KUnit test module) or the
+core kernel source file names (e.g. for tab-completion). Many existing
+tests use a ``_test`` suffix, but this is considered deprecated.
+
+So for the common case, name the file containing the test suite
+``tests/<suite>_kunit.c``. The ``tests`` directory should be placed at
+the same level as the code under test. For example, tests for
+``lib/string.c`` live in ``lib/tests/string_kunit.c``.
If the suite name contains some or all of the name of the test's parent
-directory, it may make sense to modify the source filename to reduce redundancy.
-For example, a ``foo_firmware`` suite could be in the ``foo/firmware_test.c``
-file.
+directory, it may make sense to modify the source filename to reduce
+redundancy. For example, a ``foo_firmware`` suite could be in the
+``tests/foo/firmware_kunit.c`` file.
--
2.34.1
From: Geliang Tang <tanggeliang(a)kylinos.cn>
This set is part 11 of series "use network helpers" all BPF selftests
wide.
Finally something new in this set.
The helper make_sockaddr is extended to support sockets of AF_PACKET,
AF_ALG and AF_VSOCK families. Then these types of sockets can be used
to start_server_str() helper too.
Imitating connect_to_* interfaces, send_to_* interfaces are added to
support sendto() with given FD or the address string.
Add more conditions to control listen: nolisten flag, listen_support()
helper and clear "type" bits for listen.
Patch 1 for AF_UNIX socket:
Patch 1 uses start_server_str for a AF_UNIX socket.
Patches 2-6 for AF_PACKET sockets:
Patch 2 adds AF_PACKET support for make_sockaddr.
Patch 3 uses start_server_str for a AF_PACKET socket.
Patches 4-5 adds send_to_fd_opts/send_to_addr_str helpers.
Patch 6 uses send_to_addr_str for a AF_PACKET socket.
Patches 7-9 for AF_ALG sockets:
Patch 7 adds AF_ALG support for make_sockaddr.
Patch 8 add nolisten flag, needed by patch 9.
Patch 9 uses send_to_addr_str for a AF_ALG socket.
Patches 10-15 for AF_VSOCK sockets:
Patch 10 adds AF_VSOCK support for make_sockaddr.
Patches 11-12 uses make_sockaddr for AF_VSOCK sockets.
Patches 13-14 adds more conditions to control listen.
Patch 15 uses start_server_str for AF_VSOCK sockets.
Geliang Tang (15):
selftests/bpf: Use start_server_str in skc_to_unix_sock
selftests/bpf: AF_PACKET support for make_sockaddr
selftests/bpf: Use start_server_str in lwt_redirect
selftests/bpf: Add send_to_fd_opts helper
selftests/bpf: Add send_to_addr_str helper
selftests/bpf: Use send_to_addr_str in xdp_bonding
selftests/bpf: AF_ALG support for make_sockaddr
selftests/bpf: Add nolisten for network_helper_opts
selftests/bpf: Use start_server_str in crypto_sanity
selftests/bpf: AF_VSOCK support for make_sockaddr
selftests/bpf: Add loopback_addr_str helper
selftests/bpf: Use make_sockaddr in sockmap_helpers
selftests/bpf: Check listen support for start_server_addr
selftests/bpf: Clear type bits for start_server_addr
selftests/bpf: Use start_server_str in sockmap_helpers
tools/testing/selftests/bpf/network_helpers.c | 144 +++++++++++++++---
tools/testing/selftests/bpf/network_helpers.h | 21 +++
.../selftests/bpf/prog_tests/crypto_sanity.c | 12 +-
.../selftests/bpf/prog_tests/lwt_redirect.c | 21 +--
.../bpf/prog_tests/migrate_reuseport.c | 2 +-
.../bpf/prog_tests/skc_to_unix_sock.c | 22 +--
.../bpf/prog_tests/sockmap_helpers.h | 101 +++---------
.../selftests/bpf/prog_tests/xdp_bonding.c | 20 +--
8 files changed, 186 insertions(+), 157 deletions(-)
--
2.43.0
'%u' in format string requires 'unsigned int' in __wait_for_test()
but the argument type is 'signed int' that this problem was discovered
by reading code
Signed-off-by: Zhu Jun <zhujun2(a)cmss.chinamobile.com>
---
Changes in v2:
- modify commit info add how to find the problem in the log
tools/testing/selftests/kselftest_harness.h | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/tools/testing/selftests/kselftest_harness.h b/tools/testing/selftests/kselftest_harness.h
index b634969cbb6f..dbbbcc6c04ee 100644
--- a/tools/testing/selftests/kselftest_harness.h
+++ b/tools/testing/selftests/kselftest_harness.h
@@ -1084,7 +1084,7 @@ void __wait_for_test(struct __test_metadata *t)
}
} else {
fprintf(TH_LOG_STREAM,
- "# %s: Test ended in some other way [%u]\n",
+ "# %s: Test ended in some other way [%d]\n",
t->name,
status);
}
--
2.17.1
Hello,
This series includes two fixes to support builds targeting MIPS systems.
The patches have been tested both with the kernel-patches/bpf CI and
locally using mips64el-gcc/musl-libc and QEMU with an OpenWrt rootfs.
Patch 1 adds support for MIPS system includes when compiling BPF.
Patch 2 fixes a MIPS GOT issue when linking uprobe_multi.
Feedback and suggestions for improvement are welcome!
Thanks,
Tony
Tony Ambardar (2):
selftests/bpf: Add missing system defines for mips
selftests/bpf: Fix error linking uprobe_multi on mips
tools/testing/selftests/bpf/Makefile | 8 ++++++--
1 file changed, 6 insertions(+), 2 deletions(-)
--
2.34.1
This is an updated patchset following some excellent comments from Roman
and Longman. [1]
As suggested, I've broken this into two commits:
1) with the implementation changes
2) extending the tests tests
I haven't been able to induce anything problematic, but I'm a bit
unclear as to whether there's reference counting on cgroups such that
we don't need to handle the case where the cgroup is freed before the
one of the peak files is closed.
Documentation/admin-guide/cgroup-v2.rst | 26 ++-
include/linux/cgroup.h | 8 +
include/linux/memcontrol.h | 5 +
include/linux/page_counter.h | 11 +-
kernel/cgroup/cgroup-internal.h | 2 +
kernel/cgroup/cgroup.c | 7 +
mm/memcontrol.c | 129 +++++++++++++--
mm/page_counter.c | 36 ++++-
tools/testing/selftests/cgroup/cgroup_util.c | 22 +++
tools/testing/selftests/cgroup/cgroup_util.h | 2 +
tools/testing/selftests/cgroup/test_memcontrol.c | 227 ++++++++++++++++++++++++++-
11 files changed, 444 insertions(+), 31 deletions(-)
[1]: https://lore.kernel.org/cgroups/20240722151713.2724855-1-davidf@vimeo.com/T/
Thank you for your efforts and reviews,
David Finkel
Senior Principal Software Engineer
Vimeo Inc.
From: Geliang Tang <tanggeliang(a)kylinos.cn>
This set is part 10 of series "use network helpers" all BPF selftests
wide.
Patches 1-3 drop local functions make_client(), make_socket() and
inetaddr_len() in sk_lookup.c. Patch 4 drops a useless function
__start_server() in network_helpers.c.
Geliang Tang (4):
selftests/bpf: Drop make_client in sk_lookup
selftests/bpf: Drop make_socket in sk_lookup
selftests/bpf: Drop inetaddr_len in sk_lookup
selftests/bpf: Drop __start_server in network_helpers
tools/testing/selftests/bpf/network_helpers.c | 26 ++---
.../selftests/bpf/prog_tests/sk_lookup.c | 110 +++++-------------
2 files changed, 40 insertions(+), 96 deletions(-)
--
2.43.0
From: Xu Kuohai <xukuohai(a)huawei.com>
LSM BPF prog returning a positive number attached to the hook
file_alloc_security makes kernel panic.
Here is a panic log:
[ 441.235774] BUG: kernel NULL pointer dereference, address: 00000000000009
[ 441.236748] #PF: supervisor write access in kernel mode
[ 441.237429] #PF: error_code(0x0002) - not-present page
[ 441.238119] PGD 800000000b02f067 P4D 800000000b02f067 PUD b031067 PMD 0
[ 441.238990] Oops: 0002 [#1] PREEMPT SMP PTI
[ 441.239546] CPU: 0 PID: 347 Comm: loader Not tainted 6.8.0-rc6-gafe0cbf23373 #22
[ 441.240496] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.15.0-0-g2dd4b4
[ 441.241933] RIP: 0010:alloc_file+0x4b/0x190
[ 441.242485] Code: 8b 04 25 c0 3c 1f 00 48 8b b0 30 0c 00 00 e8 9c fe ff ff 48 3d 00 f0 ff fb
[ 441.244820] RSP: 0018:ffffc90000c67c40 EFLAGS: 00010203
[ 441.245484] RAX: ffff888006a891a0 RBX: ffffffff8223bd00 RCX: 0000000035b08000
[ 441.246391] RDX: ffff88800b95f7b0 RSI: 00000000001fc110 RDI: f089cd0b8088ffff
[ 441.247294] RBP: ffffc90000c67c58 R08: 0000000000000001 R09: 0000000000000001
[ 441.248209] R10: 0000000000000001 R11: 0000000000000001 R12: 0000000000000001
[ 441.249108] R13: ffffc90000c67c78 R14: ffffffff8223bd00 R15: fffffffffffffff4
[ 441.250007] FS: 00000000005f3300(0000) GS:ffff88803ec00000(0000) knlGS:0000000000000000
[ 441.251053] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 441.251788] CR2: 00000000000001a9 CR3: 000000000bdc4003 CR4: 0000000000170ef0
[ 441.252688] Call Trace:
[ 441.253011] <TASK>
[ 441.253296] ? __die+0x24/0x70
[ 441.253702] ? page_fault_oops+0x15b/0x480
[ 441.254236] ? fixup_exception+0x26/0x330
[ 441.254750] ? exc_page_fault+0x6d/0x1c0
[ 441.255257] ? asm_exc_page_fault+0x26/0x30
[ 441.255792] ? alloc_file+0x4b/0x190
[ 441.256257] alloc_file_pseudo+0x9f/0xf0
[ 441.256760] __anon_inode_getfile+0x87/0x190
[ 441.257311] ? lock_release+0x14e/0x3f0
[ 441.257808] bpf_link_prime+0xe8/0x1d0
[ 441.258315] bpf_tracing_prog_attach+0x311/0x570
[ 441.258916] ? __pfx_bpf_lsm_file_alloc_security+0x10/0x10
[ 441.259605] __sys_bpf+0x1bb7/0x2dc0
[ 441.260070] __x64_sys_bpf+0x20/0x30
[ 441.260533] do_syscall_64+0x72/0x140
[ 441.261004] entry_SYSCALL_64_after_hwframe+0x6e/0x76
[ 441.261643] RIP: 0033:0x4b0349
[ 441.262045] Code: ff ff c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 40 00 48 89 f8 48 89 f7 48 88
[ 441.264355] RSP: 002b:00007fff74daee38 EFLAGS: 00000246 ORIG_RAX: 0000000000000141
[ 441.265293] RAX: ffffffffffffffda RBX: 00007fff74daef30 RCX: 00000000004b0349
[ 441.266187] RDX: 0000000000000040 RSI: 00007fff74daee50 RDI: 000000000000001c
[ 441.267114] RBP: 000000000000001b R08: 00000000005ef820 R09: 0000000000000000
[ 441.268018] R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000004
[ 441.268907] R13: 0000000000000004 R14: 00000000005ef018 R15: 00000000004004e8
This is because the filesystem uses IS_ERR to check if the return value
is an error code. If it is not, the filesystem takes the return value
as a file pointer. Since the positive number returned by the BPF prog
is not a real file pointer, this misinterpretation causes a panic.
Since other LSM modules always return either a negative error code
or a valid pointer, this specific issue only exists in BPF LSM. The
proposed solution is to reject LSM BPF progs returning unexpected
values in the verifier. This patch set adds return value check to
ensure only BPF progs returning expected values are accepted.
Since each LSM hook has different excepted return values, we need to
know the expected return values for each individual hook to do the
check. Earlier versions of the patch set used LSM hook annotations
to specify the return value range for each hook. Based on Paul's
suggestion, current version gets rid of such annotations and instead
converts hook return values to a common pattern: return 0 on success
and negative error code on failure.
Basically, LSM hooks are divided into two types: hooks that return a
negative error code and zero or other values, and hooks that do not
return a negative error code. This patch set converts all hooks of the
first type and part of the second type to return 0 on success and a
negative error code on failure (see patches 1-10). For certain hooks,
like ismaclabel and inode_xattr_skipcap, the hook name already imply
that returning 0 or 1 is the best choice, so they are not converted.
There are four unconverted hooks. Except for ismaclabel, which is not
used by BPF LSM, the other three are specified with a BTF ID list to
only return 0 or 1.
v4:
1. remove LSM_HOOK return value annotaion and convert LSM hook return
value to a common patern: return 0 on success and negative error code
on failure (patch 1-10)
2. enable BPF LSM progs to read and write output params (patch 12)
3. add a special case for bitwise AND on range [-1, 0] (patch 16)
4. add a 32-bit comparing flag for retval_range_within (patch 15)
5. collect ACKs, style fix, etc
v3: https://lore.kernel.org/bpf/20240411122752.2873562-1-xukuohai@huaweicloud.c…
1. Fix incorrect lsm hook return value ranges, and add disabled hook
list for bpf lsm, and merge two LSM_RET_INT patches. (KP Singh)
2. Avoid bpf lsm progs attached to different hooks to call each other
with tail call
3. Fix a CI failure caused by false rejection of AND operation
4. Add tests
v2: https://lore.kernel.org/bpf/20240325095653.1720123-1-xukuohai@huaweicloud.c…
fix bpf ci failure
v1: https://lore.kernel.org/bpf/20240316122359.1073787-1-xukuohai@huaweicloud.c…
Xu Kuohai (20):
lsm: Refactor return value of LSM hook vm_enough_memory
lsm: Refactor return value of LSM hook inode_need_killpriv
lsm: Refactor return value of LSM hook inode_getsecurity
lsm: Refactor return value of LSM hook inode_listsecurity
lsm: Refactor return value of LSM hook inode_copy_up_xattr
lsm: Refactor return value of LSM hook getselfattr
lsm: Refactor return value of LSM hook setprocattr
lsm: Refactor return value of LSM hook getprocattr
lsm: Refactor return value of LSM hook key_getsecurity
lsm: Refactor return value of LSM hook audit_rule_match
bpf, lsm: Add disabled BPF LSM hook list
bpf, lsm: Enable BPF LSM prog to read/write return value parameters
bpf, lsm: Add check for BPF LSM return value
bpf: Prevent tail call between progs attached to different hooks
bpf: Fix compare error in function retval_range_within
bpf: Add a special case for bitwise AND on range [-1, 0]
selftests/bpf: Avoid load failure for token_lsm.c
selftests/bpf: Add return value checks for failed tests
selftests/bpf: Add test for lsm tail call
selftests/bpf: Add verifier tests for bpf lsm
fs/attr.c | 5 +-
fs/inode.c | 4 +-
fs/nfs/nfs4proc.c | 5 +-
fs/overlayfs/copy_up.c | 6 +-
fs/proc/base.c | 10 +-
fs/xattr.c | 24 +-
include/linux/bpf.h | 2 +
include/linux/bpf_lsm.h | 15 +
include/linux/lsm_hook_defs.h | 22 +-
include/linux/security.h | 62 ++--
include/linux/tnum.h | 3 +
kernel/bpf/bpf_lsm.c | 64 +++-
kernel/bpf/btf.c | 21 +-
kernel/bpf/core.c | 21 +-
kernel/bpf/tnum.c | 25 ++
kernel/bpf/verifier.c | 173 ++++++++++-
net/socket.c | 9 +-
security/apparmor/audit.c | 22 +-
security/apparmor/include/audit.h | 2 +-
security/apparmor/lsm.c | 22 +-
security/commoncap.c | 32 +-
security/integrity/evm/evm_main.c | 2 +-
security/keys/keyctl.c | 11 +-
security/lsm_syscalls.c | 6 +-
security/security.c | 167 ++++++++---
security/selinux/hooks.c | 94 +++---
security/selinux/include/audit.h | 8 +-
security/selinux/ss/services.c | 54 ++--
security/smack/smack_lsm.c | 104 ++++---
.../selftests/bpf/prog_tests/test_lsm.c | 46 ++-
.../selftests/bpf/prog_tests/verifier.c | 2 +
tools/testing/selftests/bpf/progs/err.h | 10 +
.../selftests/bpf/progs/lsm_tailcall.c | 34 +++
.../selftests/bpf/progs/test_sig_in_xattr.c | 4 +
.../bpf/progs/test_verify_pkcs7_sig.c | 8 +-
tools/testing/selftests/bpf/progs/token_lsm.c | 4 +-
.../bpf/progs/verifier_global_subprogs.c | 7 +-
.../selftests/bpf/progs/verifier_lsm.c | 274 ++++++++++++++++++
38 files changed, 1098 insertions(+), 286 deletions(-)
create mode 100644 tools/testing/selftests/bpf/progs/lsm_tailcall.c
create mode 100644 tools/testing/selftests/bpf/progs/verifier_lsm.c
--
2.30.2
This patch series is motivated by the following observation:
Raise a signal, jump to signal handler. The ucontext_t structure dumped
by kernel to userspace has a uc_sigmask field having the mask of blocked
signals. If you run a fresh minimalistic program doing this, this field
is empty, even if you block some signals while registering the handler
with sigaction().
Here is what the man-pages have to say:
sigaction(2): "sa_mask specifies a mask of signals which should be blocked
(i.e., added to the signal mask of the thread in which the signal handler
is invoked) during execution of the signal handler. In addition, the
signal which triggered the handler will be blocked, unless the SA_NODEFER
flag is used."
signal(7): Under "Execution of signal handlers", (1.3) implies:
"The thread's current signal mask is accessible via the ucontext_t
object that is pointed to by the third argument of the signal handler."
But, (1.4) states:
"Any signals specified in act->sa_mask when registering the handler with
sigprocmask(2) are added to the thread's signal mask. The signal being
delivered is also added to the signal mask, unless SA_NODEFER was
specified when registering the handler. These signals are thus blocked
while the handler executes."
There clearly is no distinction being made in the man pages between
"Thread's signal mask" and ucontext_t; this logically should imply
that a signal blocked by populating struct sigaction should be visible
in ucontext_t.
Here is what the kernel code does (for Aarch64):
do_signal() -> handle_signal() -> sigmask_to_save(), which returns
¤t->blocked, is passed to setup_rt_frame() -> setup_sigframe() ->
__copy_to_user(). Hence, ¤t->blocked is copied to ucontext_t
exposed to userspace. Returning back to handle_signal(),
signal_setup_done() -> signal_delivered() -> sigorsets() and
set_current_blocked() are responsible for using information from
struct ksignal ksig, which was populated through the sigaction()
system call in kernel/signal.c:
copy_from_user(&new_sa.sa, act, sizeof(new_sa.sa)),
to update ¤t->blocked; hence, the set of blocked signals for the
current thread is updated AFTER the kernel dumps ucontext_t to
userspace.
Assuming that the above is indeed the intended behaviour, because it
semantically makes sense, since the signals blocked using sigaction()
remain blocked only till the execution of the handler, and not in the
context present before jumping to the handler (but nothing can be
confirmed from the man-pages), the series introduces a test for
mangling with uc_sigmask. I will send a separate series to fix the
man-pages.
The proposed selftest has been tested out on Aarch32, Aarch64 and x86_64.
v3->v4:
- Allocate sigsets as automatic variables to avoid malloc()
v2->v3:
- ucontext describes current state -> ucontext describes interrupted context
- Add a comment for blockage of USR2 even after return from handler
- Describe blockage of signals in a better way
v1->v2:
- Replace all occurrences of SIGPIPE with SIGSEGV
- Fixed a mismatch between code comment and ksft log
- Add a testcase: Raise the same signal again; it must not be queued
- Remove unneeded <assert.h>, <unistd.h>
- Give a detailed test description in the comments; also describe the
exact meaning of delivered and blocked
- Handle errors for all libc functions/syscalls
- Mention tests in Makefile and .gitignore in alphabetical order
v1:
- https://lore.kernel.org/all/20240607122319.768640-1-dev.jain@arm.com/
Dev Jain (2):
selftests: Rename sigaltstack to generic signal
selftests: Add a test mangling with uc_sigmask
tools/testing/selftests/Makefile | 2 +-
.../{sigaltstack => signal}/.gitignore | 3 +-
.../{sigaltstack => signal}/Makefile | 3 +-
.../current_stack_pointer.h | 0
.../selftests/signal/mangle_uc_sigmask.c | 186 ++++++++++++++++++
.../sas.c => signal/sigaltstack.c} | 0
6 files changed, 191 insertions(+), 3 deletions(-)
rename tools/testing/selftests/{sigaltstack => signal}/.gitignore (57%)
rename tools/testing/selftests/{sigaltstack => signal}/Makefile (53%)
rename tools/testing/selftests/{sigaltstack => signal}/current_stack_pointer.h (100%)
create mode 100644 tools/testing/selftests/signal/mangle_uc_sigmask.c
rename tools/testing/selftests/{sigaltstack/sas.c => signal/sigaltstack.c} (100%)
--
2.34.1
Hi everyone,
I'm currently working on optimizing the Linux kernel for an ARM-based project and would appreciate some guidance. I've already gone through the basics of kernel configuration, but I'm looking for more advanced tips. Specifically, I'm interested in:
Best practices for enabling/disabling specific kernel features for ARM.
Effective ways to profile and analyze performance bottlenecks.
Recommendations for debugging tools tailored for ARM architecture.
Any common pitfalls to avoid during the optimization process.
If you have any experiences or resources to share, I'd be very grateful. Looking forward to your insights!
Best,
Selena Thomas
https://www.igmguru.com/data-science-bi/msbi-certification-training/
Hi everyone,
I'm currently working on optimizing the Linux kernel for an ARM-based project and would appreciate some guidance. I've already gone through the basics of kernel configuration, but I'm looking for more advanced tips. Specifically, I'm interested in:
Best practices for enabling/disabling specific kernel features for ARM.
Effective ways to profile and analyze performance bottlenecks.
Recommendations for debugging tools tailored for ARM architecture.
Any common pitfalls to avoid during the optimization process.
If you have any experiences or resources to share, I'd be very grateful. Looking forward to your insights!
Best regards
Selena
xtheadvector is a custom extension that is based upon riscv vector
version 0.7.1 [1]. All of the vector routines have been modified to
support this alternative vector version based upon whether xtheadvector
was determined to be supported at boot.
vlenb is not supported on the existing xtheadvector hardware, so a
devicetree property thead,vlenb is added to provide the vlenb to Linux.
There is a new hwprobe key RISCV_HWPROBE_KEY_VENDOR_EXT_THEAD_0 that is
used to request which thead vendor extensions are supported on the
current platform. This allows future vendors to allocate hwprobe keys
for their vendor.
Support for xtheadvector is also added to the vector kselftests.
Signed-off-by: Charlie Jenkins <charlie(a)rivosinc.com>
[1] https://github.com/T-head-Semi/thead-extension-spec/blob/95358cb2cca9489361…
---
This series is a continuation of a different series that was fragmented
into two other series in an attempt to get part of it merged in the 6.10
merge window. The split-off series did not get merged due to a NAK on
the series that added the generic riscv,vlenb devicetree entry. This
series has converted riscv,vlenb to thead,vlenb to remedy this issue.
The original series is titled "riscv: Support vendor extensions and
xtheadvector" [3].
The series titled "riscv: Extend cpufeature.c to detect vendor
extensions" is still under development and this series is based on that
series! [4]
I have tested this with an Allwinner Nezha board. I ran into issues
booting the board after 6.9-rc1 so I applied these patches to 6.8. There
are a couple of minor merge conflicts that do arrise when doing that, so
please let me know if you have been able to boot this board with a 6.9
kernel. I used SkiffOS [1] to manage building the image, but upgraded
the U-Boot version to Samuel Holland's more up-to-date version [2] and
changed out the device tree used by U-Boot with the device trees that
are present in upstream linux and this series. Thank you Samuel for all
of the work you did to make this task possible.
[1] https://github.com/skiffos/SkiffOS/tree/master/configs/allwinner/nezha
[2] https://github.com/smaeul/u-boot/commit/2e89b706f5c956a70c989cd31665f1429e9…
[3] https://lore.kernel.org/all/20240503-dev-charlie-support_thead_vector_6_9-v…
[4] https://lore.kernel.org/lkml/20240719-support_vendor_extensions-v3-4-0af758…
---
Changes in v6:
- Fix return type of is_vector_supported()/is_xthead_supported() to be bool
- Link to v5: https://lore.kernel.org/r/20240719-xtheadvector-v5-0-4b485fc7d55f@rivosinc.…
Changes in v5:
- Rebase on for-next
- Link to v4: https://lore.kernel.org/r/20240702-xtheadvector-v4-0-2bad6820db11@rivosinc.…
Changes in v4:
- Replace inline asm with C (Samuel)
- Rename VCSRs to CSRs (Samuel)
- Replace .insn directives with .4byte directives
- Link to v3: https://lore.kernel.org/r/20240619-xtheadvector-v3-0-bff39eb9668e@rivosinc.…
Changes in v3:
- Add back Heiko's signed-off-by (Conor)
- Mark RISCV_HWPROBE_KEY_VENDOR_EXT_THEAD_0 as a bitmask
- Link to v2: https://lore.kernel.org/r/20240610-xtheadvector-v2-0-97a48613ad64@rivosinc.…
Changes in v2:
- Removed extraneous references to "riscv,vlenb" (Jess)
- Moved declaration of "thead,vlenb" into cpus.yaml and added
restriction that it's only applicable to thead cores (Conor)
- Check CONFIG_RISCV_ISA_XTHEADVECTOR instead of CONFIG_RISCV_ISA_V for
thead,vlenb (Jess)
- Fix naming of hwprobe variables (Evan)
- Link to v1: https://lore.kernel.org/r/20240609-xtheadvector-v1-0-3fe591d7f109@rivosinc.…
---
Charlie Jenkins (12):
dt-bindings: riscv: Add xtheadvector ISA extension description
dt-bindings: cpus: add a thead vlen register length property
riscv: dts: allwinner: Add xtheadvector to the D1/D1s devicetree
riscv: Add thead and xtheadvector as a vendor extension
riscv: vector: Use vlenb from DT for thead
riscv: csr: Add CSR encodings for CSR_VXRM/CSR_VXSAT
riscv: Add xtheadvector instruction definitions
riscv: vector: Support xtheadvector save/restore
riscv: hwprobe: Add thead vendor extension probing
riscv: hwprobe: Document thead vendor extensions and xtheadvector extension
selftests: riscv: Fix vector tests
selftests: riscv: Support xtheadvector in vector tests
Heiko Stuebner (1):
RISC-V: define the elements of the VCSR vector CSR
Documentation/arch/riscv/hwprobe.rst | 10 +
Documentation/devicetree/bindings/riscv/cpus.yaml | 19 ++
.../devicetree/bindings/riscv/extensions.yaml | 10 +
arch/riscv/Kconfig.vendor | 26 ++
arch/riscv/boot/dts/allwinner/sun20i-d1s.dtsi | 3 +-
arch/riscv/include/asm/cpufeature.h | 2 +
arch/riscv/include/asm/csr.h | 15 ++
arch/riscv/include/asm/hwprobe.h | 5 +-
arch/riscv/include/asm/switch_to.h | 2 +-
arch/riscv/include/asm/vector.h | 223 ++++++++++++----
arch/riscv/include/asm/vendor_extensions/thead.h | 42 +++
.../include/asm/vendor_extensions/thead_hwprobe.h | 18 ++
.../include/asm/vendor_extensions/vendor_hwprobe.h | 37 +++
arch/riscv/include/uapi/asm/hwprobe.h | 3 +-
arch/riscv/include/uapi/asm/vendor/thead.h | 3 +
arch/riscv/kernel/cpufeature.c | 54 +++-
arch/riscv/kernel/kernel_mode_vector.c | 8 +-
arch/riscv/kernel/process.c | 4 +-
arch/riscv/kernel/signal.c | 6 +-
arch/riscv/kernel/sys_hwprobe.c | 5 +
arch/riscv/kernel/vector.c | 24 +-
arch/riscv/kernel/vendor_extensions.c | 10 +
arch/riscv/kernel/vendor_extensions/Makefile | 2 +
arch/riscv/kernel/vendor_extensions/thead.c | 18 ++
.../riscv/kernel/vendor_extensions/thead_hwprobe.c | 19 ++
tools/testing/selftests/riscv/vector/.gitignore | 3 +-
tools/testing/selftests/riscv/vector/Makefile | 17 +-
.../selftests/riscv/vector/v_exec_initval_nolibc.c | 93 +++++++
tools/testing/selftests/riscv/vector/v_helpers.c | 68 +++++
tools/testing/selftests/riscv/vector/v_helpers.h | 8 +
tools/testing/selftests/riscv/vector/v_initval.c | 22 ++
.../selftests/riscv/vector/v_initval_nolibc.c | 68 -----
.../selftests/riscv/vector/vstate_exec_nolibc.c | 20 +-
.../testing/selftests/riscv/vector/vstate_prctl.c | 295 ++++++++++++---------
34 files changed, 889 insertions(+), 273 deletions(-)
---
base-commit: 554462ced9ac97487c8f725fe466a9c20ed87521
change-id: 20240530-xtheadvector-833d3d17b423
--
- Charlie
My last patch[1] was met with a general desire for a safer scheme that
avoided global resets, which expose unclear ownership.
Fortunately, Johannes[2] suggested a reasonably simple scheme to provide
an FD-local reset, which eliminates most of those issues.
The one open question I have is whether the cgroup/memcg itself is kept
alive by an open FD, or if we need to update the memcg freeing code to
traverse the new list of "watchers" so they don't try to access freed
memory.
Thank you,
David Finkel
Senior Principal Software Engineer, Core Services
Vimeo Inc.
[1]: https://lore.kernel.org/cgroups/20240715203625.1462309-1-davidf@vimeo.com/
[2]: https://lore.kernel.org/cgroups/20240717170408.GC1321673@cmpxchg.org/
xtheadvector is a custom extension that is based upon riscv vector
version 0.7.1 [1]. All of the vector routines have been modified to
support this alternative vector version based upon whether xtheadvector
was determined to be supported at boot.
vlenb is not supported on the existing xtheadvector hardware, so a
devicetree property thead,vlenb is added to provide the vlenb to Linux.
There is a new hwprobe key RISCV_HWPROBE_KEY_VENDOR_EXT_THEAD_0 that is
used to request which thead vendor extensions are supported on the
current platform. This allows future vendors to allocate hwprobe keys
for their vendor.
Support for xtheadvector is also added to the vector kselftests.
Signed-off-by: Charlie Jenkins <charlie(a)rivosinc.com>
[1] https://github.com/T-head-Semi/thead-extension-spec/blob/95358cb2cca9489361…
---
This series is a continuation of a different series that was fragmented
into two other series in an attempt to get part of it merged in the 6.10
merge window. The split-off series did not get merged due to a NAK on
the series that added the generic riscv,vlenb devicetree entry. This
series has converted riscv,vlenb to thead,vlenb to remedy this issue.
The original series is titled "riscv: Support vendor extensions and
xtheadvector" [3].
The series titled "riscv: Extend cpufeature.c to detect vendor
extensions" is still under development and this series is based on that
series! [4]
I have tested this with an Allwinner Nezha board. I ran into issues
booting the board after 6.9-rc1 so I applied these patches to 6.8. There
are a couple of minor merge conflicts that do arrise when doing that, so
please let me know if you have been able to boot this board with a 6.9
kernel. I used SkiffOS [1] to manage building the image, but upgraded
the U-Boot version to Samuel Holland's more up-to-date version [2] and
changed out the device tree used by U-Boot with the device trees that
are present in upstream linux and this series. Thank you Samuel for all
of the work you did to make this task possible.
[1] https://github.com/skiffos/SkiffOS/tree/master/configs/allwinner/nezha
[2] https://github.com/smaeul/u-boot/commit/2e89b706f5c956a70c989cd31665f1429e9…
[3] https://lore.kernel.org/all/20240503-dev-charlie-support_thead_vector_6_9-v…
[4] https://lore.kernel.org/linux-riscv/20240609-support_vendor_extensions-v2-0…
---
Changes in v5:
- Rebase on for-next
- Link to v4: https://lore.kernel.org/r/20240702-xtheadvector-v4-0-2bad6820db11@rivosinc.…
Changes in v4:
- Replace inline asm with C (Samuel)
- Rename VCSRs to CSRs (Samuel)
- Replace .insn directives with .4byte directives
- Link to v3: https://lore.kernel.org/r/20240619-xtheadvector-v3-0-bff39eb9668e@rivosinc.…
Changes in v3:
- Add back Heiko's signed-off-by (Conor)
- Mark RISCV_HWPROBE_KEY_VENDOR_EXT_THEAD_0 as a bitmask
- Link to v2: https://lore.kernel.org/r/20240610-xtheadvector-v2-0-97a48613ad64@rivosinc.…
Changes in v2:
- Removed extraneous references to "riscv,vlenb" (Jess)
- Moved declaration of "thead,vlenb" into cpus.yaml and added
restriction that it's only applicable to thead cores (Conor)
- Check CONFIG_RISCV_ISA_XTHEADVECTOR instead of CONFIG_RISCV_ISA_V for
thead,vlenb (Jess)
- Fix naming of hwprobe variables (Evan)
- Link to v1: https://lore.kernel.org/r/20240609-xtheadvector-v1-0-3fe591d7f109@rivosinc.…
---
Charlie Jenkins (12):
dt-bindings: riscv: Add xtheadvector ISA extension description
dt-bindings: cpus: add a thead vlen register length property
riscv: dts: allwinner: Add xtheadvector to the D1/D1s devicetree
riscv: Add thead and xtheadvector as a vendor extension
riscv: vector: Use vlenb from DT for thead
riscv: csr: Add CSR encodings for CSR_VXRM/CSR_VXSAT
riscv: Add xtheadvector instruction definitions
riscv: vector: Support xtheadvector save/restore
riscv: hwprobe: Add thead vendor extension probing
riscv: hwprobe: Document thead vendor extensions and xtheadvector extension
selftests: riscv: Fix vector tests
selftests: riscv: Support xtheadvector in vector tests
Heiko Stuebner (1):
RISC-V: define the elements of the VCSR vector CSR
Documentation/arch/riscv/hwprobe.rst | 10 +
Documentation/devicetree/bindings/riscv/cpus.yaml | 19 ++
.../devicetree/bindings/riscv/extensions.yaml | 10 +
arch/riscv/Kconfig.vendor | 26 ++
arch/riscv/boot/dts/allwinner/sun20i-d1s.dtsi | 3 +-
arch/riscv/include/asm/cpufeature.h | 2 +
arch/riscv/include/asm/csr.h | 15 ++
arch/riscv/include/asm/hwprobe.h | 5 +-
arch/riscv/include/asm/switch_to.h | 2 +-
arch/riscv/include/asm/vector.h | 223 ++++++++++++----
arch/riscv/include/asm/vendor_extensions/thead.h | 42 +++
.../include/asm/vendor_extensions/thead_hwprobe.h | 18 ++
.../include/asm/vendor_extensions/vendor_hwprobe.h | 37 +++
arch/riscv/include/uapi/asm/hwprobe.h | 3 +-
arch/riscv/include/uapi/asm/vendor/thead.h | 3 +
arch/riscv/kernel/cpufeature.c | 54 +++-
arch/riscv/kernel/kernel_mode_vector.c | 8 +-
arch/riscv/kernel/process.c | 4 +-
arch/riscv/kernel/signal.c | 6 +-
arch/riscv/kernel/sys_hwprobe.c | 5 +
arch/riscv/kernel/vector.c | 24 +-
arch/riscv/kernel/vendor_extensions.c | 10 +
arch/riscv/kernel/vendor_extensions/Makefile | 2 +
arch/riscv/kernel/vendor_extensions/thead.c | 18 ++
.../riscv/kernel/vendor_extensions/thead_hwprobe.c | 19 ++
tools/testing/selftests/riscv/abi/ptrace | Bin 0 -> 759368 bytes
tools/testing/selftests/riscv/vector/.gitignore | 3 +-
tools/testing/selftests/riscv/vector/Makefile | 17 +-
.../selftests/riscv/vector/v_exec_initval_nolibc.c | 93 +++++++
tools/testing/selftests/riscv/vector/v_helpers.c | 67 +++++
tools/testing/selftests/riscv/vector/v_helpers.h | 7 +
tools/testing/selftests/riscv/vector/v_initval.c | 22 ++
.../selftests/riscv/vector/v_initval_nolibc.c | 68 -----
.../selftests/riscv/vector/vstate_exec_nolibc.c | 20 +-
.../testing/selftests/riscv/vector/vstate_prctl.c | 295 ++++++++++++---------
35 files changed, 887 insertions(+), 273 deletions(-)
---
base-commit: 554462ced9ac97487c8f725fe466a9c20ed87521
change-id: 20240530-xtheadvector-833d3d17b423
--
- Charlie
This is a rebase of a patch I sent a few months ago, on which I received
two acks, but the thread petered out:
https://www.spinics.net/lists/cgroups/msg40602.html.
I'm hoping that it can make it into the next LTS (and 6.11 if possible)
Thank you,
David Finkel
Sr. Principal Software Engineer, Vimeo
The arm64 Guarded Control Stack (GCS) feature provides support for
hardware protected stacks of return addresses, intended to provide
hardening against return oriented programming (ROP) attacks and to make
it easier to gather call stacks for applications such as profiling.
When GCS is active a secondary stack called the Guarded Control Stack is
maintained, protected with a memory attribute which means that it can
only be written with specific GCS operations. The current GCS pointer
can not be directly written to by userspace. When a BL is executed the
value stored in LR is also pushed onto the GCS, and when a RET is
executed the top of the GCS is popped and compared to LR with a fault
being raised if the values do not match. GCS operations may only be
performed on GCS pages, a data abort is generated if they are not.
The combination of hardware enforcement and lack of extra instructions
in the function entry and exit paths should result in something which
has less overhead and is more difficult to attack than a purely software
implementation like clang's shadow stacks.
This series implements support for use of GCS by userspace, along with
support for use of GCS within KVM guests. It does not enable use of GCS
by either EL1 or EL2, this will be implemented separately. Executables
are started without GCS and must use a prctl() to enable it, it is
expected that this will be done very early in application execution by
the dynamic linker or other startup code. For dynamic linking this will
be done by checking that everything in the executable is marked as GCS
compatible.
x86 has an equivalent feature called shadow stacks, this series depends
on the x86 patches for generic memory management support for the new
guarded/shadow stack page type and shares APIs as much as possible. As
there has been extensive discussion with the wider community around the
ABI for shadow stacks I have as far as practical kept implementation
decisions close to those for x86, anticipating that review would lead to
similar conclusions in the absence of strong reasoning for divergence.
The main divergence I am concious of is that x86 allows shadow stack to
be enabled and disabled repeatedly, freeing the shadow stack for the
thread whenever disabled, while this implementation keeps the GCS
allocated after disable but refuses to reenable it. This is to avoid
races with things actively walking the GCS during a disable, we do
anticipate that some systems will wish to disable GCS at runtime but are
not aware of any demand for subsequently reenabling it.
x86 uses an arch_prctl() to manage enable and disable, since only x86
and S/390 use arch_prctl() a generic prctl() was proposed[1] as part of a
patch set for the equivalent RISC-V Zicfiss feature which I initially
adopted fairly directly but following review feedback has been revised
quite a bit.
We currently maintain the x86 pattern of implicitly allocating a shadow
stack for threads started with shadow stack enabled, there has been some
discussion of removing this support and requiring the use of clone3()
with explicit allocation of shadow stacks instead. I have no strong
feelings either way, implicit allocation is not really consistent with
anything else we do and creates the potential for errors around thread
exit but on the other hand it is existing ABI on x86 and minimises the
changes needed in userspace code.
glibc and bionic changes using this ABI have been implemented and
tested. Headless Android systems have been validated and Ross Burton
has used this code has been used to bring up a Yocto system with GCS
enabed as standard, a test implementation of V8 support has also been
done.
There is an open issue with support for CRIU, on x86 this required the
ability to set the GCS mode via ptrace. This series supports
configuring mode bits other than enable/disable via ptrace but it needs
to be confirmed if this is sufficient.
The series depends on support for shadow stacks in clone3(), that series
includes the addition of ARCH_HAS_USER_SHADOW_STACK.
https://lore.kernel.org/r/20240623-clone3-shadow-stack-v6-0-9ee7783b1fb9@ke…
You can see a branch with the full set of dependencies against Linus'
tree at:
https://git.kernel.org/pub/scm/linux/kernel/git/broonie/misc.git arm64-gcs
[1] https://lore.kernel.org/lkml/20230213045351.3945824-1-debug@rivosinc.com/
Signed-off-by: Mark Brown <broonie(a)kernel.org>
---
Changes in v9:
- Rebase onto v6.10-rc3.
- Restructure and clarify memory management fault handling.
- Fix up basic-gcs for the latest clone3() changes.
- Convert to newly merged KVM ID register based feature configuration.
- Fixes for NV traps.
- Link to v8: https://lore.kernel.org/r/20240203-arm64-gcs-v8-0-c9fec77673ef@kernel.org
Changes in v8:
- Invalidate signal cap token on stack when consuming.
- Typo and other trivial fixes.
- Don't try to use process_vm_write() on GCS, it intentionally does not
work.
- Fix leak of thread GCSs.
- Rebase onto latest clone3() series.
- Link to v7: https://lore.kernel.org/r/20231122-arm64-gcs-v7-0-201c483bd775@kernel.org
Changes in v7:
- Rebase onto v6.7-rc2 via the clone3() patch series.
- Change the token used to cap the stack during signal handling to be
compatible with GCSPOPM.
- Fix flags for new page types.
- Fold in support for clone3().
- Replace copy_to_user_gcs() with put_user_gcs().
- Link to v6: https://lore.kernel.org/r/20231009-arm64-gcs-v6-0-78e55deaa4dd@kernel.org
Changes in v6:
- Rebase onto v6.6-rc3.
- Add some more gcsb_dsync() barriers following spec clarifications.
- Due to ongoing discussion around clone()/clone3() I've not updated
anything there, the behaviour is the same as on previous versions.
- Link to v5: https://lore.kernel.org/r/20230822-arm64-gcs-v5-0-9ef181dd6324@kernel.org
Changes in v5:
- Don't map any permissions for user GCSs, we always use EL0 accessors
or use a separate mapping of the page.
- Reduce the standard size of the GCS to RLIMIT_STACK/2.
- Enforce a PAGE_SIZE alignment requirement on map_shadow_stack().
- Clarifications and fixes to documentation.
- More tests.
- Link to v4: https://lore.kernel.org/r/20230807-arm64-gcs-v4-0-68cfa37f9069@kernel.org
Changes in v4:
- Implement flags for map_shadow_stack() allowing the cap and end of
stack marker to be enabled independently or not at all.
- Relax size and alignment requirements for map_shadow_stack().
- Add more blurb explaining the advantages of hardware enforcement.
- Link to v3: https://lore.kernel.org/r/20230731-arm64-gcs-v3-0-cddf9f980d98@kernel.org
Changes in v3:
- Rebase onto v6.5-rc4.
- Add a GCS barrier on context switch.
- Add a GCS stress test.
- Link to v2: https://lore.kernel.org/r/20230724-arm64-gcs-v2-0-dc2c1d44c2eb@kernel.org
Changes in v2:
- Rebase onto v6.5-rc3.
- Rework prctl() interface to allow each bit to be locked independently.
- map_shadow_stack() now places the cap token based on the size
requested by the caller not the actual space allocated.
- Mode changes other than enable via ptrace are now supported.
- Expand test coverage.
- Various smaller fixes and adjustments.
- Link to v1: https://lore.kernel.org/r/20230716-arm64-gcs-v1-0-bf567f93bba6@kernel.org
---
Mark Brown (39):
arm64/mm: Restructure arch_validate_flags() for extensibility
prctl: arch-agnostic prctl for shadow stack
mman: Add map_shadow_stack() flags
arm64: Document boot requirements for Guarded Control Stacks
arm64/gcs: Document the ABI for Guarded Control Stacks
arm64/sysreg: Add definitions for architected GCS caps
arm64/gcs: Add manual encodings of GCS instructions
arm64/gcs: Provide put_user_gcs()
arm64/cpufeature: Runtime detection of Guarded Control Stack (GCS)
arm64/mm: Allocate PIE slots for EL0 guarded control stack
mm: Define VM_SHADOW_STACK for arm64 when we support GCS
arm64/mm: Map pages for guarded control stack
KVM: arm64: Manage GCS registers for guests
arm64/gcs: Allow GCS usage at EL0 and EL1
arm64/idreg: Add overrride for GCS
arm64/hwcap: Add hwcap for GCS
arm64/traps: Handle GCS exceptions
arm64/mm: Handle GCS data aborts
arm64/gcs: Context switch GCS state for EL0
arm64/gcs: Ensure that new threads have a GCS
arm64/gcs: Implement shadow stack prctl() interface
arm64/mm: Implement map_shadow_stack()
arm64/signal: Set up and restore the GCS context for signal handlers
arm64/signal: Expose GCS state in signal frames
arm64/ptrace: Expose GCS via ptrace and core files
arm64: Add Kconfig for Guarded Control Stack (GCS)
kselftest/arm64: Verify the GCS hwcap
kselftest: Provide shadow stack enable helpers for arm64
selftests/clone3: Enable arm64 shadow stack testing
kselftest/arm64: Add GCS as a detected feature in the signal tests
kselftest/arm64: Add framework support for GCS to signal handling tests
kselftest/arm64: Allow signals tests to specify an expected si_code
kselftest/arm64: Always run signals tests with GCS enabled
kselftest/arm64: Add very basic GCS test program
kselftest/arm64: Add a GCS test program built with the system libc
kselftest/arm64: Add test coverage for GCS mode locking
kselftest/arm64: Add GCS signal tests
kselftest/arm64: Add a GCS stress test
kselftest/arm64: Enable GCS for the FP stress tests
Documentation/admin-guide/kernel-parameters.txt | 6 +
Documentation/arch/arm64/booting.rst | 22 +
Documentation/arch/arm64/elf_hwcaps.rst | 2 +
Documentation/arch/arm64/gcs.rst | 233 +++++++
Documentation/arch/arm64/index.rst | 1 +
Documentation/filesystems/proc.rst | 2 +-
arch/arm64/Kconfig | 20 +
arch/arm64/include/asm/cpufeature.h | 6 +
arch/arm64/include/asm/el2_setup.h | 17 +
arch/arm64/include/asm/esr.h | 28 +-
arch/arm64/include/asm/exception.h | 2 +
arch/arm64/include/asm/gcs.h | 107 +++
arch/arm64/include/asm/hwcap.h | 1 +
arch/arm64/include/asm/kvm_host.h | 14 +
arch/arm64/include/asm/mman.h | 23 +-
arch/arm64/include/asm/pgtable-prot.h | 14 +-
arch/arm64/include/asm/processor.h | 7 +
arch/arm64/include/asm/sysreg.h | 20 +
arch/arm64/include/asm/uaccess.h | 40 ++
arch/arm64/include/asm/vncr_mapping.h | 2 +
arch/arm64/include/uapi/asm/hwcap.h | 1 +
arch/arm64/include/uapi/asm/ptrace.h | 8 +
arch/arm64/include/uapi/asm/sigcontext.h | 9 +
arch/arm64/kernel/cpufeature.c | 19 +
arch/arm64/kernel/cpuinfo.c | 1 +
arch/arm64/kernel/entry-common.c | 23 +
arch/arm64/kernel/pi/idreg-override.c | 2 +
arch/arm64/kernel/process.c | 85 +++
arch/arm64/kernel/ptrace.c | 59 ++
arch/arm64/kernel/signal.c | 242 ++++++-
arch/arm64/kernel/traps.c | 11 +
arch/arm64/kvm/hyp/include/hyp/sysreg-sr.h | 48 +-
arch/arm64/kvm/sys_regs.c | 25 +-
arch/arm64/mm/Makefile | 1 +
arch/arm64/mm/fault.c | 43 ++
arch/arm64/mm/gcs.c | 325 +++++++++
arch/arm64/mm/mmap.c | 13 +-
arch/arm64/tools/cpucaps | 1 +
arch/x86/include/uapi/asm/mman.h | 3 -
fs/proc/task_mmu.c | 3 +
include/linux/mm.h | 16 +-
include/uapi/asm-generic/mman.h | 4 +
include/uapi/linux/elf.h | 1 +
include/uapi/linux/prctl.h | 22 +
kernel/sys.c | 30 +
tools/testing/selftests/arm64/Makefile | 2 +-
tools/testing/selftests/arm64/abi/hwcap.c | 19 +
tools/testing/selftests/arm64/fp/assembler.h | 15 +
tools/testing/selftests/arm64/fp/fpsimd-test.S | 2 +
tools/testing/selftests/arm64/fp/sve-test.S | 2 +
tools/testing/selftests/arm64/fp/za-test.S | 2 +
tools/testing/selftests/arm64/fp/zt-test.S | 2 +
tools/testing/selftests/arm64/gcs/.gitignore | 5 +
tools/testing/selftests/arm64/gcs/Makefile | 24 +
tools/testing/selftests/arm64/gcs/asm-offsets.h | 0
tools/testing/selftests/arm64/gcs/basic-gcs.c | 357 ++++++++++
tools/testing/selftests/arm64/gcs/gcs-locking.c | 200 ++++++
.../selftests/arm64/gcs/gcs-stress-thread.S | 311 +++++++++
tools/testing/selftests/arm64/gcs/gcs-stress.c | 532 +++++++++++++++
tools/testing/selftests/arm64/gcs/gcs-util.h | 100 +++
tools/testing/selftests/arm64/gcs/libc-gcs.c | 736 +++++++++++++++++++++
tools/testing/selftests/arm64/signal/.gitignore | 1 +
.../testing/selftests/arm64/signal/test_signals.c | 17 +-
.../testing/selftests/arm64/signal/test_signals.h | 6 +
.../selftests/arm64/signal/test_signals_utils.c | 32 +-
.../selftests/arm64/signal/test_signals_utils.h | 39 ++
.../arm64/signal/testcases/gcs_exception_fault.c | 62 ++
.../selftests/arm64/signal/testcases/gcs_frame.c | 88 +++
.../arm64/signal/testcases/gcs_write_fault.c | 67 ++
.../selftests/arm64/signal/testcases/testcases.c | 7 +
.../selftests/arm64/signal/testcases/testcases.h | 1 +
tools/testing/selftests/clone3/clone3_selftests.h | 26 +
tools/testing/selftests/ksft_shstk.h | 37 ++
73 files changed, 4213 insertions(+), 41 deletions(-)
---
base-commit: 4c8cf8814957090ce50ad18f318f72e6fe0d1a32
change-id: 20230303-arm64-gcs-e311ab0d8729
Best regards,
--
Mark Brown <broonie(a)kernel.org>
From: diegodvv <diego.daniel.professional(a)gmail.com>
Hi all,
This is part of a hackathon organized by LKCAMP[1], focused on writing
tests using KUnit. We reached out a while ago asking for advice on what would
be a useful contribution[2] and ended up choosing data structures that did
not yet have tests.
This patch adds tests for the kfifo data structure, defined in
include/linux/kfifo.h, and is inspired by the KUnit tests for the doubly
linked list in lib/list-test.c[3].
[1] https://lkcamp.dev/about/
[2] https://lore.kernel.org/all/Zktnt7rjKryTh9-N@arch/
[3] https://elixir.bootlin.com/linux/latest/source/lib/list-test.c
Diego Vieira (1):
lib/kfifo-test.c: add tests for the kfifo structure
lib/Kconfig.debug | 14 +++
lib/Makefile | 1 +
lib/kfifo-test.c | 222 ++++++++++++++++++++++++++++++++++++++++++++++
3 files changed, 237 insertions(+)
create mode 100644 lib/kfifo-test.c
--
2.34.1