Changes from PATCH v1 -> v2:
- Updated selftest to use ksft_test_result_code instead of switch-case
(Muhammad Usama Anjum)
- Included more use cases in the cover letter
(Huang, Ying)
- Added documentation for sysfs and memcg interfaces
- Added an aging-specific struct lru_gen_mm_walk in struct pglist_data
to avoid allocating for each lruvec.
Changes from RFC v3 -> PATCH v1:
- Updated selftest to use ksft_print_msg instead of fprintf(stderr, ...)
(Muhammad Usama Anjum)
- Included more detail in patch skipping pmd_young with force_scan
(Huang, Ying)
- Deferred reaccess histogram as a followup
- Removed per-memcg page age interval configs for simplicity
Changes from RFC v2 -> RFC v3:
- Update to v6.8
- Added an aging kernel thread (gated behind config)
- Added basic selftests for sysfs interface files
- Track swapped out pages for reaccesses
- Refactoring and cleanup
- Dropped the virtio-balloon extension to make things manageable
Changes from RFC v1 -> RFC v2:
- Refactored the patchs into smaller pieces
- Renamed interfaces and functions from wss to wsr (Working Set Reporting)
- Fixed build errors when CONFIG_WSR is not set
- Changed working_set_num_bins to u8 for virtio-balloon
- Added support for per-NUMA node reporting for virtio-balloon
[rfc v1]
https://lore.kernel.org/linux-mm/20230509185419.1088297-1-yuanchu@google.co…
[rfc v2]
https://lore.kernel.org/linux-mm/20230621180454.973862-1-yuanchu@google.com/
[rfc v3]
https://lore.kernel.org/linux-mm/20240327213108.2384666-1-yuanchu@google.co…
This patch series provides workingset reporting of user pages in
lruvecs, of which coldness can be tracked by accessed bits and fd
references. However, the concept of workingset applies generically to
all types of memory, which could be kernel slab caches, discardable
userspace caches (databases), or CXL.mem. Therefore, data sources might
come from slab shrinkers, device drivers, or the userspace. IMO, the
kernel should provide a set of workingset interfaces that should be
generic enough to accommodate the various use cases, and be extensible
to potential future use cases. The current proposed interfaces are not
sufficient in that regard, but I would like to start somewhere, solicit
feedback, and iterate.
Use cases
==========
Job scheduling
On overcommitted hosts, workingset information allows the job scheduler
to right-size each job and land more jobs on the same host or NUMA node,
and in the case of a job with increasing workingset, policy decisions
can be made to migrate other jobs off the host/NUMA node, or oom-kill
the misbehaving job. If the job shape is very different from the machine
shape, knowing the workingset per-node can also help inform page
allocation policies.
Proactive reclaim
Workingset information allows the a container manager to proactively
reclaim memory while not impacting a job's performance. While PSI may
provide a reactive measure of when a proactive reclaim has reclaimed too
much, workingset reporting allows the policy to be more accurate and
flexible.
Ballooning (similar to proactive reclaim)
While this patch series does not extend the virtio-balloon device,
balloon policies benefit from workingset to more precisely determine
the size of the memory balloon. On desktops/laptops/mobile devices where
memory is scarce and overcommitted, the balloon sizing in multiple VMs
running on the same device can be orchestrated with workingset reports
from each one.
Promotion/Demotion
If different mechanisms are used for promition and demotion, workingset
information can help connect the two and avoid pages being migrated back
and forth.
For example, given a promotion hot page threshold defined in reaccess
distance of N seconds (promote pages accessed more often than every N
seconds). The threshold N should be set so that ~80% (e.g.) of pages on
the fast memory node passes the threshold. This calculation can be done
with workingset reports.
To be directly useful for promotion policies, the workingset report
interfaces need to be extended to report hotness and gather hotness
information from the devices[1].
[1]
https://www.opencompute.org/documents/ocp-cms-hotness-tracking-requirements…
Sysfs and Cgroup Interfaces
==========
The interfaces are detailed in the patches that introduce them. The main
idea here is we break down the workingset per-node per-memcg into time
intervals (ms), e.g.
1000 anon=137368 file=24530
20000 anon=34342 file=0
30000 anon=353232 file=333608
40000 anon=407198 file=206052
9223372036854775807 anon=4925624 file=892892
I realize this does not generalize well to hotness information, but I
lack the intuition for an abstraction that presents hotness in a useful
way. Based on a recent proposal for move_phys_pages[2], it seems like
userspace tiering software would like to move specific physical pages,
instead of informing the kernel "move x number of hot pages to y
device". Please advise.
[2]
https://lore.kernel.org/lkml/20240319172609.332900-1-gregory.price@memverge…
Implementation
==========
Currently, the reporting of user pages is based off of MGLRU, and
therefore requires CONFIG_LRU_GEN=y. We would benefit from more MGLRU
generations for a more fine-grained workingset report. I will make the
generation count configurable in the next version. The workingset
reporting mechanism is gated behind CONFIG_WORKINGSET_REPORT, and the
aging thread is behind CONFIG_WORKINGSET_REPORT_AGING.
Yuanchu Xie (8):
mm: multi-gen LRU: ignore non-leaf pmd_young for force_scan=true
mm: aggregate working set information into histograms
mm: use refresh interval to rate-limit workingset report aggregation
mm: report workingset during memory pressure driven scanning
mm: extend working set reporting to memcgs
mm: add kernel aging thread for workingset reporting
selftest: test system-wide workingset reporting
Docs/admin-guide/mm/workingset_report: document sysfs and memcg
interfaces
Documentation/admin-guide/mm/index.rst | 1 +
.../admin-guide/mm/workingset_report.rst | 105 ++++
drivers/base/node.c | 6 +
include/linux/memcontrol.h | 5 +
include/linux/mmzone.h | 9 +
include/linux/workingset_report.h | 97 +++
mm/Kconfig | 15 +
mm/Makefile | 2 +
mm/internal.h | 18 +
mm/memcontrol.c | 184 +++++-
mm/mm_init.c | 2 +
mm/mmzone.c | 2 +
mm/vmscan.c | 58 +-
mm/workingset_report.c | 561 ++++++++++++++++++
mm/workingset_report_aging.c | 127 ++++
tools/testing/selftests/mm/.gitignore | 1 +
tools/testing/selftests/mm/Makefile | 3 +
tools/testing/selftests/mm/run_vmtests.sh | 5 +
.../testing/selftests/mm/workingset_report.c | 306 ++++++++++
.../testing/selftests/mm/workingset_report.h | 39 ++
.../selftests/mm/workingset_report_test.c | 329 ++++++++++
21 files changed, 1869 insertions(+), 6 deletions(-)
create mode 100644 Documentation/admin-guide/mm/workingset_report.rst
create mode 100644 include/linux/workingset_report.h
create mode 100644 mm/workingset_report.c
create mode 100644 mm/workingset_report_aging.c
create mode 100644 tools/testing/selftests/mm/workingset_report.c
create mode 100644 tools/testing/selftests/mm/workingset_report.h
create mode 100644 tools/testing/selftests/mm/workingset_report_test.c
--
2.45.1.467.gbab1589fc0-goog
The variable are never referenced in the code, just remove it
that this problem was discovered by reading code
Signed-off-by: Zhu Jun <zhujun2(a)cmss.chinamobile.com>
---
tools/testing/selftests/dma/dma_map_benchmark.c | 1 -
1 file changed, 1 deletion(-)
diff --git a/tools/testing/selftests/dma/dma_map_benchmark.c b/tools/testing/selftests/dma/dma_map_benchmark.c
index 3fcea00961c0..c91b485ca99a 100644
--- a/tools/testing/selftests/dma/dma_map_benchmark.c
+++ b/tools/testing/selftests/dma/dma_map_benchmark.c
@@ -33,7 +33,6 @@ int main(int argc, char **argv)
int granule = 1;
int cmd = DMA_MAP_BENCHMARK;
- char *p;
while ((opt = getopt(argc, argv, "t:s:n:b:d:x:g:")) != -1) {
switch (opt) {
--
2.17.1
This variable is never referenced in the code, just remove them
that this problem was discovered by reading the code
Signed-off-by: Zhu Jun <zhujun2(a)cmss.chinamobile.com>
---
Changes in v2:
- modify commit info
tools/testing/selftests/breakpoints/step_after_suspend_test.c | 1 -
1 file changed, 1 deletion(-)
diff --git a/tools/testing/selftests/breakpoints/step_after_suspend_test.c b/tools/testing/selftests/breakpoints/step_after_suspend_test.c
index b8703c499d28..dfec31fb9b30 100644
--- a/tools/testing/selftests/breakpoints/step_after_suspend_test.c
+++ b/tools/testing/selftests/breakpoints/step_after_suspend_test.c
@@ -130,7 +130,6 @@ int run_test(int cpu)
void suspend(void)
{
int power_state_fd;
- struct sigevent event = {};
int timerfd;
int err;
struct itimerspec spec = {};
--
2.17.1
Main function return value is int type, so add return
value in the end that this problem was discovered by reading the code
Signed-off-by: Zhu Jun <zhujun2(a)cmss.chinamobile.com>
---
Changes in v2:
- modify commit info
tools/testing/selftests/breakpoints/step_after_suspend_test.c | 4 +++-
1 file changed, 3 insertions(+), 1 deletion(-)
diff --git a/tools/testing/selftests/breakpoints/step_after_suspend_test.c b/tools/testing/selftests/breakpoints/step_after_suspend_test.c
index dfec31fb9b30..b473131fce3e 100644
--- a/tools/testing/selftests/breakpoints/step_after_suspend_test.c
+++ b/tools/testing/selftests/breakpoints/step_after_suspend_test.c
@@ -166,7 +166,7 @@ int main(int argc, char **argv)
bool succeeded = true;
unsigned int tests = 0;
cpu_set_t available_cpus;
- int err;
+ int err = 0;
int cpu;
ksft_print_header();
@@ -222,4 +222,6 @@ int main(int argc, char **argv)
ksft_exit_pass();
else
ksft_exit_fail();
+
+ return err;
}
--
2.17.1
From: Geliang Tang <tanggeliang(a)kylinos.cn>
Resend patch 1 out of "skip ENOTSUPP BPF selftests" set as Eduard
suggested. Together with two other cleanups.
Geliang Tang (3):
selftests/bpf: Null checks for links in bpf_tcp_ca
selftests/bpf: Check ASSERT_OK(err) in dummy_st_ops
selftests/bpf: Close obj in error paths in xdp_adjust_tail
.../selftests/bpf/prog_tests/bpf_tcp_ca.c | 21 +++++++++++++------
.../selftests/bpf/prog_tests/dummy_st_ops.c | 8 +++++--
.../bpf/prog_tests/xdp_adjust_tail.c | 2 +-
3 files changed, 22 insertions(+), 9 deletions(-)
--
2.43.0
On Thu, Jul 04, 2024 at 04:36:04PM +0200, Arnd Bergmann wrote:
> #define __ARCH_WANT_SYS_CLONE
> +#define __ARCH_WANT_NEW_STAT
>
> -#ifndef __COMPAT_SYSCALL_NR
> -#include <uapi/asm/unistd.h>
> -#endif
> +#include <asm/unistd_64.h>
It looks like this is causing widespread build breakage in kselftest in
-next for arm64, there are *many* errors in the form:
In file included from test_signals_utils.c:14:
/build/stage/build-work/usr/include/asm/unistd.h:2:10: fatal error: unistd_64.h: No such file or directory
2 | #include <unistd_64.h>
| ^~~~~~~~~~~~~
which obviously looks like it's tied to the above but I've not fully
understood the patch/series yet. Build log at:
https://builds.sirena.org.uk/82d01fe6ee52086035b201cfa1410a3b04384257/arm64…
A bisect appears to confirm that it's this commit, which is in -next as
6e4a077c0b607c674536908c5b68f1c31e4e26ec.
git bisect start
# status: waiting for both good and bad commits
# bad: [82d01fe6ee52086035b201cfa1410a3b04384257] Add linux-next specific files for 20240709
git bisect bad 82d01fe6ee52086035b201cfa1410a3b04384257
# status: waiting for good commit(s), bad commit known
# good: [037206cd4cb43d535453723140fde1bcde0b296e] Merge branch 'for-linux-next-fixes' of https://gitlab.freedesktop.org/drm/misc/kernel.git
git bisect good 037206cd4cb43d535453723140fde1bcde0b296e
# bad: [2ae3e655fc40f1b6620194b90dcf9a4515257918] Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/cryptodev-2.6.git
git bisect bad 2ae3e655fc40f1b6620194b90dcf9a4515257918
# bad: [4f2a367612d46dff2068582feadfbdd8e1c0443f] Merge branch 'fs-next' of linux-next
git bisect bad 4f2a367612d46dff2068582feadfbdd8e1c0443f
# bad: [d3da7ed72840f3660f90966490adfd499d96ea8f] Merge branch 'for-next' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc.git
git bisect bad d3da7ed72840f3660f90966490adfd499d96ea8f
# good: [6355edbb3dfe322f0748b1eb3987973a568bbb42] Merge tag 'v6.11-rockchip-dts64-2' of https://git.kernel.org/pub/scm/linux/kernel/git/mmind/linux-rockchip into soc/dt
git bisect good 6355edbb3dfe322f0748b1eb3987973a568bbb42
# good: [2073cda629a47f2ebe2afcd3cb8b3000d5cd13d1] mm: optimization on page allocation when CMA enabled
git bisect good 2073cda629a47f2ebe2afcd3cb8b3000d5cd13d1
# good: [91a2b5b12867f77dc68d2d15ec7381e6e43820cb] Merge branch 'perf-tools-next' of git://git.kernel.org/pub/scm/linux/kernel/git/perf/perf-tools-next.git
git bisect good 91a2b5b12867f77dc68d2d15ec7381e6e43820cb
# bad: [b8c38a39b6ee44b02ee563b60439f417fec441ad] Merge branch 'for-next/perf' of git://git.kernel.org/pub/scm/linux/kernel/git/will/linux.git
git bisect bad b8c38a39b6ee44b02ee563b60439f417fec441ad
# good: [c100216635e922f43d9e783da918a749995350ca] Merge branch 'for-next/vcpu-hotplug' into for-next/core
git bisect good c100216635e922f43d9e783da918a749995350ca
# bad: [fafb823fc82dfb746cc9043b1573c4b29ef1d52a] Merge branch 'for-next' of git://git.kernel.org/pub/scm/linux/kernel/git/rmk/linux.git
git bisect bad fafb823fc82dfb746cc9043b1573c4b29ef1d52a
# bad: [8d46f9dd06378e346a562c75bc2a260a03abe807] csky: convert to generic syscall table
git bisect bad 8d46f9dd06378e346a562c75bc2a260a03abe807
# good: [57029ba74296a4dafe35f147e88d56d8ae7b69da] kbuild: add syscall table generation to scripts/Makefile.asm-headers
git bisect good 57029ba74296a4dafe35f147e88d56d8ae7b69da
# good: [ea0130bf3c45f276b1f9e005eeb255a80a10358b] arm64: convert unistd_32.h to syscall.tbl format
git bisect good ea0130bf3c45f276b1f9e005eeb255a80a10358b
# bad: [b2595bdb3eb3fe24137d0bd07a51bc622f068a81] arm64: rework compat syscall macros
git bisect bad b2595bdb3eb3fe24137d0bd07a51bc622f068a81
# bad: [6e4a077c0b607c674536908c5b68f1c31e4e26ec] arm64: generate 64-bit syscall.tbl
git bisect bad 6e4a077c0b607c674536908c5b68f1c31e4e26ec
# first bad commit: [6e4a077c0b607c674536908c5b68f1c31e4e26ec] arm64: generate 64-bit syscall.tbl
xtheadvector is a custom extension that is based upon riscv vector
version 0.7.1 [1]. All of the vector routines have been modified to
support this alternative vector version based upon whether xtheadvector
was determined to be supported at boot.
vlenb is not supported on the existing xtheadvector hardware, so a
devicetree property thead,vlenb is added to provide the vlenb to Linux.
There is a new hwprobe key RISCV_HWPROBE_KEY_VENDOR_EXT_THEAD_0 that is
used to request which thead vendor extensions are supported on the
current platform. This allows future vendors to allocate hwprobe keys
for their vendor.
Support for xtheadvector is also added to the vector kselftests.
Signed-off-by: Charlie Jenkins <charlie(a)rivosinc.com>
[1] https://github.com/T-head-Semi/thead-extension-spec/blob/95358cb2cca9489361…
---
This series is a continuation of a different series that was fragmented
into two other series in an attempt to get part of it merged in the 6.10
merge window. The split-off series did not get merged due to a NAK on
the series that added the generic riscv,vlenb devicetree entry. This
series has converted riscv,vlenb to thead,vlenb to remedy this issue.
The original series is titled "riscv: Support vendor extensions and
xtheadvector" [3].
The series titled "riscv: Extend cpufeature.c to detect vendor
extensions" is still under development and this series is based on that
series! [4]
I have tested this with an Allwinner Nezha board. I ran into issues
booting the board after 6.9-rc1 so I applied these patches to 6.8. There
are a couple of minor merge conflicts that do arrise when doing that, so
please let me know if you have been able to boot this board with a 6.9
kernel. I used SkiffOS [1] to manage building the image, but upgraded
the U-Boot version to Samuel Holland's more up-to-date version [2] and
changed out the device tree used by U-Boot with the device trees that
are present in upstream linux and this series. Thank you Samuel for all
of the work you did to make this task possible.
[1] https://github.com/skiffos/SkiffOS/tree/master/configs/allwinner/nezha
[2] https://github.com/smaeul/u-boot/commit/2e89b706f5c956a70c989cd31665f1429e9…
[3] https://lore.kernel.org/all/20240503-dev-charlie-support_thead_vector_6_9-v…
[4] https://lore.kernel.org/linux-riscv/20240609-support_vendor_extensions-v2-0…
---
Changes in v3:
- Add back Heiko's signed-off-by (Conor)
- Mark RISCV_HWPROBE_KEY_VENDOR_EXT_THEAD_0 as a bitmask
- Link to v2: https://lore.kernel.org/r/20240610-xtheadvector-v2-0-97a48613ad64@rivosinc.…
Changes in v2:
- Removed extraneous references to "riscv,vlenb" (Jess)
- Moved declaration of "thead,vlenb" into cpus.yaml and added
restriction that it's only applicable to thead cores (Conor)
- Check CONFIG_RISCV_ISA_XTHEADVECTOR instead of CONFIG_RISCV_ISA_V for
thead,vlenb (Jess)
- Fix naming of hwprobe variables (Evan)
- Link to v1: https://lore.kernel.org/r/20240609-xtheadvector-v1-0-3fe591d7f109@rivosinc.…
---
Charlie Jenkins (12):
dt-bindings: riscv: Add xtheadvector ISA extension description
dt-bindings: cpus: add a thead vlen register length property
riscv: dts: allwinner: Add xtheadvector to the D1/D1s devicetree
riscv: Add thead and xtheadvector as a vendor extension
riscv: vector: Use vlenb from DT for thead
riscv: csr: Add CSR encodings for VCSR_VXRM/VCSR_VXSAT
riscv: Add xtheadvector instruction definitions
riscv: vector: Support xtheadvector save/restore
riscv: hwprobe: Add thead vendor extension probing
riscv: hwprobe: Document thead vendor extensions and xtheadvector extension
selftests: riscv: Fix vector tests
selftests: riscv: Support xtheadvector in vector tests
Heiko Stuebner (1):
RISC-V: define the elements of the VCSR vector CSR
Documentation/arch/riscv/hwprobe.rst | 10 +
Documentation/devicetree/bindings/riscv/cpus.yaml | 19 ++
.../devicetree/bindings/riscv/extensions.yaml | 10 +
arch/riscv/Kconfig.vendor | 26 ++
arch/riscv/boot/dts/allwinner/sun20i-d1s.dtsi | 3 +-
arch/riscv/include/asm/cpufeature.h | 2 +
arch/riscv/include/asm/csr.h | 13 +
arch/riscv/include/asm/hwprobe.h | 5 +-
arch/riscv/include/asm/switch_to.h | 2 +-
arch/riscv/include/asm/vector.h | 249 +++++++++++++----
arch/riscv/include/asm/vendor_extensions/thead.h | 42 +++
.../include/asm/vendor_extensions/thead_hwprobe.h | 18 ++
.../include/asm/vendor_extensions/vendor_hwprobe.h | 37 +++
arch/riscv/include/uapi/asm/hwprobe.h | 3 +-
arch/riscv/include/uapi/asm/vendor/thead.h | 3 +
arch/riscv/kernel/cpufeature.c | 51 +++-
arch/riscv/kernel/kernel_mode_vector.c | 8 +-
arch/riscv/kernel/process.c | 4 +-
arch/riscv/kernel/signal.c | 6 +-
arch/riscv/kernel/sys_hwprobe.c | 5 +
arch/riscv/kernel/vector.c | 25 +-
arch/riscv/kernel/vendor_extensions.c | 10 +
arch/riscv/kernel/vendor_extensions/Makefile | 2 +
arch/riscv/kernel/vendor_extensions/thead.c | 18 ++
.../riscv/kernel/vendor_extensions/thead_hwprobe.c | 19 ++
tools/testing/selftests/riscv/vector/.gitignore | 3 +-
tools/testing/selftests/riscv/vector/Makefile | 17 +-
.../selftests/riscv/vector/v_exec_initval_nolibc.c | 93 +++++++
tools/testing/selftests/riscv/vector/v_helpers.c | 67 +++++
tools/testing/selftests/riscv/vector/v_helpers.h | 7 +
tools/testing/selftests/riscv/vector/v_initval.c | 22 ++
.../selftests/riscv/vector/v_initval_nolibc.c | 68 -----
.../selftests/riscv/vector/vstate_exec_nolibc.c | 20 +-
.../testing/selftests/riscv/vector/vstate_prctl.c | 295 ++++++++++++---------
34 files changed, 911 insertions(+), 271 deletions(-)
---
base-commit: 11cc01d4d2af304b7288251aad7e03315db8dffc
change-id: 20240530-xtheadvector-833d3d17b423
--
- Charlie
In the middle of the thread about a patch to add the skip test result,
I suggested documenting the process of deprecating the KTAP v1 Specification
method of marking a skipped test:
https://lore.kernel.org/all/490271eb-1429-2217-6e38-837c6e5e328b@gmail.com/…
In a reply to that email I suggested that we ought to have a process to transition
the KTAP Specification from v1 to v2, and possibly v3 and future.
This email is meant to be the root of that discussion.
My initial thinking is that there are at least three different types of project
and/or community that may have different needs in this area.
Type 1 - project controls both the test output generation and the test output
parsing tool. Both generation and parsing code are in the same repository
and/or synchronized versions are distributed together.
Devicetree unittests are an example of Type 1. I plan to maintain changes
of test output to KTAP v2 format in coordination with updating the parser
to process KTAP v2 data.
Type 2 - project controls both the test output generation and the test output
parsing tool. The test output generation and a parser modifications may be
controlled by the project BUT there are one or more external testing projects
that (1) may have their own parsers, and (2) may have a single framework that
tests multiple versions of the tests.
I think that kselftest and kunit tests are probably examples of Type 2. I also
think that DT unittests will become a Type 2 project as a result of converting
to KTAP v2 data.
Type 3 - project may create and maintain some tests, but is primarily a consumer
of tests created by other projects. Type 3 projects typically have a single
framework that is able to execute and process multiple versions of the tests.
The Fuego test project is an example of Type 3.
Maybe adding all of this complexity of different Types in my initial thinking
was silly -- maybe everything in this topic is governed by the more complex
Type 3.
My thinking was that the three different Types of project would be impacted
in different ways by transition plans. Type 3 would be the most impacted,
so I wanted to be sure that any transition plan especially considered their
needs.
There is an important aspect of the KTAP format that might ease the transition
from one version to another: All KTAP formatted results begin with a "version
line", so as soon as a parser has processed the first line of a test, it can
apply the appropriate KTAP Specification version to all subsequent lines of
test output. A parser implementation could choose to process all versions,
could choose to invoke a version specific parser, or some other approach
all together.
In the "add skip test results" thread, I suggested deprecating the v1
method of marking a skipped test in v2, with a scheduled removal of
the v1 method in v3. But since the KTAP format version is available
in the very first line of test output, is it necessary to do a slow
deprecation and removal over two versions?
One argument to doing a two version deprecation/removal process is that
a parser that is one version older the the test output _might_ be able
to process the test output without error, but would not be able to take
advantage of features added in the newer version of the Specification.
My opinion is that a two version deprecation/removal process will slow
the Specification update process and lead to more versions of the
Specification over a given time interval.
A one version deprecation/removal process puts more of a burden on Type 3
projects and external parsers for Type 2 projects to implement parsers
that can process the newer Specification more quickly and puts a burden
on test maintainers to delay a move to the newer Specification, or possibly
pressure to support selection of more than one Specification version format
for output data.
One additional item... On the KTAP Specification version 2 process wiki page,
I suggested that it is "desirable for test result parsers that understand the
KTAP Specification version 2 data also be able to parse version 1 data."
With the implication "Converting version 1 compliant data to version 2 compliant
data should not require a "flag day" switch of test result parsers." If this
thread discussion results in a different decision, I will update the wiki.
Thoughts?
-Frank
Hi everyone,
I am new to this forum and excited to be here.
I'm interested in learning more about Linux kernel self-tests and contributing where I can.
I look forward to engaging with you all and gaining a deeper understanding of the topics discussed here.
Could someone please guide me on how to ask questions here ?
Where should I post if I have a query ?
Looking forward to your advice and connecting with you all.
Thanks,
From: Geliang Tang <tanggeliang(a)kylinos.cn>
v2:
- Although all CI tests passed on x86_64 "bpf/vmtest-bpf-next-VM_Test-22
Logs for x86_64-gcc / test (test_progs, false, 360) / test_progs on x86_64
with gcc", some unexpect "SKIP"s are showed in the log:
#29/1 bpf_tcp_ca/dctcp:SKIP
#29/2 bpf_tcp_ca/cubic:OK
#29/3 bpf_tcp_ca/invalid_license:OK
#29/4 bpf_tcp_ca/dctcp_fallback:SKIP
#29/5 bpf_tcp_ca/rel_setsockopt:OK
#29/6 bpf_tcp_ca/write_sk_pacing:OK
#29/7 bpf_tcp_ca/incompl_cong_ops:OK
#29/8 bpf_tcp_ca/unsupp_cong_op:OK
#29/9 bpf_tcp_ca/update_ca:OK
#29/10 bpf_tcp_ca/update_wrong:OK
#29/11 bpf_tcp_ca/mixed_links:OK
#29/12 bpf_tcp_ca/multi_links:OK
#29/13 bpf_tcp_ca/link_replace:OK
#29/14 bpf_tcp_ca/tcp_ca_kfunc:OK
#29/15 bpf_tcp_ca/cc_cubic:OK
#29/16 bpf_tcp_ca/dctcp_autoattach_map:SKIP
#29 bpf_tcp_ca:OK (SKIP: 3/16)
Shouldn't skip any tests on X86_64. Fix this in v2.
- add a new helper test_progs_get_error.
BPF selftests seem to have not been fully tested on Loongarch platforms.
There are so many "ENOTSUPP" (-524) errors when running BPF selftests on
them since lacking BPF trampoline on Loongarch.
For these "ENOTSUPP" tests, it's better to skip them, instead of reporting
some "ENOTSUPP" errors. This patchset skips ENOTSUPP in ASSERT_OK/
ASSERT_OK_PTR/ASSERT_GE helpers to fix them. This is useful for running BPF
selftests for other architectures too.
Geliang Tang (6):
selftests/bpf: Define ENOTSUPP in testing_helpers.h
selftests/bpf: Skip ENOTSUPP in ASSERT_OK
selftests/bpf: Use ASSERT_OK to skip ENOTSUPP
selftests/bpf: Null checks for link in bpf_tcp_ca
selftests/bpf: Skip ENOTSUPP in ASSERT_OK_PTR
selftests/bpf: Skip ENOTSUPP in ASSERT_GE
.../selftests/bpf/prog_tests/bpf_tcp_ca.c | 20 ++++++-----
.../testing/selftests/bpf/prog_tests/d_path.c | 2 +-
.../selftests/bpf/prog_tests/lsm_cgroup.c | 10 +-----
.../selftests/bpf/prog_tests/module_attach.c | 2 +-
.../selftests/bpf/prog_tests/ringbuf.c | 2 +-
.../selftests/bpf/prog_tests/sock_addr.c | 4 ---
.../selftests/bpf/prog_tests/test_bprm_opts.c | 2 +-
.../selftests/bpf/prog_tests/test_ima.c | 2 +-
.../selftests/bpf/prog_tests/trace_ext.c | 2 +-
tools/testing/selftests/bpf/test_maps.c | 4 ---
tools/testing/selftests/bpf/test_progs.h | 33 +++++++++++++++----
tools/testing/selftests/bpf/test_verifier.c | 4 ---
tools/testing/selftests/bpf/testing_helpers.h | 4 +++
13 files changed, 50 insertions(+), 41 deletions(-)
--
2.43.0
Conform individual tests to TAP output. One patch conform one test. With
this series, all vDSO tests become TAP conformant.
First patch conform the test by using kselftest_harness.h. Other patches
are conforming using default kselftest.h helpers.
All tests have been tested multiple times before and after these
patches. They are working correctly and outputting TAP messaging to find
failures quikly when they happen.
---
Changes since v1:
- Update cover letter
- Update commit message of first patch
Muhammad Usama Anjum (4):
kselftests: vdso: vdso_test_clock_getres: conform test to TAP output
kselftests: vdso: vdso_test_correctness: conform test to TAP output
kselftests: vdso: vdso_test_getcpu: conform test to TAP output
kselftests: vdso: vdso_test_gettimeofday: conform test to TAP output
.../selftests/vDSO/vdso_test_clock_getres.c | 68 ++++----
.../selftests/vDSO/vdso_test_correctness.c | 146 +++++++++---------
.../testing/selftests/vDSO/vdso_test_getcpu.c | 16 +-
.../selftests/vDSO/vdso_test_gettimeofday.c | 23 +--
4 files changed, 126 insertions(+), 127 deletions(-)
--
2.39.2
Changes since v3:
1) Rebased onto Linux 6.10-rc6+.
2) Added Muhammad's acks for the series.
Cover letter for v3:
Hi,
Dave Hansen, Muhammad Usama Anjum, here is the combined series that we
discussed yesterday [1].
As I mentioned then, this is a bit intrusive--but no more than
necessary, IMHO. Specifically, it moves some clang-un-inlineable things
out to "pure" assembly code files.
I've tested this by building with clang, then running each binary on my
x86_64 test system with today's 6.10-rc1, and comparing the console and
dmesg output to a gcc-based build without these patches applied. Aside
from timestamps and virtual addresses, it looks identical.
Earlier cover letter:
Just a bunch of build and warnings fixes that show up when building with
clang. Some of these depend on each other, so I'm sending them as a
series.
Changes since v2:
1) Dropped my test_FISTTP.c patch, and picked up Muhammad's fix instead,
seeing as how that was posted first.
2) Updated patch descriptions to reflect that Valentin Obst's build fix
for LLVM [1] has already been merged into Linux main.
3) Minor wording and typo corrections in the commit logs throughout.
Changes since the first version:
1) Rebased onto Linux 6.10-rc1
Enjoy!
[1] https://lore.kernel.org/44428518-4d21-4de7-8587-04eceefb330d@nvidia.com
thanks,
John Hubbard
John Hubbard (6):
selftests/x86: fix Makefile dependencies to work with clang
selftests/x86: build fsgsbase_restore.c with clang
selftests/x86: build sysret_rip.c with clang
selftests/x86: avoid -no-pie warnings from clang during compilation
selftests/x86: remove (or use) unused variables and functions
selftests/x86: fix printk warnings reported by clang
Muhammad Usama Anjum (1):
selftests: x86: test_FISTTP: use fisttps instead of ambiguous fisttp
tools/testing/selftests/x86/Makefile | 31 +++++++++++++++----
tools/testing/selftests/x86/amx.c | 16 ----------
.../testing/selftests/x86/clang_helpers_32.S | 11 +++++++
.../testing/selftests/x86/clang_helpers_64.S | 28 +++++++++++++++++
tools/testing/selftests/x86/fsgsbase.c | 6 ----
.../testing/selftests/x86/fsgsbase_restore.c | 11 +++----
tools/testing/selftests/x86/sigreturn.c | 2 +-
.../testing/selftests/x86/syscall_arg_fault.c | 1 -
tools/testing/selftests/x86/sysret_rip.c | 20 ++++--------
tools/testing/selftests/x86/test_FISTTP.c | 8 ++---
tools/testing/selftests/x86/test_vsyscall.c | 15 +++------
tools/testing/selftests/x86/vdso_restorer.c | 2 ++
12 files changed, 87 insertions(+), 64 deletions(-)
create mode 100644 tools/testing/selftests/x86/clang_helpers_32.S
create mode 100644 tools/testing/selftests/x86/clang_helpers_64.S
base-commit: 795c58e4c7fc6163d8fb9f2baa86cfe898fa4b19
--
2.45.2
This patch series adds unit tests for the clk fixed rate basic type and
the clk registration functions that use struct clk_parent_data. To get
there, we add support for loading device tree overlays onto the live DTB
along with probing platform drivers to bind to device nodes in the
overlays. With this series, we're able to exercise some of the code in
the common clk framework that uses devicetree lookups to find parents
and the fixed rate clk code that scans device tree directly and creates
clks. Please review.
I Cced everyone to all the patches so they get the full context. I'm
hoping I can take the whole pile through the clk tree as they all build
upon each other. Or the DT part can be merged through the DT tree to
reduce the dependencies.
Changes from v5: https://lore.kernel.org/r/20240603223811.3815762-1-sboyd@kernel.org
* Pick up reviewed-by tags
* Drop test vendor prefix bindings as dtschema allows anything now
* Use of_node_put_kunit() more to plug some reference leaks
* Select DTC config to avoid compile fails because of missing dtc
* Don't skip for OF_OVERLAY in overlay tests because they depend on it
Changes from v4: https://lore.kernel.org/r/20240422232404.213174-1-sboyd@kernel.org
* Picked up reviewed-by tags
* Check for non-NULL device pointers before calling put_device()
* Fix CFI issues with kunit actions
* Introduce platform_device_prepare_wait_for_probe() helper to wait for
a platform device to probe
* Move platform code to lib/kunit and rename functions to have kunit
prefix
* Fix issue with platform wrappers messing up reference counting
because they used kunit actions
* New patch to populate overlay devices on root node for powerpc
* Make fixed-rate binding generic single clk consumer binding
Changes from v3: https://lore.kernel.org/r/20230327222159.3509818-1-sboyd@kernel.org
* No longer depend on Frank's series[1] because it was merged upstream[2]
* Use kunit_add_action_or_reset() to shorten code
* Skip tests properly when CONFIG_OF_OVERLAY isn't set
Changes from v2: https://lore.kernel.org/r/20230315183729.2376178-1-sboyd@kernel.org
* Overlays don't depend on __symbols__ node
* Depend on Frank's always create root node if CONFIG_OF series[1]
* Added kernel-doc to KUnit API doc
* Fixed some kernel-doc on functions
* More test cases for fixed rate clk
Changes from v1: https://lore.kernel.org/r/20230302013822.1808711-1-sboyd@kernel.org
* Don't depend on UML, use unittest data approach to attach nodes
* Introduce overlay loading API for KUnit
* Move platform_device KUnit code to drivers/base/test
* Use #define macros for constants shared between unit tests and
overlays
* Settle on "test" as a vendor prefix
* Make KUnit wrappers have "_kunit" postfix
[1] https://lore.kernel.org/r/20230317053415.2254616-1-frowand.list@gmail.com
[2] https://lore.kernel.org/r/20240308195737.GA1174908-robh@kernel.org
Stephen Boyd (8):
of/platform: Allow overlays to create platform devices from the root
node
of: Add test managed wrappers for of_overlay_apply()/of_node_put()
dt-bindings: vendor-prefixes: Add "test" vendor for KUnit and friends
of: Add a KUnit test for overlays and test managed APIs
platform: Add test managed platform_device/driver APIs
clk: Add test managed clk provider/consumer APIs
clk: Add KUnit tests for clk fixed rate basic type
clk: Add KUnit tests for clks registered with struct clk_parent_data
Documentation/dev-tools/kunit/api/clk.rst | 10 +
Documentation/dev-tools/kunit/api/index.rst | 21 +
Documentation/dev-tools/kunit/api/of.rst | 13 +
.../dev-tools/kunit/api/platformdevice.rst | 10 +
.../devicetree/bindings/vendor-prefixes.yaml | 2 +
drivers/clk/.kunitconfig | 2 +
drivers/clk/Kconfig | 11 +
drivers/clk/Makefile | 9 +-
drivers/clk/clk-fixed-rate_test.c | 379 +++++++++++++++
drivers/clk/clk-fixed-rate_test.h | 8 +
drivers/clk/clk_kunit_helpers.c | 204 ++++++++
drivers/clk/clk_parent_data_test.h | 10 +
drivers/clk/clk_test.c | 453 +++++++++++++++++-
drivers/clk/kunit_clk_fixed_rate_test.dtso | 19 +
drivers/clk/kunit_clk_parent_data_test.dtso | 28 ++
drivers/of/.kunitconfig | 1 +
drivers/of/Kconfig | 10 +
drivers/of/Makefile | 2 +
drivers/of/kunit_overlay_test.dtso | 9 +
drivers/of/of_kunit_helpers.c | 74 +++
drivers/of/overlay_test.c | 114 +++++
drivers/of/platform.c | 9 +-
include/kunit/clk.h | 28 ++
include/kunit/of.h | 115 +++++
include/kunit/platform_device.h | 20 +
lib/kunit/Makefile | 4 +-
lib/kunit/platform-test.c | 223 +++++++++
lib/kunit/platform.c | 302 ++++++++++++
28 files changed, 2084 insertions(+), 6 deletions(-)
create mode 100644 Documentation/dev-tools/kunit/api/clk.rst
create mode 100644 Documentation/dev-tools/kunit/api/of.rst
create mode 100644 Documentation/dev-tools/kunit/api/platformdevice.rst
create mode 100644 drivers/clk/clk-fixed-rate_test.c
create mode 100644 drivers/clk/clk-fixed-rate_test.h
create mode 100644 drivers/clk/clk_kunit_helpers.c
create mode 100644 drivers/clk/clk_parent_data_test.h
create mode 100644 drivers/clk/kunit_clk_fixed_rate_test.dtso
create mode 100644 drivers/clk/kunit_clk_parent_data_test.dtso
create mode 100644 drivers/of/kunit_overlay_test.dtso
create mode 100644 drivers/of/of_kunit_helpers.c
create mode 100644 drivers/of/overlay_test.c
create mode 100644 include/kunit/clk.h
create mode 100644 include/kunit/of.h
create mode 100644 include/kunit/platform_device.h
create mode 100644 lib/kunit/platform-test.c
create mode 100644 lib/kunit/platform.c
base-commit: 1613e604df0cd359cf2a7fbd9be7a0bcfacfabd0
--
https://git.kernel.org/pub/scm/linux/kernel/git/clk/linux.git/https://git.kernel.org/pub/scm/linux/kernel/git/sboyd/spmi.git
Hi Linus,
Please pull this kselftest fixes update for Linux 6.10.
This kselftest fixes update for Linux 6.10 consists of fixes to clang
build failures to timerns, vDSO tests and fixes to vDSO makefile.
Note: makefile fixes are included to avoid conflicts during 6.11 merge
window.
diff is attached.
thanks,
-- Shuah
----------------------------------------------------------------
The following changes since commit 48236960c06d32370bfa6f2cc408e786873262c8:
selftests/resctrl: Fix non-contiguous CBM for AMD (2024-06-26 13:22:34 -0600)
are available in the Git repository at:
git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest tags/linux_kselftest-fixes-6.10
for you to fetch changes up to 66cde337fa1b7c6cf31f856fa015bd91a4d383e7:
selftests/vDSO: remove duplicate compiler invocations from Makefile (2024-07-05 14:12:34 -0600)
----------------------------------------------------------------
linux_kselftest-fixes-6.10
This kselftest fixes update for Linux 6.10 consists of fixes to clang
build failures to timerns, vDSO tests and fixes to vDSO makefile.
----------------------------------------------------------------
John Hubbard (4):
selftest/timerns: fix clang build failures for abs() calls
selftests/vDSO: fix clang build errors and warnings
selftests/vDSO: remove partially duplicated "all:" target in Makefile
selftests/vDSO: remove duplicate compiler invocations from Makefile
tools/testing/selftests/timens/exec.c | 6 ++---
tools/testing/selftests/timens/timer.c | 2 +-
tools/testing/selftests/timens/timerfd.c | 2 +-
tools/testing/selftests/timens/vfork_exec.c | 4 +--
tools/testing/selftests/vDSO/Makefile | 29 +++++++++-------------
tools/testing/selftests/vDSO/parse_vdso.c | 16 ++++++++----
.../selftests/vDSO/vdso_standalone_test_x86.c | 18 ++++++++++++--
7 files changed, 46 insertions(+), 31 deletions(-)
----------------------------------------------------------------
For cgroup v1, if turned on, and there's any cgroup in the "cpu" hierarchy it
needs an RT budget assigned, otherwise the processes in it will not be able to
get RT at all. The problem with RT group scheduling is that it requires the
budget assigned but there's no way we could assign a default budget, since the
values to assign are both upper and lower time limits, are absolute, and need to
be sum up to < 1 for each individal cgroup. That means we cannot really come up
with values that would work by default in the general case.[1]
For cgroup v2, it's almost unusable as well. If it turned on, the cpu controller
can only be enabled when all RT processes are in the root cgroup. But it will
lose the benefits of cgroup v2 if all RT process were placed in the same cgroup.
Red Hat, Gentoo, Arch Linux and Debian all disable it. systemd also doesn't
support it.[2]
I leave tools/testing/selftests/bpf/config.{s390x,aarch64} untouched because
I don't whether bpf testing requires it.
[1]: https://bugzilla.redhat.com/show_bug.cgi?id=1229700
[2]: https://github.com/systemd/systemd/issues/13781#issuecomment-549164383
Celeste Liu (6):
riscv: defconfig: drop RT_GROUP_SCHED=y
loongarch: defconfig: drop RT_GROUP_SCHED=y
mips: defconfig: drop RT_GROUP_SCHED=y from generic/db1xxx/eyeq5
powerpc: defconfig: drop RT_GROUP_SCHED=y from ppc6xx_defconfig
sh: defconfig: drop RT_GROUP_SCHED=y from sdk7786/urquell
arm: defconfig: drop RT_GROUP_SCHED=y from bcm2855/tegra/omap2plus
arch/arm/configs/bcm2835_defconfig | 1 -
arch/arm/configs/omap2plus_defconfig | 1 -
arch/arm/configs/tegra_defconfig | 1 -
arch/loongarch/configs/loongson3_defconfig | 1 -
arch/mips/configs/db1xxx_defconfig | 1 -
arch/mips/configs/eyeq5_defconfig | 1 -
arch/mips/configs/generic_defconfig | 1 -
arch/powerpc/configs/ppc6xx_defconfig | 1 -
arch/riscv/configs/defconfig | 1 -
arch/sh/configs/sdk7786_defconfig | 1 -
arch/sh/configs/urquell_defconfig | 1 -
11 files changed, 11 deletions(-)
--
2.45.1
in randomize function, there is a open function, but there is no
close function in the randomize, which is easy to cause memory leaks.
Signed-off-by: Liu Jing <liujing(a)cmss.chinamobile.com>
---
tools/testing/selftests/net/tcp_mmap.c | 1 +
1 file changed, 1 insertion(+)
diff --git a/tools/testing/selftests/net/tcp_mmap.c b/tools/testing/selftests/net/tcp_mmap.c
index 4fcce5150850..ab305e262d0a 100644
--- a/tools/testing/selftests/net/tcp_mmap.c
+++ b/tools/testing/selftests/net/tcp_mmap.c
@@ -438,6 +438,7 @@ static void randomize(void *target, size_t count)
perror("read /dev/urandom");
exit(1);
}
+ close(urandom);
}
int main(int argc, char *argv[])
--
2.33.0
+ Dave Miller, Jakub Kicinski, Paolo Abeni, Shuah Khan,
linux-kselftest(a)vger.kernel.org
On Mon, Jul 08, 2024 at 09:04:05PM +0000, zijianzhang(a)bytedance.com wrote:
> From: Zijian Zhang <zijianzhang(a)bytedance.com>
>
> We update selftests/net/msg_zerocopy.c to accommodate the new mechanism,
> cfg_notification_limit has the same semantics for both methods. Test
> results are as follows, we update skb_orphan_frags_rx to the same as
> skb_orphan_frags to support zerocopy in the localhost test.
>
> cfg_notification_limit = 1, both method get notifications after 1 calling
> of sendmsg. In this case, the new method has around 17% cpu savings in TCP
> and 23% cpu savings in UDP.
> +---------------------+---------+---------+---------+---------+
> | Test Type / Protocol| TCP v4 | TCP v6 | UDP v4 | UDP v6 |
> +---------------------+---------+---------+---------+---------+
> | ZCopy (MB) | 7523 | 7706 | 7489 | 7304 |
> +---------------------+---------+---------+---------+---------+
> | New ZCopy (MB) | 8834 | 8993 | 9053 | 9228 |
> +---------------------+---------+---------+---------+---------+
> | New ZCopy / ZCopy | 117.42% | 116.70% | 120.88% | 126.34% |
> +---------------------+---------+---------+---------+---------+
>
> cfg_notification_limit = 32, both get notifications after 32 calling of
> sendmsg, which means more chances to coalesce notifications, and less
> overhead of poll + recvmsg for the original method. In this case, the new
> method has around 7% cpu savings in TCP and slightly better cpu usage in
> UDP. In the context of selftest, notifications of TCP are more likely to
> out of order than UDP, it's easier to coalesce more notifications in UDP.
> The original method can get one notification with range of 32 in a recvmsg
> most of the time. In TCP, most notifications' range is around 2, so the
> original method needs around 16 recvmsgs to get notified in one round.
> That's the reason for the "New ZCopy / ZCopy" diff in TCP and UDP here.
> +---------------------+---------+---------+---------+---------+
> | Test Type / Protocol| TCP v4 | TCP v6 | UDP v4 | UDP v6 |
> +---------------------+---------+---------+---------+---------+
> | ZCopy (MB) | 8842 | 8735 | 10072 | 9380 |
> +---------------------+---------+---------+---------+---------+
> | New ZCopy (MB) | 9366 | 9477 | 10108 | 9385 |
> +---------------------+---------+---------+---------+---------+
> | New ZCopy / ZCopy | 106.00% | 108.28% | 100.31% | 100.01% |
> +---------------------+---------+---------+---------+---------+
>
> In conclusion, when notification interval is small or notifications are
> hard to be coalesced, the new mechanism is highly recommended. Otherwise,
> the performance gain from the new mechanism is very limited.
>
> Signed-off-by: Zijian Zhang <zijianzhang(a)bytedance.com>
> Signed-off-by: Xiaochun Lu <xiaochun.lu(a)bytedance.com>
> ---
> tools/testing/selftests/net/msg_zerocopy.c | 111 ++++++++++++++++++--
> tools/testing/selftests/net/msg_zerocopy.sh | 1 +
> 2 files changed, 105 insertions(+), 7 deletions(-)
>
> diff --git a/tools/testing/selftests/net/msg_zerocopy.c b/tools/testing/selftests/net/msg_zerocopy.c
...
> @@ -466,6 +504,44 @@ static void do_recv_completions(int fd, int domain)
> sends_since_notify = 0;
> }
>
> +static void do_recv_completions2(void)
> +{
> + struct cmsghdr *cm = (struct cmsghdr *)zc_ckbuf;
> + struct zc_info *zc_info;
> + __u32 hi, lo, range;
> + __u8 zerocopy;
> + int i;
> +
> + zc_info = (struct zc_info *)CMSG_DATA(cm);
> + for (i = 0; i < zc_info->size; i++) {
> + hi = zc_info->arr[i].hi;
> + lo = zc_info->arr[i].lo;
> + zerocopy = zc_info->arr[i].zerocopy;
> + range = hi - lo + 1;
> +
> + if (cfg_verbose && lo != next_completion)
> + fprintf(stderr, "gap: %u..%u does not append to %u\n",
> + lo, hi, next_completion);
> + next_completion = hi + 1;
> +
> + if (zerocopied == -1)
> + zerocopied = zerocopy;
> + else if (zerocopied != zerocopy) {
> + fprintf(stderr, "serr: inconsistent\n");
> + zerocopied = zerocopy;
> + }
nit: If any arms of a conditional have {}, then all arms should have them
> +
> + completions += range;
> +
> + if (cfg_verbose >= 2)
> + fprintf(stderr, "completed: %u (h=%u l=%u)\n",
> + range, hi, lo);
> + }
> +
> + sends_since_notify = 0;
> + added_zcopy_info = false;
> +}
...
From: Geliang Tang <tanggeliang(a)kylinos.cn>
Run this BPF selftests (./test_progs -t sockmap_basic) on a Loongarch
platform, a kernel panic occurs:
'''
Oops[#1]:
CPU: 22 PID: 2824 Comm: test_progs Tainted: G OE 6.10.0-rc2+ #18
Hardware name: LOONGSON Dabieshan/Loongson-TC542F0, BIOS Loongson-UDK2018
... ...
ra: 90000000048bf6c0 sk_msg_recvmsg+0x120/0x560
ERA: 9000000004162774 copy_page_to_iter+0x74/0x1c0
CRMD: 000000b0 (PLV0 -IE -DA +PG DACF=CC DACM=CC -WE)
PRMD: 0000000c (PPLV0 +PIE +PWE)
EUEN: 00000007 (+FPE +SXE +ASXE -BTE)
ECFG: 00071c1d (LIE=0,2-4,10-12 VS=7)
ESTAT: 00010000 [PIL] (IS= ECode=1 EsubCode=0)
BADV: 0000000000000040
PRID: 0014c011 (Loongson-64bit, Loongson-3C5000)
Modules linked in: bpf_testmod(OE) xt_CHECKSUM xt_MASQUERADE xt_conntrack
Process test_progs (pid: 2824, threadinfo=0000000000863a31, task=...)
Stack : ...
...
Call Trace:
[<9000000004162774>] copy_page_to_iter+0x74/0x1c0
[<90000000048bf6c0>] sk_msg_recvmsg+0x120/0x560
[<90000000049f2b90>] tcp_bpf_recvmsg_parser+0x170/0x4e0
[<90000000049aae34>] inet_recvmsg+0x54/0x100
[<900000000481ad5c>] sock_recvmsg+0x7c/0xe0
[<900000000481e1a8>] __sys_recvfrom+0x108/0x1c0
[<900000000481e27c>] sys_recvfrom+0x1c/0x40
[<9000000004c076ec>] do_syscall+0x8c/0xc0
[<9000000003731da4>] handle_syscall+0xc4/0x160
Code: ...
---[ end trace 0000000000000000 ]---
Kernel panic - not syncing: Fatal exception
Kernel relocated by 0x3510000
.text @ 0x9000000003710000
.data @ 0x9000000004d70000
.bss @ 0x9000000006469400
---[ end Kernel panic - not syncing: Fatal exception ]---
'''
This crash happens every time when running sockmap_skb_verdict_shutdown
subtest in sockmap_basic.
This crash is because a NULL pointer is passed to page_address() in
sk_msg_recvmsg(). Due to the difference implementations depending on the
architecture, page_address(NULL) will trigger a panic on Loongarch
platform but not on X86 platform. So this bug was hidden on X86 platform
for a while, but now it is exposed on Loongarch platform.
The root cause is a zero length skb (skb->len == 0) is put on the queue.
This zero length skb is a TCP FIN packet, which is sent by shutdown(),
invoked in test_sockmap_skb_verdict_shutdown():
shutdown(p1, SHUT_WR);
In this case, in sk_psock_skb_ingress_enqueue(), num_sge is zero, and no
page is put to this sge (see sg_set_page in sg_set_page), but this empty
sge is queued into ingress_msg list.
And in sk_msg_recvmsg(), this empty sge is used, and a NULL page is got by
sg_page(sge). Pass this NULL page to copy_page_to_iter(), which passes it
to kmap_local_page() and to page_address(), then kernel panics.
To solve this, we should skip this zero length skb. So in sk_msg_recvmsg(),
if copy is zero, that means it's a zero length skb, skip invoking
copy_page_to_iter(). We are using the EFAULT return triggered by
copy_page_to_iter to check for is_fin in tcp_bpf.c.
Fixes: 604326b41a6f ("bpf, sockmap: convert to generic sk_msg interface")
Suggested-by: John Fastabend <john.fastabend(a)gmail.com>
Signed-off-by: Geliang Tang <tanggeliang(a)kylinos.cn>
---
v5:
- update v5 as John suggested.
- skmsg: skip zero length skb in sk_msg_recvmsg
v4:
- skmsg: skip empty sge in sk_msg_recvmsg
v3:
- skmsg: prevent empty ingress skb from enqueuing
v2:
- skmsg: null check for sg_page in sk_msg_recvmsg
---
net/core/skmsg.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/net/core/skmsg.c b/net/core/skmsg.c
index fd20aae30be2..bbf40b999713 100644
--- a/net/core/skmsg.c
+++ b/net/core/skmsg.c
@@ -434,7 +434,8 @@ int sk_msg_recvmsg(struct sock *sk, struct sk_psock *psock, struct msghdr *msg,
page = sg_page(sge);
if (copied + copy > len)
copy = len - copied;
- copy = copy_page_to_iter(page, sge->offset, copy, iter);
+ if (copy)
+ copy = copy_page_to_iter(page, sge->offset, copy, iter);
if (!copy) {
copied = copied ? copied : -EFAULT;
goto out;
--
2.43.0
Changes v3:
- Reworked patch 2.
- Changed minor things in patch 1 like function name and made
corrections to the patch message.
Changes v2:
- Removed patches 2 and 3 since now this part will be supported by the
kernel.
Sub-Numa Clustering (SNC) allows splitting CPU cores, caches and memory
into multiple NUMA nodes. When enabled, NUMA-aware applications can
achieve better performance on bigger server platforms.
SNC support in the kernel is currently in review [1]. With SNC enabled
and kernel support in place all the tests will function normally (aside
from effective cache size). There might be a problem when SNC is enabled
but the system is still using an older kernel version without SNC
support. Currently the only message displayed in that situation is a
guess that SNC might be enabled and is causing issues. That message also
is displayed whenever the test fails on an Intel platform.
Add a mechanism to discover kernel support for SNC which will add more
meaning and certainty to the error message.
Add runtime SNC mode detection and verify how reliable that information
is.
Series was tested on Ice Lake server platforms with SNC disabled, SNC-2
and SNC-4. The tests were also ran with and without kernel support for
SNC.
Series applies cleanly on kselftest/next.
[1] https://lore.kernel.org/all/20240628215619.76401-1-tony.luck@intel.com/
Previous versions of this series:
[v1] https://lore.kernel.org/all/cover.1709721159.git.maciej.wieczor-retman@inte…
[v2] https://lore.kernel.org/all/cover.1715769576.git.maciej.wieczor-retman@inte…
Maciej Wieczor-Retman (2):
selftests/resctrl: Adjust effective L3 cache size with SNC enabled
selftests/resctrl: Adjust SNC support messages
tools/testing/selftests/resctrl/cache.c | 3 +
tools/testing/selftests/resctrl/cmt_test.c | 4 +-
tools/testing/selftests/resctrl/mba_test.c | 4 +
tools/testing/selftests/resctrl/mbm_test.c | 6 +-
tools/testing/selftests/resctrl/resctrl.h | 8 +
.../testing/selftests/resctrl/resctrl_tests.c | 7 +
tools/testing/selftests/resctrl/resctrlfs.c | 138 ++++++++++++++++++
7 files changed, 166 insertions(+), 4 deletions(-)
--
2.45.2
Even if a vgem device is configured in, we will skip the import_vgem_fd()
test almost every time.
TAP version 13
1..11
# Testing heap: system
# =======================================
# Testing allocation and importing:
ok 1 # SKIP Could not open vgem -1
The problem is that we use the DRM_IOCTL_VERSION ioctl to query the driver
version information but leave the name field a non-null-terminated string.
Terminate it properly to actually test against the vgem device.
Signed-off-by: Zenghui Yu <yuzenghui(a)huawei.com>
---
tools/testing/selftests/dmabuf-heaps/dmabuf-heap.c | 2 ++
1 file changed, 2 insertions(+)
diff --git a/tools/testing/selftests/dmabuf-heaps/dmabuf-heap.c b/tools/testing/selftests/dmabuf-heaps/dmabuf-heap.c
index 5f541522364f..2fcc74998fa9 100644
--- a/tools/testing/selftests/dmabuf-heaps/dmabuf-heap.c
+++ b/tools/testing/selftests/dmabuf-heaps/dmabuf-heap.c
@@ -32,6 +32,8 @@ static int check_vgem(int fd)
if (ret)
return 0;
+ name[4] = '\0';
+
return !strcmp(name, "vgem");
}
--
2.33.0
From: Geliang Tang <tanggeliang(a)kylinos.cn>
v10:
- a new patch 10 is added.
- patches 1-6, 8-9 unchanged, only commit logs updated.
- "err = -errno" is used in patches 7, 11, 12 to get the real error
number before checking value of "err".
v9:
- new patches 5-7, new struct member expect_errno for network_helper_opts.
- patches 1-4, 8-9 unchanged.
- update patches 10-11 to make sure all tests pass.
v8:
- only patch 8 updated, to fix errors reported by CI.
v7:
- address Martin's comments in v6. (thanks)
- use MAX(opts->backlog, 0) instead of opts->backlog.
- use connect_to_fd_opts instead connect_to_fd.
- more ASSERT_* to check errors.
v6:
- update patch 6 as Daniel suggested. (thanks)
v5:
- keep make_server and make_client as Eduard suggested.
v4:
- a new patch to use make_sockaddr in sockmap_ktls.
- a new patch to close fd in error path in drop_on_reuseport.
- drop make_server() in patch 7.
- drop make_client() too in patch 9.
v3:
- a new patch to add backlog for network_helper_opts.
- use start_server_str in sockmap_ktls now, not start_server.
v2:
- address Eduard's comments in v1. (thanks)
- fix errors reported by CI.
This patch set uses network helpers in sockmap_ktls and sk_lookup, and
drop three local helpers tcp_server(), inetaddr_len() and make_socket()
in them.
Geliang Tang (12):
selftests/bpf: Add backlog for network_helper_opts
selftests/bpf: Use start_server_str in sockmap_ktls
selftests/bpf: Use connect_to_fd_opts in sockmap_ktls
selftests/bpf: Use make_sockaddr in sockmap_ktls
selftests/bpf: Add network_helper_opts for connect_fd_to_fd
selftests/bpf: Add expect_errno for network_helper_opts
selftests/bpf: Set expect_errno for cgroup_skb_sk_lookup
selftests/bpf: Close fd in error path in drop_on_reuseport
selftests/bpf: Use start_server_str in sk_lookup
selftests/bpf: Use connect_fd_to_fd in sk_lookup
selftests/bpf: Use connect_to_addr in sk_lookup
selftests/bpf: Drop make_socket in sk_lookup
tools/testing/selftests/bpf/network_helpers.c | 23 ++-
tools/testing/selftests/bpf/network_helpers.h | 8 +-
.../testing/selftests/bpf/prog_tests/bpf_nf.c | 5 +-
.../bpf/prog_tests/cgroup_skb_sk_lookup.c | 8 +-
.../selftests/bpf/prog_tests/cgroup_tcp_skb.c | 4 +-
.../selftests/bpf/prog_tests/cgroup_v1v2.c | 1 +
.../selftests/bpf/prog_tests/sk_lookup.c | 162 +++++++-----------
.../selftests/bpf/prog_tests/sockmap_ktls.c | 53 ++----
8 files changed, 108 insertions(+), 156 deletions(-)
--
2.43.0
I realized this while having a map containing both a struct bpf_timer and
a struct bpf_wq: the third argument provided to the bpf_wq callback is
not the struct bpf_wq pointer itself, but the pointer to the value in
the map.
Which means that the users need to double cast the provided "value" as
this is not a struct bpf_wq *.
This is a change of API, but there doesn't seem to be much users of bpf_wq
right now, so we should be able to go with this right now.
Signed-off-by: Benjamin Tissoires <bentiss(a)kernel.org>
---
Changes in v2:
- amended the selftests to retrieve something from the third argument of
the callback
- Link to v1: https://lore.kernel.org/r/20240705-fix-wq-v1-0-91b4d82cd825@kernel.org
---
Benjamin Tissoires (2):
bpf: helpers: fix bpf_wq_set_callback_impl signature
selftests/bpf: amend for wrong bpf_wq_set_callback_impl signature
kernel/bpf/helpers.c | 2 +-
tools/testing/selftests/bpf/bpf_experimental.h | 2 +-
tools/testing/selftests/bpf/progs/wq.c | 19 ++++++++++++++-----
tools/testing/selftests/bpf/progs/wq_failures.c | 4 ++--
4 files changed, 18 insertions(+), 9 deletions(-)
---
base-commit: fd8db07705c55a995c42b1e71afc42faad675b0b
change-id: 20240705-fix-wq-f069c7fb36c3
Best regards,
--
Benjamin Tissoires <bentiss(a)kernel.org>
From: Geliang Tang <tanggeliang(a)kylinos.cn>
BPF selftests seem to have not been fully tested on Loongarch platforms.
There are so many "ENOTSUPP" (-524) errors when running BPF selftests on
them since lacking BPF trampoline on Loongarch.
For these "ENOTSUPP" tests, it's better to skip them, instead of reporting
some "ENOTSUPP" errors. This patchset skips ENOTSUPP in ASSERT_OK/
ASSERT_OK_PTR/ASSERT_GE helpers to fix them. This is useful for running BPF
selftests for other architectures too.
Geliang Tang (6):
selftests/bpf: Define ENOTSUPP in testing_helpers.h
selftests/bpf: Skip ENOTSUPP in ASSERT_OK
selftests/bpf: Use ASSERT_OK to skip ENOTSUPP
selftests/bpf: Null checks for link in bpf_tcp_ca
selftests/bpf: Skip ENOTSUPP in ASSERT_OK_PTR
selftests/bpf: Skip ENOTSUPP in ASSERT_GE
.../selftests/bpf/prog_tests/bpf_tcp_ca.c | 20 +++++++++-------
.../testing/selftests/bpf/prog_tests/d_path.c | 2 +-
.../selftests/bpf/prog_tests/lsm_cgroup.c | 10 +-------
.../selftests/bpf/prog_tests/module_attach.c | 2 +-
.../selftests/bpf/prog_tests/ringbuf.c | 2 +-
.../selftests/bpf/prog_tests/sock_addr.c | 4 ----
.../selftests/bpf/prog_tests/test_bprm_opts.c | 2 +-
.../selftests/bpf/prog_tests/test_ima.c | 2 +-
.../selftests/bpf/prog_tests/trace_ext.c | 2 +-
tools/testing/selftests/bpf/test_maps.c | 4 ----
tools/testing/selftests/bpf/test_progs.h | 24 ++++++++++++++-----
tools/testing/selftests/bpf/test_verifier.c | 4 ----
tools/testing/selftests/bpf/testing_helpers.h | 4 ++++
13 files changed, 41 insertions(+), 41 deletions(-)
--
2.43.0
I realized this while having a map containing both a struct bpf_timer and
a struct bpf_wq: the third argument provided to the bpf_wq callback is
not the struct bpf_wq pointer itself, but the pointer to the value in
the map.
Which means that the users need to double cast the provided "value" as
this is not a struct bpf_wq *.
This is a change of API, but there doesn't seem to be much users of bpf_wq
right now, so we should be able to go with this right now.
Signed-off-by: Benjamin Tissoires <bentiss(a)kernel.org>
---
Benjamin Tissoires (2):
bpf: helpers: fix bpf_wq_set_callback_impl signature
selftests/bpf: amend for wrong bpf_wq_set_callback_impl signature
kernel/bpf/helpers.c | 2 +-
tools/testing/selftests/bpf/bpf_experimental.h | 2 +-
tools/testing/selftests/bpf/progs/wq.c | 8 ++++----
tools/testing/selftests/bpf/progs/wq_failures.c | 4 ++--
4 files changed, 8 insertions(+), 8 deletions(-)
---
base-commit: fd8db07705c55a995c42b1e71afc42faad675b0b
change-id: 20240705-fix-wq-f069c7fb36c3
Best regards,
--
Benjamin Tissoires <bentiss(a)kernel.org>
From: Geliang Tang <tanggeliang(a)kylinos.cn>
v9:
- new patches 5-7, new struct member expect_errno for network_helper_opts.
- patches 1-4, 8-9 unchanged.
- update patches 10-11 to make sure all tests pass.
v8:
- only patch 8 updated, to fix errors reported by CI.
v7:
- address Martin's comments in v6. (thanks)
- use MAX(opts->backlog, 0) instead of opts->backlog.
- use connect_to_fd_opts instead connect_to_fd.
- more ASSERT_* to check errors.
v6:
- update patch 6 as Daniel suggested. (thanks)
v5:
- keep make_server and make_client as Eduard suggested.
v4:
- a new patch to use make_sockaddr in sockmap_ktls.
- a new patch to close fd in error path in drop_on_reuseport.
- drop make_server() in patch 7.
- drop make_client() too in patch 9.
v3:
- a new patch to add backlog for network_helper_opts.
- use start_server_str in sockmap_ktls now, not start_server.
v2:
- address Eduard's comments in v1. (thanks)
- fix errors reported by CI.
This patch set uses network helpers in sockmap_ktls and sk_lookup, and
drop three local helpers tcp_server(), inetaddr_len() and make_socket()
in them.
Geliang Tang (11):
selftests/bpf: Add backlog for network_helper_opts
selftests/bpf: Use start_server_str in sockmap_ktls
selftests/bpf: Use connect_to_fd_opts in sockmap_ktls
selftests/bpf: Use make_sockaddr in sockmap_ktls
selftests/bpf: Add network_helper_opts for connect_fd_to_fd
selftests/bpf: Add expect_errno for network_helper_opts
selftests/bpf: Set expect_errno for cgroup_skb_sk_lookup
selftests/bpf: Close fd in error path in drop_on_reuseport
selftests/bpf: Use start_server_str in sk_lookup
selftests/bpf: Use connect_to_addr in sk_lookup
selftests/bpf: Drop make_socket in sk_lookup
tools/testing/selftests/bpf/network_helpers.c | 23 ++-
tools/testing/selftests/bpf/network_helpers.h | 8 +-
.../testing/selftests/bpf/prog_tests/bpf_nf.c | 5 +-
.../bpf/prog_tests/cgroup_skb_sk_lookup.c | 8 +-
.../selftests/bpf/prog_tests/cgroup_tcp_skb.c | 4 +-
.../selftests/bpf/prog_tests/cgroup_v1v2.c | 1 +
.../selftests/bpf/prog_tests/sk_lookup.c | 152 +++++++-----------
.../selftests/bpf/prog_tests/sockmap_ktls.c | 53 ++----
8 files changed, 106 insertions(+), 148 deletions(-)
--
2.43.0
** Background **
Currently, OVS supports several packet sampling mechanisms (sFlow,
per-bridge IPFIX, per-flow IPFIX). These end up being translated into a
userspace action that needs to be handled by ovs-vswitchd's handler
threads only to be forwarded to some third party application that
will somehow process the sample and provide observability on the
datapath.
A particularly interesting use-case is controller-driven
per-flow IPFIX sampling where the OpenFlow controller can add metadata
to samples (via two 32bit integers) and this metadata is then available
to the sample-collecting system for correlation.
** Problem **
The fact that sampled traffic share netlink sockets and handler thread
time with upcalls, apart from being a performance bottleneck in the
sample extraction itself, can severely compromise the datapath,
yielding this solution unfit for highly loaded production systems.
Users are left with little options other than guessing what sampling
rate will be OK for their traffic pattern and system load and dealing
with the lost accuracy.
Looking at available infrastructure, an obvious candidated would be
to use psample. However, it's current state does not help with the
use-case at stake because sampled packets do not contain user-defined
metadata.
** Proposal **
This series is an attempt to fix this situation by extending the
existing psample infrastructure to carry a variable length
user-defined cookie.
The main existing user of psample is tc's act_sample. It is also
extended to forward the action's cookie to psample.
Finally, a new OVS action (OVS_SAMPLE_ATTR_PSAMPLE) is created.
It accepts a group and an optional cookie and uses psample to
multicast the packet and the metadata.
--
v8 -> v9:
- Rebased.
v7 -> v8:
- Rebased
- Redirect flow insertion to /dev/null to avoid spat in test.
- Removed inline keyword in stub execute_psample_action function.
v6 -> v7:
- Rebased
- Fixed typo in comment.
v5 -> v6:
- Renamed emit_sample -> psample
- Addressed unused variable and conditionally compilation of function.
v4 -> v5:
- Rebased.
- Removed lefover enum value and wrapped some long lines in selftests.
v3 -> v4:
- Rebased.
- Addressed Jakub's comment on private and unused nla attributes.
v2 -> v3:
- Addressed comments from Simon, Aaron and Ilya.
- Dropped probability propagation in nested sample actions.
- Dropped patch v2's 7/9 in favor of a userspace implementation and
consume skb if emit_sample is the last action, same as we do with
userspace.
- Split ovs-dpctl.py features in independent patches.
v1 -> v2:
- Create a new action ("emit_sample") rather than reuse existing
"sample" one.
- Add probability semantics to psample's sampling rate.
- Store sampling probability in skb's cb area and use it in emit_sample.
- Test combining "emit_sample" with "trunc"
- Drop group_id filtering and tracepoint in psample.
rfc_v2 -> v1:
- Accommodate Ilya's comments.
- Split OVS's attribute in two attributes and simplify internal
handling of psample arguments.
- Extend psample and tc with a user-defined cookie.
- Add a tracepoint to psample to facilitate troubleshooting.
rfc_v1 -> rfc_v2:
- Use psample instead of a new OVS-only multicast group.
- Extend psample and tc with a user-defined cookie.
Adrian Moreno (10):
net: psample: add user cookie
net: sched: act_sample: add action cookie to sample
net: psample: skip packet copy if no listeners
net: psample: allow using rate as probability
net: openvswitch: add psample action
net: openvswitch: store sampling probability in cb.
selftests: openvswitch: add psample action
selftests: openvswitch: add userspace parsing
selftests: openvswitch: parse trunc action
selftests: openvswitch: add psample test
Documentation/netlink/specs/ovs_flow.yaml | 17 ++
include/net/psample.h | 5 +-
include/uapi/linux/openvswitch.h | 31 +-
include/uapi/linux/psample.h | 11 +-
net/openvswitch/Kconfig | 1 +
net/openvswitch/actions.c | 66 ++++-
net/openvswitch/datapath.h | 3 +
net/openvswitch/flow_netlink.c | 32 ++-
net/openvswitch/vport.c | 1 +
net/psample/psample.c | 16 +-
net/sched/act_sample.c | 12 +
.../selftests/net/openvswitch/openvswitch.sh | 115 +++++++-
.../selftests/net/openvswitch/ovs-dpctl.py | 272 +++++++++++++++++-
13 files changed, 566 insertions(+), 16 deletions(-)
--
2.45.2
From: Geliang Tang <tanggeliang(a)kylinos.cn>
v8:
- only patch 8 updated, to fix errors reported by CI.
v7:
- address Martin's comments in v6. (thanks)
- use MAX(opts->backlog, 0) instead of opts->backlog.
- use connect_to_fd_opts instead connect_to_fd.
- more ASSERT_* to check errors.
v6:
- update patch 6 as Daniel suggested. (thanks)
v5:
- keep make_server and make_client as Eduard suggested.
v4:
- a new patch to use make_sockaddr in sockmap_ktls.
- a new patch to close fd in error path in drop_on_reuseport.
- drop make_server() in patch 7.
- drop make_client() too in patch 9.
v3:
- a new patch to add backlog for network_helper_opts.
- use start_server_str in sockmap_ktls now, not start_server.
v2:
- address Eduard's comments in v1. (thanks)
- fix errors reported by CI.
This patch set uses network helpers in sockmap_ktls and sk_lookup, and
drop three local helpers tcp_server(), inetaddr_len() and make_socket()
in them.
Geliang Tang (9):
selftests/bpf: Add backlog for network_helper_opts
selftests/bpf: Use start_server_str in sockmap_ktls
selftests/bpf: Use connect_to_fd_opts in sockmap_ktls
selftests/bpf: Use make_sockaddr in sockmap_ktls
selftests/bpf: Close fd in error path in drop_on_reuseport
selftests/bpf: Use start_server_str in sk_lookup
selftests/bpf: Use connect_to_fd_opts in sk_lookup
selftests/bpf: Use connect_to_addr in sk_lookup
selftests/bpf: Drop make_socket in sk_lookup
tools/testing/selftests/bpf/network_helpers.c | 2 +-
tools/testing/selftests/bpf/network_helpers.h | 4 +
.../selftests/bpf/prog_tests/sk_lookup.c | 152 +++++++-----------
.../selftests/bpf/prog_tests/sockmap_ktls.c | 53 ++----
4 files changed, 76 insertions(+), 135 deletions(-)
--
2.43.0
Hi Shuah,
These are for 6.10, as we just discussed.
Changes since v4:
1) Subject line on patch #2/3: s/mm/vDSO/
2) Added Muhammad's review tag.
Changes since v3:
1. Rebased onto Linux 6.10-rc6+.
Cover letter for v3:
Jason A. Donenfeld, I've added you because I ended up looking through
your latest "implement getrandom() in vDSO" series [1], which also
touches this Makefile, so just a heads up about upcoming (minor) merge
conflicts.
Changes since v2:
1. Added two patches, both of which apply solely to the Makefile.
These provide a smaller, cleaner, and more accurate Makefile.
2. Added Reviewed-by and Tested-by tags for the original patch, which
fixes all of the clang errors and warnings for this selftest.
3. Removed an obsolete blurb from the commit description of the original
patch, now that Valentin Obst LLVM build fix has been merged.
thanks,
John Hubbard
NVIDIA
John Hubbard (3):
selftests/vDSO: fix clang build errors and warnings
selftests/vDSO: remove partially duplicated "all:" target in Makefile
selftests/vDSO: remove duplicate compiler invocations from Makefile
tools/testing/selftests/vDSO/Makefile | 29 ++++++++-----------
tools/testing/selftests/vDSO/parse_vdso.c | 16 ++++++----
.../selftests/vDSO/vdso_standalone_test_x86.c | 18 ++++++++++--
3 files changed, 39 insertions(+), 24 deletions(-)
base-commit: d270dd21bee023ab627f34cfb77a9b89a688492a
--
2.40.1
Hi,
This is basically a resend, with a rebase onto today's latest Linux
main, in order to show that the patches are still relevant and correct.
Changes since v3:
1. Rebased onto Linux 6.10-rc6+.
Cover letter for v3:
Jason A. Donenfeld, I've added you because I ended up looking through
your latest "implement getrandom() in vDSO" series [1], which also
touches this Makefile, so just a heads up about upcoming (minor) merge
conflicts.
Changes since v2:
1. Added two patches, both of which apply solely to the Makefile.
These provide a smaller, cleaner, and more accurate Makefile.
2. Added Reviewed-by and Tested-by tags for the original patch, which
fixes all of the clang errors and warnings for this selftest.
3. Removed an obsolete blurb from the commit description of the original
patch, now that Valentin Obst LLVM build fix has been merged.
John Hubbard (3):
selftests/vDSO: fix clang build errors and warnings
selftests/mm: remove partially duplicated "all:" target in Makefile
selftests/vDSO: remove duplicate compiler invocations from Makefile
tools/testing/selftests/vDSO/Makefile | 29 ++++++++-----------
tools/testing/selftests/vDSO/parse_vdso.c | 16 ++++++----
.../selftests/vDSO/vdso_standalone_test_x86.c | 18 ++++++++++--
3 files changed, 39 insertions(+), 24 deletions(-)
base-commit: 8a9c6c40432e265600232b864f97d7c675e8be52
--
2.45.2
When building with clang, via:
make LLVM=1 -C tools/testing/selftest
...clang warns about an unused irqcount variable. clang is correct: the
variable is incremented and then ignored.
Fix this by deleting the irqcount variable.
Signed-off-by: John Hubbard <jhubbard(a)nvidia.com>
---
Changes since v2:
1) Rebased onto Linux 6.10-rc6+
Changes since the first version:
1) Rebased onto Linux 6.10-rc1
thanks,
John Hubbard
tools/testing/selftests/timers/rtcpie.c | 3 +--
1 file changed, 1 insertion(+), 2 deletions(-)
diff --git a/tools/testing/selftests/timers/rtcpie.c b/tools/testing/selftests/timers/rtcpie.c
index 4ef2184f1558..7c07edd0d450 100644
--- a/tools/testing/selftests/timers/rtcpie.c
+++ b/tools/testing/selftests/timers/rtcpie.c
@@ -29,7 +29,7 @@ static const char default_rtc[] = "/dev/rtc0";
int main(int argc, char **argv)
{
- int i, fd, retval, irqcount = 0;
+ int i, fd, retval;
unsigned long tmp, data, old_pie_rate;
const char *rtc = default_rtc;
struct timeval start, end, diff;
@@ -120,7 +120,6 @@ int main(int argc, char **argv)
fprintf(stderr, " %d",i);
fflush(stderr);
- irqcount++;
}
/* Disable periodic interrupts */
base-commit: 8a9c6c40432e265600232b864f97d7c675e8be52
--
2.45.2
These patches aim to make using the openvswitch testsuite more reliable.
These should address the major sources of flakiness in the openvswitch
test suite allowing the CI infrastructure to exercise the openvswitch
module for patch series. There should be no change for users who simply
run the tests (except that patch 3/3 does make some of the debugging a bit
easier by making some output more verbose).
Aaron Conole (3):
selftests: openvswitch: Bump timeout to 15 minutes.
selftests: openvswitch: Attempt to autoload module.
selftests: openvswitch: Be more verbose with selftest debugging.
.../selftests/net/openvswitch/openvswitch.sh | 23 ++++++++++++-------
.../selftests/net/openvswitch/settings | 1 +
2 files changed, 16 insertions(+), 8 deletions(-)
create mode 100644 tools/testing/selftests/net/openvswitch/settings
--
2.45.1
On Fri, Jul 05, 2024 at 11:49:28AM GMT, Adrián Moreno wrote:
> On Thu, Jul 04, 2024 at 10:56:51AM GMT, Adrian Moreno wrote:
> > ** Background **
> > Currently, OVS supports several packet sampling mechanisms (sFlow,
> > per-bridge IPFIX, per-flow IPFIX). These end up being translated into a
> > userspace action that needs to be handled by ovs-vswitchd's handler
> > threads only to be forwarded to some third party application that
> > will somehow process the sample and provide observability on the
> > datapath.
> >
> > A particularly interesting use-case is controller-driven
> > per-flow IPFIX sampling where the OpenFlow controller can add metadata
> > to samples (via two 32bit integers) and this metadata is then available
> > to the sample-collecting system for correlation.
> >
> > ** Problem **
> > The fact that sampled traffic share netlink sockets and handler thread
> > time with upcalls, apart from being a performance bottleneck in the
> > sample extraction itself, can severely compromise the datapath,
> > yielding this solution unfit for highly loaded production systems.
> >
> > Users are left with little options other than guessing what sampling
> > rate will be OK for their traffic pattern and system load and dealing
> > with the lost accuracy.
> >
> > Looking at available infrastructure, an obvious candidated would be
> > to use psample. However, it's current state does not help with the
> > use-case at stake because sampled packets do not contain user-defined
> > metadata.
> >
> > ** Proposal **
> > This series is an attempt to fix this situation by extending the
> > existing psample infrastructure to carry a variable length
> > user-defined cookie.
> >
> > The main existing user of psample is tc's act_sample. It is also
> > extended to forward the action's cookie to psample.
> >
> > Finally, a new OVS action (OVS_SAMPLE_ATTR_PSAMPLE) is created.
> > It accepts a group and an optional cookie and uses psample to
> > multicast the packet and the metadata.
> >
> > --
> > v8 -> v9:
> > - Rebased.
> >
> > v7 -> v8:
> > - Rebased
> > - Redirect flow insertion to /dev/null to avoid spat in test.
> > - Removed inline keyword in stub execute_psample_action function.
> >
> > v6 -> v7:
> > - Rebased
> > - Fixed typo in comment.
> >
> > v5 -> v6:
> > - Renamed emit_sample -> psample
> > - Addressed unused variable and conditionally compilation of function.
> >
> > v4 -> v5:
> > - Rebased.
> > - Removed lefover enum value and wrapped some long lines in selftests.
> >
> > v3 -> v4:
> > - Rebased.
> > - Addressed Jakub's comment on private and unused nla attributes.
> >
> > v2 -> v3:
> > - Addressed comments from Simon, Aaron and Ilya.
> > - Dropped probability propagation in nested sample actions.
> > - Dropped patch v2's 7/9 in favor of a userspace implementation and
> > consume skb if emit_sample is the last action, same as we do with
> > userspace.
> > - Split ovs-dpctl.py features in independent patches.
> >
> > v1 -> v2:
> > - Create a new action ("emit_sample") rather than reuse existing
> > "sample" one.
> > - Add probability semantics to psample's sampling rate.
> > - Store sampling probability in skb's cb area and use it in emit_sample.
> > - Test combining "emit_sample" with "trunc"
> > - Drop group_id filtering and tracepoint in psample.
> >
> > rfc_v2 -> v1:
> > - Accommodate Ilya's comments.
> > - Split OVS's attribute in two attributes and simplify internal
> > handling of psample arguments.
> > - Extend psample and tc with a user-defined cookie.
> > - Add a tracepoint to psample to facilitate troubleshooting.
> >
> > rfc_v1 -> rfc_v2:
> > - Use psample instead of a new OVS-only multicast group.
> > - Extend psample and tc with a user-defined cookie.
> >
> > Adrian Moreno (10):
> > net: psample: add user cookie
> > net: sched: act_sample: add action cookie to sample
> > net: psample: skip packet copy if no listeners
> > net: psample: allow using rate as probability
> > net: openvswitch: add psample action
> > net: openvswitch: store sampling probability in cb.
> > selftests: openvswitch: add psample action
> > selftests: openvswitch: add userspace parsing
> > selftests: openvswitch: parse trunc action
> > selftests: openvswitch: add psample test
> >
> > Documentation/netlink/specs/ovs_flow.yaml | 17 ++
> > include/net/psample.h | 5 +-
> > include/uapi/linux/openvswitch.h | 31 +-
> > include/uapi/linux/psample.h | 11 +-
> > net/openvswitch/Kconfig | 1 +
> > net/openvswitch/actions.c | 66 ++++-
> > net/openvswitch/datapath.h | 3 +
> > net/openvswitch/flow_netlink.c | 32 ++-
> > net/openvswitch/vport.c | 1 +
> > net/psample/psample.c | 16 +-
> > net/sched/act_sample.c | 12 +
> > .../selftests/net/openvswitch/openvswitch.sh | 115 +++++++-
> > .../selftests/net/openvswitch/ovs-dpctl.py | 272 +++++++++++++++++-
> > 13 files changed, 566 insertions(+), 16 deletions(-)
> >
> > --
> > 2.45.2
> >
>
> Hi,
>
> Simon Horman has spotted that openvswitch.sh tests are failing in the
> debug executor:
>
> https://netdev.bots.linux.dev/contest.html?test=openvswitch-sh
>
> The failing tests are two: psample and upcall_interfaces. These two
> tests have a known source of instability (they use "sleep") that make
> them specially unreliable in slow systems.
>
> Aaron and I already discussed this and I'm working on a patch to make
> both tests more robust by adding a wait-and-retry mechanism.
>
> I hope this series can be considered regardless of this flaky tests.
>
Adding more context to explain our situation.
This series has a counterpart in OVS [1]. The state of this other series
is still RFC just because the kernel bits have not yet been merged.
OVS 3.4 "softfreeze" was declared last monday, which excludes from the
release any series that is stil in RFC state.
Given the kernel parts seemed very close to be merged, an exception was
given to the series so we can consider it for inclusion [2].
I hate to put any pressure on already busy maintainers but I would also
dislike missing this OVS release by just one or two days and having
to wait 6 months (OVS release cadence) for it to be available.
Again, I don't want to put pressure on maintainers. If it's not
possible, that's it. I just wanted to voice our timeline constraints.
Thanks for your understanding.
Adrián
[1] https://patchwork.ozlabs.org/project/openvswitch/cover/20240704085710.35384…
[2] https://mail.openvswitch.org/pipermail/ovs-dev/2024-July/415261.html
in the do_setcpu, this function does not need to have a return value,
which is meaningless
Signed-off-by: Liu Jing <liujing(a)cmss.chinamobile.com>
---
tools/testing/selftests/net/msg_zerocopy.c | 3 +--
1 file changed, 1 insertion(+), 2 deletions(-)
diff --git a/tools/testing/selftests/net/msg_zerocopy.c b/tools/testing/selftests/net/msg_zerocopy.c
index bdc03a2097e8..0b54f2011449 100644
--- a/tools/testing/selftests/net/msg_zerocopy.c
+++ b/tools/testing/selftests/net/msg_zerocopy.c
@@ -118,7 +118,7 @@ static uint16_t get_ip_csum(const uint16_t *start, int num_words)
return ~sum;
}
-static int do_setcpu(int cpu)
+static void do_setcpu(int cpu)
{
cpu_set_t mask;
@@ -129,7 +129,6 @@ static int do_setcpu(int cpu)
else if (cfg_verbose)
fprintf(stderr, "cpu: %u\n", cpu);
- return 0;
}
static void do_setsockopt(int fd, int level, int optname, int val)
--
2.33.0
Recent CI failures brought my attention to the fact that pmu_counters_test
sometimes fails because it doesn't get any LLC cache misses.
It apparently happens because CLFLUSH can race with CPU prediction.
To attempt to fix this, implement a more aggressive cache flushing - now it is flushed
on each iteration of the measured loop which should at least reduce by order
of magnitude the chance of this happening.
This patch survived more that a day of running in a loop on a Comet Lake machine,
where the test used to fail after about 10-20 minites.
Best regards,
Maxim Levitsky
Maxim Levitsky (1):
KVM: selftests: pmu_counters_test: increase robustness of LLC cache
misses
.../selftests/kvm/x86_64/pmu_counters_test.c | 20 +++++++++----------
1 file changed, 9 insertions(+), 11 deletions(-)
--
2.26.3
From: Geliang Tang <tanggeliang(a)kylinos.cn>
v7:
- address Martin's comments in v6. (thanks)
- use MAX(opts->backlog, 0) instead of opts->backlog.
- use connect_to_fd_opts instead connect_to_fd.
- more ASSERT_* to check errors.
v6:
- update patch 6 as Daniel suggested. (thanks)
v5:
- keep make_server and make_client as Eduard suggested.
v4:
- a new patch to use make_sockaddr in sockmap_ktls.
- a new patch to close fd in error path in drop_on_reuseport.
- drop make_server() in patch 7.
- drop make_client() too in patch 9.
v3:
- a new patch to add backlog for network_helper_opts.
- use start_server_str in sockmap_ktls now, not start_server.
v2:
- address Eduard's comments in v1. (thanks)
- fix errors reported by CI.
This patch set uses network helpers in sockmap_ktls and sk_lookup, and
drop three local helpers tcp_server(), inetaddr_len() and make_socket()
in them.
Geliang Tang (9):
selftests/bpf: Add backlog for network_helper_opts
selftests/bpf: Use start_server_str in sockmap_ktls
selftests/bpf: Use connect_to_fd_opts in sockmap_ktls
selftests/bpf: Use make_sockaddr in sockmap_ktls
selftests/bpf: Close fd in error path in drop_on_reuseport
selftests/bpf: Use start_server_str in sk_lookup
selftests/bpf: Use connect_to_fd_opts in sk_lookup
selftests/bpf: Use connect_to_addr in sk_lookup
selftests/bpf: Drop make_socket in sk_lookup
tools/testing/selftests/bpf/network_helpers.c | 2 +-
tools/testing/selftests/bpf/network_helpers.h | 4 +
.../selftests/bpf/prog_tests/sk_lookup.c | 150 ++++++++----------
.../selftests/bpf/prog_tests/sockmap_ktls.c | 53 ++-----
4 files changed, 77 insertions(+), 132 deletions(-)
--
2.43.0
Hi Linus,
This PR fixes a few kselftests [1]. This has been in linux-next for a week and
rebased to add Mark Brown's Tested-by. The race condition found while writing
this fix is not new and seems specific to UML's hostfs (I also tested against
ext4 and btrfs without being able to trigger this issue).
Feel free to take this PR if you see fit.
Regards,
Mickaël
[1] https://lore.kernel.org/r/9341d4db-5e21-418c-bf9e-9ae2da7877e1@sirena.org.uk
--
The following changes since commit f2661062f16b2de5d7b6a5c42a9a5c96326b8454:
Linux 6.10-rc5 (2024-06-23 17:08:54 -0400)
are available in the Git repository at:
https://git.kernel.org/pub/scm/linux/kernel/git/mic/linux.git tags/kselftest-fix-2024-07-04
for you to fetch changes up to 130e42806773013e9cf32d211922c935ae2df86c:
selftests/harness: Fix tests timeout and race condition (2024-06-28 16:06:03 +0200)
----------------------------------------------------------------
Fix Kselftests timeout and race condition
----------------------------------------------------------------
Mickaël Salaün (1):
selftests/harness: Fix tests timeout and race condition
tools/testing/selftests/kselftest_harness.h | 43 ++++++++++++++++-------------
1 file changed, 24 insertions(+), 19 deletions(-)