From: Björn Töpel <bjorn(a)rivosinc.com>
This effectively is a revert of commit 7a6eb7c34a78 ("selftests: Skip
BPF seftests by default"). At the time when this was added, BPF had
"build time dependencies on cutting edge versions". Since then a
number of BPF capable tests has been included in net, hid, sched_ext.
There is no reason not to include BPF by default in the build.
Remove BPF from the selftests skiplist.
Signed-off-by: Björn Töpel <bjorn(a)rivosinc.com>
---
tools/testing/selftests/Makefile | 6 ++----
1 file changed, 2 insertions(+), 4 deletions(-)
diff --git a/tools/testing/selftests/Makefile b/tools/testing/selftests/Makefile
index b38199965f99..88f59a5fef96 100644
--- a/tools/testing/selftests/Makefile
+++ b/tools/testing/selftests/Makefile
@@ -129,10 +129,8 @@ ifeq ($(filter net/lib,$(TARGETS)),)
endif
endif
-# User can optionally provide a TARGETS skiplist. By default we skip
-# BPF since it has cutting edge build time dependencies which require
-# more effort to install.
-SKIP_TARGETS ?= bpf
+# User can optionally provide a TARGETS skiplist.
+SKIP_TARGETS ?=
ifneq ($(SKIP_TARGETS),)
TMP := $(filter-out $(SKIP_TARGETS), $(TARGETS))
override TARGETS := $(TMP)
base-commit: 0c559323bbaabee7346c12e74b497e283aaafef5
--
2.43.0
Hi Linus,
Please pull this kselftest fixes update for Linux 6.12-rc2.
This kselftest fixes update for Linux 6.12-rc2 consists of fixes
to build warnings, install scripts, run-time error path, and
git status cleanups to tests:
-- devices/probe: fix for Python3 regex string syntax warnings
-- clone3: removing unused macro from clone3_cap_checkpoint_restore()
-- vDSO: fix to align getrandom states to cache line
-- core and exec: add missing executables to .gitignore files
-- rtc: change to skip test if /dev/rtc0 can't be accessed
-- timers/posix: fix warn_unused_result result in __fatal_error()
-- breakpoints: fix to detect suspend successful condition correctly
-- hid: fix to install required dependencies to run the test
diff is attached.
thanks,
-- Shuah
----------------------------------------------------------------
The following changes since commit 9852d85ec9d492ebef56dc5f229416c925758edc:
Linux 6.12-rc1 (2024-09-29 15:06:19 -0700)
are available in the Git repository at:
git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest tags/linux_kselftest-fixes-6.12-rc2
for you to fetch changes up to c66be905cda24fb782b91053b196bd2e966f95b7:
selftests: breakpoints: use remaining time to check if suspend succeed (2024-10-02 14:37:30 -0600)
----------------------------------------------------------------
linux_kselftest-fixes-6.12-rc2
This kselftest fixes update for Linux 6.12-rc2 consists of fixes
to build warnings, install scripts, run-time error path, and
git status cleanups to tests:
-- devices/probe: fix for Python3 regex string syntax warnings
-- clone3: removing unused macro from clone3_cap_checkpoint_restore()
-- vDSO: fix to align getrandom states to cache line
-- core and exec: add missing executables to .gitignore files
-- rtc: change to skip test if /dev/rtc0 can't be accessed
-- timers/posix: fix warn_unused_result result in __fatal_error()
-- breakpoints: fix to detect suspend successful condition correctly
-- hid: fix to install required dependencies to run the test
----------------------------------------------------------------
Alessandro Zanni (1):
kselftest/devices/probe: Fix SyntaxWarning in regex strings for Python3
Ba Jing (1):
clone3: clone3_cap_checkpoint_restore: remove unused MAX_PID_NS_LEVEL macro
Jason A. Donenfeld (1):
selftests: vDSO: align getrandom states to cache line
Javier Carrasco (2):
selftests: core: add unshare_test to gitignore
selftests: exec: update gitignore for load_address
Joseph Jang (1):
selftest: rtc: Check if could access /dev/rtc0 before testing
Shuah Khan (1):
selftests:timers: posix_timers: Fix warn_unused_result in __fatal_error()
Yifei Liu (1):
selftests: breakpoints: use remaining time to check if suspend succeed
Yun Lu (1):
selftest: hid: add missing run-hid-tools-tests.sh
.../testing/selftests/breakpoints/step_after_suspend_test.c | 5 ++++-
.../testing/selftests/clone3/clone3_cap_checkpoint_restore.c | 2 --
tools/testing/selftests/core/.gitignore | 1 +
.../selftests/devices/probe/test_discoverable_devices.py | 4 ++--
tools/testing/selftests/exec/.gitignore | 3 ++-
tools/testing/selftests/hid/Makefile | 2 ++
tools/testing/selftests/rtc/rtctest.c | 11 ++++++++++-
tools/testing/selftests/timers/posix_timers.c | 12 ++++++++----
tools/testing/selftests/vDSO/vdso_test_getrandom.c | 8 +++++---
9 files changed, 34 insertions(+), 14 deletions(-)
----------------------------------------------------------------
v2:
- v1 missed the merge window, so while we're at it...
- split changes into two patches instead of one for readability (#1
removes the ioam selftests, #2 adds the updated ioam selftests)
TL;DR This patch comes from a discussion we had with Jakub and Paolo on
aligning the ioam selftests with its new "tunsrc" feature.
This patch updates the IOAM selftests to support the new "tunsrc"
feature of IOAM. As a consequence, some changes were required. For
example, the IPv6 header must be accessed to check some fields (i.e.,
the source address for the "tunsrc" feature), which is not possible
AFAIK with IPv6 raw sockets. The latter is currently used with
IPV6_RECVHOPOPTS and was introduced by commit 187bbb6968af ("selftests:
ioam: refactoring to align with the fix") to fix an issue. But, we
really need packet sockets actually... which is one of the changes in
this patch (see the description of the topology at the top of ioam6.sh
for explanations). Another change is that all IPv6 addresses used in the
topology are now based on the documentation prefix (2001:db8::/32).
Also, the tests have been improved and there are now many more of them.
Overall, the script is more robust.
Justin Iurman (2):
selftests: net: remove ioam tests
selftests: net: add new ioam tests
tools/testing/selftests/net/ioam6.sh | 1832 +++++++++++++++-----
tools/testing/selftests/net/ioam6_parser.c | 1087 ++++++++----
2 files changed, 2129 insertions(+), 790 deletions(-)
--
2.34.1
Currently, the second bridge command overwrites the first one.
Fix this by adding this VID to the interface behind $swp2.
The one_bridge_two_pvids() test intends to check that there is no
leakage of traffic between bridge ports which have a single VLAN - the
PVID VLAN.
Because of a typo, port $swp1 is configured with a PVID twice (second
command overwrites first), and $swp2 isn't configured at all (and since
the bridge vlan_default_pvid property is set to 0, this port will not
have a PVID at all, so it will drop all untagged and priority-tagged
traffic).
So, instead of testing the configuration that was intended, we are
testing a different one, where one port has PVID 2 and the other has
no PVID. This incorrect version of the test should also pass, but is
ineffective for its purpose, so fix the typo.
This typo has an impact on results of the test,
potentially leading to wrong conclusions regarding
the functionality of a network device.
The tests results:
TEST: Switch ports in VLAN-aware bridge with different PVIDs:
Unicast non-IP untagged [ OK ]
Multicast non-IP untagged [ OK ]
Broadcast non-IP untagged [ OK ]
Unicast IPv4 untagged [ OK ]
Multicast IPv4 untagged [ OK ]
Unicast IPv6 untagged [ OK ]
Multicast IPv6 untagged [ OK ]
Unicast non-IP VID 1 [ OK ]
Multicast non-IP VID 1 [ OK ]
Broadcast non-IP VID 1 [ OK ]
Unicast IPv4 VID 1 [ OK ]
Multicast IPv4 VID 1 [ OK ]
Unicast IPv6 VID 1 [ OK ]
Multicast IPv6 VID 1 [ OK ]
Unicast non-IP VID 4094 [ OK ]
Multicast non-IP VID 4094 [ OK ]
Broadcast non-IP VID 4094 [ OK ]
Unicast IPv4 VID 4094 [ OK ]
Multicast IPv4 VID 4094 [ OK ]
Unicast IPv6 VID 4094 [ OK ]
Multicast IPv6 VID 4094 [ OK ]
Fixes: 476a4f05d9b8 ("selftests: forwarding: add a no_forwarding.sh test")
Reviewed-by: Hangbin Liu <liuhangbin(a)gmail.com>
Reviewed-by: Shuah Khan <skhan(a)linuxfoundation.org>
Signed-off-by: Kacper Ludwinski <kac.ludwinski(a)icloud.com>
---
tools/testing/selftests/net/forwarding/no_forwarding.sh | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
v5:
- Add test results impacted by the changes
- Fix typo in commit message
v4:
- Add revision history od this patch
- Add "Reviewed-by:"
- Limit number of characters in commit to 80
- Add impact explanation to commit message
- Link: https://lore.kernel.org/linux-kselftest/20240930063543.94247-1-kac.ludwinsk…
v3:
- Edit commit message
- Add missing Signed-off-by
- Link: https://lore.kernel.org/linux-kselftest/20240927112824.339-1-kac.ludwinski@…
v2:
- Add missing CCs
- Fix typo in commit message
- Add target name
- Link: https://lore.kernel.org/linux-kselftest/fQknN_r6POzmrp8UVjyA3cknLnB1HB9I_jf…
v1:
- Link: https://lore.kernel.org/linux-kselftest/20240925050539.1906-1-kacper@ludwin…
diff --git a/tools/testing/selftests/net/forwarding/no_forwarding.sh b/tools/testing/selftests/net/forwarding/no_forwarding.sh
index 9e677aa64a06..694ece9ba3a7 100755
--- a/tools/testing/selftests/net/forwarding/no_forwarding.sh
+++ b/tools/testing/selftests/net/forwarding/no_forwarding.sh
@@ -202,7 +202,7 @@ one_bridge_two_pvids()
ip link set $swp2 master br0
bridge vlan add dev $swp1 vid 1 pvid untagged
- bridge vlan add dev $swp1 vid 2 pvid untagged
+ bridge vlan add dev $swp2 vid 2 pvid untagged
run_test "Switch ports in VLAN-aware bridge with different PVIDs"
--
2.43.0
Changes since V1:
- V1: https://lore.kernel.org/cover.1724970211.git.reinette.chatre@intel.com/
- V2 contains the same general solutions to stated problem as V1 but these
are now preceded by more fixes (patches 1 to 5) and improved robustness
(patches 6 to 9) to existing tests before the series gets back
to solving the original problem with more confidence in patches 10 to 13.
- The posibility of making "memflush = false" for CMT test was discussed
during V1. Modifying this setting does not have a significant impact on the
observed results that are already well within acceptable range and this
version thus keeps original default. If performance was a goal it may
be possible to do further experimentation where "memflush = false" could
eliminate the need for the sleep(1) within the test wrapper, but
improving the performance is not a goal of this work.
- (New) Support what seems to be unintended ability for user space to provide
parameters to "fill_buf" by making the parsing robust and only support
changing parameters that are supported to be changed. Drop support for
"write" operation since it has never been measured.
- (New) Improve wraparound handling. (Ilpo)
- (New) A couple of new fixes addressing issues discovered during development.
- (Change from V1) To support fill_buf parameters provided by user space as
well as test specific fill_buf parameters struct fill_buf_param is no longer
just a member of struct resctrl_val_param, instead there could be at most
two instances of struct fill_buf_param, the immutable parameters provided
by user space and the parameters used by individual tests. (Ilpo)
- Please see individual patches for detailed changes.
V1 cover:
The resctrl selftests for Memory Bandwidth Allocation (MBA) and Memory
Bandwidth Monitoring (MBM) are failing on some (for example [1]) Emerald
Rapids systems. The test failures result from the following two
properties of these systems:
1) Emerald Rapids systems can have up to 320MB L3 cache. The resctrl
MBA and MBM selftests measure memory traffic for which a hardcoded
250MB buffer has been sufficient so far. On platforms with L3 cache
larger than the buffer, the buffer fits in the L3 cache and thus
no/very little memory traffic is generated during the "memory
bandwidth" tests.
2) Some platform features, for example RAS features or memory
performance features that generate memory traffic may drive accesses
that are counted differently by performance counters and MBM
respectively, for instance generating "overhead" traffic which is not
counted against any specific RMID. Until now these counting
differences have always been "in the noise". On Emerald Rapids
systems the maximum MBA throttling (10% memory bandwidth)
throttles memory bandwidth to where memory accesses by these other
platform features push the memory bandwidth difference between
memory controller performance counters and resctrl (MBM) beyond the
tests' hardcoded tolerance.
Make the tests more robust against platform variations:
1) Let the buffer used by memory bandwidth tests be guided by the size
of the L3 cache.
2) Larger buffers require longer initialization time before the buffer can
be used to measurement. Rework the tests to ensure that buffer
initialization is complete before measurements start.
3) Do not compare performance counters and MBM measurements at low
bandwidth. The value of "low" is hardcoded to 750MiB based on
measurements on Emerald Rapids, Sapphire Rapids, and Ice Lake
systems. This limit is not applicable to AMD systems since it
only applies to the MBA and MBM tests that are isolated to Intel.
[1]
https://ark.intel.com/content/www/us/en/ark/products/237261/intel-xeon-plat…
Reinette Chatre (13):
selftests/resctrl: Make functions only used in same file static
selftests/resctrl: Print accurate buffer size as part of MBM results
selftests/resctrl: Fix memory overflow due to unhandled wraparound
selftests/resctrl: Protect against array overrun during iMC config
parsing
selftests/resctrl: Make wraparound handling obvious
selftests/resctrl: Remove "once" parameter required to be false
selftests/resctrl: Only support measured read operation
selftests/resctrl: Remove unused measurement code
selftests/resctrl: Make benchmark parameter passing robust
selftests/resctrl: Ensure measurements skip initialization of default
benchmark
selftests/resctrl: Use cache size to determine "fill_buf" buffer size
selftests/resctrl: Do not compare performance counters and resctrl at
low bandwidth
selftests/resctrl: Keep results from first test run
tools/testing/selftests/resctrl/cmt_test.c | 37 +-
tools/testing/selftests/resctrl/fill_buf.c | 40 +-
tools/testing/selftests/resctrl/mba_test.c | 52 +-
tools/testing/selftests/resctrl/mbm_test.c | 38 +-
tools/testing/selftests/resctrl/resctrl.h | 73 ++-
.../testing/selftests/resctrl/resctrl_tests.c | 95 +++-
tools/testing/selftests/resctrl/resctrl_val.c | 445 +++++-------------
tools/testing/selftests/resctrl/resctrlfs.c | 17 -
8 files changed, 339 insertions(+), 458 deletions(-)
--
2.46.0
Add Kunit tests for the kernel's implementation of the standard CRC-16
algorithm (<linux/crc16.h>). The test data consists of 100
randomly-generated test cases, validated against a naive CRC-16
implementation.
This test follows roughly the same logic as lib/crc32test.c, but
without the performance measurements.
Signed-off-by: Vinicius Peixoto <vpeixoto(a)lkcamp.dev>
Co-developed-by: Enzo Bertoloti <ebertoloti(a)lkcamp.dev>
Signed-off-by: Enzo Bertoloti <ebertoloti(a)lkcamp.dev>
Co-developed-by: Fabricio Gasperin <fgasperin(a)lkcamp.dev>
Signed-off-by: Fabricio Gasperin <fgasperin(a)lkcamp.dev>
Suggested-by: David Laight <David.Laight(a)ACULAB.COM>
---
Hi all,
This patch was developed during a hackathon organized by LKCAMP [1],
with the objective of writing KUnit tests, both to introduce people to
the kernel development process and to learn about different subsystems
(with the positive side effect of improving the kernel test coverage, of
course).
We noticed there were tests for CRC32 in lib/crc32test.c and thought it
would be nice to have something similar for CRC16, since it seems to be
widely used in network drivers (as well as in some ext4 code).
We would really appreciate any feedback/suggestions on how to improve
this. Thanks! :-)
Link to v1: https://lore.kernel.org/linux-kselftest/20240922232643.535329-1-vpeixoto@lk…
Changes in v2 (suggested by David Laight):
- Use the PRNG from include/linux/prandom.h to generate pseudorandom
data/test cases instead of having them hardcoded as large static
arrays
- Add a naive CRC16 implementation used to validate the kernel's
implementation (instead of having the test case results be hard-coded)
[1] https://lkcamp.dev/about
---
lib/Kconfig.debug | 9 +++
lib/Makefile | 1 +
lib/crc16_kunit.c | 165 ++++++++++++++++++++++++++++++++++++++++++++++++++++++
3 files changed, 175 insertions(+)
diff --git a/lib/Kconfig.debug b/lib/Kconfig.debug
index 7315f643817ae1021f1e4b3dd27b424f49e3f761..f9617e3054948ce43090f524dc67650e9549cee8 100644
--- a/lib/Kconfig.debug
+++ b/lib/Kconfig.debug
@@ -2850,6 +2850,15 @@ config USERCOPY_KUNIT_TEST
on the copy_to/from_user infrastructure, making sure basic
user/kernel boundary testing is working.
+config CRC16_KUNIT_TEST
+ tristate "KUnit tests for CRC16"
+ depends on KUNIT
+ default KUNIT_ALL_TESTS
+ select CRC16
+ help
+ Enable this option to run unit tests for the kernel's CRC16
+ implementation (<linux/crc16.h>).
+
config TEST_UDELAY
tristate "udelay test driver"
help
diff --git a/lib/Makefile b/lib/Makefile
index 773adf88af41665b2419202e5427e0513c6becae..1faed6414a85fd366b4966a00e8ba231d7546e14 100644
--- a/lib/Makefile
+++ b/lib/Makefile
@@ -389,6 +389,7 @@ CFLAGS_fortify_kunit.o += $(DISABLE_STRUCTLEAK_PLUGIN)
obj-$(CONFIG_FORTIFY_KUNIT_TEST) += fortify_kunit.o
obj-$(CONFIG_SIPHASH_KUNIT_TEST) += siphash_kunit.o
obj-$(CONFIG_USERCOPY_KUNIT_TEST) += usercopy_kunit.o
+obj-$(CONFIG_CRC16_KUNIT_TEST) += crc16_kunit.o
obj-$(CONFIG_GENERIC_LIB_DEVMEM_IS_ALLOWED) += devmem_is_allowed.o
diff --git a/lib/crc16_kunit.c b/lib/crc16_kunit.c
new file mode 100644
index 0000000000000000000000000000000000000000..7a79989815c451a21210d463729436fcc620d6b3
--- /dev/null
+++ b/lib/crc16_kunit.c
@@ -0,0 +1,165 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * KUnits tests for CRC16.
+ *
+ * Copyright (C) 2024, LKCAMP
+ * Author: Vinicius Peixoto <vpeixoto(a)lkcamp.dev>
+ * Author: Fabricio Gasperin <fgasperin(a)lkcamp.dev>
+ * Author: Enzo Bertoloti <ebertoloti(a)lkcamp.dev>
+ */
+#include <kunit/test.h>
+#include <linux/crc16.h>
+#include <linux/prandom.h>
+
+#define CRC16_KUNIT_DATA_SIZE 4096
+#define CRC16_KUNIT_TEST_SIZE 100
+#define CRC16_KUNIT_SEED 0x12345678
+
+/**
+ * struct crc16_test - CRC16 test data
+ * @crc: initial input value to CRC16
+ * @start: Start index within the data buffer
+ * @length: Length of the data
+ * @crc16: Expected CRC16 value for the test
+ */
+static struct crc16_test {
+ u16 crc;
+ u16 start;
+ u16 length;
+} tests[CRC16_KUNIT_TEST_SIZE];
+
+u8 data[CRC16_KUNIT_DATA_SIZE];
+
+
+/* Naive implementation of CRC16 for validation purposes */
+static inline u16 _crc16_naive_byte(u16 crc, u8 data)
+{
+ u8 i = 0;
+
+ crc ^= (u16) data;
+ for (i = 0; i < 8; i++) {
+ if (crc & 0x01)
+ crc = (crc >> 1) ^ 0xa001;
+ else
+ crc = crc >> 1;
+ }
+
+ return crc;
+}
+
+
+static inline u16 _crc16_naive(u16 crc, u8 *buffer, size_t len)
+{
+ while (len--)
+ crc = _crc16_naive_byte(crc, *buffer++);
+ return crc;
+}
+
+
+/* Small helper for generating pseudorandom 16-bit data */
+static inline u16 _rand16(void)
+{
+ static u32 rand = CRC16_KUNIT_SEED;
+
+ rand = next_pseudo_random32(rand);
+ return rand & 0xFFFF;
+}
+
+
+static int crc16_init_test_data(struct kunit_suite *suite)
+{
+ size_t i;
+
+ /* Fill the data buffer with random bytes */
+ for (i = 0; i < CRC16_KUNIT_DATA_SIZE; i++)
+ data[i] = _rand16() & 0xFF;
+
+ /* Generate random test data while ensuring the random
+ * start + length values won't overflow the 4096-byte
+ * buffer (0x7FF * 2 = 0xFFE < 0x1000)
+ */
+ for (size_t i = 0; i < CRC16_KUNIT_TEST_SIZE; i++) {
+ tests[i].crc = _rand16();
+ tests[i].start = _rand16() & 0x7FF;
+ tests[i].length = _rand16() & 0x7FF;
+ }
+
+ return 0;
+}
+
+/**
+ * crc16_test_empty - Test crc16 with empty data
+ *
+ * Test crc16 with empty data, the result should be the same as the initial crc
+ */
+static void crc16_test_empty(struct kunit *test)
+{
+ u16 crc;
+
+ crc = crc16(0x00, data, 0);
+ KUNIT_EXPECT_EQ(test, crc, 0);
+ crc = crc16(0xFF, data, 0);
+ KUNIT_EXPECT_EQ(test, crc, 0xFF);
+}
+
+/**
+ * crc16_test_correctness - Test crc16
+ *
+ * Test crc16 against a naive implementation
+ */
+static void crc16_test_correctness(struct kunit *test)
+{
+ size_t i;
+ u16 crc, crc_naive;
+
+ for (i = 0; i < CRC16_KUNIT_TEST_SIZE; i++) {
+ crc = crc16(tests[i].crc, data + tests[i].start,
+ tests[i].length);
+ crc_naive = _crc16_naive(tests[i].crc, data + tests[i].start,
+ tests[i].length);
+ KUNIT_EXPECT_EQ(test, crc, crc_naive);
+ }
+}
+
+
+/**
+ * crc16_test_combine - Test split crc16 calculations
+ *
+ * Test crc16 with data split in two parts, the result should be the same as
+ * crc16 with the data combined
+ */
+static void crc16_test_combine(struct kunit *test)
+{
+ size_t i, j;
+ u16 crc, crc_naive;
+
+ for (i = 0; i < CRC16_KUNIT_TEST_SIZE; i++) {
+ crc_naive = crc16(tests[i].crc, data + tests[i].start, tests[i].length);
+ for (j = 0; j < tests[i].length; j++) {
+ crc = crc16(tests[i].crc, data + tests[i].start, j);
+ crc = crc16(crc, data + tests[i].start + j, tests[i].length - j);
+ KUNIT_EXPECT_EQ(test, crc, crc_naive);
+ }
+ }
+}
+
+
+static struct kunit_case crc16_test_cases[] = {
+ KUNIT_CASE(crc16_test_empty),
+ KUNIT_CASE(crc16_test_combine),
+ KUNIT_CASE(crc16_test_correctness),
+ {},
+};
+
+static struct kunit_suite crc16_test_suite = {
+ .name = "crc16",
+ .test_cases = crc16_test_cases,
+ .suite_init = crc16_init_test_data,
+};
+kunit_test_suite(crc16_test_suite);
+
+MODULE_AUTHOR("Fabricio Gasperin <fgasperin(a)lkcamp.dev>");
+MODULE_AUTHOR("Vinicius Peixoto <vpeixoto(a)lkcamp.dev>");
+MODULE_AUTHOR("Enzo Bertoloti <ebertoloti(a)lkcamp.dev>");
+MODULE_DESCRIPTION("Unit tests for crc16");
+MODULE_LICENSE("GPL");
---
base-commit: 9852d85ec9d492ebef56dc5f229416c925758edc
change-id: 20241003-crc16-kunit-127a4dc2b72c
Best regards,
--
Vinicius Peixoto <vpeixoto(a)lkcamp.dev>
Add documentation for the kselftests focused on testing devices and
point to it from the kselftest documentation. There are multiple tests
in this category so the aim of this page is to make it clear when to run
each test.
Signed-off-by: Nícolas F. R. A. Prado <nfraprado(a)collabora.com>
---
This patch depends on patch "kselftest: devices: Add test to detect
missing devices" [1], since this patch documents that test.
[1] https://lore.kernel.org/all/20240928-kselftest-dev-exist-v2-1-fab07de6b80b@…
---
Documentation/dev-tools/kselftest.rst | 9 ++++++
Documentation/dev-tools/testing-devices.rst | 47 +++++++++++++++++++++++++++++
2 files changed, 56 insertions(+)
diff --git a/Documentation/dev-tools/kselftest.rst b/Documentation/dev-tools/kselftest.rst
index f3766e326d1e..fdb1df86783a 100644
--- a/Documentation/dev-tools/kselftest.rst
+++ b/Documentation/dev-tools/kselftest.rst
@@ -31,6 +31,15 @@ kselftest runs as a userspace process. Tests that can be written/run in
userspace may wish to use the `Test Harness`_. Tests that need to be
run in kernel space may wish to use a `Test Module`_.
+Documentation on the tests
+==========================
+
+For documentation on the kselftests themselves, see:
+
+.. toctree::
+
+ testing-devices
+
Running the selftests (hotplug tests are run in limited mode)
=============================================================
diff --git a/Documentation/dev-tools/testing-devices.rst b/Documentation/dev-tools/testing-devices.rst
new file mode 100644
index 000000000000..ab26adb99051
--- /dev/null
+++ b/Documentation/dev-tools/testing-devices.rst
@@ -0,0 +1,47 @@
+.. SPDX-License-Identifier: GPL-2.0
+.. Copyright (c) 2024 Collabora Ltd
+
+=============================
+Device testing with kselftest
+=============================
+
+
+There are a few different kselftests available for testing devices generically,
+with some overlap in coverage and different requirements. This document aims to
+give an overview of each one.
+
+Note: Paths in this document are relative to the kselftest folder
+(``tools/testing/selftests``).
+
+Device oriented kselftests:
+
+* Devicetree (``dt``)
+
+ * **Coverage**: Probe status for devices described in Devicetree
+ * **Requirements**: None
+
+* Error logs (``devices/error_logs``)
+
+ * **Coverage**: Error (or more critical) log messages presence coming from any
+ device
+ * **Requirements**: None
+
+* Discoverable bus (``devices/probe``)
+
+ * **Coverage**: Presence and probe status of USB or PCI devices that have been
+ described in the reference file
+ * **Requirements**: Manually describe the devices that should be tested in a
+ YAML reference file (see ``devices/probe/boards/google,spherion.yaml`` for
+ an example)
+
+* Exist (``devices/exist``)
+
+ * **Coverage**: Presence of all devices
+ * **Requirements**: Generate the reference (see ``devices/exist/README.rst``
+ for details) on a known-good kernel
+
+Therefore, the suggestion is to enable the error log and devicetree tests on all
+(DT-based) platforms, since they don't have any requirements. Then to greatly
+improve coverage, generate the reference for each platform and enable the exist
+test. The discoverable bus test can be used to verify the probe status of
+specific USB or PCI devices, but is probably not worth it for most cases.
---
base-commit: cea5425829f77e476b03702426f6b3701299b925
change-id: 20241001-kselftest-device-docs-6c8a411109b5
Best regards,
--
Nícolas F. R. A. Prado <nfraprado(a)collabora.com>
Hi,
Here is the v5 patch to support polling on event 'hist' file.
The previous version is here;
https://lore.kernel.org/all/172377544331.67914.7474878424159759789.stgit@de…
This version just update the comment in poll.c and add Shuah's
Reviewed-by.
Background
----------
There has been interest in allowing user programs to monitor kernel
events in real time. Ftrace provides `trace_pipe` interface to wait
on events in the ring buffer, but it is needed to wait until filling
up a page with events in the ring buffer. We can also peek the
`trace` file periodically, but that is inefficient way to monitor
a randomely happening event.
Overview
--------
This patch set allows user to `poll`(or `select`, `epoll`) on event
histogram interface. As you know each event has its own `hist` file
which shows histograms generated by trigger action. So user can set
a new hist trigger on any event you want to monitor, and poll on the
`hist` file until it is updated.
There are 2 poll events are supported, POLLIN and POLLPRI. POLLIN
means that there are any readable update on `hist` file and this
event will be flashed only when you call read(). So, this is
useful if you want to read the histogram periodically.
The other POLLPRI event is for monitoring trace event. Like the
POLLIN, this will be returned when the histogram is updated, but
you don't need to read() the file and use poll() again.
Note that this waits for histogram update (not event arrival), thus
you must set a histogram on the event at first.
Usage
-----
Here is an example usage:
----
TRACEFS=/sys/kernel/tracing
EVENT=$TRACEFS/events/sched/sched_process_free
# setup histogram trigger and enable event
echo "hist:key=comm" >> $EVENT/trigger
echo 1 > $EVENT/enable
# Wait for update
poll pri $EVENT/hist
# Event arrived.
echo "process free event is comming"
tail $TRACEFS/trace
----
The 'poll' command is in the selftest patch.
You can take this series also from here;
https://git.kernel.org/pub/scm/linux/kernel/git/mhiramat/linux.git/log/?h=t…
Thank you,
---
Masami Hiramatsu (Google) (3):
tracing/hist: Add poll(POLLIN) support on hist file
tracing/hist: Support POLLPRI event for poll on histogram
selftests/tracing: Add hist poll() support test
include/linux/trace_events.h | 5 +
kernel/trace/trace_events.c | 18 ++++
kernel/trace/trace_events_hist.c | 101 +++++++++++++++++++-
tools/testing/selftests/ftrace/Makefile | 2
tools/testing/selftests/ftrace/poll.c | 74 +++++++++++++++
.../ftrace/test.d/trigger/trigger-hist-poll.tc | 74 +++++++++++++++
6 files changed, 271 insertions(+), 3 deletions(-)
create mode 100644 tools/testing/selftests/ftrace/poll.c
create mode 100644 tools/testing/selftests/ftrace/test.d/trigger/trigger-hist-poll.tc
--
Masami Hiramatsu (Google) <mhiramat(a)kernel.org>
Hi
There is a long-standing problem whereby running Intel PT on host and guest
in Host/Guest mode, causes VM-Entry failure.
The motivation for this patch set is to provide a fix for stable kernels
prior to the advent of the "Mediated Passthrough vPMU" patch set:
https://lore.kernel.org/kvm/20240801045907.4010984-1-mizhang@google.com/
which would render a large part of the fix unnecessary but likely not be
suitable for backport to stable due to its size and complexity.
Ideally, this patch set would be applied before "Mediated Passthrough vPMU"
Note that the fix does not conflict with "Mediated Passthrough vPMU", it
is just that "Mediated Passthrough vPMU" will make the code to stop and
restart Intel PT unnecessary.
Adrian Hunter (3):
KVM: x86: Fix Intel PT IA32_RTIT_CTL MSR validation
KVM: x86: Fix Intel PT Host/Guest mode when host tracing also
KVM: selftests: Add guest Intel PT test
arch/x86/events/intel/pt.c | 131 ++++++-
arch/x86/events/intel/pt.h | 10 +
arch/x86/include/asm/intel_pt.h | 4 +
arch/x86/kvm/vmx/vmx.c | 26 +-
arch/x86/kvm/vmx/vmx.h | 1 -
tools/testing/selftests/kvm/Makefile | 1 +
.../selftests/kvm/include/x86_64/processor.h | 1 +
tools/testing/selftests/kvm/x86_64/intel_pt.c | 381 +++++++++++++++++++++
8 files changed, 532 insertions(+), 23 deletions(-)
create mode 100644 tools/testing/selftests/kvm/x86_64/intel_pt.c
base-commit: d45aab436cf06544abeeffc607110f559a3af3b4
Regards
Adrian
This series is a cherry-pick on top of v6.12-rc1 from the one I sent
for selftests with other patches that were not net-related:
https://lore.kernel.org/all/20240925-selftests-gitignore-v3-0-9db896474170@…
The patches have not been modified, and the Reviewed-by tags have
been kept.
Signed-off-by: Javier Carrasco <javier.carrasco.cruz(a)gmail.com>
---
Javier Carrasco (3):
selftests: net: add msg_oob to gitignore
selftests: net: rds: add include.sh to EXTRA_CLEAN
selftests: net: rds: add gitignore file for include.sh
tools/testing/selftests/net/.gitignore | 1 +
tools/testing/selftests/net/rds/.gitignore | 1 +
tools/testing/selftests/net/rds/Makefile | 2 +-
3 files changed, 3 insertions(+), 1 deletion(-)
---
base-commit: 9852d85ec9d492ebef56dc5f229416c925758edc
change-id: 20240930-net-selftests-gitignore-18b844f29391
Best regards,
--
Javier Carrasco <javier.carrasco.cruz(a)gmail.com>
This is a slight change from the fundamentals of HID-BPF.
In theory, HID-BPF is abstract to the kernel itself, and makes
only changes at the HID level (through report descriptors or
events emitted to/from the device).
However, we have seen a few use cases where HID-BPF might interact with
the running kernel when the target device is already handled by a
specific device.
For example, the XP-Pen/Huion/UC-Logic tablets are handled by
hid-uclogic but this driver is also doing a report descriptor fixup
without checking if the device has already been fixed by HID-BPF.
In the same way, another recent example[0] was when a cheap foot pedal is
used and tricks iPhones and Windows machines by presenting itself as a
known Apple wireless keyboard. The problem is that this fake keyboard is
not presenting a compatible report descriptor and hid-core merges all
device nodes together making libinput ignore the keyboard part for
historical reasons.
This series aims at tackling this problem:
- first, we promote hid_bpf_report_descriptor_fixup to be called before
any driver is even matched for the device
- then we allow hdev->quirks to be written during report_fixup and add a
new quirk to force hid-core to ignore any non hid-generic driver.
Basically, it means that when we insert a BPF program to fix a device,
we can force hid-generic to handle the device, and thus preventing
any other kernel driver to tamper with our device.
This branch is on top of the for-6.12/upstream-fixes branch of hid.git.
[0] https://gitlab.freedesktop.org/libinput/libinput/-/issues/1014
Signed-off-by: Benjamin Tissoires <bentiss(a)kernel.org>
---
Changes in v3:
- dropped the last 2 patches with hid-input control, as I'm not 100%
sure of it
- changed the first patch to avoid a double free on cleanup of a device
when a HID-BPF program was attached
- kept Peter's rev-by for all but patches 1 and 6
- Link to v2: https://lore.kernel.org/r/20240910-hid-bpf-hid-generic-v2-0-083dfc189e97@ke…
Changes in v2:
- Refactored the API to not use a new hook but hid_bpf_rdesc_fixup
instead
- Some cleanups in hid-core.c probe() device to not kmemdup multiple
time the report descriptor when it's not required
- I'm still not 100% sure the HID_QUIRK_IGNORE_HIDINPUT is that
required, but I can not think of anything else at the moment to
temporary disable any driver input device.
- Link to v1: https://lore.kernel.org/r/20240903-hid-bpf-hid-generic-v1-0-9511a565b2da@ke…
---
Benjamin Tissoires (9):
HID: bpf: move HID-BPF report descriptor fixup earlier
HID: core: save one kmemdup during .probe()
HID: core: remove one more kmemdup on .probe()
HID: bpf: allow write access to quirks field in struct hid_device
selftests/hid: add dependency on hid_common.h
selftests/hid: cleanup C tests by adding a common struct uhid_device
selftests/hid: allow to parametrize bus/vid/pid/rdesc on the test device
HID: add per device quirk to force bind to hid-generic
selftests/hid: add test for assigning a given device to hid-generic
drivers/hid/bpf/hid_bpf_dispatch.c | 9 +-
drivers/hid/bpf/hid_bpf_struct_ops.c | 1 +
drivers/hid/hid-core.c | 84 +++++++++---
drivers/hid/hid-generic.c | 3 +
include/linux/hid.h | 20 +--
include/linux/hid_bpf.h | 11 +-
tools/testing/selftests/hid/Makefile | 2 +-
tools/testing/selftests/hid/hid_bpf.c | 151 ++++++++++++++-------
tools/testing/selftests/hid/hid_common.h | 112 ++++++++++-----
tools/testing/selftests/hid/hidraw.c | 36 ++---
tools/testing/selftests/hid/progs/hid.c | 12 ++
.../testing/selftests/hid/progs/hid_bpf_helpers.h | 6 +-
12 files changed, 296 insertions(+), 151 deletions(-)
---
base-commit: acd5f76fd5292c91628e04da83e8b78c986cfa2b
change-id: 20240829-hid-bpf-hid-generic-61579f5b5945
Best regards,
--
Benjamin Tissoires <bentiss(a)kernel.org>
If MPLS is not available in the kernel then skip MPLS tests.
This avoids the test failing in situations where the test is not
supported by the underlying kernel.
In the case where all tests are run, just skip over the MPLS tests
without altering the exit code of the overall test run - there
is only one exit code in this scenario.
In the case where a single test is run, exit with KSFT_SKIP (4).
In both cases log an informative message.
Signed-off-by: Simon Horman <horms(a)kernel.org>
---
tools/testing/selftests/bpf/test_tc_tunnel.sh | 22 ++++++++++++++++++++--
1 file changed, 20 insertions(+), 2 deletions(-)
diff --git a/tools/testing/selftests/bpf/test_tc_tunnel.sh b/tools/testing/selftests/bpf/test_tc_tunnel.sh
index 7989ec608454..71cddabc4ade 100755
--- a/tools/testing/selftests/bpf/test_tc_tunnel.sh
+++ b/tools/testing/selftests/bpf/test_tc_tunnel.sh
@@ -102,6 +102,20 @@ wait_for_port() {
return 1
}
+skip_mac() {
+ if [ "$1" = "mpls" ]; then
+ modprobe mpls_iptunnel || true
+ modprobe mpls_gso || true
+
+ if [ ! -e /proc/sys/net/mpls/platform_labels ]; then
+ echo -e "skip: mpls tunnel not supported by kernel\n"
+ return # true
+ fi
+ fi
+
+ false
+}
+
set -e
# no arguments: automated test, run all
@@ -125,6 +139,8 @@ if [[ "$#" -eq "0" ]]; then
$0 ipv6 ip6vxlan eth 2000
for mac in none mpls eth ; do
+ ! skip_mac "$mac" || continue
+
echo "ip gre $mac"
$0 ipv4 gre $mac 100
@@ -193,6 +209,10 @@ readonly tuntype=$2
readonly mac=$3
readonly datalen=$4
+if skip_mac "$mac"; then
+ exit 4 # KSFT_SKIP=4
+fi
+
echo "encap ${addr1} to ${addr2}, type ${tuntype}, mac ${mac} len ${datalen}"
trap cleanup EXIT
@@ -278,8 +298,6 @@ elif [[ "$tuntype" =~ (gre|vxlan) && "$mac" == "eth" ]]; then
awk '/ether/ { print $2 }')
ip netns exec "${ns2}" ip link set testtun0 address $ethaddr
elif [[ "$mac" == "mpls" ]]; then
- modprobe mpls_iptunnel ||true
- modprobe mpls_gso ||true
ip netns exec "${ns2}" sysctl -qw net.mpls.platform_labels=65536
ip netns exec "${ns2}" ip -f mpls route add 1000 dev lo
ip netns exec "${ns2}" ip link set lo up
This patch allows progs to elide a null check on statically known map
lookup keys. In other words, if the verifier can statically prove that
the lookup will be in-bounds, allow the prog to drop the null check.
This is useful for two reasons:
1. Large numbers of nullness checks (especially when they cannot fail)
unnecessarily pushes prog towards BPF_COMPLEXITY_LIMIT_JMP_SEQ.
2. It forms a tighter contract between programmer and verifier.
For (1), bpftrace is starting to make heavier use of percpu scratch
maps. As a result, for user scripts with large number of unrolled loops,
we are starting to hit jump complexity verification errors. These
percpu lookups cannot fail anyways, as we only use static key values.
Eliding nullness probably results in less work for verifier as well.
For (2), percpu scratch maps are often used as a larger stack, as the
currrent stack is limited to 512 bytes. In these situations, it is
desirable for the programmer to express: "this lookup should never fail,
and if it does, it means I messed up the code". By omitting the null
check, the programmer can "ask" the verifier to double check the logic.
Changes in v4:
* Only allow for CAP_BPF
* Add test for stack growing upwards
* Improve comment about stack growing upwards
Changes in v3:
* Check if stack is (erroneously) growing upwards
* Mention in commit message why existing tests needed change
Changes in v2:
* Added a check for when R2 is not a ptr to stack
* Added a check for when stack is uninitialized (no stack slot yet)
* Updated existing tests to account for null elision
* Added test case for when R2 can be both const and non-const
Daniel Xu (2):
bpf: verifier: Support eliding map lookup nullness
bpf: selftests: verifier: Add nullness elision tests
kernel/bpf/verifier.c | 73 ++++++-
tools/testing/selftests/bpf/progs/iters.c | 14 +-
.../selftests/bpf/progs/map_kptr_fail.c | 2 +-
.../bpf/progs/verifier_array_access.c | 183 ++++++++++++++++++
.../selftests/bpf/progs/verifier_map_in_map.c | 2 +-
.../testing/selftests/bpf/verifier/map_kptr.c | 2 +-
6 files changed, 265 insertions(+), 11 deletions(-)
--
2.46.0
When cross building kselftest out-of-tree the following issue can be
seen:
[...]
make[4]: Entering directory
'/src/kernel/linux/tools/testing/selftests/net/lib'
CC csum
/usr/lib/gcc-cross/aarch64-linux-gnu/13/../../../../aarch64-linux-gnu/bin/ld:
cannot open output file /tmp/build/kselftest/net/lib/csum: No such
file or directory
collect2: error: ld returned 1 exit status
[...]
Create the output build directory before building the targets, solves
this issue with building 'net/lib/csum'.
Suggested-by: Jakub Kicinski <kuba(a)kernel.org>
Signed-off-by: Anders Roxell <anders.roxell(a)linaro.org>
---
tools/testing/selftests/Makefile | 1 +
1 file changed, 1 insertion(+)
diff --git a/tools/testing/selftests/Makefile b/tools/testing/selftests/Makefile
index b38199965f99..05c143bcff6a 100644
--- a/tools/testing/selftests/Makefile
+++ b/tools/testing/selftests/Makefile
@@ -261,6 +261,7 @@ ifdef INSTALL_PATH
@ret=1; \
for TARGET in $(TARGETS) $(INSTALL_DEP_TARGETS); do \
BUILD_TARGET=$$BUILD/$$TARGET; \
+ mkdir -p $$BUILD_TARGET; \
$(MAKE) OUTPUT=$$BUILD_TARGET -C $$TARGET install \
INSTALL_PATH=$(INSTALL_PATH)/$$TARGET \
SRC_PATH=$(shell readlink -e $$(pwd)) \
--
2.45.2
Rename ip_len to payload_len since the length in this case refers only
to the payload, and not the entire IP packet like for IPv4. While we're
at it, just use the variable directly when calling
recv_verify_packet_udp/tcp.
Signed-off-by: Sean Anderson <sean.anderson(a)linux.dev>
---
tools/testing/selftests/net/lib/csum.c | 12 +++++-------
1 file changed, 5 insertions(+), 7 deletions(-)
diff --git a/tools/testing/selftests/net/lib/csum.c b/tools/testing/selftests/net/lib/csum.c
index e0a34e5e8dd5..27437590eeb5 100644
--- a/tools/testing/selftests/net/lib/csum.c
+++ b/tools/testing/selftests/net/lib/csum.c
@@ -675,22 +675,20 @@ static int recv_verify_packet_ipv6(void *nh, int len)
{
struct ipv6hdr *ip6h = nh;
uint16_t proto = cfg_encap ? IPPROTO_UDP : cfg_proto;
- uint16_t ip_len;
+ uint16_t payload_len;
if (len < sizeof(*ip6h) || ip6h->nexthdr != proto)
return -1;
- ip_len = ntohs(ip6h->payload_len);
- if (ip_len > len - sizeof(*ip6h))
+ payload_len = ntohs(ip6h->payload_len);
+ if (payload_len > len - sizeof(*ip6h))
return -1;
- len = ip_len;
iph_addr_p = &ip6h->saddr;
-
if (proto == IPPROTO_TCP)
- return recv_verify_packet_tcp(ip6h + 1, len);
+ return recv_verify_packet_tcp(ip6h + 1, payload_len);
else
- return recv_verify_packet_udp(ip6h + 1, len);
+ return recv_verify_packet_udp(ip6h + 1, payload_len);
}
/* return whether auxdata includes TP_STATUS_CSUM_VALID */
--
2.35.1.1320.gc452695387.dirty
From: Amit Cohen <amcohen(a)nvidia.com>
The test runs "devlink reload" explicitly. Instead, it is better to use
devlink_reload() which waits for udev events to be processed. Do not sleep
after reload, as devlink_reload() blocks until all the netdevs are renamed.
Signed-off-by: Amit Cohen <amcohen(a)nvidia.com>
Reviewed-by: Ido Schimmel <idosch(a)nvidia.com>
Signed-off-by: Petr Machata <petrm(a)nvidia.com>
---
tools/testing/selftests/drivers/net/mlxsw/rtnetlink.sh | 10 ++--------
1 file changed, 2 insertions(+), 8 deletions(-)
diff --git a/tools/testing/selftests/drivers/net/mlxsw/rtnetlink.sh b/tools/testing/selftests/drivers/net/mlxsw/rtnetlink.sh
index 893a693ad805..45a569618424 100755
--- a/tools/testing/selftests/drivers/net/mlxsw/rtnetlink.sh
+++ b/tools/testing/selftests/drivers/net/mlxsw/rtnetlink.sh
@@ -186,10 +186,7 @@ bridge_vlan_flags_test()
# If we did not handle references correctly, then this should produce a
# trace
- devlink dev reload "$DEVLINK_DEV"
-
- # Allow netdevices to be re-created following the reload
- sleep 20
+ devlink_reload
log_test "bridge vlan flags"
}
@@ -923,12 +920,9 @@ devlink_reload_test()
# devlink reload can be performed without errors
RET=0
- devlink dev reload "$DEVLINK_DEV"
- check_err $? "devlink reload failed"
+ devlink_reload
log_test "devlink reload - last test"
-
- sleep 20
}
trap cleanup EXIT
--
2.45.0
The testing effort is increasing throughout the community.
The tests are generally merged into the subsystem trees,
and are of relatively narrow interest. The patch volume on
linux-kselftest(a)vger.kernel.org makes it hard to follow
the changes to the framework, and discuss proposals.
Create a new ML for "all" of kselftests (tests and framework),
replacing the old list. Use the old list for framework changes
only. It would cause less churn to create a ML for just the
framework, but I prefer to use the shorter name for the list
which has much more practical use.
Signed-off-by: Jakub Kicinski <kuba(a)kernel.org>
---
Posting as an RFC because we need to create the new ML.
CC: shuah(a)kernel.org
CC: linux-kselftest(a)vger.kernel.org
CC: workflows(a)vger.kernel.org
---
MAINTAINERS | 12 ++++++++++++
1 file changed, 12 insertions(+)
diff --git a/MAINTAINERS b/MAINTAINERS
index c27f3190737f..9a03dc1c8974 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -12401,6 +12401,18 @@ S: Maintained
Q: https://patchwork.kernel.org/project/linux-kselftest/list/
T: git git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest.git
F: Documentation/dev-tools/kselftest*
+F: tools/testing/selftests/kselftest/
+F: tools/testing/selftests/lib/
+F: tools/testing/selftests/lib.mk
+F: tools/testing/selftests/Makefile
+F: tools/testing/selftests/*.sh
+F: tools/testing/selftests/*.h
+
+KERNEL SELFTEST TESTS
+M: Shuah Khan <shuah(a)kernel.org>
+M: Shuah Khan <skhan(a)linuxfoundation.org>
+L: linux-kselftest-all(a)vger.kernel.org
+S: Maintained
F: tools/testing/selftests/
KERNEL SMB3 SERVER (KSMBD)
--
2.46.2
The kernel has recently added support for shadow stacks, currently
x86 only using their CET feature but both arm64 and RISC-V have
equivalent features (GCS and Zicfiss respectively), I am actively
working on GCS[1]. With shadow stacks the hardware maintains an
additional stack containing only the return addresses for branch
instructions which is not generally writeable by userspace and ensures
that any returns are to the recorded addresses. This provides some
protection against ROP attacks and making it easier to collect call
stacks. These shadow stacks are allocated in the address space of the
userspace process.
Our API for shadow stacks does not currently offer userspace any
flexiblity for managing the allocation of shadow stacks for newly
created threads, instead the kernel allocates a new shadow stack with
the same size as the normal stack whenever a thread is created with the
feature enabled. The stacks allocated in this way are freed by the
kernel when the thread exits or shadow stacks are disabled for the
thread. This lack of flexibility and control isn't ideal, in the vast
majority of cases the shadow stack will be over allocated and the
implicit allocation and deallocation is not consistent with other
interfaces. As far as I can tell the interface is done in this manner
mainly because the shadow stack patches were in development since before
clone3() was implemented.
Since clone3() is readily extensible let's add support for specifying a
shadow stack when creating a new thread or process in a similar manner
to how the normal stack is specified, keeping the current implicit
allocation behaviour if one is not specified either with clone3() or
through the use of clone(). The user must provide a shadow stack
address and size, this must point to memory mapped for use as a shadow
stackby map_shadow_stack() with a shadow stack token at the top of the
stack.
Please note that the x86 portions of this code are build tested only, I
don't appear to have a system that can run CET avaible to me, I have
done testing with an integration into my pending work for GCS. There is
some possibility that the arm64 implementation may require the use of
clone3() and explicit userspace allocation of shadow stacks, this is
still under discussion.
Please further note that the token consumption done by clone3() is not
currently implemented in an atomic fashion, Rick indicated that he would
look into fixing this if people are OK with the implementation.
A new architecture feature Kconfig option for shadow stacks is added as
here, this was suggested as part of the review comments for the arm64
GCS series and since we need to detect if shadow stacks are supported it
seemed sensible to roll it in here.
[1] https://lore.kernel.org/r/20231009-arm64-gcs-v6-0-78e55deaa4dd@kernel.org/
Signed-off-by: Mark Brown <broonie(a)kernel.org>
---
Changes in v9:
- Pull token validation earlier and report problems with an error return
to parent rather than signal delivery to the child.
- Verify that the top of the supplied shadow stack is VM_SHADOW_STACK.
- Rework token validation to only do the page mapping once.
- Drop no longer needed support for testing for signals in selftest.
- Fix typo in comments.
- Link to v8: https://lore.kernel.org/r/20240808-clone3-shadow-stack-v8-0-0acf37caf14c@ke…
Changes in v8:
- Fix token verification with user specified shadow stack.
- Don't track user managed shadow stacks for child processes.
- Link to v7: https://lore.kernel.org/r/20240731-clone3-shadow-stack-v7-0-a9532eebfb1d@ke…
Changes in v7:
- Rebase onto v6.11-rc1.
- Typo fixes.
- Link to v6: https://lore.kernel.org/r/20240623-clone3-shadow-stack-v6-0-9ee7783b1fb9@ke…
Changes in v6:
- Rebase onto v6.10-rc3.
- Ensure we don't try to free the parent shadow stack in error paths of
x86 arch code.
- Spelling fixes in userspace API document.
- Additional cleanups and improvements to the clone3() tests to support
the shadow stack tests.
- Link to v5: https://lore.kernel.org/r/20240203-clone3-shadow-stack-v5-0-322c69598e4b@ke…
Changes in v5:
- Rebase onto v6.8-rc2.
- Rework ABI to have the user allocate the shadow stack memory with
map_shadow_stack() and a token.
- Force inlining of the x86 shadow stack enablement.
- Move shadow stack enablement out into a shared header for reuse by
other tests.
- Link to v4: https://lore.kernel.org/r/20231128-clone3-shadow-stack-v4-0-8b28ffe4f676@ke…
Changes in v4:
- Formatting changes.
- Use a define for minimum shadow stack size and move some basic
validation to fork.c.
- Link to v3: https://lore.kernel.org/r/20231120-clone3-shadow-stack-v3-0-a7b8ed3e2acc@ke…
Changes in v3:
- Rebase onto v6.7-rc2.
- Remove stale shadow_stack in internal kargs.
- If a shadow stack is specified unconditionally use it regardless of
CLONE_ parameters.
- Force enable shadow stacks in the selftest.
- Update changelogs for RISC-V feature rename.
- Link to v2: https://lore.kernel.org/r/20231114-clone3-shadow-stack-v2-0-b613f8681155@ke…
Changes in v2:
- Rebase onto v6.7-rc1.
- Remove ability to provide preallocated shadow stack, just specify the
desired size.
- Link to v1: https://lore.kernel.org/r/20231023-clone3-shadow-stack-v1-0-d867d0b5d4d0@ke…
---
Mark Brown (8):
Documentation: userspace-api: Add shadow stack API documentation
selftests: Provide helper header for shadow stack testing
mm: Introduce ARCH_HAS_USER_SHADOW_STACK
fork: Add shadow stack support to clone3()
selftests/clone3: Remove redundant flushes of output streams
selftests/clone3: Factor more of main loop into test_clone3()
selftests/clone3: Allow tests to flag if -E2BIG is a valid error code
selftests/clone3: Test shadow stack support
Documentation/userspace-api/index.rst | 1 +
Documentation/userspace-api/shadow_stack.rst | 41 ++++
arch/x86/Kconfig | 1 +
arch/x86/include/asm/shstk.h | 11 +-
arch/x86/kernel/process.c | 2 +-
arch/x86/kernel/shstk.c | 103 +++++++---
fs/proc/task_mmu.c | 2 +-
include/linux/mm.h | 2 +-
include/linux/sched/task.h | 18 ++
include/uapi/linux/sched.h | 13 +-
kernel/fork.c | 114 +++++++++--
mm/Kconfig | 6 +
tools/testing/selftests/clone3/clone3.c | 230 ++++++++++++++++++----
tools/testing/selftests/clone3/clone3_selftests.h | 40 +++-
tools/testing/selftests/ksft_shstk.h | 63 ++++++
15 files changed, 560 insertions(+), 87 deletions(-)
---
base-commit: 8400291e289ee6b2bf9779ff1c83a291501f017b
change-id: 20231019-clone3-shadow-stack-15d40d2bf536
Best regards,
--
Mark Brown <broonie(a)kernel.org>
Misbehaving guests can cause bus locks to degrade the performance of
a system. Non-WB (write-back) and misaligned locked RMW
(read-modify-write) instructions are referred to as "bus locks" and
require system wide synchronization among all processors to guarantee
the atomicity. The bus locks can impose notable performance penalties
for all processors within the system.
Support for the Bus Lock Threshold is indicated by CPUID
Fn8000_000A_EDX[29] BusLockThreshold=1, the VMCB provides a Bus Lock
Threshold enable bit and an unsigned 16-bit Bus Lock Threshold count.
VMCB intercept bit
VMCB Offset Bits Function
14h 5 Intercept bus lock operations
Bus lock threshold count
VMCB Offset Bits Function
120h 15:0 Bus lock counter
During VMRUN, the bus lock threshold count is fetched and stored in an
internal count register. Prior to executing a bus lock within the
guest, the processor verifies the count in the bus lock register. If
the count is greater than zero, the processor executes the bus lock,
reducing the count. However, if the count is zero, the bus lock
operation is not performed, and instead, a Bus Lock Threshold #VMEXIT
is triggered to transfer control to the Virtual Machine Monitor (VMM).
A Bus Lock Threshold #VMEXIT is reported to the VMM with VMEXIT code
0xA5h, VMEXIT_BUSLOCK. EXITINFO1 and EXITINFO2 are set to 0 on
a VMEXIT_BUSLOCK. On a #VMEXIT, the processor writes the current
value of the Bus Lock Threshold Counter to the VMCB.
More details about the Bus Lock Threshold feature can be found in AMD
APM [1].
v1 -> v2
- Incorporated misc review comments from Sean.
- Removed bus_lock_counter module parameter.
- Set the value of bus_lock_counter to zero by default and reload the value by 1
in bus lock exit handler.
- Add documentation for the behavioral difference for KVM_EXIT_BUS_LOCK.
- Improved selftest for buslock to work on SVM and VMX.
- Rewrite the commit messages.
Patches are prepared on kvm-next/next (0cdcc99eeaed)
Testing done:
- Added a selftest for the Bus Lock Threshold functionality.
- The bus lock threshold selftest has been tested on both Intel and AMD platforms.
- Tested the Bus Lock Threshold functionality on SEV and SEV-ES guests.
- Tested the Bus Lock Threshold functionality on nested guests.
v1: https://lore.kernel.org/kvm/20240709175145.9986-4-manali.shukla@amd.com/T/
[1]: AMD64 Architecture Programmer's Manual Pub. 24593, April 2024,
Vol 2, 15.14.5 Bus Lock Threshold.
https://bugzilla.kernel.org/attachment.cgi?id=306250
Manali Shukla (3):
x86/cpu: Add virt tag in /proc/cpuinfo
x86/cpufeatures: Add CPUID feature bit for the Bus Lock Threshold
KVM: X86: Add documentation about behavioral difference for
KVM_EXIT_BUS_LOCK
Nikunj A Dadhania (2):
KVM: SVM: Enable Bus lock threshold exit
KVM: selftests: Add bus lock exit test
Documentation/virt/kvm/api.rst | 5 +
arch/x86/include/asm/cpufeature.h | 1 +
arch/x86/include/asm/cpufeatures.h | 1 +
arch/x86/include/asm/svm.h | 5 +-
arch/x86/include/uapi/asm/svm.h | 2 +
arch/x86/kernel/cpu/mkcapflags.sh | 3 +
arch/x86/kernel/cpu/proc.c | 5 +
arch/x86/kvm/svm/nested.c | 12 ++
arch/x86/kvm/svm/svm.c | 29 ++++
tools/testing/selftests/kvm/Makefile | 1 +
.../selftests/kvm/x86_64/kvm_buslock_test.c | 130 ++++++++++++++++++
11 files changed, 193 insertions(+), 1 deletion(-)
create mode 100644 tools/testing/selftests/kvm/x86_64/kvm_buslock_test.c
base-commit: 0cdcc99eeaedf2422c80d75760293fdbb476cec1
--
2.34.1
Hi all,
This is part of a hackathon organized by LKCAMP[1], focused on writing
tests using KUnit. We reached out a while ago asking for advice on what
would be a useful contribution[2] and ended up choosing data structures
that did not yet have tests.
This patch adds tests for the llist data structure, defined in
include/linux/llist.h, and is inspired by the KUnit tests for the doubly
linked list in lib/list-test.c[3].
It is important to note that this patch depends on the patch referenced
in [4], as it utilizes the newly created lib/tests/ subdirectory.
[1] https://lkcamp.dev/about/
[2] https://lore.kernel.org/all/Zktnt7rjKryTh9-N@arch/
[3] https://elixir.bootlin.com/linux/latest/source/lib/list-test.c
[4] https://lore.kernel.org/all/20240720181025.work.002-kees@kernel.org/
---
Changes in v3:
- Resolved checkpatch warnings:
- Renamed tests for macros starting with 'for_each'
- Removed link from commit message
- Replaced hardcoded constants with ENTRIES_SIZE
- Updated initialization of llist_node array
- Fixed typos
- Update Kconfig.debug message for llist_kunit
Changes in v2:
- Add MODULE_DESCRIPTION()
- Move the tests from lib/llist_kunit.c to lib/tests/llist_kunit.c
- Change the license from "GPL v2" to "GPL"
Artur Alves (1):
lib/llist_kunit.c: add KUnit tests for llist
lib/Kconfig.debug | 11 ++
lib/tests/Makefile | 1 +
lib/tests/llist_kunit.c | 358 ++++++++++++++++++++++++++++++++++++++++
3 files changed, 370 insertions(+)
create mode 100644 lib/tests/llist_kunit.c
--
2.46.0
The SLUB changes for 6.12 included new kunit tests that resulted in
noisy warnings, which we normally suppress, and a boot lockup in some
configurations in case the kunit tests are built-in.
The warnings are addressed in Patch 1.
The lockups I couldn't reproduce, but inspecting boot initialization
order makes me suspect the tests (which call few RCU operations) are
being executed a bit too early before RCU finishes initialization.
Moving the exection later seems to do the trick, so I'd like to ask
kunit folks to ack this change (Patch 2). If RCU folks have any
insights, it would be welcome too.
So these are now fixes for 4e1c44b3db79 ("kunit, slub: add
test_kfree_rcu() and test_leak_destroy()")
Once sent as a full patch, I also want to include comment fixes from
Ulad for kvfree_rcu_queue_batch():
https://lore.kernel.org/all/CA%2BKHdyV%3D0dpJX_v_tcuTQ-_ree-Yb9ch3F_HqfT4Yn…
The plan is to take the fixes via slab tree for a 6.12 rcX.
Signed-off-by: Vlastimil Babka <vbabka(a)suse.cz>
---
Vlastimil Babka (2):
mm, slab: suppress warnings in test_leak_destroy kunit test
kunit: move call to kunit_run_all_tests() after rcu_end_inkernel_boot()
init/main.c | 4 ++--
lib/slub_kunit.c | 4 ++--
mm/slab.h | 6 ++++++
mm/slab_common.c | 5 +++--
mm/slub.c | 5 +++--
5 files changed, 16 insertions(+), 8 deletions(-)
---
base-commit: 9852d85ec9d492ebef56dc5f229416c925758edc
change-id: 20240930-b4-slub-kunit-fix-6fba4d1c1742
Best regards,
--
Vlastimil Babka <vbabka(a)suse.cz>
rxq contains a pointer to the device from where
the redirect happened. Currently, the BPF program
that was executed after a redirect via BPF_MAP_TYPE_DEVMAP*
does not have it set.
Add bugfix and related selftest.
Signed-off-by: Florian Kauer <florian.kauer(a)linutronix.de>
---
Changes in v4:
- return -> goto out_close, thanks Toke
- Link to v3: https://lore.kernel.org/r/20240909-devel-koalo-fix-ingress-ifindex-v3-0-662…
Changes in v3:
- initialize skel to NULL, thanks Stanislav
- Link to v2: https://lore.kernel.org/r/20240906-devel-koalo-fix-ingress-ifindex-v2-0-4ca…
Changes in v2:
- changed fixes tag
- added selftest
- Link to v1: https://lore.kernel.org/r/20240905-devel-koalo-fix-ingress-ifindex-v1-1-d12…
---
Florian Kauer (2):
bpf: devmap: provide rxq after redirect
bpf: selftests: send packet to devmap redirect XDP
kernel/bpf/devmap.c | 11 +-
.../selftests/bpf/prog_tests/xdp_devmap_attach.c | 114 +++++++++++++++++++--
2 files changed, 115 insertions(+), 10 deletions(-)
---
base-commit: 8e69c96df771ab469cec278edb47009351de4da6
change-id: 20240905-devel-koalo-fix-ingress-ifindex-b9293d471db6
Best regards,
--
Florian Kauer <florian.kauer(a)linutronix.de>
step_after_suspend_test fails with device busy error while
writing to /sys/power/state to start suspend. The test believes
it failed to enter suspend state with
$ sudo ./step_after_suspend_test
TAP version 13
Bail out! Failed to enter Suspend state
However, in the kernel message, I indeed see the system get
suspended and then wake up later.
[611172.033108] PM: suspend entry (s2idle)
[611172.044940] Filesystems sync: 0.006 seconds
[611172.052254] Freezing user space processes
[611172.059319] Freezing user space processes completed (elapsed 0.001 seconds)
[611172.067920] OOM killer disabled.
[611172.072465] Freezing remaining freezable tasks
[611172.080332] Freezing remaining freezable tasks completed (elapsed 0.001 seconds)
[611172.089724] printk: Suspending console(s) (use no_console_suspend to debug)
[611172.117126] serial 00:03: disabled
some other hardware get reconnected
[611203.136277] OOM killer enabled.
[611203.140637] Restarting tasks ...
[611203.141135] usb 1-8.1: USB disconnect, device number 7
[611203.141755] done.
[611203.155268] random: crng reseeded on system resumption
[611203.162059] PM: suspend exit
After investigation, I noticed that for the code block
if (write(power_state_fd, "mem", strlen("mem")) != strlen("mem"))
ksft_exit_fail_msg("Failed to enter Suspend state\n");
The write will return -1 and errno is set to 16 (device busy).
It should be caused by the write function is not successfully returned
before the system suspend and the return value get messed when waking up.
As a result, It may be better to check the time passed of those few
instructions to determine whether the suspend is executed correctly for
it is pretty hard to execute those few lines for 5 seconds.
The timer to wake up the system is set to expire after 5 seconds and
no re-arm. If the timer remaining time is 0 second and 0 nano secomd,
it means the timer expired and wake the system up. Otherwise, the system
could be considered to enter the suspend state failed if there is any
remaining time.
After appling this patch, the test would not fail for it believes the
system does not go to suspend by mistake. It now could continue to the
rest part of the test after suspend.
Fixes: bfd092b8c272 ("selftests: breakpoint: add step_after_suspend_test")
Reported-by: Sinadin Shan <sinadin.shan(a)oracle.com>
Signed-off-by: Yifei Liu <yifei.l.liu(a)oracle.com>
---
v4->v5: Remove the above quotes in the first part.
remove the incorrect format which could confuse the git.
---
.../testing/selftests/breakpoints/step_after_suspend_test.c | 5 ++++-
1 file changed, 4 insertions(+), 1 deletion(-)
diff --git a/tools/testing/selftests/breakpoints/step_after_suspend_test.c b/tools/testing/selftests/breakpoints/step_after_suspend_test.c
index dfec31fb9b30d..8d275f03e977f 100644
--- a/tools/testing/selftests/breakpoints/step_after_suspend_test.c
+++ b/tools/testing/selftests/breakpoints/step_after_suspend_test.c
@@ -152,7 +152,10 @@ void suspend(void)
if (err < 0)
ksft_exit_fail_msg("timerfd_settime() failed\n");
- if (write(power_state_fd, "mem", strlen("mem")) != strlen("mem"))
+ system("(echo mem > /sys/power/state) 2> /dev/null");
+
+ timerfd_gettime(timerfd, &spec);
+ if (spec.it_value.tv_sec != 0 || spec.it_value.tv_nsec != 0)
ksft_exit_fail_msg("Failed to enter Suspend state\n");
close(timerfd);
--
2.46.0
Thanks to Miroslav, Petr and Marcos for the reviews!
V4:
Use variable for /sys/kernel/debug.
Be consistent with "" around variables.
Fix path in commit message to /sys/kernel/debug/kprobes/enabled.
V3:
Save and restore kprobe state also when test fails, by integrating it
into setup_config() and cleanup().
Rename SYSFS variables in a more logical way.
Sort test modules in alphabetical order.
Rename module description.
V2:
Save and restore kprobe state.
Michael Vetter (3):
selftests: livepatch: rename KLP_SYSFS_DIR to SYSFS_KLP_DIR
selftests: livepatch: save and restore kprobe state
selftests: livepatch: test livepatching a kprobed function
tools/testing/selftests/livepatch/Makefile | 3 +-
.../testing/selftests/livepatch/functions.sh | 19 ++++--
.../selftests/livepatch/test-kprobe.sh | 62 +++++++++++++++++++
.../selftests/livepatch/test_modules/Makefile | 3 +-
.../livepatch/test_modules/test_klp_kprobe.c | 38 ++++++++++++
5 files changed, 117 insertions(+), 8 deletions(-)
create mode 100755 tools/testing/selftests/livepatch/test-kprobe.sh
create mode 100644 tools/testing/selftests/livepatch/test_modules/test_klp_kprobe.c
--
2.46.1
The SLUB changes for 6.12 included new kunit tests that resulted in
noisy warnings, which we normally suppress, and a boot lockup in some
configurations in case the kunit tests are built-in.
The warnings are addressed in Patch 1.
The lockups I couldn't reproduce, but inspecting boot initialization
order makes me suspect the test_kfree_rcu() calling kfree_rcu() which is
too early before RCU finishes initialization. Moving the exection later
was tried but broke tests marking their code as __init so Patch 2 skips
the test when the slub kunit tests are built-in.
So these are now fixes for 4e1c44b3db79 ("kunit, slub: add
test_kfree_rcu() and test_leak_destroy()")
The plan is to take the fixes via slab tree for a 6.12 rcX.
Signed-off-by: Vlastimil Babka <vbabka(a)suse.cz>
---
Changes in v2:
- patch 2 skips the test when built-in instead of moving kunit execution
later
- Link to v1: https://lore.kernel.org/r/20240930-b4-slub-kunit-fix-v1-0-32ca9dbbbc11@suse…
---
Vlastimil Babka (2):
mm, slab: suppress warnings in test_leak_destroy kunit test
slub/kunit: skip test_kfree_rcu when the slub kunit test is built-in
lib/slub_kunit.c | 18 ++++++++++++------
mm/slab.h | 6 ++++++
mm/slab_common.c | 5 +++--
mm/slub.c | 5 +++--
4 files changed, 24 insertions(+), 10 deletions(-)
---
base-commit: 9852d85ec9d492ebef56dc5f229416c925758edc
change-id: 20240930-b4-slub-kunit-fix-6fba4d1c1742
Best regards,
--
Vlastimil Babka <vbabka(a)suse.cz>
Some applications rely on placing data in free bits addresses allocated
by mmap. Various architectures (eg. x86, arm64, powerpc) restrict the
address returned by mmap to be less than the 48-bit address space,
unless the hint address uses more than 47 bits (the 48th bit is reserved
for the kernel address space).
The riscv architecture needs a way to similarly restrict the virtual
address space. On the riscv port of OpenJDK an error is thrown if
attempted to run on the 57-bit address space, called sv57 [1]. golang
has a comment that sv57 support is not complete, but there are some
workarounds to get it to mostly work [2].
These applications work on x86 because x86 does an implicit 47-bit
restriction of mmap() address that contain a hint address that is less
than 48 bits.
Instead of implicitly restricting the address space on riscv (or any
current/future architecture), provide a flag to the personality syscall
that can be used to ensure an application works in any arbitrary VA
space. A similar feature has already been implemented by the personality
syscall in ADDR_LIMIT_32BIT.
This flag will also allow seemless compatibility between all
architectures, so applications like Go and OpenJDK that use bits in a
virtual address can request the exact number of bits they need in a
generic way. The flag can be checked inside of vm_unmapped_area() so
that this flag does not have to be handled individually by each
architecture.
Link:
https://github.com/openjdk/jdk/blob/f080b4bb8a75284db1b6037f8c00ef3b1ef1add…
[1]
Link:
https://github.com/golang/go/blob/9e8ea567c838574a0f14538c0bbbd83c3215aa55/…
[2]
To: Arnd Bergmann <arnd(a)arndb.de>
To: Richard Henderson <richard.henderson(a)linaro.org>
To: Ivan Kokshaysky <ink(a)jurassic.park.msu.ru>
To: Matt Turner <mattst88(a)gmail.com>
To: Vineet Gupta <vgupta(a)kernel.org>
To: Russell King <linux(a)armlinux.org.uk>
To: Guo Ren <guoren(a)kernel.org>
To: Huacai Chen <chenhuacai(a)kernel.org>
To: WANG Xuerui <kernel(a)xen0n.name>
To: Thomas Bogendoerfer <tsbogend(a)alpha.franken.de>
To: James E.J. Bottomley <James.Bottomley(a)HansenPartnership.com>
To: Helge Deller <deller(a)gmx.de>
To: Michael Ellerman <mpe(a)ellerman.id.au>
To: Nicholas Piggin <npiggin(a)gmail.com>
To: Christophe Leroy <christophe.leroy(a)csgroup.eu>
To: Naveen N Rao <naveen(a)kernel.org>
To: Alexander Gordeev <agordeev(a)linux.ibm.com>
To: Gerald Schaefer <gerald.schaefer(a)linux.ibm.com>
To: Heiko Carstens <hca(a)linux.ibm.com>
To: Vasily Gorbik <gor(a)linux.ibm.com>
To: Christian Borntraeger <borntraeger(a)linux.ibm.com>
To: Sven Schnelle <svens(a)linux.ibm.com>
To: Yoshinori Sato <ysato(a)users.sourceforge.jp>
To: Rich Felker <dalias(a)libc.org>
To: John Paul Adrian Glaubitz <glaubitz(a)physik.fu-berlin.de>
To: David S. Miller <davem(a)davemloft.net>
To: Andreas Larsson <andreas(a)gaisler.com>
To: Thomas Gleixner <tglx(a)linutronix.de>
To: Ingo Molnar <mingo(a)redhat.com>
To: Borislav Petkov <bp(a)alien8.de>
To: Dave Hansen <dave.hansen(a)linux.intel.com>
To: x86(a)kernel.org
To: H. Peter Anvin <hpa(a)zytor.com>
To: Andy Lutomirski <luto(a)kernel.org>
To: Peter Zijlstra <peterz(a)infradead.org>
To: Muchun Song <muchun.song(a)linux.dev>
To: Andrew Morton <akpm(a)linux-foundation.org>
To: Liam R. Howlett <Liam.Howlett(a)oracle.com>
To: Vlastimil Babka <vbabka(a)suse.cz>
To: Lorenzo Stoakes <lorenzo.stoakes(a)oracle.com>
To: Shuah Khan <shuah(a)kernel.org>
To: Christoph Hellwig <hch(a)infradead.org>
To: Michal Hocko <mhocko(a)suse.com>
To: "Kirill A. Shutemov" <kirill(a)shutemov.name>
To: Chris Torek <chris.torek(a)gmail.com>
Cc: linux-arch(a)vger.kernel.org
Cc: linux-kernel(a)vger.kernel.org
Cc: linux-alpha(a)vger.kernel.org
Cc: linux-snps-arc(a)lists.infradead.org
Cc: linux-arm-kernel(a)lists.infradead.org
Cc: linux-csky(a)vger.kernel.org
Cc: loongarch(a)lists.linux.dev
Cc: linux-mips(a)vger.kernel.org
Cc: linux-parisc(a)vger.kernel.org
Cc: linuxppc-dev(a)lists.ozlabs.org
Cc: linux-s390(a)vger.kernel.org
Cc: linux-sh(a)vger.kernel.org
Cc: sparclinux(a)vger.kernel.org
Cc: linux-mm(a)kvack.org
Cc: linux-kselftest(a)vger.kernel.org
Cc: linux-abi-devel(a)lists.sourceforge.net
Signed-off-by: Charlie Jenkins <charlie(a)rivosinc.com>
Changes in v2:
- Added much greater detail to cover letter
- Removed all code that touched architecture specific code and was able
to factor this out into all generic functions, except for flags that
needed to be added to vm_unmapped_area_info
- Made this an RFC since I have only tested it on riscv and x86
- Link to v1: https://lore.kernel.org/r/20240827-patches-below_hint_mmap-v1-0-46ff2eb9022…
Changes in v3:
- Use a personality flag instead of an mmap flag
- Link to v2: https://lore.kernel.org/r/20240829-patches-below_hint_mmap-v2-0-638a28d9eae…
---
Charlie Jenkins (2):
mm: Add personality flag to limit address to 47 bits
selftests/mm: Create ADDR_LIMIT_47BIT test
include/uapi/linux/personality.h | 1 +
mm/mmap.c | 3 ++
tools/testing/selftests/mm/.gitignore | 1 +
tools/testing/selftests/mm/Makefile | 1 +
tools/testing/selftests/mm/map_47bit_personality.c | 34 ++++++++++++++++++++++
5 files changed, 40 insertions(+)
---
base-commit: 5be63fc19fcaa4c236b307420483578a56986a37
change-id: 20240827-patches-below_hint_mmap-b13d79ae1c55
--
- Charlie
From: Tycho Andersen <tandersen(a)netflix.com>
Zbigniew mentioned at Linux Plumber's that systemd is interested in
switching to execveat() for service execution, but can't, because the
contents of /proc/pid/comm are the file descriptor which was used,
instead of the path to the binary. This makes the output of tools like
top and ps useless, especially in a world where most fds are opened
CLOEXEC so the number is truly meaningless.
Change exec path to fix up /proc/pid/comm in the case where we have
allocated one of these synthetic paths in bprm_init(). This way the actual
exec machinery is unchanged, but cosmetically the comm looks reasonable to
admins investigating things.
Signed-off-by: Tycho Andersen <tandersen(a)netflix.com>
Suggested-by: Zbigniew Jędrzejewski-Szmek <zbyszek(a)in.waw.pl>
CC: Aleksa Sarai <cyphar(a)cyphar.com>
Link: https://github.com/uapi-group/kernel-features#set-comm-field-before-exec
---
v2: * drop the flag, everyone :)
* change the rendered value to f_path.dentry->d_name.name instead of
argv[0], Eric
v3: * fix up subject line, Eric
---
fs/exec.c | 13 ++++++++++++-
1 file changed, 12 insertions(+), 1 deletion(-)
diff --git a/fs/exec.c b/fs/exec.c
index dad402d55681..9520359a8dcc 100644
--- a/fs/exec.c
+++ b/fs/exec.c
@@ -1416,7 +1416,18 @@ int begin_new_exec(struct linux_binprm * bprm)
set_dumpable(current->mm, SUID_DUMP_USER);
perf_event_exec();
- __set_task_comm(me, kbasename(bprm->filename), true);
+
+ /*
+ * If fdpath was set, execveat() made up a path that will
+ * probably not be useful to admins running ps or similar.
+ * Let's fix it up to be something reasonable.
+ */
+ if (bprm->fdpath) {
+ BUILD_BUG_ON(TASK_COMM_LEN > DNAME_INLINE_LEN);
+ __set_task_comm(me, bprm->file->f_path.dentry->d_name.name, true);
+ } else {
+ __set_task_comm(me, kbasename(bprm->filename), true);
+ }
/* An exec changes our domain. We are no longer part of the thread
group */
base-commit: baeb9a7d8b60b021d907127509c44507539c15e5
--
2.34.1
virtio-net have two usage of hashes: one is RSS and another is hash
reporting. Conventionally the hash calculation was done by the VMM.
However, computing the hash after the queue was chosen defeats the
purpose of RSS.
Another approach is to use eBPF steering program. This approach has
another downside: it cannot report the calculated hash due to the
restrictive nature of eBPF.
Introduce the code to compute hashes to the kernel in order to overcome
thse challenges.
An alternative solution is to extend the eBPF steering program so that it
will be able to report to the userspace, but it is based on context
rewrites, which is in feature freeze. We can adopt kfuncs, but they will
not be UAPIs. We opt to ioctl to align with other relevant UAPIs (KVM
and vhost_net).
The patches for QEMU to use this new feature was submitted as RFC and
is available at:
https://patchew.org/QEMU/20240915-hash-v3-0-79cb08d28647@daynix.com/
This work was presented at LPC 2024:
https://lpc.events/event/18/contributions/1963/
V1 -> V2:
Changed to introduce a new BPF program type.
Signed-off-by: Akihiko Odaki <akihiko.odaki(a)daynix.com>
---
Changes in v4:
- Moved tun_vnet_hash_ext to if_tun.h.
- Renamed virtio_net_toeplitz() to virtio_net_toeplitz_calc().
- Replaced htons() with cpu_to_be16().
- Changed virtio_net_hash_rss() to return void.
- Reordered variable declarations in virtio_net_hash_rss().
- Removed virtio_net_hdr_v1_hash_from_skb().
- Updated messages of "tap: Pad virtio header with zero" and
"tun: Pad virtio header with zero".
- Fixed vnet_hash allocation size.
- Ensured to free vnet_hash when destructing tun_struct.
- Link to v3: https://lore.kernel.org/r/20240915-rss-v3-0-c630015db082@daynix.com
Changes in v3:
- Reverted back to add ioctl.
- Split patch "tun: Introduce virtio-net hashing feature" into
"tun: Introduce virtio-net hash reporting feature" and
"tun: Introduce virtio-net RSS".
- Changed to reuse hash values computed for automq instead of performing
RSS hashing when hash reporting is requested but RSS is not.
- Extracted relevant data from struct tun_struct to keep it minimal.
- Added kernel-doc.
- Changed to allow calling TUNGETVNETHASHCAP before TUNSETIFF.
- Initialized num_buffers with 1.
- Added a test case for unclassified packets.
- Fixed error handling in tests.
- Changed tests to verify that the queue index will not overflow.
- Rebased.
- Link to v2: https://lore.kernel.org/r/20231015141644.260646-1-akihiko.odaki@daynix.com
---
Akihiko Odaki (9):
skbuff: Introduce SKB_EXT_TUN_VNET_HASH
virtio_net: Add functions for hashing
net: flow_dissector: Export flow_keys_dissector_symmetric
tap: Pad virtio header with zero
tun: Pad virtio header with zero
tun: Introduce virtio-net hash reporting feature
tun: Introduce virtio-net RSS
selftest: tun: Add tests for virtio-net hashing
vhost/net: Support VIRTIO_NET_F_HASH_REPORT
Documentation/networking/tuntap.rst | 7 +
drivers/net/Kconfig | 1 +
drivers/net/tap.c | 2 +-
drivers/net/tun.c | 255 ++++++++++++--
drivers/vhost/net.c | 16 +-
include/linux/if_tun.h | 5 +
include/linux/skbuff.h | 3 +
include/linux/virtio_net.h | 174 +++++++++
include/net/flow_dissector.h | 1 +
include/uapi/linux/if_tun.h | 71 ++++
net/core/flow_dissector.c | 3 +-
net/core/skbuff.c | 4 +
tools/testing/selftests/net/Makefile | 2 +-
tools/testing/selftests/net/tun.c | 666 ++++++++++++++++++++++++++++++++++-
14 files changed, 1170 insertions(+), 40 deletions(-)
---
base-commit: 752ebcbe87aceeb6334e846a466116197711a982
change-id: 20240403-rss-e737d89efa77
Best regards,
--
Akihiko Odaki <akihiko.odaki(a)daynix.com>
This patch allows progs to elide a null check on statically known map
lookup keys. In other words, if the verifier can statically prove that
the lookup will be in-bounds, allow the prog to drop the null check.
This is useful for two reasons:
1. Large numbers of nullness checks (especially when they cannot fail)
unnecessarily pushes prog towards BPF_COMPLEXITY_LIMIT_JMP_SEQ.
2. It forms a tighter contract between programmer and verifier.
For (1), bpftrace is starting to make heavier use of percpu scratch
maps. As a result, for user scripts with large number of unrolled loops,
we are starting to hit jump complexity verification errors. These
percpu lookups cannot fail anyways, as we only use static key values.
Eliding nullness probably results in less work for verifier as well.
For (2), percpu scratch maps are often used as a larger stack, as the
currrent stack is limited to 512 bytes. In these situations, it is
desirable for the programmer to express: "this lookup should never fail,
and if it does, it means I messed up the code". By omitting the null
check, the programmer can "ask" the verifier to double check the logic.
Changes in v3:
* Check if stack is (erroneously) growing upwards
* Mention in commit message why existing tests needed change
Changes in v2:
* Added a check for when R2 is not a ptr to stack
* Added a check for when stack is uninitialized (no stack slot yet)
* Updated existing tests to account for null elision
* Added test case for when R2 can be both const and non-const
Daniel Xu (2):
bpf: verifier: Support eliding map lookup nullness
bpf: selftests: verifier: Add nullness elision tests
kernel/bpf/verifier.c | 67 ++++++-
tools/testing/selftests/bpf/progs/iters.c | 14 +-
.../selftests/bpf/progs/map_kptr_fail.c | 2 +-
.../bpf/progs/verifier_array_access.c | 166 ++++++++++++++++++
.../selftests/bpf/progs/verifier_map_in_map.c | 2 +-
.../testing/selftests/bpf/verifier/map_kptr.c | 2 +-
6 files changed, 242 insertions(+), 11 deletions(-)
--
2.46.0
From: Joshua Hahn <joshua.hahn6(a)gmail.com>
v2 -> v3: Signed-off-by & renamed subject for clarity.
v1 -> v2: Edited commit messages for clarity.
Niced CPU usage is a metric reported in host-level /prot/stat, but is
not reported in cgroup-level statistics in cpu.stat. However, when a
host contains multiple tasks across different workloads, it becomes
difficult to gauge how much of the task is being spent on niced
processes based on /proc/stat alone, since host-level metrics do not
provide this cgroup-level granularity.
Exposing this metric will allow users to accurately probe the niced CPU
metric for each workload, and make more informed decisions when
directing higher priority tasks.
Joshua Hahn (2):
Tracking cgroup-level niced CPU time
Selftests for niced CPU statistics
include/linux/cgroup-defs.h | 1 +
kernel/cgroup/rstat.c | 16 ++++-
tools/testing/selftests/cgroup/test_cpu.c | 72 +++++++++++++++++++++++
3 files changed, 86 insertions(+), 3 deletions(-)
--
2.43.5
The recent addition of support for testing with the x86 specific quirk
KVM_X86_QUIRK_SLOT_ZAP_ALL disabled in the generic memslot tests broke the
build of the KVM selftests for all other architectures:
In file included from include/kvm_util.h:8,
from include/memstress.h:13,
from memslot_modification_stress_test.c:21:
memslot_modification_stress_test.c: In function ‘main’:
memslot_modification_stress_test.c:176:38: error: ‘KVM_X86_QUIRK_SLOT_ZAP_ALL’ undeclared (first use in this function)
176 | KVM_X86_QUIRK_SLOT_ZAP_ALL);
| ^~~~~~~~~~~~~~~~~~~~~~~~~~
Add __x86_64__ guard defines to avoid building the relevant code on other
architectures.
Fixes: 61de4c34b51c ("KVM: selftests: Test memslot move in memslot_perf_test with quirk disabled")
Fixes: 218f6415004a ("KVM: selftests: Allow slot modification stress test with quirk disabled")
Reported-by: Aishwarya TCV <aishwarya.tcv(a)arm.com>
Signed-off-by: Mark Brown <broonie(a)kernel.org>
---
This is obviously disruptive for testing of KVM changes on non-x86
architectures.
---
tools/testing/selftests/kvm/memslot_modification_stress_test.c | 2 ++
tools/testing/selftests/kvm/memslot_perf_test.c | 6 ++++++
2 files changed, 8 insertions(+)
diff --git a/tools/testing/selftests/kvm/memslot_modification_stress_test.c b/tools/testing/selftests/kvm/memslot_modification_stress_test.c
index e3343f0df9e1..c81a84990eab 100644
--- a/tools/testing/selftests/kvm/memslot_modification_stress_test.c
+++ b/tools/testing/selftests/kvm/memslot_modification_stress_test.c
@@ -169,12 +169,14 @@ int main(int argc, char *argv[])
case 'i':
p.nr_iterations = atoi_positive("Number of iterations", optarg);
break;
+#ifdef __x86_64__
case 'q':
p.disable_slot_zap_quirk = true;
TEST_REQUIRE(kvm_check_cap(KVM_CAP_DISABLE_QUIRKS2) &
KVM_X86_QUIRK_SLOT_ZAP_ALL);
break;
+#endif
case 'h':
default:
help(argv[0]);
diff --git a/tools/testing/selftests/kvm/memslot_perf_test.c b/tools/testing/selftests/kvm/memslot_perf_test.c
index 893366982f77..989ffe0d047f 100644
--- a/tools/testing/selftests/kvm/memslot_perf_test.c
+++ b/tools/testing/selftests/kvm/memslot_perf_test.c
@@ -113,7 +113,9 @@ static_assert(ATOMIC_BOOL_LOCK_FREE == 2, "atomic bool is not lockless");
static sem_t vcpu_ready;
static bool map_unmap_verify;
+#ifdef __x86_64__
static bool disable_slot_zap_quirk;
+#endif
static bool verbose;
#define pr_info_v(...) \
@@ -579,8 +581,10 @@ static bool test_memslot_move_prepare(struct vm_data *data,
uint32_t guest_page_size = data->vm->page_size;
uint64_t movesrcgpa, movetestgpa;
+#ifdef __x86_64__
if (disable_slot_zap_quirk)
vm_enable_cap(data->vm, KVM_CAP_DISABLE_QUIRKS2, KVM_X86_QUIRK_SLOT_ZAP_ALL);
+#endif
movesrcgpa = vm_slot2gpa(data, data->nslots - 1);
@@ -971,11 +975,13 @@ static bool parse_args(int argc, char *argv[],
case 'd':
map_unmap_verify = true;
break;
+#ifdef __x86_64__
case 'q':
disable_slot_zap_quirk = true;
TEST_REQUIRE(kvm_check_cap(KVM_CAP_DISABLE_QUIRKS2) &
KVM_X86_QUIRK_SLOT_ZAP_ALL);
break;
+#endif
case 's':
targs->nslots = atoi_paranoid(optarg);
if (targs->nslots <= 1 && targs->nslots != -1) {
---
base-commit: 9852d85ec9d492ebef56dc5f229416c925758edc
change-id: 20240930-kvm-build-breakage-a542f46d78f9
Best regards,
--
Mark Brown <broonie(a)kernel.org>
The HID test cases actually run tests using the run-hid-tools-tests.sh
script. However, if installed with "make install", the run-hid-tools-tests.sh
script will not be copied over, resulting in the following error message.
make -C tools/testing/selftests/ TARGETS=hid install \
INSTALL_PATH=$KSFT_INSTALL_PATH
cd $KSFT_INSTALL_PATH
./run_kselftest.sh -c hid
selftests: hid: hid-core.sh
bash: ./run-hid-tools-tests.sh: No such file or directory
So add the run-hid-tools-tests.sh script to the TEST_FILES in the Makefile.
Signed-off-by: Yun Lu <luyun(a)kylinos.cn>
---
tools/testing/selftests/hid/Makefile | 2 ++
1 file changed, 2 insertions(+)
diff --git a/tools/testing/selftests/hid/Makefile b/tools/testing/selftests/hid/Makefile
index 72be55ac4bdf..38ae31bb07b5 100644
--- a/tools/testing/selftests/hid/Makefile
+++ b/tools/testing/selftests/hid/Makefile
@@ -17,6 +17,8 @@ TEST_PROGS += hid-tablet.sh
TEST_PROGS += hid-usb_crash.sh
TEST_PROGS += hid-wacom.sh
+TEST_FILES := run-hid-tools-tests.sh
+
CXX ?= $(CROSS_COMPILE)g++
HOSTPKG_CONFIG := pkg-config
--
2.27.0
If you wish to utilise a pidfd interface to refer to the current process
(from the point of view of userland - from the kernel point of view - the
thread group leader), it is rather cumbersome, requiring something like:
int pidfd = pidfd_open(getpid(), 0);
...
close(pidfd);
Or the equivalent call opening /proc/self. It is more convenient to use a
sentinel value to indicate to an interface that accepts a pidfd that we
simply wish to refer to the current process.
This series introduces such a sentinel, PIDFD_SELF, which can be passed as
the pidfd in this instance rather than having to establish a dummy fd for
this purpose.
The only pidfd interface where this is particularly useful is
process_madvise(), which provides the motivation for this series. However,
as this is a general interface, we ensure that all pidfd interfaces can
handle this correctly.
We ensure that pidfd_send_signal() and pidfd_getfd() work correctly, and
assert as much in selftests. All other interfaces except setns() will work
implicitly with this new interface, however it doesn't make sense to test
waitid(P_PIDFD, ...) as waiting on ourselves is a blocking operation.
In the case of setns() we explicitly disallow use of PIDFD_SELF as it
doesn't make sense to obtain the namespaces of our own process, and it
would require work to implement this functionality there that would be of
no use.
We also do not provide the ability to utilise PIDFD_SELF in ordinary fd
operations such as open() or poll(), as this would require extensive work
and be of no real use.
Lorenzo Stoakes (3):
pidfd: refactor pidfd_get_pid/to_pid() and de-duplicate pid lookup
pidfd: add PIDFD_SELF sentinel to refer to own process
selftests: pidfd: add tests for PIDFD_SELF
include/linux/pid.h | 43 +++++++++++-
include/uapi/linux/pidfd.h | 3 +
kernel/exit.c | 3 +-
kernel/nsproxy.c | 1 +
kernel/pid.c | 70 +++++++++++++------
kernel/signal.c | 26 ++-----
tools/testing/selftests/pidfd/pidfd.h | 5 ++
.../selftests/pidfd/pidfd_getfd_test.c | 38 ++++++++++
.../selftests/pidfd/pidfd_setns_test.c | 6 ++
tools/testing/selftests/pidfd/pidfd_test.c | 13 ++++
10 files changed, 165 insertions(+), 43 deletions(-)
--
2.46.2