Will noticed that with newer toolchains memcpy() ends up being
implemented with SVE instructions, breaking the signals tests when in
streaming mode. We fixed this by using an open coded version of
OPTIMZER_HIDE_VAR(), but in the process it was noticed that some of the
selftests are using the tools/include headers and it might be nice to
share things there. We also have a custom compiler.h in the BTI tests.
Update the tools/include headers to have what we need, pull them into
the arm64 selftests build and make use of them in the signals and BTI
tests. Since the resulting changes are a bit invasive for a fix we keep
an initial patch using the open coding, updating and replacing that
later.
Signed-off-by: Mark Brown <broonie(a)kernel.org>
---
Changes in v4:
- Roll in a refactoring to include and use the tools/include headers.
- Link to v3: https://lore.kernel.org/r/20230720-arm64-signal-memcpy-fix-v3-1-08aed2385d6…
Changes in v3:
- Open code OPTIMISER_HIDE_VAR() instead of the memory clobber.
- Link to v2: https://lore.kernel.org/r/20230712-arm64-signal-memcpy-fix-v2-1-494f7025caf…
Changes in v2:
- Rebase onto v6.5-rc1.
- Link to v1: https://lore.kernel.org/r/20230628-arm64-signal-memcpy-fix-v1-1-db3e0300829…
---
Mark Brown (6):
kselftest/arm64: Exit streaming mode after collecting signal context
tools compiler.h: Add OPTIMIZER_HIDE_VAR()
tools include: Add some common function attributes
kselftest/arm64: Make the tools/include headers available
kselftest/arm64: Use shared OPTIMZER_HIDE_VAR() definiton
kselftest/arm64: Use the tools/include compiler.h rather than our own
tools/include/linux/compiler.h | 18 +++++++++++++++
tools/testing/selftests/arm64/Makefile | 2 ++
tools/testing/selftests/arm64/bti/compiler.h | 21 -----------------
tools/testing/selftests/arm64/bti/system.c | 4 +---
tools/testing/selftests/arm64/bti/system.h | 4 ++--
tools/testing/selftests/arm64/bti/test.c | 1 -
.../selftests/arm64/signal/test_signals_utils.h | 27 +++++++++++++++++++++-
7 files changed, 49 insertions(+), 28 deletions(-)
---
base-commit: 6eaae198076080886b9e7d57f4ae06fa782f90ef
change-id: 20230628-arm64-signal-memcpy-fix-7de3b3c8fa10
Best regards,
--
Mark Brown <broonie(a)kernel.org>
[ Upstream commit 4acfe3dfde685a5a9eaec5555351918e2d7266a1 ]
Dan Carpenter spotted a race condition in a couple of situations like
these in the test_firmware driver:
static int test_dev_config_update_u8(const char *buf, size_t size, u8 *cfg)
{
u8 val;
int ret;
ret = kstrtou8(buf, 10, &val);
if (ret)
return ret;
mutex_lock(&test_fw_mutex);
*(u8 *)cfg = val;
mutex_unlock(&test_fw_mutex);
/* Always return full write size even if we didn't consume all */
return size;
}
static ssize_t config_num_requests_store(struct device *dev,
struct device_attribute *attr,
const char *buf, size_t count)
{
int rc;
mutex_lock(&test_fw_mutex);
if (test_fw_config->reqs) {
pr_err("Must call release_all_firmware prior to changing config\n");
rc = -EINVAL;
mutex_unlock(&test_fw_mutex);
goto out;
}
mutex_unlock(&test_fw_mutex);
// NOTE: HERE is the race!!! Function can be preempted!
// test_fw_config->reqs can change between the release of
// the lock about and acquire of the lock in the
// test_dev_config_update_u8()
rc = test_dev_config_update_u8(buf, count,
&test_fw_config->num_requests);
out:
return rc;
}
static ssize_t config_read_fw_idx_store(struct device *dev,
struct device_attribute *attr,
const char *buf, size_t count)
{
return test_dev_config_update_u8(buf, count,
&test_fw_config->read_fw_idx);
}
The function test_dev_config_update_u8() is called from both the locked
and the unlocked context, function config_num_requests_store() and
config_read_fw_idx_store() which can both be called asynchronously as
they are driver's methods, while test_dev_config_update_u8() and siblings
change their argument pointed to by u8 *cfg or similar pointer.
To avoid deadlock on test_fw_mutex, the lock is dropped before calling
test_dev_config_update_u8() and re-acquired within test_dev_config_update_u8()
itself, but alas this creates a race condition.
Having two locks wouldn't assure a race-proof mutual exclusion.
This situation is best avoided by the introduction of a new, unlocked
function __test_dev_config_update_u8() which can be called from the locked
context and reducing test_dev_config_update_u8() to:
static int test_dev_config_update_u8(const char *buf, size_t size, u8 *cfg)
{
int ret;
mutex_lock(&test_fw_mutex);
ret = __test_dev_config_update_u8(buf, size, cfg);
mutex_unlock(&test_fw_mutex);
return ret;
}
doing the locking and calling the unlocked primitive, which enables both
locked and unlocked versions without duplication of code.
Fixes: c92316bf8e948 ("test_firmware: add batched firmware tests")
Cc: Luis R. Rodriguez <mcgrof(a)kernel.org>
Cc: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
Cc: Russ Weight <russell.h.weight(a)intel.com>
Cc: Takashi Iwai <tiwai(a)suse.de>
Cc: Tianfei Zhang <tianfei.zhang(a)intel.com>
Cc: Shuah Khan <shuah(a)kernel.org>
Cc: Colin Ian King <colin.i.king(a)gmail.com>
Cc: Randy Dunlap <rdunlap(a)infradead.org>
Cc: linux-kselftest(a)vger.kernel.org
Cc: stable(a)vger.kernel.org # v5.4, 4.19, 4.14
Suggested-by: Dan Carpenter <error27(a)gmail.com>
Link: https://lore.kernel.org/r/20230509084746.48259-1-mirsad.todorovac@alu.unizg…
Signed-off-by: Mirsad Todorovac <mirsad.todorovac(a)alu.unizg.hr>
[ This is the patch to fix the racing condition in locking for the 5.4, ]
[ 4.19 and 4.14 stable branches. Not all the fixes from the upstream ]
[ commit apply, but those which do are verbatim equal to those in the ]
[ upstream commit. ]
---
v3:
minor bug fixes in the commit description. no change to the code.
5.4, 4.19 and 4.14 passed build, 5.4 and 4.19 passed kselftest.
unable to boot 4.14, should work (no changes to lib/test_firmware.c).
v2:
bundled locking and ENOSPC patches together.
tested on 5.4 and 4.19 stable.
lib/test_firmware.c | 37 ++++++++++++++++++++++++++++---------
1 file changed, 28 insertions(+), 9 deletions(-)
diff --git a/lib/test_firmware.c b/lib/test_firmware.c
index 38553944e967..92d7195d5b5b 100644
--- a/lib/test_firmware.c
+++ b/lib/test_firmware.c
@@ -301,16 +301,26 @@ static ssize_t config_test_show_str(char *dst,
return len;
}
-static int test_dev_config_update_bool(const char *buf, size_t size,
- bool *cfg)
+static inline int __test_dev_config_update_bool(const char *buf, size_t size,
+ bool *cfg)
{
int ret;
- mutex_lock(&test_fw_mutex);
if (strtobool(buf, cfg) < 0)
ret = -EINVAL;
else
ret = size;
+
+ return ret;
+}
+
+static int test_dev_config_update_bool(const char *buf, size_t size,
+ bool *cfg)
+{
+ int ret;
+
+ mutex_lock(&test_fw_mutex);
+ ret = __test_dev_config_update_bool(buf, size, cfg);
mutex_unlock(&test_fw_mutex);
return ret;
@@ -340,7 +350,7 @@ static ssize_t test_dev_config_show_int(char *buf, int cfg)
return snprintf(buf, PAGE_SIZE, "%d\n", val);
}
-static int test_dev_config_update_u8(const char *buf, size_t size, u8 *cfg)
+static inline int __test_dev_config_update_u8(const char *buf, size_t size, u8 *cfg)
{
int ret;
long new;
@@ -352,14 +362,23 @@ static int test_dev_config_update_u8(const char *buf, size_t size, u8 *cfg)
if (new > U8_MAX)
return -EINVAL;
- mutex_lock(&test_fw_mutex);
*(u8 *)cfg = new;
- mutex_unlock(&test_fw_mutex);
/* Always return full write size even if we didn't consume all */
return size;
}
+static int test_dev_config_update_u8(const char *buf, size_t size, u8 *cfg)
+{
+ int ret;
+
+ mutex_lock(&test_fw_mutex);
+ ret = __test_dev_config_update_u8(buf, size, cfg);
+ mutex_unlock(&test_fw_mutex);
+
+ return ret;
+}
+
static ssize_t test_dev_config_show_u8(char *buf, u8 cfg)
{
u8 val;
@@ -392,10 +411,10 @@ static ssize_t config_num_requests_store(struct device *dev,
mutex_unlock(&test_fw_mutex);
goto out;
}
- mutex_unlock(&test_fw_mutex);
- rc = test_dev_config_update_u8(buf, count,
- &test_fw_config->num_requests);
+ rc = __test_dev_config_update_u8(buf, count,
+ &test_fw_config->num_requests);
+ mutex_unlock(&test_fw_mutex);
out:
return rc;
--
2.34.1
[ Upstream commit 4acfe3dfde685a5a9eaec5555351918e2d7266a1 ]
Dan Carpenter spotted a race condition in a couple of situations like
these in the test_firmware driver:
static int test_dev_config_update_u8(const char *buf, size_t size, u8 *cfg)
{
u8 val;
int ret;
ret = kstrtou8(buf, 10, &val);
if (ret)
return ret;
mutex_lock(&test_fw_mutex);
*(u8 *)cfg = val;
mutex_unlock(&test_fw_mutex);
/* Always return full write size even if we didn't consume all */
return size;
}
static ssize_t config_num_requests_store(struct device *dev,
struct device_attribute *attr,
const char *buf, size_t count)
{
int rc;
mutex_lock(&test_fw_mutex);
if (test_fw_config->reqs) {
pr_err("Must call release_all_firmware prior to changing config\n");
rc = -EINVAL;
mutex_unlock(&test_fw_mutex);
goto out;
}
mutex_unlock(&test_fw_mutex);
// NOTE: HERE is the race!!! Function can be preempted!
// test_fw_config->reqs can change between the release of
// the lock about and acquire of the lock in the
// test_dev_config_update_u8()
rc = test_dev_config_update_u8(buf, count,
&test_fw_config->num_requests);
out:
return rc;
}
static ssize_t config_read_fw_idx_store(struct device *dev,
struct device_attribute *attr,
const char *buf, size_t count)
{
return test_dev_config_update_u8(buf, count,
&test_fw_config->read_fw_idx);
}
The function test_dev_config_update_u8() is called from both the locked
and the unlocked context, function config_num_requests_store() and
config_read_fw_idx_store() which can both be called asynchronously as
they are driver's methods, while test_dev_config_update_u8() and siblings
change their argument pointed to by u8 *cfg or similar pointer.
To avoid deadlock on test_fw_mutex, the lock is dropped before calling
test_dev_config_update_u8() and re-acquired within test_dev_config_update_u8()
itself, but alas this creates a race condition.
Having two locks wouldn't assure a race-proof mutual exclusion.
This situation is best avoided by the introduction of a new, unlocked
function __test_dev_config_update_u8() which can be called from the locked
context and reducing test_dev_config_update_u8() to:
static int test_dev_config_update_u8(const char *buf, size_t size, u8 *cfg)
{
int ret;
mutex_lock(&test_fw_mutex);
ret = __test_dev_config_update_u8(buf, size, cfg);
mutex_unlock(&test_fw_mutex);
return ret;
}
doing the locking and calling the unlocked primitive, which enables both
locked and unlocked versions without duplication of code.
Fixes: c92316bf8e948 ("test_firmware: add batched firmware tests")
Cc: Luis R. Rodriguez <mcgrof(a)kernel.org>
Cc: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
Cc: Russ Weight <russell.h.weight(a)intel.com>
Cc: Takashi Iwai <tiwai(a)suse.de>
Cc: Tianfei Zhang <tianfei.zhang(a)intel.com>
Cc: Shuah Khan <shuah(a)kernel.org>
Cc: Colin Ian King <colin.i.king(a)gmail.com>
Cc: Randy Dunlap <rdunlap(a)infradead.org>
Cc: linux-kselftest(a)vger.kernel.org
Cc: stable(a)vger.kernel.org # v5.4, 4.19, 4.14
Suggested-by: Dan Carpenter <error27(a)gmail.com>
Link: https://lore.kernel.org/r/20230509084746.48259-1-mirsad.todorovac@alu.unizg…
Signed-off-by: Mirsad Todorovac <mirsad.todorovac(a)alu.unizg.hr>
[ This is the patch to fix the racing condition in locking for the 5.4, ]
[ 4.19 and 4.14 stable branches. Not all the fixes from the upstream ]
[ commit apply, but those which do are verbatim equal to those in the ]
[ upstream commit. ]
---
v4:
minor versioning clarifications for the patchwork. no changes to the commit.
v3:
fixed a minor typo. no change to commit.
v2:
tested on 5.4 stable build.
lib/test_firmware.c | 37 ++++++++++++++++++++++++++++---------
1 file changed, 28 insertions(+), 9 deletions(-)
diff --git a/lib/test_firmware.c b/lib/test_firmware.c
index 38553944e967..92d7195d5b5b 100644
--- a/lib/test_firmware.c
+++ b/lib/test_firmware.c
@@ -301,16 +301,26 @@ static ssize_t config_test_show_str(char *dst,
return len;
}
-static int test_dev_config_update_bool(const char *buf, size_t size,
- bool *cfg)
+static inline int __test_dev_config_update_bool(const char *buf, size_t size,
+ bool *cfg)
{
int ret;
- mutex_lock(&test_fw_mutex);
if (strtobool(buf, cfg) < 0)
ret = -EINVAL;
else
ret = size;
+
+ return ret;
+}
+
+static int test_dev_config_update_bool(const char *buf, size_t size,
+ bool *cfg)
+{
+ int ret;
+
+ mutex_lock(&test_fw_mutex);
+ ret = __test_dev_config_update_bool(buf, size, cfg);
mutex_unlock(&test_fw_mutex);
return ret;
@@ -340,7 +350,7 @@ static ssize_t test_dev_config_show_int(char *buf, int cfg)
return snprintf(buf, PAGE_SIZE, "%d\n", val);
}
-static int test_dev_config_update_u8(const char *buf, size_t size, u8 *cfg)
+static inline int __test_dev_config_update_u8(const char *buf, size_t size, u8 *cfg)
{
int ret;
long new;
@@ -352,14 +362,23 @@ static int test_dev_config_update_u8(const char *buf, size_t size, u8 *cfg)
if (new > U8_MAX)
return -EINVAL;
- mutex_lock(&test_fw_mutex);
*(u8 *)cfg = new;
- mutex_unlock(&test_fw_mutex);
/* Always return full write size even if we didn't consume all */
return size;
}
+static int test_dev_config_update_u8(const char *buf, size_t size, u8 *cfg)
+{
+ int ret;
+
+ mutex_lock(&test_fw_mutex);
+ ret = __test_dev_config_update_u8(buf, size, cfg);
+ mutex_unlock(&test_fw_mutex);
+
+ return ret;
+}
+
static ssize_t test_dev_config_show_u8(char *buf, u8 cfg)
{
u8 val;
@@ -392,10 +411,10 @@ static ssize_t config_num_requests_store(struct device *dev,
mutex_unlock(&test_fw_mutex);
goto out;
}
- mutex_unlock(&test_fw_mutex);
- rc = test_dev_config_update_u8(buf, count,
- &test_fw_config->num_requests);
+ rc = __test_dev_config_update_u8(buf, count,
+ &test_fw_config->num_requests);
+ mutex_unlock(&test_fw_mutex);
out:
return rc;
--
2.34.1
[ commit be37bed754ed90b2655382f93f9724b3c1aae847 upstream ]
Dan Carpenter spotted that test_fw_config->reqs will be leaked if
trigger_batched_requests_store() is called two or more times.
The same appears with trigger_batched_requests_async_store().
This bug wasn't triggered by the tests, but observed by Dan's visual
inspection of the code.
The recommended workaround was to return -EBUSY if test_fw_config->reqs
is already allocated.
Fixes: c92316bf8e94 ("test_firmware: add batched firmware tests")
Cc: Luis Chamberlain <mcgrof(a)kernel.org>
Cc: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
Cc: Russ Weight <russell.h.weight(a)intel.com>
Cc: Tianfei Zhang <tianfei.zhang(a)intel.com>
Cc: Shuah Khan <shuah(a)kernel.org>
Cc: Colin Ian King <colin.i.king(a)gmail.com>
Cc: Randy Dunlap <rdunlap(a)infradead.org>
Cc: linux-kselftest(a)vger.kernel.org
Cc: stable(a)vger.kernel.org # v4.19
Suggested-by: Dan Carpenter <error27(a)gmail.com>
Suggested-by: Takashi Iwai <tiwai(a)suse.de>
Link: https://lore.kernel.org/r/20230509084746.48259-2-mirsad.todorovac@alu.unizg…
Signed-off-by: Mirsad Todorovac <mirsad.todorovac(a)alu.unizg.hr>
[ This is a backport to v4.19 stable branch without a change in code from the 5.4+ patch ]
---
v2:
no changes to commit. minor clarifications with versioning for the patchwork.
v1:
patch sumbmitted verbatim from the 5.4+ branch to 4.19
lib/test_firmware.c | 10 ++++++++++
1 file changed, 10 insertions(+)
diff --git a/lib/test_firmware.c b/lib/test_firmware.c
index f4cc874021da..e4688821eab8 100644
--- a/lib/test_firmware.c
+++ b/lib/test_firmware.c
@@ -618,6 +618,11 @@ static ssize_t trigger_batched_requests_store(struct device *dev,
mutex_lock(&test_fw_mutex);
+ if (test_fw_config->reqs) {
+ rc = -EBUSY;
+ goto out_bail;
+ }
+
test_fw_config->reqs =
vzalloc(array3_size(sizeof(struct test_batched_req),
test_fw_config->num_requests, 2));
@@ -721,6 +726,11 @@ ssize_t trigger_batched_requests_async_store(struct device *dev,
mutex_lock(&test_fw_mutex);
+ if (test_fw_config->reqs) {
+ rc = -EBUSY;
+ goto out_bail;
+ }
+
test_fw_config->reqs =
vzalloc(array3_size(sizeof(struct test_batched_req),
test_fw_config->num_requests, 2));
--
2.34.1
The openvswitch selftests currently contain a few cases for managing the
datapath, which includes creating datapath instances, adding interfaces,
and doing some basic feature / upcall tests. This is useful to validate
the control path.
Add the ability to program some of the more common flows with actions. This
can be improved overtime to include regression testing, etc.
v2->v3:
1. Dropped support for ipv6 in nat() case
2. Fixed a spelling mistake in 2/5 commit message.
v1->v2:
1. Fix issue when parsing ipv6 in the NAT action
2. Fix issue calculating length during ctact parsing
3. Fix error message when invalid bridge is passed
4. Fold in Adrian's patch to support key masks
Aaron Conole (4):
selftests: openvswitch: add an initial flow programming case
selftests: openvswitch: add a test for ipv4 forwarding
selftests: openvswitch: add basic ct test case parsing
selftests: openvswitch: add ct-nat test case with ipv4
Adrian Moreno (1):
selftests: openvswitch: support key masks
.../selftests/net/openvswitch/openvswitch.sh | 223 +++++++
.../selftests/net/openvswitch/ovs-dpctl.py | 588 +++++++++++++++++-
2 files changed, 787 insertions(+), 24 deletions(-)
--
2.40.1
Submit the top-level headers also from the kunit test module notifier
initialization callback, so external tools that are parsing dmesg for
kunit test output are able to tell how many test suites should be expected
and whether to continue parsing after complete output from the first test
suite is collected.
Extend kunit module notifier initialization callback with a processing
path for only listing the tests provided by a module if the kunit action
parameter is set to "list", so external tools can obtain a list of test
cases to be executed in advance and can make a better job on assigning
kernel messages interleaved with kunit output to specific tests.
Use test filtering functions in kunit module notifier callback functions,
so external tools are able to execute individual test cases from kunit
test modules in order to still better isolate their potential impact on
kernel messages that appear interleaved with output from other tests.
v3: Fix CONFIG_GLOB, required by filtering fuctions, not selected when
building as a module.
v2: Fix new name of a structure moved to kunit namespace not updated
across all uses.
Janusz Krzysztofik (3):
kunit: Report the count of test suites in a module
kunit: Make 'list' action available to kunit test modules
kunit: Allow kunit test modules to use test filtering
include/kunit/test.h | 14 +++++++++++
lib/kunit/Kconfig | 2 +-
lib/kunit/executor.c | 57 +++++++++++++++++++++++++-------------------
lib/kunit/test.c | 57 +++++++++++++++++++++++++++++++++++++++++++-
4 files changed, 104 insertions(+), 26 deletions(-)
--
2.41.0
NOTE: This patch is tested against 5.4 stable
NOTE: This is a patch for the 5.4 stable branch, not for the torvalds tree.
The torvalds tree, and stable tree 5.10, 5.15, 6.1 and 6.4 branches
were fixed in the separate
commit ID 4acfe3dfde68 ("test_firmware: prevent race conditions by a correct implementation of locking")
which was incompatible with 5.4
Dan Carpenter spotted a race condition in a couple of situations like
these in the test_firmware driver:
static int test_dev_config_update_u8(const char *buf, size_t size, u8 *cfg)
{
u8 val;
int ret;
ret = kstrtou8(buf, 10, &val);
if (ret)
return ret;
mutex_lock(&test_fw_mutex);
*(u8 *)cfg = val;
mutex_unlock(&test_fw_mutex);
/* Always return full write size even if we didn't consume all */
return size;
}
static ssize_t config_num_requests_store(struct device *dev,
struct device_attribute *attr,
const char *buf, size_t count)
{
int rc;
mutex_lock(&test_fw_mutex);
if (test_fw_config->reqs) {
pr_err("Must call release_all_firmware prior to changing config\n");
rc = -EINVAL;
mutex_unlock(&test_fw_mutex);
goto out;
}
mutex_unlock(&test_fw_mutex);
// NOTE: HERE is the race!!! Function can be preempted!
// test_fw_config->reqs can change between the release of
// the lock about and acquire of the lock in the
// test_dev_config_update_u8()
rc = test_dev_config_update_u8(buf, count,
&test_fw_config->num_requests);
out:
return rc;
}
static ssize_t config_read_fw_idx_store(struct device *dev,
struct device_attribute *attr,
const char *buf, size_t count)
{
return test_dev_config_update_u8(buf, count,
&test_fw_config->read_fw_idx);
}
The function test_dev_config_update_u8() is called from both the locked
and the unlocked context, function config_num_requests_store() and
config_read_fw_idx_store() which can both be called asynchronously as
they are driver's methods, while test_dev_config_update_u8() and siblings
change their argument pointed to by u8 *cfg or similar pointer.
To avoid deadlock on test_fw_mutex, the lock is dropped before calling
test_dev_config_update_u8() and re-acquired within test_dev_config_update_u8()
itself, but alas this creates a race condition.
Having two locks wouldn't assure a race-proof mutual exclusion.
This situation is best avoided by the introduction of a new, unlocked
function __test_dev_config_update_u8() which can be called from the locked
context and reducing test_dev_config_update_u8() to:
static int test_dev_config_update_u8(const char *buf, size_t size, u8 *cfg)
{
int ret;
mutex_lock(&test_fw_mutex);
ret = __test_dev_config_update_u8(buf, size, cfg);
mutex_unlock(&test_fw_mutex);
return ret;
}
doing the locking and calling the unlocked primitive, which enables both
locked and unlocked versions without duplication of code.
Fixes: c92316bf8e948 ("test_firmware: add batched firmware tests")
Cc: Luis R. Rodriguez <mcgrof(a)kernel.org>
Cc: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
Cc: Russ Weight <russell.h.weight(a)intel.com>
Cc: Takashi Iwai <tiwai(a)suse.de>
Cc: Tianfei Zhang <tianfei.zhang(a)intel.com>
Cc: Shuah Khan <shuah(a)kernel.org>
Cc: Colin Ian King <colin.i.king(a)gmail.com>
Cc: Randy Dunlap <rdunlap(a)infradead.org>
Cc: linux-kselftest(a)vger.kernel.org
Cc: stable(a)vger.kernel.org # v5.4
Suggested-by: Dan Carpenter <error27(a)gmail.com>
Link: https://lore.kernel.org/r/20230509084746.48259-1-mirsad.todorovac@alu.unizg…
Signed-off-by: Mirsad Todorovac <mirsad.todorovac(a)alu.unizg.hr>
---
lib/test_firmware.c | 37 ++++++++++++++++++++++++++++---------
1 file changed, 28 insertions(+), 9 deletions(-)
diff --git a/lib/test_firmware.c b/lib/test_firmware.c
index 38553944e967..92d7195d5b5b 100644
--- a/lib/test_firmware.c
+++ b/lib/test_firmware.c
@@ -301,16 +301,26 @@ static ssize_t config_test_show_str(char *dst,
return len;
}
-static int test_dev_config_update_bool(const char *buf, size_t size,
- bool *cfg)
+static inline int __test_dev_config_update_bool(const char *buf, size_t size,
+ bool *cfg)
{
int ret;
- mutex_lock(&test_fw_mutex);
if (strtobool(buf, cfg) < 0)
ret = -EINVAL;
else
ret = size;
+
+ return ret;
+}
+
+static int test_dev_config_update_bool(const char *buf, size_t size,
+ bool *cfg)
+{
+ int ret;
+
+ mutex_lock(&test_fw_mutex);
+ ret = __test_dev_config_update_bool(buf, size, cfg);
mutex_unlock(&test_fw_mutex);
return ret;
@@ -340,7 +350,7 @@ static ssize_t test_dev_config_show_int(char *buf, int cfg)
return snprintf(buf, PAGE_SIZE, "%d\n", val);
}
-static int test_dev_config_update_u8(const char *buf, size_t size, u8 *cfg)
+static inline int __test_dev_config_update_u8(const char *buf, size_t size, u8 *cfg)
{
int ret;
long new;
@@ -352,14 +362,23 @@ static int test_dev_config_update_u8(const char *buf, size_t size, u8 *cfg)
if (new > U8_MAX)
return -EINVAL;
- mutex_lock(&test_fw_mutex);
*(u8 *)cfg = new;
- mutex_unlock(&test_fw_mutex);
/* Always return full write size even if we didn't consume all */
return size;
}
+static int test_dev_config_update_u8(const char *buf, size_t size, u8 *cfg)
+{
+ int ret;
+
+ mutex_lock(&test_fw_mutex);
+ ret = __test_dev_config_update_u8(buf, size, cfg);
+ mutex_unlock(&test_fw_mutex);
+
+ return ret;
+}
+
static ssize_t test_dev_config_show_u8(char *buf, u8 cfg)
{
u8 val;
@@ -392,10 +411,10 @@ static ssize_t config_num_requests_store(struct device *dev,
mutex_unlock(&test_fw_mutex);
goto out;
}
- mutex_unlock(&test_fw_mutex);
- rc = test_dev_config_update_u8(buf, count,
- &test_fw_config->num_requests);
+ rc = __test_dev_config_update_u8(buf, count,
+ &test_fw_config->num_requests);
+ mutex_unlock(&test_fw_mutex);
out:
return rc;
--
2.34.1
test_kmem_basic creates 100,000 negative dentries, with each one mapping
to a slab object. After memory.high is set, these are reclaimed through
the shrink_slab function call which reclaims all 100,000 entries. The
test passes the majority of the time because when slab1 is calculated,
it is often above 0, however, 0 is also an acceptable value.
Signed-off-by: Lucas Karpinski <lkarpins(a)redhat.com>
---
v3: rebased on mm-unstable
tools/testing/selftests/cgroup/test_kmem.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/tools/testing/selftests/cgroup/test_kmem.c b/tools/testing/selftests/cgroup/test_kmem.c
index 1b2cec9d18a4..67cc0182058d 100644
--- a/tools/testing/selftests/cgroup/test_kmem.c
+++ b/tools/testing/selftests/cgroup/test_kmem.c
@@ -75,7 +75,7 @@ static int test_kmem_basic(const char *root)
sleep(1);
slab1 = cg_read_key_long(cg, "memory.stat", "slab ");
- if (slab1 <= 0)
+ if (slab1 < 0)
goto cleanup;
current = cg_read_long(cg, "memory.current");
--
2.41.0
As is described in the "How to use MPTCP?" section in MPTCP wiki [1]:
"Your app should create sockets with IPPROTO_MPTCP as the proto:
( socket(AF_INET, SOCK_STREAM, IPPROTO_MPTCP); ). Legacy apps can be
forced to create and use MPTCP sockets instead of TCP ones via the
mptcpize command bundled with the mptcpd daemon."
But the mptcpize (LD_PRELOAD technique) command has some limitations
[2]:
- it doesn't work if the application is not using libc (e.g. GoLang
apps)
- in some envs, it might not be easy to set env vars / change the way
apps are launched, e.g. on Android
- mptcpize needs to be launched with all apps that want MPTCP: we could
have more control from BPF to enable MPTCP only for some apps or all the
ones of a netns or a cgroup, etc.
- it is not in BPF, we cannot talk about it at netdev conf.
So this patchset attempts to use BPF to implement functions similer to
mptcpize.
The main idea is to add a hook in sys_socket() to change the protocol id
from IPPROTO_TCP (or 0) to IPPROTO_MPTCP.
[1]
https://github.com/multipath-tcp/mptcp_net-next/wiki
[2]
https://github.com/multipath-tcp/mptcp_net-next/issues/79
v10:
- drop "#ifdef CONFIG_BPF_JIT".
- include vmlinux.h and bpf_tracing_net.h to avoid defining some
macros.
- drop unneeded checks for mptcp.
v9:
- update comment for 'update_socket_protocol'.
v8:
- drop the additional checks on the 'protocol' value after the
'update_socket_protocol()' call.
v7:
- add __weak and __diag_* for update_socket_protocol.
v6:
- add update_socket_protocol.
v5:
- add bpf_mptcpify helper.
v4:
- use lsm_cgroup/socket_create
v3:
- patch 8: char cmd[128]; -> char cmd[256];
v2:
- Fix build selftests errors reported by CI
Closes: https://github.com/multipath-tcp/mptcp_net-next/issues/79
Geliang Tang (5):
bpf: Add update_socket_protocol hook
selftests/bpf: Use random netns name for mptcp
selftests/bpf: Add two mptcp netns helpers
selftests/bpf: Drop unneeded checks for mptcp
selftests/bpf: Add mptcpify test
net/mptcp/bpf.c | 15 ++
net/socket.c | 24 ++++
.../testing/selftests/bpf/prog_tests/mptcp.c | 129 +++++++++++++++---
tools/testing/selftests/bpf/progs/mptcpify.c | 20 +++
4 files changed, 169 insertions(+), 19 deletions(-)
create mode 100644 tools/testing/selftests/bpf/progs/mptcpify.c
--
2.35.3
As is described in the "How to use MPTCP?" section in MPTCP wiki [1]:
"Your app should create sockets with IPPROTO_MPTCP as the proto:
( socket(AF_INET, SOCK_STREAM, IPPROTO_MPTCP); ). Legacy apps can be
forced to create and use MPTCP sockets instead of TCP ones via the
mptcpize command bundled with the mptcpd daemon."
But the mptcpize (LD_PRELOAD technique) command has some limitations
[2]:
- it doesn't work if the application is not using libc (e.g. GoLang
apps)
- in some envs, it might not be easy to set env vars / change the way
apps are launched, e.g. on Android
- mptcpize needs to be launched with all apps that want MPTCP: we could
have more control from BPF to enable MPTCP only for some apps or all the
ones of a netns or a cgroup, etc.
- it is not in BPF, we cannot talk about it at netdev conf.
So this patchset attempts to use BPF to implement functions similer to
mptcpize.
The main idea is to add a hook in sys_socket() to change the protocol id
from IPPROTO_TCP (or 0) to IPPROTO_MPTCP.
[1]
https://github.com/multipath-tcp/mptcp_net-next/wiki
[2]
https://github.com/multipath-tcp/mptcp_net-next/issues/79
v9:
- update comment for 'update_socket_protocol'.
v8:
- drop the additional checks on the 'protocol' value after the
'update_socket_protocol()' call.
v7:
- add __weak and __diag_* for update_socket_protocol.
v6:
- add update_socket_protocol.
v5:
- add bpf_mptcpify helper.
v4:
- use lsm_cgroup/socket_create
v3:
- patch 8: char cmd[128]; -> char cmd[256];
v2:
- Fix build selftests errors reported by CI
Closes: https://github.com/multipath-tcp/mptcp_net-next/issues/79
Geliang Tang (4):
bpf: Add update_socket_protocol hook
selftests/bpf: Use random netns name for mptcp
selftests/bpf: Add two mptcp netns helpers
selftests/bpf: Add mptcpify test
net/mptcp/bpf.c | 17 +++
net/socket.c | 24 ++++
.../testing/selftests/bpf/prog_tests/mptcp.c | 125 ++++++++++++++++--
tools/testing/selftests/bpf/progs/mptcpify.c | 25 ++++
4 files changed, 182 insertions(+), 9 deletions(-)
create mode 100644 tools/testing/selftests/bpf/progs/mptcpify.c
--
2.35.3
This test fails routinely in our prod testing environment, and I can
reproduce it locally as well.
The test allocates dcache inside a cgroup, then drops the memory limit
and checks that usage drops correspondingly. The reason it fails is
because dentries are freed with an RCU delay - a debugging sleep shows
that usage drops as expected shortly after.
Insert a 1s sleep after dropping the limit. This should be good
enough, assuming that machines running those tests are otherwise not
very busy.
Signed-off-by: Johannes Weiner <hannes(a)cmpxchg.org>
---
tools/testing/selftests/cgroup/test_kmem.c | 4 ++++
1 file changed, 4 insertions(+)
diff --git a/tools/testing/selftests/cgroup/test_kmem.c b/tools/testing/selftests/cgroup/test_kmem.c
index 258ddc565deb..1b2cec9d18a4 100644
--- a/tools/testing/selftests/cgroup/test_kmem.c
+++ b/tools/testing/selftests/cgroup/test_kmem.c
@@ -70,6 +70,10 @@ static int test_kmem_basic(const char *root)
goto cleanup;
cg_write(cg, "memory.high", "1M");
+
+ /* wait for RCU freeing */
+ sleep(1);
+
slab1 = cg_read_key_long(cg, "memory.stat", "slab ");
if (slab1 <= 0)
goto cleanup;
--
2.41.0
The riscv selftests (which were modeled after the arm64 selftests) are
improperly declaring the "emit_tests" target to depend upon the "all"
target. This approach, when combined with commit 9fc96c7c19df
("selftests: error out if kernel header files are not yet built"), has
caused build failures [1] on arm64, and is likely to cause similar
failures for riscv.
To fix this, simply remove the unnecessary "all" dependency from the
emit_tests target. The dependency is still effectively honored, because
again, invocation is via "install", which also depends upon "all".
An alternative approach would be to harden the emit_tests target so that
it can depend upon "all", but that's a lot more complicated and hard to
get right, and doesn't seem worth it, especially given that emit_tests
should probably not be overridden at all.
[1] https://lore.kernel.org/20230710-kselftest-fix-arm64-v1-1-48e872844f25@kern…
Fixes: 9fc96c7c19df ("selftests: error out if kernel header files are not yet built")
Signed-off-by: John Hubbard <jhubbard(a)nvidia.com>
---
Andrew,
With this, and with my arm64 fix [2] that you've already put into
mm-unstable, you should be able to safely drop commit 819187ab8741
("selftests: fix arm64 test installation").
[2] https://lore.kernel.org/20230711005629.2547838-1-jhubbard@nvidia.com
thanks,
John Hubbard
tools/testing/selftests/riscv/Makefile | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/tools/testing/selftests/riscv/Makefile b/tools/testing/selftests/riscv/Makefile
index 9dd629cc86aa..f4b3d5c9af5b 100644
--- a/tools/testing/selftests/riscv/Makefile
+++ b/tools/testing/selftests/riscv/Makefile
@@ -43,7 +43,7 @@ run_tests: all
done
# Avoid any output on non riscv on emit_tests
-emit_tests: all
+emit_tests:
@for DIR in $(RISCV_SUBTARGETS); do \
BUILD_TARGET=$(OUTPUT)/$$DIR; \
$(MAKE) OUTPUT=$$BUILD_TARGET -C $$DIR $@; \
base-commit: 3f01e9fed8454dcd89727016c3e5b2fbb8f8e50c
prerequisite-patch-id: 37c92f7425689ff069fb83996a25cd98e78d7242
--
2.41.0
Nested translation is a hardware feature that is supported by many modern
IOMMU hardwares. It has two stages (stage-1, stage-2) address translation
to get access to the physical address. stage-1 translation table is owned
by userspace (e.g. by a guest OS), while stage-2 is owned by kernel. Changes
to stage-1 translation table should be followed by an IOTLB invalidation.
Take Intel VT-d as an example, the stage-1 translation table is I/O page
table. As the below diagram shows, guest I/O page table pointer in GPA
(guest physical address) is passed to host and be used to perform the stage-1
address translation. Along with it, modifications to present mappings in the
guest I/O page table should be followed with an IOTLB invalidation.
.-------------. .---------------------------.
| vIOMMU | | Guest I/O page table |
| | '---------------------------'
.----------------/
| PASID Entry |--- PASID cache flush --+
'-------------' |
| | V
| | I/O page table pointer in GPA
'-------------'
Guest
------| Shadow |---------------------------|--------
v v v
Host
.-------------. .------------------------.
| pIOMMU | | FS for GIOVA->GPA |
| | '------------------------'
.----------------/ |
| PASID Entry | V (Nested xlate)
'----------------\.----------------------------------.
| | | SS for GPA->HPA, unmanaged domain|
| | '----------------------------------'
'-------------'
Where:
- FS = First stage page tables
- SS = Second stage page tables
<Intel VT-d Nested translation>
In IOMMUFD, all the translation tables are tracked by hw_pagetable (hwpt)
and each has an iommu_domain allocated from iommu driver. So in this series
hw_pagetable and iommu_domain means the same thing if no special note.
IOMMUFD has already supported allocating hw_pagetable that is linked with
an IOAS. However, nesting requires IOMMUFD to allow allocating hw_pagetable
with driver specific parameters and interface to sync stage-1 IOTLB as user
owns the stage-1 translation table.
This series is based on the iommu hw info reporting series [1]. It first
introduces new iommu op for allocating domains with user data and the op
for invalidate stage-1 IOTLB, and then extend the IOMMUFD internal infrastructure
to accept user_data and parent hwpt, then relay the data to iommu core to
allocate user iommu_domain. After it, extends the ioctl IOMMU_HWPT_ALLOC to
accept user data and stage-2 hwpt ID to allocate hwpt. Along with it, ioctl
IOMMU_HWPT_INVALIDATE is added to invalidate stage-1 IOTLB. This is needed
for user-managed hwpts. Selftest is added as well to cover the new ioctls.
Complete code can be found in [2], QEMU could can be found in [3].
At last, this is a team work together with Nicolin Chen, Lu Baolu. Thanks
them for the help. ^_^. Look forward to your feedbacks.
[1] https://lore.kernel.org/linux-iommu/20230724105936.107042-1-yi.l.liu@intel.…
[2] https://github.com/yiliu1765/iommufd/tree/iommufd_nesting
[3] https://github.com/yiliu1765/qemu/tree/wip/iommufd_rfcv4_nesting
Change log:
v3:
- Add new uAPI things in alphabetical order
- Pass in "enum iommu_hwpt_type hwpt_type" to op->domain_alloc_user for
sanity, replacing the previous op->domain_alloc_user_data_len solution
- Return ERR_PTR from domain_alloc_user instead of NULL
- Only add IOMMU_RESV_SW_MSI to kernel-managed HWPT in nested translation (Kevin)
- Add IOMMU_RESV_IOVA_RANGES to report resv iova ranges to userspace hence
userspace is able to exclude the ranges in the stage-1 HWPT (e.g. guest I/O
page table). (Kevin)
- Add selftest coverage for the new IOMMU_RESV_IOVA_RANGES ioctl
- Minor changes per Kevin's inputs
v2: https://lore.kernel.org/linux-iommu/20230511143844.22693-1-yi.l.liu@intel.c…
- Add union iommu_domain_user_data to include all user data structures to avoid
passing void * in kernel APIs.
- Add iommu op to return user data length for user domain allocation
- Rename struct iommu_hwpt_alloc::data_type to be hwpt_type
- Store the invalidation data length in iommu_domain_ops::cache_invalidate_user_data_len
- Convert cache_invalidate_user op to be int instead of void
- Remove @data_type in struct iommu_hwpt_invalidate
- Remove out_hwpt_type_bitmap in struct iommu_hw_info hence drop patch 08 of v1
v1: https://lore.kernel.org/linux-iommu/20230309080910.607396-1-yi.l.liu@intel.…
Thanks,
Yi Liu
Lu Baolu (2):
iommu: Add new iommu op to create domains owned by userspace
iommu: Add nested domain support
Nicolin Chen (6):
iommufd/hw_pagetable: Do not populate user-managed hw_pagetables
iommufd: Only enforce IOMMU_RESV_SW_MSI when attaching user-managed
HWPT
iommufd/selftest: Add domain_alloc_user() support in iommu mock
iommufd/selftest: Add coverage for IOMMU_HWPT_ALLOC with user data
iommufd/selftest: Add IOMMU_TEST_OP_MD_CHECK_IOTLB test op
iommufd/selftest: Add coverage for IOMMU_HWPT_INVALIDATE ioctl
Yi Liu (9):
iommufd/hw_pagetable: Use domain_alloc_user op for domain allocation
iommufd: Pass in hwpt_type/parent/user_data to
iommufd_hw_pagetable_alloc()
iommufd: Add IOMMU_RESV_IOVA_RANGES
iommufd: IOMMU_HWPT_ALLOC allocation with user data
iommufd: Add IOMMU_HWPT_INVALIDATE
iommufd/selftest: Add a helper to get test device
iommufd/selftest: Add IOMMU_TEST_OP_DEV_[ADD|DEL]_RESERVED to add/del
reserved regions to selftest device
iommufd/selftest: Add .get_resv_regions() for mock_dev
iommufd/selftest: Add coverage for IOMMU_RESV_IOVA_RANGES
drivers/iommu/iommufd/device.c | 9 +-
drivers/iommu/iommufd/hw_pagetable.c | 181 +++++++++++-
drivers/iommu/iommufd/io_pagetable.c | 5 +-
drivers/iommu/iommufd/iommufd_private.h | 20 +-
drivers/iommu/iommufd/iommufd_test.h | 36 +++
drivers/iommu/iommufd/main.c | 59 +++-
drivers/iommu/iommufd/selftest.c | 266 ++++++++++++++++--
include/linux/iommu.h | 34 +++
include/uapi/linux/iommufd.h | 96 ++++++-
tools/testing/selftests/iommu/iommufd.c | 224 ++++++++++++++-
tools/testing/selftests/iommu/iommufd_utils.h | 70 +++++
11 files changed, 958 insertions(+), 42 deletions(-)
--
2.34.1
test_kmem_basic creates 100,000 negative dentries, with each one mapping
to a slab object. After memory.high is set, these are reclaimed through
the shrink_slab function call which reclaims all 100,000 entries. The
test passes the majority of the time because when slab1 is calculated,
it is often above 0, however, 0 is also an acceptable value.
Signed-off-by: Lucas Karpinski <lkarpins(a)redhat.com>
---
https://lore.kernel.org/all/m6jbt5hzq27ygt3l4xyiaxxb7i5auvb2lahbcj4yaxxigqz…
V2: Corrected title
tools/testing/selftests/cgroup/test_kmem.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/tools/testing/selftests/cgroup/test_kmem.c b/tools/testing/selftests/cgroup/test_kmem.c
index 258ddc565deb..ba0a0bfc5a98 100644
--- a/tools/testing/selftests/cgroup/test_kmem.c
+++ b/tools/testing/selftests/cgroup/test_kmem.c
@@ -71,7 +71,7 @@ static int test_kmem_basic(const char *root)
cg_write(cg, "memory.high", "1M");
slab1 = cg_read_key_long(cg, "memory.stat", "slab ");
- if (slab1 <= 0)
+ if (slab1 < 0)
goto cleanup;
current = cg_read_long(cg, "memory.current");
--
2.41.0
The following error happens:
In file included from vstate_exec_nolibc.c:2:
/usr/include/riscv64-linux-gnu/sys/prctl.h:42:12: error: conflicting types for ‘prctl’; h
ave ‘int(int, ...)’
42 | extern int prctl (int __option, ...) __THROW;
| ^~~~~
In file included from ./../../../../include/nolibc/nolibc.h:99,
from <command-line>:
./../../../../include/nolibc/sys.h:892:5: note: previous definition of ‘prctl’ with type
‘int(int, long unsigned int, long unsigned int, long unsigned int, long unsigned int)
’
892 | int prctl(int option, unsigned long arg2, unsigned long arg3,
| ^~~~~
Fix this by not including <sys/prctl.h>, which is not needed here since
prctl syscall is directly called using its number.
Fixes: 7cf6198ce22d ("selftests: Test RISC-V Vector prctl interface")
Signed-off-by: Alexandre Ghiti <alexghiti(a)rivosinc.com>
---
tools/testing/selftests/riscv/vector/vstate_exec_nolibc.c | 2 --
1 file changed, 2 deletions(-)
diff --git a/tools/testing/selftests/riscv/vector/vstate_exec_nolibc.c b/tools/testing/selftests/riscv/vector/vstate_exec_nolibc.c
index 5cbc392944a6..2c0d2b1126c1 100644
--- a/tools/testing/selftests/riscv/vector/vstate_exec_nolibc.c
+++ b/tools/testing/selftests/riscv/vector/vstate_exec_nolibc.c
@@ -1,6 +1,4 @@
// SPDX-License-Identifier: GPL-2.0-only
-#include <sys/prctl.h>
-
#define THIS_PROGRAM "./vstate_exec_nolibc"
int main(int argc, char **argv)
--
2.39.2
[ Upstream commit 4acfe3dfde685a5a9eaec5555351918e2d7266a1 ]
Dan Carpenter spotted a race condition in a couple of situations like
these in the test_firmware driver:
static int test_dev_config_update_u8(const char *buf, size_t size, u8 *cfg)
{
u8 val;
int ret;
ret = kstrtou8(buf, 10, &val);
if (ret)
return ret;
mutex_lock(&test_fw_mutex);
*(u8 *)cfg = val;
mutex_unlock(&test_fw_mutex);
/* Always return full write size even if we didn't consume all */
return size;
}
static ssize_t config_num_requests_store(struct device *dev,
struct device_attribute *attr,
const char *buf, size_t count)
{
int rc;
mutex_lock(&test_fw_mutex);
if (test_fw_config->reqs) {
pr_err("Must call release_all_firmware prior to changing config\n");
rc = -EINVAL;
mutex_unlock(&test_fw_mutex);
goto out;
}
mutex_unlock(&test_fw_mutex);
// NOTE: HERE is the race!!! Function can be preempted!
// test_fw_config->reqs can change between the release of
// the lock about and acquire of the lock in the
// test_dev_config_update_u8()
rc = test_dev_config_update_u8(buf, count,
&test_fw_config->num_requests);
out:
return rc;
}
static ssize_t config_read_fw_idx_store(struct device *dev,
struct device_attribute *attr,
const char *buf, size_t count)
{
return test_dev_config_update_u8(buf, count,
&test_fw_config->read_fw_idx);
}
The function test_dev_config_update_u8() is called from both the locked
and the unlocked context, function config_num_requests_store() and
config_read_fw_idx_store() which can both be called asynchronously as
they are driver's methods, while test_dev_config_update_u8() and siblings
change their argument pointed to by u8 *cfg or similar pointer.
To avoid deadlock on test_fw_mutex, the lock is dropped before calling
test_dev_config_update_u8() and re-acquired within test_dev_config_update_u8()
itself, but alas this creates a race condition.
Having two locks wouldn't assure a race-proof mutual exclusion.
This situation is best avoided by the introduction of a new, unlocked
function __test_dev_config_update_u8() which can be called from the locked
context and reducing test_dev_config_update_u8() to:
static int test_dev_config_update_u8(const char *buf, size_t size, u8 *cfg)
{
int ret;
mutex_lock(&test_fw_mutex);
ret = __test_dev_config_update_u8(buf, size, cfg);
mutex_unlock(&test_fw_mutex);
return ret;
}
doing the locking and calling the unlocked primitive, which enables both
locked and unlocked versions without duplication of code.
Fixes: c92316bf8e948 ("test_firmware: add batched firmware tests")
Cc: Luis R. Rodriguez <mcgrof(a)kernel.org>
Cc: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
Cc: Russ Weight <russell.h.weight(a)intel.com>
Cc: Takashi Iwai <tiwai(a)suse.de>
Cc: Tianfei Zhang <tianfei.zhang(a)intel.com>
Cc: Shuah Khan <shuah(a)kernel.org>
Cc: Colin Ian King <colin.i.king(a)gmail.com>
Cc: Randy Dunlap <rdunlap(a)infradead.org>
Cc: linux-kselftest(a)vger.kernel.org
Cc: stable(a)vger.kernel.org # v5.4, 4.19
Suggested-by: Dan Carpenter <error27(a)gmail.com>
Link: https://lore.kernel.org/r/20230509084746.48259-1-mirsad.todorovac@alu.unizg…
Signed-off-by: Mirsad Todorovac <mirsad.todorovac(a)alu.unizg.hr>
[ This is the patch to fix the racing condition in locking for the 5.4, ]
[ 4.19 and 4.4 stable branches. Not all the fixes from the upstream ]
[ commit apply, but those which do are verbatim equal to those in the ]
[ upstream commit. ]
---
v2:
bundled locking and ENOSPC patches together.
tested on 5.4 and 4.19 stable.
lib/test_firmware.c | 37 ++++++++++++++++++++++++++++---------
1 file changed, 28 insertions(+), 9 deletions(-)
diff --git a/lib/test_firmware.c b/lib/test_firmware.c
index 38553944e967..92d7195d5b5b 100644
--- a/lib/test_firmware.c
+++ b/lib/test_firmware.c
@@ -301,16 +301,26 @@ static ssize_t config_test_show_str(char *dst,
return len;
}
-static int test_dev_config_update_bool(const char *buf, size_t size,
- bool *cfg)
+static inline int __test_dev_config_update_bool(const char *buf, size_t size,
+ bool *cfg)
{
int ret;
- mutex_lock(&test_fw_mutex);
if (strtobool(buf, cfg) < 0)
ret = -EINVAL;
else
ret = size;
+
+ return ret;
+}
+
+static int test_dev_config_update_bool(const char *buf, size_t size,
+ bool *cfg)
+{
+ int ret;
+
+ mutex_lock(&test_fw_mutex);
+ ret = __test_dev_config_update_bool(buf, size, cfg);
mutex_unlock(&test_fw_mutex);
return ret;
@@ -340,7 +350,7 @@ static ssize_t test_dev_config_show_int(char *buf, int cfg)
return snprintf(buf, PAGE_SIZE, "%d\n", val);
}
-static int test_dev_config_update_u8(const char *buf, size_t size, u8 *cfg)
+static inline int __test_dev_config_update_u8(const char *buf, size_t size, u8 *cfg)
{
int ret;
long new;
@@ -352,14 +362,23 @@ static int test_dev_config_update_u8(const char *buf, size_t size, u8 *cfg)
if (new > U8_MAX)
return -EINVAL;
- mutex_lock(&test_fw_mutex);
*(u8 *)cfg = new;
- mutex_unlock(&test_fw_mutex);
/* Always return full write size even if we didn't consume all */
return size;
}
+static int test_dev_config_update_u8(const char *buf, size_t size, u8 *cfg)
+{
+ int ret;
+
+ mutex_lock(&test_fw_mutex);
+ ret = __test_dev_config_update_u8(buf, size, cfg);
+ mutex_unlock(&test_fw_mutex);
+
+ return ret;
+}
+
static ssize_t test_dev_config_show_u8(char *buf, u8 cfg)
{
u8 val;
@@ -392,10 +411,10 @@ static ssize_t config_num_requests_store(struct device *dev,
mutex_unlock(&test_fw_mutex);
goto out;
}
- mutex_unlock(&test_fw_mutex);
- rc = test_dev_config_update_u8(buf, count,
- &test_fw_config->num_requests);
+ rc = __test_dev_config_update_u8(buf, count,
+ &test_fw_config->num_requests);
+ mutex_unlock(&test_fw_mutex);
out:
return rc;
--
2.39.3
It seems that the most critical issue with vm.memfd_noexec=2 (the fact
that passing MFD_EXEC would bypass it entirely[1]) has been fixed in
Andrew's tree[2], but there are still some outstanding issues that need
to be addressed:
* The dmesg warnings are pr_warn_once, which on most systems means that
they will be used up by systemd or some other boot process and
userspace developers will never see it. The original patch posted to
the ML used pr_warn_ratelimited but the merged patch had it changed
(with a comment about it being "per review"), but given that the
current warnings are useless, pr_warn_ratelimited makes far more
sense.
* vm.memfd_noexec=2 shouldn't reject old-style memfd_create(2) syscalls
because it will make it far to difficult to ever migrate. Instead it
should imply MFD_EXEC.
* The racheting mechanism for vm.memfd_noexec doesn't make sense as a
security mechanism because a CAP_SYS_ADMIN capable user can create
executable binaries in a hidden tmpfs very easily, not to mention the
many other things they can do.
* The memfd selftests would not exit with a non-zero error code when
certain tests that ran in a forked process (specifically the ones
related to MFD_EXEC and MFD_NOEXEC_SEAL) failed.
(This patchset is based on top of Jeff Xu's patches[2] fixing the
MFD_EXEC bug in vm.memfd_noexec=2.)
[1]: https://lore.kernel.org/all/ZJwcsU0vI-nzgOB_@codewreck.org/
[2]: https://lore.kernel.org/all/20230705063315.3680666-1-jeffxu@google.com/
Aleksa Sarai (3):
memfd: cleanups for vm.memfd_noexec handling
memfd: remove racheting feature from vm.memfd_noexec
selftests: memfd: error out test process when child test fails
include/linux/pid_namespace.h | 16 +++------
kernel/pid_sysctl.h | 7 ----
mm/memfd.c | 32 +++++++----------
tools/testing/selftests/memfd/memfd_test.c | 41 ++++++++++++++++++----
4 files changed, 51 insertions(+), 45 deletions(-)
--
2.41.0
As is described in the "How to use MPTCP?" section in MPTCP wiki [1]:
"Your app should create sockets with IPPROTO_MPTCP as the proto:
( socket(AF_INET, SOCK_STREAM, IPPROTO_MPTCP); ). Legacy apps can be
forced to create and use MPTCP sockets instead of TCP ones via the
mptcpize command bundled with the mptcpd daemon."
But the mptcpize (LD_PRELOAD technique) command has some limitations
[2]:
- it doesn't work if the application is not using libc (e.g. GoLang
apps)
- in some envs, it might not be easy to set env vars / change the way
apps are launched, e.g. on Android
- mptcpize needs to be launched with all apps that want MPTCP: we could
have more control from BPF to enable MPTCP only for some apps or all the
ones of a netns or a cgroup, etc.
- it is not in BPF, we cannot talk about it at netdev conf.
So this patchset attempts to use BPF to implement functions similer to
mptcpize.
The main idea is to add a hook in sys_socket() to change the protocol id
from IPPROTO_TCP (or 0) to IPPROTO_MPTCP.
[1]
https://github.com/multipath-tcp/mptcp_net-next/wiki
[2]
https://github.com/multipath-tcp/mptcp_net-next/issues/79
v8:
- drop the additional checks on the 'protocol' value after the
'update_socket_protocol()' call.
v7:
- add __weak and __diag_* for update_socket_protocol.
v6:
- add update_socket_protocol.
v5:
- add bpf_mptcpify helper.
v4:
- use lsm_cgroup/socket_create
v3:
- patch 8: char cmd[128]; -> char cmd[256];
v2:
- Fix build selftests errors reported by CI
Closes: https://github.com/multipath-tcp/mptcp_net-next/issues/79
Geliang Tang (4):
bpf: Add update_socket_protocol hook
selftests/bpf: Use random netns name for mptcp
selftests/bpf: Add two mptcp netns helpers
selftests/bpf: Add mptcpify test
net/mptcp/bpf.c | 17 +++
net/socket.c | 25 ++++
.../testing/selftests/bpf/prog_tests/mptcp.c | 125 ++++++++++++++++--
tools/testing/selftests/bpf/progs/mptcpify.c | 25 ++++
4 files changed, 183 insertions(+), 9 deletions(-)
create mode 100644 tools/testing/selftests/bpf/progs/mptcpify.c
--
2.35.3
Hi, Willy
Here is the v5, purely include the ppc parts, with two critical fixups
for the latest gcc 13.1.0 toolchain, now, both run and run-user pass.
Here is the run-user test report:
// with local toolchains
$ for arch in ppc ppc64 ppc64le; do make run-user XARCH=$arch | grep "status: "; done
165 test(s): 157 passed, 8 skipped, 0 failed => status: warning
165 test(s): 157 passed, 8 skipped, 0 failed => status: warning
165 test(s): 157 passed, 8 skipped, 0 failed => status: warning
// with latest toolchains
$ for arch in ppc ppc64 ppc64le; do make run-user XARCH=$arch CC=/path/to/gcc-13.1.0-nolibc/powerpc64-linux/bin/powerpc64-linux-gcc | grep status; file nolibc-test; done
165 test(s): 157 passed, 8 skipped, 0 failed => status: warning
nolibc-test: ELF 32-bit MSB executable, PowerPC or cisco 4500, version 1 (SYSV), statically linked, stripped
165 test(s): 157 passed, 8 skipped, 0 failed => status: warning
nolibc-test: ELF 64-bit MSB executable, 64-bit PowerPC or cisco 7500, version 1 (SYSV), statically linked, stripped
165 test(s): 157 passed, 8 skipped, 0 failed => status: warning
nolibc-test: ELF 64-bit LSB executable, 64-bit PowerPC or cisco 7500, version 1 (SYSV), statically linked, stripped
Since the missing serial console enabling patch [1] for ppc32 has
already gotten a Reviewed-by line from the ppc maintainer, now, the ppc
defconfig aligns with the others', and it is able to simply move the
nolibc-test-config related stuff to the next tinyconfig series.
Based on v4 [2], beside removing several nolibc-test-config related
patches, two bugs with the latest gcc 13.1.0 have been fixed.
Changes from v4 --> v5:
* tools/nolibc: add support for powerpc64
selftests/nolibc: add XARCH and ARCH mapping support
selftests/nolibc: add test support for ppc64
selftests/nolibc: allow customize CROSS_COMPILE by architecture
selftests/nolibc: customize CROSS_COMPILE for 32/64-bit powerpc
Almost the same as v4.
* tools/nolibc: add support for powerpc
For 32-bit PowerPC, with newer gcc compilers (e.g. gcc 13.1.0),
"omit-frame-pointer" fails with __attribute__((no_stack_protector)) but
works with __attribute__((__optimize__("-fno-stack-protector")))
Using the later for ppc32 to workaround the issue.
* selftests/nolibc: add test support for ppc
Add default CFLAGS for ppc to allow build with the
latest powerpc64-linux-gcc toolchain from
https://mirrors.edge.kernel.org/pub/tools/crosstool/
* selftests/nolibc: add test support for ppc64le
Align with kernel, prefer elfv2 ABI to elfv1 ABI when the toolchain
support, otherwise, ABI mismatched binary will not run.
Best regards,
Zhangjin Wu
---
[1]: https://lore.kernel.org/lkml/bb7b5f9958b3e3a20f6573ff7ce7c5dc566e7e32.16909…
[2]: https://lore.kernel.org/lkml/cover.1690916314.git.falcon@tinylab.org/
Zhangjin Wu (8):
tools/nolibc: add support for powerpc
tools/nolibc: add support for powerpc64
selftests/nolibc: add XARCH and ARCH mapping support
selftests/nolibc: add test support for ppc
selftests/nolibc: add test support for ppc64le
selftests/nolibc: add test support for ppc64
selftests/nolibc: allow customize CROSS_COMPILE by architecture
selftests/nolibc: customize CROSS_COMPILE for 32/64-bit powerpc
tools/include/nolibc/arch-powerpc.h | 213 ++++++++++++++++++++++++
tools/include/nolibc/arch.h | 2 +
tools/testing/selftests/nolibc/Makefile | 74 ++++++--
3 files changed, 277 insertions(+), 12 deletions(-)
create mode 100644 tools/include/nolibc/arch-powerpc.h
--
2.25.1
Hi, Willy, Hi Thomas
v4 here is mainly with a new nolibc-test-config target from your
suggestions and with the reordering of some patches to make
nolibc-test-config be fast forward.
run-user tests for all of the powerpc variants:
$ for arch in ppc ppc64 ppc64le; do make run-user XARCH=$arch | grep status; done
165 test(s): 157 passed, 8 skipped, 0 failed => status: warning
165 test(s): 157 passed, 8 skipped, 0 failed => status: warning
165 test(s): 157 passed, 8 skipped, 0 failed => status: warning
and defconfig + run for ppc:
$ make nolibc-test-config XARCH=ppc
$ make run XARCH=ppc
165 test(s): 159 passed, 6 skipped, 0 failed => status: warning
* tools/nolibc: add support for powerpc
tools/nolibc: add support for powerpc64
No change.
* selftests/nolibc: fix up O= option support
selftests/nolibc: add macros to reduce duplicated changes
From tinyconfig-part1 patchset, required by our nolibc-test-config target
Let nolibc-test-config be able to use objtree and the kernel related
macros directly.
* selftests/nolibc: add XARCH and ARCH mapping support
Moved before nolibc-test-config, for the NOLIBC_TEST_CONFIG macro used by
nolibc-test-config target
Willy talked about this twice, let nolibc-test-config be able to use
nolibc-test-$(XARCH).config listed in NOLIBC_TEST_CONFIG directly.
* selftests/nolibc: add nolibc-test-config target
selftests/nolibc: add help for nolibc-test-config target
A new generic nolibc-test-config target is added, allows to enable
additional options for a top-level config target.
defconfig is reserved as an alias of nolibc-test-config.
As suggested by Thomas and Willy.
* selftests/nolibc: add test support for ppc
selftests/nolibc: add test support for ppc64le
selftests/nolibc: add test support for ppc64
Renamed from $(XARCH).config to nolibc-test-$(XARCH).config
As suggested by Willy.
* selftests/nolibc: allow customize CROSS_COMPILE by architecture
selftests/nolibc: customize CROSS_COMPILE for 32/64-bit powerpc
Moved here as suggested by Willy.
Best regards,
Zhangjin
---
[1]: https://lore.kernel.org/lkml/cover.1690468707.git.falcon@tinylab.org/
Zhangjin Wu (12):
tools/nolibc: add support for powerpc
tools/nolibc: add support for powerpc64
selftests/nolibc: fix up O= option support
selftests/nolibc: add macros to reduce duplicated changes
selftests/nolibc: add XARCH and ARCH mapping support
selftests/nolibc: add nolibc-test-config target
selftests/nolibc: add help for nolibc-test-config target
selftests/nolibc: add test support for ppc
selftests/nolibc: add test support for ppc64le
selftests/nolibc: add test support for ppc64
selftests/nolibc: allow customize CROSS_COMPILE by architecture
selftests/nolibc: customize CROSS_COMPILE for 32/64-bit powerpc
tools/include/nolibc/arch-powerpc.h | 202 ++++++++++++++++++
tools/include/nolibc/arch.h | 2 +
tools/testing/selftests/nolibc/Makefile | 157 ++++++++++----
.../nolibc/configs/nolibc-test-ppc.config | 3 +
4 files changed, 327 insertions(+), 37 deletions(-)
create mode 100644 tools/include/nolibc/arch-powerpc.h
create mode 100644 tools/testing/selftests/nolibc/configs/nolibc-test-ppc.config
--
2.25.1
As is described in the "How to use MPTCP?" section in MPTCP wiki [1]:
"Your app should create sockets with IPPROTO_MPTCP as the proto:
( socket(AF_INET, SOCK_STREAM, IPPROTO_MPTCP); ). Legacy apps can be
forced to create and use MPTCP sockets instead of TCP ones via the
mptcpize command bundled with the mptcpd daemon."
But the mptcpize (LD_PRELOAD technique) command has some limitations
[2]:
- it doesn't work if the application is not using libc (e.g. GoLang
apps)
- in some envs, it might not be easy to set env vars / change the way
apps are launched, e.g. on Android
- mptcpize needs to be launched with all apps that want MPTCP: we could
have more control from BPF to enable MPTCP only for some apps or all the
ones of a netns or a cgroup, etc.
- it is not in BPF, we cannot talk about it at netdev conf.
So this patchset attempts to use BPF to implement functions similer to
mptcpize.
The main idea is to add a hook in sys_socket() to change the protocol id
from IPPROTO_TCP (or 0) to IPPROTO_MPTCP.
[1]
https://github.com/multipath-tcp/mptcp_net-next/wiki
[2]
https://github.com/multipath-tcp/mptcp_net-next/issues/79
v7:
- add __weak and __diag_* for update_socket_protocol.
v6:
- add update_socket_protocol.
v5:
- add bpf_mptcpify helper.
v4:
- use lsm_cgroup/socket_create
v3:
- patch 8: char cmd[128]; -> char cmd[256];
v2:
- Fix build selftests errors reported by CI
Closes: https://github.com/multipath-tcp/mptcp_net-next/issues/79
Geliang Tang (6):
net: socket: add update_socket_protocol hook
bpf: Register mptcp modret set
selftests/bpf: Add mptcpify program
selftests/bpf: use random netns name for mptcp
selftests/bpf: add two mptcp netns helpers
selftests/bpf: Add mptcpify selftest
net/mptcp/bpf.c | 17 +++
net/socket.c | 26 ++++
.../testing/selftests/bpf/prog_tests/mptcp.c | 125 ++++++++++++++++--
tools/testing/selftests/bpf/progs/mptcpify.c | 25 ++++
4 files changed, 184 insertions(+), 9 deletions(-)
create mode 100644 tools/testing/selftests/bpf/progs/mptcpify.c
--
2.35.3
Here's a follow-up from my RFC series last year:
https://lore.kernel.org/lkml/20221004093131.40392-1-thuth@redhat.com/T/
Basic idea of this series is now to use the kselftest_harness.h
framework to get TAP output in the tests, so that it is easier
for the user to see what is going on, and e.g. to be able to
detect whether a certain test is part of the test binary or not
(which is useful when tests get extended in the course of time).
Thomas Huth (4):
KVM: selftests: Rename the ASSERT_EQ macro
KVM: selftests: x86: Use TAP interface in the sync_regs test
KVM: selftests: x86: Use TAP interface in the fix_hypercall test
KVM: selftests: x86: Use TAP interface in the userspace_msr_exit test
.../selftests/kvm/aarch64/aarch32_id_regs.c | 8 +-
.../selftests/kvm/aarch64/page_fault_test.c | 10 +-
.../testing/selftests/kvm/include/test_util.h | 4 +-
tools/testing/selftests/kvm/lib/kvm_util.c | 2 +-
.../selftests/kvm/max_guest_memory_test.c | 2 +-
tools/testing/selftests/kvm/s390x/cmma_test.c | 62 +++++-----
tools/testing/selftests/kvm/s390x/memop.c | 6 +-
tools/testing/selftests/kvm/s390x/tprot.c | 4 +-
.../x86_64/dirty_log_page_splitting_test.c | 18 +--
.../x86_64/exit_on_emulation_failure_test.c | 2 +-
.../selftests/kvm/x86_64/fix_hypercall_test.c | 16 ++-
.../kvm/x86_64/nested_exceptions_test.c | 12 +-
.../kvm/x86_64/recalc_apic_map_test.c | 6 +-
.../selftests/kvm/x86_64/sync_regs_test.c | 113 +++++++++++++++---
.../selftests/kvm/x86_64/tsc_msrs_test.c | 32 ++---
.../kvm/x86_64/userspace_msr_exit_test.c | 19 +--
.../vmx_exception_with_invalid_guest_state.c | 2 +-
.../selftests/kvm/x86_64/vmx_pmu_caps_test.c | 3 +-
.../selftests/kvm/x86_64/xapic_state_test.c | 8 +-
.../selftests/kvm/x86_64/xen_vmcall_test.c | 20 ++--
20 files changed, 218 insertions(+), 131 deletions(-)
--
2.39.3
To help the developers to avoid mistakes and keep the code smaller let's
enable compiler warnings.
I stuck with __attribute__((unused)) over __maybe_unused in
nolibc-test.c for consistency with nolibc proper.
If we want to add a define it needs to be added twice once for nolibc
proper and once for nolibc-test otherwise libc-test wouldn't build
anymore.
Signed-off-by: Thomas Weißschuh <linux(a)weissschuh.net>
---
Changes in v2:
- Don't drop unused test helpers, mark them as __attribute__((unused))
- Make some function in nolibc-test static
- Also handle -W and -Wextra
- Link to v1: https://lore.kernel.org/r/20230731-nolibc-warnings-v1-0-74973d2a52d7@weisss…
---
Thomas Weißschuh (10):
tools/nolibc: drop unused variables
tools/nolibc: sys: avoid implicit sign cast
tools/nolibc: stdint: use int for size_t on 32bit
selftests/nolibc: drop unused variables
selftests/nolibc: mark test helpers as potentially unused
selftests/nolibc: make functions static if possible
selftests/nolibc: avoid unused arguments warnings
selftests/nolibc: avoid sign-compare warnings
selftests/nolibc: test return value of read() in test_vfprintf
selftests/nolibc: enable compiler warnings
tools/include/nolibc/stdint.h | 4 +
tools/include/nolibc/sys.h | 3 +-
tools/testing/selftests/nolibc/Makefile | 2 +-
tools/testing/selftests/nolibc/nolibc-test.c | 108 +++++++++++++++++----------
4 files changed, 74 insertions(+), 43 deletions(-)
---
base-commit: dfef4fc45d5713eb23d87f0863aff9c33bd4bfaf
change-id: 20230731-nolibc-warnings-c6e47284ac03
Best regards,
--
Thomas Weißschuh <linux(a)weissschuh.net>
Changes from RFC[1]
- Rebase on latest mm-unstable
- Add base-commit
----
There are use cases that need to apply DAMOS schemes to specific address
ranges or DAMON monitoring targets. NUMA nodes in the physical address
space, special memory objects in the virtual address space, and
monitoring target specific efficient monitoring results snapshot
retrieval could be examples of such use cases. This patchset extends
DAMOS filters feature for such cases, by implementing two more filter
types, namely address ranges and DAMON monitoring types.
Patches sequence
----------------
The first seven patches are for the address ranges based DAMOS filter.
The first patch implements the filter feature and expose it via DAMON
kernel API. The second patch further expose the feature to users via
DAMON sysfs interface. The third and fourth patches implement unit
tests and selftests for the feature. Three patches (fifth to seventh)
updating the documents follow.
The following six patches are for the DAMON monitoring target based
DAMOS filter. The eighth patch implements the feature in the core layer
and expose it via DAMON's kernel API. The ninth patch further expose it
to users via DAMON sysfs interface. Tenth patch add a selftest, and two
patches (eleventh and twelfth) update documents.
[1] https://lore.kernel.org/damon/20230728203444.70703-1-sj@kernel.org/
SeongJae Park (13):
mm/damon/core: introduce address range type damos filter
mm/damon/sysfs-schemes: support address range type DAMOS filter
mm/damon/core-test: add a unit test for __damos_filter_out()
selftests/damon/sysfs: test address range damos filter
Docs/mm/damon/design: update for address range filters
Docs/ABI/damon: update for address range DAMOS filter
Docs/admin-guide/mm/damon/usage: update for address range type DAMOS
filter
mm/damon/core: implement target type damos filter
mm/damon/sysfs-schemes: support target damos filter
selftests/damon/sysfs: test damon_target filter
Docs/mm/damon/design: update for DAMON monitoring target type DAMOS
filter
Docs/ABI/damon: update for DAMON monitoring target type DAMOS filter
Docs/admin-guide/mm/damon/usage: update for DAMON monitoring target
type DAMOS filter
.../ABI/testing/sysfs-kernel-mm-damon | 27 +++++-
Documentation/admin-guide/mm/damon/usage.rst | 34 +++++---
Documentation/mm/damon/design.rst | 24 ++++--
include/linux/damon.h | 28 +++++--
mm/damon/core-test.h | 61 ++++++++++++++
mm/damon/core.c | 62 ++++++++++++++
mm/damon/sysfs-schemes.c | 83 +++++++++++++++++++
tools/testing/selftests/damon/sysfs.sh | 5 ++
8 files changed, 299 insertions(+), 25 deletions(-)
base-commit: 32f9db36a0031f99629b5910d795b3f13f284472
--
2.25.1
Changes from RFC[1]
- Rebase on latest mm-unstable
- Add base-commit
----
The tried_regions directory of DAMON sysfs interface is useful for
retrieving monitoring results snapshot or DAMOS debugging. However, for
common use case that need to monitor only the total size of the scheme
tried regions (e.g., monitoring working set size), the kernel overhead
for directory construction and user overhead for reading the content
could be high if the number of monitoring region is not small. This
patchset implements DAMON sysfs files for efficient support of the use
case.
The first patch implements the sysfs file to reduce the user space
overhead, and the second patch implements a command for reducing the
kernel space overhead.
The third patch adds a selftest for the new file, and following two
patches update documents.
[1] https://lore.kernel.org/damon/20230728201817.70602-1-sj@kernel.org/
SeongJae Park (5):
mm/damon/sysfs-schemes: implement DAMOS tried total bytes file
mm/damon/sysfs: implement a command for updating only schemes tried
total bytes
selftests/damon/sysfs: test tried_regions/total_bytes file
Docs/ABI/damon: update for tried_regions/total_bytes
Docs/admin-guide/mm/damon/usage: update for tried_regions/total_bytes
.../ABI/testing/sysfs-kernel-mm-damon | 13 +++++-
Documentation/admin-guide/mm/damon/usage.rst | 42 ++++++++++++-------
mm/damon/sysfs-common.h | 2 +-
mm/damon/sysfs-schemes.c | 24 ++++++++++-
mm/damon/sysfs.c | 26 +++++++++---
tools/testing/selftests/damon/sysfs.sh | 1 +
6 files changed, 83 insertions(+), 25 deletions(-)
base-commit: a57d8094e1946e9dbdba0dddf0e10f9f4dceae0d
--
2.25.1
With test case kvm_page_table_test, start time is acquired with
time type CLOCK_MONOTONIC_RAW, however end time in function timespec_elapsed
is acquired with time type CLOCK_MONOTONIC. This will cause
inaccurate elapsed time calculation on some platform such as LoongArch.
This patch modified test case kvm_page_table_test, and uses unified
time type CLOCK_MONOTONIC for start time.
Signed-off-by: Bibo Mao <maobibo(a)loongson.cn>
---
tools/testing/selftests/kvm/kvm_page_table_test.c | 8 ++++----
1 file changed, 4 insertions(+), 4 deletions(-)
diff --git a/tools/testing/selftests/kvm/kvm_page_table_test.c b/tools/testing/selftests/kvm/kvm_page_table_test.c
index b3b00be1ef82..69f26d80c821 100644
--- a/tools/testing/selftests/kvm/kvm_page_table_test.c
+++ b/tools/testing/selftests/kvm/kvm_page_table_test.c
@@ -200,7 +200,7 @@ static void *vcpu_worker(void *data)
if (READ_ONCE(host_quit))
return NULL;
- clock_gettime(CLOCK_MONOTONIC_RAW, &start);
+ clock_gettime(CLOCK_MONOTONIC, &start);
ret = _vcpu_run(vcpu);
ts_diff = timespec_elapsed(start);
@@ -367,7 +367,7 @@ static void run_test(enum vm_guest_mode mode, void *arg)
/* Test the stage of KVM creating mappings */
*current_stage = KVM_CREATE_MAPPINGS;
- clock_gettime(CLOCK_MONOTONIC_RAW, &start);
+ clock_gettime(CLOCK_MONOTONIC, &start);
vcpus_complete_new_stage(*current_stage);
ts_diff = timespec_elapsed(start);
@@ -380,7 +380,7 @@ static void run_test(enum vm_guest_mode mode, void *arg)
*current_stage = KVM_UPDATE_MAPPINGS;
- clock_gettime(CLOCK_MONOTONIC_RAW, &start);
+ clock_gettime(CLOCK_MONOTONIC, &start);
vcpus_complete_new_stage(*current_stage);
ts_diff = timespec_elapsed(start);
@@ -392,7 +392,7 @@ static void run_test(enum vm_guest_mode mode, void *arg)
*current_stage = KVM_ADJUST_MAPPINGS;
- clock_gettime(CLOCK_MONOTONIC_RAW, &start);
+ clock_gettime(CLOCK_MONOTONIC, &start);
vcpus_complete_new_stage(*current_stage);
ts_diff = timespec_elapsed(start);
--
2.27.0
test_kmem_basic creates 100,000 negative dentries, with each one mapping
to a slab object. After memory.high is set, these are reclaimed through
the shrink_slab function call which reclaims all 100,000 entries. The test
passes the majority of the time because when slab1 is calculated, it is
often above 0, however, 0 is also an acceptable value.
Signed-off-by: Lucas Karpinski <lkarpins(a)redhat.com>
---
tools/testing/selftests/cgroup/test_kmem.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/tools/testing/selftests/cgroup/test_kmem.c b/tools/testing/selftests/cgroup/test_kmem.c
index 258ddc565deb..ba0a0bfc5a98 100644
--- a/tools/testing/selftests/cgroup/test_kmem.c
+++ b/tools/testing/selftests/cgroup/test_kmem.c
@@ -71,7 +71,7 @@ static int test_kmem_basic(const char *root)
cg_write(cg, "memory.high", "1M");
slab1 = cg_read_key_long(cg, "memory.stat", "slab ");
- if (slab1 <= 0)
+ if (slab1 < 0)
goto cleanup;
current = cg_read_long(cg, "memory.current");
--
2.41.0
This is agains mm/mm-unstable, but everything except patch #7 and #8
should apply on current master. Especially patch #1 and #2 should go
upstream first, so we can let the other stuff mature a bit longer.
Next attempt to handle the fallout of 474098edac26
("mm/gup: replace FOLL_NUMA by gup_can_follow_protnone()") where I
accidentially missed that follow_page() and smaps implicitly kept the
FOLL_NUMA flag clear by not setting it if FOLL_FORCE is absent, to
not trigger faults on PROT_NONE-mapped PTEs.
Patch #1 fixes the known issues by reintroducing FOLL_NUMA as
FOLL_HONOR_NUMA_FAULT and decoupling it from FOLL_FORCE.
Patch #2 is a cleanup that I think actually fixes some corner cases, so
I added a Fixes: tag.
Patch #3 makes KVM explicitly set FOLL_HONOR_NUMA_FAULT in the single
case where it is required, and documents the situation.
Patch #4 then stops implicitly setting FOLL_HONOR_NUMA_FAULT. But note that
for FOLL_WRITE we always implicitly honor NUMA hinting faults.
Patch #5 and patch #6 cleanup some comments.
Patch #7 improves the KVM functional tests such that patch #8 can
actually check for one of the known issues: KSM no longer working on
PROT_NONE mappings on x86-64 with CONFIG_NUMA_BALANCING.
Cc: Andrew Morton <akpm(a)linux-foundation.org>
Cc: Linus Torvalds <torvalds(a)linux-foundation.org>
Cc: liubo <liubo254(a)huawei.com>
Cc: Peter Xu <peterx(a)redhat.com>
Cc: Matthew Wilcox <willy(a)infradead.org>
Cc: Hugh Dickins <hughd(a)google.com>
Cc: Jason Gunthorpe <jgg(a)ziepe.ca>
Cc: John Hubbard <jhubbard(a)nvidia.com>
Cc: Mel Gorman <mgorman(a)suse.de>
Cc: Shuah Khan <shuah(a)kernel.org>
Cc: Paolo Bonzini <pbonzini(a)redhat.com>
David Hildenbrand (8):
mm/gup: reintroduce FOLL_NUMA as FOLL_HONOR_NUMA_FAULT
smaps: use vm_normal_page_pmd() instead of follow_trans_huge_pmd()
kvm: explicitly set FOLL_HONOR_NUMA_FAULT in hva_to_pfn_slow()
mm/gup: don't implicitly set FOLL_HONOR_NUMA_FAULT
pgtable: improve pte_protnone() comment
mm/huge_memory: remove stale NUMA hinting comment from
follow_trans_huge_pmd()
selftest/mm: ksm_functional_tests: test in mmap_and_merge_range() if
anything got merged
selftest/mm: ksm_functional_tests: Add PROT_NONE test
fs/proc/task_mmu.c | 3 +-
include/linux/mm.h | 21 +++-
include/linux/mm_types.h | 9 ++
include/linux/pgtable.h | 16 ++-
mm/gup.c | 23 +++-
mm/huge_memory.c | 3 +-
.../selftests/mm/ksm_functional_tests.c | 106 ++++++++++++++++--
virt/kvm/kvm_main.c | 13 ++-
8 files changed, 164 insertions(+), 30 deletions(-)
--
2.41.0
Add trap and cleanup for SIGTERM sent by timeout and SIGINT from
keyboard, for the test times out and leaves incoherent network stack.
Fixes: 511e8db54036c ("selftests: forwarding: Add test for custom multipath hash")
Cc: Ido Schimmel <idosch(a)nvidia.com>
Cc: netdev(a)vger.kernel.org
---
tools/testing/selftests/net/forwarding/custom_multipath_hash.sh | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/tools/testing/selftests/net/forwarding/custom_multipath_hash.sh b/tools/testing/selftests/net/forwarding/custom_multipath_hash.sh
index 56eb83d1a3bd..c7ab883d2515 100755
--- a/tools/testing/selftests/net/forwarding/custom_multipath_hash.sh
+++ b/tools/testing/selftests/net/forwarding/custom_multipath_hash.sh
@@ -363,7 +363,7 @@ custom_hash()
custom_hash_v6
}
-trap cleanup EXIT
+trap cleanup INT TERM EXIT
setup_prepare
setup_wait
--
2.34.1
On Tue, 1 Aug 2023 at 15:11, Greg Kroah-Hartman
<gregkh(a)linuxfoundation.org> wrote:
>
> This is the start of the stable review cycle for the 6.4.8 release.
> There are 239 patches in this series, all will be posted as a response
> to this one. If anyone has any issues with these being applied, please
> let me know.
>
> Responses should be made by Thu, 03 Aug 2023 09:18:38 +0000.
> Anything received after that time might be too late.
>
> The whole patch series can be found in one patch at:
> https://www.kernel.org/pub/linux/kernel/v6.x/stable-review/patch-6.4.8-rc1.…
> or in the git tree and branch at:
> git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-6.4.y
> and the diffstat can be found below.
>
> thanks,
>
> greg k-h
Following kselftest build regression found,
selftests/rseq: Play nice with binaries statically linked against
glibc 2.35+
commit 3bcbc20942db5d738221cca31a928efc09827069 upstream.
To allow running rseq and KVM's rseq selftests as statically linked
binaries, initialize the various "trampoline" pointers to point directly
at the expect glibc symbols, and skip the dlysm() lookups if the rseq
size is non-zero, i.e. the binary is statically linked *and* the libc
registered its own rseq.
Define weak versions of the symbols so as not to break linking against
libc versions that don't support rseq in any capacity.
The KVM selftests in particular are often statically linked so that they
can be run on targets with very limited runtime environments, i.e. test
machines.
Fixes: 233e667e1ae3 ("selftests/rseq: Uplift rseq selftests for
compatibility with glibc-2.35")
Cc: Aaron Lewis <aaronlewis(a)google.com>
Cc: kvm(a)vger.kernel.org
Cc: stable(a)vger.kernel.org
Signed-off-by: Sean Christopherson <seanjc(a)google.com>
Message-Id: <20230721223352.2333911-1-seanjc(a)google.com>
Signed-off-by: Paolo Bonzini <pbonzini(a)redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
Build log:
----
x86_64-linux-gnu-gcc -O2 -Wall -g -I./ -isystem
/home/tuxbuild/.cache/tuxmake/builds/1/build/usr/include
-L/home/tuxbuild/.cache/tuxmake/builds/1/build/kselftest/rseq
-Wl,-rpath=./ -shared -fPIC rseq.c -lpthread -ldl -o
/home/tuxbuild/.cache/tuxmake/builds/1/build/kselftest/rseq/librseq.so
rseq.c:41:1: error: unknown type name '__weak'
41 | __weak ptrdiff_t __rseq_offset;
| ^~~~~~
rseq.c:41:18: error: expected '=', ',', ';', 'asm' or '__attribute__'
before '__rseq_offset'
41 | __weak ptrdiff_t __rseq_offset;
| ^~~~~~~~~~~~~
rseq.c:42:7: error: expected ';' before 'unsigned'
42 | __weak unsigned int __rseq_size;
| ^~~~~~~~~
| ;
rseq.c:43:7: error: expected ';' before 'unsigned'
43 | __weak unsigned int __rseq_flags;
| ^~~~~~~~~
| ;
rseq.c:45:47: error: '__rseq_offset' undeclared here (not in a
function); did you mean 'rseq_offset'?
45 | static const ptrdiff_t *libc_rseq_offset_p = &__rseq_offset;
| ^~~~~~~~~~~~~
| rseq_offset
make[3]: Leaving directory 'tools/testing/selftests/rseq'
Reported-by: Linux Kernel Functional Testing <lkft(a)linaro.org>
Links:
- https://storage.tuxsuite.com/public/linaro/lkft/builds/2TNSVjRCfcIaJWQNkPwD…
- https://qa-reports.linaro.org/lkft/linux-stable-rc-linux-6.4.y/build/v6.4.7…
--
Linaro LKFT
https://lkft.linaro.org
The lkdtm selftest config fragment enables CONFIG_UBSAN_TRAP to make the
ARRAY_BOUNDS test kill the calling process when an out-of-bound access
is detected by UBSAN. However, after this [1] commit, UBSAN is triggered
under many new scenarios that weren't detected before, such as in struct
definitions with fixed-size trailing arrays used as flexible arrays. As
a result, CONFIG_UBSAN_TRAP=y has become a very aggressive option to
enable except for specific situations.
`make kselftest-merge` applies CONFIG_UBSAN_TRAP=y to the kernel config
for all selftests, which makes many of them fail because of system hangs
during boot.
This change removes the config option from the lkdtm kselftest and also
the ARRAY_BOUNDS test to skip it rather than have it failing. If
out-of-bound array accesses need to be checked, there's
CONFIG_TEST_UBSAN for that.
[1] commit 2d47c6956ab3 ("ubsan: Tighten UBSAN_BOUNDS on GCC")'
Signed-off-by: Ricardo Cañuelo <ricardo.canuelo(a)collabora.com>
---
tools/testing/selftests/lkdtm/config | 1 -
tools/testing/selftests/lkdtm/tests.txt | 2 +-
2 files changed, 1 insertion(+), 2 deletions(-)
diff --git a/tools/testing/selftests/lkdtm/config b/tools/testing/selftests/lkdtm/config
index 5d52f64dfb43..7afe05e8c4d7 100644
--- a/tools/testing/selftests/lkdtm/config
+++ b/tools/testing/selftests/lkdtm/config
@@ -9,7 +9,6 @@ CONFIG_INIT_ON_FREE_DEFAULT_ON=y
CONFIG_INIT_ON_ALLOC_DEFAULT_ON=y
CONFIG_UBSAN=y
CONFIG_UBSAN_BOUNDS=y
-CONFIG_UBSAN_TRAP=y
CONFIG_STACKPROTECTOR_STRONG=y
CONFIG_SLUB_DEBUG=y
CONFIG_SLUB_DEBUG_ON=y
diff --git a/tools/testing/selftests/lkdtm/tests.txt b/tools/testing/selftests/lkdtm/tests.txt
index 607b8d7e3ea3..6a49f2abbda8 100644
--- a/tools/testing/selftests/lkdtm/tests.txt
+++ b/tools/testing/selftests/lkdtm/tests.txt
@@ -7,7 +7,7 @@ EXCEPTION
#EXHAUST_STACK Corrupts memory on failure
#CORRUPT_STACK Crashes entire system on success
#CORRUPT_STACK_STRONG Crashes entire system on success
-ARRAY_BOUNDS
+#ARRAY_BOUNDS Needs CONFIG_UBSAN_TRAP=y, might cause unrelated system hangs
CORRUPT_LIST_ADD list_add corruption
CORRUPT_LIST_DEL list_del corruption
STACK_GUARD_PAGE_LEADING
--
2.25.1
[ commit be37bed754ed90b2655382f93f9724b3c1aae847 upstream ]
Dan Carpenter spotted that test_fw_config->reqs will be leaked if
trigger_batched_requests_store() is called two or more times.
The same appears with trigger_batched_requests_async_store().
This bug wasn't triggered by the tests, but observed by Dan's visual
inspection of the code.
The recommended workaround was to return -EBUSY if test_fw_config->reqs
is already allocated.
Fixes: c92316bf8e94 ("test_firmware: add batched firmware tests")
Cc: Luis Chamberlain <mcgrof(a)kernel.org>
Cc: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
Cc: Russ Weight <russell.h.weight(a)intel.com>
Cc: Tianfei Zhang <tianfei.zhang(a)intel.com>
Cc: Shuah Khan <shuah(a)kernel.org>
Cc: Colin Ian King <colin.i.king(a)gmail.com>
Cc: Randy Dunlap <rdunlap(a)infradead.org>
Cc: linux-kselftest(a)vger.kernel.org
Cc: stable(a)vger.kernel.org # v4.14
Suggested-by: Dan Carpenter <error27(a)gmail.com>
Suggested-by: Takashi Iwai <tiwai(a)suse.de>
Link: https://lore.kernel.org/r/20230509084746.48259-2-mirsad.todorovac@alu.unizg…
Signed-off-by: Mirsad Todorovac <mirsad.todorovac(a)alu.unizg.hr>
[ This fix is applied against the 4.14 stable branch. There are no changes to the ]
[ fix in code when compared to the upstread, only the reformatting for backport. ]
Signed-off-by: Mirsad Todorovac <mirsad.todorovac(a)alu.unizg.hr>
---
v1 -> v2:
removed the Reviewed-by: and Acked-by tags, as this is a slightly different patch and
those need to be reacquired
lib/test_firmware.c | 10 ++++++++++
1 file changed, 10 insertions(+)
diff --git a/lib/test_firmware.c b/lib/test_firmware.c
index 1c5e5246bf10..5318c5e18acf 100644
--- a/lib/test_firmware.c
+++ b/lib/test_firmware.c
@@ -621,6 +621,11 @@ static ssize_t trigger_batched_requests_store(struct device *dev,
mutex_lock(&test_fw_mutex);
+ if (test_fw_config->reqs) {
+ rc = -EBUSY;
+ goto out_bail;
+ }
+
test_fw_config->reqs = vzalloc(sizeof(struct test_batched_req) *
test_fw_config->num_requests * 2);
if (!test_fw_config->reqs) {
@@ -723,6 +728,11 @@ ssize_t trigger_batched_requests_async_store(struct device *dev,
mutex_lock(&test_fw_mutex);
+ if (test_fw_config->reqs) {
+ rc = -EBUSY;
+ goto out_bail;
+ }
+
test_fw_config->reqs = vzalloc(sizeof(struct test_batched_req) *
test_fw_config->num_requests * 2);
if (!test_fw_config->reqs) {
--
2.34.1
There are macros in kernel.h that can be used outside of that header.
Split them to args.h and replace open coded variants.
Test compiled with `make allmodconfig` for x86_64.
Test cross-compiled with `make multi_v7_defconfig` for arm.
(Note that positive diff statistics is due to documentation being
updated.)
In v4:
- fixed compilation error on arm (LKP, Stephen)
In v3:
- split to a series of patches
- fixed build issue on `make allmodconfig` for x86_64 (Andrew)
In v2:
- converted existing users at the same time (Andrew, Rasmus)
- documented how it does work (Andrew, Rasmus)
Andy Shevchenko (4):
kernel.h: Split out COUNT_ARGS() and CONCATENATE() to args.h
x86/asm: Replace custom COUNT_ARGS() & CONCATENATE() implementations
arm64: smccc: Replace custom COUNT_ARGS() & CONCATENATE()
implementations
genetlink: Replace custom CONCATENATE() implementation
arch/x86/include/asm/rmwcc.h | 11 ++---
include/kunit/test.h | 1 +
include/linux/args.h | 28 +++++++++++++
include/linux/arm-smccc.h | 69 ++++++++++++++-----------------
include/linux/genl_magic_func.h | 27 ++++++------
include/linux/genl_magic_struct.h | 8 ++--
include/linux/kernel.h | 7 ----
include/linux/pci.h | 2 +-
include/trace/bpf_probe.h | 2 +
9 files changed, 84 insertions(+), 71 deletions(-)
create mode 100644 include/linux/args.h
--
2.40.0.1.gaa8946217a0b
lwt xmit hook does not expect positive return values in function
ip_finish_output2 and ip6_finish_output2. However, BPF redirect programs
can return positive values such like NET_XMIT_DROP, NET_RX_DROP, and etc
as errors. Such return values can panic the kernel unexpectedly:
https://gist.github.com/zhaiyan920/8fbac245b261fe316a7ef04c9b1eba48
This patch fixes the return values from BPF redirect, so the error
handling would be consistent at xmit hook. It also adds a few test cases
to prevent future regressions.
v3: https://lore.kernel.org/bpf/cover.1690255889.git.yan@cloudflare.com/
v2: https://lore.kernel.org/netdev/ZLdY6JkWRccunvu0@debian.debian/
v1: https://lore.kernel.org/bpf/ZLbYdpWC8zt9EJtq@debian.debian/
changes since v3:
* minor change in commit message and changelogs
* tested by Jakub Sitnicki
changes since v2:
* subject name changed
* also covered redirect to ingress case
* added selftests
changes since v1:
* minor code style changes
Yan Zhai (2):
bpf: fix skb_do_redirect return values
bpf: selftests: add lwt redirect regression test cases
include/linux/netdevice.h | 2 +
net/core/filter.c | 9 +-
tools/testing/selftests/bpf/Makefile | 1 +
.../selftests/bpf/progs/test_lwt_redirect.c | 66 +++++++
.../selftests/bpf/test_lwt_redirect.sh | 174 ++++++++++++++++++
5 files changed, 250 insertions(+), 2 deletions(-)
create mode 100644 tools/testing/selftests/bpf/progs/test_lwt_redirect.c
create mode 100755 tools/testing/selftests/bpf/test_lwt_redirect.sh
--
2.30.2
Hi Willy and Thomas,
In v3, I have fixed all the problems you mentioned.
Welcome any other suggestion.
---
Changes in v3:
- Fix the missing return
- Fix __NR_pipe to __NR_pipe2
- Fix the missing static
- Test case works in one process
- Link to v2:
https://lore.kernel.org/all/cover.1690733545.git.tanyuan@tinylab.org
Changes in v2:
- Use sys_pipe2 to implement the pipe()
- Use memcmp() instead of strcmp()
- Link to v1:
https://lore.kernel.org/all/cover.1690307717.git.tanyuan@tinylab.org
---
Yuan Tan (2):
tools/nolibc: add pipe() and pipe2() support
selftests/nolibc: add testcase for pipe
tools/include/nolibc/sys.h | 24 ++++++++++++++++++++
tools/testing/selftests/nolibc/nolibc-test.c | 22 ++++++++++++++++++
2 files changed, 46 insertions(+)
--
2.34.1
[ commit be37bed754ed90b2655382f93f9724b3c1aae847 upstream ]
Dan Carpenter spotted that test_fw_config->reqs will be leaked if
trigger_batched_requests_store() is called two or more times.
The same appears with trigger_batched_requests_async_store().
This bug wasn't triggered by the tests, but observed by Dan's visual
inspection of the code.
The recommended workaround was to return -EBUSY if test_fw_config->reqs
is already allocated.
Fixes: c92316bf8e94 ("test_firmware: add batched firmware tests")
Cc: Luis Chamberlain <mcgrof(a)kernel.org>
Cc: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
Cc: Russ Weight <russell.h.weight(a)intel.com>
Cc: Tianfei Zhang <tianfei.zhang(a)intel.com>
Cc: Shuah Khan <shuah(a)kernel.org>
Cc: Colin Ian King <colin.i.king(a)gmail.com>
Cc: Randy Dunlap <rdunlap(a)infradead.org>
Cc: linux-kselftest(a)vger.kernel.org
Cc: stable(a)vger.kernel.org # v4.19
Suggested-by: Dan Carpenter <error27(a)gmail.com>
Suggested-by: Takashi Iwai <tiwai(a)suse.de>
Reviewed-by: Dan Carpenter <dan.carpenter(a)linaro.org>
Acked-by: Luis Chamberlain <mcgrof(a)kernel.org>
Link: https://lore.kernel.org/r/20230509084746.48259-2-mirsad.todorovac@alu.unizg…
Signed-off-by: Mirsad Todorovac <mirsad.todorovac(a)alu.unizg.hr>
[ This is a backport to v4.19 stable branch without a change in code from the 5.4+ patch ]
---
v1:
patch sumbmitted verbatim from the 5.4+ branch to 4.19
lib/test_firmware.c | 10 ++++++++++
1 file changed, 10 insertions(+)
diff --git a/lib/test_firmware.c b/lib/test_firmware.c
index f4cc874021da..e4688821eab8 100644
--- a/lib/test_firmware.c
+++ b/lib/test_firmware.c
@@ -618,6 +618,11 @@ static ssize_t trigger_batched_requests_store(struct device *dev,
mutex_lock(&test_fw_mutex);
+ if (test_fw_config->reqs) {
+ rc = -EBUSY;
+ goto out_bail;
+ }
+
test_fw_config->reqs =
vzalloc(array3_size(sizeof(struct test_batched_req),
test_fw_config->num_requests, 2));
@@ -721,6 +726,11 @@ ssize_t trigger_batched_requests_async_store(struct device *dev,
mutex_lock(&test_fw_mutex);
+ if (test_fw_config->reqs) {
+ rc = -EBUSY;
+ goto out_bail;
+ }
+
test_fw_config->reqs =
vzalloc(array3_size(sizeof(struct test_batched_req),
test_fw_config->num_requests, 2));
--
2.34.1
Hi, Willy, Thomas,
Thanks to your advice and I really learned a lot from it.
V2 now uses pipe2() to wrap pipe(), and fixes the strcmp issue in test
case.
Best regards,
Yuan Tan
Yuan Tan (2):
tools/nolibc: add pipe() and pipe2() support
selftests/nolibc: add testcase for pipe
tools/include/nolibc/sys.h | 24 ++++++++++++++
tools/testing/selftests/nolibc/nolibc-test.c | 35 ++++++++++++++++++++
2 files changed, 59 insertions(+)
--
2.34.1
This series was originally written by José Expósito, and can be found
here:
https://github.com/Rust-for-Linux/linux/pull/950
Add support for writing KUnit tests in Rust. While Rust doctests are
already converted to KUnit tests and run, they're really better suited
for examples, rather than as first-class unit tests.
This series implements a series of direct Rust bindings for KUnit tests,
as well as a new macro which allows KUnit tests to be written using a
close variant of normal Rust unit test syntax. The only change required
is replacing '#[cfg(test)]' with '#[kunit_tests(kunit_test_suite_name)]'
An example test would look like:
#[kunit_tests(rust_kernel_hid_driver)]
mod tests {
use super::*;
use crate::{c_str, driver, hid, prelude::*};
use core::ptr;
struct SimpleTestDriver;
impl Driver for SimpleTestDriver {
type Data = ();
}
#[test]
fn rust_test_hid_driver_adapter() {
let mut hid = bindings::hid_driver::default();
let name = c_str!("SimpleTestDriver");
static MODULE: ThisModule = unsafe { ThisModule::from_ptr(ptr::null_mut()) };
let res = unsafe {
<hid::Adapter<SimpleTestDriver> as driver::DriverOps>::register(&mut hid, name, &MODULE)
};
assert_eq!(res, Err(ENODEV)); // The mock returns -19
}
}
Changes since the GitHub PR:
- Rebased on top of kselftest/kunit
- Add const_mut_refs feature
This may conflict with https://lore.kernel.org/lkml/20230503090708.2524310-6-nmi@metaspace.dk/
- Add rust/macros/kunit.rs to the KUnit MAINTAINERS entry
Signed-off-by: David Gow <davidgow(a)google.com>
---
José Expósito (3):
rust: kunit: add KUnit case and suite macros
rust: macros: add macro to easily run KUnit tests
rust: kunit: allow to know if we are in a test
MAINTAINERS | 1 +
rust/kernel/kunit.rs | 181 +++++++++++++++++++++++++++++++++++++++++++++++++++
rust/kernel/lib.rs | 1 +
rust/macros/kunit.rs | 149 ++++++++++++++++++++++++++++++++++++++++++
rust/macros/lib.rs | 29 +++++++++
5 files changed, 361 insertions(+)
---
base-commit: 64bd4641310c41a1ecf07c13c67bc0ed61045dfd
change-id: 20230720-rustbind-477964954da5
Best regards,
--
David Gow <davidgow(a)google.com>
[ commit be37bed754ed90b2655382f93f9724b3c1aae847 upstream ]
Dan Carpenter spotted that test_fw_config->reqs will be leaked if
trigger_batched_requests_store() is called two or more times.
The same appears with trigger_batched_requests_async_store().
This bug wasn't triggered by the tests, but observed by Dan's visual
inspection of the code.
The recommended workaround was to return -EBUSY if test_fw_config->reqs
is already allocated.
Fixes: c92316bf8e94 ("test_firmware: add batched firmware tests")
Cc: Luis Chamberlain <mcgrof(a)kernel.org>
Cc: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
Cc: Russ Weight <russell.h.weight(a)intel.com>
Cc: Tianfei Zhang <tianfei.zhang(a)intel.com>
Cc: Shuah Khan <shuah(a)kernel.org>
Cc: Colin Ian King <colin.i.king(a)gmail.com>
Cc: Randy Dunlap <rdunlap(a)infradead.org>
Cc: linux-kselftest(a)vger.kernel.org
Cc: stable(a)vger.kernel.org # v4.14
Suggested-by: Dan Carpenter <error27(a)gmail.com>
Suggested-by: Takashi Iwai <tiwai(a)suse.de>
Reviewed-by: Dan Carpenter <dan.carpenter(a)linaro.org>
Acked-by: Luis Chamberlain <mcgrof(a)kernel.org>
Link: https://lore.kernel.org/r/20230509084746.48259-2-mirsad.todorovac@alu.unizg…
Signed-off-by: Mirsad Todorovac <mirsad.todorovac(a)alu.unizg.hr>
[ This fix is applied against the 4.14 stable branch. There are no changes to the ]
[ fix in code when compared to the upstread, only the reformatting for backport. ]
---
lib/test_firmware.c | 10 ++++++++++
1 file changed, 10 insertions(+)
diff --git a/lib/test_firmware.c b/lib/test_firmware.c
index 1c5e5246bf10..5318c5e18acf 100644
--- a/lib/test_firmware.c
+++ b/lib/test_firmware.c
@@ -621,6 +621,11 @@ static ssize_t trigger_batched_requests_store(struct device *dev,
mutex_lock(&test_fw_mutex);
+ if (test_fw_config->reqs) {
+ rc = -EBUSY;
+ goto out_bail;
+ }
+
test_fw_config->reqs = vzalloc(sizeof(struct test_batched_req) *
test_fw_config->num_requests * 2);
if (!test_fw_config->reqs) {
@@ -723,6 +728,11 @@ ssize_t trigger_batched_requests_async_store(struct device *dev,
mutex_lock(&test_fw_mutex);
+ if (test_fw_config->reqs) {
+ rc = -EBUSY;
+ goto out_bail;
+ }
+
test_fw_config->reqs = vzalloc(sizeof(struct test_batched_req) *
test_fw_config->num_requests * 2);
if (!test_fw_config->reqs) {
--
2.39.3
Hi, Willy, Thomas
Currently, since this part of the discussion is out of the original topic [1],
as suggested by Willy, open a new thread for this.
[1]: https://lore.kernel.org/lkml/20230731224929.GA18296@1wt.eu/#R
The original topic [1] tries to enable -Wall (or even -Wextra) to report some
issues (include the dead unused functions/data) found during compiling stage,
this further propose a method to enable '-ffunction-sections -fdata-sections
-Wl,--gc-sections,--print-gc-sections to' find the dead unused functions/data
during linking stage:
Just thought about gc-section, do we need to further remove dead code/data
in the binary? I don't think it is necessary for nolibc-test itself, but with
'-Wl,--gc-sections -Wl,--print-gc-sections' may be a good helper to show us
which ones should be dropped or which ones are wrongly declared as public?
Just found '-O3 + -Wl,--gc-section + -Wl,--print-gc-sections' did tell
us something as below:
removing unused section '.text.nolibc_raise'
removing unused section '.text.nolibc_memmove'
removing unused section '.text.nolibc_abort'
removing unused section '.text.nolibc_memcpy'
removing unused section '.text.__stack_chk_init'
removing unused section '.text.is_setting_valid'
These info may help us further add missing 'static' keyword or find
another method to to drop the wrongly used status of some functions from
the code side.
Let's continue the discussion as below.
> On Mon, Jul 31, 2023 at 08:36:05PM +0200, Thomas Wei�schuh wrote:
>
> [...]
>
> > > > > It is very easy to add the missing 'static' keyword for is_setting_valid(), but
> > > > > for __stack_chk_init(), is it ok for us to convert it to 'static' and remove
> > > > > the 'weak' attrbute and even the 'section' attribute? seems it is only used by
> > > > > our _start_c() currently.
> > > >
> > > > Making is_setting_valid(), __stack_chk_init() seems indeed useful.
> > > > Also all the run_foo() test functions.
> > >
> > > Most of them could theoretically be turned to static. *But* it causes a
> > > problem which is that it will multiply their occurrences in multi-unit
> > > programs, and that's in part why we've started to use weak instead. Also
> > > if you run through gdb and want to mark a break point, you won't have the
> > > symbol when it's static,
Willy, did I misunderstand something again? a simple test shows, seems this is
not really always like that, static mainly means 'local', the symbol is still
there if without -O2/-Os and is able to be set a breakpoint at:
// test.c: gcc -o test test.c
#include <stdio.h>
static int test (void)
{
printf("hello, world!\n");
}
int main(void)
{
test();
return 0;
}
Even with -Os/-O2, an additional '-g' is able to generate the 'test' symbol for
debug as we expect.
> and the code will appear at multiple locations,
> > > which is really painful. I'd instead really prefer to avoid static when
> > > we don't strictly want to inline the code, and prefer weak when possible
> > > because we know many of them will be dropped at link time (and that's
> > > the exact purpose).
For the empty __stack_chk_init() one (when the arch not support stackprotector)
we used, when with 'weak', it is not possible drop it during link time even
with -O3, the weak one will be dropped may be only when there is a global one
with the same name is used or the 'weak' one is never really used?
#include <stdio.h>
__attribute__((weak,unused,section(".text.nolibc_memset")))
int test (void)
{
printf("hello, world!\n");
}
int main(void)
{
test();
return 0;
}
0000000000001060 <main>:
1060: f3 0f 1e fa endbr64
1064: 48 83 ec 08 sub $0x8,%rsp
1068: e8 03 01 00 00 callq 1170 <test>
106d: 31 c0 xor %eax,%eax
106f: 48 83 c4 08 add $0x8,%rsp
1073: c3 retq
1074: 66 2e 0f 1f 84 00 00 nopw %cs:0x0(%rax,%rax,1)
107b: 00 00 00
107e: 66 90 xchg %ax,%ax
Seems it is either impossible to add a 'inline' keyword again with the 'weak'
attribute (warned by compiler), so, the _start_c (itself is always called by
_start) will always add an empty call to the weak empty __stack_chk_init(),
-Os/-O2/-O3 don't help. for such an empty function, in my opinion, as the size
we want to care about, the calling place should be simply removed by compiler.
Test also shows, with current __inline__ method, the calling place is removed,
but with c89, the __stack_chk_init() itself will not be droped automatically,
only if not with -std=c89, it will be dropped and not appear in the
--print-gc-sections result.
Even for a supported architecture, the shorter __stack_chk_init() may be better
to inlined to the _start_c()?
So, If my above test is ok, then, we'd better simply convert the whole
__stack_chk_init() to a static one as below (I didn't investigate this deeply
due to the warning about static and weak conflict at the first time):
diff --git a/tools/include/nolibc/crt.h b/tools/include/nolibc/crt.h
index 32e128b0fb62..a5f33fef1672 100644
--- a/tools/include/nolibc/crt.h
+++ b/tools/include/nolibc/crt.h
@@ -10,7 +10,7 @@
char **environ __attribute__((weak));
const unsigned long *_auxv __attribute__((weak));
-void __stack_chk_init(void);
+static void __stack_chk_init(void);
static void exit(int);
void _start_c(long *sp)
diff --git a/tools/include/nolibc/stackprotector.h b/tools/include/nolibc/stackprotector.h
index b620f2b9578d..13f1d0e60387 100644
--- a/tools/include/nolibc/stackprotector.h
+++ b/tools/include/nolibc/stackprotector.h
@@ -37,8 +37,7 @@ void __stack_chk_fail_local(void)
__attribute__((weak,section(".data.nolibc_stack_chk")))
uintptr_t __stack_chk_guard;
-__attribute__((weak,section(".text.nolibc_stack_chk"))) __no_stack_protector
-void __stack_chk_init(void)
+static __no_stack_protector void __stack_chk_init(void)
{
my_syscall3(__NR_getrandom, &__stack_chk_guard, sizeof(__stack_chk_guard), 0);
/* a bit more randomness in case getrandom() fails, ensure the guard is never 0 */
@@ -46,7 +45,7 @@ void __stack_chk_init(void)
__stack_chk_guard ^= (uintptr_t) &__stack_chk_guard;
}
#else /* !defined(_NOLIBC_STACKPROTECTOR) */
-__inline__ void __stack_chk_init(void) {}
+static void __stack_chk_init(void) {}
#endif /* defined(_NOLIBC_STACKPROTECTOR) */
#endif /* _NOLIBC_STACKPROTECTOR_H */
> >
> > Thanks for the clarification. I forgot about that completely!
> >
> > The stuff from nolibc-test.c itself (run_foo() and is_settings_valid())
> > should still be done.
>
> Yes, likely. Nolibc-test should be done just like users expect to use
> nolibc, and nolibc should be the most flexible possible.
For the 'static' keyword we tested above, '-g' may help the debug requirement,
so, is ok for us to apply 'static' for them safely now?
A further test shows, with 'static' on _start_c() doesn't help the size, for it
is always called from _start(), will never save move instructions, but we need
a more 'used' attribute to silence the error 'nolibc-test.c:(.text+0x38cd):
undefined reference to `_start_c'', so, reserve it as the before simpler 'void
_start_c(void)' may be better?
static __attribute__((used)) void _start_c(long *sp)
Thanks,
Zhangjin
>
> Cheers,
> Willy
[ Upstream commit 4acfe3dfde685a5a9eaec5555351918e2d7266a1 ]
Dan Carpenter spotted a race condition in a couple of situations like
these in the test_firmware driver:
static int test_dev_config_update_u8(const char *buf, size_t size, u8 *cfg)
{
u8 val;
int ret;
ret = kstrtou8(buf, 10, &val);
if (ret)
return ret;
mutex_lock(&test_fw_mutex);
*(u8 *)cfg = val;
mutex_unlock(&test_fw_mutex);
/* Always return full write size even if we didn't consume all */
return size;
}
static ssize_t config_num_requests_store(struct device *dev,
struct device_attribute *attr,
const char *buf, size_t count)
{
int rc;
mutex_lock(&test_fw_mutex);
if (test_fw_config->reqs) {
pr_err("Must call release_all_firmware prior to changing config\n");
rc = -EINVAL;
mutex_unlock(&test_fw_mutex);
goto out;
}
mutex_unlock(&test_fw_mutex);
// NOTE: HERE is the race!!! Function can be preempted!
// test_fw_config->reqs can change between the release of
// the lock about and acquire of the lock in the
// test_dev_config_update_u8()
rc = test_dev_config_update_u8(buf, count,
&test_fw_config->num_requests);
out:
return rc;
}
static ssize_t config_read_fw_idx_store(struct device *dev,
struct device_attribute *attr,
const char *buf, size_t count)
{
return test_dev_config_update_u8(buf, count,
&test_fw_config->read_fw_idx);
}
The function test_dev_config_update_u8() is called from both the locked
and the unlocked context, function config_num_requests_store() and
config_read_fw_idx_store() which can both be called asynchronously as
they are driver's methods, while test_dev_config_update_u8() and siblings
change their argument pointed to by u8 *cfg or similar pointer.
To avoid deadlock on test_fw_mutex, the lock is dropped before calling
test_dev_config_update_u8() and re-acquired within test_dev_config_update_u8()
itself, but alas this creates a race condition.
Having two locks wouldn't assure a race-proof mutual exclusion.
This situation is best avoided by the introduction of a new, unlocked
function __test_dev_config_update_u8() which can be called from the locked
context and reducing test_dev_config_update_u8() to:
static int test_dev_config_update_u8(const char *buf, size_t size, u8 *cfg)
{
int ret;
mutex_lock(&test_fw_mutex);
ret = __test_dev_config_update_u8(buf, size, cfg);
mutex_unlock(&test_fw_mutex);
return ret;
}
doing the locking and calling the unlocked primitive, which enables both
locked and unlocked versions without duplication of code.
Fixes: c92316bf8e948 ("test_firmware: add batched firmware tests")
Cc: Luis R. Rodriguez <mcgrof(a)kernel.org>
Cc: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
Cc: Russ Weight <russell.h.weight(a)intel.com>
Cc: Takashi Iwai <tiwai(a)suse.de>
Cc: Tianfei Zhang <tianfei.zhang(a)intel.com>
Cc: Shuah Khan <shuah(a)kernel.org>
Cc: Colin Ian King <colin.i.king(a)gmail.com>
Cc: Randy Dunlap <rdunlap(a)infradead.org>
Cc: linux-kselftest(a)vger.kernel.org
Cc: stable(a)vger.kernel.org # v5.4
Suggested-by: Dan Carpenter <error27(a)gmail.com>
Link: https://lore.kernel.org/r/20230509084746.48259-1-mirsad.todorovac@alu.unizg…
Signed-off-by: Mirsad Todorovac <mirsad.todorovac(a)alu.unizg.hr>
[ NOTE: This patch is tested against 5.4 stable. Some of the fixes from the upstream ]
[ patch to stable 5.10+ do not apply. ]
---
v2:
minor clarification and consistency improvements. no change to code.
lib/test_firmware.c | 37 ++++++++++++++++++++++++++++---------
1 file changed, 28 insertions(+), 9 deletions(-)
diff --git a/lib/test_firmware.c b/lib/test_firmware.c
index 38553944e967..92d7195d5b5b 100644
--- a/lib/test_firmware.c
+++ b/lib/test_firmware.c
@@ -301,16 +301,26 @@ static ssize_t config_test_show_str(char *dst,
return len;
}
-static int test_dev_config_update_bool(const char *buf, size_t size,
- bool *cfg)
+static inline int __test_dev_config_update_bool(const char *buf, size_t size,
+ bool *cfg)
{
int ret;
- mutex_lock(&test_fw_mutex);
if (strtobool(buf, cfg) < 0)
ret = -EINVAL;
else
ret = size;
+
+ return ret;
+}
+
+static int test_dev_config_update_bool(const char *buf, size_t size,
+ bool *cfg)
+{
+ int ret;
+
+ mutex_lock(&test_fw_mutex);
+ ret = __test_dev_config_update_bool(buf, size, cfg);
mutex_unlock(&test_fw_mutex);
return ret;
@@ -340,7 +350,7 @@ static ssize_t test_dev_config_show_int(char *buf, int cfg)
return snprintf(buf, PAGE_SIZE, "%d\n", val);
}
-static int test_dev_config_update_u8(const char *buf, size_t size, u8 *cfg)
+static inline int __test_dev_config_update_u8(const char *buf, size_t size, u8 *cfg)
{
int ret;
long new;
@@ -352,14 +362,23 @@ static int test_dev_config_update_u8(const char *buf, size_t size, u8 *cfg)
if (new > U8_MAX)
return -EINVAL;
- mutex_lock(&test_fw_mutex);
*(u8 *)cfg = new;
- mutex_unlock(&test_fw_mutex);
/* Always return full write size even if we didn't consume all */
return size;
}
+static int test_dev_config_update_u8(const char *buf, size_t size, u8 *cfg)
+{
+ int ret;
+
+ mutex_lock(&test_fw_mutex);
+ ret = __test_dev_config_update_u8(buf, size, cfg);
+ mutex_unlock(&test_fw_mutex);
+
+ return ret;
+}
+
static ssize_t test_dev_config_show_u8(char *buf, u8 cfg)
{
u8 val;
@@ -392,10 +411,10 @@ static ssize_t config_num_requests_store(struct device *dev,
mutex_unlock(&test_fw_mutex);
goto out;
}
- mutex_unlock(&test_fw_mutex);
- rc = test_dev_config_update_u8(buf, count,
- &test_fw_config->num_requests);
+ rc = __test_dev_config_update_u8(buf, count,
+ &test_fw_config->num_requests);
+ mutex_unlock(&test_fw_mutex);
out:
return rc;
--
2.34.1
iommufd gives userspace the capability to manipulate iommu subsytem.
e.g. DMA map/unmap etc. In the near future, it will support iommu nested
translation. Different platform vendors have different implementation for
the nested translation. For example, Intel VT-d supports using guest I/O
page table as the stage-1 translation table. This requires guest I/O page
table be compatible with hardware IOMMU. So before set up nested translation,
userspace needs to know the hardware iommu information to understand the
nested translation requirements.
This series reports the iommu hardware information for a given device
which has been bound to iommufd. It is preparation work for userspace to
allocate hwpt for given device. Like the nested translation support[1].
This series introduces an iommu op to report the iommu hardware info,
and an ioctl IOMMU_GET_HW_INFO is added to report such hardware info to
user. enum iommu_hw_info_type is defined to differentiate the iommu hardware
info reported to user hence user can decode them. This series only adds the
framework for iommu hw info reporting, the complete reporting path needs vendor
specific definition and driver support. The full code is available in [1]
as well.
[1] https://github.com/yiliu1765/iommufd/tree/iommufd_nesting
Change log:
v4:
- Rename ioctl to IOMMU_GET_HW_INFO and structure to iommu_hw_info
- Move the iommufd_get_hw_info handler to main.c
- Place iommu_hw_info prior to iommu_hwpt_alloc
- Update the function namings accordingly
- Update uapi kdocs
v3: https://lore.kernel.org/linux-iommu/20230511143024.19542-1-yi.l.liu@intel.c…
- Add r-b from Baolu
- Rename IOMMU_HW_INFO_TYPE_DEFAULT to be IOMMU_HW_INFO_TYPE_NONE to
better suit what it means
- Let IOMMU_DEVICE_GET_HW_INFO succeed even the underlying iommu driver
does not have driver-specific data to report per below remark.
https://lore.kernel.org/kvm/ZAcwJSK%2F9UVI9LXu@nvidia.com/
v2: https://lore.kernel.org/linux-iommu/20230309075358.571567-1-yi.l.liu@intel.…
- Drop patch 05 of v1 as it is already covered by other series
- Rename the capability info to be iommu hardware info
v1: https://lore.kernel.org/linux-iommu/20230209041642.9346-1-yi.l.liu@intel.co…
Regards,
Yi Liu
Lu Baolu (1):
iommu: Add new iommu op to get iommu hardware information
Nicolin Chen (1):
iommufd/selftest: Add coverage for IOMMU_GET_HW_INFO ioctl
Yi Liu (2):
iommu: Move dev_iommu_ops() to private header
iommufd: Add IOMMU_GET_HW_INFO
drivers/iommu/iommu-priv.h | 11 +++
drivers/iommu/iommufd/device.c | 1 +
drivers/iommu/iommufd/iommufd_test.h | 9 +++
drivers/iommu/iommufd/main.c | 76 +++++++++++++++++++
drivers/iommu/iommufd/selftest.c | 16 ++++
include/linux/iommu.h | 27 ++++---
include/uapi/linux/iommufd.h | 44 +++++++++++
tools/testing/selftests/iommu/iommufd.c | 17 ++++-
tools/testing/selftests/iommu/iommufd_utils.h | 26 +++++++
9 files changed, 215 insertions(+), 12 deletions(-)
--
2.34.1
This small series of 4 patches adds some improvements in MPTCP
selftests:
- Patch 1 reworks the detailed report of mptcp_join.sh selftest to
better display what went well or wrong per test.
- Patch 2 adds colours (if supported, forced and/or not disabled) in
mptcp_join.sh selftest output to help spotting issues.
- Patch 3 modifies an MPTCP selftest tool to interact with the
path-manager via Netlink to always look for errors if any. This makes
sure odd behaviours can be seen in the logs and errors can be caught
later if needed.
- Patch 4 removes stdout and stderr redirections to /dev/null when using
pm_nl_ctl if no errors are expected in order to log odd behaviours.
Signed-off-by: Matthieu Baerts <matthieu.baerts(a)tessares.net>
---
Matthieu Baerts (4):
selftests: mptcp: join: rework detailed report
selftests: mptcp: join: colored results
selftests: mptcp: pm_nl_ctl: always look for errors
selftests: mptcp: userspace_pm: unmute unexpected errors
tools/testing/selftests/net/mptcp/mptcp_join.sh | 452 ++++++++++------------
tools/testing/selftests/net/mptcp/mptcp_lib.sh | 39 ++
tools/testing/selftests/net/mptcp/pm_netlink.sh | 6 +-
tools/testing/selftests/net/mptcp/pm_nl_ctl.c | 33 +-
tools/testing/selftests/net/mptcp/userspace_pm.sh | 100 ++---
5 files changed, 329 insertions(+), 301 deletions(-)
---
base-commit: 64a37272fa5fb2d951ebd1a96fd42b045d64924c
change-id: 20230728-upstream-net-next-20230728-mptcp-selftests-misc-0190cfd69ef9
Best regards,
--
Matthieu Baerts <matthieu.baerts(a)tessares.net>
There is a warning reported by coccinelle:
./tools/testing/selftests/cgroup/test_zswap.c:211:6-18: WARNING:
Unsigned expression compared with zero: stored_pages < 0
The type of "stored_pages" is size_t, which always be an unsigned type,
so it is impossible less than zero. Drop the if statements to silence
the warning.
Signed-off-by: Li Zetao <lizetao1(a)huawei.com>
---
tools/testing/selftests/cgroup/test_zswap.c | 2 --
1 file changed, 2 deletions(-)
diff --git a/tools/testing/selftests/cgroup/test_zswap.c b/tools/testing/selftests/cgroup/test_zswap.c
index 49def87a909b..dbad8d0cd090 100644
--- a/tools/testing/selftests/cgroup/test_zswap.c
+++ b/tools/testing/selftests/cgroup/test_zswap.c
@@ -208,8 +208,6 @@ static int test_no_kmem_bypass(const char *root)
free(trigger_allocation);
if (get_zswap_stored_pages(&stored_pages))
break;
- if (stored_pages < 0)
- break;
/* If memory was pushed to zswap, verify it belongs to memcg */
if (stored_pages > stored_pages_threshold) {
int zswapped = cg_read_key_long(test_group, "memory.stat", "zswapped ");
--
2.34.1
This 3 patch series consists of fixes to proc_filter test
found during linun-next testing.
The first patch fixes the LKFT reported compile error, second
one adds .gitignore and the third fixes error paths to skip
instead of fail (root check, and argument checks)
Shuah Khan (3):
selftests:connector: Fix Makefile to include KHDR_INCLUDES
selftests:connector: Add .gitignore and poupulate it with test
selftests:connector: Add root check and fix arg error paths to skip
tools/testing/selftests/connector/.gitignore | 1 +
tools/testing/selftests/connector/Makefile | 2 +-
tools/testing/selftests/connector/proc_filter.c | 9 +++++++--
3 files changed, 9 insertions(+), 3 deletions(-)
create mode 100644 tools/testing/selftests/connector/.gitignore
--
2.39.2
The openvswitch selftests currently contain a few cases for managing the
datapath, which includes creating datapath instances, adding interfaces,
and doing some basic feature / upcall tests. This is useful to validate
the control path.
Add the ability to program some of the more common flows with actions. This
can be improved overtime to include regression testing, etc.
Changes from original:
1. Fix issue when parsing ipv6 in the NAT action
2. Fix issue calculating length during ctact parsing
3. Fix error message when invalid bridge is passed
4. Fold in Adrian's patch to support key masks
Aaron Conole (4):
selftests: openvswitch: add an initial flow programming case
selftests: openvswitch: add a test for ipv4 forwarding
selftests: openvswitch: add basic ct test case parsing
selftests: openvswitch: add ct-nat test case with ipv4
Adrian Moreno (1):
selftests: openvswitch: support key masks
.../selftests/net/openvswitch/openvswitch.sh | 223 +++++++
.../selftests/net/openvswitch/ovs-dpctl.py | 601 +++++++++++++++++-
2 files changed, 800 insertions(+), 24 deletions(-)
--
2.40.1
I checked and the Landlock ptrace test failed because Yama is enabled,
which is expected. You can check that with
/proc/sys/kernel/yama/ptrace_scope
Jeff Xu sent a patch to fix this case but it is not ready yet:
https://lore.kernel.org/r/20220628222941.2642917-1-jeffxu@google.com
Could you please send a new patch Jeff, and add Limin in Cc?
On 29/11/2022 12:26, limin wrote:
> cat /proc/cmdline
> BOOT_IMAGE=/vmlinuz-6.1.0-next-20221116
> root=UUID=a65b3a79-dc02-4728-8a0c-5cf24f4ae08b ro
> systemd.unified_cgroup_hierarchy=1 cgroup_no_v1=all
>
>
> config
>
> #
> # Automatically generated file; DO NOT EDIT.
> # Linux/x86 6.1.0-rc6 Kernel Configuration
> #
[...]
> CONFIG_SECURITY_YAMA=y
[...]
> CONFIG_LSM="landlock,lockdown,yama,integrity,apparmor"
[...]
>
> On 2022/11/29 19:03, Mickaël Salaün wrote:
>> I tested with next-20221116 and all tests are OK. Could you share your
>> kernel configuration with a link? What is the content of /proc/cmdline?
>>
>> On 29/11/2022 02:42, limin wrote:
>>> I run test on Linux ubuntu2204 6.1.0-next-20221116
>>>
>>> I did't use yama.
>>>
>>> you can reproduce by this step:
>>>
>>> cd kernel_src
>>>
>>> cd tools/testing/selftests/landlock/
>>> make
>>> ./ptrace_test
>>>
>>>
>>>
>>>
>>> On 2022/11/29 3:44, Mickaël Salaün wrote:
>>>> This patch changes the test semantic and then cannot work on my test
>>>> environment. On which kernel did you run test? Do you use Yama or
>>>> something similar?
>>>>
>>>> On 28/11/2022 03:04, limin wrote:
>>>>> Tests PTRACE_ATTACH and PTRACE_MODE_READ on the parent,
>>>>> trace parent return -1 when child== 0
>>>>> How to reproduce warning:
>>>>> $ make -C tools/testing/selftests TARGETS=landlock run_tests
>>>>>
>>>>> Signed-off-by: limin <limin100(a)huawei.com>
>>>>> ---
>>>>> tools/testing/selftests/landlock/ptrace_test.c | 5 ++---
>>>>> 1 file changed, 2 insertions(+), 3 deletions(-)
>>>>>
>>>>> diff --git a/tools/testing/selftests/landlock/ptrace_test.c
>>>>> b/tools/testing/selftests/landlock/ptrace_test.c
>>>>> index c28ef98ff3ac..88c4dc63eea0 100644
>>>>> --- a/tools/testing/selftests/landlock/ptrace_test.c
>>>>> +++ b/tools/testing/selftests/landlock/ptrace_test.c
>>>>> @@ -267,12 +267,11 @@ TEST_F(hierarchy, trace)
>>>>> /* Tests PTRACE_ATTACH and PTRACE_MODE_READ on the
>>>>> parent. */
>>>>> err_proc_read = test_ptrace_read(parent);
>>>>> ret = ptrace(PTRACE_ATTACH, parent, NULL, 0);
>>>>> + EXPECT_EQ(-1, ret);
>>>>> + EXPECT_EQ(EPERM, errno);
>>>>> if (variant->domain_child) {
>>>>> - EXPECT_EQ(-1, ret);
>>>>> - EXPECT_EQ(EPERM, errno);
>>>>> EXPECT_EQ(EACCES, err_proc_read);
>>>>> } else {
>>>>> - EXPECT_EQ(0, ret);
>>>>> EXPECT_EQ(0, err_proc_read);
>>>>> }
>>>>> if (ret == 0) {
Add the definition of the '__weak' attribute since it is not available in
GCC 11.3.0 and newer:
rseq.c:41:1: error: unknown type name ‘__weak’
41 | __weak ptrdiff_t __rseq_offset;
Fixes: 3bcbc20942db ("selftests/rseq: Play nice with binaries statically linked against glibc 2.35+")
Signed-off-by: Anh Tuan Phan <tuananhlfc(a)gmail.com>
---
tools/testing/selftests/rseq/rseq.c | 4 ++++
1 file changed, 4 insertions(+)
diff --git a/tools/testing/selftests/rseq/rseq.c b/tools/testing/selftests/rseq/rseq.c
index a723da253244..584eec9b0930 100644
--- a/tools/testing/selftests/rseq/rseq.c
+++ b/tools/testing/selftests/rseq/rseq.c
@@ -34,6 +34,10 @@
#include "../kselftest.h"
#include "rseq.h"
+#ifndef __weak
+#define __weak __attribute__((weak))
+#endif
+
/*
* Define weak versions to play nice with binaries that are statically linked
* against a libc that doesn't support registering its own rseq.
--
2.34.1
Hi,
Since v6.5-rc1, kunit gained a devm/drmm-like mechanism that makes tests
resources much easier to cleanup.
This series converts the existing tests to use those new actions where
relevant.
Let me know what you think,
Maxime
Signed-off-by: Maxime Ripard <mripard(a)kernel.org>
---
Changes in v3:
- Fixed the build cast warnings by switching to wrapper functions
- Link to v2: https://lore.kernel.org/r/20230720-kms-kunit-actions-rework-v2-0-175017bd56…
Changes in v2:
- Fix some typos
- Use plaltform_device_del instead of removing the call to
platform_device_put after calling platform_device_add
- Link to v1: https://lore.kernel.org/r/20230710-kms-kunit-actions-rework-v1-0-722c58d72c…
---
Maxime Ripard (11):
drm/tests: helpers: Switch to kunit actions
drm/tests: client-modeset: Remove call to drm_kunit_helper_free_device()
drm/tests: modes: Remove call to drm_kunit_helper_free_device()
drm/tests: probe-helper: Remove call to drm_kunit_helper_free_device()
drm/tests: helpers: Create a helper to allocate a locking ctx
drm/tests: helpers: Create a helper to allocate an atomic state
drm/vc4: tests: pv-muxing: Remove call to drm_kunit_helper_free_device()
drm/vc4: tests: mock: Use a kunit action to unregister DRM device
drm/vc4: tests: pv-muxing: Switch to managed locking init
drm/vc4: tests: Switch to atomic state allocation helper
drm/vc4: tests: pv-muxing: Document test scenario
drivers/gpu/drm/tests/drm_client_modeset_test.c | 8 --
drivers/gpu/drm/tests/drm_kunit_helpers.c | 141 +++++++++++++++++++++++-
drivers/gpu/drm/tests/drm_modes_test.c | 8 --
drivers/gpu/drm/tests/drm_probe_helper_test.c | 8 --
drivers/gpu/drm/vc4/tests/vc4_mock.c | 12 ++
drivers/gpu/drm/vc4/tests/vc4_test_pv_muxing.c | 115 +++++++------------
include/drm/drm_kunit_helpers.h | 7 ++
7 files changed, 198 insertions(+), 101 deletions(-)
---
base-commit: d7b3af5a77e8d8da28f435f313e069aea5bcf172
change-id: 20230710-kms-kunit-actions-rework-5d163762c93b
Best regards,
--
Maxime Ripard <mripard(a)kernel.org>
Hi,
Since v6.5-rc1, kunit gained a devm/drmm-like mechanism that makes tests
resources much easier to cleanup.
This series converts the existing tests to use those new actions where
relevant.
Let me know what you think,
Maxime
Signed-off-by: Maxime Ripard <mripard(a)kernel.org>
---
Changes in v2:
- Fix some typos
- Use plaltform_device_del instead of removing the call to
platform_device_put after calling platform_device_add
- Link to v1: https://lore.kernel.org/r/20230710-kms-kunit-actions-rework-v1-0-722c58d72c…
---
Maxime Ripard (11):
drm/tests: helpers: Switch to kunit actions
drm/tests: client-modeset: Remove call to drm_kunit_helper_free_device()
drm/tests: modes: Remove call to drm_kunit_helper_free_device()
drm/tests: probe-helper: Remove call to drm_kunit_helper_free_device()
drm/tests: helpers: Create a helper to allocate a locking ctx
drm/tests: helpers: Create a helper to allocate an atomic state
drm/vc4: tests: pv-muxing: Remove call to drm_kunit_helper_free_device()
drm/vc4: tests: mock: Use a kunit action to unregister DRM device
drm/vc4: tests: pv-muxing: Switch to managed locking init
drm/vc4: tests: Switch to atomic state allocation helper
drm/vc4: tests: pv-muxing: Document test scenario
drivers/gpu/drm/tests/drm_client_modeset_test.c | 8 --
drivers/gpu/drm/tests/drm_kunit_helpers.c | 108 +++++++++++++++++++++-
drivers/gpu/drm/tests/drm_modes_test.c | 8 --
drivers/gpu/drm/tests/drm_probe_helper_test.c | 8 --
drivers/gpu/drm/vc4/tests/vc4_mock.c | 5 ++
drivers/gpu/drm/vc4/tests/vc4_test_pv_muxing.c | 115 +++++++++---------------
include/drm/drm_kunit_helpers.h | 7 ++
7 files changed, 158 insertions(+), 101 deletions(-)
---
base-commit: c58c49dd89324b18a812762a2bfa5a0458e4f252
change-id: 20230710-kms-kunit-actions-rework-5d163762c93b
Best regards,
--
Maxime Ripard <mripard(a)kernel.org>
Submit the top-level headers also from the kunit test module notifier
initialization callback, so external tools that are parsing dmesg for
kunit test output are able to tell how many test suites should be expected
and whether to continue parsing after complete output from the first test
suite is collected.
Extend kunit module notifier initialization callback with a processing
path for only listing the tests provided by a module if the kunit action
parameter is set to "list", so external tools can obtain a list of test
cases to be executed in advance and can make a better job on assigning
kernel messages interleaved with kunit output to specific tests.
Use test filtering functions in kunit module notifier callback functions,
so external tools are able to execute individual test cases from kunit
test modules in order to still better isolate their potential impact on
kernel messages that appear interleaved with output from other tests.
v2: Fix new name of a structure moved to kunit namespace not updated
across all uses.
Janusz Krzysztofik (3):
kunit: Report the count of test suites in a module
kunit: Make 'list' action available to kunit test modules
kunit: Allow kunit test modules to use test filtering
include/kunit/test.h | 14 +++++++++++
lib/kunit/executor.c | 57 +++++++++++++++++++++++++-------------------
lib/kunit/test.c | 57 +++++++++++++++++++++++++++++++++++++++++++-
3 files changed, 103 insertions(+), 25 deletions(-)
--
2.41.0
Hi, Willy
Most of the suggestions of v2 [1] have been applied in this v3 revision,
except the local menuconfig and mrproper targets, as explained in [2].
A fresh run with tinyconfig for ppc, ppc64 and ppc64le:
$ for arch in ppc ppc64 ppc64le; do \
mkdir -p $PWD/kernel-$arch;
time make defconfig run DEFCONFIG=tinyconfig ARCH=$arch O=$PWD/kernel-$arch RUN_OUT=$PWD/run.$arch.out;
done
rerun for ppc, ppc64 and ppc64le:
$ for arch in ppc ppc64 ppc64le; do \
make rerun ARCH=$arch O=$PWD/kernel-$arch RUN_OUT=$PWD/run.$arch.out;
done
Running /labs/linux-lab/src/linux-stable/tools/testing/selftests/nolibc/kernel-ppc/vmlinux on qemu-system-ppc
>> [ppc] Kernel command line: console=ttyS0 panic=-1
printk: console [ttyS0] enabled
Run /init as init process
Running test 'startup'
Running test 'syscall'
Running test 'stdlib'
Running test 'vfprintf'
Running test 'protection'
Leaving init with final status: 0
reboot: Power down
powered off, test finish
qemu-system-ppc: terminating on signal 15 from pid 190248 ()
165 test(s): 156 passed, 9 skipped, 0 failed => status: warning
See all results in /labs/linux-lab/src/linux-stable/tools/testing/selftests/nolibc/run.ppc.out
Running /labs/linux-lab/src/linux-stable/tools/testing/selftests/nolibc/kernel-ppc64/vmlinux on qemu-system-ppc64
Linux version 6.4.0+ (ubuntu@linux-lab) (powerpc64le-linux-gnu-gcc (Ubuntu 9.3.0-17ubuntu1~20.04) 9.3.0, GNU ld (GNU Binutils for Ubuntu) 2.34) #2 SMP Fri Jul 28 01:40:55 CST 2023
Kernel command line: console=hvc0 panic=-1
printk: console [hvc0] enabled
printk: console [hvc0] enabled
Run /init as init process
Running test 'startup'
Running test 'syscall'
Running test 'stdlib'
Running test 'vfprintf'
Running test 'protection'
Leaving init with final status: 0
reboot: Power down
powered off, test finish
165 test(s): 156 passed, 9 skipped, 0 failed => status: warning
See all results in /labs/linux-lab/src/linux-stable/tools/testing/selftests/nolibc/run.ppc64.out
Running /labs/linux-lab/src/linux-stable/tools/testing/selftests/nolibc/kernel-ppc64le/arch/powerpc/boot/zImage on qemu-system-ppc64le
Linux version 6.4.0+ (ubuntu@linux-lab) (powerpc64le-linux-gnu-gcc (Ubuntu 9.3.0-17ubuntu1~20.04) 9.3.0, GNU ld (GNU Binutils for Ubuntu) 2.34) #2 SMP Fri Jul 28 01:41:12 CST 2023
Kernel command line: console=hvc0 panic=-1
Run /init as init process
Running test 'startup'
Running test 'syscall'
Running test 'stdlib'
Running test 'vfprintf'
Running test 'protection'
Leaving init with final status: 0
reboot: Power down
powered off, test finish
165 test(s): 156 passed, 9 skipped, 0 failed => status: warning
See all results in /labs/linux-lab/src/linux-stable/tools/testing/selftests/nolibc/run.ppc64le.out
A fast report on existing test logs:
$ for arch in ppc ppc64 ppc64le; do \
make report ARCH=$arch RUN_OUT=$PWD/run.$arch.out | grep status; \
done
165 test(s): 156 passed, 9 skipped, 0 failed => status: warning
165 test(s): 156 passed, 9 skipped, 0 failed => status: warning
165 test(s): 156 passed, 9 skipped, 0 failed => status: warning
Changes from v2 --> v3:
* selftests/nolibc: allow report with existing test log
selftests/nolibc: fix up O= option support
selftests/nolibc: allow customize CROSS_COMPILE by architecture
selftests/nolibc: customize CROSS_COMPILE for 32/64-bit powerpc
selftests/nolibc: tinyconfig: add extra common options
No Change.
* selftests/nolibc: add macros to reduce duplicated changes
Remove REPORT_RUN_OUT and LOG_OUT.
* selftests/nolibc: string the core targets
Removed extconfig target from our v3 powerpc patchset [3], the
operations have been merged into the defconfig target.
Let kernel depends on $(KERNEL_CONFIG) instead of the removed
extconfig.
* selftests/nolibc: add menuconfig and mrproper for development
like the other local nolibc targets, still require local menuconfig
and mrproper targets for consistent usage with the same ARCH and no -C
/path/to/srctree
Merge them together to reduce duplicated entries.
* selftests/nolibc: allow quit qemu-system when poweroff fails
Enhance timeout logic with more expected strings print and detection
about the booting of bios, kernel, init and test.
Add a default 10 seconds of QEMU_TIMEOUT for every architecture to
detect all of the potential boog hang or failed poweroff.
* selftests/nolibc: customize QEMU_TIMEOUT for ppc64/ppc64le
Reduce QEMU_TIMEOUT from 60 seconds to a more normal 15 and 20
seconds for ppc64 and ppc64le respectively. the main time cost is
the slow bios used.
* selftests/nolibc: tinyconfig: add support for 32/64-bit powerpc
Rename the file names to shorter ones as suggestions from the powerpc
patchset.
* selftests/nolibc: speed up some targets with multiple jobs
New to speed up with -j<N> by default.
Best regards,
Zhangjin
---
[1]: https://lore.kernel.org/lkml/cover.1689759351.git.falcon@tinylab.org/
[2]: https://lore.kernel.org/lkml/20230727132418.117924-1-falcon@tinylab.org/
[3]: https://lore.kernel.org/lkml/8e9e5ac6283c6ec2ecf10a70ce55b219028497c1.16904…
Zhangjin Wu (12):
selftests/nolibc: allow report with existing test log
selftests/nolibc: add macros to reduce duplicated changes
selftests/nolibc: fix up O= option support
selftests/nolibc: string the core targets
selftests/nolibc: allow customize CROSS_COMPILE by architecture
selftests/nolibc: customize CROSS_COMPILE for 32/64-bit powerpc
selftests/nolibc: add menuconfig and mrproper for development
selftests/nolibc: allow quit qemu-system when poweroff fails
selftests/nolibc: customize QEMU_TIMEOUT for ppc64/ppc64le
selftests/nolibc: tinyconfig: add extra common options
selftests/nolibc: tinyconfig: add support for 32/64-bit powerpc
selftests/nolibc: speed up some targets with multiple jobs
tools/testing/selftests/nolibc/Makefile | 102 ++++++++++++++----
.../selftests/nolibc/configs/common.config | 4 +
.../selftests/nolibc/configs/ppc.config | 3 +
.../selftests/nolibc/configs/ppc64.config | 3 +
.../selftests/nolibc/configs/ppc64le.config | 4 +
5 files changed, 98 insertions(+), 18 deletions(-)
create mode 100644 tools/testing/selftests/nolibc/configs/common.config
create mode 100644 tools/testing/selftests/nolibc/configs/ppc64.config
create mode 100644 tools/testing/selftests/nolibc/configs/ppc64le.config
--
2.25.1
Hi, Willy, Thomas
v3 here is to fix up two issues introduced in v2 powerpc patchset [1].
- One is restore the wrongly removed '\' for a '\$$ARCH'
- Another is add the missing $(ARCH).config for ppc, the default variant
for powerpc is renamed to ppc in v2 (as discussed with Willy in [2]), but
ppc.config is missing in v2 patchset, not sure why this happen, may a
'git clean -fdx .' is required to do a new test, just did it.
Btw, the v3 tinyconfig-part1 for powerpc is ready, I will send it out
soon.
Best regards,
Zhangjin
---
[1]: https://lore.kernel.org/lkml/cover.1690373704.git.falcon@tinylab.org/
[2]: https://lore.kernel.org/lkml/ZL9leVOI25S2+0+g@1wt.eu/
Zhangjin Wu (7):
tools/nolibc: add support for powerpc
tools/nolibc: add support for powerpc64
selftests/nolibc: add extra configs customize support
selftests/nolibc: add XARCH and ARCH mapping support
selftests/nolibc: add test support for ppc
selftests/nolibc: add test support for ppc64le
selftests/nolibc: add test support for ppc64
tools/include/nolibc/arch-powerpc.h | 202 ++++++++++++++++++
tools/include/nolibc/arch.h | 2 +
tools/testing/selftests/nolibc/Makefile | 46 +++-
.../selftests/nolibc/configs/ppc.config | 3 +
4 files changed, 246 insertions(+), 7 deletions(-)
create mode 100644 tools/include/nolibc/arch-powerpc.h
create mode 100644 tools/testing/selftests/nolibc/configs/ppc.config
--
2.25.1
Hi,
Zhangjin and I are working on a tiny shell with nolibc. This patch
enables the missing pipe() with its testcase.
Thanks.
Yuan Tan (2):
tools/nolibc: add pipe() support
selftests/nolibc: add testcase for pipe.
tools/include/nolibc/sys.h | 17 ++++++++++
tools/testing/selftests/nolibc/nolibc-test.c | 34 ++++++++++++++++++++
2 files changed, 51 insertions(+)
--
2.39.2
Hi,
This is the semi-friendly patch-bot of Greg Kroah-Hartman.
Markus, you seem to have sent a nonsensical or otherwise pointless
review comment to a patch submission on a Linux kernel developer mailing
list. I strongly suggest that you not do this anymore. Please do not
bother developers who are actively working to produce patches and
features with comments that, in the end, are a waste of time.
Patch submitter, please ignore Markus's suggestion; you do not need to
follow it at all. The person/bot/AI that sent it is being ignored by
almost all Linux kernel maintainers for having a persistent pattern of
behavior of producing distracting and pointless commentary, and
inability to adapt to feedback. Please feel free to also ignore emails
from them.
thanks,
greg k-h's patch email bot
damos_new_filter() is returning a damos_filter struct without
initializing its ->list field. And the users of the function uses the
struct without initializing the field. As a result, uninitialized
memory access error is possible. Actually, a kernel NULL pointer
dereference BUG can be triggered using DAMON user-space tool, like
below.
# damo start --damos_action stat --damos_filter anon matching
# damo tune --damos_action stat --damos_filter anon matching --damos_filter anon nomatching
# dmesg
[...]
[ 36.908136] BUG: kernel NULL pointer dereference, address: 0000000000000008
[ 36.910483] #PF: supervisor write access in kernel mode
[ 36.912238] #PF: error_code(0x0002) - not-present page
[ 36.913415] PGD 0 P4D 0
[ 36.913978] Oops: 0002 [#1] PREEMPT SMP PTI
[ 36.914878] CPU: 32 PID: 1335 Comm: kdamond.0 Not tainted 6.5.0-rc3-mm-unstable-damon+ #1
[ 36.916621] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.16.0-0-gd239552ce722-prebuilt.qemu.org 04/01/2014
[ 36.919051] RIP: 0010:damos_destroy_filter (include/linux/list.h:114 include/linux/list.h:137 include/linux/list.h:148 mm/damon/core.c:345 mm/damon/core.c:355)
[...]
[ 36.938247] Call Trace:
[ 36.938721] <TASK>
[...]
[ 36.950064] ? damos_destroy_filter (include/linux/list.h:114 include/linux/list.h:137 include/linux/list.h:148 mm/damon/core.c:345 mm/damon/core.c:355)
[ 36.950883] ? damon_sysfs_set_scheme_filters.isra.0 (mm/damon/sysfs-schemes.c:1573)
[ 36.952019] damon_sysfs_set_schemes (mm/damon/sysfs-schemes.c:1674 mm/damon/sysfs-schemes.c:1686)
[ 36.952875] damon_sysfs_apply_inputs (mm/damon/sysfs.c:1312 mm/damon/sysfs.c:1298)
[ 36.953757] ? damon_pa_check_accesses (mm/damon/paddr.c:168 mm/damon/paddr.c:179)
[ 36.954648] damon_sysfs_cmd_request_callback (mm/damon/sysfs.c:1329 mm/damon/sysfs.c:1359)
[...]
The first patch of this patchset fixes the bug by initializing the field in
damos_new_filter(). The second patch adds a unit test for the problem.
Note that the second patch Cc stable@ without Fixes: tag, since it would
be better to be ingested together for avoiding any future regression.
SeongJae Park (2):
mm/damon/core: initialize damo_filter->list from damos_new_filter()
mm/damon/core-test: add a test for damos_new_filter()
mm/damon/core-test.h | 13 +++++++++++++
mm/damon/core.c | 1 +
2 files changed, 14 insertions(+)
--
2.25.1
Hi, Willy, Thomas
Here is the first part of v2 of our tinyconfig support for nolibc-test
[1], the patchset subject is reserved as before.
As discussed in v1 thread [1], to easier the review progress, the whole
tinyconfig support is divided into several parts, mainly by
architecture, here is the first part, include basic preparation and
powerpc example.
This patchset should be applied after the 32/64-bit powerpc support [2],
exactly these two are required by us:
* selftests/nolibc: add extra config file customize support
* selftests/nolibc: add XARCH and ARCH mapping support
In this patchset, we firstly add some misc preparations and at last add
the tinyconfig target and use powerpc as the first example.
Tests:
// powerpc run-user
$ for arch in powerpc powerpc64 powerpc64le; do \
rm -rf $PWD/kernel-$arch; \
mkdir -p $PWD/kernel-$arch; \
make run-user XARCH=$arch O=$PWD/kernel-$arch RUN_OUT=$PWD/run.$arch.out | grep "status: "; \
done
165 test(s): 157 passed, 8 skipped, 0 failed => status: warning
165 test(s): 157 passed, 8 skipped, 0 failed => status: warning
165 test(s): 157 passed, 8 skipped, 0 failed => status: warning
// powerpc run
$ for arch in powerpc powerpc64 powerpc64le; do \
rm -rf $PWD/kernel-$arch; \
mkdir -p $PWD/kernel-$arch; \
make tinyconfig run XARCH=$arch O=$PWD/kernel-$arch RUN_OUT=$PWD/run.$arch.out; \
done
$ for arch in powerpc powerpc64 powerpc64le; do \
make report XARCH=$arch O=$PWD/kernel-$arch RUN_OUT=$PWD/run.$arch.out | grep "status: "; \
done
165 test(s): 156 passed, 9 skipped, 0 failed => status: warning
165 test(s): 156 passed, 9 skipped, 0 failed => status: warning
165 test(s): 156 passed, 9 skipped, 0 failed => status: warning
// the others, randomly choose some
$ make run-user XARCH=arm O=$PWD/kernel-arm RUN_OUT=$PWD/run.arm.out CROSS_COMPILE=arm-linux-gnueabi- | grep status:
165 test(s): 156 passed, 9 skipped, 0 failed => status: warning
$ make run-user XARCH=x86_64 O=$PWD/kernel-arm RUN_OUT=$PWD/run.x86_64.out CROSS_COMPILE=x86_64-linux-gnu- | grep status:
165 test(s): 157 passed, 8 skipped, 0 failed => status: warning
$ make run-libc-test | grep status:
165 test(s): 153 passed, 12 skipped, 0 failed => status: warning
// x86_64, require noapic kernel command line option for old qemu-system-x86_64 (v4.2.1)
$ make run XARCH=x86_64 O=$PWD/kernel-x86_64 RUN_OUT=$PWD/run.x86_64.out CROSS_COMPILE=x86_64-linux-gnu- | grep status
$ make rerun XARCH=x86_64 O=$PWD/kernel-x86_64 RUN_OUT=$PWD/run.x86_64.out CROSS_COMPILE=x86_64-linux-gnu- | grep status
165 test(s): 159 passed, 6 skipped, 0 failed => status: warning
tinyconfig mainly targets as a time-saver, the misc preparations service
for the same goal, let's take a look:
* selftests/nolibc: allow report with existing test log
Like rerun without rebuild, Add report (without rerun) to summarize
the existing test log, this may work perfectly with the 'grep status'
* selftests/nolibc: add macros to enhance maintainability
Several macros are added to dedup the shared code to shrink lines
and easier the maintainability
The macros are added just before the using area to avoid code change
conflicts in the future.
* selftests/nolibc: print running log to screen
Enable logging to let developers learn what is happening at the
first glance, without the need to edit the Makefile and rerun it.
These helps a lot when there is a long-time running, a failed
poweroff or even a forever hang.
For test summmary, the 'grep status' can be used together with the
standalone report target.
* selftests/nolibc: fix up O= option support
With objtree instead srctree for .config and IMAGE, now, O= works.
Using O=$PWD/kernel-$arch avoid the mrproer for every build.
* selftests/nolibc: add menuconfig for development
Allow manually tuning some options, mainly for a new architecture
porting.
* selftests/nolibc: add mrproper for development
selftests/nolibc: defconfig: remove mrproper target
Split the mrproper target out of defconfig, when with O=, mrproper is not
required by defconfig, but reserve it for the other use scenes.
* selftests/nolibc: string the core targets
Allow simply 'make run' instead of 'make defconfig; make extconfig;
make kernel; make run'.
* selftests/nolibc: allow quit qemu-system when poweroff fails
When poweroff fails, allow exit while detects the power off string
from output or the wait time is too long (specified by QEMU_TIMEOUT).
This helps the boards who have no poweroff support or the kernel not
enable the poweroff options (mainly for tinyconfig).
* selftests/nolibc: allow customize CROSS_COMPILE by architecture
* selftests/nolibc: customize CROSS_COMPILE for 32/64-bit powerpc
This further saves a CROSS_COMPILE option for 'make run', it is very
important when iterates all of the supported architectures and the
compilers are not just prefixed with the XARCH variable.
For example, binary of big endian powerpc64 can be compiled with
powerpc64le-linux-gnu-, but the prefix is powerpc64le.
Even if the pre-customized compiler not exist, we can configure
CROSS_COMPILE_<ARCH> before the test loop to use the code.
* selftests/nolibc: add tinyconfig target
selftests/nolibc: tinyconfig: add extra common options
selftests/nolibc: tinyconfig: add support for 32/64-bit powerpc
Here is the first architecture(and its variants) support tinyconfig.
powerpc is actually a very good architecture, for it has 'various'
variants for test.
Best regards,
Zhangjin
---
[1]: https://lore.kernel.org/lkml/cover.1687706332.git.falcon@tinylab.org/
[2]: https://lore.kernel.org/lkml/cover.1689713175.git.falcon@tinylab.org/
Zhangjin Wu (14):
selftests/nolibc: allow report with existing test log
selftests/nolibc: add macros to enhance maintainability
selftests/nolibc: print running log to screen
selftests/nolibc: fix up O= option support
selftests/nolibc: add menuconfig for development
selftests/nolibc: add mrproper for development
selftests/nolibc: defconfig: remove mrproper target
selftests/nolibc: string the core targets
selftests/nolibc: allow quit qemu-system when poweroff fails
selftests/nolibc: allow customize CROSS_COMPILE by architecture
selftests/nolibc: customize CROSS_COMPILE for 32/64-bit powerpc
selftests/nolibc: add tinyconfig target
selftests/nolibc: tinyconfig: add extra common options
selftests/nolibc: tinyconfig: add support for 32/64-bit powerpc
tools/testing/selftests/nolibc/Makefile | 102 ++++++++++++++----
.../selftests/nolibc/configs/common.config | 4 +
.../selftests/nolibc/configs/powerpc.config | 3 +
.../selftests/nolibc/configs/powerpc64.config | 3 +
.../nolibc/configs/powerpc64le.config | 4 +
5 files changed, 98 insertions(+), 18 deletions(-)
create mode 100644 tools/testing/selftests/nolibc/configs/common.config
create mode 100644 tools/testing/selftests/nolibc/configs/powerpc64.config
create mode 100644 tools/testing/selftests/nolibc/configs/powerpc64le.config
--
2.25.1
Submit the top-level headers also from the kunit test module notifier
initialization callback, so external tools that are parsing dmesg for
kunit test output are able to tell how many test suites should be expected
and whether to continue parsing after complete output from the first test
suite is collected.
Extend kunit module notifier initialization callback with a processing
path for only listing the tests provided by a module if the kunit action
parameter is set to "list", so external tools can obtain a list of test
cases to be executed in advance and can make a better job on assigning
kernel messages interleaved with kunit output to specific tests.
Use test filtering functions in kunit module notifier callback functions,
so external tools are able to execute individual test cases from kunit
test modules in order to still better isolate their potential impact on
kernel messages that appear interleaved with output from other tests.
Janusz Krzysztofik (3):
kunit: Report the count of test suites in a module
kunit: Make 'list' action available to kunit test modules
kunit: Allow kunit test modules to use test filtering
include/kunit/test.h | 14 +++++++++++
lib/kunit/executor.c | 51 ++++++++++++++++++++++-----------------
lib/kunit/test.c | 57 +++++++++++++++++++++++++++++++++++++++++++-
3 files changed, 99 insertions(+), 23 deletions(-)
--
2.41.0
Hi, Willy
Here is the powerpc support, includes 32-bit big-endian powerpc, 64-bit
little endian and big endian powerpc.
All of them passes run-user with the default powerpc toolchain from
Ubuntu 20.04:
$ make run-user DEFCONFIG=tinyconfig XARCH=powerpc CROSS_COMPILE=powerpc-linux-gnu- | grep status
165 test(s): 157 passed, 8 skipped, 0 failed => status: warning
$ make run-user DEFCONFIG=tinyconfig XARCH=powerpc64 CROSS_COMPILE=powerpc64le-linux-gnu- | grep status
165 test(s): 157 passed, 8 skipped, 0 failed => status: warning
$ make run-user DEFCONFIG=tinyconfig XARCH=powerpc64le CROSS_COMPILE=powerpc64le-linux-gnu- | grep status
165 test(s): 157 passed, 8 skipped, 0 failed => status: warning
For the slow 'run' target, I have run with defconfig before, and just
verified them via the fast tinyconfig + run with a new patch from next
patchset, all of them passes:
165 test(s): 156 passed, 9 skipped, 0 failed => status: warning
Note, the big endian crosstool powerpc64-linux-gcc from
https://mirrors.edge.kernel.org/pub/tools/crosstool/ has been used to
test both little endian and big endian powerpc64 too, both passed.
Here simply explain what they are:
* tools/nolibc: add support for powerpc
tools/nolibc: add support for powerpc64
32-bit & 64-bit powerpc support of nolibc.
* selftests/nolibc: select_null: fix up for big endian powerpc64
fix up a test case for big endian powerpc64.
* selftests/nolibc: add extra config file customize support
add extconfig target to allow enable extra config options via
configs/<ARCH>.config
applied suggestion from Thomas to use config files instead of config
lines.
* selftests/nolibc: add XARCH and ARCH mapping support
applied suggestions from Willy, use XARCH as the input of our nolibc
test, use ARCH as the pure kernel input, at last build the mapping
between XARCH and ARCH.
Customize the variables via the input XARCH.
* selftests/nolibc: add test support for powerpc
Require to use extconfig to enable the console options specified in
configs/powerpc.config
currently, we should manually run extconfig after defconfig, in next
patchset, we will do this automatically.
* selftests/nolibc: add test support for powerpc64le
selftests/nolibc: add test support for powerpc64
Very simple, but customize CFLAGS carefully to let them work with
powerpc64le-linux-gnu-gcc (from Linux distributions) and
powerpc64-linux-gcc (from mirrors.edge.kernel.org)
The next patchset will not be tinyconfig, but some prepare patches, will
be sent out soon.
Best regards,
Zhangjin
---
Zhangjin Wu (8):
tools/nolibc: add support for powerpc
tools/nolibc: add support for powerpc64
selftests/nolibc: select_null: fix up for big endian powerpc64
selftests/nolibc: add extra config file customize support
selftests/nolibc: add XARCH and ARCH mapping support
selftests/nolibc: add test support for powerpc
selftests/nolibc: add test support for powerpc64le
selftests/nolibc: add test support for powerpc64
tools/include/nolibc/arch-powerpc.h | 170 ++++++++++++++++++
tools/testing/selftests/nolibc/Makefile | 55 ++++--
.../selftests/nolibc/configs/powerpc.config | 3 +
tools/testing/selftests/nolibc/nolibc-test.c | 2 +-
4 files changed, 217 insertions(+), 13 deletions(-)
create mode 100644 tools/include/nolibc/arch-powerpc.h
create mode 100644 tools/testing/selftests/nolibc/configs/powerpc.config
--
2.25.1
If the test description is longer than the status alignment the
parameter 'n' to putcharn() would lead to a signed underflow that then
gets converted to a very large unsigned value.
This in turn leads out-of-bound writes in memset() crashing the
application.
The failure case of EXPECT_PTRER() used in "mmap_bad" exhibits this
exact behavior.
Fixes: 8a27526f49f9 ("selftests/nolibc: add EXPECT_PTREQ, EXPECT_PTRNE and EXPECT_PTRER")
Signed-off-by: Thomas Weißschuh <linux(a)weissschuh.net>
---
tools/testing/selftests/nolibc/nolibc-test.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/tools/testing/selftests/nolibc/nolibc-test.c b/tools/testing/selftests/nolibc/nolibc-test.c
index 03b1d30f5507..9b76603e4ce3 100644
--- a/tools/testing/selftests/nolibc/nolibc-test.c
+++ b/tools/testing/selftests/nolibc/nolibc-test.c
@@ -151,7 +151,8 @@ static void result(int llen, enum RESULT r)
else
msg = "[FAIL]";
- putcharn(' ', 64 - llen);
+ if (llen < 64)
+ putcharn(' ', 64 - llen);
puts(msg);
}
---
base-commit: dfef4fc45d5713eb23d87f0863aff9c33bd4bfaf
change-id: 20230726-nolibc-result-width-1f4b0b4f3ca0
Best regards,
--
Thomas Weißschuh <linux(a)weissschuh.net>
=== Context ===
In the context of a middlebox, fragmented packets are tricky to handle.
The full 5-tuple of a packet is often only available in the first
fragment which makes enforcing consistent policy difficult. There are
really only two stateless options, neither of which are very nice:
1. Enforce policy on first fragment and accept all subsequent fragments.
This works but may let in certain attacks or allow data exfiltration.
2. Enforce policy on first fragment and drop all subsequent fragments.
This does not really work b/c some protocols may rely on
fragmentation. For example, DNS may rely on oversized UDP packets for
large responses.
So stateful tracking is the only sane option. RFC 8900 [0] calls this
out as well in section 6.3:
Middleboxes [...] should process IP fragments in a manner that is
consistent with [RFC0791] and [RFC8200]. In many cases, middleboxes
must maintain state in order to achieve this goal.
=== BPF related bits ===
Policy has traditionally been enforced from XDP/TC hooks. Both hooks
run before kernel reassembly facilities. However, with the new
BPF_PROG_TYPE_NETFILTER, we can rather easily hook into existing
netfilter reassembly infra.
The basic idea is we bump a refcnt on the netfilter defrag module and
then run the bpf prog after the defrag module runs. This allows bpf
progs to transparently see full, reassembled packets. The nice thing
about this is that progs don't have to carry around logic to detect
fragments.
=== Changelog ===
Changes from v5:
* Fix defrag disable codepaths
Changes from v4:
* Refactor module handling code to not sleep in rcu_read_lock()
* Also unify the v4 and v6 hook structs so they can share codepaths
* Fixed some checkpatch.pl formatting warnings
Changes from v3:
* Correctly initialize `addrlen` stack var for recvmsg()
Changes from v2:
* module_put() if ->enable() fails
* Fix CI build errors
Changes from v1:
* Drop bpf_program__attach_netfilter() patches
* static -> static const where appropriate
* Fix callback assignment order during registration
* Only request_module() if callbacks are missing
* Fix retval when modprobe fails in userspace
* Fix v6 defrag module name (nf_defrag_ipv6_hooks -> nf_defrag_ipv6)
* Simplify priority checking code
* Add warning if module doesn't assign callbacks in the future
* Take refcnt on module while defrag link is active
[0]: https://datatracker.ietf.org/doc/html/rfc8900
Daniel Xu (5):
netfilter: defrag: Add glue hooks for enabling/disabling defrag
netfilter: bpf: Support BPF_F_NETFILTER_IP_DEFRAG in netfilter link
bpf: selftests: Support not connecting client socket
bpf: selftests: Support custom type and proto for client sockets
bpf: selftests: Add defrag selftests
include/linux/netfilter.h | 10 +
include/uapi/linux/bpf.h | 5 +
net/ipv4/netfilter/nf_defrag_ipv4.c | 17 +-
net/ipv6/netfilter/nf_defrag_ipv6_hooks.c | 11 +
net/netfilter/core.c | 6 +
net/netfilter/nf_bpf_link.c | 123 +++++++-
tools/include/uapi/linux/bpf.h | 5 +
tools/testing/selftests/bpf/Makefile | 4 +-
.../selftests/bpf/generate_udp_fragments.py | 90 ++++++
.../selftests/bpf/ip_check_defrag_frags.h | 57 ++++
tools/testing/selftests/bpf/network_helpers.c | 26 +-
tools/testing/selftests/bpf/network_helpers.h | 3 +
.../bpf/prog_tests/ip_check_defrag.c | 283 ++++++++++++++++++
.../selftests/bpf/progs/ip_check_defrag.c | 104 +++++++
14 files changed, 718 insertions(+), 26 deletions(-)
create mode 100755 tools/testing/selftests/bpf/generate_udp_fragments.py
create mode 100644 tools/testing/selftests/bpf/ip_check_defrag_frags.h
create mode 100644 tools/testing/selftests/bpf/prog_tests/ip_check_defrag.c
create mode 100644 tools/testing/selftests/bpf/progs/ip_check_defrag.c
--
2.41.0
There are use cases that need to apply DAMOS schemes to specific address
ranges or DAMON monitoring targets. NUMA nodes in the physical address
space, special memory objects in the virtual address space, and
monitoring target specific efficient monitoring results snapshot
retrieval could be examples of such use cases. This patchset extends
DAMOS filters feature for such cases, by implementing two more filter
types, namely address ranges and DAMON monitoring types.
Patches sequence
----------------
The first seven patches are for the address ranges based DAMOS filter.
The first patch implements the filter feature and expose it via DAMON
kernel API. The second patch further expose the feature to users via
DAMON sysfs interface. The third and fourth patches implement unit
tests and selftests for the feature. Three patches (fifth to seventh)
updating the documents follow.
The following six patches are for the DAMON monitoring target based
DAMOS filter. The eighth patch implements the feature in the core layer
and expose it via DAMON's kernel API. The ninth patch further expose it
to users via DAMON sysfs interface. Tenth patch add a selftest, and two
patches (eleventh and twelfth) update documents.
SeongJae Park (13):
mm/damon/core: introduce address range type damos filter
mm/damon/sysfs-schemes: support address range type DAMOS filter
mm/damon/core-test: add a unit test for __damos_filter_out()
selftests/damon/sysfs: test address range damos filter
Docs/mm/damon/design: update for address range filters
Docs/ABI/damon: update for address range DAMOS filter
Docs/admin-guide/mm/damon/usage: update for address range type DAMOS
filter
mm/damon/core: implement target type damos filter
mm/damon/sysfs-schemes: support target damos filter
selftests/damon/sysfs: test damon_target filter
Docs/mm/damon/design: update for DAMON monitoring target type DAMOS
filter
Docs/ABI/damon: update for DAMON monitoring target type DAMOS filter
Docs/admin-guide/mm/damon/usage: update for DAMON monitoring target
type DAMOS filter
.../ABI/testing/sysfs-kernel-mm-damon | 27 +++++-
Documentation/admin-guide/mm/damon/usage.rst | 34 +++++---
Documentation/mm/damon/design.rst | 24 ++++--
include/linux/damon.h | 28 +++++--
mm/damon/core-test.h | 61 ++++++++++++++
mm/damon/core.c | 62 ++++++++++++++
mm/damon/sysfs-schemes.c | 83 +++++++++++++++++++
tools/testing/selftests/damon/sysfs.sh | 5 ++
8 files changed, 299 insertions(+), 25 deletions(-)
--
2.25.1
The tried_regions directory of DAMON sysfs interface is useful for
retrieving monitoring results snapshot or DAMOS debugging. However, for
common use case that need to monitor only the total size of the scheme
tried regions (e.g., monitoring working set size), the kernel overhead
for directory construction and user overhead for reading the content
could be high if the number of monitoring region is not small. This
patchset implements DAMON sysfs files for efficient support of the use
case.
The first patch implements the sysfs file to reduce the user space
overhead, and the second patch implements a command for reducing the
kernel space overhead.
The third patch adds a selftest for the new file, and following two
patches update documents.
SeongJae Park (5):
mm/damon/sysfs-schemes: implement DAMOS tried total bytes file
mm/damon/sysfs: implement a command for updating only schemes tried
total bytes
selftests/damon/sysfs: test tried_regions/total_bytes file
Docs/ABI/damon: update for tried_regions/total_bytes
Docs/admin-guide/mm/damon/usage: update for tried_regions/total_bytes
.../ABI/testing/sysfs-kernel-mm-damon | 13 +++++-
Documentation/admin-guide/mm/damon/usage.rst | 42 ++++++++++++-------
mm/damon/sysfs-common.h | 2 +-
mm/damon/sysfs-schemes.c | 24 ++++++++++-
mm/damon/sysfs.c | 26 +++++++++---
tools/testing/selftests/damon/sysfs.sh | 1 +
6 files changed, 83 insertions(+), 25 deletions(-)
--
2.25.1
Events Tracing infrastructure contains lot of files, directories
(internally in terms of inodes, dentries). And ends up by consuming
memory in MBs. We can have multiple events of Events Tracing, which
further requires more memory.
Instead of creating inodes/dentries, eventfs could keep meta-data and
skip the creation of inodes/dentries. As and when require, eventfs will
create the inodes/dentries only for required files/directories.
Also eventfs would delete the inodes/dentries once no more requires
but preserve the meta data.
Tracing events took ~9MB, with this approach it took ~4.5MB
for ~10K files/dir.
Diff from v5:
Patch 02: removed TRACEFS_EVENT_INODE enum.
Patch 04: added TRACEFS_EVENT_INODE enum.
Patch 06: removed WARN_ON_ONCE in eventfs_set_ef_status_free()
Patch 07: added WARN_ON_ONCE in create_dentry()
moved declaration of following to internal.h:
eventfs_start_creating()
eventfs_failed_creating()
eventfs_end_creating()
Patch 08: added WARN_ON_ONCE in eventfs_set_ef_status_free()
Diff from v4:
Patch 02: moved from v4 08/10
added fs/tracefs/internal.h
Patch 03: moved from v4 02/10
removed fs/tracefs/internal.h
Patch 04: moved from v4 03/10
moved out changes of fs/tracefs/internal.h
Patch 05: moved from v4 04/10
renamed eventfs_add_top_file() -> eventfs_add_events_file()
Patch 06: moved from v4 07/10
implemented create_dentry() helper function
added create_file(), create_dir() stub function
Patch 07: moved from v4 06/10
Patch 08: moved from v4 05/10
improved eventfs remove functionality
Patch 09: removed unwanted if conditions
Patch 10: added available_filter_functions check
Diff from v3:
Patch 3,4,5,7,9:
removed all the eventfs_rwsem code and replaced it with an srcu
lock for the readers, and a mutex to synchronize the writers of
the list.
Patch 2: moved 'tracefs_inode' and 'get_tracefs()' to v4 03/10
Patch 3: moved the struct eventfs_file and eventfs_inode into event_inode.c
as it really should not be exposed to all users.
Patch 5: added a recursion check to eventfs_remove_rec() as it is really
dangerous to have unchecked recursion in the kernel (we do have
a fixed size stack).
have the free use srcu callbacks. After the srcu grace periods
are done, it adds the eventfs_file onto a llist (lockless link
list) and wakes up a work queue. Then the work queue does the
freeing (this needs to be done in task/workqueue context, as
srcu callbacks are done in softirq context).
Patch 6: renamed:
eventfs_create_file() -> create_file()
eventfs_create_dir() -> create_dir()
Diff from v2:
Patch 01: new patch:'Require all trace events to have a TRACE_SYSTEM'
Patch 02: moved from v1 1/9
Patch 03: moved from v1 2/9
As suggested by Zheng Yejian, introduced eventfs_prepare_ef()
helper function to add files or directories to eventfs
fix WARNING reported by kernel test robot in v1 8/9
Patch 04: moved from v1 3/9
used eventfs_prepare_ef() to add files
fix WARNING reported by kernel test robot in v1 8/9
Patch 05: moved from v1 4/9
fix compiling warning reported by kernel test robot in v1 4/9
Patch 06: moved from v1 5/9
Patch 07: moved from v1 6/9
Patch 08: moved from v1 7/9
Patch 09: moved from v1 8/9
rebased because of v3 01/10
Patch 10: moved from v1 9/9
Diff from v1:
Patch 1: add header file
Patch 2: resolved kernel test robot issues
protecting eventfs lists using nested eventfs_rwsem
Patch 3: protecting eventfs lists using nested eventfs_rwsem
Patch 4: improve events cleanup code to fix crashes
Patch 5: resolved kernel test robot issues
removed d_instantiate_anon() calls
Patch 6: resolved kernel test robot issues
fix kprobe test in eventfs_root_lookup()
protecting eventfs lists using nested eventfs_rwsem
Patch 7: remove header file
Patch 8: pass eventfs_rwsem as argument to eventfs functions
called eventfs_remove_events_dir() instead of tracefs_remove()
from event_trace_del_tracer()
Patch 9: new patch to fix kprobe test case
fs/tracefs/Makefile | 1 +
fs/tracefs/event_inode.c | 801 ++++++++++++++++++
fs/tracefs/inode.c | 151 +++-
fs/tracefs/internal.h | 29 +
include/linux/trace_events.h | 1 +
include/linux/tracefs.h | 23 +
kernel/trace/trace.h | 2 +-
kernel/trace/trace_events.c | 76 +-
.../ftrace/test.d/kprobe/kprobe_args_char.tc | 9 +-
.../test.d/kprobe/kprobe_args_string.tc | 9 +-
10 files changed, 1050 insertions(+), 52 deletions(-)
create mode 100644 fs/tracefs/event_inode.c
create mode 100644 fs/tracefs/internal.h
--
2.39.0
[ This series depends on the VFIO device cdev series ]
Changelog
v11:
* Added Reviewed-by from Kevin
* Dropped 'rc' in iommufd_access_detach()
* Dropped a duplicated IS_ERR check at new_ioas
* Separate into a new patch the change in iommufd_access_destroy_object
v10:
https://lore.kernel.org/all/cover.1690488745.git.nicolinc@nvidia.com/
* Added Reviewed-by from Jason
* Replaced the iommufd_ref_to_user call with refcount_inc
* Added a wrapper iommufd_access_change_ioas_id and used it in the
iommufd_access_attach() and iommufd_access_replace() APIs
v9:
https://lore.kernel.org/all/cover.1690440730.git.nicolinc@nvidia.com/
* Rebased on top of Jason's iommufd for-next tree
* Added Reviewed-by from Jason and Alex
* Reworked the replace API patches
* Added a new patch allowing passing in to iopt_remove_access
* Added a new patch of a helper function following Jason's design,
mainly by blocking any concurrent detach/replace and keeping the
refcount_dec at the end of the function
* Added a call of the new helper in iommufd_access_destroy_object()
to reduce race condition
* Simplified the replace API patch
v8:
https://lore.kernel.org/all/cover.1690226015.git.nicolinc@nvidia.com/
* Rebased on top of Jason's iommufd_hwpt series and then cdev v15 series:
https://lore.kernel.org/all/0-v8-6659224517ea+532-iommufd_alloc_jgg@nvidia.…https://lore.kernel.org/kvm/20230718135551.6592-1-yi.l.liu@intel.com/
* Changed the order of detach() and attach() in replace(), to fix a bug
v7:
https://lore.kernel.org/all/cover.1683593831.git.nicolinc@nvidia.com/
* Rebased on top of v6.4-rc1 and cdev v11 candidate
* Fixed a wrong file in replace() API patch
* Added Kevin's "Reviewed-by" to replace() API patch
v6:
https://lore.kernel.org/all/cover.1679939952.git.nicolinc@nvidia.com/
* Rebased on top of cdev v8 series
https://lore.kernel.org/kvm/20230327094047.47215-1-yi.l.liu@intel.com/
* Added "Reviewed-by" from Kevin to PATCH-4
* Squashed access->ioas updating lines into iommufd_access_change_pt(),
and changed function return type accordingly for simplification.
v5:
https://lore.kernel.org/all/cover.1679559476.git.nicolinc@nvidia.com/
* Kept the cmd->id in the iommufd_test_create_access() so the access can
be created with an ioas by default. Then, renamed the previous ioctl
IOMMU_TEST_OP_ACCESS_SET_IOAS to IOMMU_TEST_OP_ACCESS_REPLACE_IOAS, so
it would be used to replace an access->ioas pointer.
* Added iommufd_access_replace() API after the introductions of the other
two APIs iommufd_access_attach() and iommufd_access_detach().
* Since vdev->iommufd_attached is also set in emulated pathway too, call
iommufd_access_update(), similar to the physical pathway.
v4:
https://lore.kernel.org/all/cover.1678284812.git.nicolinc@nvidia.com/
* Rebased on top of Jason's series adding replace() and hwpt_alloc()
https://lore.kernel.org/all/0-v2-51b9896e7862+8a8c-iommufd_alloc_jgg@nvidia…
* Rebased on top of cdev series v6
https://lore.kernel.org/kvm/20230308132903.465159-1-yi.l.liu@intel.com/
* Dropped the patch that's moved to cdev series.
* Added unmap function pointer sanity before calling it.
* Added "Reviewed-by" from Kevin and Yi.
* Added back the VFIO change updating the ATTACH uAPI.
v3:
https://lore.kernel.org/all/cover.1677288789.git.nicolinc@nvidia.com/
* Rebased on top of Jason's iommufd_hwpt branch:
https://lore.kernel.org/all/0-v2-406f7ac07936+6a-iommufd_hwpt_jgg@nvidia.co…
* Dropped patches from this series accordingly. There were a couple of
VFIO patches that will be submitted after the VFIO cdev series. Also,
renamed the series to be "emulated".
* Moved dma_unmap sanity patch to the first in the series.
* Moved dma_unmap sanity to cover both VFIO and IOMMUFD pathways.
* Added Kevin's "Reviewed-by" to two of the patches.
* Fixed a NULL pointer bug in vfio_iommufd_emulated_bind().
* Moved unmap() call to the common place in iommufd_access_set_ioas().
v2:
https://lore.kernel.org/all/cover.1675802050.git.nicolinc@nvidia.com/
* Rebased on top of vfio_device cdev v2 series.
* Update the kdoc and commit message of iommu_group_replace_domain().
* Dropped revert-to-core-domain part in iommu_group_replace_domain().
* Dropped !ops->dma_unmap check in vfio_iommufd_emulated_attach_ioas().
* Added missing rc value in vfio_iommufd_emulated_attach_ioas() from the
iommufd_access_set_ioas() call.
* Added a new patch in vfio_main to deny vfio_pin/unpin_pages() calls if
vdev->ops->dma_unmap is not implemented.
* Added a __iommmufd_device_detach helper and let the replace routine do
a partial detach().
* Added restriction on auto_domains to use the replace feature.
* Added the patch "iommufd/device: Make hwpt_list list_add/del symmetric"
from the has_group removal series.
v1:
https://lore.kernel.org/all/cover.1675320212.git.nicolinc@nvidia.com/
Hi all,
The existing IOMMU APIs provide a pair of functions: iommu_attach_group()
for callers to attach a device from the default_domain (NULL if not being
supported) to a given iommu domain, and iommu_detach_group() for callers
to detach a device from a given domain to the default_domain. Internally,
the detach_dev op is deprecated for the newer drivers with default_domain.
This means that those drivers likely can switch an attaching domain to
another one, without stagging the device at a blocking or default domain,
for use cases such as:
1) vPASID mode, when a guest wants to replace a single pasid (PASID=0)
table with a larger table (PASID=N)
2) Nesting mode, when switching the attaching device from an S2 domain
to an S1 domain, or when switching between relevant S1 domains.
This series is rebased on top of Jason Gunthorpe's series that introduces
iommu_group_replace_domain API and IOMMUFD infrastructure for the IOMMUFD
"physical" devices. The IOMMUFD "emulated" deivces will need some extra
steps to replace the access->ioas object and its iopt pointer.
You can also find this series on Github:
https://github.com/nicolinc/iommufd/commits/iommu_group_replace_domain-v11
Thank you
Nicolin Chen
Nicolin Chen (7):
vfio: Do not allow !ops->dma_unmap in vfio_pin/unpin_pages()
iommufd: Allow passing in iopt_access_list_id to iopt_remove_access()
iommufd: Add iommufd_access_change_ioas(_id) helpers
iommufd: Use iommufd_access_change_ioas in
iommufd_access_destroy_object
iommufd: Add iommufd_access_replace() API
iommufd/selftest: Add IOMMU_TEST_OP_ACCESS_REPLACE_IOAS coverage
vfio: Support IO page table replacement
drivers/iommu/iommufd/device.c | 128 ++++++++++++------
drivers/iommu/iommufd/io_pagetable.c | 6 +-
drivers/iommu/iommufd/iommufd_private.h | 3 +-
drivers/iommu/iommufd/iommufd_test.h | 4 +
drivers/iommu/iommufd/selftest.c | 19 +++
drivers/vfio/iommufd.c | 11 +-
drivers/vfio/vfio_main.c | 4 +
include/linux/iommufd.h | 1 +
include/uapi/linux/vfio.h | 6 +
tools/testing/selftests/iommu/iommufd.c | 29 +++-
tools/testing/selftests/iommu/iommufd_utils.h | 19 +++
11 files changed, 179 insertions(+), 51 deletions(-)
--
2.41.0
A missing break in kms_tests leads to kselftest hang when the
parameter -s is used.
In current code flow because of missing break in -s, -t parses
args spilled from -s and as -t accepts only valid values as 0,1
so any arg in -s >1 or <0, gets in ksm_test failure
This went undetected since, before the addition of option -t,
the next case -M would immediately break out of the switch
statement but that is no longer the case
Add the missing break statement.
----Before----
./ksm_tests -H -s 100
Invalid merge type
----After----
./ksm_tests -H -s 100
Number of normal pages: 0
Number of huge pages: 50
Total size: 100 MiB
Total time: 0.401732682 s
Average speed: 248.922 MiB/s
Fixes: 9e7cb94ca218 ("selftests: vm: add KSM merging time test")
Signed-off-by: Ayush Jain <ayush.jain3(a)amd.com>
---
tools/testing/selftests/mm/ksm_tests.c | 1 +
1 file changed, 1 insertion(+)
diff --git a/tools/testing/selftests/mm/ksm_tests.c b/tools/testing/selftests/mm/ksm_tests.c
index 435acebdc325..380b691d3eb9 100644
--- a/tools/testing/selftests/mm/ksm_tests.c
+++ b/tools/testing/selftests/mm/ksm_tests.c
@@ -831,6 +831,7 @@ int main(int argc, char *argv[])
printf("Size must be greater than 0\n");
return KSFT_FAIL;
}
+ break;
case 't':
{
int tmp = atoi(optarg);
--
2.34.1
[ This series depends on the VFIO device cdev series ]
Changelog
v8:
* Rebased on top of Jason's iommufd_hwpt series and then cdev v15 series:
https://lore.kernel.org/all/0-v8-6659224517ea+532-iommufd_alloc_jgg@nvidia.…https://lore.kernel.org/kvm/20230718135551.6592-1-yi.l.liu@intel.com/
* Changed the order of detach() and attach() in replace(), to fix a bug
v7:
https://lore.kernel.org/all/cover.1683593831.git.nicolinc@nvidia.com/
* Rebased on top of v6.4-rc1 and cdev v11 candidate
* Fixed a wrong file in replace() API patch
* Added Kevin's "Reviewed-by" to replace() API patch
v6:
https://lore.kernel.org/all/cover.1679939952.git.nicolinc@nvidia.com/
* Rebased on top of cdev v8 series
https://lore.kernel.org/kvm/20230327094047.47215-1-yi.l.liu@intel.com/
* Added "Reviewed-by" from Kevin to PATCH-4
* Squashed access->ioas updating lines into iommufd_access_change_pt(),
and changed function return type accordingly for simplification.
v5:
https://lore.kernel.org/all/cover.1679559476.git.nicolinc@nvidia.com/
* Kept the cmd->id in the iommufd_test_create_access() so the access can
be created with an ioas by default. Then, renamed the previous ioctl
IOMMU_TEST_OP_ACCESS_SET_IOAS to IOMMU_TEST_OP_ACCESS_REPLACE_IOAS, so
it would be used to replace an access->ioas pointer.
* Added iommufd_access_replace() API after the introductions of the other
two APIs iommufd_access_attach() and iommufd_access_detach().
* Since vdev->iommufd_attached is also set in emulated pathway too, call
iommufd_access_update(), similar to the physical pathway.
v4:
https://lore.kernel.org/all/cover.1678284812.git.nicolinc@nvidia.com/
* Rebased on top of Jason's series adding replace() and hwpt_alloc()
https://lore.kernel.org/all/0-v2-51b9896e7862+8a8c-iommufd_alloc_jgg@nvidia…
* Rebased on top of cdev series v6
https://lore.kernel.org/kvm/20230308132903.465159-1-yi.l.liu@intel.com/
* Dropped the patch that's moved to cdev series.
* Added unmap function pointer sanity before calling it.
* Added "Reviewed-by" from Kevin and Yi.
* Added back the VFIO change updating the ATTACH uAPI.
v3:
https://lore.kernel.org/all/cover.1677288789.git.nicolinc@nvidia.com/
* Rebased on top of Jason's iommufd_hwpt branch:
https://lore.kernel.org/all/0-v2-406f7ac07936+6a-iommufd_hwpt_jgg@nvidia.co…
* Dropped patches from this series accordingly. There were a couple of
VFIO patches that will be submitted after the VFIO cdev series. Also,
renamed the series to be "emulated".
* Moved dma_unmap sanity patch to the first in the series.
* Moved dma_unmap sanity to cover both VFIO and IOMMUFD pathways.
* Added Kevin's "Reviewed-by" to two of the patches.
* Fixed a NULL pointer bug in vfio_iommufd_emulated_bind().
* Moved unmap() call to the common place in iommufd_access_set_ioas().
v2:
https://lore.kernel.org/all/cover.1675802050.git.nicolinc@nvidia.com/
* Rebased on top of vfio_device cdev v2 series.
* Update the kdoc and commit message of iommu_group_replace_domain().
* Dropped revert-to-core-domain part in iommu_group_replace_domain().
* Dropped !ops->dma_unmap check in vfio_iommufd_emulated_attach_ioas().
* Added missing rc value in vfio_iommufd_emulated_attach_ioas() from the
iommufd_access_set_ioas() call.
* Added a new patch in vfio_main to deny vfio_pin/unpin_pages() calls if
vdev->ops->dma_unmap is not implemented.
* Added a __iommmufd_device_detach helper and let the replace routine do
a partial detach().
* Added restriction on auto_domains to use the replace feature.
* Added the patch "iommufd/device: Make hwpt_list list_add/del symmetric"
from the has_group removal series.
v1:
https://lore.kernel.org/all/cover.1675320212.git.nicolinc@nvidia.com/
Hi all,
The existing IOMMU APIs provide a pair of functions: iommu_attach_group()
for callers to attach a device from the default_domain (NULL if not being
supported) to a given iommu domain, and iommu_detach_group() for callers
to detach a device from a given domain to the default_domain. Internally,
the detach_dev op is deprecated for the newer drivers with default_domain.
This means that those drivers likely can switch an attaching domain to
another one, without stagging the device at a blocking or default domain,
for use cases such as:
1) vPASID mode, when a guest wants to replace a single pasid (PASID=0)
table with a larger table (PASID=N)
2) Nesting mode, when switching the attaching device from an S2 domain
to an S1 domain, or when switching between relevant S1 domains.
This series is rebased on top of Jason Gunthorpe's series that introduces
iommu_group_replace_domain API and IOMMUFD infrastructure for the IOMMUFD
"physical" devices. The IOMMUFD "emulated" deivces will need some extra
steps to replace the access->ioas object and its iopt pointer.
You can also find this series on Github:
https://github.com/nicolinc/iommufd/commits/iommu_group_replace_domain-v8
Thank you
Nicolin Chen
Nicolin Chen (4):
vfio: Do not allow !ops->dma_unmap in vfio_pin/unpin_pages()
iommufd: Add iommufd_access_replace() API
iommufd/selftest: Add IOMMU_TEST_OP_ACCESS_REPLACE_IOAS coverage
vfio: Support IO page table replacement
drivers/iommu/iommufd/device.c | 72 ++++++++++++++-----
drivers/iommu/iommufd/iommufd_test.h | 4 ++
drivers/iommu/iommufd/selftest.c | 19 +++++
drivers/vfio/iommufd.c | 11 +--
drivers/vfio/vfio_main.c | 4 ++
include/linux/iommufd.h | 1 +
include/uapi/linux/vfio.h | 6 ++
tools/testing/selftests/iommu/iommufd.c | 29 +++++++-
tools/testing/selftests/iommu/iommufd_utils.h | 19 +++++
9 files changed, 142 insertions(+), 23 deletions(-)
--
2.41.0
PTP_SYS_OFFSET_EXTENDED was added in November 2018 in
361800876f80 (" ptp: add PTP_SYS_OFFSET_EXTENDED ioctl")
and PTP_SYS_OFFSET_PRECISE was added in February 2016 in
719f1aa4a671 ("ptp: Add PTP_SYS_OFFSET_PRECISE for driver crosstimestamping")
The PTP selftest code is lacking support for these two IOCTLS.
This short series of patches adds support for them.
Changes in v2:
- Fixed rebase issues (v1 somehow ended up with patch 1 being from the
first manual split of my changes and patch 2 being from rebase 2 out
of 3)
- Rebased on top of net-next
Alex Maftei (2):
selftests/ptp: Add -x option for testing PTP_SYS_OFFSET_EXTENDED
selftests/ptp: Add -X option for testing PTP_SYS_OFFSET_PRECISE
tools/testing/selftests/ptp/testptp.c | 73 ++++++++++++++++++++++++++-
1 file changed, 71 insertions(+), 2 deletions(-)
--
2.25.1
[ This series depends on the VFIO device cdev series ]
Changelog
v10:
* Added Reviewed-by from Jason
* Replaced the iommufd_ref_to_user call with refcount_inc
* Added a wrapper iommufd_access_change_ioas_id and used it in the
iommufd_access_attach() and iommufd_access_replace() APIs
v9:
https://lore.kernel.org/linux-iommu/cover.1690440730.git.nicolinc@nvidia.co…
* Rebased on top of Jason's iommufd for-next tree
* Added Reviewed-by from Jason and Alex
* Reworked the replace API patches
* Added a new patch allowing passing in to iopt_remove_access
* Added a new patch of a helper function following Jason's design,
mainly by blocking any concurrent detach/replace and keeping the
refcount_dec at the end of the function
* Added a call of the new helper in iommufd_access_destroy_object()
to reduce race condition
* Simplified the replace API patch
v8:
https://lore.kernel.org/all/cover.1690226015.git.nicolinc@nvidia.com/
* Rebased on top of Jason's iommufd_hwpt series and then cdev v15 series:
https://lore.kernel.org/all/0-v8-6659224517ea+532-iommufd_alloc_jgg@nvidia.…https://lore.kernel.org/kvm/20230718135551.6592-1-yi.l.liu@intel.com/
* Changed the order of detach() and attach() in replace(), to fix a bug
v7:
https://lore.kernel.org/all/cover.1683593831.git.nicolinc@nvidia.com/
* Rebased on top of v6.4-rc1 and cdev v11 candidate
* Fixed a wrong file in replace() API patch
* Added Kevin's "Reviewed-by" to replace() API patch
v6:
https://lore.kernel.org/all/cover.1679939952.git.nicolinc@nvidia.com/
* Rebased on top of cdev v8 series
https://lore.kernel.org/kvm/20230327094047.47215-1-yi.l.liu@intel.com/
* Added "Reviewed-by" from Kevin to PATCH-4
* Squashed access->ioas updating lines into iommufd_access_change_pt(),
and changed function return type accordingly for simplification.
v5:
https://lore.kernel.org/all/cover.1679559476.git.nicolinc@nvidia.com/
* Kept the cmd->id in the iommufd_test_create_access() so the access can
be created with an ioas by default. Then, renamed the previous ioctl
IOMMU_TEST_OP_ACCESS_SET_IOAS to IOMMU_TEST_OP_ACCESS_REPLACE_IOAS, so
it would be used to replace an access->ioas pointer.
* Added iommufd_access_replace() API after the introductions of the other
two APIs iommufd_access_attach() and iommufd_access_detach().
* Since vdev->iommufd_attached is also set in emulated pathway too, call
iommufd_access_update(), similar to the physical pathway.
v4:
https://lore.kernel.org/all/cover.1678284812.git.nicolinc@nvidia.com/
* Rebased on top of Jason's series adding replace() and hwpt_alloc()
https://lore.kernel.org/all/0-v2-51b9896e7862+8a8c-iommufd_alloc_jgg@nvidia…
* Rebased on top of cdev series v6
https://lore.kernel.org/kvm/20230308132903.465159-1-yi.l.liu@intel.com/
* Dropped the patch that's moved to cdev series.
* Added unmap function pointer sanity before calling it.
* Added "Reviewed-by" from Kevin and Yi.
* Added back the VFIO change updating the ATTACH uAPI.
v3:
https://lore.kernel.org/all/cover.1677288789.git.nicolinc@nvidia.com/
* Rebased on top of Jason's iommufd_hwpt branch:
https://lore.kernel.org/all/0-v2-406f7ac07936+6a-iommufd_hwpt_jgg@nvidia.co…
* Dropped patches from this series accordingly. There were a couple of
VFIO patches that will be submitted after the VFIO cdev series. Also,
renamed the series to be "emulated".
* Moved dma_unmap sanity patch to the first in the series.
* Moved dma_unmap sanity to cover both VFIO and IOMMUFD pathways.
* Added Kevin's "Reviewed-by" to two of the patches.
* Fixed a NULL pointer bug in vfio_iommufd_emulated_bind().
* Moved unmap() call to the common place in iommufd_access_set_ioas().
v2:
https://lore.kernel.org/all/cover.1675802050.git.nicolinc@nvidia.com/
* Rebased on top of vfio_device cdev v2 series.
* Update the kdoc and commit message of iommu_group_replace_domain().
* Dropped revert-to-core-domain part in iommu_group_replace_domain().
* Dropped !ops->dma_unmap check in vfio_iommufd_emulated_attach_ioas().
* Added missing rc value in vfio_iommufd_emulated_attach_ioas() from the
iommufd_access_set_ioas() call.
* Added a new patch in vfio_main to deny vfio_pin/unpin_pages() calls if
vdev->ops->dma_unmap is not implemented.
* Added a __iommmufd_device_detach helper and let the replace routine do
a partial detach().
* Added restriction on auto_domains to use the replace feature.
* Added the patch "iommufd/device: Make hwpt_list list_add/del symmetric"
from the has_group removal series.
v1:
https://lore.kernel.org/all/cover.1675320212.git.nicolinc@nvidia.com/
Hi all,
The existing IOMMU APIs provide a pair of functions: iommu_attach_group()
for callers to attach a device from the default_domain (NULL if not being
supported) to a given iommu domain, and iommu_detach_group() for callers
to detach a device from a given domain to the default_domain. Internally,
the detach_dev op is deprecated for the newer drivers with default_domain.
This means that those drivers likely can switch an attaching domain to
another one, without stagging the device at a blocking or default domain,
for use cases such as:
1) vPASID mode, when a guest wants to replace a single pasid (PASID=0)
table with a larger table (PASID=N)
2) Nesting mode, when switching the attaching device from an S2 domain
to an S1 domain, or when switching between relevant S1 domains.
This series is rebased on top of Jason Gunthorpe's series that introduces
iommu_group_replace_domain API and IOMMUFD infrastructure for the IOMMUFD
"physical" devices. The IOMMUFD "emulated" deivces will need some extra
steps to replace the access->ioas object and its iopt pointer.
You can also find this series on Github:
https://github.com/nicolinc/iommufd/commits/iommu_group_replace_domain-v10
Thank you
Nicolin Chen
Nicolin Chen (6):
vfio: Do not allow !ops->dma_unmap in vfio_pin/unpin_pages()
iommufd: Allow passing in iopt_access_list_id to iopt_remove_access()
iommufd: Add iommufd_access_change_ioas(_id) helpers
iommufd: Add iommufd_access_replace() API
iommufd/selftest: Add IOMMU_TEST_OP_ACCESS_REPLACE_IOAS coverage
vfio: Support IO page table replacement
drivers/iommu/iommufd/device.c | 132 ++++++++++++------
drivers/iommu/iommufd/io_pagetable.c | 6 +-
drivers/iommu/iommufd/iommufd_private.h | 3 +-
drivers/iommu/iommufd/iommufd_test.h | 4 +
drivers/iommu/iommufd/selftest.c | 19 +++
drivers/vfio/iommufd.c | 11 +-
drivers/vfio/vfio_main.c | 4 +
include/linux/iommufd.h | 1 +
include/uapi/linux/vfio.h | 6 +
tools/testing/selftests/iommu/iommufd.c | 29 +++-
tools/testing/selftests/iommu/iommufd_utils.h | 19 +++
11 files changed, 184 insertions(+), 50 deletions(-)
--
2.41.0
Make sv48 the default address space for mmap as some applications
currently depend on this assumption. Users can now select a
desired address space using a non-zero hint address to mmap. Previously,
requesting the default address space from mmap by passing zero as the hint
address would result in using the largest address space possible. Some
applications depend on empty bits in the virtual address space, like Go and
Java, so this patch provides more flexibility for application developers.
-Charlie
---
v7:
- Changing RLIMIT_STACK inside of an executing program does not trigger
arch_pick_mmap_layout(), so rewrite tests to change RLIMIT_STACK from a
script before executing tests. RLIMIT_STACK of infinity forces bottomup
mmap allocation.
- Make arch_get_mmap_base macro more readible by extracting out the rnd
calculation.
- Use MMAP_MIN_VA_BITS in TASK_UNMAPPED_BASE to support case when mmap
attempts to allocate address smaller than DEFAULT_MAP_WINDOW.
- Fix incorrect wording in documentation.
v6:
- Rebase onto the correct base
v5:
- Minor wording change in documentation
- Change some parenthesis in arch_get_mmap_ macros
- Added case for addr==0 in arch_get_mmap_ because without this, programs would
crash if RLIMIT_STACK was modified before executing the program. This was
tested using the libhugetlbfs tests.
v4:
- Split testcases/document patch into test cases, in-code documentation, and
formal documentation patches
- Modified the mmap_base macro to be more legible and better represent memory
layout
- Fixed documentation to better reflect the implmentation
- Renamed DEFAULT_VA_BITS to MMAP_VA_BITS
- Added additional test case for rlimit changes
---
Charlie Jenkins (4):
RISC-V: mm: Restrict address space for sv39,sv48,sv57
RISC-V: mm: Add tests for RISC-V mm
RISC-V: mm: Update pgtable comment documentation
RISC-V: mm: Document mmap changes
Documentation/riscv/vm-layout.rst | 22 +++++++
arch/riscv/include/asm/elf.h | 2 +-
arch/riscv/include/asm/pgtable.h | 21 ++++--
arch/riscv/include/asm/processor.h | 47 ++++++++++++--
tools/testing/selftests/riscv/Makefile | 2 +-
tools/testing/selftests/riscv/mm/.gitignore | 2 +
tools/testing/selftests/riscv/mm/Makefile | 15 +++++
.../riscv/mm/testcases/mmap_bottomup.c | 35 ++++++++++
.../riscv/mm/testcases/mmap_default.c | 35 ++++++++++
.../selftests/riscv/mm/testcases/mmap_test.h | 64 +++++++++++++++++++
.../selftests/riscv/mm/testcases/run_mmap.sh | 12 ++++
11 files changed, 244 insertions(+), 13 deletions(-)
create mode 100644 tools/testing/selftests/riscv/mm/.gitignore
create mode 100644 tools/testing/selftests/riscv/mm/Makefile
create mode 100644 tools/testing/selftests/riscv/mm/testcases/mmap_bottomup.c
create mode 100644 tools/testing/selftests/riscv/mm/testcases/mmap_default.c
create mode 100644 tools/testing/selftests/riscv/mm/testcases/mmap_test.h
create mode 100755 tools/testing/selftests/riscv/mm/testcases/run_mmap.sh
--
2.41.0
[ This series depends on the VFIO device cdev series ]
Changelog
v9:
* Rebased on top of Jason's iommufd for-next tree
* Added Reviewed-by from Jason and Alex
* Reworked the replace API patches
* Added a new patch allowing passing in to iopt_remove_access
* Added a new patch of a helper function following Jason's design,
mainly by blocking any concurrent detach/replace and keeping the
refcount_dec at the end of the function
* Added a call of the new helper in iommufd_access_destroy_object()
to reduce race condition
* Simplified the replace API patch
v8:
https://lore.kernel.org/all/cover.1690226015.git.nicolinc@nvidia.com/
* Rebased on top of Jason's iommufd_hwpt series and then cdev v15 series:
https://lore.kernel.org/all/0-v8-6659224517ea+532-iommufd_alloc_jgg@nvidia.…https://lore.kernel.org/kvm/20230718135551.6592-1-yi.l.liu@intel.com/
* Changed the order of detach() and attach() in replace(), to fix a bug
v7:
https://lore.kernel.org/all/cover.1683593831.git.nicolinc@nvidia.com/
* Rebased on top of v6.4-rc1 and cdev v11 candidate
* Fixed a wrong file in replace() API patch
* Added Kevin's "Reviewed-by" to replace() API patch
v6:
https://lore.kernel.org/all/cover.1679939952.git.nicolinc@nvidia.com/
* Rebased on top of cdev v8 series
https://lore.kernel.org/kvm/20230327094047.47215-1-yi.l.liu@intel.com/
* Added "Reviewed-by" from Kevin to PATCH-4
* Squashed access->ioas updating lines into iommufd_access_change_pt(),
and changed function return type accordingly for simplification.
v5:
https://lore.kernel.org/all/cover.1679559476.git.nicolinc@nvidia.com/
* Kept the cmd->id in the iommufd_test_create_access() so the access can
be created with an ioas by default. Then, renamed the previous ioctl
IOMMU_TEST_OP_ACCESS_SET_IOAS to IOMMU_TEST_OP_ACCESS_REPLACE_IOAS, so
it would be used to replace an access->ioas pointer.
* Added iommufd_access_replace() API after the introductions of the other
two APIs iommufd_access_attach() and iommufd_access_detach().
* Since vdev->iommufd_attached is also set in emulated pathway too, call
iommufd_access_update(), similar to the physical pathway.
v4:
https://lore.kernel.org/all/cover.1678284812.git.nicolinc@nvidia.com/
* Rebased on top of Jason's series adding replace() and hwpt_alloc()
https://lore.kernel.org/all/0-v2-51b9896e7862+8a8c-iommufd_alloc_jgg@nvidia…
* Rebased on top of cdev series v6
https://lore.kernel.org/kvm/20230308132903.465159-1-yi.l.liu@intel.com/
* Dropped the patch that's moved to cdev series.
* Added unmap function pointer sanity before calling it.
* Added "Reviewed-by" from Kevin and Yi.
* Added back the VFIO change updating the ATTACH uAPI.
v3:
https://lore.kernel.org/all/cover.1677288789.git.nicolinc@nvidia.com/
* Rebased on top of Jason's iommufd_hwpt branch:
https://lore.kernel.org/all/0-v2-406f7ac07936+6a-iommufd_hwpt_jgg@nvidia.co…
* Dropped patches from this series accordingly. There were a couple of
VFIO patches that will be submitted after the VFIO cdev series. Also,
renamed the series to be "emulated".
* Moved dma_unmap sanity patch to the first in the series.
* Moved dma_unmap sanity to cover both VFIO and IOMMUFD pathways.
* Added Kevin's "Reviewed-by" to two of the patches.
* Fixed a NULL pointer bug in vfio_iommufd_emulated_bind().
* Moved unmap() call to the common place in iommufd_access_set_ioas().
v2:
https://lore.kernel.org/all/cover.1675802050.git.nicolinc@nvidia.com/
* Rebased on top of vfio_device cdev v2 series.
* Update the kdoc and commit message of iommu_group_replace_domain().
* Dropped revert-to-core-domain part in iommu_group_replace_domain().
* Dropped !ops->dma_unmap check in vfio_iommufd_emulated_attach_ioas().
* Added missing rc value in vfio_iommufd_emulated_attach_ioas() from the
iommufd_access_set_ioas() call.
* Added a new patch in vfio_main to deny vfio_pin/unpin_pages() calls if
vdev->ops->dma_unmap is not implemented.
* Added a __iommmufd_device_detach helper and let the replace routine do
a partial detach().
* Added restriction on auto_domains to use the replace feature.
* Added the patch "iommufd/device: Make hwpt_list list_add/del symmetric"
from the has_group removal series.
v1:
https://lore.kernel.org/all/cover.1675320212.git.nicolinc@nvidia.com/
Hi all,
The existing IOMMU APIs provide a pair of functions: iommu_attach_group()
for callers to attach a device from the default_domain (NULL if not being
supported) to a given iommu domain, and iommu_detach_group() for callers
to detach a device from a given domain to the default_domain. Internally,
the detach_dev op is deprecated for the newer drivers with default_domain.
This means that those drivers likely can switch an attaching domain to
another one, without stagging the device at a blocking or default domain,
for use cases such as:
1) vPASID mode, when a guest wants to replace a single pasid (PASID=0)
table with a larger table (PASID=N)
2) Nesting mode, when switching the attaching device from an S2 domain
to an S1 domain, or when switching between relevant S1 domains.
This series is rebased on top of Jason Gunthorpe's series that introduces
iommu_group_replace_domain API and IOMMUFD infrastructure for the IOMMUFD
"physical" devices. The IOMMUFD "emulated" deivces will need some extra
steps to replace the access->ioas object and its iopt pointer.
You can also find this series on Github:
https://github.com/nicolinc/iommufd/commits/iommu_group_replace_domain-v9
Thank you
Nicolin Chen
Nicolin Chen (6):
vfio: Do not allow !ops->dma_unmap in vfio_pin/unpin_pages()
iommufd: Allow passing in iopt_access_list_id to iopt_remove_access()
iommufd: Add iommufd_access_change_ioas helper
iommufd: Add iommufd_access_replace() API
iommufd/selftest: Add IOMMU_TEST_OP_ACCESS_REPLACE_IOAS coverage
vfio: Support IO page table replacement
drivers/iommu/iommufd/device.c | 123 ++++++++++++------
drivers/iommu/iommufd/io_pagetable.c | 6 +-
drivers/iommu/iommufd/iommufd_private.h | 3 +-
drivers/iommu/iommufd/iommufd_test.h | 4 +
drivers/iommu/iommufd/selftest.c | 19 +++
drivers/vfio/iommufd.c | 11 +-
drivers/vfio/vfio_main.c | 4 +
include/linux/iommufd.h | 1 +
include/uapi/linux/vfio.h | 6 +
tools/testing/selftests/iommu/iommufd.c | 29 ++++-
tools/testing/selftests/iommu/iommufd_utils.h | 19 +++
11 files changed, 175 insertions(+), 50 deletions(-)
--
2.41.0
When we collect a signal context with one of the SME modes enabled we will
have enabled that mode behind the compiler and libc's back so they may
issue some instructions not valid in streaming mode, causing spurious
failures.
For the code prior to issuing the BRK to trigger signal handling we need to
stay in streaming mode if we were already there since that's a part of the
signal context the caller is trying to collect. Unfortunately this code
includes a memset() which is likely to be heavily optimised and is likely
to use FP instructions incompatible with streaming mode. We can avoid this
happening by open coding the memset(), inserting a volatile assembly
statement to avoid the compiler recognising what's being done and doing
something in optimisation. This code is not performance critical so the
inefficiency should not be an issue.
After collecting the context we can simply exit streaming mode, avoiding
these issues. Use a full SMSTOP for safety to prevent any issues appearing
with ZA.
Reported-by: Will Deacon <will(a)kernel.org>
Signed-off-by: Mark Brown <broonie(a)kernel.org>
---
Changes in v3:
- Open code OPTIMISER_HIDE_VAR() instead of the memory clobber.
- Link to v2: https://lore.kernel.org/r/20230712-arm64-signal-memcpy-fix-v2-1-494f7025caf…
Changes in v2:
- Rebase onto v6.5-rc1.
- Link to v1: https://lore.kernel.org/r/20230628-arm64-signal-memcpy-fix-v1-1-db3e0300829…
---
.../selftests/arm64/signal/test_signals_utils.h | 25 +++++++++++++++++++++-
1 file changed, 24 insertions(+), 1 deletion(-)
diff --git a/tools/testing/selftests/arm64/signal/test_signals_utils.h b/tools/testing/selftests/arm64/signal/test_signals_utils.h
index 222093f51b67..c7f5627171dd 100644
--- a/tools/testing/selftests/arm64/signal/test_signals_utils.h
+++ b/tools/testing/selftests/arm64/signal/test_signals_utils.h
@@ -60,13 +60,25 @@ static __always_inline bool get_current_context(struct tdescr *td,
size_t dest_sz)
{
static volatile bool seen_already;
+ int i;
+ char *uc = (char *)dest_uc;
assert(td && dest_uc);
/* it's a genuine invocation..reinit */
seen_already = 0;
td->live_uc_valid = 0;
td->live_sz = dest_sz;
- memset(dest_uc, 0x00, td->live_sz);
+
+ /*
+ * This is a memset() but we don't want the compiler to
+ * optimise it into either instructions or a library call
+ * which might be incompatible with streaming mode.
+ */
+ for (i = 0; i < td->live_sz; i++) {
+ uc[i] = 0;
+ __asm__ ("" : "=r" (uc[i]) : "0" (uc[i]));
+ }
+
td->live_uc = dest_uc;
/*
* Grab ucontext_t triggering a SIGTRAP.
@@ -103,6 +115,17 @@ static __always_inline bool get_current_context(struct tdescr *td,
:
: "memory");
+ /*
+ * If we were grabbing a streaming mode context then we may
+ * have entered streaming mode behind the system's back and
+ * libc or compiler generated code might decide to do
+ * something invalid in streaming mode, or potentially even
+ * the state of ZA. Issue a SMSTOP to exit both now we have
+ * grabbed the state.
+ */
+ if (td->feats_supported & FEAT_SME)
+ asm volatile("msr S0_3_C4_C6_3, xzr");
+
/*
* If we get here with seen_already==1 it implies the td->live_uc
* context has been used to get back here....this probably means
---
base-commit: 06c2afb862f9da8dc5efa4b6076a0e48c3fbaaa5
change-id: 20230628-arm64-signal-memcpy-fix-7de3b3c8fa10
Best regards,
--
Mark Brown <broonie(a)kernel.org>
This series fixes an issue which David Spickett found where if we change
the SVE VL while SME is in use we can end up attempting to save state to
an unallocated buffer and adds testing coverage for that plus a bit more
coverage of VL changes, just for paranioa.
Signed-off-by: Mark Brown <broonie(a)kernel.org>
---
Changes in v2:
- Always reallocate the SVE state.
- Rebase onto v6.5-rc2.
- Link to v1: https://lore.kernel.org/r/20230713-arm64-fix-sve-sme-vl-change-v1-0-129dd86…
---
Mark Brown (3):
arm64/fpsimd: Ensure SME storage is allocated after SVE VL changes
kselftest/arm64: Add a test case for SVE VL changes with SME active
kselftest/arm64: Validate that changing one VL type does not affect another
arch/arm64/kernel/fpsimd.c | 33 +++++--
tools/testing/selftests/arm64/fp/vec-syscfg.c | 127 +++++++++++++++++++++++++-
2 files changed, 148 insertions(+), 12 deletions(-)
---
base-commit: 06785562d1b99ff6dc1cd0af54be5e3ff999dc02
change-id: 20230713-arm64-fix-sve-sme-vl-change-60eb1fa6a707
Best regards,
--
Mark Brown <broonie(a)kernel.org>
The openvswitch selftests currently contain a few cases for managing the
datapath, which includes creating datapath instances, adding interfaces,
and doing some basic feature / upcall tests. This is useful to validate
the control path.
Add the ability to program some of the more common flows with actions. This
can be improved overtime to include regression testing, etc.
Aaron Conole (4):
selftests: openvswitch: add an initial flow programming case
selftests: openvswitch: add a test for ipv4 forwarding
selftests: openvswitch: add basic ct test case parsing
selftests: openvswitch: add ct-nat test case with ipv4
.../selftests/net/openvswitch/openvswitch.sh | 223 ++++++++
.../selftests/net/openvswitch/ovs-dpctl.py | 507 ++++++++++++++++++
2 files changed, 730 insertions(+)
--
2.40.1
* TL;DR:
Device memory TCP (devmem TCP) is a proposal for transferring data to and/or
from device memory efficiently, without bouncing the data to a host memory
buffer.
* Problem:
A large amount of data transfers have device memory as the source and/or
destination. Accelerators drastically increased the volume of such transfers.
Some examples include:
- ML accelerators transferring large amounts of training data from storage into
GPU/TPU memory. In some cases ML training setup time can be as long as 50% of
TPU compute time, improving data transfer throughput & efficiency can help
improving GPU/TPU utilization.
- Distributed training, where ML accelerators, such as GPUs on different hosts,
exchange data among them.
- Distributed raw block storage applications transfer large amounts of data with
remote SSDs, much of this data does not require host processing.
Today, the majority of the Device-to-Device data transfers the network are
implemented as the following low level operations: Device-to-Host copy,
Host-to-Host network transfer, and Host-to-Device copy.
The implementation is suboptimal, especially for bulk data transfers, and can
put significant strains on system resources, such as host memory bandwidth,
PCIe bandwidth, etc. One important reason behind the current state is the
kernel’s lack of semantics to express device to network transfers.
* Proposal:
In this patch series we attempt to optimize this use case by implementing
socket APIs that enable the user to:
1. send device memory across the network directly, and
2. receive incoming network packets directly into device memory.
Packet _payloads_ go directly from the NIC to device memory for receive and from
device memory to NIC for transmit.
Packet _headers_ go to/from host memory and are processed by the TCP/IP stack
normally. The NIC _must_ support header split to achieve this.
Advantages:
- Alleviate host memory bandwidth pressure, compared to existing
network-transfer + device-copy semantics.
- Alleviate PCIe BW pressure, by limiting data transfer to the lowest level
of the PCIe tree, compared to traditional path which sends data through the
root complex.
With this proposal we're able to reach ~96.6% line rate speeds with data sent
and received directly from/to device memory.
* Patch overview:
** Part 1: struct paged device memory
Currently the standard for device memory sharing is DMABUF, which doesn't
generate struct pages. On the other hand, networking stack (skbs, drivers, and
page pool) operate on pages. We have 2 options:
1. Generate struct pages for dmabuf device memory, or,
2. Modify the networking stack to understand a new memory type.
This proposal implements option #1. We implement a small framework to generate
struct pages for an sg_table returned from dma_buf_map_attachment(). The support
added here should be generic and easily extended to other use cases interested
in struct paged device memory. We use this framework to generate pages that can
be used in the networking stack.
** Part 2: recvmsg() & sendmsg() APIs
We define user APIs for the user to send and receive these dmabuf pages.
** part 3: support for unreadable skb frags
Dmabuf pages are not accessible by the host; we implement changes throughput the
networking stack to correctly handle skbs with unreadable frags.
** part 4: page pool support
We piggy back on Jakub's page pool memory providers idea:
https://github.com/kuba-moo/linux/tree/pp-providers
It allows the page pool to define a memory provider that provides the
page allocation and freeing. It helps abstract most of the device memory TCP
changes from the driver.
This is not strictly necessary, the driver can choose to allocate dmabuf pages
and use them directly without going through the page pool (if acceptable to
their maintainers).
Not included with this RFC is the GVE devmem TCP support, just to
simplify the review. Code available here if desired:
https://github.com/mina/linux/tree/tcpdevmem
This RFC is built on top of v6.4-rc7 with Jakub's pp-providers changes
cherry-picked.
* NIC dependencies:
1. (strict) Devmem TCP require the NIC to support header split, i.e. the
capability to split incoming packets into a header + payload and to put
each into a separate buffer. Devmem TCP works by using dmabuf pages
for the packet payload, and host memory for the packet headers.
2. (optional) Devmem TCP works better with flow steering support & RSS support,
i.e. the NIC's ability to steer flows into certain rx queues. This allows the
sysadmin to enable devmem TCP on a subset of the rx queues, and steer
devmem TCP traffic onto these queues and non devmem TCP elsewhere.
The NIC I have access to with these properties is the GVE with DQO support
running in Google Cloud, but any NIC that supports these features would suffice.
I may be able to help reviewers bring up devmem TCP on their NICs.
* Testing:
The series includes a udmabuf kselftest that show a simple use case of
devmem TCP and validates the entire data path end to end without
a dependency on a specific dmabuf provider.
Not included in this series is our devmem TCP benchmark, which
transfers data to/from GPU dmabufs directly.
With this implementation & benchmark we're able to reach ~96.6% line rate
speeds with 4 GPU/NIC pairs running bi-direction traffic, with all the
packet payloads going straight to the GPU memory (no host buffer bounce).
** Test Setup
Kernel: v6.4-rc7, with this RFC and Jakub's memory provider API
cherry-picked locally.
Hardware: Google Cloud A3 VMs.
NIC: GVE with header split & RSS & flow steering support.
Benchmark: custom devmem TCP benchmark not yet open sourced.
Mina Almasry (10):
dma-buf: add support for paged attachment mappings
dma-buf: add support for NET_RX pages
dma-buf: add support for NET_TX pages
net: add support for skbs with unreadable frags
tcp: implement recvmsg() RX path for devmem TCP
net: add SO_DEVMEM_DONTNEED setsockopt to release RX pages
tcp: implement sendmsg() TX path for for devmem tcp
selftests: add ncdevmem, netcat for devmem TCP
memory-provider: updates core provider API for devmem TCP
memory-provider: add dmabuf devmem provider
drivers/dma-buf/dma-buf.c | 444 ++++++++++++++++
include/linux/dma-buf.h | 142 +++++
include/linux/netdevice.h | 1 +
include/linux/skbuff.h | 34 +-
include/linux/socket.h | 1 +
include/net/page_pool.h | 21 +
include/net/sock.h | 4 +
include/net/tcp.h | 6 +-
include/uapi/asm-generic/socket.h | 6 +
include/uapi/linux/dma-buf.h | 12 +
include/uapi/linux/uio.h | 10 +
net/core/datagram.c | 3 +
net/core/page_pool.c | 111 +++-
net/core/skbuff.c | 81 ++-
net/core/sock.c | 47 ++
net/ipv4/tcp.c | 262 +++++++++-
net/ipv4/tcp_input.c | 13 +-
net/ipv4/tcp_ipv4.c | 8 +
net/ipv4/tcp_output.c | 5 +-
net/packet/af_packet.c | 4 +-
tools/testing/selftests/net/.gitignore | 1 +
tools/testing/selftests/net/Makefile | 1 +
tools/testing/selftests/net/ncdevmem.c | 693 +++++++++++++++++++++++++
23 files changed, 1868 insertions(+), 42 deletions(-)
create mode 100644 tools/testing/selftests/net/ncdevmem.c
--
2.41.0.390.g38632f3daf-goog
Events Tracing infrastructure contains lot of files, directories
(internally in terms of inodes, dentries). And ends up by consuming
memory in MBs. We can have multiple events of Events Tracing, which
further requires more memory.
Instead of creating inodes/dentries, eventfs could keep meta-data and
skip the creation of inodes/dentries. As and when require, eventfs will
create the inodes/dentries only for required files/directories.
Also eventfs would delete the inodes/dentries once no more requires
but preserve the meta data.
Tracing events took ~9MB, with this approach it took ~4.5MB
for ~10K files/dir.
Diff from v4:
Patch 02: moved from v4 08/10
added fs/tracefs/internal.h
Patch 03: moved from v4 02/10
removed fs/tracefs/internal.h
Patch 04: moved from v4 03/10
moved out changes of fs/tracefs/internal.h
Patch 05: moved from v4 04/10
renamed eventfs_add_top_file() -> eventfs_add_events_file()
Patch 06: moved from v4 07/10
implemented create_dentry() helper function
added create_file(), create_dir() stub function
Patch 07: moved from v4 06/10
Patch 08: moved from v4 05/10
improved eventfs remove functionality
Patch 09: removed unwanted if conditions
Patch 10: added available_filter_functions check
Diff from v3:
Patch 3,4,5,7,9:
removed all the eventfs_rwsem code and replaced it with an srcu
lock for the readers, and a mutex to synchronize the writers of
the list.
Patch 2: moved 'tracefs_inode' and 'get_tracefs()' to v4 03/10
Patch 3: moved the struct eventfs_file and eventfs_inode into event_inode.c
as it really should not be exposed to all users.
Patch 5: added a recursion check to eventfs_remove_rec() as it is really
dangerous to have unchecked recursion in the kernel (we do have
a fixed size stack).
have the free use srcu callbacks. After the srcu grace periods
are done, it adds the eventfs_file onto a llist (lockless link
list) and wakes up a work queue. Then the work queue does the
freeing (this needs to be done in task/workqueue context, as
srcu callbacks are done in softirq context).
Patch 6: renamed:
eventfs_create_file() -> create_file()
eventfs_create_dir() -> create_dir()
Diff from v2:
Patch 01: new patch:'Require all trace events to have a TRACE_SYSTEM'
Patch 02: moved from v1 1/9
Patch 03: moved from v1 2/9
As suggested by Zheng Yejian, introduced eventfs_prepare_ef()
helper function to add files or directories to eventfs
fix WARNING reported by kernel test robot in v1 8/9
Patch 04: moved from v1 3/9
used eventfs_prepare_ef() to add files
fix WARNING reported by kernel test robot in v1 8/9
Patch 05: moved from v1 4/9
fix compiling warning reported by kernel test robot in v1 4/9
Patch 06: moved from v1 5/9
Patch 07: moved from v1 6/9
Patch 08: moved from v1 7/9
Patch 09: moved from v1 8/9
rebased because of v3 01/10
Patch 10: moved from v1 9/9
Diff from v1:
Patch 1: add header file
Patch 2: resolved kernel test robot issues
protecting eventfs lists using nested eventfs_rwsem
Patch 3: protecting eventfs lists using nested eventfs_rwsem
Patch 4: improve events cleanup code to fix crashes
Patch 5: resolved kernel test robot issues
removed d_instantiate_anon() calls
Patch 6: resolved kernel test robot issues
fix kprobe test in eventfs_root_lookup()
protecting eventfs lists using nested eventfs_rwsem
Patch 7: remove header file
Patch 8: pass eventfs_rwsem as argument to eventfs functions
called eventfs_remove_events_dir() instead of tracefs_remove()
from event_trace_del_tracer()
Patch 9: new patch to fix kprobe test case
fs/tracefs/Makefile | 1 +
fs/tracefs/event_inode.c | 795 ++++++++++++++++++
fs/tracefs/inode.c | 151 +++-
fs/tracefs/internal.h | 26 +
include/linux/trace_events.h | 1 +
include/linux/tracefs.h | 30 +
kernel/trace/trace.h | 2 +-
kernel/trace/trace_events.c | 76 +-
.../ftrace/test.d/kprobe/kprobe_args_char.tc | 9 +-
.../test.d/kprobe/kprobe_args_string.tc | 9 +-
10 files changed, 1048 insertions(+), 52 deletions(-)
create mode 100644 fs/tracefs/event_inode.c
create mode 100644 fs/tracefs/internal.h
--
2.39.0
Hello,
This is v4 of the patch series for TDX selftests.
It has been updated for Intel’s v14 of the TDX host patches which was
proposed here:
https://lore.kernel.org/lkml/cover.1685333727.git.isaku.yamahata@intel.com/
The tree can be found at:
https://github.com/googleprodkernel/linux-cc/tree/tdx-selftests-rfc-v4
Changes from RFC v3:
In v14, TDX can only run with UPM enabled so the necessary changes were
made to handle that.
td_vcpu_run() was added to handle TdVmCalls that are now handled in
userspace.
The comments under the patch "KVM: selftests: Require GCC to realign
stacks on function entry" were addressed with the following patch:
https://lore.kernel.org/lkml/Y%2FfHLdvKHlK6D%2F1v@google.com/T/
And other minor tweaks were made to integrate the selftest
infrastructure onto v14.
In RFCv4, TDX selftest code is organized into:
+ headers in tools/testing/selftests/kvm/include/x86_64/tdx/
+ common code in tools/testing/selftests/kvm/lib/x86_64/tdx/
+ selftests in tools/testing/selftests/kvm/x86_64/tdx_*
Dependencies
+ Peter’s patches, which provide functions for the host to allocate
and track protected memory in the
guest. https://lore.kernel.org/lkml/20221018205845.770121-1-pgonda@google.com/T/
Further work for this patch series/TODOs
+ Sean’s comments for the non-confidential UPM selftests patch series
at https://lore.kernel.org/lkml/Y8dC8WDwEmYixJqt@google.com/T/#u apply
here as well
+ Add ucall support for TDX selftests
I would also like to acknowledge the following people, who helped
review or test patches in RFCv1, RFCv2, and RFCv3:
+ Sean Christopherson <seanjc(a)google.com>
+ Zhenzhong Duan <zhenzhong.duan(a)intel.com>
+ Peter Gonda <pgonda(a)google.com>
+ Andrew Jones <drjones(a)redhat.com>
+ Maxim Levitsky <mlevitsk(a)redhat.com>
+ Xiaoyao Li <xiaoyao.li(a)intel.com>
+ David Matlack <dmatlack(a)google.com>
+ Marc Orr <marcorr(a)google.com>
+ Isaku Yamahata <isaku.yamahata(a)gmail.com>
+ Maciej S. Szmigiero <maciej.szmigiero(a)oracle.com>
Links to earlier patch series
+ RFC v1: https://lore.kernel.org/lkml/20210726183816.1343022-1-erdemaktas@google.com…
+ RFC v2: https://lore.kernel.org/lkml/20220830222000.709028-1-sagis@google.com/T/#u
+ RFC v3: https://lore.kernel.org/lkml/20230121001542.2472357-1-ackerleytng@google.co…
Ackerley Tng (12):
KVM: selftests: Add function to allow one-to-one GVA to GPA mappings
KVM: selftests: Expose function that sets up sregs based on VM's mode
KVM: selftests: Store initial stack address in struct kvm_vcpu
KVM: selftests: Refactor steps in vCPU descriptor table initialization
KVM: selftests: TDX: Use KVM_TDX_CAPABILITIES to validate TDs'
attribute configuration
KVM: selftests: TDX: Update load_td_memory_region for VM memory backed
by guest memfd
KVM: selftests: Add functions to allow mapping as shared
KVM: selftests: Expose _vm_vaddr_alloc
KVM: selftests: TDX: Add support for TDG.MEM.PAGE.ACCEPT
KVM: selftests: TDX: Add support for TDG.VP.VEINFO.GET
KVM: selftests: TDX: Add TDX UPM selftest
KVM: selftests: TDX: Add TDX UPM selftests for implicit conversion
Erdem Aktas (3):
KVM: selftests: Add helper functions to create TDX VMs
KVM: selftests: TDX: Add TDX lifecycle test
KVM: selftests: TDX: Adding test case for TDX port IO
Roger Wang (1):
KVM: selftests: TDX: Add TDG.VP.INFO test
Ryan Afranji (2):
KVM: selftests: TDX: Verify the behavior when host consumes a TD
private memory
KVM: selftests: TDX: Add shared memory test
Sagi Shahar (10):
KVM: selftests: TDX: Add report_fatal_error test
KVM: selftests: TDX: Add basic TDX CPUID test
KVM: selftests: TDX: Add basic get_td_vmcall_info test
KVM: selftests: TDX: Add TDX IO writes test
KVM: selftests: TDX: Add TDX IO reads test
KVM: selftests: TDX: Add TDX MSR read/write tests
KVM: selftests: TDX: Add TDX HLT exit test
KVM: selftests: TDX: Add TDX MMIO reads test
KVM: selftests: TDX: Add TDX MMIO writes test
KVM: selftests: TDX: Add TDX CPUID TDVMCALL test
tools/testing/selftests/kvm/Makefile | 8 +
.../selftests/kvm/include/kvm_util_base.h | 35 +
.../selftests/kvm/include/x86_64/processor.h | 4 +
.../kvm/include/x86_64/tdx/td_boot.h | 82 +
.../kvm/include/x86_64/tdx/td_boot_asm.h | 16 +
.../selftests/kvm/include/x86_64/tdx/tdcall.h | 59 +
.../selftests/kvm/include/x86_64/tdx/tdx.h | 65 +
.../kvm/include/x86_64/tdx/tdx_util.h | 19 +
.../kvm/include/x86_64/tdx/test_util.h | 164 ++
tools/testing/selftests/kvm/lib/kvm_util.c | 115 +-
.../selftests/kvm/lib/x86_64/processor.c | 77 +-
.../selftests/kvm/lib/x86_64/tdx/td_boot.S | 101 ++
.../selftests/kvm/lib/x86_64/tdx/tdcall.S | 158 ++
.../selftests/kvm/lib/x86_64/tdx/tdx.c | 262 ++++
.../selftests/kvm/lib/x86_64/tdx/tdx_util.c | 565 +++++++
.../selftests/kvm/lib/x86_64/tdx/test_util.c | 101 ++
.../kvm/x86_64/tdx_shared_mem_test.c | 134 ++
.../selftests/kvm/x86_64/tdx_upm_test.c | 469 ++++++
.../selftests/kvm/x86_64/tdx_vm_tests.c | 1322 +++++++++++++++++
19 files changed, 3730 insertions(+), 26 deletions(-)
create mode 100644 tools/testing/selftests/kvm/include/x86_64/tdx/td_boot.h
create mode 100644 tools/testing/selftests/kvm/include/x86_64/tdx/td_boot_asm.h
create mode 100644 tools/testing/selftests/kvm/include/x86_64/tdx/tdcall.h
create mode 100644 tools/testing/selftests/kvm/include/x86_64/tdx/tdx.h
create mode 100644 tools/testing/selftests/kvm/include/x86_64/tdx/tdx_util.h
create mode 100644 tools/testing/selftests/kvm/include/x86_64/tdx/test_util.h
create mode 100644 tools/testing/selftests/kvm/lib/x86_64/tdx/td_boot.S
create mode 100644 tools/testing/selftests/kvm/lib/x86_64/tdx/tdcall.S
create mode 100644 tools/testing/selftests/kvm/lib/x86_64/tdx/tdx.c
create mode 100644 tools/testing/selftests/kvm/lib/x86_64/tdx/tdx_util.c
create mode 100644 tools/testing/selftests/kvm/lib/x86_64/tdx/test_util.c
create mode 100644 tools/testing/selftests/kvm/x86_64/tdx_shared_mem_test.c
create mode 100644 tools/testing/selftests/kvm/x86_64/tdx_upm_test.c
create mode 100644 tools/testing/selftests/kvm/x86_64/tdx_vm_tests.c
--
2.41.0.487.g6d72f3e995-goog
[ Resending because claws-mail is messing with the Cc again. It doesn't like quotes :-p ]
On Fri, 21 Jul 2023 08:48:39 -0400
Steven Rostedt <rostedt(a)goodmis.org> wrote:
> diff --git a/fs/tracefs/event_inode.c b/fs/tracefs/event_inode.c
> index 4db048250cdb..2718de1533e6 100644
> --- a/fs/tracefs/event_inode.c
> +++ b/fs/tracefs/event_inode.c
> @@ -36,16 +36,36 @@ struct eventfs_file {
> const struct file_operations *fop;
> const struct inode_operations *iop;
> union {
> + struct list_head del_list;
> struct rcu_head rcu;
> - struct llist_node llist; /* For freeing after RCU */
> + unsigned long is_freed; /* Freed if one of the above is set */
I changed the freeing around. The dentries are freed before returning from
eventfs_remove_dir().
I also added a "is_freed" field that is part of the union and is set if
list elements have content. Note, since the union was criticized before, I
will state the entire purpose of doing this patch set is to save memory.
This structure will be used for every event file. What's the point of
getting rid of dentries if we are replacing it with something just as big?
Anyway, struct dentry does the exact same thing!
> };
> void *data;
> umode_t mode;
> - bool created;
> + unsigned int flags;
Bah, I forgot to remove flags (one iteration replaced the created with
flags to set both created and freed). I removed the freed with the above
"is_freed" and noticed that created is set if and only if ef->dentry is
set. So instead of using the created boolean, just test ef->dentry.
The flags isn't used and can be removed. I just forgot to do so.
> };
>
> static DEFINE_MUTEX(eventfs_mutex);
> DEFINE_STATIC_SRCU(eventfs_srcu);
> +
> +static struct dentry *eventfs_root_lookup(struct inode *dir,
> + struct dentry *dentry,
> + unsigned int flags);
> +static int dcache_dir_open_wrapper(struct inode *inode, struct file *file);
> +static int eventfs_release(struct inode *inode, struct file *file);
> +
> +static const struct inode_operations eventfs_root_dir_inode_operations = {
> + .lookup = eventfs_root_lookup,
> +};
> +
> +static const struct file_operations eventfs_file_operations = {
> + .open = dcache_dir_open_wrapper,
> + .read = generic_read_dir,
> + .iterate_shared = dcache_readdir,
> + .llseek = generic_file_llseek,
> + .release = eventfs_release,
> +};
> +
In preparing for getting rid of eventfs_file, I noticed that all
directories are set to the above ops. In create_dir() instead of passing in
ef->*ops, just use these directly. This does help with future work.
> /**
> * create_file - create a file in the tracefs filesystem
> * @name: the name of the file to create.
> @@ -123,17 +143,12 @@ static struct dentry *create_file(const char *name, umode_t mode,
> * If tracefs is not enabled in the kernel, the value -%ENODEV will be
> * returned.
> */
> -static struct dentry *create_dir(const char *name, umode_t mode,
> - struct dentry *parent, void *data,
> - const struct file_operations *fop,
> - const struct inode_operations *iop)
> +static struct dentry *create_dir(const char *name, struct dentry *parent, void *data)
> {
As stated, the directories always used the same *op values, so I just hard
coded it.
> struct tracefs_inode *ti;
> struct dentry *dentry;
> struct inode *inode;
>
> - WARN_ON(!S_ISDIR(mode));
> -
> dentry = eventfs_start_creating(name, parent);
> if (IS_ERR(dentry))
> return dentry;
> @@ -142,9 +157,9 @@ static struct dentry *create_dir(const char *name, umode_t mode,
> if (unlikely(!inode))
> return eventfs_failed_creating(dentry);
>
> - inode->i_mode = mode;
> - inode->i_op = iop;
> - inode->i_fop = fop;
> + inode->i_mode = S_IFDIR | S_IRWXU | S_IRUGO | S_IXUGO;
> + inode->i_op = &eventfs_root_dir_inode_operations;
> + inode->i_fop = &eventfs_file_operations;
> inode->i_private = data;
>
> ti = get_tracefs(inode);
> @@ -169,15 +184,27 @@ void eventfs_set_ef_status_free(struct dentry *dentry)
> struct tracefs_inode *ti_parent;
> struct eventfs_file *ef;
>
> + mutex_lock(&eventfs_mutex);
To synchronize with the removals, I needed to add locking here.
> ti_parent = get_tracefs(dentry->d_parent->d_inode);
> if (!ti_parent || !(ti_parent->flags & TRACEFS_EVENT_INODE))
> - return;
> + goto out;
>
> ef = dentry->d_fsdata;
> if (!ef)
> - return;
> - ef->created = false;
> + goto out;
> + /*
> + * If ef was freed, then the LSB bit is set for d_fsdata.
> + * But this should not happen, as it should still have a
> + * ref count that prevents it. Warn in case it does.
> + */
> + if (WARN_ON_ONCE((unsigned long)ef & 1))
> + goto out;
During the remove, a dget() is done to keep the dentry from freeing. To
make sure that it doesn't get freed, I added this test.
> +
> + dentry->d_fsdata = NULL;
> +
> ef->dentry = NULL;
> + out:
> + mutex_unlock(&eventfs_mutex);
> }
>
> /**
> @@ -202,6 +229,79 @@ static void eventfs_post_create_dir(struct eventfs_file *ef)
> ti->private = ef->ei;
> }
>
> +static struct dentry *
> +create_dentry(struct eventfs_file *ef, struct dentry *parent, bool lookup)
> +{
Because both the lookup and the dir_open_wrapper did basically the same
thing, I created a helper function so that I didn't have to update both
locations.
> + bool invalidate = false;
> + struct dentry *dentry;
> +
> + mutex_lock(&eventfs_mutex);
> + if (ef->is_freed) {
> + mutex_unlock(&eventfs_mutex);
> + return NULL;
> + }
Ignore if the ef is on its way to be freed.
> + if (ef->dentry) {
> + dentry = ef->dentry;
If the ef already has a dentry (created) then use it.
> + /* On dir open, up the ref count */
> + if (!lookup)
> + dget(dentry);
> + mutex_unlock(&eventfs_mutex);
> + return dentry;
> + }
> + mutex_unlock(&eventfs_mutex);
> +
> + if (!lookup)
> + inode_lock(parent->d_inode);
> +
> + if (ef->ei)
> + dentry = create_dir(ef->name, parent, ef->data);
> + else
> + dentry = create_file(ef->name, ef->mode, parent,
> + ef->data, ef->fop);
> +
> + if (!lookup)
> + inode_unlock(parent->d_inode);
> +
> + mutex_lock(&eventfs_mutex);
> + if (IS_ERR_OR_NULL(dentry)) {
With the lock dropped, the dentry could have been created causing it to
fail. Check if the ef->dentry exists, and if so, use it instead.
Note, if the ef is freed, it should not have a dentry.
> + /* If the ef was already updated get it */
> + dentry = ef->dentry;
> + if (dentry && !lookup)
> + dget(dentry);
> + mutex_unlock(&eventfs_mutex);
> + return dentry;
> + }
> +
> + if (!ef->dentry && !ef->is_freed) {
With the lock dropped, the dentry could have been filled too. If so, drop
the created dentry and use the one owned by the ef->dentry.
> + ef->dentry = dentry;
> + if (ef->ei)
> + eventfs_post_create_dir(ef);
> + dentry->d_fsdata = ef;
> + } else {
> + /* A race here, should try again (unless freed) */
> + invalidate = true;
I had a WARN_ON() once here. Probably could add a:
WARN_ON_ONCE(!ef->is_freed);
> + }
> + mutex_unlock(&eventfs_mutex);
> + if (invalidate)
> + d_invalidate(dentry);
> +
> + if (lookup || invalidate)
> + dput(dentry);
> +
> + return invalidate ? NULL : dentry;
> +}
> +
> +static bool match_event_file(struct eventfs_file *ef, const char *name)
> +{
A bit of a paranoid helper function. I wanted to make sure to synchronize
with the removals.
> + bool ret;
> +
> + mutex_lock(&eventfs_mutex);
> + ret = !ef->is_freed && strcmp(ef->name, name) == 0;
> + mutex_unlock(&eventfs_mutex);
> +
> + return ret;
> +}
> +
> /**
> * eventfs_root_lookup - lookup routine to create file/dir
> * @dir: directory in which lookup to be done
> @@ -211,7 +311,6 @@ static void eventfs_post_create_dir(struct eventfs_file *ef)
> * Used to create dynamic file/dir with-in @dir, search with-in ei
> * list, if @dentry found go ahead and create the file/dir
> */
> -
> static struct dentry *eventfs_root_lookup(struct inode *dir,
> struct dentry *dentry,
> unsigned int flags)
> @@ -230,30 +329,10 @@ static struct dentry *eventfs_root_lookup(struct inode *dir,
> idx = srcu_read_lock(&eventfs_srcu);
> list_for_each_entry_srcu(ef, &ei->e_top_files, list,
> srcu_read_lock_held(&eventfs_srcu)) {
> - if (strcmp(ef->name, dentry->d_name.name))
> + if (!match_event_file(ef, dentry->d_name.name))
> continue;
> ret = simple_lookup(dir, dentry, flags);
> - if (ef->created)
> - continue;
> - mutex_lock(&eventfs_mutex);
> - ef->created = true;
> - if (ef->ei)
> - ef->dentry = create_dir(ef->name, ef->mode, ef->d_parent,
> - ef->data, ef->fop, ef->iop);
> - else
> - ef->dentry = create_file(ef->name, ef->mode, ef->d_parent,
> - ef->data, ef->fop);
> -
> - if (IS_ERR_OR_NULL(ef->dentry)) {
> - ef->created = false;
> - mutex_unlock(&eventfs_mutex);
> - } else {
> - if (ef->ei)
> - eventfs_post_create_dir(ef);
> - ef->dentry->d_fsdata = ef;
> - mutex_unlock(&eventfs_mutex);
> - dput(ef->dentry);
> - }
> + create_dentry(ef, ef->d_parent, true);
> break;
> }
> srcu_read_unlock(&eventfs_srcu, idx);
> @@ -270,6 +349,7 @@ static int eventfs_release(struct inode *inode, struct file *file)
> struct tracefs_inode *ti;
> struct eventfs_inode *ei;
> struct eventfs_file *ef;
> + struct dentry *dentry;
> int idx;
>
> ti = get_tracefs(inode);
> @@ -280,8 +360,11 @@ static int eventfs_release(struct inode *inode, struct file *file)
> idx = srcu_read_lock(&eventfs_srcu);
> list_for_each_entry_srcu(ef, &ei->e_top_files, list,
> srcu_read_lock_held(&eventfs_srcu)) {
> - if (ef->created)
> - dput(ef->dentry);
> + mutex_lock(&eventfs_mutex);
> + dentry = ef->dentry;
> + mutex_unlock(&eventfs_mutex);
> + if (dentry)
> + dput(dentry);
> }
> srcu_read_unlock(&eventfs_srcu, idx);
> return dcache_dir_close(inode, file);
> @@ -312,47 +395,12 @@ static int dcache_dir_open_wrapper(struct inode *inode, struct file *file)
> ei = ti->private;
> idx = srcu_read_lock(&eventfs_srcu);
> list_for_each_entry_rcu(ef, &ei->e_top_files, list) {
> - if (ef->created) {
> - dget(ef->dentry);
> - continue;
> - }
> - mutex_lock(&eventfs_mutex);
> - ef->created = true;
> -
> - inode_lock(dentry->d_inode);
> - if (ef->ei)
> - ef->dentry = create_dir(ef->name, ef->mode, dentry,
> - ef->data, ef->fop, ef->iop);
> - else
> - ef->dentry = create_file(ef->name, ef->mode, dentry,
> - ef->data, ef->fop);
> - inode_unlock(dentry->d_inode);
> -
> - if (IS_ERR_OR_NULL(ef->dentry)) {
> - ef->created = false;
> - } else {
> - if (ef->ei)
> - eventfs_post_create_dir(ef);
> - ef->dentry->d_fsdata = ef;
> - }
> - mutex_unlock(&eventfs_mutex);
> + create_dentry(ef, dentry, false);
> }
> srcu_read_unlock(&eventfs_srcu, idx);
> return dcache_dir_open(inode, file);
> }
>
> -static const struct file_operations eventfs_file_operations = {
> - .open = dcache_dir_open_wrapper,
> - .read = generic_read_dir,
> - .iterate_shared = dcache_readdir,
> - .llseek = generic_file_llseek,
> - .release = eventfs_release,
> -};
> -
> -static const struct inode_operations eventfs_root_dir_inode_operations = {
> - .lookup = eventfs_root_lookup,
> -};
> -
> /**
> * eventfs_prepare_ef - helper function to prepare eventfs_file
> * @name: the name of the file/directory to create.
> @@ -470,11 +518,7 @@ struct eventfs_file *eventfs_add_subsystem_dir(const char *name,
> ti_parent = get_tracefs(parent->d_inode);
> ei_parent = ti_parent->private;
>
> - ef = eventfs_prepare_ef(name,
> - S_IFDIR | S_IRWXU | S_IRUGO | S_IXUGO,
> - &eventfs_file_operations,
> - &eventfs_root_dir_inode_operations, NULL);
> -
> + ef = eventfs_prepare_ef(name, S_IFDIR, NULL, NULL, NULL);
For directories, just use the hard coded values.
> if (IS_ERR(ef))
> return ef;
>
> @@ -502,11 +546,7 @@ struct eventfs_file *eventfs_add_dir(const char *name,
> if (!ef_parent)
> return ERR_PTR(-EINVAL);
>
> - ef = eventfs_prepare_ef(name,
> - S_IFDIR | S_IRWXU | S_IRUGO | S_IXUGO,
> - &eventfs_file_operations,
> - &eventfs_root_dir_inode_operations, NULL);
> -
> + ef = eventfs_prepare_ef(name, S_IFDIR, NULL, NULL, NULL);
ditto.
> if (IS_ERR(ef))
> return ef;
>
> @@ -601,37 +641,15 @@ int eventfs_add_file(const char *name, umode_t mode,
> return 0;
> }
>
> -static LLIST_HEAD(free_list);
> -
> -static void eventfs_workfn(struct work_struct *work)
> -{
> - struct eventfs_file *ef, *tmp;
> - struct llist_node *llnode;
> -
> - llnode = llist_del_all(&free_list);
> - llist_for_each_entry_safe(ef, tmp, llnode, llist) {
> - if (ef->created && ef->dentry)
> - dput(ef->dentry);
> - kfree(ef->name);
> - kfree(ef->ei);
> - kfree(ef);
> - }
> -}
> -
> -DECLARE_WORK(eventfs_work, eventfs_workfn);
> -
> static void free_ef(struct rcu_head *head)
> {
> struct eventfs_file *ef = container_of(head, struct eventfs_file, rcu);
>
> - if (!llist_add(&ef->llist, &free_list))
> - return;
> -
> - queue_work(system_unbound_wq, &eventfs_work);
> + kfree(ef->name);
> + kfree(ef->ei);
> + kfree(ef);
Since I did not do the dput() or d_invalidate() here I don't need call this
from task context. This simplifies the process.
> }
>
> -
> -
> /**
> * eventfs_remove_rec - remove eventfs dir or file from list
> * @ef: eventfs_file to be removed.
> @@ -639,7 +657,7 @@ static void free_ef(struct rcu_head *head)
> * This function recursively remove eventfs_file which
> * contains info of file or dir.
> */
> -static void eventfs_remove_rec(struct eventfs_file *ef, int level)
> +static void eventfs_remove_rec(struct eventfs_file *ef, struct list_head *head, int level)
> {
> struct eventfs_file *ef_child;
>
> @@ -659,15 +677,12 @@ static void eventfs_remove_rec(struct eventfs_file *ef, int level)
> /* search for nested folders or files */
> list_for_each_entry_srcu(ef_child, &ef->ei->e_top_files, list,
> lockdep_is_held(&eventfs_mutex)) {
> - eventfs_remove_rec(ef_child, level + 1);
> + eventfs_remove_rec(ef_child, head, level + 1);
> }
> }
>
> - if (ef->created && ef->dentry)
> - d_invalidate(ef->dentry);
> -
> list_del_rcu(&ef->list);
> - call_srcu(&eventfs_srcu, &ef->rcu, free_ef);
> + list_add_tail(&ef->del_list, head);
Hold off on freeing the ef. Add it to a link list to do so later.
> }
>
> /**
> @@ -678,12 +693,62 @@ static void eventfs_remove_rec(struct eventfs_file *ef, int level)
> */
> void eventfs_remove(struct eventfs_file *ef)
> {
> + struct eventfs_file *tmp;
> + LIST_HEAD(ef_del_list);
> + struct dentry *dentry_list = NULL;
> + struct dentry *dentry;
> +
> if (!ef)
> return;
>
> mutex_lock(&eventfs_mutex);
> - eventfs_remove_rec(ef, 0);
> + eventfs_remove_rec(ef, &ef_del_list, 0);
The above returns back with ef_del_list holding all the ef's to be freed.
I probably could have just passed the dentry_list down instead, but I
wanted the below complexity done in a non recursive function.
> +
> + list_for_each_entry_safe(ef, tmp, &ef_del_list, del_list) {
> + if (ef->dentry) {
> + unsigned long ptr = (unsigned long)dentry_list;
> +
> + /* Keep the dentry from being freed yet */
> + dget(ef->dentry);
> +
> + /*
> + * Paranoid: The dget() above should prevent the dentry
> + * from being freed and calling eventfs_set_ef_status_free().
> + * But just in case, set the link list LSB pointer to 1
> + * and have eventfs_set_ef_status_free() check that to
> + * make sure that if it does happen, it will not think
> + * the d_fsdata is an event_file.
> + *
> + * For this to work, no event_file should be allocated
> + * on a odd space, as the ef should always be allocated
> + * to be at least word aligned. Check for that too.
> + */
> + WARN_ON_ONCE(ptr & 1);
> +
> + ef->dentry->d_fsdata = (void *)(ptr | 1);
Set the d_fsdata to be a link list. The comment above needs to say to say
struct eventfs_file and struct dentry should be word aligned. Anyway, while
the eventfs_mutex is held, set all the dentries belonging to eventfs_files
to the dentry_list and clear the ef->dentry.
> + dentry_list = ef->dentry;
> + ef->dentry = NULL;
> + }
> + call_srcu(&eventfs_srcu, &ef->rcu, free_ef);
> + }
> mutex_unlock(&eventfs_mutex);
> +
> + while (dentry_list) {
> + unsigned long ptr;
> +
> + dentry = dentry_list;
> + ptr = (unsigned long)dentry->d_fsdata & ~1UL;
> + dentry_list = (struct dentry *)ptr;
> + dentry->d_fsdata = NULL;
With the mutex released, it is safe to free the dentries here. This also
must be done before returning from this function, as when I had it done in
the workqueue, it was failing some tests that would remove a dynamic event
and still see that the directory was still around!
> + d_invalidate(dentry);
> + mutex_lock(&eventfs_mutex);
> + /* dentry should now have at least a single reference */
> + WARN_ONCE((int)d_count(dentry) < 1,
> + "dentry %px less than one reference (%d) after invalidate\n",
I did update the above to:
WARN_ONCE((int)d_count(dentry) < 1,
"dentry %px (%s) less than one reference (%d) after invalidate\n",
dentry, dentry->d_name.name, d_count(dentry));
To include the name of the dentry (my current work is triggering this still).
> + dentry, d_count(dentry));
> + mutex_unlock(&eventfs_mutex);
> + dput(dentry);
> + }
> }
>
> /**
> diff --git a/fs/tracefs/internal.h b/fs/tracefs/internal.h
> index c443a0c32a8c..1b880b5cd29d 100644
> --- a/fs/tracefs/internal.h
> +++ b/fs/tracefs/internal.h
> @@ -22,4 +22,6 @@ struct dentry *tracefs_end_creating(struct dentry *dentry);
> struct dentry *tracefs_failed_creating(struct dentry *dentry);
> struct inode *tracefs_get_inode(struct super_block *sb);
>
> +void eventfs_set_ef_status_free(struct dentry *dentry);
> +
> #endif /* _TRACEFS_INTERNAL_H */
> diff --git a/include/linux/tracefs.h b/include/linux/tracefs.h
> index 4d30b0cafc5f..47c1b4d21735 100644
> --- a/include/linux/tracefs.h
> +++ b/include/linux/tracefs.h
> @@ -51,8 +51,6 @@ void eventfs_remove(struct eventfs_file *ef);
>
> void eventfs_remove_events_dir(struct dentry *dentry);
>
> -void eventfs_set_ef_status_free(struct dentry *dentry);
> -
Oh, and eventfs_set_ef_status_free() should not be exported to outside the
tracefs system.
-- Steve
> struct dentry *tracefs_create_file(const char *name, umode_t mode,
> struct dentry *parent, void *data,
> const struct file_operations *fops);
Hi, Willy, Thomas
The suggestions of v1 nolibc powerpc patchset [1] from you have been applied,
here is v2.
Testing results:
- run with tinyconfig
arch/board | result
------------|------------
ppc/g3beige | 165 test(s): 158 passed, 7 skipped, 0 failed => status: warning.
ppc/ppce500 | 165 test(s): 158 passed, 7 skipped, 0 failed => status: warning.
ppc64le/pseries | 165 test(s): 158 passed, 7 skipped, 0 failed => status: warning.
ppc64le/powernv | 165 test(s): 158 passed, 7 skipped, 0 failed => status: warning.
ppc64/pseries | 165 test(s): 158 passed, 7 skipped, 0 failed => status: warning.
ppc64/powernv | 165 test(s): 158 passed, 7 skipped, 0 failed => status: warning.
- run-user
(Tested with -Os, -O0 and -O2)
// for 32-bit PowerPC
$ for arch in powerpc ppc; do make run-user ARCH=$arch CROSS_COMPILE=powerpc-linux-gnu- ; done | grep status
165 test(s): 157 passed, 8 skipped, 0 failed => status: warning
165 test(s): 157 passed, 8 skipped, 0 failed => status: warning
// for 64-bit big-endian PowerPC and 64-bit little-endian PowerPC
$ for arch in ppc64 ppc64le; do make run-user ARCH=$arch CROSS_COMPILE=powerpc64le-linux-gnu- ; done | grep status
165 test(s): 157 passed, 8 skipped, 0 failed => status: warning
165 test(s): 157 passed, 8 skipped, 0 failed => status: warning
Changes from v1 --> v2:
- tools/nolibc: add support for powerpc
Add missing arch-powerpc.h lines to arch.h
Align with the other arch-<ARCH>.h, naming the variables
with more meaningful words, such as _ret, _num, _arg1 ...
Clean up the syscall instructions
No line from musl now.
Suggestons from Thomas
* tools/nolibc: add support for pppc64
No change
* selftests/nolibc: add extra configs customize support
To reduce complexity, merge the commands from the new extraconfig
target to defconfig target and drop the extconfig target completely.
Derived from Willy's suggestion of the tinyconfig patchset
* selftests/nolibc: add XARCH and ARCH mapping support
To reduce complexity, let's use XARCH internally and only reserve
ARCH as the input variable.
Derived from Willy's suggestion
* selftests/nolibc: add test support for powerpc
Add ppc as the default 32-bit variant for powerpc target, allow pass
ARCH=ppc or ARCH=powerpc to test 32-bit powerpc
Derived from Willy's suggestion
* selftests/nolibc: add test support for pppc64le
Rename powerpc64le to ppc64le
Suggestion from Willy
* selftests/nolibc: add test support for pppc64
Rename powerpc64 to ppc64
Suggestion from Willy
Best regards,
Zhangjin
---
[1]: https://lore.kernel.org/lkml/cover.1689713175.git.falcon@tinylab.org/
Zhangjin Wu (7):
tools/nolibc: add support for powerpc
tools/nolibc: add support for powerpc64
selftests/nolibc: add extra configs customize support
selftests/nolibc: add XARCH and ARCH mapping support
selftests/nolibc: add test support for ppc
selftests/nolibc: add test support for ppc64le
selftests/nolibc: add test support for ppc64
tools/include/nolibc/arch-powerpc.h | 202 ++++++++++++++++++++++++
tools/include/nolibc/arch.h | 2 +
tools/testing/selftests/nolibc/Makefile | 48 +++++-
3 files changed, 244 insertions(+), 8 deletions(-)
create mode 100644 tools/include/nolibc/arch-powerpc.h
--
2.25.1
Make sv48 the default address space for mmap as some applications
currently depend on this assumption. Users can now select a
desired address space using a non-zero hint address to mmap. Previously,
requesting the default address space from mmap by passing zero as the hint
address would result in using the largest address space possible. Some
applications depend on empty bits in the virtual address space, like Go and
Java, so this patch provides more flexibility for application developers.
-Charlie
---
v6:
- Rebase onto the correct base
v5:
- Minor wording change in documentation
- Change some parenthesis in arch_get_mmap_ macros
- Added case for addr==0 in arch_get_mmap_ because without this, programs would
crash if RLIMIT_STACK was modified before executing the program. This was
tested using the libhugetlbfs tests.
v4:
- Split testcases/document patch into test cases, in-code documentation, and
formal documentation patches
- Modified the mmap_base macro to be more legible and better represent memory
layout
- Fixed documentation to better reflect the implmentation
- Renamed DEFAULT_VA_BITS to MMAP_VA_BITS
- Added additional test case for rlimit changes
---
Charlie Jenkins (4):
RISC-V: mm: Restrict address space for sv39,sv48,sv57
RISC-V: mm: Add tests for RISC-V mm
RISC-V: mm: Update pgtable comment documentation
RISC-V: mm: Document mmap changes
Documentation/riscv/vm-layout.rst | 22 +++
arch/riscv/include/asm/elf.h | 2 +-
arch/riscv/include/asm/pgtable.h | 20 ++-
arch/riscv/include/asm/processor.h | 46 +++++-
tools/testing/selftests/riscv/Makefile | 2 +-
tools/testing/selftests/riscv/mm/.gitignore | 1 +
tools/testing/selftests/riscv/mm/Makefile | 21 +++
.../selftests/riscv/mm/testcases/mmap.c | 133 ++++++++++++++++++
8 files changed, 234 insertions(+), 13 deletions(-)
create mode 100644 tools/testing/selftests/riscv/mm/.gitignore
create mode 100644 tools/testing/selftests/riscv/mm/Makefile
create mode 100644 tools/testing/selftests/riscv/mm/testcases/mmap.c
--
2.41.0
This is the basic functionality for iommufd to support
iommufd_device_replace() and IOMMU_HWPT_ALLOC for physical devices.
iommufd_device_replace() allows changing the HWPT associated with the
device to a new IOAS or HWPT. Replace does this in way that failure leaves
things unchanged, and utilizes the iommu iommu_group_replace_domain() API
to allow the iommu driver to perform an optional non-disruptive change.
IOMMU_HWPT_ALLOC allows HWPTs to be explicitly allocated by the user and
used by attach or replace. At this point it isn't very useful since the
HWPT is the same as the automatically managed HWPT from the IOAS. However
a following series will allow userspace to customize the created HWPT.
The implementation is complicated because we have to introduce some
per-iommu_group memory in iommufd and redo how we think about multi-device
groups to be more explicit. This solves all the locking problems in the
prior attempts.
This series is infrastructure work for the following series which:
- Add replace for attach
- Expose replace through VFIO APIs
- Implement driver parameters for HWPT creation (nesting)
Once review of this is complete I will keep it on a side branch and
accumulate the following series when they are ready so we can have a
stable base and make more incremental progress. When we have all the parts
together to get a full implementation it can go to Linus.
This is on github: https://github.com/jgunthorpe/linux/commits/iommufd_hwpt
v8:
- Rebase to v6.5-rc2, update to new behavior of __iommu_group_set_domain()
v7: https://lore.kernel.org/r/0-v7-6c0fd698eda2+5e3-iommufd_alloc_jgg@nvidia.com
- Rebase to v6.4-rc2, update to new signature of iommufd_get_ioas()
v6: https://lore.kernel.org/r/0-v6-fdb604df649a+369-iommufd_alloc_jgg@nvidia.com
- Go back to the v4 locking arragnment with now both the attach/detach
igroup->locks inside the functions, Kevin says he needs this for a
followup series. This still fixes the syzkaller bug
- Fix two more error unwind locking bugs where
iommufd_object_abort_and_destroy(hwpt) would deadlock or be mislocked.
Make sure fail_nth will catch these mistakes
- Add a patch allowing objects to have different abort than destroy
function, it allows hwpt abort to require the caller to continue
to hold the lock and enforces this with lockdep.
v5: https://lore.kernel.org/r/0-v5-6716da355392+c5-iommufd_alloc_jgg@nvidia.com
- Go back to the v3 version of the code, keep the comment changes from
v4. Syzkaller says the group lock change in v4 didn't work.
- Adjust the fail_nth test to cover the path syzkaller found. We need to
have an ioas with a mapped page installed to inject a failure during
domain attachment.
v4: https://lore.kernel.org/r/0-v4-9cd79ad52ee8+13f5-iommufd_alloc_jgg@nvidia.c…
- Refine comments and commit messages
- Move the group lock into iommufd_hw_pagetable_attach()
- Fix error unwind in iommufd_device_do_replace()
v3: https://lore.kernel.org/r/0-v3-61d41fd9e13e+1f5-iommufd_alloc_jgg@nvidia.com
- Refine comments and commit messages
- Adjust the flow in iommufd_device_auto_get_domain() so pt_id is only
set on success
- Reject replace on non-attached devices
- Add missing __reserved check for IOMMU_HWPT_ALLOC
v2: https://lore.kernel.org/r/0-v2-51b9896e7862+8a8c-iommufd_alloc_jgg@nvidia.c…
- Use WARN_ON for the igroup->group test and move that logic to a
function iommufd_group_try_get()
- Change igroup->devices to igroup->device list
Replace will need to iterate over all attached idevs
- Rename to iommufd_group_setup_msi()
- New patch to export iommu_get_resv_regions()
- New patch to use per-device reserved regions instead of per-group
regions
- Split out the reorganizing of iommufd_device_change_pt() from the
replace patch
- Replace uses the per-dev reserved regions
- Use stdev_id in a few more places in the selftest
- Fix error handling in IOMMU_HWPT_ALLOC
- Clarify comments
- Rebase on v6.3-rc1
v1: https://lore.kernel.org/all/0-v1-7612f88c19f5+2f21-iommufd_alloc_jgg@nvidia…
Jason Gunthorpe (17):
iommufd: Move isolated msi enforcement to iommufd_device_bind()
iommufd: Add iommufd_group
iommufd: Replace the hwpt->devices list with iommufd_group
iommu: Export iommu_get_resv_regions()
iommufd: Keep track of each device's reserved regions instead of
groups
iommufd: Use the iommufd_group to avoid duplicate MSI setup
iommufd: Make sw_msi_start a group global
iommufd: Move putting a hwpt to a helper function
iommufd: Add enforced_cache_coherency to iommufd_hw_pagetable_alloc()
iommufd: Allow a hwpt to be aborted after allocation
iommufd: Fix locking around hwpt allocation
iommufd: Reorganize iommufd_device_attach into
iommufd_device_change_pt
iommufd: Add iommufd_device_replace()
iommufd: Make destroy_rwsem use a lock class per object type
iommufd: Add IOMMU_HWPT_ALLOC
iommufd/selftest: Return the real idev id from selftest mock_domain
iommufd/selftest: Add a selftest for IOMMU_HWPT_ALLOC
Nicolin Chen (2):
iommu: Introduce a new iommu_group_replace_domain() API
iommufd/selftest: Test iommufd_device_replace()
drivers/iommu/iommu-priv.h | 10 +
drivers/iommu/iommu.c | 38 +-
drivers/iommu/iommufd/device.c | 555 +++++++++++++-----
drivers/iommu/iommufd/hw_pagetable.c | 112 +++-
drivers/iommu/iommufd/io_pagetable.c | 32 +-
drivers/iommu/iommufd/iommufd_private.h | 52 +-
drivers/iommu/iommufd/iommufd_test.h | 6 +
drivers/iommu/iommufd/main.c | 24 +-
drivers/iommu/iommufd/selftest.c | 40 ++
include/linux/iommufd.h | 1 +
include/uapi/linux/iommufd.h | 26 +
tools/testing/selftests/iommu/iommufd.c | 67 ++-
.../selftests/iommu/iommufd_fail_nth.c | 67 ++-
tools/testing/selftests/iommu/iommufd_utils.h | 63 +-
14 files changed, 867 insertions(+), 226 deletions(-)
create mode 100644 drivers/iommu/iommu-priv.h
base-commit: fdf0eaf11452d72945af31804e2a1048ee1b574c
--
2.41.0
Apologize for sending previous mail from a wrong app (not text mode).
Resending to keep the mailing list thread consistent.
On Wed, Jul 26, 2023 at 3:10 AM Markus Elfring <Markus.Elfring(a)web.de>
wrote:
>
> > Tests BPF redirect at the lwt xmit hook to ensure error handling are
> > safe, i.e. won't panic the kernel.
>
> Are imperative change descriptions still preferred?
Hi Markus,
I think you linked this to me yesterday that it should be described
imperatively:
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/Doc…
>
> See also:
>
> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/Doc…
>
I don’t follow the purpose of this reference. This points to user impact
but this is a selftest, so I don’t see any user impact here. Or is there
anything I missed?
>
> Can remaining wording weaknesses be adjusted accordingly?
I am not following this question . Can you be more specific or provide an
example?
Yan
>
> Regards,
> Markus
>
Dzień dobry,
zapoznałem się z Państwa ofertą i z przyjemnością przyznaję, że przyciąga uwagę i zachęca do dalszych rozmów.
Pomyślałem, że może mógłbym mieć swój wkład w Państwa rozwój i pomóc dotrzeć z tą ofertą do większego grona odbiorców. Pozycjonuję strony www, dzięki czemu generują świetny ruch w sieci.
Możemy porozmawiać w najbliższym czasie?
Pozdrawiam
Adam Charachuta
Add specification for declaring test metadata to the KTAP v2 spec.
The purpose of test metadata is to allow for the declaration of essential
testing information in KTAP output. This information includes test
names, test configuration info, test attributes, and test files.
There have been similar ideas around the idea of test metadata such as test
prefixes and test name lines. However, I propose this specification as an
overall fix for these issues.
These test metadata lines are a form of diagnostic lines with the
format: "# <metadata_type>: <data>". As a type of diagnostic line, test
metadata lines are compliant with KTAP v1, which will help to not
interfere too much with current parsers.
Specifically the "# Subtest:" line is derived from the TAP 14 spec:
https://testanything.org/tap-version-14-specification.html.
The proposed location for test metadata is in the test header, between the
version line and the test plan line. Note including diagnostic lines in
the test header is a depature from KTAP v1.
This location provides two main benefits:
First, metadata will be printed prior to when subtests are run. Then if a
test fails, test metadata can help discern which test is causing the issue
and potentially why.
Second, this location ensures that the lines will not be accidentally
parsed as a subtest's diagnostic lines because the lines are bordered by
the version line and plan line.
Here is an example of test metadata:
KTAP version 2
# Config: CONFIG_TEST=y
1..1
KTAP version 2
# Subtest: test_suite
# File: /sys/kernel/...
# Attributes: slow
# Other: example_test
1..2
ok 1 test_1
ok 2 test_2
ok 1 test_suite
Here is a link to a version of the KUnit parser that is able to parse test
metadata lines for KTAP version 2. Note this includes test metadata
lines for the main level of KTAP.
Link: https://kunit-review.googlesource.com/c/linux/+/5809
Signed-off-by: Rae Moar <rmoar(a)google.com>
---
Hi everyone,
I would like to use this proposal similar to an RFC to gather ideas on the
topic of test metadata. Let me know what you think.
I am also interested in brainstorming a list of recognized metadata types.
Providing recognized metadata types would be helpful in parsing and
displaying test metadata in a useful way.
Current ideas:
- "# Subtest: <test_name>" to indicate test name (name must match
corresponding result line)
- "# Attributes: <attributes list>" to indicate test attributes (list
separated by commas)
- "# File: <file_path>" to indicate file used in testing
Any other ideas?
Note this proposal replaces two of my previous proposals: "ktap_v2: add
recognized test name line" and "ktap_v2: allow prefix to KTAP lines."
Thanks!
-Rae
Note: this patch is based on Frank's ktap_spec_version_2 branch.
Documentation/dev-tools/ktap.rst | 51 ++++++++++++++++++++++++++++++--
1 file changed, 48 insertions(+), 3 deletions(-)
diff --git a/Documentation/dev-tools/ktap.rst b/Documentation/dev-tools/ktap.rst
index ff77f4aaa6ef..a2d0a196c115 100644
--- a/Documentation/dev-tools/ktap.rst
+++ b/Documentation/dev-tools/ktap.rst
@@ -17,7 +17,9 @@ KTAP test results describe a series of tests (which may be nested: i.e., test
can have subtests), each of which can contain both diagnostic data -- e.g., log
lines -- and a final result. The test structure and results are
machine-readable, whereas the diagnostic data is unstructured and is there to
-aid human debugging.
+aid human debugging. One exception to this is test metadata lines - a type
+of diagnostic lines. Test metadata is located between the version line and
+plan line of a test and can be machine-readable.
KTAP output is built from four different types of lines:
- Version lines
@@ -28,8 +30,7 @@ KTAP output is built from four different types of lines:
In general, valid KTAP output should also form valid TAP output, but some
information, in particular nested test results, may be lost. Also note that
there is a stagnant draft specification for TAP14, KTAP diverges from this in
-a couple of places (notably the "Subtest" header), which are described where
-relevant later in this document.
+a couple of places, which are described where relevant later in this document.
Version lines
-------------
@@ -166,6 +167,45 @@ even if they do not start with a "#": this is to capture any other useful
kernel output which may help debug the test. It is nevertheless recommended
that tests always prefix any diagnostic output they have with a "#" character.
+Test metadata lines
+-------------------
+
+Test metadata lines are a type of diagnostic lines used to the declare the
+name of a test and other helpful testing information in the test header.
+These lines are often helpful for parsing and for providing context during
+crashes.
+
+Test metadata lines must follow the format: "# <metadata_type>: <data>".
+These lines must be located between the version line and the plan line
+within a test header.
+
+There are a few currently recognized metadata types:
+- "# Subtest: <test_name>" to indicate test name (name must match
+ corresponding result line)
+- "# Attributes: <attributes list>" to indicate test attributes (list
+ separated by commas)
+- "# File: <file_path>" to indicate file used in testing
+
+As a rule, the "# Subtest:" line is generally first to declare the test
+name. Note that metadata lines do not necessarily need to use a
+recognized metadata type.
+
+An example of using metadata lines:
+
+::
+
+ KTAP version 2
+ 1..1
+ # File: /sys/kernel/...
+ KTAP version 2
+ # Subtest: example
+ # Attributes: slow, example_test
+ 1..1
+ ok 1 test_1
+ # example passed
+ ok 1 example
+
+
Unknown lines
-------------
@@ -206,6 +246,7 @@ An example of a test with two nested subtests:
KTAP version 2
1..1
KTAP version 2
+ # Subtest: example
1..2
ok 1 test_1
not ok 2 test_2
@@ -219,6 +260,7 @@ An example format with multiple levels of nested testing:
KTAP version 2
1..2
KTAP version 2
+ # Subtest: example_test_1
1..2
KTAP version 2
1..2
@@ -254,6 +296,7 @@ Example KTAP output
KTAP version 2
1..1
KTAP version 2
+ # Subtest: main_test
1..3
KTAP version 2
1..1
@@ -261,11 +304,13 @@ Example KTAP output
ok 1 test_1
ok 1 example_test_1
KTAP version 2
+ # Attributes: slow
1..2
ok 1 test_1 # SKIP test_1 skipped
ok 2 test_2
ok 2 example_test_2
KTAP version 2
+ # Subtest: example_test_3
1..3
ok 1 test_1
# test_2: FAIL
base-commit: 906f02e42adfbd5ae70d328ee71656ecb602aaf5
--
2.40.0.396.gfff15efe05-goog
lwt xmit hook does not expect positive return values in function
ip_finish_output2 and ip6_finish_output2. However, BPF redirect programs
can return positive values such like NET_XMIT_DROP, NET_RX_DROP, and etc
as errors. Such return values can panic the kernel unexpectedly:
https://gist.github.com/zhaiyan920/8fbac245b261fe316a7ef04c9b1eba48
This patch fixes the return values from BPF redirect, so the error
handling would be consistent at xmit hook. It also adds a few test cases
to prevent future regressions.
v2: https://lore.kernel.org/netdev/ZLdY6JkWRccunvu0@debian.debian/
v1: https://lore.kernel.org/bpf/ZLbYdpWC8zt9EJtq@debian.debian/
changes since v2:
* subject name changed
* also covered redirect to ingress case
* added selftests
changes since v1:
* minor code style changes
Yan Zhai (2):
bpf: fix skb_do_redirect return values
bpf: selftests: add lwt redirect regression test cases
net/core/filter.c | 12 +-
tools/testing/selftests/bpf/Makefile | 1 +
.../selftests/bpf/progs/test_lwt_redirect.c | 67 +++++++
.../selftests/bpf/test_lwt_redirect.sh | 165 ++++++++++++++++++
4 files changed, 244 insertions(+), 1 deletion(-)
create mode 100644 tools/testing/selftests/bpf/progs/test_lwt_redirect.c
create mode 100755 tools/testing/selftests/bpf/test_lwt_redirect.sh
--
2.30.2
Remove clean target in Makefile to fix the following warning
and use the one in common lib.mk
Makefile:14: warning: overriding recipe for target 'clean'
../lib.mk:160: warning: ignoring old recipe for target 'clean'
Signed-off-by: Shuah Khan <skhan(a)linuxfoundation.org>
---
tools/testing/selftests/prctl/Makefile | 2 --
1 file changed, 2 deletions(-)
diff --git a/tools/testing/selftests/prctl/Makefile b/tools/testing/selftests/prctl/Makefile
index cfc35d29fc2e..01dc90fbb509 100644
--- a/tools/testing/selftests/prctl/Makefile
+++ b/tools/testing/selftests/prctl/Makefile
@@ -10,7 +10,5 @@ all: $(TEST_PROGS)
include ../lib.mk
-clean:
- rm -fr $(TEST_PROGS)
endif
endif
--
2.39.2
On Tue, Jul 25, 2023 at 12:11 AM Markus Elfring <Markus.Elfring(a)web.de> wrote:
>
> > … unexpected problems. This change
> > converts the positive status code to proper error code.
>
> Please choose a corresponding imperative change suggestion.
>
> See also:
> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/Doc…
>
>
> Did you provide sufficient justification for a possible addition of the tag “Fixes”?
>
>
> …
> > v2: code style change suggested by Stanislav Fomichev
> > ---
> > net/core/filter.c | 12 +++++++++++-
> …
>
> How do you think about to replace this marker by a line break?
>
> See also:
> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/Doc…
>
> Regards,
> Markus
Hi Markus,
Thanks for the suggestions, those are what I could use more help with.
Will address these in the next version.
Yan
Hello everyone,
This patch series adds a test attributes framework to KUnit.
There has been interest in filtering out "slow" KUnit tests. Most notably,
a new config, CONFIG_MEMCPY_SLOW_KUNIT_TEST, has been added to exclude a
particularly slow memcpy test
(https://lore.kernel.org/all/20230118200653.give.574-kees@kernel.org/).
This attributes framework can be used to save and access test associated
data, including whether a test is slow. These attributes are reportable
(via KTAP and command line output) and are also filterable.
This framework is designed to allow for the addition of other attributes in
the future. These attributes could include whether the test can be run
concurrently, test file path, etc.
To try out the framework I suggest running:
"./tools/testing/kunit/kunit.py run --filter speed!=slow"
This patch series was originally sent out as an RFC. Here is a link to the
RFC v2:
https://lore.kernel.org/all/20230707210947.1208717-1-rmoar@google.com/
Thanks!
Rae
Rae Moar (9):
kunit: Add test attributes API structure
kunit: Add speed attribute
kunit: Add module attribute
kunit: Add ability to filter attributes
kunit: tool: Add command line interface to filter and report
attributes
kunit: memcpy: Mark tests as slow using test attributes
kunit: time: Mark test as slow using test attributes
kunit: add tests for filtering attributes
kunit: Add documentation of KUnit test attributes
.../dev-tools/kunit/running_tips.rst | 166 +++++++
include/kunit/attributes.h | 50 +++
include/kunit/test.h | 70 ++-
kernel/time/time_test.c | 2 +-
lib/Kconfig.debug | 3 +
lib/kunit/Makefile | 3 +-
lib/kunit/attributes.c | 418 ++++++++++++++++++
lib/kunit/executor.c | 115 ++++-
lib/kunit/executor_test.c | 128 +++++-
lib/kunit/kunit-example-test.c | 9 +
lib/kunit/test.c | 27 +-
lib/memcpy_kunit.c | 8 +-
tools/testing/kunit/kunit.py | 70 ++-
tools/testing/kunit/kunit_kernel.py | 8 +-
tools/testing/kunit/kunit_parser.py | 11 +-
tools/testing/kunit/kunit_tool_test.py | 39 +-
16 files changed, 1051 insertions(+), 76 deletions(-)
create mode 100644 include/kunit/attributes.h
create mode 100644 lib/kunit/attributes.c
base-commit: 64bd4641310c41a1ecf07c13c67bc0ed61045dfd
--
2.41.0.487.g6d72f3e995-goog
Hi All,
This is v3 of my series to clean up mm selftests so that they run correctly on
arm64. See [1] for full explanation.
Only patch 6 has changed vs v2. The rest are the same and already carry
reviewed/acked-bys. So I'm hoping I can get the final patch reviewed and this
series is hopefully then good enough to merge?
Changes Since v2 [2]
--------------------
- Patch 6: Change approach to cleaning up child processes; Use "parent death
signal", as suggested by David.
- Added Reviewed-by/Acked-by tags: thanks to David, Mark and Peter!
Changes Since v1 [1]
--------------------
- Patch 1: Explicitly set line buffer mode in ksft_print_header()
- Dropped v1 patch 2 (set execute permissions): Andrew has taken this into his
branch separately.
- Patch 2: Don't compile `soft-dirty` suite for arm64 instead of skipping it
at runtime.
- Patch 2: Declare fewer tests and skip all of test_softdirty() if soft-dirty
is not supported, rather than conditionally marking each check as skipped.
- Added Reviewed-by tags: thanks DavidH!
- Patch 8: Clarified commit message.
[1] https://lore.kernel.org/linux-mm/20230713135440.3651409-1-ryan.roberts@arm.…
[2] https://lore.kernel.org/linux-mm/20230717103152.202078-1-ryan.roberts@arm.c…
Thanks,
Ryan
Ryan Roberts (8):
selftests: Line buffer test program's stdout
selftests/mm: Skip soft-dirty tests on arm64
selftests/mm: Enable mrelease_test for arm64
selftests/mm: Fix thuge-gen test bugs
selftests/mm: va_high_addr_switch should skip unsupported arm64
configs
selftests/mm: Make migration test robust to failure
selftests/mm: Optionally pass duration to transhuge-stress
selftests/mm: Run all tests from run_vmtests.sh
tools/testing/selftests/kselftest.h | 9 ++
tools/testing/selftests/kselftest/runner.sh | 7 +-
tools/testing/selftests/mm/Makefile | 82 ++++++++++---------
tools/testing/selftests/mm/madv_populate.c | 26 +++++-
tools/testing/selftests/mm/migration.c | 12 ++-
tools/testing/selftests/mm/mrelease_test.c | 1 +
tools/testing/selftests/mm/run_vmtests.sh | 28 ++++++-
tools/testing/selftests/mm/settings | 2 +-
tools/testing/selftests/mm/thuge-gen.c | 4 +-
tools/testing/selftests/mm/transhuge-stress.c | 12 ++-
.../selftests/mm/va_high_addr_switch.c | 2 +-
11 files changed, 132 insertions(+), 53 deletions(-)
--
2.25.1
The arm64 Guarded Control Stack (GCS) feature provides support for
hardware protected stacks of return addresses, intended to provide
hardening against return oriented programming (ROP) attacks and to make
it easier to gather call stacks for applications such as profiling.
When GCS is active a secondary stack called the Guarded Control Stack is
maintained, protected with a memory attribute which means that it can
only be written with specific GCS operations. When a BL is executed the
value stored in LR is also pushed onto the GCS, and when a RET is
executed the top of the GCS is popped and compared to LR with a fault
being raised if the values do not match. GCS operations may only be
performed on GCS pages, a data abort is generated if they are not.
This series implements support for use of GCS by userspace, along with
support for use of GCS within KVM guests. It does not enable use of GCS
by either EL1 or EL2. Executables are started without GCS and must use
a prctl() to enable it, it is expected that this will be done very early
in application execution by the dynamic linker or other startup code.
x86 has an equivalent feature called shadow stacks, this series depends
on the x86 patches for generic memory management support for the new
guarded/shadow stack page type and shares APIs as much as possible. As
there has been extensive discussion with the wider community around the
ABI for shadow stacks I have as far as practical kept implementation
decisions close to those for x86, anticipating that review would lead to
similar conclusions in the absence of strong reasoning for divergence.
The main divergence I am concious of is that x86 allows shadow stack to
be enabled and disabled repeatedly, freeing the shadow stack for the
thread whenever disabled, while this implementation keeps the GCS
allocated after disable but refuses to reenable it. This is to avoid
races with things actively walking the GCS during a disable, we do
anticipate that some systems will wish to disable GCS at runtime but are
not aware of any demand for subsequently reenabling it.
x86 uses an arch_prctl() to manage enable and disable, since only x86
and S/390 use arch_prctl() a generic prctl() was proposed[1] as part of a
patch set for the equivalent RISC-V zisslpcfi feature which I initially
adopted fairly directly but following review feedback has been reviewed
quite a bit.
There is an open issue with support for CRIU, on x86 this required the
ability to set the GCS mode via ptrace. This series supports
configuring mode bits other than enable/disable via ptrace but it needs
to be confirmed if this is sufficient.
There's a few bits where I'm not convinced with where I've placed
things, in particular the GCS write operation is in the GCS header not
in uaccess.h, I wasn't sure what was clearest there and am probably too
close to the code to have a clear opinion. The reporting of GCS in
/proc/PID/smaps is also a bit awkward.
The series depends on the x86 shadow stack support:
https://lore.kernel.org/lkml/20230227222957.24501-1-rick.p.edgecombe@intel.…
I've rebased this onto v6.5-rc3 but not included it in the series in
order to avoid confusion with Rick's work and cut down the size of the
series, you can see the branch at:
https://git.kernel.org/pub/scm/linux/kernel/git/broonie/misc.git arm64-gcs
[1] https://lore.kernel.org/lkml/20230213045351.3945824-1-debug@rivosinc.com/
Signed-off-by: Mark Brown <broonie(a)kernel.org>
---
Changes in v2:
- Rebase onto v6.5-rc3.
- Rework prctl() interface to allow each bit to be locked independently.
- map_shadow_stack() now places the cap token based on the size
requested by the caller not the actual space allocated.
- Mode changes other than enable via ptrace are now supported.
- Expand test coverage.
- Various smaller fixes and adjustments.
- Link to v1: https://lore.kernel.org/r/20230716-arm64-gcs-v1-0-bf567f93bba6@kernel.org
---
Mark Brown (35):
prctl: arch-agnostic prctl for shadow stack
arm64: Document boot requirements for Guarded Control Stacks
arm64/gcs: Document the ABI for Guarded Control Stacks
arm64/sysreg: Add new system registers for GCS
arm64/sysreg: Add definitions for architected GCS caps
arm64/gcs: Add manual encodings of GCS instructions
arm64/gcs: Provide copy_to_user_gcs()
arm64/cpufeature: Runtime detection of Guarded Control Stack (GCS)
arm64/mm: Allocate PIE slots for EL0 guarded control stack
mm: Define VM_SHADOW_STACK for arm64 when we support GCS
arm64/mm: Map pages for guarded control stack
KVM: arm64: Manage GCS registers for guests
arm64/el2_setup: Allow GCS usage at EL0 and EL1
arm64/idreg: Add overrride for GCS
arm64/hwcap: Add hwcap for GCS
arm64/traps: Handle GCS exceptions
arm64/mm: Handle GCS data aborts
arm64/gcs: Context switch GCS registers for EL0
arm64/gcs: Allocate a new GCS for threads with GCS enabled
arm64/gcs: Implement shadow stack prctl() interface
arm64/mm: Implement map_shadow_stack()
arm64/signal: Set up and restore the GCS context for signal handlers
arm64/signal: Expose GCS state in signal frames
arm64/ptrace: Expose GCS via ptrace and core files
arm64: Add Kconfig for Guarded Control Stack (GCS)
kselftest/arm64: Verify the GCS hwcap
kselftest/arm64: Add GCS as a detected feature in the signal tests
kselftest/arm64: Add framework support for GCS to signal handling tests
kselftest/arm64: Allow signals tests to specify an expected si_code
kselftest/arm64: Always run signals tests with GCS enabled
kselftest/arm64: Add very basic GCS test program
kselftest/arm64: Add a GCS test program built with the system libc
kselftest/arm64: Add test coverage for GCS mode locking
selftests/arm64: Add GCS signal tests
kselftest/arm64: Enable GCS for the FP stress tests
Documentation/admin-guide/kernel-parameters.txt | 3 +
Documentation/arch/arm64/booting.rst | 22 ++
Documentation/arch/arm64/elf_hwcaps.rst | 3 +
Documentation/arch/arm64/gcs.rst | 225 +++++++++++++
Documentation/arch/arm64/index.rst | 1 +
Documentation/filesystems/proc.rst | 2 +-
arch/arm64/Kconfig | 19 ++
arch/arm64/include/asm/cpufeature.h | 6 +
arch/arm64/include/asm/el2_setup.h | 17 +
arch/arm64/include/asm/esr.h | 28 +-
arch/arm64/include/asm/exception.h | 2 +
arch/arm64/include/asm/gcs.h | 106 ++++++
arch/arm64/include/asm/hwcap.h | 1 +
arch/arm64/include/asm/kvm_host.h | 12 +
arch/arm64/include/asm/pgtable-prot.h | 14 +-
arch/arm64/include/asm/processor.h | 7 +
arch/arm64/include/asm/sysreg.h | 20 ++
arch/arm64/include/asm/uaccess.h | 42 +++
arch/arm64/include/uapi/asm/hwcap.h | 1 +
arch/arm64/include/uapi/asm/ptrace.h | 8 +
arch/arm64/include/uapi/asm/sigcontext.h | 9 +
arch/arm64/kernel/cpufeature.c | 19 ++
arch/arm64/kernel/cpuinfo.c | 1 +
arch/arm64/kernel/entry-common.c | 23 ++
arch/arm64/kernel/idreg-override.c | 2 +
arch/arm64/kernel/process.c | 78 +++++
arch/arm64/kernel/ptrace.c | 59 ++++
arch/arm64/kernel/signal.c | 237 ++++++++++++-
arch/arm64/kernel/traps.c | 11 +
arch/arm64/kvm/hyp/include/hyp/sysreg-sr.h | 17 +
arch/arm64/kvm/sys_regs.c | 22 ++
arch/arm64/mm/Makefile | 1 +
arch/arm64/mm/fault.c | 78 ++++-
arch/arm64/mm/gcs.c | 226 +++++++++++++
arch/arm64/mm/mmap.c | 17 +-
arch/arm64/tools/cpucaps | 1 +
arch/arm64/tools/sysreg | 55 +++
fs/proc/task_mmu.c | 3 +
include/linux/mm.h | 16 +-
include/linux/syscalls.h | 1 +
include/uapi/asm-generic/unistd.h | 5 +-
include/uapi/linux/elf.h | 1 +
include/uapi/linux/prctl.h | 22 ++
kernel/sys.c | 30 ++
kernel/sys_ni.c | 1 +
tools/testing/selftests/arm64/Makefile | 2 +-
tools/testing/selftests/arm64/abi/hwcap.c | 19 ++
tools/testing/selftests/arm64/fp/assembler.h | 15 +
tools/testing/selftests/arm64/fp/fpsimd-test.S | 2 +
tools/testing/selftests/arm64/fp/sve-test.S | 2 +
tools/testing/selftests/arm64/fp/za-test.S | 2 +
tools/testing/selftests/arm64/fp/zt-test.S | 2 +
tools/testing/selftests/arm64/gcs/.gitignore | 3 +
tools/testing/selftests/arm64/gcs/Makefile | 19 ++
tools/testing/selftests/arm64/gcs/basic-gcs.c | 351 +++++++++++++++++++
tools/testing/selftests/arm64/gcs/gcs-locking.c | 200 +++++++++++
tools/testing/selftests/arm64/gcs/gcs-util.h | 87 +++++
tools/testing/selftests/arm64/gcs/libc-gcs.c | 372 +++++++++++++++++++++
tools/testing/selftests/arm64/signal/.gitignore | 1 +
.../testing/selftests/arm64/signal/test_signals.c | 17 +-
.../testing/selftests/arm64/signal/test_signals.h | 6 +
.../selftests/arm64/signal/test_signals_utils.c | 32 +-
.../selftests/arm64/signal/test_signals_utils.h | 39 +++
.../arm64/signal/testcases/gcs_exception_fault.c | 59 ++++
.../selftests/arm64/signal/testcases/gcs_frame.c | 78 +++++
.../arm64/signal/testcases/gcs_write_fault.c | 67 ++++
.../selftests/arm64/signal/testcases/testcases.c | 7 +
.../selftests/arm64/signal/testcases/testcases.h | 1 +
68 files changed, 2825 insertions(+), 32 deletions(-)
---
base-commit: b8f2cc1100d85456f9a48243328b33ab0ce5caff
change-id: 20230303-arm64-gcs-e311ab0d8729
Best regards,
--
Mark Brown <broonie(a)kernel.org>
Hi,
Using the same config for 6.5-rc2 on Ubuntu 22.04 LTS and 22.10, the execution
stop at the exact same line on both boxes (os I reckon it is more than an
accident):
# selftests: net/forwarding: sch_ets.sh
# TEST: ping vlan 10 [ OK ]
# TEST: ping vlan 11 [ OK ]
# TEST: ping vlan 12 [ OK ]
# Running in priomap mode
# Testing ets bands 3 strict 3, streams 0 1
# TEST: band 0 [ OK ]
# INFO: Expected ratio >95% Measured ratio 100.00
# TEST: band 1 [ OK ]
# INFO: Expected ratio <5% Measured ratio 0
# Testing ets bands 3 strict 3, streams 1 2
# TEST: band 1 [ OK ]
# INFO: Expected ratio >95% Measured ratio 100.00
# TEST: band 2 [ OK ]
# INFO: Expected ratio <5% Measured ratio 0
# Testing ets bands 4 strict 1 quanta 5000 2500 1500, streams 0 1
# TEST: band 0 [ OK ]
# INFO: Expected ratio >95% Measured ratio 100.00
# TEST: band 1 [ OK ]
# INFO: Expected ratio <5% Measured ratio 0
# Testing ets bands 4 strict 1 quanta 5000 2500 1500, streams 1 2
# TEST: bands 1:2 [ OK ]
# INFO: Expected ratio 2.00 Measured ratio 1.99
# Testing ets bands 3 quanta 3300 3300 3300, streams 0 1 2
# TEST: bands 0:1 [ OK ]
# INFO: Expected ratio 1.00 Measured ratio .99
# TEST: bands 0:2 [ OK ]
# INFO: Expected ratio 1.00 Measured ratio 1.00
# Testing ets bands 3 quanta 5000 3500 1500, streams 0 1 2
# TEST: bands 0:1 [ OK ]
# INFO: Expected ratio 1.42 Measured ratio 1.42
# TEST: bands 0:2 [ OK ]
# INFO: Expected ratio 3.33 Measured ratio 3.33
# Testing ets bands 3 quanta 5000 8000 1500, streams 0 1 2
# TEST: bands 0:1 [ OK ]
# INFO: Expected ratio 1.60 Measured ratio 1.59
# TEST: bands 0:2 [ OK ]
# INFO: Expected ratio 3.33 Measured ratio 3.33
# Testing ets bands 2 quanta 5000 2500, streams 0 1
# TEST: bands 0:1 [ OK ]
# INFO: Expected ratio 2.00 Measured ratio 1.99
# Running in classifier mode
# Testing ets bands 3 strict 3, streams 0 1
# TEST: band 0 [ OK ]
# INFO: Expected ratio >95% Measured ratio 100.00
# TEST: band 1 [ OK ]
# INFO: Expected ratio <5% Measured ratio 0
# Testing ets bands 3 strict 3, streams 1 2
# TEST: band 1 [ OK ]
# INFO: Expected ratio >95% Measured ratio 100.00
# TEST: band 2 [ OK ]
# INFO: Expected ratio <5% Measured ratio 0
# Testing ets bands 4 strict 1 quanta 5000 2500 1500, streams 0 1
I tried to run 'set -x' enabled version standalone, but that one finished
correctly (?).
It could be something previous scripts left, but right now I don't have a clue.
I can attempt to rerun all tests with sch_ets.sh bash 'set -x' enabled later today.
Best regards,
Mirsad Todorovac
Adds a check to verify if the rtc device file is valid or not
and prints a useful error message if the file is not accessible.
Signed-off-by: Atul Kumar Pant <atulpant.linux(a)gmail.com>
Acked-by: Alexandre Belloni <alexandre.belloni(a)bootlin.com>
---
changes since v4:
Updated the commit message.
changes since v3:
Added Linux-kselftest and Linux-kernel mailing lists.
changes since v2:
Changed error message when rtc file does not exist.
changes since v1:
Removed check for uid=0
If rtc file is invalid, then exit the test.
tools/testing/selftests/rtc/rtctest.c | 11 ++++++++++-
1 file changed, 10 insertions(+), 1 deletion(-)
diff --git a/tools/testing/selftests/rtc/rtctest.c b/tools/testing/selftests/rtc/rtctest.c
index 63ce02d1d5cc..630fef735c7e 100644
--- a/tools/testing/selftests/rtc/rtctest.c
+++ b/tools/testing/selftests/rtc/rtctest.c
@@ -17,6 +17,7 @@
#include <unistd.h>
#include "../kselftest_harness.h"
+#include "../kselftest.h"
#define NUM_UIE 3
#define ALARM_DELTA 3
@@ -419,6 +420,8 @@ __constructor_order_last(void)
int main(int argc, char **argv)
{
+ int ret = -1;
+
switch (argc) {
case 2:
rtc_file = argv[1];
@@ -430,5 +433,11 @@ int main(int argc, char **argv)
return 1;
}
- return test_harness_run(argc, argv);
+ // Run the test if rtc_file is valid
+ if (access(rtc_file, F_OK) == 0)
+ ret = test_harness_run(argc, argv);
+ else
+ ksft_exit_fail_msg("[ERROR]: Cannot access rtc file %s - Exiting\n", rtc_file);
+
+ return ret;
}
--
2.25.1
Hello everyone,
This patch series adds a test attributes framework to KUnit.
There has been interest in filtering out "slow" KUnit tests. Most notably,
a new config, CONFIG_MEMCPY_SLOW_KUNIT_TEST, has been added to exclude a
particularly slow memcpy test
(https://lore.kernel.org/all/20230118200653.give.574-kees@kernel.org/).
This attributes framework can be used to save and access test associated
data, including whether a test is slow. These attributes are reportable
(via KTAP and command line output) and are also filterable.
This framework is designed to allow for the addition of other attributes in
the future. These attributes could include whether the test can be run
concurrently, test file path, etc.
To try out the framework I suggest running:
"./tools/testing/kunit/kunit.py run --filter speed!=slow"
This patch series was originally sent out as an RFC. Here is a link to the
RFC v2:
https://lore.kernel.org/all/20230707210947.1208717-1-rmoar@google.com/
Thanks!
Rae
Rae Moar (9):
kunit: Add test attributes API structure
kunit: Add speed attribute
kunit: Add module attribute
kunit: Add ability to filter attributes
kunit: tool: Add command line interface to filter and report
attributes
kunit: memcpy: Mark tests as slow using test attributes
kunit: time: Mark test as slow using test attributes
kunit: add tests for filtering attributes
kunit: Add documentation of KUnit test attributes
.../dev-tools/kunit/running_tips.rst | 166 +++++++
include/kunit/attributes.h | 50 +++
include/kunit/test.h | 70 ++-
kernel/time/time_test.c | 2 +-
lib/Kconfig.debug | 3 +
lib/kunit/Makefile | 3 +-
lib/kunit/attributes.c | 421 ++++++++++++++++++
lib/kunit/executor.c | 115 ++++-
lib/kunit/executor_test.c | 128 +++++-
lib/kunit/kunit-example-test.c | 9 +
lib/kunit/test.c | 27 +-
lib/memcpy_kunit.c | 8 +-
tools/testing/kunit/kunit.py | 70 ++-
tools/testing/kunit/kunit_kernel.py | 8 +-
tools/testing/kunit/kunit_parser.py | 11 +-
tools/testing/kunit/kunit_tool_test.py | 39 +-
16 files changed, 1054 insertions(+), 76 deletions(-)
create mode 100644 include/kunit/attributes.h
create mode 100644 lib/kunit/attributes.c
base-commit: 64bd4641310c41a1ecf07c13c67bc0ed61045dfd
--
2.41.0.255.g8b1d071c50-goog
Hi,
There seems to be a problem with net/forwarding line of 6.5-rc2 kselftests,
vanilla Torvalds tree, commit fdf0eaf11452, on Ubuntu 22.04 LTS Jammy Jellyfish.
(Confirmed on Ubuntu 22.10 Kinetic Kudu.)
Tests fail with error message:
Command line is not complete. Try option "help"
Failed to create netif
The script
# tools/testing/seltests/net/forwarding/bridge_igmp.sh
bash `set -x` ends with an error:
++ create_netif_veth
++ local i
++ (( i = 1 ))
++ (( i <= NUM_NETIFS ))
++ local j=2
++ ip link show dev
++ [[ 255 -ne 0 ]]
++ ip link add type veth peer name
Command line is not complete. Try option "help"
++ [[ 255 -ne 0 ]]
++ echo 'Failed to create netif'
Failed to create netif
++ exit 1
The problem seems to be linked with this piece of code of "lib.sh":
create_netif_veth()
{
local i
for ((i = 1; i <= NUM_NETIFS; ++i)); do
local j=$((i+1))
ip link show dev ${NETIFS[p$i]} &> /dev/null
if [[ $? -ne 0 ]]; then
ip link add ${NETIFS[p$i]} type veth \
peer name ${NETIFS[p$j]}
if [[ $? -ne 0 ]]; then
echo "Failed to create netif"
exit 1
fi
fi
i=$j
done
}
Somehow, ${NETIFS[p$i]} is evaluated to an empty string?
However, I can't seem to see what is the expected result.
The problem was confirmed in the backlogs of 6.5-rc1 and 6.4 kselftests.
It is possible that I'm doing something terribly wrong, but it is basically
the default kselftest suite on a rather minimal Ubuntu.
Please find attached the bash output from `set -x`.
Version of iproute2 is:
ii iproute2 5.15.0-1ubuntu2 amd64 networking and traffic control tools
Hope this helps.
Best regards,
Mirsad Todorovac
Thank you Michał.
On 7/21/23 12:28 AM, Michał Mirosław wrote:
> b. rename match "flags" to 'page categories' everywhere - this makes
> it easier to differentiate the ioctl()s categorisation of pages
> from struct page flags;
> c. change {required + excluded} to {inverted + required}. This was
> rejected before, but I'd like to illustrate the difference.
> Old interface can be translated to the new by:
> categories_inverted = excluded_mask
> categories_mask = required_mask | excluded_mask
> categories_anyof_mask = anyof_mask
> The new way allows filtering by: A & (B | !C)
> categories_inverted = C
> categories_mask = A
> categories_anyof_mask = B | C
Andrei and Danylo,
Are you okay with these masks? It were you two who had proposed these.
--
BR,
Muhammad Usama Anjum
This series demonstrates how KTAP output can be used by nolibc-test to
make the test results better to read for people and machines.
Especially when running multiple invocations for different architectors
or build configurations we can make use of the kernels TAP parser to
automatically provide aggregated test reports.
The code is very hacky and incomplete and mostly meant to validate if
the output format is useful.
Start with the last patch of the series to actually see the generated
format, or run it for yourself.
Signed-off-by: Thomas Weißschuh <linux(a)weissschuh.net>
---
Thomas Weißschuh (7):
selftests/nolibc: statically calculate number of testsuites
selftests/nolibc: use unsigned indices for testcases
selftests/nolibc: replace repetitive test structure with macro
selftests/nolibc: count subtests
kselftest: support KTAP format
kselftest: support skipping tests with testname
selftests/nolibc: proof of concept for TAP output
tools/testing/selftests/kselftest.h | 20 +++
tools/testing/selftests/nolibc/nolibc-test.c | 197 ++++++++++--------------
tools/testing/selftests/nolibc/run-all-tests.sh | 22 +++
3 files changed, 127 insertions(+), 112 deletions(-)
---
base-commit: dfef4fc45d5713eb23d87f0863aff9c33bd4bfaf
change-id: 20230718-nolibc-ktap-tmp-4408f505408d
Best regards,
--
Thomas Weißschuh <linux(a)weissschuh.net>
Hi Michał,
kernel test robot noticed the following build errors:
[auto build test ERROR on akpm-mm/mm-everything]
[also build test ERROR on linus/master v6.5-rc2 next-20230721]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch#_base_tree_information]
url: https://github.com/intel-lab-lkp/linux/commits/Micha-Miros-aw/Re-fs-proc-ta…
base: https://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm.git mm-everything
patch link: https://lore.kernel.org/r/a0b5c6776b2ed91f78a7575649f8b100e58bd3a9.16898810…
patch subject: Re: fs/proc/task_mmu: Implement IOCTL for efficient page table scanning
config: powerpc-randconfig-r015-20230720 (https://download.01.org/0day-ci/archive/20230721/202307211507.xOl45LiR-lkp@…)
compiler: clang version 17.0.0 (https://github.com/llvm/llvm-project.git 4a5ac14ee968ff0ad5d2cc1ffa0299048db4c88a)
reproduce: (https://download.01.org/0day-ci/archive/20230721/202307211507.xOl45LiR-lkp@…)
If you fix the issue in a separate patch/commit (i.e. not just a new version of
the same patch/commit), kindly add following tags
| Reported-by: kernel test robot <lkp(a)intel.com>
| Closes: https://lore.kernel.org/oe-kbuild-all/202307211507.xOl45LiR-lkp@intel.com/
All errors (new ones prefixed by >>):
>> fs/proc/task_mmu.c:1921:6: error: call to undeclared function 'userfaultfd_wp_async'; ISO C99 and later do not support implicit function declarations [-Wimplicit-function-declaration]
1921 | if (userfaultfd_wp_async(vma) && userfaultfd_wp_use_markers(vma))
| ^
>> fs/proc/task_mmu.c:2200:12: error: call to undeclared function 'uffd_wp_range'; ISO C99 and later do not support implicit function declarations [-Wimplicit-function-declaration]
2200 | int err = uffd_wp_range(vma, addr, end - addr, true);
| ^
2 errors generated.
vim +/userfaultfd_wp_async +1921 fs/proc/task_mmu.c
1913
1914 static int pagemap_scan_test_walk(unsigned long start, unsigned long end,
1915 struct mm_walk *walk)
1916 {
1917 struct pagemap_scan_private *p = walk->private;
1918 struct vm_area_struct *vma = walk->vma;
1919 unsigned long vma_category = 0;
1920
> 1921 if (userfaultfd_wp_async(vma) && userfaultfd_wp_use_markers(vma))
1922 vma_category |= PAGE_IS_WPASYNC;
1923 else if (p->arg.flags & PM_SCAN_CHECK_WPASYNC)
1924 return -EPERM;
1925
1926 if (vma->vm_flags & VM_PFNMAP)
1927 return 1;
1928
1929 if (!pagemap_scan_is_interesting_vma(vma_category, p))
1930 return 1;
1931
1932 p->cur_vma_category = vma_category;
1933 return 0;
1934 }
1935
--
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki
Hi Michał,
kernel test robot noticed the following build errors:
[auto build test ERROR on akpm-mm/mm-everything]
[also build test ERROR on linus/master v6.5-rc2 next-20230720]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch#_base_tree_information]
url: https://github.com/intel-lab-lkp/linux/commits/Micha-Miros-aw/Re-fs-proc-ta…
base: https://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm.git mm-everything
patch link: https://lore.kernel.org/r/a0b5c6776b2ed91f78a7575649f8b100e58bd3a9.16898810…
patch subject: Re: fs/proc/task_mmu: Implement IOCTL for efficient page table scanning
config: i386-randconfig-r022-20230720 (https://download.01.org/0day-ci/archive/20230721/202307211337.5dwCMeHb-lkp@…)
compiler: clang version 15.0.7 (https://github.com/llvm/llvm-project.git 8dfdcc7b7bf66834a761bd8de445840ef68e4d1a)
reproduce: (https://download.01.org/0day-ci/archive/20230721/202307211337.5dwCMeHb-lkp@…)
If you fix the issue in a separate patch/commit (i.e. not just a new version of
the same patch/commit), kindly add following tags
| Reported-by: kernel test robot <lkp(a)intel.com>
| Closes: https://lore.kernel.org/oe-kbuild-all/202307211337.5dwCMeHb-lkp@intel.com/
All errors (new ones prefixed by >>):
>> fs/proc/task_mmu.c:1921:6: error: call to undeclared function 'userfaultfd_wp_async'; ISO C99 and later do not support implicit function declarations [-Werror,-Wimplicit-function-declaration]
if (userfaultfd_wp_async(vma) && userfaultfd_wp_use_markers(vma))
^
>> fs/proc/task_mmu.c:2200:12: error: call to undeclared function 'uffd_wp_range'; ISO C99 and later do not support implicit function declarations [-Werror,-Wimplicit-function-declaration]
int err = uffd_wp_range(vma, addr, end - addr, true);
^
2 errors generated.
vim +/userfaultfd_wp_async +1921 fs/proc/task_mmu.c
1913
1914 static int pagemap_scan_test_walk(unsigned long start, unsigned long end,
1915 struct mm_walk *walk)
1916 {
1917 struct pagemap_scan_private *p = walk->private;
1918 struct vm_area_struct *vma = walk->vma;
1919 unsigned long vma_category = 0;
1920
> 1921 if (userfaultfd_wp_async(vma) && userfaultfd_wp_use_markers(vma))
1922 vma_category |= PAGE_IS_WPASYNC;
1923 else if (p->arg.flags & PM_SCAN_CHECK_WPASYNC)
1924 return -EPERM;
1925
1926 if (vma->vm_flags & VM_PFNMAP)
1927 return 1;
1928
1929 if (!pagemap_scan_is_interesting_vma(vma_category, p))
1930 return 1;
1931
1932 p->cur_vma_category = vma_category;
1933 return 0;
1934 }
1935
--
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki
In order to select a nexthop for multipath routes, fib_select_multipath()
is used with legacy nexthops and nexthop_select_path_hthr() is used with
nexthop objects. Those two functions perform a validity test on the
neighbor related to each nexthop but their logic is structured differently.
This causes a divergence in behavior and nexthop_select_path_hthr() may
return a nexthop that failed the neighbor validity test even if there was
one that passed.
Refactor nexthop_select_path_hthr() to make it more similar to
fib_select_multipath() and fix the problem mentioned above.
v2:
Removed unnecessary "first" variable in "nexthop: Do not return invalid
nexthop object during multipath selection".
v1:
https://lore.kernel.org/netdev/20230529201914.69828-1-bpoirier@nvidia.com/
---
Benjamin Poirier (4):
nexthop: Factor out hash threshold fdb nexthop selection
nexthop: Factor out neighbor validity check
nexthop: Do not return invalid nexthop object during multipath selection
selftests: net: Add test cases for nexthop groups with invalid neighbors
net/ipv4/nexthop.c | 61 +++++++++----
tools/testing/selftests/net/fib_nexthops.sh | 129 ++++++++++++++++++++++++++++
2 files changed, 171 insertions(+), 19 deletions(-)
---
base-commit: 36395b2efe905650cd179d67411ffee3b770268b
change-id: 20230719-nh_select-0303d55a1fb0
Best regards,
--
Benjamin Poirier <bpoirier(a)nvidia.com>
Hi Michał,
kernel test robot noticed the following build errors:
[auto build test ERROR on akpm-mm/mm-everything]
[also build test ERROR on linus/master v6.5-rc2 next-20230720]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch#_base_tree_information]
url: https://github.com/intel-lab-lkp/linux/commits/Micha-Miros-aw/Re-fs-proc-ta…
base: https://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm.git mm-everything
patch link: https://lore.kernel.org/r/a0b5c6776b2ed91f78a7575649f8b100e58bd3a9.16898810…
patch subject: Re: fs/proc/task_mmu: Implement IOCTL for efficient page table scanning
config: arc-randconfig-r035-20230720 (https://download.01.org/0day-ci/archive/20230721/202307211030.2CJH6TkM-lkp@…)
compiler: arceb-elf-gcc (GCC) 12.3.0
reproduce: (https://download.01.org/0day-ci/archive/20230721/202307211030.2CJH6TkM-lkp@…)
If you fix the issue in a separate patch/commit (i.e. not just a new version of
the same patch/commit), kindly add following tags
| Reported-by: kernel test robot <lkp(a)intel.com>
| Closes: https://lore.kernel.org/oe-kbuild-all/202307211030.2CJH6TkM-lkp@intel.com/
All errors (new ones prefixed by >>):
fs/proc/task_mmu.c: In function 'pagemap_scan_test_walk':
fs/proc/task_mmu.c:1921:13: error: implicit declaration of function 'userfaultfd_wp_async'; did you mean 'userfaultfd_wp'? [-Werror=implicit-function-declaration]
1921 | if (userfaultfd_wp_async(vma) && userfaultfd_wp_use_markers(vma))
| ^~~~~~~~~~~~~~~~~~~~
| userfaultfd_wp
fs/proc/task_mmu.c: In function 'pagemap_scan_pte_hole':
>> fs/proc/task_mmu.c:2200:19: error: implicit declaration of function 'uffd_wp_range' [-Werror=implicit-function-declaration]
2200 | int err = uffd_wp_range(vma, addr, end - addr, true);
| ^~~~~~~~~~~~~
fs/proc/task_mmu.c: In function 'pagemap_scan_init_bounce_buffer':
fs/proc/task_mmu.c:2290:22: warning: cast to pointer from integer of different size [-Wint-to-pointer-cast]
2290 | p->vec_out = (void __user *)p->arg.vec;
| ^
fs/proc/task_mmu.c: At top level:
fs/proc/task_mmu.c:1967:13: warning: 'pagemap_scan_backout_range' defined but not used [-Wunused-function]
1967 | static void pagemap_scan_backout_range(struct pagemap_scan_private *p,
| ^~~~~~~~~~~~~~~~~~~~~~~~~~
cc1: some warnings being treated as errors
vim +/uffd_wp_range +2200 fs/proc/task_mmu.c
2182
2183 static int pagemap_scan_pte_hole(unsigned long addr, unsigned long end,
2184 int depth, struct mm_walk *walk)
2185 {
2186 struct pagemap_scan_private *p = walk->private;
2187 struct vm_area_struct *vma = walk->vma;
2188 int ret;
2189
2190 if (!vma || !pagemap_scan_is_interesting_page(p->cur_vma_category, p))
2191 return 0;
2192
2193 ret = pagemap_scan_output(p->cur_vma_category, p, addr, &end);
2194 if (addr == end)
2195 return ret;
2196
2197 if (~p->arg.flags & PM_SCAN_WP_MATCHING)
2198 return ret;
2199
> 2200 int err = uffd_wp_range(vma, addr, end - addr, true);
2201 if (err < 0)
2202 ret = err;
2203
2204 return ret;
2205 }
2206
--
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki
Hi Michał,
kernel test robot noticed the following build warnings:
[auto build test WARNING on akpm-mm/mm-everything]
[also build test WARNING on linus/master v6.5-rc2 next-20230720]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch#_base_tree_information]
url: https://github.com/intel-lab-lkp/linux/commits/Micha-Miros-aw/Re-fs-proc-ta…
base: https://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm.git mm-everything
patch link: https://lore.kernel.org/r/a0b5c6776b2ed91f78a7575649f8b100e58bd3a9.16898810…
patch subject: Re: fs/proc/task_mmu: Implement IOCTL for efficient page table scanning
config: mips-allyesconfig (https://download.01.org/0day-ci/archive/20230721/202307210528.2qgK1vwi-lkp@…)
compiler: mips-linux-gcc (GCC) 12.3.0
reproduce: (https://download.01.org/0day-ci/archive/20230721/202307210528.2qgK1vwi-lkp@…)
If you fix the issue in a separate patch/commit (i.e. not just a new version of
the same patch/commit), kindly add following tags
| Reported-by: kernel test robot <lkp(a)intel.com>
| Closes: https://lore.kernel.org/oe-kbuild-all/202307210528.2qgK1vwi-lkp@intel.com/
All warnings (new ones prefixed by >>):
fs/proc/task_mmu.c: In function 'pagemap_scan_test_walk':
fs/proc/task_mmu.c:1921:13: error: implicit declaration of function 'userfaultfd_wp_async'; did you mean 'userfaultfd_wp'? [-Werror=implicit-function-declaration]
1921 | if (userfaultfd_wp_async(vma) && userfaultfd_wp_use_markers(vma))
| ^~~~~~~~~~~~~~~~~~~~
| userfaultfd_wp
fs/proc/task_mmu.c: In function 'pagemap_scan_init_bounce_buffer':
>> fs/proc/task_mmu.c:2290:22: warning: cast to pointer from integer of different size [-Wint-to-pointer-cast]
2290 | p->vec_out = (void __user *)p->arg.vec;
| ^
fs/proc/task_mmu.c: At top level:
fs/proc/task_mmu.c:1967:13: warning: 'pagemap_scan_backout_range' defined but not used [-Wunused-function]
1967 | static void pagemap_scan_backout_range(struct pagemap_scan_private *p,
| ^~~~~~~~~~~~~~~~~~~~~~~~~~
cc1: some warnings being treated as errors
vim +2290 fs/proc/task_mmu.c
2264
2265 static int pagemap_scan_init_bounce_buffer(struct pagemap_scan_private *p)
2266 {
2267 if (!p->arg.vec_len) {
2268 /*
2269 * An arbitrary non-page-aligned sentinel value for
2270 * pagemap_scan_push_range().
2271 */
2272 p->cur_buf.start = p->cur_buf.end = ULLONG_MAX;
2273 return 0;
2274 }
2275
2276 /*
2277 * Allocate a smaller buffer to get output from inside the page
2278 * walk functions and walk the range in PAGEMAP_WALK_SIZE chunks.
2279 * The last range is always stored in p.cur_buf to allow coalescing
2280 * consecutive ranges that have the same categories returned across
2281 * walk_page_range() calls.
2282 */
2283 p->vec_buf_len = min_t(size_t, PAGEMAP_WALK_SIZE >> PAGE_SHIFT,
2284 p->arg.vec_len - 1);
2285 p->vec_buf = kmalloc_array(p->vec_buf_len, sizeof(*p->vec_buf),
2286 GFP_KERNEL);
2287 if (!p->vec_buf)
2288 return -ENOMEM;
2289
> 2290 p->vec_out = (void __user *)p->arg.vec;
2291
2292 return 0;
2293 }
2294
--
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki
=== Context ===
In the context of a middlebox, fragmented packets are tricky to handle.
The full 5-tuple of a packet is often only available in the first
fragment which makes enforcing consistent policy difficult. There are
really only two stateless options, neither of which are very nice:
1. Enforce policy on first fragment and accept all subsequent fragments.
This works but may let in certain attacks or allow data exfiltration.
2. Enforce policy on first fragment and drop all subsequent fragments.
This does not really work b/c some protocols may rely on
fragmentation. For example, DNS may rely on oversized UDP packets for
large responses.
So stateful tracking is the only sane option. RFC 8900 [0] calls this
out as well in section 6.3:
Middleboxes [...] should process IP fragments in a manner that is
consistent with [RFC0791] and [RFC8200]. In many cases, middleboxes
must maintain state in order to achieve this goal.
=== BPF related bits ===
Policy has traditionally been enforced from XDP/TC hooks. Both hooks
run before kernel reassembly facilities. However, with the new
BPF_PROG_TYPE_NETFILTER, we can rather easily hook into existing
netfilter reassembly infra.
The basic idea is we bump a refcnt on the netfilter defrag module and
then run the bpf prog after the defrag module runs. This allows bpf
progs to transparently see full, reassembled packets. The nice thing
about this is that progs don't have to carry around logic to detect
fragments.
=== Changelog ===
Changes from v4:
* Refactor module handling code to not sleep in rcu_read_lock()
* Also unify the v4 and v6 hook structs so they can share codepaths
* Fixed some checkpatch.pl formatting warnings
Changes from v3:
* Correctly initialize `addrlen` stack var for recvmsg()
Changes from v2:
* module_put() if ->enable() fails
* Fix CI build errors
Changes from v1:
* Drop bpf_program__attach_netfilter() patches
* static -> static const where appropriate
* Fix callback assignment order during registration
* Only request_module() if callbacks are missing
* Fix retval when modprobe fails in userspace
* Fix v6 defrag module name (nf_defrag_ipv6_hooks -> nf_defrag_ipv6)
* Simplify priority checking code
* Add warning if module doesn't assign callbacks in the future
* Take refcnt on module while defrag link is active
[0]: https://datatracker.ietf.org/doc/html/rfc8900
Daniel Xu (5):
netfilter: defrag: Add glue hooks for enabling/disabling defrag
netfilter: bpf: Support BPF_F_NETFILTER_IP_DEFRAG in netfilter link
bpf: selftests: Support not connecting client socket
bpf: selftests: Support custom type and proto for client sockets
bpf: selftests: Add defrag selftests
include/linux/netfilter.h | 10 +
include/uapi/linux/bpf.h | 5 +
net/ipv4/netfilter/nf_defrag_ipv4.c | 17 +-
net/ipv6/netfilter/nf_defrag_ipv6_hooks.c | 11 +
net/netfilter/core.c | 6 +
net/netfilter/nf_bpf_link.c | 116 ++++++-
tools/include/uapi/linux/bpf.h | 5 +
tools/testing/selftests/bpf/Makefile | 4 +-
.../selftests/bpf/generate_udp_fragments.py | 90 ++++++
.../selftests/bpf/ip_check_defrag_frags.h | 57 ++++
tools/testing/selftests/bpf/network_helpers.c | 26 +-
tools/testing/selftests/bpf/network_helpers.h | 3 +
.../bpf/prog_tests/ip_check_defrag.c | 283 ++++++++++++++++++
.../selftests/bpf/progs/ip_check_defrag.c | 104 +++++++
14 files changed, 715 insertions(+), 22 deletions(-)
create mode 100755 tools/testing/selftests/bpf/generate_udp_fragments.py
create mode 100644 tools/testing/selftests/bpf/ip_check_defrag_frags.h
create mode 100644 tools/testing/selftests/bpf/prog_tests/ip_check_defrag.c
create mode 100644 tools/testing/selftests/bpf/progs/ip_check_defrag.c
--
2.41.0
On Thu, Jul 20, 2023 at 09:28:52PM +0200, Michał Mirosław wrote:
> This is a massaged version of patch by Muhammad Usama Anjum [1]
> to illustrate my review comments and hopefully push the implementation
> efforts closer to conclusion. The changes are:
[...]
> +static void pagemap_scan_backout_range(struct pagemap_scan_private *p,
> + unsigned long addr, unsigned long end)
> +{
> + struct page_region *cur_buf = &p->cur_buf;
> +
> + if (cur_buf->start != addr) {
> + cur_buf->end = addr;
> + } else {
> + cur_buf->start = cur_buf->end = 0;
> + }
> +
> + p->end_addr = 0;
Just noticed that this is missing:
p->found_pages -= (end - addr) / PAGE_SIZE;
> +}
[...]
Best Regards
Michał Mirosław
v1: https://lore.kernel.org/rust-for-linux/20230614180837.630180-1-ojeda@kernel…
v2:
- Rebased on top of v6.5-rc1, which requires a change from
`kunit_do_failed_assertion` to `__kunit_do_failed_assertion` (since
the former became a macro) and the addition of a call to
`__kunit_abort` (since previously the call was done by the old
function which we cannot use anymore since it is a macro). (David)
- Added prerequisite patch to KUnit header to include `stddef.h` to
support the `KUNIT=y` case. (Reported by Boqun)
- Added comment on the purpose of `trait FromErrno`. (Martin asked
about it)
- Simplify code to use `std::fs::write` instead of `write!`, which
improves code size too. (Suggested by Alice)
- Fix copy-paste type in docs from "error" to "info" and, to make it
proper English, copy the `printk` docs style, i.e. from "info"
to "info-level message" -- and same for the "error" one. (Miguel)
- Swap `FILE` and `LINE` `static`s to keep the same order as done
elsewhere. (Miguel)
- Rename config option from `RUST_KERNEL_KUNIT_TEST` to
`RUST_KERNEL_DOCTESTS` (and update its title), so that we can use
the former for the "normal" (i.e. non-doctests, e.g. `#[test]` ones)
tests in the future. (David)
- Follow the syntax proposed for declaring test metadata in the KTAP
v2 spec, which may also get used for the KUnit test attributes API.
Thus, instead of "# Doctest from line {line}", use
"# {test_name}.location: {file}.{line}", which ideally will allow to
migrate to a KUnit attribute later.
This is done in all cases, i.e. when the tests succeeds, because
it may be useful for users running KUnit manually, or when passing
`--raw_output` to `kunit.py`. (David)
David: I used "location" instead of your suggested "line" alone, in
order to have both in a single line, which looked nice and closer to
the "ASSERTION FAILURE" case/line, since now we do have the original
file (please see below).
- Figure out the original line. This is done by deploying an anchor
so that the difference in lines between the beginning of the test
and the assert, in the generated file, can be computed. Then, we
offset the line number of the original test, which is given by
`rustdoc`. (developed by Boqun)
- Figure out the original file. This is done by walking the
filesystem, checking directory prefixes to reduce the amount of
combinations to check, and it is only done once per file (rather
than per test).
Ambiguities are detected and reported. It does limit the filenames
(module names) we can use, but it is unlikely we will hit it soon
and this should be temporary anyway until `rustdoc` provides us
with the real path. (Miguel)
Tested with both in-tree and `O=` builds, but I would appreciate
extra testing on this one, including via the `kunit.py` script.
- The three last items combined means that we now see this output even
for successful cases:
# rust_doctest_kernel_sync_locked_by_rs_0.location: rust/kernel/sync/locked_by.rs:28
ok 53 rust_doctest_kernel_sync_locked_by_rs_0
Which basically gives the user all the information they need to go
back to the source code of the doctest, while keeping them fairly
stable for bisection, and while avoiding to require users to write
test names manually. (David + Boqun + Miguel)
David: from what I saw in v2 of the RFC for the test attributes API,
the syntax still contains the test name when it is not a suite, so
I followed that, but if you prefer to omit it, please feel free to
do so (for me either way it is fine, and if this is the expected
attribute syntax, I guess it is worth to follow it to make migration
easier later on):
# location: rust/kernel/sync/locked_by.rs:28
ok 53 rust_doctest_kernel_sync_locked_by_rs_0
- Collected `Reviewed-by`s and `Tested-by`s, except for the main
commit due to the substantial changes.
Miguel Ojeda (7):
kunit: test-bug.h: include `stddef.h` for `NULL`
rust: init: make doctests compilable/testable
rust: str: make doctests compilable/testable
rust: sync: make doctests compilable/testable
rust: types: make doctests compilable/testable
rust: support running Rust documentation tests as KUnit ones
MAINTAINERS: add Rust KUnit files to the KUnit entry
MAINTAINERS | 2 +
include/kunit/test-bug.h | 2 +
lib/Kconfig.debug | 13 ++
rust/.gitignore | 2 +
rust/Makefile | 29 ++++
rust/bindings/bindings_helper.h | 1 +
rust/helpers.c | 7 +
rust/kernel/init.rs | 26 +--
rust/kernel/kunit.rs | 163 +++++++++++++++++++
rust/kernel/lib.rs | 2 +
rust/kernel/str.rs | 4 +-
rust/kernel/sync/arc.rs | 9 +-
rust/kernel/sync/lock/mutex.rs | 1 +
rust/kernel/sync/lock/spinlock.rs | 1 +
rust/kernel/types.rs | 6 +-
scripts/.gitignore | 2 +
scripts/Makefile | 4 +
scripts/rustdoc_test_builder.rs | 72 +++++++++
scripts/rustdoc_test_gen.rs | 260 ++++++++++++++++++++++++++++++
19 files changed, 591 insertions(+), 15 deletions(-)
create mode 100644 rust/kernel/kunit.rs
create mode 100644 scripts/rustdoc_test_builder.rs
create mode 100644 scripts/rustdoc_test_gen.rs
base-commit: 06c2afb862f9da8dc5efa4b6076a0e48c3fbaaa5
--
2.41.0
This series fixes an issue which David Spickett found where if we change
the SVE VL while SME is in use we can end up attempting to save state to
an unallocated buffer and adds testing coverage for that plus a bit more
coverage of VL changes, just for paranioa.
Signed-off-by: Mark Brown <broonie(a)kernel.org>
---
Mark Brown (3):
arm64/fpsimd: Ensure SME storage is allocated after SVE VL changes
kselftest/arm64: Add a test case for SVE VL changes with SME active
kselftest/arm64: Validate that changing one VL type does not affect another
arch/arm64/kernel/fpsimd.c | 32 +++++--
tools/testing/selftests/arm64/fp/vec-syscfg.c | 127 +++++++++++++++++++++++++-
2 files changed, 148 insertions(+), 11 deletions(-)
---
base-commit: 06c2afb862f9da8dc5efa4b6076a0e48c3fbaaa5
change-id: 20230713-arm64-fix-sve-sme-vl-change-60eb1fa6a707
Best regards,
--
Mark Brown <broonie(a)kernel.org>
Hi,
This follows the discussion here:
https://lore.kernel.org/linux-kselftest/20230324123157.bbwvfq4gsxnlnfwb@hou…
This shows a couple of inconsistencies with regard to how device-managed
resources are cleaned up. Basically, devm resources will only be cleaned up
if the device is attached to a bus and bound to a driver. Failing any of
these cases, a call to device_unregister will not end up in the devm
resources being released.
We had to work around it in DRM to provide helpers to create a device for
kunit tests, but the current discussion around creating similar, generic,
helpers for kunit resumed interest in fixing this.
This can be tested using the command:
./tools/testing/kunit/kunit.py run --kunitconfig=drivers/base/test/
I added the fix David suggested back in that discussion which does fix
the tests. The SoB is missing, since David didn't provide it back then.
Let me know what you think,
Maxime
Signed-off-by: Maxime Ripard <maxime(a)cerno.tech>
---
Changes in v2:
- Use an init function
- Document the tests
- Add a fix for the bugs
- Link to v1: https://lore.kernel.org/r/20230329-kunit-devm-inconsistencies-test-v1-0-c33…
---
David Gow (1):
drivers: base: Free devm resources when unregistering a device
Maxime Ripard (2):
drivers: base: Add basic devm tests for root devices
drivers: base: Add basic devm tests for platform devices
drivers/base/core.c | 11 ++
drivers/base/test/.kunitconfig | 2 +
drivers/base/test/Kconfig | 4 +
drivers/base/test/Makefile | 3 +
drivers/base/test/platform-device-test.c | 220 +++++++++++++++++++++++++++++++
drivers/base/test/root-device-test.c | 108 +++++++++++++++
6 files changed, 348 insertions(+)
---
base-commit: 53cdf865f90ba922a854c65ed05b519f9d728424
change-id: 20230329-kunit-devm-inconsistencies-test-5e5a7d01e60d
Best regards,
--
Maxime Ripard <mripard(a)kernel.org>
Hi,
This patch series aims to improve the PMU event filter settings with a cleaner
and more organized structure and adds several test cases related to PMU event
filters.
These changes help to ensure that KVM's PMU event filter functions as expected
in all supported use cases.
Any feedback or suggestions are greatly appreciated.
Sincerely,
Jinrong Liang
Changes log:
v5:
- Add more x86 properties for Intel PMU;
- Designated initializer instead of overwrite all members; (Isaku Yamahata)
- PMU event filter invalid flag modified to "KVM_PMU_EVENT_FLAGS_VALID_MASK << 1"; (Isaku Yamahata)
- sizeof(bitmap) is modified to "sizeof(bitmap) * 8" to represent the number of
bits that can be represented by the bitmap variable. (Isaku Yamahata)
Previous:
https://lore.kernel.org/kvm/20230717062343.3743-1-cloudliang@tencent.com/T/
Jinrong Liang (6):
KVM: selftests: Add x86 properties for Intel PMU in processor.h
KVM: selftests: Drop the return of remove_event()
KVM: selftests: Introduce __kvm_pmu_event_filter to improved event
filter settings
KVM: selftests: Add test cases for unsupported PMU event filter input
values
KVM: selftests: Test if event filter meets expectations on fixed
counters
KVM: selftests: Test gp event filters don't affect fixed event filters
.../selftests/kvm/include/x86_64/processor.h | 5 +
.../kvm/x86_64/pmu_event_filter_test.c | 317 ++++++++++++------
2 files changed, 228 insertions(+), 94 deletions(-)
base-commit: 88bb466c9dec4f70d682cf38c685324e7b1b3d60
--
2.39.3
When we collect a signal context with one of the SME modes enabled we will
have enabled that mode behind the compiler and libc's back so they may
issue some instructions not valid in streaming mode, causing spurious
failures.
For the code prior to issuing the BRK to trigger signal handling we need to
stay in streaming mode if we were already there since that's a part of the
signal context the caller is trying to collect. Unfortunately this code
includes a memset() which is likely to be heavily optimised and is likely
to use FP instructions incompatible with streaming mode. We can avoid this
happening by open coding the memset(), inserting a volatile assembly
statement to avoid the compiler recognising what's being done and doing
something in optimisation. This code is not performance critical so the
inefficiency should not be an issue.
After collecting the context we can simply exit streaming mode, avoiding
these issues. Use a full SMSTOP for safety to prevent any issues appearing
with ZA.
Reported-by: Will Deacon <will(a)kernel.org>
Signed-off-by: Mark Brown <broonie(a)kernel.org>
---
Changes in v2:
- Rebase onto v6.5-rc1.
- Link to v1: https://lore.kernel.org/r/20230628-arm64-signal-memcpy-fix-v1-1-db3e0300829…
---
.../selftests/arm64/signal/test_signals_utils.h | 28 +++++++++++++++++++++-
1 file changed, 27 insertions(+), 1 deletion(-)
diff --git a/tools/testing/selftests/arm64/signal/test_signals_utils.h b/tools/testing/selftests/arm64/signal/test_signals_utils.h
index 222093f51b67..db28409fd44b 100644
--- a/tools/testing/selftests/arm64/signal/test_signals_utils.h
+++ b/tools/testing/selftests/arm64/signal/test_signals_utils.h
@@ -60,13 +60,28 @@ static __always_inline bool get_current_context(struct tdescr *td,
size_t dest_sz)
{
static volatile bool seen_already;
+ int i;
+ char *uc = (char *)dest_uc;
assert(td && dest_uc);
/* it's a genuine invocation..reinit */
seen_already = 0;
td->live_uc_valid = 0;
td->live_sz = dest_sz;
- memset(dest_uc, 0x00, td->live_sz);
+
+ /*
+ * This is a memset() but we don't want the compiler to
+ * optimise it into either instructions or a library call
+ * which might be incompatible with streaming mode.
+ */
+ for (i = 0; i < td->live_sz; i++) {
+ asm volatile("nop"
+ : "+m" (*dest_uc)
+ :
+ : "memory");
+ uc[i] = 0;
+ }
+
td->live_uc = dest_uc;
/*
* Grab ucontext_t triggering a SIGTRAP.
@@ -103,6 +118,17 @@ static __always_inline bool get_current_context(struct tdescr *td,
:
: "memory");
+ /*
+ * If we were grabbing a streaming mode context then we may
+ * have entered streaming mode behind the system's back and
+ * libc or compiler generated code might decide to do
+ * something invalid in streaming mode, or potentially even
+ * the state of ZA. Issue a SMSTOP to exit both now we have
+ * grabbed the state.
+ */
+ if (td->feats_supported & FEAT_SME)
+ asm volatile("msr S0_3_C4_C6_3, xzr");
+
/*
* If we get here with seen_already==1 it implies the td->live_uc
* context has been used to get back here....this probably means
---
base-commit: 06c2afb862f9da8dc5efa4b6076a0e48c3fbaaa5
change-id: 20230628-arm64-signal-memcpy-fix-7de3b3c8fa10
Best regards,
--
Mark Brown <broonie(a)kernel.org>
Hi All,
This is v2 of my series to clean up mm selftests so that they run correctly on
arm64. See [1] for full explanation.
Changes Since v1 [1]
--------------------
- Patch 1: Explicitly set line buffer mode in ksft_print_header()
- Dropped v1 patch 2 (set execute permissions): Andrew has taken this into his
branch separately.
- Patch 2: Don't compile `soft-dirty` suite for arm64 instead of skipping it
at runtime.
- Patch 2: Declare fewer tests and skip all of test_softdirty() if soft-dirty
is not supported, rather than conditionally marking each check as skipped.
- Added Reviewed-by tags: thanks DavidH!
- Patch 8: Clarified commit message.
[1] https://lore.kernel.org/linux-mm/20230713135440.3651409-1-ryan.roberts@arm.…
Thanks,
Ryan
Ryan Roberts (8):
selftests: Line buffer test program's stdout
selftests/mm: Skip soft-dirty tests on arm64
selftests/mm: Enable mrelease_test for arm64
selftests/mm: Fix thuge-gen test bugs
selftests/mm: va_high_addr_switch should skip unsupported arm64
configs
selftests/mm: Make migration test robust to failure
selftests/mm: Optionally pass duration to transhuge-stress
selftests/mm: Run all tests from run_vmtests.sh
tools/testing/selftests/kselftest.h | 9 ++
tools/testing/selftests/kselftest/runner.sh | 7 +-
tools/testing/selftests/mm/Makefile | 82 ++++++++++---------
tools/testing/selftests/mm/madv_populate.c | 26 +++++-
tools/testing/selftests/mm/migration.c | 14 +++-
tools/testing/selftests/mm/mrelease_test.c | 1 +
tools/testing/selftests/mm/run_vmtests.sh | 28 ++++++-
tools/testing/selftests/mm/settings | 2 +-
tools/testing/selftests/mm/thuge-gen.c | 4 +-
tools/testing/selftests/mm/transhuge-stress.c | 12 ++-
.../selftests/mm/va_high_addr_switch.c | 2 +-
11 files changed, 133 insertions(+), 54 deletions(-)
--
2.25.1
Hi,
Consequential to the previous problem report, this one addresses almost the very
next test script.
The testing environment is the same: 6.5-rc2 vanilla Torvalds tree on Ubuntu 22.04 LTS.
The used config is the same, please find it with the bridge_mdb.sh normal and "set -x"
output on this link (too large to attach):
https://domac.alu.unizg.hr/~mtodorov/linux/selftests/net-forwarding/bridge_…
root@defiant:# ./bridge_mdb.sh
INFO: # Host entries configuration tests
TEST: Common host entries configuration tests (IPv4) [FAIL]
Managed to add IPv4 host entry with a filter mode
TEST: Common host entries configuration tests (IPv6) [FAIL]
Managed to add IPv6 host entry with a filter mode
TEST: Common host entries configuration tests (L2) [FAIL]
Managed to add L2 host entry with a filter mode
INFO: # Port group entries configuration tests - (*, G)
Command "replace" is unknown, try "bridge mdb help".
TEST: Common port group entries configuration tests (IPv4 (*, G)) [FAIL]
Failed to replace IPv4 (*, G) entry
Command "replace" is unknown, try "bridge mdb help".
TEST: Common port group entries configuration tests (IPv6 (*, G)) [FAIL]
Failed to replace IPv6 (*, G) entry
Command "replace" is unknown, try "bridge mdb help".
Command "replace" is unknown, try "bridge mdb help".
Command "replace" is unknown, try "bridge mdb help".
Command "replace" is unknown, try "bridge mdb help".
Command "replace" is unknown, try "bridge mdb help".
Command "replace" is unknown, try "bridge mdb help".
Command "replace" is unknown, try "bridge mdb help".
RTNETLINK answers: Invalid argument
Error: bridge: (*, G) group is already joined by port.
Error: bridge: (*, G) group is already joined by port.
TEST: IPv4 (*, G) port group entries configuration tests [FAIL]
(S, G) entry not created
Command "replace" is unknown, try "bridge mdb help".
Command "replace" is unknown, try "bridge mdb help".
Command "replace" is unknown, try "bridge mdb help".
Command "replace" is unknown, try "bridge mdb help".
Command "replace" is unknown, try "bridge mdb help".
Command "replace" is unknown, try "bridge mdb help".
Command "replace" is unknown, try "bridge mdb help".
RTNETLINK answers: Invalid argument
Error: bridge: (*, G) group is already joined by port.
Error: bridge: (*, G) group is already joined by port.
TEST: IPv6 (*, G) port group entries configuration tests [FAIL]
(S, G) entry not created
INFO: # Port group entries configuration tests - (S, G)
Command "replace" is unknown, try "bridge mdb help".
TEST: Common port group entries configuration tests (IPv4 (S, G)) [FAIL]
Failed to replace IPv4 (S, G) entry
Command "replace" is unknown, try "bridge mdb help".
TEST: Common port group entries configuration tests (IPv6 (S, G)) [FAIL]
Failed to replace IPv6 (S, G) entry
Error: bridge: (S, G) group is already joined by port.
Command "replace" is unknown, try "bridge mdb help".
Command "replace" is unknown, try "bridge mdb help".
Command "replace" is unknown, try "bridge mdb help".
TEST: IPv4 (S, G) port group entries configuration tests [FAIL]
Managed to add an entry with a filter mode
Error: bridge: (S, G) group is already joined by port.
Command "replace" is unknown, try "bridge mdb help".
Command "replace" is unknown, try "bridge mdb help".
Command "replace" is unknown, try "bridge mdb help".
TEST: IPv6 (S, G) port group entries configuration tests [FAIL]
"temp" entry has an unpending group timer
INFO: # Port group entries configuration tests - L2
Command "replace" is unknown, try "bridge mdb help".
TEST: Common port group entries configuration tests (L2 (*, G)) [FAIL]
Failed to replace L2 (*, G) entry
TEST: L2 (*, G) port group entries configuration tests [FAIL]
Managed to add an entry with a filter mode
INFO: # Large scale dump tests
TEST: IPv4 large scale dump tests [ OK ]
TEST: IPv6 large scale dump tests [ OK ]
TEST: L2 large scale dump tests [ OK ]
INFO: # Forwarding tests
Error: bridge: Group is already joined by host.
TEST: IPv4 host entries forwarding tests [FAIL]
Packet not locally received after adding a host entry
Error: bridge: Group is already joined by host.
TEST: IPv6 host entries forwarding tests [FAIL]
Packet locally received after flood
TEST: L2 host entries forwarding tests [FAIL]
Packet not locally received after flood
Command "replace" is unknown, try "bridge mdb help".
TEST: IPv4 port group "exclude" entries forwarding tests [FAIL]
Packet from valid source not received on H2 after adding entry
Command "replace" is unknown, try "bridge mdb help".
TEST: IPv6 port group "exclude" entries forwarding tests [FAIL]
Packet from invalid source received on H2 after adding entry
Command "replace" is unknown, try "bridge mdb help".
TEST: IPv4 port group "include" entries forwarding tests [FAIL]
Packet from valid source not received on H2 after adding entry
Command "replace" is unknown, try "bridge mdb help".
TEST: IPv6 port group "include" entries forwarding tests [FAIL]
Packet from invalid source received on H2 after adding entry
TEST: L2 port entries forwarding tests [ OK ]
INFO: # Control packets tests
Command "replace" is unknown, try "bridge mdb help".
TEST: IGMPv3 MODE_IS_INCLUDE tests [FAIL]
Source not add to source list
Command "replace" is unknown, try "bridge mdb help".
TEST: MLDv2 MODE_IS_INCLUDE tests [FAIL]
Source not add to source list
root@defiant:# bridge mdb show
root@defiant:#
NOTE that several "sleep 10" command looped in the script can easily exceed
the default timeout of 45 seconds, and SIGTERM to the script isn't processed,
so it leaves the system in an unpredictable state from which even
"systemctl restart networking" didn't bail out.
Setting tools/testing/selftests/net/forwarding/settings:timeout=150 seemed enough.
Best regards,
Mirsad Todorovac
This series is a follow up to the recent change [1] which added
per-cpu insert/delete statistics for maps. The bpf_map_sum_elem_count
kfunc presented in the original series was only available to tracing
programs, so let's make it available to all.
The first patch makes types listed in the reg2btf_ids[] array to be
considered trusted by kfuncs.
The second patch allows to treat CONST_PTR_TO_MAP as trusted pointers from
kfunc's point of view by adding it to the reg2btf_ids[] array.
The third patch adds missing const to the map argument of the
bpf_map_sum_elem_count kfunc.
The fourth patch registers the bpf_map_sum_elem_count for all programs,
and patches selftests correspondingly.
[1] https://lore.kernel.org/bpf/20230705160139.19967-1-aspsk@isovalent.com/
v1 -> v2:
* treat the whole reg2btf_ids array as trusted (Alexei)
Anton Protopopov (4):
bpf: consider types listed in reg2btf_ids as trusted
bpf: consider CONST_PTR_TO_MAP as trusted pointer to struct bpf_map
bpf: make an argument const in the bpf_map_sum_elem_count kfunc
bpf: allow any program to use the bpf_map_sum_elem_count kfunc
include/linux/btf_ids.h | 1 +
kernel/bpf/map_iter.c | 7 +++---
kernel/bpf/verifier.c | 22 +++++++++++--------
.../selftests/bpf/progs/map_ptr_kern.c | 5 +++++
4 files changed, 22 insertions(+), 13 deletions(-)
--
2.34.1
bpf_dynptr_slice(_rw) uses a user provided buffer if it can not provide
a pointer to a block of contiguous memory. This buffer is unused in the
case of local dynptrs, and may be unused in other cases as well. There
is no need to require the buffer, as the kfunc can just return NULL if
it was needed and not provided.
This adds another kfunc annotation, __opt, which combines with __sz and
__szk to allow the buffer associated with the size to be NULL. If the
buffer is NULL, the verifier does not check that the buffer is of
sufficient size.
Signed-off-by: Daniel Rosenberg <drosen(a)google.com>
---
Documentation/bpf/kfuncs.rst | 23 ++++++++++++++++++++++-
include/linux/skbuff.h | 2 +-
kernel/bpf/helpers.c | 30 ++++++++++++++++++------------
kernel/bpf/verifier.c | 17 +++++++++++++----
4 files changed, 54 insertions(+), 18 deletions(-)
diff --git a/Documentation/bpf/kfuncs.rst b/Documentation/bpf/kfuncs.rst
index ea2516374d92..7a3d9de5f315 100644
--- a/Documentation/bpf/kfuncs.rst
+++ b/Documentation/bpf/kfuncs.rst
@@ -100,7 +100,7 @@ Hence, whenever a constant scalar argument is accepted by a kfunc which is not a
size parameter, and the value of the constant matters for program safety, __k
suffix should be used.
-2.2.2 __uninit Annotation
+2.2.3 __uninit Annotation
-------------------------
This annotation is used to indicate that the argument will be treated as
@@ -117,6 +117,27 @@ Here, the dynptr will be treated as an uninitialized dynptr. Without this
annotation, the verifier will reject the program if the dynptr passed in is
not initialized.
+2.2.4 __opt Annotation
+-------------------------
+
+This annotation is used to indicate that the buffer associated with an __sz or __szk
+argument may be null. If the function is passed a nullptr in place of the buffer,
+the verifier will not check that length is appropriate for the buffer. The kfunc is
+responsible for checking if this buffer is null before using it.
+
+An example is given below::
+
+ __bpf_kfunc void *bpf_dynptr_slice(..., void *buffer__opt, u32 buffer__szk)
+ {
+ ...
+ }
+
+Here, the buffer may be null. If buffer is not null, it at least of size buffer_szk.
+Either way, the returned buffer is either NULL, or of size buffer_szk. Without this
+annotation, the verifier will reject the program if a null pointer is passed in with
+a nonzero size.
+
+
.. _BPF_kfunc_nodef:
2.3 Using an existing kernel function
diff --git a/include/linux/skbuff.h b/include/linux/skbuff.h
index 738776ab8838..8ddb4af1a501 100644
--- a/include/linux/skbuff.h
+++ b/include/linux/skbuff.h
@@ -4033,7 +4033,7 @@ __skb_header_pointer(const struct sk_buff *skb, int offset, int len,
if (likely(hlen - offset >= len))
return (void *)data + offset;
- if (!skb || unlikely(skb_copy_bits(skb, offset, buffer, len) < 0))
+ if (!skb || !buffer || unlikely(skb_copy_bits(skb, offset, buffer, len) < 0))
return NULL;
return buffer;
diff --git a/kernel/bpf/helpers.c b/kernel/bpf/helpers.c
index 8d368fa353f9..26efb6fbeab2 100644
--- a/kernel/bpf/helpers.c
+++ b/kernel/bpf/helpers.c
@@ -2167,13 +2167,15 @@ __bpf_kfunc struct task_struct *bpf_task_from_pid(s32 pid)
* bpf_dynptr_slice() - Obtain a read-only pointer to the dynptr data.
* @ptr: The dynptr whose data slice to retrieve
* @offset: Offset into the dynptr
- * @buffer: User-provided buffer to copy contents into
- * @buffer__szk: Size (in bytes) of the buffer. This is the length of the
- * requested slice. This must be a constant.
+ * @buffer__opt: User-provided buffer to copy contents into. May be NULL
+ * @buffer__szk: Size (in bytes) of the buffer if present. This is the
+ * length of the requested slice. This must be a constant.
*
* For non-skb and non-xdp type dynptrs, there is no difference between
* bpf_dynptr_slice and bpf_dynptr_data.
*
+ * If buffer__opt is NULL, the call will fail if buffer_opt was needed.
+ *
* If the intention is to write to the data slice, please use
* bpf_dynptr_slice_rdwr.
*
@@ -2190,7 +2192,7 @@ __bpf_kfunc struct task_struct *bpf_task_from_pid(s32 pid)
* direct pointer)
*/
__bpf_kfunc void *bpf_dynptr_slice(const struct bpf_dynptr_kern *ptr, u32 offset,
- void *buffer, u32 buffer__szk)
+ void *buffer__opt, u32 buffer__szk)
{
enum bpf_dynptr_type type;
u32 len = buffer__szk;
@@ -2210,15 +2212,17 @@ __bpf_kfunc void *bpf_dynptr_slice(const struct bpf_dynptr_kern *ptr, u32 offset
case BPF_DYNPTR_TYPE_RINGBUF:
return ptr->data + ptr->offset + offset;
case BPF_DYNPTR_TYPE_SKB:
- return skb_header_pointer(ptr->data, ptr->offset + offset, len, buffer);
+ return skb_header_pointer(ptr->data, ptr->offset + offset, len, buffer__opt);
case BPF_DYNPTR_TYPE_XDP:
{
void *xdp_ptr = bpf_xdp_pointer(ptr->data, ptr->offset + offset, len);
if (xdp_ptr)
return xdp_ptr;
- bpf_xdp_copy_buf(ptr->data, ptr->offset + offset, buffer, len, false);
- return buffer;
+ if (!buffer__opt)
+ return NULL;
+ bpf_xdp_copy_buf(ptr->data, ptr->offset + offset, buffer__opt, len, false);
+ return buffer__opt;
}
default:
WARN_ONCE(true, "unknown dynptr type %d\n", type);
@@ -2230,13 +2234,15 @@ __bpf_kfunc void *bpf_dynptr_slice(const struct bpf_dynptr_kern *ptr, u32 offset
* bpf_dynptr_slice_rdwr() - Obtain a writable pointer to the dynptr data.
* @ptr: The dynptr whose data slice to retrieve
* @offset: Offset into the dynptr
- * @buffer: User-provided buffer to copy contents into
- * @buffer__szk: Size (in bytes) of the buffer. This is the length of the
- * requested slice. This must be a constant.
+ * @buffer__opt: User-provided buffer to copy contents into. May be NULL
+ * @buffer__szk: Size (in bytes) of the buffer if present. This is the
+ * length of the requested slice. This must be a constant.
*
* For non-skb and non-xdp type dynptrs, there is no difference between
* bpf_dynptr_slice and bpf_dynptr_data.
*
+ * If buffer__opt is NULL, the call will fail if buffer_opt was needed.
+ *
* The returned pointer is writable and may point to either directly the dynptr
* data at the requested offset or to the buffer if unable to obtain a direct
* data pointer to (example: the requested slice is to the paged area of an skb
@@ -2267,7 +2273,7 @@ __bpf_kfunc void *bpf_dynptr_slice(const struct bpf_dynptr_kern *ptr, u32 offset
* direct pointer)
*/
__bpf_kfunc void *bpf_dynptr_slice_rdwr(const struct bpf_dynptr_kern *ptr, u32 offset,
- void *buffer, u32 buffer__szk)
+ void *buffer__opt, u32 buffer__szk)
{
if (!ptr->data || bpf_dynptr_is_rdonly(ptr))
return NULL;
@@ -2294,7 +2300,7 @@ __bpf_kfunc void *bpf_dynptr_slice_rdwr(const struct bpf_dynptr_kern *ptr, u32 o
* will be copied out into the buffer and the user will need to call
* bpf_dynptr_write() to commit changes.
*/
- return bpf_dynptr_slice(ptr, offset, buffer, buffer__szk);
+ return bpf_dynptr_slice(ptr, offset, buffer__opt, buffer__szk);
}
__bpf_kfunc void *bpf_cast_to_kern_ctx(void *obj)
diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
index fbcf5a4e2fcd..708ae7bca1fe 100644
--- a/kernel/bpf/verifier.c
+++ b/kernel/bpf/verifier.c
@@ -9398,6 +9398,11 @@ static bool is_kfunc_arg_const_mem_size(const struct btf *btf,
return __kfunc_param_match_suffix(btf, arg, "__szk");
}
+static bool is_kfunc_arg_optional(const struct btf *btf, const struct btf_param *arg)
+{
+ return __kfunc_param_match_suffix(btf, arg, "__opt");
+}
+
static bool is_kfunc_arg_constant(const struct btf *btf, const struct btf_param *arg)
{
return __kfunc_param_match_suffix(btf, arg, "__k");
@@ -10464,13 +10469,17 @@ static int check_kfunc_args(struct bpf_verifier_env *env, struct bpf_kfunc_call_
break;
case KF_ARG_PTR_TO_MEM_SIZE:
{
+ struct bpf_reg_state *buff_reg = ®s[regno];
+ const struct btf_param *buff_arg = &args[i];
struct bpf_reg_state *size_reg = ®s[regno + 1];
const struct btf_param *size_arg = &args[i + 1];
- ret = check_kfunc_mem_size_reg(env, size_reg, regno + 1);
- if (ret < 0) {
- verbose(env, "arg#%d arg#%d memory, len pair leads to invalid memory access\n", i, i + 1);
- return ret;
+ if (!register_is_null(buff_reg) || !is_kfunc_arg_optional(meta->btf, buff_arg)) {
+ ret = check_kfunc_mem_size_reg(env, size_reg, regno + 1);
+ if (ret < 0) {
+ verbose(env, "arg#%d arg#%d memory, len pair leads to invalid memory access\n", i, i + 1);
+ return ret;
+ }
}
if (is_kfunc_arg_const_mem_size(meta->btf, size_arg, size_reg)) {
base-commit: 6e98b09da931a00bf4e0477d0fa52748bf28fcce
--
2.40.1.495.gc816e09b53d-goog
The arm64 Guarded Control Stack (GCS) feature provides support for
hardware protected stacks of return addresses, intended to provide
hardening against return oriented programming (ROP) attacks and to make
it easier to gather call stacks for applications such as profiling.
When GCS is active a secondary stack called the Guarded Control Stack is
maintained, protected with a memory attribute which means that it can
only be written with specific GCS operations. When a BL is executed the
value stored in LR is also pushed onto the GCS, and when a RET is
executed the top of the GCS is popped and compared to LR with a fault
being raised if the values do not match. GCS operations may only be
performed on GCS pages, a data abort is generated if they are not.
This series implements support for use of GCS by EL0, along with support
for use of GCS within KVM guests. It does not enable use of GCS by
either EL1 or EL2. Executables are started without GCS and must use a
prctl() to enable it, it is expected that this will be done very early
in application execution by the dynamic linker or other startup code.
x86 has an equivalent feature called shadow stacks, this series depends
on the x86 patches for generic memory management support for the new
guarded/shadow stack page type and shares APIs as much as possible. As
there has been extensive discussion with the wider community around the
ABI for shadow stacks I have as far as practical kept implementation
decisions close to those for x86, anticipating that review would lead to
similar conclusions in the absence of strong reasoning for divergence.
The main divergence I am concious of is that x86 allows shadow stack to
be enabled and disabled repeatedly, freeing the shadow stack for the
thread whenever disabled, while this implementation keeps the GCS
allocated after disable but refuses to reenable it. This is to avoid
races with things actively walking the GCS during a disable, we do
anticipate that some systems will wish to disable GCS at runtime but are
not aware of any demand for subsequently reenabling it.
x86 uses an arch_prctl() to manage enable and disable, since only x86
and S/390 use arch_prctl() a generic prctl() was proposed[1] as part of a
patch set for the equivalent RISC-V zisslpcfi feature which is adopted
with some enhancements here.
There's a few bits where I'm not convinced with where I've placed
things, in particular the GCS write operation is in the GCS header not
in uaccess.h, I wasn't sure what was clearest there and am probably too
close to the code to have a clear opinion.
The series depends on the x86 shadow stack support:
https://lore.kernel.org/lkml/20230227222957.24501-1-rick.p.edgecombe@intel.…
I've rebased this onto v6.5-rc1 but not included it in the series in
order to avoid confusion with Rick's work and cut down the size of the
series, you can see the branch at:
https://git.kernel.org/pub/scm/linux/kernel/git/broonie/misc.git arm64-gcs
[1] https://lore.kernel.org/lkml/20230213045351.3945824-1-debug@rivosinc.com/
Signed-off-by: Mark Brown <broonie(a)kernel.org>
---
Deepak Gupta (1):
prctl: arch-agnostic prctl for shadow stack
Mark Brown (34):
prctl: Add flag for shadow stack writeability and push/pop
arm64: Document boot requirements for Guarded Control Stacks
arm64/gcs: Document the ABI for Guarded Control Stacks
arm64/sysreg: Add new system registers for GCS
arm64/sysreg: Add definitions for architected GCS caps
arm64/gcs: Add manual encodings of GCS instructions
arm64/gcs: Provide copy_to_user_gcs()
arm64/cpufeature: Runtime detection of Guarded Control Stack (GCS)
arm64/mm: Allocate PIE slots for EL0 guarded control stack
mm: Define VM_SHADOW_STACK for arm64 when we support GCS
arm64/mm: Map pages for guarded control stack
KVM: arm64: Manage GCS registers for guests
arm64: Disable traps for GCS usage at EL0 and EL1
arm64/idreg: Add overrride for GCS
arm64/hwcap: Add hwcap for GCS
arm64/traps: Handle GCS exceptions
arm64/mm: Handle GCS data aborts
arm64/gcs: Context switch GCS registers for EL0
arm64/gcs: Allocate a new GCS for threads with GCS enabled
arm64/gcs: Implement shadow stack prctl() interface
arm64/mm: Implement map_shadow_stack()
arm64/signal: Set up and restore the GCS context for signal handlers
arm64/signal: Expose GCS state in signal frames
arm64/ptrace: Expose GCS via ptrace and core files
arm64: Add Kconfig for Guarded Control Stack (GCS)
kselftest/arm64: Verify the GCS hwcap
kselftest/arm64: Add GCS as a detected feature in the signal tests
kselftest/arm64: Add framework support for GCS to signal handling tests
kselftest/arm64: Allow signals tests to specify an expected si_code
kselftest/arm64: Always run signals tests with GCS enabled
kselftest/arm64: Add very basic GCS test program
kselftest/arm64: Add a GCS test program built with the system libc
selftests/arm64: Add GCS signal tests
kselftest/arm64: Enable GCS for the FP stress tests
Documentation/admin-guide/kernel-parameters.txt | 3 +
Documentation/arch/arm64/booting.rst | 22 ++
Documentation/arch/arm64/elf_hwcaps.rst | 3 +
Documentation/arch/arm64/gcs.rst | 216 +++++++++++++
Documentation/arch/arm64/index.rst | 1 +
Documentation/filesystems/proc.rst | 2 +-
arch/arm64/Kconfig | 19 ++
arch/arm64/include/asm/cpufeature.h | 6 +
arch/arm64/include/asm/el2_setup.h | 9 +
arch/arm64/include/asm/esr.h | 26 +-
arch/arm64/include/asm/exception.h | 2 +
arch/arm64/include/asm/gcs.h | 88 ++++++
arch/arm64/include/asm/hwcap.h | 1 +
arch/arm64/include/asm/kvm_host.h | 12 +
arch/arm64/include/asm/pgtable-prot.h | 14 +-
arch/arm64/include/asm/processor.h | 6 +
arch/arm64/include/asm/sysreg.h | 20 ++
arch/arm64/include/asm/uaccess.h | 42 +++
arch/arm64/include/uapi/asm/hwcap.h | 1 +
arch/arm64/include/uapi/asm/ptrace.h | 7 +
arch/arm64/include/uapi/asm/sigcontext.h | 9 +
arch/arm64/kernel/cpufeature.c | 23 ++
arch/arm64/kernel/cpuinfo.c | 1 +
arch/arm64/kernel/entry-common.c | 23 ++
arch/arm64/kernel/idreg-override.c | 2 +
arch/arm64/kernel/process.c | 77 +++++
arch/arm64/kernel/ptrace.c | 50 +++
arch/arm64/kernel/signal.c | 240 +++++++++++++-
arch/arm64/kernel/traps.c | 11 +
arch/arm64/kvm/hyp/include/hyp/sysreg-sr.h | 17 +
arch/arm64/kvm/sys_regs.c | 22 ++
arch/arm64/mm/Makefile | 1 +
arch/arm64/mm/fault.c | 75 ++++-
arch/arm64/mm/gcs.c | 202 ++++++++++++
arch/arm64/mm/mmap.c | 17 +-
arch/arm64/tools/cpucaps | 1 +
arch/arm64/tools/sysreg | 55 ++++
fs/proc/task_mmu.c | 3 +
include/linux/mm.h | 15 +-
include/linux/syscalls.h | 1 +
include/uapi/asm-generic/unistd.h | 5 +-
include/uapi/linux/elf.h | 1 +
include/uapi/linux/prctl.h | 19 ++
kernel/sys.c | 20 ++
kernel/sys_ni.c | 1 +
tools/testing/selftests/arm64/Makefile | 2 +-
tools/testing/selftests/arm64/abi/hwcap.c | 19 ++
tools/testing/selftests/arm64/fp/assembler.h | 15 +
tools/testing/selftests/arm64/fp/fpsimd-test.S | 2 +
tools/testing/selftests/arm64/fp/sve-test.S | 2 +
tools/testing/selftests/arm64/fp/za-test.S | 2 +
tools/testing/selftests/arm64/fp/zt-test.S | 2 +
tools/testing/selftests/arm64/gcs/.gitignore | 2 +
tools/testing/selftests/arm64/gcs/Makefile | 19 ++
tools/testing/selftests/arm64/gcs/basic-gcs.c | 350 +++++++++++++++++++++
tools/testing/selftests/arm64/gcs/gcs-util.h | 65 ++++
tools/testing/selftests/arm64/gcs/libc-gcs.c | 217 +++++++++++++
tools/testing/selftests/arm64/signal/.gitignore | 1 +
.../testing/selftests/arm64/signal/test_signals.c | 17 +-
.../testing/selftests/arm64/signal/test_signals.h | 6 +
.../selftests/arm64/signal/test_signals_utils.c | 32 +-
.../selftests/arm64/signal/test_signals_utils.h | 39 +++
.../arm64/signal/testcases/gcs_exception_fault.c | 59 ++++
.../selftests/arm64/signal/testcases/gcs_frame.c | 78 +++++
.../arm64/signal/testcases/gcs_write_fault.c | 67 ++++
.../selftests/arm64/signal/testcases/testcases.c | 7 +
.../selftests/arm64/signal/testcases/testcases.h | 1 +
67 files changed, 2363 insertions(+), 32 deletions(-)
---
base-commit: 023ee2d672f3d7c2d15acf62bcfc4bc49c3677e5
change-id: 20230303-arm64-gcs-e311ab0d8729
Best regards,
--
Mark Brown <broonie(a)kernel.org>
We have some KUnit tests for ASoC but they're not being run as much as
they should be since ASoC isn't enabled in the configs used by default
with KUnit and in the case of the topology tests there is no way to
enable them without enabling drivers that use them. This series
provides a Kconfig option which KUnit can use directly rather than worry
about drivers.
Further, since KUnit is typically run in UML but ALSA prevents build
with UML we need to remove that Kconfig conflict. As far as I can tell
the motiviation for this is that many ALSA drivers use iomem APIs which
are not available under UML and it's more trouble than it's worth to go
through and add per driver dependencies. In order to avoid these issues
we also provide stubs for these APIs so there are no build time issues
if a driver relies on iomem but does not depend on it. With these stubs
I am able to build all the sound drivers available in a UML defconfig
(UML allmodconfig appears to have substantial other issues in a quick
test).
With this series I am able to run the topology KUnit tests as part of a
kunit --alltests run.
Signed-off-by: Mark Brown <broonie(a)kernel.org>
---
Changes in v2:
- Add support for building ALSA with UML.
- Link to v1: https://lore.kernel.org/r/20230712-asoc-topology-kunit-enable-v1-0-b9f2da9d…
---
Mark Brown (5):
driver core: Provide stubs for !IOMEM builds
platform: Provide stubs for !HAS_IOMEM builds
ALSA: Enable build with UML
kunit: Enable ASoC in all_tests.config
ASoC: topology: Add explicit build option
include/linux/device.h | 26 ++++++++++++++++++++++++++
include/linux/platform_device.h | 28 ++++++++++++++++++++++++++++
sound/Kconfig | 4 ----
sound/soc/Kconfig | 11 +++++++++++
tools/testing/kunit/configs/all_tests.config | 5 +++++
5 files changed, 70 insertions(+), 4 deletions(-)
---
base-commit: 06c2afb862f9da8dc5efa4b6076a0e48c3fbaaa5
change-id: 20230701-asoc-topology-kunit-enable-5e8dd50d0ed7
Best regards,
--
Mark Brown <broonie(a)kernel.org>
We have some KUnit tests for ASoC but they're not being run as much as
they should be since ASoC isn't enabled in the configs used by default
with KUnit and in the case of the topolofy tests there is no way to
enable them without enabling drivers that use them. Let's improve that.
Signed-off-by: Mark Brown <broonie(a)kernel.org>
---
Mark Brown (2):
kunit: Enable ASoC in all_tests.config
ASoC: topology: Add explicit build option
sound/soc/Kconfig | 11 +++++++++++
tools/testing/kunit/configs/all_tests.config | 5 +++++
2 files changed, 16 insertions(+)
---
base-commit: 06c2afb862f9da8dc5efa4b6076a0e48c3fbaaa5
change-id: 20230701-asoc-topology-kunit-enable-5e8dd50d0ed7
Best regards,
--
Mark Brown <broonie(a)kernel.org>
In order to select a nexthop for multipath routes, fib_select_multipath()
is used with legacy nexthops and nexthop_select_path_hthr() is used with
nexthop objects. Those two functions perform a validity test on the
neighbor related to each nexthop but their logic is structured differently.
This causes a divergence in behavior and nexthop_select_path_hthr() may
return a nexthop that failed the neighbor validity test even if there was
one that passed.
Refactor nexthop_select_path_hthr() to make it more similar to
fib_select_multipath() and fix the problem mentioned above.
Benjamin Poirier (4):
nexthop: Factor out hash threshold fdb nexthop selection
nexthop: Factor out neighbor validity check
nexthop: Do not return invalid nexthop object during multipath
selection
selftests: net: Add test cases for nexthop groups with invalid
neighbors
net/ipv4/nexthop.c | 64 +++++++---
tools/testing/selftests/net/fib_nexthops.sh | 129 ++++++++++++++++++++
2 files changed, 174 insertions(+), 19 deletions(-)
--
2.40.1
The current selftests infrastructure formats the results in TAP 13. This
version doesn't support subtests and only the end result of each
selftest is taken into account. It means that a single issue in a
subtest of a selftest containing multiple subtests forces the whole
selftest to be marked as failed. It also means that subtests results are
not tracked by CI executing selftests.
MPTCP selftests run hundreds of various subtests. It is then important
to track each of them and not one result per selftest.
It is particularly interesting to do that when validating stable kernels
with the last version of the test suite: tests might fail because a
feature is not supported but the test didn't skip that part. In this
case, if subtests are not tracked, the whole selftest will be marked as
failed making the other subtests useless because their results are
ignored.
Regarding this patch set:
- The two first patches modify connect and userspace_pm selftests to
continue executing other tests if there is an error before the end.
This is what is done in the other MPTCP selftests.
- Patches 3-5 are refactoring the code in userspace_pm selftest to
reduce duplicated code, suppress some shellcheck warnings and prepare
subtests' support by using new helpers.
- Patch 6 adds new helpers in mptcp_lib.sh to easily support printing
the subtests results in the different MPTCP selftests.
- Patch 7-13 format subtests results in TAP 13 in the different MPTCP
selftests.
Signed-off-by: Matthieu Baerts <matthieu.baerts(a)tessares.net>
---
Matthieu Baerts (13):
selftests: mptcp: connect: don't stop if error
selftests: mptcp: userspace pm: don't stop if error
selftests: mptcp: userspace_pm: fix shellcheck warnings
selftests: mptcp: userspace_pm: uniform results printing
selftests: mptcp: userspace_pm: reduce dup code around printf
selftests: mptcp: lib: format subtests results in TAP
selftests: mptcp: connect: format subtests results in TAP
selftests: mptcp: pm_netlink: format subtests results in TAP
selftests: mptcp: join: format subtests results in TAP
selftests: mptcp: diag: format subtests results in TAP
selftests: mptcp: simult flows: format subtests results in TAP
selftests: mptcp: sockopt: format subtests results in TAP
selftests: mptcp: userspace_pm: format subtests results in TAP
tools/testing/selftests/net/mptcp/diag.sh | 7 +
tools/testing/selftests/net/mptcp/mptcp_connect.sh | 66 ++++++--
tools/testing/selftests/net/mptcp/mptcp_join.sh | 37 ++++-
tools/testing/selftests/net/mptcp/mptcp_lib.sh | 66 ++++++++
tools/testing/selftests/net/mptcp/mptcp_sockopt.sh | 20 ++-
tools/testing/selftests/net/mptcp/pm_netlink.sh | 6 +-
tools/testing/selftests/net/mptcp/simult_flows.sh | 4 +
tools/testing/selftests/net/mptcp/userspace_pm.sh | 181 +++++++++++++--------
8 files changed, 298 insertions(+), 89 deletions(-)
---
base-commit: 60cc1f7d0605598b47ee3c0c2b4b6fbd4da50a06
change-id: 20230712-upstream-net-next-20230712-selftests-mptcp-subtests-25d250d77886
Best regards,
--
Matthieu Baerts <matthieu.baerts(a)tessares.net>
Hello everyone,
This is an RFC patch series to propose the addition of a test attributes
framework to KUnit.
There has been interest in filtering out "slow" KUnit tests. Most notably,
a new config, CONFIG_MEMCPY_SLOW_KUNIT_TEST, has been added to exclude a
particularly slow memcpy test
(https://lore.kernel.org/all/20230118200653.give.574-kees@kernel.org/).
This proposed attributes framework would be used to save and access test
associated data, including whether a test is slow. These attributes would
be reportable (via KTAP and command line output) and some will be
filterable.
This framework is designed to allow for the addition of other attributes in
the future. These attributes could include whether the test is flaky,
associated test files, etc.
This is the second version of the RFC I have added a few big changes:
- Change method for inputting filters to allow for spaces in filtering
values
- Add option to skip filtered tests instead of not run or show them with
the --filter_skip flag
- Separate the new feature to list tests and their attributes into both
--list_tests (lists just tests) and --list_tests_attr (lists all)
- Add new attribute to store module name associated with test
- Add Tests to executor_test.c
- Add Documentation
- A few small changes to code commented on previously
I would love to hear about the new features. If the series seems overall
good I will send out the next version as an official patch series.
Thanks!
Rae
Rae Moar (9):
kunit: Add test attributes API structure
kunit: Add speed attribute
kunit: Add module attribute
kunit: Add ability to filter attributes
kunit: tool: Add command line interface to filter and report
attributes
kunit: memcpy: Mark tests as slow using test attributes
kunit: time: Mark test as slow using test attributes
kunit: add tests for filtering attributes
kunit: Add documentation of KUnit test attributes
.../dev-tools/kunit/running_tips.rst | 163 +++++++
include/kunit/attributes.h | 50 +++
include/kunit/test.h | 68 ++-
kernel/time/time_test.c | 2 +-
lib/Kconfig.debug | 3 +
lib/kunit/Makefile | 3 +-
lib/kunit/attributes.c | 406 ++++++++++++++++++
lib/kunit/executor.c | 115 ++++-
lib/kunit/executor_test.c | 119 ++++-
lib/kunit/kunit-example-test.c | 9 +
lib/kunit/test.c | 27 +-
lib/memcpy_kunit.c | 8 +-
tools/testing/kunit/kunit.py | 80 +++-
tools/testing/kunit/kunit_kernel.py | 6 +-
tools/testing/kunit/kunit_tool_test.py | 39 +-
15 files changed, 1022 insertions(+), 76 deletions(-)
create mode 100644 include/kunit/attributes.h
create mode 100644 lib/kunit/attributes.c
base-commit: 2e66833579ed759d7b7da1a8f07eb727ec6e80db
--
2.41.0.255.g8b1d071c50-goog
This short series is a follow up to the recent series [1] which added
per-cpu insert/delete statistics for maps. The bpf_map_sum_elem_count
kfunc presented in the original series was only available to tracing
programs, so let's make it available to all.
The first patch allows to treat CONST_PTR_TO_MAP as trusted pointers
from kfunc's point of view.
The second patch just adds const to the map argument of the
bpf_map_sum_elem_count kfunc.
The third patch registers the bpf_map_sum_elem_count for all programs,
and patches selftests correspondingly.
Anton Protopopov (3):
bpf: consider CONST_PTR_TO_MAP as trusted pointer to struct bpf_map
bpf: make an argument const in the bpf_map_sum_elem_count kfunc
bpf: allow any program to use the bpf_map_sum_elem_count kfunc
include/linux/btf_ids.h | 1 +
kernel/bpf/map_iter.c | 7 +++----
kernel/bpf/verifier.c | 5 ++++-
tools/testing/selftests/bpf/progs/map_ptr_kern.c | 5 +++++
4 files changed, 13 insertions(+), 5 deletions(-)
--
2.34.1
Hi,
This patch series aims to improve the PMU event filter settings with a cleaner
and more organized structure and adds several test cases related to PMU event
filters.
These changes help to ensure that KVM's PMU event filter functions as expected
in all supported use cases.
Any feedback or suggestions are greatly appreciated.
Sincerely,
Jinrong Liang
Changes log:
v4:
- Rebased to 88bb466c9dec(tag: kvm-x86-next-2023.06.22);
- Add a patch to add macros for fixed counters in processor.h;
- Add a patch to drop the return of remove_event(); (Sean)
- Reverse xmas tree; (Sean)
- Optimize code style and comments; (Sean)
Previous:
https://lore.kernel.org/kvm/20230607123700.40229-1-cloudliang@tencent.com/T
Jinrong Liang (6):
KVM: selftests: Add macros for fixed counters in processor.h
KVM: selftests: Drop the return of remove_event()
KVM: selftests: Introduce __kvm_pmu_event_filter to improved event
filter settings
KVM: selftests: Add test cases for unsupported PMU event filter input
values
KVM: selftests: Test if event filter meets expectations on fixed
counters
KVM: selftests: Test gp event filters don't affect fixed event filters
.../selftests/kvm/include/x86_64/processor.h | 2 +
.../kvm/x86_64/pmu_event_filter_test.c | 314 ++++++++++++------
2 files changed, 222 insertions(+), 94 deletions(-)
base-commit: 88bb466c9dec4f70d682cf38c685324e7b1b3d60
--
2.39.3
When looking for something else in LKFT reports [1], I noticed that the
TC selftest ended with a timeout error:
not ok 1 selftests: tc-testing: tdc.sh # TIMEOUT 45 seconds
I also noticed most of the tests were skipped because the "teardown
stage" did not complete successfully. It was due to missing kconfig.
These patches fix these two errors plus an extra one because this
selftest reads info from "/proc/net/nf_conntrack". Thank you Pedro for
having helped me fixing these issues [2].
Link: https://qa-reports.linaro.org/lkft/linux-next-master/build/next-20230711/te… [1]
Link: https://lore.kernel.org/netdev/0e061d4a-9a23-9f58-3b35-d8919de332d7@tessare… [2]
Signed-off-by: Matthieu Baerts <matthieu.baerts(a)tessares.net>
---
Matthieu Baerts (3):
selftests: tc: set timeout to 15 minutes
selftests: tc: add 'ct' action kconfig dep
selftests: tc: add ConnTrack procfs kconfig
tools/testing/selftests/tc-testing/config | 2 ++
tools/testing/selftests/tc-testing/settings | 1 +
2 files changed, 3 insertions(+)
---
base-commit: 9d23aac8a85f69239e585c8656c6fdb21be65695
change-id: 20230713-tc-selftests-lkft-363e4590f105
Best regards,
--
Matthieu Baerts <matthieu.baerts(a)tessares.net>
Hi Willy, Zhangjin,
after your recent discussions about the test output and report I
wondered if it would make sense to switch nolibc-test to KTAP output
format [0].
With this it would be possible to have a wrapper script run each
architecture test as its own test subcomponent.
A (K)TAP parser/runner could then directly recognize and report failing
testcases, making it easier to validate.
Also maybe we can hook it up into the regular kselftests setup and have
the bots run it as part of that.
The kernel even includes a header-only library to implement the format [1].
It also should be fairly easy to emit the format without a library.
Thomas
[0] Documentation/dev-tools/ktap.rst
[1] Documentation/dev-tools/kselftest.rst (Test harness)
Willy, Thomas
We just sent the 'selftests/nolibc: allow run with minimal kernel
config' series [1], Here is the 'minimal' kernel config support, with
both of them, it is possible to run nolibc-test for all architectures
with oneline command and in less than ~30 minutes - 1 hour (not fullly
measured yet):
// run with tiny config + qemu-system
// Note: rv32 and loongarch require to download the bios at first
$ time make run-tiny-all QUIET_RUN=1
// run with default config + qemu-system
$ time make run-default-all QUIET_RUN=1
// run with qemu-user
$ time make run-user-all QUIET_RUN=1
Besides the 'tinyconfig' suggestion from Thomas, this patch also merge
the generic part of my local powerpc porting (the extconfig to add
additional console support).
This is applied after the test report patchset [2] and the rv32 compile
patchset [3], because all of them touched the same Makefile.
Even without the 'selftests/nolibc: allow run with minimal kernel
config' series [1], all of the tests can pass except the /proc/self/net
related ones (We haven't enable CONFIG_NET in this patchset), the
chmod_net one will be removed by Thomas from this patchset [4] for the
wrong chmodable attribute issue of /proc/self/net, the link_cross one
can be simply fixed up by using another /proc/self interface (like
/proc/self/cmdline), which will be covered in our revision of the [1]
series.
Beside the core 'minimal' config support, some generic patch are added
together to avoid patch conflicts.
* selftests/nolibc: add test for -include /path/to/nolibc.h
Add a test switch to allow run nolibc-test with nolibc.h
* selftests/nolibc: print result to the screen too
Let the run targets print results by default, allow disable by
QUIET_RUN=1
* selftests/nolibc: allow use x86_64 toolchain for i386
Allow use x86_64 toolchains for i386
* selftests/nolibc: add menuconfig target for manual config
a new 'menuconfig' target added for development and debugging
* selftests/nolibc: add tinyconfig target
a new 'tinyconfig' compare to 'defconfig', smaller and faster, but not
enough for boot and print, require following 'extconfig' target
* selftests/nolibc: allow customize extra kernel config options
a new 'extconfig' allows to add extra config options for 'defconfig'
and 'tinyconfig'
* selftests/nolibc: add common extra config options
selftests/nolibc: add power reset control support
selftests/nolibc: add procfs, shmem and tmpfs
Add common extra configs, the 3rd one (procfs, shmem and tmpfs) can be
completely reverted after [1] series, but as discuss with Thomas,
procfs may be still a hard requirement.
* selftests/nolibc: add extra configs for i386
selftests/nolibc: add extra configs for x86_64
selftests/nolibc: add extra configs for arm64
selftests/nolibc: add extra configs for arm
selftests/nolibc: add extra configs for mips
selftests/nolibc: add extra configs for riscv32
selftests/nolibc: add extra configs for riscv64
selftests/nolibc: add extra configs for s390x
selftests/nolibc: add extra configs for loongarch
Add architecture specific extra configs to let kernel boot and
nolibc-test print. The rv32 added here is only for test, it should not
be merged before the missing 64bit syscalls are added (still wait for
the merging of the __sysret and -ENOSYS patches).
* selftests/nolibc: config default CROSS_COMPILE
selftests/nolibc: add run-tiny and run-default
both run-tiny and run-default are added to do config and run together,
this easier test a log.
* selftests/nolibc: allow run tests on all targets
selftests/nolibc: detect bios existing to avoid hang
Further allow do run-user, run-tiny and run-default for all
architectures at once, the -all suffix is added to do so.
Since some generic patches are still in review, before sending the left
rv32 patches, I'm will send more generic patches later, the coming one
is arch-xxx.h cleanup, and then, the 32bit powerpc porting support.
For the compile speedup, the next step may be add architecture specific
'O' support, which may allow us rerun across architectures without
mrproper, for a single architecture development, this 'minimal' config
should be enough ;-)
Thanks.
Best regards,
Zhangjin
---
[1]: https://lore.kernel.org/lkml/cover.1687344643.git.falcon@tinylab.org/
[2]: https://lore.kernel.org/lkml/cover.1687156559.git.falcon@tinylab.org/
[3]: https://lore.kernel.org/linux-riscv/cover.1687176996.git.falcon@tinylab.org/
[4]: https://lore.kernel.org/lkml/20230624-proc-net-setattr-v1-0-73176812adee@we…
Zhangjin Wu (22):
selftests/nolibc: add test for -include /path/to/nolibc.h
selftests/nolibc: print result to the screen too
selftests/nolibc: allow use x86_64 toolchain for i386
selftests/nolibc: add menuconfig target for manual config
selftests/nolibc: add tinyconfig target
selftests/nolibc: allow customize extra kernel config options
selftests/nolibc: add common extra config options
selftests/nolibc: add power reset control support
selftests/nolibc: add procfs, shmem and tmpfs
selftests/nolibc: add extra configs for i386
selftests/nolibc: add extra configs for x86_64
selftests/nolibc: add extra configs for arm64
selftests/nolibc: add extra configs for arm
selftests/nolibc: add extra configs for mips
selftests/nolibc: add extra configs for riscv32
selftests/nolibc: add extra configs for riscv64
selftests/nolibc: add extra configs for s390x
selftests/nolibc: add extra configs for loongarch
selftests/nolibc: config default CROSS_COMPILE
selftests/nolibc: add run-tiny and run-default
selftests/nolibc: allow run tests on all targets
selftests/nolibc: detect bios existing to avoid hang
tools/testing/selftests/nolibc/Makefile | 125 ++++++++++++++++++++++--
1 file changed, 119 insertions(+), 6 deletions(-)
--
2.25.1
Hi Linus,
Please pull the following Kselftest fixes update for Linux 6.5-rc3.
This Kselftest fixes update for Linux 6.5-rc3 consists of fixes to
bugs that are interfering with arm64 and riscv workflows. This update
also includes two fixes to timer and mincore tests that are causing
test failures.
diff is attached.
thanks,
-- Shuah
----------------------------------------------------------------
The following changes since commit 06c2afb862f9da8dc5efa4b6076a0e48c3fbaaa5:
Linux 6.5-rc1 (2023-07-09 13:53:13 -0700)
are available in the Git repository at:
git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest tags/linux-kselftest-fixes-6.5-rc3
for you to fetch changes up to 569f8b501b177f21121d483a96491716ab8905f4:
selftests/arm64: fix build failure during the "emit_tests" step (2023-07-14 12:33:35 -0600)
----------------------------------------------------------------
linux-kselftest-fixes-6.5-rc3
This Kselftest fixes update for Linux 6.5-rc3 consists of fixes to
bugs that are interfering with arm64 and riscv workflows. This update
also includes two fixes to timer and mincore tests that are causing
test failures.
----------------------------------------------------------------
John Hubbard (2):
selftests/riscv: fix potential build failure during the "emit_tests" step
selftests/arm64: fix build failure during the "emit_tests" step
Minjie Du (1):
tools: timers: fix freq average calculation
Ricardo Cañuelo (1):
selftests/mincore: fix skip condition for check_huge_pages test
tools/testing/selftests/arm64/Makefile | 2 +-
tools/testing/selftests/mincore/mincore_selftest.c | 4 ++--
tools/testing/selftests/riscv/Makefile | 2 +-
tools/testing/selftests/timers/raw_skew.c | 3 +--
4 files changed, 5 insertions(+), 6 deletions(-)
----------------------------------------------------------------
*Changes in v25*:
- Do proper filtering on hole as well (hole got missed earlier)
*Changes in v24*:
- Rebase on top of next-20230710
- Place WP markers in case of hole as well
*Changes in v23*:
- Set vec_buf_index in loop only when vec_buf_index is set
- Return -EFAULT instead of -EINVAL if vec is NULL
- Correctly return the walk ending address to the page granularity
*Changes in v22*:
- Interface change:
- Replace [start start + len) with [start, end)
- Return the ending address of the address walk in start
*Changes in v21*:
- Abort walk instead of returning error if WP is to be performed on
partial hugetlb
*Changes in v20*
- Correct PAGE_IS_FILE and add PAGE_IS_PFNZERO
*Changes in v19*
- Minor changes and interface updates
*Changes in v18*
- Rebase on top of next-20230613
- Minor updates
*Changes in v17*
- Rebase on top of next-20230606
- Minor improvements in PAGEMAP_SCAN IOCTL patch
*Changes in v16*
- Fix a corner case
- Add exclusive PM_SCAN_OP_WP back
*Changes in v15*
- Build fix (Add missed build fix in RESEND)
*Changes in v14*
- Fix build error caused by #ifdef added at last minute in some configs
*Changes in v13*
- Rebase on top of next-20230414
- Give-up on using uffd_wp_range() and write new helpers, flush tlb only
once
*Changes in v12*
- Update and other memory types to UFFD_FEATURE_WP_ASYNC
- Rebaase on top of next-20230406
- Review updates
*Changes in v11*
- Rebase on top of next-20230307
- Base patches on UFFD_FEATURE_WP_UNPOPULATED
- Do a lot of cosmetic changes and review updates
- Remove ENGAGE_WP + !GET operation as it can be performed with
UFFDIO_WRITEPROTECT
*Changes in v10*
- Add specific condition to return error if hugetlb is used with wp
async
- Move changes in tools/include/uapi/linux/fs.h to separate patch
- Add documentation
*Changes in v9:*
- Correct fault resolution for userfaultfd wp async
- Fix build warnings and errors which were happening on some configs
- Simplify pagemap ioctl's code
*Changes in v8:*
- Update uffd async wp implementation
- Improve PAGEMAP_IOCTL implementation
*Changes in v7:*
- Add uffd wp async
- Update the IOCTL to use uffd under the hood instead of soft-dirty
flags
*Motivation*
The real motivation for adding PAGEMAP_SCAN IOCTL is to emulate Windows
GetWriteWatch() syscall [1]. The GetWriteWatch{} retrieves the addresses of
the pages that are written to in a region of virtual memory.
This syscall is used in Windows applications and games etc. This syscall is
being emulated in pretty slow manner in userspace. Our purpose is to
enhance the kernel such that we translate it efficiently in a better way.
Currently some out of tree hack patches are being used to efficiently
emulate it in some kernels. We intend to replace those with these patches.
So the whole gaming on Linux can effectively get benefit from this. It
means there would be tons of users of this code.
CRIU use case [2] was mentioned by Andrei and Danylo:
> Use cases for migrating sparse VMAs are binaries sanitized with ASAN,
> MSAN or TSAN [3]. All of these sanitizers produce sparse mappings of
> shadow memory [4]. Being able to migrate such binaries allows to highly
> reduce the amount of work needed to identify and fix post-migration
> crashes, which happen constantly.
Andrei's defines the following uses of this code:
* it is more granular and allows us to track changed pages more
effectively. The current interface can clear dirty bits for the entire
process only. In addition, reading info about pages is a separate
operation. It means we must freeze the process to read information
about all its pages, reset dirty bits, only then we can start dumping
pages. The information about pages becomes more and more outdated,
while we are processing pages. The new interface solves both these
downsides. First, it allows us to read pte bits and clear the
soft-dirty bit atomically. It means that CRIU will not need to freeze
processes to pre-dump their memory. Second, it clears soft-dirty bits
for a specified region of memory. It means CRIU will have actual info
about pages to the moment of dumping them.
* The new interface has to be much faster because basic page filtering
is happening in the kernel. With the old interface, we have to read
pagemap for each page.
*Implementation Evolution (Short Summary)*
From the definition of GetWriteWatch(), we feel like kernel's soft-dirty
feature can be used under the hood with some additions like:
* reset soft-dirty flag for only a specific region of memory instead of
clearing the flag for the entire process
* get and clear soft-dirty flag for a specific region atomically
So we decided to use ioctl on pagemap file to read or/and reset soft-dirty
flag. But using soft-dirty flag, sometimes we get extra pages which weren't
even written. They had become soft-dirty because of VMA merging and
VM_SOFTDIRTY flag. This breaks the definition of GetWriteWatch(). We were
able to by-pass this short coming by ignoring VM_SOFTDIRTY until David
reported that mprotect etc messes up the soft-dirty flag while ignoring
VM_SOFTDIRTY [5]. This wasn't happening until [6] got introduced. We
discussed if we can revert these patches. But we could not reach to any
conclusion. So at this point, I made couple of tries to solve this whole
VM_SOFTDIRTY issue by correcting the soft-dirty implementation:
* [7] Correct the bug fixed wrongly back in 2014. It had potential to cause
regression. We left it behind.
* [8] Keep a list of soft-dirty part of a VMA across splits and merges. I
got the reply don't increase the size of the VMA by 8 bytes.
At this point, we left soft-dirty considering it is too much delicate and
userfaultfd [9] seemed like the only way forward. From there onward, we
have been basing soft-dirty emulation on userfaultfd wp feature where
kernel resolves the faults itself when WP_ASYNC feature is used. It was
straight forward to add WP_ASYNC feature in userfautlfd. Now we get only
those pages dirty or written-to which are really written in reality. (PS
There is another WP_UNPOPULATED userfautfd feature is required which is
needed to avoid pre-faulting memory before write-protecting [9].)
All the different masks were added on the request of CRIU devs to create
interface more generic and better.
[1] https://learn.microsoft.com/en-us/windows/win32/api/memoryapi/nf-memoryapi-…
[2] https://lore.kernel.org/all/20221014134802.1361436-1-mdanylo@google.com
[3] https://github.com/google/sanitizers
[4] https://github.com/google/sanitizers/wiki/AddressSanitizerAlgorithm#64-bit
[5] https://lore.kernel.org/all/bfcae708-db21-04b4-0bbe-712badd03071@redhat.com
[6] https://lore.kernel.org/all/20220725142048.30450-1-peterx@redhat.com/
[7] https://lore.kernel.org/all/20221122115007.2787017-1-usama.anjum@collabora.…
[8] https://lore.kernel.org/all/20221220162606.1595355-1-usama.anjum@collabora.…
[9] https://lore.kernel.org/all/20230306213925.617814-1-peterx@redhat.com
[10] https://lore.kernel.org/all/20230125144529.1630917-1-mdanylo@google.com
* Original Cover letter from v8*
Hello,
Note:
Soft-dirty pages and pages which have been written-to are synonyms. As
kernel already has soft-dirty feature inside which we have given up to
use, we are using written-to terminology while using UFFD async WP under
the hood.
This IOCTL, PAGEMAP_SCAN on pagemap file can be used to get and/or clear
the info about page table entries. The following operations are
supported in this ioctl:
- Get the information if the pages have been written-to (PAGE_IS_WRITTEN),
file mapped (PAGE_IS_FILE), present (PAGE_IS_PRESENT) or swapped
(PAGE_IS_SWAPPED).
- Write-protect the pages (PAGEMAP_WP_ENGAGE) to start finding which
pages have been written-to.
- Find pages which have been written-to and write protect the pages
(atomic PAGE_IS_WRITTEN + PAGEMAP_WP_ENGAGE)
It is possible to find and clear soft-dirty pages entirely in userspace.
But it isn't efficient:
- The mprotect and SIGSEGV handler for bookkeeping
- The userfaultfd wp (synchronous) with the handler for bookkeeping
Some benchmarks can be seen here[1]. This series adds features that weren't
present earlier:
- There is no atomic get soft-dirty/Written-to status and clear present in
the kernel.
- The pages which have been written-to can not be found in accurate way.
(Kernel's soft-dirty PTE bit + sof_dirty VMA bit shows more soft-dirty
pages than there actually are.)
Historically, soft-dirty PTE bit tracking has been used in the CRIU
project. The procfs interface is enough for finding the soft-dirty bit
status and clearing the soft-dirty bit of all the pages of a process.
We have the use case where we need to track the soft-dirty PTE bit for
only specific pages on-demand. We need this tracking and clear mechanism
of a region of memory while the process is running to emulate the
getWriteWatch() syscall of Windows.
*(Moved to using UFFD instead of soft-dirtyi feature to find pages which
have been written-to from v7 patch series)*:
Stop using the soft-dirty flags for finding which pages have been
written to. It is too delicate and wrong as it shows more soft-dirty
pages than the actual soft-dirty pages. There is no interest in
correcting it [2][3] as this is how the feature was written years ago.
It shouldn't be updated to changed behaviour. Peter Xu has suggested
using the async version of the UFFD WP [4] as it is based inherently
on the PTEs.
So in this patch series, I've added a new mode to the UFFD which is
asynchronous version of the write protect. When this variant of the
UFFD WP is used, the page faults are resolved automatically by the
kernel. The pages which have been written-to can be found by reading
pagemap file (!PM_UFFD_WP). This feature can be used successfully to
find which pages have been written to from the time the pages were
write protected. This works just like the soft-dirty flag without
showing any extra pages which aren't soft-dirty in reality.
The information related to pages if the page is file mapped, present and
swapped is required for the CRIU project [5][6]. The addition of the
required mask, any mask, excluded mask and return masks are also required
for the CRIU project [5].
The IOCTL returns the addresses of the pages which match the specific
masks. The page addresses are returned in struct page_region in a compact
form. The max_pages is needed to support a use case where user only wants
to get a specific number of pages. So there is no need to find all the
pages of interest in the range when max_pages is specified. The IOCTL
returns when the maximum number of the pages are found. The max_pages is
optional. If max_pages is specified, it must be equal or greater than the
vec_size. This restriction is needed to handle worse case when one
page_region only contains info of one page and it cannot be compacted.
This is needed to emulate the Windows getWriteWatch() syscall.
The patch series include the detailed selftest which can be used as an
example for the uffd async wp test and PAGEMAP_IOCTL. It shows the
interface usages as well.
[1] https://lore.kernel.org/lkml/54d4c322-cd6e-eefd-b161-2af2b56aae24@collabora…
[2] https://lore.kernel.org/all/20221220162606.1595355-1-usama.anjum@collabora.…
[3] https://lore.kernel.org/all/20221122115007.2787017-1-usama.anjum@collabora.…
[4] https://lore.kernel.org/all/Y6Hc2d+7eTKs7AiH@x1n
[5] https://lore.kernel.org/all/YyiDg79flhWoMDZB@gmail.com/
[6] https://lore.kernel.org/all/20221014134802.1361436-1-mdanylo@google.com/
Regards,
Muhammad Usama Anjum
Muhammad Usama Anjum (4):
fs/proc/task_mmu: Implement IOCTL to get and optionally clear info
about PTEs
tools headers UAPI: Update linux/fs.h with the kernel sources
mm/pagemap: add documentation of PAGEMAP_SCAN IOCTL
selftests: mm: add pagemap ioctl tests
Peter Xu (1):
userfaultfd: UFFD_FEATURE_WP_ASYNC
Documentation/admin-guide/mm/pagemap.rst | 58 +
Documentation/admin-guide/mm/userfaultfd.rst | 35 +
fs/proc/task_mmu.c | 591 +++++++
fs/userfaultfd.c | 26 +-
include/linux/hugetlb.h | 1 +
include/linux/userfaultfd_k.h | 21 +-
include/uapi/linux/fs.h | 55 +
include/uapi/linux/userfaultfd.h | 9 +-
mm/hugetlb.c | 34 +-
mm/memory.c | 27 +-
tools/include/uapi/linux/fs.h | 55 +
tools/testing/selftests/mm/.gitignore | 2 +
tools/testing/selftests/mm/Makefile | 3 +-
tools/testing/selftests/mm/config | 1 +
tools/testing/selftests/mm/pagemap_ioctl.c | 1464 ++++++++++++++++++
tools/testing/selftests/mm/run_vmtests.sh | 4 +
16 files changed, 2362 insertions(+), 24 deletions(-)
create mode 100644 tools/testing/selftests/mm/pagemap_ioctl.c
mode change 100644 => 100755 tools/testing/selftests/mm/run_vmtests.sh
--
2.39.2
kselftest.rst states that flags must be specified before including lib.mk,
but the vDSO selftest Makefile does not follow this order. As a result,
changes made by lib.mk to flags and other variables are overwritten by the
Makefile. For example, it is impossible to pass CFLAGS to the compiler via
make.
Rectify this by including lib.mk after assigning flag values.
Also change the paths of the generated programs from absolute to relative
paths as lib.mk will now correctly prepend the output directory path to
the program name as intended.
Cc: Shuah Khan <shuah(a)kernel.org>
Cc: Andy Lutomirski <luto(a)kernel.org>
Cc: Thomas Gleixner <tglx(a)linutronix.de>
Cc: Vincenzo Frascino <vincenzo.frascino(a)arm.com>
Signed-off-by: Aditya Deshpande <aditya.deshpande(a)arm.com>
---
tools/testing/selftests/vDSO/Makefile | 16 ++++++++--------
1 file changed, 8 insertions(+), 8 deletions(-)
diff --git a/tools/testing/selftests/vDSO/Makefile b/tools/testing/selftests/vDSO/Makefile
index d53a4d8008f9..19145210d044 100644
--- a/tools/testing/selftests/vDSO/Makefile
+++ b/tools/testing/selftests/vDSO/Makefile
@@ -1,16 +1,15 @@
# SPDX-License-Identifier: GPL-2.0
-include ../lib.mk
-
uname_M := $(shell uname -m 2>/dev/null || echo not)
ARCH ?= $(shell echo $(uname_M) | sed -e s/i.86/x86/ -e s/x86_64/x86/)
-TEST_GEN_PROGS := $(OUTPUT)/vdso_test_gettimeofday $(OUTPUT)/vdso_test_getcpu
-TEST_GEN_PROGS += $(OUTPUT)/vdso_test_abi
-TEST_GEN_PROGS += $(OUTPUT)/vdso_test_clock_getres
+TEST_GEN_PROGS := vdso_test_gettimeofday
+TEST_GEN_PROGS += vdso_test_getcpu
+TEST_GEN_PROGS += vdso_test_abi
+TEST_GEN_PROGS += vdso_test_clock_getres
ifeq ($(ARCH),$(filter $(ARCH),x86 x86_64))
-TEST_GEN_PROGS += $(OUTPUT)/vdso_standalone_test_x86
+TEST_GEN_PROGS += vdso_standalone_test_x86
endif
-TEST_GEN_PROGS += $(OUTPUT)/vdso_test_correctness
+TEST_GEN_PROGS += vdso_test_correctness
CFLAGS := -std=gnu99
CFLAGS_vdso_standalone_test_x86 := -nostdlib -fno-asynchronous-unwind-tables -fno-stack-protector
@@ -19,7 +18,8 @@ ifeq ($(CONFIG_X86_32),y)
LDLIBS += -lgcc_s
endif
-all: $(TEST_GEN_PROGS)
+include ../lib.mk
+
$(OUTPUT)/vdso_test_gettimeofday: parse_vdso.c vdso_test_gettimeofday.c
$(OUTPUT)/vdso_test_getcpu: parse_vdso.c vdso_test_getcpu.c
$(OUTPUT)/vdso_test_abi: parse_vdso.c vdso_test_abi.c
--
2.25.1
When running the rtctest if we pass wrong rtc device file as an argument
the test fails expectedly, but prints the logs that are not useful
to point out the issue.
To handle this, the patch adds a checks to verify if the rtc_file is valid.
Signed-off-by: Atul Kumar Pant <atulpant.linux(a)gmail.com>
---
changes since v3:
Added Linux-kselftest and Linux-kernel mailing lists.
changes since v2:
Changed error message when rtc file does not exist.
changes since v1:
Removed check for uid=0
If rtc file is invalid, then exit the test.
tools/testing/selftests/rtc/rtctest.c | 11 ++++++++++-
1 file changed, 10 insertions(+), 1 deletion(-)
diff --git a/tools/testing/selftests/rtc/rtctest.c b/tools/testing/selftests/rtc/rtctest.c
index 63ce02d1d5cc..630fef735c7e 100644
--- a/tools/testing/selftests/rtc/rtctest.c
+++ b/tools/testing/selftests/rtc/rtctest.c
@@ -17,6 +17,7 @@
#include <unistd.h>
#include "../kselftest_harness.h"
+#include "../kselftest.h"
#define NUM_UIE 3
#define ALARM_DELTA 3
@@ -419,6 +420,8 @@ __constructor_order_last(void)
int main(int argc, char **argv)
{
+ int ret = -1;
+
switch (argc) {
case 2:
rtc_file = argv[1];
@@ -430,5 +433,11 @@ int main(int argc, char **argv)
return 1;
}
- return test_harness_run(argc, argv);
+ // Run the test if rtc_file is valid
+ if (access(rtc_file, F_OK) == 0)
+ ret = test_harness_run(argc, argv);
+ else
+ ksft_exit_fail_msg("[ERROR]: Cannot access rtc file %s - Exiting\n", rtc_file);
+
+ return ret;
}
--
2.25.1
There are macros in kernel.h that can be used outside of that header.
Split them to args.h and replace open coded variants.
Test compiled with `make allmodconfig` for x86_64.
(Note that positive diff statistics is due to documentation being
updated.)
In v3:
- split to a series of patches
- fixed build issue on `make allmodconfig` for x86_64 (Andrew)
In v2:
- converted existing users at the same time (Andrew, Rasmus)
- documented how it does work (Andrew, Rasmus)
Andy Shevchenko (4):
kernel.h: Split out COUNT_ARGS() and CONCATENATE() to args.h
x86/asm: Replace custom COUNT_ARGS() & CONCATENATE() implementations
arm64: smccc: Replace custom COUNT_ARGS() & CONCATENATE()
implementations
genetlink: Replace custom CONCATENATE() implementation
arch/x86/include/asm/rmwcc.h | 11 +++--------
include/kunit/test.h | 1 +
include/linux/args.h | 28 ++++++++++++++++++++++++++++
include/linux/arm-smccc.h | 27 ++++++++++-----------------
include/linux/genl_magic_func.h | 27 ++++++++++++++-------------
include/linux/genl_magic_struct.h | 8 +++-----
include/linux/kernel.h | 7 -------
include/linux/pci.h | 2 +-
include/trace/bpf_probe.h | 2 ++
9 files changed, 62 insertions(+), 51 deletions(-)
create mode 100644 include/linux/args.h
--
2.40.0.1.gaa8946217a0b
Hi All,
Given my on-going work on large anon folios and contpte mappings, I decided it
would be a good idea to start running mm selftests to help guard against
regressions. However, it soon became clear that I couldn't get the suite to run
cleanly on arm64 with a vanilla v6.5-rc1 kernel (perhaps I'm just doing it
wrong??), so got stuck in a rabbit hole trying to debug and fix all the issues.
Some were down to misconfigurations, but I also found a number of issues with
the tests and even a couple of issues with the kernel.
This series aims to fix (most of) the test issues. It applies on top of
v6.5-rc1.
Reproducing
-----------
What follows is a write up of how I'm running the tests and the results I see
with this series applied. I don't yet have a concrete understanding of all of
the remaining failures. So if anyone has any comments on my setup or reasons for
the test failures it would be great to hear.
Source: v6.5-rc1 + this series + [1] + [2]. [1] is a patch from Florent Revest to
fix mdwe mmap_FIXED tests. [2] is a fix for a regression in the kernel that I
found by running `mlock-random-test` and `mlock2-tests`.
Compile the kernel (on arm64 system):
$ make defconfig
$ ./scripts/config --enable CONFIG_SQUASHFS_LZ4
$ ./scripts/config --enable CONFIG_SQUASHFS_LZO
$ ./scripts/config --enable CONFIG_SQUASHFS_XZ
$ ./scripts/config --enable CONFIG_SQUASHFS_ZSTD
$ ./scripts/config --enable CONFIG_XFS_FS
$ ./scripts/config --enable CONFIG_SYSVIPC
$ ./scripts/config --enable CONFIG_USERFAULTFD
$ ./scripts/config --enable CONFIG_TEST_VMALLOC
$ ./scripts/config --enable CONFIG_GUP_TEST
$ ./scripts/config --enable CONFIG_TRANSPARENT_HUGEPAGE
$ ./scripts/config --enable CONFIG_MEM_SOFT_DIRTY
$ make olddefconfig
$ make -s -j`nproc` Image
(In the above case, I'm building/testing a 4K kernel).
Note that it turns out that arm64 doesn't really support ZONE_DEVICE; Although
it defines ARCH_HAS_PTE_DEVMAP, it can't allocate `struct page`s for arbitrary
physical addresses. This means that the TEST_HMM module causes warnings to be
emitted when initializing because it tries to reserve arbitrary PA range then
requests struct page's for them. I haven't fully investigated this yet, but for
now, I'm just deliverately excluding ZONE_DEVICE, (which TEST_HMM depends upon).
This means that the `hmm-tests` selftest gets skipped at runtime.
Compile the tests:
$ make -j`nproc` headers_install
$ make -C tools/testing/selftests TARGETS=mm install INSTALL_PATH=<path/to/install>
Start a VM running the kernel we just compiled:
$ taskset -c 8-15 qemu-system-aarch64 \
-object memory-backend-file,id=mem0,size=6G,mem-path=/hugetlbfs,merge=off,prealloc=on,host-nodes=0,policy=bind,align=1G \
-object memory-backend-file,id=mem1,size=6G,mem-path=/hugetlbfs,merge=off,prealloc=on,host-nodes=0,policy=bind,align=1G \
-nographic -enable-kvm -machine virt,gic-version=3 -cpu max \
-smp 8 -m 12G \
-numa node,memdev=mem0,cpus=0-3,nodeid=0 \
-numa node,memdev=mem1,cpus=4-7,nodeid=1 \
-drive if=virtio,format=raw,file=ubuntu-22.04.xfs \
-object rng-random,filename=/dev/urandom,id=rng0 \
-device virtio-scsi-pci,id=scsi0 \
-netdev user,id=net0,hostfwd=tcp::8022-:22 \
-device virtio-rng-pci,rng=rng0 \
-device virtio-net-pci,netdev=net0 \
-kernel arch/arm64/boot/Image \
-append "earlycon root=/dev/vda2 secretmem.enable hugepagesz=1G hugepages=0:2,1:2 hugepagesz=32M hugepages=0:2,1:2 default_hugepagesz=2M hugepages=0:64,1:64 hugepagesz=64K hugepages=0:2,1:2"
This starts a VM with 2 numa nodes (needed by ksm and migration tests), with 6G
of memory and 4 CPUs on each node. The kernel command line enables secretmem
(needed for `memfd_secret` test), and preallocates a bunch of huge pages
(divined by reading the comments and source for a bunch of tests that require
huge pages). 128M of the default huge page size, and 4 pages of each of the
other sizes appear to be sufficient. I'm allocating half on each numa node.
Once booted, copy the selftests we just compiled onto it.
On the VM, run the tests:
$ cd path/to/selftests
$ sudo ./run_kselftest.sh
or alternatively:
$ cd path/to/selftests/mm
$ sudo ./run_vmtests.sh
Test Results
------------
TOP-LEVEL SUMMARY: PASS=42 SKIP=4 FAIL=2
Only showing nested tests if they are skipped or failed.
[PASS] hugepage-mmap
[PASS] hugepage-shm
[PASS] map_hugetlb
[PASS] hugepage-mremap
[PASS] hugepage-vmemmap
[PASS] hugetlb-madvise
[PASS] map_fixed_noreplace
[PASS] gup_test -u
[PASS] gup_test -a
[PASS] gup_test -ct -F 0x1 0 19 0x1000
[PASS] gup_longterm
[PASS] uffd-unit-tests
[PASS] uffd-stress anon 20 16
[PASS] uffd-stress hugetlb 128 32
[PASS] uffd-stress hugetlb-private 128 32
[PASS] uffd-stress shmem 20 16
[PASS] uffd-stress shmem-private 20 16
[PASS] compaction_test
[PASS] on-fault-limit
[PASS] map_populate
[PASS] mlock-random-test
[PASS] mlock2-tests
[PASS] mrelease_test
[PASS] mremap_test
[PASS] thuge-gen
[PASS] virtual_address_range
[SKIP] va_high_addr_switch.sh
# 4K kernel does not support big enough VA space for test
[SKIP] test_vmalloc.sh smoke
# Test requires test_vmalloc kernel module which isn't present
[PASS] mremap_dontunmap
[SKIP] test_hmm.sh smoke
# Test requires test_hmm kernel module - see ZONE_DEVICE issue above
[PASS] madv_populate
[PASS] test_softdirty
[SKIP] range is not softdirty
[SKIP] MADV_POPULATE_READ
[SKIP] range is not softdirty
[SKIP] MADV_POPULATE_WRITE
[SKIP] range is softdirty
# All skipped because arm64 does not support soft-dirty
[PASS] memfd_secret
[PASS] ksm_tests -M -p 10
[PASS] ksm_tests -U
[PASS] ksm_tests -Z -p 10 -z 0
[PASS] ksm_tests -Z -p 10 -z 1
[PASS] ksm_tests -N -m 1
[PASS] ksm_tests -N -m 0
[PASS] ksm_functional_tests
[SKIP] test_unmerge_uffd_wp
# UFFD_FEATURE_PAGEFAULT_FLAG_WP not available on arm64
[PASS] ksm_functional_tests
[SKIP] test_unmerge_uffd_wp
# UFFD_FEATURE_PAGEFAULT_FLAG_WP not available on arm64
[SKIP] soft-dirty
# Skipped because arm64 does not support soft-dirty
[FAIL] cow
[FAIL] vmsplice() + unmap in child ... with hugetlb
[FAIL] vmsplice() + unmap in child with mprotect() optimization ... with hugetlb
[FAIL] vmsplice() before fork(), unmap in parent after fork() ... with hugetlb
[FAIL] vmsplice() + unmap in parent after fork() ... with hugetlb
# Above are known issues for vmsplice + hugetlb
# Reproduces on x86
[SKIP] Basic COW after fork() ... with swapped-out, PTE-mapped THP
[SKIP] Basic COW after fork() with mprotect() optimization ... with swapped-out, PTE-mapped THP
[SKIP] vmsplice() + unmap in child ... with swapped-out, PTE-mapped THP
[SKIP] vmsplice() + unmap in child with mprotect() optimization ... with swapped-out, PTE-mapped THP
[SKIP] vmsplice() before fork(), unmap in parent after fork() ... with swapped-out, PTE-mapped THP
[SKIP] vmsplice() + unmap in parent after fork() ... with swapped-out, PTE-mapped THP
[SKIP] R/O-mapping a page registered as iouring fixed buffer ... with swapped-out, PTE-mapped THP
[SKIP] fork() with an iouring fixed buffer ... with swapped-out, PTE-mapped THP
[SKIP] R/O GUP pin on R/O-mapped shared page ... with swapped-out, PTE-mapped THP
[SKIP] R/O GUP-fast pin on R/O-mapped shared page ... with swapped-out, PTE-mapped THP
[SKIP] R/O GUP pin on R/O-mapped previously-shared page ... with swapped-out, PTE-mapped THP
[SKIP] R/O GUP-fast pin on R/O-mapped previously-shared page ... with swapped-out, PTE-mapped THP
[SKIP] R/O GUP pin on R/O-mapped exclusive page ... with swapped-out, PTE-mapped THP
[SKIP] R/O GUP-fast pin on R/O-mapped exclusive page ... with swapped-out, PTE-mapped THP
# Above all skipped due to "MADV_PAGEOUT did not work, is swap enabled?"
# swap is enabled though
# Reproduces on x86
[SKIP] Basic COW after fork() when collapsing after fork() (fully shared)
# MADV_COLLAPSE failed: Invalid argument
[PASS] khugepaged
[PASS] transhuge-stress -d 20
[PASS] split_huge_page_test
[FAIL] migration
[FAIL] migration.shared_anon
# move_pages() reports that the requested page was not migrated
# after a few iterations.
[PASS] mkdirty
[PASS] mdwe_test
[1] https://lore.kernel.org/lkml/20230704153630.1591122-3-revest@chromium.org/
[2] https://lore.kernel.org/linux-mm/20230711175020.4091336-1-Liam.Howlett@orac…
Thanks,
Ryan
Ryan Roberts (9):
selftests: Line buffer test program's stdout
selftests/mm: Give scripts execute permission
selftests/mm: Skip soft-dirty tests on arm64
selftests/mm: Enable mrelease_test for arm64
selftests/mm: Fix thuge-gen test bugs
selftests/mm: va_high_addr_switch should skip unsupported arm64
configs
selftests/mm: Make migration test robust to failure
selftests/mm: Optionally pass duration to transhuge-stress
selftests/mm: Run all tests from run_vmtests.sh
tools/testing/selftests/kselftest/runner.sh | 5 +-
tools/testing/selftests/mm/Makefile | 79 ++++++++++---------
.../selftests/mm/charge_reserved_hugetlb.sh | 0
tools/testing/selftests/mm/check_config.sh | 0
.../selftests/mm/hugetlb_reparenting_test.sh | 0
tools/testing/selftests/mm/madv_populate.c | 18 +++--
tools/testing/selftests/mm/migration.c | 14 +++-
tools/testing/selftests/mm/mrelease_test.c | 1 +
tools/testing/selftests/mm/run_vmtests.sh | 23 ++++++
tools/testing/selftests/mm/settings | 2 +-
tools/testing/selftests/mm/soft-dirty.c | 3 +
tools/testing/selftests/mm/test_hmm.sh | 0
tools/testing/selftests/mm/test_vmalloc.sh | 0
tools/testing/selftests/mm/thuge-gen.c | 4 +-
tools/testing/selftests/mm/transhuge-stress.c | 12 ++-
.../selftests/mm/va_high_addr_switch.c | 3 +-
.../selftests/mm/va_high_addr_switch.sh | 0
tools/testing/selftests/mm/vm_util.c | 17 ++++
tools/testing/selftests/mm/vm_util.h | 1 +
.../selftests/mm/write_hugetlb_memory.sh | 0
20 files changed, 127 insertions(+), 55 deletions(-)
mode change 100644 => 100755 tools/testing/selftests/mm/charge_reserved_hugetlb.sh
mode change 100644 => 100755 tools/testing/selftests/mm/check_config.sh
mode change 100644 => 100755 tools/testing/selftests/mm/hugetlb_reparenting_test.sh
mode change 100644 => 100755 tools/testing/selftests/mm/run_vmtests.sh
mode change 100644 => 100755 tools/testing/selftests/mm/test_hmm.sh
mode change 100644 => 100755 tools/testing/selftests/mm/test_vmalloc.sh
mode change 100644 => 100755 tools/testing/selftests/mm/va_high_addr_switch.sh
mode change 100644 => 100755 tools/testing/selftests/mm/write_hugetlb_memory.sh
--
2.25.1
Hi, Willy, Thomas
Thanks very much for your careful review and great suggestions, now, we
get v4 revision of the arch shrink series [1], it mainly include a new
fixup for -O0 under gcc < 11.1.0, the stackprotector support for
_start_c(), new testcases for startup code and two new test targets.
All of the tests passed or skipped (tinyconfig + few options +
qemu-system) for both -Os and -O0:
arch/board | result
------------|------------
arm/versatilepb | 165 test(s): 158 passed, 7 skipped, 0 failed => status: warning.
arm/vexpress-a9 | 165 test(s): 158 passed, 7 skipped, 0 failed => status: warning.
arm/virt | 165 test(s): 158 passed, 7 skipped, 0 failed => status: warning.
aarch64/virt | 165 test(s): 158 passed, 7 skipped, 0 failed => status: warning.
i386/pc | 165 test(s): 158 passed, 7 skipped, 0 failed => status: warning.
x86_64/pc | 165 test(s): 158 passed, 7 skipped, 0 failed => status: warning.
mipsel/malta | 165 test(s): 158 passed, 7 skipped, 0 failed => status: warning.
loongarch64/virt | 165 test(s): 158 passed, 7 skipped, 0 failed => status: warning.
riscv64/virt | 165 test(s): 158 passed, 7 skipped, 0 failed => status: warning.
s390x/s390-ccw-virtio | 165 test(s): 158 passed, 7 skipped, 0 failed => status: warning.
And more, for both -Os and -O0:
$ for r in run-user run-nolibc-test run-libc-test; do make clean > /dev/null; make $r | grep status; done
165 test(s): 157 passed, 8 skipped, 0 failed => status: warning
165 test(s): 157 passed, 8 skipped, 0 failed => status: warning
165 test(s): 153 passed, 12 skipped, 0 failed => status: warning
// for make run-user, the euid0 and 32bit limit related tests are
// skipped
$ make clean && make run-user
$ grep -i skip run.out
17 chroot_root [SKIPPED]
39 link_dir [SKIPPED]
62 limit_intptr_min_32 [SKIPPED]
63 limit_intptr_max_32 [SKIPPED]
64 limit_uintptr_max_32 [SKIPPED]
65 limit_ptrdiff_min_32 [SKIPPED]
66 limit_ptrdiff_max_32 [SKIPPED]
67 limit_size_max_32 [SKIPPED]
// for run-libc-test, the _auxv variables, euid0, 32bits limit and
// stackprotector related tests are skipped
$ make clean && make run-libc-test
$ grep -i skip run.out
9 environ_auxv [SKIPPED]
10 environ_total [SKIPPED]
12 auxv_addr [SKIPPED]
17 chroot_root [SKIPPED]
39 link_dir [SKIPPED]
62 limit_intptr_min_32 [SKIPPED]
63 limit_intptr_max_32 [SKIPPED]
64 limit_uintptr_max_32 [SKIPPED]
65 limit_ptrdiff_min_32 [SKIPPED]
66 limit_ptrdiff_max_32 [SKIPPED]
67 limit_size_max_32 [SKIPPED]
0 -fstackprotector not supported [SKIPPED]
$ make clean >/dev/null; make run-libc-test CC=/labs/linux-lab/src/examples/musl-install/bin/musl-gcc | grep status
165 test(s): 151 passed, 12 skipped, 2 failed => status: failure
// The failures are expected for musl has disabled both sbrk and brk
// but not the sbrk(0); the _auxv variables, euid0, 32bits limit and
// stackprotector related tests are skipped for musl too
$ grep FAIL -ur run.out
9 sbrk = 1 ENOMEM [FAIL]
10 brk = -1 ENOMEM [FAIL]
$ grep "SKIP" -ur run.out
9 environ_auxv [SKIPPED]
10 environ_total [SKIPPED]
12 auxv_addr [SKIPPED]
17 chroot_root [SKIPPED]
39 link_dir [SKIPPED]
62 limit_intptr_min_32 [SKIPPED]
63 limit_intptr_max_32 [SKIPPED]
64 limit_uintptr_max_32 [SKIPPED]
65 limit_ptrdiff_min_32 [SKIPPED]
66 limit_ptrdiff_max_32 [SKIPPED]
67 limit_size_max_32 [SKIPPED]
0 -fstackprotector not supported [SKIPPED]
For stackprotector, gcc 13.1.0 is used to test on x86_64 standalonely:
$ make run-user CROSS_COMPILE=x86_64-linux- | grep status
165 test(s): 157 passed, 8 skipped, 0 failed => status: warning
$ grep stack -ur run.out
0 -fstackprotector [OK]
$ make run-nolibc-test CROSS_COMPILE=x86_64-linux- | grep status
165 test(s): 157 passed, 8 skipped, 0 failed => status: warning
$ grep stack -ur run.out
0 -fstackprotector [OK]
Changes from v3 --> v4:
* tools/nolibc: arch-*.h: add missing space after ','
tools/nolibc: fix up startup failures for -O0 under gcc < 11.1.0
Both of the above changes are for _start, it is able to merge them
if necessary.
The first one is old for format errors reported by
scripts/checkpatch.pl
The second one is for -O0 failure under gcc < 11.1.0, applied the
optimize("-Os", "omit-frame-pointer") suggestion from Thomas.
* tools/nolibc: remove the old sys_stat support
As suggested by Willy, Document carefully about the statx supported
Linux version info.
* tools/nolibc: add new crt.h with _start_c
The code is polished carefully for smaller size and better
readability.
* tools/nolibc: stackprotector.h: add empty __stack_chk_init for !_NOLIBC_STACKPROTECTOR
tools/nolibc: crt.h: initialize stack protector
As suggested by Thomas, init stackprotector in _start_c() too.
* tools/nolibc: arm: shrink _start with _start_c
tools/nolibc: aarch64: shrink _start with _start_c
tools/nolibc: i386: shrink _start with _start_c
tools/nolibc: x86_64: shrink _start with _start_c
tools/nolibc: mips: shrink _start with _start_c
tools/nolibc: loongarch: shrink _start with _start_c
tools/nolibc: riscv: shrink _start with _start_c
tools/nolibc: s390: shrink _start with _start_c
Removed the stackprotector initialization from _start too, we
already have it in _start_c().
* selftests/nolibc: add EXPECT_PTRGE, EXPECT_PTRGT, EXPECT_PTRLE, EXPECT_PTRLT
selftests/nolibc: add testcases for startup code
Add a new startup test group to cover the testing of argc,
argv/argv0, envp/environ and _auxv.
Some testcases are enhanced, some are newly added from after the
discussion during v3 review.
* selftests/nolibc: allow run nolibc-test locally
selftests/nolibc: allow test -include /path/to/nolibc.h
Two new test targets are added to cover more scenes.
Hope you like this revisoin ;-)
Next patchset is powerpc & powerpc64 support, after that we will send
the v2 of tinyconfig support, at last the left rv32 patches (mainly
64bit time).
Best regards,
Zhangjin
---
[1]: https://lore.kernel.org/lkml/20230715100134.GD24086@1wt.eu/
Zhangjin Wu (18):
tools/nolibc: arch-*.h: add missing space after ','
tools/nolibc: fix up startup failures for -O0 under gcc < 11.1.0
tools/nolibc: remove the old sys_stat support
tools/nolibc: add new crt.h with _start_c
tools/nolibc: stackprotector.h: add empty __stack_chk_init for
!_NOLIBC_STACKPROTECTOR
tools/nolibc: crt.h: initialize stack protector
tools/nolibc: arm: shrink _start with _start_c
tools/nolibc: aarch64: shrink _start with _start_c
tools/nolibc: i386: shrink _start with _start_c
tools/nolibc: x86_64: shrink _start with _start_c
tools/nolibc: mips: shrink _start with _start_c
tools/nolibc: loongarch: shrink _start with _start_c
tools/nolibc: riscv: shrink _start with _start_c
tools/nolibc: s390: shrink _start with _start_c
selftests/nolibc: add EXPECT_PTRGE, EXPECT_PTRGT, EXPECT_PTRLE,
EXPECT_PTRLT
selftests/nolibc: add testcases for startup code
selftests/nolibc: allow run nolibc-test locally
selftests/nolibc: allow test -include /path/to/nolibc.h
tools/include/nolibc/Makefile | 1 +
tools/include/nolibc/arch-aarch64.h | 57 +---------
tools/include/nolibc/arch-arm.h | 83 ++-------------
tools/include/nolibc/arch-i386.h | 62 ++---------
tools/include/nolibc/arch-loongarch.h | 46 +-------
tools/include/nolibc/arch-mips.h | 76 ++-----------
tools/include/nolibc/arch-riscv.h | 69 ++----------
tools/include/nolibc/arch-s390.h | 63 ++---------
tools/include/nolibc/arch-x86_64.h | 58 ++--------
tools/include/nolibc/crt.h | 61 +++++++++++
tools/include/nolibc/stackprotector.h | 2 +
tools/include/nolibc/sys.h | 63 ++---------
tools/include/nolibc/types.h | 4 +-
tools/testing/selftests/nolibc/Makefile | 12 +++
tools/testing/selftests/nolibc/nolibc-test.c | 106 ++++++++++++++++++-
15 files changed, 246 insertions(+), 517 deletions(-)
create mode 100644 tools/include/nolibc/crt.h
--
2.25.1
Make sv48 the default address space for mmap as some applications
currently depend on this assumption. Users can now select a
desired address space using a non-zero hint address to mmap. Previously,
requesting the default address space from mmap by passing zero as the hint
address would result in using the largest address space possible. Some
applications depend on empty bits in the virtual address space, like Go and
Java, so this patch provides more flexibility for application developers.
-Charlie
---
v5:
- Minor wording change in documentation
- Change some parenthesis in arch_get_mmap_ macros
- Added case for addr==0 in arch_get_mmap_ because without this, programs would
crash if RLIMIT_STACK was modified before executing the program. This was
tested using the libhugetlbfs tests.
v4:
- Split testcases/document patch into test cases, in-code documentation, and
formal documentation patches
- Modified the mmap_base macro to be more legible and better represent memory
layout
- Fixed documentation to better reflect the implmentation
- Renamed DEFAULT_VA_BITS to MMAP_VA_BITS
- Added additional test case for rlimit changes
---
Charlie Jenkins (4):
RISC-V: mm: Restrict address space for sv39,sv48,sv57
RISC-V: mm: Add tests for RISC-V mm
RISC-V: mm: Update pgtable comment documentation
RISC-V: mm: Document mmap changes
Documentation/riscv/vm-layout.rst | 22 +++
arch/riscv/include/asm/elf.h | 2 +-
arch/riscv/include/asm/pgtable.h | 20 ++-
arch/riscv/include/asm/processor.h | 46 +++++-
tools/testing/selftests/riscv/Makefile | 2 +-
tools/testing/selftests/riscv/mm/.gitignore | 1 +
tools/testing/selftests/riscv/mm/Makefile | 21 +++
.../selftests/riscv/mm/testcases/mmap.c | 133 ++++++++++++++++++
8 files changed, 234 insertions(+), 13 deletions(-)
create mode 100644 tools/testing/selftests/riscv/mm/.gitignore
create mode 100644 tools/testing/selftests/riscv/mm/Makefile
create mode 100644 tools/testing/selftests/riscv/mm/testcases/mmap.c
--
2.41.0
In this series, Geliang did some refactoring in the mptcp_join.sh file.
Patch 1 reduces the scope of some global env vars, only used by some
tests: easier to deal with.
Patch 2 uses a dedicated env var for fastclose case instead of re-using
addr_nr_ns2 with embedded info, clearer.
Patch 3 is similar but for the fullmesh case.
Patch 4 moves a positional but optional argument of run_tests() to an
env var like it has already been done with the other args, cleaner.
Signed-off-by: Matthieu Baerts <matthieu.baerts(a)tessares.net>
---
Geliang Tang (4):
selftests: mptcp: set all env vars as local ones
selftests: mptcp: add fastclose env var
selftests: mptcp: add fullmesh env var
selftests: mptcp: add speed env var
tools/testing/selftests/net/mptcp/mptcp_join.sh | 271 +++++++++++++-----------
1 file changed, 151 insertions(+), 120 deletions(-)
---
base-commit: e0f0a5db5f8c413cbbf48607f711c2a21023ee66
change-id: 20230712-upstream-net-next-20230712-selftests-mptcp-use-local-env-ad964224bc2a
Best regards,
--
Matthieu Baerts <matthieu.baerts(a)tessares.net>
This RFC is a follow-up of the discussions taken here:
https://lore.kernel.org/linux-doc/20230704132812.02ba97ba@maurocar-mobl2/T/…
It adds a new extension that allows documenting tests using the same tool we're
using for DRM unit tests at IGT GPU tools: https://gitlab.freedesktop.org/drm/igt-gpu-tools.
While kernel-doc has provided documentation for in-lined functions/struct comments,
it was not meant to document tests.
Tests need to be grouped by the test functions. It should also be possible to produce
other outputs from the documentation, to integrate it with test suites. For instance,
Internally at Intel, we use the comments to generate DOT files hierarchically grouped
per feature categories.
This is just an initial RFC to start discussions around the solution. Before being merged
upstream, we need to define what tags will be used to identify test markups and add
a simple change at kernel-doc to let it ignore such markups.
On this series, we have:
- patch 1:
adding test_list.py as present at the IGT tree, after a patch series to make it
more generic: https://patchwork.freedesktop.org/series/120622/
- patch 2:
adds an example about how tests could be documented. This is a really simple
example, just to test the feature, specially designed to make easy to build just
the test documentation from a single DRM kunit file.
After discussions, my plan is to send a new version addressing the issues, and add
some documentation for DRM and/or i915 kunit tests.
Mauro Carvalho Chehab (2):
docs: add support for documenting kUnit and kSelftests
drm: add documentation for drm_buddy_test kUnit test
Documentation/conf.py | 2 +-
Documentation/index.rst | 2 +-
Documentation/sphinx/test_kdoc.py | 108 ++
Documentation/sphinx/test_list.py | 1288 ++++++++++++++++++++++++
Documentation/tests/index.rst | 6 +
Documentation/tests/kunit.rst | 5 +
drivers/gpu/drm/tests/drm_buddy_test.c | 12 +
7 files changed, 1421 insertions(+), 2 deletions(-)
create mode 100644 Documentation/sphinx/test_kdoc.py
create mode 100644 Documentation/sphinx/test_list.py
create mode 100644 Documentation/tests/index.rst
create mode 100644 Documentation/tests/kunit.rst
--
2.40.1
This series decreases the pcm-test duration in order to avoid timeouts
by first moving the audio stream duration to a variable and subsequently
decreasing it.
Nícolas F. R. A. Prado (2):
kselftest/alsa: pcm-test: Move stream duration and margin to variables
kselftest/alsa: pcm-test: Decrease stream duration from 4 to 2 seconds
tools/testing/selftests/alsa/pcm-test.c | 8 +++++---
1 file changed, 5 insertions(+), 3 deletions(-)
--
2.41.0
Since apparently enabling all the KUnit tests shouldn't enable any new
subsystems it is hard to enable the regmap KUnit tests in normal KUnit
testing scenarios that don't enable any drivers. Add a Kconfig option
to help with this and include it in the KUnit all tests config.
Signed-off-by: Mark Brown <broonie(a)kernel.org>
---
drivers/base/regmap/Kconfig | 12 +++++++++++-
tools/testing/kunit/configs/all_tests.config | 2 ++
2 files changed, 13 insertions(+), 1 deletion(-)
diff --git a/drivers/base/regmap/Kconfig b/drivers/base/regmap/Kconfig
index 0db2021f7477..b1affac70d5d 100644
--- a/drivers/base/regmap/Kconfig
+++ b/drivers/base/regmap/Kconfig
@@ -4,7 +4,7 @@
# subsystems should select the appropriate symbols.
config REGMAP
- bool "Register Map support" if KUNIT_ALL_TESTS
+ bool
default y if (REGMAP_I2C || REGMAP_SPI || REGMAP_SPMI || REGMAP_W1 || REGMAP_AC97 || REGMAP_MMIO || REGMAP_IRQ || REGMAP_SOUNDWIRE || REGMAP_SOUNDWIRE_MBQ || REGMAP_SCCB || REGMAP_I3C || REGMAP_SPI_AVMM || REGMAP_MDIO || REGMAP_FSI)
select IRQ_DOMAIN if REGMAP_IRQ
select MDIO_BUS if REGMAP_MDIO
@@ -23,6 +23,16 @@ config REGMAP_KUNIT
default KUNIT_ALL_TESTS
select REGMAP_RAM
+config REGMAP_BUILD
+ bool "Enable regmap build"
+ depends on KUNIT
+ select REGMAP
+ help
+ This option exists purely to allow the regmap KUnit tests to
+ be enabled without having to enable some driver that uses
+ regmap due to unfortunate issues with how KUnit tests are
+ normally enabled.
+
config REGMAP_AC97
tristate
diff --git a/tools/testing/kunit/configs/all_tests.config b/tools/testing/kunit/configs/all_tests.config
index 0393940c706a..873f3e06ccad 100644
--- a/tools/testing/kunit/configs/all_tests.config
+++ b/tools/testing/kunit/configs/all_tests.config
@@ -33,5 +33,7 @@ CONFIG_DAMON_PADDR=y
CONFIG_DEBUG_FS=y
CONFIG_DAMON_DBGFS=y
+CONFIG_REGMAP_BUILD=y
+
CONFIG_SECURITY=y
CONFIG_SECURITY_APPARMOR=y
---
base-commit: 06c2afb862f9da8dc5efa4b6076a0e48c3fbaaa5
change-id: 20230701-regmap-kunit-enable-a08718e77dd4
Best regards,
--
Mark Brown <broonie(a)kernel.org>
Delete a duplicate assignment from this function implementation.
The note means ppm is average of the two actual freq samples.
But ppm have a duplicate assignment.
Signed-off-by: Minjie Du <duminjie(a)vivo.com>
---
tools/testing/selftests/timers/raw_skew.c | 3 +--
1 file changed, 1 insertion(+), 2 deletions(-)
diff --git a/tools/testing/selftests/timers/raw_skew.c b/tools/testing/selftests/timers/raw_skew.c
index 5beceeed0..6eba203f9 100644
--- a/tools/testing/selftests/timers/raw_skew.c
+++ b/tools/testing/selftests/timers/raw_skew.c
@@ -129,8 +129,7 @@ int main(int argc, char **argv)
printf("%lld.%i(est)", eppm/1000, abs((int)(eppm%1000)));
/* Avg the two actual freq samples adjtimex gave us */
- ppm = (tx1.freq + tx2.freq) * 1000 / 2;
- ppm = (long long)tx1.freq * 1000;
+ ppm = (long long)(tx1.freq + tx2.freq) * 1000 / 2;
ppm = shift_right(ppm, 16);
printf(" %lld.%i(act)", ppm/1000, abs((int)(ppm%1000)));
--
2.39.0
From: Jeff Xu <jeffxu(a)google.com>
Add documentation for sysctl vm.memfd_noexec
Link:https://lore.kernel.org/linux-mm/CABi2SkXUX_QqTQ10Yx9bBUGpN1wByOi_=gZU…
Reported-by: Dominique Martinet <asmadeus(a)codewreck.org>
Signed-off-by: Jeff Xu <jeffxu(a)google.com>
---
Documentation/admin-guide/sysctl/vm.rst | 29 +++++++++++++++++++++++++
1 file changed, 29 insertions(+)
diff --git a/Documentation/admin-guide/sysctl/vm.rst b/Documentation/admin-guide/sysctl/vm.rst
index 45ba1f4dc004..71923c3d7044 100644
--- a/Documentation/admin-guide/sysctl/vm.rst
+++ b/Documentation/admin-guide/sysctl/vm.rst
@@ -424,6 +424,35 @@ e.g., up to one or two maps per allocation.
The default value is 65530.
+memfd_noexec:
+=============
+This pid namespaced sysctl controls memfd_create().
+
+The new MFD_NOEXEC_SEAL and MFD_EXEC flags of memfd_create() allows
+application to set executable bit at creation time.
+
+When MFD_NOEXEC_SEAL is set, memfd is created without executable bit
+(mode:0666), and sealed with F_SEAL_EXEC, so it can't be chmod to
+be executable (mode: 0777) after creation.
+
+when MFD_EXEC flag is set, memfd is created with executable bit
+(mode:0777), this is the same as the old behavior of memfd_create.
+
+The new pid namespaced sysctl vm.memfd_noexec has 3 values:
+0: memfd_create() without MFD_EXEC nor MFD_NOEXEC_SEAL acts like
+ MFD_EXEC was set.
+1: memfd_create() without MFD_EXEC nor MFD_NOEXEC_SEAL acts like
+ MFD_NOEXEC_SEAL was set.
+2: memfd_create() without MFD_NOEXEC_SEAL will be rejected.
+
+The default value is 0.
+
+Once set, it can't be downgraded at runtime, i.e. 2=>1, 1=>0
+are denied.
+
+This is pid namespaced sysctl, child processes inherit the parent
+process's pid at the time of fork. Changes to the parent process
+after fork are not automatically propagated to the child process.
memory_failure_early_kill:
==========================
--
2.41.0.255.g8b1d071c50-goog
/proc/$PID/net currently allows the setting of file attributes,
in contrast to other /proc/$PID/ files and directories.
This would break the nolibc testsuite so the first patch in the series
removes the offending testcase.
The "fix" for nolibc-test is intentionally kept trivial as the series
will most likely go through the filesystem tree and if conflicts arise,
it is obvious on how to resolve them.
Technically this can lead to breakage of nolibc-test if an old
nolibc-test is used with a newer kernel containing the fix.
Note:
Except for /proc itself this is the only "struct inode_operations" in
fs/proc/ that is missing an implementation of setattr().
Signed-off-by: Thomas Weißschuh <linux(a)weissschuh.net>
---
Thomas Weißschuh (2):
selftests/nolibc: drop test chmod_net
proc: use generic setattr() for /proc/$PID/net
fs/proc/proc_net.c | 1 +
tools/testing/selftests/nolibc/nolibc-test.c | 1 -
2 files changed, 1 insertion(+), 1 deletion(-)
---
base-commit: a92b7d26c743b9dc06d520f863d624e94978a1d9
change-id: 20230624-proc-net-setattr-8f0a6b8eb2f5
Best regards,
--
Thomas Weißschuh <linux(a)weissschuh.net>
In the case where a sysfs file cannot be opened the error return path
fcloses file pointer fpl, however, fpl has already been closed in the
previous stanza. Fix the double fclose by removing it.
Fixes: 10b98a4db11a ("selftests: ALSA: Add test for the 'pcmtest' driver")
Signed-off-by: Colin Ian King <colin.i.king(a)gmail.com>
---
tools/testing/selftests/alsa/test-pcmtest-driver.c | 4 +---
1 file changed, 1 insertion(+), 3 deletions(-)
diff --git a/tools/testing/selftests/alsa/test-pcmtest-driver.c b/tools/testing/selftests/alsa/test-pcmtest-driver.c
index 71931b240a83..357adc722cba 100644
--- a/tools/testing/selftests/alsa/test-pcmtest-driver.c
+++ b/tools/testing/selftests/alsa/test-pcmtest-driver.c
@@ -47,10 +47,8 @@ static int read_patterns(void)
sprintf(pf, "/sys/kernel/debug/pcmtest/fill_pattern%d", i);
fp = fopen(pf, "r");
- if (!fp) {
- fclose(fpl);
+ if (!fp)
return -1;
- }
fread(patterns[i].buf, 1, patterns[i].len, fp);
fclose(fp);
}
--
2.39.2
The checksum_32 code was originally written to only handle 2-byte
aligned buffers, but was later extended to support arbitrary alignment.
However, the non-PPro variant doesn't apply the carry before jumping to
the 2- or 4-byte aligned versions, which clear CF.
This causes the new checksum_kunit test to fail, as it runs with a large
number of different possible alignments and both with and without
carries.
For example:
./tools/testing/kunit/kunit.py run --arch i386 --kconfig_add CONFIG_M486=y checksum
Gives:
KTAP version 1
# Subtest: checksum
1..3
ok 1 test_csum_fixed_random_inputs
# test_csum_all_carry_inputs: ASSERTION FAILED at lib/checksum_kunit.c:267
Expected result == expec, but
result == 65281 (0xff01)
expec == 65280 (0xff00)
not ok 2 test_csum_all_carry_inputs
# test_csum_no_carry_inputs: ASSERTION FAILED at lib/checksum_kunit.c:314
Expected result == expec, but
result == 65535 (0xffff)
expec == 65534 (0xfffe)
not ok 3 test_csum_no_carry_inputs
With this patch, it passes.
KTAP version 1
# Subtest: checksum
1..3
ok 1 test_csum_fixed_random_inputs
ok 2 test_csum_all_carry_inputs
ok 3 test_csum_no_carry_inputs
I also tested it on a real 486DX2, with the same results.
Signed-off-by: David Gow <davidgow(a)google.com>
---
This is a follow-up to the UML patch to use the common 32-bit x86
checksum implementations:
https://lore.kernel.org/linux-um/20230704083022.692368-2-davidgow@google.co…
---
arch/x86/lib/checksum_32.S | 1 +
1 file changed, 1 insertion(+)
diff --git a/arch/x86/lib/checksum_32.S b/arch/x86/lib/checksum_32.S
index 23318c338db0..128287cea42d 100644
--- a/arch/x86/lib/checksum_32.S
+++ b/arch/x86/lib/checksum_32.S
@@ -62,6 +62,7 @@ SYM_FUNC_START(csum_partial)
jl 8f
movzbl (%esi), %ebx
adcl %ebx, %eax
+ adcl $0, %eax
roll $8, %eax
inc %esi
testl $2, %esi
--
2.41.0.255.g8b1d071c50-goog