This cleans up the output of the tty_tstamp_update selftest to play a
bit more nicely with automated systems parsing the test output.
To do this I've also added a new helper ksft_test_result() which takes a
KSFT_ code as a report, this is something I've wanted on other occasions
but restructured things to avoid needing it. This time I figured I'd
just add it since it keeps coming up.
Signed-off-by: Mark Brown <broonie(a)kernel.org>
---
Mark Brown (2):
kselftest: Add mechanism for reporting a KSFT_ result code
kselftest/tty: Report a consistent test name for the one test we run
tools/testing/selftests/kselftest.h | 22 ++++++++++++
tools/testing/selftests/tty/tty_tstamp_update.c | 48 +++++++++++++++++--------
2 files changed, 55 insertions(+), 15 deletions(-)
---
base-commit: 54be6c6c5ae8e0d93a6c4641cb7528eb0b6ba478
change-id: 20240305-kselftest-tty-tname-5411444ce037
Best regards,
--
Mark Brown <broonie(a)kernel.org>
The seccomp benchmark runs five scenarios, one calibration run with no
seccomp filters enabled then four further runs each adding a filter. The
calibration run times itself for 15s and then each additional run executes
for the same number of times.
Currently the seccomp tests, including the benchmark, run with an extended
120s timeout but this is not sufficient to robustly run the tests on a lot
of platforms. Sample timings from some recent runs:
Platform Run 1 Run 2 Run 3 Run 4
--------- ----- ----- ----- -----
PowerEdge R200 16.6s 16.6s 31.6s 37.4s
BBB (arm) 20.4s 20.4s 54.5s
Synquacer (arm64) 20.7s 23.7s 40.3s
The x86 runs from the PowerEdge are quite marginal and routinely fail, for
the successful run reported here the timed portions of the run are at
117.2s leaving less than 3s of margin which is frequently breached. The
added overhead of adding filters on the other platforms is such that there
is no prospect of their runs fitting into the 120s timeout, especially
on 32 bit arm where there is no BPF JIT.
While we could lower the time we calibrate for I'm also already seeing the
currently completing runs reporting issues with the per filter overheads
not matching expectations:
Let's instead raise the timeout to 180s which is only a 50% increase on the
current timeout which is itself not *too* large given that there's only two
tests in this suite.
Signed-off-by: Mark Brown <broonie(a)kernel.org>
---
Changes in v2:
- Rebase onto v6.9-rc1.
- Link to v1: https://lore.kernel.org/r/20231219-b4-kselftest-seccomp-benchmark-timeout-v…
---
tools/testing/selftests/seccomp/settings | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/tools/testing/selftests/seccomp/settings b/tools/testing/selftests/seccomp/settings
index 6091b45d226b..a953c96aa16e 100644
--- a/tools/testing/selftests/seccomp/settings
+++ b/tools/testing/selftests/seccomp/settings
@@ -1 +1 @@
-timeout=120
+timeout=180
---
base-commit: 4cece764965020c22cff7665b18a012006359095
change-id: 20231219-b4-kselftest-seccomp-benchmark-timeout-05b66e7d29d1
Best regards,
--
Mark Brown <broonie(a)kernel.org>
The event filter function test has been failing in our internal test
farm:
| # not ok 33 event filter function - test event filtering on functions
Running the test in verbose mode indicates that this is because the test
erroneously determines that kmem_cache_free() is the most common caller
of kmem_cache_free():
# # + cut -d: -f3 trace
# # + sed s/call_site=([^+]*)+0x.*/1/
# # + sort
# # + uniq -c
# # + sort
# # + tail -n 1
# # + sed s/^[ 0-9]*//
# # + target_func=kmem_cache_free
... and as kmem_cache_free() doesn't call itself, setting this as the
filter function for kmem_cache_free() results in no hits, and
consequently the test fails:
# # + grep kmem_cache_free trace
# # + grep kmem_cache_free
# # + wc -l
# # + hitcnt=0
# # + grep kmem_cache_free trace
# # + grep -v kmem_cache_free
# # + wc -l
# # + misscnt=0
# # + [ 0 -eq 0 ]
# # + exit_fail
This seems to be because the system in question has tasks with ':' in
their name (which a number of kernel worker threads have). These show up
in the trace, e.g.
test:.sh-1299 [004] ..... 2886.040608: kmem_cache_free: call_site=putname+0xa4/0xc8 ptr=000000000f4d22f4 name=names_cache
... and so when we try to extact the call_site with:
cut -d: -f3 trace | sed 's/call_site=\([^+]*\)+0x.*/\1/'
... the 'cut' command will extrace the column containing
'kmem_cache_free' rather than the column containing 'call_site=...', and
the 'sed' command will leave this unchanged. Consequently, the test will
decide to use 'kmem_cache_free' as the filter function, resulting in the
failure seen above.
Fix this by matching the 'call_site=<func>' part specifically to extract
the function name.
Signed-off-by: Mark Rutland <mark.rutland(a)arm.com>
Reported-by: Aishwarya TCV <aishwarya.tcv(a)arm.com>
Cc: Masami Hiramatsu <mhiramat(a)kernel.org>
Cc: Mathieu Desnoyers <mathieu.desnoyers(a)efficios.com>
Cc: Shuah Khan <shuah(a)kernel.org>
Cc: Steven Rostedt <rostedt(a)goodmis.org>
Cc: linux-kernel(a)vger.kernel.org
Cc: linux-kselftest(a)vger.kernel.org
Cc: linux-trace-kernel(a)vger.kernel.org
---
.../selftests/ftrace/test.d/filter/event-filter-function.tc | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/tools/testing/selftests/ftrace/test.d/filter/event-filter-function.tc b/tools/testing/selftests/ftrace/test.d/filter/event-filter-function.tc
index 2de7c61d1ae3..3f74c09c56b6 100644
--- a/tools/testing/selftests/ftrace/test.d/filter/event-filter-function.tc
+++ b/tools/testing/selftests/ftrace/test.d/filter/event-filter-function.tc
@@ -24,7 +24,7 @@ echo 0 > events/enable
echo "Get the most frequently calling function"
sample_events
-target_func=`cut -d: -f3 trace | sed 's/call_site=\([^+]*\)+0x.*/\1/' | sort | uniq -c | sort | tail -n 1 | sed 's/^[ 0-9]*//'`
+target_func=`cat trace | grep -o 'call_site=\([^+]*\)' | sed 's/call_site=//' | sort | uniq -c | sort | tail -n 1 | sed 's/^[ 0-9]*//'`
if [ -z "$target_func" ]; then
exit_fail
fi
--
2.39.2
As discussed on the bi-weekly call on Jan 30, and in mailing around
kernel CI effort, some changes are desirable in the suite of forwarding
selftests the better to work with the CI tooling. Namely:
- The forwarding selftests use a configuration file where names of
interfaces are defined and various variables can be overridden. There
is also forwarding.config.sample that users can use as a template to
refer to when creating the config file. What happens a fair bit is
that users either do not know about this at all, or simply forget, and
are confused by cryptic failures about interfaces that cannot be
created.
In patches #1 - #3 have lib.sh just be the single source of truth with
regards to which variables exist. That includes the topology variables
which were previously only in the sample file, and any "tweak
variables", such as what tools to use, sleep times, etc.
forwarding.config.sample then becomes just a placeholder with a couple
examples. Unless specific HW should be exercised, or specific tools
used, the defaults are usually just fine.
- Several net/forwarding/ selftests (and one net/ one) cannot be run on
veth pairs, they need an actual HW interface to run on. They are
generic in the sense that any capable HW should pass them, which is
why they have been put to net/forwarding/ as opposed to drivers/net/,
but they do not generalize to veth. The fact that these tests are in
net/forwarding/, but still complaining when run, is confusing.
In patches #4 - #6 move these tests to a new directory
drivers/net/hw.
- The following patches extend the codebase to handle well test results
other than pass and fail.
Patch #7 is preparatory. It converts several log_test_skip to XFAIL,
so that tests do not spuriously end up returning non-0 when they
are not supposed to.
In patches #8 - #10, introduce some missing ksft constants, then support
having those constants in RET, and then finally in EXIT_STATUS.
- The traffic scheduler tests generate a large amount of network traffic
to test the behavior of the scheduler. This demands a relatively
high-performance computer. On slow machines, such as with a debugging
kernel, the test would spuriously fail.
It can still be useful to "go through the motions" though, to possibly
catch bugs in setup of the scheduler graph and passing packets around.
Thus we still want to run the tests, just with lowered demands.
To that end, in patches #11 - #12, introduce an environment variable
KSFT_MACHINE_SLOW, with obvious meaning. Tests can then make checks
more lenient, such as mark failures as XFAIL. A helper, xfail_on_slow,
is provided to mark performance-sensitive parts of the selftest.
- In patch #13, use a similar mechanism to mark a NH group stats
selftest to XFAIL HW stats tests when run on VETH pairs.
- All these changes complicate the hitherto straightforward logging and
checking logic, so in patch #14, add a selftest that checks this
functionality in lib.sh.
Petr Machata (14):
selftests: net: libs: Change variable fallback syntax
selftests: forwarding.config.sample: Move overrides to lib.sh
selftests: forwarding: README: Document customization
selftests: forwarding: ipip_lib: Do not import lib.sh
selftests: forwarding: Move several selftests
selftests: forwarding: Ditch skip_on_veth()
selftests: forwarding: Change inappropriate log_test_skip() calls
selftests: lib: Define more kselftest exit codes
selftests: forwarding: Have RET track kselftest framework constants
selftests: forwarding: Convert log_test() to recognize RET values
selftests: forwarding: Support for performance sensitive tests
selftests: forwarding: Mark performance-sensitive tests
selftests: forwarding: router_mpath_nh_lib: Don't skip, xfail on veth
selftests: forwarding: Add a test for testing lib.sh functionality
.../testing/selftests/drivers/net/hw/Makefile | 25 ++
.../net/hw}/devlink_port_split.py | 0
.../forwarding => drivers/net/hw}/ethtool.sh | 5 +-
.../net/hw}/ethtool_extended_state.sh | 5 +-
.../net/hw}/ethtool_lib.sh | 0
.../net/hw}/ethtool_mm.sh | 3 +-
.../net/hw}/ethtool_rmon.sh | 7 +-
.../net/hw}/hw_stats_l3.sh | 19 +-
.../net/hw}/hw_stats_l3_gre.sh | 7 +-
.../forwarding => drivers/net/hw}/loopback.sh | 5 +-
.../testing/selftests/drivers/net/hw/settings | 1 +
.../selftests/drivers/net/mlxsw/mlxsw_lib.sh | 2 +-
.../net/mlxsw/spectrum-2/resource_scale.sh | 1 -
.../net/mlxsw/spectrum/resource_scale.sh | 1 -
tools/testing/selftests/net/Makefile | 1 -
.../testing/selftests/net/forwarding/Makefile | 9 +-
tools/testing/selftests/net/forwarding/README | 33 +++
.../net/forwarding/forwarding.config.sample | 53 ++--
.../selftests/net/forwarding/ipip_lib.sh | 1 -
tools/testing/selftests/net/forwarding/lib.sh | 255 +++++++++++++-----
.../selftests/net/forwarding/lib_sh_test.sh | 208 ++++++++++++++
.../net/forwarding/router_mpath_nh_lib.sh | 12 +-
.../selftests/net/forwarding/sch_ets_tests.sh | 19 +-
.../selftests/net/forwarding/sch_red.sh | 10 +-
.../selftests/net/forwarding/sch_tbf_core.sh | 2 +-
.../selftests/net/forwarding/tc_common.sh | 2 +-
.../selftests/net/forwarding/tc_tunnel_key.sh | 2 -
tools/testing/selftests/net/lib.sh | 48 +++-
28 files changed, 565 insertions(+), 171 deletions(-)
create mode 100644 tools/testing/selftests/drivers/net/hw/Makefile
rename tools/testing/selftests/{net => drivers/net/hw}/devlink_port_split.py (100%)
rename tools/testing/selftests/{net/forwarding => drivers/net/hw}/ethtool.sh (98%)
rename tools/testing/selftests/{net/forwarding => drivers/net/hw}/ethtool_extended_state.sh (96%)
rename tools/testing/selftests/{net/forwarding => drivers/net/hw}/ethtool_lib.sh (100%)
rename tools/testing/selftests/{net/forwarding => drivers/net/hw}/ethtool_mm.sh (99%)
rename tools/testing/selftests/{net/forwarding => drivers/net/hw}/ethtool_rmon.sh (92%)
rename tools/testing/selftests/{net/forwarding => drivers/net/hw}/hw_stats_l3.sh (96%)
rename tools/testing/selftests/{net/forwarding => drivers/net/hw}/hw_stats_l3_gre.sh (92%)
rename tools/testing/selftests/{net/forwarding => drivers/net/hw}/loopback.sh (92%)
create mode 100644 tools/testing/selftests/drivers/net/hw/settings
create mode 100755 tools/testing/selftests/net/forwarding/lib_sh_test.sh
--
2.43.0
The test is inspired by the pmu_event_filter_test which implemented by x86. On
the arm64 platform, there is the same ability to set the pmu_event_filter
through the KVM_ARM_VCPU_PMU_V3_FILTER attribute. So add the test for arm64.
The series first create the helper function which can be used
for the vpmu related tests. Then, it implement the test.
Changelog:
----------
v5->v6:
- Rebased to v6.9-rc1.
- Collect RB.
- Add multiple filter test.
v4->v5:
- Rebased to v6.8-rc6.
- Refactor the helper function, make it fine-grained and easy to be used.
- Namimg improvements.
- Use the kvm_device_attr_set() helper.
- Make the test descriptor array readable and clean.
- Delete the patch which moves the pmu related helper to vpmu.h.
- Remove the kvm_supports_pmu_event_filter() function since nobody will run
this on a old kernel.
v3->v4:
- Rebased to the v6.8-rc2.
v2->v3:
- Check the pmceid in guest code instead of pmu event count since different
hardware may have different event count result, check pmceid makes it stable
on different platform. [Eric]
- Some typo fixed and commit message improved.
v1->v2:
- Improve the commit message. [Eric]
- Fix the bug in [enable|disable]_counter. [Raghavendra & Marc]
- Add the check if kvm has attr KVM_ARM_VCPU_PMU_V3_FILTER.
- Add if host pmu support the test event throught pmceid0.
- Split the test_invalid_filter() to another patch. [Eric]
v1: https://lore.kernel.org/all/20231123063750.2176250-1-shahuang@redhat.com/
v2: https://lore.kernel.org/all/20231129072712.2667337-1-shahuang@redhat.com/
v3: https://lore.kernel.org/all/20240116060129.55473-1-shahuang@redhat.com/
v4: https://lore.kernel.org/all/20240202025659.5065-1-shahuang@redhat.com/
v5: https://lore.kernel.org/all/20240229065625.114207-1-shahuang@redhat.com/
Shaoqin Huang (3):
KVM: selftests: aarch64: Add helper function for the vpmu vcpu
creation
KVM: selftests: aarch64: Introduce pmu_event_filter_test
KVM: selftests: aarch64: Add invalid filter test in
pmu_event_filter_test
tools/testing/selftests/kvm/Makefile | 1 +
.../kvm/aarch64/pmu_event_filter_test.c | 336 ++++++++++++++++++
.../kvm/aarch64/vpmu_counter_access.c | 33 +-
.../selftests/kvm/include/aarch64/vpmu.h | 28 ++
4 files changed, 372 insertions(+), 26 deletions(-)
create mode 100644 tools/testing/selftests/kvm/aarch64/pmu_event_filter_test.c
create mode 100644 tools/testing/selftests/kvm/include/aarch64/vpmu.h
--
2.40.1
The test is inspired by the pmu_event_filter_test which implemented by x86. On
the arm64 platform, there is the same ability to set the pmu_event_filter
through the KVM_ARM_VCPU_PMU_V3_FILTER attribute. So add the test for arm64.
The series first create the helper function which can be used
for the vpmu related tests. Then, it implement the test.
Changelog:
----------
v4->v5:
- Rebased to v6.8-rc6.
- Refactor the helper function, make it fine-grained and easy to be used.
- Namimg improvements.
- Use the kvm_device_attr_set() helper.
- Make the test descriptor array readable and clean.
- Delete the patch which moves the pmu related helper to vpmu.h.
- Remove the kvm_supports_pmu_event_filter() function since nobody will run
this on a old kernel.
v3->v4:
- Rebased to the v6.8-rc2.
v2->v3:
- Check the pmceid in guest code instead of pmu event count since different
hardware may have different event count result, check pmceid makes it stable
on different platform. [Eric]
- Some typo fixed and commit message improved.
v1->v2:
- Improve the commit message. [Eric]
- Fix the bug in [enable|disable]_counter. [Raghavendra & Marc]
- Add the check if kvm has attr KVM_ARM_VCPU_PMU_V3_FILTER.
- Add if host pmu support the test event throught pmceid0.
- Split the test_invalid_filter() to another patch. [Eric]
v1: https://lore.kernel.org/all/20231123063750.2176250-1-shahuang@redhat.com/
v2: https://lore.kernel.org/all/20231129072712.2667337-1-shahuang@redhat.com/
v3: https://lore.kernel.org/all/20240116060129.55473-1-shahuang@redhat.com/
v4: https://lore.kernel.org/all/20240202025659.5065-1-shahuang@redhat.com/
Shaoqin Huang (3):
KVM: selftests: aarch64: Add helper function for the vpmu vcpu
creation
KVM: selftests: aarch64: Introduce pmu_event_filter_test
KVM: selftests: aarch64: Add invalid filter test in
pmu_event_filter_test
tools/testing/selftests/kvm/Makefile | 1 +
.../kvm/aarch64/pmu_event_filter_test.c | 325 ++++++++++++++++++
.../kvm/aarch64/vpmu_counter_access.c | 33 +-
.../selftests/kvm/include/aarch64/vpmu.h | 29 ++
4 files changed, 362 insertions(+), 26 deletions(-)
create mode 100644 tools/testing/selftests/kvm/aarch64/pmu_event_filter_test.c
create mode 100644 tools/testing/selftests/kvm/include/aarch64/vpmu.h
--
2.40.1
Hi all,
This series does a number of cleanups into resctrl_val() and
generalizes it by removing test name specific handling from the
function.
One of the changes improves MBA/MBM measurement by narrowing down the
period the resctrl FS derived memory bandwidth numbers are measured
over. My feel is it didn't cause noticeable difference into the numbers
because they're generally good anyway except for the small number of
outliers. To see the impact on outliers, I'd need to setup a test to
run large number of replications and do a statistical analysis, which
I've not spent my time on. Even without the statistical analysis, the
new way to measure seems obviously better and makes sense even if I
cannot see a major improvement with the setup I'm using.
This series has some conflicts with SNC series from Maciej and also
with the MBA/MBM series from Babu.
--
i.
v2:
- Resolved conflicts with kselftest/next
- Spaces -> tabs correction
Ilpo Järvinen (13):
selftests/resctrl: Convert get_mem_bw_imc() fd close to for loop
selftests/resctrl: Calculate resctrl FS derived mem bw over sleep(1)
only
selftests/resctrl: Consolidate get_domain_id() into resctrl_val()
selftests/resctrl: Use correct type for pids
selftests/resctrl: Cleanup bm_pid and ppid usage & limit scope
selftests/resctrl: Rename measure_vals() to measure_mem_bw_vals() &
document
selftests/resctrl: Add ->measure() callback to resctrl_val_param
selftests/resctrl: Add ->init() callback into resctrl_val_param
selftests/resctrl: Simplify bandwidth report type handling
selftests/resctrl: Make some strings passed to resctrlfs functions
const
selftests/resctrl: Convert ctrlgrp & mongrp to pointers
selftests/resctrl: Remove mongrp from MBA test
selftests/resctrl: Remove test name comparing from
write_bm_pid_to_resctrl()
tools/testing/selftests/resctrl/cache.c | 6 +-
tools/testing/selftests/resctrl/cat_test.c | 5 +-
tools/testing/selftests/resctrl/cmt_test.c | 21 +-
tools/testing/selftests/resctrl/mba_test.c | 34 ++-
tools/testing/selftests/resctrl/mbm_test.c | 33 ++-
tools/testing/selftests/resctrl/resctrl.h | 48 ++--
tools/testing/selftests/resctrl/resctrl_val.c | 269 ++++++------------
tools/testing/selftests/resctrl/resctrlfs.c | 55 ++--
8 files changed, 224 insertions(+), 247 deletions(-)
--
2.39.2
Currently, VA exhaustion is being checked by passing a hint to mmap() and
expecting it to fail. This patch makes a stricter test by successful
write() calls from /proc/self/maps to a dump file, confirming that a
free chunk is indeed not available.
Changes in v2:
- Replace SZ_1GB with MAP_CHUNK_SIZE, tidy-up
Signed-off-by: Dev Jain <dev.jain(a)arm.com>
---
Merge dependency: https://lore.kernel.org/all/20240314122250.68534-1-dev.jain@arm.com/
.../selftests/mm/virtual_address_range.c | 66 +++++++++++++++++++
1 file changed, 66 insertions(+)
diff --git a/tools/testing/selftests/mm/virtual_address_range.c b/tools/testing/selftests/mm/virtual_address_range.c
index 7bcf8d48256a..050e997e3be2 100644
--- a/tools/testing/selftests/mm/virtual_address_range.c
+++ b/tools/testing/selftests/mm/virtual_address_range.c
@@ -12,6 +12,8 @@
#include <errno.h>
#include <sys/mman.h>
#include <sys/time.h>
+#include <fcntl.h>
+
#include "../kselftest.h"
/*
@@ -93,6 +95,66 @@ static int validate_lower_address_hint(void)
return 1;
}
+static int validate_complete_va_space(void)
+{
+ unsigned long start_addr, end_addr, prev_end_addr;
+ char line[400];
+ char prot[6];
+ FILE *file;
+ int fd;
+
+ fd = open("va_dump", O_CREAT | O_WRONLY, 0600);
+ unlink("va_dump");
+ if (fd < 0) {
+ ksft_test_result_skip("cannot create or open dump file\n");
+ ksft_finished();
+ }
+
+ file = fopen("/proc/self/maps", "r");
+ if (file == NULL)
+ ksft_exit_fail_msg("cannot open /proc/self/maps\n");
+
+ prev_end_addr = 0;
+ while (fgets(line, sizeof(line), file)) {
+ unsigned long hop;
+
+ if (sscanf(line, "%lx-%lx %s[rwxp-]",
+ &start_addr, &end_addr, prot) != 3)
+ ksft_exit_fail_msg("cannot parse /proc/self/maps\n");
+
+ /* end of userspace mappings; ignore vsyscall mapping */
+ if (start_addr & (1UL << 63))
+ return 0;
+
+ /* /proc/self/maps must have gaps less than MAP_CHUNK_SIZE */
+ if (start_addr - prev_end_addr >= MAP_CHUNK_SIZE)
+ return 1;
+
+ prev_end_addr = end_addr;
+
+ if (prot[0] != 'r')
+ continue;
+
+ /*
+ * Confirm whether MAP_CHUNK_SIZE chunk can be found or not.
+ * If write succeeds, no need to check MAP_CHUNK_SIZE - 1
+ * addresses after that. If the address was not held by this
+ * process, write would fail with errno set to EFAULT.
+ * Anyways, if write returns anything apart from 1, exit the
+ * program since that would mean a bug in /proc/self/maps.
+ */
+ hop = 0;
+ while (start_addr + hop < end_addr) {
+ if (write(fd, (void *)(start_addr + hop), 1) != 1)
+ return 1;
+ lseek(fd, 0, SEEK_SET);
+
+ hop += MAP_CHUNK_SIZE;
+ }
+ }
+ return 0;
+}
+
int main(int argc, char *argv[])
{
char *ptr[NR_CHUNKS_LOW];
@@ -135,6 +197,10 @@ int main(int argc, char *argv[])
validate_addr(hptr[i], 1);
}
hchunks = i;
+ if (validate_complete_va_space()) {
+ ksft_test_result_fail("BUG in mmap() or /proc/self/maps\n");
+ ksft_finished();
+ }
for (i = 0; i < lchunks; i++)
munmap(ptr[i], MAP_CHUNK_SIZE);
--
2.34.1
Currently, VA exhaustion is being checked by passing a hint to mmap() and
expecting it to fail. This patch makes a stricter test by successful write()
calls from /proc/self/maps to a dump file, confirming that a free chunk is
indeed not available.
Signed-off-by: Dev Jain <dev.jain(a)arm.com>
---
Merge dependency: https://lore.kernel.org/all/20240314122250.68534-1-dev.jain@arm.com/
.../selftests/mm/virtual_address_range.c | 69 +++++++++++++++++++
1 file changed, 69 insertions(+)
diff --git a/tools/testing/selftests/mm/virtual_address_range.c b/tools/testing/selftests/mm/virtual_address_range.c
index 7bcf8d48256a..31063613dfd9 100644
--- a/tools/testing/selftests/mm/virtual_address_range.c
+++ b/tools/testing/selftests/mm/virtual_address_range.c
@@ -12,6 +12,8 @@
#include <errno.h>
#include <sys/mman.h>
#include <sys/time.h>
+#include <fcntl.h>
+
#include "../kselftest.h"
/*
@@ -93,6 +95,69 @@ static int validate_lower_address_hint(void)
return 1;
}
+static int validate_complete_va_space(void)
+{
+ unsigned long start_addr, end_addr, prev_end_addr;
+ char line[400];
+ char prot[6];
+ FILE *file;
+ int fd;
+
+ fd = open("va_dump", O_CREAT | O_WRONLY, 0600);
+ unlink("va_dump");
+ if (fd < 0) {
+ ksft_test_result_skip("cannot create or open dump file\n");
+ ksft_finished();
+ }
+
+ file = fopen("/proc/self/maps", "r");
+ if (file == NULL)
+ ksft_exit_fail_msg("cannot open /proc/self/maps\n");
+
+ prev_end_addr = 0;
+ while (fgets(line, sizeof(line), file)) {
+ unsigned long hop;
+ int ret;
+
+ ret = sscanf(line, "%lx-%lx %s[rwxp-]",
+ &start_addr, &end_addr, prot);
+ if (ret != 3)
+ ksft_exit_fail_msg("sscanf failed, cannot parse\n");
+
+ /* end of userspace mappings; ignore vsyscall mapping */
+ if (start_addr & (1UL << 63))
+ return 0;
+
+ /* /proc/self/maps must have gaps less than 1GB only */
+ if (start_addr - prev_end_addr >= SZ_1GB)
+ return 1;
+
+ prev_end_addr = end_addr;
+
+ if (prot[0] != 'r')
+ continue;
+
+ /*
+ * Confirm whether MAP_CHUNK_SIZE chunk can be found or not.
+ * If write succeeds, no need to check MAP_CHUNK_SIZE - 1
+ * addresses after that. If the address was not held by this
+ * process, write would fail with errno set to EFAULT.
+ * Anyways, if write returns anything apart from 1, exit the
+ * program since that would mean a bug in /proc/self/maps.
+ */
+ hop = 0;
+ while (start_addr + hop < end_addr) {
+ if (write(fd, (void *)(start_addr + hop), 1) != 1)
+ return 1;
+ else
+ lseek(fd, 0, SEEK_SET);
+
+ hop += MAP_CHUNK_SIZE;
+ }
+ }
+ return 0;
+}
+
int main(int argc, char *argv[])
{
char *ptr[NR_CHUNKS_LOW];
@@ -135,6 +200,10 @@ int main(int argc, char *argv[])
validate_addr(hptr[i], 1);
}
hchunks = i;
+ if (validate_complete_va_space()) {
+ ksft_test_result_fail("BUG in mmap() or /proc/self/maps\n");
+ ksft_finished();
+ }
for (i = 0; i < lchunks; i++)
munmap(ptr[i], MAP_CHUNK_SIZE);
--
2.34.1
The upcoming new Idle HLT Intercept feature allows for the HLT
instruction execution by a vCPU to be intercepted by the hypervisor
only if there are no pending V_INTR and V_NMI events for the vCPU.
When the vCPU is expected to service the pending V_INTR and V_NMI
events, the Idle HLT intercept won’t trigger. The feature allows the
hypervisor to determine if the vCPU is actually idle and reduces
wasteful VMEXITs.
Presence of the Idle HLT Intercept feature is indicated via CPUID
function Fn8000_000A_EDX[30].
Document for the Idle HLT intercept feature will be available in the
next version of "AMD64 Architecture Programmer’s Manual".
Testing Done:
Added a selftest to test the Idle HLT intercept functionality.
Tested SEV and SEV-ES guest for the Idle HLT intercept functionality.
Manali Shukla (5):
x86/cpufeatures: Add CPUID feature bit for Idle HLT intercept
KVM: SVM: Add Idle HLT intercept support
tools: Add KVM exit reason for the Idle HLT
selftests: Add an interface to read the data of named vcpu stat
selftests: KVM: SVM: Add Idle HLT intercept test
arch/x86/include/asm/cpufeatures.h | 1 +
arch/x86/include/asm/svm.h | 1 +
arch/x86/include/uapi/asm/svm.h | 2 +
arch/x86/kvm/svm/svm.c | 11 +-
tools/arch/x86/include/uapi/asm/svm.h | 2 +
tools/testing/selftests/kvm/Makefile | 1 +
.../selftests/kvm/include/kvm_util_base.h | 11 ++
tools/testing/selftests/kvm/lib/kvm_util.c | 41 ++++++
.../selftests/kvm/x86_64/svm_idlehlt_test.c | 119 ++++++++++++++++++
9 files changed, 186 insertions(+), 3 deletions(-)
create mode 100644 tools/testing/selftests/kvm/x86_64/svm_idlehlt_test.c
base-commit: fdd58834d132046149699b88a27a0db26829f4fb
--
2.34.1
Hi,
While running kselftest on vanilla torvalds tree kernel commit v6.8-11167-g4438a810f396,
the test suite reported a number of errors.
I was using the latest iproute2-next suite on an Ubuntu 22.04 LTS box.
# Tests passed: 558
# Tests failed: 84
not ok 90 selftests: net: test_vxlan_mdb.sh # exit=1
495:# TEST: Destination IP - match [FAIL]
496:# TEST: Destination IP - no match [FAIL]
497:# TEST: Default destination port - match [FAIL]
498:# TEST: Default destination port - no match [FAIL]
499:# TEST: Non-default destination port - match [FAIL]
500:# TEST: Non-default destination port - no match [FAIL]
501:# TEST: Default destination VNI - match [FAIL]
502:# TEST: Default destination VNI - no match [FAIL]
503:# TEST: Non-default destination VNI - match [FAIL]
504:# TEST: Non-default destination VNI - no match [FAIL]
521:# TEST: Destination IP - match [FAIL]
522:# TEST: Destination IP - no match [FAIL]
523:# TEST: Default destination port - match [FAIL]
524:# TEST: Default destination port - no match [FAIL]
525:# TEST: Non-default destination port - match [FAIL]
526:# TEST: Non-default destination port - no match [FAIL]
527:# TEST: Default destination VNI - match [FAIL]
528:# TEST: Default destination VNI - no match [FAIL]
529:# TEST: Non-default destination VNI - match [FAIL]
530:# TEST: Non-default destination VNI - no match [FAIL]
549:# TEST: Forward valid source - first VTEP [FAIL]
550:# TEST: Forward valid source - second VTEP [FAIL]
551:# TEST: Block excluded source after removal - first VTEP [FAIL]
552:# TEST: Block excluded source after removal - second VTEP [FAIL]
553:# TEST: Forward valid source after removal - first VTEP [FAIL]
554:# TEST: Forward valid source after removal - second VTEP [FAIL]
571:# TEST: Forward valid source - first VTEP [FAIL]
572:# TEST: Forward valid source - second VTEP [FAIL]
573:# TEST: Block excluded source after removal - first VTEP [FAIL]
574:# TEST: Block excluded source after removal - second VTEP [FAIL]
575:# TEST: Forward valid source after removal - first VTEP [FAIL]
576:# TEST: Forward valid source after removal - second VTEP [FAIL]
593:# TEST: Forward valid source - first VTEP [FAIL]
594:# TEST: Forward valid source - second VTEP [FAIL]
595:# TEST: Block excluded source after removal - first VTEP [FAIL]
596:# TEST: Block excluded source after removal - second VTEP [FAIL]
597:# TEST: Forward valid source after removal - first VTEP [FAIL]
598:# TEST: Forward valid source after removal - second VTEP [FAIL]
615:# TEST: Forward valid source - first VTEP [FAIL]
616:# TEST: Forward valid source - second VTEP [FAIL]
617:# TEST: Block excluded source after removal - first VTEP [FAIL]
618:# TEST: Block excluded source after removal - second VTEP [FAIL]
619:# TEST: Forward valid source after removal - first VTEP [FAIL]
620:# TEST: Forward valid source after removal - second VTEP [FAIL]
636:# TEST: Forward valid source [FAIL]
637:# TEST: Receive of valid source after removal from group [FAIL]
648:# TEST: Forward valid source [FAIL]
649:# TEST: Receive of valid source after removal from group [FAIL]
660:# TEST: Forward valid source [FAIL]
661:# TEST: Receive of valid source after removal from group [FAIL]
672:# TEST: Forward valid source [FAIL]
673:# TEST: Receive of valid source after removal from group [FAIL]
683:# TEST: Egress VNI translation - PVID configured [FAIL]
684:# TEST: Egress VNI translation - no PVID configured [FAIL]
685:# TEST: Egress VNI translation - PVID reconfigured [FAIL]
695:# TEST: Egress VNI translation - PVID configured [FAIL]
696:# TEST: Egress VNI translation - no PVID configured [FAIL]
697:# TEST: Egress VNI translation - PVID reconfigured [FAIL]
707:# TEST: Registered IPv4 multicast - first VTEP [FAIL]
709:# TEST: Unregistered IPv4 multicast - first VTEP [FAIL]
710:# TEST: Unregistered IPv4 multicast - second VTEP [FAIL]
711:# TEST: Link-local IPv4 multicast - first VTEP [FAIL]
712:# TEST: Link-local IPv4 multicast - second VTEP [FAIL]
713:# TEST: Registered IPv4 multicast with a unicast MAC - first VTEP [FAIL]
714:# TEST: Registered IPv4 multicast with a unicast MAC - second VTEP [FAIL]
715:# TEST: Registered IPv4 multicast with a broadcast MAC - first VTEP [FAIL]
716:# TEST: Registered IPv4 multicast with a broadcast MAC - second VTEP [FAIL]
734:# TEST: Registered IPv4 multicast - first VTEP [FAIL]
736:# TEST: Unregistered IPv4 multicast - first VTEP [FAIL]
737:# TEST: Unregistered IPv4 multicast - second VTEP [FAIL]
738:# TEST: Link-local IPv4 multicast - first VTEP [FAIL]
739:# TEST: Link-local IPv4 multicast - second VTEP [FAIL]
740:# TEST: Registered IPv4 multicast with a unicast MAC - first VTEP [FAIL]
741:# TEST: Registered IPv4 multicast with a unicast MAC - second VTEP [FAIL]
742:# TEST: Registered IPv4 multicast with a broadcast MAC - first VTEP [FAIL]
743:# TEST: Registered IPv4 multicast with a broadcast MAC - second VTEP [FAIL]
761:# TEST: IP multicast - first VTEP [FAIL]
763:# TEST: Broadcast - first VTEP [FAIL]
765:# TEST: IP multicast after removal - first VTEP [FAIL]
766:# TEST: IP multicast after removal - second VTEP [FAIL]
779:# TEST: IP multicast - first VTEP [FAIL]
781:# TEST: Broadcast - first VTEP [FAIL]
783:# TEST: IP multicast after removal - first VTEP [FAIL]
784:# TEST: IP multicast after removal - second VTEP [FAIL]
The problem is present at least since 6.8-rc7.
Please find attached the config and the full output of test_vxlan_mdb.sh.
Hope this helps.
Best regards,
Mirsad Todorovac
New version of the sleepable bpf_timer code, without the HID changes, as
they can now go through the HID tree independantly.
For reference, the use cases I have in mind:
---
Basically, I need to be able to defer a HID-BPF program for the
following reasons (from the aforementioned patch):
1. defer an event:
Sometimes we receive an out of proximity event, but the device can not
be trusted enough, and we need to ensure that we won't receive another
one in the following n milliseconds. So we need to wait those n
milliseconds, and eventually re-inject that event in the stack.
2. inject new events in reaction to one given event:
We might want to transform one given event into several. This is the
case for macro keys where a single key press is supposed to send
a sequence of key presses. But this could also be used to patch a
faulty behavior, if a device forgets to send a release event.
3. communicate with the device in reaction to one event:
We might want to communicate back to the device after a given event.
For example a device might send us an event saying that it came back
from sleeping state and needs to be re-initialized.
Currently we can achieve that by keeping a userspace program around,
raise a bpf event, and let that userspace program inject the events and
commands.
However, we are just keeping that program alive as a daemon for just
scheduling commands. There is no logic in it, so it doesn't really justify
an actual userspace wakeup. So a kernel workqueue seems simpler to handle.
bpf_timers are currently running in a soft IRQ context, this patch
series implements a sleppable context for them.
Cheers,
Benjamin
To: Alexei Starovoitov <ast(a)kernel.org>
To: Daniel Borkmann <daniel(a)iogearbox.net>
To: Andrii Nakryiko <andrii(a)kernel.org>
To: Martin KaFai Lau <martin.lau(a)linux.dev>
To: Eduard Zingerman <eddyz87(a)gmail.com>
To: Song Liu <song(a)kernel.org>
To: Yonghong Song <yonghong.song(a)linux.dev>
To: John Fastabend <john.fastabend(a)gmail.com>
To: KP Singh <kpsingh(a)kernel.org>
To: Stanislav Fomichev <sdf(a)google.com>
To: Hao Luo <haoluo(a)google.com>
To: Jiri Olsa <jolsa(a)kernel.org>
To: Mykola Lysenko <mykolal(a)fb.com>
To: Shuah Khan <shuah(a)kernel.org>
Cc: Benjamin Tissoires <bentiss(a)kernel.org>
Cc: <bpf(a)vger.kernel.org>
Cc: <linux-kernel(a)vger.kernel.org>
Cc: <linux-kselftest(a)vger.kernel.org>
---
Changes in v4:
- dropped the HID changes, they can go independently from bpf-core
- addressed Alexei's and Eduard's remarks
- added selftests
- Link to v3: https://lore.kernel.org/r/20240221-hid-bpf-sleepable-v3-0-1fb378ca6301@kern…
Changes in v3:
- fixed the crash from v2
- changed the API to have only BPF_F_TIMER_SLEEPABLE for
bpf_timer_start()
- split the new kfuncs/verifier patch into several sub-patches, for
easier reviews
- Link to v2: https://lore.kernel.org/r/20240214-hid-bpf-sleepable-v2-0-5756b054724d@kern…
Changes in v2:
- make use of bpf_timer (and dropped the custom HID handling)
- implemented bpf_timer_set_sleepable_cb as a kfunc
- still not implemented global subprogs
- no sleepable bpf_timer selftests yet
- Link to v1: https://lore.kernel.org/r/20240209-hid-bpf-sleepable-v1-0-4cc895b5adbd@kern…
---
Benjamin Tissoires (6):
bpf/helpers: introduce sleepable bpf_timers
bpf/verifier: add bpf_timer as a kfunc capable type
bpf/helpers: introduce bpf_timer_set_sleepable_cb() kfunc
bpf/helpers: mark the callback of bpf_timer_set_sleepable_cb() as sleepable
tools: sync include/uapi/linux/bpf.h
selftests/bpf: add sleepable timer tests
include/linux/bpf_verifier.h | 1 +
include/uapi/linux/bpf.h | 4 +
kernel/bpf/helpers.c | 132 ++++++++++++++++++++-
kernel/bpf/verifier.c | 92 +++++++++++++-
tools/include/uapi/linux/bpf.h | 4 +
tools/testing/selftests/bpf/bpf_experimental.h | 4 +
.../selftests/bpf/bpf_testmod/bpf_testmod.c | 5 +
.../selftests/bpf/bpf_testmod/bpf_testmod_kfunc.h | 1 +
tools/testing/selftests/bpf/prog_tests/timer.c | 1 +
tools/testing/selftests/bpf/progs/timer.c | 40 ++++++-
tools/testing/selftests/bpf/progs/timer_failure.c | 114 +++++++++++++++++-
11 files changed, 387 insertions(+), 11 deletions(-)
---
base-commit: 9187210eee7d87eea37b45ea93454a88681894a4
change-id: 20240205-hid-bpf-sleepable-c01260fd91c4
Best regards,
--
Benjamin Tissoires <bentiss(a)kernel.org>
PASID (Process Address Space ID) is a PCIe extension to tag the DMA
transactions out of a physical device, and most modern IOMMU hardware
have supported PASID granular address translation. So a PASID-capable
device can be attached to multiple hwpts (a.k.a. domains), each attachment
is tagged with a pasid.
This series first adds a missing iommu API to replace domain for a pasid,
then adds iommufd APIs for device drivers to attach/replace/detach pasid
to/from hwpt per userspace's request, and adds selftest to validate the
iommufd APIs.
pasid attach/replace is mandatory on Intel VT-d given the PASID table
locates in the physical address space hence must be managed by the kernel,
both for supporting vSVA and coming SIOV. But it's optional on ARM/AMD
which allow configuring the PASID/CD table either in host physical address
space or nested on top of an GPA address space. This series only add VT-d
support as the minimal requirement.
Complete code can be found in below link:
https://github.com/yiliu1765/iommufd/tree/iommufd_pasid
Change log:
v1:
- Implemnet iommu_replace_device_pasid() to fall back to the original domain
if this replacement failed (Kevin)
- Add check in do_attach() to check corressponding attach_fn per the pasid value.
rfc: https://lore.kernel.org/linux-iommu/20230926092651.17041-1-yi.l.liu@intel.c…
Regards,
Yi Liu
Kevin Tian (1):
iommufd: Support attach/replace hwpt per pasid
Lu Baolu (2):
iommu: Introduce a replace API for device pasid
iommu/vt-d: Add set_dev_pasid callback for nested domain
Yi Liu (5):
iommufd: replace attach_fn with a structure
iommufd/selftest: Add set_dev_pasid and remove_dev_pasid in mock iommu
iommufd/selftest: Add a helper to get test device
iommufd/selftest: Add test ops to test pasid attach/detach
iommufd/selftest: Add coverage for iommufd pasid attach/detach
drivers/iommu/intel/nested.c | 47 +++++
drivers/iommu/iommu-priv.h | 2 +
drivers/iommu/iommu.c | 82 ++++++--
drivers/iommu/iommufd/Makefile | 1 +
drivers/iommu/iommufd/device.c | 50 +++--
drivers/iommu/iommufd/iommufd_private.h | 23 +++
drivers/iommu/iommufd/iommufd_test.h | 24 +++
drivers/iommu/iommufd/pasid.c | 138 ++++++++++++++
drivers/iommu/iommufd/selftest.c | 176 ++++++++++++++++--
include/linux/iommufd.h | 6 +
tools/testing/selftests/iommu/iommufd.c | 172 +++++++++++++++++
.../selftests/iommu/iommufd_fail_nth.c | 28 ++-
tools/testing/selftests/iommu/iommufd_utils.h | 78 ++++++++
13 files changed, 785 insertions(+), 42 deletions(-)
create mode 100644 drivers/iommu/iommufd/pasid.c
--
2.34.1
This patch enhances the BPF helpers by adding a kfunc to retrieve the
cgroup v2 of a task, addressing a previous limitation where only
bpf_task_get_cgroup1 was available for cgroup v1. The new kfunc is
particularly useful for scenarios where obtaining the cgroup ID of a
task other than the "current" one is necessary, which the existing
bpf_get_current_cgroup_id helper cannot accommodate. A specific use
case at Netflix involved the sched_switch tracepoint, where we had to
get the cgroup IDs of both the prev and next tasks.
The bpf_task_get_cgroup kfunc acquires and returns a reference to a
task's default cgroup, ensuring thread-safe access by correctly
implementing RCU read locking and unlocking. It leverages the existing
cgroup.h helper, and cgroup_tryget to safely acquire a reference to it.
Signed-off-by: Jose Fernandez <josef(a)netflix.com>
Reviewed-by: Tycho Andersen <tycho(a)tycho.pizza>
Acked-by: Yonghong Song <yonghong.song(a)linux.dev>
Acked-by: Stanislav Fomichev <sdf(a)google.com>
---
V2 -> V3: No changes
V1 -> V2: Return a pointer to the cgroup instead of the cgroup ID
kernel/bpf/helpers.c | 26 ++++++++++++++++++++++++++
1 file changed, 26 insertions(+)
diff --git a/kernel/bpf/helpers.c b/kernel/bpf/helpers.c
index a89587859571..bbd19d5eedb6 100644
--- a/kernel/bpf/helpers.c
+++ b/kernel/bpf/helpers.c
@@ -2266,6 +2266,31 @@ bpf_task_get_cgroup1(struct task_struct *task, int hierarchy_id)
return NULL;
return cgrp;
}
+
+/**
+ * bpf_task_get_cgroup - Acquire a reference to the default cgroup of a task.
+ * @task: The target task
+ *
+ * This function returns the task's default cgroup, primarily
+ * designed for use with cgroup v2. In cgroup v1, the concept of default
+ * cgroup varies by subsystem, and while this function will work with
+ * cgroup v1, it's recommended to use bpf_task_get_cgroup1 instead.
+ * A cgroup returned by this kfunc which is not subsequently stored in a
+ * map, must be released by calling bpf_cgroup_release().
+ *
+ * Return: On success, the cgroup is returned. On failure, NULL is returned.
+ */
+__bpf_kfunc struct cgroup *bpf_task_get_cgroup(struct task_struct *task)
+{
+ struct cgroup *cgrp;
+
+ rcu_read_lock();
+ cgrp = task_dfl_cgroup(task);
+ if (!cgroup_tryget(cgrp))
+ cgrp = NULL;
+ rcu_read_unlock();
+ return cgrp;
+}
#endif /* CONFIG_CGROUPS */
/**
@@ -2573,6 +2598,7 @@ BTF_ID_FLAGS(func, bpf_cgroup_ancestor, KF_ACQUIRE | KF_RCU | KF_RET_NULL)
BTF_ID_FLAGS(func, bpf_cgroup_from_id, KF_ACQUIRE | KF_RET_NULL)
BTF_ID_FLAGS(func, bpf_task_under_cgroup, KF_RCU)
BTF_ID_FLAGS(func, bpf_task_get_cgroup1, KF_ACQUIRE | KF_RCU | KF_RET_NULL)
+BTF_ID_FLAGS(func, bpf_task_get_cgroup, KF_ACQUIRE | KF_RCU | KF_RET_NULL)
#endif
BTF_ID_FLAGS(func, bpf_task_from_pid, KF_ACQUIRE | KF_RET_NULL)
BTF_ID_FLAGS(func, bpf_throw)
base-commit: c733239f8f530872a1f80d8c45dcafbaff368737
--
2.40.1
On riscv, mmap currently returns an address from the largest address
space that can fit entirely inside of the hint address. This makes it
such that the hint address is almost never returned. This patch raises
the mappable area up to and including the hint address. This allows mmap
to often return the hint address, which allows a performance improvement
over searching for a valid address as well as making the behavior more
similar to other architectures.
Note that a previous patch introduced stronger semantics compared to
other architectures for riscv mmap. On riscv, mmap will not use bits in
the upper bits of the virtual address depending on the hint address. On
other architectures, a random address is returned in the address space
requested. On all architectures the hint address will be returned if it
is available. This allows riscv applications to configure how many bits
in the virtual address should be left empty. This has the two benefits
of being able to request address spaces that are smaller than the
default and doesn't require the application to know the page table
layout of riscv.
Signed-off-by: Charlie Jenkins <charlie(a)rivosinc.com>
---
Changes in v2:
- Add back forgotten "mmap_end = STACK_TOP_MAX"
- Link to v1: https://lore.kernel.org/r/20240129-use_mmap_hint_address-v1-0-4c74da813ba1@…
---
Charlie Jenkins (3):
riscv: mm: Use hint address in mmap if available
selftests: riscv: Generalize mm selftests
docs: riscv: Define behavior of mmap
Documentation/arch/riscv/vm-layout.rst | 16 ++--
arch/riscv/include/asm/processor.h | 22 +++---
tools/testing/selftests/riscv/mm/mmap_bottomup.c | 20 +----
tools/testing/selftests/riscv/mm/mmap_default.c | 20 +----
tools/testing/selftests/riscv/mm/mmap_test.h | 93 +++++++++++++-----------
5 files changed, 67 insertions(+), 104 deletions(-)
---
base-commit: 556e2d17cae620d549c5474b1ece053430cd50bc
change-id: 20240119-use_mmap_hint_address-f9f4b1b6f5f1
--
- Charlie
v3:
- pull forward to v6.8
- style and small fixups recommended by jcameron
- update syscall number (will do all archs when RFC tag drops)
- update for new folio code
- added OCP link to device-tracked address hotness proposal
- kept void* over __u64 simply because it integrates cleanly with
existing migration code. If there's strong opinions, I can refactor.
This patch set is a proposal for a syscall analogous to move_pages,
that migrates pages between NUMA nodes using physical addressing.
The intent is to better enable user-land system-wide memory tiering
as CXL devices begin to provide memory resources on the PCIe bus.
For example, user-land software which is making decisions based on
data sources which expose physical address information no longer
must convert that information to virtual addressing to act upon it
(see background for info on how physical addresses are acquired).
The syscall requires CAP_SYS_ADMIN, since physical address source
information is typically protected by the same (or CAP_SYS_NICE).
This patch set broken into 3 patches:
1) refactor of existing migration code for code reuse
2) The sys_move_phys_pages system call.
3) ktest of the syscall
The sys_move_phys_pages system call validates the page may be
migrated by checking migratable-status of each vma mapping the page,
and the intersection of cpuset policies each vma's task.
Background:
Userspace job schedulers, memory managers, and tiering software
solutions depend on page migration syscalls to reallocate resources
across NUMA nodes. Currently, these calls enable movement of memory
associated with a specific PID. Moves can be requested in coarse,
process-sized strokes (as with migrate_pages), and on specific virtual
pages (via move_pages).
However, a number of profiling mechanisms provide system-wide information
that would benefit from a physical-addressing version move_pages.
There are presently at least 4 ways userland can acquire physical
address information for use with this interface, and 1 hardware offload
mechanism being proposed by opencompute.
1) /proc/pid/pagemap: can be used to do page table translations.
This is only really useful for testing, and the ktest was
written using this functionality.
2) X86: IBS (AMD) and PEBS (Intel) can be configured to return physical
and/or vitual address information.
3) zoneinfo: /proc/zoneinfo exposes the start PFN of zones
4) /sys/kernel/mm/page_idle: A way to query whether a PFN is idle.
So long as the page size is known, this can be used to identify
system-wide idle pages that could be migrated to lower tiers.
https://docs.kernel.org/admin-guide/mm/idle_page_tracking.html
5) CXL Offloaded Hotness Monitoring (Proposed): a CXL memory device
may provide hot/cold information about its memory. For example,
it may report the hottest device addresses (0-based) or a physical
address (if it has access to decoders for convert bases).
DPA can be cheaply converted to HPA by combining it with data
exposed by /sys/bus/cxl/ information (region address bases).
See: https://www.opencompute.org/documents/ocp-cms-hotness-tracking-requirements…
Information from these sources facilitates systemwide resource management,
but with the limitations of migrate_pages and move_pages applying to
individual tasks, their outputs must be converted back to virtual addresses
and re-associated with specific PIDs.
Doing this reverse-translation outside of the kernel requires considerable
space and compute, and it will have to be performed again by the existing
system calls. Much of this work can be avoided if the pages can be
migrated directly with physical memory addressing.
Gregory Price (3):
mm/migrate: refactor add_page_for_migration for code re-use
mm/migrate: Create move_phys_pages syscall
ktest: sys_move_phys_pages ktest
arch/x86/entry/syscalls/syscall_32.tbl | 1 +
arch/x86/entry/syscalls/syscall_64.tbl | 1 +
include/linux/syscalls.h | 5 +
include/uapi/asm-generic/unistd.h | 8 +-
kernel/sys_ni.c | 1 +
mm/migrate.c | 288 ++++++++++++++++++++----
tools/include/uapi/asm-generic/unistd.h | 8 +-
tools/testing/selftests/mm/migration.c | 99 ++++++++
8 files changed, 370 insertions(+), 41 deletions(-)
--
2.39.1
In some systems, the netcat server can incur in delay to start listening.
When this happens, the test can randomly fail in various points.
This is an example error message:
# ip gre none gso
# encap 192.168.1.1 to 192.168.1.2, type gre, mac none len 2000
# test basic connectivity
# Ncat: Connection refused.
The issue stems from a race condition between the netcat client and server.
The test author had addressed this problem by implementing a sleep, which
I have removed in this patch.
This patch introduces a function capable of sleeping for up to two seconds.
However, it can terminate the waiting period early if the port is reported
to be listening.
Signed-off-by: Alessandro Carminati (Red Hat) <alessandro.carminati(a)gmail.com>
---
tools/testing/selftests/bpf/test_tc_tunnel.sh | 13 ++++++++++++-
1 file changed, 12 insertions(+), 1 deletion(-)
diff --git a/tools/testing/selftests/bpf/test_tc_tunnel.sh b/tools/testing/selftests/bpf/test_tc_tunnel.sh
index 910044f08908..7989ec608454 100755
--- a/tools/testing/selftests/bpf/test_tc_tunnel.sh
+++ b/tools/testing/selftests/bpf/test_tc_tunnel.sh
@@ -72,7 +72,6 @@ cleanup() {
server_listen() {
ip netns exec "${ns2}" nc "${netcat_opt}" -l "${port}" > "${outfile}" &
server_pid=$!
- sleep 0.2
}
client_connect() {
@@ -93,6 +92,16 @@ verify_data() {
fi
}
+wait_for_port() {
+ for i in $(seq 20); do
+ if ip netns exec "${ns2}" ss ${2:--4}OHntl | grep -q "$1"; then
+ return 0
+ fi
+ sleep 0.1
+ done
+ return 1
+}
+
set -e
# no arguments: automated test, run all
@@ -193,6 +202,7 @@ setup
# basic communication works
echo "test basic connectivity"
server_listen
+wait_for_port ${port} ${netcat_opt}
client_connect
verify_data
@@ -204,6 +214,7 @@ ip netns exec "${ns1}" tc filter add dev veth1 egress \
section "encap_${tuntype}_${mac}"
echo "test bpf encap without decap (expect failure)"
server_listen
+wait_for_port ${port} ${netcat_opt}
! client_connect
if [[ "$tuntype" =~ "udp" ]]; then
--
2.34.1
Sub-Numa Clustering (SNC) allows splitting CPU cores, caches and memory
into multiple NUMA nodes. When enabled, NUMA-aware applications can
achieve better performance on bigger server platforms.
The series adding SNC support to the kernel is currently in review [1]
but the selftests for resctrl need NUMA-aware adjustments to use these
changes. Issues with resctrl selftests not working properly with SNC
enabled were originally reported by Shaopeng Tan [2][3] and the
following series resolves them.
The main concept currently missing from resctrl selftests is that while
resctrl tracks memory accesses on a single NUMA node (which normally is
the same as the CPU socket) on machines with SNC enabled memory accesses
can leak outside of the local NUMA node and into other NUMA nodes on the
same socket. In that case resctrl could report a diminished value in one
of its monitoring technologies: Cache Monitoring Technology (CMT) or
Memory Bandwidth Monitoring (MBM) .
Implemented solutions for both CMT and MBM follow the same idea which is
to simply sum values reported by different NUMA nodes for a single
Resource Monitoring ID (RMID).
Series was tested on Ice Lake server platforms with SNC disabled, SNC-2
and SNC-4. The tests were also ran with and without kernel support for
SNC.
Series applies cleanly on kselftest/next.
[1] https://lore.kernel.org/all/20240228112215.8044-tony.luck@intel.com/
[2] https://lore.kernel.org/all/TYAPR01MB6330B9B17686EF426D2C3F308B25A@TYAPR01M…
[3] https://lore.kernel.org/lkml/TYAPR01MB6330A4EB3633B791939EA45E8B39A@TYAPR01…
Maciej Wieczor-Retman (4):
selftests/resctrl: Adjust effective L3 cache size with SNC enabled
selftests/resctrl: SNC support for CMT
selftests/resctrl: SNC support for MBM
selftests/resctrl: Adjust SNC support messages
tools/testing/selftests/resctrl/cache.c | 17 ++-
tools/testing/selftests/resctrl/cat_test.c | 2 +-
tools/testing/selftests/resctrl/cmt_test.c | 6 +-
tools/testing/selftests/resctrl/mba_test.c | 3 +-
tools/testing/selftests/resctrl/mbm_test.c | 4 +-
tools/testing/selftests/resctrl/resctrl.h | 13 +-
tools/testing/selftests/resctrl/resctrl_val.c | 46 ++++---
tools/testing/selftests/resctrl/resctrlfs.c | 128 +++++++++++++++++-
8 files changed, 185 insertions(+), 34 deletions(-)
--
2.44.0
Hi,
With the commit v6.8-11167-g4438a810f396 in vanilla torvalds tree, there seem to be problems with
the icmp_redirect.sh tests.
The iproute2-next tools were used, commit 7a6d30c95da9.
# timeout set to 3600
# selftests: net: icmp_redirect.sh
#
# ###########################################################################
# Legacy routing
# ###########################################################################
#
# TEST: IPv4: redirect exception [FAIL]
# TEST: IPv6: redirect exception [ OK ]
# TEST: IPv4: redirect exception plus mtu [FAIL]
# TEST: IPv6: redirect exception plus mtu [ OK ]
# TEST: IPv4: routing reset [ OK ]
# TEST: IPv6: routing reset [ OK ]
# TEST: IPv4: mtu exception [ OK ]
# TEST: IPv6: mtu exception [ OK ]
# TEST: IPv4: mtu exception plus redirect [FAIL]
# TEST: IPv6: mtu exception plus redirect [ OK ]
#
# ###########################################################################
# Legacy routing with VRF
# ###########################################################################
#
# TEST: IPv4: redirect exception [FAIL]
# TEST: IPv6: redirect exception [ OK ]
# TEST: IPv4: redirect exception plus mtu [FAIL]
# TEST: IPv6: redirect exception plus mtu [ OK ]
# TEST: IPv4: routing reset [ OK ]
# TEST: IPv6: routing reset [ OK ]
# TEST: IPv4: mtu exception [ OK ]
# TEST: IPv6: mtu exception [ OK ]
# TEST: IPv4: mtu exception plus redirect [FAIL]
# TEST: IPv6: mtu exception plus redirect [ OK ]
#
# ###########################################################################
# Routing with nexthop objects
# ###########################################################################
#
# TEST: IPv4: redirect exception [FAIL]
# TEST: IPv6: redirect exception [ OK ]
# TEST: IPv4: redirect exception plus mtu [FAIL]
# TEST: IPv6: redirect exception plus mtu [ OK ]
# TEST: IPv4: routing reset [ OK ]
# TEST: IPv6: routing reset [ OK ]
# TEST: IPv4: mtu exception [ OK ]
# TEST: IPv6: mtu exception [ OK ]
# TEST: IPv4: mtu exception plus redirect [FAIL]
# TEST: IPv6: mtu exception plus redirect [ OK ]
#
# ###########################################################################
# Routing with nexthop objects and VRF
# ###########################################################################
#
# TEST: IPv4: redirect exception [FAIL]
# TEST: IPv6: redirect exception [ OK ]
# TEST: IPv4: redirect exception plus mtu [FAIL]
# TEST: IPv6: redirect exception plus mtu [ OK ]
# TEST: IPv4: routing reset [ OK ]
# TEST: IPv6: routing reset [ OK ]
# TEST: IPv4: mtu exception [ OK ]
# TEST: IPv6: mtu exception [ OK ]
# TEST: IPv4: mtu exception plus redirect [FAIL]
# TEST: IPv6: mtu exception plus redirect [ OK ]
#
# Tests passed: 28
# Tests failed: 12
# Tests xfailed: 0
not ok 45 selftests: net: icmp_redirect.sh # exit=1
These errors are not introduced with this commit, but were already present at least in 6.8-rc7.
Hope this helps.
Best regards,
Mirsad Todorovac
This patch enhances the BPF helpers by adding a kfunc to retrieve the
cgroup v2 of a task, addressing a previous limitation where only
bpf_task_get_cgroup1 was available for cgroup v1. The new kfunc is
particularly useful for scenarios where obtaining the cgroup ID of a
task other than the "current" one is necessary, which the existing
bpf_get_current_cgroup_id helper cannot accommodate. A specific use
case at Netflix involved the sched_switch tracepoint, where we had to
get the cgroup IDs of both the prev and next tasks.
The bpf_task_get_cgroup kfunc acquires and returns a reference to a
task's default cgroup, ensuring thread-safe access by correctly
implementing RCU read locking and unlocking. It leverages the existing
cgroup.h helper, and cgroup_tryget to safely acquire a reference to it.
Signed-off-by: Jose Fernandez <josef(a)netflix.com>
Reviewed-by: Tycho Andersen <tycho(a)tycho.pizza>
---
V1 -> V2: Return a pointer to the cgroup instead of the cgroup ID
kernel/bpf/helpers.c | 26 ++++++++++++++++++++++++++
1 file changed, 26 insertions(+)
diff --git a/kernel/bpf/helpers.c b/kernel/bpf/helpers.c
index a89587859571..bbd19d5eedb6 100644
--- a/kernel/bpf/helpers.c
+++ b/kernel/bpf/helpers.c
@@ -2266,6 +2266,31 @@ bpf_task_get_cgroup1(struct task_struct *task, int hierarchy_id)
return NULL;
return cgrp;
}
+
+/**
+ * bpf_task_get_cgroup - Acquire a reference to the default cgroup of a task.
+ * @task: The target task
+ *
+ * This function returns the task's default cgroup, primarily
+ * designed for use with cgroup v2. In cgroup v1, the concept of default
+ * cgroup varies by subsystem, and while this function will work with
+ * cgroup v1, it's recommended to use bpf_task_get_cgroup1 instead.
+ * A cgroup returned by this kfunc which is not subsequently stored in a
+ * map, must be released by calling bpf_cgroup_release().
+ *
+ * Return: On success, the cgroup is returned. On failure, NULL is returned.
+ */
+__bpf_kfunc struct cgroup *bpf_task_get_cgroup(struct task_struct *task)
+{
+ struct cgroup *cgrp;
+
+ rcu_read_lock();
+ cgrp = task_dfl_cgroup(task);
+ if (!cgroup_tryget(cgrp))
+ cgrp = NULL;
+ rcu_read_unlock();
+ return cgrp;
+}
#endif /* CONFIG_CGROUPS */
/**
@@ -2573,6 +2598,7 @@ BTF_ID_FLAGS(func, bpf_cgroup_ancestor, KF_ACQUIRE | KF_RCU | KF_RET_NULL)
BTF_ID_FLAGS(func, bpf_cgroup_from_id, KF_ACQUIRE | KF_RET_NULL)
BTF_ID_FLAGS(func, bpf_task_under_cgroup, KF_RCU)
BTF_ID_FLAGS(func, bpf_task_get_cgroup1, KF_ACQUIRE | KF_RCU | KF_RET_NULL)
+BTF_ID_FLAGS(func, bpf_task_get_cgroup, KF_ACQUIRE | KF_RCU | KF_RET_NULL)
#endif
BTF_ID_FLAGS(func, bpf_task_from_pid, KF_ACQUIRE | KF_RET_NULL)
BTF_ID_FLAGS(func, bpf_throw)
base-commit: 4c8644f86c854c214aaabbcc24a27fa4c7e6a951
--
2.40.1
Add ability to parse multiple files. Additionally add the
ability to parse all results in the KUnit debugfs repository.
How to parse multiple files:
./tools/testing/kunit/kunit.py parse results.log results2.log
How to parse all files in directory:
./tools/testing/kunit/kunit.py parse directory_path/*
How to parse KUnit debugfs repository:
./tools/testing/kunit/kunit.py parse debugfs
For each file, the parser outputs the file name, results, and test
summary. At the end of all parsing, the parser outputs a total summary
line.
This feature can be easily tested on the tools/testing/kunit/test_data/
directory.
Signed-off-by: Rae Moar <rmoar(a)google.com>
---
Changes since v3:
- Changing from input() to stdin
- Add checking for non-regular files
- Spacing fix
- Small printing fix
tools/testing/kunit/kunit.py | 54 +++++++++++++++++++++++++-----------
1 file changed, 38 insertions(+), 16 deletions(-)
diff --git a/tools/testing/kunit/kunit.py b/tools/testing/kunit/kunit.py
index bc74088c458a..641b8ca83e3e 100755
--- a/tools/testing/kunit/kunit.py
+++ b/tools/testing/kunit/kunit.py
@@ -511,19 +511,40 @@ def exec_handler(cli_args: argparse.Namespace) -> None:
def parse_handler(cli_args: argparse.Namespace) -> None:
- if cli_args.file is None:
- sys.stdin.reconfigure(errors='backslashreplace') # type: ignore
- kunit_output = sys.stdin # type: Iterable[str]
- else:
- with open(cli_args.file, 'r', errors='backslashreplace') as f:
- kunit_output = f.read().splitlines()
- # We know nothing about how the result was created!
- metadata = kunit_json.Metadata()
- request = KunitParseRequest(raw_output=cli_args.raw_output,
- json=cli_args.json)
- result, _ = parse_tests(request, metadata, kunit_output)
- if result.status != KunitStatus.SUCCESS:
- sys.exit(1)
+ parsed_files = cli_args.files # type: List[str]
+ total_test = kunit_parser.Test()
+ total_test.status = kunit_parser.TestStatus.SUCCESS
+ if not parsed_files:
+ parsed_files.append('/dev/stdin')
+ elif len(parsed_files) == 1 and parsed_files[0] == "debugfs":
+ parsed_files.pop()
+ for (root, _, files) in os.walk("/sys/kernel/debug/kunit"):
+ parsed_files.extend(os.path.join(root, f) for f in files if f == "results")
+ if not parsed_files:
+ print("No files found.")
+
+ for file in parsed_files:
+ if os.path.isdir(file):
+ print(f'Ignoring directory "{file}"')
+ elif os.path.exists(file):
+ print(file)
+ with open(file, 'r', errors='backslashreplace') as f:
+ kunit_output = f.read().splitlines()
+ # We know nothing about how the result was created!
+ metadata = kunit_json.Metadata()
+ request = KunitParseRequest(raw_output=cli_args.raw_output,
+ json=cli_args.json)
+ _, test = parse_tests(request, metadata, kunit_output)
+ total_test.subtests.append(test)
+ else:
+ print(f'Could not find "{file}"')
+
+ if len(parsed_files) > 1: # if more than one file was parsed output total summary
+ print('All files parsed.')
+ if not request.raw_output:
+ stdout.print_with_timestamp(kunit_parser.DIVIDER)
+ kunit_parser.bubble_up_test_results(total_test)
+ kunit_parser.print_summary_line(total_test)
subcommand_handlers_map = {
@@ -569,9 +590,10 @@ def main(argv: Sequence[str]) -> None:
help='Parses KUnit results from a file, '
'and parses formatted results.')
add_parse_opts(parse_parser)
- parse_parser.add_argument('file',
- help='Specifies the file to read results from.',
- type=str, nargs='?', metavar='input_file')
+ parse_parser.add_argument('files',
+ help='List of file paths to read results from or keyword'
+ '"debugfs" to read all results from the debugfs directory.',
+ type=str, nargs='*', metavar='input_files')
cli_args = parser.parse_args(massage_argv(argv))
base-commit: 806cb2270237ce2ec672a407d66cee17a07d3aa2
--
2.44.0.291.gc1ea87d7ee-goog
Add ability to parse multiple files. Additionally add the
ability to parse all results in the KUnit debugfs repository.
How to parse multiple files:
./tools/testing/kunit/kunit.py parse results.log results2.log
How to parse all files in directory:
./tools/testing/kunit/kunit.py parse directory_path/*
How to parse KUnit debugfs repository:
./tools/testing/kunit/kunit.py parse debugfs
For each file, the parser outputs the file name, results, and test
summary. At the end of all parsing, the parser outputs a total summary
line.
This feature can be easily tested on the tools/testing/kunit/test_data/
directory.
Signed-off-by: Rae Moar <rmoar(a)google.com>
---
Changes since v2:
- Fixed bug with input from command line. I changed this to use
input(). Daniel, let me know if this works for you.
- Add more specific warning messages
tools/testing/kunit/kunit.py | 56 +++++++++++++++++++++++++-----------
1 file changed, 40 insertions(+), 16 deletions(-)
diff --git a/tools/testing/kunit/kunit.py b/tools/testing/kunit/kunit.py
index bc74088c458a..1aa3d736d80c 100755
--- a/tools/testing/kunit/kunit.py
+++ b/tools/testing/kunit/kunit.py
@@ -511,19 +511,42 @@ def exec_handler(cli_args: argparse.Namespace) -> None:
def parse_handler(cli_args: argparse.Namespace) -> None:
- if cli_args.file is None:
- sys.stdin.reconfigure(errors='backslashreplace') # type: ignore
- kunit_output = sys.stdin # type: Iterable[str]
- else:
- with open(cli_args.file, 'r', errors='backslashreplace') as f:
- kunit_output = f.read().splitlines()
- # We know nothing about how the result was created!
- metadata = kunit_json.Metadata()
- request = KunitParseRequest(raw_output=cli_args.raw_output,
- json=cli_args.json)
- result, _ = parse_tests(request, metadata, kunit_output)
- if result.status != KunitStatus.SUCCESS:
- sys.exit(1)
+ parsed_files = cli_args.files # type: List[str]
+ total_test = kunit_parser.Test()
+ total_test.status = kunit_parser.TestStatus.SUCCESS
+ if not parsed_files:
+ parsed_files.append(input("File path: "))
+
+ if parsed_files[0] == "debugfs" and len(parsed_files) == 1:
+ parsed_files.pop()
+ for (root, _, files) in os.walk("/sys/kernel/debug/kunit"):
+ parsed_files.extend(os.path.join(root, f) for f in files if f == "results")
+
+ if not parsed_files:
+ print("No files found.")
+
+ for file in parsed_files:
+ if os.path.isfile(file):
+ print(file)
+ with open(file, 'r', errors='backslashreplace') as f:
+ kunit_output = f.read().splitlines()
+ # We know nothing about how the result was created!
+ metadata = kunit_json.Metadata()
+ request = KunitParseRequest(raw_output=cli_args.raw_output,
+ json=cli_args.json)
+ _, test = parse_tests(request, metadata, kunit_output)
+ total_test.subtests.append(test)
+ elif os.path.isdir(file):
+ print("Ignoring directory ", file)
+ else:
+ print("Could not find ", file)
+
+ if len(parsed_files) > 1: # if more than one file was parsed output total summary
+ print('All files parsed.')
+ if not request.raw_output:
+ stdout.print_with_timestamp(kunit_parser.DIVIDER)
+ kunit_parser.bubble_up_test_results(total_test)
+ kunit_parser.print_summary_line(total_test)
subcommand_handlers_map = {
@@ -569,9 +592,10 @@ def main(argv: Sequence[str]) -> None:
help='Parses KUnit results from a file, '
'and parses formatted results.')
add_parse_opts(parse_parser)
- parse_parser.add_argument('file',
- help='Specifies the file to read results from.',
- type=str, nargs='?', metavar='input_file')
+ parse_parser.add_argument('files',
+ help='List of file paths to read results from or keyword'
+ '"debugfs" to read all results from the debugfs directory.',
+ type=str, nargs='*', metavar='input_files')
cli_args = parser.parse_args(massage_argv(argv))
base-commit: 806cb2270237ce2ec672a407d66cee17a07d3aa2
--
2.44.0.278.ge034bb2e1d-goog
Add missing flags argument to open(2) call with O_CREAT.
Some tests fail to compile if _FORTIFY_SOURCE is defined (to any valid
value) (together with -O), resulting in similar error messages such as:
In file included from /usr/include/fcntl.h:342,
from gup_test.c:1:
In function 'open',
inlined from 'main' at gup_test.c:206:10:
/usr/include/bits/fcntl2.h:50:11: error: call to '__open_missing_mode' declared with attribute error: open with O_CREAT or O_TMPFILE in second argument needs 3 arguments
50 | __open_missing_mode ();
| ^~~~~~~~~~~~~~~~~~~~~~
_FORTIFY_SOURCE is enabled by default in some distributions, so the
tests are not built by default and are skipped.
open(2) man-page warns about missing flags argument: "if it is not
supplied, some arbitrary bytes from the stack will be applied as the
file mode."
Fixes: aeb85ed4f41a ("tools/testing/selftests/vm/gup_benchmark.c: allow user specified file")
Fixes: fbe37501b252 ("mm: huge_memory: debugfs for file-backed THP split")
Fixes: c942f5bd17b3 ("selftests: soft-dirty: add test for mprotect")
Cc: Keith Busch <kbusch(a)kernel.org>
Cc: Zi Yan <ziy(a)nvidia.com>
Cc: Peter Xu <peterx(a)redhat.com>
Cc: Yang Shi <shy828301(a)gmail.com>
Cc: Andrea Arcangeli <aarcange(a)redhat.com>
Cc: Nadav Amit <nadav.amit(a)gmail.com>
Cc: David Hildenbrand <david(a)redhat.com>
Cc: Andrew Morton <akpm(a)linux-foundation.org>
Signed-off-by: Vitaly Chikunov <vt(a)altlinux.org>
---
tools/testing/selftests/mm/gup_test.c | 2 +-
tools/testing/selftests/mm/soft-dirty.c | 2 +-
tools/testing/selftests/mm/split_huge_page_test.c | 2 +-
3 files changed, 3 insertions(+), 3 deletions(-)
diff --git a/tools/testing/selftests/mm/gup_test.c b/tools/testing/selftests/mm/gup_test.c
index cbe99594d319..18a49c70d4c6 100644
--- a/tools/testing/selftests/mm/gup_test.c
+++ b/tools/testing/selftests/mm/gup_test.c
@@ -203,7 +203,7 @@ int main(int argc, char **argv)
ksft_print_header();
ksft_set_plan(nthreads);
- filed = open(file, O_RDWR|O_CREAT);
+ filed = open(file, O_RDWR|O_CREAT, 0664);
if (filed < 0)
ksft_exit_fail_msg("Unable to open %s: %s\n", file, strerror(errno));
diff --git a/tools/testing/selftests/mm/soft-dirty.c b/tools/testing/selftests/mm/soft-dirty.c
index cc5f144430d4..7dbfa53d93a0 100644
--- a/tools/testing/selftests/mm/soft-dirty.c
+++ b/tools/testing/selftests/mm/soft-dirty.c
@@ -137,7 +137,7 @@ static void test_mprotect(int pagemap_fd, int pagesize, bool anon)
if (!map)
ksft_exit_fail_msg("anon mmap failed\n");
} else {
- test_fd = open(fname, O_RDWR | O_CREAT);
+ test_fd = open(fname, O_RDWR | O_CREAT, 0664);
if (test_fd < 0) {
ksft_test_result_skip("Test %s open() file failed\n", __func__);
return;
diff --git a/tools/testing/selftests/mm/split_huge_page_test.c b/tools/testing/selftests/mm/split_huge_page_test.c
index 856662d2f87a..6c988bd2f335 100644
--- a/tools/testing/selftests/mm/split_huge_page_test.c
+++ b/tools/testing/selftests/mm/split_huge_page_test.c
@@ -223,7 +223,7 @@ void split_file_backed_thp(void)
ksft_exit_fail_msg("Fail to create file-backed THP split testing file\n");
}
- fd = open(testfile, O_CREAT|O_WRONLY);
+ fd = open(testfile, O_CREAT|O_WRONLY, 0664);
if (fd == -1) {
ksft_perror("Cannot open testing file");
goto cleanup;
--
2.42.1
Some unit tests intentionally trigger warning backtraces by passing bad
parameters to kernel API functions. Such unit tests typically check the
return value from such calls, not the existence of the warning backtrace.
Such intentionally generated warning backtraces are neither desirable
nor useful for a number of reasons.
- They can result in overlooked real problems.
- A warning that suddenly starts to show up in unit tests needs to be
investigated and has to be marked to be ignored, for example by
adjusting filter scripts. Such filters are ad-hoc because there is
no real standard format for warnings. On top of that, such filter
scripts would require constant maintenance.
One option to address problem would be to add messages such as "expected
warning backtraces start / end here" to the kernel log. However, that
would again require filter scripts, it might result in missing real
problematic warning backtraces triggered while the test is running, and
the irrelevant backtrace(s) would still clog the kernel log.
Solve the problem by providing a means to identify and suppress specific
warning backtraces while executing test code. Support suppressing multiple
backtraces while at the same time limiting changes to generic code to the
absolute minimum. Architecture specific changes are kept at minimum by
retaining function names only if both CONFIG_DEBUG_BUGVERBOSE and
CONFIG_KUNIT are enabled.
The first patch of the series introduces the necessary infrastructure.
The second patch introduces support for counting suppressed backtraces.
This capability is used in patch three to implement unit tests.
Patch four documents the new API.
The next two patches add support for suppressing backtraces in drm_rect
and dev_addr_lists unit tests. These patches are intended to serve as
examples for the use of the functionality introduced with this series.
The remaining patches implement the necessary changes for all
architectures with GENERIC_BUG support.
This series is based on the RFC patch and subsequent discussion at
https://patchwork.kernel.org/project/linux-kselftest/patch/02546e59-1afe-4b…
and offers a more comprehensive solution of the problem discussed there.
Design note:
Function pointers are only added to the __bug_table section if both
CONFIG_KUNIT and CONFIG_DEBUG_BUGVERBOSE are enabled to avoid image
size increases if CONFIG_KUNIT=n. There would be some benefits to
adding those pointers all the time (reduced complexity, ability to
display function names in BUG/WARNING messages). That change, if
desired, can be made later.
Checkpatch note:
Remaining checkpatch errors and warnings were deliberately ignored.
Some are triggered by matching coding style or by comments interpreted
as code, others by assembler macros which are disliked by checkpatch.
Suggestions for improvements are welcome.
Changes since RFC:
- Minor cleanups and bug fixes
- Added support for all affected architectures
- Added support for counting suppressed warnings
- Added unit tests using those counters
- Added patch to suppress warning backtraces in dev_addr_lists tests
This patch enhances the BPF helpers by adding a kfunc to retrieve the
cgroup v2 ID of a specific task, addressing a previous limitation where
only bpf_task_get_cgroup1 was available for cgroup v1. The new kfunc is
particularly useful for scenarios where obtaining the cgroup ID of a
task other than the "current" one is necessary, which the existing
bpf_get_current_cgroup_id helper cannot accommodate. A specific use case
at Netflix involved the sched_switch tracepoint, where we had to get
the cgroup IDs of both the previous and next tasks.
The bpf_task_get_cgroup_id kfunc returns a task's cgroup ID, correctly
implementing RCU read locking and unlocking for safe data access, and
leverages existing cgroup.h helpers to fetch the cgroup and its ID.
Signed-off-by: Jose Fernandez <josef(a)netflix.com>
Reviewed-by: Tycho Andersen <tycho(a)tycho.pizza>
---
kernel/bpf/helpers.c | 22 ++++++++++++++++++++++
1 file changed, 22 insertions(+)
diff --git a/kernel/bpf/helpers.c b/kernel/bpf/helpers.c
index a89587859571..8038b2bd3488 100644
--- a/kernel/bpf/helpers.c
+++ b/kernel/bpf/helpers.c
@@ -2266,6 +2266,27 @@ bpf_task_get_cgroup1(struct task_struct *task, int hierarchy_id)
return NULL;
return cgrp;
}
+
+/**
+ * bpf_task_get_cgroup_id - Get the cgroup ID of a task.
+ * @task: The target task
+ *
+ * This function returns the ID of the task's default cgroup, primarily
+ * designed for use with cgroup v2. In cgroup v1, the concept of default
+ * cgroup varies by subsystem, and while this function will work with
+ * cgroup v1, it's recommended to use bpf_task_get_cgroup1 instead.
+ */
+__bpf_kfunc u64 bpf_task_get_cgroup_id(struct task_struct *task)
+{
+ struct cgroup *cgrp;
+ u64 cgrp_id;
+
+ rcu_read_lock();
+ cgrp = task_dfl_cgroup(task);
+ cgrp_id = cgroup_id(cgrp);
+ rcu_read_unlock();
+ return cgrp_id;
+}
#endif /* CONFIG_CGROUPS */
/**
@@ -2573,6 +2594,7 @@ BTF_ID_FLAGS(func, bpf_cgroup_ancestor, KF_ACQUIRE | KF_RCU | KF_RET_NULL)
BTF_ID_FLAGS(func, bpf_cgroup_from_id, KF_ACQUIRE | KF_RET_NULL)
BTF_ID_FLAGS(func, bpf_task_under_cgroup, KF_RCU)
BTF_ID_FLAGS(func, bpf_task_get_cgroup1, KF_ACQUIRE | KF_RCU | KF_RET_NULL)
+BTF_ID_FLAGS(func, bpf_task_get_cgroup_id, KF_RCU)
#endif
BTF_ID_FLAGS(func, bpf_task_from_pid, KF_ACQUIRE | KF_RET_NULL)
BTF_ID_FLAGS(func, bpf_throw)
--
2.40.1
There are statements with two semicolons. Remove the second one, it
is redundant.
Signed-off-by: Colin Ian King <colin.i.king(a)gmail.com>
---
tools/testing/selftests/bpf/benchs/bench_local_storage_create.c | 2 +-
tools/testing/selftests/bpf/progs/iters.c | 2 +-
2 files changed, 2 insertions(+), 2 deletions(-)
diff --git a/tools/testing/selftests/bpf/benchs/bench_local_storage_create.c b/tools/testing/selftests/bpf/benchs/bench_local_storage_create.c
index b36de42ee4d9..e2ff8ea1cb79 100644
--- a/tools/testing/selftests/bpf/benchs/bench_local_storage_create.c
+++ b/tools/testing/selftests/bpf/benchs/bench_local_storage_create.c
@@ -186,7 +186,7 @@ static void *task_producer(void *input)
for (i = 0; i < batch_sz; i++) {
if (!pthd_results[i])
- pthread_join(pthds[i], NULL);;
+ pthread_join(pthds[i], NULL);
}
}
diff --git a/tools/testing/selftests/bpf/progs/iters.c b/tools/testing/selftests/bpf/progs/iters.c
index 3db416606f2f..fe65e0952a1e 100644
--- a/tools/testing/selftests/bpf/progs/iters.c
+++ b/tools/testing/selftests/bpf/progs/iters.c
@@ -673,7 +673,7 @@ static __noinline void fill(struct bpf_iter_num *it, int *arr, __u32 n, int mul)
static __noinline int sum(struct bpf_iter_num *it, int *arr, __u32 n)
{
- int *t, i, sum = 0;;
+ int *t, i, sum = 0;
while ((t = bpf_iter_num_next(it))) {
i = *t;
--
2.39.2
On 12/12/2023 12:47 PM, Shashar, Sagi wrote:
>
>
> -----Original Message-----
> From: Sagi Shahar <sagis(a)google.com>
> Sent: Tuesday, December 12, 2023 12:47 PM
> To: linux-kselftest(a)vger.kernel.org; Ackerley Tng <ackerleytng(a)google.com>; Afranji, Ryan <afranji(a)google.com>; Aktas, Erdem <erdemaktas(a)google.com>; Sagi Shahar <sagis(a)google.com>; Yamahata, Isaku <isaku.yamahata(a)intel.com>
> Cc: Sean Christopherson <seanjc(a)google.com>; Paolo Bonzini <pbonzini(a)redhat.com>; Shuah Khan <shuah(a)kernel.org>; Peter Gonda <pgonda(a)google.com>; Xu, Haibo1 <haibo1.xu(a)intel.com>; Chao Peng <chao.p.peng(a)linux.intel.com>; Annapurve, Vishal <vannapurve(a)google.com>; Roger Wang <runanwang(a)google.com>; Vipin Sharma <vipinsh(a)google.com>; jmattson(a)google.com; dmatlack(a)google.com; linux-kernel(a)vger.kernel.org; kvm(a)vger.kernel.org; linux-mm(a)kvack.org
> Subject: [RFC PATCH v5 27/29] KVM: selftests: Propagate KVM_EXIT_MEMORY_FAULT to userspace
>
> Allow userspace to handle KVM_EXIT_MEMORY_FAULT instead of triggering TEST_ASSERT.
>
> From the KVM_EXIT_MEMORY_FAULT documentation:
> Note! KVM_EXIT_MEMORY_FAULT is unique among all KVM exit reasons in that it accompanies a return code of '-1', not '0'! errno will always be set to EFAULT or EHWPOISON when KVM exits with KVM_EXIT_MEMORY_FAULT, userspace should assume kvm_run.exit_reason is stale/undefined for all other error numbers.
If KVM exits to userspace with KVM_EXIT_MEMORY_FAULT, most likely it's because the guest attempts to access the gfn in a way that is different from what the KVM is configured, in terms of private/shared property. I'd suggest to drop this patch and work on the selftests code to eliminate this exit.
If we need a testcase to catch this exit intentionally, we may call _vcpu_run() directly from the testcase and keep the common API vcpu_run() intact.
>
> Signed-off-by: Sagi Shahar <sagis(a)google.com>
> ---
> tools/testing/selftests/kvm/lib/kvm_util.c | 4 ++++
> 1 file changed, 4 insertions(+)
>
> diff --git a/tools/testing/selftests/kvm/lib/kvm_util.c b/tools/testing/selftests/kvm/lib/kvm_util.c
> index d024abc5379c..8fb041e51484 100644
> --- a/tools/testing/selftests/kvm/lib/kvm_util.c
> +++ b/tools/testing/selftests/kvm/lib/kvm_util.c
> @@ -1742,6 +1742,10 @@ void vcpu_run(struct kvm_vcpu *vcpu) {
> int ret = _vcpu_run(vcpu);
>
> + // Allow this scenario to be handled by the caller.
> + if (ret == -1 && errno == EFAULT)
> + return;
> +
> TEST_ASSERT(!ret, KVM_IOCTL_ERROR(KVM_RUN, ret)); }
>
> --
> 2.43.0.472.g3155946c3a-goog
>
The arm64 Guarded Control Stack (GCS) feature provides support for
hardware protected stacks of return addresses, intended to provide
hardening against return oriented programming (ROP) attacks and to make
it easier to gather call stacks for applications such as profiling.
When GCS is active a secondary stack called the Guarded Control Stack is
maintained, protected with a memory attribute which means that it can
only be written with specific GCS operations. The current GCS pointer
can not be directly written to by userspace. When a BL is executed the
value stored in LR is also pushed onto the GCS, and when a RET is
executed the top of the GCS is popped and compared to LR with a fault
being raised if the values do not match. GCS operations may only be
performed on GCS pages, a data abort is generated if they are not.
The combination of hardware enforcement and lack of extra instructions
in the function entry and exit paths should result in something which
has less overhead and is more difficult to attack than a purely software
implementation like clang's shadow stacks.
This series implements support for use of GCS by userspace, along with
support for use of GCS within KVM guests. It does not enable use of GCS
by either EL1 or EL2, this will be implemented separately. Executables
are started without GCS and must use a prctl() to enable it, it is
expected that this will be done very early in application execution by
the dynamic linker or other startup code. For dynamic linking this will
be done by checking that everything in the executable is marked as GCS
compatible.
x86 has an equivalent feature called shadow stacks, this series depends
on the x86 patches for generic memory management support for the new
guarded/shadow stack page type and shares APIs as much as possible. As
there has been extensive discussion with the wider community around the
ABI for shadow stacks I have as far as practical kept implementation
decisions close to those for x86, anticipating that review would lead to
similar conclusions in the absence of strong reasoning for divergence.
The main divergence I am concious of is that x86 allows shadow stack to
be enabled and disabled repeatedly, freeing the shadow stack for the
thread whenever disabled, while this implementation keeps the GCS
allocated after disable but refuses to reenable it. This is to avoid
races with things actively walking the GCS during a disable, we do
anticipate that some systems will wish to disable GCS at runtime but are
not aware of any demand for subsequently reenabling it.
x86 uses an arch_prctl() to manage enable and disable, since only x86
and S/390 use arch_prctl() a generic prctl() was proposed[1] as part of a
patch set for the equivalent RISC-V Zicfiss feature which I initially
adopted fairly directly but following review feedback has been revised
quite a bit.
We currently maintain the x86 pattern of implicitly allocating a shadow
stack for threads started with shadow stack enabled, there has been some
discussion of removing this support and requiring the use of clone3()
with explicit allocation of shadow stacks instead. I have no strong
feelings either way, implicit allocation is not really consistent with
anything else we do and creates the potential for errors around thread
exit but on the other hand it is existing ABI on x86 and minimises the
changes needed in userspace code.
There is an open issue with support for CRIU, on x86 this required the
ability to set the GCS mode via ptrace. This series supports
configuring mode bits other than enable/disable via ptrace but it needs
to be confirmed if this is sufficient.
The series depends on support for shadow stacks in clone3(), that series
includes the addition of ARCH_HAS_USER_SHADOW_STACK.
https://lore.kernel.org/r/20231120-clone3-shadow-stack-v3-0-a7b8ed3e2acc@ke…
It also depends on the addition of more waitpid() flags to nolibc:
https://lore.kernel.org/r/20231023-nolibc-waitpid-flags-v2-1-b09d096f091f@k…
You can see a branch with the full set of dependencies against Linus'
tree at:
https://git.kernel.org/pub/scm/linux/kernel/git/broonie/misc.git arm64-gcs
[1] https://lore.kernel.org/lkml/20230213045351.3945824-1-debug@rivosinc.com/
Signed-off-by: Mark Brown <broonie(a)kernel.org>
---
Changes in v8:
- Invalidate signal cap token on stack when consuming.
- Typo and other trivial fixes.
- Don't try to use process_vm_write() on GCS, it intentionally does not
work.
- Fix leak of thread GCSs.
- Rebase onto latest clone3() series.
- Link to v7: https://lore.kernel.org/r/20231122-arm64-gcs-v7-0-201c483bd775@kernel.org
Changes in v7:
- Rebase onto v6.7-rc2 via the clone3() patch series.
- Change the token used to cap the stack during signal handling to be
compatible with GCSPOPM.
- Fix flags for new page types.
- Fold in support for clone3().
- Replace copy_to_user_gcs() with put_user_gcs().
- Link to v6: https://lore.kernel.org/r/20231009-arm64-gcs-v6-0-78e55deaa4dd@kernel.org
Changes in v6:
- Rebase onto v6.6-rc3.
- Add some more gcsb_dsync() barriers following spec clarifications.
- Due to ongoing discussion around clone()/clone3() I've not updated
anything there, the behaviour is the same as on previous versions.
- Link to v5: https://lore.kernel.org/r/20230822-arm64-gcs-v5-0-9ef181dd6324@kernel.org
Changes in v5:
- Don't map any permissions for user GCSs, we always use EL0 accessors
or use a separate mapping of the page.
- Reduce the standard size of the GCS to RLIMIT_STACK/2.
- Enforce a PAGE_SIZE alignment requirement on map_shadow_stack().
- Clarifications and fixes to documentation.
- More tests.
- Link to v4: https://lore.kernel.org/r/20230807-arm64-gcs-v4-0-68cfa37f9069@kernel.org
Changes in v4:
- Implement flags for map_shadow_stack() allowing the cap and end of
stack marker to be enabled independently or not at all.
- Relax size and alignment requirements for map_shadow_stack().
- Add more blurb explaining the advantages of hardware enforcement.
- Link to v3: https://lore.kernel.org/r/20230731-arm64-gcs-v3-0-cddf9f980d98@kernel.org
Changes in v3:
- Rebase onto v6.5-rc4.
- Add a GCS barrier on context switch.
- Add a GCS stress test.
- Link to v2: https://lore.kernel.org/r/20230724-arm64-gcs-v2-0-dc2c1d44c2eb@kernel.org
Changes in v2:
- Rebase onto v6.5-rc3.
- Rework prctl() interface to allow each bit to be locked independently.
- map_shadow_stack() now places the cap token based on the size
requested by the caller not the actual space allocated.
- Mode changes other than enable via ptrace are now supported.
- Expand test coverage.
- Various smaller fixes and adjustments.
- Link to v1: https://lore.kernel.org/r/20230716-arm64-gcs-v1-0-bf567f93bba6@kernel.org
---
Mark Brown (38):
arm64/mm: Restructure arch_validate_flags() for extensibility
prctl: arch-agnostic prctl for shadow stack
mman: Add map_shadow_stack() flags
arm64: Document boot requirements for Guarded Control Stacks
arm64/gcs: Document the ABI for Guarded Control Stacks
arm64/sysreg: Add definitions for architected GCS caps
arm64/gcs: Add manual encodings of GCS instructions
arm64/gcs: Provide put_user_gcs()
arm64/cpufeature: Runtime detection of Guarded Control Stack (GCS)
arm64/mm: Allocate PIE slots for EL0 guarded control stack
mm: Define VM_SHADOW_STACK for arm64 when we support GCS
arm64/mm: Map pages for guarded control stack
KVM: arm64: Manage GCS registers for guests
arm64/gcs: Allow GCS usage at EL0 and EL1
arm64/idreg: Add overrride for GCS
arm64/hwcap: Add hwcap for GCS
arm64/traps: Handle GCS exceptions
arm64/mm: Handle GCS data aborts
arm64/gcs: Context switch GCS state for EL0
arm64/gcs: Ensure that new threads have a GCS
arm64/gcs: Implement shadow stack prctl() interface
arm64/mm: Implement map_shadow_stack()
arm64/signal: Set up and restore the GCS context for signal handlers
arm64/signal: Expose GCS state in signal frames
arm64/ptrace: Expose GCS via ptrace and core files
arm64: Add Kconfig for Guarded Control Stack (GCS)
kselftest/arm64: Verify the GCS hwcap
kselftest/arm64: Add GCS as a detected feature in the signal tests
kselftest/arm64: Add framework support for GCS to signal handling tests
kselftest/arm64: Allow signals tests to specify an expected si_code
kselftest/arm64: Always run signals tests with GCS enabled
kselftest/arm64: Add very basic GCS test program
kselftest/arm64: Add a GCS test program built with the system libc
kselftest/arm64: Add test coverage for GCS mode locking
selftests/arm64: Add GCS signal tests
kselftest/arm64: Add a GCS stress test
kselftest/arm64: Enable GCS for the FP stress tests
kselftest: Provide shadow stack enable helpers for arm64
Documentation/admin-guide/kernel-parameters.txt | 6 +
Documentation/arch/arm64/booting.rst | 22 +
Documentation/arch/arm64/elf_hwcaps.rst | 3 +
Documentation/arch/arm64/gcs.rst | 233 +++++++
Documentation/arch/arm64/index.rst | 1 +
Documentation/filesystems/proc.rst | 2 +-
arch/arm64/Kconfig | 20 +
arch/arm64/include/asm/cpufeature.h | 6 +
arch/arm64/include/asm/el2_setup.h | 17 +
arch/arm64/include/asm/esr.h | 28 +-
arch/arm64/include/asm/exception.h | 2 +
arch/arm64/include/asm/gcs.h | 107 +++
arch/arm64/include/asm/hwcap.h | 1 +
arch/arm64/include/asm/kvm_arm.h | 4 +-
arch/arm64/include/asm/kvm_host.h | 12 +
arch/arm64/include/asm/mman.h | 23 +-
arch/arm64/include/asm/pgtable-prot.h | 14 +-
arch/arm64/include/asm/processor.h | 7 +
arch/arm64/include/asm/sysreg.h | 20 +
arch/arm64/include/asm/uaccess.h | 40 ++
arch/arm64/include/uapi/asm/hwcap.h | 1 +
arch/arm64/include/uapi/asm/ptrace.h | 8 +
arch/arm64/include/uapi/asm/sigcontext.h | 9 +
arch/arm64/kernel/cpufeature.c | 19 +
arch/arm64/kernel/cpuinfo.c | 1 +
arch/arm64/kernel/entry-common.c | 23 +
arch/arm64/kernel/idreg-override.c | 2 +
arch/arm64/kernel/process.c | 85 +++
arch/arm64/kernel/ptrace.c | 59 ++
arch/arm64/kernel/signal.c | 242 ++++++-
arch/arm64/kernel/traps.c | 11 +
arch/arm64/kvm/emulate-nested.c | 4 +
arch/arm64/kvm/hyp/include/hyp/sysreg-sr.h | 17 +
arch/arm64/kvm/sys_regs.c | 22 +
arch/arm64/mm/Makefile | 1 +
arch/arm64/mm/fault.c | 79 ++-
arch/arm64/mm/gcs.c | 300 +++++++++
arch/arm64/mm/mmap.c | 13 +-
arch/arm64/tools/cpucaps | 1 +
arch/x86/include/uapi/asm/mman.h | 3 -
fs/proc/task_mmu.c | 3 +
include/linux/mm.h | 16 +-
include/uapi/asm-generic/mman.h | 4 +
include/uapi/linux/elf.h | 1 +
include/uapi/linux/prctl.h | 22 +
kernel/sys.c | 30 +
tools/testing/selftests/arm64/Makefile | 2 +-
tools/testing/selftests/arm64/abi/hwcap.c | 19 +
tools/testing/selftests/arm64/fp/assembler.h | 15 +
tools/testing/selftests/arm64/fp/fpsimd-test.S | 2 +
tools/testing/selftests/arm64/fp/sve-test.S | 2 +
tools/testing/selftests/arm64/fp/za-test.S | 2 +
tools/testing/selftests/arm64/fp/zt-test.S | 2 +
tools/testing/selftests/arm64/gcs/.gitignore | 5 +
tools/testing/selftests/arm64/gcs/Makefile | 24 +
tools/testing/selftests/arm64/gcs/asm-offsets.h | 0
tools/testing/selftests/arm64/gcs/basic-gcs.c | 428 ++++++++++++
tools/testing/selftests/arm64/gcs/gcs-locking.c | 200 ++++++
.../selftests/arm64/gcs/gcs-stress-thread.S | 311 +++++++++
tools/testing/selftests/arm64/gcs/gcs-stress.c | 532 +++++++++++++++
tools/testing/selftests/arm64/gcs/gcs-util.h | 100 +++
tools/testing/selftests/arm64/gcs/libc-gcs.c | 736 +++++++++++++++++++++
tools/testing/selftests/arm64/signal/.gitignore | 1 +
.../testing/selftests/arm64/signal/test_signals.c | 17 +-
.../testing/selftests/arm64/signal/test_signals.h | 6 +
.../selftests/arm64/signal/test_signals_utils.c | 32 +-
.../selftests/arm64/signal/test_signals_utils.h | 39 ++
.../arm64/signal/testcases/gcs_exception_fault.c | 62 ++
.../selftests/arm64/signal/testcases/gcs_frame.c | 88 +++
.../arm64/signal/testcases/gcs_write_fault.c | 67 ++
.../selftests/arm64/signal/testcases/testcases.c | 7 +
.../selftests/arm64/signal/testcases/testcases.h | 1 +
tools/testing/selftests/ksft_shstk.h | 37 ++
73 files changed, 4241 insertions(+), 40 deletions(-)
---
base-commit: 50abefbf1bc07f5c4e403fd28f71dcee855100f7
change-id: 20230303-arm64-gcs-e311ab0d8729
Best regards,
--
Mark Brown <broonie(a)kernel.org>
Changes since v1:
- Rebased the series on top of next-20240202
Muhammad Usama Anjum (12):
selftests/mm: map_fixed_noreplace: conform test to TAP format output
selftests/mm: map_hugetlb: conform test to TAP format output
selftests/mm: map_populate: conform test to TAP format output
selftests/mm: mlock-random-test: conform test to TAP format output
selftests/mm: mlock2-tests: conform test to TAP format output
selftests/mm: mrelease_test: conform test to TAP format output
selftests/mm: mremap_dontunmap: conform test to TAP format output
selftests/mm: split_huge_page_test: conform test to TAP format output
selftests/mm: thp_settings: conform to TAP format output
selftests/mm: thuge-gen: conform to TAP format output
selftests/mm: transhuge-stress: conform to TAP format output
selftests/mm: virtual_address_range: conform to TAP format output
tools/testing/selftests/mm/khugepaged.c | 3 +-
.../selftests/mm/map_fixed_noreplace.c | 96 ++----
tools/testing/selftests/mm/map_hugetlb.c | 42 ++-
tools/testing/selftests/mm/map_populate.c | 37 ++-
.../testing/selftests/mm/mlock-random-test.c | 136 ++++-----
tools/testing/selftests/mm/mlock2-tests.c | 282 +++++++-----------
tools/testing/selftests/mm/mlock2.h | 11 +-
tools/testing/selftests/mm/mrelease_test.c | 80 ++---
tools/testing/selftests/mm/mremap_dontunmap.c | 32 +-
.../selftests/mm/split_huge_page_test.c | 161 +++++-----
tools/testing/selftests/mm/thp_settings.c | 123 +++-----
tools/testing/selftests/mm/thp_settings.h | 4 +-
tools/testing/selftests/mm/thuge-gen.c | 147 ++++-----
tools/testing/selftests/mm/transhuge-stress.c | 36 ++-
.../selftests/mm/virtual_address_range.c | 44 +--
tools/testing/selftests/mm/vm_util.c | 6 +-
16 files changed, 537 insertions(+), 703 deletions(-)
--
2.42.0
In this series, I'm trying to add 3 missing tests to vm_runtests.sh
which is used to run all the tests in mm suite. These tests weren't
running by CIs. While enabling them and through review feedback, I've
fixed some problems in tests as well. I've found more flakiness in more
tests which I'll be fixing with future patches.
hugetlb-read-hwpoison test is being added where it can only run with
newly added "-d" (destructive) flag only. Not sure why it is failing
again. So once it become stable, we can think of moving it to default
set of tests if it doesn't have any side-effect to them.
Cc: Ryan Roberts <ryan.roberts(a)arm.com>
Cc: Andrew Morton <akpm(a)linux-foundation.org>
---
Changes in v3:
- Add cover letter
- Fix flakiness in tests found during enablement
- Move additional tests down in the file
- Add "-d" option which poisons the pages and aren't being useable after
the test
v2: https://lore.kernel.org/all/20240123073615.920324-1-usama.anjum@collabora.c…
Muhammad Usama Anjum (5):
selftests/mm: hugetlb_reparenting_test: do not unmount
selftests/mm: run_vmtests: remove sudo and conform to tap
selftests/mm: save and restore nr_hugepages value
selftests/mm: protection_keys: save/restore nr_hugepages settings
selftests/mm: run_vmtests.sh: add missing tests
tools/testing/selftests/mm/Makefile | 5 +++
.../selftests/mm/charge_reserved_hugetlb.sh | 4 +++
.../selftests/mm/hugetlb_reparenting_test.sh | 9 +++--
tools/testing/selftests/mm/on-fault-limit.c | 36 +++++++++----------
tools/testing/selftests/mm/protection_keys.c | 34 ++++++++++++++++++
tools/testing/selftests/mm/run_vmtests.sh | 10 +++++-
6 files changed, 76 insertions(+), 22 deletions(-)
--
2.42.0
Hello,
I've been running execveat (execveat.c) locally on v6.1 and next-20240228.
It has flaky test case. There are some test cases which fail consistently.
The comment (not very clear) on top of failing cases is as following:
/*
* Execute as a long pathname relative to "/". If this is a script,
* the interpreter will launch but fail to open the script because its
* name ("/dev/fd/5/xxx....") is bigger than PATH_MAX.
*
* The failure code is usually 127 (POSIX: "If a command is not found,
* the exit status shall be 127."), but some systems give 126 (POSIX:
* "If the command name is found, but it is not an executable utility,
* the exit status shall be 126."), so allow either.
*/
The file name is just less than PATH_MAX (4096) and we are expecting the
execveat() to fail with particular 99 or 127/128 error code. But kernel is
returning 1 error code. Snippet from full output:
# child 3493092 exited with 1 not 99 nor 99
# child 3493094 exited with 1 not 127 nor 126
I'm not sure if test is wrong or the kernel has changed the return error codes.
Full test run output:
./execveat
TAP version 13
1..51
ok 1 Check success of execveat(3, '../execveat', 0)...
ok 2 Check success of execveat(5, 'execveat', 0)...
ok 3 Check success of execveat(7, 'execveat', 0)...
ok 4 Check success of execveat(-100,
'/home/usama/repos/ke...ftests/exec/execveat', 0)...
ok 5 Check success of execveat(99,
'/home/usama/repos/ke...ftests/exec/execveat', 0)...
ok 6 Check success of execveat(9, '', 4096)...
ok 7 Check success of execveat(18, '', 4096)...
ok 8 Check success of execveat(10, '', 4096)...
ok 9 Check success of execveat(15, '', 4096)...
ok 10 Check success of execveat(15, '', 4096)...
ok 11 Check success of execveat(16, '', 4096)...
ok 12 Check failure of execveat(9, '', 0) with ENOENT
ok 13 Check failure of execveat(9, '(null)', 4096) with EFAULT
ok 14 Check success of execveat(5, 'execveat.symlink', 0)...
ok 15 Check success of execveat(7, 'execveat.symlink', 0)...
ok 16 Check success of execveat(-100,
'/home/usama/repos/ke...xec/execveat.symlink', 0)...
ok 17 Check success of execveat(11, '', 4096)...
ok 18 Check success of execveat(11, '', 4352)...
ok 19 Check failure of execveat(5, 'execveat.symlink', 256) with ELOOP
ok 20 Check failure of execveat(7, 'execveat.symlink', 256) with ELOOP
ok 21 Check failure of execveat(-100,
'/home/usama/repos/kernel/linux_mainline/tools/testing/selftests/exec/execveat.symlink',
256) with ELOOP
ok 22 Check failure of execveat(5, 'pipe', 0) with EACCES
ok 23 Check success of execveat(3, '../script', 0)...
ok 24 Check success of execveat(5, 'script', 0)...
ok 25 Check success of execveat(7, 'script', 0)...
ok 26 Check success of execveat(-100,
'/home/usama/repos/ke...elftests/exec/script', 0)...
ok 27 Check success of execveat(14, '', 4096)...
ok 28 Check success of execveat(14, '', 4352)...
ok 29 Check failure of execveat(19, '', 4096) with ENOENT
ok 30 Check failure of execveat(8, 'script', 0) with ENOENT
ok 31 Check success of execveat(17, '', 4096)...
ok 32 Check success of execveat(17, '', 4096)...
ok 33 Check success of execveat(4, '../script', 0)...
ok 34 Check success of execveat(4, 'script', 0)...
ok 35 Check success of execveat(4, '../script', 0)...
ok 36 Check failure of execveat(4, 'script', 0) with ENOENT
ok 37 Check failure of execveat(5, 'execveat', 65535) with EINVAL
ok 38 Check failure of execveat(5, 'no-such-file', 0) with ENOENT
ok 39 Check failure of execveat(7, 'no-such-file', 0) with ENOENT
ok 40 Check failure of execveat(-100, 'no-such-file', 0) with ENOENT
ok 41 Check failure of execveat(5, '', 4096) with EACCES
ok 42 Check failure of execveat(5, 'Makefile', 0) with EACCES
ok 43 Check failure of execveat(12, '', 4096) with EACCES
ok 44 Check failure of execveat(13, '', 4096) with EACCES
ok 45 Check failure of execveat(99, '', 4096) with EBADF
ok 46 Check failure of execveat(99, 'execveat', 0) with EBADF
ok 47 Check failure of execveat(9, 'execveat', 0) with ENOTDIR
# Invoke copy of 'execveat' via filename of length 4094:
ok 48 Check success of execveat(20, '', 4096)...
# execveat() failed, rc=-1 errno=2 (No such file or directory)
not ok 49 Check success of execveat(6,
'home/usama/repos/ker...yyyyyyyyyyyyyyyyyyyy', 0)...
# child 3493092 exited with 1 not 99 nor 99
not ok 49 Check success of execveat(6,
'home/usama/repos/ker...yyyyyyyyyyyyyyyyyyyy', 0)...
# Invoke copy of 'script' via filename of length 4094:
ok 50 Check success of execveat(21, '', 4096)...
# execveat() failed, rc=-1 errno=2 (No such file or directory)
not ok 51 Check success of execveat(6,
'home/usama/repos/ker...yyyyyyyyyyyyyyyyyyyy', 0)...
# child 3493094 exited with 1 not 127 nor 126
not ok 51 Check success of execveat(6,
'home/usama/repos/ker...yyyyyyyyyyyyyyyyyyyy', 0)...
2 tests failed
# Totals: pass:49 fail:2 xfail:0 xpass:0 skip:0 error:0
--
BR,
Muhammad Usama Anjum
This patchset adds KVM selftests for LoongArch system, currently only
some common test cases are supported and pass to run. These testcase
are listed as following:
demand_paging_test
dirty_log_perf_test
dirty_log_test
guest_print_test
hardware_disable_test
kvm_binary_stats_test
kvm_create_max_vcpus
kvm_page_table_test
memslot_modification_stress_test
memslot_perf_test
set_memory_region_test
This patchset originally is posted from zhaotianrui, I continue to work
on his efforts.
---
Changes in v7:
1. Refine code to add LoongArch support in test case
set_memory_region_test.
Changes in v6:
1. Refresh the patch based on latest kernel 6.8-rc1, add LoongArch
support about testcase set_memory_region_test.
2. Add hardware_disable_test test case.
3. Drop modification about macro DEFAULT_GUEST_TEST_MEM, it is problem
of LoongArch binutils, this issue is raised to LoongArch binutils owners.
Changes in v5:
1. In LoongArch kvm self tests, the DEFAULT_GUEST_TEST_MEM could be
0x130000000, it is different from the default value in memstress.h.
So we Move the definition of DEFAULT_GUEST_TEST_MEM into LoongArch
ucall.h, and add 'ifndef' condition for DEFAULT_GUEST_TEST_MEM
in memstress.h.
Changes in v4:
1. Remove the based-on flag, as the LoongArch KVM patch series
have been accepted by Linux kernel, so this can be applied directly
in kernel.
Changes in v3:
1. Improve implementation of LoongArch VM page walk.
2. Add exception handler for LoongArch.
3. Add dirty_log_test, dirty_log_perf_test, guest_print_test
test cases for LoongArch.
4. Add __ASSEMBLER__ macro to distinguish asm file and c file.
5. Move ucall_arch_do_ucall to the header file and make it as
static inline to avoid function calls.
6. Change the DEFAULT_GUEST_TEST_MEM base addr for LoongArch.
Changes in v2:
1. We should use ".balign 4096" to align the assemble code with 4K in
exception.S instead of "align 12".
2. LoongArch only supports 3 or 4 levels page tables, so we remove the
hanlders for 2-levels page table.
3. Remove the DEFAULT_LOONGARCH_GUEST_STACK_VADDR_MIN and use the common
DEFAULT_GUEST_STACK_VADDR_MIN to allocate stack memory in guest.
4. Reorganize the test cases supported by LoongArch.
5. Fix some code comments.
6. Add kvm_binary_stats_test test case into LoongArch KVM selftests.
---
Tianrui Zhao (4):
KVM: selftests: Add KVM selftests header files for LoongArch
KVM: selftests: Add core KVM selftests support for LoongArch
KVM: selftests: Add ucall test support for LoongArch
KVM: selftests: Add test cases for LoongArch
tools/testing/selftests/kvm/Makefile | 16 +
.../selftests/kvm/include/kvm_util_base.h | 5 +
.../kvm/include/loongarch/processor.h | 133 +++++++
.../selftests/kvm/include/loongarch/ucall.h | 20 ++
.../selftests/kvm/lib/loongarch/exception.S | 59 ++++
.../selftests/kvm/lib/loongarch/processor.c | 332 ++++++++++++++++++
.../selftests/kvm/lib/loongarch/ucall.c | 38 ++
.../selftests/kvm/set_memory_region_test.c | 2 +-
8 files changed, 604 insertions(+), 1 deletion(-)
create mode 100644 tools/testing/selftests/kvm/include/loongarch/processor.h
create mode 100644 tools/testing/selftests/kvm/include/loongarch/ucall.h
create mode 100644 tools/testing/selftests/kvm/lib/loongarch/exception.S
create mode 100644 tools/testing/selftests/kvm/lib/loongarch/processor.c
create mode 100644 tools/testing/selftests/kvm/lib/loongarch/ucall.c
base-commit: 6764c317b6bb91bd806ef79adf6d9c0e428b191e
--
2.39.3