[Kevin has a done a great job to get through reviews on all these, and
Vasant/Ankit have been looking at it on AMD systems, I think we are close to
being done now!]
Currently each of the iommu page table formats duplicates all of the logic
to maintain the page table and perform map/unmap/etc operations. There are
several different versions of the algorithms between all the different
formats. The io-pgtable system provides an interface to help isolate the
page table code from the iommu driver, but doesn't provide tools to
implement the common algorithms.
This makes it very hard to improve the state of the pagetable code under
the iommu domains as any proposed improvement needs to alter a large
number of different driver code paths. Combined with a lack of software
based testing this makes improvement in this area very hard.
iommufd wants several new page table operations:
- More efficient map/unmap operations, using iommufd's batching logic
- unmap that returns the physical addresses into a batch as it progresses
- cut that allows splitting areas so large pages can have holes
poked in them dynamically (ie guestmemfd hitless shared/private
transitions)
- More agressive freeing of table memory to avoid waste
- Fragmenting large pages so that dirty tracking can be more granular
- Reassembling large pages so that VMs can run at full IO performance
in migration/dirty tracking error flows
- KHO integration for kernel live upgrade
Together these are algorithmically complex enough to be a very significant
task to go and implement in all the page table formats we support. Just
the "server" focused drivers use almost all the formats (ARMv8 S1&S2 / x86
PAE / AMDv1 / VT-d SS / RISCV)
Instead of doing the duplicated work, this series takes the first step to
consolidate the algorithms into one places. In spirit it is similar to the
work Christoph did a few years back to pull the redundant get_user_pages()
implementations out of the arch code into core MM. This unlocked a great
deal of improvement in that space in the following years. I would like to
see the same benefit in iommu as well.
My first RFC showed a bigger picture with all most all formats and more
algorithms. This series reorganizes that to be narrowly focused on just
enough to convert the AMD driver to use the new mechanism.
kunit tests are provided that allow good testing of the algorithms and all
formats on x86, nothing is arch specific.
AMD is one of the simpler options as the HW is quite uniform with few
different options/bugs while still requiring the complicated contiguous
pages support. The HW also has a very simple range based invalidation
approach that is easy to implement.
The AMD v1 and AMD v2 page table formats are implemented bit for bit
identical to the current code, tested using a compare kunit test that
checks against the io-pgtable version (on github, see below).
Updating the AMD driver to replace the io-pgtable layer with the new stuff
is fairly straightforward now. The layering is fixed up in the new version
so that all the invalidation goes through function pointers.
Several small fixing patches have come out of this as I've been fixing the
problems that the test suite uncovers in the current code, and
implementing the fixed version in iommupt.
On performance, there is a quite wide variety of implementation designs
across all the drivers. Looking at some key performance across
the main formats:
iommu_map():
pgsz ,avg new,old ns, min new,old ns , min % (+ve is better)
2^12, 53,66 , 51,63 , 19.19 (AMDV1)
256*2^12, 386,1909 , 367,1795 , 79.79
256*2^21, 362,1633 , 355,1556 , 77.77
2^12, 56,62 , 52,59 , 11.11 (AMDv2)
256*2^12, 405,1355 , 357,1292 , 72.72
256*2^21, 393,1160 , 358,1114 , 67.67
2^12, 55,65 , 53,62 , 14.14 (VT-d second stage)
256*2^12, 391,518 , 332,512 , 35.35
256*2^21, 383,635 , 336,624 , 46.46
2^12, 57,65 , 55,63 , 12.12 (ARM 64 bit)
256*2^12, 380,389 , 361,369 , 2.02
256*2^21, 358,419 , 345,400 , 13.13
iommu_unmap():
pgsz ,avg new,old ns, min new,old ns , min % (+ve is better)
2^12, 69,88 , 65,85 , 23.23 (AMDv1)
256*2^12, 353,6498 , 331,6029 , 94.94
256*2^21, 373,6014 , 360,5706 , 93.93
2^12, 71,72 , 66,69 , 4.04 (AMDv2)
256*2^12, 228,891 , 206,871 , 76.76
256*2^21, 254,721 , 245,711 , 65.65
2^12, 69,87 , 65,82 , 20.20 (VT-d second stage)
256*2^12, 210,321 , 200,315 , 36.36
256*2^21, 255,349 , 238,342 , 30.30
2^12, 72,77 , 68,74 , 8.08 (ARM 64 bit)
256*2^12, 521,357 , 447,346 , -29.29
256*2^21, 489,358 , 433,345 , -25.25
* Above numbers include additional patches to remove the iommu_pgsize()
overheads. gcc 13.3.0, i7-12700
This version provides fairly consistent performance across formats. ARM
unmap performance is quite different because this version supports
contiguous pages and uses a very different algorithm for unmapping. Though
why it is so worse compared to AMDv1 I haven't figured out yet.
The per-format commits include a more detailed chart.
There is a second branch:
https://github.com/jgunthorpe/linux/commits/iommu_pt_all
Containing supporting work and future steps:
- ARM short descriptor (32 bit), ARM long descriptor (64 bit) formats
- RISCV format and RISCV conversion
https://github.com/jgunthorpe/linux/commits/iommu_pt_riscv
- Support for a DMA incoherent HW page table walker
- VT-d second stage format and VT-d conversion
https://github.com/jgunthorpe/linux/commits/iommu_pt_vtd
- DART v1 & v2 format
- Draft of a iommufd 'cut' operation to break down huge pages
- A compare test that checks the iommupt formats against the iopgtable
interface, including updating AMD to have a working iopgtable and patches
to make VT-d have an iopgtable for testing.
- A performance test to micro-benchmark map and unmap against iogptable
My strategy is to go one by one for the drivers:
- AMD driver conversion
- RISCV page table and driver
- Intel VT-d driver and VTDSS page table
- Flushing improvements for RISCV
- ARM SMMUv3
And concurrently work on the algorithm side:
- debugfs content dump, like VT-d has
- Cut support
- Increase/Decrease page size support
- map/unmap batching
- KHO
As we make more algorithm improvements the value to convert the drivers
increases.
This is on github: https://github.com/jgunthorpe/linux/commits/iommu_pt
v7:
- Rebase to v6.18-rc2
- Improve comments and documentation
- Add a few missed __sme_sets() for AMD CC
- Rename pt_iommu_flush_ops -> pt_iommu_driver_ops
VT-D -> VT-d
pt_clear_entry -> pt_clear_entries
pt_entry_write_is_dirty -> pt_entry_is_write_dirty
pt_entry_set_write_clean -> pt_entry_make_write_clean
- Tidy some of the map flow into a new function do_map()
- Fix ffz64()
v6: https://patch.msgid.link/r/0-v6-0fb54a1d9850+36b-iommu_pt_jgg@nvidia.com
- Improve comments and documentation
- Rename pt_entry_oa_full -> pt_entry_oa_exact
pt_has_system_page -> pt_has_system_page_size
pt_max_output_address_lg2 -> pt_max_oa_lg2
log2_f*() -> vaf* / oaf* / f*_t
pt_item_fully_covered -> pt_entry_fully_covered
- Fix missed constant propogation causing division
- Consolidate debugging checks to pt_check_install_leaf_args()
- Change collect->ignore_mapped to check_mapped
- Shuffle some hunks around to more appropriate patches
- Two new mini kunit tests
v5: https://patch.msgid.link/r/0-v5-116c4948af3d+68091-iommu_pt_jgg@nvidia.com
- Text grammar updates and kdoc fixes
v4: https://patch.msgid.link/r/0-v4-0d6a6726a372+18959-iommu_pt_jgg@nvidia.com
- Rebase on v6.16-rc3
- Integrate the HATS/HATDis changes
- Remove 'default n' from kconfig
- Remove unused 'PT_FIXED_TOP_LEVEL'
- Improve comments and documentation
- Fix some compile warnings from kbuild robots
v3: https://patch.msgid.link/r/0-v3-a93aab628dbc+521-iommu_pt_jgg@nvidia.com
- Rebase on v6.16-rc2
- s/PT_ENTRY_WORD_SIZE/PT_ITEM_WORD_SIZE/s to follow the language better
- Comment and documentation updates
- Add PT_TOP_PHYS_MASK to help manage alignment restrictions on the top
pointer
- Add missed force_aperture = true
- Make pt_iommu_deinit() take care of the not-yet-inited error case
internally as AMD/RISCV/VTD all shared this logic
- Change gather_range() into gather_range_pages() so it also deals with
the page list. This makes the following cache flushing series simpler
- Fix missed update of unmap->unmapped in some error cases
- Change clear_contig() to order the gather more logically
- Remove goto from the error handling in __map_range_leaf()
- s/log2_/oalog2_/ in places where the argument is an oaddr_t
- Pass the pts to pt_table_install64/32()
- Do not use SIGN_EXTEND for the AMDv2 page table because of Vasant's
information on how PASID 0 works.
v2: https://patch.msgid.link/r/0-v2-5c26bde5c22d+58b-iommu_pt_jgg@nvidia.com
- AMD driver only, many code changes
RFC: https://lore.kernel.org/all/0-v1-01fa10580981+1d-iommu_pt_jgg@nvidia.com/
Cc: Michael Roth <michael.roth(a)amd.com>
Cc: Alexey Kardashevskiy <aik(a)amd.com>
Cc: Pasha Tatashin <pasha.tatashin(a)soleen.com>
Cc: James Gowans <jgowans(a)amazon.com>
Signed-off-by: Jason Gunthorpe <jgg(a)nvidia.com>
Alejandro Jimenez (1):
iommu/amd: Use the generic iommu page table
Jason Gunthorpe (14):
genpt: Generic Page Table base API
genpt: Add Documentation/ files
iommupt: Add the basic structure of the iommu implementation
iommupt: Add the AMD IOMMU v1 page table format
iommupt: Add iova_to_phys op
iommupt: Add unmap_pages op
iommupt: Add map_pages op
iommupt: Add read_and_clear_dirty op
iommupt: Add a kunit test for Generic Page Table
iommupt: Add a mock pagetable format for iommufd selftest to use
iommufd: Change the selftest to use iommupt instead of xarray
iommupt: Add the x86 64 bit page table format
iommu/amd: Remove AMD io_pgtable support
iommupt: Add a kunit test for the IOMMU implementation
.clang-format | 1 +
Documentation/driver-api/generic_pt.rst | 142 ++
Documentation/driver-api/index.rst | 1 +
drivers/iommu/Kconfig | 2 +
drivers/iommu/Makefile | 1 +
drivers/iommu/amd/Kconfig | 5 +-
drivers/iommu/amd/Makefile | 2 +-
drivers/iommu/amd/amd_iommu.h | 1 -
drivers/iommu/amd/amd_iommu_types.h | 110 +-
drivers/iommu/amd/io_pgtable.c | 577 --------
drivers/iommu/amd/io_pgtable_v2.c | 370 ------
drivers/iommu/amd/iommu.c | 538 ++++----
drivers/iommu/generic_pt/.kunitconfig | 13 +
drivers/iommu/generic_pt/Kconfig | 68 +
drivers/iommu/generic_pt/fmt/Makefile | 26 +
drivers/iommu/generic_pt/fmt/amdv1.h | 415 ++++++
drivers/iommu/generic_pt/fmt/defs_amdv1.h | 21 +
drivers/iommu/generic_pt/fmt/defs_x86_64.h | 21 +
drivers/iommu/generic_pt/fmt/iommu_amdv1.c | 15 +
drivers/iommu/generic_pt/fmt/iommu_mock.c | 10 +
drivers/iommu/generic_pt/fmt/iommu_template.h | 48 +
drivers/iommu/generic_pt/fmt/iommu_x86_64.c | 11 +
drivers/iommu/generic_pt/fmt/x86_64.h | 259 ++++
drivers/iommu/generic_pt/iommu_pt.h | 1162 +++++++++++++++++
drivers/iommu/generic_pt/kunit_generic_pt.h | 713 ++++++++++
drivers/iommu/generic_pt/kunit_iommu.h | 183 +++
drivers/iommu/generic_pt/kunit_iommu_pt.h | 487 +++++++
drivers/iommu/generic_pt/pt_common.h | 358 +++++
drivers/iommu/generic_pt/pt_defs.h | 329 +++++
drivers/iommu/generic_pt/pt_fmt_defaults.h | 233 ++++
drivers/iommu/generic_pt/pt_iter.h | 636 +++++++++
drivers/iommu/generic_pt/pt_log2.h | 122 ++
drivers/iommu/io-pgtable.c | 4 -
drivers/iommu/iommufd/Kconfig | 1 +
drivers/iommu/iommufd/iommufd_test.h | 11 +-
drivers/iommu/iommufd/selftest.c | 438 +++----
include/linux/generic_pt/common.h | 167 +++
include/linux/generic_pt/iommu.h | 271 ++++
include/linux/io-pgtable.h | 2 -
include/linux/irqchip/riscv-imsic.h | 3 +-
tools/testing/selftests/iommu/iommufd.c | 60 +-
tools/testing/selftests/iommu/iommufd_utils.h | 12 +
42 files changed, 6237 insertions(+), 1612 deletions(-)
create mode 100644 Documentation/driver-api/generic_pt.rst
delete mode 100644 drivers/iommu/amd/io_pgtable.c
delete mode 100644 drivers/iommu/amd/io_pgtable_v2.c
create mode 100644 drivers/iommu/generic_pt/.kunitconfig
create mode 100644 drivers/iommu/generic_pt/Kconfig
create mode 100644 drivers/iommu/generic_pt/fmt/Makefile
create mode 100644 drivers/iommu/generic_pt/fmt/amdv1.h
create mode 100644 drivers/iommu/generic_pt/fmt/defs_amdv1.h
create mode 100644 drivers/iommu/generic_pt/fmt/defs_x86_64.h
create mode 100644 drivers/iommu/generic_pt/fmt/iommu_amdv1.c
create mode 100644 drivers/iommu/generic_pt/fmt/iommu_mock.c
create mode 100644 drivers/iommu/generic_pt/fmt/iommu_template.h
create mode 100644 drivers/iommu/generic_pt/fmt/iommu_x86_64.c
create mode 100644 drivers/iommu/generic_pt/fmt/x86_64.h
create mode 100644 drivers/iommu/generic_pt/iommu_pt.h
create mode 100644 drivers/iommu/generic_pt/kunit_generic_pt.h
create mode 100644 drivers/iommu/generic_pt/kunit_iommu.h
create mode 100644 drivers/iommu/generic_pt/kunit_iommu_pt.h
create mode 100644 drivers/iommu/generic_pt/pt_common.h
create mode 100644 drivers/iommu/generic_pt/pt_defs.h
create mode 100644 drivers/iommu/generic_pt/pt_fmt_defaults.h
create mode 100644 drivers/iommu/generic_pt/pt_iter.h
create mode 100644 drivers/iommu/generic_pt/pt_log2.h
create mode 100644 include/linux/generic_pt/common.h
create mode 100644 include/linux/generic_pt/iommu.h
base-commit: bf3db0366052dcdf7dea89a07929b690aac59b15
--
2.43.0
Parsing KTAP is quite an inconvenience, but most of the time the thing
you really want to know is "did anything fail"?
Let's give the user the his information without them needing
to parse anything.
Because of the use of subshells and namespaces, this needs to be
communicated via a file. Just write arbitrary data into the file and
treat non-empty content as a signal that something failed.
In case any user depends on the current behaviour, such as running this
from a script with `set -e` and parsing the result for failures
afterwards, add a flag they can set to get the old behaviour, namely
--no-error-on-fail.
Signed-off-by: Brendan Jackman <jackmanb(a)google.com>
---
Changes in v3:
- Fixed quoting
- Link to v2: https://lore.kernel.org/r/20251014-b4-ksft-error-on-fail-v2-1-b3e2657237b8@…
Changes in v2:
- Fixed bug in report_failure()
- Made error-on-fail the default
- Link to v1: https://lore.kernel.org/r/20251007-b4-ksft-error-on-fail-v1-1-71bf058f5662@…
---
tools/testing/selftests/kselftest/runner.sh | 14 ++++++++++----
tools/testing/selftests/run_kselftest.sh | 14 ++++++++++++++
2 files changed, 24 insertions(+), 4 deletions(-)
diff --git a/tools/testing/selftests/kselftest/runner.sh b/tools/testing/selftests/kselftest/runner.sh
index 2c3c58e65a419f5ee8d7dc51a37671237a07fa0b..3a62039fa6217f3453423ff011575d0a1eb8c275 100644
--- a/tools/testing/selftests/kselftest/runner.sh
+++ b/tools/testing/selftests/kselftest/runner.sh
@@ -44,6 +44,12 @@ tap_timeout()
fi
}
+report_failure()
+{
+ echo "not ok $*"
+ echo "$*" >> "$kselftest_failures_file"
+}
+
run_one()
{
DIR="$1"
@@ -105,7 +111,7 @@ run_one()
echo "# $TEST_HDR_MSG"
if [ ! -e "$TEST" ]; then
echo "# Warning: file $TEST is missing!"
- echo "not ok $test_num $TEST_HDR_MSG"
+ report_failure "$test_num $TEST_HDR_MSG"
else
if [ -x /usr/bin/stdbuf ]; then
stdbuf="/usr/bin/stdbuf --output=L "
@@ -123,7 +129,7 @@ run_one()
interpreter=$(head -n 1 "$TEST" | cut -c 3-)
cmd="$stdbuf $interpreter ./$BASENAME_TEST"
else
- echo "not ok $test_num $TEST_HDR_MSG"
+ report_failure "$test_num $TEST_HDR_MSG"
return
fi
fi
@@ -137,9 +143,9 @@ run_one()
echo "ok $test_num $TEST_HDR_MSG # SKIP"
elif [ $rc -eq $timeout_rc ]; then \
echo "#"
- echo "not ok $test_num $TEST_HDR_MSG # TIMEOUT $kselftest_timeout seconds"
+ report_failure "$test_num $TEST_HDR_MSG # TIMEOUT $kselftest_timeout seconds"
else
- echo "not ok $test_num $TEST_HDR_MSG # exit=$rc"
+ report_failure "$test_num $TEST_HDR_MSG # exit=$rc"
fi)
cd - >/dev/null
fi
diff --git a/tools/testing/selftests/run_kselftest.sh b/tools/testing/selftests/run_kselftest.sh
index 0443beacf3621ae36cb12ffd57f696ddef3526b5..d4be97498b32e975c63a1167d3060bdeba674c8c 100755
--- a/tools/testing/selftests/run_kselftest.sh
+++ b/tools/testing/selftests/run_kselftest.sh
@@ -33,6 +33,7 @@ Usage: $0 [OPTIONS]
-c | --collection COLLECTION Run all tests from COLLECTION
-l | --list List the available collection:test entries
-d | --dry-run Don't actually run any tests
+ -f | --no-error-on-fail Don't exit with an error just because tests failed
-n | --netns Run each test in namespace
-h | --help Show this usage info
-o | --override-timeout Number of seconds after which we timeout
@@ -44,6 +45,7 @@ COLLECTIONS=""
TESTS=""
dryrun=""
kselftest_override_timeout=""
+ERROR_ON_FAIL=true
while true; do
case "$1" in
-s | --summary)
@@ -65,6 +67,9 @@ while true; do
-d | --dry-run)
dryrun="echo"
shift ;;
+ -f | --no-error-on-fail)
+ ERROR_ON_FAIL=false
+ shift ;;
-n | --netns)
RUN_IN_NETNS=1
shift ;;
@@ -105,9 +110,18 @@ if [ -n "$TESTS" ]; then
available="$(echo "$valid" | sed -e 's/ /\n/g')"
fi
+kselftest_failures_file="$(mktemp --tmpdir kselftest-failures-XXXXXX)"
+export kselftest_failures_file
+
collections=$(echo "$available" | cut -d: -f1 | sort | uniq)
for collection in $collections ; do
[ -w /dev/kmsg ] && echo "kselftest: Running tests in $collection" >> /dev/kmsg
tests=$(echo "$available" | grep "^$collection:" | cut -d: -f2)
($dryrun cd "$collection" && $dryrun run_many $tests)
done
+
+failures="$(cat "$kselftest_failures_file")"
+rm "$kselftest_failures_file"
+if "$ERROR_ON_FAIL" && [ "$failures" ]; then
+ exit 1
+fi
---
base-commit: 8f5ae30d69d7543eee0d70083daf4de8fe15d585
change-id: 20251007-b4-ksft-error-on-fail-0c2cb3246041
Best regards,
--
Brendan Jackman <jackmanb(a)google.com>
This patch set adds timer test case for LoongArch system, it is based
on common arch_timer test case. And it includes time counter function,
one-shot/period mode interrupt, and software emulated timer function
test.
Bibo Mao (6):
KVM: LoongArch: selftests: Add system registers save and restore on
exception
KVM: LoongArch: selftests: Add exception handler register interface
KVM: LoongArch: selftests: Add basic interfaces
KVM: LoongArch: selftests: Add timer test case with one-shot mode
KVM: LoongArch: selftests: Add period mode timer and time counter test
KVM: LoongArch: selftests: Add SW emulated timer test
tools/testing/selftests/kvm/Makefile.kvm | 10 +-
.../kvm/include/loongarch/arch_timer.h | 84 ++++++++
.../kvm/include/loongarch/processor.h | 81 +++++++-
.../selftests/kvm/lib/loongarch/exception.S | 6 +
.../selftests/kvm/lib/loongarch/processor.c | 38 +++-
.../selftests/kvm/loongarch/arch_timer.c | 187 ++++++++++++++++++
6 files changed, 400 insertions(+), 6 deletions(-)
create mode 100644 tools/testing/selftests/kvm/include/loongarch/arch_timer.h
create mode 100644 tools/testing/selftests/kvm/loongarch/arch_timer.c
base-commit: e53642b87a4f4b03a8d7e5f8507fc3cd0c595ea6
--
2.39.3
Hi Linus,
Please pull the following kselftest fixes update for Linux 6.18-rc4.
Fixes build warning in cachestat found during clang build and adds
tmpshmcstat to .gitignore.
diff is attached.
thanks,
-- Shuah
----------------------------------------------------------------
The following changes since commit 3a8660878839faadb4f1a6dd72c3179c1df56787:
Linux 6.18-rc1 (2025-10-12 13:42:36 -0700)
are available in the Git repository at:
git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest tags/linux_kselftest-fixes-6.18-rc4
for you to fetch changes up to 920aa3a7705a061cb3004572d8b7932b54463dbf:
selftests: cachestat: Fix warning on declaration under label (2025-10-22 09:23:18 -0600)
----------------------------------------------------------------
linux_kselftest-fixes-6.18-rc4
Fixes build warning in cachestat found during clang build and adds
tmpshmcstat to .gitignore.
----------------------------------------------------------------
Madhur Kumar (1):
selftests/cachestat: add tmpshmcstat file to .gitignore
Sidharth Seela (1):
selftests: cachestat: Fix warning on declaration under label
tools/testing/selftests/cachestat/.gitignore | 1 +
tools/testing/selftests/cachestat/test_cachestat.c | 4 ++--
2 files changed, 3 insertions(+), 2 deletions(-)
----------------------------------------------------------------
Hi Linus,
Please pull the following kunit fixes update for Linux 6.18-rc4.
Fixes log overwrite in param_tests and fixes incorrect cast of priv
pointer in test_dev_action(). Updates email address for Rae Moar in
MAINTAINERS KUnit entry.
diff is attached.
thanks,
-- Shuah
----------------------------------------------------------------
The following changes since commit 3a8660878839faadb4f1a6dd72c3179c1df56787:
Linux 6.18-rc1 (2025-10-12 13:42:36 -0700)
are available in the Git repository at:
git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest tags/linux_kselftest-kunit-fixes-6.18-rc4
for you to fetch changes up to f3903ec76ae6afcdba0347681d1dda005fb145cd:
MAINTAINERS: Update KUnit email address for Rae Moar (2025-10-29 14:57:54 -0600)
----------------------------------------------------------------
linux_kselftest-kunit-fixes-6.18-rc4
Fixes log overwrite in param_tests and fixes incorrect cast of priv
pointer in test_dev_action(). Updates email address for Rae Moar in
MAINTAINERS KUnit entry.
----------------------------------------------------------------
Carlos Llamas (1):
kunit: prevent log overwrite in param_tests
Florian Schmaus (1):
kunit: test_dev_action: Correctly cast 'priv' pointer to long*
Rae Moar (1):
MAINTAINERS: Update KUnit email address for Rae Moar
.mailmap | 1 +
MAINTAINERS | 2 +-
lib/kunit/kunit-test.c | 2 +-
lib/kunit/test.c | 3 ++-
4 files changed, 5 insertions(+), 3 deletions(-)
----------------------------------------------------------------
Prior to commit 9245fd6b8531 ("KVM: x86: model canonical checks more
precisely"), KVM_SET_NESTED_STATE would fail if the state was captured
with L2 active, L1 had CR4.LA57 set, L2 did not, and the
VMCS12.HOST_GSBASE (or other host-state field checked for canonicality)
had an address greater than 48 bits wide.
Add a regression test that reproduces the KVM_SET_NESTED_STATE failure
conditions. To do so, the first three patches add support for 5-level
paging in the selftest L1 VM.
v1 -> v2
Ended the page walking loops before visiting 4K mappings [Yosry]
Changed VM_MODE_PXXV48_4K into VM_MODE_PXXVYY_4K;
use 5-level paging when possible [Sean]
Removed the check for non-NULL vmx_pages in guest_code() [Yosry]
Jim Mattson (4):
KVM: selftests: Use a loop to create guest page tables
KVM: selftests: Use a loop to walk guest page tables
KVM: selftests: Change VM_MODE_PXXV48_4K to VM_MODE_PXXVYY_4K
KVM: selftests: Add a VMX test for LA57 nested state
tools/testing/selftests/kvm/Makefile.kvm | 1 +
.../testing/selftests/kvm/include/kvm_util.h | 4 +-
.../selftests/kvm/include/x86/processor.h | 2 +-
.../selftests/kvm/lib/arm64/processor.c | 2 +-
tools/testing/selftests/kvm/lib/kvm_util.c | 30 ++--
.../testing/selftests/kvm/lib/x86/processor.c | 80 +++++------
tools/testing/selftests/kvm/lib/x86/vmx.c | 6 +-
.../kvm/x86/vmx_la57_nested_state_test.c | 134 ++++++++++++++++++
8 files changed, 197 insertions(+), 62 deletions(-)
create mode 100644 tools/testing/selftests/kvm/x86/vmx_la57_nested_state_test.c
--
2.51.1.851.g4ebd6896fd-goog
Hello,
this is the v3 of test_tc_tunnel conversion into test_progs framework.
This new revision:
- fixes a few issues spotted by the bot reviewer
- removes any test ensuring connection failure (and so depending on a
timout) to keep the execution time reasonable
test_tc_tunnel.sh tests a variety of tunnels based on BPF: packets are
encapsulated by a BPF program on the client egress. We then check that
those packets can be decapsulated on server ingress side, either thanks
to kernel-based or BPF-based decapsulation. Those tests are run thanks
to two veths in two dedicated namespaces.
- patches 1 and 2 are preparatory patches
- patch 3 introduce tc_tunnel test into test_progs
- patch 4 gets rid of the test_tc_tunnel.sh script
The new test has been executed both in some x86 local qemu machine, as
well as in CI:
# ./test_progs -a tc_tunnel
#454/1 tc_tunnel/ipip_none:OK
#454/2 tc_tunnel/ipip6_none:OK
#454/3 tc_tunnel/ip6tnl_none:OK
#454/4 tc_tunnel/sit_none:OK
#454/5 tc_tunnel/vxlan_eth:OK
#454/6 tc_tunnel/ip6vxlan_eth:OK
#454/7 tc_tunnel/gre_none:OK
#454/8 tc_tunnel/gre_eth:OK
#454/9 tc_tunnel/gre_mpls:OK
#454/10 tc_tunnel/ip6gre_none:OK
#454/11 tc_tunnel/ip6gre_eth:OK
#454/12 tc_tunnel/ip6gre_mpls:OK
#454/13 tc_tunnel/udp_none:OK
#454/14 tc_tunnel/udp_eth:OK
#454/15 tc_tunnel/udp_mpls:OK
#454/16 tc_tunnel/ip6udp_none:OK
#454/17 tc_tunnel/ip6udp_eth:OK
#454/18 tc_tunnel/ip6udp_mpls:OK
#454 tc_tunnel:OK
Summary: 1/18 PASSED, 0 SKIPPED, 0 FAILED
Signed-off-by: Alexis Lothoré (eBPF Foundation) <alexis.lothore(a)bootlin.com>
---
Changes in v3:
- remove systematic "connection must fail" test part of each subtest
- also remove kernel-based decap test for subtests supposed to fail on
kernel side
- fix potential fd leak if connection structure allocation fails
- fix wrong early return in run_test
- Link to v2: https://lore.kernel.org/r/20251022-tc_tunnel-v2-0-a44a0bd52902@bootlin.com
Changes in v2:
- declare a single tc_prog_attach helper rather than multiple,
intermediate helpers
- move the new helper to network_helpers.c rather than a dedicated
file
- do not rename existing tc_helpers.c/h pair (drop patch)
- keep only the minimal set of needed NS switches
- Link to v1: https://lore.kernel.org/r/20251017-tc_tunnel-v1-0-2d86808d86b2@bootlin.com
---
Alexis Lothoré (eBPF Foundation) (4):
selftests/bpf: add tc helpers
selftests/bpf: make test_tc_tunnel.bpf.c compatible with big endian platforms
selftests/bpf: integrate test_tc_tunnel.sh tests into test_progs
selftests/bpf: remove test_tc_tunnel.sh
tools/testing/selftests/bpf/Makefile | 1 -
tools/testing/selftests/bpf/network_helpers.c | 45 ++
tools/testing/selftests/bpf/network_helpers.h | 16 +
.../selftests/bpf/prog_tests/test_tc_tunnel.c | 674 +++++++++++++++++++++
.../testing/selftests/bpf/prog_tests/test_tunnel.c | 107 +---
tools/testing/selftests/bpf/progs/test_tc_tunnel.c | 95 ++-
tools/testing/selftests/bpf/test_tc_tunnel.sh | 320 ----------
7 files changed, 790 insertions(+), 468 deletions(-)
---
base-commit: ecdeefe65eaeb82a1262e20401ba750b8c9e0b97
change-id: 20250811-tc_tunnel-c61342683f18
Best regards,
--
Alexis Lothoré, Bootlin
Embedded Linux and Kernel engineering
https://bootlin.com