PASID (Process Address Space ID) is a PCIe extension to tag the DMA
transactions out of a physical device, and most modern IOMMU hardware
have supported PASID granular address translation. So a PASID-capable
device can be attached to multiple hwpts (a.k.a. domains), each attachment
is tagged with a pasid.
This series is based on a preparation series [1], it first adds a missing
iommu API to replace domain for a pasid. Based on the iommu pasid attach/
replace/detach APIs, this series adds iommufd APIs for device drivers to
attach/replace/detach pasid to/from hwpt per userspace's request, and adds
selftest to validate the iommufd APIs.
The completed code can be found in below link [2]. Heads up! The existing
iommufd selftest was broken, there was a fix [3] to it, but not been
upstreamed yet. If want to run the iommufd selftest, please apply that fix.
Sorry for the inconvenience.
[1] https://lore.kernel.org/linux-iommu/20240628085538.47049-1-yi.l.liu@intel.c…
[2] https://github.com/yiliu1765/iommufd/tree/iommufd_pasid
[3] https://lore.kernel.org/linux-iommu/20240111073213.180020-1-baolu.lu@linux.…
Change log:
v3:
- Split the set_dev_pasid op enhancements for domain replacement to be a
separate series "Make set_dev_pasid op supportting domain replacement" [1].
The below changes are made in the separate series.
*) set_dev_pasid() callback should keep the old config if failed to attach to
a domain. This simplifies the caller a lot as caller does not need to attach
it back to old domain explicitly. This also avoids some corner cases in which
the core may do duplicated domain attachment as described in below link (Jason)
https://lore.kernel.org/linux-iommu/BN9PR11MB52768C98314A95AFCD2FA6478C0F2@…
*) Drop patch 10 of v2 as it's a bug fix and can be submitted separately (Kevin)
*) Rebase on top of Baolu's domain_alloc_paging refactor series (Jason)
- Drop the attach_data which includes attach_fn and pasid, insteadly passing the
pasid through the device attach path. (Jason)
- Add a pasid-num-bits property to mock dev to make pasid selftest work (Kevin)
v2: https://lore.kernel.org/linux-iommu/20240412081516.31168-1-yi.l.liu@intel.c…
- Domain replace for pasid should be handled in set_dev_pasid() callbacks
instead of remove_dev_pasid and call set_dev_pasid afteward in iommu
layer (Jason)
- Make xarray operations more self-contained in iommufd pasid attach/replace/detach
(Jason)
- Tweak the dev_iommu_get_max_pasids() to allow iommu driver to populate the
max_pasids. This makes the iommufd selftest simpler to meet the max_pasids
check in iommu_attach_device_pasid() (Jason)
v1: https://lore.kernel.org/kvm/20231127063428.127436-1-yi.l.liu@intel.com/#r
- Implemnet iommu_replace_device_pasid() to fall back to the original domain
if this replacement failed (Kevin)
- Add check in do_attach() to check corressponding attach_fn per the pasid value.
rfc: https://lore.kernel.org/linux-iommu/20230926092651.17041-1-yi.l.liu@intel.c…
Regards,
Yi Liu
Yi Liu (7):
iommu: Introduce a replace API for device pasid
iommufd: Pass pasid through the device attach/replace path
iommufd: Support attach/replace hwpt per pasid
iommufd/selftest: Add set_dev_pasid and remove_dev_pasid in mock iommu
iommufd/selftest: Add a helper to get test device
iommufd/selftest: Add test ops to test pasid attach/detach
iommufd/selftest: Add coverage for iommufd pasid attach/detach
drivers/iommu/iommu-priv.h | 3 +
drivers/iommu/iommu.c | 80 ++++++-
drivers/iommu/iommufd/Makefile | 1 +
drivers/iommu/iommufd/device.c | 31 +--
drivers/iommu/iommufd/iommufd_private.h | 15 ++
drivers/iommu/iommufd/iommufd_test.h | 30 +++
drivers/iommu/iommufd/pasid.c | 157 +++++++++++++
drivers/iommu/iommufd/selftest.c | 206 ++++++++++++++++-
include/linux/iommufd.h | 6 +
tools/testing/selftests/iommu/iommufd.c | 207 ++++++++++++++++++
.../selftests/iommu/iommufd_fail_nth.c | 28 ++-
tools/testing/selftests/iommu/iommufd_utils.h | 78 +++++++
12 files changed, 808 insertions(+), 34 deletions(-)
create mode 100644 drivers/iommu/iommufd/pasid.c
--
2.34.1
There are no maintainers specified for tools/testing/selftests/x86.
Shuah has mentioned [1] that the patches should go through x86 tree or
in special cases directly to Shuah's tree after getting ack-ed from x86
maintainers. Different people have been confused when sending patches as
correct maintainers aren't found by get_maintainer.pl script. Fix
this by adding entry to MAINTAINERS file.
[1] https://lore.kernel.org/all/90dc0dfc-4c67-4ea1-b705-0585d6e2ec47@linuxfound…
Signed-off-by: Muhammad Usama Anjum <usama.anjum(a)collabora.com>
---
MAINTAINERS | 1 +
1 file changed, 1 insertion(+)
diff --git a/MAINTAINERS b/MAINTAINERS
index 523d84b2d6139..f3a17e5d954a3 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -24378,6 +24378,7 @@ T: git git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip.git x86/core
F: Documentation/arch/x86/
F: Documentation/devicetree/bindings/x86/
F: arch/x86/
+F: tools/testing/selftests/x86
X86 ENTRY CODE
M: Andy Lutomirski <luto(a)kernel.org>
--
2.39.2
In a series posted a few years ago [1], a proposal was put forward to allow the
kernel to allocate memory local to a mm and thus push it out of reach for
current and future speculation-based cross-process attacks. We still believe
this is a nice thing to have.
However, in the time passed since that post Linux mm has grown quite a few new
goodies, so we'd like to explore possibilities to implement this functionality
with less effort and churn leveraging the now available facilities.
Specifically, this is a proof-of-concept attempt to implement mm-local
allocations piggy-backing on memfd_secret(), using regular user addressess but
pinning the pages and flipping the user/supervisor flag on the respective PTEs
to make them directly accessible from kernel, and sealing the VMA to prevent
userland from taking over the address range. The approach allowed to delegate
all the heavy lifting -- address management, interactions with the direct map,
cleanup on mm teardown -- to the existing infrastructure, and required zero
architecture-specific code.
Compared to the approach used in the orignal series, where a dedicated kernel
address range and thus a dedicated PGD was used for mm-local allocations, the
one proposed here may have certain drawbacks, in particular
- using user addresses for kernel memory may violate assumptions in various
parts of kernel code which we may not have identified with smoke tests we did
- the allocated addresses are guessable by the userland (ATM they are even
visible in /proc/PID/maps but that's fixable) which may weaken the security
posture
Also included is a simple test driver and selftest to smoke test and showcase
the feature.
The code is PoC RFC and lacks a lot of checks and special case handling, but
demonstrates the idea. We'd appreciate any feedback on whether it's a viable
approach or it should better be abandoned in favor of the one with dedicated
PGD / kernel address range or yet something else.
[1] https://lore.kernel.org/lkml/20190612170834.14855-1-mhillenb@amazon.de/
Fares Mehanna (2):
mseal: expose interface to seal / unseal user memory ranges
mm/secretmem: implement mm-local kernel allocations
Roman Kagan (1):
drivers/misc: add test driver and selftest for proclocal allocator
drivers/misc/Makefile | 1 +
tools/testing/selftests/proclocal/Makefile | 6 +
include/linux/secretmem.h | 8 +
mm/internal.h | 7 +
drivers/misc/proclocal-test.c | 200 +++++++++++++++++
mm/gup.c | 4 +-
mm/mseal.c | 81 ++++---
mm/secretmem.c | 208 ++++++++++++++++++
.../selftests/proclocal/proclocal-test.c | 150 +++++++++++++
drivers/misc/Kconfig | 15 ++
tools/testing/selftests/proclocal/.gitignore | 1 +
11 files changed, 649 insertions(+), 32 deletions(-)
create mode 100644 tools/testing/selftests/proclocal/Makefile
create mode 100644 drivers/misc/proclocal-test.c
create mode 100644 tools/testing/selftests/proclocal/proclocal-test.c
create mode 100644 tools/testing/selftests/proclocal/.gitignore
--
2.34.1
Amazon Web Services Development Center Germany GmbH
Krausenstr. 38
10117 Berlin
Geschaeftsfuehrung: Christian Schlaeger, Jonathan Weiss
Eingetragen am Amtsgericht Charlottenburg unter HRB 257764 B
Sitz: Berlin
Ust-ID: DE 365 538 597
On riscv, mmap currently returns an address from the largest address
space that can fit entirely inside of the hint address. This makes it
such that the hint address is almost never returned. This patch raises
the mappable area up to and including the hint address. This allows mmap
to often return the hint address, which allows a performance improvement
over searching for a valid address as well as making the behavior more
similar to other architectures.
Note that a previous patch introduced stronger semantics compared to
other architectures for riscv mmap. On riscv, mmap will not use bits in
the upper bits of the virtual address depending on the hint address. On
other architectures, a random address is returned in the address space
requested. On all architectures the hint address will be returned if it
is available. This allows riscv applications to configure how many bits
in the virtual address should be left empty. This has the two benefits
of being able to request address spaces that are smaller than the
default and doesn't require the application to know the page table
layout of riscv.
Signed-off-by: Charlie Jenkins <charlie(a)rivosinc.com>
---
Changes in v3:
- Add back forgotten semi-colon
- Fix test cases
- Add support for rv32
- Change cover letter name so it's not the same as patch 1
- Link to v2: https://lore.kernel.org/r/20240130-use_mmap_hint_address-v2-0-f34ebfd33053@…
Changes in v2:
- Add back forgotten "mmap_end = STACK_TOP_MAX"
- Link to v1: https://lore.kernel.org/r/20240129-use_mmap_hint_address-v1-0-4c74da813ba1@…
---
Charlie Jenkins (3):
riscv: mm: Use hint address in mmap if available
selftests: riscv: Generalize mm selftests
docs: riscv: Define behavior of mmap
Documentation/arch/riscv/vm-layout.rst | 16 ++--
arch/riscv/include/asm/processor.h | 27 +++---
tools/testing/selftests/riscv/mm/mmap_bottomup.c | 23 +----
tools/testing/selftests/riscv/mm/mmap_default.c | 23 +----
tools/testing/selftests/riscv/mm/mmap_test.h | 107 ++++++++++++++---------
5 files changed, 83 insertions(+), 113 deletions(-)
---
base-commit: 556e2d17cae620d549c5474b1ece053430cd50bc
change-id: 20240119-use_mmap_hint_address-f9f4b1b6f5f1
--
- Charlie
In this series from Geliang, modifying MPTCP BPF selftests, we have:
- A new MPTCP subflow BPF program setting socket options per subflow: it
looks better to have this old test program in the BPF selftests to
track regressions and to serve as example.
Note: Nicolas is no longer working for Tessares, but he did this work
while working for them, and his email address is no longer available.
- A new MPTCP BPF subtest validating this new BPF program.
Signed-off-by: Matthieu Baerts (NGI0) <matttbe(a)kernel.org>
---
Changes in v2:
- Previous patches 1/4 and 2/4 have been dropped from this series:
- 1/4: "selftests/bpf: Handle SIGINT when creating netns":
- A new version, more generic and no longer specific to MPTCP BPF
selftest will be sent later, as part of a new series. (Alexei)
- 2/4: "selftests/bpf: Add RUN_MPTCP_TEST macro":
- Removed, not to hide helper functions in macros. (Alexei)
- The commit message of patch 1/2 has been clarified to avoid some
possible confusions spot by Alexei.
- Link to v1: https://lore.kernel.org/r/20240507-upstream-bpf-next-20240506-mptcp-subflow…
---
Geliang Tang (1):
selftests/bpf: Add mptcp subflow subtest
Nicolas Rybowski (1):
selftests/bpf: Add mptcp subflow example
tools/testing/selftests/bpf/prog_tests/mptcp.c | 109 ++++++++++++++++++++++
tools/testing/selftests/bpf/progs/mptcp_subflow.c | 70 ++++++++++++++
2 files changed, 179 insertions(+)
---
base-commit: 009367099eb61a4fc2af44d4eb06b6b4de7de6db
change-id: 20240506-upstream-bpf-next-20240506-mptcp-subflow-test-faef6654bfa3
Best regards,
--
Matthieu Baerts (NGI0) <matttbe(a)kernel.org>
From: "Steven Rostedt (Google)" <rostedt(a)goodmis.org>
The function "scheduler_tick" was renamed to "sched_tick" and a selftest
that used that function for testing function trace filtering used that
function as part of the test.
But the change causes it to fail when run on older kernels. As tests
should not fail on older kernels, add a check to see which name is
available before testing.
Fixes: 86dd6c04ef9f2 ("sched/balancing: Rename scheduler_tick() => sched_tick()")
Signed-off-by: Steven Rostedt (Google) <rostedt(a)goodmis.org>
---
.../ftrace/test.d/ftrace/func_set_ftrace_file.tc | 9 ++++++++-
1 file changed, 8 insertions(+), 1 deletion(-)
diff --git a/tools/testing/selftests/ftrace/test.d/ftrace/func_set_ftrace_file.tc b/tools/testing/selftests/ftrace/test.d/ftrace/func_set_ftrace_file.tc
index 073a748b9380..263f6b798c85 100644
--- a/tools/testing/selftests/ftrace/test.d/ftrace/func_set_ftrace_file.tc
+++ b/tools/testing/selftests/ftrace/test.d/ftrace/func_set_ftrace_file.tc
@@ -19,7 +19,14 @@ fail() { # mesg
FILTER=set_ftrace_filter
FUNC1="schedule"
-FUNC2="sched_tick"
+if grep '^sched_tick\b' available_filter_functions; then
+ FUNC2="sched_tick"
+elif grep '^scheduler_tick\b' available_filter_functions; then
+ FUNC2="scheduler_tick"
+else
+ exit_unresolved
+fi
+
ALL_FUNCS="#### all functions enabled ####"
--
2.43.0
Hi,
This attempts to implement PT_LOAD p_align support for static PIE builds.
I intend this to go into -next after the coming merge window so we can
maximize bake time. In the past we've had regressions with both the
selftests and the ELF loader. Hopefully we can shake everything out over
a few months. :)
Thanks!
-Kees
Kees Cook (3):
selftests/exec: Build both static and non-static load_address tests
binfmt_elf: Calculate total_size earlier
binfmt_elf: Honor PT_LOAD alignment for static PIE
fs/binfmt_elf.c | 94 ++++++++++++++-------
tools/testing/selftests/exec/Makefile | 19 +++--
tools/testing/selftests/exec/load_address.c | 67 ++++++++++++---
3 files changed, 130 insertions(+), 50 deletions(-)
--
2.34.1