Use a filter function to skip the time namespace test on systems with
!CONFIG_TIME_NS. This reworks a fix originally done by Tiezhu Yang prior
to the refactoring in 34dce23f7e40 ("selftests/clone3: Report descriptive
test names"). The changelog from their fix explains the issue very clearly:
When execute the following command to test clone3 under !CONFIG_TIME_NS:
# make headers && cd tools/testing/selftests/clone3 && make && ./clone3
we can see the following error info:
# [7538] Trying clone3() with flags 0x80 (size 0)
# Invalid argument - Failed to create new process
# [7538] clone3() with flags says: -22 expected 0
not ok 18 [7538] Result (-22) is different than expected (0)
...
# Totals: pass:18 fail:1 xfail:0 xpass:0 skip:0 error:0
This is because if CONFIG_TIME_NS is not set, but the flag
CLONE_NEWTIME (0x80) is used to clone a time namespace, it
will return -EINVAL in copy_time_ns().
If kernel does not support CONFIG_TIME_NS, /proc/self/ns/time
will be not exist, and then we should skip clone3() test with
CLONE_NEWTIME.
With this patch under !CONFIG_TIME_NS:
# make headers && cd tools/testing/selftests/clone3 && make && ./clone3
...
# Time namespaces are not supported
ok 18 # SKIP Skipping clone3() with CLONE_NEWTIME
...
# Totals: pass:18 fail:0 xfail:0 xpass:0 skip:1 error:0
Fixes: 515bddf0ec41 ("selftests/clone3: test clone3 with CLONE_NEWTIME")
Suggested-by: Thomas Gleixner <tglx(a)linutronix.de>
Original-fix-from: Tiezhu Yang <yangtiezhu(a)loongson.cn>
Signed-off-by: Mark Brown <broonie(a)kernel.org>
---
tools/testing/selftests/clone3/clone3.c | 11 +++++++++++
1 file changed, 11 insertions(+)
diff --git a/tools/testing/selftests/clone3/clone3.c b/tools/testing/selftests/clone3/clone3.c
index 9429d361059e..915846bc32cc 100644
--- a/tools/testing/selftests/clone3/clone3.c
+++ b/tools/testing/selftests/clone3/clone3.c
@@ -138,6 +138,16 @@ static bool not_root(void)
return false;
}
+static bool timens_unsupported(void)
+{
+ if (access("/proc/self/ns/time", F_OK) == 0) {
+ ksft_print_msg("Time namespaces are not supported\n");
+ return true;
+ }
+
+ return false;
+}
+
static size_t page_size_plus_8(void)
{
return getpagesize() + 8;
@@ -282,6 +292,7 @@ static const struct test tests[] = {
.size = 0,
.expected = 0,
.test_mode = CLONE3_ARGS_NO_TEST,
+ .filter = timens_unsupported,
},
{
.name = "exit signal (SIGCHLD) in flags",
---
base-commit: 8d4099dd0727acfc8b0f644eacaf852f9d5dc649
change-id: 20231019-kselftest-clone3-time-ns-730b6f4187c7
Best regards,
--
Mark Brown <broonie(a)kernel.org>
If name_show() is non unique, this test will try to install a kprobe on this
function which should fail returning EADDRNOTAVAIL.
On kernel where name_show() is not unique, this test is skipped.
Cc: stable(a)vger.kernel.org
Signed-off-by: Francis Laniel <flaniel(a)linux.microsoft.com>
---
.../ftrace/test.d/kprobe/kprobe_non_uniq_symbol.tc | 13 +++++++++++++
1 file changed, 13 insertions(+)
create mode 100644 tools/testing/selftests/ftrace/test.d/kprobe/kprobe_non_uniq_symbol.tc
diff --git a/tools/testing/selftests/ftrace/test.d/kprobe/kprobe_non_uniq_symbol.tc b/tools/testing/selftests/ftrace/test.d/kprobe/kprobe_non_uniq_symbol.tc
new file mode 100644
index 000000000000..bc9514428dba
--- /dev/null
+++ b/tools/testing/selftests/ftrace/test.d/kprobe/kprobe_non_uniq_symbol.tc
@@ -0,0 +1,13 @@
+#!/bin/sh
+# SPDX-License-Identifier: GPL-2.0
+# description: Test failure of registering kprobe on non unique symbol
+# requires: kprobe_events
+
+SYMBOL='name_show'
+
+# We skip this test on kernel where SYMBOL is unique or does not exist.
+if [ "$(grep -c -E "[[:alnum:]]+ t ${SYMBOL}" /proc/kallsyms)" -le '1' ]; then
+ exit_unsupported
+fi
+
+! echo "p:test_non_unique ${SYMBOL}" > kprobe_events
--
2.34.1
Nested translation is a hardware feature that is supported by many modern
IOMMU hardwares. It has two stages (stage-1, stage-2) address translation
to get access to the physical address. stage-1 translation table is owned
by userspace (e.g. by a guest OS), while stage-2 is owned by kernel. Changes
to stage-1 translation table should be followed by an IOTLB invalidation.
Take Intel VT-d as an example, the stage-1 translation table is I/O page
table. As the below diagram shows, guest I/O page table pointer in GPA
(guest physical address) is passed to host and be used to perform the stage-1
address translation. Along with it, modifications to present mappings in the
guest I/O page table should be followed with an IOTLB invalidation.
.-------------. .---------------------------.
| vIOMMU | | Guest I/O page table |
| | '---------------------------'
.----------------/
| PASID Entry |--- PASID cache flush --+
'-------------' |
| | V
| | I/O page table pointer in GPA
'-------------'
Guest
------| Shadow |---------------------------|--------
v v v
Host
.-------------. .------------------------.
| pIOMMU | | FS for GIOVA->GPA |
| | '------------------------'
.----------------/ |
| PASID Entry | V (Nested xlate)
'----------------\.----------------------------------.
| | | SS for GPA->HPA, unmanaged domain|
| | '----------------------------------'
'-------------'
Where:
- FS = First stage page tables
- SS = Second stage page tables
<Intel VT-d Nested translation>
This series adds the cache invalidation path for the userspace to invalidate
cache after modifying the stage-1 page table. This is based on the first part
of nesting [1]
Complete code can be found in [2], QEMU could can be found in [3].
At last, this is a team work together with Nicolin Chen, Lu Baolu. Thanks
them for the help. ^_^. Look forward to your feedbacks.
[1] https://lore.kernel.org/linux-iommu/20231020091946.12173-1-yi.l.liu@intel.c…
[2] https://github.com/yiliu1765/iommufd/tree/iommufd_nesting
[3] https://github.com/yiliu1765/qemu/tree/zhenzhong/wip/iommufd_nesting_rfcv1
Change log:
v5:
- Split the iommufd nesting series into two parts of alloc_user and
invalidation (Jason)
- Split IOMMUFD_OBJ_HW_PAGETABLE to IOMMUFD_OBJ_HWPT_PAGING/_NESTED, and
do the same with the structures/alloc()/abort()/destroy(). Reworked the
selftest accordingly too. (Jason)
- Move hwpt/data_type into struct iommu_user_data from standalone op
arguments. (Jason)
- Rename hwpt_type to be data_type, the HWPT_TYPE to be HWPT_ALLOC_DATA,
_TYPE_DEFAULT to be _ALLOC_DATA_NONE (Jason, Kevin)
- Rename iommu_copy_user_data() to iommu_copy_struct_from_user() (Kevin)
- Add macro to the iommu_copy_struct_from_user() to calculate min_size
(Jason)
- Fix two bugs spotted by ZhaoYan
v4: https://lore.kernel.org/linux-iommu/20230921075138.124099-1-yi.l.liu@intel.…
- Separate HWPT alloc/destroy/abort functions between user-managed HWPTs
and kernel-managed HWPTs
- Rework invalidate uAPI to be a multi-request array-based design
- Add a struct iommu_user_data_array and a helper for driver to sanitize
and copy the entry data from user space invalidation array
- Add a patch fixing TEST_LENGTH() in selftest program
- Drop IOMMU_RESV_IOVA_RANGES patches
- Update kdoc and inline comments
- Drop the code to add IOMMU_RESV_SW_MSI to kernel-managed HWPT in nested translation,
this does not change the rule that resv regions should only be added to the
kernel-managed HWPT. The IOMMU_RESV_SW_MSI stuff will be added in later series
as it is needed only by SMMU so far.
v3: https://lore.kernel.org/linux-iommu/20230724110406.107212-1-yi.l.liu@intel.…
- Add new uAPI things in alphabetical order
- Pass in "enum iommu_hwpt_type hwpt_type" to op->domain_alloc_user for
sanity, replacing the previous op->domain_alloc_user_data_len solution
- Return ERR_PTR from domain_alloc_user instead of NULL
- Only add IOMMU_RESV_SW_MSI to kernel-managed HWPT in nested translation (Kevin)
- Add IOMMU_RESV_IOVA_RANGES to report resv iova ranges to userspace hence
userspace is able to exclude the ranges in the stage-1 HWPT (e.g. guest I/O
page table). (Kevin)
- Add selftest coverage for the new IOMMU_RESV_IOVA_RANGES ioctl
- Minor changes per Kevin's inputs
v2: https://lore.kernel.org/linux-iommu/20230511143844.22693-1-yi.l.liu@intel.c…
- Add union iommu_domain_user_data to include all user data structures to avoid
passing void * in kernel APIs.
- Add iommu op to return user data length for user domain allocation
- Rename struct iommu_hwpt_alloc::data_type to be hwpt_type
- Store the invalidation data length in iommu_domain_ops::cache_invalidate_user_data_len
- Convert cache_invalidate_user op to be int instead of void
- Remove @data_type in struct iommu_hwpt_invalidate
- Remove out_hwpt_type_bitmap in struct iommu_hw_info hence drop patch 08 of v1
v1: https://lore.kernel.org/linux-iommu/20230309080910.607396-1-yi.l.liu@intel.…
Thanks,
Yi Liu
Lu Baolu (1):
iommu: Add cache_invalidate_user op
Nicolin Chen (4):
iommu: Add iommu_copy_struct_from_user_array helper
iommufd/selftest: Add mock_domain_cache_invalidate_user support
iommufd/selftest: Add IOMMU_TEST_OP_MD_CHECK_IOTLB test op
iommufd/selftest: Add coverage for IOMMU_HWPT_INVALIDATE ioctl
Yi Liu (1):
iommufd: Add IOMMU_HWPT_INVALIDATE
drivers/iommu/iommufd/hw_pagetable.c | 35 ++++++++
drivers/iommu/iommufd/iommufd_private.h | 9 ++
drivers/iommu/iommufd/iommufd_test.h | 22 +++++
drivers/iommu/iommufd/main.c | 3 +
drivers/iommu/iommufd/selftest.c | 69 +++++++++++++++
include/linux/iommu.h | 84 +++++++++++++++++++
include/uapi/linux/iommufd.h | 36 ++++++++
tools/testing/selftests/iommu/iommufd.c | 75 +++++++++++++++++
tools/testing/selftests/iommu/iommufd_utils.h | 63 ++++++++++++++
9 files changed, 396 insertions(+)
--
2.34.1
Nested translation is a hardware feature that is supported by many modern
IOMMU hardwares. It has two stages of address translations to get access
to the physical address. A stage-1 translation table is owned by userspace
(e.g. by a guest OS), while a stage-2 is owned by kernel. Any change to a
stage-1 translation table should be followed by an IOTLB invalidation.
Take Intel VT-d as an example, the stage-1 translation table is guest I/O
page table. As the below diagram shows, the guest I/O page table pointer
in GPA (guest physical address) is passed to host and be used to perform
a stage-1 translation. Along with it, a modification to present mappings
in the guest I/O page table should be followed by an IOTLB invalidation.
.-------------. .---------------------------.
| vIOMMU | | Guest I/O page table |
| | '---------------------------'
.----------------/
| PASID Entry |--- PASID cache flush --+
'-------------' |
| | V
| | I/O page table pointer in GPA
'-------------'
Guest
------| Shadow |---------------------------|--------
v v v
Host
.-------------. .------------------------.
| pIOMMU | | FS for GIOVA->GPA |
| | '------------------------'
.----------------/ |
| PASID Entry | V (Nested xlate)
'----------------\.----------------------------------.
| | | SS for GPA->HPA, unmanaged domain|
| | '----------------------------------'
'-------------'
Where:
- FS = First stage page tables
- SS = Second stage page tables
<Intel VT-d Nested translation>
In IOMMUFD, all the translation tables are tracked by hw_pagetable (hwpt)
and each hwpt is backed by an iommu_domain allocated from an iommu driver.
So in this series hw_pagetable and iommu_domain means the same thing if no
special note. IOMMUFD has already supported allocating hw_pagetable linked
with an IOAS. However, a nesting case requires IOMMUFD to allow allocating
hw_pagetable with driver specific parameters and interface to sync stage-1
IOTLB as user owns the stage-1 translation table.
This series is based on the iommu hw info reporting series [1] and nested
parent domain allocation [2]. It first extends domain_alloc_user to allocate
hwpt with user data by allowing the IOMMUFD internal infrastructure to accept
user_data and parent hwpt, relaying the user_data/parent to the iommu core
to allocate IOMMU_DOMAIN_NESTED. And it then extends the IOMMU_HWPT_ALLOC
ioctl to accept user data and a parent hwpt ID.
Note that this series is the part-1 set of a two-part nesting series. It
does not include the cache invalidation interface, which will be added in
the part 2.
Complete code can be found in [3], QEMU could can be found in [4].
At last, this is a team work together with Nicolin Chen, Lu Baolu. Thanks
them for the help. ^_^. Look forward to your feedbacks.
[1] https://lore.kernel.org/linux-iommu/20230818101033.4100-1-yi.l.liu@intel.co… - merged
[2] https://lore.kernel.org/linux-iommu/20230928071528.26258-1-yi.l.liu@intel.c… - merged
[3] https://github.com/yiliu1765/iommufd/tree/iommufd_nesting
[4] https://github.com/yiliu1765/qemu/tree/zhenzhong/wip/iommufd_nesting_rfcv1
Change log:
v5:
- Split the iommufd nesting series into two parts of alloc_user and
invalidation (Jason)
- Split IOMMUFD_OBJ_HW_PAGETABLE to IOMMUFD_OBJ_HWPT_PAGING/_NESTED, and
do the same with the structures/alloc()/abort()/destroy(). Reworked the
selftest accordingly too. (Jason)
- Move hwpt/data_type into struct iommu_user_data from standalone op
arguments. (Jason)
- Rename hwpt_type to be data_type, the HWPT_TYPE to be HWPT_ALLOC_DATA,
_TYPE_DEFAULT to be _ALLOC_DATA_NONE (Jason, Kevin)
- Rename iommu_copy_user_data() to iommu_copy_struct_from_user() (Kevin)
- Add macro to the iommu_copy_struct_from_user() to calculate min_size
(Jason)
- Fix two bugs spotted by ZhaoYan
v4: https://lore.kernel.org/linux-iommu/20230921075138.124099-1-yi.l.liu@intel.…
- Separate HWPT alloc/destroy/abort functions between user-managed HWPTs
and kernel-managed HWPTs
- Rework invalidate uAPI to be a multi-request array-based design
- Add a struct iommu_user_data_array and a helper for driver to sanitize
and copy the entry data from user space invalidation array
- Add a patch fixing TEST_LENGTH() in selftest program
- Drop IOMMU_RESV_IOVA_RANGES patches
- Update kdoc and inline comments
- Drop the code to add IOMMU_RESV_SW_MSI to kernel-managed HWPT in nested
translation, this does not change the rule that resv regions should only
be added to the kernel-managed HWPT. The IOMMU_RESV_SW_MSI stuff will be
added in later series as it is needed only by SMMU so far.
v3: https://lore.kernel.org/linux-iommu/20230724110406.107212-1-yi.l.liu@intel.…
- Add new uAPI things in alphabetical order
- Pass in "enum iommu_hwpt_type hwpt_type" to op->domain_alloc_user for
sanity, replacing the previous op->domain_alloc_user_data_len solution
- Return ERR_PTR from domain_alloc_user instead of NULL
- Only add IOMMU_RESV_SW_MSI to kernel-managed HWPT in nested translation
(Kevin)
- Add IOMMU_RESV_IOVA_RANGES to report resv iova ranges to userspace hence
userspace is able to exclude the ranges in the stage-1 HWPT (e.g. guest
I/O page table). (Kevin)
- Add selftest coverage for the new IOMMU_RESV_IOVA_RANGES ioctl
- Minor changes per Kevin's inputs
v2: https://lore.kernel.org/linux-iommu/20230511143844.22693-1-yi.l.liu@intel.c…
- Add union iommu_domain_user_data to include all user data structures to
avoid passing void * in kernel APIs.
- Add iommu op to return user data length for user domain allocation
- Rename struct iommu_hwpt_alloc::data_type to be hwpt_type
- Store the invalidation data length in
iommu_domain_ops::cache_invalidate_user_data_len
- Convert cache_invalidate_user op to be int instead of void
- Remove @data_type in struct iommu_hwpt_invalidate
- Remove out_hwpt_type_bitmap in struct iommu_hw_info hence drop patch 08
of v1
v1: https://lore.kernel.org/linux-iommu/20230309080910.607396-1-yi.l.liu@intel.…
Thanks,
Yi Liu
Jason Gunthorpe (2):
iommufd: Rename IOMMUFD_OBJ_HW_PAGETABLE to IOMMUFD_OBJ_HWPT_PAGING
iommufd/device: Wrap IOMMUFD_OBJ_HWPT_PAGING-only configurations
Lu Baolu (1):
iommu: Add IOMMU_DOMAIN_NESTED
Nicolin Chen (6):
iommufd: Derive iommufd_hwpt_paging from iommufd_hw_pagetable
iommufd: Share iommufd_hwpt_alloc with IOMMUFD_OBJ_HWPT_NESTED
iommufd: Add a nested HW pagetable object
iommu: Add iommu_copy_struct_from_user helper
iommufd/selftest: Add nested domain allocation for mock domain
iommufd/selftest: Add coverage for IOMMU_HWPT_ALLOC with nested HWPTs
Yi Liu (1):
iommu: Pass in parent domain with user_data to domain_alloc_user op
drivers/iommu/intel/iommu.c | 4 +-
drivers/iommu/iommufd/device.c | 184 +++++++++-----
drivers/iommu/iommufd/hw_pagetable.c | 237 ++++++++++++++----
drivers/iommu/iommufd/iommufd_private.h | 67 +++--
drivers/iommu/iommufd/iommufd_test.h | 18 ++
drivers/iommu/iommufd/main.c | 10 +-
drivers/iommu/iommufd/selftest.c | 137 ++++++++--
drivers/iommu/iommufd/vfio_compat.c | 6 +-
include/linux/iommu.h | 72 +++++-
include/uapi/linux/iommufd.h | 31 ++-
tools/testing/selftests/iommu/iommufd.c | 120 +++++++++
.../selftests/iommu/iommufd_fail_nth.c | 3 +-
tools/testing/selftests/iommu/iommufd_utils.h | 31 ++-
13 files changed, 752 insertions(+), 168 deletions(-)
--
2.34.1
hi, Akihiko Odaki,
sorry for sending again, the previous one has some problem that lost most
of CC part.
Hello,
kernel test robot noticed "kernel-selftests.net.make.fail" on:
commit: c04079dfb34c2f534f013408b12218c14b286b7d ("[RFC PATCH 6/7] selftest: tun: Add tests for virtio-net hashing")
url: https://github.com/intel-lab-lkp/linux/commits/Akihiko-Odaki/net-skbuff-Add…
base: https://git.kernel.org/cgit/linux/kernel/git/shuah/linux-kselftest.git next
patch link: https://lore.kernel.org/all/20231008052101.144422-7-akihiko.odaki@daynix.co…
patch subject: [RFC PATCH 6/7] selftest: tun: Add tests for virtio-net hashing
in testcase: kernel-selftests
version: kernel-selftests-x86_64-60acb023-1_20230329
with following parameters:
group: net
test: fcnal-test.sh
atomic_test: ipv6_runtime
compiler: gcc-12
test machine: 36 threads 1 sockets Intel(R) Core(TM) i9-10980XE CPU @ 3.00GHz (Cascade Lake) with 32G memory
(please refer to attached dmesg/kmsg for entire log/backtrace)
If you fix the issue in a separate patch/commit (i.e. not just a new version of
the same patch/commit), kindly add following tags
| Reported-by: kernel test robot <oliver.sang(a)intel.com>
| Closes: https://lore.kernel.org/oe-lkp/202310192236.fde97031-oliver.sang@intel.com
KERNEL SELFTESTS: linux_headers_dir is /usr/src/linux-headers-x86_64-rhel-8.3-kselftests-c04079dfb34c2f534f013408b12218c14b286b7d
2023-10-14 17:54:16 mount --bind /lib/modules/6.6.0-rc2-00023-gc04079dfb34c/kernel/lib /usr/src/perf_selftests-x86_64-rhel-8.3-kselftests-c04079dfb34c2f534f013408b12218c14b286b7d/lib
make: Entering directory '/usr/src/perf_selftests-x86_64-rhel-8.3-kselftests-c04079dfb34c2f534f013408b12218c14b286b7d/tools/bpf/resolve_btfids'
...
gcc -Wall -Wl,--no-as-needed -O2 -g -I../../../../usr/include/ -I../../../include/ -isystem /usr/src/perf_selftests-x86_64-rhel-8.3-kselftests-c04079dfb34c2f534f013408b12218c14b286b7d/usr/include -I../ txtimestamp.c -o /usr/src/perf_selftests-x86_64-rhel-8.3-kselftests-c04079dfb34c2f534f013408b12218c14b286b7d/tools/testing/selftests/net/txtimestamp
reuseport_bpf.c: In function ‘attach_cbpf’:
reuseport_bpf.c:133:28: error: array type has incomplete element type ‘struct sock_filter’
133 | struct sock_filter code[] = {
| ^~~~
reuseport_bpf.c:139:29: error: ‘BPF_A’ undeclared (first use in this function); did you mean ‘BPF_H’?
139 | { BPF_RET | BPF_A, 0, 0, 0 },
| ^~~~~
| BPF_H
reuseport_bpf.c:139:29: note: each undeclared identifier is reported only once for each function it appears in
reuseport_bpf.c:141:16: error: variable ‘p’ has initializer but incomplete type
141 | struct sock_fprog p = {
| ^~~~~~~~~~
reuseport_bpf.c:142:18: error: ‘struct sock_fprog’ has no member named ‘len’
142 | .len = ARRAY_SIZE(code),
| ^~~
In file included from reuseport_bpf.c:27:
../kselftest.h:56:25: warning: excess elements in struct initializer
56 | #define ARRAY_SIZE(arr) (sizeof(arr) / sizeof((arr)[0]))
| ^
reuseport_bpf.c:142:24: note: in expansion of macro ‘ARRAY_SIZE’
142 | .len = ARRAY_SIZE(code),
| ^~~~~~~~~~
../kselftest.h:56:25: note: (near initialization for ‘p’)
56 | #define ARRAY_SIZE(arr) (sizeof(arr) / sizeof((arr)[0]))
| ^
reuseport_bpf.c:142:24: note: in expansion of macro ‘ARRAY_SIZE’
142 | .len = ARRAY_SIZE(code),
| ^~~~~~~~~~
reuseport_bpf.c:143:18: error: ‘struct sock_fprog’ has no member named ‘filter’
143 | .filter = code,
| ^~~~~~
reuseport_bpf.c:143:27: warning: excess elements in struct initializer
143 | .filter = code,
| ^~~~
reuseport_bpf.c:143:27: note: (near initialization for ‘p’)
reuseport_bpf.c:141:27: error: storage size of ‘p’ isn’t known
141 | struct sock_fprog p = {
| ^
reuseport_bpf.c:141:27: warning: unused variable ‘p’ [-Wunused-variable]
reuseport_bpf.c:133:28: warning: unused variable ‘code’ [-Wunused-variable]
133 | struct sock_filter code[] = {
| ^~~~
reuseport_bpf.c: In function ‘test_filter_no_reuseport’:
reuseport_bpf.c:346:28: error: array type has incomplete element type ‘struct sock_filter’
346 | struct sock_filter ccode[] = {{ BPF_RET | BPF_A, 0, 0, 0 }};
| ^~~~~
reuseport_bpf.c:346:51: error: ‘BPF_A’ undeclared (first use in this function); did you mean ‘BPF_H’?
346 | struct sock_filter ccode[] = {{ BPF_RET | BPF_A, 0, 0, 0 }};
| ^~~~~
| BPF_H
reuseport_bpf.c:348:27: error: storage size of ‘cprog’ isn’t known
348 | struct sock_fprog cprog;
| ^~~~~
reuseport_bpf.c:348:27: warning: unused variable ‘cprog’ [-Wunused-variable]
reuseport_bpf.c:346:28: warning: unused variable ‘ccode’ [-Wunused-variable]
346 | struct sock_filter ccode[] = {{ BPF_RET | BPF_A, 0, 0, 0 }};
| ^~~~~
make: *** [../lib.mk:181: /usr/src/perf_selftests-x86_64-rhel-8.3-kselftests-c04079dfb34c2f534f013408b12218c14b286b7d/tools/testing/selftests/net/reuseport_bpf] Error 1
make: *** Waiting for unfinished jobs....
reuseport_bpf_cpu.c: In function ‘attach_bpf’:
reuseport_bpf_cpu.c:79:28: error: array type has incomplete element type ‘struct sock_filter’
79 | struct sock_filter code[] = {
| ^~~~
reuseport_bpf_cpu.c:81:52: error: ‘SKF_AD_OFF’ undeclared (first use in this function)
81 | { BPF_LD | BPF_W | BPF_ABS, 0, 0, SKF_AD_OFF + SKF_AD_CPU },
| ^~~~~~~~~~
reuseport_bpf_cpu.c:81:52: note: each undeclared identifier is reported only once for each function it appears in
reuseport_bpf_cpu.c:81:65: error: ‘SKF_AD_CPU’ undeclared (first use in this function)
81 | { BPF_LD | BPF_W | BPF_ABS, 0, 0, SKF_AD_OFF + SKF_AD_CPU },
| ^~~~~~~~~~
reuseport_bpf_cpu.c:83:29: error: ‘BPF_A’ undeclared (first use in this function); did you mean ‘BPF_H’?
83 | { BPF_RET | BPF_A, 0, 0, 0 },
| ^~~~~
| BPF_H
reuseport_bpf_cpu.c:85:16: error: variable ‘p’ has initializer but incomplete type
85 | struct sock_fprog p = {
| ^~~~~~~~~~
reuseport_bpf_cpu.c:86:18: error: ‘struct sock_fprog’ has no member named ‘len’
86 | .len = 2,
| ^~~
reuseport_bpf_cpu.c:86:24: warning: excess elements in struct initializer
86 | .len = 2,
| ^
reuseport_bpf_cpu.c:86:24: note: (near initialization for ‘p’)
reuseport_bpf_cpu.c:87:18: error: ‘struct sock_fprog’ has no member named ‘filter’
87 | .filter = code,
| ^~~~~~
reuseport_bpf_cpu.c:87:27: warning: excess elements in struct initializer
87 | .filter = code,
| ^~~~
reuseport_bpf_cpu.c:87:27: note: (near initialization for ‘p’)
reuseport_bpf_cpu.c:85:27: error: storage size of ‘p’ isn’t known
85 | struct sock_fprog p = {
| ^
reuseport_bpf_cpu.c:85:27: warning: unused variable ‘p’ [-Wunused-variable]
reuseport_bpf_cpu.c:79:28: warning: unused variable ‘code’ [-Wunused-variable]
79 | struct sock_filter code[] = {
| ^~~~
make: *** [../lib.mk:181: /usr/src/perf_selftests-x86_64-rhel-8.3-kselftests-c04079dfb34c2f534f013408b12218c14b286b7d/tools/testing/selftests/net/reuseport_bpf_cpu] Error 1
In file included from psock_fanout.c:55:
psock_lib.h: In function ‘pair_udp_setfilter’:
psock_lib.h:52:28: error: array type has incomplete element type ‘struct sock_filter’
52 | struct sock_filter bpf_filter[] = {
| ^~~~~~~~~~
psock_lib.h:65:27: error: storage size of ‘bpf_prog’ isn’t known
65 | struct sock_fprog bpf_prog;
| ^~~~~~~~
psock_lib.h:65:27: warning: unused variable ‘bpf_prog’ [-Wunused-variable]
psock_lib.h:52:28: warning: unused variable ‘bpf_filter’ [-Wunused-variable]
52 | struct sock_filter bpf_filter[] = {
| ^~~~~~~~~~
psock_fanout.c: In function ‘sock_fanout_set_cbpf’:
psock_fanout.c:114:28: error: array type has incomplete element type ‘struct sock_filter’
114 | struct sock_filter bpf_filter[] = {
| ^~~~~~~~~~
psock_fanout.c:115:17: warning: implicit declaration of function ‘BPF_STMT’; did you mean ‘BPF_STX’? [-Wimplicit-function-declaration]
115 | BPF_STMT(BPF_LD | BPF_B | BPF_ABS, 80), /* ldb [80] */
| ^~~~~~~~
| BPF_STX
psock_fanout.c:116:36: error: ‘BPF_A’ undeclared (first use in this function); did you mean ‘BPF_X’?
116 | BPF_STMT(BPF_RET | BPF_A, 0), /* ret A */
| ^~~~~
| BPF_X
psock_fanout.c:116:36: note: each undeclared identifier is reported only once for each function it appears in
psock_fanout.c:118:27: error: storage size of ‘bpf_prog’ isn’t known
118 | struct sock_fprog bpf_prog;
| ^~~~~~~~
psock_fanout.c:118:27: warning: unused variable ‘bpf_prog’ [-Wunused-variable]
psock_fanout.c:114:28: warning: unused variable ‘bpf_filter’ [-Wunused-variable]
114 | struct sock_filter bpf_filter[] = {
| ^~~~~~~~~~
make: *** [../lib.mk:181: /usr/src/perf_selftests-x86_64-rhel-8.3-kselftests-c04079dfb34c2f534f013408b12218c14b286b7d/tools/testing/selftests/net/psock_fanout] Error 1
In file included from psock_tpacket.c:47:
psock_lib.h: In function ‘pair_udp_setfilter’:
psock_lib.h:52:28: error: array type has incomplete element type ‘struct sock_filter’
52 | struct sock_filter bpf_filter[] = {
| ^~~~~~~~~~
psock_lib.h:65:27: error: storage size of ‘bpf_prog’ isn’t known
65 | struct sock_fprog bpf_prog;
| ^~~~~~~~
psock_lib.h:65:27: warning: unused variable ‘bpf_prog’ [-Wunused-variable]
psock_lib.h:52:28: warning: unused variable ‘bpf_filter’ [-Wunused-variable]
52 | struct sock_filter bpf_filter[] = {
| ^~~~~~~~~~
make: *** [../lib.mk:181: /usr/src/perf_selftests-x86_64-rhel-8.3-kselftests-c04079dfb34c2f534f013408b12218c14b286b7d/tools/testing/selftests/net/psock_tpacket] Error 1
In file included from psock_snd.c:32:
psock_lib.h: In function ‘pair_udp_setfilter’:
psock_lib.h:52:28: error: array type has incomplete element type ‘struct sock_filter’
52 | struct sock_filter bpf_filter[] = {
| ^~~~~~~~~~
psock_lib.h:65:27: error: storage size of ‘bpf_prog’ isn’t known
65 | struct sock_fprog bpf_prog;
| ^~~~~~~~
psock_lib.h:65:27: warning: unused variable ‘bpf_prog’ [-Wunused-variable]
psock_lib.h:52:28: warning: unused variable ‘bpf_filter’ [-Wunused-variable]
52 | struct sock_filter bpf_filter[] = {
| ^~~~~~~~~~
make: *** [../lib.mk:181: /usr/src/perf_selftests-x86_64-rhel-8.3-kselftests-c04079dfb34c2f534f013408b12218c14b286b7d/tools/testing/selftests/net/psock_snd] Error 1
make: Leaving directory '/usr/src/perf_selftests-x86_64-rhel-8.3-kselftests-c04079dfb34c2f534f013408b12218c14b286b7d/tools/testing/selftests/net'
The kernel config and materials to reproduce are available at:
https://download.01.org/0day-ci/archive/20231019/202310192236.fde97031-oliv…
--
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki
The arm64 Guarded Control Stack (GCS) feature provides support for
hardware protected stacks of return addresses, intended to provide
hardening against return oriented programming (ROP) attacks and to make
it easier to gather call stacks for applications such as profiling.
When GCS is active a secondary stack called the Guarded Control Stack is
maintained, protected with a memory attribute which means that it can
only be written with specific GCS operations. The current GCS pointer
can not be directly written to by userspace. When a BL is executed the
value stored in LR is also pushed onto the GCS, and when a RET is
executed the top of the GCS is popped and compared to LR with a fault
being raised if the values do not match. GCS operations may only be
performed on GCS pages, a data abort is generated if they are not.
The combination of hardware enforcement and lack of extra instructions
in the function entry and exit paths should result in something which
has less overhead and is more difficult to attack than a purely software
implementation like clang's shadow stacks.
This series implements support for use of GCS by userspace, along with
support for use of GCS within KVM guests. It does not enable use of GCS
by either EL1 or EL2, this will be implemented separately. Executables
are started without GCS and must use a prctl() to enable it, it is
expected that this will be done very early in application execution by
the dynamic linker or other startup code.
x86 has an equivalent feature called shadow stacks, this series depends
on the x86 patches for generic memory management support for the new
guarded/shadow stack page type and shares APIs as much as possible. As
there has been extensive discussion with the wider community around the
ABI for shadow stacks I have as far as practical kept implementation
decisions close to those for x86, anticipating that review would lead to
similar conclusions in the absence of strong reasoning for divergence.
The main divergence I am concious of is that x86 allows shadow stack to
be enabled and disabled repeatedly, freeing the shadow stack for the
thread whenever disabled, while this implementation keeps the GCS
allocated after disable but refuses to reenable it. This is to avoid
races with things actively walking the GCS during a disable, we do
anticipate that some systems will wish to disable GCS at runtime but are
not aware of any demand for subsequently reenabling it.
x86 uses an arch_prctl() to manage enable and disable, since only x86
and S/390 use arch_prctl() a generic prctl() was proposed[1] as part of a
patch set for the equivalent RISC-V zisslpcfi feature which I initially
adopted fairly directly but following review feedback has been revised
quite a bit.
There is an open issue with support for CRIU, on x86 this required the
ability to set the GCS mode via ptrace. This series supports
configuring mode bits other than enable/disable via ptrace but it needs
to be confirmed if this is sufficient.
There's a few bits where I'm not convinced with where I've placed
things, in particular the GCS write operation is in the GCS header not
in uaccess.h, I wasn't sure what was clearest there and am probably too
close to the code to have a clear opinion. The reporting of GCS in
/proc/PID/smaps is also a bit awkward.
The series depends on the x86 shadow stack support:
https://lore.kernel.org/lkml/20230227222957.24501-1-rick.p.edgecombe@intel.…
I've rebased this onto v6.5-rc4 but not included it in the series in
order to avoid confusion with Rick's work and cut down the size of the
series, you can see the branch at:
https://git.kernel.org/pub/scm/linux/kernel/git/broonie/misc.git arm64-gcs
[1] https://lore.kernel.org/lkml/20230213045351.3945824-1-debug@rivosinc.com/
Signed-off-by: Mark Brown <broonie(a)kernel.org>
---
Changes in v4:
- Implement flags for map_shadow_stack() allowing the cap and end of
stack marker to be enabled independently or not at all.
- Relax size and alignment requirements for map_shadow_stack().
- Add more blurb explaining the advantages of hardware enforcement.
- Link to v3: https://lore.kernel.org/r/20230731-arm64-gcs-v3-0-cddf9f980d98@kernel.org
Changes in v3:
- Rebase onto v6.5-rc4.
- Add a GCS barrier on context switch.
- Add a GCS stress test.
- Link to v2: https://lore.kernel.org/r/20230724-arm64-gcs-v2-0-dc2c1d44c2eb@kernel.org
Changes in v2:
- Rebase onto v6.5-rc3.
- Rework prctl() interface to allow each bit to be locked independently.
- map_shadow_stack() now places the cap token based on the size
requested by the caller not the actual space allocated.
- Mode changes other than enable via ptrace are now supported.
- Expand test coverage.
- Various smaller fixes and adjustments.
- Link to v1: https://lore.kernel.org/r/20230716-arm64-gcs-v1-0-bf567f93bba6@kernel.org
---
Mark Brown (36):
prctl: arch-agnostic prctl for shadow stack
arm64: Document boot requirements for Guarded Control Stacks
arm64/gcs: Document the ABI for Guarded Control Stacks
arm64/sysreg: Add new system registers for GCS
arm64/sysreg: Add definitions for architected GCS caps
arm64/gcs: Add manual encodings of GCS instructions
arm64/gcs: Provide copy_to_user_gcs()
arm64/cpufeature: Runtime detection of Guarded Control Stack (GCS)
arm64/mm: Allocate PIE slots for EL0 guarded control stack
mm: Define VM_SHADOW_STACK for arm64 when we support GCS
arm64/mm: Map pages for guarded control stack
KVM: arm64: Manage GCS registers for guests
arm64/gcs: Allow GCS usage at EL0 and EL1
arm64/idreg: Add overrride for GCS
arm64/hwcap: Add hwcap for GCS
arm64/traps: Handle GCS exceptions
arm64/mm: Handle GCS data aborts
arm64/gcs: Context switch GCS state for EL0
arm64/gcs: Allocate a new GCS for threads with GCS enabled
arm64/gcs: Implement shadow stack prctl() interface
arm64/mm: Implement map_shadow_stack()
arm64/signal: Set up and restore the GCS context for signal handlers
arm64/signal: Expose GCS state in signal frames
arm64/ptrace: Expose GCS via ptrace and core files
arm64: Add Kconfig for Guarded Control Stack (GCS)
kselftest/arm64: Verify the GCS hwcap
kselftest/arm64: Add GCS as a detected feature in the signal tests
kselftest/arm64: Add framework support for GCS to signal handling tests
kselftest/arm64: Allow signals tests to specify an expected si_code
kselftest/arm64: Always run signals tests with GCS enabled
kselftest/arm64: Add very basic GCS test program
kselftest/arm64: Add a GCS test program built with the system libc
kselftest/arm64: Add test coverage for GCS mode locking
selftests/arm64: Add GCS signal tests
kselftest/arm64: Add a GCS stress test
kselftest/arm64: Enable GCS for the FP stress tests
Documentation/admin-guide/kernel-parameters.txt | 3 +
Documentation/arch/arm64/booting.rst | 22 +
Documentation/arch/arm64/elf_hwcaps.rst | 3 +
Documentation/arch/arm64/gcs.rst | 228 +++++++++
Documentation/arch/arm64/index.rst | 1 +
Documentation/filesystems/proc.rst | 2 +-
arch/arm64/Kconfig | 19 +
arch/arm64/include/asm/cpufeature.h | 6 +
arch/arm64/include/asm/el2_setup.h | 17 +
arch/arm64/include/asm/esr.h | 28 +-
arch/arm64/include/asm/exception.h | 2 +
arch/arm64/include/asm/gcs.h | 106 ++++
arch/arm64/include/asm/hwcap.h | 1 +
arch/arm64/include/asm/kvm_arm.h | 4 +-
arch/arm64/include/asm/kvm_host.h | 12 +
arch/arm64/include/asm/pgtable-prot.h | 14 +-
arch/arm64/include/asm/processor.h | 7 +
arch/arm64/include/asm/sysreg.h | 20 +
arch/arm64/include/asm/uaccess.h | 42 ++
arch/arm64/include/uapi/asm/hwcap.h | 1 +
arch/arm64/include/uapi/asm/ptrace.h | 8 +
arch/arm64/include/uapi/asm/sigcontext.h | 9 +
arch/arm64/kernel/cpufeature.c | 19 +
arch/arm64/kernel/cpuinfo.c | 1 +
arch/arm64/kernel/entry-common.c | 23 +
arch/arm64/kernel/idreg-override.c | 2 +
arch/arm64/kernel/process.c | 85 ++++
arch/arm64/kernel/ptrace.c | 59 +++
arch/arm64/kernel/signal.c | 237 ++++++++-
arch/arm64/kernel/traps.c | 11 +
arch/arm64/kvm/hyp/include/hyp/sysreg-sr.h | 17 +
arch/arm64/kvm/sys_regs.c | 22 +
arch/arm64/mm/Makefile | 1 +
arch/arm64/mm/fault.c | 78 ++-
arch/arm64/mm/gcs.c | 234 +++++++++
arch/arm64/mm/mmap.c | 12 +-
arch/arm64/tools/cpucaps | 1 +
arch/arm64/tools/sysreg | 55 +++
fs/proc/task_mmu.c | 3 +
include/linux/mm.h | 16 +-
include/linux/syscalls.h | 1 +
include/uapi/asm-generic/unistd.h | 5 +-
include/uapi/linux/elf.h | 1 +
include/uapi/linux/prctl.h | 22 +
kernel/sys.c | 30 ++
kernel/sys_ni.c | 1 +
tools/testing/selftests/arm64/Makefile | 2 +-
tools/testing/selftests/arm64/abi/hwcap.c | 19 +
tools/testing/selftests/arm64/fp/assembler.h | 15 +
tools/testing/selftests/arm64/fp/fpsimd-test.S | 2 +
tools/testing/selftests/arm64/fp/sve-test.S | 2 +
tools/testing/selftests/arm64/fp/za-test.S | 2 +
tools/testing/selftests/arm64/fp/zt-test.S | 2 +
tools/testing/selftests/arm64/gcs/.gitignore | 5 +
tools/testing/selftests/arm64/gcs/Makefile | 24 +
tools/testing/selftests/arm64/gcs/asm-offsets.h | 0
tools/testing/selftests/arm64/gcs/basic-gcs.c | 356 ++++++++++++++
tools/testing/selftests/arm64/gcs/gcs-locking.c | 200 ++++++++
.../selftests/arm64/gcs/gcs-stress-thread.S | 311 ++++++++++++
tools/testing/selftests/arm64/gcs/gcs-stress.c | 532 +++++++++++++++++++++
tools/testing/selftests/arm64/gcs/gcs-util.h | 87 ++++
tools/testing/selftests/arm64/gcs/libc-gcs.c | 500 +++++++++++++++++++
tools/testing/selftests/arm64/signal/.gitignore | 1 +
.../testing/selftests/arm64/signal/test_signals.c | 17 +-
.../testing/selftests/arm64/signal/test_signals.h | 6 +
.../selftests/arm64/signal/test_signals_utils.c | 32 +-
.../selftests/arm64/signal/test_signals_utils.h | 39 ++
.../arm64/signal/testcases/gcs_exception_fault.c | 59 +++
.../selftests/arm64/signal/testcases/gcs_frame.c | 78 +++
.../arm64/signal/testcases/gcs_write_fault.c | 67 +++
.../selftests/arm64/signal/testcases/testcases.c | 7 +
.../selftests/arm64/signal/testcases/testcases.h | 1 +
72 files changed, 3823 insertions(+), 34 deletions(-)
---
base-commit: ed0e1456f04be7a93c9a186e8e13aed78b555617
change-id: 20230303-arm64-gcs-e311ab0d8729
Best regards,
--
Mark Brown <broonie(a)kernel.org>
Commit 2810c1e99867 ("kunit: Fix wild-memory-access bug in
kunit_free_suite_set()") is causing all test suites to run (when
built as modules) while still in MODULE_STATE_COMING. In that state,
test modules are not fully initialized and lack sysfs kobjects.
This behavior can cause a crash if the test module tries to register
fake devices.
This patch restores the normal execution flow, waiting for the module
initialization to complete before running the test suites.
The issue reported in the commit mentioned above is addressed using
virt_addr_valid() to detect if the module loading has failed
and mod->kunit_suites has not been allocated using kmalloc_array().
Fixes: 2810c1e99867 ("kunit: Fix wild-memory-access bug in kunit_free_suite_set()")
Signed-off-by: Marco Pagani <marpagan(a)redhat.com>
---
lib/kunit/test.c | 8 +++++---
1 file changed, 5 insertions(+), 3 deletions(-)
diff --git a/lib/kunit/test.c b/lib/kunit/test.c
index 421f13981412..1a49569186fc 100644
--- a/lib/kunit/test.c
+++ b/lib/kunit/test.c
@@ -769,12 +769,14 @@ static void kunit_module_exit(struct module *mod)
};
const char *action = kunit_action();
+ if (!suite_set.start || !virt_addr_valid(suite_set.start))
+ return;
+
if (!action)
__kunit_test_suites_exit(mod->kunit_suites,
mod->num_kunit_suites);
- if (suite_set.start)
- kunit_free_suite_set(suite_set);
+ kunit_free_suite_set(suite_set);
}
static int kunit_module_notify(struct notifier_block *nb, unsigned long val,
@@ -784,12 +786,12 @@ static int kunit_module_notify(struct notifier_block *nb, unsigned long val,
switch (val) {
case MODULE_STATE_LIVE:
+ kunit_module_init(mod);
break;
case MODULE_STATE_GOING:
kunit_module_exit(mod);
break;
case MODULE_STATE_COMING:
- kunit_module_init(mod);
break;
case MODULE_STATE_UNFORMED:
break;
--
2.41.0
Kselftests are kernel tests and must be build with kernel headers from
same source version. The kernel headers are already being included
correctly in futex selftest Makefile with the help of KHDR_INCLUDE. In
this patch, only the dead code is being removed. No functional change is
intended.
Signed-off-by: Muhammad Usama Anjum <usama.anjum(a)collabora.com>
---
Changes since v1:
- Make the explanation correct
---
.../selftests/futex/include/futextest.h | 22 -------------------
1 file changed, 22 deletions(-)
diff --git a/tools/testing/selftests/futex/include/futextest.h b/tools/testing/selftests/futex/include/futextest.h
index ddbcfc9b7bac4..59f66af3a6d10 100644
--- a/tools/testing/selftests/futex/include/futextest.h
+++ b/tools/testing/selftests/futex/include/futextest.h
@@ -25,28 +25,6 @@
typedef volatile u_int32_t futex_t;
#define FUTEX_INITIALIZER 0
-/* Define the newer op codes if the system header file is not up to date. */
-#ifndef FUTEX_WAIT_BITSET
-#define FUTEX_WAIT_BITSET 9
-#endif
-#ifndef FUTEX_WAKE_BITSET
-#define FUTEX_WAKE_BITSET 10
-#endif
-#ifndef FUTEX_WAIT_REQUEUE_PI
-#define FUTEX_WAIT_REQUEUE_PI 11
-#endif
-#ifndef FUTEX_CMP_REQUEUE_PI
-#define FUTEX_CMP_REQUEUE_PI 12
-#endif
-#ifndef FUTEX_WAIT_REQUEUE_PI_PRIVATE
-#define FUTEX_WAIT_REQUEUE_PI_PRIVATE (FUTEX_WAIT_REQUEUE_PI | \
- FUTEX_PRIVATE_FLAG)
-#endif
-#ifndef FUTEX_REQUEUE_PI_PRIVATE
-#define FUTEX_CMP_REQUEUE_PI_PRIVATE (FUTEX_CMP_REQUEUE_PI | \
- FUTEX_PRIVATE_FLAG)
-#endif
-
/**
* futex() - SYS_futex syscall wrapper
* @uaddr: address of first futex
--
2.40.1
Commit 20d96b25cc4c ("selftests/resctrl: Fix schemata write error
check") exposed a problem in feature detection logic in MBM selftest.
If schemata does not support MB:x=x entries, the schemata write to
initialize 100% memory bandwidth allocation in mbm_setup() will now
fail with -EINVAL due to the error handling corrected by the commit
20d96b25cc4c ("selftests/resctrl: Fix schemata write error check").
That commit just uncovers the failed write, it is not wrong itself.
If MB:x=x is not supported by schemata, it is safe to assume 100%
memory bandwidth is always set. Therefore, the previously ignored error
does not make the MBM test itself wrong.
Restore the previous behavior of MBM test by checking MB support before
attempting to write it into schemata which results in behavior
equivalent to ignoring the write error.
Fixes: 20d96b25cc4c ("selftests/resctrl: Fix schemata write error check")
Signed-off-by: Ilpo Järvinen <ilpo.jarvinen(a)linux.intel.com>
Reviewed-by: Reinette Chatre <reinette.chatre(a)intel.com>
---
tools/testing/selftests/resctrl/mbm_test.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/tools/testing/selftests/resctrl/mbm_test.c b/tools/testing/selftests/resctrl/mbm_test.c
index d3c0d30c676a..741533f2b075 100644
--- a/tools/testing/selftests/resctrl/mbm_test.c
+++ b/tools/testing/selftests/resctrl/mbm_test.c
@@ -95,7 +95,7 @@ static int mbm_setup(struct resctrl_val_param *p)
return END_OF_TESTS;
/* Set up shemata with 100% allocation on the first run. */
- if (p->num_of_runs == 0)
+ if (p->num_of_runs == 0 && validate_resctrl_feature_request("MB", NULL))
ret = write_schemata(p->ctrlgrp, "100", p->cpu_no,
p->resctrl_val);
--
2.30.2
If name_show() is non unique, this test will try to install a kprobe on this
function which should fail returning EADDRNOTAVAIL.
On kernel where name_show() is not unique, this test is skipped.
Signed-off-by: Francis Laniel <flaniel(a)linux.microsoft.com>
---
.../ftrace/test.d/kprobe/kprobe_non_uniq_symbol.tc | 13 +++++++++++++
1 file changed, 13 insertions(+)
create mode 100644 tools/testing/selftests/ftrace/test.d/kprobe/kprobe_non_uniq_symbol.tc
diff --git a/tools/testing/selftests/ftrace/test.d/kprobe/kprobe_non_uniq_symbol.tc b/tools/testing/selftests/ftrace/test.d/kprobe/kprobe_non_uniq_symbol.tc
new file mode 100644
index 000000000000..bc9514428dba
--- /dev/null
+++ b/tools/testing/selftests/ftrace/test.d/kprobe/kprobe_non_uniq_symbol.tc
@@ -0,0 +1,13 @@
+#!/bin/sh
+# SPDX-License-Identifier: GPL-2.0
+# description: Test failure of registering kprobe on non unique symbol
+# requires: kprobe_events
+
+SYMBOL='name_show'
+
+# We skip this test on kernel where SYMBOL is unique or does not exist.
+if [ "$(grep -c -E "[[:alnum:]]+ t ${SYMBOL}" /proc/kallsyms)" -le '1' ]; then
+ exit_unsupported
+fi
+
+! echo "p:test_non_unique ${SYMBOL}" > kprobe_events
--
2.34.1
Introduce a limit on the amount of learned FDB entries on a bridge,
configured by netlink with a build time default on bridge creation in
the kernel config.
For backwards compatibility the kernel config default is disabling the
limit (0).
Without any limit a malicious actor may OOM a kernel by spamming packets
with changing MAC addresses on their bridge port, so allow the bridge
creator to limit the number of entries.
Currently the manual entries are identified by the bridge flags
BR_FDB_LOCAL or BR_FDB_ADDED_BY_USER, atomically bundled under the new
flag BR_FDB_DYNAMIC_LEARNED. This means the limit also applies to
entries created with BR_FDB_ADDED_BY_EXT_LEARN but none of BR_FDB_LOCAL
or BR_FDB_ADDED_BY_USER, e.g. ones added by SWITCHDEV_FDB_ADD_TO_BRIDGE.
Link to the corresponding iproute2 changes:
https://lore.kernel.org/r/20230919-fdb_limit-v4-1-b4d2dc4df30f@avm.de
Signed-off-by: Johannes Nixdorf <jnixdorf-oss(a)avm.de>
---
Changes in v5:
- Set IFLA_BR_FDB_N_LEARNED to NLA_REJECT (from review)
- Moved the strict_start_type-commit after the netlink change, used
the new attribute. (from review)
- Dropped the new build time config option. (from review)
- Link to v4: https://lore.kernel.org/r/20230919-fdb_limit-v4-0-39f0293807b8@avm.de
Changes in v4:
- Added the new test to the Makefile. (from review)
- Removed _entries from the names. (from iproute2 review, in some places
only for consistency)
- Wrapped the lines at 80 chars, except when longer lines are consistent
with neighbouring code. (from review)
- Fixed a race in fdb_delete. (from review)
- Link to v3: https://lore.kernel.org/r/20230905-fdb_limit-v3-0-7597cd500a82@avm.de
Changes in v3:
- Fixed the flags for fdb_create in fdb_add_entry to use
BIT(...). Previously we passed garbage. (from review)
- Set strict_start_type for br_policy. (from review)
- Split out the combined accounting and limit patch, and the netlink
patch from the combined patch in v2. (from review)
- Count atomically, remove the newly introduced lock. (from review)
- Added the new attributes to br_policy. (from review)
- Added a selftest for the new feature. (from review)
- Link to v2: https://lore.kernel.org/netdev/20230619071444.14625-1-jnixdorf-oss@avm.de/
Changes in v2:
- Added BR_FDB_ADDED_BY_USER earlier in fdb_add_entry to ensure the
limit is not applied.
- Do not initialize fdb_*_entries to 0. (from review)
- Do not skip decrementing on 0. (from review)
- Moved the counters to a conditional hole in struct net_bridge to
avoid growing the struct. (from review, it still grows the struct as
there are 2 32-bit values)
- Add IFLA_BR_FDB_CUR_LEARNED_ENTRIES (from review)
- Fix br_get_size() with the added attributes.
- Only limit learned entries, rename to
*_(CUR|MAX)_LEARNED_ENTRIES. (from review)
- Added a default limit in Kconfig. (deemed acceptable in review
comments, helps with embedded use-cases where a special purpose kernel
is built anyways)
- Added an iproute2 patch for easier testing.
- Link to v1: https://lore.kernel.org/netdev/20230515085046.4457-1-jnixdorf-oss@avm.de/
Obsolete v1 review comments:
- Return better errors to users: Due to limiting the limit to
automatically created entries, netlink fdb add requests and changing
bridge ports are never rejected, so they do not yet need a more
friendly error returned.
---
Johannes Nixdorf (5):
net: bridge: Set BR_FDB_ADDED_BY_USER early in fdb_add_entry
net: bridge: Track and limit dynamically learned FDB entries
net: bridge: Add netlink knobs for number / max learned FDB entries
net: bridge: Set strict_start_type for br_policy
selftests: forwarding: bridge_fdb_learning_limit: Add a new selftest
include/uapi/linux/if_link.h | 2 +
net/bridge/br_fdb.c | 42 ++-
net/bridge/br_netlink.c | 17 +-
net/bridge/br_private.h | 4 +
tools/testing/selftests/net/forwarding/Makefile | 3 +-
.../net/forwarding/bridge_fdb_learning_limit.sh | 283 +++++++++++++++++++++
6 files changed, 344 insertions(+), 7 deletions(-)
---
base-commit: 58720809f52779dc0f08e53e54b014209d13eebb
change-id: 20230904-fdb_limit-fae5bbf16c88
Best regards,
--
Johannes Nixdorf <jnixdorf-oss(a)avm.de>
On Tue, 15 Aug 2023 16:59:10 +0800
Yu Liao <liaoyu15(a)huawei.com> wrote:
> Yu Liao (2):
> selftests/ftrace: add loongarch support for kprobe args char tests
> selftests/ftrace: Add riscv support for kprobe arg tests
>
> .../selftests/ftrace/test.d/kprobe/kprobe_args_char.tc | 6 ++++++
> .../selftests/ftrace/test.d/kprobe/kprobe_args_string.tc | 3 +++
> .../selftests/ftrace/test.d/kprobe/kprobe_args_syntax.tc | 4 ++++
> 3 files changed, 13 insertions(+)
>
I noticed that this never got picked up, but that's probably because it
didn't also Cc linux-kselftest(a)vger.kernel.org (which I did here).
Shuah,
Can you add this? You can also add:
Acked-by: Steven Rostedt (Google) <rostedt(a)goodmis.org>
-- Steve
The abi_test currently uses a long sized test value for enablement
checks. On LE this works fine, however, on BE this results in inaccurate
assert checks due to a bit being used and assuming it's value is the
same on both LE and BE.
Use int type for 32-bit values and long type for 64-bit values to ensure
appropriate behavior on both LE and BE.
Fixes: 60b1af8de8c1 ("tracing/user_events: Add ABI self-test")
Signed-off-by: Beau Belgrave <beaub(a)linux.microsoft.com>
---
V3 Changes:
Fix missing cast in clone_check().
V2 Changes:
Rebase to linux-kselftest/fixes.
tools/testing/selftests/user_events/abi_test.c | 16 +++++++++-------
1 file changed, 9 insertions(+), 7 deletions(-)
diff --git a/tools/testing/selftests/user_events/abi_test.c b/tools/testing/selftests/user_events/abi_test.c
index 8202f1327c39..f5575ef2007c 100644
--- a/tools/testing/selftests/user_events/abi_test.c
+++ b/tools/testing/selftests/user_events/abi_test.c
@@ -47,7 +47,7 @@ static int change_event(bool enable)
return ret;
}
-static int reg_enable(long *enable, int size, int bit)
+static int reg_enable(void *enable, int size, int bit)
{
struct user_reg reg = {0};
int fd = open(data_file, O_RDWR);
@@ -69,7 +69,7 @@ static int reg_enable(long *enable, int size, int bit)
return ret;
}
-static int reg_disable(long *enable, int bit)
+static int reg_disable(void *enable, int bit)
{
struct user_unreg reg = {0};
int fd = open(data_file, O_RDWR);
@@ -90,7 +90,8 @@ static int reg_disable(long *enable, int bit)
}
FIXTURE(user) {
- long check;
+ int check;
+ long check_long;
bool umount;
};
@@ -99,6 +100,7 @@ FIXTURE_SETUP(user) {
change_event(false);
self->check = 0;
+ self->check_long = 0;
}
FIXTURE_TEARDOWN(user) {
@@ -136,9 +138,9 @@ TEST_F(user, bit_sizes) {
#if BITS_PER_LONG == 8
/* Allow 0-64 bits for 64-bit */
- ASSERT_EQ(0, reg_enable(&self->check, sizeof(long), 63));
- ASSERT_NE(0, reg_enable(&self->check, sizeof(long), 64));
- ASSERT_EQ(0, reg_disable(&self->check, 63));
+ ASSERT_EQ(0, reg_enable(&self->check_long, sizeof(long), 63));
+ ASSERT_NE(0, reg_enable(&self->check_long, sizeof(long), 64));
+ ASSERT_EQ(0, reg_disable(&self->check_long, 63));
#endif
/* Disallowed sizes (everything beside 4 and 8) */
@@ -200,7 +202,7 @@ static int clone_check(void *check)
for (i = 0; i < 10; ++i) {
usleep(100000);
- if (*(long *)check)
+ if (*(int *)check)
return 0;
}
base-commit: 6f874fa021dfc7bf37f4f37da3a5aaa41fe9c39c
--
2.34.1
When we dynamically generate a name for a configuration in get-reg-list
we use strcat() to append to a buffer allocated using malloc() but we
never initialise that buffer. Since malloc() offers no guarantees
regarding the contents of the memory it returns this can lead to us
corrupting, and likely overflowing, the buffer:
vregs: PASS
vregs+pmu: PASS
sve: PASS
sve+pmu: PASS
vregs+pauth_address+pauth_generic: PASS
X�vr+gspauth_addre+spauth_generi+pmu: PASS
Initialise the buffer to an empty string to avoid this.
Fixes: 2f9ace5d4557 ("KVM: arm64: selftests: get-reg-list: Introduce vcpu configs")
Reviewed-by: Andrew Jones <ajones(a)ventanamicro.com>
Signed-off-by: Mark Brown <broonie(a)kernel.org>
---
Changes in v2:
- Update Fixes: tag.
- Link to v1: https://lore.kernel.org/r/20231013-kvm-get-reg-list-str-init-v1-1-034f370ff…
---
tools/testing/selftests/kvm/get-reg-list.c | 1 +
1 file changed, 1 insertion(+)
diff --git a/tools/testing/selftests/kvm/get-reg-list.c b/tools/testing/selftests/kvm/get-reg-list.c
index be7bf5224434..dd62a6976c0d 100644
--- a/tools/testing/selftests/kvm/get-reg-list.c
+++ b/tools/testing/selftests/kvm/get-reg-list.c
@@ -67,6 +67,7 @@ static const char *config_name(struct vcpu_reg_list *c)
c->name = malloc(len);
+ c->name[0] = '\0';
len = 0;
for_each_sublist(c, s) {
if (!strcmp(s->name, "base"))
---
base-commit: 6465e260f48790807eef06b583b38ca9789b6072
change-id: 20231012-kvm-get-reg-list-str-init-76c8ed4e19d6
Best regards,
--
Mark Brown <broonie(a)kernel.org>
Commit 20d96b25cc4c ("selftests/resctrl: Fix schemata write error
check") exposed a problem in feature detection logic in MBM selftest.
If schemata does not support MB:x=x entries, the schemata write to
initialize 100% memory bandwidth allocation in mbm_setup() will now
fail with -EINVAL due to the error handling corrected by 20d96b25cc4c.
Commit 20d96b25cc4c just uncovers the failed write, it is not wrong
itself.
If MB:x=x is not supported by schemata, it is safe to assume 100%
memory bandwidth is always set. Therefore, the previously ignored error
does not make the MBM test itself wrong.
Restore the previous behavior of MBM test by checking MB support before
attempting to write it into schemata which results in behavior
equivalent to ignoring the write error.
Fixes: 20d96b25cc4c ("selftests/resctrl: Fix schemata write error check")
Signed-off-by: Ilpo Järvinen <ilpo.jarvinen(a)linux.intel.com>
---
tools/testing/selftests/resctrl/mbm_test.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/tools/testing/selftests/resctrl/mbm_test.c b/tools/testing/selftests/resctrl/mbm_test.c
index d3c0d30c676a..85987957e7f5 100644
--- a/tools/testing/selftests/resctrl/mbm_test.c
+++ b/tools/testing/selftests/resctrl/mbm_test.c
@@ -95,7 +95,7 @@ static int mbm_setup(struct resctrl_val_param *p)
return END_OF_TESTS;
/* Set up shemata with 100% allocation on the first run. */
- if (p->num_of_runs == 0)
+ if ((p->num_of_runs == 0) && validate_resctrl_feature_request("MB", NULL))
ret = write_schemata(p->ctrlgrp, "100", p->cpu_no,
p->resctrl_val);
--
2.30.2
IOMMU hardwares that support nested translation would have two stages
address translation (normally mentioned as stage-1 and stage-2). The page
table formats of the stage-1 and stage-2 can be different. e.g., VT-d has
different page table formats for stage-1 and stage-2.
Nested parent domain is the iommu domain used to represent the stage-2
translation. In IOMMUFD, both stage-1 and stage-2 translation are tracked
as HWPT (a.k.a. iommu domain). Stage-2 HWPT is parent of stage-1 HWPT as
stage-1 cannot work alone in nested translation. In the cases of stage-1 and
stage-2 page table format are different, the parent HWPT should use exactly
the stage-2 page table format. However, the existing kernel hides the format
selection in iommu drivers, so the domain allocated via IOMMU_HWPT_ALLOC can
use either stage-1 page table format or stage-2 page table format, there is
no guarantees for it.
To enforce the page table format of the nested parent domain, this series
introduces a new iommu op (domain_alloc_user) which can accept user flags
to allocate domain as userspace requires. It also converts IOMMUFD to use
the new domain_alloc_user op for domain allocation if supported, then extends
the IOMMU_HWPT_ALLOC ioctl to pass down a NEST_PARENT flag to allocate a HWPT
which can be used as parent. This series implements the new op in Intel iommu
driver to have a complete picture. It is a preparation for adding nesting
support in IOMMUFD/IOMMU.
Complete code can be found:
https://github.com/yiliu1765/iommufd/tree/iommufd_alloc_user_v2
Change log:
v2:
- Require domain_alloc_user op if IOMMU_HWPT_ALLOC passes non-zero flags (Kevin)
- IOMMUFD core should check kernel known flags while iommu driver needs
to check supported flags as well (Jason)
- Minor tweaks per Baolu's comment
v1: https://lore.kernel.org/linux-iommu/20230919092523.39286-1-yi.l.liu@intel.c…
Regards,
Yi Liu
Yi Liu (6):
iommu: Add new iommu op to create domains owned by userspace
iommufd/hw_pagetable: Use domain_alloc_user op for domain allocation
iommufd/hw_pagetable: Accepts user flags for domain allocation
iommufd/hw_pagetable: Support allocating nested parent domain
iommufd/selftest: Add domain_alloc_user() support in iommu mock
iommu/vt-d: Add domain_alloc_user op
drivers/iommu/intel/iommu.c | 28 +++++++++++++++++
drivers/iommu/iommufd/device.c | 2 +-
drivers/iommu/iommufd/hw_pagetable.c | 31 ++++++++++++++-----
drivers/iommu/iommufd/iommufd_private.h | 3 +-
drivers/iommu/iommufd/selftest.c | 19 ++++++++++++
include/linux/iommu.h | 11 ++++++-
include/uapi/linux/iommufd.h | 12 ++++++-
tools/testing/selftests/iommu/iommufd.c | 24 +++++++++++---
.../selftests/iommu/iommufd_fail_nth.c | 2 +-
tools/testing/selftests/iommu/iommufd_utils.h | 11 +++++--
10 files changed, 124 insertions(+), 19 deletions(-)
--
2.34.1
Hi all:
The core frequency is subjected to the process variation in semiconductors.
Not all cores are able to reach the maximum frequency respecting the
infrastructure limits. Consequently, AMD has redefined the concept of
maximum frequency of a part. This means that a fraction of cores can reach
maximum frequency. To find the best process scheduling policy for a given
scenario, OS needs to know the core ordering informed by the platform through
highest performance capability register of the CPPC interface.
Earlier implementations of amd-pstate preferred core only support a static
core ranking and targeted performance. Now it has the ability to dynamically
change the preferred core based on the workload and platform conditions and
accounting for thermals and aging.
Amd-pstate driver utilizes the functions and data structures provided by
the ITMT architecture to enable the scheduler to favor scheduling on cores
which can be get a higher frequency with lower voltage.
We call it amd-pstate preferred core.
Here sched_set_itmt_core_prio() is called to set priorities and
sched_set_itmt_support() is called to enable ITMT feature.
Amd-pstate driver uses the highest performance value to indicate
the priority of CPU. The higher value has a higher priority.
Amd-pstate driver will provide an initial core ordering at boot time.
It relies on the CPPC interface to communicate the core ranking to the
operating system and scheduler to make sure that OS is choosing the cores
with highest performance firstly for scheduling the process. When amd-pstate
driver receives a message with the highest performance change, it will
update the core ranking.
Changes form V8->V9:
- all:
- - pick up Tested-By flag added by Oleksandr.
- cpufreq: amd-pstate:
- - pick up Review-By flag added by Wyes.
- - ignore modification of bug.
- - add a attribute of prefcore_ranking.
- - modify data type conversion from u32 to int.
- Documentation: amd-pstate:
- - pick up Review-By flag added by Wyes.
Changes form V7->V8:
- all:
- - pick up Review-By flag added by Mario and Ray.
- cpufreq: amd-pstate:
- - use hw_prefcore embeds into cpudata structure.
- - delete preferred core init from cpu online/off.
Changes form V6->V7:
- x86:
- - Modify kconfig about X86_AMD_PSTATE.
- cpufreq: amd-pstate:
- - modify incorrect comments about scheduler_work().
- - convert highest_perf data type.
- - modify preferred core init when cpu init and online.
- acpi: cppc:
- - modify link of CPPC highest performance.
- cpufreq:
- - modify link of CPPC highest performance changed.
Changes form V5->V6:
- cpufreq: amd-pstate:
- - modify the wrong tag order.
- - modify warning about hw_prefcore sysfs attribute.
- - delete duplicate comments.
- - modify the variable name cppc_highest_perf to prefcore_ranking.
- - modify judgment conditions for setting highest_perf.
- - modify sysfs attribute for CPPC highest perf to pr_debug message.
- Documentation: amd-pstate:
- - modify warning: title underline too short.
Changes form V4->V5:
- cpufreq: amd-pstate:
- - modify sysfs attribute for CPPC highest perf.
- - modify warning about comments
- - rebase linux-next
- cpufreq:
- - Moidfy warning about function declarations.
- Documentation: amd-pstate:
- - align with ``amd-pstat``
Changes form V3->V4:
- Documentation: amd-pstate:
- - Modify inappropriate descriptions.
Changes form V2->V3:
- x86:
- - Modify kconfig and description.
- cpufreq: amd-pstate:
- - Add Co-developed-by tag in commit message.
- cpufreq:
- - Modify commit message.
- Documentation: amd-pstate:
- - Modify inappropriate descriptions.
Changes form V1->V2:
- acpi: cppc:
- - Add reference link.
- cpufreq:
- - Moidfy link error.
- cpufreq: amd-pstate:
- - Init the priorities of all online CPUs
- - Use a single variable to represent the status of preferred core.
- Documentation:
- - Default enabled preferred core.
- Documentation: amd-pstate:
- - Modify inappropriate descriptions.
- - Default enabled preferred core.
- - Use a single variable to represent the status of preferred core.
Meng Li (7):
x86: Drop CPU_SUP_INTEL from SCHED_MC_PRIO for the expansion.
acpi: cppc: Add get the highest performance cppc control
cpufreq: amd-pstate: Enable amd-pstate preferred core supporting.
cpufreq: Add a notification message that the highest perf has changed
cpufreq: amd-pstate: Update amd-pstate preferred core ranking
dynamically
Documentation: amd-pstate: introduce amd-pstate preferred core
Documentation: introduce amd-pstate preferrd core mode kernel command
line options
.../admin-guide/kernel-parameters.txt | 5 +
Documentation/admin-guide/pm/amd-pstate.rst | 59 ++++-
arch/x86/Kconfig | 5 +-
drivers/acpi/cppc_acpi.c | 13 ++
drivers/acpi/processor_driver.c | 6 +
drivers/cpufreq/amd-pstate.c | 204 ++++++++++++++++--
drivers/cpufreq/cpufreq.c | 13 ++
include/acpi/cppc_acpi.h | 5 +
include/linux/amd-pstate.h | 10 +
include/linux/cpufreq.h | 5 +
10 files changed, 305 insertions(+), 20 deletions(-)
--
2.34.1
Zero out the buffer for readlink() since readlink() does not append a
terminating null byte to the buffer. Also change the buffer length
passed to readlink() to 'PATH_MAX - 1' to ensure the resulting string
is always null terminated.
Fixes: 833c12ce0f430 ("selftests/x86/lam: Add inherit test cases for linear-address masking")
Signed-off-by: Binbin Wu <binbin.wu(a)linux.intel.com>
---
v1->v2:
- Change the buffer length passed to readlink() to 'PATH_MAX - 1' to ensure the
resulting string is always null terminated. [Kirill]
tools/testing/selftests/x86/lam.c | 6 +++---
1 file changed, 3 insertions(+), 3 deletions(-)
diff --git a/tools/testing/selftests/x86/lam.c b/tools/testing/selftests/x86/lam.c
index eb0e46905bf9..8f9b06d9ce03 100644
--- a/tools/testing/selftests/x86/lam.c
+++ b/tools/testing/selftests/x86/lam.c
@@ -573,7 +573,7 @@ int do_uring(unsigned long lam)
char path[PATH_MAX] = {0};
/* get current process path */
- if (readlink("/proc/self/exe", path, PATH_MAX) <= 0)
+ if (readlink("/proc/self/exe", path, PATH_MAX - 1) <= 0)
return 1;
int file_fd = open(path, O_RDONLY);
@@ -680,14 +680,14 @@ static int handle_execve(struct testcases *test)
perror("Fork failed.");
ret = 1;
} else if (pid == 0) {
- char path[PATH_MAX];
+ char path[PATH_MAX] = {0};
/* Set LAM mode in parent process */
if (set_lam(lam) != 0)
return 1;
/* Get current binary's path and the binary was run by execve */
- if (readlink("/proc/self/exe", path, PATH_MAX) <= 0)
+ if (readlink("/proc/self/exe", path, PATH_MAX - 1) <= 0)
exit(-1);
/* run binary to get LAM mode and return to parent process */
base-commit: 58720809f52779dc0f08e53e54b014209d13eebb
--
2.25.1
This series fixes issues observed with selftests/amd-pstate while
running performance comparison tests with different governors. First
patch changes relative paths with absolute paths and also change it
with correct paths wherever it is broken.
The second patch adds an option to provide perf binary path to
handle the case where distro perf does not work.
Changelog v3->v4:
* Addressed review comments from v3
Swapnil Sapkal (2):
selftests/amd-pstate: Fix broken paths to run workloads in
amd-pstate-ut
selftests/amd-pstate: Added option to provide perf binary path
.../x86/amd_pstate_tracer/amd_pstate_trace.py | 3 +--
.../testing/selftests/amd-pstate/gitsource.sh | 17 +++++++++------
tools/testing/selftests/amd-pstate/run.sh | 21 +++++++++++++------
tools/testing/selftests/amd-pstate/tbench.sh | 4 ++--
4 files changed, 29 insertions(+), 16 deletions(-)
--
2.34.1
TEST_LENGTH passing ".size = sizeof(struct _struct) - 1" expects -EINVAL
from "if (ucmd.user_size < op->min_size)" check in iommufd_fops_ioctl().
This has been working when min_size is exactly the size of the structure.
However, if the size of the structure becomes larger than min_size, i.e.
the passing size above is larger than min_size, that min_size sanity no
longer works.
Since the first test in TEST_LENGTH() was to test that min_size sanity
routine, rework it to support a min_size calculation, rather than using
the full size of the structure.
Signed-off-by: Nicolin Chen <nicolinc(a)nvidia.com>
---
Hi Jason/Kevin,
This was a part of the nesting series. Its link in v4:
https://lore.kernel.org/linux-iommu/20230921075138.124099-13-yi.l.liu@intel…
I just realized that this should go in prior to the nesting series.
One of the nesting patches changes the IOMMU_HWPT_ALLOC structure,
which would break the cmd_length test without this patch.
Thanks!
Nicolin
tools/testing/selftests/iommu/iommufd.c | 29 ++++++++++++++-----------
1 file changed, 16 insertions(+), 13 deletions(-)
diff --git a/tools/testing/selftests/iommu/iommufd.c b/tools/testing/selftests/iommu/iommufd.c
index c5eca2fee42c..6323153d277b 100644
--- a/tools/testing/selftests/iommu/iommufd.c
+++ b/tools/testing/selftests/iommu/iommufd.c
@@ -86,12 +86,13 @@ TEST_F(iommufd, cmd_fail)
TEST_F(iommufd, cmd_length)
{
-#define TEST_LENGTH(_struct, _ioctl) \
+#define TEST_LENGTH(_struct, _ioctl, _last) \
{ \
+ size_t min_size = offsetofend(struct _struct, _last); \
struct { \
struct _struct cmd; \
uint8_t extra; \
- } cmd = { .cmd = { .size = sizeof(struct _struct) - 1 }, \
+ } cmd = { .cmd = { .size = min_size - 1 }, \
.extra = UINT8_MAX }; \
int old_errno; \
int rc; \
@@ -112,17 +113,19 @@ TEST_F(iommufd, cmd_length)
} \
}
- TEST_LENGTH(iommu_destroy, IOMMU_DESTROY);
- TEST_LENGTH(iommu_hw_info, IOMMU_GET_HW_INFO);
- TEST_LENGTH(iommu_hwpt_alloc, IOMMU_HWPT_ALLOC);
- TEST_LENGTH(iommu_ioas_alloc, IOMMU_IOAS_ALLOC);
- TEST_LENGTH(iommu_ioas_iova_ranges, IOMMU_IOAS_IOVA_RANGES);
- TEST_LENGTH(iommu_ioas_allow_iovas, IOMMU_IOAS_ALLOW_IOVAS);
- TEST_LENGTH(iommu_ioas_map, IOMMU_IOAS_MAP);
- TEST_LENGTH(iommu_ioas_copy, IOMMU_IOAS_COPY);
- TEST_LENGTH(iommu_ioas_unmap, IOMMU_IOAS_UNMAP);
- TEST_LENGTH(iommu_option, IOMMU_OPTION);
- TEST_LENGTH(iommu_vfio_ioas, IOMMU_VFIO_IOAS);
+ TEST_LENGTH(iommu_destroy, IOMMU_DESTROY, id);
+ TEST_LENGTH(iommu_hw_info, IOMMU_GET_HW_INFO, __reserved);
+ TEST_LENGTH(iommu_hwpt_alloc, IOMMU_HWPT_ALLOC, __reserved);
+ TEST_LENGTH(iommu_ioas_alloc, IOMMU_IOAS_ALLOC, out_ioas_id);
+ TEST_LENGTH(iommu_ioas_iova_ranges, IOMMU_IOAS_IOVA_RANGES,
+ out_iova_alignment);
+ TEST_LENGTH(iommu_ioas_allow_iovas, IOMMU_IOAS_ALLOW_IOVAS,
+ allowed_iovas);
+ TEST_LENGTH(iommu_ioas_map, IOMMU_IOAS_MAP, iova);
+ TEST_LENGTH(iommu_ioas_copy, IOMMU_IOAS_COPY, src_iova);
+ TEST_LENGTH(iommu_ioas_unmap, IOMMU_IOAS_UNMAP, length);
+ TEST_LENGTH(iommu_option, IOMMU_OPTION, val64);
+ TEST_LENGTH(iommu_vfio_ioas, IOMMU_VFIO_IOAS, __reserved);
#undef TEST_LENGTH
}
--
2.42.0
This is to add Intel VT-d nested translation based on IOMMUFD nesting
infrastructure. As the iommufd nesting infrastructure series[1], iommu
core supports new ops to report iommu hardware information, allocate
domains with user data and invalidate stage-1 IOTLB when there is mapping
changed in stage-1 page table. The data required in the three paths are
vendor-specific, so
1) IOMMU_HWPT_TYPE_VTD_S1 is defined for the Intel VT-d stage-1 page
table, it will be used in the stage-1 domain allocation and IOTLB
syncing path. struct iommu_hwpt_vtd_s1 is defined to pass user_data
for the Intel VT-d stage-1 domain allocation.
struct iommu_hwpt_vtd_s1_invalidate is defined to pass the data for
the Intel VT-d stage-1 IOTLB invalidation.
2) IOMMU_HW_INFO_TYPE_INTEL_VTD and struct iommu_hw_info_vtd are defined
to report iommu hardware information for Intel VT-d.
With above IOMMUFD extensions, the intel iommu driver implements the three
paths to support nested translation.
The first Intel platform supporting nested translation is Sapphire
Rapids which, unfortunately, has a hardware errata [2] requiring special
treatment. This errata happens when a stage-1 page table page (either
level) is located in a stage-2 read-only region. In that case the IOMMU
hardware may ignore the stage-2 RO permission and still set the A/D bit
in stage-1 page table entries during page table walking.
A flag IOMMU_HW_INFO_VTD_ERRATA_772415_SPR17 is introduced to report
this errata to userspace. With that restriction the user should either
disable nested translation to favor RO stage-2 mappings or ensure no
RO stage-2 mapping to enable nested translation.
Intel-iommu driver is armed with necessary checks to prevent such mix
in patch12 of this series.
Qemu currently does add RO mappings though. The vfio agent in Qemu
simply maps all valid regions in the GPA address space which certainly
includes RO regions e.g. vbios.
In reality we don't know a usage relying on DMA reads from the BIOS
region. Hence finding a way to skip RO regions (e.g. via a discard manager)
in Qemu might be an acceptable tradeoff. The actual change needs more
discussion in Qemu community. For now we just hacked Qemu to test.
Complete code can be found in [3], corresponding QEMU could can be found
in [4].
[1] https://lore.kernel.org/linux-iommu/20230921075138.124099-1-yi.l.liu@intel.…
[2] https://www.intel.com/content/www/us/en/content-details/772415/content-deta…
[3] https://github.com/yiliu1765/iommufd/tree/iommufd_nesting
[4] https://github.com/yiliu1765/qemu/tree/zhenzhong/wip/iommufd_nesting_rfcv1
Change log:
v5:
- Add Kevin's r-b for patch 2, 3 ,5 8, 10
- Drop enforce_cache_coherency callback from the nested type domain ops (Kevin)
- Remove duplicate agaw check in patch 04 (Kevin)
- Remove duplicate domain_update_iommu_cap() in patch 06 (Kevin)
- Check parent's force_snooping to set pgsnp in the pasid entry (Kevin)
- uapi data structure check (Kevin)
- Simplify the errata handling as user can allocate nested parent domain
v4: https://lore.kernel.org/linux-iommu/20230724111335.107427-1-yi.l.liu@intel.…
- Remove ascii art tables (Jason)
- Drop EMT (Tina, Jason)
- Drop MTS and related definitions (Kevin)
- Rename macro IOMMU_VTD_PGTBL_ to IOMMU_VTD_S1_ (Kevin)
- Rename struct iommu_hwpt_intel_vtd_ to iommu_hwpt_vtd_ (Kevin)
- Rename struct iommu_hwpt_intel_vtd to iommu_hwpt_vtd_s1 (Kevin)
- Put the vendor specific hwpt alloc data structure before enuma iommu_hwpt_type (Kevin)
- Do not trim the higher page levels of S2 domain in nested domain attachment as the
S2 domain may have been used independently. (Kevin)
- Remove the first-stage pgd check against the maximum address of s2_domain as hw
can check it anyhow. It makes sense to check every pfns used in the stage-1 page
table. But it cannot make it. So just leave it to hw. (Kevin)
- Split the iotlb flush part into an order of uapi, helper and callback implementation (Kevin)
- Change the policy of VT-d nesting errata, disallow RO mapping once a domain is used
as parent domain of a nested domain. This removes the nested_users counting. (Kevin)
- Minor fix for "make htmldocs"
v3: https://lore.kernel.org/linux-iommu/20230511145110.27707-1-yi.l.liu@intel.c…
- Further split the patches into an order of adding helpers for nested
domain, iotlb flush, nested domain attachment and nested domain allocation
callback, then report the hw_info to userspace.
- Add batch support in cache invalidation from userspace
- Disallow nested translation usage if RO mappings exists in stage-2 domain
due to errata on readonly mappings on Sapphire Rapids platform.
v2: https://lore.kernel.org/linux-iommu/20230309082207.612346-1-yi.l.liu@intel.…
- The iommufd infrastructure is split to be separate series.
v1: https://lore.kernel.org/linux-iommu/20230209043153.14964-1-yi.l.liu@intel.c…
Regards,
Yi Liu
Lu Baolu (5):
iommu/vt-d: Extend dmar_domain to support nested domain
iommu/vt-d: Add helper for nested domain allocation
iommu/vt-d: Add helper to setup pasid nested translation
iommu/vt-d: Add nested domain allocation
iommu/vt-d: Disallow read-only mappings to nest parent domain
Yi Liu (6):
iommufd: Add data structure for Intel VT-d stage-1 domain allocation
iommu/vt-d: Make domain attach helpers to be extern
iommu/vt-d: Set the nested domain to a device
iommufd: Add data structure for Intel VT-d stage-1 cache invalidation
iommu/vt-d: Make iotlb flush helpers to be extern
iommu/vt-d: Add iotlb flush for nested domain
drivers/iommu/intel/Makefile | 2 +-
drivers/iommu/intel/iommu.c | 60 +++++++++----
drivers/iommu/intel/iommu.h | 51 +++++++++--
drivers/iommu/intel/nested.c | 162 +++++++++++++++++++++++++++++++++++
drivers/iommu/intel/pasid.c | 125 +++++++++++++++++++++++++++
drivers/iommu/intel/pasid.h | 2 +
include/uapi/linux/iommufd.h | 76 +++++++++++++++-
7 files changed, 452 insertions(+), 26 deletions(-)
create mode 100644 drivers/iommu/intel/nested.c
--
2.34.1
Dzień dobry,
Pozwoliłem sobie na kontakt, ponieważ jestem zainteresowany weryfikacją możliwości nawiązania współpracy.
Wspieramy firmy w pozyskiwaniu nowych klientów biznesowych.
Czy możemy porozmawiać w celu przedstawienia szczegółowych informacji?
Pozdrawiam
Andrzej Polański
kselftest.h declares many variadic functions that can print some
formatted message while also executing selftest logic. These
declarations don't have any compiler mechanism to verify if passed
arguments are valid in comparison with format specifiers used in
printf() calls.
Attribute addition can make debugging easier, the code more consistent
and prevent mismatched or missing variables.
The first patch adds __printf() macro and applies it to all functions
in kselftest.h that use printf format specifiers. After compiling all
selftests using:
make -C tools/testing/selftests
many instances of format specifier mismatching are exposed in the form
of -Wformat warnings.
Fix the mismatched format specifiers caught by __printf() attribute in
multiple tests.
Series is based on kselftests next branch.
Changelog v6:
- Add methodology notes to all patches.
- No functional changes in the patches.
Changelog v5:
- Mention in the cover letter what methodology was used to find the
mismatched format specifiers.
- No functional changes in the patches.
Changelog v4:
- Fix patch 1/8 subject typo.
- Add Reinette's reviewed-by tags.
- Rebase onto new kselftest/next patches.
Changelog v3:
- Changed git signature from Wieczor-Retman Maciej to Maciej
Wieczor-Retman.
- Added one review tag.
- Rebased onto updated kselftests next branch.
Changelog v2:
- Add review and fixes tags to patches.
- Add two patches with mismatch fixes.
- Fix missed attribute in selftests/kvm. (Andrew)
- Fix previously missed issues in selftests/mm (Ilpo)
[v5] https://lore.kernel.org/all/cover.1697012398.git.maciej.wieczor-retman@inte…
[v4] https://lore.kernel.org/all/cover.1696846568.git.maciej.wieczor-retman@inte…
[v3] https://lore.kernel.org/all/cover.1695373131.git.maciej.wieczor-retman@inte…
[v2] https://lore.kernel.org/all/cover.1693829810.git.maciej.wieczor-retman@inte…
[v1] https://lore.kernel.org/all/cover.1693216959.git.maciej.wieczor-retman@inte…
Maciej Wieczor-Retman (8):
selftests: Add printf attribute to kselftest prints
selftests/cachestat: Fix print_cachestat format
selftests/openat2: Fix wrong format specifier
selftests/pidfd: Fix ksft print formats
selftests/sigaltstack: Fix wrong format specifier
selftests/kvm: Replace attribute with macro
selftests/mm: Substitute attribute with a macro
selftests/resctrl: Fix wrong format specifier
.../selftests/cachestat/test_cachestat.c | 2 +-
tools/testing/selftests/kselftest.h | 18 ++++++++++--------
.../testing/selftests/kvm/include/test_util.h | 8 ++++----
tools/testing/selftests/mm/mremap_test.c | 2 +-
tools/testing/selftests/mm/pkey-helpers.h | 2 +-
tools/testing/selftests/openat2/openat2_test.c | 2 +-
.../selftests/pidfd/pidfd_fdinfo_test.c | 2 +-
tools/testing/selftests/pidfd/pidfd_test.c | 12 ++++++------
tools/testing/selftests/resctrl/cache.c | 2 +-
tools/testing/selftests/sigaltstack/sas.c | 2 +-
10 files changed, 27 insertions(+), 25 deletions(-)
base-commit: 2531f374f922e77ba51f24d1aa6fa11c7f4c36b8
--
2.42.0
A number of corner cases were caught when trying to run the selftests on
older systems. Missed skip conditions, some error cases, and outdated
python setups would all report failures but the issue would actually be
related to some other condition rather than the selftest suite.
Address these individual cases.
Aaron Conole (4):
selftests: openvswitch: Add version check for pyroute2
selftests: openvswitch: Catch cases where the tests are killed
selftests: openvswitch: Skip drop testing on older kernels
selftests: openvswitch: Fix the ct_tuple for v4
.../selftests/net/openvswitch/openvswitch.sh | 21 +++++++-
.../selftests/net/openvswitch/ovs-dpctl.py | 48 ++++++++++++++++++-
2 files changed, 66 insertions(+), 3 deletions(-)
--
2.41.0
In order to use run_kselftest.sh the list of tests must be emitted to
populate kselftest-list.txt.
The powerpc Makefile is written to use EMIT_TESTS. But support for
EMIT_TESTS was dropped in commit d4e59a536f50 ("selftests: Use runner.sh
for emit targets"). Although prior to that commit a548de0fe8e1
("selftests: lib.mk: add test execute bit check to EMIT_TESTS") had
already broken run_kselftest.sh for powerpc due to the executable check
using the wrong path.
It can be fixed by replacing the EMIT_TESTS definitions with actual
emit_tests rules in the powerpc Makefiles. This makes run_kselftest.sh
able to run powerpc tests:
$ cd linux
$ export ARCH=powerpc
$ export CROSS_COMPILE=powerpc64le-linux-gnu-
$ make headers
$ make -j -C tools/testing/selftests install
$ grep -c "^powerpc" tools/testing/selftests/kselftest_install/kselftest-list.txt
182
Fixes: d4e59a536f50 ("selftests: Use runner.sh for emit targets")
Signed-off-by: Michael Ellerman <mpe(a)ellerman.id.au>
---
tools/testing/selftests/powerpc/Makefile | 7 +++----
tools/testing/selftests/powerpc/pmu/Makefile | 11 ++++++-----
2 files changed, 9 insertions(+), 9 deletions(-)
I'll plan to merge this via the powerpc tree.
cheers
diff --git a/tools/testing/selftests/powerpc/Makefile b/tools/testing/selftests/powerpc/Makefile
index 49f2ad1793fd..7ea42fa02eab 100644
--- a/tools/testing/selftests/powerpc/Makefile
+++ b/tools/testing/selftests/powerpc/Makefile
@@ -59,12 +59,11 @@ override define INSTALL_RULE
done;
endef
-override define EMIT_TESTS
+emit_tests:
+@for TARGET in $(SUB_DIRS); do \
BUILD_TARGET=$(OUTPUT)/$$TARGET; \
- $(MAKE) OUTPUT=$$BUILD_TARGET -s -C $$TARGET emit_tests;\
+ $(MAKE) OUTPUT=$$BUILD_TARGET -s -C $$TARGET $@;\
done;
-endef
override define CLEAN
+@for TARGET in $(SUB_DIRS); do \
@@ -77,4 +76,4 @@ endef
tags:
find . -name '*.c' -o -name '*.h' | xargs ctags
-.PHONY: tags $(SUB_DIRS)
+.PHONY: tags $(SUB_DIRS) emit_tests
diff --git a/tools/testing/selftests/powerpc/pmu/Makefile b/tools/testing/selftests/powerpc/pmu/Makefile
index 2b95e44d20ff..a284fa874a9f 100644
--- a/tools/testing/selftests/powerpc/pmu/Makefile
+++ b/tools/testing/selftests/powerpc/pmu/Makefile
@@ -30,13 +30,14 @@ override define RUN_TESTS
+TARGET=event_code_tests; BUILD_TARGET=$$OUTPUT/$$TARGET; $(MAKE) OUTPUT=$$BUILD_TARGET -C $$TARGET run_tests
endef
-DEFAULT_EMIT_TESTS := $(EMIT_TESTS)
-override define EMIT_TESTS
- $(DEFAULT_EMIT_TESTS)
+emit_tests:
+ for TEST in $(TEST_GEN_PROGS); do \
+ BASENAME_TEST=`basename $$TEST`; \
+ echo "$(COLLECTION):$$BASENAME_TEST"; \
+ done
+TARGET=ebb; BUILD_TARGET=$$OUTPUT/$$TARGET; $(MAKE) OUTPUT=$$BUILD_TARGET -s -C $$TARGET emit_tests
+TARGET=sampling_tests; BUILD_TARGET=$$OUTPUT/$$TARGET; $(MAKE) OUTPUT=$$BUILD_TARGET -s -C $$TARGET emit_tests
+TARGET=event_code_tests; BUILD_TARGET=$$OUTPUT/$$TARGET; $(MAKE) OUTPUT=$$BUILD_TARGET -s -C $$TARGET emit_tests
-endef
DEFAULT_INSTALL_RULE := $(INSTALL_RULE)
override define INSTALL_RULE
@@ -64,4 +65,4 @@ sampling_tests:
event_code_tests:
TARGET=$@; BUILD_TARGET=$$OUTPUT/$$TARGET; mkdir -p $$BUILD_TARGET; $(MAKE) OUTPUT=$$BUILD_TARGET -k -C $$TARGET all
-.PHONY: all run_tests ebb sampling_tests event_code_tests
+.PHONY: all run_tests ebb sampling_tests event_code_tests emit_tests
--
2.41.0