This series attempts to reduce the parsing overhead of IPv6 extension
headers in GRO and GSO, by removing extension header specific code and
enabling the frag0 fast path.
The following changes were made:
- Removed some unnecessary HBH conditionals by adding HBH offload
to inet6_offloads
- Added a utility function to support frag0 fast path in ipv6_gro_receive
- Added selftests for IPv6 packets with extension headers in GRO
Richard
v2 -> v3:
- Removed previously added IPv6 extension header length constant and
using sizeof(*opth) instead.
- Removed unnecessary conditional in gro selftest framework
- v2:
https://lore.kernel.org/netdev/127b8199-1cd4-42d7-9b2b-875abaad93fe@gmail.c…
v1 -> v2:
- Added a minimum IPv6 extension header length constant to make code self
documenting.
- Added new selftest which checks that packets with different extension
header payloads do not coalesce.
- Added more info in the second commit message regarding the code changes.
- v1:
https://lore.kernel.org/netdev/f4eff69d-3917-4c42-8c6b-d09597ac4437@gmail.c…
Richard Gobert (3):
net: gso: add HBH extension header offload support
net: gro: parse ipv6 ext headers without frag0 invalidation
selftests/net: fix GRO coalesce test and add ext header coalesce tests
net/ipv6/exthdrs_offload.c | 11 ++++
net/ipv6/ip6_offload.c | 76 +++++++++++++++++--------
tools/testing/selftests/net/gro.c | 93 +++++++++++++++++++++++++++++--
3 files changed, 150 insertions(+), 30 deletions(-)
--
2.36.1
Hi,
for this v3 I changed the approach for identifying devices in a stable
way from the match fields back to the hardware topology (used in v1).
The match fields were proposed as a way to avoid the possible issue of
PCI topology being reconfigured, but that wasn't observed on any real
system so far. However using match fields does allow for a real issue if
an external device similar to an internal one is connected to the
system, which results in a change of the match count and therefore a
test failure. So using the HW topology was chosen as the most reliable
approach.
The per-platform device description file now uses YAML following a
suggestion from Chris Obbard, and the test script was re-written in
python to handle the new YAML format.
A second sample board file is also now included for an x86 platform,
which contains an USB controller behind a PCI controller, which wasn't
possible to describe in v1.
Thanks,
Nícolas
v2: https://lore.kernel.org/all/20231127233558.868365-1-nfraprado@collabora.com
v1: https://lore.kernel.org/all/20231024211818.365844-1-nfraprado@collabora.com
Original cover letter:
This is part of an effort to improve detection of regressions impacting
device probe on all platforms. The recently merged DT kselftest [3]
detects probe issues for all devices described statically in the DT.
That leaves out devices discovered at run-time from discoverable busses.
This is where this test comes in. All of the devices that are connected
through discoverable busses (ie USB and PCI), and which are internal and
therefore always present, can be described in a per-platform file so
they can be checked for. The test will check that the device has been
instantiated and bound to a driver.
Patch 1 introduces the test. Patch 2 and 3 add the device definitions
for the google,spherion machine (Acer Chromebook 514) and XPS 13 as
examples.
This is the output from the test running on Spherion:
TAP version 13
Using board file: boards/google,spherion.yaml
1..8
ok 1 /usb2-controller(a)11200000/1.4.1/camera.device
ok 2 /usb2-controller(a)11200000/1.4.1/camera.0.driver
ok 3 /usb2-controller(a)11200000/1.4.1/camera.1.driver
ok 4 /usb2-controller(a)11200000/1.4.2/bluetooth.device
ok 5 /usb2-controller(a)11200000/1.4.2/bluetooth.0.driver
ok 6 /usb2-controller(a)11200000/1.4.2/bluetooth.1.driver
ok 7 /pci-controller(a)11230000/0.0/0.0/wifi.device
ok 8 /pci-controller(a)11230000/0.0/0.0/wifi.driver
Totals: pass:8 fail:0 xfail:0 xpass:0 skip:0 error:0
[3] https://lore.kernel.org/all/20230828211424.2964562-1-nfraprado@collabora.co…
Changes in v3:
- Reverted approach of encoding stable device reference in test file
from device match fields (from modalias) back to HW topology (from v1)
- Changed board file description to YAML
- Rewrote test script in python to handle YAML and support x86 platforms
Changes in v2:
- Changed approach of encoding stable device reference in test file from
HW topology to device match fields (the ones from modalias)
- Better documented test format
Nícolas F. R. A. Prado (3):
kselftest: Add test to verify probe of devices from discoverable
busses
kselftest: devices: Add sample board file for google,spherion
kselftest: devices: Add sample board file for XPS 13 9300
tools/testing/selftests/Makefile | 1 +
tools/testing/selftests/devices/Makefile | 4 +
.../devices/boards/Dell Inc.,XPS 13 9300.yaml | 40 +++
.../devices/boards/google,spherion.yaml | 50 +++
tools/testing/selftests/devices/ksft.py | 90 +++++
.../devices/test_discoverable_devices.py | 318 ++++++++++++++++++
6 files changed, 503 insertions(+)
create mode 100644 tools/testing/selftests/devices/Makefile
create mode 100644 tools/testing/selftests/devices/boards/Dell Inc.,XPS 13 9300.yaml
create mode 100644 tools/testing/selftests/devices/boards/google,spherion.yaml
create mode 100644 tools/testing/selftests/devices/ksft.py
create mode 100755 tools/testing/selftests/devices/test_discoverable_devices.py
--
2.43.0
Add NULL checks to KUNIT_BINARY_STR_ASSERTION() so that it will fail
cleanly if either pointer is NULL, instead of causing a NULL pointer
dereference in the strcmp().
A test failure could be that a string is unexpectedly NULL. This could
be trapped by KUNIT_ASSERT_NOT_NULL() but that would terminate the test
at that point. It's preferable that the KUNIT_EXPECT_STR*() macros can
handle NULL pointers as a failure.
Signed-off-by: Richard Fitzgerald <rf(a)opensource.cirrus.com>
---
include/kunit/test.h | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/include/kunit/test.h b/include/kunit/test.h
index b163b9984b33..c2ce379c329b 100644
--- a/include/kunit/test.h
+++ b/include/kunit/test.h
@@ -758,7 +758,7 @@ do { \
.right_text = #right, \
}; \
\
- if (likely(strcmp(__left, __right) op 0)) \
+ if (likely((__left) && (__right) && (strcmp(__left, __right) op 0))) \
break; \
\
\
--
2.30.2
The RISC-V arch_timer selftests is used to validate Sstc timer
functionality in a guest, which sets up periodic timer interrupts
and check the basic interrupt status upon its receipt.
This KVM selftests was ported from aarch64 arch_timer and tested
with Linux v6.7-rc4 on a Qemu riscv64 virt machine.
---
Changed since v4:
* Rebased to Linux 6.7-rc4
* Included Paolo's patch(01/11) to fix issues with SPLIT_TESTS
* Droped the patch(KVM: selftests: Unify the makefile rule for split targets)
since Paolo's patch had included the fix
* Added new patch(05/11) to include header file vdso/processor.h from linux
source tree to leverage the cpu_relax() definition - Conor/Andrew
* Added new patch(11/11) to enable user configuration of timer error margin
parameter which alleviate the intermitent failure in stress test - Andrew
* Other minor fixes per Andrew's comments
Haibo Xu (10):
KVM: arm64: selftests: Split arch_timer test code
KVM: selftests: Add CONFIG_64BIT definition for the build
tools: riscv: Add header file csr.h
tools: riscv: Add header file vdso/processor.h
KVM: riscv: selftests: Switch to use macro from csr.h
KVM: riscv: selftests: Add exception handling support
KVM: riscv: selftests: Add guest helper to get vcpu id
KVM: riscv: selftests: Change vcpu_has_ext to a common function
KVM: riscv: selftests: Add sstc timer test
KVM: selftests: Enable tunning of err_margin_us in arch timer test
Paolo Bonzini (1):
selftests/kvm: Fix issues with $(SPLIT_TESTS)
tools/arch/riscv/include/asm/csr.h | 521 ++++++++++++++++++
tools/arch/riscv/include/asm/vdso/processor.h | 32 ++
tools/testing/selftests/kvm/Makefile | 27 +-
.../selftests/kvm/aarch64/arch_timer.c | 295 +---------
tools/testing/selftests/kvm/arch_timer.c | 259 +++++++++
.../selftests/kvm/include/aarch64/processor.h | 4 -
.../selftests/kvm/include/kvm_util_base.h | 9 +
.../selftests/kvm/include/riscv/arch_timer.h | 71 +++
.../selftests/kvm/include/riscv/processor.h | 65 ++-
.../testing/selftests/kvm/include/test_util.h | 2 +
.../selftests/kvm/include/timer_test.h | 45 ++
.../selftests/kvm/lib/riscv/handlers.S | 101 ++++
.../selftests/kvm/lib/riscv/processor.c | 87 +++
.../testing/selftests/kvm/riscv/arch_timer.c | 111 ++++
.../selftests/kvm/riscv/get-reg-list.c | 11 +-
15 files changed, 1333 insertions(+), 307 deletions(-)
create mode 100644 tools/arch/riscv/include/asm/csr.h
create mode 100644 tools/arch/riscv/include/asm/vdso/processor.h
create mode 100644 tools/testing/selftests/kvm/arch_timer.c
create mode 100644 tools/testing/selftests/kvm/include/riscv/arch_timer.h
create mode 100644 tools/testing/selftests/kvm/include/timer_test.h
create mode 100644 tools/testing/selftests/kvm/lib/riscv/handlers.S
create mode 100644 tools/testing/selftests/kvm/riscv/arch_timer.c
--
2.34.1
The patch set [1] added a general lib.sh in net selftests, and converted
several test scripts to source the lib.sh.
unicast_extensions.sh (converted in [1]) and pmtu.sh (converted in [2])
have a /bin/sh shebang which may point to various shells in different
distributions, but "source" is only available in some of them. For
example, "source" is a built-it function in bash, but it cannot be
used in dash.
Refer to other scripts that were converted together, simply change the
shebang to bash to fix the following issues when the default /bin/sh
points to other shells.
# selftests: net: unicast_extensions.sh
# ./unicast_extensions.sh: 31: source: not found
# ###########################################################################
# Unicast address extensions tests (behavior of reserved IPv4 addresses)
# ###########################################################################
# TEST: assign and ping within 240/4 (1 of 2) (is allowed) [FAIL]
# TEST: assign and ping within 240/4 (2 of 2) (is allowed) [FAIL]
# TEST: assign and ping within 0/8 (1 of 2) (is allowed) [FAIL]
# TEST: assign and ping within 0/8 (2 of 2) (is allowed) [FAIL]
# TEST: assign and ping inside 255.255/16 (is allowed) [FAIL]
# TEST: assign and ping inside 255.255.255/24 (is allowed) [FAIL]
# TEST: route between 240.5.6/24 and 255.1.2/24 (is allowed) [FAIL]
# TEST: route between 0.200/16 and 245.99/16 (is allowed) [FAIL]
# TEST: assign and ping lowest address (/24) [FAIL]
# TEST: assign and ping lowest address (/26) [FAIL]
# TEST: routing using lowest address [FAIL]
# TEST: assigning 0.0.0.0 (is forbidden) [ OK ]
# TEST: assigning 255.255.255.255 (is forbidden) [ OK ]
# TEST: assign and ping inside 127/8 (is forbidden) [ OK ]
# TEST: assign and ping class D address (is forbidden) [ OK ]
# TEST: routing using class D (is forbidden) [ OK ]
# TEST: routing using 127/8 (is forbidden) [ OK ]
not ok 51 selftests: net: unicast_extensions.sh # exit=1
v1 -> v2:
- Fix pmtu.sh which has the same issue as unicast_extensions.sh,
suggested by Hangbin
- Change the style of the "source" line to be consistent with other
tests, suggested by Hangbin
Link: https://lore.kernel.org/all/20231202020110.362433-1-liuhangbin@gmail.com/ [1]
Link: https://lore.kernel.org/all/20231219094856.1740079-1-liuhangbin@gmail.com/ [2]
Reported-by: kernel test robot <oliver.sang(a)intel.com>
Signed-off-by: Yujie Liu <yujie.liu(a)intel.com>
---
tools/testing/selftests/net/pmtu.sh | 4 ++--
tools/testing/selftests/net/unicast_extensions.sh | 4 ++--
2 files changed, 4 insertions(+), 4 deletions(-)
diff --git a/tools/testing/selftests/net/pmtu.sh b/tools/testing/selftests/net/pmtu.sh
index 175d3d1d773b..f10879788f61 100755
--- a/tools/testing/selftests/net/pmtu.sh
+++ b/tools/testing/selftests/net/pmtu.sh
@@ -1,4 +1,4 @@
-#!/bin/sh
+#!/bin/bash
# SPDX-License-Identifier: GPL-2.0
#
# Check that route PMTU values match expectations, and that initial device MTU
@@ -198,7 +198,7 @@
# - pmtu_ipv6_route_change
# Same as above but with IPv6
-source ./lib.sh
+source lib.sh
PAUSE_ON_FAIL=no
VERBOSE=0
diff --git a/tools/testing/selftests/net/unicast_extensions.sh b/tools/testing/selftests/net/unicast_extensions.sh
index b7a2cb9e7477..f52aa5f7da52 100755
--- a/tools/testing/selftests/net/unicast_extensions.sh
+++ b/tools/testing/selftests/net/unicast_extensions.sh
@@ -1,4 +1,4 @@
-#!/bin/sh
+#!/bin/bash
# SPDX-License-Identifier: GPL-2.0
#
# By Seth Schoen (c) 2021, for the IPv4 Unicast Extensions Project
@@ -28,7 +28,7 @@
# These tests provide an easy way to flip the expected result of any
# of these behaviors for testing kernel patches that change them.
-source ./lib.sh
+source lib.sh
# nettest can be run from PATH or from same directory as this selftest
if ! which nettest >/dev/null; then
base-commit: cd4d7263d58ab98fd4dee876776e4da6c328faa3
--
2.34.1
This series attempts to reduce the parsing overhead of IPv6 extension
headers in GRO and GSO, by removing extension header specific code and
enabling the frag0 fast path.
The following changes were made:
- Removed some unnecessary HBH conditionals by adding HBH offload
to inet6_offloads
- Added a utility function to support frag0 fast path in ipv6_gro_receive
- Added selftests for IPv6 packets with extension headers in GRO
Richard
v1 -> v2:
- Added a minimum IPv6 extension header length constant to make code self
documenting.
- Added new selftest which checks that packets with different extension
header payloads do not coalesce.
- Added more info in the second commit message regarding the code changes.
- v1:
https://lore.kernel.org/netdev/f4eff69d-3917-4c42-8c6b-d09597ac4437@gmail.c…
Richard Gobert (3):
net: gso: add HBH extension header offload support
net: gro: parse ipv6 ext headers without frag0 invalidation
selftests/net: fix GRO coalesce test and add ext header coalesce tests
include/net/ipv6.h | 1 +
net/ipv6/exthdrs_offload.c | 11 ++++
net/ipv6/ip6_offload.c | 76 +++++++++++++++++--------
tools/testing/selftests/net/gro.c | 94 +++++++++++++++++++++++++++++--
4 files changed, 152 insertions(+), 30 deletions(-)
--
2.36.1
Nested translation is a hardware feature that is supported by many modern
IOMMU hardwares. It has two stages (stage-1, stage-2) address translation
to get access to the physical address. stage-1 translation table is owned
by userspace (e.g. by a guest OS), while stage-2 is owned by kernel. Changes
to stage-1 translation table should be followed by an IOTLB invalidation.
Take Intel VT-d as an example, the stage-1 translation table is I/O page
table. As the below diagram shows, guest I/O page table pointer in GPA
(guest physical address) is passed to host and be used to perform the stage-1
address translation. Along with it, modifications to present mappings in the
guest I/O page table should be followed with an IOTLB invalidation.
.-------------. .---------------------------.
| vIOMMU | | Guest I/O page table |
| | '---------------------------'
.----------------/
| PASID Entry |--- PASID cache flush --+
'-------------' |
| | V
| | I/O page table pointer in GPA
'-------------'
Guest
------| Shadow |---------------------------|--------
v v v
Host
.-------------. .------------------------.
| pIOMMU | | FS for GIOVA->GPA |
| | '------------------------'
.----------------/ |
| PASID Entry | V (Nested xlate)
'----------------\.----------------------------------.
| | | SS for GPA->HPA, unmanaged domain|
| | '----------------------------------'
'-------------'
Where:
- FS = First stage page tables
- SS = Second stage page tables
<Intel VT-d Nested translation>
This series is based on the first part which was merged [1], this series is to
add the cache invalidation interface or the userspace to invalidate cache after
modifying the stage-1 page table. This includes both the iommufd changes and the
VT-d driver changes.
Complete code can be found in [2], QEMU could can be found in [3].
At last, this is a team work together with Nicolin Chen, Lu Baolu. Thanks
them for the help. ^_^. Look forward to your feedbacks.
[1] https://lore.kernel.org/linux-iommu/20231026044216.64964-1-yi.l.liu@intel.c… - merged
[2] https://github.com/yiliu1765/iommufd/tree/iommufd_nesting
[3] https://github.com/yiliu1765/qemu/tree/zhenzhong/wip/iommufd_nesting_rfcv1
Change log:
v10:
- Minor tweak to patch 07 (Kevin)
- Rebase on top of 6.7-rc8
v9: https://lore.kernel.org/linux-iommu/20231228150629.13149-1-yi.l.liu@intel.c…
- Add a test case which sets both IOMMU_TEST_INVALIDATE_FLAG_ALL and
IOMMU_TEST_INVALIDATE_FLAG_TRIGGER_ERROR in flags, and expect to succeed
and see an 'error'. (Kevin)
- Returns -ETIMEOUT in qi_check_fault() if caller is interested with the
fault when timeout happens. If not, the qi_submit_sync() will keep retry
hence unable to report the error back to user. For now, only the user cache
invalidation path has interest on the time out error. So this change only
affects the user cache invalidation path. Other path will still hang in
qi_submit_sync() when timeout happens. (Kevin)
v8: https://lore.kernel.org/linux-iommu/20231227161354.67701-1-yi.l.liu@intel.c…
- Pass invalidation hint to the cache invalidation helper in the cache_invalidate_user
op path (Kevin)
- Move the devTLB invalidation out of info->iommu loop (Kevin, Weijiang)
- Clear *fault per restart in qi_submit_sync() to avoid acroos submission error
accumulation. (Kevin)
- Define the vtd cache invalidation uapi structure in separate patch (Kevin)
- Rename inv_error to be hw_error (Kevin)
- Rename 'reqs_uptr', 'req_type', 'req_len' and 'req_num' to be 'data_uptr',
'data_type', "entry_len' and 'entry_num" (Kevin)
- Allow user to set IOMMU_TEST_INVALIDATE_FLAG_ALL and IOMMU_TEST_INVALIDATE_FLAG_TRIGGER_ERROR
in the same time (Kevin)
v7: https://lore.kernel.org/linux-iommu/20231221153948.119007-1-yi.l.liu@intel.…
- Remove domain->ops->cache_invalidate_user check in hwpt alloc path due
to failure in bisect (Baolu)
- Remove out_driver_error_code from struct iommu_hwpt_invalidate after
discussion in v6. Should expect per-entry error code.
- Rework the selftest cache invalidation part to report a per-entry error
- Allow user to pass in an empty array to have a try-and-fail mechanism for
user to check if a given req_type is supported by the kernel (Jason)
- Define a separate enum type for cache invalidation data (Jason)
- Fix the IOMMU_HWPT_INVALIDATE to always update the req_num field before
returning (Nicolin)
- Merge the VT-d nesting part 2/2
https://lore.kernel.org/linux-iommu/20231117131816.24359-1-yi.l.liu@intel.c…
into this series to avoid defining empty enum in the middle of the series.
The major difference is adding the VT-d related invalidation uapi structures
together with the generic data structures in patch 02 of this series.
- VT-d driver was refined to report ICE/ITE error from the bottom cache
invalidation submit helpers, hence the cache_invalidate_user op could
report such errors via the per-entry error field to user. VT-d driver
will not stop the invalidation array walking due to the ICE/ITE errors
as such errors are defined by VT-d spec, userspace should be able to
handle it and let the real user (say Virtual Machine) know about it.
But for other errors like invalid uapi data structure configuration,
memory copy failure, such errors should stop the array walking as it
may have more issues if go on.
- Minor fixes per Jason and Kevin's review comments
v6: https://lore.kernel.org/linux-iommu/20231117130717.19875-1-yi.l.liu@intel.c…
- No much change, just rebase on top of 6.7-rc1 as part 1/2 is merged
v5: https://lore.kernel.org/linux-iommu/20231020092426.13907-1-yi.l.liu@intel.c…
- Split the iommufd nesting series into two parts of alloc_user and
invalidation (Jason)
- Split IOMMUFD_OBJ_HW_PAGETABLE to IOMMUFD_OBJ_HWPT_PAGING/_NESTED, and
do the same with the structures/alloc()/abort()/destroy(). Reworked the
selftest accordingly too. (Jason)
- Move hwpt/data_type into struct iommu_user_data from standalone op
arguments. (Jason)
- Rename hwpt_type to be data_type, the HWPT_TYPE to be HWPT_ALLOC_DATA,
_TYPE_DEFAULT to be _ALLOC_DATA_NONE (Jason, Kevin)
- Rename iommu_copy_user_data() to iommu_copy_struct_from_user() (Kevin)
- Add macro to the iommu_copy_struct_from_user() to calculate min_size
(Jason)
- Fix two bugs spotted by ZhaoYan
v4: https://lore.kernel.org/linux-iommu/20230921075138.124099-1-yi.l.liu@intel.…
- Separate HWPT alloc/destroy/abort functions between user-managed HWPTs
and kernel-managed HWPTs
- Rework invalidate uAPI to be a multi-request array-based design
- Add a struct iommu_user_data_array and a helper for driver to sanitize
and copy the entry data from user space invalidation array
- Add a patch fixing TEST_LENGTH() in selftest program
- Drop IOMMU_RESV_IOVA_RANGES patches
- Update kdoc and inline comments
- Drop the code to add IOMMU_RESV_SW_MSI to kernel-managed HWPT in nested translation,
this does not change the rule that resv regions should only be added to the
kernel-managed HWPT. The IOMMU_RESV_SW_MSI stuff will be added in later series
as it is needed only by SMMU so far.
v3: https://lore.kernel.org/linux-iommu/20230724110406.107212-1-yi.l.liu@intel.…
- Add new uAPI things in alphabetical order
- Pass in "enum iommu_hwpt_type hwpt_type" to op->domain_alloc_user for
sanity, replacing the previous op->domain_alloc_user_data_len solution
- Return ERR_PTR from domain_alloc_user instead of NULL
- Only add IOMMU_RESV_SW_MSI to kernel-managed HWPT in nested translation (Kevin)
- Add IOMMU_RESV_IOVA_RANGES to report resv iova ranges to userspace hence
userspace is able to exclude the ranges in the stage-1 HWPT (e.g. guest I/O
page table). (Kevin)
- Add selftest coverage for the new IOMMU_RESV_IOVA_RANGES ioctl
- Minor changes per Kevin's inputs
v2: https://lore.kernel.org/linux-iommu/20230511143844.22693-1-yi.l.liu@intel.c…
- Add union iommu_domain_user_data to include all user data structures to avoid
passing void * in kernel APIs.
- Add iommu op to return user data length for user domain allocation
- Rename struct iommu_hwpt_alloc::data_type to be hwpt_type
- Store the invalidation data length in iommu_domain_ops::cache_invalidate_user_data_len
- Convert cache_invalidate_user op to be int instead of void
- Remove @data_type in struct iommu_hwpt_invalidate
- Remove out_hwpt_type_bitmap in struct iommu_hw_info hence drop patch 08 of v1
v1: https://lore.kernel.org/linux-iommu/20230309080910.607396-1-yi.l.liu@intel.…
Thanks,
Yi Liu
Lu Baolu (4):
iommu: Add cache_invalidate_user op
iommu/vt-d: Allow qi_submit_sync() to return the QI faults
iommu/vt-d: Convert stage-1 cache invalidation to return QI fault
iommu/vt-d: Add iotlb flush for nested domain
Nicolin Chen (4):
iommu: Add iommu_copy_struct_from_user_array helper
iommufd/selftest: Add mock_domain_cache_invalidate_user support
iommufd/selftest: Add IOMMU_TEST_OP_MD_CHECK_IOTLB test op
iommufd/selftest: Add coverage for IOMMU_HWPT_INVALIDATE ioctl
Yi Liu (2):
iommufd: Add IOMMU_HWPT_INVALIDATE
iommufd: Add data structure for Intel VT-d stage-1 cache invalidation
drivers/iommu/intel/dmar.c | 42 ++--
drivers/iommu/intel/iommu.c | 12 +-
drivers/iommu/intel/iommu.h | 8 +-
drivers/iommu/intel/irq_remapping.c | 2 +-
drivers/iommu/intel/nested.c | 107 ++++++++++
drivers/iommu/intel/pasid.c | 14 +-
drivers/iommu/intel/svm.c | 14 +-
drivers/iommu/iommufd/hw_pagetable.c | 41 ++++
drivers/iommu/iommufd/iommufd_private.h | 10 +
drivers/iommu/iommufd/iommufd_test.h | 39 ++++
drivers/iommu/iommufd/main.c | 3 +
drivers/iommu/iommufd/selftest.c | 86 ++++++++
include/linux/iommu.h | 100 +++++++++
include/uapi/linux/iommufd.h | 101 ++++++++++
tools/testing/selftests/iommu/iommufd.c | 190 ++++++++++++++++++
tools/testing/selftests/iommu/iommufd_utils.h | 57 ++++++
16 files changed, 787 insertions(+), 39 deletions(-)
--
2.34.1
For now, we have to call some helpers when we need to update the csum,
such as bpf_l4_csum_replace, bpf_l3_csum_replace, etc. These helpers are
not inlined, which causes poor performance.
In fact, we can define our own csum update functions in BPF program
instead of bpf_l3_csum_replace, which is totally inlined and efficient.
However, we can't do this for bpf_l4_csum_replace for now, as we can't
update skb->csum, which can cause skb->csum invalid in the rx path with
CHECKSUM_COMPLETE mode.
What's more, we can't use the direct data access and have to use
skb_store_bytes() with the BPF_F_RECOMPUTE_CSUM flag in some case, such
as modifing the vni in the vxlan header and the underlay udp header has
no checksum.
In the first patch, we make skb->csum readable and writable, and we make
skb->ip_summed readable. For now, for tc only. With these 2 fields, we
don't need to call bpf helpers for csum update any more.
In the second patch, we add some testcases for the read/write testing for
skb->csum and skb->ip_summed.
If this series is acceptable, we can define the inlined functions for csum
update in libbpf in the next step.
Menglong Dong (2):
bpf: add csum/ip_summed fields to __sk_buff
testcases/bpf: add testcases for skb->csum to ctx_skb.c
include/linux/skbuff.h | 2 +
include/uapi/linux/bpf.h | 2 +
net/core/filter.c | 22 ++++++++++
tools/include/uapi/linux/bpf.h | 2 +
.../testing/selftests/bpf/verifier/ctx_skb.c | 43 +++++++++++++++++++
5 files changed, 71 insertions(+)
--
2.39.2
While testing the split PMD path with lockdep enabled I've got an
"Invalid wait context" error caused by split_huge_page_to_list() trying
to lock anon_vma->rwsem while inside RCU read section. The issues is due
to move_pages_pte() calling split_folio() under RCU read lock. Fix this
by unmapping the PTEs and exiting RCU read section before splitting the
folio and then retrying. The same retry pattern is used when locking the
folio or anon_vma in this function. After splitting the large folio we
unlock and release it because after the split the old folio might not be
the one that contains the src_addr.
Fixes: 94b01c885131 ("userfaultfd: UFFDIO_MOVE uABI")
Signed-off-by: Suren Baghdasaryan <surenb(a)google.com>
---
Changes from v1 [1]:
1. Reset src_folio and src_folio_pte after folio is split, per Peter Xu
[1] https://lore.kernel.org/all/20231230025607.2476912-1-surenb@google.com/
mm/userfaultfd.c | 9 +++++++++
1 file changed, 9 insertions(+)
diff --git a/mm/userfaultfd.c b/mm/userfaultfd.c
index 5e718014e671..216ab4c8621f 100644
--- a/mm/userfaultfd.c
+++ b/mm/userfaultfd.c
@@ -1078,9 +1078,18 @@ static int move_pages_pte(struct mm_struct *mm, pmd_t *dst_pmd, pmd_t *src_pmd,
/* at this point we have src_folio locked */
if (folio_test_large(src_folio)) {
+ /* split_folio() can block */
+ pte_unmap(&orig_src_pte);
+ pte_unmap(&orig_dst_pte);
+ src_pte = dst_pte = NULL;
err = split_folio(src_folio);
if (err)
goto out;
+ /* have to reacquire the folio after it got split */
+ folio_unlock(src_folio);
+ folio_put(src_folio);
+ src_folio = NULL;
+ goto retry;
}
if (!src_anon_vma) {
--
2.43.0.472.g3155946c3a-goog
While testing the split PMD path with lockdep enabled I've got an
"Invalid wait context" error caused by split_huge_page_to_list() trying
to lock anon_vma->rwsem while inside RCU read section. The issues is due
to move_pages_pte() calling split_folio() under RCU read lock. Fix this
by unmapping the PTEs and exiting RCU read section before splitting the
folio and then retrying. The same retry pattern is used when locking the
folio or anon_vma in this function.
Fixes: 94b01c885131 ("userfaultfd: UFFDIO_MOVE uABI")
Signed-off-by: Suren Baghdasaryan <surenb(a)google.com>
---
Patch applies over mm-unstable.
Please note that the SHA in Fixes tag is unstable.
mm/userfaultfd.c | 5 +++++
1 file changed, 5 insertions(+)
diff --git a/mm/userfaultfd.c b/mm/userfaultfd.c
index 5e718014e671..71393410e028 100644
--- a/mm/userfaultfd.c
+++ b/mm/userfaultfd.c
@@ -1078,9 +1078,14 @@ static int move_pages_pte(struct mm_struct *mm, pmd_t *dst_pmd, pmd_t *src_pmd,
/* at this point we have src_folio locked */
if (folio_test_large(src_folio)) {
+ /* split_folio() can block */
+ pte_unmap(&orig_src_pte);
+ pte_unmap(&orig_dst_pte);
+ src_pte = dst_pte = NULL;
err = split_folio(src_folio);
if (err)
goto out;
+ goto retry;
}
if (!src_anon_vma) {
--
2.43.0.472.g3155946c3a-goog
From: Roberto Sassu <roberto.sassu(a)huawei.com>
IMA and EVM are not effectively LSMs, especially due to the fact that in
the past they could not provide a security blob while there is another LSM
active.
That changed in the recent years, the LSM stacking feature now makes it
possible to stack together multiple LSMs, and allows them to provide a
security blob for most kernel objects. While the LSM stacking feature has
some limitations being worked out, it is already suitable to make IMA and
EVM as LSMs.
The main purpose of this patch set is to remove IMA and EVM function calls,
hardcoded in the LSM infrastructure and other places in the kernel, and to
register them as LSM hook implementations, so that those functions are
called by the LSM infrastructure like other regular LSMs.
This patch set introduces two new LSMs 'ima' and 'evm', so that functions
can be registered to their respective LSM, and removes the 'integrity' LSM.
integrity_kernel_module_request() was moved to IMA, since it was related to
appraisal. integrity_inode_free() was replaced with
ima_inode_free_security() (EVM does not need to free memory).
In order to make 'ima' and 'evm' independent LSMs, it was necessary to
split integrity metadata used by both IMA and EVM, and to let them manage
their own. The special case of the IMA_NEW_FILE flag, managed by IMA and
used by EVM, was handled by introducing a new flag in EVM, EVM_NEW_FILE,
managed by two additional LSM hooks, evm_post_path_mknod() and
evm_file_free(), equivalent to their counterparts ima_post_path_mknod() and
ima_file_free().
In addition to splitting metadata, it was decided to embed the full
structure into the inode security blob, rather than using a cache of
objects and allocating them on demand. This opens for new possibilities,
such as improving locking in IMA.
Another follow-up change was removing the iint parameter from
evm_verifyxattr(), that IMA used to pass integrity metadata to EVM. After
splitting metadata, and aligning EVM_NEW_FILE with IMA_NEW_FILE, this
parameter was not necessary anymore.
The last part was to ensure that the order of IMA and EVM functions is
respected after they become LSMs. Since the order of lsm_info structures in
the .lsm_info.init section depends on the order object files containing
those structures are passed to the linker of the kernel image, and since
IMA is before EVM in the Makefile, that is sufficient to assert that IMA
functions are executed before EVM ones.
The patch set is organized as follows.
Patches 1-9 make IMA and EVM functions suitable to be registered to the LSM
infrastructure, by aligning function parameters.
Patches 10-18 add new LSM hooks in the same places where IMA and EVM
functions are called, if there is no LSM hook already.
Patches 19-21 introduce the new standalone LSMs 'ima' and 'evm', and move
hardcoded calls to IMA, EVM and integrity functions to those LSMs.
Patches 22-23 remove the dependency on the 'integrity' LSM by splitting
integrity metadata, so that the 'ima' and 'evm' LSMs can use their own.
They also duplicate iint_lockdep_annotate() in ima_main.c, since the mutex
field was moved from integrity_iint_cache to ima_iint_cache.
Patch 24 finally removes the 'integrity' LSM, since 'ima' and 'evm' are now
self-contained and independent.
The patch set applies on top of lsm/dev, commit 80b4ff1d2c9b ("selftests:
remove the LSM_ID_IMA check in lsm/lsm_list_modules_test"). The
linux-integrity/next-integrity-testing at commit f17167bea279 ("ima: Remove
EXPERIMENTAL from Kconfig") was merged.
Changelog:
v7:
- Use return instead of goto in __vfs_removexattr_locked() (suggested by
Casey)
- Clarify in security/integrity/Makefile that the order of 'ima' and 'evm'
LSMs depends on the order in which IMA and EVM are compiled
- Move integrity_iint_cache flags to ima.h and evm.h in security/ and
duplicate IMA_NEW_FILE to EVM_NEW_FILE
- Rename evm_inode_get_iint() to evm_iint_inode() and ima_inode_get_iint()
to ima_iint_inode(), check if inode->i_security is NULL, and just return
the pointer from the inode security blob
- Restore the non-NULL checks after ima_iint_inode() and evm_iint_inode()
(suggested by Casey)
- Introduce evm_file_free() to clear EVM_NEW_FILE
- Remove comment about LSM_ORDER_LAST not guaranteeing the order of 'ima'
and 'evm' LSMs
- Lock iint->mutex before reading IMA_COLLECTED flag in __ima_inode_hash()
and restored ima_policy_flag check
- Remove patch about the hardcoded ordering of 'ima' and 'evm' LSMs in
security.c
- Add missing ima_inode_free_security() to free iint->ima_hash
- Add the cases for LSM_ID_IMA and LSM_ID_EVM in lsm_list_modules_test.c
- Mention about the change in IMA and EVM post functions for private
inodes
v6:
- See v7
v5:
- Rename security_file_pre_free() to security_file_release() and the LSM
hook file_pre_free_security to file_release (suggested by Paul)
- Move integrity_kernel_module_request() to ima_main.c (renamed to
ima_kernel_module_request())
- Split the integrity_iint_cache structure into ima_iint_cache and
evm_iint_cache, so that IMA and EVM can use disjoint metadata and
reserve space with the LSM infrastructure
- Reserve space for the entire ima_iint_cache and evm_iint_cache
structures, not just the pointer (suggested by Paul)
- Introduce ima_inode_get_iint() and evm_inode_get_iint() to retrieve
respectively the ima_iint_cache and evm_iint_cache structure from the
security blob
- Remove the various non-NULL checks for the ima_iint_cache and
evm_iint_cache structures, since the LSM infrastructure ensure that they
always exist
- Remove the iint parameter from evm_verifyxattr() since IMA and EVM
use disjoint integrity metaddata
- Introduce the evm_post_path_mknod() to set the IMA_NEW_FILE flag
- Register the inode_alloc_security LSM hook in IMA and EVM to
initialize the respective integrity metadata structures
- Remove the 'integrity' LSM completely and instead make 'ima' and 'evm'
proper standalone LSMs
- Add the inode parameter to ima_get_verity_digest(), since the inode
field is not present in ima_iint_cache
- Move iint_lockdep_annotate() to ima_main.c (renamed to
ima_iint_lockdep_annotate())
- Remove ima_get_lsm_id() and evm_get_lsm_id(), since IMA and EVM directly
register the needed LSM hooks
- Enforce 'ima' and 'evm' LSM ordering at LSM infrastructure level
v4:
- Improve short and long description of
security_inode_post_create_tmpfile(), security_inode_post_set_acl(),
security_inode_post_remove_acl() and security_file_post_open()
(suggested by Mimi)
- Improve commit message of 'ima: Move to LSM infrastructure' (suggested
by Mimi)
v3:
- Drop 'ima: Align ima_post_path_mknod() definition with LSM
infrastructure' and 'ima: Align ima_post_create_tmpfile() definition
with LSM infrastructure', define the new LSM hooks with the same
IMA parameters instead (suggested by Mimi)
- Do IS_PRIVATE() check in security_path_post_mknod() and
security_inode_post_create_tmpfile() on the new inode rather than the
parent directory (in the post method it is available)
- Don't export ima_file_check() (suggested by Stefan)
- Remove redundant check of file mode in ima_post_path_mknod() (suggested
by Mimi)
- Mention that ima_post_path_mknod() is now conditionally invoked when
CONFIG_SECURITY_PATH=y (suggested by Mimi)
- Mention when a LSM hook will be introduced in the IMA/EVM alignment
patches (suggested by Mimi)
- Simplify the commit messages when introducing a new LSM hook
- Still keep the 'extern' in the function declaration, until the
declaration is removed (suggested by Mimi)
- Improve documentation of security_file_pre_free()
- Register 'ima' and 'evm' as standalone LSMs (suggested by Paul)
- Initialize the 'ima' and 'evm' LSMs from 'integrity', to keep the
original ordering of IMA and EVM functions as when they were hardcoded
- Return the IMA and EVM LSM IDs to 'integrity' for registration of the
integrity-specific hooks
- Reserve an xattr slot from the 'evm' LSM instead of 'integrity'
- Pass the LSM ID to init_ima_appraise_lsm()
v2:
- Add description for newly introduced LSM hooks (suggested by Casey)
- Clarify in the description of security_file_pre_free() that actions can
be performed while the file is still open
v1:
- Drop 'evm: Complete description of evm_inode_setattr()', 'fs: Fix
description of vfs_tmpfile()' and 'security: Introduce LSM_ORDER_LAST',
they were sent separately (suggested by Christian Brauner)
- Replace dentry with file descriptor parameter for
security_inode_post_create_tmpfile()
- Introduce mode_stripped and pass it as mode argument to
security_path_mknod() and security_path_post_mknod()
- Use goto in do_mknodat() and __vfs_removexattr_locked() (suggested by
Mimi)
- Replace __lsm_ro_after_init with __ro_after_init
- Modify short description of security_inode_post_create_tmpfile() and
security_inode_post_set_acl() (suggested by Stefan)
- Move security_inode_post_setattr() just after security_inode_setattr()
(suggested by Mimi)
- Modify short description of security_key_post_create_or_update()
(suggested by Mimi)
- Add back exported functions ima_file_check() and
evm_inode_init_security() respectively to ima.h and evm.h (reported by
kernel robot)
- Remove extern from prototype declarations and fix style issues
- Remove unnecessary include of linux/lsm_hooks.h in ima_main.c and
ima_appraise.c
Roberto Sassu (24):
ima: Align ima_inode_post_setattr() definition with LSM infrastructure
ima: Align ima_file_mprotect() definition with LSM infrastructure
ima: Align ima_inode_setxattr() definition with LSM infrastructure
ima: Align ima_inode_removexattr() definition with LSM infrastructure
ima: Align ima_post_read_file() definition with LSM infrastructure
evm: Align evm_inode_post_setattr() definition with LSM infrastructure
evm: Align evm_inode_setxattr() definition with LSM infrastructure
evm: Align evm_inode_post_setxattr() definition with LSM
infrastructure
security: Align inode_setattr hook definition with EVM
security: Introduce inode_post_setattr hook
security: Introduce inode_post_removexattr hook
security: Introduce file_post_open hook
security: Introduce file_release hook
security: Introduce path_post_mknod hook
security: Introduce inode_post_create_tmpfile hook
security: Introduce inode_post_set_acl hook
security: Introduce inode_post_remove_acl hook
security: Introduce key_post_create_or_update hook
ima: Move to LSM infrastructure
ima: Move IMA-Appraisal to LSM infrastructure
evm: Move to LSM infrastructure
evm: Make it independent from 'integrity' LSM
ima: Make it independent from 'integrity' LSM
integrity: Remove LSM
fs/attr.c | 5 +-
fs/file_table.c | 3 +-
fs/namei.c | 12 +-
fs/nfsd/vfs.c | 3 +-
fs/open.c | 1 -
fs/posix_acl.c | 5 +-
fs/xattr.c | 9 +-
include/linux/evm.h | 111 +-------
include/linux/fs.h | 2 -
include/linux/ima.h | 142 ----------
include/linux/integrity.h | 27 --
include/linux/lsm_hook_defs.h | 20 +-
include/linux/security.h | 59 ++++
include/uapi/linux/lsm.h | 2 +
security/integrity/Makefile | 1 +
security/integrity/digsig_asymmetric.c | 23 --
security/integrity/evm/evm.h | 19 ++
security/integrity/evm/evm_crypto.c | 4 +-
security/integrity/evm/evm_main.c | 195 ++++++++++---
security/integrity/iint.c | 197 +------------
security/integrity/ima/ima.h | 120 +++++++-
security/integrity/ima/ima_api.c | 15 +-
security/integrity/ima/ima_appraise.c | 64 +++--
security/integrity/ima/ima_init.c | 2 +-
security/integrity/ima/ima_main.c | 201 +++++++++++---
security/integrity/ima/ima_policy.c | 2 +-
security/integrity/integrity.h | 80 +-----
security/keys/key.c | 10 +-
security/security.c | 261 +++++++++++-------
security/selinux/hooks.c | 3 +-
security/smack/smack_lsm.c | 4 +-
.../selftests/lsm/lsm_list_modules_test.c | 6 +
32 files changed, 783 insertions(+), 825 deletions(-)
--
2.34.1
This MIB counter is similar to the one of TCP -- CurrEstab -- available
in /proc/net/snmp. This is useful to quickly list the number of MPTCP
connections without having to iterate over all of them.
Patch 1 prepares its support by adding new helper functions:
- MPTCP_DEC_STATS(): similar to MPTCP_INC_STATS(), but this time to
decrement a counter.
- mptcp_set_state(): similar to tcp_set_state(), to change the state of
an MPTCP socket, and to inc/decrement the new counter when needed.
Patch 2 uses mptcp_set_state() instead of directly calling
inet_sk_state_store() to change the state of MPTCP sockets.
Patch 3 and 4 validate the new feature in MPTCP "join" and "diag"
selftests.
Signed-off-by: Matthieu Baerts <matttbe(a)kernel.org>
---
Geliang Tang (4):
mptcp: add CurrEstab MIB counter support
mptcp: use mptcp_set_state
selftests: mptcp: join: check CURRESTAB counters
selftests: mptcp: diag: check CURRESTAB counters
net/mptcp/mib.c | 1 +
net/mptcp/mib.h | 8 ++++
net/mptcp/pm_netlink.c | 5 +++
net/mptcp/protocol.c | 56 ++++++++++++++++---------
net/mptcp/protocol.h | 1 +
net/mptcp/subflow.c | 2 +-
tools/testing/selftests/net/mptcp/diag.sh | 17 +++++++-
tools/testing/selftests/net/mptcp/mptcp_join.sh | 46 +++++++++++++++++---
8 files changed, 110 insertions(+), 26 deletions(-)
---
base-commit: 56794e5358542b7c652f202946e53bfd2373b5e0
change-id: 20231221-upstream-net-next-20231221-mptcp-currestab-5a2867b4020b
Best regards,
--
Matthieu Baerts <matttbe(a)kernel.org>
suite->log must be checked for NULL before passing it to
string_stream_clear(). This was done in kunit_init_test() but was missing
from kunit_init_suite().
Signed-off-by: Richard Fitzgerald <rf(a)opensource.cirrus.com>
Fixes: 6d696c4695c5 ("kunit: add ability to run tests after boot using debugfs")
---
lib/kunit/test.c | 4 +++-
1 file changed, 3 insertions(+), 1 deletion(-)
diff --git a/lib/kunit/test.c b/lib/kunit/test.c
index e803d998e855..ea7f0913e55a 100644
--- a/lib/kunit/test.c
+++ b/lib/kunit/test.c
@@ -658,7 +658,9 @@ static void kunit_init_suite(struct kunit_suite *suite)
kunit_debugfs_create_suite(suite);
suite->status_comment[0] = '\0';
suite->suite_init_err = 0;
- string_stream_clear(suite->log);
+
+ if (suite->log)
+ string_stream_clear(suite->log);
}
bool kunit_enabled(void)
--
2.30.2
This makes the uevent selftests build not write to the source tree
unconditionally, as that breaks out of tree builds when the source tree
is read-only. It also avoids leaving a git repository in a dirty state
after a build.
v2: drop spurious extra SPDX-License-Identifier
Signed-off-by: Antonio Terceiro <antonio.terceiro(a)linaro.org>
---
tools/testing/selftests/uevent/Makefile | 15 +++------------
1 file changed, 3 insertions(+), 12 deletions(-)
diff --git a/tools/testing/selftests/uevent/Makefile b/tools/testing/selftests/uevent/Makefile
index f7baa9aa2932..872969f42694 100644
--- a/tools/testing/selftests/uevent/Makefile
+++ b/tools/testing/selftests/uevent/Makefile
@@ -1,17 +1,8 @@
# SPDX-License-Identifier: GPL-2.0
all:
-include ../lib.mk
-
-.PHONY: all clean
-
-BINARIES := uevent_filtering
-CFLAGS += -Wl,-no-as-needed -Wall
+CFLAGS += -Wl,-no-as-needed -Wall $(KHDR_INCLUDES)
-uevent_filtering: uevent_filtering.c ../kselftest.h ../kselftest_harness.h
- $(CC) $(CFLAGS) $< -o $@
+TEST_GEN_PROGS = uevent_filtering
-TEST_PROGS += $(BINARIES)
-EXTRA_CLEAN := $(BINARIES)
-
-all: $(BINARIES)
+include ../lib.mk
--
2.43.0
Nested translation is a hardware feature that is supported by many modern
IOMMU hardwares. It has two stages (stage-1, stage-2) address translation
to get access to the physical address. stage-1 translation table is owned
by userspace (e.g. by a guest OS), while stage-2 is owned by kernel. Changes
to stage-1 translation table should be followed by an IOTLB invalidation.
Take Intel VT-d as an example, the stage-1 translation table is I/O page
table. As the below diagram shows, guest I/O page table pointer in GPA
(guest physical address) is passed to host and be used to perform the stage-1
address translation. Along with it, modifications to present mappings in the
guest I/O page table should be followed with an IOTLB invalidation.
.-------------. .---------------------------.
| vIOMMU | | Guest I/O page table |
| | '---------------------------'
.----------------/
| PASID Entry |--- PASID cache flush --+
'-------------' |
| | V
| | I/O page table pointer in GPA
'-------------'
Guest
------| Shadow |---------------------------|--------
v v v
Host
.-------------. .------------------------.
| pIOMMU | | FS for GIOVA->GPA |
| | '------------------------'
.----------------/ |
| PASID Entry | V (Nested xlate)
'----------------\.----------------------------------.
| | | SS for GPA->HPA, unmanaged domain|
| | '----------------------------------'
'-------------'
Where:
- FS = First stage page tables
- SS = Second stage page tables
<Intel VT-d Nested translation>
This series is based on the first part which was merged [1], this series is to
add the cache invalidation interface or the userspace to invalidate cache after
modifying the stage-1 page table. This includes both the iommufd changes and the
VT-d driver changes.
Complete code can be found in [2], QEMU could can be found in [3].
At last, this is a team work together with Nicolin Chen, Lu Baolu. Thanks
them for the help. ^_^. Look forward to your feedbacks.
[1] https://lore.kernel.org/linux-iommu/20231026044216.64964-1-yi.l.liu@intel.c… - merged
[2] https://github.com/yiliu1765/iommufd/tree/iommufd_nesting
[3] https://github.com/yiliu1765/qemu/tree/zhenzhong/wip/iommufd_nesting_rfcv1
Change log:
v9:
- Add a test case which sets both IOMMU_TEST_INVALIDATE_FLAG_ALL and
IOMMU_TEST_INVALIDATE_FLAG_TRIGGER_ERROR in flags, and expect to succeed
and see an 'error'. (Kevin)
- Returns -ETIMEOUT in qi_check_fault() if caller is interested with the
fault when timeout happens. If not, the qi_submit_sync() will keep retry
hence unable to report the error back to user. For now, only the user cache
invalidation path has interest on the time out error. So this change only
affects the user cache invalidation path. Other path will still hang in
qi_submit_sync() when timeout happens. (Kevin)
v8: https://lore.kernel.org/linux-iommu/20231227161354.67701-1-yi.l.liu@intel.c…
- Pass invalidation hint to the cache invalidation helper in the cache_invalidate_user
op path (Kevin)
- Move the devTLB invalidation out of info->iommu loop (Kevin, Weijiang)
- Clear *fault per restart in qi_submit_sync() to avoid acroos submission error
accumulation. (Kevin)
- Define the vtd cache invalidation uapi structure in separate patch (Kevin)
- Rename inv_error to be hw_error (Kevin)
- Rename 'reqs_uptr', 'req_type', 'req_len' and 'req_num' to be 'data_uptr',
'data_type', "entry_len' and 'entry_num" (Kevin)
- Allow user to set IOMMU_TEST_INVALIDATE_FLAG_ALL and IOMMU_TEST_INVALIDATE_FLAG_TRIGGER_ERROR
in the same time (Kevin)
v7: https://lore.kernel.org/linux-iommu/20231221153948.119007-1-yi.l.liu@intel.…
- Remove domain->ops->cache_invalidate_user check in hwpt alloc path due
to failure in bisect (Baolu)
- Remove out_driver_error_code from struct iommu_hwpt_invalidate after
discussion in v6. Should expect per-entry error code.
- Rework the selftest cache invalidation part to report a per-entry error
- Allow user to pass in an empty array to have a try-and-fail mechanism for
user to check if a given req_type is supported by the kernel (Jason)
- Define a separate enum type for cache invalidation data (Jason)
- Fix the IOMMU_HWPT_INVALIDATE to always update the req_num field before
returning (Nicolin)
- Merge the VT-d nesting part 2/2
https://lore.kernel.org/linux-iommu/20231117131816.24359-1-yi.l.liu@intel.c…
into this series to avoid defining empty enum in the middle of the series.
The major difference is adding the VT-d related invalidation uapi structures
together with the generic data structures in patch 02 of this series.
- VT-d driver was refined to report ICE/ITE error from the bottom cache
invalidation submit helpers, hence the cache_invalidate_user op could
report such errors via the per-entry error field to user. VT-d driver
will not stop the invalidation array walking due to the ICE/ITE errors
as such errors are defined by VT-d spec, userspace should be able to
handle it and let the real user (say Virtual Machine) know about it.
But for other errors like invalid uapi data structure configuration,
memory copy failure, such errors should stop the array walking as it
may have more issues if go on.
- Minor fixes per Jason and Kevin's review comments
v6: https://lore.kernel.org/linux-iommu/20231117130717.19875-1-yi.l.liu@intel.c…
- No much change, just rebase on top of 6.7-rc1 as part 1/2 is merged
v5: https://lore.kernel.org/linux-iommu/20231020092426.13907-1-yi.l.liu@intel.c…
- Split the iommufd nesting series into two parts of alloc_user and
invalidation (Jason)
- Split IOMMUFD_OBJ_HW_PAGETABLE to IOMMUFD_OBJ_HWPT_PAGING/_NESTED, and
do the same with the structures/alloc()/abort()/destroy(). Reworked the
selftest accordingly too. (Jason)
- Move hwpt/data_type into struct iommu_user_data from standalone op
arguments. (Jason)
- Rename hwpt_type to be data_type, the HWPT_TYPE to be HWPT_ALLOC_DATA,
_TYPE_DEFAULT to be _ALLOC_DATA_NONE (Jason, Kevin)
- Rename iommu_copy_user_data() to iommu_copy_struct_from_user() (Kevin)
- Add macro to the iommu_copy_struct_from_user() to calculate min_size
(Jason)
- Fix two bugs spotted by ZhaoYan
v4: https://lore.kernel.org/linux-iommu/20230921075138.124099-1-yi.l.liu@intel.…
- Separate HWPT alloc/destroy/abort functions between user-managed HWPTs
and kernel-managed HWPTs
- Rework invalidate uAPI to be a multi-request array-based design
- Add a struct iommu_user_data_array and a helper for driver to sanitize
and copy the entry data from user space invalidation array
- Add a patch fixing TEST_LENGTH() in selftest program
- Drop IOMMU_RESV_IOVA_RANGES patches
- Update kdoc and inline comments
- Drop the code to add IOMMU_RESV_SW_MSI to kernel-managed HWPT in nested translation,
this does not change the rule that resv regions should only be added to the
kernel-managed HWPT. The IOMMU_RESV_SW_MSI stuff will be added in later series
as it is needed only by SMMU so far.
v3: https://lore.kernel.org/linux-iommu/20230724110406.107212-1-yi.l.liu@intel.…
- Add new uAPI things in alphabetical order
- Pass in "enum iommu_hwpt_type hwpt_type" to op->domain_alloc_user for
sanity, replacing the previous op->domain_alloc_user_data_len solution
- Return ERR_PTR from domain_alloc_user instead of NULL
- Only add IOMMU_RESV_SW_MSI to kernel-managed HWPT in nested translation (Kevin)
- Add IOMMU_RESV_IOVA_RANGES to report resv iova ranges to userspace hence
userspace is able to exclude the ranges in the stage-1 HWPT (e.g. guest I/O
page table). (Kevin)
- Add selftest coverage for the new IOMMU_RESV_IOVA_RANGES ioctl
- Minor changes per Kevin's inputs
v2: https://lore.kernel.org/linux-iommu/20230511143844.22693-1-yi.l.liu@intel.c…
- Add union iommu_domain_user_data to include all user data structures to avoid
passing void * in kernel APIs.
- Add iommu op to return user data length for user domain allocation
- Rename struct iommu_hwpt_alloc::data_type to be hwpt_type
- Store the invalidation data length in iommu_domain_ops::cache_invalidate_user_data_len
- Convert cache_invalidate_user op to be int instead of void
- Remove @data_type in struct iommu_hwpt_invalidate
- Remove out_hwpt_type_bitmap in struct iommu_hw_info hence drop patch 08 of v1
v1: https://lore.kernel.org/linux-iommu/20230309080910.607396-1-yi.l.liu@intel.…
Thanks,
Yi Liu
Lu Baolu (4):
iommu: Add cache_invalidate_user op
iommu/vt-d: Allow qi_submit_sync() to return the QI faults
iommu/vt-d: Convert stage-1 cache invalidation to return QI fault
iommu/vt-d: Add iotlb flush for nested domain
Nicolin Chen (4):
iommu: Add iommu_copy_struct_from_user_array helper
iommufd/selftest: Add mock_domain_cache_invalidate_user support
iommufd/selftest: Add IOMMU_TEST_OP_MD_CHECK_IOTLB test op
iommufd/selftest: Add coverage for IOMMU_HWPT_INVALIDATE ioctl
Yi Liu (2):
iommufd: Add IOMMU_HWPT_INVALIDATE
iommufd: Add data structure for Intel VT-d stage-1 cache invalidation
drivers/iommu/intel/dmar.c | 49 +++--
drivers/iommu/intel/iommu.c | 12 +-
drivers/iommu/intel/iommu.h | 8 +-
drivers/iommu/intel/irq_remapping.c | 2 +-
drivers/iommu/intel/nested.c | 107 ++++++++++
drivers/iommu/intel/pasid.c | 14 +-
drivers/iommu/intel/svm.c | 14 +-
drivers/iommu/iommufd/hw_pagetable.c | 41 ++++
drivers/iommu/iommufd/iommufd_private.h | 10 +
drivers/iommu/iommufd/iommufd_test.h | 39 ++++
drivers/iommu/iommufd/main.c | 3 +
drivers/iommu/iommufd/selftest.c | 86 ++++++++
include/linux/iommu.h | 100 +++++++++
include/uapi/linux/iommufd.h | 101 ++++++++++
tools/testing/selftests/iommu/iommufd.c | 190 ++++++++++++++++++
tools/testing/selftests/iommu/iommufd_utils.h | 57 ++++++
16 files changed, 794 insertions(+), 39 deletions(-)
--
2.34.1
The patch set [1] added a general lib.sh in net selftests, and converted
several test scripts to source the lib.sh.
The shebang of unicast_extensions.sh is /bin/sh which may point to various
shells in different distributions, but "source" is only available in some
of them. For example, "source" is a built-it function in bash, but it
cannot be used in dash.
Refer to other scripts that were converted together, simply change the
shebang to bash to suppress the following errors when the default /bin/sh
points to other shells.
# selftests: net: unicast_extensions.sh
# ./unicast_extensions.sh: 31: source: not found
# ###########################################################################
# Unicast address extensions tests (behavior of reserved IPv4 addresses)
# ###########################################################################
# TEST: assign and ping within 240/4 (1 of 2) (is allowed) [FAIL]
# TEST: assign and ping within 240/4 (2 of 2) (is allowed) [FAIL]
# TEST: assign and ping within 0/8 (1 of 2) (is allowed) [FAIL]
# TEST: assign and ping within 0/8 (2 of 2) (is allowed) [FAIL]
# TEST: assign and ping inside 255.255/16 (is allowed) [FAIL]
# TEST: assign and ping inside 255.255.255/24 (is allowed) [FAIL]
# TEST: route between 240.5.6/24 and 255.1.2/24 (is allowed) [FAIL]
# TEST: route between 0.200/16 and 245.99/16 (is allowed) [FAIL]
# TEST: assign and ping lowest address (/24) [FAIL]
# TEST: assign and ping lowest address (/26) [FAIL]
# TEST: routing using lowest address [FAIL]
# TEST: assigning 0.0.0.0 (is forbidden) [ OK ]
# TEST: assigning 255.255.255.255 (is forbidden) [ OK ]
# TEST: assign and ping inside 127/8 (is forbidden) [ OK ]
# TEST: assign and ping class D address (is forbidden) [ OK ]
# TEST: routing using class D (is forbidden) [ OK ]
# TEST: routing using 127/8 (is forbidden) [ OK ]
not ok 51 selftests: net: unicast_extensions.sh # exit=1
Link: https://lore.kernel.org/all/20231202020110.362433-1-liuhangbin@gmail.com/ [1]
Fixes: 0f4765d0b48d ("selftests/net: convert unicast_extensions.sh to run it in unique namespace")
Reported-by: kernel test robot <oliver.sang(a)intel.com>
Signed-off-by: Yujie Liu <yujie.liu(a)intel.com>
---
tools/testing/selftests/net/unicast_extensions.sh | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/tools/testing/selftests/net/unicast_extensions.sh b/tools/testing/selftests/net/unicast_extensions.sh
index b7a2cb9e7477..2766990c2b78 100755
--- a/tools/testing/selftests/net/unicast_extensions.sh
+++ b/tools/testing/selftests/net/unicast_extensions.sh
@@ -1,4 +1,4 @@
-#!/bin/sh
+#!/bin/bash
# SPDX-License-Identifier: GPL-2.0
#
# By Seth Schoen (c) 2021, for the IPv4 Unicast Extensions Project
--
2.34.1
This series attempts to reduce parsing overhead of IPv6 extension headers
in GRO and GSO, by removing extension header specific code and enabling
the frag0 fast path.
The following changes were made:
- Specific unnecessary HBH conditionals were removed by adding HBH offload
to inet6_offloads
- Added a utility function to support frag0 fast path in ipv6_gro_receive
- Added self-test for IPv6 packets with extension headers in GRO
Richard
Richard Gobert (3):
net: gso: add HBH extension header offload support
net: gro: parse ipv6 ext headers without frag0 invalidation
selftests/net: fix GRO coalesce test and add ext header coalesce test
net/ipv6/exthdrs_offload.c | 11 +++++
net/ipv6/ip6_offload.c | 76 ++++++++++++++++++++----------
tools/testing/selftests/net/gro.c | 78 ++++++++++++++++++++++++++++---
3 files changed, 134 insertions(+), 31 deletions(-)
--
2.36.1
Nested translation is a hardware feature that is supported by many modern
IOMMU hardwares. It has two stages (stage-1, stage-2) address translation
to get access to the physical address. stage-1 translation table is owned
by userspace (e.g. by a guest OS), while stage-2 is owned by kernel. Changes
to stage-1 translation table should be followed by an IOTLB invalidation.
Take Intel VT-d as an example, the stage-1 translation table is I/O page
table. As the below diagram shows, guest I/O page table pointer in GPA
(guest physical address) is passed to host and be used to perform the stage-1
address translation. Along with it, modifications to present mappings in the
guest I/O page table should be followed with an IOTLB invalidation.
.-------------. .---------------------------.
| vIOMMU | | Guest I/O page table |
| | '---------------------------'
.----------------/
| PASID Entry |--- PASID cache flush --+
'-------------' |
| | V
| | I/O page table pointer in GPA
'-------------'
Guest
------| Shadow |---------------------------|--------
v v v
Host
.-------------. .------------------------.
| pIOMMU | | FS for GIOVA->GPA |
| | '------------------------'
.----------------/ |
| PASID Entry | V (Nested xlate)
'----------------\.----------------------------------.
| | | SS for GPA->HPA, unmanaged domain|
| | '----------------------------------'
'-------------'
Where:
- FS = First stage page tables
- SS = Second stage page tables
<Intel VT-d Nested translation>
This series is based on the first part which was merged [1], this series is to
add the cache invalidation interface or the userspace to invalidate cache after
modifying the stage-1 page table. This includes both the iommufd changes and the
VT-d driver changes.
Complete code can be found in [2], QEMU could can be found in [3].
At last, this is a team work together with Nicolin Chen, Lu Baolu. Thanks
them for the help. ^_^. Look forward to your feedbacks.
[1] https://lore.kernel.org/linux-iommu/20231026044216.64964-1-yi.l.liu@intel.c… - merged
[2] https://github.com/yiliu1765/iommufd/tree/iommufd_nesting
[3] https://github.com/yiliu1765/qemu/tree/zhenzhong/wip/iommufd_nesting_rfcv1
Change log:
v7:
- Remove domain->ops->cache_invalidate_user check in hwpt alloc path due
to failure in bisect (Baolu)
- Remove out_driver_error_code from struct iommu_hwpt_invalidate after
discussion in v6. Should expect per-entry error code.
- Rework the selftest cache invalidation part to report a per-entry error
- Allow user to pass in an empty array to have a try-and-fail mechanism for
user to check if a given req_type is supported by the kernel (Jason)
- Define a separate enum type for cache invalidation data (Jason)
- Fix the IOMMU_HWPT_INVALIDATE to always update the req_num field before
returning (Nicolin)
- Merge the VT-d nesting part 2/2
https://lore.kernel.org/linux-iommu/20231117131816.24359-1-yi.l.liu@intel.c…
into this series to avoid defining empty enum in the middle of the series.
The major difference is adding the VT-d related invalidation uapi structures
together with the generic data structures in patch 02 of this series.
- VT-d driver was refined to report ICE/ITE error from the bottom cache
invalidation submit helpers, hence the cache_invalidate_user op could
report such errors via the per-entry error field to user. VT-d driver
will not stop the invalidation array walking due to the ICE/ITE errors
as such errors are defined by VT-d spec, userspace should be able to
handle it and let the real user (say Virtual Machine) know about it.
But for other errors like invalid uapi data structure configuration,
memory copy failure, such errors should stop the array walking as it
may have more issues if go on.
- Minor fixes per Jason and Kevin's review comments
v6: https://lore.kernel.org/linux-iommu/20231117130717.19875-1-yi.l.liu@intel.c…
- No much change, just rebase on top of 6.7-rc1 as part 1/2 is merged
v5: https://lore.kernel.org/linux-iommu/20231020092426.13907-1-yi.l.liu@intel.c…
- Split the iommufd nesting series into two parts of alloc_user and
invalidation (Jason)
- Split IOMMUFD_OBJ_HW_PAGETABLE to IOMMUFD_OBJ_HWPT_PAGING/_NESTED, and
do the same with the structures/alloc()/abort()/destroy(). Reworked the
selftest accordingly too. (Jason)
- Move hwpt/data_type into struct iommu_user_data from standalone op
arguments. (Jason)
- Rename hwpt_type to be data_type, the HWPT_TYPE to be HWPT_ALLOC_DATA,
_TYPE_DEFAULT to be _ALLOC_DATA_NONE (Jason, Kevin)
- Rename iommu_copy_user_data() to iommu_copy_struct_from_user() (Kevin)
- Add macro to the iommu_copy_struct_from_user() to calculate min_size
(Jason)
- Fix two bugs spotted by ZhaoYan
v4: https://lore.kernel.org/linux-iommu/20230921075138.124099-1-yi.l.liu@intel.…
- Separate HWPT alloc/destroy/abort functions between user-managed HWPTs
and kernel-managed HWPTs
- Rework invalidate uAPI to be a multi-request array-based design
- Add a struct iommu_user_data_array and a helper for driver to sanitize
and copy the entry data from user space invalidation array
- Add a patch fixing TEST_LENGTH() in selftest program
- Drop IOMMU_RESV_IOVA_RANGES patches
- Update kdoc and inline comments
- Drop the code to add IOMMU_RESV_SW_MSI to kernel-managed HWPT in nested translation,
this does not change the rule that resv regions should only be added to the
kernel-managed HWPT. The IOMMU_RESV_SW_MSI stuff will be added in later series
as it is needed only by SMMU so far.
v3: https://lore.kernel.org/linux-iommu/20230724110406.107212-1-yi.l.liu@intel.…
- Add new uAPI things in alphabetical order
- Pass in "enum iommu_hwpt_type hwpt_type" to op->domain_alloc_user for
sanity, replacing the previous op->domain_alloc_user_data_len solution
- Return ERR_PTR from domain_alloc_user instead of NULL
- Only add IOMMU_RESV_SW_MSI to kernel-managed HWPT in nested translation (Kevin)
- Add IOMMU_RESV_IOVA_RANGES to report resv iova ranges to userspace hence
userspace is able to exclude the ranges in the stage-1 HWPT (e.g. guest I/O
page table). (Kevin)
- Add selftest coverage for the new IOMMU_RESV_IOVA_RANGES ioctl
- Minor changes per Kevin's inputs
v2: https://lore.kernel.org/linux-iommu/20230511143844.22693-1-yi.l.liu@intel.c…
- Add union iommu_domain_user_data to include all user data structures to avoid
passing void * in kernel APIs.
- Add iommu op to return user data length for user domain allocation
- Rename struct iommu_hwpt_alloc::data_type to be hwpt_type
- Store the invalidation data length in iommu_domain_ops::cache_invalidate_user_data_len
- Convert cache_invalidate_user op to be int instead of void
- Remove @data_type in struct iommu_hwpt_invalidate
- Remove out_hwpt_type_bitmap in struct iommu_hw_info hence drop patch 08 of v1
v1: https://lore.kernel.org/linux-iommu/20230309080910.607396-1-yi.l.liu@intel.…
Thanks,
Yi Liu
Lu Baolu (4):
iommu: Add cache_invalidate_user op
iommu/vt-d: Allow qi_submit_sync() to return the QI faults
iommu/vt-d: Convert pasid based cache invalidation to return QI fault
iommu/vt-d: Add iotlb flush for nested domain
Nicolin Chen (4):
iommu: Add iommu_copy_struct_from_user_array helper
iommufd/selftest: Add mock_domain_cache_invalidate_user support
iommufd/selftest: Add IOMMU_TEST_OP_MD_CHECK_IOTLB test op
iommufd/selftest: Add coverage for IOMMU_HWPT_INVALIDATE ioctl
Yi Liu (1):
iommufd: Add IOMMU_HWPT_INVALIDATE
drivers/iommu/intel/dmar.c | 36 ++--
drivers/iommu/intel/iommu.c | 12 +-
drivers/iommu/intel/iommu.h | 8 +-
drivers/iommu/intel/irq_remapping.c | 2 +-
drivers/iommu/intel/nested.c | 116 ++++++++++++
drivers/iommu/intel/pasid.c | 14 +-
drivers/iommu/intel/svm.c | 14 +-
drivers/iommu/iommufd/hw_pagetable.c | 36 ++++
drivers/iommu/iommufd/iommufd_private.h | 10 ++
drivers/iommu/iommufd/iommufd_test.h | 39 ++++
drivers/iommu/iommufd/main.c | 3 +
drivers/iommu/iommufd/selftest.c | 93 ++++++++++
include/linux/iommu.h | 101 +++++++++++
include/uapi/linux/iommufd.h | 100 +++++++++++
tools/testing/selftests/iommu/iommufd.c | 170 ++++++++++++++++++
tools/testing/selftests/iommu/iommufd_utils.h | 57 ++++++
16 files changed, 773 insertions(+), 38 deletions(-)
--
2.34.1
Patch 1 is a cleanup one: mptcp_is_tcpsk() helper was modifying sock_ops
in some cases which is unexpected with that name.
Patch 2 to 4 add support for two socket options: IP_LOCAL_PORT_RANGE and
IP_BIND_ADDRESS_NO_PORT. The first one is a preparation patch, the
second one adds the support while the last one modifies an existing
selftest to validate the new features.
Signed-off-by: Matthieu Baerts <matttbe(a)kernel.org>
---
Davide Caratti (1):
mptcp: don't overwrite sock_ops in mptcp_is_tcpsk()
Maxim Galaganov (3):
mptcp: rename mptcp_setsockopt_sol_ip_set_transparent()
mptcp: sockopt: support IP_LOCAL_PORT_RANGE and IP_BIND_ADDRESS_NO_PORT
selftests/net: add MPTCP coverage for IP_LOCAL_PORT_RANGE
net/mptcp/protocol.c | 108 +++++++++-------------
net/mptcp/sockopt.c | 27 +++++-
tools/testing/selftests/net/ip_local_port_range.c | 12 +++
3 files changed, 79 insertions(+), 68 deletions(-)
---
base-commit: 62ed78f3baff396bd928ee77077580c5aa940149
change-id: 20231219-upstream-net-next-20231219-mptcp-sockopts-ephemeral-ports-645522e83161
Best regards,
--
Matthieu Baerts <matttbe(a)kernel.org>
From: Maxim Mikityanskiy <maxim(a)isovalent.com>
The goal of this series is to extend the verifier's capabilities of
tracking scalars when they are spilled to stack, especially when the
spill or fill is narrowing. It also contains a fix by Eduard for
infinite loop detection and a state pruning optimization by Eduard that
compensates for a verification complexity regression introduced by
tracking unbounded scalars. These improvements reduce the surface of
false rejections that I saw while working on Cilium codebase.
Patch 1 (Maxim): Fix for an existing test, it will matter later in the
series.
Patches 2-3 (Eduard): Fixes for false rejections in infinite loop
detection that happen in the selftests when my patches are applied.
Patches 4-5 (Maxim): Fix the inconsistency of find_equal_scalars that
was possible if 32-bit spills were made.
Patches 6-11 (Maxim): Support the case when boundary checks are first
performed after the register was spilled to the stack.
Patches 12-13 (Maxim): Support narrowing fills.
Patches 14-15 (Eduard): Optimization for state pruning in stacksafe() to
mitigate the verification complexity regression.
veristat -e file,prog,states -f '!states_diff<50' -f '!states_pct<10' -f '!states_a<10' -f '!states_b<10' -C ...
* Without patch 14:
File Program States (A) States (B) States (DIFF)
-------------------- ------------ ---------- ---------- ----------------
bpf_xdp.o tail_lb_ipv6 3877 2936 -941 (-24.27%)
pyperf180.bpf.o on_event 8422 10456 +2034 (+24.15%)
pyperf600.bpf.o on_event 22259 37319 +15060 (+67.66%)
pyperf600_iter.bpf.o on_event 400 540 +140 (+35.00%)
strobemeta.bpf.o on_event 4702 13435 +8733 (+185.73%)
* With patch 14:
File Program States (A) States (B) States (DIFF)
-------------------- ------------ ---------- ---------- --------------
bpf_xdp.o tail_lb_ipv6 3877 2937 -940 (-24.25%)
pyperf600_iter.bpf.o on_event 400 500 +100 (+25.00%)
Eduard Zingerman (4):
bpf: make infinite loop detection in is_state_visited() exact
selftests/bpf: check if imprecise stack spills confuse infinite loop
detection
bpf: Optimize state pruning for spilled scalars
selftests/bpf: states pruning checks for scalar vs STACK_{MISC,ZERO}
Maxim Mikityanskiy (11):
selftests/bpf: Fix the u64_offset_to_skb_data test
bpf: Make bpf_for_each_spilled_reg consider narrow spills
selftests/bpf: Add a test case for 32-bit spill tracking
bpf: Add the assign_scalar_id_before_mov function
bpf: Add the get_reg_width function
bpf: Assign ID to scalars on spill
selftests/bpf: Test assigning ID to scalars on spill
bpf: Track spilled unbounded scalars
selftests/bpf: Test tracking spilled unbounded scalars
bpf: Preserve boundaries and track scalars on narrowing fill
selftests/bpf: Add test cases for narrowing fill
include/linux/bpf_verifier.h | 2 +-
kernel/bpf/verifier.c | 160 +++++-
.../bpf/progs/verifier_direct_packet_access.c | 2 +-
.../selftests/bpf/progs/verifier_loops1.c | 24 +
.../selftests/bpf/progs/verifier_spill_fill.c | 529 +++++++++++++++++-
.../testing/selftests/bpf/verifier/precise.c | 6 +-
6 files changed, 677 insertions(+), 46 deletions(-)
--
2.42.1
The KUnit device helpers are documented with kerneldoc in their header
file, but also have short comments over their implementation. These were
mistakenly formatted as kerneldoc comments, even though they're not
valid kerneldoc. It shouldn't cause any serious problems -- this file
isn't included in the docs -- but it could be confusing, and causes
warnings.
Remove the extra '*' so that these aren't treated as kerneldoc.
Fixes: d03c720e03bd ("kunit: Add APIs for managing devices")
Reported-by: kernel test robot <lkp(a)intel.com>
Closes: https://lore.kernel.org/oe-kbuild-all/202312181920.H4EPAH20-lkp@intel.com/
Signed-off-by: David Gow <davidgow(a)google.com>
---
lib/kunit/device.c | 6 +++---
1 file changed, 3 insertions(+), 3 deletions(-)
diff --git a/lib/kunit/device.c b/lib/kunit/device.c
index 1db4305b615a..f5371287b375 100644
--- a/lib/kunit/device.c
+++ b/lib/kunit/device.c
@@ -60,7 +60,7 @@ static void kunit_device_release(struct device *d)
kfree(to_kunit_device(d));
}
-/**
+/*
* Create and register a KUnit-managed struct device_driver on the kunit_bus.
* Returns an error pointer on failure.
*/
@@ -124,7 +124,7 @@ static struct kunit_device *kunit_device_register_internal(struct kunit *test,
return kunit_dev;
}
-/**
+/*
* Create and register a new KUnit-managed device, using the user-supplied device_driver.
* On failure, returns an error pointer.
*/
@@ -141,7 +141,7 @@ struct device *kunit_device_register_with_driver(struct kunit *test,
}
EXPORT_SYMBOL_GPL(kunit_device_register_with_driver);
-/**
+/*
* Create and register a new KUnit-managed device, including a matching device_driver.
* On failure, returns an error pointer.
*/
--
2.43.0.472.g3155946c3a-goog
Here is the last part of converting net selftests to run in unique namespace.
This part converts all left tests. After the conversion, we can run the net
sleftests in parallel. e.g.
# ./run_kselftest.sh -n -t net:reuseport_bpf
TAP version 13
1..1
# selftests: net: reuseport_bpf
ok 1 selftests: net: reuseport_bpf
mod 10...
# Socket 0: 0
# Socket 1: 1
...
# Socket 4: 19
# Testing filter add without bind...
# SUCCESS
# ./run_kselftest.sh -p -n -t net:cmsg_so_mark.sh -t net:cmsg_time.sh -t net:cmsg_ipv6.sh
TAP version 13
1..3
# selftests: net: cmsg_so_mark.sh
ok 1 selftests: net: cmsg_so_mark.sh
# selftests: net: cmsg_time.sh
ok 2 selftests: net: cmsg_time.sh
# selftests: net: cmsg_ipv6.sh
ok 3 selftests: net: cmsg_ipv6.sh
# ./run_kselftest.sh -p -n -c net
TAP version 13
1..95
# selftests: net: reuseport_bpf_numa
ok 3 selftests: net: reuseport_bpf_numa
# selftests: net: reuseport_bpf_cpu
ok 2 selftests: net: reuseport_bpf_cpu
# selftests: net: sk_bind_sendto_listen
ok 9 selftests: net: sk_bind_sendto_listen
# selftests: net: reuseaddr_conflict
ok 5 selftests: net: reuseaddr_conflict
...
Here is the part 1 link:
https://lore.kernel.org/netdev/20231202020110.362433-1-liuhangbin@gmail.com
part 2 link:
https://lore.kernel.org/netdev/20231206070801.1691247-1-liuhangbin@gmail.com
part 3 link:
https://lore.kernel.org/netdev/20231213060856.4030084-1-liuhangbin@gmail.com
Hangbin Liu (8):
selftests/net: convert gre_gso.sh to run it in unique namespace
selftests/net: convert netns-name.sh to run it in unique namespace
selftests/net: convert rtnetlink.sh to run it in unique namespace
selftests/net: convert stress_reuseport_listen.sh to run it in unique
namespace
selftests/net: convert xfrm_policy.sh to run it in unique namespace
selftests/net: use unique netns name for setup_loopback.sh
setup_veth.sh
selftests/net: convert pmtu.sh to run it in unique namespace
kselftest/runner.sh: add netns support
tools/testing/selftests/kselftest/runner.sh | 38 ++++-
tools/testing/selftests/net/gre_gso.sh | 18 +--
tools/testing/selftests/net/gro.sh | 4 +-
tools/testing/selftests/net/netns-name.sh | 44 +++---
tools/testing/selftests/net/pmtu.sh | 27 ++--
tools/testing/selftests/net/rtnetlink.sh | 34 +++--
tools/testing/selftests/net/setup_loopback.sh | 8 +-
tools/testing/selftests/net/setup_veth.sh | 9 +-
.../selftests/net/stress_reuseport_listen.sh | 6 +-
tools/testing/selftests/net/toeplitz.sh | 14 +-
tools/testing/selftests/net/xfrm_policy.sh | 138 +++++++++---------
tools/testing/selftests/run_kselftest.sh | 10 +-
12 files changed, 193 insertions(+), 157 deletions(-)
--
2.43.0
From: "Steven Rostedt (Google)" <rostedt(a)goodmis.org>
As there were bugs found with the ownership of eventfs dynamic file
creation. Add a test to test it.
It will remount tracefs with a different gid and check the ownership of
the eventfs directory, as well as the system and event directories. It
will also check the event file directories.
It then does a chgrp on each of these as well to see if they all get
updated as expected.
Then it remounts the tracefs file system back to the original group and
makes sure that all the updated files and directories were reset back to
the original ownership.
It does the same for instances that change the ownership of he instance
directory.
Note, because the uid is not reset by a remount, it is tested for every
file by switching it to a new owner and then back again.
Acked-by: Masami Hiramatsu (Google) <mhiramat(a)kernel.org>
Tested-by: Masami Hiramatsu (Google) <mhiramat(a)kernel.org>
Signed-off-by: Steven Rostedt (Google) <rostedt(a)goodmis.org>
---
Changes since v3: https://lore.kernel.org/linux-trace-kernel/20231221211229.13398ef3@gandalf.…
- Added missing SPDX and removed exec permission from file (Shuah Khan)
.../ftrace/test.d/00basic/test_ownership.tc | 114 ++++++++++++++++++
1 file changed, 114 insertions(+)
create mode 100644 tools/testing/selftests/ftrace/test.d/00basic/test_ownership.tc
diff --git a/tools/testing/selftests/ftrace/test.d/00basic/test_ownership.tc b/tools/testing/selftests/ftrace/test.d/00basic/test_ownership.tc
new file mode 100644
index 000000000000..add7d5bf585d
--- /dev/null
+++ b/tools/testing/selftests/ftrace/test.d/00basic/test_ownership.tc
@@ -0,0 +1,114 @@
+#!/bin/sh
+# SPDX-License-Identifier: GPL-2.0
+# description: Test file and directory owership changes for eventfs
+
+original_group=`stat -c "%g" .`
+original_owner=`stat -c "%u" .`
+
+mount_point=`stat -c '%m' .`
+mount_options=`mount | grep "$mount_point" | sed -e 's/.*(\(.*\)).*/\1/'`
+
+# find another owner and group that is not the original
+other_group=`tac /etc/group | grep -v ":$original_group:" | head -1 | cut -d: -f3`
+other_owner=`tac /etc/passwd | grep -v ":$original_owner:" | head -1 | cut -d: -f3`
+
+# Remove any group ownership already
+new_options=`echo "$mount_options" | sed -e "s/gid=[0-9]*/gid=$other_group/"`
+
+if [ "$new_options" = "$mount_options" ]; then
+ new_options="$mount_options,gid=$other_group"
+ mount_options="$mount_options,gid=$original_group"
+fi
+
+canary="events/timer events/timer/timer_cancel events/timer/timer_cancel/format"
+
+test() {
+ file=$1
+ test_group=$2
+
+ owner=`stat -c "%u" $file`
+ group=`stat -c "%g" $file`
+
+ echo "testing $file $owner=$original_owner and $group=$test_group"
+ if [ $owner -ne $original_owner ]; then
+ exit_fail
+ fi
+ if [ $group -ne $test_group ]; then
+ exit_fail
+ fi
+
+ # Note, the remount does not update ownership so test going to and from owner
+ echo "test owner $file to $other_owner"
+ chown $other_owner $file
+ owner=`stat -c "%u" $file`
+ if [ $owner -ne $other_owner ]; then
+ exit_fail
+ fi
+
+ chown $original_owner $file
+ owner=`stat -c "%u" $file`
+ if [ $owner -ne $original_owner ]; then
+ exit_fail
+ fi
+
+}
+
+run_tests() {
+ for d in "." "events" "events/sched" "events/sched/sched_switch" "events/sched/sched_switch/enable" $canary; do
+ test "$d" $other_group
+ done
+
+ chgrp $original_group events
+ test "events" $original_group
+ for d in "." "events/sched" "events/sched/sched_switch" "events/sched/sched_switch/enable" $canary; do
+ test "$d" $other_group
+ done
+
+ chgrp $original_group events/sched
+ test "events/sched" $original_group
+ for d in "." "events/sched/sched_switch" "events/sched/sched_switch/enable" $canary; do
+ test "$d" $other_group
+ done
+
+ chgrp $original_group events/sched/sched_switch
+ test "events/sched/sched_switch" $original_group
+ for d in "." "events/sched/sched_switch/enable" $canary; do
+ test "$d" $other_group
+ done
+
+ chgrp $original_group events/sched/sched_switch/enable
+ test "events/sched/sched_switch/enable" $original_group
+ for d in "." $canary; do
+ test "$d" $other_group
+ done
+}
+
+mount -o remount,"$new_options" .
+
+run_tests
+
+mount -o remount,"$mount_options" .
+
+for d in "." "events" "events/sched" "events/sched/sched_switch" "events/sched/sched_switch/enable" $canary; do
+ test "$d" $original_group
+done
+
+# check instances as well
+
+chgrp $other_group instances
+
+instance="$(mktemp -u test-XXXXXX)"
+
+mkdir instances/$instance
+
+cd instances/$instance
+
+run_tests
+
+cd ../..
+
+rmdir instances/$instance
+
+chgrp $original_group instances
+
+exit 0
--
2.42.0
From: "Steven Rostedt (Google)" <rostedt(a)goodmis.org>
As there were bugs found with the ownership of eventfs dynamic file
creation. Add a test to test it.
It will remount tracefs with a different gid and check the ownership of
the eventfs directory, as well as the system and event directories. It
will also check the event file directories.
It then does a chgrp on each of these as well to see if they all get
updated as expected.
Then it remounts the tracefs file system back to the original group and
makes sure that all the updated files and directories were reset back to
the original ownership.
It does the same for instances that change the ownership of he instance
directory.
Note, because the uid is not reset by a remount, it is tested for every
file by switching it to a new owner and then back again.
Signed-off-by: Steven Rostedt (Google) <rostedt(a)goodmis.org>
---
Changes since v2: https://lore.kernel.org/linux-trace-kernel/20231221194516.53e1ee43@gandalf.…
- Changed the instance test name from "foo-$(mktemp -u XXXXX)" to
"$(mktemp -u test-XXXXXX)" as Masami reported that busybox mktemp only
works with 6 Xs and not 5. Also changed "foo" to "test" and placed it
into the mktemp format.
.../ftrace/test.d/00basic/test_ownership.tc | 113 ++++++++++++++++++
1 file changed, 113 insertions(+)
create mode 100755 tools/testing/selftests/ftrace/test.d/00basic/test_ownership.tc
diff --git a/tools/testing/selftests/ftrace/test.d/00basic/test_ownership.tc b/tools/testing/selftests/ftrace/test.d/00basic/test_ownership.tc
new file mode 100755
index 000000000000..4c20be3a714a
--- /dev/null
+++ b/tools/testing/selftests/ftrace/test.d/00basic/test_ownership.tc
@@ -0,0 +1,113 @@
+#!/bin/sh
+# description: Test file and directory owership changes for eventfs
+
+original_group=`stat -c "%g" .`
+original_owner=`stat -c "%u" .`
+
+mount_point=`stat -c '%m' .`
+mount_options=`mount | grep "$mount_point" | sed -e 's/.*(\(.*\)).*/\1/'`
+
+# find another owner and group that is not the original
+other_group=`tac /etc/group | grep -v ":$original_group:" | head -1 | cut -d: -f3`
+other_owner=`tac /etc/passwd | grep -v ":$original_owner:" | head -1 | cut -d: -f3`
+
+# Remove any group ownership already
+new_options=`echo "$mount_options" | sed -e "s/gid=[0-9]*/gid=$other_group/"`
+
+if [ "$new_options" = "$mount_options" ]; then
+ new_options="$mount_options,gid=$other_group"
+ mount_options="$mount_options,gid=$original_group"
+fi
+
+canary="events/timer events/timer/timer_cancel events/timer/timer_cancel/format"
+
+test() {
+ file=$1
+ test_group=$2
+
+ owner=`stat -c "%u" $file`
+ group=`stat -c "%g" $file`
+
+ echo "testing $file $owner=$original_owner and $group=$test_group"
+ if [ $owner -ne $original_owner ]; then
+ exit_fail
+ fi
+ if [ $group -ne $test_group ]; then
+ exit_fail
+ fi
+
+ # Note, the remount does not update ownership so test going to and from owner
+ echo "test owner $file to $other_owner"
+ chown $other_owner $file
+ owner=`stat -c "%u" $file`
+ if [ $owner -ne $other_owner ]; then
+ exit_fail
+ fi
+
+ chown $original_owner $file
+ owner=`stat -c "%u" $file`
+ if [ $owner -ne $original_owner ]; then
+ exit_fail
+ fi
+
+}
+
+run_tests() {
+ for d in "." "events" "events/sched" "events/sched/sched_switch" "events/sched/sched_switch/enable" $canary; do
+ test "$d" $other_group
+ done
+
+ chgrp $original_group events
+ test "events" $original_group
+ for d in "." "events/sched" "events/sched/sched_switch" "events/sched/sched_switch/enable" $canary; do
+ test "$d" $other_group
+ done
+
+ chgrp $original_group events/sched
+ test "events/sched" $original_group
+ for d in "." "events/sched/sched_switch" "events/sched/sched_switch/enable" $canary; do
+ test "$d" $other_group
+ done
+
+ chgrp $original_group events/sched/sched_switch
+ test "events/sched/sched_switch" $original_group
+ for d in "." "events/sched/sched_switch/enable" $canary; do
+ test "$d" $other_group
+ done
+
+ chgrp $original_group events/sched/sched_switch/enable
+ test "events/sched/sched_switch/enable" $original_group
+ for d in "." $canary; do
+ test "$d" $other_group
+ done
+}
+
+mount -o remount,"$new_options" .
+
+run_tests
+
+mount -o remount,"$mount_options" .
+
+for d in "." "events" "events/sched" "events/sched/sched_switch" "events/sched/sched_switch/enable" $canary; do
+ test "$d" $original_group
+done
+
+# check instances as well
+
+chgrp $other_group instances
+
+instance="$(mktemp -u test-XXXXXX)"
+
+mkdir instances/$instance
+
+cd instances/$instance
+
+run_tests
+
+cd ../..
+
+rmdir instances/$instance
+
+chgrp $original_group instances
+
+exit 0
--
2.42.0
This series implements support for SME use in non-protected KVM guests.
Much of this is very similar to SVE, the main additional challenge that
SME presents is that it introduces two new controls which change the
registers seen by guests:
- PSTATE.ZA enables the ZA matrix register and, if SME2 is supported,
the ZT0 LUT register.
- PSTATE.SM enables streaming mode, a new floating point mode which
uses the SVE register set with a separately configured vector length.
In streaming mode implementation of the FFR register is optional.
It is also permitted to build systems which support SME without SVE, in
this case when not in streaming mode no SVE registers or instructions
are available. Further, there is no requirement that there be any
overlap in the set of vector lengths supported by SVE and SME in a
system, this is expected to be a common situation in practical systems.
Since there is a new vector length to configure we introduce a new
feature parallel to the existing SVE one with a new pseudo register for
the streaming mode vector length. Due to the overlap with SVE caused by
streaming mode rather than finalising SME as a separate feature we use
the existing SVE finalisation to also finalise SME, a new define
KVM_ARM_VCPU_VEC is provided to help make user code clearer. Finalising
SVE and SME separately would introduce complication with register access
since finalising SVE makes the SVE regsiters writeable by userspace and
doing multiple finalisations results in an error being reported.
Dealing with a state where the SVE registers are writeable due to one of
SVE or SME being finalised but may have their VL changed by the other
being finalised seems like needless complexity with minimal practical
utility, it seems clearer to just express directly that only one
finalisation can be done in the ABI.
We represent the streaming mode registers to userspace by always using
the existing SVE registers to access the floating point state, using the
larger of the SME and (if enabled for the guest) SVE vector lengths.
There are a large number of subfeatures for SME, most of which only
offer additional instructions but some of which (SME2 and FA64) add
architectural state. The expectation is that these will be configured
via the ID registers but since the mechanism for doing this is still
unclear the current code enables SME2 and FA64 for the guest if the host
supports them regardless of what the ID registers say.
Since we do not yet have support for SVE in protected guests and SME is
very reliant on SVE this series does not implement support for SME in
protected guests. This will be added separately once SVE support is
merged into mainline (or along with merging that), there is code for
protected guests using SVE in the Android tree.
The new KVM_ARM_VCPU_VEC feature and ZA and ZT0 registers have not been
added to the get-reg-list selftest, the idea of supporting additional
features there without restructuring the program to generate all
possible feature combinations has been rejected. I will post a separate
series which does that restructuring.
I am seeing some test failures currently which I've not got to the
bottom of, at this point I'm reasonably sure these are preexisting
issues in the kernel which are more apparent in a guest.
To: Marc Zyngier <maz(a)kernel.org>
To: Oliver Upton <oliver.upton(a)linux.dev>
To: James Morse <james.morse(a)arm.com>
To: Suzuki K Poulose <suzuki.poulose(a)arm.com>
To: Catalin Marinas <catalin.marinas(a)arm.com>
To: Will Deacon <will(a)kernel.org>
Cc: <linux-arm-kernel(a)lists.infradead.org>
Cc: <kvmarm(a)lists.linux.dev>
Cc: <linux-kernel(a)vger.kernel.org>
To: Paolo Bonzini <pbonzini(a)redhat.com>
To: Jonathan Corbet <corbet(a)lwn.net>
Cc: <kvm(a)vger.kernel.org>
Cc: <linux-doc(a)vger.kernel.org>
To: Shuah Khan <shuah(a)kernel.org>
Cc: <linux-kselftest(a)vger.kernel.org>
Signed-off-by: Mark Brown <broonie(a)kernel.org>
Changes in v2:
- Rebase onto v6.7-rc3.
- Configure subfeatures based on host system only.
- Complete nVHE support.
- There was some snafu with sending v1 out, it didn't make it to the
lists but in case it hit people's inboxes I'm sending as v2.
---
Mark Brown (22):
KVM: arm64: Document why we trap SVE access from the host
arm64/fpsimd: Make SVE<->FPSIMD rewriting available to KVM
KVM: arm64: Move SVE state access macros after feature test macros
KVM: arm64: Store vector lengths in an array
KVM: arm64: Document the KVM ABI for SME
KVM: arm64: Make FFR restore optional in __sve_restore_state()
KVM: arm64: Define guest flags for SME
KVM: arm64: Rename SVE finalization constants to be more general
KVM: arm64: Basic SME system register descriptions
KVM: arm64: Add support for TPIDR2_EL0
KVM: arm64: Make SMPRI_EL1 RES0 for SME guests
KVM: arm64: Make SVCR a normal system register
KVM: arm64: Context switch SME state for guest
KVM: arm64: Manage and handle SME traps
KVM: arm64: Implement SME vector length configuration
KVM: arm64: Rename sve_state_reg_region
KVM: arm64: Support userspace access to streaming mode SVE registers
KVM: arm64: Expose ZA to userspace
KVM: arm64: Provide userspace access to ZT0
KVM: arm64: Support SME version configuration via ID registers
KVM: arm64: Provide userspace ABI for enabling SME
KVM: arm64: selftests: Add SME system registers to get-reg-list
Documentation/virt/kvm/api.rst | 104 +++++---
arch/arm64/include/asm/fpsimd.h | 5 +
arch/arm64/include/asm/kvm_emulate.h | 13 +-
arch/arm64/include/asm/kvm_host.h | 99 +++++---
arch/arm64/include/asm/kvm_hyp.h | 3 +-
arch/arm64/include/uapi/asm/kvm.h | 33 +++
arch/arm64/kernel/fpsimd.c | 51 +++-
arch/arm64/kvm/arm.c | 16 +-
arch/arm64/kvm/fpsimd.c | 266 ++++++++++++++++++---
arch/arm64/kvm/guest.c | 230 +++++++++++++++---
arch/arm64/kvm/handle_exit.c | 11 +
arch/arm64/kvm/hyp/fpsimd.S | 11 +-
arch/arm64/kvm/hyp/include/hyp/switch.h | 86 ++++++-
arch/arm64/kvm/hyp/include/hyp/sysreg-sr.h | 16 ++
arch/arm64/kvm/hyp/nvhe/hyp-main.c | 60 ++++-
arch/arm64/kvm/hyp/nvhe/switch.c | 13 +-
arch/arm64/kvm/hyp/vhe/switch.c | 3 +
arch/arm64/kvm/reset.c | 150 +++++++++---
arch/arm64/kvm/sys_regs.c | 67 +++++-
include/uapi/linux/kvm.h | 1 +
tools/testing/selftests/kvm/aarch64/get-reg-list.c | 32 ++-
21 files changed, 1063 insertions(+), 207 deletions(-)
---
base-commit: 4ae6e89253b387476c2ba0202c3a80f2e1284e91
change-id: 20230301-kvm-arm64-sme-06a1246d3636
Best regards,
--
Mark Brown <broonie(a)kernel.org>
Swap the arguments to typecheck_fn() in kunit_activate_static_stub()
so that real_fn_addr can be either the function itself or a pointer
to that function.
This is useful to simplify redirecting static functions in a module.
Having to pass the actual function meant that it must be exported
from the module. Either making the 'static' and EXPORT_SYMBOL*()
conditional (which makes the code messy), or change it to always
exported (which increases the export namespace and prevents the
compiler inlining a trivial stub function in non-test builds).
With the original definition of kunit_activate_static_stub() the
address of real_fn_addr was passed to typecheck_fn() as the type to
be passed. This meant that if real_fn_addr was a pointer-to-function
it would resolve to a ** instead of a *, giving an error like this:
error: initialization of ‘int (**)(int)’ from incompatible pointer
type ‘int (*)(int)’ [-Werror=incompatible-pointer-types]
kunit_activate_static_stub(test, add_one_fn_ptr, subtract_one);
| ^~~~~~~~~~~~
./include/linux/typecheck.h:21:25: note: in definition of macro
‘typecheck_fn’
21 | ({ typeof(type) __tmp = function; \
Swapping the arguments to typecheck_fn makes it take the type of a
pointer to the replacement function. Either a function or a pointer
to function can be assigned to that. For example:
static int some_function(int x)
{
/* whatever */
}
int (* some_function_ptr)(int) = some_function;
static int replacement(int x)
{
/* whatever */
}
Then:
kunit_activate_static_stub(test, some_function, replacement);
yields:
typecheck_fn(typeof(&replacement), some_function);
and:
kunit_activate_static_stub(test, some_function_ptr, replacement);
yields:
typecheck_fn(typeof(&replacement), some_function_ptr);
The two typecheck_fn() then resolve to:
int (*__tmp)(int) = some_function;
and
int (*__tmp)(int) = some_function_ptr;
Both of these are valid. In the first case the compiler inserts
an implicit '&' to take the address of the supplied function, and
in the second case the RHS is already a pointer to the same type.
Signed-off-by: Richard Fitzgerald <rf(a)opensource.cirrus.com>
Reviewed-by: Rae Moar <rmoar(a)google.com>
---
No changes since V1.
---
include/kunit/static_stub.h | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/include/kunit/static_stub.h b/include/kunit/static_stub.h
index 85315c80b303..bf940322dfc0 100644
--- a/include/kunit/static_stub.h
+++ b/include/kunit/static_stub.h
@@ -93,7 +93,7 @@ void __kunit_activate_static_stub(struct kunit *test,
* The redirection can be disabled again with kunit_deactivate_static_stub().
*/
#define kunit_activate_static_stub(test, real_fn_addr, replacement_addr) do { \
- typecheck_fn(typeof(&real_fn_addr), replacement_addr); \
+ typecheck_fn(typeof(&replacement_addr), real_fn_addr); \
__kunit_activate_static_stub(test, real_fn_addr, replacement_addr); \
} while (0)
--
2.30.2
From: "Steven Rostedt (Google)" <rostedt(a)goodmis.org>
As there were bugs found with the ownership of eventfs dynamic file
creation. Add a test to test it.
It will remount tracefs with a different gid and check the ownership of
the eventfs directory, as well as the system and event directories. It
will also check the event file directories.
It then does a chgrp on each of these as well to see if they all get
updated as expected.
Then it remounts the tracefs file system back to the original group and
makes sure that all the updated files and directories were reset back to
the original ownership.
It does the same for instances that change the ownership of he instance
directory.
Note, because the uid is not reset by a remount, it is tested for every
file by switching it to a new owner and then back again.
Signed-off-by: Steven Rostedt (Google) <rostedt(a)goodmis.org>
---
Changes since v1: https://lore.kernel.org/linux-trace-kernel/20231221193551.13a0b7bd@gandalf.…
- Fixed a cut and paste error of using $original_group for finding another uid
.../ftrace/test.d/00basic/test_ownership.tc | 113 ++++++++++++++++++
1 file changed, 113 insertions(+)
create mode 100755 tools/testing/selftests/ftrace/test.d/00basic/test_ownership.tc
diff --git a/tools/testing/selftests/ftrace/test.d/00basic/test_ownership.tc b/tools/testing/selftests/ftrace/test.d/00basic/test_ownership.tc
new file mode 100755
index 000000000000..83cbd116d06b
--- /dev/null
+++ b/tools/testing/selftests/ftrace/test.d/00basic/test_ownership.tc
@@ -0,0 +1,113 @@
+#!/bin/sh
+# description: Test file and directory owership changes for eventfs
+
+original_group=`stat -c "%g" .`
+original_owner=`stat -c "%u" .`
+
+mount_point=`stat -c '%m' .`
+mount_options=`mount | grep "$mount_point" | sed -e 's/.*(\(.*\)).*/\1/'`
+
+# find another owner and group that is not the original
+other_group=`tac /etc/group | grep -v ":$original_group:" | head -1 | cut -d: -f3`
+other_owner=`tac /etc/passwd | grep -v ":$original_owner:" | head -1 | cut -d: -f3`
+
+# Remove any group ownership already
+new_options=`echo "$mount_options" | sed -e "s/gid=[0-9]*/gid=$other_group/"`
+
+if [ "$new_options" = "$mount_options" ]; then
+ new_options="$mount_options,gid=$other_group"
+ mount_options="$mount_options,gid=$original_group"
+fi
+
+canary="events/timer events/timer/timer_cancel events/timer/timer_cancel/format"
+
+test() {
+ file=$1
+ test_group=$2
+
+ owner=`stat -c "%u" $file`
+ group=`stat -c "%g" $file`
+
+ echo "testing $file $owner=$original_owner and $group=$test_group"
+ if [ $owner -ne $original_owner ]; then
+ exit_fail
+ fi
+ if [ $group -ne $test_group ]; then
+ exit_fail
+ fi
+
+ # Note, the remount does not update ownership so test going to and from owner
+ echo "test owner $file to $other_owner"
+ chown $other_owner $file
+ owner=`stat -c "%u" $file`
+ if [ $owner -ne $other_owner ]; then
+ exit_fail
+ fi
+
+ chown $original_owner $file
+ owner=`stat -c "%u" $file`
+ if [ $owner -ne $original_owner ]; then
+ exit_fail
+ fi
+
+}
+
+run_tests() {
+ for d in "." "events" "events/sched" "events/sched/sched_switch" "events/sched/sched_switch/enable" $canary; do
+ test "$d" $other_group
+ done
+
+ chgrp $original_group events
+ test "events" $original_group
+ for d in "." "events/sched" "events/sched/sched_switch" "events/sched/sched_switch/enable" $canary; do
+ test "$d" $other_group
+ done
+
+ chgrp $original_group events/sched
+ test "events/sched" $original_group
+ for d in "." "events/sched/sched_switch" "events/sched/sched_switch/enable" $canary; do
+ test "$d" $other_group
+ done
+
+ chgrp $original_group events/sched/sched_switch
+ test "events/sched/sched_switch" $original_group
+ for d in "." "events/sched/sched_switch/enable" $canary; do
+ test "$d" $other_group
+ done
+
+ chgrp $original_group events/sched/sched_switch/enable
+ test "events/sched/sched_switch/enable" $original_group
+ for d in "." $canary; do
+ test "$d" $other_group
+ done
+}
+
+mount -o remount,"$new_options" .
+
+run_tests
+
+mount -o remount,"$mount_options" .
+
+for d in "." "events" "events/sched" "events/sched/sched_switch" "events/sched/sched_switch/enable" $canary; do
+ test "$d" $original_group
+done
+
+# check instances as well
+
+chgrp $other_group instances
+
+instance="foo-$(mktemp -u XXXXX)"
+
+mkdir instances/$instance
+
+cd instances/$instance
+
+run_tests
+
+cd ../..
+
+rmdir instances/$instance
+
+chgrp $original_group instances
+
+exit 0
--
2.42.0
From: "Steven Rostedt (Google)" <rostedt(a)goodmis.org>
As there were bugs found with the ownership of eventfs dynamic file
creation. Add a test to test it.
It will remount tracefs with a different gid and check the ownership of
the eventfs directory, as well as the system and event directories. It
will also check the event file directories.
It then does a chgrp on each of these as well to see if they all get
updated as expected.
Then it remounts the tracefs file system back to the original group and
makes sure that all the updated files and directories were reset back to
the original ownership.
It does the same for instances that change the ownership of he instance
directory.
Note, because the uid is not reset by a remount, it is tested for every
file by switching it to a new owner and then back again.
Signed-off-by: Steven Rostedt (Google) <rostedt(a)goodmis.org>
---
.../ftrace/test.d/00basic/test_ownership.tc | 113 ++++++++++++++++++
1 file changed, 113 insertions(+)
create mode 100755 tools/testing/selftests/ftrace/test.d/00basic/test_ownership.tc
diff --git a/tools/testing/selftests/ftrace/test.d/00basic/test_ownership.tc b/tools/testing/selftests/ftrace/test.d/00basic/test_ownership.tc
new file mode 100755
index 000000000000..de8cdf6f207b
--- /dev/null
+++ b/tools/testing/selftests/ftrace/test.d/00basic/test_ownership.tc
@@ -0,0 +1,113 @@
+#!/bin/sh
+# description: Test file and directory owership changes for eventfs
+
+original_group=`stat -c "%g" .`
+original_owner=`stat -c "%u" .`
+
+mount_point=`stat -c '%m' .`
+mount_options=`mount | grep "$mount_point" | sed -e 's/.*(\(.*\)).*/\1/'`
+
+# find another owner and group that is not the original
+other_group=`tac /etc/group | grep -v ":$original_group:" | head -1 | cut -d: -f3`
+other_owner=`tac /etc/passwd | grep -v ":$original_group:" | head -1 | cut -d: -f3`
+
+# Remove any group ownership already
+new_options=`echo "$mount_options" | sed -e "s/gid=[0-9]*/gid=$other_group/"`
+
+if [ "$new_options" = "$mount_options" ]; then
+ new_options="$mount_options,gid=$other_group"
+ mount_options="$mount_options,gid=$original_group"
+fi
+
+canary="events/timer events/timer/timer_cancel events/timer/timer_cancel/format"
+
+test() {
+ file=$1
+ test_group=$2
+
+ owner=`stat -c "%u" $file`
+ group=`stat -c "%g" $file`
+
+ echo "testing $file $owner=$original_owner and $group=$test_group"
+ if [ $owner -ne $original_owner ]; then
+ exit_fail
+ fi
+ if [ $group -ne $test_group ]; then
+ exit_fail
+ fi
+
+ # Note, the remount does not update ownership so test going to and from owner
+ echo "test owner $file to $other_owner"
+ chown $other_owner $file
+ owner=`stat -c "%u" $file`
+ if [ $owner -ne $other_owner ]; then
+ exit_fail
+ fi
+
+ chown $original_owner $file
+ owner=`stat -c "%u" $file`
+ if [ $owner -ne $original_owner ]; then
+ exit_fail
+ fi
+
+}
+
+run_tests() {
+ for d in "." "events" "events/sched" "events/sched/sched_switch" "events/sched/sched_switch/enable" $canary; do
+ test "$d" $other_group
+ done
+
+ chgrp $original_group events
+ test "events" $original_group
+ for d in "." "events/sched" "events/sched/sched_switch" "events/sched/sched_switch/enable" $canary; do
+ test "$d" $other_group
+ done
+
+ chgrp $original_group events/sched
+ test "events/sched" $original_group
+ for d in "." "events/sched/sched_switch" "events/sched/sched_switch/enable" $canary; do
+ test "$d" $other_group
+ done
+
+ chgrp $original_group events/sched/sched_switch
+ test "events/sched/sched_switch" $original_group
+ for d in "." "events/sched/sched_switch/enable" $canary; do
+ test "$d" $other_group
+ done
+
+ chgrp $original_group events/sched/sched_switch/enable
+ test "events/sched/sched_switch/enable" $original_group
+ for d in "." $canary; do
+ test "$d" $other_group
+ done
+}
+
+mount -o remount,"$new_options" .
+
+run_tests
+
+mount -o remount,"$mount_options" .
+
+for d in "." "events" "events/sched" "events/sched/sched_switch" "events/sched/sched_switch/enable" $canary; do
+ test "$d" $original_group
+done
+
+# check instances as well
+
+chgrp $other_group instances
+
+instance="foo-$(mktemp -u XXXXX)"
+
+mkdir instances/$instance
+
+cd instances/$instance
+
+run_tests
+
+cd ../..
+
+rmdir instances/$instance
+
+chgrp $original_group instances
+
+exit 0
--
2.42.0