- Linux-kselftest-mirror - lists.linaro.org

[PATCH v2 0/9] arm64: Support 2024 dpISA extensions

by Mark Brown

The 2024 architecture release includes a number of data processing extensions, mostly SVE and SME additions with a few others. These are all very straightforward extensions which add instructions but no architectural state so only need hwcaps and exposing of the ID registers to KVM guests and userspace. Signed-off-by: Mark Brown <broonie(a)kernel.org> --- Changes in v2: - Filter KVM guest visible bitfields in ID_AA64ISAR3_EL1 to only those we make writeable. - Link to v1: https://lore.kernel.org/r/20241028-arm64-2024-dpisa-v1-0-a38d08b008a8@kerne… --- Mark Brown (9): arm64/sysreg: Update ID_AA64PFR2_EL1 to DDI0601 2024-09 arm64/sysreg: Update ID_AA64ISAR3_EL1 to DDI0601 2024-09 arm64/sysreg: Update ID_AA64FPFR0_EL1 to DDI0601 2024-09 arm64/sysreg: Update ID_AA64ZFR0_EL1 to DDI0601 2024-09 arm64/sysreg: Update ID_AA64SMFR0_EL1 to DDI0601 2024-09 arm64/sysreg: Update ID_AA64ISAR2_EL1 to DDI0601 2024-09 arm64/hwcap: Describe 2024 dpISA extensions to userspace KVM: arm64: Allow control of dpISA extensions in ID_AA64ISAR3_EL1 kselftest/arm64: Add 2024 dpISA extensions to hwcap test Documentation/arch/arm64/elf_hwcaps.rst | 51 ++++++ arch/arm64/include/asm/hwcap.h | 17 ++ arch/arm64/include/uapi/asm/hwcap.h | 17 ++ arch/arm64/kernel/cpufeature.c | 35 ++++ arch/arm64/kernel/cpuinfo.c | 17 ++ arch/arm64/kvm/sys_regs.c | 6 +- arch/arm64/tools/sysreg | 87 +++++++++- tools/testing/selftests/arm64/abi/hwcap.c | 273 +++++++++++++++++++++++++++++- 8 files changed, 493 insertions(+), 10 deletions(-) --- base-commit: 8e929cb546ee42c9a61d24fae60605e9e3192354 change-id: 20241008-arm64-2024-dpisa-8091074a7f48 Best regards, -- Mark Brown <broonie(a)kernel.org>

1 year, 2 months

2
10
0 0

[PATCH v4 0/3] bpf: Using binary search to improve the performance of btf_find_by_name_kind

by Donglin Peng

Currently, we are only using the linear search method to find the type id by the name, which has a time complexity of O(n). This change involves sorting the names of btf types in ascending order and using binary search, which has a time complexity of O(log(n)). This idea was inspired by the following patch: 60443c88f3a8 ("kallsyms: Improve the performance of kallsyms_lookup_name()"). At present, this improvement is only for searching in vmlinux's and module's BTFs. Another change is the search direction, where we search the BTF first and then its base, the type id of the first matched btf_type will be returned. Here is a time-consuming result that finding 87590 type ids by their names in vmlinux's BTF. Before: 158426 ms After: 114 ms The average lookup performance has improved more than 1000x in the above scenario. v4: - Divide the patch into two parts: kernel and libbpf - Use Eduard's code to sort btf_types in the btf__dedup function - Correct some btf testcases due to modifications of the order of btf_types. v3: - Link: https://lore.kernel.org/all/20240608140835.965949-1-dolinux.peng@gmail.com/ - Sort btf_types during build process other than during boot, to reduce the overhead of memory and boot time. v2: - Link: https://lore.kernel.org/all/20230909091646.420163-1-pengdonglin@sangfor.com… Donglin Peng (3): libbpf: Sort btf_types in ascending order by name bpf: Using binary search to improve the performance of btf_find_by_name_kind libbpf: Using binary search to improve the performance of btf__find_by_name_kind include/linux/btf.h | 1 + kernel/bpf/btf.c | 157 +++++++++- tools/lib/bpf/btf.c | 274 +++++++++++++--- tools/testing/selftests/bpf/prog_tests/btf.c | 296 +++++++++--------- .../bpf/prog_tests/btf_dedup_split.c | 64 ++-- 5 files changed, 555 insertions(+), 237 deletions(-) -- 2.34.1

1 year, 2 months

2
12
0 0

[PATCH v3 0/4] vfio-pci support pasid attach/detach

by Yi Liu

This adds the pasid attach/detach uAPIs for userspace to attach/detach a PASID of a device to/from a given ioas/hwpt. Only vfio-pci driver is enabled in this series. After this series, PASID-capable devices bound with vfio-pci can report PASID capability to userspace and VM to enable PASID usages like Shared Virtual Addressing (SVA). Based on the discussion about reporting the vPASID to VM [1], it's agreed that we will let the userspace VMM to synthesize the vPASID capability. The VMM needs to figure out a hole to put the vPASID cap. This includes the hidden bits handling for some devices. While, it's up to the userspace, it's not the focus of this series. This series first adds the helpers for pasid attach in vfio core and then adds the device cdev ioctls for pasid attach/detach. In the end of this series, the IOMMU_GET_HW_INFO ioctl is extended to report the PCI PASID capability to the userspace. Userspace should check this before using any PASID related uAPIs provided by VFIO, which is the agreement in [2]. This series depends on the iommufd pasid attach/detach series [3]. The completed code can be found at [4], tested with a hacky Qemu branch [5]. [1] https://lore.kernel.org/kvm/BN9PR11MB5276318969A212AD0649C7BE8CBE2@BN9PR11M… [2] https://lore.kernel.org/kvm/4f2daf50-a5ad-4599-ab59-bcfc008688d8@intel.com/ [3] https://lore.kernel.org/linux-iommu/20240912131255.13305-1-yi.l.liu@intel.c… [4] https://github.com/yiliu1765/iommufd/tree/iommufd_pasid [5] https://github.com/yiliu1765/qemu/tree/wip/zhenzhong/iommufd_nesting_rfcv2-… Change log: v3: - Misc enhancement on patch 01 of v2 (Alex, Jason) - Add Jason's r-b to patch 03 of v2 - Drop the logic that report PASID via VFIO_DEVICE_FEATURE ioctl - Extend IOMMU_GET_HW_INFO to report PASID support (Kevin, Jason, Alex) v2: https://lore.kernel.org/kvm/20240412082121.33382-1-yi.l.liu@intel.com/ - Use IDA to track if PASID is attached or not in VFIO. (Jason) - Fix the issue of calling pasid_at[de]tach_ioas callback unconditionally (Alex) - Fix the wrong data copy in vfio_df_ioctl_pasid_detach_pt() (Zhenzhong) - Minor tweaks in comments (Kevin) v1: https://lore.kernel.org/kvm/20231127063909.129153-1-yi.l.liu@intel.com/ - Report PASID capability via VFIO_DEVICE_FEATURE (Alex) rfc: https://lore.kernel.org/linux-iommu/20230926093121.18676-1-yi.l.liu@intel.c… Regards, Yi Liu Yi Liu (4): ida: Add ida_find_first_range() vfio-iommufd: Support pasid [at|de]tach for physical VFIO devices vfio: Add VFIO_DEVICE_PASID_[AT|DE]TACH_IOMMUFD_PT iommufd: Extend IOMMU_GET_HW_INFO to report PASID capability drivers/iommu/iommufd/device.c | 27 +++++++++++++- drivers/pci/ats.c | 32 ++++++++++++++++ drivers/vfio/device_cdev.c | 51 ++++++++++++++++++++++++++ drivers/vfio/iommufd.c | 50 +++++++++++++++++++++++++ drivers/vfio/pci/vfio_pci.c | 2 + drivers/vfio/vfio.h | 4 ++ drivers/vfio/vfio_main.c | 8 ++++ include/linux/idr.h | 11 ++++++ include/linux/pci-ats.h | 3 ++ include/linux/vfio.h | 11 ++++++ include/uapi/linux/iommufd.h | 14 ++++++- include/uapi/linux/vfio.h | 55 ++++++++++++++++++++++++++++ lib/idr.c | 67 ++++++++++++++++++++++++++++++++++ 13 files changed, 333 insertions(+), 2 deletions(-) -- 2.34.1

1 year, 2 months

7
29
0 0

[PATCH net-next v6 0/3] Threads support in proc connector

by Anjali Kulkarni

Recently we committed a fix to allow processes to receive notifications for non-zero exits via the process connector module. Commit is a4c9a56e6a2c. However, for threads, when it does a pthread_exit(&exit_status) call, the kernel is not aware of the exit status with which pthread_exit is called. It is sent by child thread to the parent process, if it is waiting in pthread_join(). Hence, for a thread exiting abnormally, kernel cannot send notifications to any listening processes. The exception to this is if the thread is sent a signal which it has not handled, and dies along with it's process as a result; for eg. SIGSEGV or SIGKILL. In this case, kernel is aware of the non-zero exit and sends a notification for it. For our use case, we cannot have parent wait in pthread_join, one of the main reasons for this being that we do not want to track normal pthread_exit(), which could be a very large number. We only want to be notified of any abnormal exits. Hence, threads are created with pthread_attr_t set to PTHREAD_CREATE_DETACHED. To fix this problem, we add a new type PROC_CN_MCAST_NOTIFY to proc connector API, which allows a thread to send it's exit status to kernel either when it needs to call pthread_exit() with non-zero value to indicate some error or from signal handler before pthread_exit(). v5->v6 changes: - As suggested by Simon Horman, fixed style issues (some old) by running ./scripts/checkpatch.pl --strict --max-line-length=80 - Removed inline functions as suggested by Simon Horman - Added "depends on" in Kconfig.debug as suggested by Stanislav Fomichev - Removed warning while compilation with kernel space headers as suggested by Simon Horman - Removed "comm" field, will send separate patch for this. - Added kunit configs in tools/testing/kunit/configs/all_tests.config with it's dependencies. v4->v5 changes: - Handled comment by Stanislav Fomichev to fix a print format error. - Made thread.c completely automated by starting proc_filter program from within threads.c. - Changed name CONFIG_CN_HASH_KUNIT_TEST to CN_HASH_KUNIT_TEST in Kconfig.debug and changed display text. v3->v4 changes: - Reduce size of exit.log by removing unnecessary text. v2->v3 changes: - Handled comment by Liam Howlett to set hdev to NULL and add comment on it. - Handled comment by Liam Howlett to combine functions for deleting+get and deleting into one in cn_hash.c - Handled comment by Liam Howlett to remove extern in the functions defined in cn_hash_test.h - Some nits by Liam Howlett fixed. - Handled comment by Liam Howlett to make threads test automated. proc_filter.c creates exit.log, which is read by thread.c and checks the values reported. - Added "comm" field to struct proc_event, to copy the task's name to the packet to allow further filtering by packets. v1->v2 changes: - Handled comment by Peter Zijlstra to remove locking for PF_EXIT_NOTIFY task->flags. - Added error handling in thread.c v->v1 changes: - Handled comment by Simon Horman to remove unused err in cn_proc.c - Handled comment by Simon Horman to make adata and key_display static in cn_hash_test.c Anjali Kulkarni (3): connector/cn_proc: Add hash table for threads connector/cn_proc: Kunit tests for threads hash table connector/cn_proc: Selftest for threads drivers/connector/Makefile | 2 +- drivers/connector/cn_hash.c | 216 ++++++++++++++++ drivers/connector/cn_proc.c | 81 ++++-- drivers/connector/connector.c | 88 ++++++- include/linux/connector.h | 62 ++++- include/linux/sched.h | 2 +- include/uapi/linux/cn_proc.h | 4 +- lib/Kconfig.debug | 19 ++ lib/Makefile | 1 + lib/cn_hash_test.c | 169 +++++++++++++ lib/cn_hash_test.h | 10 + tools/testing/kunit/configs/all_tests.config | 5 + tools/testing/selftests/connector/Makefile | 25 ++ .../testing/selftests/connector/proc_filter.c | 63 +++-- tools/testing/selftests/connector/thread.c | 238 ++++++++++++++++++ .../selftests/connector/thread_filter.c | 96 +++++++ 16 files changed, 1027 insertions(+), 54 deletions(-) create mode 100644 drivers/connector/cn_hash.c create mode 100644 lib/cn_hash_test.c create mode 100644 lib/cn_hash_test.h create mode 100644 tools/testing/selftests/connector/thread.c create mode 100644 tools/testing/selftests/connector/thread_filter.c -- 2.46.0

1 year, 2 months

2
4
0 0

[PATCH RFT v11 0/8] fork: Support shadow stacks in clone3()

by Mark Brown

The kernel has recently added support for shadow stacks, currently x86 only using their CET feature but both arm64 and RISC-V have equivalent features (GCS and Zicfiss respectively), I am actively working on GCS[1]. With shadow stacks the hardware maintains an additional stack containing only the return addresses for branch instructions which is not generally writeable by userspace and ensures that any returns are to the recorded addresses. This provides some protection against ROP attacks and making it easier to collect call stacks. These shadow stacks are allocated in the address space of the userspace process. Our API for shadow stacks does not currently offer userspace any flexiblity for managing the allocation of shadow stacks for newly created threads, instead the kernel allocates a new shadow stack with the same size as the normal stack whenever a thread is created with the feature enabled. The stacks allocated in this way are freed by the kernel when the thread exits or shadow stacks are disabled for the thread. This lack of flexibility and control isn't ideal, in the vast majority of cases the shadow stack will be over allocated and the implicit allocation and deallocation is not consistent with other interfaces. As far as I can tell the interface is done in this manner mainly because the shadow stack patches were in development since before clone3() was implemented. Since clone3() is readily extensible let's add support for specifying a shadow stack when creating a new thread or process, keeping the current implicit allocation behaviour if one is not specified either with clone3() or through the use of clone(). The user must provide a shadow stack pointer, this must point to memory mapped for use as a shadow stackby map_shadow_stack() with an architecture specified shadow stack token at the top of the stack. Please note that the x86 portions of this code are build tested only, I don't appear to have a system that can run CET available to me. [1] https://lore.kernel.org/linux-arm-kernel/20241001-arm64-gcs-v13-0-222b78d87… Signed-off-by: Mark Brown <broonie(a)kernel.org> --- Changes in v11: - Rebase onto arm64 for-next/gcs, which is based on v6.12-rc1, and integrate arm64 support. - Rework the interface to specify a shadow stack pointer rather than a base and size like we do for the regular stack. - Link to v10: https://lore.kernel.org/r/20240821-clone3-shadow-stack-v10-0-06e8797b9445@k… Changes in v10: - Integrate fixes & improvements for the x86 implementation from Rick Edgecombe. - Require that the shadow stack be VM_WRITE. - Require that the shadow stack base and size be sizeof(void *) aligned. - Clean up trailing newline. - Link to v9: https://lore.kernel.org/r/20240819-clone3-shadow-stack-v9-0-962d74f99464@ke… Changes in v9: - Pull token validation earlier and report problems with an error return to parent rather than signal delivery to the child. - Verify that the top of the supplied shadow stack is VM_SHADOW_STACK. - Rework token validation to only do the page mapping once. - Drop no longer needed support for testing for signals in selftest. - Fix typo in comments. - Link to v8: https://lore.kernel.org/r/20240808-clone3-shadow-stack-v8-0-0acf37caf14c@ke… Changes in v8: - Fix token verification with user specified shadow stack. - Don't track user managed shadow stacks for child processes. - Link to v7: https://lore.kernel.org/r/20240731-clone3-shadow-stack-v7-0-a9532eebfb1d@ke… Changes in v7: - Rebase onto v6.11-rc1. - Typo fixes. - Link to v6: https://lore.kernel.org/r/20240623-clone3-shadow-stack-v6-0-9ee7783b1fb9@ke… Changes in v6: - Rebase onto v6.10-rc3. - Ensure we don't try to free the parent shadow stack in error paths of x86 arch code. - Spelling fixes in userspace API document. - Additional cleanups and improvements to the clone3() tests to support the shadow stack tests. - Link to v5: https://lore.kernel.org/r/20240203-clone3-shadow-stack-v5-0-322c69598e4b@ke… Changes in v5: - Rebase onto v6.8-rc2. - Rework ABI to have the user allocate the shadow stack memory with map_shadow_stack() and a token. - Force inlining of the x86 shadow stack enablement. - Move shadow stack enablement out into a shared header for reuse by other tests. - Link to v4: https://lore.kernel.org/r/20231128-clone3-shadow-stack-v4-0-8b28ffe4f676@ke… Changes in v4: - Formatting changes. - Use a define for minimum shadow stack size and move some basic validation to fork.c. - Link to v3: https://lore.kernel.org/r/20231120-clone3-shadow-stack-v3-0-a7b8ed3e2acc@ke… Changes in v3: - Rebase onto v6.7-rc2. - Remove stale shadow_stack in internal kargs. - If a shadow stack is specified unconditionally use it regardless of CLONE_ parameters. - Force enable shadow stacks in the selftest. - Update changelogs for RISC-V feature rename. - Link to v2: https://lore.kernel.org/r/20231114-clone3-shadow-stack-v2-0-b613f8681155@ke… Changes in v2: - Rebase onto v6.7-rc1. - Remove ability to provide preallocated shadow stack, just specify the desired size. - Link to v1: https://lore.kernel.org/r/20231023-clone3-shadow-stack-v1-0-d867d0b5d4d0@ke… --- Mark Brown (8): arm64/gcs: Return a success value from gcs_alloc_thread_stack() Documentation: userspace-api: Add shadow stack API documentation selftests: Provide helper header for shadow stack testing fork: Add shadow stack support to clone3() selftests/clone3: Remove redundant flushes of output streams selftests/clone3: Factor more of main loop into test_clone3() selftests/clone3: Allow tests to flag if -E2BIG is a valid error code selftests/clone3: Test shadow stack support Documentation/userspace-api/index.rst | 1 + Documentation/userspace-api/shadow_stack.rst | 41 ++++ arch/arm64/include/asm/gcs.h | 8 +- arch/arm64/kernel/process.c | 8 +- arch/arm64/mm/gcs.c | 62 +++++- arch/x86/include/asm/shstk.h | 11 +- arch/x86/kernel/process.c | 2 +- arch/x86/kernel/shstk.c | 57 +++++- include/asm-generic/cacheflush.h | 11 ++ include/linux/sched/task.h | 17 ++ include/uapi/linux/sched.h | 10 +- kernel/fork.c | 96 +++++++-- tools/testing/selftests/clone3/clone3.c | 226 ++++++++++++++++++---- tools/testing/selftests/clone3/clone3_selftests.h | 65 ++++++- tools/testing/selftests/ksft_shstk.h | 98 ++++++++++ 15 files changed, 632 insertions(+), 81 deletions(-) --- base-commit: d17cd7b7cc92d37ee8b2df8f975fc859a261f4dc change-id: 20231019-clone3-shadow-stack-15d40d2bf536 Best regards, -- Mark Brown <broonie(a)kernel.org>

1 year, 2 months

3
12
0 0

[PATCH bpf-next] selftests/bpf: Fix compile error when MPTCP not support

by Tao Chen

Fix compile error when MPTCP feature not support, though eBPF core check already done which seems invalid in this situation, the error info like: progs/mptcp_sock.c:49:40: error: no member named 'is_mptcp' in 'struct tcp_sock' 49 | is_mptcp = bpf_core_field_exists(tsk->is_mptcp) ? The filed created in new definitions with eBPF core feature to solve this build problem, and test case result still ok in MPTCP kernel. 176/1 mptcp/base:OK 176/2 mptcp/mptcpify:OK 176 mptcp:OK Summary: 1/2 PASSED, 0 SKIPPED, 0 FAILED Fixes: 8039d353217c ("selftests/bpf: Add MPTCP test base") Signed-off-by: Tao Chen <chen.dylane(a)gmail.com> --- .../testing/selftests/bpf/progs/mptcp_sock.c | 42 ++++++++++++++----- 1 file changed, 32 insertions(+), 10 deletions(-) diff --git a/tools/testing/selftests/bpf/progs/mptcp_sock.c b/tools/testing/selftests/bpf/progs/mptcp_sock.c index f3acb90588c7..2f80d042686a 100644 --- a/tools/testing/selftests/bpf/progs/mptcp_sock.c +++ b/tools/testing/selftests/bpf/progs/mptcp_sock.c @@ -25,13 +25,23 @@ struct { __type(value, struct mptcp_storage); } socket_storage_map SEC(".maps"); +struct tcp_sock___new { + bool is_mptcp; +} __attribute__((preserve_access_index)); + +struct mptcp_sock___new { + __u32 token; + struct sock *first; + char ca_name[TCP_CA_NAME_MAX]; +} __attribute__((preserve_access_index)); + SEC("sockops") int _sockops(struct bpf_sock_ops *ctx) { struct mptcp_storage *storage; - struct mptcp_sock *msk; + struct mptcp_sock___new *msk; int op = (int)ctx->op; - struct tcp_sock *tsk; + struct tcp_sock___new *tsk; struct bpf_sock *sk; bool is_mptcp; @@ -42,11 +52,16 @@ int _sockops(struct bpf_sock_ops *ctx) if (!sk) return 1; - tsk = bpf_skc_to_tcp_sock(sk); + /* recast pointer to capture new type for compiler */ + tsk = (void *)bpf_skc_to_tcp_sock(sk); if (!tsk) return 1; - is_mptcp = bpf_core_field_exists(tsk->is_mptcp) ? tsk->is_mptcp : 0; + if (bpf_core_field_exists(tsk->is_mptcp)) + is_mptcp = BPF_CORE_READ(tsk, is_mptcp); + else + is_mptcp = 0; + if (!is_mptcp) { storage = bpf_sk_storage_get(&socket_storage_map, sk, 0, BPF_SK_STORAGE_GET_F_CREATE); @@ -57,7 +72,7 @@ int _sockops(struct bpf_sock_ops *ctx) __builtin_memset(storage->ca_name, 0, TCP_CA_NAME_MAX); storage->first = NULL; } else { - msk = bpf_skc_to_mptcp_sock(sk); + msk = (void *)bpf_skc_to_mptcp_sock(sk); if (!msk) return 1; @@ -66,9 +81,9 @@ int _sockops(struct bpf_sock_ops *ctx) if (!storage) return 1; - storage->token = msk->token; - __builtin_memcpy(storage->ca_name, msk->ca_name, TCP_CA_NAME_MAX); - storage->first = msk->first; + storage->token = BPF_CORE_READ(msk, token); + BPF_CORE_READ_STR_INTO(&storage->ca_name, msk, ca_name); + storage->first = BPF_CORE_READ(msk, first); } storage->invoked++; storage->is_mptcp = is_mptcp; @@ -81,8 +96,15 @@ SEC("fentry/mptcp_pm_new_connection") int BPF_PROG(trace_mptcp_pm_new_connection, struct mptcp_sock *msk, const struct sock *ssk, int server_side) { - if (!server_side) - token = msk->token; + struct mptcp_sock___new *mskw; + + if (!server_side) { + mskw = (void *)msk; + if (bpf_core_field_exists(mskw->token)) + token = BPF_CORE_READ(mskw, token); + else + token = 0; + } return 0; } -- 2.43.0

1 year, 2 months

3
4
0 0

[PATCH 0/3] Tracefs gid mount option fixes

by Kalesh Singh

Hi all, This series is based on v6.12-rc4. It fixes an issue with the tracefs gid mount option. Adds test case to prevent future breakages and updates the tracefs readme to document the expected behavior of this option. Thanks, Kalesh Kalesh Singh (3): tracing: Document tracefs gid mount option tracing/selftests: Add tracefs mount options test tracing: Fix tracefs gid mount option fs/tracefs/inode.c | 12 ++- kernel/trace/trace.c | 4 + .../ftrace/test.d/00basic/mount_options.tc | 101 ++++++++++++++++++ .../ftrace/test.d/00basic/test_ownership.tc | 16 +-- .../testing/selftests/ftrace/test.d/functions | 25 +++++ 5 files changed, 142 insertions(+), 16 deletions(-) create mode 100644 tools/testing/selftests/ftrace/test.d/00basic/mount_options.tc base-commit: 42f7652d3eb527d03665b09edac47f85fb600924 -- 2.47.0.163.g1226f6d8fa-goog

1 year, 2 months

3
9
0 0

[PATCH v2 0/3] Tracefs mount option fixes

by Kalesh Singh

Hi all, This is v2 of the series fixing the tracefs mount options. Changes in v2: - Update the commit descriptions, per Eric and Steve - Fix ordering of the patches, per Steve v1 can be found at: - https://lore.kernel.org/r/20241028214550.2099923-1-kaleshsingh@google.com/ Thanks, Kalesh Kalesh Singh (3): tracing: Fix tracefs mount options tracing: Document tracefs gid mount option tracing/selftests: Add tracefs mount options test fs/tracefs/inode.c | 12 ++- kernel/trace/trace.c | 4 + .../ftrace/test.d/00basic/mount_options.tc | 101 ++++++++++++++++++ .../ftrace/test.d/00basic/test_ownership.tc | 16 +-- .../testing/selftests/ftrace/test.d/functions | 25 +++++ 5 files changed, 142 insertions(+), 16 deletions(-) create mode 100644 tools/testing/selftests/ftrace/test.d/00basic/mount_options.tc base-commit: 42f7652d3eb527d03665b09edac47f85fb600924 -- 2.47.0.163.g1226f6d8fa-goog

1 year, 2 months

1
3
0 0

[PATCH] selftests/lam: Test get_user() LAM pointer handling

by Maciej Wieczor-Retman

Recent change in how get_user() handles pointers [1] has a specific case for LAM. It assigns a different bitmask that's later used to check whether a pointer comes from userland in get_user(). While currently commented out (until LASS [2] is merged into the kernel) it's worth making changes to the LAM selftest ahead of time. Add test case to LAM that utilizes a ioctl (FIOASYNC) syscall which uses get_user() in its implementation. Execute the syscall with differently tagged pointers to verify that valid user pointers are passing through and invalid kernel/non-canonical pointers are not. Code was tested on a Sierra Forest Xeon machine that's LAM capable. The test was ran without issues with both the LAM lines from [1] untouched and commented out. The test was also ran without issues with LAM_SUP both enabled and disabled. [1] https://lore.kernel.org/all/20241024013214.129639-1-torvalds@linux-foundati… [2] https://lore.kernel.org/all/20240710160655.3402786-1-alexander.shishkin@lin… Signed-off-by: Maciej Wieczor-Retman <maciej.wieczor-retman(a)intel.com> --- tools/testing/selftests/x86/lam.c | 85 +++++++++++++++++++++++++++++++ 1 file changed, 85 insertions(+) diff --git a/tools/testing/selftests/x86/lam.c b/tools/testing/selftests/x86/lam.c index 0ea4f6813930..3c53d4b7aa61 100644 --- a/tools/testing/selftests/x86/lam.c +++ b/tools/testing/selftests/x86/lam.c @@ -4,6 +4,7 @@ #include <stdlib.h> #include <string.h> #include <sys/syscall.h> +#include <sys/ioctl.h> #include <time.h> #include <signal.h> #include <setjmp.h> @@ -43,10 +44,19 @@ #define FUNC_INHERITE 0x20 #define FUNC_PASID 0x40 +/* get_user() pointer test cases */ +#define GET_USER_USER 0 +#define GET_USER_KERNEL_TOP 1 +#define GET_USER_KERNEL_BOT 2 +#define GET_USER_KERNEL 3 + #define TEST_MASK 0x7f +#define L5_SIGN_EXT_MASK (0xFFUL << 56) +#define L4_SIGN_EXT_MASK (0x1FFFFUL << 47) #define LOW_ADDR (0x1UL << 30) #define HIGH_ADDR (0x3UL << 48) +#define L5_ADDR (0x1UL << 48) #define MALLOC_LEN 32 @@ -370,6 +380,54 @@ static int handle_syscall(struct testcases *test) return ret; } +static int get_user_syscall(struct testcases *test) +{ + int ret = 0; + int ptr_value = 0; + void *ptr = &ptr_value; + int fd; + + uint64_t bitmask = ((uint64_t)ptr & L5_ADDR) ? L5_SIGN_EXT_MASK : + L4_SIGN_EXT_MASK; + + if (test->lam != 0) + if (set_lam(test->lam) != 0) + return 2; + + fd = memfd_create("lam_ioctl", 0); + if (fd == -1) + exit(EXIT_FAILURE); + + switch (test->later) { + case GET_USER_USER: + /* Control group - properly tagger user pointer */ + ptr = (void *)set_metadata((uint64_t)ptr, test->lam); + break; + case GET_USER_KERNEL_TOP: + /* Kernel address with top bit cleared */ + bitmask &= (bitmask >> 1); + ptr = (void *)((uint64_t)ptr | bitmask); + break; + case GET_USER_KERNEL_BOT: + /* Kernel address with bottom sign-extension bit cleared */ + bitmask &= (bitmask << 1); + ptr = (void *)((uint64_t)ptr | bitmask); + break; + case GET_USER_KERNEL: + /* Try to pass a kernel address */ + ptr = (void *)((uint64_t)ptr | bitmask); + break; + default: + printf("Invalid test case value passed!\n"); + break; + } + + if (ioctl(fd, FIOASYNC, ptr) != 0) + ret = 1; + + return ret; +} + int sys_uring_setup(unsigned int entries, struct io_uring_params *p) { return (int)syscall(__NR_io_uring_setup, entries, p); @@ -883,6 +941,33 @@ static struct testcases syscall_cases[] = { .test_func = handle_syscall, .msg = "SYSCALL:[Negative] Disable LAM. Dereferencing pointer with metadata.\n", }, + { + .later = GET_USER_USER, + .lam = LAM_U57_BITS, + .test_func = get_user_syscall, + .msg = "GET_USER: get_user() and pass a properly tagged user pointer.\n", + }, + { + .later = GET_USER_KERNEL_TOP, + .expected = 1, + .lam = LAM_U57_BITS, + .test_func = get_user_syscall, + .msg = "GET_USER:[Negative] get_user() with a kernel pointer and the top bit cleared.\n", + }, + { + .later = GET_USER_KERNEL_BOT, + .expected = 1, + .lam = LAM_U57_BITS, + .test_func = get_user_syscall, + .msg = "GET_USER:[Negative] get_user() with a kernel pointer and the bottom sign-extension bit cleared.\n", + }, + { + .later = GET_USER_KERNEL, + .expected = 1, + .lam = LAM_U57_BITS, + .test_func = get_user_syscall, + .msg = "GET_USER:[Negative] get_user() and pass a kernel pointer.\n", + }, }; static struct testcases mmap_cases[] = { -- 2.46.2

1 year, 2 months

2
2
0 0

[PATCH v5 00/13] iommufd: Add vIOMMU infrastructure (Part-1)

by Nicolin Chen

This series introduces a new vIOMMU infrastructure and related ioctls. IOMMUFD has been using the HWPT infrastructure for all cases, including a nested IO page table support. Yet, there're limitations for an HWPT-based structure to support some advanced HW-accelerated features, such as CMDQV on NVIDIA Grace, and HW-accelerated vIOMMU on AMD. Even for a multi-IOMMU environment, it is not straightforward for nested HWPTs to share the same parent HWPT (stage-2 IO pagetable), with the HWPT infrastructure alone: a parent HWPT typically hold one stage-2 IO pagetable and tag it with only one ID in the cache entries. When sharing one large stage-2 IO pagetable across physical IOMMU instances, that one ID may not always be available across all the IOMMU instances. In other word, it's ideal for SW to have a different container for the stage-2 IO pagetable so it can hold another ID that's available. And this container will be able to hold some advanced feature too. For this "different container", add vIOMMU, an additional layer to hold extra virtualization information: _______________________________________________________________________ | iommufd (with vIOMMU) | | _____________ | | | | | | |----------------| vIOMMU | | | | ______ | | _____________ ________ | | | | | | | | | | | | | | | IOAS |<---|(HWPT_PAGING)|<---| HWPT_NESTED |<--| DEVICE | | | | |______| |_____________| |_____________| |________| | | | | | | | | |______|________|______________|__________________|_______________|_____| | | | | | ______v_____ | ______v_____ ______v_____ ___v__ | struct | | PFN | (paging) | | (nested) | |struct| |iommu_device| |------>|iommu_domain|<----|iommu_domain|<----|device| |____________| storage|____________| |____________| |______| The vIOMMU object should be seen as a slice of a physical IOMMU instance that is passed to or shared with a VM. That can be some HW/SW resources: - Security namespace for guest owned ID, e.g. guest-controlled cache tags - Access to a sharable nesting parent pagetable across physical IOMMUs - Non-affiliated event reporting (e.g. an invalidation queue error) - Virtualization of various platforms IDs, e.g. RIDs and others - Delivery of paravirtualized invalidation - Direct assigned invalidation queues - Direct assigned interrupts On a multi-IOMMU system, the vIOMMU object must be instanced to the number of the physical IOMMUs that have a slice passed to (via device) a guest VM, while being able to hold the shareable parent HWPT. Each vIOMMU then just needs to allocate its own individual ID to tag its own cache: ---------------------------- ---------------- | | paging_hwpt0 | | hwpt_nested0 |--->| viommu0 ------------------ ---------------- | | IDx | ---------------------------- ---------------------------- ---------------- | | paging_hwpt0 | | hwpt_nested1 |--->| viommu1 ------------------ ---------------- | | IDy | ---------------------------- As an initial part-1, add IOMMUFD_CMD_VIOMMU_ALLOC ioctl for an allocation only. And implement it in arm-smmu-v3 driver as a real world use case. More vIOMMU-based structs and ioctls will be introduced in the follow-up series to support vDEVICE, vIRQ (vEVENT) and vQUEUE objects. Although we repurposed the vIOMMU object from an earlier RFC, just for a referece: https://lore.kernel.org/all/cover.1712978212.git.nicolinc@nvidia.com/ This series is on Github: https://github.com/nicolinc/iommufd/commits/iommufd_viommu_p1-v5 (paring QEMU branch for testing will be provided with the part2 series) Changelog v5 * Added "Reviewed-by" from Kevin * Reworked iommufd_viommu_alloc helper * Revised the uAPI kdoc for vIOMMU object * Revised comments for pluggable iommu_dev * Added a couple of cleanup patches for selftest * Renamed domain_alloc_nested op to alloc_domain_nested * Updated a few commit messages to reflect the latest series * Renamed iommufd_hwpt_nested_alloc_for_viommu to iommufd_viommu_alloc_hwpt_nested, and added flag validation v4 https://lore.kernel.org/all/cover.1729553811.git.nicolinc@nvidia.com/ * Added "Reviewed-by" from Jason * Dropped IOMMU_VIOMMU_TYPE_DEFAULT support * Dropped iommufd_object_alloc_elm renamings * Renamed iommufd's viommu_api.c to driver.c * Reworked iommufd_viommu_alloc helper * Added a separate iommufd_hwpt_nested_alloc_for_viommu function for hwpt_nested allocations on a vIOMMU, and added comparison between viommu->iommu_dev->ops and dev_iommu_ops(idev->dev) * Replaced s2_parent with vsmmu in arm_smmu_nested_domain * Replaced domain_alloc_user in iommu_ops with domain_alloc_nested in viommu_ops * Replaced wait_queue_head_t with a completion, to delay the unplug of mock_iommu_dev * Corrected documentation graph that was missing struct iommu_device * Added an iommufd_verify_unfinalized_object helper to verify driver- allocated vIOMMU/vDEVICE objects * Added missing test cases for TEST_LENGTH and fail_nth v3 https://lore.kernel.org/all/cover.1728491453.git.nicolinc@nvidia.com/ * Rebased on top of Jason's nesting v3 series https://lore.kernel.org/all/0-v3-e2e16cd7467f+2a6a1-smmuv3_nesting_jgg@nvid… * Split the series into smaller parts * Added Jason's Reviewed-by * Added back viommu->iommu_dev * Added support for driver-allocated vIOMMU v.s. core-allocated * Dropped arm_smmu_cache_invalidate_user * Added an iommufd_test_wait_for_users() in selftest * Reworked test code to make viommu an individual FIXTURE * Added missing TEST_LENGTH case for the new ioctl command v2 https://lore.kernel.org/all/cover.1724776335.git.nicolinc@nvidia.com/ * Limited vdev_id to one per idev * Added a rw_sem to protect the vdev_id list * Reworked driver-level APIs with proper lockings * Added a new viommu_api file for IOMMUFD_DRIVER config * Dropped useless iommu_dev point from the viommu structure * Added missing index numnbers to new types in the uAPI header * Dropped IOMMU_VIOMMU_INVALIDATE uAPI; Instead, reuse the HWPT one * Reworked mock_viommu_cache_invalidate() using the new iommu helper * Reordered details of set/unset_vdev_id handlers for proper lockings v1 https://lore.kernel.org/all/cover.1723061377.git.nicolinc@nvidia.com/ Thanks! Nicolin Nicolin Chen (13): iommufd: Move struct iommufd_object to public iommufd header iommufd: Introduce IOMMUFD_OBJ_VIOMMU and its related struct iommufd: Add iommufd_verify_unfinalized_object iommufd/viommu: Add IOMMU_VIOMMU_ALLOC ioctl iommufd: Add alloc_domain_nested op to iommufd_viommu_ops iommufd: Allow pt_id to carry viommu_id for IOMMU_HWPT_ALLOC iommufd/selftest: Add container_of helpers iommufd/selftest: Prepare for mock_viommu_alloc_domain_nested() iommufd/selftest: Add refcount to mock_iommu_device iommufd/selftest: Add IOMMU_VIOMMU_TYPE_SELFTEST iommufd/selftest: Add IOMMU_VIOMMU_ALLOC test coverage Documentation: userspace-api: iommufd: Update vIOMMU iommu/arm-smmu-v3: Add IOMMU_VIOMMU_TYPE_ARM_SMMUV3 support drivers/iommu/iommufd/Makefile | 5 +- drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h | 26 +- drivers/iommu/iommufd/iommufd_private.h | 36 +-- drivers/iommu/iommufd/iommufd_test.h | 2 + include/linux/iommu.h | 14 + include/linux/iommufd.h | 86 ++++++ include/uapi/linux/iommufd.h | 56 +++- tools/testing/selftests/iommu/iommufd_utils.h | 28 ++ .../arm/arm-smmu-v3/arm-smmu-v3-iommufd.c | 80 ++++-- drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 9 +- drivers/iommu/iommufd/driver.c | 38 +++ drivers/iommu/iommufd/hw_pagetable.c | 71 ++++- drivers/iommu/iommufd/main.c | 58 ++-- drivers/iommu/iommufd/selftest.c | 256 ++++++++++++------ drivers/iommu/iommufd/viommu.c | 89 ++++++ tools/testing/selftests/iommu/iommufd.c | 87 ++++++ .../selftests/iommu/iommufd_fail_nth.c | 11 + Documentation/userspace-api/iommufd.rst | 69 ++++- 18 files changed, 827 insertions(+), 194 deletions(-) create mode 100644 drivers/iommu/iommufd/driver.c create mode 100644 drivers/iommu/iommufd/viommu.c -- 2.43.0

1 year, 2 months

4
52
0 0

[PATCH] selftests: netfilter: remove unused parameter

by Liu Jing

Signed-off-by: Liu Jing <liujing(a)cmss.chinamobile.com> --- V1 -> V2: Delete more unused parameters, such as err, v1 only deleted rplnlh parameter Signed-off-by: Liu Jing <liujing(a)cmss.chinamobile.com> --- .../testing/selftests/net/netfilter/conntrack_dump_flush.c | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/tools/testing/selftests/net/netfilter/conntrack_dump_flush.c b/tools/testing/selftests/net/netfilter/conntrack_dump_flush.c index e03ddc60b5d4..63ae49f166a1 100644 --- a/tools/testing/selftests/net/netfilter/conntrack_dump_flush.c +++ b/tools/testing/selftests/net/netfilter/conntrack_dump_flush.c @@ -97,7 +97,7 @@ static int conntrack_data_insert(struct mnl_socket *sock, struct nlmsghdr *nlh, { char buf[MNL_SOCKET_BUFFER_SIZE]; unsigned int portid; - int err, ret; + int ret; portid = mnl_socket_get_portid(sock); @@ -215,7 +215,7 @@ static int conntracK_count_zone(struct mnl_socket *sock, uint16_t zone) struct nfgenmsg *nfh; struct nlattr *nest; unsigned int portid; - int err, ret; + int ret; portid = mnl_socket_get_portid(sock); @@ -262,7 +262,7 @@ static int conntrack_flush_zone(struct mnl_socket *sock, uint16_t zone) struct nfgenmsg *nfh; struct nlattr *nest; unsigned int portid; - int err, ret; + int ret; portid = mnl_socket_get_portid(sock); -- 2.27.0

1 year, 2 months

2
1
0 0

[PATCH][next] KVM: s390: selftests: Fix spelling mistake "avaialable" -> "available"

by Colin Ian King

There is a spelling mistake in a ksft_test_result_skip message. Fix it. Signed-off-by: Colin Ian King <colin.i.king(a)gmail.com> --- tools/testing/selftests/kvm/s390x/cpumodel_subfuncs_test.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/tools/testing/selftests/kvm/s390x/cpumodel_subfuncs_test.c b/tools/testing/selftests/kvm/s390x/cpumodel_subfuncs_test.c index 222ba1cc3cac..34e0ceceaa74 100644 --- a/tools/testing/selftests/kvm/s390x/cpumodel_subfuncs_test.c +++ b/tools/testing/selftests/kvm/s390x/cpumodel_subfuncs_test.c @@ -276,7 +276,7 @@ int main(int argc, char *argv[]) ksft_test_result_pass("%s\n", testlist[idx].subfunc_name); free(array); } else { - ksft_test_result_skip("%s feature is not avaialable\n", + ksft_test_result_skip("%s feature is not available\n", testlist[idx].subfunc_name); } } -- 2.39.5

1 year, 2 months

1
0
0 0

[PATCH v2 0/3] rust: kunit: Support KUnit tests with a user-space like syntax

by David Gow

This series was originally written by José Expósito, and has been modified and updated by Matt Gilbride and myself. The original version can be found here: https://github.com/Rust-for-Linux/linux/pull/950 Add support for writing KUnit tests in Rust. While Rust doctests are already converted to KUnit tests and run, they're really better suited for examples, rather than as first-class unit tests. This series implements a series of direct Rust bindings for KUnit tests, as well as a new macro which allows KUnit tests to be written using a close variant of normal Rust unit test syntax. The only change required is replacing '#[cfg(test)]' with '#[kunit_tests(kunit_test_suite_name)]' An example test would look like: #[kunit_tests(rust_kernel_hid_driver)] mod tests { use super::*; use crate::{c_str, driver, hid, prelude::*}; use core::ptr; struct SimpleTestDriver; impl Driver for SimpleTestDriver { type Data = (); } #[test] fn rust_test_hid_driver_adapter() { let mut hid = bindings::hid_driver::default(); let name = c_str!("SimpleTestDriver"); static MODULE: ThisModule = unsafe { ThisModule::from_ptr(ptr::null_mut()) }; let res = unsafe { <hid::Adapter<SimpleTestDriver> as driver::DriverOps>::register(&mut hid, name, &MODULE) }; assert_eq!(res, Err(ENODEV)); // The mock returns -19 } } Please give this a go, and make sure I haven't broken it! There's almost certainly a lot of improvements which can be made -- and there's a fair case to be made for replacing some of this with generated C code which can use the C macros -- but this is hopefully an adequate implementation for now, and the interface can (with luck) remain the same even if the implementation changes. A few small notable missing features: - Attributes (like the speed of a test) are hardcoded to the default value. - Similarly, the module name attribute is hardcoded to NULL. In C, we use the KBUILD_MODNAME macro, but I couldn't find a way to use this from Rust which wasn't more ugly than just disabling it. - Assertions are not automatically rewritten to use KUnit assertions. --- Changes since v1: https://lore.kernel.org/lkml/20230720-rustbind-v1-0-c80db349e3b5@google.com… - Rebase on top of the latest rust-next (commit 718c4069896c) - Make kunit_case a const fn, rather than a macro (Thanks Boqun) - As a result, the null terminator is now created with kernel::kunit::kunit_case_null() - Use the C kunit_get_current_test() function to implement in_kunit_test(), rather than re-implementing it (less efficiently) ourselves. Changes since the GitHub PR: - Rebased on top of kselftest/kunit - Add const_mut_refs feature This may conflict with https://lore.kernel.org/lkml/20230503090708.2524310-6-nmi@metaspace.dk/ - Add rust/macros/kunit.rs to the KUnit MAINTAINERS entry --- José Expósito (3): rust: kunit: add KUnit case and suite macros rust: macros: add macro to easily run KUnit tests rust: kunit: allow to know if we are in a test MAINTAINERS | 1 + rust/kernel/kunit.rs | 191 +++++++++++++++++++++++++++++++++++++++++++ rust/kernel/lib.rs | 1 + rust/macros/lib.rs | 29 +++++++ 4 files changed, 222 insertions(+) -- 2.47.0.163.g1226f6d8fa-goog

1 year, 2 months

3
8
0 0

[PATCH] selftest/hid: increase timeout to 10 min

by Yun Lu

If running hid testcases with command "./run_kselftest.sh -c hid", the following tests will take longer than the kselftest framework timeout (now 200 seconds) to run and thus got terminated with TIMEOUT error: hid-multitouch.sh - took about 6min41s hid-tablet.sh - took about 6min30s Increase the timeout setting to 10 minutes to allow them have a chance to finish. Cc: stable(a)vger.kernel.org Signed-off-by: Yun Lu <luyun(a)kylinos.cn> --- tools/testing/selftests/hid/settings | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/tools/testing/selftests/hid/settings b/tools/testing/selftests/hid/settings index b3cbfc521b10..dff0d947f9c2 100644 --- a/tools/testing/selftests/hid/settings +++ b/tools/testing/selftests/hid/settings @@ -1,3 +1,3 @@ # HID tests can be long, so give a little bit more time # to them -timeout=200 +timeout=600 -- 2.27.0

1 year, 2 months

3
2
0 0

[PATCH bpf-next v2] selftests/bpf: Drop netns helpers in mptcp

by Geliang Tang

From: Geliang Tang <tanggeliang(a)kylinos.cn> New netns selftest helpers netns_new() and netns_free() has been added in network_helpers.c, let's use them in mptcp selftests too instead of using MPTCP's own helpers create_netns() and cleanup_netns(). Signed-off-by: Geliang Tang <tanggeliang(a)kylinos.cn> --- v2: - Use netns_new/netns_free instead of create_netns/cleanup_netns as Martin suggested. v1: selftests/bpf: Use make/remove netns helpers in mptcp --- .../testing/selftests/bpf/prog_tests/mptcp.c | 42 ++++++------------- 1 file changed, 12 insertions(+), 30 deletions(-) diff --git a/tools/testing/selftests/bpf/prog_tests/mptcp.c b/tools/testing/selftests/bpf/prog_tests/mptcp.c index be3cad2aff77..f8eb7f9d4fd2 100644 --- a/tools/testing/selftests/bpf/prog_tests/mptcp.c +++ b/tools/testing/selftests/bpf/prog_tests/mptcp.c @@ -69,24 +69,6 @@ struct mptcp_storage { char ca_name[TCP_CA_NAME_MAX]; }; -static struct nstoken *create_netns(void) -{ - SYS(fail, "ip netns add %s", NS_TEST); - SYS(fail, "ip -net %s link set dev lo up", NS_TEST); - - return open_netns(NS_TEST); -fail: - return NULL; -} - -static void cleanup_netns(struct nstoken *nstoken) -{ - if (nstoken) - close_netns(nstoken); - - SYS_NOFAIL("ip netns del %s", NS_TEST); -} - static int start_mptcp_server(int family, const char *addr_str, __u16 port, int timeout_ms) { @@ -206,15 +188,15 @@ static int run_test(int cgroup_fd, int server_fd, bool is_mptcp) static void test_base(void) { - struct nstoken *nstoken = NULL; + struct netns_obj *netns = NULL; int server_fd, cgroup_fd; cgroup_fd = test__join_cgroup("/mptcp"); if (!ASSERT_GE(cgroup_fd, 0, "test__join_cgroup")) return; - nstoken = create_netns(); - if (!ASSERT_OK_PTR(nstoken, "create_netns")) + netns = netns_new(NS_TEST, true); + if (!ASSERT_OK_PTR(netns, "netns_new")) goto fail; /* without MPTCP */ @@ -237,7 +219,7 @@ static void test_base(void) close(server_fd); fail: - cleanup_netns(nstoken); + netns_free(netns); close(cgroup_fd); } @@ -322,21 +304,21 @@ static int run_mptcpify(int cgroup_fd) static void test_mptcpify(void) { - struct nstoken *nstoken = NULL; + struct netns_obj *netns = NULL; int cgroup_fd; cgroup_fd = test__join_cgroup("/mptcpify"); if (!ASSERT_GE(cgroup_fd, 0, "test__join_cgroup")) return; - nstoken = create_netns(); - if (!ASSERT_OK_PTR(nstoken, "create_netns")) + netns = netns_new(NS_TEST, true); + if (!ASSERT_OK_PTR(netns, "netns_new")) goto fail; ASSERT_OK(run_mptcpify(cgroup_fd), "run_mptcpify"); fail: - cleanup_netns(nstoken); + netns_free(netns); close(cgroup_fd); } @@ -414,7 +396,7 @@ static void run_subflow(void) static void test_subflow(void) { struct mptcp_subflow *skel; - struct nstoken *nstoken; + struct netns_obj *netns; int cgroup_fd; cgroup_fd = test__join_cgroup("/mptcp_subflow"); @@ -437,8 +419,8 @@ static void test_subflow(void) if (!ASSERT_OK_PTR(skel->links._getsockopt_subflow, "attach _getsockopt_subflow")) goto skel_destroy; - nstoken = create_netns(); - if (!ASSERT_OK_PTR(nstoken, "create_netns: mptcp_subflow")) + netns = netns_new(NS_TEST, true); + if (!ASSERT_OK_PTR(netns, "netns_new: mptcp_subflow")) goto skel_destroy; if (endpoint_init("subflow") < 0) @@ -447,7 +429,7 @@ static void test_subflow(void) run_subflow(); close_netns: - cleanup_netns(nstoken); + netns_free(netns); skel_destroy: mptcp_subflow__destroy(skel); close_cgroup: -- 2.45.2

1 year, 2 months

1
0
0 0

[PATCH v5 00/13] iommufd: Add vIOMMU infrastructure (Part-2: vDEVICE)

by Nicolin Chen

Following the previous vIOMMU series, this adds another vDEVICE structure, representing the association from an iommufd_device to an iommufd_viommu. This gives the whole architecture a new "v" layer: _______________________________________________________________________ | iommufd (with vIOMMU/vDEVICE) | | _____________ _____________ | | | | | | | | |----------------| vIOMMU |<---| vDEVICE |<------| | | | | | |_____________| | | | | ______ | | _____________ ___|____ | | | | | | | | | | | | | | | IOAS |<---|(HWPT_PAGING)|<---| HWPT_NESTED |<--| DEVICE | | | | |______| |_____________| |_____________| |________| | |______|________|______________|__________________|_______________|_____| | | | | | ______v_____ | ______v_____ ______v_____ ___v__ | struct | | PFN | (paging) | | (nested) | |struct| |iommu_device| |------>|iommu_domain|<----|iommu_domain|<----|device| |____________| storage|____________| |____________| |______| This vDEVICE object is used to collect and store all vIOMMU-related device information/attributes in a VM. As an initial series for vDEVICE, add only the virt_id to the vDEVICE, which is a vIOMMU specific device ID in a VM: e.g. vSID of ARM SMMUv3, vDeviceID of AMD IOMMU, and vID of Intel VT-d to a Context Table. This virt_id helps IOMMU drivers to link the vID to a pID of the device against the physical IOMMU instance. This is essential for a vIOMMU-based invalidation, where the request contains a device's vID for a device cache flush, e.g. ATC invalidation. Therefore, with this vDEVICE object, support a vIOMMU-based invalidation, by reusing IOMMUFD_CMD_HWPT_INVALIDATE for a vIOMMU object to flush cache with a given driver data. As for the implementation of the series, add driver support in ARM SMMUv3 for a real world use case. This series is on Github: https://github.com/nicolinc/iommufd/commits/iommufd_viommu_p2-v5 For testing, try this "with-rmr" branch: https://github.com/nicolinc/iommufd/commits/iommufd_viommu_p2-v5-with-rmr Paring QEMU branch for testing: https://github.com/nicolinc/qemu/commits/wip/for_iommufd_viommu_p2-v5 Changelog v5 * Dropped driver-allocated vDEVICE support * Changed vdev_to_dev helper to iommufd_viommu_find_dev v4 https://lore.kernel.org/all/cover.1729555967.git.nicolinc@nvidia.com/ * Added missing brackets in switch-case * Fixed the unreleased idev refcount issue * Reworked the iommufd_vdevice_alloc allocator * Dropped support for IOMMU_VIOMMU_TYPE_DEFAULT * Added missing TEST_LENGTH and fail_nth coverages * Added a verification to the driver-allocated vDEVICE object * Added an iommufd_vdevice_abort for a missing mutex protection * Added a u64 structure arm_vsmmu_invalidation_cmd for user command conversion v3 https://lore.kernel.org/all/cover.1728491532.git.nicolinc@nvidia.com/ * Added Jason's Reviewed-by * Split this invalidation part out of the part-1 series * Repurposed VDEV_ID ioctl to a wider vDEVICE structure and ioctl * Reduced viommu_api functions by allowing drivers to access viommu and vdevice structure directly * Dropped vdevs_rwsem by using xa_lock instead * Dropped arm_smmu_cache_invalidate_user v2 https://lore.kernel.org/all/cover.1724776335.git.nicolinc@nvidia.com/ * Limited vdev_id to one per idev * Added a rw_sem to protect the vdev_id list * Reworked driver-level APIs with proper lockings * Added a new viommu_api file for IOMMUFD_DRIVER config * Dropped useless iommu_dev point from the viommu structure * Added missing index numnbers to new types in the uAPI header * Dropped IOMMU_VIOMMU_INVALIDATE uAPI; Instead, reuse the HWPT one * Reworked mock_viommu_cache_invalidate() using the new iommu helper * Reordered details of set/unset_vdev_id handlers for proper lockings v1 https://lore.kernel.org/all/cover.1723061377.git.nicolinc@nvidia.com/ Thanks! Nicolin Jason Gunthorpe (2): iommu: Add iommu_copy_struct_from_full_user_array helper iommu/arm-smmu-v3: Allow ATS for IOMMU_DOMAIN_NESTED Nicolin Chen (11): iommufd/viommu: Add IOMMUFD_OBJ_VDEVICE and IOMMU_VDEVICE_ALLOC ioctl iommufd/selftest: Add IOMMU_VDEVICE_ALLOC test coverage iommu/viommu: Add cache_invalidate to iommufd_viommu_ops iommufd/hw_pagetable: Enforce invalidation op on vIOMMU-based hwpt_nested iommufd: Allow hwpt_id to carry viommu_id for IOMMU_HWPT_INVALIDATE iommufd/viommu: Add iommufd_viommu_find_dev helper iommufd/selftest: Add mock_viommu_cache_invalidate iommufd/selftest: Add IOMMU_TEST_OP_DEV_CHECK_CACHE test command iommufd/selftest: Add vIOMMU coverage for IOMMU_HWPT_INVALIDATE ioctl Documentation: userspace-api: iommufd: Update vDEVICE iommu/arm-smmu-v3: Add arm_vsmmu_cache_invalidate drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h | 9 +- drivers/iommu/iommufd/iommufd_private.h | 20 ++ drivers/iommu/iommufd/iommufd_test.h | 30 +++ include/linux/iommu.h | 48 ++++- include/linux/iommufd.h | 21 ++ include/uapi/linux/iommufd.h | 61 +++++- tools/testing/selftests/iommu/iommufd_utils.h | 83 +++++++ .../arm/arm-smmu-v3/arm-smmu-v3-iommufd.c | 161 +++++++++++++- drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 32 ++- drivers/iommu/iommufd/device.c | 11 + drivers/iommu/iommufd/driver.c | 13 ++ drivers/iommu/iommufd/hw_pagetable.c | 36 +++- drivers/iommu/iommufd/main.c | 7 + drivers/iommu/iommufd/selftest.c | 98 ++++++++- drivers/iommu/iommufd/viommu.c | 101 +++++++++ tools/testing/selftests/iommu/iommufd.c | 204 +++++++++++++++++- .../selftests/iommu/iommufd_fail_nth.c | 4 + Documentation/userspace-api/iommufd.rst | 41 +++- 18 files changed, 942 insertions(+), 38 deletions(-) -- 2.43.0

1 year, 2 months

4
43
0 0

[PATCH v3] selftests/net: Add missing gitignore file

by Li Zhijian

Compiled binary files should be added to .gitignore 'git status' complains: Untracked files: (use "git add <file>..." to include in what will be committed) net/netfilter/conntrack_reverse_clash Cc: Pablo Neira Ayuso <pablo(a)netfilter.org> Cc: Jozsef Kadlecsik <kadlec(a)netfilter.org> Cc: "David S. Miller" <davem(a)davemloft.net> Cc: Eric Dumazet <edumazet(a)google.com> Cc: Jakub Kicinski <kuba(a)kernel.org> Cc: Paolo Abeni <pabeni(a)redhat.com> Cc: Shuah Khan <shuah(a)kernel.org> Signed-off-by: Li Zhijian <lizhijian(a)fujitsu.com> --- Cc: netfilter-devel(a)vger.kernel.org Cc: coreteam(a)netfilter.org Cc: netdev(a)vger.kernel.org --- Hello, Cover letter is here. This patch set aims to make 'git status' clear after 'make' and 'make run_tests' for kselftests. --- V3: sort the files V2: split as a separate patch from a small one [0] [0] https://lore.kernel.org/linux-kselftest/20241015010817.453539-1-lizhijian@f… Signed-off-by: Li Zhijian <lizhijian(a)fujitsu.com> --- tools/testing/selftests/net/netfilter/.gitignore | 1 + 1 file changed, 1 insertion(+) diff --git a/tools/testing/selftests/net/netfilter/.gitignore b/tools/testing/selftests/net/netfilter/.gitignore index 0a64d6d0e29a..64c4f8d9aa6c 100644 --- a/tools/testing/selftests/net/netfilter/.gitignore +++ b/tools/testing/selftests/net/netfilter/.gitignore @@ -2,5 +2,6 @@ audit_logread connect_close conntrack_dump_flush +conntrack_reverse_clash sctp_collision nf_queue -- 2.44.0

1 year, 2 months

1
0
0 0

[PATCH for-next 1/7] selftests/alsa: Add a few missing gitignore files

by Li Zhijian

Compiled binary files should be added to .gitignore 'git status' complains: Untracked files: (use "git add <file>..." to include in what will be committed) alsa/global-timer alsa/utimer-test Cc: Mark Brown <broonie(a)kernel.org> Cc: Jaroslav Kysela <perex(a)perex.cz> Cc: Takashi Iwai <tiwai(a)suse.com> Cc: Shuah Khan <shuah(a)kernel.org> Signed-off-by: Li Zhijian <lizhijian(a)fujitsu.com> --- Cc: linux-sound(a)vger.kernel.org --- Hello, Cover letter is here. This patch set aims to make 'git status' clear after 'make' and 'make run_tests' for kselftests. --- V2: split as a separate patch from a small one [0] [0] https://lore.kernel.org/linux-kselftest/20241015010817.453539-1-lizhijian@f… --- tools/testing/selftests/alsa/.gitignore | 2 ++ 1 file changed, 2 insertions(+) diff --git a/tools/testing/selftests/alsa/.gitignore b/tools/testing/selftests/alsa/.gitignore index 12dc3fcd3456..1407fd24a97b 100644 --- a/tools/testing/selftests/alsa/.gitignore +++ b/tools/testing/selftests/alsa/.gitignore @@ -1,3 +1,5 @@ mixer-test pcm-test test-pcmtest-driver +global-timer +utimer-test -- 2.44.0

1 year, 2 months

5
13
0 0

[PATCH net-next v5 00/12] selftests: ncdevmem: Add ncdevmem to ksft

by Stanislav Fomichev

The goal of the series is to simplify and make it possible to use ncdevmem in an automated way from the ksft python wrapper. ncdevmem is slowly mutated into a state where it uses stdout to print the payload and the python wrapper is added to make sure the arrived payload matches the expected one. v5: - properly handle errors from inet_pton() and socket() (Paolo) - remove unneeded import from python selftest (Paolo) v4: - keep usage example with validation (Mina) - fix compilation issue in one patch (s/start_queues/start_queue/) v3: - keep and refine the comment about ncdevmem invocation (Mina) - add the comment about not enforcing exit status for ntuple reset (Mina) - make configure_headersplit more robust (Mina) - use num_queues/2 in selftest and let the users override it (Mina) - remove memory_provider.memcpy_to_device (Mina) - keep ksft as is (don't use -v validate flags): we are gonna need a --debug-disable flag to make it less chatty; otherwise it times out when sending too much data; so leaving it as a separate follow up v2: - don't remove validation (Mina) - keep 5-tuple flow steering but use it only when -c is provided (Mina) - remove separate flag for probing (Mina) - move ncdevmem under drivers/net/hw, not drivers/net (Jakub) Cc: Mina Almasry <almasrymina(a)google.com> Stanislav Fomichev (12): selftests: ncdevmem: Redirect all non-payload output to stderr selftests: ncdevmem: Separate out dmabuf provider selftests: ncdevmem: Unify error handling selftests: ncdevmem: Make client_ip optional selftests: ncdevmem: Remove default arguments selftests: ncdevmem: Switch to AF_INET6 selftests: ncdevmem: Properly reset flow steering selftests: ncdevmem: Use YNL to enable TCP header split selftests: ncdevmem: Remove hard-coded queue numbers selftests: ncdevmem: Run selftest when none of the -s or -c has been provided selftests: ncdevmem: Move ncdevmem under drivers/net/hw selftests: ncdevmem: Add automated test .../selftests/drivers/net/hw/.gitignore | 1 + .../testing/selftests/drivers/net/hw/Makefile | 9 + .../selftests/drivers/net/hw/devmem.py | 45 + .../selftests/drivers/net/hw/ncdevmem.c | 773 ++++++++++++++++++ tools/testing/selftests/net/.gitignore | 1 - tools/testing/selftests/net/Makefile | 8 - tools/testing/selftests/net/ncdevmem.c | 570 ------------- 7 files changed, 828 insertions(+), 579 deletions(-) create mode 100644 tools/testing/selftests/drivers/net/hw/.gitignore create mode 100755 tools/testing/selftests/drivers/net/hw/devmem.py create mode 100644 tools/testing/selftests/drivers/net/hw/ncdevmem.c delete mode 100644 tools/testing/selftests/net/ncdevmem.c -- 2.47.0

1 year, 2 months

3
14
0 0

[PATCH v6 00/33] riscv control-flow integrity for usermode

by Deepak Gupta

Basics and overview =================== Software with larger attack surfaces (e.g. network facing apps like databases, browsers or apps relying on browser runtimes) suffer from memory corruption issues which can be utilized by attackers to bend control flow of the program to eventually gain control (by making their payload executable). Attackers are able to perform such attacks by leveraging call-sites which rely on indirect calls or return sites which rely on obtaining return address from stack memory. To mitigate such attacks, risc-v extension zicfilp enforces that all indirect calls must land on a landing pad instruction `lpad` else cpu will raise software check exception (a new cpu exception cause code on riscv). Similarly for return flow, risc-v extension zicfiss extends architecture with - `sspush` instruction to push return address on a shadow stack - `sspopchk` instruction to pop return address from shadow stack and compare with input operand (i.e. return address on stack) - `sspopchk` to raise software check exception if comparision above was a mismatch - Protection mechanism using which shadow stack is not writeable via regular store instructions More information an details can be found at extensions github repo [1]. Equivalent to landing pad (zicfilp) on x86 is `ENDBRANCH` instruction in Intel CET [3] and branch target identification (BTI) [4] on arm. Similarly x86's Intel CET has shadow stack [5] and arm64 has guarded control stack (GCS) [6] which are very similar to risc-v's zicfiss shadow stack. x86 already supports shadow stack for user mode and arm64 support for GCS in usermode [7] is ongoing. Kernel awareness for user control flow integrity ================================================ This series picks up Samuel Holland's envcfg changes [2] as well. So if those are being applied independently, they should be removed from this series. Enabling: In order to maintain compatibility and not break anything in user mode, kernel doesn't enable control flow integrity cpu extensions on binary by default. Instead exposes a prctl interface to enable, disable and lock the shadow stack or landing pad feature for a task. This allows userspace (loader) to enumerate if all objects in its address space are compiled with shadow stack and landing pad support and accordingly enable the feature. Additionally if a subsequent `dlopen` happens on a library, user mode can take a decision again to disable the feature (if incoming library is not compiled with support) OR terminate the task (if user mode policy is strict to have all objects in address space to be compiled with control flow integirty cpu feature). prctl to enable shadow stack results in allocating shadow stack from virtual memory and activating for user address space. x86 and arm64 are also following same direction due to similar reason(s). clone/fork: On clone and fork, cfi state for task is inherited by child. Shadow stack is part of virtual memory and is a writeable memory from kernel perspective (writeable via a restricted set of instructions aka shadow stack instructions) Thus kernel changes ensure that this memory is converted into read-only when fork/clone happens and COWed when fault is taken due to sspush, sspopchk or ssamoswap. In case `CLONE_VM` is specified and shadow stack is to be enabled, kernel will automatically allocate a shadow stack for that clone call. map_shadow_stack: x86 introduced `map_shadow_stack` system call to allow user space to explicitly map shadow stack memory in its address space. It is useful to allocate shadow for different contexts managed by a single thread (green threads or contexts) risc-v implements this system call as well. signal management: If shadow stack is enabled for a task, kernel performs an asynchronous control flow diversion to deliver the signal and eventually expects userspace to issue sigreturn so that original execution can be resumed. Even though resume context is prepared by kernel, it is in user space memory and is subject to memory corruption and corruption bugs can be utilized by attacker in this race window to perform arbitrary sigreturn and eventually bypass cfi mechanism. Another issue is how to ensure that cfi related state on sigcontext area is not trampled by legacy apps or apps compiled with old kernel headers. In order to mitigate control-flow hijacting, kernel prepares a token and place it on shadow stack before signal delivery and places address of token in sigcontext structure. During sigreturn, kernel obtains address of token from sigcontext struture, reads token from shadow stack and validates it and only then allow sigreturn to succeed. Compatiblity issue is solved by adopting dynamic sigcontext management introduced for vector extension. This series re-factor the code little bit to allow future sigcontext management easy (as proposed by Andy Chiu from SiFive) config and compilation: Introduce a new risc-v config option `CONFIG_RISCV_USER_CFI`. Selecting this config option picks the kernel support for user control flow integrity. This optin is presented only if toolchain has shadow stack and landing pad support. And is on purpose guarded by toolchain support. Reason being that eventually vDSO also needs to be compiled in with shadow stack and landing pad support. vDSO compile patches are not included as of now because landing pad labeling scheme is yet to settle for usermode runtime. To get more information on kernel interactions with respect to zicfilp and zicfiss, patch series adds documentation for `zicfilp` and `zicfiss` in following: Documentation/arch/riscv/zicfiss.rst Documentation/arch/riscv/zicfilp.rst How to test this series ======================= Toolchain --------- $ git clone git@github.com:sifive/riscv-gnu-toolchain.git -b cfi-dev $ riscv-gnu-toolchain/configure --prefix=<path-to-where-to-build> --with-arch=rv64gc_zicfilp_zicfiss --enable-linux --disable-gdb --with-extra-multilib-test="rv64gc_zicfilp_zicfiss-lp64d:-static" $ make -j$(nproc) Qemu ---- $ git clone git@github.com:deepak0414/qemu.git -b zicfilp_zicfiss_ratified_master_july11 $ cd qemu $ mkdir build $ cd build $ ../configure --target-list=riscv64-softmmu $ make -j$(nproc) Opensbi ------- $ git clone git@github.com:deepak0414/opensbi.git -b v6_cfi_spec_split_opensbi $ make CROSS_COMPILE=<your riscv toolchain> -j$(nproc) PLATFORM=generic Linux ----- Running defconfig is fine. CFI is enabled by default if the toolchain supports it. $ make ARCH=riscv CROSS_COMPILE=<path-to-cfi-riscv-gnu-toolchain>/build/bin/riscv64-unknown-linux-gnu- -j$(nproc) defconfig $ make ARCH=riscv CROSS_COMPILE=<path-to-cfi-riscv-gnu-toolchain>/build/bin/riscv64-unknown-linux-gnu- -j$(nproc) Branch where user cfi enabling patches are maintained https://github.com/deepak0414/linux-riscv-cfi/tree/vdso_user_cfi_v6.12-rc1 In case you're building your own rootfs using toolchain, please make sure you pick following patch to ensure that vDSO compiled with lpad and shadow stack. "arch/riscv: compile vdso with landing pad" Running ------- Modify your qemu command to have: -bios <path-to-cfi-opensbi>/build/platform/generic/firmware/fw_dynamic.bin -cpu rv64,zicfilp=true,zicfiss=true,zimop=true,zcmop=true vDSO related Opens (in the flux) ================================= I am listing these opens for laying out plan and what to expect in future patch sets. And of course for the sake of discussion. Shadow stack and landing pad enabling in vDSO ---------------------------------------------- vDSO must have shadow stack and landing pad support compiled in for task to have shadow stack and landing pad support. This patch series doesn't enable that (yet). Enabling shadow stack support in vDSO should be straight forward (intend to do that in next versions of patch set). Enabling landing pad support in vDSO requires some collaboration with toolchain folks to follow a single label scheme for all object binaries. This is necessary to ensure that all indirect call-sites are setting correct label and target landing pads are decorated with same label scheme. How many vDSOs --------------- Shadow stack instructions are carved out of zimop (may be operations) and if CPU doesn't implement zimop, they're illegal instructions. Kernel could be running on a CPU which may or may not implement zimop. And thus kernel will have to carry 2 different vDSOs and expose the appropriate one depending on whether CPU implements zimop or not. References ========== [1] - https://github.com/riscv/riscv-cfi [2] - https://lore.kernel.org/all/20240814081126.956287-1-samuel.holland@sifive.c… [3] - https://lwn.net/Articles/889475/ [4] - https://developer.arm.com/documentation/109576/0100/Branch-Target-Identific… [5] - https://www.intel.com/content/dam/develop/external/us/en/documents/catc17-i… [6] - https://lwn.net/Articles/940403/ [7] - https://lore.kernel.org/all/20241001-arm64-gcs-v13-0-222b78d87eee@kernel.or… To: Thomas Gleixner <tglx(a)linutronix.de> To: Ingo Molnar <mingo(a)redhat.com> To: Borislav Petkov <bp(a)alien8.de> To: Dave Hansen <dave.hansen(a)linux.intel.com> To: x86(a)kernel.org To: H. Peter Anvin <hpa(a)zytor.com> To: Andrew Morton <akpm(a)linux-foundation.org> To: Liam R. Howlett <Liam.Howlett(a)oracle.com> To: Vlastimil Babka <vbabka(a)suse.cz> To: Lorenzo Stoakes <lorenzo.stoakes(a)oracle.com> To: Paul Walmsley <paul.walmsley(a)sifive.com> To: Palmer Dabbelt <palmer(a)dabbelt.com> To: Albert Ou <aou(a)eecs.berkeley.edu> To: Conor Dooley <conor(a)kernel.org> To: Rob Herring <robh(a)kernel.org> To: Krzysztof Kozlowski <krzk+dt(a)kernel.org> To: Arnd Bergmann <arnd(a)arndb.de> To: Christian Brauner <brauner(a)kernel.org> To: Peter Zijlstra <peterz(a)infradead.org> To: Oleg Nesterov <oleg(a)redhat.com> To: Eric Biederman <ebiederm(a)xmission.com> To: Kees Cook <kees(a)kernel.org> To: Jonathan Corbet <corbet(a)lwn.net> To: Shuah Khan <shuah(a)kernel.org> Cc: linux-kernel(a)vger.kernel.org Cc: linux-fsdevel(a)vger.kernel.org Cc: linux-mm(a)kvack.org Cc: linux-riscv(a)lists.infradead.org Cc: devicetree(a)vger.kernel.org Cc: linux-arch(a)vger.kernel.org Cc: linux-doc(a)vger.kernel.org Cc: linux-kselftest(a)vger.kernel.org Cc: alistair.francis(a)wdc.com Cc: richard.henderson(a)linaro.org Cc: jim.shu(a)sifive.com Cc: andybnac(a)gmail.com Cc: kito.cheng(a)sifive.com Cc: charlie(a)rivosinc.com Cc: atishp(a)rivosinc.com Cc: evan(a)rivosinc.com Cc: cleger(a)rivosinc.com Cc: alexghiti(a)rivosinc.com Cc: samitolvanen(a)google.com Cc: broonie(a)kernel.org Cc: rick.p.edgecombe(a)intel.com Signed-off-by: Deepak Gupta <debug(a)rivosinc.com> --- changelog --------- v6: - Picked up Samuel Holland's changes as is with `envcfg` placed in `thread` instead of `thread_info` - fixed unaligned newline escapes in kselftest - cleaned up messages in kselftest and included test output in commit message - fixed a bug in clone path reported by Zong Li - fixed a build issue if CONFIG_RISCV_ISA_V is not selected (this was introduced due to re-factoring signal context management code) v5: - rebased on v6.12-rc1 - Fixed schema related issues in device tree file - Fixed some of the documentation related issues in zicfilp/ss.rst (style issues and added index) - added `SHADOW_STACK_SET_MARKER` so that implementation can define base of shadow stack. - Fixed warnings on definitions added in usercfi.h when CONFIG_RISCV_USER_CFI is not selected. - Adopted context header based signal handling as proposed by Andy Chiu - Added support for enabling kernel mode access to shadow stack using FWFT (https://github.com/riscv-non-isa/riscv-sbi-doc/blob/master/src/ext-firmware…) - Link to v5: https://lore.kernel.org/r/20241001-v5_user_cfi_series-v1-0-3ba65b6e550f@riv… (Note: I had an issue in my workflow due to which version number wasn't picked up correctly while sending out patches) v4: - rebased on 6.11-rc6 - envcfg: Converged with Samuel Holland's patches for envcfg management on per- thread basis. - vma_is_shadow_stack is renamed to is_vma_shadow_stack - picked up Mark Brown's `ARCH_HAS_USER_SHADOW_STACK` patch - signal context: using extended context management to maintain compatibility. - fixed `-Wmissing-prototypes` compiler warnings for prctl functions - Documentation fixes and amending typos. - Link to v4: https://lore.kernel.org/all/20240912231650.3740732-1-debug@rivosinc.com/ v3: - envcfg logic to pick up base envcfg had a bug where `ENVCFG_CBZE` could have been picked on per task basis, even though CPU didn't implement it. Fixed in this series. - dt-bindings As suggested, split into separate commit. fixed the messaging that spec is in public review - arch_is_shadow_stack change arch_is_shadow_stack changed to vma_is_shadow_stack - hwprobe zicfiss / zicfilp if present will get enumerated in hwprobe - selftests As suggested, added object and binary filenames to .gitignore Selftest binary anyways need to be compiled with cfi enabled compiler which will make sure that landing pad and shadow stack are enabled. Thus removed separate enable/disable tests. Cleaned up tests a bit. - Link to v3: https://lore.kernel.org/lkml/20240403234054.2020347-1-debug@rivosinc.com/ v2: - Using config `CONFIG_RISCV_USER_CFI`, kernel support for riscv control flow integrity for user mode programs can be compiled in the kernel. - Enabling of control flow integrity for user programs is left to user runtime - This patch series introduces arch agnostic `prctls` to enable shadow stack and indirect branch tracking. And implements them on riscv. --- Andy Chiu (1): riscv: signal: abstract header saving for setup_sigcontext Clément Léger (1): riscv: Add Firmware Feature SBI extensions definitions Deepak Gupta (26): mm: helper `is_shadow_stack_vma` to check shadow stack vma riscv/Kconfig: enable HAVE_EXIT_THREAD for riscv dt-bindings: riscv: zicfilp and zicfiss in dt-bindings (extensions.yaml) riscv: zicfiss / zicfilp enumeration riscv: zicfiss / zicfilp extension csr and bit definitions riscv: usercfi state for task and save/restore of CSR_SSP on trap entry/exit riscv/mm : ensure PROT_WRITE leads to VM_READ | VM_WRITE riscv mm: manufacture shadow stack pte riscv mmu: teach pte_mkwrite to manufacture shadow stack PTEs riscv mmu: write protect and shadow stack riscv/mm: Implement map_shadow_stack() syscall riscv/shstk: If needed allocate a new shadow stack on clone prctl: arch-agnostic prctl for indirect branch tracking riscv: Implements arch agnostic shadow stack prctls riscv: Implements arch agnostic indirect branch tracking prctls riscv/traps: Introduce software check exception riscv/signal: save and restore of shadow stack for signal riscv/kernel: update __show_regs to print shadow stack register riscv/ptrace: riscv cfi status and state via ptrace and in core files riscv/hwprobe: zicfilp / zicfiss enumeration in hwprobe riscv: enable kernel access to shadow stack memory via FWFT sbi call riscv: kernel command line option to opt out of user cfi riscv: create a config for shadow stack and landing pad instr support riscv: Documentation for landing pad / indirect branch tracking riscv: Documentation for shadow stack on riscv kselftest/riscv: kselftest for user mode cfi Mark Brown (2): mm: Introduce ARCH_HAS_USER_SHADOW_STACK prctl: arch-agnostic prctl for shadow stack Samuel Holland (3): riscv: Enable cbo.zero only when all harts support Zicboz riscv: Add support for per-thread envcfg CSR values riscv: Call riscv_user_isa_enable() only on the boot hart Documentation/arch/riscv/index.rst | 2 + Documentation/arch/riscv/zicfilp.rst | 115 +++++ Documentation/arch/riscv/zicfiss.rst | 176 +++++++ .../devicetree/bindings/riscv/extensions.yaml | 14 + arch/riscv/Kconfig | 21 + arch/riscv/include/asm/asm-prototypes.h | 1 + arch/riscv/include/asm/cpufeature.h | 15 +- arch/riscv/include/asm/csr.h | 16 + arch/riscv/include/asm/entry-common.h | 2 + arch/riscv/include/asm/hwcap.h | 2 + arch/riscv/include/asm/mman.h | 24 + arch/riscv/include/asm/pgtable.h | 30 +- arch/riscv/include/asm/processor.h | 3 + arch/riscv/include/asm/sbi.h | 27 ++ arch/riscv/include/asm/switch_to.h | 8 + arch/riscv/include/asm/thread_info.h | 3 + arch/riscv/include/asm/usercfi.h | 89 ++++ arch/riscv/include/asm/vector.h | 3 + arch/riscv/include/uapi/asm/hwprobe.h | 2 + arch/riscv/include/uapi/asm/ptrace.h | 22 + arch/riscv/include/uapi/asm/sigcontext.h | 1 + arch/riscv/kernel/Makefile | 2 + arch/riscv/kernel/asm-offsets.c | 8 + arch/riscv/kernel/cpufeature.c | 13 +- arch/riscv/kernel/entry.S | 31 +- arch/riscv/kernel/head.S | 12 + arch/riscv/kernel/process.c | 31 +- arch/riscv/kernel/ptrace.c | 83 ++++ arch/riscv/kernel/signal.c | 140 +++++- arch/riscv/kernel/smpboot.c | 2 - arch/riscv/kernel/suspend.c | 4 +- arch/riscv/kernel/sys_hwprobe.c | 2 + arch/riscv/kernel/sys_riscv.c | 10 + arch/riscv/kernel/traps.c | 42 ++ arch/riscv/kernel/usercfi.c | 526 +++++++++++++++++++++ arch/riscv/mm/init.c | 2 +- arch/riscv/mm/pgtable.c | 17 + arch/x86/Kconfig | 1 + fs/proc/task_mmu.c | 2 +- include/linux/cpu.h | 4 + include/linux/mm.h | 5 +- include/uapi/asm-generic/mman.h | 4 + include/uapi/linux/elf.h | 1 + include/uapi/linux/prctl.h | 48 ++ kernel/sys.c | 60 +++ mm/Kconfig | 6 + mm/gup.c | 2 +- mm/mmap.c | 1 + mm/vma.h | 10 +- tools/testing/selftests/riscv/Makefile | 2 +- tools/testing/selftests/riscv/cfi/.gitignore | 3 + tools/testing/selftests/riscv/cfi/Makefile | 10 + tools/testing/selftests/riscv/cfi/cfi_rv_test.h | 84 ++++ tools/testing/selftests/riscv/cfi/riscv_cfi_test.c | 78 +++ tools/testing/selftests/riscv/cfi/shadowstack.c | 373 +++++++++++++++ tools/testing/selftests/riscv/cfi/shadowstack.h | 37 ++ 56 files changed, 2190 insertions(+), 42 deletions(-) --- base-commit: 7d9923ee3960bdbfaa7f3a4e0ac2364e770c46ff change-id: 20240930-v5_user_cfi_series-3dc332f8f5b2 -- - debug

1 year, 2 months

6
51
0 0

[PATCH bpf-next] selftests/bpf: Use make/remove netns helpers in mptcp

by Geliang Tang

From: Geliang Tang <tanggeliang(a)kylinos.cn> New netns selftest helpers make_netns() and remove_netns() has been added in network_helpers.c, let's use them in mptcp selftests too. Signed-off-by: Geliang Tang <tanggeliang(a)kylinos.cn> Reviewed-by: Matthieu Baerts (NGI0) <matttbe(a)kernel.org> --- tools/testing/selftests/bpf/prog_tests/mptcp.c | 18 ++++++++++++------ 1 file changed, 12 insertions(+), 6 deletions(-) diff --git a/tools/testing/selftests/bpf/prog_tests/mptcp.c b/tools/testing/selftests/bpf/prog_tests/mptcp.c index d2ca32fa3b21..8276398f7d6a 100644 --- a/tools/testing/selftests/bpf/prog_tests/mptcp.c +++ b/tools/testing/selftests/bpf/prog_tests/mptcp.c @@ -66,12 +66,18 @@ struct mptcp_storage { static struct nstoken *create_netns(void) { - SYS(fail, "ip netns add %s", NS_TEST); - SYS(fail, "ip -net %s link set dev lo up", NS_TEST); + struct nstoken *nstoken; - return open_netns(NS_TEST); -fail: - return NULL; + if (make_netns(NS_TEST) < 0) + return NULL; + + nstoken = open_netns(NS_TEST); + if (!nstoken) { + log_err("open netns %s failed", NS_TEST); + remove_netns(NS_TEST); + } + + return nstoken; } static void cleanup_netns(struct nstoken *nstoken) @@ -79,7 +85,7 @@ static void cleanup_netns(struct nstoken *nstoken) if (nstoken) close_netns(nstoken); - SYS_NOFAIL("ip netns del %s", NS_TEST); + remove_netns(NS_TEST); } static int start_mptcp_server(int family, const char *addr_str, __u16 port, -- 2.45.2

1 year, 2 months

2
1
0 0

[PATCH 0/9] arm64: Support 2024 dpISA extensions

by Mark Brown

The 2024 architecture release includes a number of data processing extensions, mostly SVE and SME additions with a few others. These are all very straightforward extensions which add instructions but no architectural state so only need hwcaps and exposing of the ID registers to KVM guests and userspace. Signed-off-by: Mark Brown <broonie(a)kernel.org> --- Mark Brown (9): arm64/sysreg: Update ID_AA64PFR2_EL1 to DDI0601 2024-09 arm64/sysreg: Update ID_AA64ISAR3_EL1 to DDI0601 2024-09 arm64/sysreg: Update ID_AA64FPFR0_EL1 to DDI0601 2024-09 arm64/sysreg: Update ID_AA64ZFR0_EL1 to DDI0601 2024-09 arm64/sysreg: Update ID_AA64SMFR0_EL1 to DDI0601 2024-09 arm64/sysreg: Update ID_AA64ISAR2_EL1 to DDI0601 2024-09 arm64/hwcap: Describe 2024 dpISA extensions to userspace KVM: arm64: Allow control of dpISA extensions in ID_AA64ISAR3_EL1 kselftest/arm64: Add 2024 dpISA extensions to hwcap test Documentation/arch/arm64/elf_hwcaps.rst | 51 ++++++ arch/arm64/include/asm/hwcap.h | 17 ++ arch/arm64/include/uapi/asm/hwcap.h | 17 ++ arch/arm64/kernel/cpufeature.c | 35 ++++ arch/arm64/kernel/cpuinfo.c | 17 ++ arch/arm64/kvm/sys_regs.c | 3 +- arch/arm64/tools/sysreg | 87 +++++++++- tools/testing/selftests/arm64/abi/hwcap.c | 273 +++++++++++++++++++++++++++++- 8 files changed, 490 insertions(+), 10 deletions(-) --- base-commit: 8e929cb546ee42c9a61d24fae60605e9e3192354 change-id: 20241008-arm64-2024-dpisa-8091074a7f48 Best regards, -- Mark Brown <broonie(a)kernel.org>

1 year, 2 months

2
11
0 0

[PATCH v2] selftests: tc-testing: Fix typo error

by Karan Sanghavi

Correct the typo errors in json files - "diffferent" is corrected to "different". - "muliple" and "miltiple" is corrected to "multiple". Reviewed-by: Simon Horman <horms(a)kernel.org> Reviewed-by: Shuah Khan <skhan(a)linuxfoundation.org> Signed-off-by: Karan Sanghavi <karansanghvi98(a)gmail.com> --- Changes in v2: - Rewrote short log and commit message - Link to v1: https://lore.kernel.org/r/20241019-multiple_spell_error-v1-1-facff43b5610@g… --- tools/testing/selftests/tc-testing/tc-tests/filters/basic.json | 6 +++--- tools/testing/selftests/tc-testing/tc-tests/filters/cgroup.json | 6 +++--- tools/testing/selftests/tc-testing/tc-tests/filters/flow.json | 2 +- tools/testing/selftests/tc-testing/tc-tests/filters/route.json | 2 +- 4 files changed, 8 insertions(+), 8 deletions(-) diff --git a/tools/testing/selftests/tc-testing/tc-tests/filters/basic.json b/tools/testing/selftests/tc-testing/tc-tests/filters/basic.json index d1278de8ebc3..c9309a44a87e 100644 --- a/tools/testing/selftests/tc-testing/tc-tests/filters/basic.json +++ b/tools/testing/selftests/tc-testing/tc-tests/filters/basic.json @@ -67,7 +67,7 @@ }, { "id": "4943", - "name": "Add basic filter with cmp ematch u32/link layer and miltiple actions", + "name": "Add basic filter with cmp ematch u32/link layer and multiple actions", "category": [ "filter", "basic" @@ -155,7 +155,7 @@ }, { "id": "32d8", - "name": "Add basic filter with cmp ematch u32/network layer and miltiple actions", + "name": "Add basic filter with cmp ematch u32/network layer and multiple actions", "category": [ "filter", "basic" @@ -243,7 +243,7 @@ }, { "id": "62d7", - "name": "Add basic filter with cmp ematch u32/transport layer and miltiple actions", + "name": "Add basic filter with cmp ematch u32/transport layer and multiple actions", "category": [ "filter", "basic" diff --git a/tools/testing/selftests/tc-testing/tc-tests/filters/cgroup.json b/tools/testing/selftests/tc-testing/tc-tests/filters/cgroup.json index 03723cf84379..35c9a7dbe1c4 100644 --- a/tools/testing/selftests/tc-testing/tc-tests/filters/cgroup.json +++ b/tools/testing/selftests/tc-testing/tc-tests/filters/cgroup.json @@ -67,7 +67,7 @@ }, { "id": "0234", - "name": "Add cgroup filter with cmp ematch u32/link layer and miltiple actions", + "name": "Add cgroup filter with cmp ematch u32/link layer and multiple actions", "category": [ "filter", "cgroup" @@ -155,7 +155,7 @@ }, { "id": "2733", - "name": "Add cgroup filter with cmp ematch u32/network layer and miltiple actions", + "name": "Add cgroup filter with cmp ematch u32/network layer and multiple actions", "category": [ "filter", "cgroup" @@ -1189,7 +1189,7 @@ }, { "id": "4319", - "name": "Replace cgroup filter with diffferent match", + "name": "Replace cgroup filter with different match", "category": [ "filter", "cgroup" diff --git a/tools/testing/selftests/tc-testing/tc-tests/filters/flow.json b/tools/testing/selftests/tc-testing/tc-tests/filters/flow.json index 58189327f644..996448afe31b 100644 --- a/tools/testing/selftests/tc-testing/tc-tests/filters/flow.json +++ b/tools/testing/selftests/tc-testing/tc-tests/filters/flow.json @@ -507,7 +507,7 @@ }, { "id": "4341", - "name": "Add flow filter with muliple ops", + "name": "Add flow filter with multiple ops", "category": [ "filter", "flow" diff --git a/tools/testing/selftests/tc-testing/tc-tests/filters/route.json b/tools/testing/selftests/tc-testing/tc-tests/filters/route.json index 8d8de8f65aef..05cedca67cca 100644 --- a/tools/testing/selftests/tc-testing/tc-tests/filters/route.json +++ b/tools/testing/selftests/tc-testing/tc-tests/filters/route.json @@ -111,7 +111,7 @@ }, { "id": "7994", - "name": "Add route filter with miltiple actions", + "name": "Add route filter with multiple actions", "category": [ "filter", "route" --- base-commit: 8e929cb546ee42c9a61d24fae60605e9e3192354 change-id: 20241017-multiple_spell_error-8b267ffffe47 Best regards, -- Karan Sanghavi <karansanghvi98(a)gmail.com>

1 year, 2 months

2
1
0 0

[PATCH 0/2] kselftest/arm64: fp-stress signal delivery interval improvements

by Mark Brown

One of the things that fp-stress does to stress the floating point context switching is send signals to the test threads it spawns. Currently we do this once per second but as suggested by Mark Rutland if we increase this we can improve the chances of triggering any issues with context switching the signal handling code. Do a quick change to increase the rate of signal delivery, trying to avoid excessive impact on emulated platforms, and a further change to mitigate the impact that this creates during startup. Signed-off-by: Mark Brown <broonie(a)kernel.org> --- Mark Brown (2): kselftest/arm64: Increase frequency of signal delivery in fp-stress kselftest/arm64: Lower poll interval while waiting for fp-stress children tools/testing/selftests/arm64/fp/fp-stress.c | 26 ++++++++++++++++---------- 1 file changed, 16 insertions(+), 10 deletions(-) --- base-commit: 8e929cb546ee42c9a61d24fae60605e9e3192354 change-id: 20241028-arm64-fp-stress-interval-8f5e62c06e12 Best regards, -- Mark Brown <broonie(a)kernel.org>

1 year, 2 months

2
5
0 0

[PATCH bpf-next v2] bpf: handle implicit declaration of function gettid in bpf_iter.c

by Jason Xing

From: Jason Xing <kernelxing(a)tencent.com> As we can see from the title, when I compiled the selftests/bpf, I saw the error: implicit declaration of function ‘gettid’ ; did you mean ‘getgid’? [-Werror=implicit-function-declaration] skel->bss->tid = gettid(); ^~~~~~ getgid Directly call the syscall solves this issue. Signed-off-by: Jason Xing <kernelxing(a)tencent.com> --- v2 Link: https://lore.kernel.org/all/20241028034143.14675-1-kerneljasonxing@gmail.co… 1. directly call the syscall (Andrii) --- tools/testing/selftests/bpf/prog_tests/bpf_iter.c | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/tools/testing/selftests/bpf/prog_tests/bpf_iter.c b/tools/testing/selftests/bpf/prog_tests/bpf_iter.c index f0a3a9c18e9e..9006549a1294 100644 --- a/tools/testing/selftests/bpf/prog_tests/bpf_iter.c +++ b/tools/testing/selftests/bpf/prog_tests/bpf_iter.c @@ -226,7 +226,7 @@ static void test_task_common_nocheck(struct bpf_iter_attach_opts *opts, ASSERT_OK(pthread_create(&thread_id, NULL, &do_nothing_wait, NULL), "pthread_create"); - skel->bss->tid = gettid(); + skel->bss->tid = syscall(SYS_gettid); do_dummy_read_opts(skel->progs.dump_task, opts); @@ -255,10 +255,10 @@ static void *run_test_task_tid(void *arg) union bpf_iter_link_info linfo; int num_unknown_tid, num_known_tid; - ASSERT_NEQ(getpid(), gettid(), "check_new_thread_id"); + ASSERT_NEQ(getpid(), syscall(SYS_gettid), "check_new_thread_id"); memset(&linfo, 0, sizeof(linfo)); - linfo.task.tid = gettid(); + linfo.task.tid = syscall(SYS_gettid); opts.link_info = &linfo; opts.link_info_len = sizeof(linfo); test_task_common(&opts, 0, 1); -- 2.37.3

1 year, 2 months

3
2
0 0

[PATCH v4 0/5] implement lightweight guard pages

by Lorenzo Stoakes

Userland library functions such as allocators and threading implementations often require regions of memory to act as 'guard pages' - mappings which, when accessed, result in a fatal signal being sent to the accessing process. The current means by which these are implemented is via a PROT_NONE mmap() mapping, which provides the required semantics however incur an overhead of a VMA for each such region. With a great many processes and threads, this can rapidly add up and incur a significant memory penalty. It also has the added problem of preventing merges that might otherwise be permitted. This series takes a different approach - an idea suggested by Vlasimil Babka (and before him David Hildenbrand and Jann Horn - perhaps more - the provenance becomes a little tricky to ascertain after this - please forgive any omissions!) - rather than locating the guard pages at the VMA layer, instead placing them in page tables mapping the required ranges. Early testing of the prototype version of this code suggests a 5 times speed up in memory mapping invocations (in conjunction with use of process_madvise()) and a 13% reduction in VMAs on an entirely idle android system and unoptimised code. We expect with optimisation and a loaded system with a larger number of guard pages this could significantly increase, but in any case these numbers are encouraging. This way, rather than having separate VMAs specifying which parts of a range are guard pages, instead we have a VMA spanning the entire range of memory a user is permitted to access and including ranges which are to be 'guarded'. After mapping this, a user can specify which parts of the range should result in a fatal signal when accessed. By restricting the ability to specify guard pages to memory mapped by existing VMAs, we can rely on the mappings being torn down when the mappings are ultimately unmapped and everything works simply as if the memory were not faulted in, from the point of view of the containing VMAs. This mechanism in effect poisons memory ranges similar to hardware memory poisoning, only it is an entirely software-controlled form of poisoning. The mechanism is implemented via madvise() behaviour - MADV_GUARD_INSTALL which installs page table-level guard page markers - and MADV_GUARD_REMOVE - which clears them. Guard markers can be installed across multiple VMAs and any existing mappings will be cleared, that is zapped, before installing the guard page markers in the page tables. There is no concept of 'nested' guard markers, multiple attempts to install guard markers in a range will, after the first attempt, have no effect. Importantly, removing guard markers over a range that contains both guard markers and ordinary backed memory has no effect on anything but the guard markers (including leaving huge pages un-split), so a user can safely remove guard markers over a range of memory leaving the rest intact. The actual mechanism by which the page table entries are specified makes use of existing logic - PTE markers, which are used for the userfaultfd UFFDIO_POISON mechanism. Unfortunately PTE_MARKER_POISONED is not suited for the guard page mechanism as it results in VM_FAULT_HWPOISON semantics in the fault handler, so we add our own specific PTE_MARKER_GUARD and adapt existing logic to handle it. We also extend the generic page walk mechanism to allow for installation of PTEs (carefully restricted to memory management logic only to prevent unwanted abuse). We ensure that zapping performed by MADV_DONTNEED and MADV_FREE do not remove guard markers, nor does forking (except when VM_WIPEONFORK is specified for a VMA which implies a total removal of memory characteristics). It's important to note that the guard page implementation is emphatically NOT a security feature, so a user can remove the markers if they wish. We simply implement it in such a way as to provide the least surprising behaviour. An extensive set of self-tests are provided which ensure behaviour is as expected and additionally self-documents expected behaviour of guard ranges. Suggested-by: Vlastimil Babka <vbabka(a)suse.cz> Suggested-by: Jann Horn <jannh(a)google.com> Suggested-by: David Hildenbrand <david(a)redhat.com> v4 * Use restart_syscall() to implement -ERESTARTNOINTR to ensure correctly handled by kernel - tested this code path and confirmed it works correctly. Thanks to Vlastimil for pointing this issue out! * Updated the vector_madvise() handler to not unnecessarily invoke cond_resched() as suggested by Vlastimil. * Updated guard page tests to add a test for a vector operation which overwrites existing mappings. Tested this against the -ERESTARTNOINTR case and confirmed working. * Improved page walk logic further, refactoring handling logic as suggested by Vlastimil. * Moved MAX_MADVISE_GUARD_RETRIES to mm/madvise.c as suggested by Vlastimil. v3 * Cleaned up mm/pagewalk.c logic a bit to make things clearer, as suggested by Vlastiml. * Explicitly avoid splitting THP on PTE installation, as suggested by Vlastimil. Note this has no impact on the guard pages logic, which has page table entry handlers at PUD, PMD and PTE level. * Added WARN_ON_ONCE() to mm/hugetlb.c path where we don't expect a guard marker, as suggested by Vlastimil. * Reverted change to is_poisoned_swp_entry() to exclude guard pages which has the effect of MADV_FREE _not_ clearing guard pages. After discussion with Vlastimil, it became apparent that the ability to 'cancel' the freeing operation by writing to the mapping after having issued an MADV_FREE would mean that we would risk unexpected behaviour should the guard pages be removed, so we now do not remove markers here at all. * Added comment to PTE_MARKER_GUARD to highlight that memory tagged with the marker behaves as if it were a region mapped PROT_NONE, as highlighted by David. * Rename poison -> install, unpoison -> remove (i.e. MADV_GUARD_INSTALL / MADV_GUARD_REMOVE over MADV_GUARD_POISON / MADV_GUARD_REMOVE) at the request of David and John who both find the poison analogy confusing/overloaded. * After a lot of discussion, replace the looping behaviour should page faults race with guard page installation with a modest reattempt followed by returning -ERESTARTNOINTR to have the operation abort and re-enter, relieving lock contention and avoiding the possibility of allowing a malicious sandboxed process to impact the mmap lock or stall the overall process more than necessary, as suggested by Jann and Vlastimil having raised the issue. * Adjusted the page table walker so a populated huge PUD or PMD is correctly treated as being populated, necessitating a zap. In v2 we incorrectly skipped over these, which would cause the logic to wrongly proceed as if nothing were populated and the install succeeded. Instead, explicitly check to see if a huge page - if so, do not split but rather abort the operation and let zap take care of things. * Updated the guard remove logic to not unnecessarily split huge pages either. * Added a debug check to assert that the number of installed PTEs matches expectation, accounting for any existing guard pages. * Adapted vector_madvise() used by the process_madvise() system call to handle -ERESTARTNOINTR correctly. https://lore.kernel.org/all/cover.1729699916.git.lorenzo.stoakes@oracle.com/ v2 * The macros in kselftest_harness.h seem to be broken - __EXPECT() is terminated by '} while (0); OPTIONAL_HANDLER(_assert)' meaning it is not safe in single line if / else or for /which blocks, however working around this results in checkpatch producing invalid warnings, as reported by Shuah. * Fixing these macros is out of scope for this series, so compromise and instead rewrite test blocks so as to use multiple lines by separating out a decl in most cases. This has the side effect of, for the most part, making things more readable. * Heavily document the use of the volatile keyword - we can't avoid checkpatch complaining about this, so we explain it, as reported by Shuah. * Updated commit message to highlight that we skip tests we lack permissions for, as reported by Shuah. * Replaced a perror() with ksft_exit_fail_perror(), as reported by Shuah. * Added user friendly messages to cases where tests are skipped due to lack of permissions, as reported by Shuah. * Update the tool header to include the new MADV_GUARD_POISON/UNPOISON defines and directly include asm-generic/mman.h to get the platform-neutral versions to ensure we import them. * Finally fixed Vlastimil's email address in Suggested-by tags from suze to suse, as reported by Vlastimil. * Added linux-api to cc list, as reported by Vlastimil. https://lore.kernel.org/all/cover.1729440856.git.lorenzo.stoakes@oracle.com/ v1 * Un-RFC'd as appears no major objections to approach but rather debate on implementation. * Fixed issue with arches which need mmu_context.h and tlbfush.h. header imports in pagewalker logic to be able to use update_mmu_cache() as reported by the kernel test bot. * Added comments in page walker logic to clarify who can use ops->install_pte and why as well as adding a check_ops_valid() helper function, as suggested by Christoph. * Pass false in full parameter in pte_clear_not_present_full() as suggested by Jann. * Stopped erroneously requiring a write lock for the poison operation as suggested by Jann and Suren. * Moved anon_vma_prepare() to the start of madvise_guard_poison() to be consistent with how this is used elsewhere in the kernel as suggested by Jann. * Avoid returning -EAGAIN if we are raced on page faults, just keep looping and duck out if a fatal signal is pending or a conditional reschedule is needed, as suggested by Jann. * Avoid needlessly splitting huge PUDs and PMDs by specifying ACTION_CONTINUE, as suggested by Jann. https://lore.kernel.org/all/cover.1729196871.git.lorenzo.stoakes@oracle.com/ RFC https://lore.kernel.org/all/cover.1727440966.git.lorenzo.stoakes@oracle.com/ Lorenzo Stoakes (5): mm: pagewalk: add the ability to install PTEs mm: add PTE_MARKER_GUARD PTE marker mm: madvise: implement lightweight guard page mechanism tools: testing: update tools UAPI header for mman-common.h selftests/mm: add self tests for guard page feature arch/alpha/include/uapi/asm/mman.h | 3 + arch/mips/include/uapi/asm/mman.h | 3 + arch/parisc/include/uapi/asm/mman.h | 3 + arch/xtensa/include/uapi/asm/mman.h | 3 + include/linux/mm_inline.h | 2 +- include/linux/pagewalk.h | 18 +- include/linux/swapops.h | 24 +- include/uapi/asm-generic/mman-common.h | 3 + mm/hugetlb.c | 4 + mm/internal.h | 6 + mm/madvise.c | 239 ++++ mm/memory.c | 18 +- mm/mprotect.c | 6 +- mm/mseal.c | 1 + mm/pagewalk.c | 246 +++- tools/include/uapi/asm-generic/mman-common.h | 3 + tools/testing/selftests/mm/.gitignore | 1 + tools/testing/selftests/mm/Makefile | 1 + tools/testing/selftests/mm/guard-pages.c | 1243 ++++++++++++++++++ 19 files changed, 1751 insertions(+), 76 deletions(-) create mode 100644 tools/testing/selftests/mm/guard-pages.c -- 2.47.0

1 year, 2 months

3
8
0 0

[PATCH net-next v10 00/23] Introducing OpenVPN Data Channel Offload

by Antonio Quartulli

Notable changes from v9: * fixed potential deadlock between peer float and keepalive worker * simplified skb CB layout to avoid allocating large mem area per pkt * fixed double free along encryption/decryption error path * fixed crash due to race condition in keepalive worker for p2p mode * mitigated race condition between TCP socket init and first strp rcv * added CMD_OVPN_KEY_GET doit handler (it went missing in v9) * removed double 'select STREAM_PARSER' in Kconfig * removed double include of socket.h in io.c * sorted new OVPN section in MAINTAINERS alphabetically * sorted F: entries in MAINTAINERS alphabetically * switched to state "Supported" in MAINTAINERS * added ending newline to various error messages in ovpn-cli.c * rearranged ovpn-cli.c for better readability and maintenance * ovpn-cli.c now compiles with -Wextra -Wall -pedantic * renamed selftest scripts * test-float.sh is now a stub that just calls test.sh with appropriate params Please note that patches previously reviewed by Andrew Lunn have retained the Reviewed-by tag as they have been simply rebased without major modifications. The latest code can also be found at: https://github.com/OpenVPN/linux-kernel-ovpn Thanks a lot! Best Regards, Antonio Quartulli OpenVPN Inc. --- Antonio Quartulli (23): netlink: add NLA_POLICY_MAX_LEN macro net: introduce OpenVPN Data Channel Offload (ovpn) ovpn: add basic netlink support ovpn: add basic interface creation/destruction/management routines ovpn: keep carrier always on ovpn: introduce the ovpn_peer object ovpn: introduce the ovpn_socket object ovpn: implement basic TX path (UDP) ovpn: implement basic RX path (UDP) ovpn: implement packet processing ovpn: store tunnel and transport statistics ovpn: implement TCP transport ovpn: implement multi-peer support ovpn: implement peer lookup logic ovpn: implement keepalive mechanism ovpn: add support for updating local UDP endpoint ovpn: add support for peer floating ovpn: implement peer add/get/dump/delete via netlink ovpn: implement key add/get/del/swap via netlink ovpn: kill key and notify userspace in case of IV exhaustion ovpn: notify userspace when a peer is deleted ovpn: add basic ethtool support testing/selftests: add test tool and scripts for ovpn module Documentation/netlink/specs/ovpn.yaml | 362 +++ MAINTAINERS | 11 + drivers/net/Kconfig | 14 + drivers/net/Makefile | 1 + drivers/net/ovpn/Makefile | 22 + drivers/net/ovpn/bind.c | 54 + drivers/net/ovpn/bind.h | 117 + drivers/net/ovpn/crypto.c | 214 ++ drivers/net/ovpn/crypto.h | 145 ++ drivers/net/ovpn/crypto_aead.c | 386 ++++ drivers/net/ovpn/crypto_aead.h | 33 + drivers/net/ovpn/io.c | 462 ++++ drivers/net/ovpn/io.h | 25 + drivers/net/ovpn/main.c | 337 +++ drivers/net/ovpn/main.h | 24 + drivers/net/ovpn/netlink-gen.c | 212 ++ drivers/net/ovpn/netlink-gen.h | 41 + drivers/net/ovpn/netlink.c | 1135 ++++++++++ drivers/net/ovpn/netlink.h | 18 + drivers/net/ovpn/ovpnstruct.h | 61 + drivers/net/ovpn/packet.h | 40 + drivers/net/ovpn/peer.c | 1201 ++++++++++ drivers/net/ovpn/peer.h | 165 ++ drivers/net/ovpn/pktid.c | 130 ++ drivers/net/ovpn/pktid.h | 87 + drivers/net/ovpn/proto.h | 104 + drivers/net/ovpn/skb.h | 56 + drivers/net/ovpn/socket.c | 178 ++ drivers/net/ovpn/socket.h | 55 + drivers/net/ovpn/stats.c | 21 + drivers/net/ovpn/stats.h | 47 + drivers/net/ovpn/tcp.c | 506 +++++ drivers/net/ovpn/tcp.h | 44 + drivers/net/ovpn/udp.c | 406 ++++ drivers/net/ovpn/udp.h | 26 + include/net/netlink.h | 1 + include/uapi/linux/if_link.h | 15 + include/uapi/linux/ovpn.h | 109 + include/uapi/linux/udp.h | 1 + tools/net/ynl/ynl-gen-c.py | 4 +- tools/testing/selftests/Makefile | 1 + tools/testing/selftests/net/ovpn/.gitignore | 2 + tools/testing/selftests/net/ovpn/Makefile | 17 + tools/testing/selftests/net/ovpn/config | 10 + tools/testing/selftests/net/ovpn/data64.key | 5 + tools/testing/selftests/net/ovpn/ovpn-cli.c | 2370 ++++++++++++++++++++ tools/testing/selftests/net/ovpn/tcp_peers.txt | 5 + .../testing/selftests/net/ovpn/test-chachapoly.sh | 9 + tools/testing/selftests/net/ovpn/test-float.sh | 9 + tools/testing/selftests/net/ovpn/test-tcp.sh | 9 + tools/testing/selftests/net/ovpn/test.sh | 183 ++ tools/testing/selftests/net/ovpn/udp_peers.txt | 5 + 52 files changed, 9494 insertions(+), 1 deletion(-) --- base-commit: 03fc07a24735e0be8646563913abf5f5cb71ad19 change-id: 20241002-b4-ovpn-eeee35c694a2 Best regards, -- Antonio Quartulli <antonio(a)openvpn.net>

1 year, 2 months

4
30
0 0

[PATCH net-next 1/2] net: netconsole: selftests: Change the IP subnet

by Breno Leitao

Use a less populated IP range to run the tests, as suggested by Petr in Link: https://lore.kernel.org/netdev/87ikvukv3s.fsf@nvidia.com/. Suggested-by: Petr Machata <petrm(a)nvidia.com> Signed-off-by: Breno Leitao <leitao(a)debian.org> --- tools/testing/selftests/drivers/net/netcons_basic.sh | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/tools/testing/selftests/drivers/net/netcons_basic.sh b/tools/testing/selftests/drivers/net/netcons_basic.sh index 06021b2059b7..4ad1e216c6b0 100755 --- a/tools/testing/selftests/drivers/net/netcons_basic.sh +++ b/tools/testing/selftests/drivers/net/netcons_basic.sh @@ -20,9 +20,9 @@ SCRIPTDIR=$(dirname "$(readlink -e "${BASH_SOURCE[0]}")") # Simple script to test dynamic targets in netconsole SRCIF="" # to be populated later -SRCIP=192.168.1.1 +SRCIP=192.168.2.1 DSTIF="" # to be populated later -DSTIP=192.168.1.2 +DSTIP=192.168.2.2 PORT="6666" MSG="netconsole selftest" -- 2.43.5

1 year, 2 months

2
3
0 0

kselftest/next kselftest-lib: 1 runs, 1 regressions (v6.12-rc3-8-gcecc795329fc)

by kernelci.org bot

kselftest/next kselftest-lib: 1 runs, 1 regressions (v6.12-rc3-8-gcecc795329fc) Regressions Summary ------------------- platform | arch | lab | compiler | defconfig | regressions ----------------------------+------+-------------+----------+------------------------------+------------ stm32mp157a-dhcor-avenger96 | arm | lab-broonie | gcc-12 | multi_v7_defconfig+kselftest | 1 Details: https://kernelci.org/test/job/kselftest/branch/next/kernel/v6.12-rc3-8-gcec… Test: kselftest-lib Tree: kselftest Branch: next Describe: v6.12-rc3-8-gcecc795329fc URL: https://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest.git SHA: cecc795329fc3e0ea2e84567ee57570cc050cf6b Test Regressions ---------------- platform | arch | lab | compiler | defconfig | regressions ----------------------------+------+-------------+----------+------------------------------+------------ stm32mp157a-dhcor-avenger96 | arm | lab-broonie | gcc-12 | multi_v7_defconfig+kselftest | 1 Details: https://kernelci.org/test/plan/id/6720a485c2f02871e7c86857 Results: 0 PASS, 1 FAIL, 0 SKIP Full config: multi_v7_defconfig+kselftest Compiler: gcc-12 (arm-linux-gnueabihf-gcc (Debian 12.2.0-14) 12.2.0) Plain log: https://storage.kernelci.org//kselftest/next/v6.12-rc3-8-gcecc795329fc/arm/… HTML log: https://storage.kernelci.org//kselftest/next/v6.12-rc3-8-gcecc795329fc/arm/… Rootfs: http://storage.kernelci.org/images/rootfs/debian/bookworm-kselftest/2024031… * kselftest-lib.login: https://kernelci.org/test/case/id/6720a485c2f02871e7c86858 failing since 91 days (last pass: v6.10-rc7-29-gdf09b0bb09ea, first fail: v6.11-rc1)

1 year, 2 months

1
0
0 0

kselftest/next kselftest-lkdtm: 1 runs, 1 regressions (v6.12-rc3-8-gcecc795329fc)

by kernelci.org bot

kselftest/next kselftest-lkdtm: 1 runs, 1 regressions (v6.12-rc3-8-gcecc795329fc) Regressions Summary ------------------- platform | arch | lab | compiler | defconfig | regressions ----------------------------+------+-------------+----------+------------------------------+------------ stm32mp157a-dhcor-avenger96 | arm | lab-broonie | gcc-12 | multi_v7_defconfig+kselftest | 1 Details: https://kernelci.org/test/job/kselftest/branch/next/kernel/v6.12-rc3-8-gcec… Test: kselftest-lkdtm Tree: kselftest Branch: next Describe: v6.12-rc3-8-gcecc795329fc URL: https://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest.git SHA: cecc795329fc3e0ea2e84567ee57570cc050cf6b Test Regressions ---------------- platform | arch | lab | compiler | defconfig | regressions ----------------------------+------+-------------+----------+------------------------------+------------ stm32mp157a-dhcor-avenger96 | arm | lab-broonie | gcc-12 | multi_v7_defconfig+kselftest | 1 Details: https://kernelci.org/test/plan/id/6720a2196a6ed469cec8688c Results: 0 PASS, 1 FAIL, 0 SKIP Full config: multi_v7_defconfig+kselftest Compiler: gcc-12 (arm-linux-gnueabihf-gcc (Debian 12.2.0-14) 12.2.0) Plain log: https://storage.kernelci.org//kselftest/next/v6.12-rc3-8-gcecc795329fc/arm/… HTML log: https://storage.kernelci.org//kselftest/next/v6.12-rc3-8-gcecc795329fc/arm/… Rootfs: http://storage.kernelci.org/images/rootfs/debian/bookworm-kselftest/2024031… * kselftest-lkdtm.login: https://kernelci.org/test/case/id/6720a2196a6ed469cec8688d failing since 91 days (last pass: v6.10-rc7-29-gdf09b0bb09ea, first fail: v6.11-rc1)

1 year, 2 months

1
0
0 0

[PATCH net-next v2 2/2] net: netconsole: selftests: Add userdata validation

by Breno Leitao

Extend netcons_basic selftest to verify the userdata functionality by: 1. Creating a test key in the userdata configfs directory 2. Writing a known value to the key 3. Validating the key-value pair appears in the captured network output This ensures the userdata feature is properly tested during selftests. Signed-off-by: Breno Leitao <leitao(a)debian.org> --- .../selftests/drivers/net/netcons_basic.sh | 29 +++++++++++++++++++ 1 file changed, 29 insertions(+) diff --git a/tools/testing/selftests/drivers/net/netcons_basic.sh b/tools/testing/selftests/drivers/net/netcons_basic.sh index 8d28e5189e91..182eb1a97e59 100755 --- a/tools/testing/selftests/drivers/net/netcons_basic.sh +++ b/tools/testing/selftests/drivers/net/netcons_basic.sh @@ -26,10 +26,13 @@ DSTIP=192.0.2.2 PORT="6666" MSG="netconsole selftest" +USERDATA_KEY="key" +USERDATA_VALUE="value" TARGET=$(mktemp -u netcons_XXXXX) DEFAULT_PRINTK_VALUES=$(cat /proc/sys/kernel/printk) NETCONS_CONFIGFS="/sys/kernel/config/netconsole" NETCONS_PATH="${NETCONS_CONFIGFS}"/"${TARGET}" +KEY_PATH="${NETCONS_PATH}/userdata/${USERDATA_KEY}" # NAMESPACE will be populated by setup_ns with a random value NAMESPACE="" @@ -122,6 +125,8 @@ function cleanup() { # delete netconsole dynamic reconfiguration echo 0 > "${NETCONS_PATH}"/enabled + # Remove key + rmdir "${KEY_PATH}" # Remove the configfs entry rmdir "${NETCONS_PATH}" @@ -136,6 +141,18 @@ function cleanup() { echo "${DEFAULT_PRINTK_VALUES}" > /proc/sys/kernel/printk } +function set_user_data() { + if [[ ! -d "${NETCONS_PATH}""/userdata" ]] + then + echo "Userdata path not available in ${NETCONS_PATH}/userdata" + exit "${ksft_skip}" + fi + + mkdir -p "${KEY_PATH}" + VALUE_PATH="${KEY_PATH}""/value" + echo "${USERDATA_VALUE}" > "${VALUE_PATH}" +} + function listen_port_and_save_to() { local OUTPUT=${1} # Just wait for 2 seconds @@ -146,6 +163,10 @@ function listen_port_and_save_to() { function validate_result() { local TMPFILENAME="$1" + # TMPFILENAME will contain something like: + # 6.11.1-0_fbk0_rc13_509_g30d75cea12f7,13,1822,115075213798,-;netconsole selftest: netcons_gtJHM + # key=value + # Check if the file exists if [ ! -f "$TMPFILENAME" ]; then echo "FAIL: File was not generated." >&2 @@ -158,6 +179,12 @@ function validate_result() { exit "${ksft_fail}" fi + if ! grep -q "${USERDATA_KEY}=${USERDATA_VALUE}" "${TMPFILENAME}"; then + echo "FAIL: ${USERDATA_KEY}=${USERDATA_VALUE} not found in ${TMPFILENAME}" >&2 + cat "${TMPFILENAME}" >&2 + exit "${ksft_fail}" + fi + # Delete the file once it is validated, otherwise keep it # for debugging purposes rm "${TMPFILENAME}" @@ -220,6 +247,8 @@ trap cleanup EXIT set_network # Create a dynamic target for netconsole create_dynamic_target +# Set userdata "key" with the "value" value +set_user_data # Listed for netconsole port inside the namespace and destination interface listen_port_and_save_to "${OUTPUT_FILE}" & # Wait for socat to start and listen to the port. -- 2.43.5

1 year, 2 months

1
0
0 0

[PATCH net-next v2 1/2] net: netconsole: selftests: Change the IP subnet

by Breno Leitao

Use a less populated IP range to run the tests, as suggested by Petr in Link: https://lore.kernel.org/netdev/87ikvukv3s.fsf@nvidia.com/. Suggested-by: Petr Machata <petrm(a)nvidia.com> Signed-off-by: Breno Leitao <leitao(a)debian.org> --- tools/testing/selftests/drivers/net/netcons_basic.sh | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/tools/testing/selftests/drivers/net/netcons_basic.sh b/tools/testing/selftests/drivers/net/netcons_basic.sh index 06021b2059b7..8d28e5189e91 100755 --- a/tools/testing/selftests/drivers/net/netcons_basic.sh +++ b/tools/testing/selftests/drivers/net/netcons_basic.sh @@ -20,9 +20,9 @@ SCRIPTDIR=$(dirname "$(readlink -e "${BASH_SOURCE[0]}")") # Simple script to test dynamic targets in netconsole SRCIF="" # to be populated later -SRCIP=192.168.1.1 +SRCIP=192.0.2.1 DSTIF="" # to be populated later -DSTIP=192.168.1.2 +DSTIP=192.0.2.2 PORT="6666" MSG="netconsole selftest" -- 2.43.5

1 year, 2 months

1
0
0 0

kselftest/next build: 6 builds: 2 failed, 4 passed, 1 warning (v6.12-rc3-8-gcecc795329fc)

by kernelci.org bot

kselftest/next build: 6 builds: 2 failed, 4 passed, 1 warning (v6.12-rc3-8-gcecc795329fc) Full Build Summary: https://kernelci.org/build/kselftest/branch/next/kernel/v6.12-rc3-8-gcecc79… Tree: kselftest Branch: next Git Describe: v6.12-rc3-8-gcecc795329fc Git Commit: cecc795329fc3e0ea2e84567ee57570cc050cf6b Git URL: https://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest.git Built: 4 unique architectures Build Failures Detected: arm64: defconfig+kselftest+arm64-chromebook: (clang-16) FAIL defconfig+kselftest+arm64-chromebook: (gcc-12) FAIL Warnings Detected: arm64: arm: i386: x86_64: x86_64_defconfig+kselftest (clang-16): 1 warning Warnings summary: 1 vmlinux.o: warning: objtool: set_ftrace_ops_ro+0x23: relocation to !ENDBR: .text+0x14fd19 ================================================================================ Detailed per-defconfig build reports: -------------------------------------------------------------------------------- defconfig+kselftest+arm64-chromebook (arm64, gcc-12) — FAIL, 0 errors, 0 warnings, 0 section mismatches -------------------------------------------------------------------------------- defconfig+kselftest+arm64-chromebook (arm64, clang-16) — FAIL, 0 errors, 0 warnings, 0 section mismatches -------------------------------------------------------------------------------- i386_defconfig+kselftest (i386, gcc-12) — PASS, 0 errors, 0 warnings, 0 section mismatches -------------------------------------------------------------------------------- multi_v7_defconfig+kselftest (arm, gcc-12) — PASS, 0 errors, 0 warnings, 0 section mismatches -------------------------------------------------------------------------------- x86_64_defconfig+kselftest (x86_64, gcc-12) — PASS, 0 errors, 0 warnings, 0 section mismatches -------------------------------------------------------------------------------- x86_64_defconfig+kselftest (x86_64, clang-16) — PASS, 0 errors, 1 warning, 0 section mismatches Warnings: vmlinux.o: warning: objtool: set_ftrace_ops_ro+0x23: relocation to !ENDBR: .text+0x14fd19 --- For more info write to <info(a)kernelci.org>

1 year, 2 months

1
0
0 0

[RESEND] [PATCH v6 0/2] Add test to distinguish between thread's signal mask and ucontext_t

by Dev Jain

This patch series is motivated by the following observation: Raise a signal, jump to signal handler. The ucontext_t structure dumped by kernel to userspace has a uc_sigmask field having the mask of blocked signals. If you run a fresh minimalistic program doing this, this field is empty, even if you block some signals while registering the handler with sigaction(). Here is what the man-pages have to say: sigaction(2): "sa_mask specifies a mask of signals which should be blocked (i.e., added to the signal mask of the thread in which the signal handler is invoked) during execution of the signal handler. In addition, the signal which triggered the handler will be blocked, unless the SA_NODEFER flag is used." signal(7): Under "Execution of signal handlers", (1.3) implies: "The thread's current signal mask is accessible via the ucontext_t object that is pointed to by the third argument of the signal handler." But, (1.4) states: "Any signals specified in act->sa_mask when registering the handler with sigprocmask(2) are added to the thread's signal mask. The signal being delivered is also added to the signal mask, unless SA_NODEFER was specified when registering the handler. These signals are thus blocked while the handler executes." There clearly is no distinction being made in the man pages between "Thread's signal mask" and ucontext_t; this logically should imply that a signal blocked by populating struct sigaction should be visible in ucontext_t. Here is what the kernel code does (for Aarch64): do_signal() -> handle_signal() -> sigmask_to_save(), which returns &current->blocked, is passed to setup_rt_frame() -> setup_sigframe() -> __copy_to_user(). Hence, &current->blocked is copied to ucontext_t exposed to userspace. Returning back to handle_signal(), signal_setup_done() -> signal_delivered() -> sigorsets() and set_current_blocked() are responsible for using information from struct ksignal ksig, which was populated through the sigaction() system call in kernel/signal.c: copy_from_user(&new_sa.sa, act, sizeof(new_sa.sa)), to update &current->blocked; hence, the set of blocked signals for the current thread is updated AFTER the kernel dumps ucontext_t to userspace. Assuming that the above is indeed the intended behaviour, because it semantically makes sense, since the signals blocked using sigaction() remain blocked only till the execution of the handler, and not in the context present before jumping to the handler (but nothing can be confirmed from the man-pages), the series introduces a test for mangling with uc_sigmask. I will send a separate series to fix the man-pages. The proposed selftest has been tested out on Aarch32, Aarch64 and x86_64. v5->v6: - Drop renaming of sas.c - Include the explanation from the cover letter in the changelog for the second patch v4->v5: - Remove a redundant print statement v3->v4: - Allocate sigsets as automatic variables to avoid malloc() v2->v3: - ucontext describes current state -> ucontext describes interrupted context - Add a comment for blockage of USR2 even after return from handler - Describe blockage of signals in a better way v1->v2: - Replace all occurrences of SIGPIPE with SIGSEGV - Fixed a mismatch between code comment and ksft log - Add a testcase: Raise the same signal again; it must not be queued - Remove unneeded <assert.h>, <unistd.h> - Give a detailed test description in the comments; also describe the exact meaning of delivered and blocked - Handle errors for all libc functions/syscalls - Mention tests in Makefile and .gitignore in alphabetical order v1: - https://lore.kernel.org/all/20240607122319.768640-1-dev.jain@arm.com/ Dev Jain (2): selftests: Rename sigaltstack to generic signal selftests: Add a test mangling with uc_sigmask tools/testing/selftests/Makefile | 2 +- .../{sigaltstack => signal}/.gitignore | 1 + .../{sigaltstack => signal}/Makefile | 3 +- .../current_stack_pointer.h | 0 .../selftests/signal/mangle_uc_sigmask.c | 184 ++++++++++++++++++ .../selftests/{sigaltstack => signal}/sas.c | 0 6 files changed, 188 insertions(+), 2 deletions(-) rename tools/testing/selftests/{sigaltstack => signal}/.gitignore (70%) rename tools/testing/selftests/{sigaltstack => signal}/Makefile (56%) rename tools/testing/selftests/{sigaltstack => signal}/current_stack_pointer.h (100%) create mode 100644 tools/testing/selftests/signal/mangle_uc_sigmask.c rename tools/testing/selftests/{sigaltstack => signal}/sas.c (100%) -- 2.30.2

1 year, 2 months

2
4
0 0

kselftest/fixes build: 6 builds: 2 failed, 4 passed, 1 warning (linux_kselftest-fixes-6.12-rc4-4-gdc1308bee1ed)

by kernelci.org bot

kselftest/fixes build: 6 builds: 2 failed, 4 passed, 1 warning (linux_kselftest-fixes-6.12-rc4-4-gdc1308bee1ed) Full Build Summary: https://kernelci.org/build/kselftest/branch/fixes/kernel/linux_kselftest-fi… Tree: kselftest Branch: fixes Git Describe: linux_kselftest-fixes-6.12-rc4-4-gdc1308bee1ed Git Commit: dc1308bee1ed03b4d698d77c8bd670d399dcd04d Git URL: https://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest.git Built: 4 unique architectures Build Failures Detected: arm64: defconfig+kselftest+arm64-chromebook: (clang-16) FAIL defconfig+kselftest+arm64-chromebook: (gcc-12) FAIL Warnings Detected: arm64: arm: i386: x86_64: x86_64_defconfig+kselftest (clang-16): 1 warning Warnings summary: 1 vmlinux.o: warning: objtool: set_ftrace_ops_ro+0x23: relocation to !ENDBR: .text+0x14fd19 ================================================================================ Detailed per-defconfig build reports: -------------------------------------------------------------------------------- defconfig+kselftest (arm64, gcc-12) — PASS, 0 errors, 0 warnings, 0 section mismatches -------------------------------------------------------------------------------- defconfig+kselftest+arm64-chromebook (arm64, gcc-12) — FAIL, 0 errors, 0 warnings, 0 section mismatches -------------------------------------------------------------------------------- defconfig+kselftest+arm64-chromebook (arm64, clang-16) — FAIL, 0 errors, 0 warnings, 0 section mismatches -------------------------------------------------------------------------------- i386_defconfig+kselftest (i386, gcc-12) — PASS, 0 errors, 0 warnings, 0 section mismatches -------------------------------------------------------------------------------- multi_v7_defconfig+kselftest (arm, gcc-12) — PASS, 0 errors, 0 warnings, 0 section mismatches -------------------------------------------------------------------------------- x86_64_defconfig+kselftest (x86_64, gcc-12) — PASS, 0 errors, 0 warnings, 0 section mismatches -------------------------------------------------------------------------------- x86_64_defconfig+kselftest (x86_64, clang-16) — PASS, 0 errors, 1 warning, 0 section mismatches Warnings: vmlinux.o: warning: objtool: set_ftrace_ops_ro+0x23: relocation to !ENDBR: .text+0x14fd19 --- For more info write to <info(a)kernelci.org>

1 year, 2 months

1
0
0 0

[PATCH] selftests/watchdog-test: Fix system accidentally reset after watchdog-test

by Li Zhijian

When running watchdog-test with 'make run_tests', the watchdog-test will be terminated by a timeout signal(SIGTERM) due to the test timemout. And then, a system reboot would happen due to watchdog not stop. see the dmesg as below: ``` [ 1367.185172] watchdog: watchdog0: watchdog did not stop! ``` Fix it by registering more signals(including SIGTERM) in watchdog-test, where its signal handler will stop the watchdog. After that # timeout 1 ./watchdog-test Watchdog Ticking Away! . Stopping watchdog ticks... Signed-off-by: Li Zhijian <lizhijian(a)fujitsu.com> --- tools/testing/selftests/watchdog/watchdog-test.c | 6 ++++++ 1 file changed, 6 insertions(+) diff --git a/tools/testing/selftests/watchdog/watchdog-test.c b/tools/testing/selftests/watchdog/watchdog-test.c index bc71cbca0dde..a1f506ba5578 100644 --- a/tools/testing/selftests/watchdog/watchdog-test.c +++ b/tools/testing/selftests/watchdog/watchdog-test.c @@ -334,7 +334,13 @@ int main(int argc, char *argv[]) printf("Watchdog Ticking Away!\n"); + /* + * Register the signals + */ signal(SIGINT, term); + signal(SIGTERM, term); + signal(SIGKILL, term); + signal(SIGQUIT, term); while (1) { keep_alive(); -- 2.44.0

1 year, 2 months

2
1
0 0

[PATCH 0/2] selftests/intel_pstate: fix arithmetic expression and cpupower

by Alessandro Zanni

Address issues related to arithmetic expression compatibility and cpupower operand expected. Command to test: make kselftest TARGETS=intel_pstate Alessandro Zanni (2): selftests/intel_pstate: fix operand expected selftests/intel_pstate: cpupower command not found tools/testing/selftests/intel_pstate/run.sh | 9 +++++++-- 1 file changed, 7 insertions(+), 2 deletions(-) -- 2.43.0

1 year, 2 months

2
3
0 0

[PATCH for-next 1/3] selftests/watchdog: add count parameter for watchdog-test

by Li Zhijian

Currently, watchdog-test keep running until it gets a SIGINT. However, when watchdog-test is executed from the kselftests framework, where it launches test via timeout which will send SIGTERM in time up. This could lead to 1. watchdog haven't stop, a watchdog reset is triggered to reboot the OS in silent. 2. kselftests gets an timeout exit code, and judge watchdog-test as 'not ok' This patch is prepare to fix above 2 issues Signed-off-by: Li Zhijian <lizhijian(a)fujitsu.com> --- Hey, Cover letter is here. It's notice that a OS reboot was triggerred after ran the watchdog-test in kselftests framwork 'make run_tests', that's because watchdog-test didn't stop feeding the watchdog after enable it. In addition, current watchdog-test didn't adapt to the kselftests framework which launchs the test with /usr/bin/timeout and no timeout is expected. --- tools/testing/selftests/watchdog/watchdog-test.c | 13 +++++++++---- 1 file changed, 9 insertions(+), 4 deletions(-) diff --git a/tools/testing/selftests/watchdog/watchdog-test.c b/tools/testing/selftests/watchdog/watchdog-test.c index bc71cbca0dde..2f8fd2670897 100644 --- a/tools/testing/selftests/watchdog/watchdog-test.c +++ b/tools/testing/selftests/watchdog/watchdog-test.c @@ -27,7 +27,7 @@ int fd; const char v = 'V'; -static const char sopts[] = "bdehp:st:Tn:NLf:i"; +static const char sopts[] = "bdehp:st:Tn:NLf:c:i"; static const struct option lopts[] = { {"bootstatus", no_argument, NULL, 'b'}, {"disable", no_argument, NULL, 'd'}, @@ -42,6 +42,7 @@ static const struct option lopts[] = { {"gettimeleft", no_argument, NULL, 'L'}, {"file", required_argument, NULL, 'f'}, {"info", no_argument, NULL, 'i'}, + {"count", required_argument, NULL, 'c'}, {NULL, no_argument, NULL, 0x0} }; @@ -95,6 +96,7 @@ static void usage(char *progname) printf(" -n, --pretimeout=T\tSet the pretimeout to T seconds\n"); printf(" -N, --getpretimeout\tGet the pretimeout\n"); printf(" -L, --gettimeleft\tGet the time left until timer expires\n"); + printf(" -c, --count\tStop after feeding the watchdog count times\n"); printf("\n"); printf("Parameters are parsed left-to-right in real-time.\n"); printf("Example: %s -d -t 10 -p 5 -e\n", progname); @@ -174,7 +176,7 @@ int main(int argc, char *argv[]) unsigned int ping_rate = DEFAULT_PING_RATE; int ret; int c; - int oneshot = 0; + int oneshot = 0, stop = 1, count = 0; char *file = "/dev/watchdog"; struct watchdog_info info; int temperature; @@ -307,6 +309,9 @@ int main(int argc, char *argv[]) else printf("WDIOC_GETTIMELEFT error '%s'\n", strerror(errno)); break; + case 'c': + stop = 0; + count = strtoul(optarg, NULL, 0); case 'f': /* Handled above */ break; @@ -336,8 +341,8 @@ int main(int argc, char *argv[]) signal(SIGINT, term); - while (1) { - keep_alive(); + while (stop || count--) { + exit_code = keep_alive(); sleep(ping_rate); } end: -- 2.44.0

1 year, 2 months

3
15
0 0

[PATCH v3] selftests: tmpfs: Add kselftest support to tmpfs

by Shivam Chaudhary

Add kselftest support for openat, linkat, unshare, mount tests - Replace direct error handling with 'ksft_test_result_*' , 'ksft_print_msg' macros for better reporting. - Add `ksft_print_header()` and `ksft_set_plan()` to structure test outputs more effectively. - Improve the test flow by adding more detailed pass/fail reporting for unshare, mounting, file opening, and linking operations. - Skip the test if it's not run as root, providing an appropriate Warning. Test logs: Before change: - Without root error: unshare, errno 1 - With root No, output After change: - Without root TAP version 13 1..1 ok 1 # SKIP This test needs root to run - With root TAP version 13 1..1 unshare(): Creat new mount namespace: Success. mount(): Root filesystem private mount: Success mount(): Mounting tmpfs on /tmp: Success openat(): Open first temporary file: Success linkat(): Linking the temporary file: Success openat(): Opening the second temporary file: Success ok 1 Test : Success Totals: pass:1 fail:0 xfail:0 xpass:0 skip:0 error:0 Signed-off-by: Shivam Chaudhary <cvam0000(a)gmail.com> --- Notes: Changes in v3: - Remove extra ksft_set_plan() - Remove function for unshare() - Fix the comment style link to v2: https://lore.kernel.org/all/20241026191621.2860376-1-cvam0000@gmail.com/ Changes in v2: - Make the commit message more clear. link to v1: https://lore.kernel.org/all/20241024200228.1075840-1-cvam0000@gmail.com/T/#u .../selftests/tmpfs/bug-link-o-tmpfile.c | 69 +++++++++++++++---- 1 file changed, 57 insertions(+), 12 deletions(-) diff --git a/tools/testing/selftests/tmpfs/bug-link-o-tmpfile.c b/tools/testing/selftests/tmpfs/bug-link-o-tmpfile.c index b5c3ddb90942..9ca1245784d9 100644 --- a/tools/testing/selftests/tmpfs/bug-link-o-tmpfile.c +++ b/tools/testing/selftests/tmpfs/bug-link-o-tmpfile.c @@ -23,45 +23,90 @@ #include <sys/mount.h> #include <unistd.h> +#include "../kselftest.h" + + int main(void) { int fd; + /* Setting up kselftest framework */ + ksft_print_header(); + ksft_set_plan(1); + + /* Check if test is run as root */ + if (geteuid()) { + ksft_test_result_skip("This test needs root to run!\n"); + return 1; + } + + if (unshare(CLONE_NEWNS) == -1) { if (errno == ENOSYS || errno == EPERM) { - fprintf(stderr, "error: unshare, errno %d\n", errno); - return 4; + ksft_test_result_fail("unshare() error: unshare, errno %d\n", errno); + return 1; } - fprintf(stderr, "error: unshare, errno %d\n", errno); - return 1; + else{ + fprintf(stderr, "unshare() error: unshare, errno %d\n", errno); + ksft_test_result_fail("unshare() error: unshare, errno %d\n", errno); + return 1; + + } + } + + else { + ksft_print_msg("unshare(): Creat new mount namespace: Success.\n"); + } - if (mount(NULL, "/", NULL, MS_PRIVATE|MS_REC, NULL) == -1) { - fprintf(stderr, "error: mount '/', errno %d\n", errno); + + + + if (mount(NULL, "/", NULL, MS_PRIVATE | MS_REC, NULL) == -1) { + ksft_test_result_fail("mount() error: Root filesystem private mount: Fail %d\n", errno); return 1; + } else { + ksft_print_msg("mount(): Root filesystem private mount: Success\n"); } + /* Our heroes: 1 root inode, 1 O_TMPFILE inode, 1 permanent inode. */ if (mount(NULL, "/tmp", "tmpfs", 0, "nr_inodes=3") == -1) { - fprintf(stderr, "error: mount tmpfs, errno %d\n", errno); + ksft_test_result_fail("mount() error: Mounting tmpfs on /tmp: Fail %d\n", errno); return 1; + } else { + ksft_print_msg("mount(): Mounting tmpfs on /tmp: Success\n"); } - fd = openat(AT_FDCWD, "/tmp", O_WRONLY|O_TMPFILE, 0600); + + fd = openat(AT_FDCWD, "/tmp", O_WRONLY | O_TMPFILE, 0600); if (fd == -1) { - fprintf(stderr, "error: open 1, errno %d\n", errno); + ksft_test_result_fail("openat() error: Open first temporary file: Fail %d\n", errno); return 1; + } else { + ksft_print_msg("openat(): Open first temporary file: Success\n"); } + + if (linkat(fd, "", AT_FDCWD, "/tmp/1", AT_EMPTY_PATH) == -1) { - fprintf(stderr, "error: linkat, errno %d\n", errno); + ksft_test_result_fail("linkat() error: Linking the temporary file: Fail %d\n", errno); + /* Ensure fd is closed on failure */ + close(fd); return 1; + } else { + ksft_print_msg("linkat(): Linking the temporary file: Success\n"); } close(fd); - fd = openat(AT_FDCWD, "/tmp", O_WRONLY|O_TMPFILE, 0600); + + fd = openat(AT_FDCWD, "/tmp", O_WRONLY | O_TMPFILE, 0600); if (fd == -1) { - fprintf(stderr, "error: open 2, errno %d\n", errno); + ksft_test_result_fail("openat() error: Opening the second temporary file: Fail %d\n", errno); return 1; + } else { + ksft_print_msg("openat(): Opening the second temporary file: Success\n"); } + ksft_test_result_pass("Test : Success\n"); + ksft_exit_pass(); return 0; } -- 2.45.2

1 year, 2 months

2
1
0 0

[PATCH v2] selftests/mount_setattr: fix idmap_mount_tree_invalid failed to run

by zhouyuhang

From: zhouyuhang <zhouyuhang(a)kylinos.cn> Test case idmap_mount_tree_invalid failed to run on the newer kernel with the following output: # RUN mount_setattr_idmapped.idmap_mount_tree_invalid ... # mount_setattr_test.c:1428:idmap_mount_tree_invalid:Expected sys_mount_setattr(open_tree_fd, "", AT_EMPTY_PATH, &attr, sizeof(attr)) (0) ! = 0 (0) # idmap_mount_tree_invalid: Test terminated by assertion This is because tmpfs is mounted at "/mnt/A", and tmpfs already contains the flag FS_ALLOW_IDMAP after the commit 7a80e5b8c6fa ("shmem: support idmapped mounts for tmpfs"). So calling sys_mount_setattr here returns 0 instead of -EINVAL as expected. Ramfs does not support idmap mounts, so we can use it here to test invalid mounts, which allows the test case to pass with the following output: # Starting 1 tests from 1 test cases. # RUN mount_setattr_idmapped.idmap_mount_tree_invalid ... # OK mount_setattr_idmapped.idmap_mount_tree_invalid ok 1 mount_setattr_idmapped.idmap_mount_tree_invalid # PASSED: 1 / 1 tests passed. Signed-off-by: zhouyuhang <zhouyuhang(a)kylinos.cn> --- .../testing/selftests/mount_setattr/mount_setattr_test.c | 9 +++++++++ 1 file changed, 9 insertions(+) diff --git a/tools/testing/selftests/mount_setattr/mount_setattr_test.c b/tools/testing/selftests/mount_setattr/mount_setattr_test.c index c6a8c732b802..68801e1a9ec2 100644 --- a/tools/testing/selftests/mount_setattr/mount_setattr_test.c +++ b/tools/testing/selftests/mount_setattr/mount_setattr_test.c @@ -1414,6 +1414,13 @@ TEST_F(mount_setattr_idmapped, idmap_mount_tree_invalid) ASSERT_EQ(expected_uid_gid(-EBADF, "/tmp/B/b", 0, 0, 0), 0); ASSERT_EQ(expected_uid_gid(-EBADF, "/tmp/B/BB/b", 0, 0, 0), 0); + ASSERT_EQ(mount("testing", "/mnt/A", "ramfs", MS_NOATIME | MS_NODEV, + "size=100000,mode=700"), 0); + + ASSERT_EQ(mkdir("/mnt/A/AA", 0777), 0); + + ASSERT_EQ(mount("/tmp", "/mnt/A/AA", NULL, MS_BIND | MS_REC, NULL), 0); + open_tree_fd = sys_open_tree(-EBADF, "/mnt/A", AT_RECURSIVE | AT_EMPTY_PATH | @@ -1433,6 +1440,8 @@ TEST_F(mount_setattr_idmapped, idmap_mount_tree_invalid) ASSERT_EQ(expected_uid_gid(-EBADF, "/tmp/B/BB/b", 0, 0, 0), 0); ASSERT_EQ(expected_uid_gid(open_tree_fd, "B/b", 0, 0, 0), 0); ASSERT_EQ(expected_uid_gid(open_tree_fd, "B/BB/b", 0, 0, 0), 0); + + (void)umount2("/mnt/A", MNT_DETACH); } TEST_F(mount_setattr, mount_attr_nosymfollow) -- 2.27.0

1 year, 2 months

3
2
0 0

Re: Dear Respectable, I wait for your response to this my message

by Evan Bohdan

Good day, My name is Evan from Ukraine.I have tried to reach you but find it difficult due to internet scarcity here in this town. I am contacting you because I want to come over to your country,I have some huge funds I plan to invest in your country because I want to relocate my Late Father business due to ongoing war in my country Ukraine and visit your country to set up new investment business you may advice profitable in your area. I also have some millions of dollars belonging to my late father that i want to invest and 995kg of 24+carat 99.9% purity gold I want to ship to your country and establish gold jewelry manufacturing company. I will offer you good monetary rewards for your help,due to how russian drone bombards this town frequently,internet network is disrupted because of attacks on network installations,please for fast discussion, you can add me on whatsapp and let's talk faster as i want to leave this war zone as soon as possible,this is my whatsapp number below. +380-672575220 Email:evanbohdan@gmail.com I wait for your earliest response. Regards, Evan Whatsapp/Tel:+380-672575220 Email:evanbohdan@gmail.com

1 year, 2 months

1
0
0 0

[PATCH bpf-next] bpf: handle implicit declaration of function gettid in bpf_iter.c

by Jason Xing

From: Jason Xing <kernelxing(a)tencent.com> As we can see from the title, when I compiled the selftests/bpf, I saw the error: implicit declaration of function ‘gettid’ ; did you mean ‘getgid’? [-Werror=implicit-function-declaration] skel->bss->tid = gettid(); ^~~~~~ getgid Adding a define to fix it (referring to tools/perf/tests/shell/coresight/thread_loop/thread_loop.c file. Signed-off-by: Jason Xing <kernelxing(a)tencent.com> --- tools/testing/selftests/bpf/prog_tests/bpf_iter.c | 2 ++ 1 file changed, 2 insertions(+) diff --git a/tools/testing/selftests/bpf/prog_tests/bpf_iter.c b/tools/testing/selftests/bpf/prog_tests/bpf_iter.c index f0a3a9c18e9e..a105759f3dcf 100644 --- a/tools/testing/selftests/bpf/prog_tests/bpf_iter.c +++ b/tools/testing/selftests/bpf/prog_tests/bpf_iter.c @@ -34,6 +34,8 @@ #include "bpf_iter_ksym.skel.h" #include "bpf_iter_sockmap.skel.h" +#define gettid() syscall(SYS_gettid) + static void test_btf_id_or_null(void) { struct bpf_iter_test_kern3 *skel; -- 2.37.3

1 year, 2 months

2
2
0 0

[PATCH 0/6] damon/{self,kunit}tests: minor fixups for DAMON debugfs interface tests

by SeongJae Park

Fixup small broken window panes in DAMON selftests and kunit tests. First four patches clean up DAMON debugfs interface selftests output, by fixing segmentation fault of a test program (patch 1), removing unnecessary debugging messages (patch 2), and hiding error messages from expected failures (patches 3 and 4). Following two patches fix copy-paste mistakes in DAMON Kconfig help message that copied from debugfs kunit test (patch 5) and a comment on the debugfs kunit test code (patch 6). Signed-off-by: SeongJae Park <sj(a)kernel.org> Andrew Paniakin (1): selftests/damon/huge_count_read_write: provide sufficiently large buffer for DEPRECATED file read SeongJae Park (5): selftests/damon/huge_count_read_write: remove unnecessary debugging message selftests/damon/_debugfs_common: hide expected error message from test_write_result() selftests/damon/debugfs_duplicate_context_creation: hide errors from expected file write failures mm/damon/Kconfig: update DBGFS_KUNIT prompt copy for SYSFS_KUNIT mm/damon/tests/dbgfs-kunit: fix the header double inclusion guarding ifdef comment mm/damon/Kconfig | 2 +- mm/damon/tests/dbgfs-kunit.h | 2 +- tools/testing/selftests/damon/_debugfs_common.sh | 7 ++++++- .../selftests/damon/debugfs_duplicate_context_creation.sh | 2 +- tools/testing/selftests/damon/huge_count_read_write.c | 4 +--- 5 files changed, 10 insertions(+), 7 deletions(-) base-commit: 13583c750117b4e10cdaf5578dcc7723b305ce4e -- 2.39.5

1 year, 2 months

1
5
0 0

[PATCH net-next v3] selftest/tcp-ao: Add filter tests

by Leo Stone

Add tests that check if getsockopt(TCP_AO_GET_KEYS) returns the right keys when using different filters. Sample output: > # ok 114 filter keys: by sndid, rcvid, address > # ok 115 filter keys: by is_current > # ok 116 filter keys: by is_rnext > # ok 117 filter keys: by sndid, rcvid > # ok 118 filter keys: correct nkeys when in.nkeys < matches Acked-by: Dmitry Safonov <0x7f454c46(a)gmail.com> Signed-off-by: Leo Stone <leocstone(a)gmail.com> --- v3: - Ordered locals in reverse xmas tree order - Separated socket fd declaration from assignment - Broke lines longer than 80 columns v2: https://lore.kernel.org/netdev/20241016055823.21299-1-leocstone@gmail.com/ - Changed 2 unnecessary test_error calls to test_fail - Added another test to make sure getsockopt returns the right nkeys value when the input nkeys is smaller than the number of matching keys - Removed the TODO that this patch addresses v1: https://lore.kernel.org/netdev/20241014213313.15100-1-leocstone@gmail.com/ Thanks to the reviewers for their time and feedback! --- .../selftests/net/tcp_ao/setsockopt-closed.c | 186 +++++++++++++++++- 1 file changed, 181 insertions(+), 5 deletions(-) diff --git a/tools/testing/selftests/net/tcp_ao/setsockopt-closed.c b/tools/testing/selftests/net/tcp_ao/setsockopt-closed.c index 084db4ecdff6..0abb9807d742 100644 --- a/tools/testing/selftests/net/tcp_ao/setsockopt-closed.c +++ b/tools/testing/selftests/net/tcp_ao/setsockopt-closed.c @@ -6,6 +6,8 @@ static union tcp_addr tcp_md5_client; +#define FILTER_TEST_NKEYS 16 + static int test_port = 7788; static void make_listen(int sk) { @@ -813,23 +815,197 @@ static void duplicate_tests(void) setsockopt_checked(sk, TCP_AO_ADD_KEY, &ao, EEXIST, "duplicate: SendID differs"); } +static void fetch_all_keys(int sk, struct tcp_ao_getsockopt *keys) +{ + socklen_t optlen = sizeof(struct tcp_ao_getsockopt); + + memset(keys, 0, sizeof(struct tcp_ao_getsockopt) * FILTER_TEST_NKEYS); + keys[0].get_all = 1; + keys[0].nkeys = FILTER_TEST_NKEYS; + if (getsockopt(sk, IPPROTO_TCP, TCP_AO_GET_KEYS, &keys[0], &optlen)) + test_error("getsockopt"); +} + +static int prepare_test_keys(struct tcp_ao_getsockopt *keys) +{ + const char *test_password = "Test password number "; + struct tcp_ao_add test_ao[FILTER_TEST_NKEYS]; + char test_password_scratch[64] = {}; + u8 rcvid = 100, sndid = 100; + int sk; + + sk = socket(test_family, SOCK_STREAM, IPPROTO_TCP); + if (sk < 0) + test_error("socket()"); + + for (int i = 0; i < FILTER_TEST_NKEYS; i++) { + snprintf(test_password_scratch, 64, "%s %d", test_password, i); + test_prepare_key(&test_ao[i], DEFAULT_TEST_ALGO, this_ip_dest, + false, false, DEFAULT_TEST_PREFIX, 0, sndid++, + rcvid++, 0, 0, strlen(test_password_scratch), + test_password_scratch); + } + test_ao[0].set_current = 1; + test_ao[1].set_rnext = 1; + /* One key with a different addr and overlapping sndid, rcvid */ + tcp_addr_to_sockaddr_in(&test_ao[2].addr, &this_ip_addr, 0); + test_ao[2].sndid = 100; + test_ao[2].rcvid = 100; + + /* Add keys in a random order */ + for (int i = 0; i < FILTER_TEST_NKEYS; i++) { + int randidx = rand() % (FILTER_TEST_NKEYS - i); + + if (setsockopt(sk, IPPROTO_TCP, TCP_AO_ADD_KEY, + &test_ao[randidx], sizeof(struct tcp_ao_add))) + test_error("setsockopt()"); + memcpy(&test_ao[randidx], &test_ao[FILTER_TEST_NKEYS - 1 - i], + sizeof(struct tcp_ao_add)); + } + + fetch_all_keys(sk, keys); + + return sk; +} + +/* Assumes passwords are unique */ +static int compare_mkts(struct tcp_ao_getsockopt *expected, int nexpected, + struct tcp_ao_getsockopt *actual, int nactual) +{ + int matches = 0; + + for (int i = 0; i < nexpected; i++) { + for (int j = 0; j < nactual; j++) { + if (memcmp(expected[i].key, actual[j].key, + TCP_AO_MAXKEYLEN) == 0) + matches++; + } + } + return nexpected - matches; +} + +static void filter_keys_checked(int sk, struct tcp_ao_getsockopt *filter, + struct tcp_ao_getsockopt *expected, + unsigned int nexpected, const char *tst) +{ + struct tcp_ao_getsockopt filtered_keys[FILTER_TEST_NKEYS] = {}; + struct tcp_ao_getsockopt all_keys[FILTER_TEST_NKEYS] = {}; + socklen_t len = sizeof(struct tcp_ao_getsockopt); + + fetch_all_keys(sk, all_keys); + memcpy(&filtered_keys[0], filter, sizeof(struct tcp_ao_getsockopt)); + filtered_keys[0].nkeys = FILTER_TEST_NKEYS; + if (getsockopt(sk, IPPROTO_TCP, TCP_AO_GET_KEYS, filtered_keys, &len)) + test_error("getsockopt"); + if (filtered_keys[0].nkeys != nexpected) { + test_fail("wrong nr of keys, expected %u got %u", nexpected, + filtered_keys[0].nkeys); + goto out_close; + } + if (compare_mkts(expected, nexpected, filtered_keys, + filtered_keys[0].nkeys)) { + test_fail("got wrong keys back"); + goto out_close; + } + test_ok("filter keys: %s", tst); + +out_close: + close(sk); + memset(filter, 0, sizeof(struct tcp_ao_getsockopt)); +} + +static void filter_tests(void) +{ + struct tcp_ao_getsockopt original_keys[FILTER_TEST_NKEYS]; + struct tcp_ao_getsockopt expected_keys[FILTER_TEST_NKEYS]; + struct tcp_ao_getsockopt filter = {}; + int sk, f, nmatches; + socklen_t len; + + f = 2; + sk = prepare_test_keys(original_keys); + filter.rcvid = original_keys[f].rcvid; + filter.sndid = original_keys[f].sndid; + memcpy(&filter.addr, &original_keys[f].addr, + sizeof(original_keys[f].addr)); + filter.prefix = original_keys[f].prefix; + filter_keys_checked(sk, &filter, &original_keys[f], 1, + "by sndid, rcvid, address"); + + f = -1; + sk = prepare_test_keys(original_keys); + for (int i = 0; i < original_keys[0].nkeys; i++) { + if (original_keys[i].is_current) { + f = i; + break; + } + } + if (f < 0) + test_error("No current key after adding one"); + filter.is_current = 1; + filter_keys_checked(sk, &filter, &original_keys[f], 1, "by is_current"); + + f = -1; + sk = prepare_test_keys(original_keys); + for (int i = 0; i < original_keys[0].nkeys; i++) { + if (original_keys[i].is_rnext) { + f = i; + break; + } + } + if (f < 0) + test_error("No rnext key after adding one"); + filter.is_rnext = 1; + filter_keys_checked(sk, &filter, &original_keys[f], 1, "by is_rnext"); + + f = -1; + nmatches = 0; + sk = prepare_test_keys(original_keys); + for (int i = 0; i < original_keys[0].nkeys; i++) { + if (original_keys[i].sndid == 100) { + f = i; + memcpy(&expected_keys[nmatches], &original_keys[i], + sizeof(struct tcp_ao_getsockopt)); + nmatches++; + } + } + if (f < 0) + test_error("No key for sndid 100"); + if (nmatches != 2) + test_error("Should have 2 keys with sndid 100"); + filter.rcvid = original_keys[f].rcvid; + filter.sndid = original_keys[f].sndid; + filter.addr.ss_family = test_family; + filter_keys_checked(sk, &filter, expected_keys, nmatches, + "by sndid, rcvid"); + + sk = prepare_test_keys(original_keys); + filter.get_all = 1; + filter.nkeys = FILTER_TEST_NKEYS / 2; + len = sizeof(struct tcp_ao_getsockopt); + if (getsockopt(sk, IPPROTO_TCP, TCP_AO_GET_KEYS, &filter, &len)) + test_error("getsockopt"); + if (filter.nkeys == FILTER_TEST_NKEYS) + test_ok("filter keys: correct nkeys when in.nkeys < matches"); + else + test_fail("filter keys: wrong nkeys, expected %u got %u", + FILTER_TEST_NKEYS, filter.nkeys); +} + static void *client_fn(void *arg) { if (inet_pton(TEST_FAMILY, __TEST_CLIENT_IP(2), &tcp_md5_client) != 1) test_error("Can't convert ip address"); extend_tests(); einval_tests(); + filter_tests(); duplicate_tests(); - /* - * TODO: check getsockopt(TCP_AO_GET_KEYS) with different filters - * returning proper nr & keys; - */ return NULL; } int main(int argc, char *argv[]) { - test_init(121, client_fn, NULL); + test_init(126, client_fn, NULL); return 0; } -- 2.43.0

1 year, 2 months

3
2
0 0

[PATCH net 0/3] mptcp: sched: fix some lock issues

by Matthieu Baerts (NGI0)

Two small fixes related to the MPTCP packets scheduler: - Patch 1: add missing rcu_read_(un)lock(). A fix for >= 6.6. - Patch 2: remove unneeded lock when listing packets schedulers. A fix for >= 6.10. And some modifications in the MPTCP selftests: - Patch 3: a small addition to the MPTCP selftests to cover more code. Signed-off-by: Matthieu Baerts (NGI0) <matttbe(a)kernel.org> --- Matthieu Baerts (NGI0) (3): mptcp: init: protect sched with rcu_read_lock mptcp: remove unneeded lock when listing scheds selftests: mptcp: list sysctl data net/mptcp/protocol.c | 2 ++ net/mptcp/sched.c | 2 -- tools/testing/selftests/net/mptcp/mptcp_connect.sh | 9 +++++++++ 3 files changed, 11 insertions(+), 2 deletions(-) --- base-commit: 3b05b9c36ddd01338e1352588f2ec1ea23f97d43 change-id: 20241021-net-mptcp-sched-lock-10dfc75d1e00 Best regards, -- Matthieu Baerts (NGI0) <matttbe(a)kernel.org>

1 year, 2 months

5
10
0 0

[PATCH v3 0/5] implement lightweight guard pages

by Lorenzo Stoakes

Userland library functions such as allocators and threading implementations often require regions of memory to act as 'guard pages' - mappings which, when accessed, result in a fatal signal being sent to the accessing process. The current means by which these are implemented is via a PROT_NONE mmap() mapping, which provides the required semantics however incur an overhead of a VMA for each such region. With a great many processes and threads, this can rapidly add up and incur a significant memory penalty. It also has the added problem of preventing merges that might otherwise be permitted. This series takes a different approach - an idea suggested by Vlasimil Babka (and before him David Hildenbrand and Jann Horn - perhaps more - the provenance becomes a little tricky to ascertain after this - please forgive any omissions!) - rather than locating the guard pages at the VMA layer, instead placing them in page tables mapping the required ranges. Early testing of the prototype version of this code suggests a 5 times speed up in memory mapping invocations (in conjunction with use of process_madvise()) and a 13% reduction in VMAs on an entirely idle android system and unoptimised code. We expect with optimisation and a loaded system with a larger number of guard pages this could significantly increase, but in any case these numbers are encouraging. This way, rather than having separate VMAs specifying which parts of a range are guard pages, instead we have a VMA spanning the entire range of memory a user is permitted to access and including ranges which are to be 'guarded'. After mapping this, a user can specify which parts of the range should result in a fatal signal when accessed. By restricting the ability to specify guard pages to memory mapped by existing VMAs, we can rely on the mappings being torn down when the mappings are ultimately unmapped and everything works simply as if the memory were not faulted in, from the point of view of the containing VMAs. This mechanism in effect poisons memory ranges similar to hardware memory poisoning, only it is an entirely software-controlled form of poisoning. The mechanism is implemented via madvise() behaviour - MADV_GUARD_INSTALL which installs page table-level guard page markers - and MADV_GUARD_REMOVE - which clears them. Guard markers can be installed across multiple VMAs and any existing mappings will be cleared, that is zapped, before installing the guard page markers in the page tables. There is no concept of 'nested' guard markers, multiple attempts to install guard markers in a range will, after the first attempt, have no effect. Importantly, removing guard markers over a range that contains both guard markers and ordinary backed memory has no effect on anything but the guard markers (including leaving huge pages un-split), so a user can safely remove guard markers over a range of memory leaving the rest intact. The actual mechanism by which the page table entries are specified makes use of existing logic - PTE markers, which are used for the userfaultfd UFFDIO_POISON mechanism. Unfortunately PTE_MARKER_POISONED is not suited for the guard page mechanism as it results in VM_FAULT_HWPOISON semantics in the fault handler, so we add our own specific PTE_MARKER_GUARD and adapt existing logic to handle it. We also extend the generic page walk mechanism to allow for installation of PTEs (carefully restricted to memory management logic only to prevent unwanted abuse). We ensure that zapping performed by MADV_DONTNEED and MADV_FREE do not remove guard markers, nor does forking (except when VM_WIPEONFORK is specified for a VMA which implies a total removal of memory characteristics). It's important to note that the guard page implementation is emphatically NOT a security feature, so a user can remove the markers if they wish. We simply implement it in such a way as to provide the least surprising behaviour. An extensive set of self-tests are provided which ensure behaviour is as expected and additionally self-documents expected behaviour of guard ranges. Suggested-by: Vlastimil Babka <vbabka(a)suse.cz> Suggested-by: Jann Horn <jannh(a)google.com> Suggested-by: David Hildenbrand <david(a)redhat.com> v3 * Cleaned up mm/pagewalk.c logic a bit to make things clearer, as suggested by Vlastiml. * Explicitly avoid splitting THP on PTE installation, as suggested by Vlastimil. Note this has no impact on the guard pages logic, which has page table entry handlers at PUD, PMD and PTE level. * Added WARN_ON_ONCE() to mm/hugetlb.c path where we don't expect a guard marker, as suggested by Vlastimil. * Reverted change to is_poisoned_swp_entry() to exclude guard pages which has the effect of MADV_FREE _not_ clearing guard pages. After discussion with Vlastimil, it became apparent that the ability to 'cancel' the freeing operation by writing to the mapping after having issued an MADV_FREE would mean that we would risk unexpected behaviour should the guard pages be removed, so we now do not remove markers here at all. * Added comment to PTE_MARKER_GUARD to highlight that memory tagged with the marker behaves as if it were a region mapped PROT_NONE, as highlighted by David. * Rename poison -> install, unpoison -> remove (i.e. MADV_GUARD_INSTALL / MADV_GUARD_REMOVE over MADV_GUARD_POISON / MADV_GUARD_REMOVE) at the request of David and John who both find the poison analogy confusing/overloaded. * After a lot of discussion, replace the looping behaviour should page faults race with guard page installation with a modest reattempt followed by returning -ERESTARTNOINTR to have the operation abort and re-enter, relieving lock contention and avoiding the possibility of allowing a malicious sandboxed process to impact the mmap lock or stall the overall process more than necessary, as suggested by Jann and Vlastimil having raised the issue. * Adjusted the page table walker so a populated huge PUD or PMD is correctly treated as being populated, necessitating a zap. In v2 we incorrectly skipped over these, which would cause the logic to wrongly proceed as if nothing were populated and the install succeeded. Instead, explicitly check to see if a huge page - if so, do not split but rather abort the operation and let zap take care of things. * Updated the guard remove logic to not unnecessarily split huge pages either. * Added a debug check to assert that the number of installed PTEs matches expectation, accounting for any existing guard pages. * Adapted vector_madvise() used by the process_madvise() system call to handle -ERESTARTNOINTR correctly. v2 * The macros in kselftest_harness.h seem to be broken - __EXPECT() is terminated by '} while (0); OPTIONAL_HANDLER(_assert)' meaning it is not safe in single line if / else or for /which blocks, however working around this results in checkpatch producing invalid warnings, as reported by Shuah. * Fixing these macros is out of scope for this series, so compromise and instead rewrite test blocks so as to use multiple lines by separating out a decl in most cases. This has the side effect of, for the most part, making things more readable. * Heavily document the use of the volatile keyword - we can't avoid checkpatch complaining about this, so we explain it, as reported by Shuah. * Updated commit message to highlight that we skip tests we lack permissions for, as reported by Shuah. * Replaced a perror() with ksft_exit_fail_perror(), as reported by Shuah. * Added user friendly messages to cases where tests are skipped due to lack of permissions, as reported by Shuah. * Update the tool header to include the new MADV_GUARD_POISON/UNPOISON defines and directly include asm-generic/mman.h to get the platform-neutral versions to ensure we import them. * Finally fixed Vlastimil's email address in Suggested-by tags from suze to suse, as reported by Vlastimil. * Added linux-api to cc list, as reported by Vlastimil. https://lore.kernel.org/all/cover.1729440856.git.lorenzo.stoakes@oracle.com/ v1 * Un-RFC'd as appears no major objections to approach but rather debate on implementation. * Fixed issue with arches which need mmu_context.h and tlbfush.h. header imports in pagewalker logic to be able to use update_mmu_cache() as reported by the kernel test bot. * Added comments in page walker logic to clarify who can use ops->install_pte and why as well as adding a check_ops_valid() helper function, as suggested by Christoph. * Pass false in full parameter in pte_clear_not_present_full() as suggested by Jann. * Stopped erroneously requiring a write lock for the poison operation as suggested by Jann and Suren. * Moved anon_vma_prepare() to the start of madvise_guard_poison() to be consistent with how this is used elsewhere in the kernel as suggested by Jann. * Avoid returning -EAGAIN if we are raced on page faults, just keep looping and duck out if a fatal signal is pending or a conditional reschedule is needed, as suggested by Jann. * Avoid needlessly splitting huge PUDs and PMDs by specifying ACTION_CONTINUE, as suggested by Jann. https://lore.kernel.org/all/cover.1729196871.git.lorenzo.stoakes@oracle.com/ RFC https://lore.kernel.org/all/cover.1727440966.git.lorenzo.stoakes@oracle.com/ Lorenzo Stoakes (5): mm: pagewalk: add the ability to install PTEs mm: add PTE_MARKER_GUARD PTE marker mm: madvise: implement lightweight guard page mechanism tools: testing: update tools UAPI header for mman-common.h selftests/mm: add self tests for guard page feature arch/alpha/include/uapi/asm/mman.h | 3 + arch/mips/include/uapi/asm/mman.h | 3 + arch/parisc/include/uapi/asm/mman.h | 3 + arch/xtensa/include/uapi/asm/mman.h | 3 + include/linux/mm_inline.h | 2 +- include/linux/pagewalk.h | 18 +- include/linux/swapops.h | 24 +- include/uapi/asm-generic/mman-common.h | 3 + mm/hugetlb.c | 4 + mm/internal.h | 12 + mm/madvise.c | 225 ++++ mm/memory.c | 18 +- mm/mprotect.c | 6 +- mm/mseal.c | 1 + mm/pagewalk.c | 227 +++- tools/include/uapi/asm-generic/mman-common.h | 3 + tools/testing/selftests/mm/.gitignore | 1 + tools/testing/selftests/mm/Makefile | 1 + tools/testing/selftests/mm/guard-pages.c | 1239 ++++++++++++++++++ 19 files changed, 1720 insertions(+), 76 deletions(-) create mode 100644 tools/testing/selftests/mm/guard-pages.c -- 2.47.0

1 year, 2 months

5
28
0 0

kselftest:arm64/FVP: arm64_check_buffer_fill - failed on Linux next-20241025

by Naresh Kamboju

The following kselftest arm64 and FVP failed with Linux next-20241025 on - Qemu-arm64 - FVP running Linux next-20241025 kernel. First seen on next-20241025 Good: next-20241024 BAD: next-20241025 kselftest-arm64, FVP * arm64_check_buffer_fill * arm64_check_mmap_options * arm64_check_child_memory Anyone have seen these failures ? Reported-by: Linux Kernel Functional Testing <lkft(a)linaro.org> Test log: ---------- # selftests: arm64: check_buffer_fill # 1..20 # ok 1 Check buffer correctness by byte with sync err mode and mmap memory # ok 2 Check buffer correctness by byte with async err mode and mmap memory # ok 3 Check buffer correctness by byte with sync err mode and mmap/mprotect memory # ok 4 Check buffer correctness by byte with async err mode and mmap/mprotect memory # ok 5 Check buffer write underflow by byte with sync mode and mmap memory # ok 6 Check buffer write underflow by byte with async mode and mmap memory # ok 7 Check buffer write underflow by byte with tag check fault ignore and mmap memory # ok 8 Check buffer write underflow by byte with sync mode and mmap memory # ok 9 Check buffer write underflow by byte with async mode and mmap memory # ok 10 Check buffer write underflow by byte with tag check fault ignore and mmap memory # ok 11 Check buffer write overflow by byte with sync mode and mmap memory # ok 12 Check buffer write overflow by byte with async mode and mmap memory # ok 13 Check buffer write overflow by byte with tag fault ignore mode and mmap memory # ok 14 Check buffer write correctness by block with sync mode and mmap memory # ok 15 Check buffer write correctness by block with async mode and mmap memory # ok 16 Check buffer write correctness by block with tag fault ignore and mmap memory # # FAIL: mmap allocation # # FAIL: memory allocation # not ok 17 Check initial tags with private mapping, sync error mode and mmap memory # ok 18 Check initial tags with private mapping, sync error mode and mmap/mprotect memory # # FAIL: mmap allocation # # FAIL: memory allocation # not ok 19 Check initial tags with shared mapping, sync error mode and mmap memory # ok 20 Check initial tags with shared mapping, sync error mode and mmap/mprotect memory # # Totals: pass:18 fail:2 xfail:0 xpass:0 skip:0 error:0 not ok 21 selftests: arm64: check_buffer_fill # exit=1 # timeout set to 45 # selftests: arm64: check_child_memory # 1..12 # ok 1 Check child anonymous memory with private mapping, precise mode and mmap memory # ok 2 Check child anonymous memory with shared mapping, precise mode and mmap memory # ok 3 Check child anonymous memory with private mapping, imprecise mode and mmap memory # ok 4 Check child anonymous memory with shared mapping, imprecise mode and mmap memory # ok 5 Check child anonymous memory with private mapping, precise mode and mmap/mprotect memory # ok 6 Check child anonymous memory with shared mapping, precise mode and mmap/mprotect memory # # FAIL: mmap allocation # # FAIL: memory allocation # not ok 7 Check child file memory with private mapping, precise mode and mmap memory # # FAIL: mmap allocation # # FAIL: memory allocation # not ok 8 Check child file memory with shared mapping, precise mode and mmap memory # ok 9 Check child file memory with private mapping, imprecise mode and mmap memory # ok 10 Check child file memory with shared mapping, imprecise mode and mmap memory # ok 11 Check child file memory with private mapping, precise mode and mmap/mprotect memory # ok 12 Check child file memory with shared mapping, precise mode and mmap/mprotect memory # # Totals: pass:10 fail:2 xfail:0 xpass:0 skip:0 error:0 not ok 22 selftests: arm64: check_child_memory # exit=1 boot Log links, -------- - https://qa-reports.linaro.org/lkft/linux-next-master/build/next-20241028/te… - https://qa-reports.linaro.org/lkft/linux-next-master/build/next-20241028/te… Test results history: ---------- - https://qa-reports.linaro.org/lkft/linux-next-master/build/next-20241028/te… - https://qa-reports.linaro.org/lkft/linux-next-master/build/next-20241028/te… metadata: ---- git describe: next-20241028 git repo: https://gitlab.com/Linaro/lkft/mirrors/next/linux-next git sha: dec9255a128e19c5fcc3bdb18175d78094cc624d kernel config: https://storage.tuxsuite.com/public/linaro/lkft/builds/2o3tMqzOtHXYQjlvfR5t… build url: https://storage.tuxsuite.com/public/linaro/lkft/builds/2o3tMqzOtHXYQjlvfR5t… toolchain: gcc-13 Steps to reproduce: --------- - https://tuxapi.tuxsuite.com/v1/groups/linaro/projects/lkft/tests/2o3tON5kNi… - https://tuxapi.tuxsuite.com/v1/groups/linaro/projects/lkft/tests/2o3tON5kNi… -- Linaro LKFT https://lkft.linaro.org

1 year, 2 months

2
1
0 0

[PATCH v1 00/10] iommufd: Add VIOMMU infrastructure (Part-2 VIRQ)

by Nicolin Chen

As the part-2 of the VIOMMU infrastructure, this series introduces a VIRQ object after repurposing the existing FAULT object, which provides a nice notification pathway to the user space already. So, the first thing to do is reworking the FAULT object. Mimicing the HWPT structures, add a common EVENT structure to support its derivatives: EVENT_IOPF (the prior FAULT object) and EVENT_VIRQ (new one). IOMMUFD_CMD_VIRQ_ALLOC is introduced to allocate EVENT_VIRQ for a VIOMMU. One VIOMMU can have multiple VIRQs in different types but can not support multiple VIRQs with the same types. Drivers might need the VIOMMU's vdev_id list or the exact vdev_id link of the passthrough device's to forward IRQs/events via the VIOMMU framework. Thus, extend the set/unset_vdev_id ioctls down to the driver using VIOMMU ops. This allows drivers to take the control of a vdev_id's lifecycle. The forwarding part is fairly simple but might need to replace a physical device ID with a virtual device ID. So, there comes with some helpers for drivers to use. As usual, this series comes with the selftest coverage for this new VIRQ, and with a real world use case in the ARM SMMUv3 driver. This must be based on the VIOMMU Part-1 series. It's on Github: https://github.com/nicolinc/iommufd/commits/iommufd_virq-v1 Paring QEMU branch for testing: https://github.com/nicolinc/qemu/commits/wip/for_iommufd_virq-v1 Thanks! Nicolin Nicolin Chen (10): iommufd: Rename IOMMUFD_OBJ_FAULT to IOMMUFD_OBJ_EVENT_IOPF iommufd: Rename fault.c to event.c iommufd: Add IOMMUFD_OBJ_EVENT_VIRQ and IOMMUFD_CMD_VIRQ_ALLOC iommufd/viommu: Allow drivers to control vdev_id lifecycle iommufd/viommu: Add iommufd_vdev_id_to_dev helper iommufd/viommu: Add iommufd_viommu_report_irq helper iommufd/selftest: Implement mock_viommu_set/unset_vdev_id iommufd/selftest: Add IOMMU_TEST_OP_TRIGGER_VIRQ for VIRQ coverage iommufd/selftest: Add EVENT_VIRQ test coverage iommu/arm-smmu-v3: Report virtual IRQ for device in user space drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 109 +++- drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h | 2 + drivers/iommu/iommufd/Makefile | 2 +- drivers/iommu/iommufd/device.c | 2 + drivers/iommu/iommufd/event.c | 613 ++++++++++++++++++ drivers/iommu/iommufd/fault.c | 443 ------------- drivers/iommu/iommufd/hw_pagetable.c | 12 +- drivers/iommu/iommufd/iommufd_private.h | 147 ++++- drivers/iommu/iommufd/iommufd_test.h | 10 + drivers/iommu/iommufd/main.c | 13 +- drivers/iommu/iommufd/selftest.c | 66 ++ drivers/iommu/iommufd/viommu.c | 25 +- drivers/iommu/iommufd/viommu_api.c | 54 ++ include/linux/iommufd.h | 28 + include/uapi/linux/iommufd.h | 46 ++ tools/testing/selftests/iommu/iommufd.c | 11 + tools/testing/selftests/iommu/iommufd_utils.h | 64 ++ 17 files changed, 1130 insertions(+), 517 deletions(-) create mode 100644 drivers/iommu/iommufd/event.c delete mode 100644 drivers/iommu/iommufd/fault.c -- 2.43.0

1 year, 2 months

2
16
0 0

[PATCH RFC 08/10] mm: page_frag: add testing for the newly added API

by Yunsheng Lin

Add testing for the newly added prepare API, for both aligned and non-aligned API, also probe API is also tested along with prepare API. CC: Alexander Duyck <alexander.duyck(a)gmail.com> CC: Andrew Morton <akpm(a)linux-foundation.org> CC: Linux-MM <linux-mm(a)kvack.org> Signed-off-by: Yunsheng Lin <linyunsheng(a)huawei.com> --- .../selftests/mm/page_frag/page_frag_test.c | 76 +++++++++++++++++-- tools/testing/selftests/mm/run_vmtests.sh | 4 + tools/testing/selftests/mm/test_page_frag.sh | 27 +++++++ 3 files changed, 102 insertions(+), 5 deletions(-) diff --git a/tools/testing/selftests/mm/page_frag/page_frag_test.c b/tools/testing/selftests/mm/page_frag/page_frag_test.c index e806c1866e36..3b3c32389def 100644 --- a/tools/testing/selftests/mm/page_frag/page_frag_test.c +++ b/tools/testing/selftests/mm/page_frag/page_frag_test.c @@ -32,6 +32,10 @@ static bool test_align; module_param(test_align, bool, 0); MODULE_PARM_DESC(test_align, "use align API for testing"); +static bool test_prepare; +module_param(test_prepare, bool, 0); +MODULE_PARM_DESC(test_prepare, "use prepare API for testing"); + static int test_alloc_len = 2048; module_param(test_alloc_len, int, 0); MODULE_PARM_DESC(test_alloc_len, "alloc len for testing"); @@ -74,6 +78,21 @@ static int page_frag_pop_thread(void *arg) return 0; } +static void frag_frag_test_commit(struct page_frag_cache *nc, + struct page_frag *prepare_pfrag, + struct page_frag *probe_pfrag, + unsigned int used_sz) +{ + if (prepare_pfrag->page != probe_pfrag->page || + prepare_pfrag->offset != probe_pfrag->offset || + prepare_pfrag->size != probe_pfrag->size) { + force_exit = true; + WARN_ONCE(true, TEST_FAILED_PREFIX "wrong probed info\n"); + } + + page_frag_refill_commit(nc, prepare_pfrag, used_sz); +} + static int page_frag_push_thread(void *arg) { struct ptr_ring *ring = arg; @@ -86,15 +105,61 @@ static int page_frag_push_thread(void *arg) int ret; if (test_align) { - va = page_frag_alloc_align(&test_nc, test_alloc_len, - GFP_KERNEL, SMP_CACHE_BYTES); + if (test_prepare) { + struct page_frag prepare_frag, probe_frag; + void *probe_va; + + va = page_frag_alloc_refill_prepare_align(&test_nc, + test_alloc_len, + &prepare_frag, + GFP_KERNEL, + SMP_CACHE_BYTES); + + probe_va = __page_frag_alloc_refill_probe_align(&test_nc, + test_alloc_len, + &probe_frag, + -SMP_CACHE_BYTES); + if (va != probe_va) { + force_exit = true; + WARN_ONCE(true, TEST_FAILED_PREFIX "wrong va\n"); + } + + if (likely(va)) + frag_frag_test_commit(&test_nc, &prepare_frag, + &probe_frag, test_alloc_len); + } else { + va = page_frag_alloc_align(&test_nc, + test_alloc_len, + GFP_KERNEL, + SMP_CACHE_BYTES); + } if ((unsigned long)va & (SMP_CACHE_BYTES - 1)) { force_exit = true; WARN_ONCE(true, TEST_FAILED_PREFIX "unaligned va returned\n"); } } else { - va = page_frag_alloc(&test_nc, test_alloc_len, GFP_KERNEL); + if (test_prepare) { + struct page_frag prepare_frag, probe_frag; + void *probe_va; + + va = page_frag_alloc_refill_prepare(&test_nc, test_alloc_len, + &prepare_frag, GFP_KERNEL); + + probe_va = page_frag_alloc_refill_probe(&test_nc, test_alloc_len, + &probe_frag); + + if (va != probe_va) { + force_exit = true; + WARN_ONCE(true, TEST_FAILED_PREFIX "wrong va\n"); + } + + if (likely(va)) + frag_frag_test_commit(&test_nc, &prepare_frag, + &probe_frag, test_alloc_len); + } else { + va = page_frag_alloc(&test_nc, test_alloc_len, GFP_KERNEL); + } } if (!va) @@ -176,8 +241,9 @@ static int __init page_frag_test_init(void) } duration = (u64)ktime_us_delta(ktime_get(), start); - pr_info("%d of iterations for %s testing took: %lluus\n", nr_test, - test_align ? "aligned" : "non-aligned", duration); + pr_info("%d of iterations for %s %s API testing took: %lluus\n", nr_test, + test_align ? "aligned" : "non-aligned", + test_prepare ? "prepare" : "alloc", duration); out: ptr_ring_cleanup(&ptr_ring, NULL); diff --git a/tools/testing/selftests/mm/run_vmtests.sh b/tools/testing/selftests/mm/run_vmtests.sh index 2c5394584af4..f6ff9080a6f2 100755 --- a/tools/testing/selftests/mm/run_vmtests.sh +++ b/tools/testing/selftests/mm/run_vmtests.sh @@ -464,6 +464,10 @@ CATEGORY="page_frag" run_test ./test_page_frag.sh aligned CATEGORY="page_frag" run_test ./test_page_frag.sh nonaligned +CATEGORY="page_frag" run_test ./test_page_frag.sh aligned_prepare + +CATEGORY="page_frag" run_test ./test_page_frag.sh nonaligned_prepare + echo "SUMMARY: PASS=${count_pass} SKIP=${count_skip} FAIL=${count_fail}" | tap_prefix echo "1..${count_total}" | tap_output diff --git a/tools/testing/selftests/mm/test_page_frag.sh b/tools/testing/selftests/mm/test_page_frag.sh index f55b105084cf..1c757fd11844 100755 --- a/tools/testing/selftests/mm/test_page_frag.sh +++ b/tools/testing/selftests/mm/test_page_frag.sh @@ -43,6 +43,8 @@ check_test_failed_prefix() { SMOKE_PARAM="test_push_cpu=$TEST_CPU_0 test_pop_cpu=$TEST_CPU_1" NONALIGNED_PARAM="$SMOKE_PARAM test_alloc_len=75 nr_test=$NR_TEST" ALIGNED_PARAM="$NONALIGNED_PARAM test_align=1" +NONALIGNED_PREPARE_PARAM="$NONALIGNED_PARAM test_prepare=1" +ALIGNED_PREPARE_PARAM="$ALIGNED_PARAM test_prepare=1" check_test_requirements() { @@ -77,6 +79,20 @@ run_aligned_check() insmod $DRIVER $ALIGNED_PARAM > /dev/null 2>&1 } +run_nonaligned_prepare_check() +{ + echo "Run performance tests to evaluate how fast nonaligned prepare API is." + + insmod $DRIVER $NONALIGNED_PREPARE_PARAM > /dev/null 2>&1 +} + +run_aligned_prepare_check() +{ + echo "Run performance tests to evaluate how fast aligned prepare API is." + + insmod $DRIVER $ALIGNED_PREPARE_PARAM > /dev/null 2>&1 +} + run_smoke_check() { echo "Run smoke test." @@ -87,6 +103,7 @@ run_smoke_check() usage() { echo -n "Usage: $0 [ aligned ] | [ nonaligned ] | | [ smoke ] | " + echo "[ aligned_prepare ] | [ nonaligned_prepare ] | " echo "manual parameters" echo echo "Valid tests and parameters:" @@ -107,6 +124,12 @@ usage() echo "# Performance testing for aligned alloc API" echo "$0 aligned" echo + echo "# Performance testing for nonaligned prepare API" + echo "$0 nonaligned_prepare" + echo + echo "# Performance testing for aligned prepare API" + echo "$0 aligned_prepare" + echo exit 0 } @@ -158,6 +181,10 @@ function run_test() run_nonaligned_check elif [[ "$1" = "aligned" ]]; then run_aligned_check + elif [[ "$1" = "nonaligned_prepare" ]]; then + run_nonaligned_prepare_check + elif [[ "$1" = "aligned_prepare" ]]; then + run_aligned_prepare_check else run_manual_check $@ fi -- 2.33.0

1 year, 2 months

1
0
0 0

[PATCH net-next v23 4/7] mm: page_frag: avoid caller accessing 'page_frag_cache' directly

by Yunsheng Lin

Use appropriate frag_page API instead of caller accessing 'page_frag_cache' directly. CC: Alexander Duyck <alexander.duyck(a)gmail.com> CC: Andrew Morton <akpm(a)linux-foundation.org> CC: Linux-MM <linux-mm(a)kvack.org> Signed-off-by: Yunsheng Lin <linyunsheng(a)huawei.com> Reviewed-by: Alexander Duyck <alexanderduyck(a)fb.com> Acked-by: Chuck Lever <chuck.lever(a)oracle.com> --- drivers/vhost/net.c | 2 +- include/linux/page_frag_cache.h | 10 ++++++++++ net/core/skbuff.c | 6 +++--- net/rxrpc/conn_object.c | 4 +--- net/rxrpc/local_object.c | 4 +--- net/sunrpc/svcsock.c | 6 ++---- tools/testing/selftests/mm/page_frag/page_frag_test.c | 2 +- 7 files changed, 19 insertions(+), 15 deletions(-) diff --git a/drivers/vhost/net.c b/drivers/vhost/net.c index f16279351db5..9ad37c012189 100644 --- a/drivers/vhost/net.c +++ b/drivers/vhost/net.c @@ -1325,7 +1325,7 @@ static int vhost_net_open(struct inode *inode, struct file *f) vqs[VHOST_NET_VQ_RX]); f->private_data = n; - n->pf_cache.va = NULL; + page_frag_cache_init(&n->pf_cache); return 0; } diff --git a/include/linux/page_frag_cache.h b/include/linux/page_frag_cache.h index 67ac8626ed9b..0a52f7a179c8 100644 --- a/include/linux/page_frag_cache.h +++ b/include/linux/page_frag_cache.h @@ -7,6 +7,16 @@ #include <linux/mm_types_task.h> #include <linux/types.h> +static inline void page_frag_cache_init(struct page_frag_cache *nc) +{ + nc->va = NULL; +} + +static inline bool page_frag_cache_is_pfmemalloc(struct page_frag_cache *nc) +{ + return !!nc->pfmemalloc; +} + void page_frag_cache_drain(struct page_frag_cache *nc); void __page_frag_cache_drain(struct page *page, unsigned int count); void *__page_frag_alloc_align(struct page_frag_cache *nc, unsigned int fragsz, diff --git a/net/core/skbuff.c b/net/core/skbuff.c index 00afeb90c23a..6841e61a6bd0 100644 --- a/net/core/skbuff.c +++ b/net/core/skbuff.c @@ -753,14 +753,14 @@ struct sk_buff *__netdev_alloc_skb(struct net_device *dev, unsigned int len, if (in_hardirq() || irqs_disabled()) { nc = this_cpu_ptr(&netdev_alloc_cache); data = page_frag_alloc(nc, len, gfp_mask); - pfmemalloc = nc->pfmemalloc; + pfmemalloc = page_frag_cache_is_pfmemalloc(nc); } else { local_bh_disable(); local_lock_nested_bh(&napi_alloc_cache.bh_lock); nc = this_cpu_ptr(&napi_alloc_cache.page); data = page_frag_alloc(nc, len, gfp_mask); - pfmemalloc = nc->pfmemalloc; + pfmemalloc = page_frag_cache_is_pfmemalloc(nc); local_unlock_nested_bh(&napi_alloc_cache.bh_lock); local_bh_enable(); @@ -850,7 +850,7 @@ struct sk_buff *napi_alloc_skb(struct napi_struct *napi, unsigned int len) len = SKB_HEAD_ALIGN(len); data = page_frag_alloc(&nc->page, len, gfp_mask); - pfmemalloc = nc->page.pfmemalloc; + pfmemalloc = page_frag_cache_is_pfmemalloc(&nc->page); } local_unlock_nested_bh(&napi_alloc_cache.bh_lock); diff --git a/net/rxrpc/conn_object.c b/net/rxrpc/conn_object.c index 1539d315afe7..694c4df7a1a3 100644 --- a/net/rxrpc/conn_object.c +++ b/net/rxrpc/conn_object.c @@ -337,9 +337,7 @@ static void rxrpc_clean_up_connection(struct work_struct *work) */ rxrpc_purge_queue(&conn->rx_queue); - if (conn->tx_data_alloc.va) - __page_frag_cache_drain(virt_to_page(conn->tx_data_alloc.va), - conn->tx_data_alloc.pagecnt_bias); + page_frag_cache_drain(&conn->tx_data_alloc); call_rcu(&conn->rcu, rxrpc_rcu_free_connection); } diff --git a/net/rxrpc/local_object.c b/net/rxrpc/local_object.c index f9623ace2201..2792d2304605 100644 --- a/net/rxrpc/local_object.c +++ b/net/rxrpc/local_object.c @@ -452,9 +452,7 @@ void rxrpc_destroy_local(struct rxrpc_local *local) #endif rxrpc_purge_queue(&local->rx_queue); rxrpc_purge_client_connections(local); - if (local->tx_alloc.va) - __page_frag_cache_drain(virt_to_page(local->tx_alloc.va), - local->tx_alloc.pagecnt_bias); + page_frag_cache_drain(&local->tx_alloc); } /* diff --git a/net/sunrpc/svcsock.c b/net/sunrpc/svcsock.c index 825ec5357691..b785425c3315 100644 --- a/net/sunrpc/svcsock.c +++ b/net/sunrpc/svcsock.c @@ -1608,7 +1608,6 @@ static void svc_tcp_sock_detach(struct svc_xprt *xprt) static void svc_sock_free(struct svc_xprt *xprt) { struct svc_sock *svsk = container_of(xprt, struct svc_sock, sk_xprt); - struct page_frag_cache *pfc = &svsk->sk_frag_cache; struct socket *sock = svsk->sk_sock; trace_svcsock_free(svsk, sock); @@ -1618,8 +1617,7 @@ static void svc_sock_free(struct svc_xprt *xprt) sockfd_put(sock); else sock_release(sock); - if (pfc->va) - __page_frag_cache_drain(virt_to_head_page(pfc->va), - pfc->pagecnt_bias); + + page_frag_cache_drain(&svsk->sk_frag_cache); kfree(svsk); } diff --git a/tools/testing/selftests/mm/page_frag/page_frag_test.c b/tools/testing/selftests/mm/page_frag/page_frag_test.c index 13c44133e009..e806c1866e36 100644 --- a/tools/testing/selftests/mm/page_frag/page_frag_test.c +++ b/tools/testing/selftests/mm/page_frag/page_frag_test.c @@ -126,7 +126,7 @@ static int __init page_frag_test_init(void) u64 duration; int ret; - test_nc.va = NULL; + page_frag_cache_init(&test_nc); atomic_set(&nthreads, 2); init_completion(&wait); -- 2.33.0

1 year, 2 months

1
0
0 0

2026

2025

2024

2023

2022

2021

2020

2019

2018

2017

Linux-kselftest-mirror