July 2025 - Linux-kselftest-mirror

[PATCH 0/3] selftests/nolibc: enable qemu-system tests with LLVM builds

by Thomas Weißschuh

Currently the test setup does not support running nolibc-test built with LLVM in qemu-system. Enable this. FYI, sparc32 on LLVM seems to be broken at the moment. To me this looks like a LLVM regression, emitting invalid object code. Signed-off-by: Thomas Weißschuh <linux(a)weissschuh.net> --- Thomas Weißschuh (3): selftests/nolibc: deduplicate invocations of toplevel Makefile selftests/nolibc: don't pass CC to toplevel Makefile selftests/nolibc: always compile the kernel with GCC tools/testing/selftests/nolibc/Makefile.nolibc | 11 +++++++---- 1 file changed, 7 insertions(+), 4 deletions(-) --- base-commit: b9e50363178a40c76bebaf2f00faa2b0b6baf8d1 change-id: 20250719-nolibc-llvm-system-311762b62829 Best regards, -- Thomas Weißschuh <linux(a)weissschuh.net>

1 day, 6 hours

2
5
0 0

[PATCH v2 0/2] rust: minor idiomatic fixes to doctest generator

by Tamir Duberstein

Please see individual commit messages. Signed-off-by: Tamir Duberstein <tamird(a)gmail.com> --- Changes in v2: - rustfmt. - Alice's RB. - Add second patch to emit information in panic rather than separately to stderr. - Link to v1: https://lore.kernel.org/r/20250527-idiomatic-match-slice-v1-1-34b0b1d1d58c@… --- Tamir Duberstein (2): rust: replace length checks with match rust: emit path candidates in panic message scripts/rustdoc_test_gen.rs | 33 +++++++++++++++++---------------- 1 file changed, 17 insertions(+), 16 deletions(-) --- base-commit: 1ce98bb2bb30713ec4374ef11ead0d7d3e856766 change-id: 20250527-idiomatic-match-slice-26a79d100e4d Best regards, -- Tamir Duberstein <tamird(a)gmail.com>

1 day, 10 hours

2
8
0 0

[PATCH v2 00/15] selftests/futex: Refactor tests to use kselftest_harness.h

by André Almeida

This patch series refactors all futex selftests to use kselftest_harness.h instead of futex's logging.h, as discussed here [1]. This allows to remove a lot of boilerplate code and to simplify some parts of the test logic, mainly when the test needs to exit early. The result of this is more than 500 lines removed from tools/testing/selftests/futex/. Also, this enables new tests to use kselftest.h features like ASSERT_s and such. There are some caveats around this refactor: - logging.h had verbosity levels, while kselftest_harness.h doesn't. I created a new print function called ksft_print_dbg_msg() that prints the message if the user uses the -d flag, so now there's an equivalent of this feature. - futex_requeue_pi test accepted command line arguments to be used as test parameters (e.g. ./futex_requeue_pi -b -l -t 500000). This doesn't work with kselftest_harness.h because there's no straightforward way to send command line arguments to the test. I used FIXTURE_VARIANT() to achieve the same result, but now the parameters live inside of the test file, instead of on functional/run.sh. This increased a little bit the number of test cases for futex_requeue_pi, from 22 to 24. - test_harness_run() calls mmap(MAP_SHARED) before running the test and this has caused a side effect on test futex_numa_mpol.c. This test also calls mmap() and then try to access an address out of boundaries of this mapped memory for a "Memory out of range" subtest, where the kernel should return -EACCESS. After the refactor, the test address might be fall inside the first memory mapped region, thus being a valid address and succeeding the syscall, making the test fail. To fix that, I created a small "buffer zone" with mmap(PROT_NONE) between both mmaps. I have compared the results of run.sh before and after this patchset and didn't find any regression from the test results. Thanks, André [1] https://lore.kernel.org/lkml/87ecv6p364.ffs@tglx/ --- Changes in v2: - Rebased on top of tip/master - Dropped priv_hash global test variant now that this feature was dropped - Added include <stdbool.h> in the first patch - Link to v1: https://lore.kernel.org/r/20250704-tonyk-robust_test_cleanup-v1-0-c0ff4f24c… --- André Almeida (15): selftests: kselftest: Create ksft_print_dbg_msg() selftests/futex: Refactor futex_requeue_pi with kselftest_harness.h selftests/futex: Refactor futex_requeue_pi_mismatched_ops with kselftest_harness.h selftests/futex: Refactor futex_requeue_pi_signal_restart with kselftest_harness.h selftests/futex: Refactor futex_wait_timeout with kselftest_harness.h selftests/futex: Refactor futex_wait_wouldblock with kselftest_harness.h selftests/futex: Refactor futex_wait_unitialized_heap with kselftest_harness.h selftests/futex: Refactor futex_wait_private_mapped_file with kselftest_harness.h selftests/futex: Refactor futex_wait with kselftest_harness.h selftests/futex: Refactor futex_requeue with kselftest_harness.h selftests/futex: Refactor futex_waitv with kselftest_harness.h selftests/futex: Refactor futex_priv_hash with kselftest_harness.h selftests/futex: Refactor futex_numa_mpol with kselftest_harness.h selftests/futex: Drop logging.h include from futex_numa selftests/futex: Remove logging.h file tools/testing/selftests/futex/functional/Makefile | 3 +- .../selftests/futex/functional/futex_numa.c | 3 +- .../selftests/futex/functional/futex_numa_mpol.c | 57 ++--- .../selftests/futex/functional/futex_priv_hash.c | 49 +--- .../selftests/futex/functional/futex_requeue.c | 76 ++---- .../selftests/futex/functional/futex_requeue_pi.c | 261 ++++++++++----------- .../functional/futex_requeue_pi_mismatched_ops.c | 80 ++----- .../functional/futex_requeue_pi_signal_restart.c | 129 +++------- .../selftests/futex/functional/futex_wait.c | 103 +++----- .../functional/futex_wait_private_mapped_file.c | 83 ++----- .../futex/functional/futex_wait_timeout.c | 139 +++++------ .../functional/futex_wait_uninitialized_heap.c | 76 ++---- .../futex/functional/futex_wait_wouldblock.c | 75 ++---- .../selftests/futex/functional/futex_waitv.c | 98 ++++---- tools/testing/selftests/futex/functional/run.sh | 62 +---- tools/testing/selftests/futex/include/logging.h | 148 ------------ tools/testing/selftests/kselftest.h | 14 ++ tools/testing/selftests/kselftest_harness.h | 13 +- 18 files changed, 465 insertions(+), 1004 deletions(-) --- base-commit: ed0272f0675f31642c3d445a596b544de9db405b change-id: 20250703-tonyk-robust_test_cleanup-d1f3406365d9 Best regards, -- André Almeida <andrealmeid(a)igalia.com>

1 day, 10 hours

1
15
0 0

[PATCH v2] selftests: firmware: Add details in error logging

by Harshal

Specify details in logs of failed cases Signed-off-by: Harshal <embedkari167(a)gmail.com> --- v2: - revert back to exit() instead of die() to avoid modifying system behaviour v1: https://lore.kernel.org/all/c7c071ed-6a4e-4a9c-ba9d-c745fd42c22f@linuxfound… tools/testing/selftests/firmware/fw_namespace.c | 14 +++++++------- 1 file changed, 7 insertions(+), 7 deletions(-) diff --git a/tools/testing/selftests/firmware/fw_namespace.c b/tools/testing/selftests/firmware/fw_namespace.c index 04757dc7e546..5b0032498ede 100644 --- a/tools/testing/selftests/firmware/fw_namespace.c +++ b/tools/testing/selftests/firmware/fw_namespace.c @@ -38,7 +38,7 @@ static void trigger_fw(const char *fw_name, const char *sys_path) fd = open(sys_path, O_WRONLY); if (fd < 0) - die("open failed: %s\n", + die("open of sys_path failed: %s\n", strerror(errno)); if (write(fd, fw_name, strlen(fw_name)) != strlen(fw_name)) exit(EXIT_FAILURE); @@ -52,10 +52,10 @@ static void setup_fw(const char *fw_path) fd = open(fw_path, O_WRONLY | O_CREAT, 0600); if (fd < 0) - die("open failed: %s\n", + die("open of firmware file failed: %s\n", strerror(errno)); if (write(fd, fw, sizeof(fw) -1) != sizeof(fw) -1) - die("write failed: %s\n", + die("write to firmware file failed: %s\n", strerror(errno)); close(fd); } @@ -66,7 +66,7 @@ static bool test_fw_in_ns(const char *fw_name, const char *sys_path, bool block_ if (block_fw_in_parent_ns) if (mount("test", "/lib/firmware", "tmpfs", MS_RDONLY, NULL) == -1) - die("blocking firmware in parent ns failed\n"); + die("blocking firmware in parent namespace failed\n"); child = fork(); if (child == -1) { @@ -99,11 +99,11 @@ static bool test_fw_in_ns(const char *fw_name, const char *sys_path, bool block_ strerror(errno)); } if (mount(NULL, "/", NULL, MS_SLAVE|MS_REC, NULL) == -1) - die("remount root in child ns failed\n"); + die("remount root in child namespace failed\n"); if (!block_fw_in_parent_ns) { if (mount("test", "/lib/firmware", "tmpfs", MS_RDONLY, NULL) == -1) - die("blocking firmware in child ns failed\n"); + die("blocking firmware in child namespace failed\n"); } else umount("/lib/firmware"); @@ -129,8 +129,8 @@ int main(int argc, char **argv) die("error: failed to build full fw_path\n"); setup_fw(fw_path); - setvbuf(stdout, NULL, _IONBF, 0); + /* Positive case: firmware in PID1 mount namespace */ printf("Testing with firmware in parent namespace (assumed to be same file system as PID1)\n"); if (!test_fw_in_ns(fw_name, sys_path, false)) -- 2.43.0

1 day, 14 hours

3
2
0 0

[PATCH RFC 0/4] landlock: add LANDLOCK_SCOPE_MEMFD_EXEC execution

by Abhinav Saxena

This patch series introduces LANDLOCK_SCOPE_MEMFD_EXEC, a new Landlock scoping mechanism that restricts execution of anonymous memory file descriptors (memfd) created via memfd_create(2). This addresses security gaps where processes can bypass W^X policies and execute arbitrary code through anonymous memory objects. Fixes: https://github.com/landlock-lsm/linux/issues/37 SECURITY PROBLEM ================ Current Landlock filesystem restrictions do not cover memfd objects, allowing processes to: 1. Read-to-execute bypass: Create writable memfd, inject code, then execute via mmap(PROT_EXEC) or direct execve() 2. Anonymous execution: Execute code without touching the filesystem via execve("/proc/self/fd/N") where N is a memfd descriptor 3. Cross-domain access violations: Pass memfd between processes to bypass domain restrictions These scenarios can occur in sandboxed environments where filesystem access is restricted but memfd creation remains possible. IMPLEMENTATION ============== The implementation adds hierarchical execution control through domain scoping: Core Components: - is_memfd_file(): Reliable memfd detection via "memfd:" dentry prefix - domain_is_scoped(): Cross-domain hierarchy checking (moved to domain.c) - LSM hooks: mmap_file, file_mprotect, bprm_creds_for_exec - Creation-time restrictions: hook_file_alloc_security Security Matrix: Execution decisions follow domain hierarchy rules preventing both same-domain bypass attempts and cross-domain access violations while preserving legitimate hierarchical access patterns. Domain Hierarchy with LANDLOCK_SCOPE_MEMFD_EXEC: =============================================== Root (no domain) - No restrictions | +-- Domain A [SCOPE_MEMFD_EXEC] Layer 1 | +-- memfd_A (tagged with Domain A as creator) | | | +-- Domain A1 (child) [NO SCOPE] Layer 2 | | +-- Inherits Layer 1 restrictions from parent | | +-- memfd_A1 (can create, inherits restrictions) | | +-- Domain A1a [SCOPE_MEMFD_EXEC] Layer 3 | | +-- memfd_A1a (tagged with Domain A1a) | | | +-- Domain A2 (child) [SCOPE_MEMFD_EXEC] Layer 2 | +-- memfd_A2 (tagged with Domain A2 as creator) | +-- CANNOT access memfd_A1 (different subtree) | +-- Domain B [SCOPE_MEMFD_EXEC] Layer 1 +-- memfd_B (tagged with Domain B as creator) +-- CANNOT access ANY memfd from Domain A subtree Execution Decision Matrix: ======================== Executor-> | A | A1 | A1a | A2 | B | Root Creator | | | | | | ------------|-----|----|-----|----|----|----- Domain A | X | X | X | X | X | Y Domain A1 | Y | X | X | X | X | Y Domain A1a | Y | Y | X | X | X | Y Domain A2 | Y | X | X | X | X | Y Domain B | X | X | X | X | X | Y Root | Y | Y | Y | Y | Y | Y Legend: Y = Execution allowed, X = Execution denied Scenarios Covered: - Direct mmap(PROT_EXEC) on memfd files - Two-stage mmap(PROT_READ) + mprotect(PROT_EXEC) bypass attempts - execve("/proc/self/fd/N") anonymous execution - execveat() and fexecve() file descriptor execution - Cross-process memfd inheritance and IPC passing TESTING ======= All patches have been validated with: - scripts/checkpatch.pl --strict (clean) - Selftests covering same-domain restrictions, cross-domain hierarchy enforcement, and regular file isolation - KUnit tests for memfd detection edge cases DISCLAIMER ========== My understanding of Landlock scoping semantics may be limited, but this implementation reflects my current understanding based on available documentation and code analysis. I welcome feedback and corrections regarding the scoping logic and domain hierarchy enforcement. Signed-off-by: Abhinav Saxena <xandfury(a)gmail.com> --- Abhinav Saxena (4): landlock: add LANDLOCK_SCOPE_MEMFD_EXEC scope landlock: implement memfd detection landlock: add memfd exec LSM hooks and scoping selftests/landlock: add memfd execution tests include/uapi/linux/landlock.h | 5 + security/landlock/.kunitconfig | 1 + security/landlock/audit.c | 4 + security/landlock/audit.h | 1 + security/landlock/cred.c | 14 - security/landlock/domain.c | 67 ++++ security/landlock/domain.h | 4 + security/landlock/fs.c | 405 ++++++++++++++++++++- security/landlock/limits.h | 2 +- security/landlock/task.c | 67 ---- .../selftests/landlock/scoped_memfd_exec_test.c | 325 +++++++++++++++++ 11 files changed, 812 insertions(+), 83 deletions(-) --- base-commit: 5b74b2eff1eeefe43584e5b7b348c8cd3b723d38 change-id: 20250716-memfd-exec-ac0d582018c3 Best regards, -- Abhinav Saxena <xandfury(a)gmail.com>

2 days, 1 hour

2
5
0 0

[PATCH v8 0/6] use per-vma locks for /proc/pid/maps reads

by Suren Baghdasaryan

Reading /proc/pid/maps requires read-locking mmap_lock which prevents any other task from concurrently modifying the address space. This guarantees coherent reporting of virtual address ranges, however it can block important updates from happening. Oftentimes /proc/pid/maps readers are low priority monitoring tasks and them blocking high priority tasks results in priority inversion. Locking the entire address space is required to present fully coherent picture of the address space, however even current implementation does not strictly guarantee that by outputting vmas in page-size chunks and dropping mmap_lock in between each chunk. Address space modifications are possible while mmap_lock is dropped and userspace reading the content is expected to deal with possible concurrent address space modifications. Considering these relaxed rules, holding mmap_lock is not strictly needed as long as we can guarantee that a concurrently modified vma is reported either in its original form or after it was modified. This patchset switches from holding mmap_lock while reading /proc/pid/maps to taking per-vma locks as we walk the vma tree. This reduces the contention with tasks modifying the address space because they would have to contend for the same vma as opposed to the entire address space. Previous version of this patchset [1] tried to perform /proc/pid/maps reading under RCU, however its implementation is quite complex and the results are worse than the new version because it still relied on mmap_lock speculation which retries if any part of the address space gets modified. New implementaion is both simpler and results in less contention. Note that similar approach would not work for /proc/pid/smaps reading as it also walks the page table and that's not RCU-safe. Paul McKenney's designed a test [2] to measure mmap/munmap latencies while concurrently reading /proc/pid/maps. The test has a pair of processes scanning /proc/PID/maps, and another process unmapping and remapping 4K pages from a 128MB range of anonymous memory. At the end of each 10 second run, the latency of each mmap() or munmap() operation is measured, and for each run the maximum and mean latency is printed. The map/unmap process is started first, its PID is passed to the scanners, and then the map/unmap process waits until both scanners are running before starting its timed test. The scanners keep scanning until the specified /proc/PID/maps file disappears. The latest results from Paul: Stock mm-unstable, all of the runs had maximum latencies in excess of 0.5 milliseconds, and with 80% of the runs' latencies exceeding a full millisecond, and ranging up beyond 4 full milliseconds. In contrast, 99% of the runs with this patch series applied had maximum latencies of less than 0.5 milliseconds, with the single outlier at only 0.608 milliseconds. From a median-performance (as opposed to maximum-latency) viewpoint, this patch series also looks good, with stock mm weighing in at 11 microseconds and patch series at 6 microseconds, better than a 2x improvement. Before the change: ./run-proc-vs-map.sh --nsamples 100 --rawdata -- --busyduration 2 0.011 0.008 0.521 0.011 0.008 0.552 0.011 0.008 0.590 0.011 0.008 0.660 ... 0.011 0.015 2.987 0.011 0.015 3.038 0.011 0.016 3.431 0.011 0.016 4.707 After the change: ./run-proc-vs-map.sh --nsamples 100 --rawdata -- --busyduration 2 0.006 0.005 0.026 0.006 0.005 0.029 0.006 0.005 0.034 0.006 0.005 0.035 ... 0.006 0.006 0.421 0.006 0.006 0.423 0.006 0.006 0.439 0.006 0.006 0.608 The patchset also adds a number of tests to check for /proc/pid/maps data coherency. They are designed to detect any unexpected data tearing while performing some common address space modifications (vma split, resize and remap). Even before these changes, reading /proc/pid/maps might have inconsistent data because the file is read page-by-page with mmap_lock being dropped between the pages. An example of user-visible inconsistency can be that the same vma is printed twice: once before it was modified and then after the modifications. For example if vma was extended, it might be found and reported twice. What is not expected is to see a gap where there should have been a vma both before and after modification. This patchset increases the chances of such tearing, therefore it's even more important now to test for unexpected inconsistencies. In [3] Lorenzo identified the following possible vma merging/splitting scenarios: Merges with changes to existing vmas: 1 Merge both - mapping a vma over another one and between two vmas which can be merged after this replacement; 2. Merge left full - mapping a vma at the end of an existing one and completely over its right neighbor; 3. Merge left partial - mapping a vma at the end of an existing one and partially over its right neighbor; 4. Merge right full - mapping a vma before the start of an existing one and completely over its left neighbor; 5. Merge right partial - mapping a vma before the start of an existing one and partially over its left neighbor; Merges without changes to existing vmas: 6. Merge both - mapping a vma into a gap between two vmas which can be merged after the insertion; 7. Merge left - mapping a vma at the end of an existing one; 8. Merge right - mapping a vma before the start end of an existing one; Splits 9. Split with new vma at the lower address; 10. Split with new vma at the higher address; If such merges or splits happen concurrently with the /proc/maps reading we might report a vma twice, once before the modification and once after it is modified: Case 1 might report overwritten and previous vma along with the final merged vma; Case 2 might report previous and the final merged vma; Case 3 might cause us to retry once we detect the temporary gap caused by shrinking of the right neighbor; Case 4 might report overritten and the final merged vma; Case 5 might cause us to retry once we detect the temporary gap caused by shrinking of the left neighbor; Case 6 might report previous vma and the gap along with the final marged vma; Case 7 might report previous and the final merged vma; Case 8 might report the original gap and the final merged vma covering the gap; Case 9 might cause us to retry once we detect the temporary gap caused by shrinking of the original vma at the vma start; Case 10 might cause us to retry once we detect the temporary gap caused by shrinking of the original vma at the vma end; In all these cases the retry mechanism prevents us from reporting possible temporary gaps. Changes since v7 [4]: - Refactored tests to use kselftest harness, per David Hildenbrand and Lorenzo Stoakes - Removed PROCMAP_QUERY selftest, per David Hildenbrand and Lorenzo Stoakes - Added Acked-by, per David Hildenbrand - Replaced sentinels values with named definitions, per Vlastimil Babka - Added Reviewed-by, per Vlastimil Babka !!! NOTES FOR APPLYING THE PATCHSET !!! Applies cleanly over mm-unstable after reverting v7 version of this patchset (from 94951ab6fe6f to e47914e6c28f in mm-unstable). [1] https://lore.kernel.org/all/20250418174959.1431962-1-surenb@google.com/ [2] https://github.com/paulmckrcu/proc-mmap_sem-test [3] https://lore.kernel.org/all/e1863f40-39ab-4e5b-984a-c48765ffde1c@lucifer.lo… [4] https://lore.kernel.org/all/20250716030557.1547501-1-surenb@google.com/ Suren Baghdasaryan (6): selftests/proc: add /proc/pid/maps tearing from vma split test selftests/proc: extend /proc/pid/maps tearing test to include vma resizing selftests/proc: extend /proc/pid/maps tearing test to include vma remapping selftests/proc: add verbose mode for /proc/pid/maps tearing tests fs/proc/task_mmu: remove conversion of seq_file position to unsigned fs/proc/task_mmu: read proc/pid/maps under per-vma lock fs/proc/internal.h | 5 + fs/proc/task_mmu.c | 158 +++- include/linux/mmap_lock.h | 11 + mm/madvise.c | 3 +- mm/mmap_lock.c | 93 +++ tools/testing/selftests/proc/.gitignore | 1 + tools/testing/selftests/proc/Makefile | 1 + tools/testing/selftests/proc/proc-maps-race.c | 741 ++++++++++++++++++ 8 files changed, 997 insertions(+), 16 deletions(-) create mode 100644 tools/testing/selftests/proc/proc-maps-race.c -- 2.50.0.727.gbf7dc18ff4-goog

2 days, 9 hours

2
7
0 0

[PATCH v2 0/6] VMM can handle guest SEA via KVM_EXIT_ARM_SEA

by Jiaqi Yan

Problem ======= When host APEI is unable to claim synchronous external abort (SEA) during stage-2 guest abort, today KVM directly injects an async SError into the VCPU then resumes it. The injected SError usually results in unpleasant guest kernel panic. One of the major situation of guest SEA is when VCPU consumes recoverable uncorrected memory error (UER), which is not uncommon at all in modern datacenter servers with large amounts of physical memory. Although SError and guest panic is sufficient to stop the propagation of corrupted memory there is room to recover from an UER in a more graceful manner. Proposed Solution ================= Alternatively KVM can replay the SEA to the faulting VCPU, via existing KVM_SET_VCPU_EVENTS API. If the memory poison consumption or the fault that cause SEA is not from guest kernel, the blast radius can be limited to the consuming or faulting guest userspace process, so the VM can keep running. In addition, instead of doing under the hood without involving userspace, there are benefits to redirect the SEA to VMM: - VM customers care about the disruptions caused by memory errors, and VMM usually has the responsibility to start the process of notifying the customers of memory error events in their VMs. For example some cloud provider emits a critical log in their observability UI [1], and provides playbook for customers on how to mitigate disruptions to their workloads. - VMM can protect future memory error consumption by unmapping the poisoned pages from stage-2 page table with KVM userfault, or by splitting the memslot that contains the poisoned guest pages [2]. - VMM can keep track of SEA events in the VM. When VMM thinks the status on the host or the VM is bad enough, e.g. number of distinct SEAs exceeds a threshold, it can restart the VM on another healthy host. - Behavior parity with x86 architecture. When machine check exception (MCE) is caused by VCPU, kernel or KVM signals userspace SIGBUS to let VMM either recover from the MCE, or terminate itself with VM. The prior RFC proposes to implement SIGBUS on arm64 as well, but Marc preferred VCPU exit over signal [3]. However, implementation aside, returning SEA to VMM is on par with returning MCE to VMM. Once SEA is redirected to VMM, among other actions, VMM is encouraged to inject external aborts into the faulting VCPU, which is already supported by KVM on arm64. We notice injecting instruction abort is not fully supported by KVM_SET_VCPU_EVENTS. Complement it in the patchset. New UAPIs ========= This patchset introduces following userspace-visiable changes to empower VMM to control what happens next for SEA on guest memory: - KVM_CAP_ARM_SEA_TO_USER. While taking SEA, if userspace has enabled this new capability at VM creation, and the SEA is not caused by memory allocated for stage-2 translation table, instead of injecting SError, return KVM_EXIT_ARM_SEA to userspace. - KVM_EXIT_ARM_SEA. This is the VM exit reason VMM gets. The details about the SEA is provided in arm_sea as much as possible, including sanitized ESR value at EL2, if guest virtual and physical addresses (GPA and GVA) are available and the values if available. - KVM_CAP_ARM_INJECT_EXT_IABT. VMM today can inject external data abort to VCPU via KVM_SET_VCPU_EVENTS API. However, in case of instruction abort, VMM cannot inject it via KVM_SET_VCPU_EVENTS. KVM_CAP_ARM_INJECT_EXT_IABT is just a natural extend to KVM_CAP_ARM_INJECT_EXT_DABT that tells VMM KVM_SET_VCPU_EVENTS now supports external instruction abort. * From v1 [4]: - Rebased on commit 4d62121ce9b5 ("KVM: arm64: vgic-debug: Avoid dereferencing NULL ITE pointer"). - Sanitize ESR_EL2 before reporting it to userspace. - Do not do KVM_EXIT_ARM_SEA when SEA is caused by memory allocated to stage-2 translation table. [1] https://cloud.google.com/solutions/sap/docs/manage-host-errors [2] https://lore.kernel.org/kvm/20250109204929.1106563-1-jthoughton@google.com [3] https://lore.kernel.org/kvm/86pljbqqh0.wl-maz@kernel.org [4] https://lore.kernel.org/kvm/20250505161412.1926643-1-jiaqiyan@google.com Jiaqi Yan (5): KVM: arm64: VM exit to userspace to handle SEA KVM: arm64: Set FnV for VCPU when FAR_EL2 is invalid KVM: selftests: Test for KVM_EXIT_ARM_SEA and KVM_CAP_ARM_SEA_TO_USER KVM: selftests: Test for KVM_CAP_INJECT_EXT_IABT Documentation: kvm: new uAPI for handling SEA Raghavendra Rao Ananta (1): KVM: arm64: Allow userspace to inject external instruction aborts Documentation/virt/kvm/api.rst | 128 ++++++- arch/arm64/include/asm/kvm_emulate.h | 67 ++++ arch/arm64/include/asm/kvm_host.h | 8 + arch/arm64/include/asm/kvm_ras.h | 2 +- arch/arm64/include/uapi/asm/kvm.h | 3 +- arch/arm64/kvm/arm.c | 6 + arch/arm64/kvm/guest.c | 13 +- arch/arm64/kvm/inject_fault.c | 3 + arch/arm64/kvm/mmu.c | 59 ++- include/uapi/linux/kvm.h | 12 + tools/arch/arm64/include/asm/esr.h | 2 + tools/arch/arm64/include/uapi/asm/kvm.h | 3 +- tools/testing/selftests/kvm/Makefile.kvm | 2 + .../testing/selftests/kvm/arm64/inject_iabt.c | 98 +++++ .../testing/selftests/kvm/arm64/sea_to_user.c | 340 ++++++++++++++++++ tools/testing/selftests/kvm/lib/kvm_util.c | 1 + 16 files changed, 718 insertions(+), 29 deletions(-) create mode 100644 tools/testing/selftests/kvm/arm64/inject_iabt.c create mode 100644 tools/testing/selftests/kvm/arm64/sea_to_user.c -- 2.49.0.1266.g31b7d2e469-goog

2 days, 11 hours

2
17
0 0

Start Campaign Outreach with Fresh Data from GSX 2025

by Ben Graham

Hi , Interested in getting the GSX 2025 attendee list? Expo Name: Global Security Exchange (GSX) 2025 Total Number of records: 17,000 records List includes: Company Name, Contact Name, Job Title, Mailing Address, Phone, Emails, etc. Are you considering buying these leads? If yes, I can send you the pricing information. Awaiting your message Regards Ben Graham Demand Generation Manager US Marketing Data Inc., Please reply with REMOVE if you don't wish to receive further emails

2 days, 16 hours

1
0
0 0

[PATCH 0/7] Replace "__auto_type" with "auto"

by H. Peter Anvin

"auto" was defined as a keyword back in the K&R days, but as a storage type specifier. No one ever used it, since it was and is the default storage type for local variables. C++11 recycled the keyword to allow a type to be declared based on the type of an initializer. This was finally adopted into standard C in C23. gcc and clang provide the "__auto_type" alias keyword as an extension for pre-C23, however, there is no reason to pollute the bulk of the source base with this temporary keyword; instead define "auto" as a macro unless the compiler is running in C23+ mode. This macro is added in <linux/compiler_types.h> because that header is included in some of the tools headers, wheres <linux/compiler.h> is not as it has a bunch of very kernel-specific things in it. --- arch/nios2/include/asm/uaccess.h | 4 ++-- arch/x86/include/asm/bug.h | 2 +- arch/x86/include/asm/string_64.h | 6 +++--- arch/x86/include/asm/uaccess_64.h | 2 +- fs/proc/inode.c | 16 ++++++++-------- include/linux/cleanup.h | 4 ++-- include/linux/compiler.h | 2 +- include/linux/compiler_types.h | 13 +++++++++++++ include/linux/minmax.h | 6 +++--- tools/testing/selftests/bpf/prog_tests/socket_helpers.h | 9 +++++++-- tools/virtio/linux/compiler.h | 2 +- 11 files changed, 42 insertions(+), 24 deletions(-)

2 days, 18 hours

4
14
0 0

[PATCH RFC 0/3] selftests/landlock: scoping abstractions

by Abhinav Saxena

Hi all, I was starting to work on the memfd-exec[1] feature and observed that Landlock's scoped-IPC features (abstract UNIX sockets and signals) follow a consistent high-level model, which I'm calling a resource-accessor pattern: Resource Process <-> Accessor Process - Resource process: owns or manages the asset - socket creator (bind/accept) - signal handler - memfd creator - Accessor process: attempts to use the asset - socket client (connect/sendto) - signal sender - memfd executor RESOURCE-ACCESSOR PATTERN FUNDAMENTALS ====================================== This pattern appears fundamental to Landlock scoping because: 1. Consistent enforcement model: Landlock restrictions are enforced only on the accessor side; the resource side remains unconstrained across all scope types. 2. Reflects actual security boundaries: In practice, sandboxed processes typically need to access resources created by other processes, not the reverse. 3. Scalable design: This model works consistently whether processes are in parent-child relationships or independent peer domains. 4. Real-world usage patterns: Container runtimes and sandbox orchestrators routinely start multiple workers that restrict themselves independently. CURRENT TEST COVERAGE GAP ========================= Existing self-tests cover hierarchical resource <-> accessor pairs but do not exercise the case where each task enters an independent domain. While 'sibling_domain' tests exist, they still use parent-child relationship patterns rather than true peer domains. Current Coverage (Linear Hierarchies Only): ------------------------------------------- Type 1: Parent-Child (scoped_domains) P1 ---- P2 Type 2: Three Generations (scoped_vs_unscoped) P1 ---- P2 ---- P3 Variations tested for both types: - No domains - Various scoped domain combinations - Nested domains within inherited domains - Mixed domain types (SCOPE vs OTHER vs NONE) Missing Coverage (True Sibling Scenarios): ------------------------------------------ Root | +-- Child A [various domain types] | +-- Child B [various domain types] Missing test scenarios: - A <-> B cross-sibling communication - Mixed sibling domain combinations - Sibling isolation enforcement - Parent -> A, Parent -> B differential access SOLUTION ======== This series implements the missing sibling pattern using the resource-accessor model. The tests create a fork tree that looks like this: coordinator (no domain) | +-- resource_proc (Domain X) /* owns the resource */ | +-- accessor_proc (Domain Y) /* tries to access */ This directly addresses the missing coverage by creating two independent child processes that establish peer domains, rather than the hierarchical parent-child domains covered by existing tests. Both children call landlock_restrict_self() for the first time, so their struct landlock_domain->parent pointers are NULL, creating true peer domains. The harness exposes four test variants: Variant name | Resource domain | Accessor domain | Result -------------------|-----------------|-----------------|---------- none_to_none | none | none | ALLOW none_to_scoped | none | scoped | DENY scoped_to_none | scoped | none | ALLOW scoped_to_scoped | scoped | scoped (peer) | DENY The scoped_to_scoped case was missing from current coverage. TESTING ======= All patches apply cleanly to v6.14-rc2 and pass on landlock/master. The helpers are small and re-use the existing kselftest_harness.h fixture/variant pattern. All patches have been validated with scripts/checkpatch.pl --strict and show no warnings. This series introduces **no kernel changes**, only selftests additions. Feedback very welcome. Thanks, Abhinav [1] https://github.com/landlock-lsm/linux/issues/37 Links: - Landlock documentation: https://docs.kernel.org/userspace-api/landlock.html - Landlock LSM kernel docs: https://docs.kernel.org/security/landlock.html - Existing tests: tools/testing/selftests/landlock/scoped_* Signed-off-by: Abhinav Saxena <xandfury(a)gmail.com> --- Abhinav Saxena (3): selftests/landlock: move sandbox_type to common selftests/landlock: add cross-domain variants selftests/landlock: add cross-domain signal tests tools/testing/selftests/landlock/scoped_common.h | 7 + .../landlock/scoped_cross_domain_variants.h | 54 +++++ .../landlock/scoped_multiple_domain_variants.h | 7 - .../selftests/landlock/scoped_signal_test.c | 237 +++++++++++++++++++++ 4 files changed, 298 insertions(+), 7 deletions(-) --- base-commit: 5b74b2eff1eeefe43584e5b7b348c8cd3b723d38 change-id: 20250715-landlock_abstractions-dbc0aabf1063 Best regards, -- Abhinav Saxena <xandfury(a)gmail.com>

2 days, 20 hours

1
3
0 0

2025

2024

2023

2022

2021

2020

2019

2018

2017

Linux-kselftest-mirror July 2025