From: "Mike Rapoport (Microsoft)" <rppt(a)kernel.org>
Hi,
These patches allow guest_memfd to notify userspace about minor page
faults using userfaultfd and let userspace to resolve these page faults
using UFFDIO_CONTINUE.
To allow UFFDIO_CONTINUE outside of the core mm I added a get_shmem_folio()
callback to vm_ops that allows an address space backing a VMA to return a
folio that exists in it's page cache (patch 2)
In order for guest_memfd to notify userspace about page faults, there is a
new VM_FAULT_UFFD_MINOR that a ->fault() handler can return to inform the
page fault handler that it needs to call handle_userfault() to complete the
fault (patch 3).
Patch 4 plumbs these new goodies into guest_memfd.
This series is the minimal change I've been able to come up with to allow
integration of guest_memfd with uffd and while refactoring uffd and making
mfill_atomic() flow more linear would have been a nice improvement, it's
way out of the scope of enabling uffd with guest_memfd.
v2 changes:
* Introduce VM_FAULF_UFFD_MINOR to avoid exporting handle_userfault()
* Simplify vma_can_mfill_atomic()
* Rename get_pagecache_folio() to get_shared_folio() and use inode
instead of vma as its argument
v1: https://lore.kernel.org/all/20251117114631.2029447-1-rppt@kernel.org
Mike Rapoport (Microsoft) (4):
userfaultfd: move vma_can_userfault out of line
userfaultfd, shmem: use a VMA callback to handle UFFDIO_CONTINUE
mm: introduce VM_FAULT_UFFD_MINOR fault reason
guest_memfd: add support for userfaultfd minor mode
Nikita Kalyazin (1):
KVM: selftests: test userfaultfd minor for guest_memfd
include/linux/mm.h | 9 ++
include/linux/mm_types.h | 3 +
include/linux/userfaultfd_k.h | 36 +-----
mm/memory.c | 2 +
mm/shmem.c | 21 +++-
mm/userfaultfd.c | 80 +++++++++++---
.../testing/selftests/kvm/guest_memfd_test.c | 103 ++++++++++++++++++
virt/kvm/guest_memfd.c | 29 +++++
8 files changed, 232 insertions(+), 51 deletions(-)
base-commit: 6a23ae0a96a600d1d12557add110e0bb6e32730c
--
2.50.1
Currently the only way of excluding certain tests from a collection is by
passing all the other tests explicitly via `--test`. Therefore, if the user
wants to skip a single test the resulting command line might be too big,
depending on the collection. Add an option `--skip` that takes care of
that.
Signed-off-by: Ricardo B. Marlière <rbm(a)suse.com>
---
tools/testing/selftests/run_kselftest.sh | 11 +++++++++++
1 file changed, 11 insertions(+)
diff --git a/tools/testing/selftests/run_kselftest.sh b/tools/testing/selftests/run_kselftest.sh
index d4be97498b32..84d45254675c 100755
--- a/tools/testing/selftests/run_kselftest.sh
+++ b/tools/testing/selftests/run_kselftest.sh
@@ -30,6 +30,7 @@ Usage: $0 [OPTIONS]
-s | --summary Print summary with detailed log in output.log (conflict with -p)
-p | --per-test-log Print test log in /tmp with each test name (conflict with -s)
-t | --test COLLECTION:TEST Run TEST from COLLECTION
+ -S | --skip COLLECTION:TEST Skip TEST from COLLECTION
-c | --collection COLLECTION Run all tests from COLLECTION
-l | --list List the available collection:test entries
-d | --dry-run Don't actually run any tests
@@ -43,6 +44,7 @@ EOF
COLLECTIONS=""
TESTS=""
+SKIP=""
dryrun=""
kselftest_override_timeout=""
ERROR_ON_FAIL=true
@@ -58,6 +60,9 @@ while true; do
-t | --test)
TESTS="$TESTS $2"
shift 2 ;;
+ -S | --skip)
+ SKIP="$SKIP $2"
+ shift 2 ;;
-c | --collection)
COLLECTIONS="$COLLECTIONS $2"
shift 2 ;;
@@ -109,6 +114,12 @@ if [ -n "$TESTS" ]; then
done
available="$(echo "$valid" | sed -e 's/ /\n/g')"
fi
+# Remove tests to be skipped from available list
+if [ -n "$SKIP" ]; then
+ for skipped in $SKIP ; do
+ available="$(echo "$available" | grep -v "^${skipped}$")"
+ done
+fi
kselftest_failures_file="$(mktemp --tmpdir kselftest-failures-XXXXXX)"
export kselftest_failures_file
---
base-commit: a2f7990d330937a204b86b9cafbfef82f87a8693
change-id: 20251125-selftests-add_skip_opt-0f3fd24d7afa
Best regards,
--
Ricardo B. Marlière <rbm(a)suse.com>
Currently, x86, Riscv, Loongarch use the Generic Entry which makes
maintainers' work easier and codes more elegant. arm64 has already
successfully switched to the Generic IRQ Entry in commit
b3cf07851b6c ("arm64: entry: Switch to generic IRQ entry"), it is
time to completely convert arm64 to Generic Entry.
The goal is to bring arm64 in line with other architectures that already
use the generic entry infrastructure, reducing duplicated code and
making it easier to share future changes in entry/exit paths, such as
"Syscall User Dispatch".
This patch set is rebased on v6.18-rc6.
The performance benchmarks from perf bench basic syscall on
real hardware are below:
| Metric | W/O Generic Framework | With Generic Framework | Change |
| ---------- | --------------------- | ---------------------- | ------ |
| Total time | 2.813 [sec] | 2.930 [sec] | ↑4% |
| usecs/op | 0.281349 | 0.293006 | ↑4% |
| ops/sec | 3,554,299 | 3,412,894 | ↓4% |
Compared to earlier with arch specific handling, the performance decreased
by approximately 4%.
It was tested ok with following test cases on QEMU virt platform:
- Perf tests.
- Different `dynamic preempt` mode switch.
- Pseudo NMI tests.
- Stress-ng CPU stress test.
- MTE test case in Documentation/arch/arm64/memory-tagging-extension.rst
and all test cases in tools/testing/selftests/arm64/mte/*.
- "sud" selftest testcase.
- get_syscall_info, peeksiginfo in tools/testing/selftests/ptrace.
The test QEMU configuration is as follows:
qemu-system-aarch64 \
-M virt,gic-version=3,virtualization=on,mte=on \
-cpu max,pauth-impdef=on \
-kernel Image \
-smp 8,sockets=1,cores=4,threads=2 \
-m 512m \
-nographic \
-no-reboot \
-device virtio-rng-pci \
-append "root=/dev/vda rw console=ttyAMA0 kgdboc=ttyAMA0,115200 \
earlycon preempt=voluntary irqchip.gicv3_pseudo_nmi=1" \
-drive if=none,file=images/rootfs.ext4,format=raw,id=hd0 \
-device virtio-blk-device,drive=hd0 \
Chanegs in v7:
- Support "Syscall User Dispatch" by implementing
arch_syscall_is_vdso_sigreturn() as kemal suggested.
- Add aarch64 support for "sud" selftest testcase, which tested ok with
the patch series.
- Fix the kernel test robot warning for arch_ptrace_report_syscall_entry()
and arch_ptrace_report_syscall_exit() in asm/entry-common.h.
- Add perf syscall performance test.
- Link to v6: https://lore.kernel.org/all/20250916082611.2972008-1-ruanjinjie@huawei.com/
Changes in v6:
- Rebased on v6.17-rc5-next as arm64 generic irq entry has merged.
- Update the commit message.
- Link to v5: https://lore.kernel.org/all/20241206101744.4161990-1-ruanjinjie@huawei.com/
Changes in v5:
- Not change arm32 and keep inerrupts_enabled() macro for gicv3 driver.
- Move irqentry_state definition into arch/arm64/kernel/entry-common.c.
- Avoid removing the __enter_from_*() and __exit_to_*() wrappers.
- Update "irqentry_state_t ret/irq_state" to "state"
to keep it consistently.
- Use generic irq entry header for PREEMPT_DYNAMIC after split
the generic entry.
- Also refactor the ARM64 syscall code.
- Introduce arch_ptrace_report_syscall_entry/exit(), instead of
arch_pre/post_report_syscall_entry/exit() to simplify code.
- Make the syscall patches clear separation.
- Update the commit message.
- Link to v4: https://lore.kernel.org/all/20241025100700.3714552-1-ruanjinjie@huawei.com/
Changes in v4:
- Rework/cleanup split into a few patches as Mark suggested.
- Replace interrupts_enabled() macro with regs_irqs_disabled(), instead
of left it here.
- Remove rcu and lockdep state in pt_regs by using temporary
irqentry_state_t as Mark suggested.
- Remove some unnecessary intermediate functions to make it clear.
- Rework preempt irq and PREEMPT_DYNAMIC code
to make the switch more clear.
- arch_prepare_*_entry/exit() -> arch_pre_*_entry/exit().
- Expand the arch functions comment.
- Make arch functions closer to its caller.
- Declare saved_reg in for block.
- Remove arch_exit_to_kernel_mode_prepare(), arch_enter_from_kernel_mode().
- Adjust "Add few arch functions to use generic entry" patch to be
the penultimate.
- Update the commit message.
- Add suggested-by.
- Link to v3: https://lore.kernel.org/all/20240629085601.470241-1-ruanjinjie@huawei.com/
Changes in v3:
- Test the MTE test cases.
- Handle forget_syscall() in arch_post_report_syscall_entry()
- Make the arch funcs not use __weak as Thomas suggested, so move
the arch funcs to entry-common.h, and make arch_forget_syscall() folded
in arch_post_report_syscall_entry() as suggested.
- Move report_single_step() to thread_info.h for arm64
- Change __always_inline() to inline, add inline for the other arch funcs.
- Remove unused signal.h for entry-common.h.
- Add Suggested-by.
- Update the commit message.
Changes in v2:
- Add tested-by.
- Fix a bug that not call arch_post_report_syscall_entry() in
syscall_trace_enter() if ptrace_report_syscall_entry() return not zero.
- Refactor report_syscall().
- Add comment for arch_prepare_report_syscall_exit().
- Adjust entry-common.h header file inclusion to alphabetical order.
- Update the commit message.
Jinjie Ruan (10):
arm64/ptrace: Split report_syscall()
arm64/ptrace: Refactor syscall_trace_enter/exit()
arm64/ptrace: Refator el0_svc_common()
entry: Add syscall_exit_to_user_mode_prepare() helper
arm64/ptrace: Handle ptrace_report_syscall_entry() error
arm64/ptrace: Expand secure_computing() in place
arm64/ptrace: Use syscall_get_arguments() heleper
entry: Add arch_ptrace_report_syscall_entry/exit()
entry: Add has_syscall_work() helper
arm64: entry: Convert to generic entry
kemal (1):
selftests: sud_test: Support aarch64
arch/arm64/Kconfig | 2 +-
arch/arm64/include/asm/entry-common.h | 69 ++++++++++++++
arch/arm64/include/asm/syscall.h | 29 +++++-
arch/arm64/include/asm/thread_info.h | 22 +----
arch/arm64/kernel/debug-monitors.c | 7 ++
arch/arm64/kernel/ptrace.c | 90 -------------------
arch/arm64/kernel/signal.c | 2 +-
arch/arm64/kernel/syscall.c | 31 ++-----
include/linux/entry-common.h | 42 ++++++---
kernel/entry/syscall-common.c | 43 ++++++++-
.../syscall_user_dispatch/sud_test.c | 4 +
11 files changed, 188 insertions(+), 153 deletions(-)
--
2.34.1
syzkaller reported a bug [1] where a socket using sockmap, after being
unloaded, exposed incorrect copied_seq calculation. The selftest I
provided can be used to reproduce the issue reported by syzkaller.
TCP recvmsg seq # bug 2: copied E92C873, seq E68D125, rcvnxt E7CEB7C, fl 40
WARNING: CPU: 1 PID: 5997 at net/ipv4/tcp.c:2724 tcp_recvmsg_locked+0xb2f/0x2910 net/ipv4/tcp.c:2724
Call Trace:
<TASK>
receive_fallback_to_copy net/ipv4/tcp.c:1968 [inline]
tcp_zerocopy_receive+0x131a/0x2120 net/ipv4/tcp.c:2200
do_tcp_getsockopt+0xe28/0x26c0 net/ipv4/tcp.c:4713
tcp_getsockopt+0xdf/0x100 net/ipv4/tcp.c:4812
do_sock_getsockopt+0x34d/0x440 net/socket.c:2421
__sys_getsockopt+0x12f/0x260 net/socket.c:2450
__do_sys_getsockopt net/socket.c:2457 [inline]
__se_sys_getsockopt net/socket.c:2454 [inline]
__x64_sys_getsockopt+0xbd/0x160 net/socket.c:2454
do_syscall_x64 arch/x86/entry/syscall_64.c:63 [inline]
do_syscall_64+0xcd/0xfa0 arch/x86/entry/syscall_64.c:94
entry_SYSCALL_64_after_hwframe+0x77/0x7f
A sockmap socket maintains its own receive queue (ingress_msg) which may
contain data from either its own protocol stack or forwarded from other
sockets.
FD1:read()
-- FD1->copied_seq++
| [read data]
|
[enqueue data] v
[sockmap] -> ingress to self -> ingress_msg queue
FD1 native stack ------> ^
-- FD1->rcv_nxt++ -> redirect to other | [enqueue data]
| |
| ingress to FD1
v ^
... | [sockmap]
FD2 native stack
The issue occurs when reading from ingress_msg: we update tp->copied_seq
by default, but if the data comes from other sockets (not the socket's
own protocol stack), tcp->rcv_nxt remains unchanged. Later, when
converting back to a native socket, reads may fail as copied_seq could
be significantly larger than rcv_nxt.
Additionally, FIONREAD calculation based on copied_seq and rcv_nxt is
insufficient for sockmap sockets, requiring separate field tracking.
[1] https://syzkaller.appspot.com/bug?extid=06dbd397158ec0ea4983
---
v1 -> v4: Use skmsg.sk instead of extending BPF_F_XXX macro and fix CI
failure reported by CI
v1: https://lore.kernel.org/bpf/20251117110736.293040-1-jiayuan.chen@linux.dev/
Jiayuan Chen (3):
bpf, sockmap: Fix incorrect copied_seq calculation
bpf, sockmap: Fix FIONREAD for sockmap
bpf, selftest: Add tests for FIONREAD and copied_seq
include/linux/skmsg.h | 58 ++++-
net/core/skmsg.c | 28 ++-
net/ipv4/tcp_bpf.c | 26 ++-
net/ipv4/udp_bpf.c | 25 ++-
.../selftests/bpf/prog_tests/sockmap_basic.c | 203 +++++++++++++++++-
.../bpf/progs/test_sockmap_pass_prog.c | 8 +
6 files changed, 331 insertions(+), 17 deletions(-)
--
2.43.0
Fix compilation error in UPROBE_setup caused by pointer type mismatch
in ternary expression. The probed_uretprobe and probed_uprobe function
pointers have different type attributes (__attribute__((nocf_check))),
which causes the conditional operator to fail with:
seccomp_bpf.c:5175:74: error: pointer type mismatch in conditional
expression [-Wincompatible-pointer-types]
Cast both function pointers to 'const void *' to match the expected
parameter type of get_uprobe_offset(), resolving the type mismatch
while preserving the function selection logic.
Signed-off-by: Nirbhay Sharma <nirbhay.lkd(a)gmail.com>
---
tools/testing/selftests/seccomp/seccomp_bpf.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/tools/testing/selftests/seccomp/seccomp_bpf.c b/tools/testing/selftests/seccomp/seccomp_bpf.c
index 874f17763536..e13ffe18ef95 100644
--- a/tools/testing/selftests/seccomp/seccomp_bpf.c
+++ b/tools/testing/selftests/seccomp/seccomp_bpf.c
@@ -5172,7 +5172,8 @@ FIXTURE_SETUP(UPROBE)
ASSERT_GE(bit, 0);
}
- offset = get_uprobe_offset(variant->uretprobe ? probed_uretprobe : probed_uprobe);
+ offset = get_uprobe_offset(variant->uretprobe ?
+ (const void *)probed_uretprobe : (const void *)probed_uprobe);
ASSERT_GE(offset, 0);
if (variant->uretprobe)
--
2.48.1
Hi all,
This is v2 of a short series that adds kernel support for the ratified
Zilsd (Load/Store pair) and Zclsd (Compressed Load/Store pair) RISC-V
ISA extensions. The series enables kernel-side exposure so user-space
(for example glibc) can detect and use these extensions via hwprobe and
runtime checks.
Patches:
- Patch 1:Add device tree bindings documentation for Zilsd and Zclsd.
- Patch 2: Extend RISC-V ISA extension string parsing to recognize them.
- Patch 3: Export Zilsd and Zclsd via riscv_hwprobe.
- Patch 4: Allow KVM guests to use them.
- Patch 5: Add KVM selftests.
Changes in v2:
- Device-tree schema: simplified the rv64 validation for Zilsd by
removing a redundant `contais: const: zilsd` in the `if` clause; the
simpler `if (riscv, isa-base contains rv64i) then (riscv,
isa-extension not contains zilsd)` form is used instead. Behaviour is
unchanged, and the logic is cleaner.
- Device-tree schema: corrected Zclsd dependency to require both Zilsd
and Zca (previous `anyOf` was incorrect; now both are enforced).
- Commit message typo fixed: "dt-bidings" -> "dt-bindings" in the Patch
1 commit subject.
The v2 changes are documentation/schema corrections in extensions.yaml.
No functional changes were made to ISA parsing, hwprobe syscall, KVM
guest support or the selftests beyond ensuring the binding correctly
documents and validates the extension relationships.
Please review v2 and advise if futher changes are needed.
Thanks,
Pincheng Wang
Pincheng Wang (5):
dt-bindings: riscv: add Zilsd and Zclsd extension descriptions
riscv: add ISA extension parsing for Zilsd and Zclsd
riscv: hwprobe: export Zilsd and Zclsd ISA extensions
riscv: KVM: allow Zilsd and Zclsd extensions for Guest/VM
KVM: riscv: selftests: add Zilsd and Zclsd extension to get-reg-list
test
Documentation/arch/riscv/hwprobe.rst | 8 +++++
.../devicetree/bindings/riscv/extensions.yaml | 36 +++++++++++++++++++
arch/riscv/include/asm/hwcap.h | 2 ++
arch/riscv/include/uapi/asm/hwprobe.h | 2 ++
arch/riscv/include/uapi/asm/kvm.h | 2 ++
arch/riscv/kernel/cpufeature.c | 24 +++++++++++++
arch/riscv/kernel/sys_hwprobe.c | 2 ++
arch/riscv/kvm/vcpu_onereg.c | 2 ++
.../selftests/kvm/riscv/get-reg-list.c | 6 ++++
9 files changed, 84 insertions(+)
--
2.39.5
This series adds namespace support to vhost-vsock and loopback. It does
not add namespaces to any of the other guest transports (virtio-vsock,
hyperv, or vmci).
The current revision supports two modes: local and global. Local
mode is complete isolation of namespaces, while global mode is complete
sharing between namespaces of CIDs (the original behavior).
The mode is set using /proc/sys/net/vsock/ns_mode.
Modes are per-netns and write-once. This allows a system to configure
namespaces independently (some may share CIDs, others are completely
isolated). This also supports future possible mixed use cases, where
there may be namespaces in global mode spinning up VMs while there are
mixed mode namespaces that provide services to the VMs, but are not
allowed to allocate from the global CID pool (this mode is not
implemented in this series).
If a socket or VM is created when a namespace is global but the
namespace changes to local, the socket or VM will continue working
normally. That is, the socket or VM assumes the mode behavior of the
namespace at the time the socket/VM was created. The original mode is
captured in vsock_create() and so occurs at the time of socket(2) and
accept(2) for sockets and open(2) on /dev/vhost-vsock for VMs. This
prevents a socket/VM connection from suddenly breaking due to a
namespace mode change. Any new sockets/VMs created after the mode change
will adopt the new mode's behavior.
Additionally, added tests for the new namespace features:
tools/testing/selftests/vsock/vmtest.sh
1..29
ok 1 vm_server_host_client
ok 2 vm_client_host_server
ok 3 vm_loopback
ok 4 ns_vm_local_mode_rejected
ok 5 ns_host_vsock_ns_mode_ok
ok 6 ns_host_vsock_ns_mode_write_once_ok
ok 7 ns_global_same_cid_fails
ok 8 ns_local_same_cid_ok
ok 9 ns_global_local_same_cid_ok
ok 10 ns_local_global_same_cid_ok
ok 11 ns_diff_global_host_connect_to_global_vm_ok
ok 12 ns_diff_global_host_connect_to_local_vm_fails
ok 13 ns_diff_global_vm_connect_to_global_host_ok
ok 14 ns_diff_global_vm_connect_to_local_host_fails
ok 15 ns_diff_local_host_connect_to_local_vm_fails
ok 16 ns_diff_local_vm_connect_to_local_host_fails
ok 17 ns_diff_global_to_local_loopback_local_fails
ok 18 ns_diff_local_to_global_loopback_fails
ok 19 ns_diff_local_to_local_loopback_fails
ok 20 ns_diff_global_to_global_loopback_ok
ok 21 ns_same_local_loopback_ok
ok 22 ns_same_local_host_connect_to_local_vm_ok
ok 23 ns_same_local_vm_connect_to_local_host_ok
ok 24 ns_mode_change_connection_continue_vm_ok
ok 25 ns_mode_change_connection_continue_host_ok
ok 26 ns_mode_change_connection_continue_both_ok
ok 27 ns_delete_vm_ok
ok 28 ns_delete_host_ok
ok 29 ns_delete_both_ok
SUMMARY: PASS=29 SKIP=0 FAIL=0
Dependent on series:
https://lore.kernel.org/all/20251108-vsock-selftests-fixes-and-improvements…
Thanks again for everyone's help and reviews!
Suggested-by: Sargun Dhillon <sargun(a)sargun.me>
Signed-off-by: Bobby Eshleman <bobbyeshleman(a)gmail.com>
Changes in v11:
- vmtest: add a patch to use ss in wait_for_listener functions and
support vsock, tcp, and unix. Change all patches to use the new
functions.
- vmtest: add a patch to re-use vm dmesg / warn counting functions
- Link to v10: https://lore.kernel.org/r/20251117-vsock-vmtest-v10-0-df08f165bf3e@meta.com
Changes in v10:
- Combine virtio common patches into one (Stefano)
- Resolve vsock_loopback virtio_transport_reset_no_sock() issue
with info->vsk setting. This eliminates the need for skb->cb,
so remove skb->cb patches.
- many line width 80 fixes
- Link to v9: https://lore.kernel.org/all/20251111-vsock-vmtest-v9-0-852787a37bed@meta.com
Changes in v9:
- reorder loopback patch after patch for virtio transport common code
- remove module ordering tests patch because loopback no longer depends
on pernet ops
- major simplifications in vsock_loopback
- added a new patch for blocking local mode for guests, added test case
to check
- add net ref tracking to vsock_loopback patch
- Link to v8: https://lore.kernel.org/r/20251023-vsock-vmtest-v8-0-dea984d02bb0@meta.com
Changes in v8:
- Break generic cleanup/refactoring patches into standalone series,
remove those from this series
- Link to dependency: https://lore.kernel.org/all/20251022-vsock-selftests-fixes-and-improvements…
- Link to v7: https://lore.kernel.org/r/20251021-vsock-vmtest-v7-0-0661b7b6f081@meta.com
Changes in v7:
- fix hv_sock build
- break out vmtest patches into distinct, more well-scoped patches
- change `orig_net_mode` to `net_mode`
- many fixes and style changes in per-patch change sets (see individual
patches for specific changes)
- optimize `virtio_vsock_skb_cb` layout
- update commit messages with more useful descriptions
- vsock_loopback: use orig_net_mode instead of current net mode
- add tests for edge cases (ns deletion, mode changing, loopback module
load ordering)
- Link to v6: https://lore.kernel.org/r/20250916-vsock-vmtest-v6-0-064d2eb0c89d@meta.com
Changes in v6:
- define behavior when mode changes to local while socket/VM is alive
- af_vsock: clarify description of CID behavior
- af_vsock: use stronger langauge around CID rules (dont use "may")
- af_vsock: improve naming of buf/buffer
- af_vsock: improve string length checking on proc writes
- vsock_loopback: add space in struct to clarify lock protection
- vsock_loopback: do proper cleanup/unregister on vsock_loopback_exit()
- vsock_loopback: use virtio_vsock_skb_net() instead of sock_net()
- vsock_loopback: set loopback to NULL after kfree()
- vsock_loopback: use pernet_operations and remove callback mechanism
- vsock_loopback: add macros for "global" and "local"
- vsock_loopback: fix length checking
- vmtest.sh: check for namespace support in vmtest.sh
- Link to v5: https://lore.kernel.org/r/20250827-vsock-vmtest-v5-0-0ba580bede5b@meta.com
Changes in v5:
- /proc/net/vsock_ns_mode -> /proc/sys/net/vsock/ns_mode
- vsock_global_net -> vsock_global_dummy_net
- fix netns lookup in vhost_vsock to respect pid namespaces
- add callbacks for vsock_loopback to avoid circular dependency
- vmtest.sh loads vsock_loopback module
- remove vsock_net_mode_can_set()
- change vsock_net_write_mode() to return true/false based on success
- make vsock_net_mode enum instead of u8
- Link to v4: https://lore.kernel.org/r/20250805-vsock-vmtest-v4-0-059ec51ab111@meta.com
Changes in v4:
- removed RFC tag
- implemented loopback support
- renamed new tests to better reflect behavior
- completed suite of tests with permutations of ns modes and vsock_test
as guest/host
- simplified socat bridging with unix socket instead of tcp + veth
- only use vsock_test for success case, socat for failure case (context
in commit message)
- lots of cleanup
Changes in v3:
- add notion of "modes"
- add procfs /proc/net/vsock_ns_mode
- local and global modes only
- no /dev/vhost-vsock-netns
- vmtest.sh already merged, so new patch just adds new tests for NS
- Link to v2:
https://lore.kernel.org/kvm/20250312-vsock-netns-v2-0-84bffa1aa97a@gmail.com
Changes in v2:
- only support vhost-vsock namespaces
- all g2h namespaces retain old behavior, only common API changes
impacted by vhost-vsock changes
- add /dev/vhost-vsock-netns for "opt-in"
- leave /dev/vhost-vsock to old behavior
- removed netns module param
- Link to v1:
https://lore.kernel.org/r/20200116172428.311437-1-sgarzare@redhat.com
Changes in v1:
- added 'netns' module param to vsock.ko to enable the
network namespace support (disabled by default)
- added 'vsock_net_eq()' to check the "net" assigned to a socket
only when 'netns' support is enabled
- Link to RFC: https://patchwork.ozlabs.org/cover/1202235/
---
Bobby Eshleman (13):
vsock: a per-net vsock NS mode state
vsock: add netns to vsock core
vsock: reject bad VSOCK_NET_MODE_LOCAL configuration for G2H
virtio: set skb owner of virtio_transport_reset_no_sock() reply
vsock: add netns support to virtio transports
selftests/vsock: add namespace helpers to vmtest.sh
selftests/vsock: prepare vm management helpers for namespaces
selftests/vsock: add vm_dmesg_{warn,oops}_count() helpers
selftests/vsock: use ss to wait for listeners instead of /proc/net
selftests/vsock: add tests for proc sys vsock ns_mode
selftests/vsock: add namespace tests for CID collisions
selftests/vsock: add tests for host <-> vm connectivity with namespaces
selftests/vsock: add tests for namespace deletion and mode changes
MAINTAINERS | 1 +
drivers/vhost/vsock.c | 57 +-
include/linux/virtio_vsock.h | 8 +-
include/net/af_vsock.h | 64 +-
include/net/net_namespace.h | 4 +
include/net/netns/vsock.h | 17 +
net/vmw_vsock/af_vsock.c | 290 ++++++++-
net/vmw_vsock/hyperv_transport.c | 6 +
net/vmw_vsock/virtio_transport.c | 29 +-
net/vmw_vsock/virtio_transport_common.c | 69 +-
net/vmw_vsock/vmci_transport.c | 12 +
net/vmw_vsock/vsock_loopback.c | 20 +-
tools/testing/selftests/vsock/vmtest.sh | 1087 +++++++++++++++++++++++++++++--
13 files changed, 1560 insertions(+), 104 deletions(-)
---
base-commit: 962ac5ca99a5c3e7469215bf47572440402dfd59
change-id: 20250325-vsock-vmtest-b3a21d2102c2
prerequisite-message-id: <20251022-vsock-selftests-fixes-and-improvements-v1-0-edeb179d6463(a)meta.com>
prerequisite-patch-id: a2eecc3851f2509ed40009a7cab6990c6d7cfff5
prerequisite-patch-id: 501db2100636b9c8fcb3b64b8b1df797ccbede85
prerequisite-patch-id: ba1a2f07398a035bc48ef72edda41888614be449
prerequisite-patch-id: fd5cc5445aca9355ce678e6d2bfa89fab8a57e61
prerequisite-patch-id: 795ab4432ffb0843e22b580374782e7e0d99b909
prerequisite-patch-id: 1499d263dc933e75366c09e045d2125ca39f7ddd
prerequisite-patch-id: f92d99bb1d35d99b063f818a19dcda999152d74c
prerequisite-patch-id: e3296f38cdba6d903e061cff2bbb3e7615e8e671
prerequisite-patch-id: bc4662b4710d302d4893f58708820fc2a0624325
prerequisite-patch-id: f8991f2e98c2661a706183fde6b35e2b8d9aedcf
prerequisite-patch-id: 44bf9ed69353586d284e5ee63d6fffa30439a698
prerequisite-patch-id: d50621bc630eeaf608bbaf260370c8dabf6326df
Best regards,
--
Bobby Eshleman <bobbyeshleman(a)meta.com>
'head' is defined as 'static struct robust_list_head' that stores the
local variable of 'struct lock_struct a' raising the Wdangling-pointer
warning.
robust_list.c: In function ���child_circular_list���:
robust_list.c:522:24: warning: storing the address of local variable
���a��� in ���head.list.next��� [-Wdangling-pointer=]
522 | head.list.next = &a.list;
| ~~~~~~~~~~~~~~~^~~~~~~~~
robust_list.c:513:28: note: ���a��� declared here
513 | struct lock_struct a, b, c;
| ^
robust_list.c:512:40: note: ���head��� declared here
512 | static struct robust_list_head head;
| ^~~~
Since 'head' doesn't need static storge duration, removing the static
keyword of it to fix this.
Signed-off-by: Ankit Khushwaha <ankitkhushwaha.linux(a)gmail.com>
---
v3:
Updated the patch name and msg as suggested by Andr��.
v2: https://lore.kernel.org/all/20251118170907.108832-1-ankitkhushwaha.linux@gm…
Added changes suggested by Andr��.
v1: https://lore.kernel.org/all/20251118162619.50586-1-ankitkhushwaha.linux@gma…
---
tools/testing/selftests/futex/functional/robust_list.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/tools/testing/selftests/futex/functional/robust_list.c b/tools/testing/selftests/futex/functional/robust_list.c
index e7d1254e18ca..ef21a7ec9def 100644
--- a/tools/testing/selftests/futex/functional/robust_list.c
+++ b/tools/testing/selftests/futex/functional/robust_list.c
@@ -509,7 +509,7 @@ TEST(test_robust_list_multiple_elements)
static int child_circular_list(void *arg)
{
- static struct robust_list_head head;
+ struct robust_list_head head;
struct lock_struct a, b, c;
int ret;
--
2.52.0
netdev CI reserves SKIP in selftests for cases which can't be executed
due to setup issues, like missing or old commands. Tests which are
expected to fail must use XFAIL.
Signed-off-by: Jakub Kicinski <kuba(a)kernel.org>
---
CC: kuniyu(a)google.com
CC: adelodunolaoluwa(a)yahoo.com
CC: shuah(a)kernel.org
CC: linux-kselftest(a)vger.kernel.org
---
tools/testing/selftests/net/af_unix/unix_connreset.c | 8 ++++++--
1 file changed, 6 insertions(+), 2 deletions(-)
diff --git a/tools/testing/selftests/net/af_unix/unix_connreset.c b/tools/testing/selftests/net/af_unix/unix_connreset.c
index bffef2b54bfd..6eb936207b31 100644
--- a/tools/testing/selftests/net/af_unix/unix_connreset.c
+++ b/tools/testing/selftests/net/af_unix/unix_connreset.c
@@ -161,8 +161,12 @@ TEST_F(unix_sock, reset_closed_embryo)
char buf[16] = {};
ssize_t n;
- if (variant->socket_type == SOCK_DGRAM)
- SKIP(return, "This test only applies to SOCK_STREAM and SOCK_SEQPACKET");
+ if (variant->socket_type == SOCK_DGRAM) {
+ snprintf(_metadata->results->reason,
+ sizeof(_metadata->results->reason),
+ "Test only applies to SOCK_STREAM and SOCK_SEQPACKET");
+ exit(KSFT_XFAIL);
+ }
/* Close server without accept()ing */
close(self->server);
--
2.51.1