September 2024 - Linux-kselftest-mirror

[PATCH] utimer-test: remove unused variables

by bajing

The variable i is never referenced in the code, just remove it. Signed-off-by: bajing <bajing(a)cmss.chinamobile.com> --- tools/testing/selftests/alsa/utimer-test.c | 1 - 1 file changed, 1 deletion(-) diff --git a/tools/testing/selftests/alsa/utimer-test.c b/tools/testing/selftests/alsa/utimer-test.c index 32ee3ce57721..9d2683c83ef3 100644 --- a/tools/testing/selftests/alsa/utimer-test.c +++ b/tools/testing/selftests/alsa/utimer-test.c @@ -140,7 +140,6 @@ TEST_F(timer_f, utimer) { TEST(wrong_timers_test) { int timer_dev_fd; int utimer_fd; - size_t i; struct snd_timer_uinfo wrong_timer = { .resolution = 0, .id = UTIMER_DEFAULT_ID, -- 2.33.0

10 months

1
0
0 0

[PATCH RFC 0/8] extensible syscalls: CHECK_FIELDS to allow for easier feature detection

by Aleksa Sarai

This is something that I've been thinking about for a while. We had a discussion at LPC 2020 about this[1] but the proposals suggested there never materialised. In short, it is quite difficult for userspace to detect the feature capability of syscalls at runtime. This is something a lot of programs want to do, but they are forced to create elaborate scenarios to try to figure out if a feature is supported without causing damage to the system. For the vast majority of cases, each individual feature also needs to be tested individually (because syscall results are all-or-nothing), so testing even a single syscall's feature set can easily inflate the startup time of programs. This patchset implements the fairly minimal design I proposed in this talk[2] and in some old LKML threads (though I can't find the exact references ATM). The general flow looks like: 1. Userspace will indicate to the kernel that a syscall should a be no-op by setting the top bit of the extensible struct size argument. We will almost certainly never support exabyte sized structs, so the top bits are free for us to use as makeshift flag bits. This is preferable to using the per-syscall flag field inside the structure because seccomp can easily detect the bit in the flag and allow the probe or forcefully return -EEXTSYS_NOOP. 2. The kernel will then fill the provided structure with every valid bit pattern that the current kernel understands. For flags or other bitflag-like fields, this is the set of valid flags or bits. For pointer fields or fields that take an arbitrary value, the field has every bit set (0xFF... to fill the field) to indicate that any value is valid in the field. 3. The syscall then returns -EEXTSYS_NOOP which is an errno that will only ever be used for this purpose (so userspace can be sure that the request succeeded). On older kernels, the syscall will return a different error (usually -E2BIG or -EFAULT) and userspace can do their old-fashioned checks. 4. Userspace can then check which flags and fields are supported by looking at the fields in the returned structure. Flags are checked by doing an AND with the flags field, and field support can checked by comparing to 0. In principle you could just AND the entire structure if you wanted to do this check generically without caring about the structure contents (this is what libraries might consider doing). Userspace can even find out the internal kernel structure size by passing a PAGE_SIZE buffer and seeing how many bytes are non-zero. As with copy_struct_from_user(), this is designed to be forward- and backwards- compatible. This allows programas to get a one-shot understanding of what features a syscall supports without having to do any elaborate setups or tricks to detect support for destructive features. Flags can simply be ANDed to check if they are in the supported set, and fields can just be checked to see if they are non-zero. This patchset is IMHO the simplest way we can add the ability to introspect the feature set of extensible struct (copy_struct_from_user) syscalls. It doesn't preclude the chance of a more generic mechanism being added later. The intended way of using this interface to get feature information looks something like the following (imagine that openat2 has gained a new field and a new flag in the future): static bool openat2_no_automount_supported; static bool openat2_cwd_fd_supported; int check_openat2_support(void) { int err; struct open_how how = {}; err = openat2(AT_FDCWD, ".", &how, CHECK_FIELDS | sizeof(how)); assert(err < 0); switch (errno) { case EFAULT: case E2BIG: /* Old kernel... */ check_support_the_old_way(); break; case EEXTSYS_NOOP: openat2_no_automount_supported = (how.flags & RESOLVE_NO_AUTOMOUNT); openat2_cwd_fd_supported = (how.cwd_fd != 0); break; } } [1]: https://lwn.net/Articles/830666/ [2]: https://youtu.be/ggD-eb3yPVs Signed-off-by: Aleksa Sarai <cyphar(a)cyphar.com> --- Aleksa Sarai (8): uaccess: add copy_struct_to_user helper sched_getattr: port to copy_struct_to_user openat2: explicitly return -E2BIG for (usize > PAGE_SIZE) openat2: add CHECK_FIELDS flag to usize argument clone3: add CHECK_FIELDS flag to usize argument selftests: openat2: add 0xFF poisoned data after misaligned struct selftests: openat2: add CHECK_FIELDS selftests selftests: clone3: add CHECK_FIELDS selftests fs/open.c | 17 ++ include/linux/uaccess.h | 98 +++++++++ include/uapi/asm-generic/errno.h | 3 + include/uapi/linux/openat2.h | 2 + kernel/fork.c | 33 ++- kernel/sched/syscalls.c | 42 +--- tools/testing/selftests/clone3/.gitignore | 1 + tools/testing/selftests/clone3/Makefile | 2 +- .../testing/selftests/clone3/clone3_check_fields.c | 229 +++++++++++++++++++++ tools/testing/selftests/openat2/openat2_test.c | 126 +++++++++++- 10 files changed, 504 insertions(+), 49 deletions(-) --- base-commit: 431c1646e1f86b949fa3685efc50b660a364c2b6 change-id: 20240803-extensible-structs-check_fields-a47e94cef691 Best regards, -- Aleksa Sarai <cyphar(a)cyphar.com>

10 months

2
13
0 0

[PATCH] selftests/bpf: Fix procmap_query()'s params mismatch and compilation warning

by Yuan Chen

From: Yuan Chen <chenyuan(a)kylinos.cn> When the PROCMAP_QUERY is not defined, a compilation error occurs due to the mismatch of the procmap_query()'s params, procmap_query() only be called in the file where the function is defined, modify the params so they can match. We get a warning when build samples/bpf: trace_helpers.c:252:5: warning: no previous prototype for ‘procmap_query’ [-Wmissing-prototypes] 252 | int procmap_query(int fd, const void *addr, __u32 query_flags, size_t *start, size_t *offset, int *flags) | ^~~~~~~~~~~~~ As this function is only used in the file, mark it as 'static'. Signed-off-by: Yuan Chen <chenyuan(a)kylinos.cn> --- tools/testing/selftests/bpf/trace_helpers.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/tools/testing/selftests/bpf/trace_helpers.c b/tools/testing/selftests/bpf/trace_helpers.c index 1bfd881c0e07..2d742fdac6b9 100644 --- a/tools/testing/selftests/bpf/trace_helpers.c +++ b/tools/testing/selftests/bpf/trace_helpers.c @@ -249,7 +249,7 @@ int kallsyms_find(const char *sym, unsigned long long *addr) #ifdef PROCMAP_QUERY int env_verbosity __weak = 0; -int procmap_query(int fd, const void *addr, __u32 query_flags, size_t *start, size_t *offset, int *flags) +static int procmap_query(int fd, const void *addr, __u32 query_flags, size_t *start, size_t *offset, int *flags) { char path_buf[PATH_MAX], build_id_buf[20]; struct procmap_query q; @@ -293,7 +293,7 @@ int procmap_query(int fd, const void *addr, __u32 query_flags, size_t *start, si return 0; } #else -int procmap_query(int fd, const void *addr, size_t *start, size_t *offset, int *flags) +static int procmap_query(int fd, const void *addr, __u32 query_flags, size_t *start, size_t *offset, int *flags) { return -EOPNOTSUPP; } -- 2.46.0

10 months

2
1
0 0

[PATCH 0/6] Extend pmu_counters_test to AMD CPUs

by Colton Lewis

(I was positive I had sent this already, but I couldn't find it on the mailing list to reply to and ask for reviews.) Extend pmu_counters_test to AMD CPUs. As the AMD PMU is quite different from Intel with different events and feature sets, this series introduces a new code path to test it, specifically focusing on the core counters including the PerfCtrExtCore and PerfMonV2 features. Northbridge counters and cache counters exist, but are not as important and can be deferred to a later series. The first patch is a bug fix that could be submitted separately. The series has been tested on both Intel and AMD machines, but I have not found an AMD machine old enough to lack PerfCtrExtCore. I have made efforts that no part of the code has any dependency on its presence. I am aware of similar work in this direction done by Jinrong Liang [1]. He told me he is not working on it currently and I am not intruding by making my own submission. [1] https://lore.kernel.org/kvm/20231121115457.76269-1-cloudliang@tencent.com/ Colton Lewis (6): KVM: x86: selftests: Fix typos in macro variable use KVM: x86: selftests: Define AMD PMU CPUID leaves KVM: x86: selftests: Set up AMD VM in pmu_counters_test KVM: x86: selftests: Test read/write core counters KVM: x86: selftests: Test core events KVM: x86: selftests: Test PerfMonV2 .../selftests/kvm/include/x86_64/processor.h | 7 + .../selftests/kvm/x86_64/pmu_counters_test.c | 267 ++++++++++++++++-- 2 files changed, 249 insertions(+), 25 deletions(-) -- 2.46.0.76.ge559c4bf1a-goog

10 months

3
20
0 0

[PATCH net-next] selftests: add selftest for UDP SO_PEEK_OFF support

by Jason Xing

From: Jason Xing <kernelxing(a)tencent.com> Add the SO_PEEK_OFF selftest for UDP. In this patch, I mainly do three things: 1. rename tcp_so_peek_off.c 2. adjust for UDP protocol 3. add selftests into it Suggested-by: Jon Maloy <jmaloy(a)redhat.com> Signed-off-by: Jason Xing <kernelxing(a)tencent.com> --- Link: https://lore.kernel.org/all/9f4dd14d-fbe3-4c61-b04c-f0e6b8096d7b@redhat.com/ --- tools/testing/selftests/net/Makefile | 2 +- .../{tcp_so_peek_off.c => sk_so_peek_off.c} | 91 +++++++++++-------- 2 files changed, 56 insertions(+), 37 deletions(-) rename tools/testing/selftests/net/{tcp_so_peek_off.c => sk_so_peek_off.c} (58%) diff --git a/tools/testing/selftests/net/Makefile b/tools/testing/selftests/net/Makefile index 1179e3261bef..d5029f978aa9 100644 --- a/tools/testing/selftests/net/Makefile +++ b/tools/testing/selftests/net/Makefile @@ -80,7 +80,7 @@ TEST_PROGS += io_uring_zerocopy_tx.sh TEST_GEN_FILES += bind_bhash TEST_GEN_PROGS += sk_bind_sendto_listen TEST_GEN_PROGS += sk_connect_zero_addr -TEST_GEN_PROGS += tcp_so_peek_off +TEST_GEN_PROGS += sk_so_peek_off TEST_PROGS += test_ingress_egress_chaining.sh TEST_GEN_PROGS += so_incoming_cpu TEST_PROGS += sctp_vrf.sh diff --git a/tools/testing/selftests/net/tcp_so_peek_off.c b/tools/testing/selftests/net/sk_so_peek_off.c similarity index 58% rename from tools/testing/selftests/net/tcp_so_peek_off.c rename to tools/testing/selftests/net/sk_so_peek_off.c index df8a39d9d3c3..870a890138c4 100644 --- a/tools/testing/selftests/net/tcp_so_peek_off.c +++ b/tools/testing/selftests/net/sk_so_peek_off.c @@ -10,37 +10,41 @@ #include <arpa/inet.h> #include "../kselftest.h" -static char *afstr(int af) +static char *afstr(int af, int proto) { - return af == AF_INET ? "TCP/IPv4" : "TCP/IPv6"; + if (proto == IPPROTO_TCP) + return af == AF_INET ? "TCP/IPv4" : "TCP/IPv6"; + else + return af == AF_INET ? "UDP/IPv4" : "UDP/IPv6"; } -int tcp_peek_offset_probe(sa_family_t af) +int sk_peek_offset_probe(sa_family_t af, int proto) { + int type = (proto == IPPROTO_TCP ? SOCK_STREAM : SOCK_DGRAM); int optv = 0; int ret = 0; int s; - s = socket(af, SOCK_STREAM | SOCK_CLOEXEC, IPPROTO_TCP); + s = socket(af, type, proto); if (s < 0) { ksft_perror("Temporary TCP socket creation failed"); } else { if (!setsockopt(s, SOL_SOCKET, SO_PEEK_OFF, &optv, sizeof(int))) ret = 1; else - printf("%s does not support SO_PEEK_OFF\n", afstr(af)); + printf("%s does not support SO_PEEK_OFF\n", afstr(af, proto)); close(s); } return ret; } -static void tcp_peek_offset_set(int s, int offset) +static void sk_peek_offset_set(int s, int offset) { if (setsockopt(s, SOL_SOCKET, SO_PEEK_OFF, &offset, sizeof(offset))) ksft_perror("Failed to set SO_PEEK_OFF value\n"); } -static int tcp_peek_offset_get(int s) +static int sk_peek_offset_get(int s) { int offset; socklen_t len = sizeof(offset); @@ -50,8 +54,9 @@ static int tcp_peek_offset_get(int s) return offset; } -static int tcp_peek_offset_test(sa_family_t af) +static int sk_peek_offset_test(sa_family_t af, int proto) { + int type = (proto == IPPROTO_TCP ? SOCK_STREAM : SOCK_DGRAM); union { struct sockaddr sa; struct sockaddr_in a4; @@ -62,13 +67,13 @@ static int tcp_peek_offset_test(sa_family_t af) int recv_sock = 0; int offset = 0; ssize_t len; - char buf; + char buf[2]; memset(&a, 0, sizeof(a)); a.sa.sa_family = af; - s[0] = socket(af, SOCK_STREAM, IPPROTO_TCP); - s[1] = socket(af, SOCK_STREAM | SOCK_NONBLOCK, IPPROTO_TCP); + s[0] = recv_sock = socket(af, type, proto); + s[1] = socket(af, type, proto); if (s[0] < 0 || s[1] < 0) { ksft_perror("Temporary socket creation failed\n"); @@ -82,76 +87,78 @@ static int tcp_peek_offset_test(sa_family_t af) ksft_perror("Temporary socket getsockname() failed\n"); goto out; } - if (listen(s[0], 0) < 0) { + if (proto == IPPROTO_TCP && listen(s[0], 0) < 0) { ksft_perror("Temporary socket listen() failed\n"); goto out; } - if (connect(s[1], &a.sa, sizeof(a)) >= 0 || errno != EINPROGRESS) { + if (connect(s[1], &a.sa, sizeof(a))) { ksft_perror("Temporary socket connect() failed\n"); goto out; } - recv_sock = accept(s[0], NULL, NULL); - if (recv_sock <= 0) { - ksft_perror("Temporary socket accept() failed\n"); - goto out; + if (proto == IPPROTO_TCP) { + recv_sock = accept(s[0], NULL, NULL); + if (recv_sock <= 0) { + ksft_perror("Temporary socket accept() failed\n"); + goto out; + } } /* Some basic tests of getting/setting offset */ - offset = tcp_peek_offset_get(recv_sock); + offset = sk_peek_offset_get(recv_sock); if (offset != -1) { ksft_perror("Initial value of socket offset not -1\n"); goto out; } - tcp_peek_offset_set(recv_sock, 0); - offset = tcp_peek_offset_get(recv_sock); + sk_peek_offset_set(recv_sock, 0); + offset = sk_peek_offset_get(recv_sock); if (offset != 0) { ksft_perror("Failed to set socket offset to 0\n"); goto out; } /* Transfer a message */ - if (send(s[1], (char *)("ab"), 2, 0) <= 0 || errno != EINPROGRESS) { + if (send(s[1], (char *)("ab"), 2, 0) != 2) { ksft_perror("Temporary probe socket send() failed\n"); goto out; } /* Read first byte */ - len = recv(recv_sock, &buf, 1, MSG_PEEK); - if (len != 1 || buf != 'a') { + len = recv(recv_sock, buf, 1, MSG_PEEK); + if (len != 1 || buf[0] != 'a') { ksft_perror("Failed to read first byte of message\n"); goto out; } - offset = tcp_peek_offset_get(recv_sock); + offset = sk_peek_offset_get(recv_sock); if (offset != 1) { ksft_perror("Offset not forwarded correctly at first byte\n"); goto out; } /* Try to read beyond last byte */ - len = recv(recv_sock, &buf, 2, MSG_PEEK); - if (len != 1 || buf != 'b') { + len = recv(recv_sock, buf, 2, MSG_PEEK); + if (len != 1 || buf[0] != 'b') { ksft_perror("Failed to read last byte of message\n"); goto out; } - offset = tcp_peek_offset_get(recv_sock); + offset = sk_peek_offset_get(recv_sock); if (offset != 2) { ksft_perror("Offset not forwarded correctly at last byte\n"); goto out; } /* Flush message */ - len = recv(recv_sock, NULL, 2, MSG_TRUNC); + len = recv(recv_sock, buf, 2, MSG_TRUNC); if (len != 2) { ksft_perror("Failed to flush message\n"); goto out; } - offset = tcp_peek_offset_get(recv_sock); + offset = sk_peek_offset_get(recv_sock); if (offset != 0) { ksft_perror("Offset not reverted correctly after flush\n"); goto out; } - printf("%s with MSG_PEEK_OFF works correctly\n", afstr(af)); + printf("%s with MSG_PEEK_OFF works correctly\n", afstr(af, proto)); res = 1; out: - if (recv_sock >= 0) + if (proto == IPPROTO_TCP && recv_sock >= 0) close(recv_sock); if (s[1] >= 0) close(s[1]); @@ -160,24 +167,36 @@ static int tcp_peek_offset_test(sa_family_t af) return res; } -int main(void) +static int do_test(int proto) { int res4, res6; - res4 = tcp_peek_offset_probe(AF_INET); - res6 = tcp_peek_offset_probe(AF_INET6); + res4 = sk_peek_offset_probe(AF_INET, proto); + res6 = sk_peek_offset_probe(AF_INET6, proto); if (!res4 && !res6) return KSFT_SKIP; if (res4) - res4 = tcp_peek_offset_test(AF_INET); + res4 = sk_peek_offset_test(AF_INET, proto); if (res6) - res6 = tcp_peek_offset_test(AF_INET6); + res6 = sk_peek_offset_test(AF_INET6, proto); if (!res4 || !res6) return KSFT_FAIL; return KSFT_PASS; } + +int main(void) +{ + int restcp, resudp; + + restcp = do_test(IPPROTO_TCP); + resudp = do_test(IPPROTO_UDP); + if (restcp == KSFT_FAIL || resudp == KSFT_FAIL) + return KSFT_FAIL; + + return KSFT_PASS; +} -- 2.37.3

10 months

3
5
0 0

[PATCH v4 0/5] Wire up getrandom() vDSO implementation on powerpc

by Christophe Leroy

This series wires up getrandom() vDSO implementation on powerpc. Tested on PPC32 on real hardware. Tested on PPC64 (both BE and LE) on QEMU: Performance on powerpc 885: ~# ./vdso_test_getrandom bench-single vdso: 25000000 times in 62.938002291 seconds libc: 25000000 times in 535.581916866 seconds syscall: 25000000 times in 531.525042806 seconds Performance on powerpc 8321: ~# ./vdso_test_getrandom bench-single vdso: 25000000 times in 16.899318858 seconds libc: 25000000 times in 131.050596522 seconds syscall: 25000000 times in 129.794790389 seconds Performance on QEMU pseries: ~ # ./vdso_test_getrandom bench-single vdso: 25000000 times in 4.977777162 seconds libc: 25000000 times in 75.516749981 seconds syscall: 25000000 times in 86.842242014 seconds Changes in v4: - Rebased on recent random git tree (963233ff0133) (The new tree includes selftests fixes) - Read/write counter in native byte order - Don't use anymore compat macros to write output - Fixed selftests build failure with patch 4 (without patch 5) on little endian on PPC64 - Implement a __kernel_getrandom() stub returning ENOSYS on ppc64 in patch 4 (without patch 5) to make selftests happy. Changes in v3: - Rebased on recent random git tree (0c7e00e22c21) - Fixed build failures reported by robots around VM_DROPPABLE - Fixed crash on PPC64 due to clobbered r13 by not using r13 anymore (saving it was not enough for signals). - Split final patch in two, first for PPC32, second for PPC64 - Moved selftest fixes out of this series Changes in v2: - Define VM_DROPPABLE for powerpc/32 - Fixes generic vDSO getrandom headers to enable CONFIG_COMPAT build. - Fixed size of generation counter - Fixed selftests to work on non x86 architectures Christophe Leroy (5): mm: Define VM_DROPPABLE for powerpc/32 powerpc/vdso32: Add crtsavres powerpc/vdso: Refactor CFLAGS for CVDSO build powerpc/vdso: Wire up getrandom() vDSO implementation on PPC32 powerpc/vdso: Wire up getrandom() vDSO implementation on PPC64 arch/powerpc/Kconfig | 1 + arch/powerpc/include/asm/mman.h | 2 +- arch/powerpc/include/asm/vdso/getrandom.h | 54 ++++ arch/powerpc/include/asm/vdso/vsyscall.h | 6 + arch/powerpc/include/asm/vdso_datapage.h | 2 + arch/powerpc/kernel/asm-offsets.c | 1 + arch/powerpc/kernel/vdso/Makefile | 57 ++-- arch/powerpc/kernel/vdso/getrandom.S | 58 ++++ arch/powerpc/kernel/vdso/gettimeofday.S | 13 - arch/powerpc/kernel/vdso/vdso32.lds.S | 1 + arch/powerpc/kernel/vdso/vdso64.lds.S | 1 + arch/powerpc/kernel/vdso/vgetrandom-chacha.S | 320 +++++++++++++++++++ arch/powerpc/kernel/vdso/vgetrandom.c | 14 + fs/proc/task_mmu.c | 4 +- include/linux/mm.h | 4 +- include/trace/events/mmflags.h | 4 +- tools/testing/selftests/vDSO/Makefile | 2 +- 17 files changed, 501 insertions(+), 43 deletions(-) create mode 100644 arch/powerpc/include/asm/vdso/getrandom.h create mode 100644 arch/powerpc/kernel/vdso/getrandom.S create mode 100644 arch/powerpc/kernel/vdso/vgetrandom-chacha.S create mode 100644 arch/powerpc/kernel/vdso/vgetrandom.c -- 2.44.0

10 months

2
14
0 0

[PATCH net-next v17 11/14] mm: page_frag: add testing for the newly added prepare API

by Yunsheng Lin

Add testing for the newly added prepare API, for both aligned and non-aligned API, also probe API is also tested along with prepare API. CC: Alexander Duyck <alexander.duyck(a)gmail.com> Signed-off-by: Yunsheng Lin <linyunsheng(a)huawei.com> --- .../selftests/mm/page_frag/page_frag_test.c | 66 +++++++++++++++++-- tools/testing/selftests/mm/run_vmtests.sh | 4 ++ tools/testing/selftests/mm/test_page_frag.sh | 31 +++++++++ 3 files changed, 96 insertions(+), 5 deletions(-) diff --git a/tools/testing/selftests/mm/page_frag/page_frag_test.c b/tools/testing/selftests/mm/page_frag/page_frag_test.c index a4bd543d6950..7cfa896f69cb 100644 --- a/tools/testing/selftests/mm/page_frag/page_frag_test.c +++ b/tools/testing/selftests/mm/page_frag/page_frag_test.c @@ -27,6 +27,10 @@ static bool test_align; module_param(test_align, bool, 0); MODULE_PARM_DESC(test_align, "use align API for testing"); +static bool test_prepare; +module_param(test_prepare, bool, 0); +MODULE_PARM_DESC(test_prepare, "use prepare API for testing"); + static int test_alloc_len = 2048; module_param(test_alloc_len, int, 0); MODULE_PARM_DESC(test_alloc_len, "alloc len for testing"); @@ -67,6 +71,18 @@ static int page_frag_pop_thread(void *arg) return 0; } +static void frag_frag_test_commit(struct page_frag_cache *nc, + struct page_frag *prepare_pfrag, + struct page_frag *probe_pfrag, + unsigned int used_sz) +{ + WARN_ON_ONCE(prepare_pfrag->page != probe_pfrag->page || + prepare_pfrag->offset != probe_pfrag->offset || + prepare_pfrag->size != probe_pfrag->size); + + page_frag_commit(nc, prepare_pfrag, used_sz); +} + static int page_frag_push_thread(void *arg) { struct ptr_ring *ring = arg; @@ -80,13 +96,52 @@ static int page_frag_push_thread(void *arg) int ret; if (test_align) { - va = page_frag_alloc_align(&test_nc, test_alloc_len, - GFP_KERNEL, SMP_CACHE_BYTES); + if (test_prepare) { + struct page_frag prepare_frag, probe_frag; + void *probe_va; + + va = page_frag_alloc_refill_prepare_align(&test_nc, + test_alloc_len, + &prepare_frag, + GFP_KERNEL, + SMP_CACHE_BYTES); + + probe_va = __page_frag_alloc_refill_probe_align(&test_nc, + test_alloc_len, + &probe_frag, + -SMP_CACHE_BYTES); + WARN_ON_ONCE(va != probe_va); + + if (likely(va)) + frag_frag_test_commit(&test_nc, &prepare_frag, + &probe_frag, test_alloc_len); + } else { + va = page_frag_alloc_align(&test_nc, + test_alloc_len, + GFP_KERNEL, + SMP_CACHE_BYTES); + } WARN_ONCE((unsigned long)va & (SMP_CACHE_BYTES - 1), "unaligned va returned\n"); } else { - va = page_frag_alloc(&test_nc, test_alloc_len, GFP_KERNEL); + if (test_prepare) { + struct page_frag prepare_frag, probe_frag; + void *probe_va; + + va = page_frag_alloc_refill_prepare(&test_nc, test_alloc_len, + &prepare_frag, GFP_KERNEL); + + probe_va = page_frag_alloc_refill_probe(&test_nc, test_alloc_len, + &probe_frag); + + WARN_ON_ONCE(va != probe_va); + if (likely(va)) + frag_frag_test_commit(&test_nc, &prepare_frag, + &probe_frag, test_alloc_len); + } else { + va = page_frag_alloc(&test_nc, test_alloc_len, GFP_KERNEL); + } } if (!va) @@ -149,8 +204,9 @@ static int __init page_frag_test_init(void) wait_for_completion(&wait); duration = (u64)ktime_us_delta(ktime_get(), start); - pr_info("%d of iterations for %s testing took: %lluus\n", nr_test, - test_align ? "aligned" : "non-aligned", duration); + pr_info("%d of iterations for %s %s API testing took: %lluus\n", nr_test, + test_align ? "aligned" : "non-aligned", + test_prepare ? "prepare" : "alloc", duration); ptr_ring_cleanup(&ptr_ring, NULL); page_frag_cache_drain(&test_nc); diff --git a/tools/testing/selftests/mm/run_vmtests.sh b/tools/testing/selftests/mm/run_vmtests.sh index 96fd470b9f51..e4a36231bbea 100755 --- a/tools/testing/selftests/mm/run_vmtests.sh +++ b/tools/testing/selftests/mm/run_vmtests.sh @@ -464,6 +464,10 @@ CATEGORY="page_frag" run_test ./test_page_frag.sh aligned CATEGORY="page_frag" run_test ./test_page_frag.sh nonaligned +CATEGORY="page_frag" run_test ./test_page_frag.sh aligned_prepare + +CATEGORY="page_frag" run_test ./test_page_frag.sh nonaligned_prepare + echo "SUMMARY: PASS=${count_pass} SKIP=${count_skip} FAIL=${count_fail}" | tap_prefix echo "1..${count_total}" | tap_output diff --git a/tools/testing/selftests/mm/test_page_frag.sh b/tools/testing/selftests/mm/test_page_frag.sh index d2b0734a90b5..3bc40a895d0d 100755 --- a/tools/testing/selftests/mm/test_page_frag.sh +++ b/tools/testing/selftests/mm/test_page_frag.sh @@ -36,6 +36,8 @@ ksft_skip=4 SMOKE_PARAM="test_push_cpu=$TEST_CPU_0 test_pop_cpu=$TEST_CPU_1" NONALIGNED_PARAM="$SMOKE_PARAM test_alloc_len=75 nr_test=$NR_TEST" ALIGNED_PARAM="$NONALIGNED_PARAM test_align=1" +NONALIGNED_PREPARE_PARAM="$NONALIGNED_PARAM test_prepare=1" +ALIGNED_PREPARE_PARAM="$ALIGNED_PARAM test_prepare=1" check_test_requirements() { @@ -74,6 +76,24 @@ run_aligned_check() echo "Check the kernel ring buffer to see the summary." } +run_nonaligned_prepare_check() +{ + echo "Run performance tests to evaluate how fast nonaligned prepare API is." + + insmod $DRIVER $NONALIGNED_PREPARE_PARAM > /dev/null 2>&1 + echo "Done." + echo "Ccheck the kernel ring buffer to see the summary." +} + +run_aligned_prepare_check() +{ + echo "Run performance tests to evaluate how fast aligned prepare API is." + + insmod $DRIVER $ALIGNED_PREPARE_PARAM > /dev/null 2>&1 + echo "Done." + echo "Check the kernel ring buffer to see the summary." +} + run_smoke_check() { echo "Run smoke test." @@ -86,6 +106,7 @@ run_smoke_check() usage() { echo -n "Usage: $0 [ aligned ] | [ nonaligned ] | | [ smoke ] | " + echo "[ aligned_prepare ] | [ nonaligned_prepare ] | " echo "manual parameters" echo echo "Valid tests and parameters:" @@ -106,6 +127,12 @@ usage() echo "# Performance testing for aligned alloc API" echo "$0 aligned" echo + echo "# Performance testing for nonaligned prepare API" + echo "$0 nonaligned_prepare" + echo + echo "# Performance testing for aligned prepare API" + echo "$0 aligned_prepare" + echo exit 0 } @@ -159,6 +186,10 @@ function run_test() run_nonaligned_check elif [[ "$1" = "aligned" ]]; then run_aligned_check + elif [[ "$1" = "nonaligned_prepare" ]]; then + run_nonaligned_prepare_check + elif [[ "$1" = "aligned_prepare" ]]; then + run_aligned_prepare_check else run_manual_check $@ fi -- 2.33.0

10 months

1
0
0 0

[PATCH net-next v17 04/14] mm: page_frag: avoid caller accessing 'page_frag_cache' directly

by Yunsheng Lin

Use appropriate frag_page API instead of caller accessing 'page_frag_cache' directly. CC: Alexander Duyck <alexander.duyck(a)gmail.com> Signed-off-by: Yunsheng Lin <linyunsheng(a)huawei.com> Reviewed-by: Alexander Duyck <alexanderduyck(a)fb.com> Acked-by: Chuck Lever <chuck.lever(a)oracle.com> --- drivers/vhost/net.c | 2 +- include/linux/page_frag_cache.h | 10 ++++++++++ net/core/skbuff.c | 6 +++--- net/rxrpc/conn_object.c | 4 +--- net/rxrpc/local_object.c | 4 +--- net/sunrpc/svcsock.c | 6 ++---- tools/testing/selftests/mm/page_frag/page_frag_test.c | 2 +- 7 files changed, 19 insertions(+), 15 deletions(-) diff --git a/drivers/vhost/net.c b/drivers/vhost/net.c index f16279351db5..9ad37c012189 100644 --- a/drivers/vhost/net.c +++ b/drivers/vhost/net.c @@ -1325,7 +1325,7 @@ static int vhost_net_open(struct inode *inode, struct file *f) vqs[VHOST_NET_VQ_RX]); f->private_data = n; - n->pf_cache.va = NULL; + page_frag_cache_init(&n->pf_cache); return 0; } diff --git a/include/linux/page_frag_cache.h b/include/linux/page_frag_cache.h index 67ac8626ed9b..0a52f7a179c8 100644 --- a/include/linux/page_frag_cache.h +++ b/include/linux/page_frag_cache.h @@ -7,6 +7,16 @@ #include <linux/mm_types_task.h> #include <linux/types.h> +static inline void page_frag_cache_init(struct page_frag_cache *nc) +{ + nc->va = NULL; +} + +static inline bool page_frag_cache_is_pfmemalloc(struct page_frag_cache *nc) +{ + return !!nc->pfmemalloc; +} + void page_frag_cache_drain(struct page_frag_cache *nc); void __page_frag_cache_drain(struct page *page, unsigned int count); void *__page_frag_alloc_align(struct page_frag_cache *nc, unsigned int fragsz, diff --git a/net/core/skbuff.c b/net/core/skbuff.c index a52638363ea5..a5f8e4e0c649 100644 --- a/net/core/skbuff.c +++ b/net/core/skbuff.c @@ -752,14 +752,14 @@ struct sk_buff *__netdev_alloc_skb(struct net_device *dev, unsigned int len, if (in_hardirq() || irqs_disabled()) { nc = this_cpu_ptr(&netdev_alloc_cache); data = page_frag_alloc(nc, len, gfp_mask); - pfmemalloc = nc->pfmemalloc; + pfmemalloc = page_frag_cache_is_pfmemalloc(nc); } else { local_bh_disable(); local_lock_nested_bh(&napi_alloc_cache.bh_lock); nc = this_cpu_ptr(&napi_alloc_cache.page); data = page_frag_alloc(nc, len, gfp_mask); - pfmemalloc = nc->pfmemalloc; + pfmemalloc = page_frag_cache_is_pfmemalloc(nc); local_unlock_nested_bh(&napi_alloc_cache.bh_lock); local_bh_enable(); @@ -849,7 +849,7 @@ struct sk_buff *napi_alloc_skb(struct napi_struct *napi, unsigned int len) len = SKB_HEAD_ALIGN(len); data = page_frag_alloc(&nc->page, len, gfp_mask); - pfmemalloc = nc->page.pfmemalloc; + pfmemalloc = page_frag_cache_is_pfmemalloc(&nc->page); } local_unlock_nested_bh(&napi_alloc_cache.bh_lock); diff --git a/net/rxrpc/conn_object.c b/net/rxrpc/conn_object.c index 1539d315afe7..694c4df7a1a3 100644 --- a/net/rxrpc/conn_object.c +++ b/net/rxrpc/conn_object.c @@ -337,9 +337,7 @@ static void rxrpc_clean_up_connection(struct work_struct *work) */ rxrpc_purge_queue(&conn->rx_queue); - if (conn->tx_data_alloc.va) - __page_frag_cache_drain(virt_to_page(conn->tx_data_alloc.va), - conn->tx_data_alloc.pagecnt_bias); + page_frag_cache_drain(&conn->tx_data_alloc); call_rcu(&conn->rcu, rxrpc_rcu_free_connection); } diff --git a/net/rxrpc/local_object.c b/net/rxrpc/local_object.c index 504453c688d7..a8cffe47cf01 100644 --- a/net/rxrpc/local_object.c +++ b/net/rxrpc/local_object.c @@ -452,9 +452,7 @@ void rxrpc_destroy_local(struct rxrpc_local *local) #endif rxrpc_purge_queue(&local->rx_queue); rxrpc_purge_client_connections(local); - if (local->tx_alloc.va) - __page_frag_cache_drain(virt_to_page(local->tx_alloc.va), - local->tx_alloc.pagecnt_bias); + page_frag_cache_drain(&local->tx_alloc); } /* diff --git a/net/sunrpc/svcsock.c b/net/sunrpc/svcsock.c index 6b3f01beb294..dcfd84cf0694 100644 --- a/net/sunrpc/svcsock.c +++ b/net/sunrpc/svcsock.c @@ -1609,7 +1609,6 @@ static void svc_tcp_sock_detach(struct svc_xprt *xprt) static void svc_sock_free(struct svc_xprt *xprt) { struct svc_sock *svsk = container_of(xprt, struct svc_sock, sk_xprt); - struct page_frag_cache *pfc = &svsk->sk_frag_cache; struct socket *sock = svsk->sk_sock; trace_svcsock_free(svsk, sock); @@ -1619,8 +1618,7 @@ static void svc_sock_free(struct svc_xprt *xprt) sockfd_put(sock); else sock_release(sock); - if (pfc->va) - __page_frag_cache_drain(virt_to_head_page(pfc->va), - pfc->pagecnt_bias); + + page_frag_cache_drain(&svsk->sk_frag_cache); kfree(svsk); } diff --git a/tools/testing/selftests/mm/page_frag/page_frag_test.c b/tools/testing/selftests/mm/page_frag/page_frag_test.c index 5395a36e4030..a4bd543d6950 100644 --- a/tools/testing/selftests/mm/page_frag/page_frag_test.c +++ b/tools/testing/selftests/mm/page_frag/page_frag_test.c @@ -117,7 +117,7 @@ static int __init page_frag_test_init(void) u64 duration; int ret; - test_nc.va = NULL; + page_frag_cache_init(&test_nc); atomic_set(&nthreads, 2); init_completion(&wait); -- 2.33.0

10 months

1
0
0 0

[PATCH 0/2] Improve migration by backing off earlier

by Dev Jain

It was recently observed at [1] that during the folio unmapping stage of migration, when the PTEs are cleared, a racing thread faulting on that folio may increase the refcount of the folio, sleep on the folio lock (the migration path has the lock), and migration ultimately fails when asserting the actual refcount against the expected. Migration is a best effort service; the unmapping and the moving phase are wrapped around loops for retrying. The refcount of the folio is currently being asserted during the move stage; if it fails, we retry. But, if a racing thread changes the refcount, and ends up sleeping on the folio lock (which is mostly the case), there is no way the refcount would be decremented; as a result, this renders the retrying useless. In the first patch, we make the refcount check also during the unmap stage; if it fails, we restore the original state of the PTE, drop the folio lock, let the system make progress, and retry unmapping again. This improves the probability of migration winning the race. Given that migration is a best-effort service, it is wrong to fail the test for just a single failure; hence, fail the test after 100 consecutive failures (where 100 is still a subjective choice). [1] https://lore.kernel.org/all/20240801081657.1386743-1-dev.jain@arm.com/ Dev Jain (2): mm: Retry migration earlier upon refcount mismatch selftests/mm: Do not fail test for a single migration failure mm/migrate.c | 9 +++++++++ tools/testing/selftests/mm/migration.c | 17 +++++++++++------ 2 files changed, 20 insertions(+), 6 deletions(-) -- 2.30.2

10 months

6
29
0 0

[PATCH] selftests/futex: Create test for robust list

by André Almeida

Create a test for the robust list mechanism. Signed-off-by: André Almeida <andrealmeid(a)igalia.com> --- .../selftests/futex/functional/.gitignore | 1 + .../selftests/futex/functional/Makefile | 3 +- .../selftests/futex/functional/robust_list.c | 450 ++++++++++++++++++ 3 files changed, 453 insertions(+), 1 deletion(-) create mode 100644 tools/testing/selftests/futex/functional/robust_list.c diff --git a/tools/testing/selftests/futex/functional/.gitignore b/tools/testing/selftests/futex/functional/.gitignore index fbcbdb6963b3..4726e1be7497 100644 --- a/tools/testing/selftests/futex/functional/.gitignore +++ b/tools/testing/selftests/futex/functional/.gitignore @@ -9,3 +9,4 @@ futex_wait_wouldblock futex_wait futex_requeue futex_waitv +robust_list diff --git a/tools/testing/selftests/futex/functional/Makefile b/tools/testing/selftests/futex/functional/Makefile index f79f9bac7918..b8635a1ac7f6 100644 --- a/tools/testing/selftests/futex/functional/Makefile +++ b/tools/testing/selftests/futex/functional/Makefile @@ -17,7 +17,8 @@ TEST_GEN_PROGS := \ futex_wait_private_mapped_file \ futex_wait \ futex_requeue \ - futex_waitv + futex_waitv \ + robust_list TEST_PROGS := run.sh diff --git a/tools/testing/selftests/futex/functional/robust_list.c b/tools/testing/selftests/futex/functional/robust_list.c new file mode 100644 index 000000000000..5cc0edaaf028 --- /dev/null +++ b/tools/testing/selftests/futex/functional/robust_list.c @@ -0,0 +1,450 @@ +// SPDX-License-Identifier: GPL-2.0-or-later +/* + * Copyright Igalia, 2024 + * + * Robust list test by André Almeida <andrealmeid(a)igalia.com> + * + * The robust list uAPI allows userspace to create "robust" locks, in the sense + * that if the lock holder thread dies, the remaining threads that are waiting + * for the lock won't block forever, waiting for a lock that will never be + * released. + * + * This is achieve by userspace setting a list where a thread can enter all the + * locks (futexes) that it is holding. The robust list is a linked list, and + * userspace register the start of the list with the syscall set_robust_list(). + * If such thread eventually dies, the kernel will walk this list, waking up one + * thread waiting for each futex and marking the futex word with the flag + * FUTEX_OWNER_DIED. + * + * See also + * man set_robust_list + * Documententation/locking/robust-futex-ABI.rst + * Documententation/locking/robust-futexes.rst + */ + +#define _GNU_SOURCE + +#include "../../kselftest_harness.h" + +#include "futextest.h" + +#include <pthread.h> +#include <stdatomic.h> +#include <stddef.h> + +#define STACK_SIZE (1024 * 1024) + +#define FUTEX_TIMEOUT 3 + +static pthread_barrier_t barrier, barrier2; + +int set_robust_list(struct robust_list_head *head, size_t len) +{ + return syscall(SYS_set_robust_list, head, len); +} + +int get_robust_list(int pid, struct robust_list_head **head, size_t *len_ptr) +{ + return syscall(SYS_get_robust_list, pid, head, len_ptr); +} + +int futex2_wait(void *futex, int val, struct timespec *timo) +{ + return syscall(SYS_futex_wait, futex, val, ~0U, FUTEX2_SIZE_U32, timo, CLOCK_MONOTONIC); +} + +/* + * Basic lock struct, contains just the futex word and the robust list element + * Real implementations have also a *prev to easily walk in the list + */ +struct lock_struct { + int futex; + struct robust_list list; +}; + +/* + * Helper function to spawn a child thread. Returns -1 on error, pid on success + */ +static int create_child(int (*fn)(void *arg), void *arg) +{ + char *stack; + pid_t pid; + + stack = mmap(NULL, STACK_SIZE, PROT_READ | PROT_WRITE, + MAP_PRIVATE | MAP_ANONYMOUS | MAP_STACK, -1, 0); + if (stack == MAP_FAILED) + return -1; + + stack += STACK_SIZE; + + pid = clone(fn, stack, CLONE_VM | SIGCHLD, arg); + + if (pid == -1) + return -1; + + return pid; +} + +/* + * Helper function to prepare and register a robust list + */ +static int set_list(struct robust_list_head *head) +{ + int ret; + + ret = set_robust_list(head, sizeof(struct robust_list_head)); + if (ret) + return ret; + + head->futex_offset = (size_t) offsetof(struct lock_struct, futex) - + (size_t) offsetof(struct lock_struct, list); + head->list.next = &head->list; + head->list_op_pending = NULL; + + return 0; +} + +/* + * A basic (and incomplete) mutex lock function with robustness + */ +static int mutex_lock(struct lock_struct *lock, struct robust_list_head *head, bool error_inject) +{ + int *futex = &lock->futex, zero = 0, ret = -1; + pid_t tid = gettid(); + + /* + * Set list_op_pending before starting the lock, so the kernel can catch + * the case where the thread died during the lock operation + */ + head->list_op_pending = &lock->list; + + if (atomic_compare_exchange_strong(futex, &zero, tid)) { + /* + * We took the lock, insert it in the robust list + */ + struct robust_list *list = &head->list; + + /* Error injection to test list_op_pending */ + if (error_inject) + return 0; + + while (list->next != &head->list) + list = list->next; + + list->next = &lock->list; + lock->list.next = &head->list; + + ret = 0; + } else { + /* + * We didn't take the lock, wait until the owner wakes (or dies) + */ + struct timespec to; + + clock_gettime(CLOCK_MONOTONIC, &to); + to.tv_sec = to.tv_sec + FUTEX_TIMEOUT; + + tid = atomic_load(futex); + /* Kernel ignores futexes without the waiters flag */ + tid |= FUTEX_WAITERS; + atomic_store(futex, tid); + + ret = futex2_wait(futex, tid, &to); + + /* + * A real mutex_lock() implementation would loop here to finally + * take the lock. We don't care about that, so we stop here. + */ + } + + head->list_op_pending = NULL; + + return ret; +} + +/* + * This child thread will succeed taking the lock, and then will exit holding it + */ +static int child_fn_lock(void *arg) +{ + struct lock_struct *lock = (struct lock_struct *) arg; + struct robust_list_head head; + int ret; + + ret = set_list(&head); + if (ret) + ksft_test_result_fail("set_robust_list error\n"); + + ret = mutex_lock(lock, &head, false); + if (ret) + ksft_test_result_fail("mutex_lock error\n"); + + pthread_barrier_wait(&barrier); + + /* + * There's a race here: the parent thread needs to be inside + * futex_wait() before the child thread dies, otherwise it will miss the + * wakeup from handle_futex_death() that this child will emit. We wait a + * little bit just to make sure that this happens. + */ + sleep(1); + + return 0; +} + +/* + * Spawns a child thread that will set a robust list, take the lock, register it + * in the robust list and die. The parent thread will wait on this futex, and + * should be waken up when the child exits. + */ +TEST(robustness) +{ + struct lock_struct lock = { .futex = 0 }; + struct robust_list_head head; + int ret, *futex = &lock.futex; + + ret = set_list(&head); + ASSERT_EQ(ret, 0); + + /* + * Lets use a barrier to ensure that the child thread takes the lock + * before the parent + */ + ret = pthread_barrier_init(&barrier, NULL, 2); + ASSERT_EQ(ret, 0); + + ret = create_child(&child_fn_lock, &lock); + ASSERT_NE(ret, -1); + + pthread_barrier_wait(&barrier); + ret = mutex_lock(&lock, &head, false); + + /* + * futex_wait() should return 0 and the futex word should be marked with + * FUTEX_OWNER_DIED + */ + ASSERT_EQ(ret, 0) TH_LOG("futex wait returned %d", errno); + ASSERT_TRUE(*futex | FUTEX_OWNER_DIED); + + pthread_barrier_destroy(&barrier); +} + +/* + * The only valid value for len is sizeof(*head) + */ +TEST(set_robust_list_invalid_size) +{ + struct robust_list_head head; + size_t head_size = sizeof(struct robust_list_head); + int ret; + + ret = set_robust_list(&head, head_size); + ASSERT_EQ(ret, 0); + + ret = set_robust_list(&head, head_size * 2); + ASSERT_EQ(ret, -1); + ASSERT_EQ(errno, EINVAL); + + ret = set_robust_list(&head, head_size - 1); + ASSERT_EQ(ret, -1); + ASSERT_EQ(errno, EINVAL); + + ret = set_robust_list(&head, 0); + ASSERT_EQ(ret, -1); + ASSERT_EQ(errno, EINVAL); +} + +/* + * Test get_robust_list with pid = 0, getting the list of the running thread + */ +TEST(get_robust_list_self) +{ + struct robust_list_head head, head2, *get_head; + size_t head_size = sizeof(struct robust_list_head), len_ptr; + int ret; + + ret = set_robust_list(&head, head_size); + ASSERT_EQ(ret, 0); + + ret = get_robust_list(0, &get_head, &len_ptr); + ASSERT_EQ(ret, 0); + ASSERT_EQ(get_head, &head); + ASSERT_EQ(head_size, len_ptr); + + ret = set_robust_list(&head2, head_size); + ASSERT_EQ(ret, 0); + + ret = get_robust_list(0, &get_head, &len_ptr); + ASSERT_EQ(ret, 0); + ASSERT_EQ(get_head, &head2); + ASSERT_EQ(head_size, len_ptr); +} + +static int child_list(void *arg) +{ + struct robust_list_head *head = (struct robust_list_head *) arg; + int ret; + + ret = set_robust_list(head, sizeof(struct robust_list_head)); + if (ret) + ksft_test_result_fail("set_robust_list error\n"); + + pthread_barrier_wait(&barrier); + pthread_barrier_wait(&barrier2); + + return 0; +} + +/* + * Test get_robust_list from another thread. We use two barriers here to ensure + * that: + * 1) the child thread set the list before we try to get it from the + * parent + * 2) the child thread still alive when we try to get the list from it + */ +TEST(get_robust_list_child) +{ + pid_t tid; + int ret; + struct robust_list_head head, *get_head; + size_t len_ptr; + + ret = pthread_barrier_init(&barrier, NULL, 2); + ret = pthread_barrier_init(&barrier2, NULL, 2); + ASSERT_EQ(ret, 0); + + tid = create_child(&child_list, &head); + ASSERT_NE(tid, -1); + + pthread_barrier_wait(&barrier); + + ret = get_robust_list(tid, &get_head, &len_ptr); + ASSERT_EQ(ret, 0); + ASSERT_EQ(&head, get_head); + + pthread_barrier_wait(&barrier2); + + pthread_barrier_destroy(&barrier); + pthread_barrier_destroy(&barrier2); +} + +static int child_fn_lock_with_error(void *arg) +{ + struct lock_struct *lock = (struct lock_struct *) arg; + struct robust_list_head head; + int ret; + + ret = set_list(&head); + if (ret) + ksft_test_result_fail("set_robust_list error\n"); + + ret = mutex_lock(lock, &head, true); + if (ret) + ksft_test_result_fail("mutex_lock error\n"); + + pthread_barrier_wait(&barrier); + + sleep(1); + + return 0; +} + +/* + * Same as robustness test, but inject an error where the mutex_lock() exits + * earlier, just after setting list_op_pending and taking the lock, to test the + * list_op_pending mechanism + */ +TEST(set_list_op_pending) +{ + struct lock_struct lock = { .futex = 0 }; + struct robust_list_head head; + int ret, *futex = &lock.futex; + + ret = set_list(&head); + ASSERT_EQ(ret, 0); + + ret = pthread_barrier_init(&barrier, NULL, 2); + ASSERT_EQ(ret, 0); + + ret = create_child(&child_fn_lock_with_error, &lock); + ASSERT_NE(ret, -1); + + pthread_barrier_wait(&barrier); + ret = mutex_lock(&lock, &head, false); + + ASSERT_EQ(ret, 0) TH_LOG("futex wait returned %d", errno); + ASSERT_TRUE(*futex | FUTEX_OWNER_DIED); + + pthread_barrier_destroy(&barrier); +} + +#define CHILD_NR 10 + +static int child_lock_holder(void *arg) +{ + struct lock_struct *locks = (struct lock_struct *) arg; + struct robust_list_head head; + int i; + + set_list(&head); + + for (i = 0; i < CHILD_NR; i++) { + locks[i].futex = 0; + mutex_lock(&locks[i], &head, false); + } + + pthread_barrier_wait(&barrier); + pthread_barrier_wait(&barrier2); + + sleep(1); + return 0; +} + +static int child_wait_lock(void *arg) +{ + struct lock_struct *lock = (struct lock_struct *) arg; + struct robust_list_head head; + int ret; + + pthread_barrier_wait(&barrier2); + ret = mutex_lock(lock, &head, false); + + if (ret) + ksft_test_result_fail("mutex_lock error\n"); + + if (!(lock->futex | FUTEX_OWNER_DIED)) + ksft_test_result_fail("futex not marked with FUTEX_OWNER_DIED\n"); + + return 0; +} + +/* + * Test a robust list of more than one element. All the waiters should wake when + * the holder dies + */ +TEST(robust_list_multiple_elements) +{ + struct lock_struct locks[CHILD_NR]; + int i, ret; + + ret = pthread_barrier_init(&barrier, NULL, 2); + ASSERT_EQ(ret, 0); + ret = pthread_barrier_init(&barrier2, NULL, CHILD_NR + 1); + ASSERT_EQ(ret, 0); + + create_child(&child_lock_holder, &locks); + + /* Wait until the locker thread takes the look */ + pthread_barrier_wait(&barrier); + + for (i = 0; i < CHILD_NR; i++) + create_child(&child_wait_lock, &locks[i]); + + /* Wait for all children to return */ + while (wait(NULL) > 0); + + pthread_barrier_destroy(&barrier); + pthread_barrier_destroy(&barrier2); +} + +TEST_HARNESS_MAIN -- 2.46.0

10 months

2
2
0 0

2025

2024

2023

2022

2021

2020

2019

2018

2017

Linux-kselftest-mirror September 2024