- Linux-kselftest-mirror - lists.linaro.org

[PATCH v9 0/4] RISC-V: mm: Make SV48 the default address space

by Charlie Jenkins

Make sv48 the default address space for mmap as some applications currently depend on this assumption. Users can now select a desired address space using a non-zero hint address to mmap. Previously, requesting the default address space from mmap by passing zero as the hint address would result in using the largest address space possible. Some applications depend on empty bits in the virtual address space, like Go and Java, so this patch provides more flexibility for application developers. -Charlie --- v9: - Raise the mmap_end default to STACK_TOP_MAX to allow the address space to grow beyond the default of sv48 on sv57 machines as suggested by Alexandre - Some of the mmap macros had unnecessary conditionals that I have removed v8: - Fix RV32 and the RV32 compat mode of RV64 (suggested by Conor) - Extract out addr and base from the mmap macros (suggested by Alexandre) v7: - Changing RLIMIT_STACK inside of an executing program does not trigger arch_pick_mmap_layout(), so rewrite tests to change RLIMIT_STACK from a script before executing tests. RLIMIT_STACK of infinity forces bottomup mmap allocation. - Make arch_get_mmap_base macro more readible by extracting out the rnd calculation. - Use MMAP_MIN_VA_BITS in TASK_UNMAPPED_BASE to support case when mmap attempts to allocate address smaller than DEFAULT_MAP_WINDOW. - Fix incorrect wording in documentation. v6: - Rebase onto the correct base v5: - Minor wording change in documentation - Change some parenthesis in arch_get_mmap_ macros - Added case for addr==0 in arch_get_mmap_ because without this, programs would crash if RLIMIT_STACK was modified before executing the program. This was tested using the libhugetlbfs tests. v4: - Split testcases/document patch into test cases, in-code documentation, and formal documentation patches - Modified the mmap_base macro to be more legible and better represent memory layout - Fixed documentation to better reflect the implmentation - Renamed DEFAULT_VA_BITS to MMAP_VA_BITS - Added additional test case for rlimit changes --- Charlie Jenkins (4): RISC-V: mm: Restrict address space for sv39,sv48,sv57 RISC-V: mm: Add tests for RISC-V mm RISC-V: mm: Update pgtable comment documentation RISC-V: mm: Document mmap changes Documentation/riscv/vm-layout.rst | 22 +++++++ arch/riscv/include/asm/elf.h | 2 +- arch/riscv/include/asm/pgtable.h | 29 +++++++-- arch/riscv/include/asm/processor.h | 52 +++++++++++++-- tools/testing/selftests/riscv/Makefile | 2 +- tools/testing/selftests/riscv/mm/.gitignore | 2 + tools/testing/selftests/riscv/mm/Makefile | 15 +++++ .../riscv/mm/testcases/mmap_bottomup.c | 35 ++++++++++ .../riscv/mm/testcases/mmap_default.c | 35 ++++++++++ .../selftests/riscv/mm/testcases/mmap_test.h | 64 +++++++++++++++++++ .../selftests/riscv/mm/testcases/run_mmap.sh | 12 ++++ 11 files changed, 258 insertions(+), 12 deletions(-) create mode 100644 tools/testing/selftests/riscv/mm/.gitignore create mode 100644 tools/testing/selftests/riscv/mm/Makefile create mode 100644 tools/testing/selftests/riscv/mm/testcases/mmap_bottomup.c create mode 100644 tools/testing/selftests/riscv/mm/testcases/mmap_default.c create mode 100644 tools/testing/selftests/riscv/mm/testcases/mmap_test.h create mode 100755 tools/testing/selftests/riscv/mm/testcases/run_mmap.sh -- 2.34.1

1 year, 10 months

3
7
0 0

[PATCH v2 0/6] kunit: Add dynamically-extending log

by Richard Fitzgerald

Replace the original fixed-size log buffer with a dynamically- extending log. Patch 1 provides the basic implementation. The following patches add test cases, support for logging long strings, and an optimization to the string formatting that is now more thoroughly testable. Richard Fitzgerald (6): kunit: Replace fixed-size log with dynamically-extending buffer kunit: kunit-test: Add test cases for extending log buffer kunit: Handle logging of lines longer than the fragment buffer size kunit: kunit-test: Add test cases for logging very long lines kunit: kunit-test: Add test of logging only a newline kunit: Don't waste first attempt to format string in kunit_log_append() include/kunit/test.h | 25 +++- lib/kunit/debugfs.c | 65 +++++++-- lib/kunit/kunit-test.c | 321 ++++++++++++++++++++++++++++++++++++++++- lib/kunit/test.c | 127 +++++++++++++--- 4 files changed, 489 insertions(+), 49 deletions(-) -- 2.30.2

1 year, 11 months

3
12
0 0

[PATCH v6 00/13] RISCV: Add KVM_GET_REG_LIST API

by Haibo Xu

KVM_GET_REG_LIST will dump all register IDs that are available to KVM_GET/SET_ONE_REG and It's very useful to identify some platform regression issue during VM migration. Patch 1-7 re-structured the get-reg-list test in aarch64 to make some of the code as common test framework that can be shared by riscv. Patch 8 move reject_set check logic to a function so as to check for different errno for different registers. Patch 9 move finalize_vcpu back to run_test so that riscv can implement its specific operation. Patch 10 change to do the get/set operation only on present-blessed list. Patch 11 add the skip_set facilities so that riscv can skip set operation on some registers. Patch 12 enabled the KVM_GET_REG_LIST API in riscv. patch 13 added the corresponding kselftest for checking possible register regressions. The get-reg-list kvm selftest was ported from aarch64 and tested with Linux v6.5-rc3 on a Qemu riscv64 virt machine. --- Changed since v5: * Rebase to v6.5-rc3 * Minor fix for Andrew's comments Andrew Jones (7): KVM: arm64: selftests: Replace str_with_index with strdup_printf KVM: arm64: selftests: Drop SVE cap check in print_reg KVM: arm64: selftests: Remove print_reg's dependency on vcpu_config KVM: arm64: selftests: Rename vcpu_config and add to kvm_util.h KVM: arm64: selftests: Delete core_reg_fixup KVM: arm64: selftests: Split get-reg-list test code KVM: arm64: selftests: Finish generalizing get-reg-list Haibo Xu (6): KVM: arm64: selftests: Move reject_set check logic to a function KVM: arm64: selftests: Move finalize_vcpu back to run_test KVM: selftests: Only do get/set tests on present blessed list KVM: selftests: Add skip_set facility to get_reg_list test KVM: riscv: Add KVM_GET_REG_LIST API support KVM: riscv: selftests: Add get-reg-list test Documentation/virt/kvm/api.rst | 2 +- arch/riscv/kvm/vcpu.c | 375 +++++++++ tools/testing/selftests/kvm/Makefile | 13 +- .../selftests/kvm/aarch64/get-reg-list.c | 554 ++----------- tools/testing/selftests/kvm/get-reg-list.c | 401 +++++++++ .../selftests/kvm/include/kvm_util_base.h | 21 + .../selftests/kvm/include/riscv/processor.h | 3 + .../testing/selftests/kvm/include/test_util.h | 2 + tools/testing/selftests/kvm/lib/test_util.c | 15 + .../selftests/kvm/riscv/get-reg-list.c | 780 ++++++++++++++++++ 10 files changed, 1670 insertions(+), 496 deletions(-) create mode 100644 tools/testing/selftests/kvm/get-reg-list.c create mode 100644 tools/testing/selftests/kvm/riscv/get-reg-list.c -- 2.34.1

1 year, 11 months

5
22
0 0

[PATCH 0/2] x86/BPF: Add new BPF helper call bpf_rdtsc

by Tero Kristo

Hello, This patch series adds a new x86 arch specific BPF helper, bpf_rdtsc() which can be used for reading the hardware time stamp counter (TSC.) Currently the same counter is directly accessible from userspace (using RDTSC instruction), and kernel space using various rdtsc_*() APIs, however eBPF lacks the support. The main usage for the TSC counter is for various profiling and timing purposes, getting accurate cycle counter values. The counter can be currently read from BPF programs by using the existing perf subsystem services (bpf_perf_event_read()), however its usage is cumbersome at best. Additionally, the perf subsystem provides relative value only for the counter, but absolute values are desired by some use cases like Wult [1]. The absolute value of TSC can be read with BPF programs currently via some kprobe / bpf_core_read() magic (see [2], [3], [4] for example), but this relies on accessing kernel internals and is not stable API, and is pretty cumbersome. Thus, this patch proposes a new arch x86 specific BPF helper to avoid the above issues. -Tero [1] https://github.com/intel/wult [2] https://github.com/intel/wult/blob/c92237c95b898498faf41e6644983102d1fe5156… [3] https://github.com/intel/wult/blob/c92237c95b898498faf41e6644983102d1fe5156… [4] https://github.com/intel/wult/blob/c92237c95b898498faf41e6644983102d1fe5156…

1 year, 11 months

4
16
0 0

[PATCH 5.4 134/154] test_firmware: prevent race conditions by a correct implementation of locking

by Greg Kroah-Hartman

From: Mirsad Goran Todorovac <mirsad.todorovac(a)alu.unizg.hr> commit 4acfe3dfde685a5a9eaec5555351918e2d7266a1 upstream. Dan Carpenter spotted a race condition in a couple of situations like these in the test_firmware driver: static int test_dev_config_update_u8(const char *buf, size_t size, u8 *cfg) { u8 val; int ret; ret = kstrtou8(buf, 10, &val); if (ret) return ret; mutex_lock(&test_fw_mutex); *(u8 *)cfg = val; mutex_unlock(&test_fw_mutex); /* Always return full write size even if we didn't consume all */ return size; } static ssize_t config_num_requests_store(struct device *dev, struct device_attribute *attr, const char *buf, size_t count) { int rc; mutex_lock(&test_fw_mutex); if (test_fw_config->reqs) { pr_err("Must call release_all_firmware prior to changing config\n"); rc = -EINVAL; mutex_unlock(&test_fw_mutex); goto out; } mutex_unlock(&test_fw_mutex); rc = test_dev_config_update_u8(buf, count, &test_fw_config->num_requests); out: return rc; } static ssize_t config_read_fw_idx_store(struct device *dev, struct device_attribute *attr, const char *buf, size_t count) { return test_dev_config_update_u8(buf, count, &test_fw_config->read_fw_idx); } The function test_dev_config_update_u8() is called from both the locked and the unlocked context, function config_num_requests_store() and config_read_fw_idx_store() which can both be called asynchronously as they are driver's methods, while test_dev_config_update_u8() and siblings change their argument pointed to by u8 *cfg or similar pointer. To avoid deadlock on test_fw_mutex, the lock is dropped before calling test_dev_config_update_u8() and re-acquired within test_dev_config_update_u8() itself, but alas this creates a race condition. Having two locks wouldn't assure a race-proof mutual exclusion. This situation is best avoided by the introduction of a new, unlocked function __test_dev_config_update_u8() which can be called from the locked context and reducing test_dev_config_update_u8() to: static int test_dev_config_update_u8(const char *buf, size_t size, u8 *cfg) { int ret; mutex_lock(&test_fw_mutex); ret = __test_dev_config_update_u8(buf, size, cfg); mutex_unlock(&test_fw_mutex); return ret; } doing the locking and calling the unlocked primitive, which enables both locked and unlocked versions without duplication of code. The similar approach was applied to all functions called from the locked and the unlocked context, which safely mitigates both deadlocks and race conditions in the driver. __test_dev_config_update_bool(), __test_dev_config_update_u8() and __test_dev_config_update_size_t() unlocked versions of the functions were introduced to be called from the locked contexts as a workaround without releasing the main driver's lock and thereof causing a race condition. The test_dev_config_update_bool(), test_dev_config_update_u8() and test_dev_config_update_size_t() locked versions of the functions are being called from driver methods without the unnecessary multiplying of the locking and unlocking code for each method, and complicating the code with saving of the return value across lock. Fixes: 7feebfa487b92 ("test_firmware: add support for request_firmware_into_buf") Cc: Luis Chamberlain <mcgrof(a)kernel.org> Cc: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org> Cc: Russ Weight <russell.h.weight(a)intel.com> Cc: Takashi Iwai <tiwai(a)suse.de> Cc: Tianfei Zhang <tianfei.zhang(a)intel.com> Cc: Shuah Khan <shuah(a)kernel.org> Cc: Colin Ian King <colin.i.king(a)gmail.com> Cc: Randy Dunlap <rdunlap(a)infradead.org> Cc: linux-kselftest(a)vger.kernel.org Cc: stable(a)vger.kernel.org # v5.4 Suggested-by: Dan Carpenter <error27(a)gmail.com> Signed-off-by: Mirsad Goran Todorovac <mirsad.todorovac(a)alu.unizg.hr> Link: https://lore.kernel.org/r/20230509084746.48259-1-mirsad.todorovac@alu.unizg… Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org> --- lib/test_firmware.c | 37 ++++++++++++++++++++++++++++--------- 1 file changed, 28 insertions(+), 9 deletions(-) --- a/lib/test_firmware.c +++ b/lib/test_firmware.c @@ -301,16 +301,26 @@ static ssize_t config_test_show_str(char return len; } -static int test_dev_config_update_bool(const char *buf, size_t size, - bool *cfg) +static inline int __test_dev_config_update_bool(const char *buf, size_t size, + bool *cfg) { int ret; - mutex_lock(&test_fw_mutex); if (strtobool(buf, cfg) < 0) ret = -EINVAL; else ret = size; + + return ret; +} + +static int test_dev_config_update_bool(const char *buf, size_t size, + bool *cfg) +{ + int ret; + + mutex_lock(&test_fw_mutex); + ret = __test_dev_config_update_bool(buf, size, cfg); mutex_unlock(&test_fw_mutex); return ret; @@ -340,7 +350,7 @@ static ssize_t test_dev_config_show_int( return snprintf(buf, PAGE_SIZE, "%d\n", val); } -static int test_dev_config_update_u8(const char *buf, size_t size, u8 *cfg) +static inline int __test_dev_config_update_u8(const char *buf, size_t size, u8 *cfg) { int ret; long new; @@ -352,14 +362,23 @@ static int test_dev_config_update_u8(con if (new > U8_MAX) return -EINVAL; - mutex_lock(&test_fw_mutex); *(u8 *)cfg = new; - mutex_unlock(&test_fw_mutex); /* Always return full write size even if we didn't consume all */ return size; } +static int test_dev_config_update_u8(const char *buf, size_t size, u8 *cfg) +{ + int ret; + + mutex_lock(&test_fw_mutex); + ret = __test_dev_config_update_u8(buf, size, cfg); + mutex_unlock(&test_fw_mutex); + + return ret; +} + static ssize_t test_dev_config_show_u8(char *buf, u8 cfg) { u8 val; @@ -392,10 +411,10 @@ static ssize_t config_num_requests_store mutex_unlock(&test_fw_mutex); goto out; } - mutex_unlock(&test_fw_mutex); - rc = test_dev_config_update_u8(buf, count, - &test_fw_config->num_requests); + rc = __test_dev_config_update_u8(buf, count, + &test_fw_config->num_requests); + mutex_unlock(&test_fw_mutex); out: return rc;

1 year, 11 months

1
0
0 0

[PATCH 4.19 312/323] test_firmware: fix a memory leak with reqs buffer

by Greg Kroah-Hartman

From: Mirsad Goran Todorovac <mirsad.todorovac(a)alu.unizg.hr> commit be37bed754ed90b2655382f93f9724b3c1aae847 upstream. Dan Carpenter spotted that test_fw_config->reqs will be leaked if trigger_batched_requests_store() is called two or more times. The same appears with trigger_batched_requests_async_store(). This bug wasn't trigger by the tests, but observed by Dan's visual inspection of the code. The recommended workaround was to return -EBUSY if test_fw_config->reqs is already allocated. Fixes: 7feebfa487b92 ("test_firmware: add support for request_firmware_into_buf") Cc: Luis Chamberlain <mcgrof(a)kernel.org> Cc: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org> Cc: Russ Weight <russell.h.weight(a)intel.com> Cc: Tianfei Zhang <tianfei.zhang(a)intel.com> Cc: Shuah Khan <shuah(a)kernel.org> Cc: Colin Ian King <colin.i.king(a)gmail.com> Cc: Randy Dunlap <rdunlap(a)infradead.org> Cc: linux-kselftest(a)vger.kernel.org Cc: stable(a)vger.kernel.org # v5.4 Suggested-by: Dan Carpenter <error27(a)gmail.com> Suggested-by: Takashi Iwai <tiwai(a)suse.de> Signed-off-by: Mirsad Goran Todorovac <mirsad.todorovac(a)alu.unizg.hr> Reviewed-by: Dan Carpenter <dan.carpenter(a)linaro.org> Acked-by: Luis Chamberlain <mcgrof(a)kernel.org> Link: https://lore.kernel.org/r/20230509084746.48259-2-mirsad.todorovac@alu.unizg… Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org> --- lib/test_firmware.c | 10 ++++++++++ 1 file changed, 10 insertions(+) --- a/lib/test_firmware.c +++ b/lib/test_firmware.c @@ -618,6 +618,11 @@ static ssize_t trigger_batched_requests_ mutex_lock(&test_fw_mutex); + if (test_fw_config->reqs) { + rc = -EBUSY; + goto out_bail; + } + test_fw_config->reqs = vzalloc(array3_size(sizeof(struct test_batched_req), test_fw_config->num_requests, 2)); @@ -721,6 +726,11 @@ ssize_t trigger_batched_requests_async_s mutex_lock(&test_fw_mutex); + if (test_fw_config->reqs) { + rc = -EBUSY; + goto out_bail; + } + test_fw_config->reqs = vzalloc(array3_size(sizeof(struct test_batched_req), test_fw_config->num_requests, 2));

1 year, 11 months

1
0
0 0

[PATCH 4.14 203/204] test_firmware: fix a memory leak with reqs buffer

by Greg Kroah-Hartman

From: Mirsad Goran Todorovac <mirsad.todorovac(a)alu.unizg.hr> commit be37bed754ed90b2655382f93f9724b3c1aae847 upstream. Dan Carpenter spotted that test_fw_config->reqs will be leaked if trigger_batched_requests_store() is called two or more times. The same appears with trigger_batched_requests_async_store(). This bug wasn't trigger by the tests, but observed by Dan's visual inspection of the code. The recommended workaround was to return -EBUSY if test_fw_config->reqs is already allocated. Fixes: 7feebfa487b92 ("test_firmware: add support for request_firmware_into_buf") Cc: Luis Chamberlain <mcgrof(a)kernel.org> Cc: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org> Cc: Russ Weight <russell.h.weight(a)intel.com> Cc: Tianfei Zhang <tianfei.zhang(a)intel.com> Cc: Shuah Khan <shuah(a)kernel.org> Cc: Colin Ian King <colin.i.king(a)gmail.com> Cc: Randy Dunlap <rdunlap(a)infradead.org> Cc: linux-kselftest(a)vger.kernel.org Cc: stable(a)vger.kernel.org # v5.4 Suggested-by: Dan Carpenter <error27(a)gmail.com> Suggested-by: Takashi Iwai <tiwai(a)suse.de> Signed-off-by: Mirsad Goran Todorovac <mirsad.todorovac(a)alu.unizg.hr> Reviewed-by: Dan Carpenter <dan.carpenter(a)linaro.org> Acked-by: Luis Chamberlain <mcgrof(a)kernel.org> Link: https://lore.kernel.org/r/20230509084746.48259-2-mirsad.todorovac@alu.unizg… Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org> --- lib/test_firmware.c | 10 ++++++++++ 1 file changed, 10 insertions(+) --- a/lib/test_firmware.c +++ b/lib/test_firmware.c @@ -621,6 +621,11 @@ static ssize_t trigger_batched_requests_ mutex_lock(&test_fw_mutex); + if (test_fw_config->reqs) { + rc = -EBUSY; + goto out_bail; + } + test_fw_config->reqs = vzalloc(sizeof(struct test_batched_req) * test_fw_config->num_requests * 2); if (!test_fw_config->reqs) { @@ -723,6 +728,11 @@ ssize_t trigger_batched_requests_async_s mutex_lock(&test_fw_mutex); + if (test_fw_config->reqs) { + rc = -EBUSY; + goto out_bail; + } + test_fw_config->reqs = vzalloc(sizeof(struct test_batched_req) * test_fw_config->num_requests * 2); if (!test_fw_config->reqs) {

1 year, 11 months

1
0
0 0

[PATCH 0/2] v2: F_OFD_GETLK extension to read lock info

by Stas Sergeev

This extension allows to use F_UNLCK on query, which currently returns EINVAL. Instead it can be used to query the locks on a particular fd - something that is not currently possible. The basic idea is that on F_OFD_GETLK, F_UNLCK would "conflict" with (or query) any types of the lock on the same fd, and ignore any locks on other fds. Use-cases: 1. CRIU-alike scenario when you want to read the locking info from an fd for the later reconstruction. This can now be done by setting l_start and l_len to 0 to cover entire file range, and do F_OFD_GETLK. In the loop you need to advance l_start past the returned lock ranges, to eventually collect all locked ranges. 2. Implementing the lock checking/enforcing policy. Say you want to implement an "auditor" module in your program, that checks that the I/O is done only after the proper locking is applied on a file region. In this case you need to know if the particular region is locked on that fd, and if so - with what type of the lock. If you would do that currently (without this extension) then you can only check for the write locks, and for that you need to probe the lock on your fd and then open the same file via another fd and probe there. That way you can identify the write lock on a particular fd, but such trick is non-atomic and complex. As for finding out the read lock on a particular fd - impossible. This extension allows to do such queries without any extra efforts. 3. Implementing the mandatory locking policy. Suppose you want to make a policy where the write lock inhibits any unlocked readers and writers. Currently you need to check if the write lock is present on some other fd, and if it is not there - allow the I/O operation. But because the write lock can appear at any moment, you need to do that under some global lock, which can be released only when the I/O operation is finished. With the proposed extension you can instead just check the write lock on your own fd first, and if it is there - allow the I/O operation on that fd without using any global lock. Only if there is no write lock on this fd, then you need to take global lock and check for a write lock on other fds. The second patch adds a test-case for OFD locks. It tests both the generic things and the proposed extension. The third patch is a proposed man page update for fcntl(2) (not for the linux source tree) Changes in v2: - Dropped the l_pid extension patch and updated test-case accordingly. Stas Sergeev (2): fs/locks: F_UNLCK extension for F_OFD_GETLK selftests: add OFD lock tests fs/locks.c | 23 +++- tools/testing/selftests/locking/Makefile | 2 + tools/testing/selftests/locking/ofdlocks.c | 132 +++++++++++++++++++++ 3 files changed, 154 insertions(+), 3 deletions(-) create mode 100644 tools/testing/selftests/locking/ofdlocks.c CC: Jeff Layton <jlayton(a)kernel.org> CC: Chuck Lever <chuck.lever(a)oracle.com> CC: Alexander Viro <viro(a)zeniv.linux.org.uk> CC: Christian Brauner <brauner(a)kernel.org> CC: linux-fsdevel(a)vger.kernel.org CC: linux-kernel(a)vger.kernel.org CC: Shuah Khan <shuah(a)kernel.org> CC: linux-kselftest(a)vger.kernel.org CC: linux-api(a)vger.kernel.org -- 2.39.2

1 year, 11 months

6
15
0 0

[PATCH] selftests/mm: FOLL_LONGTERM need to be updated to 0x100

by Ayush Jain

After commit 2c2241081f7d ("mm/gup: move private gup FOLL_ flags to internal.h") FOLL_LONGTERM flag value got updated from 0x10000 to 0x100 at include/linux/mm_types.h. As hmm.hmm_device_private.hmm_gup_test uses FOLL_LONGTERM Updating same here as well. Before this change test goes in an infinite assert loop in hmm.hmm_device_private.hmm_gup_test ========================================================== RUN hmm.hmm_device_private.hmm_gup_test ... hmm-tests.c:1962:hmm_gup_test:Expected HMM_DMIRROR_PROT_WRITE.. ..(2) == m[2] (34) hmm-tests.c:157:hmm_gup_test:Expected ret (-1) == 0 (0) hmm-tests.c:157:hmm_gup_test:Expected ret (-1) == 0 (0) ... ========================================================== Call Trace: <TASK> ? sched_clock+0xd/0x20 ? __lock_acquire.constprop.0+0x120/0x6c0 ? ktime_get+0x2c/0xd0 ? sched_clock+0xd/0x20 ? local_clock+0x12/0xd0 ? lock_release+0x26e/0x3b0 pin_user_pages_fast+0x4c/0x70 gup_test_ioctl+0x4ff/0xbb0 ? gup_test_ioctl+0x68c/0xbb0 __x64_sys_ioctl+0x99/0xd0 do_syscall_64+0x60/0x90 ? syscall_exit_to_user_mode+0x2a/0x50 ? do_syscall_64+0x6d/0x90 ? syscall_exit_to_user_mode+0x2a/0x50 ? do_syscall_64+0x6d/0x90 ? irqentry_exit_to_user_mode+0xd/0x20 ? irqentry_exit+0x3f/0x50 ? exc_page_fault+0x96/0x200 entry_SYSCALL_64_after_hwframe+0x72/0xdc RIP: 0033:0x7f6aaa31aaff After this change test is able to pass successfully. Signed-off-by: Ayush Jain <ayush.jain3(a)amd.com> Reviewed-by: Raghavendra K T <raghavendra.kt(a)amd.com> --- tools/testing/selftests/mm/hmm-tests.c | 7 ++++++- 1 file changed, 6 insertions(+), 1 deletion(-) diff --git a/tools/testing/selftests/mm/hmm-tests.c b/tools/testing/selftests/mm/hmm-tests.c index 4adaad1b822f..20294553a5dd 100644 --- a/tools/testing/selftests/mm/hmm-tests.c +++ b/tools/testing/selftests/mm/hmm-tests.c @@ -57,9 +57,14 @@ enum { #define ALIGN(x, a) (((x) + (a - 1)) & (~((a) - 1))) /* Just the flags we need, copied from mm.h: */ + +#ifndef FOLL_WRITE #define FOLL_WRITE 0x01 /* check pte is writable */ -#define FOLL_LONGTERM 0x10000 /* mapping lifetime is indefinite */ +#endif +#ifndef FOLL_LONGTERM +#define FOLL_LONGTERM 0x100 /* mapping lifetime is indefinite */ +#endif FIXTURE(hmm) { int fd; -- 2.39.3

1 year, 11 months

3
2
0 0

[PATCH v27 0/6] Implement IOCTL to get and optionally clear info about PTEs

by Muhammad Usama Anjum

*Changes in v27:* - Handle review comments and minor improvements - Add performance improvement patch on top with test for easy review *Changes in v26:* - Code re-structurring and API changes in PAGEMAP_IOCTL *Changes in v25*: - Do proper filtering on hole as well (hole got missed earlier) *Changes in v24*: - Rebase on top of next-20230710 - Place WP markers in case of hole as well *Changes in v23*: - Set vec_buf_index in loop only when vec_buf_index is set - Return -EFAULT instead of -EINVAL if vec is NULL - Correctly return the walk ending address to the page granularity *Changes in v22*: - Interface change: - Replace [start start + len) with [start, end) - Return the ending address of the address walk in start *Changes in v21*: - Abort walk instead of returning error if WP is to be performed on partial hugetlb *Changes in v20* - Correct PAGE_IS_FILE and add PAGE_IS_PFNZERO *Changes in v19* - Minor changes and interface updates *Changes in v18* - Rebase on top of next-20230613 - Minor updates *Changes in v17* - Rebase on top of next-20230606 - Minor improvements in PAGEMAP_SCAN IOCTL patch *Changes in v16* - Fix a corner case - Add exclusive PM_SCAN_OP_WP back *Changes in v15* - Build fix (Add missed build fix in RESEND) *Changes in v14* - Fix build error caused by #ifdef added at last minute in some configs *Changes in v13* - Rebase on top of next-20230414 - Give-up on using uffd_wp_range() and write new helpers, flush tlb only once *Changes in v12* - Update and other memory types to UFFD_FEATURE_WP_ASYNC - Rebaase on top of next-20230406 - Review updates *Changes in v11* - Rebase on top of next-20230307 - Base patches on UFFD_FEATURE_WP_UNPOPULATED - Do a lot of cosmetic changes and review updates - Remove ENGAGE_WP + !GET operation as it can be performed with UFFDIO_WRITEPROTECT *Changes in v10* - Add specific condition to return error if hugetlb is used with wp async - Move changes in tools/include/uapi/linux/fs.h to separate patch - Add documentation *Changes in v9:* - Correct fault resolution for userfaultfd wp async - Fix build warnings and errors which were happening on some configs - Simplify pagemap ioctl's code *Changes in v8:* - Update uffd async wp implementation - Improve PAGEMAP_IOCTL implementation *Changes in v7:* - Add uffd wp async - Update the IOCTL to use uffd under the hood instead of soft-dirty flags *Motivation* The real motivation for adding PAGEMAP_SCAN IOCTL is to emulate Windows GetWriteWatch() and ResetWriteWatch() syscalls [1]. The GetWriteWatch() retrieves the addresses of the pages that are written to in a region of virtual memory. This syscall is used in Windows applications and games etc. This syscall is being emulated in pretty slow manner in userspace. Our purpose is to enhance the kernel such that we translate it efficiently in a better way. Currently some out of tree hack patches are being used to efficiently emulate it in some kernels. We intend to replace those with these patches. So the whole gaming on Linux can effectively get benefit from this. It means there would be tons of users of this code. CRIU use case [2] was mentioned by Andrei and Danylo: > Use cases for migrating sparse VMAs are binaries sanitized with ASAN, > MSAN or TSAN [3]. All of these sanitizers produce sparse mappings of > shadow memory [4]. Being able to migrate such binaries allows to highly > reduce the amount of work needed to identify and fix post-migration > crashes, which happen constantly. Andrei's defines the following uses of this code: * it is more granular and allows us to track changed pages more effectively. The current interface can clear dirty bits for the entire process only. In addition, reading info about pages is a separate operation. It means we must freeze the process to read information about all its pages, reset dirty bits, only then we can start dumping pages. The information about pages becomes more and more outdated, while we are processing pages. The new interface solves both these downsides. First, it allows us to read pte bits and clear the soft-dirty bit atomically. It means that CRIU will not need to freeze processes to pre-dump their memory. Second, it clears soft-dirty bits for a specified region of memory. It means CRIU will have actual info about pages to the moment of dumping them. * The new interface has to be much faster because basic page filtering is happening in the kernel. With the old interface, we have to read pagemap for each page. *Implementation Evolution (Short Summary)* From the definition of GetWriteWatch(), we feel like kernel's soft-dirty feature can be used under the hood with some additions like: * reset soft-dirty flag for only a specific region of memory instead of clearing the flag for the entire process * get and clear soft-dirty flag for a specific region atomically So we decided to use ioctl on pagemap file to read or/and reset soft-dirty flag. But using soft-dirty flag, sometimes we get extra pages which weren't even written. They had become soft-dirty because of VMA merging and VM_SOFTDIRTY flag. This breaks the definition of GetWriteWatch(). We were able to by-pass this short coming by ignoring VM_SOFTDIRTY until David reported that mprotect etc messes up the soft-dirty flag while ignoring VM_SOFTDIRTY [5]. This wasn't happening until [6] got introduced. We discussed if we can revert these patches. But we could not reach to any conclusion. So at this point, I made couple of tries to solve this whole VM_SOFTDIRTY issue by correcting the soft-dirty implementation: * [7] Correct the bug fixed wrongly back in 2014. It had potential to cause regression. We left it behind. * [8] Keep a list of soft-dirty part of a VMA across splits and merges. I got the reply don't increase the size of the VMA by 8 bytes. At this point, we left soft-dirty considering it is too much delicate and userfaultfd [9] seemed like the only way forward. From there onward, we have been basing soft-dirty emulation on userfaultfd wp feature where kernel resolves the faults itself when WP_ASYNC feature is used. It was straight forward to add WP_ASYNC feature in userfautlfd. Now we get only those pages dirty or written-to which are really written in reality. (PS There is another WP_UNPOPULATED userfautfd feature is required which is needed to avoid pre-faulting memory before write-protecting [9].) All the different masks were added on the request of CRIU devs to create interface more generic and better. [1] https://learn.microsoft.com/en-us/windows/win32/api/memoryapi/nf-memoryapi-… [2] https://lore.kernel.org/all/20221014134802.1361436-1-mdanylo@google.com [3] https://github.com/google/sanitizers [4] https://github.com/google/sanitizers/wiki/AddressSanitizerAlgorithm#64-bit [5] https://lore.kernel.org/all/bfcae708-db21-04b4-0bbe-712badd03071@redhat.com [6] https://lore.kernel.org/all/20220725142048.30450-1-peterx@redhat.com/ [7] https://lore.kernel.org/all/20221122115007.2787017-1-usama.anjum@collabora.… [8] https://lore.kernel.org/all/20221220162606.1595355-1-usama.anjum@collabora.… [9] https://lore.kernel.org/all/20230306213925.617814-1-peterx@redhat.com [10] https://lore.kernel.org/all/20230125144529.1630917-1-mdanylo@google.com * Original Cover letter from v8* Hello, Note: Soft-dirty pages and pages which have been written-to are synonyms. As kernel already has soft-dirty feature inside which we have given up to use, we are using written-to terminology while using UFFD async WP under the hood. It is possible to find and clear soft-dirty pages entirely in userspace. But it isn't efficient: - The mprotect and SIGSEGV handler for bookkeeping - The userfaultfd wp (synchronous) with the handler for bookkeeping Some benchmarks can be seen here[1]. This series adds features that weren't present earlier: - There is no atomic get soft-dirty/Written-to status and clear present in the kernel. - The pages which have been written-to can not be found in accurate way. (Kernel's soft-dirty PTE bit + sof_dirty VMA bit shows more soft-dirty pages than there actually are.) Historically, soft-dirty PTE bit tracking has been used in the CRIU project. The procfs interface is enough for finding the soft-dirty bit status and clearing the soft-dirty bit of all the pages of a process. We have the use case where we need to track the soft-dirty PTE bit for only specific pages on-demand. We need this tracking and clear mechanism of a region of memory while the process is running to emulate the getWriteWatch() syscall of Windows. *(Moved to using UFFD instead of soft-dirty feature to find pages which have been written-to from v7 patch series)*: Stop using the soft-dirty flags for finding which pages have been written to. It is too delicate and wrong as it shows more soft-dirty pages than the actual soft-dirty pages. There is no interest in correcting it [2][3] as this is how the feature was written years ago. It shouldn't be updated to changed behaviour. Peter Xu has suggested using the async version of the UFFD WP [4] as it is based inherently on the PTEs. So in this patch series, I've added a new mode to the UFFD which is asynchronous version of the write protect. When this variant of the UFFD WP is used, the page faults are resolved automatically by the kernel. The pages which have been written-to can be found by reading pagemap file (!PM_UFFD_WP). This feature can be used successfully to find which pages have been written to from the time the pages were write protected. This works just like the soft-dirty flag without showing any extra pages which aren't soft-dirty in reality. The information related to pages if the page is file mapped, present and swapped is required for the CRIU project [5][6]. The addition of the required mask, any mask, excluded mask and return masks are also required for the CRIU project [5]. The IOCTL returns the addresses of the pages which match the specific masks. The page addresses are returned in struct page_region in a compact form. The max_pages is needed to support a use case where user only wants to get a specific number of pages. So there is no need to find all the pages of interest in the range when max_pages is specified. The IOCTL returns when the maximum number of the pages are found. The max_pages is optional. If max_pages is specified, it must be equal or greater than the vec_size. This restriction is needed to handle worse case when one page_region only contains info of one page and it cannot be compacted. This is needed to emulate the Windows getWriteWatch() syscall. The patch series include the detailed selftest which can be used as an example for the uffd async wp test and PAGEMAP_IOCTL. It shows the interface usages as well. [1] https://lore.kernel.org/lkml/54d4c322-cd6e-eefd-b161-2af2b56aae24@collabora… [2] https://lore.kernel.org/all/20221220162606.1595355-1-usama.anjum@collabora.… [3] https://lore.kernel.org/all/20221122115007.2787017-1-usama.anjum@collabora.… [4] https://lore.kernel.org/all/Y6Hc2d+7eTKs7AiH@x1n [5] https://lore.kernel.org/all/YyiDg79flhWoMDZB@gmail.com/ [6] https://lore.kernel.org/all/20221014134802.1361436-1-mdanylo@google.com/ Regards, Muhammad Usama Anjum Muhammad Usama Anjum (5): fs/proc/task_mmu: Implement IOCTL to get and optionally clear info about PTEs fs/proc/task_mmu: Add fast paths to get/clear PAGE_IS_WRITTEN flag tools headers UAPI: Update linux/fs.h with the kernel sources mm/pagemap: add documentation of PAGEMAP_SCAN IOCTL selftests: mm: add pagemap ioctl tests Peter Xu (1): userfaultfd: UFFD_FEATURE_WP_ASYNC Documentation/admin-guide/mm/pagemap.rst | 64 + Documentation/admin-guide/mm/userfaultfd.rst | 35 + fs/proc/task_mmu.c | 715 +++++++++ fs/userfaultfd.c | 26 +- include/linux/hugetlb.h | 1 + include/linux/userfaultfd_k.h | 21 +- include/uapi/linux/fs.h | 59 + include/uapi/linux/userfaultfd.h | 9 +- mm/hugetlb.c | 34 +- mm/memory.c | 27 +- tools/include/uapi/linux/fs.h | 59 + tools/testing/selftests/mm/.gitignore | 2 + tools/testing/selftests/mm/Makefile | 3 +- tools/testing/selftests/mm/config | 1 + tools/testing/selftests/mm/pagemap_ioctl.c | 1491 ++++++++++++++++++ tools/testing/selftests/mm/run_vmtests.sh | 4 + 16 files changed, 2527 insertions(+), 24 deletions(-) create mode 100644 tools/testing/selftests/mm/pagemap_ioctl.c -- 2.39.2

1 year, 11 months

2
10
0 0

kselftest/next kselftest-seccomp: 2 runs, 1 regressions (v6.5-rc3-26-g2b2fe6052dd01)

by kernelci.org bot

kselftest/next kselftest-seccomp: 2 runs, 1 regressions (v6.5-rc3-26-g2b2fe6052dd01) Regressions Summary ------------------- platform | arch | lab | compiler | defconfig | regressions ----------------+-------+---------------+----------+------------------------------+------------ mt8173-elm-hana | arm64 | lab-collabora | gcc-10 | defconfig+kse...4-chromebook | 1 Details: https://kernelci.org/test/job/kselftest/branch/next/kernel/v6.5-rc3-26-g2b2… Test: kselftest-seccomp Tree: kselftest Branch: next Describe: v6.5-rc3-26-g2b2fe6052dd01 URL: https://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest.git SHA: 2b2fe6052dd01fdb4e9a31031c2c9d8f03cf7753 Test Regressions ---------------- platform | arch | lab | compiler | defconfig | regressions ----------------+-------+---------------+----------+------------------------------+------------ mt8173-elm-hana | arm64 | lab-collabora | gcc-10 | defconfig+kse...4-chromebook | 1 Details: https://kernelci.org/test/plan/id/64d2fa2ce56d88a33435b1f6 Results: 0 PASS, 1 FAIL, 0 SKIP Full config: defconfig+kselftest+arm64-chromebook Compiler: gcc-10 (aarch64-linux-gnu-gcc (Debian 10.2.1-6) 10.2.1 20210110) Plain log: https://storage.kernelci.org//kselftest/next/v6.5-rc3-26-g2b2fe6052dd01/arm… HTML log: https://storage.kernelci.org//kselftest/next/v6.5-rc3-26-g2b2fe6052dd01/arm… Rootfs: http://storage.kernelci.org/images/rootfs/debian/bullseye-kselftest/2023062… * kselftest-seccomp.login: https://kernelci.org/test/case/id/64d2fa2ce56d88a33435b1f7 failing since 295 days (last pass: linux-kselftest-next-6.0-rc2-11-g144eeb2fc761, first fail: v6.1-rc1)

1 year, 11 months

1
0
0 0

kselftest/next build: 6 builds: 0 failed, 6 passed, 1 warning (v6.5-rc3-26-g2b2fe6052dd01)

by kernelci.org bot

kselftest/next build: 6 builds: 0 failed, 6 passed, 1 warning (v6.5-rc3-26-g2b2fe6052dd01) Full Build Summary: https://kernelci.org/build/kselftest/branch/next/kernel/v6.5-rc3-26-g2b2fe6… Tree: kselftest Branch: next Git Describe: v6.5-rc3-26-g2b2fe6052dd01 Git Commit: 2b2fe6052dd01fdb4e9a31031c2c9d8f03cf7753 Git URL: https://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest.git Built: 4 unique architectures Warnings Detected: arm64: arm: i386: x86_64: x86_64_defconfig+kselftest (clang-16): 1 warning Warnings summary: 1 vmlinux.o: warning: objtool: set_ftrace_ops_ro+0x39: relocation to !ENDBR: .text+0x13df86 ================================================================================ Detailed per-defconfig build reports: -------------------------------------------------------------------------------- defconfig+kselftest (arm64, gcc-10) — PASS, 0 errors, 0 warnings, 0 section mismatches -------------------------------------------------------------------------------- defconfig+kselftest+arm64-chromebook (arm64, gcc-10) — PASS, 0 errors, 0 warnings, 0 section mismatches -------------------------------------------------------------------------------- i386_defconfig+kselftest (i386, gcc-10) — PASS, 0 errors, 0 warnings, 0 section mismatches -------------------------------------------------------------------------------- multi_v7_defconfig+kselftest (arm, gcc-10) — PASS, 0 errors, 0 warnings, 0 section mismatches -------------------------------------------------------------------------------- x86_64_defconfig+kselftest (x86_64, gcc-10) — PASS, 0 errors, 0 warnings, 0 section mismatches -------------------------------------------------------------------------------- x86_64_defconfig+kselftest (x86_64, clang-16) — PASS, 0 errors, 1 warning, 0 section mismatches Warnings: vmlinux.o: warning: objtool: set_ftrace_ops_ro+0x39: relocation to !ENDBR: .text+0x13df86 --- For more info write to <info(a)kernelci.org>

1 year, 11 months

1
0
0 0

[PATCH 0/4] RSEQ selftests updates

by Mathieu Desnoyers

Hi, You will find in this series updates to the rseq selftests, mainly bringing fixes from librseq project back into the RSEQ selftests. Thanks, Mathieu Mathieu Desnoyers (4): selftests/rseq: Fix CID_ID typo in Makefile selftests/rseq: Implement rseq_unqual_scalar_typeof selftests/rseq: Fix arm64 buggy load-acquire/store-release macros selftests/rseq: Use rseq_unqual_scalar_typeof in macros tools/testing/selftests/rseq/Makefile | 2 +- tools/testing/selftests/rseq/compiler.h | 26 ++++++++++ tools/testing/selftests/rseq/rseq-arm.h | 4 +- tools/testing/selftests/rseq/rseq-arm64.h | 58 ++++++++++++----------- tools/testing/selftests/rseq/rseq-mips.h | 4 +- tools/testing/selftests/rseq/rseq-ppc.h | 4 +- tools/testing/selftests/rseq/rseq-riscv.h | 6 +-- tools/testing/selftests/rseq/rseq-s390.h | 4 +- tools/testing/selftests/rseq/rseq-x86.h | 4 +- 9 files changed, 70 insertions(+), 42 deletions(-) -- 2.25.1

1 year, 11 months

2
8
0 0

[PATCH v3 4.14 1/1] test_firmware: fix the memory leaks with the reqs buffer

by Mirsad Todorovac

[ commit be37bed754ed90b2655382f93f9724b3c1aae847 upstream ] Dan Carpenter spotted that test_fw_config->reqs will be leaked if trigger_batched_requests_store() is called two or more times. The same appears with trigger_batched_requests_async_store(). This bug wasn't triggered by the tests, but observed by Dan's visual inspection of the code. The recommended workaround was to return -EBUSY if test_fw_config->reqs is already allocated. Fixes: c92316bf8e94 ("test_firmware: add batched firmware tests") Cc: Luis Chamberlain <mcgrof(a)kernel.org> Cc: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org> Cc: Russ Weight <russell.h.weight(a)intel.com> Cc: Tianfei Zhang <tianfei.zhang(a)intel.com> Cc: Shuah Khan <shuah(a)kernel.org> Cc: Colin Ian King <colin.i.king(a)gmail.com> Cc: Randy Dunlap <rdunlap(a)infradead.org> Cc: linux-kselftest(a)vger.kernel.org Cc: stable(a)vger.kernel.org # v4.14 Suggested-by: Dan Carpenter <error27(a)gmail.com> Suggested-by: Takashi Iwai <tiwai(a)suse.de> Link: https://lore.kernel.org/r/20230509084746.48259-2-mirsad.todorovac@alu.unizg… Signed-off-by: Mirsad Todorovac <mirsad.todorovac(a)alu.unizg.hr> [ This fix is applied against the 4.14 stable branch. There are no changes to the ] [ fix in code when compared to the upstread, only the reformatting for backport. ] --- v2 -> v3: minor clarifications in the versioning for the patchwork. not change to commit. v1 -> v2: removed the Reviewed-by: and Acked-by tags, as this is a slightly different patch and those need to be reacquired lib/test_firmware.c | 10 ++++++++++ 1 file changed, 10 insertions(+) diff --git a/lib/test_firmware.c b/lib/test_firmware.c index 1c5e5246bf10..5318c5e18acf 100644 --- a/lib/test_firmware.c +++ b/lib/test_firmware.c @@ -621,6 +621,11 @@ static ssize_t trigger_batched_requests_store(struct device *dev, mutex_lock(&test_fw_mutex); + if (test_fw_config->reqs) { + rc = -EBUSY; + goto out_bail; + } + test_fw_config->reqs = vzalloc(sizeof(struct test_batched_req) * test_fw_config->num_requests * 2); if (!test_fw_config->reqs) { @@ -723,6 +728,11 @@ ssize_t trigger_batched_requests_async_store(struct device *dev, mutex_lock(&test_fw_mutex); + if (test_fw_config->reqs) { + rc = -EBUSY; + goto out_bail; + } + test_fw_config->reqs = vzalloc(sizeof(struct test_batched_req) * test_fw_config->num_requests * 2); if (!test_fw_config->reqs) { -- 2.34.1

1 year, 11 months

2
6
0 0

[PATCH v2 1/2] selftests: mm: ksm: Fix incorrect evaluation of parameter

by Ayush Jain

A missing break in kms_tests leads to kselftest hang when the parameter -s is used. In current code flow because of missing break in -s, -t parses args spilled from -s and as -t accepts only valid values as 0,1 so any arg in -s >1 or <0, gets in ksm_test failure This went undetected since, before the addition of option -t, the next case -M would immediately break out of the switch statement but that is no longer the case Add the missing break statement. ----Before---- ./ksm_tests -H -s 100 Invalid merge type ----After---- ./ksm_tests -H -s 100 Number of normal pages: 0 Number of huge pages: 50 Total size: 100 MiB Total time: 0.401732682 s Average speed: 248.922 MiB/s Fixes: 07115fcc15b4 ("selftests/mm: add new selftests for KSM") Signed-off-by: Ayush Jain <ayush.jain3(a)amd.com> Reviewed-by: David Hildenbrand <david(a)redhat.com> --- v1 -> v2 collect Reviewed-by from David Updated Fixes tag from commit 9e7cb94ca218 to 07115fcc15b4 tools/testing/selftests/mm/ksm_tests.c | 1 + 1 file changed, 1 insertion(+) diff --git a/tools/testing/selftests/mm/ksm_tests.c b/tools/testing/selftests/mm/ksm_tests.c index 435acebdc325..380b691d3eb9 100644 --- a/tools/testing/selftests/mm/ksm_tests.c +++ b/tools/testing/selftests/mm/ksm_tests.c @@ -831,6 +831,7 @@ int main(int argc, char *argv[]) printf("Size must be greater than 0\n"); return KSFT_FAIL; } + break; case 't': { int tmp = atoi(optarg); -- 2.34.1

1 year, 11 months

2
2
0 0

[PATCH bpf-next v6 0/8] Add SO_REUSEPORT support for TC bpf_sk_assign

by Lorenz Bauer

We want to replace iptables TPROXY with a BPF program at TC ingress. To make this work in all cases we need to assign a SO_REUSEPORT socket to an skb, which is currently prohibited. This series adds support for such sockets to bpf_sk_assing. I did some refactoring to cut down on the amount of duplicate code. The key to this is to use INDIRECT_CALL in the reuseport helpers. To show that this approach is not just beneficial to TC sk_assign I removed duplicate code for bpf_sk_lookup as well. Joint work with Daniel Borkmann. Signed-off-by: Lorenz Bauer <lmb(a)isovalent.com> --- Changes in v6: - Reject unhashed UDP sockets in bpf_sk_assign to avoid ref leak - Link to v5: https://lore.kernel.org/r/20230613-so-reuseport-v5-0-f6686a0dbce0@isovalent… Changes in v5: - Drop reuse_sk == sk check in inet[6]_steal_stock (Kuniyuki) - Link to v4: https://lore.kernel.org/r/20230613-so-reuseport-v4-0-4ece76708bba@isovalent… Changes in v4: - WARN_ON_ONCE if reuseport socket is refcounted (Kuniyuki) - Use inet[6]_ehashfn_t to shorten function declarations (Kuniyuki) - Shuffle documentation patch around (Kuniyuki) - Update commit message to explain why IPv6 needs EXPORT_SYMBOL - Link to v3: https://lore.kernel.org/r/20230613-so-reuseport-v3-0-907b4cbb7b99@isovalent… Changes in v3: - Fix warning re udp_ehashfn and udp6_ehashfn (Simon) - Return higher scoring connected UDP reuseport sockets (Kuniyuki) - Fix ipv6 module builds - Link to v2: https://lore.kernel.org/r/20230613-so-reuseport-v2-0-b7c69a342613@isovalent… Changes in v2: - Correct commit abbrev length (Kuniyuki) - Reduce duplication (Kuniyuki) - Add checks on sk_state (Martin) - Split exporting inet[6]_lookup_reuseport into separate patch (Eric) --- Daniel Borkmann (1): selftests/bpf: Test that SO_REUSEPORT can be used with sk_assign helper Lorenz Bauer (7): udp: re-score reuseport groups when connected sockets are present bpf: reject unhashed sockets in bpf_sk_assign net: export inet_lookup_reuseport and inet6_lookup_reuseport net: remove duplicate reuseport_lookup functions net: document inet[6]_lookup_reuseport sk_state requirements net: remove duplicate sk_lookup helpers bpf, net: Support SO_REUSEPORT sockets with bpf_sk_assign include/net/inet6_hashtables.h | 81 ++++++++- include/net/inet_hashtables.h | 74 +++++++- include/net/sock.h | 7 +- include/uapi/linux/bpf.h | 3 - net/core/filter.c | 4 +- net/ipv4/inet_hashtables.c | 68 ++++--- net/ipv4/udp.c | 88 ++++----- net/ipv6/inet6_hashtables.c | 71 +++++--- net/ipv6/udp.c | 98 ++++------ tools/include/uapi/linux/bpf.h | 3 - tools/testing/selftests/bpf/network_helpers.c | 3 + .../selftests/bpf/prog_tests/assign_reuse.c | 197 +++++++++++++++++++++ .../selftests/bpf/progs/test_assign_reuse.c | 142 +++++++++++++++ 13 files changed, 660 insertions(+), 179 deletions(-) --- base-commit: 6f5a630d7c57cd79b1f526a95e757311e32d41e5 change-id: 20230613-so-reuseport-e92c526173ee Best regards, -- Lorenz Bauer <lmb(a)isovalent.com>

1 year, 11 months

5
18
0 0

[PATCH v5 0/4] iommufd: Add iommu hardware info reporting

by Yi Liu

iommufd gives userspace the capability to manipulate iommu subsytem. e.g. DMA map/unmap etc. In the near future, it will support iommu nested translation. Different platform vendors have different implementation for the nested translation. For example, Intel VT-d supports using guest I/O page table as the stage-1 translation table. This requires guest I/O page table be compatible with hardware IOMMU. So before set up nested translation, userspace needs to know the hardware iommu information to understand the nested translation requirements. This series reports the iommu hardware information for a given device which has been bound to iommufd. It is preparation work for userspace to allocate hwpt for given device. Like the nested translation support[1]. This series introduces an iommu op to report the iommu hardware info, and an ioctl IOMMU_GET_HW_INFO is added to report such hardware info to user. enum iommu_hw_info_type is defined to differentiate the iommu hardware info reported to user hence user can decode them. This series only adds the framework for iommu hw info reporting, the complete reporting path needs vendor specific definition and driver support. The full code is available in [1] as well. [1] https://github.com/yiliu1765/iommufd/tree/wip/iommufd_nesting_08032023-yi (only the hw_info report path is the latest, other parts is wip) Change log: v5: - Return hw_info_type in the .hw_info op, hence drop hw_info_type field in iommu_ops (Kevin) - Add Jason's r-b for patch 01 - Address coding style comments from Jason and Kevin w.r.t. patch 02, 03 and 04 v4: https://lore.kernel.org/linux-iommu/20230724105936.107042-1-yi.l.liu@intel.… - Rename ioctl to IOMMU_GET_HW_INFO and structure to iommu_hw_info - Move the iommufd_get_hw_info handler to main.c - Place iommu_hw_info prior to iommu_hwpt_alloc - Update the function namings accordingly - Update uapi kdocs v3: https://lore.kernel.org/linux-iommu/20230511143024.19542-1-yi.l.liu@intel.c… - Add r-b from Baolu - Rename IOMMU_HW_INFO_TYPE_DEFAULT to be IOMMU_HW_INFO_TYPE_NONE to better suit what it means - Let IOMMU_DEVICE_GET_HW_INFO succeed even the underlying iommu driver does not have driver-specific data to report per below remark. https://lore.kernel.org/kvm/ZAcwJSK%2F9UVI9LXu@nvidia.com/ v2: https://lore.kernel.org/linux-iommu/20230309075358.571567-1-yi.l.liu@intel.… - Drop patch 05 of v1 as it is already covered by other series - Rename the capability info to be iommu hardware info v1: https://lore.kernel.org/linux-iommu/20230209041642.9346-1-yi.l.liu@intel.co… Regards, Yi Liu Lu Baolu (1): iommu: Add new iommu op to get iommu hardware information Nicolin Chen (1): iommufd/selftest: Add coverage for IOMMU_GET_HW_INFO ioctl Yi Liu (2): iommu: Move dev_iommu_ops() to private header iommufd: Add IOMMU_GET_HW_INFO drivers/iommu/iommu-priv.h | 11 +++ drivers/iommu/iommufd/iommufd_test.h | 9 +++ drivers/iommu/iommufd/main.c | 79 +++++++++++++++++++ drivers/iommu/iommufd/selftest.c | 16 ++++ include/linux/iommu.h | 20 +++-- include/uapi/linux/iommufd.h | 44 +++++++++++ tools/testing/selftests/iommu/iommufd.c | 17 +++- tools/testing/selftests/iommu/iommufd_utils.h | 26 ++++++ 8 files changed, 210 insertions(+), 12 deletions(-) -- 2.34.1

1 year, 11 months

5
12
0 0

[PATCH v2 0/7] Split a folio to any lower order folios

by Zi Yan

From: Zi Yan <ziy(a)nvidia.com> Hi all, File folio supports any order and people would like to support flexible orders for anonymous folio[1] too. Currently, split_huge_page() only splits a huge page to order-0 pages, but splitting to orders higher than 0 is also useful. This patchset adds support for splitting a huge page to any lower order pages and uses it during folio truncate operations. The patchset is on top of mm-everything-2023-03-27-21-20. Changelog from v1 === 1. Changed split_page_memcg() and split_page_owner() parameter to use order 2. Used folio_test_pmd_mappable() in place of the equivalent code Details === * Patch 1 changes split_page_memcg() to use order instead of nr_pages * Patch 2 changes split_page_owner() to use order instead of nr_pages * Patch 3 and 4 add new_order parameter split_page_memcg() and split_page_owner() and prepare for upcoming changes. * Patch 5 adds split_huge_page_to_list_to_order() to split a huge page to any lower order. The original split_huge_page_to_list() calls split_huge_page_to_list_to_order() with new_order = 0. * Patch 6 uses split_huge_page_to_list_to_order() in large pagecache folio truncation instead of split the large folio all the way down to order-0. * Patch 7 adds a test API to debugfs and test cases in split_huge_page_test selftests. Comments and/or suggestions are welcome. [1] https://lore.kernel.org/linux-mm/Y%2FblF0GIunm+pRIC@casper.infradead.org/ Zi Yan (7): mm/memcg: use order instead of nr in split_page_memcg() mm/page_owner: use order instead of nr in split_page_owner() mm: memcg: make memcg huge page split support any order split. mm: page_owner: add support for splitting to any order in split page_owner. mm: thp: split huge page to any lower order pages. mm: truncate: split huge page cache page to a non-zero order if possible. mm: huge_memory: enable debugfs to split huge pages to any order. include/linux/huge_mm.h | 10 +- include/linux/memcontrol.h | 4 +- include/linux/page_owner.h | 10 +- mm/huge_memory.c | 137 ++++++++--- mm/memcontrol.c | 10 +- mm/page_alloc.c | 8 +- mm/page_owner.c | 10 +- mm/truncate.c | 21 +- .../selftests/mm/split_huge_page_test.c | 225 +++++++++++++++++- 9 files changed, 366 insertions(+), 69 deletions(-) -- 2.39.2

1 year, 11 months

4
13
0 0

[net-next 0/2] seg6: add NEXT-C-SID support for SRv6 End.X behavior

by Andrea Mayer

In the Segment Routing (SR) architecture a list of instructions, called segments, can be added to the packet headers to influence the forwarding and processing of the packets in an SR enabled network. Considering the Segment Routing over IPv6 data plane (SRv6) [1], the segment identifiers (SIDs) are IPv6 addresses (128 bits) and the segment list (SID List) is carried in the Segment Routing Header (SRH). A segment may correspond to a "behavior" that is executed by a node when the packet is received. The Linux kernel currently supports a large subset of the behaviors described in [2] (e.g., End, End.X, End.T and so on). In some SRv6 scenarios, the number of segments carried by the SID List may increase dramatically, reducing the MTU (Maximum Transfer Unit) size and/or limiting the processing power of legacy hardware devices (due to longer IPv6 headers). The NEXT-C-SID mechanism [3] extends the SRv6 architecture by providing several ways to efficiently represent the SID List. By leveraging the NEXT-C-SID, is it possible to encode several SRv6 segments within a single 128 bit SID address (also referenced as Compressed SID Container). In this way, the length of the SID List can be drastically reduced. The NEXT-C-SID mechanism is built upon the "flavors" framework defined in [2]. This framework is already supported by the Linux SRv6 subsystem and is used to modify and/or extend a subset of existing behaviors. In this patchset, we extend the SRv6 End.X behavior in order to support the NEXT-C-SID mechanism. In details, the patchset is made of: - patch 1/2: add NEXT-C-SID support for SRv6 End.X behavior; - patch 2/2: add selftest for NEXT-C-SID in SRv6 End.X behavior. From the user space perspective, we do not need to change the iproute2 code to support the NEXT-C-SID flavor for the SRv6 End.X behavior. However, we will update the man page considering the NEXT-C-SID flavor applied to the SRv6 End.X behavior in a separate patch. Comments, improvements and suggestions are always appreciated. Thank you all, Andrea [1] - https://datatracker.ietf.org/doc/html/rfc8754 [2] - https://datatracker.ietf.org/doc/html/rfc8986 [3] - https://datatracker.ietf.org/doc/html/draft-ietf-spring-srv6-srh-compression Andrea Mayer (1): seg6: add NEXT-C-SID support for SRv6 End.X behavior Paolo Lungaroni (1): selftests: seg6: add selftest for NEXT-C-SID flavor in SRv6 End.X behavior net/ipv6/seg6_local.c | 108 +- tools/testing/selftests/net/Makefile | 1 + .../net/srv6_end_x_next_csid_l3vpn_test.sh | 1213 +++++++++++++++++ 3 files changed, 1302 insertions(+), 20 deletions(-) create mode 100755 tools/testing/selftests/net/srv6_end_x_next_csid_l3vpn_test.sh -- 2.20.1

1 year, 11 months

4
9
0 0

[PATCH 1/5] kselftest/arm64: add float-point feature to hwcap test

by Zeng Heng

Add the FP feature check in the set of hwcap tests. Signed-off-by: Zeng Heng <zengheng4(a)huawei.com> --- tools/testing/selftests/arm64/abi/hwcap.c | 13 +++++++++++++ 1 file changed, 13 insertions(+) diff --git a/tools/testing/selftests/arm64/abi/hwcap.c b/tools/testing/selftests/arm64/abi/hwcap.c index 6a0adf916028..eaf9881c2e43 100644 --- a/tools/testing/selftests/arm64/abi/hwcap.c +++ b/tools/testing/selftests/arm64/abi/hwcap.c @@ -39,6 +39,12 @@ static void cssc_sigill(void) asm volatile(".inst 0xdac01c00" : : : "x0"); } +static void fp_sigill(void) +{ + /* FMOV S0, #1 */ + asm volatile(".inst 0x1e2e1000" : : : ); +} + static void ilrcpc_sigill(void) { /* LDAPUR W0, [SP, #8] */ @@ -235,6 +241,13 @@ static const struct hwcap_data { .cpuinfo = "cssc", .sigill_fn = cssc_sigill, }, + { + .name = "FP", + .at_hwcap = AT_HWCAP, + .hwcap_bit = HWCAP_FP, + .cpuinfo = "fp", + .sigill_fn = fp_sigill, + }, { .name = "LRCPC", .at_hwcap = AT_HWCAP, -- 2.25.1

1 year, 11 months

2
9
0 0

[PATCH V1 0/6] AMD Pstate Preferred Core

by Meng Li

Hi all: The core frequency is subjected to the process variation in semiconductors. Not all cores are able to reach the maximum frequency respecting the infrastructure limits. Consequently, AMD has redefined the concept of maximum frequency of a part. This means that a fraction of cores can reach maximum frequency. To find the best process scheduling policy for a given scenario, OS needs to know the core ordering informed by the platform through highest performance capability register of the CPPC interface. Earlier implementations of AMD Pstate Preferred Core only support a static core ranking and targeted performance. Now it has the ability to dynamically change the preferred core based on the workload and platform conditions and accounting for thermals and aging. AMD Pstate driver utilizes the functions and data structures provided by the ITMT architecture to enable the scheduler to favor scheduling on cores which can be get a higher frequency with lower voltage. We call it AMD Pstate Preferrred Core. Here sched_set_itmt_core_prio() is called to set priorities and sched_set_itmt_support() is called to enable ITMT feature. AMD Pstate driver uses the highest performance value to indicate the priority of CPU. The higher value has a higher priority. AMD Pstate driver will provide an initial core ordering at boot time. It relies on the CPPC interface to communicate the core ranking to the operating system and scheduler to make sure that OS is choosing the cores with highest performance firstly for scheduling the process. When AMD Pstate driver receives a message with the highest performance change, it will update the core ranking. Meng Li (6): ACPI: CPPC: Add get the highest performance cppc control cpufreq: amd-pstate: Enable AMD Pstate Preferred Core Supporting. cpufreq: Add a notification message that the highest perf has changed cpufreq: amd-pstate: Update AMD Pstate Preferred Core ranking dynamically Documentation: amd-pstate: introduce AMD Pstate Preferred Core Documentation: introduce AMD Pstate Preferrd Core mode kernel command line options .../admin-guide/kernel-parameters.txt | 5 + Documentation/admin-guide/pm/amd-pstate.rst | 55 ++++++ drivers/acpi/cppc_acpi.c | 13 ++ drivers/acpi/processor_driver.c | 6 + drivers/cpufreq/amd-pstate.c | 181 ++++++++++++++++-- drivers/cpufreq/cpufreq.c | 13 ++ include/acpi/cppc_acpi.h | 5 + include/linux/amd-pstate.h | 1 + include/linux/cpufreq.h | 4 + 9 files changed, 267 insertions(+), 16 deletions(-) -- 2.34.1

1 year, 11 months

3
13
0 0

[PATCH v5 0/3] kunit: Expose some built-in features to modules

by Janusz Krzysztofik

Submit the top-level headers also from the kunit test module notifier initialization callback, so external tools that are parsing dmesg for kunit test output are able to tell how many test suites should be expected and whether to continue parsing after complete output from the first test suite is collected. Extend kunit module notifier initialization callback with a processing path for only listing the tests provided by a module if the kunit action parameter is set to "list", so external tools can obtain a list of test cases to be executed in advance and can make a better job on assigning kernel messages interleaved with kunit output to specific tests. Use test filtering functions in kunit module notifier callback functions, so external tools are able to execute individual test cases from kunit test modules in order to still better isolate their potential impact on kernel messages that appear interleaved with output from other tests. v5: Fix new name of a structure moved to kunit namespace not updated in executor_test functions (lkp(a)intel.com), - refresh on tpp of attributes filtering fix. v4: Use kunit_exec_run_tests() (Mauro, Rae), but prevent it from emitting the headers when called on load of non-test modules, - don't use a different list format, use kunit_exec_list_tests() (Rae), - refresh on top of newly introduced attributes patches, handle newly introduced kunit.action=list_attr case (Rae). v3: Fix CONFIG_GLOB, required by filtering functions, not selected when building as a module (lkp(a)intel.com). v2: Fix new name of a structure moved to kunit namespace not updated across all uses (lkp(a)intel.com). Janusz Krzysztofik (3): kunit: Report the count of test suites in a module kunit: Make 'list' action available to kunit test modules kunit: Allow kunit test modules to use test filtering include/kunit/test.h | 21 +++++++ lib/kunit/Kconfig | 2 +- lib/kunit/executor.c | 115 ++++++++++++++++++++++---------------- lib/kunit/executor_test.c | 36 ++++++++---- lib/kunit/test.c | 37 +++++++++++- 5 files changed, 149 insertions(+), 62 deletions(-) base-commit: 1c9fd080dffe5e5ad763527fbc2aa3f6f8c653e9 -- 2.41.0

1 year, 11 months

3
7
0 0

[PATCH v8 0/4] RISC-V: mm: Make SV48 the default address space

by Charlie Jenkins

Make sv48 the default address space for mmap as some applications currently depend on this assumption. Users can now select a desired address space using a non-zero hint address to mmap. Previously, requesting the default address space from mmap by passing zero as the hint address would result in using the largest address space possible. Some applications depend on empty bits in the virtual address space, like Go and Java, so this patch provides more flexibility for application developers. -Charlie --- v8: - Fix RV32 and the RV32 compat mode of RV64 - Extract out addr and base from the mmap macros v7: - Changing RLIMIT_STACK inside of an executing program does not trigger arch_pick_mmap_layout(), so rewrite tests to change RLIMIT_STACK from a script before executing tests. RLIMIT_STACK of infinity forces bottomup mmap allocation. - Make arch_get_mmap_base macro more readible by extracting out the rnd calculation. - Use MMAP_MIN_VA_BITS in TASK_UNMAPPED_BASE to support case when mmap attempts to allocate address smaller than DEFAULT_MAP_WINDOW. - Fix incorrect wording in documentation. v6: - Rebase onto the correct base v5: - Minor wording change in documentation - Change some parenthesis in arch_get_mmap_ macros - Added case for addr==0 in arch_get_mmap_ because without this, programs would crash if RLIMIT_STACK was modified before executing the program. This was tested using the libhugetlbfs tests. v4: - Split testcases/document patch into test cases, in-code documentation, and formal documentation patches - Modified the mmap_base macro to be more legible and better represent memory layout - Fixed documentation to better reflect the implmentation - Renamed DEFAULT_VA_BITS to MMAP_VA_BITS - Added additional test case for rlimit changes --- Charlie Jenkins (4): RISC-V: mm: Restrict address space for sv39,sv48,sv57 RISC-V: mm: Add tests for RISC-V mm RISC-V: mm: Update pgtable comment documentation RISC-V: mm: Document mmap changes Documentation/riscv/vm-layout.rst | 22 +++++++ arch/riscv/include/asm/elf.h | 2 +- arch/riscv/include/asm/pgtable.h | 28 ++++++-- arch/riscv/include/asm/processor.h | 52 +++++++++++++-- tools/testing/selftests/riscv/Makefile | 2 +- tools/testing/selftests/riscv/mm/.gitignore | 2 + tools/testing/selftests/riscv/mm/Makefile | 15 +++++ .../riscv/mm/testcases/mmap_bottomup.c | 35 ++++++++++ .../riscv/mm/testcases/mmap_default.c | 35 ++++++++++ .../selftests/riscv/mm/testcases/mmap_test.h | 64 +++++++++++++++++++ .../selftests/riscv/mm/testcases/run_mmap.sh | 12 ++++ 11 files changed, 257 insertions(+), 12 deletions(-) create mode 100644 tools/testing/selftests/riscv/mm/.gitignore create mode 100644 tools/testing/selftests/riscv/mm/Makefile create mode 100644 tools/testing/selftests/riscv/mm/testcases/mmap_bottomup.c create mode 100644 tools/testing/selftests/riscv/mm/testcases/mmap_default.c create mode 100644 tools/testing/selftests/riscv/mm/testcases/mmap_test.h create mode 100755 tools/testing/selftests/riscv/mm/testcases/run_mmap.sh -- 2.41.0

1 year, 11 months

2
9
0 0

[PATCH v5 00/19] selftests/resctrl: Fixes and cleanups

by Ilpo Järvinen

Here is a series with some fixes and cleanups to resctrl selftests. v5: - Improve changelogs - Close fd_lm only in cat_val() - Improve unmount error handling v4: - Move resctrlfs (unconditional) umount after resctrl fs support check v3: - Don't include rewritten CAT test into this series! - Tweak wildcard style in Makefile - Fix many changelog typos, remove some wrong claims, and generally improve them. - Add fix to PARENT_EXIT() to unmount resctrl FS - Add unmounting resctrl FS before starting any tests - Add fix for buf leak - Add fix for perf fd closing - Split mount/remount/umount patches differently - Use size_t and %zu for span - Keep MBM print as MB, only internally use span in bytes - Drop start_buf global from fill_buf v2 (was sent with CAT test rewrite which is no longer included in v3): - Rebased on top of next to solve the conflicts - Added 2 patches related to resctrl FS mount/umount (fix + cleanup) - Consistently use "alloc" in cache_alloc_size() - CAT test error handling tweaked - Remove a spurious newline change from the CAT patch - Small improvements to changelogs Ilpo Järvinen (19): selftests/resctrl: Add resctrl.h into build deps selftests/resctrl: Don't leak buffer in fill_cache() selftests/resctrl: Unmount resctrl FS if child fails to run benchmark selftests/resctrl: Close perf value read fd on errors selftests/resctrl: Unmount resctrl FS before starting the first test selftests/resctrl: Move resctrl FS mount/umount to higher level selftests/resctrl: Refactor remount_resctrl(bool mum_resctrlfs) to mount_resctrl() selftests/resctrl: Remove mum_resctrlfs from struct resctrl_val_param selftests/resctrl: Convert span to size_t selftests/resctrl: Express span internally in bytes selftests/resctrl: Remove duplicated preparation for span arg selftests/resctrl: Remove "malloc_and_init_memory" param from run_fill_buf() selftests/resctrl: Remove unnecessary startptr global from fill_buf selftests/resctrl: Improve parameter consistency in fill_buf selftests/resctrl: Don't pass test name to fill_buf selftests/resctrl: Don't use variable argument list for ->setup() selftests/resctrl: Move CAT/CMT test global vars to function they are used in selftests/resctrl: Pass the real number of tests to show_cache_info() selftests/resctrl: Remove test type checks from cat_val() tools/testing/selftests/resctrl/Makefile | 2 +- tools/testing/selftests/resctrl/cache.c | 66 +++++++------- tools/testing/selftests/resctrl/cat_test.c | 28 ++---- tools/testing/selftests/resctrl/cmt_test.c | 29 ++----- tools/testing/selftests/resctrl/fill_buf.c | 87 +++++++------------ tools/testing/selftests/resctrl/mba_test.c | 9 +- tools/testing/selftests/resctrl/mbm_test.c | 17 ++-- tools/testing/selftests/resctrl/resctrl.h | 17 ++-- .../testing/selftests/resctrl/resctrl_tests.c | 83 ++++++++++++------ tools/testing/selftests/resctrl/resctrl_val.c | 7 +- tools/testing/selftests/resctrl/resctrlfs.c | 64 +++++++------- 11 files changed, 178 insertions(+), 231 deletions(-) -- 2.30.2

1 year, 11 months

5
28
0 0

[PATCH] kunit: Replace fixed-size log with dynamically-extending buffer

by Richard Fitzgerald

Re-implement the log buffer as a list of buffer fragments that can be extended as the size of the log info grows. When using parameterization the test case can run many times and create a large amount of log. It's not really practical to keep increasing the size of the fixed buffer every time a test needs more space. And a big fixed buffer wastes memory. The original char *log pointer is replaced by a pointer to a list of struct kunit_log_frag, each containing a fixed-size buffer. kunit_log_append() now attempts to append to that last kunit_log_frag in the list. If there isn't enough space it will append a new kunit_log_frag to the list. This simple implementation does not attempt to completely fill the buffer in every kunit_log_frag. The 'log' member of kunit_suite, kunit_test_case and kunit_suite must be a pointer because the API of kunit_log() requires that is the same type in all three structs. As kunit.log is a pointer to the 'log' of the current kunit_case, it must be a pointer in the other two structs. The existing kunit-test.c log tests have been updated for the new log buffer and a new kunit_log_extend_test() case to test extending the buffer. Signed-off-by: Richard Fitzgerald <rf(a)opensource.cirrus.com> --- include/kunit/test.h | 25 +++++++++---- lib/kunit/debugfs.c | 65 ++++++++++++++++++++++++++------- lib/kunit/kunit-test.c | 82 +++++++++++++++++++++++++++++++++++++----- lib/kunit/test.c | 51 ++++++++++++++++---------- 4 files changed, 177 insertions(+), 46 deletions(-) diff --git a/include/kunit/test.h b/include/kunit/test.h index 011e0d6bb506..907b30401669 100644 --- a/include/kunit/test.h +++ b/include/kunit/test.h @@ -33,8 +33,8 @@ DECLARE_STATIC_KEY_FALSE(kunit_running); struct kunit; -/* Size of log associated with test. */ -#define KUNIT_LOG_SIZE 2048 +/* Initial size of log associated with test. */ +#define KUNIT_DEFAULT_LOG_SIZE 500 /* Maximum size of parameter description string. */ #define KUNIT_PARAM_DESC_SIZE 128 @@ -85,6 +85,11 @@ struct kunit_attributes { enum kunit_speed speed; }; +struct kunit_log_frag { + struct list_head list; + char buf[KUNIT_DEFAULT_LOG_SIZE]; +}; + /** * struct kunit_case - represents an individual test case. * @@ -132,7 +137,7 @@ struct kunit_case { /* private: internal use only. */ enum kunit_status status; char *module_name; - char *log; + struct list_head *log; }; static inline char *kunit_status_to_ok_not_ok(enum kunit_status status) @@ -252,7 +257,7 @@ struct kunit_suite { /* private: internal use only */ char status_comment[KUNIT_STATUS_COMMENT_SIZE]; struct dentry *debugfs; - char *log; + struct list_head *log; int suite_init_err; }; @@ -272,7 +277,7 @@ struct kunit { /* private: internal use only. */ const char *name; /* Read only after initialization! */ - char *log; /* Points at case log after initialization */ + struct list_head *log; /* Points at case log after initialization */ struct kunit_try_catch try_catch; /* param_value is the current parameter value for a test case. */ const void *param_value; @@ -304,7 +309,7 @@ static inline void kunit_set_failure(struct kunit *test) bool kunit_enabled(void); -void kunit_init_test(struct kunit *test, const char *name, char *log); +void kunit_init_test(struct kunit *test, const char *name, struct list_head *log); int kunit_run_tests(struct kunit_suite *suite); @@ -317,6 +322,12 @@ int __kunit_test_suites_init(struct kunit_suite * const * const suites, int num_ void __kunit_test_suites_exit(struct kunit_suite **suites, int num_suites); +static inline void kunit_init_log_frag(struct kunit_log_frag *frag) +{ + INIT_LIST_HEAD(&frag->list); + frag->buf[0] = '\0'; +} + #if IS_BUILTIN(CONFIG_KUNIT) int kunit_run_all_tests(void); #else @@ -451,7 +462,7 @@ static inline void *kunit_kcalloc(struct kunit *test, size_t n, size_t size, gfp void kunit_cleanup(struct kunit *test); -void __printf(2, 3) kunit_log_append(char *log, const char *fmt, ...); +void __printf(2, 3) kunit_log_append(struct list_head *log, const char *fmt, ...); /** * kunit_mark_skipped() - Marks @test_or_suite as skipped diff --git a/lib/kunit/debugfs.c b/lib/kunit/debugfs.c index 22c5c496a68f..a26b6d31bd2f 100644 --- a/lib/kunit/debugfs.c +++ b/lib/kunit/debugfs.c @@ -5,6 +5,7 @@ */ #include <linux/debugfs.h> +#include <linux/list.h> #include <linux/module.h> #include <kunit/test.h> @@ -37,14 +38,15 @@ void kunit_debugfs_init(void) debugfs_rootdir = debugfs_create_dir(KUNIT_DEBUGFS_ROOT, NULL); } -static void debugfs_print_result(struct seq_file *seq, - struct kunit_suite *suite, - struct kunit_case *test_case) +static void debugfs_print_log(struct seq_file *seq, const struct list_head *log) { - if (!test_case || !test_case->log) + struct kunit_log_frag *frag; + + if (!log) return; - seq_printf(seq, "%s", test_case->log); + list_for_each_entry(frag, log, list) + seq_puts(seq, frag->buf); } /* @@ -69,10 +71,9 @@ static int debugfs_print_results(struct seq_file *seq, void *v) seq_printf(seq, KUNIT_SUBTEST_INDENT "1..%zd\n", kunit_suite_num_test_cases(suite)); kunit_suite_for_each_test_case(suite, test_case) - debugfs_print_result(seq, suite, test_case); + debugfs_print_log(seq, test_case->log); - if (suite->log) - seq_printf(seq, "%s", suite->log); + debugfs_print_log(seq, suite->log); seq_printf(seq, "%s %d %s\n", kunit_status_to_ok_not_ok(success), 1, suite->name); @@ -100,14 +101,53 @@ static const struct file_operations debugfs_results_fops = { .release = debugfs_release, }; +static struct list_head *kunit_debugfs_alloc_log(void) +{ + struct list_head *log; + struct kunit_log_frag *frag; + + log = kzalloc(sizeof(*log), GFP_KERNEL); + if (!log) + return NULL; + + INIT_LIST_HEAD(log); + + frag = kmalloc(sizeof(*frag), GFP_KERNEL); + if (!frag) { + kfree(log); + return NULL; + } + + kunit_init_log_frag(frag); + list_add_tail(&frag->list, log); + + return log; +} + +static void kunit_debugfs_free_log(struct list_head *log) +{ + struct kunit_log_frag *frag, *n; + + if (!log) + return; + + list_for_each_entry_safe(frag, n, log, list) { + list_del(&frag->list); + kfree(frag); + } + + kfree(log); +} + void kunit_debugfs_create_suite(struct kunit_suite *suite) { struct kunit_case *test_case; /* Allocate logs before creating debugfs representation. */ - suite->log = kzalloc(KUNIT_LOG_SIZE, GFP_KERNEL); + suite->log = kunit_debugfs_alloc_log(); + kunit_suite_for_each_test_case(suite, test_case) - test_case->log = kzalloc(KUNIT_LOG_SIZE, GFP_KERNEL); + test_case->log = kunit_debugfs_alloc_log(); suite->debugfs = debugfs_create_dir(suite->name, debugfs_rootdir); @@ -121,7 +161,8 @@ void kunit_debugfs_destroy_suite(struct kunit_suite *suite) struct kunit_case *test_case; debugfs_remove_recursive(suite->debugfs); - kfree(suite->log); + kunit_debugfs_free_log(suite->log); + kunit_suite_for_each_test_case(suite, test_case) - kfree(test_case->log); + kunit_debugfs_free_log(test_case->log); } diff --git a/lib/kunit/kunit-test.c b/lib/kunit/kunit-test.c index 83d8e90ca7a2..abf659c6ded4 100644 --- a/lib/kunit/kunit-test.c +++ b/lib/kunit/kunit-test.c @@ -533,9 +533,16 @@ static struct kunit_suite kunit_resource_test_suite = { static void kunit_log_test(struct kunit *test) { struct kunit_suite suite; + struct kunit_log_frag *frag; - suite.log = kunit_kzalloc(test, KUNIT_LOG_SIZE, GFP_KERNEL); + suite.log = kunit_kzalloc(test, sizeof(*suite.log), GFP_KERNEL); KUNIT_ASSERT_NOT_ERR_OR_NULL(test, suite.log); + INIT_LIST_HEAD(suite.log); + frag = kunit_kmalloc(test, sizeof(*frag), GFP_KERNEL); + KUNIT_ASSERT_NOT_ERR_OR_NULL(test, frag); + kunit_init_log_frag(frag); + KUNIT_ASSERT_EQ(test, frag->buf[0], '\0'); + list_add_tail(&frag->list, suite.log); kunit_log(KERN_INFO, test, "put this in log."); kunit_log(KERN_INFO, test, "this too."); @@ -543,14 +550,17 @@ static void kunit_log_test(struct kunit *test) kunit_log(KERN_INFO, &suite, "along with this."); #ifdef CONFIG_KUNIT_DEBUGFS + frag = list_first_entry(test->log, struct kunit_log_frag, list); KUNIT_EXPECT_NOT_ERR_OR_NULL(test, - strstr(test->log, "put this in log.")); + strstr(frag->buf, "put this in log.")); KUNIT_EXPECT_NOT_ERR_OR_NULL(test, - strstr(test->log, "this too.")); + strstr(frag->buf, "this too.")); + + frag = list_first_entry(suite.log, struct kunit_log_frag, list); KUNIT_EXPECT_NOT_ERR_OR_NULL(test, - strstr(suite.log, "add to suite log.")); + strstr(frag->buf, "add to suite log.")); KUNIT_EXPECT_NOT_ERR_OR_NULL(test, - strstr(suite.log, "along with this.")); + strstr(frag->buf, "along with this.")); #else KUNIT_EXPECT_NULL(test, test->log); #endif @@ -558,19 +568,75 @@ static void kunit_log_test(struct kunit *test) static void kunit_log_newline_test(struct kunit *test) { + struct kunit_log_frag *frag; + kunit_info(test, "Add newline\n"); if (test->log) { - KUNIT_ASSERT_NOT_NULL_MSG(test, strstr(test->log, "Add newline\n"), - "Missing log line, full log:\n%s", test->log); - KUNIT_EXPECT_NULL(test, strstr(test->log, "Add newline\n\n")); + frag = list_first_entry(test->log, struct kunit_log_frag, list); + KUNIT_ASSERT_NOT_NULL_MSG(test, strstr(frag->buf, "Add newline\n"), + "Missing log line, full log:\n%s", frag->buf); + KUNIT_EXPECT_NULL(test, strstr(frag->buf, "Add newline\n\n")); } else { kunit_skip(test, "only useful when debugfs is enabled"); } } +static void kunit_log_extend_test(struct kunit *test) +{ +#ifdef CONFIG_KUNIT_DEBUGFS + struct kunit_suite suite; + struct kunit_log_frag *frag; + char *p, *pn; + size_t len; + int i, q; + + suite.log = kunit_kzalloc(test, sizeof(*suite.log), GFP_KERNEL); + KUNIT_ASSERT_NOT_ERR_OR_NULL(test, suite.log); + INIT_LIST_HEAD(suite.log); + frag = kunit_kmalloc(test, sizeof(*frag), GFP_KERNEL); + KUNIT_ASSERT_NOT_ERR_OR_NULL(test, frag); + kunit_init_log_frag(frag); + KUNIT_ASSERT_EQ(test, frag->buf[0], '\0'); + list_add_tail(&frag->list, suite.log); + + for (i = 0; i < KUNIT_DEFAULT_LOG_SIZE; ++i) { + kunit_log(KERN_INFO, &suite, + "The quick brown fox jumps over the lazy penguin %d\n", i); + } + + /* There must be more than one buffer fragment now */ + KUNIT_ASSERT_FALSE(test, list_is_singular(suite.log)); + + /* Copy all the data into a contiguous string for easier parsing */ + len = 0; + list_for_each_entry(frag, suite.log, list) + len += strlen(frag->buf); + + p = kunit_kmalloc(test, len + 1, GFP_KERNEL); + KUNIT_ASSERT_NOT_ERR_OR_NULL(test, p); + + list_for_each_entry(frag, suite.log, list) + strlcat(p, frag->buf, len); + + i = 0; + while ((pn = strchr(p, '\n')) != NULL) { + *pn = '\0'; + KUNIT_ASSERT_EQ(test, sscanf(p, + "The quick brown fox jumps over the lazy penguin %d\n", + &q), 1); + KUNIT_ASSERT_EQ(test, q, i); + p = pn + 1; + ++i; + } +#else + kunit_skip(test, "only useful when debugfs is enabled"); +#endif +} + static struct kunit_case kunit_log_test_cases[] = { KUNIT_CASE(kunit_log_test), KUNIT_CASE(kunit_log_newline_test), + KUNIT_CASE(kunit_log_extend_test), {} }; diff --git a/lib/kunit/test.c b/lib/kunit/test.c index cb9797fa6303..f90e395b3656 100644 --- a/lib/kunit/test.c +++ b/lib/kunit/test.c @@ -11,6 +11,7 @@ #include <kunit/test-bug.h> #include <kunit/attributes.h> #include <linux/kernel.h> +#include <linux/list.h> #include <linux/module.h> #include <linux/moduleparam.h> #include <linux/panic.h> @@ -114,46 +115,54 @@ static void kunit_print_test_stats(struct kunit *test, * already present. * @log: The log to add the newline to. */ -static void kunit_log_newline(char *log) +static void kunit_log_newline(struct kunit_log_frag *frag) { int log_len, len_left; - log_len = strlen(log); - len_left = KUNIT_LOG_SIZE - log_len - 1; + log_len = strlen(frag->buf); + len_left = sizeof(frag->buf) - log_len - 1; - if (log_len > 0 && log[log_len - 1] != '\n') - strncat(log, "\n", len_left); + if (log_len > 0 && frag->buf[log_len - 1] != '\n') + strncat(frag->buf, "\n", len_left); } -/* - * Append formatted message to log, size of which is limited to - * KUNIT_LOG_SIZE bytes (including null terminating byte). - */ -void kunit_log_append(char *log, const char *fmt, ...) +/* Append formatted message to log, extending the log buffer if necessary. */ +void kunit_log_append(struct list_head *log, const char *fmt, ...) { va_list args; + struct kunit_log_frag *frag; int len, log_len, len_left; if (!log) return; - log_len = strlen(log); - len_left = KUNIT_LOG_SIZE - log_len - 1; - if (len_left <= 0) - return; + frag = list_last_entry(log, struct kunit_log_frag, list); + log_len = strlen(frag->buf); + len_left = sizeof(frag->buf) - log_len - 1; /* Evaluate length of line to add to log */ va_start(args, fmt); len = vsnprintf(NULL, 0, fmt, args) + 1; va_end(args); + if (len > len_left) { + frag = kmalloc(sizeof(*frag), GFP_KERNEL); + if (!frag) + return; + + kunit_init_log_frag(frag); + list_add_tail(&frag->list, log); + len_left = sizeof(frag->buf) - 1; + log_len = 0; + } + /* Print formatted line to the log */ va_start(args, fmt); - vsnprintf(log + log_len, min(len, len_left), fmt, args); + vsnprintf(frag->buf + log_len, min(len, len_left), fmt, args); va_end(args); /* Add newline to end of log if not already present. */ - kunit_log_newline(log); + kunit_log_newline(frag); } EXPORT_SYMBOL_GPL(kunit_log_append); @@ -359,14 +368,18 @@ void __kunit_do_failed_assertion(struct kunit *test, } EXPORT_SYMBOL_GPL(__kunit_do_failed_assertion); -void kunit_init_test(struct kunit *test, const char *name, char *log) +void kunit_init_test(struct kunit *test, const char *name, struct list_head *log) { spin_lock_init(&test->lock); INIT_LIST_HEAD(&test->resources); test->name = name; test->log = log; - if (test->log) - test->log[0] = '\0'; + if (test->log) { + struct kunit_log_frag *frag = list_first_entry(test->log, + struct kunit_log_frag, + list); + frag->buf[0] = '\0'; + } test->status = KUNIT_SUCCESS; test->status_comment[0] = '\0'; } -- 2.30.2

1 year, 11 months

2
2
0 0

[PATCH] selftests/harness: Actually report SKIP for signal tests

by Kees Cook

Tests that were expecting a signal were not correctly checking for a SKIP condition. Move the check before the signal checking when processing test result. Cc: Shuah Khan <shuah(a)kernel.org> Cc: Andy Lutomirski <luto(a)amacapital.net> Cc: Will Drewry <wad(a)chromium.org> Cc: linux-kselftest(a)vger.kernel.org Fixes: 9847d24af95c ("selftests/harness: Refactor XFAIL into SKIP") Signed-off-by: Kees Cook <keescook(a)chromium.org> --- tools/testing/selftests/kselftest_harness.h | 11 +++++------ 1 file changed, 5 insertions(+), 6 deletions(-) diff --git a/tools/testing/selftests/kselftest_harness.h b/tools/testing/selftests/kselftest_harness.h index 5fd49ad0c696..e05ac8261046 100644 --- a/tools/testing/selftests/kselftest_harness.h +++ b/tools/testing/selftests/kselftest_harness.h @@ -938,7 +938,11 @@ void __wait_for_test(struct __test_metadata *t) fprintf(TH_LOG_STREAM, "# %s: Test terminated by timeout\n", t->name); } else if (WIFEXITED(status)) { - if (t->termsig != -1) { + if (WEXITSTATUS(status) == 255) { + /* SKIP */ + t->passed = 1; + t->skip = 1; + } else if (t->termsig != -1) { t->passed = 0; fprintf(TH_LOG_STREAM, "# %s: Test exited normally instead of by signal (code: %d)\n", @@ -950,11 +954,6 @@ void __wait_for_test(struct __test_metadata *t) case 0: t->passed = 1; break; - /* SKIP */ - case 255: - t->passed = 1; - t->skip = 1; - break; /* Other failure, assume step report. */ default: t->passed = 0; -- 2.34.1

1 year, 11 months

1
0
0 0

[PATCH v3 0/7] smaps / mm/gup: fix gup_can_follow_protnone fallout

by David Hildenbrand

This is agains mm/mm-unstable, but everything except patch #6 and #7 should apply on current master. Especially patch #1 and #2 should go upstream first, so we can let the other stuff mature a bit longer. Handle the fallout of 474098edac26 ("mm/gup: replace FOLL_NUMA by gup_can_follow_protnone()") where I accidentially missed that follow_page() and smaps implicitly kept the FOLL_NUMA flag clear by not setting it if FOLL_FORCE is absent, to not trigger faults on PROT_NONE-mapped PTEs. Patch #1 fixes the known issues by reintroducing FOLL_NUMA as FOLL_HONOR_NUMA_FAULT and decoupling it from FOLL_FORCE. Patch #2 is a cleanup that I think actually fixes some corner cases, so I added a Fixes: tag. Patch #3 makes KVM explicitly set FOLL_HONOR_NUMA_FAULT in the single case where it is required, and documents the situation. Patch #4 then stops implicitly setting FOLL_HONOR_NUMA_FAULT. But note that for FOLL_WRITE we always implicitly honor NUMA hinting faults. Patch #5 cleans up a comments. Patch #6 improves the KVM functional tests such that patch #7 can actually check for one of the known issues: KSM no longer working on PROT_NONE mappings on x86-64 with CONFIG_NUMA_BALANCING. v2 -> V3: * "mm/gup: reintroduce FOLL_NUMA as FOLL_HONOR_NUMA_FAULT" -> Squash one comment removal -> Adjust the KSM comment * smaps: use vm_normal_page_pmd() instead of follow_trans_huge_pmd() -> Move follow_trans_huge_pmd() to mm/internal.h Cc: Andrew Morton <akpm(a)linux-foundation.org> Cc: Linus Torvalds <torvalds(a)linux-foundation.org> Cc: liubo <liubo254(a)huawei.com> Cc: Peter Xu <peterx(a)redhat.com> Cc: Matthew Wilcox <willy(a)infradead.org> Cc: Hugh Dickins <hughd(a)google.com> Cc: Jason Gunthorpe <jgg(a)ziepe.ca> Cc: John Hubbard <jhubbard(a)nvidia.com> Cc: Mel Gorman <mgorman(a)suse.de> Cc: Shuah Khan <shuah(a)kernel.org> Cc: Paolo Bonzini <pbonzini(a)redhat.com> David Hildenbrand (7): mm/gup: reintroduce FOLL_NUMA as FOLL_HONOR_NUMA_FAULT smaps: use vm_normal_page_pmd() instead of follow_trans_huge_pmd() kvm: explicitly set FOLL_HONOR_NUMA_FAULT in hva_to_pfn_slow() mm/gup: don't implicitly set FOLL_HONOR_NUMA_FAULT pgtable: improve pte_protnone() comment selftest/mm: ksm_functional_tests: test in mmap_and_merge_range() if anything got merged selftest/mm: ksm_functional_tests: Add PROT_NONE test fs/proc/task_mmu.c | 3 +- include/linux/huge_mm.h | 3 - include/linux/mm.h | 21 +++- include/linux/mm_types.h | 9 ++ include/linux/pgtable.h | 16 ++- mm/gup.c | 23 +++- mm/huge_memory.c | 3 +- mm/internal.h | 7 ++ .../selftests/mm/ksm_functional_tests.c | 106 ++++++++++++++++-- virt/kvm/kvm_main.c | 13 ++- 10 files changed, 171 insertions(+), 33 deletions(-) -- 2.41.0

1 year, 11 months

2
12
0 0

[PATCH v3] tools/nolibc: fix up size inflate regression

by Zhangjin Wu

As reported and suggested by Willy, the inline __sysret() helper introduces three types of conversions and increases the size: (1) the "unsigned long" argument to __sysret() forces a sign extension from all sys_* functions that used to return 'int' (2) the comparison with the error range now has to be performed on a 'unsigned long' instead of an 'int' (3) the return value from __sysret() is a 'long' (note, a signed long) which then has to be turned back to an 'int' before being returned by the caller to satisfy the caller's prototype. To fix up this, firstly, let's use macro instead of inline function to preserves the input type and avoids these useless conversions (1), (3). Secondly, comparison to -MAX_ERRNO inflicts on all integer returns where we could previously keep a simple sign comparison, let's use a new is_signed_type() macro from include/linux/compiler.h to limit the comparision to -MAX_ERRNO (2) only on demand and preserves a simple sign comparision for most of the cases as before. Thirdly, fix up the following warning by an explicit conversion and let __sysret() be able to accept the (void *) type of argument and return value with the same (void *) type: sysroot/powerpc/include/sys.h: In function 'sbrk': sysroot/powerpc/include/sys.h:104:16: warning: cast to pointer from integer of different size [-Wint-to-pointer-cast] 104 | return (void *)__sysret(-ENOMEM); Fourthly, to further workaround the argument type with 'const', must use __auto_type for a new enough gcc versions and use 'long' for the old gcc versions as before. Here reports the size testing result with nolibc-test: before: // ppc64le $ size nolibc-test text data bss dec hex filename 27916 8 80 28004 6d64 nolibc-test // mips $ size nolibc-test text data bss dec hex filename 23276 64 64 23404 5b6c nolibc-test after: // ppc64le $ size nolibc-test text data bss dec hex filename 27736 8 80 27824 6cb0 nolibc-test // mips $ size nolibc-test text data bss dec hex filename 23036 64 64 23164 5a7c nolibc-test Suggested-by: Willy Tarreau <w(a)1wt.eu> Link: https://lore.kernel.org/lkml/20230806095846.GB10627@1wt.eu/ Link: https://lore.kernel.org/lkml/20230806134348.GA19145@1wt.eu/ Signed-off-by: Zhangjin Wu <falcon(a)tinylab.org> --- Hi, Willy To increase readability, v3 further defines a __GXX_HAS_AUTO_TYPE_WITH_CONST_SUPPORT macro for gcc >= 11.0 (ABI_VERSION >= 1016) who has __auto_type with 'const' support. When this macro is defined, provides a __sysret version with __auto_type, otherwise, use a fixed 'long' type as a fallback. Tested for all of the nolibc supported architectures with Arnd's 13.2.0 toolchains. and also for x86_64 with gcc-4.8 and gcc-9, no compile failures, no compile warnings, no running failures. Changes from v2 --> v3: * define a __GXX_HAS_AUTO_TYPE_WITH_CONST_SUPPORT for gcc >= 11.0 (ABI_VERSION >= 1016) * split __sysret() to two versions by the macro instead of a mixed unified and unreadable version * use shorter __ret instead of __sysret_arg Changes from v1 --> v2: * fix up argument with 'const' in the type * support "void *" argument v2: https://lore.kernel.org/lkml/95fe3e732f455fab653fe1427118d905e4d04257.16913… v1: https://lore.kernel.org/lkml/20230806131921.52453-1-falcon@tinylab.org/ --- tools/include/nolibc/sys.h | 66 +++++++++++++++++++++++++++++++------- 1 file changed, 55 insertions(+), 11 deletions(-) diff --git a/tools/include/nolibc/sys.h b/tools/include/nolibc/sys.h index 56f63eb48a1b..b137f7771db9 100644 --- a/tools/include/nolibc/sys.h +++ b/tools/include/nolibc/sys.h @@ -35,15 +35,59 @@ * (src/internal/syscall_ret.c) and glibc (sysdeps/unix/sysv/linux/sysdep.h) */ -static __inline__ __attribute__((unused, always_inline)) -long __sysret(unsigned long ret) -{ - if (ret >= (unsigned long)-MAX_ERRNO) { - SET_ERRNO(-(long)ret); - return -1; - } - return ret; -} +/* + * Whether 'type' is a signed type or an unsigned type. Supports scalar types, + * bool and also pointer types. (from include/linux/compiler.h) + */ +#define __is_signed_type(type) (((type)(-1)) < (type)1) + +/* __auto_type is used instead of __typeof__ to workaround the build error + * 'error: assignment of read-only variable' when the argument has 'const' in + * the type, but __auto_type is a new feature from newer gcc version and it + * only works with 'const' from gcc 11.0 (__GXX_ABI_VERSION = 1016) + * https://gcc.gnu.org/legacy-ml/gcc-patches/2013-11/msg01378.html + */ + +#if __GXX_ABI_VERSION >= 1016 +#define __GXX_HAS_AUTO_TYPE_WITH_CONST_SUPPORT +#endif + +#ifdef __GXX_HAS_AUTO_TYPE_WITH_CONST_SUPPORT +#define __sysret(arg) \ +({ \ + __auto_type __ret = (arg); \ + if (__is_signed_type(__typeof__(arg))) { \ + if (__ret < 0) { \ + SET_ERRNO(-(long)__ret); \ + __ret = (__typeof__(arg))(-1L); \ + } \ + } else { \ + if ((unsigned long)__ret >= (unsigned long)-MAX_ERRNO) { \ + SET_ERRNO(-(long)__ret); \ + __ret = (__typeof__(arg))(-1L); \ + } \ + } \ + __ret; \ +}) + +#else /* ! __GXX_HAS_AUTO_TYPE_WITH_CONST_SUPPORT */ +#define __sysret(arg) \ +({ \ + long __ret = (long)(arg); \ + if (__is_signed_type(__typeof__(arg))) { \ + if (__ret < 0) { \ + SET_ERRNO(-__ret); \ + __ret = -1L; \ + } \ + } else { \ + if ((unsigned long)__ret >= (unsigned long)-MAX_ERRNO) { \ + SET_ERRNO(-__ret); \ + __ret = -1L; \ + } \ + } \ + (__typeof__(arg))__ret; \ +}) +#endif /* ! __GXX_HAS_AUTO_TYPE_WITH_CONST_SUPPORT */ /* Functions in this file only describe syscalls. They're declared static so * that the compiler usually decides to inline them while still being allowed @@ -94,7 +138,7 @@ void *sbrk(intptr_t inc) if (ret && sys_brk(ret + inc) == ret + inc) return ret + inc; - return (void *)__sysret(-ENOMEM); + return __sysret((void *)-ENOMEM); } @@ -682,7 +726,7 @@ void *sys_mmap(void *addr, size_t length, int prot, int flags, int fd, static __attribute__((unused)) void *mmap(void *addr, size_t length, int prot, int flags, int fd, off_t offset) { - return (void *)__sysret((unsigned long)sys_mmap(addr, length, prot, flags, fd, offset)); + return __sysret(sys_mmap(addr, length, prot, flags, fd, offset)); } static __attribute__((unused)) -- 2.25.1

1 year, 11 months

2
3
0 0

[PATCH] tools/nolibc: silence ppc64 compile warnings

by Zhangjin Wu

Silence the following warnings reported by the new -Wall -Wextra options with pure assembly code. In file included from sysroot/powerpc/include/stdio.h:13, from nolibc-test.c:13: sysroot/powerpc/include/arch.h: In function '_start': sysroot/powerpc/include/arch.h:192:32: warning: unused variable 'r2' [-Wunused-variable] 192 | register volatile long r2 __asm__ ("r2") = (void *)&TOC - (void *)_start; | ^~ sysroot/powerpc/include/arch.h:187:97: warning: optimization may eliminate reads and/or writes to register variables [-Wvolatile-register-var] 187 | void __attribute__((weak, noreturn, optimize("Os", "omit-frame-pointer"))) __no_stack_protector _start(void) | ^~~~~~ Since only elfv2 ABI requires to save the TOC/GOT pointer to r2 register, when using elfv1 ABI, the old C code is simply ignored by the compiler, but the compiler can not ignore the inline assembly code and will introduce build failure or running segfaults. So, let's further only add the new assembly code for elfv2 ABI with the checking of _CALL_ELF == 2. Link: https://refspecs.linuxfoundation.org/ELF/ppc64/PPC-elf64abi.pdf Link: https://www.llvm.org/devmtg/2014-04/PDFs/Talks/Euro-LLVM-2014-Weigand.pdf Signed-off-by: Zhangjin Wu <falcon(a)tinylab.org> --- Hi, Willy When rebase on latest 20230806-for-6.6-1 branch, -Wall -Wextra reported the above warnings. Here uses volatile inline assembly code instead of C code to silence the unused and optimization warnings. And since only elfv2 require to save TOC pointer to r2 register, this further only add the assembly code for elfv2. BR, Zhangjin --- tools/include/nolibc/arch-powerpc.h | 14 +++++++++++--- 1 file changed, 11 insertions(+), 3 deletions(-) diff --git a/tools/include/nolibc/arch-powerpc.h b/tools/include/nolibc/arch-powerpc.h index 76c3784f9dc7..ac212e6185b2 100644 --- a/tools/include/nolibc/arch-powerpc.h +++ b/tools/include/nolibc/arch-powerpc.h @@ -187,9 +187,17 @@ void __attribute__((weak, noreturn, optimize("Os", "omit-frame-pointer"))) __no_stack_protector _start(void) { #ifdef __powerpc64__ - /* On 64-bit PowerPC, save TOC/GOT pointer to r2 */ - extern char TOC __asm__ (".TOC."); - register volatile long r2 __asm__ ("r2") = (void *)&TOC - (void *)_start; +#if _CALL_ELF == 2 + /* with -mabi=elfv2, save TOC/GOT pointer to r2 + * r12 is global entry pointer, we use it to compute TOC from r12 + * https://www.llvm.org/devmtg/2014-04/PDFs/Talks/Euro-LLVM-2014-Weigand.pdf + * https://refspecs.linuxfoundation.org/ELF/ppc64/PPC-elf64abi.pdf + */ + __asm__ volatile ( + "addis 2, 12, .TOC. - _start@ha\n" + "addi 2, 2, .TOC. - _start@l\n" + ); +#endif /* _CALL_ELF == 2 */ __asm__ volatile ( "mr 3, 1\n" /* save stack pointer to r3, as arg1 of _start_c */ -- 2.25.1

1 year, 11 months

1
0
0 0

[PATCH v4 00/19] selftests/resctrl: Fixes and cleanups

by Ilpo Järvinen

Here is a series with some fixes and cleanups to resctrl selftests. Only has a minor change in code ordering in main() compared with v3. v4: - Move resctrlfs (unconditional) umount after resctrl fs support check v3: - Don't include rewritten CAT test into this series! - Tweak wildcard style in Makefile - Fix many changelog typos, remove some wrong claims, and generally improve them. - Add fix to PARENT_EXIT() to unmount resctrl FS - Add unmounting resctrl FS before starting any tests - Add fix for buf leak - Add fix for perf fd closing - Split mount/remount/umount patches differently - Use size_t and %zu for span - Keep MBM print as MB, only internally use span in bytes - Drop start_buf global from fill_buf v2 (was sent with CAT test rewrite which is no longer included in v3): - Rebased on top of next to solve the conflicts - Added 2 patches related to resctrl FS mount/umount (fix + cleanup) - Consistently use "alloc" in cache_alloc_size() - CAT test error handling tweaked - Remove a spurious newline change from the CAT patch - Small improvements to changelogs Ilpo Järvinen (19): selftests/resctrl: Add resctrl.h into build deps selftests/resctrl: Don't leak buffer in fill_cache() selftests/resctrl: Unmount resctrl FS if child fails to run benchmark selftests/resctrl: Close perf value read fd on errors selftests/resctrl: Unmount resctrl FS before starting the first test selftests/resctrl: Move resctrl FS mount/umount to higher level selftests/resctrl: Refactor remount_resctrl(bool mum_resctrlfs) to mount_resctrl() selftests/resctrl: Remove mum_resctrlfs from struct resctrl_val_param selftests/resctrl: Convert span to size_t selftests/resctrl: Express span internally in bytes selftests/resctrl: Remove duplicated preparation for span arg selftests/resctrl: Remove "malloc_and_init_memory" param from run_fill_buf() selftests/resctrl: Remove unnecessary startptr global from fill_buf selftests/resctrl: Improve parameter consistency in fill_buf selftests/resctrl: Don't pass test name to fill_buf selftests/resctrl: Don't use variable argument list for ->setup() selftests/resctrl: Move CAT/CMT test global vars to function they are used in selftests/resctrl: Pass the real number of tests to show_cache_info() selftests/resctrl: Remove test type checks from cat_val() tools/testing/selftests/resctrl/Makefile | 2 +- tools/testing/selftests/resctrl/cache.c | 64 +++++++------- tools/testing/selftests/resctrl/cat_test.c | 28 ++---- tools/testing/selftests/resctrl/cmt_test.c | 29 ++----- tools/testing/selftests/resctrl/fill_buf.c | 87 +++++++------------ tools/testing/selftests/resctrl/mba_test.c | 9 +- tools/testing/selftests/resctrl/mbm_test.c | 17 ++-- tools/testing/selftests/resctrl/resctrl.h | 17 ++-- .../testing/selftests/resctrl/resctrl_tests.c | 82 +++++++++++------ tools/testing/selftests/resctrl/resctrl_val.c | 7 +- tools/testing/selftests/resctrl/resctrlfs.c | 57 ++++++------ 11 files changed, 169 insertions(+), 230 deletions(-) -- 2.30.2

1 year, 11 months

3
55
0 0

[PATCH v6 00/15] tools/nolibc: add a new syscall helper

by Zhangjin Wu

Hi, Willy Here is the v6 of the __sysret series [1], applies your suggestions. additionally, the sbrk() also uses the __sysret helper. These patches are tested (together with the coming v4 selftests/nolibc patches) for all of the supported architectures: arch/board | result ------------|------------ arm/vexpress-a9 | 142 test(s) passed, 1 skipped, 0 failed. arm/virt | 142 test(s) passed, 1 skipped, 0 failed. aarch64/virt | 142 test(s) passed, 1 skipped, 0 failed. ppc/g3beige | not supported ppc/ppce500 | not supported i386/pc | 142 test(s) passed, 1 skipped, 0 failed. x86_64/pc | 142 test(s) passed, 1 skipped, 0 failed. mipsel/malta | 142 test(s) passed, 1 skipped, 0 failed. loongarch64/virt | 142 test(s) passed, 1 skipped, 0 failed. riscv64/virt | 142 test(s) passed, 1 skipped, 0 failed. riscv32/virt | 0 test(s) passed, 0 skipped, 0 failed. s390x/s390-ccw-virtio | 142 test(s) passed, 1 skipped, 0 failed. Changes from v5 --> v6: * tools/nolibc: arch-*.h: fix up code indent errors toolc/nolibc: arch-*.h: clean up whitespaces after __asm__ Fix up the code indent errors and whitespaces between __asm__ and volatile. The post-whitespaces are reserved as before. * tools/nolibc: arch-loongarch.h: shrink with _NOLIBC_SYSCALL_CLOBBERLIST tools/nolibc: arch-mips.h: shrink with _NOLIBC_SYSCALL_CLOBBERLIST Add _NOLIBC_ prefix for SYSCALL_CLOBBERLIST. * tools/nolibc: add missing my_syscall6() for mips Use post-whitespaces instead of post-tab. The above 4 patches are preparation for this one. * tools/nolibc: __sysret: support syscalls who return a pointer Add comments about the new errno range [-MAX_ERRNOR, -1], add ref to the musl and glibc. * tools/nolibc: clean up mmap() routine Comment the MAP_FAILED return info. * tools/nolibc: clean up sbrk() routine New patch, applies __sysret() helper too and also fixes up an error reported by scripts/checkpatch.pl. * selftests/nolibc: export argv0 for some tests selftests/nolibc: prepare: create /dev/zero Prepare /dev/zero and argv0 for mmap test cases. * selftests/nolibc: add EXPECT_PTREQ, EXPECT_PTRNE and EXPECT_PTRER selftests/nolibc: add sbrk_0 to test current brk getting No change. * selftests/nolibc: add mmap_bad test case selftests/nolibc: add munmap_bad test case selftests/nolibc: add mmap_munmap_good test case Split the first two out to standalone patches. Add /dev/zero and argv0 to the file list and assigns a file_size manually for /dev/zero. Best regards, Zhangjin --- [1]: https://lore.kernel.org/lkml/cover.1687957589.git.falcon@tinylab.org/ Zhangjin Wu (15): tools/nolibc: arch-*.h: fix up code indent errors toolc/nolibc: arch-*.h: clean up whitespaces after __asm__ tools/nolibc: arch-loongarch.h: shrink with _NOLIBC_SYSCALL_CLOBBERLIST tools/nolibc: arch-mips.h: shrink with _NOLIBC_SYSCALL_CLOBBERLIST tools/nolibc: add missing my_syscall6() for mips tools/nolibc: __sysret: support syscalls who return a pointer tools/nolibc: clean up mmap() routine tools/nolibc: clean up sbrk() routine selftests/nolibc: export argv0 for some tests selftests/nolibc: prepare: create /dev/zero selftests/nolibc: add EXPECT_PTREQ, EXPECT_PTRNE and EXPECT_PTRER selftests/nolibc: add sbrk_0 to test current brk getting selftests/nolibc: add mmap_bad test case selftests/nolibc: add munmap_bad test case selftests/nolibc: add mmap_munmap_good test case tools/include/nolibc/arch-aarch64.h | 28 ++-- tools/include/nolibc/arch-arm.h | 28 ++-- tools/include/nolibc/arch-i386.h | 24 ++-- tools/include/nolibc/arch-loongarch.h | 37 +++--- tools/include/nolibc/arch-mips.h | 73 +++++++---- tools/include/nolibc/arch-riscv.h | 14 +- tools/include/nolibc/arch-s390.h | 14 +- tools/include/nolibc/arch-x86_64.h | 28 ++-- tools/include/nolibc/nolibc.h | 9 +- tools/include/nolibc/sys.h | 55 ++++---- tools/include/nolibc/types.h | 6 + tools/testing/selftests/nolibc/nolibc-test.c | 129 ++++++++++++++++++- 12 files changed, 292 insertions(+), 153 deletions(-) -- 2.25.1

1 year, 11 months

3
22
0 0

[PATCH v4] tools/nolibc: fix up size inflate regression

by Zhangjin Wu

As reported and suggested by Willy, the inline __sysret() helper introduces three types of conversions and increases the size: (1) the "unsigned long" argument to __sysret() forces a sign extension from all sys_* functions that used to return 'int' (2) the comparison with the error range now has to be performed on a 'unsigned long' instead of an 'int' (3) the return value from __sysret() is a 'long' (note, a signed long) which then has to be turned back to an 'int' before being returned by the caller to satisfy the caller's prototype. To fix up this, firstly, let's use macro instead of inline function to preserves the input type and avoids these useless conversions (1), (3). Secondly, comparison to -MAX_ERRNO inflicts on all integer returns where we could previously keep a simple sign comparison, let's use a new is_signed_type() macro from include/linux/compiler.h to limit the comparision to -MAX_ERRNO (2) only on demand and preserves a simple sign comparision for most of the cases as before. Thirdly, fix up the following warning by an explicit conversion and let __sysret() be able to accept the (void *) type of argument: sysroot/powerpc/include/sys.h: In function 'sbrk': sysroot/powerpc/include/sys.h:104:16: warning: cast to pointer from integer of different size [-Wint-to-pointer-cast] 104 | return (void *)__sysret(-ENOMEM); Fourthly, to further workaround the argument type with 'const', must use __auto_type in a new enough version or use 'long' as before. Here reports the size testing result of nolibc-test with gcc 13.2.0: before: // ppc64le with powerpc64-linux-gcc $ size nolibc-test text data bss dec hex filename 28004 8 80 28092 6dbc nolibc-test // mips with mips64-linux-gcc (CFLAGS="-mabi=32 -EL") $ size nolibc-test text data bss dec hex filename 23164 64 64 23292 5afc nolibc-test after: // ppc64le with powerpc64-linux-gcc $ size nolibc-test text data bss dec hex filename 27828 8 80 27916 6d0c nolibc-test // mips with mips64-linux-gcc (CFLAGS="-mabi=32 -EL") $ size nolibc-test text data bss dec hex filename 22924 64 64 23052 5a0c nolibc-test Suggested-by: Willy Tarreau <w(a)1wt.eu> Link: https://lore.kernel.org/lkml/20230806095846.GB10627@1wt.eu/ Link: https://lore.kernel.org/lkml/20230806134348.GA19145@1wt.eu/ Signed-off-by: Zhangjin Wu <falcon(a)tinylab.org> --- Hi, Willy v4 rebases on latest 20230806-for-6.6-1 and fixes up a warning reported by the new -Wall -Wextra options. Changes from v3 --> v4: * fix up a new warning about 'ret < 0' when the input arg type is (void *) Changes from v2 --> v3: * define a __GXX_HAS_AUTO_TYPE_WITH_CONST_SUPPORT for gcc >= 11.0 (ABI_VERSION >= 1016) * split __sysret() to two versions by the macro instead of a mixed unified and unreadable version * use shorter __ret instead of __sysret_arg Changes from v1 --> v2: * fix up argument with 'const' in the type * support "void *" argument v2: https://lore.kernel.org/lkml/95fe3e732f455fab653fe1427118d905e4d04257.16913… v1: https://lore.kernel.org/lkml/20230806131921.52453-1-falcon@tinylab.org/ --- tools/include/nolibc/sys.h | 66 +++++++++++++++++++++++++++++++------- 1 file changed, 55 insertions(+), 11 deletions(-) diff --git a/tools/include/nolibc/sys.h b/tools/include/nolibc/sys.h index 833d6c5e86dc..565b4a295c11 100644 --- a/tools/include/nolibc/sys.h +++ b/tools/include/nolibc/sys.h @@ -35,15 +35,59 @@ * (src/internal/syscall_ret.c) and glibc (sysdeps/unix/sysv/linux/sysdep.h) */ -static __inline__ __attribute__((unused, always_inline)) -long __sysret(unsigned long ret) -{ - if (ret >= (unsigned long)-MAX_ERRNO) { - SET_ERRNO(-(long)ret); - return -1; - } - return ret; -} +/* + * Whether 'type' is a signed type or an unsigned type. Supports scalar types, + * bool and also pointer types. (from include/linux/compiler.h) + */ +#define __is_signed_type(type) (((type)(-1)) < (type)1) + +/* __auto_type is used instead of __typeof__ to workaround the build error + * 'error: assignment of read-only variable' when the argument has 'const' in + * the type, but __auto_type is a new feature from newer gcc version and it + * only works with 'const' from gcc 11.0 (__GXX_ABI_VERSION = 1016) + * https://gcc.gnu.org/legacy-ml/gcc-patches/2013-11/msg01378.html + */ + +#if __GXX_ABI_VERSION >= 1016 +#define __GXX_HAS_AUTO_TYPE_WITH_CONST_SUPPORT +#endif + +#ifdef __GXX_HAS_AUTO_TYPE_WITH_CONST_SUPPORT +#define __sysret(arg) \ +({ \ + __auto_type __ret = (arg); \ + if (__is_signed_type(__typeof__(arg))) { \ + if ((long)__ret < 0) { \ + SET_ERRNO(-(long)__ret); \ + __ret = (__typeof__(arg))(-1L); \ + } \ + } else { \ + if ((unsigned long)__ret >= (unsigned long)-MAX_ERRNO) { \ + SET_ERRNO(-(long)__ret); \ + __ret = (__typeof__(arg))(-1L); \ + } \ + } \ + __ret; \ +}) + +#else /* ! __GXX_HAS_AUTO_TYPE_WITH_CONST_SUPPORT */ +#define __sysret(arg) \ +({ \ + long __ret = (long)(arg); \ + if (__is_signed_type(__typeof__(arg))) { \ + if (__ret < 0) { \ + SET_ERRNO(-__ret); \ + __ret = -1L; \ + } \ + } else { \ + if ((unsigned long)__ret >= (unsigned long)-MAX_ERRNO) { \ + SET_ERRNO(-__ret); \ + __ret = -1L; \ + } \ + } \ + (__typeof__(arg))__ret; \ +}) +#endif /* ! __GXX_HAS_AUTO_TYPE_WITH_CONST_SUPPORT */ /* Functions in this file only describe syscalls. They're declared static so * that the compiler usually decides to inline them while still being allowed @@ -94,7 +138,7 @@ void *sbrk(intptr_t inc) if (ret && sys_brk(ret + inc) == ret + inc) return ret + inc; - return (void *)__sysret(-ENOMEM); + return __sysret((void *)-ENOMEM); } @@ -682,7 +726,7 @@ void *sys_mmap(void *addr, size_t length, int prot, int flags, int fd, static __attribute__((unused)) void *mmap(void *addr, size_t length, int prot, int flags, int fd, off_t offset) { - return (void *)__sysret((unsigned long)sys_mmap(addr, length, prot, flags, fd, offset)); + return __sysret(sys_mmap(addr, length, prot, flags, fd, offset)); } static __attribute__((unused)) -- 2.25.1

1 year, 11 months

1
0
0 0

[PATCH mptcp-next v12 0/5] bpf: Force to MPTCP

by Geliang Tang

As is described in the "How to use MPTCP?" section in MPTCP wiki [1]: "Your app should create sockets with IPPROTO_MPTCP as the proto: ( socket(AF_INET, SOCK_STREAM, IPPROTO_MPTCP); ). Legacy apps can be forced to create and use MPTCP sockets instead of TCP ones via the mptcpize command bundled with the mptcpd daemon." But the mptcpize (LD_PRELOAD technique) command has some limitations [2]: - it doesn't work if the application is not using libc (e.g. GoLang apps) - in some envs, it might not be easy to set env vars / change the way apps are launched, e.g. on Android - mptcpize needs to be launched with all apps that want MPTCP: we could have more control from BPF to enable MPTCP only for some apps or all the ones of a netns or a cgroup, etc. - it is not in BPF, we cannot talk about it at netdev conf. So this patchset attempts to use BPF to implement functions similer to mptcpize. The main idea is to add a hook in sys_socket() to change the protocol id from IPPROTO_TCP (or 0) to IPPROTO_MPTCP. [1] https://github.com/multipath-tcp/mptcp_net-next/wiki [2] https://github.com/multipath-tcp/mptcp_net-next/issues/79 v12: - update diag_* log of update_socket_protocol. - add 'ip netns show' after 'ip netns del' to check if there is a test did not clean up its netns. - return libbpf_get_error() instead of -EIO for the error from open_and_load(). - Use getsockopt(SOL_PROTOCOL) to verify mptcp protocol intead of using 'ss -tOni'. v11: - add comments about outputs of 'ss' and 'nstat'. - use "err = verify_mptcpify()" instead of using =+. v10: - drop "#ifdef CONFIG_BPF_JIT". - include vmlinux.h and bpf_tracing_net.h to avoid defining some macros. - drop unneeded checks for mptcp. v9: - update comment for 'update_socket_protocol'. v8: - drop the additional checks on the 'protocol' value after the 'update_socket_protocol()' call. v7: - add __weak and __diag_* for update_socket_protocol. v6: - add update_socket_protocol. v5: - add bpf_mptcpify helper. v4: - use lsm_cgroup/socket_create v3: - patch 8: char cmd[128]; -> char cmd[256]; v2: - Fix build selftests errors reported by CI Closes: https://github.com/multipath-tcp/mptcp_net-next/issues/79 Geliang Tang (5): bpf: Add update_socket_protocol hook selftests/bpf: Use random netns name for mptcp selftests/bpf: Add two mptcp netns helpers selftests/bpf: Fix error checks of mptcp open_and_load selftests/bpf: Add mptcpify test net/mptcp/bpf.c | 15 ++ net/socket.c | 26 +++- .../testing/selftests/bpf/prog_tests/mptcp.c | 146 +++++++++++++++--- tools/testing/selftests/bpf/progs/mptcpify.c | 20 +++ 4 files changed, 186 insertions(+), 21 deletions(-) create mode 100644 tools/testing/selftests/bpf/progs/mptcpify.c -- 2.35.3

1 year, 11 months

1
5
0 0

[PATCH 0/1] Possible bug in zram on ppc64le on vfat

by Petr Vorel

Hi all, following bug is trying to workaround an error on ppc64le, where zram01.sh LTP test (there is also kernel selftest tools/testing/selftests/zram/zram01.sh, but LTP test got further updates) has often mem_used_total 0 although zram is already filled. Patch tries to repeatedly read /sys/block/zram*/mm_stat for 1 sec, waiting for mem_used_total > 0. The question if this is expected and should be workarounded or a bug which should be fixed. REPRODUCE THE ISSUE Quickest way to install only zram tests and their dependencies: make autotools && ./configure && for i in testcases/lib/ testcases/kernel/device-drivers/zram/; do cd $i && make -j$(getconf _NPROCESSORS_ONLN) && make install && cd -; done Run the test (only on vfat) PATH="/opt/ltp/testcases/bin:$PATH" LTP_SINGLE_FS_TYPE=vfat zram01.sh Petr Vorel (1): zram01.sh: Workaround division by 0 on vfat on ppc64le .../kernel/device-drivers/zram/zram01.sh | 27 +++++++++++++++++-- 1 file changed, 25 insertions(+), 2 deletions(-) -- 2.38.0

1 year, 11 months

5
19
0 0

[PATCH v26 0/5] Implement IOCTL to get and optionally clear info about PTEs

by Muhammad Usama Anjum

*Changes in v26:* - Code re-structurring and API changes in PAGEMAP_IOCTL *Changes in v25*: - Do proper filtering on hole as well (hole got missed earlier) *Changes in v24*: - Rebase on top of next-20230710 - Place WP markers in case of hole as well *Changes in v23*: - Set vec_buf_index in loop only when vec_buf_index is set - Return -EFAULT instead of -EINVAL if vec is NULL - Correctly return the walk ending address to the page granularity *Changes in v22*: - Interface change: - Replace [start start + len) with [start, end) - Return the ending address of the address walk in start *Changes in v21*: - Abort walk instead of returning error if WP is to be performed on partial hugetlb *Changes in v20* - Correct PAGE_IS_FILE and add PAGE_IS_PFNZERO *Changes in v19* - Minor changes and interface updates *Changes in v18* - Rebase on top of next-20230613 - Minor updates *Changes in v17* - Rebase on top of next-20230606 - Minor improvements in PAGEMAP_SCAN IOCTL patch *Changes in v16* - Fix a corner case - Add exclusive PM_SCAN_OP_WP back *Changes in v15* - Build fix (Add missed build fix in RESEND) *Changes in v14* - Fix build error caused by #ifdef added at last minute in some configs *Changes in v13* - Rebase on top of next-20230414 - Give-up on using uffd_wp_range() and write new helpers, flush tlb only once *Changes in v12* - Update and other memory types to UFFD_FEATURE_WP_ASYNC - Rebaase on top of next-20230406 - Review updates *Changes in v11* - Rebase on top of next-20230307 - Base patches on UFFD_FEATURE_WP_UNPOPULATED - Do a lot of cosmetic changes and review updates - Remove ENGAGE_WP + !GET operation as it can be performed with UFFDIO_WRITEPROTECT *Changes in v10* - Add specific condition to return error if hugetlb is used with wp async - Move changes in tools/include/uapi/linux/fs.h to separate patch - Add documentation *Changes in v9:* - Correct fault resolution for userfaultfd wp async - Fix build warnings and errors which were happening on some configs - Simplify pagemap ioctl's code *Changes in v8:* - Update uffd async wp implementation - Improve PAGEMAP_IOCTL implementation *Changes in v7:* - Add uffd wp async - Update the IOCTL to use uffd under the hood instead of soft-dirty flags *Motivation* The real motivation for adding PAGEMAP_SCAN IOCTL is to emulate Windows GetWriteWatch() and ResetWriteWatch() syscalls [1]. The GetWriteWatch() retrieves the addresses of the pages that are written to in a region of virtual memory. This syscall is used in Windows applications and games etc. This syscall is being emulated in pretty slow manner in userspace. Our purpose is to enhance the kernel such that we translate it efficiently in a better way. Currently some out of tree hack patches are being used to efficiently emulate it in some kernels. We intend to replace those with these patches. So the whole gaming on Linux can effectively get benefit from this. It means there would be tons of users of this code. CRIU use case [2] was mentioned by Andrei and Danylo: > Use cases for migrating sparse VMAs are binaries sanitized with ASAN, > MSAN or TSAN [3]. All of these sanitizers produce sparse mappings of > shadow memory [4]. Being able to migrate such binaries allows to highly > reduce the amount of work needed to identify and fix post-migration > crashes, which happen constantly. Andrei's defines the following uses of this code: * it is more granular and allows us to track changed pages more effectively. The current interface can clear dirty bits for the entire process only. In addition, reading info about pages is a separate operation. It means we must freeze the process to read information about all its pages, reset dirty bits, only then we can start dumping pages. The information about pages becomes more and more outdated, while we are processing pages. The new interface solves both these downsides. First, it allows us to read pte bits and clear the soft-dirty bit atomically. It means that CRIU will not need to freeze processes to pre-dump their memory. Second, it clears soft-dirty bits for a specified region of memory. It means CRIU will have actual info about pages to the moment of dumping them. * The new interface has to be much faster because basic page filtering is happening in the kernel. With the old interface, we have to read pagemap for each page. *Implementation Evolution (Short Summary)* From the definition of GetWriteWatch(), we feel like kernel's soft-dirty feature can be used under the hood with some additions like: * reset soft-dirty flag for only a specific region of memory instead of clearing the flag for the entire process * get and clear soft-dirty flag for a specific region atomically So we decided to use ioctl on pagemap file to read or/and reset soft-dirty flag. But using soft-dirty flag, sometimes we get extra pages which weren't even written. They had become soft-dirty because of VMA merging and VM_SOFTDIRTY flag. This breaks the definition of GetWriteWatch(). We were able to by-pass this short coming by ignoring VM_SOFTDIRTY until David reported that mprotect etc messes up the soft-dirty flag while ignoring VM_SOFTDIRTY [5]. This wasn't happening until [6] got introduced. We discussed if we can revert these patches. But we could not reach to any conclusion. So at this point, I made couple of tries to solve this whole VM_SOFTDIRTY issue by correcting the soft-dirty implementation: * [7] Correct the bug fixed wrongly back in 2014. It had potential to cause regression. We left it behind. * [8] Keep a list of soft-dirty part of a VMA across splits and merges. I got the reply don't increase the size of the VMA by 8 bytes. At this point, we left soft-dirty considering it is too much delicate and userfaultfd [9] seemed like the only way forward. From there onward, we have been basing soft-dirty emulation on userfaultfd wp feature where kernel resolves the faults itself when WP_ASYNC feature is used. It was straight forward to add WP_ASYNC feature in userfautlfd. Now we get only those pages dirty or written-to which are really written in reality. (PS There is another WP_UNPOPULATED userfautfd feature is required which is needed to avoid pre-faulting memory before write-protecting [9].) All the different masks were added on the request of CRIU devs to create interface more generic and better. [1] https://learn.microsoft.com/en-us/windows/win32/api/memoryapi/nf-memoryapi-… [2] https://lore.kernel.org/all/20221014134802.1361436-1-mdanylo@google.com [3] https://github.com/google/sanitizers [4] https://github.com/google/sanitizers/wiki/AddressSanitizerAlgorithm#64-bit [5] https://lore.kernel.org/all/bfcae708-db21-04b4-0bbe-712badd03071@redhat.com [6] https://lore.kernel.org/all/20220725142048.30450-1-peterx@redhat.com/ [7] https://lore.kernel.org/all/20221122115007.2787017-1-usama.anjum@collabora.… [8] https://lore.kernel.org/all/20221220162606.1595355-1-usama.anjum@collabora.… [9] https://lore.kernel.org/all/20230306213925.617814-1-peterx@redhat.com [10] https://lore.kernel.org/all/20230125144529.1630917-1-mdanylo@google.com * Original Cover letter from v8* Hello, Note: Soft-dirty pages and pages which have been written-to are synonyms. As kernel already has soft-dirty feature inside which we have given up to use, we are using written-to terminology while using UFFD async WP under the hood. It is possible to find and clear soft-dirty pages entirely in userspace. But it isn't efficient: - The mprotect and SIGSEGV handler for bookkeeping - The userfaultfd wp (synchronous) with the handler for bookkeeping Some benchmarks can be seen here[1]. This series adds features that weren't present earlier: - There is no atomic get soft-dirty/Written-to status and clear present in the kernel. - The pages which have been written-to can not be found in accurate way. (Kernel's soft-dirty PTE bit + sof_dirty VMA bit shows more soft-dirty pages than there actually are.) Historically, soft-dirty PTE bit tracking has been used in the CRIU project. The procfs interface is enough for finding the soft-dirty bit status and clearing the soft-dirty bit of all the pages of a process. We have the use case where we need to track the soft-dirty PTE bit for only specific pages on-demand. We need this tracking and clear mechanism of a region of memory while the process is running to emulate the getWriteWatch() syscall of Windows. *(Moved to using UFFD instead of soft-dirty feature to find pages which have been written-to from v7 patch series)*: Stop using the soft-dirty flags for finding which pages have been written to. It is too delicate and wrong as it shows more soft-dirty pages than the actual soft-dirty pages. There is no interest in correcting it [2][3] as this is how the feature was written years ago. It shouldn't be updated to changed behaviour. Peter Xu has suggested using the async version of the UFFD WP [4] as it is based inherently on the PTEs. So in this patch series, I've added a new mode to the UFFD which is asynchronous version of the write protect. When this variant of the UFFD WP is used, the page faults are resolved automatically by the kernel. The pages which have been written-to can be found by reading pagemap file (!PM_UFFD_WP). This feature can be used successfully to find which pages have been written to from the time the pages were write protected. This works just like the soft-dirty flag without showing any extra pages which aren't soft-dirty in reality. The information related to pages if the page is file mapped, present and swapped is required for the CRIU project [5][6]. The addition of the required mask, any mask, excluded mask and return masks are also required for the CRIU project [5]. The IOCTL returns the addresses of the pages which match the specific masks. The page addresses are returned in struct page_region in a compact form. The max_pages is needed to support a use case where user only wants to get a specific number of pages. So there is no need to find all the pages of interest in the range when max_pages is specified. The IOCTL returns when the maximum number of the pages are found. The max_pages is optional. If max_pages is specified, it must be equal or greater than the vec_size. This restriction is needed to handle worse case when one page_region only contains info of one page and it cannot be compacted. This is needed to emulate the Windows getWriteWatch() syscall. The patch series include the detailed selftest which can be used as an example for the uffd async wp test and PAGEMAP_IOCTL. It shows the interface usages as well. [1] https://lore.kernel.org/lkml/54d4c322-cd6e-eefd-b161-2af2b56aae24@collabora… [2] https://lore.kernel.org/all/20221220162606.1595355-1-usama.anjum@collabora.… [3] https://lore.kernel.org/all/20221122115007.2787017-1-usama.anjum@collabora.… [4] https://lore.kernel.org/all/Y6Hc2d+7eTKs7AiH@x1n [5] https://lore.kernel.org/all/YyiDg79flhWoMDZB@gmail.com/ [6] https://lore.kernel.org/all/20221014134802.1361436-1-mdanylo@google.com/ Regards, Muhammad Usama Anjum Muhammad Usama Anjum (4): fs/proc/task_mmu: Implement IOCTL to get and optionally clear info about PTEs tools headers UAPI: Update linux/fs.h with the kernel sources mm/pagemap: add documentation of PAGEMAP_SCAN IOCTL selftests: mm: add pagemap ioctl tests Peter Xu (1): userfaultfd: UFFD_FEATURE_WP_ASYNC Documentation/admin-guide/mm/pagemap.rst | 64 + Documentation/admin-guide/mm/userfaultfd.rst | 35 + fs/proc/task_mmu.c | 653 ++++++++ fs/userfaultfd.c | 26 +- include/linux/hugetlb.h | 1 + include/linux/userfaultfd_k.h | 21 +- include/uapi/linux/fs.h | 58 + include/uapi/linux/userfaultfd.h | 9 +- mm/hugetlb.c | 34 +- mm/memory.c | 27 +- tools/include/uapi/linux/fs.h | 58 + tools/testing/selftests/mm/.gitignore | 2 + tools/testing/selftests/mm/Makefile | 3 +- tools/testing/selftests/mm/config | 1 + tools/testing/selftests/mm/pagemap_ioctl.c | 1485 ++++++++++++++++++ tools/testing/selftests/mm/run_vmtests.sh | 4 + 16 files changed, 2457 insertions(+), 24 deletions(-) create mode 100644 tools/testing/selftests/mm/pagemap_ioctl.c -- 2.39.2

1 year, 11 months

4
19
0 0

[PATCH v2] tools/nolibc: fix up size inflate regression

by Zhangjin Wu

As reported and suggested by Willy, the inline __sysret() helper introduces three types of conversions and increases the size: (1) the "unsigned long" argument to __sysret() forces a sign extension from all sys_* functions that used to return 'int' (2) the comparison with the error range now has to be performed on a 'unsigned long' instead of an 'int' (3) the return value from __sysret() is a 'long' (note, a signed long) which then has to be turned back to an 'int' before being returned by the caller to satisfy the caller's prototype. To fix up this, firstly, let's use macro instead of inline function to preserves the input type and avoids these useless conversions (1), (3). Secondly, comparison to -MAX_ERRNO inflicts on all integer returns where we could previously keep a simple sign comparison, let's use a new is_signed_type() macro from include/linux/compiler.h to limit the comparision to -MAX_ERRNO (2) only on demand and preserves a simple sign comparision for most of the cases as before. Thirdly, fix up the following warning by an explicit conversion and let __sysret() be able to accept the (void *) type of argument: sysroot/powerpc/include/sys.h: In function 'sbrk': sysroot/powerpc/include/sys.h:104:16: warning: cast to pointer from integer of different size [-Wint-to-pointer-cast] 104 | return (void *)__sysret(-ENOMEM); Fourthly, to further workaround the argument type with 'const', must use __auto_type in a new enough version or use 'long' as before. Here reports the size testing result with nolibc-test: before: // ppc64le $ size nolibc-test text data bss dec hex filename 27916 8 80 28004 6d64 nolibc-test // mips $ size nolibc-test text data bss dec hex filename 23276 64 64 23404 5b6c nolibc-test after: // ppc64le $ size nolibc-test text data bss dec hex filename 27736 8 80 27824 6cb0 nolibc-test // mips $ size nolibc-test text data bss dec hex filename 23036 64 64 23164 5a7c nolibc-test Suggested-by: Willy Tarreau <w(a)1wt.eu> Link: https://lore.kernel.org/lkml/20230806095846.GB10627@1wt.eu/ Link: https://lore.kernel.org/lkml/20230806134348.GA19145@1wt.eu/ Signed-off-by: Zhangjin Wu <falcon(a)tinylab.org> --- v2 here is further fix up argument with 'const' in the type and also support "void *" argument, v1 is [1]. Tested on many architectures (i386, x86_64, mips, ppc64) and gcc version (from gcc 4.8-13.1.0), compiles well without any warning and errors and also with smaller size. [1]: https://lore.kernel.org/lkml/20230806131921.52453-1-falcon@tinylab.org/ --- tools/include/nolibc/sys.h | 52 ++++++++++++++++++++++++++++++-------- 1 file changed, 41 insertions(+), 11 deletions(-) diff --git a/tools/include/nolibc/sys.h b/tools/include/nolibc/sys.h index 56f63eb48a1b..9c7448ae19e2 100644 --- a/tools/include/nolibc/sys.h +++ b/tools/include/nolibc/sys.h @@ -35,15 +35,45 @@ * (src/internal/syscall_ret.c) and glibc (sysdeps/unix/sysv/linux/sysdep.h) */ -static __inline__ __attribute__((unused, always_inline)) -long __sysret(unsigned long ret) -{ - if (ret >= (unsigned long)-MAX_ERRNO) { - SET_ERRNO(-(long)ret); - return -1; - } - return ret; -} +/* + * Whether 'type' is a signed type or an unsigned type. Supports scalar types, + * bool and also pointer types. (from include/linux/compiler.h) + */ +#define __is_signed_type(type) (((type)(-1)) < (type)1) + +/* __auto_type is used instead of __typeof__ to workaround the build error + * 'error: assignment of read-only variable' when the argument has 'const' in + * the type, but __auto_type is a new feature from newer version and it only + * work with 'const' from gcc 11.0 (__GXX_ABI_VERSION = 1016) + * https://gcc.gnu.org/legacy-ml/gcc-patches/2013-11/msg01378.html + */ + +#if __GXX_ABI_VERSION < 1016 +#define __typeofdecl(arg) long +#define __typeofconv1(arg) (long) +#define __typeofconv2(arg) (long) +#else +#define __typeofdecl(arg) __auto_type +#define __typeofconv1(arg) +#define __typeofconv2(arg) (__typeof__(arg)) +#endif + +#define __sysret(arg) \ +({ \ + __typeofdecl(arg) __sysret_arg = __typeofconv1(arg)(arg); \ + if (__is_signed_type(__typeof__(arg))) { \ + if (__sysret_arg < 0) { \ + SET_ERRNO(-(long)__sysret_arg); \ + __sysret_arg = __typeofconv2(arg)(-1L); \ + } \ + } else { \ + if ((unsigned long)__sysret_arg >= (unsigned long)-MAX_ERRNO) { \ + SET_ERRNO(-(long)__sysret_arg); \ + __sysret_arg = __typeofconv2(arg)(-1L); \ + } \ + } \ + (__typeof__(arg))__sysret_arg; \ +}) /* Functions in this file only describe syscalls. They're declared static so * that the compiler usually decides to inline them while still being allowed @@ -94,7 +124,7 @@ void *sbrk(intptr_t inc) if (ret && sys_brk(ret + inc) == ret + inc) return ret + inc; - return (void *)__sysret(-ENOMEM); + return __sysret((void *)-ENOMEM); } @@ -682,7 +712,7 @@ void *sys_mmap(void *addr, size_t length, int prot, int flags, int fd, static __attribute__((unused)) void *mmap(void *addr, size_t length, int prot, int flags, int fd, off_t offset) { - return (void *)__sysret((unsigned long)sys_mmap(addr, length, prot, flags, fd, offset)); + return __sysret(sys_mmap(addr, length, prot, flags, fd, offset)); } static __attribute__((unused)) -- 2.25.1

1 year, 11 months

1
1
0 0

[PATCH] tools/nolibc: fix up size inflate regression

by Zhangjin Wu

As reported and suggested by Willy, the inline __sysret() helper introduces three types of conversions and increases the size: (1) the "unsigned long" argument to __sysret() forces a sign extension from all sys_* functions that used to return 'int' (2) the comparison with the error range now has to be performed on a 'unsigned long' instead of an 'int' (3) the return value from __sysret() is a 'long' (note, a signed long) which then has to be turned back to an 'int' before being returned by the caller to satisfy the caller's prototype. To fix up this, firstly, let's use macro instead of inline function to preserves the input type and avoids these useless conversions (1), (3). Secondly, comparison to -MAX_ERRNO inflicts on all integer returns where we could previously keep a simple sign comparison, let's use a new is_signed_type() macro from include/linux/compiler.h to limit the comparision to -MAX_ERRNO (2) only on demand and preserves a simple sign comparision for most of the cases as before. Thirdly, fix up the following warning by an explicit conversion: sysroot/powerpc/include/sys.h: In function 'sbrk': sysroot/powerpc/include/sys.h:104:16: warning: cast to pointer from integer of different size [-Wint-to-pointer-cast] 104 | return (void *)__sysret(-ENOMEM); Here reports the size testing result with nolibc-test: before: // ppc64le $ size nolibc-test text data bss dec hex filename 27916 8 80 28004 6d64 nolibc-test // mips $ size nolibc-test text data bss dec hex filename 23276 64 64 23404 5b6c nolibc-test after: // ppc64le $ size nolibc-test text data bss dec hex filename 27736 8 80 27824 6cb0 nolibc-test // mips $ size nolibc-test text data bss dec hex filename 23036 64 64 23164 5a7c nolibc-test Suggested-by: Willy Tarreau <w(a)1wt.eu> Link: https://lore.kernel.org/lkml/20230806095846.GB10627@1wt.eu/#R Signed-off-by: Zhangjin Wu <falcon(a)tinylab.org> --- tools/include/nolibc/compiler.h | 9 +++++++++ tools/include/nolibc/sys.h | 27 +++++++++++++++++---------- 2 files changed, 26 insertions(+), 10 deletions(-) diff --git a/tools/include/nolibc/compiler.h b/tools/include/nolibc/compiler.h index beddc3665d69..360dfc533814 100644 --- a/tools/include/nolibc/compiler.h +++ b/tools/include/nolibc/compiler.h @@ -22,4 +22,13 @@ # define __no_stack_protector __attribute__((__optimize__("-fno-stack-protector"))) #endif /* defined(__has_attribute) */ +/* + * from include/linux/compiler.h + * + * Whether 'type' is a signed type or an unsigned type. Supports scalar types, + * bool and also pointer types. + */ +#define is_signed_type(type) (((type)(-1)) < (type)1) +#define is_unsigned_type(type) (!is_signed_type(type)) + #endif /* _NOLIBC_COMPILER_H */ diff --git a/tools/include/nolibc/sys.h b/tools/include/nolibc/sys.h index 56f63eb48a1b..8271302f79c4 100644 --- a/tools/include/nolibc/sys.h +++ b/tools/include/nolibc/sys.h @@ -35,15 +35,22 @@ * (src/internal/syscall_ret.c) and glibc (sysdeps/unix/sysv/linux/sysdep.h) */ -static __inline__ __attribute__((unused, always_inline)) -long __sysret(unsigned long ret) -{ - if (ret >= (unsigned long)-MAX_ERRNO) { - SET_ERRNO(-(long)ret); - return -1; - } - return ret; -} +#define __sysret(arg) \ +({ \ + __typeof__(arg) __sysret_arg = (arg); \ + if (is_signed_type(__typeof__(arg))) { \ + if (__sysret_arg < 0) { \ + SET_ERRNO(-(int)__sysret_arg); \ + __sysret_arg = -1L; \ + } \ + } else { \ + if ((unsigned long)__sysret_arg >= (unsigned long)-MAX_ERRNO) { \ + SET_ERRNO(-(int)__sysret_arg); \ + __sysret_arg = -1L; \ + } \ + } \ + __sysret_arg; \ +}) /* Functions in this file only describe syscalls. They're declared static so * that the compiler usually decides to inline them while still being allowed @@ -94,7 +101,7 @@ void *sbrk(intptr_t inc) if (ret && sys_brk(ret + inc) == ret + inc) return ret + inc; - return (void *)__sysret(-ENOMEM); + return (void *)__sysret((unsigned long)-ENOMEM); } -- 2.25.1

1 year, 11 months

2
2
0 0

[PATCH v6 0/8] tools/nolibc: add 32/64-bit powerpc support

by Zhangjin Wu

Hi, Willy Now, the dependent pmac32_defconfig patch has been merged into the powerpc next-test branch [1] ;-) v6 here with a clean up of the CFLAGS for ppc variants, removed the redundant -Wl options and call cc-option to check the -mmultiple option for llvm as kernel does. v5 is [2]. Tests run with local toolchains and latest toolchains. $ for arch in ppc ppc64 ppc64le; do \ make run-user XARCH=$arch | grep "status: "; \ done 166 test(s): 158 passed, 8 skipped, 0 failed => status: warning 166 test(s): 158 passed, 8 skipped, 0 failed => status: warning 166 test(s): 158 passed, 8 skipped, 0 failed => status: warning $ for arch in ppc ppc64 ppc64le; do \ make run-user XARCH=$arch CC=/labs/linux-lab/prebuilt/toolchains/ppc64/gcc-13.1.0-nolibc/powerpc64-linux/bin/powerpc64-linux-gcc | grep "status: "; \ done 166 test(s): 158 passed, 8 skipped, 0 failed => status: warning 166 test(s): 158 passed, 8 skipped, 0 failed => status: warning 166 test(s): 158 passed, 8 skipped, 0 failed => status: warning Changes from v5 --> v6: * selftests/nolibc: add test support for ppc selftests/nolibc: add test support for ppc64le selftests/nolibc: add test support for ppc64 Removed the -Wl options. As comment from arch/powerpc/Makefile, use -mmultiple with cc-option for llvm has no such options. * tools/nolibc: add support for powerpc tools/nolibc: add support for powerpc64 selftests/nolibc: add XARCH and ARCH mapping support selftests/nolibc: allow customize CROSS_COMPILE by architecture selftests/nolibc: customize CROSS_COMPILE for 32/64-bit powerpc No changes. BR, Zhangjin Wu --- [1]: https://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux.git/commit/?h… [2]: https://lore.kernel.org/lkml/cover.1691062722.git.falcon@tinylab.org/ Zhangjin Wu (8): tools/nolibc: add support for powerpc tools/nolibc: add support for powerpc64 selftests/nolibc: add XARCH and ARCH mapping support selftests/nolibc: add test support for ppc selftests/nolibc: add test support for ppc64le selftests/nolibc: add test support for ppc64 selftests/nolibc: allow customize CROSS_COMPILE by architecture selftests/nolibc: customize CROSS_COMPILE for 32/64-bit powerpc tools/include/nolibc/arch-powerpc.h | 213 ++++++++++++++++++++++++ tools/include/nolibc/arch.h | 2 + tools/testing/selftests/nolibc/Makefile | 74 ++++++-- 3 files changed, 277 insertions(+), 12 deletions(-) create mode 100644 tools/include/nolibc/arch-powerpc.h -- 2.25.1

1 year, 11 months

2
15
0 0

[PATCH v1 0/4] selftests/nolibc: customize CROSS_COMPILE for all supported architectures

by Zhangjin Wu

Hi, Willy Based on the CROSS_COMPILE customize support [1] from the last ppc patchset, to further make run-user/run targets happy for all of the nolibc supported architectures, let's customize CROSS_COMPILE for all of them. Beside loongarch, all of the other architectures have local toolchains. let's use the one from [2] for loongarch, it has a different prefix. And also, as suggested by you in our previous discuss, let's add some notes for the toolchains and firmwares instead of automatically download them. Now, the test iteration becomes very simple and pretty: $ ARCHS="i386 x86_64 arm64 arm mips ppc ppc64 ppc64le riscv s390" $ for arch in ${ARCHS[@]}; do printf "%9s: " $arch; make run-user XARCH=$arch | grep status; done i386: 165 test(s): 157 passed, 8 skipped, 0 failed => status: warning x86_64: 165 test(s): 157 passed, 8 skipped, 0 failed => status: warning arm64: 165 test(s): 157 passed, 8 skipped, 0 failed => status: warning arm: 165 test(s): 156 passed, 9 skipped, 0 failed => status: warning mips: 165 test(s): 156 passed, 9 skipped, 0 failed => status: warning ppc: 165 test(s): 157 passed, 8 skipped, 0 failed => status: warning ppc64: 165 test(s): 157 passed, 8 skipped, 0 failed => status: warning ppc64le: 165 test(s): 157 passed, 8 skipped, 0 failed => status: warning riscv: 165 test(s): 156 passed, 9 skipped, 0 failed => status: warning s390: 165 test(s): 156 passed, 9 skipped, 0 failed => status: warning (I have no qemu-user currently for loongarch, so, no test result above) Best regards, Zhangjin --- [1] https://lore.kernel.org/lkml/cover.1691259983.git.falcon@tinylab.org/ [2] https://mirrors.edge.kernel.org/pub/tools/crosstool/ Zhangjin Wu (4): selftests/nolibc: allow use x86_64 toolchain for i386 selftests/nolibc: customize CROSS_COMPILE for many architectures selftests/nolibc: customize CROSS_COMPILE for loongarch selftests/nolibc: add some notes about qemu tools tools/testing/selftests/nolibc/Makefile | 32 ++++++++++++++++++++++++- 1 file changed, 31 insertions(+), 1 deletion(-) -- 2.25.1

1 year, 11 months

2
5
0 0

[PATCH] selftests: prctl: Add prctl test for PR_GET_NAME

by Osama Muhammad

This patch covers the testing of PR_GET_NAME by reading it's value from proc/self/task/pid/comm and matching it by the value returned by PR_GET_NAME. Signed-off-by: Osama Muhammad <osmtendev(a)gmail.com> --- .../selftests/prctl/set-process-name.c | 25 +++++++++++++++++++ 1 file changed, 25 insertions(+) diff --git a/tools/testing/selftests/prctl/set-process-name.c b/tools/testing/selftests/prctl/set-process-name.c index 3bc5e0e09..41f4b105d 100644 --- a/tools/testing/selftests/prctl/set-process-name.c +++ b/tools/testing/selftests/prctl/set-process-name.c @@ -47,6 +47,28 @@ int check_null_pointer(char *check_name) return res; } +int check_name(void) +{ + + int pid; + + pid = getpid(); + FILE *fptr; + char path[50] = {}; + int j; + + j = snprintf(path, 50, "/proc/self/task/%d/comm", pid); + fptr = fopen(path, "r"); + char name[TASK_COMM_LEN] = {}; + int res = prctl(PR_GET_NAME, name, NULL, NULL, NULL); + char output[TASK_COMM_LEN] = {}; + + fscanf(fptr, "%s", output); + + return !strcmp(output, name); + +} + TEST(rename_process) { EXPECT_GE(set_name(CHANGE_NAME), 0); @@ -57,6 +79,9 @@ TEST(rename_process) { EXPECT_GE(set_name(CHANGE_NAME), 0); EXPECT_LT(check_null_pointer(CHANGE_NAME), 0); + + EXPECT_TRUE(check_name()); + } TEST_HARNESS_MAIN -- 2.34.1

1 year, 11 months

1
0
0 0

[PATCH v3 00/14] tools/nolibc: enable compiler warnings

by Thomas Weißschuh

To help the developers to avoid mistakes and keep the code smaller let's enable compiler warnings. I stuck with __attribute__((unused)) over __maybe_unused in nolibc-test.c for consistency with nolibc proper. If we want to add a define it needs to be added twice once for nolibc proper and once for nolibc-test otherwise libc-test wouldn't build anymore. Signed-off-by: Thomas Weißschuh <linux(a)weissschuh.net> --- Changes in v3: - Make getpagesize() return "int" - Simplify validation of read() return value - Don't make functions static that are to be used as breakpoints - Drop -s from LDFLAGS - Use proper types for read()/write() return values - Fix unused parameter warnings in new setvbuf() - Link to v2: https://lore.kernel.org/r/20230801-nolibc-warnings-v2-0-1ba5ca57bd9b@weisss… Changes in v2: - Don't drop unused test helpers, mark them as __attribute__((unused)) - Make some function in nolibc-test static - Also handle -W and -Wextra - Link to v1: https://lore.kernel.org/r/20230731-nolibc-warnings-v1-0-74973d2a52d7@weisss… --- Thomas Weißschuh (14): tools/nolibc: drop unused variables tools/nolibc: fix return type of getpagesize() tools/nolibc: setvbuf: avoid unused parameter warnings tools/nolibc: sys: avoid implicit sign cast tools/nolibc: stdint: use int for size_t on 32bit selftests/nolibc: drop unused variables selftests/nolibc: mark test helpers as potentially unused selftests/nolibc: make functions static if possible selftests/nolibc: avoid unused parameter warnings selftests/nolibc: avoid sign-compare warnings selftests/nolibc: use correct return type for read() and write() selftests/nolibc: prevent out of bounds access in expect_vfprintf selftests/nolibc: don't strip nolibc-test selftests/nolibc: enable compiler warnings tools/include/nolibc/stdint.h | 4 + tools/include/nolibc/stdio.h | 5 +- tools/include/nolibc/sys.h | 7 +- tools/testing/selftests/nolibc/Makefile | 4 +- tools/testing/selftests/nolibc/nolibc-test.c | 111 ++++++++++++++++----------- 5 files changed, 80 insertions(+), 51 deletions(-) --- base-commit: bc87f9562af7b2b4cb07dcaceccfafcf05edaff8 change-id: 20230731-nolibc-warnings-c6e47284ac03 Best regards, -- Thomas Weißschuh <linux(a)weissschuh.net>

1 year, 11 months

2
20
0 0

[PATCH v2 0/2] tracing: Fix cpu buffers unavailable problem and add its testcase

by Zheng Yejian

Hi, This is the v2 to fix cpu buffers unavailable problem after some operations on file 'tracing_cpumask' and 'snapshot', also upload its testcase. Changes show as below. v2: - Fix compile issue reported-by kernel test robot <lkp(a)intel.com> with suggestion from Steve: - Link: https://lore.kernel.org/all/202308050731.PQutr3r0-lkp@intel.com/ - Link: https://lore.kernel.org/all/20230804125107.41d6cdb1@gandalf.local.home/ - Add a step to set tracing_on in testcase (see patch 2) and add descriptions of each step. v1: - Link: https://lore.kernel.org/all/20230804124549.2562977-1-zhengyejian1@huawei.co… Zheng Yejian (2): tracing: Fix cpu buffers unavailable due to 'record_disabled' messed selftests/ftrace: Add a basic testcase for snapshot kernel/trace/trace.c | 6 ++++ .../ftrace/test.d/00basic/snapshot1.tc | 31 +++++++++++++++++++ 2 files changed, 37 insertions(+) create mode 100644 tools/testing/selftests/ftrace/test.d/00basic/snapshot1.tc -- 2.25.1

1 year, 11 months

1
2
0 0

[PATCH 0/2] tracing: Fix cpu buffers unavailable problem and add its testcase

by Zheng Yejian

Hi, steve, after some operations on file 'tracing_cpumask' and 'snapshot', trace ring buffer can no longer record anything. This series contain a patch to fix it and a constrived testcase to reproduce it. I think the reproduction testcase is useful to help others to check if their version has this problem and verify the bugfix. However, currently in "tools/testing/selftests/ftrace/test.d", there seems no appropriate subdirectory to put this kind reproductions. So I now put the testcase in "00basic" because it is basicly simple. Or would there be a new directory to collect simple reproduction testcases? Zheng Yejian (2): tracing: Fix cpu buffers unavailable due to 'record_disabled' messed selftests/ftrace: Add a basic testcase for snapshot kernel/trace/trace.c | 2 ++ .../ftrace/test.d/00basic/snapshot1.tc | 17 +++++++++++++++++ 2 files changed, 19 insertions(+) create mode 100644 tools/testing/selftests/ftrace/test.d/00basic/snapshot1.tc -- 2.25.1

1 year, 11 months

3
8
0 0

[PATCH net 0/4] mptcp: more fixes for v6.5

by Matthieu Baerts

Here is a new batch of fixes related to MPTCP for v6.5 and older. Patches 1 and 2 fix issues with MPTCP Join selftest when manually launched with '-i' parameter to use 'ip mptcp' tool instead of the dedicated one (pm_nl_ctl). The issues have been there since v5.18. Thank you Andrea for your first contributions to MPTCP code in the upstream kernel! Patch 3 avoids corrupting the data stream when trying to reset connections that have fallen back to TCP. This can happen from v6.1. Patch 4 fixes a race when doing a disconnect() and an accept() in parallel on a listener socket. The issue only happens in rare cases if the user is really unlucky since a fix that landed in v6.3 but backported up to v6.1. Signed-off-by: Matthieu Baerts <matthieu.baerts(a)tessares.net> --- Andrea Claudi (2): selftests: mptcp: join: fix 'delete and re-add' test selftests: mptcp: join: fix 'implicit EP' test Paolo Abeni (2): mptcp: avoid bogus reset on fallback close mptcp: fix disconnect vs accept race net/mptcp/protocol.c | 2 +- net/mptcp/protocol.h | 1 - net/mptcp/subflow.c | 60 ++++++++++++------------- tools/testing/selftests/net/mptcp/mptcp_join.sh | 6 ++- 4 files changed, 35 insertions(+), 34 deletions(-) --- base-commit: 0f71c9caf26726efea674646f566984e735cc3b9 change-id: 20230803-upstream-net-20230803-misc-fixes-6-5-6046c6ca74b6 Best regards, -- Matthieu Baerts <matthieu.baerts(a)tessares.net>

1 year, 11 months

2
5
0 0

[PATCH v4 0/3] kunit: Expose some built-in features to modules

by Janusz Krzysztofik

Submit the top-level headers also from the kunit test module notifier initialization callback, so external tools that are parsing dmesg for kunit test output are able to tell how many test suites should be expected and whether to continue parsing after complete output from the first test suite is collected. Extend kunit module notifier initialization callback with a processing path for only listing the tests provided by a module if the kunit action parameter is set to "list", so external tools can obtain a list of test cases to be executed in advance and can make a better job on assigning kernel messages interleaved with kunit output to specific tests. Use test filtering functions in kunit module notifier callback functions, so external tools are able to execute individual test cases from kunit test modules in order to still better isolate their potential impact on kernel messages that appear interleaved with output from other tests. v4: Use kunit_exec_run_tests() (Mauro, Rae), but prevent it from emitting the headers when called on load of non-test modules, - don't use a different list format, use kunit_exec_list_tests() (Rae), - refresh on top of newly introduced attributes patches, handle newly introduced kunit.action=list_attr case (Rae). v3: Fix CONFIG_GLOB, required by filtering functions, not selected when building as a module. v2: Fix new name of a structure moved to kunit namespace not updated across all uses. Janusz Krzysztofik (3): kunit: Report the count of test suites in a module kunit: Make 'list' action available to kunit test modules kunit: Allow kunit test modules to use test filtering include/kunit/test.h | 21 ++++++++ lib/kunit/Kconfig | 2 +- lib/kunit/executor.c | 115 +++++++++++++++++++++++++------------------ lib/kunit/test.c | 40 ++++++++++++++- 4 files changed, 128 insertions(+), 50 deletions(-) base-commit: 5a175d369c702ce08c9feb630125c9fc7a9e1370 -- 2.41.0

1 year, 11 months

2
4
0 0

[PATCH] selftests/rseq: Fix build with undefined __weak

by Mark Brown

Commit 3bcbc20942db ("selftests/rseq: Play nice with binaries statically linked against glibc 2.35+") which is now in Linus' tree introduced uses of __weak but did nothing to ensure that a definition is provided for it resulting in build failures for the rseq tests: rseq.c:41:1: error: unknown type name '__weak' __weak ptrdiff_t __rseq_offset; ^ rseq.c:41:17: error: expected ';' after top level declarator __weak ptrdiff_t __rseq_offset; ^ ; rseq.c:42:1: error: unknown type name '__weak' __weak unsigned int __rseq_size; ^ rseq.c:43:1: error: unknown type name '__weak' __weak unsigned int __rseq_flags; Fix this by using the definition from tools/include compiler.h. Fixes: 3bcbc20942db ("selftests/rseq: Play nice with binaries statically linked against glibc 2.35+") Signed-off-by: Mark Brown <broonie(a)kernel.org> --- It'd be good if the KVM testing could include builds of the rseq selftests, the KVM tests pull in code from rseq but not the build system which has resulted in multiple failures like this. --- tools/testing/selftests/rseq/Makefile | 4 +++- tools/testing/selftests/rseq/rseq.c | 2 ++ 2 files changed, 5 insertions(+), 1 deletion(-) diff --git a/tools/testing/selftests/rseq/Makefile b/tools/testing/selftests/rseq/Makefile index b357ba24af06..7a957c7d459a 100644 --- a/tools/testing/selftests/rseq/Makefile +++ b/tools/testing/selftests/rseq/Makefile @@ -4,8 +4,10 @@ ifneq ($(shell $(CC) --version 2>&1 | head -n 1 | grep clang),) CLANG_FLAGS += -no-integrated-as endif +top_srcdir = ../../../.. + CFLAGS += -O2 -Wall -g -I./ $(KHDR_INCLUDES) -L$(OUTPUT) -Wl,-rpath=./ \ - $(CLANG_FLAGS) + $(CLANG_FLAGS) -I$(top_srcdir)/tools/include LDLIBS += -lpthread -ldl # Own dependencies because we only want to build against 1st prerequisite, but diff --git a/tools/testing/selftests/rseq/rseq.c b/tools/testing/selftests/rseq/rseq.c index a723da253244..96e812bdf8a4 100644 --- a/tools/testing/selftests/rseq/rseq.c +++ b/tools/testing/selftests/rseq/rseq.c @@ -31,6 +31,8 @@ #include <sys/auxv.h> #include <linux/auxvec.h> +#include <linux/compiler.h> + #include "../kselftest.h" #include "rseq.h" --- base-commit: 5d0c230f1de8c7515b6567d9afba1f196fb4e2f4 change-id: 20230804-kselftest-rseq-build-9d537942b1de Best regards, -- Mark Brown <broonie(a)kernel.org>

1 year, 11 months

2
1
0 0

[PATCH v3] selftests: cgroup: fix test_kmem_basic less than error

by Lucas Karpinski

test_kmem_basic creates 100,000 negative dentries, with each one mapping to a slab object. After memory.high is set, these are reclaimed through the shrink_slab function call which reclaims all 100,000 entries. The test passes the majority of the time because when slab1 or current is calculated, it is often above 0, however, 0 is also an acceptable value. Signed-off-by: Lucas Karpinski <lkarpins(a)redhat.com> --- In the previous patch, I missed a change to the variable 'current' even after some testing as the issue was so sporadic. Current takes the slab size into account and can also face the same issue where it fails since the reported value is 0, which is an acceptable value. Drop: b4abfc19 in mm-unstable V2: https://lore.kernel.org/all/ix6vzgjqay2x7bskle7pypoint4nj66fwq7odvd5hektatv… tools/testing/selftests/cgroup/test_kmem.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/tools/testing/selftests/cgroup/test_kmem.c b/tools/testing/selftests/cgroup/test_kmem.c index 1b2cec9d18a4..ed2e50bb1e76 100644 --- a/tools/testing/selftests/cgroup/test_kmem.c +++ b/tools/testing/selftests/cgroup/test_kmem.c @@ -75,11 +75,11 @@ static int test_kmem_basic(const char *root) sleep(1); slab1 = cg_read_key_long(cg, "memory.stat", "slab "); - if (slab1 <= 0) + if (slab1 < 0) goto cleanup; current = cg_read_long(cg, "memory.current"); - if (current <= 0) + if (current < 0) goto cleanup; if (slab1 < slab0 / 2 && current < slab0 / 2) -- 2.41.0

1 year, 11 months

1
0
0 0

[PATCH v1 0/3] selftests/nolibc: add misc improvments

by Zhangjin Wu

Hi, Willy Here is last 3 patches for v6.6 from me. It includes two generic patches from the tinyconfig part1 series and one static related patch derived from Thomas' series. Best regards, Zhangjin Zhangjin Wu (3): selftests/nolibc: allow report with existing test log selftests/nolibc: fix up O= option support tools/nolibc: stackprotector.h: make __stack_chk_init static tools/include/nolibc/crt.h | 2 +- tools/include/nolibc/stackprotector.h | 5 ++--- tools/testing/selftests/nolibc/Makefile | 11 +++++++++-- 3 files changed, 12 insertions(+), 6 deletions(-) -- 2.25.1

1 year, 11 months

3
16
0 0

[PATCH v2] kselftest/arm64: add RCpc load-acquire to hwcap test

by Zeng Heng

Add the RCpc and various features check in the set of hwcap tests. Signed-off-by: Zeng Heng <zengheng4(a)huawei.com> --- v1 -> v2: - sort features by name tools/testing/selftests/arm64/abi/hwcap.c | 26 +++++++++++++++++++++++ 1 file changed, 26 insertions(+) diff --git a/tools/testing/selftests/arm64/abi/hwcap.c b/tools/testing/selftests/arm64/abi/hwcap.c index d4ad813fed10..6a0adf916028 100644 --- a/tools/testing/selftests/arm64/abi/hwcap.c +++ b/tools/testing/selftests/arm64/abi/hwcap.c @@ -39,6 +39,18 @@ static void cssc_sigill(void) asm volatile(".inst 0xdac01c00" : : : "x0"); } +static void ilrcpc_sigill(void) +{ + /* LDAPUR W0, [SP, #8] */ + asm volatile(".inst 0x994083e0" : : : ); +} + +static void lrcpc_sigill(void) +{ + /* LDAPR W0, [SP, #0] */ + asm volatile(".inst 0xb8bfc3e0" : : : ); +} + static void mops_sigill(void) { char dst[1], src[1]; @@ -223,6 +235,20 @@ static const struct hwcap_data { .cpuinfo = "cssc", .sigill_fn = cssc_sigill, }, + { + .name = "LRCPC", + .at_hwcap = AT_HWCAP, + .hwcap_bit = HWCAP_LRCPC, + .cpuinfo = "lrcpc", + .sigill_fn = lrcpc_sigill, + }, + { + .name = "LRCPC2", + .at_hwcap = AT_HWCAP, + .hwcap_bit = HWCAP_ILRCPC, + .cpuinfo = "ilrcpc", + .sigill_fn = ilrcpc_sigill, + }, { .name = "MOPS", .at_hwcap = AT_HWCAP2, -- 2.25.1

1 year, 11 months

3
2
0 0

[PATCH v4 0/6] kselfest/arm64: Fix a SVE memcpy() issue and use tools/include

by Mark Brown

Will noticed that with newer toolchains memcpy() ends up being implemented with SVE instructions, breaking the signals tests when in streaming mode. We fixed this by using an open coded version of OPTIMZER_HIDE_VAR(), but in the process it was noticed that some of the selftests are using the tools/include headers and it might be nice to share things there. We also have a custom compiler.h in the BTI tests. Update the tools/include headers to have what we need, pull them into the arm64 selftests build and make use of them in the signals and BTI tests. Since the resulting changes are a bit invasive for a fix we keep an initial patch using the open coding, updating and replacing that later. Signed-off-by: Mark Brown <broonie(a)kernel.org> --- Changes in v4: - Roll in a refactoring to include and use the tools/include headers. - Link to v3: https://lore.kernel.org/r/20230720-arm64-signal-memcpy-fix-v3-1-08aed2385d6… Changes in v3: - Open code OPTIMISER_HIDE_VAR() instead of the memory clobber. - Link to v2: https://lore.kernel.org/r/20230712-arm64-signal-memcpy-fix-v2-1-494f7025caf… Changes in v2: - Rebase onto v6.5-rc1. - Link to v1: https://lore.kernel.org/r/20230628-arm64-signal-memcpy-fix-v1-1-db3e0300829… --- Mark Brown (6): kselftest/arm64: Exit streaming mode after collecting signal context tools compiler.h: Add OPTIMIZER_HIDE_VAR() tools include: Add some common function attributes kselftest/arm64: Make the tools/include headers available kselftest/arm64: Use shared OPTIMZER_HIDE_VAR() definiton kselftest/arm64: Use the tools/include compiler.h rather than our own tools/include/linux/compiler.h | 18 +++++++++++++++ tools/testing/selftests/arm64/Makefile | 2 ++ tools/testing/selftests/arm64/bti/compiler.h | 21 ----------------- tools/testing/selftests/arm64/bti/system.c | 4 +--- tools/testing/selftests/arm64/bti/system.h | 4 ++-- tools/testing/selftests/arm64/bti/test.c | 1 - .../selftests/arm64/signal/test_signals_utils.h | 27 +++++++++++++++++++++- 7 files changed, 49 insertions(+), 28 deletions(-) --- base-commit: 6eaae198076080886b9e7d57f4ae06fa782f90ef change-id: 20230628-arm64-signal-memcpy-fix-7de3b3c8fa10 Best regards, -- Mark Brown <broonie(a)kernel.org>

1 year, 11 months

2
7
0 0

2025

2024

2023

2022

2021

2020

2019

2018

2017

Linux-kselftest-mirror