October 2023 - Linux-kselftest-mirror

[PATCH] selftests: sud_test: return correct emulated syscall value on RISC-V

by Clément Léger

Currently, the sud_test expects the emulated syscall to return the emulated syscall number. This assumption only works on architectures were the syscall calling convention use the same register for syscall number/syscall return value. This is not the case for RISC-V and thus the return value must be also emulated using the provided ucontext. Signed-off-by: Clément Léger <cleger(a)rivosinc.com> --- tools/testing/selftests/syscall_user_dispatch/sud_test.c | 8 ++++++++ 1 file changed, 8 insertions(+) diff --git a/tools/testing/selftests/syscall_user_dispatch/sud_test.c b/tools/testing/selftests/syscall_user_dispatch/sud_test.c index b5d592d4099e..1b5553c19700 100644 --- a/tools/testing/selftests/syscall_user_dispatch/sud_test.c +++ b/tools/testing/selftests/syscall_user_dispatch/sud_test.c @@ -158,6 +158,14 @@ static void handle_sigsys(int sig, siginfo_t *info, void *ucontext) /* In preparation for sigreturn. */ SYSCALL_DISPATCH_OFF(glob_sel); + + /* + * Modify interrupted context returned value according to syscall + * calling convention + */ +#if defined(__riscv) + ((ucontext_t*)ucontext)->uc_mcontext.__gregs[REG_A0] = MAGIC_SYSCALL_1; +#endif } TEST(dispatch_and_return) -- 2.40.1

1 year, 8 months

2
3
0 0

[PATCH 00/24] selftests/resctrl: CAT test improvements & generalized test framework

by Ilpo Järvinen

Hi all, Here's a series to improve resctrl selftests. It contains following improvements: - Excludes shareable bits from CAT test allocation to avoid interference - Alters read pattern to defeat HW prefetcher optimizations - Rewrites CAT test to make the CAT test reliable and truly measure if CAT is working or not - Introduces generalized test framework making easier to add new tests - Adds L2 CAT test - Lots of other cleanups & refactoring The patches up to CAT test rewrite have been earlier on the mailing list. I've tried to address all the comments made against them back then. This series have been tested across a large number of systems from different generations. Ilpo Järvinen (24): selftests/resctrl: Split fill_buf to allow tests finer-grained control selftests/resctrl: Refactor fill_buf functions selftests/resctrl: Refactor get_cbm_mask() selftests/resctrl: Mark get_cache_size() cache_type const selftests/resctrl: Create cache_size() helper selftests/resctrl: Exclude shareable bits from schemata in CAT test selftests/resctrl: Split measure_cache_vals() function selftests/resctrl: Split show_cache_info() to test specific and generic parts selftests/resctrl: Remove unnecessary __u64 -> unsigned long conversion selftests/resctrl: Remove nested calls in perf event handling selftests/resctrl: Consolidate naming of perf event related things selftests/resctrl: Improve perf init selftests/resctrl: Convert perf related globals to locals selftests/resctrl: Move cat_val() to cat_test.c and rename to cat_test() selftests/resctrl: Read in less obvious order to defeat prefetch optimizations selftests/resctrl: Rewrite Cache Allocation Technology (CAT) test selftests/resctrl: Create struct for input parameter selftests/resctrl: Introduce generalized test framework selftests/resctrl: Pass write_schemata() resource instead of test name selftests/resctrl: Add helper to convert L2/3 to integer selftests/resctrl: Get resource id from cache id selftests/resctrl: Add test groups and name L3 CAT test L3_CAT selftests/resctrl: Add L2 CAT test selftests/resctrl: Ignore failures from L2 CAT test with <= 2 bits tools/testing/selftests/resctrl/cache.c | 263 +++--------- tools/testing/selftests/resctrl/cat_test.c | 386 ++++++++++++------ tools/testing/selftests/resctrl/cmt_test.c | 72 +++- tools/testing/selftests/resctrl/fill_buf.c | 114 +++--- tools/testing/selftests/resctrl/mba_test.c | 24 +- tools/testing/selftests/resctrl/mbm_test.c | 26 +- tools/testing/selftests/resctrl/resctrl.h | 102 ++++- .../testing/selftests/resctrl/resctrl_tests.c | 202 ++++----- tools/testing/selftests/resctrl/resctrl_val.c | 6 +- tools/testing/selftests/resctrl/resctrlfs.c | 234 +++++++---- 10 files changed, 807 insertions(+), 622 deletions(-) -- 2.30.2

1 year, 8 months

3
78
0 0

[PATCH v2 2/2] selftests/mm: Add a new test for madv and hugetlb

by Breno Leitao

Create a selftest that exercises the race between page faults and madvise(MADV_DONTNEED) in the same huge page. Do it by running two threads that touches the huge page and madvise(MADV_DONTNEED) at the same time. In case of a SIGBUS coming at pagefault, the test should fail, since we hit the bug. The test doesn't have a signal handler, and if it fails, it fails like the following ---------------------------------- running ./hugetlb_fault_after_madv ---------------------------------- ./run_vmtests.sh: line 186: 595563 Bus error (core dumped) "$@" [FAIL] This selftest goes together with the fix of the bug[1] itself. [1] https://lore.kernel.org/all/20231001005659.2185316-1-riel@surriel.com/#r Signed-off-by: Breno Leitao <leitao(a)debian.org> --- tools/testing/selftests/mm/Makefile | 1 + .../selftests/mm/hugetlb_fault_after_madv.c | 73 +++++++++++++++++++ tools/testing/selftests/mm/run_vmtests.sh | 4 + 3 files changed, 78 insertions(+) create mode 100644 tools/testing/selftests/mm/hugetlb_fault_after_madv.c diff --git a/tools/testing/selftests/mm/Makefile b/tools/testing/selftests/mm/Makefile index 6a9fc5693145..e71ec9910c62 100644 --- a/tools/testing/selftests/mm/Makefile +++ b/tools/testing/selftests/mm/Makefile @@ -68,6 +68,7 @@ TEST_GEN_FILES += split_huge_page_test TEST_GEN_FILES += ksm_tests TEST_GEN_FILES += ksm_functional_tests TEST_GEN_FILES += mdwe_test +TEST_GEN_FILES += hugetlb_fault_after_madv ifneq ($(ARCH),arm64) TEST_GEN_PROGS += soft-dirty diff --git a/tools/testing/selftests/mm/hugetlb_fault_after_madv.c b/tools/testing/selftests/mm/hugetlb_fault_after_madv.c new file mode 100644 index 000000000000..73b81c632366 --- /dev/null +++ b/tools/testing/selftests/mm/hugetlb_fault_after_madv.c @@ -0,0 +1,73 @@ +// SPDX-License-Identifier: GPL-2.0 +#include <pthread.h> +#include <stdio.h> +#include <stdlib.h> +#include <sys/mman.h> +#include <sys/types.h> +#include <unistd.h> + +#include "vm_util.h" +#include "../kselftest.h" + +#define MMAP_SIZE (1 << 21) +#define INLOOP_ITER 100 + +char *huge_ptr; + +/* Touch the memory while it is being madvised() */ +void *touch(void *unused) +{ + char *ptr = (char *)huge_ptr; + + for (int i = 0; i < INLOOP_ITER; i++) + ptr[0] = '.'; + + return NULL; +} + +void *madv(void *unused) +{ + usleep(rand() % 10); + + for (int i = 0; i < INLOOP_ITER; i++) + madvise(huge_ptr, MMAP_SIZE, MADV_DONTNEED); + + return NULL; +} + +int main(void) +{ + unsigned long free_hugepages; + pthread_t thread1, thread2; + /* + * On kernel 6.4, we are able to reproduce the problem with ~1000 + * interactions + */ + int max = 10000; + + srand(getpid()); + + free_hugepages = get_free_hugepages(); + if (free_hugepages != 1) { + ksft_exit_skip("This test needs one and only one page to execute. Got %lu\n", + free_hugepages); + } + + while (max--) { + huge_ptr = mmap(NULL, MMAP_SIZE, PROT_READ | PROT_WRITE, + MAP_PRIVATE | MAP_ANONYMOUS | MAP_HUGETLB, + -1, 0); + + if ((unsigned long)huge_ptr == -1) + ksft_exit_skip("Failed to allocated huge page\n"); + + pthread_create(&thread1, NULL, madv, NULL); + pthread_create(&thread2, NULL, touch, NULL); + + pthread_join(thread1, NULL); + pthread_join(thread2, NULL); + munmap(huge_ptr, MMAP_SIZE); + } + + return KSFT_PASS; +} diff --git a/tools/testing/selftests/mm/run_vmtests.sh b/tools/testing/selftests/mm/run_vmtests.sh index 3e2bc818d566..9f53f7318a38 100755 --- a/tools/testing/selftests/mm/run_vmtests.sh +++ b/tools/testing/selftests/mm/run_vmtests.sh @@ -221,6 +221,10 @@ CATEGORY="hugetlb" run_test ./hugepage-mremap CATEGORY="hugetlb" run_test ./hugepage-vmemmap CATEGORY="hugetlb" run_test ./hugetlb-madvise +# For this test, we need one and just one huge page +echo 1 > /proc/sys/vm/nr_hugepages +CATEGORY="hugetlb" run_test ./hugetlb_fault_after_madv + if test_selected "hugetlb"; then echo "NOTE: These hugetlb tests provide minimal coverage. Use" echo " https://github.com/libhugetlbfs/libhugetlbfs.git for" -- 2.34.1

1 year, 8 months

3
5
0 0

[PATCHv2 net] selftests: pmtu.sh: fix result checking

by Hangbin Liu

In the PMTU test, when all previous tests are skipped and the new test passes, the exit code is set to 0. However, the current check mistakenly treats this as an assignment, causing the check to pass every time. Consequently, regardless of how many tests have failed, if the latest test passes, the PMTU test will report a pass. Fixes: 2a9d3716b810 ("selftests: pmtu.sh: improve the test result processing") Signed-off-by: Hangbin Liu <liuhangbin(a)gmail.com> --- v2: use "-eq" instead of "=" to make less error-prone --- tools/testing/selftests/net/pmtu.sh | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/tools/testing/selftests/net/pmtu.sh b/tools/testing/selftests/net/pmtu.sh index f838dd370f6a..b3b2dc5a630c 100755 --- a/tools/testing/selftests/net/pmtu.sh +++ b/tools/testing/selftests/net/pmtu.sh @@ -2048,7 +2048,7 @@ run_test() { case $ret in 0) all_skipped=false - [ $exitcode=$ksft_skip ] && exitcode=0 + [ $exitcode -eq $ksft_skip ] && exitcode=0 ;; $ksft_skip) [ $all_skipped = true ] && exitcode=$ksft_skip -- 2.41.0

1 year, 8 months

3
2
0 0

[PATCH v4 0/5] userfaultfd move option

by Suren Baghdasaryan

This patch series introduces UFFDIO_MOVE feature to userfaultfd, which has long been implemented and maintained by Andrea in his local tree [1], but was not upstreamed due to lack of use cases where this approach would be better than allocating a new page and copying the contents. Previous upstraming attempts could be found at [6] and [7]. UFFDIO_COPY performs ~20% better than UFFDIO_MOVE when the application needs pages to be allocated [2]. However, with UFFDIO_MOVE, if pages are available (in userspace) for recycling, as is usually the case in heap compaction algorithms, then we can avoid the page allocation and memcpy (done by UFFDIO_COPY). Also, since the pages are recycled in the userspace, we avoid the need to release (via madvise) the pages back to the kernel [3]. We see over 40% reduction (on a Google pixel 6 device) in the compacting thread’s completion time by using UFFDIO_MOVE vs. UFFDIO_COPY. This was measured using a benchmark that emulates a heap compaction implementation using userfaultfd (to allow concurrent accesses by application threads). More details of the usecase are explained in [3]. Furthermore, UFFDIO_MOVE enables moving swapped-out pages without touching them within the same vma. Today, it can only be done by mremap, however it forces splitting the vma. TODOs for follow-up improvements: - cross-mm support. Known differences from single-mm and missing pieces: - memcg recharging (might need to isolate pages in the process) - mm counters - cross-mm deposit table moves - cross-mm test - document the address space where src and dest reside in struct uffdio_move - TLB flush batching. Will require extensive changes to PTL locking in move_pages_pte(). OTOH that might let us reuse parts of mremap code. Changes since v3 [8]: - changed retry path in folio_lock_anon_vma_read() to unlock and then relock RCU, per Peter Xu - removed cross-mm support from initial patchset, per David Hildenbrand - replaced BUG_ONs with VM_WARN_ON or WARN_ON_ONCE, per David Hildenbrand - added missing cache flushing, per Lokesh Gidra and Peter Xu - updated manpage text in the patch description, per Peter Xu - renamed internal functions from "remap" to "move", per Peter Xu - added mmap_changing check after taking mmap_lock, per Peter Xu - changed uffd context check to ensure dst_mm is registered onto uffd we are operating on, Peter Xu and David Hildenbrand - changed to non-maybe variants of maybe*_mkwrite(), per David Hildenbrand - fixed warning for CONFIG_TRANSPARENT_HUGEPAGE=n, per kernel test robot - comments cleanup, per David Hildenbrand and Peter Xu - checks for VM_IO,VM_PFNMAP,VM_HUGETLB,..., per David Hildenbrand - prevent moving pinned pages, per Peter Xu - changed uffd tests to call move uffd_test_ctx_clear() at the end of the test run instead of in the beginning of the next run - added support for testcase-specific ops - added test for moving PMD-aligned blocks Changes since v2 [5]: - renamed UFFDIO_REMAP to UFFDIO_MOVE, per David Hildenbrand - rebase over mm-unstable to use folio_move_anon_rmap(), per David Hildenbrand - added text for manpage explaining DONTFORK and KSM requirements for this feature, per David Hildenbrand - check for anon_vma changes in the fast path of folio_lock_anon_vma_read, per Peter Xu - updated the title and description of the first patch, per David Hildenbrand - updating comments in folio_lock_anon_vma_read() explaining the need for anon_vma checks, per David Hildenbrand - changed all mapcount checks to PageAnonExclusive, per Jann Horn and David Hildenbrand - changed counters in remap_swap_pte() from MM_ANONPAGES to MM_SWAPENTS, per Jann Horn - added a check for PTE change after folio is locked in remap_pages_pte(), per Jann Horn - added handling of PMD migration entries and bailout when pmd_devmap(), per Jann Horn - added checks to ensure both src and dst VMAs are writable, per Peter Xu - added UFFD_FEATURE_MOVE, per Peter Xu - removed obsolete comments, per Peter Xu - renamed remap_anon_pte to remap_present_pte, per Peter Xu - added a comment for folio_get_anon_vma() explaining the need for anon_vma checks, per Peter Xu - changed error handling in remap_pages() to make it more clear, per Peter Xu - changed EFAULT to EAGAIN to retry when a hugepage appears or disappears from under us, per Peter Xu - added links to previous upstreaming attempts, per David Hildenbrand Changes since v1 [4]: - add mmget_not_zero in userfaultfd_remap, per Jann Horn - removed extern from function definitions, per Matthew Wilcox - converted to folios in remap_pages_huge_pmd, per Matthew Wilcox - use PageAnonExclusive in remap_pages_huge_pmd, per David Hildenbrand - handle pgtable transfers between MMs, per Jann Horn - ignore concurrent A/D pte bit changes, per Jann Horn - split functions into smaller units, per David Hildenbrand - test for folio_test_large in remap_anon_pte, per Matthew Wilcox - use pte_swp_exclusive for swapcount check, per David Hildenbrand - eliminated use of mmu_notifier_invalidate_range_start_nonblock, per Jann Horn - simplified THP alignment checks, per Jann Horn - refactored the loop inside remap_pages, per Jann Horn - additional clarifying comments, per Jann Horn Main changes since Andrea's last version [1]: - Trivial translations from page to folio, mmap_sem to mmap_lock - Replace pmd_trans_unstable() with pte_offset_map_nolock() and handle its possible failure - Move pte mapping into remap_pages_pte to allow for retries when source page or anon_vma is contended. Since pte_offset_map_nolock() start RCU read section, we can't block anymore after mapping a pte, so have to unmap the ptesm do the locking and retry. - Add and use anon_vma_trylock_write() to avoid blocking while in RCU read section. - Accommodate changes in mmu_notifier_range_init() API, switch to mmu_notifier_invalidate_range_start_nonblock() to avoid blocking while in RCU read section. - Open-code now removed __swp_swapcount() - Replace pmd_read_atomic() with pmdp_get_lockless() - Add new selftest for UFFDIO_MOVE [1] https://gitlab.com/aarcange/aa/-/commit/2aec7aea56b10438a3881a20a411aa4b1fc… [2] https://lore.kernel.org/all/1425575884-2574-1-git-send-email-aarcange@redha… [3] https://lore.kernel.org/linux-mm/CA+EESO4uO84SSnBhArH4HvLNhaUQ5nZKNKXqxRCyj… [4] https://lore.kernel.org/all/20230914152620.2743033-1-surenb@google.com/ [5] https://lore.kernel.org/all/20230923013148.1390521-1-surenb@google.com/ [6] https://lore.kernel.org/all/1425575884-2574-21-git-send-email-aarcange@redh… [7] https://lore.kernel.org/all/cover.1547251023.git.blake.caldwell@colorado.ed… [8] https://lore.kernel.org/all/20231009064230.2952396-1-surenb@google.com/ The patchset applies over mm-unstable. Andrea Arcangeli (2): mm/rmap: support move to different root anon_vma in folio_move_anon_rmap() userfaultfd: UFFDIO_MOVE uABI Suren Baghdasaryan (3): selftests/mm: call uffd_test_ctx_clear at the end of the test selftests/mm: add uffd_test_case_ops to allow test case-specific operations selftests/mm: add UFFDIO_MOVE ioctl test Documentation/admin-guide/mm/userfaultfd.rst | 3 + fs/userfaultfd.c | 72 +++ include/linux/rmap.h | 5 + include/linux/userfaultfd_k.h | 11 + include/uapi/linux/userfaultfd.h | 29 +- mm/huge_memory.c | 122 ++++ mm/khugepaged.c | 3 + mm/rmap.c | 30 + mm/userfaultfd.c | 596 +++++++++++++++++++ tools/testing/selftests/mm/uffd-common.c | 51 +- tools/testing/selftests/mm/uffd-common.h | 11 + tools/testing/selftests/mm/uffd-stress.c | 5 +- tools/testing/selftests/mm/uffd-unit-tests.c | 144 +++++ 13 files changed, 1078 insertions(+), 4 deletions(-) -- 2.42.0.820.g83a721a137-goog

1 year, 8 months

4
13
0 0

[PATCH 0/3] selftests/nolibc: various build improvements

by Thomas Weißschuh

With the out-of-tree builds it's possible do incremental tests fairly fast: time ./run-tests.sh i386: 162 test(s): 162 passed, 0 skipped, 0 failed => status: success x86_64: 162 test(s): 162 passed, 0 skipped, 0 failed => status: success arm64: 162 test(s): 162 passed, 0 skipped, 0 failed => status: success arm: 162 test(s): 162 passed, 0 skipped, 0 failed => status: success mips: 162 test(s): 161 passed, 1 skipped, 0 failed => status: warning ppc: 162 test(s): 162 passed, 0 skipped, 0 failed => status: success ppc64: 162 test(s): 162 passed, 0 skipped, 0 failed => status: success ppc64le: 162 test(s): 162 passed, 0 skipped, 0 failed => status: success riscv: 162 test(s): 162 passed, 0 skipped, 0 failed => status: success s390: 162 test(s): 161 passed, 1 skipped, 0 failed => status: warning loongarch: 162 test(s): 161 passed, 1 skipped, 0 failed => status: warning real 1m56.226s user 2m42.457s sys 0m57.979s This is with an incremental kernel rebuild and testrun inside qemu. --- Changes in v2: - Drop already applied qemu-system-ppc64le patch - Drop config generation patch - Add Co-developed-by for out-of-tree patch - Link to v1: https://lore.kernel.org/lkml/20231010-nolibc-out-of-tree-v1-0-b6a263859596@… --- Thomas Weißschuh (3): selftests/nolibc: use EFI -bios for LoongArch qemu selftests/nolibc: anchor paths in $(srcdir) if possible selftests/nolibc: support out-of-tree builds tools/testing/selftests/nolibc/Makefile | 31 ++++++++++++++++++++++++------- 1 file changed, 24 insertions(+), 7 deletions(-) --- base-commit: 5a6a09e97199d6600d31383055f9d43fbbcbe86f change-id: 20231010-nolibc-out-of-tree-b6684c6cf0e3 Best regards, -- Thomas Weißschuh <linux(a)weissschuh.net>

1 year, 8 months

2
6
0 0

[PATCH v33 0/6] Implement IOCTL to get and optionally clear info about PTEs

by Muhammad Usama Anjum

*Changes in v33*: - Add PAGE_IS_FILE support for THPs *Changes in v31 and v32*: - Minor updates *Changes in v30*: - Rebase on top of next-20230815 - Minor nitpicks *Changes in v29:* - Polish IOCTL and improve documentation *Changes in v28:* - Fix walk_end and add 17 test cases in selftests patch *Changes in v27:* - Handle review comments and minor improvements - Add performance improvement patch on top with test for easy review *Changes in v26:* - Code re-structurring and API changes in PAGEMAP_IOCTL *Changes in v25*: - Do proper filtering on hole as well (hole got missed earlier) *Changes in v24*: - Rebase on top of next-20230710 - Place WP markers in case of hole as well *Changes in v23*: - Set vec_buf_index in loop only when vec_buf_index is set - Return -EFAULT instead of -EINVAL if vec is NULL - Correctly return the walk ending address to the page granularity *Changes in v22*: - Interface change: - Replace [start start + len) with [start, end) - Return the ending address of the address walk in start *Changes in v21*: - Abort walk instead of returning error if WP is to be performed on partial hugetlb *Changes in v20* - Correct PAGE_IS_FILE and add PAGE_IS_PFNZERO *Changes in v19* - Minor changes and interface updates *Changes in v18* - Rebase on top of next-20230613 - Minor updates *Changes in v17* - Rebase on top of next-20230606 - Minor improvements in PAGEMAP_SCAN IOCTL patch *Changes in v16* - Fix a corner case - Add exclusive PM_SCAN_OP_WP back *Changes in v15* - Build fix (Add missed build fix in RESEND) *Changes in v14* - Fix build error caused by #ifdef added at last minute in some configs *Changes in v13* - Rebase on top of next-20230414 - Give-up on using uffd_wp_range() and write new helpers, flush tlb only once *Changes in v12* - Update and other memory types to UFFD_FEATURE_WP_ASYNC - Rebaase on top of next-20230406 - Review updates *Changes in v11* - Rebase on top of next-20230307 - Base patches on UFFD_FEATURE_WP_UNPOPULATED - Do a lot of cosmetic changes and review updates - Remove ENGAGE_WP + !GET operation as it can be performed with UFFDIO_WRITEPROTECT *Changes in v10* - Add specific condition to return error if hugetlb is used with wp async - Move changes in tools/include/uapi/linux/fs.h to separate patch - Add documentation *Changes in v9:* - Correct fault resolution for userfaultfd wp async - Fix build warnings and errors which were happening on some configs - Simplify pagemap ioctl's code *Changes in v8:* - Update uffd async wp implementation - Improve PAGEMAP_IOCTL implementation *Changes in v7:* - Add uffd wp async - Update the IOCTL to use uffd under the hood instead of soft-dirty flags *Motivation* The real motivation for adding PAGEMAP_SCAN IOCTL is to emulate Windows GetWriteWatch() and ResetWriteWatch() syscalls [1]. The GetWriteWatch() retrieves the addresses of the pages that are written to in a region of virtual memory. This syscall is used in Windows applications and games etc. This syscall is being emulated in pretty slow manner in userspace. Our purpose is to enhance the kernel such that we translate it efficiently in a better way. Currently some out of tree hack patches are being used to efficiently emulate it in some kernels. We intend to replace those with these patches. So the whole gaming on Linux can effectively get benefit from this. It means there would be tons of users of this code. CRIU use case [2] was mentioned by Andrei and Danylo: > Use cases for migrating sparse VMAs are binaries sanitized with ASAN, > MSAN or TSAN [3]. All of these sanitizers produce sparse mappings of > shadow memory [4]. Being able to migrate such binaries allows to highly > reduce the amount of work needed to identify and fix post-migration > crashes, which happen constantly. Andrei's defines the following uses of this code: * it is more granular and allows us to track changed pages more effectively. The current interface can clear dirty bits for the entire process only. In addition, reading info about pages is a separate operation. It means we must freeze the process to read information about all its pages, reset dirty bits, only then we can start dumping pages. The information about pages becomes more and more outdated, while we are processing pages. The new interface solves both these downsides. First, it allows us to read pte bits and clear the soft-dirty bit atomically. It means that CRIU will not need to freeze processes to pre-dump their memory. Second, it clears soft-dirty bits for a specified region of memory. It means CRIU will have actual info about pages to the moment of dumping them. * The new interface has to be much faster because basic page filtering is happening in the kernel. With the old interface, we have to read pagemap for each page. *Implementation Evolution (Short Summary)* From the definition of GetWriteWatch(), we feel like kernel's soft-dirty feature can be used under the hood with some additions like: * reset soft-dirty flag for only a specific region of memory instead of clearing the flag for the entire process * get and clear soft-dirty flag for a specific region atomically So we decided to use ioctl on pagemap file to read or/and reset soft-dirty flag. But using soft-dirty flag, sometimes we get extra pages which weren't even written. They had become soft-dirty because of VMA merging and VM_SOFTDIRTY flag. This breaks the definition of GetWriteWatch(). We were able to by-pass this short coming by ignoring VM_SOFTDIRTY until David reported that mprotect etc messes up the soft-dirty flag while ignoring VM_SOFTDIRTY [5]. This wasn't happening until [6] got introduced. We discussed if we can revert these patches. But we could not reach to any conclusion. So at this point, I made couple of tries to solve this whole VM_SOFTDIRTY issue by correcting the soft-dirty implementation: * [7] Correct the bug fixed wrongly back in 2014. It had potential to cause regression. We left it behind. * [8] Keep a list of soft-dirty part of a VMA across splits and merges. I got the reply don't increase the size of the VMA by 8 bytes. At this point, we left soft-dirty considering it is too much delicate and userfaultfd [9] seemed like the only way forward. From there onward, we have been basing soft-dirty emulation on userfaultfd wp feature where kernel resolves the faults itself when WP_ASYNC feature is used. It was straight forward to add WP_ASYNC feature in userfautlfd. Now we get only those pages dirty or written-to which are really written in reality. (PS There is another WP_UNPOPULATED userfautfd feature is required which is needed to avoid pre-faulting memory before write-protecting [9].) All the different masks were added on the request of CRIU devs to create interface more generic and better. [1] https://learn.microsoft.com/en-us/windows/win32/api/memoryapi/nf-memoryapi-… [2] https://lore.kernel.org/all/20221014134802.1361436-1-mdanylo@google.com [3] https://github.com/google/sanitizers [4] https://github.com/google/sanitizers/wiki/AddressSanitizerAlgorithm#64-bit [5] https://lore.kernel.org/all/bfcae708-db21-04b4-0bbe-712badd03071@redhat.com [6] https://lore.kernel.org/all/20220725142048.30450-1-peterx@redhat.com/ [7] https://lore.kernel.org/all/20221122115007.2787017-1-usama.anjum@collabora.… [8] https://lore.kernel.org/all/20221220162606.1595355-1-usama.anjum@collabora.… [9] https://lore.kernel.org/all/20230306213925.617814-1-peterx@redhat.com [10] https://lore.kernel.org/all/20230125144529.1630917-1-mdanylo@google.com * Original Cover letter from v8* Hello, Note: Soft-dirty pages and pages which have been written-to are synonyms. As kernel already has soft-dirty feature inside which we have given up to use, we are using written-to terminology while using UFFD async WP under the hood. It is possible to find and clear soft-dirty pages entirely in userspace. But it isn't efficient: - The mprotect and SIGSEGV handler for bookkeeping - The userfaultfd wp (synchronous) with the handler for bookkeeping Some benchmarks can be seen here[1]. This series adds features that weren't present earlier: - There is no atomic get soft-dirty/Written-to status and clear present in the kernel. - The pages which have been written-to can not be found in accurate way. (Kernel's soft-dirty PTE bit + sof_dirty VMA bit shows more soft-dirty pages than there actually are.) Historically, soft-dirty PTE bit tracking has been used in the CRIU project. The procfs interface is enough for finding the soft-dirty bit status and clearing the soft-dirty bit of all the pages of a process. We have the use case where we need to track the soft-dirty PTE bit for only specific pages on-demand. We need this tracking and clear mechanism of a region of memory while the process is running to emulate the getWriteWatch() syscall of Windows. *(Moved to using UFFD instead of soft-dirty feature to find pages which have been written-to from v7 patch series)*: Stop using the soft-dirty flags for finding which pages have been written to. It is too delicate and wrong as it shows more soft-dirty pages than the actual soft-dirty pages. There is no interest in correcting it [2][3] as this is how the feature was written years ago. It shouldn't be updated to changed behaviour. Peter Xu has suggested using the async version of the UFFD WP [4] as it is based inherently on the PTEs. So in this patch series, I've added a new mode to the UFFD which is asynchronous version of the write protect. When this variant of the UFFD WP is used, the page faults are resolved automatically by the kernel. The pages which have been written-to can be found by reading pagemap file (!PM_UFFD_WP). This feature can be used successfully to find which pages have been written to from the time the pages were write protected. This works just like the soft-dirty flag without showing any extra pages which aren't soft-dirty in reality. The information related to pages if the page is file mapped, present and swapped is required for the CRIU project [5][6]. The addition of the required mask, any mask, excluded mask and return masks are also required for the CRIU project [5]. The IOCTL returns the addresses of the pages which match the specific masks. The page addresses are returned in struct page_region in a compact form. The max_pages is needed to support a use case where user only wants to get a specific number of pages. So there is no need to find all the pages of interest in the range when max_pages is specified. The IOCTL returns when the maximum number of the pages are found. The max_pages is optional. If max_pages is specified, it must be equal or greater than the vec_size. This restriction is needed to handle worse case when one page_region only contains info of one page and it cannot be compacted. This is needed to emulate the Windows getWriteWatch() syscall. The patch series include the detailed selftest which can be used as an example for the uffd async wp test and PAGEMAP_IOCTL. It shows the interface usages as well. [1] https://lore.kernel.org/lkml/54d4c322-cd6e-eefd-b161-2af2b56aae24@collabora… [2] https://lore.kernel.org/all/20221220162606.1595355-1-usama.anjum@collabora.… [3] https://lore.kernel.org/all/20221122115007.2787017-1-usama.anjum@collabora.… [4] https://lore.kernel.org/all/Y6Hc2d+7eTKs7AiH@x1n [5] https://lore.kernel.org/all/YyiDg79flhWoMDZB@gmail.com/ [6] https://lore.kernel.org/all/20221014134802.1361436-1-mdanylo@google.com/ Regards, Muhammad Usama Anjum Muhammad Usama Anjum (5): fs/proc/task_mmu: Implement IOCTL to get and optionally clear info about PTEs fs/proc/task_mmu: Add fast paths to get/clear PAGE_IS_WRITTEN flag tools headers UAPI: Update linux/fs.h with the kernel sources mm/pagemap: add documentation of PAGEMAP_SCAN IOCTL selftests: mm: add pagemap ioctl tests Peter Xu (1): userfaultfd: UFFD_FEATURE_WP_ASYNC Documentation/admin-guide/mm/pagemap.rst | 89 + Documentation/admin-guide/mm/userfaultfd.rst | 35 + fs/proc/task_mmu.c | 722 ++++++++ fs/userfaultfd.c | 26 +- include/linux/hugetlb.h | 1 + include/linux/userfaultfd_k.h | 28 +- include/uapi/linux/fs.h | 59 + include/uapi/linux/userfaultfd.h | 9 +- mm/hugetlb.c | 34 +- mm/memory.c | 28 +- tools/include/uapi/linux/fs.h | 59 + tools/testing/selftests/mm/.gitignore | 2 + tools/testing/selftests/mm/Makefile | 3 +- tools/testing/selftests/mm/config | 1 + tools/testing/selftests/mm/pagemap_ioctl.c | 1660 ++++++++++++++++++ tools/testing/selftests/mm/run_vmtests.sh | 4 + 16 files changed, 2736 insertions(+), 24 deletions(-) create mode 100644 tools/testing/selftests/mm/pagemap_ioctl.c -- 2.40.1

1 year, 8 months

3
12
0 0

[PATCH bpf-next v3 1/2] selftests/bpf: Convert CHECK macros to ASSERT_* macros in bpf_iter

by Yuran Pereira

As it was pointed out by Yonghong Song [1], in the bpf selftests the use of the ASSERT_* series of macros is preferred over the CHECK macro. This patch replaces all CHECK calls in bpf_iter with the appropriate ASSERT_* macros. [1] https://lore.kernel.org/lkml/0a142924-633c-44e6-9a92-2dc019656bf2@linux.dev Suggested-by: Yonghong Song <yonghong.song(a)linux.dev> Signed-off-by: Yuran Pereira <yuran.pereira(a)hotmail.com> --- .../selftests/bpf/prog_tests/bpf_iter.c | 79 ++++++++----------- 1 file changed, 35 insertions(+), 44 deletions(-) diff --git a/tools/testing/selftests/bpf/prog_tests/bpf_iter.c b/tools/testing/selftests/bpf/prog_tests/bpf_iter.c index 1f02168103dd..123a3502b8f0 100644 --- a/tools/testing/selftests/bpf/prog_tests/bpf_iter.c +++ b/tools/testing/selftests/bpf/prog_tests/bpf_iter.c @@ -34,8 +34,6 @@ #include "bpf_iter_ksym.skel.h" #include "bpf_iter_sockmap.skel.h" -static int duration; - static void test_btf_id_or_null(void) { struct bpf_iter_test_kern3 *skel; @@ -64,7 +62,7 @@ static void do_dummy_read_opts(struct bpf_program *prog, struct bpf_iter_attach_ /* not check contents, but ensure read() ends without error */ while ((len = read(iter_fd, buf, sizeof(buf))) > 0) ; - CHECK(len < 0, "read", "read failed: %s\n", strerror(errno)); + ASSERT_GE(len, 0, "read"); close(iter_fd); @@ -413,7 +411,7 @@ static int do_btf_read(struct bpf_iter_task_btf *skel) goto free_link; } - if (CHECK(err < 0, "read", "read failed: %s\n", strerror(errno))) + if (!ASSERT_GE(err, 0, "read")) goto free_link; ASSERT_HAS_SUBSTR(taskbuf, "(struct task_struct)", @@ -526,11 +524,11 @@ static int do_read_with_fd(int iter_fd, const char *expected, start = 0; while ((len = read(iter_fd, buf + start, read_buf_len)) > 0) { start += len; - if (CHECK(start >= 16, "read", "read len %d\n", len)) + if (!ASSERT_LT(start, 16, "read")) return -1; read_buf_len = read_one_char ? 1 : 16 - start; } - if (CHECK(len < 0, "read", "read failed: %s\n", strerror(errno))) + if (!ASSERT_GE(len, 0, "read")) return -1; if (!ASSERT_STREQ(buf, expected, "read")) @@ -571,8 +569,7 @@ static int do_read(const char *path, const char *expected) int err, iter_fd; iter_fd = open(path, O_RDONLY); - if (CHECK(iter_fd < 0, "open", "open %s failed: %s\n", - path, strerror(errno))) + if (!ASSERT_GE(iter_fd, 0, "open")) return -1; err = do_read_with_fd(iter_fd, expected, false); @@ -600,7 +597,7 @@ static void test_file_iter(void) unlink(path); err = bpf_link__pin(link, path); - if (CHECK(err, "pin_iter", "pin_iter to %s failed: %d\n", path, err)) + if (!ASSERT_OK(err, "pin_iter")) goto free_link; err = do_read(path, "abcd"); @@ -651,12 +648,10 @@ static void test_overflow(bool test_e2big_overflow, bool ret1) * overflow and needs restart. */ map1_fd = bpf_map_create(BPF_MAP_TYPE_ARRAY, NULL, 4, 8, 1, NULL); - if (CHECK(map1_fd < 0, "bpf_map_create", - "map_creation failed: %s\n", strerror(errno))) + if (!ASSERT_GE(map1_fd, 0, "bpf_map_create")) goto out; map2_fd = bpf_map_create(BPF_MAP_TYPE_ARRAY, NULL, 4, 8, 1, NULL); - if (CHECK(map2_fd < 0, "bpf_map_create", - "map_creation failed: %s\n", strerror(errno))) + if (!ASSERT_GE(map2_fd, 0, "bpf_map_create")) goto free_map1; /* bpf_seq_printf kernel buffer is 8 pages, so one map @@ -685,14 +680,12 @@ static void test_overflow(bool test_e2big_overflow, bool ret1) /* setup filtering map_id in bpf program */ map_info_len = sizeof(map_info); err = bpf_map_get_info_by_fd(map1_fd, &map_info, &map_info_len); - if (CHECK(err, "get_map_info", "get map info failed: %s\n", - strerror(errno))) + if (!ASSERT_OK(err, "get_map_info")) goto free_map2; skel->bss->map1_id = map_info.id; err = bpf_map_get_info_by_fd(map2_fd, &map_info, &map_info_len); - if (CHECK(err, "get_map_info", "get map info failed: %s\n", - strerror(errno))) + if (!ASSERT_OK(err, "get_map_info")) goto free_map2; skel->bss->map2_id = map_info.id; @@ -714,16 +707,14 @@ static void test_overflow(bool test_e2big_overflow, bool ret1) while ((len = read(iter_fd, buf, expected_read_len)) > 0) total_read_len += len; - CHECK(len != -1 || errno != E2BIG, "read", - "expected ret -1, errno E2BIG, but get ret %d, error %s\n", - len, strerror(errno)); + ASSERT_EQ(len, -1, "read"); + ASSERT_EQ(errno, E2BIG, "read"); goto free_buf; } else if (!ret1) { while ((len = read(iter_fd, buf, expected_read_len)) > 0) total_read_len += len; - if (CHECK(len < 0, "read", "read failed: %s\n", - strerror(errno))) + if (!ASSERT_GE(len, 0, "read")) goto free_buf; } else { do { @@ -732,8 +723,7 @@ static void test_overflow(bool test_e2big_overflow, bool ret1) total_read_len += len; } while (len > 0 || len == -EAGAIN); - if (CHECK(len < 0, "read", "read failed: %s\n", - strerror(errno))) + if (!ASSERT_GE(len, 0, "read")) goto free_buf; } @@ -836,7 +826,7 @@ static void test_bpf_hash_map(void) /* do some tests */ while ((len = read(iter_fd, buf, sizeof(buf))) > 0) ; - if (CHECK(len < 0, "read", "read failed: %s\n", strerror(errno))) + if (!ASSERT_GE(len, 0, "read")) goto close_iter; /* test results */ @@ -917,7 +907,7 @@ static void test_bpf_percpu_hash_map(void) /* do some tests */ while ((len = read(iter_fd, buf, sizeof(buf))) > 0) ; - if (CHECK(len < 0, "read", "read failed: %s\n", strerror(errno))) + if (!ASSERT_GE(len, 0, "read")) goto close_iter; /* test results */ @@ -983,17 +973,14 @@ static void test_bpf_array_map(void) start = 0; while ((len = read(iter_fd, buf + start, sizeof(buf) - start)) > 0) start += len; - if (CHECK(len < 0, "read", "read failed: %s\n", strerror(errno))) + if (!ASSERT_GE(len, 0, "read")) goto close_iter; /* test results */ res_first_key = *(__u32 *)buf; res_first_val = *(__u64 *)(buf + sizeof(__u32)); - if (CHECK(res_first_key != 0 || res_first_val != first_val, - "bpf_seq_write", - "seq_write failure: first key %u vs expected 0, " - " first value %llu vs expected %llu\n", - res_first_key, res_first_val, first_val)) + if (!ASSERT_EQ(res_first_key, 0, "bpf_seq_write") || + !ASSERT_EQ(res_first_val, first_val, "bpf_seq_write")) goto close_iter; if (!ASSERT_EQ(skel->bss->key_sum, expected_key, "key_sum")) @@ -1092,7 +1079,7 @@ static void test_bpf_percpu_array_map(void) /* do some tests */ while ((len = read(iter_fd, buf, sizeof(buf))) > 0) ; - if (CHECK(len < 0, "read", "read failed: %s\n", strerror(errno))) + if (!ASSERT_GE(len, 0, "read")) goto close_iter; /* test results */ @@ -1131,6 +1118,7 @@ static void test_bpf_sk_storage_delete(void) sock_fd = socket(AF_INET6, SOCK_STREAM, 0); if (!ASSERT_GE(sock_fd, 0, "socket")) goto out; + err = bpf_map_update_elem(map_fd, &sock_fd, &val, BPF_NOEXIST); if (!ASSERT_OK(err, "map_update")) goto out; @@ -1151,14 +1139,19 @@ static void test_bpf_sk_storage_delete(void) /* do some tests */ while ((len = read(iter_fd, buf, sizeof(buf))) > 0) ; - if (CHECK(len < 0, "read", "read failed: %s\n", strerror(errno))) + if (!ASSERT_GE(len, 0, "read")) goto close_iter; /* test results */ err = bpf_map_lookup_elem(map_fd, &sock_fd, &val); - if (CHECK(!err || errno != ENOENT, "bpf_map_lookup_elem", - "map value wasn't deleted (err=%d, errno=%d)\n", err, errno)) - goto close_iter; + + /* Note: The following assertions serve to ensure + * the value was deleted. It does so by asserting + * that bpf_map_lookup_elem has failed. This might + * seem counterintuitive at first. + */ + ASSERT_ERR(err, "bpf_map_lookup_elem"); + ASSERT_EQ(errno, ENOENT, "bpf_map_lookup_elem"); close_iter: close(iter_fd); @@ -1203,17 +1196,15 @@ static void test_bpf_sk_storage_get(void) do_dummy_read(skel->progs.fill_socket_owner); err = bpf_map_lookup_elem(map_fd, &sock_fd, &val); - if (CHECK(err || val != getpid(), "bpf_map_lookup_elem", - "map value wasn't set correctly (expected %d, got %d, err=%d)\n", - getpid(), val, err)) + if (!ASSERT_OK(err, "bpf_map_lookup_elem") || + !ASSERT_EQ(val, getpid(), "bpf_map_lookup_elem")) goto close_socket; do_dummy_read(skel->progs.negate_socket_local_storage); err = bpf_map_lookup_elem(map_fd, &sock_fd, &val); - CHECK(err || val != -getpid(), "bpf_map_lookup_elem", - "map value wasn't set correctly (expected %d, got %d, err=%d)\n", - -getpid(), val, err); + ASSERT_OK(err, "bpf_map_lookup_elem"); + ASSERT_EQ(val, -getpid(), "bpf_map_lookup_elem"); close_socket: close(sock_fd); @@ -1290,7 +1281,7 @@ static void test_bpf_sk_storage_map(void) /* do some tests */ while ((len = read(iter_fd, buf, sizeof(buf))) > 0) ; - if (CHECK(len < 0, "read", "read failed: %s\n", strerror(errno))) + if (!ASSERT_GE(len, 0, "read")) goto close_iter; /* test results */ -- 2.25.1

1 year, 8 months

4
3
0 0

[PATCH v2] selftests/net: synchronize udpgso_bench rx and tx

by Lucas Karpinski

The sockets used by udpgso_bench_tx aren't always ready when udpgso_bench_tx transmits packets. This issue is more prevalent in -rt kernels, but can occur in both. Replace the hacky sleep calls with a function that checks whether the ports in the namespace are ready for use. Suggested-by: Paolo Abeni <pabeni(a)redhat.com> Signed-off-by: Lucas Karpinski <lkarpins(a)redhat.com> --- https://lore.kernel.org/all/t7v6mmuobrbucyfpwqbcujtvpa3wxnsrc36cz5rr6kzzrzk… Changelog v2: - applied synchronization method suggested by Pablo - changed commit message to code tools/testing/selftests/net/udpgro.sh | 27 ++++++++++++++----- tools/testing/selftests/net/udpgro_bench.sh | 19 +++++++++++-- tools/testing/selftests/net/udpgro_frglist.sh | 19 +++++++++++-- 3 files changed, 54 insertions(+), 11 deletions(-) diff --git a/tools/testing/selftests/net/udpgro.sh b/tools/testing/selftests/net/udpgro.sh index 0c743752669a..04792a315729 100755 --- a/tools/testing/selftests/net/udpgro.sh +++ b/tools/testing/selftests/net/udpgro.sh @@ -24,6 +24,22 @@ cleanup() { } trap cleanup EXIT +wait_local_port_listen() +{ + local port="${1}" + + local port_hex + port_hex="$(printf "%04X" "${port}")" + + local i + for i in $(seq 10); do + ip netns exec "${PEER_NS}" cat /proc/net/udp* | \ + awk "BEGIN {rc=1} {if (\$2 ~ /:${port_hex}\$/) {rc=0; exit}} END {exit rc}" && + break + sleep 0.1 + done +} + cfg_veth() { ip netns add "${PEER_NS}" ip -netns "${PEER_NS}" link set lo up @@ -51,8 +67,7 @@ run_one() { echo "ok" || \ echo "failed" & - # Hack: let bg programs complete the startup - sleep 0.2 + wait_local_port_listen 8000 ./udpgso_bench_tx ${tx_args} ret=$? wait $(jobs -p) @@ -97,7 +112,7 @@ run_one_nat() { echo "ok" || \ echo "failed"& - sleep 0.1 + wait_local_port_listen 8000 ./udpgso_bench_tx ${tx_args} ret=$? kill -INT $pid @@ -118,11 +133,9 @@ run_one_2sock() { echo "ok" || \ echo "failed" & - # Hack: let bg programs complete the startup - sleep 0.2 + wait_local_port_listen 12345 ./udpgso_bench_tx ${tx_args} -p 12345 - sleep 0.1 - # first UDP GSO socket should be closed at this point + wait_local_port_listen 8000 ./udpgso_bench_tx ${tx_args} ret=$? wait $(jobs -p) diff --git a/tools/testing/selftests/net/udpgro_bench.sh b/tools/testing/selftests/net/udpgro_bench.sh index 894972877e8b..91833518e80b 100755 --- a/tools/testing/selftests/net/udpgro_bench.sh +++ b/tools/testing/selftests/net/udpgro_bench.sh @@ -16,6 +16,22 @@ cleanup() { } trap cleanup EXIT +wait_local_port_listen() +{ + local port="${1}" + + local port_hex + port_hex="$(printf "%04X" "${port}")" + + local i + for i in $(seq 10); do + ip netns exec "${PEER_NS}" cat /proc/net/udp* | \ + awk "BEGIN {rc=1} {if (\$2 ~ /:${port_hex}\$/) {rc=0; exit}} END {exit rc}" && + break + sleep 0.1 + done +} + run_one() { # use 'rx' as separator between sender args and receiver args local -r all="$@" @@ -40,8 +56,7 @@ run_one() { ip netns exec "${PEER_NS}" ./udpgso_bench_rx ${rx_args} -r & ip netns exec "${PEER_NS}" ./udpgso_bench_rx -t ${rx_args} -r & - # Hack: let bg programs complete the startup - sleep 0.2 + wait_local_port_listen 8000 ./udpgso_bench_tx ${tx_args} } diff --git a/tools/testing/selftests/net/udpgro_frglist.sh b/tools/testing/selftests/net/udpgro_frglist.sh index 0a6359bed0b9..0aa2068f122c 100755 --- a/tools/testing/selftests/net/udpgro_frglist.sh +++ b/tools/testing/selftests/net/udpgro_frglist.sh @@ -16,6 +16,22 @@ cleanup() { } trap cleanup EXIT +wait_local_port_listen() +{ + local port="${1}" + + local port_hex + port_hex="$(printf "%04X" "${port}")" + + local i + for i in $(seq 10); do + ip netns exec "${PEER_NS}" cat /proc/net/udp* | \ + awk "BEGIN {rc=1} {if (\$2 ~ /:${port_hex}\$/) {rc=0; exit}} END {exit rc}" && + break + sleep 0.1 + done +} + run_one() { # use 'rx' as separator between sender args and receiver args local -r all="$@" @@ -45,8 +61,7 @@ run_one() { echo ${rx_args} ip netns exec "${PEER_NS}" ./udpgso_bench_rx ${rx_args} -r & - # Hack: let bg programs complete the startup - sleep 0.2 + wait_local_port_listen 8000 ./udpgso_bench_tx ${tx_args} } -- 2.41.0

1 year, 8 months

3
6
0 0

selftests: user_events: ftrace_test - RIP: 0010:tracing_update_buffers (kernel/trace/trace.c:6470)

by Naresh Kamboju

Following kernel crash noticed on x86_64 while running selftests: user_events: ftrace_test running 6.6.0-rc7-next-20231026. Reported-by: Linux Kernel Functional Testing <lkft(a)linaro.org> kselftest: Running tests in user_events TAP version 13 1..4 # timeout set to 90 # selftests: user_events: ftrace_test [ 2391.606817] general protection fault, probably for non-canonical address 0x6b6b6b6b6b6b8a83: 0000 [#1] PREEMPT SMP PTI [ 2391.617519] CPU: 1 PID: 34662 Comm: ftrace_test Not tainted 6.6.0-rc7-next-20231026 #1 [ 2391.625428] Hardware name: Supermicro SYS-5019S-ML/X11SSH-F, BIOS 2.7 12/07/2021 [ 2391.632811] RIP: 0010:tracing_update_buffers (kernel/trace/trace.c:6470) [ 2391.637952] Code: 90 90 90 90 90 90 90 90 90 90 90 90 66 0f 1f 00 55 31 f6 48 89 e5 41 55 41 54 53 48 89 fb 48 c7 c7 40 8c 61 94 e8 92 d3 5a 01 <44> 0f b6 a3 18 1f 00 00 41 80 fc 01 0f 87 c8 dc 4e 01 45 31 ed 41 All code ======== 0: 90 nop 1: 90 nop 2: 90 nop 3: 90 nop 4: 90 nop 5: 90 nop 6: 90 nop 7: 90 nop 8: 90 nop 9: 90 nop a: 90 nop b: 90 nop c: 66 0f 1f 00 nopw (%rax) 10: 55 push %rbp 11: 31 f6 xor %esi,%esi 13: 48 89 e5 mov %rsp,%rbp 16: 41 55 push %r13 18: 41 54 push %r12 1a: 53 push %rbx 1b: 48 89 fb mov %rdi,%rbx 1e: 48 c7 c7 40 8c 61 94 mov $0xffffffff94618c40,%rdi 25: e8 92 d3 5a 01 callq 0x15ad3bc 2a:* 44 0f b6 a3 18 1f 00 movzbl 0x1f18(%rbx),%r12d <-- trapping instruction 31: 00 32: 41 80 fc 01 cmp $0x1,%r12b 36: 0f 87 c8 dc 4e 01 ja 0x14edd04 3c: 45 31 ed xor %r13d,%r13d 3f: 41 rex.B Code starting with the faulting instruction =========================================== 0: 44 0f b6 a3 18 1f 00 movzbl 0x1f18(%rbx),%r12d 7: 00 8: 41 80 fc 01 cmp $0x1,%r12b c: 0f 87 c8 dc 4e 01 ja 0x14edcda 12: 45 31 ed xor %r13d,%r13d 15: 41 rex.B [ 2391.656696] RSP: 0018:ffffb36e0a477d80 EFLAGS: 00010246 [ 2391.661937] RAX: 0000000000000000 RBX: 6b6b6b6b6b6b6b6b RCX: 0000000080000000 [ 2391.669064] RDX: 0000000000000000 RSI: ffffffff9299b722 RDI: ffffffff9299b722 [ 2391.676195] RBP: ffffb36e0a477d98 R08: 000000000000002f R09: 0000000000000002 [ 2391.683321] R10: ffffb36e0a477d70 R11: 0000000000000000 R12: 0000000000000002 [ 2391.690453] R13: ffffb36e0a477e88 R14: ffff99c5803a2230 R15: ffff99c581c39000 [ 2391.697586] FS: 00007fb4b9681740(0000) GS:ffff99c6efa80000(0000) knlGS:0000000000000000 [ 2391.705670] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 2391.711410] CR2: 00007fb4b96ab5e0 CR3: 000000010635c002 CR4: 00000000003706f0 [ 2391.718540] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 2391.725665] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 [ 2391.732797] Call Trace: [ 2391.735240] <TASK> [ 2391.737339] ? show_regs (arch/x86/kernel/dumpstack.c:479) [ 2391.740744] ? die_addr (arch/x86/kernel/dumpstack.c:421 arch/x86/kernel/dumpstack.c:460) [ 2391.744056] ? exc_general_protection (arch/x86/kernel/traps.c:697 arch/x86/kernel/traps.c:642) [ 2391.748766] ? asm_exc_general_protection (arch/x86/include/asm/idtentry.h:564) [ 2391.753652] ? __mutex_lock (kernel/locking/mutex.c:613 (discriminator 3) kernel/locking/mutex.c:747 (discriminator 3)) [ 2391.757487] ? __mutex_lock (kernel/locking/mutex.c:613 (discriminator 3) kernel/locking/mutex.c:747 (discriminator 3)) [ 2391.761318] ? tracing_update_buffers (kernel/trace/trace.c:6470) [ 2391.765851] event_enable_write (kernel/trace/trace_events.c:1408) [ 2391.769976] vfs_write (fs/read_write.c:582) [ 2391.773296] ? close_fd_get_file (fs/file.c:821) [ 2391.777396] ? preempt_count_sub (kernel/sched/core.c:5857 kernel/sched/core.c:5853 kernel/sched/core.c:5875) [ 2391.781496] ksys_write (fs/read_write.c:638) [ 2391.784918] __x64_sys_write (fs/read_write.c:646) [ 2391.788671] do_syscall_64 (arch/x86/entry/common.c:51 arch/x86/entry/common.c:82) [ 2391.792248] ? do_syscall_64 (arch/x86/entry/common.c:101) [ 2391.795995] ? syscall_exit_to_user_mode (kernel/entry/common.c:299) [ 2391.800785] ? do_syscall_64 (arch/x86/entry/common.c:101) [ 2391.804529] ? do_syscall_64 (arch/x86/entry/common.c:101) [ 2391.808275] entry_SYSCALL_64_after_hwframe (arch/x86/entry/entry_64.S:129) [ 2391.813327] RIP: 0033:0x7fb4b977c140 [ 2391.816920] Code: 40 00 48 8b 15 c1 9c 0d 00 f7 d8 64 89 02 48 c7 c0 ff ff ff ff eb b7 0f 1f 00 80 3d a1 24 0e 00 00 74 17 b8 01 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 58 c3 0f 1f 80 00 00 00 00 48 83 ec 28 48 89 All code ======== 0: 40 00 48 8b add %cl,-0x75(%rax) 4: 15 c1 9c 0d 00 adc $0xd9cc1,%eax 9: f7 d8 neg %eax b: 64 89 02 mov %eax,%fs:(%rdx) e: 48 c7 c0 ff ff ff ff mov $0xffffffffffffffff,%rax 15: eb b7 jmp 0xffffffffffffffce 17: 0f 1f 00 nopl (%rax) 1a: 80 3d a1 24 0e 00 00 cmpb $0x0,0xe24a1(%rip) # 0xe24c2 21: 74 17 je 0x3a 23: b8 01 00 00 00 mov $0x1,%eax 28: 0f 05 syscall 2a:* 48 3d 00 f0 ff ff cmp $0xfffffffffffff000,%rax <-- trapping instruction 30: 77 58 ja 0x8a 32: c3 retq 33: 0f 1f 80 00 00 00 00 nopl 0x0(%rax) 3a: 48 83 ec 28 sub $0x28,%rsp 3e: 48 rex.W 3f: 89 .byte 0x89 Code starting with the faulting instruction =========================================== 0: 48 3d 00 f0 ff ff cmp $0xfffffffffffff000,%rax 6: 77 58 ja 0x60 8: c3 retq 9: 0f 1f 80 00 00 00 00 nopl 0x0(%rax) 10: 48 83 ec 28 sub $0x28,%rsp 14: 48 rex.W 15: 89 .byte 0x89 [ 2391.835660] RSP: 002b:00007ffc43b05b38 EFLAGS: 00000202 ORIG_RAX: 0000000000000001 [ 2391.843225] RAX: ffffffffffffffda RBX: 00007ffc43b05d88 RCX: 00007fb4b977c140 [ 2391.850350] RDX: 0000000000000002 RSI: 000056376b59b7d4 RDI: 0000000000000007 [ 2391.857482] RBP: 00007ffc43b05b60 R08: 0000000000000000 R09: 00007fb4b9681740 [ 2391.864615] R10: 0000000000000000 R11: 0000000000000202 R12: 0000000000000000 [ 2391.871747] R13: 00007ffc43b05d98 R14: 000056376b59ddc8 R15: 00007fb4b9981020 [ 2391.878907] </TASK> [ 2391.881106] Modules linked in: x86_pkg_temp_thermal fuse configfs [ 2391.887288] ---[ end trace 0000000000000000 ]--- [ 2391.891915] RIP: 0010:tracing_update_buffers (kernel/trace/trace.c:6470) [ 2391.897231] Code: 90 90 90 90 90 90 90 90 90 90 90 90 66 0f 1f 00 55 31 f6 48 89 e5 41 55 41 54 53 48 89 fb 48 c7 c7 40 8c 61 94 e8 92 d3 5a 01 <44> 0f b6 a3 18 1f 00 00 41 80 fc 01 0f 87 c8 dc 4e 01 45 31 ed 41 All code ======== 0: 90 nop 1: 90 nop 2: 90 nop 3: 90 nop 4: 90 nop 5: 90 nop 6: 90 nop 7: 90 nop 8: 90 nop 9: 90 nop a: 90 nop b: 90 nop c: 66 0f 1f 00 nopw (%rax) 10: 55 push %rbp 11: 31 f6 xor %esi,%esi 13: 48 89 e5 mov %rsp,%rbp 16: 41 55 push %r13 18: 41 54 push %r12 1a: 53 push %rbx 1b: 48 89 fb mov %rdi,%rbx 1e: 48 c7 c7 40 8c 61 94 mov $0xffffffff94618c40,%rdi 25: e8 92 d3 5a 01 callq 0x15ad3bc 2a:* 44 0f b6 a3 18 1f 00 movzbl 0x1f18(%rbx),%r12d <-- trapping instruction 31: 00 32: 41 80 fc 01 cmp $0x1,%r12b 36: 0f 87 c8 dc 4e 01 ja 0x14edd04 3c: 45 31 ed xor %r13d,%r13d 3f: 41 rex.B Code starting with the faulting instruction =========================================== 0: 44 0f b6 a3 18 1f 00 movzbl 0x1f18(%rbx),%r12d 7: 00 8: 41 80 fc 01 cmp $0x1,%r12b c: 0f 87 c8 dc 4e 01 ja 0x14edcda 12: 45 31 ed xor %r13d,%r13d 15: 41 rex.B [ 2391.916120] RSP: 0018:ffffb36e0a477d80 EFLAGS: 00010246 [ 2391.921569] RAX: 0000000000000000 RBX: 6b6b6b6b6b6b6b6b RCX: 0000000080000000 [ 2391.928872] RDX: 0000000000000000 RSI: ffffffff9299b722 RDI: ffffffff9299b722 [ 2391.936237] RBP: ffffb36e0a477d98 R08: 000000000000002f R09: 0000000000000002 [ 2391.943388] R10: ffffb36e0a477d70 R11: 0000000000000000 R12: 0000000000000002 [ 2391.950527] R13: ffffb36e0a477e88 R14: ffff99c5803a2230 R15: ffff99c581c39000 [ 2391.957670] FS: 00007fb4b9681740(0000) GS:ffff99c6efa80000(0000) knlGS:0000000000000000 [ 2391.965822] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 2391.971579] CR2: 00007fb4b96ab5e0 CR3: 000000010635c002 CR4: 00000000003706f0 [ 2391.978721] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 2391.985879] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 [ 2391.993028] Kernel panic - not syncing: Fatal exception [ 2391.998287] Kernel Offset: 0x10000000 from 0xffffffff81000000 (relocation range: 0xffffffff80000000-0xffffffffbfffffff) [ 2392.009066] ---[ end Kernel panic - not syncing: Fatal exception ]--- Links: - https://qa-reports.linaro.org/lkft/linux-next-master/build/next-20231026/te… - https://lkft.validation.linaro.org/scheduler/job/6974179#L5053 metadata: git_ref: master git_repo: https://gitlab.com/Linaro/lkft/mirrors/next/linux-next git_sha: 2ef7141596eed0b4b45ef18b3626f428a6b0a822 git_describe: next-20231026 kernel_version: 6.6.0-rc7 kernel-config: https://storage.tuxsuite.com/public/linaro/lkft/builds/2XHt24sNSdog7DYY3FLK… artifact-location: https://storage.tuxsuite.com/public/linaro/lkft/builds/2XHt24sNSdog7DYY3FLK… toolchain: gcc-13 -- Linaro LKFT https://lkft.linaro.org

1 year, 8 months

3
12
0 0

2025

2024

2023

2022

2021

2020

2019

2018

2017

Linux-kselftest-mirror October 2023