From: Jeff Xu jeffxu@chromium.org
mremap doesn't allow relocate, expand, shrink across VMA boundaries, refactor the code to check src address range before doing anything on the destination, i.e. destination won't be unmapped, if src address failed the boundaries check.
This also allows us to remove can_modify_mm from mremap.c, since the src address must be single VMA, can_modify_vma is used.
It is likely this will improve the performance on mremap, previously the code does sealing check using can_modify_mm for the src address range, and the new code removed the loop (used by can_modify_mm).
In order to verify this patch doesn't regress on mremap, I added tests in mseal_test, the test patch can be applied before mremap refactor patch or checkin independently.
Also this patch doesn't change mseal's existing schematic: if sealing fail, user can expect the src/dst address isn't updated. So this patch can be applied regardless if we decided to go with current out-of-loop approach or in-loop approach currently in discussion.
Regarding the perf test report by stress-ng [1] title: 8be7258aad: stress-ng.pagemove.page_remaps_per_sec -4.4% regression
The test is using below for testing: stress-ng --timeout 60 --times --verify --metrics --no-rand-seed --pagemove 64
I can't repro this using ChromeOS, the pagemove test shows large value of stddev and stderr, and can't reasonably refect the performance impact.
For example: I write a c program [2] to run the above pagemove test 10 times and calculate the stddev, stderr, for 3 commits:
1> before mseal feature is added: Ops/sec: Mean : 3564.40 Std Dev : 2737.35 (76.80% of Mean) Std Err : 865.63 (24.29% of Mean)
2> after mseal feature is added: Ops/sec: Mean : 2703.84 Std Dev : 2085.13 (77.12% of Mean) Std Err : 659.38 (24.39% of Mean)
3> after current patch (mremap refactor) Ops/sec: Mean : 3603.67 Std Dev : 2422.22 (67.22% of Mean) Std Err : 765.97 (21.26% of Mean)
The result shows 21%-24% stderr, this means whatever perf improvment/impact there might be won't be measured correctly by this test.
This test machine has 32G memory, Intel(R) Celeron(R) 7305, 5 CPU. And I reboot the machine before each test, and take the first 10 runs with run_stress_ng 10
(I will run longer duration to see if test still shows large stdDev,StdErr)
[1] https://lore.kernel.org/lkml/202408041602.caa0372-oliver.sang@intel.com/ [2] https://github.com/peaktocreek/mmperf/blob/main/run_stress_ng.c
Jeff Xu (2): mseal:selftest mremap across VMA boundaries. mseal: refactor mremap to remove can_modify_mm
mm/internal.h | 24 ++ mm/mremap.c | 77 +++---- mm/mseal.c | 17 -- tools/testing/selftests/mm/mseal_test.c | 293 +++++++++++++++++++++++- 4 files changed, 353 insertions(+), 58 deletions(-)
From: Jeff Xu jeffxu@chromium.org
Add selftest to mremap across VMA boundaries, i.e. mremap will fail.
Signed-off-by: Jeff Xu jeffxu@chromium.org --- tools/testing/selftests/mm/mseal_test.c | 293 +++++++++++++++++++++++- 1 file changed, 292 insertions(+), 1 deletion(-)
diff --git a/tools/testing/selftests/mm/mseal_test.c b/tools/testing/selftests/mm/mseal_test.c index 5bce2fe102ab..422cf90fb56c 100644 --- a/tools/testing/selftests/mm/mseal_test.c +++ b/tools/testing/selftests/mm/mseal_test.c @@ -1482,6 +1482,47 @@ static void test_seal_mremap_move_dontunmap_anyaddr(bool seal) REPORT_TEST_PASS(); }
+static void test_seal_mremap_move_dontunmap_allocated(bool seal) +{ + void *ptr, *ptr2; + unsigned long page_size = getpagesize(); + unsigned long size = 4 * page_size; + int ret; + void *ret2; + + setup_single_address(size, &ptr); + FAIL_TEST_IF_FALSE(ptr != (void *)-1); + + if (seal) { + ret = sys_mseal(ptr, size); + FAIL_TEST_IF_FALSE(!ret); + } + + /* + * The new address is allocated. + */ + setup_single_address(size, &ptr2); + FAIL_TEST_IF_FALSE(ptr2 != (void *)-1); + + /* + * remap to allocated address. + */ + ret2 = sys_mremap(ptr, size, size, MREMAP_MAYMOVE | MREMAP_DONTUNMAP, + (void *) ptr2); + if (seal) { + FAIL_TEST_IF_FALSE(ret2 == MAP_FAILED); + FAIL_TEST_IF_FALSE(errno == EPERM); + } else { + /* remap success and but it won't be ptr2 */ + FAIL_TEST_IF_FALSE(!(ret2 == MAP_FAILED)); + FAIL_TEST_IF_FALSE(ret2 != ptr2); + } + + REPORT_TEST_PASS(); +} + + + static void test_seal_merge_and_split(void) { void *ptr; @@ -1746,6 +1787,239 @@ static void test_seal_discard_ro_anon(bool seal) REPORT_TEST_PASS(); }
+static void test_seal_mremap_shrink_multiple_vmas(bool seal) +{ + void *ptr; + unsigned long page_size = getpagesize(); + unsigned long size = 12 * page_size; + int ret; + void *ret2; + int prot; + + setup_single_address(size, &ptr); + FAIL_TEST_IF_FALSE(ptr != (void *)-1); + + ret = sys_mprotect(ptr + 4 * page_size, 4 * page_size, PROT_NONE); + FAIL_TEST_IF_FALSE(!ret); + + size = get_vma_size(ptr, &prot); + FAIL_TEST_IF_FALSE(size == 4 * page_size); + + size = get_vma_size(ptr + 4 * page_size, &prot); + FAIL_TEST_IF_FALSE(size == 4 * page_size); + + if (seal) { + ret = sys_mseal(ptr + 4 * page_size, 4 * page_size); + FAIL_TEST_IF_FALSE(!ret); + } + + ret2 = sys_mremap(ptr, 12 * page_size, 6 * page_size, 0, 0); + if (seal) { + FAIL_TEST_IF_FALSE(ret2 == (void *) MAP_FAILED); + FAIL_TEST_IF_FALSE(errno == EPERM); + + size = get_vma_size(ptr, &prot); + FAIL_TEST_IF_FALSE(size == 4 * page_size); + FAIL_TEST_IF_FALSE(prot == 0x4); + + size = get_vma_size(ptr + 4 * page_size, &prot); + FAIL_TEST_IF_FALSE(size == 4 * page_size); + FAIL_TEST_IF_FALSE(prot == 0x0); + } else { + FAIL_TEST_IF_FALSE(ret2 == ptr); + + size = get_vma_size(ptr, &prot); + FAIL_TEST_IF_FALSE(size == 4 * page_size); + + size = get_vma_size(ptr + 4 * page_size, &prot); + FAIL_TEST_IF_FALSE(size == 2 * page_size); + } + + REPORT_TEST_PASS(); +} + +static void test_seal_mremap_expand_multiple_vmas(bool seal) +{ + void *ptr; + unsigned long page_size = getpagesize(); + unsigned long size = 12 * page_size; + int ret; + void *ret2; + int prot; + + setup_single_address(size, &ptr); + FAIL_TEST_IF_FALSE(ptr != (void *)-1); + + ret = sys_mprotect(ptr + 4 * page_size, 4 * page_size, PROT_NONE); + FAIL_TEST_IF_FALSE(!ret); + + /* ummap last 4 pages. */ + ret = sys_munmap(ptr + 8 * page_size, 4 * page_size); + FAIL_TEST_IF_FALSE(!ret); + + size = get_vma_size(ptr, &prot); + FAIL_TEST_IF_FALSE(size == 4 * page_size); + + size = get_vma_size(ptr + 4 * page_size, &prot); + FAIL_TEST_IF_FALSE(size == 4 * page_size); + + if (seal) { + ret = sys_mseal(ptr + 4 * page_size, 4 * page_size); + FAIL_TEST_IF_FALSE(!ret); + } + + ret2 = sys_mremap(ptr, 8 * page_size, 12 * page_size, 0, 0); + if (seal) { + FAIL_TEST_IF_FALSE(ret2 == (void *) MAP_FAILED); + + size = get_vma_size(ptr, &prot); + FAIL_TEST_IF_FALSE(size == 4 * page_size); + FAIL_TEST_IF_FALSE(prot == 0x4); + + size = get_vma_size(ptr + 4 * page_size, &prot); + FAIL_TEST_IF_FALSE(size == 4 * page_size); + FAIL_TEST_IF_FALSE(prot == 0x0); + } else { + FAIL_TEST_IF_FALSE(ret2 == (void *) MAP_FAILED); + + size = get_vma_size(ptr, &prot); + FAIL_TEST_IF_FALSE(size == 4 * page_size); + + size = get_vma_size(ptr + 4 * page_size, &prot); + FAIL_TEST_IF_FALSE(size == 4 * page_size); + + } + + REPORT_TEST_PASS(); +} + +static void test_seal_mremap_move_expand_multiple_vmas(bool seal) +{ + void *ptr; + unsigned long page_size = getpagesize(); + unsigned long size = 12 * page_size; + int ret; + void *ret2; + int prot; + void *ptr2; + + setup_single_address(size, &ptr); + FAIL_TEST_IF_FALSE(ptr != (void *)-1); + + setup_single_address(size, &ptr2); + FAIL_TEST_IF_FALSE(ptr2 != (void *)-1); + + ret = sys_munmap(ptr2, 12 * page_size); + FAIL_TEST_IF_FALSE(!ret); + + ret = sys_mprotect(ptr + 4 * page_size, 4 * page_size, PROT_NONE); + FAIL_TEST_IF_FALSE(!ret); + + /* ummap last 4 pages. */ + ret = sys_munmap(ptr + 8 * page_size, 4 * page_size); + FAIL_TEST_IF_FALSE(!ret); + + size = get_vma_size(ptr, &prot); + FAIL_TEST_IF_FALSE(size == 4 * page_size); + + size = get_vma_size(ptr + 4 * page_size, &prot); + FAIL_TEST_IF_FALSE(size == 4 * page_size); + + if (seal) { + ret = sys_mseal(ptr + 4 * page_size, 4 * page_size); + FAIL_TEST_IF_FALSE(!ret); + } + + /* move and expand cross VMA boundary will fail */ + ret2 = sys_mremap(ptr, 8 * page_size, 10 * page_size, MREMAP_FIXED | MREMAP_MAYMOVE, ptr2); + if (seal) { + FAIL_TEST_IF_FALSE(ret2 == (void *) MAP_FAILED); + + size = get_vma_size(ptr, &prot); + FAIL_TEST_IF_FALSE(size == 4 * page_size); + FAIL_TEST_IF_FALSE(prot == 0x4); + + size = get_vma_size(ptr + 4 * page_size, &prot); + FAIL_TEST_IF_FALSE(size == 4 * page_size); + FAIL_TEST_IF_FALSE(prot == 0x0); + } else { + FAIL_TEST_IF_FALSE(ret2 == (void *) MAP_FAILED); + + size = get_vma_size(ptr, &prot); + FAIL_TEST_IF_FALSE(size == 4 * page_size); + FAIL_TEST_IF_FALSE(prot == 0x4); + + size = get_vma_size(ptr + 4 * page_size, &prot); + FAIL_TEST_IF_FALSE(size == 4 * page_size); + FAIL_TEST_IF_FALSE(prot == 0x0); + } + + REPORT_TEST_PASS(); +} + +static void test_seal_mremap_move_shrink_multiple_vmas(bool seal) +{ + void *ptr; + unsigned long page_size = getpagesize(); + unsigned long size = 12 * page_size; + int ret; + void *ret2; + int prot; + void *ptr2; + + setup_single_address(size, &ptr); + FAIL_TEST_IF_FALSE(ptr != (void *)-1); + + setup_single_address(size, &ptr2); + FAIL_TEST_IF_FALSE(ptr2 != (void *)-1); + + ret = sys_munmap(ptr2, 12 * page_size); + FAIL_TEST_IF_FALSE(!ret); + + ret = sys_mprotect(ptr + 4 * page_size, 4 * page_size, PROT_NONE); + FAIL_TEST_IF_FALSE(!ret); + + size = get_vma_size(ptr, &prot); + FAIL_TEST_IF_FALSE(size == 4 * page_size); + FAIL_TEST_IF_FALSE(prot == 4); + + size = get_vma_size(ptr + 4 * page_size, &prot); + FAIL_TEST_IF_FALSE(size == 4 * page_size); + FAIL_TEST_IF_FALSE(prot == 0); + + if (seal) { + ret = sys_mseal(ptr + 4 * page_size, 4 * page_size); + FAIL_TEST_IF_FALSE(!ret); + } + + /* move and shrink cross VMA boundary is NOK */ + ret2 = sys_mremap(ptr, 12 * page_size, 8 * page_size, MREMAP_FIXED | MREMAP_MAYMOVE, ptr2); + if (seal) { + FAIL_TEST_IF_FALSE(ret2 == (void *) MAP_FAILED); + //FAIL_TEST_IF_FALSE(errno == EPERM); + + size = get_vma_size(ptr, &prot); + FAIL_TEST_IF_FALSE(size == 4 * page_size); + FAIL_TEST_IF_FALSE(prot == 4); + + size = get_vma_size(ptr + 4 * page_size, &prot); + FAIL_TEST_IF_FALSE(size == 4 * page_size); + FAIL_TEST_IF_FALSE(prot == 0); + } else { + FAIL_TEST_IF_FALSE(ret2 == (void *) MAP_FAILED); + + size = get_vma_size(ptr, &prot); + FAIL_TEST_IF_FALSE(size == 4 * page_size); + FAIL_TEST_IF_FALSE(prot == 4); + + size = get_vma_size(ptr + 4 * page_size, &prot); + FAIL_TEST_IF_FALSE(size == 4 * page_size); + FAIL_TEST_IF_FALSE(prot == 0); + } + + REPORT_TEST_PASS(); +} + int main(int argc, char **argv) { bool test_seal = seal_support(); @@ -1758,7 +2032,7 @@ int main(int argc, char **argv) if (!pkey_supported()) ksft_print_msg("PKEY not supported\n");
- ksft_set_plan(80); + ksft_set_plan(91);
test_seal_addseal(); test_seal_unmapped_start(); @@ -1835,8 +2109,12 @@ int main(int argc, char **argv) test_seal_mremap_move_dontunmap(true); test_seal_mremap_move_fixed_zero(false); test_seal_mremap_move_fixed_zero(true); + test_seal_mremap_move_dontunmap_anyaddr(false); test_seal_mremap_move_dontunmap_anyaddr(true); + test_seal_mremap_move_dontunmap_allocated(false); + test_seal_mremap_move_dontunmap_allocated(true); + test_seal_discard_ro_anon(false); test_seal_discard_ro_anon(true); test_seal_discard_ro_anon_on_rw(false); @@ -1858,5 +2136,18 @@ int main(int argc, char **argv) test_seal_discard_ro_anon_on_pkey(false); test_seal_discard_ro_anon_on_pkey(true);
+ test_seal_mremap_shrink_multiple_vmas(false); + test_seal_mremap_shrink_multiple_vmas(true); + + test_seal_mremap_expand_multiple_vmas(false); + test_seal_mremap_expand_multiple_vmas(true); + + test_seal_mremap_move_expand_multiple_vmas(false); + + test_seal_mremap_move_expand_multiple_vmas(false); + test_seal_mremap_move_expand_multiple_vmas(true); + test_seal_mremap_move_shrink_multiple_vmas(false); + test_seal_mremap_move_shrink_multiple_vmas(true); + ksft_finished(); }
From: Jeff Xu jeffxu@chromium.org
mremap doesn't allow relocate, expand, shrink across VMA boundaries, refactor the code to check src address range before doing anything on the destination.
This also allow we remove can_modify_mm from mremap, since the src address must be single VMA, use can_modify_vma instead.
Signed-off-by: Jeff Xu jeffxu@chromium.org --- mm/internal.h | 24 ++++++++++++++++ mm/mremap.c | 77 +++++++++++++++++++++++++-------------------------- mm/mseal.c | 17 ------------ 3 files changed, 61 insertions(+), 57 deletions(-)
diff --git a/mm/internal.h b/mm/internal.h index b4d86436565b..53f0bbbc6449 100644 --- a/mm/internal.h +++ b/mm/internal.h @@ -1501,6 +1501,24 @@ bool can_modify_mm(struct mm_struct *mm, unsigned long start, unsigned long end); bool can_modify_mm_madv(struct mm_struct *mm, unsigned long start, unsigned long end, int behavior); + +static inline bool vma_is_sealed(struct vm_area_struct *vma) +{ + return (vma->vm_flags & VM_SEALED); +} + +/* + * check if a vma is sealed for modification. + * return true, if modification is allowed. + */ +static inline bool can_modify_vma(struct vm_area_struct *vma) +{ + if (unlikely(vma_is_sealed(vma))) + return false; + + return true; +} + #else static inline int can_do_mseal(unsigned long flags) { @@ -1518,6 +1536,12 @@ static inline bool can_modify_mm_madv(struct mm_struct *mm, unsigned long start, { return true; } + +static inline bool can_modify_vma(struct vm_area_struct *vma) +{ + return true; +} + #endif
#ifdef CONFIG_SHRINKER_DEBUG diff --git a/mm/mremap.c b/mm/mremap.c index e7ae140fc640..3c5bb671a280 100644 --- a/mm/mremap.c +++ b/mm/mremap.c @@ -904,28 +904,7 @@ static unsigned long mremap_to(unsigned long addr, unsigned long old_len,
/* * In mremap_to(). - * Move a VMA to another location, check if src addr is sealed. - * - * Place can_modify_mm here because mremap_to() - * does its own checking for address range, and we only - * check the sealing after passing those checks. - * - * can_modify_mm assumes we have acquired the lock on MM. */ - if (unlikely(!can_modify_mm(mm, addr, addr + old_len))) - return -EPERM; - - if (flags & MREMAP_FIXED) { - /* - * In mremap_to(). - * VMA is moved to dst address, and munmap dst first. - * do_munmap will check if dst is sealed. - */ - ret = do_munmap(mm, new_addr, new_len, uf_unmap_early); - if (ret) - goto out; - } - if (old_len > new_len) { ret = do_munmap(mm, addr+new_len, old_len - new_len, uf_unmap); if (ret) @@ -939,6 +918,26 @@ static unsigned long mremap_to(unsigned long addr, unsigned long old_len, goto out; }
+ /* + * Since we can't remap across vma boundaries, + * check single vma instead of src address range. + */ + if (unlikely(!can_modify_vma(vma))) { + ret = -EPERM; + goto out; + } + + if (flags & MREMAP_FIXED) { + /* + * In mremap_to(). + * VMA is moved to dst address, and munmap dst first. + * do_munmap will check if dst is sealed. + */ + ret = do_munmap(mm, new_addr, new_len, uf_unmap_early); + if (ret) + goto out; + } + /* MREMAP_DONTUNMAP expands by old_len since old_len == new_len */ if (flags & MREMAP_DONTUNMAP && !may_expand_vm(mm, vma->vm_flags, old_len >> PAGE_SHIFT)) { @@ -1079,19 +1078,6 @@ SYSCALL_DEFINE5(mremap, unsigned long, addr, unsigned long, old_len, goto out; }
- /* - * Below is shrink/expand case (not mremap_to()) - * Check if src address is sealed, if so, reject. - * In other words, prevent shrinking or expanding a sealed VMA. - * - * Place can_modify_mm here so we can keep the logic related to - * shrink/expand together. - */ - if (unlikely(!can_modify_mm(mm, addr, addr + old_len))) { - ret = -EPERM; - goto out; - } - /* * Always allow a shrinking remap: that just unmaps * the unnecessary pages.. @@ -1107,7 +1093,7 @@ SYSCALL_DEFINE5(mremap, unsigned long, addr, unsigned long, old_len, }
ret = do_vmi_munmap(&vmi, mm, addr + new_len, old_len - new_len, - &uf_unmap, true); + &uf_unmap, true); if (ret) goto out;
@@ -1124,6 +1110,15 @@ SYSCALL_DEFINE5(mremap, unsigned long, addr, unsigned long, old_len, goto out; }
+ /* + * Since we can't remap across vma boundaries, + * check single vma instead of src address range. + */ + if (unlikely(!can_modify_vma(vma))) { + ret = -EPERM; + goto out; + } + /* old_len exactly to the end of the area.. */ if (old_len == vma->vm_end - addr) { @@ -1132,9 +1127,10 @@ SYSCALL_DEFINE5(mremap, unsigned long, addr, unsigned long, old_len, /* can we just expand the current mapping? */ if (vma_expandable(vma, delta)) { long pages = delta >> PAGE_SHIFT; - VMA_ITERATOR(vmi, mm, vma->vm_end); long charged = 0;
+ VMA_ITERATOR(vmi, mm, vma->vm_end); + if (vma->vm_flags & VM_ACCOUNT) { if (security_vm_enough_memory_mm(mm, pages)) { ret = -ENOMEM; @@ -1177,20 +1173,21 @@ SYSCALL_DEFINE5(mremap, unsigned long, addr, unsigned long, old_len, ret = -ENOMEM; if (flags & MREMAP_MAYMOVE) { unsigned long map_flags = 0; + if (vma->vm_flags & VM_MAYSHARE) map_flags |= MAP_SHARED;
new_addr = get_unmapped_area(vma->vm_file, 0, new_len, - vma->vm_pgoff + - ((addr - vma->vm_start) >> PAGE_SHIFT), - map_flags); + vma->vm_pgoff + + ((addr - vma->vm_start) >> PAGE_SHIFT), + map_flags); if (IS_ERR_VALUE(new_addr)) { ret = new_addr; goto out; }
ret = move_vma(vma, addr, old_len, new_len, new_addr, - &locked, flags, &uf, &uf_unmap); + &locked, flags, &uf, &uf_unmap); } out: if (offset_in_page(ret)) diff --git a/mm/mseal.c b/mm/mseal.c index bf783bba8ed0..4591ae8d29c2 100644 --- a/mm/mseal.c +++ b/mm/mseal.c @@ -16,28 +16,11 @@ #include <linux/sched.h> #include "internal.h"
-static inline bool vma_is_sealed(struct vm_area_struct *vma) -{ - return (vma->vm_flags & VM_SEALED); -} - static inline void set_vma_sealed(struct vm_area_struct *vma) { vm_flags_set(vma, VM_SEALED); }
-/* - * check if a vma is sealed for modification. - * return true, if modification is allowed. - */ -static bool can_modify_vma(struct vm_area_struct *vma) -{ - if (unlikely(vma_is_sealed(vma))) - return false; - - return true; -} - static bool is_madv_discard(int behavior) { return behavior &
* jeffxu@chromium.org jeffxu@chromium.org [240814 03:14]:
From: Jeff Xu jeffxu@chromium.org
mremap doesn't allow relocate, expand, shrink across VMA boundaries, refactor the code to check src address range before doing anything on the destination, i.e. destination won't be unmapped, if src address failed the boundaries check.
This also allows us to remove can_modify_mm from mremap.c, since the src address must be single VMA, can_modify_vma is used.
I don't think sending out a separate patch to address the same thing as the patch you said you were testing [1] is the correct approach. You had already sent suggestions on mremap changes - why send this patch set instead of making another suggestion?
Maybe send your selftest to be included with the initial patch set would work? Does this test pass with the other patch set?
[1] https://lore.kernel.org/all/CALmYWFs0v07z5vheDt1h3hD+3--yr6Va0ZuQeaATo+-8MuR...
On Wed, Aug 14, 2024 at 7:40 AM Liam R. Howlett Liam.Howlett@oracle.com wrote:
- jeffxu@chromium.org jeffxu@chromium.org [240814 03:14]:
From: Jeff Xu jeffxu@chromium.org
mremap doesn't allow relocate, expand, shrink across VMA boundaries, refactor the code to check src address range before doing anything on the destination, i.e. destination won't be unmapped, if src address failed the boundaries check.
This also allows us to remove can_modify_mm from mremap.c, since the src address must be single VMA, can_modify_vma is used.
I don't think sending out a separate patch to address the same thing as the patch you said you were testing [1] is the correct approach. You had already sent suggestions on mremap changes - why send this patch set instead of making another suggestion?
As indicated in the cover letter, this patch aims to improve mremap performance while preserving existing mseal's semantics. And this patch can go in-dependantly regardless of in-loop out-loop discussion.
[1] link in your email is broken, but I assume you meant Pedro's V1/V2 of in-loop change. In-loop change has a semantic/regression risk to mseal, and will take longer time to review/test/prove and bake.
We can leave in-loop discussion in Pedro's thread, I hope the V3 of Pedro's patch adds more testing coverage and addresses existing comments in V2.
Thanks -Jeff
-Jeff
* Jeff Xu jeffxu@chromium.org [240814 12:57]:
On Wed, Aug 14, 2024 at 7:40 AM Liam R. Howlett Liam.Howlett@oracle.com wrote:
- jeffxu@chromium.org jeffxu@chromium.org [240814 03:14]:
From: Jeff Xu jeffxu@chromium.org
mremap doesn't allow relocate, expand, shrink across VMA boundaries, refactor the code to check src address range before doing anything on the destination, i.e. destination won't be unmapped, if src address failed the boundaries check.
This also allows us to remove can_modify_mm from mremap.c, since the src address must be single VMA, can_modify_vma is used.
I don't think sending out a separate patch to address the same thing as the patch you said you were testing [1] is the correct approach. You had already sent suggestions on mremap changes - why send this patch set instead of making another suggestion?
As indicated in the cover letter, this patch aims to improve mremap performance while preserving existing mseal's semantics.
They are not worth preserving.
And this patch can go in-dependantly regardless of in-loop out-loop discussion.
No, it conflicts with the other mremap patch as it changes the same code - in a very similar way.
[1] link in your email is broken, but I assume you meant Pedro's V1/V2 of in-loop change.
Yes, the email where you delayed discussing the fix so that you could test it. Which brings up the question you didn't answer and deleted: Does your testing pass on those patches?
In-loop change has a semantic/regression risk to mseal, and will take longer time to review/test/prove and bake.
There are no uses, so the risk is minimal.
We can leave in-loop discussion in Pedro's thread,
No, it is directly linked to these patches as this should have just been a comment on a patch in that series.
I hope the V3 of Pedro's patch adds more testing coverage and addresses existing comments in V2.
The majority of the comments to V2 are mine, you only told us that splitting a sealed vma is wrong (after I asked you directly to answer) and then you made a comment about testing of the patch set. Besides the direct responses to me, your comment was "wait for me to test".
You are holding us hostage by asking for more testing but not sharing what is and is not valid for mseal() - or even answering questions on tests you run. Splitting a vma doesn't change the memory, but that's not allowed for some reason.
These patches should be rejected in favour of fixing the feature like it should have been written in the first place. Anything less is just to simplify backports and avoiding testing - "avoiding the business logic".
Liam
[1] https://lore.kernel.org/all/CALmYWFvURJBgyFw7x5qrL4CqoZjy92NeFAS750XaLxO7o7C...
On Wed, Aug 14, 2024 at 12:55 PM Liam R. Howlett Liam.Howlett@oracle.com wrote:
The majority of the comments to V2 are mine, you only told us that splitting a sealed vma is wrong (after I asked you directly to answer) and then you made a comment about testing of the patch set. Besides the direct responses to me, your comment was "wait for me to test".
Please share this link for " Besides the direct responses to me, your comment was "wait for me to test". Or pop up that email by responding to it, to remind me. Thanks.
You are holding us hostage by asking for more testing but not sharing what is and is not valid for mseal() - or even answering questions on tests you run.
https://docs.kernel.org/process/submitting-patches.html#don-t-get-discourage...
These patches should be rejected in favour of fixing the feature like it should have been written in the first place.
This is not ture.
Without removing arch_unmap, it is impossible to implement in-loop. And I have mentioned this during initial discussion of mseal patch, as well as when Pedro expressed the interest on in-loop approach. If you like reference, I can find the links for you.
I'm glad that arch_unmap is removed now and resulting in much cleaner code, it has always been a question/mysterial to me ever since I read that code. Thanks to Linus's leadership and Michael Ellerman's quick response, this is now resolved.
Best regards, -Jeff
* Jeff Xu jeffxu@chromium.org [240814 23:46]:
On Wed, Aug 14, 2024 at 12:55 PM Liam R. Howlett Liam.Howlett@oracle.com wrote:
The majority of the comments to V2 are mine, you only told us that splitting a sealed vma is wrong (after I asked you directly to answer) and then you made a comment about testing of the patch set. Besides the direct responses to me, your comment was "wait for me to test".
Please share this link for " Besides the direct responses to me, your comment was "wait for me to test". Or pop up that email by responding to it, to remind me. Thanks.
[1].
You are holding us hostage by asking for more testing but not sharing what is and is not valid for mseal() - or even answering questions on tests you run.
https://docs.kernel.org/process/submitting-patches.html#don-t-get-discourage...
If you are implying that I'm impatient, I can assure you that is not the feeling driving these emails.
You are just trying to push a patch through that changes the exact code that you said you would test but didn't say how, and you said the testing of another patch was insufficient but didn't say why. Then you send out this fix.
These patches should be rejected in favour of fixing the feature like it should have been written in the first place.
This is not ture.
Yes, it is.
Without removing arch_unmap, it is impossible to implement in-loop.
arch_unmap() is going away, besides..
arch_unmap() could fail today and leave the ppc vdso pointing to NULL, mseal() would introduce a even less likely case of this happening. I asked you about this in v10 [2]. I elaborated in my response, but I doubt you got that far in the email.
And I have mentioned this during initial discussion of mseal patch, as well as when Pedro expressed the interest on in-loop approach. If you like reference, I can find the links for you.
So the main concern is that ppc is going to mseal the vdso, then fail to unmap it?
It would have been better to put a check in the arch_unmap() code in ppc to avoid that - but it will never happen.
I'm glad that arch_unmap is removed now and resulting in much cleaner code,
If you care at all about cleaner code, please move the mseal check to where it should be - or stop getting in the way of others moving it.
it has always been a question/mysterial to me ever since I read that code.
You could have also looked into what arch_unmap() did, or why it was where it is today. If you had, you would have found that arch_unmap() could be moved lower in the function and allowed in-loop approach - but you didn't bother to find out what it was about.
Liam
...
[1]. https://lore.kernel.org/all/CALmYWFs0v07z5vheDt1h3hD+3--yr6Va0ZuQeaATo+-8MuR... [2]. https://lore.kernel.org/lkml/3rpmzsxiwo5t2uq7xy5inizbtaasotjtzocxbayw5ntgk5a...
On Thu, Aug 15, 2024 at 9:50 AM Liam R. Howlett Liam.Howlett@oracle.com wrote:
- Jeff Xu jeffxu@chromium.org [240814 23:46]:
On Wed, Aug 14, 2024 at 12:55 PM Liam R. Howlett Liam.Howlett@oracle.com wrote:
The majority of the comments to V2 are mine, you only told us that splitting a sealed vma is wrong (after I asked you directly to answer) and then you made a comment about testing of the patch set. Besides the direct responses to me, your comment was "wait for me to test".
Please share this link for " Besides the direct responses to me, your comment was "wait for me to test". Or pop up that email by responding to it, to remind me. Thanks.
[1].
That is responding to Andrew, to indicate V2 patch has dependency on arch_munmap in PPC. And I will review/test the code, I will respond to Andrew directly.
PS Your statement above is entirely false, and out of context.
" You only told us that splitting a sealed vma is wrong (after I asked you directly to answer) and then you made a comment about testing of the patch set. Besides the direct responses to me, your comment was "wait for me to test".
If you will excuse me, I would rather spend time on code/test and other duties than responding to your false accusation.
Best regards, -Jeff
Liam
...
[1]. https://lore.kernel.org/all/CALmYWFs0v07z5vheDt1h3hD+3--yr6Va0ZuQeaATo+-8MuR... [2]. https://lore.kernel.org/lkml/3rpmzsxiwo5t2uq7xy5inizbtaasotjtzocxbayw5ntgk5a...
* Jeff Xu jeffxu@google.com [240815 13:23]:
On Thu, Aug 15, 2024 at 9:50 AM Liam R. Howlett Liam.Howlett@oracle.com wrote:
- Jeff Xu jeffxu@chromium.org [240814 23:46]:
On Wed, Aug 14, 2024 at 12:55 PM Liam R. Howlett Liam.Howlett@oracle.com wrote:
The majority of the comments to V2 are mine, you only told us that splitting a sealed vma is wrong (after I asked you directly to answer) and then you made a comment about testing of the patch set. Besides the direct responses to me, your comment was "wait for me to test".
Please share this link for " Besides the direct responses to me, your comment was "wait for me to test". Or pop up that email by responding to it, to remind me. Thanks.
[1].
That is responding to Andrew, to indicate V2 patch has dependency on arch_munmap in PPC. And I will review/test the code, I will respond to Andrew directly.
PS Your statement above is entirely false, and out of context.
" You only told us that splitting a sealed vma is wrong (after I asked you directly to answer) and then you made a comment about testing of the patch set. Besides the direct responses to me, your comment was "wait for me to test".
[1] has your "wait for me to test" to hold up a patch set, [2] has you answering my direct question to you and making the untested comment to someone else.
So, entirely true.
Liam
[1]. https://lore.kernel.org/all/CALmYWFs0v07z5vheDt1h3hD+3--yr6Va0ZuQeaATo+-8MuR... [2]. https://lore.kernel.org/all/CALmYWFvURJBgyFw7x5qrL4CqoZjy92NeFAS750XaLxO7o7C...
On Thu, Aug 15, 2024 at 1:14 PM Liam R. Howlett Liam.Howlett@oracle.com wrote:
- Jeff Xu jeffxu@google.com [240815 13:23]:
On Thu, Aug 15, 2024 at 9:50 AM Liam R. Howlett Liam.Howlett@oracle.com wrote:
- Jeff Xu jeffxu@chromium.org [240814 23:46]:
On Wed, Aug 14, 2024 at 12:55 PM Liam R. Howlett Liam.Howlett@oracle.com wrote:
The majority of the comments to V2 are mine, you only told us that splitting a sealed vma is wrong (after I asked you directly to answer) and then you made a comment about testing of the patch set. Besides the direct responses to me, your comment was "wait for me to test".
Please share this link for " Besides the direct responses to me, your comment was "wait for me to test". Or pop up that email by responding to it, to remind me. Thanks.
[1].
That is responding to Andrew, to indicate V2 patch has dependency on arch_munmap in PPC. And I will review/test the code, I will respond to Andrew directly.
PS Your statement above is entirely false, and out of context.
" You only told us that splitting a sealed vma is wrong (after I asked you directly to answer) and then you made a comment about testing of the patch set. Besides the direct responses to me, your comment was "wait for me to test".
[1] has your "wait for me to test" to hold up a patch set, [2] has you answering my direct question to you and making the untested comment to someone else.
This is the last time that I'm trying to clarify this. [1] is my response to Andrew and Pedro. [2] is my comments about V2 lack of test , i.e. no selftest change, no extra tests added.
-Jeff
So, entirely true.
Liam
[1]. https://lore.kernel.org/all/CALmYWFs0v07z5vheDt1h3hD+3--yr6Va0ZuQeaATo+-8MuR... [2]. https://lore.kernel.org/all/CALmYWFvURJBgyFw7x5qrL4CqoZjy92NeFAS750XaLxO7o7C...
* Jeff Xu jeffxu@chromium.org [240815 16:23]:
On Thu, Aug 15, 2024 at 1:14 PM Liam R. Howlett Liam.Howlett@oracle.com wrote:
- Jeff Xu jeffxu@google.com [240815 13:23]:
On Thu, Aug 15, 2024 at 9:50 AM Liam R. Howlett Liam.Howlett@oracle.com wrote:
- Jeff Xu jeffxu@chromium.org [240814 23:46]:
On Wed, Aug 14, 2024 at 12:55 PM Liam R. Howlett Liam.Howlett@oracle.com wrote:
The majority of the comments to V2 are mine, you only told us that splitting a sealed vma is wrong (after I asked you directly to answer) and then you made a comment about testing of the patch set. Besides the direct responses to me, your comment was "wait for me to test".
Please share this link for " Besides the direct responses to me, your comment was "wait for me to test". Or pop up that email by responding to it, to remind me. Thanks.
[1].
That is responding to Andrew, to indicate V2 patch has dependency on arch_munmap in PPC. And I will review/test the code, I will respond to Andrew directly.
PS Your statement above is entirely false, and out of context.
" You only told us that splitting a sealed vma is wrong (after I asked you directly to answer) and then you made a comment about testing of the patch set. Besides the direct responses to me, your comment was "wait for me to test".
[1] has your "wait for me to test" to hold up a patch set, [2] has you answering my direct question to you and making the untested comment to someone else.
This is the last time that I'm trying to clarify this. [1] is my response to Andrew and Pedro.
That doesn't change what you said, or what you are doing.
[2] is my comments about V2 lack of test , i.e. no selftest change, no extra tests added.
But they pass the tests that exist.
Maybe you should take a step back, and look at both solutions. There is a competing set of patches that fixes the same problem in a similar way that was sent out before these patches, and those patches address the entire problem with the mseal() approach.
Instead of helping make the complete solution work as you think it should, you are making the design problem worse and can't seem to verify your patches actually fix the regression.
Liam
On Wed, Aug 14, 2024 at 12:14 AM jeffxu@chromium.org wrote:
From: Jeff Xu jeffxu@chromium.org
mremap doesn't allow relocate, expand, shrink across VMA boundaries, refactor the code to check src address range before doing anything on the destination, i.e. destination won't be unmapped, if src address failed the boundaries check.
This also allows us to remove can_modify_mm from mremap.c, since the src address must be single VMA, can_modify_vma is used.
It is likely this will improve the performance on mremap, previously the code does sealing check using can_modify_mm for the src address range, and the new code removed the loop (used by can_modify_mm).
In order to verify this patch doesn't regress on mremap, I added tests in mseal_test, the test patch can be applied before mremap refactor patch or checkin independently.
Also this patch doesn't change mseal's existing schematic: if sealing fail, user can expect the src/dst address isn't updated. So this patch can be applied regardless if we decided to go with current out-of-loop approach or in-loop approach currently in discussion.
Regarding the perf test report by stress-ng [1] title: 8be7258aad: stress-ng.pagemove.page_remaps_per_sec -4.4% regression
The test is using below for testing: stress-ng --timeout 60 --times --verify --metrics --no-rand-seed --pagemove 64
I can't repro this using ChromeOS, the pagemove test shows large value of stddev and stderr, and can't reasonably refect the performance impact.
For example: I write a c program [2] to run the above pagemove test 10 times and calculate the stddev, stderr, for 3 commits:
1> before mseal feature is added: Ops/sec: Mean : 3564.40 Std Dev : 2737.35 (76.80% of Mean) Std Err : 865.63 (24.29% of Mean)
2> after mseal feature is added: Ops/sec: Mean : 2703.84 Std Dev : 2085.13 (77.12% of Mean) Std Err : 659.38 (24.39% of Mean)
3> after current patch (mremap refactor) Ops/sec: Mean : 3603.67 Std Dev : 2422.22 (67.22% of Mean) Std Err : 765.97 (21.26% of Mean)
The result shows 21%-24% stderr, this means whatever perf improvment/impact there might be won't be measured correctly by this test.
This test machine has 32G memory, Intel(R) Celeron(R) 7305, 5 CPU. And I reboot the machine before each test, and take the first 10 runs with run_stress_ng 10
(I will run longer duration to see if test still shows large stdDev,StdErr)
I took more samples (100 run ), the stddev/stderr is smaller, however still not at a range that can reasonably measure the perf improvement here.
The tests were taken using the same machine as (10 times run above) and exact the same steps: i.e. change to certain kernel commit, reboot test device, take the first test result.
1> Before mseal feature is added: Statistics: Ops/sec: Mean : 1733.26 Std Dev : 842.13 (48.59% of Mean) Std Err : 84.21 (4.86% of Mean)
2> After mseal feature is added Statistics: Ops/sec: Mean : 1701.53 Std Dev : 1017.29 (59.79% of Mean) Std Err : 101.73 (5.98% of Mean)
3> After mremap refactor (this patch) Statistics: Ops/sec: Mean : 1097.04 Std Dev : 860.67 (78.45% of Mean) Std Err : 86.07 (7.85% of Mean)
Summary: even when the stderr is down to 4%-%8 percentage range, the stddev is still too big.
Hence, there are other unknown, random variables that impact this test.
-Jeff
[1] https://lore.kernel.org/lkml/202408041602.caa0372-oliver.sang@intel.com/ [2] https://github.com/peaktocreek/mmperf/blob/main/run_stress_ng.c
Jeff Xu (2): mseal:selftest mremap across VMA boundaries. mseal: refactor mremap to remove can_modify_mm
mm/internal.h | 24 ++ mm/mremap.c | 77 +++---- mm/mseal.c | 17 -- tools/testing/selftests/mm/mseal_test.c | 293 +++++++++++++++++++++++- 4 files changed, 353 insertions(+), 58 deletions(-)
-- 2.46.0.76.ge559c4bf1a-goog
Hi Oliver,
On Thu, Aug 15, 2024 at 11:16 AM Jeff Xu jeffxu@chromium.org wrote:
On Wed, Aug 14, 2024 at 12:14 AM jeffxu@chromium.org wrote:
From: Jeff Xu jeffxu@chromium.org
mremap doesn't allow relocate, expand, shrink across VMA boundaries, refactor the code to check src address range before doing anything on the destination, i.e. destination won't be unmapped, if src address failed the boundaries check.
This also allows us to remove can_modify_mm from mremap.c, since the src address must be single VMA, can_modify_vma is used.
It is likely this will improve the performance on mremap, previously the code does sealing check using can_modify_mm for the src address range, and the new code removed the loop (used by can_modify_mm).
In order to verify this patch doesn't regress on mremap, I added tests in mseal_test, the test patch can be applied before mremap refactor patch or checkin independently.
Also this patch doesn't change mseal's existing schematic: if sealing fail, user can expect the src/dst address isn't updated. So this patch can be applied regardless if we decided to go with current out-of-loop approach or in-loop approach currently in discussion.
Regarding the perf test report by stress-ng [1] title: 8be7258aad: stress-ng.pagemove.page_remaps_per_sec -4.4% regression
The test is using below for testing: stress-ng --timeout 60 --times --verify --metrics --no-rand-seed --pagemove 64
I can't repro this using ChromeOS, the pagemove test shows large value of stddev and stderr, and can't reasonably refect the performance impact.
For example: I write a c program [2] to run the above pagemove test 10 times and calculate the stddev, stderr, for 3 commits:
1> before mseal feature is added: Ops/sec: Mean : 3564.40 Std Dev : 2737.35 (76.80% of Mean) Std Err : 865.63 (24.29% of Mean)
2> after mseal feature is added: Ops/sec: Mean : 2703.84 Std Dev : 2085.13 (77.12% of Mean) Std Err : 659.38 (24.39% of Mean)
3> after current patch (mremap refactor) Ops/sec: Mean : 3603.67 Std Dev : 2422.22 (67.22% of Mean) Std Err : 765.97 (21.26% of Mean)
The result shows 21%-24% stderr, this means whatever perf improvment/impact there might be won't be measured correctly by this test.
This test machine has 32G memory, Intel(R) Celeron(R) 7305, 5 CPU. And I reboot the machine before each test, and take the first 10 runs with run_stress_ng 10
(I will run longer duration to see if test still shows large stdDev,StdErr)
I took more samples (100 run ), the stddev/stderr is smaller, however still not at a range that can reasonably measure the perf improvement here.
The tests were taken using the same machine as (10 times run above) and exact the same steps: i.e. change to certain kernel commit, reboot test device, take the first test result.
1> Before mseal feature is added: Statistics: Ops/sec: Mean : 1733.26 Std Dev : 842.13 (48.59% of Mean) Std Err : 84.21 (4.86% of Mean)
2> After mseal feature is added Statistics: Ops/sec: Mean : 1701.53 Std Dev : 1017.29 (59.79% of Mean) Std Err : 101.73 (5.98% of Mean)
3> After mremap refactor (this patch) Statistics: Ops/sec: Mean : 1097.04 Std Dev : 860.67 (78.45% of Mean) Std Err : 86.07 (7.85% of Mean)
Summary: even when the stderr is down to 4%-%8 percentage range, the stddev is still too big.
Hence, there are other unknown, random variables that impact this test.
I could not repro the 4% degradation with my test machine (Chromebook), this can be entirely due to the specific test and this test machine.
Do you think it is possible to do a few more tests ? This time I like to have a larger sample size (100 run)
stress-ng --timeout 60 --times --verify --metrics --no-rand-seed --pagemove 64
Please run the test for each commit following the exact steps, e.g. reboot the machine, run the test, get the first 100 results for sample. Please don't select or drop any unstable report because then the data will be biased. If possible, please includes stddiv and stderr for the data (or raw data if not possible, and I will do post-processing)
for 3 commits: -> this patch. -> after mseal feature -> before mseal feature
Thank you for your time and assistance in helping me on understanding this issue.
Best regards, -Jeff
-Jeff
[1] https://lore.kernel.org/lkml/202408041602.caa0372-oliver.sang@intel.com/ [2] https://github.com/peaktocreek/mmperf/blob/main/run_stress_ng.c
Jeff Xu (2): mseal:selftest mremap across VMA boundaries. mseal: refactor mremap to remove can_modify_mm
mm/internal.h | 24 ++ mm/mremap.c | 77 +++---- mm/mseal.c | 17 -- tools/testing/selftests/mm/mseal_test.c | 293 +++++++++++++++++++++++- 4 files changed, 353 insertions(+), 58 deletions(-)
-- 2.46.0.76.ge559c4bf1a-goog
hi, Jeff,
On Thu, Aug 15, 2024 at 01:19:06PM -0700, Jeff Xu wrote:
Hi Oliver,
On Thu, Aug 15, 2024 at 11:16 AM Jeff Xu jeffxu@chromium.org wrote:
On Wed, Aug 14, 2024 at 12:14 AM jeffxu@chromium.org wrote:
From: Jeff Xu jeffxu@chromium.org
mremap doesn't allow relocate, expand, shrink across VMA boundaries, refactor the code to check src address range before doing anything on the destination, i.e. destination won't be unmapped, if src address failed the boundaries check.
This also allows us to remove can_modify_mm from mremap.c, since the src address must be single VMA, can_modify_vma is used.
It is likely this will improve the performance on mremap, previously the code does sealing check using can_modify_mm for the src address range, and the new code removed the loop (used by can_modify_mm).
In order to verify this patch doesn't regress on mremap, I added tests in mseal_test, the test patch can be applied before mremap refactor patch or checkin independently.
Also this patch doesn't change mseal's existing schematic: if sealing fail, user can expect the src/dst address isn't updated. So this patch can be applied regardless if we decided to go with current out-of-loop approach or in-loop approach currently in discussion.
Regarding the perf test report by stress-ng [1] title: 8be7258aad: stress-ng.pagemove.page_remaps_per_sec -4.4% regression
The test is using below for testing: stress-ng --timeout 60 --times --verify --metrics --no-rand-seed --pagemove 64
I can't repro this using ChromeOS, the pagemove test shows large value of stddev and stderr, and can't reasonably refect the performance impact.
For example: I write a c program [2] to run the above pagemove test 10 times and calculate the stddev, stderr, for 3 commits:
1> before mseal feature is added: Ops/sec: Mean : 3564.40 Std Dev : 2737.35 (76.80% of Mean) Std Err : 865.63 (24.29% of Mean)
2> after mseal feature is added: Ops/sec: Mean : 2703.84 Std Dev : 2085.13 (77.12% of Mean) Std Err : 659.38 (24.39% of Mean)
3> after current patch (mremap refactor) Ops/sec: Mean : 3603.67 Std Dev : 2422.22 (67.22% of Mean) Std Err : 765.97 (21.26% of Mean)
The result shows 21%-24% stderr, this means whatever perf improvment/impact there might be won't be measured correctly by this test.
This test machine has 32G memory, Intel(R) Celeron(R) 7305, 5 CPU. And I reboot the machine before each test, and take the first 10 runs with run_stress_ng 10
(I will run longer duration to see if test still shows large stdDev,StdErr)
I took more samples (100 run ), the stddev/stderr is smaller, however still not at a range that can reasonably measure the perf improvement here.
The tests were taken using the same machine as (10 times run above) and exact the same steps: i.e. change to certain kernel commit, reboot test device, take the first test result.
1> Before mseal feature is added: Statistics: Ops/sec: Mean : 1733.26 Std Dev : 842.13 (48.59% of Mean) Std Err : 84.21 (4.86% of Mean)
2> After mseal feature is added Statistics: Ops/sec: Mean : 1701.53 Std Dev : 1017.29 (59.79% of Mean) Std Err : 101.73 (5.98% of Mean)
3> After mremap refactor (this patch) Statistics: Ops/sec: Mean : 1097.04 Std Dev : 860.67 (78.45% of Mean) Std Err : 86.07 (7.85% of Mean)
Summary: even when the stderr is down to 4%-%8 percentage range, the stddev is still too big.
Hence, there are other unknown, random variables that impact this test.
I could not repro the 4% degradation with my test machine (Chromebook), this can be entirely due to the specific test and this test machine.
Do you think it is possible to do a few more tests ? This time I like to have a larger sample size (100 run)
stress-ng --timeout 60 --times --verify --metrics --no-rand-seed --pagemove 64
Please run the test for each commit following the exact steps, e.g. reboot the machine, run the test, get the first 100 results for sample. Please don't select or drop any unstable report because then the data will be biased. If possible, please includes stddiv and stderr for the data (or raw data if not possible, and I will do post-processing)
for 3 commits: -> this patch.
what's the base of it? could I directly apply this patch upon the commit what you said "after mseal feature" as below?
-> after mseal feature -> before mseal feature
could you exlictly point to two commit-id?
Thank you for your time and assistance in helping me on understanding this issue.
due to resource constraint, please expect that we need several days to finish this test request.
Best regards, -Jeff
-Jeff
[1] https://lore.kernel.org/lkml/202408041602.caa0372-oliver.sang@intel.com/ [2] https://github.com/peaktocreek/mmperf/blob/main/run_stress_ng.c
Jeff Xu (2): mseal:selftest mremap across VMA boundaries. mseal: refactor mremap to remove can_modify_mm
mm/internal.h | 24 ++ mm/mremap.c | 77 +++---- mm/mseal.c | 17 -- tools/testing/selftests/mm/mseal_test.c | 293 +++++++++++++++++++++++- 4 files changed, 353 insertions(+), 58 deletions(-)
-- 2.46.0.76.ge559c4bf1a-goog
Hi Oliver
On Thu, Aug 15, 2024 at 7:39 PM Oliver Sang oliver.sang@intel.com wrote:
hi, Jeff,
On Thu, Aug 15, 2024 at 01:19:06PM -0700, Jeff Xu wrote:
Hi Oliver,
On Thu, Aug 15, 2024 at 11:16 AM Jeff Xu jeffxu@chromium.org wrote:
On Wed, Aug 14, 2024 at 12:14 AM jeffxu@chromium.org wrote:
From: Jeff Xu jeffxu@chromium.org
mremap doesn't allow relocate, expand, shrink across VMA boundaries, refactor the code to check src address range before doing anything on the destination, i.e. destination won't be unmapped, if src address failed the boundaries check.
This also allows us to remove can_modify_mm from mremap.c, since the src address must be single VMA, can_modify_vma is used.
It is likely this will improve the performance on mremap, previously the code does sealing check using can_modify_mm for the src address range, and the new code removed the loop (used by can_modify_mm).
In order to verify this patch doesn't regress on mremap, I added tests in mseal_test, the test patch can be applied before mremap refactor patch or checkin independently.
Also this patch doesn't change mseal's existing schematic: if sealing fail, user can expect the src/dst address isn't updated. So this patch can be applied regardless if we decided to go with current out-of-loop approach or in-loop approach currently in discussion.
Regarding the perf test report by stress-ng [1] title: 8be7258aad: stress-ng.pagemove.page_remaps_per_sec -4.4% regression
The test is using below for testing: stress-ng --timeout 60 --times --verify --metrics --no-rand-seed --pagemove 64
I can't repro this using ChromeOS, the pagemove test shows large value of stddev and stderr, and can't reasonably refect the performance impact.
For example: I write a c program [2] to run the above pagemove test 10 times and calculate the stddev, stderr, for 3 commits:
1> before mseal feature is added: Ops/sec: Mean : 3564.40 Std Dev : 2737.35 (76.80% of Mean) Std Err : 865.63 (24.29% of Mean)
2> after mseal feature is added: Ops/sec: Mean : 2703.84 Std Dev : 2085.13 (77.12% of Mean) Std Err : 659.38 (24.39% of Mean)
3> after current patch (mremap refactor) Ops/sec: Mean : 3603.67 Std Dev : 2422.22 (67.22% of Mean) Std Err : 765.97 (21.26% of Mean)
The result shows 21%-24% stderr, this means whatever perf improvment/impact there might be won't be measured correctly by this test.
This test machine has 32G memory, Intel(R) Celeron(R) 7305, 5 CPU. And I reboot the machine before each test, and take the first 10 runs with run_stress_ng 10
(I will run longer duration to see if test still shows large stdDev,StdErr)
I took more samples (100 run ), the stddev/stderr is smaller, however still not at a range that can reasonably measure the perf improvement here.
The tests were taken using the same machine as (10 times run above) and exact the same steps: i.e. change to certain kernel commit, reboot test device, take the first test result.
1> Before mseal feature is added: Statistics: Ops/sec: Mean : 1733.26 Std Dev : 842.13 (48.59% of Mean) Std Err : 84.21 (4.86% of Mean)
2> After mseal feature is added Statistics: Ops/sec: Mean : 1701.53 Std Dev : 1017.29 (59.79% of Mean) Std Err : 101.73 (5.98% of Mean)
3> After mremap refactor (this patch) Statistics: Ops/sec: Mean : 1097.04 Std Dev : 860.67 (78.45% of Mean) Std Err : 86.07 (7.85% of Mean)
Summary: even when the stderr is down to 4%-%8 percentage range, the stddev is still too big.
Hence, there are other unknown, random variables that impact this test.
I could not repro the 4% degradation with my test machine (Chromebook), this can be entirely due to the specific test and this test machine.
Do you think it is possible to do a few more tests ? This time I like to have a larger sample size (100 run)
stress-ng --timeout 60 --times --verify --metrics --no-rand-seed --pagemove 64
Please run the test for each commit following the exact steps, e.g. reboot the machine, run the test, get the first 100 results for sample. Please don't select or drop any unstable report because then the data will be biased. If possible, please includes stddiv and stderr for the data (or raw data if not possible, and I will do post-processing)
for 3 commits: -> this patch.
what's the base of it? could I directly apply this patch upon the commit what you said "after mseal feature" as below?
-> after mseal feature -> before mseal feature
could you exlictly point to two commit-id?
sure
this patch 8be7258a: mseal: add mseal syscall ff388fe5c: mseal: wire up mseal syscall
Thank you for your time and assistance in helping me on understanding this issue.
due to resource constraint, please expect that we need several days to finish this test request.
No problem.
Thanks for your help! -Jeff
Best regards, -Jeff
-Jeff
[1] https://lore.kernel.org/lkml/202408041602.caa0372-oliver.sang@intel.com/ [2] https://github.com/peaktocreek/mmperf/blob/main/run_stress_ng.c
Jeff Xu (2): mseal:selftest mremap across VMA boundaries. mseal: refactor mremap to remove can_modify_mm
mm/internal.h | 24 ++ mm/mremap.c | 77 +++---- mm/mseal.c | 17 -- tools/testing/selftests/mm/mseal_test.c | 293 +++++++++++++++++++++++- 4 files changed, 353 insertions(+), 58 deletions(-)
-- 2.46.0.76.ge559c4bf1a-goog
hi, Jeff,
On Thu, Aug 15, 2024 at 07:58:57PM -0700, Jeff Xu wrote:
Hi Oliver
[...]
could you exlictly point to two commit-id?
sure
this patch 8be7258a: mseal: add mseal syscall ff388fe5c: mseal: wire up mseal syscall
I failed to apply this patch set to "8be7258a: mseal: add mseal syscall"
to avoid the impact of other changes, better to apply the patch upon 8be7258a directly.
if you prefer other base for this patch, please let us know. then we will supply the results for 4 commits in fact:
this patch the base of this patch 8be7258a: mseal: add mseal syscall ff388fe5c: mseal: wire up mseal syscall
Thank you for your time and assistance in helping me on understanding this issue.
due to resource constraint, please expect that we need several days to finish this test request.
No problem.
Thanks for your help! -Jeff
Best regards, -Jeff
-Jeff
[1] https://lore.kernel.org/lkml/202408041602.caa0372-oliver.sang@intel.com/ [2] https://github.com/peaktocreek/mmperf/blob/main/run_stress_ng.c
Jeff Xu (2): mseal:selftest mremap across VMA boundaries. mseal: refactor mremap to remove can_modify_mm
mm/internal.h | 24 ++ mm/mremap.c | 77 +++---- mm/mseal.c | 17 -- tools/testing/selftests/mm/mseal_test.c | 293 +++++++++++++++++++++++- 4 files changed, 353 insertions(+), 58 deletions(-)
-- 2.46.0.76.ge559c4bf1a-goog
hi, Jeff,
On Sun, Aug 18, 2024 at 05:28:41PM +0800, Oliver Sang wrote:
hi, Jeff,
On Thu, Aug 15, 2024 at 07:58:57PM -0700, Jeff Xu wrote:
Hi Oliver
[...]
could you exlictly point to two commit-id?
sure
this patch 8be7258a: mseal: add mseal syscall ff388fe5c: mseal: wire up mseal syscall
I failed to apply this patch set to "8be7258a: mseal: add mseal syscall"
look your patch set again [PATCH v1 1/2] mseal:selftest mremap across VMA boundaries just for kselftests
and I can apply [PATCH v1 2/2] mseal: refactor mremap to remove can_modify_mm upon "8be7258a: mseal: add mseal syscall" cleanly
so I will start test for this [PATCH v1 2/2]
BTW, I will firstly use our default setting - "60s testtime; reboot between each run; run 10 times", since we've already have the data for 8be7258a and ff388fe5c then we could give you an update kind of quickly.
as some private mail discussed, you want some special run method, could you elaborate them here? thanks
to avoid the impact of other changes, better to apply the patch upon 8be7258a directly.
if you prefer other base for this patch, please let us know. then we will supply the results for 4 commits in fact:
this patch the base of this patch 8be7258a: mseal: add mseal syscall ff388fe5c: mseal: wire up mseal syscall
Thank you for your time and assistance in helping me on understanding this issue.
due to resource constraint, please expect that we need several days to finish this test request.
No problem.
Thanks for your help! -Jeff
Best regards, -Jeff
-Jeff
[1] https://lore.kernel.org/lkml/202408041602.caa0372-oliver.sang@intel.com/ [2] https://github.com/peaktocreek/mmperf/blob/main/run_stress_ng.c
Jeff Xu (2): mseal:selftest mremap across VMA boundaries. mseal: refactor mremap to remove can_modify_mm
mm/internal.h | 24 ++ mm/mremap.c | 77 +++---- mm/mseal.c | 17 -- tools/testing/selftests/mm/mseal_test.c | 293 +++++++++++++++++++++++- 4 files changed, 353 insertions(+), 58 deletions(-)
-- 2.46.0.76.ge559c4bf1a-goog
hi, Jeff,
On Mon, Aug 19, 2024 at 09:38:19AM +0800, Oliver Sang wrote:
hi, Jeff,
On Sun, Aug 18, 2024 at 05:28:41PM +0800, Oliver Sang wrote:
hi, Jeff,
On Thu, Aug 15, 2024 at 07:58:57PM -0700, Jeff Xu wrote:
Hi Oliver
[...]
could you exlictly point to two commit-id?
sure
this patch 8be7258a: mseal: add mseal syscall ff388fe5c: mseal: wire up mseal syscall
I failed to apply this patch set to "8be7258a: mseal: add mseal syscall"
look your patch set again [PATCH v1 1/2] mseal:selftest mremap across VMA boundaries just for kselftests
and I can apply [PATCH v1 2/2] mseal: refactor mremap to remove can_modify_mm upon "8be7258a: mseal: add mseal syscall" cleanly
so I will start test for this [PATCH v1 2/2]
BTW, I will firstly use our default setting - "60s testtime; reboot between each run; run 10 times", since we've already have the data for 8be7258a and ff388fe5c then we could give you an update kind of quickly.
as some private mail discussed, you want some special run method, could you elaborate them here? thanks
here is a quick update before you give us more details about special run method.
by our default run method (60s testtime; reboot between each run; run 10 times), your "[PATCH v1 2/2] mseal: refactor mremap to remove can_modify_mm" could resolve regression partically.
========================================================================================= compiler/cpufreq_governor/kconfig/nr_threads/rootfs/tbox_group/test/testcase/testtime: gcc-12/performance/x86_64-rhel-8.3/100%/debian-12-x86_64-20240206.cgz/lkp-icl-2sp7/pagemove/stress-ng/60s
commit: ff388fe5c4 ("mseal: wire up mseal syscall") 8be7258aad ("mseal: add mseal syscall") 2a78ece39f <-- your "[PATCH v1 2/2] mseal: refactor mremap to remove can_modify_mm"
ff388fe5c481d39c 8be7258aad44b5e25977a98db13 2a78ece39f13ea6f3f9679a6c66 ---------------- --------------------------- --------------------------- %stddev %change %stddev %change %stddev \ | \ | \ 4957 +1.3% 5023 +1.0% 5008 time.percent_of_cpu_this_job_got 2915 +1.5% 2959 +1.2% 2949 time.system_time 65.96 -7.3% 61.16 -5.5% 62.30 time.user_time 41535878 -4.0% 39873501 -2.6% 40452264 proc-vmstat.numa_hit 41466104 -4.0% 39806121 -2.6% 40384854 proc-vmstat.numa_local 77297398 -4.1% 74165258 -2.6% 75286134 proc-vmstat.pgalloc_normal 77016866 -4.1% 73886027 -2.6% 75012630 proc-vmstat.pgfree 18386219 -5.0% 17474214 -2.9% 17850959 stress-ng.pagemove.ops 306421 -5.0% 291207 -2.9% 297490 stress-ng.pagemove.ops_per_sec 4957 +1.3% 5023 +1.0% 5008 stress-ng.time.percent_of_cpu_this_job_got 2915 +1.5% 2959 +1.2% 2949 stress-ng.time.system_time 3.349e+10 ± 4% +3.0% 3.447e+10 ± 2% +4.1% 3.484e+10 perf-stat.i.branch-instructions 1.13 -2.1% 1.10 -2.2% 1.10 perf-stat.i.cpi 0.89 +2.2% 0.91 +2.0% 0.91 perf-stat.i.ipc 1.04 -6.9% 0.97 -4.9% 0.99 perf-stat.overall.MPKI 1.13 -2.3% 1.10 -2.0% 1.10 perf-stat.overall.cpi 1081 +5.0% 1136 +3.0% 1114 perf-stat.overall.cycles-between-cache-misses 0.89 +2.3% 0.91 +2.0% 0.91 perf-stat.overall.ipc 3.295e+10 ± 3% +2.9% 3.392e+10 ± 2% +4.0% 3.427e+10 perf-stat.ps.branch-instructions 1.674e+11 ± 3% +1.8% 1.704e+11 ± 2% +3.3% 1.73e+11 perf-stat.ps.instructions 1.046e+13 +2.7% 1.074e+13 +1.7% 1.064e+13 perf-stat.total.instructions 75.05 -2.0 73.02 -0.9 74.18 perf-profile.calltrace.cycles-pp.move_vma.__do_sys_mremap.do_syscall_64.entry_SYSCALL_64_after_hwframe.mremap 36.83 -1.6 35.19 -1.2 35.62 perf-profile.calltrace.cycles-pp.do_vmi_align_munmap.do_vmi_munmap.move_vma.__do_sys_mremap.do_syscall_64 25.02 -1.4 23.65 -0.9 24.12 perf-profile.calltrace.cycles-pp.copy_vma.move_vma.__do_sys_mremap.do_syscall_64.entry_SYSCALL_64_after_hwframe 19.94 -1.1 18.87 -0.8 19.19 perf-profile.calltrace.cycles-pp.__split_vma.do_vmi_align_munmap.do_vmi_munmap.move_vma.__do_sys_mremap 14.78 -0.8 14.01 -0.5 14.28 perf-profile.calltrace.cycles-pp.vma_merge.copy_vma.move_vma.__do_sys_mremap.do_syscall_64 1.48 -0.5 0.99 -0.5 1.00 perf-profile.calltrace.cycles-pp.mas_find.do_vmi_munmap.move_vma.__do_sys_mremap.do_syscall_64 7.88 -0.4 7.47 -0.3 7.62 perf-profile.calltrace.cycles-pp.move_page_tables.move_vma.__do_sys_mremap.do_syscall_64.entry_SYSCALL_64_after_hwframe 6.73 -0.4 6.37 -0.2 6.51 perf-profile.calltrace.cycles-pp.vm_area_dup.__split_vma.do_vmi_align_munmap.do_vmi_munmap.move_vma 6.16 -0.3 5.82 -0.3 5.90 perf-profile.calltrace.cycles-pp.vma_complete.__split_vma.do_vmi_align_munmap.do_vmi_munmap.move_vma 6.12 -0.3 5.79 -0.2 5.93 perf-profile.calltrace.cycles-pp.kmem_cache_alloc_noprof.vm_area_dup.__split_vma.do_vmi_align_munmap.do_vmi_munmap 5.79 -0.3 5.48 -0.2 5.59 perf-profile.calltrace.cycles-pp.move_ptes.move_page_tables.move_vma.__do_sys_mremap.do_syscall_64 5.54 -0.3 5.25 -0.2 5.32 perf-profile.calltrace.cycles-pp.mas_store_prealloc.vma_complete.__split_vma.do_vmi_align_munmap.do_vmi_munmap 5.56 -0.3 5.28 -0.2 5.36 perf-profile.calltrace.cycles-pp.mas_store_prealloc.vma_merge.copy_vma.move_vma.__do_sys_mremap 5.19 -0.3 4.92 -0.2 4.98 perf-profile.calltrace.cycles-pp.mas_wr_store_entry.mas_store_prealloc.vma_complete.__split_vma.do_vmi_align_munmap 5.21 -0.3 4.95 -0.2 5.02 perf-profile.calltrace.cycles-pp.mas_wr_store_entry.mas_store_prealloc.vma_merge.copy_vma.move_vma 4.09 -0.2 3.85 -0.2 3.93 perf-profile.calltrace.cycles-pp.vm_area_dup.copy_vma.move_vma.__do_sys_mremap.do_syscall_64 4.69 -0.2 4.46 -0.2 4.51 perf-profile.calltrace.cycles-pp.mas_wr_node_store.mas_wr_store_entry.mas_store_prealloc.vma_merge.copy_vma 3.56 -0.2 3.36 -0.1 3.43 perf-profile.calltrace.cycles-pp.kmem_cache_alloc_noprof.vm_area_dup.copy_vma.move_vma.__do_sys_mremap 3.40 -0.2 3.22 -0.1 3.29 perf-profile.calltrace.cycles-pp.flush_tlb_mm_range.move_ptes.move_page_tables.move_vma.__do_sys_mremap 1.35 -0.2 1.16 -0.1 1.24 perf-profile.calltrace.cycles-pp.mas_find.do_vmi_munmap.do_munmap.mremap_to.__do_sys_mremap 4.00 -0.2 3.82 -0.1 3.86 perf-profile.calltrace.cycles-pp.mas_wr_node_store.mas_wr_store_entry.mas_store_prealloc.vma_complete.__split_vma 2.23 -0.2 2.05 -0.1 2.12 perf-profile.calltrace.cycles-pp.find_vma_prev.copy_vma.move_vma.__do_sys_mremap.do_syscall_64 8.26 -0.2 8.10 -0.2 8.06 perf-profile.calltrace.cycles-pp.unmap_region.do_vmi_align_munmap.do_vmi_munmap.move_vma.__do_sys_mremap 1.97 ± 3% -0.2 1.81 ± 3% -0.1 1.88 ± 4% perf-profile.calltrace.cycles-pp.mod_objcg_state.__memcg_slab_post_alloc_hook.kmem_cache_alloc_noprof.vm_area_dup.__split_vma 3.11 ± 2% -0.2 2.96 -0.1 3.05 perf-profile.calltrace.cycles-pp.__memcg_slab_post_alloc_hook.kmem_cache_alloc_noprof.vm_area_dup.__split_vma.do_vmi_align_munmap 0.97 -0.2 0.81 -0.1 0.87 perf-profile.calltrace.cycles-pp.mas_walk.mas_find.do_vmi_munmap.do_munmap.mremap_to 2.27 -0.2 2.11 -0.1 2.16 perf-profile.calltrace.cycles-pp.mas_preallocate.__split_vma.do_vmi_align_munmap.do_vmi_munmap.move_vma 3.25 -0.1 3.10 -0.1 3.17 perf-profile.calltrace.cycles-pp.mas_store_gfp.do_vmi_align_munmap.do_vmi_munmap.move_vma.__do_sys_mremap 3.14 -0.1 3.00 -0.1 3.06 perf-profile.calltrace.cycles-pp.unmap_vmas.unmap_region.do_vmi_align_munmap.do_vmi_munmap.move_vma 2.98 -0.1 2.85 -0.1 2.87 ± 2% perf-profile.calltrace.cycles-pp.anon_vma_clone.__split_vma.do_vmi_align_munmap.do_vmi_munmap.move_vma 1.27 ± 2% -0.1 1.15 ± 4% -0.1 1.19 ± 6% perf-profile.calltrace.cycles-pp.__memcpy.mas_wr_node_store.mas_wr_store_entry.mas_store_prealloc.vma_merge 2.45 -0.1 2.34 -0.1 2.38 perf-profile.calltrace.cycles-pp.flush_tlb_func.flush_tlb_mm_range.move_ptes.move_page_tables.move_vma 2.05 -0.1 1.94 -0.1 1.97 perf-profile.calltrace.cycles-pp.mas_preallocate.vma_merge.copy_vma.move_vma.__do_sys_mremap 2.44 -0.1 2.33 -0.1 2.38 perf-profile.calltrace.cycles-pp.unmap_page_range.unmap_vmas.unmap_region.do_vmi_align_munmap.do_vmi_munmap 2.22 -0.1 2.11 -0.1 2.15 perf-profile.calltrace.cycles-pp.native_flush_tlb_one_user.flush_tlb_func.flush_tlb_mm_range.move_ptes.move_page_tables 1.76 ± 2% -0.1 1.65 ± 2% -0.1 1.66 ± 4% perf-profile.calltrace.cycles-pp.vma_prepare.vma_merge.copy_vma.move_vma.__do_sys_mremap 1.86 -0.1 1.75 -0.1 1.78 perf-profile.calltrace.cycles-pp.vma_link.copy_vma.move_vma.__do_sys_mremap.do_syscall_64 1.40 -0.1 1.30 -0.1 1.34 perf-profile.calltrace.cycles-pp.mas_alloc_nodes.mas_preallocate.__split_vma.do_vmi_align_munmap.do_vmi_munmap 1.39 -0.1 1.30 -0.1 1.33 perf-profile.calltrace.cycles-pp.mas_alloc_nodes.mas_preallocate.vma_merge.copy_vma.move_vma 0.55 -0.1 0.46 ± 30% -0.0 0.52 perf-profile.calltrace.cycles-pp.mas_find.find_vma_prev.copy_vma.move_vma.__do_sys_mremap 1.25 -0.1 1.16 -0.1 1.20 perf-profile.calltrace.cycles-pp.kmem_cache_alloc_noprof.mas_alloc_nodes.mas_preallocate.__split_vma.do_vmi_align_munmap 0.94 -0.1 0.86 -0.1 0.87 perf-profile.calltrace.cycles-pp.mas_walk.mas_find.do_vmi_munmap.move_vma.__do_sys_mremap 1.23 -0.1 1.15 -0.1 1.17 perf-profile.calltrace.cycles-pp.kmem_cache_alloc_noprof.mas_alloc_nodes.mas_preallocate.vma_merge.copy_vma 1.54 -0.1 1.47 -0.0 1.49 perf-profile.calltrace.cycles-pp.zap_pmd_range.unmap_page_range.unmap_vmas.unmap_region.do_vmi_align_munmap 0.73 -0.1 0.66 -0.0 0.69 perf-profile.calltrace.cycles-pp.mas_walk.find_vma_prev.copy_vma.move_vma.__do_sys_mremap 1.15 -0.1 1.09 -0.1 1.10 perf-profile.calltrace.cycles-pp.___slab_alloc.kmem_cache_alloc_noprof.vm_area_dup.__split_vma.do_vmi_align_munmap 0.60 ± 2% -0.1 0.54 -0.0 0.58 perf-profile.calltrace.cycles-pp.security_mmap_addr.__get_unmapped_area.mremap_to.__do_sys_mremap.do_syscall_64 1.27 -0.1 1.21 -0.0 1.24 perf-profile.calltrace.cycles-pp.mas_wr_store_entry.mas_store_gfp.do_vmi_align_munmap.do_vmi_munmap.move_vma 0.80 ± 2% -0.1 0.74 ± 2% -0.0 0.76 ± 2% perf-profile.calltrace.cycles-pp.__call_rcu_common.mas_wr_node_store.mas_wr_store_entry.mas_store_prealloc.vma_merge 0.72 -0.1 0.66 -0.0 0.69 perf-profile.calltrace.cycles-pp.mas_prev.vma_merge.copy_vma.move_vma.__do_sys_mremap 0.78 -0.1 0.73 -0.0 0.75 perf-profile.calltrace.cycles-pp.___slab_alloc.kmem_cache_alloc_noprof.mas_alloc_nodes.mas_preallocate.__split_vma 0.69 ± 2% -0.1 0.64 ± 3% -0.0 0.66 ± 4% perf-profile.calltrace.cycles-pp.mod_objcg_state.__memcg_slab_post_alloc_hook.kmem_cache_alloc_noprof.vm_area_dup.copy_vma 1.63 -0.1 1.58 -0.1 1.57 perf-profile.calltrace.cycles-pp.__get_unmapped_area.mremap_to.__do_sys_mremap.do_syscall_64.entry_SYSCALL_64_after_hwframe 1.02 -0.1 0.97 -0.0 0.98 perf-profile.calltrace.cycles-pp.zap_pte_range.zap_pmd_range.unmap_page_range.unmap_vmas.unmap_region 0.77 -0.0 0.72 -0.0 0.74 perf-profile.calltrace.cycles-pp.___slab_alloc.kmem_cache_alloc_noprof.mas_alloc_nodes.mas_preallocate.vma_merge 0.62 -0.0 0.57 -0.0 0.60 perf-profile.calltrace.cycles-pp.mas_prev_setup.mas_prev.vma_merge.copy_vma.move_vma 0.67 -0.0 0.62 -0.0 0.64 perf-profile.calltrace.cycles-pp.percpu_counter_add_batch.do_vmi_align_munmap.do_vmi_munmap.move_vma.__do_sys_mremap 0.86 -0.0 0.81 -0.0 0.83 perf-profile.calltrace.cycles-pp.mtree_load.vma_to_resize.mremap_to.__do_sys_mremap.do_syscall_64 1.12 -0.0 1.08 -0.0 1.09 perf-profile.calltrace.cycles-pp.clear_bhb_loop.mremap 0.56 -0.0 0.51 -0.0 0.53 perf-profile.calltrace.cycles-pp.mas_walk.mas_prev_setup.mas_prev.vma_merge.copy_vma 0.68 ± 2% -0.0 0.63 -0.0 0.65 perf-profile.calltrace.cycles-pp.syscall_return_via_sysret.mremap 0.81 -0.0 0.77 -0.0 0.80 perf-profile.calltrace.cycles-pp.mtree_load.__do_sys_mremap.do_syscall_64.entry_SYSCALL_64_after_hwframe.mremap 1.02 -0.0 0.97 -0.0 0.98 perf-profile.calltrace.cycles-pp.vma_to_resize.mremap_to.__do_sys_mremap.do_syscall_64.entry_SYSCALL_64_after_hwframe 0.95 ± 2% -0.0 0.90 ± 2% -0.0 0.93 perf-profile.calltrace.cycles-pp.__memcg_slab_free_hook.kmem_cache_free.unlink_anon_vmas.free_pgtables.unmap_region 0.98 -0.0 0.94 -0.0 0.95 perf-profile.calltrace.cycles-pp.mas_find.do_vmi_align_munmap.do_vmi_munmap.move_vma.__do_sys_mremap 0.78 -0.0 0.74 -0.0 0.75 perf-profile.calltrace.cycles-pp.mas_store_prealloc.vma_link.copy_vma.move_vma.__do_sys_mremap 0.70 -0.0 0.66 -0.0 0.67 perf-profile.calltrace.cycles-pp.__call_rcu_common.do_vmi_align_munmap.do_vmi_munmap.move_vma.__do_sys_mremap 0.69 -0.0 0.65 -0.0 0.66 perf-profile.calltrace.cycles-pp.___slab_alloc.kmem_cache_alloc_noprof.vm_area_dup.copy_vma.move_vma 0.69 -0.0 0.65 -0.0 0.65 perf-profile.calltrace.cycles-pp.mas_preallocate.vma_link.copy_vma.move_vma.__do_sys_mremap 0.62 -0.0 0.59 -0.0 0.60 perf-profile.calltrace.cycles-pp.mas_prev_slot.do_vmi_align_munmap.do_vmi_munmap.move_vma.__do_sys_mremap 1.16 -0.0 1.12 -0.0 1.13 perf-profile.calltrace.cycles-pp.anon_vma_clone.copy_vma.move_vma.__do_sys_mremap.do_syscall_64 0.76 ± 2% -0.0 0.72 -0.0 0.72 ± 2% perf-profile.calltrace.cycles-pp.allocate_slab.___slab_alloc.kmem_cache_alloc_noprof.vm_area_dup.__split_vma 1.01 -0.0 0.97 -0.0 0.99 perf-profile.calltrace.cycles-pp.mt_find.vma_merge.copy_vma.move_vma.__do_sys_mremap 0.60 -0.0 0.57 -0.0 0.58 perf-profile.calltrace.cycles-pp.__pte_offset_map_lock.zap_pte_range.zap_pmd_range.unmap_page_range.unmap_vmas 0.88 -0.0 0.85 -0.0 0.85 perf-profile.calltrace.cycles-pp.userfaultfd_unmap_complete.__do_sys_mremap.do_syscall_64.entry_SYSCALL_64_after_hwframe.mremap 0.62 ± 2% -0.0 0.59 ± 2% -0.0 0.60 perf-profile.calltrace.cycles-pp.get_old_pud.move_page_tables.move_vma.__do_sys_mremap.do_syscall_64 0.59 -0.0 0.56 -0.0 0.56 perf-profile.calltrace.cycles-pp.entry_SYSCALL_64.mremap 0.65 -0.0 0.62 ± 2% -0.0 0.63 perf-profile.calltrace.cycles-pp.mas_update_gap.mas_store_gfp.do_vmi_align_munmap.do_vmi_munmap.move_vma 0.81 +0.0 0.82 -0.0 0.79 perf-profile.calltrace.cycles-pp.thp_get_unmapped_area_vmflags.__get_unmapped_area.mremap_to.__do_sys_mremap.do_syscall_64 2.76 +0.0 2.78 ± 2% -0.1 2.67 perf-profile.calltrace.cycles-pp.unlink_anon_vmas.free_pgtables.unmap_region.do_vmi_align_munmap.do_vmi_munmap 3.47 +0.0 3.51 -0.1 3.37 perf-profile.calltrace.cycles-pp.free_pgtables.unmap_region.do_vmi_align_munmap.do_vmi_munmap.move_vma 0.76 +0.1 0.83 +0.1 0.85 perf-profile.calltrace.cycles-pp.__madvise 0.66 +0.1 0.73 +0.1 0.75 perf-profile.calltrace.cycles-pp.do_syscall_64.entry_SYSCALL_64_after_hwframe.__madvise 0.67 +0.1 0.74 +0.1 0.76 perf-profile.calltrace.cycles-pp.entry_SYSCALL_64_after_hwframe.__madvise 0.63 +0.1 0.70 +0.1 0.72 perf-profile.calltrace.cycles-pp.__x64_sys_madvise.do_syscall_64.entry_SYSCALL_64_after_hwframe.__madvise 0.62 +0.1 0.70 +0.1 0.71 perf-profile.calltrace.cycles-pp.do_madvise.__x64_sys_madvise.do_syscall_64.entry_SYSCALL_64_after_hwframe.__madvise 0.00 +0.9 0.86 +0.9 0.92 perf-profile.calltrace.cycles-pp.mas_walk.mas_find.can_modify_mm.do_vmi_munmap.do_munmap 0.00 +0.9 0.88 +0.0 0.00 perf-profile.calltrace.cycles-pp.mas_walk.mas_find.can_modify_mm.mremap_to.__do_sys_mremap 83.81 +0.9 84.69 +0.6 84.44 perf-profile.calltrace.cycles-pp.__do_sys_mremap.do_syscall_64.entry_SYSCALL_64_after_hwframe.mremap 0.00 +0.9 0.90 ± 2% +0.9 0.91 perf-profile.calltrace.cycles-pp.mas_walk.mas_find.can_modify_mm.do_vmi_munmap.move_vma 0.00 +1.1 1.10 +0.0 0.00 perf-profile.calltrace.cycles-pp.mas_find.can_modify_mm.mremap_to.__do_sys_mremap.do_syscall_64 0.00 +1.2 1.21 +1.3 1.28 perf-profile.calltrace.cycles-pp.mas_find.can_modify_mm.do_vmi_munmap.do_munmap.mremap_to 2.10 +1.5 3.60 +1.7 3.79 perf-profile.calltrace.cycles-pp.do_munmap.mremap_to.__do_sys_mremap.do_syscall_64.entry_SYSCALL_64_after_hwframe 0.00 +1.5 1.52 +1.5 1.52 perf-profile.calltrace.cycles-pp.mas_find.can_modify_mm.do_vmi_munmap.move_vma.__do_sys_mremap 1.59 +1.5 3.12 +1.7 3.31 perf-profile.calltrace.cycles-pp.do_vmi_munmap.do_munmap.mremap_to.__do_sys_mremap.do_syscall_64 0.00 +1.6 1.61 +0.0 0.00 perf-profile.calltrace.cycles-pp.can_modify_mm.mremap_to.__do_sys_mremap.do_syscall_64.entry_SYSCALL_64_after_hwframe 0.00 +1.7 1.73 +1.8 1.83 perf-profile.calltrace.cycles-pp.can_modify_mm.do_vmi_munmap.do_munmap.mremap_to.__do_sys_mremap 0.00 +2.0 2.01 +2.0 2.04 perf-profile.calltrace.cycles-pp.can_modify_mm.do_vmi_munmap.move_vma.__do_sys_mremap.do_syscall_64 5.34 +3.0 8.38 +1.6 6.92 perf-profile.calltrace.cycles-pp.mremap_to.__do_sys_mremap.do_syscall_64.entry_SYSCALL_64_after_hwframe.mremap 75.22 -2.0 73.18 -0.9 74.34 perf-profile.children.cycles-pp.move_vma 37.04 -1.6 35.40 -1.2 35.83 perf-profile.children.cycles-pp.do_vmi_align_munmap 25.09 -1.4 23.72 -0.9 24.20 perf-profile.children.cycles-pp.copy_vma 20.04 -1.1 18.96 -0.8 19.28 perf-profile.children.cycles-pp.__split_vma 19.87 -1.0 18.84 -0.6 19.24 perf-profile.children.cycles-pp.rcu_core 19.85 -1.0 18.82 -0.6 19.22 perf-profile.children.cycles-pp.rcu_do_batch 19.89 -1.0 18.86 -0.6 19.26 perf-profile.children.cycles-pp.handle_softirqs 17.55 -0.9 16.67 -0.5 17.02 perf-profile.children.cycles-pp.kmem_cache_free 15.32 -0.8 14.49 -0.5 14.78 perf-profile.children.cycles-pp.kmem_cache_alloc_noprof 15.17 -0.8 14.39 -0.5 14.66 perf-profile.children.cycles-pp.vma_merge 12.12 -0.6 11.48 -0.4 11.70 perf-profile.children.cycles-pp.__slab_free 12.19 -0.6 11.56 -0.5 11.73 perf-profile.children.cycles-pp.mas_wr_store_entry 11.99 -0.6 11.36 -0.5 11.53 perf-profile.children.cycles-pp.mas_store_prealloc 10.88 -0.6 10.28 -0.4 10.50 perf-profile.children.cycles-pp.vm_area_dup 9.90 -0.5 9.41 -0.4 9.53 perf-profile.children.cycles-pp.mas_wr_node_store 8.39 -0.5 7.92 -0.3 8.13 perf-profile.children.cycles-pp.__memcg_slab_post_alloc_hook 7.99 -0.4 7.58 -0.3 7.73 perf-profile.children.cycles-pp.move_page_tables 6.70 -0.4 6.33 -0.3 6.43 perf-profile.children.cycles-pp.vma_complete 5.87 -0.3 5.55 -0.2 5.66 perf-profile.children.cycles-pp.move_ptes 5.12 -0.3 4.81 -0.2 4.90 perf-profile.children.cycles-pp.mas_preallocate 6.05 -0.3 5.74 -0.2 5.85 perf-profile.children.cycles-pp.vm_area_free_rcu_cb 2.98 -0.3 2.69 ± 4% -0.2 2.80 ± 6% perf-profile.children.cycles-pp.__memcpy 3.46 ± 2% -0.2 3.25 -0.1 3.36 ± 3% perf-profile.children.cycles-pp.mod_objcg_state 3.47 -0.2 3.26 -0.2 3.32 perf-profile.children.cycles-pp.___slab_alloc 2.44 -0.2 2.25 -0.1 2.33 perf-profile.children.cycles-pp.find_vma_prev 2.92 -0.2 2.73 -0.1 2.79 perf-profile.children.cycles-pp.mas_alloc_nodes 3.46 -0.2 3.27 -0.1 3.34 perf-profile.children.cycles-pp.flush_tlb_mm_range 3.47 -0.2 3.29 -0.2 3.32 ± 2% perf-profile.children.cycles-pp.down_write 3.33 -0.2 3.16 -0.1 3.25 perf-profile.children.cycles-pp.__memcg_slab_free_hook 4.23 -0.2 4.07 -0.1 4.08 ± 2% perf-profile.children.cycles-pp.anon_vma_clone 8.33 -0.2 8.17 -0.2 8.13 perf-profile.children.cycles-pp.unmap_region 3.35 -0.1 3.20 -0.1 3.26 perf-profile.children.cycles-pp.mas_store_gfp 2.21 -0.1 2.07 -0.1 2.10 perf-profile.children.cycles-pp.__cond_resched 3.19 -0.1 3.05 -0.1 3.11 perf-profile.children.cycles-pp.unmap_vmas 2.12 -0.1 1.99 -0.1 2.04 perf-profile.children.cycles-pp.__call_rcu_common 2.66 -0.1 2.54 -0.1 2.60 perf-profile.children.cycles-pp.mtree_load 2.24 -0.1 2.12 ± 2% -0.1 2.13 ± 3% perf-profile.children.cycles-pp.vma_prepare 2.50 -0.1 2.38 -0.1 2.42 perf-profile.children.cycles-pp.flush_tlb_func 2.04 ± 2% -0.1 1.93 -0.1 1.96 ± 2% perf-profile.children.cycles-pp.allocate_slab 2.46 -0.1 2.35 -0.1 2.41 perf-profile.children.cycles-pp.rcu_cblist_dequeue 2.48 -0.1 2.38 -0.1 2.42 perf-profile.children.cycles-pp.unmap_page_range 2.23 -0.1 2.12 -0.1 2.16 perf-profile.children.cycles-pp.native_flush_tlb_one_user 1.77 -0.1 1.67 -0.1 1.70 perf-profile.children.cycles-pp.mas_wr_walk 1.88 -0.1 1.78 -0.1 1.80 perf-profile.children.cycles-pp.vma_link 1.84 -0.1 1.75 -0.1 1.77 perf-profile.children.cycles-pp.up_write 0.97 ± 2% -0.1 0.88 -0.1 0.89 perf-profile.children.cycles-pp.rcu_all_qs 1.40 -0.1 1.32 -0.1 1.34 ± 2% perf-profile.children.cycles-pp.shuffle_freelist 1.03 -0.1 0.95 -0.0 0.99 perf-profile.children.cycles-pp.mas_prev 0.92 -0.1 0.85 -0.0 0.88 perf-profile.children.cycles-pp.mas_prev_setup 1.58 -0.1 1.51 -0.1 1.53 perf-profile.children.cycles-pp.zap_pmd_range 1.24 -0.1 1.17 -0.0 1.20 perf-profile.children.cycles-pp.mas_prev_slot 1.57 -0.1 1.49 -0.1 1.49 perf-profile.children.cycles-pp.mas_update_gap 0.62 -0.1 0.56 -0.0 0.60 perf-profile.children.cycles-pp.security_mmap_addr 0.90 -0.1 0.84 -0.0 0.86 perf-profile.children.cycles-pp.percpu_counter_add_batch 0.86 -0.1 0.80 -0.0 0.81 perf-profile.children.cycles-pp._raw_spin_lock_irqsave 0.98 -0.1 0.92 -0.0 0.95 perf-profile.children.cycles-pp.mas_pop_node 1.68 -0.1 1.62 -0.1 1.62 perf-profile.children.cycles-pp.__get_unmapped_area 1.23 -0.1 1.18 -0.0 1.20 perf-profile.children.cycles-pp.__pte_offset_map_lock 0.49 ± 2% -0.1 0.43 -0.1 0.43 ± 2% perf-profile.children.cycles-pp.setup_object 1.09 -0.1 1.03 -0.0 1.05 perf-profile.children.cycles-pp.zap_pte_range 1.07 ± 2% -0.1 1.02 ± 2% -0.1 1.00 perf-profile.children.cycles-pp.mas_leaf_max_gap 0.70 ± 2% -0.0 0.65 -0.0 0.67 perf-profile.children.cycles-pp.syscall_return_via_sysret 1.18 -0.0 1.14 -0.0 1.15 perf-profile.children.cycles-pp.clear_bhb_loop 0.51 ± 3% -0.0 0.47 -0.0 0.49 ± 3% perf-profile.children.cycles-pp.anon_vma_interval_tree_insert 1.04 -0.0 1.00 -0.0 1.01 perf-profile.children.cycles-pp.vma_to_resize 0.57 -0.0 0.53 -0.0 0.54 perf-profile.children.cycles-pp.mas_wr_end_piv 0.44 ± 2% -0.0 0.40 ± 2% -0.0 0.40 perf-profile.children.cycles-pp.native_queued_spin_lock_slowpath 1.14 -0.0 1.10 -0.0 1.12 perf-profile.children.cycles-pp.mt_find 0.90 -0.0 0.87 -0.0 0.87 perf-profile.children.cycles-pp.userfaultfd_unmap_complete 0.62 -0.0 0.59 -0.0 0.60 perf-profile.children.cycles-pp.__put_partials 0.45 ± 6% -0.0 0.42 -0.0 0.43 perf-profile.children.cycles-pp._raw_spin_lock 0.48 -0.0 0.45 ± 2% -0.0 0.46 perf-profile.children.cycles-pp.mas_prev_range 0.61 -0.0 0.58 -0.0 0.59 perf-profile.children.cycles-pp.entry_SYSCALL_64 0.31 ± 3% -0.0 0.28 ± 3% -0.0 0.31 perf-profile.children.cycles-pp.security_vm_enough_memory_mm 0.33 ± 3% -0.0 0.30 ± 2% -0.0 0.31 ± 4% perf-profile.children.cycles-pp.mas_put_in_tree 0.32 ± 2% -0.0 0.29 ± 2% -0.0 0.30 perf-profile.children.cycles-pp.tlb_finish_mmu 0.46 -0.0 0.44 ± 2% -0.0 0.46 perf-profile.children.cycles-pp.rcu_segcblist_enqueue 0.33 -0.0 0.31 -0.0 0.32 perf-profile.children.cycles-pp.mas_destroy 0.36 -0.0 0.34 -0.0 0.34 perf-profile.children.cycles-pp.__rb_insert_augmented 0.39 -0.0 0.37 -0.0 0.38 ± 2% perf-profile.children.cycles-pp.down_write_killable 0.29 -0.0 0.27 ± 2% -0.0 0.28 perf-profile.children.cycles-pp.tlb_gather_mmu 0.26 -0.0 0.24 ± 2% -0.0 0.25 ± 2% perf-profile.children.cycles-pp.syscall_exit_to_user_mode 0.16 ± 2% -0.0 0.14 ± 3% -0.0 0.14 ± 3% perf-profile.children.cycles-pp.mas_wr_append 0.30 ± 2% -0.0 0.28 ± 2% -0.0 0.29 ± 2% perf-profile.children.cycles-pp.__vm_enough_memory 0.32 -0.0 0.30 ± 2% -0.0 0.31 perf-profile.children.cycles-pp.pte_offset_map_nolock 2.83 +0.0 2.85 ± 2% -0.1 2.74 perf-profile.children.cycles-pp.unlink_anon_vmas 0.84 +0.0 0.86 -0.0 0.81 perf-profile.children.cycles-pp.thp_get_unmapped_area_vmflags 0.08 ± 5% +0.0 0.10 ± 3% -0.0 0.08 ± 6% perf-profile.children.cycles-pp.mm_get_unmapped_area_vmflags 3.52 +0.0 3.56 -0.1 3.42 perf-profile.children.cycles-pp.free_pgtables 0.78 +0.1 0.85 +0.1 0.86 perf-profile.children.cycles-pp.__madvise 0.63 +0.1 0.70 +0.1 0.72 perf-profile.children.cycles-pp.__x64_sys_madvise 0.63 +0.1 0.70 +0.1 0.71 perf-profile.children.cycles-pp.do_madvise 0.00 +0.1 0.09 ± 3% +0.1 0.10 ± 5% perf-profile.children.cycles-pp.can_modify_mm_madv 1.31 +0.2 1.46 +0.2 1.50 perf-profile.children.cycles-pp.mas_next_slot 83.90 +0.9 84.79 +0.6 84.53 perf-profile.children.cycles-pp.__do_sys_mremap 40.45 +1.4 41.90 +2.1 42.57 perf-profile.children.cycles-pp.do_vmi_munmap 2.12 +1.5 3.62 +1.7 3.82 perf-profile.children.cycles-pp.do_munmap 3.63 +2.4 5.98 +1.7 5.29 perf-profile.children.cycles-pp.mas_walk 5.40 +3.0 8.44 +1.6 6.97 perf-profile.children.cycles-pp.mremap_to 5.26 +3.2 8.48 +2.3 7.58 perf-profile.children.cycles-pp.mas_find 0.00 +5.5 5.46 +3.9 3.93 perf-profile.children.cycles-pp.can_modify_mm 11.49 -0.6 10.89 -0.4 11.10 perf-profile.self.cycles-pp.__slab_free 4.32 -0.3 4.06 -0.2 4.16 perf-profile.self.cycles-pp.__memcg_slab_post_alloc_hook 1.96 -0.2 1.77 ± 4% -0.1 1.84 ± 6% perf-profile.self.cycles-pp.__memcpy 2.36 -0.1 2.25 ± 2% -0.1 2.25 ± 3% perf-profile.self.cycles-pp.down_write 2.42 -0.1 2.31 -0.0 2.38 perf-profile.self.cycles-pp.rcu_cblist_dequeue 2.33 -0.1 2.23 -0.1 2.28 perf-profile.self.cycles-pp.mtree_load 2.21 -0.1 2.10 -0.1 2.14 perf-profile.self.cycles-pp.native_flush_tlb_one_user 1.62 -0.1 1.54 -0.0 1.57 perf-profile.self.cycles-pp.__memcg_slab_free_hook 1.52 -0.1 1.44 -0.1 1.46 perf-profile.self.cycles-pp.mas_wr_walk 1.44 -0.1 1.36 -0.1 1.38 ± 2% perf-profile.self.cycles-pp.__call_rcu_common 1.53 -0.1 1.45 -0.0 1.48 perf-profile.self.cycles-pp.up_write 1.72 -0.1 1.65 -0.0 1.70 perf-profile.self.cycles-pp.mod_objcg_state 0.69 ± 2% -0.1 0.63 -0.1 0.63 perf-profile.self.cycles-pp.rcu_all_qs 1.14 ± 2% -0.1 1.08 -0.0 1.09 ± 2% perf-profile.self.cycles-pp.shuffle_freelist 1.18 -0.1 1.12 -0.0 1.17 perf-profile.self.cycles-pp.vma_merge 1.38 -0.1 1.33 -0.0 1.35 perf-profile.self.cycles-pp.do_vmi_align_munmap 0.51 ± 2% -0.1 0.45 -0.0 0.49 perf-profile.self.cycles-pp.security_mmap_addr 0.62 -0.1 0.56 ± 2% -0.1 0.56 perf-profile.self.cycles-pp.mremap 0.89 -0.1 0.83 -0.0 0.85 perf-profile.self.cycles-pp.___slab_alloc 0.99 -0.1 0.94 -0.0 0.96 perf-profile.self.cycles-pp.mas_prev_slot 1.00 -0.0 0.95 -0.0 0.96 perf-profile.self.cycles-pp.mas_preallocate 0.98 -0.0 0.93 -0.0 0.95 perf-profile.self.cycles-pp.move_ptes 0.85 -0.0 0.80 -0.0 0.82 perf-profile.self.cycles-pp.mas_pop_node 0.94 -0.0 0.90 -0.0 0.91 ± 2% perf-profile.self.cycles-pp.vm_area_free_rcu_cb 1.09 -0.0 1.04 -0.0 1.06 perf-profile.self.cycles-pp.__cond_resched 0.77 -0.0 0.72 -0.0 0.74 perf-profile.self.cycles-pp.percpu_counter_add_batch 0.94 ± 2% -0.0 0.89 ± 2% -0.1 0.87 perf-profile.self.cycles-pp.mas_leaf_max_gap 1.17 -0.0 1.12 -0.0 1.14 perf-profile.self.cycles-pp.clear_bhb_loop 0.68 -0.0 0.63 -0.0 0.65 perf-profile.self.cycles-pp.__split_vma 0.79 -0.0 0.75 -0.0 0.77 perf-profile.self.cycles-pp.mas_wr_store_entry 1.22 -0.0 1.18 -0.0 1.18 perf-profile.self.cycles-pp.move_vma 0.43 ± 2% -0.0 0.40 ± 2% -0.0 0.40 perf-profile.self.cycles-pp.native_queued_spin_lock_slowpath 1.49 -0.0 1.45 +0.0 1.49 perf-profile.self.cycles-pp.kmem_cache_free 0.44 -0.0 0.40 -0.0 0.40 perf-profile.self.cycles-pp.do_munmap 0.45 -0.0 0.42 -0.0 0.43 perf-profile.self.cycles-pp.mas_wr_end_piv 0.89 -0.0 0.86 -0.0 0.88 perf-profile.self.cycles-pp.mas_store_gfp 0.78 -0.0 0.75 -0.0 0.76 perf-profile.self.cycles-pp.userfaultfd_unmap_complete 0.66 -0.0 0.62 -0.0 0.64 perf-profile.self.cycles-pp.mas_store_prealloc 0.60 -0.0 0.58 -0.0 0.59 perf-profile.self.cycles-pp.unmap_region 0.36 ± 4% -0.0 0.33 ± 3% -0.0 0.34 ± 2% perf-profile.self.cycles-pp.syscall_return_via_sysret 0.55 -0.0 0.52 -0.0 0.53 perf-profile.self.cycles-pp.get_old_pud 0.99 -0.0 0.97 -0.0 0.98 perf-profile.self.cycles-pp.mt_find 0.61 -0.0 0.58 -0.0 0.60 perf-profile.self.cycles-pp.copy_vma 0.43 ± 3% -0.0 0.40 -0.0 0.41 ± 4% perf-profile.self.cycles-pp.anon_vma_interval_tree_insert 0.49 -0.0 0.47 -0.0 0.48 perf-profile.self.cycles-pp.find_vma_prev 0.71 -0.0 0.68 -0.0 0.70 perf-profile.self.cycles-pp.unmap_page_range 0.27 -0.0 0.25 -0.0 0.26 perf-profile.self.cycles-pp.mas_prev_setup 0.47 -0.0 0.45 -0.0 0.46 ± 2% perf-profile.self.cycles-pp.flush_tlb_mm_range 0.37 ± 6% -0.0 0.35 -0.0 0.35 perf-profile.self.cycles-pp._raw_spin_lock 0.41 -0.0 0.39 -0.0 0.40 perf-profile.self.cycles-pp._raw_spin_lock_irqsave 0.40 -0.0 0.37 -0.0 0.38 perf-profile.self.cycles-pp.entry_SYSRETQ_unsafe_stack 0.27 -0.0 0.25 ± 2% -0.0 0.25 ± 3% perf-profile.self.cycles-pp.mas_put_in_tree 0.49 -0.0 0.47 -0.0 0.49 perf-profile.self.cycles-pp.refill_obj_stock 0.48 -0.0 0.46 -0.0 0.47 perf-profile.self.cycles-pp.entry_SYSCALL_64_after_hwframe 0.27 ± 2% -0.0 0.25 -0.0 0.26 perf-profile.self.cycles-pp.tlb_finish_mmu 0.24 ± 2% -0.0 0.22 -0.0 0.23 perf-profile.self.cycles-pp.mas_prev 0.28 -0.0 0.26 -0.0 0.27 ± 2% perf-profile.self.cycles-pp.mas_alloc_nodes 0.40 -0.0 0.39 -0.0 0.40 perf-profile.self.cycles-pp.__pte_offset_map_lock 0.14 ± 3% -0.0 0.12 ± 2% -0.0 0.13 ± 3% perf-profile.self.cycles-pp.syscall_exit_to_user_mode 0.26 -0.0 0.24 ± 2% -0.0 0.25 perf-profile.self.cycles-pp.__rb_insert_augmented 0.28 -0.0 0.26 -0.0 0.27 perf-profile.self.cycles-pp.alloc_new_pud 0.28 -0.0 0.26 -0.0 0.27 ± 2% perf-profile.self.cycles-pp.flush_tlb_func 0.20 ± 2% -0.0 0.19 -0.0 0.19 ± 2% perf-profile.self.cycles-pp.__get_unmapped_area 0.47 -0.0 0.46 -0.0 0.45 perf-profile.self.cycles-pp.arch_get_unmapped_area_topdown_vmflags 0.06 -0.0 0.05 ± 5% -0.0 0.05 perf-profile.self.cycles-pp.vma_dup_policy 0.06 ± 6% +0.0 0.07 -0.0 0.06 ± 8% perf-profile.self.cycles-pp.mm_get_unmapped_area_vmflags 0.11 ± 4% +0.0 0.12 ± 4% +0.0 0.12 ± 4% perf-profile.self.cycles-pp.free_pgd_range 0.21 +0.0 0.22 ± 2% -0.0 0.20 ± 2% perf-profile.self.cycles-pp.thp_get_unmapped_area_vmflags 0.45 +0.0 0.48 +0.0 0.50 perf-profile.self.cycles-pp.do_vmi_munmap 0.27 +0.0 0.32 -0.0 0.26 perf-profile.self.cycles-pp.free_pgtables 0.36 ± 2% +0.1 0.44 -0.0 0.35 perf-profile.self.cycles-pp.unlink_anon_vmas 1.07 +0.1 1.19 +0.2 1.22 perf-profile.self.cycles-pp.mas_next_slot 1.49 +0.5 2.01 +0.4 1.86 perf-profile.self.cycles-pp.mas_find 0.00 +1.4 1.37 +0.9 0.93 perf-profile.self.cycles-pp.can_modify_mm 3.14 +2.1 5.23 +1.5 4.60 perf-profile.self.cycles-pp.mas_walk
to avoid the impact of other changes, better to apply the patch upon 8be7258a directly.
if you prefer other base for this patch, please let us know. then we will supply the results for 4 commits in fact:
this patch the base of this patch 8be7258a: mseal: add mseal syscall ff388fe5c: mseal: wire up mseal syscall
Thank you for your time and assistance in helping me on understanding this issue.
due to resource constraint, please expect that we need several days to finish this test request.
No problem.
Thanks for your help! -Jeff
Best regards, -Jeff
-Jeff
> [1] https://lore.kernel.org/lkml/202408041602.caa0372-oliver.sang@intel.com/ > [2] https://github.com/peaktocreek/mmperf/blob/main/run_stress_ng.c > > > Jeff Xu (2): > mseal:selftest mremap across VMA boundaries. > mseal: refactor mremap to remove can_modify_mm > > mm/internal.h | 24 ++ > mm/mremap.c | 77 +++---- > mm/mseal.c | 17 -- > tools/testing/selftests/mm/mseal_test.c | 293 +++++++++++++++++++++++- > 4 files changed, 353 insertions(+), 58 deletions(-) > > -- > 2.46.0.76.ge559c4bf1a-goog >
hi, Jeff,
here is a update per your test request.
we extented the runtime to 600 seconds, and run 10 times for each commit.
========================================================================================= compiler/cpufreq_governor/kconfig/nr_threads/rootfs/tbox_group/test/testcase/testtime: gcc-12/performance/x86_64-rhel-8.3/100%/debian-12-x86_64-20240206.cgz/lkp-icl-2sp7/pagemove/stress-ng/***600s***
commit: ff388fe5c4 ("mseal: wire up mseal syscall") 8be7258aad ("mseal: add mseal syscall") 2a78ece39f <-- your "[PATCH v1 2/2] mseal: refactor mremap to remove can_modify_mm"
ff388fe5c481d39c 8be7258aad44b5e25977a98db13 2a78ece39f13ea6f3f9679a6c66 ---------------- --------------------------- --------------------------- %stddev %change %stddev %change %stddev \ | \ | \ 1.886e+08 ± 0% -5.0% 1.792e+08 ± 0% -3.4% 1.821e+08 ± 0% stress-ng.pagemove.ops 314345 ± 0% -5.0% 298656 ± 0% -3.4% 303565 ± 0% stress-ng.pagemove.ops_per_sec
the score of stress-ng.pagemove.ops_per_sec has some difference with 60s run (list as below for comparison). but the trend is similar.
========================================================================================= compiler/cpufreq_governor/kconfig/nr_threads/rootfs/tbox_group/test/testcase/testtime: gcc-12/performance/x86_64-rhel-8.3/100%/debian-12-x86_64-20240206.cgz/lkp-icl-2sp7/pagemove/stress-ng/***60s***
commit: ff388fe5c4 ("mseal: wire up mseal syscall") 8be7258aad ("mseal: add mseal syscall") 2a78ece39f <-- your "[PATCH v1 2/2] mseal: refactor mremap to remove can_modify_mm"
ff388fe5c481d39c 8be7258aad44b5e25977a98db13 2a78ece39f13ea6f3f9679a6c66 ---------------- --------------------------- --------------------------- %stddev %change %stddev %change %stddev \ | \ | \ 18386219 ± 0% -5.0% 17474214 ± 0% -2.9% 17850959 ± 0% stress-ng.pagemove.ops 306421 ± 0% -5.0% 291207 ± 0% -2.9% 297490 ± 0% stress-ng.pagemove.ops_per_sec
since the data is stable, %stddev shows as "± 0%" in both above tables. let me give out the detail data for 600s runs.
for ff388fe5c4 ("mseal: wire up mseal syscall")
"stress-ng.pagemove.ops": [ 188545955, 188681834, 188907282, 188345009, 188729465, 188312187, 188897283, 188209713, 188425965, 189026136 ], "stress-ng.pagemove.ops_per_sec": [ 314242.1, 314467.13, 314841.5, 313907.19, 314548.11, 313852.5, 314827.84, 313680.74, 314042.14, 315042.79 ],
for 8be7258aad ("mseal: add mseal syscall")
"stress-ng.pagemove.ops": [ 179127848, 179401350, 179350278, 179023817, 179106624, 179535213, 178936504, 178870141, 179462171, 179136065 ], "stress-ng.pagemove.ops_per_sec": [ 298545.54, 299000.95, 298915.62, 298371.45, 298509.15, 299223.65, 298226.74, 298115.08, 299101.23, 298558.74 ],
for 2a78ece39f <-- your "[PATCH v1 2/2] mseal: refactor mremap to remove can_modify_mm"
"stress-ng.pagemove.ops": [ 182188207, 182288813, 182483678, 181980233, 182249440, 181837961, 182155893, 181699445, 182347580, 182174597 ], "stress-ng.pagemove.ops_per_sec": [ 303643.28, 303814.05, 304138.38, 303298.9, 303747.33, 303060.84, 303592.48, 302831.56, 303909.81, 303622.07 ],
for 600s run, below is the full comparion.
========================================================================================= compiler/cpufreq_governor/kconfig/nr_threads/rootfs/tbox_group/test/testcase/testtime: gcc-12/performance/x86_64-rhel-8.3/100%/debian-12-x86_64-20240206.cgz/lkp-icl-2sp7/pagemove/stress-ng/***600s***
commit: ff388fe5c4 ("mseal: wire up mseal syscall") 8be7258aad ("mseal: add mseal syscall") 2a78ece39f <-- your "[PATCH v1 2/2] mseal: refactor mremap to remove can_modify_mm"
ff388fe5c481d39c 8be7258aad44b5e25977a98db13 2a78ece39f13ea6f3f9679a6c66 ---------------- --------------------------- --------------------------- %stddev %change %stddev %change %stddev \ | \ | \ 4667 ± 0% -2.4% 4553 ± 0% -1.6% 4593 ± 0% vmstat.system.cs 4.192e+08 ± 0% -4.3% 4.012e+08 ± 0% -2.8% 4.075e+08 ± 0% proc-vmstat.numa_hit 4.192e+08 ± 0% -4.3% 4.011e+08 ± 0% -2.8% 4.074e+08 ± 0% proc-vmstat.numa_local 7.843e+08 ± 0% -4.3% 7.504e+08 ± 0% -2.8% 7.623e+08 ± 0% proc-vmstat.pgalloc_normal 7.836e+08 ± 0% -4.3% 7.498e+08 ± 0% -2.8% 7.616e+08 ± 0% proc-vmstat.pgfree 1174825 ± 0% -2.6% 1143891 ± 0% -1.7% 1155336 ± 0% time.involuntary_context_switches 5082 ± 0% +1.3% 5147 ± 0% +0.9% 5126 ± 0% time.percent_of_cpu_this_job_got 29840 ± 0% +1.4% 30267 ± 0% +1.0% 30133 ± 0% time.system_time 663.58 ± 1% -5.7% 625.54 ± 1% -4.3% 635.17 ± 0% time.user_time 1.886e+08 ± 0% -5.0% 1.792e+08 ± 0% -3.4% 1.821e+08 ± 0% stress-ng.pagemove.ops 314345 ± 0% -5.0% 298656 ± 0% -3.4% 303565 ± 0% stress-ng.pagemove.ops_per_sec 212508 ± 0% -4.3% 203280 ± 0% -3.1% 205831 ± 0% stress-ng.pagemove.page_remaps_per_sec 1174825 ± 0% -2.6% 1143891 ± 0% -1.7% 1155336 ± 0% stress-ng.time.involuntary_context_switches 5082 ± 0% +1.3% 5147 ± 0% +0.9% 5126 ± 0% stress-ng.time.percent_of_cpu_this_job_got 29840 ± 0% +1.4% 30267 ± 0% +1.0% 30133 ± 0% stress-ng.time.system_time 663.58 ± 1% -5.7% 625.54 ± 1% -4.3% 635.17 ± 0% stress-ng.time.user_time 1.00 ± 0% -7.1% 0.93 ± 0% -4.9% 0.95 ± 0% perf-stat.i.MPKI 3.487e+10 ± 0% +3.5% 3.607e+10 ± 0% +2.4% 3.57e+10 ± 0% perf-stat.i.branch-instructions 0.21 ± 0% -0.0 0.19 ± 3% -0.0 0.20 ± 0% perf-stat.i.branch-miss-rate% 1.763e+08 ± 0% -5.0% 1.675e+08 ± 0% -3.4% 1.704e+08 ± 0% perf-stat.i.cache-misses 2.342e+08 ± 0% -4.9% 2.228e+08 ± 0% -3.3% 2.264e+08 ± 0% perf-stat.i.cache-references 4650 ± 0% -2.4% 4537 ± 0% -1.5% 4578 ± 0% perf-stat.i.context-switches 1.11 ± 0% -2.2% 1.09 ± 0% -1.6% 1.10 ± 0% perf-stat.i.cpi 172.66 ± 0% -2.8% 167.77 ± 0% -1.8% 169.52 ± 0% perf-stat.i.cpu-migrations 1121 ± 0% +5.2% 1180 ± 0% +3.5% 1160 ± 0% perf-stat.i.cycles-between-cache-misses 1.772e+11 ± 0% +2.2% 1.812e+11 ± 0% +1.6% 1.801e+11 ± 0% perf-stat.i.instructions 0.90 ± 0% +2.3% 0.92 ± 0% +1.6% 0.91 ± 0% perf-stat.i.ipc 0.99 ± 0% -7.1% 0.92 ± 0% -4.9% 0.95 ± 0% perf-stat.overall.MPKI 0.21 ± 0% -0.0 0.19 ± 3% -0.0 0.20 ± 0% perf-stat.overall.branch-miss-rate% 1.11 ± 0% -2.2% 1.09 ± 0% -1.6% 1.10 ± 0% perf-stat.overall.cpi 1120 ± 0% +5.2% 1179 ± 0% +3.5% 1159 ± 0% perf-stat.overall.cycles-between-cache-misses 0.90 ± 0% +2.3% 0.92 ± 0% +1.6% 0.91 ± 0% perf-stat.overall.ipc 3.48e+10 ± 0% +3.5% 3.6e+10 ± 0% +2.4% 3.563e+10 ± 0% perf-stat.ps.branch-instructions 1.759e+08 ± 0% -5.0% 1.672e+08 ± 0% -3.4% 1.7e+08 ± 0% perf-stat.ps.cache-misses 2.338e+08 ± 0% -4.9% 2.224e+08 ± 0% -3.3% 2.26e+08 ± 0% perf-stat.ps.cache-references 4642 ± 0% -2.4% 4529 ± 0% -1.5% 4570 ± 0% perf-stat.ps.context-switches 172.30 ± 0% -2.8% 167.43 ± 0% -1.8% 169.17 ± 0% perf-stat.ps.cpu-migrations 1.769e+11 ± 0% +2.3% 1.808e+11 ± 0% +1.6% 1.797e+11 ± 0% perf-stat.ps.instructions 1.063e+14 ± 0% +2.3% 1.087e+14 ± 0% +1.7% 1.081e+14 ± 0% perf-stat.total.instructions 74.86 ± 0% -2.1 72.76 ± 0% -0.8 74.06 ± 0% perf-profile.calltrace.cycles-pp.move_vma.__do_sys_mremap.do_syscall_64.entry_SYSCALL_64_after_hwframe.mremap 36.72 ± 0% -1.7 35.04 ± 0% -1.2 35.54 ± 0% perf-profile.calltrace.cycles-pp.do_vmi_align_munmap.do_vmi_munmap.move_vma.__do_sys_mremap.do_syscall_64 24.93 ± 0% -1.4 23.54 ± 0% -0.8 24.12 ± 0% perf-profile.calltrace.cycles-pp.copy_vma.move_vma.__do_sys_mremap.do_syscall_64.entry_SYSCALL_64_after_hwframe 19.91 ± 0% -1.1 18.79 ± 0% -0.7 19.17 ± 0% perf-profile.calltrace.cycles-pp.__split_vma.do_vmi_align_munmap.do_vmi_munmap.move_vma.__do_sys_mremap 14.71 ± 0% -0.8 13.90 ± 0% -0.4 14.30 ± 0% perf-profile.calltrace.cycles-pp.vma_merge.copy_vma.move_vma.__do_sys_mremap.do_syscall_64 10.82 ± 2% -0.6 10.22 ± 2% -0.6 10.25 ± 2% perf-profile.calltrace.cycles-pp.smpboot_thread_fn.kthread.ret_from_fork.ret_from_fork_asm 10.81 ± 2% -0.6 10.21 ± 2% -0.6 10.24 ± 2% perf-profile.calltrace.cycles-pp.run_ksoftirqd.smpboot_thread_fn.kthread.ret_from_fork.ret_from_fork_asm 10.81 ± 2% -0.6 10.21 ± 2% -0.6 10.24 ± 2% perf-profile.calltrace.cycles-pp.handle_softirqs.run_ksoftirqd.smpboot_thread_fn.kthread.ret_from_fork 10.80 ± 2% -0.6 10.21 ± 2% -0.6 10.23 ± 2% perf-profile.calltrace.cycles-pp.rcu_core.handle_softirqs.run_ksoftirqd.smpboot_thread_fn.kthread 10.85 ± 2% -0.6 10.26 ± 2% -0.6 10.28 ± 2% perf-profile.calltrace.cycles-pp.kthread.ret_from_fork.ret_from_fork_asm 10.85 ± 2% -0.6 10.26 ± 2% -0.6 10.28 ± 2% perf-profile.calltrace.cycles-pp.ret_from_fork.ret_from_fork_asm 10.85 ± 2% -0.6 10.26 ± 2% -0.6 10.28 ± 2% perf-profile.calltrace.cycles-pp.ret_from_fork_asm 10.76 ± 2% -0.6 10.17 ± 2% -0.6 10.20 ± 2% perf-profile.calltrace.cycles-pp.rcu_do_batch.rcu_core.handle_softirqs.run_ksoftirqd.smpboot_thread_fn 1.49 ± 1% -0.5 0.98 ± 0% -0.5 1.00 ± 0% perf-profile.calltrace.cycles-pp.mas_find.do_vmi_munmap.move_vma.__do_sys_mremap.do_syscall_64 7.86 ± 0% -0.4 7.48 ± 0% -0.3 7.59 ± 0% perf-profile.calltrace.cycles-pp.move_page_tables.move_vma.__do_sys_mremap.do_syscall_64.entry_SYSCALL_64_after_hwframe 6.72 ± 0% -0.4 6.37 ± 0% -0.2 6.49 ± 0% perf-profile.calltrace.cycles-pp.vm_area_dup.__split_vma.do_vmi_align_munmap.do_vmi_munmap.move_vma 6.06 ± 2% -0.3 5.71 ± 2% -0.3 5.73 ± 2% perf-profile.calltrace.cycles-pp.kmem_cache_free.rcu_do_batch.rcu_core.handle_softirqs.run_ksoftirqd 6.11 ± 0% -0.3 5.77 ± 0% -0.2 5.90 ± 0% perf-profile.calltrace.cycles-pp.vma_complete.__split_vma.do_vmi_align_munmap.do_vmi_munmap.move_vma 6.11 ± 0% -0.3 5.78 ± 1% -0.2 5.90 ± 0% perf-profile.calltrace.cycles-pp.kmem_cache_alloc_noprof.vm_area_dup.__split_vma.do_vmi_align_munmap.do_vmi_munmap 5.50 ± 0% -0.3 5.19 ± 0% -0.2 5.31 ± 0% perf-profile.calltrace.cycles-pp.mas_store_prealloc.vma_complete.__split_vma.do_vmi_align_munmap.do_vmi_munmap 5.52 ± 0% -0.3 5.22 ± 0% -0.2 5.35 ± 0% perf-profile.calltrace.cycles-pp.mas_store_prealloc.vma_merge.copy_vma.move_vma.__do_sys_mremap 5.15 ± 0% -0.3 4.86 ± 0% -0.2 4.97 ± 0% perf-profile.calltrace.cycles-pp.mas_wr_store_entry.mas_store_prealloc.vma_complete.__split_vma.do_vmi_align_munmap 5.77 ± 0% -0.3 5.48 ± 0% -0.2 5.58 ± 0% perf-profile.calltrace.cycles-pp.move_ptes.move_page_tables.move_vma.__do_sys_mremap.do_syscall_64 5.16 ± 0% -0.3 4.88 ± 0% -0.1 5.01 ± 0% perf-profile.calltrace.cycles-pp.mas_wr_store_entry.mas_store_prealloc.vma_merge.copy_vma.move_vma 4.72 ± 2% -0.3 4.44 ± 2% -0.3 4.45 ± 2% perf-profile.calltrace.cycles-pp.__slab_free.kmem_cache_free.rcu_do_batch.rcu_core.handle_softirqs 4.64 ± 0% -0.3 4.38 ± 0% -0.1 4.51 ± 1% perf-profile.calltrace.cycles-pp.mas_wr_node_store.mas_wr_store_entry.mas_store_prealloc.vma_merge.copy_vma 4.07 ± 0% -0.2 3.84 ± 0% -0.2 3.92 ± 0% perf-profile.calltrace.cycles-pp.vm_area_dup.copy_vma.move_vma.__do_sys_mremap.do_syscall_64 3.96 ± 1% -0.2 3.76 ± 1% -0.1 3.88 ± 1% perf-profile.calltrace.cycles-pp.mas_wr_node_store.mas_wr_store_entry.mas_store_prealloc.vma_complete.__split_vma 3.54 ± 0% -0.2 3.34 ± 0% -0.1 3.41 ± 1% perf-profile.calltrace.cycles-pp.kmem_cache_alloc_noprof.vm_area_dup.copy_vma.move_vma.__do_sys_mremap 38.68 ± 0% -0.2 38.49 ± 0% +0.4 39.05 ± 0% perf-profile.calltrace.cycles-pp.do_vmi_munmap.move_vma.__do_sys_mremap.do_syscall_64.entry_SYSCALL_64_after_hwframe 0.55 ± 1% -0.2 0.36 ± 65% -0.0 0.52 ± 1% perf-profile.calltrace.cycles-pp.mas_find.find_vma_prev.copy_vma.move_vma.__do_sys_mremap 3.41 ± 0% -0.2 3.22 ± 0% -0.1 3.28 ± 0% perf-profile.calltrace.cycles-pp.flush_tlb_mm_range.move_ptes.move_page_tables.move_vma.__do_sys_mremap 1.35 ± 0% -0.2 1.17 ± 0% -0.1 1.23 ± 0% perf-profile.calltrace.cycles-pp.mas_find.do_vmi_munmap.do_munmap.mremap_to.__do_sys_mremap 2.22 ± 0% -0.2 2.05 ± 0% -0.1 2.12 ± 0% perf-profile.calltrace.cycles-pp.find_vma_prev.copy_vma.move_vma.__do_sys_mremap.do_syscall_64 2.27 ± 0% -0.2 2.10 ± 0% -0.1 2.15 ± 0% perf-profile.calltrace.cycles-pp.mas_preallocate.__split_vma.do_vmi_align_munmap.do_vmi_munmap.move_vma 3.25 ± 0% -0.2 3.08 ± 0% -0.1 3.14 ± 0% perf-profile.calltrace.cycles-pp.mas_store_gfp.do_vmi_align_munmap.do_vmi_munmap.move_vma.__do_sys_mremap 3.12 ± 2% -0.2 2.97 ± 2% -0.1 3.04 ± 2% perf-profile.calltrace.cycles-pp.__memcg_slab_post_alloc_hook.kmem_cache_alloc_noprof.vm_area_dup.__split_vma.do_vmi_align_munmap 0.96 ± 0% -0.1 0.82 ± 1% -0.1 0.87 ± 1% perf-profile.calltrace.cycles-pp.mas_walk.mas_find.do_vmi_munmap.do_munmap.mremap_to 2.98 ± 1% -0.1 2.84 ± 1% -0.1 2.89 ± 2% perf-profile.calltrace.cycles-pp.anon_vma_clone.__split_vma.do_vmi_align_munmap.do_vmi_munmap.move_vma 8.19 ± 0% -0.1 8.05 ± 0% -0.1 8.04 ± 0% perf-profile.calltrace.cycles-pp.unmap_region.do_vmi_align_munmap.do_vmi_munmap.move_vma.__do_sys_mremap 3.13 ± 0% -0.1 3.00 ± 0% -0.1 3.06 ± 0% perf-profile.calltrace.cycles-pp.unmap_vmas.unmap_region.do_vmi_align_munmap.do_vmi_munmap.move_vma 0.53 ± 1% -0.1 0.41 ± 50% -0.2 0.30 ± 81% perf-profile.calltrace.cycles-pp.arch_get_unmapped_area_topdown_vmflags.thp_get_unmapped_area_vmflags.__get_unmapped_area.mremap_to.__do_sys_mremap 1.73 ± 2% -0.1 1.61 ± 2% -0.0 1.70 ± 3% perf-profile.calltrace.cycles-pp.vma_prepare.vma_merge.copy_vma.move_vma.__do_sys_mremap 2.14 ± 2% -0.1 2.02 ± 2% -0.0 2.09 ± 2% perf-profile.calltrace.cycles-pp.kmem_cache_alloc_noprof.anon_vma_clone.__split_vma.do_vmi_align_munmap.do_vmi_munmap 2.46 ± 0% -0.1 2.34 ± 0% -0.1 2.38 ± 0% perf-profile.calltrace.cycles-pp.flush_tlb_func.flush_tlb_mm_range.move_ptes.move_page_tables.move_vma 2.04 ± 0% -0.1 1.93 ± 0% -0.1 1.96 ± 0% perf-profile.calltrace.cycles-pp.mas_preallocate.vma_merge.copy_vma.move_vma.__do_sys_mremap 1.85 ± 0% -0.1 1.74 ± 0% -0.1 1.78 ± 0% perf-profile.calltrace.cycles-pp.vma_link.copy_vma.move_vma.__do_sys_mremap.do_syscall_64 2.22 ± 0% -0.1 2.12 ± 0% -0.1 2.15 ± 0% perf-profile.calltrace.cycles-pp.native_flush_tlb_one_user.flush_tlb_func.flush_tlb_mm_range.move_ptes.move_page_tables 1.40 ± 0% -0.1 1.30 ± 0% -0.1 1.33 ± 0% perf-profile.calltrace.cycles-pp.mas_alloc_nodes.mas_preallocate.__split_vma.do_vmi_align_munmap.do_vmi_munmap 0.56 ± 1% -0.1 0.46 ± 33% -0.0 0.54 ± 2% perf-profile.calltrace.cycles-pp.mas_walk.mas_prev_setup.mas_prev.vma_merge.copy_vma 1.80 ± 2% -0.1 1.70 ± 2% -0.1 1.74 ± 2% perf-profile.calltrace.cycles-pp.__memcg_slab_post_alloc_hook.kmem_cache_alloc_noprof.vm_area_dup.copy_vma.move_vma 2.43 ± 0% -0.1 2.33 ± 0% -0.1 2.37 ± 0% perf-profile.calltrace.cycles-pp.unmap_page_range.unmap_vmas.unmap_region.do_vmi_align_munmap.do_vmi_munmap 1.25 ± 0% -0.1 1.15 ± 1% -0.1 1.19 ± 0% perf-profile.calltrace.cycles-pp.kmem_cache_alloc_noprof.mas_alloc_nodes.mas_preallocate.__split_vma.do_vmi_align_munmap 0.94 ± 1% -0.1 0.86 ± 0% -0.1 0.87 ± 0% perf-profile.calltrace.cycles-pp.mas_walk.mas_find.do_vmi_munmap.move_vma.__do_sys_mremap 1.38 ± 0% -0.1 1.30 ± 0% -0.1 1.33 ± 1% perf-profile.calltrace.cycles-pp.mas_alloc_nodes.mas_preallocate.vma_merge.copy_vma.move_vma 1.22 ± 0% -0.1 1.14 ± 0% -0.1 1.17 ± 1% perf-profile.calltrace.cycles-pp.kmem_cache_alloc_noprof.mas_alloc_nodes.mas_preallocate.vma_merge.copy_vma 1.28 ± 0% -0.1 1.21 ± 0% -0.0 1.23 ± 0% perf-profile.calltrace.cycles-pp.mas_wr_store_entry.mas_store_gfp.do_vmi_align_munmap.do_vmi_munmap.move_vma 1.54 ± 1% -0.1 1.46 ± 0% -0.0 1.49 ± 0% perf-profile.calltrace.cycles-pp.zap_pmd_range.unmap_page_range.unmap_vmas.unmap_region.do_vmi_align_munmap 1.15 ± 0% -0.1 1.08 ± 1% -0.1 1.09 ± 0% perf-profile.calltrace.cycles-pp.___slab_alloc.kmem_cache_alloc_noprof.vm_area_dup.__split_vma.do_vmi_align_munmap 0.73 ± 1% -0.1 0.67 ± 1% -0.0 0.69 ± 1% perf-profile.calltrace.cycles-pp.mas_walk.find_vma_prev.copy_vma.move_vma.__do_sys_mremap 0.72 ± 0% -0.1 0.66 ± 1% -0.0 0.69 ± 1% perf-profile.calltrace.cycles-pp.mas_prev.vma_merge.copy_vma.move_vma.__do_sys_mremap 1.64 ± 1% -0.1 1.58 ± 0% -0.1 1.58 ± 0% perf-profile.calltrace.cycles-pp.__get_unmapped_area.mremap_to.__do_sys_mremap.do_syscall_64.entry_SYSCALL_64_after_hwframe 0.78 ± 1% -0.1 0.72 ± 1% -0.0 0.75 ± 1% perf-profile.calltrace.cycles-pp.___slab_alloc.kmem_cache_alloc_noprof.mas_alloc_nodes.mas_preallocate.__split_vma 0.63 ± 1% -0.1 0.57 ± 1% -0.0 0.60 ± 1% perf-profile.calltrace.cycles-pp.mas_prev_setup.mas_prev.vma_merge.copy_vma.move_vma 0.69 ± 2% -0.1 0.63 ± 4% -0.0 0.66 ± 2% perf-profile.calltrace.cycles-pp.mod_objcg_state.__memcg_slab_post_alloc_hook.kmem_cache_alloc_noprof.vm_area_dup.copy_vma 0.60 ± 1% -0.1 0.54 ± 1% -0.0 0.58 ± 1% perf-profile.calltrace.cycles-pp.security_mmap_addr.__get_unmapped_area.mremap_to.__do_sys_mremap.do_syscall_64 0.79 ± 2% -0.1 0.74 ± 3% -0.0 0.75 ± 2% perf-profile.calltrace.cycles-pp.__call_rcu_common.mas_wr_node_store.mas_wr_store_entry.mas_store_prealloc.vma_merge 1.12 ± 0% -0.0 1.08 ± 0% -0.0 1.09 ± 1% perf-profile.calltrace.cycles-pp.clear_bhb_loop.mremap 0.67 ± 1% -0.0 0.62 ± 1% -0.0 0.63 ± 1% perf-profile.calltrace.cycles-pp.percpu_counter_add_batch.do_vmi_align_munmap.do_vmi_munmap.move_vma.__do_sys_mremap 0.77 ± 1% -0.0 0.72 ± 1% -0.0 0.73 ± 1% perf-profile.calltrace.cycles-pp.___slab_alloc.kmem_cache_alloc_noprof.mas_alloc_nodes.mas_preallocate.vma_merge 1.01 ± 1% -0.0 0.96 ± 0% -0.0 0.98 ± 0% perf-profile.calltrace.cycles-pp.zap_pte_range.zap_pmd_range.unmap_page_range.unmap_vmas.unmap_region 0.86 ± 0% -0.0 0.81 ± 1% -0.0 0.83 ± 1% perf-profile.calltrace.cycles-pp.mtree_load.vma_to_resize.mremap_to.__do_sys_mremap.do_syscall_64 0.82 ± 1% -0.0 0.78 ± 1% -0.0 0.79 ± 1% perf-profile.calltrace.cycles-pp.mtree_load.__do_sys_mremap.do_syscall_64.entry_SYSCALL_64_after_hwframe.mremap 1.01 ± 0% -0.0 0.97 ± 0% -0.0 0.98 ± 0% perf-profile.calltrace.cycles-pp.vma_to_resize.mremap_to.__do_sys_mremap.do_syscall_64.entry_SYSCALL_64_after_hwframe 0.98 ± 1% -0.0 0.94 ± 0% -0.0 0.94 ± 1% perf-profile.calltrace.cycles-pp.mas_find.do_vmi_align_munmap.do_vmi_munmap.move_vma.__do_sys_mremap 0.78 ± 0% -0.0 0.74 ± 1% -0.0 0.75 ± 1% perf-profile.calltrace.cycles-pp.mas_store_prealloc.vma_link.copy_vma.move_vma.__do_sys_mremap 0.68 ± 0% -0.0 0.64 ± 1% -0.0 0.65 ± 0% perf-profile.calltrace.cycles-pp.___slab_alloc.kmem_cache_alloc_noprof.vm_area_dup.copy_vma.move_vma 0.68 ± 1% -0.0 0.64 ± 1% -0.0 0.64 ± 1% perf-profile.calltrace.cycles-pp.mas_preallocate.vma_link.copy_vma.move_vma.__do_sys_mremap 0.89 ± 1% -0.0 0.85 ± 1% -0.0 0.86 ± 1% perf-profile.calltrace.cycles-pp.mtree_load.vma_merge.copy_vma.move_vma.__do_sys_mremap 0.62 ± 1% -0.0 0.58 ± 2% -0.0 0.59 ± 1% perf-profile.calltrace.cycles-pp.get_old_pud.move_page_tables.move_vma.__do_sys_mremap.do_syscall_64 0.62 ± 1% -0.0 0.58 ± 1% -0.0 0.59 ± 1% perf-profile.calltrace.cycles-pp.mas_prev_slot.do_vmi_align_munmap.do_vmi_munmap.move_vma.__do_sys_mremap 0.76 ± 1% -0.0 0.72 ± 1% -0.0 0.73 ± 1% perf-profile.calltrace.cycles-pp.allocate_slab.___slab_alloc.kmem_cache_alloc_noprof.vm_area_dup.__split_vma 1.01 ± 0% -0.0 0.97 ± 1% -0.0 0.98 ± 1% perf-profile.calltrace.cycles-pp.mt_find.vma_merge.copy_vma.move_vma.__do_sys_mremap 0.64 ± 1% -0.0 0.60 ± 1% -0.0 0.61 ± 1% perf-profile.calltrace.cycles-pp.mas_update_gap.mas_store_gfp.do_vmi_align_munmap.do_vmi_munmap.move_vma 0.88 ± 1% -0.0 0.85 ± 0% -0.0 0.85 ± 0% perf-profile.calltrace.cycles-pp.userfaultfd_unmap_complete.__do_sys_mremap.do_syscall_64.entry_SYSCALL_64_after_hwframe.mremap 0.69 ± 1% -0.0 0.66 ± 1% -0.0 0.67 ± 0% perf-profile.calltrace.cycles-pp.__call_rcu_common.do_vmi_align_munmap.do_vmi_munmap.move_vma.__do_sys_mremap 0.59 ± 1% -0.0 0.56 ± 1% -0.0 0.56 ± 0% perf-profile.calltrace.cycles-pp.entry_SYSCALL_64.mremap 0.82 ± 1% -0.0 0.82 ± 1% -0.0 0.79 ± 1% perf-profile.calltrace.cycles-pp.thp_get_unmapped_area_vmflags.__get_unmapped_area.mremap_to.__do_sys_mremap.do_syscall_64 0.76 ± 1% +0.1 0.83 ± 0% +0.1 0.84 ± 0% perf-profile.calltrace.cycles-pp.__madvise 0.67 ± 1% +0.1 0.73 ± 1% +0.1 0.75 ± 1% perf-profile.calltrace.cycles-pp.entry_SYSCALL_64_after_hwframe.__madvise 0.63 ± 1% +0.1 0.70 ± 1% +0.1 0.71 ± 0% perf-profile.calltrace.cycles-pp.__x64_sys_madvise.do_syscall_64.entry_SYSCALL_64_after_hwframe.__madvise 0.62 ± 1% +0.1 0.69 ± 1% +0.1 0.71 ± 0% perf-profile.calltrace.cycles-pp.do_madvise.__x64_sys_madvise.do_syscall_64.entry_SYSCALL_64_after_hwframe.__madvise 0.66 ± 1% +0.1 0.73 ± 1% +0.1 0.74 ± 0% perf-profile.calltrace.cycles-pp.do_syscall_64.entry_SYSCALL_64_after_hwframe.__madvise 87.57 ± 0% +0.6 88.14 ± 0% +0.5 88.09 ± 0% perf-profile.calltrace.cycles-pp.mremap 84.74 ± 0% +0.7 85.47 ± 0% +0.6 85.37 ± 0% perf-profile.calltrace.cycles-pp.entry_SYSCALL_64_after_hwframe.mremap 84.58 ± 0% +0.7 85.32 ± 0% +0.6 85.22 ± 0% perf-profile.calltrace.cycles-pp.do_syscall_64.entry_SYSCALL_64_after_hwframe.mremap 83.64 ± 0% +0.8 84.41 ± 0% +0.7 84.30 ± 0% perf-profile.calltrace.cycles-pp.__do_sys_mremap.do_syscall_64.entry_SYSCALL_64_after_hwframe.mremap 0.00 ± -1% +0.9 0.86 ± 0% +0.9 0.92 ± 0% perf-profile.calltrace.cycles-pp.mas_walk.mas_find.can_modify_mm.do_vmi_munmap.do_munmap 0.00 ± -1% +0.9 0.87 ± 0% +0.0 0.00 ± -1% perf-profile.calltrace.cycles-pp.mas_walk.mas_find.can_modify_mm.mremap_to.__do_sys_mremap 0.00 ± -1% +0.9 0.91 ± 2% +0.9 0.92 ± 1% perf-profile.calltrace.cycles-pp.mas_walk.mas_find.can_modify_mm.do_vmi_munmap.move_vma 0.00 ± -1% +1.1 1.09 ± 0% +0.0 0.00 ± -1% perf-profile.calltrace.cycles-pp.mas_find.can_modify_mm.mremap_to.__do_sys_mremap.do_syscall_64 0.00 ± -1% +1.2 1.21 ± 0% +1.3 1.29 ± 0% perf-profile.calltrace.cycles-pp.mas_find.can_modify_mm.do_vmi_munmap.do_munmap.mremap_to 2.10 ± 0% +1.5 3.61 ± 0% +1.7 3.79 ± 0% perf-profile.calltrace.cycles-pp.do_munmap.mremap_to.__do_sys_mremap.do_syscall_64.entry_SYSCALL_64_after_hwframe 0.00 ± -1% +1.5 1.51 ± 1% +1.5 1.52 ± 0% perf-profile.calltrace.cycles-pp.mas_find.can_modify_mm.do_vmi_munmap.move_vma.__do_sys_mremap 1.60 ± 0% +1.5 3.13 ± 0% +1.7 3.31 ± 0% perf-profile.calltrace.cycles-pp.do_vmi_munmap.do_munmap.mremap_to.__do_sys_mremap.do_syscall_64 0.00 ± -1% +1.6 1.60 ± 0% +0.0 0.00 ± -1% perf-profile.calltrace.cycles-pp.can_modify_mm.mremap_to.__do_sys_mremap.do_syscall_64.entry_SYSCALL_64_after_hwframe 0.00 ± -1% +1.7 1.73 ± 0% +1.8 1.84 ± 0% perf-profile.calltrace.cycles-pp.can_modify_mm.do_vmi_munmap.do_munmap.mremap_to.__do_sys_mremap 0.00 ± -1% +2.0 2.00 ± 1% +2.0 2.04 ± 0% perf-profile.calltrace.cycles-pp.can_modify_mm.do_vmi_munmap.move_vma.__do_sys_mremap.do_syscall_64 5.35 ± 0% +3.0 8.37 ± 0% +1.6 6.92 ± 0% perf-profile.calltrace.cycles-pp.mremap_to.__do_sys_mremap.do_syscall_64.entry_SYSCALL_64_after_hwframe.mremap 75.03 ± 0% -2.1 72.92 ± 0% -0.8 74.22 ± 0% perf-profile.children.cycles-pp.move_vma 36.94 ± 0% -1.7 35.25 ± 0% -1.2 35.75 ± 0% perf-profile.children.cycles-pp.do_vmi_align_munmap 25.01 ± 0% -1.4 23.61 ± 0% -0.8 24.19 ± 0% perf-profile.children.cycles-pp.copy_vma 20.00 ± 0% -1.1 18.88 ± 0% -0.7 19.26 ± 0% perf-profile.children.cycles-pp.__split_vma 19.92 ± 0% -1.1 18.84 ± 0% -0.8 19.14 ± 0% perf-profile.children.cycles-pp.handle_softirqs 19.90 ± 0% -1.1 18.82 ± 0% -0.8 19.12 ± 0% perf-profile.children.cycles-pp.rcu_core 19.88 ± 0% -1.1 18.80 ± 0% -0.8 19.10 ± 0% perf-profile.children.cycles-pp.rcu_do_batch 17.57 ± 0% -0.9 16.66 ± 0% -0.6 16.94 ± 0% perf-profile.children.cycles-pp.kmem_cache_free 15.29 ± 0% -0.9 14.43 ± 0% -0.5 14.75 ± 0% perf-profile.children.cycles-pp.kmem_cache_alloc_noprof 15.11 ± 0% -0.8 14.27 ± 0% -0.4 14.68 ± 0% perf-profile.children.cycles-pp.vma_merge 12.15 ± 0% -0.7 11.46 ± 0% -0.5 11.65 ± 0% perf-profile.children.cycles-pp.__slab_free 12.11 ± 0% -0.7 11.43 ± 0% -0.4 11.71 ± 0% perf-profile.children.cycles-pp.mas_wr_store_entry 11.90 ± 0% -0.7 11.24 ± 0% -0.4 11.50 ± 0% perf-profile.children.cycles-pp.mas_store_prealloc 10.82 ± 2% -0.6 10.22 ± 2% -0.6 10.25 ± 2% perf-profile.children.cycles-pp.smpboot_thread_fn 10.81 ± 2% -0.6 10.21 ± 2% -0.6 10.24 ± 2% perf-profile.children.cycles-pp.run_ksoftirqd 10.85 ± 2% -0.6 10.26 ± 2% -0.6 10.28 ± 2% perf-profile.children.cycles-pp.kthread 10.85 ± 2% -0.6 10.26 ± 2% -0.6 10.28 ± 2% perf-profile.children.cycles-pp.ret_from_fork 10.85 ± 2% -0.6 10.26 ± 2% -0.6 10.28 ± 2% perf-profile.children.cycles-pp.ret_from_fork_asm 10.85 ± 0% -0.6 10.26 ± 0% -0.4 10.47 ± 0% perf-profile.children.cycles-pp.vm_area_dup 9.81 ± 0% -0.5 9.28 ± 0% -0.3 9.52 ± 0% perf-profile.children.cycles-pp.mas_wr_node_store 8.38 ± 1% -0.5 7.90 ± 1% -0.2 8.13 ± 1% perf-profile.children.cycles-pp.__memcg_slab_post_alloc_hook 7.98 ± 0% -0.4 7.58 ± 0% -0.3 7.70 ± 0% perf-profile.children.cycles-pp.move_page_tables 6.66 ± 0% -0.4 6.29 ± 0% -0.2 6.43 ± 0% perf-profile.children.cycles-pp.vma_complete 5.12 ± 0% -0.3 4.79 ± 0% -0.2 4.88 ± 0% perf-profile.children.cycles-pp.mas_preallocate 6.05 ± 0% -0.3 5.72 ± 0% -0.2 5.82 ± 0% perf-profile.children.cycles-pp.vm_area_free_rcu_cb 5.85 ± 0% -0.3 5.56 ± 0% -0.2 5.66 ± 0% perf-profile.children.cycles-pp.move_ptes 3.51 ± 1% -0.2 3.28 ± 2% -0.1 3.37 ± 1% perf-profile.children.cycles-pp.mod_objcg_state 3.45 ± 0% -0.2 3.24 ± 0% -0.2 3.30 ± 0% perf-profile.children.cycles-pp.___slab_alloc 2.91 ± 0% -0.2 2.71 ± 0% -0.1 2.78 ± 0% perf-profile.children.cycles-pp.mas_alloc_nodes 3.47 ± 0% -0.2 3.27 ± 0% -0.1 3.34 ± 0% perf-profile.children.cycles-pp.flush_tlb_mm_range 3.43 ± 1% -0.2 3.24 ± 1% -0.1 3.35 ± 2% perf-profile.children.cycles-pp.down_write 2.44 ± 0% -0.2 2.25 ± 0% -0.1 2.32 ± 0% perf-profile.children.cycles-pp.find_vma_prev 4.24 ± 1% -0.2 4.06 ± 1% -0.1 4.11 ± 1% perf-profile.children.cycles-pp.anon_vma_clone 3.35 ± 0% -0.2 3.18 ± 0% -0.1 3.24 ± 0% perf-profile.children.cycles-pp.mas_store_gfp 2.21 ± 1% -0.2 2.05 ± 0% -0.1 2.10 ± 0% perf-profile.children.cycles-pp.__cond_resched 3.32 ± 0% -0.2 3.17 ± 1% -0.1 3.24 ± 0% perf-profile.children.cycles-pp.__memcg_slab_free_hook 8.26 ± 0% -0.1 8.12 ± 0% -0.1 8.11 ± 0% perf-profile.children.cycles-pp.unmap_region 2.22 ± 1% -0.1 2.08 ± 1% -0.1 2.16 ± 3% perf-profile.children.cycles-pp.vma_prepare 2.67 ± 0% -0.1 2.54 ± 0% -0.1 2.58 ± 0% perf-profile.children.cycles-pp.mtree_load 3.18 ± 0% -0.1 3.05 ± 0% -0.1 3.11 ± 0% perf-profile.children.cycles-pp.unmap_vmas 2.46 ± 0% -0.1 2.34 ± 0% -0.1 2.38 ± 0% perf-profile.children.cycles-pp.rcu_cblist_dequeue 2.50 ± 0% -0.1 2.39 ± 0% -0.1 2.43 ± 0% perf-profile.children.cycles-pp.flush_tlb_func 2.11 ± 1% -0.1 2.00 ± 1% -0.1 2.02 ± 1% perf-profile.children.cycles-pp.__call_rcu_common 2.04 ± 1% -0.1 1.93 ± 1% -0.1 1.95 ± 1% perf-profile.children.cycles-pp.allocate_slab 1.77 ± 1% -0.1 1.66 ± 0% -0.1 1.69 ± 1% perf-profile.children.cycles-pp.mas_wr_walk 1.87 ± 0% -0.1 1.77 ± 0% -0.1 1.80 ± 0% perf-profile.children.cycles-pp.vma_link 2.24 ± 0% -0.1 2.13 ± 0% -0.1 2.17 ± 0% perf-profile.children.cycles-pp.native_flush_tlb_one_user 1.85 ± 1% -0.1 1.74 ± 0% -0.1 1.79 ± 2% perf-profile.children.cycles-pp.up_write 2.48 ± 0% -0.1 2.38 ± 0% -0.1 2.42 ± 0% perf-profile.children.cycles-pp.unmap_page_range 0.97 ± 2% -0.1 0.88 ± 1% -0.1 0.90 ± 1% perf-profile.children.cycles-pp.rcu_all_qs 1.04 ± 0% -0.1 0.95 ± 1% -0.0 0.99 ± 1% perf-profile.children.cycles-pp.mas_prev 1.24 ± 0% -0.1 1.16 ± 0% -0.1 1.19 ± 0% perf-profile.children.cycles-pp.mas_prev_slot 0.93 ± 0% -0.1 0.85 ± 1% -0.0 0.88 ± 1% perf-profile.children.cycles-pp.mas_prev_setup 1.39 ± 1% -0.1 1.31 ± 1% -0.1 1.33 ± 1% perf-profile.children.cycles-pp.shuffle_freelist 1.52 ± 0% -0.1 1.45 ± 0% -0.0 1.48 ± 0% perf-profile.children.cycles-pp.mas_update_gap 1.58 ± 1% -0.1 1.50 ± 0% -0.0 1.53 ± 0% perf-profile.children.cycles-pp.zap_pmd_range 0.87 ± 1% -0.1 0.80 ± 0% -0.1 0.82 ± 1% perf-profile.children.cycles-pp._raw_spin_lock_irqsave 1.68 ± 1% -0.1 1.62 ± 0% -0.1 1.62 ± 0% perf-profile.children.cycles-pp.__get_unmapped_area 0.90 ± 1% -0.1 0.84 ± 0% -0.0 0.86 ± 1% perf-profile.children.cycles-pp.percpu_counter_add_batch 0.62 ± 1% -0.1 0.56 ± 1% -0.0 0.60 ± 1% perf-profile.children.cycles-pp.security_mmap_addr 0.49 ± 1% -0.1 0.44 ± 1% -0.1 0.44 ± 1% perf-profile.children.cycles-pp.setup_object 1.02 ± 0% -0.1 0.97 ± 1% -0.0 0.99 ± 0% perf-profile.children.cycles-pp.mas_leaf_max_gap 0.98 ± 1% -0.0 0.93 ± 1% -0.0 0.94 ± 1% perf-profile.children.cycles-pp.mas_pop_node 1.22 ± 1% -0.0 1.18 ± 1% -0.0 1.19 ± 1% perf-profile.children.cycles-pp.__pte_offset_map_lock 0.45 ± 2% -0.0 0.40 ± 2% -0.0 0.41 ± 1% perf-profile.children.cycles-pp.native_queued_spin_lock_slowpath 1.18 ± 0% -0.0 1.13 ± 0% -0.0 1.15 ± 1% perf-profile.children.cycles-pp.clear_bhb_loop 1.08 ± 1% -0.0 1.03 ± 0% -0.0 1.05 ± 0% perf-profile.children.cycles-pp.zap_pte_range 1.04 ± 0% -0.0 1.00 ± 0% -0.0 1.01 ± 0% perf-profile.children.cycles-pp.vma_to_resize 0.58 ± 1% -0.0 0.53 ± 1% -0.0 0.54 ± 1% perf-profile.children.cycles-pp.mas_wr_end_piv 0.34 ± 2% -0.0 0.30 ± 5% -0.0 0.31 ± 4% perf-profile.children.cycles-pp.get_partial_node 0.64 ± 1% -0.0 0.61 ± 2% -0.0 0.61 ± 1% perf-profile.children.cycles-pp.get_old_pud 0.62 ± 0% -0.0 0.59 ± 0% -0.0 0.59 ± 1% perf-profile.children.cycles-pp.__put_partials 1.14 ± 0% -0.0 1.10 ± 1% -0.0 1.12 ± 1% perf-profile.children.cycles-pp.mt_find 0.90 ± 0% -0.0 0.87 ± 0% -0.0 0.87 ± 0% perf-profile.children.cycles-pp.userfaultfd_unmap_complete 0.61 ± 1% -0.0 0.58 ± 1% -0.0 0.59 ± 0% perf-profile.children.cycles-pp.entry_SYSCALL_64 0.32 ± 2% -0.0 0.29 ± 3% -0.0 0.30 ± 4% perf-profile.children.cycles-pp.security_vm_enough_memory_mm 0.54 ± 1% -0.0 0.52 ± 1% -0.0 0.52 ± 1% perf-profile.children.cycles-pp.arch_get_unmapped_area_topdown_vmflags 0.55 ± 1% -0.0 0.52 ± 1% -0.0 0.54 ± 1% perf-profile.children.cycles-pp.refill_obj_stock 0.45 ± 1% -0.0 0.43 ± 2% -0.0 0.43 ± 2% perf-profile.children.cycles-pp.__alloc_pages_noprof 0.43 ± 1% -0.0 0.41 ± 2% -0.0 0.41 ± 2% perf-profile.children.cycles-pp.get_page_from_freelist 0.17 ± 1% -0.0 0.15 ± 3% -0.0 0.16 ± 1% perf-profile.children.cycles-pp.get_any_partial 0.32 ± 1% -0.0 0.30 ± 1% -0.0 0.30 ± 1% perf-profile.children.cycles-pp.pte_offset_map_nolock 0.40 ± 0% -0.0 0.38 ± 1% -0.0 0.39 ± 1% perf-profile.children.cycles-pp.entry_SYSRETQ_unsafe_stack 0.28 ± 2% -0.0 0.26 ± 2% -0.0 0.27 ± 1% perf-profile.children.cycles-pp.khugepaged_enter_vma 0.32 ± 1% -0.0 0.30 ± 1% -0.0 0.30 ± 2% perf-profile.children.cycles-pp.mas_wr_store_setup 0.19 ± 4% -0.0 0.17 ± 4% -0.0 0.18 ± 6% perf-profile.children.cycles-pp.cap_vm_enough_memory 0.29 ± 1% -0.0 0.27 ± 2% -0.0 0.28 ± 3% perf-profile.children.cycles-pp.tlb_gather_mmu 0.09 ± 4% -0.0 0.07 ± 6% -0.0 0.08 ± 5% perf-profile.children.cycles-pp.vma_dup_policy 0.16 ± 3% -0.0 0.14 ± 2% -0.0 0.14 ± 2% perf-profile.children.cycles-pp.mas_wr_append 0.22 ± 2% -0.0 0.20 ± 3% -0.0 0.20 ± 3% perf-profile.children.cycles-pp.__rmqueue_pcplist 0.20 ± 2% -0.0 0.18 ± 2% -0.0 0.19 ± 3% perf-profile.children.cycles-pp.__thp_vma_allowable_orders 0.24 ± 2% -0.0 0.23 ± 2% -0.0 0.23 ± 2% perf-profile.children.cycles-pp.free_pcppages_bulk 0.44 ± 1% +0.0 0.45 ± 1% +0.0 0.46 ± 1% perf-profile.children.cycles-pp.mremap_userfaultfd_prep 0.85 ± 1% +0.0 0.85 ± 1% -0.0 0.81 ± 1% perf-profile.children.cycles-pp.thp_get_unmapped_area_vmflags 0.13 ± 3% +0.0 0.14 ± 3% +0.0 0.15 ± 2% perf-profile.children.cycles-pp.free_pgd_range 0.08 ± 8% +0.0 0.10 ± 3% -0.0 0.08 ± 6% perf-profile.children.cycles-pp.mm_get_unmapped_area_vmflags 0.78 ± 1% +0.1 0.84 ± 0% +0.1 0.86 ± 0% perf-profile.children.cycles-pp.__madvise 0.63 ± 1% +0.1 0.70 ± 1% +0.1 0.72 ± 0% perf-profile.children.cycles-pp.__x64_sys_madvise 0.63 ± 1% +0.1 0.70 ± 0% +0.1 0.71 ± 0% perf-profile.children.cycles-pp.do_madvise 0.00 ± -1% +0.1 0.09 ± 0% +0.1 0.09 ± 5% perf-profile.children.cycles-pp.can_modify_mm_madv 1.32 ± 1% +0.1 1.46 ± 0% +0.2 1.50 ± 0% perf-profile.children.cycles-pp.mas_next_slot 87.96 ± 0% +0.6 88.52 ± 0% +0.5 88.48 ± 0% perf-profile.children.cycles-pp.mremap 85.91 ± 0% +0.8 86.69 ± 0% +0.7 86.61 ± 0% perf-profile.children.cycles-pp.entry_SYSCALL_64_after_hwframe 83.74 ± 0% +0.8 84.52 ± 0% +0.7 84.40 ± 0% perf-profile.children.cycles-pp.__do_sys_mremap 85.42 ± 0% +0.8 86.23 ± 0% +0.7 86.14 ± 0% perf-profile.children.cycles-pp.do_syscall_64 40.36 ± 0% +1.4 41.74 ± 0% +2.1 42.49 ± 0% perf-profile.children.cycles-pp.do_vmi_munmap 2.12 ± 0% +1.5 3.63 ± 0% +1.7 3.81 ± 0% perf-profile.children.cycles-pp.do_munmap 3.62 ± 0% +2.3 5.97 ± 0% +1.7 5.29 ± 0% perf-profile.children.cycles-pp.mas_walk 5.41 ± 0% +3.0 8.44 ± 0% +1.6 6.98 ± 0% perf-profile.children.cycles-pp.mremap_to 5.28 ± 0% +3.2 8.48 ± 0% +2.3 7.56 ± 0% perf-profile.children.cycles-pp.mas_find 0.00 ± -1% +5.4 5.45 ± 0% +3.9 3.94 ± 0% perf-profile.children.cycles-pp.can_modify_mm 11.51 ± 0% -0.6 10.86 ± 0% -0.5 11.04 ± 0% perf-profile.self.cycles-pp.__slab_free 4.23 ± 2% -0.2 4.00 ± 2% -0.1 4.13 ± 2% perf-profile.self.cycles-pp.__memcg_slab_post_alloc_hook 2.34 ± 1% -0.1 2.21 ± 1% -0.0 2.30 ± 3% perf-profile.self.cycles-pp.down_write 2.43 ± 0% -0.1 2.31 ± 0% -0.1 2.34 ± 0% perf-profile.self.cycles-pp.rcu_cblist_dequeue 2.34 ± 0% -0.1 2.24 ± 0% -0.1 2.27 ± 0% perf-profile.self.cycles-pp.mtree_load 2.21 ± 0% -0.1 2.11 ± 0% -0.1 2.14 ± 0% perf-profile.self.cycles-pp.native_flush_tlb_one_user 1.75 ± 0% -0.1 1.67 ± 0% -0.0 1.70 ± 0% perf-profile.self.cycles-pp.mod_objcg_state 1.54 ± 1% -0.1 1.46 ± 0% -0.0 1.50 ± 1% perf-profile.self.cycles-pp.up_write 1.52 ± 0% -0.1 1.44 ± 0% -0.1 1.46 ± 0% perf-profile.self.cycles-pp.mas_wr_walk 0.70 ± 3% -0.1 0.63 ± 1% -0.1 0.64 ± 1% perf-profile.self.cycles-pp.rcu_all_qs 1.43 ± 1% -0.1 1.36 ± 1% -0.1 1.36 ± 1% perf-profile.self.cycles-pp.__call_rcu_common 1.01 ± 0% -0.1 0.95 ± 0% -0.0 0.96 ± 0% perf-profile.self.cycles-pp.mas_preallocate 1.40 ± 1% -0.1 1.33 ± 1% -0.0 1.35 ± 0% perf-profile.self.cycles-pp.do_vmi_align_munmap 1.00 ± 0% -0.1 0.94 ± 0% -0.0 0.96 ± 0% perf-profile.self.cycles-pp.mas_prev_slot 1.14 ± 1% -0.1 1.08 ± 1% -0.0 1.10 ± 1% perf-profile.self.cycles-pp.shuffle_freelist 1.18 ± 0% -0.1 1.13 ± 0% -0.0 1.16 ± 0% perf-profile.self.cycles-pp.vma_merge 0.94 ± 1% -0.1 0.89 ± 2% -0.0 0.91 ± 1% perf-profile.self.cycles-pp.vm_area_free_rcu_cb 0.88 ± 0% -0.1 0.83 ± 1% -0.0 0.84 ± 0% perf-profile.self.cycles-pp.___slab_alloc 0.50 ± 1% -0.0 0.45 ± 2% -0.0 0.50 ± 1% perf-profile.self.cycles-pp.security_mmap_addr 0.77 ± 1% -0.0 0.72 ± 1% -0.0 0.74 ± 1% perf-profile.self.cycles-pp.percpu_counter_add_batch 0.45 ± 2% -0.0 0.40 ± 2% -0.0 0.41 ± 1% perf-profile.self.cycles-pp.native_queued_spin_lock_slowpath 1.17 ± 0% -0.0 1.12 ± 0% -0.0 1.14 ± 1% perf-profile.self.cycles-pp.clear_bhb_loop 1.08 ± 1% -0.0 1.04 ± 1% -0.0 1.06 ± 1% perf-profile.self.cycles-pp.__cond_resched 1.50 ± 2% -0.0 1.46 ± 0% -0.0 1.48 ± 0% perf-profile.self.cycles-pp.kmem_cache_free 1.23 ± 0% -0.0 1.18 ± 0% -0.1 1.18 ± 0% perf-profile.self.cycles-pp.move_vma 0.68 ± 1% -0.0 0.64 ± 0% -0.0 0.65 ± 1% perf-profile.self.cycles-pp.__split_vma 0.80 ± 0% -0.0 0.76 ± 1% -0.0 0.77 ± 0% perf-profile.self.cycles-pp.mas_wr_store_entry 0.61 ± 2% -0.0 0.57 ± 2% -0.0 0.57 ± 6% perf-profile.self.cycles-pp.mremap 0.85 ± 1% -0.0 0.80 ± 1% -0.0 0.81 ± 1% perf-profile.self.cycles-pp.mas_pop_node 0.44 ± 0% -0.0 0.40 ± 1% -0.0 0.40 ± 1% perf-profile.self.cycles-pp.do_munmap 0.98 ± 0% -0.0 0.94 ± 1% -0.0 0.95 ± 0% perf-profile.self.cycles-pp.move_ptes 0.89 ± 0% -0.0 0.86 ± 0% -0.0 0.87 ± 0% perf-profile.self.cycles-pp.mas_leaf_max_gap 0.46 ± 1% -0.0 0.42 ± 1% -0.0 0.43 ± 1% perf-profile.self.cycles-pp.mas_wr_end_piv 0.89 ± 0% -0.0 0.86 ± 0% -0.0 0.87 ± 0% perf-profile.self.cycles-pp.mas_store_gfp 0.79 ± 0% -0.0 0.76 ± 1% -0.0 0.76 ± 0% perf-profile.self.cycles-pp.userfaultfd_unmap_complete 0.99 ± 0% -0.0 0.97 ± 0% -0.0 0.98 ± 0% perf-profile.self.cycles-pp.mt_find 0.87 ± 0% -0.0 0.84 ± 0% -0.0 0.84 ± 0% perf-profile.self.cycles-pp.move_page_tables 0.55 ± 2% -0.0 0.52 ± 1% -0.0 0.52 ± 1% perf-profile.self.cycles-pp.get_old_pud 0.50 ± 0% -0.0 0.47 ± 1% -0.0 0.48 ± 0% perf-profile.self.cycles-pp.find_vma_prev 0.61 ± 0% -0.0 0.58 ± 1% -0.0 0.59 ± 0% perf-profile.self.cycles-pp.unmap_region 0.66 ± 0% -0.0 0.63 ± 1% -0.0 0.64 ± 0% perf-profile.self.cycles-pp.mas_store_prealloc 0.27 ± 1% -0.0 0.25 ± 1% -0.0 0.26 ± 1% perf-profile.self.cycles-pp.mas_prev_setup 0.61 ± 1% -0.0 0.59 ± 1% -0.0 0.60 ± 1% perf-profile.self.cycles-pp.copy_vma 0.48 ± 0% -0.0 0.45 ± 1% -0.0 0.46 ± 1% perf-profile.self.cycles-pp.flush_tlb_mm_range 0.41 ± 1% -0.0 0.39 ± 1% -0.0 0.40 ± 1% perf-profile.self.cycles-pp._raw_spin_lock_irqsave 0.48 ± 1% -0.0 0.46 ± 1% -0.0 0.47 ± 0% perf-profile.self.cycles-pp.entry_SYSCALL_64_after_hwframe 0.50 ± 1% -0.0 0.48 ± 1% -0.0 0.48 ± 1% perf-profile.self.cycles-pp.refill_obj_stock 0.47 ± 1% -0.0 0.46 ± 1% -0.0 0.45 ± 1% perf-profile.self.cycles-pp.arch_get_unmapped_area_topdown_vmflags 0.71 ± 0% -0.0 0.69 ± 1% -0.0 0.69 ± 1% perf-profile.self.cycles-pp.unmap_page_range 0.17 ± 4% -0.0 0.15 ± 4% -0.0 0.16 ± 3% perf-profile.self.cycles-pp.get_partial_node 0.24 ± 1% -0.0 0.22 ± 1% -0.0 0.23 ± 0% perf-profile.self.cycles-pp.mas_prev 0.45 ± 1% -0.0 0.43 ± 0% -0.0 0.44 ± 1% perf-profile.self.cycles-pp.mas_update_gap 0.53 ± 1% -0.0 0.51 ± 0% -0.0 0.51 ± 1% perf-profile.self.cycles-pp.mremap_to 0.21 ± 2% -0.0 0.19 ± 2% -0.0 0.19 ± 2% perf-profile.self.cycles-pp.__get_unmapped_area 0.27 ± 1% -0.0 0.26 ± 1% -0.0 0.25 ± 1% perf-profile.self.cycles-pp.tlb_finish_mmu 0.18 ± 2% -0.0 0.17 ± 2% -0.0 0.18 ± 2% perf-profile.self.cycles-pp.rcu_do_batch 0.06 ± 0% -0.0 0.05 ± 0% -0.0 0.05 ± 0% perf-profile.self.cycles-pp.vma_dup_policy 0.12 ± 0% -0.0 0.11 ± 0% -0.0 0.11 ± 3% perf-profile.self.cycles-pp.mas_wr_append 0.14 ± 3% -0.0 0.13 ± 3% -0.0 0.12 ± 3% perf-profile.self.cycles-pp.x64_sys_call 0.11 ± 0% +0.0 0.12 ± 0% +0.0 0.12 ± 3% perf-profile.self.cycles-pp.free_pgd_range 0.06 ± 5% +0.0 0.07 ± 0% +0.0 0.06 ± 5% perf-profile.self.cycles-pp.mm_get_unmapped_area_vmflags 0.21 ± 0% +0.0 0.22 ± 2% -0.0 0.21 ± 2% perf-profile.self.cycles-pp.thp_get_unmapped_area_vmflags 0.45 ± 1% +0.0 0.48 ± 2% +0.0 0.50 ± 1% perf-profile.self.cycles-pp.do_vmi_munmap 0.27 ± 1% +0.0 0.32 ± 2% -0.0 0.26 ± 1% perf-profile.self.cycles-pp.free_pgtables 0.36 ± 2% +0.1 0.44 ± 1% -0.0 0.35 ± 4% perf-profile.self.cycles-pp.unlink_anon_vmas 1.07 ± 1% +0.1 1.19 ± 0% +0.1 1.22 ± 0% perf-profile.self.cycles-pp.mas_next_slot 1.50 ± 0% +0.5 2.02 ± 0% +0.4 1.85 ± 0% perf-profile.self.cycles-pp.mas_find 0.00 ± -1% +1.4 1.38 ± 0% +0.9 0.92 ± 0% perf-profile.self.cycles-pp.can_modify_mm 3.15 ± 0% +2.1 5.26 ± 0% +1.5 4.62 ± 0% perf-profile.self.cycles-pp.mas_walk
On Mon, Aug 19, 2024 at 02:35:40PM +0800, Oliver Sang wrote:
hi, Jeff,
On Mon, Aug 19, 2024 at 09:38:19AM +0800, Oliver Sang wrote:
hi, Jeff,
On Sun, Aug 18, 2024 at 05:28:41PM +0800, Oliver Sang wrote:
hi, Jeff,
On Thu, Aug 15, 2024 at 07:58:57PM -0700, Jeff Xu wrote:
Hi Oliver
[...]
could you exlictly point to two commit-id?
sure
this patch 8be7258a: mseal: add mseal syscall ff388fe5c: mseal: wire up mseal syscall
I failed to apply this patch set to "8be7258a: mseal: add mseal syscall"
look your patch set again [PATCH v1 1/2] mseal:selftest mremap across VMA boundaries just for kselftests
and I can apply [PATCH v1 2/2] mseal: refactor mremap to remove can_modify_mm upon "8be7258a: mseal: add mseal syscall" cleanly
so I will start test for this [PATCH v1 2/2]
BTW, I will firstly use our default setting - "60s testtime; reboot between each run; run 10 times", since we've already have the data for 8be7258a and ff388fe5c then we could give you an update kind of quickly.
as some private mail discussed, you want some special run method, could you elaborate them here? thanks
here is a quick update before you give us more details about special run method.
by our default run method (60s testtime; reboot between each run; run 10 times), your "[PATCH v1 2/2] mseal: refactor mremap to remove can_modify_mm" could resolve regression partically.
========================================================================================= compiler/cpufreq_governor/kconfig/nr_threads/rootfs/tbox_group/test/testcase/testtime: gcc-12/performance/x86_64-rhel-8.3/100%/debian-12-x86_64-20240206.cgz/lkp-icl-2sp7/pagemove/stress-ng/60s
commit: ff388fe5c4 ("mseal: wire up mseal syscall") 8be7258aad ("mseal: add mseal syscall") 2a78ece39f <-- your "[PATCH v1 2/2] mseal: refactor mremap to remove can_modify_mm"
ff388fe5c481d39c 8be7258aad44b5e25977a98db13 2a78ece39f13ea6f3f9679a6c66
%stddev %change %stddev %change %stddev \ | \ | \ 4957 +1.3% 5023 +1.0% 5008 time.percent_of_cpu_this_job_got 2915 +1.5% 2959 +1.2% 2949 time.system_time 65.96 -7.3% 61.16 -5.5% 62.30 time.user_time
41535878 -4.0% 39873501 -2.6% 40452264 proc-vmstat.numa_hit 41466104 -4.0% 39806121 -2.6% 40384854 proc-vmstat.numa_local 77297398 -4.1% 74165258 -2.6% 75286134 proc-vmstat.pgalloc_normal 77016866 -4.1% 73886027 -2.6% 75012630 proc-vmstat.pgfree 18386219 -5.0% 17474214 -2.9% 17850959 stress-ng.pagemove.ops 306421 -5.0% 291207 -2.9% 297490 stress-ng.pagemove.ops_per_sec 4957 +1.3% 5023 +1.0% 5008 stress-ng.time.percent_of_cpu_this_job_got 2915 +1.5% 2959 +1.2% 2949 stress-ng.time.system_time 3.349e+10 ± 4% +3.0% 3.447e+10 ± 2% +4.1% 3.484e+10 perf-stat.i.branch-instructions 1.13 -2.1% 1.10 -2.2% 1.10 perf-stat.i.cpi 0.89 +2.2% 0.91 +2.0% 0.91 perf-stat.i.ipc 1.04 -6.9% 0.97 -4.9% 0.99 perf-stat.overall.MPKI 1.13 -2.3% 1.10 -2.0% 1.10 perf-stat.overall.cpi 1081 +5.0% 1136 +3.0% 1114 perf-stat.overall.cycles-between-cache-misses 0.89 +2.3% 0.91 +2.0% 0.91 perf-stat.overall.ipc 3.295e+10 ± 3% +2.9% 3.392e+10 ± 2% +4.0% 3.427e+10 perf-stat.ps.branch-instructions 1.674e+11 ± 3% +1.8% 1.704e+11 ± 2% +3.3% 1.73e+11 perf-stat.ps.instructions 1.046e+13 +2.7% 1.074e+13 +1.7% 1.064e+13 perf-stat.total.instructions 75.05 -2.0 73.02 -0.9 74.18 perf-profile.calltrace.cycles-pp.move_vma.__do_sys_mremap.do_syscall_64.entry_SYSCALL_64_after_hwframe.mremap 36.83 -1.6 35.19 -1.2 35.62 perf-profile.calltrace.cycles-pp.do_vmi_align_munmap.do_vmi_munmap.move_vma.__do_sys_mremap.do_syscall_64 25.02 -1.4 23.65 -0.9 24.12 perf-profile.calltrace.cycles-pp.copy_vma.move_vma.__do_sys_mremap.do_syscall_64.entry_SYSCALL_64_after_hwframe 19.94 -1.1 18.87 -0.8 19.19 perf-profile.calltrace.cycles-pp.__split_vma.do_vmi_align_munmap.do_vmi_munmap.move_vma.__do_sys_mremap 14.78 -0.8 14.01 -0.5 14.28 perf-profile.calltrace.cycles-pp.vma_merge.copy_vma.move_vma.__do_sys_mremap.do_syscall_64 1.48 -0.5 0.99 -0.5 1.00 perf-profile.calltrace.cycles-pp.mas_find.do_vmi_munmap.move_vma.__do_sys_mremap.do_syscall_64 7.88 -0.4 7.47 -0.3 7.62 perf-profile.calltrace.cycles-pp.move_page_tables.move_vma.__do_sys_mremap.do_syscall_64.entry_SYSCALL_64_after_hwframe 6.73 -0.4 6.37 -0.2 6.51 perf-profile.calltrace.cycles-pp.vm_area_dup.__split_vma.do_vmi_align_munmap.do_vmi_munmap.move_vma 6.16 -0.3 5.82 -0.3 5.90 perf-profile.calltrace.cycles-pp.vma_complete.__split_vma.do_vmi_align_munmap.do_vmi_munmap.move_vma 6.12 -0.3 5.79 -0.2 5.93 perf-profile.calltrace.cycles-pp.kmem_cache_alloc_noprof.vm_area_dup.__split_vma.do_vmi_align_munmap.do_vmi_munmap 5.79 -0.3 5.48 -0.2 5.59 perf-profile.calltrace.cycles-pp.move_ptes.move_page_tables.move_vma.__do_sys_mremap.do_syscall_64 5.54 -0.3 5.25 -0.2 5.32 perf-profile.calltrace.cycles-pp.mas_store_prealloc.vma_complete.__split_vma.do_vmi_align_munmap.do_vmi_munmap 5.56 -0.3 5.28 -0.2 5.36 perf-profile.calltrace.cycles-pp.mas_store_prealloc.vma_merge.copy_vma.move_vma.__do_sys_mremap 5.19 -0.3 4.92 -0.2 4.98 perf-profile.calltrace.cycles-pp.mas_wr_store_entry.mas_store_prealloc.vma_complete.__split_vma.do_vmi_align_munmap 5.21 -0.3 4.95 -0.2 5.02 perf-profile.calltrace.cycles-pp.mas_wr_store_entry.mas_store_prealloc.vma_merge.copy_vma.move_vma 4.09 -0.2 3.85 -0.2 3.93 perf-profile.calltrace.cycles-pp.vm_area_dup.copy_vma.move_vma.__do_sys_mremap.do_syscall_64 4.69 -0.2 4.46 -0.2 4.51 perf-profile.calltrace.cycles-pp.mas_wr_node_store.mas_wr_store_entry.mas_store_prealloc.vma_merge.copy_vma 3.56 -0.2 3.36 -0.1 3.43 perf-profile.calltrace.cycles-pp.kmem_cache_alloc_noprof.vm_area_dup.copy_vma.move_vma.__do_sys_mremap 3.40 -0.2 3.22 -0.1 3.29 perf-profile.calltrace.cycles-pp.flush_tlb_mm_range.move_ptes.move_page_tables.move_vma.__do_sys_mremap 1.35 -0.2 1.16 -0.1 1.24 perf-profile.calltrace.cycles-pp.mas_find.do_vmi_munmap.do_munmap.mremap_to.__do_sys_mremap 4.00 -0.2 3.82 -0.1 3.86 perf-profile.calltrace.cycles-pp.mas_wr_node_store.mas_wr_store_entry.mas_store_prealloc.vma_complete.__split_vma 2.23 -0.2 2.05 -0.1 2.12 perf-profile.calltrace.cycles-pp.find_vma_prev.copy_vma.move_vma.__do_sys_mremap.do_syscall_64 8.26 -0.2 8.10 -0.2 8.06 perf-profile.calltrace.cycles-pp.unmap_region.do_vmi_align_munmap.do_vmi_munmap.move_vma.__do_sys_mremap 1.97 ± 3% -0.2 1.81 ± 3% -0.1 1.88 ± 4% perf-profile.calltrace.cycles-pp.mod_objcg_state.__memcg_slab_post_alloc_hook.kmem_cache_alloc_noprof.vm_area_dup.__split_vma 3.11 ± 2% -0.2 2.96 -0.1 3.05 perf-profile.calltrace.cycles-pp.__memcg_slab_post_alloc_hook.kmem_cache_alloc_noprof.vm_area_dup.__split_vma.do_vmi_align_munmap 0.97 -0.2 0.81 -0.1 0.87 perf-profile.calltrace.cycles-pp.mas_walk.mas_find.do_vmi_munmap.do_munmap.mremap_to 2.27 -0.2 2.11 -0.1 2.16 perf-profile.calltrace.cycles-pp.mas_preallocate.__split_vma.do_vmi_align_munmap.do_vmi_munmap.move_vma 3.25 -0.1 3.10 -0.1 3.17 perf-profile.calltrace.cycles-pp.mas_store_gfp.do_vmi_align_munmap.do_vmi_munmap.move_vma.__do_sys_mremap 3.14 -0.1 3.00 -0.1 3.06 perf-profile.calltrace.cycles-pp.unmap_vmas.unmap_region.do_vmi_align_munmap.do_vmi_munmap.move_vma 2.98 -0.1 2.85 -0.1 2.87 ± 2% perf-profile.calltrace.cycles-pp.anon_vma_clone.__split_vma.do_vmi_align_munmap.do_vmi_munmap.move_vma 1.27 ± 2% -0.1 1.15 ± 4% -0.1 1.19 ± 6% perf-profile.calltrace.cycles-pp.__memcpy.mas_wr_node_store.mas_wr_store_entry.mas_store_prealloc.vma_merge 2.45 -0.1 2.34 -0.1 2.38 perf-profile.calltrace.cycles-pp.flush_tlb_func.flush_tlb_mm_range.move_ptes.move_page_tables.move_vma 2.05 -0.1 1.94 -0.1 1.97 perf-profile.calltrace.cycles-pp.mas_preallocate.vma_merge.copy_vma.move_vma.__do_sys_mremap 2.44 -0.1 2.33 -0.1 2.38 perf-profile.calltrace.cycles-pp.unmap_page_range.unmap_vmas.unmap_region.do_vmi_align_munmap.do_vmi_munmap 2.22 -0.1 2.11 -0.1 2.15 perf-profile.calltrace.cycles-pp.native_flush_tlb_one_user.flush_tlb_func.flush_tlb_mm_range.move_ptes.move_page_tables 1.76 ± 2% -0.1 1.65 ± 2% -0.1 1.66 ± 4% perf-profile.calltrace.cycles-pp.vma_prepare.vma_merge.copy_vma.move_vma.__do_sys_mremap 1.86 -0.1 1.75 -0.1 1.78 perf-profile.calltrace.cycles-pp.vma_link.copy_vma.move_vma.__do_sys_mremap.do_syscall_64 1.40 -0.1 1.30 -0.1 1.34 perf-profile.calltrace.cycles-pp.mas_alloc_nodes.mas_preallocate.__split_vma.do_vmi_align_munmap.do_vmi_munmap 1.39 -0.1 1.30 -0.1 1.33 perf-profile.calltrace.cycles-pp.mas_alloc_nodes.mas_preallocate.vma_merge.copy_vma.move_vma 0.55 -0.1 0.46 ± 30% -0.0 0.52 perf-profile.calltrace.cycles-pp.mas_find.find_vma_prev.copy_vma.move_vma.__do_sys_mremap 1.25 -0.1 1.16 -0.1 1.20 perf-profile.calltrace.cycles-pp.kmem_cache_alloc_noprof.mas_alloc_nodes.mas_preallocate.__split_vma.do_vmi_align_munmap 0.94 -0.1 0.86 -0.1 0.87 perf-profile.calltrace.cycles-pp.mas_walk.mas_find.do_vmi_munmap.move_vma.__do_sys_mremap 1.23 -0.1 1.15 -0.1 1.17 perf-profile.calltrace.cycles-pp.kmem_cache_alloc_noprof.mas_alloc_nodes.mas_preallocate.vma_merge.copy_vma 1.54 -0.1 1.47 -0.0 1.49 perf-profile.calltrace.cycles-pp.zap_pmd_range.unmap_page_range.unmap_vmas.unmap_region.do_vmi_align_munmap 0.73 -0.1 0.66 -0.0 0.69 perf-profile.calltrace.cycles-pp.mas_walk.find_vma_prev.copy_vma.move_vma.__do_sys_mremap 1.15 -0.1 1.09 -0.1 1.10 perf-profile.calltrace.cycles-pp.___slab_alloc.kmem_cache_alloc_noprof.vm_area_dup.__split_vma.do_vmi_align_munmap 0.60 ± 2% -0.1 0.54 -0.0 0.58 perf-profile.calltrace.cycles-pp.security_mmap_addr.__get_unmapped_area.mremap_to.__do_sys_mremap.do_syscall_64 1.27 -0.1 1.21 -0.0 1.24 perf-profile.calltrace.cycles-pp.mas_wr_store_entry.mas_store_gfp.do_vmi_align_munmap.do_vmi_munmap.move_vma 0.80 ± 2% -0.1 0.74 ± 2% -0.0 0.76 ± 2% perf-profile.calltrace.cycles-pp.__call_rcu_common.mas_wr_node_store.mas_wr_store_entry.mas_store_prealloc.vma_merge 0.72 -0.1 0.66 -0.0 0.69 perf-profile.calltrace.cycles-pp.mas_prev.vma_merge.copy_vma.move_vma.__do_sys_mremap 0.78 -0.1 0.73 -0.0 0.75 perf-profile.calltrace.cycles-pp.___slab_alloc.kmem_cache_alloc_noprof.mas_alloc_nodes.mas_preallocate.__split_vma 0.69 ± 2% -0.1 0.64 ± 3% -0.0 0.66 ± 4% perf-profile.calltrace.cycles-pp.mod_objcg_state.__memcg_slab_post_alloc_hook.kmem_cache_alloc_noprof.vm_area_dup.copy_vma 1.63 -0.1 1.58 -0.1 1.57 perf-profile.calltrace.cycles-pp.__get_unmapped_area.mremap_to.__do_sys_mremap.do_syscall_64.entry_SYSCALL_64_after_hwframe 1.02 -0.1 0.97 -0.0 0.98 perf-profile.calltrace.cycles-pp.zap_pte_range.zap_pmd_range.unmap_page_range.unmap_vmas.unmap_region 0.77 -0.0 0.72 -0.0 0.74 perf-profile.calltrace.cycles-pp.___slab_alloc.kmem_cache_alloc_noprof.mas_alloc_nodes.mas_preallocate.vma_merge 0.62 -0.0 0.57 -0.0 0.60 perf-profile.calltrace.cycles-pp.mas_prev_setup.mas_prev.vma_merge.copy_vma.move_vma 0.67 -0.0 0.62 -0.0 0.64 perf-profile.calltrace.cycles-pp.percpu_counter_add_batch.do_vmi_align_munmap.do_vmi_munmap.move_vma.__do_sys_mremap 0.86 -0.0 0.81 -0.0 0.83 perf-profile.calltrace.cycles-pp.mtree_load.vma_to_resize.mremap_to.__do_sys_mremap.do_syscall_64 1.12 -0.0 1.08 -0.0 1.09 perf-profile.calltrace.cycles-pp.clear_bhb_loop.mremap 0.56 -0.0 0.51 -0.0 0.53 perf-profile.calltrace.cycles-pp.mas_walk.mas_prev_setup.mas_prev.vma_merge.copy_vma 0.68 ± 2% -0.0 0.63 -0.0 0.65 perf-profile.calltrace.cycles-pp.syscall_return_via_sysret.mremap 0.81 -0.0 0.77 -0.0 0.80 perf-profile.calltrace.cycles-pp.mtree_load.__do_sys_mremap.do_syscall_64.entry_SYSCALL_64_after_hwframe.mremap 1.02 -0.0 0.97 -0.0 0.98 perf-profile.calltrace.cycles-pp.vma_to_resize.mremap_to.__do_sys_mremap.do_syscall_64.entry_SYSCALL_64_after_hwframe 0.95 ± 2% -0.0 0.90 ± 2% -0.0 0.93 perf-profile.calltrace.cycles-pp.__memcg_slab_free_hook.kmem_cache_free.unlink_anon_vmas.free_pgtables.unmap_region 0.98 -0.0 0.94 -0.0 0.95 perf-profile.calltrace.cycles-pp.mas_find.do_vmi_align_munmap.do_vmi_munmap.move_vma.__do_sys_mremap 0.78 -0.0 0.74 -0.0 0.75 perf-profile.calltrace.cycles-pp.mas_store_prealloc.vma_link.copy_vma.move_vma.__do_sys_mremap 0.70 -0.0 0.66 -0.0 0.67 perf-profile.calltrace.cycles-pp.__call_rcu_common.do_vmi_align_munmap.do_vmi_munmap.move_vma.__do_sys_mremap 0.69 -0.0 0.65 -0.0 0.66 perf-profile.calltrace.cycles-pp.___slab_alloc.kmem_cache_alloc_noprof.vm_area_dup.copy_vma.move_vma 0.69 -0.0 0.65 -0.0 0.65 perf-profile.calltrace.cycles-pp.mas_preallocate.vma_link.copy_vma.move_vma.__do_sys_mremap 0.62 -0.0 0.59 -0.0 0.60 perf-profile.calltrace.cycles-pp.mas_prev_slot.do_vmi_align_munmap.do_vmi_munmap.move_vma.__do_sys_mremap 1.16 -0.0 1.12 -0.0 1.13 perf-profile.calltrace.cycles-pp.anon_vma_clone.copy_vma.move_vma.__do_sys_mremap.do_syscall_64 0.76 ± 2% -0.0 0.72 -0.0 0.72 ± 2% perf-profile.calltrace.cycles-pp.allocate_slab.___slab_alloc.kmem_cache_alloc_noprof.vm_area_dup.__split_vma 1.01 -0.0 0.97 -0.0 0.99 perf-profile.calltrace.cycles-pp.mt_find.vma_merge.copy_vma.move_vma.__do_sys_mremap 0.60 -0.0 0.57 -0.0 0.58 perf-profile.calltrace.cycles-pp.__pte_offset_map_lock.zap_pte_range.zap_pmd_range.unmap_page_range.unmap_vmas 0.88 -0.0 0.85 -0.0 0.85 perf-profile.calltrace.cycles-pp.userfaultfd_unmap_complete.__do_sys_mremap.do_syscall_64.entry_SYSCALL_64_after_hwframe.mremap 0.62 ± 2% -0.0 0.59 ± 2% -0.0 0.60 perf-profile.calltrace.cycles-pp.get_old_pud.move_page_tables.move_vma.__do_sys_mremap.do_syscall_64 0.59 -0.0 0.56 -0.0 0.56 perf-profile.calltrace.cycles-pp.entry_SYSCALL_64.mremap 0.65 -0.0 0.62 ± 2% -0.0 0.63 perf-profile.calltrace.cycles-pp.mas_update_gap.mas_store_gfp.do_vmi_align_munmap.do_vmi_munmap.move_vma 0.81 +0.0 0.82 -0.0 0.79 perf-profile.calltrace.cycles-pp.thp_get_unmapped_area_vmflags.__get_unmapped_area.mremap_to.__do_sys_mremap.do_syscall_64 2.76 +0.0 2.78 ± 2% -0.1 2.67 perf-profile.calltrace.cycles-pp.unlink_anon_vmas.free_pgtables.unmap_region.do_vmi_align_munmap.do_vmi_munmap 3.47 +0.0 3.51 -0.1 3.37 perf-profile.calltrace.cycles-pp.free_pgtables.unmap_region.do_vmi_align_munmap.do_vmi_munmap.move_vma 0.76 +0.1 0.83 +0.1 0.85 perf-profile.calltrace.cycles-pp.__madvise 0.66 +0.1 0.73 +0.1 0.75 perf-profile.calltrace.cycles-pp.do_syscall_64.entry_SYSCALL_64_after_hwframe.__madvise 0.67 +0.1 0.74 +0.1 0.76 perf-profile.calltrace.cycles-pp.entry_SYSCALL_64_after_hwframe.__madvise 0.63 +0.1 0.70 +0.1 0.72 perf-profile.calltrace.cycles-pp.__x64_sys_madvise.do_syscall_64.entry_SYSCALL_64_after_hwframe.__madvise 0.62 +0.1 0.70 +0.1 0.71 perf-profile.calltrace.cycles-pp.do_madvise.__x64_sys_madvise.do_syscall_64.entry_SYSCALL_64_after_hwframe.__madvise 0.00 +0.9 0.86 +0.9 0.92 perf-profile.calltrace.cycles-pp.mas_walk.mas_find.can_modify_mm.do_vmi_munmap.do_munmap 0.00 +0.9 0.88 +0.0 0.00 perf-profile.calltrace.cycles-pp.mas_walk.mas_find.can_modify_mm.mremap_to.__do_sys_mremap 83.81 +0.9 84.69 +0.6 84.44 perf-profile.calltrace.cycles-pp.__do_sys_mremap.do_syscall_64.entry_SYSCALL_64_after_hwframe.mremap 0.00 +0.9 0.90 ± 2% +0.9 0.91 perf-profile.calltrace.cycles-pp.mas_walk.mas_find.can_modify_mm.do_vmi_munmap.move_vma 0.00 +1.1 1.10 +0.0 0.00 perf-profile.calltrace.cycles-pp.mas_find.can_modify_mm.mremap_to.__do_sys_mremap.do_syscall_64 0.00 +1.2 1.21 +1.3 1.28 perf-profile.calltrace.cycles-pp.mas_find.can_modify_mm.do_vmi_munmap.do_munmap.mremap_to 2.10 +1.5 3.60 +1.7 3.79 perf-profile.calltrace.cycles-pp.do_munmap.mremap_to.__do_sys_mremap.do_syscall_64.entry_SYSCALL_64_after_hwframe 0.00 +1.5 1.52 +1.5 1.52 perf-profile.calltrace.cycles-pp.mas_find.can_modify_mm.do_vmi_munmap.move_vma.__do_sys_mremap 1.59 +1.5 3.12 +1.7 3.31 perf-profile.calltrace.cycles-pp.do_vmi_munmap.do_munmap.mremap_to.__do_sys_mremap.do_syscall_64 0.00 +1.6 1.61 +0.0 0.00 perf-profile.calltrace.cycles-pp.can_modify_mm.mremap_to.__do_sys_mremap.do_syscall_64.entry_SYSCALL_64_after_hwframe 0.00 +1.7 1.73 +1.8 1.83 perf-profile.calltrace.cycles-pp.can_modify_mm.do_vmi_munmap.do_munmap.mremap_to.__do_sys_mremap 0.00 +2.0 2.01 +2.0 2.04 perf-profile.calltrace.cycles-pp.can_modify_mm.do_vmi_munmap.move_vma.__do_sys_mremap.do_syscall_64 5.34 +3.0 8.38 +1.6 6.92 perf-profile.calltrace.cycles-pp.mremap_to.__do_sys_mremap.do_syscall_64.entry_SYSCALL_64_after_hwframe.mremap 75.22 -2.0 73.18 -0.9 74.34 perf-profile.children.cycles-pp.move_vma 37.04 -1.6 35.40 -1.2 35.83 perf-profile.children.cycles-pp.do_vmi_align_munmap 25.09 -1.4 23.72 -0.9 24.20 perf-profile.children.cycles-pp.copy_vma 20.04 -1.1 18.96 -0.8 19.28 perf-profile.children.cycles-pp.__split_vma 19.87 -1.0 18.84 -0.6 19.24 perf-profile.children.cycles-pp.rcu_core 19.85 -1.0 18.82 -0.6 19.22 perf-profile.children.cycles-pp.rcu_do_batch 19.89 -1.0 18.86 -0.6 19.26 perf-profile.children.cycles-pp.handle_softirqs 17.55 -0.9 16.67 -0.5 17.02 perf-profile.children.cycles-pp.kmem_cache_free 15.32 -0.8 14.49 -0.5 14.78 perf-profile.children.cycles-pp.kmem_cache_alloc_noprof 15.17 -0.8 14.39 -0.5 14.66 perf-profile.children.cycles-pp.vma_merge 12.12 -0.6 11.48 -0.4 11.70 perf-profile.children.cycles-pp.__slab_free 12.19 -0.6 11.56 -0.5 11.73 perf-profile.children.cycles-pp.mas_wr_store_entry 11.99 -0.6 11.36 -0.5 11.53 perf-profile.children.cycles-pp.mas_store_prealloc 10.88 -0.6 10.28 -0.4 10.50 perf-profile.children.cycles-pp.vm_area_dup 9.90 -0.5 9.41 -0.4 9.53 perf-profile.children.cycles-pp.mas_wr_node_store 8.39 -0.5 7.92 -0.3 8.13 perf-profile.children.cycles-pp.__memcg_slab_post_alloc_hook 7.99 -0.4 7.58 -0.3 7.73 perf-profile.children.cycles-pp.move_page_tables 6.70 -0.4 6.33 -0.3 6.43 perf-profile.children.cycles-pp.vma_complete 5.87 -0.3 5.55 -0.2 5.66 perf-profile.children.cycles-pp.move_ptes 5.12 -0.3 4.81 -0.2 4.90 perf-profile.children.cycles-pp.mas_preallocate 6.05 -0.3 5.74 -0.2 5.85 perf-profile.children.cycles-pp.vm_area_free_rcu_cb 2.98 -0.3 2.69 ± 4% -0.2 2.80 ± 6% perf-profile.children.cycles-pp.__memcpy 3.46 ± 2% -0.2 3.25 -0.1 3.36 ± 3% perf-profile.children.cycles-pp.mod_objcg_state 3.47 -0.2 3.26 -0.2 3.32 perf-profile.children.cycles-pp.___slab_alloc 2.44 -0.2 2.25 -0.1 2.33 perf-profile.children.cycles-pp.find_vma_prev 2.92 -0.2 2.73 -0.1 2.79 perf-profile.children.cycles-pp.mas_alloc_nodes 3.46 -0.2 3.27 -0.1 3.34 perf-profile.children.cycles-pp.flush_tlb_mm_range 3.47 -0.2 3.29 -0.2 3.32 ± 2% perf-profile.children.cycles-pp.down_write 3.33 -0.2 3.16 -0.1 3.25 perf-profile.children.cycles-pp.__memcg_slab_free_hook 4.23 -0.2 4.07 -0.1 4.08 ± 2% perf-profile.children.cycles-pp.anon_vma_clone 8.33 -0.2 8.17 -0.2 8.13 perf-profile.children.cycles-pp.unmap_region 3.35 -0.1 3.20 -0.1 3.26 perf-profile.children.cycles-pp.mas_store_gfp 2.21 -0.1 2.07 -0.1 2.10 perf-profile.children.cycles-pp.__cond_resched 3.19 -0.1 3.05 -0.1 3.11 perf-profile.children.cycles-pp.unmap_vmas 2.12 -0.1 1.99 -0.1 2.04 perf-profile.children.cycles-pp.__call_rcu_common 2.66 -0.1 2.54 -0.1 2.60 perf-profile.children.cycles-pp.mtree_load 2.24 -0.1 2.12 ± 2% -0.1 2.13 ± 3% perf-profile.children.cycles-pp.vma_prepare 2.50 -0.1 2.38 -0.1 2.42 perf-profile.children.cycles-pp.flush_tlb_func 2.04 ± 2% -0.1 1.93 -0.1 1.96 ± 2% perf-profile.children.cycles-pp.allocate_slab 2.46 -0.1 2.35 -0.1 2.41 perf-profile.children.cycles-pp.rcu_cblist_dequeue 2.48 -0.1 2.38 -0.1 2.42 perf-profile.children.cycles-pp.unmap_page_range 2.23 -0.1 2.12 -0.1 2.16 perf-profile.children.cycles-pp.native_flush_tlb_one_user 1.77 -0.1 1.67 -0.1 1.70 perf-profile.children.cycles-pp.mas_wr_walk 1.88 -0.1 1.78 -0.1 1.80 perf-profile.children.cycles-pp.vma_link 1.84 -0.1 1.75 -0.1 1.77 perf-profile.children.cycles-pp.up_write 0.97 ± 2% -0.1 0.88 -0.1 0.89 perf-profile.children.cycles-pp.rcu_all_qs 1.40 -0.1 1.32 -0.1 1.34 ± 2% perf-profile.children.cycles-pp.shuffle_freelist 1.03 -0.1 0.95 -0.0 0.99 perf-profile.children.cycles-pp.mas_prev 0.92 -0.1 0.85 -0.0 0.88 perf-profile.children.cycles-pp.mas_prev_setup 1.58 -0.1 1.51 -0.1 1.53 perf-profile.children.cycles-pp.zap_pmd_range 1.24 -0.1 1.17 -0.0 1.20 perf-profile.children.cycles-pp.mas_prev_slot 1.57 -0.1 1.49 -0.1 1.49 perf-profile.children.cycles-pp.mas_update_gap 0.62 -0.1 0.56 -0.0 0.60 perf-profile.children.cycles-pp.security_mmap_addr 0.90 -0.1 0.84 -0.0 0.86 perf-profile.children.cycles-pp.percpu_counter_add_batch 0.86 -0.1 0.80 -0.0 0.81 perf-profile.children.cycles-pp._raw_spin_lock_irqsave 0.98 -0.1 0.92 -0.0 0.95 perf-profile.children.cycles-pp.mas_pop_node 1.68 -0.1 1.62 -0.1 1.62 perf-profile.children.cycles-pp.__get_unmapped_area 1.23 -0.1 1.18 -0.0 1.20 perf-profile.children.cycles-pp.__pte_offset_map_lock 0.49 ± 2% -0.1 0.43 -0.1 0.43 ± 2% perf-profile.children.cycles-pp.setup_object 1.09 -0.1 1.03 -0.0 1.05 perf-profile.children.cycles-pp.zap_pte_range 1.07 ± 2% -0.1 1.02 ± 2% -0.1 1.00 perf-profile.children.cycles-pp.mas_leaf_max_gap 0.70 ± 2% -0.0 0.65 -0.0 0.67 perf-profile.children.cycles-pp.syscall_return_via_sysret 1.18 -0.0 1.14 -0.0 1.15 perf-profile.children.cycles-pp.clear_bhb_loop 0.51 ± 3% -0.0 0.47 -0.0 0.49 ± 3% perf-profile.children.cycles-pp.anon_vma_interval_tree_insert 1.04 -0.0 1.00 -0.0 1.01 perf-profile.children.cycles-pp.vma_to_resize 0.57 -0.0 0.53 -0.0 0.54 perf-profile.children.cycles-pp.mas_wr_end_piv 0.44 ± 2% -0.0 0.40 ± 2% -0.0 0.40 perf-profile.children.cycles-pp.native_queued_spin_lock_slowpath 1.14 -0.0 1.10 -0.0 1.12 perf-profile.children.cycles-pp.mt_find 0.90 -0.0 0.87 -0.0 0.87 perf-profile.children.cycles-pp.userfaultfd_unmap_complete 0.62 -0.0 0.59 -0.0 0.60 perf-profile.children.cycles-pp.__put_partials 0.45 ± 6% -0.0 0.42 -0.0 0.43 perf-profile.children.cycles-pp._raw_spin_lock 0.48 -0.0 0.45 ± 2% -0.0 0.46 perf-profile.children.cycles-pp.mas_prev_range 0.61 -0.0 0.58 -0.0 0.59 perf-profile.children.cycles-pp.entry_SYSCALL_64 0.31 ± 3% -0.0 0.28 ± 3% -0.0 0.31 perf-profile.children.cycles-pp.security_vm_enough_memory_mm 0.33 ± 3% -0.0 0.30 ± 2% -0.0 0.31 ± 4% perf-profile.children.cycles-pp.mas_put_in_tree 0.32 ± 2% -0.0 0.29 ± 2% -0.0 0.30 perf-profile.children.cycles-pp.tlb_finish_mmu 0.46 -0.0 0.44 ± 2% -0.0 0.46 perf-profile.children.cycles-pp.rcu_segcblist_enqueue 0.33 -0.0 0.31 -0.0 0.32 perf-profile.children.cycles-pp.mas_destroy 0.36 -0.0 0.34 -0.0 0.34 perf-profile.children.cycles-pp.__rb_insert_augmented 0.39 -0.0 0.37 -0.0 0.38 ± 2% perf-profile.children.cycles-pp.down_write_killable 0.29 -0.0 0.27 ± 2% -0.0 0.28 perf-profile.children.cycles-pp.tlb_gather_mmu 0.26 -0.0 0.24 ± 2% -0.0 0.25 ± 2% perf-profile.children.cycles-pp.syscall_exit_to_user_mode 0.16 ± 2% -0.0 0.14 ± 3% -0.0 0.14 ± 3% perf-profile.children.cycles-pp.mas_wr_append 0.30 ± 2% -0.0 0.28 ± 2% -0.0 0.29 ± 2% perf-profile.children.cycles-pp.__vm_enough_memory 0.32 -0.0 0.30 ± 2% -0.0 0.31 perf-profile.children.cycles-pp.pte_offset_map_nolock 2.83 +0.0 2.85 ± 2% -0.1 2.74 perf-profile.children.cycles-pp.unlink_anon_vmas 0.84 +0.0 0.86 -0.0 0.81 perf-profile.children.cycles-pp.thp_get_unmapped_area_vmflags 0.08 ± 5% +0.0 0.10 ± 3% -0.0 0.08 ± 6% perf-profile.children.cycles-pp.mm_get_unmapped_area_vmflags 3.52 +0.0 3.56 -0.1 3.42 perf-profile.children.cycles-pp.free_pgtables 0.78 +0.1 0.85 +0.1 0.86 perf-profile.children.cycles-pp.__madvise 0.63 +0.1 0.70 +0.1 0.72 perf-profile.children.cycles-pp.__x64_sys_madvise 0.63 +0.1 0.70 +0.1 0.71 perf-profile.children.cycles-pp.do_madvise 0.00 +0.1 0.09 ± 3% +0.1 0.10 ± 5% perf-profile.children.cycles-pp.can_modify_mm_madv 1.31 +0.2 1.46 +0.2 1.50 perf-profile.children.cycles-pp.mas_next_slot 83.90 +0.9 84.79 +0.6 84.53 perf-profile.children.cycles-pp.__do_sys_mremap 40.45 +1.4 41.90 +2.1 42.57 perf-profile.children.cycles-pp.do_vmi_munmap 2.12 +1.5 3.62 +1.7 3.82 perf-profile.children.cycles-pp.do_munmap 3.63 +2.4 5.98 +1.7 5.29 perf-profile.children.cycles-pp.mas_walk 5.40 +3.0 8.44 +1.6 6.97 perf-profile.children.cycles-pp.mremap_to 5.26 +3.2 8.48 +2.3 7.58 perf-profile.children.cycles-pp.mas_find 0.00 +5.5 5.46 +3.9 3.93 perf-profile.children.cycles-pp.can_modify_mm 11.49 -0.6 10.89 -0.4 11.10 perf-profile.self.cycles-pp.__slab_free 4.32 -0.3 4.06 -0.2 4.16 perf-profile.self.cycles-pp.__memcg_slab_post_alloc_hook 1.96 -0.2 1.77 ± 4% -0.1 1.84 ± 6% perf-profile.self.cycles-pp.__memcpy 2.36 -0.1 2.25 ± 2% -0.1 2.25 ± 3% perf-profile.self.cycles-pp.down_write 2.42 -0.1 2.31 -0.0 2.38 perf-profile.self.cycles-pp.rcu_cblist_dequeue 2.33 -0.1 2.23 -0.1 2.28 perf-profile.self.cycles-pp.mtree_load 2.21 -0.1 2.10 -0.1 2.14 perf-profile.self.cycles-pp.native_flush_tlb_one_user 1.62 -0.1 1.54 -0.0 1.57 perf-profile.self.cycles-pp.__memcg_slab_free_hook 1.52 -0.1 1.44 -0.1 1.46 perf-profile.self.cycles-pp.mas_wr_walk 1.44 -0.1 1.36 -0.1 1.38 ± 2% perf-profile.self.cycles-pp.__call_rcu_common 1.53 -0.1 1.45 -0.0 1.48 perf-profile.self.cycles-pp.up_write 1.72 -0.1 1.65 -0.0 1.70 perf-profile.self.cycles-pp.mod_objcg_state 0.69 ± 2% -0.1 0.63 -0.1 0.63 perf-profile.self.cycles-pp.rcu_all_qs 1.14 ± 2% -0.1 1.08 -0.0 1.09 ± 2% perf-profile.self.cycles-pp.shuffle_freelist 1.18 -0.1 1.12 -0.0 1.17 perf-profile.self.cycles-pp.vma_merge 1.38 -0.1 1.33 -0.0 1.35 perf-profile.self.cycles-pp.do_vmi_align_munmap 0.51 ± 2% -0.1 0.45 -0.0 0.49 perf-profile.self.cycles-pp.security_mmap_addr 0.62 -0.1 0.56 ± 2% -0.1 0.56 perf-profile.self.cycles-pp.mremap 0.89 -0.1 0.83 -0.0 0.85 perf-profile.self.cycles-pp.___slab_alloc 0.99 -0.1 0.94 -0.0 0.96 perf-profile.self.cycles-pp.mas_prev_slot 1.00 -0.0 0.95 -0.0 0.96 perf-profile.self.cycles-pp.mas_preallocate 0.98 -0.0 0.93 -0.0 0.95 perf-profile.self.cycles-pp.move_ptes 0.85 -0.0 0.80 -0.0 0.82 perf-profile.self.cycles-pp.mas_pop_node 0.94 -0.0 0.90 -0.0 0.91 ± 2% perf-profile.self.cycles-pp.vm_area_free_rcu_cb 1.09 -0.0 1.04 -0.0 1.06 perf-profile.self.cycles-pp.__cond_resched 0.77 -0.0 0.72 -0.0 0.74 perf-profile.self.cycles-pp.percpu_counter_add_batch 0.94 ± 2% -0.0 0.89 ± 2% -0.1 0.87 perf-profile.self.cycles-pp.mas_leaf_max_gap 1.17 -0.0 1.12 -0.0 1.14 perf-profile.self.cycles-pp.clear_bhb_loop 0.68 -0.0 0.63 -0.0 0.65 perf-profile.self.cycles-pp.__split_vma 0.79 -0.0 0.75 -0.0 0.77 perf-profile.self.cycles-pp.mas_wr_store_entry 1.22 -0.0 1.18 -0.0 1.18 perf-profile.self.cycles-pp.move_vma 0.43 ± 2% -0.0 0.40 ± 2% -0.0 0.40 perf-profile.self.cycles-pp.native_queued_spin_lock_slowpath 1.49 -0.0 1.45 +0.0 1.49 perf-profile.self.cycles-pp.kmem_cache_free 0.44 -0.0 0.40 -0.0 0.40 perf-profile.self.cycles-pp.do_munmap 0.45 -0.0 0.42 -0.0 0.43 perf-profile.self.cycles-pp.mas_wr_end_piv 0.89 -0.0 0.86 -0.0 0.88 perf-profile.self.cycles-pp.mas_store_gfp 0.78 -0.0 0.75 -0.0 0.76 perf-profile.self.cycles-pp.userfaultfd_unmap_complete 0.66 -0.0 0.62 -0.0 0.64 perf-profile.self.cycles-pp.mas_store_prealloc 0.60 -0.0 0.58 -0.0 0.59 perf-profile.self.cycles-pp.unmap_region 0.36 ± 4% -0.0 0.33 ± 3% -0.0 0.34 ± 2% perf-profile.self.cycles-pp.syscall_return_via_sysret 0.55 -0.0 0.52 -0.0 0.53 perf-profile.self.cycles-pp.get_old_pud 0.99 -0.0 0.97 -0.0 0.98 perf-profile.self.cycles-pp.mt_find 0.61 -0.0 0.58 -0.0 0.60 perf-profile.self.cycles-pp.copy_vma 0.43 ± 3% -0.0 0.40 -0.0 0.41 ± 4% perf-profile.self.cycles-pp.anon_vma_interval_tree_insert 0.49 -0.0 0.47 -0.0 0.48 perf-profile.self.cycles-pp.find_vma_prev 0.71 -0.0 0.68 -0.0 0.70 perf-profile.self.cycles-pp.unmap_page_range 0.27 -0.0 0.25 -0.0 0.26 perf-profile.self.cycles-pp.mas_prev_setup 0.47 -0.0 0.45 -0.0 0.46 ± 2% perf-profile.self.cycles-pp.flush_tlb_mm_range 0.37 ± 6% -0.0 0.35 -0.0 0.35 perf-profile.self.cycles-pp._raw_spin_lock 0.41 -0.0 0.39 -0.0 0.40 perf-profile.self.cycles-pp._raw_spin_lock_irqsave 0.40 -0.0 0.37 -0.0 0.38 perf-profile.self.cycles-pp.entry_SYSRETQ_unsafe_stack 0.27 -0.0 0.25 ± 2% -0.0 0.25 ± 3% perf-profile.self.cycles-pp.mas_put_in_tree 0.49 -0.0 0.47 -0.0 0.49 perf-profile.self.cycles-pp.refill_obj_stock 0.48 -0.0 0.46 -0.0 0.47 perf-profile.self.cycles-pp.entry_SYSCALL_64_after_hwframe 0.27 ± 2% -0.0 0.25 -0.0 0.26 perf-profile.self.cycles-pp.tlb_finish_mmu 0.24 ± 2% -0.0 0.22 -0.0 0.23 perf-profile.self.cycles-pp.mas_prev 0.28 -0.0 0.26 -0.0 0.27 ± 2% perf-profile.self.cycles-pp.mas_alloc_nodes 0.40 -0.0 0.39 -0.0 0.40 perf-profile.self.cycles-pp.__pte_offset_map_lock 0.14 ± 3% -0.0 0.12 ± 2% -0.0 0.13 ± 3% perf-profile.self.cycles-pp.syscall_exit_to_user_mode 0.26 -0.0 0.24 ± 2% -0.0 0.25 perf-profile.self.cycles-pp.__rb_insert_augmented 0.28 -0.0 0.26 -0.0 0.27 perf-profile.self.cycles-pp.alloc_new_pud 0.28 -0.0 0.26 -0.0 0.27 ± 2% perf-profile.self.cycles-pp.flush_tlb_func 0.20 ± 2% -0.0 0.19 -0.0 0.19 ± 2% perf-profile.self.cycles-pp.__get_unmapped_area 0.47 -0.0 0.46 -0.0 0.45 perf-profile.self.cycles-pp.arch_get_unmapped_area_topdown_vmflags 0.06 -0.0 0.05 ± 5% -0.0 0.05 perf-profile.self.cycles-pp.vma_dup_policy 0.06 ± 6% +0.0 0.07 -0.0 0.06 ± 8% perf-profile.self.cycles-pp.mm_get_unmapped_area_vmflags 0.11 ± 4% +0.0 0.12 ± 4% +0.0 0.12 ± 4% perf-profile.self.cycles-pp.free_pgd_range 0.21 +0.0 0.22 ± 2% -0.0 0.20 ± 2% perf-profile.self.cycles-pp.thp_get_unmapped_area_vmflags 0.45 +0.0 0.48 +0.0 0.50 perf-profile.self.cycles-pp.do_vmi_munmap 0.27 +0.0 0.32 -0.0 0.26 perf-profile.self.cycles-pp.free_pgtables 0.36 ± 2% +0.1 0.44 -0.0 0.35 perf-profile.self.cycles-pp.unlink_anon_vmas 1.07 +0.1 1.19 +0.2 1.22 perf-profile.self.cycles-pp.mas_next_slot 1.49 +0.5 2.01 +0.4 1.86 perf-profile.self.cycles-pp.mas_find 0.00 +1.4 1.37 +0.9 0.93 perf-profile.self.cycles-pp.can_modify_mm 3.14 +2.1 5.23 +1.5 4.60 perf-profile.self.cycles-pp.mas_walk
to avoid the impact of other changes, better to apply the patch upon 8be7258a directly.
if you prefer other base for this patch, please let us know. then we will supply the results for 4 commits in fact:
this patch the base of this patch 8be7258a: mseal: add mseal syscall ff388fe5c: mseal: wire up mseal syscall
Thank you for your time and assistance in helping me on understanding this issue.
due to resource constraint, please expect that we need several days to finish this test request.
No problem.
Thanks for your help! -Jeff
Best regards, -Jeff
> -Jeff > > > [1] https://lore.kernel.org/lkml/202408041602.caa0372-oliver.sang@intel.com/ > > [2] https://github.com/peaktocreek/mmperf/blob/main/run_stress_ng.c > > > > > > Jeff Xu (2): > > mseal:selftest mremap across VMA boundaries. > > mseal: refactor mremap to remove can_modify_mm > > > > mm/internal.h | 24 ++ > > mm/mremap.c | 77 +++---- > > mm/mseal.c | 17 -- > > tools/testing/selftests/mm/mseal_test.c | 293 +++++++++++++++++++++++- > > 4 files changed, 353 insertions(+), 58 deletions(-) > > > > -- > > 2.46.0.76.ge559c4bf1a-goog > >
Hi Oliver
On Tue, Aug 20, 2024 at 11:19 PM Oliver Sang oliver.sang@intel.com wrote:
hi, Jeff,
here is a update per your test request.
we extented the runtime to 600 seconds, and run 10 times for each commit.
========================================================================================= compiler/cpufreq_governor/kconfig/nr_threads/rootfs/tbox_group/test/testcase/testtime: gcc-12/performance/x86_64-rhel-8.3/100%/debian-12-x86_64-20240206.cgz/lkp-icl-2sp7/pagemove/stress-ng/***600s***
commit: ff388fe5c4 ("mseal: wire up mseal syscall") 8be7258aad ("mseal: add mseal syscall") 2a78ece39f <-- your "[PATCH v1 2/2] mseal: refactor mremap to remove can_modify_mm"
ff388fe5c481d39c 8be7258aad44b5e25977a98db13 2a78ece39f13ea6f3f9679a6c66
%stddev %change %stddev %change %stddev \ | \ | \
1.886e+08 ą 0% -5.0% 1.792e+08 ą 0% -3.4% 1.821e+08 ą 0% stress-ng.pagemove.ops 314345 ą 0% -5.0% 298656 ą 0% -3.4% 303565 ą 0% stress-ng.pagemove.ops_per_sec
Thanks for testing with more samples. The result is reasonable and consistent with the 60 seconds result. The -3.4% reflects the impact from munmap, which isn't covered by this patch.
the score of stress-ng.pagemove.ops_per_sec has some difference with 60s run (list as below for comparison). but the trend is similar.
========================================================================================= compiler/cpufreq_governor/kconfig/nr_threads/rootfs/tbox_group/test/testcase/testtime: gcc-12/performance/x86_64-rhel-8.3/100%/debian-12-x86_64-20240206.cgz/lkp-icl-2sp7/pagemove/stress-ng/***60s***
commit: ff388fe5c4 ("mseal: wire up mseal syscall") 8be7258aad ("mseal: add mseal syscall") 2a78ece39f <-- your "[PATCH v1 2/2] mseal: refactor mremap to remove can_modify_mm"
ff388fe5c481d39c 8be7258aad44b5e25977a98db13 2a78ece39f13ea6f3f9679a6c66
%stddev %change %stddev %change %stddev \ | \ | \
18386219 ą 0% -5.0% 17474214 ą 0% -2.9% 17850959 ą 0% stress-ng.pagemove.ops 306421 ą 0% -5.0% 291207 ą 0% -2.9% 297490 ą 0% stress-ng.pagemove.ops_per_sec
since the data is stable, %stddev shows as "ą 0%" in both above tables. let me give out the detail data for 600s runs.
for ff388fe5c4 ("mseal: wire up mseal syscall")
"stress-ng.pagemove.ops": [ 188545955, 188681834, 188907282, 188345009, 188729465, 188312187, 188897283, 188209713, 188425965, 189026136 ], "stress-ng.pagemove.ops_per_sec": [ 314242.1, 314467.13, 314841.5, 313907.19, 314548.11, 313852.5, 314827.84, 313680.74, 314042.14, 315042.79 ],
for 8be7258aad ("mseal: add mseal syscall")
"stress-ng.pagemove.ops": [ 179127848, 179401350, 179350278, 179023817, 179106624, 179535213, 178936504, 178870141, 179462171, 179136065 ], "stress-ng.pagemove.ops_per_sec": [ 298545.54, 299000.95, 298915.62, 298371.45, 298509.15, 299223.65, 298226.74, 298115.08, 299101.23, 298558.74 ],
for 2a78ece39f <-- your "[PATCH v1 2/2] mseal: refactor mremap to remove can_modify_mm"
"stress-ng.pagemove.ops": [ 182188207, 182288813, 182483678, 181980233, 182249440, 181837961, 182155893, 181699445, 182347580, 182174597 ], "stress-ng.pagemove.ops_per_sec": [ 303643.28, 303814.05, 304138.38, 303298.9, 303747.33, 303060.84, 303592.48, 302831.56, 303909.81, 303622.07 ],
for 600s run, below is the full comparion.
========================================================================================= compiler/cpufreq_governor/kconfig/nr_threads/rootfs/tbox_group/test/testcase/testtime: gcc-12/performance/x86_64-rhel-8.3/100%/debian-12-x86_64-20240206.cgz/lkp-icl-2sp7/pagemove/stress-ng/***600s***
commit: ff388fe5c4 ("mseal: wire up mseal syscall") 8be7258aad ("mseal: add mseal syscall") 2a78ece39f <-- your "[PATCH v1 2/2] mseal: refactor mremap to remove can_modify_mm"
ff388fe5c481d39c 8be7258aad44b5e25977a98db13 2a78ece39f13ea6f3f9679a6c66
%stddev %change %stddev %change %stddev \ | \ | \ 4667 ą 0% -2.4% 4553 ą 0% -1.6% 4593 ą 0% vmstat.system.cs
4.192e+08 ą 0% -4.3% 4.012e+08 ą 0% -2.8% 4.075e+08 ą 0% proc-vmstat.numa_hit 4.192e+08 ą 0% -4.3% 4.011e+08 ą 0% -2.8% 4.074e+08 ą 0% proc-vmstat.numa_local 7.843e+08 ą 0% -4.3% 7.504e+08 ą 0% -2.8% 7.623e+08 ą 0% proc-vmstat.pgalloc_normal 7.836e+08 ą 0% -4.3% 7.498e+08 ą 0% -2.8% 7.616e+08 ą 0% proc-vmstat.pgfree 1174825 ą 0% -2.6% 1143891 ą 0% -1.7% 1155336 ą 0% time.involuntary_context_switches 5082 ą 0% +1.3% 5147 ą 0% +0.9% 5126 ą 0% time.percent_of_cpu_this_job_got 29840 ą 0% +1.4% 30267 ą 0% +1.0% 30133 ą 0% time.system_time 663.58 ą 1% -5.7% 625.54 ą 1% -4.3% 635.17 ą 0% time.user_time 1.886e+08 ą 0% -5.0% 1.792e+08 ą 0% -3.4% 1.821e+08 ą 0% stress-ng.pagemove.ops 314345 ą 0% -5.0% 298656 ą 0% -3.4% 303565 ą 0% stress-ng.pagemove.ops_per_sec 212508 ą 0% -4.3% 203280 ą 0% -3.1% 205831 ą 0% stress-ng.pagemove.page_remaps_per_sec 1174825 ą 0% -2.6% 1143891 ą 0% -1.7% 1155336 ą 0% stress-ng.time.involuntary_context_switches 5082 ą 0% +1.3% 5147 ą 0% +0.9% 5126 ą 0% stress-ng.time.percent_of_cpu_this_job_got 29840 ą 0% +1.4% 30267 ą 0% +1.0% 30133 ą 0% stress-ng.time.system_time 663.58 ą 1% -5.7% 625.54 ą 1% -4.3% 635.17 ą 0% stress-ng.time.user_time 1.00 ą 0% -7.1% 0.93 ą 0% -4.9% 0.95 ą 0% perf-stat.i.MPKI 3.487e+10 ą 0% +3.5% 3.607e+10 ą 0% +2.4% 3.57e+10 ą 0% perf-stat.i.branch-instructions 0.21 ą 0% -0.0 0.19 ą 3% -0.0 0.20 ą 0% perf-stat.i.branch-miss-rate% 1.763e+08 ą 0% -5.0% 1.675e+08 ą 0% -3.4% 1.704e+08 ą 0% perf-stat.i.cache-misses 2.342e+08 ą 0% -4.9% 2.228e+08 ą 0% -3.3% 2.264e+08 ą 0% perf-stat.i.cache-references 4650 ą 0% -2.4% 4537 ą 0% -1.5% 4578 ą 0% perf-stat.i.context-switches 1.11 ą 0% -2.2% 1.09 ą 0% -1.6% 1.10 ą 0% perf-stat.i.cpi 172.66 ą 0% -2.8% 167.77 ą 0% -1.8% 169.52 ą 0% perf-stat.i.cpu-migrations 1121 ą 0% +5.2% 1180 ą 0% +3.5% 1160 ą 0% perf-stat.i.cycles-between-cache-misses 1.772e+11 ą 0% +2.2% 1.812e+11 ą 0% +1.6% 1.801e+11 ą 0% perf-stat.i.instructions 0.90 ą 0% +2.3% 0.92 ą 0% +1.6% 0.91 ą 0% perf-stat.i.ipc 0.99 ą 0% -7.1% 0.92 ą 0% -4.9% 0.95 ą 0% perf-stat.overall.MPKI 0.21 ą 0% -0.0 0.19 ą 3% -0.0 0.20 ą 0% perf-stat.overall.branch-miss-rate% 1.11 ą 0% -2.2% 1.09 ą 0% -1.6% 1.10 ą 0% perf-stat.overall.cpi 1120 ą 0% +5.2% 1179 ą 0% +3.5% 1159 ą 0% perf-stat.overall.cycles-between-cache-misses 0.90 ą 0% +2.3% 0.92 ą 0% +1.6% 0.91 ą 0% perf-stat.overall.ipc 3.48e+10 ą 0% +3.5% 3.6e+10 ą 0% +2.4% 3.563e+10 ą 0% perf-stat.ps.branch-instructions 1.759e+08 ą 0% -5.0% 1.672e+08 ą 0% -3.4% 1.7e+08 ą 0% perf-stat.ps.cache-misses 2.338e+08 ą 0% -4.9% 2.224e+08 ą 0% -3.3% 2.26e+08 ą 0% perf-stat.ps.cache-references 4642 ą 0% -2.4% 4529 ą 0% -1.5% 4570 ą 0% perf-stat.ps.context-switches 172.30 ą 0% -2.8% 167.43 ą 0% -1.8% 169.17 ą 0% perf-stat.ps.cpu-migrations 1.769e+11 ą 0% +2.3% 1.808e+11 ą 0% +1.6% 1.797e+11 ą 0% perf-stat.ps.instructions 1.063e+14 ą 0% +2.3% 1.087e+14 ą 0% +1.7% 1.081e+14 ą 0% perf-stat.total.instructions 74.86 ą 0% -2.1 72.76 ą 0% -0.8 74.06 ą 0% perf-profile.calltrace.cycles-pp.move_vma.__do_sys_mremap.do_syscall_64.entry_SYSCALL_64_after_hwframe.mremap 36.72 ą 0% -1.7 35.04 ą 0% -1.2 35.54 ą 0% perf-profile.calltrace.cycles-pp.do_vmi_align_munmap.do_vmi_munmap.move_vma.__do_sys_mremap.do_syscall_64 24.93 ą 0% -1.4 23.54 ą 0% -0.8 24.12 ą 0% perf-profile.calltrace.cycles-pp.copy_vma.move_vma.__do_sys_mremap.do_syscall_64.entry_SYSCALL_64_after_hwframe 19.91 ą 0% -1.1 18.79 ą 0% -0.7 19.17 ą 0% perf-profile.calltrace.cycles-pp.__split_vma.do_vmi_align_munmap.do_vmi_munmap.move_vma.__do_sys_mremap 14.71 ą 0% -0.8 13.90 ą 0% -0.4 14.30 ą 0% perf-profile.calltrace.cycles-pp.vma_merge.copy_vma.move_vma.__do_sys_mremap.do_syscall_64 10.82 ą 2% -0.6 10.22 ą 2% -0.6 10.25 ą 2% perf-profile.calltrace.cycles-pp.smpboot_thread_fn.kthread.ret_from_fork.ret_from_fork_asm 10.81 ą 2% -0.6 10.21 ą 2% -0.6 10.24 ą 2% perf-profile.calltrace.cycles-pp.run_ksoftirqd.smpboot_thread_fn.kthread.ret_from_fork.ret_from_fork_asm 10.81 ą 2% -0.6 10.21 ą 2% -0.6 10.24 ą 2% perf-profile.calltrace.cycles-pp.handle_softirqs.run_ksoftirqd.smpboot_thread_fn.kthread.ret_from_fork 10.80 ą 2% -0.6 10.21 ą 2% -0.6 10.23 ą 2% perf-profile.calltrace.cycles-pp.rcu_core.handle_softirqs.run_ksoftirqd.smpboot_thread_fn.kthread 10.85 ą 2% -0.6 10.26 ą 2% -0.6 10.28 ą 2% perf-profile.calltrace.cycles-pp.kthread.ret_from_fork.ret_from_fork_asm 10.85 ą 2% -0.6 10.26 ą 2% -0.6 10.28 ą 2% perf-profile.calltrace.cycles-pp.ret_from_fork.ret_from_fork_asm 10.85 ą 2% -0.6 10.26 ą 2% -0.6 10.28 ą 2% perf-profile.calltrace.cycles-pp.ret_from_fork_asm 10.76 ą 2% -0.6 10.17 ą 2% -0.6 10.20 ą 2% perf-profile.calltrace.cycles-pp.rcu_do_batch.rcu_core.handle_softirqs.run_ksoftirqd.smpboot_thread_fn 1.49 ą 1% -0.5 0.98 ą 0% -0.5 1.00 ą 0% perf-profile.calltrace.cycles-pp.mas_find.do_vmi_munmap.move_vma.__do_sys_mremap.do_syscall_64 7.86 ą 0% -0.4 7.48 ą 0% -0.3 7.59 ą 0% perf-profile.calltrace.cycles-pp.move_page_tables.move_vma.__do_sys_mremap.do_syscall_64.entry_SYSCALL_64_after_hwframe 6.72 ą 0% -0.4 6.37 ą 0% -0.2 6.49 ą 0% perf-profile.calltrace.cycles-pp.vm_area_dup.__split_vma.do_vmi_align_munmap.do_vmi_munmap.move_vma 6.06 ą 2% -0.3 5.71 ą 2% -0.3 5.73 ą 2% perf-profile.calltrace.cycles-pp.kmem_cache_free.rcu_do_batch.rcu_core.handle_softirqs.run_ksoftirqd 6.11 ą 0% -0.3 5.77 ą 0% -0.2 5.90 ą 0% perf-profile.calltrace.cycles-pp.vma_complete.__split_vma.do_vmi_align_munmap.do_vmi_munmap.move_vma 6.11 ą 0% -0.3 5.78 ą 1% -0.2 5.90 ą 0% perf-profile.calltrace.cycles-pp.kmem_cache_alloc_noprof.vm_area_dup.__split_vma.do_vmi_align_munmap.do_vmi_munmap 5.50 ą 0% -0.3 5.19 ą 0% -0.2 5.31 ą 0% perf-profile.calltrace.cycles-pp.mas_store_prealloc.vma_complete.__split_vma.do_vmi_align_munmap.do_vmi_munmap 5.52 ą 0% -0.3 5.22 ą 0% -0.2 5.35 ą 0% perf-profile.calltrace.cycles-pp.mas_store_prealloc.vma_merge.copy_vma.move_vma.__do_sys_mremap 5.15 ą 0% -0.3 4.86 ą 0% -0.2 4.97 ą 0% perf-profile.calltrace.cycles-pp.mas_wr_store_entry.mas_store_prealloc.vma_complete.__split_vma.do_vmi_align_munmap 5.77 ą 0% -0.3 5.48 ą 0% -0.2 5.58 ą 0% perf-profile.calltrace.cycles-pp.move_ptes.move_page_tables.move_vma.__do_sys_mremap.do_syscall_64 5.16 ą 0% -0.3 4.88 ą 0% -0.1 5.01 ą 0% perf-profile.calltrace.cycles-pp.mas_wr_store_entry.mas_store_prealloc.vma_merge.copy_vma.move_vma 4.72 ą 2% -0.3 4.44 ą 2% -0.3 4.45 ą 2% perf-profile.calltrace.cycles-pp.__slab_free.kmem_cache_free.rcu_do_batch.rcu_core.handle_softirqs 4.64 ą 0% -0.3 4.38 ą 0% -0.1 4.51 ą 1% perf-profile.calltrace.cycles-pp.mas_wr_node_store.mas_wr_store_entry.mas_store_prealloc.vma_merge.copy_vma 4.07 ą 0% -0.2 3.84 ą 0% -0.2 3.92 ą 0% perf-profile.calltrace.cycles-pp.vm_area_dup.copy_vma.move_vma.__do_sys_mremap.do_syscall_64 3.96 ą 1% -0.2 3.76 ą 1% -0.1 3.88 ą 1% perf-profile.calltrace.cycles-pp.mas_wr_node_store.mas_wr_store_entry.mas_store_prealloc.vma_complete.__split_vma 3.54 ą 0% -0.2 3.34 ą 0% -0.1 3.41 ą 1% perf-profile.calltrace.cycles-pp.kmem_cache_alloc_noprof.vm_area_dup.copy_vma.move_vma.__do_sys_mremap 38.68 ą 0% -0.2 38.49 ą 0% +0.4 39.05 ą 0% perf-profile.calltrace.cycles-pp.do_vmi_munmap.move_vma.__do_sys_mremap.do_syscall_64.entry_SYSCALL_64_after_hwframe 0.55 ą 1% -0.2 0.36 ą 65% -0.0 0.52 ą 1% perf-profile.calltrace.cycles-pp.mas_find.find_vma_prev.copy_vma.move_vma.__do_sys_mremap 3.41 ą 0% -0.2 3.22 ą 0% -0.1 3.28 ą 0% perf-profile.calltrace.cycles-pp.flush_tlb_mm_range.move_ptes.move_page_tables.move_vma.__do_sys_mremap 1.35 ą 0% -0.2 1.17 ą 0% -0.1 1.23 ą 0% perf-profile.calltrace.cycles-pp.mas_find.do_vmi_munmap.do_munmap.mremap_to.__do_sys_mremap 2.22 ą 0% -0.2 2.05 ą 0% -0.1 2.12 ą 0% perf-profile.calltrace.cycles-pp.find_vma_prev.copy_vma.move_vma.__do_sys_mremap.do_syscall_64 2.27 ą 0% -0.2 2.10 ą 0% -0.1 2.15 ą 0% perf-profile.calltrace.cycles-pp.mas_preallocate.__split_vma.do_vmi_align_munmap.do_vmi_munmap.move_vma 3.25 ą 0% -0.2 3.08 ą 0% -0.1 3.14 ą 0% perf-profile.calltrace.cycles-pp.mas_store_gfp.do_vmi_align_munmap.do_vmi_munmap.move_vma.__do_sys_mremap 3.12 ą 2% -0.2 2.97 ą 2% -0.1 3.04 ą 2% perf-profile.calltrace.cycles-pp.__memcg_slab_post_alloc_hook.kmem_cache_alloc_noprof.vm_area_dup.__split_vma.do_vmi_align_munmap 0.96 ą 0% -0.1 0.82 ą 1% -0.1 0.87 ą 1% perf-profile.calltrace.cycles-pp.mas_walk.mas_find.do_vmi_munmap.do_munmap.mremap_to 2.98 ą 1% -0.1 2.84 ą 1% -0.1 2.89 ą 2% perf-profile.calltrace.cycles-pp.anon_vma_clone.__split_vma.do_vmi_align_munmap.do_vmi_munmap.move_vma 8.19 ą 0% -0.1 8.05 ą 0% -0.1 8.04 ą 0% perf-profile.calltrace.cycles-pp.unmap_region.do_vmi_align_munmap.do_vmi_munmap.move_vma.__do_sys_mremap 3.13 ą 0% -0.1 3.00 ą 0% -0.1 3.06 ą 0% perf-profile.calltrace.cycles-pp.unmap_vmas.unmap_region.do_vmi_align_munmap.do_vmi_munmap.move_vma 0.53 ą 1% -0.1 0.41 ą 50% -0.2 0.30 ą 81% perf-profile.calltrace.cycles-pp.arch_get_unmapped_area_topdown_vmflags.thp_get_unmapped_area_vmflags.__get_unmapped_area.mremap_to.__do_sys_mremap 1.73 ą 2% -0.1 1.61 ą 2% -0.0 1.70 ą 3% perf-profile.calltrace.cycles-pp.vma_prepare.vma_merge.copy_vma.move_vma.__do_sys_mremap 2.14 ą 2% -0.1 2.02 ą 2% -0.0 2.09 ą 2% perf-profile.calltrace.cycles-pp.kmem_cache_alloc_noprof.anon_vma_clone.__split_vma.do_vmi_align_munmap.do_vmi_munmap 2.46 ą 0% -0.1 2.34 ą 0% -0.1 2.38 ą 0% perf-profile.calltrace.cycles-pp.flush_tlb_func.flush_tlb_mm_range.move_ptes.move_page_tables.move_vma 2.04 ą 0% -0.1 1.93 ą 0% -0.1 1.96 ą 0% perf-profile.calltrace.cycles-pp.mas_preallocate.vma_merge.copy_vma.move_vma.__do_sys_mremap 1.85 ą 0% -0.1 1.74 ą 0% -0.1 1.78 ą 0% perf-profile.calltrace.cycles-pp.vma_link.copy_vma.move_vma.__do_sys_mremap.do_syscall_64 2.22 ą 0% -0.1 2.12 ą 0% -0.1 2.15 ą 0% perf-profile.calltrace.cycles-pp.native_flush_tlb_one_user.flush_tlb_func.flush_tlb_mm_range.move_ptes.move_page_tables 1.40 ą 0% -0.1 1.30 ą 0% -0.1 1.33 ą 0% perf-profile.calltrace.cycles-pp.mas_alloc_nodes.mas_preallocate.__split_vma.do_vmi_align_munmap.do_vmi_munmap 0.56 ą 1% -0.1 0.46 ą 33% -0.0 0.54 ą 2% perf-profile.calltrace.cycles-pp.mas_walk.mas_prev_setup.mas_prev.vma_merge.copy_vma 1.80 ą 2% -0.1 1.70 ą 2% -0.1 1.74 ą 2% perf-profile.calltrace.cycles-pp.__memcg_slab_post_alloc_hook.kmem_cache_alloc_noprof.vm_area_dup.copy_vma.move_vma 2.43 ą 0% -0.1 2.33 ą 0% -0.1 2.37 ą 0% perf-profile.calltrace.cycles-pp.unmap_page_range.unmap_vmas.unmap_region.do_vmi_align_munmap.do_vmi_munmap 1.25 ą 0% -0.1 1.15 ą 1% -0.1 1.19 ą 0% perf-profile.calltrace.cycles-pp.kmem_cache_alloc_noprof.mas_alloc_nodes.mas_preallocate.__split_vma.do_vmi_align_munmap 0.94 ą 1% -0.1 0.86 ą 0% -0.1 0.87 ą 0% perf-profile.calltrace.cycles-pp.mas_walk.mas_find.do_vmi_munmap.move_vma.__do_sys_mremap 1.38 ą 0% -0.1 1.30 ą 0% -0.1 1.33 ą 1% perf-profile.calltrace.cycles-pp.mas_alloc_nodes.mas_preallocate.vma_merge.copy_vma.move_vma 1.22 ą 0% -0.1 1.14 ą 0% -0.1 1.17 ą 1% perf-profile.calltrace.cycles-pp.kmem_cache_alloc_noprof.mas_alloc_nodes.mas_preallocate.vma_merge.copy_vma 1.28 ą 0% -0.1 1.21 ą 0% -0.0 1.23 ą 0% perf-profile.calltrace.cycles-pp.mas_wr_store_entry.mas_store_gfp.do_vmi_align_munmap.do_vmi_munmap.move_vma 1.54 ą 1% -0.1 1.46 ą 0% -0.0 1.49 ą 0% perf-profile.calltrace.cycles-pp.zap_pmd_range.unmap_page_range.unmap_vmas.unmap_region.do_vmi_align_munmap 1.15 ą 0% -0.1 1.08 ą 1% -0.1 1.09 ą 0% perf-profile.calltrace.cycles-pp.___slab_alloc.kmem_cache_alloc_noprof.vm_area_dup.__split_vma.do_vmi_align_munmap 0.73 ą 1% -0.1 0.67 ą 1% -0.0 0.69 ą 1% perf-profile.calltrace.cycles-pp.mas_walk.find_vma_prev.copy_vma.move_vma.__do_sys_mremap 0.72 ą 0% -0.1 0.66 ą 1% -0.0 0.69 ą 1% perf-profile.calltrace.cycles-pp.mas_prev.vma_merge.copy_vma.move_vma.__do_sys_mremap 1.64 ą 1% -0.1 1.58 ą 0% -0.1 1.58 ą 0% perf-profile.calltrace.cycles-pp.__get_unmapped_area.mremap_to.__do_sys_mremap.do_syscall_64.entry_SYSCALL_64_after_hwframe 0.78 ą 1% -0.1 0.72 ą 1% -0.0 0.75 ą 1% perf-profile.calltrace.cycles-pp.___slab_alloc.kmem_cache_alloc_noprof.mas_alloc_nodes.mas_preallocate.__split_vma 0.63 ą 1% -0.1 0.57 ą 1% -0.0 0.60 ą 1% perf-profile.calltrace.cycles-pp.mas_prev_setup.mas_prev.vma_merge.copy_vma.move_vma 0.69 ą 2% -0.1 0.63 ą 4% -0.0 0.66 ą 2% perf-profile.calltrace.cycles-pp.mod_objcg_state.__memcg_slab_post_alloc_hook.kmem_cache_alloc_noprof.vm_area_dup.copy_vma 0.60 ą 1% -0.1 0.54 ą 1% -0.0 0.58 ą 1% perf-profile.calltrace.cycles-pp.security_mmap_addr.__get_unmapped_area.mremap_to.__do_sys_mremap.do_syscall_64 0.79 ą 2% -0.1 0.74 ą 3% -0.0 0.75 ą 2% perf-profile.calltrace.cycles-pp.__call_rcu_common.mas_wr_node_store.mas_wr_store_entry.mas_store_prealloc.vma_merge 1.12 ą 0% -0.0 1.08 ą 0% -0.0 1.09 ą 1% perf-profile.calltrace.cycles-pp.clear_bhb_loop.mremap 0.67 ą 1% -0.0 0.62 ą 1% -0.0 0.63 ą 1% perf-profile.calltrace.cycles-pp.percpu_counter_add_batch.do_vmi_align_munmap.do_vmi_munmap.move_vma.__do_sys_mremap 0.77 ą 1% -0.0 0.72 ą 1% -0.0 0.73 ą 1% perf-profile.calltrace.cycles-pp.___slab_alloc.kmem_cache_alloc_noprof.mas_alloc_nodes.mas_preallocate.vma_merge 1.01 ą 1% -0.0 0.96 ą 0% -0.0 0.98 ą 0% perf-profile.calltrace.cycles-pp.zap_pte_range.zap_pmd_range.unmap_page_range.unmap_vmas.unmap_region 0.86 ą 0% -0.0 0.81 ą 1% -0.0 0.83 ą 1% perf-profile.calltrace.cycles-pp.mtree_load.vma_to_resize.mremap_to.__do_sys_mremap.do_syscall_64 0.82 ą 1% -0.0 0.78 ą 1% -0.0 0.79 ą 1% perf-profile.calltrace.cycles-pp.mtree_load.__do_sys_mremap.do_syscall_64.entry_SYSCALL_64_after_hwframe.mremap 1.01 ą 0% -0.0 0.97 ą 0% -0.0 0.98 ą 0% perf-profile.calltrace.cycles-pp.vma_to_resize.mremap_to.__do_sys_mremap.do_syscall_64.entry_SYSCALL_64_after_hwframe 0.98 ą 1% -0.0 0.94 ą 0% -0.0 0.94 ą 1% perf-profile.calltrace.cycles-pp.mas_find.do_vmi_align_munmap.do_vmi_munmap.move_vma.__do_sys_mremap 0.78 ą 0% -0.0 0.74 ą 1% -0.0 0.75 ą 1% perf-profile.calltrace.cycles-pp.mas_store_prealloc.vma_link.copy_vma.move_vma.__do_sys_mremap 0.68 ą 0% -0.0 0.64 ą 1% -0.0 0.65 ą 0% perf-profile.calltrace.cycles-pp.___slab_alloc.kmem_cache_alloc_noprof.vm_area_dup.copy_vma.move_vma 0.68 ą 1% -0.0 0.64 ą 1% -0.0 0.64 ą 1% perf-profile.calltrace.cycles-pp.mas_preallocate.vma_link.copy_vma.move_vma.__do_sys_mremap 0.89 ą 1% -0.0 0.85 ą 1% -0.0 0.86 ą 1% perf-profile.calltrace.cycles-pp.mtree_load.vma_merge.copy_vma.move_vma.__do_sys_mremap 0.62 ą 1% -0.0 0.58 ą 2% -0.0 0.59 ą 1% perf-profile.calltrace.cycles-pp.get_old_pud.move_page_tables.move_vma.__do_sys_mremap.do_syscall_64 0.62 ą 1% -0.0 0.58 ą 1% -0.0 0.59 ą 1% perf-profile.calltrace.cycles-pp.mas_prev_slot.do_vmi_align_munmap.do_vmi_munmap.move_vma.__do_sys_mremap 0.76 ą 1% -0.0 0.72 ą 1% -0.0 0.73 ą 1% perf-profile.calltrace.cycles-pp.allocate_slab.___slab_alloc.kmem_cache_alloc_noprof.vm_area_dup.__split_vma 1.01 ą 0% -0.0 0.97 ą 1% -0.0 0.98 ą 1% perf-profile.calltrace.cycles-pp.mt_find.vma_merge.copy_vma.move_vma.__do_sys_mremap 0.64 ą 1% -0.0 0.60 ą 1% -0.0 0.61 ą 1% perf-profile.calltrace.cycles-pp.mas_update_gap.mas_store_gfp.do_vmi_align_munmap.do_vmi_munmap.move_vma 0.88 ą 1% -0.0 0.85 ą 0% -0.0 0.85 ą 0% perf-profile.calltrace.cycles-pp.userfaultfd_unmap_complete.__do_sys_mremap.do_syscall_64.entry_SYSCALL_64_after_hwframe.mremap 0.69 ą 1% -0.0 0.66 ą 1% -0.0 0.67 ą 0% perf-profile.calltrace.cycles-pp.__call_rcu_common.do_vmi_align_munmap.do_vmi_munmap.move_vma.__do_sys_mremap 0.59 ą 1% -0.0 0.56 ą 1% -0.0 0.56 ą 0% perf-profile.calltrace.cycles-pp.entry_SYSCALL_64.mremap 0.82 ą 1% -0.0 0.82 ą 1% -0.0 0.79 ą 1% perf-profile.calltrace.cycles-pp.thp_get_unmapped_area_vmflags.__get_unmapped_area.mremap_to.__do_sys_mremap.do_syscall_64 0.76 ą 1% +0.1 0.83 ą 0% +0.1 0.84 ą 0% perf-profile.calltrace.cycles-pp.__madvise 0.67 ą 1% +0.1 0.73 ą 1% +0.1 0.75 ą 1% perf-profile.calltrace.cycles-pp.entry_SYSCALL_64_after_hwframe.__madvise 0.63 ą 1% +0.1 0.70 ą 1% +0.1 0.71 ą 0% perf-profile.calltrace.cycles-pp.__x64_sys_madvise.do_syscall_64.entry_SYSCALL_64_after_hwframe.__madvise 0.62 ą 1% +0.1 0.69 ą 1% +0.1 0.71 ą 0% perf-profile.calltrace.cycles-pp.do_madvise.__x64_sys_madvise.do_syscall_64.entry_SYSCALL_64_after_hwframe.__madvise 0.66 ą 1% +0.1 0.73 ą 1% +0.1 0.74 ą 0% perf-profile.calltrace.cycles-pp.do_syscall_64.entry_SYSCALL_64_after_hwframe.__madvise 87.57 ą 0% +0.6 88.14 ą 0% +0.5 88.09 ą 0% perf-profile.calltrace.cycles-pp.mremap 84.74 ą 0% +0.7 85.47 ą 0% +0.6 85.37 ą 0% perf-profile.calltrace.cycles-pp.entry_SYSCALL_64_after_hwframe.mremap 84.58 ą 0% +0.7 85.32 ą 0% +0.6 85.22 ą 0% perf-profile.calltrace.cycles-pp.do_syscall_64.entry_SYSCALL_64_after_hwframe.mremap 83.64 ą 0% +0.8 84.41 ą 0% +0.7 84.30 ą 0% perf-profile.calltrace.cycles-pp.__do_sys_mremap.do_syscall_64.entry_SYSCALL_64_after_hwframe.mremap 0.00 ą -1% +0.9 0.86 ą 0% +0.9 0.92 ą 0% perf-profile.calltrace.cycles-pp.mas_walk.mas_find.can_modify_mm.do_vmi_munmap.do_munmap 0.00 ą -1% +0.9 0.87 ą 0% +0.0 0.00 ą -1% perf-profile.calltrace.cycles-pp.mas_walk.mas_find.can_modify_mm.mremap_to.__do_sys_mremap 0.00 ą -1% +0.9 0.91 ą 2% +0.9 0.92 ą 1% perf-profile.calltrace.cycles-pp.mas_walk.mas_find.can_modify_mm.do_vmi_munmap.move_vma 0.00 ą -1% +1.1 1.09 ą 0% +0.0 0.00 ą -1% perf-profile.calltrace.cycles-pp.mas_find.can_modify_mm.mremap_to.__do_sys_mremap.do_syscall_64 0.00 ą -1% +1.2 1.21 ą 0% +1.3 1.29 ą 0% perf-profile.calltrace.cycles-pp.mas_find.can_modify_mm.do_vmi_munmap.do_munmap.mremap_to 2.10 ą 0% +1.5 3.61 ą 0% +1.7 3.79 ą 0% perf-profile.calltrace.cycles-pp.do_munmap.mremap_to.__do_sys_mremap.do_syscall_64.entry_SYSCALL_64_after_hwframe 0.00 ą -1% +1.5 1.51 ą 1% +1.5 1.52 ą 0% perf-profile.calltrace.cycles-pp.mas_find.can_modify_mm.do_vmi_munmap.move_vma.__do_sys_mremap 1.60 ą 0% +1.5 3.13 ą 0% +1.7 3.31 ą 0% perf-profile.calltrace.cycles-pp.do_vmi_munmap.do_munmap.mremap_to.__do_sys_mremap.do_syscall_64 0.00 ą -1% +1.6 1.60 ą 0% +0.0 0.00 ą -1% perf-profile.calltrace.cycles-pp.can_modify_mm.mremap_to.__do_sys_mremap.do_syscall_64.entry_SYSCALL_64_after_hwframe 0.00 ą -1% +1.7 1.73 ą 0% +1.8 1.84 ą 0% perf-profile.calltrace.cycles-pp.can_modify_mm.do_vmi_munmap.do_munmap.mremap_to.__do_sys_mremap 0.00 ą -1% +2.0 2.00 ą 1% +2.0 2.04 ą 0% perf-profile.calltrace.cycles-pp.can_modify_mm.do_vmi_munmap.move_vma.__do_sys_mremap.do_syscall_64 5.35 ą 0% +3.0 8.37 ą 0% +1.6 6.92 ą 0% perf-profile.calltrace.cycles-pp.mremap_to.__do_sys_mremap.do_syscall_64.entry_SYSCALL_64_after_hwframe.mremap 75.03 ą 0% -2.1 72.92 ą 0% -0.8 74.22 ą 0% perf-profile.children.cycles-pp.move_vma 36.94 ą 0% -1.7 35.25 ą 0% -1.2 35.75 ą 0% perf-profile.children.cycles-pp.do_vmi_align_munmap 25.01 ą 0% -1.4 23.61 ą 0% -0.8 24.19 ą 0% perf-profile.children.cycles-pp.copy_vma 20.00 ą 0% -1.1 18.88 ą 0% -0.7 19.26 ą 0% perf-profile.children.cycles-pp.__split_vma 19.92 ą 0% -1.1 18.84 ą 0% -0.8 19.14 ą 0% perf-profile.children.cycles-pp.handle_softirqs 19.90 ą 0% -1.1 18.82 ą 0% -0.8 19.12 ą 0% perf-profile.children.cycles-pp.rcu_core 19.88 ą 0% -1.1 18.80 ą 0% -0.8 19.10 ą 0% perf-profile.children.cycles-pp.rcu_do_batch 17.57 ą 0% -0.9 16.66 ą 0% -0.6 16.94 ą 0% perf-profile.children.cycles-pp.kmem_cache_free 15.29 ą 0% -0.9 14.43 ą 0% -0.5 14.75 ą 0% perf-profile.children.cycles-pp.kmem_cache_alloc_noprof 15.11 ą 0% -0.8 14.27 ą 0% -0.4 14.68 ą 0% perf-profile.children.cycles-pp.vma_merge 12.15 ą 0% -0.7 11.46 ą 0% -0.5 11.65 ą 0% perf-profile.children.cycles-pp.__slab_free 12.11 ą 0% -0.7 11.43 ą 0% -0.4 11.71 ą 0% perf-profile.children.cycles-pp.mas_wr_store_entry 11.90 ą 0% -0.7 11.24 ą 0% -0.4 11.50 ą 0% perf-profile.children.cycles-pp.mas_store_prealloc 10.82 ą 2% -0.6 10.22 ą 2% -0.6 10.25 ą 2% perf-profile.children.cycles-pp.smpboot_thread_fn 10.81 ą 2% -0.6 10.21 ą 2% -0.6 10.24 ą 2% perf-profile.children.cycles-pp.run_ksoftirqd 10.85 ą 2% -0.6 10.26 ą 2% -0.6 10.28 ą 2% perf-profile.children.cycles-pp.kthread 10.85 ą 2% -0.6 10.26 ą 2% -0.6 10.28 ą 2% perf-profile.children.cycles-pp.ret_from_fork 10.85 ą 2% -0.6 10.26 ą 2% -0.6 10.28 ą 2% perf-profile.children.cycles-pp.ret_from_fork_asm 10.85 ą 0% -0.6 10.26 ą 0% -0.4 10.47 ą 0% perf-profile.children.cycles-pp.vm_area_dup 9.81 ą 0% -0.5 9.28 ą 0% -0.3 9.52 ą 0% perf-profile.children.cycles-pp.mas_wr_node_store 8.38 ą 1% -0.5 7.90 ą 1% -0.2 8.13 ą 1% perf-profile.children.cycles-pp.__memcg_slab_post_alloc_hook 7.98 ą 0% -0.4 7.58 ą 0% -0.3 7.70 ą 0% perf-profile.children.cycles-pp.move_page_tables 6.66 ą 0% -0.4 6.29 ą 0% -0.2 6.43 ą 0% perf-profile.children.cycles-pp.vma_complete 5.12 ą 0% -0.3 4.79 ą 0% -0.2 4.88 ą 0% perf-profile.children.cycles-pp.mas_preallocate 6.05 ą 0% -0.3 5.72 ą 0% -0.2 5.82 ą 0% perf-profile.children.cycles-pp.vm_area_free_rcu_cb 5.85 ą 0% -0.3 5.56 ą 0% -0.2 5.66 ą 0% perf-profile.children.cycles-pp.move_ptes 3.51 ą 1% -0.2 3.28 ą 2% -0.1 3.37 ą 1% perf-profile.children.cycles-pp.mod_objcg_state 3.45 ą 0% -0.2 3.24 ą 0% -0.2 3.30 ą 0% perf-profile.children.cycles-pp.___slab_alloc 2.91 ą 0% -0.2 2.71 ą 0% -0.1 2.78 ą 0% perf-profile.children.cycles-pp.mas_alloc_nodes 3.47 ą 0% -0.2 3.27 ą 0% -0.1 3.34 ą 0% perf-profile.children.cycles-pp.flush_tlb_mm_range 3.43 ą 1% -0.2 3.24 ą 1% -0.1 3.35 ą 2% perf-profile.children.cycles-pp.down_write 2.44 ą 0% -0.2 2.25 ą 0% -0.1 2.32 ą 0% perf-profile.children.cycles-pp.find_vma_prev 4.24 ą 1% -0.2 4.06 ą 1% -0.1 4.11 ą 1% perf-profile.children.cycles-pp.anon_vma_clone 3.35 ą 0% -0.2 3.18 ą 0% -0.1 3.24 ą 0% perf-profile.children.cycles-pp.mas_store_gfp 2.21 ą 1% -0.2 2.05 ą 0% -0.1 2.10 ą 0% perf-profile.children.cycles-pp.__cond_resched 3.32 ą 0% -0.2 3.17 ą 1% -0.1 3.24 ą 0% perf-profile.children.cycles-pp.__memcg_slab_free_hook 8.26 ą 0% -0.1 8.12 ą 0% -0.1 8.11 ą 0% perf-profile.children.cycles-pp.unmap_region 2.22 ą 1% -0.1 2.08 ą 1% -0.1 2.16 ą 3% perf-profile.children.cycles-pp.vma_prepare 2.67 ą 0% -0.1 2.54 ą 0% -0.1 2.58 ą 0% perf-profile.children.cycles-pp.mtree_load 3.18 ą 0% -0.1 3.05 ą 0% -0.1 3.11 ą 0% perf-profile.children.cycles-pp.unmap_vmas 2.46 ą 0% -0.1 2.34 ą 0% -0.1 2.38 ą 0% perf-profile.children.cycles-pp.rcu_cblist_dequeue 2.50 ą 0% -0.1 2.39 ą 0% -0.1 2.43 ą 0% perf-profile.children.cycles-pp.flush_tlb_func 2.11 ą 1% -0.1 2.00 ą 1% -0.1 2.02 ą 1% perf-profile.children.cycles-pp.__call_rcu_common 2.04 ą 1% -0.1 1.93 ą 1% -0.1 1.95 ą 1% perf-profile.children.cycles-pp.allocate_slab 1.77 ą 1% -0.1 1.66 ą 0% -0.1 1.69 ą 1% perf-profile.children.cycles-pp.mas_wr_walk 1.87 ą 0% -0.1 1.77 ą 0% -0.1 1.80 ą 0% perf-profile.children.cycles-pp.vma_link 2.24 ą 0% -0.1 2.13 ą 0% -0.1 2.17 ą 0% perf-profile.children.cycles-pp.native_flush_tlb_one_user 1.85 ą 1% -0.1 1.74 ą 0% -0.1 1.79 ą 2% perf-profile.children.cycles-pp.up_write 2.48 ą 0% -0.1 2.38 ą 0% -0.1 2.42 ą 0% perf-profile.children.cycles-pp.unmap_page_range 0.97 ą 2% -0.1 0.88 ą 1% -0.1 0.90 ą 1% perf-profile.children.cycles-pp.rcu_all_qs 1.04 ą 0% -0.1 0.95 ą 1% -0.0 0.99 ą 1% perf-profile.children.cycles-pp.mas_prev 1.24 ą 0% -0.1 1.16 ą 0% -0.1 1.19 ą 0% perf-profile.children.cycles-pp.mas_prev_slot 0.93 ą 0% -0.1 0.85 ą 1% -0.0 0.88 ą 1% perf-profile.children.cycles-pp.mas_prev_setup 1.39 ą 1% -0.1 1.31 ą 1% -0.1 1.33 ą 1% perf-profile.children.cycles-pp.shuffle_freelist 1.52 ą 0% -0.1 1.45 ą 0% -0.0 1.48 ą 0% perf-profile.children.cycles-pp.mas_update_gap 1.58 ą 1% -0.1 1.50 ą 0% -0.0 1.53 ą 0% perf-profile.children.cycles-pp.zap_pmd_range 0.87 ą 1% -0.1 0.80 ą 0% -0.1 0.82 ą 1% perf-profile.children.cycles-pp._raw_spin_lock_irqsave 1.68 ą 1% -0.1 1.62 ą 0% -0.1 1.62 ą 0% perf-profile.children.cycles-pp.__get_unmapped_area 0.90 ą 1% -0.1 0.84 ą 0% -0.0 0.86 ą 1% perf-profile.children.cycles-pp.percpu_counter_add_batch 0.62 ą 1% -0.1 0.56 ą 1% -0.0 0.60 ą 1% perf-profile.children.cycles-pp.security_mmap_addr 0.49 ą 1% -0.1 0.44 ą 1% -0.1 0.44 ą 1% perf-profile.children.cycles-pp.setup_object 1.02 ą 0% -0.1 0.97 ą 1% -0.0 0.99 ą 0% perf-profile.children.cycles-pp.mas_leaf_max_gap 0.98 ą 1% -0.0 0.93 ą 1% -0.0 0.94 ą 1% perf-profile.children.cycles-pp.mas_pop_node 1.22 ą 1% -0.0 1.18 ą 1% -0.0 1.19 ą 1% perf-profile.children.cycles-pp.__pte_offset_map_lock 0.45 ą 2% -0.0 0.40 ą 2% -0.0 0.41 ą 1% perf-profile.children.cycles-pp.native_queued_spin_lock_slowpath 1.18 ą 0% -0.0 1.13 ą 0% -0.0 1.15 ą 1% perf-profile.children.cycles-pp.clear_bhb_loop 1.08 ą 1% -0.0 1.03 ą 0% -0.0 1.05 ą 0% perf-profile.children.cycles-pp.zap_pte_range 1.04 ą 0% -0.0 1.00 ą 0% -0.0 1.01 ą 0% perf-profile.children.cycles-pp.vma_to_resize 0.58 ą 1% -0.0 0.53 ą 1% -0.0 0.54 ą 1% perf-profile.children.cycles-pp.mas_wr_end_piv 0.34 ą 2% -0.0 0.30 ą 5% -0.0 0.31 ą 4% perf-profile.children.cycles-pp.get_partial_node 0.64 ą 1% -0.0 0.61 ą 2% -0.0 0.61 ą 1% perf-profile.children.cycles-pp.get_old_pud 0.62 ą 0% -0.0 0.59 ą 0% -0.0 0.59 ą 1% perf-profile.children.cycles-pp.__put_partials 1.14 ą 0% -0.0 1.10 ą 1% -0.0 1.12 ą 1% perf-profile.children.cycles-pp.mt_find 0.90 ą 0% -0.0 0.87 ą 0% -0.0 0.87 ą 0% perf-profile.children.cycles-pp.userfaultfd_unmap_complete 0.61 ą 1% -0.0 0.58 ą 1% -0.0 0.59 ą 0% perf-profile.children.cycles-pp.entry_SYSCALL_64 0.32 ą 2% -0.0 0.29 ą 3% -0.0 0.30 ą 4% perf-profile.children.cycles-pp.security_vm_enough_memory_mm 0.54 ą 1% -0.0 0.52 ą 1% -0.0 0.52 ą 1% perf-profile.children.cycles-pp.arch_get_unmapped_area_topdown_vmflags 0.55 ą 1% -0.0 0.52 ą 1% -0.0 0.54 ą 1% perf-profile.children.cycles-pp.refill_obj_stock 0.45 ą 1% -0.0 0.43 ą 2% -0.0 0.43 ą 2% perf-profile.children.cycles-pp.__alloc_pages_noprof 0.43 ą 1% -0.0 0.41 ą 2% -0.0 0.41 ą 2% perf-profile.children.cycles-pp.get_page_from_freelist 0.17 ą 1% -0.0 0.15 ą 3% -0.0 0.16 ą 1% perf-profile.children.cycles-pp.get_any_partial 0.32 ą 1% -0.0 0.30 ą 1% -0.0 0.30 ą 1% perf-profile.children.cycles-pp.pte_offset_map_nolock 0.40 ą 0% -0.0 0.38 ą 1% -0.0 0.39 ą 1% perf-profile.children.cycles-pp.entry_SYSRETQ_unsafe_stack 0.28 ą 2% -0.0 0.26 ą 2% -0.0 0.27 ą 1% perf-profile.children.cycles-pp.khugepaged_enter_vma 0.32 ą 1% -0.0 0.30 ą 1% -0.0 0.30 ą 2% perf-profile.children.cycles-pp.mas_wr_store_setup 0.19 ą 4% -0.0 0.17 ą 4% -0.0 0.18 ą 6% perf-profile.children.cycles-pp.cap_vm_enough_memory 0.29 ą 1% -0.0 0.27 ą 2% -0.0 0.28 ą 3% perf-profile.children.cycles-pp.tlb_gather_mmu 0.09 ą 4% -0.0 0.07 ą 6% -0.0 0.08 ą 5% perf-profile.children.cycles-pp.vma_dup_policy 0.16 ą 3% -0.0 0.14 ą 2% -0.0 0.14 ą 2% perf-profile.children.cycles-pp.mas_wr_append 0.22 ą 2% -0.0 0.20 ą 3% -0.0 0.20 ą 3% perf-profile.children.cycles-pp.__rmqueue_pcplist 0.20 ą 2% -0.0 0.18 ą 2% -0.0 0.19 ą 3% perf-profile.children.cycles-pp.__thp_vma_allowable_orders 0.24 ą 2% -0.0 0.23 ą 2% -0.0 0.23 ą 2% perf-profile.children.cycles-pp.free_pcppages_bulk 0.44 ą 1% +0.0 0.45 ą 1% +0.0 0.46 ą 1% perf-profile.children.cycles-pp.mremap_userfaultfd_prep 0.85 ą 1% +0.0 0.85 ą 1% -0.0 0.81 ą 1% perf-profile.children.cycles-pp.thp_get_unmapped_area_vmflags 0.13 ą 3% +0.0 0.14 ą 3% +0.0 0.15 ą 2% perf-profile.children.cycles-pp.free_pgd_range 0.08 ą 8% +0.0 0.10 ą 3% -0.0 0.08 ą 6% perf-profile.children.cycles-pp.mm_get_unmapped_area_vmflags 0.78 ą 1% +0.1 0.84 ą 0% +0.1 0.86 ą 0% perf-profile.children.cycles-pp.__madvise 0.63 ą 1% +0.1 0.70 ą 1% +0.1 0.72 ą 0% perf-profile.children.cycles-pp.__x64_sys_madvise 0.63 ą 1% +0.1 0.70 ą 0% +0.1 0.71 ą 0% perf-profile.children.cycles-pp.do_madvise 0.00 ą -1% +0.1 0.09 ą 0% +0.1 0.09 ą 5% perf-profile.children.cycles-pp.can_modify_mm_madv 1.32 ą 1% +0.1 1.46 ą 0% +0.2 1.50 ą 0% perf-profile.children.cycles-pp.mas_next_slot 87.96 ą 0% +0.6 88.52 ą 0% +0.5 88.48 ą 0% perf-profile.children.cycles-pp.mremap 85.91 ą 0% +0.8 86.69 ą 0% +0.7 86.61 ą 0% perf-profile.children.cycles-pp.entry_SYSCALL_64_after_hwframe 83.74 ą 0% +0.8 84.52 ą 0% +0.7 84.40 ą 0% perf-profile.children.cycles-pp.__do_sys_mremap 85.42 ą 0% +0.8 86.23 ą 0% +0.7 86.14 ą 0% perf-profile.children.cycles-pp.do_syscall_64 40.36 ą 0% +1.4 41.74 ą 0% +2.1 42.49 ą 0% perf-profile.children.cycles-pp.do_vmi_munmap 2.12 ą 0% +1.5 3.63 ą 0% +1.7 3.81 ą 0% perf-profile.children.cycles-pp.do_munmap 3.62 ą 0% +2.3 5.97 ą 0% +1.7 5.29 ą 0% perf-profile.children.cycles-pp.mas_walk 5.41 ą 0% +3.0 8.44 ą 0% +1.6 6.98 ą 0% perf-profile.children.cycles-pp.mremap_to 5.28 ą 0% +3.2 8.48 ą 0% +2.3 7.56 ą 0% perf-profile.children.cycles-pp.mas_find 0.00 ą -1% +5.4 5.45 ą 0% +3.9 3.94 ą 0% perf-profile.children.cycles-pp.can_modify_mm 11.51 ą 0% -0.6 10.86 ą 0% -0.5 11.04 ą 0% perf-profile.self.cycles-pp.__slab_free 4.23 ą 2% -0.2 4.00 ą 2% -0.1 4.13 ą 2% perf-profile.self.cycles-pp.__memcg_slab_post_alloc_hook 2.34 ą 1% -0.1 2.21 ą 1% -0.0 2.30 ą 3% perf-profile.self.cycles-pp.down_write 2.43 ą 0% -0.1 2.31 ą 0% -0.1 2.34 ą 0% perf-profile.self.cycles-pp.rcu_cblist_dequeue 2.34 ą 0% -0.1 2.24 ą 0% -0.1 2.27 ą 0% perf-profile.self.cycles-pp.mtree_load 2.21 ą 0% -0.1 2.11 ą 0% -0.1 2.14 ą 0% perf-profile.self.cycles-pp.native_flush_tlb_one_user 1.75 ą 0% -0.1 1.67 ą 0% -0.0 1.70 ą 0% perf-profile.self.cycles-pp.mod_objcg_state 1.54 ą 1% -0.1 1.46 ą 0% -0.0 1.50 ą 1% perf-profile.self.cycles-pp.up_write 1.52 ą 0% -0.1 1.44 ą 0% -0.1 1.46 ą 0% perf-profile.self.cycles-pp.mas_wr_walk 0.70 ą 3% -0.1 0.63 ą 1% -0.1 0.64 ą 1% perf-profile.self.cycles-pp.rcu_all_qs 1.43 ą 1% -0.1 1.36 ą 1% -0.1 1.36 ą 1% perf-profile.self.cycles-pp.__call_rcu_common 1.01 ą 0% -0.1 0.95 ą 0% -0.0 0.96 ą 0% perf-profile.self.cycles-pp.mas_preallocate 1.40 ą 1% -0.1 1.33 ą 1% -0.0 1.35 ą 0% perf-profile.self.cycles-pp.do_vmi_align_munmap 1.00 ą 0% -0.1 0.94 ą 0% -0.0 0.96 ą 0% perf-profile.self.cycles-pp.mas_prev_slot 1.14 ą 1% -0.1 1.08 ą 1% -0.0 1.10 ą 1% perf-profile.self.cycles-pp.shuffle_freelist 1.18 ą 0% -0.1 1.13 ą 0% -0.0 1.16 ą 0% perf-profile.self.cycles-pp.vma_merge 0.94 ą 1% -0.1 0.89 ą 2% -0.0 0.91 ą 1% perf-profile.self.cycles-pp.vm_area_free_rcu_cb 0.88 ą 0% -0.1 0.83 ą 1% -0.0 0.84 ą 0% perf-profile.self.cycles-pp.___slab_alloc 0.50 ą 1% -0.0 0.45 ą 2% -0.0 0.50 ą 1% perf-profile.self.cycles-pp.security_mmap_addr 0.77 ą 1% -0.0 0.72 ą 1% -0.0 0.74 ą 1% perf-profile.self.cycles-pp.percpu_counter_add_batch 0.45 ą 2% -0.0 0.40 ą 2% -0.0 0.41 ą 1% perf-profile.self.cycles-pp.native_queued_spin_lock_slowpath 1.17 ą 0% -0.0 1.12 ą 0% -0.0 1.14 ą 1% perf-profile.self.cycles-pp.clear_bhb_loop 1.08 ą 1% -0.0 1.04 ą 1% -0.0 1.06 ą 1% perf-profile.self.cycles-pp.__cond_resched 1.50 ą 2% -0.0 1.46 ą 0% -0.0 1.48 ą 0% perf-profile.self.cycles-pp.kmem_cache_free 1.23 ą 0% -0.0 1.18 ą 0% -0.1 1.18 ą 0% perf-profile.self.cycles-pp.move_vma 0.68 ą 1% -0.0 0.64 ą 0% -0.0 0.65 ą 1% perf-profile.self.cycles-pp.__split_vma 0.80 ą 0% -0.0 0.76 ą 1% -0.0 0.77 ą 0% perf-profile.self.cycles-pp.mas_wr_store_entry 0.61 ą 2% -0.0 0.57 ą 2% -0.0 0.57 ą 6% perf-profile.self.cycles-pp.mremap 0.85 ą 1% -0.0 0.80 ą 1% -0.0 0.81 ą 1% perf-profile.self.cycles-pp.mas_pop_node 0.44 ą 0% -0.0 0.40 ą 1% -0.0 0.40 ą 1% perf-profile.self.cycles-pp.do_munmap 0.98 ą 0% -0.0 0.94 ą 1% -0.0 0.95 ą 0% perf-profile.self.cycles-pp.move_ptes 0.89 ą 0% -0.0 0.86 ą 0% -0.0 0.87 ą 0% perf-profile.self.cycles-pp.mas_leaf_max_gap 0.46 ą 1% -0.0 0.42 ą 1% -0.0 0.43 ą 1% perf-profile.self.cycles-pp.mas_wr_end_piv 0.89 ą 0% -0.0 0.86 ą 0% -0.0 0.87 ą 0% perf-profile.self.cycles-pp.mas_store_gfp 0.79 ą 0% -0.0 0.76 ą 1% -0.0 0.76 ą 0% perf-profile.self.cycles-pp.userfaultfd_unmap_complete 0.99 ą 0% -0.0 0.97 ą 0% -0.0 0.98 ą 0% perf-profile.self.cycles-pp.mt_find 0.87 ą 0% -0.0 0.84 ą 0% -0.0 0.84 ą 0% perf-profile.self.cycles-pp.move_page_tables 0.55 ą 2% -0.0 0.52 ą 1% -0.0 0.52 ą 1% perf-profile.self.cycles-pp.get_old_pud 0.50 ą 0% -0.0 0.47 ą 1% -0.0 0.48 ą 0% perf-profile.self.cycles-pp.find_vma_prev 0.61 ą 0% -0.0 0.58 ą 1% -0.0 0.59 ą 0% perf-profile.self.cycles-pp.unmap_region 0.66 ą 0% -0.0 0.63 ą 1% -0.0 0.64 ą 0% perf-profile.self.cycles-pp.mas_store_prealloc 0.27 ą 1% -0.0 0.25 ą 1% -0.0 0.26 ą 1% perf-profile.self.cycles-pp.mas_prev_setup 0.61 ą 1% -0.0 0.59 ą 1% -0.0 0.60 ą 1% perf-profile.self.cycles-pp.copy_vma 0.48 ą 0% -0.0 0.45 ą 1% -0.0 0.46 ą 1% perf-profile.self.cycles-pp.flush_tlb_mm_range 0.41 ą 1% -0.0 0.39 ą 1% -0.0 0.40 ą 1% perf-profile.self.cycles-pp._raw_spin_lock_irqsave 0.48 ą 1% -0.0 0.46 ą 1% -0.0 0.47 ą 0% perf-profile.self.cycles-pp.entry_SYSCALL_64_after_hwframe 0.50 ą 1% -0.0 0.48 ą 1% -0.0 0.48 ą 1% perf-profile.self.cycles-pp.refill_obj_stock 0.47 ą 1% -0.0 0.46 ą 1% -0.0 0.45 ą 1% perf-profile.self.cycles-pp.arch_get_unmapped_area_topdown_vmflags 0.71 ą 0% -0.0 0.69 ą 1% -0.0 0.69 ą 1% perf-profile.self.cycles-pp.unmap_page_range 0.17 ą 4% -0.0 0.15 ą 4% -0.0 0.16 ą 3% perf-profile.self.cycles-pp.get_partial_node 0.24 ą 1% -0.0 0.22 ą 1% -0.0 0.23 ą 0% perf-profile.self.cycles-pp.mas_prev 0.45 ą 1% -0.0 0.43 ą 0% -0.0 0.44 ą 1% perf-profile.self.cycles-pp.mas_update_gap 0.53 ą 1% -0.0 0.51 ą 0% -0.0 0.51 ą 1% perf-profile.self.cycles-pp.mremap_to 0.21 ą 2% -0.0 0.19 ą 2% -0.0 0.19 ą 2% perf-profile.self.cycles-pp.__get_unmapped_area 0.27 ą 1% -0.0 0.26 ą 1% -0.0 0.25 ą 1% perf-profile.self.cycles-pp.tlb_finish_mmu 0.18 ą 2% -0.0 0.17 ą 2% -0.0 0.18 ą 2% perf-profile.self.cycles-pp.rcu_do_batch 0.06 ą 0% -0.0 0.05 ą 0% -0.0 0.05 ą 0% perf-profile.self.cycles-pp.vma_dup_policy 0.12 ą 0% -0.0 0.11 ą 0% -0.0 0.11 ą 3% perf-profile.self.cycles-pp.mas_wr_append 0.14 ą 3% -0.0 0.13 ą 3% -0.0 0.12 ą 3% perf-profile.self.cycles-pp.x64_sys_call 0.11 ą 0% +0.0 0.12 ą 0% +0.0 0.12 ą 3% perf-profile.self.cycles-pp.free_pgd_range 0.06 ą 5% +0.0 0.07 ą 0% +0.0 0.06 ą 5% perf-profile.self.cycles-pp.mm_get_unmapped_area_vmflags 0.21 ą 0% +0.0 0.22 ą 2% -0.0 0.21 ą 2% perf-profile.self.cycles-pp.thp_get_unmapped_area_vmflags 0.45 ą 1% +0.0 0.48 ą 2% +0.0 0.50 ą 1% perf-profile.self.cycles-pp.do_vmi_munmap 0.27 ą 1% +0.0 0.32 ą 2% -0.0 0.26 ą 1% perf-profile.self.cycles-pp.free_pgtables 0.36 ą 2% +0.1 0.44 ą 1% -0.0 0.35 ą 4% perf-profile.self.cycles-pp.unlink_anon_vmas 1.07 ą 1% +0.1 1.19 ą 0% +0.1 1.22 ą 0% perf-profile.self.cycles-pp.mas_next_slot 1.50 ą 0% +0.5 2.02 ą 0% +0.4 1.85 ą 0% perf-profile.self.cycles-pp.mas_find 0.00 ą -1% +1.4 1.38 ą 0% +0.9 0.92 ą 0% perf-profile.self.cycles-pp.can_modify_mm 3.15 ą 0% +2.1 5.26 ą 0% +1.5 4.62 ą 0% perf-profile.self.cycles-pp.mas_walk
On Mon, Aug 19, 2024 at 02:35:40PM +0800, Oliver Sang wrote:
hi, Jeff,
On Mon, Aug 19, 2024 at 09:38:19AM +0800, Oliver Sang wrote:
hi, Jeff,
On Sun, Aug 18, 2024 at 05:28:41PM +0800, Oliver Sang wrote:
hi, Jeff,
On Thu, Aug 15, 2024 at 07:58:57PM -0700, Jeff Xu wrote:
Hi Oliver
[...]
could you exlictly point to two commit-id?
sure
this patch 8be7258a: mseal: add mseal syscall ff388fe5c: mseal: wire up mseal syscall
I failed to apply this patch set to "8be7258a: mseal: add mseal syscall"
look your patch set again [PATCH v1 1/2] mseal:selftest mremap across VMA boundaries just for kselftests
and I can apply [PATCH v1 2/2] mseal: refactor mremap to remove can_modify_mm upon "8be7258a: mseal: add mseal syscall" cleanly
so I will start test for this [PATCH v1 2/2]
BTW, I will firstly use our default setting - "60s testtime; reboot between each run; run 10 times", since we've already have the data for 8be7258a and ff388fe5c then we could give you an update kind of quickly.
as some private mail discussed, you want some special run method, could you elaborate them here? thanks
here is a quick update before you give us more details about special run method.
by our default run method (60s testtime; reboot between each run; run 10 times), your "[PATCH v1 2/2] mseal: refactor mremap to remove can_modify_mm" could resolve regression partically.
========================================================================================= compiler/cpufreq_governor/kconfig/nr_threads/rootfs/tbox_group/test/testcase/testtime: gcc-12/performance/x86_64-rhel-8.3/100%/debian-12-x86_64-20240206.cgz/lkp-icl-2sp7/pagemove/stress-ng/60s
commit: ff388fe5c4 ("mseal: wire up mseal syscall") 8be7258aad ("mseal: add mseal syscall") 2a78ece39f <-- your "[PATCH v1 2/2] mseal: refactor mremap to remove can_modify_mm"
ff388fe5c481d39c 8be7258aad44b5e25977a98db13 2a78ece39f13ea6f3f9679a6c66
%stddev %change %stddev %change %stddev \ | \ | \ 4957 +1.3% 5023 +1.0% 5008 time.percent_of_cpu_this_job_got 2915 +1.5% 2959 +1.2% 2949 time.system_time 65.96 -7.3% 61.16 -5.5% 62.30 time.user_time
41535878 -4.0% 39873501 -2.6% 40452264 proc-vmstat.numa_hit 41466104 -4.0% 39806121 -2.6% 40384854 proc-vmstat.numa_local 77297398 -4.1% 74165258 -2.6% 75286134 proc-vmstat.pgalloc_normal 77016866 -4.1% 73886027 -2.6% 75012630 proc-vmstat.pgfree 18386219 -5.0% 17474214 -2.9% 17850959 stress-ng.pagemove.ops 306421 -5.0% 291207 -2.9% 297490 stress-ng.pagemove.ops_per_sec 4957 +1.3% 5023 +1.0% 5008 stress-ng.time.percent_of_cpu_this_job_got 2915 +1.5% 2959 +1.2% 2949 stress-ng.time.system_time 3.349e+10 ą 4% +3.0% 3.447e+10 ą 2% +4.1% 3.484e+10 perf-stat.i.branch-instructions 1.13 -2.1% 1.10 -2.2% 1.10 perf-stat.i.cpi 0.89 +2.2% 0.91 +2.0% 0.91 perf-stat.i.ipc 1.04 -6.9% 0.97 -4.9% 0.99 perf-stat.overall.MPKI 1.13 -2.3% 1.10 -2.0% 1.10 perf-stat.overall.cpi 1081 +5.0% 1136 +3.0% 1114 perf-stat.overall.cycles-between-cache-misses 0.89 +2.3% 0.91 +2.0% 0.91 perf-stat.overall.ipc 3.295e+10 ą 3% +2.9% 3.392e+10 ą 2% +4.0% 3.427e+10 perf-stat.ps.branch-instructions 1.674e+11 ą 3% +1.8% 1.704e+11 ą 2% +3.3% 1.73e+11 perf-stat.ps.instructions 1.046e+13 +2.7% 1.074e+13 +1.7% 1.064e+13 perf-stat.total.instructions 75.05 -2.0 73.02 -0.9 74.18 perf-profile.calltrace.cycles-pp.move_vma.__do_sys_mremap.do_syscall_64.entry_SYSCALL_64_after_hwframe.mremap 36.83 -1.6 35.19 -1.2 35.62 perf-profile.calltrace.cycles-pp.do_vmi_align_munmap.do_vmi_munmap.move_vma.__do_sys_mremap.do_syscall_64 25.02 -1.4 23.65 -0.9 24.12 perf-profile.calltrace.cycles-pp.copy_vma.move_vma.__do_sys_mremap.do_syscall_64.entry_SYSCALL_64_after_hwframe 19.94 -1.1 18.87 -0.8 19.19 perf-profile.calltrace.cycles-pp.__split_vma.do_vmi_align_munmap.do_vmi_munmap.move_vma.__do_sys_mremap 14.78 -0.8 14.01 -0.5 14.28 perf-profile.calltrace.cycles-pp.vma_merge.copy_vma.move_vma.__do_sys_mremap.do_syscall_64 1.48 -0.5 0.99 -0.5 1.00 perf-profile.calltrace.cycles-pp.mas_find.do_vmi_munmap.move_vma.__do_sys_mremap.do_syscall_64 7.88 -0.4 7.47 -0.3 7.62 perf-profile.calltrace.cycles-pp.move_page_tables.move_vma.__do_sys_mremap.do_syscall_64.entry_SYSCALL_64_after_hwframe 6.73 -0.4 6.37 -0.2 6.51 perf-profile.calltrace.cycles-pp.vm_area_dup.__split_vma.do_vmi_align_munmap.do_vmi_munmap.move_vma 6.16 -0.3 5.82 -0.3 5.90 perf-profile.calltrace.cycles-pp.vma_complete.__split_vma.do_vmi_align_munmap.do_vmi_munmap.move_vma 6.12 -0.3 5.79 -0.2 5.93 perf-profile.calltrace.cycles-pp.kmem_cache_alloc_noprof.vm_area_dup.__split_vma.do_vmi_align_munmap.do_vmi_munmap 5.79 -0.3 5.48 -0.2 5.59 perf-profile.calltrace.cycles-pp.move_ptes.move_page_tables.move_vma.__do_sys_mremap.do_syscall_64 5.54 -0.3 5.25 -0.2 5.32 perf-profile.calltrace.cycles-pp.mas_store_prealloc.vma_complete.__split_vma.do_vmi_align_munmap.do_vmi_munmap 5.56 -0.3 5.28 -0.2 5.36 perf-profile.calltrace.cycles-pp.mas_store_prealloc.vma_merge.copy_vma.move_vma.__do_sys_mremap 5.19 -0.3 4.92 -0.2 4.98 perf-profile.calltrace.cycles-pp.mas_wr_store_entry.mas_store_prealloc.vma_complete.__split_vma.do_vmi_align_munmap 5.21 -0.3 4.95 -0.2 5.02 perf-profile.calltrace.cycles-pp.mas_wr_store_entry.mas_store_prealloc.vma_merge.copy_vma.move_vma 4.09 -0.2 3.85 -0.2 3.93 perf-profile.calltrace.cycles-pp.vm_area_dup.copy_vma.move_vma.__do_sys_mremap.do_syscall_64 4.69 -0.2 4.46 -0.2 4.51 perf-profile.calltrace.cycles-pp.mas_wr_node_store.mas_wr_store_entry.mas_store_prealloc.vma_merge.copy_vma 3.56 -0.2 3.36 -0.1 3.43 perf-profile.calltrace.cycles-pp.kmem_cache_alloc_noprof.vm_area_dup.copy_vma.move_vma.__do_sys_mremap 3.40 -0.2 3.22 -0.1 3.29 perf-profile.calltrace.cycles-pp.flush_tlb_mm_range.move_ptes.move_page_tables.move_vma.__do_sys_mremap 1.35 -0.2 1.16 -0.1 1.24 perf-profile.calltrace.cycles-pp.mas_find.do_vmi_munmap.do_munmap.mremap_to.__do_sys_mremap 4.00 -0.2 3.82 -0.1 3.86 perf-profile.calltrace.cycles-pp.mas_wr_node_store.mas_wr_store_entry.mas_store_prealloc.vma_complete.__split_vma 2.23 -0.2 2.05 -0.1 2.12 perf-profile.calltrace.cycles-pp.find_vma_prev.copy_vma.move_vma.__do_sys_mremap.do_syscall_64 8.26 -0.2 8.10 -0.2 8.06 perf-profile.calltrace.cycles-pp.unmap_region.do_vmi_align_munmap.do_vmi_munmap.move_vma.__do_sys_mremap 1.97 ą 3% -0.2 1.81 ą 3% -0.1 1.88 ą 4% perf-profile.calltrace.cycles-pp.mod_objcg_state.__memcg_slab_post_alloc_hook.kmem_cache_alloc_noprof.vm_area_dup.__split_vma 3.11 ą 2% -0.2 2.96 -0.1 3.05 perf-profile.calltrace.cycles-pp.__memcg_slab_post_alloc_hook.kmem_cache_alloc_noprof.vm_area_dup.__split_vma.do_vmi_align_munmap 0.97 -0.2 0.81 -0.1 0.87 perf-profile.calltrace.cycles-pp.mas_walk.mas_find.do_vmi_munmap.do_munmap.mremap_to 2.27 -0.2 2.11 -0.1 2.16 perf-profile.calltrace.cycles-pp.mas_preallocate.__split_vma.do_vmi_align_munmap.do_vmi_munmap.move_vma 3.25 -0.1 3.10 -0.1 3.17 perf-profile.calltrace.cycles-pp.mas_store_gfp.do_vmi_align_munmap.do_vmi_munmap.move_vma.__do_sys_mremap 3.14 -0.1 3.00 -0.1 3.06 perf-profile.calltrace.cycles-pp.unmap_vmas.unmap_region.do_vmi_align_munmap.do_vmi_munmap.move_vma 2.98 -0.1 2.85 -0.1 2.87 ą 2% perf-profile.calltrace.cycles-pp.anon_vma_clone.__split_vma.do_vmi_align_munmap.do_vmi_munmap.move_vma 1.27 ą 2% -0.1 1.15 ą 4% -0.1 1.19 ą 6% perf-profile.calltrace.cycles-pp.__memcpy.mas_wr_node_store.mas_wr_store_entry.mas_store_prealloc.vma_merge 2.45 -0.1 2.34 -0.1 2.38 perf-profile.calltrace.cycles-pp.flush_tlb_func.flush_tlb_mm_range.move_ptes.move_page_tables.move_vma 2.05 -0.1 1.94 -0.1 1.97 perf-profile.calltrace.cycles-pp.mas_preallocate.vma_merge.copy_vma.move_vma.__do_sys_mremap 2.44 -0.1 2.33 -0.1 2.38 perf-profile.calltrace.cycles-pp.unmap_page_range.unmap_vmas.unmap_region.do_vmi_align_munmap.do_vmi_munmap 2.22 -0.1 2.11 -0.1 2.15 perf-profile.calltrace.cycles-pp.native_flush_tlb_one_user.flush_tlb_func.flush_tlb_mm_range.move_ptes.move_page_tables 1.76 ą 2% -0.1 1.65 ą 2% -0.1 1.66 ą 4% perf-profile.calltrace.cycles-pp.vma_prepare.vma_merge.copy_vma.move_vma.__do_sys_mremap 1.86 -0.1 1.75 -0.1 1.78 perf-profile.calltrace.cycles-pp.vma_link.copy_vma.move_vma.__do_sys_mremap.do_syscall_64 1.40 -0.1 1.30 -0.1 1.34 perf-profile.calltrace.cycles-pp.mas_alloc_nodes.mas_preallocate.__split_vma.do_vmi_align_munmap.do_vmi_munmap 1.39 -0.1 1.30 -0.1 1.33 perf-profile.calltrace.cycles-pp.mas_alloc_nodes.mas_preallocate.vma_merge.copy_vma.move_vma 0.55 -0.1 0.46 ą 30% -0.0 0.52 perf-profile.calltrace.cycles-pp.mas_find.find_vma_prev.copy_vma.move_vma.__do_sys_mremap 1.25 -0.1 1.16 -0.1 1.20 perf-profile.calltrace.cycles-pp.kmem_cache_alloc_noprof.mas_alloc_nodes.mas_preallocate.__split_vma.do_vmi_align_munmap 0.94 -0.1 0.86 -0.1 0.87 perf-profile.calltrace.cycles-pp.mas_walk.mas_find.do_vmi_munmap.move_vma.__do_sys_mremap 1.23 -0.1 1.15 -0.1 1.17 perf-profile.calltrace.cycles-pp.kmem_cache_alloc_noprof.mas_alloc_nodes.mas_preallocate.vma_merge.copy_vma 1.54 -0.1 1.47 -0.0 1.49 perf-profile.calltrace.cycles-pp.zap_pmd_range.unmap_page_range.unmap_vmas.unmap_region.do_vmi_align_munmap 0.73 -0.1 0.66 -0.0 0.69 perf-profile.calltrace.cycles-pp.mas_walk.find_vma_prev.copy_vma.move_vma.__do_sys_mremap 1.15 -0.1 1.09 -0.1 1.10 perf-profile.calltrace.cycles-pp.___slab_alloc.kmem_cache_alloc_noprof.vm_area_dup.__split_vma.do_vmi_align_munmap 0.60 ą 2% -0.1 0.54 -0.0 0.58 perf-profile.calltrace.cycles-pp.security_mmap_addr.__get_unmapped_area.mremap_to.__do_sys_mremap.do_syscall_64 1.27 -0.1 1.21 -0.0 1.24 perf-profile.calltrace.cycles-pp.mas_wr_store_entry.mas_store_gfp.do_vmi_align_munmap.do_vmi_munmap.move_vma 0.80 ą 2% -0.1 0.74 ą 2% -0.0 0.76 ą 2% perf-profile.calltrace.cycles-pp.__call_rcu_common.mas_wr_node_store.mas_wr_store_entry.mas_store_prealloc.vma_merge 0.72 -0.1 0.66 -0.0 0.69 perf-profile.calltrace.cycles-pp.mas_prev.vma_merge.copy_vma.move_vma.__do_sys_mremap 0.78 -0.1 0.73 -0.0 0.75 perf-profile.calltrace.cycles-pp.___slab_alloc.kmem_cache_alloc_noprof.mas_alloc_nodes.mas_preallocate.__split_vma 0.69 ą 2% -0.1 0.64 ą 3% -0.0 0.66 ą 4% perf-profile.calltrace.cycles-pp.mod_objcg_state.__memcg_slab_post_alloc_hook.kmem_cache_alloc_noprof.vm_area_dup.copy_vma 1.63 -0.1 1.58 -0.1 1.57 perf-profile.calltrace.cycles-pp.__get_unmapped_area.mremap_to.__do_sys_mremap.do_syscall_64.entry_SYSCALL_64_after_hwframe 1.02 -0.1 0.97 -0.0 0.98 perf-profile.calltrace.cycles-pp.zap_pte_range.zap_pmd_range.unmap_page_range.unmap_vmas.unmap_region 0.77 -0.0 0.72 -0.0 0.74 perf-profile.calltrace.cycles-pp.___slab_alloc.kmem_cache_alloc_noprof.mas_alloc_nodes.mas_preallocate.vma_merge 0.62 -0.0 0.57 -0.0 0.60 perf-profile.calltrace.cycles-pp.mas_prev_setup.mas_prev.vma_merge.copy_vma.move_vma 0.67 -0.0 0.62 -0.0 0.64 perf-profile.calltrace.cycles-pp.percpu_counter_add_batch.do_vmi_align_munmap.do_vmi_munmap.move_vma.__do_sys_mremap 0.86 -0.0 0.81 -0.0 0.83 perf-profile.calltrace.cycles-pp.mtree_load.vma_to_resize.mremap_to.__do_sys_mremap.do_syscall_64 1.12 -0.0 1.08 -0.0 1.09 perf-profile.calltrace.cycles-pp.clear_bhb_loop.mremap 0.56 -0.0 0.51 -0.0 0.53 perf-profile.calltrace.cycles-pp.mas_walk.mas_prev_setup.mas_prev.vma_merge.copy_vma 0.68 ą 2% -0.0 0.63 -0.0 0.65 perf-profile.calltrace.cycles-pp.syscall_return_via_sysret.mremap 0.81 -0.0 0.77 -0.0 0.80 perf-profile.calltrace.cycles-pp.mtree_load.__do_sys_mremap.do_syscall_64.entry_SYSCALL_64_after_hwframe.mremap 1.02 -0.0 0.97 -0.0 0.98 perf-profile.calltrace.cycles-pp.vma_to_resize.mremap_to.__do_sys_mremap.do_syscall_64.entry_SYSCALL_64_after_hwframe 0.95 ą 2% -0.0 0.90 ą 2% -0.0 0.93 perf-profile.calltrace.cycles-pp.__memcg_slab_free_hook.kmem_cache_free.unlink_anon_vmas.free_pgtables.unmap_region 0.98 -0.0 0.94 -0.0 0.95 perf-profile.calltrace.cycles-pp.mas_find.do_vmi_align_munmap.do_vmi_munmap.move_vma.__do_sys_mremap 0.78 -0.0 0.74 -0.0 0.75 perf-profile.calltrace.cycles-pp.mas_store_prealloc.vma_link.copy_vma.move_vma.__do_sys_mremap 0.70 -0.0 0.66 -0.0 0.67 perf-profile.calltrace.cycles-pp.__call_rcu_common.do_vmi_align_munmap.do_vmi_munmap.move_vma.__do_sys_mremap 0.69 -0.0 0.65 -0.0 0.66 perf-profile.calltrace.cycles-pp.___slab_alloc.kmem_cache_alloc_noprof.vm_area_dup.copy_vma.move_vma 0.69 -0.0 0.65 -0.0 0.65 perf-profile.calltrace.cycles-pp.mas_preallocate.vma_link.copy_vma.move_vma.__do_sys_mremap 0.62 -0.0 0.59 -0.0 0.60 perf-profile.calltrace.cycles-pp.mas_prev_slot.do_vmi_align_munmap.do_vmi_munmap.move_vma.__do_sys_mremap 1.16 -0.0 1.12 -0.0 1.13 perf-profile.calltrace.cycles-pp.anon_vma_clone.copy_vma.move_vma.__do_sys_mremap.do_syscall_64 0.76 ą 2% -0.0 0.72 -0.0 0.72 ą 2% perf-profile.calltrace.cycles-pp.allocate_slab.___slab_alloc.kmem_cache_alloc_noprof.vm_area_dup.__split_vma 1.01 -0.0 0.97 -0.0 0.99 perf-profile.calltrace.cycles-pp.mt_find.vma_merge.copy_vma.move_vma.__do_sys_mremap 0.60 -0.0 0.57 -0.0 0.58 perf-profile.calltrace.cycles-pp.__pte_offset_map_lock.zap_pte_range.zap_pmd_range.unmap_page_range.unmap_vmas 0.88 -0.0 0.85 -0.0 0.85 perf-profile.calltrace.cycles-pp.userfaultfd_unmap_complete.__do_sys_mremap.do_syscall_64.entry_SYSCALL_64_after_hwframe.mremap 0.62 ą 2% -0.0 0.59 ą 2% -0.0 0.60 perf-profile.calltrace.cycles-pp.get_old_pud.move_page_tables.move_vma.__do_sys_mremap.do_syscall_64 0.59 -0.0 0.56 -0.0 0.56 perf-profile.calltrace.cycles-pp.entry_SYSCALL_64.mremap 0.65 -0.0 0.62 ą 2% -0.0 0.63 perf-profile.calltrace.cycles-pp.mas_update_gap.mas_store_gfp.do_vmi_align_munmap.do_vmi_munmap.move_vma 0.81 +0.0 0.82 -0.0 0.79 perf-profile.calltrace.cycles-pp.thp_get_unmapped_area_vmflags.__get_unmapped_area.mremap_to.__do_sys_mremap.do_syscall_64 2.76 +0.0 2.78 ą 2% -0.1 2.67 perf-profile.calltrace.cycles-pp.unlink_anon_vmas.free_pgtables.unmap_region.do_vmi_align_munmap.do_vmi_munmap 3.47 +0.0 3.51 -0.1 3.37 perf-profile.calltrace.cycles-pp.free_pgtables.unmap_region.do_vmi_align_munmap.do_vmi_munmap.move_vma 0.76 +0.1 0.83 +0.1 0.85 perf-profile.calltrace.cycles-pp.__madvise 0.66 +0.1 0.73 +0.1 0.75 perf-profile.calltrace.cycles-pp.do_syscall_64.entry_SYSCALL_64_after_hwframe.__madvise 0.67 +0.1 0.74 +0.1 0.76 perf-profile.calltrace.cycles-pp.entry_SYSCALL_64_after_hwframe.__madvise 0.63 +0.1 0.70 +0.1 0.72 perf-profile.calltrace.cycles-pp.__x64_sys_madvise.do_syscall_64.entry_SYSCALL_64_after_hwframe.__madvise 0.62 +0.1 0.70 +0.1 0.71 perf-profile.calltrace.cycles-pp.do_madvise.__x64_sys_madvise.do_syscall_64.entry_SYSCALL_64_after_hwframe.__madvise 0.00 +0.9 0.86 +0.9 0.92 perf-profile.calltrace.cycles-pp.mas_walk.mas_find.can_modify_mm.do_vmi_munmap.do_munmap 0.00 +0.9 0.88 +0.0 0.00 perf-profile.calltrace.cycles-pp.mas_walk.mas_find.can_modify_mm.mremap_to.__do_sys_mremap 83.81 +0.9 84.69 +0.6 84.44 perf-profile.calltrace.cycles-pp.__do_sys_mremap.do_syscall_64.entry_SYSCALL_64_after_hwframe.mremap 0.00 +0.9 0.90 ą 2% +0.9 0.91 perf-profile.calltrace.cycles-pp.mas_walk.mas_find.can_modify_mm.do_vmi_munmap.move_vma 0.00 +1.1 1.10 +0.0 0.00 perf-profile.calltrace.cycles-pp.mas_find.can_modify_mm.mremap_to.__do_sys_mremap.do_syscall_64 0.00 +1.2 1.21 +1.3 1.28 perf-profile.calltrace.cycles-pp.mas_find.can_modify_mm.do_vmi_munmap.do_munmap.mremap_to 2.10 +1.5 3.60 +1.7 3.79 perf-profile.calltrace.cycles-pp.do_munmap.mremap_to.__do_sys_mremap.do_syscall_64.entry_SYSCALL_64_after_hwframe 0.00 +1.5 1.52 +1.5 1.52 perf-profile.calltrace.cycles-pp.mas_find.can_modify_mm.do_vmi_munmap.move_vma.__do_sys_mremap 1.59 +1.5 3.12 +1.7 3.31 perf-profile.calltrace.cycles-pp.do_vmi_munmap.do_munmap.mremap_to.__do_sys_mremap.do_syscall_64 0.00 +1.6 1.61 +0.0 0.00 perf-profile.calltrace.cycles-pp.can_modify_mm.mremap_to.__do_sys_mremap.do_syscall_64.entry_SYSCALL_64_after_hwframe 0.00 +1.7 1.73 +1.8 1.83 perf-profile.calltrace.cycles-pp.can_modify_mm.do_vmi_munmap.do_munmap.mremap_to.__do_sys_mremap 0.00 +2.0 2.01 +2.0 2.04 perf-profile.calltrace.cycles-pp.can_modify_mm.do_vmi_munmap.move_vma.__do_sys_mremap.do_syscall_64 5.34 +3.0 8.38 +1.6 6.92 perf-profile.calltrace.cycles-pp.mremap_to.__do_sys_mremap.do_syscall_64.entry_SYSCALL_64_after_hwframe.mremap 75.22 -2.0 73.18 -0.9 74.34 perf-profile.children.cycles-pp.move_vma 37.04 -1.6 35.40 -1.2 35.83 perf-profile.children.cycles-pp.do_vmi_align_munmap 25.09 -1.4 23.72 -0.9 24.20 perf-profile.children.cycles-pp.copy_vma 20.04 -1.1 18.96 -0.8 19.28 perf-profile.children.cycles-pp.__split_vma 19.87 -1.0 18.84 -0.6 19.24 perf-profile.children.cycles-pp.rcu_core 19.85 -1.0 18.82 -0.6 19.22 perf-profile.children.cycles-pp.rcu_do_batch 19.89 -1.0 18.86 -0.6 19.26 perf-profile.children.cycles-pp.handle_softirqs 17.55 -0.9 16.67 -0.5 17.02 perf-profile.children.cycles-pp.kmem_cache_free 15.32 -0.8 14.49 -0.5 14.78 perf-profile.children.cycles-pp.kmem_cache_alloc_noprof 15.17 -0.8 14.39 -0.5 14.66 perf-profile.children.cycles-pp.vma_merge 12.12 -0.6 11.48 -0.4 11.70 perf-profile.children.cycles-pp.__slab_free 12.19 -0.6 11.56 -0.5 11.73 perf-profile.children.cycles-pp.mas_wr_store_entry 11.99 -0.6 11.36 -0.5 11.53 perf-profile.children.cycles-pp.mas_store_prealloc 10.88 -0.6 10.28 -0.4 10.50 perf-profile.children.cycles-pp.vm_area_dup 9.90 -0.5 9.41 -0.4 9.53 perf-profile.children.cycles-pp.mas_wr_node_store 8.39 -0.5 7.92 -0.3 8.13 perf-profile.children.cycles-pp.__memcg_slab_post_alloc_hook 7.99 -0.4 7.58 -0.3 7.73 perf-profile.children.cycles-pp.move_page_tables 6.70 -0.4 6.33 -0.3 6.43 perf-profile.children.cycles-pp.vma_complete 5.87 -0.3 5.55 -0.2 5.66 perf-profile.children.cycles-pp.move_ptes 5.12 -0.3 4.81 -0.2 4.90 perf-profile.children.cycles-pp.mas_preallocate 6.05 -0.3 5.74 -0.2 5.85 perf-profile.children.cycles-pp.vm_area_free_rcu_cb 2.98 -0.3 2.69 ą 4% -0.2 2.80 ą 6% perf-profile.children.cycles-pp.__memcpy 3.46 ą 2% -0.2 3.25 -0.1 3.36 ą 3% perf-profile.children.cycles-pp.mod_objcg_state 3.47 -0.2 3.26 -0.2 3.32 perf-profile.children.cycles-pp.___slab_alloc 2.44 -0.2 2.25 -0.1 2.33 perf-profile.children.cycles-pp.find_vma_prev 2.92 -0.2 2.73 -0.1 2.79 perf-profile.children.cycles-pp.mas_alloc_nodes 3.46 -0.2 3.27 -0.1 3.34 perf-profile.children.cycles-pp.flush_tlb_mm_range 3.47 -0.2 3.29 -0.2 3.32 ą 2% perf-profile.children.cycles-pp.down_write 3.33 -0.2 3.16 -0.1 3.25 perf-profile.children.cycles-pp.__memcg_slab_free_hook 4.23 -0.2 4.07 -0.1 4.08 ą 2% perf-profile.children.cycles-pp.anon_vma_clone 8.33 -0.2 8.17 -0.2 8.13 perf-profile.children.cycles-pp.unmap_region 3.35 -0.1 3.20 -0.1 3.26 perf-profile.children.cycles-pp.mas_store_gfp 2.21 -0.1 2.07 -0.1 2.10 perf-profile.children.cycles-pp.__cond_resched 3.19 -0.1 3.05 -0.1 3.11 perf-profile.children.cycles-pp.unmap_vmas 2.12 -0.1 1.99 -0.1 2.04 perf-profile.children.cycles-pp.__call_rcu_common 2.66 -0.1 2.54 -0.1 2.60 perf-profile.children.cycles-pp.mtree_load 2.24 -0.1 2.12 ą 2% -0.1 2.13 ą 3% perf-profile.children.cycles-pp.vma_prepare 2.50 -0.1 2.38 -0.1 2.42 perf-profile.children.cycles-pp.flush_tlb_func 2.04 ą 2% -0.1 1.93 -0.1 1.96 ą 2% perf-profile.children.cycles-pp.allocate_slab 2.46 -0.1 2.35 -0.1 2.41 perf-profile.children.cycles-pp.rcu_cblist_dequeue 2.48 -0.1 2.38 -0.1 2.42 perf-profile.children.cycles-pp.unmap_page_range 2.23 -0.1 2.12 -0.1 2.16 perf-profile.children.cycles-pp.native_flush_tlb_one_user 1.77 -0.1 1.67 -0.1 1.70 perf-profile.children.cycles-pp.mas_wr_walk 1.88 -0.1 1.78 -0.1 1.80 perf-profile.children.cycles-pp.vma_link 1.84 -0.1 1.75 -0.1 1.77 perf-profile.children.cycles-pp.up_write 0.97 ą 2% -0.1 0.88 -0.1 0.89 perf-profile.children.cycles-pp.rcu_all_qs 1.40 -0.1 1.32 -0.1 1.34 ą 2% perf-profile.children.cycles-pp.shuffle_freelist 1.03 -0.1 0.95 -0.0 0.99 perf-profile.children.cycles-pp.mas_prev 0.92 -0.1 0.85 -0.0 0.88 perf-profile.children.cycles-pp.mas_prev_setup 1.58 -0.1 1.51 -0.1 1.53 perf-profile.children.cycles-pp.zap_pmd_range 1.24 -0.1 1.17 -0.0 1.20 perf-profile.children.cycles-pp.mas_prev_slot 1.57 -0.1 1.49 -0.1 1.49 perf-profile.children.cycles-pp.mas_update_gap 0.62 -0.1 0.56 -0.0 0.60 perf-profile.children.cycles-pp.security_mmap_addr 0.90 -0.1 0.84 -0.0 0.86 perf-profile.children.cycles-pp.percpu_counter_add_batch 0.86 -0.1 0.80 -0.0 0.81 perf-profile.children.cycles-pp._raw_spin_lock_irqsave 0.98 -0.1 0.92 -0.0 0.95 perf-profile.children.cycles-pp.mas_pop_node 1.68 -0.1 1.62 -0.1 1.62 perf-profile.children.cycles-pp.__get_unmapped_area 1.23 -0.1 1.18 -0.0 1.20 perf-profile.children.cycles-pp.__pte_offset_map_lock 0.49 ą 2% -0.1 0.43 -0.1 0.43 ą 2% perf-profile.children.cycles-pp.setup_object 1.09 -0.1 1.03 -0.0 1.05 perf-profile.children.cycles-pp.zap_pte_range 1.07 ą 2% -0.1 1.02 ą 2% -0.1 1.00 perf-profile.children.cycles-pp.mas_leaf_max_gap 0.70 ą 2% -0.0 0.65 -0.0 0.67 perf-profile.children.cycles-pp.syscall_return_via_sysret 1.18 -0.0 1.14 -0.0 1.15 perf-profile.children.cycles-pp.clear_bhb_loop 0.51 ą 3% -0.0 0.47 -0.0 0.49 ą 3% perf-profile.children.cycles-pp.anon_vma_interval_tree_insert 1.04 -0.0 1.00 -0.0 1.01 perf-profile.children.cycles-pp.vma_to_resize 0.57 -0.0 0.53 -0.0 0.54 perf-profile.children.cycles-pp.mas_wr_end_piv 0.44 ą 2% -0.0 0.40 ą 2% -0.0 0.40 perf-profile.children.cycles-pp.native_queued_spin_lock_slowpath 1.14 -0.0 1.10 -0.0 1.12 perf-profile.children.cycles-pp.mt_find 0.90 -0.0 0.87 -0.0 0.87 perf-profile.children.cycles-pp.userfaultfd_unmap_complete 0.62 -0.0 0.59 -0.0 0.60 perf-profile.children.cycles-pp.__put_partials 0.45 ą 6% -0.0 0.42 -0.0 0.43 perf-profile.children.cycles-pp._raw_spin_lock 0.48 -0.0 0.45 ą 2% -0.0 0.46 perf-profile.children.cycles-pp.mas_prev_range 0.61 -0.0 0.58 -0.0 0.59 perf-profile.children.cycles-pp.entry_SYSCALL_64 0.31 ą 3% -0.0 0.28 ą 3% -0.0 0.31 perf-profile.children.cycles-pp.security_vm_enough_memory_mm 0.33 ą 3% -0.0 0.30 ą 2% -0.0 0.31 ą 4% perf-profile.children.cycles-pp.mas_put_in_tree 0.32 ą 2% -0.0 0.29 ą 2% -0.0 0.30 perf-profile.children.cycles-pp.tlb_finish_mmu 0.46 -0.0 0.44 ą 2% -0.0 0.46 perf-profile.children.cycles-pp.rcu_segcblist_enqueue 0.33 -0.0 0.31 -0.0 0.32 perf-profile.children.cycles-pp.mas_destroy 0.36 -0.0 0.34 -0.0 0.34 perf-profile.children.cycles-pp.__rb_insert_augmented 0.39 -0.0 0.37 -0.0 0.38 ą 2% perf-profile.children.cycles-pp.down_write_killable 0.29 -0.0 0.27 ą 2% -0.0 0.28 perf-profile.children.cycles-pp.tlb_gather_mmu 0.26 -0.0 0.24 ą 2% -0.0 0.25 ą 2% perf-profile.children.cycles-pp.syscall_exit_to_user_mode 0.16 ą 2% -0.0 0.14 ą 3% -0.0 0.14 ą 3% perf-profile.children.cycles-pp.mas_wr_append 0.30 ą 2% -0.0 0.28 ą 2% -0.0 0.29 ą 2% perf-profile.children.cycles-pp.__vm_enough_memory 0.32 -0.0 0.30 ą 2% -0.0 0.31 perf-profile.children.cycles-pp.pte_offset_map_nolock 2.83 +0.0 2.85 ą 2% -0.1 2.74 perf-profile.children.cycles-pp.unlink_anon_vmas 0.84 +0.0 0.86 -0.0 0.81 perf-profile.children.cycles-pp.thp_get_unmapped_area_vmflags 0.08 ą 5% +0.0 0.10 ą 3% -0.0 0.08 ą 6% perf-profile.children.cycles-pp.mm_get_unmapped_area_vmflags 3.52 +0.0 3.56 -0.1 3.42 perf-profile.children.cycles-pp.free_pgtables 0.78 +0.1 0.85 +0.1 0.86 perf-profile.children.cycles-pp.__madvise 0.63 +0.1 0.70 +0.1 0.72 perf-profile.children.cycles-pp.__x64_sys_madvise 0.63 +0.1 0.70 +0.1 0.71 perf-profile.children.cycles-pp.do_madvise 0.00 +0.1 0.09 ą 3% +0.1 0.10 ą 5% perf-profile.children.cycles-pp.can_modify_mm_madv 1.31 +0.2 1.46 +0.2 1.50 perf-profile.children.cycles-pp.mas_next_slot 83.90 +0.9 84.79 +0.6 84.53 perf-profile.children.cycles-pp.__do_sys_mremap 40.45 +1.4 41.90 +2.1 42.57 perf-profile.children.cycles-pp.do_vmi_munmap 2.12 +1.5 3.62 +1.7 3.82 perf-profile.children.cycles-pp.do_munmap 3.63 +2.4 5.98 +1.7 5.29 perf-profile.children.cycles-pp.mas_walk 5.40 +3.0 8.44 +1.6 6.97 perf-profile.children.cycles-pp.mremap_to 5.26 +3.2 8.48 +2.3 7.58 perf-profile.children.cycles-pp.mas_find 0.00 +5.5 5.46 +3.9 3.93 perf-profile.children.cycles-pp.can_modify_mm 11.49 -0.6 10.89 -0.4 11.10 perf-profile.self.cycles-pp.__slab_free 4.32 -0.3 4.06 -0.2 4.16 perf-profile.self.cycles-pp.__memcg_slab_post_alloc_hook 1.96 -0.2 1.77 ą 4% -0.1 1.84 ą 6% perf-profile.self.cycles-pp.__memcpy 2.36 -0.1 2.25 ą 2% -0.1 2.25 ą 3% perf-profile.self.cycles-pp.down_write 2.42 -0.1 2.31 -0.0 2.38 perf-profile.self.cycles-pp.rcu_cblist_dequeue 2.33 -0.1 2.23 -0.1 2.28 perf-profile.self.cycles-pp.mtree_load 2.21 -0.1 2.10 -0.1 2.14 perf-profile.self.cycles-pp.native_flush_tlb_one_user 1.62 -0.1 1.54 -0.0 1.57 perf-profile.self.cycles-pp.__memcg_slab_free_hook 1.52 -0.1 1.44 -0.1 1.46 perf-profile.self.cycles-pp.mas_wr_walk 1.44 -0.1 1.36 -0.1 1.38 ą 2% perf-profile.self.cycles-pp.__call_rcu_common 1.53 -0.1 1.45 -0.0 1.48 perf-profile.self.cycles-pp.up_write 1.72 -0.1 1.65 -0.0 1.70 perf-profile.self.cycles-pp.mod_objcg_state 0.69 ą 2% -0.1 0.63 -0.1 0.63 perf-profile.self.cycles-pp.rcu_all_qs 1.14 ą 2% -0.1 1.08 -0.0 1.09 ą 2% perf-profile.self.cycles-pp.shuffle_freelist 1.18 -0.1 1.12 -0.0 1.17 perf-profile.self.cycles-pp.vma_merge 1.38 -0.1 1.33 -0.0 1.35 perf-profile.self.cycles-pp.do_vmi_align_munmap 0.51 ą 2% -0.1 0.45 -0.0 0.49 perf-profile.self.cycles-pp.security_mmap_addr 0.62 -0.1 0.56 ą 2% -0.1 0.56 perf-profile.self.cycles-pp.mremap 0.89 -0.1 0.83 -0.0 0.85 perf-profile.self.cycles-pp.___slab_alloc 0.99 -0.1 0.94 -0.0 0.96 perf-profile.self.cycles-pp.mas_prev_slot 1.00 -0.0 0.95 -0.0 0.96 perf-profile.self.cycles-pp.mas_preallocate 0.98 -0.0 0.93 -0.0 0.95 perf-profile.self.cycles-pp.move_ptes 0.85 -0.0 0.80 -0.0 0.82 perf-profile.self.cycles-pp.mas_pop_node 0.94 -0.0 0.90 -0.0 0.91 ą 2% perf-profile.self.cycles-pp.vm_area_free_rcu_cb 1.09 -0.0 1.04 -0.0 1.06 perf-profile.self.cycles-pp.__cond_resched 0.77 -0.0 0.72 -0.0 0.74 perf-profile.self.cycles-pp.percpu_counter_add_batch 0.94 ą 2% -0.0 0.89 ą 2% -0.1 0.87 perf-profile.self.cycles-pp.mas_leaf_max_gap 1.17 -0.0 1.12 -0.0 1.14 perf-profile.self.cycles-pp.clear_bhb_loop 0.68 -0.0 0.63 -0.0 0.65 perf-profile.self.cycles-pp.__split_vma 0.79 -0.0 0.75 -0.0 0.77 perf-profile.self.cycles-pp.mas_wr_store_entry 1.22 -0.0 1.18 -0.0 1.18 perf-profile.self.cycles-pp.move_vma 0.43 ą 2% -0.0 0.40 ą 2% -0.0 0.40 perf-profile.self.cycles-pp.native_queued_spin_lock_slowpath 1.49 -0.0 1.45 +0.0 1.49 perf-profile.self.cycles-pp.kmem_cache_free 0.44 -0.0 0.40 -0.0 0.40 perf-profile.self.cycles-pp.do_munmap 0.45 -0.0 0.42 -0.0 0.43 perf-profile.self.cycles-pp.mas_wr_end_piv 0.89 -0.0 0.86 -0.0 0.88 perf-profile.self.cycles-pp.mas_store_gfp 0.78 -0.0 0.75 -0.0 0.76 perf-profile.self.cycles-pp.userfaultfd_unmap_complete 0.66 -0.0 0.62 -0.0 0.64 perf-profile.self.cycles-pp.mas_store_prealloc 0.60 -0.0 0.58 -0.0 0.59 perf-profile.self.cycles-pp.unmap_region 0.36 ą 4% -0.0 0.33 ą 3% -0.0 0.34 ą 2% perf-profile.self.cycles-pp.syscall_return_via_sysret 0.55 -0.0 0.52 -0.0 0.53 perf-profile.self.cycles-pp.get_old_pud 0.99 -0.0 0.97 -0.0 0.98 perf-profile.self.cycles-pp.mt_find 0.61 -0.0 0.58 -0.0 0.60 perf-profile.self.cycles-pp.copy_vma 0.43 ą 3% -0.0 0.40 -0.0 0.41 ą 4% perf-profile.self.cycles-pp.anon_vma_interval_tree_insert 0.49 -0.0 0.47 -0.0 0.48 perf-profile.self.cycles-pp.find_vma_prev 0.71 -0.0 0.68 -0.0 0.70 perf-profile.self.cycles-pp.unmap_page_range 0.27 -0.0 0.25 -0.0 0.26 perf-profile.self.cycles-pp.mas_prev_setup 0.47 -0.0 0.45 -0.0 0.46 ą 2% perf-profile.self.cycles-pp.flush_tlb_mm_range 0.37 ą 6% -0.0 0.35 -0.0 0.35 perf-profile.self.cycles-pp._raw_spin_lock 0.41 -0.0 0.39 -0.0 0.40 perf-profile.self.cycles-pp._raw_spin_lock_irqsave 0.40 -0.0 0.37 -0.0 0.38 perf-profile.self.cycles-pp.entry_SYSRETQ_unsafe_stack 0.27 -0.0 0.25 ą 2% -0.0 0.25 ą 3% perf-profile.self.cycles-pp.mas_put_in_tree 0.49 -0.0 0.47 -0.0 0.49 perf-profile.self.cycles-pp.refill_obj_stock 0.48 -0.0 0.46 -0.0 0.47 perf-profile.self.cycles-pp.entry_SYSCALL_64_after_hwframe 0.27 ą 2% -0.0 0.25 -0.0 0.26 perf-profile.self.cycles-pp.tlb_finish_mmu 0.24 ą 2% -0.0 0.22 -0.0 0.23 perf-profile.self.cycles-pp.mas_prev 0.28 -0.0 0.26 -0.0 0.27 ą 2% perf-profile.self.cycles-pp.mas_alloc_nodes 0.40 -0.0 0.39 -0.0 0.40 perf-profile.self.cycles-pp.__pte_offset_map_lock 0.14 ą 3% -0.0 0.12 ą 2% -0.0 0.13 ą 3% perf-profile.self.cycles-pp.syscall_exit_to_user_mode 0.26 -0.0 0.24 ą 2% -0.0 0.25 perf-profile.self.cycles-pp.__rb_insert_augmented 0.28 -0.0 0.26 -0.0 0.27 perf-profile.self.cycles-pp.alloc_new_pud 0.28 -0.0 0.26 -0.0 0.27 ą 2% perf-profile.self.cycles-pp.flush_tlb_func 0.20 ą 2% -0.0 0.19 -0.0 0.19 ą 2% perf-profile.self.cycles-pp.__get_unmapped_area 0.47 -0.0 0.46 -0.0 0.45 perf-profile.self.cycles-pp.arch_get_unmapped_area_topdown_vmflags 0.06 -0.0 0.05 ą 5% -0.0 0.05 perf-profile.self.cycles-pp.vma_dup_policy 0.06 ą 6% +0.0 0.07 -0.0 0.06 ą 8% perf-profile.self.cycles-pp.mm_get_unmapped_area_vmflags 0.11 ą 4% +0.0 0.12 ą 4% +0.0 0.12 ą 4% perf-profile.self.cycles-pp.free_pgd_range 0.21 +0.0 0.22 ą 2% -0.0 0.20 ą 2% perf-profile.self.cycles-pp.thp_get_unmapped_area_vmflags 0.45 +0.0 0.48 +0.0 0.50 perf-profile.self.cycles-pp.do_vmi_munmap 0.27 +0.0 0.32 -0.0 0.26 perf-profile.self.cycles-pp.free_pgtables 0.36 ą 2% +0.1 0.44 -0.0 0.35 perf-profile.self.cycles-pp.unlink_anon_vmas 1.07 +0.1 1.19 +0.2 1.22 perf-profile.self.cycles-pp.mas_next_slot 1.49 +0.5 2.01 +0.4 1.86 perf-profile.self.cycles-pp.mas_find 0.00 +1.4 1.37 +0.9 0.93 perf-profile.self.cycles-pp.can_modify_mm 3.14 +2.1 5.23 +1.5 4.60 perf-profile.self.cycles-pp.mas_walk
to avoid the impact of other changes, better to apply the patch upon 8be7258a directly.
if you prefer other base for this patch, please let us know. then we will supply the results for 4 commits in fact:
this patch the base of this patch 8be7258a: mseal: add mseal syscall ff388fe5c: mseal: wire up mseal syscall
> > Thank you for your time and assistance in helping me on understanding > this issue.
due to resource constraint, please expect that we need several days to finish this test request.
No problem.
Thanks for your help! -Jeff
> > Best regards, > -Jeff > > > -Jeff > > > > > [1] https://lore.kernel.org/lkml/202408041602.caa0372-oliver.sang@intel.com/ > > > [2] https://github.com/peaktocreek/mmperf/blob/main/run_stress_ng.c > > > > > > > > > Jeff Xu (2): > > > mseal:selftest mremap across VMA boundaries. > > > mseal: refactor mremap to remove can_modify_mm > > > > > > mm/internal.h | 24 ++ > > > mm/mremap.c | 77 +++---- > > > mm/mseal.c | 17 -- > > > tools/testing/selftests/mm/mseal_test.c | 293 +++++++++++++++++++++++- > > > 4 files changed, 353 insertions(+), 58 deletions(-) > > > > > > -- > > > 2.46.0.76.ge559c4bf1a-goog > > >
linux-kselftest-mirror@lists.linaro.org