From: "Liam R. Howlett" Liam.Howlett@Oracle.com
Use the maple tree in RCU mode for VMA tracking.
The maple tree tracks the stack and is able to update the pivot (lower/upper boundary) in-place to allow the page fault handler to write to the tree while holding just the mmap read lock. This is safe as the writes to the stack have a guard VMA which ensures there will always be a NULL in the direction of the growth and thus will only update a pivot.
It is possible, but not recommended, to have VMAs that grow up/down without guard VMAs. syzbot has constructed a testcase which sets up a VMA to grow and consume the empty space. Overwriting the entire NULL entry causes the tree to be altered in a way that is not safe for concurrent readers; the readers may see a node being rewritten or one that does not match the maple state they are using.
Enabling RCU mode allows the concurrent readers to see a stable node and will return the expected result.
Link: https://lkml.kernel.org/r/20230227173632.3292573-9-surenb@google.com Link: https://lore.kernel.org/linux-mm/000000000000b0a65805f663ace6@google.com/ Cc: stable@vger.kernel.org Fixes: d4af56c5c7c6 ("mm: start tracking VMAs with maple tree") Signed-off-by: Liam R. Howlett Liam.Howlett@oracle.com Reported-by: syzbot+8d95422d3537159ca390@syzkaller.appspotmail.com --- include/linux/mm_types.h | 3 ++- kernel/fork.c | 3 +++ mm/mmap.c | 3 ++- 3 files changed, 7 insertions(+), 2 deletions(-)
diff --git a/include/linux/mm_types.h b/include/linux/mm_types.h index 0722859c3647..a57e6ae78e65 100644 --- a/include/linux/mm_types.h +++ b/include/linux/mm_types.h @@ -774,7 +774,8 @@ struct mm_struct { unsigned long cpu_bitmap[]; };
-#define MM_MT_FLAGS (MT_FLAGS_ALLOC_RANGE | MT_FLAGS_LOCK_EXTERN) +#define MM_MT_FLAGS (MT_FLAGS_ALLOC_RANGE | MT_FLAGS_LOCK_EXTERN | \ + MT_FLAGS_USE_RCU) extern struct mm_struct init_mm;
/* Pointer magic because the dynamic array size confuses some compilers. */ diff --git a/kernel/fork.c b/kernel/fork.c index d8cda4c6de6c..1bf31ba07e85 100644 --- a/kernel/fork.c +++ b/kernel/fork.c @@ -617,6 +617,7 @@ static __latent_entropy int dup_mmap(struct mm_struct *mm, if (retval) goto out;
+ mt_clear_in_rcu(vmi.mas.tree); for_each_vma(old_vmi, mpnt) { struct file *file;
@@ -700,6 +701,8 @@ static __latent_entropy int dup_mmap(struct mm_struct *mm, retval = arch_dup_mmap(oldmm, mm); loop_out: vma_iter_free(&vmi); + if (!retval) + mt_set_in_rcu(vmi.mas.tree); out: mmap_write_unlock(mm); flush_tlb_mm(oldmm); diff --git a/mm/mmap.c b/mm/mmap.c index 740b54be3ed4..16cbb83b3ec6 100644 --- a/mm/mmap.c +++ b/mm/mmap.c @@ -2277,7 +2277,7 @@ do_vmi_align_munmap(struct vma_iterator *vmi, struct vm_area_struct *vma, int count = 0; int error = -ENOMEM; MA_STATE(mas_detach, &mt_detach, 0, 0); - mt_init_flags(&mt_detach, MT_FLAGS_LOCK_EXTERN); + mt_init_flags(&mt_detach, vmi->mas.tree->ma_flags & MT_FLAGS_LOCK_MASK); mt_set_external_lock(&mt_detach, &mm->mmap_lock);
/* @@ -3042,6 +3042,7 @@ void exit_mmap(struct mm_struct *mm) */ set_bit(MMF_OOM_SKIP, &mm->flags); mmap_write_lock(mm); + mt_clear_in_rcu(&mm->mm_mt); free_pgtables(&tlb, &mm->mm_mt, vma, FIRST_USER_ADDRESS, USER_PGTABLES_CEILING); tlb_finish_mmu(&tlb);
On Mon, 27 Mar 2023 14:55:32 -0400 "Liam R. Howlett" Liam.Howlett@oracle.com wrote:
Use the maple tree in RCU mode for VMA tracking.
The maple tree tracks the stack and is able to update the pivot (lower/upper boundary) in-place to allow the page fault handler to write to the tree while holding just the mmap read lock. This is safe as the writes to the stack have a guard VMA which ensures there will always be a NULL in the direction of the growth and thus will only update a pivot.
It is possible, but not recommended, to have VMAs that grow up/down without guard VMAs. syzbot has constructed a testcase which sets up a VMA to grow and consume the empty space. Overwriting the entire NULL entry causes the tree to be altered in a way that is not safe for concurrent readers; the readers may see a node being rewritten or one that does not match the maple state they are using.
Enabling RCU mode allows the concurrent readers to see a stable node and
This differs from what had. Intended?
--- a/mm/mmap.c~mm-enable-maple-tree-rcu-mode-by-default-v8 +++ a/mm/mmap.c @@ -2277,8 +2277,7 @@ do_vmi_align_munmap(struct vma_iterator int count = 0; int error = -ENOMEM; MA_STATE(mas_detach, &mt_detach, 0, 0); - mt_init_flags(&mt_detach, vmi->mas.tree->ma_flags & - (MT_FLAGS_LOCK_MASK | MT_FLAGS_USE_RCU)); + mt_init_flags(&mt_detach, vmi->mas.tree->ma_flags & MT_FLAGS_LOCK_MASK); mt_set_external_lock(&mt_detach, &mm->mmap_lock);
/* _
* Andrew Morton akpm@linux-foundation.org [230327 15:38]:
On Mon, 27 Mar 2023 14:55:32 -0400 "Liam R. Howlett" Liam.Howlett@oracle.com wrote:
Use the maple tree in RCU mode for VMA tracking.
The maple tree tracks the stack and is able to update the pivot (lower/upper boundary) in-place to allow the page fault handler to write to the tree while holding just the mmap read lock. This is safe as the writes to the stack have a guard VMA which ensures there will always be a NULL in the direction of the growth and thus will only update a pivot.
It is possible, but not recommended, to have VMAs that grow up/down without guard VMAs. syzbot has constructed a testcase which sets up a VMA to grow and consume the empty space. Overwriting the entire NULL entry causes the tree to be altered in a way that is not safe for concurrent readers; the readers may see a node being rewritten or one that does not match the maple state they are using.
Enabling RCU mode allows the concurrent readers to see a stable node and
This differs from what had. Intended?
Yes, this is not necessary. The scope of this tree is limited to the function do_vmi_align_munmap() and so we don't need to free the nodes with RCU.
Thanks, Liam
--- a/mm/mmap.c~mm-enable-maple-tree-rcu-mode-by-default-v8 +++ a/mm/mmap.c @@ -2277,8 +2277,7 @@ do_vmi_align_munmap(struct vma_iterator int count = 0; int error = -ENOMEM; MA_STATE(mas_detach, &mt_detach, 0, 0);
- mt_init_flags(&mt_detach, vmi->mas.tree->ma_flags &
(MT_FLAGS_LOCK_MASK | MT_FLAGS_USE_RCU));
- mt_init_flags(&mt_detach, vmi->mas.tree->ma_flags & MT_FLAGS_LOCK_MASK); mt_set_external_lock(&mt_detach, &mm->mmap_lock);
/* _
Hello,
kernel test robot noticed a -8.5% regression of stress-ng.mmapaddr.ops_per_sec on:
commit: b5768f8ae36fd0c218838c88b114a9978db05c91 ("[PATCH 8/8] mm: enable maple tree RCU mode by default.") url: https://github.com/intel-lab-lkp/linux/commits/Liam-R-Howlett/maple_tree-be-... base: https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git 3a93e40326c8f470e71d20b4c42d36767450f38f patch link: https://lore.kernel.org/all/20230327185532.2354250-9-Liam.Howlett@oracle.com... patch subject: [PATCH 8/8] mm: enable maple tree RCU mode by default.
testcase: stress-ng test machine: 96 threads 2 sockets (Ice Lake) with 256G memory parameters:
nr_threads: 10% disk: 1HDD testtime: 60s fs: ext4 class: vm test: mmapaddr cpufreq_governor: performance
If you fix the issue, kindly add following tag | Reported-by: kernel test robot oliver.sang@intel.com | Link: https://lore.kernel.org/oe-lkp/202304110907.a7339b10-oliver.sang@intel.com
Details are as below: -------------------------------------------------------------------------------------------------->
To reproduce:
git clone https://github.com/intel/lkp-tests.git cd lkp-tests sudo bin/lkp install job.yaml # job file is attached in this email bin/lkp split-job --compatible job.yaml # generate the yaml file for lkp run sudo bin/lkp run generated-yaml-file
# if come across any failure that blocks the test, # please remove ~/.lkp and /lkp dir to run from a clean state.
========================================================================================= class/compiler/cpufreq_governor/disk/fs/kconfig/nr_threads/rootfs/tbox_group/test/testcase/testtime: vm/gcc-11/performance/1HDD/ext4/x86_64-rhel-8.3/10%/debian-11.1-x86_64-20220510.cgz/lkp-icl-2sp1/mmapaddr/stress-ng/60s
commit: 5ae51d7b1f ("maple_tree: add RCU lock checking to rcu callback functions") b5768f8ae3 ("mm: enable maple tree RCU mode by default.")
5ae51d7b1fb2fba9 b5768f8ae36fd0c218838c88b11 ---------------- --------------------------- %stddev %change %stddev \ | \ 53853773 -8.4% 49313594 stress-ng.mmapaddr.ops 898492 -8.5% 821889 stress-ng.mmapaddr.ops_per_sec 1.077e+08 -8.4% 98633700 stress-ng.time.minor_page_faults 863.25 -2.5% 841.62 stress-ng.time.percent_of_cpu_this_job_got 491.00 -1.9% 481.58 stress-ng.time.system_time 221487 +12.8% 249928 meminfo.SUnreclaim 0.04 ± 2% +0.2 0.27 mpstat.cpu.all.soft% 93053 ± 6% +20.0% 111624 ± 7% numa-meminfo.node1.SUnreclaim 23263 ± 6% +20.0% 27913 ± 7% numa-vmstat.node1.nr_slab_unreclaimable 1947 +18.1% 2299 vmstat.system.cs 55371 +12.8% 62436 proc-vmstat.nr_slab_unreclaimable 4.898e+08 -6.6% 4.573e+08 proc-vmstat.numa_hit 4.89e+08 -6.7% 4.561e+08 proc-vmstat.numa_local 4.813e+08 -5.5% 4.548e+08 proc-vmstat.pgalloc_normal 1.08e+08 -8.4% 98948247 proc-vmstat.pgfault 4.812e+08 -5.6% 4.544e+08 proc-vmstat.pgfree 3.35 +39.9% 4.68 ± 18% perf-stat.i.MPKI 8.691e+09 -2.9% 8.441e+09 perf-stat.i.branch-instructions 0.57 +0.2 0.76 ± 7% perf-stat.i.branch-miss-rate% 43345846 +30.2% 56428345 perf-stat.i.branch-misses 20.37 ± 2% -2.7 17.66 perf-stat.i.cache-miss-rate% 26387649 ± 2% +18.0% 31141545 perf-stat.i.cache-misses 1.286e+08 +35.4% 1.741e+08 perf-stat.i.cache-references 1540 +24.1% 1912 perf-stat.i.context-switches 0.71 +4.5% 0.74 ± 4% perf-stat.i.cpi 1212 -12.7% 1058 ± 5% perf-stat.i.cycles-between-cache-misses 1.163e+10 -2.0% 1.139e+10 perf-stat.i.dTLB-loads 0.05 ± 2% +0.0 0.05 ± 12% perf-stat.i.dTLB-store-miss-rate% 7.19e+09 -2.3% 7.023e+09 perf-stat.i.dTLB-stores 4.587e+10 -2.6% 4.468e+10 perf-stat.i.instructions 1.42 -3.0% 1.38 perf-stat.i.ipc 287.84 -2.2% 281.56 perf-stat.i.metric.M/sec 89.27 -8.4 80.91 perf-stat.i.node-load-miss-rate% 417672 ± 5% +102.1% 844069 ± 3% perf-stat.i.node-loads 85.67 -18.1 67.58 perf-stat.i.node-store-miss-rate% 708025 ± 4% +187.6% 2036086 ± 3% perf-stat.i.node-stores 2.80 +39.0% 3.90 perf-stat.overall.MPKI 0.50 +0.2 0.67 perf-stat.overall.branch-miss-rate% 20.52 ± 2% -2.6 17.89 perf-stat.overall.cache-miss-rate% 0.70 +3.0% 0.72 perf-stat.overall.cpi 1209 ± 2% -15.0% 1028 perf-stat.overall.cycles-between-cache-misses 0.04 ± 3% +0.0 0.05 ± 13% perf-stat.overall.dTLB-store-miss-rate% 1.44 -2.9% 1.40 perf-stat.overall.ipc 90.25 -8.7 81.54 perf-stat.overall.node-load-miss-rate% 86.37 -18.3 68.08 perf-stat.overall.node-store-miss-rate% 8.553e+09 -2.9% 8.308e+09 perf-stat.ps.branch-instructions 42646399 +30.2% 55522635 perf-stat.ps.branch-misses 25968476 ± 2% +18.0% 30648177 perf-stat.ps.cache-misses 1.265e+08 +35.4% 1.714e+08 perf-stat.ps.cache-references 1515 +24.2% 1881 perf-stat.ps.context-switches 1.144e+10 -2.0% 1.121e+10 perf-stat.ps.dTLB-loads 7.076e+09 -2.3% 6.912e+09 perf-stat.ps.dTLB-stores 4.514e+10 -2.6% 4.398e+10 perf-stat.ps.instructions 411017 ± 5% +102.1% 830672 ± 3% perf-stat.ps.node-loads 696326 ± 4% +187.7% 2003485 ± 3% perf-stat.ps.node-stores 2.847e+12 -2.5% 2.777e+12 perf-stat.total.instructions 24.83 -4.0 20.87 ± 37% perf-profile.calltrace.cycles-pp.__munmap 23.56 -3.7 19.86 ± 37% perf-profile.calltrace.cycles-pp.entry_SYSCALL_64_after_hwframe.__munmap 23.37 -3.7 19.72 ± 37% perf-profile.calltrace.cycles-pp.do_syscall_64.entry_SYSCALL_64_after_hwframe.__munmap 23.05 -3.6 19.44 ± 37% perf-profile.calltrace.cycles-pp.__x64_sys_munmap.do_syscall_64.entry_SYSCALL_64_after_hwframe.__munmap 22.88 -3.6 19.27 ± 37% perf-profile.calltrace.cycles-pp.__vm_munmap.__x64_sys_munmap.do_syscall_64.entry_SYSCALL_64_after_hwframe.__munmap 17.14 -3.3 13.84 ± 37% perf-profile.calltrace.cycles-pp.__mm_populate.vm_mmap_pgoff.do_syscall_64.entry_SYSCALL_64_after_hwframe.__mmap 11.25 -2.2 9.04 ± 37% perf-profile.calltrace.cycles-pp.mincore 9.04 -1.8 7.23 ± 37% perf-profile.calltrace.cycles-pp.entry_SYSCALL_64_after_hwframe.mincore 8.80 -1.8 7.05 ± 37% perf-profile.calltrace.cycles-pp.do_syscall_64.entry_SYSCALL_64_after_hwframe.mincore 7.73 -1.5 6.19 ± 37% perf-profile.calltrace.cycles-pp.__do_sys_mincore.do_syscall_64.entry_SYSCALL_64_after_hwframe.mincore 12.75 -1.2 11.60 ± 2% perf-profile.calltrace.cycles-pp.unmap_region.do_vmi_align_munmap.do_vmi_munmap.__vm_munmap.__x64_sys_munmap 16.22 -1.1 15.10 ± 2% perf-profile.calltrace.cycles-pp.populate_vma_page_range.__mm_populate.vm_mmap_pgoff.do_syscall_64.entry_SYSCALL_64_after_hwframe 16.01 -1.1 14.91 ± 2% perf-profile.calltrace.cycles-pp.__get_user_pages.populate_vma_page_range.__mm_populate.vm_mmap_pgoff.do_syscall_64 8.49 -0.7 7.74 ± 3% perf-profile.calltrace.cycles-pp.tlb_batch_pages_flush.tlb_finish_mmu.unmap_region.do_vmi_align_munmap.do_vmi_munmap 8.06 -0.7 7.33 ± 3% perf-profile.calltrace.cycles-pp.release_pages.tlb_batch_pages_flush.tlb_finish_mmu.unmap_region.do_vmi_align_munmap 8.25 -0.7 7.57 ± 2% perf-profile.calltrace.cycles-pp.tlb_finish_mmu.unmap_region.do_vmi_align_munmap.do_vmi_munmap.__vm_munmap 6.26 ± 2% -0.6 5.62 ± 3% perf-profile.calltrace.cycles-pp.__mem_cgroup_uncharge_list.release_pages.tlb_batch_pages_flush.tlb_finish_mmu.unmap_region 2.94 -0.6 2.33 ± 37% perf-profile.calltrace.cycles-pp.do_mincore.__do_sys_mincore.do_syscall_64.entry_SYSCALL_64_after_hwframe.mincore 5.44 ± 2% -0.6 4.84 ± 3% perf-profile.calltrace.cycles-pp.uncharge_batch.__mem_cgroup_uncharge_list.release_pages.tlb_batch_pages_flush.tlb_finish_mmu 8.68 -0.6 8.11 perf-profile.calltrace.cycles-pp.handle_mm_fault.__get_user_pages.populate_vma_page_range.__mm_populate.vm_mmap_pgoff 7.99 -0.5 7.48 perf-profile.calltrace.cycles-pp.__handle_mm_fault.handle_mm_fault.__get_user_pages.populate_vma_page_range.__mm_populate 2.18 ± 2% -0.5 1.73 ± 37% perf-profile.calltrace.cycles-pp.__get_free_pages.__do_sys_mincore.do_syscall_64.entry_SYSCALL_64_after_hwframe.mincore 4.07 ± 3% -0.4 3.62 ± 4% perf-profile.calltrace.cycles-pp.page_counter_uncharge.uncharge_batch.__mem_cgroup_uncharge_list.release_pages.tlb_batch_pages_flush 5.21 -0.4 4.78 ± 2% perf-profile.calltrace.cycles-pp.unmap_region.do_vmi_align_munmap.do_vmi_munmap.move_vma.__do_sys_mremap 1.30 ± 5% -0.3 1.02 ± 38% perf-profile.calltrace.cycles-pp.stress_mmapaddr_child 3.44 -0.3 3.16 ± 3% perf-profile.calltrace.cycles-pp.folio_mark_accessed.follow_page_pte.__get_user_pages.populate_vma_page_range.__mm_populate 3.68 -0.3 3.42 ± 2% perf-profile.calltrace.cycles-pp.tlb_finish_mmu.unmap_region.do_vmi_align_munmap.do_vmi_munmap.move_vma 3.51 -0.2 3.27 ± 2% perf-profile.calltrace.cycles-pp.move_page_tables.move_vma.__do_sys_mremap.do_syscall_64.entry_SYSCALL_64_after_hwframe 2.30 -0.2 2.07 ± 3% perf-profile.calltrace.cycles-pp.walk_page_range.do_mincore.__do_sys_mincore.do_syscall_64.entry_SYSCALL_64_after_hwframe 2.82 -0.2 2.60 perf-profile.calltrace.cycles-pp.free_pgd_range.free_pgtables.unmap_region.do_vmi_align_munmap.do_vmi_munmap 2.46 ± 2% -0.2 2.24 ± 2% perf-profile.calltrace.cycles-pp.free_pgtables.unmap_region.do_vmi_align_munmap.do_vmi_munmap.__vm_munmap 2.65 ± 2% -0.2 2.44 perf-profile.calltrace.cycles-pp.free_p4d_range.free_pgd_range.free_pgtables.unmap_region.do_vmi_align_munmap 2.46 -0.2 2.26 ± 2% perf-profile.calltrace.cycles-pp.flush_tlb_func.flush_tlb_mm_range.tlb_finish_mmu.unmap_region.do_vmi_align_munmap 0.61 ± 3% -0.2 0.40 ± 57% perf-profile.calltrace.cycles-pp.zap_pte_range.zap_pmd_range.unmap_page_range.unmap_vmas.unmap_region 2.67 -0.2 2.47 ± 2% perf-profile.calltrace.cycles-pp.__pmd_alloc.__handle_mm_fault.handle_mm_fault.__get_user_pages.populate_vma_page_range 2.48 ± 2% -0.2 2.29 perf-profile.calltrace.cycles-pp.free_pud_range.free_p4d_range.free_pgd_range.free_pgtables.unmap_region 1.64 ± 2% -0.2 1.47 ± 4% perf-profile.calltrace.cycles-pp.unmap_vmas.unmap_region.do_vmi_align_munmap.do_vmi_munmap.__vm_munmap 2.92 -0.2 2.74 perf-profile.calltrace.cycles-pp.flush_tlb_mm_range.tlb_finish_mmu.unmap_region.do_vmi_align_munmap.do_vmi_munmap 0.86 -0.2 0.69 ± 38% perf-profile.calltrace.cycles-pp.__entry_text_start.mincore 1.68 ± 3% -0.2 1.52 ± 3% perf-profile.calltrace.cycles-pp.__alloc_pages.__get_free_pages.__do_sys_mincore.do_syscall_64.entry_SYSCALL_64_after_hwframe 2.65 -0.2 2.49 ± 2% perf-profile.calltrace.cycles-pp.do_anonymous_page.__handle_mm_fault.handle_mm_fault.__get_user_pages.populate_vma_page_range 1.31 ± 2% -0.1 1.17 ± 4% perf-profile.calltrace.cycles-pp.unmap_page_range.unmap_vmas.unmap_region.do_vmi_align_munmap.do_vmi_munmap 2.36 -0.1 2.22 ± 2% perf-profile.calltrace.cycles-pp.__pte_alloc.do_anonymous_page.__handle_mm_fault.handle_mm_fault.__get_user_pages 2.14 -0.1 2.00 ± 2% perf-profile.calltrace.cycles-pp.pte_alloc_one.__pte_alloc.do_anonymous_page.__handle_mm_fault.handle_mm_fault 1.64 ± 2% -0.1 1.51 ± 2% perf-profile.calltrace.cycles-pp.__alloc_pages.__pmd_alloc.__handle_mm_fault.handle_mm_fault.__get_user_pages 1.16 -0.1 1.04 ± 3% perf-profile.calltrace.cycles-pp.__walk_page_range.walk_page_range.do_mincore.__do_sys_mincore.do_syscall_64 0.58 ± 2% -0.1 0.46 ± 38% perf-profile.calltrace.cycles-pp.get_page_from_freelist.__alloc_pages.pte_alloc_one.__pte_alloc.do_anonymous_page 0.87 ± 2% -0.1 0.77 ± 4% perf-profile.calltrace.cycles-pp.zap_pmd_range.unmap_page_range.unmap_vmas.unmap_region.do_vmi_align_munmap 0.86 ± 3% -0.1 0.76 ± 3% perf-profile.calltrace.cycles-pp.__mod_lruvec_page_state.___pte_free_tlb.free_pud_range.free_p4d_range.free_pgd_range 1.08 -0.1 0.99 ± 3% perf-profile.calltrace.cycles-pp.walk_pgd_range.__walk_page_range.walk_page_range.do_mincore.__do_sys_mincore 1.60 ± 2% -0.1 1.51 ± 2% perf-profile.calltrace.cycles-pp.__pud_alloc.__handle_mm_fault.handle_mm_fault.__get_user_pages.populate_vma_page_range 0.96 ± 3% -0.1 0.88 ± 2% perf-profile.calltrace.cycles-pp.___pte_free_tlb.free_pud_range.free_p4d_range.free_pgd_range.free_pgtables 1.57 ± 2% -0.1 1.49 ± 2% perf-profile.calltrace.cycles-pp.__alloc_pages.pte_alloc_one.__pte_alloc.do_anonymous_page.__handle_mm_fault 0.83 ± 3% -0.1 0.75 ± 2% perf-profile.calltrace.cycles-pp.native_flush_tlb_local.flush_tlb_func.flush_tlb_mm_range.tlb_finish_mmu.unmap_region 0.85 ± 4% -0.1 0.78 ± 3% perf-profile.calltrace.cycles-pp.kmem_cache_free.do_vmi_align_munmap.do_vmi_munmap.__vm_munmap.__x64_sys_munmap 1.33 ± 2% -0.1 1.26 ± 2% perf-profile.calltrace.cycles-pp.__alloc_pages.get_zeroed_page.__pud_alloc.__handle_mm_fault.handle_mm_fault 0.91 ± 3% -0.1 0.84 ± 2% perf-profile.calltrace.cycles-pp.free_pgtables.unmap_region.do_vmi_align_munmap.do_vmi_munmap.move_vma 0.91 -0.1 0.84 ± 3% perf-profile.calltrace.cycles-pp.walk_p4d_range.walk_pgd_range.__walk_page_range.walk_page_range.do_mincore 1.04 ± 2% -0.1 0.97 ± 3% perf-profile.calltrace.cycles-pp.__pmd_alloc.move_page_tables.move_vma.__do_sys_mremap.do_syscall_64 0.66 ± 3% -0.1 0.60 ± 4% perf-profile.calltrace.cycles-pp.__mod_lruvec_page_state.__pmd_alloc.__handle_mm_fault.handle_mm_fault.__get_user_pages 0.68 ± 3% -0.1 0.62 ± 4% perf-profile.calltrace.cycles-pp.get_page_from_freelist.__alloc_pages.__pmd_alloc.__handle_mm_fault.handle_mm_fault 0.77 -0.1 0.71 ± 3% perf-profile.calltrace.cycles-pp.walk_pud_range.walk_p4d_range.walk_pgd_range.__walk_page_range.walk_page_range 0.76 ± 2% -0.1 0.70 ± 3% perf-profile.calltrace.cycles-pp.__memcg_kmem_charge_page.__alloc_pages.__pmd_alloc.__handle_mm_fault.handle_mm_fault 0.64 ± 2% -0.0 0.59 ± 3% perf-profile.calltrace.cycles-pp.__memcg_kmem_charge_page.__alloc_pages.get_zeroed_page.__pud_alloc.__handle_mm_fault 0.61 ± 2% -0.0 0.57 ± 3% perf-profile.calltrace.cycles-pp.flush_tlb_mm_range.move_ptes.move_page_tables.move_vma.__do_sys_mremap 0.62 ± 2% -0.0 0.59 ± 2% perf-profile.calltrace.cycles-pp.mas_next_entry.do_vmi_align_munmap.do_vmi_munmap.__vm_munmap.__x64_sys_munmap 1.18 ± 7% +0.2 1.34 ± 6% perf-profile.calltrace.cycles-pp.tick_sched_handle.tick_sched_timer.__hrtimer_run_queues.hrtimer_interrupt.__sysvec_apic_timer_interrupt 1.06 ± 8% +0.2 1.22 ± 7% perf-profile.calltrace.cycles-pp.update_process_times.tick_sched_handle.tick_sched_timer.__hrtimer_run_queues.hrtimer_interrupt 0.57 +0.2 0.74 ± 2% perf-profile.calltrace.cycles-pp.mas_wr_node_store.mas_wr_modify.mas_store_prealloc.vma_link.copy_vma 0.60 +0.2 0.77 ± 2% perf-profile.calltrace.cycles-pp.mas_wr_modify.mas_store_prealloc.vma_link.copy_vma.move_vma 1.32 ± 7% +0.2 1.51 ± 7% perf-profile.calltrace.cycles-pp.tick_sched_timer.__hrtimer_run_queues.hrtimer_interrupt.__sysvec_apic_timer_interrupt.sysvec_apic_timer_interrupt 1.93 ± 5% +0.2 2.15 ± 6% perf-profile.calltrace.cycles-pp.__hrtimer_run_queues.hrtimer_interrupt.__sysvec_apic_timer_interrupt.sysvec_apic_timer_interrupt.asm_sysvec_apic_timer_interrupt 1.26 +0.3 1.53 ± 2% perf-profile.calltrace.cycles-pp.mas_store_gfp.do_vmi_align_munmap.do_vmi_munmap.move_vma.__do_sys_mremap 0.61 ± 2% +0.3 0.92 ± 2% perf-profile.calltrace.cycles-pp.mas_wr_modify.mas_store_gfp.do_vmi_align_munmap.do_vmi_munmap.move_vma 1.61 +0.3 1.96 perf-profile.calltrace.cycles-pp.mas_wr_modify.mas_store_prealloc.mmap_region.do_mmap.vm_mmap_pgoff 1.53 +0.4 1.88 perf-profile.calltrace.cycles-pp.mas_wr_node_store.mas_wr_modify.mas_store_prealloc.mmap_region.do_mmap 0.76 ± 4% +0.4 1.13 ± 2% perf-profile.calltrace.cycles-pp.mas_wr_modify.mas_store_gfp.do_vmi_align_munmap.do_vmi_munmap.__vm_munmap 1.15 ± 2% +0.4 1.60 ± 2% perf-profile.calltrace.cycles-pp.mas_wr_bnode.mas_store_gfp.do_vmi_align_munmap.do_vmi_munmap.__vm_munmap 1.19 +0.5 1.69 ± 2% perf-profile.calltrace.cycles-pp.mas_store_prealloc.vma_link.copy_vma.move_vma.__do_sys_mremap 0.46 ± 37% +0.6 1.02 ± 2% perf-profile.calltrace.cycles-pp.mas_commit_b_node.mas_wr_bnode.mas_store_gfp.do_vmi_align_munmap.do_vmi_munmap 1.30 ± 2% +0.7 1.96 ± 2% perf-profile.calltrace.cycles-pp.mas_wr_node_store.mas_wr_modify.mas_store_gfp.do_vmi_align_munmap.do_vmi_munmap 0.68 ± 2% +0.7 1.34 ± 3% perf-profile.calltrace.cycles-pp.kmem_cache_free_bulk.mas_destroy.mas_store_prealloc.mmap_region.do_mmap 0.00 +0.7 0.69 ± 3% perf-profile.calltrace.cycles-pp.__kmem_cache_alloc_bulk.kmem_cache_alloc_bulk.mas_alloc_nodes.mas_preallocate.vma_link 3.94 ± 2% +0.7 4.63 perf-profile.calltrace.cycles-pp.mas_store_gfp.do_vmi_align_munmap.do_vmi_munmap.__vm_munmap.__x64_sys_munmap 0.00 +0.7 0.72 ± 3% perf-profile.calltrace.cycles-pp.mas_destroy.mas_store_prealloc.vma_link.copy_vma.move_vma 0.00 +0.8 0.79 ± 3% perf-profile.calltrace.cycles-pp.kmem_cache_alloc_bulk.mas_alloc_nodes.mas_preallocate.vma_link.copy_vma 0.00 +1.0 0.98 ± 3% perf-profile.calltrace.cycles-pp.mas_alloc_nodes.mas_preallocate.vma_link.copy_vma.move_vma 1.16 +1.0 2.15 ± 3% perf-profile.calltrace.cycles-pp.mas_destroy.mas_store_prealloc.mmap_region.do_mmap.vm_mmap_pgoff 0.00 +1.0 1.01 ± 3% perf-profile.calltrace.cycles-pp.mas_preallocate.vma_link.copy_vma.move_vma.__do_sys_mremap 2.99 +1.0 4.02 ± 2% perf-profile.calltrace.cycles-pp.copy_vma.move_vma.__do_sys_mremap.do_syscall_64.entry_SYSCALL_64_after_hwframe 1.73 +1.0 2.77 ± 2% perf-profile.calltrace.cycles-pp.vma_link.copy_vma.move_vma.__do_sys_mremap.do_syscall_64 3.77 +1.5 5.25 ± 2% perf-profile.calltrace.cycles-pp.mas_store_prealloc.mmap_region.do_mmap.vm_mmap_pgoff.do_syscall_64 1.27 ± 2% +1.7 2.97 ± 3% perf-profile.calltrace.cycles-pp.mas_preallocate.mmap_region.do_mmap.vm_mmap_pgoff.do_syscall_64 1.20 +1.7 2.92 ± 3% perf-profile.calltrace.cycles-pp.mas_alloc_nodes.mas_preallocate.mmap_region.do_mmap.vm_mmap_pgoff 0.66 ± 2% +1.7 2.39 ± 3% perf-profile.calltrace.cycles-pp.kmem_cache_alloc_bulk.mas_alloc_nodes.mas_preallocate.mmap_region.do_mmap 0.00 +1.8 1.83 ± 15% perf-profile.calltrace.cycles-pp.___slab_alloc.__kmem_cache_alloc_bulk.kmem_cache_alloc_bulk.mas_alloc_nodes.mas_preallocate 0.00 +2.1 2.11 ± 3% perf-profile.calltrace.cycles-pp.__kmem_cache_alloc_bulk.kmem_cache_alloc_bulk.mas_alloc_nodes.mas_preallocate.mmap_region 9.81 +3.0 12.83 ± 2% perf-profile.calltrace.cycles-pp.mmap_region.do_mmap.vm_mmap_pgoff.do_syscall_64.entry_SYSCALL_64_after_hwframe 25.28 -4.0 21.25 ± 37% perf-profile.children.cycles-pp.__munmap 11.80 -2.3 9.46 ± 37% perf-profile.children.cycles-pp.mincore 18.01 -1.6 16.43 ± 2% perf-profile.children.cycles-pp.unmap_region 17.18 -1.2 16.00 ± 2% perf-profile.children.cycles-pp.__mm_populate 16.23 -1.1 15.12 ± 2% perf-profile.children.cycles-pp.populate_vma_page_range 16.08 -1.1 14.97 ± 2% perf-profile.children.cycles-pp.__get_user_pages 11.96 -1.0 11.00 ± 2% perf-profile.children.cycles-pp.tlb_finish_mmu 8.51 -0.7 7.77 ± 3% perf-profile.children.cycles-pp.tlb_batch_pages_flush 8.10 -0.7 7.37 ± 3% perf-profile.children.cycles-pp.release_pages 6.29 -0.6 5.65 ± 3% perf-profile.children.cycles-pp.__mem_cgroup_uncharge_list 7.84 -0.6 7.22 ± 2% perf-profile.children.cycles-pp.__do_sys_mincore 5.48 ± 2% -0.6 4.88 ± 3% perf-profile.children.cycles-pp.uncharge_batch 8.73 -0.6 8.16 perf-profile.children.cycles-pp.handle_mm_fault 8.06 -0.5 7.55 perf-profile.children.cycles-pp.__handle_mm_fault 8.18 -0.5 7.70 perf-profile.children.cycles-pp.__alloc_pages 4.10 ± 2% -0.5 3.64 ± 4% perf-profile.children.cycles-pp.page_counter_uncharge 2.22 ± 2% -0.4 1.84 ± 20% perf-profile.children.cycles-pp.__entry_text_start 3.13 ± 3% -0.3 2.79 ± 2% perf-profile.children.cycles-pp.__mod_lruvec_page_state 1.43 ± 4% -0.3 1.11 ± 38% perf-profile.children.cycles-pp.stress_mmapaddr_child 3.42 -0.3 3.11 ± 2% perf-profile.children.cycles-pp.free_pgtables 3.74 -0.3 3.47 ± 2% perf-profile.children.cycles-pp.__pmd_alloc 3.00 -0.3 2.75 ± 2% perf-profile.children.cycles-pp.flush_tlb_func 2.15 -0.2 1.90 ± 3% perf-profile.children.cycles-pp.unmap_vmas 2.96 -0.2 2.72 ± 2% perf-profile.children.cycles-pp.do_mincore 2.33 -0.2 2.09 ± 2% perf-profile.children.cycles-pp.walk_page_range 3.54 -0.2 3.30 ± 2% perf-profile.children.cycles-pp.move_page_tables 2.84 -0.2 2.61 perf-profile.children.cycles-pp.free_pgd_range 3.57 -0.2 3.35 perf-profile.children.cycles-pp.flush_tlb_mm_range 2.68 ± 2% -0.2 2.47 ± 2% perf-profile.children.cycles-pp.free_p4d_range 3.56 -0.2 3.36 perf-profile.children.cycles-pp.__pte_alloc 2.21 ± 2% -0.2 2.01 ± 3% perf-profile.children.cycles-pp.__get_free_pages 3.17 ± 2% -0.2 2.97 ± 2% perf-profile.children.cycles-pp.__memcg_kmem_charge_page 1.70 ± 2% -0.2 1.51 ± 3% perf-profile.children.cycles-pp.unmap_page_range 3.22 -0.2 3.04 perf-profile.children.cycles-pp.pte_alloc_one 2.50 ± 2% -0.2 2.32 perf-profile.children.cycles-pp.free_pud_range 2.69 -0.2 2.51 ± 3% perf-profile.children.cycles-pp.mas_find 0.75 -0.2 0.58 ± 3% perf-profile.children.cycles-pp.mas_update_gap 2.68 -0.2 2.52 ± 2% perf-profile.children.cycles-pp.do_anonymous_page 3.34 -0.1 3.20 ± 2% perf-profile.children.cycles-pp.get_page_from_freelist 1.11 ± 2% -0.1 0.99 ± 3% perf-profile.children.cycles-pp.zap_pmd_range 1.23 ± 2% -0.1 1.11 ± 2% perf-profile.children.cycles-pp.entry_SYSRETQ_unsafe_stack 1.18 -0.1 1.06 ± 3% perf-profile.children.cycles-pp.__walk_page_range 1.31 ± 2% -0.1 1.19 perf-profile.children.cycles-pp.native_flush_tlb_local 1.44 -0.1 1.33 ± 3% perf-profile.children.cycles-pp.native_flush_tlb_one_user 1.34 ± 2% -0.1 1.22 ± 3% perf-profile.children.cycles-pp.___pte_free_tlb 1.94 -0.1 1.82 perf-profile.children.cycles-pp.mas_next_entry 0.47 ± 2% -0.1 0.36 ± 4% perf-profile.children.cycles-pp.mas_leaf_max_gap 0.77 ± 2% -0.1 0.67 ± 4% perf-profile.children.cycles-pp.zap_pte_range 1.89 ± 2% -0.1 1.78 ± 2% perf-profile.children.cycles-pp.get_zeroed_page 1.21 ± 2% -0.1 1.10 ± 2% perf-profile.children.cycles-pp.find_vma 0.86 ± 4% -0.1 0.77 ± 2% perf-profile.children.cycles-pp.__mod_memcg_lruvec_state 1.24 -0.1 1.15 ± 2% perf-profile.children.cycles-pp.clear_page_erms 1.09 -0.1 1.00 ± 3% perf-profile.children.cycles-pp.walk_pgd_range 1.09 ± 2% -0.1 1.00 ± 2% perf-profile.children.cycles-pp.syscall_exit_to_user_mode 1.61 ± 2% -0.1 1.53 ± 2% perf-profile.children.cycles-pp.__pud_alloc 0.95 ± 4% -0.1 0.87 ± 5% perf-profile.children.cycles-pp.propagate_protected_usage 0.92 -0.1 0.85 ± 3% perf-profile.children.cycles-pp.walk_p4d_range 1.09 -0.1 1.02 ± 2% perf-profile.children.cycles-pp.down_write_killable 0.62 ± 5% -0.1 0.56 ± 4% perf-profile.children.cycles-pp.__count_memcg_events 0.67 ± 4% -0.1 0.61 ± 4% perf-profile.children.cycles-pp.try_charge_memcg 0.78 -0.1 0.72 ± 4% perf-profile.children.cycles-pp.walk_pud_range 0.90 -0.1 0.84 ± 3% perf-profile.children.cycles-pp.move_ptes 0.65 -0.0 0.60 ± 4% perf-profile.children.cycles-pp.walk_pmd_range 0.71 ± 2% -0.0 0.66 ± 3% perf-profile.children.cycles-pp.__might_sleep 0.26 ± 4% -0.0 0.22 ± 7% perf-profile.children.cycles-pp.__check_object_size 0.34 ± 5% -0.0 0.29 ± 5% perf-profile.children.cycles-pp.follow_p4d_mask 0.28 ± 2% -0.0 0.24 ± 5% perf-profile.children.cycles-pp.lru_add_drain 0.38 ± 3% -0.0 0.35 ± 5% perf-profile.children.cycles-pp.__mod_node_page_state 0.22 ± 3% -0.0 0.18 ± 8% perf-profile.children.cycles-pp.tlb_gather_mmu 0.27 -0.0 0.24 ± 5% perf-profile.children.cycles-pp.do_munmap 0.33 ± 5% -0.0 0.30 ± 4% perf-profile.children.cycles-pp.__x64_sys_mincore 0.88 -0.0 0.84 ± 3% perf-profile.children.cycles-pp.alloc_pages 0.24 ± 2% -0.0 0.20 ± 8% perf-profile.children.cycles-pp.remove_vma 0.23 ± 2% -0.0 0.20 ± 9% perf-profile.children.cycles-pp.perf_event_mmap_output 0.21 ± 4% -0.0 0.18 ± 6% perf-profile.children.cycles-pp.lru_add_drain_cpu 0.21 ± 4% -0.0 0.18 ± 3% perf-profile.children.cycles-pp._find_next_bit 0.12 ± 8% -0.0 0.10 ± 7% perf-profile.children.cycles-pp.check_vma_flags 0.18 ± 3% -0.0 0.16 ± 6% perf-profile.children.cycles-pp.exit_to_user_mode_prepare 0.09 ± 9% -0.0 0.07 ± 8% perf-profile.children.cycles-pp.mremap_userfaultfd_prep 0.08 ± 10% +0.0 0.10 ± 9% perf-profile.children.cycles-pp.local_clock 0.18 ± 5% +0.0 0.21 ± 4% perf-profile.children.cycles-pp.__list_add_valid 0.02 ±100% +0.0 0.07 ± 5% perf-profile.children.cycles-pp.memcg_slab_free_hook 0.00 +0.1 0.05 ± 6% perf-profile.children.cycles-pp.discard_slab 0.00 +0.1 0.05 ± 6% perf-profile.children.cycles-pp.__free_one_page 0.16 ± 5% +0.1 0.22 ± 5% perf-profile.children.cycles-pp.vma_complete 0.00 +0.1 0.07 ± 6% perf-profile.children.cycles-pp.setup_object 0.09 ± 7% +0.1 0.16 ± 12% perf-profile.children.cycles-pp.security_mmap_addr 0.00 +0.1 0.07 ± 13% perf-profile.children.cycles-pp.rcu_segcblist_pend_cbs 0.00 +0.1 0.08 ± 9% perf-profile.children.cycles-pp.mas_node_count_gfp 0.19 ± 6% +0.1 0.27 ± 5% perf-profile.children.cycles-pp.__list_del_entry_valid 0.43 ± 5% +0.1 0.51 ± 3% perf-profile.children.cycles-pp.__mt_destroy 0.29 ± 3% +0.1 0.37 ± 2% perf-profile.children.cycles-pp.__split_vma 0.00 +0.1 0.10 ± 22% perf-profile.children.cycles-pp.cap_mmap_addr 0.19 ± 7% +0.1 0.30 ± 3% perf-profile.children.cycles-pp.vma_expand 0.00 +0.1 0.12 ± 5% perf-profile.children.cycles-pp.free_pcppages_bulk 0.71 ± 3% +0.1 0.83 ± 2% perf-profile.children.cycles-pp.free_unref_page 1.15 ± 7% +0.2 1.31 ± 7% perf-profile.children.cycles-pp.update_process_times 1.26 ± 6% +0.2 1.42 ± 7% perf-profile.children.cycles-pp.tick_sched_handle 0.00 +0.2 0.17 ± 3% perf-profile.children.cycles-pp.inc_slabs_node 0.01 ±264% +0.2 0.18 ± 14% perf-profile.children.cycles-pp._raw_spin_unlock_irqrestore 0.19 ± 12% +0.2 0.37 ± 10% perf-profile.children.cycles-pp.rcu_pending 1.41 ± 6% +0.2 1.59 ± 7% perf-profile.children.cycles-pp.tick_sched_timer 0.00 +0.2 0.19 ± 7% perf-profile.children.cycles-pp.put_cpu_partial 0.22 ± 11% +0.2 0.41 ± 10% perf-profile.children.cycles-pp.rcu_sched_clock_irq 0.12 ± 18% +0.2 0.32 ± 11% perf-profile.children.cycles-pp.check_cpu_stall 0.00 +0.2 0.22 ± 2% perf-profile.children.cycles-pp.rcu_nocb_try_bypass 0.00 +0.3 0.25 ± 7% perf-profile.children.cycles-pp.shuffle_freelist 0.00 +0.3 0.27 ± 11% perf-profile.children.cycles-pp.rcu_cblist_dequeue 2.03 ± 2% +0.3 2.34 ± 2% perf-profile.children.cycles-pp.kmem_cache_free 0.00 +0.3 0.31 ± 8% perf-profile.children.cycles-pp.get_any_partial 0.00 +0.4 0.37 ± 6% perf-profile.children.cycles-pp.allocate_slab 0.00 +0.4 0.38 ± 4% perf-profile.children.cycles-pp.rcu_segcblist_enqueue 0.00 +0.4 0.43 ± 4% perf-profile.children.cycles-pp.mas_replace 16.16 +0.4 16.59 perf-profile.children.cycles-pp.__do_sys_mremap 2.55 ± 2% +0.5 3.01 ± 4% perf-profile.children.cycles-pp.kmem_cache_alloc 15.01 +0.5 15.52 perf-profile.children.cycles-pp.move_vma 0.28 ± 11% +0.6 0.85 ± 3% perf-profile.children.cycles-pp.mas_pop_node 1.52 ± 2% +0.6 2.10 ± 2% perf-profile.children.cycles-pp.mas_wr_bnode 0.00 +0.6 0.61 ± 7% perf-profile.children.cycles-pp.get_partial_node 0.00 +0.6 0.62 ± 2% perf-profile.children.cycles-pp.__unfreeze_partials 0.00 +0.6 0.65 ± 4% perf-profile.children.cycles-pp.native_queued_spin_lock_slowpath 0.63 ± 2% +0.7 1.31 ± 3% perf-profile.children.cycles-pp.mas_commit_b_node 0.05 ± 40% +0.8 0.88 ± 3% perf-profile.children.cycles-pp._raw_spin_lock_irqsave 1.01 ± 3% +0.9 1.95 ± 2% perf-profile.children.cycles-pp.kmem_cache_free_bulk 5.28 +1.0 6.24 perf-profile.children.cycles-pp.mas_store_gfp 3.02 +1.0 4.04 ± 2% perf-profile.children.cycles-pp.copy_vma 1.74 +1.0 2.78 ± 2% perf-profile.children.cycles-pp.vma_link 0.00 +1.2 1.22 ± 2% perf-profile.children.cycles-pp.__call_rcu_common 3.73 +1.2 4.95 perf-profile.children.cycles-pp.mas_wr_modify 3.51 +1.2 4.76 perf-profile.children.cycles-pp.mas_wr_node_store 1.79 +1.4 3.18 ± 2% perf-profile.children.cycles-pp.mas_destroy 0.00 +1.6 1.62 ± 3% perf-profile.children.cycles-pp.__slab_free 31.27 +1.8 33.02 ± 2% perf-profile.children.cycles-pp.vm_mmap_pgoff 5.27 +2.1 7.35 ± 2% perf-profile.children.cycles-pp.mas_store_prealloc 0.00 +2.2 2.24 ± 3% perf-profile.children.cycles-pp.rcu_do_batch 0.00 +2.3 2.28 ± 3% perf-profile.children.cycles-pp.rcu_core 0.00 +2.3 2.30 ± 3% perf-profile.children.cycles-pp.___slab_alloc 0.44 ± 4% +2.3 2.74 ± 2% perf-profile.children.cycles-pp.__irq_exit_rcu 0.36 ± 5% +2.3 2.67 ± 2% perf-profile.children.cycles-pp.__do_softirq 1.86 +2.3 4.19 ± 3% perf-profile.children.cycles-pp.mas_preallocate 1.01 +2.4 3.38 ± 3% perf-profile.children.cycles-pp.kmem_cache_alloc_bulk 0.53 ± 2% +2.4 2.93 ± 3% perf-profile.children.cycles-pp.__kmem_cache_alloc_bulk 3.76 ± 4% +2.5 6.26 ± 3% perf-profile.children.cycles-pp.asm_sysvec_apic_timer_interrupt 3.44 ± 5% +2.5 5.94 ± 3% perf-profile.children.cycles-pp.sysvec_apic_timer_interrupt 2.33 +3.0 5.33 ± 3% perf-profile.children.cycles-pp.mas_alloc_nodes 9.91 +3.0 12.92 ± 2% perf-profile.children.cycles-pp.mmap_region 12.05 +3.0 15.08 ± 2% perf-profile.children.cycles-pp.do_mmap 2.99 -0.7 2.30 perf-profile.self.cycles-pp.mas_wr_node_store 3.18 ± 3% -0.4 2.74 ± 4% perf-profile.self.cycles-pp.page_counter_uncharge 3.32 -0.3 2.98 ± 3% perf-profile.self.cycles-pp.folio_mark_accessed 1.12 ± 5% -0.3 0.84 ± 40% perf-profile.self.cycles-pp.stress_mmapaddr_child 2.82 -0.2 2.60 ± 3% perf-profile.self.cycles-pp.mtree_range_walk 0.96 ± 2% -0.2 0.75 ± 37% perf-profile.self.cycles-pp.mincore 1.76 ± 4% -0.2 1.56 ± 4% perf-profile.self.cycles-pp.__mod_lruvec_page_state 0.80 -0.2 0.64 ± 38% perf-profile.self.cycles-pp.__mmap 1.66 ± 2% -0.1 1.52 ± 3% perf-profile.self.cycles-pp.__might_resched 1.02 ± 5% -0.1 0.90 ± 5% perf-profile.self.cycles-pp.__alloc_pages 0.46 ± 3% -0.1 0.34 ± 4% perf-profile.self.cycles-pp.mas_leaf_max_gap 1.19 ± 2% -0.1 1.07 ± 2% perf-profile.self.cycles-pp.entry_SYSRETQ_unsafe_stack 1.44 -0.1 1.32 ± 3% perf-profile.self.cycles-pp.native_flush_tlb_one_user 1.22 -0.1 1.11 ± 2% perf-profile.self.cycles-pp.clear_page_erms 1.30 ± 2% -0.1 1.18 perf-profile.self.cycles-pp.native_flush_tlb_local 0.91 ± 4% -0.1 0.81 ± 4% perf-profile.self.cycles-pp.__entry_text_start 0.76 ± 5% -0.1 0.66 ± 3% perf-profile.self.cycles-pp.__mod_memcg_lruvec_state 0.52 ± 4% -0.1 0.42 ± 4% perf-profile.self.cycles-pp.zap_pte_range 0.93 ± 4% -0.1 0.84 ± 6% perf-profile.self.cycles-pp.propagate_protected_usage 0.61 ± 4% -0.1 0.52 ± 4% perf-profile.self.cycles-pp.mmap_region 0.93 ± 2% -0.1 0.84 perf-profile.self.cycles-pp.mas_next_entry 0.77 ± 2% -0.1 0.68 ± 3% perf-profile.self.cycles-pp.entry_SYSCALL_64_after_hwframe 0.63 -0.1 0.55 ± 5% perf-profile.self.cycles-pp.mas_wr_store_entry 0.42 ± 4% -0.1 0.34 ± 38% perf-profile.self.cycles-pp.mremap 1.08 -0.1 1.00 ± 4% perf-profile.self.cycles-pp._raw_spin_lock 0.54 ± 4% -0.1 0.47 ± 6% perf-profile.self.cycles-pp.uncharge_batch 0.27 ± 4% -0.1 0.20 ± 5% perf-profile.self.cycles-pp.mas_update_gap 0.98 -0.1 0.92 ± 3% perf-profile.self.cycles-pp.mas_wr_walk 0.55 ± 2% -0.1 0.48 ± 5% perf-profile.self.cycles-pp.mas_find 0.81 ± 2% -0.1 0.75 ± 3% perf-profile.self.cycles-pp.syscall_exit_to_user_mode 0.89 ± 2% -0.1 0.83 ± 4% perf-profile.self.cycles-pp.mas_next_nentry 0.42 ± 5% -0.1 0.36 ± 2% perf-profile.self.cycles-pp.down_read 0.54 ± 6% -0.1 0.49 ± 4% perf-profile.self.cycles-pp.__count_memcg_events 0.92 -0.1 0.86 ± 3% perf-profile.self.cycles-pp._raw_spin_trylock 0.59 ± 3% -0.1 0.53 ± 3% perf-profile.self.cycles-pp.mas_store_gfp 0.76 ± 2% -0.1 0.71 ± 4% perf-profile.self.cycles-pp.rmqueue 0.47 ± 2% -0.1 0.41 ± 3% perf-profile.self.cycles-pp.do_syscall_64 0.46 ± 2% -0.1 0.41 ± 4% perf-profile.self.cycles-pp.down_write_killable 0.32 ± 5% -0.1 0.27 ± 5% perf-profile.self.cycles-pp.follow_p4d_mask 0.45 ± 3% -0.0 0.40 ± 4% perf-profile.self.cycles-pp.try_charge_memcg 0.64 ± 2% -0.0 0.60 ± 4% perf-profile.self.cycles-pp.__cond_resched 0.52 ± 2% -0.0 0.48 ± 3% perf-profile.self.cycles-pp.alloc_pages 0.51 ± 3% -0.0 0.47 ± 3% perf-profile.self.cycles-pp.memcg_account_kmem 0.21 ± 5% -0.0 0.17 ± 9% perf-profile.self.cycles-pp.tlb_gather_mmu 0.36 ± 3% -0.0 0.33 ± 5% perf-profile.self.cycles-pp.__mod_node_page_state 0.32 ± 5% -0.0 0.28 ± 4% perf-profile.self.cycles-pp.__vm_munmap 0.30 ± 5% -0.0 0.27 ± 3% perf-profile.self.cycles-pp.__x64_sys_mincore 0.46 ± 3% -0.0 0.42 ± 2% perf-profile.self.cycles-pp.__do_sys_mincore 0.49 ± 3% -0.0 0.46 ± 3% perf-profile.self.cycles-pp.release_pages 0.17 ± 2% -0.0 0.14 ± 5% perf-profile.self.cycles-pp.unmap_vmas 0.28 ± 3% -0.0 0.25 ± 4% perf-profile.self.cycles-pp.mab_mas_cp 0.18 ± 6% -0.0 0.15 ± 8% perf-profile.self.cycles-pp.unmap_region 0.35 ± 2% -0.0 0.32 ± 4% perf-profile.self.cycles-pp.security_mmap_file 0.14 ± 8% -0.0 0.12 ± 6% perf-profile.self.cycles-pp.free_p4d_range 0.19 ± 5% -0.0 0.16 ± 3% perf-profile.self.cycles-pp._find_next_bit 0.11 ± 6% -0.0 0.08 ± 10% perf-profile.self.cycles-pp.check_vma_flags 0.17 ± 4% -0.0 0.15 ± 6% perf-profile.self.cycles-pp.vm_area_alloc 0.14 ± 10% -0.0 0.11 ± 8% perf-profile.self.cycles-pp.free_pgtables 0.21 ± 2% -0.0 0.18 ± 8% perf-profile.self.cycles-pp.perf_event_mmap_output 0.08 ± 5% -0.0 0.06 ± 10% perf-profile.self.cycles-pp.__walk_page_range 0.07 ± 10% -0.0 0.05 ± 38% perf-profile.self.cycles-pp.remove_vma 0.06 ± 7% -0.0 0.04 ± 38% perf-profile.self.cycles-pp.lru_add_drain 0.14 ± 6% -0.0 0.12 ± 6% perf-profile.self.cycles-pp.walk_p4d_range 0.09 ± 8% -0.0 0.07 ± 14% perf-profile.self.cycles-pp.__virt_addr_valid 0.11 ± 4% -0.0 0.09 ± 7% perf-profile.self.cycles-pp.ksys_mmap_pgoff 0.16 ± 4% -0.0 0.14 ± 5% perf-profile.self.cycles-pp.mas_prev 0.14 ± 5% +0.0 0.17 ± 4% perf-profile.self.cycles-pp.__list_add_valid 0.00 +0.1 0.05 ± 6% perf-profile.self.cycles-pp.discard_slab 0.01 ±264% +0.1 0.06 ± 5% perf-profile.self.cycles-pp.memcg_slab_free_hook 0.14 ± 4% +0.1 0.22 ± 3% perf-profile.self.cycles-pp.__list_del_entry_valid 0.02 ±129% +0.1 0.10 ± 9% perf-profile.self.cycles-pp.mas_commit_b_node 0.00 +0.1 0.09 ± 6% perf-profile.self.cycles-pp.rcu_do_batch 0.00 +0.1 0.09 ± 23% perf-profile.self.cycles-pp.cap_mmap_addr 0.00 +0.1 0.11 ± 9% perf-profile.self.cycles-pp.get_any_partial 0.44 ± 2% +0.1 0.58 ± 3% perf-profile.self.cycles-pp.mas_alloc_nodes 0.00 +0.1 0.15 ± 5% perf-profile.self.cycles-pp.rcu_nocb_try_bypass 0.00 +0.2 0.16 ± 7% perf-profile.self.cycles-pp.__unfreeze_partials 0.00 +0.2 0.16 ± 10% perf-profile.self.cycles-pp.shuffle_freelist 0.00 +0.2 0.16 ± 4% perf-profile.self.cycles-pp.inc_slabs_node 0.00 +0.2 0.18 ± 6% perf-profile.self.cycles-pp.put_cpu_partial 0.00 +0.2 0.18 ± 10% perf-profile.self.cycles-pp.get_partial_node 0.05 ± 40% +0.2 0.23 ± 6% perf-profile.self.cycles-pp._raw_spin_lock_irqsave 0.12 ± 18% +0.2 0.32 ± 11% perf-profile.self.cycles-pp.check_cpu_stall 0.90 ± 2% +0.2 1.10 ± 3% perf-profile.self.cycles-pp.kmem_cache_alloc 0.00 +0.3 0.25 ± 10% perf-profile.self.cycles-pp.rcu_cblist_dequeue 0.52 ± 2% +0.3 0.80 ± 5% perf-profile.self.cycles-pp.__kmem_cache_alloc_bulk 1.53 ± 3% +0.3 1.81 ± 3% perf-profile.self.cycles-pp.kmem_cache_free 0.95 ± 3% +0.3 1.27 ± 3% perf-profile.self.cycles-pp.kmem_cache_free_bulk 0.00 +0.4 0.35 ± 4% perf-profile.self.cycles-pp.rcu_segcblist_enqueue 0.00 +0.4 0.40 ± 4% perf-profile.self.cycles-pp.mas_replace 0.27 ± 11% +0.5 0.80 ± 3% perf-profile.self.cycles-pp.mas_pop_node 0.00 +0.6 0.63 ± 2% perf-profile.self.cycles-pp.__call_rcu_common 0.00 +0.6 0.65 ± 4% perf-profile.self.cycles-pp.native_queued_spin_lock_slowpath 0.00 +1.0 1.00 ± 2% perf-profile.self.cycles-pp.___slab_alloc 0.00 +1.4 1.44 ± 3% perf-profile.self.cycles-pp.__slab_free
Disclaimer: Results have been estimated based on internal Intel analysis and are provided for informational purposes only. Any difference in system hardware or software design or configuration may affect actual performance.
On Tue, Apr 11, 2023 at 09:25:16AM +0800, kernel test robot wrote:
kernel test robot noticed a -8.5% regression of stress-ng.mmapaddr.ops_per_sec on:
Assuming this is the test in question:
https://github.com/ColinIanKing/stress-ng/blob/master/stress-mmapaddr.c
then yes, this is expected. The test calls mmap() and munmap() a lot, and we've made those slower in order to fix a bug.
While it does take pagefaults (which is a better test than some microbenchmarks), it only takes one pagefault per call to mmap() and munmap(), which is not representative of real workloads.
Thanks for running the test.
linux-stable-mirror@lists.linaro.org