[based on mm-unstable, 651c8c1d7359]
Optimize mseal checks by removing the separate can_modify_mm() step, and just doing checks on the individual vmas, when various operations are themselves iterating through the tree. This provides a nice speedup and restores performance parity with pre-mseal[3].
This series also depends on the powerpc series that removes arch_unmap[2]. This series is already in mm-unstable.
will-it-scale mmap1_process[1] -t 1 results:
commit 3450fe2b574b4345e4296ccae395149e1a357fee:
min:277605 max:277605 total:277605 min:281784 max:281784 total:281784 min:277238 max:277238 total:277238 min:281761 max:281761 total:281761 min:274279 max:274279 total:274279 min:254854 max:254854 total:254854 measurement min:269143 max:269143 total:269143 min:270454 max:270454 total:270454 min:243523 max:243523 total:243523 min:251148 max:251148 total:251148 min:209669 max:209669 total:209669 min:190426 max:190426 total:190426 min:231219 max:231219 total:231219 min:275364 max:275364 total:275364 min:266540 max:266540 total:266540 min:242572 max:242572 total:242572 min:284469 max:284469 total:284469 min:278882 max:278882 total:278882 min:283269 max:283269 total:283269 min:281204 max:281204 total:281204
After this patch set:
min:280580 max:280580 total:280580 min:290514 max:290514 total:290514 min:291006 max:291006 total:291006 min:290352 max:290352 total:290352 min:294582 max:294582 total:294582 min:293075 max:293075 total:293075 measurement min:295613 max:295613 total:295613 min:294070 max:294070 total:294070 min:293193 max:293193 total:293193 min:291631 max:291631 total:291631 min:295278 max:295278 total:295278 min:293782 max:293782 total:293782 min:290361 max:290361 total:290361 min:294517 max:294517 total:294517 min:293750 max:293750 total:293750 min:293572 max:293572 total:293572 min:295239 max:295239 total:295239 min:292932 max:292932 total:292932 min:293319 max:293319 total:293319 min:294954 max:294954 total:294954
This was a Completely Unscientific test but seems to show there were around 5-10% gains on ops per second.
Oliver performed his own tests and showed[3] a similar ~5% gain in them.
[1]: mmap1_process does mmap and munmap in a loop. I didn't bother testing multithreading cases. [2]: https://lore.kernel.org/all/20240807124103.85644-1-mpe@ellerman.id.au/ [3]: https://lore.kernel.org/all/ZrMMJfe9aXSWxJz6@xsang-OptiPlex-9020/ Link: https://lore.kernel.org/all/202408041602.caa0372-oliver.sang@intel.com/
Changes in v3: - Moved can_modify_vma to vma.h instead of internal.h (Lorenzo) - Fixed a bug in munmap where we used the wrong VMA pointer - Added tests for the previous munmap bug - Moved the mremap source vma check upwards, to stop us from unmapping dest while the source is sealed (Liam) Changes in v2: - Rebased on top of mm-unstable - Removed a superfluous check in mremap (Jeff Xu)
Signed-off-by: Pedro Falcato pedro.falcato@gmail.com --- Pedro Falcato (7): mm: Move can_modify_vma to mm/vma.h mm/munmap: Replace can_modify_mm with can_modify_vma mm/mprotect: Replace can_modify_mm with can_modify_vma mm/mremap: Replace can_modify_mm with can_modify_vma mseal: Replace can_modify_mm_madv with a vma variant mm: Remove can_modify_mm() selftests/mm: add more mseal traversal tests
mm/internal.h | 16 ----- mm/madvise.c | 13 +--- mm/mmap.c | 11 +--- mm/mprotect.c | 12 +--- mm/mremap.c | 32 ++------- mm/mseal.c | 55 ++-------------- mm/vma.c | 19 ++++-- mm/vma.h | 35 ++++++++++ tools/testing/selftests/mm/mseal_test.c | 111 +++++++++++++++++++++++++++++++- 9 files changed, 174 insertions(+), 130 deletions(-) --- base-commit: 651c8c1d735983040bec4f71d0e2e690f3c0fc2d change-id: 20240816-mseal-depessimize-f39d9f4c32c6
Best regards,