mremap time can be optimized by moving entries at the PMD/PUD level if the source and destination addresses are PMD/PUD-aligned and PMD/PUD-sized. Enable moving at the PMD and PUD levels on arm64 and x86. Other architectures where this type of move is supported and known to be safe can also opt-in to these optimizations by enabling HAVE_MOVE_PMD and HAVE_MOVE_PUD.
Observed Performance Improvements for remapping a PUD-aligned 1GB-sized region on x86 and arm64:
- HAVE_MOVE_PMD is already enabled on x86 : N/A - Enabling HAVE_MOVE_PUD on x86 : ~13x speed up
- Enabling HAVE_MOVE_PMD on arm64 : ~ 8x speed up - Enabling HAVE_MOVE_PUD on arm64 : ~19x speed up
Altogether, HAVE_MOVE_PMD and HAVE_MOVE_PUD give a total of ~150x speed up on arm64.
Kalesh Singh (5): kselftests: vm: Add mremap tests arm64: mremap speedup - Enable HAVE_MOVE_PMD mm: Speedup mremap on 1GB or larger regions arm64: mremap speedup - Enable HAVE_MOVE_PUD x86: mremap speedup - Enable HAVE_MOVE_PUD
arch/Kconfig | 7 + arch/arm64/Kconfig | 2 + arch/arm64/include/asm/pgtable.h | 1 + arch/x86/Kconfig | 1 + mm/mremap.c | 211 +++++++++++++++++--- tools/testing/selftests/vm/.gitignore | 1 + tools/testing/selftests/vm/Makefile | 1 + tools/testing/selftests/vm/mremap_test.c | 243 +++++++++++++++++++++++ tools/testing/selftests/vm/run_vmtests | 11 + 9 files changed, 448 insertions(+), 30 deletions(-) create mode 100644 tools/testing/selftests/vm/mremap_test.c