On Wed, Oct 14, 2020 at 12:53:07AM +0000, Kalesh Singh wrote:
HAVE_MOVE_PMD enables remapping pages at the PMD level if both the source and destination addresses are PMD-aligned.
HAVE_MOVE_PMD is already enabled on x86. The original patch [1] that introduced this config did not enable it on arm64 at the time because of performance issues with flushing the TLB on every PMD move. These issues have since been addressed in more recent releases with improvements to the arm64 TLB invalidation and core mmu_gather code as Will Deacon mentioned in [2].
From the data below, it can be inferred that there is approximately 8x improvement in performance when HAVE_MOVE_PMD is enabled on arm64.
--------- Test Results ----------
The following results were obtained on an arm64 device running a 5.4 kernel, by remapping a PMD-aligned, 1GB sized region to a PMD-aligned destination. The results from 10 iterations of the test are given below. All times are in nanoseconds.
Control HAVE_MOVE_PMD
9220833 1247761 9002552 1219896 9254115 1094792 8725885 1227760 9308646 1043698 9001667 1101771 8793385 1159896 8774636 1143594 9553125 1025833 9374010 1078125
9100885.4 1134312.6 <-- Mean Time in nanoseconds
Total mremap time for a 1GB sized PMD-aligned region drops from ~9.1 milliseconds to ~1.1 milliseconds. (~8x speedup).
[1] https://lore.kernel.org/r/20181108181201.88826-3-joelaf@google.com [2] https://www.mail-archive.com/linuxppc-dev@lists.ozlabs.org/msg140837.html
Signed-off-by: Kalesh Singh kaleshsingh@google.com Acked-by: Kirill A. Shutemov kirill.shutemov@linux.intel.com Cc: Catalin Marinas catalin.marinas@arm.com Cc: Will Deacon will@kernel.org Cc: Andrew Morton akpm@linux-foundation.org
Changes in v4:
- Add Kirill's Acked-by.
Argh, I thought we already enabled this for PMDs back in 2018! Looks like that we forgot to actually do that after I improved the performance of the TLB invalidation.
I'll pick this one patch up for 5.10.
Will