Re: [PATCH v6 5/6] mm: secretmem: use PMD-size pages to amortize direct map fragmentation

29 Sep 2020


      On Tue, Sep 29, 2020 at 04:05:29PM +0300, Mike Rapoport wrote:
...
On Fri, Sep 25, 2020 at 09:41:25AM +0200, Peter Zijlstra wrote:
...
On Thu, Sep 24, 2020 at 04:29:03PM +0300, Mike Rapoport wrote:
...
From: Mike Rapoport rppt@linux.ibm.com
Removing a PAGE_SIZE page from the direct map every time such page is
allocated for a secret memory mapping will cause severe fragmentation of
the direct map. This fragmentation can be reduced by using PMD-size pages
as a pool for small pages for secret memory mappings.
Add a gen_pool per secretmem inode and lazily populate this pool with
PMD-size pages.
What's the actual efficacy of this? Since the pmd is per inode, all I
need is a lot of inodes and we're in business to destroy the directmap,
no?
Afaict there's no privs needed to use this, all a process needs is to
stay below the mlock limit, so a 'fork-bomb' that maps a single secret
page will utterly destroy the direct map.
This indeed will cause 1G pages in the direct map to be split into 2M
chunks, but I disagree with 'destroy' term here. Citing the cover letter
of an earlier version of this series:
It will drop them down to 4k pages. Given enough inodes, and allocating
only a single sekrit page per pmd, we'll shatter the directmap into 4k.
...
I've tried to find some numbers that show the benefit of using larger
  pages in the direct map, but I couldn't find anything so I've run a
  couple of benchmarks from phoronix-test-suite on my laptop (i7-8650U
  with 32G RAM).
Existing benchmarks suck at this, but FB had a load that had a
deterministic enough performance regression to bisect to a directmap
issue, fixed by:
7af0145067bc ("x86/mm/cpa: Prevent large page split when ftrace flips RW on kernel text")
...
I've tested three variants: the default with 28G of the physical
  memory covered with 1G pages, then I disabled 1G pages using
  "nogbpages" in the kernel command line and at last I've forced the
  entire direct map to use 4K pages using a simple patch to
  arch/x86/mm/init.c.  I've made runs of the benchmarks with SSD and
  tmpfs.
  
  Surprisingly, the results does not show huge advantage for large
  pages. For instance, here the results for kernel build with
  'make -j8', in seconds:
Your benchmark should stress the TLB of your uarch, such that additional
pressure added by the shattered directmap shows up.
And no, I don't have one either.
...
                    |  1G    |  2M    |  4K

----------------------+--------+--------+---------
  ssd, mitigations=on	| 308.75 | 317.37 | 314.9
  ssd, mitigations=off	| 305.25 | 295.32 | 304.92
  ram, mitigations=on	| 301.58 | 322.49 | 306.54
  ram, mitigations=off	| 299.32 | 288.44 | 310.65
These results lack error data, but assuming the reults are significant,
then this very much makes a case for 1G mappings. 5s on a kernel builds
is pretty good.

2025

2024

2023

2022

2021

2020

2019

2018

2017

Re: [PATCH v6 5/6] mm: secretmem: use PMD-size pages to amortize direct map fragmentation