Re: [PATCH v6 5/6] mm: secretmem: use PMD-size pages to amortize direct map fragmentation

30 Sep 2020


      On Wed, 2020-09-30 at 16:45 +0200, David Hildenbrand wrote:
...
On 30.09.20 16:39, James Bottomley wrote:
...
On Wed, 2020-09-30 at 13:27 +0300, Mike Rapoport wrote:
...
On Tue, Sep 29, 2020 at 05:15:52PM +0200, Peter Zijlstra wrote:
...
On Tue, Sep 29, 2020 at 05:58:13PM +0300, Mike Rapoport wrote:
...
On Tue, Sep 29, 2020 at 04:12:16PM +0200, Peter Zijlstra
wrote:
...
It will drop them down to 4k pages. Given enough inodes,
and allocating only a single sekrit page per pmd, we'll
shatter the directmap into 4k.
Why? Secretmem allocates PMD-size page per inode and uses it
as a pool of 4K pages for that inode. This way it ensures
that __kernel_map_pages() is always called on PMD boundaries.
Oh, you unmap the 2m page upfront? I read it like you did the
unmap at the sekrit page alloc, not the pool alloc side of
things.
Then yes, but then you're wasting gobs of memory. Basically you
can pin 2M per inode while only accounting a single page.
Right, quite like THP :)
I considered using a global pool of 2M pages for secretmem and
handing 4K pages to each inode from that global pool. But I've
decided to waste memory in favor of simplicity.
I can also add that the user space consumer of this we wrote does
its user pool allocation at a 2M granularity, so nothing is
actually wasted.
... for that specific user space consumer. (or am I missing
something?)
I'm not sure I understand what you mean?  It's designed to be either
the standard wrapper or an example of how to do the standard wrapper
for the syscall.  It uses the same allocator system glibc uses for
malloc/free ... which pretty much everyone uses instead of calling
sys_brk directly.  If you look at the granularity glibc uses for
sys_brk, it's not 4k either.
James

2024

2023

2022

2021

2020

2019

2018

2017

Re: [PATCH v6 5/6] mm: secretmem: use PMD-size pages to amortize direct map fragmentation