On Wed, 2020-09-30 at 16:45 +0200, David Hildenbrand wrote:
On 30.09.20 16:39, James Bottomley wrote:
On Wed, 2020-09-30 at 13:27 +0300, Mike Rapoport wrote:
On Tue, Sep 29, 2020 at 05:15:52PM +0200, Peter Zijlstra wrote:
On Tue, Sep 29, 2020 at 05:58:13PM +0300, Mike Rapoport wrote:
On Tue, Sep 29, 2020 at 04:12:16PM +0200, Peter Zijlstra wrote:
It will drop them down to 4k pages. Given enough inodes, and allocating only a single sekrit page per pmd, we'll shatter the directmap into 4k.
Why? Secretmem allocates PMD-size page per inode and uses it as a pool of 4K pages for that inode. This way it ensures that __kernel_map_pages() is always called on PMD boundaries.
Oh, you unmap the 2m page upfront? I read it like you did the unmap at the sekrit page alloc, not the pool alloc side of things.
Then yes, but then you're wasting gobs of memory. Basically you can pin 2M per inode while only accounting a single page.
Right, quite like THP :)
I considered using a global pool of 2M pages for secretmem and handing 4K pages to each inode from that global pool. But I've decided to waste memory in favor of simplicity.
I can also add that the user space consumer of this we wrote does its user pool allocation at a 2M granularity, so nothing is actually wasted.
... for that specific user space consumer. (or am I missing something?)
I'm not sure I understand what you mean? It's designed to be either the standard wrapper or an example of how to do the standard wrapper for the syscall. It uses the same allocator system glibc uses for malloc/free ... which pretty much everyone uses instead of calling sys_brk directly. If you look at the granularity glibc uses for sys_brk, it's not 4k either.
James