On Fri, Sep 25, 2020 at 11:50:29AM +0200, Peter Zijlstra wrote:
On Fri, Sep 25, 2020 at 11:00:30AM +0200, David Hildenbrand wrote:
On 25.09.20 09:41, Peter Zijlstra wrote:
On Thu, Sep 24, 2020 at 04:29:03PM +0300, Mike Rapoport wrote:
From: Mike Rapoport rppt@linux.ibm.com
Removing a PAGE_SIZE page from the direct map every time such page is allocated for a secret memory mapping will cause severe fragmentation of the direct map. This fragmentation can be reduced by using PMD-size pages as a pool for small pages for secret memory mappings.
Add a gen_pool per secretmem inode and lazily populate this pool with PMD-size pages.
What's the actual efficacy of this? Since the pmd is per inode, all I need is a lot of inodes and we're in business to destroy the directmap, no?
Afaict there's no privs needed to use this, all a process needs is to stay below the mlock limit, so a 'fork-bomb' that maps a single secret page will utterly destroy the direct map.
I really don't like this, at all.
As I expressed earlier, I would prefer allowing allocation of secretmem only from a previously defined CMA area. This would physically locally limit the pain.
Given that this thing doesn't have a migrate hook, that seems like an eminently reasonable contraint. Because not only will it mess up the directmap, it will also destroy the ability of the page-allocator / compaction to re-form high order blocks by sprinkling holes throughout.
Also, this is all very close to XPFO, yet I don't see that mentioned anywhere.
It's close to XPFO in the sense it removes pages from the kernel page table. But unlike XPFO memfd_secret() does not mean allowing access to these pages in the kernel until they are freed by the user. And, unlike XPFO, it does not require TLB flushing all over the place.
Further still, it has this HAVE_SECRETMEM_UNCACHED nonsense which is completely unused. I'm not at all sure exposing UNCACHED to random userspace is a sane idea.
The uncached mappings were originally proposed as a mean "... to prevent or considerably restrict speculation on such pages" [1] as a comment to my initial proposal to use mmap(MAP_EXCLUSIVE).
I've added the ability to create uncached mappings into the fd-based implementation of the exclusive mappings as it is indeed can reduce availability of side channels and the implementation was quite straight forward.
[1] https://lore.kernel.org/linux-mm/2236FBA76BA1254E88B949DDB74E612BA4EEC0CE@IR...