Re: [PATCH v2] x86/sgx: Set active memcg prior to shmem allocation

20 May 2022

On Thu, May 19, 2022 at 02:04:45PM -0700, Kristen Carlson Accardi wrote:
...
When the system runs out of enclave memory, SGX can reclaim EPC pages
by swapping to normal RAM. These backing pages are allocated via a
per-enclave shared memory area. Since SGX allows unlimited over
commit on EPC memory, the reclaimer thread can allocate a large
number of backing RAM pages in response to EPC memory pressure.
When the shared memory backing RAM allocation occurs during
the reclaimer thread context, the shared memory is charged to
the root memory control group, and the shmem usage of the enclave
is not properly accounted for, making cgroups ineffective at
limiting the amount of RAM an enclave can consume.
For example, when using a cgroup to launch a set of test
enclaves, the kernel does not properly account for 50% - 75% of
shmem page allocations on average. In the worst case, when
nearly all allocations occur during the reclaimer thread, the
kernel accounts less than a percent of the amount of shmem used
by the enclave's cgroup to the correct cgroup.
SGX stores a list of mm_structs that are associated with
an enclave. Pick one of them during reclaim and charge that
mm's memcg with the shmem allocation. The one that gets picked
is arbitrary, but this list almost always only has one mm. The
cases where there is more than one mm with different memcg's
are not worth considering.
Create a new function - sgx_encl_alloc_backing(). This function
is used whenever a new backing storage page needs to be
allocated. Previously the same function was used for page
allocation as well as retrieving a previously allocated page.
Prior to backing page allocation, if there is a mm_struct associated
with the enclave that is requesting the allocation, it is set
as the active memory control group.
Signed-off-by: Kristen Carlson Accardi kristen@linux.intel.com
V1 -> V2:
 Changed sgx_encl_set_active_memcg() to simply return the correct
 memcg for the enclave and renamed to sgx_encl_get_mem_cgroup().
Created helper function current_is_ksgxd() to improve readability.
Use mmget_not_zero()/mmput_async() when searching mm_list.
Move call to set_active_memcg() to sgx_encl_alloc_backing() and
 use mem_cgroup_put() to avoid leaking a memcg reference.
Address review feedback regarding comments and commit log.
arch/x86/kernel/cpu/sgx/encl.c | 109 ++++++++++++++++++++++++++++++++-
 arch/x86/kernel/cpu/sgx/encl.h |  11 +++-
 arch/x86/kernel/cpu/sgx/main.c |   4 +-
 3 files changed, 118 insertions(+), 6 deletions(-)

diff --git a/arch/x86/kernel/cpu/sgx/encl.c b/arch/x86/kernel/cpu/sgx/encl.c
index 001808e3901c..6d10202612d6 100644
--- a/arch/x86/kernel/cpu/sgx/encl.c
+++ b/arch/x86/kernel/cpu/sgx/encl.c
@@ -32,7 +32,7 @@ static int __sgx_encl_eldu(struct sgx_encl_page *encl_page,
   else
   	page_index = PFN_DOWN(encl->size);

ret = sgx_encl_get_backing(encl, page_index, &b);


ret = sgx_encl_lookup_backing(encl, page_index, &b);
if (ret)
return ret;

@@ -574,7 +574,7 @@ static struct page *sgx_encl_get_backing_page(struct sgx_encl *encl,

0 on success,
-errno otherwise.

*/
-int sgx_encl_get_backing(struct sgx_encl *encl, unsigned long page_index,
+static int sgx_encl_get_backing(struct sgx_encl *encl, unsigned long page_index,
   		 struct sgx_backing *backing)
 {
   pgoff_t pcmd_index = PFN_DOWN(encl->size) + 1 + (page_index >> 5);
@@ -601,6 +601,111 @@ int sgx_encl_get_backing(struct sgx_encl *encl, unsigned long page_index,
   return 0;
 }
 
+/*


When called from ksgxd, returns the mem_cgroup of a struct mm stored



in the enclave's mm_list. When not called from ksgxd, just returns



the mem_cgroup of the current task.


*/

+static struct mem_cgroup *sgx_encl_get_mem_cgroup(struct sgx_encl *encl)
+{

struct mem_cgroup *memcg = NULL;
struct sgx_encl_mm *encl_mm;
int idx;

/*
* If called from normal task context, return the mem_cgroup


* of the current task's mm. The remainder of the handling is for


* ksgxd.


*/


if (!current_is_ksgxd())
return get_mem_cgroup_from_mm(current->mm);



/*
* Search the enclave's mm_list to find an mm associated with


* this enclave to charge the allocation to.


*/


idx = srcu_read_lock(&encl->srcu);

list_for_each_entry_rcu(encl_mm, &encl->mm_list, list) {
if (!mmget_not_zero(encl_mm->mm))


	continue;



memcg = get_mem_cgroup_from_mm(encl_mm->mm);



mmput_async(encl_mm->mm);



break;


}

srcu_read_unlock(&encl->srcu, idx);

/*
* In the rare case that there isn't an mm associated with


* the enclave, set memcg to the current active mem_cgroup.


* This will be the root mem_cgroup if there is no active


* mem_cgroup.


*/


if (!memcg)
return get_mem_cgroup_from_mm(NULL);



return memcg;

+}
You can simplify the function a bit. But it's up to you, not a strong
opinion.
static struct mem_cgroup *sgx_encl_get_mem_cgroup(struct sgx_encl *encl)
{
    struct mem_cgroup *memcg = NULL;
    struct sgx_encl_mm *encl_mm;
    int idx;
if (current_is_ksgxd()) {
    	/*
    	 * Search the enclave's mm_list to find an mm associated with
    	 * this enclave to charge the allocation to.
    	 */
    	idx = srcu_read_lock(&encl->srcu);
    	list_for_each_entry_rcu(encl_mm, &encl->mm_list, list) {
    		if (!mmget_not_zero(encl_mm->mm))
    			continue;
memcg = get_mem_cgroup_from_mm(encl_mm->mm);
    		mmput_async(encl_mm->mm);
    		break;
    	}
    	srcu_read_unlock(&encl->srcu, idx);
    }
return memcg ? memcg : get_mem_cgroup_from_mm(current->mm);
}
--
The rest of the patch looks good to me. Please, feel free to add:
Acked-by: Roman Gushchin roman.gushchin@linux.dev
Thanks!

    

2025

2024

2023

2022

2021

2020

2019

2018

2017

Re: [PATCH v2] x86/sgx: Set active memcg prior to shmem allocation

Signed-off-by: Kristen Carlson Accardi kristen@linux.intel.com