On Wed, 2023-07-26 at 18:02 -0700, Haitao Huang wrote:
Under heavy load, the SGX EPC reclaimer (ksgxd) may reclaim the SECS EPC
If I read correctly, Dave suggested to not use "high" (heavy in this sentence) or "low" pressure:
https://lore.kernel.org/lkml/op.179a4xs0wjvjmi@hhuan26-mobl.amr.corp.intel.c...
And I agree. For instance, consider this happens to one extremely "small" enclave, while there's a new "big" enclave starts to run. I don't think we should say this is "under heavy load". Just stick to the fact that the reclaimer may reclaim the SECS page.
page for an enclave and set encl->secs.epc_page to NULL. But the SECS EPC page is used for EAUG in the SGX page fault handler without checking for NULL and reloading.
Fix this by checking if SECS is loaded before EAUG and loading it if it was reclaimed.
The SECS page holds global enclave metadata. It can only be reclaimed when there are no other enclave pages remaining. At that point, virtually nothing can be done with the enclave until the SECS page is paged back in.
An enclave can not run nor generate page faults without a resident SECS page.
I am not sure whether "nor generate page faults without a resident SECS page" is accurate? When SECS is swapped out, I suppose the first EENTER should trigger a #PF on the TSC page and in the #PF handler the SECS will be swapped in first.
I guess you can just remove this sentence?
But it is still possible for a #PF for a non-SECS page to race with paging out the SECS page: when the last resident non-SECS page A triggers a #PF in a non-resident page B, and then page A and the SECS both are paged out before the #PF on B is handled.
Hitting this bug requires that race triggered with a #PF for EAUG.
The above race can happen for the normal ELDU path too, thus I suppose it will be better to mention why the normal ELDU path doesn't have this issue: it already does what this fix does.
Following is a trace when it happens.
BUG: kernel NULL pointer dereference, address: 0000000000000000 RIP: 0010:sgx_encl_eaug_page+0xc7/0x210 Call Trace: ? __kmem_cache_alloc_node+0x16a/0x440 ? xa_load+0x6e/0xa0 sgx_vma_fault+0x119/0x230 __do_fault+0x36/0x140 do_fault+0x12f/0x400 __handle_mm_fault+0x728/0x1110 handle_mm_fault+0x105/0x310 do_user_addr_fault+0x1ee/0x750 ? __this_cpu_preempt_check+0x13/0x20 exc_page_fault+0x76/0x180 asm_exc_page_fault+0x27/0x30
Fixes: 5a90d2c3f5ef ("x86/sgx: Support adding of pages to an initialized enclave") Cc: stable@vger.kernel.org # v6.0+ Signed-off-by: Haitao Huang haitao.huang@linux.intel.com Reviewed-by: Jarkko Sakkinen jarkko@kernel.org Acked-by: Reinette Chatre reinette.chatre@intel.com
With above addressed, feel free to add:
Reviewed-by: Kai Huang kai.huang@intel.com