Hello,
This is the second version for maturing the OP-TEE mediator.
Changes also can be pulled from [2].
Changes from v1: - Added patch that updates SUPPORT.md - Instead of removing "experimental" status I changed it to "Tech Preview" - Other changes are described in the corresponding patches
Cover letter for v1:
This patch series fixes various unfinished items in the OP-TEE mediator. Mostly this is about limiting resources that guest can consume. This includes both memory and time - how many buffers guest can share with OP-TEE (this uses Xen memory) and when mediator should preempt itself, to make sure that guest does not stress scheduling.
Apart from this, there were one case, when mediator's actions might lead to memory leak in a good-behaving guest. To fix this issue I had to extend mediator logic, so now it can issue RPC requests to guest in the same way, as OP-TEE does this. This is useful feature, because it allows to preempt mediator during long operations. So, in the future it will be possible to remove shared buffer size limitation, because mediator can preempt self during buffer translation.
This patch series can be pulled from [1].
[1] https://github.com/lorc/xen/tree/optee3_v1 [2] https://github.com/lorc/xen/tree/optee3_v2
Volodymyr Babchuk (6): xen/arm: optee: impose limit on shared buffer size xen/arm: optee: check for preemption while freeing shared buffers xen/arm: optee: limit number of shared buffers xen/arm: optee: handle shared buffer translation error SUPPORT.md: Describe OP-TEE mediator xen/arm: optee: update description in Kconfig
SUPPORT.md | 4 + xen/arch/arm/tee/Kconfig | 12 +- xen/arch/arm/tee/optee.c | 259 ++++++++++++++++++++++++++++++--------- 3 files changed, 213 insertions(+), 62 deletions(-)
We want to limit number of calls to lookup_and_pin_guest_ram_addr() per one request. There are two ways to do this: either preempt translate_noncontig() or limit size of one shared buffer size.
It is quite hard to preempt translate_noncontig(), because it is deep nested. So we chose the second option. We will allow 129 pages per one shared buffer. This corresponds to the GP standard, as it requires that size limit for shared buffer should be at least 512kB. One extra page (129th) is needed to cope with the fact that user's buffer is not necessary aligned with page boundary.
Also, with this limitation OP-TEE still passes own "xtest" test suite, so this is okay for now.
Signed-off-by: Volodymyr Babchuk volodymyr_babchuk@epam.com
---
Changes from v1: - Added comment before BUILD_BUG_ON(PAGE_SIZE != 4096); - Fixed typo in the commit message - Decreased MAX_SHM_BUFFER_PG to 129 --- xen/arch/arm/tee/optee.c | 44 +++++++++++++++++++++++++++++----------- 1 file changed, 32 insertions(+), 12 deletions(-)
diff --git a/xen/arch/arm/tee/optee.c b/xen/arch/arm/tee/optee.c index ec5402e89b..d64e9c3b85 100644 --- a/xen/arch/arm/tee/optee.c +++ b/xen/arch/arm/tee/optee.c @@ -72,6 +72,19 @@ */ #define MAX_TOTAL_SMH_BUF_PG 16384
+/* + * Limit for shared buffer size. Please note that this define limits + * number of pages. But user buffer can be not aligned to a page + * boundary. So it is possible that user would not be able to share + * exactly MAX_SHM_BUFFER_PG * PAGE_SIZE bytes with OP-TEE. + * + * Global Platform specification for TEE requires that any TEE + * implementation should allow to share buffers with size of at least + * 512KB, which equals to 128 4KB pages. Due to align issue mentioned + * above, we need to increase this value to 129. + */ +#define MAX_SHM_BUFFER_PG 129 + #define OPTEE_KNOWN_NSEC_CAPS OPTEE_SMC_NSEC_CAP_UNIPROCESSOR #define OPTEE_KNOWN_SEC_CAPS (OPTEE_SMC_SEC_CAP_HAVE_RESERVED_SHM | \ OPTEE_SMC_SEC_CAP_UNREGISTERED_SHM | \ @@ -697,16 +710,29 @@ static int translate_noncontig(struct optee_domain *ctx, size = ROUNDUP(param->u.tmem.size + offset, OPTEE_MSG_NONCONTIG_PAGE_SIZE);
pg_count = DIV_ROUND_UP(size, OPTEE_MSG_NONCONTIG_PAGE_SIZE); + if ( pg_count > MAX_SHM_BUFFER_PG ) + return -ENOMEM; + order = get_order_from_bytes(get_pages_list_size(pg_count));
/* - * In the worst case we will want to allocate 33 pages, which is - * MAX_TOTAL_SMH_BUF_PG/511 rounded up. This gives order 6 or at - * most 64 pages allocated. This buffer will be freed right after - * the end of the call and there can be no more than + * In the worst case we will want to allocate 1 page, which is + * MAX_SHM_BUFFER_PG/511 rounded up. This buffer will be freed + * right after the end of the call and there can be no more than * max_optee_threads calls simultaneously. So in the worst case - * guest can trick us to allocate 64 * max_optee_threads pages in + * guest can trick us to allocate 1 * max_optee_threads pages in * total. + * + * It may seem strange to have such complex calculations if we + * always will allocate exactly one page. Those calculations exist + * in the first place because earlier there were bigger limit for + * shared buffer size, so there were cases, when we needed more + * that one page there. Right now this is not true, but this code + * remains for two reasons: + * - Users can change MAX_SHM_BUFFER_PG to a higher value, in which + * case they will need this code. + * - There is a plan to implement preemption in the code below, which + * will allow use to increase default MAX_SHM_BUFFER_PG value. */ xen_pgs = alloc_domheap_pages(current->domain, order, 0); if ( !xen_pgs ) @@ -747,13 +773,7 @@ static int translate_noncontig(struct optee_domain *ctx, xen_data = __map_domain_page(xen_pgs); }
- /* - * TODO: That function can pin up to 64MB of guest memory by - * calling lookup_and_pin_guest_ram_addr() 16384 times - * (assuming that PAGE_SIZE equals to 4096). - * This should be addressed before declaring OP-TEE security - * supported. - */ + /* Only 4k pages are supported right now */ BUILD_BUG_ON(PAGE_SIZE != 4096); page = get_domain_ram_page(gaddr_to_gfn(guest_data->pages_list[idx])); if ( !page )
Hi Volodymyr,
On 18/09/2019 19:50, Volodymyr Babchuk wrote:
We want to limit number of calls to lookup_and_pin_guest_ram_addr() per one request. There are two ways to do this: either preempt translate_noncontig() or limit size of one shared buffer size.
It is quite hard to preempt translate_noncontig(), because it is deep nested. So we chose the second option. We will allow 129 pages per one shared buffer. This corresponds to the GP standard, as it requires that size limit for shared buffer should be at least 512kB. One extra page (129th) is needed to cope with the fact that user's buffer is not necessary aligned with page boundary.
Also, with this limitation OP-TEE still passes own "xtest" test suite, so this is okay for now.
Signed-off-by: Volodymyr Babchuk volodymyr_babchuk@epam.com
Changes from v1:
- Added comment before BUILD_BUG_ON(PAGE_SIZE != 4096);
- Fixed typo in the commit message
- Decreased MAX_SHM_BUFFER_PG to 129
xen/arch/arm/tee/optee.c | 44 +++++++++++++++++++++++++++++----------- 1 file changed, 32 insertions(+), 12 deletions(-)
diff --git a/xen/arch/arm/tee/optee.c b/xen/arch/arm/tee/optee.c index ec5402e89b..d64e9c3b85 100644 --- a/xen/arch/arm/tee/optee.c +++ b/xen/arch/arm/tee/optee.c @@ -72,6 +72,19 @@ */ #define MAX_TOTAL_SMH_BUF_PG 16384 +/*
- Limit for shared buffer size. Please note that this define limits
- number of pages. But user buffer can be not aligned to a page
- boundary. So it is possible that user would not be able to share
- exactly MAX_SHM_BUFFER_PG * PAGE_SIZE bytes with OP-TEE.
- Global Platform specification for TEE requires that any TEE
- implementation should allow to share buffers with size of at least
- 512KB, which equals to 128 4KB pages. Due to align issue mentioned
- above, we need to increase this value to 129.
- */
+#define MAX_SHM_BUFFER_PG 129
- #define OPTEE_KNOWN_NSEC_CAPS OPTEE_SMC_NSEC_CAP_UNIPROCESSOR #define OPTEE_KNOWN_SEC_CAPS (OPTEE_SMC_SEC_CAP_HAVE_RESERVED_SHM | \ OPTEE_SMC_SEC_CAP_UNREGISTERED_SHM | \
@@ -697,16 +710,29 @@ static int translate_noncontig(struct optee_domain *ctx, size = ROUNDUP(param->u.tmem.size + offset, OPTEE_MSG_NONCONTIG_PAGE_SIZE); pg_count = DIV_ROUND_UP(size, OPTEE_MSG_NONCONTIG_PAGE_SIZE);
- if ( pg_count > MAX_SHM_BUFFER_PG )
return -ENOMEM;
order = get_order_from_bytes(get_pages_list_size(pg_count));
/*
* In the worst case we will want to allocate 33 pages, which is
* MAX_TOTAL_SMH_BUF_PG/511 rounded up. This gives order 6 or at
* most 64 pages allocated. This buffer will be freed right after
* the end of the call and there can be no more than
* In the worst case we will want to allocate 1 page, which is
* MAX_SHM_BUFFER_PG/511 rounded up. This buffer will be freed
* right after the end of the call and there can be no more than * max_optee_threads calls simultaneously. So in the worst case
* guest can trick us to allocate 64 * max_optee_threads pages in
* guest can trick us to allocate 1 * max_optee_threads pages in * total.
*
* It may seem strange to have such complex calculations if we
* always will allocate exactly one page. Those calculations exist
* in the first place because earlier there were bigger limit for
* shared buffer size, so there were cases, when we needed more
* that one page there. Right now this is not true, but this code
* remains for two reasons:
* - Users can change MAX_SHM_BUFFER_PG to a higher value, in which
* case they will need this code.
* - There is a plan to implement preemption in the code below, which
* will allow use to increase default MAX_SHM_BUFFER_PG value. */ xen_pgs = alloc_domheap_pages(current->domain, order, 0); if ( !xen_pgs )
@@ -747,13 +773,7 @@ static int translate_noncontig(struct optee_domain *ctx, xen_data = __map_domain_page(xen_pgs); }
/*
* TODO: That function can pin up to 64MB of guest memory by
* calling lookup_and_pin_guest_ram_addr() 16384 times
* (assuming that PAGE_SIZE equals to 4096).
* This should be addressed before declaring OP-TEE security
* supported.
*/
/* Only 4k pages are supported right now */
NIT: s/4k/4KB/ to stay consistent.
BUILD_BUG_ON(PAGE_SIZE != 4096); page = get_domain_ram_page(gaddr_to_gfn(guest_data->pages_list[idx])); if ( !page )
Cheers,
We can check for hypercall_preempt_check() in the loop inside optee_relinquish_resources() to increase hypervisor responsiveness in case if preemption is required.
Signed-off-by: Volodymyr Babchuk volodymyr_babchuk@epam.com
---
Changes from v1: - Removed extra hypercall_preempt_check() - Updated the commit message --- xen/arch/arm/tee/optee.c | 13 +++++-------- 1 file changed, 5 insertions(+), 8 deletions(-)
diff --git a/xen/arch/arm/tee/optee.c b/xen/arch/arm/tee/optee.c index d64e9c3b85..55d11b91a9 100644 --- a/xen/arch/arm/tee/optee.c +++ b/xen/arch/arm/tee/optee.c @@ -633,17 +633,14 @@ static int optee_relinquish_resources(struct domain *d) list_for_each_entry_safe( shm_rpc, shm_rpc_tmp, &ctx->shm_rpc_list, list ) free_shm_rpc(ctx, shm_rpc->cookie);
- if ( hypercall_preempt_check() ) - return -ERESTART; - - /* - * TODO: Guest can pin up to MAX_TOTAL_SMH_BUF_PG pages and all of - * them will be put in this loop. It is worth considering to - * check for preemption inside the loop. - */ list_for_each_entry_safe( optee_shm_buf, optee_shm_buf_tmp, &ctx->optee_shm_buf_list, list ) + { + if ( hypercall_preempt_check() ) + return -ERESTART; + free_optee_shm_buf(ctx, optee_shm_buf->cookie); + }
if ( hypercall_preempt_check() ) return -ERESTART;
Hi Volodymyr,
On 18/09/2019 19:50, Volodymyr Babchuk wrote:
We can check for hypercall_preempt_check() in the loop inside optee_relinquish_resources() to increase hypervisor responsiveness in case if preemption is required.
Signed-off-by: Volodymyr Babchuk volodymyr_babchuk@epam.com
Acked-by: Julien Grall julien.grall@arm.com
Cheers,
Changes from v1:
- Removed extra hypercall_preempt_check()
- Updated the commit message
xen/arch/arm/tee/optee.c | 13 +++++-------- 1 file changed, 5 insertions(+), 8 deletions(-)
diff --git a/xen/arch/arm/tee/optee.c b/xen/arch/arm/tee/optee.c index d64e9c3b85..55d11b91a9 100644 --- a/xen/arch/arm/tee/optee.c +++ b/xen/arch/arm/tee/optee.c @@ -633,17 +633,14 @@ static int optee_relinquish_resources(struct domain *d) list_for_each_entry_safe( shm_rpc, shm_rpc_tmp, &ctx->shm_rpc_list, list ) free_shm_rpc(ctx, shm_rpc->cookie);
- if ( hypercall_preempt_check() )
return -ERESTART;
- /*
* TODO: Guest can pin up to MAX_TOTAL_SMH_BUF_PG pages and all of
* them will be put in this loop. It is worth considering to
* check for preemption inside the loop.
*/ list_for_each_entry_safe( optee_shm_buf, optee_shm_buf_tmp, &ctx->optee_shm_buf_list, list )
- {
if ( hypercall_preempt_check() )
return -ERESTART;
free_optee_shm_buf(ctx, optee_shm_buf->cookie);
- }
if ( hypercall_preempt_check() ) return -ERESTART;
We want to limit number of shared buffers that guest can register in OP-TEE. Every such buffer consumes XEN resources and we don't want guest to exhaust XEN. So we choose arbitrary limit for shared buffers.
Signed-off-by: Volodymyr Babchuk volodymyr_babchuk@epam.com --- xen/arch/arm/tee/optee.c | 30 +++++++++++++++++++++++------- 1 file changed, 23 insertions(+), 7 deletions(-)
diff --git a/xen/arch/arm/tee/optee.c b/xen/arch/arm/tee/optee.c index 55d11b91a9..88be959819 100644 --- a/xen/arch/arm/tee/optee.c +++ b/xen/arch/arm/tee/optee.c @@ -85,6 +85,14 @@ */ #define MAX_SHM_BUFFER_PG 129
+/* + * Limits the number of shared buffers that guest can have at once. + * This is to prevent case, when guests tricks XEN into exhausting + * own memory by allocating zillions of one-byte buffers. Value is + * chosen arbitrary. + */ +#define MAX_SHM_BUFFER_COUNT 16 + #define OPTEE_KNOWN_NSEC_CAPS OPTEE_SMC_NSEC_CAP_UNIPROCESSOR #define OPTEE_KNOWN_SEC_CAPS (OPTEE_SMC_SEC_CAP_HAVE_RESERVED_SHM | \ OPTEE_SMC_SEC_CAP_UNREGISTERED_SHM | \ @@ -146,6 +154,7 @@ struct optee_domain { struct list_head optee_shm_buf_list; atomic_t call_count; atomic_t optee_shm_buf_pages; + atomic_t optee_shm_buf_count; spinlock_t lock; };
@@ -233,6 +242,7 @@ static int optee_domain_init(struct domain *d) INIT_LIST_HEAD(&ctx->optee_shm_buf_list); atomic_set(&ctx->call_count, 0); atomic_set(&ctx->optee_shm_buf_pages, 0); + atomic_set(&ctx->optee_shm_buf_count, 0); spin_lock_init(&ctx->lock);
d->arch.tee = ctx; @@ -481,23 +491,26 @@ static struct optee_shm_buf *allocate_optee_shm_buf(struct optee_domain *ctx, struct optee_shm_buf *optee_shm_buf, *optee_shm_buf_tmp; int old, new; int err_code; + int count; + + count = atomic_add_unless(&ctx->optee_shm_buf_count, 1, + MAX_SHM_BUFFER_COUNT); + if ( count == MAX_SHM_BUFFER_COUNT ) + return ERR_PTR(-ENOMEM);
do { old = atomic_read(&ctx->optee_shm_buf_pages); new = old + pages_cnt; if ( new >= MAX_TOTAL_SMH_BUF_PG ) - return ERR_PTR(-ENOMEM); + { + err_code = -ENOMEM; + goto err_dec_cnt; + } } while ( unlikely(old != atomic_cmpxchg(&ctx->optee_shm_buf_pages, old, new)) );
- /* - * TODO: Guest can try to register many small buffers, thus, forcing - * XEN to allocate context for every buffer. Probably we need to - * limit not only total number of pages pinned but also number - * of buffer objects. - */ optee_shm_buf = xzalloc_bytes(sizeof(struct optee_shm_buf) + pages_cnt * sizeof(struct page *)); if ( !optee_shm_buf ) @@ -533,6 +546,8 @@ static struct optee_shm_buf *allocate_optee_shm_buf(struct optee_domain *ctx, err: xfree(optee_shm_buf); atomic_sub(pages_cnt, &ctx->optee_shm_buf_pages); +err_dec_cnt: + atomic_dec(&ctx->optee_shm_buf_count);
return ERR_PTR(err_code); } @@ -575,6 +590,7 @@ static void free_optee_shm_buf(struct optee_domain *ctx, uint64_t cookie) free_pg_list(optee_shm_buf);
atomic_sub(optee_shm_buf->page_cnt, &ctx->optee_shm_buf_pages); + atomic_dec(&ctx->optee_shm_buf_count);
xfree(optee_shm_buf); }
Hi Volodymyr,
On 18/09/2019 19:50, Volodymyr Babchuk wrote:
We want to limit number of shared buffers that guest can register in OP-TEE. Every such buffer consumes XEN resources and we don't want guest to exhaust XEN. So we choose arbitrary limit for shared buffers.
Signed-off-by: Volodymyr Babchuk volodymyr_babchuk@epam.com
Acked-by: Julien Grall julien.grall@arm.com
Cheers,
There is a case possible, when OP-TEE asks guest to allocate shared buffer, but Xen for some reason can't translate buffer's addresses. In this situation we should do two things:
1. Tell guest to free allocated buffer, so there will be no memory leak for guest.
2. Tell OP-TEE that buffer allocation failed.
To ask guest to free allocated buffer we should perform the same thing, as OP-TEE does - issue RPC request. This is done by filling request buffer (luckily we can reuse the same buffer, that OP-TEE used to issue original request) and then return to guest with special return code.
Then we need to handle next call from guest in a special way: as RPC was issued by Xen, not by OP-TEE, it should be handled by Xen. Basically, this is the mechanism to preempt OP-TEE mediator.
The same mechanism can be used in the future to preempt mediator during translation large (>512 pages) shared buffers.
Signed-off-by: Volodymyr Babchuk volodymyr_babchuk@epam.com
---
Changes from v1: - Renamed OPTEEM_CALL_* to OPTEE_CALL_* - Fixed comments - Added ASSERT() in handle_xen_rpc_return() --- xen/arch/arm/tee/optee.c | 172 ++++++++++++++++++++++++++++++++------- 1 file changed, 141 insertions(+), 31 deletions(-)
diff --git a/xen/arch/arm/tee/optee.c b/xen/arch/arm/tee/optee.c index 88be959819..0a3205f9e8 100644 --- a/xen/arch/arm/tee/optee.c +++ b/xen/arch/arm/tee/optee.c @@ -98,6 +98,11 @@ OPTEE_SMC_SEC_CAP_UNREGISTERED_SHM | \ OPTEE_SMC_SEC_CAP_DYNAMIC_SHM)
+enum optee_call_state { + OPTEE_CALL_NORMAL, + OPTEE_CALL_XEN_RPC, +}; + static unsigned int __read_mostly max_optee_threads;
/* @@ -114,6 +119,9 @@ struct optee_std_call { paddr_t guest_arg_ipa; int optee_thread_id; int rpc_op; + /* Saved buffer type for the current buffer allocate request */ + unsigned int rpc_buffer_type; + enum optee_call_state state; uint64_t rpc_data_cookie; bool in_flight; register_t rpc_params[2]; @@ -301,6 +309,7 @@ static struct optee_std_call *allocate_std_call(struct optee_domain *ctx)
call->optee_thread_id = -1; call->in_flight = true; + call->state = OPTEE_CALL_NORMAL;
spin_lock(&ctx->lock); list_add_tail(&call->list, &ctx->call_list); @@ -1086,6 +1095,10 @@ static int handle_rpc_return(struct optee_domain *ctx, ret = -ERESTART; }
+ /* Save the buffer type in case we will want to free it */ + if ( shm_rpc->xen_arg->cmd == OPTEE_RPC_CMD_SHM_ALLOC ) + call->rpc_buffer_type = shm_rpc->xen_arg->params[0].u.value.a; + unmap_domain_page(shm_rpc->xen_arg); }
@@ -1250,18 +1263,107 @@ err: return; }
+/* + * Prepare RPC request to free shared buffer in the same way, as + * OP-TEE does this. + * + * Return values: + * true - successfully prepared RPC request + * false - there was an error + */ +static bool issue_rpc_cmd_free(struct optee_domain *ctx, + struct cpu_user_regs *regs, + struct optee_std_call *call, + struct shm_rpc *shm_rpc, + uint64_t cookie) +{ + register_t r1, r2; + + /* In case if guest will forget to update it with meaningful value */ + shm_rpc->xen_arg->ret = TEEC_ERROR_GENERIC; + shm_rpc->xen_arg->cmd = OPTEE_RPC_CMD_SHM_FREE; + shm_rpc->xen_arg->num_params = 1; + shm_rpc->xen_arg->params[0].attr = OPTEE_MSG_ATTR_TYPE_VALUE_INPUT; + shm_rpc->xen_arg->params[0].u.value.a = call->rpc_buffer_type; + shm_rpc->xen_arg->params[0].u.value.b = cookie; + + if ( access_guest_memory_by_ipa(current->domain, + gfn_to_gaddr(shm_rpc->gfn), + shm_rpc->xen_arg, + OPTEE_MSG_GET_ARG_SIZE(1), + true) ) + { + /* + * Well, this is quite bad. We have error in the error + * path. This can happen only if guest behaves badly, so all + * we can do is to return error to OP-TEE and leave guest's + * memory leaked. We already have freed all resources + * allocated for this buffer, but guest will never receive + * OPTEE_RPC_CMD_SHM_FREE request, so it will not know that it + * can release allocated buffer. + */ + shm_rpc->xen_arg->ret = TEEC_ERROR_GENERIC; + shm_rpc->xen_arg->num_params = 0; + + return false; + } + + uint64_to_regpair(&r1, &r2, shm_rpc->cookie); + + call->state = OPTEE_CALL_XEN_RPC; + call->rpc_op = OPTEE_SMC_RPC_FUNC_CMD; + call->rpc_params[0] = r1; + call->rpc_params[1] = r2; + call->optee_thread_id = get_user_reg(regs, 3); + + set_user_reg(regs, 0, OPTEE_SMC_RETURN_RPC_CMD); + set_user_reg(regs, 1, r1); + set_user_reg(regs, 2, r2); + + return true; +} + +/* Handles return from Xen-issued RPC */ +static void handle_xen_rpc_return(struct optee_domain *ctx, + struct cpu_user_regs *regs, + struct optee_std_call *call, + struct shm_rpc *shm_rpc) +{ + call->state = OPTEE_CALL_NORMAL; + + /* + * Right now we have only one reason to be there - we asked guest + * to free shared buffer and it did it. Now we can tell OP-TEE + * that buffer allocation failed. We are not storing exact command + * type, only type of RPC return. So, this is the only check we + * can perform there. + */ + ASSERT(call->rpc_op == OPTEE_SMC_RPC_FUNC_CMD); + + /* + * We are not checking return value from a guest because we assume + * that OPTEE_RPC_CMD_SHM_FREE never fails. + */ + shm_rpc->xen_arg->ret = TEEC_ERROR_GENERIC; + shm_rpc->xen_arg->num_params = 0; +} + /* * This function is called when guest is finished processing RPC * request from OP-TEE and wished to resume the interrupted standard * call. + * + * Return values: + * false - there was an error, do not call OP-TEE + * true - success, proceed as normal */ -static void handle_rpc_cmd_alloc(struct optee_domain *ctx, +static bool handle_rpc_cmd_alloc(struct optee_domain *ctx, struct cpu_user_regs *regs, struct optee_std_call *call, struct shm_rpc *shm_rpc) { if ( shm_rpc->xen_arg->ret || shm_rpc->xen_arg->num_params != 1 ) - return; + return true;
if ( shm_rpc->xen_arg->params[0].attr != (OPTEE_MSG_ATTR_TYPE_TMEM_OUTPUT | OPTEE_MSG_ATTR_NONCONTIG) ) @@ -1269,7 +1371,7 @@ static void handle_rpc_cmd_alloc(struct optee_domain *ctx, gdprintk(XENLOG_WARNING, "Invalid attrs for shared mem buffer: %"PRIx64"\n", shm_rpc->xen_arg->params[0].attr); - return; + return true; }
/* Free pg list for buffer */ @@ -1285,21 +1387,14 @@ static void handle_rpc_cmd_alloc(struct optee_domain *ctx, { call->rpc_data_cookie = 0; /* - * Okay, so there was problem with guest's buffer and we need - * to tell about this to OP-TEE. - */ - shm_rpc->xen_arg->ret = TEEC_ERROR_GENERIC; - shm_rpc->xen_arg->num_params = 0; - /* - * TODO: With current implementation, OP-TEE will not issue - * RPC to free this buffer. Guest and OP-TEE will be out of - * sync: guest believes that it provided buffer to OP-TEE, - * while OP-TEE thinks of opposite. Ideally, we need to - * emulate RPC with OPTEE_MSG_RPC_CMD_SHM_FREE command. + * We are unable to translate guest's buffer, so we need tell guest + * to free it, before reporting an error to OP-TEE. */ - gprintk(XENLOG_WARNING, - "translate_noncontig() failed, OP-TEE/guest state is out of sync.\n"); + return !issue_rpc_cmd_free(ctx, regs, call, shm_rpc, + shm_rpc->xen_arg->params[0].u.tmem.shm_ref); } + + return true; }
static void handle_rpc_cmd(struct optee_domain *ctx, struct cpu_user_regs *regs, @@ -1349,22 +1444,37 @@ static void handle_rpc_cmd(struct optee_domain *ctx, struct cpu_user_regs *regs, goto out; }
- switch (shm_rpc->xen_arg->cmd) + if ( call->state == OPTEE_CALL_NORMAL ) { - case OPTEE_RPC_CMD_GET_TIME: - case OPTEE_RPC_CMD_WAIT_QUEUE: - case OPTEE_RPC_CMD_SUSPEND: - break; - case OPTEE_RPC_CMD_SHM_ALLOC: - handle_rpc_cmd_alloc(ctx, regs, call, shm_rpc); - break; - case OPTEE_RPC_CMD_SHM_FREE: - free_optee_shm_buf(ctx, shm_rpc->xen_arg->params[0].u.value.b); - if ( call->rpc_data_cookie == shm_rpc->xen_arg->params[0].u.value.b ) - call->rpc_data_cookie = 0; - break; - default: - break; + switch (shm_rpc->xen_arg->cmd) + { + case OPTEE_RPC_CMD_GET_TIME: + case OPTEE_RPC_CMD_WAIT_QUEUE: + case OPTEE_RPC_CMD_SUSPEND: + break; + case OPTEE_RPC_CMD_SHM_ALLOC: + if ( !handle_rpc_cmd_alloc(ctx, regs, call, shm_rpc) ) + { + /* We failed to translate buffer, report back to guest */ + unmap_domain_page(shm_rpc->xen_arg); + put_std_call(ctx, call); + + return; + } + break; + case OPTEE_RPC_CMD_SHM_FREE: + free_optee_shm_buf(ctx, shm_rpc->xen_arg->params[0].u.value.b); + if ( call->rpc_data_cookie == + shm_rpc->xen_arg->params[0].u.value.b ) + call->rpc_data_cookie = 0; + break; + default: + break; + } + } + else + { + handle_xen_rpc_return(ctx, regs, call, shm_rpc); }
out:
Hi,
On 18/09/2019 19:51, Volodymyr Babchuk wrote:
+/* Handles return from Xen-issued RPC */ +static void handle_xen_rpc_return(struct optee_domain *ctx,
struct cpu_user_regs *regs,
struct optee_std_call *call,
struct shm_rpc *shm_rpc)
+{
- call->state = OPTEE_CALL_NORMAL;
- /*
* Right now we have only one reason to be there - we asked guest
* to free shared buffer and it did it. Now we can tell OP-TEE
* that buffer allocation failed. We are not storing exact command
* type, only type of RPC return. So, this is the only check we
* can perform there.
*/
- ASSERT(call->rpc_op == OPTEE_SMC_RPC_FUNC_CMD);
As I pointed out in v1, ASSERT() is probably the less desirable solution here as this is an error path.
Can you explain why you chose that over the 3 solutions I suggested?
Cheers,
Hi Julien,
Julien Grall writes:
Hi,
On 18/09/2019 19:51, Volodymyr Babchuk wrote:
+/* Handles return from Xen-issued RPC */ +static void handle_xen_rpc_return(struct optee_domain *ctx,
struct cpu_user_regs *regs,
struct optee_std_call *call,
struct shm_rpc *shm_rpc)
+{
- call->state = OPTEE_CALL_NORMAL;
- /*
* Right now we have only one reason to be there - we asked guest
* to free shared buffer and it did it. Now we can tell OP-TEE
* that buffer allocation failed. We are not storing exact command
* type, only type of RPC return. So, this is the only check we
* can perform there.
*/
- ASSERT(call->rpc_op == OPTEE_SMC_RPC_FUNC_CMD);
As I pointed out in v1, ASSERT() is probably the less desirable solution here as this is an error path.
Looks like I misunderstood you :(
Can you explain why you chose that over the 3 solutions I suggested?
I think I need some clarification there. ASSERT() is adopted way to tell about invariant. Clearly, this is programming error if ASSERT() fails.
But I understand, that ASSERT() is available only in debug build. So, in release it will never fire. As this is very unlikely error path, bug there can live forever. Okay, so ASSERT() is the least desirable way.
Warning is not enough, as we are already in incorrect state.
So, best way is to crash guest, it this correct? I'll do this in the next version then.
On 24/09/2019 12:37, Volodymyr Babchuk wrote:
Hi Julien,
Julien Grall writes:
Hi,
On 18/09/2019 19:51, Volodymyr Babchuk wrote:
+/* Handles return from Xen-issued RPC */ +static void handle_xen_rpc_return(struct optee_domain *ctx,
struct cpu_user_regs *regs,
struct optee_std_call *call,
struct shm_rpc *shm_rpc)
+{
- call->state = OPTEE_CALL_NORMAL;
- /*
* Right now we have only one reason to be there - we asked guest
* to free shared buffer and it did it. Now we can tell OP-TEE
* that buffer allocation failed. We are not storing exact command
* type, only type of RPC return. So, this is the only check we
* can perform there.
*/
- ASSERT(call->rpc_op == OPTEE_SMC_RPC_FUNC_CMD);
As I pointed out in v1, ASSERT() is probably the less desirable solution here as this is an error path.
Looks like I misunderstood you :(
Can you explain why you chose that over the 3 solutions I suggested?
I think I need some clarification there. ASSERT() is adopted way to tell about invariant. Clearly, this is programming error if ASSERT() fails.
That's correct.
But I understand, that ASSERT() is available only in debug build. So, in release it will never fire. As this is very unlikely error path, bug there can live forever. Okay, so ASSERT() is the least desirable way.
This is my concern, ASSERT() are fine in path that can be exercised quite well. By experience, error path as not so well tested, so any verbosity in non-debug build is always helpful.
Warning is not enough, as we are already in incorrect state.
Incorrect state for who? The guest?
So, best way is to crash guest, it this correct? I'll do this in the next version then.
A rule of thumb is - BUG_ON can be replaced by domain_crash() as this is a fatal error you can't recover (the scope depends on the function call).
- ASSERT() can be replaced by WARN_ON() as the former will be a NOP in non-debug build. In both case, the error is not fatal and continue will not result So it means the error is not fatal.
You used ASSERT() in your code, so it feels to me that WARN_ON() would be a suitable replacement.
However, if the state inconsistency is for Xen, then domain_crash() is the best. If this is for the guest, then this is not really our business and it may be best to let him continue as it could provide more debug (domain_crash() will only dump the stack and registers).
Cheers,
Julien Grall writes:
On 24/09/2019 12:37, Volodymyr Babchuk wrote:
Hi Julien,
Julien Grall writes:
Hi,
On 18/09/2019 19:51, Volodymyr Babchuk wrote:
+/* Handles return from Xen-issued RPC */ +static void handle_xen_rpc_return(struct optee_domain *ctx,
struct cpu_user_regs *regs,
struct optee_std_call *call,
struct shm_rpc *shm_rpc)
+{
- call->state = OPTEE_CALL_NORMAL;
- /*
* Right now we have only one reason to be there - we asked guest
* to free shared buffer and it did it. Now we can tell OP-TEE
* that buffer allocation failed. We are not storing exact command
* type, only type of RPC return. So, this is the only check we
* can perform there.
*/
- ASSERT(call->rpc_op == OPTEE_SMC_RPC_FUNC_CMD);
As I pointed out in v1, ASSERT() is probably the less desirable solution here as this is an error path.
Looks like I misunderstood you :(
Can you explain why you chose that over the 3 solutions I suggested?
I think I need some clarification there. ASSERT() is adopted way to tell about invariant. Clearly, this is programming error if ASSERT() fails.
That's correct.
But I understand, that ASSERT() is available only in debug build. So, in release it will never fire. As this is very unlikely error path, bug there can live forever. Okay, so ASSERT() is the least desirable way.
This is my concern, ASSERT() are fine in path that can be exercised quite well. By experience, error path as not so well tested, so any verbosity in non-debug build is always helpful.
Yep, I see.
Warning is not enough, as we are already in incorrect state.
Incorrect state for who? The guest?
Yes, for the current call of the current guest. State of other calls and other guests should not be affected. But it is possible that our view of OP-TEE state for that guest is not correct also.
So, best way is to crash guest, it this correct? I'll do this in the next version then.
A rule of thumb is
- BUG_ON can be replaced by domain_crash() as this is a fatal error
you can't recover (the scope depends on the function call).
This seems correct.
- ASSERT() can be replaced by WARN_ON() as the former will be a NOP
in non-debug build. In both case, the error is not fatal and continue will not result So it means the error is not fatal.
I can't agree with this in general. But maybe this makes sense for Xen. As I said, generally ASSERT() is used to, well, assert that condition is true for any correct state of a program. So it cant' be replaced with WARN_ON(). If we detected invalid state we should either try to correct it (if know how) or to immediately stop the program.
But I can see, why this behavior is not desired for hypervisor. Especially in production use.
You used ASSERT() in your code, so it feels to me that WARN_ON() would be a suitable replacement.
Well, honestly I believe that it is better to crash guest to prevent any additional harm. Look, we already detected that something is wrong with mediator state for certain guest. We can pretend that all is fine and try to continue the call. But who knows which consequences it will have?
However, if the state inconsistency is for Xen, then domain_crash() is the best. If this is for the guest, then this is not really our business and it may be best to let him continue as it could provide more debug (domain_crash() will only dump the stack and registers).
I'm looking at this from different point: we promised to provide some service for a guest and screwed up. It is not guest's fault. Now we know that we can't provide reliable service for a guest anymore. From safety point of view we should shut down the service. (But this is job for another patch) For now, we at least should crash the guest. This is the safest way. What do you think?
-- Volodymyr Babchuk at EPAM
On 24/09/2019 14:30, Volodymyr Babchuk wrote:
Julien Grall writes:
On 24/09/2019 12:37, Volodymyr Babchuk wrote:
Hi Julien,
Julien Grall writes:
Hi,
On 18/09/2019 19:51, Volodymyr Babchuk wrote:
+/* Handles return from Xen-issued RPC */ +static void handle_xen_rpc_return(struct optee_domain *ctx,
struct cpu_user_regs *regs,
struct optee_std_call *call,
struct shm_rpc *shm_rpc)
+{
- call->state = OPTEE_CALL_NORMAL;
- /*
* Right now we have only one reason to be there - we asked guest
* to free shared buffer and it did it. Now we can tell OP-TEE
* that buffer allocation failed. We are not storing exact command
* type, only type of RPC return. So, this is the only check we
* can perform there.
*/
- ASSERT(call->rpc_op == OPTEE_SMC_RPC_FUNC_CMD);
As I pointed out in v1, ASSERT() is probably the less desirable solution here as this is an error path.
Looks like I misunderstood you :(
Can you explain why you chose that over the 3 solutions I suggested?
I think I need some clarification there. ASSERT() is adopted way to tell about invariant. Clearly, this is programming error if ASSERT() fails.
That's correct.
But I understand, that ASSERT() is available only in debug build. So, in release it will never fire. As this is very unlikely error path, bug there can live forever. Okay, so ASSERT() is the least desirable way.
This is my concern, ASSERT() are fine in path that can be exercised quite well. By experience, error path as not so well tested, so any verbosity in non-debug build is always helpful.
Yep, I see.
Warning is not enough, as we are already in incorrect state.
Incorrect state for who? The guest?
Yes, for the current call of the current guest. State of other calls and other guests should not be affected. But it is possible that our view of OP-TEE state for that guest is not correct also.
So, best way is to crash guest, it this correct? I'll do this in the next version then.
A rule of thumb is
- BUG_ON can be replaced by domain_crash() as this is a fatal error
you can't recover (the scope depends on the function call).
This seems correct.
- ASSERT() can be replaced by WARN_ON() as the former will be a NOP
in non-debug build. In both case, the error is not fatal and continue will not result So it means the error is not fatal.
I can't agree with this in general. But maybe this makes sense for Xen. As I said, generally ASSERT() is used to, well, assert that condition is true for any correct state of a program. So it cant' be replaced with WARN_ON(). If we detected invalid state we should either try to correct it (if know how) or to immediately stop the program.
I agree that assert are here to catch that any condition is true. However, as I pointed out, ASSERTs are turned to NOP in production build. So if there were a problem in that path, you would have happily continued with the error.
In other words, ASSERT() is only here to help you catch any mistake during development by crashing the hypervisor so you get information on what happen. But your code should be able to cope of any mistake in production build (or you know this can't happen thanks to code coverage).
This is very different from domain_crash() where you know that your code is not able to cope with it and you don't want the guest to continue. Note that domain_crash() is asynchronous, so you will still execute some part of the hypervisor (until leave_hypervisor_head() is called).
But I can see, why this behavior is not desired for hypervisor. Especially in production use.
You used ASSERT() in your code, so it feels to me that WARN_ON() would be a suitable replacement.
Well, honestly I believe that it is better to crash guest to prevent any additional harm. Look, we already detected that something is wrong with mediator state for certain guest. We can pretend that all is fine and try to continue the call. But who knows which consequences it will have?
However, if the state inconsistency is for Xen, then domain_crash() is the best. If this is for the guest, then this is not really our business and it may be best to let him continue as it could provide more debug (domain_crash() will only dump the stack and registers).
I'm looking at this from different point: we promised to provide some service for a guest and screwed up. It is not guest's fault. Now we know that we can't provide reliable service for a guest anymore. From safety point of view we should shut down the service. (But this is job for another patch) For now, we at least should crash the guest. This is the safest way. What do you think?
I am happy with domain_crash().
Cheers,
With the latest patches to the mediator, it can be considered as Technological Preview feature.
Signed-off-by: Volodymyr Babchuk volodymyr_babchuk@epam.com
---
Note for commiter: Obviously this patch should be merged after all other patches in this series. --- SUPPORT.md | 4 ++++ 1 file changed, 4 insertions(+)
diff --git a/SUPPORT.md b/SUPPORT.md index 375473a456..8d50a72dcb 100644 --- a/SUPPORT.md +++ b/SUPPORT.md @@ -660,6 +660,10 @@ No support for QEMU backends in a 16K or 64K domain.
Status: Supported
+### ARM: OP-TEE Mediator + + Status: Tech Preview + ## Virtual Hardware, QEMU
This section describes supported devices available in HVM mode using a
Hi,
On 18/09/2019 19:51, Volodymyr Babchuk wrote:
With the latest patches to the mediator, it can be considered as Technological Preview feature.
Signed-off-by: Volodymyr Babchuk volodymyr_babchuk@epam.com
With one change below:
Acked-by: Julien Grall julien.grall@arm.com
Note for commiter: Obviously this patch should be merged after all other patches in this series.
SUPPORT.md | 4 ++++ 1 file changed, 4 insertions(+)
diff --git a/SUPPORT.md b/SUPPORT.md index 375473a456..8d50a72dcb 100644 --- a/SUPPORT.md +++ b/SUPPORT.md @@ -660,6 +660,10 @@ No support for QEMU backends in a 16K or 64K domain. Status: Supported +### ARM: OP-TEE Mediator
NIT: s/ARM/Arm/
- Status: Tech Preview
- ## Virtual Hardware, QEMU
This section describes supported devices available in HVM mode using a
Cheers,
OP-TEE mediator now is "Tech Preview" state, and we want to update it's description in Kconfig accordingly.
Signed-off-by: Volodymyr Babchuk volodymyr_babchuk@epam.com
---
Note to commiter: this patch depends on first 4 patches in the series. --- xen/arch/arm/tee/Kconfig | 12 ++++++++---- 1 file changed, 8 insertions(+), 4 deletions(-)
diff --git a/xen/arch/arm/tee/Kconfig b/xen/arch/arm/tee/Kconfig index b4b6aa2610..a4a598191e 100644 --- a/xen/arch/arm/tee/Kconfig +++ b/xen/arch/arm/tee/Kconfig @@ -3,7 +3,11 @@ config OPTEE default n depends on TEE help - Enable experimental OP-TEE mediator. It allows guests to access - OP-TEE running on your platform. This requires virtualization-enabled - OP-TEE present. You can learn more about virtualization for OP-TEE - at https://optee.readthedocs.io/architecture/virtualization.html + Enable the OP-TEE mediator. It allows guests to access + OP-TEE running on your platform. This requires + virtualization-enabled OP-TEE present. You can learn more + about virtualization for OP-TEE at + https://optee.readthedocs.io/architecture/virtualization.html + + Right now OP-TEE mediator is "Tech Preview" state, so it is + not good idea to use it in production.
Hi Volodymyr,
On 18/09/2019 19:51, Volodymyr Babchuk wrote:
OP-TEE mediator now is "Tech Preview" state, and we want to update it's description in Kconfig accordingly.
Signed-off-by: Volodymyr Babchuk volodymyr_babchuk@epam.com
Note to commiter: this patch depends on first 4 patches in the series.
xen/arch/arm/tee/Kconfig | 12 ++++++++---- 1 file changed, 8 insertions(+), 4 deletions(-)
diff --git a/xen/arch/arm/tee/Kconfig b/xen/arch/arm/tee/Kconfig index b4b6aa2610..a4a598191e 100644 --- a/xen/arch/arm/tee/Kconfig +++ b/xen/arch/arm/tee/Kconfig @@ -3,7 +3,11 @@ config OPTEE default n depends on TEE help
Enable experimental OP-TEE mediator. It allows guests to access
OP-TEE running on your platform. This requires virtualization-enabled
OP-TEE present. You can learn more about virtualization for OP-TEE
at https://optee.readthedocs.io/architecture/virtualization.html
Enable the OP-TEE mediator. It allows guests to access
OP-TEE running on your platform. This requires
virtualization-enabled OP-TEE present. You can learn more
about virtualization for OP-TEE at
https://optee.readthedocs.io/architecture/virtualization.html
Right now OP-TEE mediator is "Tech Preview" state, so it is
not good idea to use it in production.
Well, the whole TEE support is under "EXPERT" so someone selecting this option already know this should not be used in production. We also have SUPPORT.MD describing the state of the feature.
So I would drop completely the last sentence and potentially gate OPTEE with "EXPERT" as well. Note that the last bits is optional.
Cheers,
On 18/09/2019 19:50, Volodymyr Babchuk wrote:
Hello,
Hi,
This is the second version for maturing the OP-TEE mediator.
Changes also can be pulled from [2].
Changes from v1:
- Added patch that updates SUPPORT.md
- Instead of removing "experimental" status I changed it to "Tech Preview"
- Other changes are described in the corresponding patches
Cover letter for v1:
This patch series fixes various unfinished items in the OP-TEE mediator. Mostly this is about limiting resources that guest can consume. This includes both memory and time - how many buffers guest can share with OP-TEE (this uses Xen memory) and when mediator should preempt itself, to make sure that guest does not stress scheduling.
Apart from this, there were one case, when mediator's actions might lead to memory leak in a good-behaving guest. To fix this issue I had to extend mediator logic, so now it can issue RPC requests to guest in the same way, as OP-TEE does this. This is useful feature, because it allows to preempt mediator during long operations. So, in the future it will be possible to remove shared buffer size limitation, because mediator can preempt self during buffer translation.
This patch series can be pulled from [1].
[1] https://github.com/lorc/xen/tree/optee3_v1 [2] https://github.com/lorc/xen/tree/optee3_v2
Volodymyr Babchuk (6): xen/arm: optee: impose limit on shared buffer size xen/arm: optee: check for preemption while freeing shared buffers xen/arm: optee: limit number of shared buffers
I have committed patch 1-3.
Cheers,