On 24/09/2019 14:30, Volodymyr Babchuk wrote:
Julien Grall writes:
On 24/09/2019 12:37, Volodymyr Babchuk wrote:
Hi Julien,
Julien Grall writes:
Hi,
On 18/09/2019 19:51, Volodymyr Babchuk wrote:
+/* Handles return from Xen-issued RPC */ +static void handle_xen_rpc_return(struct optee_domain *ctx,
struct cpu_user_regs *regs,
struct optee_std_call *call,
struct shm_rpc *shm_rpc)
+{
- call->state = OPTEE_CALL_NORMAL;
- /*
* Right now we have only one reason to be there - we asked guest
* to free shared buffer and it did it. Now we can tell OP-TEE
* that buffer allocation failed. We are not storing exact command
* type, only type of RPC return. So, this is the only check we
* can perform there.
*/
- ASSERT(call->rpc_op == OPTEE_SMC_RPC_FUNC_CMD);
As I pointed out in v1, ASSERT() is probably the less desirable solution here as this is an error path.
Looks like I misunderstood you :(
Can you explain why you chose that over the 3 solutions I suggested?
I think I need some clarification there. ASSERT() is adopted way to tell about invariant. Clearly, this is programming error if ASSERT() fails.
That's correct.
But I understand, that ASSERT() is available only in debug build. So, in release it will never fire. As this is very unlikely error path, bug there can live forever. Okay, so ASSERT() is the least desirable way.
This is my concern, ASSERT() are fine in path that can be exercised quite well. By experience, error path as not so well tested, so any verbosity in non-debug build is always helpful.
Yep, I see.
Warning is not enough, as we are already in incorrect state.
Incorrect state for who? The guest?
Yes, for the current call of the current guest. State of other calls and other guests should not be affected. But it is possible that our view of OP-TEE state for that guest is not correct also.
So, best way is to crash guest, it this correct? I'll do this in the next version then.
A rule of thumb is
- BUG_ON can be replaced by domain_crash() as this is a fatal error
you can't recover (the scope depends on the function call).
This seems correct.
- ASSERT() can be replaced by WARN_ON() as the former will be a NOP
in non-debug build. In both case, the error is not fatal and continue will not result So it means the error is not fatal.
I can't agree with this in general. But maybe this makes sense for Xen. As I said, generally ASSERT() is used to, well, assert that condition is true for any correct state of a program. So it cant' be replaced with WARN_ON(). If we detected invalid state we should either try to correct it (if know how) or to immediately stop the program.
I agree that assert are here to catch that any condition is true. However, as I pointed out, ASSERTs are turned to NOP in production build. So if there were a problem in that path, you would have happily continued with the error.
In other words, ASSERT() is only here to help you catch any mistake during development by crashing the hypervisor so you get information on what happen. But your code should be able to cope of any mistake in production build (or you know this can't happen thanks to code coverage).
This is very different from domain_crash() where you know that your code is not able to cope with it and you don't want the guest to continue. Note that domain_crash() is asynchronous, so you will still execute some part of the hypervisor (until leave_hypervisor_head() is called).
But I can see, why this behavior is not desired for hypervisor. Especially in production use.
You used ASSERT() in your code, so it feels to me that WARN_ON() would be a suitable replacement.
Well, honestly I believe that it is better to crash guest to prevent any additional harm. Look, we already detected that something is wrong with mediator state for certain guest. We can pretend that all is fine and try to continue the call. But who knows which consequences it will have?
However, if the state inconsistency is for Xen, then domain_crash() is the best. If this is for the guest, then this is not really our business and it may be best to let him continue as it could provide more debug (domain_crash() will only dump the stack and registers).
I'm looking at this from different point: we promised to provide some service for a guest and screwed up. It is not guest's fault. Now we know that we can't provide reliable service for a guest anymore. From safety point of view we should shut down the service. (But this is job for another patch) For now, we at least should crash the guest. This is the safest way. What do you think?
I am happy with domain_crash().
Cheers,