On 6 November 2013 13:51, Peter Maydell peter.maydell@linaro.org wrote:
On 6 November 2013 11:22, Ard Biesheuvel ard.biesheuvel@linaro.org wrote:
This adds support for the AESE/AESD/AESMC/AESIMC instructions that are available on some v8 implementations of Aarch32.
Thanks for this patch. Please could you send QEMU patches to qemu-devel@nongnu.org? (we do all our work on upstream qemu and in the upstream mailing list). You might also like to read http://wiki.qemu.org/Contribute/SubmitAPatch which discusses some minor issues of formatting and so on (for instance it tells you about the coding style checking script, which would tell you about some minor errors in your patch).
Well, to be honest, my primary motivation for posting it to the list was so I could refer to it from the JIRA issue you told me about yesterday, but I would be happy to fix these minor issues and do a proper submission as well. However, considering that this is just a minor slice of the crypto extensions, and that I should probably also implement disassemble support (?) for a complete implementation, I did not think the coverage of the patch would be sufficient for a proper upstream submission. (I mainly wrote it to unblock my own work)
I'll do a first pass review here but you should definitely send subsequent versions to the qemu list.
OK, thanks.
How much testing have you done on this patch?
This is actually a port of my AES emulation for the v8 kernel which I submitted a while ago
http://marc.info/?l=linux-arm-kernel&m=138114828031956&w=2
(which was not in fact intended for upstreaming, but would be required by anyone who wants to test the other patches in the series and does not have access to the Fast Model plugin [which is most of us]). I have tested that extensively, and this patch was tested with some C code that compares against OpenSSL AES.
- This program is free software; you can redistribute it and/or modify
- it under the terms of the GNU General Public License version 2 as
- published by the Free Software Foundation.
Upstream prefers 2-or-later.
I am not sure what Linaro prefers (do you?) but my personal preference would be this. No show stopper though.
+static void mix_columns(union AES_STATE *out, union AES_STATE *in) +{
- int i;
- for (i = 0; i < 16; i++)
out->bytes[i] =
gf8_mul_x(in->bytes[i]) ^
gf8_mul_x(in->bytes[((i + 1) % 4) | (i & ~3)]) ^
in->bytes[((i + 1) % 4) | (i & ~3)] ^
in->bytes[((i + 2) % 4) | (i & ~3)] ^
in->bytes[((i + 3) % 4) | (i & ~3)];
Coding style demands braces on for() and if() statements even if the body is a single statement (checkpatch.pl will tell you this).
OK.
--- a/target-arm/translate.c +++ b/target-arm/translate.c @@ -4346,6 +4346,8 @@ static const uint8_t neon_3r_sizes[] = { #define NEON_2RM_VREV16 2 #define NEON_2RM_VPADDL 4 #define NEON_2RM_VPADDL_U 5 +#define NEON_2RM_AESE 6 /* Includes AESD */ +#define NEON_2RM_AESMC 7 /* Includes AESIMC */ #define NEON_2RM_VCLS 8 #define NEON_2RM_VCLZ 9 #define NEON_2RM_VCNT 10 @@ -4403,6 +4405,8 @@ static const uint8_t neon_2rm_sizes[] = { [NEON_2RM_VREV16] = 0x1, [NEON_2RM_VPADDL] = 0x7, [NEON_2RM_VPADDL_U] = 0x7,
- [NEON_2RM_AESE] = 0x1,
- [NEON_2RM_AESMC] = 0x1, [NEON_2RM_VCLS] = 0x7, [NEON_2RM_VCLZ] = 0x7, [NEON_2RM_VCNT] = 0x1,
@@ -5925,6 +5929,24 @@ static int disas_neon_data_insn(CPUARMState * env, DisasContext *s, uint32_t ins tcg_temp_free_i32(tmp2); tcg_temp_free_i32(tmp3); break;
case NEON_2RM_AESE:
You need a check here: if (!ENABLE_ARCH_8) { return 1; }
so we don't decode these instructions on cores where they don't exist.
Thanks, I was wondering about that. Should there be another check? Not all v8 cores will implement these extensions, so we might also want to emulate one that doesn't, I suppose?
tmp = tcg_const_i32(rd);
tmp2 = tcg_const_i32(rm);
if (insn & (1 << 6))
gen_helper_crypto_aesd(cpu_env, tmp, tmp2);
else
gen_helper_crypto_aese(cpu_env, tmp, tmp2);
break;
case NEON_2RM_AESMC:
tmp = tcg_const_i32(rd);
tmp2 = tcg_const_i32(rm);
if (insn & (1 << 6))
gen_helper_crypto_aesimc(cpu_env, tmp, tmp2);
else
gen_helper_crypto_aesmc(cpu_env, tmp, tmp2);
break; default: elementwise: for (pass = 0; pass < (q ? 4 : 2); pass++) {
regards, Ard.