On Mon, May 20, 2013 at 08:34:11PM +0200, Ard Biesheuvel wrote:
On 20 May 2013 18:19, Vladimir Murzin murzin.v@gmail.com wrote:
AFAIK NEON intrinsics requires -mfpu=neon GCC option. I wonder if you have any approach to prevent code outside kernel_fpu_begin/kernel_fpu_end guards being optimized for VFP/NEON?
That is a good point. First of all, I don't think there is a point to compiling the whole kernel with -mfpu=neon, so setting the flag per compilation unit is probably a better approach (if you use -mfloat-abi=softfp, the resulting code should link fine against the other soft-float objects) Inside a compilation unit, it is a bit more tricky, as you correctly pointed out.
I am preparing a couple of patches that use kernel-mode NEON in various ways:
- RAID 6 syndrome calculations using NEON intrinsics -> same as the
Intel code, i.e., use a noinline function that uses the intrinsics and call it from another function that does the kernel_vfp_begin/end pair;
Hmmm.. Intel code uses inline asm for all MXX, SSE, SSE2 and AVX optimizations. PPC uses intrinsics for its ALTIVEC optimizations, may be you mean this?
as long as you don't perform any floating point arithmetic in the same compilation unit, the compiler shouldn't emit any other NEON/VFP code;
NEON optimization might be done on integer operations by GCC easily. It isn't a good practice to make assumptions, especially speaking about compiler - one day it might hurt you.
- XOR_BLOCKS NEON implementation using -ftree-vectorize -> a bit
trickier, but I think with a bit of care, the above approach should work as well; if not, having a separate compilation unit containing the vectorized functions and calling those from another unit that does the kernel_vfp_begin/end pair should do the trick;
- bit sliced AES using NEON assembler -> no need so set the -mpfu=neon
flag, you can just set '.fpu neon' in the asm code, and the assembler emits the code exactly as you typed it.
Intrinsics is a good technology which supposed to make life easier. From the kernel side perspective, I see we have to worry a lot while using them. I'll wait for your rest code.
Vladimir Murzin
Regards, Ard.
Personally I end up with two options while doing NEON optimized syndrome generation for RAID:
- consolidate all NEON specific code into separate compilation unit, i.e. S-file
- inline bare opcodes into C source
Vladimir Murzin
Signed-off-by: Ard Biesheuvel ard.biesheuvel@linaro.org
arch/arm/include/asm/neon.h | 44 ++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 44 insertions(+) create mode 100644 arch/arm/include/asm/neon.h
diff --git a/arch/arm/include/asm/neon.h b/arch/arm/include/asm/neon.h new file mode 100644 index 0000000..0f76dc3 --- /dev/null +++ b/arch/arm/include/asm/neon.h @@ -0,0 +1,44 @@ +/*
- linux/arch/arm/include/asm/neon.h
- Copyright (C) 2013 Linaro Ltd ard.biesheuvel@linaro.org
- This program is free software; you can redistribute it and/or modify
- it under the terms of the GNU General Public License version 2 as
- published by the Free Software Foundation.
- */
+#ifndef _ASM_NEON_H +#define _ASM_NEON_H
+/*
- The GCC support header file for NEON intrinsics, <arm_neon.h>, does an
- unconditional #include of <stdint.h>, assuming it will never be used outside
- a C99 conformant environment. Sadly, this is not the case for the kernel.
- The only dependencies <arm_neon.h> has on <stdint.h> are the
- uint[8|16|32|64]_t types, which the kernel defines in <linux/types.h>.
- */
+#include <linux/types.h>
+/*
- The GCC option -ffreestanding prevents GCC's internal <stdint.h> from
- including the <stdint.h> system header, it will #include "stdint-gcc.h"
- instead.
- */
+#if __STDC_HOSTED__ != 0 +#error You must compile with -ffreestanding to use NEON intrinsics +#endif
+/*
- The type uintptr_t is typedef'ed to __UINTPTR_TYPE__ by "stdint-gcc.h".
- However, the bare metal and GLIBC versions of GCC don't agree on the
- definition of __UINTPTR_TYPE__. Bare metal agrees with the kernel
- (unsigned long), but GCC for GLIBC uses 'unsigned int' instead.
- */
+#ifdef __linux__ +#undef __UINTPTR_TYPE__ +#endif
+#include <arm_neon.h>
+#endif
1.8.1.2
linaro-kernel mailing list linaro-kernel@lists.linaro.org http://lists.linaro.org/mailman/listinfo/linaro-kernel