On 20 May 2013 18:19, Vladimir Murzin murzin.v@gmail.com wrote:
AFAIK NEON intrinsics requires -mfpu=neon GCC option. I wonder if you have any approach to prevent code outside kernel_fpu_begin/kernel_fpu_end guards being optimized for VFP/NEON?
That is a good point. First of all, I don't think there is a point to compiling the whole kernel with -mfpu=neon, so setting the flag per compilation unit is probably a better approach (if you use -mfloat-abi=softfp, the resulting code should link fine against the other soft-float objects) Inside a compilation unit, it is a bit more tricky, as you correctly pointed out.
I am preparing a couple of patches that use kernel-mode NEON in various ways: - RAID 6 syndrome calculations using NEON intrinsics -> same as the Intel code, i.e., use a noinline function that uses the intrinsics and call it from another function that does the kernel_vfp_begin/end pair; as long as you don't perform any floating point arithmetic in the same compilation unit, the compiler shouldn't emit any other NEON/VFP code; - XOR_BLOCKS NEON implementation using -ftree-vectorize -> a bit trickier, but I think with a bit of care, the above approach should work as well; if not, having a separate compilation unit containing the vectorized functions and calling those from another unit that does the kernel_vfp_begin/end pair should do the trick; - bit sliced AES using NEON assembler -> no need so set the -mpfu=neon flag, you can just set '.fpu neon' in the asm code, and the assembler emits the code exactly as you typed it.
Regards, Ard.
Personally I end up with two options while doing NEON optimized syndrome generation for RAID:
- consolidate all NEON specific code into separate compilation unit, i.e. S-file
- inline bare opcodes into C source
Vladimir Murzin
Signed-off-by: Ard Biesheuvel ard.biesheuvel@linaro.org
arch/arm/include/asm/neon.h | 44 ++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 44 insertions(+) create mode 100644 arch/arm/include/asm/neon.h
diff --git a/arch/arm/include/asm/neon.h b/arch/arm/include/asm/neon.h new file mode 100644 index 0000000..0f76dc3 --- /dev/null +++ b/arch/arm/include/asm/neon.h @@ -0,0 +1,44 @@ +/*
- linux/arch/arm/include/asm/neon.h
- Copyright (C) 2013 Linaro Ltd ard.biesheuvel@linaro.org
- This program is free software; you can redistribute it and/or modify
- it under the terms of the GNU General Public License version 2 as
- published by the Free Software Foundation.
- */
+#ifndef _ASM_NEON_H +#define _ASM_NEON_H
+/*
- The GCC support header file for NEON intrinsics, <arm_neon.h>, does an
- unconditional #include of <stdint.h>, assuming it will never be used outside
- a C99 conformant environment. Sadly, this is not the case for the kernel.
- The only dependencies <arm_neon.h> has on <stdint.h> are the
- uint[8|16|32|64]_t types, which the kernel defines in <linux/types.h>.
- */
+#include <linux/types.h>
+/*
- The GCC option -ffreestanding prevents GCC's internal <stdint.h> from
- including the <stdint.h> system header, it will #include "stdint-gcc.h"
- instead.
- */
+#if __STDC_HOSTED__ != 0 +#error You must compile with -ffreestanding to use NEON intrinsics +#endif
+/*
- The type uintptr_t is typedef'ed to __UINTPTR_TYPE__ by "stdint-gcc.h".
- However, the bare metal and GLIBC versions of GCC don't agree on the
- definition of __UINTPTR_TYPE__. Bare metal agrees with the kernel
- (unsigned long), but GCC for GLIBC uses 'unsigned int' instead.
- */
+#ifdef __linux__ +#undef __UINTPTR_TYPE__ +#endif
+#include <arm_neon.h>
+#endif
1.8.1.2
linaro-kernel mailing list linaro-kernel@lists.linaro.org http://lists.linaro.org/mailman/listinfo/linaro-kernel