On Fri, Jul 19, 2019 at 1:17 AM Peter Zijlstra peterz@infradead.org wrote:
On Thu, Jul 18, 2019 at 02:34:44PM -0700, Nick Desaulniers wrote:
On Wed, Jul 17, 2019 at 5:02 PM Vaibhav Rustagi vaibhavrustagi@google.com wrote:
Compiling the purgatory code with clang results in using of mmx registers.
$ objdump -d arch/x86/purgatory/purgatory.ro | grep xmm
112: 0f 28 00 movaps (%rax),%xmm0 115: 0f 11 07 movups %xmm0,(%rdi) 122: 0f 28 00 movaps (%rax),%xmm0 125: 0f 11 47 10 movups %xmm0,0x10(%rdi)
Add -mno-sse, -mno-mmx, -mno-sse2 to avoid generating SSE instructions.
Signed-off-by: Vaibhav Rustagi vaibhavrustagi@google.com
arch/x86/purgatory/Makefile | 1 + 1 file changed, 1 insertion(+)
diff --git a/arch/x86/purgatory/Makefile b/arch/x86/purgatory/Makefile index 3cf302b26332..3589ec4a28c7 100644 --- a/arch/x86/purgatory/Makefile +++ b/arch/x86/purgatory/Makefile @@ -20,6 +20,7 @@ KCOV_INSTRUMENT := n # sure how to relocate those. Like kexec-tools, use custom flags.
KBUILD_CFLAGS := -fno-strict-aliasing -Wall -Wstrict-prototypes -fno-zero-initialized-in-bss -fno-builtin -ffreestanding -c -Os -mcmodel=large +KBUILD_CFLAGS += -mno-mmx -mno-sse -mno-sse2
Yep, this is a commonly recurring bug in the kernel, observed again and again for Clang builds. The top level Makefile carefully sets KBUILD_CFLAGS, then lower subdirs in the kernel wipe them away with `:=` assignment. Invariably important flags don't always get re-added. In this case, these flags are used in arch/x86/Makefile, but not here and should be IMO. Thanks for the patch.
Should we then not fix/remove these := assignments?
Good point, it's actually pretty straightforward to do so. It just will invert the order of patches in the series, as then the memcpy/memset infinite recursion is now guaranteed with CONFIG_CC_OPTIMIZE_FOR_PERFORMANCE=y (without the other patch in this series). Did the x86 maintainers have thoughts on their favorite implementation of memset/memcpy for me to use from the thread from the other patch in the series? I'll just resend with this fix and maybe we can discuss there and spin a v3 if needed.