linaro-toolchain

linaro-toolchain@lists.linaro.org

9 participants
5726 discussions

by Stefan Johansson A

Hello, We have been using Linaro GCC 7.5-2019.12 for the A53. As we move on to new tech there seems to be no support for "- mcpu=cortex-a55". Today, we use the aarch64-elf- toolchain. What GCC do you suggest we start using for A55 ? Thanks, Stefan

4 years, 7 months

[TCWG CI] 473.astar:[.] wayobj::makebound2 grew in size by 14% after llvm: [Clang/Test]: Rename enable_noundef_analysis to disable-noundef-analysis and turn it off by default

by ci_notify＠linaro.org

After llvm commit 7584ef766a7219b6ee5a400637206d26e0fa98ac Author: Juneyoung Lee <aqjune(a)gmail.com> [Clang/Test]: Rename enable_noundef_analysis to disable-noundef-analysis and turn it off by default the following hot functions grew in size by more than 10% (but their benchmarks grew in size by less than 1%): - 473.astar:[.] wayobj::makebound2 grew in size by 14% from 404 to 462 bytes Below reproducer instructions can be used to re-build both "first_bad" and "last_good" cross-toolchains used in this bisection. Naturally, the scripts will fail when triggerring benchmarking jobs if you don't have access to Linaro TCWG CI. For your convenience, we have uploaded tarballs with pre-processed source and assembly files at: - First_bad save-temps: https://ci.linaro.org/job/tcwg_bmk_ci_llvm-bisect-tcwg_bmk_apm-llvm-master-… - Last_good save-temps: https://ci.linaro.org/job/tcwg_bmk_ci_llvm-bisect-tcwg_bmk_apm-llvm-master-… - Baseline save-temps: https://ci.linaro.org/job/tcwg_bmk_ci_llvm-bisect-tcwg_bmk_apm-llvm-master-… Configuration: - Benchmark: SPEC CPU2006 - Toolchain: Clang + Glibc + LLVM Linker - Version: all components were built from their tip of trunk - Target: arm-linux-gnueabihf - Compiler flags: -Oz -mthumb - Hardware: APM Mustang 8x X-Gene1 This benchmarking CI is work-in-progress, and we welcome feedback and suggestions at linaro-toolchain(a)lists.linaro.org . In our improvement plans is to add support for SPEC CPU2017 benchmarks and provide "perf report/annotate" data behind these reports. THIS IS THE END OF INTERESTING STUFF. BELOW ARE LINKS TO BUILDS, REPRODUCTION INSTRUCTIONS, AND THE RAW COMMIT. This commit has regressed these CI configurations: - tcwg_bmk_llvm_apm/llvm-master-arm-spec2k6-Oz First_bad build: https://ci.linaro.org/job/tcwg_bmk_ci_llvm-bisect-tcwg_bmk_apm-llvm-master-… Last_good build: https://ci.linaro.org/job/tcwg_bmk_ci_llvm-bisect-tcwg_bmk_apm-llvm-master-… Baseline build: https://ci.linaro.org/job/tcwg_bmk_ci_llvm-bisect-tcwg_bmk_apm-llvm-master-… Even more details: https://ci.linaro.org/job/tcwg_bmk_ci_llvm-bisect-tcwg_bmk_apm-llvm-master-… Reproduce builds: <cut> mkdir investigate-llvm-7584ef766a7219b6ee5a400637206d26e0fa98ac cd investigate-llvm-7584ef766a7219b6ee5a400637206d26e0fa98ac # Fetch scripts git clone https://git.linaro.org/toolchain/jenkins-scripts # Fetch manifests and test.sh script mkdir -p artifacts/manifests curl -o artifacts/manifests/build-baseline.sh https://ci.linaro.org/job/tcwg_bmk_ci_llvm-bisect-tcwg_bmk_apm-llvm-master-… --fail curl -o artifacts/manifests/build-parameters.sh https://ci.linaro.org/job/tcwg_bmk_ci_llvm-bisect-tcwg_bmk_apm-llvm-master-… --fail curl -o artifacts/test.sh https://ci.linaro.org/job/tcwg_bmk_ci_llvm-bisect-tcwg_bmk_apm-llvm-master-… --fail chmod +x artifacts/test.sh # Reproduce the baseline build (build all pre-requisites) ./jenkins-scripts/tcwg_bmk-build.sh @@ artifacts/manifests/build-baseline.sh # Save baseline build state (which is then restored in artifacts/test.sh) mkdir -p ./bisect rsync -a --del --delete-excluded --exclude /bisect/ --exclude /artifacts/ --exclude /llvm/ ./ ./bisect/baseline/ cd llvm # Reproduce first_bad build git checkout --detach 7584ef766a7219b6ee5a400637206d26e0fa98ac ../artifacts/test.sh # Reproduce last_good build git checkout --detach 1ab9a2906e19cca87cafac25cc31231a36de4843 ../artifacts/test.sh cd .. </cut> Full commit (up to 1000 lines): <cut> commit 7584ef766a7219b6ee5a400637206d26e0fa98ac Author: Juneyoung Lee <aqjune(a)gmail.com> Date: Sat Nov 6 15:34:49 2021 +0900 [Clang/Test]: Rename enable_noundef_analysis to disable-noundef-analysis and turn it off by default Turning on `enable_noundef_analysis` flag allows better codegen by removing freeze instructions. I modified clang by renaming `enable_noundef_analysis` flag to `disable-noundef-analysis` and turning it off by default. Test updates are made as a separate patch: D108453 Reviewed By: eugenis Differential Revision: https://reviews.llvm.org/D105169 --- clang/include/clang/Basic/CodeGenOptions.def | 2 +- clang/include/clang/Driver/Options.td | 6 +- clang/lib/CodeGen/CGCall.cpp | 4 +- clang/test/CXX/except/except.spec/p14-ir.cpp | 4 +- .../expr.prim/expr.prim.lambda/blocks-irgen.mm | 4 +- clang/test/CodeGen/2005-01-02-ConstantInits.c | 10 +- clang/test/CodeGen/2006-05-19-SingleEltReturn.c | 2 +- clang/test/CodeGen/2007-06-18-SextAttrAggregate.c | 2 +- .../test/CodeGen/2009-02-13-zerosize-union-field.c | 2 +- clang/test/CodeGen/2009-05-04-EnumInreg.c | 2 +- clang/test/CodeGen/64bit-swiftcall.c | 8 +- clang/test/CodeGen/RISCV/riscv-inline-asm.c | 2 +- clang/test/CodeGen/RISCV/riscv32-ilp32-abi.c | 8 +- .../test/CodeGen/RISCV/riscv32-ilp32-ilp32f-abi.c | 8 +- .../RISCV/riscv32-ilp32-ilp32f-ilp32d-abi.c | 48 +- clang/test/CodeGen/RISCV/riscv32-ilp32d-abi.c | 24 +- clang/test/CodeGen/RISCV/riscv32-ilp32f-abi.c | 6 +- .../test/CodeGen/RISCV/riscv32-ilp32f-ilp32d-abi.c | 16 +- clang/test/CodeGen/RISCV/riscv64-lp64-abi.c | 6 +- clang/test/CodeGen/RISCV/riscv64-lp64-lp64f-abi.c | 4 +- .../CodeGen/RISCV/riscv64-lp64-lp64f-lp64d-abi.c | 58 +- clang/test/CodeGen/RISCV/riscv64-lp64d-abi.c | 12 +- clang/test/CodeGen/RISCV/riscv64-lp64f-lp64d-abi.c | 16 +- clang/test/CodeGen/SystemZ/systemz-abi-vector.c | 18 +- clang/test/CodeGen/SystemZ/systemz-abi.c | 22 +- clang/test/CodeGen/SystemZ/systemz-inline-asm.c | 24 +- clang/test/CodeGen/WebAssembly/wasm-arguments.c | 38 +- .../test/CodeGen/WebAssembly/wasm-main_argc_argv.c | 2 +- clang/test/CodeGen/X86/avx-union.c | 6 +- clang/test/CodeGen/X86/avx512fp16-complex-abi.c | 2 +- clang/test/CodeGen/X86/ms-x86-intrinsics.c | 32 +- clang/test/CodeGen/X86/strictfp_builtins.c | 8 +- clang/test/CodeGen/X86/x86-atomic-long_double.c | 36 +- .../CodeGen/X86/x86-inline-asm-min-vector-width.c | 12 +- clang/test/CodeGen/X86/x86-long-double.cpp | 6 +- clang/test/CodeGen/X86/x86-soft-float.c | 4 +- clang/test/CodeGen/X86/x86-vec-i128.c | 22 +- clang/test/CodeGen/X86/x86_32-arguments-darwin.c | 62 +- clang/test/CodeGen/X86/x86_32-arguments-iamcu.c | 24 +- clang/test/CodeGen/X86/x86_32-arguments-linux.c | 30 +- clang/test/CodeGen/X86/x86_32-arguments-nommx.c | 4 +- clang/test/CodeGen/X86/x86_32-arguments-realign.c | 2 +- clang/test/CodeGen/X86/x86_32-arguments-win32.c | 12 +- clang/test/CodeGen/X86/x86_64-arguments-nacl.c | 6 +- clang/test/CodeGen/X86/x86_64-arguments-win32.c | 12 +- clang/test/CodeGen/X86/x86_64-arguments.c | 82 +- clang/test/CodeGen/X86/x86_64-longdouble.c | 36 +- clang/test/CodeGen/aapcs-align.cpp | 56 +- clang/test/CodeGen/aapcs64-align.cpp | 34 +- clang/test/CodeGen/aarch64-args.cpp | 18 +- clang/test/CodeGen/aarch64-byval-temp.c | 8 +- clang/test/CodeGen/aarch64-neon-3v.c | 160 +- clang/test/CodeGen/aarch64-neon-across.c | 88 +- clang/test/CodeGen/aarch64-neon-dot-product.c | 24 +- clang/test/CodeGen/aarch64-neon-extract.c | 48 +- clang/test/CodeGen/aarch64-neon-fcvt-intrinsics.c | 42 +- clang/test/CodeGen/aarch64-neon-fma.c | 44 +- clang/test/CodeGen/aarch64-neon-ldst-one.c | 540 ++-- clang/test/CodeGen/aarch64-neon-scalar-copy.c | 48 +- .../CodeGen/aarch64-neon-scalar-x-indexed-elem.c | 80 +- clang/test/CodeGen/aarch64-neon-tbl.c | 144 +- clang/test/CodeGen/aarch64-neon-vcombine.c | 28 +- clang/test/CodeGen/aarch64-neon-vget-hilo.c | 56 +- clang/test/CodeGen/aarch64-neon-vget.c | 96 +- clang/test/CodeGen/aarch64-poly128.c | 62 +- clang/test/CodeGen/aarch64-poly64.c | 96 +- clang/test/CodeGen/aarch64-strictfp-builtins.c | 8 +- ...4-sve-acle-__ARM_FEATURE_SVE_VECTOR_OPERATORS.c | 16 +- ...sve-acle-__ARM_FEATURE_SVE_VECTOR_OPERATORS.cpp | 8 +- clang/test/CodeGen/aarch64-varargs.c | 2 +- clang/test/CodeGen/address-space-field1.c | 2 +- clang/test/CodeGen/address-space.c | 2 +- clang/test/CodeGen/aix-alignment.c | 8 +- clang/test/CodeGen/aix-altivec.c | 10 +- clang/test/CodeGen/aix-ignore-xcoff-visibility.cpp | 12 +- clang/test/CodeGen/aix-return.c | 16 +- clang/test/CodeGen/aix-struct-arg.c | 44 +- clang/test/CodeGen/aix-vaargs.c | 4 +- clang/test/CodeGen/alias.c | 12 +- clang/test/CodeGen/align_value.cpp | 63 +- clang/test/CodeGen/alloc-align-attr.c | 46 +- clang/test/CodeGen/alloc-fns-alignment.c | 2 +- clang/test/CodeGen/alloc-size-fnptr.c | 12 +- clang/test/CodeGen/arc/arguments.c | 26 +- clang/test/CodeGen/arithmetic-fence-builtin.c | 10 +- clang/test/CodeGen/arm-aapcs-vfp.c | 24 +- clang/test/CodeGen/arm-abi-vector.c | 48 +- clang/test/CodeGen/arm-arguments.c | 10 +- clang/test/CodeGen/arm-bf16-params-returns.c | 10 +- clang/test/CodeGen/arm-byval-align.c | 2 +- clang/test/CodeGen/arm-cmse-attr.c | 4 +- clang/test/CodeGen/arm-cmse-call.c | 4 +- clang/test/CodeGen/arm-float-helpers.c | 76 +- clang/test/CodeGen/arm-fp16-arguments.c | 12 +- clang/test/CodeGen/arm-homogenous.c | 2 +- clang/test/CodeGen/arm-mangle-bf16.cpp | 2 +- clang/test/CodeGen/arm-neon-directed-rounding.c | 30 +- clang/test/CodeGen/arm-neon-dot-product.c | 16 +- clang/test/CodeGen/arm-neon-fma.c | 8 +- clang/test/CodeGen/arm-neon-numeric-maxmin.c | 8 +- clang/test/CodeGen/arm-neon-vcvtX.c | 32 +- clang/test/CodeGen/arm-swiftcall.c | 6 +- clang/test/CodeGen/arm-varargs.c | 2 +- clang/test/CodeGen/arm-vector-arguments.c | 10 +- clang/test/CodeGen/arm-vfp16-arguments.c | 12 +- clang/test/CodeGen/arm64-aapcs-arguments.c | 12 +- clang/test/CodeGen/arm64-abi-vector.c | 42 +- clang/test/CodeGen/arm64-arguments.c | 96 +- clang/test/CodeGen/arm64-microsoft-arguments.cpp | 6 +- clang/test/CodeGen/arm64_32.c | 2 +- clang/test/CodeGen/arm64_vcopy.c | 20 +- clang/test/CodeGen/arm64_vdupq_n_f64.c | 12 +- clang/test/CodeGen/armv7k-abi.c | 6 +- clang/test/CodeGen/asm-label.c | 12 +- .../assume-aligned-and-alloc-align-attributes.c | 12 +- clang/test/CodeGen/atomic-arm64.c | 2 +- clang/test/CodeGen/atomic-ops-libcall.c | 34 +- clang/test/CodeGen/atomic-ops.c | 44 +- clang/test/CodeGen/atomic_ops.c | 10 +- clang/test/CodeGen/atomics-inlining.c | 52 +- clang/test/CodeGen/attr-func-def.c | 4 +- clang/test/CodeGen/attr-naked.c | 2 +- clang/test/CodeGen/attr-no-tail.c | 8 +- clang/test/CodeGen/attr-nomerge.cpp | 20 +- clang/test/CodeGen/attr-noundef.cpp | 4 +- clang/test/CodeGen/attr-target-mv-func-ptrs.c | 4 +- clang/test/CodeGen/attr-target-mv-va-args.c | 24 +- clang/test/CodeGen/attr-target-mv.c | 28 +- clang/test/CodeGen/attr-x86-interrupt.c | 16 +- clang/test/CodeGen/attributes.c | 2 +- clang/test/CodeGen/available-externally-hidden.cpp | 2 +- clang/test/CodeGen/available-externally-suppress.c | 2 +- clang/test/CodeGen/avr/struct.c | 4 +- clang/test/CodeGen/big-atomic-ops.c | 30 +- clang/test/CodeGen/bittest-intrin.c | 8 +- clang/test/CodeGen/blocks.c | 6 +- clang/test/CodeGen/bool-convert.c | 2 +- clang/test/CodeGen/builtin-align-array.c | 8 +- clang/test/CodeGen/builtin-align.c | 24 +- clang/test/CodeGen/builtin-assume-aligned.c | 31 +- clang/test/CodeGen/builtin-attributes.c | 20 +- clang/test/CodeGen/builtin-memfns.c | 4 +- clang/test/CodeGen/builtin-sqrt.c | 2 +- clang/test/CodeGen/builtins-arm.c | 24 +- clang/test/CodeGen/builtins-memcpy-inline.c | 8 +- clang/test/CodeGen/builtins-ms.c | 4 +- clang/test/CodeGen/builtins-multiprecision.c | 4 +- clang/test/CodeGen/builtins-overflow.c | 12 +- clang/test/CodeGen/builtins-ppc-xlcompat-macros.c | 4 +- clang/test/CodeGen/builtins.c | 44 +- clang/test/CodeGen/c-strings.c | 2 +- clang/test/CodeGen/c11atomics-ios.c | 8 +- clang/test/CodeGen/c11atomics.c | 52 +- clang/test/CodeGen/calling-conv-ignored.c | 32 +- ...-assumption-attribute-align_value-on-lvalue.cpp | 2 +- ...ssumption-attribute-align_value-on-paramvar.cpp | 4 +- ...-attribute-alloc_align-on-function-variable.cpp | 6 +- ...ssumption-attribute-alloc_align-on-function.cpp | 8 +- ...ibute-assume_aligned-on-function-two-params.cpp | 6 +- ...mption-attribute-assume_aligned-on-function.cpp | 8 +- ...uiltin_assume_aligned-three-params-variable.cpp | 2 +- ...umption-builtin_assume_aligned-three-params.cpp | 2 +- ...ssumption-builtin_assume_aligned-two-params.cpp | 2 +- .../CodeGen/catch-alignment-assumption-openmp.cpp | 2 +- .../CodeGen/catch-implicit-integer-sign-changes.c | 18 +- ...icit-signed-integer-truncation-or-sign-change.c | 10 +- ...tr-and-nonzero-offset-when-nullptr-is-defined.c | 2 +- .../CodeGen/catch-nullptr-and-nonzero-offset.c | 14 +- .../test/CodeGen/catch-pointer-overflow-volatile.c | 2 +- clang/test/CodeGen/catch-pointer-overflow.c | 16 +- clang/test/CodeGen/cfi-check-fail.c | 2 +- clang/test/CodeGen/cfi-check-fail2.c | 2 +- clang/test/CodeGen/cmse-clear-arg.c | 2 +- clang/test/CodeGen/complex-builtins.c | 228 +- clang/test/CodeGen/complex-indirect.c | 2 +- clang/test/CodeGen/complex-libcalls.c | 228 +- clang/test/CodeGen/complex-math.c | 12 +- clang/test/CodeGen/complex-strictfp.c | 42 +- clang/test/CodeGen/constructor-attribute.c | 2 +- clang/test/CodeGen/debug-info-block-vars.c | 2 +- clang/test/CodeGen/debug-info-pseudo-probe.cpp | 4 +- clang/test/CodeGen/decl.c | 2 +- clang/test/CodeGen/default-address-space.c | 4 +- clang/test/CodeGen/exceptions-seh-finally.c | 14 +- clang/test/CodeGen/exceptions-seh-leave.c | 30 +- clang/test/CodeGen/exceptions-seh-nested-finally.c | 4 +- clang/test/CodeGen/exceptions-seh.c | 26 +- clang/test/CodeGen/exceptions.c | 2 +- clang/test/CodeGen/ext-int-cc.c | 58 +- clang/test/CodeGen/extend-arg-64.c | 2 +- clang/test/CodeGen/fp-function-attrs.cpp | 6 +- clang/test/CodeGen/fp-options-to-fast-math-flags.c | 18 +- clang/test/CodeGen/fpconstrained-cmp-double.c | 24 +- clang/test/CodeGen/fpconstrained-cmp-float.c | 24 +- clang/test/CodeGen/function-attributes.c | 20 +- clang/test/CodeGen/functions.c | 4 +- clang/test/CodeGen/hexagon-hvx-abi.c | 8 +- clang/test/CodeGen/incomplete-function-type-2.c | 2 +- clang/test/CodeGen/indirect-noundef.cpp | 2 +- clang/test/CodeGen/inline.c | 4 +- clang/test/CodeGen/lanai-arguments.c | 12 +- clang/test/CodeGen/lanai-regparm.c | 12 +- clang/test/CodeGen/libcall-declarations.c | 636 ++-- clang/test/CodeGen/libcalls.c | 54 +- clang/test/CodeGen/long_double_fp128.cpp | 14 +- clang/test/CodeGen/malign-double-x86-nacl.c | 6 +- clang/test/CodeGen/mangle-blocks.c | 6 +- clang/test/CodeGen/mangle-windows.c | 2 +- clang/test/CodeGen/math-builtins-long.c | 386 +-- clang/test/CodeGen/math-builtins.c | 648 ++-- clang/test/CodeGen/math-libcalls.c | 474 +-- clang/test/CodeGen/matrix-cast.c | 26 +- clang/test/CodeGen/matrix-type-builtins.c | 4 +- .../test/CodeGen/matrix-type-operators-fast-math.c | 12 +- clang/test/CodeGen/matrix-type-operators.c | 84 +- clang/test/CodeGen/memcmp-inline-builtin-to-asm.c | 2 +- clang/test/CodeGen/memcpy-inline-builtin.c | 2 +- clang/test/CodeGen/microsoft-call-conv-x64.c | 2 +- clang/test/CodeGen/microsoft-call-conv.c | 2 +- clang/test/CodeGen/mingw-long-double.c | 12 +- clang/test/CodeGen/mips-unsigned-ext-var.c | 6 +- clang/test/CodeGen/mips-unsigned-extend.c | 6 +- clang/test/CodeGen/mips-vector-arg.c | 16 +- clang/test/CodeGen/mips-zero-sized-struct.c | 6 +- clang/test/CodeGen/mips64-padding-arg.c | 24 +- clang/test/CodeGen/mrtd.c | 6 +- clang/test/CodeGen/ms-inline-asm.c | 2 +- clang/test/CodeGen/ms-intrinsics-cpuid.c | 4 +- clang/test/CodeGen/ms-intrinsics-other.c | 2 +- clang/test/CodeGen/ms-mixed-ptr-sizes.c | 20 +- clang/test/CodeGen/ms_abi.c | 4 +- clang/test/CodeGen/ms_abi_aarch64.c | 4 +- clang/test/CodeGen/named_reg_global.c | 2 +- clang/test/CodeGen/no-bitfield-type-align.c | 2 +- clang/test/CodeGen/no-builtin.cpp | 12 +- clang/test/CodeGen/no-prototype.c | 2 +- clang/test/CodeGen/noduplicate-cxx11-test.cpp | 2 +- .../CodeGen/non-power-of-2-alignment-assumptions.c | 10 +- clang/test/CodeGen/nonnull.c | 28 +- clang/test/CodeGen/nrvo-tracking.cpp | 2 +- clang/test/CodeGen/nvptx-abi.c | 10 +- clang/test/CodeGen/object-size.c | 4 +- clang/test/CodeGen/padding-init.c | 6 +- clang/test/CodeGen/pass-by-value-noalias.c | 4 +- clang/test/CodeGen/pass-object-size.c | 114 +- clang/test/CodeGen/pch-dllexport.cpp | 4 +- clang/test/CodeGen/powerpc-c99complex.c | 14 +- clang/test/CodeGen/ppc-emmintrin.c | 750 ++--- clang/test/CodeGen/ppc-mm-malloc-le.c | 8 +- clang/test/CodeGen/ppc-mm-malloc.c | 8 +- clang/test/CodeGen/ppc-mmintrin.c | 124 +- clang/test/CodeGen/ppc-pmmintrin.c | 177 +- clang/test/CodeGen/ppc-signbit.c | 2 +- clang/test/CodeGen/ppc-smmintrin.c | 32 +- clang/test/CodeGen/ppc-tmmintrin.c | 290 +- clang/test/CodeGen/ppc-xmmintrin.c | 400 +-- clang/test/CodeGen/ppc64-align-struct.c | 26 +- clang/test/CodeGen/ppc64-complex-parms.c | 38 +- clang/test/CodeGen/ppc64-complex-return.c | 20 +- clang/test/CodeGen/ppc64-extend.c | 4 +- clang/test/CodeGen/ppc64-inline-asm.c | 14 +- clang/test/CodeGen/ppc64-long-double.cpp | 6 +- clang/test/CodeGen/ppc64-soft-float.c | 6 +- clang/test/CodeGen/ppc64-vector.c | 10 +- clang/test/CodeGen/ppc64le-aggregates.c | 8 +- clang/test/CodeGen/ppc64le-f128Aggregates.c | 4 +- clang/test/CodeGen/ppc64le-varargs-f128.c | 12 +- clang/test/CodeGen/pr25786.c | 4 +- clang/test/CodeGen/pr5406.c | 2 +- clang/test/CodeGen/pr9614.c | 4 +- clang/test/CodeGen/pragma-weak.c | 2 +- clang/test/CodeGen/ps4-dllimport-dllexport.c | 2 +- clang/test/CodeGen/regcall.c | 100 +- clang/test/CodeGen/regparm-flag.c | 12 +- clang/test/CodeGen/regparm-struct.c | 36 +- clang/test/CodeGen/regparm.c | 6 +- clang/test/CodeGen/renderscript.c | 14 +- clang/test/CodeGen/restrict.c | 10 +- .../sanitize-thread-no-checking-at-run-time.m | 2 +- clang/test/CodeGen/sparc-arguments.c | 4 +- clang/test/CodeGen/sparcv8-abi.c | 6 +- clang/test/CodeGen/sparcv8-inline-asm.c | 2 +- clang/test/CodeGen/sparcv9-abi.c | 16 +- clang/test/CodeGen/spir-half-type.cpp | 2 +- clang/test/CodeGen/stack-protector.c | 4 +- clang/test/CodeGen/stdcall-fastcall.c | 24 +- clang/test/CodeGen/strictfp_builtins.c | 26 +- clang/test/CodeGen/swift-async-call-conv.c | 22 +- clang/test/CodeGen/switch-dce.c | 4 +- clang/test/CodeGen/sysv_abi.c | 8 +- clang/test/CodeGen/temporary-lifetime.cpp | 4 +- clang/test/CodeGen/transparent-union-redecl.c | 8 +- clang/test/CodeGen/transparent-union.c | 8 +- clang/test/CodeGen/ubsan-function.cpp | 2 +- .../CodeGen/unique-internal-linkage-names-dwarf.c | 4 +- .../unique-internal-linkage-names-dwarf.cpp | 12 +- .../test/CodeGen/unique-internal-linkage-names.cpp | 16 +- clang/test/CodeGen/variadic-null-win64.c | 12 +- clang/test/CodeGen/ve-abi.c | 34 +- clang/test/CodeGen/vectorcall.c | 86 +- clang/test/CodeGen/vla.c | 22 +- clang/test/CodeGen/win64-i128.c | 4 +- clang/test/CodeGen/windows-itanium.c | 2 +- .../CodeGen/windows-on-arm-dllimport-dllexport.c | 2 +- .../CodeGen/windows-seh-EHa-CppCatchDotDotDot.cpp | 2 +- .../test/CodeGen/windows-seh-EHa-CppCondiTemps.cpp | 18 +- clang/test/CodeGen/windows-seh-EHa-CppDtors01.cpp | 2 +- .../test/CodeGen/windows-seh-EHa-TryInFinally.cpp | 4 +- clang/test/CodeGen/windows-seh-abnormal-exits.c | 2 +- clang/test/CodeGen/windows-swiftcall.c | 22 +- clang/test/CodeGen/x86_32-align-linux.c | 6 +- clang/test/CodeGen/xcore-abi.c | 14 +- clang/test/CodeGen/xray-log-args.cpp | 4 +- clang/test/CodeGenCUDA/address-spaces.cu | 2 +- .../CodeGenCUDA/amdgpu-kernel-arg-pointer-type.cu | 10 +- clang/test/CodeGenCUDA/builtins-amdgcn.cu | 2 +- clang/test/CodeGenCUDA/cuda-builtin-vars.cu | 2 +- clang/test/CodeGenCUDA/kernel-args-alignment.cu | 2 +- clang/test/CodeGenCUDA/kernel-args.cu | 8 +- clang/test/CodeGenCUDA/kernel-stub-name.cu | 4 +- clang/test/CodeGenCUDA/lambda.cu | 8 +- clang/test/CodeGenCUDA/redux-builtins.cu | 2 +- clang/test/CodeGenCUDA/surface.cu | 4 +- clang/test/CodeGenCUDA/texture.cu | 6 +- clang/test/CodeGenCUDA/unnamed-types.cu | 8 +- clang/test/CodeGenCUDA/usual-deallocators.cu | 36 +- clang/test/CodeGenCUDA/vtbl.cu | 2 +- .../CodeGenCXX/2009-05-04-PureConstNounwind.cpp | 10 +- .../test/CodeGenCXX/2011-12-19-init-list-ctor.cpp | 6 +- .../diamond-virtual-inheritance.cpp | 2 +- .../CodeGenCXX/RelativeVTablesABI/dynamic-cast.cpp | 8 +- .../RelativeVTablesABI/member-function-pointer.cpp | 2 +- .../RelativeVTablesABI/multiple-inheritance.cpp | 2 +- .../parent-and-child-in-comdats.cpp | 2 +- .../CodeGenCXX/RelativeVTablesABI/type-info.cpp | 2 +- .../CodeGenCXX/RelativeVTablesABI/vbase-offset.cpp | 2 +- .../RelativeVTablesABI/virtual-function-call.cpp | 2 +- clang/test/CodeGenCXX/address-space-cast.cpp | 14 +- clang/test/CodeGenCXX/address-space-ref.cpp | 8 +- clang/test/CodeGenCXX/aix-alignment.cpp | 6 +- .../aix-static-init-temp-spec-and-inline-var.cpp | 14 +- clang/test/CodeGenCXX/aix-static-init.cpp | 4 +- .../test/CodeGenCXX/align-avx-complete-objects.cpp | 4 +- clang/test/CodeGenCXX/alignment.cpp | 20 +- clang/test/CodeGenCXX/alloc-size.cpp | 16 +- .../test/CodeGenCXX/amdgcn-automatic-variable.cpp | 10 +- clang/test/CodeGenCXX/amdgcn-func-arg.cpp | 24 +- clang/test/CodeGenCXX/amdgcn_declspec_get.cpp | 2 +- clang/test/CodeGenCXX/anonymous-namespaces.cpp | 4 +- .../test/CodeGenCXX/apple-kext-indirect-call-2.cpp | 8 +- clang/test/CodeGenCXX/apple-kext-linkage.cpp | 4 +- clang/test/CodeGenCXX/arm-cc.cpp | 4 +- clang/test/CodeGenCXX/arm-swiftcall.cpp | 2 +- clang/test/CodeGenCXX/arm.cpp | 4 +- clang/test/CodeGenCXX/arm64-constructor-return.cpp | 4 +- clang/test/CodeGenCXX/arm64-darwinpcs.cpp | 4 +- clang/test/CodeGenCXX/atomic-dllexport.cpp | 4 +- clang/test/CodeGenCXX/atomic-inline.cpp | 2 +- clang/test/CodeGenCXX/atomicinit.cpp | 8 +- .../CodeGenCXX/attr-cpuspecific-outoflinedefs.cpp | 28 +- clang/test/CodeGenCXX/attr-disable-tail-calls.cpp | 12 +- clang/test/CodeGenCXX/attr-musttail.cpp | 40 +- clang/test/CodeGenCXX/attr-notail.cpp | 10 +- clang/test/CodeGenCXX/attr-target-mv-diff-ns.cpp | 42 +- clang/test/CodeGenCXX/attr-target-mv-func-ptrs.cpp | 6 +- clang/test/CodeGenCXX/attr-target-mv-inalloca.cpp | 16 +- .../CodeGenCXX/attr-target-mv-member-funcs.cpp | 96 +- .../CodeGenCXX/attr-target-mv-out-of-line-defs.cpp | 22 +- clang/test/CodeGenCXX/attr-target-mv-overloads.cpp | 36 +- ...used-member-function-implicit-instantiation.cpp | 2 +- clang/test/CodeGenCXX/attr-x86-interrupt.cpp | 24 +- clang/test/CodeGenCXX/blocks-cxx11.cpp | 16 +- clang/test/CodeGenCXX/blocks.cpp | 4 +- clang/test/CodeGenCXX/builtin-calling-conv.cpp | 18 +- .../CodeGenCXX/builtin-is-constant-evaluated.cpp | 8 +- .../CodeGenCXX/builtin-operator-new-delete.cpp | 20 +- clang/test/CodeGenCXX/builtin-source-location.cpp | 20 +- clang/test/CodeGenCXX/builtin_FUNCTION.cpp | 6 +- clang/test/CodeGenCXX/builtin_LINE.cpp | 24 +- clang/test/CodeGenCXX/builtins.cpp | 4 +- clang/test/CodeGenCXX/call-with-static-chain.cpp | 16 +- clang/test/CodeGenCXX/catch-undef-behavior.cpp | 10 +- clang/test/CodeGenCXX/cfi-cast.cpp | 4 +- clang/test/CodeGenCXX/cfi-multiple-inheritance.cpp | 2 +- .../test/CodeGenCXX/cfi-vcall-check-after-args.cpp | 2 +- clang/test/CodeGenCXX/clang-sections.cpp | 2 +- clang/test/CodeGenCXX/compound-literals.cpp | 6 +- clang/test/CodeGenCXX/condition.cpp | 30 +- clang/test/CodeGenCXX/conditional-gnu-ext.cpp | 14 +- clang/test/CodeGenCXX/conditional-temporaries.cpp | 44 +- clang/test/CodeGenCXX/const-init-cxx11.cpp | 16 +- .../constructor-destructor-return-this.cpp | 100 +- clang/test/CodeGenCXX/constructor-direct-call.cpp | 14 +- clang/test/CodeGenCXX/constructor-init.cpp | 10 +- clang/test/CodeGenCXX/constructors.cpp | 24 +- clang/test/CodeGenCXX/convert-to-fptr.cpp | 4 +- clang/test/CodeGenCXX/copy-assign-synthesis-1.cpp | 2 +- clang/test/CodeGenCXX/copy-constructor-elim-2.cpp | 2 +- .../CodeGenCXX/copy-constructor-synthesis-2.cpp | 2 +- .../test/CodeGenCXX/copy-constructor-synthesis.cpp | 6 +- clang/test/CodeGenCXX/copy-elision.cpp | 2 +- clang/test/CodeGenCXX/copy-initialization.cpp | 2 +- clang/test/CodeGenCXX/cxx-abi-switch.cpp | 4 +- clang/test/CodeGenCXX/cxx0x-delegating-ctors.cpp | 2 +- .../CodeGenCXX/cxx0x-initializer-constructors.cpp | 14 +- .../CodeGenCXX/cxx0x-initializer-references.cpp | 4 +- .../CodeGenCXX/cxx11-initializer-aggregate.cpp | 4 +- .../CodeGenCXX/cxx11-initializer-array-new.cpp | 30 +- .../CodeGenCXX/cxx11-thread-local-reference.cpp | 6 +- .../CodeGenCXX/cxx11-thread-local-visibility.cpp | 8 +- clang/test/CodeGenCXX/cxx11-thread-local.cpp | 38 +- .../test/CodeGenCXX/cxx11-user-defined-literal.cpp | 20 +- clang/test/CodeGenCXX/cxx1y-init-captures.cpp | 12 +- .../CodeGenCXX/cxx1y-initializer-aggregate.cpp | 6 +- clang/test/CodeGenCXX/cxx1y-sized-deallocation.cpp | 48 +- .../CodeGenCXX/cxx1y-variable-template-linkage.cpp | 10 +- clang/test/CodeGenCXX/cxx1y-variable-template.cpp | 2 +- clang/test/CodeGenCXX/cxx1z-aligned-allocation.cpp | 68 +- clang/test/CodeGenCXX/cxx1z-copy-omission.cpp | 8 +- clang/test/CodeGenCXX/cxx1z-decomposition.cpp | 4 +- clang/test/CodeGenCXX/cxx1z-init-statement.cpp | 4 +- .../CodeGenCXX/cxx1z-initializer-aggregate.cpp | 20 +- clang/test/CodeGenCXX/cxx1z-inline-variables.cpp | 8 +- clang/test/CodeGenCXX/cxx2a-consteval.cpp | 11 +- clang/test/CodeGenCXX/cxx2a-destroying-delete.cpp | 38 +- .../debug-info-codeview-heapallocsite.cpp | 6 +- .../test/CodeGenCXX/debug-info-destroy-helper.cpp | 48 +- clang/test/CodeGenCXX/debug-info-globalinit.cpp | 6 +- clang/test/CodeGenCXX/debug-info-line.cpp | 4 +- clang/test/CodeGenCXX/debug-info-nested-exprs.cpp | 84 +- clang/test/CodeGenCXX/debug-info-static-fns.cpp | 2 +- clang/test/CodeGenCXX/debug-info-thunk-msabi.cpp | 2 +- clang/test/CodeGenCXX/decl-ref-init.cpp | 4 +- clang/test/CodeGenCXX/default-arg-temps.cpp | 4 +- clang/test/CodeGenCXX/default-arguments.cpp | 2 +- clang/test/CodeGenCXX/default_calling_conv.cpp | 24 +- clang/test/CodeGenCXX/delete-two-arg.cpp | 8 +- clang/test/CodeGenCXX/delete.cpp | 6 +- clang/test/CodeGenCXX/derived-to-base-conv.cpp | 6 +- clang/test/CodeGenCXX/derived-to-base.cpp | 4 +- clang/test/CodeGenCXX/destructors.cpp | 8 +- clang/test/CodeGenCXX/devirtualize-ms-dtor.cpp | 2 +- .../devirtualize-virtual-function-calls-final.cpp | 34 +- .../devirtualize-virtual-function-calls.cpp | 2 +- clang/test/CodeGenCXX/dllexport-ctor-closure.cpp | 10 +- clang/test/CodeGenCXX/dllexport-dtor-thunks.cpp | 2 +- clang/test/CodeGenCXX/dllexport-members.cpp | 12 +- .../CodeGenCXX/dllexport-no-dllexport-inlines.cpp | 18 +- clang/test/CodeGenCXX/dllexport.cpp | 12 +- clang/test/CodeGenCXX/dllimport-members.cpp | 12 +- clang/test/CodeGenCXX/dllimport-runtime-fns.cpp | 6 +- clang/test/CodeGenCXX/dllimport.cpp | 18 +- clang/test/CodeGenCXX/eh.cpp | 10 +- .../CodeGenCXX/empty-nontrivially-copyable.cpp | 6 +- clang/test/CodeGenCXX/exceptions-cxx-new.cpp | 10 +- .../CodeGenCXX/exceptions-seh-filter-captures.cpp | 24 +- .../CodeGenCXX/exceptions-seh-filter-uwtable.cpp | 2 +- clang/test/CodeGenCXX/exceptions-seh.cpp | 16 +- clang/test/CodeGenCXX/exceptions.cpp | 4 +- clang/test/CodeGenCXX/explicit-instantiation.cpp | 32 +- clang/test/CodeGenCXX/ext-int.cpp | 16 +- clang/test/CodeGenCXX/fastcall.cpp | 2 +- clang/test/CodeGenCXX/float128-declarations.cpp | 20 +- clang/test/CodeGenCXX/float16-declarations.cpp | 8 +- clang/test/CodeGenCXX/for-cond-var.cpp | 16 +- clang/test/CodeGenCXX/for-range-temporaries.cpp | 2 +- clang/test/CodeGenCXX/for-range.cpp | 20 +- clang/test/CodeGenCXX/forward-enum.cpp | 2 +- clang/test/CodeGenCXX/fp16-mangle-arg-return.cpp | 4 +- clang/test/CodeGenCXX/fp16-mangle.cpp | 4 +- clang/test/CodeGenCXX/fp16-overload.cpp | 4 +- clang/test/CodeGenCXX/global-init.cpp | 2 +- clang/test/CodeGenCXX/goto.cpp | 6 +- clang/test/CodeGenCXX/homogeneous-aggregates.cpp | 28 +- clang/test/CodeGenCXX/ibm128-declarations.cpp | 24 +- .../CodeGenCXX/implicit-copy-assign-operator.cpp | 2 +- .../test/CodeGenCXX/implicit-copy-constructor.cpp | 2 +- clang/test/CodeGenCXX/inalloca-overaligned.cpp | 38 +- clang/test/CodeGenCXX/inalloca-stmtexpr.cpp | 2 +- clang/test/CodeGenCXX/inalloca-vector.cpp | 40 +- .../CodeGenCXX/inheriting-constructor-cleanup.cpp | 4 +- clang/test/CodeGenCXX/inheriting-constructor.cpp | 10 +- clang/test/CodeGenCXX/init-invariant.cpp | 14 +- clang/test/CodeGenCXX/init-priority-attr.cpp | 10 +- .../CodeGenCXX/initializer-list-ctor-order.cpp | 2 +- clang/test/CodeGenCXX/inline-functions.cpp | 2 +- clang/test/CodeGenCXX/lambda-conversion-op-cc.cpp | 56 +- .../lambda-expressions-inside-auto-functions.cpp | 8 +- .../lambda-expressions-nested-linkage.cpp | 10 +- clang/test/CodeGenCXX/lambda-expressions.cpp | 30 +- clang/test/CodeGenCXX/lifetime-sanitizer.cpp | 2 +- clang/test/CodeGenCXX/linkage.cpp | 2 +- clang/test/CodeGenCXX/mangle-abi-tag.cpp | 2 +- clang/test/CodeGenCXX/mangle-exprs.cpp | 8 +- clang/test/CodeGenCXX/mangle-extern-local.cpp | 6 +- clang/test/CodeGenCXX/mangle-lambdas.cpp | 102 +- clang/test/CodeGenCXX/mangle-ms-cxx11.cpp | 4 +- .../CodeGenCXX/mangle-ms-templates-memptrs-2.cpp | 2 +- clang/test/CodeGenCXX/mangle-ms-vector-types.cpp | 14 +- clang/test/CodeGenCXX/mangle-ms.cpp | 10 +- clang/test/CodeGenCXX/mangle-this-cxx11.cpp | 4 +- clang/test/CodeGenCXX/mangle-win-ccs.cpp | 24 +- clang/test/CodeGenCXX/mangle-win64-ccs.cpp | 14 +- clang/test/CodeGenCXX/mangle.cpp | 32 +- clang/test/CodeGenCXX/matrix-casts.cpp | 8 +- clang/test/CodeGenCXX/matrix-type-builtins.cpp | 56 +- clang/test/CodeGenCXX/matrix-type-operators.cpp | 48 +- clang/test/CodeGenCXX/matrix-type.cpp | 2 +- .../CodeGenCXX/member-expr-references-variable.cpp | 40 +- clang/test/CodeGenCXX/member-expressions.cpp | 2 +- .../CodeGenCXX/member-function-pointer-calls.cpp | 8 +- clang/test/CodeGenCXX/member-init-assignment.cpp | 2 +- clang/test/CodeGenCXX/member-templates.cpp | 4 +- clang/test/CodeGenCXX/microsoft-abi-arg-order.cpp | 16 +- .../CodeGenCXX/microsoft-abi-array-cookies.cpp | 8 +- clang/test/CodeGenCXX/microsoft-abi-byval-sret.cpp | 8 +- .../test/CodeGenCXX/microsoft-abi-byval-thunks.cpp | 16 +- .../test/CodeGenCXX/microsoft-abi-byval-vararg.cpp | 12 +- .../CodeGenCXX/microsoft-abi-cdecl-method-sret.cpp | 8 +- .../test/CodeGenCXX/microsoft-abi-dynamic-cast.cpp | 22 +- clang/test/CodeGenCXX/microsoft-abi-eh-catch.cpp | 6 +- .../test/CodeGenCXX/microsoft-abi-eh-cleanups.cpp | 56 +- .../CodeGenCXX/microsoft-abi-extern-template.cpp | 8 +- .../CodeGenCXX/microsoft-abi-member-pointers.cpp | 42 +- clang/test/CodeGenCXX/microsoft-abi-methods.cpp | 10 +- ...crosoft-abi-multiple-nonvirtual-inheritance.cpp | 10 +- .../CodeGenCXX/microsoft-abi-sret-and-byval.cpp | 78 +- .../microsoft-abi-static-initializers.cpp | 24 +- clang/test/CodeGenCXX/microsoft-abi-structors.cpp | 2 +- .../CodeGenCXX/microsoft-abi-this-nullable.cpp | 2 +- .../microsoft-abi-thread-safe-statics.cpp | 2 +- clang/test/CodeGenCXX/microsoft-abi-throw.cpp | 4 +- clang/test/CodeGenCXX/microsoft-abi-thunks.cpp | 14 +- clang/test/CodeGenCXX/microsoft-abi-typeid.cpp | 16 +- .../test/CodeGenCXX/microsoft-abi-unknown-arch.cpp | 2 +- clang/test/CodeGenCXX/microsoft-abi-vbase-dtor.cpp | 2 +- ...microsoft-abi-virtual-inheritance-vtordisps.cpp | 6 +- .../microsoft-abi-virtual-inheritance.cpp | 54 +- .../microsoft-abi-virtual-member-pointers.cpp | 56 +- .../CodeGenCXX/microsoft-abi-vmemptr-conflicts.cpp | 34 +- .../CodeGenCXX/microsoft-abi-vmemptr-fastcall.cpp | 4 +- ...iple-nonvirtual-inheritance-this-adjustment.cpp | 4 +- clang/test/CodeGenCXX/microsoft-compatibility.cpp | 2 +- .../CodeGenCXX/microsoft-inaccessible-base.cpp | 4 +- clang/test/CodeGenCXX/microsoft-interface.cpp | 10 +- clang/test/CodeGenCXX/microsoft-new.cpp | 8 +- clang/test/CodeGenCXX/mips-size_t-ptrdiff_t.cpp | 12 +- clang/test/CodeGenCXX/ms-inline-asm-fields.cpp | 2 +- clang/test/CodeGenCXX/ms-inline-asm-return.cpp | 2 +- clang/test/CodeGenCXX/ms-property.cpp | 48 +- clang/test/CodeGenCXX/ms-thunks-ehspec.cpp | 4 +- clang/test/CodeGenCXX/ms-thunks-unprototyped.cpp | 18 +- clang/test/CodeGenCXX/ms-union-member-ref.cpp | 6 +- .../test/CodeGenCXX/msabi-ctor-abstract-vbase.cpp | 8 +- clang/test/CodeGenCXX/multi-dim-operator-new.cpp | 6 +- clang/test/CodeGenCXX/new-alias.cpp | 2 +- clang/test/CodeGenCXX/new-array-init.cpp | 18 +- clang/test/CodeGenCXX/new-infallible.cpp | 4 +- clang/test/CodeGenCXX/new-overflow.cpp | 30 +- clang/test/CodeGenCXX/new.cpp | 56 +- clang/test/CodeGenCXX/noescape.cpp | 22 +- clang/test/CodeGenCXX/nonconst-init.cpp | 2 +- clang/test/CodeGenCXX/nrvo.cpp | 4 +- clang/test/CodeGenCXX/observe-noexcept.cpp | 4 +- clang/test/CodeGenCXX/operator-new.cpp | 8 +- clang/test/CodeGenCXX/partial-destruction.cpp | 22 +- clang/test/CodeGenCXX/pass-by-value-noalias.cpp | 16 +- clang/test/CodeGenCXX/pass-object-size.cpp | 8 +- clang/test/CodeGenCXX/pod-member-memcpys.cpp | 4 +- clang/test/CodeGenCXX/powerpc-byval.cpp | 2 +- clang/test/CodeGenCXX/pr13396.cpp | 12 +- clang/test/CodeGenCXX/pr20897.cpp | 4 +- clang/test/CodeGenCXX/pr24097.cpp | 2 +- clang/test/CodeGenCXX/pr28360.cpp | 2 +- clang/test/CodeGenCXX/pr9130.cpp | 2 +- clang/test/CodeGenCXX/pragma-visibility.cpp | 2 +- clang/test/CodeGenCXX/redefine_extname.cpp | 2 +- clang/test/CodeGenCXX/reference-cast.cpp | 12 +- clang/test/CodeGenCXX/references.cpp | 2 +- clang/test/CodeGenCXX/regcall.cpp | 42 +- clang/test/CodeGenCXX/regparm.cpp | 6 +- clang/test/CodeGenCXX/runtime-dllstorage.cpp | 14 +- clang/test/CodeGenCXX/runtimecc.cpp | 2 +- clang/test/CodeGenCXX/rvalue-references.cpp | 12 +- clang/test/CodeGenCXX/split-stacks.cpp | 12 +- clang/test/CodeGenCXX/stack-reuse-miscompile.cpp | 8 +- clang/test/CodeGenCXX/stack-reuse.cpp | 2 +- clang/test/CodeGenCXX/static-data-member.cpp | 4 +- clang/test/CodeGenCXX/static-destructor.cpp | 4 +- clang/test/CodeGenCXX/static-init-1.cpp | 8 +- clang/test/CodeGenCXX/static-init-wasm.cpp | 4 +- clang/test/CodeGenCXX/static-init.cpp | 14 +- .../CodeGenCXX/static-local-in-local-class.cpp | 20 +- clang/test/CodeGenCXX/stmtexpr.cpp | 16 +- clang/test/CodeGenCXX/switch-case-folding-2.cpp | 2 +- clang/test/CodeGenCXX/temp-order.cpp | 18 +- clang/test/CodeGenCXX/template-anonymous-types.cpp | 12 +- clang/test/CodeGenCXX/temporaries.cpp | 48 +- clang/test/CodeGenCXX/this-nonnull.cpp | 8 +- clang/test/CodeGenCXX/thunk-linkonce-odr.cpp | 4 +- clang/test/CodeGenCXX/thunk-returning-memptr.cpp | 2 +- clang/test/CodeGenCXX/thunks-ehspec.cpp | 6 +- clang/test/CodeGenCXX/thunks.cpp | 20 +- clang/test/CodeGenCXX/tls-init-funcs.cpp | 10 +- clang/test/CodeGenCXX/trivial_abi.cpp | 46 +- clang/test/CodeGenCXX/ubsan-suppress-checks.cpp | 16 +- clang/test/CodeGenCXX/ubsan-vtable-checks.cpp | 4 +- clang/test/CodeGenCXX/uncopyable-args.cpp | 48 +- clang/test/CodeGenCXX/unknown-anytype.cpp | 28 +- clang/test/CodeGenCXX/value-init.cpp | 4 +- clang/test/CodeGenCXX/varargs.cpp | 2 +- clang/test/CodeGenCXX/variadic-templates.cpp | 2 +- .../CodeGenCXX/virtual-base-destructor-call.cpp | 4 +- clang/test/CodeGenCXX/virtual-bases.cpp | 8 +- clang/test/CodeGenCXX/virtual-operator-call.cpp | 4 +- .../visibility-inlines-hidden-staticvar.cpp | 44 +- .../test/CodeGenCXX/visibility-inlines-hidden.cpp | 4 +- clang/test/CodeGenCXX/vla-consruct.cpp | 4 +- clang/test/CodeGenCXX/vla-lambda-capturing.cpp | 6 +- clang/test/CodeGenCXX/vla.cpp | 4 +- clang/test/CodeGenCXX/volatile.cpp | 2 +- clang/test/CodeGenCXX/vtable-assume-load.cpp | 2 +- .../CodeGenCXX/vtable-available-externally.cpp | 16 +- clang/test/CodeGenCXX/wasm-args-returns.cpp | 4 +- clang/test/CodeGenCXX/wasm-eh.cpp | 8 +- .../windows-on-arm-itanium-thread-local.cpp | 2 +- clang/test/CodeGenCXX/windows-x86-swiftcall.cpp | 6 +- clang/test/CodeGenCXX/x86_32-arguments.cpp | 8 +- clang/test/CodeGenCXX/x86_64-arguments-avx.cpp | 2 +- .../test/CodeGenCXX/x86_64-arguments-nacl-x32.cpp | 2 +- clang/test/CodeGenCXX/x86_64-arguments.cpp | 2 +- .../CodeGenCoroutines/coro-alloc-exp-namespace.cpp | 26 +- clang/test/CodeGenCoroutines/coro-alloc.cpp | 26 +- .../CodeGenCoroutines/coro-await-exp-namespace.cpp | 2 +- clang/test/CodeGenCoroutines/coro-await.cpp | 2 +- clang/test/CodeGenCoroutines/coro-builtins.c | 2 +- .../coro-cleanup-exp-namespace.cpp | 6 +- clang/test/CodeGenCoroutines/coro-cleanup.cpp | 6 +- .../CodeGenCoroutines/coro-gro-exp-namespace.cpp | 6 +- .../coro-gro-nrvo-exp-namespace.cpp | 8 +- clang/test/CodeGenCoroutines/coro-gro-nrvo.cpp | 8 +- clang/test/CodeGenCoroutines/coro-gro.cpp | 6 +- .../coro-params-exp-namespace.cpp | 22 +- clang/test/CodeGenCoroutines/coro-params.cpp | 22 +- .../coro-promise-dtor-exp-namespace.cpp | 2 +- clang/test/CodeGenCoroutines/coro-promise-dtor.cpp | 2 +- .../coro-ret-void-exp-namespace.cpp | 2 +- clang/test/CodeGenCoroutines/coro-ret-void.cpp | 2 +- .../coro-return-exp-namespace.cpp | 6 +- clang/test/CodeGenCoroutines/coro-return.cpp | 6 +- .../coro-symmetric-transfer-01-exp-namespace.cpp | 4 +- .../coro-symmetric-transfer-01.cpp | 2 +- clang/test/CodeGenObjC/arc-blocks.m | 44 +- clang/test/CodeGenObjC/arc-foreach.m | 4 +- clang/test/CodeGenObjC/arc-literals.m | 16 +- clang/test/CodeGenObjC/arc-no-arc-exceptions.m | 6 +- clang/test/CodeGenObjC/arc-precise-lifetime.m | 4 +- clang/test/CodeGenObjC/arc-property.m | 10 +- clang/test/CodeGenObjC/arc-ternary-op.m | 4 +- clang/test/CodeGenObjC/arc.m | 44 +- .../CodeGenObjC/arm-atomic-scalar-setter-getter.m | 4 +- clang/test/CodeGenObjC/atomic-aggregate-property.m | 4 +- .../test/CodeGenObjC/availability-cf-link-guard.m | 2 +- clang/test/CodeGenObjC/blocks.m | 4 +- clang/test/CodeGenObjC/builtin-constant-p.m | 4 +- clang/test/CodeGenObjC/class-stubs.m | 10 +- clang/test/CodeGenObjC/debug-info-blocks.m | 2 +- clang/test/CodeGenObjC/debug-info-nested-blocks.m | 2 +- clang/test/CodeGenObjC/exceptions.m | 16 +- clang/test/CodeGenObjC/for-in.m | 2 +- clang/test/CodeGenObjC/fragile-arc.m | 8 +- clang/test/CodeGenObjC/gnu-exceptions.m | 4 +- clang/test/CodeGenObjC/implicit-objc_msgSend.m | 2 +- clang/test/CodeGenObjC/ivar-invariant.m | 2 +- clang/test/CodeGenObjC/local-static-block.m | 2 +- clang/test/CodeGenObjC/mangle-blocks.m | 6 +- clang/test/CodeGenObjC/matrix-type-builtins.m | 16 +- clang/test/CodeGenObjC/matrix-type-operators.m | 10 +- clang/test/CodeGenObjC/noescape.m | 10 +- .../CodeGenObjC/nontrivial-c-struct-exception.m | 2 +- .../nontrivial-c-struct-within-struct-name.m | 6 +- .../CodeGenObjC/nsvalue-objc-boxable-ios-arc.m | 12 +- clang/test/CodeGenObjC/nsvalue-objc-boxable-ios.m | 12 +- .../CodeGenObjC/nsvalue-objc-boxable-mac-arc.m | 12 +- clang/test/CodeGenObjC/nsvalue-objc-boxable-mac.m | 12 +- .../CodeGenObjC/objc-container-subscripting-1.m | 8 +- clang/test/CodeGenObjC/objc-literal-tests.m | 26 +- .../CodeGenObjC/objc-non-trivial-struct-nrvo.m | 6 +- clang/test/CodeGenObjC/objfw.m | 2 +- clang/test/CodeGenObjC/optimize-ivar-offset-load.m | 2 +- clang/test/CodeGenObjC/os_log.m | 12 +- clang/test/CodeGenObjC/parameterized_classes.m | 2 +- clang/test/CodeGenObjC/pass-by-value-noalias.m | 4 +- clang/test/CodeGenObjC/property-array-type.m | 2 +- clang/test/CodeGenObjC/property-atomic-bool.m | 4 +- clang/test/CodeGenObjC/property-ref-cast-to-void.m | 4 +- clang/test/CodeGenObjC/property.m | 10 +- clang/test/CodeGenObjC/return-objc-object.mm | 4 +- clang/test/CodeGenObjC/stret_lookup.m | 4 +- clang/test/CodeGenObjC/strong-in-c-struct.m | 54 +- .../test/CodeGenObjC/tentative-cfconstantstring.m | 2 +- clang/test/CodeGenObjC/terminate.m | 8 +- clang/test/CodeGenObjC/ubsan-bool.m | 6 +- clang/test/CodeGenObjC/ubsan-nonnull.m | 12 +- clang/test/CodeGenObjC/ubsan-nullability.m | 4 +- clang/test/CodeGenObjC/weak-in-c-struct.m | 30 +- clang/test/CodeGenObjCXX/arc-attrs.mm | 18 +- clang/test/CodeGenObjCXX/arc-blocks.mm | 6 +- clang/test/CodeGenObjCXX/arc-cxx11-init-list.mm | 2 +- clang/test/CodeGenObjCXX/arc-cxx11-member-init.mm | 4 +- clang/test/CodeGenObjCXX/arc-exceptions.mm | 8 +- .../CodeGenObjCXX/arc-forwarded-lambda-call.mm | 8 +- clang/test/CodeGenObjCXX/arc-globals.mm | 4 +- clang/test/CodeGenObjCXX/arc-list-init-destruct.mm | 2 +- clang/test/CodeGenObjCXX/arc-mangle.mm | 22 +- clang/test/CodeGenObjCXX/arc-marker-funclet.mm | 2 +- clang/test/CodeGenObjCXX/arc-move.mm | 6 +- clang/test/CodeGenObjCXX/arc-new-delete.mm | 16 +- clang/test/CodeGenObjCXX/arc-references.mm | 6 +- clang/test/CodeGenObjCXX/arc-rv-attr.mm | 2 +- .../CodeGenObjCXX/arc-special-member-functions.mm | 2 +- clang/test/CodeGenObjCXX/arc.mm | 44 +- .../CodeGenObjCXX/auto-release-result-assert.mm | 8 +- clang/test/CodeGenObjCXX/block-default-arg.mm | 4 +- clang/test/CodeGenObjCXX/block-nested-in-lambda.mm | 4 +- clang/test/CodeGenObjCXX/copy.mm | 2 +- .../CodeGenObjCXX/implicit-copy-assign-operator.mm | 2 +- .../CodeGenObjCXX/implicit-copy-constructor.mm | 2 +- .../inheriting-constructor-cleanup.mm | 2 +- clang/test/CodeGenObjCXX/lambda-expressions.mm | 20 +- clang/test/CodeGenObjCXX/lambda-to-block.mm | 18 +- clang/test/CodeGenObjCXX/literals.mm | 8 +- .../test/CodeGenObjCXX/lvalue-reference-getter.mm | 4 +- clang/test/CodeGenObjCXX/mangle-blocks.mm | 8 +- clang/test/CodeGenObjCXX/message-reference.mm | 2 +- clang/test/CodeGenObjCXX/message.mm | 4 +- .../CodeGenObjCXX/objc-container-subscripting.mm | 2 +- clang/test/CodeGenObjCXX/objc-struct-cxx-abi.mm | 54 +- clang/test/CodeGenObjCXX/objc-weak.mm | 4 +- .../CodeGenObjCXX/property-dot-copy-elision.mm | 6 +- clang/test/CodeGenObjCXX/property-dot-reference.mm | 22 +- .../test/CodeGenObjCXX/property-lvalue-capture.mm | 6 +- clang/test/CodeGenObjCXX/property-lvalue-lambda.mm | 2 +- .../CodeGenObjCXX/property-object-reference-1.mm | 2 +- .../CodeGenObjCXX/property-object-reference-2.mm | 14 +- clang/test/CodeGenObjCXX/property-objects.mm | 14 +- clang/test/CodeGenObjCXX/property-reference.mm | 6 +- clang/test/CodeGenObjCXX/selector-expr-lvalue.mm | 2 +- .../CodeGenObjCXX/synthesized-property-cleanup.mm | 2 +- .../ubsan-nullability-return-notypeloc.mm | 2 +- clang/test/CodeGenOpenCL/addr-space-struct-arg.cl | 20 +- clang/test/CodeGenOpenCL/address-spaces.cl | 10 +- .../CodeGenOpenCL/amdgcn-automatic-variable.cl | 8 +- .../test/CodeGenOpenCL/amdgpu-abi-struct-coerce.cl | 48 +- clang/test/CodeGenOpenCL/amdgpu-call-kernel.cl | 2 +- clang/test/CodeGenOpenCL/amdgpu-nullptr.cl | 8 +- clang/test/CodeGenOpenCL/as_type.cl | 26 +- clang/test/CodeGenOpenCL/atomic-ops-libcall.cl | 54 +- clang/test/CodeGenOpenCL/blocks.cl | 12 +- clang/test/CodeGenOpenCL/byval.cl | 4 +- .../test/CodeGenOpenCL/cl20-device-side-enqueue.cl | 6 +- clang/test/CodeGenOpenCL/const-str-array-decay.cl | 2 +- .../CodeGenOpenCL/constant-addr-space-globals.cl | 2 +- clang/test/CodeGenOpenCL/convergent.cl | 4 +- clang/test/CodeGenOpenCL/fpmath.cl | 4 +- clang/test/CodeGenOpenCL/half.cl | 8 +- .../kernels-have-spir-cc-by-default.cl | 8 +- clang/test/CodeGenOpenCL/no-half.cl | 4 +- clang/test/CodeGenOpenCL/overload.cl | 20 +- clang/test/CodeGenOpenCL/printf.cl | 12 +- clang/test/CodeGenOpenCL/size_t.cl | 60 +- clang/test/CodeGenOpenCL/spir-calling-conv.cl | 10 +- .../CodeGenOpenCLCXX/address-space-deduction.clcpp | 2 +- .../CodeGenOpenCLCXX/addrspace-derived-base.clcpp | 4 +- .../CodeGenOpenCLCXX/addrspace-new-delete.clcpp | 2 +- .../test/CodeGenOpenCLCXX/addrspace-of-this.clcpp | 32 +- .../CodeGenOpenCLCXX/addrspace-operators.clcpp | 4 +- .../CodeGenOpenCLCXX/addrspace-references.clcpp | 2 +- .../CodeGenOpenCLCXX/addrspace-with-class.clcpp | 22 +- .../CodeGenOpenCLCXX/template-address-spaces.clcpp | 6 +- .../test/CodeGenSYCL/address-space-conversions.cpp | 52 +- clang/test/CodeGenSYCL/address-space-mangling.cpp | 16 +- clang/test/CodeGenSYCL/unique_stable_name.cpp | 40 +- clang/test/Headers/ms-arm64-intrin.cpp | 6 +- clang/test/Headers/stdarg.cpp | 28 +- clang/test/Modules/codegen-extern-template.cpp | 2 +- clang/test/Modules/codegen.test | 2 +- clang/test/Modules/cxx-irgen.cpp | 2 +- clang/test/Modules/initializers.cpp | 4 +- clang/test/Modules/templates.mm | 8 +- clang/test/OpenMP/allocate_codegen.cpp | 2 +- clang/test/OpenMP/allocate_codegen_attr.cpp | 2 +- clang/test/OpenMP/assumes_include_nvptx.cpp | 6 +- clang/test/OpenMP/atomic_capture_codegen.cpp | 28 +- clang/test/OpenMP/atomic_codegen.cpp | 8 +- clang/test/OpenMP/atomic_read_codegen.c | 14 +- clang/test/OpenMP/atomic_update_codegen.cpp | 28 +- clang/test/OpenMP/atomic_write_codegen.c | 18 +- clang/test/OpenMP/cancel_codegen.cpp | 104 +- clang/test/OpenMP/cancellation_point_codegen.cpp | 28 +- clang/test/OpenMP/debug-info-complex-byval.cpp | 49 +- clang/test/OpenMP/debug-info-openmp-array.cpp | 6 +- clang/test/OpenMP/declare_mapper_codegen.cpp | 20 +- clang/test/OpenMP/declare_reduction_codegen.c | 48 +- clang/test/OpenMP/declare_reduction_codegen.cpp | 46 +- .../declare_reduction_codegen_in_templates.cpp | 2 +- clang/test/OpenMP/declare_target_codegen.cpp | 4 +- .../declare_target_codegen_globalization.cpp | 12 +- clang/test/OpenMP/declare_target_link_codegen.cpp | 4 +- clang/test/OpenMP/declare_variant_mixed_codegen.c | 12 +- clang/test/OpenMP/distribute_codegen.cpp | 304 +- .../OpenMP/distribute_firstprivate_codegen.cpp | 329 +- .../test/OpenMP/distribute_lastprivate_codegen.cpp | 361 ++- .../OpenMP/distribute_parallel_for_codegen.cpp | 576 ++-- ...istribute_parallel_for_firstprivate_codegen.cpp | 385 ++- .../OpenMP/distribute_parallel_for_if_codegen.cpp | 320 +- ...distribute_parallel_for_lastprivate_codegen.cpp | 449 ++- ...distribute_parallel_for_num_threads_codegen.cpp | 481 ++- .../distribute_parallel_for_private_codegen.cpp | 425 ++- .../distribute_parallel_for_proc_bind_codegen.cpp | 29 +- ...tribute_parallel_for_reduction_task_codegen.cpp | 44 +- .../distribute_parallel_for_simd_codegen.cpp | 592 ++-- ...bute_parallel_for_simd_firstprivate_codegen.cpp | 1362 ++++----- .../distribute_parallel_for_simd_if_codegen.cpp | 3192 ++++++++++---------- ...ibute_parallel_for_simd_lastprivate_codegen.cpp | 1336 ++++---- ...ibute_parallel_for_simd_num_threads_codegen.cpp | 2640 ++++++++-------- ...istribute_parallel_for_simd_private_codegen.cpp | 1288 ++++---- ...tribute_parallel_for_simd_proc_bind_codegen.cpp | 236 +- clang/test/OpenMP/distribute_private_codegen.cpp | 345 ++- clang/test/OpenMP/distribute_simd_codegen.cpp | 512 ++-- .../distribute_simd_firstprivate_codegen.cpp | 944 +++--- .../OpenMP/distribute_simd_lastprivate_codegen.cpp | 1008 +++---- .../OpenMP/distribute_simd_private_codegen.cpp | 1056 +++---- .../OpenMP/distribute_simd_reduction_codegen.cpp | 272 +- clang/test/OpenMP/for_codegen.cpp | 16 +- clang/test/OpenMP/for_firstprivate_codegen.cpp | 313 +- clang/test/OpenMP/for_lastprivate_codegen.cpp | 601 ++-- clang/test/OpenMP/for_linear_codegen.cpp | 165 +- clang/test/OpenMP/for_private_codegen.cpp | 177 +- clang/test/OpenMP/for_reduction_codegen.cpp | 760 ++--- clang/test/OpenMP/for_reduction_codegen_UDR.cpp | 936 +++--- clang/test/OpenMP/for_reduction_task_codegen.cpp | 36 +- clang/test/OpenMP/for_scan_codegen.cpp | 2 +- clang/test/OpenMP/for_simd_codegen.cpp | 6 +- clang/test/OpenMP/for_simd_scan_codegen.cpp | 2 +- clang/test/OpenMP/function-attr.cpp | 8 +- clang/test/OpenMP/irbuilder_for_iterator.cpp | 24 +- clang/test/OpenMP/irbuilder_for_rangefor.cpp | 28 +- clang/test/OpenMP/irbuilder_for_unsigned.c | 6 +- ...builder_unroll_partial_heuristic_constant_for.c | 2 +- ...builder_unroll_partial_heuristic_for_collapse.c | 380 ++- ...rbuilder_unroll_partial_heuristic_runtime_for.c | 2 +- clang/test/OpenMP/master_taskloop_codegen.cpp | 10 +- .../master_taskloop_firstprivate_codegen.cpp | 22 +- .../master_taskloop_in_reduction_codegen.cpp | 12 +- .../OpenMP/master_taskloop_lastprivate_codegen.cpp | 22 +- .../OpenMP/master_taskloop_private_codegen.cpp | 22 +- .../OpenMP/master_taskloop_reduction_codegen.cpp | 22 +- clang/test/OpenMP/master_taskloop_simd_codegen.cpp | 8 +- .../master_taskloop_simd_firstprivate_codegen.cpp | 22 +- .../master_taskloop_simd_in_reduction_codegen.cpp | 12 +- .../master_taskloop_simd_lastprivate_codegen.cpp | 22 +- .../master_taskloop_simd_private_codegen.cpp | 22 +- .../master_taskloop_simd_reduction_codegen.cpp | 22 +- clang/test/OpenMP/nvptx_allocate_codegen.cpp | 8 +- clang/test/OpenMP/nvptx_data_sharing.cpp | 8 +- .../nvptx_declare_target_var_ctor_dtor_codegen.cpp | 28 +- .../OpenMP/nvptx_declare_variant_name_mangling.cpp | 4 +- ...tx_distribute_parallel_generic_mode_codegen.cpp | 48 +- clang/test/OpenMP/nvptx_lambda_capturing.cpp | 122 +- .../OpenMP/nvptx_multi_target_parallel_codegen.cpp | 18 +- .../test/OpenMP/nvptx_nested_parallel_codegen.cpp | 72 +- clang/test/OpenMP/nvptx_parallel_codegen.cpp | 52 +- clang/test/OpenMP/nvptx_parallel_for_codegen.cpp | 6 +- clang/test/OpenMP/nvptx_target_codegen.cpp | 10 +- .../OpenMP/nvptx_target_firstprivate_codegen.cpp | 8 +- .../test/OpenMP/nvptx_target_parallel_codegen.cpp | 48 +- .../nvptx_target_parallel_num_threads_codegen.cpp | 48 +- .../nvptx_target_parallel_reduction_codegen.cpp | 18 +- ...get_parallel_reduction_codegen_tbaa_PR46146.cpp | 10 +- clang/test/OpenMP/nvptx_target_printf_codegen.c | 4 +- clang/test/OpenMP/nvptx_target_teams_codegen.cpp | 48 +- .../nvptx_target_teams_distribute_codegen.cpp | 18 +- ...arget_teams_distribute_parallel_for_codegen.cpp | 144 +- ...istribute_parallel_for_generic_mode_codegen.cpp | 72 +- ..._teams_distribute_parallel_for_simd_codegen.cpp | 72 +- .../nvptx_target_teams_distribute_simd_codegen.cpp | 22 +- clang/test/OpenMP/nvptx_teams_codegen.cpp | 32 +- .../test/OpenMP/nvptx_teams_reduction_codegen.cpp | 162 +- .../test/OpenMP/nvptx_unsupported_type_codegen.cpp | 4 +- clang/test/OpenMP/openmp_offload_codegen.cpp | 2 +- clang/test/OpenMP/openmp_win_codegen.cpp | 7 +- clang/test/OpenMP/ordered_codegen.cpp | 76 +- clang/test/OpenMP/parallel_codegen.cpp | 100 +- clang/test/OpenMP/parallel_copyin_codegen.cpp | 613 ++-- .../test/OpenMP/parallel_firstprivate_codegen.cpp | 44 +- clang/test/OpenMP/parallel_for_codegen.cpp | 224 +- .../parallel_for_lastprivate_conditional.cpp | 17 +- clang/test/OpenMP/parallel_for_linear_codegen.cpp | 93 +- .../OpenMP/parallel_for_reduction_task_codegen.cpp | 36 +- clang/test/OpenMP/parallel_for_scan_codegen.cpp | 2 +- .../OpenMP/parallel_for_simd_aligned_codegen.cpp | 72 +- clang/test/OpenMP/parallel_for_simd_codegen.cpp | 6 +- .../test/OpenMP/parallel_for_simd_scan_codegen.cpp | 2 +- clang/test/OpenMP/parallel_if_codegen.cpp | 100 +- clang/test/OpenMP/parallel_if_codegen_PR51349.cpp | 2 +- clang/test/OpenMP/parallel_master_codegen.cpp | 63 +- .../parallel_master_reduction_task_codegen.cpp | 36 +- .../OpenMP/parallel_master_taskloop_codegen.cpp | 60 +- ...rallel_master_taskloop_firstprivate_codegen.cpp | 20 +- ...arallel_master_taskloop_lastprivate_codegen.cpp | 282 +- .../parallel_master_taskloop_private_codegen.cpp | 20 +- .../parallel_master_taskloop_reduction_codegen.cpp | 22 +- .../parallel_master_taskloop_simd_codegen.cpp | 160 +- ...l_master_taskloop_simd_firstprivate_codegen.cpp | 20 +- ...el_master_taskloop_simd_lastprivate_codegen.cpp | 470 +-- ...rallel_master_taskloop_simd_private_codegen.cpp | 20 +- ...llel_master_taskloop_simd_reduction_codegen.cpp | 22 +- clang/test/OpenMP/parallel_num_threads_codegen.cpp | 4 +- clang/test/OpenMP/parallel_private_codegen.cpp | 261 +- clang/test/OpenMP/parallel_reduction_codegen.cpp | 501 ++- .../OpenMP/parallel_reduction_task_codegen.cpp | 36 +- clang/test/OpenMP/parallel_sections_codegen.cpp | 13 +- .../parallel_sections_reduction_task_codegen.cpp | 36 +- clang/test/OpenMP/reduction_compound_op.cpp | 12 +- .../test/OpenMP/sections_firstprivate_codegen.cpp | 321 +- clang/test/OpenMP/sections_lastprivate_codegen.cpp | 433 ++- clang/test/OpenMP/sections_private_codegen.cpp | 189 +- clang/test/OpenMP/sections_reduction_codegen.cpp | 353 ++- .../OpenMP/sections_reduction_task_codegen.cpp | 36 +- clang/test/OpenMP/simd_codegen.cpp | 8 +- clang/test/OpenMP/single_codegen.cpp | 597 ++-- clang/test/OpenMP/single_firstprivate_codegen.cpp | 321 +- clang/test/OpenMP/single_private_codegen.cpp | 189 +- clang/test/OpenMP/target_codegen.cpp | 12 +- .../test/OpenMP/target_codegen_global_capture.cpp | 104 +- clang/test/OpenMP/target_defaultmap_codegen_01.cpp | 676 ++--- clang/test/OpenMP/target_depend_codegen.cpp | 14 +- clang/test/OpenMP/target_enter_data_codegen.cpp | 2 +- .../OpenMP/target_enter_data_depend_codegen.cpp | 8 +- clang/test/OpenMP/target_exit_data_codegen.cpp | 2 +- .../OpenMP/target_exit_data_depend_codegen.cpp | 8 +- clang/test/OpenMP/target_firstprivate_codegen.cpp | 12 +- clang/test/OpenMP/target_map_codegen_00.cpp | 2 +- clang/test/OpenMP/target_map_codegen_01.cpp | 4 +- clang/test/OpenMP/target_map_codegen_02.cpp | 2 +- clang/test/OpenMP/target_map_codegen_03.cpp | 96 +- clang/test/OpenMP/target_map_codegen_04.cpp | 2 +- clang/test/OpenMP/target_map_codegen_05.cpp | 2 +- clang/test/OpenMP/target_map_codegen_06.cpp | 2 +- clang/test/OpenMP/target_map_codegen_07.cpp | 2 +- clang/test/OpenMP/target_map_codegen_11.cpp | 2 +- clang/test/OpenMP/target_map_codegen_12.cpp | 2 +- clang/test/OpenMP/target_map_codegen_13.cpp | 2 +- clang/test/OpenMP/target_map_codegen_14.cpp | 4 +- clang/test/OpenMP/target_map_codegen_15.cpp | 2 +- clang/test/OpenMP/target_map_codegen_17.cpp | 2 +- clang/test/OpenMP/target_map_codegen_24.cpp | 2 +- clang/test/OpenMP/target_map_names.cpp | 2 +- clang/test/OpenMP/target_map_names_attr.cpp | 2 +- clang/test/OpenMP/target_parallel_codegen.cpp | 608 ++-- .../test/OpenMP/target_parallel_debug_codegen.cpp | 24 +- .../test/OpenMP/target_parallel_depend_codegen.cpp | 12 +- clang/test/OpenMP/target_parallel_for_codegen.cpp | 672 ++--- .../OpenMP/target_parallel_for_debug_codegen.cpp | 24 +- .../OpenMP/target_parallel_for_depend_codegen.cpp | 12 +- .../target_parallel_for_reduction_task_codegen.cpp | 40 +- .../OpenMP/target_parallel_for_simd_codegen.cpp | 1008 +++---- .../target_parallel_for_simd_depend_codegen.cpp | 12 +- clang/test/OpenMP/target_parallel_if_codegen.cpp | 464 +-- .../OpenMP/target_parallel_num_threads_codegen.cpp | 464 +-- .../target_parallel_reduction_task_codegen.cpp | 40 +- clang/test/OpenMP/target_private_codegen.cpp | 4 +- clang/test/OpenMP/target_reduction_codegen.cpp | 2 +- clang/test/OpenMP/target_simd_codegen.cpp | 6 +- clang/test/OpenMP/target_simd_depend_codegen.cpp | 12 +- clang/test/OpenMP/target_teams_codegen.cpp | 928 +++--- clang/test/OpenMP/target_teams_depend_codegen.cpp | 12 +- .../OpenMP/target_teams_distribute_codegen.cpp | 656 ++-- .../target_teams_distribute_collapse_codegen.cpp | 89 +- .../target_teams_distribute_depend_codegen.cpp | 12 +- ...rget_teams_distribute_dist_schedule_codegen.cpp | 184 +- ...arget_teams_distribute_firstprivate_codegen.cpp | 573 ++-- ...target_teams_distribute_lastprivate_codegen.cpp | 361 ++- ...arget_teams_distribute_parallel_for_codegen.cpp | 118 +- </cut>

4 years, 7 months

[TCWG CI] 464.h264ref slowed down by 6% after llvm: [Clang/Test]: Rename enable_noundef_analysis to disable-noundef-analysis and turn it off by default

by ci_notify＠linaro.org

After llvm commit aacfbb953eb705af2ecfeb95a6262818fa85dd92 Author: hyeongyukim <gusrb406(a)snu.ac.kr> [Clang/Test]: Rename enable_noundef_analysis to disable-noundef-analysis and turn it off by default the following benchmarks slowed down by more than 2%: - 464.h264ref slowed down by 6% from 10889 to 11584 perf samples Below reproducer instructions can be used to re-build both "first_bad" and "last_good" cross-toolchains used in this bisection. Naturally, the scripts will fail when triggerring benchmarking jobs if you don't have access to Linaro TCWG CI. For your convenience, we have uploaded tarballs with pre-processed source and assembly files at: - First_bad save-temps: https://ci.linaro.org/job/tcwg_bmk_ci_llvm-bisect-tcwg_bmk_tx1-llvm-master-… - Last_good save-temps: https://ci.linaro.org/job/tcwg_bmk_ci_llvm-bisect-tcwg_bmk_tx1-llvm-master-… - Baseline save-temps: https://ci.linaro.org/job/tcwg_bmk_ci_llvm-bisect-tcwg_bmk_tx1-llvm-master-… Configuration: - Benchmark: SPEC CPU2006 - Toolchain: Clang + Glibc + LLVM Linker - Version: all components were built from their tip of trunk - Target: aarch64-linux-gnu - Compiler flags: -O2 -flto - Hardware: NVidia TX1 4x Cortex-A57 This benchmarking CI is work-in-progress, and we welcome feedback and suggestions at linaro-toolchain(a)lists.linaro.org . In our improvement plans is to add support for SPEC CPU2017 benchmarks and provide "perf report/annotate" data behind these reports. THIS IS THE END OF INTERESTING STUFF. BELOW ARE LINKS TO BUILDS, REPRODUCTION INSTRUCTIONS, AND THE RAW COMMIT. This commit has regressed these CI configurations: - tcwg_bmk_llvm_tx1/llvm-master-aarch64-spec2k6-O2_LTO First_bad build: https://ci.linaro.org/job/tcwg_bmk_ci_llvm-bisect-tcwg_bmk_tx1-llvm-master-… Last_good build: https://ci.linaro.org/job/tcwg_bmk_ci_llvm-bisect-tcwg_bmk_tx1-llvm-master-… Baseline build: https://ci.linaro.org/job/tcwg_bmk_ci_llvm-bisect-tcwg_bmk_tx1-llvm-master-… Even more details: https://ci.linaro.org/job/tcwg_bmk_ci_llvm-bisect-tcwg_bmk_tx1-llvm-master-… Reproduce builds: <cut> mkdir investigate-llvm-aacfbb953eb705af2ecfeb95a6262818fa85dd92 cd investigate-llvm-aacfbb953eb705af2ecfeb95a6262818fa85dd92 # Fetch scripts git clone https://git.linaro.org/toolchain/jenkins-scripts # Fetch manifests and test.sh script mkdir -p artifacts/manifests curl -o artifacts/manifests/build-baseline.sh https://ci.linaro.org/job/tcwg_bmk_ci_llvm-bisect-tcwg_bmk_tx1-llvm-master-… --fail curl -o artifacts/manifests/build-parameters.sh https://ci.linaro.org/job/tcwg_bmk_ci_llvm-bisect-tcwg_bmk_tx1-llvm-master-… --fail curl -o artifacts/test.sh https://ci.linaro.org/job/tcwg_bmk_ci_llvm-bisect-tcwg_bmk_tx1-llvm-master-… --fail chmod +x artifacts/test.sh # Reproduce the baseline build (build all pre-requisites) ./jenkins-scripts/tcwg_bmk-build.sh @@ artifacts/manifests/build-baseline.sh # Save baseline build state (which is then restored in artifacts/test.sh) mkdir -p ./bisect rsync -a --del --delete-excluded --exclude /bisect/ --exclude /artifacts/ --exclude /llvm/ ./ ./bisect/baseline/ cd llvm # Reproduce first_bad build git checkout --detach aacfbb953eb705af2ecfeb95a6262818fa85dd92 ../artifacts/test.sh # Reproduce last_good build git checkout --detach b5aef90d4656c5188759d03e2c5c3dc3d8bb398b ../artifacts/test.sh cd .. </cut> Full commit (up to 1000 lines): <cut> commit aacfbb953eb705af2ecfeb95a6262818fa85dd92 Author: hyeongyukim <gusrb406(a)snu.ac.kr> Date: Fri Oct 15 19:26:07 2021 +0900 [Clang/Test]: Rename enable_noundef_analysis to disable-noundef-analysis and turn it off by default Turning on `enable_noundef_analysis` flag allows better codegen by removing freeze instructions. I modified clang by renaming `enable_noundef_analysis` flag to `disable-noundef-analysis` and turning it off by default. Test updates are made as a separate patch: D108453 Reviewed By: eugenis Differential Revision: https://reviews.llvm.org/D105169 [Clang/Test]: Rename enable_noundef_analysis to disable-noundef-analysis and turn it off by default (2) This patch updates test files after D105169. Autogenerated test codes are changed by `utils/update_cc_test_checks.py,` and non-autogenerated test codes are changed as follows: (1) I wrote a python script that (partially) updates the tests using regex: {F18594904} The script is not perfect, but I believe it gives hints about which patterns are updated to have `noundef` attached. (2) The remaining tests are updated manually. Reviewed By: eugenis Differential Revision: https://reviews.llvm.org/D108453 Resolve lit failures in clang after 8ca4b3e's land Fix lit test failures in clang-ppc* and clang-x64-windows-msvc Fix missing failures in clang-ppc64be* and retry fixing clang-x64-windows-msvc Fix internal_clone(aarch64) inline assembly --- clang/include/clang/Basic/CodeGenOptions.def | 2 +- clang/include/clang/Driver/Options.td | 6 +- clang/lib/CodeGen/CGCall.cpp | 4 +- clang/test/CXX/except/except.spec/p14-ir.cpp | 4 +- .../expr.prim/expr.prim.lambda/blocks-irgen.mm | 4 +- clang/test/CodeGen/2005-01-02-ConstantInits.c | 10 +- clang/test/CodeGen/2006-05-19-SingleEltReturn.c | 2 +- clang/test/CodeGen/2007-06-18-SextAttrAggregate.c | 2 +- .../test/CodeGen/2009-02-13-zerosize-union-field.c | 2 +- clang/test/CodeGen/2009-05-04-EnumInreg.c | 2 +- clang/test/CodeGen/64bit-swiftcall.c | 8 +- clang/test/CodeGen/RISCV/riscv-inline-asm.c | 2 +- clang/test/CodeGen/RISCV/riscv32-ilp32-abi.c | 8 +- .../test/CodeGen/RISCV/riscv32-ilp32-ilp32f-abi.c | 8 +- .../RISCV/riscv32-ilp32-ilp32f-ilp32d-abi.c | 48 +- clang/test/CodeGen/RISCV/riscv32-ilp32d-abi.c | 24 +- clang/test/CodeGen/RISCV/riscv32-ilp32f-abi.c | 6 +- .../test/CodeGen/RISCV/riscv32-ilp32f-ilp32d-abi.c | 16 +- clang/test/CodeGen/RISCV/riscv64-lp64-abi.c | 6 +- clang/test/CodeGen/RISCV/riscv64-lp64-lp64f-abi.c | 4 +- .../CodeGen/RISCV/riscv64-lp64-lp64f-lp64d-abi.c | 58 +- clang/test/CodeGen/RISCV/riscv64-lp64d-abi.c | 12 +- clang/test/CodeGen/RISCV/riscv64-lp64f-lp64d-abi.c | 16 +- clang/test/CodeGen/SystemZ/systemz-abi-vector.c | 18 +- clang/test/CodeGen/SystemZ/systemz-abi.c | 22 +- clang/test/CodeGen/SystemZ/systemz-inline-asm.c | 24 +- clang/test/CodeGen/WebAssembly/wasm-arguments.c | 38 +- .../test/CodeGen/WebAssembly/wasm-main_argc_argv.c | 2 +- clang/test/CodeGen/X86/avx-union.c | 6 +- clang/test/CodeGen/X86/avx512fp16-complex-abi.c | 2 +- clang/test/CodeGen/X86/ms-x86-intrinsics.c | 32 +- clang/test/CodeGen/X86/strictfp_builtins.c | 8 +- clang/test/CodeGen/X86/x86-atomic-long_double.c | 36 +- .../CodeGen/X86/x86-inline-asm-min-vector-width.c | 12 +- clang/test/CodeGen/X86/x86-long-double.cpp | 6 +- clang/test/CodeGen/X86/x86-soft-float.c | 4 +- clang/test/CodeGen/X86/x86-vec-i128.c | 22 +- clang/test/CodeGen/X86/x86_32-arguments-darwin.c | 62 +- clang/test/CodeGen/X86/x86_32-arguments-iamcu.c | 24 +- clang/test/CodeGen/X86/x86_32-arguments-linux.c | 30 +- clang/test/CodeGen/X86/x86_32-arguments-nommx.c | 4 +- clang/test/CodeGen/X86/x86_32-arguments-realign.c | 2 +- clang/test/CodeGen/X86/x86_32-arguments-win32.c | 12 +- clang/test/CodeGen/X86/x86_64-arguments-nacl.c | 6 +- clang/test/CodeGen/X86/x86_64-arguments-win32.c | 12 +- clang/test/CodeGen/X86/x86_64-arguments.c | 82 +- clang/test/CodeGen/X86/x86_64-longdouble.c | 36 +- clang/test/CodeGen/aapcs-align.cpp | 56 +- clang/test/CodeGen/aapcs64-align.cpp | 34 +- clang/test/CodeGen/aarch64-args.cpp | 18 +- clang/test/CodeGen/aarch64-byval-temp.c | 8 +- clang/test/CodeGen/aarch64-neon-3v.c | 160 +- clang/test/CodeGen/aarch64-neon-across.c | 88 +- clang/test/CodeGen/aarch64-neon-dot-product.c | 24 +- clang/test/CodeGen/aarch64-neon-extract.c | 48 +- clang/test/CodeGen/aarch64-neon-fcvt-intrinsics.c | 42 +- clang/test/CodeGen/aarch64-neon-fma.c | 44 +- clang/test/CodeGen/aarch64-neon-ldst-one.c | 540 ++-- clang/test/CodeGen/aarch64-neon-scalar-copy.c | 48 +- .../CodeGen/aarch64-neon-scalar-x-indexed-elem.c | 80 +- clang/test/CodeGen/aarch64-neon-tbl.c | 144 +- clang/test/CodeGen/aarch64-neon-vcombine.c | 28 +- clang/test/CodeGen/aarch64-neon-vget-hilo.c | 56 +- clang/test/CodeGen/aarch64-neon-vget.c | 96 +- clang/test/CodeGen/aarch64-poly128.c | 62 +- clang/test/CodeGen/aarch64-poly64.c | 96 +- clang/test/CodeGen/aarch64-strictfp-builtins.c | 8 +- ...4-sve-acle-__ARM_FEATURE_SVE_VECTOR_OPERATORS.c | 16 +- ...sve-acle-__ARM_FEATURE_SVE_VECTOR_OPERATORS.cpp | 8 +- clang/test/CodeGen/aarch64-varargs.c | 2 +- clang/test/CodeGen/address-space-field1.c | 2 +- clang/test/CodeGen/address-space.c | 2 +- clang/test/CodeGen/aix-alignment.c | 8 +- clang/test/CodeGen/aix-altivec.c | 10 +- clang/test/CodeGen/aix-ignore-xcoff-visibility.cpp | 12 +- clang/test/CodeGen/aix-return.c | 16 +- clang/test/CodeGen/aix-struct-arg.c | 44 +- clang/test/CodeGen/aix-vaargs.c | 4 +- clang/test/CodeGen/alias.c | 12 +- clang/test/CodeGen/align_value.cpp | 63 +- clang/test/CodeGen/alloc-align-attr.c | 46 +- clang/test/CodeGen/alloc-fns-alignment.c | 2 +- clang/test/CodeGen/alloc-size-fnptr.c | 12 +- clang/test/CodeGen/arc/arguments.c | 26 +- clang/test/CodeGen/arithmetic-fence-builtin.c | 10 +- clang/test/CodeGen/arm-aapcs-vfp.c | 24 +- clang/test/CodeGen/arm-abi-vector.c | 48 +- clang/test/CodeGen/arm-arguments.c | 10 +- clang/test/CodeGen/arm-bf16-params-returns.c | 10 +- clang/test/CodeGen/arm-byval-align.c | 2 +- clang/test/CodeGen/arm-cmse-attr.c | 4 +- clang/test/CodeGen/arm-cmse-call.c | 4 +- clang/test/CodeGen/arm-float-helpers.c | 76 +- clang/test/CodeGen/arm-fp16-arguments.c | 12 +- clang/test/CodeGen/arm-homogenous.c | 2 +- clang/test/CodeGen/arm-mangle-bf16.cpp | 2 +- clang/test/CodeGen/arm-neon-directed-rounding.c | 30 +- clang/test/CodeGen/arm-neon-dot-product.c | 16 +- clang/test/CodeGen/arm-neon-fma.c | 8 +- clang/test/CodeGen/arm-neon-numeric-maxmin.c | 8 +- clang/test/CodeGen/arm-neon-vcvtX.c | 32 +- clang/test/CodeGen/arm-swiftcall.c | 6 +- clang/test/CodeGen/arm-varargs.c | 2 +- clang/test/CodeGen/arm-vector-arguments.c | 10 +- clang/test/CodeGen/arm-vfp16-arguments.c | 12 +- clang/test/CodeGen/arm64-aapcs-arguments.c | 12 +- clang/test/CodeGen/arm64-abi-vector.c | 42 +- clang/test/CodeGen/arm64-arguments.c | 96 +- clang/test/CodeGen/arm64-microsoft-arguments.cpp | 6 +- clang/test/CodeGen/arm64_32.c | 2 +- clang/test/CodeGen/arm64_vcopy.c | 20 +- clang/test/CodeGen/arm64_vdupq_n_f64.c | 12 +- clang/test/CodeGen/armv7k-abi.c | 6 +- clang/test/CodeGen/asm-label.c | 12 +- .../assume-aligned-and-alloc-align-attributes.c | 12 +- clang/test/CodeGen/atomic-arm64.c | 2 +- clang/test/CodeGen/atomic-ops-libcall.c | 34 +- clang/test/CodeGen/atomic-ops.c | 44 +- clang/test/CodeGen/atomic_ops.c | 10 +- clang/test/CodeGen/atomics-inlining.c | 52 +- clang/test/CodeGen/attr-func-def.c | 4 +- clang/test/CodeGen/attr-naked.c | 2 +- clang/test/CodeGen/attr-no-tail.c | 8 +- clang/test/CodeGen/attr-nomerge.cpp | 20 +- clang/test/CodeGen/attr-noundef.cpp | 4 +- clang/test/CodeGen/attr-target-mv-func-ptrs.c | 4 +- clang/test/CodeGen/attr-target-mv-va-args.c | 24 +- clang/test/CodeGen/attr-target-mv.c | 28 +- clang/test/CodeGen/attr-x86-interrupt.c | 16 +- clang/test/CodeGen/attributes.c | 2 +- clang/test/CodeGen/available-externally-hidden.cpp | 2 +- clang/test/CodeGen/available-externally-suppress.c | 2 +- clang/test/CodeGen/avr/struct.c | 4 +- clang/test/CodeGen/big-atomic-ops.c | 30 +- clang/test/CodeGen/bittest-intrin.c | 8 +- clang/test/CodeGen/blocks.c | 6 +- clang/test/CodeGen/bool-convert.c | 2 +- clang/test/CodeGen/builtin-align-array.c | 8 +- clang/test/CodeGen/builtin-align.c | 24 +- clang/test/CodeGen/builtin-assume-aligned.c | 31 +- clang/test/CodeGen/builtin-attributes.c | 20 +- clang/test/CodeGen/builtin-memfns.c | 4 +- clang/test/CodeGen/builtin-sqrt.c | 2 +- clang/test/CodeGen/builtins-arm.c | 24 +- clang/test/CodeGen/builtins-memcpy-inline.c | 8 +- clang/test/CodeGen/builtins-ms.c | 4 +- clang/test/CodeGen/builtins-multiprecision.c | 4 +- clang/test/CodeGen/builtins-overflow.c | 12 +- clang/test/CodeGen/builtins-ppc-xlcompat-macros.c | 4 +- clang/test/CodeGen/builtins.c | 44 +- clang/test/CodeGen/c-strings.c | 2 +- clang/test/CodeGen/c11atomics-ios.c | 8 +- clang/test/CodeGen/c11atomics.c | 52 +- clang/test/CodeGen/calling-conv-ignored.c | 32 +- ...-assumption-attribute-align_value-on-lvalue.cpp | 2 +- ...ssumption-attribute-align_value-on-paramvar.cpp | 4 +- ...-attribute-alloc_align-on-function-variable.cpp | 6 +- ...ssumption-attribute-alloc_align-on-function.cpp | 8 +- ...ibute-assume_aligned-on-function-two-params.cpp | 6 +- ...mption-attribute-assume_aligned-on-function.cpp | 8 +- ...uiltin_assume_aligned-three-params-variable.cpp | 2 +- ...umption-builtin_assume_aligned-three-params.cpp | 2 +- ...ssumption-builtin_assume_aligned-two-params.cpp | 2 +- .../CodeGen/catch-alignment-assumption-openmp.cpp | 2 +- .../CodeGen/catch-implicit-integer-sign-changes.c | 18 +- ...icit-signed-integer-truncation-or-sign-change.c | 10 +- ...tr-and-nonzero-offset-when-nullptr-is-defined.c | 2 +- .../CodeGen/catch-nullptr-and-nonzero-offset.c | 14 +- .../test/CodeGen/catch-pointer-overflow-volatile.c | 2 +- clang/test/CodeGen/catch-pointer-overflow.c | 16 +- clang/test/CodeGen/cfi-check-fail.c | 2 +- clang/test/CodeGen/cfi-check-fail2.c | 2 +- clang/test/CodeGen/cmse-clear-arg.c | 2 +- clang/test/CodeGen/complex-builtins.c | 228 +- clang/test/CodeGen/complex-indirect.c | 2 +- clang/test/CodeGen/complex-libcalls.c | 228 +- clang/test/CodeGen/complex-math.c | 12 +- clang/test/CodeGen/complex-strictfp.c | 42 +- clang/test/CodeGen/constructor-attribute.c | 2 +- clang/test/CodeGen/debug-info-block-vars.c | 2 +- clang/test/CodeGen/debug-info-pseudo-probe.cpp | 4 +- clang/test/CodeGen/decl.c | 2 +- clang/test/CodeGen/default-address-space.c | 4 +- clang/test/CodeGen/exceptions-seh-finally.c | 14 +- clang/test/CodeGen/exceptions-seh-leave.c | 30 +- clang/test/CodeGen/exceptions-seh-nested-finally.c | 4 +- clang/test/CodeGen/exceptions-seh.c | 26 +- clang/test/CodeGen/exceptions.c | 2 +- clang/test/CodeGen/ext-int-cc.c | 58 +- clang/test/CodeGen/extend-arg-64.c | 2 +- clang/test/CodeGen/fp-function-attrs.cpp | 6 +- clang/test/CodeGen/fp-options-to-fast-math-flags.c | 18 +- clang/test/CodeGen/fpconstrained-cmp-double.c | 24 +- clang/test/CodeGen/fpconstrained-cmp-float.c | 24 +- clang/test/CodeGen/function-attributes.c | 20 +- clang/test/CodeGen/functions.c | 4 +- clang/test/CodeGen/hexagon-hvx-abi.c | 8 +- clang/test/CodeGen/incomplete-function-type-2.c | 2 +- clang/test/CodeGen/indirect-noundef.cpp | 2 +- clang/test/CodeGen/inline.c | 4 +- clang/test/CodeGen/lanai-arguments.c | 12 +- clang/test/CodeGen/lanai-regparm.c | 12 +- clang/test/CodeGen/libcall-declarations.c | 636 ++-- clang/test/CodeGen/libcalls.c | 54 +- clang/test/CodeGen/long_double_fp128.cpp | 14 +- clang/test/CodeGen/malign-double-x86-nacl.c | 6 +- clang/test/CodeGen/mangle-blocks.c | 6 +- clang/test/CodeGen/mangle-windows.c | 2 +- clang/test/CodeGen/math-builtins-long.c | 386 +-- clang/test/CodeGen/math-builtins.c | 648 ++-- clang/test/CodeGen/math-libcalls.c | 474 +-- clang/test/CodeGen/matrix-cast.c | 26 +- clang/test/CodeGen/matrix-type-builtins.c | 4 +- .../test/CodeGen/matrix-type-operators-fast-math.c | 12 +- clang/test/CodeGen/matrix-type-operators.c | 84 +- clang/test/CodeGen/memcmp-inline-builtin-to-asm.c | 2 +- clang/test/CodeGen/memcpy-inline-builtin.c | 2 +- clang/test/CodeGen/microsoft-call-conv-x64.c | 2 +- clang/test/CodeGen/microsoft-call-conv.c | 2 +- clang/test/CodeGen/mingw-long-double.c | 12 +- clang/test/CodeGen/mips-unsigned-ext-var.c | 6 +- clang/test/CodeGen/mips-unsigned-extend.c | 6 +- clang/test/CodeGen/mips-vector-arg.c | 16 +- clang/test/CodeGen/mips-zero-sized-struct.c | 6 +- clang/test/CodeGen/mips64-padding-arg.c | 24 +- clang/test/CodeGen/mrtd.c | 6 +- clang/test/CodeGen/ms-inline-asm.c | 2 +- clang/test/CodeGen/ms-intrinsics-cpuid.c | 4 +- clang/test/CodeGen/ms-intrinsics-other.c | 2 +- clang/test/CodeGen/ms-mixed-ptr-sizes.c | 20 +- clang/test/CodeGen/ms_abi.c | 4 +- clang/test/CodeGen/ms_abi_aarch64.c | 4 +- clang/test/CodeGen/named_reg_global.c | 2 +- clang/test/CodeGen/no-bitfield-type-align.c | 2 +- clang/test/CodeGen/no-builtin.cpp | 12 +- clang/test/CodeGen/no-prototype.c | 2 +- clang/test/CodeGen/noduplicate-cxx11-test.cpp | 2 +- .../CodeGen/non-power-of-2-alignment-assumptions.c | 10 +- clang/test/CodeGen/nonnull.c | 28 +- clang/test/CodeGen/nrvo-tracking.cpp | 2 +- clang/test/CodeGen/nvptx-abi.c | 10 +- clang/test/CodeGen/object-size.c | 4 +- clang/test/CodeGen/padding-init.c | 6 +- clang/test/CodeGen/pass-by-value-noalias.c | 4 +- clang/test/CodeGen/pass-object-size.c | 114 +- clang/test/CodeGen/pch-dllexport.cpp | 4 +- clang/test/CodeGen/powerpc-c99complex.c | 14 +- clang/test/CodeGen/ppc-emmintrin.c | 750 ++--- clang/test/CodeGen/ppc-mm-malloc-le.c | 8 +- clang/test/CodeGen/ppc-mm-malloc.c | 8 +- clang/test/CodeGen/ppc-mmintrin.c | 124 +- clang/test/CodeGen/ppc-pmmintrin.c | 177 +- clang/test/CodeGen/ppc-signbit.c | 2 +- clang/test/CodeGen/ppc-smmintrin.c | 32 +- clang/test/CodeGen/ppc-tmmintrin.c | 290 +- clang/test/CodeGen/ppc-xmmintrin.c | 400 +-- clang/test/CodeGen/ppc64-align-struct.c | 26 +- clang/test/CodeGen/ppc64-complex-parms.c | 38 +- clang/test/CodeGen/ppc64-complex-return.c | 20 +- clang/test/CodeGen/ppc64-extend.c | 4 +- clang/test/CodeGen/ppc64-inline-asm.c | 14 +- clang/test/CodeGen/ppc64-long-double.cpp | 6 +- clang/test/CodeGen/ppc64-soft-float.c | 6 +- clang/test/CodeGen/ppc64-vector.c | 10 +- clang/test/CodeGen/ppc64le-aggregates.c | 8 +- clang/test/CodeGen/ppc64le-f128Aggregates.c | 4 +- clang/test/CodeGen/ppc64le-varargs-f128.c | 12 +- clang/test/CodeGen/pr25786.c | 4 +- clang/test/CodeGen/pr5406.c | 2 +- clang/test/CodeGen/pr9614.c | 4 +- clang/test/CodeGen/pragma-weak.c | 2 +- clang/test/CodeGen/ps4-dllimport-dllexport.c | 2 +- clang/test/CodeGen/regcall.c | 100 +- clang/test/CodeGen/regparm-flag.c | 12 +- clang/test/CodeGen/regparm-struct.c | 36 +- clang/test/CodeGen/regparm.c | 6 +- clang/test/CodeGen/renderscript.c | 14 +- clang/test/CodeGen/restrict.c | 10 +- .../sanitize-thread-no-checking-at-run-time.m | 2 +- clang/test/CodeGen/sparc-arguments.c | 4 +- clang/test/CodeGen/sparcv8-abi.c | 6 +- clang/test/CodeGen/sparcv8-inline-asm.c | 2 +- clang/test/CodeGen/sparcv9-abi.c | 16 +- clang/test/CodeGen/spir-half-type.cpp | 2 +- clang/test/CodeGen/stack-protector.c | 4 +- clang/test/CodeGen/stdcall-fastcall.c | 24 +- clang/test/CodeGen/strictfp_builtins.c | 26 +- clang/test/CodeGen/swift-async-call-conv.c | 22 +- clang/test/CodeGen/switch-dce.c | 4 +- clang/test/CodeGen/sysv_abi.c | 8 +- clang/test/CodeGen/temporary-lifetime.cpp | 4 +- clang/test/CodeGen/transparent-union-redecl.c | 8 +- clang/test/CodeGen/transparent-union.c | 8 +- clang/test/CodeGen/ubsan-function.cpp | 2 +- .../CodeGen/unique-internal-linkage-names-dwarf.c | 4 +- .../unique-internal-linkage-names-dwarf.cpp | 12 +- .../test/CodeGen/unique-internal-linkage-names.cpp | 16 +- clang/test/CodeGen/variadic-null-win64.c | 12 +- clang/test/CodeGen/ve-abi.c | 34 +- clang/test/CodeGen/vectorcall.c | 86 +- clang/test/CodeGen/vla.c | 22 +- clang/test/CodeGen/win64-i128.c | 4 +- clang/test/CodeGen/windows-itanium.c | 2 +- .../CodeGen/windows-on-arm-dllimport-dllexport.c | 2 +- .../CodeGen/windows-seh-EHa-CppCatchDotDotDot.cpp | 2 +- .../test/CodeGen/windows-seh-EHa-CppCondiTemps.cpp | 18 +- clang/test/CodeGen/windows-seh-EHa-CppDtors01.cpp | 2 +- .../test/CodeGen/windows-seh-EHa-TryInFinally.cpp | 4 +- clang/test/CodeGen/windows-seh-abnormal-exits.c | 2 +- clang/test/CodeGen/windows-swiftcall.c | 22 +- clang/test/CodeGen/x86_32-align-linux.c | 6 +- clang/test/CodeGen/xcore-abi.c | 14 +- clang/test/CodeGen/xray-log-args.cpp | 4 +- clang/test/CodeGenCUDA/address-spaces.cu | 2 +- .../CodeGenCUDA/amdgpu-kernel-arg-pointer-type.cu | 10 +- clang/test/CodeGenCUDA/builtins-amdgcn.cu | 2 +- clang/test/CodeGenCUDA/cuda-builtin-vars.cu | 2 +- clang/test/CodeGenCUDA/kernel-args-alignment.cu | 2 +- clang/test/CodeGenCUDA/kernel-args.cu | 8 +- clang/test/CodeGenCUDA/kernel-stub-name.cu | 4 +- clang/test/CodeGenCUDA/lambda.cu | 8 +- clang/test/CodeGenCUDA/redux-builtins.cu | 2 +- clang/test/CodeGenCUDA/surface.cu | 4 +- clang/test/CodeGenCUDA/texture.cu | 6 +- clang/test/CodeGenCUDA/unnamed-types.cu | 8 +- clang/test/CodeGenCUDA/usual-deallocators.cu | 36 +- clang/test/CodeGenCUDA/vtbl.cu | 2 +- .../CodeGenCXX/2009-05-04-PureConstNounwind.cpp | 10 +- .../test/CodeGenCXX/2011-12-19-init-list-ctor.cpp | 6 +- .../diamond-virtual-inheritance.cpp | 2 +- .../CodeGenCXX/RelativeVTablesABI/dynamic-cast.cpp | 8 +- .../RelativeVTablesABI/member-function-pointer.cpp | 2 +- .../RelativeVTablesABI/multiple-inheritance.cpp | 2 +- .../parent-and-child-in-comdats.cpp | 2 +- .../CodeGenCXX/RelativeVTablesABI/type-info.cpp | 2 +- .../CodeGenCXX/RelativeVTablesABI/vbase-offset.cpp | 2 +- .../RelativeVTablesABI/virtual-function-call.cpp | 2 +- clang/test/CodeGenCXX/address-space-cast.cpp | 14 +- clang/test/CodeGenCXX/address-space-ref.cpp | 8 +- clang/test/CodeGenCXX/aix-alignment.cpp | 6 +- .../aix-static-init-temp-spec-and-inline-var.cpp | 14 +- clang/test/CodeGenCXX/aix-static-init.cpp | 4 +- .../test/CodeGenCXX/align-avx-complete-objects.cpp | 4 +- clang/test/CodeGenCXX/alignment.cpp | 20 +- clang/test/CodeGenCXX/alloc-size.cpp | 16 +- .../test/CodeGenCXX/amdgcn-automatic-variable.cpp | 10 +- clang/test/CodeGenCXX/amdgcn-func-arg.cpp | 24 +- clang/test/CodeGenCXX/amdgcn_declspec_get.cpp | 2 +- clang/test/CodeGenCXX/anonymous-namespaces.cpp | 4 +- .../test/CodeGenCXX/apple-kext-indirect-call-2.cpp | 8 +- clang/test/CodeGenCXX/apple-kext-linkage.cpp | 4 +- clang/test/CodeGenCXX/arm-cc.cpp | 4 +- clang/test/CodeGenCXX/arm-swiftcall.cpp | 2 +- clang/test/CodeGenCXX/arm.cpp | 4 +- clang/test/CodeGenCXX/arm64-constructor-return.cpp | 4 +- clang/test/CodeGenCXX/arm64-darwinpcs.cpp | 4 +- clang/test/CodeGenCXX/atomic-dllexport.cpp | 4 +- clang/test/CodeGenCXX/atomic-inline.cpp | 2 +- clang/test/CodeGenCXX/atomicinit.cpp | 8 +- .../CodeGenCXX/attr-cpuspecific-outoflinedefs.cpp | 28 +- clang/test/CodeGenCXX/attr-disable-tail-calls.cpp | 12 +- clang/test/CodeGenCXX/attr-musttail.cpp | 40 +- clang/test/CodeGenCXX/attr-notail.cpp | 10 +- clang/test/CodeGenCXX/attr-target-mv-diff-ns.cpp | 42 +- clang/test/CodeGenCXX/attr-target-mv-func-ptrs.cpp | 6 +- clang/test/CodeGenCXX/attr-target-mv-inalloca.cpp | 16 +- .../CodeGenCXX/attr-target-mv-member-funcs.cpp | 96 +- .../CodeGenCXX/attr-target-mv-out-of-line-defs.cpp | 22 +- clang/test/CodeGenCXX/attr-target-mv-overloads.cpp | 36 +- ...used-member-function-implicit-instantiation.cpp | 2 +- clang/test/CodeGenCXX/attr-x86-interrupt.cpp | 24 +- clang/test/CodeGenCXX/blocks-cxx11.cpp | 16 +- clang/test/CodeGenCXX/blocks.cpp | 4 +- clang/test/CodeGenCXX/builtin-calling-conv.cpp | 18 +- .../CodeGenCXX/builtin-is-constant-evaluated.cpp | 8 +- .../CodeGenCXX/builtin-operator-new-delete.cpp | 20 +- clang/test/CodeGenCXX/builtin-source-location.cpp | 20 +- clang/test/CodeGenCXX/builtin_FUNCTION.cpp | 6 +- clang/test/CodeGenCXX/builtin_LINE.cpp | 24 +- clang/test/CodeGenCXX/builtins.cpp | 4 +- clang/test/CodeGenCXX/call-with-static-chain.cpp | 16 +- clang/test/CodeGenCXX/catch-undef-behavior.cpp | 10 +- clang/test/CodeGenCXX/cfi-cast.cpp | 4 +- clang/test/CodeGenCXX/cfi-multiple-inheritance.cpp | 2 +- .../test/CodeGenCXX/cfi-vcall-check-after-args.cpp | 2 +- clang/test/CodeGenCXX/clang-sections.cpp | 2 +- clang/test/CodeGenCXX/compound-literals.cpp | 6 +- clang/test/CodeGenCXX/condition.cpp | 30 +- clang/test/CodeGenCXX/conditional-gnu-ext.cpp | 14 +- clang/test/CodeGenCXX/conditional-temporaries.cpp | 44 +- clang/test/CodeGenCXX/const-init-cxx11.cpp | 16 +- .../constructor-destructor-return-this.cpp | 100 +- clang/test/CodeGenCXX/constructor-direct-call.cpp | 14 +- clang/test/CodeGenCXX/constructor-init.cpp | 10 +- clang/test/CodeGenCXX/constructors.cpp | 24 +- clang/test/CodeGenCXX/convert-to-fptr.cpp | 4 +- clang/test/CodeGenCXX/copy-assign-synthesis-1.cpp | 2 +- clang/test/CodeGenCXX/copy-constructor-elim-2.cpp | 2 +- .../CodeGenCXX/copy-constructor-synthesis-2.cpp | 2 +- .../test/CodeGenCXX/copy-constructor-synthesis.cpp | 6 +- clang/test/CodeGenCXX/copy-elision.cpp | 2 +- clang/test/CodeGenCXX/copy-initialization.cpp | 2 +- clang/test/CodeGenCXX/cxx-abi-switch.cpp | 4 +- clang/test/CodeGenCXX/cxx0x-delegating-ctors.cpp | 2 +- .../CodeGenCXX/cxx0x-initializer-constructors.cpp | 14 +- .../CodeGenCXX/cxx0x-initializer-references.cpp | 4 +- .../CodeGenCXX/cxx11-initializer-aggregate.cpp | 4 +- .../CodeGenCXX/cxx11-initializer-array-new.cpp | 30 +- .../CodeGenCXX/cxx11-thread-local-reference.cpp | 6 +- .../CodeGenCXX/cxx11-thread-local-visibility.cpp | 8 +- clang/test/CodeGenCXX/cxx11-thread-local.cpp | 38 +- .../test/CodeGenCXX/cxx11-user-defined-literal.cpp | 20 +- clang/test/CodeGenCXX/cxx1y-init-captures.cpp | 12 +- .../CodeGenCXX/cxx1y-initializer-aggregate.cpp | 6 +- clang/test/CodeGenCXX/cxx1y-sized-deallocation.cpp | 48 +- .../CodeGenCXX/cxx1y-variable-template-linkage.cpp | 10 +- clang/test/CodeGenCXX/cxx1y-variable-template.cpp | 2 +- clang/test/CodeGenCXX/cxx1z-aligned-allocation.cpp | 68 +- clang/test/CodeGenCXX/cxx1z-copy-omission.cpp | 8 +- clang/test/CodeGenCXX/cxx1z-decomposition.cpp | 4 +- clang/test/CodeGenCXX/cxx1z-init-statement.cpp | 4 +- .../CodeGenCXX/cxx1z-initializer-aggregate.cpp | 20 +- clang/test/CodeGenCXX/cxx1z-inline-variables.cpp | 8 +- clang/test/CodeGenCXX/cxx2a-consteval.cpp | 11 +- clang/test/CodeGenCXX/cxx2a-destroying-delete.cpp | 38 +- .../debug-info-codeview-heapallocsite.cpp | 6 +- .../test/CodeGenCXX/debug-info-destroy-helper.cpp | 48 +- clang/test/CodeGenCXX/debug-info-globalinit.cpp | 6 +- clang/test/CodeGenCXX/debug-info-line.cpp | 4 +- clang/test/CodeGenCXX/debug-info-nested-exprs.cpp | 84 +- clang/test/CodeGenCXX/debug-info-static-fns.cpp | 2 +- clang/test/CodeGenCXX/debug-info-thunk-msabi.cpp | 2 +- clang/test/CodeGenCXX/decl-ref-init.cpp | 4 +- clang/test/CodeGenCXX/default-arg-temps.cpp | 4 +- clang/test/CodeGenCXX/default-arguments.cpp | 2 +- clang/test/CodeGenCXX/default_calling_conv.cpp | 24 +- clang/test/CodeGenCXX/delete-two-arg.cpp | 8 +- clang/test/CodeGenCXX/delete.cpp | 6 +- clang/test/CodeGenCXX/derived-to-base-conv.cpp | 6 +- clang/test/CodeGenCXX/derived-to-base.cpp | 4 +- clang/test/CodeGenCXX/destructors.cpp | 8 +- clang/test/CodeGenCXX/devirtualize-ms-dtor.cpp | 2 +- .../devirtualize-virtual-function-calls-final.cpp | 34 +- .../devirtualize-virtual-function-calls.cpp | 2 +- clang/test/CodeGenCXX/dllexport-ctor-closure.cpp | 10 +- clang/test/CodeGenCXX/dllexport-dtor-thunks.cpp | 2 +- clang/test/CodeGenCXX/dllexport-members.cpp | 12 +- .../CodeGenCXX/dllexport-no-dllexport-inlines.cpp | 18 +- clang/test/CodeGenCXX/dllexport.cpp | 12 +- clang/test/CodeGenCXX/dllimport-members.cpp | 12 +- clang/test/CodeGenCXX/dllimport-runtime-fns.cpp | 6 +- clang/test/CodeGenCXX/dllimport.cpp | 18 +- clang/test/CodeGenCXX/eh.cpp | 10 +- .../CodeGenCXX/empty-nontrivially-copyable.cpp | 6 +- clang/test/CodeGenCXX/exceptions-cxx-new.cpp | 10 +- .../CodeGenCXX/exceptions-seh-filter-captures.cpp | 24 +- .../CodeGenCXX/exceptions-seh-filter-uwtable.cpp | 2 +- clang/test/CodeGenCXX/exceptions-seh.cpp | 16 +- clang/test/CodeGenCXX/exceptions.cpp | 4 +- clang/test/CodeGenCXX/explicit-instantiation.cpp | 32 +- clang/test/CodeGenCXX/ext-int.cpp | 16 +- clang/test/CodeGenCXX/fastcall.cpp | 2 +- clang/test/CodeGenCXX/float128-declarations.cpp | 20 +- clang/test/CodeGenCXX/float16-declarations.cpp | 8 +- clang/test/CodeGenCXX/for-cond-var.cpp | 16 +- clang/test/CodeGenCXX/for-range-temporaries.cpp | 2 +- clang/test/CodeGenCXX/for-range.cpp | 20 +- clang/test/CodeGenCXX/forward-enum.cpp | 2 +- clang/test/CodeGenCXX/fp16-mangle-arg-return.cpp | 4 +- clang/test/CodeGenCXX/fp16-mangle.cpp | 4 +- clang/test/CodeGenCXX/fp16-overload.cpp | 4 +- clang/test/CodeGenCXX/global-init.cpp | 2 +- clang/test/CodeGenCXX/goto.cpp | 6 +- clang/test/CodeGenCXX/homogeneous-aggregates.cpp | 28 +- clang/test/CodeGenCXX/ibm128-declarations.cpp | 24 +- .../CodeGenCXX/implicit-copy-assign-operator.cpp | 2 +- .../test/CodeGenCXX/implicit-copy-constructor.cpp | 2 +- clang/test/CodeGenCXX/inalloca-overaligned.cpp | 38 +- clang/test/CodeGenCXX/inalloca-stmtexpr.cpp | 2 +- clang/test/CodeGenCXX/inalloca-vector.cpp | 40 +- .../CodeGenCXX/inheriting-constructor-cleanup.cpp | 4 +- clang/test/CodeGenCXX/inheriting-constructor.cpp | 10 +- clang/test/CodeGenCXX/init-invariant.cpp | 14 +- clang/test/CodeGenCXX/init-priority-attr.cpp | 10 +- .../CodeGenCXX/initializer-list-ctor-order.cpp | 2 +- clang/test/CodeGenCXX/inline-functions.cpp | 2 +- clang/test/CodeGenCXX/lambda-conversion-op-cc.cpp | 56 +- .../lambda-expressions-inside-auto-functions.cpp | 8 +- .../lambda-expressions-nested-linkage.cpp | 10 +- clang/test/CodeGenCXX/lambda-expressions.cpp | 30 +- clang/test/CodeGenCXX/lifetime-sanitizer.cpp | 2 +- clang/test/CodeGenCXX/linkage.cpp | 2 +- clang/test/CodeGenCXX/mangle-abi-tag.cpp | 2 +- clang/test/CodeGenCXX/mangle-exprs.cpp | 8 +- clang/test/CodeGenCXX/mangle-extern-local.cpp | 6 +- clang/test/CodeGenCXX/mangle-lambdas.cpp | 102 +- clang/test/CodeGenCXX/mangle-ms-cxx11.cpp | 4 +- .../CodeGenCXX/mangle-ms-templates-memptrs-2.cpp | 2 +- clang/test/CodeGenCXX/mangle-ms-vector-types.cpp | 14 +- clang/test/CodeGenCXX/mangle-ms.cpp | 10 +- clang/test/CodeGenCXX/mangle-this-cxx11.cpp | 4 +- clang/test/CodeGenCXX/mangle-win-ccs.cpp | 24 +- clang/test/CodeGenCXX/mangle-win64-ccs.cpp | 14 +- clang/test/CodeGenCXX/mangle.cpp | 32 +- clang/test/CodeGenCXX/matrix-casts.cpp | 8 +- clang/test/CodeGenCXX/matrix-type-builtins.cpp | 56 +- clang/test/CodeGenCXX/matrix-type-operators.cpp | 48 +- clang/test/CodeGenCXX/matrix-type.cpp | 2 +- .../CodeGenCXX/member-expr-references-variable.cpp | 40 +- clang/test/CodeGenCXX/member-expressions.cpp | 2 +- .../CodeGenCXX/member-function-pointer-calls.cpp | 8 +- clang/test/CodeGenCXX/member-init-assignment.cpp | 2 +- clang/test/CodeGenCXX/member-templates.cpp | 4 +- clang/test/CodeGenCXX/microsoft-abi-arg-order.cpp | 16 +- .../CodeGenCXX/microsoft-abi-array-cookies.cpp | 8 +- clang/test/CodeGenCXX/microsoft-abi-byval-sret.cpp | 8 +- .../test/CodeGenCXX/microsoft-abi-byval-thunks.cpp | 16 +- .../test/CodeGenCXX/microsoft-abi-byval-vararg.cpp | 12 +- .../CodeGenCXX/microsoft-abi-cdecl-method-sret.cpp | 8 +- .../test/CodeGenCXX/microsoft-abi-dynamic-cast.cpp | 22 +- clang/test/CodeGenCXX/microsoft-abi-eh-catch.cpp | 6 +- .../test/CodeGenCXX/microsoft-abi-eh-cleanups.cpp | 56 +- .../CodeGenCXX/microsoft-abi-extern-template.cpp | 8 +- .../CodeGenCXX/microsoft-abi-member-pointers.cpp | 42 +- clang/test/CodeGenCXX/microsoft-abi-methods.cpp | 10 +- ...crosoft-abi-multiple-nonvirtual-inheritance.cpp | 10 +- .../CodeGenCXX/microsoft-abi-sret-and-byval.cpp | 78 +- .../microsoft-abi-static-initializers.cpp | 24 +- clang/test/CodeGenCXX/microsoft-abi-structors.cpp | 2 +- .../CodeGenCXX/microsoft-abi-this-nullable.cpp | 2 +- .../microsoft-abi-thread-safe-statics.cpp | 2 +- clang/test/CodeGenCXX/microsoft-abi-throw.cpp | 4 +- clang/test/CodeGenCXX/microsoft-abi-thunks.cpp | 14 +- clang/test/CodeGenCXX/microsoft-abi-typeid.cpp | 16 +- .../test/CodeGenCXX/microsoft-abi-unknown-arch.cpp | 2 +- clang/test/CodeGenCXX/microsoft-abi-vbase-dtor.cpp | 2 +- ...microsoft-abi-virtual-inheritance-vtordisps.cpp | 6 +- .../microsoft-abi-virtual-inheritance.cpp | 54 +- .../microsoft-abi-virtual-member-pointers.cpp | 56 +- .../CodeGenCXX/microsoft-abi-vmemptr-conflicts.cpp | 34 +- .../CodeGenCXX/microsoft-abi-vmemptr-fastcall.cpp | 4 +- ...iple-nonvirtual-inheritance-this-adjustment.cpp | 4 +- clang/test/CodeGenCXX/microsoft-compatibility.cpp | 2 +- .../CodeGenCXX/microsoft-inaccessible-base.cpp | 4 +- clang/test/CodeGenCXX/microsoft-interface.cpp | 10 +- clang/test/CodeGenCXX/microsoft-new.cpp | 8 +- clang/test/CodeGenCXX/mips-size_t-ptrdiff_t.cpp | 12 +- clang/test/CodeGenCXX/ms-inline-asm-fields.cpp | 2 +- clang/test/CodeGenCXX/ms-inline-asm-return.cpp | 2 +- clang/test/CodeGenCXX/ms-property.cpp | 48 +- clang/test/CodeGenCXX/ms-thunks-ehspec.cpp | 4 +- clang/test/CodeGenCXX/ms-thunks-unprototyped.cpp | 18 +- clang/test/CodeGenCXX/ms-union-member-ref.cpp | 6 +- .../test/CodeGenCXX/msabi-ctor-abstract-vbase.cpp | 8 +- clang/test/CodeGenCXX/multi-dim-operator-new.cpp | 6 +- clang/test/CodeGenCXX/new-alias.cpp | 2 +- clang/test/CodeGenCXX/new-array-init.cpp | 18 +- clang/test/CodeGenCXX/new-infallible.cpp | 4 +- clang/test/CodeGenCXX/new-overflow.cpp | 30 +- clang/test/CodeGenCXX/new.cpp | 56 +- clang/test/CodeGenCXX/noescape.cpp | 22 +- clang/test/CodeGenCXX/nonconst-init.cpp | 2 +- clang/test/CodeGenCXX/nrvo.cpp | 4 +- clang/test/CodeGenCXX/observe-noexcept.cpp | 4 +- clang/test/CodeGenCXX/operator-new.cpp | 8 +- clang/test/CodeGenCXX/partial-destruction.cpp | 22 +- clang/test/CodeGenCXX/pass-by-value-noalias.cpp | 16 +- clang/test/CodeGenCXX/pass-object-size.cpp | 8 +- clang/test/CodeGenCXX/pod-member-memcpys.cpp | 4 +- clang/test/CodeGenCXX/powerpc-byval.cpp | 2 +- clang/test/CodeGenCXX/pr13396.cpp | 12 +- clang/test/CodeGenCXX/pr20897.cpp | 4 +- clang/test/CodeGenCXX/pr24097.cpp | 2 +- clang/test/CodeGenCXX/pr28360.cpp | 2 +- clang/test/CodeGenCXX/pr9130.cpp | 2 +- clang/test/CodeGenCXX/pragma-visibility.cpp | 2 +- clang/test/CodeGenCXX/redefine_extname.cpp | 2 +- clang/test/CodeGenCXX/reference-cast.cpp | 12 +- clang/test/CodeGenCXX/references.cpp | 2 +- clang/test/CodeGenCXX/regcall.cpp | 42 +- clang/test/CodeGenCXX/regparm.cpp | 6 +- clang/test/CodeGenCXX/runtime-dllstorage.cpp | 14 +- clang/test/CodeGenCXX/runtimecc.cpp | 2 +- clang/test/CodeGenCXX/rvalue-references.cpp | 12 +- clang/test/CodeGenCXX/split-stacks.cpp | 12 +- clang/test/CodeGenCXX/stack-reuse-miscompile.cpp | 8 +- clang/test/CodeGenCXX/stack-reuse.cpp | 2 +- clang/test/CodeGenCXX/static-data-member.cpp | 4 +- clang/test/CodeGenCXX/static-destructor.cpp | 4 +- clang/test/CodeGenCXX/static-init-1.cpp | 8 +- clang/test/CodeGenCXX/static-init-wasm.cpp | 4 +- clang/test/CodeGenCXX/static-init.cpp | 14 +- .../CodeGenCXX/static-local-in-local-class.cpp | 20 +- clang/test/CodeGenCXX/stmtexpr.cpp | 16 +- clang/test/CodeGenCXX/switch-case-folding-2.cpp | 2 +- clang/test/CodeGenCXX/temp-order.cpp | 18 +- clang/test/CodeGenCXX/template-anonymous-types.cpp | 12 +- clang/test/CodeGenCXX/temporaries.cpp | 48 +- clang/test/CodeGenCXX/this-nonnull.cpp | 8 +- clang/test/CodeGenCXX/thunk-linkonce-odr.cpp | 4 +- clang/test/CodeGenCXX/thunk-returning-memptr.cpp | 2 +- clang/test/CodeGenCXX/thunks-ehspec.cpp | 6 +- clang/test/CodeGenCXX/thunks.cpp | 20 +- clang/test/CodeGenCXX/tls-init-funcs.cpp | 10 +- clang/test/CodeGenCXX/trivial_abi.cpp | 46 +- clang/test/CodeGenCXX/ubsan-suppress-checks.cpp | 16 +- clang/test/CodeGenCXX/ubsan-vtable-checks.cpp | 4 +- clang/test/CodeGenCXX/uncopyable-args.cpp | 48 +- clang/test/CodeGenCXX/unknown-anytype.cpp | 28 +- clang/test/CodeGenCXX/value-init.cpp | 4 +- clang/test/CodeGenCXX/varargs.cpp | 2 +- clang/test/CodeGenCXX/variadic-templates.cpp | 2 +- .../CodeGenCXX/virtual-base-destructor-call.cpp | 4 +- clang/test/CodeGenCXX/virtual-bases.cpp | 8 +- clang/test/CodeGenCXX/virtual-operator-call.cpp | 4 +- .../visibility-inlines-hidden-staticvar.cpp | 44 +- .../test/CodeGenCXX/visibility-inlines-hidden.cpp | 4 +- clang/test/CodeGenCXX/vla-consruct.cpp | 4 +- clang/test/CodeGenCXX/vla-lambda-capturing.cpp | 6 +- clang/test/CodeGenCXX/vla.cpp | 4 +- clang/test/CodeGenCXX/volatile.cpp | 2 +- clang/test/CodeGenCXX/vtable-assume-load.cpp | 2 +- .../CodeGenCXX/vtable-available-externally.cpp | 16 +- clang/test/CodeGenCXX/wasm-args-returns.cpp | 4 +- clang/test/CodeGenCXX/wasm-eh.cpp | 8 +- .../windows-on-arm-itanium-thread-local.cpp | 2 +- clang/test/CodeGenCXX/windows-x86-swiftcall.cpp | 6 +- clang/test/CodeGenCXX/x86_32-arguments.cpp | 8 +- clang/test/CodeGenCXX/x86_64-arguments-avx.cpp | 2 +- .../test/CodeGenCXX/x86_64-arguments-nacl-x32.cpp | 2 +- clang/test/CodeGenCXX/x86_64-arguments.cpp | 2 +- .../CodeGenCoroutines/coro-alloc-exp-namespace.cpp | 26 +- clang/test/CodeGenCoroutines/coro-alloc.cpp | 26 +- .../CodeGenCoroutines/coro-await-exp-namespace.cpp | 2 +- clang/test/CodeGenCoroutines/coro-await.cpp | 4 + clang/test/CodeGenCoroutines/coro-builtins.c | 2 +- .../coro-cleanup-exp-namespace.cpp | 6 +- clang/test/CodeGenCoroutines/coro-cleanup.cpp | 6 +- .../CodeGenCoroutines/coro-gro-exp-namespace.cpp | 6 +- .../coro-gro-nrvo-exp-namespace.cpp | 8 +- clang/test/CodeGenCoroutines/coro-gro-nrvo.cpp | 8 +- clang/test/CodeGenCoroutines/coro-gro.cpp | 6 +- .../coro-params-exp-namespace.cpp | 22 +- clang/test/CodeGenCoroutines/coro-params.cpp | 22 +- .../coro-promise-dtor-exp-namespace.cpp | 2 +- clang/test/CodeGenCoroutines/coro-promise-dtor.cpp | 2 +- .../coro-ret-void-exp-namespace.cpp | 2 +- clang/test/CodeGenCoroutines/coro-ret-void.cpp | 5 + .../coro-return-exp-namespace.cpp | 6 +- clang/test/CodeGenCoroutines/coro-return.cpp | 6 +- .../coro-symmetric-transfer-01.cpp | 26 +- clang/test/CodeGenObjC/arc-blocks.m | 44 +- clang/test/CodeGenObjC/arc-foreach.m | 4 +- clang/test/CodeGenObjC/arc-literals.m | 16 +- clang/test/CodeGenObjC/arc-no-arc-exceptions.m | 6 +- clang/test/CodeGenObjC/arc-precise-lifetime.m | 4 +- clang/test/CodeGenObjC/arc-property.m | 10 +- clang/test/CodeGenObjC/arc-ternary-op.m | 4 +- clang/test/CodeGenObjC/arc.m | 44 +- .../CodeGenObjC/arm-atomic-scalar-setter-getter.m | 4 +- clang/test/CodeGenObjC/atomic-aggregate-property.m | 4 +- .../test/CodeGenObjC/availability-cf-link-guard.m | 2 +- clang/test/CodeGenObjC/blocks.m | 4 +- clang/test/CodeGenObjC/builtin-constant-p.m | 4 +- clang/test/CodeGenObjC/class-stubs.m | 10 +- clang/test/CodeGenObjC/debug-info-blocks.m | 2 +- clang/test/CodeGenObjC/debug-info-nested-blocks.m | 2 +- clang/test/CodeGenObjC/exceptions.m | 16 +- clang/test/CodeGenObjC/for-in.m | 2 +- clang/test/CodeGenObjC/fragile-arc.m | 8 +- clang/test/CodeGenObjC/gnu-exceptions.m | 4 +- clang/test/CodeGenObjC/implicit-objc_msgSend.m | 2 +- clang/test/CodeGenObjC/ivar-invariant.m | 2 +- clang/test/CodeGenObjC/local-static-block.m | 2 +- clang/test/CodeGenObjC/mangle-blocks.m | 6 +- clang/test/CodeGenObjC/matrix-type-builtins.m | 16 +- clang/test/CodeGenObjC/matrix-type-operators.m | 10 +- clang/test/CodeGenObjC/noescape.m | 10 +- .../CodeGenObjC/nontrivial-c-struct-exception.m | 2 +- .../nontrivial-c-struct-within-struct-name.m | 6 +- .../CodeGenObjC/nsvalue-objc-boxable-ios-arc.m | 12 +- clang/test/CodeGenObjC/nsvalue-objc-boxable-ios.m | 12 +- .../CodeGenObjC/nsvalue-objc-boxable-mac-arc.m | 12 +- clang/test/CodeGenObjC/nsvalue-objc-boxable-mac.m | 12 +- .../CodeGenObjC/objc-container-subscripting-1.m | 8 +- clang/test/CodeGenObjC/objc-literal-tests.m | 26 +- .../CodeGenObjC/objc-non-trivial-struct-nrvo.m | 6 +- clang/test/CodeGenObjC/objfw.m | 2 +- clang/test/CodeGenObjC/optimize-ivar-offset-load.m | 2 +- clang/test/CodeGenObjC/os_log.m | 12 +- clang/test/CodeGenObjC/parameterized_classes.m | 2 +- clang/test/CodeGenObjC/pass-by-value-noalias.m | 4 +- clang/test/CodeGenObjC/property-array-type.m | 2 +- clang/test/CodeGenObjC/property-atomic-bool.m | 4 +- clang/test/CodeGenObjC/property-ref-cast-to-void.m | 4 +- clang/test/CodeGenObjC/property.m | 10 +- clang/test/CodeGenObjC/return-objc-object.mm | 4 +- clang/test/CodeGenObjC/stret_lookup.m | 4 +- clang/test/CodeGenObjC/strong-in-c-struct.m | 54 +- .../test/CodeGenObjC/tentative-cfconstantstring.m | 2 +- clang/test/CodeGenObjC/terminate.m | 8 +- clang/test/CodeGenObjC/ubsan-bool.m | 6 +- clang/test/CodeGenObjC/ubsan-nonnull.m | 12 +- clang/test/CodeGenObjC/ubsan-nullability.m | 4 +- clang/test/CodeGenObjC/weak-in-c-struct.m | 30 +- clang/test/CodeGenObjCXX/arc-attrs.mm | 18 +- clang/test/CodeGenObjCXX/arc-blocks.mm | 6 +- clang/test/CodeGenObjCXX/arc-cxx11-init-list.mm | 2 +- clang/test/CodeGenObjCXX/arc-cxx11-member-init.mm | 4 +- clang/test/CodeGenObjCXX/arc-exceptions.mm | 8 +- .../CodeGenObjCXX/arc-forwarded-lambda-call.mm | 8 +- clang/test/CodeGenObjCXX/arc-globals.mm | 4 +- clang/test/CodeGenObjCXX/arc-list-init-destruct.mm | 2 +- clang/test/CodeGenObjCXX/arc-mangle.mm | 22 +- clang/test/CodeGenObjCXX/arc-marker-funclet.mm | 2 +- clang/test/CodeGenObjCXX/arc-move.mm | 6 +- clang/test/CodeGenObjCXX/arc-new-delete.mm | 16 +- clang/test/CodeGenObjCXX/arc-references.mm | 6 +- clang/test/CodeGenObjCXX/arc-rv-attr.mm | 2 +- .../CodeGenObjCXX/arc-special-member-functions.mm | 2 +- clang/test/CodeGenObjCXX/arc.mm | 44 +- .../CodeGenObjCXX/auto-release-result-assert.mm | 8 +- clang/test/CodeGenObjCXX/block-default-arg.mm | 4 +- clang/test/CodeGenObjCXX/block-nested-in-lambda.mm | 4 +- clang/test/CodeGenObjCXX/copy.mm | 2 +- .../CodeGenObjCXX/implicit-copy-assign-operator.mm | 2 +- .../CodeGenObjCXX/implicit-copy-constructor.mm | 2 +- .../inheriting-constructor-cleanup.mm | 2 +- clang/test/CodeGenObjCXX/lambda-expressions.mm | 20 +- clang/test/CodeGenObjCXX/lambda-to-block.mm | 18 +- clang/test/CodeGenObjCXX/literals.mm | 8 +- .../test/CodeGenObjCXX/lvalue-reference-getter.mm | 4 +- clang/test/CodeGenObjCXX/mangle-blocks.mm | 8 +- clang/test/CodeGenObjCXX/message-reference.mm | 2 +- clang/test/CodeGenObjCXX/message.mm | 4 +- .../CodeGenObjCXX/objc-container-subscripting.mm | 2 +- clang/test/CodeGenObjCXX/objc-struct-cxx-abi.mm | 54 +- clang/test/CodeGenObjCXX/objc-weak.mm | 4 +- .../CodeGenObjCXX/property-dot-copy-elision.mm | 6 +- clang/test/CodeGenObjCXX/property-dot-reference.mm | 22 +- .../test/CodeGenObjCXX/property-lvalue-capture.mm | 6 +- clang/test/CodeGenObjCXX/property-lvalue-lambda.mm | 2 +- .../CodeGenObjCXX/property-object-reference-1.mm | 2 +- .../CodeGenObjCXX/property-object-reference-2.mm | 14 +- clang/test/CodeGenObjCXX/property-objects.mm | 14 +- clang/test/CodeGenObjCXX/property-reference.mm | 6 +- clang/test/CodeGenObjCXX/selector-expr-lvalue.mm | 2 +- .../CodeGenObjCXX/synthesized-property-cleanup.mm | 2 +- .../ubsan-nullability-return-notypeloc.mm | 2 +- clang/test/CodeGenOpenCL/addr-space-struct-arg.cl | 20 +- clang/test/CodeGenOpenCL/address-spaces.cl | 10 +- .../CodeGenOpenCL/amdgcn-automatic-variable.cl | 8 +- .../test/CodeGenOpenCL/amdgpu-abi-struct-coerce.cl | 48 +- clang/test/CodeGenOpenCL/amdgpu-call-kernel.cl | 2 +- clang/test/CodeGenOpenCL/amdgpu-nullptr.cl | 8 +- clang/test/CodeGenOpenCL/as_type.cl | 26 +- clang/test/CodeGenOpenCL/atomic-ops-libcall.cl | 54 +- clang/test/CodeGenOpenCL/blocks.cl | 12 +- clang/test/CodeGenOpenCL/byval.cl | 4 +- .../test/CodeGenOpenCL/cl20-device-side-enqueue.cl | 6 +- clang/test/CodeGenOpenCL/const-str-array-decay.cl | 2 +- .../CodeGenOpenCL/constant-addr-space-globals.cl | 2 +- clang/test/CodeGenOpenCL/convergent.cl | 4 +- clang/test/CodeGenOpenCL/fpmath.cl | 4 +- clang/test/CodeGenOpenCL/half.cl | 8 +- .../kernels-have-spir-cc-by-default.cl | 8 +- clang/test/CodeGenOpenCL/no-half.cl | 4 +- clang/test/CodeGenOpenCL/overload.cl | 20 +- clang/test/CodeGenOpenCL/printf.cl | 12 +- clang/test/CodeGenOpenCL/size_t.cl | 60 +- clang/test/CodeGenOpenCL/spir-calling-conv.cl | 10 +- .../CodeGenOpenCLCXX/address-space-deduction.clcpp | 2 +- .../CodeGenOpenCLCXX/addrspace-derived-base.clcpp | 4 +- .../CodeGenOpenCLCXX/addrspace-new-delete.clcpp | 2 +- .../test/CodeGenOpenCLCXX/addrspace-of-this.clcpp | 32 +- .../CodeGenOpenCLCXX/addrspace-operators.clcpp | 4 +- .../CodeGenOpenCLCXX/addrspace-references.clcpp | 2 +- .../CodeGenOpenCLCXX/addrspace-with-class.clcpp | 22 +- .../CodeGenOpenCLCXX/template-address-spaces.clcpp | 6 +- .../test/CodeGenSYCL/address-space-conversions.cpp | 52 +- clang/test/CodeGenSYCL/address-space-mangling.cpp | 16 +- clang/test/CodeGenSYCL/unique_stable_name.cpp | 40 +- clang/test/Headers/ms-arm64-intrin.cpp | 6 +- clang/test/Headers/stdarg.cpp | 28 +- clang/test/Modules/codegen-extern-template.cpp | 2 +- clang/test/Modules/codegen.test | 2 +- clang/test/Modules/cxx-irgen.cpp | 2 +- clang/test/Modules/initializers.cpp | 4 +- clang/test/Modules/templates.mm | 8 +- clang/test/OpenMP/allocate_codegen.cpp | 2 +- clang/test/OpenMP/allocate_codegen_attr.cpp | 2 +- clang/test/OpenMP/assumes_include_nvptx.cpp | 6 +- clang/test/OpenMP/atomic_capture_codegen.cpp | 28 +- clang/test/OpenMP/atomic_codegen.cpp | 8 +- clang/test/OpenMP/atomic_read_codegen.c | 14 +- clang/test/OpenMP/atomic_update_codegen.cpp | 28 +- clang/test/OpenMP/atomic_write_codegen.c | 18 +- clang/test/OpenMP/cancel_codegen.cpp | 104 +- clang/test/OpenMP/cancellation_point_codegen.cpp | 28 +- clang/test/OpenMP/debug-info-complex-byval.cpp | 49 +- clang/test/OpenMP/debug-info-openmp-array.cpp | 6 +- clang/test/OpenMP/declare_mapper_codegen.cpp | 20 +- clang/test/OpenMP/declare_reduction_codegen.c | 48 +- clang/test/OpenMP/declare_reduction_codegen.cpp | 46 +- .../declare_reduction_codegen_in_templates.cpp | 2 +- clang/test/OpenMP/declare_target_codegen.cpp | 4 +- .../declare_target_codegen_globalization.cpp | 12 +- clang/test/OpenMP/declare_target_link_codegen.cpp | 4 +- clang/test/OpenMP/declare_variant_mixed_codegen.c | 12 +- clang/test/OpenMP/distribute_codegen.cpp | 304 +- .../OpenMP/distribute_firstprivate_codegen.cpp | 329 +- .../test/OpenMP/distribute_lastprivate_codegen.cpp | 361 ++- .../OpenMP/distribute_parallel_for_codegen.cpp | 576 ++-- ...istribute_parallel_for_firstprivate_codegen.cpp | 385 ++- .../OpenMP/distribute_parallel_for_if_codegen.cpp | 320 +- ...distribute_parallel_for_lastprivate_codegen.cpp | 449 ++- ...distribute_parallel_for_num_threads_codegen.cpp | 481 ++- .../distribute_parallel_for_private_codegen.cpp | 425 ++- .../distribute_parallel_for_proc_bind_codegen.cpp | 29 +- ...tribute_parallel_for_reduction_task_codegen.cpp | 44 +- .../distribute_parallel_for_simd_codegen.cpp | 592 ++-- ...bute_parallel_for_simd_firstprivate_codegen.cpp | 1362 ++++----- .../distribute_parallel_for_simd_if_codegen.cpp | 3192 ++++++++++---------- ...ibute_parallel_for_simd_lastprivate_codegen.cpp | 1336 ++++---- ...ibute_parallel_for_simd_num_threads_codegen.cpp | 2640 ++++++++-------- ...istribute_parallel_for_simd_private_codegen.cpp | 1288 ++++---- ...tribute_parallel_for_simd_proc_bind_codegen.cpp | 236 +- clang/test/OpenMP/distribute_private_codegen.cpp | 345 ++- clang/test/OpenMP/distribute_simd_codegen.cpp | 512 ++-- .../distribute_simd_firstprivate_codegen.cpp | 944 +++--- .../OpenMP/distribute_simd_lastprivate_codegen.cpp | 1008 +++---- .../OpenMP/distribute_simd_private_codegen.cpp | 1056 +++---- .../OpenMP/distribute_simd_reduction_codegen.cpp | 272 +- clang/test/OpenMP/for_codegen.cpp | 16 +- clang/test/OpenMP/for_firstprivate_codegen.cpp | 313 +- clang/test/OpenMP/for_lastprivate_codegen.cpp | 601 ++-- clang/test/OpenMP/for_linear_codegen.cpp | 165 +- clang/test/OpenMP/for_private_codegen.cpp | 177 +- clang/test/OpenMP/for_reduction_codegen.cpp | 760 ++--- clang/test/OpenMP/for_reduction_codegen_UDR.cpp | 936 +++--- clang/test/OpenMP/for_reduction_task_codegen.cpp | 36 +- clang/test/OpenMP/for_scan_codegen.cpp | 2 +- clang/test/OpenMP/for_simd_codegen.cpp | 6 +- clang/test/OpenMP/for_simd_scan_codegen.cpp | 2 +- clang/test/OpenMP/function-attr.cpp | 8 +- clang/test/OpenMP/irbuilder_for_iterator.cpp | 24 +- clang/test/OpenMP/irbuilder_for_rangefor.cpp | 28 +- clang/test/OpenMP/irbuilder_for_unsigned.c | 6 +- ...builder_unroll_partial_heuristic_constant_for.c | 2 +- ...builder_unroll_partial_heuristic_for_collapse.c | 380 ++- ...rbuilder_unroll_partial_heuristic_runtime_for.c | 2 +- clang/test/OpenMP/master_taskloop_codegen.cpp | 10 +- .../master_taskloop_firstprivate_codegen.cpp | 22 +- .../master_taskloop_in_reduction_codegen.cpp | 12 +- .../OpenMP/master_taskloop_lastprivate_codegen.cpp | 22 +- .../OpenMP/master_taskloop_private_codegen.cpp | 22 +- .../OpenMP/master_taskloop_reduction_codegen.cpp | 22 +- clang/test/OpenMP/master_taskloop_simd_codegen.cpp | 8 +- .../master_taskloop_simd_firstprivate_codegen.cpp | 22 +- .../master_taskloop_simd_in_reduction_codegen.cpp | 12 +- .../master_taskloop_simd_lastprivate_codegen.cpp | 22 +- .../master_taskloop_simd_private_codegen.cpp | 22 +- .../master_taskloop_simd_reduction_codegen.cpp | 22 +- clang/test/OpenMP/nvptx_allocate_codegen.cpp | 8 +- clang/test/OpenMP/nvptx_data_sharing.cpp | 8 +- .../nvptx_declare_target_var_ctor_dtor_codegen.cpp | 28 +- .../OpenMP/nvptx_declare_variant_name_mangling.cpp | 4 +- ...tx_distribute_parallel_generic_mode_codegen.cpp | 48 +- clang/test/OpenMP/nvptx_lambda_capturing.cpp | 122 +- .../OpenMP/nvptx_multi_target_parallel_codegen.cpp | 18 +- .../test/OpenMP/nvptx_nested_parallel_codegen.cpp | 72 +- clang/test/OpenMP/nvptx_parallel_codegen.cpp | 52 +- clang/test/OpenMP/nvptx_parallel_for_codegen.cpp | 6 +- clang/test/OpenMP/nvptx_target_codegen.cpp | 10 +- .../OpenMP/nvptx_target_firstprivate_codegen.cpp | 8 +- .../test/OpenMP/nvptx_target_parallel_codegen.cpp | 48 +- .../nvptx_target_parallel_num_threads_codegen.cpp | 48 +- .../nvptx_target_parallel_reduction_codegen.cpp | 18 +- ...get_parallel_reduction_codegen_tbaa_PR46146.cpp | 10 +- clang/test/OpenMP/nvptx_target_printf_codegen.c | 4 +- clang/test/OpenMP/nvptx_target_teams_codegen.cpp | 48 +- .../nvptx_target_teams_distribute_codegen.cpp | 18 +- ...arget_teams_distribute_parallel_for_codegen.cpp | 144 +- ...istribute_parallel_for_generic_mode_codegen.cpp | 72 +- ..._teams_distribute_parallel_for_simd_codegen.cpp | 72 +- .../nvptx_target_teams_distribute_simd_codegen.cpp | 22 +- clang/test/OpenMP/nvptx_teams_codegen.cpp | 32 +- .../test/OpenMP/nvptx_teams_reduction_codegen.cpp | 162 +- .../test/OpenMP/nvptx_unsupported_type_codegen.cpp | 4 +- clang/test/OpenMP/openmp_offload_codegen.cpp | 2 +- clang/test/OpenMP/openmp_win_codegen.cpp | 7 +- clang/test/OpenMP/ordered_codegen.cpp | 76 +- clang/test/OpenMP/parallel_codegen.cpp | 100 +- clang/test/OpenMP/parallel_copyin_codegen.cpp | 613 ++-- .../test/OpenMP/parallel_firstprivate_codegen.cpp | 44 +- clang/test/OpenMP/parallel_for_codegen.cpp | 224 +- .../parallel_for_lastprivate_conditional.cpp | 17 +- clang/test/OpenMP/parallel_for_linear_codegen.cpp | 93 +- .../OpenMP/parallel_for_reduction_task_codegen.cpp | 36 +- clang/test/OpenMP/parallel_for_scan_codegen.cpp | 2 +- .../OpenMP/parallel_for_simd_aligned_codegen.cpp | 72 +- clang/test/OpenMP/parallel_for_simd_codegen.cpp | 6 +- .../test/OpenMP/parallel_for_simd_scan_codegen.cpp | 2 +- clang/test/OpenMP/parallel_if_codegen.cpp | 100 +- clang/test/OpenMP/parallel_if_codegen_PR51349.cpp | 2 +- clang/test/OpenMP/parallel_master_codegen.cpp | 63 +- .../parallel_master_reduction_task_codegen.cpp | 36 +- .../OpenMP/parallel_master_taskloop_codegen.cpp | 60 +- ...rallel_master_taskloop_firstprivate_codegen.cpp | 20 +- ...arallel_master_taskloop_lastprivate_codegen.cpp | 282 +- .../parallel_master_taskloop_private_codegen.cpp | 20 +- .../parallel_master_taskloop_reduction_codegen.cpp | 22 +- .../parallel_master_taskloop_simd_codegen.cpp | 160 +- ...l_master_taskloop_simd_firstprivate_codegen.cpp | 20 +- ...el_master_taskloop_simd_lastprivate_codegen.cpp | 470 +-- ...rallel_master_taskloop_simd_private_codegen.cpp | 20 +- ...llel_master_taskloop_simd_reduction_codegen.cpp | 22 +- clang/test/OpenMP/parallel_num_threads_codegen.cpp | 4 +- clang/test/OpenMP/parallel_private_codegen.cpp | 261 +- clang/test/OpenMP/parallel_reduction_codegen.cpp | 501 ++- .../OpenMP/parallel_reduction_task_codegen.cpp | 36 +- clang/test/OpenMP/parallel_sections_codegen.cpp | 13 +- .../parallel_sections_reduction_task_codegen.cpp | 36 +- clang/test/OpenMP/reduction_compound_op.cpp | 12 +- .../test/OpenMP/sections_firstprivate_codegen.cpp | 321 +- clang/test/OpenMP/sections_lastprivate_codegen.cpp | 433 ++- clang/test/OpenMP/sections_private_codegen.cpp | 189 +- clang/test/OpenMP/sections_reduction_codegen.cpp | 353 ++- .../OpenMP/sections_reduction_task_codegen.cpp | 36 +- clang/test/OpenMP/simd_codegen.cpp | 8 +- clang/test/OpenMP/single_codegen.cpp | 597 ++-- clang/test/OpenMP/single_firstprivate_codegen.cpp | 321 +- clang/test/OpenMP/single_private_codegen.cpp | 189 +- clang/test/OpenMP/target_codegen.cpp | 12 +- .../test/OpenMP/target_codegen_global_capture.cpp | 104 +- clang/test/OpenMP/target_defaultmap_codegen_01.cpp | 676 ++--- clang/test/OpenMP/target_depend_codegen.cpp | 14 +- clang/test/OpenMP/target_enter_data_codegen.cpp | 2 +- .../OpenMP/target_enter_data_depend_codegen.cpp | 8 +- clang/test/OpenMP/target_exit_data_codegen.cpp | 2 +- .../OpenMP/target_exit_data_depend_codegen.cpp | 8 +- clang/test/OpenMP/target_firstprivate_codegen.cpp | 12 +- clang/test/OpenMP/target_map_codegen_00.cpp | 2 +- clang/test/OpenMP/target_map_codegen_01.cpp | 4 +- clang/test/OpenMP/target_map_codegen_02.cpp | 2 +- clang/test/OpenMP/target_map_codegen_03.cpp | 96 +- clang/test/OpenMP/target_map_codegen_04.cpp | 2 +- clang/test/OpenMP/target_map_codegen_05.cpp | 2 +- clang/test/OpenMP/target_map_codegen_06.cpp | 2 +- clang/test/OpenMP/target_map_codegen_07.cpp | 2 +- clang/test/OpenMP/target_map_codegen_11.cpp | 2 +- clang/test/OpenMP/target_map_codegen_12.cpp | 2 +- clang/test/OpenMP/target_map_codegen_13.cpp | 2 +- clang/test/OpenMP/target_map_codegen_14.cpp | 4 +- clang/test/OpenMP/target_map_codegen_15.cpp | 2 +- clang/test/OpenMP/target_map_codegen_17.cpp | 2 +- clang/test/OpenMP/target_map_codegen_24.cpp | 2 +- clang/test/OpenMP/target_map_names.cpp | 2 +- clang/test/OpenMP/target_map_names_attr.cpp | 2 +- clang/test/OpenMP/target_parallel_codegen.cpp | 608 ++-- .../test/OpenMP/target_parallel_debug_codegen.cpp | 24 +- .../test/OpenMP/target_parallel_depend_codegen.cpp | 12 +- clang/test/OpenMP/target_parallel_for_codegen.cpp | 672 ++--- .../OpenMP/target_parallel_for_debug_codegen.cpp | 24 +- </cut>

4 years, 7 months

[TCWG CI] 464.h264ref slowed down by 7% after llvm: [PassManager] `buildModuleOptimizationPipeline()`: schedule `LoopDeletion` pass run before vectorization passes

by ci_notify＠linaro.org

After llvm commit 9c2469c1ddb34517de8dafd83d1940deada3fc22 Author: Roman Lebedev <lebedev.ri(a)gmail.com> [PassManager] `buildModuleOptimizationPipeline()`: schedule `LoopDeletion` pass run before vectorization passes the following benchmarks slowed down by more than 2%: - 464.h264ref slowed down by 7% from 10836 to 11596 perf samples - 464.h264ref:[.] FastFullPelBlockMotionSearch slowed down by 46% from 1525 to 2231 perf samples Below reproducer instructions can be used to re-build both "first_bad" and "last_good" cross-toolchains used in this bisection. Naturally, the scripts will fail when triggerring benchmarking jobs if you don't have access to Linaro TCWG CI. For your convenience, we have uploaded tarballs with pre-processed source and assembly files at: - First_bad save-temps: https://ci.linaro.org/job/tcwg_bmk_ci_llvm-bisect-tcwg_bmk_tx1-llvm-master-… - Last_good save-temps: https://ci.linaro.org/job/tcwg_bmk_ci_llvm-bisect-tcwg_bmk_tx1-llvm-master-… - Baseline save-temps: https://ci.linaro.org/job/tcwg_bmk_ci_llvm-bisect-tcwg_bmk_tx1-llvm-master-… Configuration: - Benchmark: SPEC CPU2006 - Toolchain: Clang + Glibc + LLVM Linker - Version: all components were built from their tip of trunk - Target: aarch64-linux-gnu - Compiler flags: -O3 - Hardware: NVidia TX1 4x Cortex-A57 This benchmarking CI is work-in-progress, and we welcome feedback and suggestions at linaro-toolchain(a)lists.linaro.org . In our improvement plans is to add support for SPEC CPU2017 benchmarks and provide "perf report/annotate" data behind these reports. THIS IS THE END OF INTERESTING STUFF. BELOW ARE LINKS TO BUILDS, REPRODUCTION INSTRUCTIONS, AND THE RAW COMMIT. This commit has regressed these CI configurations: - tcwg_bmk_llvm_tx1/llvm-master-aarch64-spec2k6-O3 First_bad build: https://ci.linaro.org/job/tcwg_bmk_ci_llvm-bisect-tcwg_bmk_tx1-llvm-master-… Last_good build: https://ci.linaro.org/job/tcwg_bmk_ci_llvm-bisect-tcwg_bmk_tx1-llvm-master-… Baseline build: https://ci.linaro.org/job/tcwg_bmk_ci_llvm-bisect-tcwg_bmk_tx1-llvm-master-… Even more details: https://ci.linaro.org/job/tcwg_bmk_ci_llvm-bisect-tcwg_bmk_tx1-llvm-master-… Reproduce builds: <cut> mkdir investigate-llvm-9c2469c1ddb34517de8dafd83d1940deada3fc22 cd investigate-llvm-9c2469c1ddb34517de8dafd83d1940deada3fc22 # Fetch scripts git clone https://git.linaro.org/toolchain/jenkins-scripts # Fetch manifests and test.sh script mkdir -p artifacts/manifests curl -o artifacts/manifests/build-baseline.sh https://ci.linaro.org/job/tcwg_bmk_ci_llvm-bisect-tcwg_bmk_tx1-llvm-master-… --fail curl -o artifacts/manifests/build-parameters.sh https://ci.linaro.org/job/tcwg_bmk_ci_llvm-bisect-tcwg_bmk_tx1-llvm-master-… --fail curl -o artifacts/test.sh https://ci.linaro.org/job/tcwg_bmk_ci_llvm-bisect-tcwg_bmk_tx1-llvm-master-… --fail chmod +x artifacts/test.sh # Reproduce the baseline build (build all pre-requisites) ./jenkins-scripts/tcwg_bmk-build.sh @@ artifacts/manifests/build-baseline.sh # Save baseline build state (which is then restored in artifacts/test.sh) mkdir -p ./bisect rsync -a --del --delete-excluded --exclude /bisect/ --exclude /artifacts/ --exclude /llvm/ ./ ./bisect/baseline/ cd llvm # Reproduce first_bad build git checkout --detach 9c2469c1ddb34517de8dafd83d1940deada3fc22 ../artifacts/test.sh # Reproduce last_good build git checkout --detach 4bef0304e153c757c9f42c2001d4c56e8f99929e ../artifacts/test.sh cd .. </cut> Full commit (up to 1000 lines): <cut> commit 9c2469c1ddb34517de8dafd83d1940deada3fc22 Author: Roman Lebedev <lebedev.ri(a)gmail.com> Date: Wed Nov 3 19:23:25 2021 +0300 [PassManager] `buildModuleOptimizationPipeline()`: schedule `LoopDeletion` pass run before vectorization passes Test thanks to Michael Kuklinski from `#llvm`: https://godbolt.org/z/bdrah5Goo originally inspired by Daniel Lemire's https://lemire.me/blog/2021/10/26/in-c-is-empty-faster-than-comparing-the-s… We manage to deduce that the answer does not require looping, but we do that after the last `LoopDeletion` pass run, so we end up being stuck with a dead loop. Now, as with all things SCEV, this has a very expected ~`+0.12%` compile time performance regression: https://llvm-compile-time-tracker.com/compare.php?from=0ae7bf124a9bca76dd9a… (for comparison, doing that in function simplification pipeline would have been ~`+0.5` compile time performance regression, D112840) Looking at the transformation stats over vanilla test-suite, i think it's rather expected: ``` | statistic name | baseline | proposed | Δ | % | |%| | |--------------------------------------------------|----------:|----------:|------:|-------:|-------:| | scalar-evolution.NumBruteForceTripCountsComputed | 789 | 888 | 99 | 12.55% | 12.55% | | scalar-evolution.NumTripCountsNotComputed | 105592 | 117900 | 12308 | 11.66% | 11.66% | | loop-delete.NumBackedgesBroken | 542 | 559 | 17 | 3.14% | 3.14% | | regalloc.numExtends | 81 | 79 | -2 | -2.47% | 2.47% | | indvars.NumFoldedUser | 408 | 400 | -8 | -1.96% | 1.96% | | indvars.NumElimCmp | 3831 | 3758 | -73 | -1.91% | 1.91% | | scalar-evolution.NumTripCountsComputed | 299759 | 304278 | 4519 | 1.51% | 1.51% | | loop-delete.NumDeleted | 8055 | 8128 | 73 | 0.91% | 0.91% | | machine-cse.NumCommutes | 111 | 110 | -1 | -0.90% | 0.90% | | globaldce.NumFunctions | 1187 | 1192 | 5 | 0.42% | 0.42% | | codegenprepare.NumSelectsExpanded | 277 | 278 | 1 | 0.36% | 0.36% | | loop-unroll.NumRuntimeUnrolled | 13841 | 13791 | -50 | -0.36% | 0.36% | | machinelicm.NumPostRAHoisted | 1168 | 1172 | 4 | 0.34% | 0.34% | | phi-node-elimination.NumCriticalEdgesSplit | 83054 | 82879 | -175 | -0.21% | 0.21% | | machine-cse.NumPREs | 3085 | 3079 | -6 | -0.19% | 0.19% | | branch-folder.NumBranchOpts | 108122 | 107942 | -180 | -0.17% | 0.17% | | loop-unroll.NumUnrolled | 40136 | 40067 | -69 | -0.17% | 0.17% | | branch-folder.NumDeadBlocks | 130818 | 130607 | -211 | -0.16% | 0.16% | | codegenprepare.NumBlocksElim | 92856 | 92714 | -142 | -0.15% | 0.15% | | instsimplify.NumSimplified | 103263 | 103129 | -134 | -0.13% | 0.13% | | instcombine.NumConstProp | 26070 | 26102 | 32 | 0.12% | 0.12% | | instsimplify.NumExpand | 1716 | 1718 | 2 | 0.12% | 0.12% | | loop-unroll.NumCompletelyUnrolled | 9236 | 9225 | -11 | -0.12% | 0.12% | | branch-folder.NumHoist | 2773 | 2770 | -3 | -0.11% | 0.11% | | regalloc.NumReloadsRemoved | 10822 | 10834 | 12 | 0.11% | 0.11% | | regalloc.NumSnippets | 11394 | 11406 | 12 | 0.11% | 0.11% | | machine-cse.NumCrossBBCSEs | 1052 | 1053 | 1 | 0.10% | 0.10% | | machinelicm.NumCSEed | 99887 | 99784 | -103 | -0.10% | 0.10% | | branch-folder.NumTailMerge | 72501 | 72435 | -66 | -0.09% | 0.09% | | codegenprepare.NumExtUses | 22007 | 21987 | -20 | -0.09% | 0.09% | | local.NumRemoved | 68232 | 68294 | 62 | 0.09% | 0.09% | | loop-vectorize.LoopsAnalyzed | 75483 | 75413 | -70 | -0.09% | 0.09% | ``` Note that i'm only changing current PM, and not touching obsolete PM. This is an alternative to the function simplification pipeline variant of the same change, D112840. It has both less compile time impact (since the additional number of SCEV trip count calculations is way lass less than with the D112840), and it is much more powerful/impactful (almost 2x more loops deleted). I have checked, and doing this after loop rotation is favorable (more loops deleted). Reviewed By: mkazantsev Differential Revision: https://reviews.llvm.org/D112851 --- llvm/lib/Passes/PassBuilderPipelines.cpp | 9 +++- llvm/test/Other/new-pm-defaults.ll | 1 + llvm/test/Other/new-pm-thinlto-defaults.ll | 1 + .../Other/new-pm-thinlto-postlink-pgo-defaults.ll | 1 + .../new-pm-thinlto-postlink-samplepgo-defaults.ll | 1 + ...letion-of-loops-that-became-side-effect-free.ll | 49 ++++------------------ 6 files changed, 18 insertions(+), 44 deletions(-) diff --git a/llvm/lib/Passes/PassBuilderPipelines.cpp b/llvm/lib/Passes/PassBuilderPipelines.cpp index 2009a687ae7d..f0f7803ed3ae 100644 --- a/llvm/lib/Passes/PassBuilderPipelines.cpp +++ b/llvm/lib/Passes/PassBuilderPipelines.cpp @@ -1093,11 +1093,16 @@ PassBuilder::buildModuleOptimizationPipeline(OptimizationLevel Level, for (auto &C : VectorizerStartEPCallbacks) C(OptimizePM, Level); + LoopPassManager LPM; // First rotate loops that may have been un-rotated by prior passes. // Disable header duplication at -Oz. + LPM.addPass(LoopRotatePass(Level != OptimizationLevel::Oz, LTOPreLink)); + // Some loops may have become dead by now. Try to delete them. + // FIXME: see disscussion in https://reviews.llvm.org/D112851 + // this may need to be revisited once GVN is more powerful. + LPM.addPass(LoopDeletionPass()); OptimizePM.addPass(createFunctionToLoopPassAdaptor( - LoopRotatePass(Level != OptimizationLevel::Oz, LTOPreLink), - /*UseMemorySSA=*/false, /*UseBlockFrequencyInfo=*/false)); + std::move(LPM), /*UseMemorySSA=*/false, /*UseBlockFrequencyInfo=*/false)); // Distribute loops to allow partial vectorization. I.e. isolate dependences // into separate loop that would otherwise inhibit vectorization. This is diff --git a/llvm/test/Other/new-pm-defaults.ll b/llvm/test/Other/new-pm-defaults.ll index 5067b6fbdd18..b9f90dad8224 100644 --- a/llvm/test/Other/new-pm-defaults.ll +++ b/llvm/test/Other/new-pm-defaults.ll @@ -216,6 +216,7 @@ ; CHECK-O-NEXT: Running pass: LoopSimplifyPass ; CHECK-O-NEXT: Running pass: LCSSAPass ; CHECK-O-NEXT: Running pass: LoopRotatePass +; CHECK-O-NEXT: Running pass: LoopDeletionPass ; CHECK-O-NEXT: Running pass: LoopDistributePass ; CHECK-O-NEXT: Running pass: InjectTLIMappings ; CHECK-O-NEXT: Running pass: LoopVectorizePass diff --git a/llvm/test/Other/new-pm-thinlto-defaults.ll b/llvm/test/Other/new-pm-thinlto-defaults.ll index 1f52fe47ae73..7836de5c6cce 100644 --- a/llvm/test/Other/new-pm-thinlto-defaults.ll +++ b/llvm/test/Other/new-pm-thinlto-defaults.ll @@ -196,6 +196,7 @@ ; CHECK-POSTLINK-O-NEXT: Running pass: LoopSimplifyPass ; CHECK-POSTLINK-O-NEXT: Running pass: LCSSAPass ; CHECK-POSTLINK-O-NEXT: Running pass: LoopRotatePass +; CHECK-POSTLINK-O-NEXT: Running pass: LoopDeletionPass ; CHECK-POSTLINK-O-NEXT: Running pass: LoopDistributePass ; CHECK-POSTLINK-O-NEXT: Running pass: InjectTLIMappings ; CHECK-POSTLINK-O-NEXT: Running pass: LoopVectorizePass diff --git a/llvm/test/Other/new-pm-thinlto-postlink-pgo-defaults.ll b/llvm/test/Other/new-pm-thinlto-postlink-pgo-defaults.ll index 3a80efba3c56..e66e8672358c 100644 --- a/llvm/test/Other/new-pm-thinlto-postlink-pgo-defaults.ll +++ b/llvm/test/Other/new-pm-thinlto-postlink-pgo-defaults.ll @@ -167,6 +167,7 @@ ; CHECK-O-NEXT: Running pass: LoopSimplifyPass on foo ; CHECK-O-NEXT: Running pass: LCSSAPass on foo ; CHECK-O-NEXT: Running pass: LoopRotatePass +; CHECK-O-NEXT: Running pass: LoopDeletionPass ; CHECK-O-NEXT: Running pass: LoopDistributePass ; CHECK-O-NEXT: Running pass: InjectTLIMappings ; CHECK-O-NEXT: Running pass: LoopVectorizePass diff --git a/llvm/test/Other/new-pm-thinlto-postlink-samplepgo-defaults.ll b/llvm/test/Other/new-pm-thinlto-postlink-samplepgo-defaults.ll index 2e822b21f8a1..410841124c8e 100644 --- a/llvm/test/Other/new-pm-thinlto-postlink-samplepgo-defaults.ll +++ b/llvm/test/Other/new-pm-thinlto-postlink-samplepgo-defaults.ll @@ -179,6 +179,7 @@ ; CHECK-O-NEXT: Running pass: LoopSimplifyPass ; CHECK-O-NEXT: Running pass: LCSSAPass ; CHECK-O-NEXT: Running pass: LoopRotatePass +; CHECK-O-NEXT: Running pass: LoopDeletionPass ; CHECK-O-NEXT: Running pass: LoopDistributePass ; CHECK-O-NEXT: Running pass: InjectTLIMappings ; CHECK-O-NEXT: Running pass: LoopVectorizePass diff --git a/llvm/test/Transforms/PhaseOrdering/deletion-of-loops-that-became-side-effect-free.ll b/llvm/test/Transforms/PhaseOrdering/deletion-of-loops-that-became-side-effect-free.ll index ec8db3cceeb1..99a52acd3b2b 100644 --- a/llvm/test/Transforms/PhaseOrdering/deletion-of-loops-that-became-side-effect-free.ll +++ b/llvm/test/Transforms/PhaseOrdering/deletion-of-loops-that-became-side-effect-free.ll @@ -11,17 +11,8 @@ define dso_local zeroext i1 @is_not_empty_variant1(%struct.node* %p) { ; ALL-LABEL: @is_not_empty_variant1( ; ALL-NEXT: entry: -; ALL-NEXT: [[TOBOOL_NOT3_I:%.*]] = icmp eq %struct.node* [[P:%.*]], null -; ALL-NEXT: br i1 [[TOBOOL_NOT3_I]], label [[COUNT_NODES_VARIANT1_EXIT:%.*]], label [[WHILE_BODY_I:%.*]] -; ALL: while.body.i: -; ALL-NEXT: [[P_ADDR_04_I:%.*]] = phi %struct.node* [ [[TMP0:%.*]], [[WHILE_BODY_I]] ], [ [[P]], [[ENTRY:%.*]] ] -; ALL-NEXT: [[NEXT_I:%.*]] = getelementptr inbounds [[STRUCT_NODE:%.*]], %struct.node* [[P_ADDR_04_I]], i64 0, i32 0 -; ALL-NEXT: [[TMP0]] = load %struct.node*, %struct.node** [[NEXT_I]], align 8 -; ALL-NEXT: [[TOBOOL_NOT_I:%.*]] = icmp eq %struct.node* [[TMP0]], null -; ALL-NEXT: br i1 [[TOBOOL_NOT_I]], label [[COUNT_NODES_VARIANT1_EXIT]], label [[WHILE_BODY_I]], !llvm.loop [[LOOP0:![0-9]+]] -; ALL: count_nodes_variant1.exit: -; ALL-NEXT: [[TMP1:%.*]] = xor i1 [[TOBOOL_NOT3_I]], true -; ALL-NEXT: ret i1 [[TMP1]] +; ALL-NEXT: [[TOBOOL_NOT3_I:%.*]] = icmp ne %struct.node* [[P:%.*]], null +; ALL-NEXT: ret i1 [[TOBOOL_NOT3_I]] ; entry: %p.addr = alloca %struct.node*, align 8 @@ -113,39 +104,13 @@ while.end: define dso_local zeroext i1 @is_not_empty_variant3(%struct.node* %p) { ; O3-LABEL: @is_not_empty_variant3( ; O3-NEXT: entry: -; O3-NEXT: [[TOBOOL_NOT4_I:%.*]] = icmp eq %struct.node* [[P:%.*]], null -; O3-NEXT: br i1 [[TOBOOL_NOT4_I]], label [[COUNT_NODES_VARIANT3_EXIT:%.*]], label [[WHILE_BODY_I:%.*]] -; O3: while.body.i: -; O3-NEXT: [[SIZE_06_I:%.*]] = phi i64 [ [[INC_I:%.*]], [[WHILE_BODY_I]] ], [ 0, [[ENTRY:%.*]] ] -; O3-NEXT: [[P_ADDR_05_I:%.*]] = phi %struct.node* [ [[TMP0:%.*]], [[WHILE_BODY_I]] ], [ [[P]], [[ENTRY]] ] -; O3-NEXT: [[CMP_I:%.*]] = icmp ne i64 [[SIZE_06_I]], -1 -; O3-NEXT: tail call void @llvm.assume(i1 [[CMP_I]]) #[[ATTR3:[0-9]+]] -; O3-NEXT: [[NEXT_I:%.*]] = getelementptr inbounds [[STRUCT_NODE:%.*]], %struct.node* [[P_ADDR_05_I]], i64 0, i32 0 -; O3-NEXT: [[TMP0]] = load %struct.node*, %struct.node** [[NEXT_I]], align 8 -; O3-NEXT: [[INC_I]] = add nuw i64 [[SIZE_06_I]], 1 -; O3-NEXT: [[TOBOOL_NOT_I:%.*]] = icmp eq %struct.node* [[TMP0]], null -; O3-NEXT: br i1 [[TOBOOL_NOT_I]], label [[COUNT_NODES_VARIANT3_EXIT]], label [[WHILE_BODY_I]], !llvm.loop [[LOOP2:![0-9]+]] -; O3: count_nodes_variant3.exit: -; O3-NEXT: [[TMP1:%.*]] = xor i1 [[TOBOOL_NOT4_I]], true -; O3-NEXT: ret i1 [[TMP1]] +; O3-NEXT: [[TOBOOL_NOT4_I:%.*]] = icmp ne %struct.node* [[P:%.*]], null +; O3-NEXT: ret i1 [[TOBOOL_NOT4_I]] ; ; O2-LABEL: @is_not_empty_variant3( ; O2-NEXT: entry: -; O2-NEXT: [[TOBOOL_NOT4_I:%.*]] = icmp eq %struct.node* [[P:%.*]], null -; O2-NEXT: br i1 [[TOBOOL_NOT4_I]], label [[COUNT_NODES_VARIANT3_EXIT:%.*]], label [[WHILE_BODY_I:%.*]] -; O2: while.body.i: -; O2-NEXT: [[SIZE_06_I:%.*]] = phi i64 [ [[INC_I:%.*]], [[WHILE_BODY_I]] ], [ 0, [[ENTRY:%.*]] ] -; O2-NEXT: [[P_ADDR_05_I:%.*]] = phi %struct.node* [ [[TMP0:%.*]], [[WHILE_BODY_I]] ], [ [[P]], [[ENTRY]] ] -; O2-NEXT: [[CMP_I:%.*]] = icmp ne i64 [[SIZE_06_I]], -1 -; O2-NEXT: tail call void @llvm.assume(i1 [[CMP_I]]) #[[ATTR3:[0-9]+]] -; O2-NEXT: [[NEXT_I:%.*]] = getelementptr inbounds [[STRUCT_NODE:%.*]], %struct.node* [[P_ADDR_05_I]], i64 0, i32 0 -; O2-NEXT: [[TMP0]] = load %struct.node*, %struct.node** [[NEXT_I]], align 8 -; O2-NEXT: [[INC_I]] = add nuw i64 [[SIZE_06_I]], 1 -; O2-NEXT: [[TOBOOL_NOT_I:%.*]] = icmp eq %struct.node* [[TMP0]], null -; O2-NEXT: br i1 [[TOBOOL_NOT_I]], label [[COUNT_NODES_VARIANT3_EXIT]], label [[WHILE_BODY_I]], !llvm.loop [[LOOP2:![0-9]+]] -; O2: count_nodes_variant3.exit: -; O2-NEXT: [[TMP1:%.*]] = xor i1 [[TOBOOL_NOT4_I]], true -; O2-NEXT: ret i1 [[TMP1]] +; O2-NEXT: [[TOBOOL_NOT4_I:%.*]] = icmp ne %struct.node* [[P:%.*]], null +; O2-NEXT: ret i1 [[TOBOOL_NOT4_I]] ; ; O1-LABEL: @is_not_empty_variant3( ; O1-NEXT: entry: @@ -160,7 +125,7 @@ define dso_local zeroext i1 @is_not_empty_variant3(%struct.node* %p) { ; O1-NEXT: [[TMP0]] = load %struct.node*, %struct.node** [[NEXT_I]], align 8 ; O1-NEXT: [[INC_I]] = add i64 [[SIZE_06_I]], 1 ; O1-NEXT: [[TOBOOL_NOT_I:%.*]] = icmp eq %struct.node* [[TMP0]], null -; O1-NEXT: br i1 [[TOBOOL_NOT_I]], label [[COUNT_NODES_VARIANT3_EXIT_LOOPEXIT:%.*]], label [[WHILE_BODY_I]], !llvm.loop [[LOOP2:![0-9]+]] +; O1-NEXT: br i1 [[TOBOOL_NOT_I]], label [[COUNT_NODES_VARIANT3_EXIT_LOOPEXIT:%.*]], label [[WHILE_BODY_I]], !llvm.loop [[LOOP0:![0-9]+]] ; O1: count_nodes_variant3.exit.loopexit: ; O1-NEXT: [[PHI_CMP:%.*]] = icmp ne i64 [[INC_I]], 0 ; O1-NEXT: br label [[COUNT_NODES_VARIANT3_EXIT]] </cut>

4 years, 7 months

[ACTIVITY] week ending Nov. 7 2021

by Alex Bennée

VirtIO Initiative ([STR-9]) =========================== - various rust-vmm discussions - [upstream rust-vmm sync meeting] - how to deal with vhost-device/vm-virtio split: [proposal] - synced with ARM on their interests - got update on Fwd: FW: [App-services] Slides from the hypervisor-less virtio status meeting Message-Id: <CAHDbmO2G4hUyfxtaxwnbxsrMk+P41zbL-7VNe=Aa6DshxC-5zQ(a)mail.gmail.com> [STR-9] <https://linaro.atlassian.net/browse/STR-9> [upstream rust-vmm sync meeting] <https://etherpad.opendev.org/p/rust-vmm-sync-2021&sa=D&source=calendar&ust=…> [proposal] <https://github.com/rust-vmm/vhost-device/pull/57> QEMU Upstream Work ([UM-2]) =========================== - did some bug triage and investigated [555] and [690] which might intersect with earlier changes I made - spent time on the PR from hell [PULL 00/30] testing, gdbstub and semihosting Message-Id: <20210115130828.23968-1-alex.bennee(a)linaro.org> [UM-2] <https://linaro.atlassian.net/browse/UM-2> [555] <https://gitlab.com/qemu-project/qemu/-/issues/555> [690] <https://gitlab.com/qemu-project/qemu/-/issues/690> Other ===== - TSC report preparation for QEMU and Stratos Completed Reviews [1/1] ======================= [XEN PATCH v7 00/51] xen: Build system improvements, now with out-of-tree build! Message-Id: <20210824105038.1257926-1-anthony.perard(a)citrix.com> Absences ======== ,---- | (save-excursion | (goto-char (point-min)) | (when (re-search-forward "* Absences") | (goto-char (match-beginning 0)) | (org-export-as 'ascii t nil t ))) `---- Current Review Queue ==================== TODO [PATCH v2 00/48] tcg: optimize redundant sign extensions Message-Id: <20211007195456.1168070-1-richard.henderson(a)linaro.org> ================================================================================================================================ TODO [PATCH] cpu-models-x86.rst: Tidy up a couple of things Message-Id: <20211015100718.17828-1-pbonzini(a)redhat.com> =================================================================================================================== TODO [PATCH 00/16] fdt: Make OF_BOARD a boolean option Message-Id: <20211013010120.96851-1-sjg(a)chromium.org> =========================================================================================================== TODO [PATCH v4 00/41] linux-user: Streamline handling of SIGSEGV Message-Id: <20211006172307.780893-1-richard.henderson(a)linaro.org> ================================================================================================================================== -- Alex Bennée

4 years, 7 months

[ACTIVITY] report week ending 5 Nov

by Peter Maydell

Progress * UM-2 [QEMU upstream maintainership] + worked through the big pile of email that had built up while I was on holiday... + some long-delayed sysadmin tasks on my work machines now I have an opportunity to go into the office and do things that would be too risky with only remote access + triaged a bunch of Coverity issues * QEMU-406 [QEMU support for MVE (M-profile Vector Extension; Helium)] + All work here has now gone upstream; closed! -- PMM

4 years, 7 months

[TCWG CI] 401.bzip2 grew in size by 4% after llvm: Revert "Revert "Recommit "Revert "[CVP] processSwitch: Remove default case when switch cover all possible values.""""

by ci_notify＠linaro.org

After llvm commit c93f93b2e3f28997f794265089fb8138dd5b5f13 Author: Jun Ma <JunMa(a)linux.alibaba.com> Revert "Revert "Recommit "Revert "[CVP] processSwitch: Remove default case when switch cover all possible values."""" the following benchmarks grew in size by more than 1%: - 401.bzip2 grew in size by 4% from 36134 to 37534 bytes - 401.bzip2:[.] BZ2_decompress grew in size by 19% from 7256 to 8656 bytes Below reproducer instructions can be used to re-build both "first_bad" and "last_good" cross-toolchains used in this bisection. Naturally, the scripts will fail when triggerring benchmarking jobs if you don't have access to Linaro TCWG CI. For your convenience, we have uploaded tarballs with pre-processed source and assembly files at: - First_bad save-temps: https://ci.linaro.org/job/tcwg_bmk_ci_llvm-bisect-tcwg_bmk_apm-llvm-master-… - Last_good save-temps: https://ci.linaro.org/job/tcwg_bmk_ci_llvm-bisect-tcwg_bmk_apm-llvm-master-… - Baseline save-temps: https://ci.linaro.org/job/tcwg_bmk_ci_llvm-bisect-tcwg_bmk_apm-llvm-master-… Configuration: - Benchmark: SPEC CPU2006 - Toolchain: Clang + Glibc + LLVM Linker - Version: all components were built from their tip of trunk - Target: arm-linux-gnueabihf - Compiler flags: -Oz -mthumb - Hardware: APM Mustang 8x X-Gene1 This benchmarking CI is work-in-progress, and we welcome feedback and suggestions at linaro-toolchain(a)lists.linaro.org . In our improvement plans is to add support for SPEC CPU2017 benchmarks and provide "perf report/annotate" data behind these reports. THIS IS THE END OF INTERESTING STUFF. BELOW ARE LINKS TO BUILDS, REPRODUCTION INSTRUCTIONS, AND THE RAW COMMIT. This commit has regressed these CI configurations: - tcwg_bmk_llvm_apm/llvm-master-arm-spec2k6-Oz First_bad build: https://ci.linaro.org/job/tcwg_bmk_ci_llvm-bisect-tcwg_bmk_apm-llvm-master-… Last_good build: https://ci.linaro.org/job/tcwg_bmk_ci_llvm-bisect-tcwg_bmk_apm-llvm-master-… Baseline build: https://ci.linaro.org/job/tcwg_bmk_ci_llvm-bisect-tcwg_bmk_apm-llvm-master-… Even more details: https://ci.linaro.org/job/tcwg_bmk_ci_llvm-bisect-tcwg_bmk_apm-llvm-master-… Reproduce builds: <cut> mkdir investigate-llvm-c93f93b2e3f28997f794265089fb8138dd5b5f13 cd investigate-llvm-c93f93b2e3f28997f794265089fb8138dd5b5f13 # Fetch scripts git clone https://git.linaro.org/toolchain/jenkins-scripts # Fetch manifests and test.sh script mkdir -p artifacts/manifests curl -o artifacts/manifests/build-baseline.sh https://ci.linaro.org/job/tcwg_bmk_ci_llvm-bisect-tcwg_bmk_apm-llvm-master-… --fail curl -o artifacts/manifests/build-parameters.sh https://ci.linaro.org/job/tcwg_bmk_ci_llvm-bisect-tcwg_bmk_apm-llvm-master-… --fail curl -o artifacts/test.sh https://ci.linaro.org/job/tcwg_bmk_ci_llvm-bisect-tcwg_bmk_apm-llvm-master-… --fail chmod +x artifacts/test.sh # Reproduce the baseline build (build all pre-requisites) ./jenkins-scripts/tcwg_bmk-build.sh @@ artifacts/manifests/build-baseline.sh # Save baseline build state (which is then restored in artifacts/test.sh) mkdir -p ./bisect rsync -a --del --delete-excluded --exclude /bisect/ --exclude /artifacts/ --exclude /llvm/ ./ ./bisect/baseline/ cd llvm # Reproduce first_bad build git checkout --detach c93f93b2e3f28997f794265089fb8138dd5b5f13 ../artifacts/test.sh # Reproduce last_good build git checkout --detach b4fb42300e39c99ac5bb9d02b304b713fabdec4d ../artifacts/test.sh cd .. </cut> Full commit (up to 1000 lines): <cut> commit c93f93b2e3f28997f794265089fb8138dd5b5f13 Author: Jun Ma <JunMa(a)linux.alibaba.com> Date: Tue Sep 28 09:44:00 2021 +0800 Revert "Revert "Recommit "Revert "[CVP] processSwitch: Remove default case when switch cover all possible values."""" This reverts commit 3a998c06a8e93989319238e12b56a731198cc1c2. --- llvm/include/llvm/Transforms/Utils/Local.h | 5 ++++ .../Scalar/CorrelatedValuePropagation.cpp | 27 +++++++++++++++++++++- llvm/lib/Transforms/Utils/Local.cpp | 20 ++++++++++++++++ llvm/lib/Transforms/Utils/SimplifyCFG.cpp | 20 ---------------- .../Transforms/CorrelatedValuePropagation/basic.ll | 11 +++++---- 5 files changed, 57 insertions(+), 26 deletions(-) diff --git a/llvm/include/llvm/Transforms/Utils/Local.h b/llvm/include/llvm/Transforms/Utils/Local.h index 3c529abce85a..72cb606eb51a 100644 --- a/llvm/include/llvm/Transforms/Utils/Local.h +++ b/llvm/include/llvm/Transforms/Utils/Local.h @@ -55,6 +55,7 @@ class MDNode; class MemorySSAUpdater; class PHINode; class StoreInst; +class SwitchInst; class TargetLibraryInfo; class TargetTransformInfo; @@ -237,6 +238,10 @@ CallInst *createCallMatchingInvoke(InvokeInst *II); /// This function converts the specified invoek into a normall call. void changeToCall(InvokeInst *II, DomTreeUpdater *DTU = nullptr); +/// This function removes the default destination from the specified switch. +void createUnreachableSwitchDefault(SwitchInst *Switch, + DomTreeUpdater *DTU = nullptr); + ///===---------------------------------------------------------------------===// /// Dbg Intrinsic utilities /// diff --git a/llvm/lib/Transforms/Scalar/CorrelatedValuePropagation.cpp b/llvm/lib/Transforms/Scalar/CorrelatedValuePropagation.cpp index 6dbd3da24059..4b8392db9628 100644 --- a/llvm/lib/Transforms/Scalar/CorrelatedValuePropagation.cpp +++ b/llvm/lib/Transforms/Scalar/CorrelatedValuePropagation.cpp @@ -341,7 +341,13 @@ static bool processSwitch(SwitchInst *I, LazyValueInfo *LVI, // ConstantFoldTerminator() as the underlying SwitchInst can be changed. SwitchInstProfUpdateWrapper SI(*I); - for (auto CI = SI->case_begin(), CE = SI->case_end(); CI != CE;) { + APInt Low = + APInt::getSignedMaxValue(Cond->getType()->getScalarSizeInBits()); + APInt High = + APInt::getSignedMinValue(Cond->getType()->getScalarSizeInBits()); + + SwitchInst::CaseIt CI = SI->case_begin(); + for (auto CE = SI->case_end(); CI != CE;) { ConstantInt *Case = CI->getCaseValue(); LazyValueInfo::Tristate State = LVI->getPredicateAt(CmpInst::ICMP_EQ, Cond, Case, I, @@ -374,9 +380,28 @@ static bool processSwitch(SwitchInst *I, LazyValueInfo *LVI, break; } + // Get Lower/Upper bound from switch cases. + Low = APIntOps::smin(Case->getValue(), Low); + High = APIntOps::smax(Case->getValue(), High); + // Increment the case iterator since we didn't delete it. ++CI; } + + // Try to simplify default case as unreachable + if (CI == SI->case_end() && SI->getNumCases() != 0 && + !isa<UnreachableInst>(SI->getDefaultDest()->getFirstNonPHIOrDbg())) { + const ConstantRange SIRange = + LVI->getConstantRange(SI->getCondition(), SI); + + // If the numbered switch cases cover the entire range of the condition, + // then the default case is not reachable. + if (SIRange.getSignedMin() == Low && SIRange.getSignedMax() == High && + SI->getNumCases() == High - Low + 1) { + createUnreachableSwitchDefault(SI, &DTU); + Changed = true; + } + } } if (Changed) diff --git a/llvm/lib/Transforms/Utils/Local.cpp b/llvm/lib/Transforms/Utils/Local.cpp index 3e36f498523d..74ab37fadf36 100644 --- a/llvm/lib/Transforms/Utils/Local.cpp +++ b/llvm/lib/Transforms/Utils/Local.cpp @@ -2190,6 +2190,26 @@ void llvm::changeToCall(InvokeInst *II, DomTreeUpdater *DTU) { DTU->applyUpdates({{DominatorTree::Delete, BB, UnwindDestBB}}); } +void llvm::createUnreachableSwitchDefault(SwitchInst *Switch, + DomTreeUpdater *DTU) { + LLVM_DEBUG(dbgs() << "SimplifyCFG: switch default is dead.\n"); + auto *BB = Switch->getParent(); + auto *OrigDefaultBlock = Switch->getDefaultDest(); + OrigDefaultBlock->removePredecessor(BB); + BasicBlock *NewDefaultBlock = BasicBlock::Create( + BB->getContext(), BB->getName() + ".unreachabledefault", BB->getParent(), + OrigDefaultBlock); + new UnreachableInst(Switch->getContext(), NewDefaultBlock); + Switch->setDefaultDest(&*NewDefaultBlock); + if (DTU) { + SmallVector<DominatorTree::UpdateType, 2> Updates; + Updates.push_back({DominatorTree::Insert, BB, &*NewDefaultBlock}); + if (!is_contained(successors(BB), OrigDefaultBlock)) + Updates.push_back({DominatorTree::Delete, BB, &*OrigDefaultBlock}); + DTU->applyUpdates(Updates); + } +} + BasicBlock *llvm::changeToInvokeAndSplitBasicBlock(CallInst *CI, BasicBlock *UnwindEdge, DomTreeUpdater *DTU) { diff --git a/llvm/lib/Transforms/Utils/SimplifyCFG.cpp b/llvm/lib/Transforms/Utils/SimplifyCFG.cpp index 7b49f47778e0..3eab293b433e 100644 --- a/llvm/lib/Transforms/Utils/SimplifyCFG.cpp +++ b/llvm/lib/Transforms/Utils/SimplifyCFG.cpp @@ -4782,26 +4782,6 @@ static bool CasesAreContiguous(SmallVectorImpl<ConstantInt *> &Cases) { return true; } -static void createUnreachableSwitchDefault(SwitchInst *Switch, - DomTreeUpdater *DTU) { - LLVM_DEBUG(dbgs() << "SimplifyCFG: switch default is dead.\n"); - auto *BB = Switch->getParent(); - auto *OrigDefaultBlock = Switch->getDefaultDest(); - OrigDefaultBlock->removePredecessor(BB); - BasicBlock *NewDefaultBlock = BasicBlock::Create( - BB->getContext(), BB->getName() + ".unreachabledefault", BB->getParent(), - OrigDefaultBlock); - new UnreachableInst(Switch->getContext(), NewDefaultBlock); - Switch->setDefaultDest(&*NewDefaultBlock); - if (DTU) { - SmallVector<DominatorTree::UpdateType, 2> Updates; - Updates.push_back({DominatorTree::Insert, BB, &*NewDefaultBlock}); - if (!is_contained(successors(BB), OrigDefaultBlock)) - Updates.push_back({DominatorTree::Delete, BB, &*OrigDefaultBlock}); - DTU->applyUpdates(Updates); - } -} - /// Turn a switch with two reachable destinations into an integer range /// comparison and branch. bool SimplifyCFGOpt::TurnSwitchRangeIntoICmp(SwitchInst *SI, diff --git a/llvm/test/Transforms/CorrelatedValuePropagation/basic.ll b/llvm/test/Transforms/CorrelatedValuePropagation/basic.ll index 5abbcbc90e01..a620c8468d4d 100644 --- a/llvm/test/Transforms/CorrelatedValuePropagation/basic.ll +++ b/llvm/test/Transforms/CorrelatedValuePropagation/basic.ll @@ -382,7 +382,7 @@ define i32 @switch_range(i32 %cond) { ; CHECK-NEXT: entry: ; CHECK-NEXT: [[S:%.*]] = urem i32 [[COND:%.*]], 3 ; CHECK-NEXT: [[S1:%.*]] = add nuw nsw i32 [[S]], 1 -; CHECK-NEXT: switch i32 [[S1]], label [[UNREACHABLE:%.*]] [ +; CHECK-NEXT: switch i32 [[S1]], label [[ENTRY_UNREACHABLEDEFAULT:%.*]] [ ; CHECK-NEXT: i32 1, label [[EXIT1:%.*]] ; CHECK-NEXT: i32 2, label [[EXIT2:%.*]] ; CHECK-NEXT: i32 3, label [[EXIT1]] @@ -391,6 +391,8 @@ define i32 @switch_range(i32 %cond) { ; CHECK-NEXT: ret i32 1 ; CHECK: exit2: ; CHECK-NEXT: ret i32 2 +; CHECK: entry.unreachabledefault: +; CHECK-NEXT: unreachable ; CHECK: unreachable: ; CHECK-NEXT: ret i32 0 ; @@ -453,10 +455,9 @@ define i8 @switch_defaultdest_multipleuse(i8 %t0) { ; CHECK-NEXT: entry: ; CHECK-NEXT: [[O:%.*]] = or i8 [[T0:%.*]], 1 ; CHECK-NEXT: [[R:%.*]] = srem i8 1, [[O]] -; CHECK-NEXT: switch i8 [[R]], label [[EXIT:%.*]] [ -; CHECK-NEXT: i8 0, label [[EXIT]] -; CHECK-NEXT: i8 1, label [[EXIT]] -; CHECK-NEXT: ] +; CHECK-NEXT: br label [[EXIT:%.*]] +; CHECK: entry.unreachabledefault: +; CHECK-NEXT: unreachable ; CHECK: exit: ; CHECK-NEXT: ret i8 0 ; </cut>

4 years, 7 months

[TCWG CI] 464.h264ref slowed down by 7% after llvm: [BasicAA] Handle known bits as ranges

by ci_notify＠linaro.org

After llvm commit fbc0c308d599fe3300ab6516650b65b41979446d Author: Nikita Popov <nikita.ppv(a)gmail.com> [BasicAA] Handle known bits as ranges the following benchmarks slowed down by more than 2%: - 464.h264ref slowed down by 7% from 10899 to 11610 perf samples - 464.h264ref:libc.so.6 slowed down by 11% from 3538 to 3922 perf samples Below reproducer instructions can be used to re-build both "first_bad" and "last_good" cross-toolchains used in this bisection. Naturally, the scripts will fail when triggerring benchmarking jobs if you don't have access to Linaro TCWG CI. For your convenience, we have uploaded tarballs with pre-processed source and assembly files at: - First_bad save-temps: https://ci.linaro.org/job/tcwg_bmk_ci_llvm-bisect-tcwg_bmk_tx1-llvm-master-… - Last_good save-temps: https://ci.linaro.org/job/tcwg_bmk_ci_llvm-bisect-tcwg_bmk_tx1-llvm-master-… - Baseline save-temps: https://ci.linaro.org/job/tcwg_bmk_ci_llvm-bisect-tcwg_bmk_tx1-llvm-master-… Configuration: - Benchmark: SPEC CPU2006 - Toolchain: Clang + Glibc + LLVM Linker - Version: all components were built from their tip of trunk - Target: aarch64-linux-gnu - Compiler flags: -O2 -flto - Hardware: NVidia TX1 4x Cortex-A57 This benchmarking CI is work-in-progress, and we welcome feedback and suggestions at linaro-toolchain(a)lists.linaro.org . In our improvement plans is to add support for SPEC CPU2017 benchmarks and provide "perf report/annotate" data behind these reports. THIS IS THE END OF INTERESTING STUFF. BELOW ARE LINKS TO BUILDS, REPRODUCTION INSTRUCTIONS, AND THE RAW COMMIT. This commit has regressed these CI configurations: - tcwg_bmk_llvm_tx1/llvm-master-aarch64-spec2k6-O2_LTO First_bad build: https://ci.linaro.org/job/tcwg_bmk_ci_llvm-bisect-tcwg_bmk_tx1-llvm-master-… Last_good build: https://ci.linaro.org/job/tcwg_bmk_ci_llvm-bisect-tcwg_bmk_tx1-llvm-master-… Baseline build: https://ci.linaro.org/job/tcwg_bmk_ci_llvm-bisect-tcwg_bmk_tx1-llvm-master-… Even more details: https://ci.linaro.org/job/tcwg_bmk_ci_llvm-bisect-tcwg_bmk_tx1-llvm-master-… Reproduce builds: <cut> mkdir investigate-llvm-fbc0c308d599fe3300ab6516650b65b41979446d cd investigate-llvm-fbc0c308d599fe3300ab6516650b65b41979446d # Fetch scripts git clone https://git.linaro.org/toolchain/jenkins-scripts # Fetch manifests and test.sh script mkdir -p artifacts/manifests curl -o artifacts/manifests/build-baseline.sh https://ci.linaro.org/job/tcwg_bmk_ci_llvm-bisect-tcwg_bmk_tx1-llvm-master-… --fail curl -o artifacts/manifests/build-parameters.sh https://ci.linaro.org/job/tcwg_bmk_ci_llvm-bisect-tcwg_bmk_tx1-llvm-master-… --fail curl -o artifacts/test.sh https://ci.linaro.org/job/tcwg_bmk_ci_llvm-bisect-tcwg_bmk_tx1-llvm-master-… --fail chmod +x artifacts/test.sh # Reproduce the baseline build (build all pre-requisites) ./jenkins-scripts/tcwg_bmk-build.sh @@ artifacts/manifests/build-baseline.sh # Save baseline build state (which is then restored in artifacts/test.sh) mkdir -p ./bisect rsync -a --del --delete-excluded --exclude /bisect/ --exclude /artifacts/ --exclude /llvm/ ./ ./bisect/baseline/ cd llvm # Reproduce first_bad build git checkout --detach fbc0c308d599fe3300ab6516650b65b41979446d ../artifacts/test.sh # Reproduce last_good build git checkout --detach 30a3652b6ade43504087f6e3acd8dc879055f501 ../artifacts/test.sh cd .. </cut> Full commit (up to 1000 lines): <cut> commit fbc0c308d599fe3300ab6516650b65b41979446d Author: Nikita Popov <nikita.ppv(a)gmail.com> Date: Mon Oct 25 15:47:21 2021 +0200 [BasicAA] Handle known bits as ranges BasicAA currently tries to determine that the offset is positive by checking whether all variable indices are positive based on known bits, multiplied by a positive scale. However, this is incorrect if the scale multiplication might overflow. In the modified test case the original value is positive, but may be negative after a left shift. Fix this by converting known bits into a constant range and reusing the range-based logic, which handles overflow correctly. Differential Revision: https://reviews.llvm.org/D112611 --- llvm/lib/Analysis/BasicAliasAnalysis.cpp | 51 +++++----------------- .../test/Analysis/BasicAA/assume-index-positive.ll | 4 +- 2 files changed, 12 insertions(+), 43 deletions(-) diff --git a/llvm/lib/Analysis/BasicAliasAnalysis.cpp b/llvm/lib/Analysis/BasicAliasAnalysis.cpp index 0305732ca5d5..8cf947c43bf4 100644 --- a/llvm/lib/Analysis/BasicAliasAnalysis.cpp +++ b/llvm/lib/Analysis/BasicAliasAnalysis.cpp @@ -318,15 +318,6 @@ struct CastedValue { return N; } - KnownBits evaluateWith(KnownBits N) const { - assert(N.getBitWidth() == V->getType()->getPrimitiveSizeInBits() && - "Incompatible bit width"); - if (TruncBits) N = N.trunc(N.getBitWidth() - TruncBits); - if (SExtBits) N = N.sext(N.getBitWidth() + SExtBits); - if (ZExtBits) N = N.zext(N.getBitWidth() + ZExtBits); - return N; - } - ConstantRange evaluateWith(ConstantRange N) const { assert(N.getBitWidth() == V->getType()->getPrimitiveSizeInBits() && "Incompatible bit width"); @@ -1250,8 +1241,6 @@ AliasResult BasicAAResult::aliasGEP( if (!DecompGEP1.VarIndices.empty()) { APInt GCD; - bool AllNonNegative = DecompGEP1.Offset.isNonNegative(); - bool AllNonPositive = DecompGEP1.Offset.isNonPositive(); ConstantRange OffsetRange = ConstantRange(DecompGEP1.Offset); for (unsigned i = 0, e = DecompGEP1.VarIndices.size(); i != e; ++i) { const VariableGEPIndex &Index = DecompGEP1.VarIndices[i]; @@ -1266,24 +1255,19 @@ AliasResult BasicAAResult::aliasGEP( else GCD = APIntOps::GreatestCommonDivisor(GCD, ScaleForGCD.abs()); - if (AllNonNegative || AllNonPositive) { - KnownBits Known = Index.Val.evaluateWith( - computeKnownBits(Index.Val.V, DL, 0, &AC, Index.CxtI, DT)); - bool SignKnownZero = Known.isNonNegative(); - bool SignKnownOne = Known.isNegative(); - AllNonNegative &= (SignKnownZero && Scale.isNonNegative()) || - (SignKnownOne && Scale.isNonPositive()); - AllNonPositive &= (SignKnownZero && Scale.isNonPositive()) || - (SignKnownOne && Scale.isNonNegative()); - } + ConstantRange CR = + computeConstantRange(Index.Val.V, true, &AC, Index.CxtI); + KnownBits Known = + computeKnownBits(Index.Val.V, DL, 0, &AC, Index.CxtI, DT); + CR = CR.intersectWith( + ConstantRange::fromKnownBits(Known, /* Signed */ true), + ConstantRange::Signed); assert(OffsetRange.getBitWidth() == Scale.getBitWidth() && "Bit widths are normalized to MaxPointerSize"); - OffsetRange = OffsetRange.add(Index.Val - .evaluateWith(computeConstantRange( - Index.Val.V, true, &AC, Index.CxtI)) - .sextOrTrunc(OffsetRange.getBitWidth()) - .smul_fast(ConstantRange(Scale))); + OffsetRange = OffsetRange.add( + Index.Val.evaluateWith(CR).sextOrTrunc(OffsetRange.getBitWidth()) + .smul_fast(ConstantRange(Scale))); } // We now have accesses at two offsets from the same base: @@ -1300,21 +1284,6 @@ AliasResult BasicAAResult::aliasGEP( (GCD - ModOffset).uge(V1Size.getValue())) return AliasResult::NoAlias; - // If we know all the variables are non-negative, then the total offset is - // also non-negative and >= DecompGEP1.Offset. We have the following layout: - // [0, V2Size) ... [TotalOffset, TotalOffer+V1Size] - // If DecompGEP1.Offset >= V2Size, the accesses don't alias. - if (AllNonNegative && V2Size.hasValue() && - DecompGEP1.Offset.uge(V2Size.getValue())) - return AliasResult::NoAlias; - // Similarly, if the variables are non-positive, then the total offset is - // also non-positive and <= DecompGEP1.Offset. We have the following layout: - // [TotalOffset, TotalOffset+V1Size) ... [0, V2Size) - // If -DecompGEP1.Offset >= V1Size, the accesses don't alias. - if (AllNonPositive && V1Size.hasValue() && - (-DecompGEP1.Offset).uge(V1Size.getValue())) - return AliasResult::NoAlias; - if (V1Size.hasValue() && V2Size.hasValue()) { // Compute ranges of potentially accessed bytes for both accesses. If the // interseciton is empty, there can be no overlap. diff --git a/llvm/test/Analysis/BasicAA/assume-index-positive.ll b/llvm/test/Analysis/BasicAA/assume-index-positive.ll index 451592067f4b..a53fff2c6009 100644 --- a/llvm/test/Analysis/BasicAA/assume-index-positive.ll +++ b/llvm/test/Analysis/BasicAA/assume-index-positive.ll @@ -130,12 +130,12 @@ define void @symmetry([0 x i8]* %ptr, i32 %a, i32 %b, i32 %c) { ret void } -; TODO: %ptr.neg and %ptr.shl may alias, as the shl renders the previously +; %ptr.neg and %ptr.shl may alias, as the shl renders the previously ; non-negative value potentially negative. define void @shl_of_non_negative(i8* %ptr, i64 %a) { ; CHECK-LABEL: Function: shl_of_non_negative ; CHECK: NoAlias: i8* %ptr.a, i8* %ptr.neg -; CHECK: NoAlias: i8* %ptr.neg, i8* %ptr.shl +; CHECK: MayAlias: i8* %ptr.neg, i8* %ptr.shl %a.cmp = icmp sge i64 %a, 0 call void @llvm.assume(i1 %a.cmp) %ptr.neg = getelementptr i8, i8* %ptr, i64 -2 </cut>

4 years, 7 months

[TCWG CI] 400.perlbench:[.] S_find_byclass slowed down by 12% after llvm: [ORC] Call ExecutorProcessControl::disconnect in unit tests that require it.

by ci_notify＠linaro.org

After llvm commit adf55ac6657693f7bfbe3087b599b4031a765a44 Author: Lang Hames <lhames(a)gmail.com> [ORC] Call ExecutorProcessControl::disconnect in unit tests that require it. the following hot functions slowed down by more than 10% (but their benchmarks slowed down by less than 2%): - 400.perlbench:[.] S_find_byclass slowed down by 12% from 644 to 721 perf samples Below reproducer instructions can be used to re-build both "first_bad" and "last_good" cross-toolchains used in this bisection. Naturally, the scripts will fail when triggerring benchmarking jobs if you don't have access to Linaro TCWG CI. For your convenience, we have uploaded tarballs with pre-processed source and assembly files at: - First_bad save-temps: https://ci.linaro.org/job/tcwg_bmk_ci_llvm-bisect-tcwg_bmk_tx1-llvm-master-… - Last_good save-temps: https://ci.linaro.org/job/tcwg_bmk_ci_llvm-bisect-tcwg_bmk_tx1-llvm-master-… - Baseline save-temps: https://ci.linaro.org/job/tcwg_bmk_ci_llvm-bisect-tcwg_bmk_tx1-llvm-master-… Configuration: - Benchmark: SPEC CPU2006 - Toolchain: Clang + Glibc + LLVM Linker - Version: all components were built from their tip of trunk - Target: aarch64-linux-gnu - Compiler flags: -O2 - Hardware: NVidia TX1 4x Cortex-A57 This benchmarking CI is work-in-progress, and we welcome feedback and suggestions at linaro-toolchain(a)lists.linaro.org . In our improvement plans is to add support for SPEC CPU2017 benchmarks and provide "perf report/annotate" data behind these reports. THIS IS THE END OF INTERESTING STUFF. BELOW ARE LINKS TO BUILDS, REPRODUCTION INSTRUCTIONS, AND THE RAW COMMIT. This commit has regressed these CI configurations: - tcwg_bmk_llvm_tx1/llvm-master-aarch64-spec2k6-O2 First_bad build: https://ci.linaro.org/job/tcwg_bmk_ci_llvm-bisect-tcwg_bmk_tx1-llvm-master-… Last_good build: https://ci.linaro.org/job/tcwg_bmk_ci_llvm-bisect-tcwg_bmk_tx1-llvm-master-… Baseline build: https://ci.linaro.org/job/tcwg_bmk_ci_llvm-bisect-tcwg_bmk_tx1-llvm-master-… Even more details: https://ci.linaro.org/job/tcwg_bmk_ci_llvm-bisect-tcwg_bmk_tx1-llvm-master-… Reproduce builds: <cut> mkdir investigate-llvm-adf55ac6657693f7bfbe3087b599b4031a765a44 cd investigate-llvm-adf55ac6657693f7bfbe3087b599b4031a765a44 # Fetch scripts git clone https://git.linaro.org/toolchain/jenkins-scripts # Fetch manifests and test.sh script mkdir -p artifacts/manifests curl -o artifacts/manifests/build-baseline.sh https://ci.linaro.org/job/tcwg_bmk_ci_llvm-bisect-tcwg_bmk_tx1-llvm-master-… --fail curl -o artifacts/manifests/build-parameters.sh https://ci.linaro.org/job/tcwg_bmk_ci_llvm-bisect-tcwg_bmk_tx1-llvm-master-… --fail curl -o artifacts/test.sh https://ci.linaro.org/job/tcwg_bmk_ci_llvm-bisect-tcwg_bmk_tx1-llvm-master-… --fail chmod +x artifacts/test.sh # Reproduce the baseline build (build all pre-requisites) ./jenkins-scripts/tcwg_bmk-build.sh @@ artifacts/manifests/build-baseline.sh # Save baseline build state (which is then restored in artifacts/test.sh) mkdir -p ./bisect rsync -a --del --delete-excluded --exclude /bisect/ --exclude /artifacts/ --exclude /llvm/ ./ ./bisect/baseline/ cd llvm # Reproduce first_bad build git checkout --detach adf55ac6657693f7bfbe3087b599b4031a765a44 ../artifacts/test.sh # Reproduce last_good build git checkout --detach f526ee5b8517b60620cd03bb3e5945ed69d6bfaa ../artifacts/test.sh cd .. </cut> Full commit (up to 1000 lines): <cut> commit adf55ac6657693f7bfbe3087b599b4031a765a44 Author: Lang Hames <lhames(a)gmail.com> Date: Tue Oct 12 14:55:49 2021 -0700 [ORC] Call ExecutorProcessControl::disconnect in unit tests that require it. Another follow-up to 2815ed57e3c and 19b4e3cfc6a. For unit tests that don't use an ExecutionSession we need to call ExecutorProcessControl::disconnect directly to wait for the dispatcher to shut down. https://llvm.org/PR52153 --- .../ExecutionEngine/Orc/EPCGenericJITLinkMemoryManagerTest.cpp | 2 ++ llvm/unittests/ExecutionEngine/Orc/EPCGenericMemoryAccessTest.cpp | 2 ++ 2 files changed, 4 insertions(+) diff --git a/llvm/unittests/ExecutionEngine/Orc/EPCGenericJITLinkMemoryManagerTest.cpp b/llvm/unittests/ExecutionEngine/Orc/EPCGenericJITLinkMemoryManagerTest.cpp index f2b157e424b6..a95435aec2a3 100644 --- a/llvm/unittests/ExecutionEngine/Orc/EPCGenericJITLinkMemoryManagerTest.cpp +++ b/llvm/unittests/ExecutionEngine/Orc/EPCGenericJITLinkMemoryManagerTest.cpp @@ -134,6 +134,8 @@ TEST(EPCGenericJITLinkMemoryManagerTest, AllocFinalizeFree) { auto Err2 = MemMgr->deallocate(std::move(*FA)); EXPECT_THAT_ERROR(std::move(Err2), Succeeded()); + + cantFail(SelfEPC->disconnect()); } } // namespace diff --git a/llvm/unittests/ExecutionEngine/Orc/EPCGenericMemoryAccessTest.cpp b/llvm/unittests/ExecutionEngine/Orc/EPCGenericMemoryAccessTest.cpp index 78024644ca8b..beb0fefa094a 100644 --- a/llvm/unittests/ExecutionEngine/Orc/EPCGenericMemoryAccessTest.cpp +++ b/llvm/unittests/ExecutionEngine/Orc/EPCGenericMemoryAccessTest.cpp @@ -93,6 +93,8 @@ TEST(EPCGenericMemoryAccessTest, MemWrites) { {{pointerToJITTargetAddress(&Test_Buffer), TestMsg}}); EXPECT_THAT_ERROR(std::move(Err5), Succeeded()); EXPECT_EQ(StringRef(Test_Buffer, TestMsg.size()), TestMsg); + + cantFail(SelfEPC->disconnect()); } } // namespace </cut>

4 years, 7 months

[TCWG CI] 400.perlbench slowed down by 6% after llvm: [AArch64] Remove redundant ORRWrs which is generated by zero-extend

by ci_notify＠linaro.org

After llvm commit a502436259307f95e9c95437d8a1d2d07294341c Author: Jingu Kang <jingu.kang(a)arm.com> [AArch64] Remove redundant ORRWrs which is generated by zero-extend the following benchmarks slowed down by more than 2%: - 400.perlbench slowed down by 6% from 9792 to 10354 perf samples - 464.h264ref slowed down by 4% from 11023 to 11509 perf samples - 464.h264ref:[.] FastFullPelBlockMotionSearch slowed down by 33% from 1634 to 2180 perf samples Below reproducer instructions can be used to re-build both "first_bad" and "last_good" cross-toolchains used in this bisection. Naturally, the scripts will fail when triggerring benchmarking jobs if you don't have access to Linaro TCWG CI. For your convenience, we have uploaded tarballs with pre-processed source and assembly files at: - First_bad save-temps: https://ci.linaro.org/job/tcwg_bmk_ci_llvm-bisect-tcwg_bmk_tx1-llvm-master-… - Last_good save-temps: https://ci.linaro.org/job/tcwg_bmk_ci_llvm-bisect-tcwg_bmk_tx1-llvm-master-… - Baseline save-temps: https://ci.linaro.org/job/tcwg_bmk_ci_llvm-bisect-tcwg_bmk_tx1-llvm-master-… Configuration: - Benchmark: SPEC CPU2006 - Toolchain: Clang + Glibc + LLVM Linker - Version: all components were built from their tip of trunk - Target: aarch64-linux-gnu - Compiler flags: -O3 - Hardware: NVidia TX1 4x Cortex-A57 This benchmarking CI is work-in-progress, and we welcome feedback and suggestions at linaro-toolchain(a)lists.linaro.org . In our improvement plans is to add support for SPEC CPU2017 benchmarks and provide "perf report/annotate" data behind these reports. THIS IS THE END OF INTERESTING STUFF. BELOW ARE LINKS TO BUILDS, REPRODUCTION INSTRUCTIONS, AND THE RAW COMMIT. This commit has regressed these CI configurations: - tcwg_bmk_llvm_tx1/llvm-master-aarch64-spec2k6-O3 First_bad build: https://ci.linaro.org/job/tcwg_bmk_ci_llvm-bisect-tcwg_bmk_tx1-llvm-master-… Last_good build: https://ci.linaro.org/job/tcwg_bmk_ci_llvm-bisect-tcwg_bmk_tx1-llvm-master-… Baseline build: https://ci.linaro.org/job/tcwg_bmk_ci_llvm-bisect-tcwg_bmk_tx1-llvm-master-… Even more details: https://ci.linaro.org/job/tcwg_bmk_ci_llvm-bisect-tcwg_bmk_tx1-llvm-master-… Reproduce builds: <cut> mkdir investigate-llvm-a502436259307f95e9c95437d8a1d2d07294341c cd investigate-llvm-a502436259307f95e9c95437d8a1d2d07294341c # Fetch scripts git clone https://git.linaro.org/toolchain/jenkins-scripts # Fetch manifests and test.sh script mkdir -p artifacts/manifests curl -o artifacts/manifests/build-baseline.sh https://ci.linaro.org/job/tcwg_bmk_ci_llvm-bisect-tcwg_bmk_tx1-llvm-master-… --fail curl -o artifacts/manifests/build-parameters.sh https://ci.linaro.org/job/tcwg_bmk_ci_llvm-bisect-tcwg_bmk_tx1-llvm-master-… --fail curl -o artifacts/test.sh https://ci.linaro.org/job/tcwg_bmk_ci_llvm-bisect-tcwg_bmk_tx1-llvm-master-… --fail chmod +x artifacts/test.sh # Reproduce the baseline build (build all pre-requisites) ./jenkins-scripts/tcwg_bmk-build.sh @@ artifacts/manifests/build-baseline.sh # Save baseline build state (which is then restored in artifacts/test.sh) mkdir -p ./bisect rsync -a --del --delete-excluded --exclude /bisect/ --exclude /artifacts/ --exclude /llvm/ ./ ./bisect/baseline/ cd llvm # Reproduce first_bad build git checkout --detach a502436259307f95e9c95437d8a1d2d07294341c ../artifacts/test.sh # Reproduce last_good build git checkout --detach 6fa1b4ff4b05b9b9a432f7310802255c160c8f4f ../artifacts/test.sh cd .. </cut> Full commit (up to 1000 lines): <cut> commit a502436259307f95e9c95437d8a1d2d07294341c Author: Jingu Kang <jingu.kang(a)arm.com> Date: Thu Sep 30 15:39:10 2021 +0100 [AArch64] Remove redundant ORRWrs which is generated by zero-extend %3:gpr32 = ORRWrs $wzr, %2, 0 %4:gpr64 = SUBREG_TO_REG 0, %3, %subreg.sub_32 If AArch64's 32-bit form of instruction defines the source operand of ORRWrs, we can remove the ORRWrs because the upper 32 bits of the source operand are set to zero. Differential Revision: https://reviews.llvm.org/D110841 --- llvm/lib/Target/AArch64/AArch64MIPeepholeOpt.cpp | 57 ++++++++++++++++ .../test/CodeGen/AArch64/arm64-assert-zext-sext.ll | 51 +++++++++++--- .../AArch64/redundant-mov-from-zero-extend.ll | 79 ++++++++++++++++++++++ .../AArch64/redundant-orrwrs-from-zero-extend.mir | 69 +++++++++++++++++++ 4 files changed, 248 insertions(+), 8 deletions(-) diff --git a/llvm/lib/Target/AArch64/AArch64MIPeepholeOpt.cpp b/llvm/lib/Target/AArch64/AArch64MIPeepholeOpt.cpp index 42f683613698..42db18332f1c 100644 --- a/llvm/lib/Target/AArch64/AArch64MIPeepholeOpt.cpp +++ b/llvm/lib/Target/AArch64/AArch64MIPeepholeOpt.cpp @@ -15,6 +15,16 @@ // later. In this case, we could try to split the constant operand of mov // instruction into two bitmask immediates. It makes two AND instructions // intead of multiple `mov` + `and` instructions. +// +// 2. Remove redundant ORRWrs which is generated by zero-extend. +// +// %3:gpr32 = ORRWrs $wzr, %2, 0 +// %4:gpr64 = SUBREG_TO_REG 0, %3, %subreg.sub_32 +// +// If AArch64's 32-bit form of instruction defines the source operand of +// ORRWrs, we can remove the ORRWrs because the upper 32 bits of the source +// operand are set to zero. +// //===----------------------------------------------------------------------===// #include "AArch64ExpandImm.h" @@ -44,6 +54,8 @@ struct AArch64MIPeepholeOpt : public MachineFunctionPass { template <typename T> bool visitAND(MachineInstr &MI, SmallSetVector<MachineInstr *, 8> &ToBeRemoved); + bool visitORR(MachineInstr &MI, + SmallSetVector<MachineInstr *, 8> &ToBeRemoved); bool runOnMachineFunction(MachineFunction &MF) override; StringRef getPassName() const override { @@ -196,6 +208,49 @@ bool AArch64MIPeepholeOpt::visitAND( return true; } +bool AArch64MIPeepholeOpt::visitORR( + MachineInstr &MI, SmallSetVector<MachineInstr *, 8> &ToBeRemoved) { + // Check this ORR comes from below zero-extend pattern. + // + // def : Pat<(i64 (zext GPR32:$src)), + // (SUBREG_TO_REG (i32 0), (ORRWrs WZR, GPR32:$src, 0), sub_32)>; + if (MI.getOperand(3).getImm() != 0) + return false; + + if (MI.getOperand(1).getReg() != AArch64::WZR) + return false; + + MachineInstr *SrcMI = MRI->getUniqueVRegDef(MI.getOperand(2).getReg()); + if (!SrcMI) + return false; + + // From https://developer.arm.com/documentation/dui0801/b/BABBGCAC + // + // When you use the 32-bit form of an instruction, the upper 32 bits of the + // source registers are ignored and the upper 32 bits of the destination + // register are set to zero. + // + // If AArch64's 32-bit form of instruction defines the source operand of + // zero-extend, we do not need the zero-extend. Let's check the MI's opcode is + // real AArch64 instruction and if it is not, do not process the opcode + // conservatively. + if (SrcMI->getOpcode() <= TargetOpcode::GENERIC_OP_END) + return false; + + Register DefReg = MI.getOperand(0).getReg(); + Register SrcReg = MI.getOperand(2).getReg(); + MRI->replaceRegWith(DefReg, SrcReg); + MRI->clearKillFlags(SrcReg); + // replaceRegWith changes MI's definition register. Keep it for SSA form until + // deleting MI. + MI.getOperand(0).setReg(DefReg); + ToBeRemoved.insert(&MI); + + LLVM_DEBUG({ dbgs() << "Removed: " << MI << "\n"; }); + + return true; +} + bool AArch64MIPeepholeOpt::runOnMachineFunction(MachineFunction &MF) { if (skipFunction(MF.getFunction())) return false; @@ -221,6 +276,8 @@ bool AArch64MIPeepholeOpt::runOnMachineFunction(MachineFunction &MF) { case AArch64::ANDXrr: Changed = visitAND<uint64_t>(MI, ToBeRemoved); break; + case AArch64::ORRWrs: + Changed = visitORR(MI, ToBeRemoved); } } } diff --git a/llvm/test/CodeGen/AArch64/arm64-assert-zext-sext.ll b/llvm/test/CodeGen/AArch64/arm64-assert-zext-sext.ll index 07dd0b4ec56b..df4a9010dfa9 100644 --- a/llvm/test/CodeGen/AArch64/arm64-assert-zext-sext.ll +++ b/llvm/test/CodeGen/AArch64/arm64-assert-zext-sext.ll @@ -1,9 +1,32 @@ +; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py ; RUN: llc -O2 -mtriple=aarch64-linux-gnu < %s | FileCheck %s declare i32 @test(i32) local_unnamed_addr declare i32 @test1(i64) local_unnamed_addr define i32 @assertzext(i32 %n, i1 %a, i32* %b) local_unnamed_addr { +; CHECK-LABEL: assertzext: +; CHECK: // %bb.0: // %entry +; CHECK-NEXT: stp x30, x19, [sp, #-16]! // 16-byte Folded Spill +; CHECK-NEXT: .cfi_def_cfa_offset 16 +; CHECK-NEXT: .cfi_offset w19, -8 +; CHECK-NEXT: .cfi_offset w30, -16 +; CHECK-NEXT: mov w8, #33066 +; CHECK-NEXT: tst w1, #0x1 +; CHECK-NEXT: movk w8, #28567, lsl #16 +; CHECK-NEXT: csel w19, wzr, w8, ne +; CHECK-NEXT: cbnz w0, .LBB0_2 +; CHECK-NEXT: // %bb.1: // %if.then +; CHECK-NEXT: mov w19, wzr +; CHECK-NEXT: str wzr, [x2] +; CHECK-NEXT: .LBB0_2: // %if.end +; CHECK-NEXT: mov w0, w19 +; CHECK-NEXT: bl test +; CHECK-NEXT: mov w0, w19 +; CHECK-NEXT: bl test1 +; CHECK-NEXT: mov w0, wzr +; CHECK-NEXT: ldp x30, x19, [sp], #16 // 16-byte Folded Reload +; CHECK-NEXT: ret entry: %i = select i1 %a, i64 0, i64 66296709418 %conv.i = trunc i64 %i to i32 @@ -20,14 +43,29 @@ if.end: ; preds = %if.then, %entry %i2 = sext i32 %i1 to i64 %call1.i = tail call i32 @test1(i64 %i2) ret i32 0 -; CHECK: // %if.end -; CHECK: mov w{{[0-9]+}}, w{{[0-9]+}} -; CHECK: bl test -; CHECK: mov w{{[0-9]+}}, w{{[0-9]+}} -; CHECK: bl test1 } define i32 @assertsext(i32 %n, i8 %a) local_unnamed_addr { +; CHECK-LABEL: assertsext: +; CHECK: // %bb.0: // %entry +; CHECK-NEXT: cbz w0, .LBB1_2 +; CHECK-NEXT: // %bb.1: +; CHECK-NEXT: mov x0, xzr +; CHECK-NEXT: b .LBB1_3 +; CHECK-NEXT: .LBB1_2: // %if.then +; CHECK-NEXT: mov x9, #24575 +; CHECK-NEXT: sxtb w8, w1 +; CHECK-NEXT: movk x9, #15873, lsl #16 +; CHECK-NEXT: movk x9, #474, lsl #32 +; CHECK-NEXT: udiv x0, x9, x8 +; CHECK-NEXT: .LBB1_3: // %if.end +; CHECK-NEXT: str x30, [sp, #-16]! // 8-byte Folded Spill +; CHECK-NEXT: .cfi_def_cfa_offset 16 +; CHECK-NEXT: .cfi_offset w30, -16 +; CHECK-NEXT: bl test1 +; CHECK-NEXT: mov w0, wzr +; CHECK-NEXT: ldr x30, [sp], #16 // 8-byte Folded Reload +; CHECK-NEXT: ret entry: %conv.i = sext i8 %a to i32 %cmp = icmp eq i32 %n, 0 @@ -37,9 +75,6 @@ if.then: ; preds = %entry %conv1 = zext i32 %conv.i to i64 %div = udiv i64 2036854775807, %conv1 br label %if.end -; CHECK: // %if.then -; CHECK: mov w{{[0-9]+}}, w{{[0-9]+}} -; CHECK: udiv x{{[0-9]+}}, x{{[0-9]+}}, x{{[0-9]+}} if.end: ; preds = %if.then, %entry %i1 = phi i64 [ %div, %if.then ], [ 0, %entry ] diff --git a/llvm/test/CodeGen/AArch64/redundant-mov-from-zero-extend.ll b/llvm/test/CodeGen/AArch64/redundant-mov-from-zero-extend.ll new file mode 100644 index 000000000000..42b9838acef2 --- /dev/null +++ b/llvm/test/CodeGen/AArch64/redundant-mov-from-zero-extend.ll @@ -0,0 +1,79 @@ +; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py +; RUN: llc -O3 -mtriple=aarch64-linux-gnu < %s | FileCheck %s + +define i32 @test(i32 %input, i32 %n, i32 %a) { +; CHECK-LABEL: test: +; CHECK: // %bb.0: // %entry +; CHECK-NEXT: cbz w1, .LBB0_2 +; CHECK-NEXT: // %bb.1: +; CHECK-NEXT: mov w0, wzr +; CHECK-NEXT: ret +; CHECK-NEXT: .LBB0_2: // %bb.0 +; CHECK-NEXT: add w8, w0, w1 +; CHECK-NEXT: mov w0, #100 +; CHECK-NEXT: cmp w8, #4 +; CHECK-NEXT: b.hi .LBB0_5 +; CHECK-NEXT: // %bb.3: // %bb.0 +; CHECK-NEXT: adrp x9, .LJTI0_0 +; CHECK-NEXT: add x9, x9, :lo12:.LJTI0_0 +; CHECK-NEXT: adr x10, .LBB0_4 +; CHECK-NEXT: ldrb w11, [x9, x8] +; CHECK-NEXT: add x10, x10, x11, lsl #2 +; CHECK-NEXT: br x10 +; CHECK-NEXT: .LBB0_4: // %sw.bb +; CHECK-NEXT: add w0, w2, #1 +; CHECK-NEXT: ret +; CHECK-NEXT: .LBB0_5: // %bb.0 +; CHECK-NEXT: cmp w8, #200 +; CHECK-NEXT: b.ne .LBB0_10 +; CHECK-NEXT: // %bb.6: // %sw.bb7 +; CHECK-NEXT: add w0, w2, #7 +; CHECK-NEXT: ret +; CHECK-NEXT: .LBB0_7: // %sw.bb1 +; CHECK-NEXT: add w0, w2, #3 +; CHECK-NEXT: ret +; CHECK-NEXT: .LBB0_8: // %sw.bb3 +; CHECK-NEXT: add w0, w2, #4 +; CHECK-NEXT: ret +; CHECK-NEXT: .LBB0_9: // %sw.bb5 +; CHECK-NEXT: add w0, w2, #5 +; CHECK-NEXT: .LBB0_10: // %return +; CHECK-NEXT: ret +entry: + %b = add nsw i32 %input, %n + %cmp = icmp eq i32 %n, 0 + br i1 %cmp, label %bb.0, label %return + +bb.0: + switch i32 %b, label %return [ + i32 0, label %sw.bb + i32 1, label %sw.bb1 + i32 2, label %sw.bb3 + i32 4, label %sw.bb5 + i32 200, label %sw.bb7 + ] + +sw.bb: + %add = add nsw i32 %a, 1 + br label %return + +sw.bb1: + %add2 = add nsw i32 %a, 3 + br label %return + +sw.bb3: + %add4 = add nsw i32 %a, 4 + br label %return + +sw.bb5: + %add6 = add nsw i32 %a, 5 + br label %return + +sw.bb7: + %add8 = add nsw i32 %a, 7 + br label %return + +return: + %retval.0 = phi i32 [ %add8, %sw.bb7 ], [ %add6, %sw.bb5 ], [ %add4, %sw.bb3 ], [ %add2, %sw.bb1 ], [ %add, %sw.bb ], [ 100, %bb.0 ], [ 0, %entry ] + ret i32 %retval.0 +} diff --git a/llvm/test/CodeGen/AArch64/redundant-orrwrs-from-zero-extend.mir b/llvm/test/CodeGen/AArch64/redundant-orrwrs-from-zero-extend.mir new file mode 100644 index 000000000000..37540dde048f --- /dev/null +++ b/llvm/test/CodeGen/AArch64/redundant-orrwrs-from-zero-extend.mir @@ -0,0 +1,69 @@ +# RUN: llc -mtriple=aarch64 -run-pass aarch64-mi-peephole-opt -verify-machineinstrs -o - %s | FileCheck %s +--- +name: test1 +# CHECK-LABEL: name: test1 +alignment: 4 +tracksRegLiveness: true +registers: + - { id: 0, class: gpr32 } + - { id: 1, class: gpr32 } + - { id: 2, class: gpr32 } + - { id: 3, class: gpr32 } + - { id: 4, class: gpr64 } +body: | + bb.0: + liveins: $w0, $w1 + + %0:gpr32 = COPY $w0 + %1:gpr32 = COPY $w1 + B %bb.1 + + bb.1: + %2:gpr32 = nsw ADDWrr %0, %1 + B %bb.2 + + bb.2: + ; CHECK-LABEL: bb.2: + ; CHECK-NOT: %3:gpr32 = ORRWrs $wzr, %2, 0 + ; The ORRWrs should be removed. + %3:gpr32 = ORRWrs $wzr, %2, 0 + %4:gpr64 = SUBREG_TO_REG 0, %3, %subreg.sub_32 + B %bb.3 + + bb.3: + $x0 = COPY %4 + RET_ReallyLR implicit $x0 +... +--- +name: test2 +# CHECK-LABEL: name: test2 +alignment: 4 +tracksRegLiveness: true +registers: + - { id: 0, class: gpr64 } + - { id: 1, class: gpr32 } + - { id: 2, class: gpr32 } + - { id: 3, class: gpr64 } +body: | + bb.0: + liveins: $x0 + + %0:gpr64 = COPY $x0 + B %bb.1 + + bb.1: + %1:gpr32 = EXTRACT_SUBREG %0, %subreg.sub_32 + B %bb.2 + + bb.2: + ; CHECK-LABEL: bb.2: + ; CHECK: %2:gpr32 = ORRWrs $wzr, %1, 0 + ; The ORRWrs should not be removed. + %2:gpr32 = ORRWrs $wzr, %1, 0 + %3:gpr64 = SUBREG_TO_REG 0, %2, %subreg.sub_32 + B %bb.3 + + bb.3: + $x0 = COPY %3 + RET_ReallyLR implicit $x0 +... </cut>

4 years, 7 months

Jump to page:

2026

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

linaro-toolchain