- linaro-toolchain - lists.linaro.org

by Peter Maydell

Progress * UM-2 [QEMU upstream maintainership] + Respin of a linux-user cleanup patchset + Code review, as usual * QEMU-406 [QEMU support for MVE (M-profile Vector Extension; Helium)] + Working on version 2 of the "optimized code gen for MVE" patchset; this now covers all the insns that have an easy optimized version. -- PMM

3 years, 11 months

1
0
0 0

[CI-NOTIFY]: TCWG Bisect tcwg_kernel/gnu-master-aarch64-lts-allyesconfig - Build # 13 - Successful!

by ci_notify＠linaro.org

Successfully identified regression in *gcc* in CI configuration tcwg_kernel/gnu-master-aarch64-lts-allyesconfig. So far, this commit has regressed CI configurations: - tcwg_kernel/gnu-master-aarch64-lts-allyesconfig Culprit: <cut> commit a25e0b5e6ac8a77a71c229e0a7b744603365b0e9 Author: qing zhao <qing.zhao(a)oracle.com> Date: Thu Sep 9 15:44:49 2021 -0700 Add -ftrivial-auto-var-init option and uninitialized variable attribute. Initialize automatic variables with either a pattern or with zeroes to increase the security and predictability of a program by preventing uninitialized memory disclosure and use. GCC still considers an automatic variable that doesn't have an explicit initializer as uninitialized, -Wuninitialized will still report warning messages on such automatic variables. With this option, GCC will also initialize any padding of automatic variables that have structure or union types to zeroes. You can control this behavior for a specific variable by using the variable attribute "uninitialized" to control runtime overhead. gcc/ChangeLog: 2021-09-09 qing zhao <qing.zhao(a)oracle.com> * builtins.c (expand_builtin_memset): Make external visible. * builtins.h (expand_builtin_memset): Declare extern. * common.opt (ftrivial-auto-var-init=): New option. * doc/extend.texi: Document the uninitialized attribute. * doc/invoke.texi: Document -ftrivial-auto-var-init. * flag-types.h (enum auto_init_type): New enumerated type auto_init_type. * gimple-fold.c (clear_padding_type): Add one new parameter. (clear_padding_union): Likewise. (clear_padding_emit_loop): Likewise. (clear_type_padding_in_mask): Likewise. (gimple_fold_builtin_clear_padding): Handle this new parameter. * gimplify.c (gimple_add_init_for_auto_var): New function. (gimple_add_padding_init_for_auto_var): New function. (is_var_need_auto_init): New function. (gimplify_decl_expr): Add initialization to automatic variables per users' requests. (gimplify_call_expr): Add one new parameter for call to __builtin_clear_padding. (gimplify_init_constructor): Add padding initialization in the end. * internal-fn.c (INIT_PATTERN_VALUE): New macro. (expand_DEFERRED_INIT): New function. * internal-fn.def (DEFERRED_INIT): New internal function. * tree-cfg.c (verify_gimple_call): Verify calls to .DEFERRED_INIT. * tree-sra.c (generate_subtree_deferred_init): New function. (scan_function): Avoid setting cannot_scalarize_away_bitmap for calls to .DEFERRED_INIT. (sra_modify_deferred_init): New function. (sra_modify_function_body): Handle calls to DEFERRED_INIT specially. * tree-ssa-structalias.c (find_func_aliases_for_call): Likewise. * tree-ssa-uninit.c (warn_uninit): Handle calls to DEFERRED_INIT specially. (check_defs): Likewise. (warn_uninitialized_vars): Likewise. * tree-ssa.c (ssa_undefined_value_p): Likewise. * tree.c (build_common_builtin_nodes): Build tree node for BUILT_IN_CLEAR_PADDING when needed. gcc/c-family/ChangeLog: 2021-09-09 qing zhao <qing.zhao(a)oracle.com> * c-attribs.c (handle_uninitialized_attribute): New function. (c_common_attribute_table): Add "uninitialized" attribute. gcc/testsuite/ChangeLog: 2021-09-09 qing zhao <qing.zhao(a)oracle.com> * c-c++-common/auto-init-1.c: New test. * c-c++-common/auto-init-10.c: New test. * c-c++-common/auto-init-11.c: New test. * c-c++-common/auto-init-12.c: New test. * c-c++-common/auto-init-13.c: New test. * c-c++-common/auto-init-14.c: New test. * c-c++-common/auto-init-15.c: New test. * c-c++-common/auto-init-16.c: New test. * c-c++-common/auto-init-2.c: New test. * c-c++-common/auto-init-3.c: New test. * c-c++-common/auto-init-4.c: New test. * c-c++-common/auto-init-5.c: New test. * c-c++-common/auto-init-6.c: New test. * c-c++-common/auto-init-7.c: New test. * c-c++-common/auto-init-8.c: New test. * c-c++-common/auto-init-9.c: New test. * c-c++-common/auto-init-esra.c: New test. * c-c++-common/auto-init-padding-1.c: New test. * c-c++-common/auto-init-padding-2.c: New test. * c-c++-common/auto-init-padding-3.c: New test. * g++.dg/auto-init-uninit-pred-1_a.C: New test. * g++.dg/auto-init-uninit-pred-2_a.C: New test. * g++.dg/auto-init-uninit-pred-3_a.C: New test. * g++.dg/auto-init-uninit-pred-4.C: New test. * gcc.dg/auto-init-sra-1.c: New test. * gcc.dg/auto-init-sra-2.c: New test. * gcc.dg/auto-init-uninit-1.c: New test. * gcc.dg/auto-init-uninit-12.c: New test. * gcc.dg/auto-init-uninit-13.c: New test. * gcc.dg/auto-init-uninit-14.c: New test. * gcc.dg/auto-init-uninit-15.c: New test. * gcc.dg/auto-init-uninit-16.c: New test. * gcc.dg/auto-init-uninit-17.c: New test. * gcc.dg/auto-init-uninit-18.c: New test. * gcc.dg/auto-init-uninit-19.c: New test. * gcc.dg/auto-init-uninit-2.c: New test. * gcc.dg/auto-init-uninit-20.c: New test. * gcc.dg/auto-init-uninit-21.c: New test. * gcc.dg/auto-init-uninit-22.c: New test. * gcc.dg/auto-init-uninit-23.c: New test. * gcc.dg/auto-init-uninit-24.c: New test. * gcc.dg/auto-init-uninit-25.c: New test. * gcc.dg/auto-init-uninit-26.c: New test. * gcc.dg/auto-init-uninit-3.c: New test. * gcc.dg/auto-init-uninit-34.c: New test. * gcc.dg/auto-init-uninit-36.c: New test. * gcc.dg/auto-init-uninit-37.c: New test. * gcc.dg/auto-init-uninit-4.c: New test. * gcc.dg/auto-init-uninit-5.c: New test. * gcc.dg/auto-init-uninit-6.c: New test. * gcc.dg/auto-init-uninit-8.c: New test. * gcc.dg/auto-init-uninit-9.c: New test. * gcc.dg/auto-init-uninit-A.c: New test. * gcc.dg/auto-init-uninit-B.c: New test. * gcc.dg/auto-init-uninit-C.c: New test. * gcc.dg/auto-init-uninit-H.c: New test. * gcc.dg/auto-init-uninit-I.c: New test. * gcc.target/aarch64/auto-init-1.c: New test. * gcc.target/aarch64/auto-init-2.c: New test. * gcc.target/aarch64/auto-init-3.c: New test. * gcc.target/aarch64/auto-init-4.c: New test. * gcc.target/aarch64/auto-init-5.c: New test. * gcc.target/aarch64/auto-init-6.c: New test. * gcc.target/aarch64/auto-init-7.c: New test. * gcc.target/aarch64/auto-init-8.c: New test. * gcc.target/aarch64/auto-init-padding-1.c: New test. * gcc.target/aarch64/auto-init-padding-10.c: New test. * gcc.target/aarch64/auto-init-padding-11.c: New test. * gcc.target/aarch64/auto-init-padding-12.c: New test. * gcc.target/aarch64/auto-init-padding-2.c: New test. * gcc.target/aarch64/auto-init-padding-3.c: New test. * gcc.target/aarch64/auto-init-padding-4.c: New test. * gcc.target/aarch64/auto-init-padding-5.c: New test. * gcc.target/aarch64/auto-init-padding-6.c: New test. * gcc.target/aarch64/auto-init-padding-7.c: New test. * gcc.target/aarch64/auto-init-padding-8.c: New test. * gcc.target/aarch64/auto-init-padding-9.c: New test. * gcc.target/i386/auto-init-1.c: New test. * gcc.target/i386/auto-init-2.c: New test. * gcc.target/i386/auto-init-21.c: New test. * gcc.target/i386/auto-init-22.c: New test. * gcc.target/i386/auto-init-23.c: New test. * gcc.target/i386/auto-init-24.c: New test. * gcc.target/i386/auto-init-3.c: New test. * gcc.target/i386/auto-init-4.c: New test. * gcc.target/i386/auto-init-5.c: New test. * gcc.target/i386/auto-init-6.c: New test. * gcc.target/i386/auto-init-7.c: New test. * gcc.target/i386/auto-init-8.c: New test. * gcc.target/i386/auto-init-padding-1.c: New test. * gcc.target/i386/auto-init-padding-10.c: New test. * gcc.target/i386/auto-init-padding-11.c: New test. * gcc.target/i386/auto-init-padding-12.c: New test. * gcc.target/i386/auto-init-padding-2.c: New test. * gcc.target/i386/auto-init-padding-3.c: New test. * gcc.target/i386/auto-init-padding-4.c: New test. * gcc.target/i386/auto-init-padding-5.c: New test. * gcc.target/i386/auto-init-padding-6.c: New test. * gcc.target/i386/auto-init-padding-7.c: New test. * gcc.target/i386/auto-init-padding-8.c: New test. * gcc.target/i386/auto-init-padding-9.c: New test. </cut> Results regressed to (for first_bad == a25e0b5e6ac8a77a71c229e0a7b744603365b0e9) # reset_artifacts: -10 # build_abe binutils: -9 # build_abe stage1: -5 # build_abe qemu: -2 # linux_n_obj: 29 # First few build errors in logs: from (for last_good == 5fe0865ab788bdc387b284a3ad57e5a95a767b18) # reset_artifacts: -10 # build_abe binutils: -9 # build_abe stage1: -5 # build_abe qemu: -2 # linux_n_obj: 19270 # linux build successful: all Artifacts of last_good build: https://ci.linaro.org/job/tcwg_kernel-gnu-bisect-gnu-master-aarch64-lts-all… Artifacts of first_bad build: https://ci.linaro.org/job/tcwg_kernel-gnu-bisect-gnu-master-aarch64-lts-all… Build top page/logs: https://ci.linaro.org/job/tcwg_kernel-gnu-bisect-gnu-master-aarch64-lts-all… Configuration details: Reproduce builds: <cut> mkdir investigate-gcc-a25e0b5e6ac8a77a71c229e0a7b744603365b0e9 cd investigate-gcc-a25e0b5e6ac8a77a71c229e0a7b744603365b0e9 git clone https://git.linaro.org/toolchain/jenkins-scripts mkdir -p artifacts/manifests curl -o artifacts/manifests/build-baseline.sh https://ci.linaro.org/job/tcwg_kernel-gnu-bisect-gnu-master-aarch64-lts-all… --fail curl -o artifacts/manifests/build-parameters.sh https://ci.linaro.org/job/tcwg_kernel-gnu-bisect-gnu-master-aarch64-lts-all… --fail curl -o artifacts/test.sh https://ci.linaro.org/job/tcwg_kernel-gnu-bisect-gnu-master-aarch64-lts-all… --fail chmod +x artifacts/test.sh # Reproduce the baseline build (build all pre-requisites) ./jenkins-scripts/tcwg_kernel-build.sh @@ artifacts/manifests/build-baseline.sh # Save baseline build state (which is then restored in artifacts/test.sh) mkdir -p ./bisect rsync -a --del --delete-excluded --exclude /bisect/ --exclude /artifacts/ --exclude /gcc/ ./ ./bisect/baseline/ cd gcc # Reproduce first_bad build git checkout --detach a25e0b5e6ac8a77a71c229e0a7b744603365b0e9 ../artifacts/test.sh # Reproduce last_good build git checkout --detach 5fe0865ab788bdc387b284a3ad57e5a95a767b18 ../artifacts/test.sh cd .. </cut> History of pending regressions and results: https://git.linaro.org/toolchain/ci/base-artifacts.git/log/?h=linaro-local/… Artifacts: https://ci.linaro.org/job/tcwg_kernel-gnu-bisect-gnu-master-aarch64-lts-all… Build log: https://ci.linaro.org/job/tcwg_kernel-gnu-bisect-gnu-master-aarch64-lts-all… Full commit (up to 1000 lines): <cut> commit a25e0b5e6ac8a77a71c229e0a7b744603365b0e9 Author: qing zhao <qing.zhao(a)oracle.com> Date: Thu Sep 9 15:44:49 2021 -0700 Add -ftrivial-auto-var-init option and uninitialized variable attribute. Initialize automatic variables with either a pattern or with zeroes to increase the security and predictability of a program by preventing uninitialized memory disclosure and use. GCC still considers an automatic variable that doesn't have an explicit initializer as uninitialized, -Wuninitialized will still report warning messages on such automatic variables. With this option, GCC will also initialize any padding of automatic variables that have structure or union types to zeroes. You can control this behavior for a specific variable by using the variable attribute "uninitialized" to control runtime overhead. gcc/ChangeLog: 2021-09-09 qing zhao <qing.zhao(a)oracle.com> * builtins.c (expand_builtin_memset): Make external visible. * builtins.h (expand_builtin_memset): Declare extern. * common.opt (ftrivial-auto-var-init=): New option. * doc/extend.texi: Document the uninitialized attribute. * doc/invoke.texi: Document -ftrivial-auto-var-init. * flag-types.h (enum auto_init_type): New enumerated type auto_init_type. * gimple-fold.c (clear_padding_type): Add one new parameter. (clear_padding_union): Likewise. (clear_padding_emit_loop): Likewise. (clear_type_padding_in_mask): Likewise. (gimple_fold_builtin_clear_padding): Handle this new parameter. * gimplify.c (gimple_add_init_for_auto_var): New function. (gimple_add_padding_init_for_auto_var): New function. (is_var_need_auto_init): New function. (gimplify_decl_expr): Add initialization to automatic variables per users' requests. (gimplify_call_expr): Add one new parameter for call to __builtin_clear_padding. (gimplify_init_constructor): Add padding initialization in the end. * internal-fn.c (INIT_PATTERN_VALUE): New macro. (expand_DEFERRED_INIT): New function. * internal-fn.def (DEFERRED_INIT): New internal function. * tree-cfg.c (verify_gimple_call): Verify calls to .DEFERRED_INIT. * tree-sra.c (generate_subtree_deferred_init): New function. (scan_function): Avoid setting cannot_scalarize_away_bitmap for calls to .DEFERRED_INIT. (sra_modify_deferred_init): New function. (sra_modify_function_body): Handle calls to DEFERRED_INIT specially. * tree-ssa-structalias.c (find_func_aliases_for_call): Likewise. * tree-ssa-uninit.c (warn_uninit): Handle calls to DEFERRED_INIT specially. (check_defs): Likewise. (warn_uninitialized_vars): Likewise. * tree-ssa.c (ssa_undefined_value_p): Likewise. * tree.c (build_common_builtin_nodes): Build tree node for BUILT_IN_CLEAR_PADDING when needed. gcc/c-family/ChangeLog: 2021-09-09 qing zhao <qing.zhao(a)oracle.com> * c-attribs.c (handle_uninitialized_attribute): New function. (c_common_attribute_table): Add "uninitialized" attribute. gcc/testsuite/ChangeLog: 2021-09-09 qing zhao <qing.zhao(a)oracle.com> * c-c++-common/auto-init-1.c: New test. * c-c++-common/auto-init-10.c: New test. * c-c++-common/auto-init-11.c: New test. * c-c++-common/auto-init-12.c: New test. * c-c++-common/auto-init-13.c: New test. * c-c++-common/auto-init-14.c: New test. * c-c++-common/auto-init-15.c: New test. * c-c++-common/auto-init-16.c: New test. * c-c++-common/auto-init-2.c: New test. * c-c++-common/auto-init-3.c: New test. * c-c++-common/auto-init-4.c: New test. * c-c++-common/auto-init-5.c: New test. * c-c++-common/auto-init-6.c: New test. * c-c++-common/auto-init-7.c: New test. * c-c++-common/auto-init-8.c: New test. * c-c++-common/auto-init-9.c: New test. * c-c++-common/auto-init-esra.c: New test. * c-c++-common/auto-init-padding-1.c: New test. * c-c++-common/auto-init-padding-2.c: New test. * c-c++-common/auto-init-padding-3.c: New test. * g++.dg/auto-init-uninit-pred-1_a.C: New test. * g++.dg/auto-init-uninit-pred-2_a.C: New test. * g++.dg/auto-init-uninit-pred-3_a.C: New test. * g++.dg/auto-init-uninit-pred-4.C: New test. * gcc.dg/auto-init-sra-1.c: New test. * gcc.dg/auto-init-sra-2.c: New test. * gcc.dg/auto-init-uninit-1.c: New test. * gcc.dg/auto-init-uninit-12.c: New test. * gcc.dg/auto-init-uninit-13.c: New test. * gcc.dg/auto-init-uninit-14.c: New test. * gcc.dg/auto-init-uninit-15.c: New test. * gcc.dg/auto-init-uninit-16.c: New test. * gcc.dg/auto-init-uninit-17.c: New test. * gcc.dg/auto-init-uninit-18.c: New test. * gcc.dg/auto-init-uninit-19.c: New test. * gcc.dg/auto-init-uninit-2.c: New test. * gcc.dg/auto-init-uninit-20.c: New test. * gcc.dg/auto-init-uninit-21.c: New test. * gcc.dg/auto-init-uninit-22.c: New test. * gcc.dg/auto-init-uninit-23.c: New test. * gcc.dg/auto-init-uninit-24.c: New test. * gcc.dg/auto-init-uninit-25.c: New test. * gcc.dg/auto-init-uninit-26.c: New test. * gcc.dg/auto-init-uninit-3.c: New test. * gcc.dg/auto-init-uninit-34.c: New test. * gcc.dg/auto-init-uninit-36.c: New test. * gcc.dg/auto-init-uninit-37.c: New test. * gcc.dg/auto-init-uninit-4.c: New test. * gcc.dg/auto-init-uninit-5.c: New test. * gcc.dg/auto-init-uninit-6.c: New test. * gcc.dg/auto-init-uninit-8.c: New test. * gcc.dg/auto-init-uninit-9.c: New test. * gcc.dg/auto-init-uninit-A.c: New test. * gcc.dg/auto-init-uninit-B.c: New test. * gcc.dg/auto-init-uninit-C.c: New test. * gcc.dg/auto-init-uninit-H.c: New test. * gcc.dg/auto-init-uninit-I.c: New test. * gcc.target/aarch64/auto-init-1.c: New test. * gcc.target/aarch64/auto-init-2.c: New test. * gcc.target/aarch64/auto-init-3.c: New test. * gcc.target/aarch64/auto-init-4.c: New test. * gcc.target/aarch64/auto-init-5.c: New test. * gcc.target/aarch64/auto-init-6.c: New test. * gcc.target/aarch64/auto-init-7.c: New test. * gcc.target/aarch64/auto-init-8.c: New test. * gcc.target/aarch64/auto-init-padding-1.c: New test. * gcc.target/aarch64/auto-init-padding-10.c: New test. * gcc.target/aarch64/auto-init-padding-11.c: New test. * gcc.target/aarch64/auto-init-padding-12.c: New test. * gcc.target/aarch64/auto-init-padding-2.c: New test. * gcc.target/aarch64/auto-init-padding-3.c: New test. * gcc.target/aarch64/auto-init-padding-4.c: New test. * gcc.target/aarch64/auto-init-padding-5.c: New test. * gcc.target/aarch64/auto-init-padding-6.c: New test. * gcc.target/aarch64/auto-init-padding-7.c: New test. * gcc.target/aarch64/auto-init-padding-8.c: New test. * gcc.target/aarch64/auto-init-padding-9.c: New test. * gcc.target/i386/auto-init-1.c: New test. * gcc.target/i386/auto-init-2.c: New test. * gcc.target/i386/auto-init-21.c: New test. * gcc.target/i386/auto-init-22.c: New test. * gcc.target/i386/auto-init-23.c: New test. * gcc.target/i386/auto-init-24.c: New test. * gcc.target/i386/auto-init-3.c: New test. * gcc.target/i386/auto-init-4.c: New test. * gcc.target/i386/auto-init-5.c: New test. * gcc.target/i386/auto-init-6.c: New test. * gcc.target/i386/auto-init-7.c: New test. * gcc.target/i386/auto-init-8.c: New test. * gcc.target/i386/auto-init-padding-1.c: New test. * gcc.target/i386/auto-init-padding-10.c: New test. * gcc.target/i386/auto-init-padding-11.c: New test. * gcc.target/i386/auto-init-padding-12.c: New test. * gcc.target/i386/auto-init-padding-2.c: New test. * gcc.target/i386/auto-init-padding-3.c: New test. * gcc.target/i386/auto-init-padding-4.c: New test. * gcc.target/i386/auto-init-padding-5.c: New test. * gcc.target/i386/auto-init-padding-6.c: New test. * gcc.target/i386/auto-init-padding-7.c: New test. * gcc.target/i386/auto-init-padding-8.c: New test. * gcc.target/i386/auto-init-padding-9.c: New test. --- gcc/builtins.c | 3 +- gcc/builtins.h | 1 + gcc/c-family/c-attribs.c | 27 +++ gcc/common.opt | 16 ++ gcc/doc/extend.texi | 16 ++ gcc/doc/invoke.texi | 41 +++- gcc/flag-types.h | 7 + gcc/gimple-fold.c | 54 +++-- gcc/gimplify.c | 151 ++++++++++++- gcc/internal-fn.c | 99 +++++++++ gcc/internal-fn.def | 4 + gcc/testsuite/c-c++-common/auto-init-1.c | 39 ++++ gcc/testsuite/c-c++-common/auto-init-10.c | 22 ++ gcc/testsuite/c-c++-common/auto-init-11.c | 14 ++ gcc/testsuite/c-c++-common/auto-init-12.c | 14 ++ gcc/testsuite/c-c++-common/auto-init-13.c | 23 ++ gcc/testsuite/c-c++-common/auto-init-14.c | 23 ++ gcc/testsuite/c-c++-common/auto-init-15.c | 13 ++ gcc/testsuite/c-c++-common/auto-init-16.c | 13 ++ gcc/testsuite/c-c++-common/auto-init-2.c | 39 ++++ gcc/testsuite/c-c++-common/auto-init-3.c | 19 ++ gcc/testsuite/c-c++-common/auto-init-4.c | 19 ++ gcc/testsuite/c-c++-common/auto-init-5.c | 21 ++ gcc/testsuite/c-c++-common/auto-init-6.c | 21 ++ gcc/testsuite/c-c++-common/auto-init-7.c | 35 +++ gcc/testsuite/c-c++-common/auto-init-8.c | 35 +++ gcc/testsuite/c-c++-common/auto-init-9.c | 20 ++ gcc/testsuite/c-c++-common/auto-init-esra.c | 35 +++ gcc/testsuite/c-c++-common/auto-init-padding-1.c | 23 ++ gcc/testsuite/c-c++-common/auto-init-padding-2.c | 114 ++++++++++ gcc/testsuite/c-c++-common/auto-init-padding-3.c | 114 ++++++++++ gcc/testsuite/g++.dg/auto-init-uninit-pred-1_a.C | 3 + gcc/testsuite/g++.dg/auto-init-uninit-pred-2_a.C | 3 + gcc/testsuite/g++.dg/auto-init-uninit-pred-3_a.C | 3 + gcc/testsuite/g++.dg/auto-init-uninit-pred-4.C | 3 + gcc/testsuite/gcc.dg/auto-init-sra-1.c | 24 +++ gcc/testsuite/gcc.dg/auto-init-sra-2.c | 24 +++ gcc/testsuite/gcc.dg/auto-init-uninit-1.c | 5 + gcc/testsuite/gcc.dg/auto-init-uninit-12.c | 4 + gcc/testsuite/gcc.dg/auto-init-uninit-13.c | 10 + gcc/testsuite/gcc.dg/auto-init-uninit-14.c | 4 + gcc/testsuite/gcc.dg/auto-init-uninit-15.c | 26 +++ gcc/testsuite/gcc.dg/auto-init-uninit-16.c | 25 +++ gcc/testsuite/gcc.dg/auto-init-uninit-17.c | 15 ++ gcc/testsuite/gcc.dg/auto-init-uninit-18.c | 3 + gcc/testsuite/gcc.dg/auto-init-uninit-19.c | 26 +++ gcc/testsuite/gcc.dg/auto-init-uninit-2.c | 5 + gcc/testsuite/gcc.dg/auto-init-uninit-20.c | 4 + gcc/testsuite/gcc.dg/auto-init-uninit-21.c | 4 + gcc/testsuite/gcc.dg/auto-init-uninit-22.c | 3 + gcc/testsuite/gcc.dg/auto-init-uninit-23.c | 27 +++ gcc/testsuite/gcc.dg/auto-init-uninit-24.c | 3 + gcc/testsuite/gcc.dg/auto-init-uninit-25.c | 23 ++ gcc/testsuite/gcc.dg/auto-init-uninit-26.c | 23 ++ gcc/testsuite/gcc.dg/auto-init-uninit-3.c | 5 + gcc/testsuite/gcc.dg/auto-init-uninit-34.c | 60 ++++++ gcc/testsuite/gcc.dg/auto-init-uninit-36.c | 238 +++++++++++++++++++++ gcc/testsuite/gcc.dg/auto-init-uninit-37.c | 156 ++++++++++++++ gcc/testsuite/gcc.dg/auto-init-uninit-4.c | 10 + gcc/testsuite/gcc.dg/auto-init-uninit-5.c | 6 + gcc/testsuite/gcc.dg/auto-init-uninit-6.c | 7 + gcc/testsuite/gcc.dg/auto-init-uninit-8.c | 8 + gcc/testsuite/gcc.dg/auto-init-uninit-9.c | 8 + gcc/testsuite/gcc.dg/auto-init-uninit-A.c | 7 + gcc/testsuite/gcc.dg/auto-init-uninit-B.c | 17 ++ gcc/testsuite/gcc.dg/auto-init-uninit-C.c | 5 + gcc/testsuite/gcc.dg/auto-init-uninit-H.c | 5 + gcc/testsuite/gcc.dg/auto-init-uninit-I.c | 3 + gcc/testsuite/gcc.target/aarch64/auto-init-1.c | 32 +++ gcc/testsuite/gcc.target/aarch64/auto-init-2.c | 35 +++ gcc/testsuite/gcc.target/aarch64/auto-init-3.c | 19 ++ gcc/testsuite/gcc.target/aarch64/auto-init-4.c | 19 ++ gcc/testsuite/gcc.target/aarch64/auto-init-5.c | 19 ++ gcc/testsuite/gcc.target/aarch64/auto-init-6.c | 18 ++ gcc/testsuite/gcc.target/aarch64/auto-init-7.c | 32 +++ gcc/testsuite/gcc.target/aarch64/auto-init-8.c | 32 +++ .../gcc.target/aarch64/auto-init-padding-1.c | 17 ++ .../gcc.target/aarch64/auto-init-padding-10.c | 22 ++ .../gcc.target/aarch64/auto-init-padding-11.c | 27 +++ .../gcc.target/aarch64/auto-init-padding-12.c | 27 +++ .../gcc.target/aarch64/auto-init-padding-2.c | 18 ++ .../gcc.target/aarch64/auto-init-padding-3.c | 27 +++ .../gcc.target/aarch64/auto-init-padding-4.c | 27 +++ .../gcc.target/aarch64/auto-init-padding-5.c | 22 ++ .../gcc.target/aarch64/auto-init-padding-6.c | 20 ++ .../gcc.target/aarch64/auto-init-padding-7.c | 20 ++ .../gcc.target/aarch64/auto-init-padding-8.c | 22 ++ .../gcc.target/aarch64/auto-init-padding-9.c | 21 ++ gcc/testsuite/gcc.target/i386/auto-init-1.c | 32 +++ gcc/testsuite/gcc.target/i386/auto-init-2.c | 36 ++++ gcc/testsuite/gcc.target/i386/auto-init-21.c | 14 ++ gcc/testsuite/gcc.target/i386/auto-init-22.c | 14 ++ gcc/testsuite/gcc.target/i386/auto-init-23.c | 13 ++ gcc/testsuite/gcc.target/i386/auto-init-24.c | 13 ++ gcc/testsuite/gcc.target/i386/auto-init-3.c | 17 ++ gcc/testsuite/gcc.target/i386/auto-init-4.c | 20 ++ gcc/testsuite/gcc.target/i386/auto-init-5.c | 20 ++ gcc/testsuite/gcc.target/i386/auto-init-6.c | 19 ++ gcc/testsuite/gcc.target/i386/auto-init-7.c | 33 +++ gcc/testsuite/gcc.target/i386/auto-init-8.c | 35 +++ .../gcc.target/i386/auto-init-padding-1.c | 19 ++ .../gcc.target/i386/auto-init-padding-10.c | 21 ++ .../gcc.target/i386/auto-init-padding-11.c | 26 +++ .../gcc.target/i386/auto-init-padding-12.c | 26 +++ .../gcc.target/i386/auto-init-padding-2.c | 19 ++ .../gcc.target/i386/auto-init-padding-3.c | 30 +++ .../gcc.target/i386/auto-init-padding-4.c | 28 +++ .../gcc.target/i386/auto-init-padding-5.c | 21 ++ .../gcc.target/i386/auto-init-padding-6.c | 22 ++ .../gcc.target/i386/auto-init-padding-7.c | 22 ++ .../gcc.target/i386/auto-init-padding-8.c | 22 ++ .../gcc.target/i386/auto-init-padding-9.c | 22 ++ gcc/tree-cfg.c | 47 +++- gcc/tree-sra.c | 124 ++++++++++- gcc/tree-ssa-structalias.c | 3 + gcc/tree-ssa-uninit.c | 48 +++++ gcc/tree-ssa.c | 40 ++++ gcc/tree.c | 13 ++ 118 files changed, 3131 insertions(+), 44 deletions(-) diff --git a/gcc/builtins.c b/gcc/builtins.c index 99548627761..3e57eb03af0 100644 --- a/gcc/builtins.c +++ b/gcc/builtins.c @@ -142,7 +142,6 @@ static rtx expand_builtin_strcpy (tree, rtx); static rtx expand_builtin_strcpy_args (tree, tree, tree, rtx); static rtx expand_builtin_stpcpy (tree, rtx, machine_mode); static rtx expand_builtin_strncpy (tree, rtx); -static rtx expand_builtin_memset (tree, rtx, machine_mode); static rtx expand_builtin_memset_args (tree, tree, tree, rtx, machine_mode, tree); static rtx expand_builtin_bzero (tree); static rtx expand_builtin_strlen (tree, rtx, machine_mode); @@ -3872,7 +3871,7 @@ builtin_memset_gen_str (void *data, void *prev, try to get the result in TARGET, if convenient (and in mode MODE if that's convenient). */ -static rtx +rtx expand_builtin_memset (tree exp, rtx target, machine_mode mode) { if (!validate_arglist (exp, diff --git a/gcc/builtins.h b/gcc/builtins.h index 16b47ac1a7b..d330b78e591 100644 --- a/gcc/builtins.h +++ b/gcc/builtins.h @@ -114,6 +114,7 @@ extern rtx builtin_strncpy_read_str (void *, void *, HOST_WIDE_INT, fixed_size_mode); extern rtx builtin_memset_read_str (void *, void *, HOST_WIDE_INT, fixed_size_mode); +extern rtx expand_builtin_memset (tree, rtx, machine_mode); extern rtx expand_builtin_saveregs (void); extern tree std_build_builtin_va_list (void); extern tree std_fn_abi_va_list (tree); diff --git a/gcc/c-family/c-attribs.c b/gcc/c-family/c-attribs.c index d14e9c441b3..007b928c54b 100644 --- a/gcc/c-family/c-attribs.c +++ b/gcc/c-family/c-attribs.c @@ -83,6 +83,7 @@ static tree handle_artificial_attribute (tree *, tree, tree, int, bool *); static tree handle_flatten_attribute (tree *, tree, tree, int, bool *); static tree handle_error_attribute (tree *, tree, tree, int, bool *); static tree handle_used_attribute (tree *, tree, tree, int, bool *); +static tree handle_uninitialized_attribute (tree *, tree, tree, int, bool *); static tree handle_externally_visible_attribute (tree *, tree, tree, int, bool *); static tree handle_no_reorder_attribute (tree *, tree, tree, int, @@ -333,6 +334,8 @@ const struct attribute_spec c_common_attribute_table[] = handle_used_attribute, NULL }, { "unused", 0, 0, false, false, false, false, handle_unused_attribute, NULL }, + { "uninitialized", 0, 0, true, false, false, false, + handle_uninitialized_attribute, NULL }, { "retain", 0, 0, true, false, false, false, handle_retain_attribute, NULL }, { "externally_visible", 0, 0, true, false, false, false, @@ -1617,6 +1620,30 @@ handle_retain_attribute (tree *pnode, tree name, tree ARG_UNUSED (args), return NULL_TREE; } +/* Handle an "uninitialized" attribute; arguments as in + struct attribute_spec.handler. */ + +static tree +handle_uninitialized_attribute (tree *node, tree name, tree ARG_UNUSED (args), + int ARG_UNUSED (flags), bool *no_add_attrs) +{ + tree decl = *node; + if (!VAR_P (decl)) + { + warning (OPT_Wattributes, "%qE attribute ignored because %qD " + "is not a variable", name, decl); + *no_add_attrs = true; + } + else if (TREE_STATIC (decl) || DECL_EXTERNAL (decl)) + { + warning (OPT_Wattributes, "%qE attribute ignored because %qD " + "is not a local variable", name, decl); + *no_add_attrs = true; + } + + return NULL_TREE; +} + /* Handle a "externally_visible" attribute; arguments as in struct attribute_spec.handler. */ diff --git a/gcc/common.opt b/gcc/common.opt index f103a7de004..b921f5e3b25 100644 --- a/gcc/common.opt +++ b/gcc/common.opt @@ -3081,6 +3081,22 @@ ftree-scev-cprop Common Var(flag_tree_scev_cprop) Init(1) Optimization Enable copy propagation of scalar-evolution information. +ftrivial-auto-var-init= +Common Joined RejectNegative Enum(auto_init_type) Var(flag_auto_var_init) Init(AUTO_INIT_UNINITIALIZED) Optimization +-ftrivial-auto-var-init=[uninitialized|pattern|zero] Add initializations to automatic variables. + +Enum +Name(auto_init_type) Type(enum auto_init_type) UnknownError(unrecognized automatic variable initialization type %qs) + +EnumValue +Enum(auto_init_type) String(uninitialized) Value(AUTO_INIT_UNINITIALIZED) + +EnumValue +Enum(auto_init_type) String(pattern) Value(AUTO_INIT_PATTERN) + +EnumValue +Enum(auto_init_type) String(zero) Value(AUTO_INIT_ZERO) + ; -fverbose-asm causes extra commentary information to be produced in ; the generated assembly code (to make it more readable). This option ; is generally only of use to those who actually need to read the diff --git a/gcc/doc/extend.texi b/gcc/doc/extend.texi index 52bc4e5b76e..8b324a097a4 100644 --- a/gcc/doc/extend.texi +++ b/gcc/doc/extend.texi @@ -7610,6 +7610,22 @@ will be placed in new, unique sections. This additional functionality requires Binutils version 2.36 or later. +@item uninitialized +@cindex @code{uninitialized} variable attribute +This attribute, attached to a variable with automatic storage, means that +the variable should not be automatically initialized by the compiler when +the option @code{-ftrivial-auto-var-init} presents. + +With the option @code{-ftrivial-auto-var-init}, all the automatic variables +that do not have explicit initializers will be initialized by the compiler. +These additional compiler initializations might incur run-time overhead, +sometimes dramatically. This attribute can be used to mark some variables +to be excluded from such automatical initialization in order to reduce runtime +overhead. + +This attribute has no effect when the option @code{-ftrivial-auto-var-init} +does not present. + @item vector_size (@var{bytes}) @cindex @code{vector_size} variable attribute This attribute specifies the vector size for the type of the declared diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi index d4b3a66ee4f..b08a5eb4d73 100644 --- a/gcc/doc/invoke.texi +++ b/gcc/doc/invoke.texi @@ -573,9 +573,9 @@ Objective-C and Objective-C++ Dialects}. -ftree-parallelize-loops=@var{n} -ftree-pre -ftree-partial-pre -ftree-pta @gol -ftree-reassoc -ftree-scev-cprop -ftree-sink -ftree-slsr -ftree-sra @gol -ftree-switch-conversion -ftree-tail-merge @gol --ftree-ter -ftree-vectorize -ftree-vrp -funconstrained-commons @gol --funit-at-a-time -funroll-all-loops -funroll-loops @gol --funsafe-math-optimizations -funswitch-loops @gol +-ftree-ter -ftree-vectorize -ftree-vrp -ftrivial-auto-var-init @gol +-funconstrained-commons -funit-at-a-time -funroll-all-loops @gol +-funroll-loops -funsafe-math-optimizations -funswitch-loops @gol -fipa-ra -fvariable-expansion-in-unroller -fvect-cost-model -fvpt @gol -fweb -fwhole-program -fwpa -fuse-linker-plugin -fzero-call-used-regs @gol --param @var{name}=@var{value} @@ -11843,6 +11843,41 @@ Perform basic block vectorization on trees. This flag is enabled by default at @option{-O3} and by @option{-ftree-vectorize}, @option{-fprofile-use}, and @option{-fauto-profile}. +@item -ftrivial-auto-var-init=@var{choice} +@opindex ftrivial-auto-var-init +Initialize automatic variables with either a pattern or with zeroes to increase +the security and predictability of a program by preventing uninitialized memory +disclosure and use. +GCC still considers an automatic variable that doesn't have an explicit +initializer as uninitialized, -Wuninitialized will still report warning messages +on such automatic variables. +With this option, GCC will also initialize any padding of automatic variables +that have structure or union types to zeroes. + +The three values of @var{choice} are: + +@itemize @bullet +@item +@samp{uninitialized} doesn't initialize any automatic variables. +This is C and C++'s default. + +@item +@samp{pattern} Initialize automatic variables with values which will likely +transform logic bugs into crashes down the line, are easily recognized in a +crash dump and without being values that programmers can rely on for useful +program semantics. +The current value is byte-repeatable pattern with byte "0xFE". +The values used for pattern initialization might be changed in the future. + +@item +@samp{zero} Initialize automatic variables with zeroes. +@end itemize + +The default is @samp{uninitialized}. + +You can control this behavior for a specific variable by using the variable +attribute @code{uninitialized} (@pxref{Variable Attributes}). + @item -fvect-cost-model=@var{model} @opindex fvect-cost-model Alter the cost model used for vectorization. The @var{model} argument diff --git a/gcc/flag-types.h b/gcc/flag-types.h index 45a2338d5f6..5bd1f771c8b 100644 --- a/gcc/flag-types.h +++ b/gcc/flag-types.h @@ -281,6 +281,13 @@ enum vect_cost_model { VECT_COST_MODEL_DEFAULT = 1 }; +/* Automatic variable initialization type. */ +enum auto_init_type { + AUTO_INIT_UNINITIALIZED = 0, + AUTO_INIT_PATTERN = 1, + AUTO_INIT_ZERO = 2 +}; + /* Different instrumentation modes. */ enum sanitize_code { /* AddressSanitizer. */ diff --git a/gcc/gimple-fold.c b/gcc/gimple-fold.c index 3f2c176cff6..dd0e6b5daff 100644 --- a/gcc/gimple-fold.c +++ b/gcc/gimple-fold.c @@ -4518,12 +4518,14 @@ clear_padding_add_padding (clear_padding_struct *buf, } } -static void clear_padding_type (clear_padding_struct *, tree, HOST_WIDE_INT); +static void clear_padding_type (clear_padding_struct *, tree, + HOST_WIDE_INT, bool); /* Clear padding bits of union type TYPE. */ static void -clear_padding_union (clear_padding_struct *buf, tree type, HOST_WIDE_INT sz) +clear_padding_union (clear_padding_struct *buf, tree type, + HOST_WIDE_INT sz, bool for_auto_init) { clear_padding_struct *union_buf; HOST_WIDE_INT start_off = 0, next_off = 0; @@ -4568,7 +4570,7 @@ clear_padding_union (clear_padding_struct *buf, tree type, HOST_WIDE_INT sz) continue; gcc_assert (TREE_CODE (TREE_TYPE (field)) == ARRAY_TYPE && !COMPLETE_TYPE_P (TREE_TYPE (field))); - if (!buf->clear_in_mask) + if (!buf->clear_in_mask && !for_auto_init) error_at (buf->loc, "flexible array member %qD does not have " "well defined padding bits for %qs", field, "__builtin_clear_padding"); @@ -4579,7 +4581,7 @@ clear_padding_union (clear_padding_struct *buf, tree type, HOST_WIDE_INT sz) union_buf->off = start_off; union_buf->size = start_size; memset (union_buf->buf, ~0, start_size); - clear_padding_type (union_buf, TREE_TYPE (field), fldsz); + clear_padding_type (union_buf, TREE_TYPE (field), fldsz, for_auto_init); clear_padding_add_padding (union_buf, sz - fldsz); clear_padding_flush (union_buf, true); } @@ -4649,7 +4651,8 @@ clear_padding_type_may_have_padding_p (tree type) __builtin_clear_padding (buf.base); */ static void -clear_padding_emit_loop (clear_padding_struct *buf, tree type, tree end) +clear_padding_emit_loop (clear_padding_struct *buf, tree type, + tree end, bool for_auto_init) { tree l1 = create_artificial_label (buf->loc); tree l2 = create_artificial_label (buf->loc); @@ -4660,7 +4663,7 @@ clear_padding_emit_loop (clear_padding_struct *buf, tree type, tree end) g = gimple_build_label (l1); gimple_set_location (g, buf->loc); gsi_insert_before (buf->gsi, g, GSI_SAME_STMT); - clear_padding_type (buf, type, buf->sz); + clear_padding_type (buf, type, buf->sz, for_auto_init); clear_padding_flush (buf, true); g = gimple_build_assign (buf->base, POINTER_PLUS_EXPR, buf->base, size_int (buf->sz)); @@ -4678,10 +4681,16 @@ clear_padding_emit_loop (clear_padding_struct *buf, tree type, tree end) } /* Clear padding bits for TYPE. Called recursively from - gimple_fold_builtin_clear_padding. */ + gimple_fold_builtin_clear_padding. If FOR_AUTO_INIT is true, + the __builtin_clear_padding is not called by the end user, + instead, it's inserted by the compiler to initialize the + paddings of automatic variable. Therefore, we should not + emit the error messages for flexible array members to confuse + the end user. */ static void -clear_padding_type (clear_padding_struct *buf, tree type, HOST_WIDE_INT sz) +clear_padding_type (clear_padding_struct *buf, tree type, + HOST_WIDE_INT sz, bool for_auto_init) { switch (TREE_CODE (type)) { @@ -4765,7 +4774,7 @@ clear_padding_type (clear_padding_struct *buf, tree type, HOST_WIDE_INT sz) continue; gcc_assert (TREE_CODE (ftype) == ARRAY_TYPE && !COMPLETE_TYPE_P (ftype)); - if (!buf->clear_in_mask) + if (!buf->clear_in_mask && !for_auto_init) error_at (buf->loc, "flexible array member %qD does not " "have well defined padding bits for %qs", field, "__builtin_clear_padding"); @@ -4781,7 +4790,8 @@ clear_padding_type (clear_padding_struct *buf, tree type, HOST_WIDE_INT sz) gcc_assert (pos >= 0 && fldsz >= 0 && pos >= cur_pos); clear_padding_add_padding (buf, pos - cur_pos); cur_pos = pos; - clear_padding_type (buf, TREE_TYPE (field), fldsz); + clear_padding_type (buf, TREE_TYPE (field), + fldsz, for_auto_init); cur_pos += fldsz; } } @@ -4821,7 +4831,7 @@ clear_padding_type (clear_padding_struct *buf, tree type, HOST_WIDE_INT sz) buf->align = TYPE_ALIGN (elttype); buf->off = 0; buf->size = 0; - clear_padding_emit_loop (buf, elttype, end); + clear_padding_emit_loop (buf, elttype, end, for_auto_init); buf->base = base; buf->sz = prev_sz; buf->align = prev_align; @@ -4831,10 +4841,10 @@ clear_padding_type (clear_padding_struct *buf, tree type, HOST_WIDE_INT sz) break; } for (HOST_WIDE_INT i = 0; i < nelts; i++) - clear_padding_type (buf, TREE_TYPE (type), fldsz); + clear_padding_type (buf, TREE_TYPE (type), fldsz, for_auto_init); break; case UNION_TYPE: - clear_padding_union (buf, type, sz); + clear_padding_union (buf, type, sz, for_auto_init); break; case REAL_TYPE: gcc_assert ((size_t) sz <= clear_padding_unit); @@ -4858,14 +4868,14 @@ clear_padding_type (clear_padding_struct *buf, tree type, HOST_WIDE_INT sz) break; case COMPLEX_TYPE: fldsz = int_size_in_bytes (TREE_TYPE (type)); - clear_padding_type (buf, TREE_TYPE (type), fldsz); - clear_padding_type (buf, TREE_TYPE (type), fldsz); + clear_padding_type (buf, TREE_TYPE (type), fldsz, for_auto_init); + clear_padding_type (buf, TREE_TYPE (type), fldsz, for_auto_init); break; case VECTOR_TYPE: nelts = TYPE_VECTOR_SUBPARTS (type).to_constant (); fldsz = int_size_in_bytes (TREE_TYPE (type)); for (HOST_WIDE_INT i = 0; i < nelts; i++) - clear_padding_type (buf, TREE_TYPE (type), fldsz); + clear_padding_type (buf, TREE_TYPE (type), fldsz, for_auto_init); break; case NULLPTR_TYPE: gcc_assert ((size_t) sz <= clear_padding_unit); @@ -4901,7 +4911,7 @@ clear_type_padding_in_mask (tree type, unsigned char *mask) buf.sz = int_size_in_bytes (type); buf.size = 0; buf.union_ptr = mask; - clear_padding_type (&buf, type, buf.sz); + clear_padding_type (&buf, type, buf.sz, false); clear_padding_flush (&buf, true); } @@ -4911,9 +4921,13 @@ static bool gimple_fold_builtin_clear_padding (gimple_stmt_iterator *gsi) { gimple *stmt = gsi_stmt (*gsi); - gcc_assert (gimple_call_num_args (stmt) == 2); + gcc_assert (gimple_call_num_args (stmt) == 3); tree ptr = gimple_call_arg (stmt, 0); tree typearg = gimple_call_arg (stmt, 1); + /* the 3rd argument of __builtin_clear_padding is to distinguish whether + this call is made by the user or by the compiler for automatic variable + initialization. */ + bool for_auto_init = (bool) TREE_INT_CST_LOW (gimple_call_arg (stmt, 2)); tree type = TREE_TYPE (TREE_TYPE (typearg)); location_t loc = gimple_location (stmt); clear_padding_struct buf; @@ -4970,7 +4984,7 @@ gimple_fold_builtin_clear_padding (gimple_stmt_iterator *gsi) buf.sz = eltsz; buf.align = TYPE_ALIGN (elttype); buf.alias_type = build_pointer_type (elttype); - clear_padding_emit_loop (&buf, elttype, end); + clear_padding_emit_loop (&buf, elttype, end, for_auto_init); } } else @@ -4983,7 +4997,7 @@ gimple_fold_builtin_clear_padding (gimple_stmt_iterator *gsi) gsi_insert_before (gsi, g, GSI_SAME_STMT); } buf.alias_type = build_pointer_type (type); - clear_padding_type (&buf, type, buf.sz); + clear_padding_type (&buf, type, buf.sz, for_auto_init); clear_padding_flush (&buf, true); } diff --git a/gcc/gimplify.c b/gcc/gimplify.c index 99d1c7fcce4..3314f76cf3f 100644 --- a/gcc/gimplify.c +++ b/gcc/gimplify.c @@ -1743,6 +1743,94 @@ force_labels_r (tree *tp, int *walk_subtrees, void *data ATTRIBUTE_UNUSED) return NULL_TREE; } +/* Generate an initialization to automatic variable DECL based on INIT_TYPE. + Build a call to internal const function DEFERRED_INIT: + 1st argument: SIZE of the DECL; + 2nd argument: INIT_TYPE; + 3rd argument: IS_VLA, 0 NO, 1 YES; + + as LHS = DEFERRED_INIT (SIZE of the DECL, INIT_TYPE, IS_VLA) + if IS_VLA is false, the LHS is the DECL itself, + if IS_VLA is true, the LHS is a MEM_REF whose address is the pointer + to this DECL. */ +static void +gimple_add_init_for_auto_var (tree decl, + enum auto_init_type init_type, + bool is_vla, + gimple_seq *seq_p) +{ + gcc_assert (auto_var_p (decl)); + gcc_assert (init_type > AUTO_INIT_UNINITIALIZED); + location_t loc = EXPR_LOCATION (decl); + tree decl_size = TYPE_SIZE_UNIT (TREE_TYPE (decl)); + + tree init_type_node + = build_int_cst (integer_type_node, (int) init_type); + tree is_vla_node + = build_int_cst (integer_type_node, (int) is_vla); + + tree call = build_call_expr_internal_loc (loc, IFN_DEFERRED_INIT, + TREE_TYPE (decl), 3, + decl_size, init_type_node, + is_vla_node); + + gimplify_assign (decl, call, seq_p); +} + +/* Generate padding initialization for automatic vairable DECL. + C guarantees that brace-init with fewer initializers than members + aggregate will initialize the rest of the aggregate as-if it were + static initialization. In turn static initialization guarantees + that padding is initialized to zero. So, we always initialize paddings + to zeroes regardless INIT_TYPE. + To do the padding initialization, we insert a call to + __BUILTIN_CLEAR_PADDING (&decl, 0, for_auto_init = true). + Note, we add an additional dummy argument for __BUILTIN_CLEAR_PADDING, + 'for_auto_init' to distinguish whether this call is for automatic + variable initialization or not. + */ +static void +gimple_add_padding_init_for_auto_var (tree decl, bool is_vla, + gimple_seq *seq_p) +{ + tree addr_of_decl = NULL_TREE; + bool for_auto_init = true; + tree fn = builtin_decl_explicit (BUILT_IN_CLEAR_PADDING); + + if (is_vla) + { + /* The temporary address variable for this vla should be + created in gimplify_vla_decl. */ + gcc_assert (DECL_HAS_VALUE_EXPR_P (decl)); + gcc_assert (TREE_CODE (DECL_VALUE_EXPR (decl)) == INDIRECT_REF); + addr_of_decl = TREE_OPERAND (DECL_VALUE_EXPR (decl), 0); + } + else + { + mark_addressable (decl); + addr_of_decl = build_fold_addr_expr (decl); + } + + gimple *call = gimple_build_call (fn, + 3, addr_of_decl, + build_zero_cst (TREE_TYPE (addr_of_decl)), + build_int_cst (integer_type_node, + (int) for_auto_init)); + gimplify_seq_add_stmt (seq_p, call); +} + +/* Return true if the DECL need to be automaticly initialized by the + compiler. */ +static bool +is_var_need_auto_init (tree decl) +{ + if (auto_var_p (decl) + && (flag_auto_var_init > AUTO_INIT_UNINITIALIZED) + && (!lookup_attribute ("uninitialized", DECL_ATTRIBUTES (decl)))) + return true; + return false; +} + /* Gimplify a DECL_EXPR node *STMT_P by making any necessary allocation and initialization explicit. */ @@ -1840,6 +1928,26 @@ gimplify_decl_expr (tree *stmt_p, gimple_seq *seq_p) as they may contain a label address. */ walk_tree (&init, force_labels_r, NULL, NULL); } + /* When there is no explicit initializer, if the user requested, + We should insert an artifical initializer for this automatic + variable. */ + else if (is_var_need_auto_init (decl)) + { + gimple_add_init_for_auto_var (decl, + flag_auto_var_init, + is_vla, + seq_p); + /* The expanding of a call to the above .DEFERRED_INIT will apply + block initialization to the whole space covered by this variable. + As a result, all the paddings will be initialized to zeroes + for zero initialization and 0xFE byte-repeatable patterns for + pattern initialization. + In order to make the paddings as zeroes for pattern init, We + should add a call to __builtin_clear_padding to clear the + paddings to zero in compatiple with CLANG. */ + if (flag_auto_var_init == AUTO_INIT_PATTERN) + gimple_add_padding_init_for_auto_var (decl, is_vla, seq_p); + } } return GS_ALL_DONE; @@ -3411,11 +3519,15 @@ gimplify_call_expr (tree *expr_p, gimple_seq *pre_p, bool want_value) { /* Remember the original type of the argument in an internal dummy second argument, as in GIMPLE pointer conversions are - useless. */ + useless. also mark this call as not for automatic initialization + in the internal dummy third argument. */ p = CALL_EXPR_ARG (*expr_p, 0); + bool for_auto_init = false; *expr_p - = build_call_expr_loc (EXPR_LOCATION (*expr_p), fndecl, 2, p, - build_zero_cst (TREE_TYPE (p))); + = build_call_expr_loc (EXPR_LOCATION (*expr_p), fndecl, 3, p, + build_zero_cst (TREE_TYPE (p)), + build_int_cst (integer_type_node, + (int) for_auto_init)); return GS_OK; } break; @@ -4872,6 +4984,9 @@ gimplify_init_constructor (tree *expr_p, gimple_seq *pre_p, gimple_seq *post_p, tree object, ctor, type; enum gimplify_status ret; vec<constructor_elt, va_gc> *elts; + bool cleared = false; + bool is_empty_ctor = false; + bool is_init_expr = (TREE_CODE (*expr_p) == INIT_EXPR); gcc_assert (TREE_CODE (TREE_OPERAND (*expr_p, 1)) == CONSTRUCTOR); @@ -4914,7 +5029,7 @@ gimplify_init_constructor (tree *expr_p, gimple_seq *pre_p, gimple_seq *post_p, struct gimplify_init_ctor_preeval_data preeval_data; HOST_WIDE_INT num_ctor_elements, num_nonzero_elements; HOST_WIDE_INT num_unique_nonzero_elements; - bool cleared, complete_p, valid_const_initializer; + bool complete_p, valid_const_initializer; /* Aggregate types must lower constructors to initialization of individual elements. The exception is that a CONSTRUCTOR node @@ -4923,6 +5038,7 @@ gimplify_init_constructor (tree *expr_p, gimple_seq *pre_p, gimple_seq *post_p, { if (notify_temp_creation) return GS_OK; + is_empty_ctor = true; break; } @@ -5248,13 +5364,28 @@ gimplify_init_constructor (tree *expr_p, gimple_seq *pre_p, gimple_seq *post_p, if (want_value) { *expr_p = object; - return GS_OK; + ret = GS_OK; } else { *expr_p = NULL; - return GS_ALL_DONE; - } + ret = GS_ALL_DONE; + } + + /* If the user requests to initialize automatic variables, we + should initialize paddings inside the variable. Add a call to + __BUILTIN_CLEAR_PADDING (&object, 0, for_auto_init = true) to + initialize paddings of object always to zero regardless of + INIT_TYPE. Note, we will not insert this call if the aggregate + variable has be completely cleared already or it's initialized + with an empty constructor. */ + if (is_init_expr + && ((AGGREGATE_TYPE_P (type) && !cleared && !is_empty_ctor) + || !AGGREGATE_TYPE_P (type)) + && is_var_need_auto_init (object)) + gimple_add_padding_init_for_auto_var (object, false, pre_p); + + return ret; } /* Given a pointer value OP0, return a simplified version of an @@ -5395,10 +5526,12 @@ gimplify_modify_expr_rhs (tree *expr_p, tree *from_p, tree *to_p, crack at this before we break it down. */ if (ret != GS_UNHANDLED) break; + /* If we're initializing from a CONSTRUCTOR, break this into individual MODIFY_EXPRs. */ - return gimplify_init_constructor (expr_p, pre_p, post_p, want_value, - false); + ret = gimplify_init_constructor (expr_p, pre_p, post_p, want_value, + false); + return ret; case COND_EXPR: /* If we're assigning to a non-register type, push the assignment diff --git a/gcc/internal-fn.c b/gcc/internal-fn.c index 1360a00f0b9..ada2a820ff1 100644 --- a/gcc/internal-fn.c +++ b/gcc/internal-fn.c @@ -53,6 +53,9 @@ along with GCC; see the file COPYING3. If not see #include "rtl-iter.h" #include "gimple-range.h" +/* For lang_hooks.types.type_for_mode. */ +#include "langhooks.h" + /* The names of each internal function, indexed by function number. */ const char *const internal_fn_name_array[] = { #define DEF_INTERNAL_FN(CODE, FLAGS, FNSPEC) #CODE, @@ -2977,6 +2980,102 @@ expand_UNIQUE (internal_fn, gcall *stmt) emit_insn (pattern); } +/* Expand the IFN_DEFERRED_INIT function: + LHS = DEFERRED_INIT (SIZE of the DECL, INIT_TYPE, IS_VLA); + + if IS_VLA is false, the LHS is the DECL itself, + if IS_VLA is true, the LHS is a MEM_REF whose address is the pointer + to this DECL. + + Initialize the LHS with zero/pattern according to its second argument + INIT_TYPE: + if INIT_TYPE is AUTO_INIT_ZERO, use zeroes to initialize; + if INIT_TYPE is AUTO_INIT_PATTERN, use 0xFE byte-repeatable pattern + to initialize; + The LHS variable is initialized including paddings. + The reasons to choose 0xFE for pattern initialization are: + 1. It is a non-canonical virtual address on x86_64, and at the + high end of the i386 kernel address space. + 2. It is a very large float value (-1.694739530317379e+38). + 3. It is also an unusual number for integers. */ +#define INIT_PATTERN_VALUE 0xFE +static void +expand_DEFERRED_INIT (internal_fn, gcall *stmt) +{ + tree lhs = gimple_call_lhs (stmt); + tree var_size = gimple_call_arg (stmt, 0); + enum auto_init_type init_type + = (enum auto_init_type) TREE_INT_CST_LOW (gimple_call_arg (stmt, 1)); + bool is_vla = (bool) TREE_INT_CST_LOW (gimple_call_arg (stmt, 2)); + bool reg_lhs = true; + + tree var_type = TREE_TYPE (lhs); + gcc_assert (init_type > AUTO_INIT_UNINITIALIZED); + + if (DECL_P (lhs)) + { + rtx tem = expand_expr (lhs, NULL_RTX, VOIDmode, EXPAND_WRITE); + reg_lhs = !MEM_P (tem); + } + else if (TREE_CODE (lhs) == SSA_NAME) + reg_lhs = true; + else + { + gcc_assert (is_vla); + reg_lhs = false; + } + + + if (!reg_lhs) + { + /* If this is a VLA or the variable is not in register, + expand to a memset to initialize it. */ + + mark_addressable (lhs); + tree var_addr = build_fold_addr_expr (lhs); + + tree value = (init_type == AUTO_INIT_PATTERN) ? + build_int_cst (integer_type_node, + INIT_PATTERN_VALUE) : + integer_zero_node; + tree m_call = build_call_expr (builtin_decl_implicit (BUILT_IN_MEMSET), + 3, var_addr, value, var_size); + /* Expand this memset call. */ + expand_builtin_memset (m_call, NULL_RTX, TYPE_MODE (var_type)); + } + else + { + /* If this variable is in a register, use expand_assignment might + generate better code. */ + tree init = build_zero_cst (var_type); + unsigned HOST_WIDE_INT total_bytes + = tree_to_uhwi (TYPE_SIZE_UNIT (var_type)); + + if (init_type == AUTO_INIT_PATTERN) + { + tree alt_type = NULL_TREE; + if (!can_native_interpret_type_p (var_type)) + { + alt_type </cut>

3 years, 11 months

1
0
0 0

[CI-NOTIFY]: TCWG Bisect tcwg_bmk_tx1/llvm-master-aarch64-spec2k6-O3 - Build # 22 - Successful!

by ci_notify＠linaro.org

Successfully identified regression in *llvm* in CI configuration tcwg_bmk_llvm_tx1/llvm-master-aarch64-spec2k6-O3. So far, this commit has regressed CI configurations: - tcwg_bmk_llvm_tx1/llvm-master-aarch64-spec2k6-O3 Culprit: <cut> commit 10c982e0b3e6d46d1fe288d7dbe0a393c65a640f Author: Simon Pilgrim <llvm-dev(a)redking.me.uk> Date: Mon Aug 23 21:06:06 2021 +0100 Revert rG1c9bec727ab5c53fa060560dc8d346a911142170 : [InstCombine] Fold (gep (oneuse(gep Ptr, Idx0)), Idx1) -> (gep Ptr, (add Idx0, Idx1)) (PR51069) Reverted (manually due to merge conflicts) while regressions reported on PR51540 are investigated As noticed on D106352, after we've folded "(select C, (gep Ptr, Idx), Ptr) -> (gep Ptr, (select C, Idx, 0))" if the inner Ptr was also a (now one use) gep we could then merge the geps, using the sum of the indices instead. I've limited this to basic 2-op geps - a more general case further down InstCombinerImpl.visitGetElementPtrInst doesn't have the one-use limitation but only creates the add if it can be created via SimplifyAddInst. https://alive2.llvm.org/ce/z/f8pLfD (Thanks Roman!) Differential Revision: https://reviews.llvm.org/D106450 </cut> Results regressed to (for first_bad == 10c982e0b3e6d46d1fe288d7dbe0a393c65a640f) # reset_artifacts: -10 # build_abe binutils: -9 # build_abe stage1 -- --set gcc_override_configure=--disable-libsanitizer: -8 # build_abe linux: -7 # build_abe glibc: -6 # build_abe stage2 -- --set gcc_override_configure=--disable-libsanitizer: -5 # build_llvm true: -3 # true: 0 # benchmark -- -O3 artifacts/build-10c982e0b3e6d46d1fe288d7dbe0a393c65a640f/results_id: 1 # 447.dealII,[.] _ZNK13LaplaceSolver6SolverILi3EE15assemble_mat regressed by 120 # 464.h264ref,h264ref_base.default regressed by 104 # 464.h264ref,[.] FastFullPelBlockMotionSearch regressed by 135 from (for last_good == 50f4ae58eb136bc9d802cb98f02b6ff237eb61e0) # reset_artifacts: -10 # build_abe binutils: -9 # build_abe stage1 -- --set gcc_override_configure=--disable-libsanitizer: -8 # build_abe linux: -7 # build_abe glibc: -6 # build_abe stage2 -- --set gcc_override_configure=--disable-libsanitizer: -5 # build_llvm true: -3 # true: 0 # benchmark -- -O3 artifacts/build-50f4ae58eb136bc9d802cb98f02b6ff237eb61e0/results_id: 1 Artifacts of last_good build: https://ci.linaro.org/job/tcwg_bmk_ci_llvm-bisect-tcwg_bmk_tx1-llvm-master-… Results ID of last_good: tx1_64/tcwg_bmk_llvm_tx1/bisect-llvm-master-aarch64-spec2k6-O3/4963 Artifacts of first_bad build: https://ci.linaro.org/job/tcwg_bmk_ci_llvm-bisect-tcwg_bmk_tx1-llvm-master-… Results ID of first_bad: tx1_64/tcwg_bmk_llvm_tx1/bisect-llvm-master-aarch64-spec2k6-O3/4956 Build top page/logs: https://ci.linaro.org/job/tcwg_bmk_ci_llvm-bisect-tcwg_bmk_tx1-llvm-master-… Configuration details: Reproduce builds: <cut> mkdir investigate-llvm-10c982e0b3e6d46d1fe288d7dbe0a393c65a640f cd investigate-llvm-10c982e0b3e6d46d1fe288d7dbe0a393c65a640f git clone https://git.linaro.org/toolchain/jenkins-scripts mkdir -p artifacts/manifests curl -o artifacts/manifests/build-baseline.sh https://ci.linaro.org/job/tcwg_bmk_ci_llvm-bisect-tcwg_bmk_tx1-llvm-master-… --fail curl -o artifacts/manifests/build-parameters.sh https://ci.linaro.org/job/tcwg_bmk_ci_llvm-bisect-tcwg_bmk_tx1-llvm-master-… --fail curl -o artifacts/test.sh https://ci.linaro.org/job/tcwg_bmk_ci_llvm-bisect-tcwg_bmk_tx1-llvm-master-… --fail chmod +x artifacts/test.sh # Reproduce the baseline build (build all pre-requisites) ./jenkins-scripts/tcwg_bmk-build.sh @@ artifacts/manifests/build-baseline.sh # Save baseline build state (which is then restored in artifacts/test.sh) mkdir -p ./bisect rsync -a --del --delete-excluded --exclude /bisect/ --exclude /artifacts/ --exclude /llvm/ ./ ./bisect/baseline/ cd llvm # Reproduce first_bad build git checkout --detach 10c982e0b3e6d46d1fe288d7dbe0a393c65a640f ../artifacts/test.sh # Reproduce last_good build git checkout --detach 50f4ae58eb136bc9d802cb98f02b6ff237eb61e0 ../artifacts/test.sh cd .. </cut> History of pending regressions and results: https://git.linaro.org/toolchain/ci/base-artifacts.git/log/?h=linaro-local/… Artifacts: https://ci.linaro.org/job/tcwg_bmk_ci_llvm-bisect-tcwg_bmk_tx1-llvm-master-… Build log: https://ci.linaro.org/job/tcwg_bmk_ci_llvm-bisect-tcwg_bmk_tx1-llvm-master-… Full commit (up to 1000 lines): <cut> commit 10c982e0b3e6d46d1fe288d7dbe0a393c65a640f Author: Simon Pilgrim <llvm-dev(a)redking.me.uk> Date: Mon Aug 23 21:06:06 2021 +0100 Revert rG1c9bec727ab5c53fa060560dc8d346a911142170 : [InstCombine] Fold (gep (oneuse(gep Ptr, Idx0)), Idx1) -> (gep Ptr, (add Idx0, Idx1)) (PR51069) Reverted (manually due to merge conflicts) while regressions reported on PR51540 are investigated As noticed on D106352, after we've folded "(select C, (gep Ptr, Idx), Ptr) -> (gep Ptr, (select C, Idx, 0))" if the inner Ptr was also a (now one use) gep we could then merge the geps, using the sum of the indices instead. I've limited this to basic 2-op geps - a more general case further down InstCombinerImpl.visitGetElementPtrInst doesn't have the one-use limitation but only creates the add if it can be created via SimplifyAddInst. https://alive2.llvm.org/ce/z/f8pLfD (Thanks Roman!) Differential Revision: https://reviews.llvm.org/D106450 --- .../InstCombine/InstructionCombining.cpp | 21 ----- .../InstCombine/gep-combine-loop-invariant.ll | 12 +-- llvm/test/Transforms/InstCombine/gep-custom-dl.ll | 4 +- llvm/test/Transforms/InstCombine/getelementptr.ll | 4 +- llvm/test/Transforms/InstCombine/select-gep.ll | 12 +-- llvm/test/Transforms/InstCombine/shift.ll | 4 +- .../LoopVectorize/AArch64/sve-vector-reverse.ll | 100 ++++++++++----------- .../LoopVectorize/AArch64/vector-reverse-mask4.ll | 54 +++++------ .../Transforms/LoopVectorize/ARM/mve-reductions.ll | 26 +++--- .../X86/x86-interleaved-accesses-masked-group.ll | 60 +++++++------ .../x86-interleaved-store-accesses-with-gaps.ll | 58 ++++++------ .../LoopVectorize/consecutive-ptr-uniforms.ll | 4 +- .../LoopVectorize/interleaved-accesses.ll | 62 +++++++------ 13 files changed, 210 insertions(+), 211 deletions(-) diff --git a/llvm/lib/Transforms/InstCombine/InstructionCombining.cpp b/llvm/lib/Transforms/InstCombine/InstructionCombining.cpp index 1026b9da44e9..48645b484fd2 100644 --- a/llvm/lib/Transforms/InstCombine/InstructionCombining.cpp +++ b/llvm/lib/Transforms/InstCombine/InstructionCombining.cpp @@ -2131,27 +2131,6 @@ Instruction *InstCombinerImpl::visitGetElementPtrInst(GetElementPtrInst &GEP) { } } } - - // Guard the gep(gep) fold so we don't create an add inside a loop - // when there wasn't an equivalent instruction there before. - bool DifferentLoops = false; - if (LI) - if (auto *GEPLoop = LI->getLoopFor(GEP.getParent())) - if (auto *SrcOpI = dyn_cast<Instruction>(Src)) - if (LI->getLoopFor(SrcOpI->getParent()) != GEPLoop) - DifferentLoops = true; - - // Fold (gep(gep(Ptr,Idx0),Idx1) -> gep(Ptr,add(Idx0,Idx1)) - if (!DifferentLoops && GO1->getType() == SO1->getType()) { - bool NewInBounds = GEP.isInBounds() && Src->isInBounds(); - auto *NewIdx = - Builder.CreateAdd(GO1, SO1, GEP.getName() + ".idx", - /*HasNUW*/ false, /*HasNSW*/ NewInBounds); - auto *NewGEP = GetElementPtrInst::Create( - GEPEltType, Src->getPointerOperand(), {NewIdx}); - NewGEP->setIsInBounds(NewInBounds); - return NewGEP; - } } // Note that if our source is a gep chain itself then we wait for that diff --git a/llvm/test/Transforms/InstCombine/gep-combine-loop-invariant.ll b/llvm/test/Transforms/InstCombine/gep-combine-loop-invariant.ll index dfa664fde208..f9aac12cfb1f 100644 --- a/llvm/test/Transforms/InstCombine/gep-combine-loop-invariant.ll +++ b/llvm/test/Transforms/InstCombine/gep-combine-loop-invariant.ll @@ -8,10 +8,10 @@ define i32 @foo(i8* nocapture readnone %match, i32 %cur_match, i32 %best_len, i3 ; CHECK-LABEL: @foo( ; CHECK-NEXT: entry: ; CHECK-NEXT: [[IDX_EXT2:%.*]] = zext i32 [[CUR_MATCH:%.*]] to i64 +; CHECK-NEXT: [[ADD_PTR4:%.*]] = getelementptr inbounds i8, i8* [[WIN:%.*]], i64 [[IDX_EXT2]] ; CHECK-NEXT: [[IDX_EXT1:%.*]] = zext i32 [[BEST_LEN:%.*]] to i64 -; CHECK-NEXT: [[ADD_PTR25_IDX:%.*]] = add nuw nsw i64 [[IDX_EXT1]], [[IDX_EXT2]] -; CHECK-NEXT: [[ADD_PTR36_IDX:%.*]] = add nsw i64 [[ADD_PTR25_IDX]], -1 -; CHECK-NEXT: [[ADD_PTR36:%.*]] = getelementptr inbounds i8, i8* [[WIN:%.*]], i64 [[ADD_PTR36_IDX]] +; CHECK-NEXT: [[ADD_PTR25:%.*]] = getelementptr inbounds i8, i8* [[ADD_PTR4]], i64 [[IDX_EXT1]] +; CHECK-NEXT: [[ADD_PTR36:%.*]] = getelementptr inbounds i8, i8* [[ADD_PTR25]], i64 -1 ; CHECK-NEXT: [[TMP0:%.*]] = bitcast i8* [[ADD_PTR36]] to i32* ; CHECK-NEXT: [[TMP1:%.*]] = load i32, i32* [[TMP0]], align 4 ; CHECK-NEXT: [[CMP7:%.*]] = icmp eq i32 [[TMP1]], [[SCAN_END:%.*]] @@ -20,9 +20,9 @@ define i32 @foo(i8* nocapture readnone %match, i32 %cur_match, i32 %best_len, i3 ; CHECK-NEXT: br label [[IF_THEN:%.*]] ; CHECK: do.body: ; CHECK-NEXT: [[IDX_EXT:%.*]] = zext i32 [[TMP4:%.*]] to i64 -; CHECK-NEXT: [[ADD_PTR2_IDX:%.*]] = add nuw nsw i64 [[IDX_EXT]], [[IDX_EXT1]] -; CHECK-NEXT: [[ADD_PTR3_IDX:%.*]] = add nsw i64 [[ADD_PTR2_IDX]], -1 -; CHECK-NEXT: [[ADD_PTR3:%.*]] = getelementptr inbounds i8, i8* [[WIN]], i64 [[ADD_PTR3_IDX]] +; CHECK-NEXT: [[ADD_PTR:%.*]] = getelementptr inbounds i8, i8* [[WIN]], i64 [[IDX_EXT1]] +; CHECK-NEXT: [[ADD_PTR2:%.*]] = getelementptr inbounds i8, i8* [[ADD_PTR]], i64 -1 +; CHECK-NEXT: [[ADD_PTR3:%.*]] = getelementptr inbounds i8, i8* [[ADD_PTR2]], i64 [[IDX_EXT]] ; CHECK-NEXT: [[TMP2:%.*]] = bitcast i8* [[ADD_PTR3]] to i32* ; CHECK-NEXT: [[TMP3:%.*]] = load i32, i32* [[TMP2]], align 4 ; CHECK-NEXT: [[CMP:%.*]] = icmp eq i32 [[TMP3]], [[SCAN_END]] diff --git a/llvm/test/Transforms/InstCombine/gep-custom-dl.ll b/llvm/test/Transforms/InstCombine/gep-custom-dl.ll index 0980451d8ec7..3de70f3c151c 100644 --- a/llvm/test/Transforms/InstCombine/gep-custom-dl.ll +++ b/llvm/test/Transforms/InstCombine/gep-custom-dl.ll @@ -75,8 +75,8 @@ define void @test_evaluate_gep_as_ptrs_array(i8 addrspace(2)* %B) { define i32* @test4(i32* %I, i32 %C, i32 %D) { ; CHECK-LABEL: @test4( -; CHECK-NEXT: [[B_IDX:%.*]] = add i32 [[D:%.*]], [[C:%.*]] -; CHECK-NEXT: [[B:%.*]] = getelementptr i32, i32* [[I:%.*]], i32 [[B_IDX]] +; CHECK-NEXT: [[A:%.*]] = getelementptr i32, i32* [[I:%.*]], i32 [[C:%.*]] +; CHECK-NEXT: [[B:%.*]] = getelementptr i32, i32* [[A]], i32 [[D:%.*]] ; CHECK-NEXT: ret i32* [[B]] ; %A = getelementptr i32, i32* %I, i32 %C diff --git a/llvm/test/Transforms/InstCombine/getelementptr.ll b/llvm/test/Transforms/InstCombine/getelementptr.ll index 688303d308c1..f2a336767fda 100644 --- a/llvm/test/Transforms/InstCombine/getelementptr.ll +++ b/llvm/test/Transforms/InstCombine/getelementptr.ll @@ -115,8 +115,8 @@ define void @test_evaluate_gep_as_ptrs_array(i8 addrspace(2)* %B) { define i32* @test7(i32* %I, i64 %C, i64 %D) { ; CHECK-LABEL: @test7( -; CHECK-NEXT: [[B_IDX:%.*]] = add i64 [[D:%.*]], [[C:%.*]] -; CHECK-NEXT: [[B:%.*]] = getelementptr i32, i32* [[I:%.*]], i64 [[B_IDX]] +; CHECK-NEXT: [[A:%.*]] = getelementptr i32, i32* [[I:%.*]], i64 [[C:%.*]] +; CHECK-NEXT: [[B:%.*]] = getelementptr i32, i32* [[A]], i64 [[D:%.*]] ; CHECK-NEXT: ret i32* [[B]] ; %A = getelementptr i32, i32* %I, i64 %C diff --git a/llvm/test/Transforms/InstCombine/select-gep.ll b/llvm/test/Transforms/InstCombine/select-gep.ll index 2e112fe93a4c..519f0a94a136 100644 --- a/llvm/test/Transforms/InstCombine/select-gep.ll +++ b/llvm/test/Transforms/InstCombine/select-gep.ll @@ -102,10 +102,10 @@ define i32* @test2b(i32* %p, i64 %x, i64 %y) { ; PR51069 define i32* @test2c(i32* %p, i64 %x, i64 %y) { ; CHECK-LABEL: @test2c( -; CHECK-NEXT: [[ICMP:%.*]] = icmp ugt i64 [[X:%.*]], [[Y:%.*]] +; CHECK-NEXT: [[GEP1:%.*]] = getelementptr inbounds i32, i32* [[P:%.*]], i64 [[X:%.*]] +; CHECK-NEXT: [[ICMP:%.*]] = icmp ugt i64 [[X]], [[Y:%.*]] ; CHECK-NEXT: [[SEL_IDX:%.*]] = select i1 [[ICMP]], i64 0, i64 6 -; CHECK-NEXT: [[SEL_IDX1:%.*]] = add i64 [[SEL_IDX]], [[X]] -; CHECK-NEXT: [[SEL:%.*]] = getelementptr i32, i32* [[P:%.*]], i64 [[SEL_IDX1]] +; CHECK-NEXT: [[SEL:%.*]] = getelementptr i32, i32* [[GEP1]], i64 [[SEL_IDX]] ; CHECK-NEXT: ret i32* [[SEL]] ; %gep1 = getelementptr inbounds i32, i32* %p, i64 %x @@ -118,10 +118,10 @@ define i32* @test2c(i32* %p, i64 %x, i64 %y) { ; PR51069 define i32* @test2d(i32* %p, i64 %x, i64 %y) { ; CHECK-LABEL: @test2d( -; CHECK-NEXT: [[ICMP:%.*]] = icmp ugt i64 [[X:%.*]], [[Y:%.*]] +; CHECK-NEXT: [[GEP1:%.*]] = getelementptr inbounds i32, i32* [[P:%.*]], i64 [[X:%.*]] +; CHECK-NEXT: [[ICMP:%.*]] = icmp ugt i64 [[X]], [[Y:%.*]] ; CHECK-NEXT: [[SEL_IDX:%.*]] = select i1 [[ICMP]], i64 6, i64 0 -; CHECK-NEXT: [[SEL_IDX1:%.*]] = add i64 [[SEL_IDX]], [[X]] -; CHECK-NEXT: [[SEL:%.*]] = getelementptr i32, i32* [[P:%.*]], i64 [[SEL_IDX1]] +; CHECK-NEXT: [[SEL:%.*]] = getelementptr i32, i32* [[GEP1]], i64 [[SEL_IDX]] ; CHECK-NEXT: ret i32* [[SEL]] ; %gep1 = getelementptr inbounds i32, i32* %p, i64 %x diff --git a/llvm/test/Transforms/InstCombine/shift.ll b/llvm/test/Transforms/InstCombine/shift.ll index f87de574bc99..2c5c4a7dbe1c 100644 --- a/llvm/test/Transforms/InstCombine/shift.ll +++ b/llvm/test/Transforms/InstCombine/shift.ll @@ -1774,10 +1774,10 @@ define void @ashr_out_of_range(i177* %A) { define void @ashr_out_of_range_1(i177* %A) { ; CHECK-LABEL: @ashr_out_of_range_1( ; CHECK-NEXT: [[L:%.*]] = load i177, i177* [[A:%.*]], align 4 +; CHECK-NEXT: [[G11:%.*]] = getelementptr i177, i177* [[A]], i64 -1 ; CHECK-NEXT: [[B24_LOBIT:%.*]] = ashr i177 [[L]], 175 ; CHECK-NEXT: [[TMP1:%.*]] = trunc i177 [[B24_LOBIT]] to i64 -; CHECK-NEXT: [[G62_IDX:%.*]] = add i64 [[TMP1]], -1 -; CHECK-NEXT: [[G62:%.*]] = getelementptr i177, i177* [[A]], i64 [[G62_IDX]] +; CHECK-NEXT: [[G62:%.*]] = getelementptr i177, i177* [[G11]], i64 [[TMP1]] ; CHECK-NEXT: store i177 0, i177* [[G62]], align 4 ; CHECK-NEXT: ret void ; diff --git a/llvm/test/Transforms/LoopVectorize/AArch64/sve-vector-reverse.ll b/llvm/test/Transforms/LoopVectorize/AArch64/sve-vector-reverse.ll index d406c6de1571..5cd5af5dd9e6 100644 --- a/llvm/test/Transforms/LoopVectorize/AArch64/sve-vector-reverse.ll +++ b/llvm/test/Transforms/LoopVectorize/AArch64/sve-vector-reverse.ll @@ -34,30 +34,30 @@ define void @vector_reverse_f64(i64 %N, double* %a, double* %b) #0{ ; CHECK-NEXT: [[INDEX:%.*]] = phi i64 [ 0, [[VECTOR_PH]] ], [ [[INDEX_NEXT:%.*]], [[VECTOR_BODY]] ] ; CHECK-NEXT: [[TMP4:%.*]] = xor i64 [[INDEX]], -1 ; CHECK-NEXT: [[TMP5:%.*]] = add i64 [[TMP4]], [[N]] -; CHECK-NEXT: [[TMP6:%.*]] = call i32 @llvm.vscale.i32() -; CHECK-NEXT: [[DOTNEG:%.*]] = mul i32 [[TMP6]], -8 -; CHECK-NEXT: [[TMP7:%.*]] = or i32 [[DOTNEG]], 1 -; CHECK-NEXT: [[TMP8:%.*]] = sext i32 [[TMP7]] to i64 -; CHECK-NEXT: [[DOTIDX:%.*]] = add nsw i64 [[TMP5]], [[TMP8]] -; CHECK-NEXT: [[TMP9:%.*]] = getelementptr inbounds double, double* [[B]], i64 [[DOTIDX]] -; CHECK-NEXT: [[TMP10:%.*]] = bitcast double* [[TMP9]] to <vscale x 8 x double>* -; CHECK-NEXT: [[WIDE_LOAD:%.*]] = load <vscale x 8 x double>, <vscale x 8 x double>* [[TMP10]], align 8, !alias.scope !0 +; CHECK-NEXT: [[TMP6:%.*]] = getelementptr inbounds double, double* [[B]], i64 [[TMP5]] +; CHECK-NEXT: [[TMP7:%.*]] = call i32 @llvm.vscale.i32() +; CHECK-NEXT: [[DOTNEG:%.*]] = mul i32 [[TMP7]], -8 +; CHECK-NEXT: [[TMP8:%.*]] = or i32 [[DOTNEG]], 1 +; CHECK-NEXT: [[TMP9:%.*]] = sext i32 [[TMP8]] to i64 +; CHECK-NEXT: [[TMP10:%.*]] = getelementptr inbounds double, double* [[TMP6]], i64 [[TMP9]] +; CHECK-NEXT: [[TMP11:%.*]] = bitcast double* [[TMP10]] to <vscale x 8 x double>* +; CHECK-NEXT: [[WIDE_LOAD:%.*]] = load <vscale x 8 x double>, <vscale x 8 x double>* [[TMP11]], align 8, !alias.scope !0 ; CHECK-NEXT: [[REVERSE:%.*]] = call <vscale x 8 x double> @llvm.experimental.vector.reverse.nxv8f64(<vscale x 8 x double> [[WIDE_LOAD]]) -; CHECK-NEXT: [[TMP11:%.*]] = fadd <vscale x 8 x double> [[REVERSE]], shufflevector (<vscale x 8 x double> insertelement (<vscale x 8 x double> poison, double 1.000000e+00, i32 0), <vscale x 8 x double> poison, <vscale x 8 x i32> zeroinitializer) -; CHECK-NEXT: [[REVERSE6:%.*]] = call <vscale x 8 x double> @llvm.experimental.vector.reverse.nxv8f64(<vscale x 8 x double> [[TMP11]]) -; CHECK-NEXT: [[TMP12:%.*]] = call i32 @llvm.vscale.i32() -; CHECK-NEXT: [[DOTNEG7:%.*]] = mul i32 [[TMP12]], -8 -; CHECK-NEXT: [[TMP13:%.*]] = or i32 [[DOTNEG7]], 1 -; CHECK-NEXT: [[TMP14:%.*]] = sext i32 [[TMP13]] to i64 -; CHECK-NEXT: [[DOTIDX8:%.*]] = add nsw i64 [[TMP5]], [[TMP14]] -; CHECK-NEXT: [[TMP15:%.*]] = getelementptr inbounds double, double* [[A]], i64 [[DOTIDX8]] -; CHECK-NEXT: [[TMP16:%.*]] = bitcast double* [[TMP15]] to <vscale x 8 x double>* -; CHECK-NEXT: store <vscale x 8 x double> [[REVERSE6]], <vscale x 8 x double>* [[TMP16]], align 8, !alias.scope !3, !noalias !0 -; CHECK-NEXT: [[TMP17:%.*]] = call i64 @llvm.vscale.i64() -; CHECK-NEXT: [[TMP18:%.*]] = shl i64 [[TMP17]], 3 -; CHECK-NEXT: [[INDEX_NEXT]] = add nuw i64 [[INDEX]], [[TMP18]] -; CHECK-NEXT: [[TMP19:%.*]] = icmp eq i64 [[INDEX_NEXT]], [[N_VEC]] -; CHECK-NEXT: br i1 [[TMP19]], label [[MIDDLE_BLOCK:%.*]], label [[VECTOR_BODY]], !llvm.loop [[LOOP5:![0-9]+]] +; CHECK-NEXT: [[TMP12:%.*]] = fadd <vscale x 8 x double> [[REVERSE]], shufflevector (<vscale x 8 x double> insertelement (<vscale x 8 x double> poison, double 1.000000e+00, i32 0), <vscale x 8 x double> poison, <vscale x 8 x i32> zeroinitializer) +; CHECK-NEXT: [[TMP13:%.*]] = getelementptr inbounds double, double* [[A]], i64 [[TMP5]] +; CHECK-NEXT: [[REVERSE6:%.*]] = call <vscale x 8 x double> @llvm.experimental.vector.reverse.nxv8f64(<vscale x 8 x double> [[TMP12]]) +; CHECK-NEXT: [[TMP14:%.*]] = call i32 @llvm.vscale.i32() +; CHECK-NEXT: [[DOTNEG7:%.*]] = mul i32 [[TMP14]], -8 +; CHECK-NEXT: [[TMP15:%.*]] = or i32 [[DOTNEG7]], 1 +; CHECK-NEXT: [[TMP16:%.*]] = sext i32 [[TMP15]] to i64 +; CHECK-NEXT: [[TMP17:%.*]] = getelementptr inbounds double, double* [[TMP13]], i64 [[TMP16]] +; CHECK-NEXT: [[TMP18:%.*]] = bitcast double* [[TMP17]] to <vscale x 8 x double>* +; CHECK-NEXT: store <vscale x 8 x double> [[REVERSE6]], <vscale x 8 x double>* [[TMP18]], align 8, !alias.scope !3, !noalias !0 +; CHECK-NEXT: [[TMP19:%.*]] = call i64 @llvm.vscale.i64() +; CHECK-NEXT: [[TMP20:%.*]] = shl i64 [[TMP19]], 3 +; CHECK-NEXT: [[INDEX_NEXT]] = add nuw i64 [[INDEX]], [[TMP20]] +; CHECK-NEXT: [[TMP21:%.*]] = icmp eq i64 [[INDEX_NEXT]], [[N_VEC]] +; CHECK-NEXT: br i1 [[TMP21]], label [[MIDDLE_BLOCK:%.*]], label [[VECTOR_BODY]], !llvm.loop [[LOOP5:![0-9]+]] ; CHECK: middle.block: ; CHECK-NEXT: [[CMP_N:%.*]] = icmp eq i64 [[N_MOD_VF]], 0 ; CHECK-NEXT: br i1 [[CMP_N]], label [[FOR_COND_CLEANUP_LOOPEXIT:%.*]], label [[SCALAR_PH]] @@ -72,8 +72,8 @@ define void @vector_reverse_f64(i64 %N, double* %a, double* %b) #0{ ; CHECK-NEXT: [[I_08_IN:%.*]] = phi i64 [ [[I_08:%.*]], [[FOR_BODY]] ], [ [[BC_RESUME_VAL]], [[SCALAR_PH]] ] ; CHECK-NEXT: [[I_08]] = add nsw i64 [[I_08_IN]], -1 ; CHECK-NEXT: [[ARRAYIDX:%.*]] = getelementptr inbounds double, double* [[B]], i64 [[I_08]] -; CHECK-NEXT: [[TMP20:%.*]] = load double, double* [[ARRAYIDX]], align 8 -; CHECK-NEXT: [[ADD:%.*]] = fadd double [[TMP20]], 1.000000e+00 +; CHECK-NEXT: [[TMP22:%.*]] = load double, double* [[ARRAYIDX]], align 8 +; CHECK-NEXT: [[ADD:%.*]] = fadd double [[TMP22]], 1.000000e+00 ; CHECK-NEXT: [[ARRAYIDX1:%.*]] = getelementptr inbounds double, double* [[A]], i64 [[I_08]] ; CHECK-NEXT: store double [[ADD]], double* [[ARRAYIDX1]], align 8 ; CHECK-NEXT: [[CMP:%.*]] = icmp sgt i64 [[I_08_IN]], 1 @@ -126,30 +126,30 @@ define void @vector_reverse_i64(i64 %N, i64* %a, i64* %b) #0 { ; CHECK-NEXT: [[INDEX:%.*]] = phi i64 [ 0, [[VECTOR_PH]] ], [ [[INDEX_NEXT:%.*]], [[VECTOR_BODY]] ] ; CHECK-NEXT: [[TMP4:%.*]] = xor i64 [[INDEX]], -1 ; CHECK-NEXT: [[TMP5:%.*]] = add i64 [[TMP4]], [[N]] -; CHECK-NEXT: [[TMP6:%.*]] = call i32 @llvm.vscale.i32() -; CHECK-NEXT: [[DOTNEG:%.*]] = mul i32 [[TMP6]], -8 -; CHECK-NEXT: [[TMP7:%.*]] = or i32 [[DOTNEG]], 1 -; CHECK-NEXT: [[TMP8:%.*]] = sext i32 [[TMP7]] to i64 -; CHECK-NEXT: [[DOTIDX:%.*]] = add nsw i64 [[TMP5]], [[TMP8]] -; CHECK-NEXT: [[TMP9:%.*]] = getelementptr inbounds i64, i64* [[B]], i64 [[DOTIDX]] -; CHECK-NEXT: [[TMP10:%.*]] = bitcast i64* [[TMP9]] to <vscale x 8 x i64>* -; CHECK-NEXT: [[WIDE_LOAD:%.*]] = load <vscale x 8 x i64>, <vscale x 8 x i64>* [[TMP10]], align 8, !alias.scope !9 +; CHECK-NEXT: [[TMP6:%.*]] = getelementptr inbounds i64, i64* [[B]], i64 [[TMP5]] +; CHECK-NEXT: [[TMP7:%.*]] = call i32 @llvm.vscale.i32() +; CHECK-NEXT: [[DOTNEG:%.*]] = mul i32 [[TMP7]], -8 +; CHECK-NEXT: [[TMP8:%.*]] = or i32 [[DOTNEG]], 1 +; CHECK-NEXT: [[TMP9:%.*]] = sext i32 [[TMP8]] to i64 +; CHECK-NEXT: [[TMP10:%.*]] = getelementptr inbounds i64, i64* [[TMP6]], i64 [[TMP9]] +; CHECK-NEXT: [[TMP11:%.*]] = bitcast i64* [[TMP10]] to <vscale x 8 x i64>* +; CHECK-NEXT: [[WIDE_LOAD:%.*]] = load <vscale x 8 x i64>, <vscale x 8 x i64>* [[TMP11]], align 8, !alias.scope !9 ; CHECK-NEXT: [[REVERSE:%.*]] = call <vscale x 8 x i64> @llvm.experimental.vector.reverse.nxv8i64(<vscale x 8 x i64> [[WIDE_LOAD]]) -; CHECK-NEXT: [[TMP11:%.*]] = add <vscale x 8 x i64> [[REVERSE]], shufflevector (<vscale x 8 x i64> insertelement (<vscale x 8 x i64> poison, i64 1, i32 0), <vscale x 8 x i64> poison, <vscale x 8 x i32> zeroinitializer) -; CHECK-NEXT: [[REVERSE6:%.*]] = call <vscale x 8 x i64> @llvm.experimental.vector.reverse.nxv8i64(<vscale x 8 x i64> [[TMP11]]) -; CHECK-NEXT: [[TMP12:%.*]] = call i32 @llvm.vscale.i32() -; CHECK-NEXT: [[DOTNEG7:%.*]] = mul i32 [[TMP12]], -8 -; CHECK-NEXT: [[TMP13:%.*]] = or i32 [[DOTNEG7]], 1 -; CHECK-NEXT: [[TMP14:%.*]] = sext i32 [[TMP13]] to i64 -; CHECK-NEXT: [[DOTIDX8:%.*]] = add nsw i64 [[TMP5]], [[TMP14]] -; CHECK-NEXT: [[TMP15:%.*]] = getelementptr inbounds i64, i64* [[A]], i64 [[DOTIDX8]] -; CHECK-NEXT: [[TMP16:%.*]] = bitcast i64* [[TMP15]] to <vscale x 8 x i64>* -; CHECK-NEXT: store <vscale x 8 x i64> [[REVERSE6]], <vscale x 8 x i64>* [[TMP16]], align 8, !alias.scope !12, !noalias !9 -; CHECK-NEXT: [[TMP17:%.*]] = call i64 @llvm.vscale.i64() -; CHECK-NEXT: [[TMP18:%.*]] = shl i64 [[TMP17]], 3 -; CHECK-NEXT: [[INDEX_NEXT]] = add nuw i64 [[INDEX]], [[TMP18]] -; CHECK-NEXT: [[TMP19:%.*]] = icmp eq i64 [[INDEX_NEXT]], [[N_VEC]] -; CHECK-NEXT: br i1 [[TMP19]], label [[MIDDLE_BLOCK:%.*]], label [[VECTOR_BODY]], !llvm.loop [[LOOP14:![0-9]+]] +; CHECK-NEXT: [[TMP12:%.*]] = add <vscale x 8 x i64> [[REVERSE]], shufflevector (<vscale x 8 x i64> insertelement (<vscale x 8 x i64> poison, i64 1, i32 0), <vscale x 8 x i64> poison, <vscale x 8 x i32> zeroinitializer) +; CHECK-NEXT: [[TMP13:%.*]] = getelementptr inbounds i64, i64* [[A]], i64 [[TMP5]] +; CHECK-NEXT: [[REVERSE6:%.*]] = call <vscale x 8 x i64> @llvm.experimental.vector.reverse.nxv8i64(<vscale x 8 x i64> [[TMP12]]) +; CHECK-NEXT: [[TMP14:%.*]] = call i32 @llvm.vscale.i32() +; CHECK-NEXT: [[DOTNEG7:%.*]] = mul i32 [[TMP14]], -8 +; CHECK-NEXT: [[TMP15:%.*]] = or i32 [[DOTNEG7]], 1 +; CHECK-NEXT: [[TMP16:%.*]] = sext i32 [[TMP15]] to i64 +; CHECK-NEXT: [[TMP17:%.*]] = getelementptr inbounds i64, i64* [[TMP13]], i64 [[TMP16]] +; CHECK-NEXT: [[TMP18:%.*]] = bitcast i64* [[TMP17]] to <vscale x 8 x i64>* +; CHECK-NEXT: store <vscale x 8 x i64> [[REVERSE6]], <vscale x 8 x i64>* [[TMP18]], align 8, !alias.scope !12, !noalias !9 +; CHECK-NEXT: [[TMP19:%.*]] = call i64 @llvm.vscale.i64() +; CHECK-NEXT: [[TMP20:%.*]] = shl i64 [[TMP19]], 3 +; CHECK-NEXT: [[INDEX_NEXT]] = add nuw i64 [[INDEX]], [[TMP20]] +; CHECK-NEXT: [[TMP21:%.*]] = icmp eq i64 [[INDEX_NEXT]], [[N_VEC]] +; CHECK-NEXT: br i1 [[TMP21]], label [[MIDDLE_BLOCK:%.*]], label [[VECTOR_BODY]], !llvm.loop [[LOOP14:![0-9]+]] ; CHECK: middle.block: ; CHECK-NEXT: [[CMP_N:%.*]] = icmp eq i64 [[N_MOD_VF]], 0 ; CHECK-NEXT: br i1 [[CMP_N]], label [[FOR_COND_CLEANUP_LOOPEXIT:%.*]], label [[SCALAR_PH]] @@ -164,8 +164,8 @@ define void @vector_reverse_i64(i64 %N, i64* %a, i64* %b) #0 { ; CHECK-NEXT: [[I_09_IN:%.*]] = phi i64 [ [[I_09:%.*]], [[FOR_BODY]] ], [ [[BC_RESUME_VAL]], [[SCALAR_PH]] ] ; CHECK-NEXT: [[I_09]] = add nsw i64 [[I_09_IN]], -1 ; CHECK-NEXT: [[ARRAYIDX:%.*]] = getelementptr inbounds i64, i64* [[B]], i64 [[I_09]] -; CHECK-NEXT: [[TMP20:%.*]] = load i64, i64* [[ARRAYIDX]], align 8 -; CHECK-NEXT: [[ADD:%.*]] = add i64 [[TMP20]], 1 +; CHECK-NEXT: [[TMP22:%.*]] = load i64, i64* [[ARRAYIDX]], align 8 +; CHECK-NEXT: [[ADD:%.*]] = add i64 [[TMP22]], 1 ; CHECK-NEXT: [[ARRAYIDX2:%.*]] = getelementptr inbounds i64, i64* [[A]], i64 [[I_09]] ; CHECK-NEXT: store i64 [[ADD]], i64* [[ARRAYIDX2]], align 8 ; CHECK-NEXT: [[CMP:%.*]] = icmp sgt i64 [[I_09_IN]], 1 diff --git a/llvm/test/Transforms/LoopVectorize/AArch64/vector-reverse-mask4.ll b/llvm/test/Transforms/LoopVectorize/AArch64/vector-reverse-mask4.ll index 4233760333ac..077d3c1f71b3 100644 --- a/llvm/test/Transforms/LoopVectorize/AArch64/vector-reverse-mask4.ll +++ b/llvm/test/Transforms/LoopVectorize/AArch64/vector-reverse-mask4.ll @@ -44,30 +44,32 @@ define void @vector_reverse_mask_v4i1(double* %a, double* %cond, i64 %N) #0 { ; CHECK-NEXT: [[TMP4:%.*]] = bitcast double* [[TMP3]] to <4 x double>* ; CHECK-NEXT: [[WIDE_LOAD:%.*]] = load <4 x double>, <4 x double>* [[TMP4]], align 8, !alias.scope !0 ; CHECK-NEXT: [[REVERSE:%.*]] = shufflevector <4 x double> [[WIDE_LOAD]], <4 x double> poison, <4 x i32> <i32 3, i32 2, i32 1, i32 0> -; CHECK-NEXT: [[TMP5:%.*]] = getelementptr inbounds double, double* [[TMP2]], i64 -7 -; CHECK-NEXT: [[TMP6:%.*]] = bitcast double* [[TMP5]] to <4 x double>* -; CHECK-NEXT: [[WIDE_LOAD6:%.*]] = load <4 x double>, <4 x double>* [[TMP6]], align 8, !alias.scope !0 +; CHECK-NEXT: [[TMP5:%.*]] = getelementptr inbounds double, double* [[TMP2]], i64 -4 +; CHECK-NEXT: [[TMP6:%.*]] = getelementptr inbounds double, double* [[TMP5]], i64 -3 +; CHECK-NEXT: [[TMP7:%.*]] = bitcast double* [[TMP6]] to <4 x double>* +; CHECK-NEXT: [[WIDE_LOAD6:%.*]] = load <4 x double>, <4 x double>* [[TMP7]], align 8, !alias.scope !0 ; CHECK-NEXT: [[REVERSE7:%.*]] = shufflevector <4 x double> [[WIDE_LOAD6]], <4 x double> poison, <4 x i32> <i32 3, i32 2, i32 1, i32 0> -; CHECK-NEXT: [[TMP7:%.*]] = fcmp une <4 x double> [[REVERSE]], zeroinitializer -; CHECK-NEXT: [[TMP8:%.*]] = fcmp une <4 x double> [[REVERSE7]], zeroinitializer -; CHECK-NEXT: [[TMP9:%.*]] = getelementptr inbounds double, double* [[A]], i64 [[TMP1]] -; CHECK-NEXT: [[TMP10:%.*]] = getelementptr inbounds double, double* [[TMP9]], i64 -3 -; CHECK-NEXT: [[REVERSE8:%.*]] = shufflevector <4 x i1> [[TMP7]], <4 x i1> poison, <4 x i32> <i32 3, i32 2, i32 1, i32 0> -; CHECK-NEXT: [[TMP11:%.*]] = bitcast double* [[TMP10]] to <4 x double>* -; CHECK-NEXT: [[WIDE_MASKED_LOAD:%.*]] = call <4 x double> @llvm.masked.load.v4f64.p0v4f64(<4 x double>* nonnull [[TMP11]], i32 8, <4 x i1> [[REVERSE8]], <4 x double> poison), !alias.scope !3, !noalias !0 -; CHECK-NEXT: [[TMP12:%.*]] = getelementptr inbounds double, double* [[TMP9]], i64 -7 -; CHECK-NEXT: [[REVERSE10:%.*]] = shufflevector <4 x i1> [[TMP8]], <4 x i1> poison, <4 x i32> <i32 3, i32 2, i32 1, i32 0> -; CHECK-NEXT: [[TMP13:%.*]] = bitcast double* [[TMP12]] to <4 x double>* -; CHECK-NEXT: [[WIDE_MASKED_LOAD11:%.*]] = call <4 x double> @llvm.masked.load.v4f64.p0v4f64(<4 x double>* nonnull [[TMP13]], i32 8, <4 x i1> [[REVERSE10]], <4 x double> poison), !alias.scope !3, !noalias !0 -; CHECK-NEXT: [[TMP14:%.*]] = fadd <4 x double> [[WIDE_MASKED_LOAD]], <double 1.000000e+00, double 1.000000e+00, double 1.000000e+00, double 1.000000e+00> -; CHECK-NEXT: [[TMP15:%.*]] = fadd <4 x double> [[WIDE_MASKED_LOAD11]], <double 1.000000e+00, double 1.000000e+00, double 1.000000e+00, double 1.000000e+00> -; CHECK-NEXT: [[TMP16:%.*]] = bitcast double* [[TMP10]] to <4 x double>* -; CHECK-NEXT: call void @llvm.masked.store.v4f64.p0v4f64(<4 x double> [[TMP14]], <4 x double>* [[TMP16]], i32 8, <4 x i1> [[REVERSE8]]), !alias.scope !3, !noalias !0 -; CHECK-NEXT: [[TMP17:%.*]] = bitcast double* [[TMP12]] to <4 x double>* -; CHECK-NEXT: call void @llvm.masked.store.v4f64.p0v4f64(<4 x double> [[TMP15]], <4 x double>* [[TMP17]], i32 8, <4 x i1> [[REVERSE10]]), !alias.scope !3, !noalias !0 +; CHECK-NEXT: [[TMP8:%.*]] = fcmp une <4 x double> [[REVERSE]], zeroinitializer +; CHECK-NEXT: [[TMP9:%.*]] = fcmp une <4 x double> [[REVERSE7]], zeroinitializer +; CHECK-NEXT: [[TMP10:%.*]] = getelementptr inbounds double, double* [[A]], i64 [[TMP1]] +; CHECK-NEXT: [[TMP11:%.*]] = getelementptr inbounds double, double* [[TMP10]], i64 -3 +; CHECK-NEXT: [[REVERSE8:%.*]] = shufflevector <4 x i1> [[TMP8]], <4 x i1> poison, <4 x i32> <i32 3, i32 2, i32 1, i32 0> +; CHECK-NEXT: [[TMP12:%.*]] = bitcast double* [[TMP11]] to <4 x double>* +; CHECK-NEXT: [[WIDE_MASKED_LOAD:%.*]] = call <4 x double> @llvm.masked.load.v4f64.p0v4f64(<4 x double>* nonnull [[TMP12]], i32 8, <4 x i1> [[REVERSE8]], <4 x double> poison), !alias.scope !3, !noalias !0 +; CHECK-NEXT: [[TMP13:%.*]] = getelementptr inbounds double, double* [[TMP10]], i64 -4 +; CHECK-NEXT: [[TMP14:%.*]] = getelementptr inbounds double, double* [[TMP13]], i64 -3 +; CHECK-NEXT: [[REVERSE10:%.*]] = shufflevector <4 x i1> [[TMP9]], <4 x i1> poison, <4 x i32> <i32 3, i32 2, i32 1, i32 0> +; CHECK-NEXT: [[TMP15:%.*]] = bitcast double* [[TMP14]] to <4 x double>* +; CHECK-NEXT: [[WIDE_MASKED_LOAD11:%.*]] = call <4 x double> @llvm.masked.load.v4f64.p0v4f64(<4 x double>* nonnull [[TMP15]], i32 8, <4 x i1> [[REVERSE10]], <4 x double> poison), !alias.scope !3, !noalias !0 +; CHECK-NEXT: [[TMP16:%.*]] = fadd <4 x double> [[WIDE_MASKED_LOAD]], <double 1.000000e+00, double 1.000000e+00, double 1.000000e+00, double 1.000000e+00> +; CHECK-NEXT: [[TMP17:%.*]] = fadd <4 x double> [[WIDE_MASKED_LOAD11]], <double 1.000000e+00, double 1.000000e+00, double 1.000000e+00, double 1.000000e+00> +; CHECK-NEXT: [[TMP18:%.*]] = bitcast double* [[TMP11]] to <4 x double>* +; CHECK-NEXT: call void @llvm.masked.store.v4f64.p0v4f64(<4 x double> [[TMP16]], <4 x double>* [[TMP18]], i32 8, <4 x i1> [[REVERSE8]]), !alias.scope !3, !noalias !0 +; CHECK-NEXT: [[TMP19:%.*]] = bitcast double* [[TMP14]] to <4 x double>* +; CHECK-NEXT: call void @llvm.masked.store.v4f64.p0v4f64(<4 x double> [[TMP17]], <4 x double>* [[TMP19]], i32 8, <4 x i1> [[REVERSE10]]), !alias.scope !3, !noalias !0 ; CHECK-NEXT: [[INDEX_NEXT]] = add nuw i64 [[INDEX]], 8 -; CHECK-NEXT: [[TMP18:%.*]] = icmp eq i64 [[INDEX_NEXT]], [[N_VEC]] -; CHECK-NEXT: br i1 [[TMP18]], label [[MIDDLE_BLOCK:%.*]], label [[VECTOR_BODY]], !llvm.loop [[LOOP5:![0-9]+]] +; CHECK-NEXT: [[TMP20:%.*]] = icmp eq i64 [[INDEX_NEXT]], [[N_VEC]] +; CHECK-NEXT: br i1 [[TMP20]], label [[MIDDLE_BLOCK:%.*]], label [[VECTOR_BODY]], !llvm.loop [[LOOP5:![0-9]+]] ; CHECK: middle.block: ; CHECK-NEXT: [[CMP_N:%.*]] = icmp eq i64 [[N_VEC]], [[N]] ; CHECK-NEXT: br i1 [[CMP_N]], label [[FOR_COND_CLEANUP_LOOPEXIT:%.*]], label [[SCALAR_PH]] @@ -82,13 +84,13 @@ define void @vector_reverse_mask_v4i1(double* %a, double* %cond, i64 %N) #0 { ; CHECK-NEXT: [[I_08_IN:%.*]] = phi i64 [ [[I_08:%.*]], [[FOR_INC:%.*]] ], [ [[BC_RESUME_VAL]], [[SCALAR_PH]] ] ; CHECK-NEXT: [[I_08]] = add nsw i64 [[I_08_IN]], -1 ; CHECK-NEXT: [[ARRAYIDX:%.*]] = getelementptr inbounds double, double* [[COND]], i64 [[I_08]] -; CHECK-NEXT: [[TMP19:%.*]] = load double, double* [[ARRAYIDX]], align 8 -; CHECK-NEXT: [[TOBOOL:%.*]] = fcmp une double [[TMP19]], 0.000000e+00 +; CHECK-NEXT: [[TMP21:%.*]] = load double, double* [[ARRAYIDX]], align 8 +; CHECK-NEXT: [[TOBOOL:%.*]] = fcmp une double [[TMP21]], 0.000000e+00 ; CHECK-NEXT: br i1 [[TOBOOL]], label [[IF_THEN:%.*]], label [[FOR_INC]] ; CHECK: if.then: ; CHECK-NEXT: [[ARRAYIDX1:%.*]] = getelementptr inbounds double, double* [[A]], i64 [[I_08]] -; CHECK-NEXT: [[TMP20:%.*]] = load double, double* [[ARRAYIDX1]], align 8 -; CHECK-NEXT: [[ADD:%.*]] = fadd double [[TMP20]], 1.000000e+00 +; CHECK-NEXT: [[TMP22:%.*]] = load double, double* [[ARRAYIDX1]], align 8 +; CHECK-NEXT: [[ADD:%.*]] = fadd double [[TMP22]], 1.000000e+00 ; CHECK-NEXT: store double [[ADD]], double* [[ARRAYIDX1]], align 8 ; CHECK-NEXT: br label [[FOR_INC]] ; CHECK: for.inc: diff --git a/llvm/test/Transforms/LoopVectorize/ARM/mve-reductions.ll b/llvm/test/Transforms/LoopVectorize/ARM/mve-reductions.ll index 3e8ac1bad93c..e66fbede57b7 100644 --- a/llvm/test/Transforms/LoopVectorize/ARM/mve-reductions.ll +++ b/llvm/test/Transforms/LoopVectorize/ARM/mve-reductions.ll @@ -1367,26 +1367,28 @@ define i32 @reduction_interleave_group(i32 %n, i32* %arr) #0 { ; CHECK-NEXT: br label [[VECTOR_BODY:%.*]] ; CHECK: vector.body: ; CHECK-NEXT: [[INDEX:%.*]] = phi i32 [ 0, [[VECTOR_PH]] ], [ [[INDEX_NEXT:%.*]], [[VECTOR_BODY]] ] -; CHECK-NEXT: [[VEC_PHI:%.*]] = phi i32 [ 0, [[VECTOR_PH]] ], [ [[TMP8:%.*]], [[VECTOR_BODY]] ] +; CHECK-NEXT: [[VEC_PHI:%.*]] = phi i32 [ 0, [[VECTOR_PH]] ], [ [[TMP10:%.*]], [[VECTOR_BODY]] ] ; CHECK-NEXT: [[OFFSET_IDX:%.*]] = shl i32 [[INDEX]], 1 -; CHECK-NEXT: [[TMP3:%.*]] = getelementptr inbounds i32, i32* [[ARR:%.*]], i32 [[OFFSET_IDX]] -; CHECK-NEXT: [[TMP4:%.*]] = bitcast i32* [[TMP3]] to <8 x i32>* -; CHECK-NEXT: [[WIDE_VEC:%.*]] = load <8 x i32>, <8 x i32>* [[TMP4]], align 4 +; CHECK-NEXT: [[TMP3:%.*]] = or i32 [[OFFSET_IDX]], 1 +; CHECK-NEXT: [[TMP4:%.*]] = getelementptr inbounds i32, i32* [[ARR:%.*]], i32 -1 +; CHECK-NEXT: [[TMP5:%.*]] = getelementptr inbounds i32, i32* [[TMP4]], i32 [[TMP3]] +; CHECK-NEXT: [[TMP6:%.*]] = bitcast i32* [[TMP5]] to <8 x i32>* +; CHECK-NEXT: [[WIDE_VEC:%.*]] = load <8 x i32>, <8 x i32>* [[TMP6]], align 4 ; CHECK-NEXT: [[STRIDED_VEC:%.*]] = shufflevector <8 x i32> [[WIDE_VEC]], <8 x i32> poison, <4 x i32> <i32 0, i32 2, i32 4, i32 6> ; CHECK-NEXT: [[STRIDED_VEC1:%.*]] = shufflevector <8 x i32> [[WIDE_VEC]], <8 x i32> poison, <4 x i32> <i32 1, i32 3, i32 5, i32 7> -; CHECK-NEXT: [[TMP5:%.*]] = call i32 @llvm.vector.reduce.add.v4i32(<4 x i32> [[STRIDED_VEC1]]) -; CHECK-NEXT: [[TMP6:%.*]] = add i32 [[TMP5]], [[VEC_PHI]] -; CHECK-NEXT: [[TMP7:%.*]] = call i32 @llvm.vector.reduce.add.v4i32(<4 x i32> [[STRIDED_VEC]]) -; CHECK-NEXT: [[TMP8]] = add i32 [[TMP7]], [[TMP6]] +; CHECK-NEXT: [[TMP7:%.*]] = call i32 @llvm.vector.reduce.add.v4i32(<4 x i32> [[STRIDED_VEC1]]) +; CHECK-NEXT: [[TMP8:%.*]] = add i32 [[TMP7]], [[VEC_PHI]] +; CHECK-NEXT: [[TMP9:%.*]] = call i32 @llvm.vector.reduce.add.v4i32(<4 x i32> [[STRIDED_VEC]]) +; CHECK-NEXT: [[TMP10]] = add i32 [[TMP9]], [[TMP8]] ; CHECK-NEXT: [[INDEX_NEXT]] = add nuw i32 [[INDEX]], 4 -; CHECK-NEXT: [[TMP9:%.*]] = icmp eq i32 [[INDEX_NEXT]], [[N_VEC]] -; CHECK-NEXT: br i1 [[TMP9]], label [[MIDDLE_BLOCK:%.*]], label [[VECTOR_BODY]], !llvm.loop [[LOOP26:![0-9]+]] +; CHECK-NEXT: [[TMP11:%.*]] = icmp eq i32 [[INDEX_NEXT]], [[N_VEC]] +; CHECK-NEXT: br i1 [[TMP11]], label [[MIDDLE_BLOCK:%.*]], label [[VECTOR_BODY]], !llvm.loop [[LOOP30:![0-9]+]] ; CHECK: middle.block: ; CHECK-NEXT: [[CMP_N:%.*]] = icmp eq i32 [[TMP2]], [[N_VEC]] ; CHECK-NEXT: br i1 [[CMP_N]], label [[EXIT]], label [[SCALAR_PH]] ; CHECK: scalar.ph: ; CHECK-NEXT: [[BC_RESUME_VAL:%.*]] = phi i32 [ [[IND_END]], [[MIDDLE_BLOCK]] ], [ 0, [[FOR_BODY_PREHEADER]] ] -; CHECK-NEXT: [[BC_MERGE_RDX:%.*]] = phi i32 [ [[TMP8]], [[MIDDLE_BLOCK]] ], [ 0, [[FOR_BODY_PREHEADER]] ] +; CHECK-NEXT: [[BC_MERGE_RDX:%.*]] = phi i32 [ [[TMP10]], [[MIDDLE_BLOCK]] ], [ 0, [[FOR_BODY_PREHEADER]] ] ; CHECK-NEXT: br label [[FOR_BODY:%.*]] ; CHECK: for.body: ; CHECK-NEXT: [[IV:%.*]] = phi i32 [ [[IV_NEXT:%.*]], [[FOR_BODY]] ], [ [[BC_RESUME_VAL]], [[SCALAR_PH]] ] @@ -1402,7 +1404,7 @@ define i32 @reduction_interleave_group(i32 %n, i32* %arr) #0 { ; CHECK-NEXT: [[CMP:%.*]] = icmp slt i32 [[IV_NEXT]], [[N]] ; CHECK-NEXT: br i1 [[CMP]], label [[FOR_BODY]], label [[EXIT]], !llvm.loop [[LOOP31:![0-9]+]] ; CHECK: exit: -; CHECK-NEXT: [[RET_LCSSA:%.*]] = phi i32 [ 0, [[ENTRY:%.*]] ], [ [[RED_2]], [[FOR_BODY]] ], [ [[TMP8]], [[MIDDLE_BLOCK]] ] +; CHECK-NEXT: [[RET_LCSSA:%.*]] = phi i32 [ 0, [[ENTRY:%.*]] ], [ [[RED_2]], [[FOR_BODY]] ], [ [[TMP10]], [[MIDDLE_BLOCK]] ] ; CHECK-NEXT: ret i32 [[RET_LCSSA]] ; entry: diff --git a/llvm/test/Transforms/LoopVectorize/X86/x86-interleaved-accesses-masked-group.ll b/llvm/test/Transforms/LoopVectorize/X86/x86-interleaved-accesses-masked-group.ll index a80140fea413..884d743a1bad 100644 --- a/llvm/test/Transforms/LoopVectorize/X86/x86-interleaved-accesses-masked-group.ll +++ b/llvm/test/Transforms/LoopVectorize/X86/x86-interleaved-accesses-masked-group.ll @@ -1439,17 +1439,19 @@ define dso_local void @masked_strided2(i8* noalias nocapture readonly %p, i8* no ; ENABLED_MASKED_STRIDED-NEXT: [[WIDE_MASKED_VEC:%.*]] = call <16 x i8> @llvm.masked.load.v16i8.p0v16i8(<16 x i8>* [[TMP3]], i32 1, <16 x i1> [[INTERLEAVED_MASK]], <16 x i8> poison) ; ENABLED_MASKED_STRIDED-NEXT: [[STRIDED_VEC:%.*]] = shufflevector <16 x i8> [[WIDE_MASKED_VEC]], <16 x i8> poison, <8 x i32> <i32 0, i32 2, i32 4, i32 6, i32 8, i32 10, i32 12, i32 14> ; ENABLED_MASKED_STRIDED-NEXT: [[STRIDED_VEC1:%.*]] = shufflevector <16 x i8> [[WIDE_MASKED_VEC]], <16 x i8> poison, <8 x i32> <i32 1, i32 3, i32 5, i32 7, i32 9, i32 11, i32 13, i32 15> -; ENABLED_MASKED_STRIDED-NEXT: [[TMP4:%.*]] = icmp slt <8 x i8> [[STRIDED_VEC]], [[STRIDED_VEC1]] -; ENABLED_MASKED_STRIDED-NEXT: [[TMP5:%.*]] = select <8 x i1> [[TMP4]], <8 x i8> [[STRIDED_VEC1]], <8 x i8> [[STRIDED_VEC]] -; ENABLED_MASKED_STRIDED-NEXT: [[TMP6:%.*]] = sub <8 x i8> zeroinitializer, [[TMP5]] -; ENABLED_MASKED_STRIDED-NEXT: [[TMP7:%.*]] = getelementptr inbounds i8, i8* [[Q:%.*]], i32 [[TMP1]] -; ENABLED_MASKED_STRIDED-NEXT: [[TMP8:%.*]] = bitcast i8* [[TMP7]] to <16 x i8>* -; ENABLED_MASKED_STRIDED-NEXT: [[INTERLEAVED_VEC:%.*]] = shufflevector <8 x i8> [[TMP5]], <8 x i8> [[TMP6]], <16 x i32> <i32 0, i32 8, i32 1, i32 9, i32 2, i32 10, i32 3, i32 11, i32 4, i32 12, i32 5, i32 13, i32 6, i32 14, i32 7, i32 15> -; ENABLED_MASKED_STRIDED-NEXT: call void @llvm.masked.store.v16i8.p0v16i8(<16 x i8> [[INTERLEAVED_VEC]], <16 x i8>* [[TMP8]], i32 1, <16 x i1> [[INTERLEAVED_MASK]]) +; ENABLED_MASKED_STRIDED-NEXT: [[TMP4:%.*]] = or i32 [[TMP1]], 1 +; ENABLED_MASKED_STRIDED-NEXT: [[TMP5:%.*]] = icmp slt <8 x i8> [[STRIDED_VEC]], [[STRIDED_VEC1]] +; ENABLED_MASKED_STRIDED-NEXT: [[TMP6:%.*]] = select <8 x i1> [[TMP5]], <8 x i8> [[STRIDED_VEC1]], <8 x i8> [[STRIDED_VEC]] +; ENABLED_MASKED_STRIDED-NEXT: [[TMP7:%.*]] = sub <8 x i8> zeroinitializer, [[TMP6]] +; ENABLED_MASKED_STRIDED-NEXT: [[TMP8:%.*]] = getelementptr inbounds i8, i8* [[Q:%.*]], i32 -1 +; ENABLED_MASKED_STRIDED-NEXT: [[TMP9:%.*]] = getelementptr inbounds i8, i8* [[TMP8]], i32 [[TMP4]] +; ENABLED_MASKED_STRIDED-NEXT: [[TMP10:%.*]] = bitcast i8* [[TMP9]] to <16 x i8>* +; ENABLED_MASKED_STRIDED-NEXT: [[INTERLEAVED_VEC:%.*]] = shufflevector <8 x i8> [[TMP6]], <8 x i8> [[TMP7]], <16 x i32> <i32 0, i32 8, i32 1, i32 9, i32 2, i32 10, i32 3, i32 11, i32 4, i32 12, i32 5, i32 13, i32 6, i32 14, i32 7, i32 15> +; ENABLED_MASKED_STRIDED-NEXT: call void @llvm.masked.store.v16i8.p0v16i8(<16 x i8> [[INTERLEAVED_VEC]], <16 x i8>* [[TMP10]], i32 1, <16 x i1> [[INTERLEAVED_MASK]]) ; ENABLED_MASKED_STRIDED-NEXT: [[INDEX_NEXT]] = add nuw i32 [[INDEX]], 8 ; ENABLED_MASKED_STRIDED-NEXT: [[VEC_IND_NEXT]] = add <8 x i32> [[VEC_IND]], <i32 8, i32 8, i32 8, i32 8, i32 8, i32 8, i32 8, i32 8> -; ENABLED_MASKED_STRIDED-NEXT: [[TMP9:%.*]] = icmp eq i32 [[INDEX_NEXT]], 1024 -; ENABLED_MASKED_STRIDED-NEXT: br i1 [[TMP9]], label [[FOR_END:%.*]], label [[VECTOR_BODY]], !llvm.loop [[LOOP9:![0-9]+]] +; ENABLED_MASKED_STRIDED-NEXT: [[TMP11:%.*]] = icmp eq i32 [[INDEX_NEXT]], 1024 +; ENABLED_MASKED_STRIDED-NEXT: br i1 [[TMP11]], label [[FOR_END:%.*]], label [[VECTOR_BODY]], !llvm.loop [[LOOP9:![0-9]+]] ; ENABLED_MASKED_STRIDED: for.end: ; ENABLED_MASKED_STRIDED-NEXT: ret void ; @@ -1875,17 +1877,19 @@ define dso_local void @masked_strided2_unknown_tc(i8* noalias nocapture readonly ; ENABLED_MASKED_STRIDED-NEXT: [[WIDE_MASKED_VEC:%.*]] = call <16 x i8> @llvm.masked.load.v16i8.p0v16i8(<16 x i8>* [[TMP5]], i32 1, <16 x i1> [[INTERLEAVED_MASK]], <16 x i8> poison) ; ENABLED_MASKED_STRIDED-NEXT: [[STRIDED_VEC:%.*]] = shufflevector <16 x i8> [[WIDE_MASKED_VEC]], <16 x i8> poison, <8 x i32> <i32 0, i32 2, i32 4, i32 6, i32 8, i32 10, i32 12, i32 14> ; ENABLED_MASKED_STRIDED-NEXT: [[STRIDED_VEC3:%.*]] = shufflevector <16 x i8> [[WIDE_MASKED_VEC]], <16 x i8> poison, <8 x i32> <i32 1, i32 3, i32 5, i32 7, i32 9, i32 11, i32 13, i32 15> -; ENABLED_MASKED_STRIDED-NEXT: [[TMP6:%.*]] = icmp slt <8 x i8> [[STRIDED_VEC]], [[STRIDED_VEC3]] -; ENABLED_MASKED_STRIDED-NEXT: [[TMP7:%.*]] = select <8 x i1> [[TMP6]], <8 x i8> [[STRIDED_VEC3]], <8 x i8> [[STRIDED_VEC]] -; ENABLED_MASKED_STRIDED-NEXT: [[TMP8:%.*]] = sub <8 x i8> zeroinitializer, [[TMP7]] -; ENABLED_MASKED_STRIDED-NEXT: [[TMP9:%.*]] = getelementptr inbounds i8, i8* [[Q:%.*]], i32 [[TMP2]] -; ENABLED_MASKED_STRIDED-NEXT: [[TMP10:%.*]] = bitcast i8* [[TMP9]] to <16 x i8>* -; ENABLED_MASKED_STRIDED-NEXT: [[INTERLEAVED_VEC:%.*]] = shufflevector <8 x i8> [[TMP7]], <8 x i8> [[TMP8]], <16 x i32> <i32 0, i32 8, i32 1, i32 9, i32 2, i32 10, i32 3, i32 11, i32 4, i32 12, i32 5, i32 13, i32 6, i32 14, i32 7, i32 15> -; ENABLED_MASKED_STRIDED-NEXT: call void @llvm.masked.store.v16i8.p0v16i8(<16 x i8> [[INTERLEAVED_VEC]], <16 x i8>* [[TMP10]], i32 1, <16 x i1> [[INTERLEAVED_MASK]]) +; ENABLED_MASKED_STRIDED-NEXT: [[TMP6:%.*]] = or i32 [[TMP2]], 1 +; ENABLED_MASKED_STRIDED-NEXT: [[TMP7:%.*]] = icmp slt <8 x i8> [[STRIDED_VEC]], [[STRIDED_VEC3]] +; ENABLED_MASKED_STRIDED-NEXT: [[TMP8:%.*]] = select <8 x i1> [[TMP7]], <8 x i8> [[STRIDED_VEC3]], <8 x i8> [[STRIDED_VEC]] +; ENABLED_MASKED_STRIDED-NEXT: [[TMP9:%.*]] = sub <8 x i8> zeroinitializer, [[TMP8]] +; ENABLED_MASKED_STRIDED-NEXT: [[TMP10:%.*]] = getelementptr inbounds i8, i8* [[Q:%.*]], i32 -1 +; ENABLED_MASKED_STRIDED-NEXT: [[TMP11:%.*]] = getelementptr inbounds i8, i8* [[TMP10]], i32 [[TMP6]] +; ENABLED_MASKED_STRIDED-NEXT: [[TMP12:%.*]] = bitcast i8* [[TMP11]] to <16 x i8>* +; ENABLED_MASKED_STRIDED-NEXT: [[INTERLEAVED_VEC:%.*]] = shufflevector <8 x i8> [[TMP8]], <8 x i8> [[TMP9]], <16 x i32> <i32 0, i32 8, i32 1, i32 9, i32 2, i32 10, i32 3, i32 11, i32 4, i32 12, i32 5, i32 13, i32 6, i32 14, i32 7, i32 15> +; ENABLED_MASKED_STRIDED-NEXT: call void @llvm.masked.store.v16i8.p0v16i8(<16 x i8> [[INTERLEAVED_VEC]], <16 x i8>* [[TMP12]], i32 1, <16 x i1> [[INTERLEAVED_MASK]]) ; ENABLED_MASKED_STRIDED-NEXT: [[INDEX_NEXT]] = add i32 [[INDEX]], 8 ; ENABLED_MASKED_STRIDED-NEXT: [[VEC_IND_NEXT]] = add <8 x i32> [[VEC_IND]], <i32 8, i32 8, i32 8, i32 8, i32 8, i32 8, i32 8, i32 8> -; ENABLED_MASKED_STRIDED-NEXT: [[TMP11:%.*]] = icmp eq i32 [[INDEX_NEXT]], [[N_VEC]] -; ENABLED_MASKED_STRIDED-NEXT: br i1 [[TMP11]], label [[FOR_END]], label [[VECTOR_BODY]], !llvm.loop [[LOOP10:![0-9]+]] +; ENABLED_MASKED_STRIDED-NEXT: [[TMP13:%.*]] = icmp eq i32 [[INDEX_NEXT]], [[N_VEC]] +; ENABLED_MASKED_STRIDED-NEXT: br i1 [[TMP13]], label [[FOR_END]], label [[VECTOR_BODY]], !llvm.loop [[LOOP10:![0-9]+]] ; ENABLED_MASKED_STRIDED: for.end: ; ENABLED_MASKED_STRIDED-NEXT: ret void ; @@ -2311,16 +2315,18 @@ define dso_local void @unconditional_masked_strided2_unknown_tc(i8* noalias noca ; ENABLED_MASKED_STRIDED-NEXT: [[WIDE_MASKED_VEC:%.*]] = call <16 x i8> @llvm.masked.load.v16i8.p0v16i8(<16 x i8>* [[TMP3]], i32 1, <16 x i1> [[INTERLEAVED_MASK]], <16 x i8> poison) ; ENABLED_MASKED_STRIDED-NEXT: [[STRIDED_VEC:%.*]] = shufflevector <16 x i8> [[WIDE_MASKED_VEC]], <16 x i8> poison, <8 x i32> <i32 0, i32 2, i32 4, i32 6, i32 8, i32 10, i32 12, i32 14> ; ENABLED_MASKED_STRIDED-NEXT: [[STRIDED_VEC3:%.*]] = shufflevector <16 x i8> [[WIDE_MASKED_VEC]], <16 x i8> poison, <8 x i32> <i32 1, i32 3, i32 5, i32 7, i32 9, i32 11, i32 13, i32 15> -; ENABLED_MASKED_STRIDED-NEXT: [[TMP4:%.*]] = icmp slt <8 x i8> [[STRIDED_VEC]], [[STRIDED_VEC3]] -; ENABLED_MASKED_STRIDED-NEXT: [[TMP5:%.*]] = select <8 x i1> [[TMP4]], <8 x i8> [[STRIDED_VEC3]], <8 x i8> [[STRIDED_VEC]] -; ENABLED_MASKED_STRIDED-NEXT: [[TMP6:%.*]] = sub <8 x i8> zeroinitializer, [[TMP5]] -; ENABLED_MASKED_STRIDED-NEXT: [[TMP7:%.*]] = getelementptr inbounds i8, i8* [[Q:%.*]], i32 [[TMP1]] -; ENABLED_MASKED_STRIDED-NEXT: [[TMP8:%.*]] = bitcast i8* [[TMP7]] to <16 x i8>* -; ENABLED_MASKED_STRIDED-NEXT: [[INTERLEAVED_VEC:%.*]] = shufflevector <8 x i8> [[TMP5]], <8 x i8> [[TMP6]], <16 x i32> <i32 0, i32 8, i32 1, i32 9, i32 2, i32 10, i32 3, i32 11, i32 4, i32 12, i32 5, i32 13, i32 6, i32 14, i32 7, i32 15> -; ENABLED_MASKED_STRIDED-NEXT: call void @llvm.masked.store.v16i8.p0v16i8(<16 x i8> [[INTERLEAVED_VEC]], <16 x i8>* [[TMP8]], i32 1, <16 x i1> [[INTERLEAVED_MASK]]) +; ENABLED_MASKED_STRIDED-NEXT: [[TMP4:%.*]] = or i32 [[TMP1]], 1 +; ENABLED_MASKED_STRIDED-NEXT: [[TMP5:%.*]] = icmp slt <8 x i8> [[STRIDED_VEC]], [[STRIDED_VEC3]] +; ENABLED_MASKED_STRIDED-NEXT: [[TMP6:%.*]] = select <8 x i1> [[TMP5]], <8 x i8> [[STRIDED_VEC3]], <8 x i8> [[STRIDED_VEC]] +; ENABLED_MASKED_STRIDED-NEXT: [[TMP7:%.*]] = sub <8 x i8> zeroinitializer, [[TMP6]] +; ENABLED_MASKED_STRIDED-NEXT: [[TMP8:%.*]] = getelementptr inbounds i8, i8* [[Q:%.*]], i32 -1 +; ENABLED_MASKED_STRIDED-NEXT: [[TMP9:%.*]] = getelementptr inbounds i8, i8* [[TMP8]], i32 [[TMP4]] +; ENABLED_MASKED_STRIDED-NEXT: [[TMP10:%.*]] = bitcast i8* [[TMP9]] to <16 x i8>* +; ENABLED_MASKED_STRIDED-NEXT: [[INTERLEAVED_VEC:%.*]] = shufflevector <8 x i8> [[TMP6]], <8 x i8> [[TMP7]], <16 x i32> <i32 0, i32 8, i32 1, i32 9, i32 2, i32 10, i32 3, i32 11, i32 4, i32 12, i32 5, i32 13, i32 6, i32 14, i32 7, i32 15> +; ENABLED_MASKED_STRIDED-NEXT: call void @llvm.masked.store.v16i8.p0v16i8(<16 x i8> [[INTERLEAVED_VEC]], <16 x i8>* [[TMP10]], i32 1, <16 x i1> [[INTERLEAVED_MASK]]) ; ENABLED_MASKED_STRIDED-NEXT: [[INDEX_NEXT]] = add i32 [[INDEX]], 8 -; ENABLED_MASKED_STRIDED-NEXT: [[TMP9:%.*]] = icmp eq i32 [[INDEX_NEXT]], [[N_VEC]] -; ENABLED_MASKED_STRIDED-NEXT: br i1 [[TMP9]], label [[FOR_END]], label [[VECTOR_BODY]], !llvm.loop [[LOOP11:![0-9]+]] +; ENABLED_MASKED_STRIDED-NEXT: [[TMP11:%.*]] = icmp eq i32 [[INDEX_NEXT]], [[N_VEC]] +; ENABLED_MASKED_STRIDED-NEXT: br i1 [[TMP11]], label [[FOR_END]], label [[VECTOR_BODY]], !llvm.loop [[LOOP11:![0-9]+]] ; ENABLED_MASKED_STRIDED: for.end: ; ENABLED_MASKED_STRIDED-NEXT: ret void ; diff --git a/llvm/test/Transforms/LoopVectorize/X86/x86-interleaved-store-accesses-with-gaps.ll b/llvm/test/Transforms/LoopVectorize/X86/x86-interleaved-store-accesses-with-gaps.ll index 65838c1f4b02..24bedad51ae1 100644 --- a/llvm/test/Transforms/LoopVectorize/X86/x86-interleaved-store-accesses-with-gaps.ll +++ b/llvm/test/Transforms/LoopVectorize/X86/x86-interleaved-store-accesses-with-gaps.ll @@ -74,23 +74,25 @@ define dso_local void @test1(i16* noalias nocapture %points, i16* noalias nocapt ; ; ENABLED_MASKED_STRIDED-LABEL: @test1( ; ENABLED_MASKED_STRIDED-NEXT: entry: +; ENABLED_MASKED_STRIDED-NEXT: [[TMP0:%.*]] = getelementptr inbounds i16, i16* [[POINTS:%.*]], i64 -1 ; ENABLED_MASKED_STRIDED-NEXT: br label [[VECTOR_BODY:%.*]] ; ENABLED_MASKED_STRIDED: vector.body: ; ENABLED_MASKED_STRIDED-NEXT: [[INDEX:%.*]] = phi i64 [ 0, [[ENTRY:%.*]] ], [ [[INDEX_NEXT:%.*]], [[VECTOR_BODY]] ] -; ENABLED_MASKED_STRIDED-NEXT: [[TMP0:%.*]] = getelementptr inbounds i16, i16* [[X:%.*]], i64 [[INDEX]] -; ENABLED_MASKED_STRIDED-NEXT: [[TMP1:%.*]] = bitcast i16* [[TMP0]] to <4 x i16>* -; ENABLED_MASKED_STRIDED-NEXT: [[WIDE_LOAD:%.*]] = load <4 x i16>, <4 x i16>* [[TMP1]], align 2 -; ENABLED_MASKED_STRIDED-NEXT: [[TMP2:%.*]] = shl nuw nsw i64 [[INDEX]], 2 -; ENABLED_MASKED_STRIDED-NEXT: [[TMP3:%.*]] = getelementptr inbounds i16, i16* [[Y:%.*]], i64 [[INDEX]] -; ENABLED_MASKED_STRIDED-NEXT: [[TMP4:%.*]] = bitcast i16* [[TMP3]] to <4 x i16>* -; ENABLED_MASKED_STRIDED-NEXT: [[WIDE_LOAD1:%.*]] = load <4 x i16>, <4 x i16>* [[TMP4]], align 2 -; ENABLED_MASKED_STRIDED-NEXT: [[TMP5:%.*]] = getelementptr inbounds i16, i16* [[POINTS:%.*]], i64 [[TMP2]] -; ENABLED_MASKED_STRIDED-NEXT: [[TMP6:%.*]] = bitcast i16* [[TMP5]] to <16 x i16>* +; ENABLED_MASKED_STRIDED-NEXT: [[TMP1:%.*]] = getelementptr inbounds i16, i16* [[X:%.*]], i64 [[INDEX]] +; ENABLED_MASKED_STRIDED-NEXT: [[TMP2:%.*]] = bitcast i16* [[TMP1]] to <4 x i16>* +; ENABLED_MASKED_STRIDED-NEXT: [[WIDE_LOAD:%.*]] = load <4 x i16>, <4 x i16>* [[TMP2]], align 2 +; ENABLED_MASKED_STRIDED-NEXT: [[TMP3:%.*]] = shl nuw nsw i64 [[INDEX]], 2 +; ENABLED_MASKED_STRIDED-NEXT: [[TMP4:%.*]] = getelementptr inbounds i16, i16* [[Y:%.*]], i64 [[INDEX]] +; ENABLED_MASKED_STRIDED-NEXT: [[TMP5:%.*]] = bitcast i16* [[TMP4]] to <4 x i16>* +; ENABLED_MASKED_STRIDED-NEXT: [[WIDE_LOAD1:%.*]] = load <4 x i16>, <4 x i16>* [[TMP5]], align 2 +; ENABLED_MASKED_STRIDED-NEXT: [[TMP6:%.*]] = or i64 [[TMP3]], 1 +; ENABLED_MASKED_STRIDED-NEXT: [[TMP7:%.*]] = getelementptr inbounds i16, i16* [[TMP0]], i64 [[TMP6]] +; ENABLED_MASKED_STRIDED-NEXT: [[TMP8:%.*]] = bitcast i16* [[TMP7]] to <16 x i16>* ; ENABLED_MASKED_STRIDED-NEXT: [[INTERLEAVED_VEC:%.*]] = shufflevector <4 x i16> [[WIDE_LOAD]], <4 x i16> [[WIDE_LOAD1]], <16 x i32> <i32 0, i32 4, i32 undef, i32 undef, i32 1, i32 5, i32 undef, i32 undef, i32 2, i32 6, i32 undef, i32 undef, i32 3, i32 7, i32 undef, i32 undef> -; ENABLED_MASKED_STRIDED-NEXT: call void @llvm.masked.store.v16i16.p0v16i16(<16 x i16> [[INTERLEAVED_VEC]], <16 x i16>* [[TMP6]], i32 2, <16 x i1> <i1 true, i1 true, i1 false, i1 false, i1 true, i1 true, i1 false, i1 false, i1 true, i1 true, i1 false, i1 false, i1 true, i1 true, i1 false, i1 false>) +; ENABLED_MASKED_STRIDED-NEXT: call void @llvm.masked.store.v16i16.p0v16i16(<16 x i16> [[INTERLEAVED_VEC]], <16 x i16>* [[TMP8]], i32 2, <16 x i1> <i1 true, i1 true, i1 false, i1 false, i1 true, i1 true, i1 false, i1 false, i1 true, i1 true, i1 false, i1 false, i1 true, i1 true, i1 false, i1 false>) ; ENABLED_MASKED_STRIDED-NEXT: [[INDEX_NEXT]] = add nuw i64 [[INDEX]], 4 -; ENABLED_MASKED_STRIDED-NEXT: [[TMP7:%.*]] = icmp eq i64 [[INDEX_NEXT]], 1024 -; ENABLED_MASKED_STRIDED-NEXT: br i1 [[TMP7]], label [[FOR_END:%.*]], label [[VECTOR_BODY]], !llvm.loop [[LOOP0:![0-9]+]] +; ENABLED_MASKED_STRIDED-NEXT: [[TMP9:%.*]] = icmp eq i64 [[INDEX_NEXT]], 1024 +; ENABLED_MASKED_STRIDED-NEXT: br i1 [[TMP9]], label [[FOR_END:%.*]], label [[VECTOR_BODY]], !llvm.loop [[LOOP0:![0-9]+]] ; ENABLED_MASKED_STRIDED: for.end: ; ENABLED_MASKED_STRIDED-NEXT: ret void ; @@ -244,29 +246,31 @@ define dso_local void @test2(i16* noalias nocapture %points, i32 %numPoints, i16 ; ENABLED_MASKED_STRIDED-NEXT: [[TRIP_COUNT_MINUS_1:%.*]] = add nsw i64 [[WIDE_TRIP_COUNT]], -1 ; ENABLED_MASKED_STRIDED-NEXT: [[BROADCAST_SPLATINSERT:%.*]] = insertelement <4 x i64> poison, i64 [[TRIP_COUNT_MINUS_1]], i32 0 ; ENABLED_MASKED_STRIDED-NEXT: [[BROADCAST_SPLAT:%.*]] = shufflevector <4 x i64> [[BROADCAST_SPLATINSERT]], <4 x i64> poison, <4 x i32> zeroinitializer +; ENABLED_MASKED_STRIDED-NEXT: [[TMP0:%.*]] = getelementptr inbounds i16, i16* [[POINTS:%.*]], i64 -1 ; ENABLED_MASKED_STRIDED-NEXT: br label [[VECTOR_BODY:%.*]] ; ENABLED_MASKED_STRIDED: vector.body: ; ENABLED_MASKED_STRIDED-NEXT: [[INDEX:%.*]] = phi i64 [ 0, [[FOR_BODY_PREHEADER]] ], [ [[INDEX_NEXT:%.*]], [[VECTOR_BODY]] ] ; ENABLED_MASKED_STRIDED-NEXT: [[BROADCAST_SPLATINSERT1:%.*]] = insertelement <4 x i64> poison, i64 [[INDEX]], i32 0 ; ENABLED_MASKED_STRIDED-NEXT: [[BROADCAST_SPLAT2:%.*]] = shufflevector <4 x i64> [[BROADCAST_SPLATINSERT1]], <4 x i64> poison, <4 x i32> zeroinitializer ; ENABLED_MASKED_STRIDED-NEXT: [[INDUCTION:%.*]] = or <4 x i64> [[BROADCAST_SPLAT2]], <i64 0, i64 1, i64 2, i64 3> -; ENABLED_MASKED_STRIDED-NEXT: [[TMP0:%.*]] = icmp ule <4 x i64> [[INDUCTION]], [[BROADCAST_SPLAT]] -; ENABLED_MASKED_STRIDED-NEXT: [[TMP1:%.*]] = getelementptr inbounds i16, i16* [[X:%.*]], i64 [[INDEX]] -; ENABLED_MASKED_STRIDED-NEXT: [[TMP2:%.*]] = bitcast i16* [[TMP1]] to <4 x i16>* -; ENABLED_MASKED_STRIDED-NEXT: [[WIDE_MASKED_LOAD:%.*]] = call <4 x i16> @llvm.masked.load.v4i16.p0v4i16(<4 x i16>* [[TMP2]], i32 2, <4 x i1> [[TMP0]], <4 x i16> poison) -; ENABLED_MASKED_STRIDED-NEXT: [[TMP3:%.*]] = shl nsw i64 [[INDEX]], 2 -; ENABLED_MASKED_STRIDED-NEXT: [[TMP4:%.*]] = getelementptr inbounds i16, i16* [[Y:%.*]], i64 [[INDEX]] -; ENABLED_MASKED_STRIDED-NEXT: [[TMP5:%.*]] = bitcast i16* [[TMP4]] to <4 x i16>* -; ENABLED_MASKED_STRIDED-NEXT: [[WIDE_MASKED_LOAD3:%.*]] = call <4 x i16> @llvm.masked.load.v4i16.p0v4i16(<4 x i16>* [[TMP5]], i32 2, <4 x i1> [[TMP0]], <4 x i16> poison) -; ENABLED_MASKED_STRIDED-NEXT: [[TMP6:%.*]] = getelementptr inbounds i16, i16* [[POINTS:%.*]], i64 [[TMP3]] -; ENABLED_MASKED_STRIDED-NEXT: [[TMP7:%.*]] = bitcast i16* [[TMP6]] to <16 x i16>* +; ENABLED_MASKED_STRIDED-NEXT: [[TMP1:%.*]] = icmp ule <4 x i64> [[INDUCTION]], [[BROADCAST_SPLAT]] +; ENABLED_MASKED_STRIDED-NEXT: [[TMP2:%.*]] = getelementptr inbounds i16, i16* [[X:%.*]], i64 [[INDEX]] +; ENABLED_MASKED_STRIDED-NEXT: [[TMP3:%.*]] = bitcast i16* [[TMP2]] to <4 x i16>* +; ENABLED_MASKED_STRIDED-NEXT: [[WIDE_MASKED_LOAD:%.*]] = call <4 x i16> @llvm.masked.load.v4i16.p0v4i16(<4 x i16>* [[TMP3]], i32 2, <4 x i1> [[TMP1]], <4 x i16> poison) +; ENABLED_MASKED_STRIDED-NEXT: [[TMP4:%.*]] = shl nsw i64 [[INDEX]], 2 +; ENABLED_MASKED_STRIDED-NEXT: [[TMP5:%.*]] = getelementptr inbounds i16, i16* [[Y:%.*]], i64 [[INDEX]] +; ENABLED_MASKED_STRIDED-NEXT: [[TMP6:%.*]] = bitcast i16* [[TMP5]] to <4 x i16>* +; ENABLED_MASKED_STRIDED-NEXT: [[WIDE_MASKED_LOAD3:%.*]] = call <4 x i16> @llvm.masked.load.v4i16.p0v4i16(<4 x i16>* [[TMP6]], i32 2, <4 x i1> [[TMP1]], <4 x i16> poison) +; ENABLED_MASKED_STRIDED-NEXT: [[TMP7:%.*]] = or i64 [[TMP4]], 1 +; ENABLED_MASKED_STRIDED-NEXT: [[TMP8:%.*]] = getelementptr inbounds i16, i16* [[TMP0]], i64 [[TMP7]] +; ENABLED_MASKED_STRIDED-NEXT: [[TMP9:%.*]] = bitcast i16* [[TMP8]] to <16 x i16>* ; ENABLED_MASKED_STRIDED-NEXT: [[INTERLEAVED_VEC:%.*]] = shufflevector <4 x i16> [[WIDE_MASKED_LOAD]], <4 x i16> [[WIDE_MASKED_LOAD3]], <16 x i32> <i32 0, i32 4, i32 undef, i32 undef, i32 1, i32 5, i32 undef, i32 undef, i32 2, i32 6, i32 undef, i32 undef, i32 3, i32 7, i32 undef, i32 undef> -; ENABLED_MASKED_STRIDED-NEXT: [[INTERLEAVED_MASK:%.*]] = shufflevector <4 x i1> [[TMP0]], <4 x i1> poison, <16 x i32> <i32 0, i32 0, i32 0, i32 0, i32 1, i32 1, i32 1, i32 1, i32 2, i32 2, i32 2, i32 2, i32 3, i32 3, i32 3, i32 3> -; ENABLED_MASKED_STRIDED-NEXT: [[TMP8:%.*]] = and <16 x i1> [[INTERLEAVED_MASK]], <i1 true, i1 true, i1 false, i1 false, i1 true, i1 true, i1 false, i1 false, i1 true, i1 true, i1 false, i1 false, i1 true, i1 true, i1 false, i1 false> -; ENABLED_MASKED_STRIDED-NEXT: call void @llvm.masked.store.v16i16.p0v16i16(<16 x i16> [[INTERLEAVED_VEC]], <16 x i16>* [[TMP7]], i32 2, <16 x i1> [[TMP8]]) +; ENABLED_MASKED_STRIDED-NEXT: [[INTERLEAVED_MASK:%.*]] = shufflevector <4 x i1> [[TMP1]], <4 x i1> poison, <16 x i32> <i32 0, i32 0, i32 0, i32 0, i32 1, i32 1, i32 1, i32 1, i32 2, i32 2, i32 2, i32 2, i32 3, i32 3, i32 3, i32 3> +; ENABLED_MASKED_STRIDED-NEXT: [[TMP10:%.*]] = and <16 x i1> [[INTERLEAVED_MASK]], <i1 true, i1 true, i1 false, i1 false, i1 true, i1 true, i1 false, i1 false, i1 true, i1 true, i1 false, i1 false, i1 true, i1 true, i1 false, i1 false> +; ENABLED_MASKED_STRIDED-NEXT: call void @llvm.masked.store.v16i16.p0v16i16(<16 x i16> [[INTERLEAVED_VEC]], <16 x i16>* [[TMP9]], i32 2, <16 x i1> [[TMP10]]) ; ENABLED_MASKED_STRIDED-NEXT: [[INDEX_NEXT]] = add i64 [[INDEX]], 4 -; ENABLED_MASKED_STRIDED-NEXT: [[TMP9:%.*]] = icmp eq i64 [[INDEX_NEXT]], [[N_VEC]] -; ENABLED_MASKED_STRIDED-NEXT: br i1 [[TMP9]], label [[FOR_END_LOOPEXIT:%.*]], label [[VECTOR_BODY]], !llvm.loop [[LOOP2:![0-9]+]] +; ENABLED_MASKED_STRIDED-NEXT: [[TMP11:%.*]] = icmp eq i64 [[INDEX_NEXT]], [[N_VEC]] +; ENABLED_MASKED_STRIDED-NEXT: br i1 [[TMP11]], label [[FOR_END_LOOPEXIT:%.*]], label [[VECTOR_BODY]], !llvm.loop [[LOOP2:![0-9]+]] ; ENABLED_MASKED_STRIDED: for.end.loopexit: ; ENABLED_MASKED_STRIDED-NEXT: br label [[FOR_END]] ; ENABLED_MASKED_STRIDED: for.end: diff --git a/llvm/test/Transforms/LoopVectorize/consecutive-ptr-uniforms.ll b/llvm/test/Transforms/LoopVectorize/consecutive-ptr-uniforms.ll index 89c6efa6945c..0a127ad4ef88 100644 --- a/llvm/test/Transforms/LoopVectorize/consecutive-ptr-uniforms.ll +++ b/llvm/test/Transforms/LoopVectorize/consecutive-ptr-uniforms.ll @@ -50,8 +50,8 @@ for.end: ; CHECK: %index = phi i64 [ 0, %vector.ph ], [ %index.next, %vector.body ] ; CHECK: %offset.idx = sub i64 %n, %index ; CHECK-NOT: getelementptr -; CHECK: %[[G0IDX:.+]] = add nsw i64 %offset.idx, -3 -; CHECK: getelementptr inbounds i32, i32* %a, i64 %[[G0IDX]] +; CHECK: %[[G0:.+]] = getelementptr inbounds i32, i32* %a, i64 -3 +; CHECK: getelementptr inbounds i32, i32* %[[G0]], i64 %offset.idx ; CHECK-NOT: getelementptr ; CHECK: br i1 {{.*}}, label %middle.block, label %vector.body ; diff --git a/llvm/test/Transforms/LoopVectorize/interleaved-accesses.ll b/llvm/test/Transforms/LoopVectorize/interleaved-accesses.ll index e56b607342e6..3e77d76a26a7 100644 --- a/llvm/test/Transforms/LoopVectorize/interleaved-accesses.ll +++ b/llvm/test/Transforms/LoopVectorize/interleaved-accesses.ll @@ -686,17 +686,19 @@ define void @mixed_load2_store2(i32* noalias nocapture readonly %A, i32* noalias ; CHECK-NEXT: [[WIDE_VEC:%.*]] = load <8 x i32>, <8 x i32>* [[TMP1]], align 4 ; CHECK-NEXT: [[STRIDED_VEC:%.*]] = shufflevector <8 x i32> [[WIDE_VEC]], <8 x i32> poison, <4 x i32> <i32 0, i32 2, i32 4, i32 6> ; CHECK-NEXT: [[STRIDED_VEC1:%.*]] = shufflevector <8 x i32> [[WIDE_VEC]], <8 x i32> poison, <4 x i32> <i32 1, i32 3, i32 5, i32 7> -; CHECK-NEXT: [[TMP2:%.*]] = mul nsw <4 x i32> [[STRIDED_VEC1]], [[STRIDED_VEC]] +; CHECK-NEXT: [[TMP2:%.*]] = or i64 [[OFFSET_IDX]], 1 +; CHECK-NEXT: [[TMP3:%.*]] = mul nsw <4 x i32> [[STRIDED_VEC1]], [[STRIDED_VEC]] ; CHECK-NEXT: [[STRIDED_VEC3:%.*]] = shufflevector <8 x i32> [[WIDE_VEC]], <8 x i32> poison, <4 x i32> <i32 0, i32 2, i32 4, i32 6> ; CHECK-NEXT: [[STRIDED_VEC4:%.*]] = shufflevector <8 x i32> [[WIDE_VEC]], <8 x i32> poison, <4 x i32> <i32 1, i32 3, i32 5, i32 7> -; CHECK-NEXT: [[TMP3:%.*]] = add nsw <4 x i32> [[STRIDED_VEC4]], [[STRIDED_VEC3]] -; CHECK-NEXT: [[TMP4:%.*]] = getelementptr inbounds i32, i32* [[B:%.*]], i64 [[OFFSET_IDX]] -; CHECK-NEXT: [[TMP5:%.*]] = bitcast i32* [[TMP4]] to <8 x i32>* -; CHECK-NEXT: [[INTERLEAVED_VEC:%.*]] = shufflevector <4 x i32> [[TMP2]], <4 x i32> [[TMP3]], <8 x i32> <i32 0, i32 4, i32 1, i32 5, i32 2, i32 6, i32 3, i32 7> -; CHECK-NEXT: store <8 x i32> [[INTERLEAVED_VEC]], <8 x i32>* [[TMP5]], align 4 +; CHECK-NEXT: [[TMP4:%.*]] = add nsw <4 x i32> [[STRIDED_VEC4]], [[STRIDED_VEC3]] +; CHECK-NEXT: [[TMP5:%.*]] = getelementptr inbounds i32, i32* [[B:%.*]], i64 -1 +; CHECK-NEXT: [[TMP6:%.*]] = getelementptr inbounds i32, i32* [[TMP5]], i64 [[TMP2]] +; CHECK-NEXT: [[TMP7:%.*]] = bitcast i32* [[TMP6]] to <8 x i32>* +; CHECK-NEXT: [[INTERLEAVED_VEC:%.*]] = shufflevector <4 x i32> [[TMP3]], <4 x i32> [[TMP4]], <8 x i32> <i32 0, i32 4, i32 1, i32 5, i32 2, i32 6, i32 3, i32 7> +; CHECK-NEXT: store <8 x i32> [[INTERLEAVED_VEC]], <8 x i32>* [[TMP7]], align 4 ; CHECK-NEXT: [[INDEX_NEXT]] = add nuw i64 [[INDEX]], 4 -; CHECK-NEXT: [[TMP6:%.*]] = icmp eq i64 [[INDEX_NEXT]], 512 -; CHECK-NEXT: br i1 [[TMP6]], label [[MIDDLE_BLOCK:%.*]], label [[VECTOR_BODY]], !llvm.loop [[LOOP18:![0-9]+]] +; CHECK-NEXT: [[TMP8:%.*]] = icmp eq i64 [[INDEX_NEXT]], 512 +; CHECK-NEXT: br i1 [[TMP8]], label [[MIDDLE_BLOCK:%.*]], label [[VECTOR_BODY]], !llvm.loop [[LOOP18:![0-9]+]] ; CHECK: middle.block: ; CHECK-NEXT: br i1 true, label [[FOR_COND_CLEANUP:%.*]], label [[SCALAR_PH]] ; CHECK: scalar.ph: @@ -760,17 +762,19 @@ define void @mixed_load3_store3(i32* nocapture %A) { ; CHECK-NEXT: [[STRIDED_VEC2:%.*]] = shufflevector <12 x i32> [[WIDE_VEC]], <12 x i32> poison, <4 x i32> <i32 1, i32 4, i32 7, i32 10> ; CHECK-NEXT: [[STRIDED_VEC3:%.*]] = shufflevector <12 x i32> [[WIDE_VEC]], <12 x i32> poison, <4 x i32> <i32 2, i32 5, i32 8, i32 11> ; CHECK-NEXT: [[TMP2:%.*]] = add <4 x i32> [[STRIDED_VEC]], [[VEC_IND]] -; CHECK-NEXT: [[TMP3:%.*]] = add <4 x i32> [[STRIDED_VEC2]], [[VEC_IND]] -; CHECK-NEXT: [[TMP4:%.*]] = add <4 x i32> [[STRIDED_VEC3]], [[VEC_IND]] -; CHECK-NEXT: [[TMP5:%.*]] = bitcast i32* [[NEXT_GEP]] to <12 x i32>* -; CHECK-NEXT: [[TMP6:%.*]] = shufflevector <4 x i32> [[TMP2]], <4 x i32> [[TMP3]], <8 x i32> <i32 0, i32 1, i32 2, i32 3, i32 4, i32 5, i32 6, i32 7> -; CHECK-NEXT: [[TMP7:%.*]] = shufflevector <4 x i32> [[TMP4]], <4 x i32> poison, <8 x i32> <i32 0, i32 1, i32 2, i32 3, i32 undef, i32 undef, i32 undef, i32 undef> -; CHECK-NEXT: [[INTERLEAVED_VEC:%.*]] = shufflevector <8 x i32> [[TMP6]], <8 x i32> [[TMP7]], <12 x i32> <i32 0, i32 4, i32 8, i32 1, i32 5, i32 9, i32 2, i32 6, i32 10, i32 3, i32 7, i32 11> -; CHECK-NEXT: store <12 x i32> [[INTERLEAVED_VEC]], <12 x i32>* [[TMP5]], align 4 +; CHECK-NEXT: [[TMP3:%.*]] = getelementptr inbounds i32, i32* [[NEXT_GEP]], i64 2 +; CHECK-NEXT: [[TMP4:%.*]] = add <4 x i32> [[STRIDED_VEC2]], [[VEC_IND]] +; CHECK-NEXT: [[TMP5:%.*]] = add <4 x i32> [[STRIDED_VEC3]], [[VEC_IND]] +; CHECK-NEXT: [[TMP6:%.*]] = getelementptr inbounds i32, i32* [[TMP3]], i64 -2 +; CHECK-NEXT: [[TMP7:%.*]] = bitcast i32* [[TMP6]] to <12 x i32>* +; CHECK-NEXT: [[TMP8:%.*]] = shufflevector <4 x i32> [[TMP2]], <4 x i32> [[TMP4]], <8 x i32> <i32 0, i32 1, i32 2, i32 3, i32 4, i32 5, i32 6, i32 7> +; CHECK-NEXT: [[TMP9:%.*]] = shufflevector <4 x i32> [[TMP5]], <4 x i32> poison, <8 x i32> <i32 0, i32 1, i32 2, i32 3, i32 undef, i32 undef, i32 undef, i32 undef> +; CHECK-NEXT: [[INTERLEAVED_VEC:%.*]] = shufflevector <8 x i32> [[TMP8]], <8 x i32> [[TMP9]], <12 x i32> <i32 0, i32 4, i32 8, i32 1, i32 5, i32 9, i32 2, i32 6, i32 10, i32 3, i32 7, i32 11> +; CHECK-NEXT: store <12 x i32> [[INTERLEAVED_VEC]], <12 x i32>* [[TMP7]], align 4 ; CHECK-NEXT: [[INDEX_NEXT]] = add nuw i64 [[INDEX]], 4 ; CHECK-NEXT: [[VEC_IND_NEXT]] = add <4 x i32> [[VEC_IND]], <i32 4, i32 4, i32 4, i32 4> -; CHECK-NEXT: [[TMP8:%.*]] = icmp eq i64 [[INDEX_NEXT]], 1024 -; CHECK-NEXT: br i1 [[TMP8]], label [[MIDDLE_BLOCK:%.*]], label [[VECTOR_BODY]], !llvm.loop [[LOOP20:![0-9]+]] +; CHECK-NEXT: [[TMP10:%.*]] = icmp eq i64 [[INDEX_NEXT]], 1024 +; CHECK-NEXT: br i1 [[TMP10]], label [[MIDDLE_BLOCK:%.*]], label [[VECTOR_BODY]], !llvm.loop [[LOOP20:![0-9]+]] ; CHECK: middle.block: ; CHECK-NEXT: br i1 true, label [[FOR_COND_CLEANUP:%.*]], label [[SCALAR_PH]] ; CHECK: scalar.ph: @@ -1315,21 +1319,23 @@ define void @PR27626_4(i32 *%a, i32 %x, i32 %y, i32 %z, i64 %n) { ; CHECK-NEXT: [[TMP3:%.*]] = or i64 [[OFFSET_IDX]], 2 ; CHECK-NEXT: [[TMP4:%.*]] = or i64 [[OFFSET_IDX]], 4 ; CHECK-NEXT: [[TMP5:%.*]] = or i64 [[OFFSET_IDX]], 6 -; CHECK-NEXT: [[TMP6:%.*]] = getelementptr inbounds i32, i32* [[A:%.*]], i64 [[OFFSET_IDX]] -; CHECK-NEXT: [[TMP7:%.*]] = getelementptr inbounds i32, i32* [[A]], i64 [[TMP3]] -; CHECK-NEXT: [[TMP8:%.*]] = getelementptr inbounds i32, i32* [[A]], i64 [[TMP4]] -; CHECK-NEXT: [[TMP9:%.*]] = getelementptr inbounds i32, i32* [[A]], i64 [[TMP5]] -; CHECK-NEXT: store i32 [[X:%.*]], i32* [[TMP6]], align 4 -; CHECK-NEXT: store i32 [[X]], i32* [[TMP7]], align 4 +; CHECK-NEXT: [[TMP6:%.*]] = or i64 [[OFFSET_IDX]], 1 +; CHECK-NEXT: [[TMP7:%.*]] = getelementptr inbounds i32, i32* [[A:%.*]], i64 [[OFFSET_IDX]] +; CHECK-NEXT: [[TMP8:%.*]] = getelementptr inbounds i32, i32* [[A]], i64 [[TMP3]] +; CHECK-NEXT: [[TMP9:%.*]] = getelementptr inbounds i32, i32* [[A]], i64 [[TMP4]] +; CHECK-NEXT: [[TMP10:%.*]] = getelementptr inbounds i32, i32* [[A]], i64 [[TMP5]] +; CHECK-NEXT: [[TMP11:%.*]] = getelementptr inbounds i32, i32* [[A]], i64 -1 +; CHECK-NEXT: store i32 [[X:%.*]], i32* [[TMP7]], align 4 ; CHECK-NEXT: store i32 [[X]], i32* [[TMP8]], align 4 ; CHECK-NEXT: store i32 [[X]], i32* [[TMP9]], align 4 -; CHECK-NEXT: [[TMP10:%.*]] = getelementptr inbounds i32, i32* [[A]], i64 [[OFFSET_IDX]] -; CHECK-NEXT: [[TMP11:%.*]] = bitcast i32* [[TMP10]] to <8 x i32>* +; CHECK-NEXT: store i32 [[X]], i32* [[TMP10]], align 4 +; CHECK-NEXT: [[TMP12:%.*]] = getelementptr inbounds i32, i32* [[TMP11]], i64 [[TMP6]] +; CHECK-NEXT: [[TMP13:%.*]] = bitcast i32* [[TMP12]] to <8 x i32>* ; CHECK-NEXT: [[INTERLEAVED_VEC:%.*]] = shufflevector <4 x i32> [[BROADCAST_SPLAT]], <4 x i32> [[BROADCAST_SPLAT2]], <8 x i32> <i32 0, i32 4, i32 1, i32 5, i32 2, i32 6, i32 3, i32 7> -; CHECK-NEXT: store <8 x i32> [[INTERLEAVED_VEC]], <8 x i32>* [[TMP11]], align 4 +; CHECK-NEXT: store <8 x i32> [[INTERLEAVED_VEC]], <8 x i32>* [[TMP13]], align 4 ; CHECK-NEXT: [[INDEX_NEXT]] = add nuw i64 [[INDEX]], 4 -; CHECK-NEXT: [[TMP12:%.*]] = icmp eq i64 [[INDEX_NEXT]], [[N_VEC]] -; CHECK-NEXT: br i1 [[TMP12]], label [[MIDDLE_BLOCK:%.*]], label [[VECTOR_BODY]], !llvm.loop [[LOOP32:![0-9]+]] +; CHECK-NEXT: [[TMP14:%.*]] = icmp eq i64 [[INDEX_NEXT]], [[N_VEC]] +; CHECK-NEXT: br i1 [[TMP14]], label [[MIDDLE_BLOCK:%.*]], label [[VECTOR_BODY]], !llvm.loop [[LOOP32:![0-9]+]] ; CHECK: middle.block: ; CHECK-NEXT: [[CMP_N:%.*]] = icmp eq i64 [[TMP2]], [[N_VEC]] ; CHECK-NEXT: br i1 [[CMP_N]], label [[FOR_END:%.*]], label [[SCALAR_PH]] </cut>

3 years, 11 months

1
0
0 0

[CI-NOTIFY]: TCWG Bisect tcwg_kernel/gnu-release-arm-next-allyesconfig - Build # 25 - Successful!

by ci_notify＠linaro.org

Successfully identified regression in *linux* in CI configuration tcwg_kernel/gnu-release-arm-next-allyesconfig. So far, this commit has regressed CI configurations: - tcwg_kernel/gnu-release-arm-next-allyesconfig Culprit: <cut> commit ee1ba5beab143f3afcc89720bb18ac438c3241b3 Merge: 2fdba6b39b9b 54f9cb2466e1 Author: Stephen Rothwell <sfr(a)canb.auug.org.au> Date: Wed Sep 8 10:07:28 2021 +1000 Merge remote-tracking branch 'pm/linux-next' </cut> Results regressed to (for first_bad == ee1ba5beab143f3afcc89720bb18ac438c3241b3) # reset_artifacts: -10 # build_abe binutils: -9 # build_abe stage1: -5 # build_abe qemu: -2 # linux_n_obj: 19801 # First few build errors in logs: from (for last_good == 2fdba6b39b9bd8deafe182764414eb075032c31d) # reset_artifacts: -10 # build_abe binutils: -9 # build_abe stage1: -5 # build_abe qemu: -2 # linux_n_obj: 19887 # linux build successful: all Artifacts of last_good build: https://ci.linaro.org/job/tcwg_kernel-gnu-bisect-gnu-release-arm-next-allye… Artifacts of first_bad build: https://ci.linaro.org/job/tcwg_kernel-gnu-bisect-gnu-release-arm-next-allye… Build top page/logs: https://ci.linaro.org/job/tcwg_kernel-gnu-bisect-gnu-release-arm-next-allye… Configuration details: rr[linux_git]="https://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git#626bf91…" Reproduce builds: <cut> mkdir investigate-linux-ee1ba5beab143f3afcc89720bb18ac438c3241b3 cd investigate-linux-ee1ba5beab143f3afcc89720bb18ac438c3241b3 git clone https://git.linaro.org/toolchain/jenkins-scripts mkdir -p artifacts/manifests curl -o artifacts/manifests/build-baseline.sh https://ci.linaro.org/job/tcwg_kernel-gnu-bisect-gnu-release-arm-next-allye… --fail curl -o artifacts/manifests/build-parameters.sh https://ci.linaro.org/job/tcwg_kernel-gnu-bisect-gnu-release-arm-next-allye… --fail curl -o artifacts/test.sh https://ci.linaro.org/job/tcwg_kernel-gnu-bisect-gnu-release-arm-next-allye… --fail chmod +x artifacts/test.sh # Reproduce the baseline build (build all pre-requisites) ./jenkins-scripts/tcwg_kernel-build.sh @@ artifacts/manifests/build-baseline.sh # Save baseline build state (which is then restored in artifacts/test.sh) mkdir -p ./bisect rsync -a --del --delete-excluded --exclude /bisect/ --exclude /artifacts/ --exclude /linux/ ./ ./bisect/baseline/ cd linux # Reproduce first_bad build git checkout --detach ee1ba5beab143f3afcc89720bb18ac438c3241b3 ../artifacts/test.sh # Reproduce last_good build git checkout --detach 2fdba6b39b9bd8deafe182764414eb075032c31d ../artifacts/test.sh cd .. </cut> History of pending regressions and results: https://git.linaro.org/toolchain/ci/base-artifacts.git/log/?h=linaro-local/… Artifacts: https://ci.linaro.org/job/tcwg_kernel-gnu-bisect-gnu-release-arm-next-allye… Build log: https://ci.linaro.org/job/tcwg_kernel-gnu-bisect-gnu-release-arm-next-allye… Full commit (up to 1000 lines): <cut> commit ee1ba5beab143f3afcc89720bb18ac438c3241b3 Merge: 2fdba6b39b9b 54f9cb2466e1 Author: Stephen Rothwell <sfr(a)canb.auug.org.au> Date: Wed Sep 8 10:07:28 2021 +1000 Merge remote-tracking branch 'pm/linux-next' Documentation/admin-guide/acpi/ssdt-overlays.rst | 49 +- Documentation/cpu-freq/cpu-drivers.rst | 3 - .../devicetree/bindings/cpufreq/cpufreq-dt.txt | 2 +- .../bindings/cpufreq/cpufreq-mediatek-hw.yaml | 70 +++ .../bindings/cpufreq/cpufreq-mediatek.txt | 2 +- .../devicetree/bindings/cpufreq/cpufreq-st.txt | 6 +- .../bindings/cpufreq/nvidia,tegra20-cpufreq.txt | 2 +- .../devicetree/bindings/devfreq/rk3399_dmc.txt | 2 +- .../devicetree/bindings/gpu/arm,mali-bifrost.yaml | 2 +- .../devicetree/bindings/gpu/arm,mali-midgard.yaml | 2 +- .../bindings/interconnect/fsl,imx8m-noc.yaml | 4 +- .../opp/allwinner,sun50i-h6-operating-points.yaml | 4 + Documentation/devicetree/bindings/opp/opp-v1.yaml | 51 ++ .../devicetree/bindings/opp/opp-v2-base.yaml | 214 +++++++ Documentation/devicetree/bindings/opp/opp-v2.yaml | 475 ++++++++++++++++ Documentation/devicetree/bindings/opp/opp.txt | 622 --------------------- Documentation/devicetree/bindings/opp/qcom-opp.txt | 2 +- .../bindings/opp/ti-omap5-opp-supply.txt | 2 +- .../devicetree/bindings/power/power-domain.yaml | 2 +- .../translations/zh_CN/cpu-freq/cpu-drivers.rst | 2 - arch/arm/boot/dts/omap34xx.dtsi | 1 - arch/arm/boot/dts/omap36xx.dtsi | 1 - drivers/acpi/x86/s2idle.c | 67 ++- drivers/base/arch_topology.c | 2 + drivers/cpufreq/Kconfig.arm | 12 + drivers/cpufreq/Makefile | 1 + drivers/cpufreq/acpi-cpufreq.c | 14 +- drivers/cpufreq/cpufreq-dt-platdev.c | 4 + drivers/cpufreq/cpufreq-dt.c | 3 +- drivers/cpufreq/cpufreq.c | 17 +- drivers/cpufreq/imx6q-cpufreq.c | 2 +- drivers/cpufreq/intel_pstate.c | 39 -- drivers/cpufreq/mediatek-cpufreq-hw.c | 308 ++++++++++ drivers/cpufreq/mediatek-cpufreq.c | 3 +- drivers/cpufreq/omap-cpufreq.c | 2 +- drivers/cpufreq/qcom-cpufreq-hw.c | 151 ++++- drivers/cpufreq/scmi-cpufreq.c | 65 ++- drivers/cpufreq/scpi-cpufreq.c | 3 +- drivers/cpufreq/sh-cpufreq.c | 11 - drivers/cpufreq/vexpress-spc-cpufreq.c | 25 +- drivers/pci/controller/vmd.c | 55 ++ drivers/pci/host-bridge.c | 1 + drivers/pci/pci-acpi.c | 74 +++ include/linux/cpufreq.h | 75 ++- include/linux/pci-acpi.h | 3 + 45 files changed, 1638 insertions(+), 819 deletions(-) </cut>

3 years, 11 months

1
0
0 0

[CI-NOTIFY]: TCWG Bisect tcwg_kernel/gnu-master-arm-mainline-allmodconfig - Build # 28 - Successful!

by ci_notify＠linaro.org

Successfully identified regression in *linux* in CI configuration tcwg_kernel/gnu-master-arm-mainline-allmodconfig. So far, this commit has regressed CI configurations: - tcwg_kernel/gnu-master-arm-mainline-allmodconfig Culprit: <cut> commit 3fe617ccafd6f5bb33c2391d6f4eeb41c1fd0151 Author: Linus Torvalds <torvalds(a)linux-foundation.org> Date: Sun Sep 5 11:24:05 2021 -0700 Enable '-Werror' by default for all kernel builds ... but make it a config option so that broken environments can disable it when required. We really should always have a clean build, and will disable specific over-eager warnings as required, if we can't fix them. But while I fairly religiously enforce that in my own tree, it doesn't get enforced by various build robots that don't necessarily report warnings. So this just makes '-Werror' a default compiler flag, but allows people to disable it for their configuration if they have some particular issues. Occasionally, new compiler versions end up enabling new warnings, and it can take a while before we have them fixed (or the warnings disabled if that is what it takes), so the config option allows for that situation. Hopefully this will mean that I get fewer pull requests that have new warnings that were not noticed by various automation we have in place. Knock wood. Signed-off-by: Linus Torvalds <torvalds(a)linux-foundation.org> </cut> Results regressed to (for first_bad == 3fe617ccafd6f5bb33c2391d6f4eeb41c1fd0151) # reset_artifacts: -10 # build_abe binutils: -9 # build_abe stage1: -5 # build_abe qemu: -2 # linux_n_obj: 21769 # First few build errors in logs: from (for last_good == fd47ff55c9c31101fcc06d20cb381da3d4089bd5) # reset_artifacts: -10 # build_abe binutils: -9 # build_abe stage1: -5 # build_abe qemu: -2 # linux_n_obj: 29880 # linux build successful: all Artifacts of last_good build: https://ci.linaro.org/job/tcwg_kernel-gnu-bisect-gnu-master-arm-mainline-al… Artifacts of first_bad build: https://ci.linaro.org/job/tcwg_kernel-gnu-bisect-gnu-master-arm-mainline-al… Build top page/logs: https://ci.linaro.org/job/tcwg_kernel-gnu-bisect-gnu-master-arm-mainline-al… Configuration details: Reproduce builds: <cut> mkdir investigate-linux-3fe617ccafd6f5bb33c2391d6f4eeb41c1fd0151 cd investigate-linux-3fe617ccafd6f5bb33c2391d6f4eeb41c1fd0151 git clone https://git.linaro.org/toolchain/jenkins-scripts mkdir -p artifacts/manifests curl -o artifacts/manifests/build-baseline.sh https://ci.linaro.org/job/tcwg_kernel-gnu-bisect-gnu-master-arm-mainline-al… --fail curl -o artifacts/manifests/build-parameters.sh https://ci.linaro.org/job/tcwg_kernel-gnu-bisect-gnu-master-arm-mainline-al… --fail curl -o artifacts/test.sh https://ci.linaro.org/job/tcwg_kernel-gnu-bisect-gnu-master-arm-mainline-al… --fail chmod +x artifacts/test.sh # Reproduce the baseline build (build all pre-requisites) ./jenkins-scripts/tcwg_kernel-build.sh @@ artifacts/manifests/build-baseline.sh # Save baseline build state (which is then restored in artifacts/test.sh) mkdir -p ./bisect rsync -a --del --delete-excluded --exclude /bisect/ --exclude /artifacts/ --exclude /linux/ ./ ./bisect/baseline/ cd linux # Reproduce first_bad build git checkout --detach 3fe617ccafd6f5bb33c2391d6f4eeb41c1fd0151 ../artifacts/test.sh # Reproduce last_good build git checkout --detach fd47ff55c9c31101fcc06d20cb381da3d4089bd5 ../artifacts/test.sh cd .. </cut> History of pending regressions and results: https://git.linaro.org/toolchain/ci/base-artifacts.git/log/?h=linaro-local/… Artifacts: https://ci.linaro.org/job/tcwg_kernel-gnu-bisect-gnu-master-arm-mainline-al… Build log: https://ci.linaro.org/job/tcwg_kernel-gnu-bisect-gnu-master-arm-mainline-al… Full commit (up to 1000 lines): <cut> commit 3fe617ccafd6f5bb33c2391d6f4eeb41c1fd0151 Author: Linus Torvalds <torvalds(a)linux-foundation.org> Date: Sun Sep 5 11:24:05 2021 -0700 Enable '-Werror' by default for all kernel builds ... but make it a config option so that broken environments can disable it when required. We really should always have a clean build, and will disable specific over-eager warnings as required, if we can't fix them. But while I fairly religiously enforce that in my own tree, it doesn't get enforced by various build robots that don't necessarily report warnings. So this just makes '-Werror' a default compiler flag, but allows people to disable it for their configuration if they have some particular issues. Occasionally, new compiler versions end up enabling new warnings, and it can take a while before we have them fixed (or the warnings disabled if that is what it takes), so the config option allows for that situation. Hopefully this will mean that I get fewer pull requests that have new warnings that were not noticed by various automation we have in place. Knock wood. Signed-off-by: Linus Torvalds <torvalds(a)linux-foundation.org> --- Makefile | 3 +++ init/Kconfig | 14 ++++++++++++++ 2 files changed, 17 insertions(+) diff --git a/Makefile b/Makefile index 6bc1c5b17a62..d45fc2edf186 100644 --- a/Makefile +++ b/Makefile @@ -785,6 +785,9 @@ stackp-flags-$(CONFIG_STACKPROTECTOR_STRONG) := -fstack-protector-strong KBUILD_CFLAGS += $(stackp-flags-y) +KBUILD_CFLAGS-$(CONFIG_WERROR) += -Werror +KBUILD_CFLAGS += $(KBUILD_CFLAGS-y) + ifdef CONFIG_CC_IS_CLANG KBUILD_CPPFLAGS += -Qunused-arguments # The kernel builds with '-std=gnu89' so use of GNU extensions is acceptable. diff --git a/init/Kconfig b/init/Kconfig index e708180e9a59..8cb97f141b70 100644 --- a/init/Kconfig +++ b/init/Kconfig @@ -137,6 +137,20 @@ config COMPILE_TEST here. If you are a user/distributor, say N here to exclude useless drivers to be distributed. +config WERROR + bool "Compile the kernel with warnings as errors" + default y + help + A kernel build should not cause any compiler warnings, and this + enables the '-Werror' flag to enforce that rule by default. + + However, if you have a new (or very old) compiler with odd and + unusual warnings, or you have some architecture with problems, + you may need to disable this config option in order to + successfully build the kernel. + + If in doubt, say Y. + config UAPI_HEADER_TEST bool "Compile test UAPI headers" depends on HEADERS_INSTALL && CC_CAN_LINK </cut>

3 years, 11 months

3
5
0 0

[CI-NOTIFY]: TCWG Bisect tcwg_bmk_apm/llvm-master-arm-spec2k6-Oz - Build # 7 - Successful!

by ci_notify＠linaro.org

Successfully identified regression in *binutils* in CI configuration tcwg_bmk_llvm_apm/llvm-master-arm-spec2k6-Oz. So far, this commit has regressed CI configurations: - tcwg_bmk_llvm_apm/llvm-master-arm-spec2k6-Oz Culprit: <cut> commit f947f96797f8ec33aabf9cd7234c850778068445 Author: Tom de Vries <tdevries(a)suse.de> Date: Mon Aug 30 14:34:03 2021 +0200 [gdb/cli] Don't assert on empty string for core-file With current gdb we run into: ... $ gdb -batch '' '' : No such file or directory. pathstuff.cc:132: internal-error: \ gdb::unique_xmalloc_ptr<char> gdb_abspath(const char*): \ Assertion `path != NULL && path[0] != '\0'' failed. ... Fix this by skipping the call to gdb_abspath in core_target_open in the empty-string case, such that we have instead: ... $ gdb -batch '' '' : No such file or directory. : No such file or directory. $ ... Tested on x86_64-linux. gdb/ChangeLog: 2021-08-30 Tom de Vries <tdevries(a)suse.de> PR cli/28290 * gdb/corelow.c (core_target_open): Skip call to gdb_abspath in the empty-string case. gdb/testsuite/ChangeLog: 2021-08-30 Tom de Vries <tdevries(a)suse.de> PR cli/28290 * gdb.base/batch-exit-status.exp: Add gdb '' and gdb '' '' tests. </cut> Results regressed to (for first_bad == f947f96797f8ec33aabf9cd7234c850778068445) # reset_artifacts: -10 # build_abe binutils: -9 # build_abe stage1 -- --set gcc_override_configure=--with-mode=thumb --set gcc_override_configure=--disable-libsanitizer: -8 # build_abe linux: -7 # build_abe glibc: -6 # build_abe stage2 -- --set gcc_override_configure=--with-mode=thumb --set gcc_override_configure=--disable-libsanitizer: -5 # build_llvm true: -3 # true: 0 # benchmark -- -Oz_mthumb artifacts/build-f947f96797f8ec33aabf9cd7234c850778068445/results_id: 1 # 447.dealII,[.] contract<3> regressed by 200 from (for last_good == 9b9b1092f0a8e6b7d240ea05a74968a883b8a05c) # reset_artifacts: -10 # build_abe binutils: -9 # build_abe stage1 -- --set gcc_override_configure=--with-mode=thumb --set gcc_override_configure=--disable-libsanitizer: -8 # build_abe linux: -7 # build_abe glibc: -6 # build_abe stage2 -- --set gcc_override_configure=--with-mode=thumb --set gcc_override_configure=--disable-libsanitizer: -5 # build_llvm true: -3 # true: 0 # benchmark -- -Oz_mthumb artifacts/build-9b9b1092f0a8e6b7d240ea05a74968a883b8a05c/results_id: 1 Artifacts of last_good build: https://ci.linaro.org/job/tcwg_bmk_ci_llvm-bisect-tcwg_bmk_apm-llvm-master-… Results ID of last_good: apm_32/tcwg_bmk_llvm_apm/bisect-llvm-master-arm-spec2k6-Oz/4909 Artifacts of first_bad build: https://ci.linaro.org/job/tcwg_bmk_ci_llvm-bisect-tcwg_bmk_apm-llvm-master-… Results ID of first_bad: apm_32/tcwg_bmk_llvm_apm/bisect-llvm-master-arm-spec2k6-Oz/4905 Build top page/logs: https://ci.linaro.org/job/tcwg_bmk_ci_llvm-bisect-tcwg_bmk_apm-llvm-master-… Configuration details: Reproduce builds: <cut> mkdir investigate-binutils-f947f96797f8ec33aabf9cd7234c850778068445 cd investigate-binutils-f947f96797f8ec33aabf9cd7234c850778068445 git clone https://git.linaro.org/toolchain/jenkins-scripts mkdir -p artifacts/manifests curl -o artifacts/manifests/build-baseline.sh https://ci.linaro.org/job/tcwg_bmk_ci_llvm-bisect-tcwg_bmk_apm-llvm-master-… --fail curl -o artifacts/manifests/build-parameters.sh https://ci.linaro.org/job/tcwg_bmk_ci_llvm-bisect-tcwg_bmk_apm-llvm-master-… --fail curl -o artifacts/test.sh https://ci.linaro.org/job/tcwg_bmk_ci_llvm-bisect-tcwg_bmk_apm-llvm-master-… --fail chmod +x artifacts/test.sh # Reproduce the baseline build (build all pre-requisites) ./jenkins-scripts/tcwg_bmk-build.sh @@ artifacts/manifests/build-baseline.sh # Save baseline build state (which is then restored in artifacts/test.sh) mkdir -p ./bisect rsync -a --del --delete-excluded --exclude /bisect/ --exclude /artifacts/ --exclude /binutils/ ./ ./bisect/baseline/ cd binutils # Reproduce first_bad build git checkout --detach f947f96797f8ec33aabf9cd7234c850778068445 ../artifacts/test.sh # Reproduce last_good build git checkout --detach 9b9b1092f0a8e6b7d240ea05a74968a883b8a05c ../artifacts/test.sh cd .. </cut> History of pending regressions and results: https://git.linaro.org/toolchain/ci/base-artifacts.git/log/?h=linaro-local/… Artifacts: https://ci.linaro.org/job/tcwg_bmk_ci_llvm-bisect-tcwg_bmk_apm-llvm-master-… Build log: https://ci.linaro.org/job/tcwg_bmk_ci_llvm-bisect-tcwg_bmk_apm-llvm-master-… Full commit (up to 1000 lines): <cut> commit f947f96797f8ec33aabf9cd7234c850778068445 Author: Tom de Vries <tdevries(a)suse.de> Date: Mon Aug 30 14:34:03 2021 +0200 [gdb/cli] Don't assert on empty string for core-file With current gdb we run into: ... $ gdb -batch '' '' : No such file or directory. pathstuff.cc:132: internal-error: \ gdb::unique_xmalloc_ptr<char> gdb_abspath(const char*): \ Assertion `path != NULL && path[0] != '\0'' failed. ... Fix this by skipping the call to gdb_abspath in core_target_open in the empty-string case, such that we have instead: ... $ gdb -batch '' '' : No such file or directory. : No such file or directory. $ ... Tested on x86_64-linux. gdb/ChangeLog: 2021-08-30 Tom de Vries <tdevries(a)suse.de> PR cli/28290 * gdb/corelow.c (core_target_open): Skip call to gdb_abspath in the empty-string case. gdb/testsuite/ChangeLog: 2021-08-30 Tom de Vries <tdevries(a)suse.de> PR cli/28290 * gdb.base/batch-exit-status.exp: Add gdb '' and gdb '' '' tests. --- gdb/corelow.c | 3 ++- gdb/testsuite/gdb.base/batch-exit-status.exp | 4 ++++ 2 files changed, 6 insertions(+), 1 deletion(-) diff --git a/gdb/corelow.c b/gdb/corelow.c index eb785a08633..711e86c4cd4 100644 --- a/gdb/corelow.c +++ b/gdb/corelow.c @@ -428,7 +428,8 @@ core_target_open (const char *arg, int from_tty) } gdb::unique_xmalloc_ptr<char> filename (tilde_expand (arg)); - if (!IS_ABSOLUTE_PATH (filename.get ())) + if (strlen (filename.get ()) != 0 + && !IS_ABSOLUTE_PATH (filename.get ())) filename = gdb_abspath (filename.get ()); flags = O_BINARY | O_LARGEFILE; diff --git a/gdb/testsuite/gdb.base/batch-exit-status.exp b/gdb/testsuite/gdb.base/batch-exit-status.exp index 085dfc6ad56..9a080196bd6 100644 --- a/gdb/testsuite/gdb.base/batch-exit-status.exp +++ b/gdb/testsuite/gdb.base/batch-exit-status.exp @@ -76,3 +76,7 @@ test_exit_status 1 "-batch -x $good_commands -x $bad_commands" \ "-batch -x good-commands -x bad-commands" test_exit_status 1 "-batch -x $good_commands -ex \"set not-a-thing 4\"" \ "-batch -x good-commands -ex \"set not-a-thing 4\"" + +set no_such_re ": No such file or directory\\." +test_exit_status 1 "-batch \"\"" $no_such_re +test_exit_status 1 "-batch \"\" \"\"" [multi_line $no_such_re $no_such_re] </cut>

3 years, 11 months

1
0
0 0

[CI-NOTIFY]: TCWG Bisect tcwg_kernel/gnu-release-arm-next-allmodconfig - Build # 20 - Successful!

by ci_notify＠linaro.org

Successfully identified regression in *linux* in CI configuration tcwg_kernel/gnu-release-arm-next-allmodconfig. So far, this commit has regressed CI configurations: - tcwg_kernel/gnu-release-arm-next-allmodconfig Culprit: <cut> commit 4d3b252a0a3aed2f6fc70aec3c37275a9ca179a4 Merge: 907f2745370d 6f65d2319f21 Author: Stephen Rothwell <sfr(a)canb.auug.org.au> Date: Tue Sep 7 10:00:35 2021 +1000 Merge remote-tracking branch 'pm/linux-next' </cut> Results regressed to (for first_bad == 4d3b252a0a3aed2f6fc70aec3c37275a9ca179a4) # reset_artifacts: -10 # build_abe binutils: -9 # build_abe stage1: -5 # build_abe qemu: -2 # linux_n_obj: 21778 # First few build errors in logs: from (for last_good == 907f2745370d3cfcc6efe7772def37c4eee4b960) # reset_artifacts: -10 # build_abe binutils: -9 # build_abe stage1: -5 # build_abe qemu: -2 # linux_n_obj: 29889 # linux build successful: all Artifacts of last_good build: https://ci.linaro.org/job/tcwg_kernel-gnu-bisect-gnu-release-arm-next-allmo… Artifacts of first_bad build: https://ci.linaro.org/job/tcwg_kernel-gnu-bisect-gnu-release-arm-next-allmo… Build top page/logs: https://ci.linaro.org/job/tcwg_kernel-gnu-bisect-gnu-release-arm-next-allmo… Configuration details: rr[linux_git]="https://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git#4b93c54…" Reproduce builds: <cut> mkdir investigate-linux-4d3b252a0a3aed2f6fc70aec3c37275a9ca179a4 cd investigate-linux-4d3b252a0a3aed2f6fc70aec3c37275a9ca179a4 git clone https://git.linaro.org/toolchain/jenkins-scripts mkdir -p artifacts/manifests curl -o artifacts/manifests/build-baseline.sh https://ci.linaro.org/job/tcwg_kernel-gnu-bisect-gnu-release-arm-next-allmo… --fail curl -o artifacts/manifests/build-parameters.sh https://ci.linaro.org/job/tcwg_kernel-gnu-bisect-gnu-release-arm-next-allmo… --fail curl -o artifacts/test.sh https://ci.linaro.org/job/tcwg_kernel-gnu-bisect-gnu-release-arm-next-allmo… --fail chmod +x artifacts/test.sh # Reproduce the baseline build (build all pre-requisites) ./jenkins-scripts/tcwg_kernel-build.sh @@ artifacts/manifests/build-baseline.sh # Save baseline build state (which is then restored in artifacts/test.sh) mkdir -p ./bisect rsync -a --del --delete-excluded --exclude /bisect/ --exclude /artifacts/ --exclude /linux/ ./ ./bisect/baseline/ cd linux # Reproduce first_bad build git checkout --detach 4d3b252a0a3aed2f6fc70aec3c37275a9ca179a4 ../artifacts/test.sh # Reproduce last_good build git checkout --detach 907f2745370d3cfcc6efe7772def37c4eee4b960 ../artifacts/test.sh cd .. </cut> History of pending regressions and results: https://git.linaro.org/toolchain/ci/base-artifacts.git/log/?h=linaro-local/… Artifacts: https://ci.linaro.org/job/tcwg_kernel-gnu-bisect-gnu-release-arm-next-allmo… Build log: https://ci.linaro.org/job/tcwg_kernel-gnu-bisect-gnu-release-arm-next-allmo… Full commit (up to 1000 lines): <cut> commit 4d3b252a0a3aed2f6fc70aec3c37275a9ca179a4 Merge: 907f2745370d 6f65d2319f21 Author: Stephen Rothwell <sfr(a)canb.auug.org.au> Date: Tue Sep 7 10:00:35 2021 +1000 Merge remote-tracking branch 'pm/linux-next' Documentation/admin-guide/acpi/ssdt-overlays.rst | 49 +- Documentation/cpu-freq/cpu-drivers.rst | 3 - .../devicetree/bindings/cpufreq/cpufreq-dt.txt | 2 +- .../bindings/cpufreq/cpufreq-mediatek.txt | 2 +- .../devicetree/bindings/cpufreq/cpufreq-st.txt | 6 +- .../bindings/cpufreq/nvidia,tegra20-cpufreq.txt | 2 +- .../devicetree/bindings/devfreq/rk3399_dmc.txt | 2 +- .../devicetree/bindings/gpu/arm,mali-bifrost.yaml | 2 +- .../devicetree/bindings/gpu/arm,mali-midgard.yaml | 2 +- .../bindings/interconnect/fsl,imx8m-noc.yaml | 4 +- .../opp/allwinner,sun50i-h6-operating-points.yaml | 4 + Documentation/devicetree/bindings/opp/opp-v1.yaml | 51 ++ .../devicetree/bindings/opp/opp-v2-base.yaml | 214 +++++++ Documentation/devicetree/bindings/opp/opp-v2.yaml | 475 ++++++++++++++++ Documentation/devicetree/bindings/opp/opp.txt | 622 --------------------- Documentation/devicetree/bindings/opp/qcom-opp.txt | 2 +- .../bindings/opp/ti-omap5-opp-supply.txt | 2 +- .../devicetree/bindings/power/power-domain.yaml | 2 +- .../translations/zh_CN/cpu-freq/cpu-drivers.rst | 2 - arch/arm/boot/dts/omap34xx.dtsi | 1 - arch/arm/boot/dts/omap36xx.dtsi | 1 - drivers/acpi/x86/s2idle.c | 67 ++- drivers/base/arch_topology.c | 2 + drivers/cpufreq/acpi-cpufreq.c | 14 +- drivers/cpufreq/cpufreq-dt-platdev.c | 4 + drivers/cpufreq/cpufreq-dt.c | 3 +- drivers/cpufreq/cpufreq.c | 17 +- drivers/cpufreq/imx6q-cpufreq.c | 2 +- drivers/cpufreq/mediatek-cpufreq.c | 3 +- drivers/cpufreq/omap-cpufreq.c | 2 +- drivers/cpufreq/qcom-cpufreq-hw.c | 151 ++++- drivers/cpufreq/scmi-cpufreq.c | 65 ++- drivers/cpufreq/scpi-cpufreq.c | 3 +- drivers/cpufreq/sh-cpufreq.c | 11 - drivers/cpufreq/vexpress-spc-cpufreq.c | 25 +- drivers/pci/controller/vmd.c | 55 ++ drivers/pci/host-bridge.c | 1 + drivers/pci/pci-acpi.c | 74 +++ include/linux/cpufreq.h | 17 +- include/linux/pci-acpi.h | 3 + 40 files changed, 1190 insertions(+), 779 deletions(-) </cut>

3 years, 11 months

1
0
0 0

[CI-NOTIFY]: TCWG Bisect tcwg_bmk_tx1/llvm-release-aarch64-spec2k6-O3 - Build # 8 - Successful!

by ci_notify＠linaro.org

Successfully identified regression in *llvm* in CI configuration tcwg_bmk_llvm_tx1/llvm-release-aarch64-spec2k6-O3. So far, this commit has regressed CI configurations: - tcwg_bmk_llvm_tx1/llvm-release-aarch64-spec2k6-O3 Culprit: <cut> commit 34f839fc9d4c0638e09c81e9981d4dacf69c3ed6 Author: Zahira Ammarguellat <zahira.ammarguellat(a)intel.com> Date: Fri Aug 6 12:01:47 2021 -0700 Revert "[clang][fpenv][patch] Change clang option -ffp-model=precise to select ffp-contract=on" This reverts commit 48ad446a0fb2c9b98cb7047e4daf8a84c29cef8f. (cherry picked from commit 4389a413e2129d7d55ee779638b649aa852b6f8a) </cut> Results regressed to (for first_bad == 34f839fc9d4c0638e09c81e9981d4dacf69c3ed6) # reset_artifacts: -10 # build_abe binutils: -9 # build_abe stage1 -- --set gcc_override_configure=--disable-libsanitizer: -8 # build_abe linux: -7 # build_abe glibc: -6 # build_abe stage2 -- --set gcc_override_configure=--disable-libsanitizer: -5 # build_llvm true: -3 # true: 0 # benchmark -- -O3 artifacts/build-34f839fc9d4c0638e09c81e9981d4dacf69c3ed6/results_id: 1 # 470.lbm,lbm_base.default regressed by 109 # 444.namd,namd_base.default regressed by 104 # 447.dealII,dealII_base.default regressed by 106 # 447.dealII,[.] _ZNK12SparseMatrixIdE5vmultI6VectorIdES3_EEvRT regressed by 115 # 447.dealII,[.] _ZN16ConstraintMatrix8add_lineEj regressed by 112 # 433.milc,milc_base.default regressed by 104 # 433.milc,[.] mult_su3_mat_vec regressed by 115 from (for last_good == b643ee1b9c1a8e0b81e31908a066c71851292890) # reset_artifacts: -10 # build_abe binutils: -9 # build_abe stage1 -- --set gcc_override_configure=--disable-libsanitizer: -8 # build_abe linux: -7 # build_abe glibc: -6 # build_abe stage2 -- --set gcc_override_configure=--disable-libsanitizer: -5 # build_llvm true: -3 # true: 0 # benchmark -- -O3 artifacts/build-b643ee1b9c1a8e0b81e31908a066c71851292890/results_id: 1 Artifacts of last_good build: https://ci.linaro.org/job/tcwg_bmk_ci_llvm-bisect-tcwg_bmk_tx1-llvm-release… Results ID of last_good: tx1_64/tcwg_bmk_llvm_tx1/bisect-llvm-release-aarch64-spec2k6-O3/4881 Artifacts of first_bad build: https://ci.linaro.org/job/tcwg_bmk_ci_llvm-bisect-tcwg_bmk_tx1-llvm-release… Results ID of first_bad: tx1_64/tcwg_bmk_llvm_tx1/bisect-llvm-release-aarch64-spec2k6-O3/4887 Build top page/logs: https://ci.linaro.org/job/tcwg_bmk_ci_llvm-bisect-tcwg_bmk_tx1-llvm-release… Configuration details: Reproduce builds: <cut> mkdir investigate-llvm-34f839fc9d4c0638e09c81e9981d4dacf69c3ed6 cd investigate-llvm-34f839fc9d4c0638e09c81e9981d4dacf69c3ed6 git clone https://git.linaro.org/toolchain/jenkins-scripts mkdir -p artifacts/manifests curl -o artifacts/manifests/build-baseline.sh https://ci.linaro.org/job/tcwg_bmk_ci_llvm-bisect-tcwg_bmk_tx1-llvm-release… --fail curl -o artifacts/manifests/build-parameters.sh https://ci.linaro.org/job/tcwg_bmk_ci_llvm-bisect-tcwg_bmk_tx1-llvm-release… --fail curl -o artifacts/test.sh https://ci.linaro.org/job/tcwg_bmk_ci_llvm-bisect-tcwg_bmk_tx1-llvm-release… --fail chmod +x artifacts/test.sh # Reproduce the baseline build (build all pre-requisites) ./jenkins-scripts/tcwg_bmk-build.sh @@ artifacts/manifests/build-baseline.sh # Save baseline build state (which is then restored in artifacts/test.sh) mkdir -p ./bisect rsync -a --del --delete-excluded --exclude /bisect/ --exclude /artifacts/ --exclude /llvm/ ./ ./bisect/baseline/ cd llvm # Reproduce first_bad build git checkout --detach 34f839fc9d4c0638e09c81e9981d4dacf69c3ed6 ../artifacts/test.sh # Reproduce last_good build git checkout --detach b643ee1b9c1a8e0b81e31908a066c71851292890 ../artifacts/test.sh cd .. </cut> History of pending regressions and results: https://git.linaro.org/toolchain/ci/base-artifacts.git/log/?h=linaro-local/… Artifacts: https://ci.linaro.org/job/tcwg_bmk_ci_llvm-bisect-tcwg_bmk_tx1-llvm-release… Build log: https://ci.linaro.org/job/tcwg_bmk_ci_llvm-bisect-tcwg_bmk_tx1-llvm-release… Full commit (up to 1000 lines): <cut> commit 34f839fc9d4c0638e09c81e9981d4dacf69c3ed6 Author: Zahira Ammarguellat <zahira.ammarguellat(a)intel.com> Date: Fri Aug 6 12:01:47 2021 -0700 Revert "[clang][fpenv][patch] Change clang option -ffp-model=precise to select ffp-contract=on" This reverts commit 48ad446a0fb2c9b98cb7047e4daf8a84c29cef8f. (cherry picked from commit 4389a413e2129d7d55ee779638b649aa852b6f8a) --- clang/docs/UsersManual.rst | 48 ++----------------------- clang/lib/Driver/ToolChains/Clang.cpp | 33 ++++++++--------- clang/test/CodeGen/ffp-contract-option.c | 47 +++--------------------- clang/test/CodeGen/ppc-emmintrin.c | 4 +-- clang/test/CodeGen/ppc-xmmintrin.c | 4 +-- clang/test/Driver/fp-model.c | 61 +++++++++++++++----------------- 6 files changed, 58 insertions(+), 139 deletions(-) diff --git a/clang/docs/UsersManual.rst b/clang/docs/UsersManual.rst index aecd28e5e12a..20be01a5f40a 100644 --- a/clang/docs/UsersManual.rst +++ b/clang/docs/UsersManual.rst @@ -1260,50 +1260,8 @@ installed. Controlling Floating Point Behavior ----------------------------------- -Clang provides a number of ways to control floating point behavior, including -with command line options and source pragmas. This section -describes the various floating point semantic modes and the corresponding options. - -.. csv-table:: Floating Point Semantic Modes - :header: "Mode", "Values" - :widths: 15, 30, 30 - - "except_behavior", "{ignore, strict, may_trap}", "ffp-exception-behavior" - "fenv_access", "{off, on}", "(none)" - "rounding_mode", "{dynamic, tonearest, downward, upward, towardzero}", "frounding-math" - "contract", "{on, off, fast}", "ffp-contract" - "denormal_fp_math", "{IEEE, PreserveSign, PositiveZero}", "fdenormal-fp-math" - "denormal_fp32_math", "{IEEE, PreserveSign, PositiveZero}", "fdenormal-fp-math-fp32" - "support_math_errno", "{on, off}", "fmath-errno" - "no_honor_nans", "{on, off}", "fhonor-nans" - "no_honor_infinities", "{on, off}", "fhonor-infinities" - "no_signed_zeros", "{on, off}", "fsigned-zeros" - "allow_reciprocal", "{on, off}", "freciprocal-math" - "allow_approximate_fns", "{on, off}", "(none)" - "allow_reassociation", "{on, off}", "fassociative-math" - - -This table describes the option settings that correspond to the three -floating point semantic models: precise (the default), strict, and fast. - - -.. csv-table:: Floating Point Models - :header: "Mode", "Precise", "Strict", "Fast" - :widths: 25, 15, 15, 15 - - "except_behavior", "ignore", "strict", "ignore" - "fenv_access", "off", "on", "off" - "rounding_mode", "tonearest", "dynamic", "tonearest" - "contract", "on", "off", "fast" - "denormal_fp_math", "IEEE", "IEEE", "PreserveSign" - "denormal_fp32_math", "IEEE","IEEE", "PreserveSign" - "support_math_errno", "on", "on", "off" - "no_honor_nans", "off", "off", "on" - "no_honor_infinities", "off", "off", "on" - "no_signed_zeros", "off", "off", "on" - "allow_reciprocal", "off", "off", "on" - "allow_approximate_fns", "off", "off", "on" - "allow_reassociation", "off", "off", "on" +Clang provides a number of ways to control floating point behavior. The options +are listed below. .. option:: -ffast-math @@ -1498,7 +1456,7 @@ Note that floating-point operations performed as part of constant initialization and ``fast``. Details: - * ``precise`` Disables optimizations that are not value-safe on floating-point data, although FP contraction (FMA) is enabled (``-ffp-contract=on``). This is the default behavior. + * ``precise`` Disables optimizations that are not value-safe on floating-point data, although FP contraction (FMA) is enabled (``-ffp-contract=fast``). This is the default behavior. * ``strict`` Enables ``-frounding-math`` and ``-ffp-exception-behavior=strict``, and disables contractions (FMA). All of the ``-ffast-math`` enablements are disabled. Enables ``STDC FENV_ACCESS``: by default ``FENV_ACCESS`` is disabled. This option setting behaves as though ``#pragma STDC FENV_ACESS ON`` appeared at the top of the source file. * ``fast`` Behaves identically to specifying both ``-ffast-math`` and ``ffp-contract=fast`` diff --git a/clang/lib/Driver/ToolChains/Clang.cpp b/clang/lib/Driver/ToolChains/Clang.cpp index 0e129e6f2fac..4c8ba8cdcd29 100644 --- a/clang/lib/Driver/ToolChains/Clang.cpp +++ b/clang/lib/Driver/ToolChains/Clang.cpp @@ -2637,7 +2637,7 @@ static void RenderFloatingPointOptions(const ToolChain &TC, const Driver &D, llvm::DenormalMode DenormalFPMath = DefaultDenormalFPMath; llvm::DenormalMode DenormalFP32Math = DefaultDenormalFP32Math; - StringRef FPContract = "on"; + StringRef FPContract = ""; bool StrictFPModel = false; @@ -2662,7 +2662,7 @@ static void RenderFloatingPointOptions(const ToolChain &TC, const Driver &D, ReciprocalMath = false; SignedZeros = true; // -fno_fast_math restores default denormal and fpcontract handling - FPContract = "on"; + FPContract = ""; DenormalFPMath = llvm::DenormalMode::getIEEE(); // FIXME: The target may have picked a non-IEEE default mode here based on @@ -2682,18 +2682,20 @@ static void RenderFloatingPointOptions(const ToolChain &TC, const Driver &D, // ffp-model= is a Driver option, it is entirely rewritten into more // granular options before being passed into cc1. // Use the gcc option in the switch below. - if (!FPModel.empty() && !FPModel.equals(Val)) + if (!FPModel.empty() && !FPModel.equals(Val)) { D.Diag(clang::diag::warn_drv_overriding_flag_option) << Args.MakeArgString("-ffp-model=" + FPModel) << Args.MakeArgString("-ffp-model=" + Val); + FPContract = ""; + } if (Val.equals("fast")) { optID = options::OPT_ffast_math; FPModel = Val; - FPContract = Val; + FPContract = "fast"; } else if (Val.equals("precise")) { optID = options::OPT_ffp_contract; FPModel = Val; - FPContract = "on"; + FPContract = "fast"; PreciseFPModel = true; } else if (Val.equals("strict")) { StrictFPModel = true; @@ -2779,11 +2781,9 @@ static void RenderFloatingPointOptions(const ToolChain &TC, const Driver &D, case options::OPT_ffp_contract: { StringRef Val = A->getValue(); if (PreciseFPModel) { - // When -ffp-model=precise is seen on the command line, - // the boolean PreciseFPModel is set to true which indicates - // "the current option is actually PreciseFPModel". The optID - // is changed to OPT_ffp_contract and FPContract is set to "on". - // the argument Val string is "precise": it shouldn't be checked. + // -ffp-model=precise enables ffp-contract=fast as a side effect + // the FPContract value has already been set to a string literal + // and the Val string isn't a pertinent value. ; } else if (Val.equals("fast") || Val.equals("on") || Val.equals("off")) FPContract = Val; @@ -2881,17 +2881,18 @@ static void RenderFloatingPointOptions(const ToolChain &TC, const Driver &D, // -fno_fast_math restores default denormal and fpcontract handling DenormalFPMath = DefaultDenormalFPMath; DenormalFP32Math = llvm::DenormalMode::getIEEE(); - FPContract = "on"; + FPContract = ""; break; } if (StrictFPModel) { // If -ffp-model=strict has been specified on command line but // subsequent options conflict then emit warning diagnostic. - if (HonorINFs && HonorNaNs && !AssociativeMath && !ReciprocalMath && - SignedZeros && TrappingMath && RoundingFPMath && - DenormalFPMath == llvm::DenormalMode::getIEEE() && - DenormalFP32Math == llvm::DenormalMode::getIEEE() && - FPContract.equals("off")) + if (HonorINFs && HonorNaNs && + !AssociativeMath && !ReciprocalMath && + SignedZeros && TrappingMath && RoundingFPMath && + (FPContract.equals("off") || FPContract.empty()) && + DenormalFPMath == llvm::DenormalMode::getIEEE() && + DenormalFP32Math == llvm::DenormalMode::getIEEE()) // OK: Current Arg doesn't conflict with -ffp-model=strict ; else { diff --git a/clang/test/CodeGen/ffp-contract-option.c b/clang/test/CodeGen/ffp-contract-option.c index efc72c2b5461..52b750795940 100644 --- a/clang/test/CodeGen/ffp-contract-option.c +++ b/clang/test/CodeGen/ffp-contract-option.c @@ -1,46 +1,9 @@ -// RUN: %clang_cc1 -O3 -ffp-contract=fast -triple=aarch64-apple-darwin -S -o - %s | FileCheck --check-prefix=CHECK-FMADD %s +// RUN: %clang_cc1 -O3 -ffp-contract=fast -triple=aarch64-apple-darwin -S -o - %s | FileCheck %s // REQUIRES: aarch64-registered-target float fma_test1(float a, float b, float c) { -// CHECK-FMADD: fmadd - float x = a * b; - float y = x + c; - return y; -} - -// RUN: %clang_cc1 -triple=x86_64 %s -emit-llvm -o - \ -// RUN:| FileCheck --check-prefix=CHECK-DEFAULT %s -// -// RUN: %clang_cc1 -triple=x86_64 -ffp-contract=off %s -emit-llvm -o - \ -// RUN:| FileCheck --check-prefix=CHECK-DEFAULT %s -// RUN: %clang_cc1 -triple=x86_64 -ffp-contract=on %s -emit-llvm -o - \ -// RUN:| FileCheck --check-prefix=CHECK-ON %s -// RUN: %clang_cc1 -triple=x86_64 -ffp-contract=fast %s -emit-llvm -o - \ -// RUN:| FileCheck --check-prefix=CHECK-CONTRACTFAST %s -// -// RUN: %clang_cc1 -triple=x86_64 -ffast-math %s -emit-llvm -o - \ -// RUN:| FileCheck --check-prefix=CHECK-DEFAULTFAST %s -// RUN: %clang_cc1 -triple=x86_64 -ffast-math -ffp-contract=off %s -emit-llvm -o - \ -// RUN:| FileCheck --check-prefix=CHECK-DEFAULTFAST %s -// RUN: %clang_cc1 -triple=x86_64 -ffast-math -ffp-contract=on %s -emit-llvm -o - \ -// RUN:| FileCheck --check-prefix=CHECK-ONFAST %s -// RUN: %clang_cc1 -triple=x86_64 -ffast-math -ffp-contract=fast %s -emit-llvm -o - \ -// RUN:| FileCheck --check-prefix=CHECK-FASTFAST %s -float mymuladd( float x, float y, float z ) { - return x * y + z; - // CHECK-DEFAULT: = fmul float - // CHECK-DEFAULT: = fadd float - - // CHECK-ON: = call float @llvm.fmuladd.f32 - - // CHECK-CONTRACTFAST: = fmul contract float - // CHECK-CONTRACTFAST: = fadd contract float - - // CHECK-DEFAULTFAST: = fmul reassoc nnan ninf nsz arcp afn float - // CHECK-DEFAULTFAST: = fadd reassoc nnan ninf nsz arcp afn float - - // CHECK-ONFAST: = call reassoc nnan ninf nsz arcp afn float @llvm.fmuladd.f32 - - // CHECK-FASTFAST: = fmul fast float - // CHECK-FASTFAST: = fadd fast float +// CHECK: fmadd + float x = a * b; + float y = x + c; + return y; } diff --git a/clang/test/CodeGen/ppc-emmintrin.c b/clang/test/CodeGen/ppc-emmintrin.c index 4a246ff92d76..fa3801f50a01 100644 --- a/clang/test/CodeGen/ppc-emmintrin.c +++ b/clang/test/CodeGen/ppc-emmintrin.c @@ -2,9 +2,9 @@ // REQUIRES: powerpc-registered-target // RUN: %clang -S -emit-llvm -target powerpc64-unknown-linux-gnu -mcpu=pwr8 -ffreestanding -DNO_WARN_X86_INTRINSICS %s \ -// RUN: -ffp-contract=off -fno-discard-value-names -mllvm -disable-llvm-optzns -o - | llvm-cxxfilt -n | FileCheck %s --check-prefixes=CHECK,CHECK-BE +// RUN: -fno-discard-value-names -mllvm -disable-llvm-optzns -o - | llvm-cxxfilt -n | FileCheck %s --check-prefixes=CHECK,CHECK-BE // RUN: %clang -S -emit-llvm -target powerpc64le-unknown-linux-gnu -mcpu=pwr8 -ffreestanding -DNO_WARN_X86_INTRINSICS %s \ -// RUN: -ffp-contract=off -fno-discard-value-names -mllvm -disable-llvm-optzns -o - | llvm-cxxfilt -n | FileCheck %s --check-prefixes=CHECK,CHECK-LE +// RUN: -fno-discard-value-names -mllvm -disable-llvm-optzns -o - | llvm-cxxfilt -n | FileCheck %s --check-prefixes=CHECK,CHECK-LE // CHECK-BE-DAG: @_mm_movemask_pd.perm_mask = internal constant <4 x i32> <i32 -2139062144, i32 -2139062144, i32 -2139062144, i32 -2139078656>, align 16 // CHECK-BE-DAG: @_mm_shuffle_epi32.permute_selectors = internal constant [4 x i32] [i32 66051, i32 67438087, i32 134810123, i32 202182159], align 4 diff --git a/clang/test/CodeGen/ppc-xmmintrin.c b/clang/test/CodeGen/ppc-xmmintrin.c index a7f6ed6e0e67..d3f18bfbb1e5 100644 --- a/clang/test/CodeGen/ppc-xmmintrin.c +++ b/clang/test/CodeGen/ppc-xmmintrin.c @@ -2,11 +2,11 @@ // REQUIRES: powerpc-registered-target // RUN: %clang -S -emit-llvm -target powerpc64-unknown-linux-gnu -mcpu=pwr8 -ffreestanding -DNO_WARN_X86_INTRINSICS %s \ -// RUN: -ffp-contract=off -fno-discard-value-names -mllvm -disable-llvm-optzns -o - | llvm-cxxfilt -n | FileCheck %s --check-prefixes=CHECK,CHECK-BE +// RUN: -fno-discard-value-names -mllvm -disable-llvm-optzns -o - | llvm-cxxfilt -n | FileCheck %s --check-prefixes=CHECK,CHECK-BE // RUN: %clang -x c++ -fsyntax-only -target powerpc64-unknown-linux-gnu -mcpu=pwr8 -ffreestanding -DNO_WARN_X86_INTRINSICS %s \ // RUN: -fno-discard-value-names -mllvm -disable-llvm-optzns // RUN: %clang -S -emit-llvm -target powerpc64le-unknown-linux-gnu -mcpu=pwr8 -ffreestanding -DNO_WARN_X86_INTRINSICS %s \ -// RUN: -ffp-contract=off -fno-discard-value-names -mllvm -disable-llvm-optzns -o - | llvm-cxxfilt -n | FileCheck %s --check-prefixes=CHECK,CHECK-LE +// RUN: -fno-discard-value-names -mllvm -disable-llvm-optzns -o - | llvm-cxxfilt -n | FileCheck %s --check-prefixes=CHECK,CHECK-LE // RUN: %clang -x c++ -fsyntax-only -target powerpc64le-unknown-linux-gnu -mcpu=pwr8 -ffreestanding -DNO_WARN_X86_INTRINSICS %s \ // RUN: -fno-discard-value-names -mllvm -disable-llvm-optzns diff --git a/clang/test/Driver/fp-model.c b/clang/test/Driver/fp-model.c index c6d683e25c0b..5fa9d110dd83 100644 --- a/clang/test/Driver/fp-model.c +++ b/clang/test/Driver/fp-model.c @@ -1,90 +1,88 @@ // Test that incompatible combinations of -ffp-model= options // and other floating point options get a warning diagnostic. +// +// REQUIRES: clang-driver -// RUN: %clang -target x86_64 -### -ffp-model=fast -ffp-contract=off -c %s 2>&1 \ +// RUN: %clang -### -ffp-model=fast -ffp-contract=off -c %s 2>&1 \ // RUN: | FileCheck --check-prefix=WARN %s // WARN: warning: overriding '-ffp-model=fast' option with '-ffp-contract=off' [-Woverriding-t-option] -// RUN: %clang -target x86_64 -### -ffp-model=fast -ffp-contract=on -c %s 2>&1 \ +// RUN: %clang -### -ffp-model=fast -ffp-contract=on -c %s 2>&1 \ // RUN: | FileCheck --check-prefix=WARN1 %s // WARN1: warning: overriding '-ffp-model=fast' option with '-ffp-contract=on' [-Woverriding-t-option] -// RUN: %clang -target x86_64 -### -ffp-model=strict -fassociative-math -c %s 2>&1 \ +// RUN: %clang -### -ffp-model=strict -fassociative-math -c %s 2>&1 \ // RUN: | FileCheck --check-prefix=WARN2 %s // WARN2: warning: overriding '-ffp-model=strict' option with '-fassociative-math' [-Woverriding-t-option] -// RUN: %clang -target x86_64 -### -ffp-model=strict -ffast-math -c %s 2>&1 \ +// RUN: %clang -### -ffp-model=strict -ffast-math -c %s 2>&1 \ // RUN: | FileCheck --check-prefix=WARN3 %s // WARN3: warning: overriding '-ffp-model=strict' option with '-ffast-math' [-Woverriding-t-option] -// RUN: %clang -target x86_64 -### -ffp-model=strict -ffinite-math-only -c %s 2>&1 \ +// RUN: %clang -### -ffp-model=strict -ffinite-math-only -c %s 2>&1 \ // RUN: | FileCheck --check-prefix=WARN4 %s // WARN4: warning: overriding '-ffp-model=strict' option with '-ffinite-math-only' [-Woverriding-t-option] -// RUN: %clang -target x86_64 -### -ffp-model=strict -ffp-contract=fast -c %s 2>&1 \ +// RUN: %clang -### -ffp-model=strict -ffp-contract=fast -c %s 2>&1 \ // RUN: | FileCheck --check-prefix=WARN5 %s // WARN5: warning: overriding '-ffp-model=strict' option with '-ffp-contract=fast' [-Woverriding-t-option] -// RUN: %clang -target x86_64 -### -ffp-model=strict -ffp-contract=fast -c %s 2>&1 \ -// RUN: | FileCheck --check-prefix=WARN6 %s -// WARN6: warning: overriding '-ffp-model=strict' option with '-ffp-contract=fast' [-Woverriding-t-option] - -// RUN: %clang -target x86_64 -### -ffp-model=strict -ffp-contract=on -c %s 2>&1 \ +// RUN: %clang -### -ffp-model=strict -ffp-contract=on -c %s 2>&1 \ // RUN: | FileCheck --check-prefix=WARN7 %s // WARN7: warning: overriding '-ffp-model=strict' option with '-ffp-contract=on' [-Woverriding-t-option] -// RUN: %clang -target x86_64 -### -ffp-model=strict -fno-honor-infinities -c %s 2>&1 \ +// RUN: %clang -### -ffp-model=strict -fno-honor-infinities -c %s 2>&1 \ // RUN: | FileCheck --check-prefix=WARN8 %s // WARN8: warning: overriding '-ffp-model=strict' option with '-fno-honor-infinities' [-Woverriding-t-option] -// RUN: %clang -target x86_64 -### -ffp-model=strict -fno-honor-nans -c %s 2>&1 \ +// RUN: %clang -### -ffp-model=strict -fno-honor-nans -c %s 2>&1 \ // RUN: | FileCheck --check-prefix=WARN9 %s // WARN9: warning: overriding '-ffp-model=strict' option with '-fno-honor-nans' [-Woverriding-t-option] -// RUN: %clang -target x86_64 -### -ffp-model=strict -fno-rounding-math -c %s 2>&1 \ +// RUN: %clang -### -ffp-model=strict -fno-rounding-math -c %s 2>&1 \ // RUN: | FileCheck --check-prefix=WARNa %s // WARNa: warning: overriding '-ffp-model=strict' option with '-fno-rounding-math' [-Woverriding-t-option] -// RUN: %clang -target x86_64 -### -ffp-model=strict -fno-signed-zeros -c %s 2>&1 \ +// RUN: %clang -### -ffp-model=strict -fno-signed-zeros -c %s 2>&1 \ // RUN: | FileCheck --check-prefix=WARNb %s // WARNb: warning: overriding '-ffp-model=strict' option with '-fno-signed-zeros' [-Woverriding-t-option] -// RUN: %clang -target x86_64 -### -ffp-model=strict -fno-trapping-math -c %s 2>&1 \ +// RUN: %clang -### -ffp-model=strict -fno-trapping-math -c %s 2>&1 \ // RUN: | FileCheck --check-prefix=WARNc %s // WARNc: warning: overriding '-ffp-model=strict' option with '-fno-trapping-math' [-Woverriding-t-option] -// RUN: %clang -target x86_64 -### -ffp-model=strict -freciprocal-math -c %s 2>&1 \ +// RUN: %clang -### -ffp-model=strict -freciprocal-math -c %s 2>&1 \ // RUN: | FileCheck --check-prefix=WARNd %s // WARNd: warning: overriding '-ffp-model=strict' option with '-freciprocal-math' [-Woverriding-t-option] -// RUN: %clang -target x86_64 -### -ffp-model=strict -funsafe-math-optimizations -c %s 2>&1 \ +// RUN: %clang -### -ffp-model=strict -funsafe-math-optimizations -c %s 2>&1 \ // RUN: | FileCheck --check-prefix=WARNe %s // WARNe: warning: overriding '-ffp-model=strict' option with '-funsafe-math-optimizations' [-Woverriding-t-option] -// RUN: %clang -target x86_64 -### -ffp-model=strict -Ofast -c %s 2>&1 \ +// RUN: %clang -### -ffp-model=strict -Ofast -c %s 2>&1 \ // RUN: | FileCheck --check-prefix=WARNf %s // WARNf: warning: overriding '-ffp-model=strict' option with '-Ofast' [-Woverriding-t-option] -// RUN: %clang -target x86_64 -### -ffp-model=strict -fdenormal-fp-math=preserve-sign,preserve-sign -c %s 2>&1 \ +// RUN: %clang -### -ffp-model=strict -fdenormal-fp-math=preserve-sign,preserve-sign -c %s 2>&1 \ // RUN: | FileCheck --check-prefix=WARN10 %s // WARN10: warning: overriding '-ffp-model=strict' option with '-fdenormal-fp-math=preserve-sign,preserve-sign' [-Woverriding-t-option] -// RUN: %clang -target x86_64 -### -c %s 2>&1 \ +// RUN: %clang -### -c %s 2>&1 \ // RUN: | FileCheck --check-prefix=CHECK-NOROUND %s // CHECK-NOROUND: "-cc1" // CHECK-NOROUND: "-fno-rounding-math" -// RUN: %clang -target x86_64 -### -frounding-math -c %s 2>&1 \ +// RUN: %clang -### -frounding-math -c %s 2>&1 \ // RUN: | FileCheck --check-prefix=CHECK-ROUND --implicit-check-not ffp-exception-behavior=strict %s // CHECK-ROUND: "-cc1" // CHECK-ROUND: "-frounding-math" -// RUN: %clang -target x86_64 -### -ftrapping-math -c %s 2>&1 \ +// RUN: %clang -### -ftrapping-math -c %s 2>&1 \ // RUN: | FileCheck --check-prefix=CHECK-TRAP %s // CHECK-TRAP: "-cc1" // CHECK-TRAP: "-ffp-exception-behavior=strict" -// RUN: %clang -target x86_64 -### -nostdinc -ffp-model=fast -c %s 2>&1 \ +// RUN: %clang -### -nostdinc -ffp-model=fast -c %s 2>&1 \ // RUN: | FileCheck --check-prefix=CHECK-FPM-FAST %s // CHECK-FPM-FAST: "-cc1" // CHECK-FPM-FAST: "-menable-no-infs" @@ -98,35 +96,34 @@ // CHECK-FPM-FAST: "-ffast-math" // CHECK-FPM-FAST: "-ffinite-math-only" -// RUN: %clang -target x86_64 -### -nostdinc -ffp-model=precise -c %s 2>&1 \ +// RUN: %clang -### -nostdinc -ffp-model=precise -c %s 2>&1 \ // RUN: | FileCheck --check-prefix=CHECK-FPM-PRECISE %s // CHECK-FPM-PRECISE: "-cc1" -// CHECK-FPM-PRECISE: "-ffp-contract=on" +// CHECK-FPM-PRECISE: "-ffp-contract=fast" // CHECK-FPM-PRECISE: "-fno-rounding-math" -// RUN: %clang -target x86_64 -### -nostdinc -ffp-model=strict -c %s 2>&1 \ +// RUN: %clang -### -nostdinc -ffp-model=strict -c %s 2>&1 \ // RUN: | FileCheck --check-prefix=CHECK-FPM-STRICT %s // CHECK-FPM-STRICT: "-cc1" -// CHECK-FPM-STRICT: "-fmath-errno" -// CHECK-FPM-STRICT: "-ffp-contract=off" // CHECK-FPM-STRICT: "-frounding-math" // CHECK-FPM-STRICT: "-ffp-exception-behavior=strict" -// RUN: %clang -target x86_64 -### -nostdinc -ffp-exception-behavior=strict -c %s 2>&1 \ +// RUN: %clang -### -nostdinc -ffp-exception-behavior=strict -c %s 2>&1 \ // RUN: | FileCheck --check-prefix=CHECK-FEB-STRICT %s // CHECK-FEB-STRICT: "-cc1" // CHECK-FEB-STRICT: "-fno-rounding-math" // CHECK-FEB-STRICT: "-ffp-exception-behavior=strict" -// RUN: %clang -target x86_64 -### -nostdinc -ffp-exception-behavior=maytrap -c %s 2>&1 \ +// RUN: %clang -### -nostdinc -ffp-exception-behavior=maytrap -c %s 2>&1 \ // RUN: | FileCheck --check-prefix=CHECK-FEB-MAYTRAP %s // CHECK-FEB-MAYTRAP: "-cc1" // CHECK-FEB-MAYTRAP: "-fno-rounding-math" // CHECK-FEB-MAYTRAP: "-ffp-exception-behavior=maytrap" -// RUN: %clang -target x86_64 -### -nostdinc -ffp-exception-behavior=ignore -c %s 2>&1 \ +// RUN: %clang -### -nostdinc -ffp-exception-behavior=ignore -c %s 2>&1 \ // RUN: | FileCheck --check-prefix=CHECK-FEB-IGNORE %s // CHECK-FEB-IGNORE: "-cc1" // CHECK-FEB-IGNORE: "-fno-rounding-math" // CHECK-FEB-IGNORE: "-ffp-exception-behavior=ignore" + </cut>

3 years, 11 months

1
0
0 0

[CI-NOTIFY]: TCWG Bisect tcwg_bmk_apm/llvm-master-aarch64-spec2k6-Oz_LTO - Build # 6 - Successful!

by ci_notify＠linaro.org

Successfully identified regression in *llvm* in CI configuration tcwg_bmk_llvm_apm/llvm-master-aarch64-spec2k6-Oz_LTO. So far, this commit has regressed CI configurations: - tcwg_bmk_llvm_apm/llvm-master-aarch64-spec2k6-Oz_LTO Culprit: <cut> commit 131b4620ee7847102479f399ce3e35a3c1cb5461 Author: Corentin Jabot <corentin.jabot(a)gmail.com> Date: Fri Aug 6 10:29:28 2021 -0400 Implement P1937 consteval in unevaluated contexts In an unevaluated contexts, consteval functions should not be immediately evaluated. </cut> Results regressed to (for first_bad == 131b4620ee7847102479f399ce3e35a3c1cb5461) # reset_artifacts: -10 # build_abe binutils: -9 # build_abe stage1 -- --set gcc_override_configure=--disable-libsanitizer: -8 # build_abe linux: -7 # build_abe glibc: -6 # build_abe stage2 -- --set gcc_override_configure=--disable-libsanitizer: -5 # build_llvm true: -3 # true: 0 # benchmark -- -Oz_LTO artifacts/build-131b4620ee7847102479f399ce3e35a3c1cb5461/results_id: 1 # 470.lbm,lbm_base.default regressed by 104 from (for last_good == 3c8e94bc20e5829ab5167d21d242b6b624dd934e) # reset_artifacts: -10 # build_abe binutils: -9 # build_abe stage1 -- --set gcc_override_configure=--disable-libsanitizer: -8 # build_abe linux: -7 # build_abe glibc: -6 # build_abe stage2 -- --set gcc_override_configure=--disable-libsanitizer: -5 # build_llvm true: -3 # true: 0 # benchmark -- -Oz_LTO artifacts/build-3c8e94bc20e5829ab5167d21d242b6b624dd934e/results_id: 1 Artifacts of last_good build: https://ci.linaro.org/job/tcwg_bmk_ci_llvm-bisect-tcwg_bmk_apm-llvm-master-… Results ID of last_good: apm_64/tcwg_bmk_llvm_apm/bisect-llvm-master-aarch64-spec2k6-Oz_LTO/4879 Artifacts of first_bad build: https://ci.linaro.org/job/tcwg_bmk_ci_llvm-bisect-tcwg_bmk_apm-llvm-master-… Results ID of first_bad: apm_64/tcwg_bmk_llvm_apm/bisect-llvm-master-aarch64-spec2k6-Oz_LTO/4868 Build top page/logs: https://ci.linaro.org/job/tcwg_bmk_ci_llvm-bisect-tcwg_bmk_apm-llvm-master-… Configuration details: Reproduce builds: <cut> mkdir investigate-llvm-131b4620ee7847102479f399ce3e35a3c1cb5461 cd investigate-llvm-131b4620ee7847102479f399ce3e35a3c1cb5461 git clone https://git.linaro.org/toolchain/jenkins-scripts mkdir -p artifacts/manifests curl -o artifacts/manifests/build-baseline.sh https://ci.linaro.org/job/tcwg_bmk_ci_llvm-bisect-tcwg_bmk_apm-llvm-master-… --fail curl -o artifacts/manifests/build-parameters.sh https://ci.linaro.org/job/tcwg_bmk_ci_llvm-bisect-tcwg_bmk_apm-llvm-master-… --fail curl -o artifacts/test.sh https://ci.linaro.org/job/tcwg_bmk_ci_llvm-bisect-tcwg_bmk_apm-llvm-master-… --fail chmod +x artifacts/test.sh # Reproduce the baseline build (build all pre-requisites) ./jenkins-scripts/tcwg_bmk-build.sh @@ artifacts/manifests/build-baseline.sh # Save baseline build state (which is then restored in artifacts/test.sh) mkdir -p ./bisect rsync -a --del --delete-excluded --exclude /bisect/ --exclude /artifacts/ --exclude /llvm/ ./ ./bisect/baseline/ cd llvm # Reproduce first_bad build git checkout --detach 131b4620ee7847102479f399ce3e35a3c1cb5461 ../artifacts/test.sh # Reproduce last_good build git checkout --detach 3c8e94bc20e5829ab5167d21d242b6b624dd934e ../artifacts/test.sh cd .. </cut> History of pending regressions and results: https://git.linaro.org/toolchain/ci/base-artifacts.git/log/?h=linaro-local/… Artifacts: https://ci.linaro.org/job/tcwg_bmk_ci_llvm-bisect-tcwg_bmk_apm-llvm-master-… Build log: https://ci.linaro.org/job/tcwg_bmk_ci_llvm-bisect-tcwg_bmk_apm-llvm-master-… Full commit (up to 1000 lines): <cut> commit 131b4620ee7847102479f399ce3e35a3c1cb5461 Author: Corentin Jabot <corentin.jabot(a)gmail.com> Date: Fri Aug 6 10:29:28 2021 -0400 Implement P1937 consteval in unevaluated contexts In an unevaluated contexts, consteval functions should not be immediately evaluated. --- clang/lib/Sema/SemaExpr.cpp | 7 ++--- clang/test/CXX/basic/basic.def.odr/p2-typeid.cpp | 33 +++++++++++++++++++++++- clang/test/SemaCXX/cxx2a-consteval.cpp | 18 +++++++++++++ clang/www/cxx_status.html | 3 ++- 4 files changed, 56 insertions(+), 5 deletions(-) diff --git a/clang/lib/Sema/SemaExpr.cpp b/clang/lib/Sema/SemaExpr.cpp index d316687c4cd8..8ef4a9d96320 100644 --- a/clang/lib/Sema/SemaExpr.cpp +++ b/clang/lib/Sema/SemaExpr.cpp @@ -16641,7 +16641,8 @@ void Sema::CheckUnusedVolatileAssignment(Expr *E) { } ExprResult Sema::CheckForImmediateInvocation(ExprResult E, FunctionDecl *Decl) { - if (!E.isUsable() || !Decl || !Decl->isConsteval() || isConstantEvaluated() || + if (isUnevaluatedContext() || !E.isUsable() || !Decl || + !Decl->isConsteval() || isConstantEvaluated() || RebuildingImmediateInvocation) return E; @@ -18758,8 +18759,8 @@ void Sema::MarkDeclRefReferenced(DeclRefExpr *E, const Expr *Base) { OdrUse = false; if (auto *FD = dyn_cast<FunctionDecl>(E->getDecl())) - if (!isConstantEvaluated() && FD->isConsteval() && - !RebuildingImmediateInvocation) + if (!isUnevaluatedContext() && !isConstantEvaluated() && + FD->isConsteval() && !RebuildingImmediateInvocation) ExprEvalContexts.back().ReferenceToConsteval.insert(E); MarkExprReferenced(*this, E->getLocation(), E->getDecl(), E, OdrUse, RefsMinusAssignments); diff --git a/clang/test/CXX/basic/basic.def.odr/p2-typeid.cpp b/clang/test/CXX/basic/basic.def.odr/p2-typeid.cpp index 55debe3ca731..fafcd127feec 100644 --- a/clang/test/CXX/basic/basic.def.odr/p2-typeid.cpp +++ b/clang/test/CXX/basic/basic.def.odr/p2-typeid.cpp @@ -1,4 +1,5 @@ // RUN: %clang_cc1 -fsyntax-only -verify %s +// RUN: %clang_cc1 -std=c++20 -fsyntax-only -verify %s // C++ [basic.def.odr]p2: // An expression is potentially evaluated unless it [...] is the @@ -16,7 +17,7 @@ struct Poly { struct NonPoly { }; -template<typename T, typename Result = T> +template<typename T, typename Result = T> struct X { Result f(T t) { return t + t; } // expected-error{{invalid operands}} @@ -34,3 +35,33 @@ void test(X<Poly> xp, X<Poly, Poly&> xpr, X<NonPoly> xnp, X<NonPoly, NonPoly&> x // Triggers an error (as it should); xpr.g(Poly()); // expected-note{{instantiation of member function}} } + +#if __cplusplus >= 202002L + +namespace unevaluated { + +struct S { + void f(); +}; +struct T { + virtual void f(); +}; + +consteval S *null_s() { return nullptr; } +consteval S *make_s() { return new S; } +consteval T *null_t() { return nullptr; } +consteval T *make_t() { return new T; } // #alloc + +void func() { + (void)typeid(*null_s()); + (void)typeid(*make_s()); + (void)typeid(*null_t()); // expected-warning {{expression with side effects will be evaluated despite being used as an operand to 'typeid'}} + (void)typeid(*make_t()); // expected-error {{call to consteval function 'unevaluated::make_t' is not a constant expression}} \ + expected-note {{pointer to heap-allocated object is not a constant expression}} \ + expected-note@#alloc {{heap allocation performed here}} \ + expected-warning {{expression with side effects will be evaluated despite being used as an operand to 'typeid'}} +} + +} // namespace unevaluated + +#endif diff --git a/clang/test/SemaCXX/cxx2a-consteval.cpp b/clang/test/SemaCXX/cxx2a-consteval.cpp index ecf8c1e0f5bd..04c8898aa5ba 100644 --- a/clang/test/SemaCXX/cxx2a-consteval.cpp +++ b/clang/test/SemaCXX/cxx2a-consteval.cpp @@ -594,3 +594,21 @@ void test() { } } // namespace special_ctor + +namespace unevaluated { + +template <typename T, typename U> struct is_same { static const bool value = false; }; +template <typename T> struct is_same<T, T> { static const bool value = true; }; + +long f(); // expected-note {{declared here}} +auto consteval g(auto a) { + return a; +} + +auto e = g(f()); // expected-error {{is not a constant expression}} + // expected-note@-1 {{non-constexpr function 'f' cannot be used in a constant expression}} + +using T = decltype(g(f())); +static_assert(is_same<long, T>::value); + +} // namespace unevaluated diff --git a/clang/www/cxx_status.html b/clang/www/cxx_status.html index 60ce69db9922..3cbee7026c5c 100755 --- a/clang/www/cxx_status.html +++ b/clang/www/cxx_status.html @@ -1105,10 +1105,11 @@ code. This issue is expected to be rectified soon. <tr> <td rowspan=2>Immediate functions (<tt>consteval</tt>)</td> <td><a href="https://wg21.link/p1073r3">P1073R3</a></td> - <td rowspan=2 class="none" align="center">No</td> + <td class="partial" align="center">Partial</td> </tr> <tr>  <td><a href="https://wg21.link/p1937r2">P1937R2</a></td> + <td class="unreleased" align="center">Clang 14</td> </tr> <tr> <td><tt>std::is_constant_evaluated</tt></td> </cut>

3 years, 11 months

1
0
0 0

[CI-NOTIFY]: TCWG Bisect tcwg_bmk_tx1/gnu-release-aarch64-spec2k6-O3 - Build # 32 - Successful!

by ci_notify＠linaro.org

Successfully identified regression in *gcc* in CI configuration tcwg_bmk_gnu_tx1/gnu-release-aarch64-spec2k6-O3. So far, this commit has regressed CI configurations: - tcwg_bmk_gnu_tx1/gnu-release-aarch64-spec2k6-O3 Culprit: <cut> commit ee875b63b22e30a0dcb4b05f7532c2c416ba6cd0 Author: Richard Biener <rguenther(a)suse.de> Date: Tue Aug 17 08:38:35 2021 +0200 tree-optimization/101868 - avoid PRE of trapping mems across calls This backports a fix for the omission of a check of trapping mems when hoisting them across calls that might not return. This was originally done as part of a fix to handle const functions that throw properly. 2021-08-17 Richard Biener <rguenther(a)suse.de> PR tree-optimization/101373 PR tree-optimization/101868 * tree-ssa-pre.c (prune_clobbered_mems): Also prune trapping references when the BB may not return. * gcc.dg/lto/pr101868_0.c: New testcase. * gcc.dg/lto/pr101868_1.c: Likewise. * gcc.dg/lto/pr101868_2.c: Likewise. * gcc.dg/lto/pr101868_3.c: Likewise. </cut> Results regressed to (for first_bad == ee875b63b22e30a0dcb4b05f7532c2c416ba6cd0) # reset_artifacts: -10 # build_abe binutils: -9 # build_abe stage1 -- --set gcc_override_configure=--disable-libsanitizer: -8 # build_abe linux: -7 # build_abe glibc: -6 # build_abe stage2 -- --set gcc_override_configure=--disable-libsanitizer: -5 # true: 0 # benchmark -- -O3 artifacts/build-ee875b63b22e30a0dcb4b05f7532c2c416ba6cd0/results_id: 1 # 429.mcf,mcf_base.default regressed by 106 # 429.mcf,[.] price_out_impl regressed by 174 from (for last_good == a0a0499b8bb920fdd98e791804812f001f0b4fe8) # reset_artifacts: -10 # build_abe binutils: -9 # build_abe stage1 -- --set gcc_override_configure=--disable-libsanitizer: -8 # build_abe linux: -7 # build_abe glibc: -6 # build_abe stage2 -- --set gcc_override_configure=--disable-libsanitizer: -5 # true: 0 # benchmark -- -O3 artifacts/build-a0a0499b8bb920fdd98e791804812f001f0b4fe8/results_id: 1 Artifacts of last_good build: https://ci.linaro.org/job/tcwg_bmk_ci_gnu-bisect-tcwg_bmk_tx1-gnu-release-a… Results ID of last_good: tx1_64/tcwg_bmk_gnu_tx1/bisect-gnu-release-aarch64-spec2k6-O3/4846 Artifacts of first_bad build: https://ci.linaro.org/job/tcwg_bmk_ci_gnu-bisect-tcwg_bmk_tx1-gnu-release-a… Results ID of first_bad: tx1_64/tcwg_bmk_gnu_tx1/bisect-gnu-release-aarch64-spec2k6-O3/4851 Build top page/logs: https://ci.linaro.org/job/tcwg_bmk_ci_gnu-bisect-tcwg_bmk_tx1-gnu-release-a… Configuration details: Reproduce builds: <cut> mkdir investigate-gcc-ee875b63b22e30a0dcb4b05f7532c2c416ba6cd0 cd investigate-gcc-ee875b63b22e30a0dcb4b05f7532c2c416ba6cd0 git clone https://git.linaro.org/toolchain/jenkins-scripts mkdir -p artifacts/manifests curl -o artifacts/manifests/build-baseline.sh https://ci.linaro.org/job/tcwg_bmk_ci_gnu-bisect-tcwg_bmk_tx1-gnu-release-a… --fail curl -o artifacts/manifests/build-parameters.sh https://ci.linaro.org/job/tcwg_bmk_ci_gnu-bisect-tcwg_bmk_tx1-gnu-release-a… --fail curl -o artifacts/test.sh https://ci.linaro.org/job/tcwg_bmk_ci_gnu-bisect-tcwg_bmk_tx1-gnu-release-a… --fail chmod +x artifacts/test.sh # Reproduce the baseline build (build all pre-requisites) ./jenkins-scripts/tcwg_bmk-build.sh @@ artifacts/manifests/build-baseline.sh # Save baseline build state (which is then restored in artifacts/test.sh) mkdir -p ./bisect rsync -a --del --delete-excluded --exclude /bisect/ --exclude /artifacts/ --exclude /gcc/ ./ ./bisect/baseline/ cd gcc # Reproduce first_bad build git checkout --detach ee875b63b22e30a0dcb4b05f7532c2c416ba6cd0 ../artifacts/test.sh # Reproduce last_good build git checkout --detach a0a0499b8bb920fdd98e791804812f001f0b4fe8 ../artifacts/test.sh cd .. </cut> History of pending regressions and results: https://git.linaro.org/toolchain/ci/base-artifacts.git/log/?h=linaro-local/… Artifacts: https://ci.linaro.org/job/tcwg_bmk_ci_gnu-bisect-tcwg_bmk_tx1-gnu-release-a… Build log: https://ci.linaro.org/job/tcwg_bmk_ci_gnu-bisect-tcwg_bmk_tx1-gnu-release-a… Full commit (up to 1000 lines): <cut> commit ee875b63b22e30a0dcb4b05f7532c2c416ba6cd0 Author: Richard Biener <rguenther(a)suse.de> Date: Tue Aug 17 08:38:35 2021 +0200 tree-optimization/101868 - avoid PRE of trapping mems across calls This backports a fix for the omission of a check of trapping mems when hoisting them across calls that might not return. This was originally done as part of a fix to handle const functions that throw properly. 2021-08-17 Richard Biener <rguenther(a)suse.de> PR tree-optimization/101373 PR tree-optimization/101868 * tree-ssa-pre.c (prune_clobbered_mems): Also prune trapping references when the BB may not return. * gcc.dg/lto/pr101868_0.c: New testcase. * gcc.dg/lto/pr101868_1.c: Likewise. * gcc.dg/lto/pr101868_2.c: Likewise. * gcc.dg/lto/pr101868_3.c: Likewise. --- gcc/testsuite/gcc.dg/lto/pr101868_0.c | 33 +++++++++++++++++++++++++++++++++ gcc/testsuite/gcc.dg/lto/pr101868_1.c | 23 +++++++++++++++++++++++ gcc/testsuite/gcc.dg/lto/pr101868_2.c | 11 +++++++++++ gcc/testsuite/gcc.dg/lto/pr101868_3.c | 8 ++++++++ gcc/tree-ssa-pre.c | 7 +++++++ 5 files changed, 82 insertions(+) diff --git a/gcc/testsuite/gcc.dg/lto/pr101868_0.c b/gcc/testsuite/gcc.dg/lto/pr101868_0.c new file mode 100644 index 00000000000..c84d19b0267 --- /dev/null +++ b/gcc/testsuite/gcc.dg/lto/pr101868_0.c @@ -0,0 +1,33 @@ +/* { dg-lto-do run } */ +/* { dg-lto-options { "-O2 -fno-strict-aliasing -flto" } } */ + +typedef unsigned long VALUE; + +__attribute__ ((cold)) +void rb_check_type(VALUE, int); + +static VALUE +repro(VALUE dummy, VALUE hash) +{ + if (hash == 0) { + rb_check_type(hash, 1); + } + else if (*(long *)hash) { + rb_check_type(hash, 1); + } + + + return *(long *)hash; +} + +static VALUE (*that)(VALUE dummy, VALUE hash) = repro; + +int +main(int argc, char **argv) +{ + argc--; + that(0, argc); + + rb_check_type(argc, argc); + +} diff --git a/gcc/testsuite/gcc.dg/lto/pr101868_1.c b/gcc/testsuite/gcc.dg/lto/pr101868_1.c new file mode 100644 index 00000000000..146c14abc76 --- /dev/null +++ b/gcc/testsuite/gcc.dg/lto/pr101868_1.c @@ -0,0 +1,23 @@ +typedef unsigned long VALUE; + + +__attribute__ ((noreturn)) void rexc_raise(VALUE mesg); + +VALUE rb_donothing(VALUE klass); + +static void +funexpected_type(VALUE x, int xt, int t) +{ + rexc_raise(rb_donothing(0)); +} + +__attribute__ ((cold)) +void +rb_check_type(VALUE x, int t) +{ + int xt; + + if (x == 0) { + funexpected_type(x, xt, t); + } +} diff --git a/gcc/testsuite/gcc.dg/lto/pr101868_2.c b/gcc/testsuite/gcc.dg/lto/pr101868_2.c new file mode 100644 index 00000000000..e6f01b23f45 --- /dev/null +++ b/gcc/testsuite/gcc.dg/lto/pr101868_2.c @@ -0,0 +1,11 @@ +typedef unsigned long VALUE; + +static void thing(void) {} +static void (*ptr)(void) = &thing; + +VALUE +rb_donothing(VALUE klass) +{ + ptr(); + return 0; +} diff --git a/gcc/testsuite/gcc.dg/lto/pr101868_3.c b/gcc/testsuite/gcc.dg/lto/pr101868_3.c new file mode 100644 index 00000000000..61217625be7 --- /dev/null +++ b/gcc/testsuite/gcc.dg/lto/pr101868_3.c @@ -0,0 +1,8 @@ +typedef unsigned long VALUE; + +__attribute__((noreturn)) +void +rexc_raise(VALUE mesg) +{ + __builtin_exit(0); +} diff --git a/gcc/tree-ssa-pre.c b/gcc/tree-ssa-pre.c index 04ec4fbaeec..2aedc31e1d7 100644 --- a/gcc/tree-ssa-pre.c +++ b/gcc/tree-ssa-pre.c @@ -2070,6 +2070,13 @@ prune_clobbered_mems (bitmap_set_t set, basic_block block) && value_dies_in_block_x (expr, block)))) to_remove = i; } + /* If the REFERENCE may trap make sure the block does not contain + a possible exit point. + ??? This is overly conservative if we translate AVAIL_OUT + as the available expression might be after the exit point. */ + if (BB_MAY_NOTRETURN (block) + && vn_reference_may_trap (ref)) + to_remove = i; } else if (expr->kind == NARY) { </cut>

3 years, 11 months

1
0
0 0

[CI-NOTIFY]: TCWG Bisect tcwg_bmk_tx1/llvm-master-aarch64-spec2k6-O3_LTO - Build # 32 - Successful!

by ci_notify＠linaro.org

Successfully identified regression in *llvm* in CI configuration tcwg_bmk_llvm_tx1/llvm-master-aarch64-spec2k6-O3_LTO. So far, this commit has regressed CI configurations: - tcwg_bmk_llvm_tx1/llvm-master-aarch64-spec2k6-O3_LTO Culprit: <cut> commit 19dc02e99f802922a3af69e802465bee0723b57a Author: Nikita Popov <nikita.ppv(a)gmail.com> Date: Sun Aug 22 18:15:55 2021 +0200 [MergeICmps] Allow sinking past non-load/store This is a followup to D106591. MergeICmps currently only allows sinking the loads past either instructions that don't write to memory at all, or simple loads/stores that don't modify the memory the loads access. The "simple loads/stores" part of this check doesn't seem necessary to me -- AA isModRef() already accurately models any operation that may clobber the memory. For example, in the adjusted test case the transform is still fine if the call to @foo() isn't readonly, but inaccessiblememonly -- in both cases, the call cannot modify the loaded memory. Differential Revision: https://reviews.llvm.org/D108517 </cut> Results regressed to (for first_bad == 19dc02e99f802922a3af69e802465bee0723b57a) # reset_artifacts: -10 # build_abe binutils: -9 # build_abe stage1 -- --set gcc_override_configure=--disable-libsanitizer: -8 # build_abe linux: -7 # build_abe glibc: -6 # build_abe stage2 -- --set gcc_override_configure=--disable-libsanitizer: -5 # build_llvm true: -3 # true: 0 # benchmark -- -O3_LTO artifacts/build-19dc02e99f802922a3af69e802465bee0723b57a/results_id: 1 # 464.h264ref,h264ref_base.default regressed by 105 from (for last_good == da12d88b1c5fc42b49b92fcf94917ca489dd677f) # reset_artifacts: -10 # build_abe binutils: -9 # build_abe stage1 -- --set gcc_override_configure=--disable-libsanitizer: -8 # build_abe linux: -7 # build_abe glibc: -6 # build_abe stage2 -- --set gcc_override_configure=--disable-libsanitizer: -5 # build_llvm true: -3 # true: 0 # benchmark -- -O3_LTO artifacts/build-da12d88b1c5fc42b49b92fcf94917ca489dd677f/results_id: 1 Artifacts of last_good build: https://ci.linaro.org/job/tcwg_bmk_ci_llvm-bisect-tcwg_bmk_tx1-llvm-master-… Results ID of last_good: tx1_64/tcwg_bmk_llvm_tx1/bisect-llvm-master-aarch64-spec2k6-O3_LTO/4822 Artifacts of first_bad build: https://ci.linaro.org/job/tcwg_bmk_ci_llvm-bisect-tcwg_bmk_tx1-llvm-master-… Results ID of first_bad: tx1_64/tcwg_bmk_llvm_tx1/bisect-llvm-master-aarch64-spec2k6-O3_LTO/4807 Build top page/logs: https://ci.linaro.org/job/tcwg_bmk_ci_llvm-bisect-tcwg_bmk_tx1-llvm-master-… Configuration details: Reproduce builds: <cut> mkdir investigate-llvm-19dc02e99f802922a3af69e802465bee0723b57a cd investigate-llvm-19dc02e99f802922a3af69e802465bee0723b57a git clone https://git.linaro.org/toolchain/jenkins-scripts mkdir -p artifacts/manifests curl -o artifacts/manifests/build-baseline.sh https://ci.linaro.org/job/tcwg_bmk_ci_llvm-bisect-tcwg_bmk_tx1-llvm-master-… --fail curl -o artifacts/manifests/build-parameters.sh https://ci.linaro.org/job/tcwg_bmk_ci_llvm-bisect-tcwg_bmk_tx1-llvm-master-… --fail curl -o artifacts/test.sh https://ci.linaro.org/job/tcwg_bmk_ci_llvm-bisect-tcwg_bmk_tx1-llvm-master-… --fail chmod +x artifacts/test.sh # Reproduce the baseline build (build all pre-requisites) ./jenkins-scripts/tcwg_bmk-build.sh @@ artifacts/manifests/build-baseline.sh # Save baseline build state (which is then restored in artifacts/test.sh) mkdir -p ./bisect rsync -a --del --delete-excluded --exclude /bisect/ --exclude /artifacts/ --exclude /llvm/ ./ ./bisect/baseline/ cd llvm # Reproduce first_bad build git checkout --detach 19dc02e99f802922a3af69e802465bee0723b57a ../artifacts/test.sh # Reproduce last_good build git checkout --detach da12d88b1c5fc42b49b92fcf94917ca489dd677f ../artifacts/test.sh cd .. </cut> History of pending regressions and results: https://git.linaro.org/toolchain/ci/base-artifacts.git/log/?h=linaro-local/… Artifacts: https://ci.linaro.org/job/tcwg_bmk_ci_llvm-bisect-tcwg_bmk_tx1-llvm-master-… Build log: https://ci.linaro.org/job/tcwg_bmk_ci_llvm-bisect-tcwg_bmk_tx1-llvm-master-… Full commit (up to 1000 lines): <cut> commit 19dc02e99f802922a3af69e802465bee0723b57a Author: Nikita Popov <nikita.ppv(a)gmail.com> Date: Sun Aug 22 18:15:55 2021 +0200 [MergeICmps] Allow sinking past non-load/store This is a followup to D106591. MergeICmps currently only allows sinking the loads past either instructions that don't write to memory at all, or simple loads/stores that don't modify the memory the loads access. The "simple loads/stores" part of this check doesn't seem necessary to me -- AA isModRef() already accurately models any operation that may clobber the memory. For example, in the adjusted test case the transform is still fine if the call to @foo() isn't readonly, but inaccessiblememonly -- in both cases, the call cannot modify the loaded memory. Differential Revision: https://reviews.llvm.org/D108517 --- llvm/lib/Transforms/Scalar/MergeICmps.cpp | 14 +------------- .../Transforms/MergeICmps/X86/split-block-does-work.ll | 2 +- 2 files changed, 2 insertions(+), 14 deletions(-) diff --git a/llvm/lib/Transforms/Scalar/MergeICmps.cpp b/llvm/lib/Transforms/Scalar/MergeICmps.cpp index f13f24ad2027..34465c76dd3d 100644 --- a/llvm/lib/Transforms/Scalar/MergeICmps.cpp +++ b/llvm/lib/Transforms/Scalar/MergeICmps.cpp @@ -66,15 +66,6 @@ namespace { #define DEBUG_TYPE "mergeicmps" -// Returns true if the instruction is a simple load or a simple store -static bool isSimpleLoadOrStore(const Instruction *I) { - if (const LoadInst *LI = dyn_cast<LoadInst>(I)) - return LI->isSimple(); - if (const StoreInst *SI = dyn_cast<StoreInst>(I)) - return SI->isSimple(); - return false; -} - // A BCE atom "Binary Compare Expression Atom" represents an integer load // that is a constant offset from a base value, e.g. `a` or `o.c` in the example // at the top. @@ -244,10 +235,7 @@ bool BCECmpBlock::canSinkBCECmpInst(const Instruction *Inst, // If this instruction may clobber the loads and is in middle of the BCE cmp // block instructions, then bail for now. if (Inst->mayWriteToMemory()) { - // Bail if this is not a simple load or store - if (!isSimpleLoadOrStore(Inst)) - return false; - // Disallow stores that might alias the BCE operands + // Disallow instructions that might modify the BCE operands MemoryLocation LLoc = MemoryLocation::get(Cmp.Lhs.LoadI); MemoryLocation RLoc = MemoryLocation::get(Cmp.Rhs.LoadI); if (isModSet(AA.getModRefInfo(Inst, LLoc)) || diff --git a/llvm/test/Transforms/MergeICmps/X86/split-block-does-work.ll b/llvm/test/Transforms/MergeICmps/X86/split-block-does-work.ll index 0b9663f44980..1e341b92918d 100644 --- a/llvm/test/Transforms/MergeICmps/X86/split-block-does-work.ll +++ b/llvm/test/Transforms/MergeICmps/X86/split-block-does-work.ll @@ -3,7 +3,7 @@ %S = type { i32, i32, i32, i32 } -declare void @foo(...) readonly +declare void @foo(...) inaccessiblememonly ; We can split %entry and create a memcmp(16 bytes). define zeroext i1 @opeq1( </cut>

3 years, 11 months

1
0
0 0

[CI-NOTIFY]: TCWG Bisect tcwg_bmk_apm/llvm-release-aarch64-spec2k6-Os - Build # 1 - Successful!

by ci_notify＠linaro.org

Successfully identified regression in *llvm* in CI configuration tcwg_bmk_llvm_apm/llvm-release-aarch64-spec2k6-Os. So far, this commit has regressed CI configurations: - tcwg_bmk_llvm_apm/llvm-release-aarch64-spec2k6-Os Culprit: <cut> commit 1828e57eb58685a6a7f6d4f4f698dfebf98ef789 Author: Sami Tolvanen <samitolvanen(a)google.com> Date: Tue Aug 3 10:56:56 2021 -0700 ThinLTO: Fix inline assembly references to static functions with CFI Create an internal alias with the original name for static functions that are renamed in promoteInternals to avoid breaking inline assembly references to them. Relands 700d07f8ce6f2879610fd6b6968b05c6f17bb915 with -msvc targets fixed. Link: https://github.com/ClangBuiltLinux/linux/issues/1354 Reviewed By: nickdesaulniers, pcc Differential Revision: https://reviews.llvm.org/D104058 (cherry picked from commit 7ce1c4da7726577986535cb7766d782f325145fe) </cut> Results regressed to (for first_bad == 1828e57eb58685a6a7f6d4f4f698dfebf98ef789) # reset_artifacts: -10 # build_abe binutils: -9 # build_abe stage1 -- --set gcc_override_configure=--disable-libsanitizer: -8 # build_abe linux: -7 # build_abe glibc: -6 # build_abe stage2 -- --set gcc_override_configure=--disable-libsanitizer: -5 # build_llvm true: -3 # true: 0 # benchmark -- -Os artifacts/build-1828e57eb58685a6a7f6d4f4f698dfebf98ef789/results_id: 1 # 453.povray,povray_base.default regressed by 102 # 470.lbm,lbm_base.default regressed by 103 # 470.lbm,[.] LBM_performStreamCollide regressed by 118 from (for last_good == 7161e4f3345fda1b640a8250a4b34d23c74b0489) # reset_artifacts: -10 # build_abe binutils: -9 # build_abe stage1 -- --set gcc_override_configure=--disable-libsanitizer: -8 # build_abe linux: -7 # build_abe glibc: -6 # build_abe stage2 -- --set gcc_override_configure=--disable-libsanitizer: -5 # build_llvm true: -3 # true: 0 # benchmark -- -Os artifacts/build-7161e4f3345fda1b640a8250a4b34d23c74b0489/results_id: 1 Artifacts of last_good build: https://ci.linaro.org/job/tcwg_bmk_ci_llvm-bisect-tcwg_bmk_apm-llvm-release… Results ID of last_good: apm_64/tcwg_bmk_llvm_apm/bisect-llvm-release-aarch64-spec2k6-Os/4779 Artifacts of first_bad build: https://ci.linaro.org/job/tcwg_bmk_ci_llvm-bisect-tcwg_bmk_apm-llvm-release… Results ID of first_bad: apm_64/tcwg_bmk_llvm_apm/bisect-llvm-release-aarch64-spec2k6-Os/4788 Build top page/logs: https://ci.linaro.org/job/tcwg_bmk_ci_llvm-bisect-tcwg_bmk_apm-llvm-release… Configuration details: Reproduce builds: <cut> mkdir investigate-llvm-1828e57eb58685a6a7f6d4f4f698dfebf98ef789 cd investigate-llvm-1828e57eb58685a6a7f6d4f4f698dfebf98ef789 git clone https://git.linaro.org/toolchain/jenkins-scripts mkdir -p artifacts/manifests curl -o artifacts/manifests/build-baseline.sh https://ci.linaro.org/job/tcwg_bmk_ci_llvm-bisect-tcwg_bmk_apm-llvm-release… --fail curl -o artifacts/manifests/build-parameters.sh https://ci.linaro.org/job/tcwg_bmk_ci_llvm-bisect-tcwg_bmk_apm-llvm-release… --fail curl -o artifacts/test.sh https://ci.linaro.org/job/tcwg_bmk_ci_llvm-bisect-tcwg_bmk_apm-llvm-release… --fail chmod +x artifacts/test.sh # Reproduce the baseline build (build all pre-requisites) ./jenkins-scripts/tcwg_bmk-build.sh @@ artifacts/manifests/build-baseline.sh # Save baseline build state (which is then restored in artifacts/test.sh) mkdir -p ./bisect rsync -a --del --delete-excluded --exclude /bisect/ --exclude /artifacts/ --exclude /llvm/ ./ ./bisect/baseline/ cd llvm # Reproduce first_bad build git checkout --detach 1828e57eb58685a6a7f6d4f4f698dfebf98ef789 ../artifacts/test.sh # Reproduce last_good build git checkout --detach 7161e4f3345fda1b640a8250a4b34d23c74b0489 ../artifacts/test.sh cd .. </cut> History of pending regressions and results: https://git.linaro.org/toolchain/ci/base-artifacts.git/log/?h=linaro-local/… Artifacts: https://ci.linaro.org/job/tcwg_bmk_ci_llvm-bisect-tcwg_bmk_apm-llvm-release… Build log: https://ci.linaro.org/job/tcwg_bmk_ci_llvm-bisect-tcwg_bmk_apm-llvm-release… Full commit (up to 1000 lines): <cut> commit 1828e57eb58685a6a7f6d4f4f698dfebf98ef789 Author: Sami Tolvanen <samitolvanen(a)google.com> Date: Tue Aug 3 10:56:56 2021 -0700 ThinLTO: Fix inline assembly references to static functions with CFI Create an internal alias with the original name for static functions that are renamed in promoteInternals to avoid breaking inline assembly references to them. Relands 700d07f8ce6f2879610fd6b6968b05c6f17bb915 with -msvc targets fixed. Link: https://github.com/ClangBuiltLinux/linux/issues/1354 Reviewed By: nickdesaulniers, pcc Differential Revision: https://reviews.llvm.org/D104058 (cherry picked from commit 7ce1c4da7726577986535cb7766d782f325145fe) --- llvm/lib/Transforms/IPO/ThinLTOBitcodeWriter.cpp | 21 +++++++++++++++++++++ llvm/test/ThinLTO/X86/devirt2.ll | 4 ++++ .../cfi-icall-static-inline-asm.ll | 22 ++++++++++++++++++++++ .../ThinLTOBitcodeWriter/split-internal2.ll | 3 +++ .../ThinLTOBitcodeWriter/split-vfunc-internal.ll | 3 +++ 5 files changed, 53 insertions(+) diff --git a/llvm/lib/Transforms/IPO/ThinLTOBitcodeWriter.cpp b/llvm/lib/Transforms/IPO/ThinLTOBitcodeWriter.cpp index 37329b489555..eea848d3eb2f 100644 --- a/llvm/lib/Transforms/IPO/ThinLTOBitcodeWriter.cpp +++ b/llvm/lib/Transforms/IPO/ThinLTOBitcodeWriter.cpp @@ -33,6 +33,19 @@ using namespace llvm; namespace { +// Determine if a promotion alias should be created for a symbol name. +static bool allowPromotionAlias(const std::string &Name) { + // Promotion aliases are used only in inline assembly. It's safe to + // simply skip unusual names. Subset of MCAsmInfo::isAcceptableChar() + // and MCAsmInfoXCOFF::isAcceptableChar(). + for (const char &C : Name) { + if (isAlnum(C) || C == '_' || C == '.') + continue; + return false; + } + return true; +} + // Promote each local-linkage entity defined by ExportM and used by ImportM by // changing visibility and appending the given ModuleId. void promoteInternals(Module &ExportM, Module &ImportM, StringRef ModuleId, @@ -55,6 +68,7 @@ void promoteInternals(Module &ExportM, Module &ImportM, StringRef ModuleId, } } + std::string OldName = Name.str(); std::string NewName = (Name + ModuleId).str(); if (const auto *C = ExportGV.getComdat()) @@ -69,6 +83,13 @@ void promoteInternals(Module &ExportM, Module &ImportM, StringRef ModuleId, ImportGV->setName(NewName); ImportGV->setVisibility(GlobalValue::HiddenVisibility); } + + if (isa<Function>(&ExportGV) && allowPromotionAlias(OldName)) { + // Create a local alias with the original name to avoid breaking + // references from inline assembly. + std::string Alias = ".set " + OldName + "," + NewName + "\n"; + ExportM.appendModuleInlineAsm(Alias); + } } if (!RenamedComdats.empty()) diff --git a/llvm/test/ThinLTO/X86/devirt2.ll b/llvm/test/ThinLTO/X86/devirt2.ll index 42c15f1c1df5..6501a01a39df 100644 --- a/llvm/test/ThinLTO/X86/devirt2.ll +++ b/llvm/test/ThinLTO/X86/devirt2.ll @@ -131,10 +131,12 @@ ; RUN: -r=%t1.o,_ZN1D1mEi, \ ; RUN: -r=%t1.o,test2, \ ; RUN: -r=%t2.o,_ZN1A1nEi,p \ +; RUN: -r=%t2.o,_ZN1A1nEi, \ ; RUN: -r=%t2.o,_ZN1B1fEi,p \ ; RUN: -r=%t2.o,_ZN1C1fEi,p \ ; RUN: -r=%t2.o,_ZN1D1mEi,p \ ; RUN: -r=%t2.o,_ZN1E1mEi,p \ +; RUN: -r=%t2.o,_ZN1E1mEi, \ ; RUN: -r=%t2.o,_ZTV1B, \ ; RUN: -r=%t2.o,_ZTV1C, \ ; RUN: -r=%t2.o,_ZTV1D, \ @@ -167,10 +169,12 @@ ; RUN: -r=%t1.o,_ZN1D1mEi, \ ; RUN: -r=%t1.o,test2, \ ; RUN: -r=%t2.o,_ZN1A1nEi,p \ +; RUN: -r=%t2.o,_ZN1A1nEi, \ ; RUN: -r=%t2.o,_ZN1B1fEi,p \ ; RUN: -r=%t2.o,_ZN1C1fEi,p \ ; RUN: -r=%t2.o,_ZN1D1mEi,p \ ; RUN: -r=%t2.o,_ZN1E1mEi,p \ +; RUN: -r=%t2.o,_ZN1E1mEi, \ ; RUN: -r=%t2.o,_ZTV1B, \ ; RUN: -r=%t2.o,_ZTV1C, \ ; RUN: -r=%t2.o,_ZTV1D, \ diff --git a/llvm/test/Transforms/ThinLTOBitcodeWriter/cfi-icall-static-inline-asm.ll b/llvm/test/Transforms/ThinLTOBitcodeWriter/cfi-icall-static-inline-asm.ll new file mode 100644 index 000000000000..c2de21ed4562 --- /dev/null +++ b/llvm/test/Transforms/ThinLTOBitcodeWriter/cfi-icall-static-inline-asm.ll @@ -0,0 +1,22 @@ +; REQUIRES: x86-registered-target +; RUN: opt -thinlto-bc -thinlto-split-lto-unit -o - %s | llvm-modextract -b -n 0 -o - | llvm-dis | FileCheck %s + +target triple = "x86_64-unknown-linux-gnu" + +; CHECK: module asm ".set a,a.[[HASH:[0-9a-f]+]]" + +define void @b() { + %f = alloca void ()*, align 8 + ; CHECK: store{{.*}} @a.[[HASH]],{{.*}} %f + store void ()* @a, void ()** %f, align 8 + ; CHECK: %1 = call void ()* asm sideeffect "leaq a(%rip) + %1 = call void ()* asm sideeffect "leaq a(%rip), $0\0A\09", "=r,~{dirflag},~{fpsr},~{flags}"() + ret void +} + +; CHECK: define{{.*}} @a.[[HASH]](){{.*}} !type +define internal void @a() !type !0 { + ret void +} + +!0 = !{i64 0, !"typeid1"} diff --git a/llvm/test/Transforms/ThinLTOBitcodeWriter/split-internal2.ll b/llvm/test/Transforms/ThinLTOBitcodeWriter/split-internal2.ll index 98cc80e557f9..f50fe3f93b08 100644 --- a/llvm/test/Transforms/ThinLTOBitcodeWriter/split-internal2.ll +++ b/llvm/test/Transforms/ThinLTOBitcodeWriter/split-internal2.ll @@ -1,3 +1,4 @@ +; REQUIRES: x86-registered-target ; RUN: opt -thinlto-bc -thinlto-split-lto-unit -o %t %s ; RUN: llvm-modextract -b -n 0 -o %t0 %t ; RUN: llvm-modextract -b -n 1 -o %t1 %t @@ -7,6 +8,8 @@ ; RUN: llvm-bcanalyzer -dump %t0 | FileCheck --check-prefix=BCA0 %s ; RUN: llvm-bcanalyzer -dump %t1 | FileCheck --check-prefix=BCA1 %s +target triple = "x86_64-unknown-linux-gnu" + ; ERROR: llvm-modextract: error: module index out of range; bitcode file contains 2 module(s) ; BCA0: <GLOBALVAL_SUMMARY_BLOCK diff --git a/llvm/test/Transforms/ThinLTOBitcodeWriter/split-vfunc-internal.ll b/llvm/test/Transforms/ThinLTOBitcodeWriter/split-vfunc-internal.ll index d17cbefb0fb1..0d67b74ca5fc 100644 --- a/llvm/test/Transforms/ThinLTOBitcodeWriter/split-vfunc-internal.ll +++ b/llvm/test/Transforms/ThinLTOBitcodeWriter/split-vfunc-internal.ll @@ -1,7 +1,10 @@ +; REQUIRES: x86-registered-target ; RUN: opt -thinlto-bc -thinlto-split-lto-unit -o %t %s ; RUN: llvm-modextract -b -n 0 -o - %t | llvm-dis | FileCheck --check-prefix=M0 %s ; RUN: llvm-modextract -b -n 1 -o - %t | llvm-dis | FileCheck --check-prefix=M1 %s +target triple = "x86_64-unknown-linux-gnu" + define [1 x i8*]* @source() { ret [1 x i8*]* @g } </cut>

3 years, 11 months

1
0
0 0

[CI-NOTIFY]: TCWG Bisect tcwg_bmk_tx1/llvm-master-aarch64-spec2k6-O2 - Build # 17 - Successful!

by ci_notify＠linaro.org

Successfully identified regression in *llvm* in CI configuration tcwg_bmk_llvm_tx1/llvm-master-aarch64-spec2k6-O2. So far, this commit has regressed CI configurations: - tcwg_bmk_llvm_tx1/llvm-master-aarch64-spec2k6-O2 Culprit: <cut> commit d39d3a327b1303012370e47d991459ffbfce45ef Author: Peyton, Jonathan L <jonathan.l.peyton(a)intel.com> Date: Fri Aug 20 16:06:13 2021 -0500 [OpenMP][test] fix omp_get_wtime.c test to be more accommodating The omp_get_wtime.c test fails intermittently if the recorded times are off by too much which can happen when many tests are run in parallel. Instead of failing if one timing is a little off, take average of 100 timings minus the 10 worst. Differential Revision: https://reviews.llvm.org/D108488 </cut> Results regressed to (for first_bad == d39d3a327b1303012370e47d991459ffbfce45ef) # reset_artifacts: -10 # build_abe binutils: -9 # build_abe stage1 -- --set gcc_override_configure=--disable-libsanitizer: -8 # build_abe linux: -7 # build_abe glibc: -6 # build_abe stage2 -- --set gcc_override_configure=--disable-libsanitizer: -5 # build_llvm true: -3 # true: 0 # benchmark -- -O2 artifacts/build-d39d3a327b1303012370e47d991459ffbfce45ef/results_id: 1 # 447.dealII,dealII_base.default regressed by 105 from (for last_good == f77174d4b8cfba3c0a53c78e53edbbaf57e37fc5) # reset_artifacts: -10 # build_abe binutils: -9 # build_abe stage1 -- --set gcc_override_configure=--disable-libsanitizer: -8 # build_abe linux: -7 # build_abe glibc: -6 # build_abe stage2 -- --set gcc_override_configure=--disable-libsanitizer: -5 # build_llvm true: -3 # true: 0 # benchmark -- -O2 artifacts/build-f77174d4b8cfba3c0a53c78e53edbbaf57e37fc5/results_id: 1 Artifacts of last_good build: https://ci.linaro.org/job/tcwg_bmk_ci_llvm-bisect-tcwg_bmk_tx1-llvm-master-… Results ID of last_good: tx1_64/tcwg_bmk_llvm_tx1/bisect-llvm-master-aarch64-spec2k6-O2/4734 Artifacts of first_bad build: https://ci.linaro.org/job/tcwg_bmk_ci_llvm-bisect-tcwg_bmk_tx1-llvm-master-… Results ID of first_bad: tx1_64/tcwg_bmk_llvm_tx1/bisect-llvm-master-aarch64-spec2k6-O2/4757 Build top page/logs: https://ci.linaro.org/job/tcwg_bmk_ci_llvm-bisect-tcwg_bmk_tx1-llvm-master-… Configuration details: Reproduce builds: <cut> mkdir investigate-llvm-d39d3a327b1303012370e47d991459ffbfce45ef cd investigate-llvm-d39d3a327b1303012370e47d991459ffbfce45ef git clone https://git.linaro.org/toolchain/jenkins-scripts mkdir -p artifacts/manifests curl -o artifacts/manifests/build-baseline.sh https://ci.linaro.org/job/tcwg_bmk_ci_llvm-bisect-tcwg_bmk_tx1-llvm-master-… --fail curl -o artifacts/manifests/build-parameters.sh https://ci.linaro.org/job/tcwg_bmk_ci_llvm-bisect-tcwg_bmk_tx1-llvm-master-… --fail curl -o artifacts/test.sh https://ci.linaro.org/job/tcwg_bmk_ci_llvm-bisect-tcwg_bmk_tx1-llvm-master-… --fail chmod +x artifacts/test.sh # Reproduce the baseline build (build all pre-requisites) ./jenkins-scripts/tcwg_bmk-build.sh @@ artifacts/manifests/build-baseline.sh # Save baseline build state (which is then restored in artifacts/test.sh) mkdir -p ./bisect rsync -a --del --delete-excluded --exclude /bisect/ --exclude /artifacts/ --exclude /llvm/ ./ ./bisect/baseline/ cd llvm # Reproduce first_bad build git checkout --detach d39d3a327b1303012370e47d991459ffbfce45ef ../artifacts/test.sh # Reproduce last_good build git checkout --detach f77174d4b8cfba3c0a53c78e53edbbaf57e37fc5 ../artifacts/test.sh cd .. </cut> History of pending regressions and results: https://git.linaro.org/toolchain/ci/base-artifacts.git/log/?h=linaro-local/… Artifacts: https://ci.linaro.org/job/tcwg_bmk_ci_llvm-bisect-tcwg_bmk_tx1-llvm-master-… Build log: https://ci.linaro.org/job/tcwg_bmk_ci_llvm-bisect-tcwg_bmk_tx1-llvm-master-… Full commit (up to 1000 lines): <cut> commit d39d3a327b1303012370e47d991459ffbfce45ef Author: Peyton, Jonathan L <jonathan.l.peyton(a)intel.com> Date: Fri Aug 20 16:06:13 2021 -0500 [OpenMP][test] fix omp_get_wtime.c test to be more accommodating The omp_get_wtime.c test fails intermittently if the recorded times are off by too much which can happen when many tests are run in parallel. Instead of failing if one timing is a little off, take average of 100 timings minus the 10 worst. Differential Revision: https://reviews.llvm.org/D108488 --- openmp/runtime/test/api/omp_get_wtime.c | 75 ++++++++++++++++++++++++++------- 1 file changed, 59 insertions(+), 16 deletions(-) diff --git a/openmp/runtime/test/api/omp_get_wtime.c b/openmp/runtime/test/api/omp_get_wtime.c index e2bb211e0ce4..a862e07fc5a2 100644 --- a/openmp/runtime/test/api/omp_get_wtime.c +++ b/openmp/runtime/test/api/omp_get_wtime.c @@ -4,30 +4,73 @@ #include "omp_testsuite.h" #include "omp_my_sleep.h" -int test_omp_get_wtime() -{ +#define NTIMES 100 + +// This is the error % threshold. Be generous with the error threshold since +// this test may be run in parallel with many other tests it may throw off the +// sleep timing. +#define THRESHOLD 33.0 + +double test_omp_get_wtime(double desired_wait_time) { double start; double end; - double measured_time; - double wait_time = 0.2; start = 0; end = 0; start = omp_get_wtime(); - my_sleep (wait_time); + my_sleep(desired_wait_time); end = omp_get_wtime(); - measured_time = end-start; - return ((measured_time > 0.97 * wait_time) && (measured_time < 1.03 * wait_time)) ; + return end - start; } -int main() -{ - int i; - int num_failed=0; +int compare_times(const void *lhs, const void *rhs) { + const double *a = (const double *)lhs; + const double *b = (const double *)rhs; + return *a - *b; +} + +int main() { + int i, final_count; + double percent_off; + double *begin, *end, *ptr; + double wait_time = 0.01; + double average = 0.0; + double n = 0.0; + double *times = (double *)malloc(sizeof(double) * NTIMES); + + // Get each timing + for (i = 0; i < NTIMES; i++) { + times[i] = test_omp_get_wtime(wait_time); + } + + // Remove approx the "worst" tenth of the timings + qsort(times, NTIMES, sizeof(double), compare_times); + begin = times; + end = times + NTIMES; + for (i = 0; i < NTIMES / 10; ++i) { + if (i % 2 == 0) + begin++; + else + end--; + } + + // Get the average of the remaining timings + for (ptr = begin, final_count = 0; ptr != end; ++ptr, ++final_count) + average += times[i]; + average /= (double)final_count; + free(times); + + // Calculate the percent off of desired wait time + percent_off = (average - wait_time) / wait_time * 100.0; + // Should always be positive, but just in case + if (percent_off < 0) + percent_off = -percent_off; - for(i = 0; i < REPETITIONS; i++) { - if(!test_omp_get_wtime()) { - num_failed++; - } + if (percent_off > (double)THRESHOLD) { + fprintf(stderr, "error: average of %d runs (%lf) is of by %lf%%\n", NTIMES, + average, percent_off); + return EXIT_FAILURE; } - return num_failed; + printf("pass: average of %d runs (%lf) is only off by %lf%%\n", NTIMES, + average, percent_off); + return EXIT_SUCCESS; } </cut>

3 years, 11 months

1
0
0 0

[CI-NOTIFY]: TCWG Bisect tcwg_bmk_apm/llvm-release-arm-spec2k6-Oz - Build # 4 - Successful!

by ci_notify＠linaro.org

Successfully identified regression in *llvm* in CI configuration tcwg_bmk_llvm_apm/llvm-release-arm-spec2k6-Oz. So far, this commit has regressed CI configurations: - tcwg_bmk_llvm_apm/llvm-release-arm-spec2k6-Oz Culprit: <cut> commit 876de062f94650f9ded56a22b062236f711fcd18 Author: Marius Brehler <marius.brehler(a)iml.fraunhofer.de> Date: Wed Jun 9 13:38:10 2021 +0000 [mlir] Add EmitC dialect This upstreams the EmitC dialect and the corresponding Cpp target, both initially presented with [1], from [2] to MLIR core. For the related discussion, see [3]. [1] https://reviews.llvm.org/D76571 [2] https://github.com/iml130/mlir-emitc [3] https://llvm.discourse.group/t/emitc-generating-c-c-from-mlir/3388 Co-authored-by: Jacques Pienaar <jpienaar(a)google.com> Co-authored-by: Simon Camphausen <simon.camphausen(a)iml.fraunhofer.de> Co-authored-by: Oliver Scherf <oliver.scherf(a)iml.fraunhofer.de> Reviewed By: rriddle Differential Revision: https://reviews.llvm.org/D103969 </cut> Results regressed to (for first_bad == 876de062f94650f9ded56a22b062236f711fcd18) # reset_artifacts: -10 # build_abe binutils: -9 # build_abe stage1 -- --set gcc_override_configure=--with-mode=thumb --set gcc_override_configure=--disable-libsanitizer: -8 # build_abe linux: -7 # build_abe glibc: -6 # build_abe stage2 -- --set gcc_override_configure=--with-mode=thumb --set gcc_override_configure=--disable-libsanitizer: -5 # build_llvm true: -3 # true: 0 # benchmark -- -Oz_mthumb artifacts/build-876de062f94650f9ded56a22b062236f711fcd18/results_id: 1 # 470.lbm,lbm_base.default regressed by 107 from (for last_good == 1bd4085e0bbc14ec61ab69c83464098622b2df56) # reset_artifacts: -10 # build_abe binutils: -9 # build_abe stage1 -- --set gcc_override_configure=--with-mode=thumb --set gcc_override_configure=--disable-libsanitizer: -8 # build_abe linux: -7 # build_abe glibc: -6 # build_abe stage2 -- --set gcc_override_configure=--with-mode=thumb --set gcc_override_configure=--disable-libsanitizer: -5 # build_llvm true: -3 # true: 0 # benchmark -- -Oz_mthumb artifacts/build-1bd4085e0bbc14ec61ab69c83464098622b2df56/results_id: 1 Artifacts of last_good build: https://ci.linaro.org/job/tcwg_bmk_ci_llvm-bisect-tcwg_bmk_apm-llvm-release… Results ID of last_good: apm_32/tcwg_bmk_llvm_apm/bisect-llvm-release-arm-spec2k6-Oz/4752 Artifacts of first_bad build: https://ci.linaro.org/job/tcwg_bmk_ci_llvm-bisect-tcwg_bmk_apm-llvm-release… Results ID of first_bad: apm_32/tcwg_bmk_llvm_apm/bisect-llvm-release-arm-spec2k6-Oz/4718 Build top page/logs: https://ci.linaro.org/job/tcwg_bmk_ci_llvm-bisect-tcwg_bmk_apm-llvm-release… Configuration details: Reproduce builds: <cut> mkdir investigate-llvm-876de062f94650f9ded56a22b062236f711fcd18 cd investigate-llvm-876de062f94650f9ded56a22b062236f711fcd18 git clone https://git.linaro.org/toolchain/jenkins-scripts mkdir -p artifacts/manifests curl -o artifacts/manifests/build-baseline.sh https://ci.linaro.org/job/tcwg_bmk_ci_llvm-bisect-tcwg_bmk_apm-llvm-release… --fail curl -o artifacts/manifests/build-parameters.sh https://ci.linaro.org/job/tcwg_bmk_ci_llvm-bisect-tcwg_bmk_apm-llvm-release… --fail curl -o artifacts/test.sh https://ci.linaro.org/job/tcwg_bmk_ci_llvm-bisect-tcwg_bmk_apm-llvm-release… --fail chmod +x artifacts/test.sh # Reproduce the baseline build (build all pre-requisites) ./jenkins-scripts/tcwg_bmk-build.sh @@ artifacts/manifests/build-baseline.sh # Save baseline build state (which is then restored in artifacts/test.sh) mkdir -p ./bisect rsync -a --del --delete-excluded --exclude /bisect/ --exclude /artifacts/ --exclude /llvm/ ./ ./bisect/baseline/ cd llvm # Reproduce first_bad build git checkout --detach 876de062f94650f9ded56a22b062236f711fcd18 ../artifacts/test.sh # Reproduce last_good build git checkout --detach 1bd4085e0bbc14ec61ab69c83464098622b2df56 ../artifacts/test.sh cd .. </cut> History of pending regressions and results: https://git.linaro.org/toolchain/ci/base-artifacts.git/log/?h=linaro-local/… Artifacts: https://ci.linaro.org/job/tcwg_bmk_ci_llvm-bisect-tcwg_bmk_apm-llvm-release… Build log: https://ci.linaro.org/job/tcwg_bmk_ci_llvm-bisect-tcwg_bmk_apm-llvm-release… Full commit (up to 1000 lines): <cut> commit 876de062f94650f9ded56a22b062236f711fcd18 Author: Marius Brehler <marius.brehler(a)iml.fraunhofer.de> Date: Wed Jun 9 13:38:10 2021 +0000 [mlir] Add EmitC dialect This upstreams the EmitC dialect and the corresponding Cpp target, both initially presented with [1], from [2] to MLIR core. For the related discussion, see [3]. [1] https://reviews.llvm.org/D76571 [2] https://github.com/iml130/mlir-emitc [3] https://llvm.discourse.group/t/emitc-generating-c-c-from-mlir/3388 Co-authored-by: Jacques Pienaar <jpienaar(a)google.com> Co-authored-by: Simon Camphausen <simon.camphausen(a)iml.fraunhofer.de> Co-authored-by: Oliver Scherf <oliver.scherf(a)iml.fraunhofer.de> Reviewed By: rriddle Differential Revision: https://reviews.llvm.org/D103969 --- mlir/include/mlir/Dialect/CMakeLists.txt | 1 + mlir/include/mlir/Dialect/EmitC/CMakeLists.txt | 1 + mlir/include/mlir/Dialect/EmitC/IR/CMakeLists.txt | 7 + mlir/include/mlir/Dialect/EmitC/IR/EmitC.h | 32 ++++ mlir/include/mlir/Dialect/EmitC/IR/EmitC.td | 148 ++++++++++++++ .../mlir/Dialect/EmitC/IR/EmitCAttributes.td | 45 +++++ mlir/include/mlir/Dialect/EmitC/IR/EmitCBase.td | 28 +++ mlir/include/mlir/Dialect/EmitC/IR/EmitCTypes.td | 46 +++++ mlir/include/mlir/InitAllDialects.h | 2 + mlir/lib/Dialect/CMakeLists.txt | 1 + mlir/lib/Dialect/EmitC/CMakeLists.txt | 1 + mlir/lib/Dialect/EmitC/IR/CMakeLists.txt | 14 ++ mlir/lib/Dialect/EmitC/IR/EmitC.cpp | 212 +++++++++++++++++++++ mlir/test/Dialect/EmitC/invalid_ops.mlir | 79 ++++++++ mlir/test/Dialect/EmitC/ops.mlir | 24 +++ mlir/test/Dialect/EmitC/types.mlir | 18 ++ mlir/test/mlir-opt/commandline.mlir | 1 + 17 files changed, 660 insertions(+) diff --git a/mlir/include/mlir/Dialect/CMakeLists.txt b/mlir/include/mlir/Dialect/CMakeLists.txt index 2d6d04a52a9d..44a9249cef83 100644 --- a/mlir/include/mlir/Dialect/CMakeLists.txt +++ b/mlir/include/mlir/Dialect/CMakeLists.txt @@ -5,6 +5,7 @@ add_subdirectory(ArmSVE) add_subdirectory(AMX) add_subdirectory(Complex) add_subdirectory(DLTI) +add_subdirectory(EmitC) add_subdirectory(GPU) add_subdirectory(Math) add_subdirectory(Linalg) diff --git a/mlir/include/mlir/Dialect/EmitC/CMakeLists.txt b/mlir/include/mlir/Dialect/EmitC/CMakeLists.txt new file mode 100644 index 000000000000..f33061b2d87c --- /dev/null +++ b/mlir/include/mlir/Dialect/EmitC/CMakeLists.txt @@ -0,0 +1 @@ +add_subdirectory(IR) diff --git a/mlir/include/mlir/Dialect/EmitC/IR/CMakeLists.txt b/mlir/include/mlir/Dialect/EmitC/IR/CMakeLists.txt new file mode 100644 index 000000000000..09a9f7a2ec1c --- /dev/null +++ b/mlir/include/mlir/Dialect/EmitC/IR/CMakeLists.txt @@ -0,0 +1,7 @@ +add_mlir_dialect(EmitC emitc) +add_mlir_doc(EmitC EmitC Dialects/ -gen-dialect-doc) + +set(LLVM_TARGET_DEFINITIONS EmitCAttributes.td) +mlir_tablegen(EmitCAttributes.h.inc -gen-attrdef-decls) +mlir_tablegen(EmitCAttributes.cpp.inc -gen-attrdef-defs) +add_public_tablegen_target(MLIREmitCAttributesIncGen) diff --git a/mlir/include/mlir/Dialect/EmitC/IR/EmitC.h b/mlir/include/mlir/Dialect/EmitC/IR/EmitC.h new file mode 100644 index 000000000000..857d1430f941 --- /dev/null +++ b/mlir/include/mlir/Dialect/EmitC/IR/EmitC.h @@ -0,0 +1,32 @@ +//===- EmitC.h - EmitC Dialect ----------------------------------*- C++ -*-===// +// +// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions. +// See https://llvm.org/LICENSE.txt for license information. +// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception +// +//===----------------------------------------------------------------------===// +// +// This file declares EmitC in MLIR. +// +//===----------------------------------------------------------------------===// + +#ifndef MLIR_DIALECT_EMITC_IR_EMITC_H +#define MLIR_DIALECT_EMITC_IR_EMITC_H + +#include "mlir/IR/BuiltinOps.h" +#include "mlir/IR/BuiltinTypes.h" +#include "mlir/IR/Dialect.h" +#include "mlir/Interfaces/SideEffectInterfaces.h" + +#include "mlir/Dialect/EmitC/IR/EmitCDialect.h.inc" + +#define GET_ATTRDEF_CLASSES +#include "mlir/Dialect/EmitC/IR/EmitCAttributes.h.inc" + +#define GET_TYPEDEF_CLASSES +#include "mlir/Dialect/EmitC/IR/EmitCTypes.h.inc" + +#define GET_OP_CLASSES +#include "mlir/Dialect/EmitC/IR/EmitC.h.inc" + +#endif // MLIR_DIALECT_EMITC_IR_EMITC_H diff --git a/mlir/include/mlir/Dialect/EmitC/IR/EmitC.td b/mlir/include/mlir/Dialect/EmitC/IR/EmitC.td new file mode 100644 index 000000000000..78c682a80671 --- /dev/null +++ b/mlir/include/mlir/Dialect/EmitC/IR/EmitC.td @@ -0,0 +1,148 @@ +//===- EmitC.td - EmitC operations--------------------------*- tablegen -*-===// +// +// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions. +// See https://llvm.org/LICENSE.txt for license information. +// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception +// +//===----------------------------------------------------------------------===// +// +// Defines the MLIR EmitC operations. +// +//===----------------------------------------------------------------------===// + +#ifndef MLIR_DIALECT_EMITC_IR_EMITC +#define MLIR_DIALECT_EMITC_IR_EMITC + +include "mlir/Dialect/EmitC/IR/EmitCAttributes.td" +include "mlir/Dialect/EmitC/IR/EmitCTypes.td" + +include "mlir/Interfaces/SideEffectInterfaces.td" + +//===----------------------------------------------------------------------===// +// EmitC op definitions +//===----------------------------------------------------------------------===// + +// Base class for EmitC dialect ops. +class EmitC_Op<string mnemonic, list<OpTrait> traits = []> + : Op<EmitC_Dialect, mnemonic, traits> { + let verifier = "return ::verify(*this);"; +} + +def EmitC_ApplyOp : EmitC_Op<"apply", []> { + let summary = "Apply operation"; + let description = [{ + With the `apply` operation the operators & (address of) and * (contents of) + can be applied to a single operand. + + Example: + + ```mlir + // Custom form of applying the & operator. + %0 = emitc.apply "&"(%arg0) : (i32) -> !emitc.opaque<"int32_t*"> + + // Generic form of the same operation. + %0 = "emitc.apply"(%arg0) {applicableOperator = "&"} + : (i32) -> !emitc.opaque<"int32_t*"> + + ``` + }]; + let arguments = (ins + Arg<StrAttr, "the operator to apply">:$applicableOperator, + AnyType:$operand + ); + let results = (outs AnyType:$result); + let assemblyFormat = [{ + $applicableOperator `(` $operand `)` attr-dict `:` functional-type($operand, results) + }]; +} + +def EmitC_CallOp : EmitC_Op<"call", []> { + let summary = "Call operation"; + let description = [{ + The `call` operation represents a C++ function call. The call allows + specifying order of operands and attributes in the call as follows: + + - integer value of index type refers to an operand; + - attribute which will get lowered to constant value in call; + + Example: + + ```mlir + // Custom form defining a call to `foo()`. + %0 = emitc.call "foo" () : () -> i32 + + // Generic form of the same operation. + %0 = "emitc.call"() {callee = "foo"} : () -> i32 + ``` + }]; + let arguments = (ins + Arg<StrAttr, "the C++ function to call">:$callee, + Arg<OptionalAttr<ArrayAttr>, "the order of operands and further attributes">:$args, + Arg<OptionalAttr<ArrayAttr>, "template arguments">:$template_args, + Variadic<AnyType>:$operands + ); + let results = (outs Variadic<AnyType>); + let assemblyFormat = [{ + $callee `(` $operands `)` attr-dict `:` functional-type($operands, results) + }]; +} + +def EmitC_ConstantOp : EmitC_Op<"constant", [ConstantLike]> { + let summary = "Constant operation"; + let description = [{ + The `constant` operation produces an SSA value equal to some constant + specified by an attribute. This can be used to form simple integer and + floating point constants, as well as more exotic things like tensor + constants. The `constant` operation also supports the EmitC opaque + attribute and the EmitC opaque type. + + Example: + + ```mlir + // Integer constant + %0 = "emitc.constant"(){value = 42 : i32} : () -> i32 + + // Constant emitted as `int32_t* = NULL;` + %1 = "emitc.constant"() + {value = #emitc.opaque<"NULL"> : !emitc.opaque<"int32_t*">} + : () -> !emitc.opaque<"int32_t*"> + ``` + }]; + + let arguments = (ins AnyAttr:$value); + let results = (outs AnyType); + + let hasFolder = 1; +} + +def EmitC_IncludeOp + : EmitC_Op<"include", [NoSideEffect, HasParent<"ModuleOp">]> { + let summary = "Include operation"; + let description = [{ + The `include` operation allows to define a source file inclusion via the + `#include` directive. + + Example: + + ```mlir + // Custom form defining the inclusion of `<myheader>`. + emitc.include "myheader.h" is_standard_include + + // Generic form of the same operation. + "emitc.include" (){include = "myheader.h", is_standard_include} : () -> () + + // Generic form defining the inclusion of `"myheader"`. + "emitc.include" (){include = "myheader.h"} : () -> () + ``` + }]; + let arguments = (ins + Arg<StrAttr, "source file to include">:$include, + UnitAttr:$is_standard_include + ); + let assemblyFormat = [{ + $include attr-dict (`is_standard_include` $is_standard_include^)? + }]; + let verifier = ?; +} + +#endif // MLIR_DIALECT_EMITC_IR_EMITC diff --git a/mlir/include/mlir/Dialect/EmitC/IR/EmitCAttributes.td b/mlir/include/mlir/Dialect/EmitC/IR/EmitCAttributes.td new file mode 100644 index 000000000000..2dd782ba49bf --- /dev/null +++ b/mlir/include/mlir/Dialect/EmitC/IR/EmitCAttributes.td @@ -0,0 +1,45 @@ +//===- EmitCAttributes.td - EmitC attributes ---------------*- tablegen -*-===// +// +// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions. +// See https://llvm.org/LICENSE.txt for license information. +// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception +// +//===----------------------------------------------------------------------===// +// +// Defines the MLIR EmitC attributes. +// +//===----------------------------------------------------------------------===// + +#ifndef MLIR_DIALECT_EMITC_IR_EMITCATTRIBUTES +#define MLIR_DIALECT_EMITC_IR_EMITCATTRIBUTES + +include "mlir/Dialect/EmitC/IR/EmitCBase.td" + +//===----------------------------------------------------------------------===// +// EmitC attribute definitions +//===----------------------------------------------------------------------===// + +class EmitC_Attr<string name, string attrMnemonic> + : AttrDef<EmitC_Dialect, name> { + let mnemonic = attrMnemonic; +} + +def EmitC_OpaqueAttr : EmitC_Attr<"Opaque", "opaque"> { + let summary = "An opaque attribute"; + + let description = [{ + An opaque attribute of which the value gets emitted as is. + + Example: + + ```mlir + #emitc.opaque<""> + #emitc.opaque<"NULL"> + #emitc.opaque<"nullptr"> + ``` + }]; + + let parameters = (ins StringRefParameter<"the opaque value">:$value); +} + +#endif // MLIR_DIALECT_EMITC_IR_EMITCATTRIBUTES diff --git a/mlir/include/mlir/Dialect/EmitC/IR/EmitCBase.td b/mlir/include/mlir/Dialect/EmitC/IR/EmitCBase.td new file mode 100644 index 000000000000..5b7e81e2833a --- /dev/null +++ b/mlir/include/mlir/Dialect/EmitC/IR/EmitCBase.td @@ -0,0 +1,28 @@ +//===- EmitCBase.td - EmitC dialect ------------------------*- tablegen -*-===// +// +// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions. +// See https://llvm.org/LICENSE.txt for license information. +// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception +// +//===----------------------------------------------------------------------===// +// +// Defines the MLIR EmitC dialect. +// +//===----------------------------------------------------------------------===// + +#ifndef MLIR_DIALECT_EMITC_IR_EMITCBASE +#define MLIR_DIALECT_EMITC_IR_EMITCBASE + +include "mlir/IR/OpBase.td" + +//===----------------------------------------------------------------------===// +// EmitC dialect definition +//===----------------------------------------------------------------------===// + +def EmitC_Dialect : Dialect { + let name = "emitc"; + let cppNamespace = "::mlir::emitc"; + let hasConstantMaterializer = 1; +} + +#endif // MLIR_DIALECT_EMITC_IR_EMITCBASE diff --git a/mlir/include/mlir/Dialect/EmitC/IR/EmitCTypes.td b/mlir/include/mlir/Dialect/EmitC/IR/EmitCTypes.td new file mode 100644 index 000000000000..d6fdd0fbf82d --- /dev/null +++ b/mlir/include/mlir/Dialect/EmitC/IR/EmitCTypes.td @@ -0,0 +1,46 @@ +//===- EmitCTypes.td - EmitC types -------------------------*- tablegen -*-===// +// +// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions. +// See https://llvm.org/LICENSE.txt for license information. +// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception +// +//===----------------------------------------------------------------------===// +// +// Defines the MLIR EmitC types. +// +//===----------------------------------------------------------------------===// + + +#ifndef MLIR_DIALECT_EMITC_IR_EMITCTYPES +#define MLIR_DIALECT_EMITC_IR_EMITCTYPES + +include "mlir/Dialect/EmitC/IR/EmitCBase.td" + +//===----------------------------------------------------------------------===// +// EmitC type definitions +//===----------------------------------------------------------------------===// + +class EmitC_Type<string name, string typeMnemonic> + : TypeDef<EmitC_Dialect, name> { + let mnemonic = typeMnemonic; +} + +def EmitC_OpaqueType : EmitC_Type<"Opaque", "opaque"> { + let summary = "An opaque type"; + + let description = [{ + An opaque data type of which the value gets emitted as is. + + Example: + + ```mlir + !emitc.opaque<"int"> + !emitc.opaque<"float *"> + !emitc.opaque<"std::vector<std::string>"> + ``` + }]; + + let parameters = (ins StringRefParameter<"the opaque value">:$value); +} + +#endif // MLIR_DIALECT_EMITC_IR_EMITCTYPES diff --git a/mlir/include/mlir/InitAllDialects.h b/mlir/include/mlir/InitAllDialects.h index e44f2e8f1ae0..c52dae3fd1b5 100644 --- a/mlir/include/mlir/InitAllDialects.h +++ b/mlir/include/mlir/InitAllDialects.h @@ -21,6 +21,7 @@ #include "mlir/Dialect/Async/IR/Async.h" #include "mlir/Dialect/Complex/IR/Complex.h" #include "mlir/Dialect/DLTI/DLTI.h" +#include "mlir/Dialect/EmitC/IR/EmitC.h" #include "mlir/Dialect/GPU/GPUDialect.h" #include "mlir/Dialect/LLVMIR/LLVMDialect.h" #include "mlir/Dialect/LLVMIR/NVVMDialect.h" @@ -57,6 +58,7 @@ inline void registerAllDialects(DialectRegistry &registry) { async::AsyncDialect, complex::ComplexDialect, DLTIDialect, + emitc::EmitCDialect, gpu::GPUDialect, LLVM::LLVMDialect, linalg::LinalgDialect, diff --git a/mlir/lib/Dialect/CMakeLists.txt b/mlir/lib/Dialect/CMakeLists.txt index f5124f7d138f..de946beef0d9 100644 --- a/mlir/lib/Dialect/CMakeLists.txt +++ b/mlir/lib/Dialect/CMakeLists.txt @@ -5,6 +5,7 @@ add_subdirectory(Async) add_subdirectory(AMX) add_subdirectory(Complex) add_subdirectory(DLTI) +add_subdirectory(EmitC) add_subdirectory(GPU) add_subdirectory(Linalg) add_subdirectory(LLVMIR) diff --git a/mlir/lib/Dialect/EmitC/CMakeLists.txt b/mlir/lib/Dialect/EmitC/CMakeLists.txt new file mode 100644 index 000000000000..f33061b2d87c --- /dev/null +++ b/mlir/lib/Dialect/EmitC/CMakeLists.txt @@ -0,0 +1 @@ +add_subdirectory(IR) diff --git a/mlir/lib/Dialect/EmitC/IR/CMakeLists.txt b/mlir/lib/Dialect/EmitC/IR/CMakeLists.txt new file mode 100644 index 000000000000..6283441fdadf --- /dev/null +++ b/mlir/lib/Dialect/EmitC/IR/CMakeLists.txt @@ -0,0 +1,14 @@ +add_mlir_dialect_library(MLIREmitC + EmitC.cpp + + ADDITIONAL_HEADER_DIRS + ${MLIR_MAIN_INCLUDE_DIR}/mlir/Dialect/EmitC + + DEPENDS + MLIREmitCIncGen + MLIREmitCAttributesIncGen + + LINK_LIBS PUBLIC + MLIRIR + MLIRSideEffectInterfaces + ) diff --git a/mlir/lib/Dialect/EmitC/IR/EmitC.cpp b/mlir/lib/Dialect/EmitC/IR/EmitC.cpp new file mode 100644 index 000000000000..364c247f75e4 --- /dev/null +++ b/mlir/lib/Dialect/EmitC/IR/EmitC.cpp @@ -0,0 +1,212 @@ +//===- EmitC.cpp - EmitC Dialect ------------------------------------------===// +// +// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions. +// See https://llvm.org/LICENSE.txt for license information. +// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception +// +//===----------------------------------------------------------------------===// + +#include "mlir/Dialect/EmitC/IR/EmitC.h" +#include "mlir/IR/Builders.h" +#include "mlir/IR/DialectImplementation.h" +#include "llvm/ADT/TypeSwitch.h" + +using namespace mlir; +using namespace mlir::emitc; + +//===----------------------------------------------------------------------===// +// EmitCDialect +//===----------------------------------------------------------------------===// + +void EmitCDialect::initialize() { + addOperations< +#define GET_OP_LIST +#include "mlir/Dialect/EmitC/IR/EmitC.cpp.inc" + >(); + addTypes< +#define GET_TYPEDEF_LIST +#include "mlir/Dialect/EmitC/IR/EmitCTypes.cpp.inc" + >(); + addAttributes< +#define GET_ATTRDEF_LIST +#include "mlir/Dialect/EmitC/IR/EmitCAttributes.cpp.inc" + >(); +} + +/// Materialize a single constant operation from a given attribute value with +/// the desired resultant type. +Operation *EmitCDialect::materializeConstant(OpBuilder &builder, + Attribute value, Type type, + Location loc) { + return builder.create<ConstantOp>(loc, type, value); +} + +//===----------------------------------------------------------------------===// +// ApplyOp +//===----------------------------------------------------------------------===// + +static LogicalResult verify(ApplyOp op) { + StringRef applicableOperator = op.applicableOperator(); + + // Applicable operator must not be empty. + if (applicableOperator.empty()) + return op.emitOpError("applicable operator must not be empty"); + + // Only `*` and `&` are supported. + if (applicableOperator != "&" && applicableOperator != "*") + return op.emitOpError("applicable operator is illegal"); + + return success(); +} + +//===----------------------------------------------------------------------===// +// CallOp +//===----------------------------------------------------------------------===// + +static LogicalResult verify(emitc::CallOp op) { + // Callee must not be empty. + if (op.callee().empty()) + return op.emitOpError("callee must not be empty"); + + if (Optional<ArrayAttr> argsAttr = op.args()) { + for (Attribute arg : argsAttr.getValue()) { + if (arg.getType().isa<IndexType>()) { + int64_t index = arg.cast<IntegerAttr>().getInt(); + // Args with elements of type index must be in range + // [0..operands.size). + if ((index < 0) || (index >= static_cast<int64_t>(op.getNumOperands()))) + return op.emitOpError("index argument is out of range"); + + // Args with elements of type ArrayAttr must have a type. + } else if (arg.isa<ArrayAttr>() && arg.getType().isa<NoneType>()) { + return op.emitOpError("array argument has no type"); + } + } + } + + if (Optional<ArrayAttr> templateArgsAttr = op.template_args()) { + for (Attribute tArg : templateArgsAttr.getValue()) { + if (!tArg.isa<TypeAttr>() && !tArg.isa<IntegerAttr>() && + !tArg.isa<FloatAttr>() && !tArg.isa<emitc::OpaqueAttr>()) + return op.emitOpError("template argument has invalid type"); + } + } + + return success(); +} + +//===----------------------------------------------------------------------===// +// ConstantOp +//===----------------------------------------------------------------------===// + +/// The constant op requires that the attribute's type matches the return type. +static LogicalResult verify(emitc::ConstantOp &op) { + Attribute value = op.value(); + Type type = op.getType(); + if (!value.getType().isa<NoneType>() && type != value.getType()) + return op.emitOpError() << "requires attribute's type (" << value.getType() + << ") to match op's return type (" << type << ")"; + return success(); +} + +OpFoldResult emitc::ConstantOp::fold(ArrayRef<Attribute> operands) { + assert(operands.empty() && "constant has no operands"); + return value(); +} + +//===----------------------------------------------------------------------===// +// TableGen'd op method definitions +//===----------------------------------------------------------------------===// + +#define GET_OP_CLASSES +#include "mlir/Dialect/EmitC/IR/EmitC.cpp.inc" + +//===----------------------------------------------------------------------===// +// EmitC Attributes +//===----------------------------------------------------------------------===// + +#define GET_ATTRDEF_CLASSES +#include "mlir/Dialect/EmitC/IR/EmitCAttributes.cpp.inc" + +Attribute emitc::OpaqueAttr::parse(MLIRContext *context, + DialectAsmParser &parser, Type type) { + if (parser.parseLess()) + return Attribute(); + StringRef value; + llvm::SMLoc loc = parser.getCurrentLocation(); + if (parser.parseOptionalString(&value)) { + parser.emitError(loc) << "expected string"; + return Attribute(); + } + if (parser.parseGreater()) + return Attribute(); + return get(context, value); +} + +Attribute EmitCDialect::parseAttribute(DialectAsmParser &parser, + Type type) const { + llvm::SMLoc typeLoc = parser.getCurrentLocation(); + StringRef mnemonic; + if (parser.parseKeyword(&mnemonic)) + return Attribute(); + Attribute genAttr; + OptionalParseResult parseResult = + generatedAttributeParser(getContext(), parser, mnemonic, type, genAttr); + if (parseResult.hasValue()) + return genAttr; + parser.emitError(typeLoc, "unknown attribute in EmitC dialect"); + return Attribute(); +} + +void EmitCDialect::printAttribute(Attribute attr, DialectAsmPrinter &os) const { + if (failed(generatedAttributePrinter(attr, os))) + llvm_unreachable("unexpected 'EmitC' attribute kind"); +} + +void emitc::OpaqueAttr::print(DialectAsmPrinter &printer) const { + printer << "opaque<\"" << getValue() << "\">"; +} + +//===----------------------------------------------------------------------===// +// EmitC Types +//===----------------------------------------------------------------------===// + +#define GET_TYPEDEF_CLASSES +#include "mlir/Dialect/EmitC/IR/EmitCTypes.cpp.inc" + +Type emitc::OpaqueType::parse(MLIRContext *context, DialectAsmParser &parser) { + if (parser.parseLess()) + return Type(); + StringRef value; + llvm::SMLoc loc = parser.getCurrentLocation(); + if (parser.parseOptionalString(&value) || value.empty()) { + parser.emitError(loc) << "expected non empty string"; + return Type(); + } + if (parser.parseGreater()) + return Type(); + return get(context, value); +} + +Type EmitCDialect::parseType(DialectAsmParser &parser) const { + llvm::SMLoc typeLoc = parser.getCurrentLocation(); + StringRef mnemonic; + if (parser.parseKeyword(&mnemonic)) + return Type(); + Type genType; + OptionalParseResult parseResult = + generatedTypeParser(getContext(), parser, mnemonic, genType); + if (parseResult.hasValue()) + return genType; + parser.emitError(typeLoc, "unknown type in EmitC dialect"); + return Type(); +} + +void EmitCDialect::printType(Type type, DialectAsmPrinter &os) const { + if (failed(generatedTypePrinter(type, os))) + llvm_unreachable("unexpected 'EmitC' type kind"); +} + +void emitc::OpaqueType::print(DialectAsmPrinter &printer) const { + printer << "opaque<\"" << getValue() << "\">"; +} diff --git a/mlir/test/Dialect/EmitC/invalid_ops.mlir b/mlir/test/Dialect/EmitC/invalid_ops.mlir new file mode 100644 index 000000000000..e86664627c36 --- /dev/null +++ b/mlir/test/Dialect/EmitC/invalid_ops.mlir @@ -0,0 +1,79 @@ +// RUN: mlir-opt %s -split-input-file -verify-diagnostics + +func @const_attribute_return_type_1() { + // expected-error @+1 {{'emitc.constant' op requires attribute's type ('i64') to match op's return type ('i32')}} + %c0 = "emitc.constant"(){value = 42: i64} : () -> i32 + return +} + +// ----- + +func @const_attribute_return_type_2() { + // expected-error @+1 {{'emitc.constant' op requires attribute's type ('!emitc.opaque<"int32_t*">') to match op's return type ('!emitc.opaque<"int32_t">')}} + %c0 = "emitc.constant"(){value = "nullptr" : !emitc.opaque<"int32_t*">} : () -> !emitc.opaque<"int32_t"> + return +} + +// ----- + +func @index_args_out_of_range_1() { + // expected-error @+1 {{'emitc.call' op index argument is out of range}} + emitc.call "test" () {args = [0 : index]} : () -> () + return +} + +// ----- + +func @index_args_out_of_range_2(%arg : i32) { + // expected-error @+1 {{'emitc.call' op index argument is out of range}} + emitc.call "test" (%arg, %arg) {args = [2 : index]} : (i32, i32) -> () + return +} + +// ----- + +func @empty_callee() { + // expected-error @+1 {{'emitc.call' op callee must not be empty}} + emitc.call "" () : () -> () + return +} + +// ----- + +func @nonetype_arg(%arg : i32) { + // expected-error @+1 {{'emitc.call' op array argument has no type}} + emitc.call "nonetype_arg"(%arg) {args = [0 : index, [0, 1, 2]]} : (i32) -> i32 + return +} + +// ----- + +func @array_template_arg(%arg : i32) { + // expected-error @+1 {{'emitc.call' op template argument has invalid type}} + emitc.call "nonetype_template_arg"(%arg) {template_args = [[0, 1, 2]]} : (i32) -> i32 + return +} + +// ----- + +func @dense_template_argument(%arg : i32) { + // expected-error @+1 {{'emitc.call' op template argument has invalid type}} + emitc.call "dense_template_argument"(%arg) {template_args = [dense<[1.0, 1.0]> : tensor<2xf32>]} : (i32) -> i32 + return +} + +// ----- + +func @empty_operator(%arg : i32) { + // expected-error @+1 {{'emitc.apply' op applicable operator must not be empty}} + %2 = emitc.apply ""(%arg) : (i32) -> !emitc.opaque<"int32_t*"> + return +} + +// ----- + +func @illegal_operator(%arg : i32) { + // expected-error @+1 {{'emitc.apply' op applicable operator is illegal}} + %2 = emitc.apply "+"(%arg) : (i32) -> !emitc.opaque<"int32_t*"> + return +} diff --git a/mlir/test/Dialect/EmitC/ops.mlir b/mlir/test/Dialect/EmitC/ops.mlir new file mode 100644 index 000000000000..3a48ff447e1c --- /dev/null +++ b/mlir/test/Dialect/EmitC/ops.mlir @@ -0,0 +1,24 @@ +// RUN: mlir-opt -verify-diagnostics %s | FileCheck %s + +"emitc.include" (){include = "test.h", is_standard_include} : () -> () +emitc.include "test.h" is_standard_include + +// CHECK-LABEL: func @f(%{{.*}}: i32, %{{.*}}: !emitc.opaque<"int32_t">) { +func @f(%arg0: i32, %f: !emitc.opaque<"int32_t">) { + %1 = "emitc.call"() {callee = "blah"} : () -> i64 + emitc.call "foo" (%1) {args = [ + 0 : index, dense<[0, 1]> : tensor<2xi32>, 0 : index + ]} : (i64) -> () + return +} + +func @c(%arg0: i32) { + %1 = "emitc.constant"(){value = 42 : i32} : () -> i32 + return +} + +func @a(%arg0: i32, %arg1: i32) { + %1 = "emitc.apply"(%arg0) {applicableOperator = "&"} : (i32) -> !emitc.opaque<"int32_t*"> + %2 = emitc.apply "&"(%arg1) : (i32) -> !emitc.opaque<"int32_t*"> + return +} diff --git a/mlir/test/Dialect/EmitC/types.mlir b/mlir/test/Dialect/EmitC/types.mlir new file mode 100644 index 000000000000..f1ffce74e4c2 --- /dev/null +++ b/mlir/test/Dialect/EmitC/types.mlir @@ -0,0 +1,18 @@ +// RUN: mlir-opt -verify-diagnostics %s | FileCheck %s +// check parser +// RUN: mlir-opt -verify-diagnostics %s | mlir-opt -verify-diagnostics | FileCheck %s + +// CHECK-LABEL: func @opaque_types() { +func @opaque_types() { + // CHECK-NEXT: !emitc.opaque<"int"> + emitc.call "f"() {args = [!emitc<"opaque<\"int\">">]} : () -> () + // CHECK-NEXT: !emitc.opaque<"byte"> + emitc.call "f"() {args = [!emitc<"opaque<\"byte\">">]} : () -> () + // CHECK-NEXT: !emitc.opaque<"unsigned"> + emitc.call "f"() {args = [!emitc<"opaque<\"unsigned\">">]} : () -> () + // CHECK-NEXT: !emitc.opaque<"status_t"> + emitc.call "f"() {args = [!emitc<"opaque<\"status_t\">">]} : () -> () + // CHECK-NEXT: !emitc.opaque<"std::vector<std::string>"> + emitc.call "f"() {args = [!emitc.opaque<"std::vector<std::string>">]} : () -> () + return +} diff --git a/mlir/test/mlir-opt/commandline.mlir b/mlir/test/mlir-opt/commandline.mlir index 125d6b1d950b..95c476a84163 100644 --- a/mlir/test/mlir-opt/commandline.mlir +++ b/mlir/test/mlir-opt/commandline.mlir @@ -8,6 +8,7 @@ // CHECK-NEXT: async // CHECK-NEXT: complex // CHECK-NEXT: dlti +// CHECK-NEXT: emitc // CHECK-NEXT: gpu // CHECK-NEXT: linalg // CHECK-NEXT: llvm </cut>

3 years, 11 months

1
0
0 0

[CI-NOTIFY]: TCWG Bisect tcwg_bmk_tx1/gnu-release-aarch64-spec2k6-O3_LTO - Build # 35 - Successful!

by ci_notify＠linaro.org

Successfully identified regression in *gcc* in CI configuration tcwg_bmk_gnu_tx1/gnu-release-aarch64-spec2k6-O3_LTO. So far, this commit has regressed CI configurations: - tcwg_bmk_gnu_tx1/gnu-release-aarch64-spec2k6-O3_LTO Culprit: <cut> commit 7a3248463c2095ba112a31809f2965d04bed03b3 Author: Mark Eggleston <markeggleston(a)gcc.gnu.org> Date: Mon Oct 7 09:13:16 2019 +0000 Delete auto-in_equiv.f90 forgot to use svn delete the first time. From-SVN: r276651 </cut> Results regressed to (for first_bad == 7a3248463c2095ba112a31809f2965d04bed03b3) # reset_artifacts: -10 # build_abe binutils: -9 # build_abe stage1 -- --set gcc_override_configure=--disable-libsanitizer: -8 # build_abe linux: -7 # build_abe glibc: -6 # build_abe stage2 -- --set gcc_override_configure=--disable-libsanitizer: -5 # true: 0 # benchmark -- -O3_LTO artifacts/build-7a3248463c2095ba112a31809f2965d04bed03b3/results_id: 1 # 436.cactusADM,cactusADM_base.default regressed by 104 from (for last_good == 9b0365879b3c4917f5a2485a1fca8bb678484bfe) # reset_artifacts: -10 # build_abe binutils: -9 # build_abe stage1 -- --set gcc_override_configure=--disable-libsanitizer: -8 # build_abe linux: -7 # build_abe glibc: -6 # build_abe stage2 -- --set gcc_override_configure=--disable-libsanitizer: -5 # true: 0 # benchmark -- -O3_LTO artifacts/build-9b0365879b3c4917f5a2485a1fca8bb678484bfe/results_id: 1 Artifacts of last_good build: https://ci.linaro.org/job/tcwg_bmk_ci_gnu-bisect-tcwg_bmk_tx1-gnu-release-a… Results ID of last_good: tx1_64/tcwg_bmk_gnu_tx1/bisect-gnu-release-aarch64-spec2k6-O3_LTO/4731 Artifacts of first_bad build: https://ci.linaro.org/job/tcwg_bmk_ci_gnu-bisect-tcwg_bmk_tx1-gnu-release-a… Results ID of first_bad: tx1_64/tcwg_bmk_gnu_tx1/bisect-gnu-release-aarch64-spec2k6-O3_LTO/4733 Build top page/logs: https://ci.linaro.org/job/tcwg_bmk_ci_gnu-bisect-tcwg_bmk_tx1-gnu-release-a… Configuration details: Reproduce builds: <cut> mkdir investigate-gcc-7a3248463c2095ba112a31809f2965d04bed03b3 cd investigate-gcc-7a3248463c2095ba112a31809f2965d04bed03b3 git clone https://git.linaro.org/toolchain/jenkins-scripts mkdir -p artifacts/manifests curl -o artifacts/manifests/build-baseline.sh https://ci.linaro.org/job/tcwg_bmk_ci_gnu-bisect-tcwg_bmk_tx1-gnu-release-a… --fail curl -o artifacts/manifests/build-parameters.sh https://ci.linaro.org/job/tcwg_bmk_ci_gnu-bisect-tcwg_bmk_tx1-gnu-release-a… --fail curl -o artifacts/test.sh https://ci.linaro.org/job/tcwg_bmk_ci_gnu-bisect-tcwg_bmk_tx1-gnu-release-a… --fail chmod +x artifacts/test.sh # Reproduce the baseline build (build all pre-requisites) ./jenkins-scripts/tcwg_bmk-build.sh @@ artifacts/manifests/build-baseline.sh # Save baseline build state (which is then restored in artifacts/test.sh) mkdir -p ./bisect rsync -a --del --delete-excluded --exclude /bisect/ --exclude /artifacts/ --exclude /gcc/ ./ ./bisect/baseline/ cd gcc # Reproduce first_bad build git checkout --detach 7a3248463c2095ba112a31809f2965d04bed03b3 ../artifacts/test.sh # Reproduce last_good build git checkout --detach 9b0365879b3c4917f5a2485a1fca8bb678484bfe ../artifacts/test.sh cd .. </cut> History of pending regressions and results: https://git.linaro.org/toolchain/ci/base-artifacts.git/log/?h=linaro-local/… Artifacts: https://ci.linaro.org/job/tcwg_bmk_ci_gnu-bisect-tcwg_bmk_tx1-gnu-release-a… Build log: https://ci.linaro.org/job/tcwg_bmk_ci_gnu-bisect-tcwg_bmk_tx1-gnu-release-a… Full commit (up to 1000 lines): <cut> commit 7a3248463c2095ba112a31809f2965d04bed03b3 Author: Mark Eggleston <markeggleston(a)gcc.gnu.org> Date: Mon Oct 7 09:13:16 2019 +0000 Delete auto-in_equiv.f90 forgot to use svn delete the first time. From-SVN: r276651 --- gcc/testsuite/gfortran.dg/auto_in_equiv_3.f90 | 63 --------------------------- 1 file changed, 63 deletions(-) diff --git a/gcc/testsuite/gfortran.dg/auto_in_equiv_3.f90 b/gcc/testsuite/gfortran.dg/auto_in_equiv_3.f90 deleted file mode 100644 index 57c384d1772..00000000000 --- a/gcc/testsuite/gfortran.dg/auto_in_equiv_3.f90 +++ /dev/null @@ -1,63 +0,0 @@ -! { dg-do run } -! { dg-options "-fdec-static -fno-automatic" } - -! Contributed by Mark Eggleston <mark.eggleston(a)codethink.com> - -! Storage is NOT on the static unless explicitly specified using the -! DEC extension "automatic". The address of the first local variable -! is used to determine that storage for the automatic local variable -! is different to that of a local variable with no attributes. The -! contents of the local variable in suba should be overwritten by the -! call to subb. -! -program test - integer :: dummy - integer, parameter :: address = kind(loc(dummy)) - integer(address) :: ad1 - integer(address) :: ad2 - integer(address) :: ad3 - logical :: ok - - call suba(0, ad1) - call subb(0, ad2) - call suba(1, ad1) - call subc(0, ad3) - ok = (ad1.eq.ad3).and.(ad1.ne.ad2) - if (.not.ok) stop 4 - -contains - subroutine suba(option, addr) - integer, intent(in) :: option - integer(address), intent(out) :: addr - integer, automatic :: a - integer :: b - equivalence (a, b) - addr = loc(a) - if (option.eq.0) then - ! initialise a and c - a = 9 - if (a.ne.b) stop 1 - if (loc(a).ne.loc(b)) stop 2 - else - ! a should've been overwritten - if (a.eq.9) stop 3 - end if - end subroutine suba - - subroutine subb(dummy, addr) - integer, intent(in) :: dummy - integer(address), intent(out) :: addr - integer :: x - addr = loc(x) - x = 77 - end subroutine subb - - subroutine subc(dummy, addr) - integer, intent(in) :: dummy - integer(address), intent(out) :: addr - integer, automatic :: y - addr = loc(y) - y = 77 - end subroutine subc - -end program test </cut>

3 years, 11 months

1
0
0 0

[CI-NOTIFY]: TCWG Bisect tcwg_bmk_tx1/llvm-master-aarch64-spec2k6-O3 - Build # 21 - Successful!

by ci_notify＠linaro.org

Successfully identified regression in *llvm* in CI configuration tcwg_bmk_llvm_tx1/llvm-master-aarch64-spec2k6-O3. So far, this commit has regressed CI configurations: - tcwg_bmk_llvm_tx1/llvm-master-aarch64-spec2k6-O3 Culprit: <cut> commit 4aeeb91a9249282231cdd35773c17110e05a870d Author: MaheshRavishankar <ravishankarm(a)google.com> Date: Mon Aug 23 10:15:35 2021 -0700 [mlir][Linalg] Allow all build methods of Structured ops to specify additional attributes. Differential Revision: https://reviews.llvm.org/D108338 </cut> Results regressed to (for first_bad == 4aeeb91a9249282231cdd35773c17110e05a870d) # reset_artifacts: -10 # build_abe binutils: -9 # build_abe stage1 -- --set gcc_override_configure=--disable-libsanitizer: -8 # build_abe linux: -7 # build_abe glibc: -6 # build_abe stage2 -- --set gcc_override_configure=--disable-libsanitizer: -5 # build_llvm true: -3 # true: 0 # benchmark -- -O3 artifacts/build-4aeeb91a9249282231cdd35773c17110e05a870d/results_id: 1 # 464.h264ref,[.] FastFullPelBlockMotionSearch regressed by 111 from (for last_good == 19dc02e99f802922a3af69e802465bee0723b57a) # reset_artifacts: -10 # build_abe binutils: -9 # build_abe stage1 -- --set gcc_override_configure=--disable-libsanitizer: -8 # build_abe linux: -7 # build_abe glibc: -6 # build_abe stage2 -- --set gcc_override_configure=--disable-libsanitizer: -5 # build_llvm true: -3 # true: 0 # benchmark -- -O3 artifacts/build-19dc02e99f802922a3af69e802465bee0723b57a/results_id: 1 Artifacts of last_good build: https://ci.linaro.org/job/tcwg_bmk_ci_llvm-bisect-tcwg_bmk_tx1-llvm-master-… Results ID of last_good: tx1_64/tcwg_bmk_llvm_tx1/bisect-llvm-master-aarch64-spec2k6-O3/4687 Artifacts of first_bad build: https://ci.linaro.org/job/tcwg_bmk_ci_llvm-bisect-tcwg_bmk_tx1-llvm-master-… Results ID of first_bad: tx1_64/tcwg_bmk_llvm_tx1/bisect-llvm-master-aarch64-spec2k6-O3/4669 Build top page/logs: https://ci.linaro.org/job/tcwg_bmk_ci_llvm-bisect-tcwg_bmk_tx1-llvm-master-… Configuration details: Reproduce builds: <cut> mkdir investigate-llvm-4aeeb91a9249282231cdd35773c17110e05a870d cd investigate-llvm-4aeeb91a9249282231cdd35773c17110e05a870d git clone https://git.linaro.org/toolchain/jenkins-scripts mkdir -p artifacts/manifests curl -o artifacts/manifests/build-baseline.sh https://ci.linaro.org/job/tcwg_bmk_ci_llvm-bisect-tcwg_bmk_tx1-llvm-master-… --fail curl -o artifacts/manifests/build-parameters.sh https://ci.linaro.org/job/tcwg_bmk_ci_llvm-bisect-tcwg_bmk_tx1-llvm-master-… --fail curl -o artifacts/test.sh https://ci.linaro.org/job/tcwg_bmk_ci_llvm-bisect-tcwg_bmk_tx1-llvm-master-… --fail chmod +x artifacts/test.sh # Reproduce the baseline build (build all pre-requisites) ./jenkins-scripts/tcwg_bmk-build.sh @@ artifacts/manifests/build-baseline.sh # Save baseline build state (which is then restored in artifacts/test.sh) mkdir -p ./bisect rsync -a --del --delete-excluded --exclude /bisect/ --exclude /artifacts/ --exclude /llvm/ ./ ./bisect/baseline/ cd llvm # Reproduce first_bad build git checkout --detach 4aeeb91a9249282231cdd35773c17110e05a870d ../artifacts/test.sh # Reproduce last_good build git checkout --detach 19dc02e99f802922a3af69e802465bee0723b57a ../artifacts/test.sh cd .. </cut> History of pending regressions and results: https://git.linaro.org/toolchain/ci/base-artifacts.git/log/?h=linaro-local/… Artifacts: https://ci.linaro.org/job/tcwg_bmk_ci_llvm-bisect-tcwg_bmk_tx1-llvm-master-… Build log: https://ci.linaro.org/job/tcwg_bmk_ci_llvm-bisect-tcwg_bmk_tx1-llvm-master-… Full commit (up to 1000 lines): <cut> commit 4aeeb91a9249282231cdd35773c17110e05a870d Author: MaheshRavishankar <ravishankarm(a)google.com> Date: Mon Aug 23 10:15:35 2021 -0700 [mlir][Linalg] Allow all build methods of Structured ops to specify additional attributes. Differential Revision: https://reviews.llvm.org/D108338 --- .../mlir/Dialect/Linalg/IR/LinalgStructuredOps.td | 12 ++++++++---- mlir/lib/Dialect/Linalg/IR/LinalgOps.cpp | 19 ++++++++++++------- mlir/test/mlir-linalg-ods-gen/test-linalg-ods-gen.tc | 3 ++- .../tools/mlir-linalg-ods-gen/mlir-linalg-ods-gen.cpp | 12 +++++++++--- .../mlir-linalg-ods-gen/mlir-linalg-ods-yaml-gen.cpp | 8 ++++++-- 5 files changed, 37 insertions(+), 17 deletions(-) diff --git a/mlir/include/mlir/Dialect/Linalg/IR/LinalgStructuredOps.td b/mlir/include/mlir/Dialect/Linalg/IR/LinalgStructuredOps.td index 33f4992e41f9..1d4e6d546067 100644 --- a/mlir/include/mlir/Dialect/Linalg/IR/LinalgStructuredOps.td +++ b/mlir/include/mlir/Dialect/Linalg/IR/LinalgStructuredOps.td @@ -620,18 +620,22 @@ def GenericOp : LinalgStructuredBase_Op<"generic", [ "ValueRange":$outputs, "ArrayRef<AffineMap>":$indexingMaps, "ArrayRef<StringRef>":$iteratorTypes, "StringRef":$doc, "StringRef":$libraryCall, - CArg<"function_ref<void(OpBuilder &, Location, ValueRange)>", "nullptr">)>, + CArg<"function_ref<void(OpBuilder &, Location, ValueRange)>", "nullptr">, + CArg<"ArrayRef<NamedAttribute>", "{}">:$attributes)>, OpBuilder<(ins "ValueRange":$inputs, "ValueRange":$outputBuffers, "ArrayRef<AffineMap>":$indexingMaps, "ArrayRef<StringRef>":$iteratorTypes, "StringRef":$doc, "StringRef":$libraryCall, - CArg<"function_ref<void(OpBuilder &, Location, ValueRange)>", "nullptr">)>, + CArg<"function_ref<void(OpBuilder &, Location, ValueRange)>", "nullptr">, + CArg<"ArrayRef<NamedAttribute>", "{}">:$attributes)>, OpBuilder<(ins "TypeRange":$resultTensorTypes, "ValueRange":$inputs, "ValueRange":$outputs, "ArrayRef<AffineMap>":$indexingMaps, "ArrayRef<StringRef>":$iteratorTypes, - CArg<"function_ref<void(OpBuilder &, Location, ValueRange)>", "nullptr">)>, + CArg<"function_ref<void(OpBuilder &, Location, ValueRange)>", "nullptr">, + CArg<"ArrayRef<NamedAttribute>", "{}">:$attributes)>, OpBuilder<(ins "ValueRange":$inputs, "ValueRange":$outputBuffers, "ArrayRef<AffineMap>":$indexingMaps, "ArrayRef<StringRef>":$iteratorTypes, - CArg<"function_ref<void(OpBuilder &, Location, ValueRange)>", "nullptr">)> + CArg<"function_ref<void(OpBuilder &, Location, ValueRange)>", "nullptr">, + CArg<"ArrayRef<NamedAttribute>", "{}">:$attributes)> ]; let extraClassDeclaration = structuredOpsBaseDecls # [{ diff --git a/mlir/lib/Dialect/Linalg/IR/LinalgOps.cpp b/mlir/lib/Dialect/Linalg/IR/LinalgOps.cpp index 6b65a9ecd9e5..f4750ca390a8 100644 --- a/mlir/lib/Dialect/Linalg/IR/LinalgOps.cpp +++ b/mlir/lib/Dialect/Linalg/IR/LinalgOps.cpp @@ -502,13 +502,15 @@ void GenericOp::build( OpBuilder &builder, OperationState &result, TypeRange resultTensorTypes, ValueRange inputs, ValueRange outputs, ArrayRef<AffineMap> indexingMaps, ArrayRef<StringRef> iteratorTypes, StringRef doc, StringRef libraryCall, - function_ref<void(OpBuilder &, Location, ValueRange)> bodyBuild) { + function_ref<void(OpBuilder &, Location, ValueRange)> bodyBuild, + ArrayRef<NamedAttribute> attributes) { build(builder, result, resultTensorTypes, inputs, outputs, builder.getAffineMapArrayAttr(indexingMaps), builder.getStrArrayAttr(iteratorTypes), doc.empty() ? StringAttr() : builder.getStringAttr(doc), libraryCall.empty() ? StringAttr() : builder.getStringAttr(libraryCall)); + result.addAttributes(attributes); if (!bodyBuild) return; @@ -527,30 +529,33 @@ void GenericOp::build( OpBuilder &builder, OperationState &result, ValueRange inputs, ValueRange outputs, ArrayRef<AffineMap> indexingMaps, ArrayRef<StringRef> iteratorTypes, StringRef doc, StringRef libraryCall, - function_ref<void(OpBuilder &, Location, ValueRange)> bodyBuild) { + function_ref<void(OpBuilder &, Location, ValueRange)> bodyBuild, + ArrayRef<NamedAttribute> attributes) { build(builder, result, TypeRange{}, inputs, outputs, indexingMaps, - iteratorTypes, doc, libraryCall, bodyBuild); + iteratorTypes, doc, libraryCall, bodyBuild, attributes); } void GenericOp::build( OpBuilder &builder, OperationState &result, ValueRange inputs, ValueRange outputs, ArrayRef<AffineMap> indexingMaps, ArrayRef<StringRef> iteratorTypes, - function_ref<void(OpBuilder &, Location, ValueRange)> bodyBuild) { + function_ref<void(OpBuilder &, Location, ValueRange)> bodyBuild, + ArrayRef<NamedAttribute> attributes) { build(builder, result, inputs, outputs, indexingMaps, iteratorTypes, /*doc=*/"", - /*libraryCall=*/"", bodyBuild); + /*libraryCall=*/"", bodyBuild, attributes); } void GenericOp::build( OpBuilder &builder, OperationState &result, TypeRange resultTensorTypes, ValueRange inputs, ValueRange outputs, ArrayRef<AffineMap> indexingMaps, ArrayRef<StringRef> iteratorTypes, - function_ref<void(OpBuilder &, Location, ValueRange)> bodyBuild) { + function_ref<void(OpBuilder &, Location, ValueRange)> bodyBuild, + ArrayRef<NamedAttribute> attributes) { build(builder, result, resultTensorTypes, inputs, outputs, indexingMaps, iteratorTypes, /*doc=*/"", - /*libraryCall=*/"", bodyBuild); + /*libraryCall=*/"", bodyBuild, attributes); } static void print(OpAsmPrinter &p, GenericOp op) { diff --git a/mlir/test/mlir-linalg-ods-gen/test-linalg-ods-gen.tc b/mlir/test/mlir-linalg-ods-gen/test-linalg-ods-gen.tc index 471961f837bf..743bdbdb12d6 100644 --- a/mlir/test/mlir-linalg-ods-gen/test-linalg-ods-gen.tc +++ b/mlir/test/mlir-linalg-ods-gen/test-linalg-ods-gen.tc @@ -169,7 +169,8 @@ It has one output. // ODS-LABEL: def Test7Op // ODS: OpBuilder< // ODS: (ins "TypeRange":$resultTensorTypes, "ValueRange":$inputs, -// ODS: "ValueRange":$outputs, "Attribute":$attr_a, "Attribute":$attr_b) +// ODS: "ValueRange":$outputs, "Attribute":$attr_a, "Attribute":$attr_b, +// ODS: CArg<"ArrayRef<NamedAttribute>", "{}">:$attributes) // ODS: $_state.addAttribute("attr_a", attr_a); // ODS: $_state.addAttribute("attr_b", attr_b); // diff --git a/mlir/tools/mlir-linalg-ods-gen/mlir-linalg-ods-gen.cpp b/mlir/tools/mlir-linalg-ods-gen/mlir-linalg-ods-gen.cpp index 1bdb5b8806d0..590f17fdedfa 100644 --- a/mlir/tools/mlir-linalg-ods-gen/mlir-linalg-ods-gen.cpp +++ b/mlir/tools/mlir-linalg-ods-gen/mlir-linalg-ods-gen.cpp @@ -1910,7 +1910,8 @@ void TCParser::printODS(llvm::raw_ostream &os, StringRef cppOpName, let skipDefaultBuilders = 1; let builders = [ OpBuilder< - (ins "ValueRange":$inputs, "ValueRange":$outputs), + (ins "ValueRange":$inputs, "ValueRange":$outputs, + CArg<"ArrayRef<NamedAttribute>", "{{}">:$attributes), [{{ $_state.addOperands(inputs); $_state.addOperands(outputs); @@ -1919,6 +1920,7 @@ void TCParser::printODS(llvm::raw_ostream &os, StringRef cppOpName, $_builder.getI32VectorAttr({{ static_cast<int32_t>(inputs.size()), static_cast<int32_t>(outputs.size())})); + $_state.addAttributes(attributes); createAndFillStructuredOpRegion<{0}>( $_builder, $_state, @@ -1927,7 +1929,8 @@ void TCParser::printODS(llvm::raw_ostream &os, StringRef cppOpName, }]>, OpBuilder< (ins "TypeRange":$resultTensorTypes, "ValueRange":$inputs, - "ValueRange":$outputs), + "ValueRange":$outputs, + CArg<"ArrayRef<NamedAttribute>", "{{}">:$attributes), [{{ $_state.addOperands(inputs); $_state.addOperands(outputs); @@ -1937,6 +1940,7 @@ void TCParser::printODS(llvm::raw_ostream &os, StringRef cppOpName, $_builder.getI32VectorAttr({{ static_cast<int32_t>(inputs.size()), static_cast<int32_t>(outputs.size())})); + $_state.addAttributes(attributes); createAndFillStructuredOpRegion<{0}>( $_builder, $_state, @@ -2020,7 +2024,8 @@ void TCParser::printODS(llvm::raw_ostream &os, StringRef cppOpName, const char *builderFmt = R"FMT( , OpBuilder< (ins "TypeRange":$resultTensorTypes, "ValueRange":$inputs, - "ValueRange":$outputs, {1}), + "ValueRange":$outputs, {1}, + CArg<"ArrayRef<NamedAttribute>", "{{}">:$attributes), [{{ $_state.addOperands(inputs); $_state.addOperands(outputs); @@ -2030,6 +2035,7 @@ void TCParser::printODS(llvm::raw_ostream &os, StringRef cppOpName, $_builder.getI32VectorAttr({{ static_cast<int32_t>(inputs.size()), static_cast<int32_t>(outputs.size())})); + $_state.addAttributes(attributes); createAndFillStructuredOpRegion<{0}>( $_builder, $_state, diff --git a/mlir/tools/mlir-linalg-ods-gen/mlir-linalg-ods-yaml-gen.cpp b/mlir/tools/mlir-linalg-ods-gen/mlir-linalg-ods-yaml-gen.cpp index a0eb1dea8860..98e90b69d631 100644 --- a/mlir/tools/mlir-linalg-ods-gen/mlir-linalg-ods-yaml-gen.cpp +++ b/mlir/tools/mlir-linalg-ods-gen/mlir-linalg-ods-yaml-gen.cpp @@ -457,7 +457,8 @@ def {0} : LinalgStructuredBase_Op<"{1}", !listconcat([ let skipDefaultBuilders = 1; let builders = [ OpBuilder< - (ins "ValueRange":$inputs, "ValueRange":$outputs), + (ins "ValueRange":$inputs, "ValueRange":$outputs, + CArg<"ArrayRef<NamedAttribute>", "{{}">:$attributes), [{{ $_state.addOperands(inputs); $_state.addOperands(outputs); @@ -471,6 +472,7 @@ def {0} : LinalgStructuredBase_Op<"{1}", !listconcat([ $_builder.getI32VectorAttr({{ static_cast<int32_t>(inputs.size()), static_cast<int32_t>(outputs.size())})); + $_state.addAttributes(attributes); createAndFillStructuredOpRegion<{0}>( $_builder, $_state, @@ -539,7 +541,8 @@ def {0} : LinalgStructuredBase_Op<"{1}", !listconcat([ static const char structuredOpBuilderFormat[] = R"FMT( , OpBuilder< (ins "TypeRange":$resultTensorTypes, "ValueRange":$inputs, - "ValueRange":$outputs, {1}), + "ValueRange":$outputs, {1}, + CArg<"ArrayRef<NamedAttribute>", "{{}">:$attributes), [{{ $_state.addOperands(inputs); $_state.addOperands(outputs); @@ -555,6 +558,7 @@ static const char structuredOpBuilderFormat[] = R"FMT( TypeRange(inputs), TypeRange(outputs)); {2} + $_state.addAttributes(attributes); }]> )FMT"; </cut>

3 years, 11 months

1
0
0 0

[CI-NOTIFY]: TCWG Bisect tcwg_bmk_tx1/gnu-release-aarch64-spec2k6-O3_LTO - Build # 34 - Successful!

by ci_notify＠linaro.org

Successfully identified regression in *gcc* in CI configuration tcwg_bmk_gnu_tx1/gnu-release-aarch64-spec2k6-O3_LTO. So far, this commit has regressed CI configurations: - tcwg_bmk_gnu_tx1/gnu-release-aarch64-spec2k6-O3_LTO Culprit: <cut> commit 3694418a6d57c5b48383af8a5c6d1b1c2e3cec9b Author: Ian Lance Taylor <ian(a)gcc.gnu.org> Date: Fri Oct 4 13:50:07 2019 +0000 compiler: adjust code to avoid shadowing local variables Also add a couple of missing calls to free after mpz_get_str. This should make the code clean with respect to -Wshadow=local. Based on patch by Bernd Edlinger. Reviewed-on: https://go-review.googlesource.com/c/gofrontend/+/198837 From-SVN: r276579 </cut> Results regressed to (for first_bad == 3694418a6d57c5b48383af8a5c6d1b1c2e3cec9b) # reset_artifacts: -10 # build_abe binutils: -9 # build_abe stage1 -- --set gcc_override_configure=--disable-libsanitizer: -8 # build_abe linux: -7 # build_abe glibc: -6 # build_abe stage2 -- --set gcc_override_configure=--disable-libsanitizer: -5 # true: 0 # benchmark -- -O3_LTO artifacts/build-3694418a6d57c5b48383af8a5c6d1b1c2e3cec9b/results_id: 1 # 456.hmmer,hmmer_base.default regressed by 103 from (for last_good == 0ced79bc4c9925c574177cb6345c26e4aad4155f) # reset_artifacts: -10 # build_abe binutils: -9 # build_abe stage1 -- --set gcc_override_configure=--disable-libsanitizer: -8 # build_abe linux: -7 # build_abe glibc: -6 # build_abe stage2 -- --set gcc_override_configure=--disable-libsanitizer: -5 # true: 0 # benchmark -- -O3_LTO artifacts/build-0ced79bc4c9925c574177cb6345c26e4aad4155f/results_id: 1 Artifacts of last_good build: https://ci.linaro.org/job/tcwg_bmk_ci_gnu-bisect-tcwg_bmk_tx1-gnu-release-a… Results ID of last_good: tx1_64/tcwg_bmk_gnu_tx1/bisect-gnu-release-aarch64-spec2k6-O3_LTO/4662 Artifacts of first_bad build: https://ci.linaro.org/job/tcwg_bmk_ci_gnu-bisect-tcwg_bmk_tx1-gnu-release-a… Results ID of first_bad: tx1_64/tcwg_bmk_gnu_tx1/bisect-gnu-release-aarch64-spec2k6-O3_LTO/4665 Build top page/logs: https://ci.linaro.org/job/tcwg_bmk_ci_gnu-bisect-tcwg_bmk_tx1-gnu-release-a… Configuration details: Reproduce builds: <cut> mkdir investigate-gcc-3694418a6d57c5b48383af8a5c6d1b1c2e3cec9b cd investigate-gcc-3694418a6d57c5b48383af8a5c6d1b1c2e3cec9b git clone https://git.linaro.org/toolchain/jenkins-scripts mkdir -p artifacts/manifests curl -o artifacts/manifests/build-baseline.sh https://ci.linaro.org/job/tcwg_bmk_ci_gnu-bisect-tcwg_bmk_tx1-gnu-release-a… --fail curl -o artifacts/manifests/build-parameters.sh https://ci.linaro.org/job/tcwg_bmk_ci_gnu-bisect-tcwg_bmk_tx1-gnu-release-a… --fail curl -o artifacts/test.sh https://ci.linaro.org/job/tcwg_bmk_ci_gnu-bisect-tcwg_bmk_tx1-gnu-release-a… --fail chmod +x artifacts/test.sh # Reproduce the baseline build (build all pre-requisites) ./jenkins-scripts/tcwg_bmk-build.sh @@ artifacts/manifests/build-baseline.sh # Save baseline build state (which is then restored in artifacts/test.sh) mkdir -p ./bisect rsync -a --del --delete-excluded --exclude /bisect/ --exclude /artifacts/ --exclude /gcc/ ./ ./bisect/baseline/ cd gcc # Reproduce first_bad build git checkout --detach 3694418a6d57c5b48383af8a5c6d1b1c2e3cec9b ../artifacts/test.sh # Reproduce last_good build git checkout --detach 0ced79bc4c9925c574177cb6345c26e4aad4155f ../artifacts/test.sh cd .. </cut> History of pending regressions and results: https://git.linaro.org/toolchain/ci/base-artifacts.git/log/?h=linaro-local/… Artifacts: https://ci.linaro.org/job/tcwg_bmk_ci_gnu-bisect-tcwg_bmk_tx1-gnu-release-a… Build log: https://ci.linaro.org/job/tcwg_bmk_ci_gnu-bisect-tcwg_bmk_tx1-gnu-release-a… Full commit (up to 1000 lines): <cut> commit 3694418a6d57c5b48383af8a5c6d1b1c2e3cec9b Author: Ian Lance Taylor <ian(a)gcc.gnu.org> Date: Fri Oct 4 13:50:07 2019 +0000 compiler: adjust code to avoid shadowing local variables Also add a couple of missing calls to free after mpz_get_str. This should make the code clean with respect to -Wshadow=local. Based on patch by Bernd Edlinger. Reviewed-on: https://go-review.googlesource.com/c/gofrontend/+/198837 From-SVN: r276579 --- gcc/go/gofrontend/MERGE | 2 +- gcc/go/gofrontend/ast-dump.cc | 8 +++--- gcc/go/gofrontend/escape.cc | 1 - gcc/go/gofrontend/expressions.cc | 54 ++++++++++++++++++---------------------- gcc/go/gofrontend/gogo.cc | 28 ++++++++++----------- gcc/go/gofrontend/parse.cc | 26 +++++++++---------- gcc/go/gofrontend/statements.cc | 17 +++++++------ gcc/go/gofrontend/types.cc | 40 ++++++++++++++--------------- 8 files changed, 85 insertions(+), 91 deletions(-) diff --git a/gcc/go/gofrontend/MERGE b/gcc/go/gofrontend/MERGE index 54c682a8b78..bb509943d6e 100644 --- a/gcc/go/gofrontend/MERGE +++ b/gcc/go/gofrontend/MERGE @@ -1,4 +1,4 @@ -a3aef6b6df932ea6c7094d074695bc0b033a3d17 +441f3f1f350b532707c48273d7f454cf1c4e959f The first line of this file holds the git revision number of the last merge done from the gofrontend repository. diff --git a/gcc/go/gofrontend/ast-dump.cc b/gcc/go/gofrontend/ast-dump.cc index b20f7e4e725..a3cbda9debc 100644 --- a/gcc/go/gofrontend/ast-dump.cc +++ b/gcc/go/gofrontend/ast-dump.cc @@ -135,11 +135,11 @@ Ast_dump_traverse_blocks_and_functions::function(Named_object* no) { if (it != res->begin()) this->ast_dump_context_->ostream() << ","; - Named_object* no = (*it); + Named_object* rno = (*it); - this->ast_dump_context_->ostream() << no->name() << " "; - go_assert(no->is_result_variable()); - Result_variable* resvar = no->result_var_value(); + this->ast_dump_context_->ostream() << rno->name() << " "; + go_assert(rno->is_result_variable()); + Result_variable* resvar = rno->result_var_value(); this->ast_dump_context_->dump_type(resvar->type()); diff --git a/gcc/go/gofrontend/escape.cc b/gcc/go/gofrontend/escape.cc index bfd1a39d7e4..f8e07f73cd2 100644 --- a/gcc/go/gofrontend/escape.cc +++ b/gcc/go/gofrontend/escape.cc @@ -1541,7 +1541,6 @@ Escape_analysis_assign::expression(Expression** pexpr) if (debug_level > 1) { - Node* n = Node::make_node(*pexpr); std::string fn_name = this->context_->current_function_name(); go_debug((*pexpr)->location(), "[%d] %s esc: %s", this->context_->loop_depth(), fn_name.c_str(), diff --git a/gcc/go/gofrontend/expressions.cc b/gcc/go/gofrontend/expressions.cc index a72ba243f37..9babc348595 100644 --- a/gcc/go/gofrontend/expressions.cc +++ b/gcc/go/gofrontend/expressions.cc @@ -4104,9 +4104,11 @@ Type_conversion_expression::do_get_backend(Translate_context* context) x = mpz_get_ui(intval); else { - char* s = mpz_get_str(NULL, 16, intval); + char* ms = mpz_get_str(NULL, 16, intval); go_warning_at(loc, 0, - "unicode code point 0x%s out of range in string", s); + "unicode code point 0x%s out of range in string", + ms); + free(ms); x = 0xfffd; } Lex::append_char(x, true, &s, loc); @@ -8016,14 +8018,14 @@ Bound_method_expression::do_flatten(Gogo* gogo, Named_object*, Expression* ret = Expression::make_struct_composite_literal(st, vals, loc); ret = Expression::make_heap_expression(ret, loc); - Node* n = Node::make_node(this); - if ((n->encoding() & ESCAPE_MASK) == Node::ESCAPE_NONE) + Node* node = Node::make_node(this); + if ((node->encoding() & ESCAPE_MASK) == Node::ESCAPE_NONE) ret->heap_expression()->set_allocate_on_stack(); else if (gogo->compiling_runtime() && gogo->package_name() == "runtime" && !saw_errors()) go_error_at(loc, "%s escapes to heap, not allowed in runtime", - n->ast_format(gogo).c_str()); + node->ast_format(gogo).c_str()); // If necessary, check whether the expression or any embedded // pointers are nil. @@ -8741,8 +8743,6 @@ Builtin_call_expression::lower_make(Statement_inserter* inserter) Expression::make_nil(loc)); else { - Numeric_constant nclen; - unsigned long vlen; if (len_arg->numeric_constant_value(&nclen) && nclen.to_unsigned_long(&vlen) == Numeric_constant::NC_UL_VALID && vlen <= Map_type::bucket_size) @@ -9053,8 +9053,7 @@ Builtin_call_expression::flatten_append(Gogo* gogo, Named_object* function, else { Type* int32_type = Type::lookup_integer_type("int32"); - Expression* zero = - Expression::make_integer_ul(0, int32_type, loc); + zero = Expression::make_integer_ul(0, int32_type, loc); call = Runtime::make_call(Runtime::BUILTIN_MEMSET, loc, 3, a1, zero, a2); } @@ -9064,15 +9063,12 @@ Builtin_call_expression::flatten_append(Gogo* gogo, Named_object* function, // For a slice containing pointers, growslice already zeroed // the memory. We only need to zero in non-growing case. // Note: growslice does not zero the memory in non-pointer case. - Expression* left = - Expression::make_temporary_reference(ntmp, loc); - left = Expression::make_cast(uint_type, left, loc); - Expression* right = - Expression::make_temporary_reference(c1tmp, loc); - right = Expression::make_cast(uint_type, right, loc); - Expression* cond = - Expression::make_binary(OPERATOR_GT, left, right, loc); - Expression* zero = Expression::make_integer_ul(0, int_type, loc); + ref = Expression::make_temporary_reference(ntmp, loc); + ref = Expression::make_cast(uint_type, ref, loc); + ref2 = Expression::make_temporary_reference(c1tmp, loc); + ref2 = Expression::make_cast(uint_type, ref2, loc); + cond = Expression::make_binary(OPERATOR_GT, ref, ref2, loc); + zero = Expression::make_integer_ul(0, int_type, loc); call = Expression::make_conditional(cond, call, zero, loc); } } @@ -10877,9 +10873,7 @@ Call_expression::do_lower(Gogo* gogo, Named_object* function, if (this->result_count() > 1 && this->call_temp_ == NULL) { Struct_field_list* sfl = new Struct_field_list(); - Function_type* fntype = this->get_function_type(); const Typed_identifier_list* results = fntype->results(); - Location loc = this->location(); int i = 0; char buf[20]; @@ -12295,10 +12289,10 @@ Call_expression::do_get_backend(Translate_context* context) } else { - Expression* first_arg; - fn = this->interface_method_function(interface_method, &first_arg, + Expression* arg0; + fn = this->interface_method_function(interface_method, &arg0, location); - fn_args[0] = first_arg->get_backend(context); + fn_args[0] = arg0->get_backend(context); } Bexpression* bclosure = NULL; @@ -16453,11 +16447,11 @@ Composite_literal_expression::lower_array(Type* type) traverse_order = new std::vector<unsigned long>(); traverse_order->reserve(v.size()); - for (V::const_iterator p = v.begin(); p != v.end(); ++p) + for (V::const_iterator pv = v.begin(); pv != v.end(); ++pv) { - indexes->push_back(p->index); - vals->push_back(p->expr); - traverse_order->push_back(p->traversal_order); + indexes->push_back(pv->index); + vals->push_back(pv->expr); + traverse_order->push_back(pv->traversal_order); } } @@ -17771,9 +17765,9 @@ Interface_info_expression::do_type() Interface_type* itype = this->iface_->type()->interface_type(); - Hashtable::const_iterator p = result_types.find(itype); - if (p != result_types.end()) - return p->second; + Hashtable::const_iterator pr = result_types.find(itype); + if (pr != result_types.end()) + return pr->second; Type* pdt = Type::make_type_descriptor_ptr_type(); if (itype->is_empty()) diff --git a/gcc/go/gofrontend/gogo.cc b/gcc/go/gofrontend/gogo.cc index e7af673c8df..a79cfc3a9a7 100644 --- a/gcc/go/gofrontend/gogo.cc +++ b/gcc/go/gofrontend/gogo.cc @@ -518,11 +518,11 @@ Gogo::import_package(const std::string& filename, else if (ln == ".") { Bindings* bindings = package->bindings(); - for (Bindings::const_declarations_iterator p = + for (Bindings::const_declarations_iterator pd = bindings->begin_declarations(); - p != bindings->end_declarations(); - ++p) - this->add_dot_import_object(p->second); + pd != bindings->end_declarations(); + ++pd) + this->add_dot_import_object(pd->second); std::string dot_alias = "." + package->package_name(); package->add_alias(dot_alias, location); } @@ -678,8 +678,8 @@ Gogo::recompute_init_priorities() pci != ii->precursors().end(); ++pci) { - Import_init* ii = this->lookup_init(*pci); - nonroots.insert(ii); + Import_init* ii_init = this->lookup_init(*pci); + nonroots.insert(ii_init); } } @@ -2613,11 +2613,11 @@ Gogo::define_global_names() { if (no->type_declaration_value()->has_methods()) { - for (std::vector<Named_object*>::const_iterator p = + for (std::vector<Named_object*>::const_iterator pm = no->type_declaration_value()->methods()->begin(); - p != no->type_declaration_value()->methods()->end(); - p++) - go_error_at((*p)->location(), + pm != no->type_declaration_value()->methods()->end(); + pm++) + go_error_at((*pm)->location(), "may not define methods on non-local type"); } no->set_type_value(global_no->type_value()); @@ -6550,8 +6550,8 @@ Function::build(Gogo* gogo, Named_object* named_function) // Build the backend representation for all the statements in the // function. - Translate_context context(gogo, named_function, NULL, NULL); - Bblock* code_block = this->block_->get_backend(&context); + Translate_context bcontext(gogo, named_function, NULL, NULL); + Bblock* code_block = this->block_->get_backend(&bcontext); // Initialize variables if necessary. Translate_context icontext(gogo, named_function, this->block_, @@ -6608,8 +6608,8 @@ Function::build(Gogo* gogo, Named_object* named_function) // If we created a descriptor for the function, make sure we emit it. if (this->descriptor_ != NULL) { - Translate_context context(gogo, NULL, NULL, NULL); - this->descriptor_->get_backend(&context); + Translate_context dcontext(gogo, NULL, NULL, NULL); + this->descriptor_->get_backend(&dcontext); } } diff --git a/gcc/go/gofrontend/parse.cc b/gcc/go/gofrontend/parse.cc index 52371b2b032..e50af616421 100644 --- a/gcc/go/gofrontend/parse.cc +++ b/gcc/go/gofrontend/parse.cc @@ -836,7 +836,7 @@ Parse::parameter_list(bool* is_varargs) { std::string name = token->identifier(); bool is_exported = token->is_identifier_exported(); - Location location = token->location(); + Location id_location = token->location(); token = this->advance_token(); if (!token->is_op(OPERATOR_COMMA)) { @@ -861,7 +861,7 @@ Parse::parameter_list(bool* is_varargs) } this->unget_token(Token::make_identifier_token(name, is_exported, - location)); + id_location)); } else { @@ -872,15 +872,15 @@ Parse::parameter_list(bool* is_varargs) // commas as we can. std::string id_name = this->gogo_->pack_hidden_name(name, is_exported); - ret->push_back(Typed_identifier(id_name, NULL, location)); + ret->push_back(Typed_identifier(id_name, NULL, id_location)); bool just_saw_comma = true; while (this->advance_token()->is_identifier()) { name = this->peek_token()->identifier(); is_exported = this->peek_token()->is_identifier_exported(); - location = this->peek_token()->location(); + id_location = this->peek_token()->location(); id_name = this->gogo_->pack_hidden_name(name, is_exported); - ret->push_back(Typed_identifier(id_name, NULL, location)); + ret->push_back(Typed_identifier(id_name, NULL, id_location)); if (!this->advance_token()->is_op(OPERATOR_COMMA)) { just_saw_comma = false; @@ -909,7 +909,7 @@ Parse::parameter_list(bool* is_varargs) // names. parameters_have_names = false; this->unget_token(Token::make_identifier_token(name, is_exported, - location)); + id_location)); ret->pop_back(); just_saw_comma = true; } @@ -2808,7 +2808,7 @@ Parse::composite_lit(Type* type, int depth, Location location) { std::string identifier = token->identifier(); bool is_exported = token->is_identifier_exported(); - Location location = token->location(); + Location id_location = token->location(); if (this->advance_token()->is_op(OPERATOR_COLON)) { @@ -2820,14 +2820,14 @@ Parse::composite_lit(Type* type, int depth, Location location) Gogo* gogo = this->gogo_; val = this->id_to_expression(gogo->pack_hidden_name(identifier, is_exported), - location, false); + id_location, false); is_name = true; } else { this->unget_token(Token::make_identifier_token(identifier, is_exported, - location)); + id_location)); val = this->expression(PRECEDENCE_NORMAL, false, true, NULL, NULL); } @@ -2923,14 +2923,14 @@ Parse::composite_lit(Type* type, int depth, Location location) go_error_at(this->location(), "expected %<,%> or %<}%>"); this->gogo_->mark_locals_used(); - int depth = 0; + int edepth = 0; while (!token->is_eof() - && (depth > 0 || !token->is_op(OPERATOR_RCURLY))) + && (edepth > 0 || !token->is_op(OPERATOR_RCURLY))) { if (token->is_op(OPERATOR_LCURLY)) - ++depth; + ++edepth; else if (token->is_op(OPERATOR_RCURLY)) - --depth; + --edepth; token = this->advance_token(); } if (token->is_op(OPERATOR_RCURLY)) diff --git a/gcc/go/gofrontend/statements.cc b/gcc/go/gofrontend/statements.cc index 3dc394ab32b..f52b33d665c 100644 --- a/gcc/go/gofrontend/statements.cc +++ b/gcc/go/gofrontend/statements.cc @@ -2938,7 +2938,7 @@ Thunk_statement::build_thunk(Gogo* gogo, const std::string& thunk_name) go_assert(call_statement->classification() == STATEMENT_EXPRESSION); Expression_statement* es = static_cast<Expression_statement*>(call_statement); - Call_expression* ce = es->expr()->call_expression(); + ce = es->expr()->call_expression(); if (ce == NULL) go_assert(saw_errors()); else @@ -5972,10 +5972,11 @@ Select_statement::lower_two_case(Block* b) // if selectnbrecv2(&lhs, &ok, chan) { body } else { default body } Type* booltype = Type::make_boolean_type(); - Temporary_statement* ts = Statement::make_temporary(booltype, NULL, loc); - b->add_statement(ts); + Temporary_statement* okts = Statement::make_temporary(booltype, NULL, + loc); + b->add_statement(okts); - okref = Expression::make_temporary_reference(ts, loc); + okref = Expression::make_temporary_reference(okts, loc); Expression* okaddr = Expression::make_unary(OPERATOR_AND, okref, loc); call = Runtime::make_call(Runtime::SELECTNBRECV2, loc, 3, addr, okaddr, chanref); @@ -6595,7 +6596,7 @@ For_range_statement::lower_range_array(Gogo* gogo, iter_init = new Block(body_block, loc); ref = Expression::make_temporary_reference(range_temp, loc); - Expression* ref2 = Expression::make_temporary_reference(index_temp, loc); + ref2 = Expression::make_temporary_reference(index_temp, loc); Expression* index = Expression::make_index(ref, ref2, NULL, NULL, loc); tref = Expression::make_temporary_reference(value_temp, loc); @@ -6693,7 +6694,7 @@ For_range_statement::lower_range_slice(Gogo* gogo, iter_init = new Block(body_block, loc); ref = Expression::make_temporary_reference(for_temp, loc); - Expression* ref2 = Expression::make_temporary_reference(index_temp, loc); + ref2 = Expression::make_temporary_reference(index_temp, loc); Expression* index = Expression::make_index(ref, ref2, NULL, NULL, loc); tref = Expression::make_temporary_reference(value_temp, loc); @@ -7179,9 +7180,9 @@ For_range_statement::lower_array_range_clear(Gogo* gogo, else { Type* int32_type = Type::lookup_integer_type("int32"); - Expression* zero = Expression::make_integer_ul(0, int32_type, loc); + Expression* zero32 = Expression::make_integer_ul(0, int32_type, loc); call = Runtime::make_call(Runtime::BUILTIN_MEMSET, loc, 3, ptr_arg, - zero, sz_arg); + zero32, sz_arg); } Statement* cs3 = Statement::make_statement(call, true); b->add_statement(cs3); diff --git a/gcc/go/gofrontend/types.cc b/gcc/go/gofrontend/types.cc index eeae9fa4c0e..e02b832df14 100644 --- a/gcc/go/gofrontend/types.cc +++ b/gcc/go/gofrontend/types.cc @@ -6410,12 +6410,11 @@ Struct_type::do_type_descriptor(Gogo* gogo, Named_type* name) fvals->push_back(Expression::make_nil(bloc)); else { - std::string n; if (is_embedded_builtin) n = gogo->package_name(); else n = Gogo::hidden_name_pkgpath(pf->field_name()); - Expression* s = Expression::make_string(n, bloc); + s = Expression::make_string(n, bloc); fvals->push_back(Expression::make_unary(OPERATOR_AND, s, bloc)); } @@ -6429,7 +6428,7 @@ Struct_type::do_type_descriptor(Gogo* gogo, Named_type* name) fvals->push_back(Expression::make_nil(bloc)); else { - Expression* s = Expression::make_string(pf->tag(), bloc); + s = Expression::make_string(pf->tag(), bloc); fvals->push_back(Expression::make_unary(OPERATOR_AND, s, bloc)); } @@ -6635,22 +6634,22 @@ Struct_type::do_reflection(Gogo* gogo, std::string* ret) const { const std::string& tag(p->tag()); ret->append(" \""); - for (std::string::const_iterator p = tag.begin(); - p != tag.end(); - ++p) + for (std::string::const_iterator pt = tag.begin(); + pt != tag.end(); + ++pt) { - if (*p == '\0') + if (*pt == '\0') ret->append("\\x00"); - else if (*p == '\n') + else if (*pt == '\n') ret->append("\\n"); - else if (*p == '\t') + else if (*pt == '\t') ret->append("\\t"); - else if (*p == '"') + else if (*pt == '"') ret->append("\\\""); - else if (*p == '\\') + else if (*pt == '\\') ret->append("\\\\"); else - ret->push_back(*p); + ret->push_back(*pt); } ret->push_back('"'); } @@ -7197,11 +7196,11 @@ Array_type::verify_length() return false; case Numeric_constant::NC_UL_BIG: { - mpz_t val; - if (!nc.to_int(&val)) + mpz_t mval; + if (!nc.to_int(&mval)) go_unreachable(); - unsigned int bits = mpz_sizeinbase(val, 2); - mpz_clear(val); + unsigned int bits = mpz_sizeinbase(mval, 2); + mpz_clear(mval); if (bits >= tbits) { go_error_at(this->length_->location(), "array bound overflows"); @@ -7704,6 +7703,7 @@ Array_type::do_export(Export* exp) const } char* s = mpz_get_str(NULL, 10, val); exp->write_string(s); + free(s); exp->write_string(" "); mpz_clear(val); } @@ -9752,7 +9752,7 @@ Interface_type::do_import(Import* imp) parameters = new Typed_identifier_list; while (true) { - std::string name = imp->read_name(); + std::string pname = imp->read_name(); imp->require_c_string(" "); if (imp->match_c_string("...")) @@ -9764,7 +9764,7 @@ Interface_type::do_import(Import* imp) Type* ptype = imp->read_type(); if (is_varargs) ptype = Type::make_array_type(ptype, NULL); - parameters->push_back(Typed_identifier(name, ptype, + parameters->push_back(Typed_identifier(pname, ptype, imp->location())); if (imp->peek_char() != ',') break; @@ -9791,10 +9791,10 @@ Interface_type::do_import(Import* imp) imp->advance(1); while (true) { - std::string name = imp->read_name(); + std::string rname = imp->read_name(); imp->require_c_string(" "); Type* rtype = imp->read_type(); - results->push_back(Typed_identifier(name, rtype, + results->push_back(Typed_identifier(rname, rtype, imp->location())); if (imp->peek_char() != ',') break; </cut>

3 years, 11 months

1
0
0 0

[CI-NOTIFY]: TCWG Bisect tcwg_bmk_tx1/llvm-release-aarch64-spec2k6-O2 - Build # 13 - Successful!

by ci_notify＠linaro.org

Successfully identified regression in *llvm* in CI configuration tcwg_bmk_llvm_tx1/llvm-release-aarch64-spec2k6-O2. So far, this commit has regressed CI configurations: - tcwg_bmk_llvm_tx1/llvm-release-aarch64-spec2k6-O2 Culprit: <cut> commit 50b523cb2ceee4ca7279b4ce22ddb0d0b05df313 Author: Stephen Kelly <steveire(a)gmail.com> Date: Mon Apr 26 18:28:50 2021 +0100 [AST] Fix DeclarationNameInfo introspection Some AST classes return `const DeclarationNameInfo &` instead of returning by value (eg CXXDependentScopeMemberExpr). </cut> Results regressed to (for first_bad == 50b523cb2ceee4ca7279b4ce22ddb0d0b05df313) # reset_artifacts: -10 # build_abe binutils: -9 # build_abe stage1 -- --set gcc_override_configure=--disable-libsanitizer: -8 # build_abe linux: -7 # build_abe glibc: -6 # build_abe stage2 -- --set gcc_override_configure=--disable-libsanitizer: -5 # build_llvm true: -3 # true: 0 # benchmark -- -O2 artifacts/build-50b523cb2ceee4ca7279b4ce22ddb0d0b05df313/results_id: 1 # 433.milc,[.] mult_su3_mat_vec regressed by 115 from (for last_good == 10038d0b3dfcfa6abf8a710612899f859ef1534b) # reset_artifacts: -10 # build_abe binutils: -9 # build_abe stage1 -- --set gcc_override_configure=--disable-libsanitizer: -8 # build_abe linux: -7 # build_abe glibc: -6 # build_abe stage2 -- --set gcc_override_configure=--disable-libsanitizer: -5 # build_llvm true: -3 # true: 0 # benchmark -- -O2 artifacts/build-10038d0b3dfcfa6abf8a710612899f859ef1534b/results_id: 1 Artifacts of last_good build: https://ci.linaro.org/job/tcwg_bmk_ci_llvm-bisect-tcwg_bmk_tx1-llvm-release… Results ID of last_good: tx1_64/tcwg_bmk_llvm_tx1/bisect-llvm-release-aarch64-spec2k6-O2/4634 Artifacts of first_bad build: https://ci.linaro.org/job/tcwg_bmk_ci_llvm-bisect-tcwg_bmk_tx1-llvm-release… Results ID of first_bad: tx1_64/tcwg_bmk_llvm_tx1/bisect-llvm-release-aarch64-spec2k6-O2/4631 Build top page/logs: https://ci.linaro.org/job/tcwg_bmk_ci_llvm-bisect-tcwg_bmk_tx1-llvm-release… Configuration details: Reproduce builds: <cut> mkdir investigate-llvm-50b523cb2ceee4ca7279b4ce22ddb0d0b05df313 cd investigate-llvm-50b523cb2ceee4ca7279b4ce22ddb0d0b05df313 git clone https://git.linaro.org/toolchain/jenkins-scripts mkdir -p artifacts/manifests curl -o artifacts/manifests/build-baseline.sh https://ci.linaro.org/job/tcwg_bmk_ci_llvm-bisect-tcwg_bmk_tx1-llvm-release… --fail curl -o artifacts/manifests/build-parameters.sh https://ci.linaro.org/job/tcwg_bmk_ci_llvm-bisect-tcwg_bmk_tx1-llvm-release… --fail curl -o artifacts/test.sh https://ci.linaro.org/job/tcwg_bmk_ci_llvm-bisect-tcwg_bmk_tx1-llvm-release… --fail chmod +x artifacts/test.sh # Reproduce the baseline build (build all pre-requisites) ./jenkins-scripts/tcwg_bmk-build.sh @@ artifacts/manifests/build-baseline.sh # Save baseline build state (which is then restored in artifacts/test.sh) mkdir -p ./bisect rsync -a --del --delete-excluded --exclude /bisect/ --exclude /artifacts/ --exclude /llvm/ ./ ./bisect/baseline/ cd llvm # Reproduce first_bad build git checkout --detach 50b523cb2ceee4ca7279b4ce22ddb0d0b05df313 ../artifacts/test.sh # Reproduce last_good build git checkout --detach 10038d0b3dfcfa6abf8a710612899f859ef1534b ../artifacts/test.sh cd .. </cut> History of pending regressions and results: https://git.linaro.org/toolchain/ci/base-artifacts.git/log/?h=linaro-local/… Artifacts: https://ci.linaro.org/job/tcwg_bmk_ci_llvm-bisect-tcwg_bmk_tx1-llvm-release… Build log: https://ci.linaro.org/job/tcwg_bmk_ci_llvm-bisect-tcwg_bmk_tx1-llvm-release… Full commit (up to 1000 lines): <cut> commit 50b523cb2ceee4ca7279b4ce22ddb0d0b05df313 Author: Stephen Kelly <steveire(a)gmail.com> Date: Mon Apr 26 18:28:50 2021 +0100 [AST] Fix DeclarationNameInfo introspection Some AST classes return `const DeclarationNameInfo &` instead of returning by value (eg CXXDependentScopeMemberExpr). --- clang/lib/Tooling/DumpTool/ASTSrcLocProcessor.cpp | 3 + .../unittests/Introspection/IntrospectionTest.cpp | 66 ++++++++++++++++++++++ 2 files changed, 69 insertions(+) diff --git a/clang/lib/Tooling/DumpTool/ASTSrcLocProcessor.cpp b/clang/lib/Tooling/DumpTool/ASTSrcLocProcessor.cpp index 0aeb3a7703f7..0a7fb9b52f23 100644 --- a/clang/lib/Tooling/DumpTool/ASTSrcLocProcessor.cpp +++ b/clang/lib/Tooling/DumpTool/ASTSrcLocProcessor.cpp @@ -225,6 +225,9 @@ void ASTSrcLocProcessor::run(const MatchFinder::MatchResult &Result) { CaptureMethods("class clang::NestedNameSpecifierLoc", ASTClass, Result); CD.DeclNameInfos = CaptureMethods("struct clang::DeclarationNameInfo", ASTClass, Result); + auto DI = CaptureMethods("const struct clang::DeclarationNameInfo &", + ASTClass, Result); + CD.DeclNameInfos.insert(CD.DeclNameInfos.end(), DI.begin(), DI.end()); if (const auto *DerivedFrom = Result.Nodes.getNodeAs<clang::CXXRecordDecl>("derivedFrom")) { diff --git a/clang/unittests/Introspection/IntrospectionTest.cpp b/clang/unittests/Introspection/IntrospectionTest.cpp index 521520c9a7c7..d4f626bfeb74 100644 --- a/clang/unittests/Introspection/IntrospectionTest.cpp +++ b/clang/unittests/Introspection/IntrospectionTest.cpp @@ -1456,6 +1456,72 @@ getNamedTypeInfo()->getTypeLoc().getAs<clang::TypeSpecTypeLoc>().getNameLoc()), STRING_LOCATION_PAIR((&NI), getSourceRange()))); } +TEST(Introspection, SourceLocations_DeclarationNameInfo_CRef) { + if (!NodeIntrospection::hasIntrospectionSupport()) + return; + + auto AST = buildASTFromCodeWithArgs( + R"cpp( +template<typename T> +struct MyContainer +{ + template <typename U> + void pushBack(); +}; + +template<typename T> +void foo() +{ + MyContainer<T> mc; + mc.template pushBack<int>(); +} +)cpp", + {"-fno-delayed-template-parsing"}, "foo.cpp", "clang-tool", + std::make_shared<PCHContainerOperations>()); + + auto &Ctx = AST->getASTContext(); + auto &TU = *Ctx.getTranslationUnitDecl(); + + auto BoundNodes = ast_matchers::match( + decl(hasDescendant(cxxDependentScopeMemberExpr(hasMemberName("pushBack")).bind("member"))), TU, + Ctx); + + EXPECT_EQ(BoundNodes.size(), 1u); + + const auto *Member = BoundNodes[0].getNodeAs<CXXDependentScopeMemberExpr>("member"); + auto Result = NodeIntrospection::GetLocations(Member); + + auto ExpectedLocations = + FormatExpected<SourceLocation>(Result.LocationAccessors); + + llvm::sort(ExpectedLocations); + + EXPECT_EQ( + llvm::makeArrayRef(ExpectedLocations), + (ArrayRef<std::pair<std::string, SourceLocation>>{ + STRING_LOCATION_STDPAIR(Member, getBeginLoc()), + STRING_LOCATION_STDPAIR(Member, getEndLoc()), + STRING_LOCATION_STDPAIR(Member, getExprLoc()), + STRING_LOCATION_STDPAIR(Member, getLAngleLoc()), + STRING_LOCATION_STDPAIR(Member, getMemberLoc()), + STRING_LOCATION_STDPAIR(Member, getMemberNameInfo().getBeginLoc()), + STRING_LOCATION_STDPAIR(Member, getMemberNameInfo().getEndLoc()), + STRING_LOCATION_STDPAIR(Member, getMemberNameInfo().getLoc()), + STRING_LOCATION_STDPAIR(Member, getOperatorLoc()), + STRING_LOCATION_STDPAIR(Member, getRAngleLoc()), + STRING_LOCATION_STDPAIR(Member, getTemplateKeywordLoc()) + })); + + auto ExpectedRanges = FormatExpected<SourceRange>(Result.RangeAccessors); + + EXPECT_THAT( + ExpectedRanges, + UnorderedElementsAre( + STRING_LOCATION_PAIR(Member, getMemberNameInfo().getSourceRange()), + STRING_LOCATION_PAIR(Member, getSourceRange()) + )); +} + TEST(Introspection, SourceLocations_DeclarationNameInfo_ConvOp) { if (!NodeIntrospection::hasIntrospectionSupport()) return; </cut>

3 years, 11 months

1
0
0 0

[CI-NOTIFY]: TCWG Bisect tcwg_bmk_apm/llvm-master-aarch64-spec2k6-Oz_LTO - Build # 5 - Successful!

by ci_notify＠linaro.org

Successfully identified regression in *llvm* in CI configuration tcwg_bmk_llvm_apm/llvm-master-aarch64-spec2k6-Oz_LTO. So far, this commit has regressed CI configurations: - tcwg_bmk_llvm_apm/llvm-master-aarch64-spec2k6-Oz_LTO Culprit: <cut> commit 02b1c3f0529e525a4ffa671478050f4704b3f472 Author: Dmitry Preobrazhensky <dmitry.preobrazhensky(a)amd.com> Date: Fri Aug 6 15:49:52 2021 +0300 [AMDGPU][MC][NFC][DOC] Updated AMD GPU assembler syntax description. Corrected sendmsg description (bug https://bugs.llvm.org/show_bug.cgi?id=49648). </cut> Results regressed to (for first_bad == 02b1c3f0529e525a4ffa671478050f4704b3f472) # reset_artifacts: -10 # build_abe binutils: -9 # build_abe stage1 -- --set gcc_override_configure=--disable-libsanitizer: -8 # build_abe linux: -7 # build_abe glibc: -6 # build_abe stage2 -- --set gcc_override_configure=--disable-libsanitizer: -5 # build_llvm true: -3 # true: 0 # benchmark -- -Oz_LTO artifacts/build-02b1c3f0529e525a4ffa671478050f4704b3f472/results_id: 1 # 470.lbm,lbm_base.default regressed by 104 from (for last_good == 4aafd5f00c2a772337ec065d4542ef158453a343) # reset_artifacts: -10 # build_abe binutils: -9 # build_abe stage1 -- --set gcc_override_configure=--disable-libsanitizer: -8 # build_abe linux: -7 # build_abe glibc: -6 # build_abe stage2 -- --set gcc_override_configure=--disable-libsanitizer: -5 # build_llvm true: -3 # true: 0 # benchmark -- -Oz_LTO artifacts/build-baseline/results_id: 1 Artifacts of last_good build: https://ci.linaro.org/job/tcwg_bmk_ci_llvm-bisect-tcwg_bmk_apm-llvm-master-… Results ID of last_good: apm_64/tcwg_bmk_llvm_apm/baseline-llvm-master-aarch64-spec2k6-Oz_LTO/4560 Artifacts of first_bad build: https://ci.linaro.org/job/tcwg_bmk_ci_llvm-bisect-tcwg_bmk_apm-llvm-master-… Results ID of first_bad: apm_64/tcwg_bmk_llvm_apm/bisect-llvm-master-aarch64-spec2k6-Oz_LTO/4618 Build top page/logs: https://ci.linaro.org/job/tcwg_bmk_ci_llvm-bisect-tcwg_bmk_apm-llvm-master-… Configuration details: Reproduce builds: <cut> mkdir investigate-llvm-02b1c3f0529e525a4ffa671478050f4704b3f472 cd investigate-llvm-02b1c3f0529e525a4ffa671478050f4704b3f472 git clone https://git.linaro.org/toolchain/jenkins-scripts mkdir -p artifacts/manifests curl -o artifacts/manifests/build-baseline.sh https://ci.linaro.org/job/tcwg_bmk_ci_llvm-bisect-tcwg_bmk_apm-llvm-master-… --fail curl -o artifacts/manifests/build-parameters.sh https://ci.linaro.org/job/tcwg_bmk_ci_llvm-bisect-tcwg_bmk_apm-llvm-master-… --fail curl -o artifacts/test.sh https://ci.linaro.org/job/tcwg_bmk_ci_llvm-bisect-tcwg_bmk_apm-llvm-master-… --fail chmod +x artifacts/test.sh # Reproduce the baseline build (build all pre-requisites) ./jenkins-scripts/tcwg_bmk-build.sh @@ artifacts/manifests/build-baseline.sh # Save baseline build state (which is then restored in artifacts/test.sh) mkdir -p ./bisect rsync -a --del --delete-excluded --exclude /bisect/ --exclude /artifacts/ --exclude /llvm/ ./ ./bisect/baseline/ cd llvm # Reproduce first_bad build git checkout --detach 02b1c3f0529e525a4ffa671478050f4704b3f472 ../artifacts/test.sh # Reproduce last_good build git checkout --detach 4aafd5f00c2a772337ec065d4542ef158453a343 ../artifacts/test.sh cd .. </cut> History of pending regressions and results: https://git.linaro.org/toolchain/ci/base-artifacts.git/log/?h=linaro-local/… Artifacts: https://ci.linaro.org/job/tcwg_bmk_ci_llvm-bisect-tcwg_bmk_apm-llvm-master-… Build log: https://ci.linaro.org/job/tcwg_bmk_ci_llvm-bisect-tcwg_bmk_apm-llvm-master-… Full commit (up to 1000 lines): <cut> commit 02b1c3f0529e525a4ffa671478050f4704b3f472 Author: Dmitry Preobrazhensky <dmitry.preobrazhensky(a)amd.com> Date: Fri Aug 6 15:49:52 2021 +0300 [AMDGPU][MC][NFC][DOC] Updated AMD GPU assembler syntax description. Corrected sendmsg description (bug https://bugs.llvm.org/show_bug.cgi?id=49648). --- llvm/docs/AMDGPU/gfx10_msg.rst | 41 +++++++++++++++++++++++------------------ llvm/docs/AMDGPU/gfx8_msg.rst | 1 + llvm/docs/AMDGPU/gfx90a_msg.rst | 41 +++++++++++++++++++++++------------------ llvm/docs/AMDGPU/gfx9_msg.rst | 41 +++++++++++++++++++++++------------------ 4 files changed, 70 insertions(+), 54 deletions(-) diff --git a/llvm/docs/AMDGPU/gfx10_msg.rst b/llvm/docs/AMDGPU/gfx10_msg.rst index 3e6c532dd85a..c0774d85a62e 100644 --- a/llvm/docs/AMDGPU/gfx10_msg.rst +++ b/llvm/docs/AMDGPU/gfx10_msg.rst @@ -47,24 +47,29 @@ or :ref:`absolute expressions<amdgpu_synid_absolute_expression>`. Each message type supports specific operations: - ================= ========== ============================== ============ ========== - Message name Message Id Supported Operations Operation Id Stream Id - ================= ========== ============================== ============ ========== - MSG_INTERRUPT 1 \- \- \- - MSG_GS 2 GS_OP_CUT 1 Optional - \ GS_OP_EMIT 2 Optional - \ GS_OP_EMIT_CUT 3 Optional - MSG_GS_DONE 3 GS_OP_NOP 0 \- - \ GS_OP_CUT 1 Optional - \ GS_OP_EMIT 2 Optional - \ GS_OP_EMIT_CUT 3 Optional - MSG_GS_ALLOC_REQ 9 \- \- \- - MSG_GET_DOORBELL 10 \- \- \- - MSG_SYSMSG 15 SYSMSG_OP_ECC_ERR_INTERRUPT 1 \- - \ SYSMSG_OP_REG_RD 2 \- - \ SYSMSG_OP_HOST_TRAP_ACK 3 \- - \ SYSMSG_OP_TTRACE_PC 4 \- - ================= ========== ============================== ============ ========== + =================== ========== ============================== ============ ========== + Message name Message Id Supported Operations Operation Id Stream Id + =================== ========== ============================== ============ ========== + MSG_INTERRUPT 1 \- \- \- + MSG_GS 2 GS_OP_CUT 1 Optional + \ GS_OP_EMIT 2 Optional + \ GS_OP_EMIT_CUT 3 Optional + MSG_GS_DONE 3 GS_OP_NOP 0 \- + \ GS_OP_CUT 1 Optional + \ GS_OP_EMIT 2 Optional + \ GS_OP_EMIT_CUT 3 Optional + MSG_SAVEWAVE 4 \- \- \- + MSG_STALL_WAVE_GEN 5 \- \- \- + MSG_HALT_WAVES 6 \- \- \- + MSG_ORDERED_PS_DONE 7 \- \- \- + MSG_GS_ALLOC_REQ 9 \- \- \- + MSG_GET_DOORBELL 10 \- \- \- + MSG_GET_DDID 11 \- \- \- + MSG_SYSMSG 15 SYSMSG_OP_ECC_ERR_INTERRUPT 1 \- + \ SYSMSG_OP_REG_RD 2 \- + \ SYSMSG_OP_HOST_TRAP_ACK 3 \- + \ SYSMSG_OP_TTRACE_PC 4 \- + =================== ========== ============================== ============ ========== *Sendmsg* arguments are validated depending on how *type* value is specified: diff --git a/llvm/docs/AMDGPU/gfx8_msg.rst b/llvm/docs/AMDGPU/gfx8_msg.rst index 0b0b2f307482..f32033dd944c 100644 --- a/llvm/docs/AMDGPU/gfx8_msg.rst +++ b/llvm/docs/AMDGPU/gfx8_msg.rst @@ -58,6 +58,7 @@ Each message type supports specific operations: \ GS_OP_CUT 1 Optional \ GS_OP_EMIT 2 Optional \ GS_OP_EMIT_CUT 3 Optional + MSG_SAVEWAVE 4 \- \- \- MSG_SYSMSG 15 SYSMSG_OP_ECC_ERR_INTERRUPT 1 \- \ SYSMSG_OP_REG_RD 2 \- \ SYSMSG_OP_HOST_TRAP_ACK 3 \- diff --git a/llvm/docs/AMDGPU/gfx90a_msg.rst b/llvm/docs/AMDGPU/gfx90a_msg.rst index aa44d3b64f49..37f945464e58 100644 --- a/llvm/docs/AMDGPU/gfx90a_msg.rst +++ b/llvm/docs/AMDGPU/gfx90a_msg.rst @@ -47,24 +47,29 @@ or :ref:`absolute expressions<amdgpu_synid_absolute_expression>`. Each message type supports specific operations: - ================= ========== ============================== ============ ========== - Message name Message Id Supported Operations Operation Id Stream Id - ================= ========== ============================== ============ ========== - MSG_INTERRUPT 1 \- \- \- - MSG_GS 2 GS_OP_CUT 1 Optional - \ GS_OP_EMIT 2 Optional - \ GS_OP_EMIT_CUT 3 Optional - MSG_GS_DONE 3 GS_OP_NOP 0 \- - \ GS_OP_CUT 1 Optional - \ GS_OP_EMIT 2 Optional - \ GS_OP_EMIT_CUT 3 Optional - MSG_GS_ALLOC_REQ 9 \- \- \- - MSG_GET_DOORBELL 10 \- \- \- - MSG_SYSMSG 15 SYSMSG_OP_ECC_ERR_INTERRUPT 1 \- - \ SYSMSG_OP_REG_RD 2 \- - \ SYSMSG_OP_HOST_TRAP_ACK 3 \- - \ SYSMSG_OP_TTRACE_PC 4 \- - ================= ========== ============================== ============ ========== + ====================== ========== ============================== ============ ========== + Message name Message Id Supported Operations Operation Id Stream Id + ====================== ========== ============================== ============ ========== + MSG_INTERRUPT 1 \- \- \- + MSG_GS 2 GS_OP_CUT 1 Optional + \ GS_OP_EMIT 2 Optional + \ GS_OP_EMIT_CUT 3 Optional + MSG_GS_DONE 3 GS_OP_NOP 0 \- + \ GS_OP_CUT 1 Optional + \ GS_OP_EMIT 2 Optional + \ GS_OP_EMIT_CUT 3 Optional + MSG_SAVEWAVE 4 \- \- \- + MSG_STALL_WAVE_GEN 5 \- \- \- + MSG_HALT_WAVES 6 \- \- \- + MSG_ORDERED_PS_DONE 7 \- \- \- + MSG_EARLY_PRIM_DEALLOC 8 \- \- \- + MSG_GS_ALLOC_REQ 9 \- \- \- + MSG_GET_DOORBELL 10 \- \- \- + MSG_SYSMSG 15 SYSMSG_OP_ECC_ERR_INTERRUPT 1 \- + \ SYSMSG_OP_REG_RD 2 \- + \ SYSMSG_OP_HOST_TRAP_ACK 3 \- + \ SYSMSG_OP_TTRACE_PC 4 \- + ====================== ========== ============================== ============ ========== *Sendmsg* arguments are validated depending on how *type* value is specified: diff --git a/llvm/docs/AMDGPU/gfx9_msg.rst b/llvm/docs/AMDGPU/gfx9_msg.rst index efb95e5a97db..34be1c8a24c5 100644 --- a/llvm/docs/AMDGPU/gfx9_msg.rst +++ b/llvm/docs/AMDGPU/gfx9_msg.rst @@ -47,24 +47,29 @@ or :ref:`absolute expressions<amdgpu_synid_absolute_expression>`. Each message type supports specific operations: - ================= ========== ============================== ============ ========== - Message name Message Id Supported Operations Operation Id Stream Id - ================= ========== ============================== ============ ========== - MSG_INTERRUPT 1 \- \- \- - MSG_GS 2 GS_OP_CUT 1 Optional - \ GS_OP_EMIT 2 Optional - \ GS_OP_EMIT_CUT 3 Optional - MSG_GS_DONE 3 GS_OP_NOP 0 \- - \ GS_OP_CUT 1 Optional - \ GS_OP_EMIT 2 Optional - \ GS_OP_EMIT_CUT 3 Optional - MSG_GS_ALLOC_REQ 9 \- \- \- - MSG_GET_DOORBELL 10 \- \- \- - MSG_SYSMSG 15 SYSMSG_OP_ECC_ERR_INTERRUPT 1 \- - \ SYSMSG_OP_REG_RD 2 \- - \ SYSMSG_OP_HOST_TRAP_ACK 3 \- - \ SYSMSG_OP_TTRACE_PC 4 \- - ================= ========== ============================== ============ ========== + ====================== ========== ============================== ============ ========== + Message name Message Id Supported Operations Operation Id Stream Id + ====================== ========== ============================== ============ ========== + MSG_INTERRUPT 1 \- \- \- + MSG_GS 2 GS_OP_CUT 1 Optional + \ GS_OP_EMIT 2 Optional + \ GS_OP_EMIT_CUT 3 Optional + MSG_GS_DONE 3 GS_OP_NOP 0 \- + \ GS_OP_CUT 1 Optional + \ GS_OP_EMIT 2 Optional + \ GS_OP_EMIT_CUT 3 Optional + MSG_SAVEWAVE 4 \- \- \- + MSG_STALL_WAVE_GEN 5 \- \- \- + MSG_HALT_WAVES 6 \- \- \- + MSG_ORDERED_PS_DONE 7 \- \- \- + MSG_EARLY_PRIM_DEALLOC 8 \- \- \- + MSG_GS_ALLOC_REQ 9 \- \- \- + MSG_GET_DOORBELL 10 \- \- \- + MSG_SYSMSG 15 SYSMSG_OP_ECC_ERR_INTERRUPT 1 \- + \ SYSMSG_OP_REG_RD 2 \- + \ SYSMSG_OP_HOST_TRAP_ACK 3 \- + \ SYSMSG_OP_TTRACE_PC 4 \- + ====================== ========== ============================== ============ ========== *Sendmsg* arguments are validated depending on how *type* value is specified: </cut>

3 years, 11 months

1
0
0 0

[CI-NOTIFY]: TCWG Bisect tcwg_bmk_tx1/gnu-release-aarch64-spec2k6-O3_LTO - Build # 33 - Successful!

by ci_notify＠linaro.org

Successfully identified regression in *gcc* in CI configuration tcwg_bmk_gnu_tx1/gnu-release-aarch64-spec2k6-O3_LTO. So far, this commit has regressed CI configurations: - tcwg_bmk_gnu_tx1/gnu-release-aarch64-spec2k6-O3_LTO Culprit: <cut> commit 53329d29274fa4af5af7ab155947fe84b9684e39 Author: Rainer Orth <ro(a)CeBiTec.Uni-Bielefeld.DE> Date: Tue May 21 16:59:39 2019 +0000 Fix dg-require-* syntax * gcc.dg/Wattribute-alias.c: Pass emtpy arg to dg-require-ifunc. * gcc.c-torture/execute/20030125-1.c: Pass emtpy arg to dg-require-weak. * gcc.dg/torture/ftrapv-2.c: Pass empty arg to dg-require-fork. * gcc.target/i386/pr84723-1.c: Remove dg-require-ifunc. * gcc.target/i386/pr84723-2.c: Likewise. * gcc.target/i386/pr84723-3.c: Likewise. * gcc.target/i386/pr84723-4.c: Likewise. * gcc.target/i386/pr84723-5.c: Likewise. From-SVN: r271476 </cut> Results regressed to (for first_bad == 53329d29274fa4af5af7ab155947fe84b9684e39) # reset_artifacts: -10 # build_abe binutils: -9 # build_abe stage1 -- --set gcc_override_configure=--disable-libsanitizer: -8 # build_abe linux: -7 # build_abe glibc: -6 # build_abe stage2 -- --set gcc_override_configure=--disable-libsanitizer: -5 # true: 0 # benchmark -- -O3_LTO artifacts/build-53329d29274fa4af5af7ab155947fe84b9684e39/results_id: 1 # 456.hmmer,hmmer_base.default regressed by 104 from (for last_good == b33a3c6451ecc09ac5f1c7ccdac9b19eb0bd1a48) # reset_artifacts: -10 # build_abe binutils: -9 # build_abe stage1 -- --set gcc_override_configure=--disable-libsanitizer: -8 # build_abe linux: -7 # build_abe glibc: -6 # build_abe stage2 -- --set gcc_override_configure=--disable-libsanitizer: -5 # true: 0 # benchmark -- -O3_LTO artifacts/build-b33a3c6451ecc09ac5f1c7ccdac9b19eb0bd1a48/results_id: 1 Artifacts of last_good build: https://ci.linaro.org/job/tcwg_bmk_ci_gnu-bisect-tcwg_bmk_tx1-gnu-release-a… Results ID of last_good: tx1_64/tcwg_bmk_gnu_tx1/bisect-gnu-release-aarch64-spec2k6-O3_LTO/4568 Artifacts of first_bad build: https://ci.linaro.org/job/tcwg_bmk_ci_gnu-bisect-tcwg_bmk_tx1-gnu-release-a… Results ID of first_bad: tx1_64/tcwg_bmk_gnu_tx1/bisect-gnu-release-aarch64-spec2k6-O3_LTO/4564 Build top page/logs: https://ci.linaro.org/job/tcwg_bmk_ci_gnu-bisect-tcwg_bmk_tx1-gnu-release-a… Configuration details: Reproduce builds: <cut> mkdir investigate-gcc-53329d29274fa4af5af7ab155947fe84b9684e39 cd investigate-gcc-53329d29274fa4af5af7ab155947fe84b9684e39 git clone https://git.linaro.org/toolchain/jenkins-scripts mkdir -p artifacts/manifests curl -o artifacts/manifests/build-baseline.sh https://ci.linaro.org/job/tcwg_bmk_ci_gnu-bisect-tcwg_bmk_tx1-gnu-release-a… --fail curl -o artifacts/manifests/build-parameters.sh https://ci.linaro.org/job/tcwg_bmk_ci_gnu-bisect-tcwg_bmk_tx1-gnu-release-a… --fail curl -o artifacts/test.sh https://ci.linaro.org/job/tcwg_bmk_ci_gnu-bisect-tcwg_bmk_tx1-gnu-release-a… --fail chmod +x artifacts/test.sh # Reproduce the baseline build (build all pre-requisites) ./jenkins-scripts/tcwg_bmk-build.sh @@ artifacts/manifests/build-baseline.sh # Save baseline build state (which is then restored in artifacts/test.sh) mkdir -p ./bisect rsync -a --del --delete-excluded --exclude /bisect/ --exclude /artifacts/ --exclude /gcc/ ./ ./bisect/baseline/ cd gcc # Reproduce first_bad build git checkout --detach 53329d29274fa4af5af7ab155947fe84b9684e39 ../artifacts/test.sh # Reproduce last_good build git checkout --detach b33a3c6451ecc09ac5f1c7ccdac9b19eb0bd1a48 ../artifacts/test.sh cd .. </cut> History of pending regressions and results: https://git.linaro.org/toolchain/ci/base-artifacts.git/log/?h=linaro-local/… Artifacts: https://ci.linaro.org/job/tcwg_bmk_ci_gnu-bisect-tcwg_bmk_tx1-gnu-release-a… Build log: https://ci.linaro.org/job/tcwg_bmk_ci_gnu-bisect-tcwg_bmk_tx1-gnu-release-a… Full commit (up to 1000 lines): <cut> commit 53329d29274fa4af5af7ab155947fe84b9684e39 Author: Rainer Orth <ro(a)CeBiTec.Uni-Bielefeld.DE> Date: Tue May 21 16:59:39 2019 +0000 Fix dg-require-* syntax * gcc.dg/Wattribute-alias.c: Pass emtpy arg to dg-require-ifunc. * gcc.c-torture/execute/20030125-1.c: Pass emtpy arg to dg-require-weak. * gcc.dg/torture/ftrapv-2.c: Pass empty arg to dg-require-fork. * gcc.target/i386/pr84723-1.c: Remove dg-require-ifunc. * gcc.target/i386/pr84723-2.c: Likewise. * gcc.target/i386/pr84723-3.c: Likewise. * gcc.target/i386/pr84723-4.c: Likewise. * gcc.target/i386/pr84723-5.c: Likewise. From-SVN: r271476 --- gcc/testsuite/ChangeLog | 14 ++++++++++++++ gcc/testsuite/gcc.c-torture/execute/20030125-1.c | 2 +- gcc/testsuite/gcc.dg/Wattribute-alias.c | 2 +- gcc/testsuite/gcc.dg/torture/ftrapv-2.c | 2 +- gcc/testsuite/gcc.target/i386/pr84723-1.c | 1 - gcc/testsuite/gcc.target/i386/pr84723-2.c | 1 - gcc/testsuite/gcc.target/i386/pr84723-3.c | 1 - gcc/testsuite/gcc.target/i386/pr84723-4.c | 1 - gcc/testsuite/gcc.target/i386/pr84723-5.c | 1 - 9 files changed, 17 insertions(+), 8 deletions(-) diff --git a/gcc/testsuite/ChangeLog b/gcc/testsuite/ChangeLog index 8ec3ed1a513..4e8e73cb52f 100644 --- a/gcc/testsuite/ChangeLog +++ b/gcc/testsuite/ChangeLog @@ -1,3 +1,17 @@ +2019-05-21 Rainer Orth <ro(a)CeBiTec.Uni-Bielefeld.DE> + + * gcc.dg/Wattribute-alias.c: Pass emtpy arg to dg-require-ifunc. + + * gcc.c-torture/execute/20030125-1.c: Pass emtpy arg to dg-require-weak. + + * gcc.dg/torture/ftrapv-2.c: Pass empty arg to dg-require-fork. + + * gcc.target/i386/pr84723-1.c: Remove dg-require-ifunc. + * gcc.target/i386/pr84723-2.c: Likewise. + * gcc.target/i386/pr84723-3.c: Likewise. + * gcc.target/i386/pr84723-4.c: Likewise. + * gcc.target/i386/pr84723-5.c: Likewise. + 2019-05-21 Iain Sandoe <iain(a)sandoe.co.uk> PR testsuite/67958 diff --git a/gcc/testsuite/gcc.c-torture/execute/20030125-1.c b/gcc/testsuite/gcc.c-torture/execute/20030125-1.c index 960552c3c3a..39578e51d15 100644 --- a/gcc/testsuite/gcc.c-torture/execute/20030125-1.c +++ b/gcc/testsuite/gcc.c-torture/execute/20030125-1.c @@ -1,6 +1,6 @@ /* Verify whether math functions are simplified. */ /* { dg-require-effective-target c99_runtime } */ -/* { dg-require-weak } */ +/* { dg-require-weak "" } */ double sin(double); double floor(double); float diff --git a/gcc/testsuite/gcc.dg/Wattribute-alias.c b/gcc/testsuite/gcc.dg/Wattribute-alias.c index 228c1be82fc..12774e82834 100644 --- a/gcc/testsuite/gcc.dg/Wattribute-alias.c +++ b/gcc/testsuite/gcc.dg/Wattribute-alias.c @@ -1,6 +1,6 @@ /* PR middle-end/81824 - Warn for missing attributes with function aliases { dg-do compile } - { dg-require-ifunc "require ifunc support" } + { dg-require-ifunc "" } { dg-options "-Wall -Wattribute-alias=2" } */ #define ATTR(...) __attribute__ ((__VA_ARGS__)) diff --git a/gcc/testsuite/gcc.dg/torture/ftrapv-2.c b/gcc/testsuite/gcc.dg/torture/ftrapv-2.c index 8065ee0461a..75e464fe557 100644 --- a/gcc/testsuite/gcc.dg/torture/ftrapv-2.c +++ b/gcc/testsuite/gcc.dg/torture/ftrapv-2.c @@ -3,7 +3,7 @@ /* { dg-skip-if "" { *-*-* } { "-flto" } { "" } } */ /* { dg-additional-options "-ftrapv" } */ /* { dg-require-effective-target trapping } */ -/* { dg-require-fork unused } */ +/* { dg-require-fork "" } */ #include <stdlib.h> #include <unistd.h> diff --git a/gcc/testsuite/gcc.target/i386/pr84723-1.c b/gcc/testsuite/gcc.target/i386/pr84723-1.c index 0264ecb1159..1357b1d5f46 100644 --- a/gcc/testsuite/gcc.target/i386/pr84723-1.c +++ b/gcc/testsuite/gcc.target/i386/pr84723-1.c @@ -1,6 +1,5 @@ /* PR middle-end/84723 */ /* { dg-do compile } */ -/* { dg-require-ifunc } */ /* { dg-options "-O2" } */ __attribute__((target_clones ("avx", "default"))) diff --git a/gcc/testsuite/gcc.target/i386/pr84723-2.c b/gcc/testsuite/gcc.target/i386/pr84723-2.c index 6456d6d256f..d092e676b62 100644 --- a/gcc/testsuite/gcc.target/i386/pr84723-2.c +++ b/gcc/testsuite/gcc.target/i386/pr84723-2.c @@ -1,6 +1,5 @@ /* PR middle-end/84723 */ /* { dg-do compile } */ -/* { dg-require-ifunc } */ /* { dg-options "-O2" } */ __attribute__((target_clones ("avx", "default"))) diff --git a/gcc/testsuite/gcc.target/i386/pr84723-3.c b/gcc/testsuite/gcc.target/i386/pr84723-3.c index bb8e7cabc88..7bb8eb29815 100644 --- a/gcc/testsuite/gcc.target/i386/pr84723-3.c +++ b/gcc/testsuite/gcc.target/i386/pr84723-3.c @@ -1,6 +1,5 @@ /* PR middle-end/84723 */ /* { dg-do compile } */ -/* { dg-require-ifunc } */ /* { dg-options "-O2" } */ __attribute__((target_clones ("avx", "default"))) diff --git a/gcc/testsuite/gcc.target/i386/pr84723-4.c b/gcc/testsuite/gcc.target/i386/pr84723-4.c index 9df1008497c..f30567dfae3 100644 --- a/gcc/testsuite/gcc.target/i386/pr84723-4.c +++ b/gcc/testsuite/gcc.target/i386/pr84723-4.c @@ -1,6 +1,5 @@ /* PR middle-end/84723 */ /* { dg-do compile } */ -/* { dg-require-ifunc } */ /* { dg-options "-O2" } */ __attribute__((target_clones ("avx", "default"))) diff --git a/gcc/testsuite/gcc.target/i386/pr84723-5.c b/gcc/testsuite/gcc.target/i386/pr84723-5.c index c7aa92804fa..0167df39850 100644 --- a/gcc/testsuite/gcc.target/i386/pr84723-5.c +++ b/gcc/testsuite/gcc.target/i386/pr84723-5.c @@ -1,6 +1,5 @@ /* PR middle-end/84723 */ /* { dg-do compile } */ -/* { dg-require-ifunc } */ /* { dg-options "-O2" } */ __attribute__((target_clones ("avx", "default"))) </cut>

3 years, 11 months

1
0
0 0

[CI-NOTIFY]: TCWG Bisect tcwg_bmk_tx1/llvm-master-aarch64-spec2k6-O2 - Build # 16 - Successful!

by ci_notify＠linaro.org

Successfully identified regression in *llvm* in CI configuration tcwg_bmk_llvm_tx1/llvm-master-aarch64-spec2k6-O2. So far, this commit has regressed CI configurations: - tcwg_bmk_llvm_tx1/llvm-master-aarch64-spec2k6-O2 Culprit: <cut> commit 7de439b2be4a046da541b625812f2fe34c54c4b9 Author: Rob Suderman <rob.suderman(a)gmail.com> Date: Wed Aug 11 11:05:08 2021 -0700 [mlir][tosa] Migrate tosa to more efficient linalg.conv Existing linalg.conv2d is not well optimized for performance. Changed to a version that is more aligned for optimziation. Include the corresponding transposes to use this optimized version. This also splits the conv and depthwise conv into separate implementations to avoid overly complex lowerings. Reviewed By: antiagainst Differential Revision: https://reviews.llvm.org/D107504 </cut> Results regressed to (for first_bad == 7de439b2be4a046da541b625812f2fe34c54c4b9) # reset_artifacts: -10 # build_abe binutils: -9 # build_abe stage1 -- --set gcc_override_configure=--disable-libsanitizer: -8 # build_abe linux: -7 # build_abe glibc: -6 # build_abe stage2 -- --set gcc_override_configure=--disable-libsanitizer: -5 # build_llvm true: -3 # true: 0 # benchmark -- -O2 artifacts/build-7de439b2be4a046da541b625812f2fe34c54c4b9/results_id: 1 # 447.dealII,[.] _ZNK12SparseMatrixIdE5vmultI6VectorIdES3_EEvRT regressed by 111 # 433.milc,[.] mult_su3_mat_vec regressed by 112 from (for last_good == c1a8f12873783e8f4827437f6b2dddadfc58109d) # reset_artifacts: -10 # build_abe binutils: -9 # build_abe stage1 -- --set gcc_override_configure=--disable-libsanitizer: -8 # build_abe linux: -7 # build_abe glibc: -6 # build_abe stage2 -- --set gcc_override_configure=--disable-libsanitizer: -5 # build_llvm true: -3 # true: 0 # benchmark -- -O2 artifacts/build-c1a8f12873783e8f4827437f6b2dddadfc58109d/results_id: 1 Artifacts of last_good build: https://ci.linaro.org/job/tcwg_bmk_ci_llvm-bisect-tcwg_bmk_tx1-llvm-master-… Results ID of last_good: tx1_64/tcwg_bmk_llvm_tx1/bisect-llvm-master-aarch64-spec2k6-O2/4556 Artifacts of first_bad build: https://ci.linaro.org/job/tcwg_bmk_ci_llvm-bisect-tcwg_bmk_tx1-llvm-master-… Results ID of first_bad: tx1_64/tcwg_bmk_llvm_tx1/bisect-llvm-master-aarch64-spec2k6-O2/4553 Build top page/logs: https://ci.linaro.org/job/tcwg_bmk_ci_llvm-bisect-tcwg_bmk_tx1-llvm-master-… Configuration details: Reproduce builds: <cut> mkdir investigate-llvm-7de439b2be4a046da541b625812f2fe34c54c4b9 cd investigate-llvm-7de439b2be4a046da541b625812f2fe34c54c4b9 git clone https://git.linaro.org/toolchain/jenkins-scripts mkdir -p artifacts/manifests curl -o artifacts/manifests/build-baseline.sh https://ci.linaro.org/job/tcwg_bmk_ci_llvm-bisect-tcwg_bmk_tx1-llvm-master-… --fail curl -o artifacts/manifests/build-parameters.sh https://ci.linaro.org/job/tcwg_bmk_ci_llvm-bisect-tcwg_bmk_tx1-llvm-master-… --fail curl -o artifacts/test.sh https://ci.linaro.org/job/tcwg_bmk_ci_llvm-bisect-tcwg_bmk_tx1-llvm-master-… --fail chmod +x artifacts/test.sh # Reproduce the baseline build (build all pre-requisites) ./jenkins-scripts/tcwg_bmk-build.sh @@ artifacts/manifests/build-baseline.sh # Save baseline build state (which is then restored in artifacts/test.sh) mkdir -p ./bisect rsync -a --del --delete-excluded --exclude /bisect/ --exclude /artifacts/ --exclude /llvm/ ./ ./bisect/baseline/ cd llvm # Reproduce first_bad build git checkout --detach 7de439b2be4a046da541b625812f2fe34c54c4b9 ../artifacts/test.sh # Reproduce last_good build git checkout --detach c1a8f12873783e8f4827437f6b2dddadfc58109d ../artifacts/test.sh cd .. </cut> History of pending regressions and results: https://git.linaro.org/toolchain/ci/base-artifacts.git/log/?h=linaro-local/… Artifacts: https://ci.linaro.org/job/tcwg_bmk_ci_llvm-bisect-tcwg_bmk_tx1-llvm-master-… Build log: https://ci.linaro.org/job/tcwg_bmk_ci_llvm-bisect-tcwg_bmk_tx1-llvm-master-… Full commit (up to 1000 lines): <cut> commit 7de439b2be4a046da541b625812f2fe34c54c4b9 Author: Rob Suderman <rob.suderman(a)gmail.com> Date: Wed Aug 11 11:05:08 2021 -0700 [mlir][tosa] Migrate tosa to more efficient linalg.conv Existing linalg.conv2d is not well optimized for performance. Changed to a version that is more aligned for optimziation. Include the corresponding transposes to use this optimized version. This also splits the conv and depthwise conv into separate implementations to avoid overly complex lowerings. Reviewed By: antiagainst Differential Revision: https://reviews.llvm.org/D107504 --- .../Linalg/IR/LinalgNamedStructuredOps.yaml | 261 +++++++++--------- mlir/lib/Conversion/TosaToLinalg/TosaToLinalg.cpp | 303 +++++++++++++-------- .../dialects/linalg/opdsl/ops/core_named_ops.py | 79 +++--- .../Conversion/TosaToLinalg/tosa-to-linalg.mlir | 40 ++- mlir/test/Dialect/Linalg/named-ops.mlir | 14 - 5 files changed, 378 insertions(+), 319 deletions(-) diff --git a/mlir/include/mlir/Dialect/Linalg/IR/LinalgNamedStructuredOps.yaml b/mlir/include/mlir/Dialect/Linalg/IR/LinalgNamedStructuredOps.yaml index 53b54e1bff9f..3e1fcabc8cb9 100644 --- a/mlir/include/mlir/Dialect/Linalg/IR/LinalgNamedStructuredOps.yaml +++ b/mlir/include/mlir/Dialect/Linalg/IR/LinalgNamedStructuredOps.yaml @@ -628,10 +628,10 @@ structured_op: !LinalgStructuredOpConfig scalar_arg: B --- !LinalgOpConfig metadata: !LinalgOpMetadata - name: conv_2d_input_nhwc_filter_ohwi_poly - cpp_class_name: Conv2DInputNhwcFilterOhwiPolyOp + name: conv_2d_nchw + cpp_class_name: Conv2DNchwOp doc: |- - Performs a 2-D convolution. + Performs 2-D convolution. Numeric casting is performed on the operands to the inner multiply, promoting them to the same data type as the accumulator/output. @@ -648,13 +648,13 @@ structured_op: !LinalgStructuredOpConfig usage: InputOperand type_var: T2 shape_map: affine_map<()[s0, s1, s2, s3, s4, s5, s6, s7, s8, s9, s10, s11, s12] - -> (s4, s5, s6, s3)> + -> (s4, s1, s5, s6)> - !LinalgOperandDefConfig name: O usage: OutputOperand type_var: U shape_map: affine_map<()[s0, s1, s2, s3, s4, s5, s6, s7, s8, s9, s10, s11, s12] - -> (s0, s7, s8, s4)> + -> (s0, s4, s7, s8, s1)> - !LinalgOperandDefConfig name: strides usage: IndexAttribute @@ -670,18 +670,18 @@ structured_op: !LinalgStructuredOpConfig indexing_maps: !LinalgIndexingMapsConfig static_indexing_maps: - affine_map<(d0, d1, d2, d3, d4, d5, d6)[s0, s1, s2, s3, s4, s5, s6, s7, s8, - s9, s10, s11, s12] -> (d0, d1 * s9 + d3 * s11, d2 * s10 + d4 * s12, d6)> + s9, s10, s11, s12] -> (d0, d4, d2 * s9 + d5 * s11, d3 * s10 + d6 * s12)> - affine_map<(d0, d1, d2, d3, d4, d5, d6)[s0, s1, s2, s3, s4, s5, s6, s7, s8, - s9, s10, s11, s12] -> (d5, d3, d4, d6)> + s9, s10, s11, s12] -> (d1, d4, d5, d6)> - affine_map<(d0, d1, d2, d3, d4, d5, d6)[s0, s1, s2, s3, s4, s5, s6, s7, s8, - s9, s10, s11, s12] -> (d0, d1, d2, d5)> + s9, s10, s11, s12] -> (d0, d1, d2, d3)> iterator_types: - parallel - parallel - parallel + - parallel - reduction - reduction - - parallel - reduction assignments: - !ScalarAssign @@ -710,14 +710,13 @@ structured_op: !LinalgStructuredOpConfig scalar_arg: K --- !LinalgOpConfig metadata: !LinalgOpMetadata - name: conv_2d_input_nhwc_filter_ohwi_poly_q - cpp_class_name: Conv2DInputNhwcFilterOhwiPolyQOp + name: conv_2d_nhwc_hwcf + cpp_class_name: Conv2DNhwcHwcfOp doc: |- - Performs a 2-D quantized convolution. + Performs 2-D convolution. Numeric casting is performed on the operands to the inner multiply, promoting - them to the same data type as the accumulator/output. Includes zero point - adjustment for quantization. + them to the same data type as the accumulator/output. structured_op: !LinalgStructuredOpConfig args: - !LinalgOperandDefConfig @@ -731,21 +730,13 @@ structured_op: !LinalgStructuredOpConfig usage: InputOperand type_var: T2 shape_map: affine_map<()[s0, s1, s2, s3, s4, s5, s6, s7, s8, s9, s10, s11, s12] - -> (s4, s5, s6, s3)> - - !LinalgOperandDefConfig - name: IZp - usage: InputOperand - type_var: I32 - - !LinalgOperandDefConfig - name: KZp - usage: InputOperand - type_var: I32 + -> (s4, s5, s3, s6)> - !LinalgOperandDefConfig name: O usage: OutputOperand type_var: U shape_map: affine_map<()[s0, s1, s2, s3, s4, s5, s6, s7, s8, s9, s10, s11, s12] - -> (s0, s7, s8, s4)> + -> (s0, s7, s8, s6)> - !LinalgOperandDefConfig name: strides usage: IndexAttribute @@ -761,22 +752,18 @@ structured_op: !LinalgStructuredOpConfig indexing_maps: !LinalgIndexingMapsConfig static_indexing_maps: - affine_map<(d0, d1, d2, d3, d4, d5, d6)[s0, s1, s2, s3, s4, s5, s6, s7, s8, - s9, s10, s11, s12] -> (d0, d1 * s9 + d3 * s11, d2 * s10 + d4 * s12, d6)> - - affine_map<(d0, d1, d2, d3, d4, d5, d6)[s0, s1, s2, s3, s4, s5, s6, s7, s8, - s9, s10, s11, s12] -> (d5, d3, d4, d6)> + s9, s10, s11, s12] -> (d0, d1 * s9 + d4 * s11, d2 * s10 + d5 * s12, d6)> - affine_map<(d0, d1, d2, d3, d4, d5, d6)[s0, s1, s2, s3, s4, s5, s6, s7, s8, - s9, s10, s11, s12] -> ()> - - affine_map<(d0, d1, d2, d3, d4, d5, d6)[s0, s1, s2, s3, s4, s5, s6, s7, s8, - s9, s10, s11, s12] -> ()> + s9, s10, s11, s12] -> (d4, d5, d6, d3)> - affine_map<(d0, d1, d2, d3, d4, d5, d6)[s0, s1, s2, s3, s4, s5, s6, s7, s8, - s9, s10, s11, s12] -> (d0, d1, d2, d5)> + s9, s10, s11, s12] -> (d0, d1, d2, d3)> iterator_types: - parallel - parallel - parallel + - parallel - reduction - reduction - - parallel - reduction assignments: - !ScalarAssign @@ -792,37 +779,17 @@ structured_op: !LinalgStructuredOpConfig fn_name: mul operands: - !ScalarExpression - scalar_apply: - fn_name: sub + symbolic_cast: + type_var: U operands: - !ScalarExpression - symbolic_cast: - type_var: U - operands: - - !ScalarExpression - scalar_arg: I - - !ScalarExpression - symbolic_cast: - type_var: U - operands: - - !ScalarExpression - scalar_arg: IZp + scalar_arg: I - !ScalarExpression - scalar_apply: - fn_name: sub + symbolic_cast: + type_var: U operands: - !ScalarExpression - symbolic_cast: - type_var: U - operands: - - !ScalarExpression - scalar_arg: K - - !ScalarExpression - symbolic_cast: - type_var: U - operands: - - !ScalarExpression - scalar_arg: KZp + scalar_arg: K --- !LinalgOpConfig metadata: !LinalgOpMetadata name: depthwise_conv_2d_input_nhwc_filter_hwc_poly @@ -906,13 +873,14 @@ structured_op: !LinalgStructuredOpConfig scalar_arg: K --- !LinalgOpConfig metadata: !LinalgOpMetadata - name: depthwise_conv_2D_nchw - cpp_class_name: DepthwiseConv2DNchwOp + name: conv_2d_nhwc_hwcf_q + cpp_class_name: Conv2DNhwcHwcfQOp doc: |- - Performs depth-wise 2-D convolution. + Performs 2-D convolution with zero point offsets. Numeric casting is performed on the operands to the inner multiply, promoting - them to the same data type as the accumulator/output. + them to the same data type as the accumulator/output. This includes the zero + point offsets common to quantized operations. structured_op: !LinalgStructuredOpConfig args: - !LinalgOperandDefConfig @@ -927,12 +895,20 @@ structured_op: !LinalgStructuredOpConfig type_var: T2 shape_map: affine_map<()[s0, s1, s2, s3, s4, s5, s6, s7, s8, s9, s10, s11, s12] -> (s4, s5, s3, s6)> + - !LinalgOperandDefConfig + name: IZp + usage: InputOperand + type_var: I32 + - !LinalgOperandDefConfig + name: KZp + usage: InputOperand + type_var: I32 - !LinalgOperandDefConfig name: O usage: OutputOperand type_var: U shape_map: affine_map<()[s0, s1, s2, s3, s4, s5, s6, s7, s8, s9, s10, s11, s12] - -> (s0, s7, s8, s3, s6)> + -> (s0, s7, s8, s6)> - !LinalgOperandDefConfig name: strides usage: IndexAttribute @@ -948,19 +924,23 @@ structured_op: !LinalgStructuredOpConfig indexing_maps: !LinalgIndexingMapsConfig static_indexing_maps: - affine_map<(d0, d1, d2, d3, d4, d5, d6)[s0, s1, s2, s3, s4, s5, s6, s7, s8, - s9, s10, s11, s12] -> (d0, d1 * s9 + d3 * s11, d2 * s10 + d4 * s12, d5)> + s9, s10, s11, s12] -> (d0, d1 * s9 + d4 * s11, d2 * s10 + d5 * s12, d6)> - affine_map<(d0, d1, d2, d3, d4, d5, d6)[s0, s1, s2, s3, s4, s5, s6, s7, s8, - s9, s10, s11, s12] -> (d3, d4, d5, d6)> + s9, s10, s11, s12] -> (d4, d5, d6, d3)> - affine_map<(d0, d1, d2, d3, d4, d5, d6)[s0, s1, s2, s3, s4, s5, s6, s7, s8, - s9, s10, s11, s12] -> (d0, d1, d2, d5, d6)> + s9, s10, s11, s12] -> ()> + - affine_map<(d0, d1, d2, d3, d4, d5, d6)[s0, s1, s2, s3, s4, s5, s6, s7, s8, + s9, s10, s11, s12] -> ()> + - affine_map<(d0, d1, d2, d3, d4, d5, d6)[s0, s1, s2, s3, s4, s5, s6, s7, s8, + s9, s10, s11, s12] -> (d0, d1, d2, d3)> iterator_types: - parallel - parallel - parallel + - parallel + - reduction - reduction - reduction - - parallel - - parallel assignments: - !ScalarAssign arg: O @@ -975,21 +955,41 @@ structured_op: !LinalgStructuredOpConfig fn_name: mul operands: - !ScalarExpression - symbolic_cast: - type_var: U + scalar_apply: + fn_name: sub operands: - !ScalarExpression - scalar_arg: I + symbolic_cast: + type_var: U + operands: + - !ScalarExpression + scalar_arg: I + - !ScalarExpression + symbolic_cast: + type_var: U + operands: + - !ScalarExpression + scalar_arg: IZp - !ScalarExpression - symbolic_cast: - type_var: U + scalar_apply: + fn_name: sub operands: - !ScalarExpression - scalar_arg: K + symbolic_cast: + type_var: U + operands: + - !ScalarExpression + scalar_arg: K + - !ScalarExpression + symbolic_cast: + type_var: U + operands: + - !ScalarExpression + scalar_arg: KZp --- !LinalgOpConfig metadata: !LinalgOpMetadata - name: depthwise_conv2D_nchw_q - cpp_class_name: DepthwiseConv2DNchwQOp + name: depthwise_conv2D_nchw + cpp_class_name: DepthwiseConv2DNchwOp doc: |- Performs depth-wise 2-D convolution. @@ -1009,14 +1009,6 @@ structured_op: !LinalgStructuredOpConfig type_var: T2 shape_map: affine_map<()[s0, s1, s2, s3, s4, s5, s6, s7, s8, s9, s10, s11, s12] -> (s4, s5, s3, s6)> - - !LinalgOperandDefConfig - name: IZp - usage: InputOperand - type_var: I32 - - !LinalgOperandDefConfig - name: KZp - usage: InputOperand - type_var: I32 - !LinalgOperandDefConfig name: O usage: OutputOperand @@ -1041,10 +1033,6 @@ structured_op: !LinalgStructuredOpConfig s9, s10, s11, s12] -> (d0, d1 * s9 + d3 * s11, d2 * s10 + d4 * s12, d5)> - affine_map<(d0, d1, d2, d3, d4, d5, d6)[s0, s1, s2, s3, s4, s5, s6, s7, s8, s9, s10, s11, s12] -> (d3, d4, d5, d6)> - - affine_map<(d0, d1, d2, d3, d4, d5, d6)[s0, s1, s2, s3, s4, s5, s6, s7, s8, - s9, s10, s11, s12] -> ()> - - affine_map<(d0, d1, d2, d3, d4, d5, d6)[s0, s1, s2, s3, s4, s5, s6, s7, s8, - s9, s10, s11, s12] -> ()> - affine_map<(d0, d1, d2, d3, d4, d5, d6)[s0, s1, s2, s3, s4, s5, s6, s7, s8, s9, s10, s11, s12] -> (d0, d1, d2, d5, d6)> iterator_types: @@ -1069,43 +1057,23 @@ structured_op: !LinalgStructuredOpConfig fn_name: mul operands: - !ScalarExpression - scalar_apply: - fn_name: sub + symbolic_cast: + type_var: U operands: - !ScalarExpression - symbolic_cast: - type_var: U - operands: - - !ScalarExpression - scalar_arg: I - - !ScalarExpression - symbolic_cast: - type_var: U - operands: - - !ScalarExpression - scalar_arg: IZp + scalar_arg: I - !ScalarExpression - scalar_apply: - fn_name: sub + symbolic_cast: + type_var: U operands: - !ScalarExpression - symbolic_cast: - type_var: U - operands: - - !ScalarExpression - scalar_arg: K - - !ScalarExpression - symbolic_cast: - type_var: U - operands: - - !ScalarExpression - scalar_arg: KZp + scalar_arg: K --- !LinalgOpConfig metadata: !LinalgOpMetadata - name: conv_2d_nchw - cpp_class_name: Conv2DNchwOp + name: depthwise_conv2D_nchw_q + cpp_class_name: DepthwiseConv2DNchwQOp doc: |- - Performs 2-D convolution. + Performs depth-wise 2-D convolution. Numeric casting is performed on the operands to the inner multiply, promoting them to the same data type as the accumulator/output. @@ -1122,13 +1090,21 @@ structured_op: !LinalgStructuredOpConfig usage: InputOperand type_var: T2 shape_map: affine_map<()[s0, s1, s2, s3, s4, s5, s6, s7, s8, s9, s10, s11, s12] - -> (s4, s1, s5, s6)> + -> (s4, s5, s3, s6)> + - !LinalgOperandDefConfig + name: IZp + usage: InputOperand + type_var: I32 + - !LinalgOperandDefConfig + name: KZp + usage: InputOperand + type_var: I32 - !LinalgOperandDefConfig name: O usage: OutputOperand type_var: U shape_map: affine_map<()[s0, s1, s2, s3, s4, s5, s6, s7, s8, s9, s10, s11, s12] - -> (s0, s4, s7, s8, s1)> + -> (s0, s7, s8, s3, s6)> - !LinalgOperandDefConfig name: strides usage: IndexAttribute @@ -1144,19 +1120,23 @@ structured_op: !LinalgStructuredOpConfig indexing_maps: !LinalgIndexingMapsConfig static_indexing_maps: - affine_map<(d0, d1, d2, d3, d4, d5, d6)[s0, s1, s2, s3, s4, s5, s6, s7, s8, - s9, s10, s11, s12] -> (d0, d4, d2 * s9 + d5 * s11, d3 * s10 + d6 * s12)> + s9, s10, s11, s12] -> (d0, d1 * s9 + d3 * s11, d2 * s10 + d4 * s12, d5)> - affine_map<(d0, d1, d2, d3, d4, d5, d6)[s0, s1, s2, s3, s4, s5, s6, s7, s8, - s9, s10, s11, s12] -> (d1, d4, d5, d6)> + s9, s10, s11, s12] -> (d3, d4, d5, d6)> - affine_map<(d0, d1, d2, d3, d4, d5, d6)[s0, s1, s2, s3, s4, s5, s6, s7, s8, - s9, s10, s11, s12] -> (d0, d1, d2, d3)> + s9, s10, s11, s12] -> ()> + - affine_map<(d0, d1, d2, d3, d4, d5, d6)[s0, s1, s2, s3, s4, s5, s6, s7, s8, + s9, s10, s11, s12] -> ()> + - affine_map<(d0, d1, d2, d3, d4, d5, d6)[s0, s1, s2, s3, s4, s5, s6, s7, s8, + s9, s10, s11, s12] -> (d0, d1, d2, d5, d6)> iterator_types: - parallel - parallel - parallel - - parallel - - reduction - reduction - reduction + - parallel + - parallel assignments: - !ScalarAssign arg: O @@ -1171,17 +1151,37 @@ structured_op: !LinalgStructuredOpConfig fn_name: mul operands: - !ScalarExpression - symbolic_cast: - type_var: U + scalar_apply: + fn_name: sub operands: - !ScalarExpression - scalar_arg: I + symbolic_cast: + type_var: U + operands: + - !ScalarExpression + scalar_arg: I + - !ScalarExpression + symbolic_cast: + type_var: U + operands: + - !ScalarExpression + scalar_arg: IZp - !ScalarExpression - symbolic_cast: - type_var: U + scalar_apply: + fn_name: sub operands: - !ScalarExpression - scalar_arg: K + symbolic_cast: + type_var: U + operands: + - !ScalarExpression + scalar_arg: K + - !ScalarExpression + symbolic_cast: + type_var: U + operands: + - !ScalarExpression + scalar_arg: KZp --- !LinalgOpConfig metadata: !LinalgOpMetadata name: pooling_nhwc_sum @@ -1896,3 +1896,4 @@ structured_op: !LinalgStructuredOpConfig operands: - !ScalarExpression scalar_arg: I + diff --git a/mlir/lib/Conversion/TosaToLinalg/TosaToLinalg.cpp b/mlir/lib/Conversion/TosaToLinalg/TosaToLinalg.cpp index 37687337e10b..8e24f03a0dac 100644 --- a/mlir/lib/Conversion/TosaToLinalg/TosaToLinalg.cpp +++ b/mlir/lib/Conversion/TosaToLinalg/TosaToLinalg.cpp @@ -849,104 +849,213 @@ static LogicalResult reduceMatchAndRewriteHelper(Operation *op, uint64_t axis, return success(); } -static LogicalResult -convolutionMatchAndRewriterHelper(Operation *op, - ConversionPatternRewriter &rewriter) { - Location loc = op->getLoc(); - Value input = op->getOperand(0); - Value weight = op->getOperand(1); - Value bias = op->getOperand(2); +namespace { - ShapedType inputTy = input.getType().cast<ShapedType>(); - ShapedType weightTy = weight.getType().cast<ShapedType>(); - ShapedType biasTy = bias.getType().cast<ShapedType>(); - ShapedType resultTy = op->getResult(0).getType().cast<ShapedType>(); +template <typename SrcOp> +class PointwiseConverter : public OpRewritePattern<SrcOp> { +public: + using OpRewritePattern<SrcOp>::OpRewritePattern; - Type inputETy = inputTy.getElementType(); - Type resultETy = resultTy.getElementType(); - - auto padAttr = op->getAttr("pad").cast<ArrayAttr>(); - auto strideTosaAttr = op->getAttr("stride").cast<ArrayAttr>(); - auto dilationTosaAttr = op->getAttr("dilation").cast<ArrayAttr>(); - - bool isQuantized = op->hasAttr("quantization_info"); - IntegerAttr iZp; - IntegerAttr kZp; - if (isQuantized) { - auto quantizationInfo = - op->getAttr("quantization_info").cast<tosa::ConvOpQuantizationAttr>(); - iZp = rewriter.getI32IntegerAttr( - quantizationInfo.input_zp().getValue().getSExtValue()); - kZp = rewriter.getI32IntegerAttr( - quantizationInfo.weight_zp().getValue().getSExtValue()); + LogicalResult matchAndRewrite(SrcOp op, + PatternRewriter &rewriter) const final { + return elementwiseMatchAndRewriteHelper(op, rewriter); } +}; - if (!inputTy.hasStaticShape() || !weightTy.hasStaticShape() || - !biasTy.hasStaticShape() || !resultTy.hasStaticShape()) - return rewriter.notifyMatchFailure(op, - "tosa.conv ops require static shapes"); +class ConvConverter : public OpConversionPattern<tosa::Conv2DOp> { +public: + using OpConversionPattern<tosa::Conv2DOp>::OpConversionPattern; + LogicalResult + matchAndRewrite(tosa::Conv2DOp op, ArrayRef<Value> args, + ConversionPatternRewriter &rewriter) const final { + Location loc = op->getLoc(); + Value input = op->getOperand(0); + Value weight = op->getOperand(1); + Value bias = op->getOperand(2); - auto weightShape = weightTy.getShape(); - auto resultShape = resultTy.getShape(); + ShapedType inputTy = input.getType().cast<ShapedType>(); + ShapedType weightTy = weight.getType().cast<ShapedType>(); + ShapedType biasTy = bias.getType().cast<ShapedType>(); + ShapedType resultTy = op->getResult(0).getType().cast<ShapedType>(); - // Apply padding as necessary. - Attribute zeroAttr = rewriter.getZeroAttr(inputETy); - llvm::SmallVector<int64_t> pad; - pad.resize(2, 0); - getValuesFromIntArrayAttribute(padAttr, pad); - pad.resize(pad.size() + 2, 0); + Type inputETy = inputTy.getElementType(); + Type resultETy = resultTy.getElementType(); - input = applyPad(loc, input, pad, zeroAttr, rewriter); + auto padAttr = op->getAttr("pad").cast<ArrayAttr>(); + auto strideTosaAttr = op->getAttr("stride").cast<ArrayAttr>(); + auto dilationTosaAttr = op->getAttr("dilation").cast<ArrayAttr>(); + bool isQuantized = op->hasAttr("quantization_info"); - // Broadcast the initial value to the output tensor before convolving. - SmallVector<AffineMap, 4> indexingMaps; - indexingMaps.push_back(AffineMap::get( - /*dimCount=*/resultTy.getRank(), /*symbolCount=*/0, - {rewriter.getAffineDimExpr(3)}, rewriter.getContext())); - indexingMaps.push_back(rewriter.getMultiDimIdentityMap(resultTy.getRank())); + if (!inputTy.hasStaticShape() || !weightTy.hasStaticShape() || + !biasTy.hasStaticShape() || !resultTy.hasStaticShape()) + return rewriter.notifyMatchFailure(op, + "tosa.conv ops require static shapes"); - Value initTensor = rewriter.create<linalg::InitTensorOp>( - loc, resultTy.getShape(), resultTy.getElementType()); + auto weightShape = weightTy.getShape(); - Value biasBroadcast = - rewriter - .create<linalg::GenericOp>( - loc, resultTy, bias, initTensor, indexingMaps, - getNParallelLoopsAttrs(resultTy.getRank()), - [&](OpBuilder &nestedBuilder, Location nestedLoc, - ValueRange args) { - nestedBuilder.create<linalg::YieldOp>(nestedLoc, args[0]); - }) - .getResult(0); - - // Extract the attributes for convolution. - llvm::SmallVector<int64_t> stride, dilation; - getValuesFromIntArrayAttribute(strideTosaAttr, stride); - getValuesFromIntArrayAttribute(dilationTosaAttr, dilation); - - // Create the convolution op. - auto strideAttr = DenseIntElementsAttr::get( - RankedTensorType::get({2}, rewriter.getI64Type()), stride); - auto dilationAttr = DenseIntElementsAttr::get( - RankedTensorType::get({2}, rewriter.getI64Type()), dilation); - - if (isa<tosa::Conv2DOp>(op) && !isQuantized) { - rewriter.replaceOpWithNewOp<linalg::Conv2DInputNhwcFilterOhwiPolyOp>( + // Apply padding as necessary. + Attribute zeroAttr = rewriter.getZeroAttr(inputETy); + llvm::SmallVector<int64_t> pad; + pad.resize(2, 0); + getValuesFromIntArrayAttribute(padAttr, pad); + pad.resize(pad.size() + 2, 0); + input = applyPad(loc, input, pad, zeroAttr, rewriter); + + // Transpose the kernel to match dimension ordering of the linalg + // convolution operation. + // TODO(suderman): See if this can be efficiently folded - check whether + // the input is used anywhere else, if not fold the constant. + SmallVector<int64_t> weightPerm{1, 2, 3, 0}; + SmallVector<int64_t> newWeightShape{weightShape[1], weightShape[2], + weightShape[3], weightShape[0]}; + auto weightPermAttr = DenseIntElementsAttr::get( + RankedTensorType::get({4}, rewriter.getI64Type()), weightPerm); + Value weightPermValue = rewriter.create<ConstantOp>(loc, weightPermAttr); + Type newWeightTy = + RankedTensorType::get(newWeightShape, weightTy.getElementType()); + weight = rewriter.create<tosa::TransposeOp>(loc, newWeightTy, weight, + weightPermValue); + + // Broadcast the initial value to the output tensor before convolving. + SmallVector<AffineMap, 4> indexingMaps; + indexingMaps.push_back(AffineMap::get( + /*dimCount=*/resultTy.getRank(), /*symbolCount=*/0, + {rewriter.getAffineDimExpr(3)}, rewriter.getContext())); + indexingMaps.push_back(rewriter.getMultiDimIdentityMap(resultTy.getRank())); + + Value initTensor = rewriter.create<linalg::InitTensorOp>( + loc, resultTy.getShape(), resultETy); + + Value biasBroadcast = + rewriter + .create<linalg::GenericOp>( + loc, resultTy, bias, initTensor, indexingMaps, + getNParallelLoopsAttrs(resultTy.getRank()), + [&](OpBuilder &nestedBuilder, Location nestedLoc, + ValueRange args) { + nestedBuilder.create<linalg::YieldOp>(nestedLoc, args[0]); + }) + .getResult(0); + + // Extract the attributes for convolution. + llvm::SmallVector<int64_t> stride, dilation; + getValuesFromIntArrayAttribute(strideTosaAttr, stride); + getValuesFromIntArrayAttribute(dilationTosaAttr, dilation); + + // Create the convolution op. + auto strideAttr = DenseIntElementsAttr::get( + RankedTensorType::get({2}, rewriter.getI64Type()), stride); + auto dilationAttr = DenseIntElementsAttr::get( + RankedTensorType::get({2}, rewriter.getI64Type()), dilation); + + Value conv; + if (isQuantized) { + auto quantizationInfo = + op->getAttr("quantization_info").cast<tosa::ConvOpQuantizationAttr>(); + auto iZp = rewriter.getI32IntegerAttr( + quantizationInfo.input_zp().getValue().getSExtValue()); + auto kZp = rewriter.getI32IntegerAttr( + quantizationInfo.weight_zp().getValue().getSExtValue()); + + auto iZpVal = rewriter.create<ConstantOp>(loc, iZp); + auto kZpVal = rewriter.create<ConstantOp>(loc, kZp); + rewriter.replaceOpWithNewOp<linalg::Conv2DNhwcHwcfQOp>( + op, resultTy, ValueRange{input, weight, iZpVal, kZpVal}, + ValueRange{biasBroadcast}, strideAttr, dilationAttr); + return success(); + } + + rewriter.replaceOpWithNewOp<linalg::Conv2DNhwcHwcfOp>( op, resultTy, ValueRange{input, weight}, ValueRange{biasBroadcast}, strideAttr, dilationAttr); return success(); } +}; - if (isa<tosa::Conv2DOp>(op) && isQuantized) { - auto iZpVal = rewriter.create<ConstantOp>(loc, iZp); - auto kZpVal = rewriter.create<ConstantOp>(loc, kZp); - rewriter.replaceOpWithNewOp<linalg::Conv2DInputNhwcFilterOhwiPolyQOp>( - op, resultTy, ValueRange{input, weight, iZpVal, kZpVal}, - ValueRange{biasBroadcast}, strideAttr, dilationAttr); - return success(); - } +class DepthwiseConvConverter + : public OpConversionPattern<tosa::DepthwiseConv2DOp> { +public: + using OpConversionPattern<tosa::DepthwiseConv2DOp>::OpConversionPattern; + LogicalResult + matchAndRewrite(tosa::DepthwiseConv2DOp op, ArrayRef<Value> args, + ConversionPatternRewriter &rewriter) const final { + Location loc = op->getLoc(); + Value input = op->getOperand(0); + Value weight = op->getOperand(1); + Value bias = op->getOperand(2); + + ShapedType inputTy = input.getType().cast<ShapedType>(); + ShapedType weightTy = weight.getType().cast<ShapedType>(); + ShapedType biasTy = bias.getType().cast<ShapedType>(); + ShapedType resultTy = op->getResult(0).getType().cast<ShapedType>(); - if (isa<tosa::DepthwiseConv2DOp>(op)) { + Type inputETy = inputTy.getElementType(); + Type resultETy = resultTy.getElementType(); + + auto padAttr = op->getAttr("pad").cast<ArrayAttr>(); + auto strideTosaAttr = op->getAttr("stride").cast<ArrayAttr>(); + auto dilationTosaAttr = op->getAttr("dilation").cast<ArrayAttr>(); + + bool isQuantized = op->hasAttr("quantization_info"); + IntegerAttr iZp; + IntegerAttr kZp; + if (isQuantized) { + auto quantizationInfo = + op->getAttr("quantization_info").cast<tosa::ConvOpQuantizationAttr>(); + iZp = rewriter.getI32IntegerAttr( + quantizationInfo.input_zp().getValue().getSExtValue()); + kZp = rewriter.getI32IntegerAttr( + quantizationInfo.weight_zp().getValue().getSExtValue()); + } + + if (!inputTy.hasStaticShape() || !weightTy.hasStaticShape() || + !biasTy.hasStaticShape() || !resultTy.hasStaticShape()) + return rewriter.notifyMatchFailure(op, + "tosa.conv ops require static shapes"); + + auto weightShape = weightTy.getShape(); + auto resultShape = resultTy.getShape(); + + // Apply padding as necessary. + Attribute zeroAttr = rewriter.getZeroAttr(inputETy); + llvm::SmallVector<int64_t> pad; + pad.resize(2, 0); + getValuesFromIntArrayAttribute(padAttr, pad); + pad.resize(pad.size() + 2, 0); + + input = applyPad(loc, input, pad, zeroAttr, rewriter); + + // Broadcast the initial value to the output tensor before convolving. + SmallVector<AffineMap, 4> indexingMaps; + indexingMaps.push_back(AffineMap::get( + /*dimCount=*/resultTy.getRank(), /*symbolCount=*/0, + {rewriter.getAffineDimExpr(3)}, rewriter.getContext())); + indexingMaps.push_back(rewriter.getMultiDimIdentityMap(resultTy.getRank())); + + Value initTensor = + rewriter.create<linalg::InitTensorOp>(loc, resultShape, resultETy); + + Value biasBroadcast = + rewriter + .create<linalg::GenericOp>( + loc, resultTy, bias, initTensor, indexingMaps, + getNParallelLoopsAttrs(resultTy.getRank()), + [&](OpBuilder &nestedBuilder, Location nestedLoc, + ValueRange args) { + nestedBuilder.create<linalg::YieldOp>(nestedLoc, args[0]); + }) + .getResult(0); + + // Extract the attributes for convolution. + llvm::SmallVector<int64_t> stride, dilation; + getValuesFromIntArrayAttribute(strideTosaAttr, stride); + getValuesFromIntArrayAttribute(dilationTosaAttr, dilation); + + // Create the convolution op. + auto strideAttr = DenseIntElementsAttr::get( + RankedTensorType::get({2}, rewriter.getI64Type()), stride); + auto dilationAttr = DenseIntElementsAttr::get( + RankedTensorType::get({2}, rewriter.getI64Type()), dilation); ShapedType linalgConvTy = RankedTensorType::get({resultShape[0], resultShape[1], resultShape[2], weightShape[2], weightShape[3]}, @@ -976,32 +1085,6 @@ convolutionMatchAndRewriterHelper(Operation *op, rewriter.replaceOp(op, reshape); return success(); } - - return failure(); -} - -namespace { - -template <typename SrcOp> -class PointwiseConverter : public OpRewritePattern<SrcOp> { -public: - using OpRewritePattern<SrcOp>::OpRewritePattern; - - LogicalResult matchAndRewrite(SrcOp op, - PatternRewriter &rewriter) const final { - return elementwiseMatchAndRewriteHelper(op, rewriter); - } -}; - -template <typename T> -class ConvConverter : public OpConversionPattern<T> { -public: - using OpConversionPattern<T>::OpConversionPattern; - LogicalResult - matchAndRewrite(T op, ArrayRef<Value> args, - ConversionPatternRewriter &rewriter) const final { - return convolutionMatchAndRewriterHelper(op, rewriter); - } }; class TransposeConvConverter @@ -2528,8 +2611,8 @@ void mlir::tosa::populateTosaToLinalgOnTensorsConversionPatterns( ReduceConverter<tosa::ReduceProdOp>, ArgMaxConverter, ConcatConverter, - ConvConverter<tosa::Conv2DOp>, - ConvConverter<tosa::DepthwiseConv2DOp>, + ConvConverter, + DepthwiseConvConverter, TransposeConvConverter, GatherConverter, PadConverter, diff --git a/mlir/python/mlir/dialects/linalg/opdsl/ops/core_named_ops.py b/mlir/python/mlir/dialects/linalg/opdsl/ops/core_named_ops.py index fc92c196a059..b9faeeb831df 100644 --- a/mlir/python/mlir/dialects/linalg/opdsl/ops/core_named_ops.py +++ b/mlir/python/mlir/dialects/linalg/opdsl/ops/core_named_ops.py @@ -144,49 +144,39 @@ def dot( implements(ContractionOpInterface) C[None] += cast(U, A[D.m]) * cast(U, B[D.m]) - @linalg_structured_op -def conv_2d_input_nhwc_filter_ohwi_poly( - I=TensorDef(T1, S.N, S.IH, S.IW, S.IC), - K=TensorDef(T2, S.OC, S.KH, S.KW, S.IC), - O=TensorDef(U, S.N, S.OH, S.OW, S.OC, output=True), +def conv_2d_nchw( + I=TensorDef(T1, S.N, S.C, S.IH, S.IW), + K=TensorDef(T2, S.F, S.C, S.KH, S.KW), + O=TensorDef(U, S.N, S.F, S.OH, S.OW, S.C, output=True), strides=AttributeDef(S.SH, S.SW), dilations=AttributeDef(S.DH, S.DW)): - """Performs a 2-D convolution. + """Performs 2-D convolution. Numeric casting is performed on the operands to the inner multiply, promoting them to the same data type as the accumulator/output. """ - domain(D.n, D.oh, D.ow, D.kh, D.kw, D.oc, D.ic) - O[D.n, D.oh, D.ow, D.oc] += cast( - U, I[D.n, - D.oh * S.SH + D.kh * S.DH, - D.ow * S.SW + D.kw * S.DW, - D.ic]) * cast(U, K[D.oc, D.kh, D.kw, D.ic]) + domain(D.n, D.f, D.oh, D.ow, D.c, D.kh, D.kw) + O[D.n, D.f, D.oh, D.ow] += cast( + U, I[D.n, D.c, D.oh * S.SH + D.kh * S.DH, D.ow * S.SW + D.kw * S.DW, + ]) * cast(U, K[D.f, D.c, D.kh, D.kw]) @linalg_structured_op -def conv_2d_input_nhwc_filter_ohwi_poly_q( - I=TensorDef(T1, S.N, S.IH, S.IW, S.IC), - K=TensorDef(T2, S.OC, S.KH, S.KW, S.IC), - IZp=ScalarDef(I32), - KZp=ScalarDef(I32), - O=TensorDef(U, S.N, S.OH, S.OW, S.OC, output=True), +def conv_2d_nhwc_hwcf( + I=TensorDef(T1, S.N, S.IH, S.IW, S.C), + K=TensorDef(T2, S.KH, S.KW, S.C, S.F), + O=TensorDef(U, S.N, S.OH, S.OW, S.F, output=True), strides=AttributeDef(S.SH, S.SW), dilations=AttributeDef(S.DH, S.DW)): - """Performs a 2-D quantized convolution. + """Performs 2-D convolution. Numeric casting is performed on the operands to the inner multiply, promoting - them to the same data type as the accumulator/output. Includes zero point - adjustment for quantization. + them to the same data type as the accumulator/output. """ - domain(D.n, D.oh, D.ow, D.kh, D.kw, D.oc, D.ic) - O[D.n, D.oh, D.ow, D.oc] += ((cast( - U, I[D.n, - D.oh * S.SH + D.kh * S.DH, - D.ow * S.SW + D.kw * S.DW, - D.ic]) - cast(U, IZp)) * - (cast(U, K[D.oc, D.kh, D.kw, D.ic]) - cast(U, KZp))) - + domain(D.n, D.oh, D.ow, D.f, D.kh, D.kw, D.c) + O[D.n, D.oh, D.ow, D.f] += cast( + U, I[D.n, D.oh * S.SH + D.kh * S.DH, D.ow * S.SW + D.kw * S.DW, D.c + ]) * cast(U, K[D.kh, D.kw, D.c, D.f]) @linalg_structured_op def depthwise_conv_2d_input_nhwc_filter_hwc_poly( @@ -206,24 +196,27 @@ def depthwise_conv_2d_input_nhwc_filter_hwc_poly( D.c]) * cast(U, K[D.kh, D.kw, D.c]) @linalg_structured_op -def conv_2d_nchw( - I=TensorDef(T1, S.N, S.C, S.IH, S.IW), - K=TensorDef(T2, S.F, S.C, S.KH, S.KW), - O=TensorDef(U, S.N, S.F, S.OH, S.OW, S.C, output=True), +def conv_2d_nhwc_hwcf_q( + I=TensorDef(T1, S.N, S.IH, S.IW, S.C), + K=TensorDef(T2, S.KH, S.KW, S.C, S.F), + IZp=ScalarDef(I32), + KZp=ScalarDef(I32), + O=TensorDef(U, S.N, S.OH, S.OW, S.F, output=True), strides=AttributeDef(S.SH, S.SW), dilations=AttributeDef(S.DH, S.DW)): - """Performs 2-D convolution. + """Performs 2-D convolution with zero point offsets. Numeric casting is performed on the operands to the inner multiply, promoting - them to the same data type as the accumulator/output. + them to the same data type as the accumulator/output. This includes the zero + point offsets common to quantized operations. """ - domain(D.n, D.f, D.oh, D.ow, D.c, D.kh, D.kw) - O[D.n, D.f, D.oh, D.ow] += cast( - U, I[D.n, D.c, D.oh * S.SH + D.kh * S.DH, D.ow * S.SW + D.kw * S.DW, - ]) * cast(U, K[D.f, D.c, D.kh, D.kw]) + domain(D.n, D.oh, D.ow, D.f, D.kh, D.kw, D.c) + O[D.n, D.oh, D.ow, D.f] += (cast( + U, I[D.n, D.oh * S.SH + D.kh * S.DH, D.ow * S.SW + D.kw * S.DW, D.c + ]) - cast(U, IZp)) * (cast(U, K[D.kh, D.kw, D.c, D.f]) - cast(U, KZp)) - -def depthwise_conv2D_nchw( #TODO: Fix name +@linalg_structured_op +def depthwise_conv2D_nchw( I=TensorDef(T1, S.N, S.IH, S.IW, S.IC), K=TensorDef(T2, S.KH, S.KW, S.IC, S.CM), O=TensorDef(U, S.N, S.OH, S.OW, S.IC, S.CM, output=True), @@ -239,8 +232,8 @@ def depthwise_conv2D_nchw( #TODO: Fix name U, I[D.n, D.oh * S.SH + D.kh * S.DH, D.ow * S.SW + D.kw * S.DW, D.ic]) * cast(U, K[D.kh, D.kw, D.ic, D.cm]) - -def depthwise_conv2D_nchw_q( #TODO: Fix name +@linalg_structured_op +def depthwise_conv2D_nchw_q( I=TensorDef(T1, S.N, S.IH, S.IW, S.IC), K=TensorDef(T2, S.KH, S.KW, S.IC, S.CM), IZp=ScalarDef(I32), diff --git a/mlir/test/Conversion/TosaToLinalg/tosa-to-linalg.mlir b/mlir/test/Conversion/TosaToLinalg/tosa-to-linalg.mlir index 309846d66c94..3c89de395187 100644 --- a/mlir/test/Conversion/TosaToLinalg/tosa-to-linalg.mlir +++ b/mlir/test/Conversion/TosaToLinalg/tosa-to-linalg.mlir @@ -1176,14 +1176,19 @@ func @avg_pool(%arg0: tensor<1x6x34x62xf32>) -> (tensor<1x5x33x62xf32>) { // ----- -// CHECK-DAG: #[[$MAP1:.*]] = affine_map<(d0, d1, d2, d3) -> (d3)> -// CHECK-DAG: #[[$MAP2:.*]] = affine_map<(d0, d1, d2, d3) -> (d0, d1, d2, d3)> +// CHECK: #[[MAP0:.+]] = affine_map<(d0, d1, d2, d3) -> (d3, d0, d1, d2)> +// CHECK: #[[MAP1:.+]] = affine_map<(d0, d1, d2, d3) -> (d0, d1, d2, d3)> +// CHECK: #[[MAP2:.+]] = affine_map<(d0, d1, d2, d3) -> (d3)> -// CHECK-LABEL: @conv2d_f32 +// CHECK-LABEL @conv2d_f32 func @conv2d_f32(%input: tensor<1x49x42x27xf32>, %weights: tensor<28x3x3x27xf32>, %bias: tensor<28xf32>) -> () { - // CHECK: %[[INIT:.+]] = linalg.init_tensor [1, 45, 40, 28] - // CHECK: %[[BROADCAST:.+]] = linalg.generic {indexing_maps = [#[[$MAP1]], #[[$MAP2]]], iterator_types = ["parallel", "parallel", "parallel", "parallel"]} ins(%arg2 : tensor<28xf32>) outs(%[[INIT]] : tensor<1x45x40x28xf32>) - // CHECK: linalg.conv_2d_input_nhwc_filter_ohwi_poly {dilations = dense<[2, 1]> : tensor<2xi64>, strides = dense<1> : tensor<2xi64>} ins(%arg0, %arg1 : tensor<1x49x42x27xf32>, tensor<28x3x3x27xf32>) outs(%[[BROADCAST]] : tensor<1x45x40x28xf32>) + // CHECK: %[[W_IN:.+]] = linalg.init_tensor [3, 3, 27, 28] + // CHECK: %[[W:.+]] = linalg.generic {indexing_maps = [#[[MAP0]], #[[MAP1]]], iterator_types = ["parallel", "parallel", "parallel", "parallel"]} ins(%arg1 : tensor<28x3x3x27xf32>) outs(%[[W_IN]] : tensor<3x3x27x28xf32>) + // CHECK: linalg.yield %arg3 : f32 + // CHECK: %[[B_IN:.+]] = linalg.init_tensor [1, 45, 40, 28] + // CHECK: %[[B:.+]] = linalg.generic {indexing_maps = [#[[MAP2]], #[[MAP1]]], iterator_types = ["parallel", "parallel", "parallel", "parallel"]} ins(%arg2 : tensor<28xf32>) outs(%[[B_IN]] : tensor<1x45x40x28xf32>) + // CHECK: linalg.yield %arg3 : f32 + // CHECK: %[[CONV:.+]] = linalg.conv_2d_nhwc_hwcf {dilations = dense<[2, 1]> : tensor<2xi64>, strides = dense<1> : tensor<2xi64>} ins(%arg0, %1 : tensor<1x49x42x27xf32>, tensor<3x3x27x28xf32>) outs(%[[B]] : tensor<1x45x40x28xf32>) %0 = "tosa.conv2d"(%input, %weights, %bias) {pad = [0, 0, 0, 0], stride = [1, 1], dilation = [2, 1]} : (tensor<1x49x42x27xf32>, tensor<28x3x3x27xf32>, tensor<28xf32>) -> (tensor<1x45x40x28xf32>) return } @@ -1192,26 +1197,17 @@ func @conv2d_f32(%input: tensor<1x49x42x27xf32>, %weights: tensor<28x3x3x27xf32> // CHECK-LABEL: @conv2d_padded_f32 func @conv2d_padded_f32(%input: tensor<1x47x40x28xf32>, %weights: tensor<28x3x3x28xf32>, %bias: tensor<28xf32>) -> () { - // CHECK: linalg.pad_tensor %arg0 - // CHECK: linalg.conv_2d_input_nhwc_filter_ohwi_poly </cut>

3 years, 11 months

1
0
0 0

[CI-NOTIFY]: TCWG Bisect tcwg_kernel/llvm-master-aarch64-mainline-allyesconfig - Build # 16 - Successful!

by ci_notify＠linaro.org

Successfully identified regression in *linux* in CI configuration tcwg_kernel/llvm-master-aarch64-mainline-allyesconfig. So far, this commit has regressed CI configurations: - tcwg_kernel/llvm-master-aarch64-mainline-allyesconfig Culprit: <cut> commit 342f43af70dbc74f8629381998f92c060e1763a2 Author: Maurizio Lombardi <mlombard(a)redhat.com> Date: Thu Jul 29 15:52:50 2021 +0200 iscsi_ibft: fix crash due to KASLR physical memory remapping Starting with commit a799c2bd29d1 ("x86/setup: Consolidate early memory reservations") memory reservations have been moved earlier during the boot process, before the execution of the Kernel Address Space Layout Randomization code. setup_arch() calls the iscsi_ibft's find_ibft_region() function to find and reserve the memory dedicated to the iBFT and this function also saves a virtual pointer to the iBFT table for later use. The problem is that if KALSR is active, the physical memory gets remapped somewhere else in the virtual address space and the pointer is no longer valid, this will cause a kernel panic when the iscsi driver tries to dereference it. iBFT detected. BUG: unable to handle page fault for address: ffff888000099fd8 #PF: supervisor read access in kernel mode #PF: error_code(0x0000) - not-present page PGD 0 P4D 0 Oops: 0000 [#1] SMP PTI ..snip.. Call Trace: ? ibft_create_kobject+0x1d2/0x1d2 [iscsi_ibft] do_one_initcall+0x44/0x1d0 ? kmem_cache_alloc_trace+0x119/0x220 do_init_module+0x5c/0x270 __do_sys_init_module+0x12e/0x1b0 do_syscall_64+0x40/0x80 entry_SYSCALL_64_after_hwframe+0x44/0xae Fix this bug by saving the address of the physical location of the ibft; later the driver will use isa_bus_to_virt() to get the correct virtual address. N.B. On each reboot KASLR randomizes the virtual addresses so assuming phys_to_virt before KASLR does its deed is incorrect. Simplify the code by renaming find_ibft_region() to reserve_ibft_region() and remove all the wrappers. Signed-off-by: Maurizio Lombardi <mlombard(a)redhat.com> Reviewed-by: Mike Rapoport <rppt(a)linux.ibm.com> Signed-off-by: Konrad Rzeszutek Wilk <konrad(a)kernel.org> </cut> Results regressed to (for first_bad == 342f43af70dbc74f8629381998f92c060e1763a2) # reset_artifacts: -10 # build_abe binutils: -9 # build_llvm: -5 # build_abe qemu: -2 # linux_n_obj: 19722 # First few build errors in logs: from (for last_good == 62fb9874f5da54fdb243003b386128037319b219) # reset_artifacts: -10 # build_abe binutils: -9 # build_llvm: -5 # build_abe qemu: -2 # linux_n_obj: 19795 # linux build successful: all Artifacts of last_good build: https://ci.linaro.org/job/tcwg_kernel-llvm-bisect-llvm-master-aarch64-mainl… Artifacts of first_bad build: https://ci.linaro.org/job/tcwg_kernel-llvm-bisect-llvm-master-aarch64-mainl… Build top page/logs: https://ci.linaro.org/job/tcwg_kernel-llvm-bisect-llvm-master-aarch64-mainl… Configuration details: Reproduce builds: <cut> mkdir investigate-linux-342f43af70dbc74f8629381998f92c060e1763a2 cd investigate-linux-342f43af70dbc74f8629381998f92c060e1763a2 git clone https://git.linaro.org/toolchain/jenkins-scripts mkdir -p artifacts/manifests curl -o artifacts/manifests/build-baseline.sh https://ci.linaro.org/job/tcwg_kernel-llvm-bisect-llvm-master-aarch64-mainl… --fail curl -o artifacts/manifests/build-parameters.sh https://ci.linaro.org/job/tcwg_kernel-llvm-bisect-llvm-master-aarch64-mainl… --fail curl -o artifacts/test.sh https://ci.linaro.org/job/tcwg_kernel-llvm-bisect-llvm-master-aarch64-mainl… --fail chmod +x artifacts/test.sh # Reproduce the baseline build (build all pre-requisites) ./jenkins-scripts/tcwg_kernel-build.sh @@ artifacts/manifests/build-baseline.sh # Save baseline build state (which is then restored in artifacts/test.sh) mkdir -p ./bisect rsync -a --del --delete-excluded --exclude /bisect/ --exclude /artifacts/ --exclude /linux/ ./ ./bisect/baseline/ cd linux # Reproduce first_bad build git checkout --detach 342f43af70dbc74f8629381998f92c060e1763a2 ../artifacts/test.sh # Reproduce last_good build git checkout --detach 62fb9874f5da54fdb243003b386128037319b219 ../artifacts/test.sh cd .. </cut> History of pending regressions and results: https://git.linaro.org/toolchain/ci/base-artifacts.git/log/?h=linaro-local/… Artifacts: https://ci.linaro.org/job/tcwg_kernel-llvm-bisect-llvm-master-aarch64-mainl… Build log: https://ci.linaro.org/job/tcwg_kernel-llvm-bisect-llvm-master-aarch64-mainl… Full commit (up to 1000 lines): <cut> commit 342f43af70dbc74f8629381998f92c060e1763a2 Author: Maurizio Lombardi <mlombard(a)redhat.com> Date: Thu Jul 29 15:52:50 2021 +0200 iscsi_ibft: fix crash due to KASLR physical memory remapping Starting with commit a799c2bd29d1 ("x86/setup: Consolidate early memory reservations") memory reservations have been moved earlier during the boot process, before the execution of the Kernel Address Space Layout Randomization code. setup_arch() calls the iscsi_ibft's find_ibft_region() function to find and reserve the memory dedicated to the iBFT and this function also saves a virtual pointer to the iBFT table for later use. The problem is that if KALSR is active, the physical memory gets remapped somewhere else in the virtual address space and the pointer is no longer valid, this will cause a kernel panic when the iscsi driver tries to dereference it. iBFT detected. BUG: unable to handle page fault for address: ffff888000099fd8 #PF: supervisor read access in kernel mode #PF: error_code(0x0000) - not-present page PGD 0 P4D 0 Oops: 0000 [#1] SMP PTI ..snip.. Call Trace: ? ibft_create_kobject+0x1d2/0x1d2 [iscsi_ibft] do_one_initcall+0x44/0x1d0 ? kmem_cache_alloc_trace+0x119/0x220 do_init_module+0x5c/0x270 __do_sys_init_module+0x12e/0x1b0 do_syscall_64+0x40/0x80 entry_SYSCALL_64_after_hwframe+0x44/0xae Fix this bug by saving the address of the physical location of the ibft; later the driver will use isa_bus_to_virt() to get the correct virtual address. N.B. On each reboot KASLR randomizes the virtual addresses so assuming phys_to_virt before KASLR does its deed is incorrect. Simplify the code by renaming find_ibft_region() to reserve_ibft_region() and remove all the wrappers. Signed-off-by: Maurizio Lombardi <mlombard(a)redhat.com> Reviewed-by: Mike Rapoport <rppt(a)linux.ibm.com> Signed-off-by: Konrad Rzeszutek Wilk <konrad(a)kernel.org> --- arch/x86/kernel/setup.c | 10 -------- drivers/firmware/iscsi_ibft.c | 10 +++++--- drivers/firmware/iscsi_ibft_find.c | 48 ++++++++++++++------------------------ include/linux/iscsi_ibft.h | 18 ++++++-------- 4 files changed, 32 insertions(+), 54 deletions(-) diff --git a/arch/x86/kernel/setup.c b/arch/x86/kernel/setup.c index 1e720626069a..b6a62af06a9f 100644 --- a/arch/x86/kernel/setup.c +++ b/arch/x86/kernel/setup.c @@ -571,16 +571,6 @@ void __init reserve_standard_io_resources(void) } -static __init void reserve_ibft_region(void) -{ - unsigned long addr, size = 0; - - addr = find_ibft_region(&size); - - if (size) - memblock_reserve(addr, size); -} - static bool __init snb_gfx_workaround_needed(void) { #ifdef CONFIG_PCI diff --git a/drivers/firmware/iscsi_ibft.c b/drivers/firmware/iscsi_ibft.c index 7127a04bca19..612a59e213df 100644 --- a/drivers/firmware/iscsi_ibft.c +++ b/drivers/firmware/iscsi_ibft.c @@ -84,8 +84,10 @@ MODULE_DESCRIPTION("sysfs interface to BIOS iBFT information"); MODULE_LICENSE("GPL"); MODULE_VERSION(IBFT_ISCSI_VERSION); +static struct acpi_table_ibft *ibft_addr; + #ifndef CONFIG_ISCSI_IBFT_FIND -struct acpi_table_ibft *ibft_addr; +phys_addr_t ibft_phys_addr; #endif struct ibft_hdr { @@ -858,11 +860,13 @@ static int __init ibft_init(void) int rc = 0; /* - As on UEFI systems the setup_arch()/find_ibft_region() + As on UEFI systems the setup_arch()/reserve_ibft_region() is called before ACPI tables are parsed and it only does legacy finding. */ - if (!ibft_addr) + if (ibft_phys_addr) + ibft_addr = isa_bus_to_virt(ibft_phys_addr); + else acpi_find_ibft_region(); if (ibft_addr) { diff --git a/drivers/firmware/iscsi_ibft_find.c b/drivers/firmware/iscsi_ibft_find.c index 64bb94523281..a0594590847d 100644 --- a/drivers/firmware/iscsi_ibft_find.c +++ b/drivers/firmware/iscsi_ibft_find.c @@ -31,8 +31,8 @@ /* * Physical location of iSCSI Boot Format Table. */ -struct acpi_table_ibft *ibft_addr; -EXPORT_SYMBOL_GPL(ibft_addr); +phys_addr_t ibft_phys_addr; +EXPORT_SYMBOL_GPL(ibft_phys_addr); static const struct { char *sign; @@ -47,13 +47,24 @@ static const struct { #define VGA_MEM 0xA0000 /* VGA buffer */ #define VGA_SIZE 0x20000 /* 128kB */ -static int __init find_ibft_in_mem(void) +/* + * Routine used to find and reserve the iSCSI Boot Format Table + */ +void __init reserve_ibft_region(void) { unsigned long pos; unsigned int len = 0; void *virt; int i; + ibft_phys_addr = 0; + + /* iBFT 1.03 section 1.4.3.1 mandates that UEFI machines will + * only use ACPI for this + */ + if (efi_enabled(EFI_BOOT)) + return; + for (pos = IBFT_START; pos < IBFT_END; pos += 16) { /* The table can't be inside the VGA BIOS reserved space, * so skip that area */ @@ -70,35 +81,12 @@ static int __init find_ibft_in_mem(void) /* if the length of the table extends past 1M, * the table cannot be valid. */ if (pos + len <= (IBFT_END-1)) { - ibft_addr = (struct acpi_table_ibft *)virt; - pr_info("iBFT found at 0x%lx.\n", pos); - goto done; + ibft_phys_addr = pos; + memblock_reserve(ibft_phys_addr, PAGE_ALIGN(len)); + pr_info("iBFT found at 0x%lx.\n", ibft_phys_addr); + return; } } } } -done: - return len; -} -/* - * Routine used to find the iSCSI Boot Format Table. The logical - * kernel address is set in the ibft_addr global variable. - */ -unsigned long __init find_ibft_region(unsigned long *sizep) -{ - ibft_addr = NULL; - - /* iBFT 1.03 section 1.4.3.1 mandates that UEFI machines will - * only use ACPI for this */ - - if (!efi_enabled(EFI_BOOT)) - find_ibft_in_mem(); - - if (ibft_addr) { - *sizep = PAGE_ALIGN(ibft_addr->header.length); - return (u64)virt_to_phys(ibft_addr); - } - - *sizep = 0; - return 0; } diff --git a/include/linux/iscsi_ibft.h b/include/linux/iscsi_ibft.h index b7b45ca82bea..790e7fcfc1a6 100644 --- a/include/linux/iscsi_ibft.h +++ b/include/linux/iscsi_ibft.h @@ -13,26 +13,22 @@ #ifndef ISCSI_IBFT_H #define ISCSI_IBFT_H -#include <linux/acpi.h> +#include <linux/types.h> /* - * Logical location of iSCSI Boot Format Table. - * If the value is NULL there is no iBFT on the machine. + * Physical location of iSCSI Boot Format Table. + * If the value is 0 there is no iBFT on the machine. */ -extern struct acpi_table_ibft *ibft_addr; +extern phys_addr_t ibft_phys_addr; /* * Routine used to find and reserve the iSCSI Boot Format Table. The - * mapped address is set in the ibft_addr variable. + * physical address is set in the ibft_phys_addr variable. */ #ifdef CONFIG_ISCSI_IBFT_FIND -unsigned long find_ibft_region(unsigned long *sizep); +void reserve_ibft_region(void); #else -static inline unsigned long find_ibft_region(unsigned long *sizep) -{ - *sizep = 0; - return 0; -} +static inline void reserve_ibft_region(void) {} #endif #endif /* ISCSI_IBFT_H */ </cut>

3 years, 11 months

2
1
0 0

[CI-NOTIFY]: TCWG Bisect tcwg_bmk_tx1/llvm-master-aarch64-spec2k6-O3 - Build # 20 - Successful!

by ci_notify＠linaro.org

Successfully identified regression in *llvm* in CI configuration tcwg_bmk_llvm_tx1/llvm-master-aarch64-spec2k6-O3. So far, this commit has regressed CI configurations: - tcwg_bmk_llvm_tx1/llvm-master-aarch64-spec2k6-O3 Culprit: <cut> commit 4cd8dd3fe05e099792e1494dedd074eb5ba289b6 Author: Amy Kwan <amy.kwan1(a)ibm.com> Date: Sun Aug 22 13:46:52 2021 -0500 [scudo][standalone] Link tests against libatomic if libatomic exists It is possible that libatomic does not exist on some systems. This patch updates the scudo standalone tests to link against libatomic if the library exists. This is an update to the original patch: https://reviews.llvm.org/D64134 and aims to resolve https://bugs.llvm.org/show_bug.cgi?id=51431. Differential Revision: https://reviews.llvm.org/D108503 </cut> Results regressed to (for first_bad == 4cd8dd3fe05e099792e1494dedd074eb5ba289b6) # reset_artifacts: -10 # build_abe binutils: -9 # build_abe stage1 -- --set gcc_override_configure=--disable-libsanitizer: -8 # build_abe linux: -7 # build_abe glibc: -6 # build_abe stage2 -- --set gcc_override_configure=--disable-libsanitizer: -5 # build_llvm true: -3 # true: 0 # benchmark -- -O3 artifacts/build-4cd8dd3fe05e099792e1494dedd074eb5ba289b6/results_id: 1 # 447.dealII,dealII_base.default regressed by 103 from (for last_good == d8d84c9df82fc114f2b22a533a8183065ca1a2e0) # reset_artifacts: -10 # build_abe binutils: -9 # build_abe stage1 -- --set gcc_override_configure=--disable-libsanitizer: -8 # build_abe linux: -7 # build_abe glibc: -6 # build_abe stage2 -- --set gcc_override_configure=--disable-libsanitizer: -5 # build_llvm true: -3 # true: 0 # benchmark -- -O3 artifacts/build-d8d84c9df82fc114f2b22a533a8183065ca1a2e0/results_id: 1 Artifacts of last_good build: https://ci.linaro.org/job/tcwg_bmk_ci_llvm-bisect-tcwg_bmk_tx1-llvm-master-… Results ID of last_good: tx1_64/tcwg_bmk_llvm_tx1/bisect-llvm-master-aarch64-spec2k6-O3/4515 Artifacts of first_bad build: https://ci.linaro.org/job/tcwg_bmk_ci_llvm-bisect-tcwg_bmk_tx1-llvm-master-… Results ID of first_bad: tx1_64/tcwg_bmk_llvm_tx1/bisect-llvm-master-aarch64-spec2k6-O3/4510 Build top page/logs: https://ci.linaro.org/job/tcwg_bmk_ci_llvm-bisect-tcwg_bmk_tx1-llvm-master-… Configuration details: Reproduce builds: <cut> mkdir investigate-llvm-4cd8dd3fe05e099792e1494dedd074eb5ba289b6 cd investigate-llvm-4cd8dd3fe05e099792e1494dedd074eb5ba289b6 git clone https://git.linaro.org/toolchain/jenkins-scripts mkdir -p artifacts/manifests curl -o artifacts/manifests/build-baseline.sh https://ci.linaro.org/job/tcwg_bmk_ci_llvm-bisect-tcwg_bmk_tx1-llvm-master-… --fail curl -o artifacts/manifests/build-parameters.sh https://ci.linaro.org/job/tcwg_bmk_ci_llvm-bisect-tcwg_bmk_tx1-llvm-master-… --fail curl -o artifacts/test.sh https://ci.linaro.org/job/tcwg_bmk_ci_llvm-bisect-tcwg_bmk_tx1-llvm-master-… --fail chmod +x artifacts/test.sh # Reproduce the baseline build (build all pre-requisites) ./jenkins-scripts/tcwg_bmk-build.sh @@ artifacts/manifests/build-baseline.sh # Save baseline build state (which is then restored in artifacts/test.sh) mkdir -p ./bisect rsync -a --del --delete-excluded --exclude /bisect/ --exclude /artifacts/ --exclude /llvm/ ./ ./bisect/baseline/ cd llvm # Reproduce first_bad build git checkout --detach 4cd8dd3fe05e099792e1494dedd074eb5ba289b6 ../artifacts/test.sh # Reproduce last_good build git checkout --detach d8d84c9df82fc114f2b22a533a8183065ca1a2e0 ../artifacts/test.sh cd .. </cut> History of pending regressions and results: https://git.linaro.org/toolchain/ci/base-artifacts.git/log/?h=linaro-local/… Artifacts: https://ci.linaro.org/job/tcwg_bmk_ci_llvm-bisect-tcwg_bmk_tx1-llvm-master-… Build log: https://ci.linaro.org/job/tcwg_bmk_ci_llvm-bisect-tcwg_bmk_tx1-llvm-master-… Full commit (up to 1000 lines): <cut> commit 4cd8dd3fe05e099792e1494dedd074eb5ba289b6 Author: Amy Kwan <amy.kwan1(a)ibm.com> Date: Sun Aug 22 13:46:52 2021 -0500 [scudo][standalone] Link tests against libatomic if libatomic exists It is possible that libatomic does not exist on some systems. This patch updates the scudo standalone tests to link against libatomic if the library exists. This is an update to the original patch: https://reviews.llvm.org/D64134 and aims to resolve https://bugs.llvm.org/show_bug.cgi?id=51431. Differential Revision: https://reviews.llvm.org/D108503 --- compiler-rt/lib/scudo/standalone/tests/CMakeLists.txt | 5 ++++- 1 file changed, 4 insertions(+), 1 deletion(-) diff --git a/compiler-rt/lib/scudo/standalone/tests/CMakeLists.txt b/compiler-rt/lib/scudo/standalone/tests/CMakeLists.txt index f4186eba1688..eaa47a04a179 100644 --- a/compiler-rt/lib/scudo/standalone/tests/CMakeLists.txt +++ b/compiler-rt/lib/scudo/standalone/tests/CMakeLists.txt @@ -39,7 +39,10 @@ foreach(lib ${SANITIZER_TEST_CXX_LIBRARIES}) endforeach() list(APPEND LINK_FLAGS -pthread) # Linking against libatomic is required with some compilers -list(APPEND LINK_FLAGS -latomic) +check_library_exists(atomic __atomic_load_8 "" COMPILER_RT_HAS_LIBATOMIC) +if (COMPILER_RT_HAS_LIBATOMIC) + list(APPEND LINK_FLAGS -latomic) +endif() set(SCUDO_TEST_HEADERS scudo_unit_test.h </cut>

3 years, 11 months

1
0
0 0

[CI-NOTIFY]: TCWG Bisect tcwg_bmk_apm/gnu-release-aarch64-spec2k6-Os_LTO - Build # 3 - Successful!

by ci_notify＠linaro.org

Successfully identified regression in *gcc* in CI configuration tcwg_bmk_gnu_apm/gnu-release-aarch64-spec2k6-Os_LTO. So far, this commit has regressed CI configurations: - tcwg_bmk_gnu_apm/gnu-release-aarch64-spec2k6-Os_LTO Culprit: <cut> commit ee875b63b22e30a0dcb4b05f7532c2c416ba6cd0 Author: Richard Biener <rguenther(a)suse.de> Date: Tue Aug 17 08:38:35 2021 +0200 tree-optimization/101868 - avoid PRE of trapping mems across calls This backports a fix for the omission of a check of trapping mems when hoisting them across calls that might not return. This was originally done as part of a fix to handle const functions that throw properly. 2021-08-17 Richard Biener <rguenther(a)suse.de> PR tree-optimization/101373 PR tree-optimization/101868 * tree-ssa-pre.c (prune_clobbered_mems): Also prune trapping references when the BB may not return. * gcc.dg/lto/pr101868_0.c: New testcase. * gcc.dg/lto/pr101868_1.c: Likewise. * gcc.dg/lto/pr101868_2.c: Likewise. * gcc.dg/lto/pr101868_3.c: Likewise. </cut> Results regressed to (for first_bad == ee875b63b22e30a0dcb4b05f7532c2c416ba6cd0) # reset_artifacts: -10 # build_abe binutils: -9 # build_abe stage1 -- --set gcc_override_configure=--disable-libsanitizer: -8 # build_abe linux: -7 # build_abe glibc: -6 # build_abe stage2 -- --set gcc_override_configure=--disable-libsanitizer: -5 # true: 0 # benchmark -- -Os_LTO artifacts/build-ee875b63b22e30a0dcb4b05f7532c2c416ba6cd0/results_id: 1 # 450.soplex,soplex_base.default regressed by 102 from (for last_good == a0a0499b8bb920fdd98e791804812f001f0b4fe8) # reset_artifacts: -10 # build_abe binutils: -9 # build_abe stage1 -- --set gcc_override_configure=--disable-libsanitizer: -8 # build_abe linux: -7 # build_abe glibc: -6 # build_abe stage2 -- --set gcc_override_configure=--disable-libsanitizer: -5 # true: 0 # benchmark -- -Os_LTO artifacts/build-a0a0499b8bb920fdd98e791804812f001f0b4fe8/results_id: 1 Artifacts of last_good build: https://ci.linaro.org/job/tcwg_bmk_ci_gnu-bisect-tcwg_bmk_apm-gnu-release-a… Results ID of last_good: apm_64/tcwg_bmk_gnu_apm/bisect-gnu-release-aarch64-spec2k6-Os_LTO/4497 Artifacts of first_bad build: https://ci.linaro.org/job/tcwg_bmk_ci_gnu-bisect-tcwg_bmk_apm-gnu-release-a… Results ID of first_bad: apm_64/tcwg_bmk_gnu_apm/bisect-gnu-release-aarch64-spec2k6-Os_LTO/4482 Build top page/logs: https://ci.linaro.org/job/tcwg_bmk_ci_gnu-bisect-tcwg_bmk_apm-gnu-release-a… Configuration details: Reproduce builds: <cut> mkdir investigate-gcc-ee875b63b22e30a0dcb4b05f7532c2c416ba6cd0 cd investigate-gcc-ee875b63b22e30a0dcb4b05f7532c2c416ba6cd0 git clone https://git.linaro.org/toolchain/jenkins-scripts mkdir -p artifacts/manifests curl -o artifacts/manifests/build-baseline.sh https://ci.linaro.org/job/tcwg_bmk_ci_gnu-bisect-tcwg_bmk_apm-gnu-release-a… --fail curl -o artifacts/manifests/build-parameters.sh https://ci.linaro.org/job/tcwg_bmk_ci_gnu-bisect-tcwg_bmk_apm-gnu-release-a… --fail curl -o artifacts/test.sh https://ci.linaro.org/job/tcwg_bmk_ci_gnu-bisect-tcwg_bmk_apm-gnu-release-a… --fail chmod +x artifacts/test.sh # Reproduce the baseline build (build all pre-requisites) ./jenkins-scripts/tcwg_bmk-build.sh @@ artifacts/manifests/build-baseline.sh # Save baseline build state (which is then restored in artifacts/test.sh) mkdir -p ./bisect rsync -a --del --delete-excluded --exclude /bisect/ --exclude /artifacts/ --exclude /gcc/ ./ ./bisect/baseline/ cd gcc # Reproduce first_bad build git checkout --detach ee875b63b22e30a0dcb4b05f7532c2c416ba6cd0 ../artifacts/test.sh # Reproduce last_good build git checkout --detach a0a0499b8bb920fdd98e791804812f001f0b4fe8 ../artifacts/test.sh cd .. </cut> History of pending regressions and results: https://git.linaro.org/toolchain/ci/base-artifacts.git/log/?h=linaro-local/… Artifacts: https://ci.linaro.org/job/tcwg_bmk_ci_gnu-bisect-tcwg_bmk_apm-gnu-release-a… Build log: https://ci.linaro.org/job/tcwg_bmk_ci_gnu-bisect-tcwg_bmk_apm-gnu-release-a… Full commit (up to 1000 lines): <cut> commit ee875b63b22e30a0dcb4b05f7532c2c416ba6cd0 Author: Richard Biener <rguenther(a)suse.de> Date: Tue Aug 17 08:38:35 2021 +0200 tree-optimization/101868 - avoid PRE of trapping mems across calls This backports a fix for the omission of a check of trapping mems when hoisting them across calls that might not return. This was originally done as part of a fix to handle const functions that throw properly. 2021-08-17 Richard Biener <rguenther(a)suse.de> PR tree-optimization/101373 PR tree-optimization/101868 * tree-ssa-pre.c (prune_clobbered_mems): Also prune trapping references when the BB may not return. * gcc.dg/lto/pr101868_0.c: New testcase. * gcc.dg/lto/pr101868_1.c: Likewise. * gcc.dg/lto/pr101868_2.c: Likewise. * gcc.dg/lto/pr101868_3.c: Likewise. --- gcc/testsuite/gcc.dg/lto/pr101868_0.c | 33 +++++++++++++++++++++++++++++++++ gcc/testsuite/gcc.dg/lto/pr101868_1.c | 23 +++++++++++++++++++++++ gcc/testsuite/gcc.dg/lto/pr101868_2.c | 11 +++++++++++ gcc/testsuite/gcc.dg/lto/pr101868_3.c | 8 ++++++++ gcc/tree-ssa-pre.c | 7 +++++++ 5 files changed, 82 insertions(+) diff --git a/gcc/testsuite/gcc.dg/lto/pr101868_0.c b/gcc/testsuite/gcc.dg/lto/pr101868_0.c new file mode 100644 index 00000000000..c84d19b0267 --- /dev/null +++ b/gcc/testsuite/gcc.dg/lto/pr101868_0.c @@ -0,0 +1,33 @@ +/* { dg-lto-do run } */ +/* { dg-lto-options { "-O2 -fno-strict-aliasing -flto" } } */ + +typedef unsigned long VALUE; + +__attribute__ ((cold)) +void rb_check_type(VALUE, int); + +static VALUE +repro(VALUE dummy, VALUE hash) +{ + if (hash == 0) { + rb_check_type(hash, 1); + } + else if (*(long *)hash) { + rb_check_type(hash, 1); + } + + + return *(long *)hash; +} + +static VALUE (*that)(VALUE dummy, VALUE hash) = repro; + +int +main(int argc, char **argv) +{ + argc--; + that(0, argc); + + rb_check_type(argc, argc); + +} diff --git a/gcc/testsuite/gcc.dg/lto/pr101868_1.c b/gcc/testsuite/gcc.dg/lto/pr101868_1.c new file mode 100644 index 00000000000..146c14abc76 --- /dev/null +++ b/gcc/testsuite/gcc.dg/lto/pr101868_1.c @@ -0,0 +1,23 @@ +typedef unsigned long VALUE; + + +__attribute__ ((noreturn)) void rexc_raise(VALUE mesg); + +VALUE rb_donothing(VALUE klass); + +static void +funexpected_type(VALUE x, int xt, int t) +{ + rexc_raise(rb_donothing(0)); +} + +__attribute__ ((cold)) +void +rb_check_type(VALUE x, int t) +{ + int xt; + + if (x == 0) { + funexpected_type(x, xt, t); + } +} diff --git a/gcc/testsuite/gcc.dg/lto/pr101868_2.c b/gcc/testsuite/gcc.dg/lto/pr101868_2.c new file mode 100644 index 00000000000..e6f01b23f45 --- /dev/null +++ b/gcc/testsuite/gcc.dg/lto/pr101868_2.c @@ -0,0 +1,11 @@ +typedef unsigned long VALUE; + +static void thing(void) {} +static void (*ptr)(void) = &thing; + +VALUE +rb_donothing(VALUE klass) +{ + ptr(); + return 0; +} diff --git a/gcc/testsuite/gcc.dg/lto/pr101868_3.c b/gcc/testsuite/gcc.dg/lto/pr101868_3.c new file mode 100644 index 00000000000..61217625be7 --- /dev/null +++ b/gcc/testsuite/gcc.dg/lto/pr101868_3.c @@ -0,0 +1,8 @@ +typedef unsigned long VALUE; + +__attribute__((noreturn)) +void +rexc_raise(VALUE mesg) +{ + __builtin_exit(0); +} diff --git a/gcc/tree-ssa-pre.c b/gcc/tree-ssa-pre.c index 04ec4fbaeec..2aedc31e1d7 100644 --- a/gcc/tree-ssa-pre.c +++ b/gcc/tree-ssa-pre.c @@ -2070,6 +2070,13 @@ prune_clobbered_mems (bitmap_set_t set, basic_block block) && value_dies_in_block_x (expr, block)))) to_remove = i; } + /* If the REFERENCE may trap make sure the block does not contain + a possible exit point. + ??? This is overly conservative if we translate AVAIL_OUT + as the available expression might be after the exit point. */ + if (BB_MAY_NOTRETURN (block) + && vn_reference_may_trap (ref)) + to_remove = i; } else if (expr->kind == NARY) { </cut>

3 years, 11 months

1
0
0 0

[ACTIVITY] report week ending 3 Sep

by Peter Maydell

Progress (short week, 2 days) * UM-2 [QEMU upstream maintainership] + Lots of code review and getting things upstream after trunk reopened + Wrote up a first draft of how to handle merging pullreqs, so that other people can share this job with me + Sent a patchset that allows board models to mark some buses as not suitable for plugging in user-created devices -- this avoids problems with i2c devices appearing on buses that are supposed to be for on-board devices only in the MPS2/MPS3 machines * QEMU-406 [QEMU support for MVE (M-profile Vector Extension; Helium)] + MVE is now enabled upstream. (There are still some loose ends to do under this JIRA task, though.) + Sent a patchset that makes some of the easier codegen optimizations for the no-predication case. (Code review spotted an issue which might be painful to sort out -- we'll see next week...) -- PMM

3 years, 11 months

1
0
0 0

[CI-NOTIFY]: TCWG Bisect tcwg_bmk_apm/llvm-master-arm-spec2k6-Oz - Build # 4 - Successful!

by ci_notify＠linaro.org

Successfully identified regression in *binutils* in CI configuration tcwg_bmk_llvm_apm/llvm-master-arm-spec2k6-Oz. So far, this commit has regressed CI configurations: - tcwg_bmk_llvm_apm/llvm-master-arm-spec2k6-Oz Culprit: <cut> commit 590d3faada8a12bf0937bbf68413956dc6a339a9 Author: Tom de Vries <tdevries(a)suse.de> Date: Mon Aug 30 10:30:26 2021 +0200 [gdb/testsuite] Improve argument syntax of proc arange The current syntax of proc arange is: ... proc arange { arange_start arange_length {comment ""} {seg_sel ""} } { ... and a typical call looks like: ... arange $start $len ... This style is somewhat annoying because if you want to specify the last parameter, you need to give the default values of all the other optional ones before as well: ... arange $start $len "" $seg_sel ... Update the syntax to: ... proc arange { options arange_start arange_length } { parse_options { { comment "" } { seg_sel "" } } ... such that a typical call looks like: ... arange {} $start $len ... and a call using seg_sel looks like: ... arange { seg_sel $seg_sel } $start $len ... Also update proc aranges, which already has an options argument, to use the new proc parse_options. Tested on x86_64-linux. Co-Authored-By: Simon Marchi <simon.marchi(a)polymtl.ca> </cut> Results regressed to (for first_bad == 590d3faada8a12bf0937bbf68413956dc6a339a9) # reset_artifacts: -10 # build_abe binutils: -9 # build_abe stage1 -- --set gcc_override_configure=--with-mode=thumb --set gcc_override_configure=--disable-libsanitizer: -8 # build_abe linux: -7 # build_abe glibc: -6 # build_abe stage2 -- --set gcc_override_configure=--with-mode=thumb --set gcc_override_configure=--disable-libsanitizer: -5 # build_llvm true: -3 # true: 0 # benchmark -- -Oz_mthumb artifacts/build-590d3faada8a12bf0937bbf68413956dc6a339a9/results_id: 1 # 447.dealII,[.] contract<3> regressed by 200 from (for last_good == cb03dd22b36b7bd21a81137005ec42dab8355b62) # reset_artifacts: -10 # build_abe binutils: -9 # build_abe stage1 -- --set gcc_override_configure=--with-mode=thumb --set gcc_override_configure=--disable-libsanitizer: -8 # build_abe linux: -7 # build_abe glibc: -6 # build_abe stage2 -- --set gcc_override_configure=--with-mode=thumb --set gcc_override_configure=--disable-libsanitizer: -5 # build_llvm true: -3 # true: 0 # benchmark -- -Oz_mthumb artifacts/build-cb03dd22b36b7bd21a81137005ec42dab8355b62/results_id: 1 Artifacts of last_good build: https://ci.linaro.org/job/tcwg_bmk_ci_llvm-bisect-tcwg_bmk_apm-llvm-master-… Results ID of last_good: apm_32/tcwg_bmk_llvm_apm/bisect-llvm-master-arm-spec2k6-Oz/4418 Artifacts of first_bad build: https://ci.linaro.org/job/tcwg_bmk_ci_llvm-bisect-tcwg_bmk_apm-llvm-master-… Results ID of first_bad: apm_32/tcwg_bmk_llvm_apm/bisect-llvm-master-arm-spec2k6-Oz/4431 Build top page/logs: https://ci.linaro.org/job/tcwg_bmk_ci_llvm-bisect-tcwg_bmk_apm-llvm-master-… Configuration details: Reproduce builds: <cut> mkdir investigate-binutils-590d3faada8a12bf0937bbf68413956dc6a339a9 cd investigate-binutils-590d3faada8a12bf0937bbf68413956dc6a339a9 git clone https://git.linaro.org/toolchain/jenkins-scripts mkdir -p artifacts/manifests curl -o artifacts/manifests/build-baseline.sh https://ci.linaro.org/job/tcwg_bmk_ci_llvm-bisect-tcwg_bmk_apm-llvm-master-… --fail curl -o artifacts/manifests/build-parameters.sh https://ci.linaro.org/job/tcwg_bmk_ci_llvm-bisect-tcwg_bmk_apm-llvm-master-… --fail curl -o artifacts/test.sh https://ci.linaro.org/job/tcwg_bmk_ci_llvm-bisect-tcwg_bmk_apm-llvm-master-… --fail chmod +x artifacts/test.sh # Reproduce the baseline build (build all pre-requisites) ./jenkins-scripts/tcwg_bmk-build.sh @@ artifacts/manifests/build-baseline.sh # Save baseline build state (which is then restored in artifacts/test.sh) mkdir -p ./bisect rsync -a --del --delete-excluded --exclude /bisect/ --exclude /artifacts/ --exclude /binutils/ ./ ./bisect/baseline/ cd binutils # Reproduce first_bad build git checkout --detach 590d3faada8a12bf0937bbf68413956dc6a339a9 ../artifacts/test.sh # Reproduce last_good build git checkout --detach cb03dd22b36b7bd21a81137005ec42dab8355b62 ../artifacts/test.sh cd .. </cut> History of pending regressions and results: https://git.linaro.org/toolchain/ci/base-artifacts.git/log/?h=linaro-local/… Artifacts: https://ci.linaro.org/job/tcwg_bmk_ci_llvm-bisect-tcwg_bmk_apm-llvm-master-… Build log: https://ci.linaro.org/job/tcwg_bmk_ci_llvm-bisect-tcwg_bmk_apm-llvm-master-… Full commit (up to 1000 lines): <cut> commit 590d3faada8a12bf0937bbf68413956dc6a339a9 Author: Tom de Vries <tdevries(a)suse.de> Date: Mon Aug 30 10:30:26 2021 +0200 [gdb/testsuite] Improve argument syntax of proc arange The current syntax of proc arange is: ... proc arange { arange_start arange_length {comment ""} {seg_sel ""} } { ... and a typical call looks like: ... arange $start $len ... This style is somewhat annoying because if you want to specify the last parameter, you need to give the default values of all the other optional ones before as well: ... arange $start $len "" $seg_sel ... Update the syntax to: ... proc arange { options arange_start arange_length } { parse_options { { comment "" } { seg_sel "" } } ... such that a typical call looks like: ... arange {} $start $len ... and a call using seg_sel looks like: ... arange { seg_sel $seg_sel } $start $len ... Also update proc aranges, which already has an options argument, to use the new proc parse_options. Tested on x86_64-linux. Co-Authored-By: Simon Marchi <simon.marchi(a)polymtl.ca> --- gdb/testsuite/gdb.dlang/watch-loc.exp | 2 +- gdb/testsuite/gdb.dwarf2/dw2-ranges-base.exp | 6 +- .../gdb.dwarf2/frame-inlined-in-outer-frame.exp | 2 +- .../template-specification-full-name.exp | 2 +- gdb/testsuite/gdb.testsuite/parse_options_args.exp | 59 ++++++++++++ gdb/testsuite/lib/dwarf.exp | 31 +++--- gdb/testsuite/lib/gdb.exp | 104 ++++++++++++++------- 7 files changed, 150 insertions(+), 56 deletions(-) diff --git a/gdb/testsuite/gdb.dlang/watch-loc.exp b/gdb/testsuite/gdb.dlang/watch-loc.exp index 6e8b26e3109..e13400ed479 100644 --- a/gdb/testsuite/gdb.dlang/watch-loc.exp +++ b/gdb/testsuite/gdb.dlang/watch-loc.exp @@ -68,7 +68,7 @@ Dwarf::assemble $asm_file { } aranges {} cu_start { - arange $dmain_start $dmain_length + arange {} $dmain_start $dmain_length } } diff --git a/gdb/testsuite/gdb.dwarf2/dw2-ranges-base.exp b/gdb/testsuite/gdb.dwarf2/dw2-ranges-base.exp index e65b4c8610a..d55b7fd150e 100644 --- a/gdb/testsuite/gdb.dwarf2/dw2-ranges-base.exp +++ b/gdb/testsuite/gdb.dwarf2/dw2-ranges-base.exp @@ -125,9 +125,9 @@ Dwarf::assemble $asm_file { } aranges {} cu_label { - arange [lindex $main_func 0] [lindex $main_func 1] - arange [lindex $frame2_func 0] [lindex $frame2_func 1] - arange [lindex $frame3_func 0] [lindex $frame3_func 1] + arange {} [lindex $main_func 0] [lindex $main_func 1] + arange {} [lindex $frame2_func 0] [lindex $frame2_func 1] + arange {} [lindex $frame3_func 0] [lindex $frame3_func 1] } } diff --git a/gdb/testsuite/gdb.dwarf2/frame-inlined-in-outer-frame.exp b/gdb/testsuite/gdb.dwarf2/frame-inlined-in-outer-frame.exp index ff12cd79f19..f95558dffef 100644 --- a/gdb/testsuite/gdb.dwarf2/frame-inlined-in-outer-frame.exp +++ b/gdb/testsuite/gdb.dwarf2/frame-inlined-in-outer-frame.exp @@ -95,7 +95,7 @@ Dwarf::assemble $dwarf_asm { } aranges {} cu_label { - arange __cu_low_pc __cu_high_pc + arange {} __cu_low_pc __cu_high_pc } } diff --git a/gdb/testsuite/gdb.dwarf2/template-specification-full-name.exp b/gdb/testsuite/gdb.dwarf2/template-specification-full-name.exp index 5c59777e1b6..6e736f2c8ef 100644 --- a/gdb/testsuite/gdb.dwarf2/template-specification-full-name.exp +++ b/gdb/testsuite/gdb.dwarf2/template-specification-full-name.exp @@ -69,7 +69,7 @@ Dwarf::assemble $asm_file { } aranges {} cu_start { - arange "$main_start" "$main_length" + arange {} "$main_start" "$main_length" } } diff --git a/gdb/testsuite/gdb.testsuite/parse_options_args.exp b/gdb/testsuite/gdb.testsuite/parse_options_args.exp new file mode 100644 index 00000000000..ce14fc3cd7c --- /dev/null +++ b/gdb/testsuite/gdb.testsuite/parse_options_args.exp @@ -0,0 +1,59 @@ +# Copyright 2021 Free Software Foundation, Inc. +# This program is free software; you can redistribute it and/or modify +# it under the terms of the GNU General Public License as published by +# the Free Software Foundation; either version 3 of the License, or +# (at your option) any later version. +# +# This program is distributed in the hope that it will be useful, +# but WITHOUT ANY WARRANTY; without even the implied warranty of +# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the +# GNU General Public License for more details. +# +# You should have received a copy of the GNU General Public License +# along with this program. If not, see <http://www.gnu.org/licenses/>. + +# Testsuite self-tests for parse_options and parse_args. + +with_test_prefix parse_options { + proc test1 { options a b } { + set v2 "defval2" + parse_options { + { opt1 defval1 } + { opt2 $v2 } + { opt3 } + { opt4 } + } + + gdb_assert { [string equal $a "vala"] } + gdb_assert { [string equal $b "valb"] } + gdb_assert { [string equal $opt1 "val1"] } + gdb_assert { [string equal $opt2 "defval2"] } + gdb_assert { $opt3 == 1 } + gdb_assert { $opt4 == 0 } + } + + set v1 "val1" + test1 { opt1 $v1 opt3 } "vala" "valb" +} + +with_test_prefix parse_args { + proc test2 { args } { + parse_args { + { opt1 defval1 } + { opt2 defval2 } + { opt3 } + { opt4 } + } + gdb_assert { [llength $args] == 2 } + lassign $args a b + gdb_assert { [string equal $a "vala"] } + gdb_assert { [string equal $b "valb"] } + gdb_assert { [string equal $opt1 "val1"] } + gdb_assert { [string equal $opt2 "defval2"] } + gdb_assert { $opt3 == 1 } + gdb_assert { $opt4 == 0 } + } + + set v1 "val1" + test2 -opt1 $v1 -opt3 "vala" "valb" +} diff --git a/gdb/testsuite/lib/dwarf.exp b/gdb/testsuite/lib/dwarf.exp index 120fa418201..7fb3561a443 100644 --- a/gdb/testsuite/lib/dwarf.exp +++ b/gdb/testsuite/lib/dwarf.exp @@ -2212,7 +2212,12 @@ namespace eval Dwarf { # Emit a DWARF .debug_aranges entry. - proc arange { arange_start arange_length {comment ""} {seg_sel ""} } { + proc arange { options arange_start arange_length } { + parse_options { + { comment "" } + { seg_sel "" } + } + if { $comment != "" } { # Wrap set comment " ($comment)" @@ -2270,22 +2275,14 @@ namespace eval Dwarf { variable _addr_size variable _seg_size - # Establish the defaults. - set is_64 0 - set cu_is_64 0 - set section_version 2 - set _seg_size 0 - # Handle options. - foreach { name value } $options { - switch -exact -- $name { - is_64 { set is_64 $value } - cu_is_64 { set cu_is_64 $value } - section_version {set section_version $value } - seg_size { set _seg_size $value } - default { error "unknown option $name" } - } + parse_options { + { is_64 0 } + { cu_is_64 0 } + { section_version 2 } + { seg_size 0 } } + set _seg_size $seg_size if { [is_64_target] } { set _addr_size 8 @@ -2354,9 +2351,9 @@ namespace eval Dwarf { # Terminator tuple. set comment "Terminator" if { $_seg_size == 0 } { - arange 0 0 $comment + arange {comment $comment} 0 0 } else { - arange 0 0 $comment 0 + arange {comment $comment seg_sel 0} 0 0 } # End label. diff --git a/gdb/testsuite/lib/gdb.exp b/gdb/testsuite/lib/gdb.exp index 093392709b4..3aea7baaab0 100644 --- a/gdb/testsuite/lib/gdb.exp +++ b/gdb/testsuite/lib/gdb.exp @@ -7293,8 +7293,8 @@ proc using_fission { } { return [regexp -- "-gsplit-dwarf" $debug_flags] } -# Search the caller's ARGS list and set variables according to the list of -# valid options described by ARGSET. +# Search LISTNAME in uplevel LEVEL caller and set variables according to the +# list of valid options with prefix PREFIX described by ARGSET. # # The first member of each one- or two-element list in ARGSET defines the # name of a variable that will be added to the caller's scope. @@ -7305,13 +7305,15 @@ proc using_fission { } { # # If two elements are given, the second element is the default value of # the variable. This is then overwritten if the option exists in ARGS. +# If EVAL, then subst is called on the value, which allows variables +# to be used. # # Any parse_args elements in (the caller's) ARGS will be removed, leaving # any optional components. - +# # Example: # proc myproc {foo args} { -# parse_args {{bar} {baz "abc"} {qux}} +# parse_list args 1 {{bar} {baz "abc"} {qux}} "-" false # # ... # } # myproc ABC -bar -baz DEF peanut butter @@ -7319,43 +7321,79 @@ proc using_fission { } { # foo (=ABC), bar (=1), baz (=DEF), and qux (=0) # args will be the list {peanut butter} -proc parse_args { argset } { - upvar args args +proc parse_list { level listname argset prefix eval } { + upvar $level $listname args foreach argument $argset { - if {[llength $argument] == 1} { - # No default specified, so we assume that we should set - # the value to 1 if the arg is present and 0 if it's not. - # It is assumed that no value is given with the argument. - set result [lsearch -exact $args "-$argument"] - if {$result != -1} then { - uplevel 1 [list set $argument 1] - set args [lreplace $args $result $result] - } else { - uplevel 1 [list set $argument 0] - } - } elseif {[llength $argument] == 2} { - # There are two items in the argument. The second is a - # default value to use if the item is not present. - # Otherwise, the variable is set to whatever is provided - # after the item in the args. - set arg [lindex $argument 0] - set result [lsearch -exact $args "-[lindex $arg 0]"] - if {$result != -1} then { - uplevel 1 [list set $arg [lindex $args [expr $result+1]]] - set args [lreplace $args $result [expr $result+1]] - } else { - uplevel 1 [list set $arg [lindex $argument 1]] - } - } else { - error "Badly formatted argument \"$argument\" in argument set" - } + if {[llength $argument] == 1} { + # Normalize argument, strip leading/trailing whitespace. + # Allows us to treat {foo} and { foo } the same. + set argument [string trim $argument] + + # No default specified, so we assume that we should set + # the value to 1 if the arg is present and 0 if it's not. + # It is assumed that no value is given with the argument. + set pattern "$prefix$argument" + set result [lsearch -exact $args $pattern] + + if {$result != -1} then { + set value 1 + set args [lreplace $args $result $result] + } else { + set value 0 + } + uplevel $level [list set $argument $value] + } elseif {[llength $argument] == 2} { + # There are two items in the argument. The second is a + # default value to use if the item is not present. + # Otherwise, the variable is set to whatever is provided + # after the item in the args. + set arg [lindex $argument 0] + set pattern "$prefix[lindex $arg 0]" + set result [lsearch -exact $args $pattern] + + if {$result != -1} then { + set value [lindex $args [expr $result+1]] + if { $eval } { + set value [uplevel [expr $level + 1] [list subst $value]] + } + set args [lreplace $args $result [expr $result+1]] + } else { + set value [lindex $argument 1] + if { $eval } { + set value [uplevel $level [list subst $value]] + } + } + uplevel $level [list set $arg $value] + } else { + error "Badly formatted argument \"$argument\" in argument set" + } } +} + +# Search the caller's args variable and set variables according to the list of +# valid options described by ARGSET. + +proc parse_args { argset } { + parse_list 2 args $argset "-" false # The remaining args should be checked to see that they match the # number of items expected to be passed into the procedure... } +# Process the caller's options variable and set variables according +# to the list of valid options described by OPTIONSET. + +proc parse_options { optionset } { + parse_list 2 options $optionset "" true + + # Require no remaining options. + upvar 1 options options + if { [llength $options] != 0 } { + error "Options left unparsed: $options" + } +} + # Capture the output of COMMAND in a string ignoring PREFIX (a regexp); # return that string. </cut>

3 years, 11 months

1
0
0 0

[ACTIVITY] week ending 29 Aug 2021

by Richard Henderson

(PSA: On holiday through 11 September.) [ UM-2 ] * Some patch review * Revise riscv tcg_constant cleanup * Cleanup tcg/optimize.c * Optimize repeat sign-extensions. r~

3 years, 11 months

1
0
0 0

[CI-NOTIFY]: TCWG Bisect tcwg_gcc_bootstrap/master-arm-bootstrap_O3 - Build # 2 - Successful!

by ci_notify＠linaro.org

Successfully identified regression in *gcc* in CI configuration tcwg_gcc_bootstrap/master-arm-bootstrap_O3. So far, this commit has regressed CI configurations: - tcwg_gcc_bootstrap/master-arm-bootstrap_O3 Culprit: <cut> commit cad36f38576a6a781e3c62ab061c68f5b8dab13a Author: Roger Sayle <roger(a)nextmovesoftware.com> Date: Tue Aug 31 11:45:07 2021 +0100 Preserve SUBREG_PROMOTED_VAR_P on (extend:HI (subreg/s:QI (reg:SI))). SUBREG_PROMOTED_VAR_P is a mechanism for tracking that a partial subreg is correctly zero-extended or sign-extended in the parent register. For example, the RTL (subreg/s/v:QI (reg/v:SI 23 [ x ]) 0) indicates that the byte x is zero extended in reg:SI 23, which is useful for optimization. An example is that zero extending the above QImode value to HImode can simply use a wider subreg, i.e. (subreg:HI (reg/v:SI 23 [ x ]) 0). This patch addresses the oversight/missed optimization opportunity that the new HImode subreg above should retain its SUBREG_PROMOTED_VAR_P annotation as its value is guaranteed to be correctly extended in the SImode parent. The code below to preserve SUBREG_PROMOTED_VAR_P is already present in the middle-end (e.g. simplify-rtx.c:7232-7242) but missing from one or two (precisely three) places that (accidentally) strip it. Whilst there I also added another optimization. If we need to extend the above QImode value beyond the SImode register holding it, say to DImode, we can eliminate the SUBREG and simply extend from the SImode register to DImode. 2021-08-31 Roger Sayle <roger(a)nextmovesoftware.com> gcc/ChangeLog * expr.c (convert_modes): Preserve SUBREG_PROMOTED_VAR_P when creating a (wider) partial subreg from a SUBREG_PROMOTED_VAR_P subreg. * simplify-rtx.c (simplify_unary_operation_1) [SIGN_EXTEND]: Likewise, preserve SUBREG_PROMOTED_VAR_P when creating a (wider) partial subreg from a SUBREG_PROMOTED_VAR_P subreg. Generate SIGN_EXTEND of the SUBREG_REG when a subreg would be paradoxical. [ZERO_EXTEND]: Likewise, preserve SUBREG_PROMOTED_VAR_P when creating a (wider) partial subreg from a SUBREG_PROMOTED_VAR_P subreg. Generate ZERO_EXTEND of the SUBREG_REG when a subreg would be paradoxical. </cut> Results regressed to (for first_bad == cad36f38576a6a781e3c62ab061c68f5b8dab13a) # reset_artifacts: -10 # true: 0 # build_abe binutils: 1 # First few build errors in logs: from (for last_good == 0960d937d9bee3c831d0b64a9c828c263a58ff89) # reset_artifacts: -10 # true: 0 # build_abe binutils: 1 # build_abe bootstrap_O3: 2 Artifacts of last_good build: https://ci.linaro.org/job/tcwg_gcc_bootstrap-bisect-master-arm-bootstrap_O3… Artifacts of first_bad build: https://ci.linaro.org/job/tcwg_gcc_bootstrap-bisect-master-arm-bootstrap_O3… Build top page/logs: https://ci.linaro.org/job/tcwg_gcc_bootstrap-bisect-master-arm-bootstrap_O3… Configuration details: Reproduce builds: <cut> mkdir investigate-gcc-cad36f38576a6a781e3c62ab061c68f5b8dab13a cd investigate-gcc-cad36f38576a6a781e3c62ab061c68f5b8dab13a git clone https://git.linaro.org/toolchain/jenkins-scripts mkdir -p artifacts/manifests curl -o artifacts/manifests/build-baseline.sh https://ci.linaro.org/job/tcwg_gcc_bootstrap-bisect-master-arm-bootstrap_O3… --fail curl -o artifacts/manifests/build-parameters.sh https://ci.linaro.org/job/tcwg_gcc_bootstrap-bisect-master-arm-bootstrap_O3… --fail curl -o artifacts/test.sh https://ci.linaro.org/job/tcwg_gcc_bootstrap-bisect-master-arm-bootstrap_O3… --fail chmod +x artifacts/test.sh # Reproduce the baseline build (build all pre-requisites) ./jenkins-scripts/tcwg_gnu-build.sh @@ artifacts/manifests/build-baseline.sh # Save baseline build state (which is then restored in artifacts/test.sh) mkdir -p ./bisect rsync -a --del --delete-excluded --exclude /bisect/ --exclude /artifacts/ --exclude /gcc/ ./ ./bisect/baseline/ cd gcc # Reproduce first_bad build git checkout --detach cad36f38576a6a781e3c62ab061c68f5b8dab13a ../artifacts/test.sh # Reproduce last_good build git checkout --detach 0960d937d9bee3c831d0b64a9c828c263a58ff89 ../artifacts/test.sh cd .. </cut> History of pending regressions and results: https://git.linaro.org/toolchain/ci/base-artifacts.git/log/?h=linaro-local/… Artifacts: https://ci.linaro.org/job/tcwg_gcc_bootstrap-bisect-master-arm-bootstrap_O3… Build log: https://ci.linaro.org/job/tcwg_gcc_bootstrap-bisect-master-arm-bootstrap_O3… Full commit (up to 1000 lines): <cut> commit cad36f38576a6a781e3c62ab061c68f5b8dab13a Author: Roger Sayle <roger(a)nextmovesoftware.com> Date: Tue Aug 31 11:45:07 2021 +0100 Preserve SUBREG_PROMOTED_VAR_P on (extend:HI (subreg/s:QI (reg:SI))). SUBREG_PROMOTED_VAR_P is a mechanism for tracking that a partial subreg is correctly zero-extended or sign-extended in the parent register. For example, the RTL (subreg/s/v:QI (reg/v:SI 23 [ x ]) 0) indicates that the byte x is zero extended in reg:SI 23, which is useful for optimization. An example is that zero extending the above QImode value to HImode can simply use a wider subreg, i.e. (subreg:HI (reg/v:SI 23 [ x ]) 0). This patch addresses the oversight/missed optimization opportunity that the new HImode subreg above should retain its SUBREG_PROMOTED_VAR_P annotation as its value is guaranteed to be correctly extended in the SImode parent. The code below to preserve SUBREG_PROMOTED_VAR_P is already present in the middle-end (e.g. simplify-rtx.c:7232-7242) but missing from one or two (precisely three) places that (accidentally) strip it. Whilst there I also added another optimization. If we need to extend the above QImode value beyond the SImode register holding it, say to DImode, we can eliminate the SUBREG and simply extend from the SImode register to DImode. 2021-08-31 Roger Sayle <roger(a)nextmovesoftware.com> gcc/ChangeLog * expr.c (convert_modes): Preserve SUBREG_PROMOTED_VAR_P when creating a (wider) partial subreg from a SUBREG_PROMOTED_VAR_P subreg. * simplify-rtx.c (simplify_unary_operation_1) [SIGN_EXTEND]: Likewise, preserve SUBREG_PROMOTED_VAR_P when creating a (wider) partial subreg from a SUBREG_PROMOTED_VAR_P subreg. Generate SIGN_EXTEND of the SUBREG_REG when a subreg would be paradoxical. [ZERO_EXTEND]: Likewise, preserve SUBREG_PROMOTED_VAR_P when creating a (wider) partial subreg from a SUBREG_PROMOTED_VAR_P subreg. Generate ZERO_EXTEND of the SUBREG_REG when a subreg would be paradoxical. --- gcc/expr.c | 19 ++++++++++++++++++- gcc/simplify-rtx.c | 52 ++++++++++++++++++++++++++++++++++++++++++---------- 2 files changed, 60 insertions(+), 11 deletions(-) diff --git a/gcc/expr.c b/gcc/expr.c index 096c0315ecc..5dd98a9bccc 100644 --- a/gcc/expr.c +++ b/gcc/expr.c @@ -688,7 +688,24 @@ convert_modes (machine_mode mode, machine_mode oldmode, rtx x, int unsignedp) && (GET_MODE_PRECISION (subreg_promoted_mode (x)) >= GET_MODE_PRECISION (int_mode)) && SUBREG_CHECK_PROMOTED_SIGN (x, unsignedp)) - x = gen_lowpart (int_mode, SUBREG_REG (x)); + { + scalar_int_mode int_orig_mode; + machine_mode orig_mode = GET_MODE (x); + x = gen_lowpart (int_mode, SUBREG_REG (x)); + + /* Preserve SUBREG_PROMOTED_VAR_P if the new mode is wider than + the original mode, but narrower than the inner mode. */ + if (GET_CODE (x) == SUBREG + && GET_MODE_PRECISION (subreg_promoted_mode (x)) + > GET_MODE_PRECISION (int_mode) + && is_a <scalar_int_mode> (orig_mode, &int_orig_mode) + && GET_MODE_PRECISION (int_mode) + > GET_MODE_PRECISION (int_orig_mode)) + { + SUBREG_PROMOTED_VAR_P (x) = 1; + SUBREG_PROMOTED_SET (x, unsignedp); + } + } if (GET_MODE (x) != VOIDmode) oldmode = GET_MODE (x); diff --git a/gcc/simplify-rtx.c b/gcc/simplify-rtx.c index e431e0c19d7..ebad5cb5a79 100644 --- a/gcc/simplify-rtx.c +++ b/gcc/simplify-rtx.c @@ -1512,12 +1512,28 @@ simplify_context::simplify_unary_operation_1 (rtx_code code, machine_mode mode, target mode is the same as the variable's promotion. */ if (GET_CODE (op) == SUBREG && SUBREG_PROMOTED_VAR_P (op) - && SUBREG_PROMOTED_SIGNED_P (op) - && !paradoxical_subreg_p (mode, GET_MODE (SUBREG_REG (op)))) + && SUBREG_PROMOTED_SIGNED_P (op)) { - temp = rtl_hooks.gen_lowpart_no_emit (mode, SUBREG_REG (op)); - if (temp) - return temp; + rtx subreg = SUBREG_REG (op); + machine_mode subreg_mode = GET_MODE (subreg); + if (!paradoxical_subreg_p (mode, subreg_mode)) + { + temp = rtl_hooks.gen_lowpart_no_emit (mode, subreg); + if (temp) + { + /* Preserve SUBREG_PROMOTED_VAR_P. */ + if (partial_subreg_p (temp)) + { + SUBREG_PROMOTED_VAR_P (temp) = 1; + SUBREG_PROMOTED_SET (temp, 1); + } + return temp; + } + } + else + /* Sign-extending a sign-extended subreg. */ + return simplify_gen_unary (SIGN_EXTEND, mode, + subreg, subreg_mode); } /* (sign_extend:M (sign_extend:N <X>)) is (sign_extend:M <X>). @@ -1631,12 +1647,28 @@ simplify_context::simplify_unary_operation_1 (rtx_code code, machine_mode mode, target mode is the same as the variable's promotion. */ if (GET_CODE (op) == SUBREG && SUBREG_PROMOTED_VAR_P (op) - && SUBREG_PROMOTED_UNSIGNED_P (op) - && !paradoxical_subreg_p (mode, GET_MODE (SUBREG_REG (op)))) + && SUBREG_PROMOTED_UNSIGNED_P (op)) { - temp = rtl_hooks.gen_lowpart_no_emit (mode, SUBREG_REG (op)); - if (temp) - return temp; + rtx subreg = SUBREG_REG (op); + machine_mode subreg_mode = GET_MODE (subreg); + if (!paradoxical_subreg_p (mode, subreg_mode)) + { + temp = rtl_hooks.gen_lowpart_no_emit (mode, subreg); + if (temp) + { + /* Preserve SUBREG_PROMOTED_VAR_P. */ + if (partial_subreg_p (temp)) + { + SUBREG_PROMOTED_VAR_P (temp) = 1; + SUBREG_PROMOTED_SET (temp, 0); + } + return temp; + } + } + else + /* Zero-extending a zero-extended subreg. */ + return simplify_gen_unary (ZERO_EXTEND, mode, + subreg, subreg_mode); } /* Extending a widening multiplication should be canonicalized to </cut>

3 years, 11 months

1
0
0 0

[CI-NOTIFY]: TCWG Bisect tcwg_bmk_tk1/llvm-master-arm-spec2k6-O2 - Build # 11 - Successful!

by ci_notify＠linaro.org

Successfully identified regression in *llvm* in CI configuration tcwg_bmk_llvm_tk1/llvm-master-arm-spec2k6-O2. So far, this commit has regressed CI configurations: - tcwg_bmk_llvm_tk1/llvm-master-arm-spec2k6-O2 Culprit: <cut> commit 92c1fd19abb15bc68b1127a26137a69e033cdb39 Author: Stanislav Mekhanoshin <Stanislav.Mekhanoshin(a)amd.com> Date: Thu Aug 19 11:42:09 2021 -0700 Allow rematerialization of virtual reg uses Currently isReallyTriviallyReMaterializableGeneric() implementation prevents rematerialization on any virtual register use on the grounds that is not a trivial rematerialization and that we do not want to extend liveranges. It appears that LRE logic does not attempt to extend a liverange of a source register for rematerialization so that is not an issue. That is checked in the LiveRangeEdit::allUsesAvailableAt(). The only non-trivial aspect of it is accounting for tied-defs which normally represent a read-modify-write operation and not rematerializable. The test for a tied-def situation already exists in the /CodeGen/AMDGPU/remat-vop.mir, test_no_remat_v_cvt_f32_i32_sdwa_dst_unused_preserve. The change has affected ARM/Thumb, Mips, RISCV, and x86. For the targets where I more or less understand the asm it seems to reduce spilling (as expected) or be neutral. However, it needs a review by all targets' specialists. Differential Revision: https://reviews.llvm.org/D106408 </cut> Results regressed to (for first_bad == 92c1fd19abb15bc68b1127a26137a69e033cdb39) # reset_artifacts: -10 # build_abe binutils: -9 # build_abe stage1 -- --set gcc_override_configure=--with-mode=arm --set gcc_override_configure=--disable-libsanitizer: -8 # build_abe linux: -7 # build_abe glibc: -6 # build_abe stage2 -- --set gcc_override_configure=--with-mode=arm --set gcc_override_configure=--disable-libsanitizer: -5 # build_llvm true: -3 # true: 0 # benchmark -- -O2_marm artifacts/build-92c1fd19abb15bc68b1127a26137a69e033cdb39/results_id: 1 # 456.hmmer,hmmer_base.default regressed by 103 from (for last_good == 1d02a8bcd393ea9c50f0212797059888efc78002) # reset_artifacts: -10 # build_abe binutils: -9 # build_abe stage1 -- --set gcc_override_configure=--with-mode=arm --set gcc_override_configure=--disable-libsanitizer: -8 # build_abe linux: -7 # build_abe glibc: -6 # build_abe stage2 -- --set gcc_override_configure=--with-mode=arm --set gcc_override_configure=--disable-libsanitizer: -5 # build_llvm true: -3 # true: 0 # benchmark -- -O2_marm artifacts/build-1d02a8bcd393ea9c50f0212797059888efc78002/results_id: 1 Artifacts of last_good build: https://ci.linaro.org/job/tcwg_bmk_ci_llvm-bisect-tcwg_bmk_tk1-llvm-master-… Results ID of last_good: tk1_32/tcwg_bmk_llvm_tk1/bisect-llvm-master-arm-spec2k6-O2/4381 Artifacts of first_bad build: https://ci.linaro.org/job/tcwg_bmk_ci_llvm-bisect-tcwg_bmk_tk1-llvm-master-… Results ID of first_bad: tk1_32/tcwg_bmk_llvm_tk1/bisect-llvm-master-arm-spec2k6-O2/4378 Build top page/logs: https://ci.linaro.org/job/tcwg_bmk_ci_llvm-bisect-tcwg_bmk_tk1-llvm-master-… Configuration details: Reproduce builds: <cut> mkdir investigate-llvm-92c1fd19abb15bc68b1127a26137a69e033cdb39 cd investigate-llvm-92c1fd19abb15bc68b1127a26137a69e033cdb39 git clone https://git.linaro.org/toolchain/jenkins-scripts mkdir -p artifacts/manifests curl -o artifacts/manifests/build-baseline.sh https://ci.linaro.org/job/tcwg_bmk_ci_llvm-bisect-tcwg_bmk_tk1-llvm-master-… --fail curl -o artifacts/manifests/build-parameters.sh https://ci.linaro.org/job/tcwg_bmk_ci_llvm-bisect-tcwg_bmk_tk1-llvm-master-… --fail curl -o artifacts/test.sh https://ci.linaro.org/job/tcwg_bmk_ci_llvm-bisect-tcwg_bmk_tk1-llvm-master-… --fail chmod +x artifacts/test.sh # Reproduce the baseline build (build all pre-requisites) ./jenkins-scripts/tcwg_bmk-build.sh @@ artifacts/manifests/build-baseline.sh # Save baseline build state (which is then restored in artifacts/test.sh) mkdir -p ./bisect rsync -a --del --delete-excluded --exclude /bisect/ --exclude /artifacts/ --exclude /llvm/ ./ ./bisect/baseline/ cd llvm # Reproduce first_bad build git checkout --detach 92c1fd19abb15bc68b1127a26137a69e033cdb39 ../artifacts/test.sh # Reproduce last_good build git checkout --detach 1d02a8bcd393ea9c50f0212797059888efc78002 ../artifacts/test.sh cd .. </cut> History of pending regressions and results: https://git.linaro.org/toolchain/ci/base-artifacts.git/log/?h=linaro-local/… Artifacts: https://ci.linaro.org/job/tcwg_bmk_ci_llvm-bisect-tcwg_bmk_tk1-llvm-master-… Build log: https://ci.linaro.org/job/tcwg_bmk_ci_llvm-bisect-tcwg_bmk_tk1-llvm-master-… Full commit (up to 1000 lines): <cut> commit 92c1fd19abb15bc68b1127a26137a69e033cdb39 Author: Stanislav Mekhanoshin <Stanislav.Mekhanoshin(a)amd.com> Date: Thu Aug 19 11:42:09 2021 -0700 Allow rematerialization of virtual reg uses Currently isReallyTriviallyReMaterializableGeneric() implementation prevents rematerialization on any virtual register use on the grounds that is not a trivial rematerialization and that we do not want to extend liveranges. It appears that LRE logic does not attempt to extend a liverange of a source register for rematerialization so that is not an issue. That is checked in the LiveRangeEdit::allUsesAvailableAt(). The only non-trivial aspect of it is accounting for tied-defs which normally represent a read-modify-write operation and not rematerializable. The test for a tied-def situation already exists in the /CodeGen/AMDGPU/remat-vop.mir, test_no_remat_v_cvt_f32_i32_sdwa_dst_unused_preserve. The change has affected ARM/Thumb, Mips, RISCV, and x86. For the targets where I more or less understand the asm it seems to reduce spilling (as expected) or be neutral. However, it needs a review by all targets' specialists. Differential Revision: https://reviews.llvm.org/D106408 --- llvm/include/llvm/CodeGen/TargetInstrInfo.h | 12 +- llvm/lib/CodeGen/TargetInstrInfo.cpp | 9 +- llvm/test/CodeGen/AMDGPU/remat-sop.mir | 60 + llvm/test/CodeGen/ARM/arm-shrink-wrapping-linux.ll | 28 +- llvm/test/CodeGen/ARM/funnel-shift-rot.ll | 32 +- llvm/test/CodeGen/ARM/funnel-shift.ll | 30 +- .../test/CodeGen/ARM/illegal-bitfield-loadstore.ll | 30 +- llvm/test/CodeGen/ARM/neon-copy.ll | 10 +- llvm/test/CodeGen/Mips/llvm-ir/ashr.ll | 227 +- llvm/test/CodeGen/Mips/llvm-ir/lshr.ll | 206 +- llvm/test/CodeGen/Mips/llvm-ir/shl.ll | 95 +- llvm/test/CodeGen/Mips/llvm-ir/sub.ll | 31 +- llvm/test/CodeGen/Mips/tls.ll | 4 +- llvm/test/CodeGen/RISCV/atomic-rmw.ll | 120 +- llvm/test/CodeGen/RISCV/atomic-signext.ll | 24 +- llvm/test/CodeGen/RISCV/bswap-ctlz-cttz-ctpop.ll | 96 +- llvm/test/CodeGen/RISCV/rv32i-rv64i-half.ll | 12 +- llvm/test/CodeGen/RISCV/rv32zbb-zbp.ll | 526 +-- llvm/test/CodeGen/RISCV/rv32zbb.ll | 94 +- llvm/test/CodeGen/RISCV/rv32zbp.ll | 282 +- llvm/test/CodeGen/RISCV/rv32zbt.ll | 348 +- .../CodeGen/RISCV/rvv/fixed-vectors-bitreverse.ll | 324 +- llvm/test/CodeGen/RISCV/rvv/fixed-vectors-bswap.ll | 146 +- llvm/test/CodeGen/RISCV/rvv/fixed-vectors-ctlz.ll | 3540 ++++++++++---------- llvm/test/CodeGen/RISCV/rvv/fixed-vectors-cttz.ll | 720 ++-- llvm/test/CodeGen/RISCV/srem-vector-lkk.ll | 208 +- llvm/test/CodeGen/RISCV/urem-vector-lkk.ll | 190 +- llvm/test/CodeGen/Thumb/dyn-stackalloc.ll | 7 +- .../tail-pred-disabled-in-loloops.ll | 14 +- .../LowOverheadLoops/varying-outer-2d-reduction.ll | 64 +- .../CodeGen/Thumb2/LowOverheadLoops/while-loops.ll | 67 +- llvm/test/CodeGen/Thumb2/ldr-str-imm12.ll | 30 +- llvm/test/CodeGen/Thumb2/mve-float16regloops.ll | 82 +- llvm/test/CodeGen/Thumb2/mve-float32regloops.ll | 98 +- llvm/test/CodeGen/Thumb2/mve-postinc-dct.ll | 529 ++- llvm/test/CodeGen/X86/addcarry.ll | 20 +- llvm/test/CodeGen/X86/callbr-asm-blockplacement.ll | 12 +- llvm/test/CodeGen/X86/dag-update-nodetomatch.ll | 17 +- llvm/test/CodeGen/X86/inalloca-invoke.ll | 2 +- llvm/test/CodeGen/X86/licm-regpressure.ll | 28 +- llvm/test/CodeGen/X86/ragreedy-hoist-spill.ll | 40 +- llvm/test/CodeGen/X86/sdiv_fix.ll | 5 +- 42 files changed, 4217 insertions(+), 4202 deletions(-) diff --git a/llvm/include/llvm/CodeGen/TargetInstrInfo.h b/llvm/include/llvm/CodeGen/TargetInstrInfo.h index 2f853a2c6f9f..1c05afba730d 100644 --- a/llvm/include/llvm/CodeGen/TargetInstrInfo.h +++ b/llvm/include/llvm/CodeGen/TargetInstrInfo.h @@ -117,10 +117,11 @@ public: const MachineFunction &MF) const; /// Return true if the instruction is trivially rematerializable, meaning it - /// has no side effects and requires no operands that aren't always available. - /// This means the only allowed uses are constants and unallocatable physical - /// registers so that the instructions result is independent of the place - /// in the function. + /// has no side effects. Uses of constants and unallocatable physical + /// registers are always trivial to rematerialize so that the instructions + /// result is independent of the place in the function. Uses of virtual + /// registers are allowed but it is caller's responsility to ensure these + /// operands are valid at the point the instruction is beeing moved. bool isTriviallyReMaterializable(const MachineInstr &MI, AAResults *AA = nullptr) const { return MI.getOpcode() == TargetOpcode::IMPLICIT_DEF || @@ -140,8 +141,7 @@ protected: /// set, this hook lets the target specify whether the instruction is actually /// trivially rematerializable, taking into consideration its operands. This /// predicate must return false if the instruction has any side effects other - /// than producing a value, or if it requres any address registers that are - /// not always available. + /// than producing a value. /// Requirements must be check as stated in isTriviallyReMaterializable() . virtual bool isReallyTriviallyReMaterializable(const MachineInstr &MI, AAResults *AA) const { diff --git a/llvm/lib/CodeGen/TargetInstrInfo.cpp b/llvm/lib/CodeGen/TargetInstrInfo.cpp index 1eab8e7443a7..fe7d60e0b7e2 100644 --- a/llvm/lib/CodeGen/TargetInstrInfo.cpp +++ b/llvm/lib/CodeGen/TargetInstrInfo.cpp @@ -921,7 +921,8 @@ bool TargetInstrInfo::isReallyTriviallyReMaterializableGeneric( const MachineRegisterInfo &MRI = MF.getRegInfo(); // Remat clients assume operand 0 is the defined register. - if (!MI.getNumOperands() || !MI.getOperand(0).isReg()) + if (!MI.getNumOperands() || !MI.getOperand(0).isReg() || + MI.getOperand(0).isTied()) return false; Register DefReg = MI.getOperand(0).getReg(); @@ -983,12 +984,6 @@ bool TargetInstrInfo::isReallyTriviallyReMaterializableGeneric( // same virtual register, though. if (MO.isDef() && Reg != DefReg) return false; - - // Don't allow any virtual-register uses. Rematting an instruction with - // virtual register uses would length the live ranges of the uses, which - // is not necessarily a good idea, certainly not "trivial". - if (MO.isUse()) - return false; } // Everything checked out. diff --git a/llvm/test/CodeGen/AMDGPU/remat-sop.mir b/llvm/test/CodeGen/AMDGPU/remat-sop.mir index ed799bfca028..c9915aaabfde 100644 --- a/llvm/test/CodeGen/AMDGPU/remat-sop.mir +++ b/llvm/test/CodeGen/AMDGPU/remat-sop.mir @@ -51,6 +51,66 @@ body: | S_NOP 0, implicit %2 S_ENDPGM 0 ... +# The liverange of %0 covers a point of rematerialization, source value is +# availabe. +--- +name: test_remat_s_mov_b32_vreg_src_long_lr +tracksRegLiveness: true +machineFunctionInfo: + stackPtrOffsetReg: $sgpr32 +body: | + bb.0: + ; GCN-LABEL: name: test_remat_s_mov_b32_vreg_src_long_lr + ; GCN: renamable $sgpr0 = IMPLICIT_DEF + ; GCN: renamable $sgpr1 = S_MOV_B32 renamable $sgpr0 + ; GCN: S_NOP 0, implicit killed renamable $sgpr1 + ; GCN: renamable $sgpr1 = S_MOV_B32 renamable $sgpr0 + ; GCN: S_NOP 0, implicit killed renamable $sgpr1 + ; GCN: renamable $sgpr1 = S_MOV_B32 renamable $sgpr0 + ; GCN: S_NOP 0, implicit killed renamable $sgpr1 + ; GCN: S_NOP 0, implicit killed renamable $sgpr0 + ; GCN: S_ENDPGM 0 + %0:sreg_32 = IMPLICIT_DEF + %1:sreg_32 = S_MOV_B32 %0:sreg_32 + %2:sreg_32 = S_MOV_B32 %0:sreg_32 + %3:sreg_32 = S_MOV_B32 %0:sreg_32 + S_NOP 0, implicit %1 + S_NOP 0, implicit %2 + S_NOP 0, implicit %3 + S_NOP 0, implicit %0 + S_ENDPGM 0 +... +# The liverange of %0 does not cover a point of rematerialization, source value is +# unavailabe and we do not want to artificially extend the liverange. +--- +name: test_no_remat_s_mov_b32_vreg_src_short_lr +tracksRegLiveness: true +machineFunctionInfo: + stackPtrOffsetReg: $sgpr32 +body: | + bb.0: + ; GCN-LABEL: name: test_no_remat_s_mov_b32_vreg_src_short_lr + ; GCN: renamable $sgpr0 = IMPLICIT_DEF + ; GCN: renamable $sgpr1 = S_MOV_B32 renamable $sgpr0 + ; GCN: SI_SPILL_S32_SAVE killed renamable $sgpr1, %stack.1, implicit $exec, implicit $sgpr32 :: (store (s32) into %stack.1, addrspace 5) + ; GCN: renamable $sgpr1 = S_MOV_B32 renamable $sgpr0 + ; GCN: SI_SPILL_S32_SAVE killed renamable $sgpr1, %stack.0, implicit $exec, implicit $sgpr32 :: (store (s32) into %stack.0, addrspace 5) + ; GCN: renamable $sgpr0 = S_MOV_B32 killed renamable $sgpr0 + ; GCN: renamable $sgpr1 = SI_SPILL_S32_RESTORE %stack.1, implicit $exec, implicit $sgpr32 :: (load (s32) from %stack.1, addrspace 5) + ; GCN: S_NOP 0, implicit killed renamable $sgpr1 + ; GCN: renamable $sgpr1 = SI_SPILL_S32_RESTORE %stack.0, implicit $exec, implicit $sgpr32 :: (load (s32) from %stack.0, addrspace 5) + ; GCN: S_NOP 0, implicit killed renamable $sgpr1 + ; GCN: S_NOP 0, implicit killed renamable $sgpr0 + ; GCN: S_ENDPGM 0 + %0:sreg_32 = IMPLICIT_DEF + %1:sreg_32 = S_MOV_B32 %0:sreg_32 + %2:sreg_32 = S_MOV_B32 %0:sreg_32 + %3:sreg_32 = S_MOV_B32 %0:sreg_32 + S_NOP 0, implicit %1 + S_NOP 0, implicit %2 + S_NOP 0, implicit %3 + S_ENDPGM 0 +... --- name: test_remat_s_mov_b64 tracksRegLiveness: true diff --git a/llvm/test/CodeGen/ARM/arm-shrink-wrapping-linux.ll b/llvm/test/CodeGen/ARM/arm-shrink-wrapping-linux.ll index a4243276c70a..175a2069a441 100644 --- a/llvm/test/CodeGen/ARM/arm-shrink-wrapping-linux.ll +++ b/llvm/test/CodeGen/ARM/arm-shrink-wrapping-linux.ll @@ -29,20 +29,20 @@ define fastcc i8* @wrongUseOfPostDominate(i8* readonly %s, i32 %off, i8* readnon ; ENABLE-NEXT: pophs {r11, pc} ; ENABLE-NEXT: .LBB0_3: @ %while.body.preheader ; ENABLE-NEXT: movw r12, :lower16:skip -; ENABLE-NEXT: sub r1, r1, #1 +; ENABLE-NEXT: sub r3, r1, #1 ; ENABLE-NEXT: movt r12, :upper16:skip ; ENABLE-NEXT: .LBB0_4: @ %while.body ; ENABLE-NEXT: @ =>This Inner Loop Header: Depth=1 -; ENABLE-NEXT: ldrb r3, [r0] -; ENABLE-NEXT: ldrb r3, [r12, r3] -; ENABLE-NEXT: add r0, r0, r3 -; ENABLE-NEXT: sub r3, r1, #1 -; ENABLE-NEXT: cmp r3, r1 +; ENABLE-NEXT: ldrb r1, [r0] +; ENABLE-NEXT: ldrb r1, [r12, r1] +; ENABLE-NEXT: add r0, r0, r1 +; ENABLE-NEXT: sub r1, r3, #1 +; ENABLE-NEXT: cmp r1, r3 ; ENABLE-NEXT: bhs .LBB0_6 ; ENABLE-NEXT: @ %bb.5: @ %while.body ; ENABLE-NEXT: @ in Loop: Header=BB0_4 Depth=1 ; ENABLE-NEXT: cmp r0, r2 -; ENABLE-NEXT: mov r1, r3 +; ENABLE-NEXT: mov r3, r1 ; ENABLE-NEXT: blo .LBB0_4 ; ENABLE-NEXT: .LBB0_6: @ %if.end29 ; ENABLE-NEXT: pop {r11, pc} @@ -119,20 +119,20 @@ define fastcc i8* @wrongUseOfPostDominate(i8* readonly %s, i32 %off, i8* readnon ; DISABLE-NEXT: pophs {r11, pc} ; DISABLE-NEXT: .LBB0_3: @ %while.body.preheader ; DISABLE-NEXT: movw r12, :lower16:skip -; DISABLE-NEXT: sub r1, r1, #1 +; DISABLE-NEXT: sub r3, r1, #1 ; DISABLE-NEXT: movt r12, :upper16:skip ; DISABLE-NEXT: .LBB0_4: @ %while.body ; DISABLE-NEXT: @ =>This Inner Loop Header: Depth=1 -; DISABLE-NEXT: ldrb r3, [r0] -; DISABLE-NEXT: ldrb r3, [r12, r3] -; DISABLE-NEXT: add r0, r0, r3 -; DISABLE-NEXT: sub r3, r1, #1 -; DISABLE-NEXT: cmp r3, r1 +; DISABLE-NEXT: ldrb r1, [r0] +; DISABLE-NEXT: ldrb r1, [r12, r1] +; DISABLE-NEXT: add r0, r0, r1 +; DISABLE-NEXT: sub r1, r3, #1 +; DISABLE-NEXT: cmp r1, r3 ; DISABLE-NEXT: bhs .LBB0_6 ; DISABLE-NEXT: @ %bb.5: @ %while.body ; DISABLE-NEXT: @ in Loop: Header=BB0_4 Depth=1 ; DISABLE-NEXT: cmp r0, r2 -; DISABLE-NEXT: mov r1, r3 +; DISABLE-NEXT: mov r3, r1 ; DISABLE-NEXT: blo .LBB0_4 ; DISABLE-NEXT: .LBB0_6: @ %if.end29 ; DISABLE-NEXT: pop {r11, pc} diff --git a/llvm/test/CodeGen/ARM/funnel-shift-rot.ll b/llvm/test/CodeGen/ARM/funnel-shift-rot.ll index 55157875d355..ea15fcc5c824 100644 --- a/llvm/test/CodeGen/ARM/funnel-shift-rot.ll +++ b/llvm/test/CodeGen/ARM/funnel-shift-rot.ll @@ -73,13 +73,13 @@ define i64 @rotl_i64(i64 %x, i64 %z) { ; SCALAR-NEXT: push {r4, r5, r11, lr} ; SCALAR-NEXT: rsb r3, r2, #0 ; SCALAR-NEXT: and r4, r2, #63 -; SCALAR-NEXT: and lr, r3, #63 -; SCALAR-NEXT: rsb r3, lr, #32 +; SCALAR-NEXT: and r12, r3, #63 +; SCALAR-NEXT: rsb r3, r12, #32 ; SCALAR-NEXT: lsl r2, r0, r4 -; SCALAR-NEXT: lsr r12, r0, lr -; SCALAR-NEXT: orr r3, r12, r1, lsl r3 -; SCALAR-NEXT: subs r12, lr, #32 -; SCALAR-NEXT: lsrpl r3, r1, r12 +; SCALAR-NEXT: lsr lr, r0, r12 +; SCALAR-NEXT: orr r3, lr, r1, lsl r3 +; SCALAR-NEXT: subs lr, r12, #32 +; SCALAR-NEXT: lsrpl r3, r1, lr ; SCALAR-NEXT: subs r5, r4, #32 ; SCALAR-NEXT: movwpl r2, #0 ; SCALAR-NEXT: cmp r5, #0 @@ -88,8 +88,8 @@ define i64 @rotl_i64(i64 %x, i64 %z) { ; SCALAR-NEXT: lsr r3, r0, r3 ; SCALAR-NEXT: orr r3, r3, r1, lsl r4 ; SCALAR-NEXT: lslpl r3, r0, r5 -; SCALAR-NEXT: lsr r0, r1, lr -; SCALAR-NEXT: cmp r12, #0 +; SCALAR-NEXT: lsr r0, r1, r12 +; SCALAR-NEXT: cmp lr, #0 ; SCALAR-NEXT: movwpl r0, #0 ; SCALAR-NEXT: orr r1, r3, r0 ; SCALAR-NEXT: mov r0, r2 @@ -245,15 +245,15 @@ define i64 @rotr_i64(i64 %x, i64 %z) { ; CHECK: @ %bb.0: ; CHECK-NEXT: .save {r4, r5, r11, lr} ; CHECK-NEXT: push {r4, r5, r11, lr} -; CHECK-NEXT: and lr, r2, #63 +; CHECK-NEXT: and r12, r2, #63 ; CHECK-NEXT: rsb r2, r2, #0 -; CHECK-NEXT: rsb r3, lr, #32 +; CHECK-NEXT: rsb r3, r12, #32 ; CHECK-NEXT: and r4, r2, #63 -; CHECK-NEXT: lsr r12, r0, lr -; CHECK-NEXT: orr r3, r12, r1, lsl r3 -; CHECK-NEXT: subs r12, lr, #32 +; CHECK-NEXT: lsr lr, r0, r12 +; CHECK-NEXT: orr r3, lr, r1, lsl r3 +; CHECK-NEXT: subs lr, r12, #32 ; CHECK-NEXT: lsl r2, r0, r4 -; CHECK-NEXT: lsrpl r3, r1, r12 +; CHECK-NEXT: lsrpl r3, r1, lr ; CHECK-NEXT: subs r5, r4, #32 ; CHECK-NEXT: movwpl r2, #0 ; CHECK-NEXT: cmp r5, #0 @@ -262,8 +262,8 @@ define i64 @rotr_i64(i64 %x, i64 %z) { ; CHECK-NEXT: lsr r3, r0, r3 ; CHECK-NEXT: orr r3, r3, r1, lsl r4 ; CHECK-NEXT: lslpl r3, r0, r5 -; CHECK-NEXT: lsr r0, r1, lr -; CHECK-NEXT: cmp r12, #0 +; CHECK-NEXT: lsr r0, r1, r12 +; CHECK-NEXT: cmp lr, #0 ; CHECK-NEXT: movwpl r0, #0 ; CHECK-NEXT: orr r1, r0, r3 ; CHECK-NEXT: mov r0, r2 diff --git a/llvm/test/CodeGen/ARM/funnel-shift.ll b/llvm/test/CodeGen/ARM/funnel-shift.ll index 54c93b493c98..6372f9be2ca3 100644 --- a/llvm/test/CodeGen/ARM/funnel-shift.ll +++ b/llvm/test/CodeGen/ARM/funnel-shift.ll @@ -224,31 +224,31 @@ define i37 @fshr_i37(i37 %x, i37 %y, i37 %z) { ; CHECK-NEXT: mov r3, #0 ; CHECK-NEXT: bl __aeabi_uldivmod ; CHECK-NEXT: add r0, r2, #27 -; CHECK-NEXT: lsl r6, r6, #27 -; CHECK-NEXT: and r1, r0, #63 ; CHECK-NEXT: lsl r2, r7, #27 +; CHECK-NEXT: and r12, r0, #63 +; CHECK-NEXT: lsl r6, r6, #27 ; CHECK-NEXT: orr r7, r6, r7, lsr #5 +; CHECK-NEXT: rsb r3, r12, #32 +; CHECK-NEXT: lsr r2, r2, r12 ; CHECK-NEXT: mov r6, #63 -; CHECK-NEXT: rsb r3, r1, #32 -; CHECK-NEXT: lsr r2, r2, r1 -; CHECK-NEXT: subs r12, r1, #32 -; CHECK-NEXT: bic r6, r6, r0 ; CHECK-NEXT: orr r2, r2, r7, lsl r3 +; CHECK-NEXT: subs r3, r12, #32 +; CHECK-NEXT: bic r6, r6, r0 ; CHECK-NEXT: lsl r5, r9, #1 -; CHECK-NEXT: lsrpl r2, r7, r12 +; CHECK-NEXT: lsrpl r2, r7, r3 +; CHECK-NEXT: subs r1, r6, #32 ; CHECK-NEXT: lsl r0, r5, r6 -; CHECK-NEXT: subs r4, r6, #32 -; CHECK-NEXT: lsl r3, r8, #1 +; CHECK-NEXT: lsl r4, r8, #1 ; CHECK-NEXT: movwpl r0, #0 -; CHECK-NEXT: orr r3, r3, r9, lsr #31 +; CHECK-NEXT: orr r4, r4, r9, lsr #31 ; CHECK-NEXT: orr r0, r0, r2 ; CHECK-NEXT: rsb r2, r6, #32 -; CHECK-NEXT: cmp r4, #0 -; CHECK-NEXT: lsr r1, r7, r1 +; CHECK-NEXT: cmp r1, #0 ; CHECK-NEXT: lsr r2, r5, r2 -; CHECK-NEXT: orr r2, r2, r3, lsl r6 -; CHECK-NEXT: lslpl r2, r5, r4 -; CHECK-NEXT: cmp r12, #0 +; CHECK-NEXT: orr r2, r2, r4, lsl r6 +; CHECK-NEXT: lslpl r2, r5, r1 +; CHECK-NEXT: lsr r1, r7, r12 +; CHECK-NEXT: cmp r3, #0 ; CHECK-NEXT: movwpl r1, #0 ; CHECK-NEXT: orr r1, r2, r1 ; CHECK-NEXT: pop {r4, r5, r6, r7, r8, r9, r11, pc} diff --git a/llvm/test/CodeGen/ARM/illegal-bitfield-loadstore.ll b/llvm/test/CodeGen/ARM/illegal-bitfield-loadstore.ll index 2922e0ed5423..0a0bb62b0a09 100644 --- a/llvm/test/CodeGen/ARM/illegal-bitfield-loadstore.ll +++ b/llvm/test/CodeGen/ARM/illegal-bitfield-loadstore.ll @@ -91,17 +91,17 @@ define void @i56_or(i56* %a) { ; BE-LABEL: i56_or: ; BE: @ %bb.0: ; BE-NEXT: mov r1, r0 -; BE-NEXT: ldr r12, [r0] ; BE-NEXT: ldrh r2, [r1, #4]! ; BE-NEXT: ldrb r3, [r1, #2] ; BE-NEXT: orr r2, r3, r2, lsl #8 -; BE-NEXT: orr r2, r2, r12, lsl #24 -; BE-NEXT: orr r2, r2, #384 -; BE-NEXT: strb r2, [r1, #2] -; BE-NEXT: lsr r3, r2, #8 -; BE-NEXT: strh r3, [r1] -; BE-NEXT: bic r1, r12, #255 -; BE-NEXT: orr r1, r1, r2, lsr #24 +; BE-NEXT: ldr r3, [r0] +; BE-NEXT: orr r2, r2, r3, lsl #24 +; BE-NEXT: orr r12, r2, #384 +; BE-NEXT: strb r12, [r1, #2] +; BE-NEXT: lsr r2, r12, #8 +; BE-NEXT: strh r2, [r1] +; BE-NEXT: bic r1, r3, #255 +; BE-NEXT: orr r1, r1, r12, lsr #24 ; BE-NEXT: str r1, [r0] ; BE-NEXT: mov pc, lr %aa = load i56, i56* %a @@ -127,13 +127,13 @@ define void @i56_and_or(i56* %a) { ; BE-NEXT: ldrb r3, [r1, #2] ; BE-NEXT: strb r2, [r1, #2] ; BE-NEXT: orr r2, r3, r12, lsl #8 -; BE-NEXT: ldr r12, [r0] -; BE-NEXT: orr r2, r2, r12, lsl #24 -; BE-NEXT: orr r2, r2, #384 -; BE-NEXT: lsr r3, r2, #8 -; BE-NEXT: strh r3, [r1] -; BE-NEXT: bic r1, r12, #255 -; BE-NEXT: orr r1, r1, r2, lsr #24 +; BE-NEXT: ldr r3, [r0] +; BE-NEXT: orr r2, r2, r3, lsl #24 +; BE-NEXT: orr r12, r2, #384 +; BE-NEXT: lsr r2, r12, #8 +; BE-NEXT: strh r2, [r1] +; BE-NEXT: bic r1, r3, #255 +; BE-NEXT: orr r1, r1, r12, lsr #24 ; BE-NEXT: str r1, [r0] ; BE-NEXT: mov pc, lr diff --git a/llvm/test/CodeGen/ARM/neon-copy.ll b/llvm/test/CodeGen/ARM/neon-copy.ll index 09a991da2e59..46490efb6631 100644 --- a/llvm/test/CodeGen/ARM/neon-copy.ll +++ b/llvm/test/CodeGen/ARM/neon-copy.ll @@ -1340,16 +1340,16 @@ define <4 x i16> @test_extracts_inserts_varidx_insert(<8 x i16> %x, i32 %idx) { ; CHECK-NEXT: .pad #8 ; CHECK-NEXT: sub sp, sp, #8 ; CHECK-NEXT: vmov.u16 r1, d0[1] -; CHECK-NEXT: and r0, r0, #3 +; CHECK-NEXT: and r12, r0, #3 ; CHECK-NEXT: vmov.u16 r2, d0[2] -; CHECK-NEXT: mov r3, sp -; CHECK-NEXT: vmov.u16 r12, d0[3] -; CHECK-NEXT: orr r0, r3, r0, lsl #1 +; CHECK-NEXT: mov r0, sp +; CHECK-NEXT: vmov.u16 r3, d0[3] +; CHECK-NEXT: orr r0, r0, r12, lsl #1 ; CHECK-NEXT: vst1.16 {d0[0]}, [r0:16] ; CHECK-NEXT: vldr d0, [sp] ; CHECK-NEXT: vmov.16 d0[1], r1 ; CHECK-NEXT: vmov.16 d0[2], r2 -; CHECK-NEXT: vmov.16 d0[3], r12 +; CHECK-NEXT: vmov.16 d0[3], r3 ; CHECK-NEXT: add sp, sp, #8 ; CHECK-NEXT: bx lr %tmp = extractelement <8 x i16> %x, i32 0 diff --git a/llvm/test/CodeGen/Mips/llvm-ir/ashr.ll b/llvm/test/CodeGen/Mips/llvm-ir/ashr.ll index 8be7100d368b..a125446b27c3 100644 --- a/llvm/test/CodeGen/Mips/llvm-ir/ashr.ll +++ b/llvm/test/CodeGen/Mips/llvm-ir/ashr.ll @@ -766,79 +766,85 @@ define signext i128 @ashr_i128(i128 signext %a, i128 signext %b) { ; MMR3-NEXT: .cfi_offset 17, -4 ; MMR3-NEXT: .cfi_offset 16, -8 ; MMR3-NEXT: move $8, $7 -; MMR3-NEXT: sw $6, 32($sp) # 4-byte Folded Spill -; MMR3-NEXT: sw $5, 36($sp) # 4-byte Folded Spill -; MMR3-NEXT: sw $4, 8($sp) # 4-byte Folded Spill +; MMR3-NEXT: move $2, $6 +; MMR3-NEXT: sw $5, 0($sp) # 4-byte Folded Spill +; MMR3-NEXT: sw $4, 12($sp) # 4-byte Folded Spill ; MMR3-NEXT: lw $16, 76($sp) -; MMR3-NEXT: srlv $4, $7, $16 -; MMR3-NEXT: not16 $3, $16 -; MMR3-NEXT: sw $3, 24($sp) # 4-byte Folded Spill -; MMR3-NEXT: sll16 $2, $6, 1 -; MMR3-NEXT: sllv $3, $2, $3 -; MMR3-NEXT: li16 $2, 64 -; MMR3-NEXT: or16 $3, $4 -; MMR3-NEXT: srlv $6, $6, $16 -; MMR3-NEXT: sw $6, 12($sp) # 4-byte Folded Spill -; MMR3-NEXT: subu16 $7, $2, $16 +; MMR3-NEXT: srlv $3, $7, $16 +; MMR3-NEXT: not16 $6, $16 +; MMR3-NEXT: sw $6, 24($sp) # 4-byte Folded Spill +; MMR3-NEXT: move $4, $2 +; MMR3-NEXT: sw $2, 32($sp) # 4-byte Folded Spill +; MMR3-NEXT: sll16 $2, $2, 1 +; MMR3-NEXT: sllv $2, $2, $6 +; MMR3-NEXT: li16 $6, 64 +; MMR3-NEXT: or16 $2, $3 +; MMR3-NEXT: srlv $4, $4, $16 +; MMR3-NEXT: sw $4, 16($sp) # 4-byte Folded Spill +; MMR3-NEXT: subu16 $7, $6, $16 ; MMR3-NEXT: sllv $9, $5, $7 -; MMR3-NEXT: andi16 $2, $7, 32 -; MMR3-NEXT: sw $2, 28($sp) # 4-byte Folded Spill -; MMR3-NEXT: andi16 $5, $16, 32 -; MMR3-NEXT: sw $5, 16($sp) # 4-byte Folded Spill -; MMR3-NEXT: move $4, $9 +; MMR3-NEXT: andi16 $5, $7, 32 +; MMR3-NEXT: sw $5, 28($sp) # 4-byte Folded Spill +; MMR3-NEXT: andi16 $6, $16, 32 +; MMR3-NEXT: sw $6, 36($sp) # 4-byte Folded Spill +; MMR3-NEXT: move $3, $9 ; MMR3-NEXT: li16 $17, 0 -; MMR3-NEXT: movn $4, $17, $2 -; MMR3-NEXT: movn $3, $6, $5 -; MMR3-NEXT: addiu $2, $16, -64 -; MMR3-NEXT: lw $5, 36($sp) # 4-byte Folded Reload -; MMR3-NEXT: srlv $5, $5, $2 -; MMR3-NEXT: sw $5, 20($sp) # 4-byte Folded Spill -; MMR3-NEXT: lw $17, 8($sp) # 4-byte Folded Reload -; MMR3-NEXT: sll16 $6, $17, 1 -; MMR3-NEXT: sw $6, 4($sp) # 4-byte Folded Spill -; MMR3-NEXT: not16 $5, $2 -; MMR3-NEXT: sllv $5, $6, $5 -; MMR3-NEXT: or16 $3, $4 -; MMR3-NEXT: lw $4, 20($sp) # 4-byte Folded Reload -; MMR3-NEXT: or16 $5, $4 -; MMR3-NEXT: srav $1, $17, $2 -; MMR3-NEXT: andi16 $2, $2, 32 -; MMR3-NEXT: sw $2, 20($sp) # 4-byte Folded Spill -; MMR3-NEXT: movn $5, $1, $2 -; MMR3-NEXT: sllv $2, $17, $7 -; MMR3-NEXT: not16 $4, $7 -; MMR3-NEXT: lw $7, 36($sp) # 4-byte Folded Reload -; MMR3-NEXT: srl16 $6, $7, 1 -; MMR3-NEXT: srlv $6, $6, $4 +; MMR3-NEXT: movn $3, $17, $5 +; MMR3-NEXT: movn $2, $4, $6 +; MMR3-NEXT: addiu $4, $16, -64 +; MMR3-NEXT: lw $17, 0($sp) # 4-byte Folded Reload +; MMR3-NEXT: srlv $4, $17, $4 +; MMR3-NEXT: sw $4, 20($sp) # 4-byte Folded Spill +; MMR3-NEXT: lw $6, 12($sp) # 4-byte Folded Reload +; MMR3-NEXT: sll16 $4, $6, 1 +; MMR3-NEXT: sw $4, 8($sp) # 4-byte Folded Spill +; MMR3-NEXT: addiu $5, $16, -64 +; MMR3-NEXT: not16 $5, $5 +; MMR3-NEXT: sllv $5, $4, $5 +; MMR3-NEXT: or16 $2, $3 +; MMR3-NEXT: lw $3, 20($sp) # 4-byte Folded Reload +; MMR3-NEXT: or16 $5, $3 +; MMR3-NEXT: addiu $3, $16, -64 +; MMR3-NEXT: srav $1, $6, $3 +; MMR3-NEXT: andi16 $3, $3, 32 +; MMR3-NEXT: sw $3, 20($sp) # 4-byte Folded Spill +; MMR3-NEXT: movn $5, $1, $3 +; MMR3-NEXT: sllv $3, $6, $7 +; MMR3-NEXT: sw $3, 4($sp) # 4-byte Folded Spill +; MMR3-NEXT: not16 $3, $7 +; MMR3-NEXT: srl16 $4, $17, 1 +; MMR3-NEXT: srlv $3, $4, $3 ; MMR3-NEXT: sltiu $10, $16, 64 -; MMR3-NEXT: movn $5, $3, $10 -; MMR3-NEXT: or16 $6, $2 -; MMR3-NEXT: srlv $2, $7, $16 -; MMR3-NEXT: lw $3, 24($sp) # 4-byte Folded Reload -; MMR3-NEXT: lw $4, 4($sp) # 4-byte Folded Reload -; MMR3-NEXT: sllv $3, $4, $3 +; MMR3-NEXT: movn $5, $2, $10 +; MMR3-NEXT: lw $2, 4($sp) # 4-byte Folded Reload ; MMR3-NEXT: or16 $3, $2 -; MMR3-NEXT: srav $11, $17, $16 -; MMR3-NEXT: lw $4, 16($sp) # 4-byte Folded Reload -; MMR3-NEXT: movn $3, $11, $4 -; MMR3-NEXT: sra $2, $17, 31 +; MMR3-NEXT: srlv $2, $17, $16 +; MMR3-NEXT: lw $4, 24($sp) # 4-byte Folded Reload +; MMR3-NEXT: lw $7, 8($sp) # 4-byte Folded Reload +; MMR3-NEXT: sllv $17, $7, $4 +; MMR3-NEXT: or16 $17, $2 +; MMR3-NEXT: srav $11, $6, $16 +; MMR3-NEXT: lw $2, 36($sp) # 4-byte Folded Reload +; MMR3-NEXT: movn $17, $11, $2 +; MMR3-NEXT: sra $2, $6, 31 ; MMR3-NEXT: movz $5, $8, $16 -; MMR3-NEXT: move $8, $2 -; MMR3-NEXT: movn $8, $3, $10 -; MMR3-NEXT: lw $3, 28($sp) # 4-byte Folded Reload -; MMR3-NEXT: movn $6, $9, $3 -; MMR3-NEXT: li16 $3, 0 -; MMR3-NEXT: lw $7, 12($sp) # 4-byte Folded Reload -; MMR3-NEXT: movn $7, $3, $4 -; MMR3-NEXT: or16 $7, $6 +; MMR3-NEXT: move $4, $2 +; MMR3-NEXT: movn $4, $17, $10 +; MMR3-NEXT: lw $6, 28($sp) # 4-byte Folded Reload +; MMR3-NEXT: movn $3, $9, $6 +; MMR3-NEXT: lw $6, 36($sp) # 4-byte Folded Reload +; MMR3-NEXT: li16 $17, 0 +; MMR3-NEXT: lw $7, 16($sp) # 4-byte Folded Reload +; MMR3-NEXT: movn $7, $17, $6 +; MMR3-NEXT: or16 $7, $3 ; MMR3-NEXT: lw $3, 20($sp) # 4-byte Folded Reload ; MMR3-NEXT: movn $1, $2, $3 ; MMR3-NEXT: movn $1, $7, $10 ; MMR3-NEXT: lw $3, 32($sp) # 4-byte Folded Reload ; MMR3-NEXT: movz $1, $3, $16 -; MMR3-NEXT: movn $11, $2, $4 +; MMR3-NEXT: movn $11, $2, $6 ; MMR3-NEXT: movn $2, $11, $10 -; MMR3-NEXT: move $3, $8 +; MMR3-NEXT: move $3, $4 ; MMR3-NEXT: move $4, $1 ; MMR3-NEXT: lwp $16, 40($sp) ; MMR3-NEXT: addiusp 48 @@ -852,79 +858,80 @@ define signext i128 @ashr_i128(i128 signext %a, i128 signext %b) { ; MMR6-NEXT: sw $16, 8($sp) # 4-byte Folded Spill ; MMR6-NEXT: .cfi_offset 17, -4 ; MMR6-NEXT: .cfi_offset 16, -8 -; MMR6-NEXT: move $1, $7 +; MMR6-NEXT: move $12, $7 ; MMR6-NEXT: lw $3, 44($sp) ; MMR6-NEXT: li16 $2, 64 -; MMR6-NEXT: subu16 $7, $2, $3 -; MMR6-NEXT: sllv $8, $5, $7 -; MMR6-NEXT: andi16 $2, $7, 32 -; MMR6-NEXT: selnez $9, $8, $2 -; MMR6-NEXT: sllv $10, $4, $7 -; MMR6-NEXT: not16 $7, $7 -; MMR6-NEXT: srl16 $16, $5, 1 -; MMR6-NEXT: srlv $7, $16, $7 -; MMR6-NEXT: or $7, $10, $7 -; MMR6-NEXT: seleqz $7, $7, $2 -; MMR6-NEXT: or $7, $9, $7 -; MMR6-NEXT: srlv $9, $1, $3 -; MMR6-NEXT: not16 $16, $3 -; MMR6-NEXT: sw $16, 4($sp) # 4-byte Folded Spill +; MMR6-NEXT: subu16 $16, $2, $3 +; MMR6-NEXT: sllv $1, $5, $16 +; MMR6-NEXT: andi16 $2, $16, 32 +; MMR6-NEXT: selnez $8, $1, $2 +; MMR6-NEXT: sllv $9, $4, $16 +; MMR6-NEXT: not16 $16, $16 +; MMR6-NEXT: srl16 $17, $5, 1 +; MMR6-NEXT: srlv $10, $17, $16 +; MMR6-NEXT: or $9, $9, $10 +; MMR6-NEXT: seleqz $9, $9, $2 +; MMR6-NEXT: or $8, $8, $9 +; MMR6-NEXT: srlv $9, $7, $3 +; MMR6-NEXT: not16 $7, $3 +; MMR6-NEXT: sw $7, 4($sp) # 4-byte Folded Spill ; MMR6-NEXT: sll16 $17, $6, 1 -; MMR6-NEXT: sllv $10, $17, $16 +; MMR6-NEXT: sllv $10, $17, $7 ; MMR6-NEXT: or $9, $10, $9 ; MMR6-NEXT: andi16 $17, $3, 32 ; MMR6-NEXT: seleqz $9, $9, $17 ; MMR6-NEXT: srlv $10, $6, $3 ; MMR6-NEXT: selnez $11, $10, $17 ; MMR6-NEXT: seleqz $10, $10, $17 -; MMR6-NEXT: or $10, $10, $7 -; MMR6-NEXT: seleqz $12, $8, $2 -; MMR6-NEXT: or $8, $11, $9 +; MMR6-NEXT: or $8, $10, $8 +; MMR6-NEXT: seleqz $1, $1, $2 +; MMR6-NEXT: or $9, $11, $9 ; MMR6-NEXT: addiu $2, $3, -64 -; MMR6-NEXT: srlv $9, $5, $2 +; MMR6-NEXT: srlv $10, $5, $2 ; MMR6-NEXT: sll16 $7, $4, 1 ; MMR6-NEXT: not16 $16, $2 ; MMR6-NEXT: sllv $11, $7, $16 ; MMR6-NEXT: sltiu $13, $3, 64 -; MMR6-NEXT: or $8, $8, $12 -; MMR6-NEXT: selnez $10, $10, $13 -; MMR6-NEXT: or $9, $11, $9 -; MMR6-NEXT: srav $11, $4, $2 +; MMR6-NEXT: or $1, $9, $1 +; MMR6-NEXT: selnez $8, $8, $13 +; MMR6-NEXT: or $9, $11, $10 +; MMR6-NEXT: srav $10, $4, $2 ; MMR6-NEXT: andi16 $2, $2, 32 -; MMR6-NEXT: seleqz $12, $11, $2 +; MMR6-NEXT: seleqz $11, $10, $2 ; MMR6-NEXT: sra $14, $4, 31 ; MMR6-NEXT: selnez $15, $14, $2 ; MMR6-NEXT: seleqz $9, $9, $2 -; MMR6-NEXT: or $12, $15, $12 -; MMR6-NEXT: seleqz $12, $12, $13 -; MMR6-NEXT: selnez $2, $11, $2 -; MMR6-NEXT: seleqz $11, $14, $13 -; MMR6-NEXT: or $10, $10, $12 -; MMR6-NEXT: selnez $10, $10, $3 -; MMR6-NEXT: selnez $8, $8, $13 +; MMR6-NEXT: or $11, $15, $11 +; MMR6-NEXT: seleqz $11, $11, $13 +; MMR6-NEXT: selnez $2, $10, $2 +; MMR6-NEXT: seleqz $10, $14, $13 +; MMR6-NEXT: or $8, $8, $11 +; MMR6-NEXT: selnez $8, $8, $3 +; MMR6-NEXT: selnez $1, $1, $13 ; MMR6-NEXT: or $2, $2, $9 ; MMR6-NEXT: srav $9, $4, $3 ; MMR6-NEXT: seleqz $4, $9, $17 -; MMR6-NEXT: selnez $12, $14, $17 -; MMR6-NEXT: or $4, $12, $4 -; MMR6-NEXT: selnez $12, $4, $13 +; MMR6-NEXT: selnez $11, $14, $17 +; MMR6-NEXT: or $4, $11, $4 +; MMR6-NEXT: selnez $11, $4, $13 ; MMR6-NEXT: seleqz $2, $2, $13 ; MMR6-NEXT: seleqz $4, $6, $3 -; MMR6-NEXT: seleqz $1, $1, $3 -; MMR6-NEXT: or $2, $8, $2 -; MMR6-NEXT: selnez $2, $2, $3 +; MMR6-NEXT: seleqz $6, $12, $3 ; MMR6-NEXT: or $1, $1, $2 -; MMR6-NEXT: or $4, $4, $10 -; MMR6-NEXT: or $2, $12, $11 -; MMR6-NEXT: srlv $3, $5, $3 -; MMR6-NEXT: lw $5, 4($sp) # 4-byte Folded Reload -; MMR6-NEXT: sllv $5, $7, $5 -; MMR6-NEXT: or $3, $5, $3 -; MMR6-NEXT: seleqz $3, $3, $17 -; MMR6-NEXT: selnez $5, $9, $17 -; MMR6-NEXT: or $3, $5, $3 -; MMR6-NEXT: selnez $3, $3, $13 -; MMR6-NEXT: or $3, $3, $11 +; MMR6-NEXT: selnez $1, $1, $3 +; MMR6-NEXT: or $1, $6, $1 +; MMR6-NEXT: or $4, $4, $8 +; MMR6-NEXT: or $6, $11, $10 +; MMR6-NEXT: srlv $2, $5, $3 +; MMR6-NEXT: lw $3, 4($sp) # 4-byte Folded Reload +; MMR6-NEXT: sllv $3, $7, $3 +; MMR6-NEXT: or $2, $3, $2 +; MMR6-NEXT: seleqz $2, $2, $17 +; MMR6-NEXT: selnez $3, $9, $17 +; MMR6-NEXT: or $2, $3, $2 +; MMR6-NEXT: selnez $2, $2, $13 +; MMR6-NEXT: or $3, $2, $10 +; MMR6-NEXT: move $2, $6 ; MMR6-NEXT: move $5, $1 ; MMR6-NEXT: lw $16, 8($sp) # 4-byte Folded Reload ; MMR6-NEXT: lw $17, 12($sp) # 4-byte Folded Reload diff --git a/llvm/test/CodeGen/Mips/llvm-ir/lshr.ll b/llvm/test/CodeGen/Mips/llvm-ir/lshr.ll index ed2bfc9fcf60..e4b4b3ae1d0f 100644 --- a/llvm/test/CodeGen/Mips/llvm-ir/lshr.ll +++ b/llvm/test/CodeGen/Mips/llvm-ir/lshr.ll @@ -776,76 +776,77 @@ define signext i128 @lshr_i128(i128 signext %a, i128 signext %b) { ; MMR3-NEXT: .cfi_offset 17, -4 ; MMR3-NEXT: .cfi_offset 16, -8 ; MMR3-NEXT: move $8, $7 -; MMR3-NEXT: sw $6, 24($sp) # 4-byte Folded Spill +; MMR3-NEXT: sw $5, 4($sp) # 4-byte Folded Spill ; MMR3-NEXT: sw $4, 28($sp) # 4-byte Folded Spill ; MMR3-NEXT: lw $16, 68($sp) ; MMR3-NEXT: li16 $2, 64 -; MMR3-NEXT: subu16 $7, $2, $16 -; MMR3-NEXT: sllv $9, $5, $7 -; MMR3-NEXT: move $17, $5 -; MMR3-NEXT: sw $5, 0($sp) # 4-byte Folded Spill -; MMR3-NEXT: andi16 $3, $7, 32 +; MMR3-NEXT: subu16 $17, $2, $16 +; MMR3-NEXT: sllv $9, $5, $17 +; MMR3-NEXT: andi16 $3, $17, 32 ; MMR3-NEXT: sw $3, 20($sp) # 4-byte Folded Spill ; MMR3-NEXT: li16 $2, 0 ; MMR3-NEXT: move $4, $9 ; MMR3-NEXT: movn $4, $2, $3 -; MMR3-NEXT: srlv $5, $8, $16 +; MMR3-NEXT: srlv $5, $7, $16 ; MMR3-NEXT: not16 $3, $16 ; MMR3-NEXT: sw $3, 16($sp) # 4-byte Folded Spill ; MMR3-NEXT: sll16 $2, $6, 1 +; MMR3-NEXT: sw $6, 24($sp) # 4-byte Folded Spill ; MMR3-NEXT: sllv $2, $2, $3 ; MMR3-NEXT: or16 $2, $5 -; MMR3-NEXT: srlv $5, $6, $16 -; MMR3-NEXT: sw $5, 4($sp) # 4-byte Folded Spill +; MMR3-NEXT: srlv $7, $6, $16 ; MMR3-NEXT: andi16 $3, $16, 32 ; MMR3-NEXT: sw $3, 12($sp) # 4-byte Folded Spill -; MMR3-NEXT: movn $2, $5, $3 +; MMR3-NEXT: movn $2, $7, $3 ; MMR3-NEXT: addiu $3, $16, -64 ; MMR3-NEXT: or16 $2, $4 -; MMR3-NEXT: srlv $4, $17, $3 -; MMR3-NEXT: sw $4, 8($sp) # 4-byte Folded Spill -; MMR3-NEXT: lw $4, 28($sp) # 4-byte Folded Reload -; MMR3-NEXT: sll16 $6, $4, 1 -; MMR3-NEXT: not16 $5, $3 -; MMR3-NEXT: sllv $5, $6, $5 -; MMR3-NEXT: lw $17, 8($sp) # 4-byte Folded Reload -; MMR3-NEXT: or16 $5, $17 -; MMR3-NEXT: srlv $1, $4, $3 -; MMR3-NEXT: andi16 $3, $3, 32 +; MMR3-NEXT: lw $6, 4($sp) # 4-byte Folded Reload +; MMR3-NEXT: srlv $3, $6, $3 ; MMR3-NEXT: sw $3, 8($sp) # 4-byte Folded Spill -; MMR3-NEXT: movn $5, $1, $3 +; MMR3-NEXT: lw $3, 28($sp) # 4-byte Folded Reload +; MMR3-NEXT: sll16 $4, $3, 1 +; MMR3-NEXT: sw $4, 0($sp) # 4-byte Folded Spill +; MMR3-NEXT: addiu $5, $16, -64 +; MMR3-NEXT: not16 $5, $5 +; MMR3-NEXT: sllv $5, $4, $5 +; MMR3-NEXT: lw $4, 8($sp) # 4-byte Folded Reload +; MMR3-NEXT: or16 $5, $4 +; MMR3-NEXT: addiu $4, $16, -64 +; MMR3-NEXT: srlv $1, $3, $4 +; MMR3-NEXT: andi16 $4, $4, 32 +; MMR3-NEXT: sw $4, 8($sp) # 4-byte Folded Spill +; MMR3-NEXT: movn $5, $1, $4 ; MMR3-NEXT: sltiu $10, $16, 64 ; MMR3-NEXT: movn $5, $2, $10 -; MMR3-NEXT: sllv $2, $4, $7 -; MMR3-NEXT: not16 $3, $7 -; MMR3-NEXT: lw $7, 0($sp) # 4-byte Folded Reload -; MMR3-NEXT: srl16 $4, $7, 1 +; MMR3-NEXT: sllv $2, $3, $17 +; MMR3-NEXT: not16 $3, $17 +; MMR3-NEXT: srl16 $4, $6, 1 ; MMR3-NEXT: srlv $4, $4, $3 ; MMR3-NEXT: or16 $4, $2 -; MMR3-NEXT: srlv $2, $7, $16 +; MMR3-NEXT: srlv $2, $6, $16 ; MMR3-NEXT: lw $3, 16($sp) # 4-byte Folded Reload +; MMR3-NEXT: lw $6, 0($sp) # 4-byte Folded Reload ; MMR3-NEXT: sllv $3, $6, $3 ; MMR3-NEXT: or16 $3, $2 ; MMR3-NEXT: lw $2, 28($sp) # 4-byte Folded Reload ; MMR3-NEXT: srlv $2, $2, $16 -; MMR3-NEXT: lw $17, 12($sp) # 4-byte Folded Reload -; MMR3-NEXT: movn $3, $2, $17 +; MMR3-NEXT: lw $6, 12($sp) # 4-byte Folded Reload +; MMR3-NEXT: movn $3, $2, $6 ; MMR3-NEXT: movz $5, $8, $16 -; MMR3-NEXT: li16 $6, 0 -; MMR3-NEXT: movz $3, $6, $10 -; MMR3-NEXT: lw $7, 20($sp) # 4-byte Folded Reload -; MMR3-NEXT: movn $4, $9, $7 -; MMR3-NEXT: lw $6, 4($sp) # 4-byte Folded Reload -; MMR3-NEXT: li16 $7, 0 -; MMR3-NEXT: movn $6, $7, $17 -; MMR3-NEXT: or16 $6, $4 +; MMR3-NEXT: li16 $17, 0 +; MMR3-NEXT: movz $3, $17, $10 +; MMR3-NEXT: lw $17, 20($sp) # 4-byte Folded Reload +; MMR3-NEXT: movn $4, $9, $17 +; MMR3-NEXT: li16 $17, 0 +; MMR3-NEXT: movn $7, $17, $6 +; MMR3-NEXT: or16 $7, $4 ; MMR3-NEXT: lw $4, 8($sp) # 4-byte Folded Reload -; MMR3-NEXT: movn $1, $7, $4 -; MMR3-NEXT: li16 $7, 0 -; MMR3-NEXT: movn $1, $6, $10 +; MMR3-NEXT: movn $1, $17, $4 +; MMR3-NEXT: li16 $17, 0 +; MMR3-NEXT: movn $1, $7, $10 ; MMR3-NEXT: lw $4, 24($sp) # 4-byte Folded Reload ; MMR3-NEXT: movz $1, $4, $16 -; MMR3-NEXT: movn $2, $7, $17 +; MMR3-NEXT: movn $2, $17, $6 ; MMR3-NEXT: li16 $4, 0 ; MMR3-NEXT: movz $2, $4, $10 ; MMR3-NEXT: move $4, $1 @@ -855,98 +856,91 @@ define signext i128 @lshr_i128(i128 signext %a, i128 signext %b) { ; ; MMR6-LABEL: lshr_i128: ; MMR6: # %bb.0: # %entry -; MMR6-NEXT: addiu $sp, $sp, -32 -; MMR6-NEXT: .cfi_def_cfa_offset 32 -; MMR6-NEXT: sw $17, 28($sp) # 4-byte Folded Spill -; MMR6-NEXT: sw $16, 24($sp) # 4-byte Folded Spill +; MMR6-NEXT: addiu $sp, $sp, -24 +; MMR6-NEXT: .cfi_def_cfa_offset 24 +; MMR6-NEXT: sw $17, 20($sp) # 4-byte Folded Spill +; MMR6-NEXT: sw $16, 16($sp) # 4-byte Folded Spill ; MMR6-NEXT: .cfi_offset 17, -4 ; MMR6-NEXT: .cfi_offset 16, -8 ; MMR6-NEXT: move $1, $7 -; MMR6-NEXT: move $7, $5 -; MMR6-NEXT: lw $3, 60($sp) +; MMR6-NEXT: move $7, $4 +; MMR6-NEXT: lw $3, 52($sp) ; MMR6-NEXT: srlv $2, $1, $3 -; MMR6-NEXT: not16 $5, $3 -; MMR6-NEXT: sw $5, 12($sp) # 4-byte Folded Spill -; MMR6-NEXT: move $17, $6 -; MMR6-NEXT: sw $6, 16($sp) # 4-byte Folded Spill +; MMR6-NEXT: not16 $16, $3 +; MMR6-NEXT: sw $16, 8($sp) # 4-byte Folded Spill +; MMR6-NEXT: move $4, $6 +; MMR6-NEXT: sw $6, 12($sp) # 4-byte Folded Spill ; MMR6-NEXT: sll16 $6, $6, 1 -; MMR6-NEXT: sllv $6, $6, $5 +; MMR6-NEXT: sllv $6, $6, $16 ; MMR6-NEXT: or $8, $6, $2 -; MMR6-NEXT: addiu $5, $3, -64 -; MMR6-NEXT: srlv $9, $7, $5 -; MMR6-NEXT: move $6, $4 -; MMR6-NEXT: sll16 $2, $4, 1 -; MMR6-NEXT: sw $2, 8($sp) # 4-byte Folded Spill -; MMR6-NEXT: not16 $16, $5 +; MMR6-NEXT: addiu $6, $3, -64 +; MMR6-NEXT: srlv $9, $5, $6 +; MMR6-NEXT: sll16 $2, $7, 1 +; MMR6-NEXT: sw $2, 4($sp) # 4-byte Folded Spill +; MMR6-NEXT: not16 $16, $6 ; MMR6-NEXT: sllv $10, $2, $16 ; MMR6-NEXT: andi16 $16, $3, 32 ; MMR6-NEXT: seleqz $8, $8, $16 ; MMR6-NEXT: or $9, $10, $9 -; MMR6-NEXT: srlv $10, $17, $3 +; MMR6-NEXT: srlv $10, $4, $3 ; MMR6-NEXT: selnez $11, $10, $16 ; MMR6-NEXT: li16 $17, 64 ; MMR6-NEXT: subu16 $2, $17, $3 -; MMR6-NEXT: sllv $12, $7, $2 -; MMR6-NEXT: move $17, $7 +; MMR6-NEXT: sllv $12, $5, $2 ; MMR6-NEXT: andi16 $4, $2, 32 -; MMR6-NEXT: andi16 $7, $5, 32 -; MMR6-NEXT: sw $7, 20($sp) # 4-byte Folded Spill -; MMR6-NEXT: seleqz $9, $9, $7 +; MMR6-NEXT: andi16 $17, $6, 32 +; MMR6-NEXT: seleqz $9, $9, $17 ; MMR6-NEXT: seleqz $13, $12, $4 ; MMR6-NEXT: or $8, $11, $8 ; MMR6-NEXT: selnez $11, $12, $4 -; MMR6-NEXT: sllv $12, $6, $2 -; MMR6-NEXT: move $7, $6 -; MMR6-NEXT: sw $6, 4($sp) # 4-byte Folded Spill +; MMR6-NEXT: sllv $12, $7, $2 ; MMR6-NEXT: not16 $2, $2 -; MMR6-NEXT: srl16 $6, $17, 1 +; MMR6-NEXT: srl16 $6, $5, 1 ; MMR6-NEXT: srlv $2, $6, $2 ; MMR6-NEXT: or $2, $12, $2 ; MMR6-NEXT: seleqz $2, $2, $4 -; MMR6-NEXT: srlv $4, $7, $5 -; MMR6-NEXT: or $11, $11, $2 -; MMR6-NEXT: or $5, $8, $13 -; MMR6-NEXT: srlv $6, $17, $3 -; MMR6-NEXT: lw $2, 20($sp) # 4-byte Folded Reload -; MMR6-NEXT: selnez $7, $4, $2 -; MMR6-NEXT: sltiu $8, $3, 64 -; MMR6-NEXT: selnez $12, $5, $8 -; MMR6-NEXT: or $7, $7, $9 -; MMR6-NEXT: lw $5, 12($sp) # 4-byte Folded Reload +; MMR6-NEXT: addiu $4, $3, -64 +; MMR6-NEXT: srlv $4, $7, $4 +; MMR6-NEXT: or $12, $11, $2 +; MMR6-NEXT: or $6, $8, $13 +; MMR6-NEXT: srlv $5, $5, $3 +; MMR6-NEXT: selnez $8, $4, $17 +; MMR6-NEXT: sltiu $11, $3, 64 +; MMR6-NEXT: selnez $13, $6, $11 +; MMR6-NEXT: or $8, $8, $9 ; MMR6-NEXT: lw $2, 8($sp) # 4-byte Folded Reload -; MMR6-NEXT: sllv $9, $2, $5 +; MMR6-NEXT: lw $6, 4($sp) # 4-byte Folded Reload +; MMR6-NEXT: sllv $9, $6, $2 ; MMR6-NEXT: seleqz $10, $10, $16 -; MMR6-NEXT: li16 $5, 0 -; MMR6-NEXT: or $10, $10, $11 -; MMR6-NEXT: or $6, $9, $6 -; MMR6-NEXT: seleqz $2, $7, $8 -; MMR6-NEXT: seleqz $7, $5, $8 -; MMR6-NEXT: lw $5, 4($sp) # 4-byte Folded Reload -; MMR6-NEXT: srlv $9, $5, $3 -; MMR6-NEXT: seleqz $11, $9, $16 -; MMR6-NEXT: selnez $11, $11, $8 +; MMR6-NEXT: li16 $2, 0 +; MMR6-NEXT: or $10, $10, $12 +; MMR6-NEXT: or $9, $9, $5 +; MMR6-NEXT: seleqz $5, $8, $11 +; MMR6-NEXT: seleqz $8, $2, $11 +; MMR6-NEXT: srlv $7, $7, $3 +; MMR6-NEXT: seleqz $2, $7, $16 +; MMR6-NEXT: selnez $2, $2, $11 ; MMR6-NEXT: seleqz $1, $1, $3 -; MMR6-NEXT: or $2, $12, $2 -; MMR6-NEXT: selnez $2, $2, $3 -; MMR6-NEXT: or $5, $1, $2 -; MMR6-NEXT: or $2, $7, $11 -; MMR6-NEXT: seleqz $1, $6, $16 -; MMR6-NEXT: selnez $6, $9, $16 -; MMR6-NEXT: lw $16, 16($sp) # 4-byte Folded Reload -; MMR6-NEXT: seleqz $9, $16, $3 -; MMR6-NEXT: selnez $10, $10, $8 -; MMR6-NEXT: lw $16, 20($sp) # 4-byte Folded Reload -; MMR6-NEXT: seleqz $4, $4, $16 -; MMR6-NEXT: seleqz $4, $4, $8 -; MMR6-NEXT: or $4, $10, $4 +; MMR6-NEXT: or $5, $13, $5 +; MMR6-NEXT: selnez $5, $5, $3 +; MMR6-NEXT: or $5, $1, $5 +; MMR6-NEXT: or $2, $8, $2 +; MMR6-NEXT: seleqz $1, $9, $16 +; MMR6-NEXT: selnez $6, $7, $16 +; MMR6-NEXT: lw $7, 12($sp) # 4-byte Folded Reload +; MMR6-NEXT: seleqz $7, $7, $3 +; MMR6-NEXT: selnez $9, $10, $11 +; MMR6-NEXT: seleqz $4, $4, $17 +; MMR6-NEXT: seleqz $4, $4, $11 </cut>

3 years, 11 months

1
0
0 0

[CI-NOTIFY]: TCWG Bisect tcwg_gnu_native_build/master-aarch64 - Build # 1 - Successful!

by ci_notify＠linaro.org

Successfully identified regression in *gcc* in CI configuration tcwg_gnu_native_build/master-aarch64. So far, this commit has regressed CI configurations: - tcwg_gnu_native_build/master-aarch64 Culprit: <cut> commit cad36f38576a6a781e3c62ab061c68f5b8dab13a Author: Roger Sayle <roger(a)nextmovesoftware.com> Date: Tue Aug 31 11:45:07 2021 +0100 Preserve SUBREG_PROMOTED_VAR_P on (extend:HI (subreg/s:QI (reg:SI))). SUBREG_PROMOTED_VAR_P is a mechanism for tracking that a partial subreg is correctly zero-extended or sign-extended in the parent register. For example, the RTL (subreg/s/v:QI (reg/v:SI 23 [ x ]) 0) indicates that the byte x is zero extended in reg:SI 23, which is useful for optimization. An example is that zero extending the above QImode value to HImode can simply use a wider subreg, i.e. (subreg:HI (reg/v:SI 23 [ x ]) 0). This patch addresses the oversight/missed optimization opportunity that the new HImode subreg above should retain its SUBREG_PROMOTED_VAR_P annotation as its value is guaranteed to be correctly extended in the SImode parent. The code below to preserve SUBREG_PROMOTED_VAR_P is already present in the middle-end (e.g. simplify-rtx.c:7232-7242) but missing from one or two (precisely three) places that (accidentally) strip it. Whilst there I also added another optimization. If we need to extend the above QImode value beyond the SImode register holding it, say to DImode, we can eliminate the SUBREG and simply extend from the SImode register to DImode. 2021-08-31 Roger Sayle <roger(a)nextmovesoftware.com> gcc/ChangeLog * expr.c (convert_modes): Preserve SUBREG_PROMOTED_VAR_P when creating a (wider) partial subreg from a SUBREG_PROMOTED_VAR_P subreg. * simplify-rtx.c (simplify_unary_operation_1) [SIGN_EXTEND]: Likewise, preserve SUBREG_PROMOTED_VAR_P when creating a (wider) partial subreg from a SUBREG_PROMOTED_VAR_P subreg. Generate SIGN_EXTEND of the SUBREG_REG when a subreg would be paradoxical. [ZERO_EXTEND]: Likewise, preserve SUBREG_PROMOTED_VAR_P when creating a (wider) partial subreg from a SUBREG_PROMOTED_VAR_P subreg. Generate ZERO_EXTEND of the SUBREG_REG when a subreg would be paradoxical. </cut> Results regressed to (for first_bad == cad36f38576a6a781e3c62ab061c68f5b8dab13a) # reset_artifacts: -10 # true: 0 # build_abe binutils: 1 # First few build errors in logs: # 00:05:59 /home/tcwg-buildslave/workspace/tcwg_gnu_6/abe/snapshots/gcc.git~master/libgcc/soft-fp/op-2.h:249:37: internal compiler error: in subreg_promoted_mode, at rtl.h:3132 # 00:05:59 /home/tcwg-buildslave/workspace/tcwg_gnu_6/abe/snapshots/gcc.git~master/libgcc/soft-fp/op-2.h:249:37: internal compiler error: in subreg_promoted_mode, at rtl.h:3132 # 00:05:59 /home/tcwg-buildslave/workspace/tcwg_gnu_6/abe/snapshots/gcc.git~master/libgcc/soft-fp/op-2.h:249:37: internal compiler error: in subreg_promoted_mode, at rtl.h:3132 # 00:05:59 /home/tcwg-buildslave/workspace/tcwg_gnu_6/abe/snapshots/gcc.git~master/libgcc/soft-fp/op-1.h:127:36: internal compiler error: in subreg_promoted_mode, at rtl.h:3132 # 00:05:59 /home/tcwg-buildslave/workspace/tcwg_gnu_6/abe/snapshots/gcc.git~master/libgcc/soft-fp/op-1.h:127:36: internal compiler error: in subreg_promoted_mode, at rtl.h:3132 # 00:05:59 /home/tcwg-buildslave/workspace/tcwg_gnu_6/abe/snapshots/gcc.git~master/libgcc/soft-fp/op-2.h:249:37: internal compiler error: in subreg_promoted_mode, at rtl.h:3132 # 00:05:59 /home/tcwg-buildslave/workspace/tcwg_gnu_6/abe/snapshots/gcc.git~master/libgcc/soft-fp/op-2.h:249:37: internal compiler error: in subreg_promoted_mode, at rtl.h:3132 # 00:05:59 /home/tcwg-buildslave/workspace/tcwg_gnu_6/abe/snapshots/gcc.git~master/libgcc/soft-fp/op-1.h:127:36: internal compiler error: in subreg_promoted_mode, at rtl.h:3132 # 00:05:59 /home/tcwg-buildslave/workspace/tcwg_gnu_6/abe/snapshots/gcc.git~master/libgcc/soft-fp/op-2.h:249:37: internal compiler error: in subreg_promoted_mode, at rtl.h:3132 # 00:05:59 make[2]: *** [/home/tcwg-buildslave/workspace/tcwg_gnu_6/abe/snapshots/gcc.git~master/libgcc/shared-object.mk:14: trunctfhf2.o] Error 1 from (for last_good == 0960d937d9bee3c831d0b64a9c828c263a58ff89) # reset_artifacts: -10 # true: 0 # build_abe binutils: 1 # build_abe gcc: 2 # build_abe linux: 4 # build_abe glibc: 5 # build_abe gdb: 6 Artifacts of last_good build: https://ci.linaro.org/job/tcwg_gnu_native_build-bisect-master-aarch64/1/art… Artifacts of first_bad build: https://ci.linaro.org/job/tcwg_gnu_native_build-bisect-master-aarch64/1/art… Build top page/logs: https://ci.linaro.org/job/tcwg_gnu_native_build-bisect-master-aarch64/1/ Configuration details: Reproduce builds: <cut> mkdir investigate-gcc-cad36f38576a6a781e3c62ab061c68f5b8dab13a cd investigate-gcc-cad36f38576a6a781e3c62ab061c68f5b8dab13a git clone https://git.linaro.org/toolchain/jenkins-scripts mkdir -p artifacts/manifests curl -o artifacts/manifests/build-baseline.sh https://ci.linaro.org/job/tcwg_gnu_native_build-bisect-master-aarch64/1/art… --fail curl -o artifacts/manifests/build-parameters.sh https://ci.linaro.org/job/tcwg_gnu_native_build-bisect-master-aarch64/1/art… --fail curl -o artifacts/test.sh https://ci.linaro.org/job/tcwg_gnu_native_build-bisect-master-aarch64/1/art… --fail chmod +x artifacts/test.sh # Reproduce the baseline build (build all pre-requisites) ./jenkins-scripts/tcwg_gnu-build.sh @@ artifacts/manifests/build-baseline.sh # Save baseline build state (which is then restored in artifacts/test.sh) mkdir -p ./bisect rsync -a --del --delete-excluded --exclude /bisect/ --exclude /artifacts/ --exclude /gcc/ ./ ./bisect/baseline/ cd gcc # Reproduce first_bad build git checkout --detach cad36f38576a6a781e3c62ab061c68f5b8dab13a ../artifacts/test.sh # Reproduce last_good build git checkout --detach 0960d937d9bee3c831d0b64a9c828c263a58ff89 ../artifacts/test.sh cd .. </cut> History of pending regressions and results: https://git.linaro.org/toolchain/ci/base-artifacts.git/log/?h=linaro-local/… Artifacts: https://ci.linaro.org/job/tcwg_gnu_native_build-bisect-master-aarch64/1/art… Build log: https://ci.linaro.org/job/tcwg_gnu_native_build-bisect-master-aarch64/1/con… Full commit (up to 1000 lines): <cut> commit cad36f38576a6a781e3c62ab061c68f5b8dab13a Author: Roger Sayle <roger(a)nextmovesoftware.com> Date: Tue Aug 31 11:45:07 2021 +0100 Preserve SUBREG_PROMOTED_VAR_P on (extend:HI (subreg/s:QI (reg:SI))). SUBREG_PROMOTED_VAR_P is a mechanism for tracking that a partial subreg is correctly zero-extended or sign-extended in the parent register. For example, the RTL (subreg/s/v:QI (reg/v:SI 23 [ x ]) 0) indicates that the byte x is zero extended in reg:SI 23, which is useful for optimization. An example is that zero extending the above QImode value to HImode can simply use a wider subreg, i.e. (subreg:HI (reg/v:SI 23 [ x ]) 0). This patch addresses the oversight/missed optimization opportunity that the new HImode subreg above should retain its SUBREG_PROMOTED_VAR_P annotation as its value is guaranteed to be correctly extended in the SImode parent. The code below to preserve SUBREG_PROMOTED_VAR_P is already present in the middle-end (e.g. simplify-rtx.c:7232-7242) but missing from one or two (precisely three) places that (accidentally) strip it. Whilst there I also added another optimization. If we need to extend the above QImode value beyond the SImode register holding it, say to DImode, we can eliminate the SUBREG and simply extend from the SImode register to DImode. 2021-08-31 Roger Sayle <roger(a)nextmovesoftware.com> gcc/ChangeLog * expr.c (convert_modes): Preserve SUBREG_PROMOTED_VAR_P when creating a (wider) partial subreg from a SUBREG_PROMOTED_VAR_P subreg. * simplify-rtx.c (simplify_unary_operation_1) [SIGN_EXTEND]: Likewise, preserve SUBREG_PROMOTED_VAR_P when creating a (wider) partial subreg from a SUBREG_PROMOTED_VAR_P subreg. Generate SIGN_EXTEND of the SUBREG_REG when a subreg would be paradoxical. [ZERO_EXTEND]: Likewise, preserve SUBREG_PROMOTED_VAR_P when creating a (wider) partial subreg from a SUBREG_PROMOTED_VAR_P subreg. Generate ZERO_EXTEND of the SUBREG_REG when a subreg would be paradoxical. --- gcc/expr.c | 19 ++++++++++++++++++- gcc/simplify-rtx.c | 52 ++++++++++++++++++++++++++++++++++++++++++---------- 2 files changed, 60 insertions(+), 11 deletions(-) diff --git a/gcc/expr.c b/gcc/expr.c index 096c0315ecc..5dd98a9bccc 100644 --- a/gcc/expr.c +++ b/gcc/expr.c @@ -688,7 +688,24 @@ convert_modes (machine_mode mode, machine_mode oldmode, rtx x, int unsignedp) && (GET_MODE_PRECISION (subreg_promoted_mode (x)) >= GET_MODE_PRECISION (int_mode)) && SUBREG_CHECK_PROMOTED_SIGN (x, unsignedp)) - x = gen_lowpart (int_mode, SUBREG_REG (x)); + { + scalar_int_mode int_orig_mode; + machine_mode orig_mode = GET_MODE (x); + x = gen_lowpart (int_mode, SUBREG_REG (x)); + + /* Preserve SUBREG_PROMOTED_VAR_P if the new mode is wider than + the original mode, but narrower than the inner mode. */ + if (GET_CODE (x) == SUBREG + && GET_MODE_PRECISION (subreg_promoted_mode (x)) + > GET_MODE_PRECISION (int_mode) + && is_a <scalar_int_mode> (orig_mode, &int_orig_mode) + && GET_MODE_PRECISION (int_mode) + > GET_MODE_PRECISION (int_orig_mode)) + { + SUBREG_PROMOTED_VAR_P (x) = 1; + SUBREG_PROMOTED_SET (x, unsignedp); + } + } if (GET_MODE (x) != VOIDmode) oldmode = GET_MODE (x); diff --git a/gcc/simplify-rtx.c b/gcc/simplify-rtx.c index e431e0c19d7..ebad5cb5a79 100644 --- a/gcc/simplify-rtx.c +++ b/gcc/simplify-rtx.c @@ -1512,12 +1512,28 @@ simplify_context::simplify_unary_operation_1 (rtx_code code, machine_mode mode, target mode is the same as the variable's promotion. */ if (GET_CODE (op) == SUBREG && SUBREG_PROMOTED_VAR_P (op) - && SUBREG_PROMOTED_SIGNED_P (op) - && !paradoxical_subreg_p (mode, GET_MODE (SUBREG_REG (op)))) + && SUBREG_PROMOTED_SIGNED_P (op)) { - temp = rtl_hooks.gen_lowpart_no_emit (mode, SUBREG_REG (op)); - if (temp) - return temp; + rtx subreg = SUBREG_REG (op); + machine_mode subreg_mode = GET_MODE (subreg); + if (!paradoxical_subreg_p (mode, subreg_mode)) + { + temp = rtl_hooks.gen_lowpart_no_emit (mode, subreg); + if (temp) + { + /* Preserve SUBREG_PROMOTED_VAR_P. */ + if (partial_subreg_p (temp)) + { + SUBREG_PROMOTED_VAR_P (temp) = 1; + SUBREG_PROMOTED_SET (temp, 1); + } + return temp; + } + } + else + /* Sign-extending a sign-extended subreg. */ + return simplify_gen_unary (SIGN_EXTEND, mode, + subreg, subreg_mode); } /* (sign_extend:M (sign_extend:N <X>)) is (sign_extend:M <X>). @@ -1631,12 +1647,28 @@ simplify_context::simplify_unary_operation_1 (rtx_code code, machine_mode mode, target mode is the same as the variable's promotion. */ if (GET_CODE (op) == SUBREG && SUBREG_PROMOTED_VAR_P (op) - && SUBREG_PROMOTED_UNSIGNED_P (op) - && !paradoxical_subreg_p (mode, GET_MODE (SUBREG_REG (op)))) + && SUBREG_PROMOTED_UNSIGNED_P (op)) { - temp = rtl_hooks.gen_lowpart_no_emit (mode, SUBREG_REG (op)); - if (temp) - return temp; + rtx subreg = SUBREG_REG (op); + machine_mode subreg_mode = GET_MODE (subreg); + if (!paradoxical_subreg_p (mode, subreg_mode)) + { + temp = rtl_hooks.gen_lowpart_no_emit (mode, subreg); + if (temp) + { + /* Preserve SUBREG_PROMOTED_VAR_P. */ + if (partial_subreg_p (temp)) + { + SUBREG_PROMOTED_VAR_P (temp) = 1; + SUBREG_PROMOTED_SET (temp, 0); + } + return temp; + } + } + else + /* Zero-extending a zero-extended subreg. */ + return simplify_gen_unary (ZERO_EXTEND, mode, + subreg, subreg_mode); } /* Extending a widening multiplication should be canonicalized to </cut>

3 years, 11 months

1
0
0 0

[CI-NOTIFY]: TCWG Bisect tcwg_gnu_cross_build/master-aarch64 - Build # 1 - Successful!

by ci_notify＠linaro.org

Successfully identified regression in *gcc* in CI configuration tcwg_gnu_cross_build/master-aarch64. So far, this commit has regressed CI configurations: - tcwg_gnu_cross_build/master-aarch64 Culprit: <cut> commit cad36f38576a6a781e3c62ab061c68f5b8dab13a Author: Roger Sayle <roger(a)nextmovesoftware.com> Date: Tue Aug 31 11:45:07 2021 +0100 Preserve SUBREG_PROMOTED_VAR_P on (extend:HI (subreg/s:QI (reg:SI))). SUBREG_PROMOTED_VAR_P is a mechanism for tracking that a partial subreg is correctly zero-extended or sign-extended in the parent register. For example, the RTL (subreg/s/v:QI (reg/v:SI 23 [ x ]) 0) indicates that the byte x is zero extended in reg:SI 23, which is useful for optimization. An example is that zero extending the above QImode value to HImode can simply use a wider subreg, i.e. (subreg:HI (reg/v:SI 23 [ x ]) 0). This patch addresses the oversight/missed optimization opportunity that the new HImode subreg above should retain its SUBREG_PROMOTED_VAR_P annotation as its value is guaranteed to be correctly extended in the SImode parent. The code below to preserve SUBREG_PROMOTED_VAR_P is already present in the middle-end (e.g. simplify-rtx.c:7232-7242) but missing from one or two (precisely three) places that (accidentally) strip it. Whilst there I also added another optimization. If we need to extend the above QImode value beyond the SImode register holding it, say to DImode, we can eliminate the SUBREG and simply extend from the SImode register to DImode. 2021-08-31 Roger Sayle <roger(a)nextmovesoftware.com> gcc/ChangeLog * expr.c (convert_modes): Preserve SUBREG_PROMOTED_VAR_P when creating a (wider) partial subreg from a SUBREG_PROMOTED_VAR_P subreg. * simplify-rtx.c (simplify_unary_operation_1) [SIGN_EXTEND]: Likewise, preserve SUBREG_PROMOTED_VAR_P when creating a (wider) partial subreg from a SUBREG_PROMOTED_VAR_P subreg. Generate SIGN_EXTEND of the SUBREG_REG when a subreg would be paradoxical. [ZERO_EXTEND]: Likewise, preserve SUBREG_PROMOTED_VAR_P when creating a (wider) partial subreg from a SUBREG_PROMOTED_VAR_P subreg. Generate ZERO_EXTEND of the SUBREG_REG when a subreg would be paradoxical. </cut> Results regressed to (for first_bad == cad36f38576a6a781e3c62ab061c68f5b8dab13a) # reset_artifacts: -10 # true: 0 # build_abe binutils: 1 # First few build errors in logs: # 00:04:40 cc1: error: no include path in which to search for stdc-predef.h # 00:04:48 /home/tcwg-buildslave/workspace/tcwg_gnu_1/abe/snapshots/gcc.git~master/libgcc/soft-fp/op-2.h:249:37: internal compiler error: in subreg_promoted_mode, at rtl.h:3132 # 00:04:48 /home/tcwg-buildslave/workspace/tcwg_gnu_1/abe/snapshots/gcc.git~master/libgcc/soft-fp/op-2.h:249:37: internal compiler error: in subreg_promoted_mode, at rtl.h:3132 # 00:04:48 /home/tcwg-buildslave/workspace/tcwg_gnu_1/abe/snapshots/gcc.git~master/libgcc/soft-fp/op-2.h:249:37: internal compiler error: in subreg_promoted_mode, at rtl.h:3132 # 00:04:48 /home/tcwg-buildslave/workspace/tcwg_gnu_1/abe/snapshots/gcc.git~master/libgcc/soft-fp/op-1.h:127:36: internal compiler error: in subreg_promoted_mode, at rtl.h:3132 # 00:04:48 /home/tcwg-buildslave/workspace/tcwg_gnu_1/abe/snapshots/gcc.git~master/libgcc/soft-fp/op-1.h:127:36: internal compiler error: in subreg_promoted_mode, at rtl.h:3132 # 00:04:48 /home/tcwg-buildslave/workspace/tcwg_gnu_1/abe/snapshots/gcc.git~master/libgcc/soft-fp/op-1.h:127:36: internal compiler error: in subreg_promoted_mode, at rtl.h:3132 # 00:04:48 /home/tcwg-buildslave/workspace/tcwg_gnu_1/abe/snapshots/gcc.git~master/libgcc/soft-fp/op-2.h:249:37: internal compiler error: in subreg_promoted_mode, at rtl.h:3132 # 00:04:48 make[2]: *** [/home/tcwg-buildslave/workspace/tcwg_gnu_1/abe/snapshots/gcc.git~master/libgcc/static-object.mk:17: floatsitf.o] Error 1 # 00:04:48 /home/tcwg-buildslave/workspace/tcwg_gnu_1/abe/snapshots/gcc.git~master/libgcc/soft-fp/op-2.h:249:37: internal compiler error: in subreg_promoted_mode, at rtl.h:3132 from (for last_good == 0960d937d9bee3c831d0b64a9c828c263a58ff89) # reset_artifacts: -10 # true: 0 # build_abe binutils: 1 # build_abe stage1: 2 # build_abe linux: 3 # build_abe glibc: 4 # build_abe stage2: 5 # build_abe gdb: 6 # build_abe qemu: 7 Artifacts of last_good build: https://ci.linaro.org/job/tcwg_gnu_cross_build-bisect-master-aarch64/1/arti… Artifacts of first_bad build: https://ci.linaro.org/job/tcwg_gnu_cross_build-bisect-master-aarch64/1/arti… Build top page/logs: https://ci.linaro.org/job/tcwg_gnu_cross_build-bisect-master-aarch64/1/ Configuration details: Reproduce builds: <cut> mkdir investigate-gcc-cad36f38576a6a781e3c62ab061c68f5b8dab13a cd investigate-gcc-cad36f38576a6a781e3c62ab061c68f5b8dab13a git clone https://git.linaro.org/toolchain/jenkins-scripts mkdir -p artifacts/manifests curl -o artifacts/manifests/build-baseline.sh https://ci.linaro.org/job/tcwg_gnu_cross_build-bisect-master-aarch64/1/arti… --fail curl -o artifacts/manifests/build-parameters.sh https://ci.linaro.org/job/tcwg_gnu_cross_build-bisect-master-aarch64/1/arti… --fail curl -o artifacts/test.sh https://ci.linaro.org/job/tcwg_gnu_cross_build-bisect-master-aarch64/1/arti… --fail chmod +x artifacts/test.sh # Reproduce the baseline build (build all pre-requisites) ./jenkins-scripts/tcwg_gnu-build.sh @@ artifacts/manifests/build-baseline.sh # Save baseline build state (which is then restored in artifacts/test.sh) mkdir -p ./bisect rsync -a --del --delete-excluded --exclude /bisect/ --exclude /artifacts/ --exclude /gcc/ ./ ./bisect/baseline/ cd gcc # Reproduce first_bad build git checkout --detach cad36f38576a6a781e3c62ab061c68f5b8dab13a ../artifacts/test.sh # Reproduce last_good build git checkout --detach 0960d937d9bee3c831d0b64a9c828c263a58ff89 ../artifacts/test.sh cd .. </cut> History of pending regressions and results: https://git.linaro.org/toolchain/ci/base-artifacts.git/log/?h=linaro-local/… Artifacts: https://ci.linaro.org/job/tcwg_gnu_cross_build-bisect-master-aarch64/1/arti… Build log: https://ci.linaro.org/job/tcwg_gnu_cross_build-bisect-master-aarch64/1/cons… Full commit (up to 1000 lines): <cut> commit cad36f38576a6a781e3c62ab061c68f5b8dab13a Author: Roger Sayle <roger(a)nextmovesoftware.com> Date: Tue Aug 31 11:45:07 2021 +0100 Preserve SUBREG_PROMOTED_VAR_P on (extend:HI (subreg/s:QI (reg:SI))). SUBREG_PROMOTED_VAR_P is a mechanism for tracking that a partial subreg is correctly zero-extended or sign-extended in the parent register. For example, the RTL (subreg/s/v:QI (reg/v:SI 23 [ x ]) 0) indicates that the byte x is zero extended in reg:SI 23, which is useful for optimization. An example is that zero extending the above QImode value to HImode can simply use a wider subreg, i.e. (subreg:HI (reg/v:SI 23 [ x ]) 0). This patch addresses the oversight/missed optimization opportunity that the new HImode subreg above should retain its SUBREG_PROMOTED_VAR_P annotation as its value is guaranteed to be correctly extended in the SImode parent. The code below to preserve SUBREG_PROMOTED_VAR_P is already present in the middle-end (e.g. simplify-rtx.c:7232-7242) but missing from one or two (precisely three) places that (accidentally) strip it. Whilst there I also added another optimization. If we need to extend the above QImode value beyond the SImode register holding it, say to DImode, we can eliminate the SUBREG and simply extend from the SImode register to DImode. 2021-08-31 Roger Sayle <roger(a)nextmovesoftware.com> gcc/ChangeLog * expr.c (convert_modes): Preserve SUBREG_PROMOTED_VAR_P when creating a (wider) partial subreg from a SUBREG_PROMOTED_VAR_P subreg. * simplify-rtx.c (simplify_unary_operation_1) [SIGN_EXTEND]: Likewise, preserve SUBREG_PROMOTED_VAR_P when creating a (wider) partial subreg from a SUBREG_PROMOTED_VAR_P subreg. Generate SIGN_EXTEND of the SUBREG_REG when a subreg would be paradoxical. [ZERO_EXTEND]: Likewise, preserve SUBREG_PROMOTED_VAR_P when creating a (wider) partial subreg from a SUBREG_PROMOTED_VAR_P subreg. Generate ZERO_EXTEND of the SUBREG_REG when a subreg would be paradoxical. --- gcc/expr.c | 19 ++++++++++++++++++- gcc/simplify-rtx.c | 52 ++++++++++++++++++++++++++++++++++++++++++---------- 2 files changed, 60 insertions(+), 11 deletions(-) diff --git a/gcc/expr.c b/gcc/expr.c index 096c0315ecc..5dd98a9bccc 100644 --- a/gcc/expr.c +++ b/gcc/expr.c @@ -688,7 +688,24 @@ convert_modes (machine_mode mode, machine_mode oldmode, rtx x, int unsignedp) && (GET_MODE_PRECISION (subreg_promoted_mode (x)) >= GET_MODE_PRECISION (int_mode)) && SUBREG_CHECK_PROMOTED_SIGN (x, unsignedp)) - x = gen_lowpart (int_mode, SUBREG_REG (x)); + { + scalar_int_mode int_orig_mode; + machine_mode orig_mode = GET_MODE (x); + x = gen_lowpart (int_mode, SUBREG_REG (x)); + + /* Preserve SUBREG_PROMOTED_VAR_P if the new mode is wider than + the original mode, but narrower than the inner mode. */ + if (GET_CODE (x) == SUBREG + && GET_MODE_PRECISION (subreg_promoted_mode (x)) + > GET_MODE_PRECISION (int_mode) + && is_a <scalar_int_mode> (orig_mode, &int_orig_mode) + && GET_MODE_PRECISION (int_mode) + > GET_MODE_PRECISION (int_orig_mode)) + { + SUBREG_PROMOTED_VAR_P (x) = 1; + SUBREG_PROMOTED_SET (x, unsignedp); + } + } if (GET_MODE (x) != VOIDmode) oldmode = GET_MODE (x); diff --git a/gcc/simplify-rtx.c b/gcc/simplify-rtx.c index e431e0c19d7..ebad5cb5a79 100644 --- a/gcc/simplify-rtx.c +++ b/gcc/simplify-rtx.c @@ -1512,12 +1512,28 @@ simplify_context::simplify_unary_operation_1 (rtx_code code, machine_mode mode, target mode is the same as the variable's promotion. */ if (GET_CODE (op) == SUBREG && SUBREG_PROMOTED_VAR_P (op) - && SUBREG_PROMOTED_SIGNED_P (op) - && !paradoxical_subreg_p (mode, GET_MODE (SUBREG_REG (op)))) + && SUBREG_PROMOTED_SIGNED_P (op)) { - temp = rtl_hooks.gen_lowpart_no_emit (mode, SUBREG_REG (op)); - if (temp) - return temp; + rtx subreg = SUBREG_REG (op); + machine_mode subreg_mode = GET_MODE (subreg); + if (!paradoxical_subreg_p (mode, subreg_mode)) + { + temp = rtl_hooks.gen_lowpart_no_emit (mode, subreg); + if (temp) + { + /* Preserve SUBREG_PROMOTED_VAR_P. */ + if (partial_subreg_p (temp)) + { + SUBREG_PROMOTED_VAR_P (temp) = 1; + SUBREG_PROMOTED_SET (temp, 1); + } + return temp; + } + } + else + /* Sign-extending a sign-extended subreg. */ + return simplify_gen_unary (SIGN_EXTEND, mode, + subreg, subreg_mode); } /* (sign_extend:M (sign_extend:N <X>)) is (sign_extend:M <X>). @@ -1631,12 +1647,28 @@ simplify_context::simplify_unary_operation_1 (rtx_code code, machine_mode mode, target mode is the same as the variable's promotion. */ if (GET_CODE (op) == SUBREG && SUBREG_PROMOTED_VAR_P (op) - && SUBREG_PROMOTED_UNSIGNED_P (op) - && !paradoxical_subreg_p (mode, GET_MODE (SUBREG_REG (op)))) + && SUBREG_PROMOTED_UNSIGNED_P (op)) { - temp = rtl_hooks.gen_lowpart_no_emit (mode, SUBREG_REG (op)); - if (temp) - return temp; + rtx subreg = SUBREG_REG (op); + machine_mode subreg_mode = GET_MODE (subreg); + if (!paradoxical_subreg_p (mode, subreg_mode)) + { + temp = rtl_hooks.gen_lowpart_no_emit (mode, subreg); + if (temp) + { + /* Preserve SUBREG_PROMOTED_VAR_P. */ + if (partial_subreg_p (temp)) + { + SUBREG_PROMOTED_VAR_P (temp) = 1; + SUBREG_PROMOTED_SET (temp, 0); + } + return temp; + } + } + else + /* Zero-extending a zero-extended subreg. */ + return simplify_gen_unary (ZERO_EXTEND, mode, + subreg, subreg_mode); } /* Extending a widening multiplication should be canonicalized to </cut>

3 years, 11 months

1
0
0 0

[CI-NOTIFY]: TCWG Bisect tcwg_gnu_native_build/master-arm - Build # 1 - Successful!

by ci_notify＠linaro.org

Successfully identified regression in *gdb* in CI configuration tcwg_gnu_native_build/master-arm. So far, this commit has regressed CI configurations: - tcwg_gnu_native_build/master-arm Culprit: <cut> commit 282aa4f7d292eb4bc213d028465a3b96f5af2f22 Author: Tom Tromey <tom(a)tromey.com> Date: Sat Aug 28 13:16:50 2021 -0600 Add some parallel_for_each tests Tom de Vries noticed that a patch in the DWARF scanner rewrite series caused a regression in parallel_for_each -- it started crashing in the case where the number of threads is 0 (there was an unchecked use of "n-1" that was used to size an array). He also pointed out that there were no tests of parallel_for_each. This adds a few tests of parallel_for_each, primarily testing that different settings for the number of threads will work. This test catches the bug that he found in that series. </cut> Results regressed to (for first_bad == 282aa4f7d292eb4bc213d028465a3b96f5af2f22) # reset_artifacts: -10 # true: 0 # build_abe binutils: 1 # build_abe gcc: 2 # build_abe linux: 4 # build_abe glibc: 5 # First few build errors in logs: # 00:03:45 ../../../../../../gdb/gdb/unittests/parallel-for-selftests.c:53:30: error: use of deleted function ‘std::atomic<int>::atomic(const std::atomic<int>&)’ # 00:03:45 make[1]: *** [unittests/parallel-for-selftests.o] Error 1 # 00:03:46 make: *** [all-gdb] Error 2 from (for last_good == ee8b88452c1cb1be97199942aee7a76bbca210ee) # reset_artifacts: -10 # true: 0 # build_abe binutils: 1 # build_abe gcc: 2 # build_abe linux: 4 # build_abe glibc: 5 # build_abe gdb: 6 Artifacts of last_good build: https://ci.linaro.org/job/tcwg_gnu_native_build-bisect-master-arm/1/artifac… Artifacts of first_bad build: https://ci.linaro.org/job/tcwg_gnu_native_build-bisect-master-arm/1/artifac… Build top page/logs: https://ci.linaro.org/job/tcwg_gnu_native_build-bisect-master-arm/1/ Configuration details: Reproduce builds: <cut> mkdir investigate-gdb-282aa4f7d292eb4bc213d028465a3b96f5af2f22 cd investigate-gdb-282aa4f7d292eb4bc213d028465a3b96f5af2f22 git clone https://git.linaro.org/toolchain/jenkins-scripts mkdir -p artifacts/manifests curl -o artifacts/manifests/build-baseline.sh https://ci.linaro.org/job/tcwg_gnu_native_build-bisect-master-arm/1/artifac… --fail curl -o artifacts/manifests/build-parameters.sh https://ci.linaro.org/job/tcwg_gnu_native_build-bisect-master-arm/1/artifac… --fail curl -o artifacts/test.sh https://ci.linaro.org/job/tcwg_gnu_native_build-bisect-master-arm/1/artifac… --fail chmod +x artifacts/test.sh # Reproduce the baseline build (build all pre-requisites) ./jenkins-scripts/tcwg_gnu-build.sh @@ artifacts/manifests/build-baseline.sh # Save baseline build state (which is then restored in artifacts/test.sh) mkdir -p ./bisect rsync -a --del --delete-excluded --exclude /bisect/ --exclude /artifacts/ --exclude /gdb/ ./ ./bisect/baseline/ cd gdb # Reproduce first_bad build git checkout --detach 282aa4f7d292eb4bc213d028465a3b96f5af2f22 ../artifacts/test.sh # Reproduce last_good build git checkout --detach ee8b88452c1cb1be97199942aee7a76bbca210ee ../artifacts/test.sh cd .. </cut> History of pending regressions and results: https://git.linaro.org/toolchain/ci/base-artifacts.git/log/?h=linaro-local/… Artifacts: https://ci.linaro.org/job/tcwg_gnu_native_build-bisect-master-arm/1/artifac… Build log: https://ci.linaro.org/job/tcwg_gnu_native_build-bisect-master-arm/1/console… Full commit (up to 1000 lines): <cut> commit 282aa4f7d292eb4bc213d028465a3b96f5af2f22 Author: Tom Tromey <tom(a)tromey.com> Date: Sat Aug 28 13:16:50 2021 -0600 Add some parallel_for_each tests Tom de Vries noticed that a patch in the DWARF scanner rewrite series caused a regression in parallel_for_each -- it started crashing in the case where the number of threads is 0 (there was an unchecked use of "n-1" that was used to size an array). He also pointed out that there were no tests of parallel_for_each. This adds a few tests of parallel_for_each, primarily testing that different settings for the number of threads will work. This test catches the bug that he found in that series. --- gdb/Makefile.in | 1 + gdb/unittests/parallel-for-selftests.c | 86 ++++++++++++++++++++++++++++++++++ 2 files changed, 87 insertions(+) diff --git a/gdb/Makefile.in b/gdb/Makefile.in index 73a1bf83c85..320d3326a81 100644 --- a/gdb/Makefile.in +++ b/gdb/Makefile.in @@ -456,6 +456,7 @@ SELFTESTS_SRCS = \ unittests/offset-type-selftests.c \ unittests/observable-selftests.c \ unittests/optional-selftests.c \ + unittests/parallel-for-selftests.c \ unittests/parse-connection-spec-selftests.c \ unittests/ptid-selftests.c \ unittests/main-thread-selftests.c \ diff --git a/gdb/unittests/parallel-for-selftests.c b/gdb/unittests/parallel-for-selftests.c new file mode 100644 index 00000000000..7f61b709fa7 --- /dev/null +++ b/gdb/unittests/parallel-for-selftests.c @@ -0,0 +1,86 @@ +/* Self tests for parallel_for_each + + Copyright (C) 2021 Free Software Foundation, Inc. + + This file is part of GDB. + + This program is free software; you can redistribute it and/or modify + it under the terms of the GNU General Public License as published by + the Free Software Foundation; either version 3 of the License, or + (at your option) any later version. + + This program is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + GNU General Public License for more details. + + You should have received a copy of the GNU General Public License + along with this program. If not, see <http://www.gnu.org/licenses/>. */ + +#include "defs.h" +#include "gdbsupport/selftest.h" +#include "gdbsupport/parallel-for.h" +#include "gdbsupport/thread-pool.h" + +#if CXX_STD_THREAD + +namespace selftests { +namespace parallel_for { + +struct save_restore_n_threads +{ + save_restore_n_threads () + : n_threads (gdb::thread_pool::g_thread_pool->thread_count ()) + { + } + + ~save_restore_n_threads () + { + gdb::thread_pool::g_thread_pool->set_thread_count (n_threads); + } + + int n_threads; +}; + +static void +test (int n_threads) +{ + save_restore_n_threads saver; + gdb::thread_pool::g_thread_pool->set_thread_count (n_threads); + +#define NUMBER 10000 + + std::atomic<int> counter = 0; + gdb::parallel_for_each (0, NUMBER, + [&] (int start, int end) + { + counter += end - start; + }); + + SELF_CHECK (counter == NUMBER); + +#undef NUMBER +} + +static void +test_n_threads () +{ + test (0); + test (1); + test (3); +} + +} +} + +#endif /* CXX_STD_THREAD */ + +void _initialize_parallel_for_selftests (); +void +_initialize_parallel_for_selftests () +{ +#ifdef CXX_STD_THREAD + selftests::register_test ("parallel_for", + selftests::parallel_for::test_n_threads); +#endif /* CXX_STD_THREAD */ +} </cut>

3 years, 11 months

1
0
0 0

[CI-NOTIFY]: TCWG Bisect tcwg_gnu_cross_build/master-arm - Build # 1 - Successful!

by ci_notify＠linaro.org

Successfully identified regression in *gcc* in CI configuration tcwg_gnu_cross_build/master-arm. So far, this commit has regressed CI configurations: - tcwg_gnu_cross_build/master-arm Culprit: <cut> commit caf81d3b57501b1f58dcd9b1ef9d7b4bc76f4ab1 Author: Sebastian Huber <sebastian.huber(a)embedded-brains.de> Date: Tue Aug 17 09:53:43 2021 +0200 Use __builtin_trap() for abort() if inhibit_libc abort() is used in gcc_assert() and gcc_unreachable() which is used by target libraries such as libgcov.a. This patch changes the abort() definition under certain conditions. If inhibit_libc is defined and abort is not already defined, then abort() is defined to __builtin_trap(). The inhibit_libc define is usually defined if GCC is built for targets running in embedded systems which may optionally use a C standard library. If inhibit_libc is defined, then there may be still a full featured abort() available. abort() is a heavy weight function which depends on signals and file streams. For statically linked applications, this means that a dependency on gcc_assert() pulls in the support for signals and file streams. This could prevent using gcov to test low end targets for example. Using __builtin_trap() avoids these dependencies if the target implements a "trap" instruction. The application or operating system could use a trap handler to react to failed GCC runtime checks which caused a trap. gcc/ * tsystem.h (abort): Define abort() if inhibit_libc is defined and it is not already defined. </cut> Results regressed to (for first_bad == caf81d3b57501b1f58dcd9b1ef9d7b4bc76f4ab1) # reset_artifacts: -10 # true: 0 # build_abe binutils: 1 # First few build errors in logs: # 00:01:44 cc1: error: no include path in which to search for stdc-predef.h # 00:02:05 /home/tcwg-buildslave/workspace/tcwg_gnu_1/abe/snapshots/gcc.git~master/libgcc/unwind-arm-common.inc:55:24: error: macro passed 1 arguments, but takes just 0 # 00:02:05 make[2]: *** [/home/tcwg-buildslave/workspace/tcwg_gnu_1/abe/snapshots/gcc.git~master/libgcc/static-object.mk:17: unwind-arm.o] Error 1 # 00:02:06 make[1]: *** [Makefile:12484: all-target-libgcc] Error 2 # 00:02:06 make: *** [Makefile:953: all] Error 2 from (for last_good == d7e56b084d0b230ae5ee280f569d679fa0f09f4d) # reset_artifacts: -10 # true: 0 # build_abe binutils: 1 # build_abe stage1: 2 # build_abe linux: 3 # build_abe glibc: 4 # build_abe stage2: 5 # build_abe gdb: 6 # build_abe qemu: 7 Artifacts of last_good build: https://ci.linaro.org/job/tcwg_gnu_cross_build-bisect-master-arm/1/artifact… Artifacts of first_bad build: https://ci.linaro.org/job/tcwg_gnu_cross_build-bisect-master-arm/1/artifact… Build top page/logs: https://ci.linaro.org/job/tcwg_gnu_cross_build-bisect-master-arm/1/ Configuration details: Reproduce builds: <cut> mkdir investigate-gcc-caf81d3b57501b1f58dcd9b1ef9d7b4bc76f4ab1 cd investigate-gcc-caf81d3b57501b1f58dcd9b1ef9d7b4bc76f4ab1 git clone https://git.linaro.org/toolchain/jenkins-scripts mkdir -p artifacts/manifests curl -o artifacts/manifests/build-baseline.sh https://ci.linaro.org/job/tcwg_gnu_cross_build-bisect-master-arm/1/artifact… --fail curl -o artifacts/manifests/build-parameters.sh https://ci.linaro.org/job/tcwg_gnu_cross_build-bisect-master-arm/1/artifact… --fail curl -o artifacts/test.sh https://ci.linaro.org/job/tcwg_gnu_cross_build-bisect-master-arm/1/artifact… --fail chmod +x artifacts/test.sh # Reproduce the baseline build (build all pre-requisites) ./jenkins-scripts/tcwg_gnu-build.sh @@ artifacts/manifests/build-baseline.sh # Save baseline build state (which is then restored in artifacts/test.sh) mkdir -p ./bisect rsync -a --del --delete-excluded --exclude /bisect/ --exclude /artifacts/ --exclude /gcc/ ./ ./bisect/baseline/ cd gcc # Reproduce first_bad build git checkout --detach caf81d3b57501b1f58dcd9b1ef9d7b4bc76f4ab1 ../artifacts/test.sh # Reproduce last_good build git checkout --detach d7e56b084d0b230ae5ee280f569d679fa0f09f4d ../artifacts/test.sh cd .. </cut> History of pending regressions and results: https://git.linaro.org/toolchain/ci/base-artifacts.git/log/?h=linaro-local/… Artifacts: https://ci.linaro.org/job/tcwg_gnu_cross_build-bisect-master-arm/1/artifact… Build log: https://ci.linaro.org/job/tcwg_gnu_cross_build-bisect-master-arm/1/consoleT… Full commit (up to 1000 lines): <cut> commit caf81d3b57501b1f58dcd9b1ef9d7b4bc76f4ab1 Author: Sebastian Huber <sebastian.huber(a)embedded-brains.de> Date: Tue Aug 17 09:53:43 2021 +0200 Use __builtin_trap() for abort() if inhibit_libc abort() is used in gcc_assert() and gcc_unreachable() which is used by target libraries such as libgcov.a. This patch changes the abort() definition under certain conditions. If inhibit_libc is defined and abort is not already defined, then abort() is defined to __builtin_trap(). The inhibit_libc define is usually defined if GCC is built for targets running in embedded systems which may optionally use a C standard library. If inhibit_libc is defined, then there may be still a full featured abort() available. abort() is a heavy weight function which depends on signals and file streams. For statically linked applications, this means that a dependency on gcc_assert() pulls in the support for signals and file streams. This could prevent using gcov to test low end targets for example. Using __builtin_trap() avoids these dependencies if the target implements a "trap" instruction. The application or operating system could use a trap handler to react to failed GCC runtime checks which caused a trap. gcc/ * tsystem.h (abort): Define abort() if inhibit_libc is defined and it is not already defined. --- gcc/tsystem.h | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/gcc/tsystem.h b/gcc/tsystem.h index e1e6a96a4f4..5c72c69ff3e 100644 --- a/gcc/tsystem.h +++ b/gcc/tsystem.h @@ -59,7 +59,7 @@ extern int atexit (void (*)(void)); #endif #ifndef abort -extern void abort (void) __attribute__ ((__noreturn__)); +#define abort() __builtin_trap () #endif #ifndef strlen </cut>

3 years, 11 months

1
0
0 0

[CI-NOTIFY]: TCWG Bisect tcwg_gcc_bootstrap/master-aarch64-bootstrap_profiled - Build # 3 - Fixed!

by ci_notify＠linaro.org

Successfully identified regression in *binutils* in CI configuration tcwg_gcc_bootstrap/master-aarch64-bootstrap_profiled. So far, this commit has regressed CI configurations: - tcwg_gcc_bootstrap/master-aarch64-bootstrap_profiled Culprit: <cut> commit a12ea97b9dab8eedf411fc5052ffaa8be29f5d36 Author: GDB Administrator <gdbadmin(a)sourceware.org> Date: Mon Aug 23 00:00:07 2021 +0000 Automatic date update in version.in </cut> Results regressed to (for first_bad == a12ea97b9dab8eedf411fc5052ffaa8be29f5d36) # reset_artifacts: -10 # true: 0 # First few build errors in logs: from (for last_good == fe7f0b013526b30ef657c5ad34a3c622a54499ac) # reset_artifacts: -10 # true: 0 # build_abe binutils: 1 # build_abe bootstrap_profiled: 2 Artifacts of last_good build: https://ci.linaro.org/job/tcwg_gcc_bootstrap-bisect-master-aarch64-bootstra… Artifacts of first_bad build: https://ci.linaro.org/job/tcwg_gcc_bootstrap-bisect-master-aarch64-bootstra… Build top page/logs: https://ci.linaro.org/job/tcwg_gcc_bootstrap-bisect-master-aarch64-bootstra… Configuration details: Reproduce builds: <cut> mkdir investigate-binutils-a12ea97b9dab8eedf411fc5052ffaa8be29f5d36 cd investigate-binutils-a12ea97b9dab8eedf411fc5052ffaa8be29f5d36 git clone https://git.linaro.org/toolchain/jenkins-scripts mkdir -p artifacts/manifests curl -o artifacts/manifests/build-baseline.sh https://ci.linaro.org/job/tcwg_gcc_bootstrap-bisect-master-aarch64-bootstra… --fail curl -o artifacts/manifests/build-parameters.sh https://ci.linaro.org/job/tcwg_gcc_bootstrap-bisect-master-aarch64-bootstra… --fail curl -o artifacts/test.sh https://ci.linaro.org/job/tcwg_gcc_bootstrap-bisect-master-aarch64-bootstra… --fail chmod +x artifacts/test.sh # Reproduce the baseline build (build all pre-requisites) ./jenkins-scripts/tcwg_gnu-build.sh @@ artifacts/manifests/build-baseline.sh # Save baseline build state (which is then restored in artifacts/test.sh) mkdir -p ./bisect rsync -a --del --delete-excluded --exclude /bisect/ --exclude /artifacts/ --exclude /binutils/ ./ ./bisect/baseline/ cd binutils # Reproduce first_bad build git checkout --detach a12ea97b9dab8eedf411fc5052ffaa8be29f5d36 ../artifacts/test.sh # Reproduce last_good build git checkout --detach fe7f0b013526b30ef657c5ad34a3c622a54499ac ../artifacts/test.sh cd .. </cut> History of pending regressions and results: https://git.linaro.org/toolchain/ci/base-artifacts.git/log/?h=linaro-local/… Artifacts: https://ci.linaro.org/job/tcwg_gcc_bootstrap-bisect-master-aarch64-bootstra… Build log: https://ci.linaro.org/job/tcwg_gcc_bootstrap-bisect-master-aarch64-bootstra… Full commit (up to 1000 lines): <cut> commit a12ea97b9dab8eedf411fc5052ffaa8be29f5d36 Author: GDB Administrator <gdbadmin(a)sourceware.org> Date: Mon Aug 23 00:00:07 2021 +0000 Automatic date update in version.in --- bfd/version.h | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/bfd/version.h b/bfd/version.h index d22e02c7fd4..66d43d6f3a1 100644 --- a/bfd/version.h +++ b/bfd/version.h @@ -16,7 +16,7 @@ In releases, the date is not included in either version strings or sonames. */ -#define BFD_VERSION_DATE 20210822 +#define BFD_VERSION_DATE 20210823 #define BFD_VERSION @bfd_version@ #define BFD_VERSION_STRING @bfd_version_package@ @bfd_version_string@ #define REPORT_BUGS_TO @report_bugs_to@ </cut>

3 years, 12 months

1
0
0 0

[ACTIVITY] report week ending 26 Aug

by Peter Maydell

Progress (short week, 3 days): * UM-2 [QEMU upstream maintainership] + QEMU 6.1.0 has now been released + Sent out the first arm pullreq for the 6.2 cycle, including another slice of the MVE patches + tried to work through some of the codereview backlog -- PMM

3 years, 12 months

1
0
0 0

[CI-NOTIFY]: TCWG Bisect tcwg_gcc_bootstrap/master-arm-bootstrap_debug - Build # 2 - Successful!

by ci_notify＠linaro.org

Successfully identified regression in *gcc* in CI configuration tcwg_gcc_bootstrap/master-arm-bootstrap_debug. So far, this commit has regressed CI configurations: - tcwg_gcc_bootstrap/master-arm-bootstrap_debug Culprit: <cut> commit 1d244020246cb155e4de62ca3b302b920a1f513f Author: Roger Sayle <roger(a)nextmovesoftware.com> Date: Mon Aug 23 12:37:04 2021 +0100 Fold sign of LSHIFT_EXPR to eliminate no-op conversions. This short patch teaches fold that it is "safe" to change the sign of a left shift, to reduce the number of type conversions in gimple. As an example: unsigned int foo(unsigned int i) { return (int)i << 8; } is currently optimized to: unsigned int foo (unsigned int i) { int i.0_1; int _2; unsigned int _4; <bb 2> [local count: 1073741824]: i.0_1 = (int) i_3(D); _2 = i.0_1 << 8; _4 = (unsigned int) _2; return _4; } with this patch, this now becomes: unsigned int foo (unsigned int i) { unsigned int _2; <bb 2> [local count: 1073741824]: _2 = i_1(D) << 8; return _2; } which generates exactly the same assembly language. Aside from the reduced memory usage, the real benefit is that no-op conversions tend to interfere with many folding optimizations. For example, unsigned int bar(unsigned char i) { return (i ^ (i<<16)) | (i<<8); } currently gets (tangled in conversions and) optimized to: unsigned int bar (unsigned char i) { unsigned int _1; unsigned int _2; int _3; int _4; unsigned int _6; unsigned int _8; <bb 2> [local count: 1073741824]: _1 = (unsigned int) i_5(D); _2 = _1 * 65537; _3 = (int) i_5(D); _4 = _3 << 8; _8 = (unsigned int) _4; _6 = _2 | _8; return _6; } but with this patch, bar now optimizes down to: unsigned int bar(unsigned char i) { unsigned int _1; unsigned int _4; <bb 2> [local count: 1073741824]: _1 = (unsigned int) i_3(D); _4 = _1 * 65793; return _4; } 2021-08-23 Roger Sayle <roger(a)nextmovesoftware.com> gcc/ChangeLog * match.pd (shift transformations): Change the sign of an LSHIFT_EXPR if it reduces the number of explicit conversions. gcc/testsuite/ChangeLog * gcc.dg/fold-convlshift-1.c: New test case. * gcc.dg/fold-convlshift-2.c: New test case. </cut> Results regressed to (for first_bad == 1d244020246cb155e4de62ca3b302b920a1f513f) # reset_artifacts: -10 # true: 0 # build_abe binutils: 1 # First few build errors in logs: # 00:06:26 make[3]: [armv8l-unknown-linux-gnueabihf/bits/largefile-config.h] Error 1 (ignored) # 00:25:39 make[3]: [armv8l-unknown-linux-gnueabihf/bits/largefile-config.h] Error 1 (ignored) # 00:29:38 /home/tcwg-buildslave/workspace/tcwg_gnu_8/abe/snapshots/gcc.git~master/gcc/bitmap.h:357:13: error: type mismatch in ‘lshift_expr’ # 00:29:38 /home/tcwg-buildslave/workspace/tcwg_gnu_8/abe/snapshots/gcc.git~master/gcc/bitmap.h:357:13: internal compiler error: ‘verify_gimple’ failed # 00:29:38 make[3]: *** [bitmap.o] Error 1 # 00:34:06 make[2]: *** [all-stage3-gcc] Error 2 # 00:34:06 make[1]: *** [stage3-bubble] Error 2 # 00:34:07 make: *** [all] Error 2 from (for last_good == b320edc0c29c838b0090c3c9be14187d132f73f2) # reset_artifacts: -10 # true: 0 # build_abe binutils: 1 # build_abe bootstrap_debug: 2 Artifacts of last_good build: https://ci.linaro.org/job/tcwg_gcc_bootstrap-bisect-master-arm-bootstrap_de… Artifacts of first_bad build: https://ci.linaro.org/job/tcwg_gcc_bootstrap-bisect-master-arm-bootstrap_de… Build top page/logs: https://ci.linaro.org/job/tcwg_gcc_bootstrap-bisect-master-arm-bootstrap_de… Configuration details: Reproduce builds: <cut> mkdir investigate-gcc-1d244020246cb155e4de62ca3b302b920a1f513f cd investigate-gcc-1d244020246cb155e4de62ca3b302b920a1f513f git clone https://git.linaro.org/toolchain/jenkins-scripts mkdir -p artifacts/manifests curl -o artifacts/manifests/build-baseline.sh https://ci.linaro.org/job/tcwg_gcc_bootstrap-bisect-master-arm-bootstrap_de… --fail curl -o artifacts/manifests/build-parameters.sh https://ci.linaro.org/job/tcwg_gcc_bootstrap-bisect-master-arm-bootstrap_de… --fail curl -o artifacts/test.sh https://ci.linaro.org/job/tcwg_gcc_bootstrap-bisect-master-arm-bootstrap_de… --fail chmod +x artifacts/test.sh # Reproduce the baseline build (build all pre-requisites) ./jenkins-scripts/tcwg_gnu-build.sh @@ artifacts/manifests/build-baseline.sh # Save baseline build state (which is then restored in artifacts/test.sh) mkdir -p ./bisect rsync -a --del --delete-excluded --exclude /bisect/ --exclude /artifacts/ --exclude /gcc/ ./ ./bisect/baseline/ cd gcc # Reproduce first_bad build git checkout --detach 1d244020246cb155e4de62ca3b302b920a1f513f ../artifacts/test.sh # Reproduce last_good build git checkout --detach b320edc0c29c838b0090c3c9be14187d132f73f2 ../artifacts/test.sh cd .. </cut> History of pending regressions and results: https://git.linaro.org/toolchain/ci/base-artifacts.git/log/?h=linaro-local/… Artifacts: https://ci.linaro.org/job/tcwg_gcc_bootstrap-bisect-master-arm-bootstrap_de… Build log: https://ci.linaro.org/job/tcwg_gcc_bootstrap-bisect-master-arm-bootstrap_de… Full commit (up to 1000 lines): <cut> commit 1d244020246cb155e4de62ca3b302b920a1f513f Author: Roger Sayle <roger(a)nextmovesoftware.com> Date: Mon Aug 23 12:37:04 2021 +0100 Fold sign of LSHIFT_EXPR to eliminate no-op conversions. This short patch teaches fold that it is "safe" to change the sign of a left shift, to reduce the number of type conversions in gimple. As an example: unsigned int foo(unsigned int i) { return (int)i << 8; } is currently optimized to: unsigned int foo (unsigned int i) { int i.0_1; int _2; unsigned int _4; <bb 2> [local count: 1073741824]: i.0_1 = (int) i_3(D); _2 = i.0_1 << 8; _4 = (unsigned int) _2; return _4; } with this patch, this now becomes: unsigned int foo (unsigned int i) { unsigned int _2; <bb 2> [local count: 1073741824]: _2 = i_1(D) << 8; return _2; } which generates exactly the same assembly language. Aside from the reduced memory usage, the real benefit is that no-op conversions tend to interfere with many folding optimizations. For example, unsigned int bar(unsigned char i) { return (i ^ (i<<16)) | (i<<8); } currently gets (tangled in conversions and) optimized to: unsigned int bar (unsigned char i) { unsigned int _1; unsigned int _2; int _3; int _4; unsigned int _6; unsigned int _8; <bb 2> [local count: 1073741824]: _1 = (unsigned int) i_5(D); _2 = _1 * 65537; _3 = (int) i_5(D); _4 = _3 << 8; _8 = (unsigned int) _4; _6 = _2 | _8; return _6; } but with this patch, bar now optimizes down to: unsigned int bar(unsigned char i) { unsigned int _1; unsigned int _4; <bb 2> [local count: 1073741824]: _1 = (unsigned int) i_3(D); _4 = _1 * 65793; return _4; } 2021-08-23 Roger Sayle <roger(a)nextmovesoftware.com> gcc/ChangeLog * match.pd (shift transformations): Change the sign of an LSHIFT_EXPR if it reduces the number of explicit conversions. gcc/testsuite/ChangeLog * gcc.dg/fold-convlshift-1.c: New test case. * gcc.dg/fold-convlshift-2.c: New test case. --- gcc/match.pd | 9 +++++++++ gcc/testsuite/gcc.dg/fold-convlshift-1.c | 20 ++++++++++++++++++++ gcc/testsuite/gcc.dg/fold-convlshift-2.c | 20 ++++++++++++++++++++ 3 files changed, 49 insertions(+) diff --git a/gcc/match.pd b/gcc/match.pd index 0fcfd0ea62c..978a1b0172e 100644 --- a/gcc/match.pd +++ b/gcc/match.pd @@ -3385,6 +3385,15 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT) (if (integer_zerop (@2) || integer_all_onesp (@2)) (cmp @0 @2))))) +/* Both signed and unsigned lshift produce the same result, so use + the form that minimizes the number of conversions. */ +(simplify + (convert (lshift:s@0 (convert:s@1 @2) INTEGER_CST@3)) + (if (tree_nop_conversion_p (type, TREE_TYPE (@0)) + && INTEGRAL_TYPE_P (TREE_TYPE (@2)) + && TYPE_PRECISION (TREE_TYPE (@2)) <= TYPE_PRECISION (type)) + (lshift (convert @2) @3))) + /* Simplifications of conversions. */ /* Basic strip-useless-type-conversions / strip_nops. */ diff --git a/gcc/testsuite/gcc.dg/fold-convlshift-1.c b/gcc/testsuite/gcc.dg/fold-convlshift-1.c new file mode 100644 index 00000000000..b6f57f81e72 --- /dev/null +++ b/gcc/testsuite/gcc.dg/fold-convlshift-1.c @@ -0,0 +1,20 @@ +/* { dg-do compile } */ +/* { dg-options "-O2 -fdump-tree-optimized" } */ + +unsigned int foo(unsigned int i) +{ + int t1 = i; + int t2 = t1 << 8; + return t2; +} + +int bar(int i) +{ + unsigned int t1 = i; + unsigned int t2 = t1 << 8; + return t2; +} + +/* { dg-final { scan-tree-dump-not "\$int\$" "optimized" } } */ +/* { dg-final { scan-tree-dump-not "\$unsigned int\$" "optimized" } } */ + diff --git a/gcc/testsuite/gcc.dg/fold-convlshift-2.c b/gcc/testsuite/gcc.dg/fold-convlshift-2.c new file mode 100644 index 00000000000..f21358c4584 --- /dev/null +++ b/gcc/testsuite/gcc.dg/fold-convlshift-2.c @@ -0,0 +1,20 @@ +/* { dg-do compile } */ +/* { dg-options "-O2 -fdump-tree-optimized" } */ + +unsigned int foo(unsigned char c) +{ + int t1 = c; + int t2 = t1 << 8; + return t2; +} + +int bar(unsigned char c) +{ + unsigned int t1 = c; + unsigned int t2 = t1 << 8; + return t2; +} + +/* { dg-final { scan-tree-dump-times "\$int\$" 1 "optimized" } } */ +/* { dg-final { scan-tree-dump-times "\$unsigned int\$" 1 "optimized" } } */ + </cut>

3 years, 12 months

1
0
0 0

[CI-NOTIFY]: TCWG Bisect tcwg_bmk_tk1/llvm-release-arm-spec2k6-O2_LTO - Build # 14 - Successful!

by ci_notify＠linaro.org

Successfully identified regression in *llvm* in CI configuration tcwg_bmk_llvm_tk1/llvm-release-arm-spec2k6-O2_LTO. So far, this commit has regressed CI configurations: - tcwg_bmk_llvm_tk1/llvm-release-arm-spec2k6-O2_LTO Culprit: <cut> commit efa7df1682c2859dabe3646ee7dc01e68629417f Author: Gabor Marton <gabor.marton(a)ericsson.com> Date: Thu Mar 25 15:40:26 2021 +0100 [Analyzer] Track RValue expressions It makes sense to track rvalue expressions in the case of special concrete integer values. The most notable special value is zero (later we may find other values). By tracking the origin of 0, we can provide a better explanation for users e.g. in case of division by 0 warnings. When the divisor is a product of a multiplication then now we can show which operand (or both) was (were) zero and why. Differential Revision: https://reviews.llvm.org/D99344 </cut> Results regressed to (for first_bad == efa7df1682c2859dabe3646ee7dc01e68629417f) # reset_artifacts: -10 # build_abe binutils: -9 # build_abe stage1 -- --set gcc_override_configure=--with-mode=arm --set gcc_override_configure=--disable-libsanitizer: -8 # build_abe linux: -7 # build_abe glibc: -6 # build_abe stage2 -- --set gcc_override_configure=--with-mode=arm --set gcc_override_configure=--disable-libsanitizer: -5 # build_llvm true: -3 # true: 0 # benchmark -- -O2_LTO_marm artifacts/build-efa7df1682c2859dabe3646ee7dc01e68629417f/results_id: 1 # 456.hmmer,hmmer_base.default regressed by 103 from (for last_good == 1696b8ae96b2d8bcbf90894bd344a8a090f43c84) # reset_artifacts: -10 # build_abe binutils: -9 # build_abe stage1 -- --set gcc_override_configure=--with-mode=arm --set gcc_override_configure=--disable-libsanitizer: -8 # build_abe linux: -7 # build_abe glibc: -6 # build_abe stage2 -- --set gcc_override_configure=--with-mode=arm --set gcc_override_configure=--disable-libsanitizer: -5 # build_llvm true: -3 # true: 0 # benchmark -- -O2_LTO_marm artifacts/build-1696b8ae96b2d8bcbf90894bd344a8a090f43c84/results_id: 1 Artifacts of last_good build: https://ci.linaro.org/job/tcwg_bmk_ci_llvm-bisect-tcwg_bmk_tk1-llvm-release… Results ID of last_good: tk1_32/tcwg_bmk_llvm_tk1/bisect-llvm-release-arm-spec2k6-O2_LTO/4320 Artifacts of first_bad build: https://ci.linaro.org/job/tcwg_bmk_ci_llvm-bisect-tcwg_bmk_tk1-llvm-release… Results ID of first_bad: tk1_32/tcwg_bmk_llvm_tk1/bisect-llvm-release-arm-spec2k6-O2_LTO/4322 Build top page/logs: https://ci.linaro.org/job/tcwg_bmk_ci_llvm-bisect-tcwg_bmk_tk1-llvm-release… Configuration details: Reproduce builds: <cut> mkdir investigate-llvm-efa7df1682c2859dabe3646ee7dc01e68629417f cd investigate-llvm-efa7df1682c2859dabe3646ee7dc01e68629417f git clone https://git.linaro.org/toolchain/jenkins-scripts mkdir -p artifacts/manifests curl -o artifacts/manifests/build-baseline.sh https://ci.linaro.org/job/tcwg_bmk_ci_llvm-bisect-tcwg_bmk_tk1-llvm-release… --fail curl -o artifacts/manifests/build-parameters.sh https://ci.linaro.org/job/tcwg_bmk_ci_llvm-bisect-tcwg_bmk_tk1-llvm-release… --fail curl -o artifacts/test.sh https://ci.linaro.org/job/tcwg_bmk_ci_llvm-bisect-tcwg_bmk_tk1-llvm-release… --fail chmod +x artifacts/test.sh # Reproduce the baseline build (build all pre-requisites) ./jenkins-scripts/tcwg_bmk-build.sh @@ artifacts/manifests/build-baseline.sh # Save baseline build state (which is then restored in artifacts/test.sh) mkdir -p ./bisect rsync -a --del --delete-excluded --exclude /bisect/ --exclude /artifacts/ --exclude /llvm/ ./ ./bisect/baseline/ cd llvm # Reproduce first_bad build git checkout --detach efa7df1682c2859dabe3646ee7dc01e68629417f ../artifacts/test.sh # Reproduce last_good build git checkout --detach 1696b8ae96b2d8bcbf90894bd344a8a090f43c84 ../artifacts/test.sh cd .. </cut> History of pending regressions and results: https://git.linaro.org/toolchain/ci/base-artifacts.git/log/?h=linaro-local/… Artifacts: https://ci.linaro.org/job/tcwg_bmk_ci_llvm-bisect-tcwg_bmk_tk1-llvm-release… Build log: https://ci.linaro.org/job/tcwg_bmk_ci_llvm-bisect-tcwg_bmk_tk1-llvm-release… Full commit (up to 1000 lines): <cut> commit efa7df1682c2859dabe3646ee7dc01e68629417f Author: Gabor Marton <gabor.marton(a)ericsson.com> Date: Thu Mar 25 15:40:26 2021 +0100 [Analyzer] Track RValue expressions It makes sense to track rvalue expressions in the case of special concrete integer values. The most notable special value is zero (later we may find other values). By tracking the origin of 0, we can provide a better explanation for users e.g. in case of division by 0 warnings. When the divisor is a product of a multiplication then now we can show which operand (or both) was (were) zero and why. Differential Revision: https://reviews.llvm.org/D99344 --- .../StaticAnalyzer/Core/BugReporterVisitors.cpp | 43 ++++++++++ clang/test/Analysis/division-by-zero-track-zero.c | 11 +++ .../test/Analysis/division-by-zero-track-zero.cpp | 98 ++++++++++++++++++++++ clang/test/Analysis/nullptr.cpp | 2 +- 4 files changed, 153 insertions(+), 1 deletion(-) diff --git a/clang/lib/StaticAnalyzer/Core/BugReporterVisitors.cpp b/clang/lib/StaticAnalyzer/Core/BugReporterVisitors.cpp index 0edd6e3f731b..fd334b0bc9c3 100644 --- a/clang/lib/StaticAnalyzer/Core/BugReporterVisitors.cpp +++ b/clang/lib/StaticAnalyzer/Core/BugReporterVisitors.cpp @@ -1924,6 +1924,44 @@ static const ExplodedNode* findNodeForExpression(const ExplodedNode *N, return N; } +/// Attempts to add visitors to track an RValue expression back to its point of +/// origin. Works similarly to trackExpressionValue, but accepts only RValues. +static void trackRValueExpression(const ExplodedNode *InputNode, const Expr *E, + PathSensitiveBugReport &report, + bugreporter::TrackingKind TKind, + bool EnableNullFPSuppression) { + assert(E->isRValue() && "The expression is not an rvalue!"); + const ExplodedNode *RVNode = findNodeForExpression(InputNode, E); + if (!RVNode) + return; + ProgramStateRef RVState = RVNode->getState(); + SVal V = RVState->getSValAsScalarOrLoc(E, RVNode->getLocationContext()); + const auto *BO = dyn_cast<BinaryOperator>(E); + if (!BO) + return; + if (!V.isZeroConstant()) + return; + if (!BO->isMultiplicativeOp()) + return; + + SVal RHSV = RVState->getSVal(BO->getRHS(), RVNode->getLocationContext()); + SVal LHSV = RVState->getSVal(BO->getLHS(), RVNode->getLocationContext()); + + // Track both LHS and RHS of a multiplication. + if (BO->getOpcode() == BO_Mul) { + if (LHSV.isZeroConstant()) + trackExpressionValue(InputNode, BO->getLHS(), report, TKind, + EnableNullFPSuppression); + if (RHSV.isZeroConstant()) + trackExpressionValue(InputNode, BO->getRHS(), report, TKind, + EnableNullFPSuppression); + } else { // Track only the LHS of a division or a modulo. + if (LHSV.isZeroConstant()) + trackExpressionValue(InputNode, BO->getLHS(), report, TKind, + EnableNullFPSuppression); + } +} + bool bugreporter::trackExpressionValue(const ExplodedNode *InputNode, const Expr *E, PathSensitiveBugReport &report, @@ -2069,6 +2107,11 @@ bool bugreporter::trackExpressionValue(const ExplodedNode *InputNode, loc::MemRegionVal(RegionRVal), /*assumption=*/false)); } } + + if (Inner->isRValue()) + trackRValueExpression(LVNode, Inner, report, TKind, + EnableNullFPSuppression); + return true; } diff --git a/clang/test/Analysis/division-by-zero-track-zero.c b/clang/test/Analysis/division-by-zero-track-zero.c new file mode 100644 index 000000000000..f6b2a78ed701 --- /dev/null +++ b/clang/test/Analysis/division-by-zero-track-zero.c @@ -0,0 +1,11 @@ +// RUN: %clang_analyze_cc1 -analyzer-checker=core \ +// RUN: -analyzer-output=text \ +// RUN: -verify %s + +int track_mul_lhs_0(int x, int y) { + int p0 = x < 0; // expected-note {{Assuming 'x' is >= 0}} \ + // expected-note {{'p0' initialized to 0}} + int div = p0 * y; // expected-note {{'div' initialized to 0}} + return 1 / div; // expected-note {{Division by zero}} \ + // expected-warning {{Division by zero}} +} diff --git a/clang/test/Analysis/division-by-zero-track-zero.cpp b/clang/test/Analysis/division-by-zero-track-zero.cpp new file mode 100644 index 000000000000..c4b9550c76c0 --- /dev/null +++ b/clang/test/Analysis/division-by-zero-track-zero.cpp @@ -0,0 +1,98 @@ +// RUN: %clang_analyze_cc1 -analyzer-checker=core \ +// RUN: -analyzer-output=text \ +// RUN: -verify %s + +namespace test_tracking_of_lhs_multiplier { + int f(int x, int y) { + bool p0 = x < 0; // expected-note {{Assuming 'x' is >= 0}} \ + // expected-note {{'p0' initialized to 0}} + int div = p0 * y; // expected-note {{'div' initialized to 0}} + return 1 / div; // expected-note {{Division by zero}} \ + // expected-warning {{Division by zero}} + } +} // namespace test_tracking_of_lhs_multiplier + +namespace test_tracking_of_rhs_multiplier { + int f(int x, int y) { + bool p0 = x < 0; // expected-note {{Assuming 'x' is >= 0}} \ + // expected-note {{'p0' initialized to 0}} + int div = y * p0; // expected-note {{'div' initialized to 0}} + return 1 / div; // expected-note {{Division by zero}} \ + // expected-warning {{Division by zero}} + } +} // namespace test_tracking_of_rhs_multiplier + +namespace test_tracking_of_nested_multiplier { + int f(int x, int y, int z) { + bool p0 = x < 0; // expected-note {{Assuming 'x' is >= 0}} \ + // expected-note {{'p0' initialized to 0}} + int div = y*z*p0; // expected-note {{'div' initialized to 0}} + return 1 / div; // expected-note {{Division by zero}} \ + // expected-warning {{Division by zero}} + } +} // namespace test_tracking_of_nested_multiplier + +namespace test_tracking_through_multiple_stmts { + int f(int x, int y) { + bool p0 = x < 0; // expected-note {{Assuming 'x' is >= 0}} + bool p1 = p0 ? 0 : 1; // expected-note {{'p0' is false}} \ + // expected-note {{'?' condition is false}} + bool p2 = 1 - p1; // expected-note {{'p2' initialized to 0}} + int div = p2 * y; // expected-note {{'div' initialized to 0}} + return 1 / div; // expected-note {{Division by zero}} \ + // expected-warning {{Division by zero}} + } +} // namespace test_tracking_through_multiple_stmts + +namespace test_tracking_both_lhs_and_rhs { + int f(int x, int y) { + bool p0 = x < 0; // expected-note {{Assuming 'x' is >= 0}} \ + // expected-note {{'p0' initialized to 0}} + bool p1 = y < 0; // expected-note {{Assuming 'y' is >= 0}} \ + // expected-note {{'p1' initialized to 0}} + int div = p0 * p1; // expected-note {{'div' initialized to 0}} + return 1 / div; // expected-note {{Division by zero}} \ + // expected-warning {{Division by zero}} + } +} // namespace test_tracking_both_lhs_and_rhs + +namespace test_tracking_of_multiplier_and_parens { + int f(int x, int y, int z) { + bool p0 = x < 0; // expected-note {{Assuming 'x' is >= 0}} \ + // expected-note {{'p0' initialized to 0}} + int div = y*(z*p0); // expected-note {{'div' initialized to 0}} + return 1 / div; // expected-note {{Division by zero}} \ + // expected-warning {{Division by zero}} + } +} // namespace test_tracking_of_multiplier_and_parens + +namespace test_tracking_of_divisible { + int f(int x, int y) { + bool p0 = x < 0; // expected-note {{Assuming 'x' is >= 0}} \ + // expected-note {{'p0' initialized to 0}} + int div = p0 / y; // expected-note {{'div' initialized to 0}} + return 1 / div; // expected-note {{Division by zero}} \ + // expected-warning {{Division by zero}} + } +} // namespace test_tracking_of_divisible + +namespace test_tracking_of_modulo { + int f(int x, int y) { + bool p0 = x < 0; // expected-note {{Assuming 'x' is >= 0}} \ + // expected-note {{'p0' initialized to 0}} + int div = p0 % y; // expected-note {{'div' initialized to 0}} + return 1 / div; // expected-note {{Division by zero}} \ + // expected-warning {{Division by zero}} + } +} // namespace test_tracking_of_modulo + +namespace test_tracking_of_assignment { + int f(int x) { + bool p0 = x < 0; // expected-note {{Assuming 'x' is >= 0}} \ + // expected-note {{'p0' initialized to 0}} + int div = 1; + div *= p0; // expected-note {{The value 0 is assigned to 'div'}} + return 1 / div; // expected-note {{Division by zero}} \ + // expected-warning {{Division by zero}} + } +} // namespace test_tracking_of_assignment diff --git a/clang/test/Analysis/nullptr.cpp b/clang/test/Analysis/nullptr.cpp index e9b975c148aa..24b574a4ccfe 100644 --- a/clang/test/Analysis/nullptr.cpp +++ b/clang/test/Analysis/nullptr.cpp @@ -64,7 +64,7 @@ void zoo1backwards() { typedef __INTPTR_TYPE__ intptr_t; void zoo1multiply() { - char **p = 0; // FIXME-should-be-note:{{'p' initialized to a null pointer value}} + char **p = 0; // expected-note{{'p' initialized to a null pointer value}} delete *((char **)((intptr_t)p * 2)); // expected-warning{{Dereference of null pointer}} // expected-note@-1{{Dereference of null pointer}} } </cut>

3 years, 12 months

1
0
0 0

Moving to llvm@lists.linux.dev

by Nathan Chancellor

Hi everyone, We are shifting the ClangBuiltLinux mailing list from clang-built-linux(a)googlegroups.com to llvm(a)lists.linux.dev. Google Groups has served us well but moving to lists.linux.dev allows for easier archival (as we will be on lore.kernel.org automatically) and allows for people to subscribe to us easier, as they only need an email address, rather than a Google account. Please follow these directions to subscribe to the new mailing list: https://subspace.kernel.org/index.html#subscribing Some more information about lists.linux.dev: https://www.kernel.org/lists-linux-dev.html https://subspace.kernel.org/lists.linux.dev.html I have added CI maintainers/mailing lists that send us regular reports to this announcement. Please continue to send us emails about build results, just switch the email from clang-built-linux(a)googlegroups.com to llvm(a)lists.linux.dev so that they get archived as a part of lore and can be easily searched, especially with the upcoming https://x-lore.kernel.org/all/. I will send a patch shortly to update MAINTAINERS. Cheers, Nathan

3 years, 12 months

1
0
0 0

[CI-NOTIFY]: TCWG Bisect tcwg_bmk_tk1/llvm-release-arm-spec2k6-O2_LTO - Build # 13 - Successful!

by ci_notify＠linaro.org

Successfully identified regression in *llvm* in CI configuration tcwg_bmk_llvm_tk1/llvm-release-arm-spec2k6-O2_LTO. So far, this commit has regressed CI configurations: - tcwg_bmk_llvm_tk1/llvm-release-arm-spec2k6-O2_LTO Culprit: <cut> commit 9be8f8b34d9b150cd1811e3556fe9d0cd735ae29 Author: Fangrui Song <i(a)maskray.me> Date: Thu Mar 25 21:55:27 2021 -0700 [sanitizer] Simplify GetTls with dl_iterate_phdr GetTls is the range of * thread control block and optional TLS_PRE_TCB_SIZE * static TLS blocks plus static TLS surplus On glibc, lsan requires the range to include `pthread::{specific_1stblock,specific}` so that allocations only referenced by `pthread_setspecific` can be scanned. This patch uses `dl_iterate_phdr` to collect TLS ranges. Find the one with `dlpi_tls_modid==1` as one of the initially loaded module, then find consecutive ranges. The boundaries give us addr and size. This allows us to drop the glibc internal `_dl_get_tls_static_info` and `InitTlsSize` entirely. Use the simplified method with non-Android Linux for now, but in theory this can be used with *BSD and potentially other ELF OSes. In the future, we can move `ThreadDescriptorSize` code to lsan (and consider intercepting `pthread_setspecific`) to avoid hacks in generic code. See https://reviews.llvm.org/D93972#2480556 for analysis on GetTls usage across various sanitizers. Differential Revision: https://reviews.llvm.org/D98926 </cut> Results regressed to (for first_bad == 9be8f8b34d9b150cd1811e3556fe9d0cd735ae29) # reset_artifacts: -10 # build_abe binutils: -9 # build_abe stage1 -- --set gcc_override_configure=--with-mode=arm --set gcc_override_configure=--disable-libsanitizer: -8 # build_abe linux: -7 # build_abe glibc: -6 # build_abe stage2 -- --set gcc_override_configure=--with-mode=arm --set gcc_override_configure=--disable-libsanitizer: -5 # build_llvm true: -3 # true: 0 # benchmark -- -O2_LTO_marm artifacts/build-9be8f8b34d9b150cd1811e3556fe9d0cd735ae29/results_id: 1 # 456.hmmer,hmmer_base.default regressed by 103 from (for last_good == 9d375a40c3df90dd48edc0e1b1115c702c55d716) # reset_artifacts: -10 # build_abe binutils: -9 # build_abe stage1 -- --set gcc_override_configure=--with-mode=arm --set gcc_override_configure=--disable-libsanitizer: -8 # build_abe linux: -7 # build_abe glibc: -6 # build_abe stage2 -- --set gcc_override_configure=--with-mode=arm --set gcc_override_configure=--disable-libsanitizer: -5 # build_llvm true: -3 # true: 0 # benchmark -- -O2_LTO_marm artifacts/build-9d375a40c3df90dd48edc0e1b1115c702c55d716/results_id: 1 Artifacts of last_good build: https://ci.linaro.org/job/tcwg_bmk_ci_llvm-bisect-tcwg_bmk_tk1-llvm-release… Results ID of last_good: tk1_32/tcwg_bmk_llvm_tk1/bisect-llvm-release-arm-spec2k6-O2_LTO/4304 Artifacts of first_bad build: https://ci.linaro.org/job/tcwg_bmk_ci_llvm-bisect-tcwg_bmk_tk1-llvm-release… Results ID of first_bad: tk1_32/tcwg_bmk_llvm_tk1/bisect-llvm-release-arm-spec2k6-O2_LTO/4302 Build top page/logs: https://ci.linaro.org/job/tcwg_bmk_ci_llvm-bisect-tcwg_bmk_tk1-llvm-release… Configuration details: Reproduce builds: <cut> mkdir investigate-llvm-9be8f8b34d9b150cd1811e3556fe9d0cd735ae29 cd investigate-llvm-9be8f8b34d9b150cd1811e3556fe9d0cd735ae29 git clone https://git.linaro.org/toolchain/jenkins-scripts mkdir -p artifacts/manifests curl -o artifacts/manifests/build-baseline.sh https://ci.linaro.org/job/tcwg_bmk_ci_llvm-bisect-tcwg_bmk_tk1-llvm-release… --fail curl -o artifacts/manifests/build-parameters.sh https://ci.linaro.org/job/tcwg_bmk_ci_llvm-bisect-tcwg_bmk_tk1-llvm-release… --fail curl -o artifacts/test.sh https://ci.linaro.org/job/tcwg_bmk_ci_llvm-bisect-tcwg_bmk_tk1-llvm-release… --fail chmod +x artifacts/test.sh # Reproduce the baseline build (build all pre-requisites) ./jenkins-scripts/tcwg_bmk-build.sh @@ artifacts/manifests/build-baseline.sh # Save baseline build state (which is then restored in artifacts/test.sh) mkdir -p ./bisect rsync -a --del --delete-excluded --exclude /bisect/ --exclude /artifacts/ --exclude /llvm/ ./ ./bisect/baseline/ cd llvm # Reproduce first_bad build git checkout --detach 9be8f8b34d9b150cd1811e3556fe9d0cd735ae29 ../artifacts/test.sh # Reproduce last_good build git checkout --detach 9d375a40c3df90dd48edc0e1b1115c702c55d716 ../artifacts/test.sh cd .. </cut> History of pending regressions and results: https://git.linaro.org/toolchain/ci/base-artifacts.git/log/?h=linaro-local/… Artifacts: https://ci.linaro.org/job/tcwg_bmk_ci_llvm-bisect-tcwg_bmk_tk1-llvm-release… Build log: https://ci.linaro.org/job/tcwg_bmk_ci_llvm-bisect-tcwg_bmk_tk1-llvm-release… Full commit (up to 1000 lines): <cut> commit 9be8f8b34d9b150cd1811e3556fe9d0cd735ae29 Author: Fangrui Song <i(a)maskray.me> Date: Thu Mar 25 21:55:27 2021 -0700 [sanitizer] Simplify GetTls with dl_iterate_phdr GetTls is the range of * thread control block and optional TLS_PRE_TCB_SIZE * static TLS blocks plus static TLS surplus On glibc, lsan requires the range to include `pthread::{specific_1stblock,specific}` so that allocations only referenced by `pthread_setspecific` can be scanned. This patch uses `dl_iterate_phdr` to collect TLS ranges. Find the one with `dlpi_tls_modid==1` as one of the initially loaded module, then find consecutive ranges. The boundaries give us addr and size. This allows us to drop the glibc internal `_dl_get_tls_static_info` and `InitTlsSize` entirely. Use the simplified method with non-Android Linux for now, but in theory this can be used with *BSD and potentially other ELF OSes. In the future, we can move `ThreadDescriptorSize` code to lsan (and consider intercepting `pthread_setspecific`) to avoid hacks in generic code. See https://reviews.llvm.org/D93972#2480556 for analysis on GetTls usage across various sanitizers. Differential Revision: https://reviews.llvm.org/D98926 --- compiler-rt/lib/asan/asan_rtl.cpp | 5 +- compiler-rt/lib/asan/asan_thread.cpp | 2 +- compiler-rt/lib/hwasan/hwasan.cpp | 2 - compiler-rt/lib/lsan/lsan.cpp | 1 - compiler-rt/lib/memprof/memprof_rtl.cpp | 3 - compiler-rt/lib/msan/msan.cpp | 1 - .../lib/sanitizer_common/sanitizer_common.h | 1 - .../lib/sanitizer_common/sanitizer_fuchsia.cpp | 1 - compiler-rt/lib/sanitizer_common/sanitizer_linux.h | 1 - .../sanitizer_common/sanitizer_linux_libcdep.cpp | 231 ++++++++------------- compiler-rt/lib/sanitizer_common/sanitizer_mac.cpp | 3 - .../lib/sanitizer_common/sanitizer_rtems.cpp | 1 - compiler-rt/lib/sanitizer_common/sanitizer_win.cpp | 3 - .../tests/sanitizer_common_test.cpp | 2 - .../tests/sanitizer_linux_test.cpp | 17 +- compiler-rt/lib/tsan/rtl/tsan_platform_linux.cpp | 1 - 16 files changed, 91 insertions(+), 184 deletions(-) diff --git a/compiler-rt/lib/asan/asan_rtl.cpp b/compiler-rt/lib/asan/asan_rtl.cpp index 7b5a929963c6..106a52607631 100644 --- a/compiler-rt/lib/asan/asan_rtl.cpp +++ b/compiler-rt/lib/asan/asan_rtl.cpp @@ -490,9 +490,6 @@ static void AsanInitInternal() { if (flags()->start_deactivated) AsanDeactivate(); - // interceptors - InitTlsSize(); - // Create main thread. AsanThread *main_thread = CreateMainThread(); CHECK_EQ(0, main_thread->tid()); @@ -568,7 +565,7 @@ void UnpoisonStack(uptr bottom, uptr top, const char *type) { type, top, bottom, top - bottom, top - bottom); return; } - PoisonShadow(bottom, top - bottom, 0); + PoisonShadow(bottom, RoundUpTo(top - bottom, SHADOW_GRANULARITY), 0); } static void UnpoisonDefaultStack() { diff --git a/compiler-rt/lib/asan/asan_thread.cpp b/compiler-rt/lib/asan/asan_thread.cpp index ae3bcba204c6..f7778c0f1e34 100644 --- a/compiler-rt/lib/asan/asan_thread.cpp +++ b/compiler-rt/lib/asan/asan_thread.cpp @@ -307,7 +307,7 @@ void AsanThread::SetThreadStackAndTls(const InitOptions *options) { uptr stack_size = 0; GetThreadStackAndTls(tid() == 0, &stack_bottom_, &stack_size, &tls_begin_, &tls_size); - stack_top_ = stack_bottom_ + stack_size; + stack_top_ = RoundDownTo(stack_bottom_ + stack_size, SHADOW_GRANULARITY); tls_end_ = tls_begin_ + tls_size; dtls_ = DTLS_Get(); diff --git a/compiler-rt/lib/hwasan/hwasan.cpp b/compiler-rt/lib/hwasan/hwasan.cpp index 5c0d804561d2..ce08ec3508c4 100644 --- a/compiler-rt/lib/hwasan/hwasan.cpp +++ b/compiler-rt/lib/hwasan/hwasan.cpp @@ -265,8 +265,6 @@ void __hwasan_init() { hwasan_init_is_running = 1; SanitizerToolName = "HWAddressSanitizer"; - InitTlsSize(); - CacheBinaryName(); InitializeFlags(); diff --git a/compiler-rt/lib/lsan/lsan.cpp b/compiler-rt/lib/lsan/lsan.cpp index 2c0a3bf0787c..b264be0ba792 100644 --- a/compiler-rt/lib/lsan/lsan.cpp +++ b/compiler-rt/lib/lsan/lsan.cpp @@ -98,7 +98,6 @@ extern "C" void __lsan_init() { InitCommonLsan(); InitializeAllocator(); ReplaceSystemMalloc(); - InitTlsSize(); InitializeInterceptors(); InitializeThreadRegistry(); InstallDeadlySignalHandlers(LsanOnDeadlySignal); diff --git a/compiler-rt/lib/memprof/memprof_rtl.cpp b/compiler-rt/lib/memprof/memprof_rtl.cpp index d6d606f666ee..05759e406f7a 100644 --- a/compiler-rt/lib/memprof/memprof_rtl.cpp +++ b/compiler-rt/lib/memprof/memprof_rtl.cpp @@ -214,9 +214,6 @@ static void MemprofInitInternal() { InitializeCoverage(common_flags()->coverage, common_flags()->coverage_dir); - // interceptors - InitTlsSize(); - // Create main thread. MemprofThread *main_thread = CreateMainThread(); CHECK_EQ(0, main_thread->tid()); diff --git a/compiler-rt/lib/msan/msan.cpp b/compiler-rt/lib/msan/msan.cpp index 4be1630cd302..4ee7e2ec4dd6 100644 --- a/compiler-rt/lib/msan/msan.cpp +++ b/compiler-rt/lib/msan/msan.cpp @@ -436,7 +436,6 @@ void __msan_init() { InitializeInterceptors(); CheckASLR(); - InitTlsSize(); InstallDeadlySignalHandlers(MsanOnDeadlySignal); InstallAtExitHandler(); // Needs __cxa_atexit interceptor. diff --git a/compiler-rt/lib/sanitizer_common/sanitizer_common.h b/compiler-rt/lib/sanitizer_common/sanitizer_common.h index dcd625d30f77..2b2629fc12dd 100644 --- a/compiler-rt/lib/sanitizer_common/sanitizer_common.h +++ b/compiler-rt/lib/sanitizer_common/sanitizer_common.h @@ -284,7 +284,6 @@ void SetSandboxingCallback(void (*f)()); void InitializeCoverage(bool enabled, const char *coverage_dir); -void InitTlsSize(); uptr GetTlsSize(); // Other diff --git a/compiler-rt/lib/sanitizer_common/sanitizer_fuchsia.cpp b/compiler-rt/lib/sanitizer_common/sanitizer_fuchsia.cpp index 4f692f99c207..5d68ad8ee8e4 100644 --- a/compiler-rt/lib/sanitizer_common/sanitizer_fuchsia.cpp +++ b/compiler-rt/lib/sanitizer_common/sanitizer_fuchsia.cpp @@ -103,7 +103,6 @@ void DisableCoreDumperIfNecessary() {} void InstallDeadlySignalHandlers(SignalHandlerType handler) {} void SetAlternateSignalStack() {} void UnsetAlternateSignalStack() {} -void InitTlsSize() {} bool SignalContext::IsStackOverflow() const { return false; } void SignalContext::DumpAllRegisters(void *context) { UNIMPLEMENTED(); } diff --git a/compiler-rt/lib/sanitizer_common/sanitizer_linux.h b/compiler-rt/lib/sanitizer_common/sanitizer_linux.h index 41ae072d6cac..9a23fcfb3b93 100644 --- a/compiler-rt/lib/sanitizer_common/sanitizer_linux.h +++ b/compiler-rt/lib/sanitizer_common/sanitizer_linux.h @@ -98,7 +98,6 @@ class ThreadLister { // Exposed for testing. uptr ThreadDescriptorSize(); uptr ThreadSelf(); -uptr ThreadSelfOffset(); // Matches a library's file name against a base name (stripping path and version // information). diff --git a/compiler-rt/lib/sanitizer_common/sanitizer_linux_libcdep.cpp b/compiler-rt/lib/sanitizer_common/sanitizer_linux_libcdep.cpp index 613658147bbd..1177a1ceb14f 100644 --- a/compiler-rt/lib/sanitizer_common/sanitizer_linux_libcdep.cpp +++ b/compiler-rt/lib/sanitizer_common/sanitizer_linux_libcdep.cpp @@ -184,80 +184,8 @@ __attribute__((unused)) static bool GetLibcVersion(int *major, int *minor, #endif } -#if SANITIZER_GLIBC && !SANITIZER_GO -static uptr g_tls_size; - -#ifdef __i386__ -#define CHECK_GET_TLS_STATIC_INFO_VERSION (!__GLIBC_PREREQ(2, 27)) -#else -#define CHECK_GET_TLS_STATIC_INFO_VERSION 0 -#endif - -#if CHECK_GET_TLS_STATIC_INFO_VERSION -#define DL_INTERNAL_FUNCTION __attribute__((regparm(3), stdcall)) -#else -#define DL_INTERNAL_FUNCTION -#endif - -namespace { -struct GetTlsStaticInfoCall { - typedef void (*get_tls_func)(size_t*, size_t*); -}; -struct GetTlsStaticInfoRegparmCall { - typedef void (*get_tls_func)(size_t*, size_t*) DL_INTERNAL_FUNCTION; -}; - -template <typename T> -void CallGetTls(void* ptr, size_t* size, size_t* align) { - typename T::get_tls_func get_tls; - CHECK_EQ(sizeof(get_tls), sizeof(ptr)); - internal_memcpy(&get_tls, &ptr, sizeof(ptr)); - CHECK_NE(get_tls, 0); - get_tls(size, align); -} - -bool CmpLibcVersion(int major, int minor, int patch) { - int ma; - int mi; - int pa; - if (!GetLibcVersion(&ma, &mi, &pa)) - return false; - if (ma > major) - return true; - if (ma < major) - return false; - if (mi > minor) - return true; - if (mi < minor) - return false; - return pa >= patch; -} - -} // namespace - -void InitTlsSize() { - // all current supported platforms have 16 bytes stack alignment - const size_t kStackAlign = 16; - void *get_tls_static_info_ptr = dlsym(RTLD_NEXT, "_dl_get_tls_static_info"); - size_t tls_size = 0; - size_t tls_align = 0; - // On i?86, _dl_get_tls_static_info used to be internal_function, i.e. - // __attribute__((regparm(3), stdcall)) before glibc 2.27 and is normal - // function in 2.27 and later. - if (CHECK_GET_TLS_STATIC_INFO_VERSION && !CmpLibcVersion(2, 27, 0)) - CallGetTls<GetTlsStaticInfoRegparmCall>(get_tls_static_info_ptr, - &tls_size, &tls_align); - else - CallGetTls<GetTlsStaticInfoCall>(get_tls_static_info_ptr, - &tls_size, &tls_align); - if (tls_align < kStackAlign) - tls_align = kStackAlign; - g_tls_size = RoundUpTo(tls_size, tls_align); -} -#else -void InitTlsSize() { } -#endif // SANITIZER_GLIBC && !SANITIZER_GO - +// ThreadDescriptorSize() is only used by lsan to get the pointer to +// thread-specific data keys in the thread control block. #if (defined(__x86_64__) || defined(__i386__) || defined(__mips__) || \ defined(__aarch64__) || defined(__powerpc64__) || defined(__s390__) || \ defined(__arm__) || SANITIZER_RISCV64) && \ @@ -330,13 +258,6 @@ uptr ThreadDescriptorSize() { return val; } -// The offset at which pointer to self is located in the thread descriptor. -const uptr kThreadSelfOffset = FIRST_32_SECOND_64(8, 16); - -uptr ThreadSelfOffset() { - return kThreadSelfOffset; -} - #if defined(__mips__) || defined(__powerpc64__) || SANITIZER_RISCV64 // TlsPreTcbSize includes size of struct pthread_descr and size of tcb // head structure. It lies before the static tls blocks. @@ -355,48 +276,61 @@ static uptr TlsPreTcbSize() { } #endif -uptr ThreadSelf() { - uptr descr_addr; -#if defined(__i386__) - asm("mov %%gs:%c1,%0" : "=r"(descr_addr) : "i"(kThreadSelfOffset)); -#elif defined(__x86_64__) - asm("mov %%fs:%c1,%0" : "=r"(descr_addr) : "i"(kThreadSelfOffset)); -#elif defined(__mips__) - // MIPS uses TLS variant I. The thread pointer (in hardware register $29) - // points to the end of the TCB + 0x7000. The pthread_descr structure is - // immediately in front of the TCB. TlsPreTcbSize() includes the size of the - // TCB and the size of pthread_descr. - const uptr kTlsTcbOffset = 0x7000; - uptr thread_pointer; - asm volatile(".set push;\ - .set mips64r2;\ - rdhwr %0,$29;\ - .set pop" : "=r" (thread_pointer)); - descr_addr = thread_pointer - kTlsTcbOffset - TlsPreTcbSize(); -#elif defined(__aarch64__) || defined(__arm__) - descr_addr = reinterpret_cast<uptr>(__builtin_thread_pointer()) - - ThreadDescriptorSize(); -#elif SANITIZER_RISCV64 - // https://github.com/riscv/riscv-elf-psabi-doc/issues/53 - uptr thread_pointer = reinterpret_cast<uptr>(__builtin_thread_pointer()); - descr_addr = thread_pointer - TlsPreTcbSize(); -#elif defined(__s390__) - descr_addr = reinterpret_cast<uptr>(__builtin_thread_pointer()); -#elif defined(__powerpc64__) - // PPC64LE uses TLS variant I. The thread pointer (in GPR 13) - // points to the end of the TCB + 0x7000. The pthread_descr structure is - // immediately in front of the TCB. TlsPreTcbSize() includes the size of the - // TCB and the size of pthread_descr. - const uptr kTlsTcbOffset = 0x7000; - uptr thread_pointer; - asm("addi %0,13,%1" : "=r"(thread_pointer) : "I"(-kTlsTcbOffset)); - descr_addr = thread_pointer - TlsPreTcbSize(); -#else -#error "unsupported CPU arch" -#endif - return descr_addr; +#if !SANITIZER_GO +namespace { +struct TlsRange { + uptr begin, end, align; + size_t tls_modid; + bool operator<(const TlsRange &rhs) const { return begin < rhs.begin; } +}; +} // namespace + +static int CollectStaticTlsRanges(struct dl_phdr_info *info, size_t size, + void *data) { + if (!info->dlpi_tls_data) + return 0; + const uptr begin = (uptr)info->dlpi_tls_data; + for (unsigned i = 0; i != info->dlpi_phnum; ++i) + if (info->dlpi_phdr[i].p_type == PT_TLS) { + static_cast<InternalMmapVector<TlsRange> *>(data)->push_back( + TlsRange{begin, begin + info->dlpi_phdr[i].p_memsz, + info->dlpi_phdr[i].p_align, info->dlpi_tls_modid}); + break; + } + return 0; } -#endif // (x86_64 || i386 || MIPS) && SANITIZER_LINUX + +static void GetStaticTlsRange(uptr *addr, uptr *size) { + InternalMmapVector<TlsRange> ranges; + dl_iterate_phdr(CollectStaticTlsRanges, &ranges); + uptr len = ranges.size(); + Sort(ranges.begin(), len); + // Find the range with tls_modid=1. For glibc, because libc.so uses PT_TLS, + // this module is guaranteed to exist and is one of the initially loaded + // modules. + uptr one = 0; + while (one != len && ranges[one].tls_modid != 1) ++one; + if (one == len) { + // This may happen with musl if no module uses PT_TLS. + *addr = 0; + *size = 0; + return; + } + // Find the maximum consecutive ranges. We consider two modules consecutive if + // the gap is smaller than the alignment. The dynamic loader places static TLS + // blocks this way not to waste space. + uptr l = one; + while (l != 0 && ranges[l].begin < ranges[l - 1].end + ranges[l - 1].align) + --l; + uptr r = one + 1; + while (r != len && ranges[r].begin < ranges[r - 1].end + ranges[r - 1].align) + ++r; + *addr = ranges[l].begin; + *size = ranges[r - 1].end - ranges[l].begin; +} +#endif // !SANITIZER_GO +#endif // (x86_64 || i386 || mips || ...) && SANITIZER_LINUX && + // !SANITIZER_ANDROID #if SANITIZER_FREEBSD static void **ThreadSelfSegbase() { @@ -468,18 +402,36 @@ static void GetTls(uptr *addr, uptr *size) { *size = 0; } #elif SANITIZER_LINUX + GetStaticTlsRange(addr, size); #if defined(__x86_64__) || defined(__i386__) || defined(__s390__) - *addr = ThreadSelf(); - *size = GetTlsSize(); - *addr -= *size; - *addr += ThreadDescriptorSize(); -#elif defined(__mips__) || defined(__aarch64__) || defined(__powerpc64__) || \ - defined(__arm__) || SANITIZER_RISCV64 - *addr = ThreadSelf(); - *size = GetTlsSize(); + // lsan requires the range to additionally cover the static TLS surplus + // (elf/dl-tls.c defines 1664). Otherwise there may be false positives for + // allocations only referenced by tls in dynamically loaded modules. + if (SANITIZER_GLIBC) { + *addr -= 1664; + *size += 1664; + } + // Extend the range to include the thread control block. On glibc, lsan needs + // the range to include pthread::{specific_1stblock,specific} so that + // allocations only referenced by pthread_setspecific can be scanned. This may + // underestimate by at most TLS_TCB_ALIGN-1 bytes but it should be fine + // because the number of bytes after pthread::specific is larger. + *size += ThreadDescriptorSize(); #else - *addr = 0; - *size = 0; + if (SANITIZER_GLIBC) + *size += 1664; +#if defined(__mips__) || defined(__powerpc64__) || SANITIZER_RISCV64 + const uptr pre_tcb_size = TlsPreTcbSize(); + *addr -= pre_tcb_size; + *size += pre_tcb_size; +#else + // arm and aarch64 reserve two words at TP, so this underestimates the range. + // However, this is sufficient for the purpose of finding the pointers to + // thread-specific data keys. + const uptr tcb_size = ThreadDescriptorSize(); + *addr -= tcb_size; + *size += tcb_size; +#endif #endif #elif SANITIZER_FREEBSD void** segbase = ThreadSelfSegbase(); @@ -520,17 +472,11 @@ static void GetTls(uptr *addr, uptr *size) { #if !SANITIZER_GO uptr GetTlsSize() { -#if SANITIZER_FREEBSD || SANITIZER_ANDROID || SANITIZER_NETBSD || \ +#if SANITIZER_FREEBSD || SANITIZER_LINUX || SANITIZER_NETBSD || \ SANITIZER_SOLARIS uptr addr, size; GetTls(&addr, &size); return size; -#elif SANITIZER_GLIBC -#if defined(__mips__) || defined(__powerpc64__) || SANITIZER_RISCV64 - return RoundUpTo(g_tls_size + TlsPreTcbSize(), 16); -#else - return g_tls_size; -#endif #else return 0; #endif @@ -553,10 +499,9 @@ void GetThreadStackAndTls(bool main, uptr *stk_addr, uptr *stk_size, if (!main) { // If stack and tls intersect, make them non-intersecting. if (*tls_addr > *stk_addr && *tls_addr < *stk_addr + *stk_size) { - CHECK_GT(*tls_addr + *tls_size, *stk_addr); - CHECK_LE(*tls_addr + *tls_size, *stk_addr + *stk_size); - *stk_size -= *tls_size; - *tls_addr = *stk_addr + *stk_size; + if (*stk_addr + *stk_size < *tls_addr + *tls_size) + *tls_size = *stk_addr + *stk_size - *tls_addr; + *stk_size = *tls_addr - *stk_addr; } } #endif diff --git a/compiler-rt/lib/sanitizer_common/sanitizer_mac.cpp b/compiler-rt/lib/sanitizer_common/sanitizer_mac.cpp index d7b0bde173c8..5055df1ec29a 100644 --- a/compiler-rt/lib/sanitizer_common/sanitizer_mac.cpp +++ b/compiler-rt/lib/sanitizer_common/sanitizer_mac.cpp @@ -548,9 +548,6 @@ uptr GetTlsSize() { return 0; } -void InitTlsSize() { -} - uptr TlsBaseAddr() { uptr segbase = 0; #if defined(__x86_64__) diff --git a/compiler-rt/lib/sanitizer_common/sanitizer_rtems.cpp b/compiler-rt/lib/sanitizer_common/sanitizer_rtems.cpp index d58bd08fb1a8..01554349cc04 100644 --- a/compiler-rt/lib/sanitizer_common/sanitizer_rtems.cpp +++ b/compiler-rt/lib/sanitizer_common/sanitizer_rtems.cpp @@ -106,7 +106,6 @@ void DisableCoreDumperIfNecessary() {} void InstallDeadlySignalHandlers(SignalHandlerType handler) {} void SetAlternateSignalStack() {} void UnsetAlternateSignalStack() {} -void InitTlsSize() {} void SignalContext::DumpAllRegisters(void *context) {} const char *DescribeSignalOrException(int signo) { UNIMPLEMENTED(); } diff --git a/compiler-rt/lib/sanitizer_common/sanitizer_win.cpp b/compiler-rt/lib/sanitizer_common/sanitizer_win.cpp index f383e130fa59..d47ccad1764d 100644 --- a/compiler-rt/lib/sanitizer_common/sanitizer_win.cpp +++ b/compiler-rt/lib/sanitizer_common/sanitizer_win.cpp @@ -846,9 +846,6 @@ uptr GetTlsSize() { return 0; } -void InitTlsSize() { -} - void GetThreadStackAndTls(bool main, uptr *stk_addr, uptr *stk_size, uptr *tls_addr, uptr *tls_size) { #if SANITIZER_GO diff --git a/compiler-rt/lib/sanitizer_common/tests/sanitizer_common_test.cpp b/compiler-rt/lib/sanitizer_common/tests/sanitizer_common_test.cpp index 80df9b497b2d..21c6b036b956 100644 --- a/compiler-rt/lib/sanitizer_common/tests/sanitizer_common_test.cpp +++ b/compiler-rt/lib/sanitizer_common/tests/sanitizer_common_test.cpp @@ -210,12 +210,10 @@ static void *WorkerThread(void *arg) { } TEST(SanitizerCommon, ThreadStackTlsMain) { - InitTlsSize(); TestThreadInfo(true); } TEST(SanitizerCommon, ThreadStackTlsWorker) { - InitTlsSize(); pthread_t t; PTHREAD_CREATE(&t, 0, WorkerThread, 0); PTHREAD_JOIN(t, 0); diff --git a/compiler-rt/lib/sanitizer_common/tests/sanitizer_linux_test.cpp b/compiler-rt/lib/sanitizer_common/tests/sanitizer_linux_test.cpp index cb6c0724ac88..025cba922d2d 100644 --- a/compiler-rt/lib/sanitizer_common/tests/sanitizer_linux_test.cpp +++ b/compiler-rt/lib/sanitizer_common/tests/sanitizer_linux_test.cpp @@ -188,24 +188,9 @@ TEST(SanitizerCommon, SetEnvTest) { } #if (defined(__x86_64__) || defined(__i386__)) && !SANITIZER_ANDROID -void *thread_self_offset_test_func(void *arg) { - bool result = - *(uptr *)((char *)ThreadSelf() + ThreadSelfOffset()) == ThreadSelf(); - return (void *)result; -} - -TEST(SanitizerLinux, ThreadSelfOffset) { - EXPECT_TRUE((bool)thread_self_offset_test_func(0)); - pthread_t tid; - void *result; - ASSERT_EQ(0, pthread_create(&tid, 0, thread_self_offset_test_func, 0)); - ASSERT_EQ(0, pthread_join(tid, &result)); - EXPECT_TRUE((bool)result); -} - // libpthread puts the thread descriptor at the end of stack space. void *thread_descriptor_size_test_func(void *arg) { - uptr descr_addr = ThreadSelf(); + uptr descr_addr = (uptr)pthread_self(); pthread_attr_t attr; pthread_getattr_np(pthread_self(), &attr); void *stackaddr; diff --git a/compiler-rt/lib/tsan/rtl/tsan_platform_linux.cpp b/compiler-rt/lib/tsan/rtl/tsan_platform_linux.cpp index 45acfe66ff3f..0d26f497f2bd 100644 --- a/compiler-rt/lib/tsan/rtl/tsan_platform_linux.cpp +++ b/compiler-rt/lib/tsan/rtl/tsan_platform_linux.cpp @@ -318,7 +318,6 @@ void InitializePlatform() { } CheckAndProtect(); - InitTlsSize(); #endif // !SANITIZER_GO } </cut>

3 years, 12 months

1
0
0 0

[CI-NOTIFY]: TCWG Bisect tcwg_bmk_tk1/llvm-release-arm-spec2k6-O3_LTO - Build # 9 - Successful!

by ci_notify＠linaro.org

Successfully identified regression in *llvm* in CI configuration tcwg_bmk_llvm_tk1/llvm-release-arm-spec2k6-O3_LTO. So far, this commit has regressed CI configurations: - tcwg_bmk_llvm_tk1/llvm-release-arm-spec2k6-O3_LTO Culprit: <cut> commit a26f1bf67ec70f72e64101cf483b26466928fc38 Author: Roman Lebedev <lebedev.ri(a)gmail.com> Date: Fri Apr 2 10:40:12 2021 +0300 [PassManager] Run additional LICM before LoopRotate Loop rotation often has to perform code duplication from header into preheader, which introduces PHI nodes. >>! In D99204, @thopre wrote: > > With loop peeling, it is important that unnecessary PHIs be avoided or > it will leads to spurious peeling. One source of such PHIs is loop > rotation which creates PHIs for invariant loads. Those PHIs are > particularly problematic since loop peeling is now run as part of simple > loop unrolling before GVN is run, and are thus a source of spurious > peeling. > > Note that while some of the load can be hoisted and eventually > eliminated by instruction combine, this is not always possible due to > alignment issue. In particular, the motivating example [1] was a load > inside a class instance which cannot be hoisted because the `this' > pointer has an alignment of 1. > > [1] http://lists.llvm.org/pipermail/llvm-dev/attachments/20210312/4ce73c47/atta… Now, we could enhance LoopRotate to avoid duplicating code when not needed, but instead hoist loop-invariant code, but isn't that a code duplication? (*sic*) We have LICM, and in fact we already run it right after LoopRotation. We could try to move it to before LoopRotation, that is basically free from compile-time perspective: https://llvm-compile-time-tracker.com/compare.php?from=6c93eb4477d88af046b9… But, looking at stats, i think it isn't great that we would no longer do LICM after LoopRotation, in particular: | statistic name | LoopRotate-LICM | LICM-LoopRotate | Δ | % | abs(%) | | asm-printer.EmittedInsts | 9015930 | 9015799 | -131 | 0.00% | 0.00% | | indvars.NumElimCmp | 3536 | 3544 | 8 | 0.23% | 0.23% | | indvars.NumElimExt | 36725 | 36580 | -145 | -0.39% | 0.39% | | indvars.NumElimIV | 1197 | 1187 | -10 | -0.84% | 0.84% | | indvars.NumElimIdentity | 143 | 136 | -7 | -4.90% | 4.90% | | indvars.NumElimRem | 4 | 5 | 1 | 25.00% | 25.00% | | indvars.NumLFTR | 29842 | 29890 | 48 | 0.16% | 0.16% | | indvars.NumReplaced | 2293 | 2227 | -66 | -2.88% | 2.88% | | indvars.NumSimplifiedSDiv | 6 | 8 | 2 | 33.33% | 33.33% | | indvars.NumWidened | 26438 | 26329 | -109 | -0.41% | 0.41% | | instcount.TotalBlocks | 1178338 | 1173840 | -4498 | -0.38% | 0.38% | | instcount.TotalFuncs | 111825 | 111829 | 4 | 0.00% | 0.00% | | instcount.TotalInsts | 9905442 | 9896139 | -9303 | -0.09% | 0.09% | | lcssa.NumLCSSA | 425871 | 423961 | -1910 | -0.45% | 0.45% | | licm.NumHoisted | 378357 | 378753 | 396 | 0.10% | 0.10% | | licm.NumMovedCalls | 2193 | 2208 | 15 | 0.68% | 0.68% | | licm.NumMovedLoads | 35899 | 31821 | -4078 | -11.36% | 11.36% | | licm.NumPromoted | 11178 | 11154 | -24 | -0.21% | 0.21% | | licm.NumSunk | 13359 | 13587 | 228 | 1.71% | 1.71% | | loop-delete.NumDeleted | 8547 | 8402 | -145 | -1.70% | 1.70% | | loop-instsimplify.NumSimplified | 12876 | 11890 | -986 | -7.66% | 7.66% | | loop-peel.NumPeeled | 1008 | 925 | -83 | -8.23% | 8.23% | | loop-rotate.NumNotRotatedDueToHeaderSize | 368 | 365 | -3 | -0.82% | 0.82% | | loop-rotate.NumRotated | 42015 | 42003 | -12 | -0.03% | 0.03% | | loop-simplifycfg.NumLoopBlocksDeleted | 240 | 242 | 2 | 0.83% | 0.83% | | loop-simplifycfg.NumLoopExitsDeleted | 497 | 20 | -477 | -95.98% | 95.98% | | loop-simplifycfg.NumTerminatorsFolded | 618 | 336 | -282 | -45.63% | 45.63% | | loop-unroll.NumCompletelyUnrolled | 11028 | 11032 | 4 | 0.04% | 0.04% | | loop-unroll.NumUnrolled | 12608 | 12529 | -79 | -0.63% | 0.63% | | mem2reg.NumDeadAlloca | 10222 | 10221 | -1 | -0.01% | 0.01% | | mem2reg.NumPHIInsert | 192110 | 192106 | -4 | 0.00% | 0.00% | | mem2reg.NumSingleStore | 637650 | 637643 | -7 | 0.00% | 0.00% | | scalar-evolution.NumBruteForceTripCountsComputed | 814 | 812 | -2 | -0.25% | 0.25% | | scalar-evolution.NumTripCountsComputed | 283108 | 282934 | -174 | -0.06% | 0.06% | | scalar-evolution.NumTripCountsNotComputed | 106712 | 106718 | 6 | 0.01% | 0.01% | | simple-loop-unswitch.NumBranches | 5178 | 4752 | -426 | -8.23% | 8.23% | | simple-loop-unswitch.NumCostMultiplierSkipped | 914 | 503 | -411 | -44.97% | 44.97% | | simple-loop-unswitch.NumSwitches | 20 | 18 | -2 | -10.00% | 10.00% | | simple-loop-unswitch.NumTrivial | 183 | 95 | -88 | -48.09% | 48.09% | ... but that actually regresses LICM (-12% `licm.NumMovedLoads`), loop-simplifycfg (`NumLoopExitsDeleted`, `NumTerminatorsFolded`), simple-loop-unswitch (`NumTrivial`). What if we instead have LICM both before and after LoopRotate? | statistic name | LoopRotate-LICM | LICM-LoopRotate-LICM | Δ | % | abs(%) | | asm-printer.EmittedInsts | 9015930 | 9014474 | -1456 | -0.02% | 0.02% | | indvars.NumElimCmp | 3536 | 3546 | 10 | 0.28% | 0.28% | | indvars.NumElimExt | 36725 | 36681 | -44 | -0.12% | 0.12% | | indvars.NumElimIV | 1197 | 1185 | -12 | -1.00% | 1.00% | | indvars.NumElimIdentity | 143 | 146 | 3 | 2.10% | 2.10% | | indvars.NumElimRem | 4 | 5 | 1 | 25.00% | 25.00% | | indvars.NumLFTR | 29842 | 29899 | 57 | 0.19% | 0.19% | | indvars.NumReplaced | 2293 | 2299 | 6 | 0.26% | 0.26% | | indvars.NumSimplifiedSDiv | 6 | 8 | 2 | 33.33% | 33.33% | | indvars.NumWidened | 26438 | 26404 | -34 | -0.13% | 0.13% | | instcount.TotalBlocks | 1178338 | 1173652 | -4686 | -0.40% | 0.40% | | instcount.TotalFuncs | 111825 | 111829 | 4 | 0.00% | 0.00% | | instcount.TotalInsts | 9905442 | 9895452 | -9990 | -0.10% | 0.10% | | lcssa.NumLCSSA | 425871 | 425373 | -498 | -0.12% | 0.12% | | licm.NumHoisted | 378357 | 383352 | 4995 | 1.32% | 1.32% | | licm.NumMovedCalls | 2193 | 2204 | 11 | 0.50% | 0.50% | | licm.NumMovedLoads | 35899 | 35755 | -144 | -0.40% | 0.40% | | licm.NumPromoted | 11178 | 11163 | -15 | -0.13% | 0.13% | | licm.NumSunk | 13359 | 14321 | 962 | 7.20% | 7.20% | | loop-delete.NumDeleted | 8547 | 8538 | -9 | -0.11% | 0.11% | | loop-instsimplify.NumSimplified | 12876 | 12041 | -835 | -6.48% | 6.48% | | loop-peel.NumPeeled | 1008 | 924 | -84 | -8.33% | 8.33% | | loop-rotate.NumNotRotatedDueToHeaderSize | 368 | 365 | -3 | -0.82% | 0.82% | | loop-rotate.NumRotated | 42015 | 42005 | -10 | -0.02% | 0.02% | | loop-simplifycfg.NumLoopBlocksDeleted | 240 | 241 | 1 | 0.42% | 0.42% | | loop-simplifycfg.NumTerminatorsFolded | 618 | 619 | 1 | 0.16% | 0.16% | | loop-unroll.NumCompletelyUnrolled | 11028 | 11029 | 1 | 0.01% | 0.01% | | loop-unroll.NumUnrolled | 12608 | 12525 | -83 | -0.66% | 0.66% | | mem2reg.NumPHIInsert | 192110 | 192073 | -37 | -0.02% | 0.02% | | mem2reg.NumSingleStore | 637650 | 637652 | 2 | 0.00% | 0.00% | | scalar-evolution.NumTripCountsComputed | 283108 | 282998 | -110 | -0.04% | 0.04% | | scalar-evolution.NumTripCountsNotComputed | 106712 | 106691 | -21 | -0.02% | 0.02% | | simple-loop-unswitch.NumBranches | 5178 | 5185 | 7 | 0.14% | 0.14% | | simple-loop-unswitch.NumCostMultiplierSkipped | 914 | 925 | 11 | 1.20% | 1.20% | | simple-loop-unswitch.NumTrivial | 183 | 179 | -4 | -2.19% | 2.19% | | simple-loop-unswitch.NumBranches | 5178 | 4752 | -426 | -8.23% | 8.23% | | simple-loop-unswitch.NumCostMultiplierSkipped | 914 | 503 | -411 | -44.97% | 44.97% | | simple-loop-unswitch.NumSwitches | 20 | 18 | -2 | -10.00% | 10.00% | | simple-loop-unswitch.NumTrivial | 183 | 95 | -88 | -48.09% | 48.09% | I.e. we end up with less instructions, less peeling, more LICM activity, also note how none of those 4 regressions are here. Namely: | statistic name | LICM-LoopRotate | LICM-LoopRotate-LICM | Δ | % | abs(%) | | asm-printer.EmittedInsts | 9015799 | 9014474 | -1325 | -0.01% | 0.01% | | indvars.NumElimCmp | 3544 | 3546 | 2 | 0.06% | 0.06% | | indvars.NumElimExt | 36580 | 36681 | 101 | 0.28% | 0.28% | | indvars.NumElimIV | 1187 | 1185 | -2 | -0.17% | 0.17% | | indvars.NumElimIdentity | 136 | 146 | 10 | 7.35% | 7.35% | | indvars.NumLFTR | 29890 | 29899 | 9 | 0.03% | 0.03% | | indvars.NumReplaced | 2227 | 2299 | 72 | 3.23% | 3.23% | | indvars.NumWidened | 26329 | 26404 | 75 | 0.28% | 0.28% | | instcount.TotalBlocks | 1173840 | 1173652 | -188 | -0.02% | 0.02% | | instcount.TotalInsts | 9896139 | 9895452 | -687 | -0.01% | 0.01% | | lcssa.NumLCSSA | 423961 | 425373 | 1412 | 0.33% | 0.33% | | licm.NumHoisted | 378753 | 383352 | 4599 | 1.21% | 1.21% | | licm.NumMovedCalls | 2208 | 2204 | -4 | -0.18% | 0.18% | | licm.NumMovedLoads | 31821 | 35755 | 3934 | 12.36% | 12.36% | | licm.NumPromoted | 11154 | 11163 | 9 | 0.08% | 0.08% | | licm.NumSunk | 13587 | 14321 | 734 | 5.40% | 5.40% | | loop-delete.NumDeleted | 8402 | 8538 | 136 | 1.62% | 1.62% | | loop-instsimplify.NumSimplified | 11890 | 12041 | 151 | 1.27% | 1.27% | | loop-peel.NumPeeled | 925 | 924 | -1 | -0.11% | 0.11% | | loop-rotate.NumRotated | 42003 | 42005 | 2 | 0.00% | 0.00% | | loop-simplifycfg.NumLoopBlocksDeleted | 242 | 241 | -1 | -0.41% | 0.41% | | loop-simplifycfg.NumLoopExitsDeleted | 20 | 497 | 477 | 2385.00% | 2385.00% | | loop-simplifycfg.NumTerminatorsFolded | 336 | 619 | 283 | 84.23% | 84.23% | | loop-unroll.NumCompletelyUnrolled | 11032 | 11029 | -3 | -0.03% | 0.03% | | loop-unroll.NumUnrolled | 12529 | 12525 | -4 | -0.03% | 0.03% | | mem2reg.NumDeadAlloca | 10221 | 10222 | 1 | 0.01% | 0.01% | | mem2reg.NumPHIInsert | 192106 | 192073 | -33 | -0.02% | 0.02% | | mem2reg.NumSingleStore | 637643 | 637652 | 9 | 0.00% | 0.00% | | scalar-evolution.NumBruteForceTripCountsComputed | 812 | 814 | 2 | 0.25% | 0.25% | | scalar-evolution.NumTripCountsComputed | 282934 | 282998 | 64 | 0.02% | 0.02% | | scalar-evolution.NumTripCountsNotComputed | 106718 | 106691 | -27 | -0.03% | 0.03% | | simple-loop-unswitch.NumBranches | 4752 | 5185 | 433 | 9.11% | 9.11% | | simple-loop-unswitch.NumCostMultiplierSkipped | 503 | 925 | 422 | 83.90% | 83.90% | | simple-loop-unswitch.NumSwitches | 18 | 20 | 2 | 11.11% | 11.11% | | simple-loop-unswitch.NumTrivial | 95 | 179 | 84 | 88.42% | 88.42% | {F15983613} {F15983615} {F15983616} (this is vanilla llvm testsuite + rawspeed + darktable) As an example of the code where early LICM only is bad, see: https://godbolt.org/z/GzEbacs4K This does have an observable compile-time regression of +~0.5% geomean https://llvm-compile-time-tracker.com/compare.php?from=7c5222e4d1a3a14f029e… but i think that's basically nothing, and there's potential that it might be avoidable in the future by fixing clang to produce alignment information on function arguments, thus making the second run unneeded. Differential Revision: https://reviews.llvm.org/D99249 </cut> Results regressed to (for first_bad == a26f1bf67ec70f72e64101cf483b26466928fc38) # reset_artifacts: -10 # build_abe binutils: -9 # build_abe stage1 -- --set gcc_override_configure=--with-mode=arm --set gcc_override_configure=--disable-libsanitizer: -8 # build_abe linux: -7 # build_abe glibc: -6 # build_abe stage2 -- --set gcc_override_configure=--with-mode=arm --set gcc_override_configure=--disable-libsanitizer: -5 # build_llvm true: -3 # true: 0 # benchmark -- -O3_LTO_marm artifacts/build-a26f1bf67ec70f72e64101cf483b26466928fc38/results_id: 1 # 462.libquantum,libquantum_base.default regressed by 104 from (for last_good == bb1e5399e4586239d6424f5eea5a9f06c52ebe9b) # reset_artifacts: -10 # build_abe binutils: -9 # build_abe stage1 -- --set gcc_override_configure=--with-mode=arm --set gcc_override_configure=--disable-libsanitizer: -8 # build_abe linux: -7 # build_abe glibc: -6 # build_abe stage2 -- --set gcc_override_configure=--with-mode=arm --set gcc_override_configure=--disable-libsanitizer: -5 # build_llvm true: -3 # true: 0 # benchmark -- -O3_LTO_marm artifacts/build-baseline/results_id: 1 Artifacts of last_good build: https://ci.linaro.org/job/tcwg_bmk_ci_llvm-bisect-tcwg_bmk_tk1-llvm-release… Results ID of last_good: tk1_32/tcwg_bmk_llvm_tk1/baseline-llvm-release-arm-spec2k6-O3_LTO/4271 Artifacts of first_bad build: https://ci.linaro.org/job/tcwg_bmk_ci_llvm-bisect-tcwg_bmk_tk1-llvm-release… Results ID of first_bad: tk1_32/tcwg_bmk_llvm_tk1/bisect-llvm-release-arm-spec2k6-O3_LTO/4289 Build top page/logs: https://ci.linaro.org/job/tcwg_bmk_ci_llvm-bisect-tcwg_bmk_tk1-llvm-release… Configuration details: Reproduce builds: <cut> mkdir investigate-llvm-a26f1bf67ec70f72e64101cf483b26466928fc38 cd investigate-llvm-a26f1bf67ec70f72e64101cf483b26466928fc38 git clone https://git.linaro.org/toolchain/jenkins-scripts mkdir -p artifacts/manifests curl -o artifacts/manifests/build-baseline.sh https://ci.linaro.org/job/tcwg_bmk_ci_llvm-bisect-tcwg_bmk_tk1-llvm-release… --fail curl -o artifacts/manifests/build-parameters.sh https://ci.linaro.org/job/tcwg_bmk_ci_llvm-bisect-tcwg_bmk_tk1-llvm-release… --fail curl -o artifacts/test.sh https://ci.linaro.org/job/tcwg_bmk_ci_llvm-bisect-tcwg_bmk_tk1-llvm-release… --fail chmod +x artifacts/test.sh # Reproduce the baseline build (build all pre-requisites) ./jenkins-scripts/tcwg_bmk-build.sh @@ artifacts/manifests/build-baseline.sh # Save baseline build state (which is then restored in artifacts/test.sh) mkdir -p ./bisect rsync -a --del --delete-excluded --exclude /bisect/ --exclude /artifacts/ --exclude /llvm/ ./ ./bisect/baseline/ cd llvm # Reproduce first_bad build git checkout --detach a26f1bf67ec70f72e64101cf483b26466928fc38 ../artifacts/test.sh # Reproduce last_good build git checkout --detach bb1e5399e4586239d6424f5eea5a9f06c52ebe9b ../artifacts/test.sh cd .. </cut> History of pending regressions and results: https://git.linaro.org/toolchain/ci/base-artifacts.git/log/?h=linaro-local/… Artifacts: https://ci.linaro.org/job/tcwg_bmk_ci_llvm-bisect-tcwg_bmk_tk1-llvm-release… Build log: https://ci.linaro.org/job/tcwg_bmk_ci_llvm-bisect-tcwg_bmk_tk1-llvm-release… Full commit (up to 1000 lines): <cut> commit a26f1bf67ec70f72e64101cf483b26466928fc38 Author: Roman Lebedev <lebedev.ri(a)gmail.com> Date: Fri Apr 2 10:40:12 2021 +0300 [PassManager] Run additional LICM before LoopRotate Loop rotation often has to perform code duplication from header into preheader, which introduces PHI nodes. >>! In D99204, @thopre wrote: > > With loop peeling, it is important that unnecessary PHIs be avoided or > it will leads to spurious peeling. One source of such PHIs is loop > rotation which creates PHIs for invariant loads. Those PHIs are > particularly problematic since loop peeling is now run as part of simple > loop unrolling before GVN is run, and are thus a source of spurious > peeling. > > Note that while some of the load can be hoisted and eventually > eliminated by instruction combine, this is not always possible due to > alignment issue. In particular, the motivating example [1] was a load > inside a class instance which cannot be hoisted because the `this' > pointer has an alignment of 1. > > [1] http://lists.llvm.org/pipermail/llvm-dev/attachments/20210312/4ce73c47/atta… Now, we could enhance LoopRotate to avoid duplicating code when not needed, but instead hoist loop-invariant code, but isn't that a code duplication? (*sic*) We have LICM, and in fact we already run it right after LoopRotation. We could try to move it to before LoopRotation, that is basically free from compile-time perspective: https://llvm-compile-time-tracker.com/compare.php?from=6c93eb4477d88af046b9… But, looking at stats, i think it isn't great that we would no longer do LICM after LoopRotation, in particular: | statistic name | LoopRotate-LICM | LICM-LoopRotate | Δ | % | abs(%) | | asm-printer.EmittedInsts | 9015930 | 9015799 | -131 | 0.00% | 0.00% | | indvars.NumElimCmp | 3536 | 3544 | 8 | 0.23% | 0.23% | | indvars.NumElimExt | 36725 | 36580 | -145 | -0.39% | 0.39% | | indvars.NumElimIV | 1197 | 1187 | -10 | -0.84% | 0.84% | | indvars.NumElimIdentity | 143 | 136 | -7 | -4.90% | 4.90% | | indvars.NumElimRem | 4 | 5 | 1 | 25.00% | 25.00% | | indvars.NumLFTR | 29842 | 29890 | 48 | 0.16% | 0.16% | | indvars.NumReplaced | 2293 | 2227 | -66 | -2.88% | 2.88% | | indvars.NumSimplifiedSDiv | 6 | 8 | 2 | 33.33% | 33.33% | | indvars.NumWidened | 26438 | 26329 | -109 | -0.41% | 0.41% | | instcount.TotalBlocks | 1178338 | 1173840 | -4498 | -0.38% | 0.38% | | instcount.TotalFuncs | 111825 | 111829 | 4 | 0.00% | 0.00% | | instcount.TotalInsts | 9905442 | 9896139 | -9303 | -0.09% | 0.09% | | lcssa.NumLCSSA | 425871 | 423961 | -1910 | -0.45% | 0.45% | | licm.NumHoisted | 378357 | 378753 | 396 | 0.10% | 0.10% | | licm.NumMovedCalls | 2193 | 2208 | 15 | 0.68% | 0.68% | | licm.NumMovedLoads | 35899 | 31821 | -4078 | -11.36% | 11.36% | | licm.NumPromoted | 11178 | 11154 | -24 | -0.21% | 0.21% | | licm.NumSunk | 13359 | 13587 | 228 | 1.71% | 1.71% | | loop-delete.NumDeleted | 8547 | 8402 | -145 | -1.70% | 1.70% | | loop-instsimplify.NumSimplified | 12876 | 11890 | -986 | -7.66% | 7.66% | | loop-peel.NumPeeled | 1008 | 925 | -83 | -8.23% | 8.23% | | loop-rotate.NumNotRotatedDueToHeaderSize | 368 | 365 | -3 | -0.82% | 0.82% | | loop-rotate.NumRotated | 42015 | 42003 | -12 | -0.03% | 0.03% | | loop-simplifycfg.NumLoopBlocksDeleted | 240 | 242 | 2 | 0.83% | 0.83% | | loop-simplifycfg.NumLoopExitsDeleted | 497 | 20 | -477 | -95.98% | 95.98% | | loop-simplifycfg.NumTerminatorsFolded | 618 | 336 | -282 | -45.63% | 45.63% | | loop-unroll.NumCompletelyUnrolled | 11028 | 11032 | 4 | 0.04% | 0.04% | | loop-unroll.NumUnrolled | 12608 | 12529 | -79 | -0.63% | 0.63% | | mem2reg.NumDeadAlloca | 10222 | 10221 | -1 | -0.01% | 0.01% | | mem2reg.NumPHIInsert | 192110 | 192106 | -4 | 0.00% | 0.00% | | mem2reg.NumSingleStore | 637650 | 637643 | -7 | 0.00% | 0.00% | | scalar-evolution.NumBruteForceTripCountsComputed | 814 | 812 | -2 | -0.25% | 0.25% | | scalar-evolution.NumTripCountsComputed | 283108 | 282934 | -174 | -0.06% | 0.06% | | scalar-evolution.NumTripCountsNotComputed | 106712 | 106718 | 6 | 0.01% | 0.01% | | simple-loop-unswitch.NumBranches | 5178 | 4752 | -426 | -8.23% | 8.23% | | simple-loop-unswitch.NumCostMultiplierSkipped | 914 | 503 | -411 | -44.97% | 44.97% | | simple-loop-unswitch.NumSwitches | 20 | 18 | -2 | -10.00% | 10.00% | | simple-loop-unswitch.NumTrivial | 183 | 95 | -88 | -48.09% | 48.09% | ... but that actually regresses LICM (-12% `licm.NumMovedLoads`), loop-simplifycfg (`NumLoopExitsDeleted`, `NumTerminatorsFolded`), simple-loop-unswitch (`NumTrivial`). What if we instead have LICM both before and after LoopRotate? | statistic name | LoopRotate-LICM | LICM-LoopRotate-LICM | Δ | % | abs(%) | | asm-printer.EmittedInsts | 9015930 | 9014474 | -1456 | -0.02% | 0.02% | | indvars.NumElimCmp | 3536 | 3546 | 10 | 0.28% | 0.28% | | indvars.NumElimExt | 36725 | 36681 | -44 | -0.12% | 0.12% | | indvars.NumElimIV | 1197 | 1185 | -12 | -1.00% | 1.00% | | indvars.NumElimIdentity | 143 | 146 | 3 | 2.10% | 2.10% | | indvars.NumElimRem | 4 | 5 | 1 | 25.00% | 25.00% | | indvars.NumLFTR | 29842 | 29899 | 57 | 0.19% | 0.19% | | indvars.NumReplaced | 2293 | 2299 | 6 | 0.26% | 0.26% | | indvars.NumSimplifiedSDiv | 6 | 8 | 2 | 33.33% | 33.33% | | indvars.NumWidened | 26438 | 26404 | -34 | -0.13% | 0.13% | | instcount.TotalBlocks | 1178338 | 1173652 | -4686 | -0.40% | 0.40% | | instcount.TotalFuncs | 111825 | 111829 | 4 | 0.00% | 0.00% | | instcount.TotalInsts | 9905442 | 9895452 | -9990 | -0.10% | 0.10% | | lcssa.NumLCSSA | 425871 | 425373 | -498 | -0.12% | 0.12% | | licm.NumHoisted | 378357 | 383352 | 4995 | 1.32% | 1.32% | | licm.NumMovedCalls | 2193 | 2204 | 11 | 0.50% | 0.50% | | licm.NumMovedLoads | 35899 | 35755 | -144 | -0.40% | 0.40% | | licm.NumPromoted | 11178 | 11163 | -15 | -0.13% | 0.13% | | licm.NumSunk | 13359 | 14321 | 962 | 7.20% | 7.20% | | loop-delete.NumDeleted | 8547 | 8538 | -9 | -0.11% | 0.11% | | loop-instsimplify.NumSimplified | 12876 | 12041 | -835 | -6.48% | 6.48% | | loop-peel.NumPeeled | 1008 | 924 | -84 | -8.33% | 8.33% | | loop-rotate.NumNotRotatedDueToHeaderSize | 368 | 365 | -3 | -0.82% | 0.82% | | loop-rotate.NumRotated | 42015 | 42005 | -10 | -0.02% | 0.02% | | loop-simplifycfg.NumLoopBlocksDeleted | 240 | 241 | 1 | 0.42% | 0.42% | | loop-simplifycfg.NumTerminatorsFolded | 618 | 619 | 1 | 0.16% | 0.16% | | loop-unroll.NumCompletelyUnrolled | 11028 | 11029 | 1 | 0.01% | 0.01% | | loop-unroll.NumUnrolled | 12608 | 12525 | -83 | -0.66% | 0.66% | | mem2reg.NumPHIInsert | 192110 | 192073 | -37 | -0.02% | 0.02% | | mem2reg.NumSingleStore | 637650 | 637652 | 2 | 0.00% | 0.00% | | scalar-evolution.NumTripCountsComputed | 283108 | 282998 | -110 | -0.04% | 0.04% | | scalar-evolution.NumTripCountsNotComputed | 106712 | 106691 | -21 | -0.02% | 0.02% | | simple-loop-unswitch.NumBranches | 5178 | 5185 | 7 | 0.14% | 0.14% | | simple-loop-unswitch.NumCostMultiplierSkipped | 914 | 925 | 11 | 1.20% | 1.20% | | simple-loop-unswitch.NumTrivial | 183 | 179 | -4 | -2.19% | 2.19% | | simple-loop-unswitch.NumBranches | 5178 | 4752 | -426 | -8.23% | 8.23% | | simple-loop-unswitch.NumCostMultiplierSkipped | 914 | 503 | -411 | -44.97% | 44.97% | | simple-loop-unswitch.NumSwitches | 20 | 18 | -2 | -10.00% | 10.00% | | simple-loop-unswitch.NumTrivial | 183 | 95 | -88 | -48.09% | 48.09% | I.e. we end up with less instructions, less peeling, more LICM activity, also note how none of those 4 regressions are here. Namely: | statistic name | LICM-LoopRotate | LICM-LoopRotate-LICM | Δ | % | abs(%) | | asm-printer.EmittedInsts | 9015799 | 9014474 | -1325 | -0.01% | 0.01% | | indvars.NumElimCmp | 3544 | 3546 | 2 | 0.06% | 0.06% | | indvars.NumElimExt | 36580 | 36681 | 101 | 0.28% | 0.28% | | indvars.NumElimIV | 1187 | 1185 | -2 | -0.17% | 0.17% | | indvars.NumElimIdentity | 136 | 146 | 10 | 7.35% | 7.35% | | indvars.NumLFTR | 29890 | 29899 | 9 | 0.03% | 0.03% | | indvars.NumReplaced | 2227 | 2299 | 72 | 3.23% | 3.23% | | indvars.NumWidened | 26329 | 26404 | 75 | 0.28% | 0.28% | | instcount.TotalBlocks | 1173840 | 1173652 | -188 | -0.02% | 0.02% | | instcount.TotalInsts | 9896139 | 9895452 | -687 | -0.01% | 0.01% | | lcssa.NumLCSSA | 423961 | 425373 | 1412 | 0.33% | 0.33% | | licm.NumHoisted | 378753 | 383352 | 4599 | 1.21% | 1.21% | | licm.NumMovedCalls | 2208 | 2204 | -4 | -0.18% | 0.18% | | licm.NumMovedLoads | 31821 | 35755 | 3934 | 12.36% | 12.36% | | licm.NumPromoted | 11154 | 11163 | 9 | 0.08% | 0.08% | | licm.NumSunk | 13587 | 14321 | 734 | 5.40% | 5.40% | | loop-delete.NumDeleted | 8402 | 8538 | 136 | 1.62% | 1.62% | | loop-instsimplify.NumSimplified | 11890 | 12041 | 151 | 1.27% | 1.27% | | loop-peel.NumPeeled | 925 | 924 | -1 | -0.11% | 0.11% | | loop-rotate.NumRotated | 42003 | 42005 | 2 | 0.00% | 0.00% | | loop-simplifycfg.NumLoopBlocksDeleted | 242 | 241 | -1 | -0.41% | 0.41% | | loop-simplifycfg.NumLoopExitsDeleted | 20 | 497 | 477 | 2385.00% | 2385.00% | | loop-simplifycfg.NumTerminatorsFolded | 336 | 619 | 283 | 84.23% | 84.23% | | loop-unroll.NumCompletelyUnrolled | 11032 | 11029 | -3 | -0.03% | 0.03% | | loop-unroll.NumUnrolled | 12529 | 12525 | -4 | -0.03% | 0.03% | | mem2reg.NumDeadAlloca | 10221 | 10222 | 1 | 0.01% | 0.01% | | mem2reg.NumPHIInsert | 192106 | 192073 | -33 | -0.02% | 0.02% | | mem2reg.NumSingleStore | 637643 | 637652 | 9 | 0.00% | 0.00% | | scalar-evolution.NumBruteForceTripCountsComputed | 812 | 814 | 2 | 0.25% | 0.25% | | scalar-evolution.NumTripCountsComputed | 282934 | 282998 | 64 | 0.02% | 0.02% | | scalar-evolution.NumTripCountsNotComputed | 106718 | 106691 | -27 | -0.03% | 0.03% | | simple-loop-unswitch.NumBranches | 4752 | 5185 | 433 | 9.11% | 9.11% | | simple-loop-unswitch.NumCostMultiplierSkipped | 503 | 925 | 422 | 83.90% | 83.90% | | simple-loop-unswitch.NumSwitches | 18 | 20 | 2 | 11.11% | 11.11% | | simple-loop-unswitch.NumTrivial | 95 | 179 | 84 | 88.42% | 88.42% | {F15983613} {F15983615} {F15983616} (this is vanilla llvm testsuite + rawspeed + darktable) As an example of the code where early LICM only is bad, see: https://godbolt.org/z/GzEbacs4K This does have an observable compile-time regression of +~0.5% geomean https://llvm-compile-time-tracker.com/compare.php?from=7c5222e4d1a3a14f029e… but i think that's basically nothing, and there's potential that it might be avoidable in the future by fixing clang to produce alignment information on function arguments, thus making the second run unneeded. Differential Revision: https://reviews.llvm.org/D99249 --- llvm/lib/Passes/PassBuilder.cpp | 10 +++ llvm/lib/Transforms/IPO/PassManagerBuilder.cpp | 4 + llvm/test/CodeGen/AMDGPU/opt-pipeline.ll | 30 +++++--- llvm/test/Other/new-pm-defaults.ll | 7 +- llvm/test/Other/new-pm-thinlto-defaults.ll | 7 +- .../Other/new-pm-thinlto-postlink-pgo-defaults.ll | 9 ++- .../new-pm-thinlto-postlink-samplepgo-defaults.ll | 7 +- .../Other/new-pm-thinlto-prelink-pgo-defaults.ll | 9 ++- .../new-pm-thinlto-prelink-samplepgo-defaults.ll | 5 +- llvm/test/Other/opt-O2-pipeline.ll | 10 ++- llvm/test/Other/opt-O3-pipeline-enable-matrix.ll | 10 ++- llvm/test/Other/opt-O3-pipeline.ll | 10 ++- llvm/test/Other/opt-Os-pipeline.ll | 10 ++- llvm/test/Other/pass-pipelines.ll | 3 + llvm/test/Transforms/IndVarSimplify/X86/pr45360.ll | 25 ++++--- .../PhaseOrdering/X86/spurious-peeling.ll | 87 +++++++++------------- llvm/test/Transforms/PhaseOrdering/X86/vdiv.ll | 78 +++++++++---------- .../loop-rotation-vs-common-code-hoisting.ll | 22 +++--- 18 files changed, 193 insertions(+), 150 deletions(-) diff --git a/llvm/lib/Passes/PassBuilder.cpp b/llvm/lib/Passes/PassBuilder.cpp index 3a325277e370..5a2285215769 100644 --- a/llvm/lib/Passes/PassBuilder.cpp +++ b/llvm/lib/Passes/PassBuilder.cpp @@ -568,6 +568,11 @@ PassBuilder::buildO1FunctionSimplificationPipeline(OptimizationLevel Level, LPM1.addPass(LoopInstSimplifyPass()); LPM1.addPass(LoopSimplifyCFGPass()); + // Try to remove as much code from the loop header as possible, + // to reduce amount of IR that will have to be duplicated. + // TODO: Investigate promotion cap for O1. + LPM1.addPass(LICMPass(PTO.LicmMssaOptCap, PTO.LicmMssaNoAccForPromotionCap)); + LPM1.addPass(LoopRotatePass(/* Disable header duplication */ true, isLTOPreLink(Phase))); // TODO: Investigate promotion cap for O1. @@ -736,6 +741,11 @@ PassBuilder::buildFunctionSimplificationPipeline(OptimizationLevel Level, LPM1.addPass(LoopInstSimplifyPass()); LPM1.addPass(LoopSimplifyCFGPass()); + // Try to remove as much code from the loop header as possible, + // to reduce amount of IR that will have to be duplicated. + // TODO: Investigate promotion cap for O1. + LPM1.addPass(LICMPass(PTO.LicmMssaOptCap, PTO.LicmMssaNoAccForPromotionCap)); + // Disable header duplication in loop rotation at -Oz. LPM1.addPass( LoopRotatePass(Level != OptimizationLevel::Oz, isLTOPreLink(Phase))); diff --git a/llvm/lib/Transforms/IPO/PassManagerBuilder.cpp b/llvm/lib/Transforms/IPO/PassManagerBuilder.cpp index 109e7c97ff1b..2c80a16febef 100644 --- a/llvm/lib/Transforms/IPO/PassManagerBuilder.cpp +++ b/llvm/lib/Transforms/IPO/PassManagerBuilder.cpp @@ -431,6 +431,10 @@ void PassManagerBuilder::addFunctionSimplificationPasses( MPM.add(createLoopInstSimplifyPass()); MPM.add(createLoopSimplifyCFGPass()); } + // Try to remove as much code from the loop header as possible, + // to reduce amount of IR that will have to be duplicated. + // TODO: Investigate promotion cap for O1. + MPM.add(createLICMPass(LicmMssaOptCap, LicmMssaNoAccForPromotionCap)); // Rotate Loop - disable header duplication at -Oz MPM.add(createLoopRotatePass(SizeLevel == 2 ? 0 : -1, PrepareForLTO)); // TODO: Investigate promotion cap for O1. diff --git a/llvm/test/CodeGen/AMDGPU/opt-pipeline.ll b/llvm/test/CodeGen/AMDGPU/opt-pipeline.ll index 34e5e6c647da..5e33d968c710 100644 --- a/llvm/test/CodeGen/AMDGPU/opt-pipeline.ll +++ b/llvm/test/CodeGen/AMDGPU/opt-pipeline.ll @@ -129,16 +129,20 @@ ; GCN-O1-NEXT: Simplify the CFG ; GCN-O1-NEXT: Reassociate expressions ; GCN-O1-NEXT: Dominator Tree Construction +; GCN-O1-NEXT: Basic Alias Analysis (stateless AA impl) +; GCN-O1-NEXT: Function Alias Analysis Results +; GCN-O1-NEXT: Memory SSA ; GCN-O1-NEXT: Natural Loop Information ; GCN-O1-NEXT: Canonicalize natural loops ; GCN-O1-NEXT: LCSSA Verifier ; GCN-O1-NEXT: Loop-Closed SSA Form Pass -; GCN-O1-NEXT: Basic Alias Analysis (stateless AA impl) -; GCN-O1-NEXT: Function Alias Analysis Results ; GCN-O1-NEXT: Scalar Evolution Analysis +; GCN-O1-NEXT: Lazy Branch Probability Analysis +; GCN-O1-NEXT: Lazy Block Frequency Analysis +; GCN-O1-NEXT: Loop Pass Manager +; GCN-O1-NEXT: Loop Invariant Code Motion ; GCN-O1-NEXT: Loop Pass Manager ; GCN-O1-NEXT: Rotate Loops -; GCN-O1-NEXT: Memory SSA ; GCN-O1-NEXT: Lazy Branch Probability Analysis ; GCN-O1-NEXT: Lazy Block Frequency Analysis ; GCN-O1-NEXT: Loop Pass Manager @@ -451,16 +455,20 @@ ; GCN-O2-NEXT: Simplify the CFG ; GCN-O2-NEXT: Reassociate expressions ; GCN-O2-NEXT: Dominator Tree Construction +; GCN-O2-NEXT: Basic Alias Analysis (stateless AA impl) +; GCN-O2-NEXT: Function Alias Analysis Results +; GCN-O2-NEXT: Memory SSA ; GCN-O2-NEXT: Natural Loop Information ; GCN-O2-NEXT: Canonicalize natural loops ; GCN-O2-NEXT: LCSSA Verifier ; GCN-O2-NEXT: Loop-Closed SSA Form Pass -; GCN-O2-NEXT: Basic Alias Analysis (stateless AA impl) -; GCN-O2-NEXT: Function Alias Analysis Results ; GCN-O2-NEXT: Scalar Evolution Analysis +; GCN-O2-NEXT: Lazy Branch Probability Analysis +; GCN-O2-NEXT: Lazy Block Frequency Analysis +; GCN-O2-NEXT: Loop Pass Manager +; GCN-O2-NEXT: Loop Invariant Code Motion ; GCN-O2-NEXT: Loop Pass Manager ; GCN-O2-NEXT: Rotate Loops -; GCN-O2-NEXT: Memory SSA ; GCN-O2-NEXT: Lazy Branch Probability Analysis ; GCN-O2-NEXT: Lazy Block Frequency Analysis ; GCN-O2-NEXT: Loop Pass Manager @@ -810,16 +818,20 @@ ; GCN-O3-NEXT: Simplify the CFG ; GCN-O3-NEXT: Reassociate expressions ; GCN-O3-NEXT: Dominator Tree Construction +; GCN-O3-NEXT: Basic Alias Analysis (stateless AA impl) +; GCN-O3-NEXT: Function Alias Analysis Results +; GCN-O3-NEXT: Memory SSA ; GCN-O3-NEXT: Natural Loop Information ; GCN-O3-NEXT: Canonicalize natural loops ; GCN-O3-NEXT: LCSSA Verifier ; GCN-O3-NEXT: Loop-Closed SSA Form Pass -; GCN-O3-NEXT: Basic Alias Analysis (stateless AA impl) -; GCN-O3-NEXT: Function Alias Analysis Results ; GCN-O3-NEXT: Scalar Evolution Analysis +; GCN-O3-NEXT: Lazy Branch Probability Analysis +; GCN-O3-NEXT: Lazy Block Frequency Analysis +; GCN-O3-NEXT: Loop Pass Manager +; GCN-O3-NEXT: Loop Invariant Code Motion ; GCN-O3-NEXT: Loop Pass Manager ; GCN-O3-NEXT: Rotate Loops -; GCN-O3-NEXT: Memory SSA ; GCN-O3-NEXT: Lazy Branch Probability Analysis ; GCN-O3-NEXT: Lazy Block Frequency Analysis ; GCN-O3-NEXT: Loop Pass Manager diff --git a/llvm/test/Other/new-pm-defaults.ll b/llvm/test/Other/new-pm-defaults.ll index 01b02b8fd482..337a0857701c 100644 --- a/llvm/test/Other/new-pm-defaults.ll +++ b/llvm/test/Other/new-pm-defaults.ll @@ -113,9 +113,9 @@ ; CHECK-O-NEXT: Running analysis: CallGraphAnalysis ; CHECK-O-NEXT: Running pass: RequireAnalysisPass<{{.*}}ProfileSummaryAnalysis ; CHECK-O-NEXT: Running analysis: ProfileSummaryAnalysis -; CHECK-O-NEXT: Running analysis: InnerAnalysisManagerProxy -; CHECK-O-NEXT: Running analysis: LazyCallGraphAnalysis -; CHECK-O-NEXT: Running analysis: FunctionAnalysisManagerCGSCCProxy +; CHECK-O-NEXT: Running analysis: InnerAnalysisManagerProxy +; CHECK-O-NEXT: Running analysis: LazyCallGraphAnalysis +; CHECK-O-NEXT: Running analysis: FunctionAnalysisManagerCGSCCProxy ; CHECK-O-NEXT: Running analysis: OuterAnalysisManagerProxy<{{.*}}LazyCallGraph::SCC{{.*}}> ; CHECK-O-NEXT: Running pass: DevirtSCCRepeatedPass ; CHECK-O-NEXT: Starting CGSCC pass manager run. @@ -156,6 +156,7 @@ ; CHECK-O-NEXT: Starting Loop pass manager run. ; CHECK-O-NEXT: Running pass: LoopInstSimplifyPass ; CHECK-O-NEXT: Running pass: LoopSimplifyCFGPass +; CHECK-O-NEXT: Running pass: LICM ; CHECK-O-NEXT: Running pass: LoopRotatePass ; CHECK-O-NEXT: Running pass: LICM ; CHECK-O-NEXT: Running pass: SimpleLoopUnswitchPass diff --git a/llvm/test/Other/new-pm-thinlto-defaults.ll b/llvm/test/Other/new-pm-thinlto-defaults.ll index fbf47de87eeb..bba43dd50e7a 100644 --- a/llvm/test/Other/new-pm-thinlto-defaults.ll +++ b/llvm/test/Other/new-pm-thinlto-defaults.ll @@ -98,9 +98,9 @@ ; CHECK-O-NEXT: Running analysis: CallGraphAnalysis ; CHECK-O-NEXT: Running pass: RequireAnalysisPass<{{.*}}ProfileSummaryAnalysis ; CHECK-PRELINK-O-NEXT: Running analysis: ProfileSummaryAnalysis -; CHECK-O-NEXT: Running analysis: InnerAnalysisManagerProxy -; CHECK-O-NEXT: Running analysis: LazyCallGraphAnalysis -; CHECK-O-NEXT: Running analysis: FunctionAnalysisManagerCGSCCProxy +; CHECK-O-NEXT: Running analysis: InnerAnalysisManagerProxy +; CHECK-O-NEXT: Running analysis: LazyCallGraphAnalysis +; CHECK-O-NEXT: Running analysis: FunctionAnalysisManagerCGSCCProxy ; CHECK-O-NEXT: Running analysis: OuterAnalysisManagerProxy ; CHECK-O-NEXT: Running pass: DevirtSCCRepeatedPass ; CHECK-O-NEXT: Starting CGSCC pass manager run. @@ -139,6 +139,7 @@ ; CHECK-O-NEXT: Starting Loop pass manager run. ; CHECK-O-NEXT: Running pass: LoopInstSimplifyPass ; CHECK-O-NEXT: Running pass: LoopSimplifyCFGPass +; CHECK-O-NEXT: Running pass: LICM ; CHECK-O-NEXT: Running pass: LoopRotatePass ; CHECK-O-NEXT: Running pass: LICM ; CHECK-O-NEXT: Running pass: SimpleLoopUnswitchPass diff --git a/llvm/test/Other/new-pm-thinlto-postlink-pgo-defaults.ll b/llvm/test/Other/new-pm-thinlto-postlink-pgo-defaults.ll index 4bcf70e15a5b..57f0e0da73b6 100644 --- a/llvm/test/Other/new-pm-thinlto-postlink-pgo-defaults.ll +++ b/llvm/test/Other/new-pm-thinlto-postlink-pgo-defaults.ll @@ -68,10 +68,10 @@ ; CHECK-O-NEXT: Running pass: ModuleInlinerWrapperPass ; CHECK-O-NEXT: Running analysis: InlineAdvisorAnalysis ; CHECK-O-NEXT: Starting {{.*}}Module pass manager run. -; CHECK-O-NEXT: Running pass: RequireAnalysisPass<{{.*}}GlobalsAA -; CHECK-O-NEXT: Running analysis: GlobalsAA -; CHECK-O-NEXT: Running analysis: CallGraphAnalysis -; CHECK-O-NEXT: Running pass: RequireAnalysisPass<{{.*}}ProfileSummaryAnalysis +; CHECK-O-NEXT: Running pass: RequireAnalysisPass<{{.*}}GlobalsAA +; CHECK-O-NEXT: Running analysis: GlobalsAA +; CHECK-O-NEXT: Running analysis: CallGraphAnalysis +; CHECK-O-NEXT: Running pass: RequireAnalysisPass<{{.*}}ProfileSummaryAnalysis ; CHECK-O-NEXT: Running analysis: InnerAnalysisManagerProxy ; CHECK-O-NEXT: Running analysis: LazyCallGraphAnalysis ; CHECK-O-NEXT: Running analysis: FunctionAnalysisManagerCGSCCProxy @@ -112,6 +112,7 @@ ; CHECK-O-NEXT: Starting Loop pass manager run. ; CHECK-O-NEXT: Running pass: LoopInstSimplifyPass ; CHECK-O-NEXT: Running pass: LoopSimplifyCFGPass +; CHECK-O-NEXT: Running pass: LICM ; CHECK-O-NEXT: Running pass: LoopRotatePass ; CHECK-O-NEXT: Running pass: LICM ; CHECK-O-NEXT: Running pass: SimpleLoopUnswitchPass diff --git a/llvm/test/Other/new-pm-thinlto-postlink-samplepgo-defaults.ll b/llvm/test/Other/new-pm-thinlto-postlink-samplepgo-defaults.ll index 1071d28432b9..0e0e2854b8df 100644 --- a/llvm/test/Other/new-pm-thinlto-postlink-samplepgo-defaults.ll +++ b/llvm/test/Other/new-pm-thinlto-postlink-samplepgo-defaults.ll @@ -78,9 +78,9 @@ ; CHECK-O-NEXT: Running pass: ModuleInlinerWrapperPass ; CHECK-O-NEXT: Running analysis: InlineAdvisorAnalysis ; CHECK-O-NEXT: Starting {{.*}}Module pass manager run. -; CHECK-O-NEXT: Running pass: RequireAnalysisPass<{{.*}}GlobalsAA -; CHECK-O-NEXT: Running analysis: GlobalsAA -; CHECK-O-NEXT: Running pass: RequireAnalysisPass<{{.*}}ProfileSummaryAnalysis +; CHECK-O-NEXT: Running pass: RequireAnalysisPass<{{.*}}GlobalsAA +; CHECK-O-NEXT: Running analysis: GlobalsAA +; CHECK-O-NEXT: Running pass: RequireAnalysisPass<{{.*}}ProfileSummaryAnalysis ; CHECK-O-NEXT: Running analysis: InnerAnalysisManagerProxy ; CHECK-O-NEXT: Running analysis: LazyCallGraphAnalysis ; CHECK-O-NEXT: Running analysis: FunctionAnalysisManagerCGSCCProxy @@ -121,6 +121,7 @@ ; CHECK-O-NEXT: Starting Loop pass manager run. ; CHECK-O-NEXT: Running pass: LoopInstSimplifyPass ; CHECK-O-NEXT: Running pass: LoopSimplifyCFGPass +; CHECK-O-NEXT: Running pass: LICM ; CHECK-O-NEXT: Running pass: LoopRotatePass ; CHECK-O-NEXT: Running pass: LICM ; CHECK-O-NEXT: Running pass: SimpleLoopUnswitchPass diff --git a/llvm/test/Other/new-pm-thinlto-prelink-pgo-defaults.ll b/llvm/test/Other/new-pm-thinlto-prelink-pgo-defaults.ll index e2f1385cf52b..4cfb9825c97e 100644 --- a/llvm/test/Other/new-pm-thinlto-prelink-pgo-defaults.ll +++ b/llvm/test/Other/new-pm-thinlto-prelink-pgo-defaults.ll @@ -93,10 +93,10 @@ ; CHECK-O-NEXT: Running analysis: OptimizationRemarkEmitterAnalysis on foo ; CHECK-O-NEXT: Running pass: ModuleInlinerWrapperPass ; CHECK-O-NEXT: Starting {{.*}}Module pass manager run. -; CHECK-O-NEXT: Running pass: RequireAnalysisPass<{{.*}}GlobalsAA -; CHECK-O-NEXT: Running analysis: GlobalsAA -; CHECK-O-NEXT: Running analysis: CallGraphAnalysis -; CHECK-O-NEXT: Running pass: RequireAnalysisPass<{{.*}}ProfileSummaryAnalysis +; CHECK-O-NEXT: Running pass: RequireAnalysisPass<{{.*}}GlobalsAA +; CHECK-O-NEXT: Running analysis: GlobalsAA +; CHECK-O-NEXT: Running analysis: CallGraphAnalysis +; CHECK-O-NEXT: Running pass: RequireAnalysisPass<{{.*}}ProfileSummaryAnalysis ; CHECK-O-NEXT: Running analysis: InnerAnalysisManagerProxy ; CHECK-O-NEXT: Running analysis: LazyCallGraphAnalysis ; CHECK-O-NEXT: Running analysis: TargetLibraryAnalysis on foo @@ -158,6 +158,7 @@ ; CHECK-O-NEXT: Starting Loop pass manager run. ; CHECK-O-NEXT: Running pass: LoopInstSimplifyPass ; CHECK-O-NEXT: Running pass: LoopSimplifyCFGPass +; CHECK-O-NEXT: Running pass: LICM ; CHECK-O-NEXT: Running pass: LoopRotatePass ; CHECK-O-NEXT: Running pass: LICM ; CHECK-O-NEXT: Running pass: SimpleLoopUnswitchPass diff --git a/llvm/test/Other/new-pm-thinlto-prelink-samplepgo-defaults.ll b/llvm/test/Other/new-pm-thinlto-prelink-samplepgo-defaults.ll index d4dc552aea01..a05555c57003 100644 --- a/llvm/test/Other/new-pm-thinlto-prelink-samplepgo-defaults.ll +++ b/llvm/test/Other/new-pm-thinlto-prelink-samplepgo-defaults.ll @@ -73,8 +73,8 @@ ; CHECK-O-NEXT: Running pass: ModuleInlinerWrapperPass ; CHECK-O-NEXT: Running analysis: InlineAdvisorAnalysis ; CHECK-O-NEXT: Starting {{.*}}Module pass manager run. -; CHECK-O-NEXT: Running pass: RequireAnalysisPass<{{.*}}GlobalsAA -; CHECK-O-NEXT: Running analysis: GlobalsAA +; CHECK-O-NEXT: Running pass: RequireAnalysisPass<{{.*}}GlobalsAA +; CHECK-O-NEXT: Running analysis: GlobalsAA ; CHECK-O-NEXT: Running pass: RequireAnalysisPass<{{.*}}ProfileSummaryAnalysis ; CHECK-O-NEXT: Running analysis: InnerAnalysisManagerProxy ; CHECK-O-NEXT: Running analysis: LazyCallGraphAnalysis @@ -116,6 +116,7 @@ ; CHECK-O-NEXT: Starting Loop pass manager run. ; CHECK-O-NEXT: Running pass: LoopInstSimplifyPass ; CHECK-O-NEXT: Running pass: LoopSimplifyCFGPass +; CHECK-O-NEXT: Running pass: LICM ; CHECK-O-NEXT: Running pass: LoopRotatePass ; CHECK-O-NEXT: Running pass: LICM ; CHECK-O-NEXT: Running pass: SimpleLoopUnswitchPass diff --git a/llvm/test/Other/opt-O2-pipeline.ll b/llvm/test/Other/opt-O2-pipeline.ll index f7217c122fdb..a3b01e5464d4 100644 --- a/llvm/test/Other/opt-O2-pipeline.ll +++ b/llvm/test/Other/opt-O2-pipeline.ll @@ -101,16 +101,20 @@ ; CHECK-NEXT: Simplify the CFG ; CHECK-NEXT: Reassociate expressions ; CHECK-NEXT: Dominator Tree Construction +; CHECK-NEXT: Basic Alias Analysis (stateless AA impl) +; CHECK-NEXT: Function Alias Analysis Results +; CHECK-NEXT: Memory SSA ; CHECK-NEXT: Natural Loop Information ; CHECK-NEXT: Canonicalize natural loops ; CHECK-NEXT: LCSSA Verifier ; CHECK-NEXT: Loop-Closed SSA Form Pass -; CHECK-NEXT: Basic Alias Analysis (stateless AA impl) -; CHECK-NEXT: Function Alias Analysis Results ; CHECK-NEXT: Scalar Evolution Analysis +; CHECK-NEXT: Lazy Branch Probability Analysis +; CHECK-NEXT: Lazy Block Frequency Analysis +; CHECK-NEXT: Loop Pass Manager +; CHECK-NEXT: Loop Invariant Code Motion ; CHECK-NEXT: Loop Pass Manager ; CHECK-NEXT: Rotate Loops -; CHECK-NEXT: Memory SSA ; CHECK-NEXT: Lazy Branch Probability Analysis ; CHECK-NEXT: Lazy Block Frequency Analysis ; CHECK-NEXT: Loop Pass Manager diff --git a/llvm/test/Other/opt-O3-pipeline-enable-matrix.ll b/llvm/test/Other/opt-O3-pipeline-enable-matrix.ll index 6b98c1f80d9e..fafd5c8fdcb8 100644 --- a/llvm/test/Other/opt-O3-pipeline-enable-matrix.ll +++ b/llvm/test/Other/opt-O3-pipeline-enable-matrix.ll @@ -106,16 +106,20 @@ ; CHECK-NEXT: Simplify the CFG ; CHECK-NEXT: Reassociate expressions ; CHECK-NEXT: Dominator Tree Construction +; CHECK-NEXT: Basic Alias Analysis (stateless AA impl) +; CHECK-NEXT: Function Alias Analysis Results +; CHECK-NEXT: Memory SSA ; CHECK-NEXT: Natural Loop Information ; CHECK-NEXT: Canonicalize natural loops ; CHECK-NEXT: LCSSA Verifier ; CHECK-NEXT: Loop-Closed SSA Form Pass -; CHECK-NEXT: Basic Alias Analysis (stateless AA impl) -; CHECK-NEXT: Function Alias Analysis Results ; CHECK-NEXT: Scalar Evolution Analysis +; CHECK-NEXT: Lazy Branch Probability Analysis +; CHECK-NEXT: Lazy Block Frequency Analysis +; CHECK-NEXT: Loop Pass Manager +; CHECK-NEXT: Loop Invariant Code Motion ; CHECK-NEXT: Loop Pass Manager ; CHECK-NEXT: Rotate Loops -; CHECK-NEXT: Memory SSA ; CHECK-NEXT: Lazy Branch Probability Analysis ; CHECK-NEXT: Lazy Block Frequency Analysis ; CHECK-NEXT: Loop Pass Manager diff --git a/llvm/test/Other/opt-O3-pipeline.ll b/llvm/test/Other/opt-O3-pipeline.ll index 00a1d61ac058..103d49bbbbab 100644 --- a/llvm/test/Other/opt-O3-pipeline.ll +++ b/llvm/test/Other/opt-O3-pipeline.ll @@ -106,16 +106,20 @@ ; CHECK-NEXT: Simplify the CFG ; CHECK-NEXT: Reassociate expressions ; CHECK-NEXT: Dominator Tree Construction +; CHECK-NEXT: Basic Alias Analysis (stateless AA impl) +; CHECK-NEXT: Function Alias Analysis Results +; CHECK-NEXT: Memory SSA ; CHECK-NEXT: Natural Loop Information ; CHECK-NEXT: Canonicalize natural loops ; CHECK-NEXT: LCSSA Verifier ; CHECK-NEXT: Loop-Closed SSA Form Pass -; CHECK-NEXT: Basic Alias Analysis (stateless AA impl) -; CHECK-NEXT: Function Alias Analysis Results ; CHECK-NEXT: Scalar Evolution Analysis +; CHECK-NEXT: Lazy Branch Probability Analysis +; CHECK-NEXT: Lazy Block Frequency Analysis +; CHECK-NEXT: Loop Pass Manager +; CHECK-NEXT: Loop Invariant Code Motion ; CHECK-NEXT: Loop Pass Manager ; CHECK-NEXT: Rotate Loops -; CHECK-NEXT: Memory SSA ; CHECK-NEXT: Lazy Branch Probability Analysis ; CHECK-NEXT: Lazy Block Frequency Analysis ; CHECK-NEXT: Loop Pass Manager diff --git a/llvm/test/Other/opt-Os-pipeline.ll b/llvm/test/Other/opt-Os-pipeline.ll index 21f9b8c6009e..508c21edbc68 100644 --- a/llvm/test/Other/opt-Os-pipeline.ll +++ b/llvm/test/Other/opt-Os-pipeline.ll @@ -87,16 +87,20 @@ ; CHECK-NEXT: Simplify the CFG ; CHECK-NEXT: Reassociate expressions ; CHECK-NEXT: Dominator Tree Construction +; CHECK-NEXT: Basic Alias Analysis (stateless AA impl) +; CHECK-NEXT: Function Alias Analysis Results +; CHECK-NEXT: Memory SSA ; CHECK-NEXT: Natural Loop Information ; CHECK-NEXT: Canonicalize natural loops ; CHECK-NEXT: LCSSA Verifier ; CHECK-NEXT: Loop-Closed SSA Form Pass -; CHECK-NEXT: Basic Alias Analysis (stateless AA impl) -; CHECK-NEXT: Function Alias Analysis Results ; CHECK-NEXT: Scalar Evolution Analysis +; CHECK-NEXT: Lazy Branch Probability Analysis +; CHECK-NEXT: Lazy Block Frequency Analysis +; CHECK-NEXT: Loop Pass Manager +; CHECK-NEXT: Loop Invariant Code Motion ; CHECK-NEXT: Loop Pass Manager ; CHECK-NEXT: Rotate Loops -; CHECK-NEXT: Memory SSA ; CHECK-NEXT: Lazy Branch Probability Analysis ; CHECK-NEXT: Lazy Block Frequency Analysis ; CHECK-NEXT: Loop Pass Manager diff --git a/llvm/test/Other/pass-pipelines.ll b/llvm/test/Other/pass-pipelines.ll index ccd364d5d740..768e8343529e 100644 --- a/llvm/test/Other/pass-pipelines.ll +++ b/llvm/test/Other/pass-pipelines.ll @@ -53,6 +53,9 @@ ; CHECK-O2-NEXT: FunctionPass Manager ; CHECK-O2-NOT: Manager ; CHECK-O2: Loop Pass Manager +; CHECK-O2-NOT: Manager +; CHECK-O2: Loop Pass Manager +; CHECK-O2-NOT: Manager ; CHECK-O2: Loop Pass Manager ; CHECK-O2-NOT: Manager ; FIXME: We shouldn't be pulling out to simplify-cfg and instcombine and diff --git a/llvm/test/Transforms/IndVarSimplify/X86/pr45360.ll b/llvm/test/Transforms/IndVarSimplify/X86/pr45360.ll index 82deee9f367b..8f43029fa303 100644 --- a/llvm/test/Transforms/IndVarSimplify/X86/pr45360.ll +++ b/llvm/test/Transforms/IndVarSimplify/X86/pr45360.ll @@ -22,30 +22,33 @@ define dso_local i32 @main() { ; CHECK-NEXT: bb: ; CHECK-NEXT: [[I6:%.*]] = load i32, i32* @a, align 4 ; CHECK-NEXT: [[I24:%.*]] = load i32, i32* @b, align 4 -; CHECK-NEXT: [[D_PROMOTED9:%.*]] = load i32, i32* @d, align 4 -; CHECK-NEXT: [[TMP0:%.*]] = and i32 [[D_PROMOTED9]], [[I6]] +; CHECK-NEXT: [[D_PROMOTED7:%.*]] = load i32, i32* @d, align 4 +; CHECK-NEXT: [[TMP0:%.*]] = and i32 [[D_PROMOTED7]], [[I6]] ; CHECK-NEXT: [[I21:%.*]] = icmp eq i32 [[TMP0]], 0 -; CHECK-NEXT: br label [[BB1:%.*]] -; CHECK: bb1: -; CHECK-NEXT: br i1 [[I21]], label [[BB13_PREHEADER_BB27_THREAD_SPLIT_CRIT_EDGE:%.*]], label [[BB19_PREHEADER:%.*]] -; CHECK: bb19.preheader: +; CHECK-NEXT: br i1 [[I21]], label [[BB27_THREAD:%.*]], label [[BB27_PREHEADER:%.*]] +; CHECK: bb27.preheader: ; CHECK-NEXT: [[I26:%.*]] = urem i32 [[I24]], [[TMP0]] ; CHECK-NEXT: store i32 [[I26]], i32* @e, align 4 ; CHECK-NEXT: [[I30_NOT:%.*]] = icmp eq i32 [[I26]], 0 -; CHECK-NEXT: br i1 [[I30_NOT]], label [[BB32_LOOPEXIT:%.*]], label [[BB1]] -; CHECK: bb13.preheader.bb27.thread.split_crit_edge: -; CHECK-NEXT: store i32 -1, i32* @f, align 4 +; CHECK-NEXT: br label [[BB27:%.*]] +; CHECK: bb27.thread: ; CHECK-NEXT: store i32 0, i32* @d, align 4 +; CHECK-NEXT: store i32 -1, i32* @f, align 4 ; CHECK-NEXT: store i32 0, i32* @c, align 4 ; CHECK-NEXT: br label [[BB32:%.*]] +; CHECK: bb27: +; CHECK-NEXT: br i1 [[I30_NOT]], label [[BB32_LOOPEXIT:%.*]], label [[BB36:%.*]] ; CHECK: bb32.loopexit: -; CHECK-NEXT: store i32 -1, i32* @f, align 4 ; CHECK-NEXT: store i32 [[TMP0]], i32* @d, align 4 +; CHECK-NEXT: store i32 -1, i32* @f, align 4 ; CHECK-NEXT: br label [[BB32]] ; CHECK: bb32: -; CHECK-NEXT: [[C_SINK:%.*]] = phi i32* [ @c, [[BB32_LOOPEXIT]] ], [ @e, [[BB13_PREHEADER_BB27_THREAD_SPLIT_CRIT_EDGE]] ] +; CHECK-NEXT: [[C_SINK:%.*]] = phi i32* [ @c, [[BB32_LOOPEXIT]] ], [ @e, [[BB27_THREAD]] ] ; CHECK-NEXT: store i32 0, i32* [[C_SINK]], align 4 ; CHECK-NEXT: ret i32 0 +; CHECK: bb36: +; CHECK-NEXT: store i32 1, i32* @c, align 4 +; CHECK-NEXT: br i1 [[I21]], label [[BB27_THREAD]], label [[BB27]] ; bb: %i = alloca i32, align 4 diff --git a/llvm/test/Transforms/PhaseOrdering/X86/spurious-peeling.ll b/llvm/test/Transforms/PhaseOrdering/X86/spurious-peeling.ll index 3e659414d982..4661bd8a36cc 100644 --- a/llvm/test/Transforms/PhaseOrdering/X86/spurious-peeling.ll +++ b/llvm/test/Transforms/PhaseOrdering/X86/spurious-peeling.ll @@ -16,32 +16,28 @@ define dso_local void @_Z13vecIncFromPtrP12FloatVecPair(%class.FloatVecPair* %FV ; OLDPM-NEXT: entry: ; OLDPM-NEXT: [[BASE_I_I:%.*]] = getelementptr inbounds [[CLASS_FLOATVECPAIR:%.*]], %class.FloatVecPair* [[FVP:%.*]], i64 0, i32 1, i32 0 ; OLDPM-NEXT: [[TMP0:%.*]] = load %class.HomemadeVector.0*, %class.HomemadeVector.0** [[BASE_I_I]], align 8, !tbaa [[TBAA0:![0-9]+]] -; OLDPM-NEXT: [[SIZE410_I:%.*]] = getelementptr inbounds [[CLASS_HOMEMADEVECTOR_0:%.*]], %class.HomemadeVector.0* [[TMP0]], i64 undef, i32 1 -; OLDPM-NEXT: [[TMP1:%.*]] = load i32, i32* [[SIZE410_I]], align 8, !tbaa [[TBAA6:![0-9]+]] -; OLDPM-NEXT: [[CMP511_NOT_I:%.*]] = icmp eq i32 [[TMP1]], 0 -; OLDPM-NEXT: br i1 [[CMP511_NOT_I]], label [[_ZN12FLOATVECPAIR6VECINCEV_EXIT:%.*]], label [[FOR_BODY7_LR_PH_I:%.*]] +; OLDPM-NEXT: [[SIZE4_I:%.*]] = getelementptr inbounds [[CLASS_HOMEMADEVECTOR_0:%.*]], %class.HomemadeVector.0* [[TMP0]], i64 undef, i32 1 +; OLDPM-NEXT: [[TMP1:%.*]] = load i32, i32* [[SIZE4_I]], align 8, !tbaa [[TBAA6:![0-9]+]] +; OLDPM-NEXT: [[CMP510_NOT_I:%.*]] = icmp eq i32 [[TMP1]], 0 +; OLDPM-NEXT: br i1 [[CMP510_NOT_I]], label [[_ZN12FLOATVECPAIR6VECINCEV_EXIT:%.*]], label [[FOR_BODY7_LR_PH_I:%.*]] ; OLDPM: for.body7.lr.ph.i: ; OLDPM-NEXT: [[BASE_I4_I:%.*]] = getelementptr inbounds [[CLASS_FLOATVECPAIR]], %class.FloatVecPair* [[FVP]], i64 0, i32 0, i32 0 -; OLDPM-NEXT: [[TMP2:%.*]] = load %class.HomemadeVector.0*, %class.HomemadeVector.0** [[BASE_I4_I]], align 8, !tbaa [[TBAA0]] -; OLDPM-NEXT: [[BASE_I2_I:%.*]] = getelementptr inbounds [[CLASS_HOMEMADEVECTOR_0]], %class.HomemadeVector.0* [[TMP2]], i64 undef, i32 0 -; OLDPM-NEXT: [[TMP3:%.*]] = load float*, float** [[BASE_I2_I]], align 8, !tbaa [[TBAA8:![0-9]+]] -; OLDPM-NEXT: [[ARRAYIDX_I3_I:%.*]] = getelementptr inbounds float, float* [[TMP3]], i64 undef -; OLDPM-NEXT: [[BASE_I6_PEEL_I:%.*]] = getelementptr inbounds [[CLASS_HOMEMADEVECTOR_0]], %class.HomemadeVector.0* [[TMP0]], i64 undef, i32 0 -; OLDPM-NEXT: [[TMP4:%.*]] = load float*, float** [[BASE_I6_PEEL_I]], align 8, !tbaa [[TBAA8]] -; OLDPM-NEXT: [[ARRAYIDX_I7_PEEL_I:%.*]] = getelementptr inbounds float, float* [[TMP4]], i64 undef -; OLDPM-NEXT: [[TMP5:%.*]] = load float, float* [[ARRAYIDX_I7_PEEL_I]], align 4, !tbaa [[TBAA9:![0-9]+]] -; OLDPM-NEXT: [[TMP6:%.*]] = load float, float* [[ARRAYIDX_I3_I]], align 4, !tbaa [[TBAA9]] -; OLDPM-NEXT: [[ADD_PEEL_I:%.*]] = fadd float [[TMP5]], [[TMP6]] -; OLDPM-NEXT: store float [[ADD_PEEL_I]], float* [[ARRAYIDX_I3_I]], align 4, !tbaa [[TBAA9]] -; OLDPM-NEXT: [[EXITCOND_PEEL_NOT_I:%.*]] = icmp eq i32 [[TMP1]], 1 -; OLDPM-NEXT: br i1 [[EXITCOND_PEEL_NOT_I]], label [[_ZN12FLOATVECPAIR6VECINCEV_EXIT]], label [[FOR_BODY7_I:%.*]] +; OLDPM-NEXT: [[BASE_I6_I:%.*]] = getelementptr inbounds [[CLASS_HOMEMADEVECTOR_0]], %class.HomemadeVector.0* [[TMP0]], i64 undef, i32 0 +; OLDPM-NEXT: [[TMP2:%.*]] = load float*, float** [[BASE_I6_I]], align 8, !tbaa [[TBAA8:![0-9]+]] +; OLDPM-NEXT: [[ARRAYIDX_I7_I:%.*]] = getelementptr inbounds float, float* [[TMP2]], i64 undef +; OLDPM-NEXT: [[TMP3:%.*]] = load %class.HomemadeVector.0*, %class.HomemadeVector.0** [[BASE_I4_I]], align 8, !tbaa [[TBAA0]] +; OLDPM-NEXT: [[BASE_I2_I:%.*]] = getelementptr inbounds [[CLASS_HOMEMADEVECTOR_0]], %class.HomemadeVector.0* [[TMP3]], i64 undef, i32 0 +; OLDPM-NEXT: [[TMP4:%.*]] = load float*, float** [[BASE_I2_I]], align 8, !tbaa [[TBAA8]] +; OLDPM-NEXT: [[ARRAYIDX_I3_I:%.*]] = getelementptr inbounds float, float* [[TMP4]], i64 undef +; OLDPM-NEXT: [[DOTPRE_I:%.*]] = load float, float* [[ARRAYIDX_I3_I]], align 4, !tbaa [[TBAA9:![0-9]+]] +; OLDPM-NEXT: br label [[FOR_BODY7_I:%.*]] ; OLDPM: for.body7.i: -; OLDPM-NEXT: [[TMP7:%.*]] = phi float [ [[ADD_I:%.*]], [[FOR_BODY7_I]] ], [ [[ADD_PEEL_I]], [[FOR_BODY7_LR_PH_I]] ] -; OLDPM-NEXT: [[J_012_I:%.*]] = phi i32 [ [[INC_I:%.*]], [[FOR_BODY7_I]] ], [ 1, [[FOR_BODY7_LR_PH_I]] ] -; OLDPM-NEXT: [[TMP8:%.*]] = load float, float* [[ARRAYIDX_I7_PEEL_I]], align 4, !tbaa [[TBAA9]] -; OLDPM-NEXT: [[ADD_I]] = fadd float [[TMP7]], [[TMP8]] +; OLDPM-NEXT: [[TMP5:%.*]] = phi float [ [[DOTPRE_I]], [[FOR_BODY7_LR_PH_I]] ], [ [[ADD_I:%.*]], [[FOR_BODY7_I]] ] +; OLDPM-NEXT: [[J_011_I:%.*]] = phi i32 [ 0, [[FOR_BODY7_LR_PH_I]] ], [ [[INC_I:%.*]], [[FOR_BODY7_I]] ] +; OLDPM-NEXT: [[TMP6:%.*]] = load float, float* [[ARRAYIDX_I7_I]], align 4, !tbaa [[TBAA9]] +; OLDPM-NEXT: [[ADD_I]] = fadd float [[TMP5]], [[TMP6]] ; OLDPM-NEXT: store float [[ADD_I]], float* [[ARRAYIDX_I3_I]], align 4, !tbaa [[TBAA9]] -; OLDPM-NEXT: [[INC_I]] = add nuw i32 [[J_012_I]], 1 +; OLDPM-NEXT: [[INC_I]] = add nuw i32 [[J_011_I]], 1 ; OLDPM-NEXT: [[EXITCOND_NOT_I:%.*]] = icmp eq i32 [[INC_I]], [[TMP1]] ; OLDPM-NEXT: br i1 [[EXITCOND_NOT_I]], label [[_ZN12FLOATVECPAIR6VECINCEV_EXIT]], label [[FOR_BODY7_I]], !llvm.loop [[LOOP11:![0-9]+]] ; OLDPM: _ZN12FloatVecPair6vecIncEv.exit: @@ -51,39 +47,30 @@ define dso_local void @_Z13vecIncFromPtrP12FloatVecPair(%class.FloatVecPair* %FV ; NEWPM-NEXT: entry: ; NEWPM-NEXT: [[BASE_I_I:%.*]] = getelementptr inbounds [[CLASS_FLOATVECPAIR:%.*]], %class.FloatVecPair* [[FVP:%.*]], i64 0, i32 1, i32 0 ; NEWPM-NEXT: [[TMP0:%.*]] = load %class.HomemadeVector.0*, %class.HomemadeVector.0** [[BASE_I_I]], align 8, !tbaa [[TBAA0:![0-9]+]] -; NEWPM-NEXT: [[SIZE410_I:%.*]] = getelementptr inbounds [[CLASS_HOMEMADEVECTOR_0:%.*]], %class.HomemadeVector.0* [[TMP0]], i64 undef, i32 1 -; NEWPM-NEXT: [[TMP1:%.*]] = load i32, i32* [[SIZE410_I]], align 8, !tbaa [[TBAA6:![0-9]+]] -; NEWPM-NEXT: [[CMP511_NOT_I:%.*]] = icmp eq i32 [[TMP1]], 0 -; NEWPM-NEXT: br i1 [[CMP511_NOT_I]], label [[_ZN12FLOATVECPAIR6VECINCEV_EXIT:%.*]], label [[FOR_BODY7_LR_PH_I:%.*]] +; NEWPM-NEXT: [[SIZE4_I:%.*]] = getelementptr inbounds [[CLASS_HOMEMADEVECTOR_0:%.*]], %class.HomemadeVector.0* [[TMP0]], i64 undef, i32 1 +; NEWPM-NEXT: [[TMP1:%.*]] = load i32, i32* [[SIZE4_I]], align 8, !tbaa [[TBAA6:![0-9]+]] +; NEWPM-NEXT: [[CMP510_NOT_I:%.*]] = icmp eq i32 [[TMP1]], 0 +; NEWPM-NEXT: br i1 [[CMP510_NOT_I]], label [[_ZN12FLOATVECPAIR6VECINCEV_EXIT:%.*]], label [[FOR_BODY7_LR_PH_I:%.*]] ; NEWPM: for.body7.lr.ph.i: ; NEWPM-NEXT: [[BASE_I6_I:%.*]] = getelementptr inbounds [[CLASS_FLOATVECPAIR]], %class.FloatVecPair* [[FVP]], i64 0, i32 0, i32 0 -; NEWPM-NEXT: [[TMP2:%.*]] = load %class.HomemadeVector.0*, %class.HomemadeVector.0** [[BASE_I6_I]], align 8, !tbaa [[TBAA0]] -; NEWPM-NEXT: [[BASE_I8_I:%.*]] = getelementptr inbounds [[CLASS_HOMEMADEVECTOR_0]], %class.HomemadeVector.0* [[TMP2]], i64 undef, i32 0 -; NEWPM-NEXT: [[TMP3:%.*]] = load float*, float** [[BASE_I8_I]], align 8, !tbaa [[TBAA8:![0-9]+]] -; NEWPM-NEXT: [[ARRAYIDX_I9_I:%.*]] = getelementptr inbounds float, float* [[TMP3]], i64 undef -; NEWPM-NEXT: [[BASE_I4_PEEL_I:%.*]] = getelementptr inbounds [[CLASS_HOMEMADEVECTOR_0]], %class.HomemadeVector.0* [[TMP0]], i64 undef, i32 0 -; NEWPM-NEXT: [[TMP4:%.*]] = load float*, float** [[BASE_I4_PEEL_I]], align 8, !tbaa [[TBAA8]] -; NEWPM-NEXT: [[ARRAYIDX_I5_PEEL_I:%.*]] = getelementptr inbounds float, float* [[TMP4]], i64 undef -; NEWPM-NEXT: [[TMP5:%.*]] = load float, float* [[ARRAYIDX_I5_PEEL_I]], align 4, !tbaa [[TBAA9:![0-9]+]] -; NEWPM-NEXT: [[TMP6:%.*]] = load float, float* [[ARRAYIDX_I9_I]], align 4, !tbaa [[TBAA9]] -; NEWPM-NEXT: [[ADD_PEEL_I:%.*]] = fadd float [[TMP5]], [[TMP6]] -; NEWPM-NEXT: store float [[ADD_PEEL_I]], float* [[ARRAYIDX_I9_I]], align 4, !tbaa [[TBAA9]] -; NEWPM-NEXT: [[EXITCOND_PEEL_NOT_I:%.*]] = icmp eq i32 [[TMP1]], 1 -; NEWPM-NEXT: br i1 [[EXITCOND_PEEL_NOT_I]], label [[_ZN12FLOATVECPAIR6VECINCEV_EXIT]], label [[FOR_BODY7_LR_PH_I_FOR_BODY7_I_CRIT_EDGE:%.*]] -; NEWPM: for.body7.lr.ph.i.for.body7.i_crit_edge: -; NEWPM-NEXT: [[INC_I_1:%.*]] = add nuw i32 1, 1 +; NEWPM-NEXT: [[BASE_I4_I:%.*]] = getelementptr inbounds [[CLASS_HOMEMADEVECTOR_0]], %class.HomemadeVector.0* [[TMP0]], i64 undef, i32 0 +; NEWPM-NEXT: [[TMP2:%.*]] = load float*, float** [[BASE_I4_I]], align 8, !tbaa [[TBAA8:![0-9]+]] +; NEWPM-NEXT: [[ARRAYIDX_I5_I:%.*]] = getelementptr inbounds float, float* [[TMP2]], i64 undef +; NEWPM-NEXT: [[TMP3:%.*]] = load %class.HomemadeVector.0*, %class.HomemadeVector.0** [[BASE_I6_I]], align 8, !tbaa [[TBAA0]] +; NEWPM-NEXT: [[BASE_I8_I:%.*]] = getelementptr inbounds [[CLASS_HOMEMADEVECTOR_0]], %class.HomemadeVector.0* [[TMP3]], i64 undef, i32 0 +; NEWPM-NEXT: [[TMP4:%.*]] = load float*, float** [[BASE_I8_I]], align 8, !tbaa [[TBAA8]] +; NEWPM-NEXT: [[ARRAYIDX_I9_I:%.*]] = getelementptr inbounds float, float* [[TMP4]], i64 undef +; NEWPM-NEXT: [[DOTPRE_I:%.*]] = load float, float* [[ARRAYIDX_I9_I]], align 4, !tbaa [[TBAA9:![0-9]+]] ; NEWPM-NEXT: br label [[FOR_BODY7_I:%.*]] ; NEWPM: for.body7.i: -; NEWPM-NEXT: [[TMP7:%.*]] = phi float [ [[ADD_I:%.*]], [[FOR_BODY7_I_FOR_BODY7_I_CRIT_EDGE:%.*]] ], [ [[ADD_PEEL_I]], [[FOR_BODY7_LR_PH_I_FOR_BODY7_I_CRIT_EDGE]] ] -; NEWPM-NEXT: [[INC_I_PHI:%.*]] = phi i32 [ [[INC_I_0:%.*]], [[FOR_BODY7_I_FOR_BODY7_I_CRIT_EDGE]] ], [ [[INC_I_1]], [[FOR_BODY7_LR_PH_I_FOR_BODY7_I_CRIT_EDGE]] ] -; NEWPM-NEXT: [[TMP8:%.*]] = load float, float* [[ARRAYIDX_I5_PEEL_I]], align 4, !tbaa [[TBAA9]] -; NEWPM-NEXT: [[ADD_I]] = fadd float [[TMP7]], [[TMP8]] +; NEWPM-NEXT: [[TMP5:%.*]] = phi float [ [[DOTPRE_I]], [[FOR_BODY7_LR_PH_I]] ], [ [[ADD_I:%.*]], [[FOR_BODY7_I]] ] +; NEWPM-NEXT: [[J_011_I:%.*]] = phi i32 [ 0, [[FOR_BODY7_LR_PH_I]] ], [ [[INC_I:%.*]], [[FOR_BODY7_I]] ] +; NEWPM-NEXT: [[TMP6:%.*]] = load float, float* [[ARRAYIDX_I5_I]], align 4, !tbaa [[TBAA9]] +; NEWPM-NEXT: [[ADD_I]] = fadd float [[TMP5]], [[TMP6]] ; NEWPM-NEXT: store float [[ADD_I]], float* [[ARRAYIDX_I9_I]], align 4, !tbaa [[TBAA9]] -; NEWPM-NEXT: [[EXITCOND_NOT_I:%.*]] = icmp eq i32 [[INC_I_PHI]], [[TMP1]] -; NEWPM-NEXT: br i1 [[EXITCOND_NOT_I]], label [[_ZN12FLOATVECPAIR6VECINCEV_EXIT]], label [[FOR_BODY7_I_FOR_BODY7_I_CRIT_EDGE]], !llvm.loop [[LOOP11:![0-9]+]] -; NEWPM: for.body7.i.for.body7.i_crit_edge: -; NEWPM-NEXT: [[INC_I_0]] = add nuw i32 [[INC_I_PHI]], 1 -; NEWPM-NEXT: br label [[FOR_BODY7_I]] +; NEWPM-NEXT: [[INC_I]] = add nuw i32 [[J_011_I]], 1 +; NEWPM-NEXT: [[EXITCOND_NOT_I:%.*]] = icmp eq i32 [[INC_I]], [[TMP1]] +; NEWPM-NEXT: br i1 [[EXITCOND_NOT_I]], label [[_ZN12FLOATVECPAIR6VECINCEV_EXIT]], label [[FOR_BODY7_I]], !llvm.loop [[LOOP11:![0-9]+]] ; NEWPM: _ZN12FloatVecPair6vecIncEv.exit: ; NEWPM-NEXT: ret void ; diff --git a/llvm/test/Transforms/PhaseOrdering/X86/vdiv.ll b/llvm/test/Transforms/PhaseOrdering/X86/vdiv.ll index 280f849dbb35..8b8b535f1a77 100644 --- a/llvm/test/Transforms/PhaseOrdering/X86/vdiv.ll +++ b/llvm/test/Transforms/PhaseOrdering/X86/vdiv.ll @@ -15,18 +15,18 @@ define void @vdiv(double* %x, double* %y, double %a, i32 %N) #0 { ; CHECK-LABEL: @vdiv( ; CHECK-NEXT: entry: ; CHECK-NEXT: [[CMP1:%.*]] = icmp sgt i32 [[N:%.*]], 0 -; CHECK-NEXT: br i1 [[CMP1]], label [[FOR_BODY_LR_PH:%.*]], label [[FOR_END:%.*]] -; CHECK: for.body.lr.ph: +; CHECK-NEXT: br i1 [[CMP1]], label [[FOR_BODY_PREHEADER:%.*]], label [[FOR_END:%.*]] +; CHECK: for.body.preheader: ; CHECK-NEXT: [[WIDE_TRIP_COUNT:%.*]] = zext i32 [[N]] to i64 ; CHECK-NEXT: [[MIN_ITERS_CHECK:%.*]] = icmp ult i32 [[N]], 4 -; CHECK-NEXT: br i1 [[MIN_ITERS_CHECK]], label [[FOR_BODY_PREHEADER:%.*]], label [[VECTOR_MEMCHECK:%.*]] +; CHECK-NEXT: br i1 [[MIN_ITERS_CHECK]], label [[FOR_BODY_PREHEADER8:%.*]], label [[VECTOR_MEMCHECK:%.*]] ; CHECK: vector.memcheck: ; CHECK-NEXT: [[SCEVGEP:%.*]] = getelementptr double, double* [[X:%.*]], i64 [[WIDE_TRIP_COUNT]] ; CHECK-NEXT: [[SCEVGEP6:%.*]] = getelementptr double, double* [[Y:%.*]], i64 [[WIDE_TRIP_COUNT]] ; CHECK-NEXT: [[BOUND0:%.*]] = icmp ugt double* [[SCEVGEP6]], [[X]] ; CHECK-NEXT: [[BOUND1:%.*]] = icmp ugt double* [[SCEVGEP]], [[Y]] ; CHECK-NEXT: [[FOUND_CONFLICT:%.*]] = and i1 [[BOUND0]], [[BOUND1]] -; CHECK-NEXT: br i1 [[FOUND_CONFLICT]], label [[FOR_BODY_PREHEADER]], label [[VECTOR_PH:%.*]] +; CHECK-NEXT: br i1 [[FOUND_CONFLICT]], label [[FOR_BODY_PREHEADER8]], label [[VECTOR_PH:%.*]] ; CHECK: vector.ph: ; CHECK-NEXT: [[N_VEC:%.*]] = and i64 [[WIDE_TRIP_COUNT]], 4294967292 ; CHECK-NEXT: [[BROADCAST_SPLATINSERT:%.*]] = insertelement <4 x double> poison, double [[A:%.*]], i32 0 @@ -49,39 +49,39 @@ define void @vdiv(double* %x, double* %y, double %a, i32 %N) #0 { ; CHECK-NEXT: [[NITER:%.*]] = phi i64 [ [[UNROLL_ITER]], [[VECTOR_PH_NEW]] ], [ [[NITER_NSUB_3:%.*]], [[VECTOR_BODY]] ] ; CHECK-NEXT: [[TMP8:%.*]] = getelementptr inbounds double, double* [[Y]], i64 [[INDEX]] ; CHECK-NEXT: [[TMP9:%.*]] = bitcast double* [[TMP8]] to <4 x double>* -; CHECK-NEXT: [[WIDE_LOAD:%.*]] = load <4 x double>, <4 x double>* [[TMP9]], align 8, [[TBAA3:!tbaa !.*]], !alias.scope !7 +; CHECK-NEXT: [[WIDE_LOAD:%.*]] = load <4 x double>, <4 x double>* [[TMP9]], align 8, !tbaa [[TBAA3:![0-9]+]], !alias.scope !7 ; CHECK-NEXT: [[TMP10:%.*]] = fmul fast <4 x double> [[WIDE_LOAD]], [[TMP4]] ; CHECK-NEXT: [[TMP11:%.*]] = getelementptr inbounds double, double* [[X]], i64 [[INDEX]] ; CHECK-NEXT: [[TMP12:%.*]] = bitcast double* [[TMP11]] to <4 x double>* -; CHECK-NEXT: store <4 x double> [[TMP10]], <4 x double>* [[TMP12]], align 8, [[TBAA3]], !alias.scope !10, !noalias !7 +; CHECK-NEXT: store <4 x double> [[TMP10]], <4 x double>* [[TMP12]], align 8, !tbaa [[TBAA3]], !alias.scope !10, !noalias !7 ; CHECK-NEXT: [[INDEX_NEXT:%.*]] = or i64 [[INDEX]], 4 ; CHECK-NEXT: [[TMP13:%.*]] = getelementptr inbounds double, double* [[Y]], i64 [[INDEX_NEXT]] ; CHECK-NEXT: [[TMP14:%.*]] = bitcast double* [[TMP13]] to <4 x double>* -; CHECK-NEXT: [[WIDE_LOAD_1:%.*]] = load <4 x double>, <4 x double>* [[TMP14]], align 8, [[TBAA3]], !alias.scope !7 +; CHECK-NEXT: [[WIDE_LOAD_1:%.*]] = load <4 x double>, <4 x double>* [[TMP14]], align 8, !tbaa [[TBAA3]], !alias.scope !7 ; CHECK-NEXT: [[TMP15:%.*]] = fmul fast <4 x double> [[WIDE_LOAD_1]], [[TMP5]] ; CHECK-NEXT: [[TMP16:%.*]] = getelementptr inbounds double, double* [[X]], i64 [[INDEX_NEXT]] ; CHECK-NEXT: [[TMP17:%.*]] = bitcast double* [[TMP16]] to <4 x double>* -; CHECK-NEXT: store <4 x double> [[TMP15]], <4 x double>* [[TMP17]], align 8, [[TBAA3]], !alias.scope !10, !noalias !7 +; CHECK-NEXT: store <4 x double> [[TMP15]], <4 x double>* [[TMP17]], align 8, !tbaa [[TBAA3]], !alias.scope !10, !noalias !7 ; CHECK-NEXT: [[INDEX_NEXT_1:%.*]] = or i64 [[INDEX]], 8 ; CHECK-NEXT: [[TMP18:%.*]] = getelementptr inbounds double, double* [[Y]], i64 [[INDEX_NEXT_1]] ; CHECK-NEXT: [[TMP19:%.*]] = bitcast double* [[TMP18]] to <4 x double>* -; CHECK-NEXT: [[WIDE_LOAD_2:%.*]] = load <4 x double>, <4 x double>* [[TMP19]], align 8, [[TBAA3]], !alias.scope !7 +; CHECK-NEXT: [[WIDE_LOAD_2:%.*]] = load <4 x double>, <4 x double>* [[TMP19]], align 8, !tbaa [[TBAA3]], !alias.scope !7 ; CHECK-NEXT: [[TMP20:%.*]] = fmul fast <4 x double> [[WIDE_LOAD_2]], [[TMP6]] ; CHECK-NEXT: [[TMP21:%.*]] = getelementptr inbounds double, double* [[X]], i64 [[INDEX_NEXT_1]] ; CHECK-NEXT: [[TMP22:%.*]] = bitcast double* [[TMP21]] to <4 x double>* -; CHECK-NEXT: store <4 x double> [[TMP20]], <4 x double>* [[TMP22]], align 8, [[TBAA3]], !alias.scope !10, !noalias !7 +; CHECK-NEXT: store <4 x double> [[TMP20]], <4 x double>* [[TMP22]], align 8, !tbaa [[TBAA3]], !alias.scope !10, !noalias !7 ; CHECK-NEXT: [[INDEX_NEXT_2:%.*]] = or i64 [[INDEX]], 12 ; CHECK-NEXT: [[TMP23:%.*]] = getelementptr inbounds double, double* [[Y]], i64 [[INDEX_NEXT_2]] ; CHECK-NEXT: [[TMP24:%.*]] = bitcast double* [[TMP23]] to <4 x double>* -; CHECK-NEXT: [[WIDE_LOAD_3:%.*]] = load <4 x double>, <4 x double>* [[TMP24]], align 8, [[TBAA3]], !alias.scope !7 +; CHECK-NEXT: [[WIDE_LOAD_3:%.*]] = load <4 x double>, <4 x double>* [[TMP24]], align 8, !tbaa [[TBAA3]], !alias.scope !7 ; CHECK-NEXT: [[TMP25:%.*]] = fmul fast <4 x double> [[WIDE_LOAD_3]], [[TMP7]] ; CHECK-NEXT: [[TMP26:%.*]] = getelementptr inbounds double, double* [[X]], i64 [[INDEX_NEXT_2]] ; CHECK-NEXT: [[TMP27:%.*]] = bitcast double* [[TMP26]] to <4 x double>* -; CHECK-NEXT: store <4 x double> [[TMP25]], <4 x double>* [[TMP27]], align 8, [[TBAA3]], !alias.scope !10, !noalias !7 +; CHECK-NEXT: store <4 x double> [[TMP25]], <4 x double>* [[TMP27]], align 8, !tbaa [[TBAA3]], !alias.scope !10, !noalias !7 ; CHECK-NEXT: [[INDEX_NEXT_3]] = add i64 [[INDEX]], 16 ; CHECK-NEXT: [[NITER_NSUB_3]] = add i64 [[NITER]], -4 ; CHECK-NEXT: [[NITER_NCMP_3:%.*]] = icmp eq i64 [[NITER_NSUB_3]], 0 -; CHECK-NEXT: br i1 [[NITER_NCMP_3]], label [[MIDDLE_BLOCK_UNR_LCSSA]], label [[VECTOR_BODY]], [[LOOP12:!llvm.loop !.*]] +; CHECK-NEXT: br i1 [[NITER_NCMP_3]], label [[MIDDLE_BLOCK_UNR_LCSSA]], label [[VECTOR_BODY]], !llvm.loop [[LOOP12:![0-9]+]] ; CHECK: middle.block.unr-lcssa: ; CHECK-NEXT: [[INDEX_UNR:%.*]] = phi i64 [ 0, [[VECTOR_PH]] ], [ [[INDEX_NEXT_3]], [[VECTOR_BODY]] ] ; CHECK-NEXT: [[LCMP_MOD_NOT:%.*]] = icmp eq i64 [[XTRAITER]], 0 @@ -94,78 +94,78 @@ define void @vdiv(double* %x, double* %y, double %a, i32 %N) #0 { ; CHECK-NEXT: [[EPIL_ITER:%.*]] = phi i64 [ [[XTRAITER]], [[VECTOR_BODY_EPIL_PREHEADER]] ], [ [[EPIL_ITER_SUB:%.*]], [[VECTOR_BODY_EPIL]] ] ; CHECK-NEXT: [[TMP29:%.*]] = getelementptr inbounds double, double* [[Y]], i64 [[INDEX_EPIL]] ; CHECK-NEXT: [[TMP30:%.*]] = bitcast double* [[TMP29]] to <4 x double>* -; CHECK-NEXT: [[WIDE_LOAD_EPIL:%.*]] = load <4 x double>, <4 x double>* [[TMP30]], align 8, [[TBAA3]], !alias.scope !7 +; CHECK-NEXT: [[WIDE_LOAD_EPIL:%.*]] = load <4 x double>, <4 x double>* [[TMP30]], align 8, !tbaa [[TBAA3]], !alias.scope !7 ; CHECK-NEXT: [[TMP31:%.*]] = fmul fast <4 x double> [[WIDE_LOAD_EPIL]], [[TMP28]] ; CHECK-NEXT: [[TMP32:%.*]] = getelementptr inbounds double, double* [[X]], i64 [[INDEX_EPIL]] ; CHECK-NEXT: [[TMP33:%.*]] = bitcast double* [[TMP32]] to <4 x double>* -; CHECK-NEXT: store <4 x double> [[TMP31]], <4 x double>* [[TMP33]], align 8, [[TBAA3]], !alias.scope !10, !noalias !7 +; CHECK-NEXT: store <4 x double> [[TMP31]], <4 x double>* [[TMP33]], align 8, !tbaa [[TBAA3]], !alias.scope !10, !noalias !7 ; CHECK-NEXT: [[INDEX_NEXT_EPIL]] = add i64 [[INDEX_EPIL]], 4 ; CHECK-NEXT: [[EPIL_ITER_SUB]] = add i64 [[EPIL_ITER]], -1 ; CHECK-NEXT: [[EPIL_ITER_CMP_NOT:%.*]] = icmp eq i64 [[EPIL_ITER_SUB]], 0 -; CHECK-NEXT: br i1 [[EPIL_ITER_CMP_NOT]], label [[MIDDLE_BLOCK]], label [[VECTOR_BODY_EPIL]], [[LOOP14:!llvm.loop !.*]] +; CHECK-NEXT: br i1 [[EPIL_ITER_CMP_NOT]], label [[MIDDLE_BLOCK]], label [[VECTOR_BODY_EPIL]], !llvm.loop [[LOOP14:![0-9]+]] ; CHECK: middle.block: ; CHECK-NEXT: [[CMP_N:%.*]] = icmp eq i64 [[N_VEC]], [[WIDE_TRIP_COUNT]] -; CHECK-NEXT: br i1 [[CMP_N]], label [[FOR_END]], label [[FOR_BODY_PREHEADER]] -; CHECK: for.body.preheader: -; CHECK-NEXT: [[INDVARS_IV_PH:%.*]] = phi i64 [ 0, [[VECTOR_MEMCHECK]] ], [ 0, [[FOR_BODY_LR_PH]] ], [ [[N_VEC]], [[MIDDLE_BLOCK]] ] +; CHECK-NEXT: br i1 [[CMP_N]], label [[FOR_END]], label [[FOR_BODY_PREHEADER8]] +; CHECK: for.body.preheader8: +; CHECK-NEXT: [[INDVARS_IV_PH:%.*]] = phi i64 [ 0, [[VECTOR_MEMCHECK]] ], [ 0, [[FOR_BODY_PREHEADER]] ], [ [[N_VEC]], [[MIDDLE_BLOCK]] ] ; CHECK-NEXT: [[TMP34:%.*]] = xor i64 [[INDVARS_IV_PH]], -1 ; CHECK-NEXT: [[TMP35:%.*]] = add nsw i64 [[TMP34]], [[WIDE_TRIP_COUNT]] -; CHECK-NEXT: [[XTRAITER8:%.*]] = and i64 [[WIDE_TRIP_COUNT]], 3 -; CHECK-NEXT: [[LCMP_MOD9_NOT:%.*]] = icmp eq i64 [[XTRAITER8]], 0 -; CHECK-NEXT: br i1 [[LCMP_MOD9_NOT]], label [[FOR_BODY_PROL_LOOPEXIT:%.*]], label [[FOR_BODY_PROL_PREHEADER:%.*]] +; CHECK-NEXT: [[XTRAITER9:%.*]] = and i64 [[WIDE_TRIP_COUNT]], 3 +; CHECK-NEXT: [[LCMP_MOD10_NOT:%.*]] = icmp eq i64 [[XTRAITER9]], 0 +; CHECK-NEXT: br i1 [[LCMP_MOD10_NOT]], label [[FOR_BODY_PROL_LOOPEXIT:%.*]], label [[FOR_BODY_PROL_PREHEADER:%.*]] ; CHECK: for.body.prol.preheader: ; CHECK-NEXT: [[TMP36:%.*]] = fdiv fast double 1.000000e+00, [[A]] ; CHECK-NEXT: br label [[FOR_BODY_PROL:%.*]] ; CHECK: for.body.prol: ; CHECK-NEXT: [[INDVARS_IV_PROL:%.*]] = phi i64 [ [[INDVARS_IV_NEXT_PROL:%.*]], [[FOR_BODY_PROL]] ], [ [[INDVARS_IV_PH]], [[FOR_BODY_PROL_PREHEADER]] ] -; CHECK-NEXT: [[PROL_ITER:%.*]] = phi i64 [ [[PROL_ITER_SUB:%.*]], [[FOR_BODY_PROL]] ], [ [[XTRAITER8]], [[FOR_BODY_PROL_PREHEADER]] ] +; CHECK-NEXT: [[PROL_ITER:%.*]] = phi i64 [ [[PROL_ITER_SUB:%.*]], [[FOR_BODY_PROL]] ], [ [[XTRAITER9]], [[FOR_BODY_PROL_PREHEADER]] ] ; CHECK-NEXT: [[ARRAYIDX_PROL:%.*]] = getelementptr inbounds double, double* [[Y]], i64 [[INDVARS_IV_PROL]] -; CHECK-NEXT: [[T0_PROL:%.*]] = load double, double* [[ARRAYIDX_PROL]], align 8, [[TBAA3]] +; CHECK-NEXT: [[T0_PROL:%.*]] = load double, double* [[ARRAYIDX_PROL]], align 8, !tbaa [[TBAA3]] ; CHECK-NEXT: [[TMP37:%.*]] = fmul fast double [[T0_PROL]], [[TMP36]] ; CHECK-NEXT: [[ARRAYIDX2_PROL:%.*]] = getelementptr inbounds double, double* [[X]], i64 [[INDVARS_IV_PROL]] -; CHECK-NEXT: store double [[TMP37]], double* [[ARRAYIDX2_PROL]], align 8, [[TBAA3]] +; CHECK-NEXT: store double [[TMP37]], double* [[ARRAYIDX2_PROL]], align 8, !tbaa [[TBAA3]] ; CHECK-NEXT: [[INDVARS_IV_NEXT_PROL]] = add nuw nsw i64 [[INDVARS_IV_PROL]], 1 ; CHECK-NEXT: [[PROL_ITER_SUB]] = add i64 [[PROL_ITER]], -1 ; CHECK-NEXT: [[PROL_ITER_CMP_NOT:%.*]] = icmp eq i64 [[PROL_ITER_SUB]], 0 -; CHECK-NEXT: br i1 [[PROL_ITER_CMP_NOT]], label [[FOR_BODY_PROL_LOOPEXIT]], label [[FOR_BODY_PROL]], [[LOOP16:!llvm.loop !.*]] +; CHECK-NEXT: br i1 [[PROL_ITER_CMP_NOT]], label [[FOR_BODY_PROL_LOOPEXIT]], label [[FOR_BODY_PROL]], !llvm.loop [[LOOP16:![0-9]+]] ; CHECK: for.body.prol.loopexit: -; CHECK-NEXT: [[INDVARS_IV_UNR:%.*]] = phi i64 [ [[INDVARS_IV_PH]], [[FOR_BODY_PREHEADER]] ], [ [[INDVARS_IV_NEXT_PROL]], [[FOR_BODY_PROL]] ] +; CHECK-NEXT: [[INDVARS_IV_UNR:%.*]] = phi i64 [ [[INDVARS_IV_PH]], [[FOR_BODY_PREHEADER8]] ], [ [[INDVARS_IV_NEXT_PROL]], [[FOR_BODY_PROL]] ] ; CHECK-NEXT: [[TMP38:%.*]] = icmp ult i64 [[TMP35]], 3 -; CHECK-NEXT: br i1 [[TMP38]], label [[FOR_END]], label [[FOR_BODY_PREHEADER_NEW:%.*]] -; CHECK: for.body.preheader.new: +; CHECK-NEXT: br i1 [[TMP38]], label [[FOR_END]], label [[FOR_BODY_PREHEADER8_NEW:%.*]] +; CHECK: for.body.preheader8.new: ; CHECK-NEXT: [[TMP39:%.*]] = fdiv fast double 1.000000e+00, [[A]] ; CHECK-NEXT: [[TMP40:%.*]] = fdiv fast double 1.000000e+00, [[A]] ; CHECK-NEXT: [[TMP41:%.*]] = fdiv fast double 1.000000e+00, [[A]] ; CHECK-NEXT: [[TMP42:%.*]] = fdiv fast double 1.000000e+00, [[A]] ; CHECK-NEXT: br label [[FOR_BODY:%.*]] ; CHECK: for.body: -; CHECK-NEXT: [[INDVARS_IV:%.*]] = phi i64 [ [[INDVARS_IV_UNR]], [[FOR_BODY_PREHEADER_NEW]] ], [ [[INDVARS_IV_NEXT_3:%.*]], [[FOR_BODY]] ] +; CHECK-NEXT: [[INDVARS_IV:%.*]] = phi i64 [ [[INDVARS_IV_UNR]], [[FOR_BODY_PREHEADER8_NEW]] ], [ [[INDVARS_IV_NEXT_3:%.*]], [[FOR_BODY]] ] ; CHECK-NEXT: [[ARRAYIDX:%.*]] = getelementptr inbounds double, double* [[Y]], i64 [[INDVARS_IV]] -; CHECK-NEXT: [[T0:%.*]] = load double, double* [[ARRAYIDX]], align 8, [[TBAA3]] +; CHECK-NEXT: [[T0:%.*]] = load double, double* [[ARRAYIDX]], align 8, !tbaa [[TBAA3]] ; CHECK-NEXT: [[TMP43:%.*]] = fmul fast double [[T0]], [[TMP39]] ; CHECK-NEXT: [[ARRAYIDX2:%.*]] = getelementptr inbounds double, double* [[X]], i64 [[INDVARS_IV]] -; CHECK-NEXT: store double [[TMP43]], double* [[ARRAYIDX2]], align 8, [[TBAA3]] +; CHECK-NEXT: store double [[TMP43]], double* [[ARRAYIDX2]], align 8, !tbaa [[TBAA3]] ; CHECK-NEXT: [[INDVARS_IV_NEXT:%.*]] = add nuw nsw i64 [[INDVARS_IV]], 1 ; CHECK-NEXT: [[ARRAYIDX_1:%.*]] = getelementptr inbounds double, double* [[Y]], i64 [[INDVARS_IV_NEXT]] -; CHECK-NEXT: [[T0_1:%.*]] = load double, double* [[ARRAYIDX_1]], align 8, [[TBAA3]] +; CHECK-NEXT: [[T0_1:%.*]] = load double, double* [[ARRAYIDX_1]], align 8, !tbaa [[TBAA3]] ; CHECK-NEXT: [[TMP44:%.*]] = fmul fast double [[T0_1]], [[TMP40]] ; CHECK-NEXT: [[ARRAYIDX2_1:%.*]] = getelementptr inbounds double, double* [[X]], i64 [[INDVARS_IV_NEXT]] -; CHECK-NEXT: store double [[TMP44]], double* [[ARRAYIDX2_1]], align 8, [[TBAA3]] +; CHECK-NEXT: store double [[TMP44]], double* [[ARRAYIDX2_1]], align 8, !tbaa [[TBAA3]] ; CHECK-NEXT: [[INDVARS_IV_NEXT_1:%.*]] = add nuw nsw i64 [[INDVARS_IV]], 2 ; CHECK-NEXT: [[ARRAYIDX_2:%.*]] = getelementptr inbounds double, double* [[Y]], i64 [[INDVARS_IV_NEXT_1]] -; CHECK-NEXT: [[T0_2:%.*]] = load double, double* [[ARRAYIDX_2]], align 8, [[TBAA3]] +; CHECK-NEXT: [[T0_2:%.*]] = load double, double* [[ARRAYIDX_2]], align 8, !tbaa [[TBAA3]] ; CHECK-NEXT: [[TMP45:%.*]] = fmul fast double [[T0_2]], [[TMP41]] ; CHECK-NEXT: [[ARRAYIDX2_2:%.*]] = getelementptr inbounds double, double* [[X]], i64 [[INDVARS_IV_NEXT_1]] -; CHECK-NEXT: store double [[TMP45]], double* [[ARRAYIDX2_2]], align 8, [[TBAA3]] +; CHECK-NEXT: store double [[TMP45]], double* [[ARRAYIDX2_2]], align 8, !tbaa [[TBAA3]] ; CHECK-NEXT: [[INDVARS_IV_NEXT_2:%.*]] = add nuw nsw i64 [[INDVARS_IV]], 3 ; CHECK-NEXT: [[ARRAYIDX_3:%.*]] = getelementptr inbounds double, double* [[Y]], i64 [[INDVARS_IV_NEXT_2]] -; CHECK-NEXT: [[T0_3:%.*]] = load double, double* [[ARRAYIDX_3]], align 8, [[TBAA3]] +; CHECK-NEXT: [[T0_3:%.*]] = load double, double* [[ARRAYIDX_3]], align 8, !tbaa [[TBAA3]] ; CHECK-NEXT: [[TMP46:%.*]] = fmul fast double [[T0_3]], [[TMP42]] ; CHECK-NEXT: [[ARRAYIDX2_3:%.*]] = getelementptr inbounds double, double* [[X]], i64 [[INDVARS_IV_NEXT_2]] -; CHECK-NEXT: store double [[TMP46]], double* [[ARRAYIDX2_3]], align 8, [[TBAA3]] +; CHECK-NEXT: store double [[TMP46]], double* [[ARRAYIDX2_3]], align 8, !tbaa [[TBAA3]] ; CHECK-NEXT: [[INDVARS_IV_NEXT_3]] = add nuw nsw i64 [[INDVARS_IV]], 4 ; CHECK-NEXT: [[EXITCOND_NOT_3:%.*]] = icmp eq i64 [[INDVARS_IV_NEXT_3]], [[WIDE_TRIP_COUNT]] -; CHECK-NEXT: br i1 [[EXITCOND_NOT_3]], label [[FOR_END]], label [[FOR_BODY]], [[LOOP17:!llvm.loop !.*]] +; CHECK-NEXT: br i1 [[EXITCOND_NOT_3]], label [[FOR_END]], label [[FOR_BODY]], !llvm.loop [[LOOP17:![0-9]+]] ; CHECK: for.end: ; CHECK-NEXT: ret void ; diff --git a/llvm/test/Transforms/PhaseOrdering/loop-rotation-vs-common-code-hoisting.ll b/llvm/test/Transforms/PhaseOrdering/loop-rotation-vs-common-code-hoisting.ll index 7d7d18a5247d..bb320af193e3 100644 --- a/llvm/test/Transforms/PhaseOrdering/loop-rotation-vs-common-code-hoisting.ll +++ b/llvm/test/Transforms/PhaseOrdering/loop-rotation-vs-common-code-hoisting.ll @@ -76,18 +76,20 @@ define void @_Z4loopi(i32 %width) { ; ROTATED_LATER_OLDPM-NEXT: [[CMP:%.*]] = icmp slt i32 [[WIDTH:%.*]], 1 ; ROTATED_LATER_OLDPM-NEXT: br i1 [[CMP]], label [[RETURN:%.*]], label [[FOR_COND_PREHEADER:%.*]] ; ROTATED_LATER_OLDPM: for.cond.preheader: +; ROTATED_LATER_OLDPM-NEXT: [[CMP13_NOT:%.*]] = icmp eq i32 [[WIDTH]], 1 +; ROTATED_LATER_OLDPM-NEXT: br i1 [[CMP13_NOT]], label [[FOR_COND_CLEANUP:%.*]], label [[FOR_BODY_PREHEADER:%.*]] +; ROTATED_LATER_OLDPM: for.body.preheader: ; ROTATED_LATER_OLDPM-NEXT: [[TMP0:%.*]] = add nsw i32 [[WIDTH]], -1 -; ROTATED_LATER_OLDPM-NEXT: [[EXITCOND_NOT3:%.*]] = icmp eq i32 [[TMP0]], 0 -; ROTATED_LATER_OLDPM-NEXT: br i1 [[EXITCOND_NOT3]], label [[FOR_COND_CLEANUP:%.*]], label [[FOR_BODY:%.*]] +; ROTATED_LATER_OLDPM-NEXT: br label [[FOR_BODY:%.*]] ; ROTATED_LATER_OLDPM: for.cond.cleanup: ; ROTATED_LATER_OLDPM-NEXT: tail call void @f0() ; ROTATED_LATER_OLDPM-NEXT: tail call void @f2() ; ROTATED_LATER_OLDPM-NEXT: br label [[RETURN]] ; ROTATED_LATER_OLDPM: for.body: -; ROTATED_LATER_OLDPM-NEXT: [[I_04:%.*]] = phi i32 [ [[INC:%.*]], [[FOR_BODY]] ], [ 0, [[FOR_COND_PREHEADER]] ] +; ROTATED_LATER_OLDPM-NEXT: [[I_04:%.*]] = phi i32 [ [[INC:%.*]], [[FOR_BODY]] ], [ 0, [[FOR_BODY_PREHEADER]] ] ; ROTATED_LATER_OLDPM-NEXT: tail call void @f0() ; ROTATED_LATER_OLDPM-NEXT: tail call void @f1() -; ROTATED_LATER_OLDPM-NEXT: [[INC]] = add nuw i32 [[I_04]], 1 +; ROTATED_LATER_OLDPM-NEXT: [[INC]] = add nuw nsw i32 [[I_04]], 1 ; ROTATED_LATER_OLDPM-NEXT: [[EXITCOND_NOT:%.*]] = icmp eq i32 [[INC]], [[TMP0]] ; ROTATED_LATER_OLDPM-NEXT: br i1 [[EXITCOND_NOT]], label [[FOR_COND_CLEANUP]], label [[FOR_BODY]] ; ROTATED_LATER_OLDPM: return: @@ -98,24 +100,24 @@ define void @_Z4loopi(i32 %width) { ; ROTATED_LATER_NEWPM-NEXT: [[CMP:%.*]] = icmp slt i32 [[WIDTH:%.*]], 1 ; ROTATED_LATER_NEWPM-NEXT: br i1 [[CMP]], label [[RETURN:%.*]], label [[FOR_COND_PREHEADER:%.*]] ; ROTATED_LATER_NEWPM: for.cond.preheader: +; ROTATED_LATER_NEWPM-NEXT: [[CMP13_NOT:%.*]] = icmp eq i32 [[WIDTH]], 1 +; ROTATED_LATER_NEWPM-NEXT: br i1 [[CMP13_NOT]], label [[FOR_COND_CLEANUP:%.*]], label [[FOR_BODY_PREHEADER:%.*]] +; ROTATED_LATER_NEWPM: for.body.preheader: ; ROTATED_LATER_NEWPM-NEXT: [[TMP0:%.*]] = add nsw i32 [[WIDTH]], -1 -; ROTATED_LATER_NEWPM-NEXT: [[EXITCOND_NOT3:%.*]] = icmp eq i32 [[TMP0]], 0 -; ROTATED_LATER_NEWPM-NEXT: br i1 [[EXITCOND_NOT3]], label [[FOR_COND_CLEANUP:%.*]], label [[FOR_COND_PREHEADER_FOR_BODY_CRIT_EDGE:%.*]] -; ROTATED_LATER_NEWPM: for.cond.preheader.for.body_crit_edge: -; ROTATED_LATER_NEWPM-NEXT: [[INC_1:%.*]] = add nuw i32 0, 1 +; ROTATED_LATER_NEWPM-NEXT: [[INC_1:%.*]] = add nuw nsw i32 0, 1 ; ROTATED_LATER_NEWPM-NEXT: br label [[FOR_BODY:%.*]] ; ROTATED_LATER_NEWPM: for.cond.cleanup: ; ROTATED_LATER_NEWPM-NEXT: tail call void @f0() ; ROTATED_LATER_NEWPM-NEXT: tail call void @f2() ; ROTATED_LATER_NEWPM-NEXT: br label [[RETURN]] ; ROTATED_LATER_NEWPM: for.body: -; ROTATED_LATER_NEWPM-NEXT: [[INC_PHI:%.*]] = phi i32 [ [[INC_0:%.*]], [[FOR_BODY_FOR_BODY_CRIT_EDGE:%.*]] ], [ [[INC_1]], [[FOR_COND_PREHEADER_FOR_BODY_CRIT_EDGE]] ] +; ROTATED_LATER_NEWPM-NEXT: [[INC_PHI:%.*]] = phi i32 [ [[INC_0:%.*]], [[FOR_BODY_FOR_BODY_CRIT_EDGE:%.*]] ], [ [[INC_1]], [[FOR_BODY_PREHEADER]] ] ; ROTATED_LATER_NEWPM-NEXT: tail call void @f0() ; ROTATED_LATER_NEWPM-NEXT: tail call void @f1() ; ROTATED_LATER_NEWPM-NEXT: [[EXITCOND_NOT:%.*]] = icmp eq i32 [[INC_PHI]], [[TMP0]] ; ROTATED_LATER_NEWPM-NEXT: br i1 [[EXITCOND_NOT]], label [[FOR_COND_CLEANUP]], label [[FOR_BODY_FOR_BODY_CRIT_EDGE]] ; ROTATED_LATER_NEWPM: for.body.for.body_crit_edge: -; ROTATED_LATER_NEWPM-NEXT: [[INC_0]] = add nuw i32 [[INC_PHI]], 1 +; ROTATED_LATER_NEWPM-NEXT: [[INC_0]] = add nuw nsw i32 [[INC_PHI]], 1 ; ROTATED_LATER_NEWPM-NEXT: br label [[FOR_BODY]] ; ROTATED_LATER_NEWPM: return: ; ROTATED_LATER_NEWPM-NEXT: ret void </cut>

4 years

1
0
0 0

[CI-NOTIFY]: TCWG Bisect tcwg_bmk_tk1/llvm-release-arm-spec2k6-O2_LTO - Build # 12 - Successful!

by ci_notify＠linaro.org

Successfully identified regression in *llvm* in CI configuration tcwg_bmk_llvm_tk1/llvm-release-arm-spec2k6-O2_LTO. So far, this commit has regressed CI configurations: - tcwg_bmk_llvm_tk1/llvm-release-arm-spec2k6-O2_LTO Culprit: <cut> commit 880822255e21179e9706ebaf77fff9111d9d3844 Author: Tobias Gysi <gysit(a)google.com> Date: Wed Mar 24 14:22:17 2021 +0000 [mlir][linalg] Do not call region builder during vectorization. All linalg operations having a region builder shall call it during op creation. Calling it during vectorization is obsolete. Differential Revision: https://reviews.llvm.org/D99168 </cut> Results regressed to (for first_bad == 880822255e21179e9706ebaf77fff9111d9d3844) # reset_artifacts: -10 # build_abe binutils: -9 # build_abe stage1 -- --set gcc_override_configure=--with-mode=arm --set gcc_override_configure=--disable-libsanitizer: -8 # build_abe linux: -7 # build_abe glibc: -6 # build_abe stage2 -- --set gcc_override_configure=--with-mode=arm --set gcc_override_configure=--disable-libsanitizer: -5 # build_llvm true: -3 # true: 0 # benchmark -- -O2_LTO_marm artifacts/build-880822255e21179e9706ebaf77fff9111d9d3844/results_id: 1 # 456.hmmer,hmmer_base.default regressed by 104 from (for last_good == 92417ebbd10382436136ed5e755be567304ac139) # reset_artifacts: -10 # build_abe binutils: -9 # build_abe stage1 -- --set gcc_override_configure=--with-mode=arm --set gcc_override_configure=--disable-libsanitizer: -8 # build_abe linux: -7 # build_abe glibc: -6 # build_abe stage2 -- --set gcc_override_configure=--with-mode=arm --set gcc_override_configure=--disable-libsanitizer: -5 # build_llvm true: -3 # true: 0 # benchmark -- -O2_LTO_marm artifacts/build-92417ebbd10382436136ed5e755be567304ac139/results_id: 1 Artifacts of last_good build: https://ci.linaro.org/job/tcwg_bmk_ci_llvm-bisect-tcwg_bmk_tk1-llvm-release… Results ID of last_good: tk1_32/tcwg_bmk_llvm_tk1/bisect-llvm-release-arm-spec2k6-O2_LTO/4267 Artifacts of first_bad build: https://ci.linaro.org/job/tcwg_bmk_ci_llvm-bisect-tcwg_bmk_tk1-llvm-release… Results ID of first_bad: tk1_32/tcwg_bmk_llvm_tk1/bisect-llvm-release-arm-spec2k6-O2_LTO/4268 Build top page/logs: https://ci.linaro.org/job/tcwg_bmk_ci_llvm-bisect-tcwg_bmk_tk1-llvm-release… Configuration details: Reproduce builds: <cut> mkdir investigate-llvm-880822255e21179e9706ebaf77fff9111d9d3844 cd investigate-llvm-880822255e21179e9706ebaf77fff9111d9d3844 git clone https://git.linaro.org/toolchain/jenkins-scripts mkdir -p artifacts/manifests curl -o artifacts/manifests/build-baseline.sh https://ci.linaro.org/job/tcwg_bmk_ci_llvm-bisect-tcwg_bmk_tk1-llvm-release… --fail curl -o artifacts/manifests/build-parameters.sh https://ci.linaro.org/job/tcwg_bmk_ci_llvm-bisect-tcwg_bmk_tk1-llvm-release… --fail curl -o artifacts/test.sh https://ci.linaro.org/job/tcwg_bmk_ci_llvm-bisect-tcwg_bmk_tk1-llvm-release… --fail chmod +x artifacts/test.sh # Reproduce the baseline build (build all pre-requisites) ./jenkins-scripts/tcwg_bmk-build.sh @@ artifacts/manifests/build-baseline.sh # Save baseline build state (which is then restored in artifacts/test.sh) mkdir -p ./bisect rsync -a --del --delete-excluded --exclude /bisect/ --exclude /artifacts/ --exclude /llvm/ ./ ./bisect/baseline/ cd llvm # Reproduce first_bad build git checkout --detach 880822255e21179e9706ebaf77fff9111d9d3844 ../artifacts/test.sh # Reproduce last_good build git checkout --detach 92417ebbd10382436136ed5e755be567304ac139 ../artifacts/test.sh cd .. </cut> History of pending regressions and results: https://git.linaro.org/toolchain/ci/base-artifacts.git/log/?h=linaro-local/… Artifacts: https://ci.linaro.org/job/tcwg_bmk_ci_llvm-bisect-tcwg_bmk_tk1-llvm-release… Build log: https://ci.linaro.org/job/tcwg_bmk_ci_llvm-bisect-tcwg_bmk_tk1-llvm-release… Full commit (up to 1000 lines): <cut> commit 880822255e21179e9706ebaf77fff9111d9d3844 Author: Tobias Gysi <gysit(a)google.com> Date: Wed Mar 24 14:22:17 2021 +0000 [mlir][linalg] Do not call region builder during vectorization. All linalg operations having a region builder shall call it during op creation. Calling it during vectorization is obsolete. Differential Revision: https://reviews.llvm.org/D99168 --- .../Dialect/Linalg/Transforms/Vectorization.cpp | 31 ++++++---------------- 1 file changed, 8 insertions(+), 23 deletions(-) diff --git a/mlir/lib/Dialect/Linalg/Transforms/Vectorization.cpp b/mlir/lib/Dialect/Linalg/Transforms/Vectorization.cpp index d4581013ae69..10562d68a9e0 100644 --- a/mlir/lib/Dialect/Linalg/Transforms/Vectorization.cpp +++ b/mlir/lib/Dialect/Linalg/Transforms/Vectorization.cpp @@ -288,7 +288,7 @@ static AffineMap getTransferReadMap(LinalgOp linalgOp, unsigned argIndex) { /// Generic vectorization function that rewrites the body of a `linalgOp` into /// vector form. Generic vectorization proceeds as follows: -/// 1. The region for the linalg op is created if necessary. +/// 1. Verify the `linalgOp` has one non-empty region. /// 2. Values defined above the region are mapped to themselves and will be /// broadcasted on a per-need basis by their consumers. /// 3. Each region argument is vectorized into a vector.transfer_read (or 0-d @@ -299,36 +299,21 @@ static AffineMap getTransferReadMap(LinalgOp linalgOp, unsigned argIndex) { LogicalResult vectorizeAsLinalgGeneric( OpBuilder &builder, LinalgOp linalgOp, SmallVectorImpl<Value> &newResults, ArrayRef<CustomVectorizationHook> customVectorizationHooks = {}) { - // 1. Certain Linalg ops do not have a region but only a region builder. - // If so, build the region so we can vectorize. - std::unique_ptr<Region> owningRegion; - Region *region; - if (linalgOp->getNumRegions() > 0) { - region = &linalgOp->getRegion(0); - } else { - // RAII avoid remaining in block. - OpBuilder::InsertionGuard g(builder); - owningRegion = std::make_unique<Region>(); - region = owningRegion.get(); - Block *block = builder.createBlock(region); - auto elementTypes = llvm::to_vector<4>( - llvm::map_range(linalgOp.getShapedOperandTypes(), - [](ShapedType t) { return t.getElementType(); })); - block->addArguments(elementTypes); - linalgOp.getRegionBuilder()(*block, /*captures=*/{}); - } - Block *block = &region->front(); + // 1. Fail to vectorize if the operation does not have one non-empty region. + if (linalgOp->getNumRegions() != 1 || linalgOp->getRegion(0).empty()) + return failure(); + auto &block = linalgOp->getRegion(0).front(); BlockAndValueMapping bvm; // 2. Values defined above the region can only be broadcast for now. Make them // map to themselves. llvm::SetVector<Value> valuesSet; - mlir::getUsedValuesDefinedAbove(*region, valuesSet); + mlir::getUsedValuesDefinedAbove(linalgOp->getRegion(0), valuesSet); bvm.map(valuesSet.getArrayRef(), valuesSet.getArrayRef()); // 3. Turn all BBArgs into vector.transfer_read / load. SmallVector<AffineMap> indexings; - for (auto bbarg : block->getArguments()) { + for (auto bbarg : block.getArguments()) { Value vectorArg = linalgOp.getShapedOperand(bbarg.getArgNumber()); AffineMap map; VectorType vectorType = extractVectorTypeFromShapedValue(vectorArg); @@ -360,7 +345,7 @@ LogicalResult vectorizeAsLinalgGeneric( hooks.push_back(vectorizeYield); // 5. Iteratively call `vectorizeOneOp` to each op in the slice. - for (Operation &op : block->getOperations()) { + for (Operation &op : block.getOperations()) { VectorizationResult result = vectorizeOneOp(builder, &op, bvm, hooks); if (result.status == VectorizationStatus::Failure) { LLVM_DEBUG(dbgs() << "\n[" DEBUG_TYPE "]: failed to vectorize: " << op); </cut>

4 years

1
0
0 0

[CI-NOTIFY]: TCWG Bisect tcwg_bmk_tx1/llvm-release-aarch64-spec2k6-O3_LTO - Build # 9 - Successful!

by ci_notify＠linaro.org

Successfully identified regression in *llvm* in CI configuration tcwg_bmk_llvm_tx1/llvm-release-aarch64-spec2k6-O3_LTO. So far, this commit has regressed CI configurations: - tcwg_bmk_llvm_tx1/llvm-release-aarch64-spec2k6-O3_LTO Culprit: <cut> commit 0237dbfdd38053cc190f814b6f92e311ae3509c6 Author: Chuanqi Xu <yedeng.yd(a)linux.alibaba.com> Date: Tue Jul 27 13:13:39 2021 +0800 [Coroutine] Record the elided coroutines Reviewed By: lxfind Differential Revision: https://reviews.llvm.org/D105606 </cut> Results regressed to (for first_bad == 0237dbfdd38053cc190f814b6f92e311ae3509c6) # reset_artifacts: -10 # build_abe binutils: -9 # build_abe stage1 -- --set gcc_override_configure=--disable-libsanitizer: -8 # build_abe linux: -7 # build_abe glibc: -6 # build_abe stage2 -- --set gcc_override_configure=--disable-libsanitizer: -5 # build_llvm true: -3 # true: 0 # benchmark -- -O3_LTO artifacts/build-0237dbfdd38053cc190f814b6f92e311ae3509c6/results_id: 1 # 458.sjeng,sjeng_base.default regressed by 104 from (for last_good == dbefcde6da1b58eb181dcbd8d7913175b2ec8350) # reset_artifacts: -10 # build_abe binutils: -9 # build_abe stage1 -- --set gcc_override_configure=--disable-libsanitizer: -8 # build_abe linux: -7 # build_abe glibc: -6 # build_abe stage2 -- --set gcc_override_configure=--disable-libsanitizer: -5 # build_llvm true: -3 # true: 0 # benchmark -- -O3_LTO artifacts/build-dbefcde6da1b58eb181dcbd8d7913175b2ec8350/results_id: 1 Artifacts of last_good build: https://ci.linaro.org/job/tcwg_bmk_ci_llvm-bisect-tcwg_bmk_tx1-llvm-release… Results ID of last_good: tx1_64/tcwg_bmk_llvm_tx1/bisect-llvm-release-aarch64-spec2k6-O3_LTO/4264 Artifacts of first_bad build: https://ci.linaro.org/job/tcwg_bmk_ci_llvm-bisect-tcwg_bmk_tx1-llvm-release… Results ID of first_bad: tx1_64/tcwg_bmk_llvm_tx1/bisect-llvm-release-aarch64-spec2k6-O3_LTO/4262 Build top page/logs: https://ci.linaro.org/job/tcwg_bmk_ci_llvm-bisect-tcwg_bmk_tx1-llvm-release… Configuration details: Reproduce builds: <cut> mkdir investigate-llvm-0237dbfdd38053cc190f814b6f92e311ae3509c6 cd investigate-llvm-0237dbfdd38053cc190f814b6f92e311ae3509c6 git clone https://git.linaro.org/toolchain/jenkins-scripts mkdir -p artifacts/manifests curl -o artifacts/manifests/build-baseline.sh https://ci.linaro.org/job/tcwg_bmk_ci_llvm-bisect-tcwg_bmk_tx1-llvm-release… --fail curl -o artifacts/manifests/build-parameters.sh https://ci.linaro.org/job/tcwg_bmk_ci_llvm-bisect-tcwg_bmk_tx1-llvm-release… --fail curl -o artifacts/test.sh https://ci.linaro.org/job/tcwg_bmk_ci_llvm-bisect-tcwg_bmk_tx1-llvm-release… --fail chmod +x artifacts/test.sh # Reproduce the baseline build (build all pre-requisites) ./jenkins-scripts/tcwg_bmk-build.sh @@ artifacts/manifests/build-baseline.sh # Save baseline build state (which is then restored in artifacts/test.sh) mkdir -p ./bisect rsync -a --del --delete-excluded --exclude /bisect/ --exclude /artifacts/ --exclude /llvm/ ./ ./bisect/baseline/ cd llvm # Reproduce first_bad build git checkout --detach 0237dbfdd38053cc190f814b6f92e311ae3509c6 ../artifacts/test.sh # Reproduce last_good build git checkout --detach dbefcde6da1b58eb181dcbd8d7913175b2ec8350 ../artifacts/test.sh cd .. </cut> History of pending regressions and results: https://git.linaro.org/toolchain/ci/base-artifacts.git/log/?h=linaro-local/… Artifacts: https://ci.linaro.org/job/tcwg_bmk_ci_llvm-bisect-tcwg_bmk_tx1-llvm-release… Build log: https://ci.linaro.org/job/tcwg_bmk_ci_llvm-bisect-tcwg_bmk_tx1-llvm-release… Full commit (up to 1000 lines): <cut> commit 0237dbfdd38053cc190f814b6f92e311ae3509c6 Author: Chuanqi Xu <yedeng.yd(a)linux.alibaba.com> Date: Tue Jul 27 13:13:39 2021 +0800 [Coroutine] Record the elided coroutines Reviewed By: lxfind Differential Revision: https://reviews.llvm.org/D105606 --- llvm/lib/Transforms/Coroutines/CoroElide.cpp | 28 ++++++++++++++++++++++ .../{coro-elide-count.ll => coro-elide-stat.ll} | 7 ++++++ 2 files changed, 35 insertions(+) diff --git a/llvm/lib/Transforms/Coroutines/CoroElide.cpp b/llvm/lib/Transforms/Coroutines/CoroElide.cpp index d35a5de6d4bd..84bebb7bf42d 100644 --- a/llvm/lib/Transforms/Coroutines/CoroElide.cpp +++ b/llvm/lib/Transforms/Coroutines/CoroElide.cpp @@ -17,6 +17,7 @@ #include "llvm/InitializePasses.h" #include "llvm/Pass.h" #include "llvm/Support/ErrorHandling.h" +#include "llvm/Support/FileSystem.h" using namespace llvm; @@ -24,6 +25,12 @@ using namespace llvm; STATISTIC(NumOfCoroElided, "The # of coroutine get elided."); +#ifndef NDEBUG +static cl::opt<std::string> CoroElideInfoOutputFilename( + "coro-elide-info-output-file", cl::value_desc("filename"), + cl::desc("File to record the coroutines got elided"), cl::Hidden); +#endif + namespace { // Created on demand if the coro-elide pass has work to do. struct Lowerer : coro::LowererBase { @@ -121,6 +128,21 @@ static Instruction *getFirstNonAllocaInTheEntryBlock(Function *F) { llvm_unreachable("no terminator in the entry block"); } +#ifndef NDEBUG +static std::unique_ptr<raw_fd_ostream> getOrCreateLogFile() { + assert(!CoroElideInfoOutputFilename.empty() && + "coro-elide-info-output-file shouldn't be empty"); + std::error_code EC; + auto Result = std::make_unique<raw_fd_ostream>(CoroElideInfoOutputFilename, + EC, sys::fs::OF_Append); + if (!EC) + return Result; + llvm::errs() << "Error opening coro-elide-info-output-file '" + << CoroElideInfoOutputFilename << " for appending!\n"; + return std::make_unique<raw_fd_ostream>(2, false); // stderr. +} +#endif + // To elide heap allocations we need to suppress code blocks guarded by // llvm.coro.alloc and llvm.coro.free instructions. void Lowerer::elideHeapAllocations(Function *F, uint64_t FrameSize, @@ -344,6 +366,12 @@ bool Lowerer::processCoroId(CoroIdInst *CoroId, AAResults &AA, FrameSizeAndAlign.second, AA); coro::replaceCoroFree(CoroId, /*Elide=*/true); NumOfCoroElided++; +#ifndef NDEBUG + if (!CoroElideInfoOutputFilename.empty()) + *getOrCreateLogFile() + << "Elide " << CoroId->getCoroutine()->getName() << " in " + << CoroId->getFunction()->getName() << "\n"; +#endif } return true; diff --git a/llvm/test/Transforms/Coroutines/coro-elide-count.ll b/llvm/test/Transforms/Coroutines/coro-elide-stat.ll similarity index 92% rename from llvm/test/Transforms/Coroutines/coro-elide-count.ll rename to llvm/test/Transforms/Coroutines/coro-elide-stat.ll index ae40a74f41d5..e92d484786c5 100644 --- a/llvm/test/Transforms/Coroutines/coro-elide-count.ll +++ b/llvm/test/Transforms/Coroutines/coro-elide-stat.ll @@ -4,8 +4,15 @@ ; RUN: opt < %s -S \ ; RUN: -passes='cgscc(repeat<2>(inline,function(coro-elide,dce)))' -stats 2>&1 \ ; RUN: | FileCheck %s +; RUN: opt < %s --disable-output \ +; RUN: -passes='cgscc(repeat<2>(inline,function(coro-elide,dce)))' \ +; RUN: -coro-elide-info-output-file=%t && \ +; RUN: cat %t \ +; RUN: | FileCheck %s --check-prefix=FILE ; CHECK: 2 coro-elide - The # of coroutine get elided. +; FILE: Elide f in callResume +; FILE: Elide f in callResumeMultiRetDommmed declare void @print(i32) nounwind </cut>

4 years

1
0
0 0

[CI-NOTIFY]: TCWG Bisect tcwg_bmk_tk1/llvm-release-arm-spec2k6-O3_LTO - Build # 8 - Successful!

by ci_notify＠linaro.org

Successfully identified regression in *llvm* in CI configuration tcwg_bmk_llvm_tk1/llvm-release-arm-spec2k6-O3_LTO. So far, this commit has regressed CI configurations: - tcwg_bmk_llvm_tk1/llvm-release-arm-spec2k6-O3_LTO Culprit: <cut> commit bb1e5399e4586239d6424f5eea5a9f06c52ebe9b Author: Vitaly Buka <vitalybuka(a)google.com> Date: Fri Apr 2 00:58:09 2021 -0700 [NFC][scudo] Inline some functions into ScudoPrimaryTest </cut> Results regressed to (for first_bad == bb1e5399e4586239d6424f5eea5a9f06c52ebe9b) # reset_artifacts: -10 # build_abe binutils: -9 # build_abe stage1 -- --set gcc_override_configure=--with-mode=arm --set gcc_override_configure=--disable-libsanitizer: -8 # build_abe linux: -7 # build_abe glibc: -6 # build_abe stage2 -- --set gcc_override_configure=--with-mode=arm --set gcc_override_configure=--disable-libsanitizer: -5 # build_llvm true: -3 # true: 0 # benchmark -- -O3_LTO_marm artifacts/build-bb1e5399e4586239d6424f5eea5a9f06c52ebe9b/results_id: 1 # 462.libquantum,libquantum_base.default regressed by 105 from (for last_good == f343a730596b6b02039a91d71dc16c113d09cfe6) # reset_artifacts: -10 # build_abe binutils: -9 # build_abe stage1 -- --set gcc_override_configure=--with-mode=arm --set gcc_override_configure=--disable-libsanitizer: -8 # build_abe linux: -7 # build_abe glibc: -6 # build_abe stage2 -- --set gcc_override_configure=--with-mode=arm --set gcc_override_configure=--disable-libsanitizer: -5 # build_llvm true: -3 # true: 0 # benchmark -- -O3_LTO_marm artifacts/build-baseline/results_id: 1 Artifacts of last_good build: https://ci.linaro.org/job/tcwg_bmk_ci_llvm-bisect-tcwg_bmk_tk1-llvm-release… Results ID of last_good: tk1_32/tcwg_bmk_llvm_tk1/baseline-llvm-release-arm-spec2k6-O3_LTO/4201 Artifacts of first_bad build: https://ci.linaro.org/job/tcwg_bmk_ci_llvm-bisect-tcwg_bmk_tk1-llvm-release… Results ID of first_bad: tk1_32/tcwg_bmk_llvm_tk1/bisect-llvm-release-arm-spec2k6-O3_LTO/4234 Build top page/logs: https://ci.linaro.org/job/tcwg_bmk_ci_llvm-bisect-tcwg_bmk_tk1-llvm-release… Configuration details: Reproduce builds: <cut> mkdir investigate-llvm-bb1e5399e4586239d6424f5eea5a9f06c52ebe9b cd investigate-llvm-bb1e5399e4586239d6424f5eea5a9f06c52ebe9b git clone https://git.linaro.org/toolchain/jenkins-scripts mkdir -p artifacts/manifests curl -o artifacts/manifests/build-baseline.sh https://ci.linaro.org/job/tcwg_bmk_ci_llvm-bisect-tcwg_bmk_tk1-llvm-release… --fail curl -o artifacts/manifests/build-parameters.sh https://ci.linaro.org/job/tcwg_bmk_ci_llvm-bisect-tcwg_bmk_tk1-llvm-release… --fail curl -o artifacts/test.sh https://ci.linaro.org/job/tcwg_bmk_ci_llvm-bisect-tcwg_bmk_tk1-llvm-release… --fail chmod +x artifacts/test.sh # Reproduce the baseline build (build all pre-requisites) ./jenkins-scripts/tcwg_bmk-build.sh @@ artifacts/manifests/build-baseline.sh # Save baseline build state (which is then restored in artifacts/test.sh) mkdir -p ./bisect rsync -a --del --delete-excluded --exclude /bisect/ --exclude /artifacts/ --exclude /llvm/ ./ ./bisect/baseline/ cd llvm # Reproduce first_bad build git checkout --detach bb1e5399e4586239d6424f5eea5a9f06c52ebe9b ../artifacts/test.sh # Reproduce last_good build git checkout --detach f343a730596b6b02039a91d71dc16c113d09cfe6 ../artifacts/test.sh cd .. </cut> History of pending regressions and results: https://git.linaro.org/toolchain/ci/base-artifacts.git/log/?h=linaro-local/… Artifacts: https://ci.linaro.org/job/tcwg_bmk_ci_llvm-bisect-tcwg_bmk_tk1-llvm-release… Build log: https://ci.linaro.org/job/tcwg_bmk_ci_llvm-bisect-tcwg_bmk_tk1-llvm-release… Full commit (up to 1000 lines): <cut> commit bb1e5399e4586239d6424f5eea5a9f06c52ebe9b Author: Vitaly Buka <vitalybuka(a)google.com> Date: Fri Apr 2 00:58:09 2021 -0700 [NFC][scudo] Inline some functions into ScudoPrimaryTest --- .../lib/scudo/standalone/tests/primary_test.cpp | 173 +++++++++------------ 1 file changed, 71 insertions(+), 102 deletions(-) diff --git a/compiler-rt/lib/scudo/standalone/tests/primary_test.cpp b/compiler-rt/lib/scudo/standalone/tests/primary_test.cpp index 07f3d6b77c17..2707985b5c72 100644 --- a/compiler-rt/lib/scudo/standalone/tests/primary_test.cpp +++ b/compiler-rt/lib/scudo/standalone/tests/primary_test.cpp @@ -21,37 +21,6 @@ // 32-bit architectures. It's not something we want to encourage, but we still // should ensure the tests pass. -template <typename Primary> static void testPrimary() { - const scudo::uptr NumberOfAllocations = 32U; - auto Deleter = [](Primary *P) { - P->unmapTestOnly(); - delete P; - }; - std::unique_ptr<Primary, decltype(Deleter)> Allocator(new Primary, Deleter); - Allocator->init(/*ReleaseToOsInterval=*/-1); - typename Primary::CacheT Cache; - Cache.init(nullptr, Allocator.get()); - for (scudo::uptr I = 0; I <= 16U; I++) { - const scudo::uptr Size = 1UL << I; - if (!Primary::canAllocate(Size)) - continue; - const scudo::uptr ClassId = Primary::SizeClassMap::getClassIdBySize(Size); - void *Pointers[NumberOfAllocations]; - for (scudo::uptr J = 0; J < NumberOfAllocations; J++) { - void *P = Cache.allocate(ClassId); - memset(P, 'B', Size); - Pointers[J] = P; - } - for (scudo::uptr J = 0; J < NumberOfAllocations; J++) - Cache.deallocate(ClassId, Pointers[J]); - } - Cache.destroy(nullptr); - Allocator->releaseToOS(); - scudo::ScopedString Str(1024); - Allocator->getStats(&Str); - Str.output(); -} - struct TestConfig1 { static const scudo::uptr PrimaryRegionSizeLog = 18U; static const scudo::s32 PrimaryMinReleaseToOsIntervalMs = INT32_MIN; @@ -84,13 +53,16 @@ struct Config : public BaseConfig { using SizeClassMap = SizeClassMapT; }; -template <typename BaseConfig, typename SizeClassMapT> struct MakeAllocator { - using Value = scudo::SizeClassAllocator64<Config<BaseConfig, SizeClassMapT>>; -}; - +template <typename BaseConfig, typename SizeClassMapT> +struct SizeClassAllocator + : public scudo::SizeClassAllocator64<Config<BaseConfig, SizeClassMapT>> {}; template <typename SizeClassMapT> -struct MakeAllocator<TestConfig1, SizeClassMapT> { - using Value = scudo::SizeClassAllocator32<Config<TestConfig1, SizeClassMapT>>; +struct SizeClassAllocator<TestConfig1, SizeClassMapT> + : public scudo::SizeClassAllocator32<Config<TestConfig1, SizeClassMapT>> {}; + +template <typename BaseConfig, typename SizeClassMapT> +struct TestAllocator : public SizeClassAllocator<BaseConfig, SizeClassMapT> { + ~TestAllocator() { this->unmapTestOnly(); } }; namespace testing { @@ -114,8 +86,31 @@ using ScudoPrimaryTestTypes = testing::Types< TYPED_TEST_CASE(ScudoPrimaryTest, ScudoPrimaryTestTypes); TYPED_TEST(ScudoPrimaryTest, BasicPrimary) { - using SizeClassMap = scudo::DefaultSizeClassMap; - testPrimary<typename MakeAllocator<TypeParam, SizeClassMap>::Value>(); + using Primary = TestAllocator<TypeParam, scudo::DefaultSizeClassMap>; + std::unique_ptr<Primary> Allocator(new Primary); + Allocator->init(/*ReleaseToOsInterval=*/-1); + typename Primary::CacheT Cache; + Cache.init(nullptr, Allocator.get()); + const scudo::uptr NumberOfAllocations = 32U; + for (scudo::uptr I = 0; I <= 16U; I++) { + const scudo::uptr Size = 1UL << I; + if (!Primary::canAllocate(Size)) + continue; + const scudo::uptr ClassId = Primary::SizeClassMap::getClassIdBySize(Size); + void *Pointers[NumberOfAllocations]; + for (scudo::uptr J = 0; J < NumberOfAllocations; J++) { + void *P = Cache.allocate(ClassId); + memset(P, 'B', Size); + Pointers[J] = P; + } + for (scudo::uptr J = 0; J < NumberOfAllocations; J++) + Cache.deallocate(ClassId, Pointers[J]); + } + Cache.destroy(nullptr); + Allocator->releaseToOS(); + scudo::ScopedString Str(1024); + Allocator->getStats(&Str); + Str.output(); } struct SmallRegionsConfig { @@ -166,12 +161,9 @@ TEST(ScudoPrimaryTest, Primary64OOM) { Allocator.unmapTestOnly(); } -template <typename Primary> static void testIteratePrimary() { - auto Deleter = [](Primary *P) { - P->unmapTestOnly(); - delete P; - }; - std::unique_ptr<Primary, decltype(Deleter)> Allocator(new Primary, Deleter); +TYPED_TEST(ScudoPrimaryTest, PrimaryIterate) { + using Primary = TestAllocator<TypeParam, scudo::DefaultSizeClassMap>; + std::unique_ptr<Primary> Allocator(new Primary); Allocator->init(/*ReleaseToOsInterval=*/-1); typename Primary::CacheT Cache; Cache.init(nullptr, Allocator.get()); @@ -205,50 +197,40 @@ template <typename Primary> static void testIteratePrimary() { Str.output(); } -TYPED_TEST(ScudoPrimaryTest, PrimaryIterate) { - using SizeClassMap = scudo::DefaultSizeClassMap; - testIteratePrimary<typename MakeAllocator<TypeParam, SizeClassMap>::Value>(); -} - -static std::mutex Mutex; -static std::condition_variable Cv; -static bool Ready; - -template <typename Primary> static void performAllocations(Primary *Allocator) { - static thread_local typename Primary::CacheT Cache; - Cache.init(nullptr, Allocator); - std::vector<std::pair<scudo::uptr, void *>> V; - { - std::unique_lock<std::mutex> Lock(Mutex); - while (!Ready) - Cv.wait(Lock); - } - for (scudo::uptr I = 0; I < 256U; I++) { - const scudo::uptr Size = std::rand() % Primary::SizeClassMap::MaxSize / 4; - const scudo::uptr ClassId = Primary::SizeClassMap::getClassIdBySize(Size); - void *P = Cache.allocate(ClassId); - if (P) - V.push_back(std::make_pair(ClassId, P)); - } - while (!V.empty()) { - auto Pair = V.back(); - Cache.deallocate(Pair.first, Pair.second); - V.pop_back(); - } - Cache.destroy(nullptr); -} - -template <typename Primary> static void testPrimaryThreaded() { - Ready = false; - auto Deleter = [](Primary *P) { - P->unmapTestOnly(); - delete P; - }; - std::unique_ptr<Primary, decltype(Deleter)> Allocator(new Primary, Deleter); +TYPED_TEST(ScudoPrimaryTest, PrimaryThreaded) { + using Primary = TestAllocator<TypeParam, scudo::SvelteSizeClassMap>; + std::unique_ptr<Primary> Allocator(new Primary); Allocator->init(/*ReleaseToOsInterval=*/-1); + std::mutex Mutex; + std::condition_variable Cv; + bool Ready = false; std::thread Threads[32]; for (scudo::uptr I = 0; I < ARRAY_SIZE(Threads); I++) - Threads[I] = std::thread(performAllocations<Primary>, Allocator.get()); + Threads[I] = std::thread([&]() { + static thread_local typename Primary::CacheT Cache; + Cache.init(nullptr, Allocator.get()); + std::vector<std::pair<scudo::uptr, void *>> V; + { + std::unique_lock<std::mutex> Lock(Mutex); + while (!Ready) + Cv.wait(Lock); + } + for (scudo::uptr I = 0; I < 256U; I++) { + const scudo::uptr Size = + std::rand() % Primary::SizeClassMap::MaxSize / 4; + const scudo::uptr ClassId = + Primary::SizeClassMap::getClassIdBySize(Size); + void *P = Cache.allocate(ClassId); + if (P) + V.push_back(std::make_pair(ClassId, P)); + } + while (!V.empty()) { + auto Pair = V.back(); + Cache.deallocate(Pair.first, Pair.second); + V.pop_back(); + } + Cache.destroy(nullptr); + }); { std::unique_lock<std::mutex> Lock(Mutex); Ready = true; @@ -262,20 +244,12 @@ template <typename Primary> static void testPrimaryThreaded() { Str.output(); } -TYPED_TEST(ScudoPrimaryTest, PrimaryThreaded) { - using SizeClassMap = scudo::SvelteSizeClassMap; - testPrimaryThreaded<typename MakeAllocator<TypeParam, SizeClassMap>::Value>(); -} - // Through a simple allocation that spans two pages, verify that releaseToOS // actually releases some bytes (at least one page worth). This is a regression // test for an error in how the release criteria were computed. -template <typename Primary> static void testReleaseToOS() { - auto Deleter = [](Primary *P) { - P->unmapTestOnly(); - delete P; - }; - std::unique_ptr<Primary, decltype(Deleter)> Allocator(new Primary, Deleter); +TYPED_TEST(ScudoPrimaryTest, ReleaseToOS) { + using Primary = TestAllocator<TypeParam, scudo::DefaultSizeClassMap>; + std::unique_ptr<Primary> Allocator(new Primary); Allocator->init(/*ReleaseToOsInterval=*/-1); typename Primary::CacheT Cache; Cache.init(nullptr, Allocator.get()); @@ -288,8 +262,3 @@ template <typename Primary> static void testReleaseToOS() { Cache.destroy(nullptr); EXPECT_GT(Allocator->releaseToOS(), 0U); } - -TYPED_TEST(ScudoPrimaryTest, ReleaseToOS) { - using SizeClassMap = scudo::DefaultSizeClassMap; - testReleaseToOS<typename MakeAllocator<TypeParam, SizeClassMap>::Value>(); -} </cut>

4 years

1
0
0 0

[CI-NOTIFY]: TCWG Bisect tcwg_bmk_tx1/llvm-release-aarch64-spec2k6-O3_LTO - Build # 8 - Successful!

by ci_notify＠linaro.org

Successfully identified regression in *llvm* in CI configuration tcwg_bmk_llvm_tx1/llvm-release-aarch64-spec2k6-O3_LTO. So far, this commit has regressed CI configurations: - tcwg_bmk_llvm_tx1/llvm-release-aarch64-spec2k6-O3_LTO Culprit: <cut> commit ad558a4ff7cd61081cfeaabff1dbc8c0a9afa92b Author: Carl Ritson <carl.ritson(a)amd.com> Date: Tue May 11 12:14:01 2021 +0900 [AMDGPU] Pre-commit tests for D102211 </cut> Results regressed to (for first_bad == ad558a4ff7cd61081cfeaabff1dbc8c0a9afa92b) # reset_artifacts: -10 # build_abe binutils: -9 # build_abe stage1 -- --set gcc_override_configure=--disable-libsanitizer: -8 # build_abe linux: -7 # build_abe glibc: -6 # build_abe stage2 -- --set gcc_override_configure=--disable-libsanitizer: -5 # build_llvm true: -3 # true: 0 # benchmark -- -O3_LTO artifacts/build-ad558a4ff7cd61081cfeaabff1dbc8c0a9afa92b/results_id: 1 # 473.astar,astar_base.default regressed by 104 from (for last_good == d8ec2b183e9243366e3a0cd1116dbe879856b333) # reset_artifacts: -10 # build_abe binutils: -9 # build_abe stage1 -- --set gcc_override_configure=--disable-libsanitizer: -8 # build_abe linux: -7 # build_abe glibc: -6 # build_abe stage2 -- --set gcc_override_configure=--disable-libsanitizer: -5 # build_llvm true: -3 # true: 0 # benchmark -- -O3_LTO artifacts/build-d8ec2b183e9243366e3a0cd1116dbe879856b333/results_id: 1 Artifacts of last_good build: https://ci.linaro.org/job/tcwg_bmk_ci_llvm-bisect-tcwg_bmk_tx1-llvm-release… Results ID of last_good: tx1_64/tcwg_bmk_llvm_tx1/bisect-llvm-release-aarch64-spec2k6-O3_LTO/4218 Artifacts of first_bad build: https://ci.linaro.org/job/tcwg_bmk_ci_llvm-bisect-tcwg_bmk_tx1-llvm-release… Results ID of first_bad: tx1_64/tcwg_bmk_llvm_tx1/bisect-llvm-release-aarch64-spec2k6-O3_LTO/4219 Build top page/logs: https://ci.linaro.org/job/tcwg_bmk_ci_llvm-bisect-tcwg_bmk_tx1-llvm-release… Configuration details: Reproduce builds: <cut> mkdir investigate-llvm-ad558a4ff7cd61081cfeaabff1dbc8c0a9afa92b cd investigate-llvm-ad558a4ff7cd61081cfeaabff1dbc8c0a9afa92b git clone https://git.linaro.org/toolchain/jenkins-scripts mkdir -p artifacts/manifests curl -o artifacts/manifests/build-baseline.sh https://ci.linaro.org/job/tcwg_bmk_ci_llvm-bisect-tcwg_bmk_tx1-llvm-release… --fail curl -o artifacts/manifests/build-parameters.sh https://ci.linaro.org/job/tcwg_bmk_ci_llvm-bisect-tcwg_bmk_tx1-llvm-release… --fail curl -o artifacts/test.sh https://ci.linaro.org/job/tcwg_bmk_ci_llvm-bisect-tcwg_bmk_tx1-llvm-release… --fail chmod +x artifacts/test.sh # Reproduce the baseline build (build all pre-requisites) ./jenkins-scripts/tcwg_bmk-build.sh @@ artifacts/manifests/build-baseline.sh # Save baseline build state (which is then restored in artifacts/test.sh) mkdir -p ./bisect rsync -a --del --delete-excluded --exclude /bisect/ --exclude /artifacts/ --exclude /llvm/ ./ ./bisect/baseline/ cd llvm # Reproduce first_bad build git checkout --detach ad558a4ff7cd61081cfeaabff1dbc8c0a9afa92b ../artifacts/test.sh # Reproduce last_good build git checkout --detach d8ec2b183e9243366e3a0cd1116dbe879856b333 ../artifacts/test.sh cd .. </cut> History of pending regressions and results: https://git.linaro.org/toolchain/ci/base-artifacts.git/log/?h=linaro-local/… Artifacts: https://ci.linaro.org/job/tcwg_bmk_ci_llvm-bisect-tcwg_bmk_tx1-llvm-release… Build log: https://ci.linaro.org/job/tcwg_bmk_ci_llvm-bisect-tcwg_bmk_tx1-llvm-release… Full commit (up to 1000 lines): <cut> commit ad558a4ff7cd61081cfeaabff1dbc8c0a9afa92b Author: Carl Ritson <carl.ritson(a)amd.com> Date: Tue May 11 12:14:01 2021 +0900 [AMDGPU] Pre-commit tests for D102211 --- llvm/test/CodeGen/AMDGPU/hard-clauses.mir | 36 +++++++++++++++++++++++++++++++ 1 file changed, 36 insertions(+) diff --git a/llvm/test/CodeGen/AMDGPU/hard-clauses.mir b/llvm/test/CodeGen/AMDGPU/hard-clauses.mir index 506f9a77c177..e6ca33341bfb 100644 --- a/llvm/test/CodeGen/AMDGPU/hard-clauses.mir +++ b/llvm/test/CodeGen/AMDGPU/hard-clauses.mir @@ -209,3 +209,39 @@ body: | $vgpr79 = BUFFER_LOAD_DWORD_OFFEN $vgpr0, $sgpr0_sgpr1_sgpr2_sgpr3, 0, 316, 0, 0, 0, implicit $exec $vgpr80 = BUFFER_LOAD_DWORD_OFFEN $vgpr0, $sgpr0_sgpr1_sgpr2_sgpr3, 0, 320, 0, 0, 0, implicit $exec ... + +--- +name: mimg_nsa +tracksRegLiveness: true +body: | + bb.0: + liveins: $sgpr0_sgpr1_sgpr2_sgpr3_sgpr4_sgpr5_sgpr6_sgpr7, $sgpr8_sgpr9_sgpr10_sgpr11, $vgpr0, $vgpr1, $vgpr2, $vgpr3, $vgpr4, $vgpr5, $vgpr6, $vgpr7, $vgpr8 + ; CHECK-LABEL: name: mimg_nsa + ; CHECK: liveins: $sgpr0_sgpr1_sgpr2_sgpr3_sgpr4_sgpr5_sgpr6_sgpr7, $sgpr8_sgpr9_sgpr10_sgpr11, $vgpr0, $vgpr1, $vgpr2, $vgpr3, $vgpr4, $vgpr5, $vgpr6, $vgpr7, $vgpr8 + ; CHECK: BUNDLE implicit-def $vgpr10_vgpr11_vgpr12_vgpr13, implicit-def $vgpr10, implicit-def $vgpr10_lo16, implicit-def $vgpr10_hi16, implicit-def $vgpr11, implicit-def $vgpr11_lo16, implicit-def $vgpr11_hi16, implicit-def $vgpr12, implicit-def $vgpr12_lo16, implicit-def $vgpr12_hi16, implicit-def $vgpr13, implicit-def $vgpr13_lo16, implicit-def $vgpr13_hi16, implicit-def $vgpr10_vgpr11, implicit-def $vgpr10_vgpr11_vgpr12, implicit-def $vgpr11_vgpr12, implicit-def $vgpr11_vgpr12_vgpr13, implicit-def $vgpr12_vgpr13, implicit-def $vgpr20_vgpr21_vgpr22_vgpr23, implicit-def $vgpr20, implicit-def $vgpr20_lo16, implicit-def $vgpr20_hi16, implicit-def $vgpr21, implicit-def $vgpr21_lo16, implicit-def $vgpr21_hi16, implicit-def $vgpr22, implicit-def $vgpr22_lo16, implicit-def $vgpr22_hi16, implicit-def $vgpr23, implicit-def $vgpr23_lo16, implicit-def $vgpr23_hi16, implicit-def $vgpr20_vgpr21, implicit-def $vgpr20_vgpr21_vgpr22, implicit-def $vgpr21_vgpr22, implicit-def $vgpr21_vgpr22_vgpr23, implicit-def $vgpr22_vgpr23, implicit $vgpr3, implicit $vgpr8, implicit $vgpr7, implicit $vgpr5, implicit $vgpr4, implicit $vgpr6, implicit $vgpr0, implicit $vgpr2, implicit $sgpr0_sgpr1_sgpr2_sgpr3_sgpr4_sgpr5_sgpr6_sgpr7, implicit $sgpr8_sgpr9_sgpr10_sgpr11, implicit $exec { + ; CHECK: S_CLAUSE 1 + ; CHECK: $vgpr10_vgpr11_vgpr12_vgpr13 = IMAGE_SAMPLE_D_V4_V9_nsa_gfx10 $vgpr3, $vgpr8, $vgpr7, $vgpr5, $vgpr4, $vgpr6, $vgpr0, $vgpr2, $vgpr2, $sgpr0_sgpr1_sgpr2_sgpr3_sgpr4_sgpr5_sgpr6_sgpr7, $sgpr8_sgpr9_sgpr10_sgpr11, 15, 2, 0, 0, 0, 0, 0, 0, 0, implicit $exec :: (load 16) + ; CHECK: $vgpr20_vgpr21_vgpr22_vgpr23 = IMAGE_SAMPLE_D_V4_V9_nsa_gfx10 $vgpr3, $vgpr8, $vgpr7, $vgpr5, $vgpr4, $vgpr6, $vgpr0, $vgpr2, $vgpr2, $sgpr0_sgpr1_sgpr2_sgpr3_sgpr4_sgpr5_sgpr6_sgpr7, $sgpr8_sgpr9_sgpr10_sgpr11, 15, 2, 0, 0, 0, 0, 0, 0, 0, implicit $exec :: (load 16) + ; CHECK: } + $vgpr10_vgpr11_vgpr12_vgpr13 = IMAGE_SAMPLE_D_V4_V9_nsa_gfx10 $vgpr3, $vgpr8, $vgpr7, $vgpr5, $vgpr4, $vgpr6, $vgpr0, $vgpr2, $vgpr2, $sgpr0_sgpr1_sgpr2_sgpr3_sgpr4_sgpr5_sgpr6_sgpr7, $sgpr8_sgpr9_sgpr10_sgpr11, 15, 2, 0, 0, 0, 0, 0, 0, 0, implicit $exec :: (load 16) + $vgpr20_vgpr21_vgpr22_vgpr23 = IMAGE_SAMPLE_D_V4_V9_nsa_gfx10 $vgpr3, $vgpr8, $vgpr7, $vgpr5, $vgpr4, $vgpr6, $vgpr0, $vgpr2, $vgpr2, $sgpr0_sgpr1_sgpr2_sgpr3_sgpr4_sgpr5_sgpr6_sgpr7, $sgpr8_sgpr9_sgpr10_sgpr11, 15, 2, 0, 0, 0, 0, 0, 0, 0, implicit $exec :: (load 16) +... + +--- +name: mimg_nsa_mixed +tracksRegLiveness: true +body: | + bb.0: + liveins: $sgpr0_sgpr1_sgpr2_sgpr3_sgpr4_sgpr5_sgpr6_sgpr7, $sgpr8_sgpr9_sgpr10_sgpr11, $vgpr0, $vgpr1, $vgpr2, $vgpr3, $vgpr4, $vgpr5, $vgpr6, $vgpr7, $vgpr8 + ; CHECK-LABEL: name: mimg_nsa_mixed + ; CHECK: liveins: $sgpr0_sgpr1_sgpr2_sgpr3_sgpr4_sgpr5_sgpr6_sgpr7, $sgpr8_sgpr9_sgpr10_sgpr11, $vgpr0, $vgpr1, $vgpr2, $vgpr3, $vgpr4, $vgpr5, $vgpr6, $vgpr7, $vgpr8 + ; CHECK: BUNDLE implicit-def $vgpr10_vgpr11_vgpr12_vgpr13, implicit-def $vgpr10, implicit-def $vgpr10_lo16, implicit-def $vgpr10_hi16, implicit-def $vgpr11, implicit-def $vgpr11_lo16, implicit-def $vgpr11_hi16, implicit-def $vgpr12, implicit-def $vgpr12_lo16, implicit-def $vgpr12_hi16, implicit-def $vgpr13, implicit-def $vgpr13_lo16, implicit-def $vgpr13_hi16, implicit-def $vgpr10_vgpr11, implicit-def $vgpr10_vgpr11_vgpr12, implicit-def $vgpr11_vgpr12, implicit-def $vgpr11_vgpr12_vgpr13, implicit-def $vgpr12_vgpr13, implicit-def $vgpr14, implicit-def $vgpr14_lo16, implicit-def $vgpr14_hi16, implicit-def $vgpr20_vgpr21_vgpr22_vgpr23, implicit-def $vgpr20, implicit-def $vgpr20_lo16, implicit-def $vgpr20_hi16, implicit-def $vgpr21, implicit-def $vgpr21_lo16, implicit-def $vgpr21_hi16, implicit-def $vgpr22, implicit-def $vgpr22_lo16, implicit-def $vgpr22_hi16, implicit-def $vgpr23, implicit-def $vgpr23_lo16, implicit-def $vgpr23_hi16, implicit-def $vgpr20_vgpr21, implicit-def $vgpr20_vgpr21_vgpr22, implicit-def $vgpr21_vgpr22, implicit-def $vgpr21_vgpr22_vgpr23, implicit-def $vgpr22_vgpr23, implicit $vgpr3, implicit $vgpr8, implicit $vgpr7, implicit $vgpr5, implicit $vgpr4, implicit $vgpr6, implicit $vgpr0, implicit $vgpr2, implicit $sgpr0_sgpr1_sgpr2_sgpr3_sgpr4_sgpr5_sgpr6_sgpr7, implicit $sgpr8_sgpr9_sgpr10_sgpr11, implicit $exec, implicit $vgpr5_vgpr6 { + ; CHECK: S_CLAUSE 2 + ; CHECK: $vgpr10_vgpr11_vgpr12_vgpr13 = IMAGE_SAMPLE_D_V4_V9_nsa_gfx10 $vgpr3, $vgpr8, $vgpr7, $vgpr5, $vgpr4, $vgpr6, $vgpr0, $vgpr2, $vgpr2, $sgpr0_sgpr1_sgpr2_sgpr3_sgpr4_sgpr5_sgpr6_sgpr7, $sgpr8_sgpr9_sgpr10_sgpr11, 15, 2, 0, 0, 0, 0, 0, 0, 0, implicit $exec :: (load 16) + ; CHECK: $vgpr14 = IMAGE_SAMPLE_LZ_V1_V2_gfx10 $vgpr5_vgpr6, $sgpr0_sgpr1_sgpr2_sgpr3_sgpr4_sgpr5_sgpr6_sgpr7, $sgpr8_sgpr9_sgpr10_sgpr11, 1, 1, 0, 0, 0, 0, 0, 0, 0, implicit $exec :: (dereferenceable load 16 from custom "ImageResource") + ; CHECK: $vgpr20_vgpr21_vgpr22_vgpr23 = IMAGE_SAMPLE_D_V4_V9_nsa_gfx10 $vgpr3, $vgpr8, $vgpr7, $vgpr5, $vgpr4, $vgpr6, $vgpr0, $vgpr2, $vgpr2, $sgpr0_sgpr1_sgpr2_sgpr3_sgpr4_sgpr5_sgpr6_sgpr7, $sgpr8_sgpr9_sgpr10_sgpr11, 15, 2, 0, 0, 0, 0, 0, 0, 0, implicit $exec :: (load 16) + ; CHECK: } + $vgpr10_vgpr11_vgpr12_vgpr13 = IMAGE_SAMPLE_D_V4_V9_nsa_gfx10 $vgpr3, $vgpr8, $vgpr7, $vgpr5, $vgpr4, $vgpr6, $vgpr0, $vgpr2, $vgpr2, $sgpr0_sgpr1_sgpr2_sgpr3_sgpr4_sgpr5_sgpr6_sgpr7, $sgpr8_sgpr9_sgpr10_sgpr11, 15, 2, 0, 0, 0, 0, 0, 0, 0, implicit $exec :: (load 16) + $vgpr14 = IMAGE_SAMPLE_LZ_V1_V2_gfx10 $vgpr5_vgpr6, $sgpr0_sgpr1_sgpr2_sgpr3_sgpr4_sgpr5_sgpr6_sgpr7, $sgpr8_sgpr9_sgpr10_sgpr11, 1, 1, 0, 0, 0, 0, 0, 0, 0, implicit $exec :: (dereferenceable load 16 from custom "ImageResource") + $vgpr20_vgpr21_vgpr22_vgpr23 = IMAGE_SAMPLE_D_V4_V9_nsa_gfx10 $vgpr3, $vgpr8, $vgpr7, $vgpr5, $vgpr4, $vgpr6, $vgpr0, $vgpr2, $vgpr2, $sgpr0_sgpr1_sgpr2_sgpr3_sgpr4_sgpr5_sgpr6_sgpr7, $sgpr8_sgpr9_sgpr10_sgpr11, 15, 2, 0, 0, 0, 0, 0, 0, 0, implicit $exec :: (load 16) +... </cut>

4 years

1
0
0 0

[CI-NOTIFY]: TCWG Bisect tcwg_bmk_tk1/llvm-release-arm-spec2k6-O3 - Build # 11 - Successful!

by ci_notify＠linaro.org

Successfully identified regression in *llvm* in CI configuration tcwg_bmk_llvm_tk1/llvm-release-arm-spec2k6-O3. So far, this commit has regressed CI configurations: - tcwg_bmk_llvm_tk1/llvm-release-arm-spec2k6-O3 Culprit: <cut> commit 99203f2004d031f2ef22f01e3c569d2775de1836 Author: Alexey Bataev <a.bataev(a)outlook.com> Date: Tue Mar 23 13:22:58 2021 -0700 [Analysis]Add getPointersDiff function to improve compile time. Added getPointersDiff function to LoopAccessAnalysis and used it instead direct calculatoin of the distance between pointers and/or isConsecutiveAccess function in SLP vectorizer to improve compile time and detection of stores consecutive chains. Part of D57059 Differential Revision: https://reviews.llvm.org/D98967 </cut> Results regressed to (for first_bad == 99203f2004d031f2ef22f01e3c569d2775de1836) # reset_artifacts: -10 # build_abe binutils: -9 # build_abe stage1 -- --set gcc_override_configure=--with-mode=arm --set gcc_override_configure=--disable-libsanitizer: -8 # build_abe linux: -7 # build_abe glibc: -6 # build_abe stage2 -- --set gcc_override_configure=--with-mode=arm --set gcc_override_configure=--disable-libsanitizer: -5 # build_llvm true: -3 # true: 0 # benchmark -- -O3_marm artifacts/build-99203f2004d031f2ef22f01e3c569d2775de1836/results_id: 1 # 458.sjeng,sjeng_base.default regressed by 103 from (for last_good == 4157a079afbf7fa5c3ce3ac0e9f4541f89188ae2) # reset_artifacts: -10 # build_abe binutils: -9 # build_abe stage1 -- --set gcc_override_configure=--with-mode=arm --set gcc_override_configure=--disable-libsanitizer: -8 # build_abe linux: -7 # build_abe glibc: -6 # build_abe stage2 -- --set gcc_override_configure=--with-mode=arm --set gcc_override_configure=--disable-libsanitizer: -5 # build_llvm true: -3 # true: 0 # benchmark -- -O3_marm artifacts/build-4157a079afbf7fa5c3ce3ac0e9f4541f89188ae2/results_id: 1 Artifacts of last_good build: https://ci.linaro.org/job/tcwg_bmk_ci_llvm-bisect-tcwg_bmk_tk1-llvm-release… Results ID of last_good: tk1_32/tcwg_bmk_llvm_tk1/bisect-llvm-release-arm-spec2k6-O3/4192 Artifacts of first_bad build: https://ci.linaro.org/job/tcwg_bmk_ci_llvm-bisect-tcwg_bmk_tk1-llvm-release… Results ID of first_bad: tk1_32/tcwg_bmk_llvm_tk1/bisect-llvm-release-arm-spec2k6-O3/4194 Build top page/logs: https://ci.linaro.org/job/tcwg_bmk_ci_llvm-bisect-tcwg_bmk_tk1-llvm-release… Configuration details: Reproduce builds: <cut> mkdir investigate-llvm-99203f2004d031f2ef22f01e3c569d2775de1836 cd investigate-llvm-99203f2004d031f2ef22f01e3c569d2775de1836 git clone https://git.linaro.org/toolchain/jenkins-scripts mkdir -p artifacts/manifests curl -o artifacts/manifests/build-baseline.sh https://ci.linaro.org/job/tcwg_bmk_ci_llvm-bisect-tcwg_bmk_tk1-llvm-release… --fail curl -o artifacts/manifests/build-parameters.sh https://ci.linaro.org/job/tcwg_bmk_ci_llvm-bisect-tcwg_bmk_tk1-llvm-release… --fail curl -o artifacts/test.sh https://ci.linaro.org/job/tcwg_bmk_ci_llvm-bisect-tcwg_bmk_tk1-llvm-release… --fail chmod +x artifacts/test.sh # Reproduce the baseline build (build all pre-requisites) ./jenkins-scripts/tcwg_bmk-build.sh @@ artifacts/manifests/build-baseline.sh # Save baseline build state (which is then restored in artifacts/test.sh) mkdir -p ./bisect rsync -a --del --delete-excluded --exclude /bisect/ --exclude /artifacts/ --exclude /llvm/ ./ ./bisect/baseline/ cd llvm # Reproduce first_bad build git checkout --detach 99203f2004d031f2ef22f01e3c569d2775de1836 ../artifacts/test.sh # Reproduce last_good build git checkout --detach 4157a079afbf7fa5c3ce3ac0e9f4541f89188ae2 ../artifacts/test.sh cd .. </cut> History of pending regressions and results: https://git.linaro.org/toolchain/ci/base-artifacts.git/log/?h=linaro-local/… Artifacts: https://ci.linaro.org/job/tcwg_bmk_ci_llvm-bisect-tcwg_bmk_tk1-llvm-release… Build log: https://ci.linaro.org/job/tcwg_bmk_ci_llvm-bisect-tcwg_bmk_tk1-llvm-release… Full commit (up to 1000 lines): <cut> commit 99203f2004d031f2ef22f01e3c569d2775de1836 Author: Alexey Bataev <a.bataev(a)outlook.com> Date: Tue Mar 23 13:22:58 2021 -0700 [Analysis]Add getPointersDiff function to improve compile time. Added getPointersDiff function to LoopAccessAnalysis and used it instead direct calculatoin of the distance between pointers and/or isConsecutiveAccess function in SLP vectorizer to improve compile time and detection of stores consecutive chains. Part of D57059 Differential Revision: https://reviews.llvm.org/D98967 --- llvm/include/llvm/Analysis/LoopAccessAnalysis.h | 9 + llvm/lib/Analysis/LoopAccessAnalysis.cpp | 198 ++++++++++------------ llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp | 79 ++++++--- llvm/test/Transforms/SLPVectorizer/X86/pr35497.ll | 17 +- 4 files changed, 163 insertions(+), 140 deletions(-) diff --git a/llvm/include/llvm/Analysis/LoopAccessAnalysis.h b/llvm/include/llvm/Analysis/LoopAccessAnalysis.h index 13fbe884eddf..39acfd5bbbee 100644 --- a/llvm/include/llvm/Analysis/LoopAccessAnalysis.h +++ b/llvm/include/llvm/Analysis/LoopAccessAnalysis.h @@ -679,6 +679,15 @@ int64_t getPtrStride(PredicatedScalarEvolution &PSE, Value *Ptr, const Loop *Lp, const ValueToValueMap &StridesMap = ValueToValueMap(), bool Assume = false, bool ShouldCheckWrap = true); +/// Returns the distance between the pointers \p PtrA and \p PtrB iff they are +/// compatible and it is possible to calculate the distance between them. This +/// is a simple API that does not depend on the analysis pass. +/// \param StrictCheck Ensure that the calculated distance matches the +/// type-based one after all the bitcasts removal in the provided pointers. +Optional<int> getPointersDiff(Value *PtrA, Value *PtrB, const DataLayout &DL, + ScalarEvolution &SE, bool StrictCheck = false, + bool CheckType = true); + /// Attempt to sort the pointers in \p VL and return the sorted indices /// in \p SortedIndices, if reordering is required. /// diff --git a/llvm/lib/Analysis/LoopAccessAnalysis.cpp b/llvm/lib/Analysis/LoopAccessAnalysis.cpp index e632fe25c24c..997d4474a448 100644 --- a/llvm/lib/Analysis/LoopAccessAnalysis.cpp +++ b/llvm/lib/Analysis/LoopAccessAnalysis.cpp @@ -1124,139 +1124,123 @@ int64_t llvm::getPtrStride(PredicatedScalarEvolution &PSE, Value *Ptr, return Stride; } +Optional<int> llvm::getPointersDiff(Value *PtrA, Value *PtrB, + const DataLayout &DL, ScalarEvolution &SE, + bool StrictCheck, bool CheckType) { + assert(PtrA && PtrB && "Expected non-nullptr pointers."); + // Make sure that A and B are different pointers. + if (PtrA == PtrB) + return 0; + + // Make sure that PtrA and PtrB have the same type if required + if (CheckType && PtrA->getType() != PtrB->getType()) + return None; + + unsigned ASA = PtrA->getType()->getPointerAddressSpace(); + unsigned ASB = PtrB->getType()->getPointerAddressSpace(); + + // Check that the address spaces match. + if (ASA != ASB) + return None; + unsigned IdxWidth = DL.getIndexSizeInBits(ASA); + + APInt OffsetA(IdxWidth, 0), OffsetB(IdxWidth, 0); + Value *PtrA1 = PtrA->stripAndAccumulateInBoundsConstantOffsets(DL, OffsetA); + Value *PtrB1 = PtrB->stripAndAccumulateInBoundsConstantOffsets(DL, OffsetB); + + int Val; + if (PtrA1 == PtrB1) { + // Retrieve the address space again as pointer stripping now tracks through + // `addrspacecast`. + ASA = cast<PointerType>(PtrA1->getType())->getAddressSpace(); + ASB = cast<PointerType>(PtrB1->getType())->getAddressSpace(); + // Check that the address spaces match and that the pointers are valid. + if (ASA != ASB) + return None; + + IdxWidth = DL.getIndexSizeInBits(ASA); + OffsetA = OffsetA.sextOrTrunc(IdxWidth); + OffsetB = OffsetB.sextOrTrunc(IdxWidth); + + OffsetB -= OffsetA; + Val = OffsetB.getSExtValue(); + } else { + // Otherwise compute the distance with SCEV between the base pointers. + const SCEV *PtrSCEVA = SE.getSCEV(PtrA); + const SCEV *PtrSCEVB = SE.getSCEV(PtrB); + const auto *Diff = + dyn_cast<SCEVConstant>(SE.getMinusSCEV(PtrSCEVB, PtrSCEVA)); + if (!Diff) + return None; + Val = Diff->getAPInt().getSExtValue(); + } + Type *Ty = cast<PointerType>(PtrA->getType())->getElementType(); + int Size = DL.getTypeStoreSize(Ty); + int Dist = Val / Size; + + // Ensure that the calculated distance matches the type-based one after all + // the bitcasts removal in the provided pointers. + if (!StrictCheck || Dist * Size == Val) + return Dist; + return None; +} + bool llvm::sortPtrAccesses(ArrayRef<Value *> VL, const DataLayout &DL, ScalarEvolution &SE, SmallVectorImpl<unsigned> &SortedIndices) { assert(llvm::all_of( VL, [](const Value *V) { return V->getType()->isPointerTy(); }) && "Expected list of pointer operands."); - SmallVector<std::pair<int64_t, Value *>, 4> OffValPairs; - OffValPairs.reserve(VL.size()); - // Walk over the pointers, and map each of them to an offset relative to // first pointer in the array. Value *Ptr0 = VL[0]; - const SCEV *Scev0 = SE.getSCEV(Ptr0); - Value *Obj0 = getUnderlyingObject(Ptr0); - - llvm::SmallSet<int64_t, 4> Offsets; - for (auto *Ptr : VL) { - // TODO: Outline this code as a special, more time consuming, version of - // computeConstantDifference() function. - if (Ptr->getType()->getPointerAddressSpace() != - Ptr0->getType()->getPointerAddressSpace()) - return false; - // If a pointer refers to a different underlying object, bail - the - // pointers are by definition incomparable. - Value *CurrObj = getUnderlyingObject(Ptr); - if (CurrObj != Obj0) - return false; - const SCEV *Scev = SE.getSCEV(Ptr); - const auto *Diff = dyn_cast<SCEVConstant>(SE.getMinusSCEV(Scev, Scev0)); - // The pointers may not have a constant offset from each other, or SCEV - // may just not be smart enough to figure out they do. Regardless, - // there's nothing we can do. + using DistOrdPair = std::pair<int64_t, int>; + auto Compare = [](const DistOrdPair &L, const DistOrdPair &R) { + return L.first < R.first; + }; + std::set<DistOrdPair, decltype(Compare)> Offsets(Compare); + Offsets.emplace(0, 0); + int Cnt = 1; + bool IsConsecutive = true; + for (auto *Ptr : VL.drop_front()) { + Optional<int> Diff = getPointersDiff(Ptr0, Ptr, DL, SE); if (!Diff) return false; // Check if the pointer with the same offset is found. - int64_t Offset = Diff->getAPInt().getSExtValue(); - if (!Offsets.insert(Offset).second) + int64_t Offset = *Diff; + auto Res = Offsets.emplace(Offset, Cnt); + if (!Res.second) return false; - OffValPairs.emplace_back(Offset, Ptr); + // Consecutive order if the inserted element is the last one. + IsConsecutive = IsConsecutive && std::next(Res.first) == Offsets.end(); + ++Cnt; } SortedIndices.clear(); - SortedIndices.resize(VL.size()); - std::iota(SortedIndices.begin(), SortedIndices.end(), 0); - - // Sort the memory accesses and keep the order of their uses in UseOrder. - llvm::stable_sort(SortedIndices, [&](unsigned Left, unsigned Right) { - return OffValPairs[Left].first < OffValPairs[Right].first; - }); - - // Check if the order is consecutive already. - if (llvm::all_of(SortedIndices, [&SortedIndices](const unsigned I) { - return I == SortedIndices[I]; - })) - SortedIndices.clear(); - + if (!IsConsecutive) { + // Fill SortedIndices array only if it is non-consecutive. + SortedIndices.resize(VL.size()); + Cnt = 0; + for (const std::pair<int64_t, int> &Pair : Offsets) { + IsConsecutive = IsConsecutive && Cnt == Pair.second; + SortedIndices[Cnt] = Pair.second; + ++Cnt; + } + } return true; } -/// Take the address space operand from the Load/Store instruction. -/// Returns -1 if this is not a valid Load/Store instruction. -static unsigned getAddressSpaceOperand(Value *I) { - if (LoadInst *L = dyn_cast<LoadInst>(I)) - return L->getPointerAddressSpace(); - if (StoreInst *S = dyn_cast<StoreInst>(I)) - return S->getPointerAddressSpace(); - return -1; -} - /// Returns true if the memory operations \p A and \p B are consecutive. bool llvm::isConsecutiveAccess(Value *A, Value *B, const DataLayout &DL, ScalarEvolution &SE, bool CheckType) { Value *PtrA = getLoadStorePointerOperand(A); Value *PtrB = getLoadStorePointerOperand(B); - unsigned ASA = getAddressSpaceOperand(A); - unsigned ASB = getAddressSpaceOperand(B); - - // Check that the address spaces match and that the pointers are valid. - if (!PtrA || !PtrB || (ASA != ASB)) - return false; - - // Make sure that A and B are different pointers. - if (PtrA == PtrB) - return false; - - // Make sure that A and B have the same type if required. - if (CheckType && PtrA->getType() != PtrB->getType()) + if (!PtrA || !PtrB) return false; - - unsigned IdxWidth = DL.getIndexSizeInBits(ASA); - Type *Ty = cast<PointerType>(PtrA->getType())->getElementType(); - - APInt OffsetA(IdxWidth, 0), OffsetB(IdxWidth, 0); - PtrA = PtrA->stripAndAccumulateInBoundsConstantOffsets(DL, OffsetA); - PtrB = PtrB->stripAndAccumulateInBoundsConstantOffsets(DL, OffsetB); - - // Retrieve the address space again as pointer stripping now tracks through - // `addrspacecast`. - ASA = cast<PointerType>(PtrA->getType())->getAddressSpace(); - ASB = cast<PointerType>(PtrB->getType())->getAddressSpace(); - // Check that the address spaces match and that the pointers are valid. - if (ASA != ASB) - return false; - - IdxWidth = DL.getIndexSizeInBits(ASA); - OffsetA = OffsetA.sextOrTrunc(IdxWidth); - OffsetB = OffsetB.sextOrTrunc(IdxWidth); - - APInt Size(IdxWidth, DL.getTypeStoreSize(Ty)); - - // OffsetDelta = OffsetB - OffsetA; - const SCEV *OffsetSCEVA = SE.getConstant(OffsetA); - const SCEV *OffsetSCEVB = SE.getConstant(OffsetB); - const SCEV *OffsetDeltaSCEV = SE.getMinusSCEV(OffsetSCEVB, OffsetSCEVA); - const APInt &OffsetDelta = cast<SCEVConstant>(OffsetDeltaSCEV)->getAPInt(); - - // Check if they are based on the same pointer. That makes the offsets - // sufficient. - if (PtrA == PtrB) - return OffsetDelta == Size; - - // Compute the necessary base pointer delta to have the necessary final delta - // equal to the size. - // BaseDelta = Size - OffsetDelta; - const SCEV *SizeSCEV = SE.getConstant(Size); - const SCEV *BaseDelta = SE.getMinusSCEV(SizeSCEV, OffsetDeltaSCEV); - - // Otherwise compute the distance with SCEV between the base pointers. - const SCEV *PtrSCEVA = SE.getSCEV(PtrA); - const SCEV *PtrSCEVB = SE.getSCEV(PtrB); - const SCEV *X = SE.getAddExpr(PtrSCEVA, BaseDelta); - return X == PtrSCEVB; + Optional<int> Diff = + getPointersDiff(PtrA, PtrB, DL, SE, /*StrictCheck=*/true, CheckType); + return Diff && *Diff == 1; } MemoryDepChecker::VectorizationSafetyStatus diff --git a/llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp b/llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp index 385b6f30dc0f..78d2ea0032db 100644 --- a/llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp +++ b/llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp @@ -941,10 +941,16 @@ public: ScalarEvolution &SE) { auto *LI1 = dyn_cast<LoadInst>(V1); auto *LI2 = dyn_cast<LoadInst>(V2); - if (LI1 && LI2) - return isConsecutiveAccess(LI1, LI2, DL, SE) - ? VLOperands::ScoreConsecutiveLoads - : VLOperands::ScoreFail; + if (LI1 && LI2) { + if (LI1->getParent() != LI2->getParent()) + return VLOperands::ScoreFail; + + Optional<int> Dist = + getPointersDiff(LI1->getPointerOperand(), LI2->getPointerOperand(), + DL, SE, /*StrictCheck=*/true); + return (Dist && *Dist == 1) ? VLOperands::ScoreConsecutiveLoads + : VLOperands::ScoreFail; + } auto *C1 = dyn_cast<Constant>(V1); auto *C2 = dyn_cast<Constant>(V2); @@ -2871,13 +2877,9 @@ void BoUpSLP::buildTree_rec(ArrayRef<Value *> VL, unsigned Depth, Ptr0 = PointerOps[CurrentOrder.front()]; PtrN = PointerOps[CurrentOrder.back()]; } - const SCEV *Scev0 = SE->getSCEV(Ptr0); - const SCEV *ScevN = SE->getSCEV(PtrN); - const auto *Diff = - dyn_cast<SCEVConstant>(SE->getMinusSCEV(ScevN, Scev0)); - uint64_t Size = DL->getTypeAllocSize(ScalarTy); + Optional<int> Diff = getPointersDiff(Ptr0, PtrN, *DL, *SE); // Check that the sorted loads are consecutive. - if (Diff && Diff->getAPInt() == (VL.size() - 1) * Size) { + if (static_cast<unsigned>(*Diff) == VL.size() - 1) { if (CurrentOrder.empty()) { // Original loads are consecutive and does not require reordering. ++NumOpsWantToKeepOriginalOrder; @@ -3150,13 +3152,9 @@ void BoUpSLP::buildTree_rec(ArrayRef<Value *> VL, unsigned Depth, Ptr0 = PointerOps[CurrentOrder.front()]; PtrN = PointerOps[CurrentOrder.back()]; } - const SCEV *Scev0 = SE->getSCEV(Ptr0); - const SCEV *ScevN = SE->getSCEV(PtrN); - const auto *Diff = - dyn_cast<SCEVConstant>(SE->getMinusSCEV(ScevN, Scev0)); - uint64_t Size = DL->getTypeAllocSize(ScalarTy); + Optional<int> Dist = getPointersDiff(Ptr0, PtrN, *DL, *SE); // Check that the sorted pointer operands are consecutive. - if (Diff && Diff->getAPInt() == (VL.size() - 1) * Size) { + if (static_cast<unsigned>(*Dist) == VL.size() - 1) { if (CurrentOrder.empty()) { // Original stores are consecutive and does not require reordering. ++NumOpsWantToKeepOriginalOrder; @@ -6107,20 +6105,41 @@ bool SLPVectorizerPass::vectorizeStores(ArrayRef<StoreInst *> Stores, int E = Stores.size(); SmallBitVector Tails(E, false); - SmallVector<int, 16> ConsecutiveChain(E, E + 1); int MaxIter = MaxStoreLookup.getValue(); + SmallVector<std::pair<int, int>, 16> ConsecutiveChain( + E, std::make_pair(E, INT_MAX)); + SmallVector<SmallBitVector, 4> CheckedPairs(E, SmallBitVector(E, false)); int IterCnt; auto &&FindConsecutiveAccess = [this, &Stores, &Tails, &IterCnt, MaxIter, + &CheckedPairs, &ConsecutiveChain](int K, int Idx) { if (IterCnt >= MaxIter) return true; + if (CheckedPairs[Idx].test(K)) + return ConsecutiveChain[K].second == 1 && + ConsecutiveChain[K].first == Idx; ++IterCnt; - if (!isConsecutiveAccess(Stores[K], Stores[Idx], *DL, *SE)) + CheckedPairs[Idx].set(K); + CheckedPairs[K].set(Idx); + Optional<int> Diff = getPointersDiff(Stores[K]->getPointerOperand(), + Stores[Idx]->getPointerOperand(), *DL, + *SE, /*StrictCheck=*/true); + if (!Diff || *Diff == 0) + return false; + int Val = *Diff; + if (Val < 0) { + if (ConsecutiveChain[Idx].second > -Val) { + Tails.set(K); + ConsecutiveChain[Idx] = std::make_pair(K, -Val); + } + return false; + } + if (ConsecutiveChain[K].second <= Val) return false; Tails.set(Idx); - ConsecutiveChain[K] = Idx; - return true; + ConsecutiveChain[K] = std::make_pair(Idx, Val); + return Val == 1; }; // Do a quadratic search on all of the given stores in reverse order and find // all of the pairs of stores that follow each other. @@ -6140,17 +6159,31 @@ bool SLPVectorizerPass::vectorizeStores(ArrayRef<StoreInst *> Stores, // For stores that start but don't end a link in the chain: for (int Cnt = E; Cnt > 0; --Cnt) { int I = Cnt - 1; - if (ConsecutiveChain[I] == E + 1 || Tails.test(I)) + if (ConsecutiveChain[I].first == E || Tails.test(I)) continue; // We found a store instr that starts a chain. Now follow the chain and try // to vectorize it. BoUpSLP::ValueList Operands; // Collect the chain into a list. - while (I != E + 1 && !VectorizedStores.count(Stores[I])) { + while (I != E && !VectorizedStores.count(Stores[I])) { Operands.push_back(Stores[I]); + Tails.set(I); + if (ConsecutiveChain[I].second != 1) { + // Mark the new end in the chain and go back, if required. It might be + // required if the original stores come in reversed order, for example. + if (ConsecutiveChain[I].first != E && + Tails.test(ConsecutiveChain[I].first) && + !VectorizedStores.count(Stores[ConsecutiveChain[I].first])) { + Tails.reset(ConsecutiveChain[I].first); + if (Cnt < ConsecutiveChain[I].first + 2) + Cnt = ConsecutiveChain[I].first + 2; + } + break; + } // Move to the next value in the chain. - I = ConsecutiveChain[I]; + I = ConsecutiveChain[I].first; } + assert(!Operands.empty() && "Expected non-empty list of stores."); unsigned MaxVecRegSize = R.getMaxVecRegSize(); unsigned EltSize = R.getVectorElementSize(Operands[0]); diff --git a/llvm/test/Transforms/SLPVectorizer/X86/pr35497.ll b/llvm/test/Transforms/SLPVectorizer/X86/pr35497.ll index 267cf1a02c29..e28362894910 100644 --- a/llvm/test/Transforms/SLPVectorizer/X86/pr35497.ll +++ b/llvm/test/Transforms/SLPVectorizer/X86/pr35497.ll @@ -1,7 +1,7 @@ ; NOTE: Assertions have been autogenerated by utils/update_test_checks.py -; RUN: opt < %s -mtriple=x86_64-unknown-linux-gnu -slp-vectorizer -slp-vectorizer -mattr=+sse2 -S | FileCheck %s --check-prefix=SSE -; RUN: opt < %s -mtriple=x86_64-unknown-linux-gnu -slp-vectorizer -slp-vectorizer -mattr=+avx -S | FileCheck %s --check-prefix=AVX -; RUN: opt < %s -mtriple=x86_64-unknown-linux-gnu -slp-vectorizer -slp-vectorizer -mattr=+avx2 -S | FileCheck %s --check-prefix=AVX +; RUN: opt < %s -mtriple=x86_64-unknown-linux-gnu -slp-vectorizer -mattr=+sse2 -S | FileCheck %s --check-prefix=SSE +; RUN: opt < %s -mtriple=x86_64-unknown-linux-gnu -slp-vectorizer -mattr=+avx -S | FileCheck %s --check-prefix=AVX +; RUN: opt < %s -mtriple=x86_64-unknown-linux-gnu -slp-vectorizer -mattr=+avx2 -S | FileCheck %s --check-prefix=AVX %class.1 = type { %class.2 } %class.2 = type { %"class.3" } @@ -117,13 +117,10 @@ define void @pr35497() local_unnamed_addr #0 { ; AVX-NEXT: [[ARRAYIDX2_6:%.*]] = getelementptr inbounds [0 x i64], [0 x i64]* undef, i64 0, i64 0 ; AVX-NEXT: [[TMP10:%.*]] = bitcast i64* [[ARRAYIDX2_6]] to <2 x i64>* ; AVX-NEXT: store <2 x i64> [[TMP4]], <2 x i64>* [[TMP10]], align 1 -; AVX-NEXT: [[TMP11:%.*]] = extractelement <2 x i64> [[TMP4]], i32 0 -; AVX-NEXT: [[TMP12:%.*]] = insertelement <2 x i64> poison, i64 [[TMP11]], i32 0 -; AVX-NEXT: [[TMP13:%.*]] = insertelement <2 x i64> [[TMP12]], i64 [[TMP5]], i32 1 -; AVX-NEXT: [[TMP14:%.*]] = lshr <2 x i64> [[TMP13]], <i64 6, i64 6> -; AVX-NEXT: [[TMP15:%.*]] = add nuw nsw <2 x i64> [[TMP9]], [[TMP14]] -; AVX-NEXT: [[TMP16:%.*]] = bitcast i64* [[ARRAYIDX2_2]] to <2 x i64>* -; AVX-NEXT: store <2 x i64> [[TMP15]], <2 x i64>* [[TMP16]], align 1 +; AVX-NEXT: [[TMP11:%.*]] = lshr <2 x i64> [[TMP4]], <i64 6, i64 6> +; AVX-NEXT: [[TMP12:%.*]] = add nuw nsw <2 x i64> [[TMP9]], [[TMP11]] +; AVX-NEXT: [[TMP13:%.*]] = bitcast i64* [[ARRAYIDX2_2]] to <2 x i64>* +; AVX-NEXT: store <2 x i64> [[TMP12]], <2 x i64>* [[TMP13]], align 1 ; AVX-NEXT: ret void ; entry: </cut>

4 years

1
0
0 0

[CI-NOTIFY]: TCWG Bisect tcwg_bmk_tx1/llvm-master-aarch64-spec2k6-O2_LTO - Build # 24 - Successful!

by ci_notify＠linaro.org

Successfully identified regression in *llvm* in CI configuration tcwg_bmk_llvm_tx1/llvm-master-aarch64-spec2k6-O2_LTO. So far, this commit has regressed CI configurations: - tcwg_bmk_llvm_tx1/llvm-master-aarch64-spec2k6-O2_LTO Culprit: <cut> commit 643ce61fb3c2c730b7ecead4a489eaeef3f053ea Author: Akira Hatanaka <ahatanaka(a)apple.com> Date: Wed Aug 11 12:55:28 2021 -0700 [ObjC][ARC] Don't form a StoreStrong call if it is unsafe to move the release call findSafeStoreForStoreStrongContraction checks whether it's safe to move the release call to the store by inspecting all instructions between the two, but was ignoring retain instructions. This was causing objects to be released and deallocated before they were retained. rdar://81668577 </cut> Results regressed to (for first_bad == 643ce61fb3c2c730b7ecead4a489eaeef3f053ea) # reset_artifacts: -10 # build_abe binutils: -9 # build_abe stage1 -- --set gcc_override_configure=--disable-libsanitizer: -8 # build_abe linux: -7 # build_abe glibc: -6 # build_abe stage2 -- --set gcc_override_configure=--disable-libsanitizer: -5 # build_llvm true: -3 # true: 0 # benchmark -- -O2_LTO artifacts/build-643ce61fb3c2c730b7ecead4a489eaeef3f053ea/results_id: 1 # 433.milc,milc_base.default regressed by 103 from (for last_good == 767496d19cb9a1fbba57ff08095faa161998ee36) # reset_artifacts: -10 # build_abe binutils: -9 # build_abe stage1 -- --set gcc_override_configure=--disable-libsanitizer: -8 # build_abe linux: -7 # build_abe glibc: -6 # build_abe stage2 -- --set gcc_override_configure=--disable-libsanitizer: -5 # build_llvm true: -3 # true: 0 # benchmark -- -O2_LTO artifacts/build-767496d19cb9a1fbba57ff08095faa161998ee36/results_id: 1 Artifacts of last_good build: https://ci.linaro.org/job/tcwg_bmk_ci_llvm-bisect-tcwg_bmk_tx1-llvm-master-… Results ID of last_good: tx1_64/tcwg_bmk_llvm_tx1/bisect-llvm-master-aarch64-spec2k6-O2_LTO/4172 Artifacts of first_bad build: https://ci.linaro.org/job/tcwg_bmk_ci_llvm-bisect-tcwg_bmk_tx1-llvm-master-… Results ID of first_bad: tx1_64/tcwg_bmk_llvm_tx1/bisect-llvm-master-aarch64-spec2k6-O2_LTO/4171 Build top page/logs: https://ci.linaro.org/job/tcwg_bmk_ci_llvm-bisect-tcwg_bmk_tx1-llvm-master-… Configuration details: Reproduce builds: <cut> mkdir investigate-llvm-643ce61fb3c2c730b7ecead4a489eaeef3f053ea cd investigate-llvm-643ce61fb3c2c730b7ecead4a489eaeef3f053ea git clone https://git.linaro.org/toolchain/jenkins-scripts mkdir -p artifacts/manifests curl -o artifacts/manifests/build-baseline.sh https://ci.linaro.org/job/tcwg_bmk_ci_llvm-bisect-tcwg_bmk_tx1-llvm-master-… --fail curl -o artifacts/manifests/build-parameters.sh https://ci.linaro.org/job/tcwg_bmk_ci_llvm-bisect-tcwg_bmk_tx1-llvm-master-… --fail curl -o artifacts/test.sh https://ci.linaro.org/job/tcwg_bmk_ci_llvm-bisect-tcwg_bmk_tx1-llvm-master-… --fail chmod +x artifacts/test.sh # Reproduce the baseline build (build all pre-requisites) ./jenkins-scripts/tcwg_bmk-build.sh @@ artifacts/manifests/build-baseline.sh # Save baseline build state (which is then restored in artifacts/test.sh) mkdir -p ./bisect rsync -a --del --delete-excluded --exclude /bisect/ --exclude /artifacts/ --exclude /llvm/ ./ ./bisect/baseline/ cd llvm # Reproduce first_bad build git checkout --detach 643ce61fb3c2c730b7ecead4a489eaeef3f053ea ../artifacts/test.sh # Reproduce last_good build git checkout --detach 767496d19cb9a1fbba57ff08095faa161998ee36 ../artifacts/test.sh cd .. </cut> History of pending regressions and results: https://git.linaro.org/toolchain/ci/base-artifacts.git/log/?h=linaro-local/… Artifacts: https://ci.linaro.org/job/tcwg_bmk_ci_llvm-bisect-tcwg_bmk_tx1-llvm-master-… Build log: https://ci.linaro.org/job/tcwg_bmk_ci_llvm-bisect-tcwg_bmk_tx1-llvm-master-… Full commit (up to 1000 lines): <cut> commit 643ce61fb3c2c730b7ecead4a489eaeef3f053ea Author: Akira Hatanaka <ahatanaka(a)apple.com> Date: Wed Aug 11 12:55:28 2021 -0700 [ObjC][ARC] Don't form a StoreStrong call if it is unsafe to move the release call findSafeStoreForStoreStrongContraction checks whether it's safe to move the release call to the store by inspecting all instructions between the two, but was ignoring retain instructions. This was causing objects to be released and deallocated before they were retained. rdar://81668577 --- llvm/lib/Transforms/ObjCARC/ObjCARCContract.cpp | 21 ++++++++++++--------- .../test/Transforms/ObjCARC/contract-storestrong.ll | 19 +++++++++++++++++++ 2 files changed, 31 insertions(+), 9 deletions(-) diff --git a/llvm/lib/Transforms/ObjCARC/ObjCARCContract.cpp b/llvm/lib/Transforms/ObjCARC/ObjCARCContract.cpp index 62161b5b6b40..577973c80601 100644 --- a/llvm/lib/Transforms/ObjCARC/ObjCARCContract.cpp +++ b/llvm/lib/Transforms/ObjCARC/ObjCARCContract.cpp @@ -226,13 +226,6 @@ static StoreInst *findSafeStoreForStoreStrongContraction(LoadInst *Load, // of Inst. ARCInstKind Class = GetBasicARCInstKind(Inst); - // If Inst is an unrelated retain, we don't care about it. - // - // TODO: This is one area where the optimization could be made more - // aggressive. - if (IsRetain(Class)) - continue; - // If we have seen the store, but not the release... if (Store) { // We need to make sure that it is safe to move the release from its @@ -248,8 +241,18 @@ static StoreInst *findSafeStoreForStoreStrongContraction(LoadInst *Load, return nullptr; } - // Ok, now we know we have not seen a store yet. See if Inst can write to - // our load location, if it can not, just ignore the instruction. + // Ok, now we know we have not seen a store yet. + + // If Inst is a retain, we don't care about it as it doesn't prevent moving + // the load to the store. + // + // TODO: This is one area where the optimization could be made more + // aggressive. + if (IsRetain(Class)) + continue; + + // See if Inst can write to our load location, if it can not, just ignore + // the instruction. if (!isModSet(AA->getModRefInfo(Inst, Loc))) continue; diff --git a/llvm/test/Transforms/ObjCARC/contract-storestrong.ll b/llvm/test/Transforms/ObjCARC/contract-storestrong.ll index eff0a6fdf900..9c45e3334d83 100644 --- a/llvm/test/Transforms/ObjCARC/contract-storestrong.ll +++ b/llvm/test/Transforms/ObjCARC/contract-storestrong.ll @@ -256,6 +256,25 @@ define i8* @test13(i8* %a0, i8* %a1, i8** %addr, i8* %new) { ret i8* %retained } +; Cannot form a storeStrong call because it's unsafe to move the release call to +; the store. + +; CHECK-LABEL: define void @test14( +; CHECK: %[[V0:.*]] = load i8*, i8** %a +; CHECK: %[[V1:.*]] = call i8* @llvm.objc.retain(i8* %p) +; CHECK: store i8* %[[V1]], i8** %a +; CHECK: %[[V2:.*]] = call i8* @llvm.objc.retain(i8* %[[V0]]) +; CHECK: call void @llvm.objc.release(i8* %[[V2]]) + +define void @test14(i8** %a, i8* %p) { + %v0 = load i8*, i8** %a, align 8 + %v1 = call i8* @llvm.objc.retain(i8* %p) + store i8* %p, i8** %a, align 8 + %v2 = call i8* @llvm.objc.retain(i8* %v0) + call void @llvm.objc.release(i8* %v0) + ret void +} + !0 = !{} ; CHECK: attributes [[NUW]] = { nounwind } </cut>

4 years

1
0
0 0

[CI-NOTIFY]: TCWG Bisect tcwg_bmk_tk1/llvm-release-arm-spec2k6-O2_LTO - Build # 11 - Successful!

by ci_notify＠linaro.org

Successfully identified regression in *llvm* in CI configuration tcwg_bmk_llvm_tk1/llvm-release-arm-spec2k6-O2_LTO. So far, this commit has regressed CI configurations: - tcwg_bmk_llvm_tk1/llvm-release-arm-spec2k6-O2_LTO Culprit: <cut> commit 176379e0c8f9dbde2b357fb3b6a6802b83282e71 Author: Alex Zinenko <zinenko(a)google.com> Date: Fri Feb 12 12:53:27 2021 +0100 [mlir] Use the interface-based translation for LLVM "intrinsic" dialects Port the translation of five dialects that define LLVM IR intrinsics (LLVMAVX512, LLVMArmNeon, LLVMArmSVE, NVVM, ROCDL) to the new dialect interface-based mechanism. This allows us to remove individual translations that were created for each of these dialects and just use one common MLIR-to-LLVM-IR translation that potentially supports all dialects instead, based on what is registered and including any combination of translatable dialects. This removal was one of the main goals of the refactoring. To support the addition of GPU-related metadata, the translation interface is extended with the `amendOperation` function that allows the interface implementation to post-process any translated operation with dialect attributes from the dialect for which the interface is implemented regardless of the operation's dialect. This is currently applied to "kernel" functions, but can be used to construct other metadata in dialect-specific ways without necessarily affecting operations. Depends On D96591, D96504 Reviewed By: nicolasvasilache Differential Revision: https://reviews.llvm.org/D96592 </cut> Results regressed to (for first_bad == 176379e0c8f9dbde2b357fb3b6a6802b83282e71) # reset_artifacts: -10 # build_abe binutils: -9 # build_abe stage1 -- --set gcc_override_configure=--with-mode=arm --set gcc_override_configure=--disable-libsanitizer: -8 # build_abe linux: -7 # build_abe glibc: -6 # build_abe stage2 -- --set gcc_override_configure=--with-mode=arm --set gcc_override_configure=--disable-libsanitizer: -5 # build_llvm true: -3 # true: 0 # benchmark -- -O2_LTO_marm artifacts/build-176379e0c8f9dbde2b357fb3b6a6802b83282e71/results_id: 1 # 482.sphinx3,sphinx_livepretend_base.default regressed by 109 from (for last_good == 2d728bbff5c688284b8b8306ecfd3000b0ab8bb1) # reset_artifacts: -10 # build_abe binutils: -9 # build_abe stage1 -- --set gcc_override_configure=--with-mode=arm --set gcc_override_configure=--disable-libsanitizer: -8 # build_abe linux: -7 # build_abe glibc: -6 # build_abe stage2 -- --set gcc_override_configure=--with-mode=arm --set gcc_override_configure=--disable-libsanitizer: -5 # build_llvm true: -3 # true: 0 # benchmark -- -O2_LTO_marm artifacts/build-2d728bbff5c688284b8b8306ecfd3000b0ab8bb1/results_id: 1 Artifacts of last_good build: https://ci.linaro.org/job/tcwg_bmk_ci_llvm-bisect-tcwg_bmk_tk1-llvm-release… Results ID of last_good: tk1_32/tcwg_bmk_llvm_tk1/bisect-llvm-release-arm-spec2k6-O2_LTO/4128 Artifacts of first_bad build: https://ci.linaro.org/job/tcwg_bmk_ci_llvm-bisect-tcwg_bmk_tk1-llvm-release… Results ID of first_bad: tk1_32/tcwg_bmk_llvm_tk1/bisect-llvm-release-arm-spec2k6-O2_LTO/4125 Build top page/logs: https://ci.linaro.org/job/tcwg_bmk_ci_llvm-bisect-tcwg_bmk_tk1-llvm-release… Configuration details: Reproduce builds: <cut> mkdir investigate-llvm-176379e0c8f9dbde2b357fb3b6a6802b83282e71 cd investigate-llvm-176379e0c8f9dbde2b357fb3b6a6802b83282e71 git clone https://git.linaro.org/toolchain/jenkins-scripts mkdir -p artifacts/manifests curl -o artifacts/manifests/build-baseline.sh https://ci.linaro.org/job/tcwg_bmk_ci_llvm-bisect-tcwg_bmk_tk1-llvm-release… --fail curl -o artifacts/manifests/build-parameters.sh https://ci.linaro.org/job/tcwg_bmk_ci_llvm-bisect-tcwg_bmk_tk1-llvm-release… --fail curl -o artifacts/test.sh https://ci.linaro.org/job/tcwg_bmk_ci_llvm-bisect-tcwg_bmk_tk1-llvm-release… --fail chmod +x artifacts/test.sh # Reproduce the baseline build (build all pre-requisites) ./jenkins-scripts/tcwg_bmk-build.sh @@ artifacts/manifests/build-baseline.sh # Save baseline build state (which is then restored in artifacts/test.sh) mkdir -p ./bisect rsync -a --del --delete-excluded --exclude /bisect/ --exclude /artifacts/ --exclude /llvm/ ./ ./bisect/baseline/ cd llvm # Reproduce first_bad build git checkout --detach 176379e0c8f9dbde2b357fb3b6a6802b83282e71 ../artifacts/test.sh # Reproduce last_good build git checkout --detach 2d728bbff5c688284b8b8306ecfd3000b0ab8bb1 ../artifacts/test.sh cd .. </cut> History of pending regressions and results: https://git.linaro.org/toolchain/ci/base-artifacts.git/log/?h=linaro-local/… Artifacts: https://ci.linaro.org/job/tcwg_bmk_ci_llvm-bisect-tcwg_bmk_tk1-llvm-release… Build log: https://ci.linaro.org/job/tcwg_bmk_ci_llvm-bisect-tcwg_bmk_tk1-llvm-release… Full commit (up to 1000 lines): <cut> commit 176379e0c8f9dbde2b357fb3b6a6802b83282e71 Author: Alex Zinenko <zinenko(a)google.com> Date: Fri Feb 12 12:53:27 2021 +0100 [mlir] Use the interface-based translation for LLVM "intrinsic" dialects Port the translation of five dialects that define LLVM IR intrinsics (LLVMAVX512, LLVMArmNeon, LLVMArmSVE, NVVM, ROCDL) to the new dialect interface-based mechanism. This allows us to remove individual translations that were created for each of these dialects and just use one common MLIR-to-LLVM-IR translation that potentially supports all dialects instead, based on what is registered and including any combination of translatable dialects. This removal was one of the main goals of the refactoring. To support the addition of GPU-related metadata, the translation interface is extended with the `amendOperation` function that allows the interface implementation to post-process any translated operation with dialect attributes from the dialect for which the interface is implemented regardless of the operation's dialect. This is currently applied to "kernel" functions, but can be used to construct other metadata in dialect-specific ways without necessarily affecting operations. Depends On D96591, D96504 Reviewed By: nicolasvasilache Differential Revision: https://reviews.llvm.org/D96592 --- .../test/Standalone/standalone-translate.mlir | 3 - mlir/include/mlir/Dialect/LLVMIR/LLVMOpBase.td | 13 ++- mlir/include/mlir/Dialect/LLVMIR/LLVMOps.td | 3 +- mlir/include/mlir/Dialect/LLVMIR/ROCDLOps.td | 2 +- mlir/include/mlir/InitAllTranslations.h | 10 -- mlir/include/mlir/Target/LLVMIR.h | 4 +- .../LLVMAVX512/LLVMAVX512ToLLVMIRTranslation.h | 37 ++++++ .../LLVMArmNeon/LLVMArmNeonToLLVMIRTranslation.h | 37 ++++++ .../LLVMArmSVE/LLVMArmSVEToLLVMIRTranslation.h | 37 ++++++ .../Dialect/LLVMIR/LLVMToLLVMIRTranslation.h | 4 +- .../LLVMIR/Dialect/NVVM/NVVMToLLVMIRTranslation.h | 42 +++++++ .../Dialect/OpenMP/OpenMPToLLVMIRTranslation.h | 4 +- .../Dialect/ROCDL/ROCDLToLLVMIRTranslation.h | 42 +++++++ .../mlir/Target/LLVMIR/LLVMTranslationInterface.h | 28 ++++- .../include/mlir/Target/LLVMIR/ModuleTranslation.h | 32 ++++-- mlir/include/mlir/Target/NVVMIR.h | 39 ------- mlir/include/mlir/Target/ROCDLIR.h | 40 ------- mlir/lib/Target/CMakeLists.txt | 107 +---------------- mlir/lib/Target/LLVMIR/ConvertToLLVMIR.cpp | 35 +++++- mlir/lib/Target/LLVMIR/ConvertToNVVMIR.cpp | 124 -------------------- mlir/lib/Target/LLVMIR/ConvertToROCDLIR.cpp | 127 --------------------- mlir/lib/Target/LLVMIR/Dialect/CMakeLists.txt | 5 + .../LLVMIR/Dialect/LLVMAVX512/CMakeLists.txt | 16 +++ .../LLVMAVX512/LLVMAVX512ToLLVMIRTranslation.cpp | 33 ++++++ .../LLVMIR/Dialect/LLVMArmNeon/CMakeLists.txt | 16 +++ .../LLVMArmNeon/LLVMArmNeonToLLVMIRTranslation.cpp | 33 ++++++ .../LLVMIR/Dialect/LLVMArmSVE/CMakeLists.txt | 16 +++ .../LLVMArmSVE/LLVMArmSVEToLLVMIRTranslation.cpp | 33 ++++++ .../Dialect/LLVMIR/LLVMToLLVMIRTranslation.cpp | 71 ++++-------- mlir/lib/Target/LLVMIR/Dialect/NVVM/CMakeLists.txt | 16 +++ .../Dialect/NVVM/NVVMToLLVMIRTranslation.cpp | 67 +++++++++++ .../lib/Target/LLVMIR/Dialect/ROCDL/CMakeLists.txt | 16 +++ .../Dialect/ROCDL/ROCDLToLLVMIRTranslation.cpp | 69 +++++++++++ mlir/lib/Target/LLVMIR/LLVMAVX512Intr.cpp | 65 ----------- mlir/lib/Target/LLVMIR/LLVMArmNeonIntr.cpp | 65 ----------- mlir/lib/Target/LLVMIR/LLVMArmSVEIntr.cpp | 65 ----------- mlir/lib/Target/LLVMIR/ModuleTranslation.cpp | 65 +++++++---- mlir/test/Target/arm-neon.mlir | 2 +- mlir/test/Target/arm-sve.mlir | 2 +- mlir/test/Target/avx512.mlir | 2 +- mlir/test/Target/nvvmir.mlir | 2 +- mlir/test/Target/rocdl.mlir | 2 +- mlir/test/lib/Transforms/CMakeLists.txt | 13 ++- .../lib/Transforms/TestConvertGPUKernelToCubin.cpp | 25 +++- .../lib/Transforms/TestConvertGPUKernelToHsaco.cpp | 22 +++- mlir/tools/mlir-cuda-runner/mlir-cuda-runner.cpp | 6 +- mlir/tools/mlir-rocm-runner/mlir-rocm-runner.cpp | 3 + mlir/tools/mlir-tblgen/LLVMIRConversionGen.cpp | 10 +- 48 files changed, 750 insertions(+), 760 deletions(-) diff --git a/mlir/examples/standalone/test/Standalone/standalone-translate.mlir b/mlir/examples/standalone/test/Standalone/standalone-translate.mlir index 2a096c38e128..16d49785ee16 100644 --- a/mlir/examples/standalone/test/Standalone/standalone-translate.mlir +++ b/mlir/examples/standalone/test/Standalone/standalone-translate.mlir @@ -1,8 +1,5 @@ // RUN: standalone-translate --help | FileCheck %s -// CHECK: --avx512-mlir-to-llvmir // CHECK: --deserialize-spirv // CHECK: --import-llvm // CHECK: --mlir-to-llvmir -// CHECK: --mlir-to-nvvmir -// CHECK: --mlir-to-rocdlir // CHECK: --serialize-spirv diff --git a/mlir/include/mlir/Dialect/LLVMIR/LLVMOpBase.td b/mlir/include/mlir/Dialect/LLVMIR/LLVMOpBase.td index ad886e55b4e6..541f7ebfadfa 100644 --- a/mlir/include/mlir/Dialect/LLVMIR/LLVMOpBase.td +++ b/mlir/include/mlir/Dialect/LLVMIR/LLVMOpBase.td @@ -219,12 +219,13 @@ class ListIntSubst<string pattern, list<int> values> { // or result in the operation. def LLVM_IntrPatterns { string operand = - [{convertType(opInst.getOperand($0).getType())}]; + [{moduleTranslation.convertType(opInst.getOperand($0).getType())}]; string result = - [{convertType(opInst.getResult($0).getType())}]; + [{moduleTranslation.convertType(opInst.getResult($0).getType())}]; string structResult = - [{convertType(opInst.getResult(0).getType().cast<LLVM::LLVMStructType>() - .getBody()[$0])}]; + [{moduleTranslation.convertType( + opInst.getResult(0).getType().cast<LLVM::LLVMStructType>() + .getBody()[$0])}]; } @@ -259,7 +260,7 @@ class LLVM_IntrOpBase<Dialect dialect, string opName, string enumName, ListIntSubst<LLVM_IntrPatterns.operand, overloadedOperands>.lst), ", ") # [{ }); - auto operands = lookupValues(opInst.getOperands()); + auto operands = moduleTranslation.lookupValues(opInst.getOperands()); }] # !if(!gt(numResults, 0), "$res = ", "") # [{builder.CreateCall(fn, operands); }]; @@ -325,7 +326,7 @@ class LLVM_VectorReductionAcc<string mnem> { }] # !interleave(ListIntSubst<LLVM_IntrPatterns.operand, [1]>.lst, ", ") # [{ }); - auto operands = lookupValues(opInst.getOperands()); + auto operands = moduleTranslation.lookupValues(opInst.getOperands()); llvm::FastMathFlags origFM = builder.getFastMathFlags(); llvm::FastMathFlags tempFM = origFM; tempFM.setAllowReassoc($reassoc); diff --git a/mlir/include/mlir/Dialect/LLVMIR/LLVMOps.td b/mlir/include/mlir/Dialect/LLVMIR/LLVMOps.td index 7a2152b9a481..7dca08a43f7a 100644 --- a/mlir/include/mlir/Dialect/LLVMIR/LLVMOps.td +++ b/mlir/include/mlir/Dialect/LLVMIR/LLVMOps.td @@ -1083,7 +1083,8 @@ def LLVM_UndefOp : LLVM_Op<"mlir.undef", [NoSideEffect]>, def LLVM_ConstantOp : LLVM_Op<"mlir.constant", [NoSideEffect]>, - LLVM_Builder<"$res = getLLVMConstant($_resultType, $value, $_location);"> + LLVM_Builder<[{$res = getLLVMConstant($_resultType, $value, $_location, + moduleTranslation);}]> { let summary = "Defines a constant of LLVM type."; let description = [{ diff --git a/mlir/include/mlir/Dialect/LLVMIR/ROCDLOps.td b/mlir/include/mlir/Dialect/LLVMIR/ROCDLOps.td index d5eec3cebb4a..cfb08ff465a2 100644 --- a/mlir/include/mlir/Dialect/LLVMIR/ROCDLOps.td +++ b/mlir/include/mlir/Dialect/LLVMIR/ROCDLOps.td @@ -175,7 +175,7 @@ def ROCDL_MubufStoreOp : LLVM_Type:$glc, LLVM_Type:$slc)>{ string llvmBuilder = [{ - auto vdataType = convertType(op.vdata().getType()); + auto vdataType = moduleTranslation.convertType(op.vdata().getType()); createIntrinsicCall(builder, llvm::Intrinsic::amdgcn_buffer_store, {$vdata, $rsrc, $vindex, $offset, $glc, $slc}, {vdataType}); diff --git a/mlir/include/mlir/InitAllTranslations.h b/mlir/include/mlir/InitAllTranslations.h index 16dd113d14cd..fc319c09a8c8 100644 --- a/mlir/include/mlir/InitAllTranslations.h +++ b/mlir/include/mlir/InitAllTranslations.h @@ -20,11 +20,6 @@ void registerFromLLVMIRTranslation(); void registerFromSPIRVTranslation(); void registerToLLVMIRTranslation(); void registerToSPIRVTranslation(); -void registerToNVVMIRTranslation(); -void registerToROCDLIRTranslation(); -void registerArmNeonToLLVMIRTranslation(); -void registerAVX512ToLLVMIRTranslation(); -void registerArmSVEToLLVMIRTranslation(); // This function should be called before creating any MLIRContext if one // expects all the possible translations to be made available to the context @@ -35,11 +30,6 @@ inline void registerAllTranslations() { registerFromSPIRVTranslation(); registerToLLVMIRTranslation(); registerToSPIRVTranslation(); - registerToNVVMIRTranslation(); - registerToROCDLIRTranslation(); - registerArmNeonToLLVMIRTranslation(); - registerAVX512ToLLVMIRTranslation(); - registerArmSVEToLLVMIRTranslation(); return true; }(); (void)initOnce; diff --git a/mlir/include/mlir/Target/LLVMIR.h b/mlir/include/mlir/Target/LLVMIR.h index 2050c63df73f..10bec79f3506 100644 --- a/mlir/include/mlir/Target/LLVMIR.h +++ b/mlir/include/mlir/Target/LLVMIR.h @@ -28,14 +28,14 @@ namespace mlir { class DialectRegistry; class OwningModuleRef; class MLIRContext; -class ModuleOp; +class Operation; /// Convert the given MLIR module into LLVM IR. The LLVM context is extracted /// from the registered LLVM IR dialect. In case of error, report it /// to the error handler registered with the MLIR context, if any (obtained from /// the MLIR module), and return `nullptr`. std::unique_ptr<llvm::Module> -translateModuleToLLVMIR(ModuleOp m, llvm::LLVMContext &llvmContext, +translateModuleToLLVMIR(Operation *op, llvm::LLVMContext &llvmContext, StringRef name = "LLVMDialectModule"); /// Convert the given LLVM module into MLIR's LLVM dialect. The LLVM context is diff --git a/mlir/include/mlir/Target/LLVMIR/Dialect/LLVMAVX512/LLVMAVX512ToLLVMIRTranslation.h b/mlir/include/mlir/Target/LLVMIR/Dialect/LLVMAVX512/LLVMAVX512ToLLVMIRTranslation.h new file mode 100644 index 000000000000..e591a95f0357 --- /dev/null +++ b/mlir/include/mlir/Target/LLVMIR/Dialect/LLVMAVX512/LLVMAVX512ToLLVMIRTranslation.h @@ -0,0 +1,37 @@ +//===- LLVMAVX512ToLLVMIRTranslation.h - LLVMAVX512 to LLVM IR --*- C++ -*-===// +// +// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions. +// See https://llvm.org/LICENSE.txt for license information. +// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception +// +//===----------------------------------------------------------------------===// +// +// This file implements the dialect interface for translating the LLVMAVX512 +// dialect to LLVM IR. +// +//===----------------------------------------------------------------------===// + +#ifndef MLIR_TARGET_LLVMIR_DIALECT_LLVMAVX512_LLVMAVX512TOLLVMIRTRANSLATION_H +#define MLIR_TARGET_LLVMIR_DIALECT_LLVMAVX512_LLVMAVX512TOLLVMIRTRANSLATION_H + +#include "mlir/Target/LLVMIR/LLVMTranslationInterface.h" + +namespace mlir { + +/// Implementation of the dialect interface that converts operations belonging +/// to the LLVMAVX512 dialect to LLVM IR. +class LLVMAVX512DialectLLVMIRTranslationInterface + : public LLVMTranslationDialectInterface { +public: + using LLVMTranslationDialectInterface::LLVMTranslationDialectInterface; + + /// Translates the given operation to LLVM IR using the provided IR builder + /// and saving the state in `moduleTranslation`. + LogicalResult + convertOperation(Operation *op, llvm::IRBuilderBase &builder, + LLVM::ModuleTranslation &moduleTranslation) const final; +}; + +} // namespace mlir + +#endif // MLIR_TARGET_LLVMIR_DIALECT_LLVMAVX512_LLVMAVX512TOLLVMIRTRANSLATION_H diff --git a/mlir/include/mlir/Target/LLVMIR/Dialect/LLVMArmNeon/LLVMArmNeonToLLVMIRTranslation.h b/mlir/include/mlir/Target/LLVMIR/Dialect/LLVMArmNeon/LLVMArmNeonToLLVMIRTranslation.h new file mode 100644 index 000000000000..7d268d155083 --- /dev/null +++ b/mlir/include/mlir/Target/LLVMIR/Dialect/LLVMArmNeon/LLVMArmNeonToLLVMIRTranslation.h @@ -0,0 +1,37 @@ +//===- LLVMArmNeonToLLVMIRTranslation.h - LLVMArmNeon to LLVMIR -*- C++ -*-===// +// +// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions. +// See https://llvm.org/LICENSE.txt for license information. +// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception +// +//===----------------------------------------------------------------------===// +// +// This file implements the dialect interface for translating the LLVMArmNeon +// dialect to LLVM IR. +// +//===----------------------------------------------------------------------===// + +#ifndef MLIR_TARGET_LLVMIR_DIALECT_LLVMARMNEON_LLVMARMNEONTOLLVMIRTRANSLATION_H +#define MLIR_TARGET_LLVMIR_DIALECT_LLVMARMNEON_LLVMARMNEONTOLLVMIRTRANSLATION_H + +#include "mlir/Target/LLVMIR/LLVMTranslationInterface.h" + +namespace mlir { + +/// Implementation of the dialect interface that converts operations belonging +/// to the LLVMArmNeon dialect to LLVM IR. +class LLVMArmNeonDialectLLVMIRTranslationInterface + : public LLVMTranslationDialectInterface { +public: + using LLVMTranslationDialectInterface::LLVMTranslationDialectInterface; + + /// Translates the given operation to LLVM IR using the provided IR builder + /// and saving the state in `moduleTranslation`. + LogicalResult + convertOperation(Operation *op, llvm::IRBuilderBase &builder, + LLVM::ModuleTranslation &moduleTranslation) const final; +}; + +} // namespace mlir + +#endif // MLIR_TARGET_LLVMIR_DIALECT_LLVMARMNEON_LLVMARMNEONTOLLVMIRTRANSLATION_H diff --git a/mlir/include/mlir/Target/LLVMIR/Dialect/LLVMArmSVE/LLVMArmSVEToLLVMIRTranslation.h b/mlir/include/mlir/Target/LLVMIR/Dialect/LLVMArmSVE/LLVMArmSVEToLLVMIRTranslation.h new file mode 100644 index 000000000000..9d4d05b9b9bd --- /dev/null +++ b/mlir/include/mlir/Target/LLVMIR/Dialect/LLVMArmSVE/LLVMArmSVEToLLVMIRTranslation.h @@ -0,0 +1,37 @@ +//===- LLVMArmSVEToLLVMIRTranslation.h - LLVMArmSVE to LLVM IR --*- C++ -*-===// +// +// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions. +// See https://llvm.org/LICENSE.txt for license information. +// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception +// +//===----------------------------------------------------------------------===// +// +// This file implements the dialect interface for translating the LLVMArmSVE +// dialect to LLVM IR. +// +//===----------------------------------------------------------------------===// + +#ifndef MLIR_TARGET_LLVMIR_DIALECT_LLVMARMSVE_LLVMARMSVETOLLVMIRTRANSLATION_H +#define MLIR_TARGET_LLVMIR_DIALECT_LLVMARMSVE_LLVMARMSVETOLLVMIRTRANSLATION_H + +#include "mlir/Target/LLVMIR/LLVMTranslationInterface.h" + +namespace mlir { + +/// Implementation of the dialect interface that converts operations belonging +/// to the LLVMArmSVE dialect to LLVM IR. +class LLVMArmSVEDialectLLVMIRTranslationInterface + : public LLVMTranslationDialectInterface { +public: + using LLVMTranslationDialectInterface::LLVMTranslationDialectInterface; + + /// Translates the given operation to LLVM IR using the provided IR builder + /// and saving the state in `moduleTranslation`. + LogicalResult + convertOperation(Operation *op, llvm::IRBuilderBase &builder, + LLVM::ModuleTranslation &moduleTranslation) const final; +}; + +} // namespace mlir + +#endif // MLIR_TARGET_LLVMIR_DIALECT_LLVMARMSVE_LLVMARMSVETOLLVMIRTRANSLATION_H diff --git a/mlir/include/mlir/Target/LLVMIR/Dialect/LLVMIR/LLVMToLLVMIRTranslation.h b/mlir/include/mlir/Target/LLVMIR/Dialect/LLVMIR/LLVMToLLVMIRTranslation.h index 8b72cedf1ff2..2af76e092917 100644 --- a/mlir/include/mlir/Target/LLVMIR/Dialect/LLVMIR/LLVMToLLVMIRTranslation.h +++ b/mlir/include/mlir/Target/LLVMIR/Dialect/LLVMIR/LLVMToLLVMIRTranslation.h @@ -18,8 +18,8 @@ namespace mlir { -/// Implementation of the dialect interface that converts operations beloning to -/// the LLVM dialect to LLVM IR. +/// Implementation of the dialect interface that converts operations belonging +/// to the LLVM dialect to LLVM IR. class LLVMDialectLLVMIRTranslationInterface : public LLVMTranslationDialectInterface { public: diff --git a/mlir/include/mlir/Target/LLVMIR/Dialect/NVVM/NVVMToLLVMIRTranslation.h b/mlir/include/mlir/Target/LLVMIR/Dialect/NVVM/NVVMToLLVMIRTranslation.h new file mode 100644 index 000000000000..3a8a01df84b0 --- /dev/null +++ b/mlir/include/mlir/Target/LLVMIR/Dialect/NVVM/NVVMToLLVMIRTranslation.h @@ -0,0 +1,42 @@ +//===- NVVMToLLVMIRTranslation.h - NVVM to LLVM IR --------------*- C++ -*-===// +// +// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions. +// See https://llvm.org/LICENSE.txt for license information. +// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception +// +//===----------------------------------------------------------------------===// +// +// This file implements the dialect interface for translating the NVVM +// dialect to LLVM IR. +// +//===----------------------------------------------------------------------===// + +#ifndef MLIR_TARGET_LLVMIR_DIALECT_NVVM_NVVMTOLLVMIRTRANSLATION_H +#define MLIR_TARGET_LLVMIR_DIALECT_NVVM_NVVMTOLLVMIRTRANSLATION_H + +#include "mlir/Target/LLVMIR/LLVMTranslationInterface.h" + +namespace mlir { + +/// Implementation of the dialect interface that converts operations belonging +/// to the NVVM dialect to LLVM IR. +class NVVMDialectLLVMIRTranslationInterface + : public LLVMTranslationDialectInterface { +public: + using LLVMTranslationDialectInterface::LLVMTranslationDialectInterface; + + /// Translates the given operation to LLVM IR using the provided IR builder + /// and saving the state in `moduleTranslation`. + LogicalResult + convertOperation(Operation *op, llvm::IRBuilderBase &builder, + LLVM::ModuleTranslation &moduleTranslation) const final; + + /// Attaches module-level metadata for functions marked as kernels. + LogicalResult + amendOperation(Operation *op, NamedAttribute attribute, + LLVM::ModuleTranslation &moduleTranslation) const final; +}; + +} // namespace mlir + +#endif // MLIR_TARGET_LLVMIR_DIALECT_NVVM_NVVMTOLLVMIRTRANSLATION_H diff --git a/mlir/include/mlir/Target/LLVMIR/Dialect/OpenMP/OpenMPToLLVMIRTranslation.h b/mlir/include/mlir/Target/LLVMIR/Dialect/OpenMP/OpenMPToLLVMIRTranslation.h index 7d9eeea9462e..07721d089689 100644 --- a/mlir/include/mlir/Target/LLVMIR/Dialect/OpenMP/OpenMPToLLVMIRTranslation.h +++ b/mlir/include/mlir/Target/LLVMIR/Dialect/OpenMP/OpenMPToLLVMIRTranslation.h @@ -18,8 +18,8 @@ namespace mlir { -/// Implementation of the dialect interface that converts operations beloning to -/// the OpenMP dialect to LLVM IR. +/// Implementation of the dialect interface that converts operations belonging +/// to the OpenMP dialect to LLVM IR. class OpenMPDialectLLVMIRTranslationInterface : public LLVMTranslationDialectInterface { public: diff --git a/mlir/include/mlir/Target/LLVMIR/Dialect/ROCDL/ROCDLToLLVMIRTranslation.h b/mlir/include/mlir/Target/LLVMIR/Dialect/ROCDL/ROCDLToLLVMIRTranslation.h new file mode 100644 index 000000000000..e2211a59098f --- /dev/null +++ b/mlir/include/mlir/Target/LLVMIR/Dialect/ROCDL/ROCDLToLLVMIRTranslation.h @@ -0,0 +1,42 @@ +//===- ROCDLToLLVMIRTranslation.h - ROCDL to LLVM IR ------------*- C++ -*-===// +// +// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions. +// See https://llvm.org/LICENSE.txt for license information. +// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception +// +//===----------------------------------------------------------------------===// +// +// This file implements the dialect interface for translating the ROCDL +// dialect to LLVM IR. +// +//===----------------------------------------------------------------------===// + +#ifndef MLIR_TARGET_LLVMIR_DIALECT_ROCDL_ROCDLTOLLVMIRTRANSLATION_H +#define MLIR_TARGET_LLVMIR_DIALECT_ROCDL_ROCDLTOLLVMIRTRANSLATION_H + +#include "mlir/Target/LLVMIR/LLVMTranslationInterface.h" + +namespace mlir { + +/// Implementation of the dialect interface that converts operations belonging +/// to the ROCDL dialect to LLVM IR. +class ROCDLDialectLLVMIRTranslationInterface + : public LLVMTranslationDialectInterface { +public: + using LLVMTranslationDialectInterface::LLVMTranslationDialectInterface; + + /// Translates the given operation to LLVM IR using the provided IR builder + /// and saving the state in `moduleTranslation`. + LogicalResult + convertOperation(Operation *op, llvm::IRBuilderBase &builder, + LLVM::ModuleTranslation &moduleTranslation) const final; + + /// Attaches module-level metadata for functions marked as kernels. + LogicalResult + amendOperation(Operation *op, NamedAttribute attribute, + LLVM::ModuleTranslation &moduleTranslation) const final; +}; + +} // namespace mlir + +#endif // MLIR_TARGET_LLVMIR_DIALECT_ROCDL_ROCDLTOLLVMIRTRANSLATION_H diff --git a/mlir/include/mlir/Target/LLVMIR/LLVMTranslationInterface.h b/mlir/include/mlir/Target/LLVMIR/LLVMTranslationInterface.h index 0063beac2977..0c563e6e7d39 100644 --- a/mlir/include/mlir/Target/LLVMIR/LLVMTranslationInterface.h +++ b/mlir/include/mlir/Target/LLVMIR/LLVMTranslationInterface.h @@ -13,12 +13,14 @@ #ifndef MLIR_TARGET_LLVMIR_LLVMTRANSLATIONINTERFACE_H #define MLIR_TARGET_LLVMIR_LLVMTRANSLATIONINTERFACE_H +#include "mlir/IR/Attributes.h" #include "mlir/IR/DialectInterface.h" +#include "mlir/IR/Identifier.h" #include "mlir/Support/LogicalResult.h" namespace llvm { class IRBuilderBase; -} +} // namespace llvm namespace mlir { namespace LLVM { @@ -43,6 +45,18 @@ public: LLVM::ModuleTranslation &moduleTranslation) const { return failure(); } + + /// Hook for derived dialect interface to act on an operation that has dialect + /// attributes from the derived dialect (the operation itself may be from a + /// different dialect). This gets called after the operation has been + /// translated. The hook is expected to use moduleTranslation to look up the + /// translation results and amend the corresponding IR constructs. Does + /// nothing and succeeds by default. + virtual LogicalResult + amendOperation(Operation *op, NamedAttribute attribute, + LLVM::ModuleTranslation &moduleTranslation) const { + return success(); + } }; /// Interface collection for translation to LLVM IR, dispatches to a concrete @@ -61,6 +75,18 @@ public: return iface->convertOperation(op, builder, moduleTranslation); return failure(); } + + /// Acts on the given operation using the interface implemented by the dialect + /// of one of the operation's dialect attributes. + virtual LogicalResult + amendOperation(Operation *op, NamedAttribute attribute, + LLVM::ModuleTranslation &moduleTranslation) const { + if (const LLVMTranslationDialectInterface *iface = + getInterfaceFor(attribute.first.getDialect())) { + return iface->amendOperation(op, attribute, moduleTranslation); + } + return success(); + } }; } // namespace mlir diff --git a/mlir/include/mlir/Target/LLVMIR/ModuleTranslation.h b/mlir/include/mlir/Target/LLVMIR/ModuleTranslation.h index 03b7f5336461..004524f33fa4 100644 --- a/mlir/include/mlir/Target/LLVMIR/ModuleTranslation.h +++ b/mlir/include/mlir/Target/LLVMIR/ModuleTranslation.h @@ -142,18 +142,11 @@ public: /// Looks up remapped a list of remapped values. SmallVector<llvm::Value *, 8> lookupValues(ValueRange values); - /// Create an LLVM IR constant of `llvmType` from the MLIR attribute `attr`. - /// This currently supports integer, floating point, splat and dense element - /// attributes and combinations thereof. In case of error, report it to `loc` - /// and return nullptr. - llvm::Constant *getLLVMConstant(llvm::Type *llvmType, Attribute attr, - Location loc); - /// Returns the MLIR context of the module being translated. MLIRContext &getContext() { return *mlirModule->getContext(); } /// Returns the LLVM context in which the IR is being constructed. - llvm::LLVMContext &getLLVMContext() { return llvmModule->getContext(); } + llvm::LLVMContext &getLLVMContext() const { return llvmModule->getContext(); } /// Finds an LLVM IR global value that corresponds to the given MLIR operation /// defining a global value. @@ -184,6 +177,10 @@ public: LogicalResult convertBlock(Block &bb, bool ignoreArguments, llvm::IRBuilder<> &builder); + /// Gets the named metadata in the LLVM IR module being constructed, creating + /// it if it does not exist. + llvm::NamedMDNode *getOrInsertNamedModuleMetadata(StringRef name); + protected: /// Translate the given MLIR module expressed in MLIR LLVM IR dialect into an /// LLVM IR module. The MLIR LLVM IR dialect holds a pointer to an @@ -208,6 +205,9 @@ private: LogicalResult convertGlobals(); LogicalResult convertOneFunction(LLVMFuncOp func); + /// Translates dialect attributes attached to the given operation. + LogicalResult convertDialectAttributes(Operation *op); + /// Original and translated module. Operation *mlirModule; std::unique_ptr<llvm::Module> llvmModule; @@ -228,6 +228,8 @@ private: /// A stateful object used to translate types. TypeToLLVMIRTranslator typeTranslator; + /// A dialect interface collection used for dispatching the translation to + /// specific dialects. LLVMTranslationInterface iface; /// Mappings between original and translated values, used for lookups. @@ -249,6 +251,20 @@ void connectPHINodes(Region &region, const ModuleTranslation &state); /// Get a topologically sorted list of blocks of the given region. llvm::SetVector<Block *> getTopologicallySortedBlocks(Region &region); + +/// Create an LLVM IR constant of `llvmType` from the MLIR attribute `attr`. +/// This currently supports integer, floating point, splat and dense element +/// attributes and combinations thereof. In case of error, report it to `loc` +/// and return nullptr. +llvm::Constant *getLLVMConstant(llvm::Type *llvmType, Attribute attr, + Location loc, + const ModuleTranslation &moduleTranslation); + +/// Creates a call to an LLVM IR intrinsic function with the given arguments. +llvm::Value *createIntrinsicCall(llvm::IRBuilderBase &builder, + llvm::Intrinsic::ID intrinsic, + ArrayRef<llvm::Value *> args = {}, + ArrayRef<llvm::Type *> tys = {}); } // namespace detail } // namespace LLVM diff --git a/mlir/include/mlir/Target/NVVMIR.h b/mlir/include/mlir/Target/NVVMIR.h deleted file mode 100644 index 0cd7688e275b..000000000000 --- a/mlir/include/mlir/Target/NVVMIR.h +++ /dev/null @@ -1,39 +0,0 @@ -//===- NVVMIR.h - MLIR to LLVM + NVVM IR conversion -------------*- C++ -*-===// -// -// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions. -// See https://llvm.org/LICENSE.txt for license information. -// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception -// -//===----------------------------------------------------------------------===// -// -// This file declares the entry point for the MLIR to LLVM + NVVM IR conversion. -// -//===----------------------------------------------------------------------===// - -#ifndef MLIR_TARGET_NVVMIR_H -#define MLIR_TARGET_NVVMIR_H - -#include "llvm/ADT/StringRef.h" -#include <memory> - -// Forward-declare LLVM classes. -namespace llvm { -class LLVMContext; -class Module; -} // namespace llvm - -namespace mlir { -class Operation; - -/// Convert the given LLVM-module-like operation into NVVM IR. This conversion -/// requires the registration of the LLVM IR dialect and will extract the LLVM -/// context from the registered LLVM IR dialect. In case of error, report it to -/// the error handler registered with the MLIR context, if any (obtained from -/// the MLIR module), and return `nullptr`. -std::unique_ptr<llvm::Module> -translateModuleToNVVMIR(Operation *m, llvm::LLVMContext &llvmContext, - llvm::StringRef name = "LLVMDialectModule"); - -} // namespace mlir - -#endif // MLIR_TARGET_NVVMIR_H diff --git a/mlir/include/mlir/Target/ROCDLIR.h b/mlir/include/mlir/Target/ROCDLIR.h deleted file mode 100644 index e2cb812a173d..000000000000 --- a/mlir/include/mlir/Target/ROCDLIR.h +++ /dev/null @@ -1,40 +0,0 @@ -//===- ROCDLIR.h - MLIR to LLVM + ROCDL IR conversion -----------*- C++ -*-===// -// -// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions. -// See https://llvm.org/LICENSE.txt for license information. -// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception -// -//===----------------------------------------------------------------------===// -// -// This file declares the entry point for the MLIR to LLVM + ROCDL IR -// conversion. -// -//===----------------------------------------------------------------------===// - -#ifndef MLIR_TARGET_ROCDLIR_H -#define MLIR_TARGET_ROCDLIR_H - -#include "llvm/ADT/StringRef.h" -#include <memory> - -// Forward-declare LLVM classes. -namespace llvm { -class LLVMContext; -class Module; -} // namespace llvm - -namespace mlir { -class Operation; - -/// Convert the given LLVM-module-like operation into ROCDL IR. This conversion -/// requires the registration of the LLVM IR dialect and will extract the LLVM -/// context from the registered LLVM IR dialect. In case of error, report it to -/// the error handler registered with the MLIR context, if any (obtained from -/// the MLIR module), and return `nullptr`. -std::unique_ptr<llvm::Module> -translateModuleToROCDLIR(Operation *m, llvm::LLVMContext &llvmContext, - llvm::StringRef name = "LLVMDialectModule"); - -} // namespace mlir - -#endif // MLIR_TARGET_ROCDLIR_H diff --git a/mlir/lib/Target/CMakeLists.txt b/mlir/lib/Target/CMakeLists.txt index e951ffade6aa..a23222d37ede 100644 --- a/mlir/lib/Target/CMakeLists.txt +++ b/mlir/lib/Target/CMakeLists.txt @@ -24,26 +24,6 @@ add_mlir_translation_library(MLIRTargetLLVMIRModuleTranslation MLIRTranslation ) -add_mlir_translation_library(MLIRTargetAVX512 - LLVMIR/LLVMAVX512Intr.cpp - - ADDITIONAL_HEADER_DIRS - ${MLIR_MAIN_INCLUDE_DIR}/mlir/Target/LLVMIR - - DEPENDS - MLIRLLVMAVX512ConversionsIncGen - - LINK_COMPONENTS - Core - - LINK_LIBS PUBLIC - MLIRIR - MLIRLLVMAVX512 - MLIRLLVMIR - MLIRTargetLLVMIR - MLIRTargetLLVMIRModuleTranslation - ) - add_mlir_translation_library(MLIRTargetLLVMIR LLVMIR/ConvertFromLLVMIR.cpp LLVMIR/ConvertToLLVMIR.cpp @@ -56,89 +36,12 @@ add_mlir_translation_library(MLIRTargetLLVMIR IRReader LINK_LIBS PUBLIC + MLIRLLVMArmNeonToLLVMIRTranslation + MLIRLLVMArmSVEToLLVMIRTranslation + MLIRLLVMAVX512ToLLVMIRTranslation MLIRLLVMToLLVMIRTranslation + MLIRNVVMToLLVMIRTranslation MLIROpenMPToLLVMIRTranslation - MLIRTargetLLVMIRModuleTranslation - ) - -add_mlir_translation_library(MLIRTargetArmNeon - LLVMIR/LLVMArmNeonIntr.cpp - - ADDITIONAL_HEADER_DIRS - ${MLIR_MAIN_INCLUDE_DIR}/mlir/Target/LLVMIR - - DEPENDS - MLIRLLVMArmNeonConversionsIncGen - - LINK_COMPONENTS - Core - - LINK_LIBS PUBLIC - MLIRIR - MLIRLLVMArmNeon - MLIRLLVMIR - MLIRTargetLLVMIR - MLIRTargetLLVMIRModuleTranslation - ) - -add_mlir_translation_library(MLIRTargetArmSVE - LLVMIR/LLVMArmSVEIntr.cpp - - ADDITIONAL_HEADER_DIRS - ${MLIR_MAIN_INCLUDE_DIR}/mlir/Target/LLVMIR - - DEPENDS - MLIRLLVMArmSVEConversionsIncGen - - LINK_COMPONENTS - Core - - LINK_LIBS PUBLIC - MLIRIR - MLIRLLVMArmSVE - MLIRLLVMIR - MLIRTargetLLVMIR - MLIRTargetLLVMIRModuleTranslation - ) - -add_mlir_translation_library(MLIRTargetNVVMIR - LLVMIR/ConvertToNVVMIR.cpp - - ADDITIONAL_HEADER_DIRS - ${MLIR_MAIN_INCLUDE_DIR}/mlir/Target/LLVMIR - - DEPENDS - intrinsics_gen - - LINK_COMPONENTS - Core - - LINK_LIBS PUBLIC - MLIRGPU - MLIRIR - MLIRLLVMIR - MLIRNVVMIR - MLIRTargetLLVMIR - MLIRTargetLLVMIRModuleTranslation - ) - -add_mlir_translation_library(MLIRTargetROCDLIR - LLVMIR/ConvertToROCDLIR.cpp - - ADDITIONAL_HEADER_DIRS - ${MLIR_MAIN_INCLUDE_DIR}/mlir/Target/LLVMIR - - DEPENDS - intrinsics_gen - - LINK_COMPONENTS - Core - - LINK_LIBS PUBLIC - MLIRGPU - MLIRIR - MLIRLLVMIR - MLIRROCDLIR - MLIRTargetLLVMIR + MLIRROCDLToLLVMIRTranslation MLIRTargetLLVMIRModuleTranslation ) diff --git a/mlir/lib/Target/LLVMIR/ConvertToLLVMIR.cpp b/mlir/lib/Target/LLVMIR/ConvertToLLVMIR.cpp index 6b30748cc79b..42391513bacf 100644 --- a/mlir/lib/Target/LLVMIR/ConvertToLLVMIR.cpp +++ b/mlir/lib/Target/LLVMIR/ConvertToLLVMIR.cpp @@ -12,9 +12,19 @@ #include "mlir/Target/LLVMIR.h" +#include "mlir/Dialect/LLVMIR/LLVMAVX512Dialect.h" +#include "mlir/Dialect/LLVMIR/LLVMArmNeonDialect.h" +#include "mlir/Dialect/LLVMIR/LLVMArmSVEDialect.h" +#include "mlir/Dialect/LLVMIR/NVVMDialect.h" +#include "mlir/Dialect/LLVMIR/ROCDLDialect.h" #include "mlir/Dialect/OpenMP/OpenMPDialect.h" +#include "mlir/Target/LLVMIR/Dialect/LLVMAVX512/LLVMAVX512ToLLVMIRTranslation.h" +#include "mlir/Target/LLVMIR/Dialect/LLVMArmNeon/LLVMArmNeonToLLVMIRTranslation.h" +#include "mlir/Target/LLVMIR/Dialect/LLVMArmSVE/LLVMArmSVEToLLVMIRTranslation.h" #include "mlir/Target/LLVMIR/Dialect/LLVMIR/LLVMToLLVMIRTranslation.h" +#include "mlir/Target/LLVMIR/Dialect/NVVM/NVVMToLLVMIRTranslation.h" #include "mlir/Target/LLVMIR/Dialect/OpenMP/OpenMPToLLVMIRTranslation.h" +#include "mlir/Target/LLVMIR/Dialect/ROCDL/ROCDLToLLVMIRTranslation.h" #include "mlir/Target/LLVMIR/ModuleTranslation.h" #include "mlir/Translation.h" @@ -26,14 +36,14 @@ using namespace mlir; std::unique_ptr<llvm::Module> -mlir::translateModuleToLLVMIR(ModuleOp m, llvm::LLVMContext &llvmContext, +mlir::translateModuleToLLVMIR(Operation *op, llvm::LLVMContext &llvmContext, StringRef name) { auto llvmModule = - LLVM::ModuleTranslation::translateModule<>(m, llvmContext, name); + LLVM::ModuleTranslation::translateModule<>(op, llvmContext, name); if (!llvmModule) - emitError(m.getLoc(), "Fail to convert MLIR to LLVM IR"); + emitError(op->getLoc(), "Fail to convert MLIR to LLVM IR"); else if (verifyModule(*llvmModule)) - emitError(m.getLoc(), "LLVM IR fails to verify"); + emitError(op->getLoc(), "LLVM IR fails to verify"); return llvmModule; } @@ -70,9 +80,24 @@ void registerToLLVMIRTranslation() { return success(); }, [](DialectRegistry &registry) { - registry.insert<omp::OpenMPDialect>(); + registry.insert<omp::OpenMPDialect, LLVM::LLVMAVX512Dialect, + LLVM::LLVMArmSVEDialect, LLVM::LLVMArmNeonDialect, + NVVM::NVVMDialect, ROCDL::ROCDLDialect>(); registry.addDialectInterface<omp::OpenMPDialect, OpenMPDialectLLVMIRTranslationInterface>(); + registry + .addDialectInterface<LLVM::LLVMAVX512Dialect, + LLVMAVX512DialectLLVMIRTranslationInterface>(); + registry.addDialectInterface< + LLVM::LLVMArmNeonDialect, + LLVMArmNeonDialectLLVMIRTranslationInterface>(); + registry + .addDialectInterface<LLVM::LLVMArmSVEDialect, + LLVMArmSVEDialectLLVMIRTranslationInterface>(); + registry.addDialectInterface<NVVM::NVVMDialect, + NVVMDialectLLVMIRTranslationInterface>(); + registry.addDialectInterface<ROCDL::ROCDLDialect, + ROCDLDialectLLVMIRTranslationInterface>(); registerLLVMDialectTranslation(registry); }); } diff --git a/mlir/lib/Target/LLVMIR/ConvertToNVVMIR.cpp b/mlir/lib/Target/LLVMIR/ConvertToNVVMIR.cpp deleted file mode 100644 index 7aee913a27d7..000000000000 --- a/mlir/lib/Target/LLVMIR/ConvertToNVVMIR.cpp +++ /dev/null @@ -1,124 +0,0 @@ -//===- ConvertToNVVMIR.cpp - MLIR to LLVM IR conversion -------------------===// -// -// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions. -// See https://llvm.org/LICENSE.txt for license information. -// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception -// -//===----------------------------------------------------------------------===// -// -// This file implements a translation between the MLIR LLVM + NVVM dialects and -// LLVM IR with NVVM intrinsics and metadata. -// -//===----------------------------------------------------------------------===// - -#include "mlir/Target/NVVMIR.h" - -#include "mlir/Dialect/GPU/GPUDialect.h" -#include "mlir/Dialect/LLVMIR/LLVMDialect.h" -#include "mlir/Dialect/LLVMIR/NVVMDialect.h" -#include "mlir/IR/BuiltinOps.h" -#include "mlir/Target/LLVMIR.h" -#include "mlir/Target/LLVMIR/ModuleTranslation.h" -#include "mlir/Translation.h" - -#include "llvm/ADT/StringRef.h" -#include "llvm/IR/IntrinsicsNVPTX.h" -#include "llvm/IR/Module.h" -#include "llvm/Support/ToolOutputFile.h" - -using namespace mlir; - -static llvm::Value *createIntrinsicCall(llvm::IRBuilder<> &builder, - llvm::Intrinsic::ID intrinsic, - ArrayRef<llvm::Value *> args = {}) { - llvm::Module *module = builder.GetInsertBlock()->getModule(); - llvm::Function *fn = llvm::Intrinsic::getDeclaration(module, intrinsic); - return builder.CreateCall(fn, args); -} - -static llvm::Intrinsic::ID getShflBflyIntrinsicId(llvm::Type *resultType, - bool withPredicate) { - if (withPredicate) { - resultType = cast<llvm::StructType>(resultType)->getElementType(0); - return resultType->isFloatTy() ? llvm::Intrinsic::nvvm_shfl_sync_bfly_f32p - : llvm::Intrinsic::nvvm_shfl_sync_bfly_i32p; - } - return resultType->isFloatTy() ? llvm::Intrinsic::nvvm_shfl_sync_bfly_f32 - : llvm::Intrinsic::nvvm_shfl_sync_bfly_i32; -} - -namespace { -class ModuleTranslation : public LLVM::ModuleTranslation { -public: - using LLVM::ModuleTranslation::ModuleTranslation; - -protected: - LogicalResult convertOperation(Operation &opInst, - llvm::IRBuilder<> &builder) override { - -#include "mlir/Dialect/LLVMIR/NVVMConversions.inc" - - return LLVM::ModuleTranslation::convertOperation(opInst, builder); - } - - /// Allow access to the constructor. - friend LLVM::ModuleTranslation; -}; -} // namespace - -std::unique_ptr<llvm::Module> -mlir::translateModuleToNVVMIR(Operation *m, llvm::LLVMContext &llvmContext, - StringRef name) { - // Register the translation to LLVM IR if nobody else did before. This may - // happen if this translation is called inside a pass pipeline that converts - // GPU dialects to binary blobs without translating the rest of the code. - registerLLVMDialectTranslation(*m->getContext()); - - auto llvmModule = LLVM::ModuleTranslation::translateModule<ModuleTranslation>( - m, llvmContext, name); - if (!llvmModule) - return llvmModule; - - // Insert the nvvm.annotations kernel so that the NVVM backend recognizes the - // function as a kernel. - for (auto func : - ModuleTranslation::getModuleBody(m).getOps<LLVM::LLVMFuncOp>()) { - if (!func->getAttrOfType<UnitAttr>( - NVVM::NVVMDialect::getKernelFuncAttrName())) - continue; - - auto *llvmFunc = llvmModule->getFunction(func.getName()); - - llvm::Metadata *llvmMetadata[] = { - llvm::ValueAsMetadata::get(llvmFunc), - llvm::MDString::get(llvmModule->getContext(), "kernel"), - llvm::ValueAsMetadata::get(llvm::ConstantInt::get( - llvm::Type::getInt32Ty(llvmModule->getContext()), 1))}; - llvm::MDNode *llvmMetadataNode = - llvm::MDNode::get(llvmModule->getContext(), llvmMetadata); - llvmModule->getOrInsertNamedMetadata("nvvm.annotations") - ->addOperand(llvmMetadataNode); - } - - return llvmModule; -} - -namespace mlir { -void registerToNVVMIRTranslation() { - TranslateFromMLIRRegistration registration( - "mlir-to-nvvmir", - [](ModuleOp module, raw_ostream &output) { - llvm::LLVMContext llvmContext; - auto llvmModule = mlir::translateModuleToNVVMIR(module, llvmContext); - if (!llvmModule) </cut>

4 years

1
0
0 0

[CI-NOTIFY]: TCWG Bisect tcwg_bmk_apm/llvm-master-aarch64-spec2k6-Oz - Build # 8 - Successful!

by ci_notify＠linaro.org

Successfully identified regression in *gcc* in CI configuration tcwg_bmk_llvm_apm/llvm-master-aarch64-spec2k6-Oz. So far, this commit has regressed CI configurations: - tcwg_bmk_llvm_apm/llvm-master-aarch64-spec2k6-Oz Culprit: <cut> commit 844105d912a4337635001fc2077404aefb90c4c6 Author: GCC Administrator <gccadmin(a)gcc.gnu.org> Date: Mon Aug 9 00:16:32 2021 +0000 Daily bump. </cut> Results regressed to (for first_bad == 844105d912a4337635001fc2077404aefb90c4c6) # reset_artifacts: -10 # build_abe binutils: -9 # build_abe stage1 -- --set gcc_override_configure=--disable-libsanitizer: -8 # build_abe linux: -7 # build_abe glibc: -6 # build_abe stage2 -- --set gcc_override_configure=--disable-libsanitizer: -5 # build_llvm true: -3 # true: 0 # benchmark -- -Oz artifacts/build-844105d912a4337635001fc2077404aefb90c4c6/results_id: 1 # 482.sphinx3,[.] OUTLINED_FUNCTION_4 regressed by 117 from (for last_good == 5f564fd013327977a4d04aa520237d17f641f01a) # reset_artifacts: -10 # build_abe binutils: -9 # build_abe stage1 -- --set gcc_override_configure=--disable-libsanitizer: -8 # build_abe linux: -7 # build_abe glibc: -6 # build_abe stage2 -- --set gcc_override_configure=--disable-libsanitizer: -5 # build_llvm true: -3 # true: 0 # benchmark -- -Oz artifacts/build-5f564fd013327977a4d04aa520237d17f641f01a/results_id: 1 Artifacts of last_good build: https://ci.linaro.org/job/tcwg_bmk_ci_llvm-bisect-tcwg_bmk_apm-llvm-master-… Results ID of last_good: apm_64/tcwg_bmk_llvm_apm/bisect-llvm-master-aarch64-spec2k6-Oz/4114 Artifacts of first_bad build: https://ci.linaro.org/job/tcwg_bmk_ci_llvm-bisect-tcwg_bmk_apm-llvm-master-… Results ID of first_bad: apm_64/tcwg_bmk_llvm_apm/bisect-llvm-master-aarch64-spec2k6-Oz/4109 Build top page/logs: https://ci.linaro.org/job/tcwg_bmk_ci_llvm-bisect-tcwg_bmk_apm-llvm-master-… Configuration details: Reproduce builds: <cut> mkdir investigate-gcc-844105d912a4337635001fc2077404aefb90c4c6 cd investigate-gcc-844105d912a4337635001fc2077404aefb90c4c6 git clone https://git.linaro.org/toolchain/jenkins-scripts mkdir -p artifacts/manifests curl -o artifacts/manifests/build-baseline.sh https://ci.linaro.org/job/tcwg_bmk_ci_llvm-bisect-tcwg_bmk_apm-llvm-master-… --fail curl -o artifacts/manifests/build-parameters.sh https://ci.linaro.org/job/tcwg_bmk_ci_llvm-bisect-tcwg_bmk_apm-llvm-master-… --fail curl -o artifacts/test.sh https://ci.linaro.org/job/tcwg_bmk_ci_llvm-bisect-tcwg_bmk_apm-llvm-master-… --fail chmod +x artifacts/test.sh # Reproduce the baseline build (build all pre-requisites) ./jenkins-scripts/tcwg_bmk-build.sh @@ artifacts/manifests/build-baseline.sh # Save baseline build state (which is then restored in artifacts/test.sh) mkdir -p ./bisect rsync -a --del --delete-excluded --exclude /bisect/ --exclude /artifacts/ --exclude /gcc/ ./ ./bisect/baseline/ cd gcc # Reproduce first_bad build git checkout --detach 844105d912a4337635001fc2077404aefb90c4c6 ../artifacts/test.sh # Reproduce last_good build git checkout --detach 5f564fd013327977a4d04aa520237d17f641f01a ../artifacts/test.sh cd .. </cut> History of pending regressions and results: https://git.linaro.org/toolchain/ci/base-artifacts.git/log/?h=linaro-local/… Artifacts: https://ci.linaro.org/job/tcwg_bmk_ci_llvm-bisect-tcwg_bmk_apm-llvm-master-… Build log: https://ci.linaro.org/job/tcwg_bmk_ci_llvm-bisect-tcwg_bmk_apm-llvm-master-… Full commit (up to 1000 lines): <cut> commit 844105d912a4337635001fc2077404aefb90c4c6 Author: GCC Administrator <gccadmin(a)gcc.gnu.org> Date: Mon Aug 9 00:16:32 2021 +0000 Daily bump. --- gcc/ChangeLog | 4 ++++ gcc/DATESTAMP | 2 +- gcc/testsuite/ChangeLog | 4 ++++ libstdc++-v3/ChangeLog | 18 ++++++++++++++++++ 4 files changed, 27 insertions(+), 1 deletion(-) diff --git a/gcc/ChangeLog b/gcc/ChangeLog index 3b0d1b06e9c..9d39f0f0064 100644 --- a/gcc/ChangeLog +++ b/gcc/ChangeLog @@ -1,3 +1,7 @@ +2021-08-08 Sergei Trofimovich <siarheit(a)google.com> + + * lra-constraints.c: Fix s/otput/output/ typo. + 2021-08-06 Martin Sebor <msebor(a)redhat.com> * builtins.c (expand_builtin_memchr): Move to gimple-ssa-warn-access.cc. diff --git a/gcc/DATESTAMP b/gcc/DATESTAMP index 3b0c65fa4e8..859da5af87c 100644 --- a/gcc/DATESTAMP +++ b/gcc/DATESTAMP @@ -1 +1 @@ -20210808 +20210809 diff --git a/gcc/testsuite/ChangeLog b/gcc/testsuite/ChangeLog index fa9298d30df..034fc300a63 100644 --- a/gcc/testsuite/ChangeLog +++ b/gcc/testsuite/ChangeLog @@ -1,3 +1,7 @@ +2021-08-08 Jeff Law <jlaw(a)localhost.localdomain> + + * gcc.target/tic6x/rotdi16-scan.c: Pull rotate into its own function. + 2021-08-07 H.J. Lu <hjl.tools(a)gmail.com> PR tree-optimization/88531 diff --git a/libstdc++-v3/ChangeLog b/libstdc++-v3/ChangeLog index da75afb8ebc..70fb007885a 100644 --- a/libstdc++-v3/ChangeLog +++ b/libstdc++-v3/ChangeLog @@ -1,3 +1,21 @@ +2021-08-08 François Dumont <fdumont(a)gcc.gnu.org> + + * testsuite/25_algorithms/copy/debug/constexpr_neg.cc: Replace 'failed_assertion' + dg-prune-output reason with 'builtin_unreachable'. + * testsuite/25_algorithms/copy_backward/debug/constexpr_neg.cc: Likewise. + * testsuite/25_algorithms/equal/debug/constexpr_neg.cc: Likewise. + * testsuite/25_algorithms/lower_bound/debug/constexpr_partitioned_neg.cc: Likewise. + * testsuite/25_algorithms/lower_bound/debug/constexpr_partitioned_pred_neg.cc: Likewise. + * testsuite/25_algorithms/lower_bound/debug/constexpr_valid_range_neg.cc: Likewise. + * testsuite/25_algorithms/upper_bound/debug/constexpr_partitioned_neg.cc: Likewise. + * testsuite/25_algorithms/upper_bound/debug/constexpr_partitioned_pred_neg.cc: Likewise. + * testsuite/25_algorithms/upper_bound/debug/constexpr_valid_range_neg.cc: Likewise. + +2021-08-08 Hans-Peter Nilsson <hp(a)bitrange.com> + + * testsuite/std/ranges/iota/max_size_type.cc: Set + dg-timeout-factor to 4. + 2021-08-06 Jonathan Wakely <jwakely(a)redhat.com> * libsupc++/compare (compare_three_way, strong_order) </cut>

4 years

1
0
0 0

[CI-NOTIFY]: TCWG Bisect tcwg_bmk_tx1/llvm-release-aarch64-spec2k6-O2_LTO - Build # 6 - Successful!

by ci_notify＠linaro.org

Successfully identified regression in *llvm* in CI configuration tcwg_bmk_llvm_tx1/llvm-release-aarch64-spec2k6-O2_LTO. So far, this commit has regressed CI configurations: - tcwg_bmk_llvm_tx1/llvm-release-aarch64-spec2k6-O2_LTO Culprit: <cut> commit f8ed31cd991bf84372369409595707f6e3cebbce Author: Jonny Farley <jonny.farley(a)sony.com> Date: Tue Feb 16 16:34:35 2021 +0000 [Fuzzer][Test] Use %python substitution for trace-malloc-unbalanced.test This test was found to fail for some of our downstream builds, on computers where python was not on the default $PATH. Therefore add a %python substitution to use sys.executable, based on similar solutions for python calls in tests elsewhere in LLVM. Differential Revision: https://reviews.llvm.org/D96799 </cut> Results regressed to (for first_bad == f8ed31cd991bf84372369409595707f6e3cebbce) # reset_artifacts: -10 # build_abe binutils: -9 # build_abe stage1 -- --set gcc_override_configure=--disable-libsanitizer: -8 # build_abe linux: -7 # build_abe glibc: -6 # build_abe stage2 -- --set gcc_override_configure=--disable-libsanitizer: -5 # build_llvm true: -3 # true: 0 # benchmark -- -O2_LTO artifacts/build-f8ed31cd991bf84372369409595707f6e3cebbce/results_id: 1 # 400.perlbench,perlbench_base.default regressed by 103 from (for last_good == cb2876800cc827431f4143c4fc5595c6c1191269) # reset_artifacts: -10 # build_abe binutils: -9 # build_abe stage1 -- --set gcc_override_configure=--disable-libsanitizer: -8 # build_abe linux: -7 # build_abe glibc: -6 # build_abe stage2 -- --set gcc_override_configure=--disable-libsanitizer: -5 # build_llvm true: -3 # true: 0 # benchmark -- -O2_LTO artifacts/build-cb2876800cc827431f4143c4fc5595c6c1191269/results_id: 1 Artifacts of last_good build: https://ci.linaro.org/job/tcwg_bmk_ci_llvm-bisect-tcwg_bmk_tx1-llvm-release… Results ID of last_good: tx1_64/tcwg_bmk_llvm_tx1/bisect-llvm-release-aarch64-spec2k6-O2_LTO/4095 Artifacts of first_bad build: https://ci.linaro.org/job/tcwg_bmk_ci_llvm-bisect-tcwg_bmk_tx1-llvm-release… Results ID of first_bad: tx1_64/tcwg_bmk_llvm_tx1/bisect-llvm-release-aarch64-spec2k6-O2_LTO/4070 Build top page/logs: https://ci.linaro.org/job/tcwg_bmk_ci_llvm-bisect-tcwg_bmk_tx1-llvm-release… Configuration details: Reproduce builds: <cut> mkdir investigate-llvm-f8ed31cd991bf84372369409595707f6e3cebbce cd investigate-llvm-f8ed31cd991bf84372369409595707f6e3cebbce git clone https://git.linaro.org/toolchain/jenkins-scripts mkdir -p artifacts/manifests curl -o artifacts/manifests/build-baseline.sh https://ci.linaro.org/job/tcwg_bmk_ci_llvm-bisect-tcwg_bmk_tx1-llvm-release… --fail curl -o artifacts/manifests/build-parameters.sh https://ci.linaro.org/job/tcwg_bmk_ci_llvm-bisect-tcwg_bmk_tx1-llvm-release… --fail curl -o artifacts/test.sh https://ci.linaro.org/job/tcwg_bmk_ci_llvm-bisect-tcwg_bmk_tx1-llvm-release… --fail chmod +x artifacts/test.sh # Reproduce the baseline build (build all pre-requisites) ./jenkins-scripts/tcwg_bmk-build.sh @@ artifacts/manifests/build-baseline.sh # Save baseline build state (which is then restored in artifacts/test.sh) mkdir -p ./bisect rsync -a --del --delete-excluded --exclude /bisect/ --exclude /artifacts/ --exclude /llvm/ ./ ./bisect/baseline/ cd llvm # Reproduce first_bad build git checkout --detach f8ed31cd991bf84372369409595707f6e3cebbce ../artifacts/test.sh # Reproduce last_good build git checkout --detach cb2876800cc827431f4143c4fc5595c6c1191269 ../artifacts/test.sh cd .. </cut> History of pending regressions and results: https://git.linaro.org/toolchain/ci/base-artifacts.git/log/?h=linaro-local/… Artifacts: https://ci.linaro.org/job/tcwg_bmk_ci_llvm-bisect-tcwg_bmk_tx1-llvm-release… Build log: https://ci.linaro.org/job/tcwg_bmk_ci_llvm-bisect-tcwg_bmk_tx1-llvm-release… Full commit (up to 1000 lines): <cut> commit f8ed31cd991bf84372369409595707f6e3cebbce Author: Jonny Farley <jonny.farley(a)sony.com> Date: Tue Feb 16 16:34:35 2021 +0000 [Fuzzer][Test] Use %python substitution for trace-malloc-unbalanced.test This test was found to fail for some of our downstream builds, on computers where python was not on the default $PATH. Therefore add a %python substitution to use sys.executable, based on similar solutions for python calls in tests elsewhere in LLVM. Differential Revision: https://reviews.llvm.org/D96799 --- compiler-rt/test/fuzzer/lit.cfg.py | 2 ++ compiler-rt/test/fuzzer/trace-malloc-unbalanced.test | 4 ++-- 2 files changed, 4 insertions(+), 2 deletions(-) diff --git a/compiler-rt/test/fuzzer/lit.cfg.py b/compiler-rt/test/fuzzer/lit.cfg.py index 00871b8a94a3..d114ae7937fa 100644 --- a/compiler-rt/test/fuzzer/lit.cfg.py +++ b/compiler-rt/test/fuzzer/lit.cfg.py @@ -61,6 +61,8 @@ config.substitutions.append(('%build_dir', config.cmake_binary_dir)) libfuzzer_src_root = os.path.join(config.compiler_rt_src_root, "lib", "fuzzer") config.substitutions.append(('%libfuzzer_src', libfuzzer_src_root)) +config.substitutions.append(('%python', '"%s"' % (sys.executable))) + def generate_compiler_cmd(is_cpp=True, fuzzer_enabled=True, msan_enabled=False): compiler_cmd = config.clang extra_cmd = config.target_flags diff --git a/compiler-rt/test/fuzzer/trace-malloc-unbalanced.test b/compiler-rt/test/fuzzer/trace-malloc-unbalanced.test index c7b4632140cb..395386ab5cd2 100644 --- a/compiler-rt/test/fuzzer/trace-malloc-unbalanced.test +++ b/compiler-rt/test/fuzzer/trace-malloc-unbalanced.test @@ -8,10 +8,10 @@ RUN: %cpp_compiler %S/TraceMallocTest.cpp -o %t-TraceMallocTest # Specify python because we can't use the shebang line on Windows. RUN: %run %t-TraceMallocTest -seed=1 -trace_malloc=1 -runs=200 2>&1 | \ -RUN: python %libfuzzer_src/scripts/unbalanced_allocs.py --skip=5 | FileCheck %s +RUN: %python %libfuzzer_src/scripts/unbalanced_allocs.py --skip=5 | FileCheck %s RUN: %run %t-TraceMallocTest -seed=1 -trace_malloc=2 -runs=200 2>&1 | \ -RUN: python %libfuzzer_src/scripts/unbalanced_allocs.py --skip=5 | FileCheck %s --check-prefixes=CHECK,CHECK2 +RUN: %python %libfuzzer_src/scripts/unbalanced_allocs.py --skip=5 | FileCheck %s --check-prefixes=CHECK,CHECK2 CHECK: MallocFreeTracer: START # Behavior of the format string "%p" is implementation defined. Account for the </cut>

4 years

1
0
0 0

[CI-NOTIFY]: TCWG Bisect tcwg_bmk_tk1/llvm-release-arm-spec2k6-O3_LTO - Build # 7 - Successful!

by ci_notify＠linaro.org

Successfully identified regression in *llvm* in CI configuration tcwg_bmk_llvm_tk1/llvm-release-arm-spec2k6-O3_LTO. So far, this commit has regressed CI configurations: - tcwg_bmk_llvm_tk1/llvm-release-arm-spec2k6-O3_LTO Culprit: <cut> commit f343a730596b6b02039a91d71dc16c113d09cfe6 Author: Vitaly Buka <vitalybuka(a)google.com> Date: Fri Apr 2 00:17:45 2021 -0700 [NFC][scudo] Convert ScudoPrimaryTest into TYPED_TEST </cut> Results regressed to (for first_bad == f343a730596b6b02039a91d71dc16c113d09cfe6) # reset_artifacts: -10 # build_abe binutils: -9 # build_abe stage1 -- --set gcc_override_configure=--with-mode=arm --set gcc_override_configure=--disable-libsanitizer: -8 # build_abe linux: -7 # build_abe glibc: -6 # build_abe stage2 -- --set gcc_override_configure=--with-mode=arm --set gcc_override_configure=--disable-libsanitizer: -5 # build_llvm true: -3 # true: 0 # benchmark -- -O3_LTO_marm artifacts/build-f343a730596b6b02039a91d71dc16c113d09cfe6/results_id: 1 # 462.libquantum,libquantum_base.default regressed by 104 from (for last_good == 28ea218417d713bcb399e9428e4c3f8f7960feb2) # reset_artifacts: -10 # build_abe binutils: -9 # build_abe stage1 -- --set gcc_override_configure=--with-mode=arm --set gcc_override_configure=--disable-libsanitizer: -8 # build_abe linux: -7 # build_abe glibc: -6 # build_abe stage2 -- --set gcc_override_configure=--with-mode=arm --set gcc_override_configure=--disable-libsanitizer: -5 # build_llvm true: -3 # true: 0 # benchmark -- -O3_LTO_marm artifacts/build-28ea218417d713bcb399e9428e4c3f8f7960feb2/results_id: 1 Artifacts of last_good build: https://ci.linaro.org/job/tcwg_bmk_ci_llvm-bisect-tcwg_bmk_tk1-llvm-release… Results ID of last_good: tk1_32/tcwg_bmk_llvm_tk1/bisect-llvm-release-arm-spec2k6-O3_LTO/4078 Artifacts of first_bad build: https://ci.linaro.org/job/tcwg_bmk_ci_llvm-bisect-tcwg_bmk_tk1-llvm-release… Results ID of first_bad: tk1_32/tcwg_bmk_llvm_tk1/bisect-llvm-release-arm-spec2k6-O3_LTO/4076 Build top page/logs: https://ci.linaro.org/job/tcwg_bmk_ci_llvm-bisect-tcwg_bmk_tk1-llvm-release… Configuration details: Reproduce builds: <cut> mkdir investigate-llvm-f343a730596b6b02039a91d71dc16c113d09cfe6 cd investigate-llvm-f343a730596b6b02039a91d71dc16c113d09cfe6 git clone https://git.linaro.org/toolchain/jenkins-scripts mkdir -p artifacts/manifests curl -o artifacts/manifests/build-baseline.sh https://ci.linaro.org/job/tcwg_bmk_ci_llvm-bisect-tcwg_bmk_tk1-llvm-release… --fail curl -o artifacts/manifests/build-parameters.sh https://ci.linaro.org/job/tcwg_bmk_ci_llvm-bisect-tcwg_bmk_tk1-llvm-release… --fail curl -o artifacts/test.sh https://ci.linaro.org/job/tcwg_bmk_ci_llvm-bisect-tcwg_bmk_tk1-llvm-release… --fail chmod +x artifacts/test.sh # Reproduce the baseline build (build all pre-requisites) ./jenkins-scripts/tcwg_bmk-build.sh @@ artifacts/manifests/build-baseline.sh # Save baseline build state (which is then restored in artifacts/test.sh) mkdir -p ./bisect rsync -a --del --delete-excluded --exclude /bisect/ --exclude /artifacts/ --exclude /llvm/ ./ ./bisect/baseline/ cd llvm # Reproduce first_bad build git checkout --detach f343a730596b6b02039a91d71dc16c113d09cfe6 ../artifacts/test.sh # Reproduce last_good build git checkout --detach 28ea218417d713bcb399e9428e4c3f8f7960feb2 ../artifacts/test.sh cd .. </cut> History of pending regressions and results: https://git.linaro.org/toolchain/ci/base-artifacts.git/log/?h=linaro-local/… Artifacts: https://ci.linaro.org/job/tcwg_bmk_ci_llvm-bisect-tcwg_bmk_tk1-llvm-release… Build log: https://ci.linaro.org/job/tcwg_bmk_ci_llvm-bisect-tcwg_bmk_tk1-llvm-release… Full commit (up to 1000 lines): <cut> commit f343a730596b6b02039a91d71dc16c113d09cfe6 Author: Vitaly Buka <vitalybuka(a)google.com> Date: Fri Apr 2 00:17:45 2021 -0700 [NFC][scudo] Convert ScudoPrimaryTest into TYPED_TEST --- .../lib/scudo/standalone/tests/primary_test.cpp | 73 +++++++++++++--------- 1 file changed, 44 insertions(+), 29 deletions(-) diff --git a/compiler-rt/lib/scudo/standalone/tests/primary_test.cpp b/compiler-rt/lib/scudo/standalone/tests/primary_test.cpp index 38bf67150853..07f3d6b77c17 100644 --- a/compiler-rt/lib/scudo/standalone/tests/primary_test.cpp +++ b/compiler-rt/lib/scudo/standalone/tests/primary_test.cpp @@ -52,8 +52,7 @@ template <typename Primary> static void testPrimary() { Str.output(); } -template <typename SizeClassMapT> struct TestConfig1 { - using SizeClassMap = SizeClassMapT; +struct TestConfig1 { static const scudo::uptr PrimaryRegionSizeLog = 18U; static const scudo::s32 PrimaryMinReleaseToOsIntervalMs = INT32_MIN; static const scudo::s32 PrimaryMaxReleaseToOsIntervalMs = INT32_MAX; @@ -62,8 +61,7 @@ template <typename SizeClassMapT> struct TestConfig1 { static const scudo::uptr PrimaryCompactPtrScale = 0; }; -template <typename SizeClassMapT> struct TestConfig2 { - using SizeClassMap = SizeClassMapT; +struct TestConfig2 { static const scudo::uptr PrimaryRegionSizeLog = 24U; static const scudo::s32 PrimaryMinReleaseToOsIntervalMs = INT32_MIN; static const scudo::s32 PrimaryMaxReleaseToOsIntervalMs = INT32_MAX; @@ -72,8 +70,7 @@ template <typename SizeClassMapT> struct TestConfig2 { static const scudo::uptr PrimaryCompactPtrScale = 0; }; -template <typename SizeClassMapT> struct TestConfig3 { - using SizeClassMap = SizeClassMapT; +struct TestConfig3 { static const scudo::uptr PrimaryRegionSizeLog = 24U; static const scudo::s32 PrimaryMinReleaseToOsIntervalMs = INT32_MIN; static const scudo::s32 PrimaryMaxReleaseToOsIntervalMs = INT32_MAX; @@ -82,13 +79,43 @@ template <typename SizeClassMapT> struct TestConfig3 { static const scudo::uptr PrimaryCompactPtrScale = 0; }; -TEST(ScudoPrimaryTest, BasicPrimary) { - using SizeClassMap = scudo::DefaultSizeClassMap; +template <typename BaseConfig, typename SizeClassMapT> +struct Config : public BaseConfig { + using SizeClassMap = SizeClassMapT; +}; + +template <typename BaseConfig, typename SizeClassMapT> struct MakeAllocator { + using Value = scudo::SizeClassAllocator64<Config<BaseConfig, SizeClassMapT>>; +}; + +template <typename SizeClassMapT> +struct MakeAllocator<TestConfig1, SizeClassMapT> { + using Value = scudo::SizeClassAllocator32<Config<TestConfig1, SizeClassMapT>>; +}; + +namespace testing { +namespace internal { +#define SCUDO_DEFINE_GTEST_TYPE_NAME(TYPE) \ + template <> std::string GetTypeName<TYPE>() { return #TYPE; } +SCUDO_DEFINE_GTEST_TYPE_NAME(TestConfig1) +SCUDO_DEFINE_GTEST_TYPE_NAME(TestConfig2) +SCUDO_DEFINE_GTEST_TYPE_NAME(TestConfig3) +#undef SCUDO_DEFINE_GTEST_TYPE_NAME +} // namespace internal +} // namespace testing + +template <class BaseConfig> struct ScudoPrimaryTest : public ::testing::Test {}; + +using ScudoPrimaryTestTypes = testing::Types< #if !SCUDO_FUCHSIA - testPrimary<scudo::SizeClassAllocator32<TestConfig1<SizeClassMap>>>(); + TestConfig1, #endif - testPrimary<scudo::SizeClassAllocator64<TestConfig2<SizeClassMap>>>(); - testPrimary<scudo::SizeClassAllocator64<TestConfig3<SizeClassMap>>>(); + TestConfig2, TestConfig3>; +TYPED_TEST_CASE(ScudoPrimaryTest, ScudoPrimaryTestTypes); + +TYPED_TEST(ScudoPrimaryTest, BasicPrimary) { + using SizeClassMap = scudo::DefaultSizeClassMap; + testPrimary<typename MakeAllocator<TypeParam, SizeClassMap>::Value>(); } struct SmallRegionsConfig { @@ -178,13 +205,9 @@ template <typename Primary> static void testIteratePrimary() { Str.output(); } -TEST(ScudoPrimaryTest, PrimaryIterate) { +TYPED_TEST(ScudoPrimaryTest, PrimaryIterate) { using SizeClassMap = scudo::DefaultSizeClassMap; -#if !SCUDO_FUCHSIA - testIteratePrimary<scudo::SizeClassAllocator32<TestConfig1<SizeClassMap>>>(); -#endif - testIteratePrimary<scudo::SizeClassAllocator64<TestConfig2<SizeClassMap>>>(); - testIteratePrimary<scudo::SizeClassAllocator64<TestConfig3<SizeClassMap>>>(); + testIteratePrimary<typename MakeAllocator<TypeParam, SizeClassMap>::Value>(); } static std::mutex Mutex; @@ -239,13 +262,9 @@ template <typename Primary> static void testPrimaryThreaded() { Str.output(); } -TEST(ScudoPrimaryTest, PrimaryThreaded) { +TYPED_TEST(ScudoPrimaryTest, PrimaryThreaded) { using SizeClassMap = scudo::SvelteSizeClassMap; -#if !SCUDO_FUCHSIA - testPrimaryThreaded<scudo::SizeClassAllocator32<TestConfig1<SizeClassMap>>>(); -#endif - testPrimaryThreaded<scudo::SizeClassAllocator64<TestConfig2<SizeClassMap>>>(); - testPrimaryThreaded<scudo::SizeClassAllocator64<TestConfig3<SizeClassMap>>>(); + testPrimaryThreaded<typename MakeAllocator<TypeParam, SizeClassMap>::Value>(); } // Through a simple allocation that spans two pages, verify that releaseToOS @@ -270,11 +289,7 @@ template <typename Primary> static void testReleaseToOS() { EXPECT_GT(Allocator->releaseToOS(), 0U); } -TEST(ScudoPrimaryTest, ReleaseToOS) { +TYPED_TEST(ScudoPrimaryTest, ReleaseToOS) { using SizeClassMap = scudo::DefaultSizeClassMap; -#if !SCUDO_FUCHSIA - testReleaseToOS<scudo::SizeClassAllocator32<TestConfig1<SizeClassMap>>>(); -#endif - testReleaseToOS<scudo::SizeClassAllocator64<TestConfig2<SizeClassMap>>>(); - testReleaseToOS<scudo::SizeClassAllocator64<TestConfig3<SizeClassMap>>>(); + testReleaseToOS<typename MakeAllocator<TypeParam, SizeClassMap>::Value>(); } </cut>

4 years

1
0
0 0

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

linaro-toolchain