Progress (short week, 3 days):
* UM-2 [QEMU upstream maintainership]
+ QEMU 6.1.0 has now been released
+ Sent out the first arm pullreq for the 6.2 cycle, including
another slice of the MVE patches
+ tried to work through some of the codereview backlog
-- PMM
Successfully identified regression in *gcc* in CI configuration tcwg_gcc_bootstrap/master-arm-bootstrap_debug. So far, this commit has regressed CI configurations:
- tcwg_gcc_bootstrap/master-arm-bootstrap_debug
Culprit:
<cut>
commit 1d244020246cb155e4de62ca3b302b920a1f513f
Author: Roger Sayle <roger(a)nextmovesoftware.com>
Date: Mon Aug 23 12:37:04 2021 +0100
Fold sign of LSHIFT_EXPR to eliminate no-op conversions.
This short patch teaches fold that it is "safe" to change the sign
of a left shift, to reduce the number of type conversions in gimple.
As an example:
unsigned int foo(unsigned int i) {
return (int)i << 8;
}
is currently optimized to:
unsigned int foo (unsigned int i)
{
int i.0_1;
int _2;
unsigned int _4;
<bb 2> [local count: 1073741824]:
i.0_1 = (int) i_3(D);
_2 = i.0_1 << 8;
_4 = (unsigned int) _2;
return _4;
}
with this patch, this now becomes:
unsigned int foo (unsigned int i)
{
unsigned int _2;
<bb 2> [local count: 1073741824]:
_2 = i_1(D) << 8;
return _2;
}
which generates exactly the same assembly language. Aside from the
reduced memory usage, the real benefit is that no-op conversions tend
to interfere with many folding optimizations. For example,
unsigned int bar(unsigned char i) {
return (i ^ (i<<16)) | (i<<8);
}
currently gets (tangled in conversions and) optimized to:
unsigned int bar (unsigned char i)
{
unsigned int _1;
unsigned int _2;
int _3;
int _4;
unsigned int _6;
unsigned int _8;
<bb 2> [local count: 1073741824]:
_1 = (unsigned int) i_5(D);
_2 = _1 * 65537;
_3 = (int) i_5(D);
_4 = _3 << 8;
_8 = (unsigned int) _4;
_6 = _2 | _8;
return _6;
}
but with this patch, bar now optimizes down to:
unsigned int bar(unsigned char i)
{
unsigned int _1;
unsigned int _4;
<bb 2> [local count: 1073741824]:
_1 = (unsigned int) i_3(D);
_4 = _1 * 65793;
return _4;
}
2021-08-23 Roger Sayle <roger(a)nextmovesoftware.com>
gcc/ChangeLog
* match.pd (shift transformations): Change the sign of an
LSHIFT_EXPR if it reduces the number of explicit conversions.
gcc/testsuite/ChangeLog
* gcc.dg/fold-convlshift-1.c: New test case.
* gcc.dg/fold-convlshift-2.c: New test case.
</cut>
Results regressed to (for first_bad == 1d244020246cb155e4de62ca3b302b920a1f513f)
# reset_artifacts:
-10
# true:
0
# build_abe binutils:
1
# First few build errors in logs:
# 00:06:26 make[3]: [armv8l-unknown-linux-gnueabihf/bits/largefile-config.h] Error 1 (ignored)
# 00:25:39 make[3]: [armv8l-unknown-linux-gnueabihf/bits/largefile-config.h] Error 1 (ignored)
# 00:29:38 /home/tcwg-buildslave/workspace/tcwg_gnu_8/abe/snapshots/gcc.git~master/gcc/bitmap.h:357:13: error: type mismatch in ‘lshift_expr’
# 00:29:38 /home/tcwg-buildslave/workspace/tcwg_gnu_8/abe/snapshots/gcc.git~master/gcc/bitmap.h:357:13: internal compiler error: ‘verify_gimple’ failed
# 00:29:38 make[3]: *** [bitmap.o] Error 1
# 00:34:06 make[2]: *** [all-stage3-gcc] Error 2
# 00:34:06 make[1]: *** [stage3-bubble] Error 2
# 00:34:07 make: *** [all] Error 2
from (for last_good == b320edc0c29c838b0090c3c9be14187d132f73f2)
# reset_artifacts:
-10
# true:
0
# build_abe binutils:
1
# build_abe bootstrap_debug:
2
Artifacts of last_good build: https://ci.linaro.org/job/tcwg_gcc_bootstrap-bisect-master-arm-bootstrap_de…
Artifacts of first_bad build: https://ci.linaro.org/job/tcwg_gcc_bootstrap-bisect-master-arm-bootstrap_de…
Build top page/logs: https://ci.linaro.org/job/tcwg_gcc_bootstrap-bisect-master-arm-bootstrap_de…
Configuration details:
Reproduce builds:
<cut>
mkdir investigate-gcc-1d244020246cb155e4de62ca3b302b920a1f513f
cd investigate-gcc-1d244020246cb155e4de62ca3b302b920a1f513f
git clone https://git.linaro.org/toolchain/jenkins-scripts
mkdir -p artifacts/manifests
curl -o artifacts/manifests/build-baseline.sh https://ci.linaro.org/job/tcwg_gcc_bootstrap-bisect-master-arm-bootstrap_de… --fail
curl -o artifacts/manifests/build-parameters.sh https://ci.linaro.org/job/tcwg_gcc_bootstrap-bisect-master-arm-bootstrap_de… --fail
curl -o artifacts/test.sh https://ci.linaro.org/job/tcwg_gcc_bootstrap-bisect-master-arm-bootstrap_de… --fail
chmod +x artifacts/test.sh
# Reproduce the baseline build (build all pre-requisites)
./jenkins-scripts/tcwg_gnu-build.sh @@ artifacts/manifests/build-baseline.sh
# Save baseline build state (which is then restored in artifacts/test.sh)
mkdir -p ./bisect
rsync -a --del --delete-excluded --exclude /bisect/ --exclude /artifacts/ --exclude /gcc/ ./ ./bisect/baseline/
cd gcc
# Reproduce first_bad build
git checkout --detach 1d244020246cb155e4de62ca3b302b920a1f513f
../artifacts/test.sh
# Reproduce last_good build
git checkout --detach b320edc0c29c838b0090c3c9be14187d132f73f2
../artifacts/test.sh
cd ..
</cut>
History of pending regressions and results: https://git.linaro.org/toolchain/ci/base-artifacts.git/log/?h=linaro-local/…
Artifacts: https://ci.linaro.org/job/tcwg_gcc_bootstrap-bisect-master-arm-bootstrap_de…
Build log: https://ci.linaro.org/job/tcwg_gcc_bootstrap-bisect-master-arm-bootstrap_de…
Full commit (up to 1000 lines):
<cut>
commit 1d244020246cb155e4de62ca3b302b920a1f513f
Author: Roger Sayle <roger(a)nextmovesoftware.com>
Date: Mon Aug 23 12:37:04 2021 +0100
Fold sign of LSHIFT_EXPR to eliminate no-op conversions.
This short patch teaches fold that it is "safe" to change the sign
of a left shift, to reduce the number of type conversions in gimple.
As an example:
unsigned int foo(unsigned int i) {
return (int)i << 8;
}
is currently optimized to:
unsigned int foo (unsigned int i)
{
int i.0_1;
int _2;
unsigned int _4;
<bb 2> [local count: 1073741824]:
i.0_1 = (int) i_3(D);
_2 = i.0_1 << 8;
_4 = (unsigned int) _2;
return _4;
}
with this patch, this now becomes:
unsigned int foo (unsigned int i)
{
unsigned int _2;
<bb 2> [local count: 1073741824]:
_2 = i_1(D) << 8;
return _2;
}
which generates exactly the same assembly language. Aside from the
reduced memory usage, the real benefit is that no-op conversions tend
to interfere with many folding optimizations. For example,
unsigned int bar(unsigned char i) {
return (i ^ (i<<16)) | (i<<8);
}
currently gets (tangled in conversions and) optimized to:
unsigned int bar (unsigned char i)
{
unsigned int _1;
unsigned int _2;
int _3;
int _4;
unsigned int _6;
unsigned int _8;
<bb 2> [local count: 1073741824]:
_1 = (unsigned int) i_5(D);
_2 = _1 * 65537;
_3 = (int) i_5(D);
_4 = _3 << 8;
_8 = (unsigned int) _4;
_6 = _2 | _8;
return _6;
}
but with this patch, bar now optimizes down to:
unsigned int bar(unsigned char i)
{
unsigned int _1;
unsigned int _4;
<bb 2> [local count: 1073741824]:
_1 = (unsigned int) i_3(D);
_4 = _1 * 65793;
return _4;
}
2021-08-23 Roger Sayle <roger(a)nextmovesoftware.com>
gcc/ChangeLog
* match.pd (shift transformations): Change the sign of an
LSHIFT_EXPR if it reduces the number of explicit conversions.
gcc/testsuite/ChangeLog
* gcc.dg/fold-convlshift-1.c: New test case.
* gcc.dg/fold-convlshift-2.c: New test case.
---
gcc/match.pd | 9 +++++++++
gcc/testsuite/gcc.dg/fold-convlshift-1.c | 20 ++++++++++++++++++++
gcc/testsuite/gcc.dg/fold-convlshift-2.c | 20 ++++++++++++++++++++
3 files changed, 49 insertions(+)
diff --git a/gcc/match.pd b/gcc/match.pd
index 0fcfd0ea62c..978a1b0172e 100644
--- a/gcc/match.pd
+++ b/gcc/match.pd
@@ -3385,6 +3385,15 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)
(if (integer_zerop (@2) || integer_all_onesp (@2))
(cmp @0 @2)))))
+/* Both signed and unsigned lshift produce the same result, so use
+ the form that minimizes the number of conversions. */
+(simplify
+ (convert (lshift:s@0 (convert:s@1 @2) INTEGER_CST@3))
+ (if (tree_nop_conversion_p (type, TREE_TYPE (@0))
+ && INTEGRAL_TYPE_P (TREE_TYPE (@2))
+ && TYPE_PRECISION (TREE_TYPE (@2)) <= TYPE_PRECISION (type))
+ (lshift (convert @2) @3)))
+
/* Simplifications of conversions. */
/* Basic strip-useless-type-conversions / strip_nops. */
diff --git a/gcc/testsuite/gcc.dg/fold-convlshift-1.c b/gcc/testsuite/gcc.dg/fold-convlshift-1.c
new file mode 100644
index 00000000000..b6f57f81e72
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/fold-convlshift-1.c
@@ -0,0 +1,20 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -fdump-tree-optimized" } */
+
+unsigned int foo(unsigned int i)
+{
+ int t1 = i;
+ int t2 = t1 << 8;
+ return t2;
+}
+
+int bar(int i)
+{
+ unsigned int t1 = i;
+ unsigned int t2 = t1 << 8;
+ return t2;
+}
+
+/* { dg-final { scan-tree-dump-not "\\(int\\)" "optimized" } } */
+/* { dg-final { scan-tree-dump-not "\\(unsigned int\\)" "optimized" } } */
+
diff --git a/gcc/testsuite/gcc.dg/fold-convlshift-2.c b/gcc/testsuite/gcc.dg/fold-convlshift-2.c
new file mode 100644
index 00000000000..f21358c4584
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/fold-convlshift-2.c
@@ -0,0 +1,20 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -fdump-tree-optimized" } */
+
+unsigned int foo(unsigned char c)
+{
+ int t1 = c;
+ int t2 = t1 << 8;
+ return t2;
+}
+
+int bar(unsigned char c)
+{
+ unsigned int t1 = c;
+ unsigned int t2 = t1 << 8;
+ return t2;
+}
+
+/* { dg-final { scan-tree-dump-times "\\(int\\)" 1 "optimized" } } */
+/* { dg-final { scan-tree-dump-times "\\(unsigned int\\)" 1 "optimized" } } */
+
</cut>
Hi everyone,
We are shifting the ClangBuiltLinux mailing list from
clang-built-linux(a)googlegroups.com to llvm(a)lists.linux.dev. Google
Groups has served us well but moving to lists.linux.dev allows for
easier archival (as we will be on lore.kernel.org automatically) and
allows for people to subscribe to us easier, as they only need an email
address, rather than a Google account.
Please follow these directions to subscribe to the new mailing list:
https://subspace.kernel.org/index.html#subscribing
Some more information about lists.linux.dev:
https://www.kernel.org/lists-linux-dev.htmlhttps://subspace.kernel.org/lists.linux.dev.html
I have added CI maintainers/mailing lists that send us regular reports
to this announcement. Please continue to send us emails about build
results, just switch the email from clang-built-linux(a)googlegroups.com
to llvm(a)lists.linux.dev so that they get archived as a part of lore and
can be easily searched, especially with the upcoming
https://x-lore.kernel.org/all/.
I will send a patch shortly to update MAINTAINERS.
Cheers,
Nathan
Successfully identified regression in *llvm* in CI configuration tcwg_bmk_llvm_tk1/llvm-release-arm-spec2k6-O2_LTO. So far, this commit has regressed CI configurations:
- tcwg_bmk_llvm_tk1/llvm-release-arm-spec2k6-O2_LTO
Culprit:
<cut>
commit 880822255e21179e9706ebaf77fff9111d9d3844
Author: Tobias Gysi <gysit(a)google.com>
Date: Wed Mar 24 14:22:17 2021 +0000
[mlir][linalg] Do not call region builder during vectorization.
All linalg operations having a region builder shall call it during op creation. Calling it during vectorization is obsolete.
Differential Revision: https://reviews.llvm.org/D99168
</cut>
Results regressed to (for first_bad == 880822255e21179e9706ebaf77fff9111d9d3844)
# reset_artifacts:
-10
# build_abe binutils:
-9
# build_abe stage1 -- --set gcc_override_configure=--with-mode=arm --set gcc_override_configure=--disable-libsanitizer:
-8
# build_abe linux:
-7
# build_abe glibc:
-6
# build_abe stage2 -- --set gcc_override_configure=--with-mode=arm --set gcc_override_configure=--disable-libsanitizer:
-5
# build_llvm true:
-3
# true:
0
# benchmark -- -O2_LTO_marm artifacts/build-880822255e21179e9706ebaf77fff9111d9d3844/results_id:
1
# 456.hmmer,hmmer_base.default regressed by 104
from (for last_good == 92417ebbd10382436136ed5e755be567304ac139)
# reset_artifacts:
-10
# build_abe binutils:
-9
# build_abe stage1 -- --set gcc_override_configure=--with-mode=arm --set gcc_override_configure=--disable-libsanitizer:
-8
# build_abe linux:
-7
# build_abe glibc:
-6
# build_abe stage2 -- --set gcc_override_configure=--with-mode=arm --set gcc_override_configure=--disable-libsanitizer:
-5
# build_llvm true:
-3
# true:
0
# benchmark -- -O2_LTO_marm artifacts/build-92417ebbd10382436136ed5e755be567304ac139/results_id:
1
Artifacts of last_good build: https://ci.linaro.org/job/tcwg_bmk_ci_llvm-bisect-tcwg_bmk_tk1-llvm-release…
Results ID of last_good: tk1_32/tcwg_bmk_llvm_tk1/bisect-llvm-release-arm-spec2k6-O2_LTO/4267
Artifacts of first_bad build: https://ci.linaro.org/job/tcwg_bmk_ci_llvm-bisect-tcwg_bmk_tk1-llvm-release…
Results ID of first_bad: tk1_32/tcwg_bmk_llvm_tk1/bisect-llvm-release-arm-spec2k6-O2_LTO/4268
Build top page/logs: https://ci.linaro.org/job/tcwg_bmk_ci_llvm-bisect-tcwg_bmk_tk1-llvm-release…
Configuration details:
Reproduce builds:
<cut>
mkdir investigate-llvm-880822255e21179e9706ebaf77fff9111d9d3844
cd investigate-llvm-880822255e21179e9706ebaf77fff9111d9d3844
git clone https://git.linaro.org/toolchain/jenkins-scripts
mkdir -p artifacts/manifests
curl -o artifacts/manifests/build-baseline.sh https://ci.linaro.org/job/tcwg_bmk_ci_llvm-bisect-tcwg_bmk_tk1-llvm-release… --fail
curl -o artifacts/manifests/build-parameters.sh https://ci.linaro.org/job/tcwg_bmk_ci_llvm-bisect-tcwg_bmk_tk1-llvm-release… --fail
curl -o artifacts/test.sh https://ci.linaro.org/job/tcwg_bmk_ci_llvm-bisect-tcwg_bmk_tk1-llvm-release… --fail
chmod +x artifacts/test.sh
# Reproduce the baseline build (build all pre-requisites)
./jenkins-scripts/tcwg_bmk-build.sh @@ artifacts/manifests/build-baseline.sh
# Save baseline build state (which is then restored in artifacts/test.sh)
mkdir -p ./bisect
rsync -a --del --delete-excluded --exclude /bisect/ --exclude /artifacts/ --exclude /llvm/ ./ ./bisect/baseline/
cd llvm
# Reproduce first_bad build
git checkout --detach 880822255e21179e9706ebaf77fff9111d9d3844
../artifacts/test.sh
# Reproduce last_good build
git checkout --detach 92417ebbd10382436136ed5e755be567304ac139
../artifacts/test.sh
cd ..
</cut>
History of pending regressions and results: https://git.linaro.org/toolchain/ci/base-artifacts.git/log/?h=linaro-local/…
Artifacts: https://ci.linaro.org/job/tcwg_bmk_ci_llvm-bisect-tcwg_bmk_tk1-llvm-release…
Build log: https://ci.linaro.org/job/tcwg_bmk_ci_llvm-bisect-tcwg_bmk_tk1-llvm-release…
Full commit (up to 1000 lines):
<cut>
commit 880822255e21179e9706ebaf77fff9111d9d3844
Author: Tobias Gysi <gysit(a)google.com>
Date: Wed Mar 24 14:22:17 2021 +0000
[mlir][linalg] Do not call region builder during vectorization.
All linalg operations having a region builder shall call it during op creation. Calling it during vectorization is obsolete.
Differential Revision: https://reviews.llvm.org/D99168
---
.../Dialect/Linalg/Transforms/Vectorization.cpp | 31 ++++++----------------
1 file changed, 8 insertions(+), 23 deletions(-)
diff --git a/mlir/lib/Dialect/Linalg/Transforms/Vectorization.cpp b/mlir/lib/Dialect/Linalg/Transforms/Vectorization.cpp
index d4581013ae69..10562d68a9e0 100644
--- a/mlir/lib/Dialect/Linalg/Transforms/Vectorization.cpp
+++ b/mlir/lib/Dialect/Linalg/Transforms/Vectorization.cpp
@@ -288,7 +288,7 @@ static AffineMap getTransferReadMap(LinalgOp linalgOp, unsigned argIndex) {
/// Generic vectorization function that rewrites the body of a `linalgOp` into
/// vector form. Generic vectorization proceeds as follows:
-/// 1. The region for the linalg op is created if necessary.
+/// 1. Verify the `linalgOp` has one non-empty region.
/// 2. Values defined above the region are mapped to themselves and will be
/// broadcasted on a per-need basis by their consumers.
/// 3. Each region argument is vectorized into a vector.transfer_read (or 0-d
@@ -299,36 +299,21 @@ static AffineMap getTransferReadMap(LinalgOp linalgOp, unsigned argIndex) {
LogicalResult vectorizeAsLinalgGeneric(
OpBuilder &builder, LinalgOp linalgOp, SmallVectorImpl<Value> &newResults,
ArrayRef<CustomVectorizationHook> customVectorizationHooks = {}) {
- // 1. Certain Linalg ops do not have a region but only a region builder.
- // If so, build the region so we can vectorize.
- std::unique_ptr<Region> owningRegion;
- Region *region;
- if (linalgOp->getNumRegions() > 0) {
- region = &linalgOp->getRegion(0);
- } else {
- // RAII avoid remaining in block.
- OpBuilder::InsertionGuard g(builder);
- owningRegion = std::make_unique<Region>();
- region = owningRegion.get();
- Block *block = builder.createBlock(region);
- auto elementTypes = llvm::to_vector<4>(
- llvm::map_range(linalgOp.getShapedOperandTypes(),
- [](ShapedType t) { return t.getElementType(); }));
- block->addArguments(elementTypes);
- linalgOp.getRegionBuilder()(*block, /*captures=*/{});
- }
- Block *block = ®ion->front();
+ // 1. Fail to vectorize if the operation does not have one non-empty region.
+ if (linalgOp->getNumRegions() != 1 || linalgOp->getRegion(0).empty())
+ return failure();
+ auto &block = linalgOp->getRegion(0).front();
BlockAndValueMapping bvm;
// 2. Values defined above the region can only be broadcast for now. Make them
// map to themselves.
llvm::SetVector<Value> valuesSet;
- mlir::getUsedValuesDefinedAbove(*region, valuesSet);
+ mlir::getUsedValuesDefinedAbove(linalgOp->getRegion(0), valuesSet);
bvm.map(valuesSet.getArrayRef(), valuesSet.getArrayRef());
// 3. Turn all BBArgs into vector.transfer_read / load.
SmallVector<AffineMap> indexings;
- for (auto bbarg : block->getArguments()) {
+ for (auto bbarg : block.getArguments()) {
Value vectorArg = linalgOp.getShapedOperand(bbarg.getArgNumber());
AffineMap map;
VectorType vectorType = extractVectorTypeFromShapedValue(vectorArg);
@@ -360,7 +345,7 @@ LogicalResult vectorizeAsLinalgGeneric(
hooks.push_back(vectorizeYield);
// 5. Iteratively call `vectorizeOneOp` to each op in the slice.
- for (Operation &op : block->getOperations()) {
+ for (Operation &op : block.getOperations()) {
VectorizationResult result = vectorizeOneOp(builder, &op, bvm, hooks);
if (result.status == VectorizationStatus::Failure) {
LLVM_DEBUG(dbgs() << "\n[" DEBUG_TYPE "]: failed to vectorize: " << op);
</cut>