VirtIO Initiative ([STR-9])
===========================
- various rust-vmm discussions
- [upstream rust-vmm sync meeting]
- how to deal with vhost-device/vm-virtio split: [proposal]
- synced with ARM on their interests
- got update on Fwd: FW: [App-services] Slides from the
hypervisor-less virtio status meeting Message-Id:
<CAHDbmO2G4hUyfxtaxwnbxsrMk+P41zbL-7VNe=Aa6DshxC-5zQ(a)mail.gmail.com>
[STR-9] <https://linaro.atlassian.net/browse/STR-9>
[upstream rust-vmm sync meeting]
<https://etherpad.opendev.org/p/rust-vmm-sync-2021&sa=D&source=calendar&ust=…>
[proposal] <https://github.com/rust-vmm/vhost-device/pull/57>
QEMU Upstream Work ([UM-2])
===========================
- did some bug triage and investigated [555] and [690] which might
intersect with earlier changes I made
- spent time on the PR from hell [PULL 00/30] testing, gdbstub and
semihosting Message-Id:
<20210115130828.23968-1-alex.bennee(a)linaro.org>
[UM-2] <https://linaro.atlassian.net/browse/UM-2>
[555] <https://gitlab.com/qemu-project/qemu/-/issues/555>
[690] <https://gitlab.com/qemu-project/qemu/-/issues/690>
Other
=====
- TSC report preparation for QEMU and Stratos
Completed Reviews [1/1]
=======================
[XEN PATCH v7 00/51] xen: Build system improvements, now with out-of-tree build!
Message-Id: <20210824105038.1257926-1-anthony.perard(a)citrix.com>
Absences
========
,----
| (save-excursion
| (goto-char (point-min))
| (when (re-search-forward "* Absences")
| (goto-char (match-beginning 0))
| (org-export-as 'ascii t nil t )))
`----
Current Review Queue
====================
TODO [PATCH v2 00/48] tcg: optimize redundant sign extensions
Message-Id: <20211007195456.1168070-1-richard.henderson(a)linaro.org>
================================================================================================================================
TODO [PATCH] cpu-models-x86.rst: Tidy up a couple of things
Message-Id: <20211015100718.17828-1-pbonzini(a)redhat.com>
===================================================================================================================
TODO [PATCH 00/16] fdt: Make OF_BOARD a boolean option
Message-Id: <20211013010120.96851-1-sjg(a)chromium.org>
===========================================================================================================
TODO [PATCH v4 00/41] linux-user: Streamline handling of SIGSEGV
Message-Id: <20211006172307.780893-1-richard.henderson(a)linaro.org>
==================================================================================================================================
--
Alex Bennée
Progress
* UM-2 [QEMU upstream maintainership]
+ worked through the big pile of email that had built up while
I was on holiday...
+ some long-delayed sysadmin tasks on my work machines now I have
an opportunity to go into the office and do things that would be
too risky with only remote access
+ triaged a bunch of Coverity issues
* QEMU-406 [QEMU support for MVE (M-profile Vector Extension; Helium)]
+ All work here has now gone upstream; closed!
-- PMM
After llvm commit fbc0c308d599fe3300ab6516650b65b41979446d
Author: Nikita Popov <nikita.ppv(a)gmail.com>
[BasicAA] Handle known bits as ranges
the following benchmarks slowed down by more than 2%:
- 464.h264ref slowed down by 7% from 10899 to 11610 perf samples
- 464.h264ref:libc.so.6 slowed down by 11% from 3538 to 3922 perf samples
Below reproducer instructions can be used to re-build both "first_bad" and "last_good" cross-toolchains used in this bisection. Naturally, the scripts will fail when triggerring benchmarking jobs if you don't have access to Linaro TCWG CI.
For your convenience, we have uploaded tarballs with pre-processed source and assembly files at:
- First_bad save-temps: https://ci.linaro.org/job/tcwg_bmk_ci_llvm-bisect-tcwg_bmk_tx1-llvm-master-…
- Last_good save-temps: https://ci.linaro.org/job/tcwg_bmk_ci_llvm-bisect-tcwg_bmk_tx1-llvm-master-…
- Baseline save-temps: https://ci.linaro.org/job/tcwg_bmk_ci_llvm-bisect-tcwg_bmk_tx1-llvm-master-…
Configuration:
- Benchmark: SPEC CPU2006
- Toolchain: Clang + Glibc + LLVM Linker
- Version: all components were built from their tip of trunk
- Target: aarch64-linux-gnu
- Compiler flags: -O2 -flto
- Hardware: NVidia TX1 4x Cortex-A57
This benchmarking CI is work-in-progress, and we welcome feedback and suggestions at linaro-toolchain(a)lists.linaro.org . In our improvement plans is to add support for SPEC CPU2017 benchmarks and provide "perf report/annotate" data behind these reports.
THIS IS THE END OF INTERESTING STUFF. BELOW ARE LINKS TO BUILDS, REPRODUCTION INSTRUCTIONS, AND THE RAW COMMIT.
This commit has regressed these CI configurations:
- tcwg_bmk_llvm_tx1/llvm-master-aarch64-spec2k6-O2_LTO
First_bad build: https://ci.linaro.org/job/tcwg_bmk_ci_llvm-bisect-tcwg_bmk_tx1-llvm-master-…
Last_good build: https://ci.linaro.org/job/tcwg_bmk_ci_llvm-bisect-tcwg_bmk_tx1-llvm-master-…
Baseline build: https://ci.linaro.org/job/tcwg_bmk_ci_llvm-bisect-tcwg_bmk_tx1-llvm-master-…
Even more details: https://ci.linaro.org/job/tcwg_bmk_ci_llvm-bisect-tcwg_bmk_tx1-llvm-master-…
Reproduce builds:
<cut>
mkdir investigate-llvm-fbc0c308d599fe3300ab6516650b65b41979446d
cd investigate-llvm-fbc0c308d599fe3300ab6516650b65b41979446d
# Fetch scripts
git clone https://git.linaro.org/toolchain/jenkins-scripts
# Fetch manifests and test.sh script
mkdir -p artifacts/manifests
curl -o artifacts/manifests/build-baseline.sh https://ci.linaro.org/job/tcwg_bmk_ci_llvm-bisect-tcwg_bmk_tx1-llvm-master-… --fail
curl -o artifacts/manifests/build-parameters.sh https://ci.linaro.org/job/tcwg_bmk_ci_llvm-bisect-tcwg_bmk_tx1-llvm-master-… --fail
curl -o artifacts/test.sh https://ci.linaro.org/job/tcwg_bmk_ci_llvm-bisect-tcwg_bmk_tx1-llvm-master-… --fail
chmod +x artifacts/test.sh
# Reproduce the baseline build (build all pre-requisites)
./jenkins-scripts/tcwg_bmk-build.sh @@ artifacts/manifests/build-baseline.sh
# Save baseline build state (which is then restored in artifacts/test.sh)
mkdir -p ./bisect
rsync -a --del --delete-excluded --exclude /bisect/ --exclude /artifacts/ --exclude /llvm/ ./ ./bisect/baseline/
cd llvm
# Reproduce first_bad build
git checkout --detach fbc0c308d599fe3300ab6516650b65b41979446d
../artifacts/test.sh
# Reproduce last_good build
git checkout --detach 30a3652b6ade43504087f6e3acd8dc879055f501
../artifacts/test.sh
cd ..
</cut>
Full commit (up to 1000 lines):
<cut>
commit fbc0c308d599fe3300ab6516650b65b41979446d
Author: Nikita Popov <nikita.ppv(a)gmail.com>
Date: Mon Oct 25 15:47:21 2021 +0200
[BasicAA] Handle known bits as ranges
BasicAA currently tries to determine that the offset is positive by
checking whether all variable indices are positive based on known
bits, multiplied by a positive scale. However, this is incorrect
if the scale multiplication might overflow. In the modified test
case the original value is positive, but may be negative after a
left shift.
Fix this by converting known bits into a constant range and reusing
the range-based logic, which handles overflow correctly.
Differential Revision: https://reviews.llvm.org/D112611
---
llvm/lib/Analysis/BasicAliasAnalysis.cpp | 51 +++++-----------------
.../test/Analysis/BasicAA/assume-index-positive.ll | 4 +-
2 files changed, 12 insertions(+), 43 deletions(-)
diff --git a/llvm/lib/Analysis/BasicAliasAnalysis.cpp b/llvm/lib/Analysis/BasicAliasAnalysis.cpp
index 0305732ca5d5..8cf947c43bf4 100644
--- a/llvm/lib/Analysis/BasicAliasAnalysis.cpp
+++ b/llvm/lib/Analysis/BasicAliasAnalysis.cpp
@@ -318,15 +318,6 @@ struct CastedValue {
return N;
}
- KnownBits evaluateWith(KnownBits N) const {
- assert(N.getBitWidth() == V->getType()->getPrimitiveSizeInBits() &&
- "Incompatible bit width");
- if (TruncBits) N = N.trunc(N.getBitWidth() - TruncBits);
- if (SExtBits) N = N.sext(N.getBitWidth() + SExtBits);
- if (ZExtBits) N = N.zext(N.getBitWidth() + ZExtBits);
- return N;
- }
-
ConstantRange evaluateWith(ConstantRange N) const {
assert(N.getBitWidth() == V->getType()->getPrimitiveSizeInBits() &&
"Incompatible bit width");
@@ -1250,8 +1241,6 @@ AliasResult BasicAAResult::aliasGEP(
if (!DecompGEP1.VarIndices.empty()) {
APInt GCD;
- bool AllNonNegative = DecompGEP1.Offset.isNonNegative();
- bool AllNonPositive = DecompGEP1.Offset.isNonPositive();
ConstantRange OffsetRange = ConstantRange(DecompGEP1.Offset);
for (unsigned i = 0, e = DecompGEP1.VarIndices.size(); i != e; ++i) {
const VariableGEPIndex &Index = DecompGEP1.VarIndices[i];
@@ -1266,24 +1255,19 @@ AliasResult BasicAAResult::aliasGEP(
else
GCD = APIntOps::GreatestCommonDivisor(GCD, ScaleForGCD.abs());
- if (AllNonNegative || AllNonPositive) {
- KnownBits Known = Index.Val.evaluateWith(
- computeKnownBits(Index.Val.V, DL, 0, &AC, Index.CxtI, DT));
- bool SignKnownZero = Known.isNonNegative();
- bool SignKnownOne = Known.isNegative();
- AllNonNegative &= (SignKnownZero && Scale.isNonNegative()) ||
- (SignKnownOne && Scale.isNonPositive());
- AllNonPositive &= (SignKnownZero && Scale.isNonPositive()) ||
- (SignKnownOne && Scale.isNonNegative());
- }
+ ConstantRange CR =
+ computeConstantRange(Index.Val.V, true, &AC, Index.CxtI);
+ KnownBits Known =
+ computeKnownBits(Index.Val.V, DL, 0, &AC, Index.CxtI, DT);
+ CR = CR.intersectWith(
+ ConstantRange::fromKnownBits(Known, /* Signed */ true),
+ ConstantRange::Signed);
assert(OffsetRange.getBitWidth() == Scale.getBitWidth() &&
"Bit widths are normalized to MaxPointerSize");
- OffsetRange = OffsetRange.add(Index.Val
- .evaluateWith(computeConstantRange(
- Index.Val.V, true, &AC, Index.CxtI))
- .sextOrTrunc(OffsetRange.getBitWidth())
- .smul_fast(ConstantRange(Scale)));
+ OffsetRange = OffsetRange.add(
+ Index.Val.evaluateWith(CR).sextOrTrunc(OffsetRange.getBitWidth())
+ .smul_fast(ConstantRange(Scale)));
}
// We now have accesses at two offsets from the same base:
@@ -1300,21 +1284,6 @@ AliasResult BasicAAResult::aliasGEP(
(GCD - ModOffset).uge(V1Size.getValue()))
return AliasResult::NoAlias;
- // If we know all the variables are non-negative, then the total offset is
- // also non-negative and >= DecompGEP1.Offset. We have the following layout:
- // [0, V2Size) ... [TotalOffset, TotalOffer+V1Size]
- // If DecompGEP1.Offset >= V2Size, the accesses don't alias.
- if (AllNonNegative && V2Size.hasValue() &&
- DecompGEP1.Offset.uge(V2Size.getValue()))
- return AliasResult::NoAlias;
- // Similarly, if the variables are non-positive, then the total offset is
- // also non-positive and <= DecompGEP1.Offset. We have the following layout:
- // [TotalOffset, TotalOffset+V1Size) ... [0, V2Size)
- // If -DecompGEP1.Offset >= V1Size, the accesses don't alias.
- if (AllNonPositive && V1Size.hasValue() &&
- (-DecompGEP1.Offset).uge(V1Size.getValue()))
- return AliasResult::NoAlias;
-
if (V1Size.hasValue() && V2Size.hasValue()) {
// Compute ranges of potentially accessed bytes for both accesses. If the
// interseciton is empty, there can be no overlap.
diff --git a/llvm/test/Analysis/BasicAA/assume-index-positive.ll b/llvm/test/Analysis/BasicAA/assume-index-positive.ll
index 451592067f4b..a53fff2c6009 100644
--- a/llvm/test/Analysis/BasicAA/assume-index-positive.ll
+++ b/llvm/test/Analysis/BasicAA/assume-index-positive.ll
@@ -130,12 +130,12 @@ define void @symmetry([0 x i8]* %ptr, i32 %a, i32 %b, i32 %c) {
ret void
}
-; TODO: %ptr.neg and %ptr.shl may alias, as the shl renders the previously
+; %ptr.neg and %ptr.shl may alias, as the shl renders the previously
; non-negative value potentially negative.
define void @shl_of_non_negative(i8* %ptr, i64 %a) {
; CHECK-LABEL: Function: shl_of_non_negative
; CHECK: NoAlias: i8* %ptr.a, i8* %ptr.neg
-; CHECK: NoAlias: i8* %ptr.neg, i8* %ptr.shl
+; CHECK: MayAlias: i8* %ptr.neg, i8* %ptr.shl
%a.cmp = icmp sge i64 %a, 0
call void @llvm.assume(i1 %a.cmp)
%ptr.neg = getelementptr i8, i8* %ptr, i64 -2
</cut>
After llvm commit adf55ac6657693f7bfbe3087b599b4031a765a44
Author: Lang Hames <lhames(a)gmail.com>
[ORC] Call ExecutorProcessControl::disconnect in unit tests that require it.
the following hot functions slowed down by more than 10% (but their benchmarks slowed down by less than 2%):
- 400.perlbench:[.] S_find_byclass slowed down by 12% from 644 to 721 perf samples
Below reproducer instructions can be used to re-build both "first_bad" and "last_good" cross-toolchains used in this bisection. Naturally, the scripts will fail when triggerring benchmarking jobs if you don't have access to Linaro TCWG CI.
For your convenience, we have uploaded tarballs with pre-processed source and assembly files at:
- First_bad save-temps: https://ci.linaro.org/job/tcwg_bmk_ci_llvm-bisect-tcwg_bmk_tx1-llvm-master-…
- Last_good save-temps: https://ci.linaro.org/job/tcwg_bmk_ci_llvm-bisect-tcwg_bmk_tx1-llvm-master-…
- Baseline save-temps: https://ci.linaro.org/job/tcwg_bmk_ci_llvm-bisect-tcwg_bmk_tx1-llvm-master-…
Configuration:
- Benchmark: SPEC CPU2006
- Toolchain: Clang + Glibc + LLVM Linker
- Version: all components were built from their tip of trunk
- Target: aarch64-linux-gnu
- Compiler flags: -O2
- Hardware: NVidia TX1 4x Cortex-A57
This benchmarking CI is work-in-progress, and we welcome feedback and suggestions at linaro-toolchain(a)lists.linaro.org . In our improvement plans is to add support for SPEC CPU2017 benchmarks and provide "perf report/annotate" data behind these reports.
THIS IS THE END OF INTERESTING STUFF. BELOW ARE LINKS TO BUILDS, REPRODUCTION INSTRUCTIONS, AND THE RAW COMMIT.
This commit has regressed these CI configurations:
- tcwg_bmk_llvm_tx1/llvm-master-aarch64-spec2k6-O2
First_bad build: https://ci.linaro.org/job/tcwg_bmk_ci_llvm-bisect-tcwg_bmk_tx1-llvm-master-…
Last_good build: https://ci.linaro.org/job/tcwg_bmk_ci_llvm-bisect-tcwg_bmk_tx1-llvm-master-…
Baseline build: https://ci.linaro.org/job/tcwg_bmk_ci_llvm-bisect-tcwg_bmk_tx1-llvm-master-…
Even more details: https://ci.linaro.org/job/tcwg_bmk_ci_llvm-bisect-tcwg_bmk_tx1-llvm-master-…
Reproduce builds:
<cut>
mkdir investigate-llvm-adf55ac6657693f7bfbe3087b599b4031a765a44
cd investigate-llvm-adf55ac6657693f7bfbe3087b599b4031a765a44
# Fetch scripts
git clone https://git.linaro.org/toolchain/jenkins-scripts
# Fetch manifests and test.sh script
mkdir -p artifacts/manifests
curl -o artifacts/manifests/build-baseline.sh https://ci.linaro.org/job/tcwg_bmk_ci_llvm-bisect-tcwg_bmk_tx1-llvm-master-… --fail
curl -o artifacts/manifests/build-parameters.sh https://ci.linaro.org/job/tcwg_bmk_ci_llvm-bisect-tcwg_bmk_tx1-llvm-master-… --fail
curl -o artifacts/test.sh https://ci.linaro.org/job/tcwg_bmk_ci_llvm-bisect-tcwg_bmk_tx1-llvm-master-… --fail
chmod +x artifacts/test.sh
# Reproduce the baseline build (build all pre-requisites)
./jenkins-scripts/tcwg_bmk-build.sh @@ artifacts/manifests/build-baseline.sh
# Save baseline build state (which is then restored in artifacts/test.sh)
mkdir -p ./bisect
rsync -a --del --delete-excluded --exclude /bisect/ --exclude /artifacts/ --exclude /llvm/ ./ ./bisect/baseline/
cd llvm
# Reproduce first_bad build
git checkout --detach adf55ac6657693f7bfbe3087b599b4031a765a44
../artifacts/test.sh
# Reproduce last_good build
git checkout --detach f526ee5b8517b60620cd03bb3e5945ed69d6bfaa
../artifacts/test.sh
cd ..
</cut>
Full commit (up to 1000 lines):
<cut>
commit adf55ac6657693f7bfbe3087b599b4031a765a44
Author: Lang Hames <lhames(a)gmail.com>
Date: Tue Oct 12 14:55:49 2021 -0700
[ORC] Call ExecutorProcessControl::disconnect in unit tests that require it.
Another follow-up to 2815ed57e3c and 19b4e3cfc6a. For unit tests that don't use
an ExecutionSession we need to call ExecutorProcessControl::disconnect directly
to wait for the dispatcher to shut down.
https://llvm.org/PR52153
---
.../ExecutionEngine/Orc/EPCGenericJITLinkMemoryManagerTest.cpp | 2 ++
llvm/unittests/ExecutionEngine/Orc/EPCGenericMemoryAccessTest.cpp | 2 ++
2 files changed, 4 insertions(+)
diff --git a/llvm/unittests/ExecutionEngine/Orc/EPCGenericJITLinkMemoryManagerTest.cpp b/llvm/unittests/ExecutionEngine/Orc/EPCGenericJITLinkMemoryManagerTest.cpp
index f2b157e424b6..a95435aec2a3 100644
--- a/llvm/unittests/ExecutionEngine/Orc/EPCGenericJITLinkMemoryManagerTest.cpp
+++ b/llvm/unittests/ExecutionEngine/Orc/EPCGenericJITLinkMemoryManagerTest.cpp
@@ -134,6 +134,8 @@ TEST(EPCGenericJITLinkMemoryManagerTest, AllocFinalizeFree) {
auto Err2 = MemMgr->deallocate(std::move(*FA));
EXPECT_THAT_ERROR(std::move(Err2), Succeeded());
+
+ cantFail(SelfEPC->disconnect());
}
} // namespace
diff --git a/llvm/unittests/ExecutionEngine/Orc/EPCGenericMemoryAccessTest.cpp b/llvm/unittests/ExecutionEngine/Orc/EPCGenericMemoryAccessTest.cpp
index 78024644ca8b..beb0fefa094a 100644
--- a/llvm/unittests/ExecutionEngine/Orc/EPCGenericMemoryAccessTest.cpp
+++ b/llvm/unittests/ExecutionEngine/Orc/EPCGenericMemoryAccessTest.cpp
@@ -93,6 +93,8 @@ TEST(EPCGenericMemoryAccessTest, MemWrites) {
{{pointerToJITTargetAddress(&Test_Buffer), TestMsg}});
EXPECT_THAT_ERROR(std::move(Err5), Succeeded());
EXPECT_EQ(StringRef(Test_Buffer, TestMsg.size()), TestMsg);
+
+ cantFail(SelfEPC->disconnect());
}
} // namespace
</cut>
== This Week ==
* GCC
- Committed a clean up patch to gimple-isel
- PR93183: Committed fix
- PR102376: Patch approved upstream
- PR83750: Patch approved upstream but it regresses one test-case.
== Next Week ==
- Continue with ongoing tasks