== Progress ==
o BKK16 remote (5/10)
* Followed TCWG sessions
* Extended validation:
- worked with Kugan
- implemented job for native validation
o GCC dev. (4/10)
* Remote validation sanitizing:
- iterate on the output pattern fix
- testing a fix for stderr/stdin ordering issue
* Gave some support on __sync builtins, preparing a fix for armv8.1
o Misc (1/10)
* Various meetings
== Plan ==
o GCC 5 branch merge, and 2016.03 snapshot
o Continue on-going tasks
Port to microinstance - TCWG-432 [17/10]
* Investigating difference between LAVA and 'desktop Juno' runtimes
** Some of this was down to piles of /dev/console output - redirecting
to file improved SPEC build time by 75%!
** Some cases make sense, others remain unexplained
** Might just go away if we update the Juno image
* Wrote up how to do benchmarking for minimal-trust cases
** Needs both lab and development work
* Merged another large tranche of changes back to benchmarking branch
** Microinstance more or less functional, main instance benchmarking
seems unbroken
** But some more tweaks to make as Lab work happens
* Prepared backport benchmarking for merge
Misc [3/10]
=Plan=
* Submit backport benchmarking for review
* Tweak uinstance in reaction to lab work
* Implement the non-lab side of minimal-trust benchmarking
* Return to looking at Juno image generation
* Look some more at LAVA/desktop runtime differences
* Implement small improvements, if time
== This week ==
* Bugzilla 69663 - [ARM] Implement overflow arithmetic standard names (3/10)
- Resolved thumb2 failures
- Negdi2 was not generating instruction to set condition codes
* Bugzilla 70008 - [ARM] Reverse subtract with carry can be generated in
thumb2 mode (1/10)
- Created new patch using new predicate that matches arm and thumb2
constraints
- Received approval to GCC 7 stage 1
* Bugzilla 70014 - [ARM] Predicate does not match constraint
(*subsi3_carryin_const) (1/10)
- Fix checked into trunk
* Linaro Connect meetings (5/10)
== Next week ==
* Bugzilla 69663 - [ARM] Implement overflow arithmetic standard names
- Create compile only test cases and re-run validation testing
- Post new patch upstream
Hey,
Regarding the GCC ABI 5 issue, I was wondering what's the policy
behind updating packages on stable updates for both Debian and Ubuntu.
Our time frame is a bit constrained, and we definitely will have to
take some hard decisions in the next six months, so I'd like to
understand everything that is at stake before I have my own opinion.
LLVM has a 6 month major cycle, releasing around February / August.
Major releases are allowed to break the ABI. Major breakages need one
release warning period.
Ubuntu has a 6 month release cycle, around April / October. IIUC,
major releases are allowed to have new versions of packages, but
updates for the next few years have to keep within the same major
release.
Debian has a -1 years release cycle (heh), and has the same major /
minor policy, which makes it a lot harder to update major versions.
However, I believe unstable is still not closed, nor will be in August
this year, so updating to LLVM 3.9 will not be a problem, but it will
mean users will have to wait a bit more to get a working LLVM.
The time frame is then:
3.8.0 released March (without the fix)
Ubuntu X released April
3.9.0 releases August (hopefully with a fix)
Ubuntu X+1 released October
Debian freezes ??
LLVM 3.8.1 ??
If we don't back-port GCC ABI 5 into 3.8.1, Ubuntu users will not have
the fix ever, unless you *can* update to 3.9.0 in August.
Ubuntu X+1 will be fine using 3.9, as will Debian after August, unless
you guys freeze before that.
I believe both Debian and Ubuntu have a trunk-based LLVM package for
experimental use only, and it would be bad, but not completely broken,
to recommend users to use that meanwhile.
If Debian freezes *before* 3.9.0 is out, or if Ubuntu can't update to
3.9.0 on April's release, then we'll have a strong reason to back-port
the change to 3.8.x. If not, even though it will be uncomfortable for
users until August, the argument is not that strong and will be hard
to get it through.
Any comments? Ideas? Does any of that make sense?
cheers,
--renato
Hi,
I have been comparing the stock gcc 5.2 and the Linaro 5.2 (Linaro GCC
5.2-2015.11-1) and have noticed a difference with the __sync
intrinsics.
Here is the simple test case
--- cut here ---
int add_int(int add_value, int *dest)
{
return __sync_add_and_fetch(dest, add_value);
}
--- cut here ---
Compiling with the stock gcc 5.2 (-S -O3) I get
---------
add_int:
.L2:
ldaxr w2, [x1]
add w2, w2, w0
stlxr w3, w2, [x1]
cbnz w3, .L2
mov w0, w2
ret
---------
Wheras with Linaro gcc 5.2 I get
---------
add_int:
.L2:
ldxr w2, [x1]
add w2, w2, w0
stlxr w3, w2, [x1]
cbnz w3, .L2
dmb ish
mov w0, w2
ret
---------
Why the extra (unnecessary?) memory barrier?
Also, is it worthwhile putting a prfm before the ldaxr. EG
add_int:
prfm pst1strm, [x1]
.L2:
ldaxr w2, [x1]
See the following thread
http://lists.infradead.org/pipermail/linux-arm-kernel/2015-July/355996.html
All the best,
Ed
== Progress ==
o GCC dev. (7/10)
* Remote validation sanitizing:
- fixed last issues in dejagnu patch and submitted it uptsream
- 2 more cleanup/fix dejagnu patches submitted and merged upstream
- proposed a fix/workaround for the output pattern issues (>400
failures removed with this patch)
o Misc (3/10)
* Various meetings
* internal discussions
== Plan ==
o Try to follow connect remotely
o Extended validation work
== Progress ==
* GCC bugs:
- #2073 tried to reproduce it with a manually-built toolchain. No luck
* GCC validation:
- added support to choose simulated cpu (different from --with-cpu)
* GCC:
- completing Neon intrinsics tests, to prepare cleanup
* Validation:
- small improvements
* Misc (conf calls, meetings, emails, ...)
== Next ==
Remote Connect
== Progress ==
* Support (5/10)
- Working on PR17193
- Continue review on D17141
* Background (5/10)
- Code review, meetings, discussions, general support, etc.
- Connect preparations
- GCC ABI 5 discussions
- Assessing Swift calling convention impact ARM back-end
- Interviews
# Progress #
* TCWG-545, Handle "branch-to-self" instruction in single stepping.
[5/10] Patches are posted upstream for review.
* TCWG-532, one patch is committed and one patch is posted for review.
[2/10]
* Tweak ARM process record. [2/10]
Two patches are pushed in. Many test fails are fixed.
* FSF patches review. [1/10].
# Plan #
* Linaro Connect.
--
Yao
Hi,
I have just switched to gcc 5.2 from 4.9.2 and the code quality does seem to have improved significantly. For example, it now seems much better at using ldp/stp and it seems to has stopped gratuitous use of the SIMD registers.
However, I still have a few whinges:-)
See attached copy.c / copy.s (This is a performance critical function from OpenJDK)
pd_disjoint_words:
cmp x2, 8 <<< (1)
sub sp, sp, #64 <<< (2)
bhi .L2
cmp w2, 8 <<< (1)
bls .L15
.L2:
add sp, sp, 64 <<< (2)
(1) If count as a 64 bit unsigned is <= 8 then it is probably still <= 8 as a 32 bit unsigned.
(2) Nowhere in the function does it store anything on the stack, so why
drop and restore the stack every time. Also, minor quibble in the
disass, why does sub use #64 whereas add uses just '64' (appreciate this
is probably binutils, not gcc).
.L15:
adrp x3, .L4
add x3, x3, :lo12:.L4
ldrb w2, [x3,w2,uxtw] <<< (3)
adr x3, .Lrtx4
add x2, x3, w2, sxtb #2
br x2
(3) Why use a byte table, this is not some sort of embedded system. Use
a word table and this becomes.
.L15:
adrp x3, .L4
add x3, x3, :lo12:.L4
ldr x2, [x3, x2, lsl #3]
br x2
An aligned word load takes exactly the same time as a byte load and we
save the faffing about calculating the address.
.L10:
ldp x6, x7, [x0]
ldp x4, x5, [x0, 16]
ldp x2, x3, [x0, 32] <<< (4)
stp x2, x3, [x1, 32] <<< (4)
stp x6, x7, [x1]
stp x4, x5, [x1, 16]
(4) Seems to be something wrong with the load scheduler here? Why not
move the stp x2, x3 to the end. It does this repeatedly.
Unfortunately as this function is performance critical it means I will
probably end up doing it in inline assembler which is time consuming,
error prone and non portable.
* Whinge mode off
Ed
== Progress ==
o GCC dev. (7/10)
* Remote validation sanitizing:
- Implemented and tested a pure dejagnu fix (the actual
implementation works fine for GCC but might be an issue in a different
context, a cleaner fix almost done)
- Found a latent issue in GCC profiling test harness
* ARM and AArch64 backends LRA cleanup:
- Looked at the remaining artifacts, will prepare a patch for GCC 7
o Misc (3/10)
* Various meetings
* internal discussions
== Plan ==
o Finalize and submit dejagnu fix
Port to microinstance - TCWG-432 [7/10]
* Merged last few months of development back to benchmarking branch
* Restored support for multiple targets per builder
* Updated builder landed, altered jobs to work with it
** Removed assumption that host filesystem is non-persistent
** Stacked up test runs for the weekend
Transfer secret management to LAVA [1/10]
* LAVA jobs now use a within-LAVA key to access sources
Misc [2/10]
* Unsuccessful fiddling with heat-monitoring tools on Juno
* Usual background of mail and meetings
=Plan=
* Fallout from weekend test runs
** Some failure is going on, need to investigate
* Update docs and Jenkins configs w.r.t. last week's activity
* Further investigation on a couple of LAVA issues that are causing me pain
** Un-deserializable bundles
** Inaccessible image reports
* Continue assessing target stability/looking at inconsistent results
== This week ==
* Bugzilla 69663 - [ARM] Implement overflow arithmetic standard names (6/10)
- Tested and posted SImode and DImode patch upstream
- Feedback recommended supporting thumb2 in addition to arm architectures
- Patch to support thumb2 fails on all thumb architectures;
investigating failures
* Bugzilla 70008 - [ARM] Reverse subtract with carry can be generated in
thumb2 mode (2/10)
- Created new bug, developed and successfully tested patch
- Fix posted upstream
* Bugzilla 70014 - [ARM] Predicate does not match constraint
(*subsi3_carryin_const) (1/10)
- Created new bug and patch
* Misc (1/10)
== Next week ==
* Bugzilla 69663 - Cleanup by merging patterns using mode iterators,
submit upstream
* Bugzilla 70008 - Respond to upstream comments as appropriate
* Bugzilla 70014 - Post patch and respond to upstream comments
* Travel to Linaro Connect beginning March 3rd
== This Week ==
* LTO (6/10)
- TCWG 528:
a) reduced test-case for the case when decl node gets visited multiple times
b) updated patch not to walk artificial record decls (typeinfo
objects) as per Richard's suggestion.
submitted upstream, waiting for review.
- benchmarking: Aarch64 SPEC2006-int benchmarks complete
- looked at pr57703
- Slides
* setting up perf on chromebook (2/10)
- perf doc
- got perf running on chromebook by manually building it and set of
(clumsy) workarounds.
- perf annotate shows no output and perf stat shows "not supported" for almost
all entires except "page faults"
- will give a try to dual boot chrubuntu on chromebook
* half-day sick leave (1/10)
- doctor's appointment for eye inflammation
* Misc (1/10)
- Meetings
== Next Week ==
- LTO
- tcwg-310
- look at jenkins tutorial in collaborate wiki
== Progress ==
* Support (4/10)
- Updating patch D17141 for Darwin, resubmitting, discussions.
- Understanding PR21778, may need changes to SLP
- Benchmarking some scheduler choices for A17
* Release (1/10)
- 3.8.0 RC3 validation
* Background (5/10)
- Code review, meetings, discussions, general support, etc.
- Sifting through CVs, interviews, etc.
# Progress #
* Support range stepping on arm-linux. TCWG-518. [5/10]
Post patch series about "the thread is stepping over breakpoint but
it spawns child thread". The fix is OK but the test case changes are
being reviewed.
The more I test my range stepping patches, the more existing bugs I
find. Looking at the bug "software single step the instruction
branch to self."
* AArch64 linux syscall for record/replay. TCWG-532. [1/10]
Patch is out for review.
* Fix some ARM reverse debugging bugs. TCWG-183. [1/10]
Patch is pushed in. The original implementation wasn't carefully
reviewed, so I am sure there are bugs somewhere else.
* Patch review on arm tracepoint support. [1/10]
One patch is approved but I insist that another patch should be done
in generic part instead of ARM specific part, but the author wants do
it in ARM specific part because he things it is simpler.
* Misc [2/10]
** Go through the Linux kernel awareness GDB patches quickly, the first
reaction is "split your patch, please".
** Go to London to collect my passport.
# Plan #
* Support range stepping on arm-linux. TCWG-518.
* TCWG-167, TCWG-532.
* Prepare for the Linaro Connect travel.
--
Yao
Hi All,
Does linaro distributes arm-gcc as a pre-built static tool chain
distribution? If yes, where can i download them from. Please point me some
location from where i can download.
--
Thanks & Regards,
M.Srikanth Kumar.
Bug with compiler flag handling - (no ticket) [2/10]
* Coremark-Pro was ignoring compiler flags
* Fixed that, made flag handling consistent across all benchmarks
Release benchmarking via Jenkins - TCWG-348 [1/10]
* Seems to work with test workload
Port to microinstance - TCG-432 [3/10]
* Looked at some inconsistent results
* Worried that one of the Junos may be sick, but unproven for now
Backport benchmarking via Jenkins - TCWG-352 [1/10]
* Finished 'general benchmarking' job
* Switched backport job to build/test cross-compilers
* Recent backport results bundles are corrupted, unable to work out why
Document benchmarking infrastructure - TCWG-496 [1/10]
* Documented Jenkins interface
Misc [2/10]
=Plan=
If updated builder becomes available, convert uinstance job to use it
Continue assessing target stability/looking at inconsistent results
Rework LAVA scripts to permit multiple targets per builder
Hi all,
I download the pre-built toolchain for one of our armv6 board.
https://releases.linaro.org/14.04/components/toolchain/binaries/gcc-linaro-…
After plug it into Yocto as an external toolchain, it failed to install it
correctly.
../meta-linaro/meta-linaro-toolchain/recipes-devtools/external-linaro-toolchain/
external-linaro-toolchain.bb, do_install
| DEBUG: Executing shell function do_install
| cp: cannot stat
`/opt/gcc-linaro-arm-none-eabi-4.8-2014.04_linux/arm-none-eabi/libc/lib/*':
No such file or directory
Any suggestion?
Thanks,
Joel