== Progress ==
- Fix EPILOGUE_USES regression in CoreMark
- CBUILD tasks had to be respwaned. Zhenqiang sent me links and
instructions. Build tasks are still in the queue.
- Tried with some simple test cases mostly to understand dataflow
and register allocator modules in gcc.
- Remove Unnecessary Zero/Sign Extensions
- Looked in detail a patch
http://old.nabble.com/new-sign-zero-extension-elimination-pass-to29991676.h…
which does extension elimination with sudo registers. This patch is
not part of gcc mainstream. There were some concerns of runtime, using
existing passes to handle this and not catching all the extension in
the discussion.
- Looked at gcc pass ree which does elimination after register
allocation. As it is, it does not eliminate some of the cases I tried.
Looking into it.
== Plan for next week ==
- Work on Remove Unnecessary Zero/Sign Extensions and get ready to
discuses this in connect. It might not be possible to run this with
CBUILD therefore going to try with simple test cases already discussed
and considered by others.
== Progress ==
* February 4.7 release:
- Released a respin of 4.7 2013.02, which fixes an issue with
multiarch on x32 and kfreebsd builds.
* Boehm GC AArch64 support:
- Fixed 128-bit atomic load/store and 'compare and swap' functions
- Testsuite is now OK
* AArch64 porting meeting:
- No new requirements
* Infrastructure:
- Wifi now usable on my laptop
== Next ==
* vacation
* Connect
== Progress ==
* smin-umin: waiting for benchmark results with 'coalesce-vars' patch
reverted on trunk.
* libsanitizer: its backtrace printing facility relies on unwinding
info not present by default in binaries. Adding -funwind-tables
improves the results in GCC testsuite.
There is still an interaction between runtest/qemu and isatty() which
confuses dejagnu. Forcing libsanitizer's internal_isatty to return 0
fixes it. TBC.
* vectorizer cost model: backport in 4.7 required to remove a part,
for lack of new vectorizer infrastructure (arm_add_stmt_cost).
* 'turnoff 64bits ops in Neon': waiting for benchmark results after
backporting on 4.7.
* internal tasks
== Next ==
* holidays next week
* Connect week after
Progress:
* updated various Virtualization category cards in cards.linaro.org
with my comments and clarifications
* rebased qemu-linaro on upstream 1.4.0
* upstream code review: pl330 and others
* prompted by LP:1129571 into another look at the linux-user threading
issues. I dusted off an ancient patch I'd written to address one
part of these, rebased it and rewrote it to work properly.
* KVM/ARM kernel patches are now upstream, so we can submit the QEMU
patches; started on a final rebase, polish and test
Absences:
* NB: I now work a 4 day week, excluding Wednesdays
* 4-8 March: Linaro Connect Asia (Hong Kong)
-- PMM
Hi folks,
Attached is the Linpack benchmark, which I ran GCC and Clang with and
without vectorization (though most of the loops are not vectorized).
Reading the output of LLVM loop vectorizer, it also doesn't do much, but
the net gain is due to the basic-block vectorizer. Does GCC has a similar
concept?
The results are also attached.
cheers,
--renato
The Linaro Toolchain Working Group announce the 2013.02-01 Linaro GCC 4.7
release. This is a respin of the 2013.02 release because of an issue with
multiarch for x32 and kfreebsd builds in the previous one.
Please find the original 2013.02 announcement below.
The Linaro Toolchain Working Group is pleased to announce the 2013.02
release of both Linaro GCC 4.7 and Linaro GCC 4.6.
Linaro GCC 4.7 2013.02 is the eleventh release in the 4.7 series. Based
off the latest GCC 4.7.2+svn195745 release, it includes ARM-focused
performance improvements and bug fixes.
Interesting changes include:
* Updates to GCC 4.7.2+svn195745
* Includes arm/aarch64-4.7-branch up to svn revision 195716
* Support for Cortex-A7 backported from trunk
Linaro GCC 4.6 2013.02 is the 24th release in the 4.6 series. Based
off the latest GCC 4.6.3+svn195744 release, this is the eleventh release
after entering maintenance.
Interesting changes include:
* Updates to 4.6.3+svn195744
The source tarballs are available from:
https://launchpad.net/gcc-linaro/+milestone/4.7-2013.02https://launchpad.net/gcc-linaro/+milestone/4.6-2013.02
Downloads are available from the Linaro GCC page on Launchpad:
https://launchpad.net/gcc-linaro
More information on the features and issues are available from the
release pages:
https://launchpad.net/gcc-linaro/4.7/4.7-2013.02https://launchpad.net/gcc-linaro/4.6/4.6-2013.02
Mailing list: http://lists.linaro.org/mailman/listinfo/linaro-toolchain
Bugs: https://bugs.launchpad.net/gcc-linaro/
Questions? https://ask.linaro.org/
On 6 November 2012 02:48, Rob Herring <robherring2(a)gmail.com> wrote:
>
> On 11/05/2012 05:13 AM, Russell King - ARM Linux wrote:
> > On Mon, Nov 05, 2012 at 10:48:50AM +0000, Dave Martin wrote:
> >> On Thu, Oct 25, 2012 at 05:08:16PM +0200, Johannes Stezenbach wrote:
> >>> On Thu, Oct 25, 2012 at 09:25:06AM -0500, Rob Herring wrote:
> >>>> On 10/25/2012 09:16 AM, Johannes Stezenbach wrote:
> >>>>> On Thu, Oct 25, 2012 at 07:41:45AM -0500, Rob Herring wrote:
> >>>>>> On 10/25/2012 04:34 AM, Johannes Stezenbach wrote:
> >>>>>>> On Thu, Oct 11, 2012 at 07:43:22AM -0500, Rob Herring wrote:
> >>>>>>>
> >>>>>>>> While v6 can support unaligned accesses, it is optional and current
> >>>>>>>> compilers won't emit unaligned accesses. So we don't clear the A bit for
> >>>>>>>> v6.
> >>>>>>>
> >>>>>>> not true according to the gcc changes page
> >>>>>>
> >>>>>> What are you going to believe: documentation or what the compiler
> >>>>>> emitted? At least for ubuntu/linaro 4.6.3 which has the unaligned access
> >>>>>> support backported and 4.7.2, unaligned accesses are emitted for v7
> >>>>>> only. I guess default here means it is the default unless you change the
> >>>>>> default in your build of gcc.
> >>>>>
> >>>>> Since ARMv6 can handle unaligned access in the same way as ARMv7
> >>>>> it seems a clear bug in gcc which might hopefully get fixed.
> >>>>> Thus in this case I think it is reasonable to follow the
> >>>>> gcc documentation, otherwise the code would break for ARMv6
> >>>>> when gcc gets fixed.
> >>>>
> >>>> But the compiler can't assume the state of the U bit. I think it is
> >>>> still legal on v6 to not support unaligned accesses, but on v7 it is
> >>>> required. All the standard v6 ARM cores support it, but I'm not sure
> >>>> about custom cores or if there are SOCs with buses that don't support
> >>>> unaligned accesses properly.
> >>>
> >>> Well, I read the "...since Linux version 2.6.28" comment
> >>> in the gcc changes page in the way that they assume the
> >>> U-bit is set. (Although I'm not sure it really is???)
> >>
> >> Actually, the kernel checks the arch version and the U bit on boot,
> >> and chooses the appropriate setting for the A bit depending on the
> >> result. (See arch/arm/mm/alignment.c:alignment_init().)
> >
> > That is in the kernel itself, _after_ the decompressor has run. It is
> > not relevant to any discussion about the decompressor.
> >
> >> Currently, we depend on the CPU reset behaviour or firmware/
> >> bootloader to set the U bit for v6, but the behaviour should be
> >> correct either way, though unaligned accesses will obviously
> >> perform (much) better with U=1.
> >
> > Will someone _PLEASE_ address my initial comments against this patch
> > in light of the fact that it's now been proven _NOT_ to be just a V7
> > issue, rather than everyone seemingly buring their heads in the sand
> > over this.
>
> I tried adding -munaligned-accesses on a v6 build and still get byte
> accesses rather than unaligned word accesses. So this does seem to be a
> v7 only issue based on what gcc will currently produce. Copying Michael
> Hope who can hopefully provide some insight on why v6 unaligned accesses
> are not enabled.
This looks like a bug. Unaligned access is enabled for armv6 but
seems to only take effect for cores with Thumb-2. Here's a test case
both with unaligned field access and unaligned block copy:
struct foo
{
char a;
int b;
struct
{
int x[3];
} c;
} __attribute__((packed));
int get_field(struct foo *p)
{
return p->b;
}
int copy_block(struct foo *p, struct foo *q)
{
p->c = q->c;
}
With -march=armv7-a you get the correct:
bar:
ldr r0, [r0, #1] @ unaligned @ 11 unaligned_loadsi/2 [length = 4]
bx lr @ 21 *arm_return [length = 12]
baz:
str r4, [sp, #-4]! @ 25 *push_multi [length = 4]
mov r2, r0 @ 2 *arm_movsi_vfp/1 [length = 4]
ldr r4, [r1, #5]! @ unaligned @ 9 unaligned_loadsi/2 [length = 4]
ldr ip, [r1, #4] @ unaligned @ 10 unaligned_loadsi/2 [length = 4]
ldr r1, [r1, #8] @ unaligned @ 11 unaligned_loadsi/2 [length = 4]
str r4, [r2, #5] @ unaligned @ 12 unaligned_storesi/2 [length = 4]
str ip, [r2, #9] @ unaligned @ 13 unaligned_storesi/2 [length = 4]
str r1, [r2, #13] @ unaligned @ 14 unaligned_storesi/2 [length = 4]
ldmfd sp!, {r4}
bx lr
With -march=armv6 you get a byte-by-byte field access and a correct
unaligned block copy:
bar:
ldrb r1, [r0, #2] @ zero_extendqisi2
ldrb r3, [r0, #1] @ zero_extendqisi2
ldrb r2, [r0, #3] @ zero_extendqisi2
ldrb r0, [r0, #4] @ zero_extendqisi2
orr r3, r3, r1, asl #8
orr r3, r3, r2, asl #16
orr r0, r3, r0, asl #24
bx lr
baz:
str r4, [sp, #-4]!
mov r2, r0
ldr r4, [r1, #5]! @ unaligned
ldr ip, [r1, #4] @ unaligned
ldr r1, [r1, #8] @ unaligned
str r4, [r2, #5] @ unaligned
str ip, [r2, #9] @ unaligned
str r1, [r2, #13] @ unaligned
ldmfd sp!, {r4}
bx lr
readelf -A shows that the compiler planned to use unaligned access in
both. My suspicion is that GCC is using the extv pattern to extract
the field from memory, and that pattern is only enabled for Thumb-2
capable cores.
I've logged PR55218. We'll discuss it at our next meeting.
-- Michael
== Progress ==
* smin-umin: spawned build jobs for gcc-trunk with 'coalesce-vars'
patch reverted (from A.Oliva), so that I can then run benchmarks to
compare its effect with the one observed on gcc-4.7.
* libasan: thanks to Peter, I am able to run sample programs under
qemu. Ran GCC testsuite, observed some failures, to be investigated.
* vectorizer cost model: committed upstream in 4.8.
* Released gcc-linaro-4.6-2013.02 and gcc-linaro-4.7-2013.02.
* backported 'turnoff 64 bits ops in Neon' from upstream to
gcc-linaro-4.7. Waiting for builds to complete to launch benchmark
jobs.
* internal tasks
== Next ==
* smin-umin: look at benchmarks results if available
* libasan: analyse testsuite results
* 64bits ops in Neon; look at benchmarks results if available
* get more codec samples
Christophe.
== Progress ==
* February merge 4.6 and 4.7
- Another issue raised by Matthias, fix merged.
- The merge request exposed a new failure on i686 (ld seg. fault)
which looks like the ones we had on i686 before the release,
re-spawned the job to see if we can reproduce.
* Boehm GC AArch64 support:
- Fixed some defect, identified some cases not implementable by
atomic builtins, worked on inline asm versions.
- re-installed working environment to be able to test the gcc
integration (thanks to Matt)
* Aarch64 porting meeting:
- No new requirements.
== Next ==
* Review roster.
* Boehm GC AArch64 support:
- fix libatomic_ops
- validate GCC's Boehm gc integration
* libunwind aarch64 support:
- continue.
== This Week ===
* Got the remote Pandaboard RootFS updates, Thanks to Dave again.
* Got the SSH tunnels going for remote GDB test suite executions, thougt a
single remote run takes eternity to complete.
* Analysis of GDB test suite log files separated the failures and other
test cases which were not passed as for the analysis.
* Preliminary analysis done on GDB logs for remote configuration on ARM in
comparison with native configuration.
== Next week ==
* Try to complete log file analysis and post results.
* Try to run GDB test suite using QEMU.
* Follow up on hong kong visa application
--
*** Sorry about spamming I got the subject wrong in the first email.
== This Week ===
* Tried running GDB test suite with QEMU linux user-mode with not much
success
* Configured Pandaboard and ran GDB test suite in native and
native-gdbserver configurations.
* More of office setup and purchase of some bits and pieces
* Gateway set up and Pandaboard access finally worked after some help from
Dave and Matt.
* Finally purchased Ticket for the Connect and also filled visa application
* Attended Weekly toolchain call
* Still suffering from a bad run of cough n flu, done with antibiotics now.
* Public Holiday on 5th Feburary
== Next week ==
* Continue with Analysis of GDB status on ARM in different configurations
possible.
* SSH tunnels and port forwarding setup to gateway and lab for smooth
access.
* Follow up on hong kong visa application
== Progress ==
* Implementing GC sections support in binutils
Wrote a patch to update some AARCH64 GOT and PLT symbol ref counts
while sweeping sections.
During "make ld-check" some tests are not running. Checking the issue.
Discussed with Matt and decided to use default gc_mark hook. Most
ports care BFD_RELOC_VTABLE_INHERIT
and BFD_RELOC_VTABLE_ENTRY relocs during gc section marking. Now
these relocs are not supported in AARCH64.
* Analysis on jump threading in GCC
Switch cases are not lowered in GCC till RTL expand stage.
Looking at a patch on lowering switch to GIMPLE and see jump
threading behaves.
Misc
------
* Attend tool chain weekly meet, and stand up call
* Had a look at one x86_64 related GCC build error.
== Next week ==
* Continue Implementing GC sections support in binutils and test the patch.
* Continue analysis on jump threading in GCC. Planning to look at Tree
VRP as per Andrew pinski suggestion.
Hello,
I am currently trying to determine how much I can optimize OpenCV using the
ARM's VFP and the Linaro nano image. I have downloaded the
arm-linux-gnueabihf-gcc & arm-linux-gnueabihf-g++ compilers, successfully
cross-compiled OpenCV using those compilers (with -O3 -mfloat-abi=hard
-ftree-vectorize -funroll-loops), and compiled an example OpenCV program
with the arm-linux-gnueabihf-g++ compiler (with the same flags). However,
when I try and run my executable within Linaro (running on a gumstix overo)
I get "/lib/arm-linux-gnueabihf/libm.so.6: version `GLIBC_2.15' not found
(required by /usr/lib/libopencv_core.so.2.2).
I assume that this is because on my build machine (x86 Ubuntu 10.04) where
I cross compiled the OpenCV libraries I am using libc-2.15, and on the
target (gumstix) the loader can't find libc-2.15. However, within the
gcc-linaro-arm-linux-gnueabihf-4.7 tar ball I downloaded today, there is a
libc-2.15.so file in there that I copied into /lib/ on the gumstix. Still
no success though when I try running my executable.
Any recommendations on things to try? I am hesitant to just put a newer
version of glibc in Linaro as it seems like that could screw a lot of
things up. Should I attempt to re-build OpenCV somehow with an older
version of glibc (namely libc-2.13)?
Also, please let me know if this is the wrong forum for this question and
where I should post it instead.
Thanks,
Derek
== Progress ==
-Spawned CBUILD tasks with and without the patch proposed by Chung-Lin
Tang to test current status of "Fix EPILOGUE_USES regression in
CoreMark". Job is still in the queue.
-Read references listed in gcc/ira.c to understand gcc IRA and went
through the source code.
-Experimented with simple test cases to analyse IRA debug output.
== Plan for next week ==
- Analyse CoreMark results if ready and work on improvements.
- Start with research for Removing unnecessary zero/sign extends when
not working on "Fix EPILOGUE_USES regression in CoreMark.
== Progress ==
* Vectorizer
- Progressing on the global structure alias analysis (discussions, drafts,
failed attempts)
- Interesting Linpack results for vectorizer, investigating performance
- http://www.systemcall.org/blog/2013/02/llvm-vectorizer/
- http://llvm.org/bugs/show_bug.cgi?id=15247
* Buildbots
- Took Chromebook out, since Pandas are fast enough
- Creating LNT test-suite buildbot
- http://llvm.org/viewvc/llvm-project?view=revision&revision=175081
- http://lab.llvm.org:8011/builders/clang-native-arm-lnt
- Still a few rounds until it works properly
* Distributed Builds
- Distributed compilation works somewhat on Pandas
-
http://www.systemcall.org/blog/2013/02/distributed-compilation-on-a-pandabo…
- It's better, cheaper and easier to use Chromebooks / Arndales, though...
* LAVA
- Tracking LLVM git repo in git.linaro.org
- Trying to build from that repo, LAVA doesn't like it... :/
* EuroLLVM
- We're not sponsoring it any more, will still help organize
== Plan ==
* Continue Global Alias Analysis, I seem to be getting close to something
worth sending (and learning a lot in the process)
* Make sure LNT buildbot is up and running with vectorizer (O3)
* Check with Dave Piggot why git.linaro doesn't work with LAVA
* Inspect why some (loop+bb) vectorized results are poor on ARM
Progress:
* rebased qemu-linaro and sorted out with Serge what he requires
in the way of patches for Ubuntu's upcoming code freeze. I plan
to set up and test the 2013.03 release a week or two early both
so that there is a patchset ready for the Ubuntu freeze and also
to avoid clashing with Connect week.
* some patches to clean up QEMU's logging functions so they can
be used more widely in place of ad-hoc printf
* cleanup patchset to get rid of sysbus_add_memory() function
* LP:1079080 -- fixed serious bug in QEMU's Thumb-mode srs insn
* upstream discussion about what to do about QEMU's disassembler
license-clash issues:
http://lists.gnu.org/archive/html/qemu-devel/2013-02/msg02122.html
* NB: I now work a 4 day week, excluding Wednesdays
-- PMM
I downloaded the aarch64 binaries to a ubuntu machine:
wink@ssi-primary:~$ uname -a
Linux ssi-primary 3.5.0-21-generic #32-Ubuntu SMP Tue Dec 11 18:51:59 UTC
2012 x86_64 x86_64 x86_64 GNU/Linux
And when I try to run gcc-4.7.3:
wink@ssi-primary:~$ ls -al
~/aarch64-toolchain/gcc-linaro-aarch64-linux-gnu-4.7+bzr115029-20121015+bzr2506_linux/bin/aarch64-linux-gnu-gcc-4.7.3
-rwxr-xr-x 1 wink wink 553068 Oct 18 14:21
/home/wink/aarch64-toolchain/gcc-linaro-aarch64-linux-gnu-4.7+bzr115029-20121015+bzr2506_linux/bin/aarch64-linux-gnu-gcc-4.7.3
I get a file not found:
wink@ssi-primary:~$ strace
/home/wink/aarch64-toolchain/gcc-linaro-aarch64-linux-gnu-4.7+bzr115029-20121015+bzr2506_linux/bin/aarch64-linux-gnu-gcc-4.7.3
-v
execve("/home/wink/aarch64-toolchain/gcc-linaro-aarch64-linux-gnu-4.7+bzr115029-20121015+bzr2506_linux/bin/aarch64-linux-gnu-gcc-4.7.3",
["/home/wink/aarch64-toolchain/gcc"..., "-v"], [/* 19 vars */]) = -1
ENOENT (No such file or directory)
dup(2) = 3
fcntl(3, F_GETFL) = 0x8002 (flags
O_RDWR|O_LARGEFILE)
fstat(3, {st_mode=S_IFCHR|0620, st_rdev=makedev(136, 0), ...}) = 0
mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0)
= 0x7fa688be5000
lseek(3, 0, SEEK_CUR) = -1 ESPIPE (Illegal seek)
write(3, "strace: exec: No such file or di"..., 40strace: exec: No such
file or directory
) = 40
close(3) = 0
munmap(0x7fa688be5000, 4096) = 0
exit_group(1) = ?
I must have done something real stupid, any help appreciated.
-- Wink
== Progress ==
* Very short working week.
* Discussions about next steps in Lava integration.
* Followed up on Arndale/GCC build issues.
== Next week ==
* Review pending Backports
* Investigate bootstrap ICE PR56184 further
== Future ==
* Run HOT/COLD partitioning benchmarks
* Analyse ARM results
* On x86_64 to see what the actual benefit we could get
* fix-gcc-multiarch-testing
* Come up with strawman proposal for updating testsuite to handle
testing with varying command-line options.
--
Matthew Gretton-Dann
Linaro Toolchain Working Group
matthew.gretton-dann(a)linaro.org
== Progress ==
* 64-bits ops in Neon: upstream accepted the patch for gcc-4.9-stage1
* smin-umin: several benchmarks runs necessary to confirm that the
generic patch I suspected actually caused the regression. However, it
also brings improvements in other benches (office-type).
* looked at libasan: very little configuration seems necessary, but
running a sample program under qemu leads to asan runtime complains.
* internal tasks
== Next ==
* smin-umin: run benchmarks on trunk to confirm the regression is also present.
* vectorizer cost model: handle Richard's feedback.
* get more codecs sample codes.
* libasan: understand error messages
Christophe.
== Progress ==
* February merge 4.6 and 4.7
- backport a patch from upstream to resolve conflicts
- this backport introduced some merge issues
- fix pushed and merge request ongoing
* Boehm GC AArch64 support:
- libatomic_ops maintainer rework the patch to make the usage of
gcc atomic builtins available on the other targets
- test exhibit some failures
* libunwind aarch64 support
- bug status moved to critical
- configuration and machine description done
- implementation ongoing
* Aarch64 porting meeting:
- Cancelled this week
== Next ==
* February merge 4.6 and 4.7
- log new testsuite failures in launchpad
* Boehm GC AArch64 support:
- fix libatomic_ops
- validate GCC's Boehm gc integration
* libunwind aarch64 support
- conitnue.
== Progress ==
* Implementing GC sections support in bintuils
Looking at aarch64 relocations to consider while sweeping.
* Cbuild experiments to test gcc svn aarch64 4.7 branch.
Connection problems to the cbuild machines are solved now.
Thanks to Matt. Spawned the build.
Misc
------
* Attend tool chain weekly meet, and stand up call
* Had a look at one x86_64 related patch that gone into GCC trunk.
== Next week ==
* Continue Implementing GC sections support in binutils.
* Start analysis on jump threading in GCC.
== Progress ==
-Attended 1:1 and other team meetings
-Read linaro wiki and on boarding instructions
-Backported Cortex A7 pipeline description to Linaro 4.7 gcc from fsf
trunk and got it reviewed
== Plan for next week ==
- Look into CoreMark regresses in Thumb-2 mode when using the LR regnum.
- Start with research for Removing unnecessary zero/sign extends