== Progress ==
* ARM Process Record made the changes pointed out in maintainers
comments. [3/10] [TCWG-336] [TCWG-338] [TCWG-339]
- Put the new bug fixes patches on hold as they can be sent in the
same series.[TCWG-317] [TCWG-315]
* Time off for setting up new office space. [4/10]
- Found a good room for office in a central location, need to get
internet and other stuff working before shifting. I will get started
on that once I get the possession of the space next week.
* Time off for Macau visa information [1/10]
* 25th December Public Holiday [2/10]
== Plan ==
* ARM Process Record:
- Create new patches as per maintainers suggestions, incorporating bug
fixes as separate patches in the list.
- Test and send all patches upstream.
* New office space setup, will be away from my desk for few hours.
Hi,
seen in a segfault running the tests in the coinor-osi package,
https://launchpad.net/bugs/1263576, both in saucy and trusty, version 0.106.4
and 0.106.5. Version 0.103 doesn't show the issue.
both the 4.7 and 4.8 linaro branches show this behaviour, and trunk 20131121
(didn't build a newer one yet).
William Grant tracked that down to a bug with very negative vcall_offsets in
aarch64 multiple inheritance thunks. The example below has two consecutive
thunks, with the second adding 263 instead of subtracting 264.
aarch64_build_constant seems to not handle negative integers. He tried a quick
gcc patch to avoid using aarch64_build_constant, and the coinor-osi tests succeed.
0000000000401ca4 <_ZTv0_n256_N1C2adEv>:
401ca4: f9400010 ldr x16, [x0]
401ca8: f8500211 ldr x17, [x16,#-256]
401cac: 8b110000 add x0, x0, x17
401cb0: 17fffff9 b 401c94 <_ZN1C2adEv>
[...]
0000000000401cc4 <_ZTv0_n264_N1C2aeEv>:
401cc4: f9400010 ldr x16, [x0]
401cc8: d28020f1 mov x17, #0x107 // #263
401ccc: f8716a11 ldr x17, [x16,x17]
401cd0: 8b110000 add x0, x0, x17
401cd4: 17fffff8 b 401cb4 <_ZN1C2aeEv>
Any chance for a quick 2013 review?
Thanks, Matthias
--- a/gcc/config/aarch64/aarch64.c
+++ b/gcc/config/aarch64/aarch64.c
@@ -2540,8 +2540,8 @@
addr = plus_constant (Pmode, temp0, vcall_offset);
else
{
- aarch64_build_constant (IP1_REGNUM, vcall_offset);
- addr = gen_rtx_PLUS (Pmode, temp0, temp1);
+ aarch64_add_constant (IP0_REGNUM, IP1_REGNUM, vcall_offset);
+ addr = temp0;
}
aarch64_emit_move (temp1, gen_rtx_MEM (Pmode,addr));
== Progress ==
* Lots of little changes to the Jenkins configure files and
Cbuildv2 so Jenkins can do a builds in a 32 bit chroot. (4/10)
* Misc - meetings and misc tasks (2/10).
* Helped with binutils & GCC releases. 4/10)
* New Arndale Octa arrived.
== Plan ==
* Get back to adding the neon intrinsics tests to the GCC
testsuite.
* More Jenkins hacking for 32bit chroot builds.
* More hacking on binary release automation.
* Review Kugan's benchmarking support in Cbuildv2.
== Issues ==
* Loosing ssh access to many of the machines we work on is a
problem...
== Leave ==
* A bit complicated. On the road till Jan 13, offline some, mixed with
some work days depending on the weather. Should have email
access most nights.
Hi All,
I am getting a weird error while building binutils from
http://cbuild.validation.linaro.org/snapshots/Latest/binutils-linaro-2.23.2…
GCC version gcc version 4.8.1 (Ubuntu/Linaro 4.8.1-10ubuntu9)
Ubuntu 13.10
Steps:
1)
/work/sources/binutils/configure --target=aarch64-unknown-linux-gnu
--prefix=/work/builds/gcc-fsf-trunk/tools
--with-sysroot=/work/builds/gcc-fsf-trunk/sysroot-aarch64-unknown-linux-gnu
2)
make
(Snip)
gcc -c -DHAVE_CONFIG_H -g -O2 -I.
-I/work/sources/binutils/libiberty/../include -W -Wall
-Wwrite-strings -Wc++-compat -Wstrict-prototypes -pedantic
/work/sources/binutils/libiberty/regex.c -o regex.o
In file included from /work/sources/binutils/libiberty/regex.c:128:0:
/usr/include/stdlib.h:510:35: error: expected ‘,’ or ‘;’ before
‘__attribute_alloc_size__’
__THROW __attribute_malloc__ __attribute_alloc_size__ ((2)) __wur;
^
make[2]: *** [regex.o] Error 1
make[2]: Leaving directory
`/work/builds/gcc-fsf-trunk/obj-aarch64-unknown-linux-gnu/binutils/libiberty'
(Snip)
Can anyone point me what is going wrong here.
regards,
Venkat,
### About Linaro binutils
Linaro binutils is a release of the GNU binutils with bug fixes and
enhancements for ARM platforms. GNU binutils is a collection of tools
including the ld linker and as assembler.
### Linaro binutils 2.24 2013.12
The Linaro Toolchain Working Group is pleased to announce the 2013.12
release of Linaro binutils 2.24.
This release is based on the latest GNU binutils 2.24 stable branch, but
with additional features and bug fixes.
### Additional Features
* Support for GNU indirect functions
### Bug Fixes
* Fixed miscalculation of GOTPLT offset for ifunc syms
* Handle static links with ifunc correctly
* Fixup IFUNC tests to work on all targets
### Source
### Release Tarball
* https://releases.linaro.org/13.12/components/toolchain/binutils-linaro
### Development Tree
* git://git.linaro.org/toolchain/binutils-gdb.git
This release was built from the linaro_binutils-2_24-2013_12_release tag.
### Feedback and Support
Subscribe to the important Linaro mailing lists and join our IRC channels to
stay on top of Linaro development.
* Linaro Toolchain Development [mailing
list](http://lists.linaro.org/mailman/listinfo/linaro-toolchain)
* Linaro Toolchain IRC channel on irc.freenode.net at `#linaro-tcwg`
* Questions? [ask Linaro](http://ask.linaro.org/).
* Interested in commercial support? inquire at [Linaro
support](mailto:support@linaro.org)
== Issues ==
* none
== Progress ==
* LRA on AArch32:
o TCWG-343 : Make LRA the default for the ARM backend (8/10)
- Validated and committed a fix from Vladimir for Thumb1 issues.
- iWMMXT issue : Tried a fix without success, continue working on it.
o TCWG-345 : Analyse performance of LRA for ARM. (0/10)
- No progress this week.
* Various meetings. (2/10)
== Next ==
* Vacation
== Progress ==
* Debugged and Fixed process record memory corruption problem.
[TCWG-315][TCWG-317][8/10]
* Sick Time off [2/10]
== Plan ==
* Send patches for bug fixes and look into remaining arm-native gdb issues.
* Respond to maintainer's suggestion on process record patches.
* Public Holiday on 25th
* Time off for setting up new office space.
== Progress ==
* Libssp GCC patch
Replied to Marcus comments on libssp machine description support for
stack protect and test. Analyzed other ports implementations on
clearing register that loaded canary value. Waiting for his feedback.
* Pointer mangling Aarch64 glibc.
Investigated mangling support and implemented a patch. Testing glibc
test suites in V8 Foundation model is in progress.
* Attend Linaro Tool chain status meeting.
* Attend 1-1 with Christophe (Linaro).
* Attend 1-1 with Matt (Linaro) .
== Plan ==
- Pointer Guard support in Aarch64 glibc
- Continue tesing Cbuildv2
== Issues ==
* None.
== Progress ==
* Rebase aarch64 build scripts to crosstool-ng upstream, test and send
out the patch for community review (2/10).
* Investigate https://ci.linaro.org/jenkins/job/openembedded-armv8b-rootfs/gcc_version=4.….
It seams build configure issue. Some MICRO is not correctly defined.
Can not follow-up it due to no access to the build system.
* Continue on "uninit warning testsuite failures" (CARD 304 7/10)
- Identify another reason why uninit-pred-8_b.c FAIL: The control
flow is too complex, it can not normalize the condition at line 22 to
( n < 10 || m > 100 || r < 10 ).
- Work on patch to fix PHI issue to make uninit-pred-9_b.c PASS.
* Test builds for backporting "ftruncate() and truncate() stubs"
related patches in Linaro newlib.
== Plan ==
* Linaro toolchain binaries 2013.12 release.
== Progress ==
- Integrate benchmarking into Cbuildv2 (TCWG-360 7/10)
- Implementation mostly complete
- Started testing to ensure compatible with cbuild1
- Code available for comments at
https://git.linaro.org/toolchain/cbuild2.git/shortlog/refs/heads/benchmarki…
- Binutils Bug 16340 (1/10)
- Posted the patch after regression testing and analysing the results
- Mics (2/10)
- Read relocation handling of tls and its implementation for aarch64
== Plan ==
- Complete Integrate benchmarking into Cbuildv2
- Address comments for Binutils Bug 16340 and look to come up with a
simple testcase
== Progress ==
* Android LLVM
- Discussions on progress, trying to line up kernel+AOSP together
- Google has bailed Clang/LLVM for L release, will consider for next one
* Vectorizer
- Progressing on the implementation of the pragma parser
- http://llvm.org/PR18086
- Discussions about introduction of generic function vectorizer (ARM)
* Release 3.4
- Tested RC3, no regressions on tests or benchmarks
- http://people.linaro.org/~rengolin/llvm/
- http://llvm.org/pre-releases/3.4/rc3/
- Looked at a bug on the vectorizer for pentium3/freebsd
- Work around found, not easy enough to get them to RC4
* Background
- Many discussions, many support requests, many patch reviews
- Adding BOF notes to dev meeting site
- Booking train and hotel for FOSDEM 14
* Time
- CARD-862 8/10
- Others 2/10
* Happy Holidays! And see you in January!
== Issues ==
* Running benchmarks on my Chromebook is very unstable.
- Even though the standard deviation is small in two different moments,
the two results are statistically incompatible.
- The wireless network on the Chromebook, as widely known,
is unstable and unpredictable.
- I need a graphical interface, so I can do stuff during Connects,
or to see Phoronix results and that is probably the responsible
for all instability
- Next release, I'll use an ODroid (or Arndale) for benchmarks
== Plan ==
* Holidays!
== Progress ==
- 2013.12 releases (4/10)
* Handover to Michael
* Committed remaining backports/branch merges
* Unexpected regression in 4.7 branch narrowed to a linker bug, now fixed.
- cross validations (2/10)
* stabilized armeb+qemu validations
- misc (4/10): misc conf-calls and meetings; internal meetings
== Next ==
Next 2 weeks off (Dec 23rd Jan 3rd)
Merry Christmas and happy new year to all of you.
Hello,
I am using the pre-built toolchain gcc-arm-none-eabi-4_6-2012q2 from linaro
to compile u-boot (u-boot-linaro-stable) and to compile my standalone
applications to run on target(PandaBoard ES rev b2)
hello_world standalone application which comes with u-boot is executing
fine on target when I disable CONFIG_SYS_THUMB_BUILD, but when I enable it,
target gets reset with following information
Panda # go 82000000 hello
## Starting application at 0x82000000 ...
undefined instruction
pc : [<8200000c>] lr : [<bff83147>]
sp : bfeffe40 ip : bfeffc10 fp : 00000000
r10: 00000003 r9 : bffac954 r8 : bfefff68
r7 : bff01d88 r6 : 82000000 r5 : bff01d8c r4 : 00000003
r3 : 82000000 r2 : bff01d8c r1 : bff01d8c r0 : 00000002
Flags: nzCv IRQs off FIQs off Mode SVC_32
Resetting CPU ...
resetting ...
U-Boot SPL 2013.01.-rc1-g0f45941 (Dec 17 2013 - 14:23:41)
OMAP4460 <http://www.ti.com/product/OMAP4460> ES1.1
OMAP SD/MMC: 0
reading u-boot.img
reading u-boot.bin
reading u-boot.bin
......
Can anyone please help me why thumb mode build is failing?
On 18/12/13 05:06, Jonathan S. Shapiro wrote:
> At the risk of sticking my nose in, this isn't a startup code issue.
> It's a contract issue.
>
> First, I don't buy Richard's argument about memcpy() startup costs and
> hard-to-predict branches. We do those tests on essentially every
> *other* RISC platform without complaint, and it's very easy to order
> those branches so that the currently efficient cases run well. Perhaps
> more to the point, I haven't seen anybody put forward quantitative
> data that using the MMU for unaligned references is any better than
> executing those branches. Speaking as a recovering processor
> architect, that assumption needs to be validated quantitatively. My
> guess is that the branches are faster if properly arranged.
>
> Second, this is a contract issue. If newlib intends to support
> embedded platforms, then it needs to implement algorithms that are
> functionally correct without relying on an MMU. By all means use
> simpler or smarter algorithms when an MMU can be assumed to be
> available in a given configuration, but provide an algorithm that is
> functionally correct when no MMU is available. "Good overall
> performance in memcpy" is a fine thing, but it is subject to the
> requirement of meeting functional specifications. As Jochen Liedtke
> famously put it (read this in a heavy German accent): "Fast, ya. But
> correct? (shrug) Eh!"
>
> So: we need a normative statement saying what the contract is. The
> rest of the answer will fall out from that.
>
> I do agree with Richard that startup code is special. I've built
> deeply embedded runtimes of one form or another for 25 years now, and
> I have yet to see a system where optimizing a simplistic byte-wise
> memcpy during bootstrap would have made any difference in anything
> overall. That said, if the specification of memcpy requires it to
> handle incompatibly aligned pointers (and it does), and the contract
> for newlib requires it to operate in MMU-less scenarios in a given
> configuration (which, at least in some cases, it does), it's
> completely legitimate to expect that bootstrap code can call memcpy()
> and expect behavior that meets specifications.
>
> So what's the contract?
>
I disagree with your assertion that newlib *requires* it to operate in
an MMU-less scenario for all targets; it only does so when the target
can reasonably be expected to not have an MMU.
The only contract that exists is the one written in the C standard:
7.23.2.1#2 The memcpy function copies n characters from the object
pointed to by s2 into the object pointed to by s1. If copying takes
place between objects that overlap, the behavior is undefined.
But that is written on the assumption that we're in a normal execution
environment, not in some special case.
What you're missing is that AArch64 is (in ARM ARM terms) an A-profile
only environment where an MMU is mandated in the system. Furthermore,
processors implementing the architecture will *expect* that the MMU be
turned on as soon as possible after boot, since without this the caches
cannot be used and without those the performance will be truly horrible.
Once the caches are enabled, it's perfectly reasonable to assume that
memcpy will only be used for copies to and from NORMAL memory, since
other types of memory have potential side effects, which means that use
of memcpy would be unsafe.
If you want to write an MMU-less memcpy, then feel free to write one;
but please install it with a different interface -- something like
__memcpy_nommu(). Don't penalise the standard case for the non-standard
exceptional one.
R.
Hi all,
I have a bit of a strange one. I'm not after a full solution, just any
hints that quickly come to mind :)
After a few simple patches I have a build of mongodb for aarch64 (built
with gcc-4.8). However, all of the test binaries that the build spits
out immediately segfault. gdb-ing shows that they segfault inside this
macro:
TSP_DECLARE(OwnedOstreamVector, threadOstreamCache);
This expands to:
# define TSP_DECLARE(T,p) \
extern __thread T* _ ## p; \
template<> inline T* TSP<T>::get() const { return _ ## p; } \
extern TSP<T> p;
And indeed, it's mongo::TSP<mongo::OwnedPointerVector<...> >::get()
const that we're segfaulting in. This is the disassembly of this
function (at -O0) with the faulting instruction marked:
0x00000000004b4b6c <+0>: stp x29, x30, [sp,#-32]!
0x00000000004b4b70 <+4>: mov x29, sp
0x00000000004b4b74 <+8>: str x0, [x29,#16]
0x00000000004b4b78 <+12>: adrp x0, 0x64c000
0x00000000004b4b7c <+16>: ldr x0, [x0,#776]
0x00000000004b4b80 <+20>: nop
0x00000000004b4b84 <+24>: nop
0x00000000004b4b88 <+28>: mrs x1, tpidr_el0
0x00000000004b4b8c <+32>: add x0, x1, x0
=> 0x00000000004b4b90 <+36>: ldr x0, [x0]
0x00000000004b4b94 <+40>: ldp x29, x30, [sp],#32
0x00000000004b4b98 <+44>: ret
And the registers:
(gdb) info registers
x0 0x7fb863fd70 548554407280
x1 0x7fb7ff76f0 548547819248
x2 0x0 0
x3 0x7fb7fc11b8 548547596728
x4 0x1 1
x5 0x0 0
x6 0x50 80
x7 0x0 0
x8 0x0 0
x9 0x6165727473676f4c 7018141438804717388
x10 0x0 0
x11 0x0 0
x12 0x2 2
x13 0x10 16
x14 0x0 0
x15 0x7fb7e5e590 548546143632
x16 0x64b3d8 6599640
x17 0x7fb7f667d0 548547225552
x18 0x7fffffdab0 549755804336
x19 0x7fffffed50 549755809104
x20 0xb 11
x21 0xb 11
x22 0x6500b0 6619312
x23 0x650070 6619248
x24 0x7fffffff 2147483647
x25 0x64db40 6609728
x26 0x7fffffeda0 549755809184
x27 0x653d00 6634752
x28 0x7fffffe750 549755807568
x29 0x7fffffe4d0 549755806928
x30 0x4b4ed4 4935380
sp 0x7fffffe4d0 0x7fffffe4d0
pc 0x4b4b90 0x4b4b90 <mongo::TSP<mongo::OwnedPointerVector<std::basic_ostringstream<char, std::char_traits<char>, std::allocator<char> > > >::get() const+36>
cpsr 0x20000000 536870912
fpsr 0x0 0
fpcr 0x0 0
If I recompile this object file without -fPIC, it works.
I guess I see three things that could be wrong:
1) The operand to "adrp x0, 0x64c000"[1]
2) The operand to "ldr x0, [x0,#776]"
3) The value of tpidr_el0
Oh, and I guess:
4) The setup of tls has gone wrong and the address in x0 _ought_ to be
accessible but isn't for some reason.
Any hints on which of these seems mostly likely to be the culprit?
Chers,
mwh
[1] FWIW, objdump reports 0x64c000 as "_GLOBAL_OFFSET_TABLE_+0x2d0", not
sure why that doesn't show up in gdb's disassembly).
== Progress ==
* Bugfixing and testing QEMU AArch64 FP patches (3/10, VIRT-183)
* Debugging and submitting a patch for ARM gdb ifunc test failures (1/10)
* Two day week due to holidays
== Issues ==
* None
== Plan ==
* Back on the 9th January, have a good Christmas and New Year everybody!
--
Will Newton
Toolchain Working Group, Linaro
Hi,
We've noticed an issue trying to use the Linaro AArch64 binary bare metal
toolchain release with the MMU turned off for some low-level tests.
Anytime puts, sprintf, etc. gets called, a reent structure gets created with
references to STDIN, STDOUT, STDERR FILE types. A member in the __sFile
struct, _mbstate, is an 8 byte struct, but is not aligned on an 8 byte
boundary. This means that when memset (or a similar function) gets called on
this struct, and doesn't operate one byte at a time, a data alignment fault
will be generated when operating out of device memory, such as on a system
where the MMU has not yet been turned on yet.
I'm still examining possible fixes (I'll probably look at building with
-mstrict-align first), but I wanted to check if anyone had thoughts on the
subject and if Newlib upstream or Linaro consider using Newlib with the MMU
turned off to be a valid use case or if running the code that turns on the MMU
is considered a prerequisite to everything else.
Thanks,
Christopher
--
Employee of Qualcomm Innovation Center, Inc.
Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum,
hosted by the Linux Foundation.
== Progress ==
TCWG-293 (9/10)
- wrote and tested 64bit division code
- it seems to work
- still need to do performance testing
TCWG-347 Fix PR59142 (1/10)
- split into series of 3 patches
- patch almost ready, was held up by non-availability of the lab
- need to bootstrap on Thumb-1 to prove change made in response to
review comments
TCWG-346 AArch64 Benchmarking: CoreMark & Dhrystone
- no significant progress, no access to the lab
== Next ==
Pick up aarch64 benchmarking when the board becomes accessible again
Submit PR59142
== Progress ==
- 2013.12 releases (4/10):
* stalled due to lab unavailability.
* A couple of backports are waiting for approval, another one is
being debugged.
- cross-validation (4/10): fixed arneb+qemu validations.
- misc (2/10): misc conf-calls and meetings
== Next ==
- Make 2013.12 releases
- cbuild2: continue testing, try to make 4.7 source release
- libsanitizer on AArch64: resume work
== Future ==
Next 2 weeks off (Dec 23rd-Jan 3rd)
== Issues ==
* 1.5 day of due to car issue. (3/10)
* Calxedas are down after lab maintenance.
== Progress ==
* LRA on AArch32:
o TCWG-343 : Make LRA the default for the ARM backend (5/10)
- Turn LRA on by default committed as rev205887
http://gcc.gnu.org/ml/gcc-patches/2013-12/msg01088.html
- New Thumb regressions reported (Cortex-m0 and bootstrap),
analysis ongoing.
- Analysed last week regressions and reported them upstream,
Vladimir fixed them at rev205974.
- iWMMXT issue : work ongoing.
o TCWG-345 : Analyse performance of LRA for ARM. (0/10)
- No progress this week.
* Reviewed some merge requests. (1/10)
* Various meetings. (1/10)
== Next ==
* Continue LRA, merge and patch reviews.
== Progress ==
* Debugging and analysis of various gdb test suite failures [TCWG-34] [5/10]
Updated googledoc sheet with action items and comments on different failures.
Investigated remote core file generation issues.
Prepared a patch to turn off corefile dependent tests in remote configs.
* Debugged gdb.reverse testsuite failures [TCWG-197] [4/10]
Found a memory corruption issue where execution log is being corrupted
in memory.
* Time off for dentist appointment and office relocation stuff [1/10]
== Plan ==
* Figure out a reason and fix for process record memory corruption problem.
* Further analysis of test suite failures in arm-native Vs x86-native
and arm-remote Vs
x86-remote test results.
* Send patch to disable corefile tests in remote mode. Ping process
record and other previous patches.
== Progress ==
- Libssp GCC (4/10)
- Rebased GCC source and added patch for stack protect and test
based on global stack guard. Discussing with Marcus on
generic stack protect set and test versus machine descriptions.
Discussed with ARM and Glibc Maintainers, Dropped my patches
for TLS based stack guard.
- Cbuildv2 experiments (3/10)
- Built cross compiler with Cbuilv2.
- Discussing with Ryan on building tool chain without
cbuild.validation.linaro.org dependency
- PGO support for aarch64 (1/10)
Read a paper on PGO optimization in GCC
- Cross build some benchmarks(2/10). There were omp.h file missing
errors when Linaro tool chain was used. The issue is the tool chain is
not built with libgomp library. Rebuilt the tool chain after checkign
configuration changes with Zhenqiang Chen .
== Plan ==
- Inverstigate Pointer Guard support in Aarch64 glibc
- Continue tesing Cbuildv2
- Continue PGO investigations
== Issues ==
* None.
== Progress ==
* Enable libomp for aarch64*-linux-gnu builds in Linaro crosstool-ng.
* Backporting r200103 and r205509 to Linaro 4.8.
* Try to enable lra and test Spec2k with -fno-move-loop-invariants and
-fira-loop-pressure. But still no overall performance improvement.
(2/10)
* Try conditional compare related changes (CARD 313: 3/10)
- Set LOGICAL_NON_SHORT_CIRCUIT to false in fold-const.c.
- Do ifcombine twice.
- Logs show lots of new FAILs in vrp related cases and no
performance improvement in Spec2k INT.
* Identified the root causes of "uninit warning testsuite failures"
(CARD 304: 3/10)
- Some values are from PHI, which is not handled when checking subset.
- Function is_included_in is conservative. Here is its comments:
/* ... It returns false if ONE_PRED's domain is
not a subset of any of the sub-domains of PREDS (
corresponding to each individual chains in it), even
though it may be still be a subset of whole domain
of PREDS which is the union (ORed) of all its subdomains.
In other words, the result is conservative. */
== Plans ==
* 2013.12 toolchain binaries release.
* Continue on CARD 313 and 304.