Hi Linaro Toochain Group,
I have few questions on glibc+libm w.r.t aarch64.
If possible, please provide some insight, otherwise kindly redirect me to
the concerned person/forum.
1.It seems from the community patches that ARM/Linaro is optimizing glibc
functions such as memcpy/memmove, string for aarch64.
However, looks like some of these (e.g. memcpy/memmov) patches are still
not merged in glibc. Any comment on their availability in glibc?
e.g. https://www.sourceware.org/ml/libc-alpha/2015-12/msg00341.html
2. On the same note, is there any plan for optimizing/tuning libm functions
(e.g. trigonometric) for aarch64?
I could find any matching patches on review board. Please correct me if I
am wrong.
3. Looks like ARM have released an independent version of libm for certain
trigonometric functions.
https://github.com/ARM-software/optimized-routines.
Any plan of these optimization going in glibc's libm? Any comment on its
performance improvement over GNU libm ?
Thanks in advance for your time.
--
with regards,
Virendra Kumar Pathak
== Progress ==
- LTO and TCWG480 (6/10)
* Read and experimented with GCC's LTO codebase.
* Setup archlinux on chromebook and ran coremark with perf
* Read referenced publications
- PR66726 (2/10)
* Rebased the patch
* Regression tested on x86_64 and using Chritsope's setup
* getting ready to post upstream
- Misc (2/10)
* gcc/bug list
* abstract for connect presentation
* Meetings
== Plan ==
* LTO
* bugs
o Teaching activity (2/10)
== Progress ==
o Linaro GCC (4/10)
* Completed backports for 2016.01
* Merged FSF 5 branch into Linaro 5 one
* Delivered 2016.01 Snapshot
* Try to reproduce Linaro PR #1988 with 2016.01.
issue not reproducible
o GCC dev. (3/10)
* Fix armv8.1-a handling at configure time
patch committed upstream in trunk and backported in our branch.
* Investigate TCWG-128 (Thumb2 branch out of range with LTO)
o Misc (1/10)
* Various meetings
== Plan ==
o Look at diff between ABE branches in our validation context
o Continue on-going tasks
Port to microinstance - TCWG-432 [3/10]
* Fallout from attempts to fix race condition
* Various minor fixes - simplifications, better reporting
Backport benchmarking - TCWG-352 [1/10]
* Decoupled 'target triple' from 'toolchain' name
** Immediately, to let me benchmark with Juno-built native gcc
** But this has been causing minor pain for a while
Benchmarking infrastructure documentation - TCWG-496 [1/10]
* Finished drafts of LAVA wrapper and microinstance access docs
Unexpected 1/2 day off [1/10]
Misc [4/10]
* Unusual number of meetings
* Fiddling with my git workflow
* Some thinking about Benchmark-102 for Connect
* Figured out why our SPEC objects were disappearing
== This week ==
* Bugzilla 68543 - [AArch64] Implement overflow arithmetic standard
names (6/10)
- Re-implementing using sign/zero extends as this was not clear from
the documentation
* Bugzilla 69008 - gcc emits unneeded memory access when passing trivial
structs by value (1/10)
- Determined to be structure alignment issue
* Code review of patch posted for Bugzilla 68532 (1/10)
- Completed code review and successfully tested with my vaddw patches
* Bugzilla 67323 - Tested patch for TCWG-318 (non-unit stride loads) (1/10)
- Patch did not generated vld3 so the bug has been reopened
* TCWG-317 - No review feedback yet
* Misc (1/10)
== Next week ==
* Bugzilla 68543 - Complete re-implementation with zero/sign extends and
test
* Bugzilla 69008 - Determine where in compiler issue should be addressed
* USA Holiday (January 18th, MLK Day)
== Progress ==
* Support (5/10)
- Bug triage: reviewing the 300+ bugs on ARM
- closing fixed ones (+35)
- adding deps for meta bugs (kernel, android, ias, chromium) (~15)
- moving to ARM/AArch64 components, etc. (~30)
- Checking a few more kernel patches for Clang support with Arnd
* Plan (3/10)
- Still planning roadmap for 2016/2017
* Background (2/10)
- Code review, meetings, discussions, general support, etc.
# Progress #
* Estimate the effort of GDB kernel-awareness work. Give comments
on formal slides. [1/10]
* GDB inserts breakpoints on the wrong place if these files'
basename is the same. TCWG-491. At least, post a patch to avoid
GDB crash. Reproduced with a simpler case, open PR 19474. [3/10]
* Update document about input interrupt. [1/10]
Done. Patch is committed.
* TCWG-503, fix GDB test case by @progbits -> %progbits. Patch is
posted. [1/10]
* Clean up arm software single step code. [3/10]
In progress. Some patches are committed, but some are still in my
queue. The recent arm software single step change causes a
regression on stepping out of signal handler.
* Think about the GDB slides for Linaro Connect [1/10]
# Plan #
* Look into the arm software single step regression.
* Post the rest of patches for arm software single step clean up.
--
Yao
== Progress ==
* Validation
- a couple of fixes in the validation jobs
- looking at timeouts problems with buildfarm-master
* GCC
- TCWG-485/PR68620: sent patch for review
- TCWG-484/target attributes problems in the testsuite
Rebased, but need more pending updates from Christian
* Misc (conf calls, meetings, emails, ...)
== Next ==
* Validation: more debug. Hopefully start extending validation scope
* GCC:
- bug fixing
- check trunk regressions reported during the holidays
The Linaro Toolchain Working Group (TCWG) is pleased to announce the
2016.01 snapshot of the Linaro GCC 5 source package.
This monthly snapshot[1] is based on FSF GCC 5.3+svn232321 and
includes performance improvements and bug fixes backported from
mainline GCC. This snapshot contents will be part of the 2016.02
stable [1] quarterly release.
This snapshot tarball is available on:
http://snapshots.linaro.org/components/toolchain/gcc-linaro/5.3-2016.01/
Interesting changes in this GCC source package snapshot include:
* Updates to GCC 5.3+svn232321
* Fixes Linaro Bugzillas: #1982, #1980
* Backport of [BugFix] [AArch32] PR 68149 Fix ICE in unaligned_loaddi split
* Backport of [BugFix] [AArch32] PR target/68214: Delete
IP-reg-clobbering call-through-mem patterns
* Backport of [BugFix] [AArch32] PR target/68390
* Backport of [Bugfix] [AArch64] PR rtl-optimization/68796 Add
compare-of-zero_extract pattern
* Backport of [BugFix] [AArch64] PR target/68696 FAIL:
gcc.target/aarch64/vbslq_u64_1.c scan-assembler-times bif\tv 1
* Backport of [BugFix] PR rtl-optimization/68381
* Backport of [ARMv8.1] [AArch32] Add ACLE feature macro for ARMv8.1
instructions
* Backport of [ARMv8.1] [AArch32] Add ACLE intrinsics vqrdmlah and vqrdmlsh
* Backport of [ARMv8.1] [AArch32] Add ACLE intrinsics vqrdmlah_lane
and vqrdmlsh_lane
* Backport of [ARMv8.1] [AArch32] Add patterns for new instructions
* Backport of [ARMv8.1] [AArch32] Add support for ARMv8.1
* Backport of [ARMv8.1] [AArch32] Multilib support for ARMv8.1.
* Backport of [ARMv8.1] [AArch32] Support ARMv8.1 ARM tests
* Backport of [AArch32] [AArch32] Fix armv8.1 support at configure time
* Backport of [AArch32] Add attribute for compatibility with ARM pipeline models
* Backport of [AArch32] Fix vector TYPE_MODE in streaming-out
* Backport of [AArch64] 1/7 Add support for ARMv8.1 Adv.SIMD instructions
* Backport of [AArch64] 2/7 Add sqrdmah, sqrdmsh instructions
* Backport of [AArch64] 3/7 Add builtins for ARMv8.1 Adv.SIMD instructions
* Backport of [AArch64] 4/7 Add ACLE feature macro for ARMv8.1
Adv.SIMD instructions
* Backport of [AArch64] 5/7 Dejagnu support for ARMv8.1 Adv.SIMD
* Backport of [AArch64] 6/7 Add NEON intrinsics vqrdmlah and vqrdmlsh
* Backport of [AArch64] 7/7 Add NEON intrinsics vqrdmlah_lane and vqrdmlsh_lane
* Backport of [AArch64] Documentation fix for -fpic
* Backport of [AArch64] fix 3/7 Add builtins for ARMv8.1 Adv.SIMD instructions
* Backport of [AArch64] Fix a few failures with LSE enabled
* Backport of [AArch64] Improve add immediate expansion
* Backport of [AArch64] Improve comparison with complex immediates
followed by branch/cset
* Backport of [AArch64] Rework ARMv8.1 command line options
* Backport of [AArch64] Update patterns to support FP zero
* Backport of [AArch64] Use aarch64_sync_memory_operand in
atomic_store<mode> pattern
* Backport of [AArch64] Use vector wide add for mixed-mode adds
* Backport of [Misc] Only restrict pure simplification in mult-extend
subst case, allow other substitutions
* Backport of [Testsuite] [AArch64] Skip big-endian as well for
gcc.target/aarch64/got_mem_hoist_1.c
* Backport of [Testsuite] Make stackalign test LTO proof
* Backport of [Testsuite] Testcase for PR rtl-optimization/68381
Feedback and Support
Subscribe to the important Linaro mailing lists and join our IRC
channels to stay on top of Linaro development.
** Linaro Toolchain Development "mailing list":
http://lists.linaro.org/mailman/listinfo/linaro-toolchain
** Linaro Toolchain IRC channel on irc.freenode.net at @#linaro-tcwg@
* Bug reports should be filed in bugzilla against GCC product:
http://bugs.linaro.org/enter_bug.cgi?product=GCC
* Interested in commercial support? inquire at "Linaro support":
mailto:support@linaro.org
[1]. Stable source package releases are defined as releases where the
full Linaro Toolchain validation plan is executed.
[2]. Source package snapshots are defined when the compiler is only
put through unit-testing and full validation is not performed.
o Teaching activity (6/10)
== Progress ==
o Linaro GCC (4/10)
* Restart validation blocked by the last infrastructure issues
* Discussed and reviewed Jenkins jobs fixes.
* Tested potential upstream fixes for linaro BZ 1980 and 1982,
but reverting the offending backports is a better solution.
== Plan ==
o Complete backports and deliver 2016.01 GCC snapshot.
Centralized benchmark source - TCWG-354 [1/10]
* Fixed some bugs in Coremark reporting
* Experimented some more with clang builds, set this aside for now
Port to microinstance - TCWG-432 [3/10]
* Fixed some false assumptions exposed by change to uinstance
* Worked on LAVA side of coremark reporting
Backport benchmarking - TCWG-352 [4/10]
* Fixed some bugs that appeared over the holiday
* Got job dispatching benchmark run to microinstance, with pointer to
backport-containing toolchain
* But resulting job fails
Benchmarking infra documentation - TCWG-496 [1/10]
* Started documenting LAVA wrapper
Misc [1/10]
=Plan=
LAVA side of coremark reporting
Fix backport-dispatched benchmark job
Much more documentation
== Progress ==
LLDB development
-- Fixed thumb mode issues if user triggers changes to PC register
[TCWG-228] [3/10]
-- Triage of testsuite failures on armel chromebook [TCWG-228] [2/10]
-- Update xfail decorators for arm-linux targets. Marked traiged as
xfail. [TCWG-494] [3/10]
Miscellaneous [1/10]
-- Meetings, emails, discussions etc.
Half Day Leave [1/10]
-- Tooth extraction surgery.
== Plan ==
LLDB development
-- Triage more failures on armel chromebook [TCWG-228]
-- Submit an updated xfail decorator for arm-linux. [TCWG-494]
Miscellaneous
-- Connect visa follow up.
== This Week ==
* tcwg-72 (2/10)
- Following Jim's suggestions using XEXP (remainder, 0) worked to
resolve segfault
for DImode case
- Resolved ICE's due to (silly) mistakes.
* PR69133 (4/10)
- Reduced test-case following Markus's suggestions.
- I think it happens because node->lto_file_data is set to NULL in
lto_free_function_in_decl_state_for_node ().
Commenting out the following calls from get_untransformed_body():
lto_free_section_data (file_data, LTO_section_function_body, name,
data, len, decl_state->compressed);
lto_free_function_in_decl_state_for_node (this);
prevents the ICE. I suppose symbols with same name in same partition
share state (?), so setting lto_file_data = NULL
affects state of 2nd occurence of symbol and we hit the assert.
If this is true, gating on lto_file_data doesn't seem unreasonable IMO
to avoid ICE (the code anyway has undefined behavior due to violation of ODR).
I am still not sure why this "works" for partitioning enabled (one,
balanced, 1to1).
* 447.deallII bug (1/10)
- Bug not reproducible (verified with Kugan).
* TCWG-319 benchmarking (1/10)
- first job submission failed due to kernel panic
- a53: base fp run completed, with-patch in progress
* Misc (2/10)
- Meetings
- ipa
== Progress ==
* Validation
- a couple of fixes in the validation jobs
- improvements to the comparison scripts
* GCC
- investigated bugs #1980 and #1982.
Reverted the backports that introduced the problem.
- TCWG-485/PR68620: resumed
- TCWG-484: target attributes problems in the testsuite.
Sent an updated patch, but Christian has also submitted
a patch which modifies the compiler behavior. I'll have
to update mine when his patch is accepted.
* Misc (conf calls, meetings, emails, ....)
== Next ==
* Validation: more cleanup. Hopefully start extending validation scope
* GCC:
- bug fixing
- check trunk regressions reported during the holidays
== Progress ==
* Support (2/10)
- Having a go at PR25722, too hacky for a feature that can
be easily worked around.
- Reviewing some kernel issues with Arnd
* Planning (6/10)
- Drafting 2 year plans for LLVM
* Background (2/10)
- Code review, meetings, discussions, general support, etc.
- Catching up from long holidays
# Progress #
* Handle input interrupt in GDB. TCWG-424. Done. [3/10]
Need to update GDB manual to clarify the expected behaviour of GDB.
* Estimate the effort of GDB kernel-awareness work. Done. [1/10]
* TCWG-491. Ongoing. [2/10]. Understand symbol handling
in GDB.
* Various patch review upstream. [2/10].
* Clean up code on arm software single step after some changes from
Ericsson. Ongoing. [2/10].
# Plan #
* TCWG-491.
* Follow up of TCWG-424 to update GDB manual.
* Clean up code on arm software single step.
* Assess the ARM and AArch64 GDB test result, as 7.11 release is
coming soon.
--
Yao
Is it in the 2015.11-1 release ?
- rob -
-------- Original message --------
From: Jim Wilson <jim.wilson(a)linaro.org>
Date: 01/05/2016 19:45 (GMT-07:00)
To: Xiaofeng Ren <xiaofeng.ren(a)nxp.com>
Cc: linaro-toolchain(a)lists.linaro.org, Zhenhua Luo <zhenhua.luo(a)nxp.com>
Subject: Re: gcc-linaro-5.1 vs gcc-linaro-4.8
On Tue, Jan 5, 2016 at 4:19 PM, Xiaofeng Ren <xiaofeng.ren(a)nxp.com> wrote:
> Hello Jim,
> Appreciate for your comments.
> I will try to manually apply that patch on my side and try it.
> BTW, may I know which released Linaro gcc version include that patch? Maybe I can download it and try it quickly.
> https://gcc.gnu.org/ml/gcc-patches/2015-10/msg00025.html
It was backported to our gcc-5 branch on Nov 24 by Yvan. This is
after the latest release 2015-11 was made. The patch is in the
December snapshot, but I think that is a source only release.
http://snapshots.linaro.org/components/toolchain/gcc-linaro/5.3-2015.12/
You would have to build your own toolchain from that, perhaps by using abe.
Jim
_______________________________________________
linaro-toolchain mailing list
linaro-toolchain(a)lists.linaro.org
https://lists.linaro.org/mailman/listinfo/linaro-toolchain
Linaro TCWG,
In newer toolchains that are built with ABE, libc.a contains a lot of debugging information, including the paths to the source files on the build machine. I think that's because ABE builds the libraries with -g and never strips out the debug information. I verified this with both the 4.9-2015.05 and 5.2-2015.11 binary releases with the command:
arm-linux-gnueabih-objdump -g libc.a | grep '\.c'
In older toolchains that were built with crosstool-ng, libc.a did not contain the paths to the source files. I guess crosstool-ng either didn't build the libraries with -g, or it stripped out the debug information later. I verified this with the 4.9-2014.09 binary release.
I'm not sure whether this change was intentional, or just an oversight during the switchover to ABE. Regardless, it makes the libraries a lot bigger, and it potentially affects the end user during debugging.
The source files of libc, etc. are not typically included with the binary releases. So, when a user of an ABE-built binary release chooses to step into an extern function of libc, gdb will search for the source file. It likely won't be able to access the source file along the same path that worked for the build machine, so it will search its list of source directories. Ultimately, unless the user has downloaded the source files, gdb will likely display a message like "printf.c: No such file or directory".
In contrast, when a user of a crosstool-ng-built toolchain tries to step into an extern function of libc, gdb will be unaware of the name of the source file. As a result, the user will not get a message about a missing file.
So, should the toolchains' libraries really contain debug information? I think it could be useful for a theoretical multilib folder that covers a -g option. On the other hand, for the usual release builds, isn't the debug information a waste of space and a source of confusion?
Thanks,
Fred Peterson
Engineer - Developer Tools
NXP Semiconductors
Hello All,
I found one difference between gcc-linaro-5.1 vs gcc-linaro-4.8 while I'm doing lmbench benchmark test for our LS1043 (cortex-A53).
While using gcc-linaro-4.8, gcc will generate advanced SIMD instructions (like as ld1, etc), however, gcc-linaro-5.1 will not generate advance SIMD instructions. This will cause big performance gap between gcc-4.8 and gcc-5.1 for lmbench memory bandwidth "fcp" test (bw_mem program).
My compiler flags is "-O3 -mcpu=cortex-a53". I also tried several different compiler flags ("-O3 -mcpu=cortex-a53+fp+simd", "-O2 -ftree-vectorize -mcpu=cortex-a53", "-O3 -ftree-vectorize -mcpu=cortex-a53"), all of them doesn't work.
Gcc-5.1 toolchain was downloaded from following link:
https://snapshots.linaro.org/openembedded/sources/gcc-linaro-5.1-snapshot-2…
Can I have your comments on this?
Thanks
Ron
Hello toolchain gurus,
In the course of Linaro's kernel tinification project, the ability to
compile the Linux kernel using LTO is a frequent requirement. However
the kernel makes heavy usage of 'ld -r' with .o files resulting from LTO
build of .c files as well as .o files resulting from pure assembly code.
This mix of LTO and non-LTO object files is not supported by upstream
binutils unless a patch from H.J. Lu is applied. That patch has been
available since 2013 and was last refreshed in his 2.25.51.0.4 branch
last September. It is accessible here:
https://git.linaro.org/toolchain/binutils-gdb.git/commit/6da5456971
I've attached a very simple test case demonstrating the problem. With
the binutils-lto-mixed.patch applied, this test case compiles to a
working executable. Otherwise compilation fails at the 'ld -r' step.
One question and one request:
- What, if anything, has prevented this patch from being merged in the
master branch upstream?
- In the mean time, could we include this patch in the Linaro binutils
package and releases?
Having this available in our toolchain releases would greatly simplify
the LTO related work on the kernel. It was included in all binutils
releases from H.J. Lu since 2013 and therefore has obtained significant
exposure already.
Thanks.
Nicolas
== This Week ==
* TCWG-72 (2/10)
- Trying to address segfault with DImode
* LTO ICE with 483.xalancbmk (6/10)
- fighting with benchmark scripts
- segfaults with ICE on -flto-partitions=none
on symbol _ZThn8_N10xalanc_1_819XercesParserLiaison11resetErrorsEv
demangler.com says the symbol is non-virtual thunk to:
non-virtual thunk to xalanc_1_8::XercesParserLiaison::resetErrors()
./-lm.res (resolution map file) says PREVAILING_DEF_IRONLY
- Trace: http://pastebin.com/vxzmQFHg
- Possible fix: http://pastebin.com/rwq8z1N1
Patch passes test and bootstrap, not sure if that's correct approach however.
* Target hook conversion (2/10)
- Unconditionalizing ASM_OUTPUT_DEF on SET_ASM_OP and converting
both to hooks.
NB Last _6_ working days.
Centralised benchmark source - TCWG-354 [6/10]
* Abe integration
* Wrote up some notes on collaborate
* Enabled clang build, which flushed out some bugs, now fixed
* LAVA reporting script
Port to microinstance - TCWG-432 [1/10]
* Investigating issues with Juno boot
** One turns out to be user error
** The other is a failed heath check, passed to the admins
Backport benchmarking - TCWG-352 [2/10]
* Difficulties with passing information from matrix children to parent
** Came up with an ugly workaround, may have thought of a better one
Misc [3/10]
=Plan=
Holidays until 4th January
# Progress #
* TCWG-156, GDB Test-Suite Parity Between Aarch64 and x86_64. Done. [4/10]
After two patches are committed, except some tests written for x86_64
unnecessarily, the test results between aarch64 and x86_64 looks no
difference.
* TCWG-424, timeout when interrupt the inferior in remote debugging. [3/10]
The fail is caused by different two problems. Two patches are ready,
and being regression tested.
* TCWG-171, Enable gdb core file tests when testing remotely. [1/10]
Write down my conclusion as it can't be fixed.
* Upstream review, [2/10]
** Review patch about handling ada aarch64 HVA array in GDB.
** Discuss target description of GDB for cortex-m device with openocd.
# Plan #
* Post patches upstream for TCWG-424,
* Patches review.
* On holiday since Wed.
--
Yao
Hi All,
I am interested in understanding Linaro LLVM activity.
I have already gone through
https://wiki.linaro.org/WorkingGroups/ToolChain/LLVM.
Could you please guide me on below questions.
1. On which LLVM & clang version, linaro is actively working now ?
2. Where can I find the latest "linaro-llvm" source code & binary? I could
not find any official git repo for "linaro-llvm" at https://git.linaro.org/.
3. Could you please explain Linaro LLVM working model? How
similar/different it is when compared with Linaro-GCC engagement.
4. Certain links (e.g Roadmap) at
https://wiki.linaro.org/WorkingGroups/ToolChain/LLVM ask for login
credentials. Any comment on how to obtain the permission?
Thanks in advance for your time.
--
with regards,
Virendra Kumar Pathak
* 1 day off (Friday) (2/10)
== Progress ==
* Validation: (2/10)
- improvements to the comparison scripts
- updated ABE stable and staging branches
* GCC: (3/10)
- TCWG-484: sent a new version of the patch to address problems with
the testsuite and target attributes.
- TCWG-485/PR68620: WIP, trying to understand the regressions
introduced by my patch
* Misc (conf calls, meetings, emails, ...) (3/10)
== Next ==
Holidays until next year
Hi All,
As per release notes, linaro_binutils-2_25-2015_01-2_release should be
present at http://git.linaro.org/toolchain/binutils-gdb.git.
But I am not able to see any "linaro_binutils" tags on
https://git.linaro.org/toolchain/binutils-gdb.git/tags.
Am I doing any mistake? or on git it is renamed to something else (same as
FSF binutils tags?).
Apology for asking such a trivial question.
Thanks for your time.
--
with regards,
Virendra Kumar Pathak
The Linaro Toolchain Working Group (TCWG) is pleased to announce the
2015.12 snapshot of the Linaro GCC 5 source package.
This monthly snapshot[1] is based on FSF GCC 5.3+svn231642 and
includes performance improvements and bug fixes backported from
mainline GCC. This snapshot contents will be part of the 2015.11
stable [1] quarterly release.
This snapshot tarball is available on:
http://snapshots.linaro.org/components/toolchain/gcc-linaro/5.3-2015.12/
Interesting changes in this GCC source package snapshot include:
* Updates to GCC 5.3+svn231642
* Backport of [Bugfix] [AArch32] 1/4 PR63870 Add qualifiers for NEON builtins
* Backport of [Bugfix] [AArch32] 2/4 PR63870 Mark lane indices of
vldN/vstN with appropriate qualifier
* Backport of [Bugfix] [AArch32] 3/4 PR63870 Remove error for invalid
lane numbers
* Backport of [Bugfix] [AArch32] 4/4 PR63870 Remove xfails for ARM targets
* Backport of [Bugfix] [AArch32] PR67305, tighten
neon_vector_mem_operand on eliminable registers
* Backport of [Bugfix] [AArch32] remove unused variable
* Backport of [Bugfix] [AArch64] PR 63304 Fix issue with global state
* Backport of [Bugfix] [AArch64] PR63304 Handle literal pools for
functions > 1 MiB in size
* Backport of [Bugfix] [AArch64] PR 68088: Fix RTL checking ICE due to
subregs inside accumulator forwarding check
* Backport of [Bugfix] [AArch64] PR rtl-optimization/67218
* Backport of [Bugfix] PR 56036 fix typo in doc
* Backport of [Bugfix] PR rtl-optimization/68236: Exit early from
autoprefetcher lookahead if not in haifa sched
* Backport of [Bugfix] PR tree-optimization/68234 Improve range info
for loop Phi node
* Backport of [AArch32] 1/4 Change GET_MODE_INNER to always return a
non-void mode
* Backport of [AArch32] 2/3 Make if_neg_move and if_move_neg into insn_and_split
* Backport of [AArch32] 2/4 Control the FMA steering pass in tuning
structures rather than as core property
* Backport of [AArch32] 3/3 Expand mod by power of 2
* Backport of [AArch32] 3/4 Replace the pattern GET_MODE_BITSIZE
(GET_MODE_INNER (m)) with GET_MODE_UNIT_BITSIZE (m)
* Backport of [AArch32] 4/4 Fix - Introduce new inline functions for
GET_MODE_UNIT_SIZE and GET_MODE_UNIT_PRECISION
* Backport of [AArch32] 4/4 Introduce new inline functions for
GET_MODE_UNIT_SIZE and GET_MODE_UNIT_PRECISION
* Backport of [AArch32/AArch64] 2/2 Add a new Cortex-A53 scheduling model
* Backport of [AArch32] Add missing v8a cpus to the t-aprofile file
* Backport of [AArch32] Fix checking RTL error in cortex_a9_sched_adjust_cost
* Backport of [AArch32] Fix for testcase after r228661
* Backport of [AArch32] Initialise cost to COSTS_N_INSNS (1) and
increment in arm rtx costs
* Backport of [AArch32] libgcc: include crtfastmath
* Backport of [AArch32] Unified assembler in ARM state
* Backport of [AArch64] 1/2 Give AArch64 ROR (Immediate) a new type attribute
* Backport of [AArch64] 1/2 Rename SYMBOL_SMALL_GOT to SYMBOL_SMALL_GOT_4G
* Backport of [AArch64] 1/2 Rename test source file for reuse
* Backport of [AArch64] 1/3 Add the option -mtls-size
* Backport of [AArch64] 1/3 Expand signed mod by power of 2 using CSNEG
* Backport of [AArch64] 2/2 Implement -fpic for -mcmodel=small
* Backport of [AArch64] 2/2 Implement TLS IE for tiny model
* Backport of [AArch64] 2/3 Rename SYMBOL_TLSLE to SYMBOL_TLSLE24
* Backport of [AArch64] 3/3 Implement local executable mode for all memory model
* Backport of [AArch64] Add initial qualcomm support
* Backport of [AArch64] Add missing entries in iterator vwcore
* Backport of [AArch64] Cleanup whitespace in aarch64.c
* Backport of [AArch64] Cortex-A57 Choose some new branch costs
* Backport of [AArch64] Define TARGET_UNSPEC_MAY_TRAP_P
* Backport of [AArch64] Distinct costs for sign and zero extension
* Backport of [AArch64] Do not ICE after apologising for -mcmodel=large -fPIC
* Backport of [AArch64] Don't allow -mgeneral-regs-only to change the
.arch assembler directives
* Backport of [AArch64] Don't transform sign and zero extends inside mults
* Backport of [AArch64] Fall back to -fPIC if no support of -fpic in binutils
* Backport of [AArch64] Fix for branch offsets over 1 MiB
* Backport of [AArch64] Fix ICE on (const_double:HF 0.0)
* Backport of [AArch64] Fix insn types
* Backport of [AArch64] Fix output assembly bug under TLSIE ILP32
* Backport of [AArch64] Handle vector float modes properly in
aarch64_output_simd_mov_immediate
* Backport of [AArch64] Mark GOT related MEM rtx as const to help RTL loop IV
* Backport of [AArch64] Properly handle simple arith+extend ops in rtx costs
* Backport of [AArch64] Rename SYMBOL_SMALL_GOTTPREL to SYMBOL_SMALL_TLSIE
* Backport of [AArch64] Restrict pic-small.c by new test directive
* Backport of [AArch64] Tighten direct call pattern for sibcall to
repair -fno-plt
* Backport of [AArch64] Tighten direct call pattern to repair -fno-plt
* Backport of [Testsuite] [AArch32] Fix thumb2-slow-flash-data.c failures
* Backport of [Testsuite] [AArch32] Switch ARM to unified asm
* Backport of [Testsuite] [AArch64] Add more TLS local executable testcases
* Backport of [Testsuite] [AArch64] Check branch types for noplt testcases
* Backport of [Testsuite] [AArch64] Fix gcc.target/aarch64/vclz.c
* Backport of [Testsuite] [AArch64] Fix some target attribute inlining
tests for -fPIC
* Backport of [Testsuite] [AArch64] Restrict got_mem_hoist_1.c with
small memory model
* Backport of [Testsuite] [AArch64] Skip tiny and large code model on
gcc.target/aarch64/pic-small.c
* Backport of [Testsuite] Add --param
sra-max-scalarization-size-Ospeed to sra-12.c
* Backport of [Testsuite] Fix target selector in
gcc.target/i386/noplt-[1234].c testcases
* Backport of [Misc] Allow sibcalls in no-PLT PIC
* Backport of [Misc] Eliminate PLT stubs for specified external
functions via -fno-plt
* Backport of [Misc] Fix for ICE with -g on testcase with incomplete types
* Backport of [Misc] Fix memory leak and wrong invariant dependence
computation in IVOPT
* Backport of [Misc] Improve rtl loop inv cost by checking if the inv
can be propagated to address uses
* Backport of [Cleanup] [AArch32/AArch64] fix ChangeLog
* Backport of [cleanup] [AArch32] Remove uses of CONST_DOUBLE_HIGH/LOW
* Backport of [Cleanup] [AArch64] Delete aarch64_symbol_context which
is not used
* Backport of [cleanup] [AArch64] Move iterators from atomics.md to iterators.md
* Backport of [cleanup] [AArch64] Remove uses of CONST_DOUBLE_HIGH,
CONST_DOUBLE_LOW
* Backport of [Doc] [AArch64] Document several AArch64-specific test directives
Feedback and Support
Subscribe to the important Linaro mailing lists and join our IRC
channels to stay on top of Linaro development.
** Linaro Toolchain Development "mailing list":
http://lists.linaro.org/mailman/listinfo/linaro-toolchain
** Linaro Toolchain IRC channel on irc.freenode.net at @#linaro-tcwg@
* Bug reports should be filed in bugzilla against GCC product:
http://bugs.linaro.org/enter_bug.cgi?product=GCC
* Interested in commercial support? inquire at "Linaro support":
mailto:support@linaro.org
[1]. Stable source package releases are defined as releases where the
full Linaro Toolchain validation plan is executed.
[2]. Source package snapshots are defined when the compiler is only
put through unit-testing and full validation is not performed.
Centralised benchmark source - TCWG-354 [6/10]
* Understood CoremarkPro run rules/behaviour better
* Experimented with shortening runs while still giving meaningful results
* Cleaned up build/run scaffolding
Port to microinstance - TCWG-432 [1/10]
* Got access to microinstance, learned a bit about SSL in the process
Backport benchmarking - TCWG-352 [2/10]
* Fixed 'build with sysroot' code for current deliverable shape
* Fixed a problem with multiple manifests
=Plan=
Review, test, debug build-triggers-benchmark job
Finish off Coremark Pro integration
Review security with shared uinstance/main instance code
Expose more data, benchmarks to bundles
Debug/test Jenkins job in microinstance
Create bootable image for at least 1 target, or know what the problems are
Write up noise control report (if time)
More support for SPEC-on-Android?
=Absence=
Holiday 22nd December - 1st January
== This Week ==
* TCWG-72 (2/10)
- Another iteration.
* LTO (4/10)
- Building spec with different combinations of -flto -flto-partition
-flto-compression-level
- Builds cleanly on AArch64, ICE's on ARM for 447.dealII with -flto
- Looked at segfault with 483.xalanpack
* TCWG-319 benchmarking (1/10)
- fp benchmark results on:
r1-a12: -0.04 %
cortex-a53: -0.09 %
cortex-a57: -0.14%
- int benchmarks completed running for reference revision on cortex-a15
- int benchmarks with-patch runnning on cortex-a15
* TCWG-310 benchmarking (1/10)
- r1-a12: 0.11%
- Benchmark jobs failed on a53, a57.
* Misc (2/10)
- Submitted patch to fix building 450.soplex
- Looked at how to write ipa pass, ipa-pure-const and ipa-cp.
- Meetings
== Next Week ==
- tcwg-319, tcwg-310 benchmarking
- tcwg-72
- LTO
== This week ==
* Bugzilla 68543 - [AArch64] Implement overflow arithmetic standard
names (7/10)
- Implemented signed and unsigned add, subtract, and multiply
overflow standard patterns
- Investigated testing of overflow patterns
* Misc (3/10)
- Conference calls
- Misc ARM Housekeeping
- Lost connectivity due to old ARM Unix VPN Client
== Next week ==
* TCWG-317 - Exploit wide add operations when appropriate for Aarch32
- Minor code update to address upstream comment
* Vacation until rest of year
== Progress ==
* GCC (4/10)
- pr68620 (fp16 transfers in big-endian mode)
- checking regressions observed on trunk
- had to revert my cleanup patch for target
attributes tests, not sure how to handle all
possible combinations of options/defaults
* Validation (1/10)
- small improvements in reporting
* Misc (conf calls, meetings, emails, ...) (3/10)
* Internal training (2/10)
== Next ==
* Validation
* GCC: bug fixes/cleanup
* One day off on Wed. [2/10]
# Progress #
* Enable gdb core file tests when testing remotely, TCWG-171.
I've almost had the conclusion that corefile remote testing can't be
done due to limitations in dejagnu and nfs mount testing
infrastructure. Need to write them down. [2/10]
* Mutli-arch follow-up work, teach AArch64 GDBserver understand ARM
breakpoint instructions. TCWG-460. Done. [3/10]
* Review ARM GDBserver software single step patch V7. Almost OK, except
some small things. [2/10]
* Misc, meeting, email, [1/10]
# Plan #
* TCWG-424, Draft a fix for the fail in random-signal.exp.
* TCWG-156, GDB test parity between AArch64 and X86_64.
* One day off on Wed. or Fri.
Planned absence:
* Dec 24-Jan 3.
--
Yao
== Progress ==
- PR66726 (2/10)
* Testing a patch
- PR63586 (2/10)
* Posted a patch
* Revised the patch based on testing
- LuaJIT (2/10)
* Setup nginx
* Still haven't figured out how to use mongodb with nginx (config
required).
- Misc (2/10)
* gcc/bug list
* LTO
- sick (2/10)
== Plan ==
* bug reports
* LTO
Hi,
when working with the Linaro patches I found that a particular commit
breaks our aarch64 kernel build.
The patch in question is that one:
commit be09330da9d0777c4a58568d137e3f8a3dbe0a0b
Author: Yvan Roux <yvan.roux(a)linaro.org>
Date: Tue Oct 27 21:18:19 2015 +0100
One of the things it attempts to change apparently is moving the .arch
specifiers in the assembler file from a global scope to individual
functions. What also happens though is that they seem to lose some
information after that transformation.
I observed that when building arch/arm64/crypto/aes-ce-cipher.c from
the Linux kernel. This code contains inline assembly like this:
static void aes_cipher_decrypt(struct crypto_tfm *tfm, u8 dst[], u8 const src[])
{
struct crypto_aes_ctx *ctx = crypto_tfm_ctx(tfm);
struct aes_block *out = (struct aes_block *)dst;
struct aes_block const *in = (struct aes_block *)src;
void *dummy0;
int dummy1;
kernel_neon_begin_partial(4);
__asm__(" ld1 {v0.16b}, %[in] ;"
" ld1 {v1.2d}, [%[key]], #16 ;"
" cmp %w[rounds], #10 ;"
" bmi 0f ;"
" bne 3f ;"
" mov v3.16b, v1.16b ;"
" b 2f ;"
"0: mov v2.16b, v1.16b ;"
" ld1 {v3.2d}, [%[key]], #16 ;"
"1: aesd v0.16b, v2.16b ;"
" aesimc v0.16b, v0.16b ;"
"2: ld1 {v1.2d}, [%[key]], #16 ;"
" aesd v0.16b, v3.16b ;"
" aesimc v0.16b, v0.16b ;"
"3: ld1 {v2.2d}, [%[key]], #16 ;"
" subs %w[rounds], %w[rounds], #3 ;"
" aesd v0.16b, v1.16b ;"
" aesimc v0.16b, v0.16b ;"
" ld1 {v3.2d}, [%[key]], #16 ;"
" bpl 1b ;"
" aesd v0.16b, v2.16b ;"
" eor v0.16b, v0.16b, v3.16b ;"
" st1 {v0.16b}, %[out] ;"
: [out] "=Q"(*out),
[key] "=r"(dummy0),
[rounds] "=r"(dummy1)
: [in] "Q"(*in),
"1"(ctx->key_dec),
"2"(num_rounds(ctx) - 2)
: "cc");
kernel_neon_end();
}
Now without this patch the compiler behaved like the following. It was
invoked with:
aarch64-linux-gnu-gcc -Wp,-MD,arch/arm64/crypto/.aes-ce-cipher.o.d
-nostdinc -isystem
/var/fpwork/rschiele/crossbuild/builds/aarch64-linux-gnu/linux-next/srcdir/bin/../lib/gcc/aarch64-linux-gnu/5.2.1/include
-I/var/fpwork/rschiele/crossbuild/builds/aarch64-linux-gnu/linux-next/srcdir/src/linux/arch/arm64/include
-Iarch/arm64/include/generated/uapi -Iarch/arm64/include/generated
-I/var/fpwork/rschiele/crossbuild/builds/aarch64-linux-gnu/linux-next/srcdir/src/linux/include
-Iinclude -I/var/fpwork/rschiele/crossbuild/builds/aarch64-linux-gnu/linux-next/srcdir/src/linux/arch/arm64/include/uapi
-Iarch/arm64/include/generated/uapi
-I/var/fpwork/rschiele/crossbuild/builds/aarch64-linux-gnu/linux-next/srcdir/src/linux/include/uapi
-Iinclude/generated/uapi -include
/var/fpwork/rschiele/crossbuild/builds/aarch64-linux-gnu/linux-next/srcdir/src/linux/include/linux/kconfig.h
-I/var/fpwork/rschiele/crossbuild/builds/aarch64-linux-gnu/linux-next/srcdir/src/linux/arch/arm64/crypto
-Iarch/arm64/crypto -D__KERNEL__ -mlittle-endian -Wall -Wundef
-Wstrict-prototypes -Wno-trigraphs -fno-strict-aliasing -fno-common
-Werror-implicit-function-declaration -Wno-format-security -std=gnu89
-mgeneral-regs-only -fno-delete-null-pointer-checks -O2
--param=allow-store-data-races=0 -Wframe-larger-than=2048
-fno-stack-protector -Wno-unused-but-set-variable
-fno-omit-frame-pointer -fno-optimize-sibling-calls
-fno-var-tracking-assignments -g -Wdeclaration-after-statement
-Wno-pointer-sign -fno-strict-overflow -fconserve-stack
-Werror=implicit-int -Werror=strict-prototypes -Werror=date-time
-DCC_HAVE_ASM_GOTO -Werror -march=armv8-a+crypto
-D"KBUILD_STR(s)=#s" -D"KBUILD_BASENAME=KBUILD_STR(aes_ce_cipher)"
-D"KBUILD_MODNAME=KBUILD_STR(aes_ce_cipher)" -c -o
arch/arm64/crypto/aes-ce-cipher.o
/var/fpwork/rschiele/crossbuild/builds/aarch64-linux-gnu/linux-next/srcdir/src/linux/arch/arm64/crypto/aes-ce-cipher.c
As a result it created a file for the assembler with the global
.arch armv8-a+fp+simd+crypto
at the beginning of the file.
After the patch it created individual
.arch armv8-a
at individual places.
It is not clear to me, why the extensions (fp+simd+crypto) got lost.
Is that intended, such that the code needs special adaption for inline
assembly using those extensions or is that loss of extensions a bug of
that patch?
Greetings!
Robert
== Progress ==
o Linaro GCC (6/10)
* More backports
* Compared our various flavor of validation results
* Looked at bug reports and mailing list questions on our last snapshot.
o Upstream work (2/10)
* Continue on sanitizing gfortran testsuite
o Misc (2/10)
* Various meetings
* Discussed benchmarking infra with Bernie
== Plan ==
o GCC 5.3 branch merge
o Continue on-going tasks
o Tuesday Off
Holiday [2/10]
Port to microinstance - TCWG-432 [1/10]
* Set up reporting for CPU2006
* Learned how to generate metadata, but not how to use it
Trigger benchmarks on backports - TCWG-352 [2/10]
* Figured out the rough shape
* Created, didn't test, rough implementation
TCWG-354 [3/10]
* Build/run scaffolding for CoreMark Pro
* Working for manual runs
Misc [2/10]
* Input validation for dispatcher script
* Meeting with Ade on LAVA benchmarking
* Meetings/mail/etc
=Plan=
Review, test, debug build-triggers-benchmark job
Check CoreMark Pro run configuration, enable for automatic runs
Review security with shared uinstance/main instance code
Expose more data, benchmarks to bundles
Debug/test Jenkins job in microinstance
Create bootable image for at least 1 target, or know what the problems are
Write up noise control report (if time)
More support for SPEC-on-Android?
* TCWG-72 (2/10)
- Added new target hook to generate target-specific divmod libfunc
- Builds cleanly now on x86_64, arm and arm-linux-gnueabihf
- Sent to tcwg list for review
* LTO spec2k6 build (2/10)
- Built speck26 with LTO
* Target hook (4/10)
- Completed with ASM_OUTPUT_LABEL_REF
- In progress - SIZE_ASM_OP to data hook, ASM_OUTPUT_SIZE_DIRECTIVE,
ASM_OUTPUT_MEASURED_SIZE
* Benchmarking (1/10)
- tcwg-319: Job in progress for fp benchmarks with patch
- tcwg-310 (loop peeling): Submitted job for running 252.eon
- Both the jobs failed due to lab downtime, need to be re-run.
- Received job template from Bernie for running benchmarks on cortex-a15.
* Misc (1/10)
- Meetings
== Next Week ==
- Continue benchmarking spec2k6 with LTO
- Look at bugs exposed by speck2k6 LTO build
- Benchmarking tcwg-319, tcwg-310
== This week ==
* TCWG-317 - Exploit wide add operations when appropriate for Aarch32 (0/10)
- No comments/review upstream will ping for update
* TCWG-316 - Exploit vector multiply by scalar instructions (4/10)
- Code improvements will require standard name for vectorizer and new
patterns
- On hold until GCC 6 is released
* Bugzilla 68543 - [AArch64] Implement overflow arithmetic standard
names (3/10)
- Initial investigation
* Bugzilla 68532 - [ARM] Incorrect code result on arm big endian (2/10)
- Investigation into understanding how vectorizer represents lanes vs
arm big endian back end
- Solution suspended until I can coordinate with Charlie
* Misc (1/10)
- Conference calls
== Next week ==
* Bugzilla 68543 - Implement add and subtract overflow operations and test
* TCWG-317 - Ping upstream and respond to upstream feedback
* Bugzilla 68532 - Coordinate with Charlie
== Progress ==
* Ill (4/10)
* Support (1/10)
- Bugzilla issues (PR20490, PR24635, PR24350, PR20025, PR25720, PR25722)
* Benchmarks (1/10)
- Checking some previous benchmark results on A57
* Buildbots (2/10)
- Getting AArch64 full bot back to rotation, since it's stable now
- Re-enabling libc++ prototypes on local master
- Bisecting broken test-suite
- Another power cut in the office sent all the bots down... :(
* Background (2/10)
- Code review, meetings, discussions, general support, etc.
- Validating some old sanitizer bugs
- FOSDEM admin
* One day off on Monday.
# Progress #
* Answer ST questions about supporting multi-arch with ST jtag probe.
[1/10]
* TCWG-171, Enable gdb core file tests when testing remotely, [3/10].
Ongoing.
* Run gdb.base/sizeof.exp with board having gdb,noinferiorio. Done.
[1/10]
* TCWG-460, mutli-arch follow-up work, teach AArch64 GDBserver
understand ARM breakpoint instructions. Patch is approved. [2/10].
* TCWG-424, fail in gdb.base/random-signal.exp. [1/10] Root cause is
identified, need to figure out how to fix it in next step.
* Review ARM GDBserver software single step patch V4.
# Plan #
* TCWG-171, TCWG-156, TCWG-424.
* Review ARM GDBserver software single step patch V5, which should be
the final version, I hope.
--
Yao
== Progress ==
* Validation
- a few cleanup patches in the comparison scripts
- contribute to debug of ptys allocation problems:
the tests pass when executed outside of our schroots.
- improvements in the reports from the validation
done in the ST Compute Farm
- reported a few regressions
* GCC
- cleanup patch for target attributes tests,
- pr68620 (fp16 transfers in big-endian mode)
* Misc (conf calls, meetings, emails, ....)
== Next ==
* Validation: monitoring, improvements
* GCC: bug fixes, cleanup
Hi Linaro Toolchian Group,
I am new to GCC development and have some basic question on its development
process.
Could you please give some insight on below questions. (Apology if they are
very trivial).
I have read https://gcc.gnu.org/develop.html
If I am correct, gcc trunk is on gcc 6.0.0 (stage 3) at present and will
becomes 6.0.1 (regression fix only) in January 2016.
gcc 6.0.1 will be released as gcc 6.1.0 in April, 2016 and from there
onwards gcc 6 release branch will start.
However, There is also a gcc 5 branch in parallel whose current version is
5.2.1 and will be released as gcc 5.3 soon.
Hopefully gcc 5.3 would be the last release in gcc 5 series. (Please
correct me if I am wrong).
[Questions]
1. What is the difference between experimental(gcc 6.0.0 stage 3) & gcc
release branch (gcc 5.2.1)?
Is there any rule which decides which changes will go where?
In case, I have some patches for new aarch64 processor at present, in
which branch these changes would be merged (assuming they passes reviews)?
2. How is the subversion of release branches are decided? Is it correct to
say that there will be always 3 subversion of any release branch (e.g. gcc
5.1, gcc 5.2 & gcc 5.3)?
3. What is the working model between GNU GCC and Linaro GCC? Does Linaro
directly accept patches? or they need to go to GNU GCC first?
Thanks in advance for your time.
with regards,
Virendra Kumar Pathak
--
with regards,
Virendra Kumar Pathak
Keeping linaro-toolchain in the loop.
Robert
---------- Forwarded message ----------
From: Robert Schiele <rschiele(a)gmail.com>
Date: Thu, Dec 3, 2015 at 4:44 PM
Subject: Re: Lost upstream patch in merge from gcc-5-branch to
linaro/gcc-5-branch
To: Yvan Roux <yvan.roux(a)linaro.org>
Hi Yvan,
On Thu, Dec 3, 2015 at 1:25 PM, Yvan Roux <yvan.roux(a)linaro.org> wrote:
> This fix on gcc-5-branch doesn't apply on Linaro 5 branch, because we
> have backported trunk revision 222624 (which renames maybe_fma to
> coumpound_p) into it. So, our branch as the same code as trunk one
> regarding aarch64_rtx_costs. Do you experiment any issues related to
> this change ?
No issues. This was just a theoretical thought and through our CI
build I learned exactly what you just told me the hard way now.
Sorry for the noise.
Robert
abe.sh in the ABE framework accepts a parameter to set the wget timeout
when it fetches snapshots (default 10s); however that parameter has an
upper threshold of 10 seconds (condition at line 996 only sets timeout
to specified value if < 11). Is this intentional? It seems like it would
make more sense to give it a floor instead of a ceiling or perhaps not
limit the range of potential values at all.
Best regards,
Chris Roberts
hi guys,
sorry maybe my question is stupid as i am not a toolchain guy.
i have no idea why ld.so search so many paths. for example, put
"-rpath" with /home/cnb1szh/test in a simple test program. then during
dynamic linking at runtime, we get the below linking debug
information:
30693: find library=libmytest.so [0]; searching
30693: search path=/home/cnb1szh/test/tls/v7l/neon/vfp:/home/cnb1szh/test/tls/v7l/neon:/home/cnb1szh/test/tls/v7l/vfp:/home/cnb1szh/test/tls/v7l:/home/cnb1szh/test/tls/neon/vfp:/home/cnb1szh/test/tls/neon:/home/cnb1szh/test/tls/vfp:/home/cnb1szh/test/tls:/home/cnb1szh/test/v7l/neon/vfp:/home/cnb1szh/test/v7l/neon:/home/cnb1szh/test/v7l/vfp:/home/cnb1szh/test/v7l:/home/cnb1szh/test/neon/vfp:/home/cnb1szh/test/neon:/home/cnb1szh/test/vfp:/home/cnb1szh/test
(RPATH from file ./hello)
30693: trying file=/home/cnb1szh/test/tls/v7l/neon/vfp/libmytest.so
30693: trying file=/home/cnb1szh/test/tls/v7l/neon/libmytest.so
30693: trying file=/home/cnb1szh/test/tls/v7l/vfp/libmytest.so
30693: trying file=/home/cnb1szh/test/tls/v7l/libmytest.so
30693: trying file=/home/cnb1szh/test/tls/neon/vfp/libmytest.so
30693: trying file=/home/cnb1szh/test/tls/neon/libmytest.so
30693: trying file=/home/cnb1szh/test/tls/vfp/libmytest.so
30693: trying file=/home/cnb1szh/test/tls/libmytest.so
30693: trying file=/home/cnb1szh/test/v7l/neon/vfp/libmytest.so
30693: trying file=/home/cnb1szh/test/v7l/neon/libmytest.so
30693: trying file=/home/cnb1szh/test/v7l/vfp/libmytest.so
30693: trying file=/home/cnb1szh/test/v7l/libmytest.so
30693: trying file=/home/cnb1szh/test/neon/vfp/libmytest.so
30693: trying file=/home/cnb1szh/test/neon/libmytest.so
30693: trying file=/home/cnb1szh/test/vfp/libmytest.so
30693: trying file=/home/cnb1szh/test/libmytest.so
but we don't have /home/cnb1szh/test/tls/, /home/cnb1szh/test/v7l/,
/home/cnb1szh/test/vfp/, /home/cnb1szh/test/neon/, why does the ld.so
search so many paths?
-barry
Hi,
I found that with the merge
commit ac19ac6481a3f326d9f41403f5dadab548b2c8a6
Author: Yvan Roux <yvan.roux(a)linaro.org>
Date: Wed Sep 16 10:57:42 2015 +0200
Merge branches/gcc-5-branch rev 227732.
Change-Id: I2f59904b28323b1c72a8cf1bd62c9e460d95bcea
the following branch that was within merge range on gcc-5-branch was
lost on the linaro branch:
commit b45a5cf7c1544f95578e823e25402b58fb3fbedd
Author: nsz <nsz@138bc75d-0d04-0410-961f-82ee72b054a4>
Date: Tue Aug 4 16:49:54 2015 +0000
Fix broken backport patch.
gcc:
Backport from mainline:
2015-08-04 Szabolcs Nagy <szabolcs.nagy(a)arm.com>
PR target/66731
* config/aarch64/aarch64.c (aarch64_rtx_costs): Fix NEG cost for FNMUL.
(aarch64_rtx_mult_cost): Fix MULT cost with -frounding-math.
git-svn-id:
svn+ssh://gcc.gnu.org/svn/gcc/branches/gcc-5-branch@226588
138bc75d-0d04-0410-961f-82ee72b054a4
diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c
index 691874b..eebc9c3 100644
--- a/gcc/config/aarch64/aarch64.c
+++ b/gcc/config/aarch64/aarch64.c
@@ -5250,7 +5250,7 @@ aarch64_rtx_mult_cost (rtx x, int code, int
outer, bool speed)
which case FNMUL is different than FMUL with operand negation. */
bool neg0 = GET_CODE (op0) == NEG;
bool neg1 = GET_CODE (op1) == NEG;
- if (compound_p || !flag_rounding_math || (neg0 && neg1))
+ if (maybe_fma || !flag_rounding_math || (neg0 && neg1))
{
if (neg0)
op0 = XEXP (op0, 0);
Since this was a fix to the patch one commit ahead and also merged in
the same operation and there is no further comment on why this fix was
skipped, may I assume that this happened by accident and you probably
want to fix that merge flaw by reapplying the missing patch? Or is
there an information detail I don't have that requires this fix to be
skipped on the Linaro branch?
Robert
Controlled image builds - TCWG-360 [1/10]
* Tried, failed to generate bootable images
Jenkins benchmarking job - TCWG-348 [5/10]
* Jenkins job functional on kvms in main instance
* Wrote job-dispatch script for non-Jenkins use cases
Juno crashdump [1/10]
* Struggled with, worked around network problems
* Juno running, waiting for it to crash
SPEC-on-Android [1/10]
* Helped Qian to find root cause of a problem
Misc [2/10]
=Plan=
Review security with shared uinstance/main instance code
Expose more data, benchmarks to bundles
Debug/test Jenkins job in microinstance
Create bootable image for at least 1 target, or know what the problems are
Write up noise control report (if time)
Probably more support for SPEC-on-Android
# Progress #
* Fix timeout in random-signal.exp. TCWG-424. Ongoing. [4/10]
I know the cause of the problem but can't figure out why it can
happen. Can only reproduce the fail in every 20~30 runs.
* TCWG-423, support gnu vector in inferior call in AArch64 GDB. Done.
[1/10]
* Build native ARM and AArch64 GDB in C++. No regression in test.
Done. TCWG-446. Done. [1/10].
* Fix GDB internal error in gdb.thread/watchpoint-fork.exp on AArch64.
[2/10]
* Upstream patches review [2/10]
** Review patches on arm gdbserver software single step V3. Some
patches are approved, but V4 is also needed for the rest.
** Review all-stop on non-stop patches.
** Discuss whether we need single step in fast tracepoint. Ongoing.
# Plan #
* One day off on Monday.
* Understand ST's jtag probe and help them to make use of multi-arch
in GDB.
* Look into timeout in random-signal.exp.
* TCWG-156, GDB test parity between AArch64 and X86_64.
* TCWG-460, follow up on the AArch64 GDBserver multi-arch work.
--
Yao
== Progress ==
- tree re-assoc regression (2/10)
- Found a test-case to reproduce it.
- working on a patch
- LuaJIT issue (6/10)
* Setup luajit on aarch64 and tried it
* tried to reproduce nginx issue with various configs without success
- LTO (1/10)
* aarch64 bootstrap
* ran into an uninit warning issues looking into it
- Misc (1/10)
* gcc/bug list
== Plan ==
* tree re-assoc
* LTO
== Progress ==
* Support (6/10)
- Trying to add ADRL support in the assembler: http://llvm.org/PR24350
* Buildbots (1/10)
- Some breakages, nothing serious
* Background (3/10)
- Code review, meetings, discussions, general support, etc.
- Working on 2016's plan
== Progress ==
o Linaro GCC (4/10)
* Found and backported missing dependencies
* More backports
o Upstream work (4/10)
* Reviewed some patches
* Tried to reproduced an LRA issue (fixed in the meantime)
* Continue on sanitizing gfortran testsuite
o Misc (2/10)
* Various meetings
== Plan ==
o Continue on-going tasks
== This Week ==
* Target conversion to hook (2/10)
- Completed build and test for ASM_FORMAT_PRIVATE_NAME, ASM_LABEL_OUTPUT_LABEL
to hook
- Converted ASM_OUTPUT_LABEL_REF to hook
* Holidays (8/10)
- 3 days leave and public holiday (Guru nanak Jayanti)
== Next Week ==
- Send updated patch upstream for tcwg-72
- LTO benchmarking with SPEC for correctness on arm and aarch64
- Target hook conversion
The Linaro Toolchain Working Group is pleased to announce the availability
of the Linaro Stable Binary Toolchain GCC 5.2-2015.11 Release Archives.
http://releases.linaro.org/components/toolchain/binaries/5.2-2015.11/http://releases.linaro.org/components/toolchain/gcc-linaro/5.2-2015.11/
These archives provide cross-toolchain executables (compiler, debugger,
linker, etc.) and shared libraries (libstdc++, libc, etc.) that target ARM
or Aarch64 GNU/Linux and bare-metal environments. The cross-toolchain
binaries execute on a Linux or MS Windows (under mingw32) host
operating-system.
Please evaluate this release-candidate for correctness. Linaro will
shortly spin the Linaro GCC 5.2-2015.11 release if this release-candidate
passes stakeholder validation.
For bugs related to this release-candidate please email
linaro-toolchain(a)lists.linaro.org or file a bug at
https://bugs.linaro.org/enter_bug.cgi?product=Linux%20Binary%20toolchain
NEWS
* GCC 5.2 2015.11
The Linaro GCC 5.2 2015.11 binary toolchain release is built from the
Linaro GCC-5.2-2015.11 release source archive. The Linaro GCC-5.2-2015.11
release source archive is derived from the same sources as the Linaro
GCC-5.2-2015.10 snapshot source archive.
* GCC 5.2 2015.11-rc1
The Linaro GCC 5.2 2015.11-rc1 binary toolchain release-candidate is built
from the Linaro GCC-5.2-2015.11 release-candidate source archive. The
Linaro GCC-5.2-2015.11-rc1 release-candidate source archive is derived from
the same sources as the Linaro GCC-5.2-2015.10 snapshot source archive.
--
Ryan S. Arnold
Linaro Toolchain Working Group - Engineering Manager
www.linaro.org
Hello,
I am trying to create gcc 4.9.x toolchains for ARM v7 and v8 based on Linaro's sources. At first Linaro's 4.9-2015.05 binary release looked suitable, but then one of my colleagues noticed that that it had an incompatibility with Red Hat Enterprise Linux 6. Linaro has decided not to fix this incompatibility (see https://bugs.linaro.org/show_bug.cgi?id=1869 ).
So, I tried to work around that bug by rebuilding the toolchains myself on RHEL6 using Linaro's new ABE script. I initially tried to recreate the builds by using ABE's --manifest <manifest_file> command line option. I experienced problems with that, though, including it building gcc version 6.x instead of 4.9.x. I eventually gave up on that approach. Instead, I extracted the required branches and revisions from the manifest files and put them into ABE command line options, like this:
abe.sh --target aarch64-elf --build all --parallel --dump --tarball --release fsl-2015.11.16 --set libc=newlib binutils=binutils-gdb.git~linaro_binutils-2_24-branch@a93e252ee5250dba831e54f98336b40c7210dac7 gcc=gcc-linaro-4.9-2015.05 gmp=5.1.3 gdb=binutils-gdb.git~gdb-7.10-branch@ef5fa52ac9ab68b505b52acb2d2068b366ba8bf2 mpfr=3.1.2 mpc=1.0.1 newlib=newlib.git~linaro_newlib-branch@136b66e404df41435bdec4630c0787b0bc7e7580
abe.sh --target aarch64-linux-gnu --build all --parallel --dump --tarball --release fsl-2015.11.16 --set libc=glibc binutils=binutils-gdb.git~linaro_binutils-2_24-branch@a93e252ee5250dba831e54f98336b40c7210dac7 gcc=gcc-linaro-4.9-2015.05 gmp=5.1.3 gdb=binutils-gdb.git~gdb-7.10-branch@ef5fa52ac9ab68b505b52acb2d2068b366ba8bf2 mpfr=3.1.2 mpc=1.0.1 glibc=glibc-linaro-2.20-2014.11.tar.xz
abe.sh --target arm-eabi --build all --parallel --dump --tarball --release fsl-2015.11.16 --set libc=newlib binutils=binutils-gdb.git~linaro_binutils-2_24-branch@a93e252ee5250dba831e54f98336b40c7210dac7 gcc=gcc-linaro-4.9-2015.05 gmp=5.1.3 gdb=binutils-gdb.git~gdb-7.10-branch@ef5fa52ac9ab68b505b52acb2d2068b366ba8bf2 mpfr=3.1.2 mpc=1.0.1 newlib=newlib.git~linaro_newlib-branch@136b66e404df41435bdec4630c0787b0bc7e7580
abe.sh --target arm-linux-gnueabihf --build all --parallel --dump --tarball --release fsl-2015.11.16 --set libc=glibc binutils=binutils-gdb.git~linaro_binutils-2_24-branch@a93e252ee5250dba831e54f98336b40c7210dac7 gcc=gcc-linaro-4.9-2015.05 gmp=5.1.3 gdb=binutils-gdb.git~gdb-7.10-branch@ef5fa52ac9ab68b505b52acb2d2068b366ba8bf2 mpfr=3.1.2 mpc=1.0.1 glibc=glibc-linaro-2.20-2014.11.tar.xz
That worked, and the resulting toolchains ran without error under RHEL6. Note that I deliberately chose to switch to glibc in the *-linux-* toolchains, whereas the manifest files had them using eglibc.
At least one serious problem remained. The toolchains supported different multilibs than previous releases. For example, arm-eabi-gcc reported that it supported only three sets of libraries:
$ arm-eabi-gcc -print-multi-lib
.;
thumb;@mthumb
fpu;@mfloat-abi=hard
Linaro's 2015.05 build of the toolchain gives the same output. However, previous releases of this toolchain supported a much larger set of multilibs. A build from 2014.08 reports:
$ arm-none-eabi-gcc --print-multi-lib
.;
thumb;@mthumb
v7-a;@march=armv7-a
v7ve;@march=armv7ve
v8-a;@march=armv8-a
v7-a/fpv3/softfp;@march=armv7-a@mfpu=vfpv3-d16@mfloat-abi=softfp
v7-a/fpv3/hard;@march=armv7-a@mfpu=vfpv3-d16@mfloat-abi=hard
v7-a/simdv1/softfp;@march=armv7-a@mfpu=neon@mfloat-abi=softfp
v7-a/simdv1/hard;@march=armv7-a@mfpu=neon@mfloat-abi=hard
v7ve/fpv4/softfp;@march=armv7ve@mfpu=vfpv4-d16@mfloat-abi=softfp
v7ve/fpv4/hard;@march=armv7ve@mfpu=vfpv4-d16@mfloat-abi=hard
v7ve/simdvfpv4/softfp;@march=armv7ve@mfpu=neon-vfpv4@mfloat-abi=softfp
v7ve/simdvfpv4/hard;@march=armv7ve@mfpu=neon-vfpv4@mfloat-abi=hard
v8-a/simdv8/softfp;@march=armv8-a@mfpu=neon-fp-armv8@mfloat-abi=softfp
v8-a/simdv8/hard;@march=armv8-a@mfpu=neon-fp-armv8@mfloat-abi=hard
thumb/v7-a;@mthumb@march=armv7-a
thumb/v7ve;@mthumb@march=armv7ve
thumb/v8-a;@mthumb@march=armv8-a
thumb/v7-a/fpv3/softfp;@mthumb@march=armv7-a@mfpu=vfpv3-d16@mfloat-abi=softfp
thumb/v7-a/fpv3/hard;@mthumb@march=armv7-a@mfpu=vfpv3-d16@mfloat-abi=hard
thumb/v7-a/simdv1/softfp;@mthumb@march=armv7-a@mfpu=neon@mfloat-abi=softfp
thumb/v7-a/simdv1/hard;@mthumb@march=armv7-a@mfpu=neon@mfloat-abi=hard
thumb/v7ve/fpv4/softfp;@mthumb@march=armv7ve@mfpu=vfpv4-d16@mfloat-abi=softfp
thumb/v7ve/fpv4/hard;@mthumb@march=armv7ve@mfpu=vfpv4-d16@mfloat-abi=hard
thumb/v7ve/simdvfpv4/softfp;@mthumb@march=armv7ve@mfpu=neon-vfpv4@mfloat-abi=softfp
thumb/v7ve/simdvfpv4/hard;@mthumb@march=armv7ve@mfpu=neon-vfpv4@mfloat-abi=hard
thumb/v8-a/simdv8/softfp;@mthumb@march=armv8-a@mfpu=neon-fp-armv8@mfloat-abi=softfp
thumb/v8-a/simdv8/hard;@mthumb@march=armv8-a@mfpu=neon-fp-armv8@mfloat-abi=hard
I found that the file that encodes this older set of multilib mappings is gcc-linaro-4.9-2015.05/gcc/config/arm/t-aprofile. Based on some comments in gcc-linaro-4.9-2015.05/gcc/config.gcc, I guessed that ABE should have configured gcc with "--with-multilib-list=aprofile", and without "--with-arch=armv7-a" or "--with-fpu=vfpv3-d16". I quickly hacked these changes into abe/config/gcc.conf like this:
diff --git a/config/gcc.conf b/config/gcc.conf
index 19c44ca..4cc5eaf
--- a/config/gcc.conf
+++ b/config/gcc.conf
@@ -111,9 +111,9 @@ if test x"${build}" != x"${target}"; then
default_configure_flags="${default_configure_flags} --with-tune=cortex-a9"
fi
if test x"${override_arch}" = x -a x"${override_cpu}" = x; then
- default_configure_flags="${default_configure_flags} --with-arch=armv7-a"
+ default_configure_flags="${default_configure_flags}"
fi
- default_configure_flags="${default_configure_flags} --enable-threads=no --with-fpu=vfpv3-d16 --enable-multilib --disable-multiarch"
+ default_configure_flags="${default_configure_flags} --enable-threads=no --with-multilib-list=aprofile --enable-multilib --disable-multiarch"
languages="c,c++,lto"
;;
aarch64*-*elf)
After rebuilding the toolchain, I found it had the desired older set of multilibs.
I hope that this mail will help anyone who experiences similar problems. I have filed a bug report for the multilib issue. See https://bugs.linaro.org/show_bug.cgi?id=1920 .
While validating the toolchains, dejagnu reports a few unexpected failures. Does the TCWG publish their validation results anywhere for comparison? That would be very helpful.
Thanks,
Fred Peterson
Freescale Developer Tools
== Progress ==
o Valfidation and Infra (2/10)
* Some fixes in our release script
* look at refactoring our publishing snapshot job
o Linaro GCC (4/10)
* Start backports for 2015.12
* Tracking dependencies
o Upstream work (1/10)
* Continue on sanitizing gfortran testsuite
o Misc (3/10)
* Various meetings
* Internal support
== Plan ==
o Continue on-going tasks
Controlled image builds - TCWG-360 [2/10]
* A few more test/debug cycles with ci-loop-built image
Jenkins benchmarking job - TCWG-348 [3/10]
* YAML-ised Jenkins job, more test/debug cycles
Juno crashdump [1/10]
* Got a usable dump (via alt-sysrq-c) with latest patches plus some fiddling
SPEC-on-Android [1/10]
* Looked at Qian's work to date, didn't come up with any bright ideas
Misc [3/10]
=Plan=
Review security with shared uinstance/main instance code
Expose more data, benchmarks to bundles
Continue debug/test of Jenkins job
Create bootable image for at least 1 target, or know what the problems are
Write up noise control report (if time)
Set Juno off, try to get a dump of my crash
Probably more support for SPEC-on-Android
=Absences=
'ARM Day' next Monday (30th)
== This week ==
* TCWG-317 - Exploit wide add operations when appropriate for Aarch32 (5/10)
- Blocked as I have not yet determined why the pattern fails on big
endian targets
* TCWG-369 - Exploit wide add operations when appropriate for Aarch64 (1/10)
- Modified code based on minor code style comments
* TCWG-316 - Exploit vector multiply by scalar instructions (3/10)
- Discovered relevant previous RFC:
https://gcc.gnu.org/ml/gcc/2013-09/msg00061.html
- Coded subset of vector patterns
- Debugging combine phase to determine why patterns are not
being utilized
* Misc (1/10)
- Conference calls
== Next week ==
- TCWG-369 - Submit modified patch upstream for final approval
- TCWG-316 - Determine if rtl patterns can be used by combine
- TCWG-317 - Need feedback
- USA Thanksgiving Holidays (November 26-27)
== This Week ==
* TCWG-72 (2/10)
- Rebased patch
- Fixed ICE for x86-gcc with -m32 following Jim's suggestions.
* Target hook conversion (6/10)
- Converted ASM_FORMAT_PRIVATE_NAME, ASM_LABEL_OUTPUT_LABEL,
ASM_OUTPUT_LABELREF to hook
* TCWG-319 (1/10)
- Benchmark jobs for fp in progress on a53, a57.
* Misc (1/10)
- Meetings
== Next Week ==
- Test and send updated patch upstream for tcwg-72
- TCWG-319 benchmarking on cortex-a15
- Holidays from 23-25th November (Mon-Wed).
# Progress #
* TCWG-332, done. [1/10]
Fix GDB bug on stepping over breakpoint on ARM. Patch is pushed in.
* TCWG-423, patches are posted. [5/10].
Support gnu vector in inferior call in AArch64 GDB.
Also correctly handle HVA (homogeneous vector aggregate) in inferior
call.
* TCWG-433, done. [2/10]
All memory issues found by -fsanitize=address in GDB are fixed.
* TCWG-447, done. [1/10]
Fix GDB mainline build warnings and errors in C++ mode on ARM and
AArch64.
* Discussion on the approach of building GDB in C++. Need to test GDB
built in C++ on both ARM and AArch64, from my side. [1/10]
# Plan #
* Understand ST's jtag probe and help them to make use of multi-arch
in GDB.
* Fix GDB internal error in gdb.thread/watchpoint-fork.exp on AArch64.
* TCWG-156, GDB test parity between AArch64 and X86_64.
--
Yao
== Progress ==
* Validation (6/10)
- a few improvements in the validations using the ST compute farm
- thinking about appropriate ways of sharing validation
reports with the GCC community without flooding gcc-testresults
- moved results comparison scripts to a dedicated repo
and updated Jenkins jobs accordingly
* GCC (1/10)
- bug #1869 / glibc dependency on RHEL6
proof of concept to force use of old memcpy
but it will be much safer to build the toolchain
in a suitable container with the right distro
* Misc (conf calls, meetings, emails, ...) (3/10)
- patches and backports reviews
== Next ==
* Validation
- continue preparation of switch, as dev-01 is now back
- improve reporting
* GCC:
- check Neon tests cleanup
- bug #1869
- look at how to send valuable reports to gcc-regression
Hi,
We're currently running into issues with the OE builds due to OE-core
having moved to 2.22. So what's the plan for glibc-linaro 2.22?
--
Koen Kooi
Builds and Baselines | Release Manager
Linaro.org | Open source software for ARM SoCs
Hi,
This question has arisen in the ODP project and the thought is that a 'best
practices' answer would be more likely to be found on this list.
We have a component that wants to make use of specialized instructions for
performing CRC and/or AES computations and was wondering what is the
recommended way for an application to determine whether such instructions
are available in the toolchain and whether the user has overruled their use?
Thanks for any insight you can provide.
Bill
I think there are many issues with binary compatibility beyond
function inlining. An ODP application cannot expect all ODP
implementations to support the same number of ODP queues or
classification rules or even which classification terms (fields) are
supported (efficiently/in HW) etc. Is there some kind of lowest common
denominator an application should expect? Do we want to make
guarantees of an ODP implementation stricter? What are the
consequences of such strict functional guarantees?
I think an application that requires binary compatibility over ARMv8.1
platforms should compile and link against a specific ODP SW
implementation (possibly with some well-defined HW offloads where the
underlying platform can provide the relevant drivers). I.e. more of a
(user-space) Linux architecture than standard ODP (as influenced by
OpenGL). The important binary interfaces then becomes the interfaces
to these offloads/drivers.
On 16 November 2015 at 14:23, Nicolas Morey-Chaisemartin
<nmorey(a)kalray.eu> wrote:
>
>
> On 11/11/2015 09:45 AM, Savolainen, Petri (Nokia - FI/Espoo) wrote:
>>
>>> -----Original Message-----
>>> From: lng-odp [mailto:lng-odp-bounces@lists.linaro.org] On Behalf Of
>>> EXT Nicolas Morey-Chaisemartin
>>> Sent: Tuesday, November 10, 2015 5:13 PM
>>> To: Zoltan Kiss; linaro-toolchain(a)lists.linaro.org
>>> Cc: lng-odp
>>> Subject: Re: [lng-odp] Runtime inlining
>>>
>>> As I said in the call last week, the problem is wider than that.
>>>
>>> ODP specifies a lot of types but not their sizes, a lot of
>>> enums/defines (things like ODP_PKTIO_INVALID) but not their value
>>> either.
>>> For our port a lot of those values were changed for
>>> performance/implementation reason. So I'm not even compatible between
>>> one version of our ODP port and another one.
>>>
>>> The only way I can see to solve this is for ODP to fix the size of all
>>> these types.
>>> Default/Invalid values are not that easy, as a pointer would have a
>>> completely different behaviour from structs/bitfields
>>>
>>> Nicolas
>>>
>> Type sizes do not need to be fixed in general, but only when an application is build for binary compatibility (the use case we are talking here). Binary compatibility and thus the fixed type sizes are defined per ISA.
>>
>> We can e.g. define a configure target (for our reference implementation == linux-generic) "--binary-compatible=armv8.x" or "--binary-compatible=x86_64". When you build your application with that option, "platform dependent" types and constants would be fixed to pre-defined values specified in (new) ODP API arch files.
>>
>> So instead of building against odp/platform/linux-generic/include/odp/plat/queue_types.h ...
>>
>> typedef ODP_HANDLE_T(odp_queue_t);
>> #define ODP_QUEUE_INVALID _odp_cast_scalar(odp_queue_t, 0)
>> #define ODP_QUEUE_NAME_LEN 32
>>
>>
>> ... you'd build against odp/arch/armv8.x/include/odp/queue_types.h ...
>>
>> typedef uintptr_t odp_queue_t;
>> #define ODP_QUEUE_INVALID ((uintptr_t)0)
>> #define ODP_QUEUE_NAME_LEN 64
>>
>>
>> ... or odp/arch/x86_64/include/odp/queue_types.h
>>
>> typedef uint64_t odp_queue_t;
>> #define ODP_QUEUE_INVALID ((uint64_t)0xffffffffffffffff)
>> #define ODP_QUEUE_NAME_LEN 32
>>
>>
>> For highest performance on a fixed target platform, you'd still build against the platform directly
>>
>> odp/platform/<soc_vendor_xyz>/include/odp/plat/queue_types.h
>>
>> typedef xyz_queue_desc_t * odp_queue_t;
>> #define ODP_QUEUE_INVALID ((xyz_queue_desc_t *)0xdeadbeef)
>> #define ODP_QUEUE_NAME_LEN 20
>>
>>
>> -Petri
>>
>
> It still means that you need to enforce a type for all ODP implementation on a given arch. Which could be problematic.
> As a precise example: the way handles are used now for odp_packet_t brings some useful features for checks and memory savings, but performance wise, they are a "disaster". One of the first thing I did was to switch them to pointers. And if I wanted a high perf linux x86_64 implementation, I'd probably do the same.
>
> Nicolas
> _______________________________________________
> lng-odp mailing list
> lng-odp(a)lists.linaro.org
> https://lists.linaro.org/mailman/listinfo/lng-odp
== Progress ==
LLDB development
-- Root Google Nexus devices and read debug module configuration with
kernel module [TCWG-429] [7/10]
-- Figure out steps to unlock and root Nexus S
-- Figure out steps to build kernel and kernel module for Nexus S
-- Tried out lldb watchpoints with custom kernel on Nexus S
-- Tried out reaching debug co processors without ptrace using kernel module.
-- Identify mix-mode debugging problems (ARM & Thumb) [TCWG-229] [2/10]
-- Ongoing Initial investigation and indentifying code areas needing changes
Miscellaneous [1/10]
-- Meetings, emails, discussions etc.
== Plan ==
-- Root Google Nexus devices and read debug module configuration with
kernel module [TCWG-429]
-- Complete app and kernel module to read debug coprocessor registers.
-- Try them out on remaining Android devices.
-- Identify mix-mode debugging problems (ARM & Thumb) [TCWG-229]
-- Further investigation and testing a mix mode application.
== Progress ==
* Buildbots (4/10)
- Found culprit for self-hosting breakages
- Bot didn't get right because of dirty builds
- Moving all self-hosting bots to clean builds (~3h)
- More work on MIPS patch breaking self-hosting
- Several breakages and bisections
- Adding first cloud (Scaleway) buildbot to local master
- No NEON, so we can't replace the Chromebooks
* Infrastructure (4/10)
- Power cut in Cambridge Lab, no generator yet
- Chromebooks fail at the time of the cut, even with the UPS
batteries still holding. I'm guessing the power regulator
depends on the internal battery to work (and we removed them)
- Bringing all bots up, etc.
- Setting up an HiKey/AMD for benchmarks (APMs are too different)
- Running EEMBC and SPEC on AMD
* Background (2/10)
- Code review, meetings, discussions, general support, etc.
- Upstreaming -meabi, which may fix builds of kernel, android, bsd
- Compiling aarch64-linux-gnu-gcc by hand because Arch pkg didn't work
o 1 day off (2/10)
== Progress ==
o Linaro GCC (6/10)
* FSF branch merge into linaro GCC 5 branch
* Troubleshot various regression after the merge
* Delivered GCC 5.2 2015.11 snapshot
o Upstream work (1/10)
* Sanitizing gfortran testsuite
o Misc (1/10)
* Various meetings
== Plan ==
o Continue on sanitizing testsuite
o Backports, infra, ...
Implement LAVA jobs for microinstance - TCWG-432 [6/10]
* Refactoring to permit sharing of code between uinstance & main
instance, as far as possible
* Further refactoring for sane submission of bundles without inserting
LAVA assumptions in the wrong places
* Tested as far as possible in main instance, using light hacks and fakebench
Jenkins benchmarking job - TCWG-348 [1/10]
* Converted pbl hacks into a sane patch for yaml-to-json.py
Controlled image builds - TCWG-360 [1/10]
* Submitted aarch64 filesystem build for review
* Generated armhf and amd64 filesystems
* Started learning how to generate hwpack
Misc [2/10]
=Plan=
Review security with shared uinstance/main instance code
Expose more data, benchmarks to bundles
Create YAML definition for Jenkins benchmarking job
Generate (controlled) hwpack for at least one target, or know what the
problems are
Write up noise control report (if time)
Have another at crashdump (if time, if new kexec patches)
* TCWG-72 (3/10)
- divmod transform approved by Richard
- builds cleanly on arm-linux-gnueabihf, aarch64-linux-gnu
- Investigating segfault with __bdi64_div.c
happens when mode == DImode and libval_mode == TImode
- Found another segfault on x86 with TImode, on arm
TImode is not supported and compiler aborts. Perhaps we should
not do the transform when mode is TImode ?
- Had a look at expand_binop_twoval_libfunc().
Wrote a similar function to obtain both results but this resulted
in infinite loop in emit_libcall_block_1
- Strangely the bug is reproducible only during the build and doesn't
trigger when compiled with preprocessed version of bid64_div.c
(passing the same set of options).
- waiting for upstream comments
* TCWG-319 (1/10)
- Submitted jobs for fp benchmark on a53, a57
* Misc:
- PR66214 appears to have gone (fixed or became latent), that was
blocking firefox LTO build with trunk
- PR65837 still appears to be present after r230327
* Public Holidays (6/10)
- Diwali festival
== Next Week ==
- Continue with TCWG-72, TCWG-319 benchmarking, target hook conversion
- Run SPEC2k6 with LTO
== Progress ==
- Widening pass (TCWG-547) - 6/10
* Bootstrapped latest patch on ppc64-linux-gnu, aarch64-linux-gnu and
x64-64-linux-gnu.
* Regression testing on ppc64-linux-gnu,
aarch64-linux-gnu arm64-linux-gnu and x64-64-linux-gnu.
* Fixed all of the execution issues
* Posted updated patch to the list
- Misc (4/10)
* Linaro bug 1900
* Continued Looking at LuaJIT code-base
* gcc/bug list
== Plan ==
* bug 1900
* Look at implementing LuaJIT for aarch64
* LTO
Hi,
We have a packaging/linking/optimization problem at LNG, I hope you guys
can give us some advice on that. (Cc'ing ODP list in case someone want
to add something)
We have OpenDataPlane (ODP), an API stretching between userspace
applications and hardware SDKs. It's defined in the form of C headers,
and we already have several implementations to face SDKs (or whathever
is actually controlling the hardware), e.g. linux-generic, a DPDK one etc.
And we have applications, like Open vSwitch (OVS), which now is able to
work with any ODP platform implementation which implements this API
When it comes to packaging, the ideal scenario would be to create one
package for the application, e.g. openvswitch.deb, and one for each
platform, e.g odp-generic.deb, odp-dpdk.deb. The latter would contain
the implementations in the form of a libodp.so file, so the application
can dynamically load the actually installed platform's library runtime,
with all the benefits of dynamic linking.
The trouble is that we have several accessor functions in the API which
are very short and __very__ frequently used. The best example is
"uint32_t odp_packet_len(odp_packet_t pkt)", which returns the length of
the packet. odp_packet_t is an opaque type defined by the
implementation, often a pointer to the packet's actual metadata, so the
actual function call yields to a simple load from that metadata pointer
(+offset). Having it wrapped into a function call brings a significant
performance decrease: when forwarding 64 byte packets at 10 Gbps, I got
13.2 Mpps with function calls. When I've inlined that function it
brought 13.8 Mpps, that's ~5% difference. And there are a lot of other
frequently used short accessor functions with the same problem.
But obviously if I inline these functions I break the ABI, and I need to
compile the application for each platform (and create packages like
openvswitch-odp-dpdk.deb, containing the platform statically linked).
I've tried to look around on Google and in gcc manual, but I couldn't
find a good solution for this kind of problem.
I've checked link time optimization (-flto), but it only helps with
static linking. Is there any way to keep the ODP application and
platform implementation binaries in separate files while having the
performance benefit of inlining?
Regards,
Zoltan
The Linaro Toolchain Working Group (TCWG) is pleased to announce the
2015.11 snapshot of the Linaro GCC 5 source package.
This monthly snapshot[1] is based on FSF GCC 5.2+svn230068 and
includes performance improvements and bug fixes backported from
mainline GCC. This snapshot contents will be part of the 2015.11
stable [1] quarterly release.
This snapshot tarball is available on:
http://snapshots.linaro.org/components/toolchain/gcc-linaro/5.2-2015.11/
Interesting changes in this GCC source package snapshot include:
* Updates to GCC 5.2+svn230068
* Backport of [Bugfix] [AArch32] fp16 Fix PR 67624 - Incorrect
conversion of float Infinity to __fp16
* Backport of [Bugfix] [AArch64] PR 66776 Add cmovdi_insn_uxtw pattern
* Backport of [Bugfix] [AArch64] PR rtl-optimization/68106 LRA
* Backport of [Bugfix] PR48052 fix testcase
* Backport of [Bugfix] PR other/57195
* Backport of [Bugfix] PR rtl-optim/67421 Cost instruction sequences
when doing left wide shift
* Backport of [Bugfix] PR rtl-optimization/67103 Improve conditional
select ops on immediates
* Backport of [Bugfix] PR rtl-optimization/67756
* Backport of [Bugfix] PR target/61578
* Backport of [Bugfix] PR target/61578
* Backport of [Bugfix] PR target/61578
* Backport of [Bugfix] PR tree-optimization/48052 IVOPTS
* Backport of [Bugfix] PR tree-optimization/52563 and 62173 IVOPTS
* Backport of [Bugfix] PR tree-optimization/64454
* Backport of [Bugfix] PR tree-optimization/66449
* Backport of [AArch32] 1/2 Record FPU features as a bit-set
* Backport of [AArch32] 2/2 Use new FPU features representation
* Backport of [AArch32] 1/5 Make room for more CPU feature flags
* Backport of [AArch32] 2/5 Add feature set definitions
* Backport of [AArch32] 3/5 Use new feature set representation
* Backport of [AArch32] 4/5 Use features sets for builtins
* Backport of [AArch32] 5/5 Move initializer into arm-cores.def and
arm-arches.def
* Backport of [AArch32] Add earlyclobber modifier for neon_(vtrn,
vuzp, vzip)<mode>_insn rtx pattern
* Backport of [AArch32] Add missing is_neon_type types
* Backport of [AArch32] arm memcpy of aligned data
* Backport of [AArch32] Fix arm bootstrap failure due to
-Werror=shift-negative-value
* Backport of [AArch32] fix vget_lane on big-endian
* Backport of [AArch32] Use %wd format for lane printing in bounds_check
* Backport of [AArch32/AArch64] 1/15 [FP16] Hide existing float16
intrinsics unless we have a scalar __fp16 type
* Backport of [AArch32/AArch64] 2/15 [fp16] float16x4_t intrinsics in arm_neon.h
* Backport of [AArch32/AArch64] 3/15 Add V8HFmode and float16x8_t type
* Backport of [AArch32/AArch64] 4/15 float16x8_t intrinsics in arm_neon.h
* Backport of [AArch32/AArch64] 5/15 Remaining intrinsics
* Backport of [AArch32/AArch64] 6/15 Add basic FP16 support
* Backport of [AArch32/AArch64] 8/15 Add support for float16x{4,8}_t
vectors/builtins
* Backport of [AArch32/AArch64] 9/15 vld{2,3,4}{,_lane,_dup}, vcombine, vcreate
* Backport of [AArch32/AArch64] 10/15 Implement vcvt_{,high_}f16_f32
* Backport of [AArch32/AArch64] 11/15 vreinterpret(q?),
vget_(low|high), vld1(q?)_dup
* Backport of [AArch32/AArch64] 12/15 Add vcvt(_high)?_f32_f16
intrinsics, with BE RTL fix
* Backport of [AArch32/AArch64] 13/15 Add float16 tests to
advsimd-intrinsics testsuite
* Backport of [AArch32/AArch64] 14/15 Add test of
vcvt{,_high}_i{f32_f16,f16_f32}
* Backport of [AArch32/AArch64] 15/15 Update sourcebuild.texi with
testsuite/effective-target hooks
* Backport of [AArch64] 1/5 Reimplement aarch64_bitmask_imm
* Backport of [AArch64] 2/5 Improve aarch64_internal_mov_immediate by
using faster algorithm
* Backport of [AArch64] 3/5 Remove dead code
* Backport of [AArch64] 4/5 Remove redundant code
* Backport of [AArch64] 5/5 Cleanup immediate generation code in
aarch64_internal_mov_immediate
* Backport of [AArch64] 1/14 Add ident field to struct processor
* Backport of [AArch64] 2/14 Refactor arches handling, add arch enum identifier
* Backport of [AArch64] 3/14 Refactor option override code
* Backport of [AArch64] 4/14 Create TARGET_FIX_ERR_A53_835769 and use
that instead of aarch64_fix_a53_err835769
* Backport of [AArch64] 5/14 Make flag_omit_leaf_frame_pointer
intialize to 2. Define and use TARGET_OMIT_LEAF_FRAME
* Backport of [AArch64] 6/14 Implement TARGET_OPTION_SAVE/TARGET_OPTION_RESTORE
* Backport of [AArch64] 7/14 Implement TARGET_SET_CURRENT_FUNCTION
* Backport of [AArch64] 8/14 Implement TARGET_OPTION_VALID_ATTRIBUTE_P
* Backport of [AArch64] 9/14 Implement TARGET_CAN_INLINE_P
* Backport of [AArch64] 10/14 Implement target pragmas
* Backport of [AArch64] 11/14 Re-layout SIMD builtin types on builtin expansion
* Backport of [AArch64] 12/14 Target attributes and target pragmas tests
* Backport of [AArch64] 13/14 Document AArch64 target attributes and pragmas
* Backport of [AArch64] 14/14 Reuse target_option_current_node when
passing pragma string to target attribute
* Backport of [AArch64] vtbl[34] and vtbx4
* Backport of [AArch64] Add backend aarch64_bfi pattern
* Backport of [AArch64] Add csneg3_uxtw_insn pattern
* Backport of [AArch64] Add support for 64-bit vector-mode ldp/stp
* Backport of [AArch64] Adjust tests to take LSE extension into account
* Backport of [AArch64] [array_mode 1/8] Rename
vec_store_lanes<mode>_lane to aarch64_vec_store_lanes<mode>_lane
* Backport of [AArch64] [array_mode 2/8] Remove VSTRUCT_DREG, use
BLKmode for d-reg aarch64_st/ld expands
* Backport of [AArch64] [array_mode 3/8] Stop using EImode in
aarch64-simd.md and iterators.md
* Backport of [AArch64] [array_mode 4/8] Remove EImode
* Backport of [AArch64] [array_mode 5/8] Remove V_FOUR_ELEM, again
using BLKmode + set_mem_size.
* Backport of [AArch64] [array_mode 6/8] Remove V_TWO_ELEM, again
using BLKmode + set_mem_size.
* Backport of [AArch64] [array_mode 7/8] Combine the expanders using
VSTRUCT:nregs
* Backport of [AArch64] [array_mode 8/8] Add d-registers to
TARGET_ARRAY_MODE_SUPPORTED_P
* Backport of [AArch64] Break -mcpu tie between the compiler and assembler
* Backport of [AArch64] [expand] Check gimple statement to improve
LSHIFT_EXP expand
* Backport of [AArch64] Fix FAIL:
gcc.target/aarch64/target_attr_crypto_ice_1.c (internal compiler
error)
* Backport of [AArch64] Fix vcvt_high_f64_f32 and vcvt_figh_f32_f64 intrinsics
* Backport of [AArch64] Fix vldX/vstX AdvSIMD intrinsics
* Backport of [AArch64] Followup to [AArch64_be] Fix vtbl[34] and vtbx4
* Backport of [AArch64] Force __builtin_aarch64_fp[sc]r argument into a REG
* Backport of [AArch64] Handle const address in aarch64_print_operand
* Backport of [AArch64] Implement copysign[ds]f3
* Backport of [AArch64] Improve code generation for float16 vector code
* Backport of [AArch64] Improve SIMD concatenation with zeroes
* Backport of [AArch64] Remove index from AARCH64_FUSION_PAIR
* Backport of [AArch64] Remove obsolete comment in aarch64-option-extensions.def
* Backport of [AArch64] Remove separate movtf pattern - Use an
iterator for all FP modes
* Backport of [AArch64] Remove the hack for AARCH64_EXTRA_TUNE_ALL
* Backport of [AArch64] TLSLE 1,2 and 3/N
* Backport of [AArch64] Use default_elf_asm_named_section instead of
special cased hook
* Backport of [AArch64] Use default_elf_asm_named_section instead of
special cased hook
* Backport of [AArch64] Use logics_imm type for 2nd alternative of
*and<mode>3nr_compare0
* Backport of [AArch64] Use popcount_hwi instead of homebrew version
* Backport of [Testsuite] Fix race on temp file in gfortran streamio_*.f90 tests
* Backport of [Testsuite] Fix race on temp file in gfortran tests
* Backport of [Testsuite] Fix typo in vcvt_f16.c testcase
* Backport of [Testsuite] Adjust compiling options for
gcc.target/arm/unsigned-float.c
* Backport of [Testsuite] [AArch32] gcc.target/arm/pr67756.c: Fixed warnings
* Backport of [Testsuite] [AArch64] 7/15 Add basic fp16 tests
* Backport of [Testsuite] [AArch64] Adjust some arith+compare tests
for potentially more aggressive if-conversion
* Backport of [Testsuite] [AArch64] Make arm_align_max_stack_pwr.c and
arm_align_max_pwr.c compile testcase, instead of execution
* Backport of [Testsuite] [AArch64] Mark target_attr_1.c as compile-only
* Backport of [testsuite] [AArch64] Remove divisions-to-produce-NaN
from vdiv_f.c
* Backport of [Testsuite] Add float16 lane_f16_indices tests
* Backport of [Testsuite] auto-wipe dump files
* Backport of [Testsuite] Clean up effective_target cache
* Backport of [Testsuite] Clean up effective_target cache
* Backport of [Testsuite] Fix order of dg-do and
dg-require-effective-target directives
* Backport of [testsuite] gcc.dg/builtins-20.c: Remove undefined behavior
* Backport of [Testsuite] gcc.dg/tree-ssa/pr65447.c: Increase searching number
* Backport of [Misc] add separate insn sched class for vector LDP & STP
* Backport of [Misc] ccorrect ChangeLog dates+address
* Backport of [Misc] fix typo in 223858 1/2
* Backport of [Misc] fix typo in 223858 2/2
* Backport of [Misc] Fix bigendian HFmode in native_interpret_real
* Backport of [Misc] model load/store multiples properly in
autoprefetcher scheduling
* Backport of [Misc] Improve auto-increment addressing mode support in
IVO by refactoring add candiate logic
* Backport of [Misc] Improve bound information in loop niter analysis
* Backport of [Misc] Improve conditional select ops on immediates
* Backport of [Misc] Improve loop bound info by simplifying
conversions in iv base
* Backport of [Misc] IVOPS
* Backport of [Misc] Look into unnecessary conversion when checking
mult_op in get_shiftadd_cost
* Backport of [Misc] Allow REG_EQUAL for ZERO_EXTRACT
* Backport of [Misc] mark libstdc++ tests unsupported if they fail
with relocation truncated
* Backport of [Misc] Rerun loop-header-copying just before vectorization
* Backport of [Misc] Allow PLUS+immediate expression in
noce_try_store_flag_constants
* Backport of [Doc] Clarify feature modifiers {no,}{fp,simd,crypto}
Feedback and Support
Subscribe to the important Linaro mailing lists and join our IRC
channels to stay on top of Linaro development.
** Linaro Toolchain Development "mailing list":
http://lists.linaro.org/mailman/listinfo/linaro-toolchain
** Linaro Toolchain IRC channel on irc.freenode.net at @#linaro-tcwg@
* Bug reports should be filed in bugzilla against GCC product:
http://bugs.linaro.org/enter_bug.cgi?product=GCC
* Interested in commercial support? inquire at "Linaro support":
mailto:support@linaro.org
[1]. Stable source package releases are defined as releases where the
full Linaro Toolchain validation plan is executed.
[2]. Source package snapshots are defined when the compiler is only
put through unit-testing and full validation is not performed.
1 day off (Wednesday) (2/10)
== Progress ==
* Validation
- Jenkins jobs maintenance & cleanup
- comparison of build times between old & new lab
- dedicated slave for results comparison works well
* GCC
- trunk monitoring, reported a few new failures.
- high rate of commits before e/o stage1 means
lots of patches to check
- infrastructure problems in the ST compute farm
mean a few false errors needed analysis
- looked at bug #1869, (problem with binary toolsets
on RHEL6). Made some progress
== Next ==
* Validation:
- continue preparation of switch, as dev-01 is now back
- improve reporting
* GCC:
- check Neon tests cleanup
- bug #1869
- look at how to send valuable reports to gcc-regression
* Off on Wed afternoon [1/10].
# Progress #
* Fails in gdb.threads/multiple-step-overs.exp, (TCWG-332) [1/10]
Patch V2 is posted, pending for review.
* TCWG-422, patch is committed. Done. [2/10].
* TCWG-423, patches are ready, being regression tested. [2/10]
* TCWG-433, build GDB with -fsanitize=address, and exposes many memory
issues. Some of them are fixed. [2/10].
* Upstream patch review, [1/10]
* Misc, meeting, [1/10]
# Plan #
* TCWG-423, Post patches upstream.
* Understand ST's jtag probe and help them to make use of multi-arch
with GDB.
* TCWG-433, Continue fixing memory issues exposed by
-fsanitize=address.
--
Yao
Hi Albert,
On Thu, Nov 12, 2015 at 08:20:18AM +0100, Albert ARIBAUD wrote:
> Can you provide the target name and commit ID that you are building,
> s well as the version of the toolchain that you are building with?
> Without being able to reproduce your issue, it's kind of hard to
> diagnose it.
With the explanation from Ard, I understand the thing now. But thanks
for the reply anyway.
Shawn
On 11 November 2015 at 00:45, Savolainen, Petri (Nokia - FI/Espoo) <
petri.savolainen(a)nokia.com> wrote:
>
>
> > -----Original Message-----
> > From: lng-odp [mailto:lng-odp-bounces@lists.linaro.org] On Behalf Of
> > EXT Nicolas Morey-Chaisemartin
> > Sent: Tuesday, November 10, 2015 5:13 PM
> > To: Zoltan Kiss; linaro-toolchain(a)lists.linaro.org
> > Cc: lng-odp
> > Subject: Re: [lng-odp] Runtime inlining
> >
> > As I said in the call last week, the problem is wider than that.
> >
> > ODP specifies a lot of types but not their sizes, a lot of
> > enums/defines (things like ODP_PKTIO_INVALID) but not their value
> > either.
> > For our port a lot of those values were changed for
> > performance/implementation reason. So I'm not even compatible between
> > one version of our ODP port and another one.
> >
> > The only way I can see to solve this is for ODP to fix the size of all
> > these types.
> > Default/Invalid values are not that easy, as a pointer would have a
> > completely different behaviour from structs/bitfields
> >
> > Nicolas
> >
>
> Type sizes do not need to be fixed in general, but only when an
> application is build for binary compatibility (the use case we are talking
> here). Binary compatibility and thus the fixed type sizes are defined per
> ISA.
>
> We can e.g. define a configure target (for our reference implementation ==
> linux-generic) "--binary-compatible=armv8.x" or
> "--binary-compatible=x86_64". When you build your application with that
> option, "platform dependent" types and constants would be fixed to
> pre-defined values specified in (new) ODP API arch files.
>
> So instead of building against
> odp/platform/linux-generic/include/odp/plat/queue_types.h ...
>
> typedef ODP_HANDLE_T(odp_queue_t);
> #define ODP_QUEUE_INVALID _odp_cast_scalar(odp_queue_t, 0)
> #define ODP_QUEUE_NAME_LEN 32
>
>
> ... you'd build against odp/arch/armv8.x/include/odp/queue_types.h ...
>
With the introduction of odp/arch at the top level I think we should also
move platform/linux-generic/arch to the same location
> typedef uintptr_t odp_queue_t;
> #define ODP_QUEUE_INVALID ((uintptr_t)0)
> #define ODP_QUEUE_NAME_LEN 64
>
>
> ... or odp/arch/x86_64/include/odp/queue_types.h
>
> typedef uint64_t odp_queue_t;
> #define ODP_QUEUE_INVALID ((uint64_t)0xffffffffffffffff)
> #define ODP_QUEUE_NAME_LEN 32
>
>
> For highest performance on a fixed target platform, you'd still build
> against the platform directly
>
> odp/platform/<soc_vendor_xyz>/include/odp/plat/queue_types.h
>
> typedef xyz_queue_desc_t * odp_queue_t;
> #define ODP_QUEUE_INVALID ((xyz_queue_desc_t *)0xdeadbeef)
> #define ODP_QUEUE_NAME_LEN 20
>
>
> -Petri
>
>
>
>
> _______________________________________________
> lng-odp mailing list
> lng-odp(a)lists.linaro.org
> https://lists.linaro.org/mailman/listinfo/lng-odp
>
--
Mike Holmes
Technical Manager - Linaro Networking Group
Linaro.org <http://www.linaro.org/> *│ *Open source software for ARM SoCs
Holiday [2/10]
Juno crash analysis [2/10]
* Spent some time fiddling with kexec on AArch64
* Worked in one very specific case
* Another patch series is (apparently) coming, will look out for it
and try again
SPEC-on-Android [2/10]
* Supporting Qian on getting this working
* Wrote a readme for the repository, fixed a Makefile bug that Qian's
cross-compiler happened to tickle
Jenkins benchmarking job - TCWG-348 [1/10]
* Tested, tidied up pbl hacks to generate JSON
* Tested my pbl with Jenkins prototype jobs
* A few minor bug fixes/enhancements for pbl
LAVA jobs for uinstance - TCWG-432 [1/10]
* Reworked jobs to support uinstance, maintaining backward
compatibility as far as possible
* Started adding support to submit results to bundle stream
Misc [2/10]
* Debian FS ready to submit
* Usual meetings/mail/etc background
=Plan=
Look at doing pbl hacks properly in Fathi's in-development refactored p-b-l
Pull together Jenkins/LAVA/pbl, ready to test when uinstance is available
Write up noise control report
(If time, if patches land) have another go at crashdump
== Progress ==
o Linaro GCC (4/10)
* Delivered GCC 4.9 2015.10 snapshot
* More backports forGCC 5 2015.11
* Many instabilities on Hetzner this week
o Upstream work (2/10)
* Sanitizing gfortran testsuite
o Release tools (2/10)
* Added RCs and binaries support to our snapshot.linaro.org
publishing job
o Misc (2/10)
* Various meetings
* Some support
== Plan ==
o Track missing backports dependencies
o Continue ongoing tasks.
== This week ==
* TCWG-369 - Exploit wide add operations when appropriate for Aarch64 (4/10)
- Determined that vectorizer is failing for all targets that have
widening adds with
V8HI to V4SI support (aarch64, ia64, powerPC).
- Modified test cases to indicate expected failure with wide add
V8HI to V4SI support
- Patch sent upstream for approval
* Bugzilla 68223 - arm_[su]min_cmp pattern fails
- Resolved by reverting patch for tcwg-146 as pattern fail in some
corner cases. (3/10)
- Reverted patch checked in upstream
* Misc (1/10)
- Conference calls
* Illness, November 2nd (2/10)
== Next week ==
- TCWG-317 - Resolve lto big endian failures
== Progress ==
- Leave (2/10)
- Widening pass (TCWG-547) - 5/10
* Made the latest changes requested in the review
* Fixed bootstrap and bootstrap mis-compare for ppc64-linux-gnu
* Making uninitialized variable as anonymous ssa (as asked in review)
results in few ICEs.
* Posted updated patch for feedback
- Misc (3/10)
* started looking into LTO status
* Looked at LuaJIT for arm
* gcc/bug list
== Plan ==
* continue with widening pass based on feedback
* Look at implementing LuaJIT for aarch64
* LTO
== This week ==
* TCWG-72 (6/10)
- 5 iterations since the original patch. Changes include:
a) Integration into widening_mul patch
b) Rewriting the divmod transform so DIVMOD() is placed before the topmost
div/mod stmt
c) Removed check for widening mode and optab handler check in expand_DIVMOD
d) Fixed ICE when constant is one of the operands to div/mod stmt.
e) Fixed mis-compilation with a test-case when operands matched but in
opposite order.
f) Formatting nits and fixed test-cases.
- Richard suggested no need to check for post-domination conditions.
- Not sure on what condition to gate the transform.
Checking for availability of divmod/div/mod is not sufficient because arm
defines optab handler for mod which only matches r0 % n where n is
constant and power of 2
for other cases it's expanded via divmod libcall thru expand_divmod.
We would rather need
to check if the template for mod/div gets matched than just to check
if optab handler exists.
AFAIK this cannot be done during tree-ssa passes.
I can think of two approaches:
a) Do the transform to DIVMOD representation unconditionally in
widening_mul pass.
And then in expand_DIVMOD check if the template for mod can be matched.
If it does match then undo the transform from DIVMOD to original
representation and expand.
I am not sure how feasible it is to undo the transform at expansion
time, and start expanding the modified cfg.
b) Define a new target hook combine_divmod.
Default implementation could check for optab handler for div/mod/divmod.
and I could override it for arm-backend to additionally check if the
second operand is a constant and power of 2 and fail for this case
(since we want this to be expanded from modsi3 pattern).
Not sure if this is a good idea, I am replicating the information from
the modsi3 pattern.
If the pattern changes, the hook would also need to be changed.
* Convert ASM_FORMAT_PRIVATE_NAME to hook (2/10)
* TCWG-319 (1/10)
- Bencharmking for patch in progress
* Misc (1/10)
- Meetings
- Sync with Kugan
== Next Week ==
- Continue with TCWG-72
- Complete the patch with build, test and config-builds for
ASM_FROMAT_PRIVATE_NAME and submit upstream
- Continue benchmarking TCWG-319, TCWG-310
== Progress ==
* Buildbots (5/10)
- Some broken bots, bisecting, etc
- Helping a MIPS patch pass on ARM bot
* Maintenance (2/10)
- SciMark2 seems not to be unstable or slow any more in ARM64
- Some more investigations on Loop Load Elimination
- Profiling bigfib on APM and HiKey
* Background (3/10)
- Code review, meetings, discussions, general support, etc.
- Some FOSDEM fiddling
- Some power issues
== Progress ==
* Validation
- moved list of unstable tests to a separate repo, to make
maintenance easier (TCWG-425)
- Jenkins jobs maintenance & cleanup
- a few ABE reporting patches
- comparison of results between old & new lab
* GCC
- trunk monitoring, reported a few new failures.
- Send patch to fix vqtb[lx][34] intrinsics on aarch64_be
* Binutils
- Added a Jenkins job to build+check binutils on
a variety of configurations:
https://ci.linaro.org/view/tcwg-ci/job/tcwg-binutils/
- sent a small patch to fix a bug in the recent STM32L4XX erratum patch
== Next ==
* Validation:
- work on the switch to the new lab, once dev-01 is back online
- more tuning to avoid deadlocks
- re-measure build time on dev-01, to better tune other build jobs
* Two half day off. [2/10]
# Progress #
* TCWG-332, fails in gdb.threads/multiple-step-overs.exp. [1/10]
Testing the simpler approach suggested during the review.
* TCWG-387, done. [1/10] GDB patches are pushed in.
* TCWG-422, GNU vector extension support in ARM GDB. [2/10]
Patches are done, and being tested.
* TCWG-423, GNU vector extension support in AArch64 GDB. [2/10]
Writing patches. Find more issues for AArch64 that GDB doesn't
fully understand the AArch64 calling convention. Need more work here.
* Review ARM GDBserver software single step patch. [1/10]
* Misc, meeting, email, [1/10]
# Plan #
* Off on Wed afternoon.
* TCWG-422, post patches
* TCWG-423, continue.
--
Yao
The Linaro Toolchain Working Group is pleased to announce the availability
of the Linaro Stable Binary Toolchain GCC 5.2-2015.11-rc1
Release-Candidate Archives.
http://snapshots.linaro.org/components/toolchain/binaries/5.2-2015.11-rc1/http://snapshots.linaro.org/components/toolchain/gcc-linaro/5.2-2015.11-rc1/
These archives provide cross-toolchain executables (compiler, debugger,
linker, etc.) and shared libraries (libstdc++, libc, etc.) that target ARM
or Aarch64 GNU/Linux and bare-metal environments. The cross-toolchain
binaries execute on a Linux or MS Windows (under mingw32) host
operating-system.
Please evaluate this release-candidate for correctness. Linaro will
shortly spin the Linaro GCC 5.2-2015.11 release if this release-candidate
passes stakeholder validation.
For bugs related to this release-candidate please email
linaro-toolchain(a)lists.linaro.org or file a bug at
https://bugs.linaro.org/enter_bug.cgi?product=Linux%20Binary%20toolchain
NEWS
* GCC 5.2 2015.11-rc1
The Linaro GCC 5.2 2015.11-rc1 binary toolchain release-candidate is
built from the Linaro GCC-5.2-2015.11 release-candidate source archive.
The Linaro GCC-5.2-2015.11 release source archive is derived from the same
sources as the Linaro GCC-5.2-2015.10 snapshot source archive.
--
Ryan S. Arnold
Linaro Toolchain Working Group - Engineering Manager
www.linaro.org
Dear List,
I'm new to this list and have some questions.
Looking at the created code of GCC on ARMv8, we noticed some areas where there is room for performance improvements.
I assume that these items might already be noticed by you guys.
For example:
1) We noticed that when writing typical DGEMM like code, GCC includes unnecessary DUP instruction
2) GCC seems unwilling to use LDP loads
3) For optimal FPU performance on some A57 its needed to interleave instruction working on ODD and EVEN registers
GCC seem not properly support this. Here sometimes 100% performance increase could be reached by different instruction interleaving.
4) Some work loops highly benefit of interleaving of FPU instructinons and loads.
GCC seems to likes to re-arrange the code so that most or all loads are put on top of the loop.
This can reduce the performance of a well written workloop significantly.
I have no patches to fix this.
But I can produce C- code and ASM output which will show these performance issues.
Please tell me what the next recommended step will be now.
Are all these items known already, or shall I provide code examples to further explain them?
Kind regards
Gunnar von Boehn
== Progress ==
* Validation
- comparing results and build times between the 2 labs
- tuning jobs scheduling to avoid deadlocks
* GCC trunk monitoring
- lots of validation results to check after 1 week of holidays :-)
- a few regressions/new failures/wrong tests reported
* Backports
- a few reviews
== Next ==
* Infrastructure/Validation
* GCC dev: try to fix vqtbl intrinsics for aarch64_be before e/o stage1
The Linaro Toolchain Working Group (TCWG) is pleased to announce the
2015.10 snapshot of the Linaro GCC 4.9 source package.
This snapshot[1] is based on FSF GCC 4.9.4-pre+svn229467 and includes
performance improvements and bug fixes backported from mainline GCC.
This snapshot contents will be part of the 2015.11 stable [1]
quarterly release.
This snapshot tarball is available on:
http://snapshots.linaro.org/components/toolchain/gcc-linaro/4.9-2015.10/
Interesting changes in this GCC source package snapshot include:
* Updates to GCC 4.9.4-pre+svn229467
* Backport of [Bugfix] PR tree-optimization/65735
* Backport of [Bugfix] PR tree-optimization/65177
* Backport of [Bugfix] PR tree-optimization/65048
Feedback and Support
Subscribe to the important Linaro mailing lists and join our IRC
channels to stay on top of Linaro development.
** Linaro Toolchain Development "mailing list":
http://lists.linaro.org/mailman/listinfo/linaro-toolchain
** Linaro Toolchain IRC channel on irc.freenode.net at @#linaro-tcwg@
* Bug reports should be filed in bugzilla against GCC product:
http://bugs.linaro.org/enter_bug.cgi?product=GCC
* Interested in commercial support? inquire at "Linaro support":
mailto:support@linaro.org
[1]. Stable source package releases are defined as releases where the
full Linaro Toolchain validation plan is executed.
[2]. Source package snapshots are defined when the compiler is only
put through unit-testing and full validation is not performed.
== Progress ==
o Linaro GCC (9/10)
* Backports and reviews for our GCC 5 2015.11 snapshot
* FSF branch merge and needed backports for our GCC 4.9 2015.10 snapshot
o Misc (1/10)
* Various meetings
== Plan ==
o Complete 4.9 snapshot
Noise control experiments - TCWG-358 [3/10]
* Some analysis of data to date
Debian filesystem - TCWG-360 [3/10]
* Got stuck on LAVA interactions
* Now booting-to-LAVA-usability, needs some cleanup and testing with
real benchmark runs
Benchmarking-via-Jenkins - TCWG-348 [1/10]
* Picked back up on understanding that LAVA uinstance is a-coming
* Hacked pbl.py (post-build-lava) to generate suitable JSON
** As a bonus, this can work as a CLI job-submission tool
=Plan=
Holiday Friday (pending approval)
Set up crashdumping on my Juno, try to learn why it crashes
Finish Debian filesystem
Get Jenkins generating and submitting jobs suitable for uinstance
Write up noise control report (probably will get bumped to next week)