- linaro-toolchain - lists.linaro.org

by Peter Maydell

RAG: Red: Amber: Green: Current Milestones: | Planned | Estimate | Actual | qemu-linaro 2011-04 | 2011-04-21 | 2011-04-21 | | Historical Milestones: finish qemu-cont-integration | 2011-01-25 | 2011-01-25 | handed off | first qemu-linaro release | 2011-02-08 | 2011-02-08 | 2011-02-08 | qemu-linaro 2011-03 | 2011-03-08 | 2011-03-08 | 2011-03-08 | == maintain-beagle-models == * I spent a couple of days on initial cleanup of the omap3 patchstack in qemu-linaro. It's still some way from being upstreamable but at least now every patch in the stack compiles; this should make rebasing on upstream a bit less painful. * the board-ram-limits patchset is still stalled with upstream :-( == merge-correctness-fixes == * Aurelien applied lots of patches so the pipeline has drained again * cleaning up/reworking patches which fix handling of Neon UNDEF cases. Not very exciting but it will get a large set of patches out of the qemu-linaro patchstack. == other == * meetings: toolchain, standup Current qemu patch status is tracked here: https://wiki.linaro.org/PeterMaydell/QemuPatchStatus Absences: Holiday: 22 Apr - 2 May 9-13 May: UDS, Budapest (maybe) 15-16 August: QEMU/KVM strand at LinuxCon NA, Vancouver [LinuxCon proper follows on 17-19th]

14 years, 3 months

1
0
0 0

[Merge] lp:~rsandifo/gcc-linaro/lp-714921-4.5 into lp:gcc-linaro

by noreply＠launchpad.net

The proposal to merge lp:~rsandifo/gcc-linaro/lp-714921-4.5 into lp:gcc-linaro has been updated. Status: Needs review => Merged For more details, see: https://code.launchpad.net/~rsandifo/gcc-linaro/lp-714921-4.5/+merge/56525 -- https://code.launchpad.net/~rsandifo/gcc-linaro/lp-714921-4.5/+merge/56525 Your team Linaro Toolchain Developers is subscribed to branch lp:gcc-linaro.

14 years, 3 months

1
0
0 0

Launchpad: Validate your team's contact email address

by Launchpad Email Validator

Hello The Launchpad user named 'Michael Hope (michaelh1)' requested the registration of 'linaro-toolchain(a)lists.linaro.org' as the contact email address of team 'Linaro Toolchain Developers'. This request can only be made by a team owner/administrator, so if this change request was unexpected or was not requested by one of the team's administrators, please contact system-error(a)launchpad.net. If you want to make this email address the contact email of 'Linaro Toolchain Developers', please click on the link below and follow the instructions. https://launchpad.net/token/6sQQKQ6kx3XP9MlWMwmX Thanks, The Launchpad Team

14 years, 3 months

1
0
0 0

New porter boxes

by Michael Hope

Hi there. The new porter boxes are now available for use. See: https://wiki.linaro.org/WorkingGroups/ToolChain/Hardware for details. These are PandaBoards with 768 MB of memory, a USB HDD, and a good internet connection. They can be used for day to day jobs like building programs, triaging bugs, and running benchmarks. Use dchroot natty to switch into the chroot. Use sudo apt-get install yyy to install packages. The build dependences for GCC, GDB, and binutils should already be installed. -- Michael

14 years, 3 months

1
0
0 0

[ACTIVITY] April 3-7

by Ira Rosen

Hi, * continued bringing patches upstream - changing default vector size to 128 - resubmitted with changes according to comments, awaiting review - if-conversion improvement - committed * PR 48252 - bug in vzip/vuzp/vtrn implementation - patch submitted * opened PR 48454 - a test failure with -mvectorize-with-neon-quad Next week - vacation. April 18-27 - Passover Holiday, I'll only work half days on April 18 and April 24. And possibly half days on April 20 and 21. Ira

14 years, 3 months

1
0
0 0

Tegra2 errata bug

by David Gilbert

https://bugs.launchpad.net/ubuntu/+source/eglibc/+bug/739374 is the bug relating to the Tegra errata mentioned today. Dave

14 years, 3 months

1
0
0 0

[ACTIVITY] Apr 04 - Apr 06

by Ulrich Weigand

== GDB == * Ongoing work to fix single-stepping over signal handlers (bug #615978). * Posted patch to support NEON registers in core files (bug #615972). * Failure to disable address space randomization (bug #616001) is shown to be a kernel problem; created stand-alone test case and opened bug against kernel team. == Schedule == * On vacation 04/07 - 04/15. Mit freundlichen Gruessen / Best Regards Ulrich Weigand -- Dr. Ulrich Weigand | Phone: +49-7031/16-3727 STSM, GNU compiler and toolchain for Linux on System z and Cell/B.E. IBM Deutschland Research & Development GmbH Vorsitzender des Aufsichtsrats: Martin Jetter | Geschäftsführung: Dirk Wittkopp Sitz der Gesellschaft: Böblingen | Registergericht: Amtsgericht Stuttgart, HRB 243294

14 years, 3 months

1
0
0 0

Re: Linaro 4.5.2 v. CodeSourcery 4.5.1

by Andrew Stubbs

On 25/03/11 21:48, Diane Holt wrote: > I hope you don't mind me sending you mail, but I'm a bit stuck...I've > been told I need the Linaro 4.5.2 toolchain because it has some "neon > optimizations" that the CS 4.5.1 doesn't have. In general, you'd be better addressing these questions on the Linaro Toolchain mailing list: linaro-toolchain(a)lists.linaro.org (I've copied it in). Not least because I'm on vacation for the next week. :) > Unfortunately, the Linaro > 4.5.2 that's available for download (already built) won't work in my > Scratchbox environment, since it was compiled against a glibc that's too > new. The CS 4.5.1 works fine -- but I'm not allowed to use it, because > of the neon stuff. The CS and Linaro compilers are really very similar, but CodeSourcery has not made a release since the autumn, so Linaro will have some extra features. > Do you know whether CS actually does have (or will have) the same neon > optimizations Linaro has? It depends which optimizations you are referring to? The existing CS release had the latest improvements at the time it was released, and I believe that the upcoming release will probably be very similar to Linaro (at least, with respect to ARMv7 - there'll be many differences for other architecture variants), but I'm not promising that. Sorry if that's a bit vague, but I the contents of the next CS release is still not finalised. > If it doesn't (and won't), then I'm going to have to build the Linaro > one from source. Unfortunately, I've not been able to find any detailed > information on how to go about doing that. Do you know if that's > documented anywhere? Are you talking about building native compiler, or a cross-compiler? The former is very simple (provided you have all the dependencies), while the latter is more involved. Here's the recipe to build a native compiler: tar xf gcc-linaro.....tar.bz2 mkdir objdir cd objdir ../gcc-linaro....../configure --prefix=<your-install-path> <opts> make bootstrap make install You can copy the configure <opts> from another compiler using 'gcc -v' and './configure --help' in the source tree should tell you what they mean. If you want to build a cross compiler, I suggest you look at crosstool or crosstool-ng, or OpenEmbedded. Building cross-toolchains is non-trivial. Hope that helps. Andrew

14 years, 3 months

5
15
0 0

CoreMark over releases

by Michael Hope

Here's some pretty pictures of the relative CoreMark performance on a Cortex-A9. Note that in accordance with the CoreMark rules, all results are estimates. Linaro GCC 4.5 and mainline GCC vs 4.5.2: http://people.linaro.org/~michaelh/incoming/all-o3-vs-gcc-4.5.2.png Linaro GCC 4.5 releases vs our first: http://people.linaro.org/~michaelh/incoming/releases-o3-vs-gcc-linaro-4.5-2… Linaro GCC 4.5 builds between 2010.08 and 2010.10: http://people.linaro.org/~michaelh/incoming/releases-o3-vs-gcc-linaro-4.5-2… The 2.5 % drop in performance for 2010.09 is probably due to r99379. The further 2 % drop in 2010.10 is due to re-syncing with mainline 4.5.2. Performance has been moving solidly upwards but it has to start from the -4.5 % from these two updates. Chung Lin's unsigned extension changes should be interesting... -- Michael

14 years, 3 months

2
2
0 0

Linaro toolchain integration with ChromiumOS build

by PANKAJ KUMAR DUBEY

14 years, 3 months

2
1
0 0

[ACTIVITY] Weekly status

by Richard Sandiford

== Last week == * Finished the patch that I was working on last week to use memory operands rather than register operands in neon.md. Submitted upstream: http://gcc.gnu.org/ml/gcc-patches/2011-03/msg01996.html Among other things, this allows the intrinsics to use post-modified addresses. * Submitted patches to make the number of rtl generator arguments (as opposed to insn operands) available to the expand-time code: http://gcc.gnu.org/ml/gcc-patches/2011-03/msg02227.html http://gcc.gnu.org/ml/gcc-patches/2011-03/msg02228.html http://gcc.gnu.org/ml/gcc-patches/2011-03/msg02229.html This is part of the tree-rtl expansion "cleanups" that I've been doing in preparation for the vectoriser work. * More discussion about the handling of type modes vs. per-function target switching. I've think we've agreed what the right approach is, although it's probably outside the scope of this project. The discussion was still useful because it meant I could submit & defend the next patch. * Submitted a patch to use non-BLK modes for arrays of vectors (like uint32x2x2_t & co. in arm_neon.h); http://gcc.gnu.org/ml/gcc-patches/2011-03/msg02192.html This avoids that stack spilling that was discussed during the week. Richard Guenther seemed happy with the patch in principle, but understandably wanted to see how the optabs stuff worked out first. Also, the testcase he asked me to try exposed another instance of: http://gcc.gnu.org/bugzilla/show_bug.cgi?id=46329 so that needs to be fixed first. * Started writing & testing a fix for that PR (46329). == This week == * Finish fix for PR46329. * More vld & vst stuff. Richard

14 years, 3 months

1
0
0 0

[ACTIVITY] Mar.28 -- Apr.03

by Chung-Lin Tang

== Last week == * PR48250 / CS Issue #9845 / Launchpad #723185. Unaligned DImode reload under NEON. Went back and forth with Richard Earnshaw on gcc-patches for most of the week. The issues should finally be clear, and I think it would be better to modify the significant parts of arm_legitimize_reload_address() to do the right thing rather than just fixing the bug. I have a new patch done over the weekend, though it still shows a few regressions after some testing. I hope this gets done by this week... * PR48325 / Launchpad #744754, another NEON ICE in postreload. This appears to be the IA/DB modes for VLDM/VSTM for NEON struct modes were not enabled. This ICE actually does not happen currently on upstream trunk, but sent patch anyways. Pending review. * Spent some time on Launchpad #736661 (C++ ICE in expr.c), and looked at upstream testsuite regressions of gcc.dg/pr17957.c and gcc.dg/torture/pr47975.c under -mfpu=neon (ICE on OImode const0_rtx assignment). * Call with Ramana on ARM optimization work. == This week == * Get PR48250/Launchpad #723185 nailed. * Other pending GCC issues. * TW Public Holiday, Mon. and Tue. (Apr.4-5)

14 years, 3 months

1
0
0 0

Improvements to bzr

by Michael Hope

The Bazaar team have been working on improving the performance of bzr on the gcc-linaro tree. Here's how long the steps take on my machine with the current 2.4 development version: Update tip before branching: bzr pull 20.4 s (no revisions) Make the branch: bzr branch --hardlink 4.5 optspace 26.8 s Do some work and commit it: ...change two files bzr status 1:05 (again) bzr status 1.7 s bzr commit . 3.6 s Push the changes up: bzr push lp:~michaelh1/gcc-linaro/optspace 3:47 ~9 MB (~40 kB/s which is saturating my uplink) Later, the merge master pulls the branch down and merges: bzr branch --no-tree lp:~michaelh1/gcc-linaro/optspace 36 s ~900 k bzr merge ../optspace 3:26 The bzr status and bzr commit are quite good. I've asked them to look into bzr merge. -- Michael

14 years, 3 months

1
0
0 0

can linaro toolchain compile ARM earlier than Cortex A8?

by Barry Song

Hi All, After downloading linaro toolchain by apt-get in ubuntu, I compiled the uboot for ARM1136 SoC with -march=armv5 option. And it can compile successfully. Then I let the uboot run on target boards and system failed due to "undefined instructions". Checked linaro toolchain options, it is: #arm-linux-gnueabi-gcc -v Using built-in specs. COLLECT_GCC=arm-linux-gnueabi-gcc COLLECT_LTO_WRAPPER=/usr/lib/gcc/arm-linux-gnueabi/4.5.2/lto-wrapper Target: arm-linux-gnueabi Configured with: ../src/configure -v --with-pkgversion='Ubuntu/Linaro 4.5.2-5ubuntu2~ppa1' --with-bugurl=file:///usr/share/doc/gcc-4.5/README.Bugs --enable-languages=c,c++,fortran,objc,obj-c++ --prefix=/usr --program-suffix=-4.5 --enable-shared --enable-multiarch --enable-linker-build-id --with-system-zlib --libexecdir=/usr/lib --without-included-gettext --enable-threads=posix --with-gxx-include-dir=/usr/arm-linux-gnueabi/include/c++/4.5.2 --libdir=/usr/lib --enable-nls --enable-clocale=gnu --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-plugin --enable-gold --enable-ld=default --with-plugin-ld=ld.gold --enable-objc-gc --disable-sjlj-exceptions --with-arch=armv7-a --with-float=softfp --with-fpu=vfpv3-d16 --with-mode=thumb --disable-werror --enable-checking=release --program-prefix=arm-linux-gnueabi- --includedir=/usr/arm-linux-gnueabi/include --build=x86_64-linux-gnu --host=x86_64-linux-gnu --target=arm-linux-gnueabi --with-headers=/usr/arm-linux-gnueabi/include --with-libs=/usr/arm-linux-gnueabi/lib Thread model: posix gcc version 4.5.2 (Ubuntu/Linaro 4.5.2-5ubuntu2~ppa1) The imporant options are "--with-arch=armv7-a --with-float=softfp --with-fpu=vfpv3-d16". I just want to ask whether these options stop arm-linux-gnueabi-gcc to support old arch? If so, according to gcc documents at http://gcc.gnu.org/install/configure.html, " --with-cpu=cpu --with-cpu-32=cpu --with-cpu-64=cpu Specify which cpu variant the compiler should generate code for by default. cpu will be used as the default value of the -mcpu= switch. This option is only supported on some targets, including ARM, i386, M68k, PowerPC, and SPARC. The --with-cpu-32 and --with-cpu-64 options specify separate default CPUs for 32-bit and 64-bit modes; these options are only supported for i386, x86-64 and PowerPC. --with-schedule=cpu --with-arch=cpu --with-arch-32=cpu --with-arch-64=cpu --with-tune=cpu --with-tune-32=cpu --with-tune-64=cpu --with-abi=abi --with-fpu=type --with-float=type These configure options provide default values for the -mschedule=, -march=, -mtune=, -mabi=, and -mfpu= options and for -mhard-float or -msoft-float. As with --with-cpu, which switches will be accepted and acceptable values of the arguments depend on the target. " There are only default values for later compiling. Users should be able to swith to other values by setting other options. But why did arm-linux-gnueabi-gcc still build "undefined instructions" to arm1136 with "arch=armv5"? In fact arm1136 is armv6. Then i compiled a toolchain for linaro gcc-linaro-4.4-2011.02-0 codes by myself, the options are simple: #arm-none-linux-gnueabi-gcc -v Using built-in specs. Target: arm-none-linux-gnueabi Configured with: ../gcc-linaro-4.4-2011.02-0/configure --target=arm-none-linux-gnueabi --prefix=/home/vmuser/development/toolchain/build-toolchain/tools --enable-languages=c,c++ --disable-libgomp Thread model: posix gcc version 4.4.5 (Linaro GCC 4.4-2011.02-0) Then I compiled uboot by this toolchain again, the uboot can work. Then why can the toolchain compiled by myself support more arch? And what performance is lost in my compiling? Thanks Barry

14 years, 3 months

4
10
0 0

[Activity] March 28 - April 1

by Ramana Radhakrishnan

== GCC == Progress: * Investigated excessive VFP moves . Partially investigating ways forward. * Polished up my divmodsi4 patch. Discussed it during the call. Looking for ways to do it properly at the tree level. * Got Panda board on Friday. * Off on Wednesday. * Conversations with Revital and Chung-Lin. Need to sync up with Andrew next week. * Found an issue with binutils and Neon and this is now LP:747837 Plans: * Continue looking at excessive VFP moves. * Finish working through Thumb2 speed tickets. * Set up new Panda board. * Conversation with Andrew sometime this week. Meetings: * 1-1s * Linaro toolchain meeting Absences: * April 15 – 26 -> Booked Holiday. * May 9-14 - LDS Budapest

14 years, 3 months

1
0
0 0

[ACTIVITY] Mar 28 - Apr 01

by Ulrich Weigand

== GDB == * Committed patch to fix single-stepping across bad ARM/Thumb boundary (bug #667309) to mainline and Linaro GDB. * Committed patch to fix accessing "fpscr" register to mainline. * Ongoing work to fix single-stepping over signal handlers (bug #615978). Posted yet another updated patch to gdb-patches for comments. * Implemented patch to support NEON registers in core files (bug #615972). * Investigated failure to disable address space randomization (bug #616001). Mit freundlichen Gruessen / Best Regards Ulrich Weigand -- Dr. Ulrich Weigand | Phone: +49-7031/16-3727 STSM, GNU compiler and toolchain for Linux on System z and Cell/B.E. IBM Deutschland Research & Development GmbH Vorsitzender des Aufsichtsrats: Martin Jetter | Geschäftsführung: Dirk Wittkopp Sitz der Gesellschaft: Böblingen | Registergericht: Amtsgericht Stuttgart, HRB 243294

14 years, 3 months

1
0
0 0

[ACTIVITY] weekly status

by Ken Werner

Hi, == pandaboard == * noticed that hw perf events are not working on 2.6.38-1001-linaro-omap * it seems that the omap kernel has not configured its PMU properly * perf_event_open syscall returns ENODEV * started discussion with agreen (#744458) * noticed that natty puts its glibc into a multilib path * prevents linaro gcc (and upstream) from being built == libunwind == * created a generic and local variant of the extbl parser * ran the test suite a few times using different unwind methods * started to look into the test suite failures * started to fix a couple of the failures on ARM Regards Ken

14 years, 3 months

1
0
0 0

[ACTIVITY] report week 13

by Peter Maydell

RAG: Red: Amber: Green: the aircon has been fixed; blessed quiet again Current Milestones: | Planned | Estimate | Actual | qemu-linaro 2011-04 | 2011-04-21 | 2011-04-21 | | Historical Milestones: finish qemu-cont-integration | 2011-01-25 | 2011-01-25 | handed off | first qemu-linaro release | 2011-02-08 | 2011-02-08 | 2011-02-08 | qemu-linaro 2011-03 | 2011-03-08 | 2011-03-08 | 2011-03-08 | == maintain-beagle-models == * the board-ram-limits patchset has been expanded significantly to address upstream suggestions; it now includes a lot of refactoring of sun4m (sparc) board code to use the new generic max-ram functionality instead of a sun4m-specific bit of code. Unfortunately there is still some pushback upstream on the grounds that a simple max-ram limit doesn't cater for complicated NUMA situations :-( == merge-correctness-fixes == * working on moving implementation of VLD/VST "multiple structures" forms into qemu helper functions; the current implementation is correct but can expand to hundreds of TCG ops which is well beyond the maximum permitted value, so could potentially overrun a TCG buffer == other == * wrote up some technical/engineering input into what we ought to be doing with qemu next cycle * review of a patch by Dmitry Eremin-Solenikov adding ARMv4/v4T support * some review of s390 TCG patches (not because we have a direct interest in s390 but as part of being a good citizen upstream) * sent a pull request for some neon patches that had been on the list a few weeks; hopefully this will help drain the patch pipeline * meetings: toolchain, standup, pdsw-tools Current qemu patch status is tracked here: https://wiki.linaro.org/PeterMaydell/QemuPatchStatus Absences: Holiday: 22 Apr - 2 May 9-13 May: UDS, Budapest (maybe) ~17-19 August: QEMU/KVM strand at LinuxCon NA, Vancouver

14 years, 3 months

1
0
0 0

[ACTIVITY] March 27-31

by Revital Eres

Hello, * Submitted merge requests for SMS patch to gcc-linaro and gcc-linaro/4.6. * Testing SMS patch which extends the current implementation to consider loops that contain instructions with REG_INC_NOTE. * Filed PRs 48336 48380 for recent fails of trunk on ARM. * Had a chat with Ramana about the DENbench benchmarks, directions and findings. * Filed PR 745743 in linaro gcc-bugzilla Thanks, Revital

14 years, 3 months

1
0
0 0

[ACTIVITY] March 27-31

by Ira Rosen

Hi, * continued bringing patches upstream - auto-detection of vector size - committed - changing default vector size to 128 - submitted and testing the final version - if-conversion improvement - submitted and now testing the final version * gcc-linaro-4.6 - submitted a merge request for store sink patch (this patch is already upstream) Ira

14 years, 3 months

1
0
0 0

Stand up call

by Michael Hope

Hi there. Just a reminder that the standup call is now at 0900 UTC on Thursdays. The first is in about half an hour. -- Michael

14 years, 3 months

1
0
0 0

NEON intrinsics and stack access

by Michael Hope

For reference. We know that the NEON intrinsics in GCC have issues. I came across this page: http://hilbert-space.de/?p=22 which has a colour to greyscale conversion done using intrinsics. gcc-linaro-4.5-2011.03-0 does poorly through saving intermediate values on the stack. The core of the loop is: .L3: mov ip, r4 vld3.8 {d16-d18}, [r6] vstmia r4, {d16-d18} ldmia ip!, {r0, r1, r2, r3} mov sl, r9 adds r7, r7, #1 adds r6, r6, #24 stmia sl!, {r0, r1, r2, r3} fldd d16, [sp, #24] fldd d18, [sp, #32] ldmia ip, {r0, r1} vmull.u8 q8, d16, d19 stmia sl, {r0, r1} vmlal.u8 q8, d18, d20 fldd d18, [sp, #40] vmlal.u8 q8, d18, d21 vshrn.i16 d16, q8, #8 vst1.8 {d16}, [r5] adds r5, r5, #8 cmp r8, r7 bgt .L3 llvm-2.9~svn128540 does much better: vld3.8 {d20, d21, d22}, [r1]! add r3, r3, #1 cmp r3, r2 vmull.u8 q12, d21, d16 vmlal.u8 q12, d20, d17 vmlal.u8 q12, d22, d18 vshrn.i16 d19, q12, #8 vst1.8 {d19}, [r0]! blt .LBB0_1 and may actually be better than the had-written assembler on Nils's page due to scheduling the loop comparison earlier. Richard S, were you looking into this? -- Michael

14 years, 3 months

2
1
0 0

Feature branch for var-tracking OOM

by Richard Sandiford

I've put the 4.5 version of the var-tracking OOM patch here: lp:~rsandifo/gcc-linaro/lp-714921-4.5 Richard

14 years, 3 months

1
0
0 0

Toolchain today

by Michael Hope

Hi there. A reminder that today's call has shifted due to the European daylight savings change. It's now at 0800 UTC which is 9 am in the UK, 10 am in central Europe, and 10 am in Israel. -- Michael

14 years, 3 months

1
0
0 0

[ACTIVITY] Mar.21 -- Mar.27

by Chung-Lin Tang

== Last week == * PR46934: Thumb-1 ICE, small fix in the "casesi" jump-table expand code. Quickly approved and committed upstream. * Enhance XOR patch for gcc/simplify-rtx.c. Updated comments and committed upstream. * PR48250 / CS Issue #9845 / Launchpad #723185. Unaligned DImode reload under NEON. Submitted patch upstream, but still need to do some more verification that older pre-ARMv5TE cases are safe. Should complete this week. * Working on a type of ICE seen currently on upstream trunk, a few testcases failing under '-O3 -g'. It seems VTA related, but also might have something to do with register elimination not fully done for (var_location (entry_value ...)) expressions, leaving [afp+#num] memory addresses existing in debug insns after reload. Still investigating. * Launchpad #689887, ICE in get_arm_condition_code(). Pushed a merge request to Linaro 4.5 for this patch. Also another LP#742961 appeared as another case of this ICE... * Still working on (what I think should be) the last of the CoreMark ARMv6 regressions. The problem is to combine uxtb+cmp into ands #255. This could be done by adding (set (cc) (compare (zero_extend...))) patterns, implemented by ands assembly, but still looking if this can be done (probably more elegantly) by something like CANONICALIZE_COMPARISON (replacing compare operands) in the ARM backend. * Launchpad #736007, ICE immed_double_const under -mfpu=neon -g. Some discussion on gcc-patches about this, still unclear on what should be done... == This week == * Push forward on above issues.

14 years, 3 months

1
0
0 0

[ACTIVITY] 21st - 26th March

by Andrew Stubbs

Committed Dan's RVCT interoperation patch, both upstream and to Linaro GCC 4.6. Adjusted Benrd's "Discourage NEON on Cortex-A8" patch following Richard Earnshaw's comments, and reposted upstream. The new version was approved, and committed. I've also submitted a merge proposal to Linaro GCC 4.6. Dropped Tom's patch for marking smalls strings read-only. This optimization seems to have no visible effect for ARM in GCC 4.6. I'll leave it it to Tom to forward-port, if it's still meaningful for MIPS. Julian has committed the patch for lp:675347, so I've submitted merge requests to both Linaro GCC 4.5 and 4.6. Bernd has posted the shrink wrapping patches upstream. I've posted this info in all the relevant Linaro tracking tickets. Talked Revital Eres through the Bazaar/Launchpad merge request system. Tried to understand why GCC 4.6 does not use multiply-and-accumulate efficiently, when used with 64-bit values. It seems that the compiler sometimes uses (subreg:SI (reg:DI ...)) and sometimes just uses a plain (reg:SI ..) and those don't combine to give useful patterns, but I haven't got to the bottom of it yet. Tested an FSF GCC 4.6 snapshot from the 23rd. All well, so I've merged it to the Linaro GCC 4.6 branch. * Future Absence Away Monday 28th to Friday 1st April. ---- Upstream patched requiring review: * Thumb2 constants: http://gcc.gnu.org/ml/gcc-patches/2010-12/msg00652.html * ARM EABI half-precision functions http://gcc.gnu.org/ml/gcc-patches/2011-02/msg00874.html * ARM Thumb2 Spill Likely tweak http://gcc.gnu.org/ml/gcc-patches/2011-02/msg00880.html * NEON scheduling patch http://gcc.gnu.org/ml/gcc-patches/2011-02/msg01431.html

14 years, 3 months

1
0
0 0

[ACTIVITY] weekly status

by Ken Werner

Hi, == libunwind == * modified the extbtl-parser to operate on the DWARF model directly * this adds support for unwinding call stacks with mixed (DWARF and extbl) frames on ARM * did a few other fixes and cleanups * posted the patches on the libunwind ml * set up a tree on git.linaro.org * attended a class on friday Regards Ken

14 years, 3 months

1
0
0 0

[ACTIVITY] Mar 21 - Mar 25

by Ulrich Weigand

== GDB == * Completed glibc patch to add ARM unwind tables to system call stubs (bug #684218), patch committed upstream and backported to Ubuntu glibc. * Posted kernel patch to fixes GDB inferior calls while stopped in a restartable system call (bug #615974); waiting for review. * Ongoing work to fix single-stepping over signal handlers (bug #615978). * Implemented patch to fix single-stepping across bad ARM/Thumb boundary (bug #667309); posted to mailing list for comments. * Contributed two fixes for valgrind on ARM (to enable running GDB under valgrind); both now accepted mainline. Mit freundlichen Gruessen / Best Regards Ulrich Weigand -- Dr. Ulrich Weigand | Phone: +49-7031/16-3727 STSM, GNU compiler and toolchain for Linux on System z and Cell/B.E. IBM Deutschland Research & Development GmbH Vorsitzender des Aufsichtsrats: Martin Jetter | Geschäftsführung: Dirk Wittkopp Sitz der Gesellschaft: Böblingen | Registergericht: Amtsgericht Stuttgart, HRB 243294

14 years, 3 months

1
0
0 0

[ACTIVITY] Weekly status

by Richard Sandiford

== This week == * Moved the discussion about the RTL and gimple representation of strided loads/stores to the gcc@ list. Got some good feedback: http://gcc.gnu.org/ml/gcc/2011-03/msg00322.html * Started a subdiscussion about the handling of modes: http://gcc.gnu.org/ml/gcc/2011-03/msg00342.html This is a tricky one. I'll add more fuel to the fire next week. * Committed two GCC patches to clean up the expand interface. Dealt with the fallout (some expected, but unfortunately some not). * Submitted two of the patches to improve code generation for strided load/store intrinsics: http://gcc.gnu.org/ml/gcc-patches/2011-03/msg01631.html http://gcc.gnu.org/ml/gcc-patches/2011-03/msg01634.html * Spent a lot of the week reworking the way the load/store intrinsics are handled, to fix both correctness and performance bugs. The new rtl patterns should have the right form for the vectoriser. Made what feels like good progress, but it's not complete yet. * Sent separate R_ARM_IRELATIVE patch to glibc, after feedback from glibc-ports. * Booked flight and hotel for Budapest summit. * Pinged unreviewed patches. == Next week == * More intrinsics improvements. I think these are necessary to get good code out of the vectoriser too. Richard

14 years, 3 months

1
0
0 0

[ACTIVITY] 2011-03-25

by David Gilbert

== String routines == * Wrote a thumb optimised strchr - As expected it's got nice performance for longer runs but at sizes <16 bytes it's slower, and a lot of the strchr calls are very short, so it's probably not of benefit in most cases ( https://wiki.linaro.org/WorkingGroups/ToolChain/Benchmarks/InitialStrchr?ac… ) * Wrote a neon-memcpy - As previously found with memset, it performs well on A8 but poorly on A9 - it does however do the case where the source/destination isn't aligned quite well even on A9 ; the vld1 unaligned case works with relatively little penalty. (it performs comparably to the Bionic implementation - mine is a bit faster on shorter calls, Bionic is better on longer uses - I think that's because they've got some careful use of preloads where I have so far got none). I'm on holiday up to and including 5th April. Dave

14 years, 3 months

1
0
0 0

[Activity] March 21-25

by Ramana Radhakrishnan

== GCC == Progress: * Investigated excessive VFP moves . Investigating ways forward. * Went through some of the test results with 4.6 RC2 upstream - looking through test results etc. * Setup SPEC2k6 cross on my Linaro machine. * Waiting for my new Panda board sometime next week. * Some small bug fixes upstream. Need to rework a couple of documentation patches after review. Plans: * Continue looking at excessive VFP moves. * Continue to look at some patches upstream. * Finish working through Thumb2 speed tickets. * Set up new Panda board. * Start looking at DENBench results and identify potential speed up areas. Meetings: * 1-1s * Linaro toolchain meeting Absences: * March 30th (maybe): WC Cricket Semi-final. (Ind v Pak) * April 15 – 26 -> Booked Holiday. * May 9-14 - LDS Budapest

14 years, 3 months

1
0
0 0

[ACTIVITY] report week 12

by Peter Maydell

RAG: Red: Amber: Green: Current Milestones: | Planned | Estimate | Actual | qemu-linaro 2011-04 | 2011-04-21 | 2011-04-21 | | Historical Milestones: finish qemu-cont-integration | 2011-01-25 | 2011-01-25 | handed off | first qemu-linaro release | 2011-02-08 | 2011-02-08 | 2011-02-08 | qemu-linaro 2011-03 | 2011-03-08 | 2011-03-08 | 2011-03-08 | == maintain-beagle-models == * benchmarking/testing of the TCG locking fix: oddly benchmarks seem to come out with less slowdown (1% or less) than a system mode bootup/shutdown. (I used scimark and dhrystone. scimark is the same speed, which is to be expected because we spend all our time doing floating point emulation. I was expecting a bigger perf hit on dhrystone, though.) * submitted patches to make qemu fail cleanly if you ask for more RAM than a board supports == merge-correctness-fixes == * tested the Neon element load/store instructions; wrote patches to fix UNDEF handling (which are blocked waiting for the patch pipeline to be drained) and confirmed there aren't any other bugs. There is a meego patch to use helper functions for multi-element load/store which is apparently to avoid overflowing a TCG buffer: need to test and upstream this. * investigated android qemu tree for any missing correctness fixes: looking through the changelog I think we have fixes upstream for everything that was fixed in the android tree. == other == * patch: fix versatilepb/realview handling of multiple nic options * patch: better diagnosis of more nics requested than board supports [this is needed to get the vexpress patch committed] * reviewed a patch to add ARMv4/v4T support to qemu (mostly consists of making sure we UNDEF in the right places) * meetings: toolchain, standup, pdsw-tools, 1-2-1 Current qemu patch status is tracked here: https://wiki.linaro.org/PeterMaydell/QemuPatchStatus Absences: Holiday: 22 Apr - 2 May 9-13 May: UDS, Budapest (maybe) ~17-19 August: QEMU/KVM strand at LinuxCon NA, Vancouver

14 years, 3 months

1
0
0 0

[ACTIVITY] March 20-24

by Revital Eres

Hello, Implemented a patch to apply SMS in the presence of instructions with REG_INC_NOTE. (this occurs in telecom/autocor thus SMS needs to be run with -fno-auto-inc-dec flag to be applied) Sent a merge request to gcc-linaro for the SMS patches. Thanks to Andrew Stubbs for his help. https://code.launchpad.net/~eres/gcc-linaro/SMS_doloop_for_ARM I intend to send a request to gcc-linaro.4.6 as well. Thanks, Revital

14 years, 3 months

1
0
0 0

[ACTIVITY] March 22-24

by Ira Rosen

Hi, * resubmitted and committed store sink patch to trunk, I'll commit it to gcc-linaro-4.6 next week * submitted autodetection of vector size patch to gcc-patches, I'l commit it next week * started testing a patch that makes mvectorize-with-neon-quad the default * DenBench: found some more cases where vectorization of strided accesses using vzip/vuzp causes degradation. Since Richard is making a lot of progress with vlsd/vst, I think it doesn't make sense to spend too much time on vzip/vuzp, and I am going to run DenBench without this patch. Ira

14 years, 3 months

1
0
0 0

Re: Cross compiler ITP (armel)

by Goswin von Brederlow

Philipp Kern <trash(a)philkern.de> writes: > On 2011-03-23, Goswin von Brederlow <goswin-v-b(a)web.de> wrote: >> Also does the testing transition consider the Built-Using? If I specify >> 'Built-Using: gcc-4.5 (= 4.5.2-5)' will the package be blocked from >> entering testing until gcc-4.5 (= 4.5.2-5) has entered and block gcc-4.5 >> (= 4.5.2-5) from being replaced from testing? > > It doesn't need to. All we want is compliance on the archive side so that the > sources are not expired away, as long as that binary is carried in a suite. > No need to involve britney at that point. > > Kind regards > Philipp Kern Not quite. For ia32-libs it would be nice if ia32-libs could be blocked from testing as long as the source packages it includes aren't in testing. Currently that is solved by building against testing in the first palce. But that is something we can live with. As a side note the debian-cd package needs to also consider Built-Using when creating source images. Will the Sources.gz file list multiple entries for a source if multiple versions are relevant? MfG Goswin

14 years, 3 months

2
1
0 0

Re: Cross compiler ITP (armel)

by Hector Oron

Hi, 2009/11/2 Mark Hymers <mhy(a)debian.org>: > On Mon, 02, Nov, 2009 at 12:43:42PM +0000, Philipp Kern spoke thus.. >> Of course it is a sane approach but very special care needs to be taken when >> releasing to ensure GPL compliance. So what we should get is support in the >> toolchain to declare against what source package the upload was built to >> keep that around. > We haven't implemented that yet for the archive software but it's on the > TODO list (and not that difficult). None of us have had time to do the > d-d-a post from the ftpteam meeting yet, but I'll make sure information > is in there about it. > > I'm hoping to the archive-side support done in the next week or so. Squeeze has already been released, cross toolchains were not released along Debian main, but found at Emdebian repository. Marcin Juszkiewicz has been working out cross compiler packages for armel as part of his work for Linaro, which I attempt to include into Debian main archive. As a result of the work done, linux-armel, binutils-armel, eglibc-armel are merged into a single source package named `cross-toolchain-base', the package is not optimal, but once we got multiarch support, it should be renamed to `binutils-armel' (or similar name) and use linux and eglibc libraries and headers provided by multiarch. Along this package I also plan to upload `gcc-4.5-cross' (#590465). At the moment we are targeting one target architecture on two build hosts ('{amd64,i386}->armel'), not sure if it is desired to be supported on more build hosts. Target architecture support might grow up in future, but right now it is not a priority. Not sure if that is an issue for someone? Comments? Best regards, -- Héctor Orón "Our Sun unleashes tremendous flares expelling hot gas into the Solar System, which one day will disconnect us." -- Day DVB-T stop working nicely Video flare: http://antwrp.gsfc.nasa.gov/apod/ap100510.html

14 years, 3 months

8
12
0 0

[ACTIVITY] Weekly status

by Richard Sandiford

== Last week == * Committed STT_GNU_IFUNC changes to binutils. * Submitted the STT_GNU_IFUNC changes to GLIBC ports. Got feedback on Friday, which I'll deal with this week. * Worked on the expand and rtl-level parts of the load/store lane representation, with new optabs for each operation. This seems to be working pretty well, but I still need to make some changes to the way the existing intrinsics work. * Wrote a patch to clean up the way we handle optabs during expand, so that the new optabs mentioned above will need a bit less cut-&-paste. Submitted upstream. Got some positive feedback. * Committed testcase for PR rtl-optimization/47166 upstream. == This week == * Deal with GLIBC feedback. * More load/store lanes. Richard

14 years, 3 months

1
0
0 0

[ACTIVITY] 21st - 25th March

by Andrew Stubbs

* Linaro GCC Tested and merged both the latest Linaro merge requests, and various bug fixes to the Shrink Wrap optimization from CS, into Linaro GCC 4.5. Merged and tested from FSF GCC 4.6. Richard and Ramana have approved some of my upstream patches! I just need to wait for stage one so I can commit them upstream. I'll commit them internally when I get time to do the final integration test. Continued benchmarking GCC 4.6 with the patches merged from GCC 4.5. Decided to discard a couple of extra patches since they don't appear to be of any value. * Other On leave Wednesday to Friday playing daddy. :) * Future Absence Away Monday 28th to Friday 1st April. ---- Upstream patched requiring review: * Thumb2 constants: http://gcc.gnu.org/ml/gcc-patches/2010-12/msg00652.html * ARM EABI half-precision functions http://gcc.gnu.org/ml/gcc-patches/2011-02/msg00874.html * ARM Thumb2 Spill Likely tweak http://gcc.gnu.org/ml/gcc-patches/2011-02/msg00880.html * NEON scheduling patch http://gcc.gnu.org/ml/gcc-patches/2011-02/msg01431.html

14 years, 3 months

1
1
0 0

Substituting -msoft-float/-mfloat-abi=* in the proper order in spec file

by Loïc Minier

Hey I'm trying to extend the *link: specs to pass a different -dynamic-linker depending on the float ABI. But I didn't manage to build a construct which would preserve the order of the flags; if I do something like: %{msoft-float:-dynamic-linker V1} %{mfloat-abi=softfp:-dynamic-linker V2} Then I get V2 for "-mfloat-abi=softfp -msoft-float" instead of V1. In gcc/gcc.c I found some docs on spec file syntax; I see one can use %{S*&T*} and %{S*:X}, but apparently %{S*&T*:X} isn't allowed, so I can't manipulate the value. I tried to use %{msoft-float*:-dynamic-linker V1} %{mfloat-abi=softfp*:-dynamic-linker V2} but that gives the same effect (the msoft-float flags are grouped together in the original order and put first, then the mfloat-abi=softfp are grouped together in the original order and put second). I didn't manage to get %{msoft-float*:%<msoft-float -dynamic-linker V1} to work; in fact I didn't get supressions to work. Any idea? Thanks! PS: float-abit=softfp/soft-float are just convenient examples; the actual target is to use different -dynamic-linker for hard vs soft float-abi -- Loïc Minier

14 years, 3 months

2
1
0 0

Trip Report -- 1st QEMU Users Forum, Grenoble, 18 March 2011

by Peter Maydell

I went to the first QEMU Users Forum in Grenoble last week; this is my impressions and summary of what happened. Sorry if it's a bit TLDR... == Summary and general observations == This was a day long set of talks tacked onto the end of the DATE conference. There were about 40 attendees; the focus of the talks was mostly industrial and academic research QEMU users/hackers (a set of people who use and modify QEMU but who aren't very well represented on the qemu-devel list). A lot of the talks related to SystemC; at the moment people are rolling their own SystemC<->QEMU bridges. In addition to the usual problems when you try to put two simulation engines together (each of which thinks it should be in control of the world) QEMU doesn't make this easy because it is not very modular and makes the assumption that only one QEMU exists in a process (lots of global variables, no locking, etc). There was a general perception from attendees that QEMU "development community" is biased towards KVM rather than TCG. I tend to agree with this, but think this is simply because (a) that's where the bulk of the contributors are and (b) people doing TCG related work don't always appear on the mailing list. (The "quick throwaway prototype" approach often used for research doesn't really mesh well with upstream's desire for solid long-term maintainable code, I guess.) QEMU could certainly be made more convenient for this group of users: greater modularisation and provision of "just the instruction set simulator" as a pluggable library, for instance. Also the work by STMicroelectronics on tracing/instrumentation plugins looks like it should be useful to reduce the need to hack extra instrumentation directly into QEMU's frontends. People generally seemed to think the forum was useful, but it hasn't been decided yet whether to repeat it next year, or perhaps to have some sort of joint event with the open-source qemu community. More detailed notes on each of the talks are below; the proceedings/slides should also appear at http://adt.cs.upb.de/quf within a few weeks. Of particular Linaro/ARM interest are: * the STMicroelectronics plugin framework so your DLL can get callbacks on interesting events and/or insert tracing or instrumentation into generated code * Nokia's work on getting useful timing/power type estimates out of QEMU by measuring key events (insn exec, cache miss, TLB miss, etc) and calibrating against real hardware to see how to weight these * a talk on parallelising QEMU, ie "multicore on multicore" * speeding up Neon by adding SIMD IR ops and translating to SSE The forum started with a brief introduction by the organiser, followed by an informal Q&A session with Nathan Froyd from CodeSourcery (...since his laptop with his presentation slides had died on the journey over from the US...) == Talk 1: QEMU and SystemC == M. Monton from GreenSocs presented a couple of approaches to using QEMU with SystemC. "QEMU-SC" is for systems which are mostly QEMU based with one or two SystemC devices -- QEMU is the master. Speed penalty is 8-14% over implementing the device natively. "QBox" makes the SystemC simulation the master, and QEMU is implemented as a TLM2 Initiator; this works for systems which are almost all SystemC and which you just want to add a QEMU core to. Speed penalty 100% (!) although they suspect this is an artifact of the current implementation and could be reduced to more like 25-30%. They'd like to see a unified effort to do SystemC and QEMU integration (you'll note that there are several talks here where the presenters had rolled their own integration). Source available from www.greensocs.com. == Talk 2: Combined Use of Dynamic Binary Translation and SystemC for Fast and Accurate MPSoc Simulation == Description of a system where QEMU is used as the core model in a SystemC simulation of a multiprocessor ARM system. The SystemC side includes models of caches, write buffers and so on; this looked like quite a low level detailed (high overhead) simulation. They simulate multiple clusters of multiple cores, which is tricky with QEMU because it has a design assumption of only one QEMU per process address space (lots of global variables, no locking, etc); they handle this by saving and restoring globals at SystemC synchronisation points, which sounded rather hacky to me. They get timing information out of their model by annotating the TCG intermediate representation ops with new ops indicating number of cycles used, whether to check for Icache/Dcache hit/miss, and so on. Clearly they've put a lot of work into this. They'd like a standalone, reentrant ISS, basically so it's easier to plug into other frameworks like SystemC. == Talk 3: QEMU/SystemC Cosimulation at Different Abstraction Levels == This talk was about modelling an RTOS in SystemC; I have to say I didn't really understand the motivation for doing this. Rather than running an RTOS under emulation, they have a SystemC component which provides the scheduler/mutex type APIs an RTOS would, and then model RTOS tasks as other SystemC components. Some of these SystemC components embed user-mode QEMU, so you can have a combination of native and target-binare RTOS tasks. They're estimating time usage by annotating QEMU translation blocks (but not doing any accounting for cache effects). == Talk 4: Timing Aspects in QEMU/SystemC Synchronisation == Slightly academic-feeling talk about how to handle the problem of trying to run several separate simulations in parallel and keep their timing in sync. (In particular, QEMU and a SystemC world.) If you just alternate running each simulation there is no problem but it's not making best use of the host CPU. If you run them in parallel you can have the problem that sim A wants to send an event to sim B at time T, but sim B has already run past time T. He described a couple of possible approaches, but they were all "if you do this you might still hit the problem but there's a tunable parameter to reduce the probability of something going wrong"; also they only actually implemented the simplest one. In some sense this is really all workarounds for the fact that SystemC is being retrofitted/bolted onto the outside of a QEMU simulation. == Talk 5: Program Instrumentation with QEMU == Presentation by STMicroelectronics, about work they'd done adding instrumentation to QEMU so you can use it for execution trace generation, performance analysis, and profiling-driven optimisation when compiling. It's basically a plugin architecture so you can register hooks to be called at various interesting points (eg every time a TB is executed); there are also translation time hooks so plugins can insert extra code into the IR stream. Because it works at the IR level it's CPU-agnostic. They've used this to do real work like optimising/debugging of the Adobe Flash JIT for ARM. They're hoping to be able to submit this upstream. I liked this; I think it's a reasonably maintainable approach, and it ought to alleviate the need for hacking extra ops directly into QEMU for instrumentation (which is the approach you see in some of the other presentations). In particular it ought to work well with the Nokia work described in the next talk... == Talk 6: Using QEMU in Timing Estimation for Mobile Software Development == Work by Nokia's research division and Aalto university. This was about getting useful timing estimates out of a QEMU model by adding some instrumentation (instructions executed, cache misses, etc) and then calibrating against real hardware to identify what weightings to apply to each of these (weightings differ for different cores/devices; eg on A8 your estimates are very poor if you don't account for L2 cache misses, but for some other cores TLB misses are more important and adding L2 cache miss instrumentation gives only a small improvement in accuracy.) The cache model is not a proper functional cache model, it's just enough to be able to give cache hit/miss stats. They reckon that three or four key statistics (cache miss, TLB miss, a basic classification of insns into slow or fast) give estimated execution times with about 10% level of inaccuracy; the claim was that this is "feasible for practical usage". Git tree available. This would be useful in conjunction with the STMicroelectronics instrumentation plugin work; alternatively it might be interesting to do this as a Valgrind plugin, since Valgrind has much more mature support for arbitrary plugins. (Of course as a Valgrind plugin you'd be restricted to running on an ARM host, and you're only measuring one process, not whole-system effects.) == Talk 7: QEMU in Digital Preservation Strategies == A less technical talk from a researcher who's working on the problems of how museums should deal with preserving and conserving "digital artifacts" (OSes, applications, games). There are a lot of reasons why "just run natively" becomes infeasible: media decay, the connector conspiracy, old and dying hardware, APIs and environments becoming unsupported, proprietary file formats and on and on. If you emulate hardware (with QEMU) then you only have to deal with emulating a few (tens of) hardware platforms, rather than hundreds of operating systems or thousands of file formats, so it's the most practical approach. They're working on web interfaces for non-technical users. Most interesting for the QEMU dev community is that they're effectively building up a large set of regression tests (ie images of old OSes and applications) which they are going to be able to run automatic testing on. == Talk 8: MARSS-x86: QEMU-based Micro-Architectural and Systems Simulator for x86 Multicore Processors == This is about using QEMU for microarchitectural level modelling (branch predictor, load/store unit, etc); their target audience is academic researchers. There's an existing x86 pipeline level simulator (PLTsim) but it has problems: it uses Xen for its system simulation so it's hard to get installed (need a custom kernel on the host!), and it doesn't cope with multicore. So they've basically taken PLTsim's pipeline model and ported it into the QEMU system emulation environment. When enabled it replaces the TCG dynamic translation implementation; since the core state is stored in the same structures it is possible to "fast forward" a simulation running under TCG and then switch to "full microarchitecture simulation" for the interesting parts of a benchmark. They get 200-400KIPS. == Talk 9: Showing and Debugging Haiku with QEMU == Haiku is an x86 OS inspired by BeOS. The speaker talked about how they use QEMU for demos and also for kernel and bootloader debugging. == Talk 10: PQEMU : A parallel system emulator based on QEMU == This was a group from a Taiwan university who were essentially claiming to have solved the "multicore on multicore" problem, so you can run a simulated MPx4 ARM core on a quad-core x86 box and have it actually use all the cores. They had some benchmarking graphs which indicated that you do indeed get ~3.x times speedup over emulated single-core, ie your scaling gain isn't swamped in locking overhead. However, the presentation concentrated on the locking required for code generation (which is in my opinion the easy part) and I wasn't really convinced that they'd actually solved all the hard problems in getting the whole system to be multithreaded. ("It only crashes once every hundred runs...") Also their work is based on QEMU 0.12, which is now quite old. We should definitely have a look at the source which they hope to make available in a few months. == Talk 11: PRoot: A Step Forward for QEMU User-Mode == STMicroelectronics again, presenting an alternative to the usual "chroot plus binfmt_misc" approach for running target binaries seamlessly under qemu's linux-user mode. It's a wrapper around qemu which uses ptrace to intercept the syscalls qemu makes to the host; in particular it can add the target-directory prefix to all filesystem access syscalls, and can turn an attempt to exec "/bin/ls" into an exec of "qemu-linux-arm /bin/ls". The advantage over chroot is that it's more flexible and doesn't need root access to set up. They didn't give figures for how much overhead the syscall interception adds, though. == Talk 12: QEMU TCG Enhancements for Speeding up Emulation of SIMD == Simple idea -- make emulation of Neon instructions faster by adding some new SIMD IR ops and then implementing them with SSE instructions in the x86 backend. Some basic benchmarking shows that they can be ten times faster this way. Issues: * what is the best set of "generic" SIMD ops to add to the QEMU IR? * is making Neon faster the best use of resource for speeding up QEMU overall, or should we be looking at parallelism or other problems first? * are there nasty edge cases (flags, corner case input values etc) which would be a pain to handle? Interesting, though, and I think it takes the right general approach (ie not horrifically Neon specific). My feeling is that for this to go upstream it would need uses in two different QEMU front ends (to demonstrate that the ops are generic) and implementations in at least the x86 backend, plus fallback code so backends need not implement the ops; that's a fair bit of work beyond what they've currently implemented. == Talk 13: A SysML-based Framework with QEMU-SystemC Code Generation == This was the last talk, and the speaker ran through it very fast as we were running out of time. They have a code generator for taking a UML description of a device and turning it into SystemC (for VHDL) and C++ (for a QEMU device) and then cosimulating them for verification. -- PMM

14 years, 3 months

1
0
0 0

Evaluate Link Time Optimization (LTO) in linaro-gcc 4.5

by Jim Huang

Hello list, Recently, Android team is working on integrating Linaro toolchain for Android and NDK. According to the initial benchmark results[1], Linaro GCC is competitive comparing to Google toolchain. In the meanwhile, we are trying to enable gcc-4.5 specific features such as Graphite and LTO (Link Time Optimization) in order to make the best choice for Android build system and NDK usage. However, I encountered a problem about LTO and would like to ask help from toolchain WG. Assuming Linaro Toolchain for Android is installed in directory /tmp/android-toolchain-eabi, you can obtain Google's toolchain benchmark suite by git: # git clone git://android.git.kernel.org/toolchain/benchmark.git You have to apply the attached patch in order to make benchmark suite work[2]. Then, change directory to skia: # cd benchmark/skia And build skia bench with LTO enabled: # ../scripts/bench.py --action=build --toolchain=/tmp/android-toolchain-eabi --add_cflags="-flto -user-linker-plugin" The build process would be interrupted by gcc: make -j4 --warn-undefined-variables -f ../scripts/build/main.mk TOOLCHAIN=/tmp/android-toolchain-eabi ADD_CFLAGS="-flto -user-linker-plugin" build CPP ARM obj/src/core/Sk64.o <= src/src/core/Sk64.cpp CPP ARM obj/src/core/SkAlphaRuns.o <= src/src/core/SkAlphaRuns.cpp CPP ARM obj/src/core/SkBitmap.o <= src/src/core/SkBitmap.cpp CPP ARM obj/src/core/SkBitmapProcShader.o <= src/src/core/SkBitmapProcShader.cpp CPP ARM obj/src/core/SkBitmapProcState.o <= src/src/core/SkBitmapProcState.cpp CPP ARM obj/src/core/SkBitmapProcState_matrixProcs.o <= src/src/core/SkBitmapProcState_matrixProcs.cpp src/src/core/SkBitmapProcShader.cpp: In function 'SkShader::CreateBitmapShader(SkBitmap const&, SkShader::TileMode, SkShader::TileMode, void*, unsigned int)': src/src/core/SkBitmapProcShader.cpp:243:13: warning: 'color' may be used uninitialized in this function CPP ARM obj/src/core/SkBitmapSampler.o <= src/src/core/SkBitmapSampler.cpp src/src/core/SkBitmapProcState_matrixProcs.cpp:530:1: sorry, unimplemented: gimple bytecode streams do not support machine specific builtin functions on this target ... However, I can get other bench items passed such as cximage, gcstone, gnugo, mpeg4, webkit, and python. Can anyone give me some hints to resolve LTO problem? Thanks in advance. Sincerely, -jserv [1] https://wiki.linaro.org/Platform/Android/Toolchain#Reference%20Benchmark We use the same toolchain benchmark suite as Google compiler team took. [2] https://wiki.linaro.org/Platform/Android/UpstreamToolchain

14 years, 3 months

3
3
0 0

[ACTIVITY] Mar.14 -- Mar.20

by Chung-Lin Tang

== Last week == * CoreMark ARMv6/v7 regressions: posted another combine patch upstream, which was quickly approved and committed. The XOR simplification one is now approved too, but needs a little more revising of comments before committing. * The above two patches now bring CoreMark under -march=armv7-a to very close of the performance of -march=armv5te. However, a regression where uxtb+cmp cannot be combined into 'ands ... #255' still causes v7 to lose slightly. This should be the final issue to solve... * Launchpad #736007/GCC Bugzilla PR48183: NEON ICE in emit-rtl.c:immed_double_const() under -g. Posted patch upstream, but looks like more discussion is needed before we know if this is the "right" way to do it. * Launchpad #736661, armel FTBFS (G++ ICE in expand_expr_real_1()). Looking at this. * Pinged a few upstream patch submissions. == This week == * Launchpad #723185/CS issue #9845 now assigned to me, start looking at this. * Get the XOR patch committed upstream, and the above described uxtb+cmp issue solved. * Work on other GCC issues.

14 years, 3 months

1
0
0 0

Tracking tickets as they go by

by Michael Hope

Hi there. I have a custom report on top of the Launchpad tickets that shows how old they are and if they need attention: http://ex.seabright.co.nz/helpers/tickets/gcc-linaro?group_by=lint I check this once a day to see how we're doing. It's useful when deciding which bug to attack next. -- Michael

14 years, 3 months

1
0
0 0

[ACTIVITY] weekly status

by Ken Werner

== libunwind == * Had few discussions with Uli with regard to unwinding. * Continued to learn about libunwind internals. * The .ARM.exidx and .ARM.extbl section parser is functional but the integration into libunwind needs to be improved. Currently there are two seperate models that hold the informations of the current frame. Since they are not synchronized the behavior of libunwind is quite unexpected to the user. * I started on eliminating the redundancy by removing the model that was introduced for the extbl support. My goal is to have the parser operate on the DWARF model directly. In theory this should also allow to mix DWARF- and extable-frames. Regards Ken

14 years, 3 months

1
0
0 0

[Activity] March 7 - 18

by Ramana Radhakrishnan

== GCC == * Started looking at performance regressions. Setting up builds with EEMBC Denbench and other benchmarks. * Looked at PR47719 in some detail this week. * Set up environment on laptop . Fixed PR46788 in 4.6 branch and trunk. * Discussions regarding armhf, how to maintain Linaro branches - upstreaming patches etc. * Looked at a case of performance improvements with VFP stores. I think it's because we end up allowing PRE_INC and POST_DEC for floating point mode values because of which there end up being more transfers to and from the integer core registers. * Off sick on Monday 14th March 2011. == Misc == * Sorted out travel arrangements for LDS. Waiting for visa now.

14 years, 3 months

1
0
0 0

[ACTIVITY] Mar 14 - Mar 18

by Ulrich Weigand

== GDB == * Ongoing work on glibc patch to add ARM unwind tables to system call stubs (bug #684218). * Implemented initial version of a kernel patch that fixes GDB inferior calls while stopped in a restartable system call (bug #615974); started discussion with kernel folks. * Implemented new version of patch to fix single-stepping over signal handlers (bug #615978) that addresses review comments; posted to mailing list. * Verified Linaro GDB patch set can be applied to Ubuntu package. Mit freundlichen Gruessen / Best Regards Ulrich Weigand -- Dr. Ulrich Weigand | Phone: +49-7031/16-3727 STSM, GNU compiler and toolchain for Linux on System z and Cell/B.E. IBM Deutschland Research & Development GmbH Vorsitzender des Aufsichtsrats: Martin Jetter | Geschäftsführung: Dirk Wittkopp Sitz der Gesellschaft: Böblingen | Registergericht: Amtsgericht Stuttgart, HRB 243294

14 years, 3 months

1
0
0 0

[ACTIVITY] report week 11

by Peter Maydell

(sent early this week since I'll be in the conference all Friday) RAG: Red: Amber: Green: Current Milestones: | Planned | Estimate | Actual | Historical Milestones: [trimmed the 2010 ones] finish qemu-cont-integration | 2011-01-25 | 2011-01-25 | handed off | first qemu-linaro release | 2011-02-08 | 2011-02-08 | 2011-02-08 | qemu-linaro 2011-03 | 2011-03-08 | 2011-03-08 | 2011-03-08 | == maintain-beagle-models == * wrote most of a patch to properly implement prlimit64 syscall in usermode -- needs checking/testing * wrote most of a patch to allow boards to specify their min/max/default RAM size rather than having a single qemu-wide default; this would let us specify RAM size on the beagle model (currently hardwired) * retested the patch I wrote late last year which fixes TCG's locking problems by having interrupting signals/threads just set a flag which we check in every TB (rather than trying to mess with the TCG graph in a totally unsafe and crashprone way). Slowdown: about 3.5% on "boot Linaro nano on vexpress to rootprompt and shutdown again". I shall see if I can persuade people that this is a price worth paying for not randomly crashing in thread-heavy code :-) == merge-correctness-fixes == * Wrote/submitted patchset which fixes VRECPS edge case handling (mostly NaN related) * Wrote/submitted patchset which fixes Neon VLD of single element to all lanes * Wrote/submitted patch which fixes qemu to work on an ARM host where the host C code has been built in Thumb mode == other == * attended QEMU Users Forum in Grenoble * meetings: toolchain, standup, pdsw-tools, arch q&a Current qemu patch status is tracked here: https://wiki.linaro.org/PeterMaydell/QemuPatchStatus Absences: 17/18 March: QEMU Users Forum, Grenoble Holiday: 22 Apr - 2 May 9-13 May: UDS, Budapest (maybe) ~17-19 August: QEMU/KVM strand at LinuxCon NA, Vancouver

14 years, 3 months

1
0
0 0

[ACTIVITY] 2011-03-17

by David Gilbert

Short week * libffi patch accepted upstream * eglibc integration of string routine changes - I have something that works but it's more complex than I'd like (to get it to fall back to the C code on stuff I haven't optimised for). * Trying a neon memchr; tried a really simple 8 byte a loop version - it's quite slow on both A8 and A9; branching on the result of comparisons done in the neon is not simple. * Porting jam bug 735877 chromium using d32 float; it was passing vpfpv3 rather than using the default when configured without neon. On holiday tomorrow (Friday). Dave

14 years, 3 months

1
0
0 0

[ACTIVITY] March 13-17

by Revital1 Eres

Hello, Experiment with aes benchmark from DENbench. Continue my experiments with SMS which includes re-implementing an old patch to insert reg-moves in free slots rather than greedily before the definition as is done in the current implementation. Thanks, Revital

14 years, 3 months

1
0
0 0

[ACTIVITY] March 13-17

by Ira Rosen

Hi, * submitted store sinking patch to mainline * started testing auto-detection of vector size patch * DENBench - some benchmarks are still unstable, I am looking into stable regressions, adjusting and fixing the cost model for them Next week: Sunday and Monday - holidays Ira

14 years, 3 months

1
0
0 0

Versatile Express write-up

by Michael Hope

Here's a end-of-cycle write up for Versatile Express support in QEMU: https://wiki.linaro.org/WorkingGroups/ToolChain/Outputs/QEMUVersatileExpress Most of it is taken from Peter's page: https://wiki.linaro.org/PeterMaydell/QemuVersatileExpress which is the place to go if you want the current state and more detail on the steps involved. While writing this up I had a seamless experience from the first linaro-image-create until seeing the alpha3 greeter come up and wobbling the mouse around. It was awesome. Some ideas for other write-ups at: https://wiki.linaro.org/WorkingGroups/ToolChain/Outputs -- Michael

14 years, 3 months

1
0
0 0

RealView PBX write-up

by Michael Hope

Dave did an investigation earlier in the year into Cortex-A9 and RealView PBX support in QEMU. The write-up is available here: https://wiki.linaro.org/WorkingGroups/ToolChain/Outputs/QEMURealViewPBX Dave and Peter: could you please review it? I've now closed out the blueprint. I'd like to do similar reports on other outputs and will attack vexpress next. -- Michael

14 years, 3 months

1
0
0 0

Work-item tracking for the "gcc-linaro-tracking" LP project

by Ulrich Weigand

Hi Michael, Andrew, Mounir just pointed out that our non-Ubuntu LP projects (like gcc-linaro, gdb-linaro etc.) are now also included in the LP work-item tracking statistics (http://status.linaro.org/linaro-toolchain-wg.html). This didn't happen in the past due to a Launchpad issue that has now been fixed. This seems to be working out nicely, except for one issue: what about the gcc-linaro-tracking project? I have a couple of bugs that are fixed in Linaro GCC, and are also fixed in mainline GCC, but they still show up as an "in-progress" work-item in the status tracker (there are a whole bunch more of those assigned to Andrew as well). The reason for this is the LP records have an associated gcc-linaro-toolchain project entry, and this is set to "Fix Committed", but not "Fix Released" ... probably because GCC 4.6.0 is not yet released? Now, on the one hand it does make sense to include the -tracking project in the work-item statistics, because they *do* reflect important tasks: namely, to make sure that the changes indeed land in the upstream repository. However, having them all show up as "in progress" until the community makes a new GCC release does not seem very helpful: this is not in our control, and our work is in fact done once the patch is committed upstream. Therefore my suggestion: we should immediately mark -tracking bugs as "Fix Released" (not "Fix Committed"), as soon as the corresponding patch is committed upstream (and thus our work on the problem is completed). Thoughts? Does this make sense? Will this mess up any of the other purposes for which we currently use the -tracking project? Mit freundlichen Gruessen / Best Regards Ulrich Weigand -- Dr. Ulrich Weigand | Phone: +49-7031/16-3727 STSM, GNU compiler and toolchain for Linux on System z and Cell/B.E. IBM Deutschland Research & Development GmbH Vorsitzender des Aufsichtsrats: Martin Jetter | Geschäftsführung: Dirk Wittkopp Sitz der Gesellschaft: Böblingen | Registergericht: Amtsgericht Stuttgart, HRB 243294

14 years, 3 months

5
9
0 0

cortex-strings benchmarks

by Michael Hope

Hi Dave. I had a little play with cortex-strings and did some benchmarks on my Tegra 2. Images are attached. I've added two scripts to cortex-strings: scripts/bench-all.sh runs all the routines on all variants and records them scripts/plot.py plots the results from above ploy.py corrects for the benchmark overhead by doing a linear fit to the null 'bounce' results and subtracting this fit. You should be able to a autogen; configure; make; bash scripts/bench-all.sh | tee log.txt; python scripts/plot.py log.txt. I'm sure you have your own favourite tools though. The string routines look good. Lumpy in funny ways though... -- Michael

14 years, 3 months

1
0
0 0

Fwd: Representing interleaving and lane load/stores at the tree level

by Richard Sandiford

[Sorry, forgot to CC: the list] Hi Ira, Thanks for the feedback. On 6 March 2011 09:20, Ira Rosen <IRAR(a)il.ibm.com> wrote: > > So how about the following functions? (Forgive the pascally syntax.) > > > > __builtin_load_lanes (REF : array N*M of X) > > returns array N of vector M of X > > maps to vldN > > in practice, the result would be used in assignments of the form: > > vectorX = ARRAY_REF <result, X> > > > > __builtin_store_lanes (VECTORS : array N of vector M of X) > > returns array N*M of X > > maps to vstN > > in practice, the argument would be populated by assignments ofthe > form: > > vectorX = ARRAY_REF <result, X> > > > > __builtin_load_lane (REF : array N of X, > > VECTORS : array N of vector M of X, > > LANE : integer) > > returns array N of vector M of X > > maps to vldN_lane > > > > __builtin_store_lane (VECTORS : array N of vector M of X, > > LANE : integer) > > returns array N of X > > maps to vstN_lane > > > > How do you distinguish between "multiple structures" and "single structure > to all lanes"? Sorry, I'm not sure I understand the question. Could you give a couple of examples? The idea is that the arrays above really are array types, regardless of the actual type of the thing we're accessing (which might be a larger array than the bounds above say, or which might be an array of structures or a structure of arrays). That should be OK because arrays alias their elements. Richard

14 years, 3 months

3
3
0 0

Linaro GDB patch for natty

by Ulrich Weigand

Hi Matthias, in last week's meeting you raised the question what, if any, code from the Linaro GDB repository could be useful for inclusion into the natty GDB package. I've now reviewed the contents of the repository, and my suggestion would be to use everything in Linaro GDB 7.2, except for this commit (which changes the branding to "Linaro GDB"): revno: 32969 committer: Ulrich Weigand <uweigand(a)de.ibm.com> branch nick: 7.2 timestamp: Wed 2010-09-22 19:18:38 +0200 message: 2010-09-22 Ulrich Weigand <uweigand(a)de.ibm.com> * src-release: Support gdb-linaro packages. gdb/ * version.in: Set to Linaro GDB version number. * configure.ac (PKGVERSION, BUGURL): Refer to Linaro. * configure: Regenerate. gdb/gdbserver/ * configure.ac (PKGVERSION, BUGURL): Refer to Linaro. * configure: Regenerate. gdb/doc/ * configure.ac (PKGVERSION, BUGURL): Refer to Linaro. * configure: Regenerate. (Instead, the branding ought to be set as appropriate for the Ubuntu package. Maybe with an additional reference to Linaro, just as with GCC?) I've created a snapshot of the Linaro GDB 7.2 branch using the command bzr diff --prefix a/:b/ -r32965.. and then manually removed changes to src-release gdb/version.in gdb/configure.ac gdb/configure gdb/gdbserver/configure.ac gdb/gdbserver/configure gdb/doc/configure.ac gdb/doc/configure I've left in the new file ChangeLog.linaro for documentation purposes, but if you prefer this could of course be removed as well. The resulting patch is appended here. (Note that I'd recommend to continue updating the patch from Linaro GDB as further changes make it in.) (See attached file: linaro-gdb.patch) I've then added the patch to the natty GDB package. Since it touches a completely distinct set of files compared to the existing list of patches in the package, it can be added to the series file at any arbitrary point. I've built the resulting compiler on i386, arm, and ppc64, and it strictly improved the test results on all three platforms: i386 without patch: # of expected passes 16161 # of unexpected failures 114 # of expected failures 72 # of untested testcases 9 # of unresolved testcases 1 # of unsupported tests 69 i386 with patch: # of expected passes 16331 # of unexpected failures 24 # of expected failures 72 # of untested testcases 9 # of unresolved testcases 1 # of unsupported tests 69 Fixed test case failures are from: gdb.base/break-interp.exp gdb.base/foll-fork.exp gdb.base/printcmds.exp (These are just test suite cleanups, no actual code changes.) ppc without patch: # of expected passes 15350 # of unexpected failures 74 # of expected failures 53 # of untested testcases 15 # of unresolved testcases 1 # of unsupported tests 63 ppc with patch: # of expected passes 15350 # of unexpected failures 55 # of expected failures 53 # of untested testcases 15 # of unresolved testcases 1 # of unsupported tests 63 Fixed test case failures are from: gdb.base/printcmds.exp gdb.threads/local-watch-wrong-thread.exp gdb.threads/watchthreads.exp (These are just test suite cleanups, no actual code changes.) arm without patch: # of expected passes 15343 # of unexpected failures 270 # of unexpected successes 1 # of expected failures 65 # of untested testcases 11 # of unresolved testcases 2 # of unsupported tests 70 arm with patch: # of expected passes 15686 # of unexpected failures 46 # of unexpected successes 3 # of expected failures 63 # of untested testcases 11 # of unresolved testcases 1 # of unsupported tests 69 Fixed test case failures are from: gdb.base/break-interp.exp gdb.base/corefile.exp gdb.base/foll-fork.exp gdb.base/gcore.exp gdb.base/gdb1555.exp gdb.base/pr11022.exp gdb.base/printcmds.exp gdb.base/recurse.exp gdb.base/relativedebug.exp gdb.base/step-test.exp gdb.base/watch-cond.exp gdb.base/watch-read.exp gdb.base/watch_thread_num.exp gdb.base/watch-vfork.exp gdb.gdb/selftest.exp gdb.mi/gdb792.exp gdb.mi/mi2-syn-frame.exp gdb.mi/mi2-var-display.exp gdb.mi/mi2-watch.exp gdb.mi/mi-syn-frame.exp gdb.mi/mi-var-display.exp gdb.mi/mi-watch.exp gdb.pie/corefile.exp gdb.server/ext-attach.exp gdb.threads/attachstop-mt.exp gdb.threads/attach-stopped.exp gdb.threads/linux-dp.exp gdb.threads/local-watch-wrong-thread.exp gdb.threads/pthread_cond_wait.exp (This represents much of the bug fix work that went into Linaro GDB.) Let me know if there's any further information you need, or anything else I can do to help get the Linaro changes into natty GDB. Mit freundlichen Gruessen / Best Regards Ulrich Weigand -- Dr. Ulrich Weigand | Phone: +49-7031/16-3727 STSM, GNU compiler and toolchain for Linux on System z and Cell/B.E. IBM Deutschland Research & Development GmbH Vorsitzender des Aufsichtsrats: Martin Jetter | Geschäftsführung: Dirk Wittkopp Sitz der Gesellschaft: Böblingen | Registergericht: Amtsgericht Stuttgart, HRB 243294

14 years, 3 months

1
0
0 0

[ACTIVITY] 7th - 13th March

by Andrew Stubbs

Merged fixes for several bug into Linaro GCC 4.5. Both from Linaro (Richard, Matthias and Ramana), and from CS (the shrink wrap problems). Continued working on benchmarking the patches I've merged to 4.6. Spent quite some time trying to figure out why EEMBC and the Spec2K weren't working properly. I've got this sorted now. Confirmed that the patch to discourage NEON use for integer operations is still profitable on Cortex-A8. Posted the patch upstream. Merged upstream GCC 4.6 into Linaro GCC 4.6. Booked travel to Budapest for Linaro @ UDS. Followed up on Ramana's questions about the RVCT interoperability patch. Paul Brook helped explain what it was about, and pointed me at the proper section in the proper ARM manual. Continued forward porting patches to 4.6. Mostly I need to convince myself that they still do something useful. I have posted one new patch to upstream - the "Discourage A8 NEON" patch. * Future Absence Away Wednesday 16th to Friday 18th. Away Monday 28th to Friday 1st April. ---- Upstream patched requiring review: * Thumb2 constants: http://gcc.gnu.org/ml/gcc-patches/2010-12/msg00652.html * ARM EABI half-precision functions http://gcc.gnu.org/ml/gcc-patches/2011-02/msg00874.html * ARM Thumb2 Spill Likely tweak http://gcc.gnu.org/ml/gcc-patches/2011-02/msg00880.html * NEON scheduling patch http://gcc.gnu.org/ml/gcc-patches/2011-02/msg01431.html * RVCT Interoperability patch http://www.mail-archive.com/gcc-patches@gcc.gnu.org/msg00059.html * Discourage NEON on A8 http://www.mail-archive.com/gcc-patches@gcc.gnu.org/msg00576.html

14 years, 4 months

1
0
0 0

[ACTIVITY] Mar.07 -- Mar.13

by Chung-Lin Tang

== Last week == * Working on Coremark ARMv6 regressions. Identified a major cause being RTL ifcvt failing on one of the crc routines, due to combine pass failing to optimize a particular sequence, causing the if-conversion estimates to give up on conditional executing (too many insns). The combine pass failed on ARMv6 and above, due to the existence of true zero_extend insns. On ARMv5, the use of two shifts actually allowed combine to phase reduce the shifts one by one, thus producing better code. On ARMv6, combine produced a (xor (and ...) <mask>) which did not match any insn. Analyzed and sent a patch upstream which should work on such XOR cases. Patch is due for upstream commit for 4.7-stage1. (http://gcc.gnu.org/ml/gcc-patches/2011-03/msg00609.html) * Another situation of un-optimized uxth insns still exists; trying to solve this by another combine patch I am currently testing, will send upstream later. == This week == * verify the improvements the above patches should have on Coremark for ARMv6/v7. * Work on sending them to Linaro and SG++ branches. * Other bug issues.

14 years, 4 months

1
0
0 0

[ACTIVITY] Mar 07 - Mar 11

by Ulrich Weigand

== GDB == * Ongoing work on glibc patch to add ARM unwind tables to system call stubs; ran into design problems that look difficult to fix. * As an alternative, started work on a GDB patch to recognize glibc system call assembler stubs via code-scanning; this should allow alloc unwinding in the absence of debug info for current libc code. * Analyzed bug #728216 (GDB fails to get a valid backtrace while debugging a Webkit SIGSEGV) and resolved as invalid; the fault occurs within JIT-generated code where unwinding is impossible. == Misc == * Made travel arrangements for Linaro Summit in Budapest Mit freundlichen Gruessen / Best Regards Ulrich Weigand -- Dr. Ulrich Weigand | Phone: +49-7031/16-3727 STSM, GNU compiler and toolchain for Linux on System z and Cell/B.E. IBM Deutschland Research & Development GmbH Vorsitzender des Aufsichtsrats: Martin Jetter | Geschäftsführung: Dirk Wittkopp Sitz der Gesellschaft: Böblingen | Registergericht: Amtsgericht Stuttgart, HRB 243294

14 years, 4 months

1
0
0 0

[ACTIVITY] report week 10

by Peter Maydell

RAG: Red: Amber: Green: another qemu-linaro release out the door on time Current Milestones: | Planned | Estimate | Actual | qemu-linaro 2011-03 | 2011-03-08 | 2011-03-08 | 2011-03-08 | Historical Milestones: finish virtio-system | 2010-08-27 | postponed | | finish testing PCI patches | 2010-10-01 | 2010-10-22 | 2010-10-18 | successful ARM qemu pull req | 2010-12-16 | 2010-12-16 | 2010-12-16 | finish qemu-cont-integration | 2011-01-25 | 2011-01-25 | handed off | first qemu-linaro release | 2011-02-08 | 2011-02-08 | 2011-02-08 | == maintain-beagle-models == * released qemu-linaro 2011-03 * had to do a 2011-03-1 reroll of the tarball on day of release to fix a "versatilepb model crashes on startup" bug found at last minute * Paul Larson is working on having automated test image boots on qemu built from git, so we can catch this much earlier in the cycle == merge-correctness-fixes == * added support to risu for testing of load and store instructions * used this to test a patch which cleans up Thumb load/store decode and makes us UNDEF in the right places * wrote/submitted patch to fix GE bits for signed modulo arithmetic * wrote/submitted patch to get SMUAD/SMLAD Q bit right in an edge case * started on a patchset which will fix various minor qemu Neon bugs detected by test programs from the valgrind source tree == other == * meetings: toolchain, standup, pdsw-tools Current qemu patch status is tracked here: https://wiki.linaro.org/PeterMaydell/QemuPatchStatus Absences: 17/18 March: QEMU Users Forum, Grenoble Holiday: 22 Apr - 2 May 9-13 May: UDS, Budapest (maybe) ~17-19 August: QEMU/KVM strand at LinuxCon NA, Vancouver

14 years, 4 months

1
0
0 0

[ACTIVITY] weekly status

by Ken Werner

Hi, == libunwind == * the patches posted last week are now upstream * continued to study the Exception Handling ABI for the ARM Architecture * looked into the structure of libunwind (lib interdependencies) * documented at: https://wiki.linaro.org/KenWerner/Sandbox/libunwind * The work on the local unwinding appears to be quite complete. If the generic unwind model is used the code assumes the GCC personality routine. We should either check name of the symbol (maybe be difficult) or just call the pers function. I'm in contact with Zach on this. Regards Ken

14 years, 4 months

1
0
0 0

Bad code generation due to shrink-wrap optimisation

by Michael Hope

LP: #731665 is a silent bad code generation bug at least on functions which are empty except for inline assembly: https://bugs.launchpad.net/ubuntu/+source/gcc-4.5/+bug/731665 It was introduced in the shrink-wrap patch and is due to using an uninitialised variable. Andrew, can you please address this urgently either in Linaro or CSL. -- Michael

14 years, 4 months

1
0
0 0

[ACTIVITY] 2011-03-10

by David Gilbert

== hard-float == * Updated libffi variadic patch and Sent updated libffi variadic patch to the ffi mailing list. == String routines == * Got a big endian build environment going * Patched up memchr and strlen for big endian; turned out to be a very small change in the end; and tested it on qemu-armeb - note that an older version it didn't work on, but a newer one it did; I'll assume the newer one is correct. * Fixed a couple of build issues in the cortex strings test harness == Other == * Kicked off a SPEC2006 train run on canis using the 2011.03 compilers I'm on holiday tomorrow (Friday) and Monday. Dave

14 years, 4 months

1
0
0 0

[ACTIVITY] March 6-10

by Revital1 Eres

Hello, * Sent the patch to support targets that their doloop part is not decoupled from the rest of the loop's instructions (as is the case for ARM) to @gcc-patches: http://gcc.gnu.org/ml/gcc-patches/2011-03/msg00350.html * Continue looking into DENbench benchmarks. Thanks, Revital

14 years, 4 months

1
0
0 0

[ACTIVITY] March 6-10

by Ira Rosen

Hi, * continued working on cost model tuning. I don't see much difference running EEMBC DenBench with and without vectorization enabled (and, therefore, also with and without cost model). Also, I have to say, that the results are not stable and I sometimes get 10% difference just running the same executable two times in a row. * the only benchmark I see consistent degradation 5% with vectorization is DenBench aes, both with GCC trunk and gcc-linaro 4.5. I found one of the responsible loops, if it is not vectorized I see only 1.8% degradation. The problem there is that the loop bound is unknown at compile time, so the vectorizer attempts to vectorize the loop using runtime guards to verify that there are enough iterations to vectorize. The actual number of iterations is 4, so the scalar version of the loop is chosen at the run time, but I guess the guards cause the degradation. I'll continue looking into this next week. * prepared the conditional-store-sink patch (one of the patches that helps to vectorize Telecom Viterbi) for submission to gcc-patches. Ira

14 years, 4 months

1
0
0 0

Linaro QEMU 2011.03-1 released

by Peter Maydell

The Linaro Toolchain Working Group is pleased to announce the release of Linaro QEMU 2011.03-1. Linaro QEMU 2011.03-1 is the second release of qemu-linaro. Based off upstream (trunk) qemu, it includes a number of ARM-focused bug fixes and enhancements. This release includes a model of the ARM Versatile Express platform. This is still experimental but may be of use to people who want a model supporting up to 1GB of RAM with graphics and networking. Instructions for getting started with it are on the wiki: https://wiki.linaro.org/PeterMaydell/QemuVersatileExpress Other interesting changes include: - The OMAP emulation bug which was causing hangs if Linux tried to enable a swapfile is fixed - The OMAP UART model has been improved; this fixes the problem where kernels using the new omap-hsuart serial drivers stopped serial output halfway through boot. - As usual, various minor correctness fixes and other upstream changes Known issues: - The beagle and beaglexm models do not support USB, so there is no keyboard, mouse or networking (#708703) The only change over the shortlived 2011.03-0 is that the last minute bug #731093 has been fixed (versatilepb models would crash on startup.) The source tarball is available at: https://launchpad.net/qemu-linaro/+milestone/2011.03-1 Binary builds of this qemu-linaro release are being prepared and will be available shortly for users of Ubuntu. When ready, Natty packages of qemu-linaro 2011.03-1 will be in the Ubuntu archive. Packages for users of Ubuntu 10.04 LTS and Ubuntu 10.10 will be in the linaro-maintainers tools ppa: https://launchpad.net/~linaro-maintainers/+archive/tools/ More information on Linaro QEMU is available at: https://launchpad.net/qemu-linaro

14 years, 4 months

4
3
0 0

Getting linaro toolchain binaries

by Dave Martin

Hi all, I've had comments that getting hold of binaries for the linaro toolchain can be trick for people unfamiliar with the linaro tools. One reason is that we don't release binaries as such -- but a visitor browsing in through http://www.linaro.org/downloads/ won't discover this, and may waste a lot of time trying to understand launchpad etc. before coming to the conclusion that binaries either aren't available or are not easily findable. On the other hand, the cross toolchain packages are likely to be of interest to such visitors, but aren't obviously advertised -- maybe I'm looking in the wrong place, but if so then new visitors to the linaro pages are likely to look in the wrong place too. Would it make sense to explain the situation more prominently so that visitors know what to expect? Something along the lines of "if you use distro x revision y, these cross-compiler packages are available" and "if you need the tools for some other environment, you need to download the source and build it for yourself". Cheers ---Dave

14 years, 4 months

13
24
0 0

Linaro GDB 7.2 2011-03 released

by Michael Hope

The Linaro Toolchain Working Group is pleased to announce the release of Linaro GDB 7.2. Linaro GDB 7.2 2011.03-0 is the fourth release in the 7.2 series. Based off the latest GDB 7.2, it includes ARM-focused bug fixes and enhancements. Interesting changes include: * Hardware watchpoint support * Backtracing while in the Linux kernel trampoline frame Hardware watchpoints use the support built into ARM devices to watch for changes in values in memory with little performance impact. A 2.6.37 or later kernel is required. The source tarball is available at: https://launchpad.net/gdb-linaro/+milestone/7.2-2011.03-0 More information on Linaro GDB is available at: https://launchpad.net/gdb-linaro -- Michael

14 years, 4 months

1
0
0 0

[ACTIVITY] 28th February - 5th March

by Andrew Stubbs

Committed Kazu's VFP testcases patch upstream. Merged the latest from upstream GCC 4.6. Merged all the outstanding launchpad merge requests against both GCC 4.5 and 4.6. Spun the 4.5-2011.03-0 and 4.6-2011.03-0 releases. Passed the tarballs to Michael H for final testing. Brought the patch tracker up to date w.r.t. to new merges. Posted one of Dan's patches upstream for review. Decided to drop Julian's A8 alignment patch completely. I had previously discovered it provided no measurable benefit on A8, and now I've found the same for A9 (Pandaboard). There's no real improvement for any combination of -falign-* options in EEMBC. Bernd's "Discourage NEON on A8" patch also doesn't show any value in the benchmark results, but I think I've forward ported it wrong, because it should at least change the binary size, and it doesn't. I need to look into this further. I also decided I don't know enough about ARMv7, so I spent some time reading a few chapters from the ARM A.R.M. ---- Upstream patched requiring review: * Thumb2 constants: http://gcc.gnu.org/ml/gcc-patches/2010-12/msg00652.html * ARM EABI half-precision functions http://gcc.gnu.org/ml/gcc-patches/2011-02/msg00874.html * ARM Thumb2 Spill Likely tweak http://gcc.gnu.org/ml/gcc-patches/2011-02/msg00880.html * NEON scheduling patch http://gcc.gnu.org/ml/gcc-patches/2011-02/msg01431.html * RVCT Interoperability patch http://www.mail-archive.com/gcc-patches@gcc.gnu.org/msg00059.html

14 years, 4 months

1
0
0 0

[ACTIVITY] Feb.28 -- Mar.06

by Chung-Lin Tang

Last week: * Launchpad #711819 / PR47719: ARM minipool ICE. Followed up on discussion with Bernd and Ramana. Later posted discussion results on gcc-patches, where Richard Earnshaw took it over with a final fix. * Coremark ARMv7/v6 regressions: mostly pinpointed the exact cases where RTL simplification fails to optimize away ZERO_EXTEND expressions. Still working on how to enhance it. * TW Public Holiday on Feb.28 (Mon), was off for one day. This week: * Try to turn Coremark regression investigation into code form. * Other GCC issues.

14 years, 4 months

1
0
0 0

Representing interleaving and lane load/stores at the tree level

by Richard Sandiford

I've been spending this week playing around with various representations of the v{ld,st}{1,2,3,4}{,_lane} operations. I agree with Ira that the best representation would be to use built-in functions. One concern in the original discussion was that the optimisers might move the original MEM_REFs away from the call. I don't think that's a problem though. For loads, we can simply treat the whole of the accessed memory as an array, and pass the array by value. If we do that, then the call would just look like: __builtin_load_lanes (MEM_REF[(elem[N] *)ADDR]) (where, despite the C notation, the MEM_REF accesses the whole of elem[N]). It is of course possible in principle for the tree optimisers to replace this MEM_REF with another, equivalent, one, but that's OK semantically. It isn't possible for the optimisers to replace it with something like an SSA name, because arrays can't be stored in gimple registers. __builtin_load_lanes would then be used like this: combined_vectors = __builtin_load_lanes (...); vector1 = ...extract first vector from combined_vectors... vector2 = ...extract second vector from combined_vectors... .... So combined_vectors only exists for load and extract operations. The question then is: what type should it have? (At this point I'm just talking about types, not modes.) The main possibilities seemed to be: 1. an integer type Pros * Gimple registers can store integers. Cons * As Julian points out, GCC doesn't really support integer types that are wider than 2 HOST_WIDE_INTs. It would be good to remove that restriction, but it might be a lot of work, and it isn't something we'd want to take on as part of this project. * We're not really using the type as an integer. * The combination of the integer type and the __builtin_load_lanes array argument wouldn't be enough to determine the correct load operation. __builtin_load_lanes would need something like a vector count (N => vldN) argument as well. 2. a combined vector type Pros * Gimple registers can store vectors. Cons * For vld3, this would mean creating vector types with non-power- of-two vectors. GCC doesn't support those yet, and you get ICEs as soon as you try to use them. (Remember that this is all about types, not modes.) It _might_ be interesting to implement this support, but as above, it would be a lot of work. It also raises some semantic questions, such as: what is the alignment of the new vectors? Which leads to... * The alignment of the type would be strange. E.g. suppose we're loading N*2 uint32_ts into N vectors of 2 elements each. The types and alignments would be: N=2 uint32x4_t, alignment 16 N=3 uint32x6_t, alignment 8 (if we follow the convention for modes) N=4 uint32x8_t, alignment 32 We don't need alignments greater than 8 in our intended use; 16 and 32 are overkill. * We're not really using the type as a single vector, but as a collection of vectors. * The combination of the vector type and the __builtin_load_lanes array argument wouldn't be enough to determine the correct load operation. __builtin_load_lanes would need something like a vector count (N => vldN) argument as well. 3. an array of vectors type Pros * No support for new GCC features (large integers or non-power-of-two vectors) is needed. * The alignment of the type would be taken from the alignment of the individual vectors, which is correct. * It accurately reflects how the loaded value is going to be used. * The type uniquely identifies the correct load operation, without need for additional arguments. (This is minor.) Cons * Gimple registers can't store array values. So I think the only disadvantage of using an array of vectors is that the result can never be a gimple register. But that isn't much of a disadvantage really; the things we care about are the individual vectors, which can of course be treated as gimple registers. I think our tracking of memory values is good enough for combined_vectors to be treated as such (even though, with the back-end changes we talked about earlier, they will actually be stored in RTL registers). So how about the following functions? (Forgive the pascally syntax.) __builtin_load_lanes (REF : array N*M of X) returns array N of vector M of X maps to vldN in practice, the result would be used in assignments of the form: vectorX = ARRAY_REF <result, X> __builtin_store_lanes (VECTORS : array N of vector M of X) returns array N*M of X maps to vstN in practice, the argument would be populated by assignments of the form: vectorX = ARRAY_REF <result, X> __builtin_load_lane (REF : array N of X, VECTORS : array N of vector M of X, LANE : integer) returns array N of vector M of X maps to vldN_lane __builtin_store_lane (VECTORS : array N of vector M of X, LANE : integer) returns array N of X maps to vstN_lane Note that each operation can be expanded independently. The expansion doesn't rely on preceding or following statements. I've hacked up the prototype below as a proof of concept. It includes changes to the C parser to allow these functions to be created in the original source code. This is throw-away code though; it would never be submitted. I've also included a simple test case and the output I get from it. The output looks pretty good; there's not even the stray VMOV that I saw with the intrinsics earlier in the week. (Note that if you'd like to try this yourself, you'll need the patch I posted on Monday as well.) What do you think? Obviously this discussion needs to move to gcc@ at some point, but I wanted to make sure this was vaguely sane first. Richard

14 years, 4 months

2
2
0 0

A question about disabling -gtoggle in bootstrap run

by Revital1 Eres

Hello, I am looking for a way to disable '-gtoggle' flag in the run of stage 2 in bootstrap; when configuring ARM with (*). The flag seems to be applied in stage 2 but not in stage 3 which seems to cause bootstrap failure when testing SMS as in stage 2 SMS fails because of debug_insn caused by -gtoggle disturbing do-loop; while in stage 3 SMS succeeds; resulting in different .o files and bootsrtrap failure. (*) This the configure I used: ../gcc/configure --prefix=/home/eres/mainline/build --enable-checking --enable-languages=c --enable-bootstrap Thanks, Revital

14 years, 4 months

3
8
0 0

[ACTIVITY] Feb 28 - Mar 04

by Ulrich Weigand

== GDB == * Committed fix for the GDB part of #620611 (Unable to backtrace out of vector page 0xffff0000) to mainline and Linaro GDB 7.2. * Ran into GDB crashes due to memory corruption in tests involving multiple inferiors. Tracked down root cause (using valgrind) to long-standing double free bug in GDB terminal state handling code. Committed fix to mainline and Linaro GDB 7.2. * While using valgrind (see above), ran into problems: * ptrace system call is unsupported on ARM * certain variants of the "SUB from SP" Thumb-2 instruction are not handled by the VEX compiler Fixed both problems locally, and was then able to successfully valgrind GDB on ARM. * Created Linaro GDB 7.2-2011.03-0 release. * Worked on glibc patch to add ARM unwind tables to system call stubs; this will help unwinding in the absence of debug info for libc, and in particular fix #684218 (Failures in gdb.base/call-signal-resume.exp) Mit freundlichen Gruessen / Best Regards Ulrich Weigand -- Dr. Ulrich Weigand | Phone: +49-7031/16-3727 STSM, GNU compiler and toolchain for Linux on System z and Cell/B.E. IBM Deutschland Research & Development GmbH Vorsitzender des Aufsichtsrats: Martin Jetter | Geschäftsführung: Dirk Wittkopp Sitz der Gesellschaft: Böblingen | Registergericht: Amtsgericht Stuttgart, HRB 243294

14 years, 4 months

1
0
0 0

[ACTIVITY] weekly status

by Ken Werner

Hi, == PandaBoard == * upgraded my ARM dev environment from Ubuntu to Linaro snapshot (20110303) * found another kernel bug on the panda (#728565) == libunwind == * resolved build issues on ARM (when using the linaro snapshot) * allows the testsuite to work with linkers that do not pull in indirect shared libs * fix build of the test-static-link test case on ARM * link libunwind-setjmp.so against libunwind-elf * posted some first patches on the libunwind ml * learned about the Exception Handling ABI for the ARM Architecture Regards Ken

14 years, 4 months

1
0
0 0

[Activity] 2011-03-04 week 9

by Ramana Radhakrishnan

Starting back in Linaro land after a gap of 3-4 weeks where I've been away on ARM internal tasks. == GCC == - Setting up a new machine that I received for Linaro work. - Spent some time reviewing upstream patches. Spent some time on the P1 PR47719 upstream to get this fixed . - Starting to read up on the benchmarking report and recreating the environments. - Looked through some of the speed tickets to have a look through and spend some time on it. - Put in a hardware request for a Panda board. == Next Week == - Set up environment properly for some amount of benchmarking. - Look at some of the performance regressions and work on some things that need to be done. - Continue looking at PR47719.

14 years, 4 months

1
0
0 0

[ACTIVITY] 2011-03-04

by David Gilbert

* Investigated and fixed sqlite3 testsuite failure on ARM (bug 725052) * Discussing libffi API changes with maintainer; hopefully he's going to send out his comments today. * Looking at how to upstream the string routine changes * Need to look at big endian testing * Testing QEmu pre-release for Peter; looking very nice. Dave

14 years, 4 months

1
0
0 0

[ACTIVITY] report week 09

by Peter Maydell

RAG: Red: Amber: Green: Current Milestones: | Planned | Estimate | Actual | qemu-linaro 2011-03 | 2011-03-08 | 2011-03-08 | | Historical Milestones: finish virtio-system | 2010-08-27 | postponed | | finish testing PCI patches | 2010-10-01 | 2010-10-22 | 2010-10-18 | successful ARM qemu pull req | 2010-12-16 | 2010-12-16 | 2010-12-16 | finish qemu-cont-integration | 2011-01-25 | 2011-01-25 | handed off | first qemu-linaro release | 2011-02-08 | 2011-02-08 | 2011-02-08 | == maintain-beagle-models == * preparation and test for next week's qemu-linaro 2011-03 release * put in a temporary fix for bug 723630 (apt/glibc now try prlimit64 syscall, so silence qemu warnings about not implementing it) * investigated qemu warnings about bad 16 bit writes: this is a kernel bug: https://bugs.launchpad.net/linux-linaro/+bug/727781 == vexpress model == * sent vexpress patches upstream, put into qemu-linaro == merge-correctness-fixes == * more work on performance counter registers: proper cycle counter implementation; now just needs a bit of tidying before upstreaming * ran valgrind's test cases on qemu; added revealed issues to https://blueprints.launchpad.net/qemu-linaro/+spec/merge-correctness-fixes * sent patch fixing broken VMOV s0,s1,r0,r1 implementation * sent patch fixing inverted carry bit on ORNS == other == * meetings: toolchain, standup, architecture q&a, pdsw-tools, team brief Current qemu patch status is tracked here: https://wiki.linaro.org/PeterMaydell/QemuPatchStatus Absences: 17/18 March: QEMU Users Forum, Grenoble Holiday: 22 Apr - 2 May 9-13 May: UDS, Budapest (maybe) ~17-19 August: QEMU/KVM strand at LinuxCon NA, Vancouver

14 years, 4 months

1
0
0 0

[ACTIVITY] Weekly status

by Richard Sandiford

== This week == * Submitted the fix for the Qt miscompilation upstream. Applied after approval. * Submitted a patch for the Thumb LDR problem that Dave Martin hit. This was rejected. * Ended up spending a few days on the "unreasonable amount of memory while compiling qemu" bug due to unfamiilarity with the DWARF 2 code. I realise the original idea was that I'd just file this upstream, but it was one of those cases where I kept finding out more info for the bug report until the problem became obvious. I've now submitted two patches for this upstream. The first was trivial and is now in. I was asked to add a bit of extra code to the second, which I hope to do next week. * Looked at the MIPS bug that was reported against the Linaro toolchain. This turned out to be a problem in our extension elimination pass. Submitted a merge request for that. * Got confirmation from ARM that we should use relocation number 160 for R_ARM_IRELATIVE, and that it was OK to make the changes public (thanks!). I've now submitted the binutils patches upstream. I'll do the eglibc ones when I get back. == Next week == Holiday! Richard

14 years, 4 months

1
0
0 0

[ACTIVITY] February 27- March 3

by Revital1 Eres

Hello, Testing the patch for SMS to support targets that their doloop part is not decoupled from the rest of the loop's instructions, as SMS currently requires. The testing includes bootstrapping on ARM machines for c language configured w and w\o --with-arch=armv7-a options and using"-O2 -fmodulo-sched -fmodulo-sched-allow-regmoves -fno-auto-inc-dec -funsafe-math-optimizations -mthumb" flags. Thanks, Revital

14 years, 4 months

1
0
0 0

GCC PR43137

by Andrew Stubbs

On Monday, I was asked to find out whether the fix for GCC Bugzilla PR43137 was present in our source base. I can confirm that it is *not* present. Apologies for the delay. Andrew

14 years, 4 months

1
0
0 0

What configuration(s) to test?

by Andrew Stubbs

Hi All, Up until now, I have had no choice but to test toolchain correctness on A8 hardware. It made sense to use the same -mfpu settings as the Linaro/Ubuntu package builds use. This did not match the policy that the interesting platform was A9-NEON, but I didn't have that option. That's changed now - our Panda boards have arrived! Yay! :) But, it seems to me that if I change to using the Pandas for correctness testing (not performance testing) then I won't be testing what Ubuntu will actually use. So what should I test on? I'd rather not double my test load by testing on both, but that is an option .... Any suggestions? Andrew

14 years, 4 months

2
1
0 0

Re: Thumb-2 kernel on imx51/efikamx

by Dave Martin

Hi, On Mon, Feb 28, 2011 at 9:19 PM, Nicolas Pitre <nicolas.pitre(a)linaro.org> wrote: > On Thu, 24 Feb 2011, John Rigby wrote: > >> The resulting kernel builds and boots but some modules have problems: >> >> $ modprobe fat >> fat: unknown relocation: 102 >> FATAL: Error inserting vfat > > A workaround for what appears to be a binutils bug has been merged in > linaro-2.6.38. So the Thumb2 kernel testing may resume on trusted > targets. Thanks for merging it. It's a bit ugly to include turn off compiler optimisations to work around this though, so we might encounter upstream to that patch. In any case, we still need someone to take a look at the possible tools issue -- CC'ing linaro-toolchain in case people aren't aware: https://bugs.launchpad.net/binutils-linaro/+bug/725126 Cheers ---Dave

14 years, 4 months

2
1
0 0

Re: [RFC] Linaro Toolchain for Android and NDK

by Jim Huang

On 25 February 2011 22:28, Alexander Sack <asac(a)linaro.org> wrote: > On Wed, Feb 23, 2011 at 8:28 AM, Jim Huang <jserv(a)0xlab.org> wrote: >> I would like to make a proposal about utilizing Linaro toolchain for >> Android and NDK (Native Development Kit)[1]. Added linaro-toolchai list in Cc. >> ** Motivation >> >> There are some different perspectives between Linaro toolchain and >> Google Android toolchain including technical and >> non-technical considerations. It doesn't really work if we only >> replace prebuilt toolchain with Linaro toolchain because >> of the compatibility of Android system utilities such as ELF >> prelinker. Also, since Android is developed in relatively closed > > I don't have enough background to understand this "ELF prelinker" > stuff. Are you saying that because of the way how android links stuff > we cannot have one code base for gcc that works for both, android and > "normal libc linux"? > Take Bug #707487 for example: https://bugs.launchpad.net/binutils-linaro/+bug/707487 It is evident that Android's system utilities like soslim ("strip" implementation) and apriori ("prelink" implementation) expect the specific output of GNU Toolchain, but it sometimes varies since we would take Linaro's toolchain. >> environment (Google style open source model), a great amount of >> software components are not always verified by different >> toolchain or build configurations. This proposal attempts to > > ack. thats what we want to do. Of course, we cannot really verify what > is going on behind closed walls, but we can continuously build android > with our toolchain and fix issues due that in android public master > and if even that doesn't work we can ensure that our android trees > always work nicely with both, our gcc and android gcc. > Android team is known to work on this field already. > Another thing is to make our toolchain easily consumable (like the NDK > you mention at the bottom); this will increase chances that someone > from google can eventually take a look at what we are doing etc. and > also helps the community to use linaro toolchain to built their > android distributions. Agree. >> establish the compact development flow to enable Linaro >> optimized ARM toolchain to build Android from scratch and verify it >> transparently. Eventually, Android can be the reference >> indicator as Linaro toolchain performance and reliability. >> >> >> ** Brief introduction to Google Android toolchain >> >> Inside Google, there is a dedicated compiler team working on GNU >> Toolchain for various purposes including server-side >> computing, Android, Chrome OS, etc. Google engineers submit patches to >> upstream for public review and maintain the >> toolchain for Android. Along with each Android Open Source Prokect >> (AOSP) release, there is a special branch in korg >> GIT [2] for hosting the GPL'd toolchain source code modified by >> Google. Usually, file "README.google" mentioned the >> summary, but it is not developer friendly because several changes were >> done within one GIT commit. >> >> Please refer to wiki for details: >> https://wiki.linaro.org/Platform/Android/UpstreamToolchain >> > > thats a good wiki page. thanks for the content. If I read the skia > example correctly, we could add a test to our "normal" abrek testsuite > that uses our daily android toolchain and run the skia benchmark? e.g. > we could start doing this benchmarking even without having a > validation solution ready for android targets? > If adb is supposed to work well on target, then you can easily use "bench.py" script mentioned in the above wiki to do several benchmarking. > Please let's talk to Paul how we can get the android toolchain to > /opt/android as part of abrek and lets try to add this to our abrek > testsuites. Until we have daily toolchain builds it would be OK to > download the android toolchain tarball from a fixed place from > people.linaro.org I guess. > Ok! >> ** What's wrong with Android upstream Toolchain? >> >> In my opinion, list as following: >> >> (1) Few information about Google improvement: Sometimes, we have to >> guess something from implicit GIT commitlog >> such as "commit gcc-4.4.3 which is used to build gcc-4.4.3 Android >> toolchain in master"[3]. It is hard to track and get >> verified carefully. > > yes, that feels like a messy situation. Do we know why they don't > commit the changes as individual commits but then in next step > document what they changed? I have no exact idea since I am just an observer regarding Android's GIT tree. Google engineers do send patches to FSF/GNU, but it is not always related to the GIT activities we have seen in korg. You can search the keyword, "submit", in file gcc/gcc-4.4.3/README.google , and you will see some descriptions as following: gcc/cp/cp-lang.c gcc/gimple.c gcc/langhooks-def.h gcc/langhooks.h gcc/langhooks.c gcc/tree-flow.h gcc/tree-ssa-dce.c gcc/testsuite/g++.dg/tree-ssa/vptr_init_dse.C gcc/testsuite/g++.dg/tree-ssa/vptr_init_dse2.C Enhancing dead code elimination to eliminate useless vptr field initialization. Owner: davidxl Status: not submitted gcc/fold-const.c gcc/Makefile.in Fix 2045297 Owner: davidxl Status: Not submitted The information is too few to track for us since the above "Fix 2045297" tends to indicate Google bug database number instead of FSF's. >> (2) Google specific improvements are absent in recent release, only >> enabled months later. For example, Google Compiler >> Team Lead, Dr. Shih-wei Liao, presented the improvements against GNU >> Toolchain in the middle of 2009.[4]. The report >> came with several impressive improvements like FDO (Feedback Directed >> Optimizations) and IPO (Inter-Procedure >> Optimizations). However, only some of them are public to AOSP and be >> integrated late in the middle of 2010 (Android >> Froyo; 2.2). Even FDO was merged in Android Froyo already, but there >> is few documentation and no robust method to verify >> by community members such as Linaro engineers. > > you say that they don't publish the code for lets say the > "gingerbread" toolchain in a timely fashion when they release > gingerbread? Or do they ship a separate "fast" NDK/prebuilt for > partners through secret channels? I have no idea. >> (3) For some reasons, Google tends to deliver stable (old) toolchain >> plus mainline backport. It is a safe and workable approach, >> but sometimes developers would expect to use the latest technologies >> as Linaro aims to bring to the world. >> >> (4) Few readable documentation. For example, Google already open its >> toolchain benchmark suite in early 2010, but there was >> no document specific to such important components. Furthermore, there >> was one file gone in public kog GIT, required by >> automated benchmark process. One year later, Google engineer finally >> put back the one to public. This implies the unusual way >> Google developed and delivered software. >> > > Assuming good faith I would think this might just have been an oversight. > > Do you know if anyone from community pointed this out to google using > official android mailing lists/groups or a bug? > Google engineers sometimes pick up the issues from Google Code: http://code.google.com/p/android/issues/list And, they do discuss on mailing-list: http://developer.android.com/community/ >> ** Linaro's Approach to enable latest technologies >> >> Linaro android team tries to do: >> (1) Document Android toolchain and related utilities in korg GIT as >> possible as we can. > > That's good stuff and I think your wiki page is already a great > contribution in that direction. What we should do though is run this > through google eyes early by using official android mailing lists. > Got it. >> (2) Early adaptation of Linaro toolchain to Android build system and >> verify these output systematically. > > ack. Do you know if those changes would be conflicting with what we do > on "normal" linux side? e.g. do we need to maintain special android > patches or can we merge those into our main trees? > In fact, GCC 4.6 already merges Android specific patches with the help of CodeSourcery. We would initially backport these patches to linaro-gcc-4.5 branch for review. Luse Cheng already did it. However, other parts are not related to Android directly, and they might be too aggressive to generic GCC optimization, that can be the reason why Google didn't submit first. >> (3) Backport Google changes to Linaro GCC and review in public. > > This is really tricky as you said. Here again, we should propose this > on android mailing lists to maybe get feedback from google team and > maybe improve the way we work on that. Untangling a big patch based > just on changelog feels really unefficient. Ok, I got your point. However, what we need is to create workable combination of Linaro kernel + Linaro toolchain for Android integration engineers. Alexander, I need your help to catch the attention of someone at Google. > Also, we have to remember that if we pick changes out of _their_ tree, > we cannot upstream those to fsf because we don't own copyright to > those. Of course, for stuff they already pushed to 4.6 its not a > problem to backport them from fsf trunk. Thanks for notice. >> (4) Improve the deployment and validation flow by means of Linaro >> infrastructure. > > my understanding is this: > > 1. we add support to build android toolchain from linaro branches to > our cloud build service > 2. we do this so that we either produce a full toolchain tarball that > can be installed under /opt/android or a NDK tarball (or both) NDK doesn't need admin permission to install. > 3. we improve our android platform build infrastructure to allow > using latest daily toolchain tarball and then we build android with a) > google toolchain and b) linaro toolchain; in this way we get daily > android builds for both toolchains that can go into the linaro > validation farm and get the typical validation/testing and > benchmarking done. Yes, it would be great. >> (5) Build and test Android system with Linaro tools. Then, figure out >> the regressions caused by Linaro Toolchain and/or >> aggressive optimizations > > right. I think that's covered with the point above, no? The android > builds done with our toolchain would also be available in public, so > you can do whatever you want on top of what we already > test/validate/measure automatically in the validation farm with them. Agree. >> (6) measure performance gain by Linaro tools > > right. for this we need to define a set of open-source benchmarks to > run and ensure that those are supported in our validation framework. > >> The detailed specification in wiki: >> https://wiki.linaro.org/Platform/Android/Specs/LinaroAndroidToolchain >> >> ** Implementation of Linaro toolchain for Android >> >> We started from Android style toolchain build and move to Linaro GCC + >> ARM specific optimizations in mind. The initial work >> can be obtained by wiki: >> https://wiki.linaro.org/Platform/Android/Toolchain >> >> We plan to maintain the following GIT repositories at least: >> * android/toolchain/build.git : Linaro-aware build system. Derived >> from Android toolchain build system, it can handle Linaro-GCC >> and Linaro snapshot/bzr. >> * android/toolchain/gcc-patches.git : Patchset to be applied on top >> of Linaro-GCC release/snapshots > > I think thats fine. however, how do we ensure that we have patches > that always apply to both release/snapshots? do we maintain branches > for gcc-patches.git in case you need two versions of patch X if the > linaro gcc codebase diverged? I might need help from toolchain WG. >> The reference builder script output: >> $ ./linaro-build.sh --help >> --prefix-dir= Specify where to install (default: >> /tmp/android-toolchain-eabi) >> --gcc-src-dir= Specify where linaro gcc source is (in <toolchain>/gcc) >> --apply-gcc-patch=(yes|no) Apply-patch which in >> <toolchain>/gcc-patches directory (default: no) >> >> Current verified combinations: >> * gcc-linaro: 4.5-2011.02-0 >> * binutils: 2.20.1 >> * gmp: 4.2.4 >> * mpfr: 2.4.1 >> >> Only gcc is replaced by gcc-linaro: 4.5-2011.02-0 and others are >> checked out from korg GIT. > > do we need to do something like --gcc-src-dir and -patches for > binutils, gmp and mpfr as well? or would we be only interested in > improving/fixing gcc for now? > I think focusing on linaro-gcc is pretty good. We can follow the original combination of Google. > Waybe we also want to support protocol schemes like git: http: and > bzr+ssh:/lp: for the --gcc-src= argument. this would then > automatically download/branch the source tree from the given location. > What do you think? Agree. >> ** Summary of gcc-patches >> >> "gcc-patches" are used as "backport" from Google changes into Linaro >> gcc base. Here is the summary at present: >> >> 0001-Add-linux-android.patch >> Add linux-android >> >> 0002-Add-support-for-Bionic-C-library.patch >> Add support for Bionic C library >> >> 0003-Support-compilation-for-Android-platform.patch >> Support compilation for Android platform >> >> 0004-Add-multilib-configuration-for-arm-linux-androideabi.patch >> Add multilib configuration for arm-linux-androideabi >> >> 0005-Fix-gthr-posix.h-to-support-Bionic.patch >> Fix gthr-posix.h to support Bionic >> >> 0006-Add-untested-support-for-Bionic-to-libstdc.patch >> Add [untested] support for Bionic to libstdc++ >> >> These patches are taken from Maxim Kuvyrkov of CodeSourcery in gcc-4.6 >> branch. Of course, we can always add changes by >> Google or other Android specific adaptation by this model. > > Can we get a toolchain example tarball done and uploaded to > people.linaro.org? I would like to verify that those work out of the > box with gingerbread and if so, i would like to see those land in the > main toolchain WG branch rather than adding them to our gcc-patches > tree. Yes, I would like to do that later. >> ** Planned improvements over Linaro toolchain for Android >> >> (1) GCC multilib setting >> Default: arm, fpu and thumb. The prebuilt google toolchain use: >> armv5te and mandroid. We should focus on ARMv7. >> (2) HardFP-ABI Support for Android. >> (3) Patch management: Better to get the Android patches into >> Linaro-GCC tree eventually. >> (4) Build system improvement. Don't have to build gmp, mpfr everytime, >> and provide option to build without gdb. >> (5) Enable LTO (Link Time Optimization, introduced since gcc-4.5) in >> Android TARGET_GLOBAL_CFLAGS >> (6) Verify the functionality of FDO (Feedback Directed Optimization) >> and introduce the approaches to integrate. > > I really think those topics should be executed by the toolchain WG > rather than in platform. I am happy that we give them guidance and > support them by providing them with easy to use tools to get their job > done. Also feeding them with topics is great. Please talk to Michael > Hope and ask him how he wants to collect those android toolchain > optimization topic ideas. Could be good input for our 11.11 > requirements gathering process. Agree. >> ** Toward Android NDK >> >> Once Linaro toolchain for Android is ready to use, it is time to >> re-package Android NDK by Linaro toolchain. To do that, extra >> build configuration, sysroot, is required. According to Android >> Release Cycle & Phases[5], the repacked NDK should be verified >> one moth after Android public release. > > That sounds like a great idea. What's the a benefit/difference of > shipping an NDK compared to just shipping a "normal" toolchain binary > tarball for this purpose? NDK consists of some architecture specific helper scripts/headers to indicate the optimization flags and some combinations such as ARMv7 with/without NEON, etc. If we provide NDK directly, users don't have to consider the above integration issues as far as I know. Sincerely, Jim Huang (jserv) Android Team

14 years, 4 months

2
2
0 0

[ACTIVITY] 21st - 26th February

by Andrew Stubbs

Temporarily took over Tech Lead of the Toolchain Working Group while Michael Hope recovers from the Christchurch earthquake. (He's fine, but unable to work.) This didn't actually require any action, in the end. Michael returned to work towards the end of the week. Forward ported, benchmarked, and posted one of Mark Shinwell's NEON patches upstream. Further benchmarking was not possible as the Panda board I was using is located in Christchurch, NZ. Merged and tested the FSF GCC 4.5 branch into Linaro GCC. There were a couple of test regressions in the fortran testsuite, so I've filed bug lp:723086. The other test results were either the same or better. Benchmarked the ARM A8 function/jump alignment patch to see what effect it has in GCC 4.6. Found no measurable improvement in EEMBC. I suggest dropping this patch. Brought the patch tracker up-to-date, and entered tracking tickets for all outstanding patches. Merged FSF trunk to Linaro GCC 4.6. Committed Jie's Thumb2 testcase fix to FSF GCC trunk. Thanks to Ramana for using his new found authority to approve it. Investigated the suitability of several of the patches for forward-porting. Corresponded with Benrd and Julian. ---- Upstream patched requiring review: * Thumb2 constants: http://gcc.gnu.org/ml/gcc-patches/2010-12/msg00652.html * Kazu's VFP testcases: http://gcc.gnu.org/ml/gcc-patches/2011-02/msg00128.html * ARM EABI half-precision functions http://gcc.gnu.org/ml/gcc-patches/2011-02/msg00874.html * ARM Thumb2 Spill Likely tweak http://gcc.gnu.org/ml/gcc-patches/2011-02/msg00880.html * NEON scheduling patch http://gcc.gnu.org/ml/gcc-patches/2011-02/msg01431.html

14 years, 4 months

1
0
0 0

[ACTIVITY] Feb 21 -- Feb 27

by Chung-Lin Tang

== Last week == * Launchpad #721021 GCC ICE on ARM/XScale: identified as case of upstream PR45177; backported and pushed to Linaro. * Launchpad #709453/CS Issue #7122: Neon vmov 0.0 issues; some progress on my current WIP patch, but tests showed another 3 regressions, still on-going. * Launchpad #711819/GCC PR47719: ICE in push_minipool_fix. Ramana reminded that my patch, which added some pool range attributes, were actually removed earlier by Bernd in the fix for PR43137. Discussed and mostly concluded that we should add them back for now. Will re-submit patch with testcase to gcc-patches this week. * Coremark ARMv7-A regressions: still work in progress. == This week == * TW Public Holiday Feb.28 (Mon). * Ping some of my upstream patch submissions. * Get incompleted issues done. * Coremark regression investigation.

14 years, 4 months

1
0
0 0

GCC M4 DSP support

by Christian ANDERSSON2

Hello Linaro toolchain guys, I have a few questions regarding GCC fully supporting the ARM Cortex M4, I'm especially thinking of the additional DSP instructions and if these are supported and how optimal the code being produced is? Thanks for your support, Best Regards Christian (ST-Ericsson)

14 years, 4 months

2
1
0 0

[ACTIVITY] weekly status

by Ken Werner

Hi, == Investigate developer tools == * Finished latrace investigation. == PandaBoard == * The defective PandaBoard that was sent back in December is now repaired and on my desk again. It doesn't show the behaviour of #708883 and works flawlessly so far. :) == libunwind == * Did some debugging of the test-async-sig testcase to get started with libunwind. It will dead-lock if you add "--enable-debug" since libunwind does printfs in this case which are not signal safe. * Sorted out which of Zachs patches are upstream and which are not. * Started to learn about the different unwind methods that libunwind provides on ARM. Regards Ken

14 years, 4 months

1
0
0 0

[ACTIVITY] 2011-02-25

by David Gilbert

== ffi == * Sent variadic patch for libffi to libffi-discuss * Worked through some suggestions from Chung-Lin, need to do some rework == string routines == * memchr & strchr patch sent for inclusion in ubuntu packages * tried sqlite's benchmarks - they don't spend too much time in the C library; although a few % in memcpy, and ~1% in memset (also seem to have found an sqlite test case failure on ARM and filed as bug 725052) == porting jam == * There wasn't much traffic on #linaro during this related to the jam * I closed bug 635850 (fastdep FTBFS) which was already fixed with an explicit fix for ARM in the changelog and bug 492336 (eglibc's tst-eintr1 failing) which seems to work now but it's not clear when it was fixed. * Looking at eglibc's test log there seem to be a bunch of others that are failing and may well be worth investigating. * bug 372121 (qemu/xargs stack/number of arguments limit) seems to work ok, however the reporter did say it was quite a fragile test; that needs more investigation to see whether the original reason has actually been fixed. == misc == * swapping notes with Peter on the PBX SD card investigation Dave

14 years, 4 months

1
0
0 0

[ACTIVITY] report week 08

by Peter Maydell

RAG: Red: Amber: Green: Current Milestones: | Planned | Estimate | Actual | qemu-linaro 2011-03 | 2011-03-08 | 2011-03-08 | | Historical Milestones: finish virtio-system | 2010-08-27 | postponed | | finish testing PCI patches | 2010-10-01 | 2010-10-22 | 2010-10-18 | successful ARM qemu pull req | 2010-12-16 | 2010-12-16 | 2010-12-16 | finish qemu-cont-integration | 2011-01-25 | 2011-01-25 | handed off | first qemu-linaro release | 2011-02-08 | 2011-02-08 | 2011-02-08 | == maintain-beagle-models == * rebased qemu-linaro on upstream * checked omap_uart model for any issues with enabling the extended (non-16550A) features which the new Linux drivers need. Sent meego merge request for patchset which turns on the features, and does a little cleanup. Now in meego, qemu-linaro. == merge-correctness-fixes == * reviewed versions 5 and 6 of Christophe's vrecpe/vsqrte patchset; v6 was good and has now been committed * sent a version of "dummy cp14 debug registers" patch upstream; however I've realised it triggers a false positive in the temp-leak debugging code in target-arm/translate.c * wrote/sent a patch which moves this temp-leak debugging code into TCG proper (which I think makes it much simpler and cleaner and avoids the false positives mentioned above) * some work on the cp15 performance counter registers. I now have some code which I think is a fully architecturally valid implementation of an "implements no events" core, except that we don't implement the cycle count register. * started testing/review of Adam's VA-to-PA translation regs patch. In the course of this discovered that qemu unconditionally implements an ARM940 cp15 WFI register which clashes with these; submitted patch to add correct not-for-v6/v7 feature gating. * sent out patch fixing usermode seeks by 32 bit guest on 64 bit host (based on a diagnosis and suggested fix by Eoghan Sherry) * sent patch fixing compile error in vnc code == vexpress model == * sent a patchset for fixing the MMC card detect wiring on PBX upstream; this is needed for vexpress too * finished vexpress cleanup and cross-checking against the docs; I now have a patchset I'm happy to upstream and will post next week == other == * took part in pgp keysigning event with emdebian folks * meetings: toolchain, PDSW-tools Current qemu patch status is tracked here: https://wiki.linaro.org/PeterMaydell/QemuPatchStatus Absences: 17/18 March: QEMU Users Forum, Grenoble Holiday: 22 Apr - 2 May 9-13 May: UDS, Budapest (maybe) ~17-19 August: QEMU/KVM strand at LinuxCon NA, Vancouver

14 years, 4 months

1
0
0 0

[ACTIVITY] Feb 21 - Feb 25

by Ulrich Weigand

== GDB == * Worked with Will Deacon and the Linaro kernel team to make sure HW watchpoint and Versatile Express errata fixes are included in the upcoming Linaro kernel release. * Committed GDB HW watchpoint patches to mainline, and backport to Linaro GDB. This completes work on the HW watchpoint blueprint. * Worked on fixing the GDB part of #620611 (Unable to backtrace out of vector page 0xffff0000). Posted (two versions of) mainline patch for discussion. * Worked on kernel patch for #615974 (Interrupted system call handling). Mit freundlichen Gruessen / Best Regards Ulrich Weigand -- Dr. Ulrich Weigand | Phone: +49-7031/16-3727 STSM, GNU compiler and toolchain for Linux on System z and Cell/B.E. IBM Deutschland Research & Development GmbH Vorsitzender des Aufsichtsrats: Martin Jetter | Geschäftsführung: Dirk Wittkopp Sitz der Gesellschaft: Böblingen | Registergericht: Amtsgericht Stuttgart, HRB 243294

14 years, 4 months

1
0
0 0

[ACTIVITY] Weekly status

by Richard Sandiford

== This week == * Looked at the poor code generated for Neon load/store intrinsics. Looked into the history behind the treatment of VFP registers by CANNOT_CHANGE_MODE_CLASS. Peter confirmed that the restrictions apply only to VFPv1. Wrote a patch to improve the code, which partly overlapped with Julian's. * Looked at how the operations should be represented at the tree level. Experimented with various combinations of tree codes and types to see which felt right. Wrote this up in the message I sent today. == Next week == * More vectorisation. * Submit some queued patches. * Maybe some bug fixing. (I see there's a reload bug just waiting to be claimed by a lucky developer.) On holiday the following week. Richard

14 years, 4 months

1
0
0 0

Re: Unavailable for a bit

by Michael Hope

Services at ex.seabright.co.nz are back up. On Tue, Feb 22, 2011 at 10:06 PM, Michael Hope <michael.hope(a)linaro.org> wrote: > Hi there. We've had an earthquake. Family and friends are fine but i'll be > unavailable for a few days. Services on ex.seabright.co.nz are down. I'll > cancel Wednesdays standup call. > > See you soon, > > -- Michael

14 years, 4 months

1
0
0 0

[ACTIVITY] February 20-24

by Revital1 Eres

Hello, Implemented a patch for SMS to support targets that their doloop part is not decoupled from the rest of the loop's instructions (which is the current assumption of SMS). ARM is an example of such target, where the loop's instructions might use CC reg which is used in the doloop part. Now testing the patch on ARM and other targets that have do-loop. Thanks, Revital

14 years, 4 months

1
0
0 0

[ACTIVITY] February 20-24

by Ira Rosen

Hi, * vectorizer cost model - implemented builtin_vectorization_cost for NEON - added register spilling considerations to the cost model - started testing/tuning on EEMBC Telecom and DenBench (for now I have only two examples for spilling: fdct_int32 mp4encode that shouldn't get vectorized and viterbi that should) * measured vectorization impact on Telecom autcor - it's about 5x (initially I got run time segfault, but the bug is already fixed on GCC trunk, I'll have to check gcc-linaro-4.5 as well) * NEON-vs.non-NEON degradation - started to look at aes. There are 6 loops that get vectorized with 4.6 (due to this patch http://gcc.gnu.org/ml/gcc-patches/2010-05/msg01927.html that allows cond_expr in number of loop iterations expressions) and vzip/vuzp patch, but not with gcc-linaro-4.5. But it doesn't explain the degradation of course. - I don't understand mp4decodepsnr improvement, since I don't see any loops or basic blocks vectorized. Ira

14 years, 4 months

1
0
0 0

Weekly Activity report

by Mounir Bsaibes

Please find the compilation of the weekly activity reports at the following link, if I have missed anyone let me know. https://wiki.linaro.org/WorkingGroups/ToolChain/ActivityReports/2011-02-18 In your report please try to stick to the following format, to help make the report consistant: == Topic == * item 1 * items 2 See https://wiki.linaro.org/Process/Reporting Thanks & Regards, Mounir

14 years, 4 months

1
0
0 0

Improving the code generated for vld and vst intrinsics

by Richard Sandiford

One of the vectorisation discussions from last year was about the poor code GCC generates for vld{2,3,4}_*() and vst{2,3,4}_*(). It forces the result of the loads onto the stack, then loads the individual pieces from there. It does the same thing in reverse for stores. I think there are two major problems here: 1. The result of the vld*() is a record type such as: typedef struct int16x4x3_t { int16x4_t val[3]; } int16x4x3_t; Ideally, we'd like one of these structures to be stored in a pseudo register. However, the ARM port currently limits in-register record types to 64 bits, so something this big is always given BLKmode and stored on the stack. A simple "fix" for this is to increase MAX_FIXED_MODE_SIZE. That would do the right thing for the structures in arm_neon.h, but wouldn't be safe in general. 2. The vld*() returns values as a single integer (such as EI mode), while uses of the value will typically be in a vector mode such as V4SI. CANNOT_CHANGE_MODE_CLASS doesn't allow direct "mode-punning" between the two in VFP_REGS, so this again forces the punning to be done on the stack. The code in question is: /* FPA registers can't do subreg as all values are reformatted to internal precision. VFP registers may only be accessed in the mode they were set. */ #define CANNOT_CHANGE_MODE_CLASS(FROM, TO, CLASS) \ (GET_MODE_SIZE (FROM) != GET_MODE_SIZE (TO) \ ? reg_classes_intersect_p (FPA_REGS, (CLASS)) \ || reg_classes_intersect_p (VFP_REGS, (CLASS)) \ However, the VFP restriction appears to be specific to VFPv1 -- thanks to Peter for the archaeology -- and isn't a problem for v6+. In that case, removing this restriction is an important optimisation. I tried the patch below on the following simple testcase: #include "arm_neon.h" void foo (uint16_t *a) { uint16x4x3_t x, y; x = vld3_u16 (a); y = vld3_u16 (a + 12); x.val[0] = vadd_u16 (x.val[0], y.val[0]); x.val[1] = vadd_u16 (x.val[1], y.val[1]); x.val[2] = vadd_u16 (x.val[2], y.val[2]); vst3_u16 (a, x); } (not necessarily sensible!). Before the patch, -O2 produced: sub sp, sp, #48 add r3, r0, #24 vld3.16 {d16-d18}, [r3] vld3.16 {d20-d22}, [r0] add r3, sp, #24 vstmia sp, {d20-d22} vstmia r3, {d16-d18} fldd d19, [sp, #8] fldd d16, [sp, #0] fldd d17, [sp, #24] fldd d20, [sp, #32] vadd.i16 d18, d16, d17 vadd.i16 d17, d19, d20 fldd d19, [sp, #16] fldd d20, [sp, #40] vadd.i16 d16, d19, d20 fstd d18, [sp, #0] fstd d17, [sp, #8] fstd d16, [sp, #16] vldmia sp, {d16-d18} vst3.16 {d16-d18}, [r0] add sp, sp, #48 bx lr After the patch we get: vld3.16 {d24-d26}, [r0] add r3, r0, #24 vld3.16 {d20-d22}, [r3] vmov q8, q12 @ ti vadd.i16 d17, d17, d21 vadd.i16 d16, d24, d20 vadd.i16 d18, d26, d22 vst3.16 {d16-d18}, [r0] bx lr The VMOV is a bit disappointing, and needs further investigation. The first hunk fixes (2), and I think is correct. The second hunk hacks (1), and isn't suitable in itself. I'll next try to make arm_neon.h use built-in record types that are explicitly EImode, which should remove the need to change MAX_FIXED_MODE_SIZE. Richard Index: gcc/gcc/config/arm/arm.h =================================================================== --- gcc.orig/gcc/config/arm/arm.h +++ gcc/gcc/config/arm/arm.h @@ -1171,10 +1171,12 @@ enum reg_class /* FPA registers can't do subreg as all values are reformatted to internal precision. VFP registers may only be accessed in the mode they were set. */ -#define CANNOT_CHANGE_MODE_CLASS(FROM, TO, CLASS) \ - (GET_MODE_SIZE (FROM) != GET_MODE_SIZE (TO) \ - ? reg_classes_intersect_p (FPA_REGS, (CLASS)) \ - || reg_classes_intersect_p (VFP_REGS, (CLASS)) \ 2+#define CANNOT_CHANGE_MODE_CLASS(FROM, TO, CLASS) \ + (GET_MODE_SIZE (FROM) != GET_MODE_SIZE (TO) \ + ? (reg_classes_intersect_p (FPA_REGS, (CLASS)) \ + || (TARGET_VFP \ + && reg_classes_intersect_p (VFP_REGS, (CLASS)) \ + && arm_fpu_desc->rev == 1)) \ : 0) /* The class value for index registers, and the one for base regs. */ @@ -2458,4 +2460,6 @@ enum arm_builtins instruction. */ #define MAX_LDM_STM_OPS 4 +#define MAX_FIXED_MODE_SIZE GET_MODE_BITSIZE (XImode) + #endif /* ! GCC_ARM_H */

14 years, 4 months

2
4
0 0

Unavailable for a bit

by Michael Hope

Hi there. We've had an earthquake. Family and friends are fine but i'll be unavailable for a few days. Services on ex.seabright.co.nz are down. I'll cancel Wednesdays standup call. See you soon, -- Michael

14 years, 4 months

2
1
0 0

[ACTIVITY] Feb 14 - Feb 17

by Ulrich Weigand

== GDB == * Working with Will Deacon, identified root cause of GDB problems running on Versatile Express in SMP mode, and verified that Errata workaround fixes the problem * Finished testing GDB HW watchpoints patch on vexpress, submitted complete patch set for mainline inclusion * Reviewed Yao's mainline patch to enable displaced stepping in Thumb mode Mit freundlichen Gruessen / Best Regards Ulrich Weigand -- Dr. Ulrich Weigand | Phone: +49-7031/16-3727 STSM, GNU compiler and toolchain for Linux on System z and Cell/B.E. IBM Deutschland Research & Development GmbH Vorsitzender des Aufsichtsrats: Martin Jetter | Geschäftsführung: Dirk Wittkopp Sitz der Gesellschaft: Böblingen | Registergericht: Amtsgericht Stuttgart, HRB 243294

14 years, 4 months

2
3
0 0

[ACTIVITY] Feb 14 -- Feb 20

by Chung-Lin Tang

== Last week == * PR46178, PR46002: both upstream issues related to the priority coloring mode of IRA. Both patches submitted, the first already approved and committed. Vladimir M. did mention that the priority algorithm would be removed once his newer "cover class-less" patches goes in during stage1. Anyways, I got more familiar with IRA during the process, and the patches will still be applicable to 4.5/4.6. * PR43872: incorrectly aligned VLAs under ARM. This turned out to be a one-liner fix. Submitted upstream awaiting approval. * Discussed on email/IRC with Revital Eres on SMS and ARM doloop pattern issues. * Launchpad #721021: Linaro GCC ICE under -mtune=xscale. Investigated a bit; did not see ICE immediately, but GCC went into infinite loop (Khem Raj, the reporter, says it runs for a while then ICEs). * Coremark ARMv5TE vs ARMv7-A performance regression: reproduced consistently using our own Tegra boards. Investigated and seem to have found something, will post more detailed findings later. == This week == * Coremark investigation. * More GCC issues.

14 years, 4 months

1
0
0 0

Continuous build results

by Michael Hope

Hi there. I've created a new mailing list for the automated build results of Linaro GCC, Linaro GDB, and (once-weekly) FSF 4.5 and 4.6. You can subscribe by going to: https://launchpad.net/~linaro-toolchain-builds I've subscribed everyone in the toolchain working group. If you'd like to get these emails, go to: https://launchpad.net/people/+me/+editemails and set 'Linaro Toolchain Builds' to 'Preferred address'. Any build failures will also go to linaro-toolchain(a)l.linaro.org. The archive is here: https://lists.launchpad.net/linaro-toolchain-builds/ You can also track the crude state of the build hosts here: http://ex.seabright.co.nz/helpers/scheduler and see things status scroll by on #linaro-cbuild on Freenode. This was kicked off due to the problems with the 2011.01 release. The follow-up is tracked here: https://blueprints.launchpad.net/gcc-linaro/+spec/incident-followup-1 -- Michael

14 years, 4 months

2
2
0 0

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

linaro-toolchain