- linaro-toolchain - lists.linaro.org

by Revital1 Eres

Hello, Experiment with aes benchmark from DENbench. Continue my experiments with SMS which includes re-implementing an old patch to insert reg-moves in free slots rather than greedily before the definition as is done in the current implementation. Thanks, Revital

15 years, 2 months

1
0
0 0

[ACTIVITY] March 13-17

by Ira Rosen

Hi, * submitted store sinking patch to mainline * started testing auto-detection of vector size patch * DENBench - some benchmarks are still unstable, I am looking into stable regressions, adjusting and fixing the cost model for them Next week: Sunday and Monday - holidays Ira

15 years, 2 months

1
0
0 0

Versatile Express write-up

by Michael Hope

Here's a end-of-cycle write up for Versatile Express support in QEMU: https://wiki.linaro.org/WorkingGroups/ToolChain/Outputs/QEMUVersatileExpress Most of it is taken from Peter's page: https://wiki.linaro.org/PeterMaydell/QemuVersatileExpress which is the place to go if you want the current state and more detail on the steps involved. While writing this up I had a seamless experience from the first linaro-image-create until seeing the alpha3 greeter come up and wobbling the mouse around. It was awesome. Some ideas for other write-ups at: https://wiki.linaro.org/WorkingGroups/ToolChain/Outputs -- Michael

15 years, 2 months

1
0
0 0

RealView PBX write-up

by Michael Hope

Dave did an investigation earlier in the year into Cortex-A9 and RealView PBX support in QEMU. The write-up is available here: https://wiki.linaro.org/WorkingGroups/ToolChain/Outputs/QEMURealViewPBX Dave and Peter: could you please review it? I've now closed out the blueprint. I'd like to do similar reports on other outputs and will attack vexpress next. -- Michael

15 years, 2 months

1
0
0 0

Work-item tracking for the "gcc-linaro-tracking" LP project

by Ulrich Weigand

Hi Michael, Andrew, Mounir just pointed out that our non-Ubuntu LP projects (like gcc-linaro, gdb-linaro etc.) are now also included in the LP work-item tracking statistics (http://status.linaro.org/linaro-toolchain-wg.html). This didn't happen in the past due to a Launchpad issue that has now been fixed. This seems to be working out nicely, except for one issue: what about the gcc-linaro-tracking project? I have a couple of bugs that are fixed in Linaro GCC, and are also fixed in mainline GCC, but they still show up as an "in-progress" work-item in the status tracker (there are a whole bunch more of those assigned to Andrew as well). The reason for this is the LP records have an associated gcc-linaro-toolchain project entry, and this is set to "Fix Committed", but not "Fix Released" ... probably because GCC 4.6.0 is not yet released? Now, on the one hand it does make sense to include the -tracking project in the work-item statistics, because they *do* reflect important tasks: namely, to make sure that the changes indeed land in the upstream repository. However, having them all show up as "in progress" until the community makes a new GCC release does not seem very helpful: this is not in our control, and our work is in fact done once the patch is committed upstream. Therefore my suggestion: we should immediately mark -tracking bugs as "Fix Released" (not "Fix Committed"), as soon as the corresponding patch is committed upstream (and thus our work on the problem is completed). Thoughts? Does this make sense? Will this mess up any of the other purposes for which we currently use the -tracking project? Mit freundlichen Gruessen / Best Regards Ulrich Weigand -- Dr. Ulrich Weigand | Phone: +49-7031/16-3727 STSM, GNU compiler and toolchain for Linux on System z and Cell/B.E. IBM Deutschland Research & Development GmbH Vorsitzender des Aufsichtsrats: Martin Jetter | Geschäftsführung: Dirk Wittkopp Sitz der Gesellschaft: Böblingen | Registergericht: Amtsgericht Stuttgart, HRB 243294

15 years, 2 months

5
9
0 0

cortex-strings benchmarks

by Michael Hope

Hi Dave. I had a little play with cortex-strings and did some benchmarks on my Tegra 2. Images are attached. I've added two scripts to cortex-strings: scripts/bench-all.sh runs all the routines on all variants and records them scripts/plot.py plots the results from above ploy.py corrects for the benchmark overhead by doing a linear fit to the null 'bounce' results and subtracting this fit. You should be able to a autogen; configure; make; bash scripts/bench-all.sh | tee log.txt; python scripts/plot.py log.txt. I'm sure you have your own favourite tools though. The string routines look good. Lumpy in funny ways though... -- Michael

15 years, 2 months

1
0
0 0

Fwd: Representing interleaving and lane load/stores at the tree level

by Richard Sandiford

[Sorry, forgot to CC: the list] Hi Ira, Thanks for the feedback. On 6 March 2011 09:20, Ira Rosen <IRAR(a)il.ibm.com> wrote: > > So how about the following functions? (Forgive the pascally syntax.) > > > > __builtin_load_lanes (REF : array N*M of X) > > returns array N of vector M of X > > maps to vldN > > in practice, the result would be used in assignments of the form: > > vectorX = ARRAY_REF <result, X> > > > > __builtin_store_lanes (VECTORS : array N of vector M of X) > > returns array N*M of X > > maps to vstN > > in practice, the argument would be populated by assignments ofthe > form: > > vectorX = ARRAY_REF <result, X> > > > > __builtin_load_lane (REF : array N of X, > > VECTORS : array N of vector M of X, > > LANE : integer) > > returns array N of vector M of X > > maps to vldN_lane > > > > __builtin_store_lane (VECTORS : array N of vector M of X, > > LANE : integer) > > returns array N of X > > maps to vstN_lane > > > > How do you distinguish between "multiple structures" and "single structure > to all lanes"? Sorry, I'm not sure I understand the question. Could you give a couple of examples? The idea is that the arrays above really are array types, regardless of the actual type of the thing we're accessing (which might be a larger array than the bounds above say, or which might be an array of structures or a structure of arrays). That should be OK because arrays alias their elements. Richard

15 years, 2 months

3
3
0 0

Linaro GDB patch for natty

by Ulrich Weigand

Hi Matthias, in last week's meeting you raised the question what, if any, code from the Linaro GDB repository could be useful for inclusion into the natty GDB package. I've now reviewed the contents of the repository, and my suggestion would be to use everything in Linaro GDB 7.2, except for this commit (which changes the branding to "Linaro GDB"): revno: 32969 committer: Ulrich Weigand <uweigand(a)de.ibm.com> branch nick: 7.2 timestamp: Wed 2010-09-22 19:18:38 +0200 message: 2010-09-22 Ulrich Weigand <uweigand(a)de.ibm.com> * src-release: Support gdb-linaro packages. gdb/ * version.in: Set to Linaro GDB version number. * configure.ac (PKGVERSION, BUGURL): Refer to Linaro. * configure: Regenerate. gdb/gdbserver/ * configure.ac (PKGVERSION, BUGURL): Refer to Linaro. * configure: Regenerate. gdb/doc/ * configure.ac (PKGVERSION, BUGURL): Refer to Linaro. * configure: Regenerate. (Instead, the branding ought to be set as appropriate for the Ubuntu package. Maybe with an additional reference to Linaro, just as with GCC?) I've created a snapshot of the Linaro GDB 7.2 branch using the command bzr diff --prefix a/:b/ -r32965.. and then manually removed changes to src-release gdb/version.in gdb/configure.ac gdb/configure gdb/gdbserver/configure.ac gdb/gdbserver/configure gdb/doc/configure.ac gdb/doc/configure I've left in the new file ChangeLog.linaro for documentation purposes, but if you prefer this could of course be removed as well. The resulting patch is appended here. (Note that I'd recommend to continue updating the patch from Linaro GDB as further changes make it in.) (See attached file: linaro-gdb.patch) I've then added the patch to the natty GDB package. Since it touches a completely distinct set of files compared to the existing list of patches in the package, it can be added to the series file at any arbitrary point. I've built the resulting compiler on i386, arm, and ppc64, and it strictly improved the test results on all three platforms: i386 without patch: # of expected passes 16161 # of unexpected failures 114 # of expected failures 72 # of untested testcases 9 # of unresolved testcases 1 # of unsupported tests 69 i386 with patch: # of expected passes 16331 # of unexpected failures 24 # of expected failures 72 # of untested testcases 9 # of unresolved testcases 1 # of unsupported tests 69 Fixed test case failures are from: gdb.base/break-interp.exp gdb.base/foll-fork.exp gdb.base/printcmds.exp (These are just test suite cleanups, no actual code changes.) ppc without patch: # of expected passes 15350 # of unexpected failures 74 # of expected failures 53 # of untested testcases 15 # of unresolved testcases 1 # of unsupported tests 63 ppc with patch: # of expected passes 15350 # of unexpected failures 55 # of expected failures 53 # of untested testcases 15 # of unresolved testcases 1 # of unsupported tests 63 Fixed test case failures are from: gdb.base/printcmds.exp gdb.threads/local-watch-wrong-thread.exp gdb.threads/watchthreads.exp (These are just test suite cleanups, no actual code changes.) arm without patch: # of expected passes 15343 # of unexpected failures 270 # of unexpected successes 1 # of expected failures 65 # of untested testcases 11 # of unresolved testcases 2 # of unsupported tests 70 arm with patch: # of expected passes 15686 # of unexpected failures 46 # of unexpected successes 3 # of expected failures 63 # of untested testcases 11 # of unresolved testcases 1 # of unsupported tests 69 Fixed test case failures are from: gdb.base/break-interp.exp gdb.base/corefile.exp gdb.base/foll-fork.exp gdb.base/gcore.exp gdb.base/gdb1555.exp gdb.base/pr11022.exp gdb.base/printcmds.exp gdb.base/recurse.exp gdb.base/relativedebug.exp gdb.base/step-test.exp gdb.base/watch-cond.exp gdb.base/watch-read.exp gdb.base/watch_thread_num.exp gdb.base/watch-vfork.exp gdb.gdb/selftest.exp gdb.mi/gdb792.exp gdb.mi/mi2-syn-frame.exp gdb.mi/mi2-var-display.exp gdb.mi/mi2-watch.exp gdb.mi/mi-syn-frame.exp gdb.mi/mi-var-display.exp gdb.mi/mi-watch.exp gdb.pie/corefile.exp gdb.server/ext-attach.exp gdb.threads/attachstop-mt.exp gdb.threads/attach-stopped.exp gdb.threads/linux-dp.exp gdb.threads/local-watch-wrong-thread.exp gdb.threads/pthread_cond_wait.exp (This represents much of the bug fix work that went into Linaro GDB.) Let me know if there's any further information you need, or anything else I can do to help get the Linaro changes into natty GDB. Mit freundlichen Gruessen / Best Regards Ulrich Weigand -- Dr. Ulrich Weigand | Phone: +49-7031/16-3727 STSM, GNU compiler and toolchain for Linux on System z and Cell/B.E. IBM Deutschland Research & Development GmbH Vorsitzender des Aufsichtsrats: Martin Jetter | Geschäftsführung: Dirk Wittkopp Sitz der Gesellschaft: Böblingen | Registergericht: Amtsgericht Stuttgart, HRB 243294

15 years, 2 months

1
0
0 0

[ACTIVITY] 7th - 13th March

by Andrew Stubbs

Merged fixes for several bug into Linaro GCC 4.5. Both from Linaro (Richard, Matthias and Ramana), and from CS (the shrink wrap problems). Continued working on benchmarking the patches I've merged to 4.6. Spent quite some time trying to figure out why EEMBC and the Spec2K weren't working properly. I've got this sorted now. Confirmed that the patch to discourage NEON use for integer operations is still profitable on Cortex-A8. Posted the patch upstream. Merged upstream GCC 4.6 into Linaro GCC 4.6. Booked travel to Budapest for Linaro @ UDS. Followed up on Ramana's questions about the RVCT interoperability patch. Paul Brook helped explain what it was about, and pointed me at the proper section in the proper ARM manual. Continued forward porting patches to 4.6. Mostly I need to convince myself that they still do something useful. I have posted one new patch to upstream - the "Discourage A8 NEON" patch. * Future Absence Away Wednesday 16th to Friday 18th. Away Monday 28th to Friday 1st April. ---- Upstream patched requiring review: * Thumb2 constants: http://gcc.gnu.org/ml/gcc-patches/2010-12/msg00652.html * ARM EABI half-precision functions http://gcc.gnu.org/ml/gcc-patches/2011-02/msg00874.html * ARM Thumb2 Spill Likely tweak http://gcc.gnu.org/ml/gcc-patches/2011-02/msg00880.html * NEON scheduling patch http://gcc.gnu.org/ml/gcc-patches/2011-02/msg01431.html * RVCT Interoperability patch http://www.mail-archive.com/gcc-patches@gcc.gnu.org/msg00059.html * Discourage NEON on A8 http://www.mail-archive.com/gcc-patches@gcc.gnu.org/msg00576.html

15 years, 2 months

1
0
0 0

[ACTIVITY] Mar.07 -- Mar.13

by Chung-Lin Tang

== Last week == * Working on Coremark ARMv6 regressions. Identified a major cause being RTL ifcvt failing on one of the crc routines, due to combine pass failing to optimize a particular sequence, causing the if-conversion estimates to give up on conditional executing (too many insns). The combine pass failed on ARMv6 and above, due to the existence of true zero_extend insns. On ARMv5, the use of two shifts actually allowed combine to phase reduce the shifts one by one, thus producing better code. On ARMv6, combine produced a (xor (and ...) <mask>) which did not match any insn. Analyzed and sent a patch upstream which should work on such XOR cases. Patch is due for upstream commit for 4.7-stage1. (http://gcc.gnu.org/ml/gcc-patches/2011-03/msg00609.html) * Another situation of un-optimized uxth insns still exists; trying to solve this by another combine patch I am currently testing, will send upstream later. == This week == * verify the improvements the above patches should have on Coremark for ARMv6/v7. * Work on sending them to Linaro and SG++ branches. * Other bug issues.

15 years, 2 months

1
0
0 0

[ACTIVITY] Mar 07 - Mar 11

by Ulrich Weigand

== GDB == * Ongoing work on glibc patch to add ARM unwind tables to system call stubs; ran into design problems that look difficult to fix. * As an alternative, started work on a GDB patch to recognize glibc system call assembler stubs via code-scanning; this should allow alloc unwinding in the absence of debug info for current libc code. * Analyzed bug #728216 (GDB fails to get a valid backtrace while debugging a Webkit SIGSEGV) and resolved as invalid; the fault occurs within JIT-generated code where unwinding is impossible. == Misc == * Made travel arrangements for Linaro Summit in Budapest Mit freundlichen Gruessen / Best Regards Ulrich Weigand -- Dr. Ulrich Weigand | Phone: +49-7031/16-3727 STSM, GNU compiler and toolchain for Linux on System z and Cell/B.E. IBM Deutschland Research & Development GmbH Vorsitzender des Aufsichtsrats: Martin Jetter | Geschäftsführung: Dirk Wittkopp Sitz der Gesellschaft: Böblingen | Registergericht: Amtsgericht Stuttgart, HRB 243294

15 years, 2 months

1
0
0 0

[ACTIVITY] report week 10

by Peter Maydell

RAG: Red: Amber: Green: another qemu-linaro release out the door on time Current Milestones: | Planned | Estimate | Actual | qemu-linaro 2011-03 | 2011-03-08 | 2011-03-08 | 2011-03-08 | Historical Milestones: finish virtio-system | 2010-08-27 | postponed | | finish testing PCI patches | 2010-10-01 | 2010-10-22 | 2010-10-18 | successful ARM qemu pull req | 2010-12-16 | 2010-12-16 | 2010-12-16 | finish qemu-cont-integration | 2011-01-25 | 2011-01-25 | handed off | first qemu-linaro release | 2011-02-08 | 2011-02-08 | 2011-02-08 | == maintain-beagle-models == * released qemu-linaro 2011-03 * had to do a 2011-03-1 reroll of the tarball on day of release to fix a "versatilepb model crashes on startup" bug found at last minute * Paul Larson is working on having automated test image boots on qemu built from git, so we can catch this much earlier in the cycle == merge-correctness-fixes == * added support to risu for testing of load and store instructions * used this to test a patch which cleans up Thumb load/store decode and makes us UNDEF in the right places * wrote/submitted patch to fix GE bits for signed modulo arithmetic * wrote/submitted patch to get SMUAD/SMLAD Q bit right in an edge case * started on a patchset which will fix various minor qemu Neon bugs detected by test programs from the valgrind source tree == other == * meetings: toolchain, standup, pdsw-tools Current qemu patch status is tracked here: https://wiki.linaro.org/PeterMaydell/QemuPatchStatus Absences: 17/18 March: QEMU Users Forum, Grenoble Holiday: 22 Apr - 2 May 9-13 May: UDS, Budapest (maybe) ~17-19 August: QEMU/KVM strand at LinuxCon NA, Vancouver

15 years, 2 months

1
0
0 0

[ACTIVITY] weekly status

by Ken Werner

Hi, == libunwind == * the patches posted last week are now upstream * continued to study the Exception Handling ABI for the ARM Architecture * looked into the structure of libunwind (lib interdependencies) * documented at: https://wiki.linaro.org/KenWerner/Sandbox/libunwind * The work on the local unwinding appears to be quite complete. If the generic unwind model is used the code assumes the GCC personality routine. We should either check name of the symbol (maybe be difficult) or just call the pers function. I'm in contact with Zach on this. Regards Ken

15 years, 2 months

1
0
0 0

Bad code generation due to shrink-wrap optimisation

by Michael Hope

LP: #731665 is a silent bad code generation bug at least on functions which are empty except for inline assembly: https://bugs.launchpad.net/ubuntu/+source/gcc-4.5/+bug/731665 It was introduced in the shrink-wrap patch and is due to using an uninitialised variable. Andrew, can you please address this urgently either in Linaro or CSL. -- Michael

15 years, 2 months

1
0
0 0

[ACTIVITY] 2011-03-10

by David Gilbert

== hard-float == * Updated libffi variadic patch and Sent updated libffi variadic patch to the ffi mailing list. == String routines == * Got a big endian build environment going * Patched up memchr and strlen for big endian; turned out to be a very small change in the end; and tested it on qemu-armeb - note that an older version it didn't work on, but a newer one it did; I'll assume the newer one is correct. * Fixed a couple of build issues in the cortex strings test harness == Other == * Kicked off a SPEC2006 train run on canis using the 2011.03 compilers I'm on holiday tomorrow (Friday) and Monday. Dave

15 years, 2 months

1
0
0 0

[ACTIVITY] March 6-10

by Revital1 Eres

Hello, * Sent the patch to support targets that their doloop part is not decoupled from the rest of the loop's instructions (as is the case for ARM) to @gcc-patches: http://gcc.gnu.org/ml/gcc-patches/2011-03/msg00350.html * Continue looking into DENbench benchmarks. Thanks, Revital

15 years, 2 months

1
0
0 0

[ACTIVITY] March 6-10

by Ira Rosen

Hi, * continued working on cost model tuning. I don't see much difference running EEMBC DenBench with and without vectorization enabled (and, therefore, also with and without cost model). Also, I have to say, that the results are not stable and I sometimes get 10% difference just running the same executable two times in a row. * the only benchmark I see consistent degradation 5% with vectorization is DenBench aes, both with GCC trunk and gcc-linaro 4.5. I found one of the responsible loops, if it is not vectorized I see only 1.8% degradation. The problem there is that the loop bound is unknown at compile time, so the vectorizer attempts to vectorize the loop using runtime guards to verify that there are enough iterations to vectorize. The actual number of iterations is 4, so the scalar version of the loop is chosen at the run time, but I guess the guards cause the degradation. I'll continue looking into this next week. * prepared the conditional-store-sink patch (one of the patches that helps to vectorize Telecom Viterbi) for submission to gcc-patches. Ira

15 years, 2 months

1
0
0 0

Linaro QEMU 2011.03-1 released

by Peter Maydell

The Linaro Toolchain Working Group is pleased to announce the release of Linaro QEMU 2011.03-1. Linaro QEMU 2011.03-1 is the second release of qemu-linaro. Based off upstream (trunk) qemu, it includes a number of ARM-focused bug fixes and enhancements. This release includes a model of the ARM Versatile Express platform. This is still experimental but may be of use to people who want a model supporting up to 1GB of RAM with graphics and networking. Instructions for getting started with it are on the wiki: https://wiki.linaro.org/PeterMaydell/QemuVersatileExpress Other interesting changes include: - The OMAP emulation bug which was causing hangs if Linux tried to enable a swapfile is fixed - The OMAP UART model has been improved; this fixes the problem where kernels using the new omap-hsuart serial drivers stopped serial output halfway through boot. - As usual, various minor correctness fixes and other upstream changes Known issues: - The beagle and beaglexm models do not support USB, so there is no keyboard, mouse or networking (#708703) The only change over the shortlived 2011.03-0 is that the last minute bug #731093 has been fixed (versatilepb models would crash on startup.) The source tarball is available at: https://launchpad.net/qemu-linaro/+milestone/2011.03-1 Binary builds of this qemu-linaro release are being prepared and will be available shortly for users of Ubuntu. When ready, Natty packages of qemu-linaro 2011.03-1 will be in the Ubuntu archive. Packages for users of Ubuntu 10.04 LTS and Ubuntu 10.10 will be in the linaro-maintainers tools ppa: https://launchpad.net/~linaro-maintainers/+archive/tools/ More information on Linaro QEMU is available at: https://launchpad.net/qemu-linaro

15 years, 2 months

4
3
0 0

Getting linaro toolchain binaries

by Dave Martin

Hi all, I've had comments that getting hold of binaries for the linaro toolchain can be trick for people unfamiliar with the linaro tools. One reason is that we don't release binaries as such -- but a visitor browsing in through http://www.linaro.org/downloads/ won't discover this, and may waste a lot of time trying to understand launchpad etc. before coming to the conclusion that binaries either aren't available or are not easily findable. On the other hand, the cross toolchain packages are likely to be of interest to such visitors, but aren't obviously advertised -- maybe I'm looking in the wrong place, but if so then new visitors to the linaro pages are likely to look in the wrong place too. Would it make sense to explain the situation more prominently so that visitors know what to expect? Something along the lines of "if you use distro x revision y, these cross-compiler packages are available" and "if you need the tools for some other environment, you need to download the source and build it for yourself". Cheers ---Dave

15 years, 2 months

13
24
0 0

Linaro GDB 7.2 2011-03 released

by Michael Hope

The Linaro Toolchain Working Group is pleased to announce the release of Linaro GDB 7.2. Linaro GDB 7.2 2011.03-0 is the fourth release in the 7.2 series. Based off the latest GDB 7.2, it includes ARM-focused bug fixes and enhancements. Interesting changes include: * Hardware watchpoint support * Backtracing while in the Linux kernel trampoline frame Hardware watchpoints use the support built into ARM devices to watch for changes in values in memory with little performance impact. A 2.6.37 or later kernel is required. The source tarball is available at: https://launchpad.net/gdb-linaro/+milestone/7.2-2011.03-0 More information on Linaro GDB is available at: https://launchpad.net/gdb-linaro -- Michael

15 years, 2 months

1
0
0 0

[ACTIVITY] 28th February - 5th March

by Andrew Stubbs

Committed Kazu's VFP testcases patch upstream. Merged the latest from upstream GCC 4.6. Merged all the outstanding launchpad merge requests against both GCC 4.5 and 4.6. Spun the 4.5-2011.03-0 and 4.6-2011.03-0 releases. Passed the tarballs to Michael H for final testing. Brought the patch tracker up to date w.r.t. to new merges. Posted one of Dan's patches upstream for review. Decided to drop Julian's A8 alignment patch completely. I had previously discovered it provided no measurable benefit on A8, and now I've found the same for A9 (Pandaboard). There's no real improvement for any combination of -falign-* options in EEMBC. Bernd's "Discourage NEON on A8" patch also doesn't show any value in the benchmark results, but I think I've forward ported it wrong, because it should at least change the binary size, and it doesn't. I need to look into this further. I also decided I don't know enough about ARMv7, so I spent some time reading a few chapters from the ARM A.R.M. ---- Upstream patched requiring review: * Thumb2 constants: http://gcc.gnu.org/ml/gcc-patches/2010-12/msg00652.html * ARM EABI half-precision functions http://gcc.gnu.org/ml/gcc-patches/2011-02/msg00874.html * ARM Thumb2 Spill Likely tweak http://gcc.gnu.org/ml/gcc-patches/2011-02/msg00880.html * NEON scheduling patch http://gcc.gnu.org/ml/gcc-patches/2011-02/msg01431.html * RVCT Interoperability patch http://www.mail-archive.com/gcc-patches@gcc.gnu.org/msg00059.html

15 years, 2 months

1
0
0 0

[ACTIVITY] Feb.28 -- Mar.06

by Chung-Lin Tang

Last week: * Launchpad #711819 / PR47719: ARM minipool ICE. Followed up on discussion with Bernd and Ramana. Later posted discussion results on gcc-patches, where Richard Earnshaw took it over with a final fix. * Coremark ARMv7/v6 regressions: mostly pinpointed the exact cases where RTL simplification fails to optimize away ZERO_EXTEND expressions. Still working on how to enhance it. * TW Public Holiday on Feb.28 (Mon), was off for one day. This week: * Try to turn Coremark regression investigation into code form. * Other GCC issues.

15 years, 2 months

1
0
0 0

Representing interleaving and lane load/stores at the tree level

by Richard Sandiford

I've been spending this week playing around with various representations of the v{ld,st}{1,2,3,4}{,_lane} operations. I agree with Ira that the best representation would be to use built-in functions. One concern in the original discussion was that the optimisers might move the original MEM_REFs away from the call. I don't think that's a problem though. For loads, we can simply treat the whole of the accessed memory as an array, and pass the array by value. If we do that, then the call would just look like: __builtin_load_lanes (MEM_REF[(elem[N] *)ADDR]) (where, despite the C notation, the MEM_REF accesses the whole of elem[N]). It is of course possible in principle for the tree optimisers to replace this MEM_REF with another, equivalent, one, but that's OK semantically. It isn't possible for the optimisers to replace it with something like an SSA name, because arrays can't be stored in gimple registers. __builtin_load_lanes would then be used like this: combined_vectors = __builtin_load_lanes (...); vector1 = ...extract first vector from combined_vectors... vector2 = ...extract second vector from combined_vectors... .... So combined_vectors only exists for load and extract operations. The question then is: what type should it have? (At this point I'm just talking about types, not modes.) The main possibilities seemed to be: 1. an integer type Pros * Gimple registers can store integers. Cons * As Julian points out, GCC doesn't really support integer types that are wider than 2 HOST_WIDE_INTs. It would be good to remove that restriction, but it might be a lot of work, and it isn't something we'd want to take on as part of this project. * We're not really using the type as an integer. * The combination of the integer type and the __builtin_load_lanes array argument wouldn't be enough to determine the correct load operation. __builtin_load_lanes would need something like a vector count (N => vldN) argument as well. 2. a combined vector type Pros * Gimple registers can store vectors. Cons * For vld3, this would mean creating vector types with non-power- of-two vectors. GCC doesn't support those yet, and you get ICEs as soon as you try to use them. (Remember that this is all about types, not modes.) It _might_ be interesting to implement this support, but as above, it would be a lot of work. It also raises some semantic questions, such as: what is the alignment of the new vectors? Which leads to... * The alignment of the type would be strange. E.g. suppose we're loading N*2 uint32_ts into N vectors of 2 elements each. The types and alignments would be: N=2 uint32x4_t, alignment 16 N=3 uint32x6_t, alignment 8 (if we follow the convention for modes) N=4 uint32x8_t, alignment 32 We don't need alignments greater than 8 in our intended use; 16 and 32 are overkill. * We're not really using the type as a single vector, but as a collection of vectors. * The combination of the vector type and the __builtin_load_lanes array argument wouldn't be enough to determine the correct load operation. __builtin_load_lanes would need something like a vector count (N => vldN) argument as well. 3. an array of vectors type Pros * No support for new GCC features (large integers or non-power-of-two vectors) is needed. * The alignment of the type would be taken from the alignment of the individual vectors, which is correct. * It accurately reflects how the loaded value is going to be used. * The type uniquely identifies the correct load operation, without need for additional arguments. (This is minor.) Cons * Gimple registers can't store array values. So I think the only disadvantage of using an array of vectors is that the result can never be a gimple register. But that isn't much of a disadvantage really; the things we care about are the individual vectors, which can of course be treated as gimple registers. I think our tracking of memory values is good enough for combined_vectors to be treated as such (even though, with the back-end changes we talked about earlier, they will actually be stored in RTL registers). So how about the following functions? (Forgive the pascally syntax.) __builtin_load_lanes (REF : array N*M of X) returns array N of vector M of X maps to vldN in practice, the result would be used in assignments of the form: vectorX = ARRAY_REF <result, X> __builtin_store_lanes (VECTORS : array N of vector M of X) returns array N*M of X maps to vstN in practice, the argument would be populated by assignments of the form: vectorX = ARRAY_REF <result, X> __builtin_load_lane (REF : array N of X, VECTORS : array N of vector M of X, LANE : integer) returns array N of vector M of X maps to vldN_lane __builtin_store_lane (VECTORS : array N of vector M of X, LANE : integer) returns array N of X maps to vstN_lane Note that each operation can be expanded independently. The expansion doesn't rely on preceding or following statements. I've hacked up the prototype below as a proof of concept. It includes changes to the C parser to allow these functions to be created in the original source code. This is throw-away code though; it would never be submitted. I've also included a simple test case and the output I get from it. The output looks pretty good; there's not even the stray VMOV that I saw with the intrinsics earlier in the week. (Note that if you'd like to try this yourself, you'll need the patch I posted on Monday as well.) What do you think? Obviously this discussion needs to move to gcc@ at some point, but I wanted to make sure this was vaguely sane first. Richard

15 years, 2 months

2
2
0 0

A question about disabling -gtoggle in bootstrap run

by Revital1 Eres

Hello, I am looking for a way to disable '-gtoggle' flag in the run of stage 2 in bootstrap; when configuring ARM with (*). The flag seems to be applied in stage 2 but not in stage 3 which seems to cause bootstrap failure when testing SMS as in stage 2 SMS fails because of debug_insn caused by -gtoggle disturbing do-loop; while in stage 3 SMS succeeds; resulting in different .o files and bootsrtrap failure. (*) This the configure I used: ../gcc/configure --prefix=/home/eres/mainline/build --enable-checking --enable-languages=c --enable-bootstrap Thanks, Revital

15 years, 2 months

3
8
0 0

[ACTIVITY] Feb 28 - Mar 04

by Ulrich Weigand

== GDB == * Committed fix for the GDB part of #620611 (Unable to backtrace out of vector page 0xffff0000) to mainline and Linaro GDB 7.2. * Ran into GDB crashes due to memory corruption in tests involving multiple inferiors. Tracked down root cause (using valgrind) to long-standing double free bug in GDB terminal state handling code. Committed fix to mainline and Linaro GDB 7.2. * While using valgrind (see above), ran into problems: * ptrace system call is unsupported on ARM * certain variants of the "SUB from SP" Thumb-2 instruction are not handled by the VEX compiler Fixed both problems locally, and was then able to successfully valgrind GDB on ARM. * Created Linaro GDB 7.2-2011.03-0 release. * Worked on glibc patch to add ARM unwind tables to system call stubs; this will help unwinding in the absence of debug info for libc, and in particular fix #684218 (Failures in gdb.base/call-signal-resume.exp) Mit freundlichen Gruessen / Best Regards Ulrich Weigand -- Dr. Ulrich Weigand | Phone: +49-7031/16-3727 STSM, GNU compiler and toolchain for Linux on System z and Cell/B.E. IBM Deutschland Research & Development GmbH Vorsitzender des Aufsichtsrats: Martin Jetter | Geschäftsführung: Dirk Wittkopp Sitz der Gesellschaft: Böblingen | Registergericht: Amtsgericht Stuttgart, HRB 243294

15 years, 2 months

1
0
0 0

2026

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

linaro-toolchain