- linaro-toolchain - lists.linaro.org

by Ira Rosen

Hi, * resubmitted and committed store sink patch to trunk, I'll commit it to gcc-linaro-4.6 next week * submitted autodetection of vector size patch to gcc-patches, I'l commit it next week * started testing a patch that makes mvectorize-with-neon-quad the default * DenBench: found some more cases where vectorization of strided accesses using vzip/vuzp causes degradation. Since Richard is making a lot of progress with vlsd/vst, I think it doesn't make sense to spend too much time on vzip/vuzp, and I am going to run DenBench without this patch. Ira

14 years, 5 months

1
0
0 0

Re: Cross compiler ITP (armel)

by Goswin von Brederlow

Philipp Kern <trash(a)philkern.de> writes: > On 2011-03-23, Goswin von Brederlow <goswin-v-b(a)web.de> wrote: >> Also does the testing transition consider the Built-Using? If I specify >> 'Built-Using: gcc-4.5 (= 4.5.2-5)' will the package be blocked from >> entering testing until gcc-4.5 (= 4.5.2-5) has entered and block gcc-4.5 >> (= 4.5.2-5) from being replaced from testing? > > It doesn't need to. All we want is compliance on the archive side so that the > sources are not expired away, as long as that binary is carried in a suite. > No need to involve britney at that point. > > Kind regards > Philipp Kern Not quite. For ia32-libs it would be nice if ia32-libs could be blocked from testing as long as the source packages it includes aren't in testing. Currently that is solved by building against testing in the first palce. But that is something we can live with. As a side note the debian-cd package needs to also consider Built-Using when creating source images. Will the Sources.gz file list multiple entries for a source if multiple versions are relevant? MfG Goswin

14 years, 5 months

2
1
0 0

Re: Cross compiler ITP (armel)

by Hector Oron

Hi, 2009/11/2 Mark Hymers <mhy(a)debian.org>: > On Mon, 02, Nov, 2009 at 12:43:42PM +0000, Philipp Kern spoke thus.. >> Of course it is a sane approach but very special care needs to be taken when >> releasing to ensure GPL compliance. So what we should get is support in the >> toolchain to declare against what source package the upload was built to >> keep that around. > We haven't implemented that yet for the archive software but it's on the > TODO list (and not that difficult). None of us have had time to do the > d-d-a post from the ftpteam meeting yet, but I'll make sure information > is in there about it. > > I'm hoping to the archive-side support done in the next week or so. Squeeze has already been released, cross toolchains were not released along Debian main, but found at Emdebian repository. Marcin Juszkiewicz has been working out cross compiler packages for armel as part of his work for Linaro, which I attempt to include into Debian main archive. As a result of the work done, linux-armel, binutils-armel, eglibc-armel are merged into a single source package named `cross-toolchain-base', the package is not optimal, but once we got multiarch support, it should be renamed to `binutils-armel' (or similar name) and use linux and eglibc libraries and headers provided by multiarch. Along this package I also plan to upload `gcc-4.5-cross' (#590465). At the moment we are targeting one target architecture on two build hosts ('{amd64,i386}->armel'), not sure if it is desired to be supported on more build hosts. Target architecture support might grow up in future, but right now it is not a priority. Not sure if that is an issue for someone? Comments? Best regards, -- Héctor Orón "Our Sun unleashes tremendous flares expelling hot gas into the Solar System, which one day will disconnect us." -- Day DVB-T stop working nicely Video flare: http://antwrp.gsfc.nasa.gov/apod/ap100510.html

14 years, 5 months

8
12
0 0

[ACTIVITY] Weekly status

by Richard Sandiford

== Last week == * Committed STT_GNU_IFUNC changes to binutils. * Submitted the STT_GNU_IFUNC changes to GLIBC ports. Got feedback on Friday, which I'll deal with this week. * Worked on the expand and rtl-level parts of the load/store lane representation, with new optabs for each operation. This seems to be working pretty well, but I still need to make some changes to the way the existing intrinsics work. * Wrote a patch to clean up the way we handle optabs during expand, so that the new optabs mentioned above will need a bit less cut-&-paste. Submitted upstream. Got some positive feedback. * Committed testcase for PR rtl-optimization/47166 upstream. == This week == * Deal with GLIBC feedback. * More load/store lanes. Richard

14 years, 5 months

1
0
0 0

[ACTIVITY] 21st - 25th March

by Andrew Stubbs

* Linaro GCC Tested and merged both the latest Linaro merge requests, and various bug fixes to the Shrink Wrap optimization from CS, into Linaro GCC 4.5. Merged and tested from FSF GCC 4.6. Richard and Ramana have approved some of my upstream patches! I just need to wait for stage one so I can commit them upstream. I'll commit them internally when I get time to do the final integration test. Continued benchmarking GCC 4.6 with the patches merged from GCC 4.5. Decided to discard a couple of extra patches since they don't appear to be of any value. * Other On leave Wednesday to Friday playing daddy. :) * Future Absence Away Monday 28th to Friday 1st April. ---- Upstream patched requiring review: * Thumb2 constants: http://gcc.gnu.org/ml/gcc-patches/2010-12/msg00652.html * ARM EABI half-precision functions http://gcc.gnu.org/ml/gcc-patches/2011-02/msg00874.html * ARM Thumb2 Spill Likely tweak http://gcc.gnu.org/ml/gcc-patches/2011-02/msg00880.html * NEON scheduling patch http://gcc.gnu.org/ml/gcc-patches/2011-02/msg01431.html

14 years, 5 months

1
1
0 0

Substituting -msoft-float/-mfloat-abi=* in the proper order in spec file

by Loïc Minier

Hey I'm trying to extend the *link: specs to pass a different -dynamic-linker depending on the float ABI. But I didn't manage to build a construct which would preserve the order of the flags; if I do something like: %{msoft-float:-dynamic-linker V1} %{mfloat-abi=softfp:-dynamic-linker V2} Then I get V2 for "-mfloat-abi=softfp -msoft-float" instead of V1. In gcc/gcc.c I found some docs on spec file syntax; I see one can use %{S*&T*} and %{S*:X}, but apparently %{S*&T*:X} isn't allowed, so I can't manipulate the value. I tried to use %{msoft-float*:-dynamic-linker V1} %{mfloat-abi=softfp*:-dynamic-linker V2} but that gives the same effect (the msoft-float flags are grouped together in the original order and put first, then the mfloat-abi=softfp are grouped together in the original order and put second). I didn't manage to get %{msoft-float*:%<msoft-float -dynamic-linker V1} to work; in fact I didn't get supressions to work. Any idea? Thanks! PS: float-abit=softfp/soft-float are just convenient examples; the actual target is to use different -dynamic-linker for hard vs soft float-abi -- Loïc Minier

14 years, 5 months

2
1
0 0

Trip Report -- 1st QEMU Users Forum, Grenoble, 18 March 2011

by Peter Maydell

I went to the first QEMU Users Forum in Grenoble last week; this is my impressions and summary of what happened. Sorry if it's a bit TLDR... == Summary and general observations == This was a day long set of talks tacked onto the end of the DATE conference. There were about 40 attendees; the focus of the talks was mostly industrial and academic research QEMU users/hackers (a set of people who use and modify QEMU but who aren't very well represented on the qemu-devel list). A lot of the talks related to SystemC; at the moment people are rolling their own SystemC<->QEMU bridges. In addition to the usual problems when you try to put two simulation engines together (each of which thinks it should be in control of the world) QEMU doesn't make this easy because it is not very modular and makes the assumption that only one QEMU exists in a process (lots of global variables, no locking, etc). There was a general perception from attendees that QEMU "development community" is biased towards KVM rather than TCG. I tend to agree with this, but think this is simply because (a) that's where the bulk of the contributors are and (b) people doing TCG related work don't always appear on the mailing list. (The "quick throwaway prototype" approach often used for research doesn't really mesh well with upstream's desire for solid long-term maintainable code, I guess.) QEMU could certainly be made more convenient for this group of users: greater modularisation and provision of "just the instruction set simulator" as a pluggable library, for instance. Also the work by STMicroelectronics on tracing/instrumentation plugins looks like it should be useful to reduce the need to hack extra instrumentation directly into QEMU's frontends. People generally seemed to think the forum was useful, but it hasn't been decided yet whether to repeat it next year, or perhaps to have some sort of joint event with the open-source qemu community. More detailed notes on each of the talks are below; the proceedings/slides should also appear at http://adt.cs.upb.de/quf within a few weeks. Of particular Linaro/ARM interest are: * the STMicroelectronics plugin framework so your DLL can get callbacks on interesting events and/or insert tracing or instrumentation into generated code * Nokia's work on getting useful timing/power type estimates out of QEMU by measuring key events (insn exec, cache miss, TLB miss, etc) and calibrating against real hardware to see how to weight these * a talk on parallelising QEMU, ie "multicore on multicore" * speeding up Neon by adding SIMD IR ops and translating to SSE The forum started with a brief introduction by the organiser, followed by an informal Q&A session with Nathan Froyd from CodeSourcery (...since his laptop with his presentation slides had died on the journey over from the US...) == Talk 1: QEMU and SystemC == M. Monton from GreenSocs presented a couple of approaches to using QEMU with SystemC. "QEMU-SC" is for systems which are mostly QEMU based with one or two SystemC devices -- QEMU is the master. Speed penalty is 8-14% over implementing the device natively. "QBox" makes the SystemC simulation the master, and QEMU is implemented as a TLM2 Initiator; this works for systems which are almost all SystemC and which you just want to add a QEMU core to. Speed penalty 100% (!) although they suspect this is an artifact of the current implementation and could be reduced to more like 25-30%. They'd like to see a unified effort to do SystemC and QEMU integration (you'll note that there are several talks here where the presenters had rolled their own integration). Source available from www.greensocs.com. == Talk 2: Combined Use of Dynamic Binary Translation and SystemC for Fast and Accurate MPSoc Simulation == Description of a system where QEMU is used as the core model in a SystemC simulation of a multiprocessor ARM system. The SystemC side includes models of caches, write buffers and so on; this looked like quite a low level detailed (high overhead) simulation. They simulate multiple clusters of multiple cores, which is tricky with QEMU because it has a design assumption of only one QEMU per process address space (lots of global variables, no locking, etc); they handle this by saving and restoring globals at SystemC synchronisation points, which sounded rather hacky to me. They get timing information out of their model by annotating the TCG intermediate representation ops with new ops indicating number of cycles used, whether to check for Icache/Dcache hit/miss, and so on. Clearly they've put a lot of work into this. They'd like a standalone, reentrant ISS, basically so it's easier to plug into other frameworks like SystemC. == Talk 3: QEMU/SystemC Cosimulation at Different Abstraction Levels == This talk was about modelling an RTOS in SystemC; I have to say I didn't really understand the motivation for doing this. Rather than running an RTOS under emulation, they have a SystemC component which provides the scheduler/mutex type APIs an RTOS would, and then model RTOS tasks as other SystemC components. Some of these SystemC components embed user-mode QEMU, so you can have a combination of native and target-binare RTOS tasks. They're estimating time usage by annotating QEMU translation blocks (but not doing any accounting for cache effects). == Talk 4: Timing Aspects in QEMU/SystemC Synchronisation == Slightly academic-feeling talk about how to handle the problem of trying to run several separate simulations in parallel and keep their timing in sync. (In particular, QEMU and a SystemC world.) If you just alternate running each simulation there is no problem but it's not making best use of the host CPU. If you run them in parallel you can have the problem that sim A wants to send an event to sim B at time T, but sim B has already run past time T. He described a couple of possible approaches, but they were all "if you do this you might still hit the problem but there's a tunable parameter to reduce the probability of something going wrong"; also they only actually implemented the simplest one. In some sense this is really all workarounds for the fact that SystemC is being retrofitted/bolted onto the outside of a QEMU simulation. == Talk 5: Program Instrumentation with QEMU == Presentation by STMicroelectronics, about work they'd done adding instrumentation to QEMU so you can use it for execution trace generation, performance analysis, and profiling-driven optimisation when compiling. It's basically a plugin architecture so you can register hooks to be called at various interesting points (eg every time a TB is executed); there are also translation time hooks so plugins can insert extra code into the IR stream. Because it works at the IR level it's CPU-agnostic. They've used this to do real work like optimising/debugging of the Adobe Flash JIT for ARM. They're hoping to be able to submit this upstream. I liked this; I think it's a reasonably maintainable approach, and it ought to alleviate the need for hacking extra ops directly into QEMU for instrumentation (which is the approach you see in some of the other presentations). In particular it ought to work well with the Nokia work described in the next talk... == Talk 6: Using QEMU in Timing Estimation for Mobile Software Development == Work by Nokia's research division and Aalto university. This was about getting useful timing estimates out of a QEMU model by adding some instrumentation (instructions executed, cache misses, etc) and then calibrating against real hardware to identify what weightings to apply to each of these (weightings differ for different cores/devices; eg on A8 your estimates are very poor if you don't account for L2 cache misses, but for some other cores TLB misses are more important and adding L2 cache miss instrumentation gives only a small improvement in accuracy.) The cache model is not a proper functional cache model, it's just enough to be able to give cache hit/miss stats. They reckon that three or four key statistics (cache miss, TLB miss, a basic classification of insns into slow or fast) give estimated execution times with about 10% level of inaccuracy; the claim was that this is "feasible for practical usage". Git tree available. This would be useful in conjunction with the STMicroelectronics instrumentation plugin work; alternatively it might be interesting to do this as a Valgrind plugin, since Valgrind has much more mature support for arbitrary plugins. (Of course as a Valgrind plugin you'd be restricted to running on an ARM host, and you're only measuring one process, not whole-system effects.) == Talk 7: QEMU in Digital Preservation Strategies == A less technical talk from a researcher who's working on the problems of how museums should deal with preserving and conserving "digital artifacts" (OSes, applications, games). There are a lot of reasons why "just run natively" becomes infeasible: media decay, the connector conspiracy, old and dying hardware, APIs and environments becoming unsupported, proprietary file formats and on and on. If you emulate hardware (with QEMU) then you only have to deal with emulating a few (tens of) hardware platforms, rather than hundreds of operating systems or thousands of file formats, so it's the most practical approach. They're working on web interfaces for non-technical users. Most interesting for the QEMU dev community is that they're effectively building up a large set of regression tests (ie images of old OSes and applications) which they are going to be able to run automatic testing on. == Talk 8: MARSS-x86: QEMU-based Micro-Architectural and Systems Simulator for x86 Multicore Processors == This is about using QEMU for microarchitectural level modelling (branch predictor, load/store unit, etc); their target audience is academic researchers. There's an existing x86 pipeline level simulator (PLTsim) but it has problems: it uses Xen for its system simulation so it's hard to get installed (need a custom kernel on the host!), and it doesn't cope with multicore. So they've basically taken PLTsim's pipeline model and ported it into the QEMU system emulation environment. When enabled it replaces the TCG dynamic translation implementation; since the core state is stored in the same structures it is possible to "fast forward" a simulation running under TCG and then switch to "full microarchitecture simulation" for the interesting parts of a benchmark. They get 200-400KIPS. == Talk 9: Showing and Debugging Haiku with QEMU == Haiku is an x86 OS inspired by BeOS. The speaker talked about how they use QEMU for demos and also for kernel and bootloader debugging. == Talk 10: PQEMU : A parallel system emulator based on QEMU == This was a group from a Taiwan university who were essentially claiming to have solved the "multicore on multicore" problem, so you can run a simulated MPx4 ARM core on a quad-core x86 box and have it actually use all the cores. They had some benchmarking graphs which indicated that you do indeed get ~3.x times speedup over emulated single-core, ie your scaling gain isn't swamped in locking overhead. However, the presentation concentrated on the locking required for code generation (which is in my opinion the easy part) and I wasn't really convinced that they'd actually solved all the hard problems in getting the whole system to be multithreaded. ("It only crashes once every hundred runs...") Also their work is based on QEMU 0.12, which is now quite old. We should definitely have a look at the source which they hope to make available in a few months. == Talk 11: PRoot: A Step Forward for QEMU User-Mode == STMicroelectronics again, presenting an alternative to the usual "chroot plus binfmt_misc" approach for running target binaries seamlessly under qemu's linux-user mode. It's a wrapper around qemu which uses ptrace to intercept the syscalls qemu makes to the host; in particular it can add the target-directory prefix to all filesystem access syscalls, and can turn an attempt to exec "/bin/ls" into an exec of "qemu-linux-arm /bin/ls". The advantage over chroot is that it's more flexible and doesn't need root access to set up. They didn't give figures for how much overhead the syscall interception adds, though. == Talk 12: QEMU TCG Enhancements for Speeding up Emulation of SIMD == Simple idea -- make emulation of Neon instructions faster by adding some new SIMD IR ops and then implementing them with SSE instructions in the x86 backend. Some basic benchmarking shows that they can be ten times faster this way. Issues: * what is the best set of "generic" SIMD ops to add to the QEMU IR? * is making Neon faster the best use of resource for speeding up QEMU overall, or should we be looking at parallelism or other problems first? * are there nasty edge cases (flags, corner case input values etc) which would be a pain to handle? Interesting, though, and I think it takes the right general approach (ie not horrifically Neon specific). My feeling is that for this to go upstream it would need uses in two different QEMU front ends (to demonstrate that the ops are generic) and implementations in at least the x86 backend, plus fallback code so backends need not implement the ops; that's a fair bit of work beyond what they've currently implemented. == Talk 13: A SysML-based Framework with QEMU-SystemC Code Generation == This was the last talk, and the speaker ran through it very fast as we were running out of time. They have a code generator for taking a UML description of a device and turning it into SystemC (for VHDL) and C++ (for a QEMU device) and then cosimulating them for verification. -- PMM

14 years, 5 months

1
0
0 0

Evaluate Link Time Optimization (LTO) in linaro-gcc 4.5

by Jim Huang

Hello list, Recently, Android team is working on integrating Linaro toolchain for Android and NDK. According to the initial benchmark results[1], Linaro GCC is competitive comparing to Google toolchain. In the meanwhile, we are trying to enable gcc-4.5 specific features such as Graphite and LTO (Link Time Optimization) in order to make the best choice for Android build system and NDK usage. However, I encountered a problem about LTO and would like to ask help from toolchain WG. Assuming Linaro Toolchain for Android is installed in directory /tmp/android-toolchain-eabi, you can obtain Google's toolchain benchmark suite by git: # git clone git://android.git.kernel.org/toolchain/benchmark.git You have to apply the attached patch in order to make benchmark suite work[2]. Then, change directory to skia: # cd benchmark/skia And build skia bench with LTO enabled: # ../scripts/bench.py --action=build --toolchain=/tmp/android-toolchain-eabi --add_cflags="-flto -user-linker-plugin" The build process would be interrupted by gcc: make -j4 --warn-undefined-variables -f ../scripts/build/main.mk TOOLCHAIN=/tmp/android-toolchain-eabi ADD_CFLAGS="-flto -user-linker-plugin" build CPP ARM obj/src/core/Sk64.o <= src/src/core/Sk64.cpp CPP ARM obj/src/core/SkAlphaRuns.o <= src/src/core/SkAlphaRuns.cpp CPP ARM obj/src/core/SkBitmap.o <= src/src/core/SkBitmap.cpp CPP ARM obj/src/core/SkBitmapProcShader.o <= src/src/core/SkBitmapProcShader.cpp CPP ARM obj/src/core/SkBitmapProcState.o <= src/src/core/SkBitmapProcState.cpp CPP ARM obj/src/core/SkBitmapProcState_matrixProcs.o <= src/src/core/SkBitmapProcState_matrixProcs.cpp src/src/core/SkBitmapProcShader.cpp: In function 'SkShader::CreateBitmapShader(SkBitmap const&, SkShader::TileMode, SkShader::TileMode, void*, unsigned int)': src/src/core/SkBitmapProcShader.cpp:243:13: warning: 'color' may be used uninitialized in this function CPP ARM obj/src/core/SkBitmapSampler.o <= src/src/core/SkBitmapSampler.cpp src/src/core/SkBitmapProcState_matrixProcs.cpp:530:1: sorry, unimplemented: gimple bytecode streams do not support machine specific builtin functions on this target ... However, I can get other bench items passed such as cximage, gcstone, gnugo, mpeg4, webkit, and python. Can anyone give me some hints to resolve LTO problem? Thanks in advance. Sincerely, -jserv [1] https://wiki.linaro.org/Platform/Android/Toolchain#Reference%20Benchmark We use the same toolchain benchmark suite as Google compiler team took. [2] https://wiki.linaro.org/Platform/Android/UpstreamToolchain

14 years, 5 months

3
3
0 0

[ACTIVITY] Mar.14 -- Mar.20

by Chung-Lin Tang

== Last week == * CoreMark ARMv6/v7 regressions: posted another combine patch upstream, which was quickly approved and committed. The XOR simplification one is now approved too, but needs a little more revising of comments before committing. * The above two patches now bring CoreMark under -march=armv7-a to very close of the performance of -march=armv5te. However, a regression where uxtb+cmp cannot be combined into 'ands ... #255' still causes v7 to lose slightly. This should be the final issue to solve... * Launchpad #736007/GCC Bugzilla PR48183: NEON ICE in emit-rtl.c:immed_double_const() under -g. Posted patch upstream, but looks like more discussion is needed before we know if this is the "right" way to do it. * Launchpad #736661, armel FTBFS (G++ ICE in expand_expr_real_1()). Looking at this. * Pinged a few upstream patch submissions. == This week == * Launchpad #723185/CS issue #9845 now assigned to me, start looking at this. * Get the XOR patch committed upstream, and the above described uxtb+cmp issue solved. * Work on other GCC issues.

14 years, 5 months

1
0
0 0

Tracking tickets as they go by

by Michael Hope

Hi there. I have a custom report on top of the Launchpad tickets that shows how old they are and if they need attention: http://ex.seabright.co.nz/helpers/tickets/gcc-linaro?group_by=lint I check this once a day to see how we're doing. It's useful when deciding which bug to attack next. -- Michael

14 years, 5 months

1
0
0 0

[ACTIVITY] weekly status

by Ken Werner

== libunwind == * Had few discussions with Uli with regard to unwinding. * Continued to learn about libunwind internals. * The .ARM.exidx and .ARM.extbl section parser is functional but the integration into libunwind needs to be improved. Currently there are two seperate models that hold the informations of the current frame. Since they are not synchronized the behavior of libunwind is quite unexpected to the user. * I started on eliminating the redundancy by removing the model that was introduced for the extbl support. My goal is to have the parser operate on the DWARF model directly. In theory this should also allow to mix DWARF- and extable-frames. Regards Ken

14 years, 5 months

1
0
0 0

[Activity] March 7 - 18

by Ramana Radhakrishnan

== GCC == * Started looking at performance regressions. Setting up builds with EEMBC Denbench and other benchmarks. * Looked at PR47719 in some detail this week. * Set up environment on laptop . Fixed PR46788 in 4.6 branch and trunk. * Discussions regarding armhf, how to maintain Linaro branches - upstreaming patches etc. * Looked at a case of performance improvements with VFP stores. I think it's because we end up allowing PRE_INC and POST_DEC for floating point mode values because of which there end up being more transfers to and from the integer core registers. * Off sick on Monday 14th March 2011. == Misc == * Sorted out travel arrangements for LDS. Waiting for visa now.

14 years, 5 months

1
0
0 0

[ACTIVITY] Mar 14 - Mar 18

by Ulrich Weigand

== GDB == * Ongoing work on glibc patch to add ARM unwind tables to system call stubs (bug #684218). * Implemented initial version of a kernel patch that fixes GDB inferior calls while stopped in a restartable system call (bug #615974); started discussion with kernel folks. * Implemented new version of patch to fix single-stepping over signal handlers (bug #615978) that addresses review comments; posted to mailing list. * Verified Linaro GDB patch set can be applied to Ubuntu package. Mit freundlichen Gruessen / Best Regards Ulrich Weigand -- Dr. Ulrich Weigand | Phone: +49-7031/16-3727 STSM, GNU compiler and toolchain for Linux on System z and Cell/B.E. IBM Deutschland Research & Development GmbH Vorsitzender des Aufsichtsrats: Martin Jetter | Geschäftsführung: Dirk Wittkopp Sitz der Gesellschaft: Böblingen | Registergericht: Amtsgericht Stuttgart, HRB 243294

14 years, 5 months

1
0
0 0

[ACTIVITY] report week 11

by Peter Maydell

(sent early this week since I'll be in the conference all Friday) RAG: Red: Amber: Green: Current Milestones: | Planned | Estimate | Actual | Historical Milestones: [trimmed the 2010 ones] finish qemu-cont-integration | 2011-01-25 | 2011-01-25 | handed off | first qemu-linaro release | 2011-02-08 | 2011-02-08 | 2011-02-08 | qemu-linaro 2011-03 | 2011-03-08 | 2011-03-08 | 2011-03-08 | == maintain-beagle-models == * wrote most of a patch to properly implement prlimit64 syscall in usermode -- needs checking/testing * wrote most of a patch to allow boards to specify their min/max/default RAM size rather than having a single qemu-wide default; this would let us specify RAM size on the beagle model (currently hardwired) * retested the patch I wrote late last year which fixes TCG's locking problems by having interrupting signals/threads just set a flag which we check in every TB (rather than trying to mess with the TCG graph in a totally unsafe and crashprone way). Slowdown: about 3.5% on "boot Linaro nano on vexpress to rootprompt and shutdown again". I shall see if I can persuade people that this is a price worth paying for not randomly crashing in thread-heavy code :-) == merge-correctness-fixes == * Wrote/submitted patchset which fixes VRECPS edge case handling (mostly NaN related) * Wrote/submitted patchset which fixes Neon VLD of single element to all lanes * Wrote/submitted patch which fixes qemu to work on an ARM host where the host C code has been built in Thumb mode == other == * attended QEMU Users Forum in Grenoble * meetings: toolchain, standup, pdsw-tools, arch q&a Current qemu patch status is tracked here: https://wiki.linaro.org/PeterMaydell/QemuPatchStatus Absences: 17/18 March: QEMU Users Forum, Grenoble Holiday: 22 Apr - 2 May 9-13 May: UDS, Budapest (maybe) ~17-19 August: QEMU/KVM strand at LinuxCon NA, Vancouver

14 years, 5 months

1
0
0 0

[ACTIVITY] 2011-03-17

by David Gilbert

Short week * libffi patch accepted upstream * eglibc integration of string routine changes - I have something that works but it's more complex than I'd like (to get it to fall back to the C code on stuff I haven't optimised for). * Trying a neon memchr; tried a really simple 8 byte a loop version - it's quite slow on both A8 and A9; branching on the result of comparisons done in the neon is not simple. * Porting jam bug 735877 chromium using d32 float; it was passing vpfpv3 rather than using the default when configured without neon. On holiday tomorrow (Friday). Dave

14 years, 5 months

1
0
0 0

[ACTIVITY] March 13-17

by Revital1 Eres

Hello, Experiment with aes benchmark from DENbench. Continue my experiments with SMS which includes re-implementing an old patch to insert reg-moves in free slots rather than greedily before the definition as is done in the current implementation. Thanks, Revital

14 years, 5 months

1
0
0 0

[ACTIVITY] March 13-17

by Ira Rosen

Hi, * submitted store sinking patch to mainline * started testing auto-detection of vector size patch * DENBench - some benchmarks are still unstable, I am looking into stable regressions, adjusting and fixing the cost model for them Next week: Sunday and Monday - holidays Ira

14 years, 5 months

1
0
0 0

Versatile Express write-up

by Michael Hope

Here's a end-of-cycle write up for Versatile Express support in QEMU: https://wiki.linaro.org/WorkingGroups/ToolChain/Outputs/QEMUVersatileExpress Most of it is taken from Peter's page: https://wiki.linaro.org/PeterMaydell/QemuVersatileExpress which is the place to go if you want the current state and more detail on the steps involved. While writing this up I had a seamless experience from the first linaro-image-create until seeing the alpha3 greeter come up and wobbling the mouse around. It was awesome. Some ideas for other write-ups at: https://wiki.linaro.org/WorkingGroups/ToolChain/Outputs -- Michael

14 years, 5 months

1
0
0 0

RealView PBX write-up

by Michael Hope

Dave did an investigation earlier in the year into Cortex-A9 and RealView PBX support in QEMU. The write-up is available here: https://wiki.linaro.org/WorkingGroups/ToolChain/Outputs/QEMURealViewPBX Dave and Peter: could you please review it? I've now closed out the blueprint. I'd like to do similar reports on other outputs and will attack vexpress next. -- Michael

14 years, 5 months

1
0
0 0

Work-item tracking for the "gcc-linaro-tracking" LP project

by Ulrich Weigand

Hi Michael, Andrew, Mounir just pointed out that our non-Ubuntu LP projects (like gcc-linaro, gdb-linaro etc.) are now also included in the LP work-item tracking statistics (http://status.linaro.org/linaro-toolchain-wg.html). This didn't happen in the past due to a Launchpad issue that has now been fixed. This seems to be working out nicely, except for one issue: what about the gcc-linaro-tracking project? I have a couple of bugs that are fixed in Linaro GCC, and are also fixed in mainline GCC, but they still show up as an "in-progress" work-item in the status tracker (there are a whole bunch more of those assigned to Andrew as well). The reason for this is the LP records have an associated gcc-linaro-toolchain project entry, and this is set to "Fix Committed", but not "Fix Released" ... probably because GCC 4.6.0 is not yet released? Now, on the one hand it does make sense to include the -tracking project in the work-item statistics, because they *do* reflect important tasks: namely, to make sure that the changes indeed land in the upstream repository. However, having them all show up as "in progress" until the community makes a new GCC release does not seem very helpful: this is not in our control, and our work is in fact done once the patch is committed upstream. Therefore my suggestion: we should immediately mark -tracking bugs as "Fix Released" (not "Fix Committed"), as soon as the corresponding patch is committed upstream (and thus our work on the problem is completed). Thoughts? Does this make sense? Will this mess up any of the other purposes for which we currently use the -tracking project? Mit freundlichen Gruessen / Best Regards Ulrich Weigand -- Dr. Ulrich Weigand | Phone: +49-7031/16-3727 STSM, GNU compiler and toolchain for Linux on System z and Cell/B.E. IBM Deutschland Research & Development GmbH Vorsitzender des Aufsichtsrats: Martin Jetter | Geschäftsführung: Dirk Wittkopp Sitz der Gesellschaft: Böblingen | Registergericht: Amtsgericht Stuttgart, HRB 243294

14 years, 5 months

5
9
0 0

cortex-strings benchmarks

by Michael Hope

Hi Dave. I had a little play with cortex-strings and did some benchmarks on my Tegra 2. Images are attached. I've added two scripts to cortex-strings: scripts/bench-all.sh runs all the routines on all variants and records them scripts/plot.py plots the results from above ploy.py corrects for the benchmark overhead by doing a linear fit to the null 'bounce' results and subtracting this fit. You should be able to a autogen; configure; make; bash scripts/bench-all.sh | tee log.txt; python scripts/plot.py log.txt. I'm sure you have your own favourite tools though. The string routines look good. Lumpy in funny ways though... -- Michael

14 years, 5 months

1
0
0 0

Fwd: Representing interleaving and lane load/stores at the tree level

by Richard Sandiford

[Sorry, forgot to CC: the list] Hi Ira, Thanks for the feedback. On 6 March 2011 09:20, Ira Rosen <IRAR(a)il.ibm.com> wrote: > > So how about the following functions? (Forgive the pascally syntax.) > > > > __builtin_load_lanes (REF : array N*M of X) > > returns array N of vector M of X > > maps to vldN > > in practice, the result would be used in assignments of the form: > > vectorX = ARRAY_REF <result, X> > > > > __builtin_store_lanes (VECTORS : array N of vector M of X) > > returns array N*M of X > > maps to vstN > > in practice, the argument would be populated by assignments ofthe > form: > > vectorX = ARRAY_REF <result, X> > > > > __builtin_load_lane (REF : array N of X, > > VECTORS : array N of vector M of X, > > LANE : integer) > > returns array N of vector M of X > > maps to vldN_lane > > > > __builtin_store_lane (VECTORS : array N of vector M of X, > > LANE : integer) > > returns array N of X > > maps to vstN_lane > > > > How do you distinguish between "multiple structures" and "single structure > to all lanes"? Sorry, I'm not sure I understand the question. Could you give a couple of examples? The idea is that the arrays above really are array types, regardless of the actual type of the thing we're accessing (which might be a larger array than the bounds above say, or which might be an array of structures or a structure of arrays). That should be OK because arrays alias their elements. Richard

14 years, 5 months

3
3
0 0

Linaro GDB patch for natty

by Ulrich Weigand

Hi Matthias, in last week's meeting you raised the question what, if any, code from the Linaro GDB repository could be useful for inclusion into the natty GDB package. I've now reviewed the contents of the repository, and my suggestion would be to use everything in Linaro GDB 7.2, except for this commit (which changes the branding to "Linaro GDB"): revno: 32969 committer: Ulrich Weigand <uweigand(a)de.ibm.com> branch nick: 7.2 timestamp: Wed 2010-09-22 19:18:38 +0200 message: 2010-09-22 Ulrich Weigand <uweigand(a)de.ibm.com> * src-release: Support gdb-linaro packages. gdb/ * version.in: Set to Linaro GDB version number. * configure.ac (PKGVERSION, BUGURL): Refer to Linaro. * configure: Regenerate. gdb/gdbserver/ * configure.ac (PKGVERSION, BUGURL): Refer to Linaro. * configure: Regenerate. gdb/doc/ * configure.ac (PKGVERSION, BUGURL): Refer to Linaro. * configure: Regenerate. (Instead, the branding ought to be set as appropriate for the Ubuntu package. Maybe with an additional reference to Linaro, just as with GCC?) I've created a snapshot of the Linaro GDB 7.2 branch using the command bzr diff --prefix a/:b/ -r32965.. and then manually removed changes to src-release gdb/version.in gdb/configure.ac gdb/configure gdb/gdbserver/configure.ac gdb/gdbserver/configure gdb/doc/configure.ac gdb/doc/configure I've left in the new file ChangeLog.linaro for documentation purposes, but if you prefer this could of course be removed as well. The resulting patch is appended here. (Note that I'd recommend to continue updating the patch from Linaro GDB as further changes make it in.) (See attached file: linaro-gdb.patch) I've then added the patch to the natty GDB package. Since it touches a completely distinct set of files compared to the existing list of patches in the package, it can be added to the series file at any arbitrary point. I've built the resulting compiler on i386, arm, and ppc64, and it strictly improved the test results on all three platforms: i386 without patch: # of expected passes 16161 # of unexpected failures 114 # of expected failures 72 # of untested testcases 9 # of unresolved testcases 1 # of unsupported tests 69 i386 with patch: # of expected passes 16331 # of unexpected failures 24 # of expected failures 72 # of untested testcases 9 # of unresolved testcases 1 # of unsupported tests 69 Fixed test case failures are from: gdb.base/break-interp.exp gdb.base/foll-fork.exp gdb.base/printcmds.exp (These are just test suite cleanups, no actual code changes.) ppc without patch: # of expected passes 15350 # of unexpected failures 74 # of expected failures 53 # of untested testcases 15 # of unresolved testcases 1 # of unsupported tests 63 ppc with patch: # of expected passes 15350 # of unexpected failures 55 # of expected failures 53 # of untested testcases 15 # of unresolved testcases 1 # of unsupported tests 63 Fixed test case failures are from: gdb.base/printcmds.exp gdb.threads/local-watch-wrong-thread.exp gdb.threads/watchthreads.exp (These are just test suite cleanups, no actual code changes.) arm without patch: # of expected passes 15343 # of unexpected failures 270 # of unexpected successes 1 # of expected failures 65 # of untested testcases 11 # of unresolved testcases 2 # of unsupported tests 70 arm with patch: # of expected passes 15686 # of unexpected failures 46 # of unexpected successes 3 # of expected failures 63 # of untested testcases 11 # of unresolved testcases 1 # of unsupported tests 69 Fixed test case failures are from: gdb.base/break-interp.exp gdb.base/corefile.exp gdb.base/foll-fork.exp gdb.base/gcore.exp gdb.base/gdb1555.exp gdb.base/pr11022.exp gdb.base/printcmds.exp gdb.base/recurse.exp gdb.base/relativedebug.exp gdb.base/step-test.exp gdb.base/watch-cond.exp gdb.base/watch-read.exp gdb.base/watch_thread_num.exp gdb.base/watch-vfork.exp gdb.gdb/selftest.exp gdb.mi/gdb792.exp gdb.mi/mi2-syn-frame.exp gdb.mi/mi2-var-display.exp gdb.mi/mi2-watch.exp gdb.mi/mi-syn-frame.exp gdb.mi/mi-var-display.exp gdb.mi/mi-watch.exp gdb.pie/corefile.exp gdb.server/ext-attach.exp gdb.threads/attachstop-mt.exp gdb.threads/attach-stopped.exp gdb.threads/linux-dp.exp gdb.threads/local-watch-wrong-thread.exp gdb.threads/pthread_cond_wait.exp (This represents much of the bug fix work that went into Linaro GDB.) Let me know if there's any further information you need, or anything else I can do to help get the Linaro changes into natty GDB. Mit freundlichen Gruessen / Best Regards Ulrich Weigand -- Dr. Ulrich Weigand | Phone: +49-7031/16-3727 STSM, GNU compiler and toolchain for Linux on System z and Cell/B.E. IBM Deutschland Research & Development GmbH Vorsitzender des Aufsichtsrats: Martin Jetter | Geschäftsführung: Dirk Wittkopp Sitz der Gesellschaft: Böblingen | Registergericht: Amtsgericht Stuttgart, HRB 243294

14 years, 5 months

1
0
0 0

[ACTIVITY] 7th - 13th March

by Andrew Stubbs

Merged fixes for several bug into Linaro GCC 4.5. Both from Linaro (Richard, Matthias and Ramana), and from CS (the shrink wrap problems). Continued working on benchmarking the patches I've merged to 4.6. Spent quite some time trying to figure out why EEMBC and the Spec2K weren't working properly. I've got this sorted now. Confirmed that the patch to discourage NEON use for integer operations is still profitable on Cortex-A8. Posted the patch upstream. Merged upstream GCC 4.6 into Linaro GCC 4.6. Booked travel to Budapest for Linaro @ UDS. Followed up on Ramana's questions about the RVCT interoperability patch. Paul Brook helped explain what it was about, and pointed me at the proper section in the proper ARM manual. Continued forward porting patches to 4.6. Mostly I need to convince myself that they still do something useful. I have posted one new patch to upstream - the "Discourage A8 NEON" patch. * Future Absence Away Wednesday 16th to Friday 18th. Away Monday 28th to Friday 1st April. ---- Upstream patched requiring review: * Thumb2 constants: http://gcc.gnu.org/ml/gcc-patches/2010-12/msg00652.html * ARM EABI half-precision functions http://gcc.gnu.org/ml/gcc-patches/2011-02/msg00874.html * ARM Thumb2 Spill Likely tweak http://gcc.gnu.org/ml/gcc-patches/2011-02/msg00880.html * NEON scheduling patch http://gcc.gnu.org/ml/gcc-patches/2011-02/msg01431.html * RVCT Interoperability patch http://www.mail-archive.com/gcc-patches@gcc.gnu.org/msg00059.html * Discourage NEON on A8 http://www.mail-archive.com/gcc-patches@gcc.gnu.org/msg00576.html

14 years, 5 months

1
0
0 0

[ACTIVITY] Mar.07 -- Mar.13

by Chung-Lin Tang

== Last week == * Working on Coremark ARMv6 regressions. Identified a major cause being RTL ifcvt failing on one of the crc routines, due to combine pass failing to optimize a particular sequence, causing the if-conversion estimates to give up on conditional executing (too many insns). The combine pass failed on ARMv6 and above, due to the existence of true zero_extend insns. On ARMv5, the use of two shifts actually allowed combine to phase reduce the shifts one by one, thus producing better code. On ARMv6, combine produced a (xor (and ...) <mask>) which did not match any insn. Analyzed and sent a patch upstream which should work on such XOR cases. Patch is due for upstream commit for 4.7-stage1. (http://gcc.gnu.org/ml/gcc-patches/2011-03/msg00609.html) * Another situation of un-optimized uxth insns still exists; trying to solve this by another combine patch I am currently testing, will send upstream later. == This week == * verify the improvements the above patches should have on Coremark for ARMv6/v7. * Work on sending them to Linaro and SG++ branches. * Other bug issues.

14 years, 5 months

1
0
0 0

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

linaro-toolchain