Thanks Bero. Sending this extremely useful information out to a wider audience.
Alex,
I think you're probably be very interested in this for your Mozilla work.
>> -O3
>> * What is is, does, available on
>
> -O3 enables several additional compiler optimizations such as tree
> vectorizing and loop unswitching, and optimizes for speed over code
> size somewhat more aggressively than -O2, e.g. by inlining all calls
> to small static functions.
> It is available on any platform supported by gcc.
>
>> OpenMP
>> * What is is, does, available on
>
> OpenMP is a simple API that makes it easier for a programmer to make
> use of multi-core or multi-processor systems, e.g. by automatically
> splitting marked loops into several threads.
> Example:
>
> #pragma omp parallel for
> for(int i=0; i<100; i++)
> do_something(i);
>
> Would use up to 100 threads to do its job.
>
>
> It is available on plaforms supported by gcc that can use libgomp,
> gcc's OpenMP library. This includes most platforms that support POSIX
> threads - but -- initially -- not Android.
>
>
>> Loop parallelization
>> * What is is, does, available on
>
> Loop parallelization takes OpenMP a step further by automatically
> determining which loops are suitable for "#pragma omp parallel for"
> and similar constructs. This allows code that was written without
> multiprocessing in mind (such as most code written specifically for
> ARM platforms - multicore/SMP ARM systems are quite new) to take
> advantage of multicore/SMP systems (to some extent) without having to
> modify the code.
>
> Compiler flag: -ftree-parallelize-loops=X (where X is the number of
> threads to be optimized for - typically the number of CPU cores in the
> target system)
>
> Available on anything supported by gcc that has both libgomp and
> graphite (incl. CLooG, PPL or ISL) - the original Android toolchain
> has neither of those.
>
>> ...and any other optimizations that you've done.
>
> None of the following is enabled yet (but the support in the toolchain
> is there now), but I'm planning to enable them step by step once we
> have systems built w/ the new toolchain that actually boot:
>
> binutils: --hash-style=gnu
> By default, ld creates SysV style hash tables for function tables
> in shared libraries. With --hash-style=gnu, we switch to GNU style
> hashes, making symbol lookup a lot faster. (details:
> http://sourceware.org/ml/binutils/2006-10/msg00377.html)
>
> binutils: -Bsymbolic-functions
> Speed up the dynamic linker by binding references to global
> functions in shared libraries where it is known that this doesn't
> break things (it's safe for libraries that don't have any users trying
> to override their symbols - it's probably safe to assume e.g. skia and
> opengl could benefit).
> (details: http://www.fkf.mpg.de/edv/docs/intel_composer/Documentation/en_US/compiler_…)
>
> binutils/gcc: -flto, -fwhole-program
> Link-Time Optimization - causes code to be optimized again at link
> time, when the compiler knows what functions are called form what
> parts of the code, what functions are only called with constant
> parameters, etc.
>
> gcc: -mtune=cortex-a9 (or whatever the actual target CPU is)
> The Android build system uses -march=arm-v7a, which is good -- but
> it doesn't do any tuning for the specifc CPU type (e.g. cortex-a8 vs.
> cortex-a9).
>
> gcc: -fvisibility-inlines-hidden
> Don't export C++ inline methods in shared libraries. Makes the
> symbol table smaller, improving startup time and diskspace efficiency
>
> gcc: -fstrict-aliasing -Werror=strict-aliasing
> Currently, Android uses -fno-strict-aliasing unconditionally for
> thumb code, to work around some pieces of code that violate strict
> aliasing rules. Using -Werror=strict-aliasing, we can determine what
> pieces of code are affected, and fix them, or limit the use of
> -fno-strict-aliasing to the specific files that need it - enabling the
> rather useful strict-aliasing optimization for the rest of the build
>
> gcc: Investigate Graphite optimizations that aren't even enabled at -O3:
> -fgraphite-identity -floop-block -floop-interchage
> -floop-strip-mine -ftree-loop-distribution -ftree-loop-linear
>
Hi everyone,
As our release plan, you can get the 11.08 release candidate of Linaro
Android image for Samsung Origen board:
https://android-build.linaro.org/builds/~linaro-android/leb-origen-11.08-re…
After you download these 3 files (if you download them with wget in console,
you may need to enable "--no-check-certificate" option),
https://android-build.linaro.org/jenkins/job/linaro-android_leb-origen-11.0…https://android-build.linaro.org/jenkins/job/linaro-android_leb-origen-11.0…https://android-build.linaro.org/jenkins/job/linaro-android_leb-origen-11.0…
flash them into a SD card with Linaro image tools (bzr branch
lp:linaro-image-tools, if you're already a launch member):
sudo ./linaro-image-tools/linaro-android-media-create --mmc /dev/sdx --dev
smdkv310 --system system.tar.bz2 --boot boot.tar.bz2 --userdata
userdata.tar.bz2
You also may need to set boot arguments if you can't boot into the system
(normally you don't need to do this):
setenv bootargs "console=ttySAC2,115200n8 root=/dev/ram init=/init rootwait
ro"
setenv bootcmd "fatload mmc 0:2 0x40007000 uImage; fatload mmc 0:2
0x42000000 uInitrd; bootm 0x40007000 0x42000000"
saveenv
This image hasn't lighted the LCD screen yet, this issue is related to the
hardware driver and I'm working with my colleagues to try to solve it. I'm
afraid it will miss the 11.08 release, sorry.
The kernel version in this image is 3.0.0, and the busybox version is
1.19.0, you can find it under /system/bin. This RC image was compiled by our
11.08 tool chain, RC build:
https://android-build.linaro.org/builds/~linaro-android/toolchain-4.6-2011.….
I will continue to work on the official release to ensure it can be done
before 25th August 2011, 16:00 (UTC).
Thank you all for your great efforts!
BR
Botao Sun
This series is posted for posterity. It has been NAK'd by the community
since CPU hotplug has been deemed an inappropriate mechanism for power
capping.
CPUoffline is a framework for taking CPU's offline via the hotplug
mechanism. The framework itself is quite straightforward: a driver
arranges the CPUs into partitions. Each partition is associated to a
governor thread and that thread implements a policy for taking CPUs in
that partition offline or online, based on some heuristic.
The CPUoffline core code includes a default driver that places all
possible CPUs into a single partition, requiring no code to be written
for a new platform. There is also a single governor named "avgload"
which looks at the average load of all of the *online* CPUs in a
partition and makes a hotplug decision based on defined thresholds.
This framework owes a lot to CPUfreq and CPUidle, from which CPUoffline
stole^H^H^H^H^H borrowed lots of code.
Note: since development was cut short to community response, there are
some missing infrastructure bits such as module unregistration and
dynamic govenor switching. The code does work fine as-is for the
curious-minded who want to test on an SMP system that supports hotplug.
Mike Turquette (6):
ARM: do not mark CPU 0 as hotpluggable
cpumask: introduce cpumask for hotpluggable CPUs
cpu: update cpu_hotpluggable_mask in register_cpu
cpuoffline core
governors
arm kconfig
arch/arm/Kconfig | 2 +
arch/arm/kernel/setup.c | 3 +-
drivers/Makefile | 1 +
drivers/base/cpu.c | 4 +-
drivers/cpuoffline/Kconfig | 26 ++
drivers/cpuoffline/Makefile | 2 +
drivers/cpuoffline/cpuoffline.c | 488 ++++++++++++++++++++++++++++++++
drivers/cpuoffline/governors/Kconfig | 9 +
drivers/cpuoffline/governors/Makefile | 2 +
drivers/cpuoffline/governors/avgload.c | 255 +++++++++++++++++
include/linux/cpumask.h | 27 ++-
include/linux/cpuoffline.h | 82 ++++++
kernel/cpu.c | 18 ++
13 files changed, 912 insertions(+), 7 deletions(-)
create mode 100644 drivers/cpuoffline/Kconfig
create mode 100644 drivers/cpuoffline/Makefile
create mode 100644 drivers/cpuoffline/cpuoffline.c
create mode 100644 drivers/cpuoffline/governors/Kconfig
create mode 100644 drivers/cpuoffline/governors/Makefile
create mode 100644 drivers/cpuoffline/governors/avgload.c
create mode 100644 include/linux/cpuoffline.h
--
1.7.4.1
In support of the Linaro 11.08 release, the libjpeg-turbo package has
been updated substantially from 1.1.1 to 1.1.90 which closely tracks
the upcoming 1.2 community release. On ARM make test passes and image
quality appears to be good. (of note to the Android WG bugs #823960
and #826642 are not present) This new version of libjpeg-turbo is
include in the ubuntu-desktop and alip reference images.
ChangeLog since 1.1.1
1.1.90 (1.2 beta1)
==================
[1] Added a JNI wrapper for TurboJPEG/OSS. See java/README for more details.
[2] TurboJPEG/OSS can now scale down images during decompression.
[3] Added SIMD routines for RGB-to-grayscale color conversion, which
significantly improves the performance of grayscale JPEG compression from an
RGB source image.
[4] Improved performance for non-x86 machines.
[5] Added a function to the TurboJPEG API which performs lossless transforms.
This function uses the same back end as jpegtran, but it performs transcoding
entirely in memory and allows multiple transforms and/or crop operations to be
batched together, so the source coefficients only need to be read once. This
is useful when generating image tiles from a single source JPEG.
[6] Modified jpgtest to benchmark the new scaled decompression and lossless
transform features in TurboJPEG/OSS.
[7] Added support for 4:4:0 (transposed 4:2:2) subsampling in TurboJPEG, which
was necessary in order for it to read 4:2:2 JPEG files that had been losslessly
transposed or rotated 90 degrees.
[8] All legacy VirtualGL code has been re-factored, and this has allowed
libjpeg-turbo, in its entirety, to be re-licensed under a BSD-style license.
[9] libjpeg-turbo can now be built with YASM.
[10] Added SIMD acceleration for ARM Linux and iOS platforms that support
NEON instructions.
[11] Refactored the TurboJPEG C API so that it uses pixel formats to define the
size and component order of the uncompressed source/destination images as well
as uses the libjpeg memory source and destination managers. The latter allows
the TurboJPEG compressor to grow the JPEG buffer as necessary.
[12] Eliminated errors in the output of jpegtran on Windows that occurred when
the application was invoked using I/O redirection
(jpegtran <input.jpg >output.jpg).
[13] The inclusion of libjpeg v7 and v8 emulation as well as arithmetic coding
support in libjpeg-turbo v1.1.0 introduced several new error constants in
jerror.h, and these were mistakenly enabled for all emulation modes, causing
the error enum in libjpeg-turbo to sometimes have different values than the
same enum in libjpeg. This represents an ABI incompatibility, and it caused
problems with rare applications that took specific action based on a particular
error value. The fix was to include the new error constants conditionally
based on whether libjpeg v7 or v8 emulation was enabled.
[14] Fixed an issue whereby Windows applications that used libjpeg-turbo would
fail to compile if the Windows system headers were included before jpeglib.h.
This issue was caused by a conflict in the definition of the INT32 type.
[15] Implemented a more efficient version of TJBUFSIZE() which computes a
worst-case JPEG size based on the level of chrominance subsampling.
[16] Fixed 32-bit supplementary package for amd64 Debian systems which was
broken by enhancements to the packaging system in 1.1.
[17] Support for decoding JPEG images that use the CMYK or YCCK colorspaces.
--
Regards,
Tom
"We want great men who, when fortune frowns will not be discouraged."
- Colonel Henry Knox
Linaro.org │ Open source software for ARM SoCs
w) tom.gall att linaro.org
w) tom_gall att vnet.ibm.com
h) tom_gall att mac.com
On 19 August 2011 09:36, Botao Sun <botao.sun(a)linaro.org> wrote:
> Hi Guys,
>
> Currently, we have 2 different situations according to our 2 different tool
> chains - old and new.
>
> For the old tool chain, use branch linaro_android_2.3.4 of my git
> repository: git://git.linaro.org/people/botaosun/busybox.git. This is a
> pre-built version, only contains a binary file of busybox.
>
> For the new tool chain, use branch linaro_android_2.3.5 of my git
> repository: git://git.linaro.org/people/botaosun/busybox.git. This version
> contains the source code of busybox 1.19.0 release, and the source code will
> be compiled with the other components of entire platform.
>
> I did this because there are some tricky issues which are related to our old
> tool chain and Android compiling system. I have discussed with Bero about
> the details. It's not impossible to solve it, but time matters. In addition,
> we will move to the new tool chain sooner or later, so there may be no more
> benefits to work on the old version support, and there is a workable busybox
> binary file already.
>
> If you have more suggestions, feel free to reply this mail.
>
> Thank you.
>
>
> BR
> Botao Sun
>