Re: The Value of Thumb-2

20 Oct 2011


      On 20 October 2011 18:27, Christian Robottom Reis kiko@linaro.org wrote:
...
- Do we know how much better Thumb-2 actually is, in practice? It's
     easy for us to confirm this on Android; what do the numbers and
     feel of the system tell us?
I did some tests comparing Libav built for ARM and Thumb-2.  The
Thumb-2 build has 18% smaller code size than the ARM build.  Data
size is of course unchanged.  The overall size reduction of
text+data+bss is 10%.  Benchmarking on a Cortex-A9 (Panda), the
Thumb-2 build is 1-3% slower in most of my test cases, only one test
being faster by 1%.  In these tests, the hand-written assembly code
was enabled (it can be built as ARM or Thumb-2).
This is of course highly specialised code so the results are not
generally applicable.  Nevertheless, I would expect similar results
from other compute-intensive applications.
...
- What are the downsides to using Thumb-2 in general? Do we have
     anecdotes or threads that talk about bad experiences or blockers
     in the transition?
The r1pX versions of Cortex-A8 had a few Thumb-2 related errata that
caused a bit of grief until they were properly understood and worked
around.  Some of the workarounds required on these core revisions have
a negative impact on performance (extra invalidations of some branch
prediction buffers etc).
...
- If it's so great, how could we lead a wide-ranging transition to
     Thumb2 becoming the standard ISA for modern v7 applications,
     including Android, Yocto and anything else relevant that runs on a
     Cortex A?
The space savings provided by Thumb-2 only matter if the available memory
(either RAM or non-volatile storage) is almost fully utilised, which is
not the typically the case on the type of systems we are focusing on
(Android and desktop distributions), where I strongly doubt code size
is the major contributor to memory usage.
One real benefit not mentioned in the quoted blurb is possibly reduced
startup times for applications when they are loaded from disk/flash.
This could make a case for building things started on system boot as
Thumb-2 in order to speed the boot process.
The promised speed gains from Thumb-2 are only possible as a result of
better I-cache utilisation due to reduced code size.  As seen in the
Libav case, a 20% reduction is code is realistic.  For this to give a
significant speed boost, the execution pattern would have to be such
that reducing the instruction working set by 20% allows it to fit within
the 16-32k (typical) I-cache thus reducing thrashing.  For instruction
working sets outside this fairly narrow range, switching to Thumb-2
would have little impact on performance.  I have no numbers to go by
here, but my feeling is that the number of realistic workloads that
would benefit here is fairly small.
In light of these observations, I do not think pushing for either
instruction set to be applied system-wide is proper.  Instead, each
application/library should be built using whichever gives it best
performance.  There is no problem mixing the instruction sets.
-- 
Mans Rullgard / mru

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

Re: The Value of Thumb-2