On 28 March 2011 07:52, Jim Huang jim.huang@linaro.org wrote:
- zlib
Using SIMD, we can optimize 'copy / repeat an existing sequence' in LZ-style encoding. The reference Intel SSE2 optimization patch is attached in this mail.
Regarding zlib in particular, in 2005 I had done an altivec port of this, apart from vectorizing Adler32 hashing function (which was ~2x faster than the C version [1], there are ~6 functions that are worth optimizing -as I found out during profiling the code. These functions are in deflate.c and inflate.c iirc, I have to search for the old tarball, it's here somewhere. Performance increase was from 20% to 50%, using plain C altivec code. I guess it should be similar with NEON. IMHO, it's worth it, but:
The problem is the zlib license, it forbids distributing compiled versions that are modified from the original source, such optimizations can go in the contrib folder, but it's of little use to the average user.
Konstantinos
[1]: http://www.freevec.org/old/whitepapers/Adler32-Altivec.pdf