Hi there. I'm looking for areas where the toolchain could generate faster code, and a good way of doing that is seeing how compiled code does against the best hand-written code. I know of skia, ffmpeg, pixman, Orc, and efl - what others are out there?
Thanks for any input,
-- Michael
On 28 March 2011 05:09, Michael Hope michael.hope@linaro.org wrote:
Hi there. I'm looking for areas where the toolchain could generate faster code, and a good way of doing that is seeing how compiled code does against the best hand-written code. I know of skia, ffmpeg, pixman, Orc, and efl - what others are out there?
hi Michael,
Great motivation to optimize the existing libraries by NEON !
As far as I know, Android depends on several libraries, and some of them are computing bound:
- libpixelflinger -- a bit like pixman There is no official document about PixelFlinger, but you can always check out its source: http://android.git.kernel.org/?p=platform/system/core.git%3Ba=summary I submitted one NEON optimization patch for libpixelflinger to AOSP before: https://review.source.android.com//#change,16358
- zlib Using SIMD, we can optimize 'copy / repeat an existing sequence' in LZ-style encoding. The reference Intel SSE2 optimization patch is attached in this mail.
Sincerely, -jserv
On 28 March 2011 07:52, Jim Huang jim.huang@linaro.org wrote:
- zlib
Using SIMD, we can optimize 'copy / repeat an existing sequence' in LZ-style encoding. The reference Intel SSE2 optimization patch is attached in this mail.
Regarding zlib in particular, in 2005 I had done an altivec port of this, apart from vectorizing Adler32 hashing function (which was ~2x faster than the C version [1], there are ~6 functions that are worth optimizing -as I found out during profiling the code. These functions are in deflate.c and inflate.c iirc, I have to search for the old tarball, it's here somewhere. Performance increase was from 20% to 50%, using plain C altivec code. I guess it should be similar with NEON. IMHO, it's worth it, but:
The problem is the zlib license, it forbids distributing compiled versions that are modified from the original source, such optimizations can go in the contrib folder, but it's of little use to the average user.
Konstantinos
[1]: http://www.freevec.org/old/whitepapers/Adler32-Altivec.pdf
Hi Konstantinos,
On Tue, Mar 29, 2011 at 10:21:53AM +0300, Konstantinos Margaritis wrote:
On 28 March 2011 07:52, Jim Huang jim.huang@linaro.org wrote:
The problem is the zlib license, it forbids distributing compiled versions that are modified from the original source, such optimizations can go in the contrib folder, but it's of little use to the average user.
There must be some misunderstanding here; no license that prohibited distribution of binaries built from modified source would be considered a Free Software license, and zlib is certainly considered free. :)
The only relevant requirements in the license (according to /usr/share/doc/zlib1g/copyright) are:
1. The origin of this software must not be misrepresented; you must not claim that you wrote the original software. If you use this software in a product, an acknowledgment in the product documentation would be appreciated but is not required. 2. Altered source versions must be plainly marked as such, and must not be misrepresented as being the original software.
Are you looking at a different zlib license than this one?
On 29 March 2011 10:53, Steve Langasek steve.langasek@linaro.org wrote:
Hi Konstantinos,
There must be some misunderstanding here; no license that prohibited distribution of binaries built from modified source would be considered a Free Software license, and zlib is certainly considered free. :)
Yes, you're right, the problem is that a modified zlib would have to be clearly marked as different -ie the package name would have to be different. This would be easily solved by means of a Provides: field, but I'm unsure if the differentiation also should include the libz.so filename. I was probably wrong in my license interpretation in 2005, but I seem to remember it was something like that that basically made me stop my work in vectorizing zlib :)
I'd love to be corrected if it meant having a NEON-optimized zlib in 2011 :)
The only relevant requirements in the license (according to /usr/share/doc/zlib1g/copyright) are:
1. The origin of this software must not be misrepresented; you must not claim that you wrote the original software. If you use this software in a product, an acknowledgment in the product documentation would be appreciated but is not required. 2. Altered source versions must be plainly marked as such, and must not be misrepresented as being the original software.
Yes, 2 is the problem, I think this was interpreted as having to rename the package and possibly the .so name.
Are you looking at a different zlib license than this one?
No, it's the same.
Konstantinos
On Tue, Mar 29, 2011 at 11:07:05AM +0300, Konstantinos Margaritis wrote:
On 29 March 2011 10:53, Steve Langasek steve.langasek@linaro.org wrote:
Hi Konstantinos,
There must be some misunderstanding here; no license that prohibited distribution of binaries built from modified source would be considered a Free Software license, and zlib is certainly considered free. :)
Yes, you're right, the problem is that a modified zlib would have to be clearly marked as different -ie the package name would have to be different.
Well, Debian zlib is already modified, hence the "dfsg" in the source and binary versions.
On Tue, Mar 29, 2011 at 11:07:05AM +0300, Konstantinos Margaritis wrote:
On 29 March 2011 10:53, Steve Langasek steve.langasek@linaro.org wrote:
Hi Konstantinos,
There must be some misunderstanding here; no license that prohibited distribution of binaries built from modified source would be considered a Free Software license, and zlib is certainly considered free. :)
Yes, you're right, the problem is that a modified zlib would have to be clearly marked as different -ie the package name would have to be different.
I don't think this is a correct interpretation of the license. You don't have to change a package name to "plainly mark" the source as modified; debian/copyright, changelogs, notices in the source files accomplish this. This is done for packages all the time, not just for zlib.
I was probably wrong in my license interpretation in 2005, but I seem to remember it was something like that that basically made me stop my work in vectorizing zlib :)
What a shame! I think you could have gone ahead in good conscience :)
I'd love to be corrected if it meant having a NEON-optimized zlib in 2011 :)
And I don't see any reason we can't go ahead with this now!
On 30 March 2011 01:45, Steve Langasek steve.langasek@linaro.org wrote:
I don't think this is a correct interpretation of the license. You don't have to change a package name to "plainly mark" the source as modified; debian/copyright, changelogs, notices in the source files accomplish this. This is done for packages all the time, not just for zlib.
from http://www.gzip.org/zlib/zlib_license.html
2. Altered source versions must be plainly marked as such, and must not be misrepresented as being the original software.
I read this then as "you cannot distribute it as a replacement of the original zlib library". I'll take your word that it's not the case, but it still is confusing to me.
Konstantinos
Konstantinos, Steve, I think that it depends on how you interpret "plainly mark". I can imagine several ways of doing this
- Naming the (binary) package explicitly - install an additional README file / include in the binary package - explicitly named source tar files
I think that the original authors did not want derived works being represented as their original work. So, need to avoid any confusion that the Neon enabled version with the original. I'm with Steve, that marking sources, adding another notice etc is enough.
Are the original authors still involved? It might be worth asking them...
Dave
Sent from yet another ARM powered mobile device
On 30 Mar 2011, at 01:04, Konstantinos Margaritis markos@genesi-usa.com wrote:
On 30 March 2011 01:45, Steve Langasek steve.langasek@linaro.org wrote:
I don't think this is a correct interpretation of the license. You don't have to change a package name to "plainly mark" the source as modified; debian/copyright, changelogs, notices in the source files accomplish this. This is done for packages all the time, not just for zlib.
from http://www.gzip.org/zlib/zlib_license.html
- Altered source versions must be plainly marked as such, and must not be misrepresented as being the original software.
I read this then as "you cannot distribute it as a replacement of the original zlib library". I'll take your word that it's not the case, but it still is confusing to me.
Konstantinos
linaro-dev mailing list linaro-dev@lists.linaro.org http://lists.linaro.org/mailman/listinfo/linaro-dev
Konstantinos is right with regards to the package naming, we discussed this with the zlib developers at the time.
It is easily solved with a packaging solution though, Provides: as he said. You will have the same problems with something libjpegturbo if it forces NEON, or libmatrix shipped with the GLES test utilities we've seen in our bug reports (btw that is a prime target for some NEON optimization, Konstantinos can you throw some code in there) or in fact any library that has NEON code which is not properly inserted/overridden at runtime based on NEON hwcaps (the concern here is i.MX515 TO2, Marvell and nVidia chips which have broken NEON or no NEON at all).
Shipping a libturboz.deb would not be a huge imposition. Given that Genesi provides system images and installers for Ubuntu we can install it by default (TO2 support for installation is going away with Natty). For Debian main, and other distributions which need to figure on supporting more platforms than ours, and for Ubuntu in the future if they ever get their act together on supporting real consumer products instead of just dev boards (looking at you too, Linaro!) then it will have to be a user installable option, but this might not be any more difficult than supplying a metapackage for the platform (like omap-extra) with some Recommends: line, which can only be resolved using an external repository (partner, or so) which is not enabled by default. As soon as someone enables that repo they will have the option at next update to "upgrade" their system to these new libraries.
Unfortunately doing it from a distribution point of view takes away all the easiest potential for performance optimization, but I think the benefit of having it "standardized" is worth it.
Speaking of standardization, we have had a LOT of customer complaints about xscreensaver-gl being installed by default on ARM platforms. In what world does the common ARM SoC ship with a full OpenGL implementation bolted on? Users are clicking some random 3D screensaver and complaining there is no acceleration - users do not understand the difference here between GL and GLES. As well as making new packages (libturboz in some other repo), it will have to be automated or automatically educating users to understand why they need this package and why, in fact in some cases, they may not actually need it.
I forgot btw: the true solution here is adapt zlib so that the functions are pluggable and the appropriate calls are made at the appropriate times based on HW capabilities.
Then throw it at mainline, they may accept it.
Thanks all for your replies. I mixed these in with a bit of Googling and recorded them here: https://wiki.linaro.org/MichaelHope/Sandbox/LibrariesWithNeon
-- Michael
On Mon, Mar 28, 2011 at 5:52 PM, Jim Huang jim.huang@linaro.org wrote:
On 28 March 2011 05:09, Michael Hope michael.hope@linaro.org wrote:
Hi there. I'm looking for areas where the toolchain could generate faster code, and a good way of doing that is seeing how compiled code does against the best hand-written code. I know of skia, ffmpeg, pixman, Orc, and efl - what others are out there?
hi Michael,
Great motivation to optimize the existing libraries by NEON !
As far as I know, Android depends on several libraries, and some of them are computing bound:
- libpixelflinger -- a bit like pixman
There is no official document about PixelFlinger, but you can always check out its source: http://android.git.kernel.org/?p=platform/system/core.git%3Ba=summary I submitted one NEON optimization patch for libpixelflinger to AOSP before: https://review.source.android.com//#change,16358
- zlib
Using SIMD, we can optimize 'copy / repeat an existing sequence' in LZ-style encoding. The reference Intel SSE2 optimization patch is attached in this mail.
Sincerely, -jserv
On 31 March 2011 08:23, Michael Hope michael.hope@linaro.org wrote:
Thanks all for your replies. I mixed these in with a bit of Googling and recorded them here: https://wiki.linaro.org/MichaelHope/Sandbox/LibrariesWithNeon
hi Michael,
Jan Seiffert implemented a series of adler32 vectorization for zlib: http://blackfin.uclinux.org/git/?p=users/vapier/zlib.git%3Ba=summary
ARM NEON and ARMv6 SIMD are included. It looks great and is being reviewed in zlib mailing-list: http://mail.madler.net/pipermail/zlib-devel_madler.net/2011-April/date.html
Regards, -jserv
On Sun, Apr 10, 2011 at 5:47 AM, Jim Huang jim.huang@linaro.org wrote:
On 31 March 2011 08:23, Michael Hope michael.hope@linaro.org wrote:
Thanks all for your replies. I mixed these in with a bit of Googling and recorded them here: https://wiki.linaro.org/MichaelHope/Sandbox/LibrariesWithNeon
hi Michael,
Jan Seiffert implemented a series of adler32 vectorization for zlib: http://blackfin.uclinux.org/git/?p=users/vapier/zlib.git%3Ba=summary
ARM NEON and ARMv6 SIMD are included. It looks great and is being reviewed in zlib mailing-list: http://mail.madler.net/pipermail/zlib-devel_madler.net/2011-April/date.html
Hi jserv. I had a quick play with this on one of my machines. It looks promising but is a bit broken at the moment:
michaelh@ursa1:/scratch/michaelh/zlib$ gdb ./example ... Starting program: /scratch/michaelh/zlib/example zlib version 1.2.5 = 0x1250, compile flags = 0x155 uncompress(): hello, hello! gzread(): hello, hello! gzgets() after gzseek: hello! inflate(): hello, hello!
Program received signal SIGSEGV, Segmentation fault. 0x00015c48 in adler32_vec (adler=2363950230, buf=0x7b000 <Address 0x7b000 out of bounds>, len=0) at adler32_arm.c:162 162 in16 = *(const uint8x16_t *)buf; (gdb) back #0 0x00015c48 in adler32_vec (adler=2363950230, buf=0x7b000 <Address 0x7b000 out of bounds>, len=0) at adler32_arm.c:162 #1 0x00016446 in adler32 (adler=2363950230, buf=0x26008 "x\001\354\320\261\r", len=20000) at adler32.c:418 #2 0x0000b81c in read_buf (strm=0x7ebf3634, buf=0x44ba8 "", size=25536) at deflate.c:1005 #3 0x0000be7a in fill_window (s=0x39898) at deflate.c:1380 #4 0x0000c06c in deflate_stored (s=0x39898, flush=0) at deflate.c:1484 #5 0x0000b252 in deflate (strm=0x7ebf3634, flush=0) at deflate.c:822 #6 0x0000922e in test_large_deflate (compr=0x26008 "x\001\354\320\261\r", comprLen=40000, uncompr=0x2fc50 "hello, hello!", uncomprLen=40000) at example.c:281 #7 0x00009ca6 in main (argc=1, argv=0x7ebf37f4) at example.c:551
Richard, the implementation uses NEON intrinsics so it'd be interesting to see if your pack/unpack patches apply to it.
I'll mention this on the zlib-devel list.
-- Michael
Michael Hope michael.hope@linaro.org writes:
Richard, the implementation uses NEON intrinsics so it'd be interesting to see if your pack/unpack patches apply to it.
Thanks for the heads up. FWIW, though, I don't think my changes help here, because there are no strided loads and stores involved. Jan's version doesn't use the intrinsics associated with the vldN and vstN instructions that I'm working on.
Richard