Following tests are based on latest libav git. built with same configurations. The result is even stranger than gst-ffmpeg with sync.
time avconv -i nfsroot2/bitstreams/720p/Source.Code.2011.BluRay.720p.DTS.x264-CHD.mkv -t 0:1:0 -f s16le a.pcm panda -- 8.345s mx53 -- 54.678s
time avconv -i nfsroot2/bitstreams/720p/Source.Code.2011.BluRay.720p.DTS.x264-CHD.mkv -t 0:1:0 -f s32le b.pcm panda -- 9.419s mx53 -- 63.294s
mru, how can I enable profile instruments in libav, any configuration options? Why native gcc in linaro releases can't support neon instructions? I think both above platforms support neon.
2011/9/28 Feng Wei feng.wei@linaro.org:
Following tests are based on latest libav git. built with same configurations. The result is even stranger than gst-ffmpeg with sync.
time avconv -i nfsroot2/bitstreams/720p/Source.Code.2011.BluRay.720p.DTS.x264-CHD.mkv -t 0:1:0 -f s16le a.pcm panda -- 8.345s mx53 -- 54.678s
time avconv -i nfsroot2/bitstreams/720p/Source.Code.2011.BluRay.720p.DTS.x264-CHD.mkv -t 0:1:0 -f s32le b.pcm panda -- 9.419s mx53 -- 63.294s
mru, how can I enable profile instruments in libav, any configuration options? Why native gcc in linaro releases can't support neon instructions? I think both above platforms support neon.
-- Wei.Feng (irc wei_feng) Linaro Multimedia Team Linaro.org │ Open source software for ARM SoCs Follow Linaro: Facebook | Twitter | Blog
additional test: time avconv -i nfsroot2/bitstreams/720p/Source.Code.2011.BluRay.720p.DTS.x264-CHD.mkv -t 0:1:0 -codec:a copy SourceCode.dts panda -- 3.418s mx53 -- 6.969s
time avconv -i SourceCode.dts -f s16le a.pcm panda -- 7.891s mx53 -- 54.215s
I make a mistake in previous tests (time results contain my interaction) The correct results are:
time avconv -i nfsroot2/bitstreams/720p/Source.Code.2011.BluRay.720p.DTS.x264-CHD.mkv -t 0:1:0 -f s16le a.pcm panda -- 7.005s mx53 -- 53.903s
time avconv -i nfsroot2/bitstreams/720p/Source.Code.2011.BluRay.720p.DTS.x264-CHD.mkv -t 0:1:0 -f s32le b.pcm panda -- 7.862s mx53 -- 60.235s
time avconv -i nfsroot2/bitstreams/720p/Source.Code.2011.BluRay.720p.DTS.x264-CHD.mkv -t 0:1:0 -codec:a copy SourceCode.dts panda -- 1.568s mx53 -- 5.155s
time avconv -i SourceCode.dts -f s16le a.pcm panda -- 6.320s mx53 -- 50.526s
2011/9/28 Feng Wei feng.wei@linaro.org:
I make a mistake in previous tests (time results contain my interaction) The correct results are:
time avconv -i nfsroot2/bitstreams/720p/Source.Code.2011.BluRay.720p.DTS.x264-CHD.mkv -t 0:1:0 -f s16le a.pcm panda -- 7.005s mx53 -- 53.903s
time avconv -i nfsroot2/bitstreams/720p/Source.Code.2011.BluRay.720p.DTS.x264-CHD.mkv -t 0:1:0 -f s32le b.pcm panda -- 7.862s mx53 -- 60.235s
time avconv -i nfsroot2/bitstreams/720p/Source.Code.2011.BluRay.720p.DTS.x264-CHD.mkv -t 0:1:0 -codec:a copy SourceCode.dts panda -- 1.568s mx53 -- 5.155s
time avconv -i SourceCode.dts -f s16le a.pcm panda -- 6.320s mx53 -- 50.526s
I took the liberty of optimising a loop that seems to have been added after I did the original NEON work on DTS. It is now ~13% faster on Cortex-A8 for streams using that code path. I didn't benchmark it on A9 but the difference there should be smaller.
2011/9/28 Feng Wei feng.wei@linaro.org:
Following tests are based on latest libav git. built with same configurations. The result is even stranger than gst-ffmpeg with sync.
time avconv -i nfsroot2/bitstreams/720p/Source.Code.2011.BluRay.720p.DTS.x264-CHD.mkv -t 0:1:0 -f s16le a.pcm panda -- 8.345s mx53 -- 54.678s
time avconv -i nfsroot2/bitstreams/720p/Source.Code.2011.BluRay.720p.DTS.x264-CHD.mkv -t 0:1:0 -f s32le b.pcm panda -- 9.419s mx53 -- 63.294s
mru, how can I enable profile instruments in libav, any configuration options?
Use perf or oprofile.
Why native gcc in linaro releases can't support neon instructions?
How do you mean? GCC isn't very good at generating NEON instruction itself, but it allows using them in hand-written assembler.
2011/9/28 Mans Rullgard mans.rullgard@linaro.org:
2011/9/28 Feng Wei feng.wei@linaro.org:
Following tests are based on latest libav git. built with same configurations. The result is even stranger than gst-ffmpeg with sync.
time avconv -i nfsroot2/bitstreams/720p/Source.Code.2011.BluRay.720p.DTS.x264-CHD.mkv -t 0:1:0 -f s16le a.pcm panda -- 8.345s mx53 -- 54.678s
time avconv -i nfsroot2/bitstreams/720p/Source.Code.2011.BluRay.720p.DTS.x264-CHD.mkv -t 0:1:0 -f s32le b.pcm panda -- 9.419s mx53 -- 63.294s
mru, how can I enable profile instruments in libav, any configuration options?
Use perf or oprofile.
Why native gcc in linaro releases can't support neon instructions?
How do you mean? GCC isn't very good at generating NEON instruction itself, but it allows using them in hand-written assembler.
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ When I configure the libav, I got -- "... C compiler gcc ARCH arm (generic) big-endian no runtime cpu detection no ARMv5TE enabled yes ARMv6 enabled yes ARMv6T2 enabled yes ARM VFP enabled yes IWMMXT enabled no NEON enabled no debug symbols yes optimize for size no optimizations yes ..."
and i find the log info "... check_asm neon "vadd.i16 q0, q0, q0" check_as BEGIN /tmp/ffconf.55xwYuP7.c 1 void foo(void){ __asm__ volatile("vadd.i16 q0, q0, q0"); } END /tmp/ffconf.55xwYuP7.c gcc -D_ISOC99_SOURCE -D_FILE_OFFSET_BITS=64 -D_LARGEFILE_SOURCE -D_POSIX_C_SOURCE=200112 -D_XOPEN_SOURCE=600 -c -o /tmp/ffconf.3j1rIVOW.o /tmp/ffconf.55xwYuP7.c /tmp/cc9mWCWX.s: Assembler messages: /tmp/cc9mWCWX.s:29: Error: selected FPU does not support instruction -- `vadd.i16 q0,q0,q0' ..."
2011/9/29 Feng Wei feng.wei@linaro.org:
When I configure the libav, I got -- "... C compiler gcc ARCH arm (generic) big-endian no runtime cpu detection no ARMv5TE enabled yes ARMv6 enabled yes ARMv6T2 enabled yes ARM VFP enabled yes IWMMXT enabled no NEON enabled no debug symbols yes optimize for size no optimizations yes ..."
and i find the log info "... check_asm neon "vadd.i16 q0, q0, q0" check_as BEGIN /tmp/ffconf.55xwYuP7.c 1 void foo(void){ __asm__ volatile("vadd.i16 q0, q0, q0"); } END /tmp/ffconf.55xwYuP7.c gcc -D_ISOC99_SOURCE -D_FILE_OFFSET_BITS=64 -D_LARGEFILE_SOURCE -D_POSIX_C_SOURCE=200112 -D_XOPEN_SOURCE=600 -c -o /tmp/ffconf.3j1rIVOW.o /tmp/ffconf.55xwYuP7.c /tmp/cc9mWCWX.s: Assembler messages: /tmp/cc9mWCWX.s:29: Error: selected FPU does not support instruction -- `vadd.i16 q0,q0,q0' ..."
This means your compiler doesn't default to NEON, and you didn't add -mfpu=neon manually. Try passing --extra-cflags=-mfpu=neon to configure.
Were your benchmarks done without NEON enabled? I get almost exactly the same speed on Beagle-xm and Panda, both at 1GHz.
2011/9/29 Mans Rullgard mans.rullgard@linaro.org:
2011/9/29 Feng Wei feng.wei@linaro.org:
When I configure the libav, I got -- "... C compiler gcc ARCH arm (generic) big-endian no runtime cpu detection no ARMv5TE enabled yes ARMv6 enabled yes ARMv6T2 enabled yes ARM VFP enabled yes IWMMXT enabled no NEON enabled no debug symbols yes optimize for size no optimizations yes ..."
and i find the log info "... check_asm neon "vadd.i16 q0, q0, q0" check_as BEGIN /tmp/ffconf.55xwYuP7.c 1 void foo(void){ __asm__ volatile("vadd.i16 q0, q0, q0"); } END /tmp/ffconf.55xwYuP7.c gcc -D_ISOC99_SOURCE -D_FILE_OFFSET_BITS=64 -D_LARGEFILE_SOURCE -D_POSIX_C_SOURCE=200112 -D_XOPEN_SOURCE=600 -c -o /tmp/ffconf.3j1rIVOW.o /tmp/ffconf.55xwYuP7.c /tmp/cc9mWCWX.s: Assembler messages: /tmp/cc9mWCWX.s:29: Error: selected FPU does not support instruction -- `vadd.i16 q0,q0,q0' ..."
This means your compiler doesn't default to NEON, and you didn't add -mfpu=neon manually. Try passing --extra-cflags=-mfpu=neon to configure.
Were your benchmarks done without NEON enabled? I get almost exactly the same speed on Beagle-xm and Panda, both at 1GHz.
-- Mans Rullgard / mru
I rebuild the libav with neon enabled, and I got the new benchmark, as below
time avconv -i SourceCode.dts -f s16le a.pcm panda -- 4.135s (53% better than non-neon version 6.320s) mx53 -- 19.054s (165% better than non-neon version 50.526s)
So as mru said, dts is mostly neon optimized. Although it's not so reasonable on A8 cpu, I think we don't need to put it into next cycle.
linaro-multimedia@lists.linaro.org