Saw this on the linaro-multimedia list: http://lists.linaro.org/pipermail/linaro-multimedia/2011-September/000074.ht...
libpng spends a significant amount of time in memcpy(). This might tie in with Ramana's investigation or the unaligned access work by allowing more memcpy()s to be inlined.
-- Michael
On 26 September 2011 21:51, Michael Hope michael.hope@linaro.org wrote:
Saw this on the linaro-multimedia list: http://lists.linaro.org/pipermail/linaro-multimedia/2011-September/000074.ht...
libpng spends a significant amount of time in memcpy(). This might tie in with Ramana's investigation or the unaligned access work by allowing more memcpy()s to be inlined.
It's the unaligned access and the change / improvements to the memcpy that *might* help in this case. But that ofcourse depends on the compiler knowing when it can do such a thing. Ofcourse what might be more interesting is the kind of workload analysis that Dave's done in the past with memcpy to know what the alignment and size of the buffer being copied is.
cheers Ramana
-- Michael
linaro-toolchain mailing list linaro-toolchain@lists.linaro.org http://lists.linaro.org/mailman/listinfo/linaro-toolchain
On Tue, Sep 27, 2011 at 09:47:33AM +0100, Ramana Radhakrishnan wrote:
On 26 September 2011 21:51, Michael Hope michael.hope@linaro.org wrote:
Saw this on the linaro-multimedia list: http://lists.linaro.org/pipermail/linaro-multimedia/2011-September/000074.ht...
libpng spends a significant amount of time in memcpy(). This might tie in with Ramana's investigation or the unaligned access work by allowing more memcpy()s to be inlined.
It's the unaligned access and the change / improvements to the memcpy that *might* help in this case. But that ofcourse depends on the compiler knowing when it can do such a thing. Ofcourse what might be more interesting is the kind of workload analysis that Dave's done in the past with memcpy to know what the alignment and size of the buffer being copied is.
If you guys could take a look at this there is a potential requirement for the MMWG around libpng optimization; we could fit this in along with other work (possible vectorizing, etc) on that component.
On 27 September 2011 14:16, Christian Robottom Reis kiko@linaro.org wrote:
On Tue, Sep 27, 2011 at 09:47:33AM +0100, Ramana Radhakrishnan wrote:
On 26 September 2011 21:51, Michael Hope michael.hope@linaro.org wrote:
Saw this on the linaro-multimedia list: http://lists.linaro.org/pipermail/linaro-multimedia/2011-September/000074.ht...
libpng spends a significant amount of time in memcpy(). This might tie in with Ramana's investigation or the unaligned access work by allowing more memcpy()s to be inlined.
It's the unaligned access and the change / improvements to the memcpy that *might* help in this case. But that ofcourse depends on the compiler knowing when it can do such a thing. Ofcourse what might be more interesting is the kind of workload analysis that Dave's done in the past with memcpy to know what the alignment and size of the buffer being copied is.
If you guys could take a look at this there is a potential requirement for the MMWG around libpng optimization; we could fit this in along with other work (possible vectorizing, etc) on that component.
It wouldn't take long to analyse the memcpy calls - life would be easier if we had the test program and some details on things like what size of images were used in these benchmarks.
Dave
If you guys could take a look at this there is a potential requirement for the MMWG around libpng optimization; we could fit this in along with other work (possible vectorizing, etc) on that component.
Getting better block operations out of the compiler is something we are interested in and if we can feed back some of the work that's happening in this area then great ! We found a few cases where the compiler could do a better job with memset especially in cases where you have largish constant structure initializations .
We can do some tests and play with things but at the end of the day some of the more specific places where improvements are likely have to come from MMWG or whoever else spots that the compiler isn't behaving as expected in the form of distilled testcases that we can look at. If not this just becomes a Friday afternoon project for someone in the group.Also knowing what the workload was to see this kind of behaviour would be interesting as Dave points out later in this thread.
cheersx Ramana
-- Christian Robottom Reis, Engineering VP Brazil (GMT-3) | [+55] 16 9112 6430 | [+1] 612 216 4935 Linaro.org: Open Source Software for ARM SoCs
linaro-toolchain@lists.linaro.org