While studying the reason why kernel copy from NOR was so slow on our platform, I realized U-Boot is pulling it from 32-bit NOR in 8-bit chunks needlessly.
bootm uses memmove() and that just takes the approach by default to move u8s around.
This optimization prefers memcpy() implementation (done mostly in 32-bit reads and writes) if there's no overlap in source and dest, resulting in a huge speedup on our platform (480ms copy from 32-bit NOR ---> 140ms)
Signed-off-by: Andy Green andy.green@linaro.org --- lib/string.c | 10 +++++++++- 1 file changed, 9 insertions(+), 1 deletion(-)
diff --git a/lib/string.c b/lib/string.c index c3ad055..96d66e0 100644 --- a/lib/string.c +++ b/lib/string.c @@ -542,13 +542,21 @@ void * memmove(void * dest,const void *src,size_t count) if (src == dest) return dest;
- if (dest <= src) { + if (dest < src) { + + if ((unsigned long)dest + count <= (unsigned long)src) + return memcpy(dest, src, count); + tmp = (char *) dest; s = (char *) src; while (count--) *tmp++ = *s++; } else { + + if ((unsigned long)src + count <= (unsigned long)dest) + return memcpy(dest, src, count); + tmp = (char *) dest + count; s = (char *) src + count; while (count--)