On Sun, Nov 27, 2011 at 8:18 PM, Woodruff, Richard r-woodruff2@ti.com wrote:
From: linaro-dev-bounces@lists.linaro.org [mailto:linaro-dev- bounces@lists.linaro.org] On Behalf Of Mans Rullgard
Do you have an erratum number for this?
This was very recent BUG and not yet made it to the public errata numbers. Most likely next PL310 errata update should have this one documented.
Do you have _any_ identifier for it?
ARM expanded errata 752271 to cover DLF not working till r3p2 in errata version 13.1 (21 Nov 11), 4460 is r3p1-50rel0 and is impacted.
Thanks a lot. Your posts are very informative as usual.
By the way, do you know whether it is safe to use "SCU Speculative linefills" with Cortex-A9 r2pX and PL310 r3pX? http://infocenter.arm.com/help/topic/com.arm.doc.ddi0407f/BABEBFBH.html
As a quick and dirty test, it can be enabled in 'arch/arm/kernel/smp_scu.c' by just setting extra (1 << 3) bit in SCU Control Register from 'scu_enable' function. In my tests, this seems to reduce L2 cache access latency a bit with an overall ~1.5% performance improvement at least for 7zip data compression. This is not a huge boost, but still would be nice to have unless some errata prevent this feature from being enabled.
Synthetic random read latency benchmark (extra overhead caused by L1 cache misses) and also a bit more realistic p7zip benchmark results on origenboard (Exynos 4210 @1.2GHz) are listed below.
=== SCU Speculative linefills disabled ===
block size : random read access time 1024 : 0.0 ns 2048 : 0.0 ns 4096 : 0.0 ns 8192 : 0.0 ns 16384 : 0.0 ns 32768 : 0.1 ns 65536 : 9.1 ns 131072 : 13.8 ns 262144 : 19.0 ns 524288 : 21.6 ns 1048576 : 31.0 ns 2097152 : 86.2 ns 4194304 : 117.2 ns 8388608 : 134.5 ns 16777216 : 146.4 ns 33554432 : 155.9 ns 67108864 : 164.7 ns
# ./7za b -mmt=1
7-Zip (A) 9.20 Copyright (c) 1999-2010 Igor Pavlov 2010-11-18 p7zip Version 9.20 (locale=C,Utf16=off,HugeFiles=on,2 CPUs)
RAM size: 477 MB, # CPU hardware threads: 2 RAM usage: 419 MB, # Benchmark threads: 1
Dict Compressing | Decompressing Speed Usage R/U Rating | Speed Usage R/U Rating KB/s % MIPS MIPS | KB/s % MIPS MIPS
22: 752 100 730 732 | 12665 100 1146 1143 23: 736 100 750 750 | 12469 100 1141 1141 24: 717 100 770 771 | 12289 100 1139 1140 25: 694 100 793 792 | 12103 100 1138 1138 ---------------------------------------------------------------- Avr: 100 761 761 100 1141 1141 Tot: 100 951 951
7-Zip (A) 9.20 Copyright (c) 1999-2010 Igor Pavlov 2010-11-18 p7zip Version 9.20 (locale=C,Utf16=off,HugeFiles=on,2 CPUs)
RAM size: 477 MB, # CPU hardware threads: 2 RAM usage: 419 MB, # Benchmark threads: 1
Dict Compressing | Decompressing Speed Usage R/U Rating | Speed Usage R/U Rating KB/s % MIPS MIPS | KB/s % MIPS MIPS
22: 751 100 730 730 | 12675 100 1144 1144 23: 736 100 750 750 | 12483 100 1143 1143 24: 716 100 770 770 | 12300 100 1142 1141 25: 693 100 792 792 | 12113 100 1139 1139 ---------------------------------------------------------------- Avr: 100 760 760 100 1142 1142 Tot: 100 951 951
=== SCU Speculative linefills enabled ===
block size : random read access time 1024 : 0.0 ns 2048 : 0.0 ns 4096 : 0.0 ns 8192 : 0.0 ns 16384 : 0.0 ns 32768 : 0.1 ns 65536 : 7.5 ns 131072 : 11.1 ns 262144 : 16.3 ns 524288 : 19.0 ns 1048576 : 33.2 ns 2097152 : 82.9 ns 4194304 : 113.1 ns 8388608 : 130.5 ns 16777216 : 143.2 ns 33554432 : 151.6 ns 67108864 : 160.3 ns
# ./7za b -mmt=1
7-Zip (A) 9.20 Copyright (c) 1999-2010 Igor Pavlov 2010-11-18 p7zip Version 9.20 (locale=C,Utf16=off,HugeFiles=on,2 CPUs)
RAM size: 479 MB, # CPU hardware threads: 2 RAM usage: 419 MB, # Benchmark threads: 1
Dict Compressing | Decompressing Speed Usage R/U Rating | Speed Usage R/U Rating KB/s % MIPS MIPS | KB/s % MIPS MIPS
22: 764 100 742 743 | 12721 100 1150 1148 23: 746 100 761 760 | 12535 100 1147 1147 24: 726 100 781 781 | 12362 100 1145 1147 25: 704 100 804 804 | 12170 100 1145 1144 ---------------------------------------------------------------- Avr: 100 772 772 100 1147 1147 Tot: 100 959 959
7-Zip (A) 9.20 Copyright (c) 1999-2010 Igor Pavlov 2010-11-18 p7zip Version 9.20 (locale=C,Utf16=off,HugeFiles=on,2 CPUs)
RAM size: 479 MB, # CPU hardware threads: 2 RAM usage: 419 MB, # Benchmark threads: 1
Dict Compressing | Decompressing Speed Usage R/U Rating | Speed Usage R/U Rating KB/s % MIPS MIPS | KB/s % MIPS MIPS
22: 764 100 745 744 | 12732 100 1148 1149 23: 747 100 761 762 | 12542 100 1149 1148 24: 728 100 783 783 | 12366 100 1147 1147 25: 706 100 806 806 | 12179 100 1145 1145 ---------------------------------------------------------------- Avr: 100 774 773 100 1147 1147 Tot: 100 960 960