sascha.silbe@flatty:~$ find /sys/block/mmcblk0/device/ -maxdepth 1 -type f -printf '\n%f:\n' -exec cat {} ;
uevent: DRIVER=mmcblk MMC_TYPE=SD MMC_NAME=SD04G MODALIAS=mmc:block
cid: 0353445344303447800218eb1900a900
csd: 400e00325b5900001d8a7f800a404000
scr: 0235800000000000
date: 09/2010
fwrev: 0x0
hwrev: 0x8
manfid: 0x000003
name: SD04G
oemid: 0x5344
serial: 0x0218eb19
type: SD
Size: 3965190144B = 3872256 KiB
flashtest@flatty:~/flashbench$ ./flashbench -a /dev/mmcblk0 --blocksize=1024 --count=1000
align 1073741824 pre 3.05ms on 3.83ms post 3.03ms diff 791µs align 536870912 pre 2.92ms on 3.79ms post 2.97ms diff 840µs align 268435456 pre 2.93ms on 3.85ms post 3.03ms diff 871µs align 134217728 pre 2.91ms on 3.78ms post 2.97ms diff 839µs align 67108864 pre 2.94ms on 3.84ms post 3.04ms diff 855µs align 33554432 pre 2.88ms on 3.78ms post 2.97ms diff 853µs align 16777216 pre 2.94ms on 3.85ms post 3.03ms diff 862µs align 8388608 pre 2.62ms on 3.5ms post 2.66ms diff 858µs align 4194304 pre 2.61ms on 3.49ms post 2.65ms diff 857µs align 2097152 pre 2.61ms on 3.51ms post 2.66ms diff 868µs align 1048576 pre 2.62ms on 3.5ms post 2.66ms diff 862µs align 524288 pre 2.63ms on 3.51ms post 2.67ms diff 859µs align 262144 pre 2.66ms on 2.94ms post 2.67ms diff 280µs align 131072 pre 2.66ms on 2.94ms post 2.66ms diff 281µs align 65536 pre 2.67ms on 2.97ms post 2.67ms diff 300µs align 32768 pre 2.66ms on 2.96ms post 2.67ms diff 293µs align 16384 pre 2.7ms on 2.95ms post 2.67ms diff 264µs align 8192 pre 2.69ms on 2.95ms post 2.67ms diff 272µs align 4096 pre 2.66ms on 2.71ms post 2.66ms diff 52.5µs align 2048 pre 2.67ms on 2.67ms post 2.67ms diff 2.51µs
Trying to determine max. number of open AUs in non-random mode, 5 samples each:
EBS=524288 ; for NUMAU in $(seq 1 10) ; do echo "Trying $NUMAU open AUs:" ; ./flashbench --open-au --open-au-nr=$NUMAU --erasesize=$EBS /dev/mmcblk0 |tr -s ' ' |tr ' ' '\t' > out.1 ; for nr in $(seq 2 5) ; do ./flashbench --open-au --open-au-nr=$NUMAU --erasesize=$EBS /dev/mmcblk0 |tr -s ' '|cut -d ' ' -f 2 > out.$nr ; done ; paste out.* ; done Trying 1 open AUs: 512KiB 5.95M/s 4.96M/s 4.99M/s 4.92M/s 5M/s 256KiB 6.47M/s 4.25M/s 5.15M/s 4.63M/s 3.21M/s 128KiB 6.48M/s 4.99M/s 4.83M/s 4.83M/s 3.34M/s 64KiB 6.5M/s 5.5M/s 4.95M/s 4.95M/s 3.19M/s 32KiB 6.13M/s 5.38M/s 5.01M/s 5.06M/s 3.85M/s 16KiB 5.35M/s 5.32M/s 5.29M/s 5.34M/s 3.35M/s Trying 2 open AUs: 512KiB 4.6M/s 4.06M/s 1.64M/s 4.68M/s 4.4M/s 256KiB 4.4M/s 3.71M/s 4.32M/s 3.7M/s 4.24M/s 128KiB 3.71M/s 4.03M/s 4M/s 3.65M/s 4.19M/s 64KiB 3.89M/s 4.14M/s 4.07M/s 3.63M/s 4.45M/s 32KiB 4.38M/s 4.01M/s 3.92M/s 4.34M/s 3.81M/s 16KiB 3.25M/s 2.7M/s 3.68M/s 3.05M/s 3.72M/s Trying 3 open AUs: 512KiB 4.56M/s 4.68M/s 4.36M/s 4.48M/s 4.13M/s 256KiB 4.37M/s 4.39M/s 4.18M/s 3.8M/s 3.99M/s 128KiB 4.3M/s 4.12M/s 4.16M/s 3.72M/s 2.3M/s 64KiB 5.33M/s 4.21M/s 4.82M/s 4.28M/s 4.7M/s 32KiB 3.82M/s 4.14M/s 3.8M/s 3.7M/s 4.2M/s 16KiB 3.38M/s 3.15M/s 3.52M/s 3.45M/s 3.19M/s Trying 4 open AUs: 512KiB 4.48M/s 3.86M/s 4.72M/s 4.43M/s 4.25M/s 256KiB 4.29M/s 2.84M/s 3.64M/s 3.85M/s 3.6M/s 128KiB 4.54M/s 1.69M/s 1.47M/s 4.06M/s 2.41M/s 64KiB 4.58M/s 4.2M/s 4.74M/s 1.93M/s 4.55M/s 32KiB 4.11M/s 4.03M/s 3.71M/s 4.5M/s 1.46M/s 16KiB 3.32M/s 2.64M/s 2.14M/s 3.2M/s 3.13M/s Trying 5 open AUs: 512KiB 4.19M/s 1.4M/s 4.26M/s 4.21M/s 4.53M/s 256KiB 4.35M/s 4.31M/s 2.07M/s 1.92M/s 3.85M/s 128KiB 4.8M/s 4.65M/s 4.55M/s 3.71M/s 3.89M/s 64KiB 3.54M/s 4.2M/s 4.05M/s 4.33M/s 4.48M/s 32KiB 1.87M/s 3.95M/s 4.07M/s 3.91M/s 3.85M/s 16KiB 2.71M/s 2.63M/s 1.7M/s 1.17M/s 873K/s Trying 6 open AUs: 512KiB 4.31M/s 4.02M/s 3.91M/s 4.03M/s 4.51M/s 256KiB 4.13M/s 3.87M/s 3.92M/s 3.97M/s 3.99M/s 128KiB 4.1M/s 4.24M/s 4.13M/s 4.09M/s 4.06M/s 64KiB 4.25M/s 4.52M/s 4.47M/s 4.09M/s 4.13M/s 32KiB 4.03M/s 4.09M/s 2.41M/s 4.13M/s 4.4M/s 16KiB 879K/s 1.11M/s 826K/s 889K/s 1.06M/s Trying 7 open AUs: 512KiB 4.19M/s 4.06M/s 4.11M/s 4.11M/s 4.18M/s 256KiB 4.04M/s 4.38M/s 4.39M/s 4.03M/s 4M/s 128KiB 4.35M/s 4.39M/s 4.49M/s 4.18M/s 4.2M/s 64KiB 4.17M/s 3.87M/s 4.17M/s 3.84M/s 4.17M/s 32KiB 1.15M/s 1M/s 3.82M/s 1.76M/s 1.03M/s 16KiB 1.54M/s 1.44M/s 868K/s 1.29M/s 1.99M/s Trying 8 open AUs: 512KiB 4.14M/s 4.24M/s 4.33M/s 4.41M/s 4.2M/s 256KiB 4.03M/s 4.12M/s 4.14M/s 3.07M/s 4.04M/s 128KiB 4.28M/s 4.33M/s 3.89M/s 2.19M/s 4.46M/s 64KiB 2.07M/s 1.05M/s 1.49M/s 1.53M/s 1.91M/s 32KiB 1.24M/s 3.85M/s 1.01M/s 2.17M/s 1.56M/s 16KiB 1.59M/s 1.67M/s 3.3M/s 1.03M/s 1.36M/s Trying 9 open AUs: 512KiB 4.15M/s 4.19M/s 4.06M/s 3.34M/s 4.07M/s 256KiB 4.07M/s 4.19M/s 4.02M/s 4.09M/s 3.97M/s 128KiB 2.55M/s 1.59M/s 4.38M/s 2.62M/s 2.78M/s 64KiB 1.42M/s 2.97M/s 2.88M/s 1.72M/s 2.72M/s 32KiB 1.46M/s 1.05M/s 995K/s 1.42M/s 1.13M/s 16KiB 3.3M/s 1.27M/s 1.98M/s 1.73M/s 1.54M/s Trying 10 open AUs: 512KiB 4.18M/s 3.57M/s 4.04M/s 4.07M/s 4.15M/s 256KiB 4.18M/s 4.15M/s 3.9M/s 4.1M/s 4.16M/s 128KiB 3.53M/s 2.73M/s 2.48M/s 2.38M/s 3.05M/s 64KiB 1.66M/s 1.72M/s 2.6M/s 2.09M/s 2.63M/s 32KiB 2.71M/s 1.22M/s 1.37M/s 1.19M/s 977K/s 16KiB 906K/s 1.39M/s 1.21M/s 1.44M/s 1.55M/s
Trying to determine max. number of open AUs in random mode, 5 samples each:
EBS=524288 ; for NUMAU in $(seq 1 10) ; do echo "Trying $NUMAU open AUs:" ; ./flashbench --open-au --random --open-au-nr=$NUMAU --erasesize=$EBS /dev/mmcblk0 |tr -s ' ' |tr ' ' '\t' > out.1 ; for nr in $(seq 2 5) ; do ./flashbench --open-au --random --open-au-nr=$NUMAU --erasesize=$EBS /dev/mmcblk0 |tr -s ' '|cut -d ' ' -f 2 > out.$nr ; done ; paste out.* ; done Trying 1 open AUs: 512KiB 6.03M/s 4.23M/s 3.99M/s 3.82M/s 4.49M/s 256KiB 6.47M/s 3.83M/s 3.74M/s 3.61M/s 4.14M/s 128KiB 6.48M/s 3.69M/s 3.51M/s 3.62M/s 3.56M/s 64KiB 6.46M/s 3.36M/s 3.14M/s 3.54M/s 3.33M/s 32KiB 5.95M/s 3.94M/s 3.87M/s 3.76M/s 3.76M/s 16KiB 5.15M/s 3.77M/s 2.68M/s 3.8M/s 3.72M/s Trying 2 open AUs: 512KiB 4.58M/s 4.7M/s 4.58M/s 5.04M/s 4.56M/s 256KiB 4.39M/s 3.98M/s 4.19M/s 4.08M/s 4.3M/s 128KiB 4.07M/s 4.14M/s 4.1M/s 3.99M/s 3.47M/s 64KiB 3.44M/s 4.55M/s 3.43M/s 4.05M/s 3.67M/s 32KiB 4.42M/s 4.34M/s 4.38M/s 3.85M/s 4.39M/s 16KiB 3.33M/s 3.44M/s 3.28M/s 3.28M/s 3.3M/s Trying 3 open AUs: 512KiB 3.89M/s 4.2M/s 3.93M/s 4.17M/s 4.43M/s 256KiB 4.59M/s 3.86M/s 3.89M/s 4.07M/s 4.09M/s 128KiB 4.07M/s 4.17M/s 3.71M/s 4.15M/s 4.17M/s 64KiB 3.5M/s 4.81M/s 3.86M/s 4.7M/s 4.64M/s 32KiB 1.05M/s 3.92M/s 4.23M/s 4.2M/s 3.96M/s 16KiB 3.35M/s 3.46M/s 3.15M/s 3.19M/s 3.44M/s Trying 4 open AUs: 512KiB 3.99M/s 3.88M/s 4.13M/s 4.1M/s 2.56M/s 256KiB 4.53M/s 2.87M/s 4M/s 3.8M/s 4.29M/s 128KiB 4.44M/s 4.04M/s 3.11M/s 4.29M/s 4.19M/s 64KiB 4.92M/s 1.45M/s 4.85M/s 3.24M/s 3.58M/s 32KiB 3.89M/s 4.01M/s 1.44M/s 1.56M/s 4.44M/s 16KiB 3.42M/s 3.36M/s 3.4M/s 3.35M/s 2.61M/s Trying 5 open AUs: 512KiB 3.97M/s 2.08M/s 1.5M/s 3.9M/s 4.03M/s 256KiB 2.45M/s 2.96M/s 4.5M/s 2.28M/s 1.66M/s 128KiB 4.26M/s 4.39M/s 4.56M/s 1.45M/s 2.16M/s 64KiB 3.46M/s 3.88M/s 4.42M/s 4.74M/s 4.21M/s 32KiB 2.28M/s 3.19M/s 4.27M/s 4.46M/s 3.77M/s 16KiB 2.06M/s 1.81M/s 3.26M/s 3.34M/s 2.69M/s Trying 6 open AUs: 512KiB 4.23M/s 3.91M/s 4.05M/s 4.46M/s 4.2M/s 256KiB 1.65M/s 3.89M/s 4.51M/s 4.29M/s 3.98M/s 128KiB 4.29M/s 4.27M/s 4.27M/s 4.31M/s 4.09M/s 64KiB 2.13M/s 4.38M/s 4.13M/s 4.11M/s 4.28M/s 32KiB 3.82M/s 3.84M/s 3.96M/s 4.33M/s 1.78M/s 16KiB 1.07M/s 646K/s 975K/s 1.13M/s 1.17M/s Trying 7 open AUs: 512KiB 4.19M/s 4.29M/s 4.39M/s 4.13M/s 4.03M/s 256KiB 3.97M/s 3.98M/s 4.08M/s 4.23M/s 4.13M/s 128KiB 4.14M/s 4.31M/s 4.37M/s 4.54M/s 4.25M/s 64KiB 4.35M/s 4.4M/s 4.27M/s 4.58M/s 3.18M/s 32KiB 3.09M/s 1.44M/s 2.32M/s 1.12M/s 1.04M/s 16KiB 1.03M/s 966K/s 787K/s 1.84M/s 3.23M/s Trying 8 open AUs: 512KiB 4.33M/s 3.81M/s 4.02M/s 3.99M/s 4.02M/s 256KiB 4.08M/s 4.13M/s 4.06M/s 4.03M/s 3.87M/s 128KiB 4.35M/s 4.33M/s 4.69M/s 4.31M/s 1.85M/s 64KiB 2.83M/s 1.52M/s 1.38M/s 2.55M/s 3.81M/s 32KiB 1.54M/s 3.25M/s 1.7M/s 1.01M/s 1.43M/s 16KiB 883K/s 1.29M/s 1.04M/s 1.7M/s 1.21M/s Trying 9 open AUs: 512KiB 3.92M/s 2.78M/s 4.01M/s 4.01M/s 3.7M/s 256KiB 4.13M/s 3.77M/s 4.11M/s 4.04M/s 4.05M/s 128KiB 4.31M/s 3.25M/s 1.88M/s 4.4M/s 2.81M/s 64KiB 4.42M/s 1.8M/s 2.05M/s 2.8M/s 1.84M/s 32KiB 1.51M/s 1.46M/s 1.37M/s 1.2M/s 1.32M/s 16KiB 1.24M/s 1.68M/s 1.31M/s 1.71M/s 1.46M/s Trying 10 open AUs: 512KiB 4.03M/s 4.4M/s 4.08M/s 4.22M/s 4.14M/s 256KiB 3.98M/s 3.79M/s 3.93M/s 4M/s 4.07M/s 128KiB 3M/s 1.71M/s 4.23M/s 4.04M/s 1.62M/s 64KiB 2.87M/s 1.99M/s 2.01M/s 1.89M/s 3.44M/s 32KiB 1.02M/s 1.01M/s 1.23M/s 1.75M/s 1.17M/s 16KiB 1.67M/s 972K/s 1.11M/s 970K/s 1.58M/s
flashtest@flatty:~/flashbench$ ./flashbench --findfat --random --erasesize=524288 /dev/mmcblk0 512KiB 5.33M/s 6.24M/s 6.23M/s 6.24M/s 6.23M/s 6.23M/s 256KiB 6.21M/s 6.23M/s 6.22M/s 6.25M/s 6.25M/s 6.08M/s 128KiB 6.23M/s 6.25M/s 6.25M/s 6.26M/s 6M/s 5.75M/s 64KiB 6.02M/s 6.04M/s 6.15M/s 4M/s 4.01M/s 4.5M/s 32KiB 5.2M/s 5.9M/s 5.9M/s 5.88M/s 5.91M/s 5.91M/s 16KiB 5.2M/s 5.2M/s 5.19M/s 5.17M/s 5.21M/s 5.19M/s flashtest@flatty:~/flashbench$
I hope you can read more into that data than I do. I have a feeling 5 samples are not enough to reliably determine the number of open AUs for this card.
Sascha
Hi Sascha,
Thanks for your measurements. The results are highly unusualm and definitely require more research. Especially the 512 KB erase block is strange, so it's quite likely that there is something more going on.
On Saturday 02 April 2011, Sascha Silbe wrote:
Size: 3965190144B = 3872256 KiB
The size is a very good first indication: It is not a multiple of full megabytes, but a multiple of 1.5 MB (2521 time 1.5 MiB). For a card produced in 2010, that means it's quite likely to actually be based on 1.5 MB erase blocks.
This is not well supported in flashbench, but what you can try is pass a blocksize and erasesize that are a multiple of three:
flashbench -a /dev/mmcblk0 --blocksize=1536 --count=100 flashbench --open-au --open-au-nr=5 --erasesize=$[1536 * 1024] --blocksize=1536 /dev/mmcblk0
When you have the correct erase block size, the results of the other tests should make much more sense.
Arnd
Excerpts from Arnd Bergmann's message of Mon Apr 04 04:27:52 +0200 2011:
This is not well supported in flashbench, but what you can try is pass a blocksize and erasesize that are a multiple of three:
flashbench -a /dev/mmcblk0 --blocksize=1536 --count=100
I'm afraid that doesn't work:
flashtest@flatty:~/flashbench$ time ./flashbench -a /dev/mmcblk0 --blocksize=1536 --count=100 time_read: Invalid argument time_read: Invalid argument time_read: Invalid argument [...]
physical_block_size, hw_sector_size, logical_block_size and minimum_io_size all read 512, so I don't understand why the pread() fails with odd multiples of 512 (512, 1536, 2560). Multiples of 1024 (1024, 2048, 3072) work fine.
Sascha
On Monday 04 April 2011, Sascha Silbe wrote:
Excerpts from Arnd Bergmann's message of Mon Apr 04 04:27:52 +0200 2011:
This is not well supported in flashbench, but what you can try is pass a blocksize and erasesize that are a multiple of three:
flashbench -a /dev/mmcblk0 --blocksize=1536 --count=100
I'm afraid that doesn't work:
flashtest@flatty:~/flashbench$ time ./flashbench -a /dev/mmcblk0 --blocksize=1536 --count=100 time_read: Invalid argument time_read: Invalid argument time_read: Invalid argument [...]
physical_block_size, hw_sector_size, logical_block_size and minimum_io_size all read 512, so I don't understand why the pread() fails with odd multiples of 512 (512, 1536, 2560). Multiples of 1024 (1024, 2048, 3072) work fine.
Ah, right, sorry about that. It needs to be 3072.
Also, better apply this patch to make sure that everything starts at multiples of three.
Arnd
diff --git a/flashbench.c b/flashbench.c index 0a1016f..c95e916 100644 --- a/flashbench.c +++ b/flashbench.c @@ -322,17 +322,19 @@ static int try_read_alignment(struct device *dev, int tries, int count,
static int try_read_alignments(struct device *dev, int tries, int blocksize) { - const int count = 7; + const int count = 10; int ret; off_t align, maxalign;
- /* make sure we can fit eight power-of-two blocks in the device */ - for (maxalign = blocksize * 2; maxalign < dev->size / count; maxalign *= 2) + /* make sure we can fit ten 3*power-of-two blocks in the device */ + for (maxalign = blocksize * 2 * 3; maxalign < dev->size / count; maxalign *= 2) ;
for (align = maxalign; align >= blocksize * 2; align /= 2) { ret = try_read_alignment(dev, tries, count, maxalign, align, blocksize); returnif (ret); + ret = try_read_alignment(dev, tries, count, maxalign, align / 3 * 2, blocksize); + returnif (ret); }
return 0; @@ -538,7 +540,7 @@ static int try_open_au(struct device *dev, unsigned int erasesize, { (random ? O_OFF_RAND : O_OFF_LIN), erasesize / blocksize, -1}, {O_REDUCE, .aggregate = A_AVERAGE}, - {O_OFF_RAND, count, 4 * erasesize}, {O_WRITE_RAND}, + {O_OFF_RAND, count, 6 * erasesize}, {O_WRITE_RAND}, {O_NEWLINE}, {O_END}, {O_END},
Excerpts from Arnd Bergmann's message of Mon Apr 04 17:12:49 +0200 2011:
Ah, right, sorry about that. It needs to be 3072.
Also, better apply this patch to make sure that everything starts at multiples of three.
Without your patch 1.5MB EBS seems like a good bet:
align 805306368 pre 5.42ms on 6.23ms post 2.87ms diff 2.08ms align 402653184 pre 5.41ms on 5.98ms post 2.86ms diff 1.85ms align 201326592 pre 5.58ms on 6.19ms post 2.87ms diff 1.96ms align 100663296 pre 4.6ms on 5.89ms post 2.87ms diff 2.16ms align 50331648 pre 5.08ms on 6.23ms post 2.87ms diff 2.26ms align 25165824 pre 5.32ms on 6.23ms post 2.84ms diff 2.15ms align 12582912 pre 4.3ms on 6.23ms post 2.86ms diff 2.65ms align 6291456 pre 4.77ms on 6.18ms post 2.58ms diff 2.5ms align 3145728 pre 3.9ms on 5.56ms post 2.58ms diff 2.32ms align 1572864 pre 5.36ms on 6.03ms post 2.59ms diff 2.05ms align 786432 pre 3.26ms on 3.47ms post 2.6ms diff 536µs align 393216 pre 3.35ms on 3.42ms post 2.58ms diff 456µs align 196608 pre 3.34ms on 3.43ms post 2.57ms diff 478µs align 98304 pre 3.26ms on 3.42ms post 2.57ms diff 506µs align 49152 pre 3.32ms on 3.45ms post 2.57ms diff 509µs align 24576 pre 3.31ms on 3.4ms post 2.57ms diff 462µs align 12288 pre 3.25ms on 2.9ms post 2.75ms diff -106663 align 6144 pre 2.6ms on 2.55ms post 2.68ms diff -86041n
With your patch applied however the results are a bit odd:
align 603979776 pre 5.46ms on 5.85ms post 2.86ms diff 1.69ms align 402653184 pre 5.54ms on 5.85ms post 2.85ms diff 1.66ms align 301989888 pre 5.09ms on 6.09ms post 2.86ms diff 2.12ms align 201326592 pre 5.48ms on 6.12ms post 2.87ms diff 1.95ms align 150994944 pre 5.46ms on 6.08ms post 2.85ms diff 1.93ms align 100663296 pre 5.19ms on 6.12ms post 2.87ms diff 2.09ms align 75497472 pre 4.56ms on 5.94ms post 2.85ms diff 2.24ms align 50331648 pre 5.69ms on 6.03ms post 2.87ms diff 1.75ms align 37748736 pre 4.69ms on 6.12ms post 2.86ms diff 2.35ms align 25165824 pre 4.94ms on 6.14ms post 2.86ms diff 2.24ms align 18874368 pre 5.45ms on 6.15ms post 2.86ms diff 2ms align 12582912 pre 4.97ms on 6.08ms post 2.85ms diff 2.17ms align 9437184 pre 5.01ms on 5.59ms post 2.85ms diff 1.66ms align 6291456 pre 5.45ms on 6.07ms post 2.67ms diff 2.01ms align 4718592 pre 5.18ms on 5.67ms post 2.67ms diff 1.74ms align 3145728 pre 5.2ms on 5.66ms post 2.67ms diff 1.73ms align 2359296 pre 3.39ms on 3.53ms post 2.67ms diff 500µs align 1572864 pre 5.32ms on 5.95ms post 2.68ms diff 1.95ms align 1179648 pre 3.5ms on 3.53ms post 2.68ms diff 442µs align 786432 pre 3.42ms on 3.57ms post 2.68ms diff 522µs align 589824 pre 3.4ms on 3.56ms post 2.65ms diff 535µs align 393216 pre 3.47ms on 3.57ms post 2.67ms diff 500µs align 294912 pre 3.4ms on 3.54ms post 2.66ms diff 511µs align 196608 pre 3.36ms on 3.55ms post 2.66ms diff 538µs align 147456 pre 3.38ms on 3.6ms post 2.67ms diff 575µs align 98304 pre 3.44ms on 3.57ms post 2.67ms diff 515µs align 73728 pre 3.46ms on 3.51ms post 2.68ms diff 445µs align 49152 pre 3.32ms on 3.55ms post 2.66ms diff 560µs align 36864 pre 3.4ms on 2.98ms post 2.85ms diff -143939 align 24576 pre 3.41ms on 3.51ms post 2.66ms diff 484µs align 18432 pre 3.43ms on 3.49ms post 3.48ms diff 35.3µs align 12288 pre 3.32ms on 2.98ms post 2.85ms diff -105911 align 9216 pre 2.75ms on 2.84ms post 3.14ms diff -104004 align 6144 pre 2.68ms on 2.64ms post 2.75ms diff -69256n
Note the times for 1.5, 2.25 and 3 MB. Around 24k it's also a bit off.
Sascha
On Monday 04 April 2011, Sascha Silbe wrote:
Excerpts from Arnd Bergmann's message of Mon Apr 04 17:12:49 +0200 2011:
Ah, right, sorry about that. It needs to be 3072.
Also, better apply this patch to make sure that everything starts at multiples of three.
Without your patch 1.5MB EBS seems like a good bet:
align 805306368 pre 5.42ms on 6.23ms post 2.87ms diff 2.08ms align 402653184 pre 5.41ms on 5.98ms post 2.86ms diff 1.85ms align 201326592 pre 5.58ms on 6.19ms post 2.87ms diff 1.96ms align 100663296 pre 4.6ms on 5.89ms post 2.87ms diff 2.16ms align 50331648 pre 5.08ms on 6.23ms post 2.87ms diff 2.26ms align 25165824 pre 5.32ms on 6.23ms post 2.84ms diff 2.15ms align 12582912 pre 4.3ms on 6.23ms post 2.86ms diff 2.65ms align 6291456 pre 4.77ms on 6.18ms post 2.58ms diff 2.5ms align 3145728 pre 3.9ms on 5.56ms post 2.58ms diff 2.32ms align 1572864 pre 5.36ms on 6.03ms post 2.59ms diff 2.05ms align 786432 pre 3.26ms on 3.47ms post 2.6ms diff 536µs align 393216 pre 3.35ms on 3.42ms post 2.58ms diff 456µs align 196608 pre 3.34ms on 3.43ms post 2.57ms diff 478µs align 98304 pre 3.26ms on 3.42ms post 2.57ms diff 506µs align 49152 pre 3.32ms on 3.45ms post 2.57ms diff 509µs align 24576 pre 3.31ms on 3.4ms post 2.57ms diff 462µs align 12288 pre 3.25ms on 2.9ms post 2.75ms diff -106663 align 6144 pre 2.6ms on 2.55ms post 2.68ms diff -86041n
Ok, that is pretty clear.
With your patch applied however the results are a bit odd:
align 603979776 pre 5.46ms on 5.85ms post 2.86ms diff 1.69ms align 402653184 pre 5.54ms on 5.85ms post 2.85ms diff 1.66ms align 301989888 pre 5.09ms on 6.09ms post 2.86ms diff 2.12ms align 201326592 pre 5.48ms on 6.12ms post 2.87ms diff 1.95ms align 150994944 pre 5.46ms on 6.08ms post 2.85ms diff 1.93ms align 100663296 pre 5.19ms on 6.12ms post 2.87ms diff 2.09ms align 75497472 pre 4.56ms on 5.94ms post 2.85ms diff 2.24ms align 50331648 pre 5.69ms on 6.03ms post 2.87ms diff 1.75ms align 37748736 pre 4.69ms on 6.12ms post 2.86ms diff 2.35ms align 25165824 pre 4.94ms on 6.14ms post 2.86ms diff 2.24ms align 18874368 pre 5.45ms on 6.15ms post 2.86ms diff 2ms align 12582912 pre 4.97ms on 6.08ms post 2.85ms diff 2.17ms align 9437184 pre 5.01ms on 5.59ms post 2.85ms diff 1.66ms align 6291456 pre 5.45ms on 6.07ms post 2.67ms diff 2.01ms align 4718592 pre 5.18ms on 5.67ms post 2.67ms diff 1.74ms align 3145728 pre 5.2ms on 5.66ms post 2.67ms diff 1.73ms align 2359296 pre 3.39ms on 3.53ms post 2.67ms diff 500µs align 1572864 pre 5.32ms on 5.95ms post 2.68ms diff 1.95ms align 1179648 pre 3.5ms on 3.53ms post 2.68ms diff 442µs align 786432 pre 3.42ms on 3.57ms post 2.68ms diff 522µs align 589824 pre 3.4ms on 3.56ms post 2.65ms diff 535µs align 393216 pre 3.47ms on 3.57ms post 2.67ms diff 500µs align 294912 pre 3.4ms on 3.54ms post 2.66ms diff 511µs align 196608 pre 3.36ms on 3.55ms post 2.66ms diff 538µs align 147456 pre 3.38ms on 3.6ms post 2.67ms diff 575µs align 98304 pre 3.44ms on 3.57ms post 2.67ms diff 515µs align 73728 pre 3.46ms on 3.51ms post 2.68ms diff 445µs align 49152 pre 3.32ms on 3.55ms post 2.66ms diff 560µs align 36864 pre 3.4ms on 2.98ms post 2.85ms diff -143939 align 24576 pre 3.41ms on 3.51ms post 2.66ms diff 484µs align 18432 pre 3.43ms on 3.49ms post 3.48ms diff 35.3µs align 12288 pre 3.32ms on 2.98ms post 2.85ms diff -105911 align 9216 pre 2.75ms on 2.84ms post 3.14ms diff -104004 align 6144 pre 2.68ms on 2.64ms post 2.75ms diff -69256n
Note the times for 1.5, 2.25 and 3 MB. Around 24k it's also a bit off.
I should have made clearer that with the patch applied, you no longer need to pass an odd blocksize, but again use 1024 byte blocks with the -a test, though not the --open-au test. The test above tries both power-of-two multiples of the block size and three times those values.
Arnd
Excerpts from Arnd Bergmann's message of Tue Apr 05 00:17:14 +0200 2011:
I should have made clearer that with the patch applied, you no longer need to pass an odd blocksize, but again use 1024 byte blocks with the -a test, though not the --open-au test. The test above tries both power-of-two multiples of the block size and three times those values.
Ah, I see. Here we go:
flashtest@flatty:~/flashbench$ time ./flashbench -a /dev/mmcblk[0-9] --blocksize=1024 --count=100 | tee ~/sandisk_a_1k.result align 402653184 pre 5.06ms on 5.99ms post 2.74ms diff 2.09ms align 268435456 pre 3.44ms on 3.8ms post 3.67ms diff 244µs align 201326592 pre 4.74ms on 5.83ms post 2.77ms diff 2.07ms align 134217728 pre 2.78ms on 2.89ms post 2.92ms diff 32.5µs align 100663296 pre 5.04ms on 5.65ms post 2.74ms diff 1.76ms align 67108864 pre 3.48ms on 3.71ms post 3.68ms diff 135µs align 50331648 pre 5.53ms on 5.93ms post 2.73ms diff 1.8ms align 33554432 pre 2.74ms on 2.89ms post 2.93ms diff 52.1µs align 25165824 pre 5.29ms on 5.66ms post 2.78ms diff 1.62ms align 16777216 pre 3.5ms on 3.79ms post 3.67ms diff 204µs align 12582912 pre 5.15ms on 5.77ms post 2.75ms diff 1.82ms align 8388608 pre 2.52ms on 2.69ms post 2.71ms diff 72.9µs align 6291456 pre 4.69ms on 5.46ms post 2.54ms diff 1.85ms align 4194304 pre 3.23ms on 3.46ms post 3.31ms diff 189µs align 3145728 pre 4.79ms on 5.41ms post 2.54ms diff 1.74ms align 2097152 pre 2.57ms on 2.72ms post 2.71ms diff 77.8µs align 1572864 pre 4.63ms on 5.81ms post 2.61ms diff 2.2ms align 1048576 pre 3.26ms on 3.49ms post 3.29ms diff 215µs align 786432 pre 3.28ms on 3.45ms post 2.58ms diff 524µs align 524288 pre 2.52ms on 2.69ms post 2.69ms diff 88µs align 393216 pre 3.3ms on 3.5ms post 2.55ms diff 576µs align 262144 pre 3.35ms on 3.32ms post 3.32ms diff -17551n align 196608 pre 3.31ms on 3.48ms post 2.54ms diff 560µs align 131072 pre 2.55ms on 2.7ms post 2.7ms diff 76.4µs align 98304 pre 3.39ms on 3.48ms post 2.57ms diff 496µs align 65536 pre 3.12ms on 3.49ms post 3.33ms diff 259µs align 49152 pre 3.31ms on 3.51ms post 2.57ms diff 575µs align 32768 pre 2.52ms on 2.69ms post 2.69ms diff 83.3µs align 24576 pre 3.24ms on 3.46ms post 2.55ms diff 570µs align 16384 pre 2.96ms on 3.38ms post 3.29ms diff 256µs align 12288 pre 2.69ms on 2.76ms post 2.67ms diff 75.5µs align 8192 pre 2.53ms on 2.68ms post 2.68ms diff 74.9µs align 6144 pre 2.47ms on 2.5ms post 2.52ms diff 2.79µs align 4096 pre 2.51ms on 2.58ms post 2.5ms diff 80.4µs align 3072 pre 2.55ms on 2.49ms post 2.55ms diff -54414n align 2048 pre 2.5ms on 2.47ms post 2.47ms diff -12031n
flashtest@flatty:~/flashbench$ EBS=1572864 ; for NUMAU in $(seq 1 10) ; do echo "Trying $NUMAU open AUs:" ; ./flashbench --open-au --random --open-au-nr=$NUMAU --erasesize=$EBS --blocksize=1536 /dev/mmcblk[0-9] |tr -s ' ' |tr ' ' '\t' > out.1 ; for nr in $(seq 2 5) ; do ./flashbench --open-au --random --open-au-nr=$NUMAU --erasesize=$EBS --blocksize=1536 /dev/mmcblk[0-9] |tr -s ' '|cut -d ' ' -f 2 > out.$nr ; done ; paste out.* ; done | tee ~/sandisk_openau_random_1.5M_1-10AUs.result Trying 1 open AUs: 1.5MiB 6.12M/s 4.2M/s 3.81M/s 6.31M/s 6.33M/s 768KiB 6.31M/s 3.97M/s 4.03M/s 6.34M/s 6.26M/s 384KiB 6.31M/s 4.11M/s 3.83M/s 4.54M/s 5.8M/s 192KiB 6.3M/s 3.88M/s 4.44M/s 4.68M/s 6.13M/s 96KiB 4.62M/s 4.19M/s 4.28M/s 4.17M/s 5.18M/s 48KiB 3.64M/s 3.34M/s 3.66M/s 4.26M/s 4.27M/s 24KiB 3.32M/s 3.18M/s 3.4M/s 4.26M/s 3.95M/s 12KiB 2.5M/s 2.45M/s 2.42M/s 4.53M/s 2.97M/s 6KiB 970K/s 1.02M/s 997K/s 2.03M/s 1.8M/s 3KiB 499K/s 478K/s 964K/s 793K/s 1.07M/s 1.5KiB 297K/s 296K/s 516K/s 598K/s 515K/s Trying 2 open AUs: 1.5MiB 6.01M/s 4.62M/s 4.55M/s 4.66M/s 4.33M/s 768KiB 6.42M/s 5.02M/s 4.43M/s 4.62M/s 4.44M/s 384KiB 6.12M/s 4.81M/s 4.36M/s 4.55M/s 4.49M/s 192KiB 5.33M/s 4.81M/s 4.39M/s 4.52M/s 4.46M/s 96KiB 1.89M/s 4.34M/s 4.37M/s 4.29M/s 4.29M/s 48KiB 3.98M/s 4.1M/s 3.05M/s 4.11M/s 4.12M/s 24KiB 3.96M/s 2.9M/s 3.95M/s 3.16M/s 3.34M/s 12KiB 2.93M/s 2.91M/s 2.89M/s 2.87M/s 2.91M/s 6KiB 879K/s 769K/s 719K/s 773K/s 747K/s 3KiB 358K/s 337K/s 355K/s 338K/s 338K/s 1.5KiB 195K/s 202K/s 199K/s 201K/s 201K/s Trying 3 open AUs: 1.5MiB 5.76M/s 4.92M/s 5.88M/s 6.15M/s 4.93M/s 768KiB 5.04M/s 5.06M/s 5.31M/s 5.79M/s 4.91M/s 384KiB 4.56M/s 4.8M/s 4.78M/s 4.69M/s 4.77M/s 192KiB 4.16M/s 5.07M/s 4.06M/s 4.72M/s 5.2M/s 96KiB 2.71M/s 2.69M/s 2.02M/s 1.84M/s 1.73M/s 48KiB 3.57M/s 2.32M/s 3.45M/s 4M/s 4.04M/s 24KiB 2.53M/s 3.87M/s 2.85M/s 2.46M/s 3.86M/s 12KiB 2.81M/s 1.82M/s 2.12M/s 2.33M/s 1.77M/s 6KiB 626K/s 637K/s 627K/s 630K/s 631K/s 3KiB 289K/s 293K/s 279K/s 287K/s 290K/s 1.5KiB 165K/s 166K/s 165K/s 165K/s 164K/s Trying 4 open AUs: 1.5MiB 4.56M/s 4.78M/s 4.81M/s 4.45M/s 4.62M/s 768KiB 4.52M/s 4.8M/s 4.95M/s 4.5M/s 4.53M/s 384KiB 4.47M/s 4.79M/s 4.81M/s 4.42M/s 4.6M/s 192KiB 4.61M/s 4.21M/s 4.41M/s 4.43M/s 4.34M/s 96KiB 2.42M/s 2.59M/s 1.54M/s 1.97M/s 1.85M/s 48KiB 3.24M/s 2.24M/s 4.11M/s 3.17M/s 3.6M/s 24KiB 2.11M/s 3.86M/s 2.22M/s 1.93M/s 1.9M/s 12KiB 1.83M/s 1.8M/s 2.46M/s 2.72M/s 2.59M/s 6KiB 554K/s 561K/s 566K/s 546K/s 561K/s 3KiB 244K/s 246K/s 245K/s 255K/s 246K/s 1.5KiB 139K/s 139K/s 139K/s 151K/s 139K/s Trying 5 open AUs: 1.5MiB 4.75M/s 4.57M/s 4.8M/s 4.47M/s 4.67M/s 768KiB 4.99M/s 4.54M/s 4.88M/s 4.56M/s 4.84M/s 384KiB 4.72M/s 4.52M/s 4.74M/s 4.52M/s 4.7M/s 192KiB 4.19M/s 4.24M/s 4.19M/s 4.2M/s 4.24M/s 96KiB 2.2M/s 2.22M/s 2.52M/s 1.84M/s 1.49M/s 48KiB 2.85M/s 2.53M/s 2.28M/s 2.61M/s 4.11M/s 24KiB 1.92M/s 1.9M/s 1.71M/s 1.83M/s 1.52M/s 12KiB 1.37M/s 1.47M/s 2.74M/s 1.65M/s 2.82M/s 6KiB 552K/s 545K/s 470K/s 534K/s 466K/s 3KiB 244K/s 244K/s 243K/s 223K/s 244K/s 1.5KiB 139K/s 139K/s 139K/s 125K/s 139K/s Trying 6 open AUs: 1.5MiB 4.74M/s 4.57M/s 4.53M/s 4.86M/s 4.86M/s 768KiB 4.88M/s 4.67M/s 4.55M/s 5.02M/s 4.95M/s 384KiB 3.82M/s 4.25M/s 4.5M/s 4.05M/s 4.71M/s 192KiB 3.44M/s 3.87M/s 3.81M/s 3.82M/s 3.39M/s 96KiB 1.89M/s 1.68M/s 1.61M/s 1.66M/s 2.42M/s 48KiB 1.8M/s 1.46M/s 1.99M/s 2.95M/s 2.15M/s 24KiB 2.44M/s 2.69M/s 1.92M/s 1.44M/s 1.44M/s 12KiB 1.48M/s 1.57M/s 1.43M/s 1.55M/s 1.44M/s 6KiB 447K/s 441K/s 476K/s 449K/s 455K/s 3KiB 219K/s 207K/s 205K/s 208K/s 203K/s 1.5KiB 129K/s 129K/s 128K/s 116K/s 132K/s Trying 7 open AUs: 1.5MiB 4.71M/s 2.93M/s 4.72M/s 4.91M/s 2.99M/s 768KiB 4.24M/s 3.65M/s 3.09M/s 4.1M/s 4.5M/s 384KiB 3.88M/s 4.51M/s 4.7M/s 3.84M/s 3.67M/s 192KiB 3.38M/s 2.96M/s 3.17M/s 3.8M/s 3.59M/s 96KiB 2.15M/s 2.42M/s 2.19M/s 1.84M/s 1.89M/s 48KiB 1.67M/s 1.71M/s 2.07M/s 2.56M/s 1.9M/s 24KiB 1.5M/s 2.05M/s 1.47M/s 1.95M/s 1.41M/s 12KiB 1.55M/s 1.31M/s 1.37M/s 1.48M/s 1.94M/s 6KiB 423K/s 422K/s 406K/s 442K/s 424K/s 3KiB 193K/s 189K/s 189K/s 182K/s 185K/s 1.5KiB 115K/s 109K/s 109K/s 111K/s 117K/s Trying 8 open AUs: 1.5MiB 4.59M/s 4.27M/s 2.65M/s 4.26M/s 4.52M/s 768KiB 3.96M/s 3.68M/s 4.32M/s 3.72M/s 2.94M/s 384KiB 3.65M/s 3.3M/s 3.11M/s 3.78M/s 3.45M/s 192KiB 2.54M/s 2.94M/s 2.91M/s 2.31M/s 3.41M/s 96KiB 2.56M/s 2.26M/s 1.99M/s 2.78M/s 1.95M/s 48KiB 1.41M/s 1.48M/s 1.5M/s 1.55M/s 1.5M/s 24KiB 1.29M/s 1.53M/s 1.48M/s 1.54M/s 1.64M/s 12KiB 1.64M/s 1.32M/s 1.38M/s 1.55M/s 1.39M/s 6KiB 399K/s 373K/s 411K/s 384K/s 374K/s 3KiB 180K/s 191K/s 181K/s 180K/s 176K/s 1.5KiB 106K/s 106K/s 109K/s 108K/s 111K/s Trying 9 open AUs: 1.5MiB 2.32M/s 3.9M/s 4.05M/s 4.65M/s 4.97M/s 768KiB 3.53M/s 3.62M/s 3.8M/s 3.67M/s 3.63M/s 384KiB 3.87M/s 3.51M/s 3.37M/s 3.1M/s 3.38M/s 192KiB 2.77M/s 3.29M/s 3.07M/s 3.37M/s 3.15M/s 96KiB 2.33M/s 2.06M/s 1.98M/s 1.94M/s 2.2M/s 48KiB 1.41M/s 1.38M/s 1.45M/s 1.5M/s 1.37M/s 24KiB 1.38M/s 1.32M/s 1.49M/s 1.17M/s 1.15M/s 12KiB 1.39M/s 1.32M/s 1.5M/s 1.42M/s 1.25M/s 6KiB 357K/s 370K/s 346K/s 377K/s 379K/s 3KiB 171K/s 178K/s 176K/s 178K/s 179K/s 1.5KiB 107K/s 99.5K/s 98.8K/s 98.4K/s 102K/s Trying 10 open AUs: 1.5MiB 3.82M/s 2.9M/s 3.16M/s 2.39M/s 2.08M/s 768KiB 3.86M/s 3.13M/s 3.52M/s 4.11M/s 3.65M/s 384KiB 3.11M/s 3.57M/s 2.75M/s 2.72M/s 3.17M/s 192KiB 3.14M/s 2.98M/s 2.94M/s 3.08M/s 2.52M/s 96KiB 2.39M/s 2.15M/s 2.64M/s 2.03M/s 2.26M/s 48KiB 1.37M/s 1.32M/s 1.22M/s 1.18M/s 1.47M/s 24KiB 1.03M/s 1.29M/s 1.24M/s 1.48M/s 1.1M/s 12KiB 1.23M/s 1.36M/s 1.36M/s 1.14M/s 1.22M/s 6KiB 336K/s 334K/s 342K/s 354K/s 345K/s 3KiB 173K/s 166K/s 172K/s 175K/s 161K/s 1.5KiB 100K/s 96.4K/s 98.6K/s 96.9K/s 95.8K/s
flashtest@flatty:~/flashbench$ EBS=1572864 ; for NUMAU in $(seq 1 10) ; do echo "Trying $NUMAU open AUs:" ; ./flashbench --open-au --open-au-nr=$NUMAU --erasesize=$EBS --blocksize=1536 /dev/mmcblk[0-9] |tr -s ' ' |tr ' ' '\t' > out.1 ; for nr in $(seq 2 5) ; do ./flashbench --open-au --open-au-nr=$NUMAU --erasesize=$EBS --blocksize=1536 /dev/mmcblk[0-9] |tr -s ' '|cut -d ' ' -f 2 > out.$nr ; done ; paste out.* ; done | tee ~/sandisk_openau_norandom_1.5M_1-10AUs.result Trying 1 open AUs: 1.5MiB 1.66M/s 4.44M/s 4.74M/s 4.4M/s 4.44M/s 768KiB 4.75M/s 3.72M/s 3.73M/s 4.14M/s 4.54M/s 384KiB 4.72M/s 3.99M/s 3.83M/s 3.84M/s 3.91M/s 192KiB 4.32M/s 3.85M/s 3.97M/s 3.75M/s 3.81M/s 96KiB 4.05M/s 3.85M/s 5.03M/s 3.76M/s 4.05M/s 48KiB 3.97M/s 3.41M/s 4.33M/s 3.35M/s 4.02M/s 24KiB 4.02M/s 3.55M/s 3.84M/s 3.25M/s 3.96M/s 12KiB 2.67M/s 2.75M/s 2.98M/s 2.54M/s 2.94M/s 6KiB 1.14M/s 1.1M/s 973K/s 1.09M/s 1.03M/s 3KiB 554K/s 522K/s 526K/s 820K/s 516K/s 1.5KiB 309K/s 310K/s 303K/s 403K/s 320K/s Trying 2 open AUs: 1.5MiB 4.51M/s 3.98M/s 5.87M/s 4.41M/s 4.87M/s 768KiB 4.05M/s 4.45M/s 6.05M/s 5.46M/s 4.95M/s 384KiB 4.31M/s 5.61M/s 5M/s 5.48M/s 4.8M/s 192KiB 4.83M/s 4.59M/s 4.79M/s 4.47M/s 4.6M/s 96KiB 2.01M/s 1.92M/s 2.15M/s 2.02M/s 4.34M/s 48KiB 3.99M/s 3.36M/s 4.03M/s 4.06M/s 4.14M/s 24KiB 4.05M/s 3.29M/s 3.92M/s 3.9M/s 3.1M/s 12KiB 2.93M/s 2.94M/s 2.92M/s 2.92M/s 2.89M/s 6KiB 688K/s 764K/s 903K/s 947K/s 834K/s 3KiB 288K/s 275K/s 348K/s 348K/s 332K/s 1.5KiB 183K/s 238K/s 242K/s 227K/s 250K/s Trying 3 open AUs: 1.5MiB 4.8M/s 4.64M/s 4.72M/s 4.56M/s 4.5M/s 768KiB 4.89M/s 4.6M/s 4.86M/s 4.56M/s 4.57M/s 384KiB 4.8M/s 4.59M/s 4.76M/s 4.54M/s 4.51M/s 192KiB 5.06M/s 4.7M/s 4.95M/s 4.65M/s 4.7M/s 96KiB 1.71M/s 2.62M/s 2.27M/s 2.49M/s 1.77M/s 48KiB 4.24M/s 2.29M/s 4.18M/s 2.37M/s 4.06M/s 24KiB 3.92M/s 4.01M/s 2.74M/s 3.94M/s 3.86M/s 12KiB 2.84M/s 2.9M/s 2.84M/s 1.78M/s 1.75M/s 6KiB 680K/s 692K/s 666K/s 712K/s 701K/s 3KiB 292K/s 277K/s 292K/s 294K/s 291K/s 1.5KiB 227K/s 224K/s 223K/s 225K/s 222K/s Trying 4 open AUs: 1.5MiB 4.69M/s 4.89M/s 4.53M/s 4.68M/s 4.61M/s 768KiB 4.45M/s 5.01M/s 4.56M/s 4.68M/s 4.61M/s 384KiB 4.55M/s 4.68M/s 4.52M/s 4.52M/s 4.46M/s 192KiB 4.41M/s 4.58M/s 4.21M/s 4.43M/s 4.52M/s 96KiB 2M/s 1.69M/s 2.06M/s 1.82M/s 2.39M/s 48KiB 3.22M/s 4.11M/s 4.12M/s 3.7M/s 2.88M/s 24KiB 1.91M/s 2.19M/s 1.82M/s 1.36M/s 2.19M/s 12KiB 2.24M/s 2.49M/s 2.9M/s 2.85M/s 2.32M/s 6KiB 607K/s 637K/s 629K/s 655K/s 610K/s 3KiB 250K/s 249K/s 251K/s 287K/s 253K/s 1.5KiB 192K/s 192K/s 193K/s 215K/s 193K/s Trying 5 open AUs: 1.5MiB 4.86M/s 4.58M/s 4.93M/s 4.71M/s 4.55M/s 768KiB 4.97M/s 4.48M/s 3.52M/s 4.88M/s 4.64M/s 384KiB 4.76M/s 4.49M/s 3.43M/s 4.25M/s 4.47M/s 192KiB 3.87M/s 4.17M/s 4.37M/s 4.34M/s 4.24M/s 96KiB 1.93M/s 1.91M/s 1.89M/s 1.92M/s 1.83M/s 48KiB 3.36M/s 2.68M/s 2.49M/s 2.61M/s 3.27M/s 24KiB 1.58M/s 1.97M/s 1.59M/s 2.14M/s 1.67M/s 12KiB 1.47M/s 1.48M/s 2.22M/s 1.46M/s 1.48M/s 6KiB 637K/s 621K/s 559K/s 637K/s 641K/s 3KiB 241K/s 233K/s 271K/s 271K/s 237K/s 1.5KiB 205K/s 178K/s 206K/s 206K/s 178K/s Trying 6 open AUs: 1.5MiB 4.48M/s 4.98M/s 4.67M/s 4.59M/s 4.9M/s 768KiB 4.62M/s 4.94M/s 4.61M/s 4.58M/s 5.01M/s 384KiB 4.17M/s 4.27M/s 3.87M/s 4.45M/s 4.52M/s 192KiB 3.83M/s 3.7M/s 3.3M/s 3.81M/s 3.34M/s 96KiB 1.8M/s 1.48M/s 1.92M/s 1.97M/s 1.63M/s 48KiB 2.71M/s 4.1M/s 2.08M/s 2.21M/s 2.62M/s 24KiB 1.71M/s 1.66M/s 1.71M/s 1.73M/s 1.83M/s 12KiB 1.64M/s 1.32M/s 1.57M/s 1.71M/s 1.33M/s 6KiB 523K/s 583K/s 522K/s 523K/s 594K/s 3KiB 233K/s 237K/s 259K/s 223K/s 241K/s 1.5KiB 196K/s 196K/s 196K/s 178K/s 196K/s Trying 7 open AUs: 1.5MiB 4.72M/s 4.6M/s 4.55M/s 4.91M/s 4.83M/s 768KiB 4.32M/s 4.25M/s 3.55M/s 4.02M/s 4.27M/s 384KiB 3.41M/s 3.77M/s 4.73M/s 4.69M/s 3.56M/s 192KiB 3.77M/s 3.19M/s 3.25M/s 3.16M/s 3.41M/s 96KiB 1.8M/s 1.94M/s 1.6M/s 1.81M/s 1.88M/s 48KiB 1.67M/s 2.59M/s 1.69M/s 2.13M/s 2.3M/s 24KiB 2.06M/s 1.68M/s 2.26M/s 1.48M/s 1.46M/s 12KiB 1.36M/s 1.56M/s 1.5M/s 1.25M/s 1.3M/s 6KiB 498K/s 505K/s 492K/s 525K/s 499K/s 3KiB 219K/s 214K/s 204K/s 222K/s 205K/s 1.5KiB 163K/s 166K/s 169K/s 168K/s 161K/s Trying 8 open AUs: 1.5MiB 4.85M/s 2.82M/s 4.56M/s 4.87M/s 3.38M/s 768KiB 3.79M/s 3.51M/s 3.75M/s 4.72M/s 3.62M/s 384KiB 3.58M/s 3.05M/s 4.12M/s 3.52M/s 4.17M/s 192KiB 2.71M/s 3.65M/s 2.96M/s 2.71M/s 2.75M/s 96KiB 2.22M/s 2.34M/s 1.98M/s 2.33M/s 2.54M/s 48KiB 1.51M/s 1.51M/s 1.68M/s 1.52M/s 1.44M/s 24KiB 1.4M/s 1.62M/s 1.48M/s 1.51M/s 1.64M/s 12KiB 1.37M/s 1.39M/s 1.21M/s 1.31M/s 1.25M/s 6KiB 488K/s 480K/s 492K/s 488K/s 483K/s 3KiB 200K/s 204K/s 201K/s 207K/s 199K/s 1.5KiB 166K/s 157K/s 155K/s 156K/s 156K/s Trying 9 open AUs: 1.5MiB 4.55M/s 3.24M/s 4.63M/s 4.88M/s 3.81M/s 768KiB 3.68M/s 3.6M/s 3.5M/s 3.41M/s 4.07M/s 384KiB 3.62M/s 3.15M/s 3.06M/s 3.36M/s 3M/s 192KiB 2.61M/s 2.98M/s 3.28M/s 3.16M/s 2.86M/s 96KiB 2.49M/s 2.27M/s 2.25M/s 1.85M/s 2.85M/s 48KiB 1.16M/s 1.42M/s 1.42M/s 1.67M/s 1.33M/s 24KiB 1.41M/s 1.4M/s 1.33M/s 1.33M/s 1.05M/s 12KiB 1.43M/s 1.34M/s 980K/s 1.3M/s 1.36M/s 6KiB 470K/s 433K/s 510K/s 417K/s 493K/s 3KiB 191K/s 197K/s 196K/s 194K/s 200K/s 1.5KiB 156K/s 150K/s 148K/s 153K/s 153K/s Trying 10 open AUs: 1.5MiB 1.89M/s 3.13M/s 3.87M/s 4.99M/s 3.59M/s 768KiB 3.2M/s 3.91M/s 3.74M/s 3.07M/s 3.29M/s 384KiB 3.78M/s 2.61M/s 2.55M/s 3.17M/s 2.63M/s 192KiB 2.49M/s 3.1M/s 2.98M/s 3.09M/s 3.29M/s 96KiB 2.09M/s 2.45M/s 2.12M/s 2.07M/s 2.51M/s 48KiB 1.31M/s 1.35M/s 1.27M/s 1.35M/s 1.25M/s 24KiB 1.18M/s 982K/s 1.4M/s 1.33M/s 1.06M/s 12KiB 1.14M/s 1.33M/s 1.09M/s 1.06M/s 1.14M/s 6KiB 424K/s 488K/s 500K/s 444K/s 421K/s 3KiB 190K/s 185K/s 182K/s 200K/s 190K/s 1.5KiB 145K/s 144K/s 145K/s 143K/s 147K/s
I'm afraid the numbers don't make much more sense to me than before. :-/
Sascha
On Wednesday 06 April 2011, Sascha Silbe wrote:
Excerpts from Arnd Bergmann's message of Tue Apr 05 00:17:14 +0200 2011:
Ah, I see. Here we go:
flashtest@flatty:~/flashbench$ time ./flashbench -a /dev/mmcblk[0-9] --blocksize=1024 --count=100 | tee ~/sandisk_a_1k.result align 402653184 pre 5.06ms on 5.99ms post 2.74ms diff 2.09ms align 268435456 pre 3.44ms on 3.8ms post 3.67ms diff 244µs align 201326592 pre 4.74ms on 5.83ms post 2.77ms diff 2.07ms align 134217728 pre 2.78ms on 2.89ms post 2.92ms diff 32.5µs align 100663296 pre 5.04ms on 5.65ms post 2.74ms diff 1.76ms align 67108864 pre 3.48ms on 3.71ms post 3.68ms diff 135µs align 50331648 pre 5.53ms on 5.93ms post 2.73ms diff 1.8ms align 33554432 pre 2.74ms on 2.89ms post 2.93ms diff 52.1µs align 25165824 pre 5.29ms on 5.66ms post 2.78ms diff 1.62ms align 16777216 pre 3.5ms on 3.79ms post 3.67ms diff 204µs align 12582912 pre 5.15ms on 5.77ms post 2.75ms diff 1.82ms align 8388608 pre 2.52ms on 2.69ms post 2.71ms diff 72.9µs align 6291456 pre 4.69ms on 5.46ms post 2.54ms diff 1.85ms align 4194304 pre 3.23ms on 3.46ms post 3.31ms diff 189µs align 3145728 pre 4.79ms on 5.41ms post 2.54ms diff 1.74ms align 2097152 pre 2.57ms on 2.72ms post 2.71ms diff 77.8µs align 1572864 pre 4.63ms on 5.81ms post 2.61ms diff 2.2ms align 1048576 pre 3.26ms on 3.49ms post 3.29ms diff 215µs align 786432 pre 3.28ms on 3.45ms post 2.58ms diff 524µs align 524288 pre 2.52ms on 2.69ms post 2.69ms diff 88µs
Ok, so all multiples of 1.5 MB are an order of magnitude slower in the "diff" column than the others, which makes it clear that there is something going on at that point. I should probably modify the test so that the 2 MB alignment actually does 6 MB, which would make it a superset of 1.5 and of 2.
(--random)
Trying 10 open AUs: 1.5MiB 3.82M/s 2.9M/s 3.16M/s 2.39M/s 2.08M/s 768KiB 3.86M/s 3.13M/s 3.52M/s 4.11M/s 3.65M/s 384KiB 3.11M/s 3.57M/s 2.75M/s 2.72M/s 3.17M/s 192KiB 3.14M/s 2.98M/s 2.94M/s 3.08M/s 2.52M/s 96KiB 2.39M/s 2.15M/s 2.64M/s 2.03M/s 2.26M/s 48KiB 1.37M/s 1.32M/s 1.22M/s 1.18M/s 1.47M/s 24KiB 1.03M/s 1.29M/s 1.24M/s 1.48M/s 1.1M/s 12KiB 1.23M/s 1.36M/s 1.36M/s 1.14M/s 1.22M/s 6KiB 336K/s 334K/s 342K/s 354K/s 345K/s 3KiB 173K/s 166K/s 172K/s 175K/s 161K/s 1.5KiB 100K/s 96.4K/s 98.6K/s 96.9K/s 95.8K/s
(not random)
Trying 10 open AUs: 1.5MiB 1.89M/s 3.13M/s 3.87M/s 4.99M/s 3.59M/s 768KiB 3.2M/s 3.91M/s 3.74M/s 3.07M/s 3.29M/s 384KiB 3.78M/s 2.61M/s 2.55M/s 3.17M/s 2.63M/s 192KiB 2.49M/s 3.1M/s 2.98M/s 3.09M/s 3.29M/s 96KiB 2.09M/s 2.45M/s 2.12M/s 2.07M/s 2.51M/s 48KiB 1.31M/s 1.35M/s 1.27M/s 1.35M/s 1.25M/s 24KiB 1.18M/s 982K/s 1.4M/s 1.33M/s 1.06M/s 12KiB 1.14M/s 1.33M/s 1.09M/s 1.06M/s 1.14M/s 6KiB 424K/s 488K/s 500K/s 444K/s 421K/s 3KiB 190K/s 185K/s 182K/s 200K/s 190K/s 1.5KiB 145K/s 144K/s 145K/s 143K/s 147K/s
I'm afraid the numbers don't make much more sense to me than before. :-/
There are a few things that are notable here:
* The numbers for --random and not random are roughly the same, so you don't really need to do both. I'd suggest running only --random tests.
* anything under 12 KB blocks is awfully slow, so you can probably speed up the test process by running --blocksize=12
* The variance between the test runs has significantly gone down, which is a good sign.
* You did not try any larger values for --open-au-nr=. Since the blocksize is relatively small, it is possible that the number of open AUs is much higher on this card. Try a much larger value (e.g. 32) once to see if you get a radical change, then go smaller from there. You don't really need to try every value for open-au-nr, finding the cut-off is the interesting one.
* It is possible that this card has a fully log-structured approach, which would mean that there actually is no cut-off but that it simply gets gradually slower with more open AUs. I have no programmatic way to find these yet.
Arnd
Excerpts from Arnd Bergmann's message of Thu Apr 07 16:23:05 +0200 2011:
- It is possible that this card has a fully log-structured approach,
which would mean that there actually is no cut-off but that it simply gets gradually slower with more open AUs. I have no programmatic way to find these yet.
A quick test with logarithmic scale suggests this might indeed be the case:
flashtest@flatty:~$ for NUMAU in 2 2 4 4 8 8 16 16 32 32 64 64 128 128 256 256 ; do echo Trying $NUMAU open AUs: ; ~/flashbench/flashbench --open-au --open-au-nr=$NUMAU --erasesize=1572864 --blocksize=1572864 --random /dev/mmcblk[0-9] ; done Trying 2 open AUs: 1.5MiB 6.42M/s Trying 2 open AUs: 1.5MiB 5.97M/s Trying 4 open AUs: 1.5MiB 6.23M/s Trying 4 open AUs: 1.5MiB 4.93M/s Trying 8 open AUs: 1.5MiB 5.07M/s Trying 8 open AUs: 1.5MiB 5.03M/s Trying 16 open AUs: 1.5MiB 3.65M/s Trying 16 open AUs: 1.5MiB 3.78M/s Trying 32 open AUs: 1.5MiB 2.59M/s Trying 32 open AUs: 1.5MiB 2.86M/s Trying 64 open AUs: 1.5MiB 2.33M/s Trying 64 open AUs: 1.5MiB 2.51M/s Trying 128 open AUs: 1.5MiB 2.35M/s Trying 128 open AUs: 1.5MiB 2.36M/s Trying 256 open AUs: 1.5MiB 2.33M/s Trying 256 open AUs: 1.5MiB 2.33M/s
For runtime reasons I've restricted the test to full EBS writes and only two runs per number. I can do a more extensive test if you think it might be worth it (e.g. to do a plot with error bars).
BTW, is there a way to ensure that the individual tests are independent (other than mechanically ejecting and reinserting the card which I was too lazy to do so far)?
Sascha
Excerpts from Sascha Silbe's message of Fri Apr 08 00:34:15 +0200 2011:
Excerpts from Arnd Bergmann's message of Thu Apr 07 16:23:05 +0200 2011:
- It is possible that this card has a fully log-structured approach,
which would mean that there actually is no cut-off but that it simply gets gradually slower with more open AUs. I have no programmatic way to find these yet.
A quick test with logarithmic scale suggests this might indeed be the case:
[...]
A more extensive set of tests (50 samples each for 1-16 open AUs, only EBS sized blocks) post-processed and plotted to show mean and standard deviation painted a different picture. For 1-3 open AUs there's "high" speed with high deviation; for 4-7 open AUs mean speed is a bit less with almost no deviation. There's a clear cut after 10 AUs (change in mean value greater than standard variation).
Not sure why there's so much deviation for 1-3 open AUs; maybe doing a completely erase + fill cycle would give better numbers. However I'd prefer to avoid that, because a) it would wear the card significantly (I don't expect it to survive more than 1k cycles) and b) the test would take two full days (for 10 samples each at 1-16 open AUs) during which I can use neither the card nor the card slot for anything else.
I've attached raw data, post-processing script and plot. The invocations were:
for NUMAU in $(seq 1 16) ; do echo $NUMAU open AUs: ; for numtry in $(seq 1 50) ; do ~/flashbench/flashbench --open-au --open-au-nr=$NUMAU --erasesize=1572864 --blocksize=1572864 --random /dev/mmcblk[0-9] ; done ; done | tee ~/sandisk_openau_random_1.5M_1.5M_1-16_2.result
./result2plot.py ~/sandisk_openau_random_1.5M_1.5M_1-16_2.result
Sascha
On Friday 08 April 2011, Sascha Silbe wrote:
A more extensive set of tests (50 samples each for 1-16 open AUs, only EBS sized blocks) post-processed and plotted to show mean and standard deviation painted a different picture. For 1-3 open AUs there's "high" speed with high deviation; for 4-7 open AUs mean speed is a bit less with almost no deviation. There's a clear cut after 10 AUs (change in mean value greater than standard variation).
What you found is clearly interesting, but I fear you are going down a much different route from what I'm normally looking at. Writing a full erase block normally always gives the exact same performance, because the card is not forced to do garbage collection.
It looks like the card actually does garbage collection, which could hint that the erase block size is actually 3 MB (or possibly 4.5 MB, though I have never seen such a strange value).
It's also possible that most erase blocks are 1.5 MB, but that every 64 MB it starts out with a 64 MB aligned block, which would coincide with the 11th erase block in your test.
You can try doing a run with
flashbench --open-au --open-au-nr=3 --erasesize=1572864 --blocksize=393216 --offset=67108864 --random /dev/mmcblk0
to see if that is fast or slow. 67108864 is 64 MB, so it is not aligned to multiples of 1.5 MB, while 67109376 is the next multiple that we flashbench would normally try.
In either case, you really need the numbers for the smaller block sizes, those will be much more interesting.
Not sure why there's so much deviation for 1-3 open AUs; maybe doing a completely erase + fill cycle would give better numbers.
The deviation in these cases is probably based on what the previous test run did: It takes longer to write one erase block when there is need to garbage-collect another one.
However I'd prefer to avoid that, because a) it would wear the card significantly (I don't expect it to survive more than 1k cycles)
Doing a high-level erase would only mark the erase blocks as free, it does not actually age the drive. In fact, doing an erase after each test run is a good idea.
and b) the test would take two full days (for 10 samples each at 1-16 open AUs) during which I can use neither the card nor the card slot for anything else.
As mentioned before, the cards tend to be completely deterministic as long with flashbench, as long as the assumptions are correct. If you get any significant deviation between test runs, that means you are not hitting the cases that flashbench was designed for.
Arnd
flashbench-results@lists.linaro.org