Possibly 0.5 MiB erase block based on factored size. Doing open-au tests nets no useful data, it's always fast, so just a few examples at the bottom.
andrew@lati:~/git/flashbench$ sudo fdisk -l /dev/mmcblk0 | grep Disk Disk /dev/mmcblk0: 15.7 GB, 15719727104 bytes Disk identifier: 0x00000000
andrew@lati:~/git/flashbench$ factor 15719727104 15719727104: 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 29983
andrew@lati:~/git/flashbench$ head /sys/block/mmcblk0/device/* 2>/dev/null | grep -v ^$ ==> /sys/block/mmcblk0/device/block <== ==> /sys/block/mmcblk0/device/cid <== 1b534d3030303030107b156f8000d900 ==> /sys/block/mmcblk0/device/csd <== 400e00325b590000751e7f800a404000 ==> /sys/block/mmcblk0/device/date <== 09/2013 ==> /sys/block/mmcblk0/device/driver <== ==> /sys/block/mmcblk0/device/erase_size <== 512 ==> /sys/block/mmcblk0/device/fwrev <== 0x0 ==> /sys/block/mmcblk0/device/hwrev <== 0x1 ==> /sys/block/mmcblk0/device/manfid <== 0x00001b ==> /sys/block/mmcblk0/device/name <== 00000 ==> /sys/block/mmcblk0/device/oemid <== 0x534d ==> /sys/block/mmcblk0/device/power <== ==> /sys/block/mmcblk0/device/preferred_erase_size <== 4194304 ==> /sys/block/mmcblk0/device/scr <== 02b5800200000000 ==> /sys/block/mmcblk0/device/serial <== 0x7b156f80 ==> /sys/block/mmcblk0/device/subsystem <== ==> /sys/block/mmcblk0/device/type <== SD ==> /sys/block/mmcblk0/device/uevent <== DRIVER=mmcblk MMC_TYPE=SD MMC_NAME=00000 MODALIAS=mmc:block
andrew@lati:~/git/flashbench$ sudo ./flashbench /dev/mmcblk0 -a --blocksize=1024 align 4294967296 pre 292µs on 260µs post 245µs diff -8046ns align 2147483648 pre 308µs on 257µs post 231µs diff -12097n align 1073741824 pre 287µs on 251µs post 239µs diff -11574n align 536870912 pre 302µs on 253µs post 226µs diff -11129n align 268435456 pre 299µs on 263µs post 246µs diff -9463ns align 134217728 pre 309µs on 261µs post 234µs diff -10506n align 67108864 pre 293µs on 263µs post 244µs diff -5672ns align 33554432 pre 307µs on 257µs post 228µs diff -9952ns align 16777216 pre 289µs on 253µs post 241µs diff -12326n align 8388608 pre 227µs on 267µs post 231µs diff 38.4µs align 4194304 pre 232µs on 275µs post 230µs diff 43.3µs align 2097152 pre 229µs on 269µs post 225µs diff 42µs align 1048576 pre 223µs on 258µs post 222µs diff 35.8µs align 524288 pre 228µs on 266µs post 227µs diff 39.4µs align 262144 pre 226µs on 260µs post 223µs diff 35.2µs align 131072 pre 231µs on 266µs post 227µs diff 37µs align 65536 pre 220µs on 258µs post 220µs diff 38µs align 32768 pre 229µs on 266µs post 226µs diff 38.4µs align 16384 pre 232µs on 270µs post 235µs diff 36.2µs align 8192 pre 231µs on 231µs post 227µs diff 1.72µs align 4096 pre 221µs on 221µs post 220µs diff 319ns align 2048 pre 221µs on 224µs post 221µs diff 2.6µs
andrew@lati:~/git/flashbench$ sudo ./flashbench /dev/mmcblk0 --open-au --blocksize=$[16*1024] --erasesize=$[8*1024*1024] --open-au-nr=1 8MiB 14.8M/s 4MiB 17.2M/s 2MiB 17.3M/s 1MiB 17.3M/s 512KiB 17.4M/s 256KiB 17.3M/s 128KiB 17.1M/s 64KiB 17M/s 32KiB 9.5M/s 16KiB 5.52M/s
andrew@lati:~/git/flashbench$ sudo ./flashbench /dev/mmcblk0 --open-au --blocksize=$[16*1024] --erasesize=$[8*1024*1024] --open-au-nr=7 8MiB 11.1M/s 4MiB 10.8M/s 2MiB 11M/s 1MiB 12.3M/s 512KiB 12.3M/s 256KiB 15.8M/s 128KiB 17.1M/s 64KiB 16.9M/s 32KiB 7.48M/s 16KiB 6.54M/s
andrew@lati:~/git/flashbench$ sudo ./flashbench /dev/mmcblk0 --open-au --blocksize=$[16*1024] --erasesize=$[8*1024*1024] --random --open-au-nr=1 8MiB 17.2M/s 4MiB 17.5M/s 2MiB 17.5M/s 1MiB 17M/s 512KiB 17.4M/s 256KiB 17.4M/s 128KiB 17.3M/s 64KiB 17.1M/s 32KiB 13.9M/s 16KiB 9.76M/s
andrew@lati:~/git/flashbench$ sudo ./flashbench /dev/mmcblk0 --open-au --blocksize=$[16*1024] --erasesize=$[8*1024*1024] --random --open-au-nr=7 8MiB 17.4M/s 4MiB 17.4M/s 2MiB 17.4M/s 1MiB 17.3M/s 512KiB 17.3M/s 256KiB 17.2M/s 128KiB 17.1M/s 64KiB 16.9M/s 32KiB 13.7M/s 16KiB 9.62M/s
On Mon, Mar 10, 2014 at 10:52 AM, Andrew Bradford andrew@bradfordembedded.com wrote:
Doing open-au tests nets no useful data, it's always fast, so just a few examples at the bottom.
Any chance this test was bottleneck on the SDIO interface link speed?
# Exynos5 ARM system fgrep mmc_host /var/log/messages ... 2014-03-10T10:38:59.069638-07:00 localhost kernel: [ 0.927858] mmc_host mmc0: Bus speed (slot 0) = 100000000Hz (slot req 52000000Hz, actual 50000000HZ div = 1) 2014-03-10T10:38:59.069901-07:00 localhost kernel: [ 0.970145] mmc_host mmc1: Bus speed (slot 0) = 200000000Hz (slot req 200000000Hz, actual 200000000HZ div = 0)
means 50Mhz 4 bits * (80% efficiency) == ~20 MB/s ... this is really a rough estimate.
(Offhand, I don't know where "width" of the SDIO link is advertised ... just happen to know what it is for this platform).
cheers, grant
On 03/10/2014 04:21 PM, Grant Grundler wrote:
On Mon, Mar 10, 2014 at 10:52 AM, Andrew Bradford andrew@bradfordembedded.com wrote:
Doing open-au tests nets no useful data, it's always fast, so just a few examples at the bottom.
Any chance this test was bottleneck on the SDIO interface link speed?
Could very well be. Was tested with a reader that does not support faster than 50 MHz data rate connected via PCMCIA.
But I would expect that at high "open-au-nr" the controller would eventually become the bottleneck and drop speeds to kB/s levels, which didn't seem to happen. I was able to start an --open-au-nr=31 test (which takes quite a while and so I gave up as I'm impatient and was running on battery) with performance the same as the --open-au-nr=7 tests. So either I've gotten the erase block size very wrong, I'm doing something else bone-headed, or the controller is quite good.
Thanks, Andrew
On Monday 10 March 2014 17:09:22 Andrew Bradford wrote:
On 03/10/2014 04:21 PM, Grant Grundler wrote:
On Mon, Mar 10, 2014 at 10:52 AM, Andrew Bradford andrew@bradfordembedded.com wrote:
Doing open-au tests nets no useful data, it's always fast, so just a few examples at the bottom.
Any chance this test was bottleneck on the SDIO interface link speed?
Could very well be. Was tested with a reader that does not support faster than 50 MHz data rate connected via PCMCIA.
But I would expect that at high "open-au-nr" the controller would eventually become the bottleneck and drop speeds to kB/s levels, which didn't seem to happen. I was able to start an --open-au-nr=31 test (which takes quite a while and so I gave up as I'm impatient and was running on battery) with performance the same as the --open-au-nr=7 tests. So either I've gotten the erase block size very wrong, I'm doing something else bone-headed, or the controller is quite good.
I'm pretty sure it's the last of these. Samsung has in the past used sophisticated controller chips with this behavior. It's essentially what you'd find in a decent eMMC. The likely tradeoff is that a controller can get good random write performance out of cheap flash but needs more embedded RAM to manage it, and performance will degrade with fragmentation, whereas a classic SD card controller (or low-end eMMC) needs very little memory and starts out slow but does not get slower with fragmentation.
flashbench is not good at analysing this kind of device. I've tried to get behind it, but I could not understand exactly what the controller does.
Arnd
flashbench-results@lists.linaro.org