On Tuesday 08 March 2011, Philippe De Muyter wrote:
On Tue, Mar 08, 2011 at 02:41:03PM +0100, Arnd Bergmann wrote:
The -a tests are still inconclusive. What kind of device is this? What is the total size? (in /proc/partitions) For a USB device, what is the output of lsusb? For a SD card, can you put the card into a builtin SD card reader and do 'head /sys/block/mmcblk0/device/*'?
It is a 2GB compactflash card. I do not remember the brand and model. I'll look them up tomorrow.
From `dmesg': [ 1.202196] ata3.00: CFA: PIO, 20100202, max PIO4 [ 1.202241] ata3.00: 3980592 sectors, multi 0: LBA [ 1.202284] ata3.00: applying bridge limits [ 1.208129] ata3.00: configured for PIO4 [ 1.243481] scsi 2:0:0:0: Direct-Access ATA PIO 2010 PQ: 0 ANSI: 5 [ 1.243754] sd 2:0:0:0: [sdb] 3980592 512-byte logical blocks: (2.03 GB/1.89 GiB) From /proc/partitions: 8 16 1990296 sdb 8 17 1989760 sdb1
Fascinating, so the size is actually not a multiple of any significant block, which means that it either uses a completely different algorithm or it has a segment size that is slightly smaller or bigger.
I noticed that the partition starts at sector 514, which is just slightly over 512, and the drive size is a multiple of 504 sectors long, which is just slightly lower than 512. Flashbench does not handle this yet, but I plan to add support for this because I have recently encountered a USB stick that uses 4128 KB segments.
One thing that you can try is to run
flashbench -s --scatter-span=3 --scatter-order=14 --blocksize=8192 --count=100 /dev/sdb -o output.plot gnuplot -p -e 'plot "output.plot"'
This will show you the time to read 3 blocks of 8 KB starting at 2^14 addresses 8KB apart. Normally, the boundaries between segments will show up as periodic dots above one or two relatively straight lines. Looking at the offsets in the output.plot text file will show you exactly where these are.
When you pass the correct size, you should see the same result that you see without --random, i.e. it's always fast with --open-au-nr=3 but much slower with --open-au-nr=4.
Here are some more results :
tmp179:~ # ./flashbench --open-au --open-au-nr=3 --erasesize=$[2048 * 1024] /dev/sdb --random sched_setscheduler: Operation not permitted 2MiB 5.31M/s 1MiB 5.44M/s 512KiB 5.13M/s 256KiB 4.45M/s 128KiB 4.45M/s 64KiB 4.03M/s 32KiB 3.56M/s 16KiB 3.08M/s
tmp179:~ # ./flashbench --open-au --open-au-nr=4 --erasesize=$[2048 * 1024] /dev/sdb --random sched_setscheduler: Operation not permitted 2MiB 5.29M/s 1MiB 5.43M/s 512KiB 4.85M/s 256KiB 3.91M/s 128KiB 3.35M/s 64KiB 3.61M/s 32KiB 3.08M/s 16KiB 2.62M/s
tmp179:~ # ./flashbench --open-au --open-au-nr=4 --erasesize=$[1024 * 1024] /dev/sdb --random sched_setscheduler: Operation not permitted 1MiB 5.29M/s 512KiB 5.1M/s 256KiB 4.28M/s 128KiB 4.55M/s 64KiB 4.94M/s 32KiB 4.62M/s 16KiB 4.34M/s
Ok, so at 2 MB you see a noticeable slowdown with nr=4, but not at smaller sizes. Since the test areas are always a segment apart, that would indicate that 8 MB is the correct segment size, but it's still guesswork because the effect is not very strong.
Thanks for your persistence, I very much appreciate your help and I hope we get down to the real design of this card.
Arnd