I can't interpret the numbers for the AUs, can you help me?
# ./flashbench -a /dev/sde --blocksize=1024
sched_setscheduler: Operation not permitted
align 536870912 pre 745µs on 756µs post 696µs diff 36.2µs
align 268435456 pre 698µs on 882µs post 670µs diff 198µs
align 134217728 pre 722µs on 899µs post 715µs diff 180µs
align 67108864 pre 720µs on 712µs post 669µs diff 17.1µs
align 33554432 pre 750µs on 765µs post 639µs diff 70.1µs
align 16777216 pre 727µs on 777µs post 684µs diff 71.8µs
align 8388608 pre 691µs on 727µs post 646µs diff 58.3µs
align 4194304 pre 689µs on 769µs post 675µs diff 87.3µs
align 2097152 pre 697µs on 879µs post 715µs diff 173µs
align 1048576 pre 716µs on 899µs post 789µs diff 147µs
align 524288 pre 659µs on 775µs post 724µs diff 83.4µs
align 262144 pre 657µs on 735µs post 685µs diff 63.6µs
align 131072 pre 656µs on 706µs post 756µs diff -248ns
align 65536 pre 655µs on 715µs post 650µs diff 62.3µs
align 32768 pre 732µs on 792µs post 743µs diff 54.1µs
align 16384 pre 685µs on 808µs post 767µs diff 82.6µs
align 8192 pre 732µs on 764µs post 697µs diff 49.8µs
align 4096 pre 661µs on 692µs post 681µs diff 20.7µs
align 2048 pre 671µs on 657µs post 653µs diff -5465ns
# ./flashbench -O --erasesize=$[4 * 1024 * 1024] --blocksize=$[256 * 1024] /dev/sde --open-au-nr=2
sched_setscheduler: Operation not permitted
4MiB 6.69M/s
2MiB 11.1M/s
1MiB 13.2M/s
512KiB 14.4M/s
256KiB 14.6M/s
# ./flashbench -O --erasesize=$[4 * 1024 * 1024] --blocksize=$[256 * 1024] /dev/sde --open-au-nr=3
sched_setscheduler: Operation not permitted
4MiB 6.68M/s
2MiB 11.2M/s
1MiB 12.9M/s
512KiB 14.5M/s
256KiB 14.8M/s
# ./flashbench -O --erasesize=$[4 * 1024 * 1024] --blocksize=$[256 * 1024] /dev/sde --open-au-nr=4
sched_setscheduler: Operation not permitted
4MiB 6.8M/s
2MiB 10.9M/s
1MiB 13M/s
512KiB 14.4M/s
256KiB 14.6M/s
# ./flashbench -O --erasesize=$[4 * 1024 * 1024] --blocksize=$[256 * 1024] /dev/sde --open-au-nr=6
sched_setscheduler: Operation not permitted
4MiB 7.45M/s
2MiB 9.24M/s
1MiB 16.4M/s
512KiB 17.4M/s
256KiB 7.98M/s
--
mit freundlichen Grüssen,
Michael Monnerie, Ing. BSc
it-management Internet Services: Protéger
http://proteger.at [gesprochen: Prot-e-schee]
Tel: +43 660 / 415 6531
// ****** Radiointerview zum Thema Spam ******
// http://www.it-podcast.at/archiv.html#podcast-100716
//
// Haus zu verkaufen: http://zmi.at/langegg/
Dear Sir or Madam,
Glad to hear that you're on the market for mobile phone. We, Wei Wan
Technology are a professional manufacturer for mobile phone.
Dual Sim card and Triple Sim card etc
Please contact me if any questions.
Sincerely hope to find a way to cooperate with your esteemed company!
Thanks & best regards
Juce
E-mail: elink(a)vip.188.com
Factory Address: 5/F, Building 21, ChenTian Industrial Park, BaoTian 2nd
Road, Xixiang Town, Bao'anDistrict, Shenzhen, GuangDong Province, China
root@netboot:~# (cd /sys/block/mmcblk0/device/ ; for i in * ; do if [ -f
$i ] ; then echo -n "$i: " ; cat $i ; fi ; done)
cid: 02544d534431364760dcbf359a00a800
csd: 400e00325b590000777f7f800a400000
date: 08/2010
fwrev: 0x0
hwrev: 0x6
manfid: 0x000002
name: SD16G
oemid: 0x544d
scr: 02b5800026027102
serial: 0xdcbf359a
type: SD
uevent: DRIVER=mmcblk
MMC_TYPE=SD
MMC_NAME=SD16G
MODALIAS=mmc:block
root@netboot:~# ~/flashbench/flashbench -a /dev/mmcblk0 --blocksize=1024
align 536870912 pre 838µs on 1.12ms post 960µs diff 222µs
align 268435456 pre 859µs on 1.09ms post 955µs diff 186µs
align 134217728 pre 860µs on 1.1ms post 957µs diff 196µs
align 67108864 pre 861µs on 1.11ms post 957µs diff 203µs
align 33554432 pre 860µs on 1.11ms post 960µs diff 197µs
align 16777216 pre 861µs on 1.11ms post 953µs diff 203µs
align 8388608 pre 859µs on 1.11ms post 961µs diff 198µs
align 4194304 pre 863µs on 1.1ms post 925µs diff 211µs
align 2097152 pre 862µs on 1.11ms post 965µs diff 199µs
align 1048576 pre 954µs on 978µs post 970µs diff 15.8µs
align 524288 pre 951µs on 977µs post 973µs diff 14.8µs
align 262144 pre 947µs on 977µs post 967µs diff 19.8µs
align 131072 pre 947µs on 983µs post 969µs diff 25.5µs
align 65536 pre 951µs on 981µs post 956µs diff 26.9µs
align 32768 pre 948µs on 977µs post 963µs diff 21.6µs
align 16384 pre 950µs on 978µs post 950µs diff 27.7µs
align 8192 pre 963µs on 960µs post 963µs diff -3106ns
align 4096 pre 962µs on 969µs post 956µs diff 10.1µs
align 2048 pre 963µs on 953µs post 951µs diff -3893ns
>From which I guess 2M erase block size. Not sure how to interpret the
last 4 rows.
Only 1 open allocation unit by the look of the following :-(
root@netboot:~# ~/flashbench/flashbench --open-au --open-au-nr=1
--erasesize=$[2 * 1024 * 1024] --blocksize=$[256 * 1024] /dev/mmcblk0
2MiB 3.68M/s
1MiB 3.16M/s
512KiB 3.06M/s
256KiB 3.33M/s
root@netboot:~# ~/flashbench/flashbench --open-au --open-au-nr=2
--erasesize=$[2 * 1024 * 1024] --blocksize=$[256 * 1024] /dev/mmcblk0
2MiB 3.09M/s
1MiB 2.01M/s
512KiB 1.43M/s
256KiB 618K/s
root@netboot:~# ~/flashbench/flashbench -f /dev/mmcblk04MiB 2.51M/s
2.46M/s 3.73M/s 3.72M/s 3.72M/s 3.72M/s
2MiB 3.7M/s 2.46M/s 3.73M/s 3.73M/s 3.72M/s 3.72M/s
1MiB 3.7M/s 2.46M/s 3.73M/s 3.74M/s 3.72M/s 3.72M/s
512KiB 3.7M/s 2.45M/s 3.72M/s 3.7M/s 3.73M/s 3.73M/s
256KiB 3.71M/s 2.63M/s 3.72M/s 3.73M/s 3.72M/s 3.72M/s
128KiB 3.71M/s 2.82M/s 3.73M/s 3.72M/s 3.72M/s 3.72M/s
64KiB 3.71M/s 3.01M/s 3.72M/s 3.72M/s 3.72M/s 3.72M/s
32KiB 3.58M/s 2.73M/s 3.57M/s 3.57M/s 3.58M/s 3.57M/s
16KiB 2.95M/s 2.25M/s 2.94M/s 2.95M/s 2.94M/s 2.95M/s
The --scatter-order=<n> --scatter-span=<m> arguments are not really
documented, so I didn't know how I should be using them, so I didn't.
There are some strange outliers on the plot, most of which seem to end
up consistently in the same place.
This card has had an HD image file copied over it (with dd) prior to any
of these results being generated.
On Friday 18 March 2011 18:45:34 Justin Piszcz wrote:
> On Fri, 18 Mar 2011, Arnd Bergmann wrote:
> > Getting back to the rogiinal question, I'd recommend testing the
> > stick by doing raw accesses instead of a file system. A simple
>
> Ok, here are the results:
>
> root@sysresccd /root % time dd if=/dev/zero of=/dev/sda oflag=direct bs=4M
> dd: writing `/dev/sda': No space left on device
> 1961+0 records in
> 1960+0 records out
> 8220835840 bytes (8.2 GB) copied, 283.744 s, 29.0 MB/s
Ok, so no immediate problem there.
> > I'm also interested in results from flashbench
> > (git://git.linaro.org/people/arnd/flashbench.git, e.g. like
> > http://lists.linaro.org/pipermail/flashbench-results/2011-March/000039.html)
> > That might help explain how the stick failed.
>
> Certainly, testing below, following this:
> http://lists.linaro.org/pipermail/flashbench-results/2011-March/000039.html
I'm sorry, I should have been more specific. Unfortunately, running flashbench
is not very user friendly yet.
The results indicate that the device does not have a 2 MB erase block size
but rather 4 or 8, which is more common on 8 GB media.
> # ./flashbench --open-au --open-au-nr=1 /dev/sda --blocksize=8192 --erasesize=$[2* 1024 * 1024] --random
> 2MiB 29.5M/s
> 1MiB 29.1M/s
> 512KiB 28.5M/s
> 256KiB 22.8M/s
> 128KiB 23.8M/s
> 64KiB 24.4M/s
> 32KiB 18.9M/s
> 16KiB 13.1M/s
> 8KiB 8.22M/s
>
> # ./flashbench --open-au --open-au-nr=4 /dev/sda --blocksize=8192 --erasesize=$[2* 1024 * 1024] --random
> 2MiB 25.9M/s
> 1MiB 21.8M/s
> 512KiB 15M/s
> 256KiB 11.9M/s
> 128KiB 12.1M/s
> 64KiB 13.6M/s
> 32KiB 9.81M/s
> 16KiB 6.41M/s
> 8KiB 3.88M/s
The numbers are jumping around a bit with the incorrectly guessed erasesize.
These values should be more like the ones in the first test. Can you rerun
with --erasesize=$[4 * 1024 * 1024]?
Also, what is the output of 'lsusb' for this stick? I'd like to add the
data to https://wiki.linaro.org/WorkingGroups/KernelConsolidation/Projects/FlashCar…
> # ./flashbench --open-au --open-au-nr=5 /dev/sda --blocksize=8192 --erasesize=$[2* 1024 * 1024] --random
> 2MiB 29.2M/s
> 1MiB 27.8M/s
> 512KiB 18.4M/s
> 256KiB 7.82M/s
> 128KiB 4.62M/s
> 64KiB 2.47M/s
> 32KiB 1.26M/s
> 16KiB 642K/s
> 8KiB 327K/s
This is where your drive stops coping with the accesses: Writing small
blocks to four different erase blocks (2MB for the test, probably
larger) works fine, but writing to five of them is devestating for
performance, going from 30 MB/s to 300 KB/s, or lower if you were
to write smaller than 8 KB blocks.
The cutoff at --open-au-nr=4 is coincidentally the same as for the
SD card I was testing. This is what happens in the animation in
http://lwn.net/Articles/428799/. The example given there is for
a drive that can only have two open AUs (allocation units aka
erase blocks), while yours does 4.
> (did not run one with 7)
Note that the test results I had with 6 and 7 are without --random,
so the cut-off there was higher for that card when writing an
multiple erase blocks from start to finish instead of writing random
sectors inside of them.
> # ./flashbench --findfat --fat-nr=10 /dev/sda --blocksize=1024 --erasesize=$[2* 1024 * 1024] --random
> 2MiB 22.7M/s 19.1M/s 15.5M/s 13.1M/s 29.5M/s 29.5M/s 29.6M/s 29.6M/s 29.5M/s 29.5M/s
> 1MiB 20.6M/s 13.3M/s 13.3M/s 20.8M/s 18.1M/s 17.8M/s 18M/s 18.3M/s 18.8M/s 18.6M/s
> 512KiB 18.4M/s 18.6M/s 18.3M/s 18.1M/s 23.5M/s 23.2M/s 23.5M/s 23.5M/s 23.4M/s 23.4M/s
> 256KiB 26.9M/s 21.3M/s 21.2M/s 21M/s 21.1M/s 21.2M/s 21.1M/s 21.1M/s 20.6M/s 21M/s
> 128KiB 22.2M/s 22.3M/s 22.6M/s 21.4M/s 21.5M/s 21.3M/s 21.6M/s 21.3M/s 21.4M/s 21.4M/s
> 64KiB 23.9M/s 22.6M/s 22.9M/s 23M/s 22.5M/s 22.4M/s 22.4M/s 22.4M/s 22.5M/s 22.4M/s
> 32KiB 18.2M/s 18.3M/s 18.3M/s 18.3M/s 18.3M/s 18.4M/s 18.3M/s 18.2M/s 18.3M/s 18.3M/s
> 16KiB 12.9M/s 12.9M/s 13M/s 13M/s 12.9M/s 13M/s 12.9M/s 12.9M/s 12.9M/s 12.9M/s
> 8KiB 8.14M/s 8.15M/s 8.15M/s 8.15M/s 8.15M/s 8.14M/s 8.14M/s 8.15M/s 8.15M/s 8.06M/s
> 4KiB 4.07M/s 4.08M/s 4.07M/s 4.06M/s 4.04M/s 4.04M/s 4.04M/s 4.04M/s 4.04M/s 4.04M/s
> 2KiB 2.02M/s 2.02M/s 2.02M/s 2.02M/s 2.02M/s 2.01M/s 2.01M/s 2.01M/s 2.01M/s 2.02M/s
> 1KiB 956K/s 954K/s 956K/s 953K/s 947K/s 947K/s 947K/s 950K/s 947K/s 948K/s
>
One thing that is very clear from this is that this stick has a page size
of 8KB, and that it requires at least 64 KB transfers for the maximum speed.
If your partition is not aligned to 8 KB or more (better: to the erase
block size, e.g. 4 MB) or if the file system writes smaller than 8 KB
naturally aligned blocks at once, the drive has to do read-modify-write
cycles that severely impact performance and the expected life-time.
I cannot see any block that is optimzied for storing the FAT, which is
good, as this means that the manufacturer did not exclusively design
the stick for FAT32, as is normally the case with flash memory cards.
For this stick, I would strongly recommend creating the file system
in a way that writes at least 16 KB naturally aligned blocks at all
times, but I don't know if that's supported by XFS.
Also, the limitation of forcing a garbage collection when writing to
more than four 4 MB (or so) segments may be a problem, depending on
how XFS stores its metadata. The good news is that it can do random
write access inside of the erase blocks.
Arnd
Hello Arnd,
Here are the hdparm and smartctl outputs for the first card I tested
(Transcend ultra 2GB Industrial 'CF100BA8'). There seems to be no dma
mode there, only PIO.
Philippe
tmp179:~ # cat /proc/partitions
major minor #blocks name
8 0 244198584 sda
8 1 2103296 sda1
8 2 20972544 sda2
8 16 1990296 sdb
tmp179:~ # hdparm -i /dev/sdb
/dev/sdb:
Model=PIO, FwRev=20100202, SerialNo=20100716 CF100BA8
Config={ HardSect NotMFM Fixed DTR>10Mbs }
RawCHS=3949/16/63, TrkSize=0, SectSize=576, ECCbytes=4
BuffType=DualPort, BuffSize=1kB, MaxMultSect=1, MultSect=off
CurCHS=3949/16/63, CurSects=3980592, LBA=yes, LBAsects=3980592
IORDY=no, tPIO={min:120,w/IORDY:120}
PIO modes: pio0 pio1 pio2 pio3 pio4
AdvancedPM=yes: disabled (255)
Drive conforms to: Unspecified: ATA/ATAPI-4
* signifies the current active mode
tmp179:~ # hdparm -I /dev/sdb
/dev/sdb:
CompactFlash ATA device
Model Number: PIO
Serial Number: 20100716 CF100BA8
Firmware Revision: 20100202
Standards:
Supported: 4
Likely used: 6
Configuration:
Logical max current
cylinders 3949 3949
heads 16 16
sectors/track 63 63
--
CHS current addressable sectors: 3980592
LBA user addressable sectors: 3980592
Logical/Physical Sector size: 512 bytes
device size with M = 1024*1024: 1943 MBytes
device size with M = 1000*1000: 2038 MBytes (2 GB)
cache/buffer size = 1 KBytes (type=DualPort)
Capabilities:
LBA, IORDY(may be)(cannot be disabled)
Standby timer values: spec'd by Vendor
R/W multiple sector transfer: Max = 1 Current = 0
Advanced power management level: disabled
DMA: not supported
PIO: pio0 pio1 pio2 pio3 pio4
Cycle time: no flow control=120ns IORDY flow control=120ns
Commands/features:
Enabled Supported:
* SMART feature set
Security Mode feature set
Power Management feature set
WRITE_BUFFER command
READ_BUFFER command
NOP cmd
CFA feature set
Advanced Power Management feature set
* Gen1 signaling speed (1.5Gb/s)
* CFA Power Level 1 (max 500mA)
Security:
Master password revision code = 65534
supported
not enabled
not locked
not frozen
not expired: security count
not supported: enhanced erase
2min for SECURITY ERASE UNIT.
Integrity word not set (found 0x0000, expected 0x52a5)
tmp179:~ # smartctl --all -T permissive /dev/sdb
smartctl 5.40 2010-10-16 r3189 [i686-pc-linux-gnu] (SUSE RPM)
Copyright (C) 2002-10 by Bruce Allen, http://smartmontools.sourceforge.net
=== START OF INFORMATION SECTION ===
Device Model: PIO
Serial Number: 20100716 CF100BA8
Firmware Version: 20100202
User Capacity: 2,038,063,104 bytes
Device is: Not in smartctl database [for details use: -P showall]
ATA Version is: 4
ATA Standard is: Exact ATA specification draft version not indicated
Local Time is: Fri Mar 18 11:35:28 2011 CET
SMART support is: Ambiguous - ATA IDENTIFY DEVICE words 85-87 don't show if SMART is enabled.
Checking to be sure by trying SMART RETURN STATUS command.
SMART support is: Available - device has SMART capability.
SMART support is: Enabled
=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED
General SMART Values:
Offline data collection status: (0x26) Offline data collection activity
is in a Reserved state.
Auto Offline Data Collection: Disabled.
Total time to complete Offline
data collection: ( 3) seconds.
Offline data collection
capabilities: (0x00) Offline data collection not supported.
SMART capabilities: (0x0002) Does not save SMART data before
entering power-saving mode.
Supports SMART auto save timer.
Error logging capability: (0x00) Error logging NOT supported.
No General Purpose Logging support.
SMART Attributes Data Structure revision number: 1
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME FLAG VALUE WORST THRESH TYPE UPDATED WHEN_FAILED RAW_VALUE
1 Raw_Read_Error_Rate 0x0000 100 100 000 Old_age Offline - 0
2 Throughput_Performance 0x0000 100 100 000 Old_age Offline - 0
5 Reallocated_Sector_Ct 0x0000 100 100 000 Old_age Offline - 0
7 Seek_Error_Rate 0x0000 100 100 000 Old_age Offline - 0
8 Seek_Time_Performance 0x0000 100 100 000 Old_age Offline - 0
12 Power_Cycle_Count 0x0000 100 100 000 Old_age Offline - 43
195 Hardware_ECC_Recovered 0x0000 100 100 000 Old_age Offline - 0
196 Reallocated_Event_Count 0x0000 100 100 000 Old_age Offline - 0
197 Current_Pending_Sector 0x0000 100 100 000 Old_age Offline - 0
198 Offline_Uncorrectable 0x0000 100 100 000 Old_age Offline - 0
199 UDMA_CRC_Error_Count 0x0000 100 100 000 Old_age Offline - 0
200 Multi_Zone_Error_Rate 0x0000 100 100 000 Old_age Offline - 0
Warning: device does not support Error Logging
Error SMART Error Log Read failed: Input/output error
Smartctl: SMART Error Log Read Failed
Warning: device does not support Self Test Logging
Error SMART Error Self-Test Log Read failed: Input/output error
Smartctl: SMART Self Test Log Read Failed
Device does not support Selective Self Tests/Logging
tmp179:~ #
--
Philippe De Muyter +32 2 6101532 Macq SA rue de l'Aeronef 2 B-1140 Bruxelles
On Saturday 05 March 2011 00:55:29 Philippe De Muyter wrote:
> I have read with interest your article in lwn, and decided to try your flashbench
> program to discover the characteristics of the CF card we use.
>
> Unfortunately,
>
> ./flashbench -a -b 1K /dev/sdb
>
> fails with endless :
>
> time_read: Invalid argument
>
> Looking with strace, I get e.g.:
>
> pread64(3, 0xb3728000, 1, 1023410175) = -1 EINVAL (Invalid argument)
>
> Is that a behavior you can explain (and fix) ?
Yes, there are two known problems that I need to fix:
1. flashbench uses O_DIRECT for talking to the device, which means that
all accesses must be on multiples of full sectors based on the
underlying device sector size. It should detect this
2. command line arguments currently need to be natural numbers. There
is no parser for interpreting arguments like 1K or 4M as kilobytes and
megabytes.
It should work when you do
./flashbench -a -b 1024 /dev/sdb
Arnd
On Sunday 13 March 2011, Dirk Behme wrote:
> Hi Arnd,
>
> reading the flashbench README [1] what I really like is the
> explanation of the -a and the -O option. Explaining the options in
> this README, giving the examples and how to interpret the resulting
> figures really does help.
>
> Somehow I miss something similar for the -s and -f options. I.e. how
> to select the proper values for --scatter-order and --scatter-span and
> how to interpret the output of -s and -f.
>
> Once I understood it, I would be able to send a patch for the README ;)
>
> Additionally, it would be nice to give the flashbench options used for
> the graphs [2] [3] in the LWN article. The LWN article explains quite
> nicely how to interpret the given graphs, but it's not mentioned which
> flashbench options were used to get these graphs.
>
> Many thanks for your help and best regards
>
> Dirk
>
> [1]
> http://git.linaro.org/gitweb?p=people/arnd/flashbench.git;a=blob;f=README;h…
>
Sorry for the late reply. I promise I'll get to it and update the
README.
I should actually remove --scatter-order, it's too difficult to
understand this. It specifies the log2 of the number of blocks
in terms of --blocksize to be tested at the start of the medium.
The output file can be interpreted by
gnuplot -p -e 'plot "output.file"'
or by importing it into a spreadsheet program like oocalc and
using the XY chart function on two columns.
For --findfat, the output shows how each of the first N erase blocks
on the drive reacts to certain access patterns within the erase
block. Most drives do something different for a few blocks in the
beginning to optimize storing the FAT on them. Each column is one
erase block here. If they are all the same, the card does not have
an optimzied FAT area.
Arnd