[i do not shorten this msg because it was not going to the list because i did not reply to list]
On 08/07/11 22:41, Arnd Bergmann wrote:
On Friday 08 July 2011 19:36:56 Peter Warasin wrote:
Hi Arnd
thank you for your help!
On 08/07/11 16:55, Arnd Bergmann wrote:
but there is a problem using three or more. However, the performance is still basically constant, so the card is doing something smart. It's probably using the new Sandisk trick with SLC and MLC areas. On the one hand, this is
ah, so the journaling of ext3 *should* not be a problem, right?
The journaling normally makes things worse, but by how much depends on the card characteristics. Sometimes it's almost nothing, sometimes it's orders of magnitude. In this case, the answer is probably "noticeable, but not drastic".
good news, because the card doesn't behave that badly, on the other hand it's not easy to measure and the baseline of 2 MB/s is rather slow.
I see. So this means aligning to 4MB is a good thing? Or does this simply mean i can write 2 blocks at once?
4 MB alignment is always the right thing.
It's still not all that bad. If you want to try other things, first of all try the same with the class 10 card. Also, add the '--random' flag to see if it changes anything. Any measurements with '--random' are more likely to be useful than those without, but it may be that the class 10 card won't like that argument and becomes extremely slow.
our class10 is a transcend card, so making tests with this card is like comparing apples with pears i think. maybe the quality of that card is also not the best. we try to get a class6 and class10 sandisk
Transcend cards vary extremely in their quality. Even comparing two Transcend cards with identical labels can sometimes be apples and oranges (or pears). Some that I've seen are quite ok, others are absolutely horrible.
If you want to experiment more with flashbench, I suggest trying the transcend card first, it will give you much more understandable results, because in my experience the controllers are so much simpler there.
ok, will try that monday and send results..
but what --random does, isn't that more likely what a normal system does on the SD card when i have /var and swap on the SD card? so this means it is quite normal that it will be very slow, is it?
Yes. If the card is this bad at random access, it won't work well with Linux at all. You can either try to get a better card right away, or continue figuring out what the card actually does. It's getting quite interesting here, I think.
same test with --random:
./flashbench -O --open-au-nr=2 --random /dev/mmcblk0p3 4MiB 839K/s 2MiB 3.26M/s 1MiB 8.7M/s 512KiB 1.48M/s 256KiB 731K/s 128KiB 952K/s 64KiB 1.18M/s 32KiB 594K/s 16KiB 356K/s
i guess i should align to 1MB then?
No, different issue. This card is rather tricky and what it probably does is clean during long writes after smaller writes. If you do "./flashbench -O --open-au-nr=2 --random /dev/mmcblk0p3 --blocksize=$[1024*1024]" a few times repeatedly, it should get better at the second or third run. You can normally ignore any odd results in the first one or two rows of a test run for this reason.
yes, it does, however it has very varying values when starting it a couple of times. from 2x faster to all slow.
is the garbage collector slow and running all the time after 2nd try?
# ./flashbench --open-au-nr=2 --random /dev/mmcblk0p3 -O 4MiB 2.16M/s 2MiB 2.86M/s 1MiB 4.2M/s 512KiB 1.47M/s 256KiB 735K/s 128KiB 904K/s 64KiB 1.12M/s 32KiB 550K/s 16KiB 360K/s
# ./flashbench --open-au-nr=2 --random /dev/mmcblk0p3 -O 4MiB 3.5M/s 2MiB 6.58M/s 1MiB 9.5M/s 512KiB 1.49M/s 256KiB 734K/s 128KiB 986K/s 64KiB 1.17M/s 32KiB 589K/s 16KiB 369K/s
# ./flashbench --open-au-nr=2 --random /dev/mmcblk0p3 -O 4MiB 647K/s 2MiB 603K/s 1MiB 604K/s 512KiB 631K/s 256KiB 745K/s 128KiB 987K/s 64KiB 1.17M/s 32KiB 588K/s 16KiB 309K/s
# ./flashbench --open-au-nr=2 --random /dev/mmcblk0p3 -O 4MiB 652K/s 2MiB 599K/s 1MiB 604K/s 512KiB 636K/s 256KiB 736K/s 128KiB 1M/s 64KiB 1.16M/s 32KiB 540K/s 16KiB 352K/s
what i also noticed on this card is that the card is running with legacy timing instead of high-speed:
cat /sys/kernel/debug/mmc0/ios clock: 25000000 Hz vdd: 20 (3.2 ~ 3.3 V) bus mode: 2 (push-pull) chip select: 0 (don't care) power mode: 2 (on) bus width: 2 (4 bits) timing spec: 0 (legacy)
i tried to figure out why, but most probably i guess the card will respond with SCR_SPEC_VER_1 for sda_vsn i *think*, since the controller (marvell/mvsdio) is MMC_CAP_SD_HIGHSPEED capable and i see no kernel warnings that the card is not able to do mandatory switch function will try to hardcode to SD_HIGHSPEED next week.. (for testing) could that help or is it completely irrelevant?
The result can mean one of two things:
- the erase block size is something other than 4 MB, and you first need to figure out the correct size to rerun it.
- The card can not actually do random writes at all, which is unusual for Sandisk cards, but is something I've seen before in a few cases.
./flashbench -O --open-au-nr=3 --random /dev/mmcblk0p3 4MiB 2.84M/s 2MiB 2.46M/s 1MiB 5.23M/s 512KiB 1.33M/s 256KiB 697K/s 128KiB 1.08M/s 64KiB 904K/s 32KiB 534K/s 16KiB 360K/s
./flashbench -O --open-au-nr=5 --random /dev/mmcblk0p3 4MiB 2.87M/s 2MiB 2.56M/s 1MiB 1.71M/s 512KiB 1.78M/s 256KiB 920K/s 128KiB 471K/s 64KiB 320K/s 32KiB 276K/s 16KiB 215K/s
Not much to see here, it only shows that the card isn't all that great for random access.
You can get measurements for smaller block sizes in addition to the values down to 16KB by passing --blocksize=512. This may get rather slow towards the end, but is very relevant because the block size used by ext3 is only 4KiB.
ah great. ok, did i understand right that i do this tests for figuring out the actual values of erase block size and blocksize? when i know those values, what do i do with them? align partitions to erase block size and format filesystem with SD blocksize (if possible)?
Correct for erasesize, that should be the partition alignment. Possible values here are 1.5MB, 2MB, 3MB, 4MB, 6MB and 8 MB, I haven't seen anything else with SD cards yet.
The blocksize argument you pass to flashbench only affects how many tests are done. By default, it assumes blocksize=16384 and stops there,
aaah, i understand.
but you can pass any other blocksize that is a multiple of 512 and a power-of-two fraction of the erasesize. When you pass blocksize=512, flashbench will print five more lines of output, but typically get very slow. The blocksize that you should use in the file system is the smallest number where flashbench gives a non-catastrophic result. Often you see something like
... 64KiB 3.2M/s 32KiB 2.0M/s 16KiB 1.6M/s 8KiB 821K/s 4KiB 401K/s 2KiB 198K/s 1KiB 90K/s 512 43K/s
In that example, the block size would be 16KB: As you can see, writing an 8KB block actually takes longer than writing a 16KB block, so it's never worth it from a performance PoV.
i guess ideal for the SD card performance would be to format with 4MiB blocksize, but that's impossible with ext3 and bad for me because i waste to much space with small files, right?
The largest block size supported by any Linux file system today is 4KB, but we're working on increasing that to 64KB, which is quite wasteful.
ah good. so right now best i can do is using 4k blocksize for the filesystem, even when this is not the ideal size.
is the copy-on-write feature of btrfs maybe helping?
definitely.
great. will do that
what about swap .. does it make sense creating a swap with 4MiB pagesize?
You can't. The swap page size is a property of the CPU, you don't get to choose it.
ah ok.
last question, hope i do not stress you to much:
Don't worry. But please try to figure out any cards that you have access to and send the results to the mailing list, so I can add it to the database. We have around 100 devices in there today, but every new addition helps make us the right choices when optimizing the file systems.
ah well, will do my best when we get new cards, which i hope, since i have to solve this problem somehow :)
during bonnie tests we had a high cpu load (90%),.. and during normal operation of the system iowait is going up really high (90%) which freezes the system then
is that normal? shouldn't most calculation being done by the microcontroller on the SD chip?
I haven't tried it, but bonnie is really totally useless on SD cards, and it's absolutely possible that this is always the case.
it's always the case, that the cpu load is that high only with bonnie, or under normal circumstances?
i ask because someone on #linaro asked me if the device is probably not dma enabled or in 1-bit mode, which would make sense i think that if it is not dma enabled it uses cpu for everything..
however, i figured out that it is not in 1-bit mode, it is in 4-bit mode, good i guess .. but i was not able how to figure out if it is dma enabled or not. i found nothing in sourcecode but how to disable dma during loading the module, which does not apply, since the driver is compiled in.
peter
On Saturday 09 July 2011 00:04:24 Peter Warasin wrote:
is the garbage collector slow and running all the time after 2nd try?
# ./flashbench --open-au-nr=2 --random /dev/mmcblk0p3 -O 4MiB 2.16M/s 2MiB 2.86M/s 1MiB 4.2M/s 512KiB 1.47M/s 256KiB 735K/s 128KiB 904K/s 64KiB 1.12M/s 32KiB 550K/s 16KiB 360K/s
# ./flashbench --open-au-nr=2 --random /dev/mmcblk0p3 -O 4MiB 3.5M/s 2MiB 6.58M/s 1MiB 9.5M/s 512KiB 1.49M/s 256KiB 734K/s 128KiB 986K/s 64KiB 1.17M/s 32KiB 589K/s 16KiB 369K/s
# ./flashbench --open-au-nr=2 --random /dev/mmcblk0p3 -O 4MiB 647K/s 2MiB 603K/s 1MiB 604K/s 512KiB 631K/s 256KiB 745K/s 128KiB 987K/s 64KiB 1.17M/s 32KiB 588K/s 16KiB 309K/s
# ./flashbench --open-au-nr=2 --random /dev/mmcblk0p3 -O 4MiB 652K/s 2MiB 599K/s 1MiB 604K/s 512KiB 636K/s 256KiB 736K/s 128KiB 1M/s 64KiB 1.16M/s 32KiB 540K/s 16KiB 352K/s
The lower lines are fairly consistent, the upper ones are not. This is a strong indication that the erase block size is not really 4MB but something else, probably a multiple of 3 even. In the lower lines, it all averages out, while at the top there are only a few system calls and doing one GC impacts the timing significantly.
Best try a few other erasesize values:
./flashbench --open-au-nr=2 --random /dev/mmcblk0p3 -O --erasesize=$[1024 * 1024] --blocksize=$[64 * 1024] ./flashbench --open-au-nr=2 --random /dev/mmcblk0p3 -O --erasesize=$[1536 * 1024] --blocksize=$[96 * 1024] ./flashbench --open-au-nr=2 --random /dev/mmcblk0p3 -O --erasesize=$[2048 * 1024] --blocksize=$[64 * 1024] ./flashbench --open-au-nr=2 --random /dev/mmcblk0p3 -O --erasesize=$[3072 * 1024] --blocksize=$[96 * 1024] ./flashbench --open-au-nr=2 --random /dev/mmcblk0p3 -O --erasesize=$[6144 * 1024] --blocksize=$[96 * 1024] ./flashbench --open-au-nr=2 --random /dev/mmcblk0p3 -O --erasesize=$[8192 * 1024] --blocksize=$[64 * 1024]
Try each one a few times. If you get to the correct erasesize value, the results should be both faster and more stable.
what i also noticed on this card is that the card is running with legacy timing instead of high-speed:
cat /sys/kernel/debug/mmc0/ios clock: 25000000 Hz vdd: 20 (3.2 ~ 3.3 V) bus mode: 2 (push-pull) chip select: 0 (don't care) power mode: 2 (on) bus width: 2 (4 bits) timing spec: 0 (legacy)
i tried to figure out why, but most probably i guess the card will respond with SCR_SPEC_VER_1 for sda_vsn i *think*, since the controller (marvell/mvsdio) is MMC_CAP_SD_HIGHSPEED capable and i see no kernel warnings that the card is not able to do mandatory switch function will try to hardcode to SD_HIGHSPEED next week.. (for testing) could that help or is it completely irrelevant?
I don't really know. Best ask this question on the linux-mmc mailing list.
during bonnie tests we had a high cpu load (90%),.. and during normal operation of the system iowait is going up really high (90%) which freezes the system then
is that normal? shouldn't most calculation being done by the microcontroller on the SD chip?
I haven't tried it, but bonnie is really totally useless on SD cards, and it's absolutely possible that this is always the case.
it's always the case, that the cpu load is that high only with bonnie, or under normal circumstances?
I mean with bonnie.
i ask because someone on #linaro asked me if the device is probably not dma enabled or in 1-bit mode, which would make sense i think that if it is not dma enabled it uses cpu for everything..
however, i figured out that it is not in 1-bit mode, it is in 4-bit mode, good i guess .. but i was not able how to figure out if it is dma enabled or not. i found nothing in sourcecode but how to disable dma during loading the module, which does not apply, since the driver is compiled in.
You can pass module options through the boot loader, in the kernel command line, like 'modulename.option=value'.
Also, there is no need to run the SD card test on the target hardware. Best do the same test on a PC with a USB card reader or builtin SD card slot and see if the numbers are much different. If everything's faster but the behaviour otherwise the same, there is probably also a bug in the host driver.
Arnd
hi arnd
On 09/07/11 00:19, Arnd Bergmann wrote:
Best try a few other erasesize values:
./flashbench --open-au-nr=2 --random /dev/mmcblk0p3 -O --erasesize=$[1024 * 1024] --blocksize=$[64 * 1024] ./flashbench --open-au-nr=2 --random /dev/mmcblk0p3 -O --erasesize=$[1536 * 1024] --blocksize=$[96 * 1024] ./flashbench --open-au-nr=2 --random /dev/mmcblk0p3 -O --erasesize=$[2048 * 1024] --blocksize=$[64 * 1024] ./flashbench --open-au-nr=2 --random /dev/mmcblk0p3 -O --erasesize=$[3072 * 1024] --blocksize=$[96 * 1024] ./flashbench --open-au-nr=2 --random /dev/mmcblk0p3 -O --erasesize=$[6144 * 1024] --blocksize=$[96 * 1024] ./flashbench --open-au-nr=2 --random /dev/mmcblk0p3 -O --erasesize=$[8192 * 1024] --blocksize=$[64 * 1024]
Try each one a few times. If you get to the correct erasesize value, the results should be both faster and more stable.
i tried always 4 times.. if it's unstable i post all results
$ ./flashbench --open-au-nr=2 --random /dev/mmcblk0p3 -O --erasesize=$[1024 * 1024] --blocksize=$[64 * 1024]
1MiB 1.49M/s 512KiB 1.92M/s 256KiB 785K/s 128KiB 752K/s 64KiB 1.14M/s
1MiB 548K/s 512KiB 580K/s 256KiB 645K/s 128KiB 954K/s 64KiB 1.27M/s
1MiB 666K/s 512KiB 705K/s 256KiB 747K/s 128KiB 1.04M/s 64KiB 1.09M/s
1MiB 725K/s 512KiB 688K/s 256KiB 745K/s 128KiB 1.11M/s 64KiB 1.37M/s
$ ./flashbench --open-au-nr=2 --random /dev/mmcblk0p3 -O --erasesize=$[1536 * 1024] --blocksize=$[96 * 1024]
1.5MiB 1.14M/s 768KiB 1.61M/s 384KiB 1.21M/s 192KiB 1.11M/s 96KiB 974K/s
1.5MiB 683K/s 768KiB 897K/s 384KiB 833K/s 192KiB 1.05M/s 96KiB 790K/s
next 2 all stable at <1M/s
$ ./flashbench --open-au-nr=2 --random /dev/mmcblk0p3 -O --erasesize=$[2048 * 1024] --blocksize=$[64 * 1024]
2MiB 928K/s 1MiB 1.78M/s 512KiB 2.16M/s 256KiB 878K/s 128KiB 1.02M/s 64KiB 1.16M/s
2MiB 626K/s 1MiB 932K/s 512KiB 910K/s 256KiB 1e+03K/ 128KiB 1.1M/s 64KiB 1.25M/s
2MiB 694K/s 1MiB 815K/s 512KiB 741K/s 256KiB 695K/s 128KiB 959K/s 64KiB 1.23M/s
2MiB 698K/s 1MiB 818K/s 512KiB 851K/s 256KiB 788K/s 128KiB 939K/s 64KiB 1.19M/s
$ ./flashbench --open-au-nr=2 --random /dev/mmcblk0p3 -O --erasesize=$[3072 * 1024] --blocksize=$[96 * 1024]
3MiB 801K/s 1.5MiB 727K/s 768KiB 726K/s 384KiB 747K/s 192KiB 903K/s 96KiB 799K/s
$ ./flashbench --open-au-nr=2 --random /dev/mmcblk0p3 -O --erasesize=$[6144 * 1024] --blocksize=$[96 * 1024]
6MiB 1.34M/s 3MiB 2.54M/s 1.5MiB 3.41M/s 768KiB 2.26M/s 384KiB 942K/s 192KiB 854K/s 96KiB 722K/s
6MiB 1.59M/s 3MiB 2.78M/s 1.5MiB 3.74M/s 768KiB 2.75M/s 384KiB 973K/s 192KiB 815K/s 96KiB 685K/s
6MiB 701K/s 3MiB 672K/s 1.5MiB 670K/s 768KiB 798K/s 384KiB 729K/s 192KiB 800K/s 96KiB 652K/s
6MiB 1.92M/s 3MiB 2.17M/s 1.5MiB 2.38M/s 768KiB 1.87M/s 384KiB 986K/s 192KiB 869K/s 96KiB 724K/s
$ ./flashbench --open-au-nr=2 --random /dev/mmcblk0p3 -O --erasesize=$[8192 * 1024] --blocksize=$[64 * 1024]
8MiB 2.95M/s 4MiB 5.6M/s 2MiB 3.79M/s 1MiB 4.21M/s 512KiB 2.91M/s 256KiB 1.2M/s 128KiB 967K/s 64KiB 601K/s
quite stable
i did also -a 256KiB is quite good, is it?
./flashbench -a /dev/mmcblk0p3 --open-au-nr=2 -b 1024 align 134217728 pre 1.19ms on 1.46ms post 904µs diff 414µs align 67108864 pre 1.27ms on 1.72ms post 1.02ms diff 573µs align 33554432 pre 1.19ms on 1.58ms post 890µs diff 539µs align 16777216 pre 1.13ms on 1.41ms post 866µs diff 414µs align 8388608 pre 1.09ms on 1.37ms post 915µs diff 371µs align 4194304 pre 1.04ms on 1.25ms post 993µs diff 239µs align 2097152 pre 1.05ms on 1.09ms post 1.07ms diff 29.1µs align 1048576 pre 1.03ms on 1.07ms post 1.05ms diff 30.2µs align 524288 pre 1.04ms on 1.08ms post 1.06ms diff 29µs align 262144 pre 1.01ms on 1.05ms post 1.05ms diff 16.2µs align 131072 pre 1.04ms on 1.06ms post 1.05ms diff 17.9µs align 65536 pre 1.03ms on 1.05ms post 1.05ms diff 15.6µs align 32768 pre 1.04ms on 1.07ms post 1.06ms diff 16.1µs align 16384 pre 1.02ms on 1.06ms post 1.04ms diff 27.2µs align 8192 pre 1.02ms on 1.02ms post 1.02ms diff 3.94µs align 4096 pre 1.02ms on 1.02ms post 1.01ms diff 6.89µs align 2048 pre 1.01ms on 1.03ms post 1.02ms diff 14.3µs
what i also noticed on this card is that the card is running with legacy timing instead of high-speed:
I don't really know. Best ask this question on the linux-mmc mailing list.
just to notice, i will then ask also on linux-mmc list:
i forced setting the card in high-speed mode and noticed that:
card->sw_caps.hs_max_dtr = 0
which is the max bus speed, *i think*. when i override that i get:
mmc0: Problem switching card into high-speed mode! mmc0: host does not support reading read-only switch. assuming wr ite-enable.
mmc0: new SDHC card at address e624
so most probably the card is really not a high-speed card.
i ask because someone on #linaro asked me if the device is probably not dma enabled or in 1-bit mode, which would make sense i think that if it is not dma enabled it uses cpu for everything..
however, i figured out that it is not in 1-bit mode, it is in 4-bit mode, good i guess .. but i was not able how to figure out if it is dma enabled or not. i found nothing in sourcecode but how to disable dma during loading the module, which does not apply, since the driver is compiled in.
You can pass module options through the boot loader, in the kernel command line, like 'modulename.option=value'.
yes, i know that.. i see only no way to understand if the card actually is dma or pio i only found out how to disable dma (which i don't want of course)
Also, there is no need to run the SD card test on the target hardware. Best do the same test on a PC with a USB card reader or builtin SD card slot and see if the numbers are much different. If everything's faster but the behaviour otherwise the same, there is probably also a bug in the host driver.
we already did that, also on different hardware and different kernels with linux it always differs, but it is always slow. i was very close to say the card is crap, but it's fast with macosx, so i thought card must be ok. well i see now, it depends on many factors :)
but in order to get *bonnie-values* like on macosx, it would must be kind of 9x faster. i mean.. on linux it is about 10% as on macosx. well, i start to think now that maybe bonnie is a lot different on mac and as you said bonnie isn't that good to use on SD-cards, well, maybe those numbers make no sense at all then. however, it is slow and freezing the system, so there is something wrong :)
trying different SD cards now. we will buy as much as different cards as we find in local stores tomorrow and i will send results, only tell me please which values are of interest.. i did not yet really figure out how to read the results
peter
On Monday 11 July 2011 21:35:17 Peter Warasin wrote:
$ ./flashbench --open-au-nr=2 --random /dev/mmcblk0p3 -O --erasesize=$[6144 * 1024] --blocksize=$[96 * 1024]
6MiB 1.92M/s 3MiB 2.17M/s 1.5MiB 2.38M/s 768KiB 1.87M/s 384KiB 986K/s 192KiB 869K/s 96KiB 724K/s
$ ./flashbench --open-au-nr=2 --random /dev/mmcblk0p3 -O --erasesize=$[8192 * 1024] --blocksize=$[64 * 1024]
8MiB 2.95M/s 4MiB 5.6M/s 2MiB 3.79M/s 1MiB 4.21M/s 512KiB 2.91M/s 256KiB 1.2M/s 128KiB 967K/s 64KiB 601K/s
quite stable
From these numbers, it's definitely clear that there is no factor 3 in it, the erase block size has to be either 4MB or 8MB.
Since you asked about the alignment in the other thread: What is the alignment of this partition? Do the numbers change if you align the block to an odd number of 4MB blocks (counting from the start of the card)?
That is usually the best way to find the true erase block size.
i did also -a 256KiB is quite good, is it?
The -a test is not about good or bad, it's about finding the cutoff between short and long delays when reading across a boundary.
./flashbench -a /dev/mmcblk0p3 --open-au-nr=2 -b 1024 align 134217728 pre 1.19ms on 1.46ms post 904µs diff 414µs align 67108864 pre 1.27ms on 1.72ms post 1.02ms diff 573µs align 33554432 pre 1.19ms on 1.58ms post 890µs diff 539µs align 16777216 pre 1.13ms on 1.41ms post 866µs diff 414µs align 8388608 pre 1.09ms on 1.37ms post 915µs diff 371µs align 4194304 pre 1.04ms on 1.25ms post 993µs diff 239µs align 2097152 pre 1.05ms on 1.09ms post 1.07ms diff 29.1µs align 1048576 pre 1.03ms on 1.07ms post 1.05ms diff 30.2µs align 524288 pre 1.04ms on 1.08ms post 1.06ms diff 29µs align 262144 pre 1.01ms on 1.05ms post 1.05ms diff 16.2µs align 131072 pre 1.04ms on 1.06ms post 1.05ms diff 17.9µs align 65536 pre 1.03ms on 1.05ms post 1.05ms diff 15.6µs align 32768 pre 1.04ms on 1.07ms post 1.06ms diff 16.1µs align 16384 pre 1.02ms on 1.06ms post 1.04ms diff 27.2µs align 8192 pre 1.02ms on 1.02ms post 1.02ms diff 3.94µs align 4096 pre 1.02ms on 1.02ms post 1.01ms diff 6.89µs align 2048 pre 1.01ms on 1.03ms post 1.02ms diff 14.3µs
What you can see here is that there is a difference at 16KB, 512KB, and 4MB blocks, so these are some internal properties of the card. The numbers get more stable if you use --count=100 or so, it will take a lot longer and average out the results then. This indicates 4MB erase blocks. Look at the last column in the 4MB and 2MB row.
what i also noticed on this card is that the card is running with legacy timing instead of high-speed:
I don't really know. Best ask this question on the linux-mmc mailing list.
just to notice, i will then ask also on linux-mmc list:
i forced setting the card in high-speed mode and noticed that:
card->sw_caps.hs_max_dtr = 0
which is the max bus speed, *i think*. when i override that i get:
mmc0: Problem switching card into high-speed mode! mmc0: host does not support reading read-only switch. assuming wr ite-enable.
mmc0: new SDHC card at address e624
so most probably the card is really not a high-speed card.
right..
You can pass module options through the boot loader, in the kernel command line, like 'modulename.option=value'.
yes, i know that.. i see only no way to understand if the card actually is dma or pio i only found out how to disable dma (which i don't want of course)
Unlike CF cards, an SD card doesn't know anything about DMA or PIO mode, that is just a property of the controller.
Also, there is no need to run the SD card test on the target hardware. Best do the same test on a PC with a USB card reader or builtin SD card slot and see if the numbers are much different. If everything's faster but the behaviour otherwise the same, there is probably also a bug in the host driver.
we already did that, also on different hardware and different kernels with linux it always differs, but it is always slow. i was very close to say the card is crap, but it's fast with macosx, so i thought card must be ok. well i see now, it depends on many factors :)
but in order to get *bonnie-values* like on macosx, it would must be kind of 9x faster. i mean.. on linux it is about 10% as on macosx. well, i start to think now that maybe bonnie is a lot different on mac and as you said bonnie isn't that good to use on SD-cards, well, maybe those numbers make no sense at all then. however, it is slow and freezing the system, so there is something wrong :)
Still, this could be anything. Bonnie is a high-level benchmark, so this could all just mean that MacOS is doing internal write-caching for SD cards while Linux does less of that.
What would be really interesting is to use flashbench on macos, if you can get that to build and macos supports O_DIRECT.
trying different SD cards now. we will buy as much as different cards as we find in local stores tomorrow and i will send results, only tell me please which values are of interest.. i did not yet really figure out how to read the results
Ok. The numbers I'm usually interested in are:
* erase block size (smaller is better, lower than 4MB is hard to find on 4GB+ cards) * maximum throughput in MB/s * page size (lower is better, 4KB is ideal but rare) * number of open erase blocks (should have at least 5, more for large erase blocks) * should do random writes just fine.
The very easy smoke test is
flashbench --open-au --open-au-nr=5 --blocksize=2048 --random
Look at how the numbers go down. Ideally, they should stay at multiple MB/s all the way down to the 16KB row, or lower. If they behave like the Kingston card you tested earlier, they are completely useless for Linux.
Arnd
Hi
On 11/07/11 22:07, Arnd Bergmann wrote:
From these numbers, it's definitely clear that there is no factor 3 in it, the erase block size has to be either 4MB or 8MB.
great.
Since you asked about the alignment in the other thread: What is the alignment of this partition? Do the numbers change if you align the block to an odd number of 4MB blocks (counting from the start of the card)?
This card is aligned to 4MB blocks (the other card was not)
Still, this could be anything. Bonnie is a high-level benchmark, so this could all just mean that MacOS is doing internal write-caching for SD cards while Linux does less of that.
What would be really interesting is to use flashbench on macos, if you can get that to build and macos supports O_DIRECT.
we will try that.
Ok. The numbers I'm usually interested in are:
- erase block size (smaller is better, lower than 4MB is hard to find on 4GB+ cards)
ok, using -a and so on
The very easy smoke test is
flashbench --open-au --open-au-nr=5 --blocksize=2048 --random
Look at how the numbers go down. Ideally, they should stay at multiple MB/s all the way down to the 16KB row, or lower. If they behave like the Kingston card you tested earlier, they are completely useless for Linux.
ok, prepare for a bunch of results :)
peter
flashbench-results@lists.linaro.org