[i do not shorten this msg because it was not going to the list because i did not reply to list]
On 08/07/11 22:41, Arnd Bergmann wrote:
On Friday 08 July 2011 19:36:56 Peter Warasin wrote:
Hi Arnd
thank you for your help!
On 08/07/11 16:55, Arnd Bergmann wrote:
but there is a problem using three or more. However, the performance is still basically constant, so the card is doing something smart. It's probably using the new Sandisk trick with SLC and MLC areas. On the one hand, this is
ah, so the journaling of ext3 *should* not be a problem, right?
The journaling normally makes things worse, but by how much depends on the card characteristics. Sometimes it's almost nothing, sometimes it's orders of magnitude. In this case, the answer is probably "noticeable, but not drastic".
good news, because the card doesn't behave that badly, on the other hand it's not easy to measure and the baseline of 2 MB/s is rather slow.
I see. So this means aligning to 4MB is a good thing? Or does this simply mean i can write 2 blocks at once?
4 MB alignment is always the right thing.
It's still not all that bad. If you want to try other things, first of all try the same with the class 10 card. Also, add the '--random' flag to see if it changes anything. Any measurements with '--random' are more likely to be useful than those without, but it may be that the class 10 card won't like that argument and becomes extremely slow.
our class10 is a transcend card, so making tests with this card is like comparing apples with pears i think. maybe the quality of that card is also not the best. we try to get a class6 and class10 sandisk
Transcend cards vary extremely in their quality. Even comparing two Transcend cards with identical labels can sometimes be apples and oranges (or pears). Some that I've seen are quite ok, others are absolutely horrible.
If you want to experiment more with flashbench, I suggest trying the transcend card first, it will give you much more understandable results, because in my experience the controllers are so much simpler there.
ok, will try that monday and send results..
but what --random does, isn't that more likely what a normal system does on the SD card when i have /var and swap on the SD card? so this means it is quite normal that it will be very slow, is it?
Yes. If the card is this bad at random access, it won't work well with Linux at all. You can either try to get a better card right away, or continue figuring out what the card actually does. It's getting quite interesting here, I think.
same test with --random:
./flashbench -O --open-au-nr=2 --random /dev/mmcblk0p3 4MiB 839K/s 2MiB 3.26M/s 1MiB 8.7M/s 512KiB 1.48M/s 256KiB 731K/s 128KiB 952K/s 64KiB 1.18M/s 32KiB 594K/s 16KiB 356K/s
i guess i should align to 1MB then?
No, different issue. This card is rather tricky and what it probably does is clean during long writes after smaller writes. If you do "./flashbench -O --open-au-nr=2 --random /dev/mmcblk0p3 --blocksize=$[1024*1024]" a few times repeatedly, it should get better at the second or third run. You can normally ignore any odd results in the first one or two rows of a test run for this reason.
yes, it does, however it has very varying values when starting it a couple of times. from 2x faster to all slow.
is the garbage collector slow and running all the time after 2nd try?
# ./flashbench --open-au-nr=2 --random /dev/mmcblk0p3 -O 4MiB 2.16M/s 2MiB 2.86M/s 1MiB 4.2M/s 512KiB 1.47M/s 256KiB 735K/s 128KiB 904K/s 64KiB 1.12M/s 32KiB 550K/s 16KiB 360K/s
# ./flashbench --open-au-nr=2 --random /dev/mmcblk0p3 -O 4MiB 3.5M/s 2MiB 6.58M/s 1MiB 9.5M/s 512KiB 1.49M/s 256KiB 734K/s 128KiB 986K/s 64KiB 1.17M/s 32KiB 589K/s 16KiB 369K/s
# ./flashbench --open-au-nr=2 --random /dev/mmcblk0p3 -O 4MiB 647K/s 2MiB 603K/s 1MiB 604K/s 512KiB 631K/s 256KiB 745K/s 128KiB 987K/s 64KiB 1.17M/s 32KiB 588K/s 16KiB 309K/s
# ./flashbench --open-au-nr=2 --random /dev/mmcblk0p3 -O 4MiB 652K/s 2MiB 599K/s 1MiB 604K/s 512KiB 636K/s 256KiB 736K/s 128KiB 1M/s 64KiB 1.16M/s 32KiB 540K/s 16KiB 352K/s
what i also noticed on this card is that the card is running with legacy timing instead of high-speed:
cat /sys/kernel/debug/mmc0/ios clock: 25000000 Hz vdd: 20 (3.2 ~ 3.3 V) bus mode: 2 (push-pull) chip select: 0 (don't care) power mode: 2 (on) bus width: 2 (4 bits) timing spec: 0 (legacy)
i tried to figure out why, but most probably i guess the card will respond with SCR_SPEC_VER_1 for sda_vsn i *think*, since the controller (marvell/mvsdio) is MMC_CAP_SD_HIGHSPEED capable and i see no kernel warnings that the card is not able to do mandatory switch function will try to hardcode to SD_HIGHSPEED next week.. (for testing) could that help or is it completely irrelevant?
The result can mean one of two things:
- the erase block size is something other than 4 MB, and you first need to figure out the correct size to rerun it.
- The card can not actually do random writes at all, which is unusual for Sandisk cards, but is something I've seen before in a few cases.
./flashbench -O --open-au-nr=3 --random /dev/mmcblk0p3 4MiB 2.84M/s 2MiB 2.46M/s 1MiB 5.23M/s 512KiB 1.33M/s 256KiB 697K/s 128KiB 1.08M/s 64KiB 904K/s 32KiB 534K/s 16KiB 360K/s
./flashbench -O --open-au-nr=5 --random /dev/mmcblk0p3 4MiB 2.87M/s 2MiB 2.56M/s 1MiB 1.71M/s 512KiB 1.78M/s 256KiB 920K/s 128KiB 471K/s 64KiB 320K/s 32KiB 276K/s 16KiB 215K/s
Not much to see here, it only shows that the card isn't all that great for random access.
You can get measurements for smaller block sizes in addition to the values down to 16KB by passing --blocksize=512. This may get rather slow towards the end, but is very relevant because the block size used by ext3 is only 4KiB.
ah great. ok, did i understand right that i do this tests for figuring out the actual values of erase block size and blocksize? when i know those values, what do i do with them? align partitions to erase block size and format filesystem with SD blocksize (if possible)?
Correct for erasesize, that should be the partition alignment. Possible values here are 1.5MB, 2MB, 3MB, 4MB, 6MB and 8 MB, I haven't seen anything else with SD cards yet.
The blocksize argument you pass to flashbench only affects how many tests are done. By default, it assumes blocksize=16384 and stops there,
aaah, i understand.
but you can pass any other blocksize that is a multiple of 512 and a power-of-two fraction of the erasesize. When you pass blocksize=512, flashbench will print five more lines of output, but typically get very slow. The blocksize that you should use in the file system is the smallest number where flashbench gives a non-catastrophic result. Often you see something like
... 64KiB 3.2M/s 32KiB 2.0M/s 16KiB 1.6M/s 8KiB 821K/s 4KiB 401K/s 2KiB 198K/s 1KiB 90K/s 512 43K/s
In that example, the block size would be 16KB: As you can see, writing an 8KB block actually takes longer than writing a 16KB block, so it's never worth it from a performance PoV.
i guess ideal for the SD card performance would be to format with 4MiB blocksize, but that's impossible with ext3 and bad for me because i waste to much space with small files, right?
The largest block size supported by any Linux file system today is 4KB, but we're working on increasing that to 64KB, which is quite wasteful.
ah good. so right now best i can do is using 4k blocksize for the filesystem, even when this is not the ideal size.
is the copy-on-write feature of btrfs maybe helping?
definitely.
great. will do that
what about swap .. does it make sense creating a swap with 4MiB pagesize?
You can't. The swap page size is a property of the CPU, you don't get to choose it.
ah ok.
last question, hope i do not stress you to much:
Don't worry. But please try to figure out any cards that you have access to and send the results to the mailing list, so I can add it to the database. We have around 100 devices in there today, but every new addition helps make us the right choices when optimizing the file systems.
ah well, will do my best when we get new cards, which i hope, since i have to solve this problem somehow :)
during bonnie tests we had a high cpu load (90%),.. and during normal operation of the system iowait is going up really high (90%) which freezes the system then
is that normal? shouldn't most calculation being done by the microcontroller on the SD chip?
I haven't tried it, but bonnie is really totally useless on SD cards, and it's absolutely possible that this is always the case.
it's always the case, that the cpu load is that high only with bonnie, or under normal circumstances?
i ask because someone on #linaro asked me if the device is probably not dma enabled or in 1-bit mode, which would make sense i think that if it is not dma enabled it uses cpu for everything..
however, i figured out that it is not in 1-bit mode, it is in 4-bit mode, good i guess .. but i was not able how to figure out if it is dma enabled or not. i found nothing in sourcecode but how to disable dma during loading the module, which does not apply, since the driver is compiled in.
peter