Oops--forgot to 'reply all' to the list!
---------- Forwarded message ---------- From: Ajax Criterion ajax.criterion@gmail.com Date: Thu, 28 Jul 2011 07:28:53 -0700 Subject: Re: [Flashbench] Kingston 8GB Datatraveler DT101 G2 To: Arnd Bergmann arnd@arndb.de
On Thu, Jul 28, 2011 at 6:30 AM, Arnd Bergmann arnd@arndb.de wrote:
On Thursday 28 July 2011, Ajax Criterion wrote:
--> aligned to 4MB blocks. FAT 32 in the front, EXT4 in the back again (I'll likely destroy this again with the --findfat test, but what the hell... FS in the back will get changed around a lot while I benchmark multiple filesystems).
The --open-au test also destroys the data, you just might be able to run a little bit longer. There is generally no reason to even create a partition table in order to run flashbench, just boot from a different drive than the one you want to test.
bash-4.1# flashbench -O --open-au-nr=10 /dev/sdc --erasesize=$[4*1024*1024] --blocksize=4096 4MiB 11.8M/s 2MiB 13.2M/s 1MiB 13.6M/s 512KiB 13.6M/s 256KiB 13.5M/s 128KiB 11.2M/s 64KiB 15.2M/s ^C ^C bash-4.1# flashbench -O --open-au-nr=13 /dev/sdc --erasesize=$[4*1024*1024] --blocksize=4096 4MiB 6.54M/s 2MiB 5.55M/s 1MiB 5.95M/s 512KiB 3.15M/s 256KiB 2.51M/s 128KiB 894K/s 64KiB 467K/s ^C bash-4.1# flashbench -O --open-au-nr=12 /dev/sdc --erasesize=$[4*1024*1024] --blocksize=4096 4MiB 6.07M/s 2MiB 5.83M/s 1MiB 5.9M/s 512KiB 5.95M/s 256KiB 5.45M/s 128KiB 4.91M/s 64KiB 4.82M/s 32KiB 4.44M/s 16KiB 2.86M/s 8KiB 772K/s ^C bash-4.1#
---> There seems to be a clear breakdown after 12 AU's ...
Yes, that is pretty clear from your numbers.
why am I bouncing around between 5-6 and 11-13 M/s??
I'm not sure, but what I can imagine is happening is that the drive can switch each erase block between linear optimized more (13 MB/s) and random access mode (6 MB/s). When you do something that has a random pattern, including going beyond 12 erase blocks, it will go into the random mode, in order to cope at all. After writing linearly a few times, the controller decides to get back into linear optimized mode. This is a very smart thing to do for the controller.
Very interesting -- I'm quite pleased with the results for this drive, I'm looking forward to benchmarking it some more, now that I know how to properly align and test it!
bash-4.1# flashbench -O --open-au-nr=12 /dev/sdc --erasesize=$[4*1024*1024] --blocksize=4096 --random 4MiB 11.2M/s 2MiB 8.51M/s 1MiB 5.8M/s 512KiB 3.6M/s 256KiB 5M/s 128KiB 3.86M/s 64KiB 3.12M/s 32KiB 3.23M/s 16KiB 1.89M/s ^C bash-4.1# flashbench -O --open-au-nr=13 /dev/sdc --erasesize=$[4*1024*1024] --blocksize=4096 --random 4MiB 8.89M/s 2MiB 7.31M/s 1MiB 5.55M/s 512KiB 3.7M/s 256KiB 4.99M/s 128KiB 3.74M/s 64KiB 3.11M/s 32KiB 3.25M/s ^C bash-4.1# flashbench -O --open-au-nr=15 /dev/sdc --erasesize=$[4*1024*1024] --blocksize=4096 --random 4MiB 9.85M/s 2MiB 7.47M/s 1MiB 4.7M/s 512KiB 2.54M/s 256KiB 995K/s ^C bash-4.1# flashbench -O --open-au-nr=14 /dev/sdc --erasesize=$[4*1024*1024] --blocksize=4096 --random 4MiB 7.13M/s 2MiB 5.82M/s 1MiB 4.01M/s 512KiB 3.67M/s 256KiB 4.94M/s 128KiB 3.23M/s 64KiB 3.43M/s 32KiB 3.37M/s 16KiB 1.79M/s ^C
--looks to be 14 open AU's for random.
Yes.
bash-4.1# mkdir /mnt/sdc1 bash-4.1# mount /dev/sdc1 /mnt/sdc1 bash-4.1# umount /dev/sdc1
---> mounts fine.
Just lucky that you didn't overwrite any actual data. flashbench does not write into the first 16 MB on the --open-au test in order to avoid the FAT optimized blocks, so if all your important data is there, you won't notice the damage.
OK, understood. Thanks for clarifying! I thought maybe my drive would copy the underlying data out of the way to another location, but I guess that's not the case :)
bash-4.1# flashbench --findfat --fat-nr=8 --erasesize=$[4*1024*1024] --random --blocksize=512 /dev/sdc 4MiB 4.96M/s 13.4M/s 13.3M/s 13.4M/s 13.3M/s 13.4M/s 13.3M/s 13.3M/s 2MiB 5.16M/s 9.27M/s 9.28M/s 9.22M/s 9.19M/s 9.39M/s 9.16M/s 9.43M/s 1MiB 5.24M/s 5.86M/s 5.86M/s 5.84M/s 5.85M/s 5.9M/s 5.87M/s 5.92M/s 512KiB 5.29M/s 3.77M/s 3.77M/s 3.77M/s 3.77M/s 3.77M/s 3.77M/s 3.77M/s 256KiB 2.92M/s 5.5M/s 5.49M/s 5.51M/s 5.48M/s 5.49M/s 5.5M/s 5.46M/s 128KiB 4.54M/s 4.61M/s 4.61M/s 4.61M/s 4.6M/s 4.62M/s 4.6M/s 4.61M/s 64KiB 4.16M/s 3.58M/s 3.6M/s 3.6M/s 3.6M/s 3.59M/s 3.59M/s 3.6M/s 32KiB 5.61M/s 4.17M/s 4.17M/s 4.18M/s 4.19M/s 4.19M/s 4.18M/s 4.18M/s 16KiB 4.1M/s 2.83M/s 2.83M/s 2.83M/s 2.82M/s 2.82M/s 2.83M/s 2.82M/s 8KiB 1.35M/s 1.33M/s 1.33M/s 1.33M/s 1.33M/s 1.33M/s 1.33M/s 1.33M/s 4KiB 743K/s 839K/s 837K/s 838K/s 838K/s 838K/s 838K/s 838K/s ^C
I think there is nothing to see here.
For some reason, when I run a scatter plot, I get a very long straight line...I'm trying to sort that out to get a better read, to verify the 4MB eraseblock...
It's quite possible that that's all you can get out of this drive.
Arnd
I did run one more series of tests, just to make absolutely sure the eraseblock size is correct. Now that I understand this test, and how it works, I really like it:
bash-4.1# for i in 1 2 3 4 6 8 ; do
echo Size $[$i * 1024 * 1024] flashbench --erasesize=$[$i * 1024 * 1024] --blocksize=$[$i * 1024 * 1024] --open-au --open-au-nr=10 /dev/sdc echo Size $[$i * 1024 * 1024], offset $[$i/2 * 1024 * 1024] flashbench --erasesize=$[$i * 1024 * 1024] --blocksize=$[$i * 1024 * 1024] --open-au --open-au-nr=10 --offset=$[$i/2 * 1024 * 1024] /dev/sdc done
Size 1048576 1MiB 10.2M/s Size 1048576, offset 0 1MiB need .param= argument from SEQUENCE from LEN_POW2 Size 2097152 2MiB 8.09M/s Size 2097152, offset 1048576 2MiB 3.9M/s Size 3145728 3MiB 5.94M/s Size 3145728, offset 1048576 3MiB 8.9M/s Size 4194304 4MiB 8.37M/s Size 4194304, offset 2097152 4MiB 7.28M/s Size 6291456 6MiB 7.4M/s Size 6291456, offset 3145728 6MiB 6.25M/s Size 8388608 8MiB 10.4M/s Size 8388608, offset 4194304 8MiB 9.19M/s bash-4.1# for i in 1 2 3 4 6 8 ; do echo Size $[$i * 1024 * 1024]; flashbench --erasesize=$[$i * 1024 * 1024] --blocksize=$[$i * 1024 * 1024] --open-au --open-au-nr=10 /dev/sdc; echo Size $[$i * 1024 * 1024], offset $[$i/2 * 1024 * 1024]; flashbench --erasesize=$[$i * 1024 * 1024] --blocksize=$[$i * 1024 * 1024] --open-au --open-au-nr=10 --offset=$[$i/2 * 1024 * 1024] /dev/sdc; done Size 1048576 1MiB 7.52M/s Size 1048576, offset 0 1MiB need .param= argument from SEQUENCE from LEN_POW2 Size 2097152 2MiB 7.47M/s Size 2097152, offset 1048576 2MiB 3.9M/s Size 3145728 3MiB 5.24M/s Size 3145728, offset 1048576 3MiB 7.34M/s Size 4194304 4MiB 10.1M/s Size 4194304, offset 2097152 4MiB 6.78M/s Size 6291456 6MiB 7.39M/s Size 6291456, offset 3145728 6MiB 5.86M/s Size 8388608 8MiB 10.8M/s Size 8388608, offset 4194304 8MiB 9.73M/s bash-4.1# for i in 1 2 3 4 6 8 ; do echo Size $[$i * 1024 * 1024]; flashbench --erasesize=$[$i * 1024 * 1024] --blocksize=$[$i * 1024 * 1024] --open-au --open-au-nr=10 /dev/sdc; echo Size $[$i * 1024 * 1024], offset $[$i/2 * 1024 * 1024]; flashbench --erasesize=$[$i * 1024 * 1024] --blocksize=$[$i * 1024 * 1024] --open-au --open-au-nr=10 --offset=$[$i/2 * 1024 * 1024] /dev/sdc; done Size 1048576 1MiB 11M/s Size 1048576, offset 0 1MiB need .param= argument from SEQUENCE from LEN_POW2 Size 2097152 2MiB 7.01M/s Size 2097152, offset 1048576 2MiB 4.04M/s Size 3145728 3MiB 4.97M/s Size 3145728, offset 1048576 3MiB 7.85M/s Size 4194304 4MiB 9.02M/s Size 4194304, offset 2097152 4MiB 7.24M/s Size 6291456 6MiB 8.25M/s Size 6291456, offset 3145728 6MiB 5.96M/s Size 8388608 8MiB 9.35M/s Size 8388608, offset 4194304 8MiB 9.7M/s bash-4.1# for i in 1 2 3 4 6 8 9 10 11 12 13 ; do echo Size $[$i * 1024 * 1024]; flashbench --erasesize=$[$i * 1024 * 1024] --blocksize=$[$i * 1024 * 1024] --open-au --open-au-nr=10 /dev/sdc; echo Size $[$i * 1024 * 1024], offset $[$i * * 1024]; flashbench --erasesize=$[$i * 1024 * 1024] --blocksize=$[$i * 1024 * 1024] --open-au --open-au-nr=10 --offset=$[$i/2 * 1024 * 1024] /dev/sdc; done Size 1048576 1MiB 11M/s Size 1048576, offset 0 1MiB need .param= argument from SEQUENCE from LEN_POW2 Size 2097152 2MiB 7.37M/s Size 2097152, offset 1048576 2MiB 4.21M/s Size 3145728 3MiB 4.62M/s Size 3145728, offset 1048576 3MiB 7.98M/s Size 4194304 4MiB 10.2M/s Size 4194304, offset 2097152 4MiB 6.99M/s Size 6291456 6MiB 7.53M/s Size 6291456, offset 3145728 6MiB 6.23M/s Size 8388608 8MiB 9.71M/s Size 8388608, offset 4194304 8MiB 8.82M/s Size 9437184 9MiB 7.44M/s Size 9437184, offset 4194304 9MiB 9.41M/s Size 10485760 10MiB 10.1M/s Size 10485760, offset 5242880 10MiB 10.5M/s Size 11534336 11MiB 9.3M/s Size 11534336, offset 5242880 11MiB 10.9M/s Size 12582912 12MiB 11.5M/s Size 12582912, offset 6291456 12MiB 10.4M/s Size 13631488 13MiB 9.29M/s Size 13631488, offset 6291456 13MiB 10.5M/s bash-4.1#
bash-4.1# for i in 1 2 3 4 6 8 9 10 11 12 13 ; do echo Size $[$i * 1024 * 1024]; flashbench --erasesize=$[$i * 1024 * 1024] --blocksize=$[$i * 1024 * 1024] --open-au --open-au-nr=10 /dev/sdc; echo Size $[$i * 1024 * 1024], offset $[$i * 512 * 1024]; flashbench --erasesize=$[$i * 1024 * 1024] --blocksize=$[$i * 1024 * 1024] --open-au --open-au-nr=10 --offset=$[$i * 512 * 1024] /dev/sdc; done Size 1048576 1MiB 3.52M/s Size 1048576, offset 524288 1MiB 9.19M/s Size 2097152 2MiB 4.37M/s Size 2097152, offset 1048576 2MiB 4.62M/s Size 3145728 3MiB 5.15M/s Size 3145728, offset 1572864 3MiB 4.29M/s Size 4194304 4MiB 10.7M/s Size 4194304, offset 2097152 4MiB 6.63M/s Size 6291456 6MiB 9.01M/s Size 6291456, offset 3145728 6MiB 6.9M/s Size 8388608 8MiB 9.62M/s Size 8388608, offset 4194304 8MiB 9.23M/s Size 9437184 9MiB 7.64M/s Size 9437184, offset 4718592 9MiB 9.03M/s Size 10485760 10MiB 10.2M/s Size 10485760, offset 5242880 10MiB 10.5M/s Size 11534336 11MiB 9.53M/s Size 11534336, offset 5767168 11MiB 9.57M/s Size 12582912 12MiB 11.7M/s Size 12582912, offset 6291456 12MiB 8.81M/s Size 13631488 13MiB 9.82M/s Size 13631488, offset 6815744 13MiB 10.3M/s bash-4.1# for i in 1 2 3 4 6 8 9 10 11 12 13 ; do echo Size $[$i * 1024 * 1024]; flashbench --erasesize=$[$i * 1024 * 1024] --blocksize=$[$i * 1024 * 1024] --open-au --open-au-nr=10 /dev/sdc1; echo Size $[$i * 1024 * 1024], offset $[$i * 512 * 1024]; flashbench --erasesize=$[$i * 1024 * 1024] --blocksize=$[$i * 1024 * 1024] --open-au --open-au-nr=10 --offset=$[$i * 512 * 1024] /dev/sdc1; done Size 1048576 1MiB 7.47M/s Size 1048576, offset 524288 1MiB 7.86M/s Size 2097152 2MiB 3.79M/s Size 2097152, offset 1048576 2MiB 4.24M/s Size 3145728 3MiB 6.31M/s Size 3145728, offset 1572864 3MiB 4.05M/s Size 4194304 4MiB 8.7M/s Size 4194304, offset 2097152 4MiB 7.29M/s Size 6291456 6MiB 7.34M/s Size 6291456, offset 3145728 6MiB 5.7M/s Size 8388608 8MiB 9.17M/s Size 8388608, offset 4194304 8MiB 9.68M/s Size 9437184 9MiB 8.09M/s Size 9437184, offset 4718592 9MiB 9.03M/s Size 10485760 10MiB 9.31M/s Size 10485760, offset 5242880 10MiB 10.5M/s Size 11534336 11MiB 9.42M/s Size 11534336, offset 5767168 11MiB 9.36M/s Size 12582912 12MiB 12.3M/s Size 12582912, offset 6291456 12MiB 8.89M/s Size 13631488 13MiB 9.63M/s Size 13631488, offset 6815744 13MiB 10.1M/s bash-4.1#
After a few runs, I modified the offset to be $i*512*1024 instead of $i/2*1024*1024, since bash can't handle fractions. I think the results are clear though -- Anything less than 4MB (other than short 1MB writes) is slower; 4MB offset by 2MB is slower than 4MB, and 8MB with a 4MB offset is the same speed as 8MB. Longer writes get faster, but 12MB, another multiple of 4, is the fastest. No doubt that I'm at 4MB
What's my page size here -- 16KB or 32? i'm thinking 16, because there are times when 16KB was running at greater than half the speed of 32KB, but I'm not positive on this...
I'm going to transfer my OS to the Kingston drive and get some better results for the Lexar drive later today. Thanks again for your time and help!
On Thursday 28 July 2011, Ajax Criterion wrote:
What's my page size here -- 16KB or 32? i'm thinking 16, because there are times when 16KB was running at greater than half the speed of 32KB, but I'm not positive on this...
Yes exactly. Given results like
64KiB 15.1M/s 32KiB 10.8M/s 16KiB 6.58M/s 8KiB 2.82M/s 4KiB 1.44M/s 2KiB 744K/s
my interpretation is that the page size is 16 KB. It may not actually be the page size, but it's what I wrote in the wiki for other drives that behave just like this one.
There is another twist that I only found late in the process of compiling the list in the wiki: Usually the drives have multiple channels, so what I list as the page size is actually the page size multiplied by the number of channels. A likely scenario would be a 4KB page with 4 channels. In this case, writing 16KB at a 4KB alignment should be just as fast as writing 16KB at a 16 KB alignment. In practice, this doesn't matter all that much because writing longer is always going to help, what I call the page size in the wiki is simply the smallest amount that you need to write to be reasonably efficient.
Arnd
On Thu, Jul 28, 2011 at 7:51 AM, Arnd Bergmann arnd@arndb.de wrote:
On Thursday 28 July 2011, Ajax Criterion wrote:
What's my page size here -- 16KB or 32? i'm thinking 16, because there are times when 16KB was running at greater than half the speed of 32KB, but I'm not positive on this...
Yes exactly. Given results like
64KiB 15.1M/s 32KiB 10.8M/s 16KiB 6.58M/s 8KiB 2.82M/s 4KiB 1.44M/s 2KiB 744K/s
my interpretation is that the page size is 16 KB. It may not actually be the page size, but it's what I wrote in the wiki for other drives that behave just like this one.
There is another twist that I only found late in the process of compiling the list in the wiki: Usually the drives have multiple channels, so what I list as the page size is actually the page size multiplied by the number of channels. A likely scenario would be a 4KB page with 4 channels. In this case, writing 16KB at a 4KB alignment should be just as fast as writing 16KB at a 16 KB alignment. In practice, this doesn't matter all that much because writing longer is always going to help, what I call the page size in the wiki is simply the smallest amount that you need to write to be reasonably efficient.
Arnd
I ran some tests on my Kingston drive to see how much having the drive in vs. out of alignment affects performance. I haven't graphed all of my results, and I have to run test again (I run each test three times and check the standard deviations-- one of the tests came back atypically slow).
I tested write, read, and overwrite speeds for large files (a 300MB iso file) and many small files (extracting and moving the kernel sources tarball), with an EXT4 partition aligned with 4MB eraseblocks, then shifted it off by 1MB to replicate starting a partition on sector 2048 (this seems to be fairly common, and represents an alignment that matches page sizes but not eraseblocks), and then shifted it to a 4MB +63 sectors position to replicate what happens when these drives get repartitioned in linux and given 255/63 geometry (this represents starting a sector out of alignment with both eraseblocks and page sizes).
What I noticed is about a 3%-5% drop in write speed between a 4MB and 1MB alignment, and a 15-25% drop in write speeds between a 4MB and 63-sector alignments. Reading speeds were not as affected (as one would suspect), but still lost between 5% and 15% with the alignment off by 63 sectors.
I suspect these affects would be even more noticeable on a drive that could not handle as many open AU's. Also, there is the unseen problem of wear -- while the noticeable speeds may only go down by 20%, if I understand the process correctly, having an unaligned partition (especially one that is not aligned to the page size) could wear out the drive twice as quickly (or even faster). This may be of little consequence to those who only write to their drives occassionally, but to heavy users, this information will be of great importance.
I'm going to be replicating this same test with a FAT32 fileysystem, as this is what most drives are formatted with out of the box, and many Porteus users run the OS off of a FAT32 stick. I'll be checking the same (mis)alignments, and then running it in an aligned partition with a cluster size that matches the effective page size. I'm also going to put this drive back to a 255 heads/63 sectors per track configuration and test it again, just to confirm that this has little to no effect (just changing the starting point of the partitions would probably be easier for most users than playing with the drive's geometry).
I do have some question out of all this -- how would a FAT32 user align their partition to an eraseblock and make use of a fat-optimized eraseblock at the beginning of the drive? In the case of my kingston drive, for example, the first 4MB is optimized for the FAT, but properly aligning the drive would mean placing my first partition after this eraseblock (since sector 0 is needed for the MBR). Would they typically set the start of the partition at sector 2048 and call that 'good enough', leaving 3MB at the front of the drive for the FAT? Is the optimization in that eraseblock useful enough to warrant misaligning the drive?
As always, thanks for your time and your help!
flashbench-results@lists.linaro.org