Trying to setup a big data backend for collection data from the web. Setting up hadoop or other distributed filesystem is the first step. Here is my founding for Cubieboard.
First we show our physical harddrive write speed in Cubieboard1 with a TOSHIBA HDD 1T
Since this time I'm testing the io speend for overall performance I use no cache ( I normally benchmark real use case in desktop without) oflag=direct
dd if=/dev/zero of=here bs=1G count=1 oflag=direct
My setup have 1T for Cubieboard1 and 2T for Cubieboard2 :
TOSHIBA MQ01ABD100 1T with 8MB cache
TOSHIBA MQ01ABB200 2T with 8MB cache as well.
Note: You may not able to find 2T 2.5" HDD directly or more expensive then a Portable USB3.0 HDD. What I do is find the easiest possible disassemble it. So I pick the cheapest ADATA 2T HDD just about 100$ USD
The follow benchmark run about 5 times each over 5 different nodes. As show you below, 2T is about 20% slower then 1T model.
[2T 8MB]
$ dd if=/dev/zero of=here bs=500M count=1 oflag=direct
1+0 records in
1+0 records out
524288000 bytes (524 MB) copied, 14.4021 s, 36.4 MB/s
$ dd if=/dev/zero of=here bs=500M count=1 oflag=direct
1+0 records in
1+0 records out
524288000 bytes (524 MB) copied, 13.9933 s, 37.5 MB/s
$ dd if=/dev/zero of=here bs=500M count=1 oflag=direct
1+0 records in
1+0 records out
524288000 bytes (524 MB) copied, 13.4424 s, 39.0 MB/s
$ dd if=/dev/zero of=here bs=500M count=1 oflag=direct
1+0 records in
1+0 records out
524288000 bytes (524 MB) copied, 13.5966 s, 38.6 MB/s
$ dd if=/dev/zero of=here bs=500M count=1 oflag=direct
1+0 records in
1+0 records out
524288000 bytes (524 MB) copied, 13.4198 s, 39.1 MB/s
[1T 8MB]
$ dd if=/dev/zero of=here bs=500M count=1 oflag=direct
1+0 records in
1+0 records out
524288000 bytes (524 MB) copied, 10.9552 s, 47.9 MB/s
$ dd if=/dev/zero of=here bs=500M count=1 oflag=direct
1+0 records in
1+0 records out
524288000 bytes (524 MB) copied, 10.0984 s, 51.9 MB/s
$ dd if=/dev/zero of=here bs=500M count=1 oflag=direct
1+0 records in
1+0 records out
524288000 bytes (524 MB) copied, 10.2123 s, 51.3 MB/s
$ dd if=/dev/zero of=here bs=500M count=1 oflag=direct
1+0 records in
1+0 records out
524288000 bytes (524 MB) copied, 10.1852 s, 51.5 MB/s
$ dd if=/dev/zero of=here bs=500M count=1 oflag=direct
1+0 records in
1+0 records out
524288000 bytes (524 MB) copied, 10.1091 s, 51.9 MB/s
You may also notice the first run always the slowest is possible the spin up time for the first run compare to the rest. Since no cache and other configuration in the physical write cubieboard2 has no benefit compare to cubieboard1. And the raw hard drive basically speak for itself.
Then we test the network by individual bandwidth
[2T Server]
$ iperf -s
------------------------------------------------------------
Server listening on TCP port 5001
TCP window size: 85.3 KByte (default)
------------------------------------------------------------
[ 4] local 192.168.2.201 port 5001 connected with 192.168.2.101 port 37688
[ ID] Interval Transfer Bandwidth
[ 4] 0.0-10.2 sec 111 MBytes 91.1 Mbits/sec
[ 5] local 192.168.2.201 port 5001 connected with 192.168.2.102 port 43612
[ 5] 0.0-10.0 sec 109 MBytes 90.8 Mbits/sec
[ 4] local 192.168.2.201 port 5001 connected with 192.168.2.103 port 43104
[ 4] 0.0-10.2 sec 111 MBytes 91.1 Mbits/sec
[ 5] local 192.168.2.201 port 5001 connected with 192.168.2.104 port 48391
[ 5] 0.0-10.2 sec 110 MBytes 91.1 Mbits/sec
[1T Client]
$ iperf -c 192.168.2.201 -i1 -t 10
------------------------------------------------------------
Client connecting to 192.168.2.201, TCP port 5001
TCP window size: 21.0 KByte (default)
------------------------------------------------------------
[ 3] local 192.168.2.101 port 37688 connected with 192.168.2.201 port 5001
[ ID] Interval Transfer Bandwidth
[ 3] 0.0- 1.0 sec 11.9 MBytes 99.6 Mbits/sec
[ 3] 1.0- 2.0 sec 11.4 MBytes 95.4 Mbits/sec
[ 3] 2.0- 3.0 sec 11.8 MBytes 98.6 Mbits/sec
[ 3] 3.0- 4.0 sec 10.2 MBytes 86.0 Mbits/sec
[ 3] 4.0- 5.0 sec 11.0 MBytes 92.3 Mbits/sec
[ 3] 5.0- 6.0 sec 11.1 MBytes 93.3 Mbits/sec
[ 3] 6.0- 7.0 sec 11.1 MBytes 93.3 Mbits/sec
[ 3] 7.0- 8.0 sec 10.2 MBytes 86.0 Mbits/sec
[ 3] 8.0- 9.0 sec 11.1 MBytes 93.3 Mbits/sec
[ 3] 9.0-10.0 sec 11.1 MBytes 93.3 Mbits/sec
[ 3] 0.0-10.1 sec 111 MBytes 92.5 Mbits/sec
Since you can see the network interfaces per host is about 90Mibit/sec. We can test the direct write to socket over the network to the tcp socket and see how fast it can handle.
[2T Cubie2]
$ nc -v -v -l -n -p 8888 > /dev/null
listening on [any] 8888 ...
connect to [192.168.2.201] from (UNKNOWN) [192.168.2.101] 54491
sent 0, rcvd -1582627856
[1T Cubie1]
$ time dd if=/dev/zero | nc -v -v -n 192.168.2.201 8888
(UNKNOWN) [192.168.2.201] 8888 (?) open
^C5297681+0 records in
5297680+0 records out
2712412160 bytes (2.7 GB) copied, 238.063 s, 11.4 MB/s
sent -1582628864, rcvd 0
real 3m58.079s
user 0m6.720s
sys 1m25.620s
The total socket write out thought put is about 11MB/s which is the same as 90Mbit basically. And the bottleneck is likely to be the network interface
Let's get started installing Ceph using the quick start guild
Once you get the sudoer user with no password
$>echo "{username} ALL = (root) NOPASSWD:ALL" | sudo tee /etc/sudoers.d/{username}
$>sudo chmod 0440 /etc/sudoers.d/{username}
And getting your clients node with remote ssh no password
$>ssh-keygen
$>ssh-copy-id {username}@A10Cubie001
$>ssh-copy-id {username}@A10Cubie002
$>ssh-copy-id {username}@A10Cubie003
$>ssh-copy-id {username}@A10Cubie004
For easy to remember I use user "ceph" and at home we create a configuration directory for the future cluster fs admin by the Cubie2 2T machine.
A20Cubie001
$>ceph-deploy new A10Cubie001 A10Cubie002 A10Cubie003 A10Cubie004$>ceph-deploy mon create-initial
$>ceph-deploy osd prepare A20Cubie001:/var/local/osd0 A10Cubie001:/var/local/osd1 A10Cubie002:/var/local/osd2 A10Cubie003:/var/local/osd3 A10Cubie004:/var/local/osd4
$>ceph-deploy osd activate A20Cubie001:/var/local/osd0 A10Cubie001:/var/local/osd1 A10Cubie002:/var/local/osd2 A10Cubie003:/var/local/osd3 A10Cubie004:/var/local/osd4
$>ceph-deploy admin A20Cubie001 A10Cubie001 A10Cubie002 A10Cubie003 A10Cubie004
$>sudo chmod +r /etc/ceph/ceph.client.admin.keyring
Then you can start testing the RADOS with its internal benchmark tool
$>rados bench 180 --no-cleanup -p data write
Maintaining 16 concurrent writes of 4194304 bytes for up to 180 seconds or 0 objects
Object prefix: benchmark_data_A20Cubie001_17853
sec Cur ops started finished avg MB/s cur MB/s last lat avg lat
0 0 0 0 0 0 - 0
1 16 16 0 0 0 - 0
2 16 16 0 0 0 - 0
3 16 17 1 1.33277 1.33333 2.78376 2.78376
4 16 17 1 0.999587 0 - 2.78376
5 16 17 1 0.799692 0 - 2.78376
6 16 18 2 1.33284 1.33333 5.52745 4.1556
7 16 18 2 1.14244 0 - 4.1556
8 16 21 5 2.49912 6 7.87302 6.17543
9 16 21 5 2.22145 0 - 6.17543
10 16 23 7 2.79904 4 9.91452 7.21812
11 16 24 8 2.9081 4 10.7933 7.66502
12 16 25 9 2.99899 4 11.8287 8.12766
13 16 26 10 3.07589 4 12.5663 8.57153
14 16 29 13 3.71305 12 3.74245 8.11178
15 16 32 16 4.26524 12 3.93393 8.17245
16 16 32 16 3.99867 0 - 8.17245
17 16 34 18 4.23389 4 16.3904 8.76522
18 16 34 18 3.99867 0 - 8.76522
19 16 34 18 3.78822 0 - 8.76522
2014-12-18 17:23:03.644173min lat: 2.78376 max lat: 16.3904 avg lat: 8.93634
.......
Total time run: 188.341275
Total writes made: 284
Write size: 4194304
Bandwidth (MB/sec): 6.032
Stddev Bandwidth: 4.47614
Max bandwidth (MB/sec): 20
Min bandwidth (MB/sec): 0
Average Latency: 10.5951
Stddev Latency: 7.44041
Max latency: 29.358
Min latency: 2.16626
$ rados bench 180 --no-cleanup -p data seq
sec Cur ops started finished avg MB/s cur MB/s last lat avg lat
0 0 0 0 0 0 - 0
1 16 18 2 7.95626 8 0.946602 0.929925
2 16 22 6 11.9603 16 0.169324 0.999572
3 16 27 11 14.6314 20 0.416821 1.18134
4 16 33 17 16.9671 24 0.238657 1.48294
5 16 38 22 17.5705 20 4.99503 1.9523
6 16 39 23 15.3104 4 0.159086 1.87433
7 16 43 27 15.4076 16 4.31879 2.1499
8 16 47 31 15.4804 16 1.67389 2.44008
9 16 49 33 14.649 8 0.158647 2.56257
10 16 54 38 15.1824 20 3.30579 2.51419
11 16 57 41 14.8928 12 5.45435 2.76928
12 16 62 46 15.3172 20 1.45525 2.90556
13 16 63 47 14.447 4 10.5026 3.0672
14 16 66 50 14.2718 12 1.48368 3.11797
15 16 70 54 14.3865 16 1.59464 3.33705
16 16 74 58 14.4868 16 5.52285 3.45213
17 16 80 64 15.0452 24 5.48771 3.41443
18 16 85 69 15.3194 20 3.90558 3.37048
19 16 88 72 15.1444 12 0.174209 3.43536
2014-12-18 17:29:19.244628min lat: 0.158647 max lat: 11.61 avg lat: 3.50872
.....
Total time run: 80.853516
Total reads made: 284
Read size: 4194304
Bandwidth (MB/sec): 14.050
Average Latency: 4.52315
Max latency: 18.0705
Min latency: 0.137559
Posted by AvengerGear Alex