IIR Congres ICT & Recht - Cloud Computing - Peter de Haas - Microsoft - 20-04...
chiba-research 2010-01-22 at rakuten meeting
1.
2.
Dynamic Load-Balanced Multicast for Data-Intensive Applications on Clouds
3.
pay-as-you-go
Web
HPC
Dynamic Load-Balanced Multicast for Data-Intensive Applications on Clouds
4. HPC
Amazon Web Services
Science Cloud
my resource
others
VM
VM
VM
VM
VM Monitor/resource manager
cloud users
provided virtual
compute resources
over HTTP, SSH
black box
cloud physical infrastructure = Data Center
Dynamic Load-Balanced Multicast for Data-Intensive Applications on Clouds
6.
cloud storage
cloud users
only once
cloud compute resources
high network transfer charge
Dynamic Load-Balanced Multicast for Data-Intensive Applications on Clouds
7.
Dynamic Load-Balanced Multicast for Data-Intensive Applications on Clouds
8.
Dynamic Load-Balanced Multicast for Data-Intensive Applications on Clouds
9. B
C
T= + M/B + ….
• Optimal Tree
clusters and grids
•
•
WAN
•
A Bandwidth(B-C) = 800Mbps
Latency(B-C) = 2ms •
D
&
Dynamic Load-Balanced Multicast for Data-Intensive Applications on Clouds
10. structured
(ALM)
[Castro et al. ’02] application-level multicast
[Castro et al. ’03]
structured overlay
[Cohen et al. ’03]
Dynamic Load-Balanced Multicast for Data-Intensive Applications on Clouds
11. Algorithm features cluster P2P
multicast topology spanning tree overlay
communication type push pull
network performance high row
network proximity dense sparse
node-to-node performance homo. hetero.
topology stable unstable
adaptability for dynamic change bad good
Dynamic Load-Balanced Multicast for Data-Intensive Applications on Clouds
12. total I/O throughput
I/O bandwidth bottleneck
bucket
nodes
flat tree algorithm
Flat Tree
Dynamic Load-Balanced Multicast for Data-Intensive Applications on Clouds
19.
Dynamic Load-Balanced Multicast for Data-Intensive Applications on Clouds
20.
Dynamic Load-Balanced Multicast for Data-Intensive Applications on Clouds
21. cluster P2P clouds
multicast topology spanning tree overlay tree + overlay
communication type push pull pull
network performance high row middle
network proximity dense sparse dense
node-to-node performance homo. hetero. hetero.
topology stable unstable (un)stable
adaptability for dynamic bad good good
change
cluster
multicast
proposed multicast
algorithm on clouds
P2P
multicast
Dynamic Load-Balanced Multicast for Data-Intensive Applications on Clouds
23. phase 1
32KB
iP (i + 1)P
Ri = ( , − 1)
N N
32KB
32KB
expression
meaning
P
N
Ri
i ID
node 0
node 1
node N-1
Wi
i
i, j
(0 i, j < N)
Dynamic Load-Balanced Multicast for Data-Intensive Applications on Clouds
24. phase 2
BitTorrent-like
possession(i)
]
p request(p)
]
p have(p)
]
update( possession(i) ) ]
pos. list
update
update
Dynamic Load-Balanced Multicast for Data-Intensive Applications on Clouds
25. phase 1 (1/2)
Download Work Stealing
32KB
32KB
node 0
node 1
node N-1
Dynamic Load-Balanced Multicast for Data-Intensive Applications on Clouds
26. phase 1 (2/2)
1) steal request
2) divide current work
i j
j
3) send new work list
j
assigned download pieces
already downloaded
2 MB/sec not yet download
2) divide current work
slow node j
Wj = W ∗ Bj /(Bi + Bj )
1) steal request
3) send new work list
= 5 ∗ 2/(8 + 2) = 1
Wi = W ∗ Bi /(Bi + Bj )
fast node i
8 MB/sec = 5 ∗ 8/(8 + 2) = 4
Dynamic Load-Balanced Multicast for Data-Intensive Applications on Clouds
27. non-steal, steal algorithm flat tree
(completion time) (stability)
(node scalability)
(performance analysis)
CPU
memory
HDD
price
small
1 ECU (1 core)
1.7 GB
160 GB
$0.10/hour
http://aws.amazon.com/ec2/instance-types/
ECU : EC2 Compute Unit 1.0 ~ 1.2 GHz
Dynamic Load-Balanced Multicast for Data-Intensive Applications on Clouds
28. completion time and stability
non-steal algorithm
steal algorithm
slowest average fastest slowest average fastest
140 140
completion time (sec)
completion time (sec)
120 120
100 100
80 80
60 60
40 40
20 20
0 0
1 2 3 4 5 6 7 8 9 10 1 2 3 4 5 6 7 8 9 10
runs runs
1GB 8
non-steal, steal flat tree
Dynamic Load-Balanced Multicast for Data-Intensive Applications on Clouds
29. node scalability
flat tree non-steal steal
10
total throughput (MB/sec)
9
8
7
6
5
4
3
2
1
0
4nodes 8nodes 16nodes 32nodes
1GB
Dynamic Load-Balanced Multicast for Data-Intensive Applications on Clouds