4. • Secure,
mul4-‐tenant
cloud
orchestra4on
plaOorm
– Turnkey
plaOorm
for
delivering
IaaS
clouds
– Hypervisor
agnos4c
– Scalable,
secure
and
open
– Open
source,
open
standards
– Deploys
on
premise
or
as
a
hosted
solu4on
• Deliver
cloud
services
faster
and
cheaper
What
is
CloudStack?
Build
your
cloud
the
way
the
world’s
most
successful
clouds
are
built
5.
Compute
CloudStack
Provides
On-‐demand
Access
to
Infrastructure
Through
a
Self-‐Service
Portal
Network
Storage
Admin
Users
Org A
Admin
Users
Org B
Users
End User
Admin
6. Open
Flexible
PlaOorm
Compute
XenServer
VMware
KVM
Oracle
VM
Bare
metal
Hypervisor
Storage
Local
Disk
iSCSI
NFS
Fiber
Channel
SwiY
Block
&
Object
Network
Network
Type
Isola4on
Load
balancer
Firewall
VPN
Network
&
Network
Services
Primary
Storage
Secondary
Storage
8. General
Architecture
Abstrac4on
• Resource
Agent
Endpoint
for
CloudStack
to
communicate
with
underlying
virtual
or
physical
resource,
example
of
resource
could
be
a
hypervisor
host
or
a
network
element
• CloudStack
RPC
(message
bus)
Cross
component
interac4on
Cross
management
server
cluster
• Asynchronous
job
execu4on
Orchestra4on
work-‐flow
usually
takes
4me
to
finish,
not
friendly
to
synchronous
API
paradigm
9. General
Architecture
Abstrac4on
Business
Logic
Modules
Business
Logic
Module
Business
Logic
Module
API
entry
Job
control
ORM
Database
Object
Layer
Agent
Agent
Agent
Agent
10. General
Architecture
Abstrac4on
Business
Logic
Module
Direct
Agent
Co-‐located
Agent
Direct
agent
–
agent
running
inside
management
server
Co-‐located
agent
–
agent
that
is
co-‐located
with
external
resource
(i.e.
Host)
KVM
XenServer
Host
XAPI
12. • Hosts
• Servers
onto
which
services
will
be
provisioned
• Primary
Storage
• VM
storage
• Cluster
• A
grouping
of
hosts
and
their
associated
storage
• Pod
• Collec4on
of
clusters
• Network
• Logical
network
associated
with
service
offerings
• Secondary
Storage
• Template,
snapshot
and
ISO
storage
• Zone
• Collec4on
of
pods,
network
offerings
and
secondary
storage
• Management
Server
Farm
• Responsible
for
all
management
and
provisioning
tasks
Core
CloudStack
Components
Zone
CloudStack
Pod
Cluster
Host
Host
Network
Primary
Storage
VM
VM
CloudStack
Pod
Cluster
Secondary
Storage
13.
Pod
1
….
Cluster
N
Access Layer
Host 2
Cluster
1
Conceptual
Deployment
Architecture
Host 1
Ø Hypervisor
is
the
basic
unit
of
scale.
Ø Cluster
consists
of
one
ore
more
hosts
of
same
hypervisor
Ø All
hosts
in
cluster
have
access
to
shared
(primary)
storage
Ø Pod
is
one
or
more
clusters,
usually
with
L2
switches.
Ø Availability
Zone
has
one
or
more
pods,
has
access
to
secondary
storage.
Ø One
or
more
zones
represent
cloud
Primary
Storage
Zone
1
….
L3 core
Secondary
Storage
Pod
N
CloudStack
Management
Server
Internet
14. Management
Server
Deployment
Architecture
-‐
Do
Not
Distribute
Managem
ent
Server
MySQL
DB
Back
Up
DB
Infrastructure
Resources
User
API
Admin
API
Load
Balancer
Managem
ent
Server
Managem
ent
Server
Managem
ent
Server
MySQL
DB
Infrastructure
Resources
User
API
Admin
API
Single-‐node
Deployment
Mul4-‐node
Deployment
Ø MS
is
stateless.
MS
can
be
deployed
as
physical
server
or
VM
Ø Single
MS
node
can
manage
up
to
10K
hosts.
Mul4ple
nodes
can
be
deployed
for
scale
or
redundancy
Ø Commercial:
RHEL
5.4+;
FOSS:
Ubuntu
10.0.4,
Fedora
16
Replica4on
16. Compute
Subsystem
VirtualMachineGuru
HypervisorGuru
ServerResource
UserVm
Guru
System
VM
Guru
System
VM
Gurus
Hypervisor
Gurus
Hypervisor
Gurus
XenServer
Resource
Vmware
Resource
KVM
Resource
Server
context
Agent
context
Orchestra4on
flow
RPC
(Message
Bus)
17. Compute
Subsystem
• VirtualMachineGuru
Defines
pluggable
points
for
various
VM
managers
to
implement
specific
VM
orchestra4on
logic.
UserManagerManager,
ConsoleProxyManager
etc
implements
this
Interface.
• HypervisorGuru
Defines
pluggable
points
to
insert
hypervisor
specific
logic
into
the
Overall
orchestra4on
flow.
• ServerResource
Defined
the
abstrac4on
layer
for
various
hypervisor
resource
agent
to
realize
the
ac4on
needed
from
orchestra4on
flow
18. Storage
Subsystem
• Subsystem
func4ons
§ Provision
VM
volume
to
storage
devices
§ Manage
volume
snapshots
§ Manage
Templates
and
ISOs
§ Mo4on
service
to
move
storage
content
• Storage
classifica4on
§ Primary
Storage
Provides
VM
volume
storage
at
all
4me
including
run4me
Demand
high
performance
storage
I/O
for
read/write
access
§ Secondary
Storage
(Backup/Object
store)
Store
ready-‐only
content
(templates,
ISOs)
19. • Primary
Storage
• Cluster
level
storage
for
VMs
• Connected
directly
to
hosts
• NFS,
iSCSI,
FC
and
Local
• Secondary
Storage
• Zone
level
storage
for
template,
ISOs
and
snapshots
• NFS
or
OpenStack
SwiY
via
CloudStack
System
VM
• Templates
and
ISOs
• Imported
into
CloudStack
• Can
be
private
or
public
Understanding
roles
in
Storage
Zone
Secondary
Storage
Pod
Cluster
Host
Host
Primary
Storage
Template
20. Network
subsystem
• Subsystem
func4ons
§ Networking
mul4-‐tenancy
§ Provision
user
logic
network
§ Manage
physical/logical
network
configura4on
§ Provide
pluggable
framework
for
third-‐party
vendors
21. CloudStack
Management
Network
Management
Server
System
VM
User
VM
172.16.0.10
172.16.0.15
172.16.0.20
Hypervisor
Host
Management
server
manages
physical
or
virtual
resource
through
management
network(s).
The
network
that
is
used
to
manage
hypervisor
host
can
be
different
to
the
network
that
is
used
to
management
system
VMs
22. Host
link-‐local
network
Management
Server
System
VM
User
VM
172.16.0.10
172.16.0.15
Hypervisor
Host
169.254.0.1
169.254.0.10
Why
host
link
local
network?
• For
hypervisor
hosts
that
have
co-‐located
agent,
host
link-‐local
network
can
reduce
IP
address
usage
• Hypervisor
co-‐located
agent/plugin
proxy
the
channel
between
virtual
resource
(system
VM)
and
management
server
23. Mul4-‐tenancy:
L2
VLAN
isola4on
CloudStack
Management
Server
System
VM
Host
Dom0/
Kernel
Guest
VM
Hypervisor
Host
System
VM
Host
Dom0/
Kernel
Guest
VM
Hypervisor
Host
CloudStack
Management
Server
192.168.10.0/24
10.1.0.0/24
on
VLAN
100
10.1.0.0/24
on
VLAN
200
24. Mul4-‐tenancy:
Security
group
isola4on
…
DB
Security
Group
Web
Security
Group
…
…
Web
VM
Web
VM
Web
VM
Web
VM
DB
VM
Web
VM
DB
VM
Web
VM
25. Network
Provisioning
NeworkGuru
GuestNetworkGuru
NetworkElement
VirtualRouterElement
NetScalarElement
RPC(Message
Bus)
Network
orchestra4on
flow
Network
Offering
DirectNetworkGuru
PhysicalNetwork
Configura4on
NetworkGuru
-‐
helps
define,
implement
shared
or
a
tenant
guest
network
in
a
lazy
manner,
it
also
helps
perform
resource
(IP,
MAC
address,
SDN
tenant
ID)
management
in
various
orchestra4on
phases
NetworkElement
–
carries
on
network
related
ac4ons
PhysicalNetwork
–
defines
a
mapping
configura4on
to
help
map
logic
network
into
physical
infrastructure
Network
Offering
–
define
a
feature
set
template
for
guest
network,
help
instan4ate
network
instance
in
lazy
construc4on
stage
27. Offerings
• Service
Offering
CPU
speed/CPU
core/Memory/HA/Rate
Limit/
Tags
• Disk
Offering
Disk
size/Local
Storage/Tags
• Network
Offering
Rate
limit/Traffic
Type/Isola4on
characteris4cs/
feature
masks
All
types
of
offerings
enable
CloudStack
to
instan4ated
related
object
in
a
lazy,
“fluid”
manner
28. Tags
• Tagging
resource
objects(Host,
Storage,
etc)
• Tagging
offering
objects(ServiceOffering,
DiskOffering,
etc)
Perform
loosely-‐coupled
associa4on
between
objects,
typically
between
offering
Objects
and
resource
objects,
resource
allocators
can
take
advantage
of
the
associa4on
to
influent
alloca4on
affinity
30. CloudStack
System
VMs
• System
VMs
op4mize
and
scale
the
data-‐path
on
behalf
of
CloudStack
– Stateless,
can
be
destroyed
and
recreated
from
database
state
– Highly
Available
– Communicates
with
Management
Server
over
management
network
– Usually
have
3
interfaces:
control,
guest
and
public
• Console
Proxy
VM
– Provides
AJAX-‐style
HTTP-‐only
console
viewer
– Grabs
VNC
output
from
hypervisor
– Scales
out
(more
spawned)
as
load
increases
– Java-‐based
server
Communicates
with
MS
over
message
bus
• Secondary
Storage
VM
– Provides
image
(template)
management
services
– Download
from
HTTP
file
share
or
SwiY
– Copy
between
zones
– Scale
out
to
handle
mul4ple
NFS
mounts
– Java-‐based
server
communicates
with
MS
over
message
bus
31. System
VM
contd
• SSH
keys
and
password
are
unique
to
cloud
installa4on
• Code
can
be
patched
by
restar4ng
system
vm
– Mounts
a
special
ISO
file
with
latest
code
at
boot
– If
ISO
contents
differ,
patch
and
reboot
• Same
system
vm
works
on
XS,
KVM,
VMWare
– Bootstrap
step
for
the
cloud
is
to
install
the
template
for
this
system
vm
• Ready
to
be
re-‐purposed
for
other
specialized
tasks
33. Architecture
refactoring
• Good
about
CloudStack
Ø
Simplicity
Easy
to
understand
Easy
to
setup
Easy
to
operate
Out
of
box
solu4on
to
almost
any
Cloud,
from
private
to
public
cloud
• Bad
about
CloudStack
Ø Tightly-‐coupled
This
architecture
complain
is
being
addressed
in
Apache
CloudStack
community
34. Architecture
refactoring
• Modular/Componen4za4on
refactoring
– Build
system
to
maven
– Adopt
Spring
Framework
• Make
component
wiring
consistent
and
explicit
• Loosely-‐coupling
– RPC/Message
Bus
improvement
• Async
programming
model
– AsyncCallFuture<T>
• VM
state
sync
35. Architecture
refactoring
• RPC/Message
Bus
improvement
– Interface
binding
AgentManager/Listener
AgentManager
defines
the
RPC/Messaging
API,
users
of
RPC/
Messaging
implement
Listener
interface
and
expliclitly
bind
to
AgentManager
– Topic
binding
Topic
constants
Users
subscribe
to
topic
itself,
publisher
and
subscriber
are
associated
indirectly
through
topic
constants
– Hierarchical
topic
naming
conven4on
Topic
to
be
similar
as
DNS
names,
subscribers
at
higher
level
can
be
no4fied
for
down-‐level
events
– Transparently
extend
message
bus
to
cross
boundaries
37. Architecture
refactoring
• AsyncCallFuture<T>
to
connect
sync
and
async
worlds
public
void
MethodThatWillCallAsyncMethod()
{
String
vol
=
new
String("Hello");
AsyncCallbackDispatcher<AsyncSampleEventDrivenStyleCaller,
Object>
caller
=
AsyncCallbackDispatcher.create(this);
AsyncCallFuture<String>
future
=
_ds.createVolume(vol);
try
{
String
result
=
future.get();
Assert.assertEquals(result,
vol);
}
catch
(InterruptedExcepOon
e)
{
//
TODO
Auto-‐generated
catch
block
e.printStackTrace();
}
catch
(ExecuOonExcepOon
e)
{
//
TODO
Auto-‐generated
catch
block
e.printStackTrace();
}
}
38. Architecture
Refactoring
Current
VMSync
implementa?on
• Basic
sync
unit
at
host/cluster
level
• ini4alize
sync-‐scope
at
startup
(full
sync)
• monitor
and
smoothen
state
changes
and
perform
delta
reports
– Resource
needs
to
know
about
CloudStack
specific
VM
state
like
migra4ng,
star4ng
– Resource
needs
to
smoothen
state
report
in
reboo4ng
case,
if
reboot
is
issued
from
guest
OS
• Threads
from
mul4ple
sources
may
collide
when
performing
state
changes
– Thread
from
HA/State
recovery
procedure
– Thread
from
API
request
– Thread
from
Host
VM
state
report
39. Architecture
Refactoring
Basic
idea
of
VMSync
refactoring
• Resource
agent
is
only
required
to
report
VM
raw
power
state
• CloudStack
VM
state
transi4on
is
driven
by
state-‐
transi4on
jobs,
avoid
mul4ple
non-‐coorporated
driving
sources
(from
HA/API/sync
report)
• Opera4on
job
on
a
VM
is
serialized
for
execu4on
• Conflict
resolu4on
– Automa4c
policy
(HA/force
sync)
– Manual
policy
(Alerts,
Admin
interven4on,
user
acknowledgement
workflow)
40. Thank
you
• Q/A
• For
more
informa4on,
please
visit
h}p://
cloudstack.apache.org/