Cloud Engineering

Cloud Engineering
Software in Datacenters
Gwendal Simon
Department of Computer Science
Institut Telecom
2009

Gartner Hype Cycle 2009

2 / 43 Gwendal Simon Cloud Engineering

Literature
A book:
“The Datacenter as a Computer”, Luiz André
Barroso and Urs Hölzle

and scientiﬁc publications including:
ACM Sigops (SOSP) and Usenix OSDI
HPDC, EuroPar, OOPSLA, etc.
blogs (e.g. http://perspectives.mvdirona.com/)


Disclaimer
No discussion about the impact of cloud computing:
net neutrality
interactions between CDNs and ISPs
privacy and electronic human rights
...


Disclaimer
No discussion about the impact of cloud computing:
net neutrality
interactions between CDNs and ISPs
privacy and electronic human rights
...

Focus here on how cloud computing works


Introduction


In a nutshell
Cloud computing is a model for enabling convenient,
on-demand network access to a shared pool of
conﬁgurable computing resources (e.g., networks,
servers, storage, applications, and services) that can
be rapidly provisioned and released with minimal
management eﬀort or service provider interaction

NIST: http://csrc.nist.gov/groups/SNS/cloud-computing/index.html


Why Cloud Computing
The ∗aaS paradigm:
SaaS: Software as a Service
applications for end-users (salesforce.com, Google, etc.)
- email, oﬃce suite, photos sharing, video storage


Why Cloud Computing

PaaS: Platform as a Service
services for web app developers (Azure, Google, etc.)
- workﬂow facilities and various basic services (http, database)


Why Cloud Computing

PaaS: Platform as a Service
services for web app developers (Azure, Google, etc.)
- workﬂow facilities and various basic services (http, database)

IaaS: Infrastructure as a Service
resources for developers (Amazon, Joyent, etc.)
- servers, network equipment, memory, CPU

An Engineer Vision
Beneﬁts: Outsourcing Infrastructure
reduce run time and response time
minimize infrastructure risk
ease deployment and upgrading


An Engineer Vision
Beneﬁts: Outsourcing Infrastructure
reduce run time and response time
minimize infrastructure risk
ease deployment and upgrading

Challenge: Warehouse-Scale Computers
thousands of individual computing nodes
costly equipments (power, conditioning, cooling)
large buildings with engineering teams


Few Pictures


Datacenter is Diﬀerent
Datacenter vs. Desktop Software:
inherent parallelism
software control
platform homogeneity
fault-free requirement


Datacenter is Diﬀerent
Datacenter vs. Desktop Software:
inherent parallelism
software control
platform homogeneity
fault-free requirement

Datacenter vs. High-Performance Computing
unpredictable input
high volume of data
not only computing

Basic Elements of
a Datacenter
Computing Architecture
Energy
Dealing with Failures


Basic Elements of
a Datacenter
Energy


Architecture Basics


Main Elements
Storage: distributed ﬁle sys. (e.g. GFS) or NAS ?
GFS is cheaper and faster for read operations


Main Elements
Storage: distributed ﬁle sys. (e.g. GFS) or NAS ?
GFS is cheaper and faster for read operations

Network: 1-Gbps switch with 48 ports in a rack
1 port per server, 8 ports for cluster rack
→ Oversubscription factor greater than 5
scarce cluster-level bandwidth
attractive rack-level networking


System Overview (in 2009)


Basic Elements of
a Datacenter
Energy


Main Electric Components


Cooling Challenge


Evaluating Energy Efficiency

computation
efficiency =
total energy
total energy
Power Usage Effectiveness = energy in equipment
first generation datacenter PUE was poor (often ≥ 3.0)
toward PUE around 1.2



computation
eﬃciency =
total energy
total energy

critical component power
Server PUE = total server power
basic SPUE is 1.7
state-of-the-art servers reach 1.2



computation
eﬃciency =
total energy
total energy

critical component power
Server PUE = total server power
basic SPUE is 1.7
state-of-the-art servers reach 1.2
and computing eﬃciency

Computing Eﬃciency
Benchmarking cluster-level eﬃciency: on-going work


Computing Eﬃciency
Benchmarking cluster-level eﬃciency: on-going work
Benchmarking individual computer is easier


Toward Energy-Proportional Servers


Basic Elements of
a Datacenter
Energy


Fault-Tolerant Application-Level
Fault-Tolerant software infrastructure layer:
masks failures of lower-layer levels
reduces hardware cost
eases operational procedures (e.g., upgrade)


Fault-Tolerant Application-Level
Fault-Tolerant software infrastructure layer:
masks failures of lower-layer levels
reduces hardware cost
eases operational procedures (e.g., upgrade)

But application-level still experiences failures
service is in degraded mode
service is unreachable
service is corrupted (loss of data)


Origin of Impacting Failures

Cause % of events
software 33 %
conﬁguration 28 %
human 13 %
network 12 %
hardware 11 %
other 3%


Origin of Impacting Failures

Cause % of events
software 33 %
conﬁguration 28 %
human 13 %
network 12 %
hardware 11 %
other 3%

hardware faults are masked by fault-tolerant software


Failures and Crash
Average machine availability is 99.9%
95% of machines restart less than once a month
80% of restart events last less than 10 minutes


Failures and Crash
Average machine availability is 99.9%
95% of machines restart less than once a month
80% of restart events last less than 10 minutes

Software most frequent faults (in one year):
DRAM soft-errors: 1% experience uncorrectable err
disk soft-errors: 3% of drives see corrupted sectors


Software
Infrastructure
Fundamentals
Cluster-Level: MapReduce
Application-Level: Web Search


Software
Infrastructure
Fundamentals


Three Layers
Infrastructure-level software:
kernel, operating systems, networking libraries


Three Layers

Cluster-level software (middleware):
speciﬁc software operating a pool of servers


Three Layers

Cluster-level software (middleware):
speciﬁc software operating a pool of servers

Application-level software:
implementation of the Internet services


Main Software Components

replication



replication
partitioning



replication
partitioning
load-balancing



replication
partitioning
load-balancing
health checking



replication
partitioning
load-balancing
health checking
integrity check



replication
partitioning
load-balancing
health checking
integrity check
compression



replication
partitioning
load-balancing
health checking
integrity check
compression
weak consistency



replication MapReduce
partitioning Dynamo
load-balancing BigTable
health checking Hadoop
integrity check Sawzall
compression Chubby
weak consistency Dryad


OS at a Cluster-Level Scale
Resource Management: mapping tasks to resources
should optimize energy usage



Hardware Abstraction: handling hardware elements
should optimize performances




Deployment Maintenance: upgrading and monitoring
should reduce manual tasks




Deployment Maintenance: upgrading and monitoring
should reduce manual tasks

Programming Frameworks: easing implementation
should increase programmer productivity

Software
Infrastructure
Fundamentals


Motivation
Map/Reduce is a software framework for easily writing applications which
process vast amounts of data (multi-terabyte data-sets) in-parallel on large
clusters (thousands of nodes) of commodity hardware in a reliable,
fault-tolerant manner.


Functional Programming
Two fundamentals functions:
map: apply a function to a list of elements
map f [] = []
| map f [x::xs] = (f x) :: (map f xs)

map square [1,2,5] → [1,4,25]


Functional Programming
Two fundamentals functions:
map: apply a function to a list of elements
map f [] = []
| map f [x::xs] = (f x) :: (map f xs)

map square [1,2,5] → [1,4,25]

reduce: build a value from a function and a list
reduce f a [] = a
| reduce f a [x::xs] = reduce f (f x a) xs

reduce add 0 [1,3,6] → 10


MapReduce
Implementing two functions w.r.t data (key,val)
map: smaller sub-problems distributed to nodes
map (inKey, inVal) → list (outKey, v)
produces intermediate values with an output key


MapReduce
Implementing two functions w.r.t data (key,val)
map: smaller sub-problems distributed to nodes
map (inKey, inVal) → list (outKey, v)
produces intermediate values with an output key

reduce: combines results of sub-problems
reduce (outKey, list v) → outVal
produces an output value from intermediate values


Example: Word Count
map(filename, content):
for each w in content:
emitInt(w, 1)


Example: Word Count
emitInt(w, 1)

reduce(word, partCount):
int result = 0
for pc in partCount:
result += pc
emit(result)


Example: Word Count
map(ﬁle1, “hello me, goodbye me”)→
<hello,1> <me,1> <goodbye,1> <me,1>
emitInt(w, 1)
map(ﬁle2, “hello you, bye you”)→
<hello,1> <you,1> <bye,1> <you,1>
int result = 0
result += pc
emit(result)


Example: Word Count
emitInt(w, 1)
int result = 0
a given key is allocated to a given server
result += pc
emit(result)


Example: Word Count
emitInt(w, 1)
int result = 0
result += pc
reduce(hello,<1,1>) → 2
emit(result)
...


Example: Word Count
emitInt(w, 1)
int result = 0
result += pc
reduce(hello,<1,1>) → 2
emit(result)
...

<hello,2> <me,2> <you,2> <goodbye,1> <bye,1>


MapReduce Implemented


Software
Infrastructure
Fundamentals


Basics
Input:
the Web 100 billion ﬁle 400 terabytes
the pagerank algorithm
a query “w1 AND w2 AND · · · AND wn ”


Basics
Input:
the Web 100 billion ﬁle 400 terabytes
the pagerank algorithm
a query “w1 AND w2 AND · · · AND wn ”

Output:
a list of ﬁles containing all words wi , i ∈ [1, n]
sorted by the pagerank algorithm


Implementation Overview
Oﬄine task: index management
based on keywords:
a word is associated with a table
a table contains all occurrences in the web
distributed on thousands of machines
multiple copies and weak consistency


Implementation Overview
Offline task: index management
based on keywords:
a word is associated with a table
a table contains all occurrences in the web
distributed on thousands of machines
multiple copies and weak consistency

Online task: query management
front-end Web servers to a subset of machines:
compute and rank their local results
all best results are combined
intermediate servers to servers of file replica
from file pointers to a set of metadata

Discussion
The user-perceived latency is less than one second:
read-only operations
high parallelism
many thousands of queries per second
traﬃc variations


Discussion
The user-perceived latency is less than one second:
read-only operations
high parallelism
many thousands of queries per second
traﬃc variations

Networking:
tiny size of data exchanges
possible packet loss around the front-end servers


Conclusion


Key Challenges
Time-scale:
datacenter are expected to last 10 years
Internet apps gain popularity in weeks


Key Challenges
Time-scale:

Hardware components:
processors are faster and more energy eﬃcient
memory systems and networks are not


Key Challenges
Time-scale:

Hardware components:
processors are faster and more energy eﬃcient
memory systems and networks are not

Server evolution:
more cores mean more parallelism

Personal Thoughts
A new era in the Internet:
the industrial era of applications
computer science does really matter


Personal Thoughts
A new era in the Internet:
the industrial era of applications
computer science does really matter

About nano-datacenter
using your always-on devices


Cloud Engineering

Recomendados

Recomendados

Más contenido relacionado

La actualidad más candente

La actualidad más candente (20)

Similar a Cloud Engineering

Similar a Cloud Engineering (20)

Más de Gwendal Simon

Más de Gwendal Simon (14)

Último

Último (20)

Cloud Engineering