SlideShare una empresa de Scribd logo
1 de 22
Descargar para leer sin conexión
A	
  Performance	
  Comparison	
  of	
  Container-­‐based	
  
Virtualiza8on	
  Systems	
  for	
  MapReduce	
  Clusters	
  	
  

Miguel	
  G.	
  Xavier,	
  Marcelo	
  V.	
  Neves,	
  Cesar	
  A.	
  F.	
  De	
  Rose	
  
miguel.xavier@acad.pucrs.br	
  

Faculty	
  of	
  Informa8cs,	
  PUCRS	
  
Porto	
  Alegre,	
  Brazil	
  
	
  
February	
  13,	
  2014	
  
Outline	
  
• 
• 
• 
• 
• 

Introduc8on	
  
Container-­‐based	
  Virtualiza8on	
  
MapReduce	
  
Evalua8on	
  
Conclusion	
  	
  
Introduc8on	
  
• 

• 

• 

Virtualiza8on	
  	
  
•  Allows	
  resources	
  to	
  be	
  shared	
  
•  Hardware	
  independence,	
  availability,	
  isola8on	
  and	
  security	
  
•  BeUer	
  manageability	
  
•  Widely	
  used	
  in	
  datacenters/cloud	
  compu8ng	
  
MapReduce	
  Cluster	
  and	
  Virtualiza8on	
  	
  
•  Usage	
  scenarios	
  
•  BeUer	
  resource	
  sharing	
  
•  Cloud	
  Compu8ng	
  
However,	
  hypervisor-­‐based	
  technologies	
  in	
  MapReduce	
  environments	
  has	
  
tradi8onally	
  been	
  avoided	
  
Container-­‐based	
  Virtualiza8on	
  
•  A	
  group	
  o	
  processes	
  on	
  	
  a	
  Linux	
  box,	
  put	
  together	
  in	
  a	
  
• 
• 
• 

isolated	
  environment	
  
A	
  lightweight	
  virtualiza8on	
  layer	
  	
  
Non	
  virtualized	
  drivers	
  
Shared	
  opera8ng	
  system	
  

Guest
Processes
Guest
Processes

Guest
Processes

Guest
Processes

Guest OS

Guest OS

Virtualization Layer

Virtualization Layer

Host OS

Host OS

Hardware

Hardware

Container-based Virtualization

Hypervisor-Based Virtualization
• 

Container-­‐based	
  Virtualiza8on	
  
	
  
Each	
  container	
  has:	
  
• 

Its	
  own	
  network	
  interface	
  (and	
  IP	
  Address)	
  
• 

• 
• 
• 

Bridged,	
  routed	
  …	
  

Its	
  own	
  filesystem	
  
Isola8on	
  (security)	
  
•  container	
  A	
  and	
  B	
  can’t	
  see	
  each	
  other	
  
Isola8on	
  (resource	
  usage)	
  
•  RAM,	
  CPU,	
  I/O	
  

•  Current	
  systems	
  
•  Linux-­‐Vserver,	
  OpenVZ,	
  LXC	
  	
  
	
  
	
  
• 

Container-­‐based	
  Virtualiza8on	
  
	
  
Implements	
  Linux	
  Namespaces	
  
Mount	
  –	
  moun8ng/unmou8ng	
  file	
  systems	
  
UTS	
  –	
  hostname,	
  domainname	
  
IPC	
  –	
  SysV	
  message	
  queues,	
  semaphore,	
  memory	
  segments	
  
Network	
  –	
  IPv4/IPv6	
  stacks,	
  rou8ng,	
  firewall,	
  /proc/net,	
  
sock	
  
•  PID	
  –	
  Own	
  set	
  of	
  pids	
  
Chroot	
  is	
  filesystem	
  namespace	
  
	
  
• 
• 
• 
• 

•  Current	
  systems	
  
•  Linux-­‐Vserver,	
  OpenVZ,	
  LXC	
  	
  
	
  
	
  
• 

Container-­‐based	
  Systems	
  
	
  
Linux-­‐VServer	
  

Implements	
  its	
  own	
  features	
  in	
  Linux	
  kernel	
  	
  
limits	
  the	
  scope	
  of	
  the	
  file	
  system	
  from	
  different	
  processes	
  
through	
  the	
  tradi8onal	
  chroot	
  
•  OpenVZ	
  
• 
• 

•  Linux	
  Containers	
  (LXC)	
  
• 

Based	
  on	
  CGroups	
  
Hypervisor-­‐	
  vs	
  Container-­‐based	
  Systems	
  
Hypervisor	
  

Container	
  

Different	
  Kernel	
  OS	
  

Single	
  Kernel	
  

Device	
  Emula8on	
  

Syscall	
  

Many	
  FS	
  caches	
  

Single	
  FS	
  cache	
  

Limits	
  per	
  machine	
  

Limits	
  per	
  process	
  

High	
  Performance	
  Overhead	
  

Low	
  Performance	
  Overhead	
  
MapReduce	
  
• 

• 

MapReduce	
  	
  
•  A	
  parallel	
  programming	
  model	
  
•  Simplicity,	
  efficiency	
  and	
  high	
  scalability	
  
•  It	
  has	
  become	
  a	
  de	
  facto	
  standard	
  for	
  large-­‐scale	
  data	
  analysis	
  
	
  
MapReduce	
  has	
  also	
  aUracted	
  the	
  aUen8on	
  of	
  the	
  HPC	
  
community	
  
•  Simpler	
  approach	
  to	
  address	
  the	
  parallelism	
  problem	
  
•  Highly	
  visible	
  case	
  where	
  MapReduce	
  has	
  been	
  successfully	
  
used	
  by	
  companies	
  like	
  Google,	
  Yahoo!,	
  Facebook	
  and	
  
Amazon	
  
MapReduce	
  and	
  Containers	
  
• 

• 

Apache	
  Mesos	
  
•  Shares	
  a	
  cluster	
  between	
  mul8ple	
  different	
  frameworks	
  
•  Creates	
  another	
  level	
  of	
  resource	
  management	
  
•  Management	
  is	
  taken	
  away	
  from	
  cluster’s	
  RMS	
  
Apache	
  YARN	
  
•  Hadoop	
  Next	
  Genera8on	
  
•  BeUer	
  job	
  scheduling/monitoring	
  
•  Uses	
  virtualiza8on	
  to	
  share	
  a	
  cluster	
  among	
  different	
  
applica8ons	
  
	
  	
  
Evalua8on	
  
• 

Experimental	
  Environment	
  	
  
• 
• 
• 
• 

• 

Hadoop	
  cluster	
  composed	
  by	
  4	
  nodes	
  	
  
Two	
  processors	
  with	
  8	
  cores	
  (without	
  threads)	
  per	
  node	
  
16GB	
  of	
  memory	
  per	
  node	
  
146GB	
  of	
  disksize	
  per	
  node	
  

Analyze	
  of	
  the	
  best	
  results	
  of	
  performance	
  
• 

Through	
  micro-­‐benchmarks	
  	
  
• 
• 
• 

• 

• 

HDFS	
  evalua8on	
  (TestDFSIO)	
  
NameNode	
  evalua8on	
  (NNBench)	
  
MapReduce	
  evalua8on	
  (MRBench)	
  

Through	
  macro-­‐benchmarks	
  (WordCount,	
  TeraBench)	
  	
  

Analyze	
  of	
  best	
  results	
  of	
  isola8on	
  
• 

Through	
  IBS	
  benchmark	
  

At	
  least	
  50	
  execu8ons	
  were	
  performed	
  for	
  each	
  experiment	
  

• 
	
  
HDFS	
  Evalua8on	
  
Semngs:	
  
•  Replica8on	
  of	
  3	
  blocks	
  
•  File	
  size	
  from	
  100	
  MB	
  to	
  
3000	
  MB	
  	
  
	
  

• 

• 
• 
	
  

All	
  Container-­‐based	
  systems	
  
have	
  performance	
  similar	
  to	
  
na8ve	
  	
  
Results	
  o	
  OpenVZ	
  represents	
  
loss	
  of	
  3Mbps	
  
It	
  is	
  due	
  to	
  the	
  CFQ	
  scheduler	
  	
  

30

25

Throughput (Mbps)

• 

20
lxc
nativa

15

ovz
vserver
10

5

0
0

1000

2000

File size (Bytes)

3000
HDFS	
  Evalua8on	
  

	
  
• 
	
  

All	
  of	
  Container-­‐based	
  
systems	
  obtained	
  
performance	
  results	
  similar	
  
to	
  na8ve	
  	
  
Linux-­‐VServer	
  uses	
  a	
  
Physical-­‐based	
  network	
  

30

25

Throughput (Mbps)

	
  
• 

20
lxc
nativa

15

ovz
vserver
10

5

0
0

1000

2000

File size (Bytes)

3000
NameNode	
  Evalua8on	
  using	
  NNBench	
  
•  	
  	
  	
  Generates	
  opera8ons	
  on	
  1000	
  files	
  on	
  HDFS	
  
Na8ve	
  

VServer	
  

0.51	
  	
  

0.52	
  

0.51	
  

0.49	
  

Create/Write	
  (ms)	
  

• 
• 

OpenVZ	
  

Open/Read	
  (ms)	
  

• 
• 

LXC	
  

54.65	
  

56.89	
  

51.96	
  

48.90	
  

NNBench	
  benchmark	
  was	
  chosen	
  to	
  evaluate	
  the	
  NameNode	
  component	
  
Linux-­‐VServer	
  reaches	
  a	
  latency	
  at	
  a	
  average	
  of	
  48ms,	
  while	
  LXC	
  obtained	
  the	
  
worst	
  result	
  at	
  an	
  average	
  of	
  56ms	
  
The	
  differences	
  are	
  not	
  so	
  significant	
  if	
  the	
  numbers	
  are	
  considered	
  
However,	
  the	
  strengths	
  are	
  that	
  no	
  excep8on	
  was	
  observed	
  during	
  the	
  high	
  
HDFS	
  management	
  stress,	
  and	
  that	
  all	
  systems	
  were	
  able	
  to	
  respond	
  
effec8vely	
  as	
  the	
  na8ve	
  
MapReduce	
  Evalua8on	
  using	
  MRBench	
  
Na8ve	
  
Execu8on	
  Time	
  	
  

• 

LXC	
  

OpenVZ	
  

VServer	
  

14251	
  	
  	
  

13577	
  

14304	
  	
  

13614	
  	
  

The	
  results	
  obtained	
  from	
  MRBench	
  show	
  that	
  MR	
  layer	
  suffers	
  no	
  substan8al	
  
effect	
  while	
  running	
  on	
  different	
  container-­‐based	
  virtualiza8on	
  systems	
  
Analyzing	
  Performance	
  with	
  WordCount	
  
180

•  30	
  GB	
  of	
  input	
  data	
  
•  The	
  peak	
  of	
  performance	
  
degrada8on	
  from	
  OpenVZ	
  
is	
  explained	
  by	
  the	
  I/O	
  
scheduler	
  overhead	
  

160

140

Execution Time (seconds)

	
  

120
Native
100

LXC
OpenVZ

80

VServer

60

40

20

0

Wordcount
Analyzing	
  Performance	
  with	
  TeraSort	
  

•  A	
  HDFS	
  block	
  size	
  of	
  64MB	
  
	
  

140

120

Execution Time (seconds)

•  Standard	
  map/reduce	
  sort	
  
	
   •  Steps:	
  
•  Generates	
  30	
  GB	
  of	
  input	
  
data	
  
•  Run	
  on	
  such	
  input	
  data.	
  	
  

100

Native

80

LXC
OpenVZ
VServer

60

40

20

0

Terasort
Performance	
  Isola8on	
  
Base	
  line	
  	
  
applica8on	
  

Base	
  line	
  	
  
applica8on	
  

Stress	
  Test	
  

Container	
  
A	
  

Container	
  
A	
  

Container	
  
B	
  

Execu8on	
  Time	
  	
  

Execu8on	
  Time	
  	
  

Performance	
  degrada8on	
  (%)	
  	
  
Performance	
  Isola8on	
  
	
  

CPU	
  
LXC	
  

Memory	
  

I/O	
  

Fork	
  Bomb	
  

0%	
  

8.3%	
  

5.5%	
  

0%	
  

•  We	
  chose	
  LXC	
  	
  as	
  the	
  representa8ve	
  of	
  the	
  container-­‐based	
  virtualiza8on	
  to	
  be	
  
evaluated	
  
•  The	
  limits	
  	
  of	
  the	
  CPU	
  usage	
  per	
  container	
  is	
  working	
  well	
  
•  no	
  significant	
  impact	
  was	
  noted.	
  	
  
•  a	
  liUle	
  performance	
  degrada8on	
  needs	
  to	
  be	
  taken	
  into	
  account	
  	
  
•  The	
  fork	
  bomb	
  stress	
  test	
  reveals	
  that	
  the	
  LXC	
  has	
  a	
  security	
  subsystem	
  that	
  
ensure	
  feasibility	
  
Conclusions	
  
•  we	
  found	
  that	
  all	
  container-­‐based	
  systems	
  reach	
  a	
  near-­‐na8ve	
  performance	
  for	
  
MapReduce	
  workloads	
  	
  
•  the	
  results	
  of	
  performance	
  isola8on	
  reveled	
  that	
  the	
  LXC	
  has	
  improved	
  its	
  
capabili8es	
  of	
  restrict	
  resources	
  among	
  containers	
  	
  
•  although	
  some	
  works	
  are	
  already	
  taking	
  advantages	
  of	
  container-­‐based	
  
systems	
  on	
  MR	
  clusters	
  
•  this	
  work	
  demonstrated	
  the	
  benefits	
  of	
  using	
  container-­‐based	
  systems	
  to	
  
support	
  MapReduce	
  clusters	
  
Future	
  Work	
  
•  We	
  plan	
  to	
  study	
  the	
  performance	
  isola8on	
  at	
  the	
  network-­‐level	
  
•  We	
  plan	
  to	
  study	
  the	
  scalability	
  while	
  increasing	
  the	
  number	
  of	
  
nodes	
  
•  We	
  plan	
  to	
  study	
  aspects	
  regarding	
  the	
  green	
  compu8ng,	
  such	
  as	
  
the	
  trade-­‐off	
  between	
  performance	
  and	
  energy	
  consump8on	
  
	
  
Thank	
  you	
  for	
  your	
  aUen8on!	
  

Más contenido relacionado

La actualidad más candente

Shifter: Containers in HPC Environments
Shifter: Containers in HPC EnvironmentsShifter: Containers in HPC Environments
Shifter: Containers in HPC Environmentsinside-BigData.com
 
Testing kubernetes and_open_shift_at_scale_20170209
Testing kubernetes and_open_shift_at_scale_20170209Testing kubernetes and_open_shift_at_scale_20170209
Testing kubernetes and_open_shift_at_scale_20170209mffiedler
 
CEPH DAY BERLIN - DEPLOYING CEPH IN KUBERNETES WITH ROOK
CEPH DAY BERLIN - DEPLOYING CEPH IN KUBERNETES WITH ROOKCEPH DAY BERLIN - DEPLOYING CEPH IN KUBERNETES WITH ROOK
CEPH DAY BERLIN - DEPLOYING CEPH IN KUBERNETES WITH ROOKCeph Community
 
Kubernetes Architecture and Introduction – Paris Kubernetes Meetup
Kubernetes Architecture and Introduction – Paris Kubernetes MeetupKubernetes Architecture and Introduction – Paris Kubernetes Meetup
Kubernetes Architecture and Introduction – Paris Kubernetes MeetupStefan Schimanski
 
Containerd Project Update: FOSDEM 2018
Containerd Project Update: FOSDEM 2018Containerd Project Update: FOSDEM 2018
Containerd Project Update: FOSDEM 2018Phil Estes
 
An Open Source Story: Open Containers & Open Communities
An Open Source Story: Open Containers & Open CommunitiesAn Open Source Story: Open Containers & Open Communities
An Open Source Story: Open Containers & Open CommunitiesPhil Estes
 
OpenNebulaConf 2016 - Measuring and tuning VM performance by Boyan Krosnov, S...
OpenNebulaConf 2016 - Measuring and tuning VM performance by Boyan Krosnov, S...OpenNebulaConf 2016 - Measuring and tuning VM performance by Boyan Krosnov, S...
OpenNebulaConf 2016 - Measuring and tuning VM performance by Boyan Krosnov, S...OpenNebula Project
 
OpenNebula Conf 2014 | Using Ceph to provide scalable storage for OpenNebula ...
OpenNebula Conf 2014 | Using Ceph to provide scalable storage for OpenNebula ...OpenNebula Conf 2014 | Using Ceph to provide scalable storage for OpenNebula ...
OpenNebula Conf 2014 | Using Ceph to provide scalable storage for OpenNebula ...NETWAYS
 
Containerd Internals: Building a Core Container Runtime
Containerd Internals: Building a Core Container RuntimeContainerd Internals: Building a Core Container Runtime
Containerd Internals: Building a Core Container RuntimePhil Estes
 
Manila, an update from Liberty, OpenStack Summit - Tokyo
Manila, an update from Liberty, OpenStack Summit - TokyoManila, an update from Liberty, OpenStack Summit - Tokyo
Manila, an update from Liberty, OpenStack Summit - TokyoSean Cohen
 
Introduction to CRI and OCI
Introduction to CRI and OCIIntroduction to CRI and OCI
Introduction to CRI and OCIHungWei Chiu
 
What is a Ceph (and why do I care). OpenStack storage - Colorado OpenStack Me...
What is a Ceph (and why do I care). OpenStack storage - Colorado OpenStack Me...What is a Ceph (and why do I care). OpenStack storage - Colorado OpenStack Me...
What is a Ceph (and why do I care). OpenStack storage - Colorado OpenStack Me...Ian Colle
 
Euro ht condor_alahiff
Euro ht condor_alahiffEuro ht condor_alahiff
Euro ht condor_alahiffvandersantiago
 
Kubernetes Introduction & Whats new in Kubernetes 1.6
Kubernetes Introduction & Whats new in Kubernetes 1.6Kubernetes Introduction & Whats new in Kubernetes 1.6
Kubernetes Introduction & Whats new in Kubernetes 1.6Opcito Technologies
 
Mesos swam-kubernetes-vds-02062017
Mesos swam-kubernetes-vds-02062017Mesos swam-kubernetes-vds-02062017
Mesos swam-kubernetes-vds-02062017Christophe Furmaniak
 
Ceph and cloud stack apr 2014
Ceph and cloud stack   apr 2014Ceph and cloud stack   apr 2014
Ceph and cloud stack apr 2014Ian Colle
 
OpenNebula Conf 2014: CentOS, QA an OpenNebula - Christoph Galuschka
OpenNebula Conf 2014: CentOS, QA an OpenNebula - Christoph GaluschkaOpenNebula Conf 2014: CentOS, QA an OpenNebula - Christoph Galuschka
OpenNebula Conf 2014: CentOS, QA an OpenNebula - Christoph GaluschkaNETWAYS
 
Ceph storage for ocp deploying and managing ceph on top of open shift conta...
Ceph storage for ocp   deploying and managing ceph on top of open shift conta...Ceph storage for ocp   deploying and managing ceph on top of open shift conta...
Ceph storage for ocp deploying and managing ceph on top of open shift conta...OrFriedmann
 

La actualidad más candente (20)

Shifter: Containers in HPC Environments
Shifter: Containers in HPC EnvironmentsShifter: Containers in HPC Environments
Shifter: Containers in HPC Environments
 
Testing kubernetes and_open_shift_at_scale_20170209
Testing kubernetes and_open_shift_at_scale_20170209Testing kubernetes and_open_shift_at_scale_20170209
Testing kubernetes and_open_shift_at_scale_20170209
 
CEPH DAY BERLIN - DEPLOYING CEPH IN KUBERNETES WITH ROOK
CEPH DAY BERLIN - DEPLOYING CEPH IN KUBERNETES WITH ROOKCEPH DAY BERLIN - DEPLOYING CEPH IN KUBERNETES WITH ROOK
CEPH DAY BERLIN - DEPLOYING CEPH IN KUBERNETES WITH ROOK
 
Kubernetes Architecture and Introduction – Paris Kubernetes Meetup
Kubernetes Architecture and Introduction – Paris Kubernetes MeetupKubernetes Architecture and Introduction – Paris Kubernetes Meetup
Kubernetes Architecture and Introduction – Paris Kubernetes Meetup
 
Containerd Project Update: FOSDEM 2018
Containerd Project Update: FOSDEM 2018Containerd Project Update: FOSDEM 2018
Containerd Project Update: FOSDEM 2018
 
An Open Source Story: Open Containers & Open Communities
An Open Source Story: Open Containers & Open CommunitiesAn Open Source Story: Open Containers & Open Communities
An Open Source Story: Open Containers & Open Communities
 
OpenNebulaConf 2016 - Measuring and tuning VM performance by Boyan Krosnov, S...
OpenNebulaConf 2016 - Measuring and tuning VM performance by Boyan Krosnov, S...OpenNebulaConf 2016 - Measuring and tuning VM performance by Boyan Krosnov, S...
OpenNebulaConf 2016 - Measuring and tuning VM performance by Boyan Krosnov, S...
 
OpenNebula Conf 2014 | Using Ceph to provide scalable storage for OpenNebula ...
OpenNebula Conf 2014 | Using Ceph to provide scalable storage for OpenNebula ...OpenNebula Conf 2014 | Using Ceph to provide scalable storage for OpenNebula ...
OpenNebula Conf 2014 | Using Ceph to provide scalable storage for OpenNebula ...
 
Containerd Internals: Building a Core Container Runtime
Containerd Internals: Building a Core Container RuntimeContainerd Internals: Building a Core Container Runtime
Containerd Internals: Building a Core Container Runtime
 
Manila, an update from Liberty, OpenStack Summit - Tokyo
Manila, an update from Liberty, OpenStack Summit - TokyoManila, an update from Liberty, OpenStack Summit - Tokyo
Manila, an update from Liberty, OpenStack Summit - Tokyo
 
Introduction to CRI and OCI
Introduction to CRI and OCIIntroduction to CRI and OCI
Introduction to CRI and OCI
 
What is a Ceph (and why do I care). OpenStack storage - Colorado OpenStack Me...
What is a Ceph (and why do I care). OpenStack storage - Colorado OpenStack Me...What is a Ceph (and why do I care). OpenStack storage - Colorado OpenStack Me...
What is a Ceph (and why do I care). OpenStack storage - Colorado OpenStack Me...
 
Euro ht condor_alahiff
Euro ht condor_alahiffEuro ht condor_alahiff
Euro ht condor_alahiff
 
Releasing a Distribution in the Age of DevOps.
Releasing a Distribution in the Age of DevOps. Releasing a Distribution in the Age of DevOps.
Releasing a Distribution in the Age of DevOps.
 
Kubernetes Introduction & Whats new in Kubernetes 1.6
Kubernetes Introduction & Whats new in Kubernetes 1.6Kubernetes Introduction & Whats new in Kubernetes 1.6
Kubernetes Introduction & Whats new in Kubernetes 1.6
 
Mesos swam-kubernetes-vds-02062017
Mesos swam-kubernetes-vds-02062017Mesos swam-kubernetes-vds-02062017
Mesos swam-kubernetes-vds-02062017
 
Ceph and cloud stack apr 2014
Ceph and cloud stack   apr 2014Ceph and cloud stack   apr 2014
Ceph and cloud stack apr 2014
 
OpenNebula Conf 2014: CentOS, QA an OpenNebula - Christoph Galuschka
OpenNebula Conf 2014: CentOS, QA an OpenNebula - Christoph GaluschkaOpenNebula Conf 2014: CentOS, QA an OpenNebula - Christoph Galuschka
OpenNebula Conf 2014: CentOS, QA an OpenNebula - Christoph Galuschka
 
Ceph storage for ocp deploying and managing ceph on top of open shift conta...
Ceph storage for ocp   deploying and managing ceph on top of open shift conta...Ceph storage for ocp   deploying and managing ceph on top of open shift conta...
Ceph storage for ocp deploying and managing ceph on top of open shift conta...
 
Linuxcon secureefficientcontainerimagemanagementharbor
Linuxcon secureefficientcontainerimagemanagementharborLinuxcon secureefficientcontainerimagemanagementharbor
Linuxcon secureefficientcontainerimagemanagementharbor
 

Destacado

OpenVZ, Virtuozzo and Docker
OpenVZ, Virtuozzo and DockerOpenVZ, Virtuozzo and Docker
OpenVZ, Virtuozzo and DockerKirill Kolyshkin
 
Apache Bigtop: a crash course in deploying a Hadoop bigdata management platform
Apache Bigtop: a crash course in deploying a Hadoop bigdata management platformApache Bigtop: a crash course in deploying a Hadoop bigdata management platform
Apache Bigtop: a crash course in deploying a Hadoop bigdata management platformrhatr
 
BigTop vm and docker provisioner
BigTop vm and docker provisionerBigTop vm and docker provisioner
BigTop vm and docker provisionerEvans Ye
 
Apache Spark: killer or savior of Apache Hadoop?
Apache Spark: killer or savior of Apache Hadoop?Apache Spark: killer or savior of Apache Hadoop?
Apache Spark: killer or savior of Apache Hadoop?rhatr
 
Virtuozzo Storage for Docker
Virtuozzo Storage for DockerVirtuozzo Storage for Docker
Virtuozzo Storage for DockerVirtuozzo
 
System Containers and Application Containers: Who Cares?
System Containers and Application Containers: Who Cares?System Containers and Application Containers: Who Cares?
System Containers and Application Containers: Who Cares?Virtuozzo
 
AWS re:Invent 2016: Open Source at AWS—Contributions, Support, and Engagement...
AWS re:Invent 2016: Open Source at AWS—Contributions, Support, and Engagement...AWS re:Invent 2016: Open Source at AWS—Contributions, Support, and Engagement...
AWS re:Invent 2016: Open Source at AWS—Contributions, Support, and Engagement...Amazon Web Services
 
Evoluation of Linux Container Virtualization
Evoluation of Linux Container VirtualizationEvoluation of Linux Container Virtualization
Evoluation of Linux Container VirtualizationImesh Gunaratne
 
Linux Containers From Scratch
Linux Containers From ScratchLinux Containers From Scratch
Linux Containers From Scratchjoshuasoundcloud
 

Destacado (12)

OpenVZ, Virtuozzo and Docker
OpenVZ, Virtuozzo and DockerOpenVZ, Virtuozzo and Docker
OpenVZ, Virtuozzo and Docker
 
Apache Bigtop: a crash course in deploying a Hadoop bigdata management platform
Apache Bigtop: a crash course in deploying a Hadoop bigdata management platformApache Bigtop: a crash course in deploying a Hadoop bigdata management platform
Apache Bigtop: a crash course in deploying a Hadoop bigdata management platform
 
BigTop vm and docker provisioner
BigTop vm and docker provisionerBigTop vm and docker provisioner
BigTop vm and docker provisioner
 
Apache Spark: killer or savior of Apache Hadoop?
Apache Spark: killer or savior of Apache Hadoop?Apache Spark: killer or savior of Apache Hadoop?
Apache Spark: killer or savior of Apache Hadoop?
 
Virtuozzo Storage for Docker
Virtuozzo Storage for DockerVirtuozzo Storage for Docker
Virtuozzo Storage for Docker
 
System Containers and Application Containers: Who Cares?
System Containers and Application Containers: Who Cares?System Containers and Application Containers: Who Cares?
System Containers and Application Containers: Who Cares?
 
Minneapolis-St. Paul Overview
Minneapolis-St. Paul OverviewMinneapolis-St. Paul Overview
Minneapolis-St. Paul Overview
 
Storage
StorageStorage
Storage
 
AWS re:Invent 2016: Open Source at AWS—Contributions, Support, and Engagement...
AWS re:Invent 2016: Open Source at AWS—Contributions, Support, and Engagement...AWS re:Invent 2016: Open Source at AWS—Contributions, Support, and Engagement...
AWS re:Invent 2016: Open Source at AWS—Contributions, Support, and Engagement...
 
tutorial presentation
tutorial presentationtutorial presentation
tutorial presentation
 
Evoluation of Linux Container Virtualization
Evoluation of Linux Container VirtualizationEvoluation of Linux Container Virtualization
Evoluation of Linux Container Virtualization
 
Linux Containers From Scratch
Linux Containers From ScratchLinux Containers From Scratch
Linux Containers From Scratch
 

Similar a A Performance Comparison of Container-based Virtualization Systems for MapReduce Clusters

A Performance Comparison of Container-based Virtualization Systems for MapRed...
A Performance Comparison of Container-based Virtualization Systems for MapRed...A Performance Comparison of Container-based Virtualization Systems for MapRed...
A Performance Comparison of Container-based Virtualization Systems for MapRed...Miguel Xavier
 
Peformance Evaluation of Container-based Vi
Peformance Evaluation of Container-based ViPeformance Evaluation of Container-based Vi
Peformance Evaluation of Container-based ViMiguel Xavier
 
Understand how docker works
Understand how docker worksUnderstand how docker works
Understand how docker worksLi Jingtian
 
Understand how docker works
Understand how docker worksUnderstand how docker works
Understand how docker worksJustin Li
 
Leveraging Amzon EC2 Container Services for Container Orchestration
Leveraging Amzon EC2 Container Services for Container OrchestrationLeveraging Amzon EC2 Container Services for Container Orchestration
Leveraging Amzon EC2 Container Services for Container OrchestrationNeeraj Shah
 
Cloud stack overview
Cloud stack overviewCloud stack overview
Cloud stack overviewhowie YU
 
HPC and cloud distributed computing, as a journey
HPC and cloud distributed computing, as a journeyHPC and cloud distributed computing, as a journey
HPC and cloud distributed computing, as a journeyPeter Clapham
 
OpenStack HA
OpenStack HAOpenStack HA
OpenStack HAtcp cloud
 
OpenStack High Availability
OpenStack High AvailabilityOpenStack High Availability
OpenStack High AvailabilityJakub Pavlik
 
An Updated Performance Comparison of Virtual Machines and Linux Containers
An Updated Performance Comparison of Virtual Machines and Linux ContainersAn Updated Performance Comparison of Virtual Machines and Linux Containers
An Updated Performance Comparison of Virtual Machines and Linux ContainersKento Aoyama
 
Latest (storage IO) patterns for cloud-native applications
Latest (storage IO) patterns for cloud-native applications Latest (storage IO) patterns for cloud-native applications
Latest (storage IO) patterns for cloud-native applications OpenEBS
 
Open stack ha design & deployment kilo
Open stack ha design & deployment   kiloOpen stack ha design & deployment   kilo
Open stack ha design & deployment kiloSteven Li
 
Demystifying Kubernetes for Enterprise DevOps
Demystifying Kubernetes for Enterprise DevOpsDemystifying Kubernetes for Enterprise DevOps
Demystifying Kubernetes for Enterprise DevOpsJim Bugwadia
 
State of the Container Ecosystem
State of the Container EcosystemState of the Container Ecosystem
State of the Container EcosystemVinay Rao
 
Cosmos DB at VLDB 2019
Cosmos DB at VLDB 2019Cosmos DB at VLDB 2019
Cosmos DB at VLDB 2019Dharma Shukla
 
PlovDev 2016: Оркестрация на контейнери с Kubernetes - Мартин Владев
PlovDev 2016: Оркестрация на контейнери с Kubernetes - Мартин ВладевPlovDev 2016: Оркестрация на контейнери с Kubernetes - Мартин Владев
PlovDev 2016: Оркестрация на контейнери с Kubernetes - Мартин ВладевPlovDev Conference
 

Similar a A Performance Comparison of Container-based Virtualization Systems for MapReduce Clusters (20)

A Performance Comparison of Container-based Virtualization Systems for MapRed...
A Performance Comparison of Container-based Virtualization Systems for MapRed...A Performance Comparison of Container-based Virtualization Systems for MapRed...
A Performance Comparison of Container-based Virtualization Systems for MapRed...
 
Peformance Evaluation of Container-based Vi
Peformance Evaluation of Container-based ViPeformance Evaluation of Container-based Vi
Peformance Evaluation of Container-based Vi
 
Kubernetes2
Kubernetes2Kubernetes2
Kubernetes2
 
Understand how docker works
Understand how docker worksUnderstand how docker works
Understand how docker works
 
Understand how docker works
Understand how docker worksUnderstand how docker works
Understand how docker works
 
Leveraging Amzon EC2 Container Services for Container Orchestration
Leveraging Amzon EC2 Container Services for Container OrchestrationLeveraging Amzon EC2 Container Services for Container Orchestration
Leveraging Amzon EC2 Container Services for Container Orchestration
 
Cloud stack overview
Cloud stack overviewCloud stack overview
Cloud stack overview
 
Kubernetes integration with ODL
Kubernetes integration with ODLKubernetes integration with ODL
Kubernetes integration with ODL
 
HPC and cloud distributed computing, as a journey
HPC and cloud distributed computing, as a journeyHPC and cloud distributed computing, as a journey
HPC and cloud distributed computing, as a journey
 
OpenStack HA
OpenStack HAOpenStack HA
OpenStack HA
 
OpenStack High Availability
OpenStack High AvailabilityOpenStack High Availability
OpenStack High Availability
 
An Updated Performance Comparison of Virtual Machines and Linux Containers
An Updated Performance Comparison of Virtual Machines and Linux ContainersAn Updated Performance Comparison of Virtual Machines and Linux Containers
An Updated Performance Comparison of Virtual Machines and Linux Containers
 
Latest (storage IO) patterns for cloud-native applications
Latest (storage IO) patterns for cloud-native applications Latest (storage IO) patterns for cloud-native applications
Latest (storage IO) patterns for cloud-native applications
 
Open stack ha design & deployment kilo
Open stack ha design & deployment   kiloOpen stack ha design & deployment   kilo
Open stack ha design & deployment kilo
 
Demystifying Kubernetes for Enterprise DevOps
Demystifying Kubernetes for Enterprise DevOpsDemystifying Kubernetes for Enterprise DevOps
Demystifying Kubernetes for Enterprise DevOps
 
State of the Container Ecosystem
State of the Container EcosystemState of the Container Ecosystem
State of the Container Ecosystem
 
CloudStackFinalProject
CloudStackFinalProjectCloudStackFinalProject
CloudStackFinalProject
 
Cosmos DB at VLDB 2019
Cosmos DB at VLDB 2019Cosmos DB at VLDB 2019
Cosmos DB at VLDB 2019
 
PlovDev 2016: Оркестрация на контейнери с Kubernetes - Мартин Владев
PlovDev 2016: Оркестрация на контейнери с Kubernetes - Мартин ВладевPlovDev 2016: Оркестрация на контейнери с Kubernetes - Мартин Владев
PlovDev 2016: Оркестрация на контейнери с Kubernetes - Мартин Владев
 
Txlf2012
Txlf2012Txlf2012
Txlf2012
 

Último

Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsMaria Levchenko
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...apidays
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityPrincipled Technologies
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slidevu2urc
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsEnterprise Knowledge
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationMichael W. Hawkins
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxFactors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxKatpro Technologies
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonAnna Loughnan Colquhoun
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...Neo4j
 
What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?Antenna Manufacturer Coco
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Servicegiselly40
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Drew Madelung
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptxHampshireHUG
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Miguel Araújo
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountPuma Security, LLC
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Scriptwesley chun
 
A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024Results
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)Gabriella Davis
 

Último (20)

Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed texts
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxFactors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
 
What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Service
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path Mount
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 
A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 

A Performance Comparison of Container-based Virtualization Systems for MapReduce Clusters

  • 1. A  Performance  Comparison  of  Container-­‐based   Virtualiza8on  Systems  for  MapReduce  Clusters     Miguel  G.  Xavier,  Marcelo  V.  Neves,  Cesar  A.  F.  De  Rose   miguel.xavier@acad.pucrs.br   Faculty  of  Informa8cs,  PUCRS   Porto  Alegre,  Brazil     February  13,  2014  
  • 2. Outline   •  •  •  •  •  Introduc8on   Container-­‐based  Virtualiza8on   MapReduce   Evalua8on   Conclusion    
  • 3. Introduc8on   •  •  •  Virtualiza8on     •  Allows  resources  to  be  shared   •  Hardware  independence,  availability,  isola8on  and  security   •  BeUer  manageability   •  Widely  used  in  datacenters/cloud  compu8ng   MapReduce  Cluster  and  Virtualiza8on     •  Usage  scenarios   •  BeUer  resource  sharing   •  Cloud  Compu8ng   However,  hypervisor-­‐based  technologies  in  MapReduce  environments  has   tradi8onally  been  avoided  
  • 4. Container-­‐based  Virtualiza8on   •  A  group  o  processes  on    a  Linux  box,  put  together  in  a   •  •  •  isolated  environment   A  lightweight  virtualiza8on  layer     Non  virtualized  drivers   Shared  opera8ng  system   Guest Processes Guest Processes Guest Processes Guest Processes Guest OS Guest OS Virtualization Layer Virtualization Layer Host OS Host OS Hardware Hardware Container-based Virtualization Hypervisor-Based Virtualization
  • 5. •  Container-­‐based  Virtualiza8on     Each  container  has:   •  Its  own  network  interface  (and  IP  Address)   •  •  •  •  Bridged,  routed  …   Its  own  filesystem   Isola8on  (security)   •  container  A  and  B  can’t  see  each  other   Isola8on  (resource  usage)   •  RAM,  CPU,  I/O   •  Current  systems   •  Linux-­‐Vserver,  OpenVZ,  LXC        
  • 6. •  Container-­‐based  Virtualiza8on     Implements  Linux  Namespaces   Mount  –  moun8ng/unmou8ng  file  systems   UTS  –  hostname,  domainname   IPC  –  SysV  message  queues,  semaphore,  memory  segments   Network  –  IPv4/IPv6  stacks,  rou8ng,  firewall,  /proc/net,   sock   •  PID  –  Own  set  of  pids   Chroot  is  filesystem  namespace     •  •  •  •  •  Current  systems   •  Linux-­‐Vserver,  OpenVZ,  LXC        
  • 7. •  Container-­‐based  Systems     Linux-­‐VServer   Implements  its  own  features  in  Linux  kernel     limits  the  scope  of  the  file  system  from  different  processes   through  the  tradi8onal  chroot   •  OpenVZ   •  •  •  Linux  Containers  (LXC)   •  Based  on  CGroups  
  • 8. Hypervisor-­‐  vs  Container-­‐based  Systems   Hypervisor   Container   Different  Kernel  OS   Single  Kernel   Device  Emula8on   Syscall   Many  FS  caches   Single  FS  cache   Limits  per  machine   Limits  per  process   High  Performance  Overhead   Low  Performance  Overhead  
  • 9. MapReduce   •  •  MapReduce     •  A  parallel  programming  model   •  Simplicity,  efficiency  and  high  scalability   •  It  has  become  a  de  facto  standard  for  large-­‐scale  data  analysis     MapReduce  has  also  aUracted  the  aUen8on  of  the  HPC   community   •  Simpler  approach  to  address  the  parallelism  problem   •  Highly  visible  case  where  MapReduce  has  been  successfully   used  by  companies  like  Google,  Yahoo!,  Facebook  and   Amazon  
  • 10. MapReduce  and  Containers   •  •  Apache  Mesos   •  Shares  a  cluster  between  mul8ple  different  frameworks   •  Creates  another  level  of  resource  management   •  Management  is  taken  away  from  cluster’s  RMS   Apache  YARN   •  Hadoop  Next  Genera8on   •  BeUer  job  scheduling/monitoring   •  Uses  virtualiza8on  to  share  a  cluster  among  different   applica8ons      
  • 11. Evalua8on   •  Experimental  Environment     •  •  •  •  •  Hadoop  cluster  composed  by  4  nodes     Two  processors  with  8  cores  (without  threads)  per  node   16GB  of  memory  per  node   146GB  of  disksize  per  node   Analyze  of  the  best  results  of  performance   •  Through  micro-­‐benchmarks     •  •  •  •  •  HDFS  evalua8on  (TestDFSIO)   NameNode  evalua8on  (NNBench)   MapReduce  evalua8on  (MRBench)   Through  macro-­‐benchmarks  (WordCount,  TeraBench)     Analyze  of  best  results  of  isola8on   •  Through  IBS  benchmark   At  least  50  execu8ons  were  performed  for  each  experiment   •   
  • 12. HDFS  Evalua8on   Semngs:   •  Replica8on  of  3  blocks   •  File  size  from  100  MB  to   3000  MB       •  •  •    All  Container-­‐based  systems   have  performance  similar  to   na8ve     Results  o  OpenVZ  represents   loss  of  3Mbps   It  is  due  to  the  CFQ  scheduler     30 25 Throughput (Mbps) •  20 lxc nativa 15 ovz vserver 10 5 0 0 1000 2000 File size (Bytes) 3000
  • 13. HDFS  Evalua8on     •    All  of  Container-­‐based   systems  obtained   performance  results  similar   to  na8ve     Linux-­‐VServer  uses  a   Physical-­‐based  network   30 25 Throughput (Mbps)   •  20 lxc nativa 15 ovz vserver 10 5 0 0 1000 2000 File size (Bytes) 3000
  • 14. NameNode  Evalua8on  using  NNBench   •       Generates  opera8ons  on  1000  files  on  HDFS   Na8ve   VServer   0.51     0.52   0.51   0.49   Create/Write  (ms)   •  •  OpenVZ   Open/Read  (ms)   •  •  LXC   54.65   56.89   51.96   48.90   NNBench  benchmark  was  chosen  to  evaluate  the  NameNode  component   Linux-­‐VServer  reaches  a  latency  at  a  average  of  48ms,  while  LXC  obtained  the   worst  result  at  an  average  of  56ms   The  differences  are  not  so  significant  if  the  numbers  are  considered   However,  the  strengths  are  that  no  excep8on  was  observed  during  the  high   HDFS  management  stress,  and  that  all  systems  were  able  to  respond   effec8vely  as  the  na8ve  
  • 15. MapReduce  Evalua8on  using  MRBench   Na8ve   Execu8on  Time     •  LXC   OpenVZ   VServer   14251       13577   14304     13614     The  results  obtained  from  MRBench  show  that  MR  layer  suffers  no  substan8al   effect  while  running  on  different  container-­‐based  virtualiza8on  systems  
  • 16. Analyzing  Performance  with  WordCount   180 •  30  GB  of  input  data   •  The  peak  of  performance   degrada8on  from  OpenVZ   is  explained  by  the  I/O   scheduler  overhead   160 140 Execution Time (seconds)   120 Native 100 LXC OpenVZ 80 VServer 60 40 20 0 Wordcount
  • 17. Analyzing  Performance  with  TeraSort   •  A  HDFS  block  size  of  64MB     140 120 Execution Time (seconds) •  Standard  map/reduce  sort     •  Steps:   •  Generates  30  GB  of  input   data   •  Run  on  such  input  data.     100 Native 80 LXC OpenVZ VServer 60 40 20 0 Terasort
  • 18. Performance  Isola8on   Base  line     applica8on   Base  line     applica8on   Stress  Test   Container   A   Container   A   Container   B   Execu8on  Time     Execu8on  Time     Performance  degrada8on  (%)    
  • 19. Performance  Isola8on     CPU   LXC   Memory   I/O   Fork  Bomb   0%   8.3%   5.5%   0%   •  We  chose  LXC    as  the  representa8ve  of  the  container-­‐based  virtualiza8on  to  be   evaluated   •  The  limits    of  the  CPU  usage  per  container  is  working  well   •  no  significant  impact  was  noted.     •  a  liUle  performance  degrada8on  needs  to  be  taken  into  account     •  The  fork  bomb  stress  test  reveals  that  the  LXC  has  a  security  subsystem  that   ensure  feasibility  
  • 20. Conclusions   •  we  found  that  all  container-­‐based  systems  reach  a  near-­‐na8ve  performance  for   MapReduce  workloads     •  the  results  of  performance  isola8on  reveled  that  the  LXC  has  improved  its   capabili8es  of  restrict  resources  among  containers     •  although  some  works  are  already  taking  advantages  of  container-­‐based   systems  on  MR  clusters   •  this  work  demonstrated  the  benefits  of  using  container-­‐based  systems  to   support  MapReduce  clusters  
  • 21. Future  Work   •  We  plan  to  study  the  performance  isola8on  at  the  network-­‐level   •  We  plan  to  study  the  scalability  while  increasing  the  number  of   nodes   •  We  plan  to  study  aspects  regarding  the  green  compu8ng,  such  as   the  trade-­‐off  between  performance  and  energy  consump8on    
  • 22. Thank  you  for  your  aUen8on!