SlideShare a Scribd company logo
1 of 22
Download to read offline
A	
  Performance	
  Comparison	
  of	
  Container-­‐based	
  
Virtualiza8on	
  Systems	
  for	
  MapReduce	
  Clusters	
  	
  
Miguel	
  G.	
  Xavier,	
  Marcelo	
  V.	
  Neves,	
  Cesar	
  A.	
  F.	
  De	
  Rose	
  
miguel.xavier@acad.pucrs.br	
  
Faculty	
  of	
  Informa8cs,	
  PUCRS	
  
Porto	
  Alegre,	
  Brazil	
  
	
  
February	
  13,	
  2014	
  
Outline	
  
•  Introduc8on	
  
•  Container-­‐based	
  Virtualiza8on	
  
•  MapReduce	
  
•  Evalua8on	
  
•  Conclusion	
  	
  
Introduc8on	
  
•  Virtualiza8on	
  	
  
•  Allows	
  resources	
  to	
  be	
  shared	
  
•  Hardware	
  independence,	
  availability,	
  isola8on	
  and	
  security	
  
•  BeUer	
  manageability	
  
•  Widely	
  used	
  in	
  datacenters/cloud	
  compu8ng	
  
•  MapReduce	
  Cluster	
  and	
  Virtualiza8on	
  	
  
•  Usage	
  scenarios	
  
•  BeUer	
  resource	
  sharing	
  
•  Cloud	
  Compu8ng	
  
•  However,	
  hypervisor-­‐based	
  technologies	
  in	
  MapReduce	
  environments	
  has	
  
tradi8onally	
  been	
  avoided	
  
Container-­‐based	
  Virtualiza8on	
  
	
  •  A	
  group	
  o	
  processes	
  on	
  a	
  Linux	
  box,	
  put	
  together	
  in	
  a	
  
isolated	
  environment	
  
•  A	
  lightweight	
  virtualiza8on	
  layer	
  	
  
•  Non	
  virtualized	
  drivers	
  
•  Shared	
  opera8ng	
  system	
  
Hardware
Host OS
Virtualization Layer
Guest
Processes
Guest
Processes
Hardware
Virtualization Layer
Guest
Processes
Guest
Processes
Guest OS Guest OS
Container-based Virtualization Hypervisor-Based Virtualization
Host OS
Container-­‐based	
  Virtualiza8on	
  
	
  •  Each	
  container	
  has:	
  
•  Its	
  own	
  network	
  interface	
  (and	
  IP	
  Address)	
  
•  Bridged,	
  routed	
  …	
  
•  Its	
  own	
  filesystem	
  
•  Isola8on	
  (security)	
  
•  container	
  A	
  and	
  B	
  can’t	
  see	
  each	
  other	
  
•  Isola8on	
  (resource	
  usage)	
  
•  RAM,	
  CPU,	
  I/O	
  
•  Current	
  systems	
  
•  Linux-­‐Vserver,	
  OpenVZ,	
  LXC	
  	
  
	
  
	
  
Container-­‐based	
  Virtualiza8on	
  
	
  •  Implements	
  Linux	
  Namespaces	
  
•  Mount	
  –	
  moun8ng/unmou8ng	
  file	
  systems	
  
•  UTS	
  –	
  hostname,	
  domainname	
  
•  IPC	
  –	
  SysV	
  message	
  queues,	
  semaphore,	
  memory	
  segments	
  
•  Network	
  –	
  IPv4/IPv6	
  stacks,	
  rou8ng,	
  firewall,	
  /proc/net,	
  
sock	
  
•  PID	
  –	
  Own	
  set	
  of	
  pids	
  
Chroot	
  is	
  filesystem	
  namespace	
  
	
  
•  Current	
  systems	
  
•  Linux-­‐Vserver,	
  OpenVZ,	
  LXC	
  	
  
	
  
	
  
Container-­‐based	
  Systems	
  
	
  •  Linux-­‐VServer	
  
•  Implements	
  its	
  own	
  features	
  in	
  Linux	
  kernel	
  	
  
•  limits	
  the	
  scope	
  of	
  the	
  file	
  system	
  from	
  different	
  processes	
  
through	
  the	
  tradi8onal	
  chroot	
  
•  OpenVZ	
  
•  Linux	
  Containers	
  (LXC)	
  
•  Based	
  on	
  CGroups	
  
Hypervisor-­‐	
  vs	
  Container-­‐based	
  Systems	
  
Hypervisor	
   Container	
  
Different	
  Kernel	
  OS	
   Single	
  Kernel	
  
Device	
  Emula8on	
   Syscall	
  
Many	
  FS	
  caches	
   Single	
  FS	
  cache	
  
Limits	
  per	
  machine	
   Limits	
  per	
  process	
  
High	
  Performance	
  Overhead	
   Low	
  Performance	
  Overhead	
  
MapReduce	
  
•  MapReduce	
  	
  
•  A	
  parallel	
  programming	
  model	
  
•  Simplicity,	
  efficiency	
  and	
  high	
  scalability	
  
•  It	
  has	
  become	
  a	
  de	
  facto	
  standard	
  for	
  large-­‐scale	
  data	
  analysis	
  
	
  
•  MapReduce	
  has	
  also	
  aUracted	
  the	
  aUen8on	
  of	
  the	
  HPC	
  
community	
  
•  Simpler	
  approach	
  to	
  address	
  the	
  parallelism	
  problem	
  
•  Highly	
  visible	
  case	
  where	
  MapReduce	
  has	
  been	
  successfully	
  
used	
  by	
  companies	
  like	
  Google,	
  Yahoo!,	
  Facebook	
  and	
  
Amazon	
  
MapReduce	
  and	
  Containers	
  
•  Apache	
  Mesos	
  
•  Shares	
  a	
  cluster	
  between	
  mul8ple	
  different	
  frameworks	
  
•  Creates	
  another	
  level	
  of	
  resource	
  management	
  
•  Management	
  is	
  taken	
  away	
  from	
  cluster’s	
  RMS	
  
•  Apache	
  YARN	
  
•  Hadoop	
  Next	
  Genera8on	
  
•  BeUer	
  job	
  scheduling/monitoring	
  
•  Uses	
  virtualiza8on	
  to	
  share	
  a	
  cluster	
  among	
  different	
  
applica8ons	
  
	
  	
  
Evalua8on	
  
•  Experimental	
  Environment	
  	
  
•  Hadoop	
  cluster	
  composed	
  by	
  4	
  nodes	
  	
  
•  Two	
  processors	
  with	
  8	
  cores	
  (without	
  threads)	
  per	
  node	
  
•  16GB	
  of	
  memory	
  per	
  node	
  
•  146GB	
  of	
  disksize	
  per	
  node	
  
•  Analyze	
  of	
  the	
  best	
  results	
  of	
  performance	
  
•  Through	
  micro-­‐benchmarks	
  	
  
•  HDFS	
  evalua8on	
  (TestDFSIO)	
  
•  NameNode	
  evalua8on	
  (NNBench)	
  
•  MapReduce	
  evalua8on	
  (MRBench)	
  
•  Through	
  macro-­‐benchmarks	
  (WordCount,	
  TeraBench)	
  	
  
•  Analyze	
  of	
  best	
  results	
  of	
  isola8on	
  
•  Through	
  IBS	
  benchmark	
  
•  At	
  least	
  50	
  execu8ons	
  were	
  performed	
  for	
  each	
  experiment	
  
	
  
HDFS	
  Evalua8on	
  
•  Semngs:	
  
•  Replica8on	
  of	
  3	
  blocks	
  
•  File	
  size	
  from	
  100	
  MB	
  to	
  
3000	
  MB	
  	
  
	
  
•  All	
  Container-­‐based	
  systems	
  
have	
  performance	
  similar	
  to	
  
na8ve	
  	
  
•  Results	
  o	
  OpenVZ	
  represents	
  
loss	
  of	
  3Mbps	
  
•  It	
  is	
  due	
  to	
  the	
  CFQ	
  scheduler	
  	
  
	
  
0
5
10
15
20
25
30
0 1000 2000 3000
File size (Bytes)
Throughput(Mbps)
lxc
nativa
ovz
vserver
HDFS	
  Evalua8on	
  
	
  
•  All	
  of	
  Container-­‐based	
  
systems	
  obtained	
  
performance	
  results	
  similar	
  
to	
  na8ve	
  	
  
	
  
•  Linux-­‐VServer	
  uses	
  a	
  
Physical-­‐based	
  network	
  
	
  
0
5
10
15
20
25
30
0 1000 2000 3000
File size (Bytes)
Throughput(Mbps)
lxc
nativa
ovz
vserver
NameNode	
  Evalua8on	
  using	
  NNBench	
  
•  NNBench	
  benchmark	
  was	
  chosen	
  to	
  evaluate	
  the	
  NameNode	
  component	
  
•  Linux-­‐VServer	
  reaches	
  a	
  latency	
  at	
  a	
  average	
  of	
  48ms,	
  while	
  LXC	
  obtained	
  the	
  
worst	
  result	
  at	
  an	
  average	
  of	
  56ms	
  
•  The	
  differences	
  are	
  not	
  so	
  significant	
  if	
  the	
  numbers	
  are	
  considered	
  
•  However,	
  the	
  strengths	
  are	
  that	
  no	
  excep8on	
  was	
  observed	
  during	
  the	
  high	
  
HDFS	
  management	
  stress,	
  and	
  that	
  all	
  systems	
  were	
  able	
  to	
  respond	
  
effec8vely	
  as	
  the	
  na8ve	
  
Na8ve	
   LXC	
   OpenVZ	
   VServer	
  
Open/Read	
  (ms)	
   0.51	
  	
   0.52	
   0.51	
   0.49	
  
Create/Write	
  (ms)	
   54.65	
   56.89	
   51.96	
   48.90	
  
•  	
  	
  	
  Generates	
  opera8ons	
  on	
  1000	
  files	
  on	
  HDFS	
  
MapReduce	
  Evalua8on	
  using	
  MRBench	
  
•  The	
  results	
  obtained	
  from	
  MRBench	
  show	
  that	
  MR	
  layer	
  suffers	
  no	
  substan8al	
  
effect	
  while	
  running	
  on	
  different	
  container-­‐based	
  virtualiza8on	
  systems	
  
Na8ve	
   LXC	
   OpenVZ	
   VServer	
  
Execu8on	
  Time	
  	
   14251	
  	
  	
   13577	
   14304	
  	
   13614	
  	
  
Analyzing	
  Performance	
  with	
  WordCount	
  
	
  
0
20
40
60
80
100
120
140
160
180
Wordcount
ExecutionTime(seconds)
Native
LXC
OpenVZ
VServer
•  30	
  GB	
  of	
  input	
  data	
  
•  The	
  peak	
  of	
  performance	
  
degrada8on	
  from	
  OpenVZ	
  
is	
  explained	
  by	
  the	
  I/O	
  
scheduler	
  overhead	
  
Analyzing	
  Performance	
  with	
  TeraSort	
  
	
  
0
20
40
60
80
100
120
140
Terasort
ExecutionTime(seconds)
Native
LXC
OpenVZ
VServer
•  Standard	
  map/reduce	
  sort	
  
•  Steps:	
  
•  Generates	
  30	
  GB	
  of	
  input	
  
data	
  
•  Run	
  on	
  such	
  input	
  data.	
  	
  
•  A	
  HDFS	
  block	
  size	
  of	
  64MB	
  
	
  
Performance	
  Isola8on	
  
Container	
  
A	
  
Container	
  
A	
  
Container	
  
B	
  
Base	
  line	
  	
  
applica8on	
  
Base	
  line	
  	
  
applica8on	
  
Stress	
  Test	
  
Execu8on	
  Time	
  	
   Execu8on	
  Time	
  	
  
Performance	
  degrada8on	
  (%)	
  	
  
Performance	
  Isola8on	
  
	
   CPU	
   Memory	
   I/O	
   Fork	
  Bomb	
  
LXC	
   0%	
   8.3%	
   5.5%	
   0%	
  
•  We	
  chose	
  LXC	
  	
  as	
  the	
  representa8ve	
  of	
  the	
  container-­‐based	
  virtualiza8on	
  to	
  be	
  
evaluated	
  
•  The	
  limits	
  	
  of	
  the	
  CPU	
  usage	
  per	
  container	
  is	
  working	
  well	
  
•  no	
  significant	
  impact	
  was	
  noted.	
  	
  
•  a	
  liUle	
  performance	
  degrada8on	
  needs	
  to	
  be	
  taken	
  into	
  account	
  	
  
•  The	
  fork	
  bomb	
  stress	
  test	
  reveals	
  that	
  the	
  LXC	
  has	
  a	
  security	
  subsystem	
  that	
  
ensure	
  feasibility	
  
Conclusions	
  
•  we	
  found	
  that	
  all	
  container-­‐based	
  systems	
  reach	
  a	
  near-­‐na8ve	
  performance	
  for	
  
MapReduce	
  workloads	
  	
  
•  the	
  results	
  of	
  performance	
  isola8on	
  reveled	
  that	
  the	
  LXC	
  has	
  improved	
  its	
  
capabili8es	
  of	
  restrict	
  resources	
  among	
  containers	
  	
  
•  although	
  some	
  works	
  are	
  already	
  taking	
  advantages	
  of	
  container-­‐based	
  
systems	
  on	
  MR	
  clusters	
  
•  this	
  work	
  demonstrated	
  the	
  benefits	
  of	
  using	
  container-­‐based	
  systems	
  to	
  
support	
  MapReduce	
  clusters	
  
Future	
  Work	
  
•  We	
  plan	
  to	
  study	
  the	
  performance	
  isola8on	
  at	
  the	
  network-­‐level	
  
•  We	
  plan	
  to	
  study	
  the	
  scalability	
  while	
  increasing	
  the	
  number	
  of	
  
nodes	
  
•  We	
  plan	
  to	
  study	
  aspects	
  regarding	
  the	
  green	
  compu8ng,	
  such	
  as	
  
the	
  trade-­‐off	
  between	
  performance	
  and	
  energy	
  consump8on	
  
	
  
Thank	
  you	
  for	
  your	
  aUen8on!	
  

More Related Content

What's hot

Supporting bioinformatics applications with hybrid multi-cloud services
Supporting bioinformatics applications with hybrid multi-cloud servicesSupporting bioinformatics applications with hybrid multi-cloud services
Supporting bioinformatics applications with hybrid multi-cloud servicesAhmed Abdullah
 
Scalable Persistent Storage for Erlang: Theory and Practice
Scalable Persistent Storage for Erlang: Theory and PracticeScalable Persistent Storage for Erlang: Theory and Practice
Scalable Persistent Storage for Erlang: Theory and PracticeAmir Ghaffari
 
PostgreSQL and Linux Containers
PostgreSQL and Linux ContainersPostgreSQL and Linux Containers
PostgreSQL and Linux ContainersJignesh Shah
 
Low-latency Multi-threaded Ensemble Learning for Dynamic Big Data Streams
Low-latency Multi-threaded Ensemble Learning for Dynamic Big Data StreamsLow-latency Multi-threaded Ensemble Learning for Dynamic Big Data Streams
Low-latency Multi-threaded Ensemble Learning for Dynamic Big Data StreamsDiego Marrón Vida
 
Apache CloudStack: API to UI (STLLUG)
Apache CloudStack: API to UI (STLLUG)Apache CloudStack: API to UI (STLLUG)
Apache CloudStack: API to UI (STLLUG)Joe Brockmeier
 
Storing Cassandra Metrics (Chris Lohfink, DataStax) | C* Summit 2016
Storing Cassandra Metrics (Chris Lohfink, DataStax) | C* Summit 2016Storing Cassandra Metrics (Chris Lohfink, DataStax) | C* Summit 2016
Storing Cassandra Metrics (Chris Lohfink, DataStax) | C* Summit 2016DataStax
 
State of Containers and the Convergence of HPC and BigData
State of Containers and the Convergence of HPC and BigDataState of Containers and the Convergence of HPC and BigData
State of Containers and the Convergence of HPC and BigDatainside-BigData.com
 
Red Hat Global File System (GFS)
Red Hat Global File System (GFS)Red Hat Global File System (GFS)
Red Hat Global File System (GFS)Schubert Zhang
 
DockerCon14 Cluster Management and Containerization
DockerCon14 Cluster Management and ContainerizationDockerCon14 Cluster Management and Containerization
DockerCon14 Cluster Management and ContainerizationDocker, Inc.
 
Pacemaker+DRBD
Pacemaker+DRBDPacemaker+DRBD
Pacemaker+DRBDDan Frincu
 
How To Build A Scalable Storage System with OSS at TLUG Meeting 2008/09/13
How To Build A Scalable Storage System with OSS at TLUG Meeting 2008/09/13How To Build A Scalable Storage System with OSS at TLUG Meeting 2008/09/13
How To Build A Scalable Storage System with OSS at TLUG Meeting 2008/09/13Gosuke Miyashita
 
DataStax: Extreme Cassandra Optimization: The Sequel
DataStax: Extreme Cassandra Optimization: The SequelDataStax: Extreme Cassandra Optimization: The Sequel
DataStax: Extreme Cassandra Optimization: The SequelDataStax Academy
 
BlueStore: a new, faster storage backend for Ceph
BlueStore: a new, faster storage backend for CephBlueStore: a new, faster storage backend for Ceph
BlueStore: a new, faster storage backend for CephSage Weil
 
Terror & Hysteria: Cost Effective Scaling of Time Series Data with Cassandra ...
Terror & Hysteria: Cost Effective Scaling of Time Series Data with Cassandra ...Terror & Hysteria: Cost Effective Scaling of Time Series Data with Cassandra ...
Terror & Hysteria: Cost Effective Scaling of Time Series Data with Cassandra ...DataStax
 
Building Large-Scale Stream Infrastructures Across Multiple Data Centers with...
Building Large-Scale Stream Infrastructures Across Multiple Data Centers with...Building Large-Scale Stream Infrastructures Across Multiple Data Centers with...
Building Large-Scale Stream Infrastructures Across Multiple Data Centers with...DataWorks Summit/Hadoop Summit
 
Distributed Resource Scheduling Frameworks
Distributed Resource Scheduling FrameworksDistributed Resource Scheduling Frameworks
Distributed Resource Scheduling FrameworksVARUN SAXENA
 
Lessons Learned on Java Tuning for Our Cassandra Clusters (Carlos Monroy, Kne...
Lessons Learned on Java Tuning for Our Cassandra Clusters (Carlos Monroy, Kne...Lessons Learned on Java Tuning for Our Cassandra Clusters (Carlos Monroy, Kne...
Lessons Learned on Java Tuning for Our Cassandra Clusters (Carlos Monroy, Kne...DataStax
 
A fun cup of joe with open liberty
A fun cup of joe with open libertyA fun cup of joe with open liberty
A fun cup of joe with open libertyAndy Mauer
 
Tuning Speculative Retries to Fight Latency (Michael Figuiere, Minh Do, Netfl...
Tuning Speculative Retries to Fight Latency (Michael Figuiere, Minh Do, Netfl...Tuning Speculative Retries to Fight Latency (Michael Figuiere, Minh Do, Netfl...
Tuning Speculative Retries to Fight Latency (Michael Figuiere, Minh Do, Netfl...DataStax
 

What's hot (20)

Supporting bioinformatics applications with hybrid multi-cloud services
Supporting bioinformatics applications with hybrid multi-cloud servicesSupporting bioinformatics applications with hybrid multi-cloud services
Supporting bioinformatics applications with hybrid multi-cloud services
 
Scalable Persistent Storage for Erlang: Theory and Practice
Scalable Persistent Storage for Erlang: Theory and PracticeScalable Persistent Storage for Erlang: Theory and Practice
Scalable Persistent Storage for Erlang: Theory and Practice
 
PostgreSQL and Linux Containers
PostgreSQL and Linux ContainersPostgreSQL and Linux Containers
PostgreSQL and Linux Containers
 
Low-latency Multi-threaded Ensemble Learning for Dynamic Big Data Streams
Low-latency Multi-threaded Ensemble Learning for Dynamic Big Data StreamsLow-latency Multi-threaded Ensemble Learning for Dynamic Big Data Streams
Low-latency Multi-threaded Ensemble Learning for Dynamic Big Data Streams
 
Apache CloudStack: API to UI (STLLUG)
Apache CloudStack: API to UI (STLLUG)Apache CloudStack: API to UI (STLLUG)
Apache CloudStack: API to UI (STLLUG)
 
Storing Cassandra Metrics (Chris Lohfink, DataStax) | C* Summit 2016
Storing Cassandra Metrics (Chris Lohfink, DataStax) | C* Summit 2016Storing Cassandra Metrics (Chris Lohfink, DataStax) | C* Summit 2016
Storing Cassandra Metrics (Chris Lohfink, DataStax) | C* Summit 2016
 
LinuxTag 2013
LinuxTag 2013LinuxTag 2013
LinuxTag 2013
 
State of Containers and the Convergence of HPC and BigData
State of Containers and the Convergence of HPC and BigDataState of Containers and the Convergence of HPC and BigData
State of Containers and the Convergence of HPC and BigData
 
Red Hat Global File System (GFS)
Red Hat Global File System (GFS)Red Hat Global File System (GFS)
Red Hat Global File System (GFS)
 
DockerCon14 Cluster Management and Containerization
DockerCon14 Cluster Management and ContainerizationDockerCon14 Cluster Management and Containerization
DockerCon14 Cluster Management and Containerization
 
Pacemaker+DRBD
Pacemaker+DRBDPacemaker+DRBD
Pacemaker+DRBD
 
How To Build A Scalable Storage System with OSS at TLUG Meeting 2008/09/13
How To Build A Scalable Storage System with OSS at TLUG Meeting 2008/09/13How To Build A Scalable Storage System with OSS at TLUG Meeting 2008/09/13
How To Build A Scalable Storage System with OSS at TLUG Meeting 2008/09/13
 
DataStax: Extreme Cassandra Optimization: The Sequel
DataStax: Extreme Cassandra Optimization: The SequelDataStax: Extreme Cassandra Optimization: The Sequel
DataStax: Extreme Cassandra Optimization: The Sequel
 
BlueStore: a new, faster storage backend for Ceph
BlueStore: a new, faster storage backend for CephBlueStore: a new, faster storage backend for Ceph
BlueStore: a new, faster storage backend for Ceph
 
Terror & Hysteria: Cost Effective Scaling of Time Series Data with Cassandra ...
Terror & Hysteria: Cost Effective Scaling of Time Series Data with Cassandra ...Terror & Hysteria: Cost Effective Scaling of Time Series Data with Cassandra ...
Terror & Hysteria: Cost Effective Scaling of Time Series Data with Cassandra ...
 
Building Large-Scale Stream Infrastructures Across Multiple Data Centers with...
Building Large-Scale Stream Infrastructures Across Multiple Data Centers with...Building Large-Scale Stream Infrastructures Across Multiple Data Centers with...
Building Large-Scale Stream Infrastructures Across Multiple Data Centers with...
 
Distributed Resource Scheduling Frameworks
Distributed Resource Scheduling FrameworksDistributed Resource Scheduling Frameworks
Distributed Resource Scheduling Frameworks
 
Lessons Learned on Java Tuning for Our Cassandra Clusters (Carlos Monroy, Kne...
Lessons Learned on Java Tuning for Our Cassandra Clusters (Carlos Monroy, Kne...Lessons Learned on Java Tuning for Our Cassandra Clusters (Carlos Monroy, Kne...
Lessons Learned on Java Tuning for Our Cassandra Clusters (Carlos Monroy, Kne...
 
A fun cup of joe with open liberty
A fun cup of joe with open libertyA fun cup of joe with open liberty
A fun cup of joe with open liberty
 
Tuning Speculative Retries to Fight Latency (Michael Figuiere, Minh Do, Netfl...
Tuning Speculative Retries to Fight Latency (Michael Figuiere, Minh Do, Netfl...Tuning Speculative Retries to Fight Latency (Michael Figuiere, Minh Do, Netfl...
Tuning Speculative Retries to Fight Latency (Michael Figuiere, Minh Do, Netfl...
 

Viewers also liked

Uma Arquitetura para Provisionamento de Ambientes de Alto Desempenho Customiz...
Uma Arquitetura para Provisionamento de Ambientes de Alto Desempenho Customiz...Uma Arquitetura para Provisionamento de Ambientes de Alto Desempenho Customiz...
Uma Arquitetura para Provisionamento de Ambientes de Alto Desempenho Customiz...Miguel Xavier
 
Peformance Evaluation of Container-based Vi
Peformance Evaluation of Container-based ViPeformance Evaluation of Container-based Vi
Peformance Evaluation of Container-based ViMiguel Xavier
 
Construção e provisionamento de ambientes de desenvolvimento virtualizados
Construção e provisionamento de ambientes  de desenvolvimento virtualizadosConstrução e provisionamento de ambientes  de desenvolvimento virtualizados
Construção e provisionamento de ambientes de desenvolvimento virtualizadosThiago Rodrigues
 
Virtualization Vs. Containers
Virtualization Vs. ContainersVirtualization Vs. Containers
Virtualization Vs. Containersactualtechmedia
 
Hypervisors and Virtualization - VMware, Hyper-V, XenServer, and KVM
Hypervisors and Virtualization - VMware, Hyper-V, XenServer, and KVMHypervisors and Virtualization - VMware, Hyper-V, XenServer, and KVM
Hypervisors and Virtualization - VMware, Hyper-V, XenServer, and KVMvwchu
 
Teaching Students with Emojis, Emoticons, & Textspeak
Teaching Students with Emojis, Emoticons, & TextspeakTeaching Students with Emojis, Emoticons, & Textspeak
Teaching Students with Emojis, Emoticons, & TextspeakShelly Sanchez Terrell
 
Hype vs. Reality: The AI Explainer
Hype vs. Reality: The AI ExplainerHype vs. Reality: The AI Explainer
Hype vs. Reality: The AI ExplainerLuminary Labs
 
Study: The Future of VR, AR and Self-Driving Cars
Study: The Future of VR, AR and Self-Driving CarsStudy: The Future of VR, AR and Self-Driving Cars
Study: The Future of VR, AR and Self-Driving CarsLinkedIn
 

Viewers also liked (8)

Uma Arquitetura para Provisionamento de Ambientes de Alto Desempenho Customiz...
Uma Arquitetura para Provisionamento de Ambientes de Alto Desempenho Customiz...Uma Arquitetura para Provisionamento de Ambientes de Alto Desempenho Customiz...
Uma Arquitetura para Provisionamento de Ambientes de Alto Desempenho Customiz...
 
Peformance Evaluation of Container-based Vi
Peformance Evaluation of Container-based ViPeformance Evaluation of Container-based Vi
Peformance Evaluation of Container-based Vi
 
Construção e provisionamento de ambientes de desenvolvimento virtualizados
Construção e provisionamento de ambientes  de desenvolvimento virtualizadosConstrução e provisionamento de ambientes  de desenvolvimento virtualizados
Construção e provisionamento de ambientes de desenvolvimento virtualizados
 
Virtualization Vs. Containers
Virtualization Vs. ContainersVirtualization Vs. Containers
Virtualization Vs. Containers
 
Hypervisors and Virtualization - VMware, Hyper-V, XenServer, and KVM
Hypervisors and Virtualization - VMware, Hyper-V, XenServer, and KVMHypervisors and Virtualization - VMware, Hyper-V, XenServer, and KVM
Hypervisors and Virtualization - VMware, Hyper-V, XenServer, and KVM
 
Teaching Students with Emojis, Emoticons, & Textspeak
Teaching Students with Emojis, Emoticons, & TextspeakTeaching Students with Emojis, Emoticons, & Textspeak
Teaching Students with Emojis, Emoticons, & Textspeak
 
Hype vs. Reality: The AI Explainer
Hype vs. Reality: The AI ExplainerHype vs. Reality: The AI Explainer
Hype vs. Reality: The AI Explainer
 
Study: The Future of VR, AR and Self-Driving Cars
Study: The Future of VR, AR and Self-Driving CarsStudy: The Future of VR, AR and Self-Driving Cars
Study: The Future of VR, AR and Self-Driving Cars
 

Similar to A Performance Comparison of Container-based Virtualization Systems for MapReduce Clusters

A Performance Comparison of Container-based Virtualization Systems for MapRed...
A Performance Comparison of Container-based Virtualization Systems for MapRed...A Performance Comparison of Container-based Virtualization Systems for MapRed...
A Performance Comparison of Container-based Virtualization Systems for MapRed...Marcelo Veiga Neves
 
Open stack ha design & deployment kilo
Open stack ha design & deployment   kiloOpen stack ha design & deployment   kilo
Open stack ha design & deployment kiloSteven Li
 
Latest (storage IO) patterns for cloud-native applications
Latest (storage IO) patterns for cloud-native applications Latest (storage IO) patterns for cloud-native applications
Latest (storage IO) patterns for cloud-native applications OpenEBS
 
Tips Tricks and Tactics with Cells and Scaling OpenStack - May, 2015
Tips Tricks and Tactics with Cells and Scaling OpenStack - May, 2015Tips Tricks and Tactics with Cells and Scaling OpenStack - May, 2015
Tips Tricks and Tactics with Cells and Scaling OpenStack - May, 2015Belmiro Moreira
 
HPC and cloud distributed computing, as a journey
HPC and cloud distributed computing, as a journeyHPC and cloud distributed computing, as a journey
HPC and cloud distributed computing, as a journeyPeter Clapham
 
OpenStack HA
OpenStack HAOpenStack HA
OpenStack HAtcp cloud
 
OpenStack High Availability
OpenStack High AvailabilityOpenStack High Availability
OpenStack High AvailabilityJakub Pavlik
 
Network-aware Data Management for High Throughput Flows Akamai, Cambridge, ...
Network-aware Data Management for High Throughput Flows   Akamai, Cambridge, ...Network-aware Data Management for High Throughput Flows   Akamai, Cambridge, ...
Network-aware Data Management for High Throughput Flows Akamai, Cambridge, ...balmanme
 
An Updated Performance Comparison of Virtual Machines and Linux Containers
An Updated Performance Comparison of Virtual Machines and Linux ContainersAn Updated Performance Comparison of Virtual Machines and Linux Containers
An Updated Performance Comparison of Virtual Machines and Linux ContainersKento Aoyama
 
Introduction to Apache Mesos and DC/OS
Introduction to Apache Mesos and DC/OSIntroduction to Apache Mesos and DC/OS
Introduction to Apache Mesos and DC/OSSteve Wong
 
How does Apache Pegasus (incubating) community develop at SensorsData
How does Apache Pegasus (incubating) community develop at SensorsDataHow does Apache Pegasus (incubating) community develop at SensorsData
How does Apache Pegasus (incubating) community develop at SensorsDataacelyc1112009
 
Revolutionary Storage for Modern Databases, Applications and Infrastrcture
Revolutionary Storage for Modern Databases, Applications and InfrastrctureRevolutionary Storage for Modern Databases, Applications and Infrastrcture
Revolutionary Storage for Modern Databases, Applications and Infrastrcturesabnees
 
Understand how docker works
Understand how docker worksUnderstand how docker works
Understand how docker worksLi Jingtian
 
Understand how docker works
Understand how docker worksUnderstand how docker works
Understand how docker worksJustin Li
 
2018 03 25 system ml ai and openpower meetup
2018 03 25 system ml ai and openpower meetup2018 03 25 system ml ai and openpower meetup
2018 03 25 system ml ai and openpower meetupGanesan Narayanasamy
 
Leveraging Amzon EC2 Container Services for Container Orchestration
Leveraging Amzon EC2 Container Services for Container OrchestrationLeveraging Amzon EC2 Container Services for Container Orchestration
Leveraging Amzon EC2 Container Services for Container OrchestrationNeeraj Shah
 
Walk Through a Software Defined Everything PoC
Walk Through a Software Defined Everything PoCWalk Through a Software Defined Everything PoC
Walk Through a Software Defined Everything PoCMidoNet
 
Scaling Up Logging and Metrics
Scaling Up Logging and MetricsScaling Up Logging and Metrics
Scaling Up Logging and MetricsRicardo Lourenço
 

Similar to A Performance Comparison of Container-based Virtualization Systems for MapReduce Clusters (20)

A Performance Comparison of Container-based Virtualization Systems for MapRed...
A Performance Comparison of Container-based Virtualization Systems for MapRed...A Performance Comparison of Container-based Virtualization Systems for MapRed...
A Performance Comparison of Container-based Virtualization Systems for MapRed...
 
Open stack ha design & deployment kilo
Open stack ha design & deployment   kiloOpen stack ha design & deployment   kilo
Open stack ha design & deployment kilo
 
Latest (storage IO) patterns for cloud-native applications
Latest (storage IO) patterns for cloud-native applications Latest (storage IO) patterns for cloud-native applications
Latest (storage IO) patterns for cloud-native applications
 
Tips Tricks and Tactics with Cells and Scaling OpenStack - May, 2015
Tips Tricks and Tactics with Cells and Scaling OpenStack - May, 2015Tips Tricks and Tactics with Cells and Scaling OpenStack - May, 2015
Tips Tricks and Tactics with Cells and Scaling OpenStack - May, 2015
 
HPC and cloud distributed computing, as a journey
HPC and cloud distributed computing, as a journeyHPC and cloud distributed computing, as a journey
HPC and cloud distributed computing, as a journey
 
OpenStack HA
OpenStack HAOpenStack HA
OpenStack HA
 
OpenStack High Availability
OpenStack High AvailabilityOpenStack High Availability
OpenStack High Availability
 
Network-aware Data Management for High Throughput Flows Akamai, Cambridge, ...
Network-aware Data Management for High Throughput Flows   Akamai, Cambridge, ...Network-aware Data Management for High Throughput Flows   Akamai, Cambridge, ...
Network-aware Data Management for High Throughput Flows Akamai, Cambridge, ...
 
An Updated Performance Comparison of Virtual Machines and Linux Containers
An Updated Performance Comparison of Virtual Machines and Linux ContainersAn Updated Performance Comparison of Virtual Machines and Linux Containers
An Updated Performance Comparison of Virtual Machines and Linux Containers
 
Introduction to Apache Mesos and DC/OS
Introduction to Apache Mesos and DC/OSIntroduction to Apache Mesos and DC/OS
Introduction to Apache Mesos and DC/OS
 
How does Apache Pegasus (incubating) community develop at SensorsData
How does Apache Pegasus (incubating) community develop at SensorsDataHow does Apache Pegasus (incubating) community develop at SensorsData
How does Apache Pegasus (incubating) community develop at SensorsData
 
Revolutionary Storage for Modern Databases, Applications and Infrastrcture
Revolutionary Storage for Modern Databases, Applications and InfrastrctureRevolutionary Storage for Modern Databases, Applications and Infrastrcture
Revolutionary Storage for Modern Databases, Applications and Infrastrcture
 
Understand how docker works
Understand how docker worksUnderstand how docker works
Understand how docker works
 
Understand how docker works
Understand how docker worksUnderstand how docker works
Understand how docker works
 
2018 03 25 system ml ai and openpower meetup
2018 03 25 system ml ai and openpower meetup2018 03 25 system ml ai and openpower meetup
2018 03 25 system ml ai and openpower meetup
 
Leveraging Amzon EC2 Container Services for Container Orchestration
Leveraging Amzon EC2 Container Services for Container OrchestrationLeveraging Amzon EC2 Container Services for Container Orchestration
Leveraging Amzon EC2 Container Services for Container Orchestration
 
Walk Through a Software Defined Everything PoC
Walk Through a Software Defined Everything PoCWalk Through a Software Defined Everything PoC
Walk Through a Software Defined Everything PoC
 
Scaling Up Logging and Metrics
Scaling Up Logging and MetricsScaling Up Logging and Metrics
Scaling Up Logging and Metrics
 
Txlf2012
Txlf2012Txlf2012
Txlf2012
 
Kubernetes2
Kubernetes2Kubernetes2
Kubernetes2
 

Recently uploaded

Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfAddepto
 
CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):comworks
 
Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Manik S Magar
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Commit University
 
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc
 
Take control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteTake control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteDianaGray10
 
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks..."LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...Fwdays
 
How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity PlanDatabarracks
 
Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Scott Keck-Warren
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr BaganFwdays
 
TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024Lonnie McRorey
 
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024Lorenzo Miniero
 
Powerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time ClashPowerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time Clashcharlottematthew16
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Mattias Andersson
 
Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Enterprise Knowledge
 
What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024Stephanie Beckett
 
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek SchlawackFwdays
 
H2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo Day
H2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo DayH2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo Day
H2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo DaySri Ambati
 
Artificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxArtificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxhariprasad279825
 

Recently uploaded (20)

DMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special EditionDMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special Edition
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdf
 
CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):
 
Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!
 
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
 
Take control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteTake control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test Suite
 
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks..."LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
 
How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity Plan
 
Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan
 
TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024
 
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024
 
Powerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time ClashPowerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time Clash
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?
 
Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024
 
What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024
 
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
 
H2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo Day
H2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo DayH2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo Day
H2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo Day
 
Artificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxArtificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptx
 

A Performance Comparison of Container-based Virtualization Systems for MapReduce Clusters

  • 1. A  Performance  Comparison  of  Container-­‐based   Virtualiza8on  Systems  for  MapReduce  Clusters     Miguel  G.  Xavier,  Marcelo  V.  Neves,  Cesar  A.  F.  De  Rose   miguel.xavier@acad.pucrs.br   Faculty  of  Informa8cs,  PUCRS   Porto  Alegre,  Brazil     February  13,  2014  
  • 2. Outline   •  Introduc8on   •  Container-­‐based  Virtualiza8on   •  MapReduce   •  Evalua8on   •  Conclusion    
  • 3. Introduc8on   •  Virtualiza8on     •  Allows  resources  to  be  shared   •  Hardware  independence,  availability,  isola8on  and  security   •  BeUer  manageability   •  Widely  used  in  datacenters/cloud  compu8ng   •  MapReduce  Cluster  and  Virtualiza8on     •  Usage  scenarios   •  BeUer  resource  sharing   •  Cloud  Compu8ng   •  However,  hypervisor-­‐based  technologies  in  MapReduce  environments  has   tradi8onally  been  avoided  
  • 4. Container-­‐based  Virtualiza8on    •  A  group  o  processes  on  a  Linux  box,  put  together  in  a   isolated  environment   •  A  lightweight  virtualiza8on  layer     •  Non  virtualized  drivers   •  Shared  opera8ng  system   Hardware Host OS Virtualization Layer Guest Processes Guest Processes Hardware Virtualization Layer Guest Processes Guest Processes Guest OS Guest OS Container-based Virtualization Hypervisor-Based Virtualization Host OS
  • 5. Container-­‐based  Virtualiza8on    •  Each  container  has:   •  Its  own  network  interface  (and  IP  Address)   •  Bridged,  routed  …   •  Its  own  filesystem   •  Isola8on  (security)   •  container  A  and  B  can’t  see  each  other   •  Isola8on  (resource  usage)   •  RAM,  CPU,  I/O   •  Current  systems   •  Linux-­‐Vserver,  OpenVZ,  LXC        
  • 6. Container-­‐based  Virtualiza8on    •  Implements  Linux  Namespaces   •  Mount  –  moun8ng/unmou8ng  file  systems   •  UTS  –  hostname,  domainname   •  IPC  –  SysV  message  queues,  semaphore,  memory  segments   •  Network  –  IPv4/IPv6  stacks,  rou8ng,  firewall,  /proc/net,   sock   •  PID  –  Own  set  of  pids   Chroot  is  filesystem  namespace     •  Current  systems   •  Linux-­‐Vserver,  OpenVZ,  LXC        
  • 7. Container-­‐based  Systems    •  Linux-­‐VServer   •  Implements  its  own  features  in  Linux  kernel     •  limits  the  scope  of  the  file  system  from  different  processes   through  the  tradi8onal  chroot   •  OpenVZ   •  Linux  Containers  (LXC)   •  Based  on  CGroups  
  • 8. Hypervisor-­‐  vs  Container-­‐based  Systems   Hypervisor   Container   Different  Kernel  OS   Single  Kernel   Device  Emula8on   Syscall   Many  FS  caches   Single  FS  cache   Limits  per  machine   Limits  per  process   High  Performance  Overhead   Low  Performance  Overhead  
  • 9. MapReduce   •  MapReduce     •  A  parallel  programming  model   •  Simplicity,  efficiency  and  high  scalability   •  It  has  become  a  de  facto  standard  for  large-­‐scale  data  analysis     •  MapReduce  has  also  aUracted  the  aUen8on  of  the  HPC   community   •  Simpler  approach  to  address  the  parallelism  problem   •  Highly  visible  case  where  MapReduce  has  been  successfully   used  by  companies  like  Google,  Yahoo!,  Facebook  and   Amazon  
  • 10. MapReduce  and  Containers   •  Apache  Mesos   •  Shares  a  cluster  between  mul8ple  different  frameworks   •  Creates  another  level  of  resource  management   •  Management  is  taken  away  from  cluster’s  RMS   •  Apache  YARN   •  Hadoop  Next  Genera8on   •  BeUer  job  scheduling/monitoring   •  Uses  virtualiza8on  to  share  a  cluster  among  different   applica8ons      
  • 11. Evalua8on   •  Experimental  Environment     •  Hadoop  cluster  composed  by  4  nodes     •  Two  processors  with  8  cores  (without  threads)  per  node   •  16GB  of  memory  per  node   •  146GB  of  disksize  per  node   •  Analyze  of  the  best  results  of  performance   •  Through  micro-­‐benchmarks     •  HDFS  evalua8on  (TestDFSIO)   •  NameNode  evalua8on  (NNBench)   •  MapReduce  evalua8on  (MRBench)   •  Through  macro-­‐benchmarks  (WordCount,  TeraBench)     •  Analyze  of  best  results  of  isola8on   •  Through  IBS  benchmark   •  At  least  50  execu8ons  were  performed  for  each  experiment    
  • 12. HDFS  Evalua8on   •  Semngs:   •  Replica8on  of  3  blocks   •  File  size  from  100  MB  to   3000  MB       •  All  Container-­‐based  systems   have  performance  similar  to   na8ve     •  Results  o  OpenVZ  represents   loss  of  3Mbps   •  It  is  due  to  the  CFQ  scheduler       0 5 10 15 20 25 30 0 1000 2000 3000 File size (Bytes) Throughput(Mbps) lxc nativa ovz vserver
  • 13. HDFS  Evalua8on     •  All  of  Container-­‐based   systems  obtained   performance  results  similar   to  na8ve       •  Linux-­‐VServer  uses  a   Physical-­‐based  network     0 5 10 15 20 25 30 0 1000 2000 3000 File size (Bytes) Throughput(Mbps) lxc nativa ovz vserver
  • 14. NameNode  Evalua8on  using  NNBench   •  NNBench  benchmark  was  chosen  to  evaluate  the  NameNode  component   •  Linux-­‐VServer  reaches  a  latency  at  a  average  of  48ms,  while  LXC  obtained  the   worst  result  at  an  average  of  56ms   •  The  differences  are  not  so  significant  if  the  numbers  are  considered   •  However,  the  strengths  are  that  no  excep8on  was  observed  during  the  high   HDFS  management  stress,  and  that  all  systems  were  able  to  respond   effec8vely  as  the  na8ve   Na8ve   LXC   OpenVZ   VServer   Open/Read  (ms)   0.51     0.52   0.51   0.49   Create/Write  (ms)   54.65   56.89   51.96   48.90   •       Generates  opera8ons  on  1000  files  on  HDFS  
  • 15. MapReduce  Evalua8on  using  MRBench   •  The  results  obtained  from  MRBench  show  that  MR  layer  suffers  no  substan8al   effect  while  running  on  different  container-­‐based  virtualiza8on  systems   Na8ve   LXC   OpenVZ   VServer   Execu8on  Time     14251       13577   14304     13614    
  • 16. Analyzing  Performance  with  WordCount     0 20 40 60 80 100 120 140 160 180 Wordcount ExecutionTime(seconds) Native LXC OpenVZ VServer •  30  GB  of  input  data   •  The  peak  of  performance   degrada8on  from  OpenVZ   is  explained  by  the  I/O   scheduler  overhead  
  • 17. Analyzing  Performance  with  TeraSort     0 20 40 60 80 100 120 140 Terasort ExecutionTime(seconds) Native LXC OpenVZ VServer •  Standard  map/reduce  sort   •  Steps:   •  Generates  30  GB  of  input   data   •  Run  on  such  input  data.     •  A  HDFS  block  size  of  64MB    
  • 18. Performance  Isola8on   Container   A   Container   A   Container   B   Base  line     applica8on   Base  line     applica8on   Stress  Test   Execu8on  Time     Execu8on  Time     Performance  degrada8on  (%)    
  • 19. Performance  Isola8on     CPU   Memory   I/O   Fork  Bomb   LXC   0%   8.3%   5.5%   0%   •  We  chose  LXC    as  the  representa8ve  of  the  container-­‐based  virtualiza8on  to  be   evaluated   •  The  limits    of  the  CPU  usage  per  container  is  working  well   •  no  significant  impact  was  noted.     •  a  liUle  performance  degrada8on  needs  to  be  taken  into  account     •  The  fork  bomb  stress  test  reveals  that  the  LXC  has  a  security  subsystem  that   ensure  feasibility  
  • 20. Conclusions   •  we  found  that  all  container-­‐based  systems  reach  a  near-­‐na8ve  performance  for   MapReduce  workloads     •  the  results  of  performance  isola8on  reveled  that  the  LXC  has  improved  its   capabili8es  of  restrict  resources  among  containers     •  although  some  works  are  already  taking  advantages  of  container-­‐based   systems  on  MR  clusters   •  this  work  demonstrated  the  benefits  of  using  container-­‐based  systems  to   support  MapReduce  clusters  
  • 21. Future  Work   •  We  plan  to  study  the  performance  isola8on  at  the  network-­‐level   •  We  plan  to  study  the  scalability  while  increasing  the  number  of   nodes   •  We  plan  to  study  aspects  regarding  the  green  compu8ng,  such  as   the  trade-­‐off  between  performance  and  energy  consump8on    
  • 22. Thank  you  for  your  aUen8on!