SlideShare una empresa de Scribd logo
1 de 55
Descargar para leer sin conexión
  1	
  
	
  
	
  
	
  
CMPE	
  283	
  –	
  Project	
  2	
  	
  
DRS	
  and	
  DPM	
  Implementation	
  in	
  Virtualized	
  Environment	
  
Large	
  Scale	
  Performance	
  Statistics	
  gathering	
  and	
  monitoring	
  
	
  
	
  
Submitted	
  to	
  
Professor	
  Simon	
  Shim	
  
	
  
Submitted	
  By	
  	
  
Team	
  -­‐01	
  
Akshay	
  Wattal	
  
Apoorva	
  Gouni	
  
Gopika	
  Gogineni	
  
Pratyusha	
  Mandapati	
  
	
  
	
  
	
  
	
  
  2	
  
Table	
  of	
  Contents	
  
1.	
  Introduction:	
  ....................................................................................................................	
  3	
  
2.	
  Background:	
  .....................................................................................................................	
  4	
  
3.	
  Requirements	
  ...................................................................................................................	
  4	
  
4.Design	
  ...............................................................................................................................	
  5	
  
5.Implementation	
  ..............................................................................................................	
  12	
  
6.	
  Assumptions:	
  .................................................................................................................	
  24	
  
7.	
  Limitations	
  .....................................................................................................................	
  25	
  
8.	
  Future	
  Work	
  and	
  its	
  extension	
  .......................................................................................	
  25	
  
9.	
  Individual	
  Contribution	
  ..................................................................................................	
  26	
  
10.	
  Installation	
  and	
  Execution	
  manual	
  ................................................................................	
  27	
  
12.	
  Screenshots	
  ..................................................................................................................	
  37	
  
13.	
  Conclusion	
  ....................................................................................................................	
  54	
  
14.	
  References:	
  ..................................................................................................................	
  55	
  
	
  
	
  
	
  
	
  
	
  
	
  
	
  
	
  
	
  
	
  
	
  
	
  
  3	
  
1.	
  INTRODUCTION:	
  
Virtualization	
  has	
  overall	
  eased	
  the	
  IT	
  computing	
  by	
  providing	
  high	
  cost	
  savings	
  and	
  can	
  also	
  
greatly	
  enhance	
  organization’s	
  business	
  agility.	
  Companies	
  that	
  employ	
  partitioning,	
  workload	
  
management	
   and	
   other	
   virtualization	
   techniques	
   are	
   better	
   positioned	
   to	
   respond	
   to	
  
changing	
  demands	
  in	
  their	
  business.	
  
	
  
1.1.	
  Goals:	
  
Some	
   of	
   the	
   key	
   challenges	
   that	
   the	
   virtualization	
   industry	
   faces	
   are	
   to	
   manage	
   efficient	
  
utilization	
   of	
   resources,	
   proper	
   consumption	
   of	
   power	
   and	
   collections	
   of	
   logs	
   from	
   the	
  
virtualized	
  environment	
  for	
  monitoring	
  and	
  analysis	
  purpose.	
  In	
  this	
  project	
  we	
  are	
  doing	
  to	
  
try	
  to	
  understand	
  these	
  challenges	
  and	
  propose	
  a	
  reasonable	
  solution.	
  
	
  
1.2.	
  Objectives:	
  
The	
  objective	
  of	
  this	
  project	
  can	
  be	
  divide	
  into	
  two	
  parts:	
  
• Creating	
  a	
  simple	
  Develop	
  a	
  simple	
  DRS	
  (Distributed	
  Resource	
  Scheduler)	
  and	
  DPM	
  
(Distributed	
  Power	
  Management)	
  function.	
  
• Develop	
  a	
  real	
  time	
  statistics	
  gathering	
  and	
  analysis	
  framework.	
  Which	
  is	
  capable	
  of	
  
capturing	
  metrics	
  from	
  vHosts	
  and	
  the	
  Virtual	
  Machines	
  and	
  has	
  the	
  ability	
  to	
  visualize	
  
them	
  in	
  an	
  efficient	
  manner.	
  	
  
	
  
1.3.	
  Needs:	
  
Distributed	
   resource	
   scheduler	
   and	
   distributed	
   power	
   management	
   are	
   essential	
   for	
  
balancing	
  the	
  load	
  in	
  the	
  virtualize	
  environment.	
  It	
  prevents	
  from	
  over	
  and	
  under-­‐utilization	
  
of	
  resources.	
  There	
  are	
  millions	
  of	
  performance	
  records	
  generated	
  from	
  virtual	
  machines,	
  its	
  
important	
  to	
  have	
  a	
  performance	
  statistics	
  collector	
  and	
  analyzer	
  framework	
  to	
  monitor	
  the	
  
health	
  of	
  the	
  infrastructure	
  and	
  thus	
  carry	
  out	
  smooth	
  DevOps.	
  
	
  
	
  
	
  
	
  
	
  
	
  
	
  
	
  
  4	
  
2.	
  BACKGROUND:	
  
The	
  Distributed	
  resource	
  scheduler	
  (DRS)	
  helps	
  in	
  distribution	
  of	
  the	
  load	
  across	
  the	
  different	
  
hosts	
  that	
  are	
  available	
  in	
  the	
  vCenter.	
  Distributed	
  Power	
  management	
  runs	
  on	
  top	
  of	
  DRS	
  
and	
  performs	
  the	
  primary	
  function	
  of	
  monitoring	
  the	
  hosts	
  and	
  virtual	
  machines	
  CPU	
  usage	
  
periodically	
   and	
   power-­‐offs	
   the	
   host	
   with	
   least	
   utilization	
   leading	
   to	
   migration	
   of	
   virtual	
  
machines.	
  In	
  addition,	
  collection	
  of	
  large-­‐scale	
  statistics	
  is	
  done	
  and	
  displayed	
  graphically	
  for	
  
analytics	
  purposes.	
  	
  
	
  
3.	
  REQUIREMENTS	
  
3.1.	
  Functional	
  Requirements	
  
1. Collection	
  of	
  the	
  performance	
  statistics	
  metrics	
  of	
  the	
  hosts	
  and	
  virtual	
  machines.	
  
2. Storing	
  of	
  the	
  data	
  should	
  be	
  in	
  scalable	
  NoSQL	
  database.	
  
3. Use	
  of	
  logstash	
  for	
  log	
  filtering	
  and	
  parsing	
  should	
  be	
  implemented.	
  
4. Agent	
   aggregator	
   should	
   read	
   and	
   aggregate	
   data	
   from	
   the	
   NoSQL	
   database	
   in	
  
intervals	
  of	
  5minutes,	
  1hour	
  and	
  daily	
  and	
  store	
  into	
  relational	
  database	
  MySQL.	
  	
  
5. The	
  data	
  from	
  the	
  MySQL	
  database	
  should	
  be	
  presented	
  in	
  the	
  form	
  of	
  charts	
  by	
  using	
  
visualization	
  tool.	
  
6. The	
   visualization	
   tool	
   must	
   be	
   able	
   to	
   present	
   the	
   data	
   on	
   a	
   real-­‐time	
   basis	
   by	
  
updating	
  itself	
  at	
  regular	
  time	
  intervals.	
  
	
  
3.2.	
  Non-­‐Functional	
  Requirements	
  
1. The	
  system	
  must	
  be	
  designed	
  to	
  scale	
  automatically	
  for	
  any	
  future	
  use	
  and	
  should	
  not	
  
have	
  a	
  single	
  point	
  of	
  failure.	
  
2. The	
  visualization	
  should	
  be	
  clear	
  and	
  convey	
  meaningful	
  insights.	
  
	
  
	
  
	
  
	
  
	
  
	
  
	
  
  5	
  
4.DESIGN	
  
Part	
  1-­‐	
  DRS	
  &	
  DPM	
  
	
  
Initially,	
  the	
  number	
  of	
  vHosts	
  and	
  the	
  virtual	
  machines	
  under	
  each	
  vHost	
  will	
  be	
  listed.	
  Then,	
  
the	
  CPU	
  usage	
  metric	
  will	
  be	
  calculated	
  for	
  each	
  vHosts.	
  When	
  the	
  new	
  virtual	
  machine	
  is	
  
added,	
  it	
  will	
  be	
  placed	
  under	
  the	
  vHosts	
  with	
  less	
  CPU	
  usage	
  there	
  by	
  balancing	
  the	
  load.	
  
That	
  is,	
  the	
  virtual	
  machine	
  with	
  less	
  CPU	
  usage	
  will	
  be	
  migrated	
  to	
  the	
  vHosts	
  with	
  less	
  CPU	
  
usage.	
  
Sample	
  code	
  that	
  shows	
  the	
  Distributed	
  Resource	
  Scheduling	
  when	
  a	
  new	
  vHost	
  is	
  added	
  to	
  
the	
  vCenter.	
  
	
  
	
  
	
  
  6	
  
	
  
Part	
  2	
  
	
  
Architecture	
  Diagram	
  
	
  
The	
  below	
  diagram	
  illustrates	
  the	
  solution	
  architecture	
  for	
  the	
  large-­‐scale	
  statistics	
  gathering	
  
and	
   analysis	
   tool.	
   It	
   highlights	
   the	
   different	
   components	
   and	
   tools	
   that	
   were	
   used	
   for	
  
implementing	
  the	
  solution.	
  In	
  addition,	
  it	
  also	
  points	
  the	
  solution	
  flow	
  and	
  the	
  deployment	
  
scheme	
  used.	
  	
  
	
  
	
  
Figure:	
  Solution	
  Architecture	
  of	
  the	
  framework	
  
	
  
	
  
	
  
	
  
	
  
	
  
	
  
  7	
  
Components	
  
	
  
Listed	
   below	
   are	
   the	
   major	
   components	
   that	
   formed	
   the	
   basis	
   of	
   the	
   system	
   and	
   solution	
  
architecture:	
  
	
  
1. Java	
  Agent	
  Collector	
  
The	
   Java	
   Agent	
   Collector	
   is	
   essential	
   a	
   Java	
   program	
   that	
   uses	
   the	
   VMware	
  
Infrastructure	
   APIs,	
   which	
   initiates	
   a	
   Performance	
   Manager	
   to	
   fetch	
   the	
   vital	
  
performance	
   statistics	
   of	
   the	
   vHosts	
   and	
   the	
   VMs.	
   After	
   getting	
   the	
   statistics	
   it	
  
performs	
  the	
  task	
  of	
  writing	
  it	
  to	
  the	
  log	
  file.	
  This	
  collector	
  is	
  deployed	
  independently	
  
on	
  each	
  of	
  the	
  virtual	
  machines.	
  
	
  
2. Log	
  File	
  
As	
   mentioned	
   above,	
   the	
   log	
   file	
   acts	
   as	
   a	
   local	
   storage	
   for	
   the	
   virtual	
   machine	
  
capturing	
  the	
  performance	
  statistics.	
  Each	
  log	
  file	
  that	
  is	
  stored	
  on	
  the	
  VM	
  is	
  stored	
  in	
  
as	
  <vHost	
  Name>.log	
  i.e.	
  130.65.132.131.log	
  
	
  
3. Logstash	
  
Logstash	
   is	
   an	
   open	
   source	
   tool	
   for	
   managing	
   events	
   and	
   logs.	
   It	
   provides	
   the	
  
capability	
  to	
  collect	
  logs,	
  parse	
  them	
  and	
  store	
  them	
  for	
  later	
  use.	
  In	
  this	
  framework,	
  it	
  
polls	
  on	
  the	
  log	
  files	
  for	
  “events”,	
  where	
  event	
  is	
  a	
  single	
  line	
  written	
  to	
  the	
  log	
  file.	
  As	
  
soon	
  as	
  an	
  event	
  is	
  detected	
  it	
  pushes	
  it	
  to	
  the	
  MongoDB	
  cloud	
  server.	
  In	
  addition,	
  
using	
  Logstash	
  in	
  our	
  implementation	
  prevents	
  from	
  single	
  point	
  of	
  failure.	
  In	
  case,	
  if	
  
the	
  connection	
  to	
  MongoDB	
  were	
  not	
  established,	
  it	
  would	
  re-­‐try	
  to	
  connect.	
  Once,	
  
the	
  connection	
  is	
  back	
  up	
  it	
  will	
  automatically	
  sync	
  the	
  missing	
  log	
  entries	
  from	
  the	
  log	
  
files	
  to	
  the	
  MongoDB	
  storage.	
  
	
  
4. MongoDB	
  Cloud	
  Storage	
  (MongoLabs)	
  
MongoLabs	
   provides	
   MongoDB-­‐as-­‐a-­‐Service.	
   Deployed	
   on	
   top	
   of	
   AWS,	
   it’s	
   servers	
   as	
   our	
  
primary	
  client	
  side	
  database	
  for	
  storing	
  the	
  raw	
  log	
  data,	
  consisting	
  of	
  performance	
  statistics	
  
of	
   the	
   vHosts	
   and	
   Virtual	
   Machines.	
   It’s	
   highly	
   scalable	
   and	
   reliable.	
   As	
   explained	
   above,	
  
logstash	
  pushes	
  huge	
  amount	
  of	
  data	
  to	
  the	
  MongoDB	
  cloud	
  infrastructure.	
  It	
  consists	
  of	
  a	
  
web	
  based	
  UI	
  for	
  management	
  of	
  the	
  database	
  (configurations	
  for	
  remote	
  connection,	
  delete	
  
etc.)	
  
	
  
5. Java	
  Agent	
  Analyzer	
  
Agent	
  Analyzer	
  is	
  a	
  multi-­‐threaded	
  java	
  program	
  that	
  spawns	
  a	
  total	
  of	
  three	
  threads.	
  
These	
  threads	
  connect	
  to	
  the	
  MongoDB	
  cloud	
  infrastructure	
  after	
  every	
  fixed	
  internal	
  
(5min,	
  1hour	
  and	
  24hours)	
  and	
  performs	
  roll-­‐ups/aggregation	
  on	
  the	
  data	
  it	
  fetches	
  
  8	
  
and	
   stores	
   it	
   into	
   the	
   MySQL	
   database.	
   Having	
   an	
   analyzer	
   ensures	
   that	
   only	
   the	
  
processed	
  data	
  goes	
  into	
  the	
  final	
  database,	
  so	
  that	
  it	
  can	
  allow	
  for	
  easy	
  and	
  effective	
  
visualization.	
  	
  
	
  
6. MySQL	
  Database	
  
MySQL	
   is	
   a	
   SQL	
   database	
   that	
   forms	
   the	
   primary	
   server-­‐side	
   database,	
   storing	
   the	
  
entire	
  valuable	
  aggregated	
  processed	
  data.	
  
	
  
7. Visualization	
  Block	
  
The	
   visualization	
   blocks	
   forms	
   an	
   important	
   part	
   of	
   this	
   large-­‐scale	
   framework.	
   It	
  
allows	
  the	
  user	
  capability	
  to	
  view	
  and	
  drill-­‐down	
  into	
  the	
  key	
  performance	
  statistics.	
  In	
  
our	
  solution	
  following	
  pieces	
  are	
  integrated	
  together	
  to	
  generate	
  data	
  insights:	
  
• PHP	
  
PHP	
  is	
  a	
  server	
  side	
  scripting	
  language	
  that	
  connects	
  to	
  the	
  MySQL	
  database	
  
and	
  caters	
  to	
  the	
  request	
  sent	
  from	
  the	
  UI.	
  
	
  
• UI/Bootstrap	
  
The	
  UI	
  is	
  developed	
  using	
  a	
  JavaScript	
  template	
  framework	
  called	
  Bootstrap.	
  It	
  
sends	
  AJAX	
  calls	
  (GET)	
  to	
  get	
  data	
  from	
  the	
  MySQL	
  interface.	
  
	
  
• Apache	
  Httpd	
  Server	
  
This	
  is	
  an	
  open	
  sourced	
  HTTP	
  server	
  from	
  Apache	
  that	
  hosts	
  the	
  PHP	
  and	
  UI	
  
files.	
  It	
  allows	
  for	
  communication	
  between	
  the	
  UI	
  and	
  PHP	
  scripts.	
  
	
  
Key	
  Workflows	
  	
  
	
  
To	
  understand	
  the	
  solution	
  we	
  can	
  divide	
  the	
  architecture	
  into	
  three	
  key	
  workflows:	
  
	
  
1. Data	
  Collection	
  
Below	
   diagram	
   shows	
   the	
   data	
   collection	
   workflow.	
   In	
   this	
   each	
   collector	
   agent	
  
running	
  on	
  the	
  virtual	
  machine	
  gets	
  the	
  performance	
  data	
  (both	
  vHost	
  and	
  VM)	
  using	
  
the	
  Performance	
  manager	
  and	
  metrics	
  id.	
  After	
  getting	
  the	
  data,	
  it	
  writes	
  the	
  data	
  to	
  
the	
  log	
  file.	
  This	
  acts	
  as	
  a	
  local	
  storage.	
  Logstash	
  process	
  that	
  is	
  running	
  on	
  the	
  VM	
  
reads	
   the	
   log	
   files	
   and	
   pushes	
   the	
   data	
   received	
   into	
   the	
   MongoDB	
   cloud	
   storage	
  
platform.	
  
  9	
  
	
  
Figure:	
  Data	
  Collection	
  Workflow	
  
	
  
	
  
2. Data	
  Aggregation	
  
The	
  next	
  workflow	
  the	
  follows	
  is	
  the	
  data	
  aggregation.	
  The	
  Java	
  Agent	
  collector	
  is	
  a	
  
multithreaded	
  program	
  that	
  spawns	
  three	
  threads	
  for	
  5minutes,	
  1hour	
  and	
  24hours	
  
data	
   aggregation.	
   Its	
   does	
   the	
   main	
   job	
   of	
   fetching	
   the	
   data	
   from	
   the	
   MongoDB	
  
datastore	
  and	
  aggregating	
  it	
  into	
  the	
  MySQL	
  database.	
  Since,	
  it’s	
  a	
  thread	
  it	
  sleeps	
  for	
  
specific	
  time	
  interval	
  before	
  again	
  performing	
  the	
  aggregation.	
  
	
  
Figure:	
  Data	
  Aggregation	
  Workflow	
  
	
  
	
  
	
  
  10	
  
3. Data	
  Visualization	
  
The	
   final	
   workflow	
   is	
   of	
   visualizing	
   the	
   aggregated	
   data.	
   PHP	
   scripts	
   act	
   as	
   an	
  
abstraction	
  on	
  top	
  of	
  the	
  MySQL	
  database	
  fetching	
  the	
  data	
  as	
  requested	
  by	
  the	
  UI	
  
(which	
  is	
  javascript).	
  The	
  communication	
  between	
  the	
  UI	
  and	
  PHP	
  takes	
  place	
  over	
  
http	
  using	
  REST	
  web	
  services.	
  The	
  Apache	
  Http	
  server	
  that	
  hosts	
  the	
  web	
  application	
  
caters	
  the	
  http	
  communication.	
  High	
  Charts,	
  Data	
  Tables	
  and	
  Google	
  Maps	
  are	
  used	
  
for	
   visualizing	
   the	
   data	
   insights.	
   The	
   data	
   in	
   the	
   dashboards	
   refreshes	
   after	
   ever	
  
5minutes	
  to	
  show	
  the	
  latest	
  data.	
  
	
  
Figure:	
  Data	
  Visualization	
  Workflow	
  
	
  
Database	
  Schema	
  Design	
  
	
  
1. Client	
  Side	
  MongoDB	
  
For	
  storing	
  the	
  huge	
  amount	
  of	
  data	
  in	
  MongoDB	
  the	
  following	
  schema	
  is	
  used:	
  
	
  
  11	
  
2. Server	
  Side	
  MySQL	
  
It’s	
   essential	
   that	
   the	
   server	
   side	
   schema	
   is	
   designed	
   to	
   scale	
   and	
   to	
   collect	
   fine	
  
information	
  to	
  be	
  presented	
  over	
  the	
  UI.	
  Thus	
  following	
  schema	
  design	
  is	
  used:	
  
Figure:	
  MySQL	
  Schema	
  Design	
  
	
  
  12	
  
5.IMPLEMENTATION	
  
Part	
  2	
  
	
  
5.1.	
  Environment	
  
The	
  project	
  has	
  been	
  implemented	
  in	
  the	
  following	
  environment:	
  
• One	
  datacenter	
  consisting	
  of	
  two	
  Hosts	
  
o vHost-­‐130.65.132.131	
  	
  
o vHost-­‐130.65.132.132.	
  
• The	
  vHosts	
  have	
  VMWare	
  ESXi	
  	
  5.0	
  installed.	
  
• Each	
  host	
  has	
  two	
  virtual	
  machines	
  	
  
o T01-­‐VM01-­‐Ubuntu01	
  and	
  T01-­‐VM01-­‐Ubuntu01	
  under	
  vHost	
  -­‐130.65.132.131	
  
o T01-­‐VM01-­‐Ubuntu03	
  and	
  T01-­‐VM01-­‐Ubuntu04	
  under	
  vHost	
  -­‐130.65.132.132.	
  
• Each	
  VM	
  has	
  the	
  following	
  OS	
  and	
  tools:	
  
o 	
  Ubuntu-­‐10.04	
  
o Java	
  	
  (version	
  -­‐1.8.0_25).	
  
o Logstash-­‐1.4.2.	
  
• Javascript	
  is	
  used	
  for	
  the	
  client	
  side	
  UI.	
  
	
  
5.2.	
  Tools	
  
The	
  following	
  tools	
  were	
  used	
  for	
  development,	
  debugging	
  and	
  testing	
  purpose:	
  
• vSphere	
  client	
  and	
  server	
  –	
  For	
  connecting	
  to	
  and	
  hosting	
  the	
  virtualized	
  environment.	
  	
  
• Eclipse	
  IDE	
  –	
  For	
  developing	
  the	
  java	
  code	
  and	
  .jar	
  files.	
  
• Logstash-­‐1.4.2	
  –	
  Acts	
  as	
  a	
  log	
  management	
  framework.	
  
• MongoLab	
  –	
  Cloud	
  storage	
  for	
  MongoDB	
  (MongoDB-­‐as-­‐a-­‐Service)	
  
• MySQL	
  –	
  5.1.75	
  is	
  used	
  as	
  the	
  server	
  side	
  storage.	
  
• PHP	
  –	
  5.4.31	
  is	
  used	
  as	
  the	
  server	
  side	
  scripting	
  language.	
  
• Bootstrap	
  –	
  UI	
  template	
  tool	
  for	
  creating	
  the	
  dashboard.	
  
• HighCharts	
  –	
  For	
  visualizing	
  the	
  key	
  statistics.	
  
• Data	
  Tables	
  -­‐	
  For	
  visualizing	
  the	
  key	
  statistics.	
  
• Google	
  Charts	
  –	
  For	
  plotting	
  machine’s	
  IP	
  address.	
  
• Apache	
  httpd-­‐	
  2.2.27	
  -­‐	
  For	
  hosting	
  the	
  web	
  application.	
  
	
  
5.3.	
  Implementation	
  Approach:	
  
The	
   statistics	
   of	
   individual	
   virtual	
   machines	
   and	
   their	
   corresponding	
   host	
   machine	
   are	
  
collected	
  by	
  deploying	
  a	
  jar	
  file	
  designed	
  and	
  compiled	
  on	
  each	
  virtual	
  machine.	
  This	
  jar	
  file	
  
  13	
  
java	
  project	
  is	
  designed	
  to	
  accept	
  the	
  virtual	
  machine	
  name	
  as	
  an	
  argument	
  at	
  runtime.	
  The	
  
following	
   three	
   main	
   aspects	
   of	
   the	
   project	
   design	
   are	
   followed	
   -­‐	
   retrieving	
   the	
   statistics,	
  
saving	
  them	
  to	
  log	
  file	
  on	
  individual	
  virtual	
  machines,	
  saving	
  them	
  to	
  the	
  client	
  side	
  database	
  
system	
  and	
  retrieving	
  them	
  from	
  the	
  server	
  side	
  database	
  for	
  visualization.	
  
	
  
Statistics	
  collection:	
  
	
  
The	
  jar	
  file	
  deployed	
  on	
  each	
  virtual	
  machine	
  collects	
  the	
  statistics	
  of	
  that	
  virtual	
  machine	
  and	
  
its	
  host	
  machine.	
  This	
  jar	
  file	
  is	
  a	
  java	
  project	
  with	
  the	
  code	
  structures	
  as	
  mentioned	
  below:	
  
• This	
   java	
   project	
   is	
   designed	
   to	
   have	
   three	
   classes:	
   statsLog.java,	
   pThread.java	
   and	
  
stats.java.	
  
• statsLog.java:	
  This	
  class	
  has	
  the	
  main()	
  function	
  defined.	
  It	
  accepts	
  a	
  valid	
  VM	
  name	
  as	
  
argument	
   and	
   gets	
   the	
   corresponding	
   host	
   name	
   from	
   a	
   dynamically	
   created	
  
HashMap	
  of	
  vHosts	
  and	
  VMs.	
  A	
  thread	
  is	
  created	
  using	
  the	
  pThread	
  class	
  to	
  maintain	
  a	
  
separate	
  thread	
  of	
  operation.	
  
• pThread.java:	
  This	
  thread	
  is	
  started	
  in	
  the	
  statsLog.java	
  and	
  have	
  the	
  virtual	
  machine	
  
and	
  	
  host	
  system	
  objects.	
  The	
  run()	
  function	
  creates	
  the	
  stats.java	
  class	
  object	
  and	
  
passes	
  the	
  virtual	
  machine	
  and	
  host	
  system	
  object.	
  This	
  run	
  method	
  is	
  a	
  blocking	
  call	
  
and	
  executed	
  in	
  a	
  infinite	
  loop	
  to	
  collect	
  the	
  statistics	
  after	
  a	
  certain	
  time	
  interval.	
  
• Stats.java:	
   This	
   class	
   defines	
   the	
   methods	
   to	
   collect	
   statistics	
   from	
   the	
   virtual	
  
machine(printVMdetails)	
   and	
   its	
   host	
   system(printHOSTdetails).	
   These	
   methods	
   use	
  
the	
  following	
  prebuilt	
  vMWare	
  classes	
  to	
  collect	
  the	
  statistics.	
  
o PerformanceManager:	
   returns	
   the	
   performance	
   manager	
   object	
   for	
   the	
  
vCenter	
  service	
  instance.	
  
o PerfProviderSummary:	
   queries	
   and	
   returns	
   the	
   summary	
   provider	
   for	
   the	
  
virtual	
  machine	
  object	
  passed.	
  
o PerfQuerySpec:	
   builds	
   query	
   for	
   the	
   corresponding	
   metric	
   and	
   retrieves	
   the	
  
metric	
  data.	
  
o The	
  following	
  metric	
  IDs	
  are	
  used	
  to	
  collect	
  the	
  statistics:	
  	
  
	
  
§ 2	
  :	
  CPU	
  Usage	
  
§ 6	
  :	
  	
  CPU	
  usage	
  in	
  megaHertz	
  
§ 24	
  :	
  Memory	
  Usage	
  
§ 29	
  :	
  Memory	
  Granted	
  
§ 33	
  :	
  Memory	
  Active	
  
§ 98	
  :	
  	
  Memory	
  Consumed	
  
§ 125	
  :	
  Disk	
  Usage	
  
§ 130	
  :	
  Disk	
  Read	
  
§ 131	
  :	
  Disk	
  Write	
  
§ 143	
  :	
  Net	
  Usage	
  
§ 148	
  :	
  Net	
  Received	
  
§ 149	
  :	
  Net	
  Transmitted	
  
§ 155	
  :	
  System	
  Up	
  time	
  
§ 480	
  :	
  System	
  resources	
  CPU	
  usage	
  
  14	
  
	
  
o All	
  these	
  statistics	
  are	
  saved	
  to	
  a	
  log	
  file	
  on	
  each	
  Virtual	
  Machine.	
  
Log	
  Parsing:	
  
	
  
The	
  Log	
  file	
  saved	
  on	
  each	
  virtual	
  machine	
  in	
  a	
  specified	
  location	
  is	
  parsed	
  to	
  the	
  Mongo	
  DB	
  in	
  
cloud	
  (Mongo	
  Lab)	
  by	
  using	
  the	
  Logstash.	
  For	
  this	
  a	
  configuration	
  file	
  is	
  created	
  where	
  the	
  
following	
  things	
  are	
  specified:	
  
• Input	
  file	
  path.	
  	
  
• Output	
  file	
  path/Connection	
  to	
  the	
  MongoDB.	
  	
  
• The	
  Database	
  name	
  and	
  the	
  Collection	
  name	
  to	
  store	
  the	
  data.	
  
	
  
The	
  format	
  of	
  the	
  configuration	
  file	
  is	
  as	
  below:	
  
	
  
Input	
  
{	
  
File	
  
{	
  
codec	
  =>	
  "json"	
  
stat_interval	
  =>	
  0	
  
path	
  =>	
  "/home/administrator/Desktop/Stats/130_65_132_131.log"	
  
}	
  
}	
  
Output	
  
{	
  
Mongodb	
  
{	
  
collection	
  =>	
  "mylogcollection"	
  
database	
  =>	
  "vmwaredb"	
  
uri=>	
  
"mongodb://Apoorva:VMstats283@ds061200.mongolab.com:61200/vmwar
edb"	
  
}	
  
}	
  
	
  
	
  
Log	
  Storage:	
  
	
  	
  
• MongoDB	
  is	
  used	
  to	
  collect	
  the	
  data	
  from	
  Logstash	
  and	
  store	
  the	
  logs	
  collected.	
  
• Received	
  logs	
  are	
  stored	
  in	
  “mylogcollection”	
  collection	
  in	
  “vmwaredb”	
  database.	
  
• Remote	
   connection	
   is	
   added	
   to	
   MongoLabs	
   instance	
   of	
   MongoDB	
   to	
   allow	
   remote	
  
connections.	
  
	
  
	
  
  15	
  
Log	
  Aggregation:	
  
	
  
This	
  is	
  done	
  using	
  the	
  T01_Analyser.java	
  program,	
  which	
  is	
  multithreaded	
  java	
  program	
  that	
  
spawns	
  three	
  threads	
  for	
  minute,	
  hourly	
  and	
  daily	
  aggregation.	
  These	
  are	
  recurring	
  threads	
  
that	
  execute	
  after	
  every	
  fixed	
  configurable	
  interval.	
  Once	
  the	
  aggregation	
  is	
  done	
  it	
  opens	
  a	
  
connection	
   with	
   MySQL	
   database	
   using	
   MySQL.jdbc.connection	
   and	
   executes	
   two	
   insert	
  
statements	
  per	
  time	
  interval	
  (i.e.	
  for	
  VM	
  and	
  vHost),	
  resulting	
  in	
  update	
  of	
  6	
  tables.	
  
	
  	
  
	
  
Data	
  Storage	
  and	
  Retrieval:	
  
	
  
Datastorage	
  is	
  done	
  using	
  the	
  MySQL	
  database;	
  following	
  is	
  the	
  schema	
  creation	
  script:	
  
	
  
CREATE	
  DATABASE	
  	
  IF	
  NOT	
  EXISTS	
  `vmware_monitoring`	
  /*!40100	
  DEFAULT	
  CHARACTER	
  SET	
  latin1	
  */;	
  
USE	
  `vmware_monitoring`;	
  
	
  
	
  
DROP	
  TABLE	
  IF	
  EXISTS	
  `vHost_monitoring_5min`;	
  
/*!40101	
  SET	
  @saved_cs_client	
  	
  	
  	
  	
  =	
  @@character_set_client	
  */;	
  
/*!40101	
  SET	
  character_set_client	
  =	
  utf8	
  */;	
  
CREATE	
  TABLE	
  `vHost_monitoring_5min`	
  (	
  
	
  	
  `id`	
  int(11)	
  unsigned	
  NOT	
  NULL	
  AUTO_INCREMENT,	
  
	
  	
  `timestamp`	
  varchar(255)	
  NOT	
  NULL,	
  
	
  	
  `name`	
  varchar(255)	
  DEFAULT	
  NULL,	
  
	
  	
  `cpuUsage`	
  	
  decimal(11.3)	
  DEFAULT	
  NULL,	
  
	
  	
  `cpuUsagemhz`	
  	
  decimal(11.3)	
  DEFAULT	
  NULL,	
  
	
  	
  `memUsage`	
  	
  decimal(11.3)	
  DEFAULT	
  NULL,	
  
	
  	
  `memGranted`	
  	
  decimal(11.3)	
  DEFAULT	
  NULL,	
  
	
  	
  `memActive`	
  	
  decimal(11.3)	
  DEFAULT	
  NULL,	
  
	
  	
  `memConsumed`	
  	
  decimal(11.3)	
  DEFAULT	
  NULL,	
  
	
  	
  `diskUsage`	
  	
  decimal(11.3)	
  DEFAULT	
  NULL,	
  
	
  	
  `diskRead`	
  	
  decimal(11.3)	
  DEFAULT	
  NULL,	
  
	
  	
  `diskWrite`	
  	
  decimal(11.3)	
  DEFAULT	
  NULL,	
  
	
  	
  `netUsage`	
  	
  decimal(11.3)	
  DEFAULT	
  NULL,	
  
	
  	
  `netReceived`	
  	
  decimal(11.3)	
  DEFAULT	
  NULL,	
  
	
  	
  `netTrasnmitted`	
  	
  decimal(11.3)	
  DEFAULT	
  NULL,	
  
	
  	
  `sysUptime`	
  	
  decimal(11.3)	
  DEFAULT	
  NULL,	
  
	
  	
  `sysResourcesCpuUsage`	
  	
  decimal(11.3)	
  DEFAULT	
  NULL,	
  
	
  	
  PRIMARY	
  KEY	
  (`id`),	
  
	
  	
  KEY	
  `name`	
  (`name`)	
  
)	
  ENGINE=InnoDB	
  AUTO_INCREMENT=40	
  DEFAULT	
  CHARSET=utf8;	
  
/*!40101	
  SET	
  character_set_client	
  =	
  @saved_cs_client	
  */;	
  
	
  
DROP	
  TABLE	
  IF	
  EXISTS	
  `VM_monitoring_5min`;	
  
/*!40101	
  SET	
  @saved_cs_client	
  	
  	
  	
  	
  =	
  @@character_set_client	
  */;	
  
/*!40101	
  SET	
  character_set_client	
  =	
  utf8	
  */;	
  
CREATE	
  TABLE	
  `VM_monitoring_5min`	
  (	
  
	
  	
  `id`	
  int(11)	
  unsigned	
  NOT	
  NULL	
  AUTO_INCREMENT,	
  
	
  	
  `timestamp`	
  varchar(255)	
  NOT	
  NULL,	
  
	
  	
  `vHost_name`	
  varchar(255)	
  DEFAULT	
  NULL,	
  
  16	
  
	
  	
  `vm_name`	
  varchar(255)	
  DEFAULT	
  NULL,	
  
	
  	
  `cpuUsage`	
  	
  decimal(11.3)	
  DEFAULT	
  NULL,	
  
	
  	
  `cpuUsagemhz`	
  	
  decimal(11.3)	
  DEFAULT	
  NULL,	
  
	
  	
  `memUsage`	
  	
  decimal(11.3)	
  DEFAULT	
  NULL,	
  
	
  	
  `memGranted`	
  	
  decimal(11.3)	
  DEFAULT	
  NULL,	
  
	
  	
  `memActive`	
  	
  decimal(11.3)	
  DEFAULT	
  NULL,	
  
	
  	
  `memConsumed`	
  	
  decimal(11.3)	
  DEFAULT	
  NULL,	
  
	
  	
  `diskUsage`	
  	
  decimal(11.3)	
  DEFAULT	
  NULL,	
  
	
  	
  `diskRead`	
  	
  decimal(11.3)	
  DEFAULT	
  NULL,	
  
	
  	
  `diskWrite`	
  	
  decimal(11.3)	
  DEFAULT	
  NULL,	
  
	
  	
  `netUsage`	
  	
  decimal(11.3)	
  DEFAULT	
  NULL,	
  
	
  	
  `netReceived`	
  	
  decimal(11.3)	
  DEFAULT	
  NULL,	
  
	
  	
  `netTrasnmitted`	
  	
  decimal(11.3)	
  DEFAULT	
  NULL,	
  
	
  	
  `sysUptime`	
  	
  decimal(11.3)	
  DEFAULT	
  NULL,	
  
	
  	
  `sysResourcesCpuUsage`	
  	
  decimal(11.3)	
  DEFAULT	
  NULL,	
  
	
  	
  PRIMARY	
  KEY	
  (`id`),	
  
	
  	
  KEY	
  `vm_name`	
  (`vm_name`)	
  
)	
  ENGINE=InnoDB	
  AUTO_INCREMENT=40	
  DEFAULT	
  CHARSET=utf8;	
  
/*!40101	
  SET	
  character_set_client	
  =	
  @saved_cs_client	
  */;	
  
	
  
	
  
DROP	
  TABLE	
  IF	
  EXISTS	
  `vHost_monitoring_1hour`;	
  
/*!40101	
  SET	
  @saved_cs_client	
  	
  	
  	
  	
  =	
  @@character_set_client	
  */;	
  
/*!40101	
  SET	
  character_set_client	
  =	
  utf8	
  */;	
  
CREATE	
  TABLE	
  `vHost_monitoring_1hour`	
  (	
  
	
  	
  `id`	
  int(11)	
  unsigned	
  NOT	
  NULL	
  AUTO_INCREMENT,	
  
	
  	
  `timestamp`	
  varchar(255)	
  NOT	
  NULL,	
  
	
  	
  `name`	
  varchar(255)	
  DEFAULT	
  NULL,	
  
	
  	
  `cpuUsage`	
  	
  decimal(11.3)	
  DEFAULT	
  NULL,	
  
	
  	
  `cpuUsagemhz`	
  	
  decimal(11.3)	
  DEFAULT	
  NULL,	
  
	
  	
  `memUsage`	
  	
  decimal(11.3)	
  DEFAULT	
  NULL,	
  
	
  	
  `memGranted`	
  	
  decimal(11.3)	
  DEFAULT	
  NULL,	
  
	
  	
  `memActive`	
  	
  decimal(11.3)	
  DEFAULT	
  NULL,	
  
	
  	
  `memConsumed`	
  	
  decimal(11.3)	
  DEFAULT	
  NULL,	
  
	
  	
  `diskUsage`	
  	
  decimal(11.3)	
  DEFAULT	
  NULL,	
  
	
  	
  `diskRead`	
  	
  decimal(11.3)	
  DEFAULT	
  NULL,	
  
	
  	
  `diskWrite`	
  	
  decimal(11.3)	
  DEFAULT	
  NULL,	
  
	
  	
  `netUsage`	
  	
  decimal(11.3)	
  DEFAULT	
  NULL,	
  
	
  	
  `netReceived`	
  	
  decimal(11.3)	
  DEFAULT	
  NULL,	
  
	
  	
  `netTrasnmitted`	
  	
  decimal(11.3)	
  DEFAULT	
  NULL,	
  
	
  	
  `sysUptime`	
  	
  decimal(11.3)	
  DEFAULT	
  NULL,	
  
	
  	
  `sysResourcesCpuUsage`	
  	
  decimal(11.3)	
  DEFAULT	
  NULL,	
  
	
  	
  PRIMARY	
  KEY	
  (`id`),	
  
	
  	
  KEY	
  `name`	
  (`name`)	
  
)	
  ENGINE=InnoDB	
  AUTO_INCREMENT=40	
  DEFAULT	
  CHARSET=utf8;	
  
/*!40101	
  SET	
  character_set_client	
  =	
  @saved_cs_client	
  */;	
  
	
  
DROP	
  TABLE	
  IF	
  EXISTS	
  `VM_monitoring_1hour`;	
  
/*!40101	
  SET	
  @saved_cs_client	
  	
  	
  	
  	
  =	
  @@character_set_client	
  */;	
  
/*!40101	
  SET	
  character_set_client	
  =	
  utf8	
  */;	
  
CREATE	
  TABLE	
  `VM_monitoring_1hour`	
  (	
  
	
  	
  `id`	
  int(11)	
  unsigned	
  NOT	
  NULL	
  AUTO_INCREMENT,	
  
	
  	
  `timestamp`	
  varchar(255)	
  NOT	
  NULL,	
  
  17	
  
	
  	
  `vHost_name`	
  varchar(255)	
  DEFAULT	
  NULL,	
  
	
  	
  `vm_name`	
  varchar(255)	
  DEFAULT	
  NULL,	
  
	
  	
  `cpuUsage`	
  	
  decimal(11.3)	
  DEFAULT	
  NULL,	
  
	
  	
  `cpuUsagemhz`	
  	
  decimal(11.3)	
  DEFAULT	
  NULL,	
  
	
  	
  `memUsage`	
  	
  decimal(11.3)	
  DEFAULT	
  NULL,	
  
	
  	
  `memGranted`	
  	
  decimal(11.3)	
  DEFAULT	
  NULL,	
  
	
  	
  `memActive`	
  	
  decimal(11.3)	
  DEFAULT	
  NULL,	
  
	
  	
  `memConsumed`	
  	
  decimal(11.3)	
  DEFAULT	
  NULL,	
  
	
  	
  `diskUsage`	
  	
  decimal(11.3)	
  DEFAULT	
  NULL,	
  
	
  	
  `diskRead`	
  	
  decimal(11.3)	
  DEFAULT	
  NULL,	
  
	
  	
  `diskWrite`	
  	
  decimal(11.3)	
  DEFAULT	
  NULL,	
  
	
  	
  `netUsage`	
  	
  decimal(11.3)	
  DEFAULT	
  NULL,	
  
	
  	
  `netReceived`	
  	
  decimal(11.3)	
  DEFAULT	
  NULL,	
  
	
  	
  `netTrasnmitted`	
  	
  decimal(11.3)	
  DEFAULT	
  NULL,	
  
	
  	
  `sysUptime`	
  	
  decimal(11.3)	
  DEFAULT	
  NULL,	
  
	
  	
  `sysResourcesCpuUsage`	
  	
  decimal(11.3)	
  DEFAULT	
  NULL,	
  
	
  	
  PRIMARY	
  KEY	
  (`id`),	
  
	
  	
  KEY	
  `vm_name`	
  (`vm_name`)	
  
)	
  ENGINE=InnoDB	
  AUTO_INCREMENT=40	
  DEFAULT	
  CHARSET=utf8;	
  
/*!40101	
  SET	
  character_set_client	
  =	
  @saved_cs_client	
  */;	
  
	
  
DROP	
  TABLE	
  IF	
  EXISTS	
  `vHost_monitoring_24hour`;	
  
/*!40101	
  SET	
  @saved_cs_client	
  	
  	
  	
  	
  =	
  @@character_set_client	
  */;	
  
/*!40101	
  SET	
  character_set_client	
  =	
  utf8	
  */;	
  
CREATE	
  TABLE	
  `vHost_monitoring_24hour`	
  (	
  
	
  	
  `id`	
  int(11)	
  unsigned	
  NOT	
  NULL	
  AUTO_INCREMENT,	
  
	
  	
  `timestamp`	
  varchar(255)	
  NOT	
  NULL,	
  
	
  	
  `name`	
  varchar(255)	
  DEFAULT	
  NULL,	
  
	
  	
  `cpuUsage`	
  	
  decimal(11.3)	
  DEFAULT	
  NULL,	
  
	
  	
  `cpuUsagemhz`	
  	
  decimal(11.3)	
  DEFAULT	
  NULL,	
  
	
  	
  `memUsage`	
  	
  decimal(11.3)	
  DEFAULT	
  NULL,	
  
	
  	
  `memGranted`	
  	
  decimal(11.3)	
  DEFAULT	
  NULL,	
  
	
  	
  `memActive`	
  	
  decimal(11.3)	
  DEFAULT	
  NULL,	
  
	
  	
  `memConsumed`	
  	
  decimal(11.3)	
  DEFAULT	
  NULL,	
  
	
  	
  `diskUsage`	
  	
  decimal(11.3)	
  DEFAULT	
  NULL,	
  
	
  	
  `diskRead`	
  	
  decimal(11.3)	
  DEFAULT	
  NULL,	
  
	
  	
  `diskWrite`	
  	
  decimal(11.3)	
  DEFAULT	
  NULL,	
  
	
  	
  `netUsage`	
  	
  decimal(11.3)	
  DEFAULT	
  NULL,	
  
	
  	
  `netReceived`	
  	
  decimal(11.3)	
  DEFAULT	
  NULL,	
  
	
  	
  `netTrasnmitted`	
  	
  decimal(11.3)	
  DEFAULT	
  NULL,	
  
	
  	
  `sysUptime`	
  	
  decimal(11.3)	
  DEFAULT	
  NULL,	
  
	
  	
  `sysResourcesCpuUsage`	
  	
  decimal(11.3)	
  DEFAULT	
  NULL,	
  
	
  	
  PRIMARY	
  KEY	
  (`id`),	
  
	
  	
  KEY	
  `name`	
  (`name`)	
  
)	
  ENGINE=InnoDB	
  AUTO_INCREMENT=40	
  DEFAULT	
  CHARSET=utf8;	
  
/*!40101	
  SET	
  character_set_client	
  =	
  @saved_cs_client	
  */;	
  
	
  
DROP	
  TABLE	
  IF	
  EXISTS	
  `VM_monitoring_24hour`;	
  
/*!40101	
  SET	
  @saved_cs_client	
  	
  	
  	
  	
  =	
  @@character_set_client	
  */;	
  
/*!40101	
  SET	
  character_set_client	
  =	
  utf8	
  */;	
  
CREATE	
  TABLE	
  `VM_monitoring_24hour`	
  (	
  
	
  	
  `id`	
  int(11)	
  unsigned	
  NOT	
  NULL	
  AUTO_INCREMENT,	
  
	
  	
  `timestamp`	
  varchar(255)	
  NOT	
  NULL,	
  
  18	
  
	
  	
  `vHost_name`	
  varchar(255)	
  DEFAULT	
  NULL,	
  
	
  	
  `vm_name`	
  varchar(255)	
  DEFAULT	
  NULL,	
  
	
  	
  `cpuUsage`	
  	
  decimal(11.3)	
  DEFAULT	
  NULL,	
  
	
  	
  `cpuUsagemhz`	
  	
  decimal(11.3)	
  DEFAULT	
  NULL,	
  
	
  	
  `memUsage`	
  	
  decimal(11.3)	
  DEFAULT	
  NULL,	
  
	
  	
  `memGranted`	
  	
  decimal(11.3)	
  DEFAULT	
  NULL,	
  
	
  	
  `memActive`	
  	
  decimal(11.3)	
  DEFAULT	
  NULL,	
  
	
  	
  `memConsumed`	
  	
  decimal(11.3)	
  DEFAULT	
  NULL,	
  
	
  	
  `diskUsage`	
  	
  decimal(11.3)	
  DEFAULT	
  NULL,	
  
	
  	
  `diskRead`	
  	
  decimal(11.3)	
  DEFAULT	
  NULL,	
  
	
  	
  `diskWrite`	
  	
  decimal(11.3)	
  DEFAULT	
  NULL,	
  
	
  	
  `netUsage`	
  	
  decimal(11.3)	
  DEFAULT	
  NULL,	
  
	
  	
  `netReceived`	
  	
  decimal(11.3)	
  DEFAULT	
  NULL,	
  
	
  	
  `netTrasnmitted`	
  	
  decimal(11.3)	
  DEFAULT	
  NULL,	
  
	
  	
  `sysUptime`	
  	
  decimal(11.3)	
  DEFAULT	
  NULL,	
  
	
  	
  `sysResourcesCpuUsage`	
  	
  decimal(11.3)	
  DEFAULT	
  NULL,	
  
	
  	
  PRIMARY	
  KEY	
  (`id`),	
  
	
  	
  KEY	
  `vm_name`	
  (`vm_name`)	
  
)	
  ENGINE=InnoDB	
  AUTO_INCREMENT=40	
  DEFAULT	
  CHARSET=utf8;	
  
/*!40101	
  SET	
  character_set_client	
  =	
  @saved_cs_client	
  */;	
  
	
  
	
  
For	
   retrieving	
   the	
   data	
   following	
   PHP	
   scripts	
   were	
   created.	
   These	
   scripts	
   define	
   the	
   data	
  
model	
  for	
  returning	
  the	
  data	
  to	
  the	
  UI,	
  when	
  a	
  web	
  service	
  request	
  is	
  initiated.	
  These	
  scripts	
  
are	
  coupled	
  with	
  the	
  MySQL	
  database	
  to	
  allow	
  easy	
  connection	
  and	
  access	
  to	
  data.	
  
	
  
	
  
	
  
	
  
User	
  Interface	
  (UI):	
  
	
  
The	
  UI	
  is	
  implemented	
  using	
  Bootstrap,	
  a	
  JavaScript	
  framework.	
  We	
  call	
  the	
  dashboard	
  as	
  
VMware	
  DevOps.	
  It	
  makes	
  REST	
  calls	
  to	
  retrieve	
  the	
  data	
  and	
  render	
  it	
  on	
  the	
  screen.	
  For	
  
visualization	
   of	
   data	
   HighCharts	
   and	
   Data	
   Tables	
   are	
   used.	
   Google	
   Maps	
   APIs	
   are	
   used	
   to	
  
determine	
  the	
  location	
  of	
  the	
  server	
  based	
  its	
  the	
  IP.	
  The	
  stats	
  are	
  updated	
  automatically	
  
after	
  5minutes.	
  
	
  
	
  
  19	
  
Part	
  1	
  -­‐	
  DRS	
  
	
  
Initially,	
  we	
  set	
  the	
  monitoring	
  interval	
  and	
  create	
  the	
  performance	
  query	
  specification	
  and	
  
then	
  loop	
  over	
  the	
  samples	
  received	
  from	
  the	
  performance	
  manager.	
  Here	
  is	
  the	
  sample	
  code	
  
for	
  it.	
  
	
  
	
  
After	
  obtaining	
  the	
  CPU	
  usage	
  of	
  the	
  vHosts,	
  the	
  DRS	
  will	
  be	
  started	
  in	
  order	
  to	
  balance	
  the	
  
load.DRS	
  will	
  now	
  check	
  the	
  load	
  of	
  the	
  vHosts	
  and	
  then	
  start	
  cloning	
  the	
  virtual	
  machines	
  as	
  
per	
  the	
  load	
  balancing	
  requirements.	
  The	
  sample	
  code	
  for	
  it	
  is	
  here.	
  
	
  
  20	
  
	
  
	
  
The	
  implementation	
  of	
  the	
  DRS	
  second	
  part	
  algorithm	
  can	
  be	
  shown	
  using	
  the	
  screenshots	
  
below:	
  
	
  
When	
  the	
  program	
  runs	
  on	
  the	
  vCenter	
  using	
  the	
  credentials	
  given,	
  it	
  adds	
  a	
  new	
  vHost	
  from	
  
the	
  admin	
  console	
  using	
  the	
  SSL	
  thumbprint	
  of	
  the	
  host	
  to	
  be	
  added-­‐	
  
  21	
  
	
  
	
  
When	
  a	
  new	
  host	
  is	
  added,	
  DRS	
  checks	
  the	
  CPU	
  usage	
  of	
  each	
  VHost	
  and	
  the	
  Virtual	
  machines	
  
under	
   the	
   host	
   having	
   the	
   highest	
   CPU	
   usage	
   is	
   migrated	
   to	
   the	
   new	
   VHost.	
   Metrics	
   are	
  
collected	
  for	
  every	
  VHost	
  to	
  get	
  the	
  CPU	
  usage.	
  As	
  the	
  host:	
  132.65.132.131	
  is	
  having	
  the	
  
highest	
   CPU	
   usage,	
   the	
   virtual	
   machine	
   Ubuntu	
   02	
   is	
   migrated	
   to	
   the	
   new	
   host:	
  
132.65.132.133.	
  	
  
	
  
While	
  migrating	
  the	
  virtual	
  machines	
  to	
  the	
  new	
  host,	
  it	
  checks	
  for	
  few	
  conditions	
  
1.	
  If	
  Power	
  On:	
  Live	
  migrate	
  the	
  virtual	
  machine	
  
2.	
  If	
  power	
  Off:	
  Cold	
  migrate	
  the	
  virtual	
  machine	
  
	
  
The	
  lower	
  threshold	
  value	
  for	
  the	
  host	
  CPU	
  usage	
  is	
  set	
  to	
  30%.	
  If	
  the	
  usage	
  of	
  the	
  host	
  is	
  less	
  
than	
  30%,	
  migration	
  doesn’t	
  take	
  place.Sample	
  code	
  which	
  shows	
  the	
  threshold	
  value	
  as	
  30%	
  
for	
  the	
  CPU	
  usage	
  is	
  as	
  follow:	
  
	
  
  22	
  
	
  
When	
  the	
  vHosts	
  has	
  only	
  one	
  virtual	
  machine,	
  migration	
  is	
  not	
  performed	
  on	
  the	
  virtual	
  
machine.	
  The	
  below	
  screenshot	
  shows	
  that	
  the	
  virtual	
  machine	
  under	
  the	
  second	
  vHosts	
  
having	
  the	
  highest	
  CPU	
  usage	
  is	
  migrated	
  successfully.	
  
	
  
Thus,	
   load	
   balancing	
   of	
   the	
   vHosts	
   can	
   be	
   obtained	
   using	
   the	
   Distributed	
   Resource	
  
Scheduling.	
  
	
  
Part	
  1	
  –DPM	
  
Initially,	
  it	
  will	
  check	
  all	
  the	
  vHosts	
  and	
  if	
  there	
  exists	
  any	
  vHosts	
  with	
  no	
  VM’s	
  under	
  it,	
  then	
  
it	
  will	
  delete	
  that	
  particular	
  vHost.	
  Then,	
  it	
  will	
  determine	
  the	
  CPU	
  usage	
  for	
  all	
  the	
  vHosts.	
  	
  
	
  
Here	
  is	
  the	
  sample	
  code	
  for	
  it:	
  
	
  
	
  
Whenever,	
  a	
  new	
  VM	
  has	
  been	
  added	
  it	
  check	
  the	
  CPU	
  usage	
  of	
  all	
  Vhosts	
  at	
  that	
  particular	
  
point	
  of	
  time.	
  
	
  
  23	
  
	
  
It	
  will	
  then	
  sort	
  the	
  vHosts	
  from	
  lower	
  to	
  higher	
  CPU	
  usage.	
  All	
  the	
  VM’s	
  under	
  the	
  vHosts	
  
having	
  less	
  than	
  30%	
  CPU	
  usage	
  will	
  be	
  migrated	
  to	
  the	
  other	
  vHost	
  and	
  that	
  old	
  vHost	
  will	
  be	
  
deleted.	
  The	
  sample	
  code	
  for	
  it	
  is	
  here.	
  
	
  
	
  
	
  
	
  
	
  
	
  
	
  
  24	
  
6.	
  ASSUMPTIONS:	
  
Part	
  1-­‐	
  DRS:	
  
DRS	
  PART-­‐1:	
  The	
  assumptions	
  to	
  implement	
  the	
  DRS	
  PART1	
  are:	
  
• Initially,	
  VCenter	
  has	
  2host	
  and	
  each	
  having	
  2	
  virtual	
  machines	
  
o vHost	
  1:	
  132.65.132.131	
  
§ VM1:	
  T01-­‐VM01-­‐Ubuntu01	
  
§ VM2:	
  T01-­‐VM01-­‐Ubuntu02	
  
o vHost	
  1:	
  132.65.132.132	
  
§ VM1:	
  T01-­‐VM01-­‐Ubuntu03	
  
§ VM2:	
  T01-­‐VM01-­‐Ubuntu04	
  
	
  
• vMotion	
  is	
  enabled	
  for	
  all	
  the	
  vHost	
  sand	
  VMWare	
  tools	
  are	
  installed	
  for	
  each	
  VM.	
  
• Program	
  Prime95	
  is	
  running	
  on	
  each	
  virtual	
  machine.	
  
Part	
  1-­‐	
  DPM	
  
• In	
  this	
  part,	
  we	
  have	
  assumed	
  that	
  the	
  minimum	
  CPU	
  usage	
  value	
  to	
  be	
  30%.	
  
• Number	
  of	
  vHosts	
  to	
  be	
  three.	
  
• There	
  will	
  be	
  no	
  VM’s	
  under	
  the	
  third	
  vHost.	
  
	
  
Part	
  2:	
  
The	
  solution	
  was	
  built	
  under	
  a	
  controlled	
  environment	
  consisting	
  of	
  2	
  Virtual	
  Machines	
  and	
  2	
  
Hosts.	
  Following	
  are	
  the	
  assumptions	
  taken	
  into	
  consideration	
  while	
  building	
  this	
  solution:	
  
	
  
1. The	
   Virtual	
   Machines	
   that	
   are	
   in	
   Power	
   On	
   state	
   will	
   only	
   be	
   considered	
   for	
   logs	
  
monitoring	
  and	
  analysis,	
  since	
  the	
  Agent	
  Collector	
  is	
  running	
  on	
  the	
  virtual	
  machines	
  
to	
  capture	
  the	
  performance	
  data.	
  
2. The	
  Virtual	
  Machines	
  are	
  packaged	
  with	
  logstash	
  and	
  agent	
  collector	
  (.jar)	
  installation	
  
files.	
   In	
   addition,	
   these	
   files	
   are	
   executed	
   using	
   the	
   shell	
   scripts.	
   This	
   is	
   to	
   prevent	
  
single	
  point	
  of	
  failure.	
  
3. The	
  interval	
  for	
  collection	
  of	
  performance	
  data	
  is	
  set	
  to	
  20	
  seconds,	
  since	
  it’s	
  a	
  part	
  of	
  
project	
  requirement	
  and	
  to	
  avoid	
  data	
  excessive	
  IO	
  on	
  the	
  log	
  files.	
  
4. The	
  Aggregator	
  program	
  is	
  executed	
  from	
  a	
  physical	
  machine	
  and	
  is	
  considered	
  to	
  be	
  
highly	
  robust	
  and	
  redundant,	
  so	
  as	
  to	
  withstand	
  failure.	
  
5. To	
  simulate	
  stress	
  on	
  the	
  Virtual	
  Machines,	
  programs	
  like	
  Prime95	
  and	
  Folding@home	
  
are	
  used.	
  
  25	
  
6. For	
   storing	
   raw	
   logs	
   MongoLabs	
   cloud	
   infrastructure	
   is	
   used.	
   It’s	
   considered	
   to	
   be	
  
highly	
  scalable	
  and	
  elastic.	
  
7. The	
  UI	
  should	
  be	
  dynamic	
  and	
  the	
  visualization	
  illustrating	
  the	
  performance	
  statistics	
  
should	
  clear	
  and	
  crisp.	
  
	
  
7.	
  LIMITATIONS	
  
Part	
  1:	
  
• The	
  current	
  implementation	
  is	
  limited	
  to	
  a	
  controlled	
  environment,	
  thus	
  the	
  real	
  
execution	
  of	
  DRS	
  and	
  DPM	
  in	
  industry	
  cannot	
  be	
  studied.	
  	
  
• The	
  algorithm	
  is	
  not	
  taking	
  into	
  consideration	
  other	
  performance	
  metrics	
  apart	
  from	
  
CPU	
  Usage.	
  
	
  
Part	
  2:	
  
Following	
  are	
  the	
  limitations	
  of	
  the	
  current	
  solution	
  architecture:	
  
1. The	
  Agent	
  collector	
  runs	
  on	
  the	
  virtual	
  machines	
  capturing	
  the	
  performance	
  metrics.	
  
This	
  leads	
  to	
  high	
  CPU	
  utilization	
  and	
  slows	
  down	
  the	
  other	
  tasks	
  running	
  on	
  the	
  VM.	
  
2. The	
  performance	
  metrics	
  that	
  are	
  captured	
  as	
  a	
  part	
  of	
  this	
  implementation	
  are	
  pre-­‐
defined.	
  There	
  is	
  no	
  way	
  to	
  dynamically	
  add	
  new	
  performance	
  metric	
  ids.	
  
3. Using	
   MongoDB-­‐as-­‐a-­‐service	
   (MongoLabs)	
   acts	
   a	
   black-­‐box	
   in-­‐terms	
   of	
   scaling	
   and	
  
security.	
   Even	
   though,	
   authentication	
   keys	
   and	
   credentials	
   are	
   used,	
   there	
   is	
   no	
  
visibility	
  into	
  where	
  the	
  data	
  is	
  being	
  stored.	
  
4. Incase	
  of	
  logstash	
  failure,	
  no	
  captured	
  data	
  (in	
  log	
  files)	
  will	
  be	
  passed	
  to	
  the	
  MongDB	
  
database	
  server	
  for	
  storage.	
  
	
  
8.	
  FUTURE	
  WORK	
  AND	
  ITS	
  EXTENSION	
  
Part	
  1:	
  
The	
  implementation	
  of	
  DRS	
  algorithm	
  for	
  various	
  vHosts	
  and	
  the	
  virtual	
  machines	
  running	
  
under	
  them	
  can	
  be	
  done.	
  So	
  as	
  to,	
  simulate	
  the	
  real	
  industry	
  environment.	
  In	
  addition,	
  the	
  
behavior	
  of	
  DRS	
  and	
  DPM	
  execution	
  can	
  be	
  monitored	
  and	
  triggered	
  using	
  a	
  remote	
  UI.	
  
	
  
	
  
  26	
  
Part	
  2:	
  
This	
  project	
  can	
  be	
  scaled	
  to	
  a	
  complete	
  log	
  analyzing	
  and	
  visualization	
  enterprise	
  solution.	
  
Listed	
  below	
  are	
  some	
  of	
  the	
  extensions	
  for	
  the	
  project:	
  
1. Providing	
  capability	
  to	
  analyze	
  logs	
  per	
  VM	
  using	
  the	
  User	
  Interface	
  (UI).	
  	
  
2. Dynamic	
  or	
  easy	
  on	
  boarding	
  of	
  new	
  performance	
  metrics.	
  
3. Providing	
  feature	
  to	
  connect	
  to	
  vCenter	
  through	
  the	
  UI	
  and	
  get	
  all	
  the	
  performance	
  
metrics	
  of	
  corresponding	
  vHosts	
  and	
  VMs	
  dynamically.	
  
4. The	
  project	
  can	
  be	
  also	
  extended	
  to	
  provide	
  monitoring	
  and	
  visualization	
  capabilities	
  
for	
  other	
  virtualization	
  players	
  for	
  e.g.	
  Hyper-­‐V,	
  Xen	
  etc.	
  
	
  
9.	
  INDIVIDUAL	
  CONTRIBUTION	
  
Name	
   Contribution	
  
Akshay	
  Wattal	
   • Overall	
   designing	
   of	
   system	
   framework	
  
and	
  architecture	
  
• Performed	
  PoC	
  for	
  different	
  architectural	
  
components.	
  
• Implementation	
   of	
   the	
   Visualization	
  
framework.	
  
• Integration	
  and	
  System	
  testing	
  
• Agent	
  analyzer	
  implementation	
  
• Documentation	
  
	
  Apoorva	
  Gouni	
  
	
  	
  
• Overall	
   designing	
   of	
   system	
   framework	
  
and	
  architecture	
  
• Agent	
  Collector	
  implementation	
  
• Configuration	
  of	
  Logstash	
  in	
  VMs	
  
• MongoLabs-­‐	
  connection	
  
• Unit	
  Testing	
  
• Documentation	
  
Gopika	
  Gogineni	
  
	
  
• Overall	
   designing	
   of	
   system	
   framework	
  
and	
  architecture	
  
• DPM	
  Implementation	
  
• Documentation	
  
Pratyusha	
  Mandapati	
  
	
  
• Overall	
   designing	
   of	
   system	
   framework	
  
and	
  architecture	
  
• DRS-­‐	
  Create	
  VM	
  scenario	
  
• Documentation	
  
	
  
  27	
  
10.	
  INSTALLATION	
  AND	
  EXECUTION	
  MANUAL	
  	
  
• Java	
   Installation:	
   Java	
   is	
   installed	
   on	
   each	
   Virtual	
   machine.	
   Below	
   are	
   the	
   steps	
  
followed	
  to	
  install	
  Java.	
  
o Download	
  and	
  save	
  the	
  jdk-­‐8u25-­‐linux-­‐i586.trz.gz	
  
o Switch	
  to	
  the	
  directory	
  where	
  you	
  saved	
  the	
  file	
  
o Uncompress	
  the	
  file.	
  
o Launch	
  Java	
  
	
  
• Logstash	
  1.4.2	
  Installation:	
  Logstash	
  is	
  installed	
  on	
  each	
  Virtual	
  machine.	
  The	
  below	
  
are	
  the	
  steps	
  followed	
  to	
  install	
  the	
  Logstash:	
  
o Download	
  the	
  logstash	
  tar	
  file	
  
o Extract	
  the	
  logstash	
  tar	
  file	
  
o Launch	
  the	
  Logstash	
  
o Download	
  Logstash-­‐contrib-­‐1.4.2	
  
o Extract	
  the	
  Logstash-­‐contrib-­‐1.4.2	
  
o Run	
  the	
  	
  Contrib	
  Plug-­‐in	
  	
  
	
  
Execution:	
  The	
  following	
  commands	
  should	
  be	
  executed	
  in	
  the	
  command	
  prompt	
  for	
  
each	
  Virtual	
  machine.	
  
	
  
java	
  –jar	
  stats.jar	
  VM_name	
  :	
  This	
  starts	
  collecting	
  the	
  statistics	
  and	
  saving	
  it	
  to	
  the	
  log	
  file	
  in	
  
specified	
  location.	
  Here	
  “stats.jar”	
  is	
  the	
  name	
  of	
  the	
  jar	
  file	
  and	
  VM_name	
  is	
  the	
  name	
  of	
  the	
  
virtual	
  machine	
  on	
  which	
  you	
  are	
  running	
  this	
  jar	
  file.	
  
	
  
	
  
	
  
  28	
  
Bin/logstash	
  agent	
  –f	
  stats.conf	
  :	
  This	
  command	
  is	
  used	
  to	
  run	
  the	
  Logstash	
  and	
  parse	
  the	
  log	
  
from	
  virtual	
  machine	
  to	
  MongoDB	
  on	
  cloud.	
  Here	
  “stats.conf”	
  is	
  the	
  name	
  of	
  configuration	
  
file.	
  
	
  
	
  
	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  
• MongoLabs	
   Configuration:	
   The	
   detailed	
   steps	
   are	
   provide	
   here	
  
http://docs.mongolab.com/	
  
	
  	
  
• MySQL	
   Installation:	
   It	
   is	
   installed	
   on	
   the	
   physical	
   machine	
   running	
   on	
   Mac	
   OS.	
   It’s	
  
installed	
  using	
  a	
  third	
  part	
  package	
  manger	
  for	
  Mac	
  called	
  homebrew.	
  
	
  
brew	
  install	
  mysql	
  	
  
	
  
Execution:	
   To	
   start	
   the	
   MySQL	
   service	
   run	
   the	
   following	
   command	
   -­‐	
   mysql.server	
  
restart	
  
	
  
• PHP	
   Installation:	
   It	
   is	
   installed	
   on	
   the	
   physical	
   machine	
   running	
   on	
   Mac	
   OS.	
   It’s	
  
installed	
  using	
  a	
  third	
  part	
  package	
  manger	
  for	
  Mac	
  called	
  homebrew.	
  Following	
  is	
  the	
  
command	
   for	
   the	
   same	
   (Detailed	
   link	
   https://github.com/Homebrew/homebrew-­‐
php):	
  
	
  
brew	
  install	
  php56	
  	
  
	
  
• Apache	
  Http	
  Server	
  Installation	
  and	
  execution:	
  Its	
  built-­‐in	
  the	
  Mac	
  OS.	
  To	
  start	
  the	
  
apache	
  server	
  first	
  deploy	
  all	
  the	
  PHP	
  and	
  UI	
  files	
  (html,	
  css,	
  js)	
  into	
  the	
  root	
  document	
  
folder.	
  Next,	
  execute	
  the	
  following	
  command	
  to	
  start	
  the	
  service	
  
	
  
./apachectl	
  start	
  
	
   To	
  stop	
  the	
  server,	
  run-­‐	
  ./apachectl	
  stop	
  
  29	
  
11.	
  UNIT,	
  SYSTEM	
  AND	
  INTEGRATION	
  TESTING:	
  
The	
  test	
  case	
  were	
  executed	
  in	
  a	
  controlled	
  environment	
  consisting	
  of	
  2	
  vHosts	
  and	
  2	
  Virtual	
  
machines:	
  
	
  
Test	
  
Case	
  Id	
  
Test	
  case	
  Description	
   Expected	
  
Result	
  
Actual	
  Result	
   Details	
  of	
  Test	
  
Results	
  Pass	
   Fail	
  
1	
   When	
  Virtual	
  machine	
  is	
  
powered	
   ON,	
   it	
   should	
  
return	
   the	
   Virtual	
  
Machine	
   name,	
   its	
  
corresponding	
   host	
  
name	
   and	
   the	
  
performance	
   statistics	
  
of	
   both	
   virtual	
   machine	
  
and	
   its	
   corresponding	
  
host	
  should	
  be	
  saved	
  in	
  
a	
   log	
   file	
   at	
   specified	
  
location.	
  
	
  
The	
  Statistics	
  of	
  
the	
   vHosts	
   and	
  
the	
   VMs	
   are	
  
successfully	
  
fetched.	
  
	
  Yes	
   	
   Successfully	
  
printed	
   the	
  
Virtual	
  Machine	
  
name	
   and	
   its	
  
corresponding	
  
vHost	
   name	
  
and	
   saved	
   the	
  
performance	
  
metrics	
   of	
   both	
  
virtual	
   machine	
  
and	
   its	
  
corresponding	
  
host	
   should	
   be	
  
saved	
   in	
   a	
   log	
  
file	
   at	
   specified	
  
location.	
  
	
  
Screenshot	
  
Test-­‐1	
  
2	
   To	
  check	
  the	
  connection	
  
with	
  the	
  Mongo	
  DB	
  and	
  
Storage	
   of	
   Data	
   in	
  
MongoDB	
  
Run	
   the	
   command	
   to	
  
execute	
  the	
  Logstash.	
  
Data	
   from	
   specified	
  
input	
   path	
   should	
   be	
  
parsed	
  to	
  MongoDB.	
  
Successfully	
  
connection	
  
should	
   be	
  
established	
  
with	
   MongoDB	
  
and	
  data	
  should	
  
be	
  parsed	
  	
  from	
  
specified	
  
location	
   and	
  
saved	
   to	
  
MongoDB.	
  
Yes	
   	
   Screenshot	
  
Test-­‐2	
   shows	
  
the	
   expected	
  
output.	
  
3	
   Test	
   remote	
  
connectivity	
   with	
   the	
  
MongoDB	
   Labs	
   server	
  
for	
   Aggregator	
   so	
   that	
  
the	
  Aggregator	
  program	
  
could	
  remote	
  connect.	
  
Remote	
  
connection	
  
should	
   is	
  
successful	
  
Robomongo	
  
tool.	
  
Yes	
   	
   Screenshots	
  
Test-­‐3	
   shows	
  
the	
   expected	
  
output.	
  
  30	
  
	
  
	
  
	
  
	
  
	
  
4	
   Test	
   case	
   to	
   ensures	
  
that	
   the	
   aggregator	
   is	
  
able	
   is	
   connect	
   to	
  
MongoDB	
   and	
   insert	
  
aggregated	
   statistics	
   in	
  
MySQL	
  database.	
  
	
  
	
  
The	
   data	
   from	
  
MongoDB	
   is	
  
aggregated	
   and	
  
pushed	
   into	
  
MySQL	
  
Yes	
   	
   Screenshots	
  
Test-­‐4	
   shows	
  
the	
   data	
  
aggregated	
   into	
  
different	
  
MySQL	
   tables	
  
corresponding	
  
to	
   the	
  
timeframe.	
  	
  
5	
   Visualization	
   	
   	
   	
   	
  
5.1	
   To	
   test	
   if	
   the	
   UI	
   REST	
  
web	
  services	
  are	
  able	
  to	
  
connect	
   with	
   the	
  
deployed	
   PHP	
   scripts	
  
and	
  get	
  back	
  the	
  data.	
  
There	
  should	
  be	
  
no	
   error	
   on	
  
loading	
   the	
  
VMware	
  
DevOps	
  
dashboard	
   and	
  
data	
   should	
   be	
  
fetched	
   for	
   the	
  
charts.	
  
Yes	
   	
   Screenshots	
  
Test-­‐5.1	
   shows	
  
the	
   excepted	
  
output	
   of	
   the	
  
dashboard	
  
getting	
   loaded	
  
successfully	
  
without	
   and	
  
errors.	
  
	
  
5.2	
  	
   To	
  test	
  data	
  consistency	
  
between	
   the	
   MySQL	
  
data	
   and	
   the	
   data	
  
populated	
   into	
  
visualization	
  widgets.	
  
The	
  data	
  points	
  
on	
   the	
   charts	
  
and	
   table	
   are	
  
correct	
   an	
   as	
  
expected.	
  
Yes	
   	
   Test-­‐5.2	
   shows	
  
the	
   excepted	
  
output	
   of	
   the	
  
getting	
   correct	
  
and	
   matching	
  
data	
   loaded	
   on	
  
the	
  UI.	
  
	
  
5.3	
   Test	
  Case	
  to	
  check	
  if	
  the	
  
graph	
   and	
   data	
   tables	
  
are	
   auto	
   refreshing	
  
after	
   every	
   fixed	
  
interval	
  of	
  time.	
  
	
  
The	
   values	
   in	
  
the	
   graphs	
   are	
  
updated	
  after	
  a	
  
configured	
  time	
  
interval.	
  
Yes	
   	
   Test-­‐5.3	
   shows	
  
that	
   the	
   graphs	
  
are	
   auto-­‐
refreshed.	
  
	
  
  31	
  
Testing	
  Screenshots:	
  
Test-­‐1:	
  
	
  
	
  
	
  
	
  
	
  
  32	
  
Test-­‐2:	
  
	
  
	
  
	
  
	
  
	
  
	
  
	
  
	
  
	
  
  33	
  
Test-­‐3:	
  
	
  
	
  
	
  
	
  
	
  
	
  
  34	
  
Test-­‐4:	
  
	
  
	
  
	
  
  35	
  
Test-­‐5.1:	
  
	
  
	
  
Test-­‐5.2:	
  
	
  
  36	
  
	
  
	
  
Test-­‐5.3:	
  
	
  
	
  
  37	
  
12.	
  SCREENSHOTS	
  
Part	
  1:	
  	
  
DRS1,	
  Initial	
  placement:	
  
Running	
  Prime95	
  for	
  increasing	
  CPU	
  utilization:	
  
	
  
Comparison	
  of	
  vHosts	
  CPU	
  utilization	
  for	
  initial	
  placement	
  and	
  selection	
  of	
  vHost:	
  
	
  
  38	
  
Virtual	
  Machine	
  added	
  to	
  the	
  selected	
  vHost:	
  
	
  
	
  
DRS2,	
  Load	
  balancing:	
  
Adding	
  a	
  new	
  vHost:	
  132.65.132.133	
  in	
  the	
  vCenter.	
  
	
  
	
  
  39	
  
	
  
	
  
Collecting	
  the	
  metrics	
  for	
  vHosts	
  and	
  VMs	
  and	
  displaying	
  the	
  CPU	
  usage	
  including	
  the	
  time	
  
stamp.	
  
	
  
	
  
	
  
  40	
  
Checks	
  performed	
  for	
  live	
  and	
  cold	
  migration	
  and	
  migrating	
  the	
  Virtual	
  Machine	
  –	
  T01-­‐VM01-­‐
Ubuntu02	
  to	
  the	
  new	
  vHost.	
  
	
  
	
  
VM	
  migrated	
  successfully	
  
	
  
	
  
	
  
	
  
	
  
  41	
  
Checks	
  performed	
  on	
  the	
  second	
  vHost:	
  132.65.132.132	
  to	
  migrate	
  the	
  Virtual	
  machine:	
  T01-­‐
VM01-­‐Ubuntu03	
  to	
  the	
  new	
  vHost.	
  
	
  
	
  
DPM	
  (Distributed	
  Power	
  Management):	
  
The	
  vHost	
  130.65.132.133	
  is	
  deleted	
  as	
  it	
  has	
  no	
  VM’s.	
  
	
  
	
  
	
  
  42	
  
Below	
  is	
  the	
  code	
  execution	
  of	
  the	
  above	
  deletion	
  procedure.	
  
	
  
	
  
Migrating	
  VM	
  T01-­‐VM01-­‐Ubuntu03	
  from	
  130.65.132.132	
  to	
  130.65.132.131	
  
	
  
	
  
	
  
  43	
  
	
  
Below	
  is	
  the	
  code	
  execution	
  of	
  the	
  above	
  migration	
  procedure.	
  
	
  
	
  
Migrating	
  VM	
  T01-­‐VM-­‐01-­‐Ubuntu04	
  from	
  130.65.132.132	
  to	
  130.65.132.132	
  
	
  
	
  
	
  
  44	
  
Below	
  is	
  the	
  code	
  execution	
  of	
  the	
  above	
  migration	
  procedure.	
  
	
  
	
  
Deleting	
  the	
  vHost	
  130.65.132.132,	
  since	
  no	
  VMs	
  are	
  left	
  
	
  
	
  
	
  
	
  
  45	
  
Below	
  is	
  the	
  code	
  execution	
  of	
  the	
  above	
  deletion	
  procedure.	
  
	
  
	
  
At	
  the	
  end	
  of	
  DPM	
  only	
  one	
  vHost	
  with	
  four	
  VMs	
  are	
  left.	
  
	
  
	
  
	
  
	
  
  46	
  
Part	
  2:	
  
Note:	
  Screenshots	
  of	
  Running	
  Collector,	
  Generated	
  Log	
  File,	
  Logstash	
  execution,	
  MonogDB	
  
storage	
  and	
  Aggregation	
  are	
  already	
  captured	
  in	
  Testing	
  Section	
  above.	
  Not	
  repeating	
  to	
  
limit	
  the	
  length	
  of	
  the	
  document.	
  
	
  
Visualization	
  
Below	
  are	
  the	
  screenshots	
  of	
  the	
  VMware	
  DevOps	
  dashboard	
  that	
  we	
  have	
  created.	
  
This	
  is	
  called	
  the	
  VMware	
  vHost	
  Top	
  Statistics	
  dashboard,	
  its	
  primary	
  purpose	
  it	
  to	
  show	
  Top3	
  
defaulters	
  vHosts	
  that	
  are	
  consuming	
  the	
  maximum	
  resources	
  interms	
  of	
  the	
  CPU,	
  Memory,	
  
Disk	
  I/O	
  and	
  Usage.	
  
	
  
	
  
	
  
	
  
	
  
	
  
	
  
  47	
  
This	
  is	
  called	
  the	
  VMware	
  VMs	
  Top	
  Statistics	
  dashboard,	
  its	
  primary	
  purpose	
  it	
  to	
  show	
  Top3	
  
defaulters	
  Virtual	
  Machines	
  that	
  are	
  consuming	
  the	
  maximum	
  resources	
  interms	
  of	
  the	
  CPU,	
  
Memory,	
  Disk	
  I/O	
  and	
  Usage.	
  
	
  
Next	
  from	
  the	
  Left	
  side	
  panel	
  we	
  can	
  expand	
  VMware	
  Inventory	
  and	
  drill-­‐down	
  to	
  each	
  ESXi	
  
or	
  VM	
  level.	
  The	
  following	
  screenshot	
  is	
  of	
  Overview	
  of	
  monitored	
  vHost.	
  
	
  
  48	
  
CPU	
  Overview	
  of	
  vHosts,	
  it	
  consists	
  of	
  two	
  charts	
  with	
  three	
  time	
  lines	
  By	
  minutes,	
  hour	
  and	
  
day.	
  
	
  
Memory	
  Overview	
  of	
  vHosts,	
  it	
  consists	
  of	
  four	
  charts	
  with	
  three	
  time	
  lines	
  By	
  minutes,	
  hour	
  
and	
  day.	
  
	
  
	
  
	
  
  49	
  
Disk	
  I/O	
  Overview	
  of	
  vHosts,	
  it	
  consists	
  of	
  three	
  charts	
  with	
  three	
  time	
  lines	
  By	
  minutes,	
  hour	
  
and	
  day.	
  
	
  
Network	
  Overview	
  of	
  vHosts,	
  it	
  consists	
  of	
  three	
  charts	
  with	
  three	
  time	
  lines	
  By	
  minutes,	
  hour	
  
and	
  day.	
  
	
  
	
  
  50	
  
And	
  finally	
  System	
  Overview	
  of	
  vHosts,	
  it	
  consists	
  of	
  three	
  charts	
  with	
  three	
  time	
  lines	
  By	
  
minutes,	
  hour	
  and	
  day.	
  
	
  
Next	
  from	
  the	
  Left	
  side	
  panel	
  we	
  can	
  expand	
  VMware	
  Inventory-­‐>Virtual	
  Machines.	
  The	
  first	
  
is	
  the	
  VMs	
  Overview	
  providing	
  quick	
  stats.	
  
	
  
  51	
  
CPU	
  Overview	
  of	
  VM,	
  it	
  consists	
  of	
  two	
  charts	
  with	
  three	
  time	
  lines	
  By	
  minutes,	
  hour	
  and	
  day.	
  
	
  
Memory	
  Overview	
  of	
  VM,	
  it	
  consists	
  of	
  four	
  charts	
  with	
  three	
  time	
  lines	
  By	
  minutes,	
  hour	
  and	
  
day.	
  
	
  
  52	
  
Disk	
  I/O	
  Overview	
  of	
  VM,	
  it	
  consists	
  of	
  three	
  charts	
  with	
  three	
  time	
  lines	
  By	
  minutes,	
  hour	
  
and	
  day.	
  Note,	
  if	
  we	
  need	
  to	
  compare	
  or	
  just	
  view	
  only	
  one	
  VMs	
  statistics	
  we	
  can	
  simply	
  
enable/disable	
  it	
  from	
  chart	
  legend	
  as	
  shown	
  below.	
  
	
  
Network	
  Overview	
  of	
  VM,	
  it	
  consists	
  of	
  three	
  charts	
  with	
  three	
  time	
  lines	
  By	
  minutes,	
  hour	
  
and	
  day.	
  
	
  
  53	
  
And	
  finally	
  System	
  Overview	
  of	
  VM,	
  it	
  consists	
  of	
  three	
  charts	
  with	
  three	
  time	
  lines	
  By	
  
minutes,	
  hour	
  and	
  day.	
  
	
  
Next,	
  to	
  purge	
  the	
  MongoDB	
  collections	
  we	
  can	
  simply	
  use	
  the	
  UI	
  to	
  clear	
  all	
  records.	
  The	
  
extension	
  of	
  this	
  could	
  be	
  to	
  delete	
  based	
  on	
  Timestamp.	
  
	
  
  54	
  
The	
  last	
  feature	
  of	
  this	
  dashboard	
  is	
  that	
  you	
  can	
  find	
  the	
  location	
  of	
  any	
  server	
  using	
  the	
  IP	
  
address.	
  
	
  
	
  
13.	
  CONCLUSION:	
  
1. It	
  was	
  highly	
  challenging,	
  in	
  real-­‐time	
  while	
  migrating	
  the	
  VMs	
  as	
  a	
  part	
  of	
  DPM	
  and	
  
trying	
  to	
  maintain	
  the	
  synchronization	
  with	
  the	
  MySQL	
  table	
  at	
  the	
  same	
  time.	
  
2. In	
  real	
  time,	
  generating	
  the	
  graphs	
  dynamically	
  with	
  the	
  response	
  was	
  slow.	
  
3. Configuring	
  logstash	
  was	
  very	
  challenging.	
  The	
  difficult	
  part	
  lies	
  in	
  determining	
  the	
  
output	
  forms	
  by	
  making	
  use	
  of	
  plug-­‐ins.	
  
4. We	
  had	
  no	
  hands-­‐on	
  experience	
  with	
  Creation	
  of	
  new	
  resource	
  pools	
  and	
  managing	
  
them	
  within	
  our	
  vCenter	
  in	
  our	
  initial	
  labs.	
  As,	
  a	
  result	
  we	
  felt	
  that	
  task	
  to	
  be	
  
challenging.	
  
5. As	
  the	
  infrastructure	
  of	
  the	
  lab	
  has	
  provided	
  every	
  team,	
  with	
  limited	
  storage,	
  we	
  had	
  
to	
  always	
  monitor	
  that	
  our	
  machines	
  did	
  not	
  run	
  out	
  of	
  memory	
  by	
  expelling	
  the	
  log	
  
files	
  that	
  have	
  been	
  collected.	
  
6. It	
  took	
  us	
  a	
  considerable	
  amount	
  of	
  time,	
  just	
  for	
  sorting	
  out	
  the	
  statistics	
  that	
  are	
  
required	
  to	
  be	
  collected	
  by	
  us	
  as	
  per	
  the	
  project	
  requirements.	
  
  55	
  
7. In	
  order	
  to	
  generate	
  the	
  algorithm	
  that	
  will	
  best	
  suit	
  to	
  implement	
  DRS	
  and	
  DPM	
  was	
  
really	
  challenging	
  and	
  took	
  some	
  considerable	
  amount	
  of	
  time.	
  Once	
  we	
  have	
  figured	
  
it	
  out,	
  it	
  was	
  really	
  easy	
  for	
  us	
  to	
  implement	
  DRS	
  and	
  DPM.	
  
	
  
14.	
  REFERENCES:	
  
1. http://logstash.net	
  
2. CMPE	
  283	
  Project-­‐2	
  Description	
  PDF	
  
3. https://mongolab.com/welcome/	
  
4. http://php.net/manual/en/ref.pdo-­‐mysql.php	
  
5. http://startbootstrap.com/	
  
	
  

Más contenido relacionado

La actualidad más candente

Survey on Load Rebalancing for Distributed File System in Cloud
Survey on Load Rebalancing for Distributed File System in CloudSurvey on Load Rebalancing for Distributed File System in Cloud
Survey on Load Rebalancing for Distributed File System in CloudAM Publications
 
googlecluster-ieee
googlecluster-ieeegooglecluster-ieee
googlecluster-ieeeHiroshi Ono
 
Cost effective failover clustering
Cost effective failover clusteringCost effective failover clustering
Cost effective failover clusteringeSAT Journals
 
Efficient Resource Management Mechanism with Fault Tolerant Model for Computa...
Efficient Resource Management Mechanism with Fault Tolerant Model for Computa...Efficient Resource Management Mechanism with Fault Tolerant Model for Computa...
Efficient Resource Management Mechanism with Fault Tolerant Model for Computa...Editor IJCATR
 
MapReduce Scheduling Algorithms
MapReduce Scheduling AlgorithmsMapReduce Scheduling Algorithms
MapReduce Scheduling AlgorithmsLeila panahi
 
Computation of spatial data on Hadoop Cluster
Computation of spatial data on Hadoop ClusterComputation of spatial data on Hadoop Cluster
Computation of spatial data on Hadoop ClusterAbhishek Sagar
 
SCHEDULING DIFFERENT CUSTOMER ACTIVITIES WITH SENSING DEVICE
SCHEDULING DIFFERENT CUSTOMER ACTIVITIES WITH SENSING DEVICESCHEDULING DIFFERENT CUSTOMER ACTIVITIES WITH SENSING DEVICE
SCHEDULING DIFFERENT CUSTOMER ACTIVITIES WITH SENSING DEVICEijait
 
Average Active Sessions - OaktableWorld 2013
Average Active Sessions - OaktableWorld 2013Average Active Sessions - OaktableWorld 2013
Average Active Sessions - OaktableWorld 2013John Beresniewicz
 
APPLICATION OF AUTONOMIC COMPUTING PRINCIPLES IN VIRTUALIZED ENVIRONMENT
APPLICATION OF AUTONOMIC COMPUTING PRINCIPLES IN VIRTUALIZED ENVIRONMENTAPPLICATION OF AUTONOMIC COMPUTING PRINCIPLES IN VIRTUALIZED ENVIRONMENT
APPLICATION OF AUTONOMIC COMPUTING PRINCIPLES IN VIRTUALIZED ENVIRONMENTcscpconf
 
A Queue Simulation Tool for a High Performance Scientific Computing Center
A Queue Simulation Tool for a High Performance Scientific Computing CenterA Queue Simulation Tool for a High Performance Scientific Computing Center
A Queue Simulation Tool for a High Performance Scientific Computing CenterJames McGalliard
 
DBPLUS Performance Monitor for PostgeSQL
DBPLUS Performance Monitor for PostgeSQL DBPLUS Performance Monitor for PostgeSQL
DBPLUS Performance Monitor for PostgeSQL DBPLUS
 
Enhancing Performance and Fault Tolerance of Hadoop Cluster
Enhancing Performance and Fault Tolerance of Hadoop ClusterEnhancing Performance and Fault Tolerance of Hadoop Cluster
Enhancing Performance and Fault Tolerance of Hadoop ClusterIRJET Journal
 
Volume 2-issue-6-2061-2063
Volume 2-issue-6-2061-2063Volume 2-issue-6-2061-2063
Volume 2-issue-6-2061-2063Editor IJARCET
 
Introduction to map reduce
Introduction to map reduceIntroduction to map reduce
Introduction to map reduceM Baddar
 

La actualidad más candente (16)

Survey on Load Rebalancing for Distributed File System in Cloud
Survey on Load Rebalancing for Distributed File System in CloudSurvey on Load Rebalancing for Distributed File System in Cloud
Survey on Load Rebalancing for Distributed File System in Cloud
 
googlecluster-ieee
googlecluster-ieeegooglecluster-ieee
googlecluster-ieee
 
Cost effective failover clustering
Cost effective failover clusteringCost effective failover clustering
Cost effective failover clustering
 
Cost effective failover clustering
Cost effective failover clusteringCost effective failover clustering
Cost effective failover clustering
 
J0210053057
J0210053057J0210053057
J0210053057
 
Efficient Resource Management Mechanism with Fault Tolerant Model for Computa...
Efficient Resource Management Mechanism with Fault Tolerant Model for Computa...Efficient Resource Management Mechanism with Fault Tolerant Model for Computa...
Efficient Resource Management Mechanism with Fault Tolerant Model for Computa...
 
MapReduce Scheduling Algorithms
MapReduce Scheduling AlgorithmsMapReduce Scheduling Algorithms
MapReduce Scheduling Algorithms
 
Computation of spatial data on Hadoop Cluster
Computation of spatial data on Hadoop ClusterComputation of spatial data on Hadoop Cluster
Computation of spatial data on Hadoop Cluster
 
SCHEDULING DIFFERENT CUSTOMER ACTIVITIES WITH SENSING DEVICE
SCHEDULING DIFFERENT CUSTOMER ACTIVITIES WITH SENSING DEVICESCHEDULING DIFFERENT CUSTOMER ACTIVITIES WITH SENSING DEVICE
SCHEDULING DIFFERENT CUSTOMER ACTIVITIES WITH SENSING DEVICE
 
Average Active Sessions - OaktableWorld 2013
Average Active Sessions - OaktableWorld 2013Average Active Sessions - OaktableWorld 2013
Average Active Sessions - OaktableWorld 2013
 
APPLICATION OF AUTONOMIC COMPUTING PRINCIPLES IN VIRTUALIZED ENVIRONMENT
APPLICATION OF AUTONOMIC COMPUTING PRINCIPLES IN VIRTUALIZED ENVIRONMENTAPPLICATION OF AUTONOMIC COMPUTING PRINCIPLES IN VIRTUALIZED ENVIRONMENT
APPLICATION OF AUTONOMIC COMPUTING PRINCIPLES IN VIRTUALIZED ENVIRONMENT
 
A Queue Simulation Tool for a High Performance Scientific Computing Center
A Queue Simulation Tool for a High Performance Scientific Computing CenterA Queue Simulation Tool for a High Performance Scientific Computing Center
A Queue Simulation Tool for a High Performance Scientific Computing Center
 
DBPLUS Performance Monitor for PostgeSQL
DBPLUS Performance Monitor for PostgeSQL DBPLUS Performance Monitor for PostgeSQL
DBPLUS Performance Monitor for PostgeSQL
 
Enhancing Performance and Fault Tolerance of Hadoop Cluster
Enhancing Performance and Fault Tolerance of Hadoop ClusterEnhancing Performance and Fault Tolerance of Hadoop Cluster
Enhancing Performance and Fault Tolerance of Hadoop Cluster
 
Volume 2-issue-6-2061-2063
Volume 2-issue-6-2061-2063Volume 2-issue-6-2061-2063
Volume 2-issue-6-2061-2063
 
Introduction to map reduce
Introduction to map reduceIntroduction to map reduce
Introduction to map reduce
 

Destacado (18)

110210106022
110210106022110210106022
110210106022
 
Հաշվետվություն
ՀաշվետվությունՀաշվետվություն
Հաշվետվություն
 
Мороз Иванович
Мороз ИвановичМороз Иванович
Мороз Иванович
 
Pinterest (MyTacks) - Software Engineering Management
Pinterest (MyTacks) - Software Engineering ManagementPinterest (MyTacks) - Software Engineering Management
Pinterest (MyTacks) - Software Engineering Management
 
I can
I canI can
I can
 
Մեկնում Դիլիջան
Մեկնում ԴիլիջանՄեկնում Դիլիջան
Մեկնում Դիլիջան
 
Surendranad Benerjee
Surendranad BenerjeeSurendranad Benerjee
Surendranad Benerjee
 
When I was a little chaild
When I was a little chaildWhen I was a little chaild
When I was a little chaild
 
Laura Minasyan
Laura MinasyanLaura Minasyan
Laura Minasyan
 
My self
My selfMy self
My self
 
Tack On RESTful APIs - Pinterest Backend Simulation
Tack On RESTful APIs - Pinterest Backend SimulationTack On RESTful APIs - Pinterest Backend Simulation
Tack On RESTful APIs - Pinterest Backend Simulation
 
15_0222 Resume Development Manager
15_0222 Resume Development Manager15_0222 Resume Development Manager
15_0222 Resume Development Manager
 
ամանոր և վանատուր
ամանոր և վանատուրամանոր և վանատուր
ամանոր և վանատուր
 
Դիջիհեքիաթ
ԴիջիհեքիաթԴիջիհեքիաթ
Դիջիհեքիաթ
 
Шоколадная дорога
Шоколадная дорогаШоколадная дорога
Шоколадная дорога
 
Քլորոֆիտում
ՔլորոֆիտումՔլորոֆիտում
Քլորոֆիտում
 
Data Insights - sentiXchange
Data Insights - sentiXchangeData Insights - sentiXchange
Data Insights - sentiXchange
 
Disaster recovery solution for VMware vCenter, vHost and VMs
Disaster recovery solution for VMware vCenter, vHost and VMsDisaster recovery solution for VMware vCenter, vHost and VMs
Disaster recovery solution for VMware vCenter, vHost and VMs
 

Similar a Part 1: DRS and DPM Implementation in Virtualized Environment, Part 2: Large Scale Performance Statistics gathering and monitoring

Large scale virtual Machine log collector (Project-Report)
Large scale virtual Machine log collector (Project-Report)Large scale virtual Machine log collector (Project-Report)
Large scale virtual Machine log collector (Project-Report)Gaurav Bhardwaj
 
Crear una aplicacion enrt ni
Crear una aplicacion enrt niCrear una aplicacion enrt ni
Crear una aplicacion enrt niedgar maya
 
Virtual appliance creation and optimization in cloud
Virtual appliance creation and optimization in cloudVirtual appliance creation and optimization in cloud
Virtual appliance creation and optimization in cloudeSAT Journals
 
Virtual appliance creation and optimization in cloud
Virtual appliance creation and optimization in cloudVirtual appliance creation and optimization in cloud
Virtual appliance creation and optimization in cloudeSAT Publishing House
 
#VirtualDesignMaster 3 Challenge 2 – James Brown
#VirtualDesignMaster 3 Challenge 2 – James Brown#VirtualDesignMaster 3 Challenge 2 – James Brown
#VirtualDesignMaster 3 Challenge 2 – James Brownvdmchallenge
 
A Virtual Machine Resource Management Method with Millisecond Precision
A Virtual Machine Resource Management Method with Millisecond PrecisionA Virtual Machine Resource Management Method with Millisecond Precision
A Virtual Machine Resource Management Method with Millisecond PrecisionIRJET Journal
 
Disadvantages Of Robotium
Disadvantages Of RobotiumDisadvantages Of Robotium
Disadvantages Of RobotiumSusan Tullis
 
Virtualization Technology using Virtual Machines for Cloud Computing
Virtualization Technology using Virtual Machines for Cloud ComputingVirtualization Technology using Virtual Machines for Cloud Computing
Virtualization Technology using Virtual Machines for Cloud ComputingIJMER
 
Potential Solutions Co Existence
Potential Solutions   Co ExistencePotential Solutions   Co Existence
Potential Solutions Co ExistenceRoman Agaev
 
A generic log analyzer for auto recovery of container orchestration system
A generic log analyzer for auto recovery of container orchestration systemA generic log analyzer for auto recovery of container orchestration system
A generic log analyzer for auto recovery of container orchestration systemConference Papers
 
ACIC Rome & Veritas: High-Availability and Disaster Recovery Scenarios
ACIC Rome & Veritas: High-Availability and Disaster Recovery ScenariosACIC Rome & Veritas: High-Availability and Disaster Recovery Scenarios
ACIC Rome & Veritas: High-Availability and Disaster Recovery ScenariosAccenture Italia
 
ideas.doc
ideas.docideas.doc
ideas.docbutest
 
Build cloud native solution using open source
Build cloud native solution using open source Build cloud native solution using open source
Build cloud native solution using open source Nitesh Jadhav
 
IRJET- Real Time Monitoring of Servers with Prometheus and Grafana for High A...
IRJET- Real Time Monitoring of Servers with Prometheus and Grafana for High A...IRJET- Real Time Monitoring of Servers with Prometheus and Grafana for High A...
IRJET- Real Time Monitoring of Servers with Prometheus and Grafana for High A...IRJET Journal
 
Observability for Integration Using WSO2 Enterprise Integrator
Observability for Integration Using WSO2 Enterprise IntegratorObservability for Integration Using WSO2 Enterprise Integrator
Observability for Integration Using WSO2 Enterprise IntegratorWSO2
 
Efficient Resource Allocation to Virtual Machine in Cloud Computing Using an ...
Efficient Resource Allocation to Virtual Machine in Cloud Computing Using an ...Efficient Resource Allocation to Virtual Machine in Cloud Computing Using an ...
Efficient Resource Allocation to Virtual Machine in Cloud Computing Using an ...ijceronline
 

Similar a Part 1: DRS and DPM Implementation in Virtualized Environment, Part 2: Large Scale Performance Statistics gathering and monitoring (20)

Large scale virtual Machine log collector (Project-Report)
Large scale virtual Machine log collector (Project-Report)Large scale virtual Machine log collector (Project-Report)
Large scale virtual Machine log collector (Project-Report)
 
Crear una aplicacion enrt ni
Crear una aplicacion enrt niCrear una aplicacion enrt ni
Crear una aplicacion enrt ni
 
Cisco project ideas
Cisco   project ideasCisco   project ideas
Cisco project ideas
 
Virtual appliance creation and optimization in cloud
Virtual appliance creation and optimization in cloudVirtual appliance creation and optimization in cloud
Virtual appliance creation and optimization in cloud
 
Virtual appliance creation and optimization in cloud
Virtual appliance creation and optimization in cloudVirtual appliance creation and optimization in cloud
Virtual appliance creation and optimization in cloud
 
#VirtualDesignMaster 3 Challenge 2 – James Brown
#VirtualDesignMaster 3 Challenge 2 – James Brown#VirtualDesignMaster 3 Challenge 2 – James Brown
#VirtualDesignMaster 3 Challenge 2 – James Brown
 
A Virtual Machine Resource Management Method with Millisecond Precision
A Virtual Machine Resource Management Method with Millisecond PrecisionA Virtual Machine Resource Management Method with Millisecond Precision
A Virtual Machine Resource Management Method with Millisecond Precision
 
unit3 part1.pptx
unit3 part1.pptxunit3 part1.pptx
unit3 part1.pptx
 
Disadvantages Of Robotium
Disadvantages Of RobotiumDisadvantages Of Robotium
Disadvantages Of Robotium
 
Virtualization Technology using Virtual Machines for Cloud Computing
Virtualization Technology using Virtual Machines for Cloud ComputingVirtualization Technology using Virtual Machines for Cloud Computing
Virtualization Technology using Virtual Machines for Cloud Computing
 
Potential Solutions Co Existence
Potential Solutions   Co ExistencePotential Solutions   Co Existence
Potential Solutions Co Existence
 
A generic log analyzer for auto recovery of container orchestration system
A generic log analyzer for auto recovery of container orchestration systemA generic log analyzer for auto recovery of container orchestration system
A generic log analyzer for auto recovery of container orchestration system
 
Proposal with sdlc
Proposal with sdlcProposal with sdlc
Proposal with sdlc
 
ACIC Rome & Veritas: High-Availability and Disaster Recovery Scenarios
ACIC Rome & Veritas: High-Availability and Disaster Recovery ScenariosACIC Rome & Veritas: High-Availability and Disaster Recovery Scenarios
ACIC Rome & Veritas: High-Availability and Disaster Recovery Scenarios
 
ideas.doc
ideas.docideas.doc
ideas.doc
 
Build cloud native solution using open source
Build cloud native solution using open source Build cloud native solution using open source
Build cloud native solution using open source
 
lect 1TO 5.pptx
lect 1TO 5.pptxlect 1TO 5.pptx
lect 1TO 5.pptx
 
IRJET- Real Time Monitoring of Servers with Prometheus and Grafana for High A...
IRJET- Real Time Monitoring of Servers with Prometheus and Grafana for High A...IRJET- Real Time Monitoring of Servers with Prometheus and Grafana for High A...
IRJET- Real Time Monitoring of Servers with Prometheus and Grafana for High A...
 
Observability for Integration Using WSO2 Enterprise Integrator
Observability for Integration Using WSO2 Enterprise IntegratorObservability for Integration Using WSO2 Enterprise Integrator
Observability for Integration Using WSO2 Enterprise Integrator
 
Efficient Resource Allocation to Virtual Machine in Cloud Computing Using an ...
Efficient Resource Allocation to Virtual Machine in Cloud Computing Using an ...Efficient Resource Allocation to Virtual Machine in Cloud Computing Using an ...
Efficient Resource Allocation to Virtual Machine in Cloud Computing Using an ...
 

Último

Instrumentation, measurement and control of bio process parameters ( Temperat...
Instrumentation, measurement and control of bio process parameters ( Temperat...Instrumentation, measurement and control of bio process parameters ( Temperat...
Instrumentation, measurement and control of bio process parameters ( Temperat...121011101441
 
Architect Hassan Khalil Portfolio for 2024
Architect Hassan Khalil Portfolio for 2024Architect Hassan Khalil Portfolio for 2024
Architect Hassan Khalil Portfolio for 2024hassan khalil
 
Unit7-DC_Motors nkkjnsdkfnfcdfknfdgfggfg
Unit7-DC_Motors nkkjnsdkfnfcdfknfdgfggfgUnit7-DC_Motors nkkjnsdkfnfcdfknfdgfggfg
Unit7-DC_Motors nkkjnsdkfnfcdfknfdgfggfgsaravananr517913
 
Concrete Mix Design - IS 10262-2019 - .pptx
Concrete Mix Design - IS 10262-2019 - .pptxConcrete Mix Design - IS 10262-2019 - .pptx
Concrete Mix Design - IS 10262-2019 - .pptxKartikeyaDwivedi3
 
Transport layer issues and challenges - Guide
Transport layer issues and challenges - GuideTransport layer issues and challenges - Guide
Transport layer issues and challenges - GuideGOPINATHS437943
 
Work Experience-Dalton Park.pptxfvvvvvvv
Work Experience-Dalton Park.pptxfvvvvvvvWork Experience-Dalton Park.pptxfvvvvvvv
Work Experience-Dalton Park.pptxfvvvvvvvLewisJB
 
CCS355 Neural Network & Deep Learning UNIT III notes and Question bank .pdf
CCS355 Neural Network & Deep Learning UNIT III notes and Question bank .pdfCCS355 Neural Network & Deep Learning UNIT III notes and Question bank .pdf
CCS355 Neural Network & Deep Learning UNIT III notes and Question bank .pdfAsst.prof M.Gokilavani
 
Application of Residue Theorem to evaluate real integrations.pptx
Application of Residue Theorem to evaluate real integrations.pptxApplication of Residue Theorem to evaluate real integrations.pptx
Application of Residue Theorem to evaluate real integrations.pptx959SahilShah
 
INFLUENCE OF NANOSILICA ON THE PROPERTIES OF CONCRETE
INFLUENCE OF NANOSILICA ON THE PROPERTIES OF CONCRETEINFLUENCE OF NANOSILICA ON THE PROPERTIES OF CONCRETE
INFLUENCE OF NANOSILICA ON THE PROPERTIES OF CONCRETEroselinkalist12
 
Gurgaon ✡️9711147426✨Call In girls Gurgaon Sector 51 escort service
Gurgaon ✡️9711147426✨Call In girls Gurgaon Sector 51 escort serviceGurgaon ✡️9711147426✨Call In girls Gurgaon Sector 51 escort service
Gurgaon ✡️9711147426✨Call In girls Gurgaon Sector 51 escort servicejennyeacort
 
Why does (not) Kafka need fsync: Eliminating tail latency spikes caused by fsync
Why does (not) Kafka need fsync: Eliminating tail latency spikes caused by fsyncWhy does (not) Kafka need fsync: Eliminating tail latency spikes caused by fsync
Why does (not) Kafka need fsync: Eliminating tail latency spikes caused by fsyncssuser2ae721
 
CCS355 Neural Network & Deep Learning Unit II Notes with Question bank .pdf
CCS355 Neural Network & Deep Learning Unit II Notes with Question bank .pdfCCS355 Neural Network & Deep Learning Unit II Notes with Question bank .pdf
CCS355 Neural Network & Deep Learning Unit II Notes with Question bank .pdfAsst.prof M.Gokilavani
 
Indian Dairy Industry Present Status and.ppt
Indian Dairy Industry Present Status and.pptIndian Dairy Industry Present Status and.ppt
Indian Dairy Industry Present Status and.pptMadan Karki
 
Vishratwadi & Ghorpadi Bridge Tender documents
Vishratwadi & Ghorpadi Bridge Tender documentsVishratwadi & Ghorpadi Bridge Tender documents
Vishratwadi & Ghorpadi Bridge Tender documentsSachinPawar510423
 
IVE Industry Focused Event - Defence Sector 2024
IVE Industry Focused Event - Defence Sector 2024IVE Industry Focused Event - Defence Sector 2024
IVE Industry Focused Event - Defence Sector 2024Mark Billinghurst
 
US Department of Education FAFSA Week of Action
US Department of Education FAFSA Week of ActionUS Department of Education FAFSA Week of Action
US Department of Education FAFSA Week of ActionMebane Rash
 
Oxy acetylene welding presentation note.
Oxy acetylene welding presentation note.Oxy acetylene welding presentation note.
Oxy acetylene welding presentation note.eptoze12
 

Último (20)

Instrumentation, measurement and control of bio process parameters ( Temperat...
Instrumentation, measurement and control of bio process parameters ( Temperat...Instrumentation, measurement and control of bio process parameters ( Temperat...
Instrumentation, measurement and control of bio process parameters ( Temperat...
 
Architect Hassan Khalil Portfolio for 2024
Architect Hassan Khalil Portfolio for 2024Architect Hassan Khalil Portfolio for 2024
Architect Hassan Khalil Portfolio for 2024
 
young call girls in Green Park🔝 9953056974 🔝 escort Service
young call girls in Green Park🔝 9953056974 🔝 escort Serviceyoung call girls in Green Park🔝 9953056974 🔝 escort Service
young call girls in Green Park🔝 9953056974 🔝 escort Service
 
Unit7-DC_Motors nkkjnsdkfnfcdfknfdgfggfg
Unit7-DC_Motors nkkjnsdkfnfcdfknfdgfggfgUnit7-DC_Motors nkkjnsdkfnfcdfknfdgfggfg
Unit7-DC_Motors nkkjnsdkfnfcdfknfdgfggfg
 
🔝9953056974🔝!!-YOUNG call girls in Rajendra Nagar Escort rvice Shot 2000 nigh...
🔝9953056974🔝!!-YOUNG call girls in Rajendra Nagar Escort rvice Shot 2000 nigh...🔝9953056974🔝!!-YOUNG call girls in Rajendra Nagar Escort rvice Shot 2000 nigh...
🔝9953056974🔝!!-YOUNG call girls in Rajendra Nagar Escort rvice Shot 2000 nigh...
 
Concrete Mix Design - IS 10262-2019 - .pptx
Concrete Mix Design - IS 10262-2019 - .pptxConcrete Mix Design - IS 10262-2019 - .pptx
Concrete Mix Design - IS 10262-2019 - .pptx
 
Transport layer issues and challenges - Guide
Transport layer issues and challenges - GuideTransport layer issues and challenges - Guide
Transport layer issues and challenges - Guide
 
Work Experience-Dalton Park.pptxfvvvvvvv
Work Experience-Dalton Park.pptxfvvvvvvvWork Experience-Dalton Park.pptxfvvvvvvv
Work Experience-Dalton Park.pptxfvvvvvvv
 
CCS355 Neural Network & Deep Learning UNIT III notes and Question bank .pdf
CCS355 Neural Network & Deep Learning UNIT III notes and Question bank .pdfCCS355 Neural Network & Deep Learning UNIT III notes and Question bank .pdf
CCS355 Neural Network & Deep Learning UNIT III notes and Question bank .pdf
 
Application of Residue Theorem to evaluate real integrations.pptx
Application of Residue Theorem to evaluate real integrations.pptxApplication of Residue Theorem to evaluate real integrations.pptx
Application of Residue Theorem to evaluate real integrations.pptx
 
INFLUENCE OF NANOSILICA ON THE PROPERTIES OF CONCRETE
INFLUENCE OF NANOSILICA ON THE PROPERTIES OF CONCRETEINFLUENCE OF NANOSILICA ON THE PROPERTIES OF CONCRETE
INFLUENCE OF NANOSILICA ON THE PROPERTIES OF CONCRETE
 
Gurgaon ✡️9711147426✨Call In girls Gurgaon Sector 51 escort service
Gurgaon ✡️9711147426✨Call In girls Gurgaon Sector 51 escort serviceGurgaon ✡️9711147426✨Call In girls Gurgaon Sector 51 escort service
Gurgaon ✡️9711147426✨Call In girls Gurgaon Sector 51 escort service
 
Why does (not) Kafka need fsync: Eliminating tail latency spikes caused by fsync
Why does (not) Kafka need fsync: Eliminating tail latency spikes caused by fsyncWhy does (not) Kafka need fsync: Eliminating tail latency spikes caused by fsync
Why does (not) Kafka need fsync: Eliminating tail latency spikes caused by fsync
 
Design and analysis of solar grass cutter.pdf
Design and analysis of solar grass cutter.pdfDesign and analysis of solar grass cutter.pdf
Design and analysis of solar grass cutter.pdf
 
CCS355 Neural Network & Deep Learning Unit II Notes with Question bank .pdf
CCS355 Neural Network & Deep Learning Unit II Notes with Question bank .pdfCCS355 Neural Network & Deep Learning Unit II Notes with Question bank .pdf
CCS355 Neural Network & Deep Learning Unit II Notes with Question bank .pdf
 
Indian Dairy Industry Present Status and.ppt
Indian Dairy Industry Present Status and.pptIndian Dairy Industry Present Status and.ppt
Indian Dairy Industry Present Status and.ppt
 
Vishratwadi & Ghorpadi Bridge Tender documents
Vishratwadi & Ghorpadi Bridge Tender documentsVishratwadi & Ghorpadi Bridge Tender documents
Vishratwadi & Ghorpadi Bridge Tender documents
 
IVE Industry Focused Event - Defence Sector 2024
IVE Industry Focused Event - Defence Sector 2024IVE Industry Focused Event - Defence Sector 2024
IVE Industry Focused Event - Defence Sector 2024
 
US Department of Education FAFSA Week of Action
US Department of Education FAFSA Week of ActionUS Department of Education FAFSA Week of Action
US Department of Education FAFSA Week of Action
 
Oxy acetylene welding presentation note.
Oxy acetylene welding presentation note.Oxy acetylene welding presentation note.
Oxy acetylene welding presentation note.
 

Part 1: DRS and DPM Implementation in Virtualized Environment, Part 2: Large Scale Performance Statistics gathering and monitoring

  • 1.   1         CMPE  283  –  Project  2     DRS  and  DPM  Implementation  in  Virtualized  Environment   Large  Scale  Performance  Statistics  gathering  and  monitoring       Submitted  to   Professor  Simon  Shim     Submitted  By     Team  -­‐01   Akshay  Wattal   Apoorva  Gouni   Gopika  Gogineni   Pratyusha  Mandapati          
  • 2.   2   Table  of  Contents   1.  Introduction:  ....................................................................................................................  3   2.  Background:  .....................................................................................................................  4   3.  Requirements  ...................................................................................................................  4   4.Design  ...............................................................................................................................  5   5.Implementation  ..............................................................................................................  12   6.  Assumptions:  .................................................................................................................  24   7.  Limitations  .....................................................................................................................  25   8.  Future  Work  and  its  extension  .......................................................................................  25   9.  Individual  Contribution  ..................................................................................................  26   10.  Installation  and  Execution  manual  ................................................................................  27   12.  Screenshots  ..................................................................................................................  37   13.  Conclusion  ....................................................................................................................  54   14.  References:  ..................................................................................................................  55                          
  • 3.   3   1.  INTRODUCTION:   Virtualization  has  overall  eased  the  IT  computing  by  providing  high  cost  savings  and  can  also   greatly  enhance  organization’s  business  agility.  Companies  that  employ  partitioning,  workload   management   and   other   virtualization   techniques   are   better   positioned   to   respond   to   changing  demands  in  their  business.     1.1.  Goals:   Some   of   the   key   challenges   that   the   virtualization   industry   faces   are   to   manage   efficient   utilization   of   resources,   proper   consumption   of   power   and   collections   of   logs   from   the   virtualized  environment  for  monitoring  and  analysis  purpose.  In  this  project  we  are  doing  to   try  to  understand  these  challenges  and  propose  a  reasonable  solution.     1.2.  Objectives:   The  objective  of  this  project  can  be  divide  into  two  parts:   • Creating  a  simple  Develop  a  simple  DRS  (Distributed  Resource  Scheduler)  and  DPM   (Distributed  Power  Management)  function.   • Develop  a  real  time  statistics  gathering  and  analysis  framework.  Which  is  capable  of   capturing  metrics  from  vHosts  and  the  Virtual  Machines  and  has  the  ability  to  visualize   them  in  an  efficient  manner.       1.3.  Needs:   Distributed   resource   scheduler   and   distributed   power   management   are   essential   for   balancing  the  load  in  the  virtualize  environment.  It  prevents  from  over  and  under-­‐utilization   of  resources.  There  are  millions  of  performance  records  generated  from  virtual  machines,  its   important  to  have  a  performance  statistics  collector  and  analyzer  framework  to  monitor  the   health  of  the  infrastructure  and  thus  carry  out  smooth  DevOps.                  
  • 4.   4   2.  BACKGROUND:   The  Distributed  resource  scheduler  (DRS)  helps  in  distribution  of  the  load  across  the  different   hosts  that  are  available  in  the  vCenter.  Distributed  Power  management  runs  on  top  of  DRS   and  performs  the  primary  function  of  monitoring  the  hosts  and  virtual  machines  CPU  usage   periodically   and   power-­‐offs   the   host   with   least   utilization   leading   to   migration   of   virtual   machines.  In  addition,  collection  of  large-­‐scale  statistics  is  done  and  displayed  graphically  for   analytics  purposes.       3.  REQUIREMENTS   3.1.  Functional  Requirements   1. Collection  of  the  performance  statistics  metrics  of  the  hosts  and  virtual  machines.   2. Storing  of  the  data  should  be  in  scalable  NoSQL  database.   3. Use  of  logstash  for  log  filtering  and  parsing  should  be  implemented.   4. Agent   aggregator   should   read   and   aggregate   data   from   the   NoSQL   database   in   intervals  of  5minutes,  1hour  and  daily  and  store  into  relational  database  MySQL.     5. The  data  from  the  MySQL  database  should  be  presented  in  the  form  of  charts  by  using   visualization  tool.   6. The   visualization   tool   must   be   able   to   present   the   data   on   a   real-­‐time   basis   by   updating  itself  at  regular  time  intervals.     3.2.  Non-­‐Functional  Requirements   1. The  system  must  be  designed  to  scale  automatically  for  any  future  use  and  should  not   have  a  single  point  of  failure.   2. The  visualization  should  be  clear  and  convey  meaningful  insights.                
  • 5.   5   4.DESIGN   Part  1-­‐  DRS  &  DPM     Initially,  the  number  of  vHosts  and  the  virtual  machines  under  each  vHost  will  be  listed.  Then,   the  CPU  usage  metric  will  be  calculated  for  each  vHosts.  When  the  new  virtual  machine  is   added,  it  will  be  placed  under  the  vHosts  with  less  CPU  usage  there  by  balancing  the  load.   That  is,  the  virtual  machine  with  less  CPU  usage  will  be  migrated  to  the  vHosts  with  less  CPU   usage.   Sample  code  that  shows  the  Distributed  Resource  Scheduling  when  a  new  vHost  is  added  to   the  vCenter.        
  • 6.   6     Part  2     Architecture  Diagram     The  below  diagram  illustrates  the  solution  architecture  for  the  large-­‐scale  statistics  gathering   and   analysis   tool.   It   highlights   the   different   components   and   tools   that   were   used   for   implementing  the  solution.  In  addition,  it  also  points  the  solution  flow  and  the  deployment   scheme  used.         Figure:  Solution  Architecture  of  the  framework                
  • 7.   7   Components     Listed   below   are   the   major   components   that   formed   the   basis   of   the   system   and   solution   architecture:     1. Java  Agent  Collector   The   Java   Agent   Collector   is   essential   a   Java   program   that   uses   the   VMware   Infrastructure   APIs,   which   initiates   a   Performance   Manager   to   fetch   the   vital   performance   statistics   of   the   vHosts   and   the   VMs.   After   getting   the   statistics   it   performs  the  task  of  writing  it  to  the  log  file.  This  collector  is  deployed  independently   on  each  of  the  virtual  machines.     2. Log  File   As   mentioned   above,   the   log   file   acts   as   a   local   storage   for   the   virtual   machine   capturing  the  performance  statistics.  Each  log  file  that  is  stored  on  the  VM  is  stored  in   as  <vHost  Name>.log  i.e.  130.65.132.131.log     3. Logstash   Logstash   is   an   open   source   tool   for   managing   events   and   logs.   It   provides   the   capability  to  collect  logs,  parse  them  and  store  them  for  later  use.  In  this  framework,  it   polls  on  the  log  files  for  “events”,  where  event  is  a  single  line  written  to  the  log  file.  As   soon  as  an  event  is  detected  it  pushes  it  to  the  MongoDB  cloud  server.  In  addition,   using  Logstash  in  our  implementation  prevents  from  single  point  of  failure.  In  case,  if   the  connection  to  MongoDB  were  not  established,  it  would  re-­‐try  to  connect.  Once,   the  connection  is  back  up  it  will  automatically  sync  the  missing  log  entries  from  the  log   files  to  the  MongoDB  storage.     4. MongoDB  Cloud  Storage  (MongoLabs)   MongoLabs   provides   MongoDB-­‐as-­‐a-­‐Service.   Deployed   on   top   of   AWS,   it’s   servers   as   our   primary  client  side  database  for  storing  the  raw  log  data,  consisting  of  performance  statistics   of   the   vHosts   and   Virtual   Machines.   It’s   highly   scalable   and   reliable.   As   explained   above,   logstash  pushes  huge  amount  of  data  to  the  MongoDB  cloud  infrastructure.  It  consists  of  a   web  based  UI  for  management  of  the  database  (configurations  for  remote  connection,  delete   etc.)     5. Java  Agent  Analyzer   Agent  Analyzer  is  a  multi-­‐threaded  java  program  that  spawns  a  total  of  three  threads.   These  threads  connect  to  the  MongoDB  cloud  infrastructure  after  every  fixed  internal   (5min,  1hour  and  24hours)  and  performs  roll-­‐ups/aggregation  on  the  data  it  fetches  
  • 8.   8   and   stores   it   into   the   MySQL   database.   Having   an   analyzer   ensures   that   only   the   processed  data  goes  into  the  final  database,  so  that  it  can  allow  for  easy  and  effective   visualization.       6. MySQL  Database   MySQL   is   a   SQL   database   that   forms   the   primary   server-­‐side   database,   storing   the   entire  valuable  aggregated  processed  data.     7. Visualization  Block   The   visualization   blocks   forms   an   important   part   of   this   large-­‐scale   framework.   It   allows  the  user  capability  to  view  and  drill-­‐down  into  the  key  performance  statistics.  In   our  solution  following  pieces  are  integrated  together  to  generate  data  insights:   • PHP   PHP  is  a  server  side  scripting  language  that  connects  to  the  MySQL  database   and  caters  to  the  request  sent  from  the  UI.     • UI/Bootstrap   The  UI  is  developed  using  a  JavaScript  template  framework  called  Bootstrap.  It   sends  AJAX  calls  (GET)  to  get  data  from  the  MySQL  interface.     • Apache  Httpd  Server   This  is  an  open  sourced  HTTP  server  from  Apache  that  hosts  the  PHP  and  UI   files.  It  allows  for  communication  between  the  UI  and  PHP  scripts.     Key  Workflows       To  understand  the  solution  we  can  divide  the  architecture  into  three  key  workflows:     1. Data  Collection   Below   diagram   shows   the   data   collection   workflow.   In   this   each   collector   agent   running  on  the  virtual  machine  gets  the  performance  data  (both  vHost  and  VM)  using   the  Performance  manager  and  metrics  id.  After  getting  the  data,  it  writes  the  data  to   the  log  file.  This  acts  as  a  local  storage.  Logstash  process  that  is  running  on  the  VM   reads   the   log   files   and   pushes   the   data   received   into   the   MongoDB   cloud   storage   platform.  
  • 9.   9     Figure:  Data  Collection  Workflow       2. Data  Aggregation   The  next  workflow  the  follows  is  the  data  aggregation.  The  Java  Agent  collector  is  a   multithreaded  program  that  spawns  three  threads  for  5minutes,  1hour  and  24hours   data   aggregation.   Its   does   the   main   job   of   fetching   the   data   from   the   MongoDB   datastore  and  aggregating  it  into  the  MySQL  database.  Since,  it’s  a  thread  it  sleeps  for   specific  time  interval  before  again  performing  the  aggregation.     Figure:  Data  Aggregation  Workflow        
  • 10.   10   3. Data  Visualization   The   final   workflow   is   of   visualizing   the   aggregated   data.   PHP   scripts   act   as   an   abstraction  on  top  of  the  MySQL  database  fetching  the  data  as  requested  by  the  UI   (which  is  javascript).  The  communication  between  the  UI  and  PHP  takes  place  over   http  using  REST  web  services.  The  Apache  Http  server  that  hosts  the  web  application   caters  the  http  communication.  High  Charts,  Data  Tables  and  Google  Maps  are  used   for   visualizing   the   data   insights.   The   data   in   the   dashboards   refreshes   after   ever   5minutes  to  show  the  latest  data.     Figure:  Data  Visualization  Workflow     Database  Schema  Design     1. Client  Side  MongoDB   For  storing  the  huge  amount  of  data  in  MongoDB  the  following  schema  is  used:    
  • 11.   11   2. Server  Side  MySQL   It’s   essential   that   the   server   side   schema   is   designed   to   scale   and   to   collect   fine   information  to  be  presented  over  the  UI.  Thus  following  schema  design  is  used:   Figure:  MySQL  Schema  Design    
  • 12.   12   5.IMPLEMENTATION   Part  2     5.1.  Environment   The  project  has  been  implemented  in  the  following  environment:   • One  datacenter  consisting  of  two  Hosts   o vHost-­‐130.65.132.131     o vHost-­‐130.65.132.132.   • The  vHosts  have  VMWare  ESXi    5.0  installed.   • Each  host  has  two  virtual  machines     o T01-­‐VM01-­‐Ubuntu01  and  T01-­‐VM01-­‐Ubuntu01  under  vHost  -­‐130.65.132.131   o T01-­‐VM01-­‐Ubuntu03  and  T01-­‐VM01-­‐Ubuntu04  under  vHost  -­‐130.65.132.132.   • Each  VM  has  the  following  OS  and  tools:   o  Ubuntu-­‐10.04   o Java    (version  -­‐1.8.0_25).   o Logstash-­‐1.4.2.   • Javascript  is  used  for  the  client  side  UI.     5.2.  Tools   The  following  tools  were  used  for  development,  debugging  and  testing  purpose:   • vSphere  client  and  server  –  For  connecting  to  and  hosting  the  virtualized  environment.     • Eclipse  IDE  –  For  developing  the  java  code  and  .jar  files.   • Logstash-­‐1.4.2  –  Acts  as  a  log  management  framework.   • MongoLab  –  Cloud  storage  for  MongoDB  (MongoDB-­‐as-­‐a-­‐Service)   • MySQL  –  5.1.75  is  used  as  the  server  side  storage.   • PHP  –  5.4.31  is  used  as  the  server  side  scripting  language.   • Bootstrap  –  UI  template  tool  for  creating  the  dashboard.   • HighCharts  –  For  visualizing  the  key  statistics.   • Data  Tables  -­‐  For  visualizing  the  key  statistics.   • Google  Charts  –  For  plotting  machine’s  IP  address.   • Apache  httpd-­‐  2.2.27  -­‐  For  hosting  the  web  application.     5.3.  Implementation  Approach:   The   statistics   of   individual   virtual   machines   and   their   corresponding   host   machine   are   collected  by  deploying  a  jar  file  designed  and  compiled  on  each  virtual  machine.  This  jar  file  
  • 13.   13   java  project  is  designed  to  accept  the  virtual  machine  name  as  an  argument  at  runtime.  The   following   three   main   aspects   of   the   project   design   are   followed   -­‐   retrieving   the   statistics,   saving  them  to  log  file  on  individual  virtual  machines,  saving  them  to  the  client  side  database   system  and  retrieving  them  from  the  server  side  database  for  visualization.     Statistics  collection:     The  jar  file  deployed  on  each  virtual  machine  collects  the  statistics  of  that  virtual  machine  and   its  host  machine.  This  jar  file  is  a  java  project  with  the  code  structures  as  mentioned  below:   • This   java   project   is   designed   to   have   three   classes:   statsLog.java,   pThread.java   and   stats.java.   • statsLog.java:  This  class  has  the  main()  function  defined.  It  accepts  a  valid  VM  name  as   argument   and   gets   the   corresponding   host   name   from   a   dynamically   created   HashMap  of  vHosts  and  VMs.  A  thread  is  created  using  the  pThread  class  to  maintain  a   separate  thread  of  operation.   • pThread.java:  This  thread  is  started  in  the  statsLog.java  and  have  the  virtual  machine   and    host  system  objects.  The  run()  function  creates  the  stats.java  class  object  and   passes  the  virtual  machine  and  host  system  object.  This  run  method  is  a  blocking  call   and  executed  in  a  infinite  loop  to  collect  the  statistics  after  a  certain  time  interval.   • Stats.java:   This   class   defines   the   methods   to   collect   statistics   from   the   virtual   machine(printVMdetails)   and   its   host   system(printHOSTdetails).   These   methods   use   the  following  prebuilt  vMWare  classes  to  collect  the  statistics.   o PerformanceManager:   returns   the   performance   manager   object   for   the   vCenter  service  instance.   o PerfProviderSummary:   queries   and   returns   the   summary   provider   for   the   virtual  machine  object  passed.   o PerfQuerySpec:   builds   query   for   the   corresponding   metric   and   retrieves   the   metric  data.   o The  following  metric  IDs  are  used  to  collect  the  statistics:       § 2  :  CPU  Usage   § 6  :    CPU  usage  in  megaHertz   § 24  :  Memory  Usage   § 29  :  Memory  Granted   § 33  :  Memory  Active   § 98  :    Memory  Consumed   § 125  :  Disk  Usage   § 130  :  Disk  Read   § 131  :  Disk  Write   § 143  :  Net  Usage   § 148  :  Net  Received   § 149  :  Net  Transmitted   § 155  :  System  Up  time   § 480  :  System  resources  CPU  usage  
  • 14.   14     o All  these  statistics  are  saved  to  a  log  file  on  each  Virtual  Machine.   Log  Parsing:     The  Log  file  saved  on  each  virtual  machine  in  a  specified  location  is  parsed  to  the  Mongo  DB  in   cloud  (Mongo  Lab)  by  using  the  Logstash.  For  this  a  configuration  file  is  created  where  the   following  things  are  specified:   • Input  file  path.     • Output  file  path/Connection  to  the  MongoDB.     • The  Database  name  and  the  Collection  name  to  store  the  data.     The  format  of  the  configuration  file  is  as  below:     Input   {   File   {   codec  =>  "json"   stat_interval  =>  0   path  =>  "/home/administrator/Desktop/Stats/130_65_132_131.log"   }   }   Output   {   Mongodb   {   collection  =>  "mylogcollection"   database  =>  "vmwaredb"   uri=>   "mongodb://Apoorva:VMstats283@ds061200.mongolab.com:61200/vmwar edb"   }   }       Log  Storage:       • MongoDB  is  used  to  collect  the  data  from  Logstash  and  store  the  logs  collected.   • Received  logs  are  stored  in  “mylogcollection”  collection  in  “vmwaredb”  database.   • Remote   connection   is   added   to   MongoLabs   instance   of   MongoDB   to   allow   remote   connections.      
  • 15.   15   Log  Aggregation:     This  is  done  using  the  T01_Analyser.java  program,  which  is  multithreaded  java  program  that   spawns  three  threads  for  minute,  hourly  and  daily  aggregation.  These  are  recurring  threads   that  execute  after  every  fixed  configurable  interval.  Once  the  aggregation  is  done  it  opens  a   connection   with   MySQL   database   using   MySQL.jdbc.connection   and   executes   two   insert   statements  per  time  interval  (i.e.  for  VM  and  vHost),  resulting  in  update  of  6  tables.         Data  Storage  and  Retrieval:     Datastorage  is  done  using  the  MySQL  database;  following  is  the  schema  creation  script:     CREATE  DATABASE    IF  NOT  EXISTS  `vmware_monitoring`  /*!40100  DEFAULT  CHARACTER  SET  latin1  */;   USE  `vmware_monitoring`;       DROP  TABLE  IF  EXISTS  `vHost_monitoring_5min`;   /*!40101  SET  @saved_cs_client          =  @@character_set_client  */;   /*!40101  SET  character_set_client  =  utf8  */;   CREATE  TABLE  `vHost_monitoring_5min`  (      `id`  int(11)  unsigned  NOT  NULL  AUTO_INCREMENT,      `timestamp`  varchar(255)  NOT  NULL,      `name`  varchar(255)  DEFAULT  NULL,      `cpuUsage`    decimal(11.3)  DEFAULT  NULL,      `cpuUsagemhz`    decimal(11.3)  DEFAULT  NULL,      `memUsage`    decimal(11.3)  DEFAULT  NULL,      `memGranted`    decimal(11.3)  DEFAULT  NULL,      `memActive`    decimal(11.3)  DEFAULT  NULL,      `memConsumed`    decimal(11.3)  DEFAULT  NULL,      `diskUsage`    decimal(11.3)  DEFAULT  NULL,      `diskRead`    decimal(11.3)  DEFAULT  NULL,      `diskWrite`    decimal(11.3)  DEFAULT  NULL,      `netUsage`    decimal(11.3)  DEFAULT  NULL,      `netReceived`    decimal(11.3)  DEFAULT  NULL,      `netTrasnmitted`    decimal(11.3)  DEFAULT  NULL,      `sysUptime`    decimal(11.3)  DEFAULT  NULL,      `sysResourcesCpuUsage`    decimal(11.3)  DEFAULT  NULL,      PRIMARY  KEY  (`id`),      KEY  `name`  (`name`)   )  ENGINE=InnoDB  AUTO_INCREMENT=40  DEFAULT  CHARSET=utf8;   /*!40101  SET  character_set_client  =  @saved_cs_client  */;     DROP  TABLE  IF  EXISTS  `VM_monitoring_5min`;   /*!40101  SET  @saved_cs_client          =  @@character_set_client  */;   /*!40101  SET  character_set_client  =  utf8  */;   CREATE  TABLE  `VM_monitoring_5min`  (      `id`  int(11)  unsigned  NOT  NULL  AUTO_INCREMENT,      `timestamp`  varchar(255)  NOT  NULL,      `vHost_name`  varchar(255)  DEFAULT  NULL,  
  • 16.   16      `vm_name`  varchar(255)  DEFAULT  NULL,      `cpuUsage`    decimal(11.3)  DEFAULT  NULL,      `cpuUsagemhz`    decimal(11.3)  DEFAULT  NULL,      `memUsage`    decimal(11.3)  DEFAULT  NULL,      `memGranted`    decimal(11.3)  DEFAULT  NULL,      `memActive`    decimal(11.3)  DEFAULT  NULL,      `memConsumed`    decimal(11.3)  DEFAULT  NULL,      `diskUsage`    decimal(11.3)  DEFAULT  NULL,      `diskRead`    decimal(11.3)  DEFAULT  NULL,      `diskWrite`    decimal(11.3)  DEFAULT  NULL,      `netUsage`    decimal(11.3)  DEFAULT  NULL,      `netReceived`    decimal(11.3)  DEFAULT  NULL,      `netTrasnmitted`    decimal(11.3)  DEFAULT  NULL,      `sysUptime`    decimal(11.3)  DEFAULT  NULL,      `sysResourcesCpuUsage`    decimal(11.3)  DEFAULT  NULL,      PRIMARY  KEY  (`id`),      KEY  `vm_name`  (`vm_name`)   )  ENGINE=InnoDB  AUTO_INCREMENT=40  DEFAULT  CHARSET=utf8;   /*!40101  SET  character_set_client  =  @saved_cs_client  */;       DROP  TABLE  IF  EXISTS  `vHost_monitoring_1hour`;   /*!40101  SET  @saved_cs_client          =  @@character_set_client  */;   /*!40101  SET  character_set_client  =  utf8  */;   CREATE  TABLE  `vHost_monitoring_1hour`  (      `id`  int(11)  unsigned  NOT  NULL  AUTO_INCREMENT,      `timestamp`  varchar(255)  NOT  NULL,      `name`  varchar(255)  DEFAULT  NULL,      `cpuUsage`    decimal(11.3)  DEFAULT  NULL,      `cpuUsagemhz`    decimal(11.3)  DEFAULT  NULL,      `memUsage`    decimal(11.3)  DEFAULT  NULL,      `memGranted`    decimal(11.3)  DEFAULT  NULL,      `memActive`    decimal(11.3)  DEFAULT  NULL,      `memConsumed`    decimal(11.3)  DEFAULT  NULL,      `diskUsage`    decimal(11.3)  DEFAULT  NULL,      `diskRead`    decimal(11.3)  DEFAULT  NULL,      `diskWrite`    decimal(11.3)  DEFAULT  NULL,      `netUsage`    decimal(11.3)  DEFAULT  NULL,      `netReceived`    decimal(11.3)  DEFAULT  NULL,      `netTrasnmitted`    decimal(11.3)  DEFAULT  NULL,      `sysUptime`    decimal(11.3)  DEFAULT  NULL,      `sysResourcesCpuUsage`    decimal(11.3)  DEFAULT  NULL,      PRIMARY  KEY  (`id`),      KEY  `name`  (`name`)   )  ENGINE=InnoDB  AUTO_INCREMENT=40  DEFAULT  CHARSET=utf8;   /*!40101  SET  character_set_client  =  @saved_cs_client  */;     DROP  TABLE  IF  EXISTS  `VM_monitoring_1hour`;   /*!40101  SET  @saved_cs_client          =  @@character_set_client  */;   /*!40101  SET  character_set_client  =  utf8  */;   CREATE  TABLE  `VM_monitoring_1hour`  (      `id`  int(11)  unsigned  NOT  NULL  AUTO_INCREMENT,      `timestamp`  varchar(255)  NOT  NULL,  
  • 17.   17      `vHost_name`  varchar(255)  DEFAULT  NULL,      `vm_name`  varchar(255)  DEFAULT  NULL,      `cpuUsage`    decimal(11.3)  DEFAULT  NULL,      `cpuUsagemhz`    decimal(11.3)  DEFAULT  NULL,      `memUsage`    decimal(11.3)  DEFAULT  NULL,      `memGranted`    decimal(11.3)  DEFAULT  NULL,      `memActive`    decimal(11.3)  DEFAULT  NULL,      `memConsumed`    decimal(11.3)  DEFAULT  NULL,      `diskUsage`    decimal(11.3)  DEFAULT  NULL,      `diskRead`    decimal(11.3)  DEFAULT  NULL,      `diskWrite`    decimal(11.3)  DEFAULT  NULL,      `netUsage`    decimal(11.3)  DEFAULT  NULL,      `netReceived`    decimal(11.3)  DEFAULT  NULL,      `netTrasnmitted`    decimal(11.3)  DEFAULT  NULL,      `sysUptime`    decimal(11.3)  DEFAULT  NULL,      `sysResourcesCpuUsage`    decimal(11.3)  DEFAULT  NULL,      PRIMARY  KEY  (`id`),      KEY  `vm_name`  (`vm_name`)   )  ENGINE=InnoDB  AUTO_INCREMENT=40  DEFAULT  CHARSET=utf8;   /*!40101  SET  character_set_client  =  @saved_cs_client  */;     DROP  TABLE  IF  EXISTS  `vHost_monitoring_24hour`;   /*!40101  SET  @saved_cs_client          =  @@character_set_client  */;   /*!40101  SET  character_set_client  =  utf8  */;   CREATE  TABLE  `vHost_monitoring_24hour`  (      `id`  int(11)  unsigned  NOT  NULL  AUTO_INCREMENT,      `timestamp`  varchar(255)  NOT  NULL,      `name`  varchar(255)  DEFAULT  NULL,      `cpuUsage`    decimal(11.3)  DEFAULT  NULL,      `cpuUsagemhz`    decimal(11.3)  DEFAULT  NULL,      `memUsage`    decimal(11.3)  DEFAULT  NULL,      `memGranted`    decimal(11.3)  DEFAULT  NULL,      `memActive`    decimal(11.3)  DEFAULT  NULL,      `memConsumed`    decimal(11.3)  DEFAULT  NULL,      `diskUsage`    decimal(11.3)  DEFAULT  NULL,      `diskRead`    decimal(11.3)  DEFAULT  NULL,      `diskWrite`    decimal(11.3)  DEFAULT  NULL,      `netUsage`    decimal(11.3)  DEFAULT  NULL,      `netReceived`    decimal(11.3)  DEFAULT  NULL,      `netTrasnmitted`    decimal(11.3)  DEFAULT  NULL,      `sysUptime`    decimal(11.3)  DEFAULT  NULL,      `sysResourcesCpuUsage`    decimal(11.3)  DEFAULT  NULL,      PRIMARY  KEY  (`id`),      KEY  `name`  (`name`)   )  ENGINE=InnoDB  AUTO_INCREMENT=40  DEFAULT  CHARSET=utf8;   /*!40101  SET  character_set_client  =  @saved_cs_client  */;     DROP  TABLE  IF  EXISTS  `VM_monitoring_24hour`;   /*!40101  SET  @saved_cs_client          =  @@character_set_client  */;   /*!40101  SET  character_set_client  =  utf8  */;   CREATE  TABLE  `VM_monitoring_24hour`  (      `id`  int(11)  unsigned  NOT  NULL  AUTO_INCREMENT,      `timestamp`  varchar(255)  NOT  NULL,  
  • 18.   18      `vHost_name`  varchar(255)  DEFAULT  NULL,      `vm_name`  varchar(255)  DEFAULT  NULL,      `cpuUsage`    decimal(11.3)  DEFAULT  NULL,      `cpuUsagemhz`    decimal(11.3)  DEFAULT  NULL,      `memUsage`    decimal(11.3)  DEFAULT  NULL,      `memGranted`    decimal(11.3)  DEFAULT  NULL,      `memActive`    decimal(11.3)  DEFAULT  NULL,      `memConsumed`    decimal(11.3)  DEFAULT  NULL,      `diskUsage`    decimal(11.3)  DEFAULT  NULL,      `diskRead`    decimal(11.3)  DEFAULT  NULL,      `diskWrite`    decimal(11.3)  DEFAULT  NULL,      `netUsage`    decimal(11.3)  DEFAULT  NULL,      `netReceived`    decimal(11.3)  DEFAULT  NULL,      `netTrasnmitted`    decimal(11.3)  DEFAULT  NULL,      `sysUptime`    decimal(11.3)  DEFAULT  NULL,      `sysResourcesCpuUsage`    decimal(11.3)  DEFAULT  NULL,      PRIMARY  KEY  (`id`),      KEY  `vm_name`  (`vm_name`)   )  ENGINE=InnoDB  AUTO_INCREMENT=40  DEFAULT  CHARSET=utf8;   /*!40101  SET  character_set_client  =  @saved_cs_client  */;       For   retrieving   the   data   following   PHP   scripts   were   created.   These   scripts   define   the   data   model  for  returning  the  data  to  the  UI,  when  a  web  service  request  is  initiated.  These  scripts   are  coupled  with  the  MySQL  database  to  allow  easy  connection  and  access  to  data.           User  Interface  (UI):     The  UI  is  implemented  using  Bootstrap,  a  JavaScript  framework.  We  call  the  dashboard  as   VMware  DevOps.  It  makes  REST  calls  to  retrieve  the  data  and  render  it  on  the  screen.  For   visualization   of   data   HighCharts   and   Data   Tables   are   used.   Google   Maps   APIs   are   used   to   determine  the  location  of  the  server  based  its  the  IP.  The  stats  are  updated  automatically   after  5minutes.      
  • 19.   19   Part  1  -­‐  DRS     Initially,  we  set  the  monitoring  interval  and  create  the  performance  query  specification  and   then  loop  over  the  samples  received  from  the  performance  manager.  Here  is  the  sample  code   for  it.       After  obtaining  the  CPU  usage  of  the  vHosts,  the  DRS  will  be  started  in  order  to  balance  the   load.DRS  will  now  check  the  load  of  the  vHosts  and  then  start  cloning  the  virtual  machines  as   per  the  load  balancing  requirements.  The  sample  code  for  it  is  here.    
  • 20.   20       The  implementation  of  the  DRS  second  part  algorithm  can  be  shown  using  the  screenshots   below:     When  the  program  runs  on  the  vCenter  using  the  credentials  given,  it  adds  a  new  vHost  from   the  admin  console  using  the  SSL  thumbprint  of  the  host  to  be  added-­‐  
  • 21.   21       When  a  new  host  is  added,  DRS  checks  the  CPU  usage  of  each  VHost  and  the  Virtual  machines   under   the   host   having   the   highest   CPU   usage   is   migrated   to   the   new   VHost.   Metrics   are   collected  for  every  VHost  to  get  the  CPU  usage.  As  the  host:  132.65.132.131  is  having  the   highest   CPU   usage,   the   virtual   machine   Ubuntu   02   is   migrated   to   the   new   host:   132.65.132.133.       While  migrating  the  virtual  machines  to  the  new  host,  it  checks  for  few  conditions   1.  If  Power  On:  Live  migrate  the  virtual  machine   2.  If  power  Off:  Cold  migrate  the  virtual  machine     The  lower  threshold  value  for  the  host  CPU  usage  is  set  to  30%.  If  the  usage  of  the  host  is  less   than  30%,  migration  doesn’t  take  place.Sample  code  which  shows  the  threshold  value  as  30%   for  the  CPU  usage  is  as  follow:    
  • 22.   22     When  the  vHosts  has  only  one  virtual  machine,  migration  is  not  performed  on  the  virtual   machine.  The  below  screenshot  shows  that  the  virtual  machine  under  the  second  vHosts   having  the  highest  CPU  usage  is  migrated  successfully.     Thus,   load   balancing   of   the   vHosts   can   be   obtained   using   the   Distributed   Resource   Scheduling.     Part  1  –DPM   Initially,  it  will  check  all  the  vHosts  and  if  there  exists  any  vHosts  with  no  VM’s  under  it,  then   it  will  delete  that  particular  vHost.  Then,  it  will  determine  the  CPU  usage  for  all  the  vHosts.       Here  is  the  sample  code  for  it:       Whenever,  a  new  VM  has  been  added  it  check  the  CPU  usage  of  all  Vhosts  at  that  particular   point  of  time.    
  • 23.   23     It  will  then  sort  the  vHosts  from  lower  to  higher  CPU  usage.  All  the  VM’s  under  the  vHosts   having  less  than  30%  CPU  usage  will  be  migrated  to  the  other  vHost  and  that  old  vHost  will  be   deleted.  The  sample  code  for  it  is  here.                
  • 24.   24   6.  ASSUMPTIONS:   Part  1-­‐  DRS:   DRS  PART-­‐1:  The  assumptions  to  implement  the  DRS  PART1  are:   • Initially,  VCenter  has  2host  and  each  having  2  virtual  machines   o vHost  1:  132.65.132.131   § VM1:  T01-­‐VM01-­‐Ubuntu01   § VM2:  T01-­‐VM01-­‐Ubuntu02   o vHost  1:  132.65.132.132   § VM1:  T01-­‐VM01-­‐Ubuntu03   § VM2:  T01-­‐VM01-­‐Ubuntu04     • vMotion  is  enabled  for  all  the  vHost  sand  VMWare  tools  are  installed  for  each  VM.   • Program  Prime95  is  running  on  each  virtual  machine.   Part  1-­‐  DPM   • In  this  part,  we  have  assumed  that  the  minimum  CPU  usage  value  to  be  30%.   • Number  of  vHosts  to  be  three.   • There  will  be  no  VM’s  under  the  third  vHost.     Part  2:   The  solution  was  built  under  a  controlled  environment  consisting  of  2  Virtual  Machines  and  2   Hosts.  Following  are  the  assumptions  taken  into  consideration  while  building  this  solution:     1. The   Virtual   Machines   that   are   in   Power   On   state   will   only   be   considered   for   logs   monitoring  and  analysis,  since  the  Agent  Collector  is  running  on  the  virtual  machines   to  capture  the  performance  data.   2. The  Virtual  Machines  are  packaged  with  logstash  and  agent  collector  (.jar)  installation   files.   In   addition,   these   files   are   executed   using   the   shell   scripts.   This   is   to   prevent   single  point  of  failure.   3. The  interval  for  collection  of  performance  data  is  set  to  20  seconds,  since  it’s  a  part  of   project  requirement  and  to  avoid  data  excessive  IO  on  the  log  files.   4. The  Aggregator  program  is  executed  from  a  physical  machine  and  is  considered  to  be   highly  robust  and  redundant,  so  as  to  withstand  failure.   5. To  simulate  stress  on  the  Virtual  Machines,  programs  like  Prime95  and  Folding@home   are  used.  
  • 25.   25   6. For   storing   raw   logs   MongoLabs   cloud   infrastructure   is   used.   It’s   considered   to   be   highly  scalable  and  elastic.   7. The  UI  should  be  dynamic  and  the  visualization  illustrating  the  performance  statistics   should  clear  and  crisp.     7.  LIMITATIONS   Part  1:   • The  current  implementation  is  limited  to  a  controlled  environment,  thus  the  real   execution  of  DRS  and  DPM  in  industry  cannot  be  studied.     • The  algorithm  is  not  taking  into  consideration  other  performance  metrics  apart  from   CPU  Usage.     Part  2:   Following  are  the  limitations  of  the  current  solution  architecture:   1. The  Agent  collector  runs  on  the  virtual  machines  capturing  the  performance  metrics.   This  leads  to  high  CPU  utilization  and  slows  down  the  other  tasks  running  on  the  VM.   2. The  performance  metrics  that  are  captured  as  a  part  of  this  implementation  are  pre-­‐ defined.  There  is  no  way  to  dynamically  add  new  performance  metric  ids.   3. Using   MongoDB-­‐as-­‐a-­‐service   (MongoLabs)   acts   a   black-­‐box   in-­‐terms   of   scaling   and   security.   Even   though,   authentication   keys   and   credentials   are   used,   there   is   no   visibility  into  where  the  data  is  being  stored.   4. Incase  of  logstash  failure,  no  captured  data  (in  log  files)  will  be  passed  to  the  MongDB   database  server  for  storage.     8.  FUTURE  WORK  AND  ITS  EXTENSION   Part  1:   The  implementation  of  DRS  algorithm  for  various  vHosts  and  the  virtual  machines  running   under  them  can  be  done.  So  as  to,  simulate  the  real  industry  environment.  In  addition,  the   behavior  of  DRS  and  DPM  execution  can  be  monitored  and  triggered  using  a  remote  UI.      
  • 26.   26   Part  2:   This  project  can  be  scaled  to  a  complete  log  analyzing  and  visualization  enterprise  solution.   Listed  below  are  some  of  the  extensions  for  the  project:   1. Providing  capability  to  analyze  logs  per  VM  using  the  User  Interface  (UI).     2. Dynamic  or  easy  on  boarding  of  new  performance  metrics.   3. Providing  feature  to  connect  to  vCenter  through  the  UI  and  get  all  the  performance   metrics  of  corresponding  vHosts  and  VMs  dynamically.   4. The  project  can  be  also  extended  to  provide  monitoring  and  visualization  capabilities   for  other  virtualization  players  for  e.g.  Hyper-­‐V,  Xen  etc.     9.  INDIVIDUAL  CONTRIBUTION   Name   Contribution   Akshay  Wattal   • Overall   designing   of   system   framework   and  architecture   • Performed  PoC  for  different  architectural   components.   • Implementation   of   the   Visualization   framework.   • Integration  and  System  testing   • Agent  analyzer  implementation   • Documentation    Apoorva  Gouni       • Overall   designing   of   system   framework   and  architecture   • Agent  Collector  implementation   • Configuration  of  Logstash  in  VMs   • MongoLabs-­‐  connection   • Unit  Testing   • Documentation   Gopika  Gogineni     • Overall   designing   of   system   framework   and  architecture   • DPM  Implementation   • Documentation   Pratyusha  Mandapati     • Overall   designing   of   system   framework   and  architecture   • DRS-­‐  Create  VM  scenario   • Documentation    
  • 27.   27   10.  INSTALLATION  AND  EXECUTION  MANUAL     • Java   Installation:   Java   is   installed   on   each   Virtual   machine.   Below   are   the   steps   followed  to  install  Java.   o Download  and  save  the  jdk-­‐8u25-­‐linux-­‐i586.trz.gz   o Switch  to  the  directory  where  you  saved  the  file   o Uncompress  the  file.   o Launch  Java     • Logstash  1.4.2  Installation:  Logstash  is  installed  on  each  Virtual  machine.  The  below   are  the  steps  followed  to  install  the  Logstash:   o Download  the  logstash  tar  file   o Extract  the  logstash  tar  file   o Launch  the  Logstash   o Download  Logstash-­‐contrib-­‐1.4.2   o Extract  the  Logstash-­‐contrib-­‐1.4.2   o Run  the    Contrib  Plug-­‐in       Execution:  The  following  commands  should  be  executed  in  the  command  prompt  for   each  Virtual  machine.     java  –jar  stats.jar  VM_name  :  This  starts  collecting  the  statistics  and  saving  it  to  the  log  file  in   specified  location.  Here  “stats.jar”  is  the  name  of  the  jar  file  and  VM_name  is  the  name  of  the   virtual  machine  on  which  you  are  running  this  jar  file.        
  • 28.   28   Bin/logstash  agent  –f  stats.conf  :  This  command  is  used  to  run  the  Logstash  and  parse  the  log   from  virtual  machine  to  MongoDB  on  cloud.  Here  “stats.conf”  is  the  name  of  configuration   file.                                                   • MongoLabs   Configuration:   The   detailed   steps   are   provide   here   http://docs.mongolab.com/       • MySQL   Installation:   It   is   installed   on   the   physical   machine   running   on   Mac   OS.   It’s   installed  using  a  third  part  package  manger  for  Mac  called  homebrew.     brew  install  mysql       Execution:   To   start   the   MySQL   service   run   the   following   command   -­‐   mysql.server   restart     • PHP   Installation:   It   is   installed   on   the   physical   machine   running   on   Mac   OS.   It’s   installed  using  a  third  part  package  manger  for  Mac  called  homebrew.  Following  is  the   command   for   the   same   (Detailed   link   https://github.com/Homebrew/homebrew-­‐ php):     brew  install  php56       • Apache  Http  Server  Installation  and  execution:  Its  built-­‐in  the  Mac  OS.  To  start  the   apache  server  first  deploy  all  the  PHP  and  UI  files  (html,  css,  js)  into  the  root  document   folder.  Next,  execute  the  following  command  to  start  the  service     ./apachectl  start     To  stop  the  server,  run-­‐  ./apachectl  stop  
  • 29.   29   11.  UNIT,  SYSTEM  AND  INTEGRATION  TESTING:   The  test  case  were  executed  in  a  controlled  environment  consisting  of  2  vHosts  and  2  Virtual   machines:     Test   Case  Id   Test  case  Description   Expected   Result   Actual  Result   Details  of  Test   Results  Pass   Fail   1   When  Virtual  machine  is   powered   ON,   it   should   return   the   Virtual   Machine   name,   its   corresponding   host   name   and   the   performance   statistics   of   both   virtual   machine   and   its   corresponding   host  should  be  saved  in   a   log   file   at   specified   location.     The  Statistics  of   the   vHosts   and   the   VMs   are   successfully   fetched.    Yes     Successfully   printed   the   Virtual  Machine   name   and   its   corresponding   vHost   name   and   saved   the   performance   metrics   of   both   virtual   machine   and   its   corresponding   host   should   be   saved   in   a   log   file   at   specified   location.     Screenshot   Test-­‐1   2   To  check  the  connection   with  the  Mongo  DB  and   Storage   of   Data   in   MongoDB   Run   the   command   to   execute  the  Logstash.   Data   from   specified   input   path   should   be   parsed  to  MongoDB.   Successfully   connection   should   be   established   with   MongoDB   and  data  should   be  parsed    from   specified   location   and   saved   to   MongoDB.   Yes     Screenshot   Test-­‐2   shows   the   expected   output.   3   Test   remote   connectivity   with   the   MongoDB   Labs   server   for   Aggregator   so   that   the  Aggregator  program   could  remote  connect.   Remote   connection   should   is   successful   Robomongo   tool.   Yes     Screenshots   Test-­‐3   shows   the   expected   output.  
  • 30.   30             4   Test   case   to   ensures   that   the   aggregator   is   able   is   connect   to   MongoDB   and   insert   aggregated   statistics   in   MySQL  database.       The   data   from   MongoDB   is   aggregated   and   pushed   into   MySQL   Yes     Screenshots   Test-­‐4   shows   the   data   aggregated   into   different   MySQL   tables   corresponding   to   the   timeframe.     5   Visualization           5.1   To   test   if   the   UI   REST   web  services  are  able  to   connect   with   the   deployed   PHP   scripts   and  get  back  the  data.   There  should  be   no   error   on   loading   the   VMware   DevOps   dashboard   and   data   should   be   fetched   for   the   charts.   Yes     Screenshots   Test-­‐5.1   shows   the   excepted   output   of   the   dashboard   getting   loaded   successfully   without   and   errors.     5.2     To  test  data  consistency   between   the   MySQL   data   and   the   data   populated   into   visualization  widgets.   The  data  points   on   the   charts   and   table   are   correct   an   as   expected.   Yes     Test-­‐5.2   shows   the   excepted   output   of   the   getting   correct   and   matching   data   loaded   on   the  UI.     5.3   Test  Case  to  check  if  the   graph   and   data   tables   are   auto   refreshing   after   every   fixed   interval  of  time.     The   values   in   the   graphs   are   updated  after  a   configured  time   interval.   Yes     Test-­‐5.3   shows   that   the   graphs   are   auto-­‐ refreshed.    
  • 31.   31   Testing  Screenshots:   Test-­‐1:            
  • 32.   32   Test-­‐2:                    
  • 33.   33   Test-­‐3:              
  • 34.   34   Test-­‐4:        
  • 35.   35   Test-­‐5.1:       Test-­‐5.2:    
  • 36.   36       Test-­‐5.3:      
  • 37.   37   12.  SCREENSHOTS   Part  1:     DRS1,  Initial  placement:   Running  Prime95  for  increasing  CPU  utilization:     Comparison  of  vHosts  CPU  utilization  for  initial  placement  and  selection  of  vHost:    
  • 38.   38   Virtual  Machine  added  to  the  selected  vHost:       DRS2,  Load  balancing:   Adding  a  new  vHost:  132.65.132.133  in  the  vCenter.      
  • 39.   39       Collecting  the  metrics  for  vHosts  and  VMs  and  displaying  the  CPU  usage  including  the  time   stamp.        
  • 40.   40   Checks  performed  for  live  and  cold  migration  and  migrating  the  Virtual  Machine  –  T01-­‐VM01-­‐ Ubuntu02  to  the  new  vHost.       VM  migrated  successfully            
  • 41.   41   Checks  performed  on  the  second  vHost:  132.65.132.132  to  migrate  the  Virtual  machine:  T01-­‐ VM01-­‐Ubuntu03  to  the  new  vHost.       DPM  (Distributed  Power  Management):   The  vHost  130.65.132.133  is  deleted  as  it  has  no  VM’s.        
  • 42.   42   Below  is  the  code  execution  of  the  above  deletion  procedure.       Migrating  VM  T01-­‐VM01-­‐Ubuntu03  from  130.65.132.132  to  130.65.132.131        
  • 43.   43     Below  is  the  code  execution  of  the  above  migration  procedure.       Migrating  VM  T01-­‐VM-­‐01-­‐Ubuntu04  from  130.65.132.132  to  130.65.132.132        
  • 44.   44   Below  is  the  code  execution  of  the  above  migration  procedure.       Deleting  the  vHost  130.65.132.132,  since  no  VMs  are  left          
  • 45.   45   Below  is  the  code  execution  of  the  above  deletion  procedure.       At  the  end  of  DPM  only  one  vHost  with  four  VMs  are  left.          
  • 46.   46   Part  2:   Note:  Screenshots  of  Running  Collector,  Generated  Log  File,  Logstash  execution,  MonogDB   storage  and  Aggregation  are  already  captured  in  Testing  Section  above.  Not  repeating  to   limit  the  length  of  the  document.     Visualization   Below  are  the  screenshots  of  the  VMware  DevOps  dashboard  that  we  have  created.   This  is  called  the  VMware  vHost  Top  Statistics  dashboard,  its  primary  purpose  it  to  show  Top3   defaulters  vHosts  that  are  consuming  the  maximum  resources  interms  of  the  CPU,  Memory,   Disk  I/O  and  Usage.                
  • 47.   47   This  is  called  the  VMware  VMs  Top  Statistics  dashboard,  its  primary  purpose  it  to  show  Top3   defaulters  Virtual  Machines  that  are  consuming  the  maximum  resources  interms  of  the  CPU,   Memory,  Disk  I/O  and  Usage.     Next  from  the  Left  side  panel  we  can  expand  VMware  Inventory  and  drill-­‐down  to  each  ESXi   or  VM  level.  The  following  screenshot  is  of  Overview  of  monitored  vHost.    
  • 48.   48   CPU  Overview  of  vHosts,  it  consists  of  two  charts  with  three  time  lines  By  minutes,  hour  and   day.     Memory  Overview  of  vHosts,  it  consists  of  four  charts  with  three  time  lines  By  minutes,  hour   and  day.        
  • 49.   49   Disk  I/O  Overview  of  vHosts,  it  consists  of  three  charts  with  three  time  lines  By  minutes,  hour   and  day.     Network  Overview  of  vHosts,  it  consists  of  three  charts  with  three  time  lines  By  minutes,  hour   and  day.      
  • 50.   50   And  finally  System  Overview  of  vHosts,  it  consists  of  three  charts  with  three  time  lines  By   minutes,  hour  and  day.     Next  from  the  Left  side  panel  we  can  expand  VMware  Inventory-­‐>Virtual  Machines.  The  first   is  the  VMs  Overview  providing  quick  stats.    
  • 51.   51   CPU  Overview  of  VM,  it  consists  of  two  charts  with  three  time  lines  By  minutes,  hour  and  day.     Memory  Overview  of  VM,  it  consists  of  four  charts  with  three  time  lines  By  minutes,  hour  and   day.    
  • 52.   52   Disk  I/O  Overview  of  VM,  it  consists  of  three  charts  with  three  time  lines  By  minutes,  hour   and  day.  Note,  if  we  need  to  compare  or  just  view  only  one  VMs  statistics  we  can  simply   enable/disable  it  from  chart  legend  as  shown  below.     Network  Overview  of  VM,  it  consists  of  three  charts  with  three  time  lines  By  minutes,  hour   and  day.    
  • 53.   53   And  finally  System  Overview  of  VM,  it  consists  of  three  charts  with  three  time  lines  By   minutes,  hour  and  day.     Next,  to  purge  the  MongoDB  collections  we  can  simply  use  the  UI  to  clear  all  records.  The   extension  of  this  could  be  to  delete  based  on  Timestamp.    
  • 54.   54   The  last  feature  of  this  dashboard  is  that  you  can  find  the  location  of  any  server  using  the  IP   address.       13.  CONCLUSION:   1. It  was  highly  challenging,  in  real-­‐time  while  migrating  the  VMs  as  a  part  of  DPM  and   trying  to  maintain  the  synchronization  with  the  MySQL  table  at  the  same  time.   2. In  real  time,  generating  the  graphs  dynamically  with  the  response  was  slow.   3. Configuring  logstash  was  very  challenging.  The  difficult  part  lies  in  determining  the   output  forms  by  making  use  of  plug-­‐ins.   4. We  had  no  hands-­‐on  experience  with  Creation  of  new  resource  pools  and  managing   them  within  our  vCenter  in  our  initial  labs.  As,  a  result  we  felt  that  task  to  be   challenging.   5. As  the  infrastructure  of  the  lab  has  provided  every  team,  with  limited  storage,  we  had   to  always  monitor  that  our  machines  did  not  run  out  of  memory  by  expelling  the  log   files  that  have  been  collected.   6. It  took  us  a  considerable  amount  of  time,  just  for  sorting  out  the  statistics  that  are   required  to  be  collected  by  us  as  per  the  project  requirements.  
  • 55.   55   7. In  order  to  generate  the  algorithm  that  will  best  suit  to  implement  DRS  and  DPM  was   really  challenging  and  took  some  considerable  amount  of  time.  Once  we  have  figured   it  out,  it  was  really  easy  for  us  to  implement  DRS  and  DPM.     14.  REFERENCES:   1. http://logstash.net   2. CMPE  283  Project-­‐2  Description  PDF   3. https://mongolab.com/welcome/   4. http://php.net/manual/en/ref.pdo-­‐mysql.php   5. http://startbootstrap.com/