SlideShare una empresa de Scribd logo
1 de 19
Descargar para leer sin conexión
MEASURING	
  AND	
  OPTIMIZING	
  PERFORMANCE	
  OF	
  CLUSTER	
  
AND	
  PRIVATE	
  CLOUD	
  APPLICATIONS	
  
BY	
  USING	
  PPA	
  
	
  
MULTICOREWARE	
  INC	
  
LIHUA.ZHANG	
  	
  
HUI.HUANG	
  
ANDY.ZHENG	
  
	
  
IntroducEon	
  to	
  MCW	
  PPA™	
  For	
  Cluster	
  
A	
  tracing	
  tool	
  targets	
  the	
  distributed	
  systems.	
  
!  Distributely	
  collect	
  instrumented	
  data	
  and	
  hardware	
  measurements	
  within	
  a	
  
tracing	
  infrastructure.	
  

!  Provide	
  visualizaEons	
  with	
  	
  intuiEve	
  graphs/GanX	
  charts	
  and	
  generate	
  staEsEc	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  
reports	
  intended	
  for	
  idenEfying	
  criEcal	
  paths.	
  
!  Do	
  offline	
  analysis	
  that	
  aids	
  in	
  understanding	
  target	
  system’s	
  behavior	
  and	
  
reasoning	
  about	
  performance	
  issues.	
  
!  PPA	
  Product	
  series	
  
	
  

PPA For Cluster

PPA Workstation Edition

3	
   |	
  	
  	
  PRESENTATION	
  TITLE	
  	
  	
  |	
  	
  	
  DECEMBER	
  4,	
  2013	
  	
  	
  |	
  	
  	
  CONFIDENTIAL	
  

PPA For Android
Main	
  Features	
  
!  Low	
  overhead	
  
‒  	
  Have	
  negligible	
  performance	
  impact	
  on	
  the	
  running	
  applicaEons	
  by	
  relying	
  on	
  the	
  
PPA	
  runEme	
  library.	
  This	
  is	
  very	
  useful	
  for	
  highly	
  opEmized	
  cases	
  which	
  are	
  
performance	
  sensiEve.	
  

!  InstrumentaBon	
  on	
  applicaBon	
  level	
  
‒  The	
  PPA	
  runEme	
  library	
  provides	
  APIs	
  to	
  measure	
  codes.	
  The	
  hardware	
  
measurement	
  part	
  is	
  very	
  transparent	
  to	
  the	
  developers.	
  And	
  these	
  PPA	
  codes	
  can	
  be	
  
easily	
  cleanup	
  by	
  turning	
  on	
  a	
  disable	
  opEon.	
  
‒  Auto-­‐instrumentaEon	
  of	
  binaries	
  available	
  soon.	
  

!  Scalability	
  
‒  The	
  tool	
  can	
  be	
  extended	
  to	
  profile	
  clusters	
  with	
  various	
  scales	
  (now	
  up	
  to	
  4000	
  
nodes)	
  and	
  services	
  (e.g.	
  Hadoop).	
  This	
  benefits	
  from	
  PPA’s	
  distributed	
  data	
  
repositories,	
  big-­‐data	
  process	
  and	
  buffered	
  views	
  of	
  visualizaEons	
  etc.	
  
‒  PPA	
  Profiler	
  can	
  be	
  extended	
  to	
  support	
  HW	
  vendor	
  specific	
  features	
  

4	
   |	
  	
  	
  PRESENTATION	
  TITLE	
  	
  	
  |	
  	
  	
  DECEMBER	
  4,	
  2013	
  	
  	
  |	
  	
  	
  CONFIDENTIAL	
  
 The	
  Highlights	
  
!  Profiler	
  and	
  performance	
  analyzer	
  
‒ 
‒ 
‒ 
‒ 
‒ 
‒ 
‒ 

Low	
  overhead	
  (almost	
  no	
  cost	
  if	
  no	
  profiling	
  capture	
  is	
  enabled)	
  
CPU	
  &	
  GPU	
  acEvity	
  traces	
  
Hardware	
  uElizaEons	
  measurement	
  
HW	
  Vendor	
  specific	
  support	
  
Features	
  Eme-­‐based	
  views	
  and	
  staEsEcal	
  analysis	
  /	
  reports	
  
MulE-­‐core	
  profiling	
  at	
  process/thread	
  at	
  source	
  code	
  
Good	
  data	
  organizaEon	
  in	
  intuiEve	
  colour	
  schemes	
  

!  Big	
  data	
  support	
  
‒  Storage	
  
‒  Smooth	
  visualizaEon	
  

!  System-­‐wide	
  criEcal	
  paths	
  idenEficaEon	
  
‒ 
‒ 
‒ 
‒ 

Correlate	
  hardware	
  uElizaEons	
  and	
  CPU	
  events	
  in	
  the	
  same	
  Emeline	
  
Cluster	
  wide	
  global	
  clock	
  synchronizaEon	
  
MulE-­‐views	
  for	
  sessions	
  from	
  different	
  nodes	
  in	
  the	
  same	
  Emeline	
  
RunEme	
  monitors	
  

!  Customizable	
  for	
  specific	
  applicaEons,	
  e.g.	
  Hadoop	
  
	
  

5	
   |	
  	
  	
  PRESENTATION	
  TITLE	
  	
  	
  |	
  	
  	
  DECEMBER	
  4,	
  2013	
  	
  	
  |	
  	
  	
  CONFIDENTIAL	
  
 Developer	
  Library	
  Overview	
  
!  C/C++	
  SDK	
  
‒  Already	
  used	
  in	
  numerous	
  OpenCL™	
  applicaEons	
  

!  Java	
  Support	
  
‒  Java	
  bindings	
  for	
  OpenCL™	
  applicaEons	
  

!  Thread-­‐safe	
  
!  	
  Low	
  overhead	
  if	
  no	
  capture	
  
!  	
  Transparent	
  for	
  OpenCL	
  instrumentaEons	
  
‒ 
‒ 
‒ 
‒ 

	
  

Timing	
  OpenCL	
  APIs	
  
	
  Timing	
  kernels	
  &	
  data	
  transfers:	
  start/submit/queue/complete	
  
Visualize	
  construcEon	
  of	
  dependence	
  graph	
  between	
  kernels	
  &	
  data	
  transfer	
  
Exclusive	
  sub-­‐kernel	
  support	
  for	
  AMD	
  GFX	
  cards	
  
C/C++
Provide	
  a	
  friendly	
  Interface	
  
(ppaAPI.h)	
  for	
  the	
  C/C++	
  developer.
	
  

6	
   |	
  	
  	
  PRESENTATION	
  TITLE	
  	
  	
  |	
  	
  	
  DECEMBER	
  4,	
  2013	
  	
  	
  |	
  	
  	
  CONFIDENTIAL	
  

JAVA
Provide	
  a	
  friendly	
  Interface	
  
(JPPA.jar)	
  for	
  the	
  JAVA	
  developer.
	
  
System	
  Overview	
  	
  
!  Distributed	
  repositories	
  for	
  trace	
  data	
  
!  Distributed	
  post-­‐processing	
  to	
  minimize	
  overhead	
  
!  Powerful	
  visualizaEon	
  engine	
  
!  Scalability	
  to	
  any	
  scale	
  of	
  cluster	
  system	
  
Presentation
layer
UI Logic layer

Network layer
Profiler Logic
layer
Data layer

Graphics	
  Rendering
Raw Data Post Processing

Communication
Framework

Processed Data Repository

Data Transfer

Profiler Control (Start/Stop etc.)

Data collecting by PPA Profiler

7	
   |	
  	
  	
  PRESENTATION	
  TITLE	
  	
  	
  |	
  	
  	
  DECEMBER	
  4,	
  2013	
  	
  	
  |	
  	
  	
  CONFIDENTIAL	
  

Data serialize for Presentation

Fault-tolerant

Synchronization
and heartbeat etc.

Other profiler logic

Raw Data Repository
Gepng	
  Started	
  
!  Install	
  PPA	
  Clients	
  and	
  PPA	
  Server	
  on	
  the	
  target	
  plaqorms	
  	
  
‒  Deploy	
  PPA	
  Clients	
  by	
  scripts	
  
‒  Support	
  CLI	
  for	
  capture	
  
‒  Generally	
  PPA	
  Server	
  is	
  running	
  on	
  master	
  node	
  

!  Set	
  up	
  capture	
  opEons	
  
‒ 
‒ 
‒ 
‒ 
‒ 

Node	
  IP,	
  communicaEon	
  Port…	
  
OpEonally	
  select	
  nodes	
  to	
  profile	
  
OpEonally	
  enable	
  CPU	
  Event	
  filters	
  
OpEonally	
  enable	
  CPU	
  Event	
  merge	
  	
  
Hardware	
  measurement	
  is	
  by	
  default	
  

!  Collect	
  data	
  and	
  analysis	
  reports	
  
!  Operate	
  views	
  

8	
   |	
  	
  	
  PRESENTATION	
  TITLE	
  	
  	
  |	
  	
  	
  DECEMBER	
  4,	
  2013	
  	
  	
  |	
  	
  	
  CONFIDENTIAL	
  
Summary	
  View	
  
!  Available	
  to	
  help	
  find	
  the	
  problemaEc	
  nodes	
  or	
  un-­‐balanced	
  loads.	
  
!  Tell	
  difference	
  between	
  different	
  runs	
  
	
  

Multistage Table

Bar Charts

9	
   |	
  	
  	
  PRESENTATION	
  TITLE	
  	
  	
  |	
  	
  	
  DECEMBER	
  4,	
  2013	
  	
  	
  |	
  	
  	
  CONFIDENTIAL	
  
The	
  Sharp	
  UElity:	
  Timeline	
  View	
  
!  Correlate	
  CPU	
  Events	
  to	
  HW	
  performance	
  in	
  analysis	
  
Monitoring application’s
behaviour

Session and its node list

Monitoring hardware
behavior

10	
   |	
  	
  	
  PRESENTATION	
  TITLE	
  	
  	
  |	
  	
  	
  DECEMBER	
  4,	
  2013	
  	
  	
  |	
  	
  	
  CONFIDENTIAL	
  

Zoom in/out from
hour to ns resolutions
Profiling	
  	
  Data	
  
!  CPU	
  Events	
  Level	
  
‒  Thread	
  
‒  Name	
  
‒  Core	
  miEgaEon	
  
‒  Timing	
  

!  OpenCL	
  traces	
  
!  Hardware	
  counters	
  
‒  %	
  CPU	
  Usage	
  
‒  Memory	
  Usage	
  
‒  Bytes	
  read/write	
  of	
  Disk	
  
‒  Bytes	
  in/write	
  of	
  the	
  Net	
  
‒  Cache	
  hit/miss	
  

!  StaEsEcs	
  
‒  Process/Thread	
  involved	
  
‒  #	
  of	
  total	
  CPU	
  Events	
  
‒  #	
  of	
  the	
  same	
  CPU	
  Events	
  
‒  Min/Max/Average	
  for	
  each	
  
11	
   |	
  	
  	
  PRESENTATION	
  TITLE	
  	
  	
  |	
  	
  	
  DECEMBER	
  4,	
  2013	
  	
  	
  |	
  	
  	
  CONFIDENTIAL	
  
Timeline	
  View	
  for	
  CPU	
  Events	
  
!  Process-­‐thread-­‐event	
  data	
  
‒  IdenEfy	
  the	
  problemaEc	
  process/thread/event	
  
‒  Tell	
  the	
  dependency	
  
‒  Tell	
  parent	
  &	
  child	
  
‒  Frames	
  analyzer	
  for	
  frame-­‐based	
  program	
  

Expand process

Expand thread

12	
   |	
  	
  	
  PRESENTATION	
  TITLE	
  	
  	
  |	
  	
  	
  DECEMBER	
  4,	
  2013	
  	
  	
  |	
  	
  	
  CONFIDENTIAL	
  
Timeline	
  View	
  for	
  HW	
  measurement	
  
!  Aggregate	
  performance	
  data	
  
!  Per-­‐core	
  data	
  

When is the critical
throughput on disk?

Abnormal load of
the Network?

When the CPU usage is
very low or high?

13	
   |	
  	
  	
  PRESENTATION	
  TITLE	
  	
  	
  |	
  	
  	
  DECEMBER	
  4,	
  2013	
  	
  	
  |	
  	
  	
  CONFIDENTIAL	
  
Where	
  mulE-­‐views	
  Help	
  OpEmizaEon	
  
!  IdenEfy	
  node’s	
  abnormal	
  behavior	
  
!  Difference/relaEons	
  between	
  nodes	
  
!  Job	
  scheduler	
  maXers	
  

14	
   |	
  	
  	
  PRESENTATION	
  TITLE	
  	
  	
  |	
  	
  	
  DECEMBER	
  4,	
  2013	
  	
  	
  |	
  	
  	
  CONFIDENTIAL	
  
Hadoop	
  with	
  PPA	
  on	
  AWS	
  as	
  Demo	
  
!  Overview	
  of	
  the	
  tracing	
  infrastructure	
  

15	
   |	
  	
  	
  PRESENTATION	
  TITLE	
  	
  	
  |	
  	
  	
  DECEMBER	
  4,	
  2013	
  	
  	
  |	
  	
  	
  CONFIDENTIAL	
  
Setup	
  AWS	
  EC2	
  instance	
  
!  16	
  Hadoop	
  nodes	
  (dual	
  core	
  node	
  with	
  7.5GB	
  memory)	
  
!  4GB	
  Hadoop	
  Terasort	
  Workload	
  
!  >	
  1.2	
  GB	
  PPA	
  trace	
  per	
  node	
  

16	
   |	
  	
  	
  PRESENTATION	
  TITLE	
  	
  	
  |	
  	
  	
  DECEMBER	
  4,	
  2013	
  	
  	
  |	
  	
  	
  CONFIDENTIAL	
  
Run	
  Hadoop	
  jobs	
  
!  Start	
  the	
  capture	
  
!  Jobs	
  are	
  done	
  by	
  map	
  &	
  reduce	
  

17	
   |	
  	
  	
  PRESENTATION	
  TITLE	
  	
  	
  |	
  	
  	
  DECEMBER	
  4,	
  2013	
  	
  	
  |	
  	
  	
  CONFIDENTIAL	
  
Remote	
  control	
  by	
  VNC	
  viewer	
  
!  Intended	
  for	
  mulEple	
  users	
  on	
  AWS	
  
!  Experience	
  and	
  operate	
  PPA	
  from	
  different	
  connect	
  points	
  

18	
   |	
  	
  	
  PRESENTATION	
  TITLE	
  	
  	
  |	
  	
  	
  DECEMBER	
  4,	
  2013	
  	
  	
  |	
  	
  	
  CONFIDENTIAL	
  
CONTACT	
  US:	
  
	
  
CURTIS@MULTICOREWAREINC.COM	
  	
  
LIHUA@MULTICOREWAREINC.COM	
  	
  
MANU@MULTICOREWAREINC.COM	
  	
  

Más contenido relacionado

La actualidad más candente

HSA-4123, HSA Memory Model, by Ben Gaster
HSA-4123, HSA Memory Model, by Ben GasterHSA-4123, HSA Memory Model, by Ben Gaster
HSA-4123, HSA Memory Model, by Ben GasterAMD Developer Central
 
CC-4006, Deliver Hardware Accelerated Applications Using RemoteFX vGPU with W...
CC-4006, Deliver Hardware Accelerated Applications Using RemoteFX vGPU with W...CC-4006, Deliver Hardware Accelerated Applications Using RemoteFX vGPU with W...
CC-4006, Deliver Hardware Accelerated Applications Using RemoteFX vGPU with W...AMD Developer Central
 
PG-4039, RapidFire API, by Dmitry Kozlov
PG-4039, RapidFire API, by Dmitry KozlovPG-4039, RapidFire API, by Dmitry Kozlov
PG-4039, RapidFire API, by Dmitry KozlovAMD Developer Central
 
GS-4152, AMD’s Radeon R9-290X, One Big dGPU, by Michael Mantor
GS-4152, AMD’s Radeon R9-290X, One Big dGPU, by Michael MantorGS-4152, AMD’s Radeon R9-290X, One Big dGPU, by Michael Mantor
GS-4152, AMD’s Radeon R9-290X, One Big dGPU, by Michael MantorAMD Developer Central
 
GS-4106 The AMD GCN Architecture - A Crash Course, by Layla Mah
GS-4106 The AMD GCN Architecture - A Crash Course, by Layla MahGS-4106 The AMD GCN Architecture - A Crash Course, by Layla Mah
GS-4106 The AMD GCN Architecture - A Crash Course, by Layla MahAMD Developer Central
 
Keynote (Johan Andersson) - Mantle for Developers - by Johan Andersson, Techn...
Keynote (Johan Andersson) - Mantle for Developers - by Johan Andersson, Techn...Keynote (Johan Andersson) - Mantle for Developers - by Johan Andersson, Techn...
Keynote (Johan Andersson) - Mantle for Developers - by Johan Andersson, Techn...AMD Developer Central
 
PL-4042, Wholly Graal: Accelerating GPU offload for Java/Sumatra using the Op...
PL-4042, Wholly Graal: Accelerating GPU offload for Java/Sumatra using the Op...PL-4042, Wholly Graal: Accelerating GPU offload for Java/Sumatra using the Op...
PL-4042, Wholly Graal: Accelerating GPU offload for Java/Sumatra using the Op...AMD Developer Central
 
PL-4044, OpenACC on AMD APUs and GPUs with the PGI Accelerator Compilers, by ...
PL-4044, OpenACC on AMD APUs and GPUs with the PGI Accelerator Compilers, by ...PL-4044, OpenACC on AMD APUs and GPUs with the PGI Accelerator Compilers, by ...
PL-4044, OpenACC on AMD APUs and GPUs with the PGI Accelerator Compilers, by ...AMD Developer Central
 
MM-4097, OpenCV-CL, by Harris Gasparakis, Vadim Pisarevsky and Andrey Pavlenko
MM-4097, OpenCV-CL, by Harris Gasparakis, Vadim Pisarevsky and Andrey PavlenkoMM-4097, OpenCV-CL, by Harris Gasparakis, Vadim Pisarevsky and Andrey Pavlenko
MM-4097, OpenCV-CL, by Harris Gasparakis, Vadim Pisarevsky and Andrey PavlenkoAMD Developer Central
 
Leverage the Speed of OpenCL™ with AMD Math Libraries
Leverage the Speed of OpenCL™ with AMD Math LibrariesLeverage the Speed of OpenCL™ with AMD Math Libraries
Leverage the Speed of OpenCL™ with AMD Math LibrariesAMD Developer Central
 
Final lisa opening_keynote_draft_-_v12.1tb
Final lisa opening_keynote_draft_-_v12.1tbFinal lisa opening_keynote_draft_-_v12.1tb
Final lisa opening_keynote_draft_-_v12.1tbr Skip
 
Deeper Look Into HSAIL And It's Runtime
Deeper Look Into HSAIL And It's Runtime Deeper Look Into HSAIL And It's Runtime
Deeper Look Into HSAIL And It's Runtime HSA Foundation
 
WT-4073, ANGLE and cross-platform WebGL support, by Shannon Woods
WT-4073, ANGLE and cross-platform WebGL support, by Shannon WoodsWT-4073, ANGLE and cross-platform WebGL support, by Shannon Woods
WT-4073, ANGLE and cross-platform WebGL support, by Shannon WoodsAMD Developer Central
 
HC-4017, HSA Compilers Technology, by Debyendu Das
HC-4017, HSA Compilers Technology, by Debyendu DasHC-4017, HSA Compilers Technology, by Debyendu Das
HC-4017, HSA Compilers Technology, by Debyendu DasAMD Developer Central
 
PL-4051, An Introduction to SPIR for OpenCL Application Developers and Compil...
PL-4051, An Introduction to SPIR for OpenCL Application Developers and Compil...PL-4051, An Introduction to SPIR for OpenCL Application Developers and Compil...
PL-4051, An Introduction to SPIR for OpenCL Application Developers and Compil...AMD Developer Central
 
IS-4081, Rabbit: Reinventing Video Chat, by Philippe Clavel
IS-4081, Rabbit: Reinventing Video Chat, by Philippe ClavelIS-4081, Rabbit: Reinventing Video Chat, by Philippe Clavel
IS-4081, Rabbit: Reinventing Video Chat, by Philippe ClavelAMD Developer Central
 
HC-4021, Efficient scheduling of OpenMP and OpenCL™ workloads on Accelerated ...
HC-4021, Efficient scheduling of OpenMP and OpenCL™ workloads on Accelerated ...HC-4021, Efficient scheduling of OpenMP and OpenCL™ workloads on Accelerated ...
HC-4021, Efficient scheduling of OpenMP and OpenCL™ workloads on Accelerated ...AMD Developer Central
 
PT-4102, Simulation, Compilation and Debugging of OpenCL on the AMD Southern ...
PT-4102, Simulation, Compilation and Debugging of OpenCL on the AMD Southern ...PT-4102, Simulation, Compilation and Debugging of OpenCL on the AMD Southern ...
PT-4102, Simulation, Compilation and Debugging of OpenCL on the AMD Southern ...AMD Developer Central
 
GS-4150, Bullet 3 OpenCL Rigid Body Simulation, by Erwin Coumans
GS-4150, Bullet 3 OpenCL Rigid Body Simulation, by Erwin CoumansGS-4150, Bullet 3 OpenCL Rigid Body Simulation, by Erwin Coumans
GS-4150, Bullet 3 OpenCL Rigid Body Simulation, by Erwin CoumansAMD Developer Central
 
PT-4059, Bolt: A C++ Template Library for Heterogeneous Computing, by Ben Sander
PT-4059, Bolt: A C++ Template Library for Heterogeneous Computing, by Ben SanderPT-4059, Bolt: A C++ Template Library for Heterogeneous Computing, by Ben Sander
PT-4059, Bolt: A C++ Template Library for Heterogeneous Computing, by Ben SanderAMD Developer Central
 

La actualidad más candente (20)

HSA-4123, HSA Memory Model, by Ben Gaster
HSA-4123, HSA Memory Model, by Ben GasterHSA-4123, HSA Memory Model, by Ben Gaster
HSA-4123, HSA Memory Model, by Ben Gaster
 
CC-4006, Deliver Hardware Accelerated Applications Using RemoteFX vGPU with W...
CC-4006, Deliver Hardware Accelerated Applications Using RemoteFX vGPU with W...CC-4006, Deliver Hardware Accelerated Applications Using RemoteFX vGPU with W...
CC-4006, Deliver Hardware Accelerated Applications Using RemoteFX vGPU with W...
 
PG-4039, RapidFire API, by Dmitry Kozlov
PG-4039, RapidFire API, by Dmitry KozlovPG-4039, RapidFire API, by Dmitry Kozlov
PG-4039, RapidFire API, by Dmitry Kozlov
 
GS-4152, AMD’s Radeon R9-290X, One Big dGPU, by Michael Mantor
GS-4152, AMD’s Radeon R9-290X, One Big dGPU, by Michael MantorGS-4152, AMD’s Radeon R9-290X, One Big dGPU, by Michael Mantor
GS-4152, AMD’s Radeon R9-290X, One Big dGPU, by Michael Mantor
 
GS-4106 The AMD GCN Architecture - A Crash Course, by Layla Mah
GS-4106 The AMD GCN Architecture - A Crash Course, by Layla MahGS-4106 The AMD GCN Architecture - A Crash Course, by Layla Mah
GS-4106 The AMD GCN Architecture - A Crash Course, by Layla Mah
 
Keynote (Johan Andersson) - Mantle for Developers - by Johan Andersson, Techn...
Keynote (Johan Andersson) - Mantle for Developers - by Johan Andersson, Techn...Keynote (Johan Andersson) - Mantle for Developers - by Johan Andersson, Techn...
Keynote (Johan Andersson) - Mantle for Developers - by Johan Andersson, Techn...
 
PL-4042, Wholly Graal: Accelerating GPU offload for Java/Sumatra using the Op...
PL-4042, Wholly Graal: Accelerating GPU offload for Java/Sumatra using the Op...PL-4042, Wholly Graal: Accelerating GPU offload for Java/Sumatra using the Op...
PL-4042, Wholly Graal: Accelerating GPU offload for Java/Sumatra using the Op...
 
PL-4044, OpenACC on AMD APUs and GPUs with the PGI Accelerator Compilers, by ...
PL-4044, OpenACC on AMD APUs and GPUs with the PGI Accelerator Compilers, by ...PL-4044, OpenACC on AMD APUs and GPUs with the PGI Accelerator Compilers, by ...
PL-4044, OpenACC on AMD APUs and GPUs with the PGI Accelerator Compilers, by ...
 
MM-4097, OpenCV-CL, by Harris Gasparakis, Vadim Pisarevsky and Andrey Pavlenko
MM-4097, OpenCV-CL, by Harris Gasparakis, Vadim Pisarevsky and Andrey PavlenkoMM-4097, OpenCV-CL, by Harris Gasparakis, Vadim Pisarevsky and Andrey Pavlenko
MM-4097, OpenCV-CL, by Harris Gasparakis, Vadim Pisarevsky and Andrey Pavlenko
 
Leverage the Speed of OpenCL™ with AMD Math Libraries
Leverage the Speed of OpenCL™ with AMD Math LibrariesLeverage the Speed of OpenCL™ with AMD Math Libraries
Leverage the Speed of OpenCL™ with AMD Math Libraries
 
Final lisa opening_keynote_draft_-_v12.1tb
Final lisa opening_keynote_draft_-_v12.1tbFinal lisa opening_keynote_draft_-_v12.1tb
Final lisa opening_keynote_draft_-_v12.1tb
 
Deeper Look Into HSAIL And It's Runtime
Deeper Look Into HSAIL And It's Runtime Deeper Look Into HSAIL And It's Runtime
Deeper Look Into HSAIL And It's Runtime
 
WT-4073, ANGLE and cross-platform WebGL support, by Shannon Woods
WT-4073, ANGLE and cross-platform WebGL support, by Shannon WoodsWT-4073, ANGLE and cross-platform WebGL support, by Shannon Woods
WT-4073, ANGLE and cross-platform WebGL support, by Shannon Woods
 
HC-4017, HSA Compilers Technology, by Debyendu Das
HC-4017, HSA Compilers Technology, by Debyendu DasHC-4017, HSA Compilers Technology, by Debyendu Das
HC-4017, HSA Compilers Technology, by Debyendu Das
 
PL-4051, An Introduction to SPIR for OpenCL Application Developers and Compil...
PL-4051, An Introduction to SPIR for OpenCL Application Developers and Compil...PL-4051, An Introduction to SPIR for OpenCL Application Developers and Compil...
PL-4051, An Introduction to SPIR for OpenCL Application Developers and Compil...
 
IS-4081, Rabbit: Reinventing Video Chat, by Philippe Clavel
IS-4081, Rabbit: Reinventing Video Chat, by Philippe ClavelIS-4081, Rabbit: Reinventing Video Chat, by Philippe Clavel
IS-4081, Rabbit: Reinventing Video Chat, by Philippe Clavel
 
HC-4021, Efficient scheduling of OpenMP and OpenCL™ workloads on Accelerated ...
HC-4021, Efficient scheduling of OpenMP and OpenCL™ workloads on Accelerated ...HC-4021, Efficient scheduling of OpenMP and OpenCL™ workloads on Accelerated ...
HC-4021, Efficient scheduling of OpenMP and OpenCL™ workloads on Accelerated ...
 
PT-4102, Simulation, Compilation and Debugging of OpenCL on the AMD Southern ...
PT-4102, Simulation, Compilation and Debugging of OpenCL on the AMD Southern ...PT-4102, Simulation, Compilation and Debugging of OpenCL on the AMD Southern ...
PT-4102, Simulation, Compilation and Debugging of OpenCL on the AMD Southern ...
 
GS-4150, Bullet 3 OpenCL Rigid Body Simulation, by Erwin Coumans
GS-4150, Bullet 3 OpenCL Rigid Body Simulation, by Erwin CoumansGS-4150, Bullet 3 OpenCL Rigid Body Simulation, by Erwin Coumans
GS-4150, Bullet 3 OpenCL Rigid Body Simulation, by Erwin Coumans
 
PT-4059, Bolt: A C++ Template Library for Heterogeneous Computing, by Ben Sander
PT-4059, Bolt: A C++ Template Library for Heterogeneous Computing, by Ben SanderPT-4059, Bolt: A C++ Template Library for Heterogeneous Computing, by Ben Sander
PT-4059, Bolt: A C++ Template Library for Heterogeneous Computing, by Ben Sander
 

Similar a PT-4058, Measuring and Optimizing Performance of Cluster and Private Cloud Applications Using PPA , by Hui Huang, Zhaoqiang Zheng and Lihua Zhang

Keynote (Phil Rogers) - The Programmers Guide to Reaching for the Cloud - by ...
Keynote (Phil Rogers) - The Programmers Guide to Reaching for the Cloud - by ...Keynote (Phil Rogers) - The Programmers Guide to Reaching for the Cloud - by ...
Keynote (Phil Rogers) - The Programmers Guide to Reaching for the Cloud - by ...AMD Developer Central
 
Final apu13 phil-rogers-keynote-21
Final apu13 phil-rogers-keynote-21Final apu13 phil-rogers-keynote-21
Final apu13 phil-rogers-keynote-21r Skip
 
[NetApp] Simplified HA:DR Using Storage Solutions
[NetApp] Simplified HA:DR Using Storage Solutions[NetApp] Simplified HA:DR Using Storage Solutions
[NetApp] Simplified HA:DR Using Storage SolutionsPerforce
 
Cowboy Dating with Big Data or DWH Evolution in Action, Борис Трофимов
Cowboy Dating with Big Data or DWH Evolution in Action, Борис ТрофимовCowboy Dating with Big Data or DWH Evolution in Action, Борис Трофимов
Cowboy Dating with Big Data or DWH Evolution in Action, Борис ТрофимовSigma Software
 
Sap security online training
Sap security online trainingSap security online training
Sap security online trainingsapscmit
 
A Comparative Survey Based on Processing Network Traffic Data Using Hadoop Pi...
A Comparative Survey Based on Processing Network Traffic Data Using Hadoop Pi...A Comparative Survey Based on Processing Network Traffic Data Using Hadoop Pi...
A Comparative Survey Based on Processing Network Traffic Data Using Hadoop Pi...IJCSES Journal
 
A comparative survey based on processing network traffic data using hadoop pi...
A comparative survey based on processing network traffic data using hadoop pi...A comparative survey based on processing network traffic data using hadoop pi...
A comparative survey based on processing network traffic data using hadoop pi...ijcses
 
Oracle apps scm online training
Oracle apps scm online trainingOracle apps scm online training
Oracle apps scm online trainingsaptpmit
 
Oracle nosql twjug-oktober-2014_taiwan_print_v01
Oracle nosql twjug-oktober-2014_taiwan_print_v01Oracle nosql twjug-oktober-2014_taiwan_print_v01
Oracle nosql twjug-oktober-2014_taiwan_print_v01Gunther Pippèrr
 
sudoers: Benchmarking Hadoop with ALOJA
sudoers: Benchmarking Hadoop with ALOJAsudoers: Benchmarking Hadoop with ALOJA
sudoers: Benchmarking Hadoop with ALOJANicolas Poggi
 
Machine Data 101
Machine Data 101Machine Data 101
Machine Data 101Splunk
 
PNDA - Platform for Network Data Analytics
PNDA - Platform for Network Data AnalyticsPNDA - Platform for Network Data Analytics
PNDA - Platform for Network Data AnalyticsJohn Evans
 
The Impact of Columnar File Formats on SQL-on-Hadoop Engine Performance: A St...
The Impact of Columnar File Formats on SQL-on-Hadoop Engine Performance: A St...The Impact of Columnar File Formats on SQL-on-Hadoop Engine Performance: A St...
The Impact of Columnar File Formats on SQL-on-Hadoop Engine Performance: A St...t_ivanov
 
Breaking the Monolith
Breaking the MonolithBreaking the Monolith
Breaking the MonolithVMware Tanzu
 
Getting Started with Apache Spark and Alluxio for Blazingly Fast Analytics
Getting Started with Apache Spark and Alluxio for Blazingly Fast AnalyticsGetting Started with Apache Spark and Alluxio for Blazingly Fast Analytics
Getting Started with Apache Spark and Alluxio for Blazingly Fast AnalyticsAlluxio, Inc.
 
XDF 2019 Xilinx Accelerated Database and Data Analytics Ecosystem
XDF 2019 Xilinx Accelerated Database and Data Analytics EcosystemXDF 2019 Xilinx Accelerated Database and Data Analytics Ecosystem
XDF 2019 Xilinx Accelerated Database and Data Analytics EcosystemDan Eaton
 
Headless approach for offloading heavy tasks in Magento
Headless approach for offloading heavy tasks in MagentoHeadless approach for offloading heavy tasks in Magento
Headless approach for offloading heavy tasks in MagentoSander Mangel
 

Similar a PT-4058, Measuring and Optimizing Performance of Cluster and Private Cloud Applications Using PPA , by Hui Huang, Zhaoqiang Zheng and Lihua Zhang (20)

Keynote (Phil Rogers) - The Programmers Guide to Reaching for the Cloud - by ...
Keynote (Phil Rogers) - The Programmers Guide to Reaching for the Cloud - by ...Keynote (Phil Rogers) - The Programmers Guide to Reaching for the Cloud - by ...
Keynote (Phil Rogers) - The Programmers Guide to Reaching for the Cloud - by ...
 
Final apu13 phil-rogers-keynote-21
Final apu13 phil-rogers-keynote-21Final apu13 phil-rogers-keynote-21
Final apu13 phil-rogers-keynote-21
 
[NetApp] Simplified HA:DR Using Storage Solutions
[NetApp] Simplified HA:DR Using Storage Solutions[NetApp] Simplified HA:DR Using Storage Solutions
[NetApp] Simplified HA:DR Using Storage Solutions
 
Cowboy Dating with Big Data or DWH Evolution in Action, Борис Трофимов
Cowboy Dating with Big Data or DWH Evolution in Action, Борис ТрофимовCowboy Dating with Big Data or DWH Evolution in Action, Борис Трофимов
Cowboy Dating with Big Data or DWH Evolution in Action, Борис Трофимов
 
SAP Basis Overview
SAP Basis OverviewSAP Basis Overview
SAP Basis Overview
 
Sap security online training
Sap security online trainingSap security online training
Sap security online training
 
A Comparative Survey Based on Processing Network Traffic Data Using Hadoop Pi...
A Comparative Survey Based on Processing Network Traffic Data Using Hadoop Pi...A Comparative Survey Based on Processing Network Traffic Data Using Hadoop Pi...
A Comparative Survey Based on Processing Network Traffic Data Using Hadoop Pi...
 
A comparative survey based on processing network traffic data using hadoop pi...
A comparative survey based on processing network traffic data using hadoop pi...A comparative survey based on processing network traffic data using hadoop pi...
A comparative survey based on processing network traffic data using hadoop pi...
 
Oracle apps scm online training
Oracle apps scm online trainingOracle apps scm online training
Oracle apps scm online training
 
Oracle nosql twjug-oktober-2014_taiwan_print_v01
Oracle nosql twjug-oktober-2014_taiwan_print_v01Oracle nosql twjug-oktober-2014_taiwan_print_v01
Oracle nosql twjug-oktober-2014_taiwan_print_v01
 
sudoers: Benchmarking Hadoop with ALOJA
sudoers: Benchmarking Hadoop with ALOJAsudoers: Benchmarking Hadoop with ALOJA
sudoers: Benchmarking Hadoop with ALOJA
 
Machine Data 101
Machine Data 101Machine Data 101
Machine Data 101
 
PNDA - Platform for Network Data Analytics
PNDA - Platform for Network Data AnalyticsPNDA - Platform for Network Data Analytics
PNDA - Platform for Network Data Analytics
 
The Impact of Columnar File Formats on SQL-on-Hadoop Engine Performance: A St...
The Impact of Columnar File Formats on SQL-on-Hadoop Engine Performance: A St...The Impact of Columnar File Formats on SQL-on-Hadoop Engine Performance: A St...
The Impact of Columnar File Formats on SQL-on-Hadoop Engine Performance: A St...
 
Breaking the Monolith
Breaking the MonolithBreaking the Monolith
Breaking the Monolith
 
Geode Meetup Apachecon
Geode Meetup ApacheconGeode Meetup Apachecon
Geode Meetup Apachecon
 
Getting Started with Apache Spark and Alluxio for Blazingly Fast Analytics
Getting Started with Apache Spark and Alluxio for Blazingly Fast AnalyticsGetting Started with Apache Spark and Alluxio for Blazingly Fast Analytics
Getting Started with Apache Spark and Alluxio for Blazingly Fast Analytics
 
Adopting the Cloud
Adopting the CloudAdopting the Cloud
Adopting the Cloud
 
XDF 2019 Xilinx Accelerated Database and Data Analytics Ecosystem
XDF 2019 Xilinx Accelerated Database and Data Analytics EcosystemXDF 2019 Xilinx Accelerated Database and Data Analytics Ecosystem
XDF 2019 Xilinx Accelerated Database and Data Analytics Ecosystem
 
Headless approach for offloading heavy tasks in Magento
Headless approach for offloading heavy tasks in MagentoHeadless approach for offloading heavy tasks in Magento
Headless approach for offloading heavy tasks in Magento
 

Más de AMD Developer Central

DX12 & Vulkan: Dawn of a New Generation of Graphics APIs
DX12 & Vulkan: Dawn of a New Generation of Graphics APIsDX12 & Vulkan: Dawn of a New Generation of Graphics APIs
DX12 & Vulkan: Dawn of a New Generation of Graphics APIsAMD Developer Central
 
An Introduction to OpenCL™ Programming with AMD GPUs - AMD & Acceleware Webinar
An Introduction to OpenCL™ Programming with AMD GPUs - AMD & Acceleware WebinarAn Introduction to OpenCL™ Programming with AMD GPUs - AMD & Acceleware Webinar
An Introduction to OpenCL™ Programming with AMD GPUs - AMD & Acceleware WebinarAMD Developer Central
 
Webinar: Whats New in Java 8 with Develop Intelligence
Webinar: Whats New in Java 8 with Develop IntelligenceWebinar: Whats New in Java 8 with Develop Intelligence
Webinar: Whats New in Java 8 with Develop IntelligenceAMD Developer Central
 
The Small Batch (and other) solutions in Mantle API, by Guennadi Riguer, Mant...
The Small Batch (and other) solutions in Mantle API, by Guennadi Riguer, Mant...The Small Batch (and other) solutions in Mantle API, by Guennadi Riguer, Mant...
The Small Batch (and other) solutions in Mantle API, by Guennadi Riguer, Mant...AMD Developer Central
 
TressFX The Fast and The Furry by Nicolas Thibieroz
TressFX The Fast and The Furry by Nicolas ThibierozTressFX The Fast and The Furry by Nicolas Thibieroz
TressFX The Fast and The Furry by Nicolas ThibierozAMD Developer Central
 
Rendering Battlefield 4 with Mantle by Yuriy ODonnell
Rendering Battlefield 4 with Mantle by Yuriy ODonnellRendering Battlefield 4 with Mantle by Yuriy ODonnell
Rendering Battlefield 4 with Mantle by Yuriy ODonnellAMD Developer Central
 
Low-level Shader Optimization for Next-Gen and DX11 by Emil Persson
Low-level Shader Optimization for Next-Gen and DX11 by Emil PerssonLow-level Shader Optimization for Next-Gen and DX11 by Emil Persson
Low-level Shader Optimization for Next-Gen and DX11 by Emil PerssonAMD Developer Central
 
Direct3D12 and the Future of Graphics APIs by Dave Oldcorn
Direct3D12 and the Future of Graphics APIs by Dave OldcornDirect3D12 and the Future of Graphics APIs by Dave Oldcorn
Direct3D12 and the Future of Graphics APIs by Dave OldcornAMD Developer Central
 
Introduction to Direct 3D 12 by Ivan Nevraev
Introduction to Direct 3D 12 by Ivan NevraevIntroduction to Direct 3D 12 by Ivan Nevraev
Introduction to Direct 3D 12 by Ivan NevraevAMD Developer Central
 
Holy smoke! Faster Particle Rendering using Direct Compute by Gareth Thomas
Holy smoke! Faster Particle Rendering using Direct Compute by Gareth ThomasHoly smoke! Faster Particle Rendering using Direct Compute by Gareth Thomas
Holy smoke! Faster Particle Rendering using Direct Compute by Gareth ThomasAMD Developer Central
 
Computer Vision Powered by Heterogeneous System Architecture (HSA) by Dr. Ha...
Computer Vision Powered by Heterogeneous System Architecture (HSA) by  Dr. Ha...Computer Vision Powered by Heterogeneous System Architecture (HSA) by  Dr. Ha...
Computer Vision Powered by Heterogeneous System Architecture (HSA) by Dr. Ha...AMD Developer Central
 
Productive OpenCL Programming An Introduction to OpenCL Libraries with Array...
Productive OpenCL Programming An Introduction to OpenCL Libraries  with Array...Productive OpenCL Programming An Introduction to OpenCL Libraries  with Array...
Productive OpenCL Programming An Introduction to OpenCL Libraries with Array...AMD Developer Central
 
Rendering Battlefield 4 with Mantle by Johan Andersson - AMD at GDC14
Rendering Battlefield 4 with Mantle by Johan Andersson - AMD at GDC14Rendering Battlefield 4 with Mantle by Johan Andersson - AMD at GDC14
Rendering Battlefield 4 with Mantle by Johan Andersson - AMD at GDC14AMD Developer Central
 
RapidFire - the Easy Route to low Latency Cloud Gaming Solutions - AMD at GDC14
RapidFire - the Easy Route to low Latency Cloud Gaming Solutions - AMD at GDC14RapidFire - the Easy Route to low Latency Cloud Gaming Solutions - AMD at GDC14
RapidFire - the Easy Route to low Latency Cloud Gaming Solutions - AMD at GDC14AMD Developer Central
 

Más de AMD Developer Central (20)

DX12 & Vulkan: Dawn of a New Generation of Graphics APIs
DX12 & Vulkan: Dawn of a New Generation of Graphics APIsDX12 & Vulkan: Dawn of a New Generation of Graphics APIs
DX12 & Vulkan: Dawn of a New Generation of Graphics APIs
 
Introduction to Node.js
Introduction to Node.jsIntroduction to Node.js
Introduction to Node.js
 
Media SDK Webinar 2014
Media SDK Webinar 2014Media SDK Webinar 2014
Media SDK Webinar 2014
 
An Introduction to OpenCL™ Programming with AMD GPUs - AMD & Acceleware Webinar
An Introduction to OpenCL™ Programming with AMD GPUs - AMD & Acceleware WebinarAn Introduction to OpenCL™ Programming with AMD GPUs - AMD & Acceleware Webinar
An Introduction to OpenCL™ Programming with AMD GPUs - AMD & Acceleware Webinar
 
DirectGMA on AMD’S FirePro™ GPUS
DirectGMA on AMD’S  FirePro™ GPUSDirectGMA on AMD’S  FirePro™ GPUS
DirectGMA on AMD’S FirePro™ GPUS
 
Webinar: Whats New in Java 8 with Develop Intelligence
Webinar: Whats New in Java 8 with Develop IntelligenceWebinar: Whats New in Java 8 with Develop Intelligence
Webinar: Whats New in Java 8 with Develop Intelligence
 
The Small Batch (and other) solutions in Mantle API, by Guennadi Riguer, Mant...
The Small Batch (and other) solutions in Mantle API, by Guennadi Riguer, Mant...The Small Batch (and other) solutions in Mantle API, by Guennadi Riguer, Mant...
The Small Batch (and other) solutions in Mantle API, by Guennadi Riguer, Mant...
 
Inside XBox- One, by Martin Fuller
Inside XBox- One, by Martin FullerInside XBox- One, by Martin Fuller
Inside XBox- One, by Martin Fuller
 
TressFX The Fast and The Furry by Nicolas Thibieroz
TressFX The Fast and The Furry by Nicolas ThibierozTressFX The Fast and The Furry by Nicolas Thibieroz
TressFX The Fast and The Furry by Nicolas Thibieroz
 
Rendering Battlefield 4 with Mantle by Yuriy ODonnell
Rendering Battlefield 4 with Mantle by Yuriy ODonnellRendering Battlefield 4 with Mantle by Yuriy ODonnell
Rendering Battlefield 4 with Mantle by Yuriy ODonnell
 
Low-level Shader Optimization for Next-Gen and DX11 by Emil Persson
Low-level Shader Optimization for Next-Gen and DX11 by Emil PerssonLow-level Shader Optimization for Next-Gen and DX11 by Emil Persson
Low-level Shader Optimization for Next-Gen and DX11 by Emil Persson
 
Gcn performance ftw by stephan hodes
Gcn performance ftw by stephan hodesGcn performance ftw by stephan hodes
Gcn performance ftw by stephan hodes
 
Inside XBOX ONE by Martin Fuller
Inside XBOX ONE by Martin FullerInside XBOX ONE by Martin Fuller
Inside XBOX ONE by Martin Fuller
 
Direct3D12 and the Future of Graphics APIs by Dave Oldcorn
Direct3D12 and the Future of Graphics APIs by Dave OldcornDirect3D12 and the Future of Graphics APIs by Dave Oldcorn
Direct3D12 and the Future of Graphics APIs by Dave Oldcorn
 
Introduction to Direct 3D 12 by Ivan Nevraev
Introduction to Direct 3D 12 by Ivan NevraevIntroduction to Direct 3D 12 by Ivan Nevraev
Introduction to Direct 3D 12 by Ivan Nevraev
 
Holy smoke! Faster Particle Rendering using Direct Compute by Gareth Thomas
Holy smoke! Faster Particle Rendering using Direct Compute by Gareth ThomasHoly smoke! Faster Particle Rendering using Direct Compute by Gareth Thomas
Holy smoke! Faster Particle Rendering using Direct Compute by Gareth Thomas
 
Computer Vision Powered by Heterogeneous System Architecture (HSA) by Dr. Ha...
Computer Vision Powered by Heterogeneous System Architecture (HSA) by  Dr. Ha...Computer Vision Powered by Heterogeneous System Architecture (HSA) by  Dr. Ha...
Computer Vision Powered by Heterogeneous System Architecture (HSA) by Dr. Ha...
 
Productive OpenCL Programming An Introduction to OpenCL Libraries with Array...
Productive OpenCL Programming An Introduction to OpenCL Libraries  with Array...Productive OpenCL Programming An Introduction to OpenCL Libraries  with Array...
Productive OpenCL Programming An Introduction to OpenCL Libraries with Array...
 
Rendering Battlefield 4 with Mantle by Johan Andersson - AMD at GDC14
Rendering Battlefield 4 with Mantle by Johan Andersson - AMD at GDC14Rendering Battlefield 4 with Mantle by Johan Andersson - AMD at GDC14
Rendering Battlefield 4 with Mantle by Johan Andersson - AMD at GDC14
 
RapidFire - the Easy Route to low Latency Cloud Gaming Solutions - AMD at GDC14
RapidFire - the Easy Route to low Latency Cloud Gaming Solutions - AMD at GDC14RapidFire - the Easy Route to low Latency Cloud Gaming Solutions - AMD at GDC14
RapidFire - the Easy Route to low Latency Cloud Gaming Solutions - AMD at GDC14
 

Último

Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreternaman860154
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processorsdebabhi2
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Scriptwesley chun
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsJoaquim Jorge
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)wesley chun
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptxHampshireHUG
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Igalia
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CVKhem
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationRadu Cotescu
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsEnterprise Knowledge
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonAnna Loughnan Colquhoun
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking MenDelhi Call girls
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024The Digital Insurer
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityPrincipled Technologies
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Servicegiselly40
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024The Digital Insurer
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUK Journal
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking MenDelhi Call girls
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsMaria Levchenko
 

Último (20)

Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreter
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CV
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Service
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed texts
 

PT-4058, Measuring and Optimizing Performance of Cluster and Private Cloud Applications Using PPA , by Hui Huang, Zhaoqiang Zheng and Lihua Zhang

  • 1. MEASURING  AND  OPTIMIZING  PERFORMANCE  OF  CLUSTER   AND  PRIVATE  CLOUD  APPLICATIONS   BY  USING  PPA    
  • 2. MULTICOREWARE  INC   LIHUA.ZHANG     HUI.HUANG   ANDY.ZHENG    
  • 3. IntroducEon  to  MCW  PPA™  For  Cluster   A  tracing  tool  targets  the  distributed  systems.   !  Distributely  collect  instrumented  data  and  hardware  measurements  within  a   tracing  infrastructure.   !  Provide  visualizaEons  with    intuiEve  graphs/GanX  charts  and  generate  staEsEc                     reports  intended  for  idenEfying  criEcal  paths.   !  Do  offline  analysis  that  aids  in  understanding  target  system’s  behavior  and   reasoning  about  performance  issues.   !  PPA  Product  series     PPA For Cluster PPA Workstation Edition 3   |      PRESENTATION  TITLE      |      DECEMBER  4,  2013      |      CONFIDENTIAL   PPA For Android
  • 4. Main  Features   !  Low  overhead   ‒   Have  negligible  performance  impact  on  the  running  applicaEons  by  relying  on  the   PPA  runEme  library.  This  is  very  useful  for  highly  opEmized  cases  which  are   performance  sensiEve.   !  InstrumentaBon  on  applicaBon  level   ‒  The  PPA  runEme  library  provides  APIs  to  measure  codes.  The  hardware   measurement  part  is  very  transparent  to  the  developers.  And  these  PPA  codes  can  be   easily  cleanup  by  turning  on  a  disable  opEon.   ‒  Auto-­‐instrumentaEon  of  binaries  available  soon.   !  Scalability   ‒  The  tool  can  be  extended  to  profile  clusters  with  various  scales  (now  up  to  4000   nodes)  and  services  (e.g.  Hadoop).  This  benefits  from  PPA’s  distributed  data   repositories,  big-­‐data  process  and  buffered  views  of  visualizaEons  etc.   ‒  PPA  Profiler  can  be  extended  to  support  HW  vendor  specific  features   4   |      PRESENTATION  TITLE      |      DECEMBER  4,  2013      |      CONFIDENTIAL  
  • 5.  The  Highlights   !  Profiler  and  performance  analyzer   ‒  ‒  ‒  ‒  ‒  ‒  ‒  Low  overhead  (almost  no  cost  if  no  profiling  capture  is  enabled)   CPU  &  GPU  acEvity  traces   Hardware  uElizaEons  measurement   HW  Vendor  specific  support   Features  Eme-­‐based  views  and  staEsEcal  analysis  /  reports   MulE-­‐core  profiling  at  process/thread  at  source  code   Good  data  organizaEon  in  intuiEve  colour  schemes   !  Big  data  support   ‒  Storage   ‒  Smooth  visualizaEon   !  System-­‐wide  criEcal  paths  idenEficaEon   ‒  ‒  ‒  ‒  Correlate  hardware  uElizaEons  and  CPU  events  in  the  same  Emeline   Cluster  wide  global  clock  synchronizaEon   MulE-­‐views  for  sessions  from  different  nodes  in  the  same  Emeline   RunEme  monitors   !  Customizable  for  specific  applicaEons,  e.g.  Hadoop     5   |      PRESENTATION  TITLE      |      DECEMBER  4,  2013      |      CONFIDENTIAL  
  • 6.  Developer  Library  Overview   !  C/C++  SDK   ‒  Already  used  in  numerous  OpenCL™  applicaEons   !  Java  Support   ‒  Java  bindings  for  OpenCL™  applicaEons   !  Thread-­‐safe   !   Low  overhead  if  no  capture   !   Transparent  for  OpenCL  instrumentaEons   ‒  ‒  ‒  ‒    Timing  OpenCL  APIs    Timing  kernels  &  data  transfers:  start/submit/queue/complete   Visualize  construcEon  of  dependence  graph  between  kernels  &  data  transfer   Exclusive  sub-­‐kernel  support  for  AMD  GFX  cards   C/C++ Provide  a  friendly  Interface   (ppaAPI.h)  for  the  C/C++  developer.   6   |      PRESENTATION  TITLE      |      DECEMBER  4,  2013      |      CONFIDENTIAL   JAVA Provide  a  friendly  Interface   (JPPA.jar)  for  the  JAVA  developer.  
  • 7. System  Overview     !  Distributed  repositories  for  trace  data   !  Distributed  post-­‐processing  to  minimize  overhead   !  Powerful  visualizaEon  engine   !  Scalability  to  any  scale  of  cluster  system   Presentation layer UI Logic layer Network layer Profiler Logic layer Data layer Graphics  Rendering Raw Data Post Processing Communication Framework Processed Data Repository Data Transfer Profiler Control (Start/Stop etc.) Data collecting by PPA Profiler 7   |      PRESENTATION  TITLE      |      DECEMBER  4,  2013      |      CONFIDENTIAL   Data serialize for Presentation Fault-tolerant Synchronization and heartbeat etc. Other profiler logic Raw Data Repository
  • 8. Gepng  Started   !  Install  PPA  Clients  and  PPA  Server  on  the  target  plaqorms     ‒  Deploy  PPA  Clients  by  scripts   ‒  Support  CLI  for  capture   ‒  Generally  PPA  Server  is  running  on  master  node   !  Set  up  capture  opEons   ‒  ‒  ‒  ‒  ‒  Node  IP,  communicaEon  Port…   OpEonally  select  nodes  to  profile   OpEonally  enable  CPU  Event  filters   OpEonally  enable  CPU  Event  merge     Hardware  measurement  is  by  default   !  Collect  data  and  analysis  reports   !  Operate  views   8   |      PRESENTATION  TITLE      |      DECEMBER  4,  2013      |      CONFIDENTIAL  
  • 9. Summary  View   !  Available  to  help  find  the  problemaEc  nodes  or  un-­‐balanced  loads.   !  Tell  difference  between  different  runs     Multistage Table Bar Charts 9   |      PRESENTATION  TITLE      |      DECEMBER  4,  2013      |      CONFIDENTIAL  
  • 10. The  Sharp  UElity:  Timeline  View   !  Correlate  CPU  Events  to  HW  performance  in  analysis   Monitoring application’s behaviour Session and its node list Monitoring hardware behavior 10   |      PRESENTATION  TITLE      |      DECEMBER  4,  2013      |      CONFIDENTIAL   Zoom in/out from hour to ns resolutions
  • 11. Profiling    Data   !  CPU  Events  Level   ‒  Thread   ‒  Name   ‒  Core  miEgaEon   ‒  Timing   !  OpenCL  traces   !  Hardware  counters   ‒  %  CPU  Usage   ‒  Memory  Usage   ‒  Bytes  read/write  of  Disk   ‒  Bytes  in/write  of  the  Net   ‒  Cache  hit/miss   !  StaEsEcs   ‒  Process/Thread  involved   ‒  #  of  total  CPU  Events   ‒  #  of  the  same  CPU  Events   ‒  Min/Max/Average  for  each   11   |      PRESENTATION  TITLE      |      DECEMBER  4,  2013      |      CONFIDENTIAL  
  • 12. Timeline  View  for  CPU  Events   !  Process-­‐thread-­‐event  data   ‒  IdenEfy  the  problemaEc  process/thread/event   ‒  Tell  the  dependency   ‒  Tell  parent  &  child   ‒  Frames  analyzer  for  frame-­‐based  program   Expand process Expand thread 12   |      PRESENTATION  TITLE      |      DECEMBER  4,  2013      |      CONFIDENTIAL  
  • 13. Timeline  View  for  HW  measurement   !  Aggregate  performance  data   !  Per-­‐core  data   When is the critical throughput on disk? Abnormal load of the Network? When the CPU usage is very low or high? 13   |      PRESENTATION  TITLE      |      DECEMBER  4,  2013      |      CONFIDENTIAL  
  • 14. Where  mulE-­‐views  Help  OpEmizaEon   !  IdenEfy  node’s  abnormal  behavior   !  Difference/relaEons  between  nodes   !  Job  scheduler  maXers   14   |      PRESENTATION  TITLE      |      DECEMBER  4,  2013      |      CONFIDENTIAL  
  • 15. Hadoop  with  PPA  on  AWS  as  Demo   !  Overview  of  the  tracing  infrastructure   15   |      PRESENTATION  TITLE      |      DECEMBER  4,  2013      |      CONFIDENTIAL  
  • 16. Setup  AWS  EC2  instance   !  16  Hadoop  nodes  (dual  core  node  with  7.5GB  memory)   !  4GB  Hadoop  Terasort  Workload   !  >  1.2  GB  PPA  trace  per  node   16   |      PRESENTATION  TITLE      |      DECEMBER  4,  2013      |      CONFIDENTIAL  
  • 17. Run  Hadoop  jobs   !  Start  the  capture   !  Jobs  are  done  by  map  &  reduce   17   |      PRESENTATION  TITLE      |      DECEMBER  4,  2013      |      CONFIDENTIAL  
  • 18. Remote  control  by  VNC  viewer   !  Intended  for  mulEple  users  on  AWS   !  Experience  and  operate  PPA  from  different  connect  points   18   |      PRESENTATION  TITLE      |      DECEMBER  4,  2013      |      CONFIDENTIAL  
  • 19. CONTACT  US:     CURTIS@MULTICOREWAREINC.COM     LIHUA@MULTICOREWAREINC.COM     MANU@MULTICOREWAREINC.COM