SlideShare una empresa de Scribd logo
1 de 23
Descargar para leer sin conexión
HETEROGENEOUS  ARCHITECTURES:  A  SURVEY  AND  OVERVIEW  FOR  
DEVELOPERS
1	
  
MAZHAR  MEMON
CTO,  BITFUSION.IO
2	
  
abstract	
  and	
  slow	
  à	
  	
  ß	
  complex	
  and	
  fast	
  
Time	
  à	
  
Delivering	
  performance	
  and	
  efficiency	
  to	
  	
  
today’s	
  applica<ons	
  is	
  becoming	
  more	
  difficult	
  
	
  
	
  
	
  
	
  
	
  
The  problem  in  compuHng
The  soKware  world  is  increasingly  
abstract
Transistor  scaling  is  ending
Moore’s  law  slowing  -­‐>  complexity
Era	
  of	
  frequency	
   Era	
  of	
  mul<-­‐core	
   Era	
  of	
  many-­‐core	
  
6	
  
abstract	
  and	
  slow	
  à	
  	
  ß	
  complex	
  and	
  fast	
  
Time	
  à	
  
Help!	
  
	
  
	
  
	
  
	
  
The  problem  in  compuHng
The  soluHon(s)
• 	
  Hardware	
  
•  Specialized	
  hardware	
  required	
  to	
  keep	
  up	
  with	
  accelerated	
  performance	
  curve	
  
•  Encourage	
  accessibility:	
  	
  low	
  hourly	
  pricing	
  
• SoIware	
  
•  Abstrac<ons:	
  	
  Libraries,	
  APIs,	
  tool	
  chain	
  up	
  to	
  compiler	
  IR,	
  use	
  transla<ons	
  where	
  possible	
  
•  Ecosystem:	
  Learning	
  materials,	
  user	
  groups,	
  university	
  engagement	
  
• 	
  What	
  makes	
  this	
  happen:	
  	
  Developers	
  
7	
  
Remainder	
  of	
  this	
  talk	
  is	
  about	
  the	
  hardware	
  out	
  there	
  and	
  how	
  to	
  develop	
  for	
  them	
  
Current  State  of  Developer  Experience  
for  Accelerators
8	
  
-­‐  Update	
  to	
  the	
  right	
  Opera<ng	
  System	
  
-­‐  Install	
  Vendor	
  Tool-­‐flows	
  which	
  only	
  
work	
  on	
  specific	
  Opera<ng	
  Systems	
  
-­‐  SeXng	
  up	
  the	
  Environment	
  and	
  
Licenses	
  
-­‐  Installing	
  the	
  Board	
  	
  
-­‐  SeXng	
  up	
  the	
  board	
  
-­‐  Numerous	
  pages	
  of	
  documenta<on	
  
Unhappy	
  Developer	
  
Experience	
  L	
  
In	
  many	
  cases	
  developers	
  give	
  up	
  
before	
  even	
  star<ng	
  real	
  work	
  due	
  
to	
  this	
  poor	
  developer	
  experience	
  
Overview  of  available  compute  devices
9	
  
…from	
  easiest	
  to	
  hardest	
  
Integrated  GPUs
• 	
  Architecture:	
  	
  SIMD,	
  shared	
  resource	
  architecture	
  
• 	
  Targeted	
  workloads:	
  Medium-­‐sized	
  offloads,	
  latency-­‐sensi<ve,	
  cost-­‐sensi<ve,	
  media	
  
• 	
  Programming	
  models:	
  	
  OpenCL,	
  DirectCompute,	
  C++	
  AMP,	
  SPIR,	
  HSAIL	
  
• 	
  Ecosystem	
  maturity:	
  	
  High	
  
• Links:	
  
•  haps://soIware.intel.com/en-­‐us/ar<cles/intel-­‐graphics-­‐developers-­‐guides	
  
10	
  
Discrete  GPUs
• 	
  Architecture:	
  	
  SIMD,	
  discrete	
  coprocessor	
  configura<on	
  
• 	
  Targeted	
  workloads:	
  Large-­‐sized	
  offloads,	
  throughput-­‐sensi<ve,	
  parallel	
  structured	
  
• 	
  Programming	
  models:	
  	
  CUDA,	
  OpenCL,	
  DirectCompute,	
  C++	
  AMP,	
  SYCL,	
  SPIR,	
  HSA	
  
• 	
  Ecosystem	
  maturity:	
  	
  High	
  
• Links:	
  
•  hap://docs.nvidia.com/cuda/cuda-­‐geXng-­‐started-­‐guide-­‐for-­‐linux	
  
11	
  
MICs
• 	
  Architecture:	
  	
  Many	
  GP	
  cores,	
  (co)processor	
  configura<on	
  
• 	
  Targeted	
  workloads:	
  	
  Large-­‐sized	
  offloads,	
  throughput-­‐sensi<ve,	
  generic	
  HPC	
  
• 	
  Programming	
  models:	
  	
  OpenCL,	
  OMP,	
  MPI,	
  general	
  x86	
  
• 	
  Ecosystem	
  maturity:	
  	
  High	
  
• Links:	
  	
  
•  haps://soIware.intel.com/en-­‐us/ar<cles/intel-­‐xeon-­‐phi-­‐coprocessor-­‐developers-­‐quick-­‐start-­‐guide	
  
12	
  
FPGAs
• 	
  Architecture:	
  	
  LUTs+HPs+Fabric,	
  coprocessor	
  configura<on	
  
• 	
  Targeted	
  workloads:	
  	
  extreme	
  pipelining	
  or	
  fanout,	
  systolic,	
  fast	
  configura<on(?)	
  
• 	
  Programming	
  models:	
  	
  VHDL,	
  Verilog,	
  HLS,	
  OpenCL	
  
• 	
  Ecosystem	
  maturity:	
  	
  Medium	
  
• Links:	
  
• haps://www.altera.com/products/design-­‐soIware/embedded-­‐soIware-­‐developers/opencl/
overview.highResolu<onDisplay.html	
  
• hap://www.xilinx.com/products/design-­‐tools/soIware-­‐zone/sdaccel.html	
  
13	
  
Automata
• 	
  Architecture:	
  	
  NFA	
  with	
  programmable	
  fabric	
  
• 	
  Targeted	
  workloads:	
  MISD,	
  paaern	
  matching,	
  parallel	
  unstructured	
  
• 	
  Programming	
  models:	
  	
  API,	
  ANML,	
  regexp	
  
• 	
  Ecosystem	
  maturity:	
  Low	
  
• Links:	
  hap://micronautomata.com/	
  
	
  
14	
  
Enabling  developers:  


Accessibility:    sHll  a  problem
15	
  
Vision
	
   To	
  bring	
  supercompu<ng	
  for	
  the	
  masses	
  by:	
  
◦  building	
  soIware	
  to	
  automa<cally	
  realize	
  the	
  benefits	
  of	
  heterogeneous	
  hardware	
  
16	
  
Enabling  scaling  automaHcally
Horizontal	
  Scaling	
  
with	
  BF	
  Boost	
  remo<ng	
  technology	
  
Ver5cal	
  Scaling	
  
with	
  BF	
  Boost	
  spliXng	
  technology	
  
	
  
Heterogeneous	
  Scaling	
  
with	
  BF	
  Boost	
  intercep<on	
  technology	
  
cpu	
  system	
   gpu	
  system	
  
3X	
   	
  Machine	
  learning	
  with	
  Caffe,	
  Torch:	
  2	
  local	
  vs.	
  8	
  remote	
  GPUs	
  
3.5X	
  Rendering	
  with	
  Blender:	
  1	
  local	
  vs.	
  4	
  remote	
  GPUs	
  
20X	
  	
  Rendering	
  with	
  Blender:	
  4	
  remote	
  GPUs	
  
8X	
   	
  Image	
  Processing	
  with	
  ImageMagick:	
  1	
  vs.	
  12	
  local	
  GPUs	
  
10X	
  	
  Computer	
  Vision	
  (face	
  detect)	
  with	
  OpenCV:	
  12	
  CPU	
  cores	
  vs.	
  4	
  GPUs	
  
7X	
   	
  Computa5onal	
  Science	
  with	
  NAMD:	
  2	
  remote	
  GPUs	
  
BiYusion  Tech:  Remote  VirtualizaHon
18	
  
Features	
  
•  Scale-­‐out:	
  connect	
  one	
  server	
  to	
  many	
  accelerators	
  to	
  boost	
  performance	
  
•  Scale-­‐in:	
  	
  connect	
  many	
  servers	
  to	
  few	
  accelerators	
  to	
  pool	
  resources	
  and	
  lower	
  cost	
  
•  Service	
  discovery:	
  local	
  and	
  remote	
  machines	
  can	
  discover	
  themselves	
  on	
  demand	
  
without	
  complex	
  or	
  <me	
  consuming	
  configura<on.	
  
•  Virtual	
  pools:	
  	
  Segment	
  resources	
  by	
  class	
  of	
  users	
  or	
  hardware	
  
Remote	
  virtualiza<on	
  enables	
  varied	
  virtual	
  configura<ons	
  by	
  
combining	
  or	
  sharing	
  the	
  resources	
  of	
  local	
  and	
  remote	
  servers	
  
•  Binary-­‐level	
  API	
  intercep<on	
  
•  Distribute	
  work	
  across	
  local	
  
and	
  remote	
  machines	
  
•  Advanced	
  performance	
  
features	
  including	
  
synchroniza<on	
  elision	
  and	
  
data	
  pipelining	
  
applica5on	
  
remote	
  servers	
  
local	
  server	
  
•  SoIware	
  sees	
  all	
  new	
  hardware	
  as	
  
if	
  it	
  were	
  directly	
  connected	
  
	
  
•  No	
  change	
  to	
  soIware	
  required	
  
applica5on	
  
virtual	
  server	
  with	
  	
  
combined	
  resources	
  
System	
  view	
   Applica5on	
  view	
  
data	
  and	
  
compute	
  
pipelining	
  
Advanced	
  caching	
  and	
  
data	
  directories	
  
Auto	
  service	
  
discovery,	
  
metering	
  
Func<on	
  
redirec<on	
  for	
  
advanced	
  
coprocessors	
  
Helping  to  solve  accessibility
19	
  
scale-­‐out	
   pooling	
  
Inexpensive	
  
micro-­‐client	
  
Shared	
  Heterogeneous	
  
server	
  
offer  most  affordable
20	
  
Heterogeneous	
  
cloud	
  
Developer	
  
machine	
  
high  performance  developer  instances
and
•  Binary-­‐level	
  API	
  intercep<on	
  
•  Distribute	
  work	
  across	
  local	
  
and	
  remote	
  machines	
  
•  Advanced	
  performance	
  
features	
  including	
  
synchroniza<on	
  elision	
  and	
  
data	
  pipelining	
  
applica5on	
  
remote	
  servers	
  
local	
  server	
  
data	
  and	
  
compute	
  
pipelining	
  
Advanced	
  caching	
  and	
  
data	
  directories	
  
Auto	
  
service	
  
discovery,	
  
metering	
  
Func<on	
  redirec<on	
  
for	
  advanced	
  
coprocessors	
  
SUPERCOMPUTING  TO  THE  MASSES
21	
  
Quantum  computers
• 	
  Architecture:	
  
• 	
  Targeted	
  workloads:	
  
• 	
  Programming	
  models:	
  
• 	
  Ecosystem	
  maturity:	
  
22	
  
ApplicaHon  specific  processors
• 	
  Architecture:	
  Varied	
  
• 	
  Targeted	
  workloads:	
  App	
  specific:	
  molecular	
  simula<ons,	
  dnn	
  
• 	
  Programming	
  models:	
  API	
  
• 	
  Ecosystem	
  maturity:	
  	
  Zero-­‐ish	
  
23	
  

Más contenido relacionado

La actualidad más candente

La actualidad más candente (20)

OSDC 2018 | Introduction to SaltStack in the Modern Data Center by Mike Place
OSDC 2018 | Introduction to SaltStack in the Modern Data Center by Mike PlaceOSDC 2018 | Introduction to SaltStack in the Modern Data Center by Mike Place
OSDC 2018 | Introduction to SaltStack in the Modern Data Center by Mike Place
 
Introduction to helm
Introduction to helmIntroduction to helm
Introduction to helm
 
Why kubernetes matters
Why kubernetes mattersWhy kubernetes matters
Why kubernetes matters
 
Magnolia CMS on Jelastic
Magnolia CMS on JelasticMagnolia CMS on Jelastic
Magnolia CMS on Jelastic
 
Cloud foundry: The Platform for Forging Cloud Native Applications
Cloud foundry: The Platform for Forging Cloud Native ApplicationsCloud foundry: The Platform for Forging Cloud Native Applications
Cloud foundry: The Platform for Forging Cloud Native Applications
 
Cloud spanner architecture and use cases
Cloud spanner architecture and use casesCloud spanner architecture and use cases
Cloud spanner architecture and use cases
 
Cloud Foundry Summit 2015: Managing Multiple Cloud with a Single BOSH Deploym...
Cloud Foundry Summit 2015: Managing Multiple Cloud with a Single BOSH Deploym...Cloud Foundry Summit 2015: Managing Multiple Cloud with a Single BOSH Deploym...
Cloud Foundry Summit 2015: Managing Multiple Cloud with a Single BOSH Deploym...
 
GCP - Continuous Integration and Delivery into Kubernetes with GitHub, Travis...
GCP - Continuous Integration and Delivery into Kubernetes with GitHub, Travis...GCP - Continuous Integration and Delivery into Kubernetes with GitHub, Travis...
GCP - Continuous Integration and Delivery into Kubernetes with GitHub, Travis...
 
Build a Basic Cloud Using RDO-manager
Build a Basic Cloud Using RDO-managerBuild a Basic Cloud Using RDO-manager
Build a Basic Cloud Using RDO-manager
 
Docker - A curtain raiser to the Container world
Docker - A curtain raiser to the Container worldDocker - A curtain raiser to the Container world
Docker - A curtain raiser to the Container world
 
CAPS: What's best for deploying and managing OpenStack? Chef vs. Ansible vs. ...
CAPS: What's best for deploying and managing OpenStack? Chef vs. Ansible vs. ...CAPS: What's best for deploying and managing OpenStack? Chef vs. Ansible vs. ...
CAPS: What's best for deploying and managing OpenStack? Chef vs. Ansible vs. ...
 
BigTop vm and docker provisioner
BigTop vm and docker provisionerBigTop vm and docker provisioner
BigTop vm and docker provisioner
 
A curtain-raiser to the container world Docker & Kubernetes
A curtain-raiser to the container world Docker & KubernetesA curtain-raiser to the container world Docker & Kubernetes
A curtain-raiser to the container world Docker & Kubernetes
 
Secure your Quarkus applications | DevNation Tech Talk
Secure your Quarkus applications | DevNation Tech TalkSecure your Quarkus applications | DevNation Tech Talk
Secure your Quarkus applications | DevNation Tech Talk
 
Open shift and docker - october,2014
Open shift and docker - october,2014Open shift and docker - october,2014
Open shift and docker - october,2014
 
Beyond Ingresses - Better Traffic Management in Kubernetes
Beyond Ingresses - Better Traffic Management in KubernetesBeyond Ingresses - Better Traffic Management in Kubernetes
Beyond Ingresses - Better Traffic Management in Kubernetes
 
How bigtop leveraged docker for build automation and one click hadoop provis...
How bigtop leveraged docker for build automation and  one click hadoop provis...How bigtop leveraged docker for build automation and  one click hadoop provis...
How bigtop leveraged docker for build automation and one click hadoop provis...
 
Kube what? for NodeJs developers
Kube what? for NodeJs developersKube what? for NodeJs developers
Kube what? for NodeJs developers
 
On Prem Container Cloud - Lessons Learned
On Prem Container Cloud - Lessons LearnedOn Prem Container Cloud - Lessons Learned
On Prem Container Cloud - Lessons Learned
 
Apache Bigtop: a crash course in deploying a Hadoop bigdata management platform
Apache Bigtop: a crash course in deploying a Hadoop bigdata management platformApache Bigtop: a crash course in deploying a Hadoop bigdata management platform
Apache Bigtop: a crash course in deploying a Hadoop bigdata management platform
 

Similar a Bitfusion Nimbix Dev Summit Heterogeneous Architectures

SFSCON23 - Andrea Alfonsi - Kubernetes for IoT
SFSCON23 - Andrea Alfonsi - Kubernetes for IoTSFSCON23 - Andrea Alfonsi - Kubernetes for IoT
SFSCON23 - Andrea Alfonsi - Kubernetes for IoT
South Tyrol Free Software Conference
 

Similar a Bitfusion Nimbix Dev Summit Heterogeneous Architectures (20)

Docker & aPaaS: Enterprise Innovation and Trends for 2015
Docker & aPaaS: Enterprise Innovation and Trends for 2015Docker & aPaaS: Enterprise Innovation and Trends for 2015
Docker & aPaaS: Enterprise Innovation and Trends for 2015
 
Cognos Performance Tuning Tips & Tricks
Cognos Performance Tuning Tips & TricksCognos Performance Tuning Tips & Tricks
Cognos Performance Tuning Tips & Tricks
 
Introduction to Apache Mesos and DC/OS
Introduction to Apache Mesos and DC/OSIntroduction to Apache Mesos and DC/OS
Introduction to Apache Mesos and DC/OS
 
High performance computing for research
High performance computing for researchHigh performance computing for research
High performance computing for research
 
Deterministic and high throughput data processing for CubeSats
Deterministic and high throughput data processing for CubeSatsDeterministic and high throughput data processing for CubeSats
Deterministic and high throughput data processing for CubeSats
 
Modern Cloud-Native Streaming Platforms: Event Streaming Microservices with A...
Modern Cloud-Native Streaming Platforms: Event Streaming Microservices with A...Modern Cloud-Native Streaming Platforms: Event Streaming Microservices with A...
Modern Cloud-Native Streaming Platforms: Event Streaming Microservices with A...
 
The Convergence of HPC and Deep Learning
The Convergence of HPC and Deep LearningThe Convergence of HPC and Deep Learning
The Convergence of HPC and Deep Learning
 
HPC and cloud distributed computing, as a journey
HPC and cloud distributed computing, as a journeyHPC and cloud distributed computing, as a journey
HPC and cloud distributed computing, as a journey
 
Hpc lunch and learn
Hpc lunch and learnHpc lunch and learn
Hpc lunch and learn
 
Optimizing performance
Optimizing performanceOptimizing performance
Optimizing performance
 
High Performance Computing Pitch Deck
High Performance Computing Pitch DeckHigh Performance Computing Pitch Deck
High Performance Computing Pitch Deck
 
Technology insights: Decision Science Platform
Technology insights: Decision Science PlatformTechnology insights: Decision Science Platform
Technology insights: Decision Science Platform
 
Day in the life event-driven workshop
Day in the life  event-driven workshopDay in the life  event-driven workshop
Day in the life event-driven workshop
 
SFSCON23 - Andrea Alfonsi - Kubernetes for IoT
SFSCON23 - Andrea Alfonsi - Kubernetes for IoTSFSCON23 - Andrea Alfonsi - Kubernetes for IoT
SFSCON23 - Andrea Alfonsi - Kubernetes for IoT
 
Intro to Apache Apex (next gen Hadoop) & comparison to Spark Streaming
Intro to Apache Apex (next gen Hadoop) & comparison to Spark StreamingIntro to Apache Apex (next gen Hadoop) & comparison to Spark Streaming
Intro to Apache Apex (next gen Hadoop) & comparison to Spark Streaming
 
Cloud nativecomputingtechnologysupportinghpc cognitiveworkflows
Cloud nativecomputingtechnologysupportinghpc cognitiveworkflowsCloud nativecomputingtechnologysupportinghpc cognitiveworkflows
Cloud nativecomputingtechnologysupportinghpc cognitiveworkflows
 
Debugging Numerical Simulations on Accelerated Architectures - TotalView fo...
 Debugging Numerical Simulations on Accelerated Architectures  - TotalView fo... Debugging Numerical Simulations on Accelerated Architectures  - TotalView fo...
Debugging Numerical Simulations on Accelerated Architectures - TotalView fo...
 
Building FoundationDB
Building FoundationDBBuilding FoundationDB
Building FoundationDB
 
Intro to Apache Apex - Next Gen Platform for Ingest and Transform
Intro to Apache Apex - Next Gen Platform for Ingest and TransformIntro to Apache Apex - Next Gen Platform for Ingest and Transform
Intro to Apache Apex - Next Gen Platform for Ingest and Transform
 
Optimising Service Deployment and Infrastructure Resource Configuration
Optimising Service Deployment and Infrastructure Resource ConfigurationOptimising Service Deployment and Infrastructure Resource Configuration
Optimising Service Deployment and Infrastructure Resource Configuration
 

Último

+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
?#DUbAI#??##{{(☎️+971_581248768%)**%*]'#abortion pills for sale in dubai@
 
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Victor Rentea
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Safe Software
 

Último (20)

2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
 
MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024
 
FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024
 
MS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectorsMS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectors
 
ICT role in 21st century education and its challenges
ICT role in 21st century education and its challengesICT role in 21st century education and its challenges
ICT role in 21st century education and its challenges
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of Terraform
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
 
DEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 AmsterdamDEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
 
[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdf[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdf
 
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
 
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
 
Spring Boot vs Quarkus the ultimate battle - DevoxxUK
Spring Boot vs Quarkus the ultimate battle - DevoxxUKSpring Boot vs Quarkus the ultimate battle - DevoxxUK
Spring Boot vs Quarkus the ultimate battle - DevoxxUK
 
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
 

Bitfusion Nimbix Dev Summit Heterogeneous Architectures

  • 1. HETEROGENEOUS  ARCHITECTURES:  A  SURVEY  AND  OVERVIEW  FOR   DEVELOPERS 1   MAZHAR  MEMON CTO,  BITFUSION.IO
  • 2. 2   abstract  and  slow  à    ß  complex  and  fast   Time  à   Delivering  performance  and  efficiency  to     today’s  applica<ons  is  becoming  more  difficult             The  problem  in  compuHng
  • 3. The  soKware  world  is  increasingly   abstract
  • 5. Moore’s  law  slowing  -­‐>  complexity Era  of  frequency   Era  of  mul<-­‐core   Era  of  many-­‐core  
  • 6. 6   abstract  and  slow  à    ß  complex  and  fast   Time  à   Help!           The  problem  in  compuHng
  • 7. The  soluHon(s) •   Hardware   •  Specialized  hardware  required  to  keep  up  with  accelerated  performance  curve   •  Encourage  accessibility:    low  hourly  pricing   • SoIware   •  Abstrac<ons:    Libraries,  APIs,  tool  chain  up  to  compiler  IR,  use  transla<ons  where  possible   •  Ecosystem:  Learning  materials,  user  groups,  university  engagement   •   What  makes  this  happen:    Developers   7   Remainder  of  this  talk  is  about  the  hardware  out  there  and  how  to  develop  for  them  
  • 8. Current  State  of  Developer  Experience   for  Accelerators 8   -­‐  Update  to  the  right  Opera<ng  System   -­‐  Install  Vendor  Tool-­‐flows  which  only   work  on  specific  Opera<ng  Systems   -­‐  SeXng  up  the  Environment  and   Licenses   -­‐  Installing  the  Board     -­‐  SeXng  up  the  board   -­‐  Numerous  pages  of  documenta<on   Unhappy  Developer   Experience  L   In  many  cases  developers  give  up   before  even  star<ng  real  work  due   to  this  poor  developer  experience  
  • 9. Overview  of  available  compute  devices 9   …from  easiest  to  hardest  
  • 10. Integrated  GPUs •   Architecture:    SIMD,  shared  resource  architecture   •   Targeted  workloads:  Medium-­‐sized  offloads,  latency-­‐sensi<ve,  cost-­‐sensi<ve,  media   •   Programming  models:    OpenCL,  DirectCompute,  C++  AMP,  SPIR,  HSAIL   •   Ecosystem  maturity:    High   • Links:   •  haps://soIware.intel.com/en-­‐us/ar<cles/intel-­‐graphics-­‐developers-­‐guides   10  
  • 11. Discrete  GPUs •   Architecture:    SIMD,  discrete  coprocessor  configura<on   •   Targeted  workloads:  Large-­‐sized  offloads,  throughput-­‐sensi<ve,  parallel  structured   •   Programming  models:    CUDA,  OpenCL,  DirectCompute,  C++  AMP,  SYCL,  SPIR,  HSA   •   Ecosystem  maturity:    High   • Links:   •  hap://docs.nvidia.com/cuda/cuda-­‐geXng-­‐started-­‐guide-­‐for-­‐linux   11  
  • 12. MICs •   Architecture:    Many  GP  cores,  (co)processor  configura<on   •   Targeted  workloads:    Large-­‐sized  offloads,  throughput-­‐sensi<ve,  generic  HPC   •   Programming  models:    OpenCL,  OMP,  MPI,  general  x86   •   Ecosystem  maturity:    High   • Links:     •  haps://soIware.intel.com/en-­‐us/ar<cles/intel-­‐xeon-­‐phi-­‐coprocessor-­‐developers-­‐quick-­‐start-­‐guide   12  
  • 13. FPGAs •   Architecture:    LUTs+HPs+Fabric,  coprocessor  configura<on   •   Targeted  workloads:    extreme  pipelining  or  fanout,  systolic,  fast  configura<on(?)   •   Programming  models:    VHDL,  Verilog,  HLS,  OpenCL   •   Ecosystem  maturity:    Medium   • Links:   • haps://www.altera.com/products/design-­‐soIware/embedded-­‐soIware-­‐developers/opencl/ overview.highResolu<onDisplay.html   • hap://www.xilinx.com/products/design-­‐tools/soIware-­‐zone/sdaccel.html   13  
  • 14. Automata •   Architecture:    NFA  with  programmable  fabric   •   Targeted  workloads:  MISD,  paaern  matching,  parallel  unstructured   •   Programming  models:    API,  ANML,  regexp   •   Ecosystem  maturity:  Low   • Links:  hap://micronautomata.com/     14  
  • 15. Enabling  developers:   Accessibility:    sHll  a  problem 15  
  • 16. Vision   To  bring  supercompu<ng  for  the  masses  by:   ◦  building  soIware  to  automa<cally  realize  the  benefits  of  heterogeneous  hardware   16  
  • 17. Enabling  scaling  automaHcally Horizontal  Scaling   with  BF  Boost  remo<ng  technology   Ver5cal  Scaling   with  BF  Boost  spliXng  technology     Heterogeneous  Scaling   with  BF  Boost  intercep<on  technology   cpu  system   gpu  system   3X    Machine  learning  with  Caffe,  Torch:  2  local  vs.  8  remote  GPUs   3.5X  Rendering  with  Blender:  1  local  vs.  4  remote  GPUs   20X    Rendering  with  Blender:  4  remote  GPUs   8X    Image  Processing  with  ImageMagick:  1  vs.  12  local  GPUs   10X    Computer  Vision  (face  detect)  with  OpenCV:  12  CPU  cores  vs.  4  GPUs   7X    Computa5onal  Science  with  NAMD:  2  remote  GPUs  
  • 18. BiYusion  Tech:  Remote  VirtualizaHon 18   Features   •  Scale-­‐out:  connect  one  server  to  many  accelerators  to  boost  performance   •  Scale-­‐in:    connect  many  servers  to  few  accelerators  to  pool  resources  and  lower  cost   •  Service  discovery:  local  and  remote  machines  can  discover  themselves  on  demand   without  complex  or  <me  consuming  configura<on.   •  Virtual  pools:    Segment  resources  by  class  of  users  or  hardware   Remote  virtualiza<on  enables  varied  virtual  configura<ons  by   combining  or  sharing  the  resources  of  local  and  remote  servers   •  Binary-­‐level  API  intercep<on   •  Distribute  work  across  local   and  remote  machines   •  Advanced  performance   features  including   synchroniza<on  elision  and   data  pipelining   applica5on   remote  servers   local  server   •  SoIware  sees  all  new  hardware  as   if  it  were  directly  connected     •  No  change  to  soIware  required   applica5on   virtual  server  with     combined  resources   System  view   Applica5on  view   data  and   compute   pipelining   Advanced  caching  and   data  directories   Auto  service   discovery,   metering   Func<on   redirec<on  for   advanced   coprocessors  
  • 19. Helping  to  solve  accessibility 19   scale-­‐out   pooling   Inexpensive   micro-­‐client   Shared  Heterogeneous   server  
  • 20. offer  most  affordable 20   Heterogeneous   cloud   Developer   machine   high  performance  developer  instances and •  Binary-­‐level  API  intercep<on   •  Distribute  work  across  local   and  remote  machines   •  Advanced  performance   features  including   synchroniza<on  elision  and   data  pipelining   applica5on   remote  servers   local  server   data  and   compute   pipelining   Advanced  caching  and   data  directories   Auto   service   discovery,   metering   Func<on  redirec<on   for  advanced   coprocessors  
  • 21. SUPERCOMPUTING  TO  THE  MASSES 21  
  • 22. Quantum  computers •   Architecture:   •   Targeted  workloads:   •   Programming  models:   •   Ecosystem  maturity:   22  
  • 23. ApplicaHon  specific  processors •   Architecture:  Varied   •   Targeted  workloads:  App  specific:  molecular  simula<ons,  dnn   •   Programming  models:  API   •   Ecosystem  maturity:    Zero-­‐ish   23