SlideShare una empresa de Scribd logo
1 de 84
Migration to  Multi-Core Zvi Avraham, CTO [email_address]
Agenda ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Multi-Core vs. Many-Core Multi-Core:  ≤8 cores/threads Many-Core: >8 cores/threads
Multi-Core x86 ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Many-Core Processors (MPU) the replacement for DSP and FPGA ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Intel Tera-scale ,[object Object],[object Object]
Cell Processor SONY Playstation 3 ,[object Object],[object Object],[object Object]
Xenon CPU Microsoft XBOX 360 ,[object Object]
NVIDIA GeForce 8800GTX ,[object Object]
Intel Larrabee GPU ,[object Object],[object Object],[object Object]
Hardware Slides ,[object Object],[object Object]
Parallel Programming Models
Concurrent/Parallel/Distributed ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Levels of HW Parallelism ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Flynn’s Taxonomy Single Instruction Multiple Instruction Single Data SISD MISD Multiple Data SIMD MIMD
Flynn’s Taxonomy (2)
Parallel Programming Models ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Shared State ,[object Object],[object Object],[object Object],[object Object]
Message Passing ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Message Passing (cont.) ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Active Object Design Pattern
Rational’s Capsule ROOM – Real-Time Object Oriented Modeling
Implicit vs. Explicit Parallelism ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Implicit vs. Explicit Parallelism ,[object Object],[object Object],[object Object],[object Object],[object Object]
Data Parallel vs. Task Parallel ,[object Object],[object Object],[object Object],[object Object],[object Object]
Data Parallel example ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Task Parallel example ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
“ Plain” vs Nested Data Parallelism ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Parallel Models ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Scatter / Gather
Fork / Join Barries – “joins” Parallel Regions Master Thread
Recursive Fork/Join ,[object Object]
Map / Reduce ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Map / Reduce
Types of Parallelism ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Embarrassingly Parallel Problems ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Extended Flynn’s Taxonomy ,[object Object],[object Object],[object Object],[object Object]
CPU Affinity Changing Process and Thread Affinity
CPU Affinity ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Why mess with CPU Affinity? Legacy Code Migration ,[object Object],[object Object],[object Object]
Why mess with CPU Affinity?  Real-Time and Determinism ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Why mess with CPU Affinity?  Memory Affinity ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Changing CPU Affinity ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
CPU Affinity – Task Manager
CPU Affinity – Task Manager
ImageCFG – Affinity Mask Tool ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Process.exe – Get Affinity Mask ,[object Object]
Process.exe – Set Affinity Mask ,[object Object],[object Object]
IntFiltr.exe   – Interrupt Affinity Filter ,[object Object],[object Object],[object Object]
Parallel Programming APIs
Parallel Programming APIs ,[object Object],[object Object],[object Object],[object Object]
Simplicity / Complexity ,[object Object],[object Object]
Capabilities Comparison Intel TBB OpenMP Threads Task level parallelism + + - Data decomposition support + + - Complex parallel patterns (non-loops) + - - Generic parallel patterns + - - Scalable nested parallelism support + - - Built-in load balancing + + - Affinity support - + + Static scheduling - + - Concurrent data structures + - - Scalable memory allocator  + - - I/O dominated tasks - - + User-level synch. primitives + + - Compiler support is not required + - + Cross OS support + + -
Native Threads ,[object Object],[object Object],[object Object],[object Object]
Win32 Threads API It’s assumed, that reader/listener is familiar with basic multithreading and corresponding Win32 API
What are Win32 Threads? ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Win32 API Hierarchy for Concurrency Windows OS Job Job Process Primary Thread Thread Process Fiber Fiber
Process ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Thread ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Job object ,[object Object],[object Object],[object Object]
Fiber / Coroutine / Microthread ,[object Object],[object Object],[object Object],[object Object],[object Object]
Win32 Threads API Example: CreateThread
Win32 Threads API: Critical Section
CCriticalSection C++ Class
CCriticalSection & CLock example
Thread Affinity ,[object Object],[object Object],[object Object],[object Object],[object Object]
Thread Ideal Processor ,[object Object],[object Object],[object Object]
Our running Example:  The PI program Numerical Integration    4.0 (1+x 2 ) dx =   0 1    F(x i )  x       i = 0 N Mathematically, we know that: We can approximate the integral as a sum of rectangles: Where each rectangle has width   x and height F(x i ) at the middle of interval i. F(x) = 4.0/(1+x 2 ) 4.0 2.0 1.0 X 0.0
PI: Matlab N=1000000;  Step = 1/N;  PI = Step*sum(4./(1+(((1:N)-0.5)*Step).^2)); ,[object Object],[object Object],[object Object]
PI Program: an example static long num_steps = 100000; double step; void main () {   int i;    double x, pi, sum = 0.0;   step = 1.0/(double) num_steps;   for (i=1;i<= num_steps; i++){   x = (i-0.5)*step;   sum = sum + 4.0/(1.0+x*x);   }   pi = step * sum; }
OpenMP PI Program :  Parallel for with a reduction #include <omp.h> static long num_steps = 100000;  double step; #define NUM_THREADS 2 void main () {   int i;    double x, pi, sum = 0.0;   step = 1.0/(double) num_steps;   omp_set_num_threads(NUM_THREADS); #pragma omp parallel for reduction(+:sum) private(x)   for (i=1;i<= num_steps; i++){   x = (i-0.5)*step;   sum = sum + 4.0/(1.0+x*x);   }   pi = step * sum; } OpenMP adds 2 to 4 lines of code
OpenMP
OpenMP Slides ,[object Object],[object Object]
OpenMP Compiler Option ,[object Object],[object Object]
OpenMP Compiler Option ,[object Object],[object Object],[object Object]
Auto-Parallelization ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Intel® TBB Thread Building Blocks C++ Library
Calculate PI using TBB
Calculate PI in MPI
MPI_Reduce ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Summary ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Commercial Multi-core ,[object Object],[object Object]
Any Questions?
Thank you!

Más contenido relacionado

La actualidad más candente

Embedded os
Embedded osEmbedded os
Embedded os
chian417
 
Inter Process Communication Presentation[1]
Inter Process Communication Presentation[1]Inter Process Communication Presentation[1]
Inter Process Communication Presentation[1]
Ravindra Raju Kolahalam
 
Hardware Software Codesign
Hardware Software CodesignHardware Software Codesign
Hardware Software Codesign
destruck
 
Parallel Programming
Parallel ProgrammingParallel Programming
Parallel Programming
Uday Sharma
 

La actualidad más candente (20)

Arm processor
Arm processorArm processor
Arm processor
 
Operating System-Memory Management
Operating System-Memory ManagementOperating System-Memory Management
Operating System-Memory Management
 
Fundamentals of Computer Design including performance measurements & quantita...
Fundamentals of Computer Design including performance measurements & quantita...Fundamentals of Computer Design including performance measurements & quantita...
Fundamentals of Computer Design including performance measurements & quantita...
 
Embedded os
Embedded osEmbedded os
Embedded os
 
Introduction to parallel processing
Introduction to parallel processingIntroduction to parallel processing
Introduction to parallel processing
 
Introduction to Operating Systems
Introduction to Operating SystemsIntroduction to Operating Systems
Introduction to Operating Systems
 
Presentation on risc pipeline
Presentation on risc pipelinePresentation on risc pipeline
Presentation on risc pipeline
 
Inter Process Communication Presentation[1]
Inter Process Communication Presentation[1]Inter Process Communication Presentation[1]
Inter Process Communication Presentation[1]
 
Improving software economics
Improving software economicsImproving software economics
Improving software economics
 
Processes and threads
Processes and threadsProcesses and threads
Processes and threads
 
Chapter 10
Chapter 10Chapter 10
Chapter 10
 
Devices and gateways
Devices and gatewaysDevices and gateways
Devices and gateways
 
Unit5
Unit5Unit5
Unit5
 
Inter Process Communication
Inter Process CommunicationInter Process Communication
Inter Process Communication
 
Hardware Software Codesign
Hardware Software CodesignHardware Software Codesign
Hardware Software Codesign
 
Cuda Architecture
Cuda ArchitectureCuda Architecture
Cuda Architecture
 
Unit vi (1)
Unit vi (1)Unit vi (1)
Unit vi (1)
 
Unit vi (2)
Unit vi (2)Unit vi (2)
Unit vi (2)
 
Parallel Programming
Parallel ProgrammingParallel Programming
Parallel Programming
 
Spm software effort estimation
Spm software effort estimationSpm software effort estimation
Spm software effort estimation
 

Destacado

Lecture 3
Lecture 3Lecture 3
Lecture 3
Mr SMAK
 
Computer architecture
Computer architecture Computer architecture
Computer architecture
Ashish Kumar
 
Flynns classification
Flynns classificationFlynns classification
Flynns classification
Yasir Khan
 

Destacado (16)

Scheduling for Parallel and Multi-Core Systems
Scheduling for Parallel and Multi-Core SystemsScheduling for Parallel and Multi-Core Systems
Scheduling for Parallel and Multi-Core Systems
 
Pipeline parallelism
Pipeline parallelismPipeline parallelism
Pipeline parallelism
 
How to Use OpenMP on Native Activity
How to Use OpenMP on Native ActivityHow to Use OpenMP on Native Activity
How to Use OpenMP on Native Activity
 
Семинар 7. Многопоточное программирование на OpenMP (часть 7)
Семинар 7. Многопоточное программирование на OpenMP (часть 7)Семинар 7. Многопоточное программирование на OpenMP (часть 7)
Семинар 7. Многопоточное программирование на OpenMP (часть 7)
 
Introduccion a MPI
Introduccion a MPIIntroduccion a MPI
Introduccion a MPI
 
Consistency And Parallelism Presentation
Consistency And Parallelism PresentationConsistency And Parallelism Presentation
Consistency And Parallelism Presentation
 
Intel® MPI Library e OpenMP* - Intel Software Conference 2013
Intel® MPI Library e OpenMP* - Intel Software Conference 2013Intel® MPI Library e OpenMP* - Intel Software Conference 2013
Intel® MPI Library e OpenMP* - Intel Software Conference 2013
 
A Pipeline for Distributed Topic and Sentiment Analysis of Tweets on Pivotal ...
A Pipeline for Distributed Topic and Sentiment Analysis of Tweets on Pivotal ...A Pipeline for Distributed Topic and Sentiment Analysis of Tweets on Pivotal ...
A Pipeline for Distributed Topic and Sentiment Analysis of Tweets on Pivotal ...
 
Lecture 3
Lecture 3Lecture 3
Lecture 3
 
Multiprocessor architecture and programming
Multiprocessor architecture and programmingMultiprocessor architecture and programming
Multiprocessor architecture and programming
 
Parallel Computing
Parallel ComputingParallel Computing
Parallel Computing
 
Types of parallelism
Types of parallelismTypes of parallelism
Types of parallelism
 
Computer architecture
Computer architecture Computer architecture
Computer architecture
 
What is Parallelism?
What is Parallelism?What is Parallelism?
What is Parallelism?
 
Presentation on flynn’s classification
Presentation on flynn’s classificationPresentation on flynn’s classification
Presentation on flynn’s classification
 
Flynns classification
Flynns classificationFlynns classification
Flynns classification
 

Similar a Migration To Multi Core - Parallel Programming Models

Parallel Programming Primer
Parallel Programming PrimerParallel Programming Primer
Parallel Programming Primer
Sri Prasanna
 
Parallel Programming Primer 1
Parallel Programming Primer 1Parallel Programming Primer 1
Parallel Programming Primer 1
mobius.cn
 
Unmanaged Parallelization via P/Invoke
Unmanaged Parallelization via P/InvokeUnmanaged Parallelization via P/Invoke
Unmanaged Parallelization via P/Invoke
Dmitri Nesteruk
 
Moving Towards a Streaming Architecture
Moving Towards a Streaming ArchitectureMoving Towards a Streaming Architecture
Moving Towards a Streaming Architecture
Gabriele Modena
 
C# Parallel programming
C# Parallel programmingC# Parallel programming
C# Parallel programming
Umeshwaran V
 

Similar a Migration To Multi Core - Parallel Programming Models (20)

Parallel Programming Primer
Parallel Programming PrimerParallel Programming Primer
Parallel Programming Primer
 
Parallel Programming Primer 1
Parallel Programming Primer 1Parallel Programming Primer 1
Parallel Programming Primer 1
 
Parallel computation
Parallel computationParallel computation
Parallel computation
 
parallel-computation.pdf
parallel-computation.pdfparallel-computation.pdf
parallel-computation.pdf
 
Unmanaged Parallelization via P/Invoke
Unmanaged Parallelization via P/InvokeUnmanaged Parallelization via P/Invoke
Unmanaged Parallelization via P/Invoke
 
Interpreting the Data:Parallel Analysis with Sawzall
Interpreting the Data:Parallel Analysis with SawzallInterpreting the Data:Parallel Analysis with Sawzall
Interpreting the Data:Parallel Analysis with Sawzall
 
Parallel Computing-Part-1.pptx
Parallel Computing-Part-1.pptxParallel Computing-Part-1.pptx
Parallel Computing-Part-1.pptx
 
Parallel Programming on the ANDC cluster
Parallel Programming on the ANDC clusterParallel Programming on the ANDC cluster
Parallel Programming on the ANDC cluster
 
Moving Towards a Streaming Architecture
Moving Towards a Streaming ArchitectureMoving Towards a Streaming Architecture
Moving Towards a Streaming Architecture
 
parellel computing
parellel computingparellel computing
parellel computing
 
Lecture1
Lecture1Lecture1
Lecture1
 
Chap 1(one) general introduction
Chap 1(one)  general introductionChap 1(one)  general introduction
Chap 1(one) general introduction
 
Parallel and Distributed Computing Chapter 8
Parallel and Distributed Computing Chapter 8Parallel and Distributed Computing Chapter 8
Parallel and Distributed Computing Chapter 8
 
Automatic Compilation Of MATLAB Programs For Synergistic Execution On Heterog...
Automatic Compilation Of MATLAB Programs For Synergistic Execution On Heterog...Automatic Compilation Of MATLAB Programs For Synergistic Execution On Heterog...
Automatic Compilation Of MATLAB Programs For Synergistic Execution On Heterog...
 
Distributed Computing & MapReduce
Distributed Computing & MapReduceDistributed Computing & MapReduce
Distributed Computing & MapReduce
 
Parallel processing (simd and mimd)
Parallel processing (simd and mimd)Parallel processing (simd and mimd)
Parallel processing (simd and mimd)
 
Software architecture for data applications
Software architecture for data applicationsSoftware architecture for data applications
Software architecture for data applications
 
C# Parallel programming
C# Parallel programmingC# Parallel programming
C# Parallel programming
 
5.7 Parallel Processing - Reactive Programming.pdf.pptx
5.7 Parallel Processing - Reactive Programming.pdf.pptx5.7 Parallel Processing - Reactive Programming.pdf.pptx
5.7 Parallel Processing - Reactive Programming.pdf.pptx
 
Multicore
MulticoreMulticore
Multicore
 

Más de Zvi Avraham

Cloud Computing: AWS for Lean Startups
Cloud Computing: AWS for Lean StartupsCloud Computing: AWS for Lean Startups
Cloud Computing: AWS for Lean Startups
Zvi Avraham
 

Más de Zvi Avraham (10)

Data isn't the new Oil - it's a new Asset Class!
Data isn't the new Oil - it's a new Asset Class!Data isn't the new Oil - it's a new Asset Class!
Data isn't the new Oil - it's a new Asset Class!
 
Functional APIs with Absinthe GraphQL
Functional APIs with Absinthe GraphQLFunctional APIs with Absinthe GraphQL
Functional APIs with Absinthe GraphQL
 
Limited supply
Limited supplyLimited supply
Limited supply
 
TimeSpaceDB
TimeSpaceDBTimeSpaceDB
TimeSpaceDB
 
Erlang on OSv
Erlang on OSvErlang on OSv
Erlang on OSv
 
Ethereum VM and DSLs for Smart Contracts (updated on May 12th 2015)
Ethereum VM and DSLs for Smart Contracts (updated on May 12th 2015)Ethereum VM and DSLs for Smart Contracts (updated on May 12th 2015)
Ethereum VM and DSLs for Smart Contracts (updated on May 12th 2015)
 
[http://1PU.SH] Building Wireless Sensor Networks with MQTT-SN, RaspberryPi a...
[http://1PU.SH] Building Wireless Sensor Networks with MQTT-SN, RaspberryPi a...[http://1PU.SH] Building Wireless Sensor Networks with MQTT-SN, RaspberryPi a...
[http://1PU.SH] Building Wireless Sensor Networks with MQTT-SN, RaspberryPi a...
 
Erlang - Concurrent Language for Concurrent World
Erlang - Concurrent Language for Concurrent WorldErlang - Concurrent Language for Concurrent World
Erlang - Concurrent Language for Concurrent World
 
Cloud Computing: AWS for Lean Startups
Cloud Computing: AWS for Lean StartupsCloud Computing: AWS for Lean Startups
Cloud Computing: AWS for Lean Startups
 
Erlang OTP
Erlang OTPErlang OTP
Erlang OTP
 

Último

+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
?#DUbAI#??##{{(☎️+971_581248768%)**%*]'#abortion pills for sale in dubai@
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
Joaquim Jorge
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Safe Software
 

Último (20)

Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CV
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
GenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdfGenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdf
 
Artificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyArtificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : Uncertainty
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
 
Top 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live StreamsTop 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live Streams
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
 

Migration To Multi Core - Parallel Programming Models

  • 1. Migration to Multi-Core Zvi Avraham, CTO [email_address]
  • 2.
  • 3. Multi-Core vs. Many-Core Multi-Core: ≤8 cores/threads Many-Core: >8 cores/threads
  • 4.
  • 5.
  • 6.
  • 7.
  • 8.
  • 9.
  • 10.
  • 11.
  • 13.
  • 14.
  • 15. Flynn’s Taxonomy Single Instruction Multiple Instruction Single Data SISD MISD Multiple Data SIMD MIMD
  • 17.
  • 18.
  • 19.
  • 20.
  • 22. Rational’s Capsule ROOM – Real-Time Object Oriented Modeling
  • 23.
  • 24.
  • 25.
  • 26.
  • 27.
  • 28.
  • 29.
  • 31. Fork / Join Barries – “joins” Parallel Regions Master Thread
  • 32.
  • 33.
  • 35.
  • 36.
  • 37.
  • 38. CPU Affinity Changing Process and Thread Affinity
  • 39.
  • 40.
  • 41.
  • 42.
  • 43.
  • 44. CPU Affinity – Task Manager
  • 45. CPU Affinity – Task Manager
  • 46.
  • 47.
  • 48.
  • 49.
  • 51.
  • 52.
  • 53. Capabilities Comparison Intel TBB OpenMP Threads Task level parallelism + + - Data decomposition support + + - Complex parallel patterns (non-loops) + - - Generic parallel patterns + - - Scalable nested parallelism support + - - Built-in load balancing + + - Affinity support - + + Static scheduling - + - Concurrent data structures + - - Scalable memory allocator + - - I/O dominated tasks - - + User-level synch. primitives + + - Compiler support is not required + - + Cross OS support + + -
  • 54.
  • 55. Win32 Threads API It’s assumed, that reader/listener is familiar with basic multithreading and corresponding Win32 API
  • 56.
  • 57. Win32 API Hierarchy for Concurrency Windows OS Job Job Process Primary Thread Thread Process Fiber Fiber
  • 58.
  • 59.
  • 60.
  • 61.
  • 62. Win32 Threads API Example: CreateThread
  • 63. Win32 Threads API: Critical Section
  • 66.
  • 67.
  • 68. Our running Example: The PI program Numerical Integration  4.0 (1+x 2 ) dx =  0 1  F(x i )  x   i = 0 N Mathematically, we know that: We can approximate the integral as a sum of rectangles: Where each rectangle has width  x and height F(x i ) at the middle of interval i. F(x) = 4.0/(1+x 2 ) 4.0 2.0 1.0 X 0.0
  • 69.
  • 70. PI Program: an example static long num_steps = 100000; double step; void main () { int i; double x, pi, sum = 0.0; step = 1.0/(double) num_steps; for (i=1;i<= num_steps; i++){ x = (i-0.5)*step; sum = sum + 4.0/(1.0+x*x); } pi = step * sum; }
  • 71. OpenMP PI Program : Parallel for with a reduction #include <omp.h> static long num_steps = 100000; double step; #define NUM_THREADS 2 void main () { int i; double x, pi, sum = 0.0; step = 1.0/(double) num_steps; omp_set_num_threads(NUM_THREADS); #pragma omp parallel for reduction(+:sum) private(x) for (i=1;i<= num_steps; i++){ x = (i-0.5)*step; sum = sum + 4.0/(1.0+x*x); } pi = step * sum; } OpenMP adds 2 to 4 lines of code
  • 73.
  • 74.
  • 75.
  • 76.
  • 77. Intel® TBB Thread Building Blocks C++ Library
  • 80.
  • 81.
  • 82.