SlideShare una empresa de Scribd logo
1 de 22
OpenMP
Presented by :
Mohammad Radpour
Amirali Sharifian
Farbod Nosrat Nezami
Introduction
• What is parallel processing ?
It is ability of processing more than one job simultaneously.
• Why going parallel ?
• Great deal of data to be processed
• Time needed to calculate an engineering equation
• Need jobs to be done faster
Isfahan University of Technology, Dep. Electronic and Computer
Engineering
2
Technologies
• What technologies used for parallel processing ?
• Network based parallel processing
• Utilizing CPU free time and power
• Fact is most of CPU time and power is wasting
• Tearing down jobs and run them on resources
• Local parallelism on multicore/multiprocessor systems
• Utilize the concept of multithreading
• Utilize the concept of share memory
• Can be run on either GPU or CPU
Isfahan University of Technology, Dep. Electronic and Computer
Engineering
3
Tools and Technics
• What tools used for parallel processing ?
• Network based parallel processing
• Gird based parallel computing
• Cloud based parallelism and Cloud computing
• Local parallelism on multicore/multiprocessor systems
• NVidia® CODA™
• MPI
• Posix Threads
• OpenMP
Isfahan University of Technology, Dep. Electronic and Computer
Engineering
4
What is OpenMP
• OpenMP
• In simple word runs a user program in parallel.
• It utilize to main concepts for parallelism
• Multithreading
• Shared Memory
• It takes user application, tear it down into group of threads and
runs them on a shared memory foundation
Isfahan University of Technology, Dep. Electronic and Computer
Engineering
5
Why using OpenMP
• It is simple to use it
• Most of the times there is no need to change program code
• It utilize compiler directives to demonstrate parallel region
• It is cross platform
• It supports by Fortran and C / C++
Isfahan University of Technology, Dep. Electronic and Computer
Engineering
6
Programming Model
• Shared Memory
• Parallelism by threading
• Fork-Join model
• Explicit Parallelism
• Nested Parallelism
• Dynamic Threads
• Input / Output
• Memory model
Isfahan University of Technology, Dep. Electronic and Computer
Engineering
7
Shared Memory
• What is shared memory ?
• Why using shared memory?
• Shared Memory in OpenMP
Isfahan University of Technology, Dep. Electronic and Computer
Engineering
8
Shared Memory (Cont.)
• Following system can be used for shared memory access
• a single core chip (older PC’s, sequential execution)
• a multicore chip (such as your laptop?)
• multiple single core chips in a NUMA system
• multiple multicore chips in a NUMA system (VT SGI system)
Isfahan University of Technology, Dep. Electronic and Computer
Engineering
9
UMA Vs. NUMA
• Unified Memory Access ( UMA )
Isfahan University of Technology, Dep. Electronic and Computer
Engineering
10
UMA Vs. NUMA (Cont.)
• Non Unified Memory Access ( NUMA )
Isfahan University of Technology, Dep. Electronic and Computer
Engineering
11
Multi Threading
• What is Multi Threading
• What is Intel Hyper-Threading
• Why using Multi Threading
• Multi Threading in OpenMP
Isfahan University of Technology, Dep. Electronic and Computer
Engineering
12
Fork – Join Model
• What is Fork
• What is Join
• How Multi Threading works in OpenMP
Isfahan University of Technology, Dep. Electronic and Computer
Engineering
13
Fork – Join Model (Cont.)
Isfahan University of Technology, Dep. Electronic and Computer
Engineering
14
F
J
Master
Thread
Thread
OpenMP Elements
• Compiler Directives
• Runtime Libraries
• Environmental Variables
Isfahan University of Technology, Dep. Electronic and Computer
Engineering
15
How to use OpenMP
• OpenMP implemented for C/C++ and Fortran
• In C/C++ we use compiler directives
• We only need to specify the parallel region
Isfahan University of Technology, Dep. Electronic and Computer
Engineering
16
How to use OpenMP
• In non Microsoft compiler:
Isfahan University of Technology, Dep. Electronic and Computer
Engineering
17
How to use OpenMP (Cont.)
• In Visual Studio :
Isfahan University of Technology, Dep. Electronic and Computer
Engineering
18
Real Expriment
void main()
{
omp_set_num_threads(6);
LARGE_INTEGER frequency; // ticks per secon
LARGE_INTEGER t1, t2; // ticks
double elapsedTime;
// get ticks per second
QueryPerformanceFrequency(&frequency);
// start timer
QueryPerformanceCounter(&t1);
#pragma omp parallel for
for(int i =0 ; i < 999999 ; i++)
for(int i =0 ; i < 1000 ; i++);
// stop timer
QueryPerformanceCounter(&t2);
elapsedTime= (t2.QuadPart - t1.QuadPart) * 1000.0 / frequency.QuadPart;
cout << elapsedTime << " ms.n";
}
Isfahan University of Technology, Dep. Electronic and Computer
Engineering
19
Expriment Result - Sequential
• It took 3347.68 milliseconds to run
Isfahan University of Technology, Dep. Electronic and Computer
Engineering
20
Expriment Result - Parallel
• It took 983.576 milliseconds to run
Isfahan University of Technology, Dep. Electronic and Computer
Engineering
21
The End
Isfahan University of Technology, Dep. Electronic and Computer
Engineering
22

Más contenido relacionado

La actualidad más candente

Programming using Open Mp
Programming using Open MpProgramming using Open Mp
Programming using Open Mp
Anshul Sharma
 
Swift for TensorFlow - Tanmay Bakshi - Advanced Spark and TensorFlow Meetup -...
Swift for TensorFlow - Tanmay Bakshi - Advanced Spark and TensorFlow Meetup -...Swift for TensorFlow - Tanmay Bakshi - Advanced Spark and TensorFlow Meetup -...
Swift for TensorFlow - Tanmay Bakshi - Advanced Spark and TensorFlow Meetup -...
Chris Fregly
 
Hands-on Learning with KubeFlow + Keras/TensorFlow 2.0 + TF Extended (TFX) + ...
Hands-on Learning with KubeFlow + Keras/TensorFlow 2.0 + TF Extended (TFX) + ...Hands-on Learning with KubeFlow + Keras/TensorFlow 2.0 + TF Extended (TFX) + ...
Hands-on Learning with KubeFlow + Keras/TensorFlow 2.0 + TF Extended (TFX) + ...
Chris Fregly
 
Understanding Android Benchmarks
Understanding Android BenchmarksUnderstanding Android Benchmarks
Understanding Android Benchmarks
Koan-Sin Tan
 

La actualidad más candente (20)

OpenMP
OpenMPOpenMP
OpenMP
 
Presentation on Shared Memory Parallel Programming
Presentation on Shared Memory Parallel ProgrammingPresentation on Shared Memory Parallel Programming
Presentation on Shared Memory Parallel Programming
 
OpenMP And C++
OpenMP And C++OpenMP And C++
OpenMP And C++
 
Open mp directives
Open mp directivesOpen mp directives
Open mp directives
 
OpenMP Tutorial for Beginners
OpenMP Tutorial for BeginnersOpenMP Tutorial for Beginners
OpenMP Tutorial for Beginners
 
Introduction to OpenMP (Performance)
Introduction to OpenMP (Performance)Introduction to OpenMP (Performance)
Introduction to OpenMP (Performance)
 
Intro to OpenMP
Intro to OpenMPIntro to OpenMP
Intro to OpenMP
 
Parallelization using open mp
Parallelization using open mpParallelization using open mp
Parallelization using open mp
 
MPI n OpenMP
MPI n OpenMPMPI n OpenMP
MPI n OpenMP
 
Open mp library functions and environment variables
Open mp library functions and environment variablesOpen mp library functions and environment variables
Open mp library functions and environment variables
 
Parallel Programming
Parallel ProgrammingParallel Programming
Parallel Programming
 
Programming using Open Mp
Programming using Open MpProgramming using Open Mp
Programming using Open Mp
 
Swift for TensorFlow - Tanmay Bakshi - Advanced Spark and TensorFlow Meetup -...
Swift for TensorFlow - Tanmay Bakshi - Advanced Spark and TensorFlow Meetup -...Swift for TensorFlow - Tanmay Bakshi - Advanced Spark and TensorFlow Meetup -...
Swift for TensorFlow - Tanmay Bakshi - Advanced Spark and TensorFlow Meetup -...
 
Hands-on Learning with KubeFlow + Keras/TensorFlow 2.0 + TF Extended (TFX) + ...
Hands-on Learning with KubeFlow + Keras/TensorFlow 2.0 + TF Extended (TFX) + ...Hands-on Learning with KubeFlow + Keras/TensorFlow 2.0 + TF Extended (TFX) + ...
Hands-on Learning with KubeFlow + Keras/TensorFlow 2.0 + TF Extended (TFX) + ...
 
Multicore
MulticoreMulticore
Multicore
 
Understanding Android Benchmarks
Understanding Android BenchmarksUnderstanding Android Benchmarks
Understanding Android Benchmarks
 
A Sneak Peek of MLIR in TensorFlow
A Sneak Peek of MLIR in TensorFlowA Sneak Peek of MLIR in TensorFlow
A Sneak Peek of MLIR in TensorFlow
 
Parllelizaion
ParllelizaionParllelizaion
Parllelizaion
 
Gpgpu intro
Gpgpu introGpgpu intro
Gpgpu intro
 
Lecture7
Lecture7Lecture7
Lecture7
 

Destacado

Biref Introduction to OpenMP
Biref Introduction to OpenMPBiref Introduction to OpenMP
Biref Introduction to OpenMP
JerryHe
 
Tutorial on Parallel Computing and Message Passing Model - C2
Tutorial on Parallel Computing and Message Passing Model - C2Tutorial on Parallel Computing and Message Passing Model - C2
Tutorial on Parallel Computing and Message Passing Model - C2
Marcirio Chaves
 
Race conditions
Race conditionsRace conditions
Race conditions
Mohd Arif
 
Parallel architecture-programming
Parallel architecture-programmingParallel architecture-programming
Parallel architecture-programming
Shaveta Banda
 
Parallel Programming
Parallel ProgrammingParallel Programming
Parallel Programming
Uday Sharma
 
Parallel computing
Parallel computingParallel computing
Parallel computing
virend111
 
Taking R to the Limit (High Performance Computing in R), Part 1 -- Paralleliz...
Taking R to the Limit (High Performance Computing in R), Part 1 -- Paralleliz...Taking R to the Limit (High Performance Computing in R), Part 1 -- Paralleliz...
Taking R to the Limit (High Performance Computing in R), Part 1 -- Paralleliz...
Ryan Rosario
 

Destacado (13)

Biref Introduction to OpenMP
Biref Introduction to OpenMPBiref Introduction to OpenMP
Biref Introduction to OpenMP
 
The State of libfabric in Open MPI
The State of libfabric in Open MPIThe State of libfabric in Open MPI
The State of libfabric in Open MPI
 
Tutorial on Parallel Computing and Message Passing Model - C2
Tutorial on Parallel Computing and Message Passing Model - C2Tutorial on Parallel Computing and Message Passing Model - C2
Tutorial on Parallel Computing and Message Passing Model - C2
 
Race conditions
Race conditionsRace conditions
Race conditions
 
Open MPI
Open MPIOpen MPI
Open MPI
 
Parallel architecture-programming
Parallel architecture-programmingParallel architecture-programming
Parallel architecture-programming
 
Openmp combined
Openmp combinedOpenmp combined
Openmp combined
 
Wolfgang Lehner Technische Universitat Dresden
Wolfgang Lehner Technische Universitat DresdenWolfgang Lehner Technische Universitat Dresden
Wolfgang Lehner Technische Universitat Dresden
 
Parallel Programming
Parallel ProgrammingParallel Programming
Parallel Programming
 
Critical section problem in operating system.
Critical section problem in operating system.Critical section problem in operating system.
Critical section problem in operating system.
 
What is [Open] MPI?
What is [Open] MPI?What is [Open] MPI?
What is [Open] MPI?
 
Parallel computing
Parallel computingParallel computing
Parallel computing
 
Taking R to the Limit (High Performance Computing in R), Part 1 -- Paralleliz...
Taking R to the Limit (High Performance Computing in R), Part 1 -- Paralleliz...Taking R to the Limit (High Performance Computing in R), Part 1 -- Paralleliz...
Taking R to the Limit (High Performance Computing in R), Part 1 -- Paralleliz...
 

Similar a Openmp

Similar a Openmp (20)

Lecture6
Lecture6Lecture6
Lecture6
 
A Source-To-Source Approach to HPC Challenges
A Source-To-Source Approach to HPC ChallengesA Source-To-Source Approach to HPC Challenges
A Source-To-Source Approach to HPC Challenges
 
Lecture 1 introduction to parallel and distributed computing
Lecture 1   introduction to parallel and distributed computingLecture 1   introduction to parallel and distributed computing
Lecture 1 introduction to parallel and distributed computing
 
Array Processor
Array ProcessorArray Processor
Array Processor
 
Putting Compilers to Work
Putting Compilers to WorkPutting Compilers to Work
Putting Compilers to Work
 
Async programming in c#
Async programming in c#Async programming in c#
Async programming in c#
 
openmp.New.intro-unc.edu.ppt
openmp.New.intro-unc.edu.pptopenmp.New.intro-unc.edu.ppt
openmp.New.intro-unc.edu.ppt
 
Introduction to multicore .ppt
Introduction to multicore .pptIntroduction to multicore .ppt
Introduction to multicore .ppt
 
OpenPOWER Acceleration of HPCC Systems
OpenPOWER Acceleration of HPCC SystemsOpenPOWER Acceleration of HPCC Systems
OpenPOWER Acceleration of HPCC Systems
 
Coding For Cores - C# Way
Coding For Cores - C# WayCoding For Cores - C# Way
Coding For Cores - C# Way
 
Taming the resource tiger
Taming the resource tigerTaming the resource tiger
Taming the resource tiger
 
Realtime traffic analyser
Realtime traffic analyserRealtime traffic analyser
Realtime traffic analyser
 
Automatic Launch and Tracking the Computational Simulations with LiFlow and S...
Automatic Launch and Tracking the Computational Simulations with LiFlow and S...Automatic Launch and Tracking the Computational Simulations with LiFlow and S...
Automatic Launch and Tracking the Computational Simulations with LiFlow and S...
 
openmp final2.pptx
openmp final2.pptxopenmp final2.pptx
openmp final2.pptx
 
Multithreaded Programming Part- II.pdf
Multithreaded Programming Part- II.pdfMultithreaded Programming Part- II.pdf
Multithreaded Programming Part- II.pdf
 
Parallel and Asynchronous Programming - ITProDevConnections 2012 (English)
Parallel and Asynchronous Programming -  ITProDevConnections 2012 (English)Parallel and Asynchronous Programming -  ITProDevConnections 2012 (English)
Parallel and Asynchronous Programming - ITProDevConnections 2012 (English)
 
Os lectures
Os lecturesOs lectures
Os lectures
 
Floating Point Operations , Memory Chip Organization , Serial Bus Architectur...
Floating Point Operations , Memory Chip Organization , Serial Bus Architectur...Floating Point Operations , Memory Chip Organization , Serial Bus Architectur...
Floating Point Operations , Memory Chip Organization , Serial Bus Architectur...
 
Threads in Operating System | Multithreading | Interprocess Communication
Threads in Operating System | Multithreading | Interprocess CommunicationThreads in Operating System | Multithreading | Interprocess Communication
Threads in Operating System | Multithreading | Interprocess Communication
 
6-9-2017-slides-vFinal.pptx
6-9-2017-slides-vFinal.pptx6-9-2017-slides-vFinal.pptx
6-9-2017-slides-vFinal.pptx
 

Último

Beyond the EU: DORA and NIS 2 Directive's Global Impact
Beyond the EU: DORA and NIS 2 Directive's Global ImpactBeyond the EU: DORA and NIS 2 Directive's Global Impact
Beyond the EU: DORA and NIS 2 Directive's Global Impact
PECB
 
Russian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in Delhi
Russian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in DelhiRussian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in Delhi
Russian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in Delhi
kauryashika82
 
1029 - Danh muc Sach Giao Khoa 10 . pdf
1029 -  Danh muc Sach Giao Khoa 10 . pdf1029 -  Danh muc Sach Giao Khoa 10 . pdf
1029 - Danh muc Sach Giao Khoa 10 . pdf
QucHHunhnh
 
Seal of Good Local Governance (SGLG) 2024Final.pptx
Seal of Good Local Governance (SGLG) 2024Final.pptxSeal of Good Local Governance (SGLG) 2024Final.pptx
Seal of Good Local Governance (SGLG) 2024Final.pptx
negromaestrong
 
Making and Justifying Mathematical Decisions.pdf
Making and Justifying Mathematical Decisions.pdfMaking and Justifying Mathematical Decisions.pdf
Making and Justifying Mathematical Decisions.pdf
Chris Hunter
 

Último (20)

Beyond the EU: DORA and NIS 2 Directive's Global Impact
Beyond the EU: DORA and NIS 2 Directive's Global ImpactBeyond the EU: DORA and NIS 2 Directive's Global Impact
Beyond the EU: DORA and NIS 2 Directive's Global Impact
 
ICT role in 21st century education and it's challenges.
ICT role in 21st century education and it's challenges.ICT role in 21st century education and it's challenges.
ICT role in 21st century education and it's challenges.
 
PROCESS RECORDING FORMAT.docx
PROCESS      RECORDING        FORMAT.docxPROCESS      RECORDING        FORMAT.docx
PROCESS RECORDING FORMAT.docx
 
Z Score,T Score, Percential Rank and Box Plot Graph
Z Score,T Score, Percential Rank and Box Plot GraphZ Score,T Score, Percential Rank and Box Plot Graph
Z Score,T Score, Percential Rank and Box Plot Graph
 
Holdier Curriculum Vitae (April 2024).pdf
Holdier Curriculum Vitae (April 2024).pdfHoldier Curriculum Vitae (April 2024).pdf
Holdier Curriculum Vitae (April 2024).pdf
 
Python Notes for mca i year students osmania university.docx
Python Notes for mca i year students osmania university.docxPython Notes for mca i year students osmania university.docx
Python Notes for mca i year students osmania university.docx
 
Advanced Views - Calendar View in Odoo 17
Advanced Views - Calendar View in Odoo 17Advanced Views - Calendar View in Odoo 17
Advanced Views - Calendar View in Odoo 17
 
Unit-V; Pricing (Pharma Marketing Management).pptx
Unit-V; Pricing (Pharma Marketing Management).pptxUnit-V; Pricing (Pharma Marketing Management).pptx
Unit-V; Pricing (Pharma Marketing Management).pptx
 
Food Chain and Food Web (Ecosystem) EVS, B. Pharmacy 1st Year, Sem-II
Food Chain and Food Web (Ecosystem) EVS, B. Pharmacy 1st Year, Sem-IIFood Chain and Food Web (Ecosystem) EVS, B. Pharmacy 1st Year, Sem-II
Food Chain and Food Web (Ecosystem) EVS, B. Pharmacy 1st Year, Sem-II
 
TỔNG ÔN TẬP THI VÀO LỚP 10 MÔN TIẾNG ANH NĂM HỌC 2023 - 2024 CÓ ĐÁP ÁN (NGỮ Â...
TỔNG ÔN TẬP THI VÀO LỚP 10 MÔN TIẾNG ANH NĂM HỌC 2023 - 2024 CÓ ĐÁP ÁN (NGỮ Â...TỔNG ÔN TẬP THI VÀO LỚP 10 MÔN TIẾNG ANH NĂM HỌC 2023 - 2024 CÓ ĐÁP ÁN (NGỮ Â...
TỔNG ÔN TẬP THI VÀO LỚP 10 MÔN TIẾNG ANH NĂM HỌC 2023 - 2024 CÓ ĐÁP ÁN (NGỮ Â...
 
Explore beautiful and ugly buildings. Mathematics helps us create beautiful d...
Explore beautiful and ugly buildings. Mathematics helps us create beautiful d...Explore beautiful and ugly buildings. Mathematics helps us create beautiful d...
Explore beautiful and ugly buildings. Mathematics helps us create beautiful d...
 
Introduction to Nonprofit Accounting: The Basics
Introduction to Nonprofit Accounting: The BasicsIntroduction to Nonprofit Accounting: The Basics
Introduction to Nonprofit Accounting: The Basics
 
Energy Resources. ( B. Pharmacy, 1st Year, Sem-II) Natural Resources
Energy Resources. ( B. Pharmacy, 1st Year, Sem-II) Natural ResourcesEnergy Resources. ( B. Pharmacy, 1st Year, Sem-II) Natural Resources
Energy Resources. ( B. Pharmacy, 1st Year, Sem-II) Natural Resources
 
Russian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in Delhi
Russian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in DelhiRussian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in Delhi
Russian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in Delhi
 
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...
 
microwave assisted reaction. General introduction
microwave assisted reaction. General introductionmicrowave assisted reaction. General introduction
microwave assisted reaction. General introduction
 
1029 - Danh muc Sach Giao Khoa 10 . pdf
1029 -  Danh muc Sach Giao Khoa 10 . pdf1029 -  Danh muc Sach Giao Khoa 10 . pdf
1029 - Danh muc Sach Giao Khoa 10 . pdf
 
Seal of Good Local Governance (SGLG) 2024Final.pptx
Seal of Good Local Governance (SGLG) 2024Final.pptxSeal of Good Local Governance (SGLG) 2024Final.pptx
Seal of Good Local Governance (SGLG) 2024Final.pptx
 
Making and Justifying Mathematical Decisions.pdf
Making and Justifying Mathematical Decisions.pdfMaking and Justifying Mathematical Decisions.pdf
Making and Justifying Mathematical Decisions.pdf
 
Grant Readiness 101 TechSoup and Remy Consulting
Grant Readiness 101 TechSoup and Remy ConsultingGrant Readiness 101 TechSoup and Remy Consulting
Grant Readiness 101 TechSoup and Remy Consulting
 

Openmp

  • 1. OpenMP Presented by : Mohammad Radpour Amirali Sharifian Farbod Nosrat Nezami
  • 2. Introduction • What is parallel processing ? It is ability of processing more than one job simultaneously. • Why going parallel ? • Great deal of data to be processed • Time needed to calculate an engineering equation • Need jobs to be done faster Isfahan University of Technology, Dep. Electronic and Computer Engineering 2
  • 3. Technologies • What technologies used for parallel processing ? • Network based parallel processing • Utilizing CPU free time and power • Fact is most of CPU time and power is wasting • Tearing down jobs and run them on resources • Local parallelism on multicore/multiprocessor systems • Utilize the concept of multithreading • Utilize the concept of share memory • Can be run on either GPU or CPU Isfahan University of Technology, Dep. Electronic and Computer Engineering 3
  • 4. Tools and Technics • What tools used for parallel processing ? • Network based parallel processing • Gird based parallel computing • Cloud based parallelism and Cloud computing • Local parallelism on multicore/multiprocessor systems • NVidia® CODA™ • MPI • Posix Threads • OpenMP Isfahan University of Technology, Dep. Electronic and Computer Engineering 4
  • 5. What is OpenMP • OpenMP • In simple word runs a user program in parallel. • It utilize to main concepts for parallelism • Multithreading • Shared Memory • It takes user application, tear it down into group of threads and runs them on a shared memory foundation Isfahan University of Technology, Dep. Electronic and Computer Engineering 5
  • 6. Why using OpenMP • It is simple to use it • Most of the times there is no need to change program code • It utilize compiler directives to demonstrate parallel region • It is cross platform • It supports by Fortran and C / C++ Isfahan University of Technology, Dep. Electronic and Computer Engineering 6
  • 7. Programming Model • Shared Memory • Parallelism by threading • Fork-Join model • Explicit Parallelism • Nested Parallelism • Dynamic Threads • Input / Output • Memory model Isfahan University of Technology, Dep. Electronic and Computer Engineering 7
  • 8. Shared Memory • What is shared memory ? • Why using shared memory? • Shared Memory in OpenMP Isfahan University of Technology, Dep. Electronic and Computer Engineering 8
  • 9. Shared Memory (Cont.) • Following system can be used for shared memory access • a single core chip (older PC’s, sequential execution) • a multicore chip (such as your laptop?) • multiple single core chips in a NUMA system • multiple multicore chips in a NUMA system (VT SGI system) Isfahan University of Technology, Dep. Electronic and Computer Engineering 9
  • 10. UMA Vs. NUMA • Unified Memory Access ( UMA ) Isfahan University of Technology, Dep. Electronic and Computer Engineering 10
  • 11. UMA Vs. NUMA (Cont.) • Non Unified Memory Access ( NUMA ) Isfahan University of Technology, Dep. Electronic and Computer Engineering 11
  • 12. Multi Threading • What is Multi Threading • What is Intel Hyper-Threading • Why using Multi Threading • Multi Threading in OpenMP Isfahan University of Technology, Dep. Electronic and Computer Engineering 12
  • 13. Fork – Join Model • What is Fork • What is Join • How Multi Threading works in OpenMP Isfahan University of Technology, Dep. Electronic and Computer Engineering 13
  • 14. Fork – Join Model (Cont.) Isfahan University of Technology, Dep. Electronic and Computer Engineering 14 F J Master Thread Thread
  • 15. OpenMP Elements • Compiler Directives • Runtime Libraries • Environmental Variables Isfahan University of Technology, Dep. Electronic and Computer Engineering 15
  • 16. How to use OpenMP • OpenMP implemented for C/C++ and Fortran • In C/C++ we use compiler directives • We only need to specify the parallel region Isfahan University of Technology, Dep. Electronic and Computer Engineering 16
  • 17. How to use OpenMP • In non Microsoft compiler: Isfahan University of Technology, Dep. Electronic and Computer Engineering 17
  • 18. How to use OpenMP (Cont.) • In Visual Studio : Isfahan University of Technology, Dep. Electronic and Computer Engineering 18
  • 19. Real Expriment void main() { omp_set_num_threads(6); LARGE_INTEGER frequency; // ticks per secon LARGE_INTEGER t1, t2; // ticks double elapsedTime; // get ticks per second QueryPerformanceFrequency(&frequency); // start timer QueryPerformanceCounter(&t1); #pragma omp parallel for for(int i =0 ; i < 999999 ; i++) for(int i =0 ; i < 1000 ; i++); // stop timer QueryPerformanceCounter(&t2); elapsedTime= (t2.QuadPart - t1.QuadPart) * 1000.0 / frequency.QuadPart; cout << elapsedTime << " ms.n"; } Isfahan University of Technology, Dep. Electronic and Computer Engineering 19
  • 20. Expriment Result - Sequential • It took 3347.68 milliseconds to run Isfahan University of Technology, Dep. Electronic and Computer Engineering 20
  • 21. Expriment Result - Parallel • It took 983.576 milliseconds to run Isfahan University of Technology, Dep. Electronic and Computer Engineering 21
  • 22. The End Isfahan University of Technology, Dep. Electronic and Computer Engineering 22