SlideShare una empresa de Scribd logo
1 de 17
High Performance Computing JawwadShamsi Lecture #6 27th January 2010
Recap Cache Coherence NUMA
Today’s topics Cache Coherence – Continuation Vector Processing
Cache Coherence In SMP or NUMA, multiple copies of cache Each copy may have a different value of data item Maintain Coherency How?
Cache Coherence: Two Approaches Write back: Update Main memory once cache is flushed. Write through: Write is updated to cache as well as to the main memory.
Implementations Software Solutions:  Compile time decision Conservative Inefficient cache utilization Hardware Solutions: Runtime decision More effective
Hardware based solution Directory Protocol Snoopy Protocol
Directory Centralized Controller Individual cache controller makes a request Centralized controller checks and issues command Updates information
Directory Write Processor requests exclusive writes Controller sends message Invalidates Read Issues command to the processor  Holding Processor Writes back to MM Read permitted
Directory Disadvantage Centralized Controller Bottleneck Advantage Useful in large –scale system
Snoopy Protocol Update operation announced All Cache controllers snoop Bus architecture Careful Increased Bus Traffic
Snoopy Protocol Two approaches Write Invalidate One write Multiple readers Exclusive: Writer invalidates others entries Write Update Multiple writers All writes are updated
Write Invalidate The MESI Protocol : P4 processor Data cache: Two status bits, 4 states Modified Exclusive Shared Invalid See Table
4 Possibilities Read Miss: EX to SH SH to SH MO to SH Read-Hit Write-Miss RWITM MO to IN SH to IN Write Hit SH to IN EX   Mo
L1- L2 Cache Consistency
Parallel programming and Amdahl's Law Suppose 1/N time for sequential code And 1-1/N for the parallel
Amdahl's Law Speedup: speed gain of using parallel processor vs. single processor Speed= 1/(s+(p/N)) S=sequential code, p = parallel code, N= no. of processors S= T(1)/ T(j) For j parallel processors As problem size increases, p may rise and s may decrease

Más contenido relacionado

La actualidad más candente

Parallel Processing Presentation2
Parallel Processing Presentation2Parallel Processing Presentation2
Parallel Processing Presentation2
daniyalqureshi712
 
Paralle programming 2
Paralle programming 2Paralle programming 2
Paralle programming 2
Anshul Sharma
 
10 Multicore 07
10 Multicore 0710 Multicore 07
10 Multicore 07
timcrack
 
Superscalar & superpipeline processor
Superscalar & superpipeline processorSuperscalar & superpipeline processor
Superscalar & superpipeline processor
Muhammad Ishaq
 
What is simultaneous multithreading
What is simultaneous multithreadingWhat is simultaneous multithreading
What is simultaneous multithreading
Fraboni Ec
 
Parallel architecture-programming
Parallel architecture-programmingParallel architecture-programming
Parallel architecture-programming
Shaveta Banda
 
Multithreading computer architecture
 Multithreading computer architecture  Multithreading computer architecture
Multithreading computer architecture
Haris456
 

La actualidad más candente (20)

Parallel Processing Presentation2
Parallel Processing Presentation2Parallel Processing Presentation2
Parallel Processing Presentation2
 
Lecture1
Lecture1Lecture1
Lecture1
 
Paralle programming 2
Paralle programming 2Paralle programming 2
Paralle programming 2
 
Lecture5
Lecture5Lecture5
Lecture5
 
Introduction to parallel processing
Introduction to parallel processingIntroduction to parallel processing
Introduction to parallel processing
 
Parallel processing extra
Parallel processing extraParallel processing extra
Parallel processing extra
 
File replication
File replicationFile replication
File replication
 
Lecture4
Lecture4Lecture4
Lecture4
 
Directory based cache coherence
Directory based cache coherenceDirectory based cache coherence
Directory based cache coherence
 
Introduction 1
Introduction 1Introduction 1
Introduction 1
 
10 Multicore 07
10 Multicore 0710 Multicore 07
10 Multicore 07
 
Superscalar & superpipeline processor
Superscalar & superpipeline processorSuperscalar & superpipeline processor
Superscalar & superpipeline processor
 
Memory models
Memory modelsMemory models
Memory models
 
Parallel processing
Parallel processingParallel processing
Parallel processing
 
Parallelism
ParallelismParallelism
Parallelism
 
What is simultaneous multithreading
What is simultaneous multithreadingWhat is simultaneous multithreading
What is simultaneous multithreading
 
Multiprocessor
MultiprocessorMultiprocessor
Multiprocessor
 
Parallel architecture-programming
Parallel architecture-programmingParallel architecture-programming
Parallel architecture-programming
 
Multithreading computer architecture
 Multithreading computer architecture  Multithreading computer architecture
Multithreading computer architecture
 
ملٹی لیول کے شے۔
ملٹی لیول کے شے۔ملٹی لیول کے شے۔
ملٹی لیول کے شے۔
 

Destacado (6)

Chap12alg
Chap12algChap12alg
Chap12alg
 
Parallel and Distributed Algorithms for Large Text Datasets Analysis
Parallel and Distributed Algorithms for Large Text Datasets AnalysisParallel and Distributed Algorithms for Large Text Datasets Analysis
Parallel and Distributed Algorithms for Large Text Datasets Analysis
 
Lecture3
Lecture3Lecture3
Lecture3
 
Lecture2
Lecture2Lecture2
Lecture2
 
Advanced full text searching techniques using Lucene
Advanced full text searching techniques using LuceneAdvanced full text searching techniques using Lucene
Advanced full text searching techniques using Lucene
 
seminar report on Li-Fi Technology
seminar report on Li-Fi Technologyseminar report on Li-Fi Technology
seminar report on Li-Fi Technology
 

Similar a Lecture6

GFS - Google File System
GFS - Google File SystemGFS - Google File System
GFS - Google File System
tutchiio
 
Memory technology and optimization in Advance Computer Architechture
Memory technology and optimization in Advance Computer ArchitechtureMemory technology and optimization in Advance Computer Architechture
Memory technology and optimization in Advance Computer Architechture
Shweta Ghate
 
memorytechnologyandoptimization-140416131506-phpapp02.pptx
memorytechnologyandoptimization-140416131506-phpapp02.pptxmemorytechnologyandoptimization-140416131506-phpapp02.pptx
memorytechnologyandoptimization-140416131506-phpapp02.pptx
shahdivyanshu1002
 
Dsm (Distributed computing)
Dsm (Distributed computing)Dsm (Distributed computing)
Dsm (Distributed computing)
Sri Prasanna
 

Similar a Lecture6 (20)

Study of various factors affecting performance of multi core processors
Study of various factors affecting performance of multi core processorsStudy of various factors affecting performance of multi core processors
Study of various factors affecting performance of multi core processors
 
Parallel processing Concepts
Parallel processing ConceptsParallel processing Concepts
Parallel processing Concepts
 
GFS - Google File System
GFS - Google File SystemGFS - Google File System
GFS - Google File System
 
Memory technology and optimization in Advance Computer Architechture
Memory technology and optimization in Advance Computer ArchitechtureMemory technology and optimization in Advance Computer Architechture
Memory technology and optimization in Advance Computer Architechture
 
memorytechnologyandoptimization-140416131506-phpapp02.pptx
memorytechnologyandoptimization-140416131506-phpapp02.pptxmemorytechnologyandoptimization-140416131506-phpapp02.pptx
memorytechnologyandoptimization-140416131506-phpapp02.pptx
 
Sinfonia
Sinfonia Sinfonia
Sinfonia
 
Kosmos Filesystem
Kosmos FilesystemKosmos Filesystem
Kosmos Filesystem
 
Parallel Processing (Part 2)
Parallel Processing (Part 2)Parallel Processing (Part 2)
Parallel Processing (Part 2)
 
CH08.pdf
CH08.pdfCH08.pdf
CH08.pdf
 
Chapter 10
Chapter 10Chapter 10
Chapter 10
 
OS Intro.ppt
OS Intro.pptOS Intro.ppt
OS Intro.ppt
 
Dsm (Distributed computing)
Dsm (Distributed computing)Dsm (Distributed computing)
Dsm (Distributed computing)
 
Memory comp
Memory compMemory comp
Memory comp
 
tittle
tittletittle
tittle
 
message passing vs shared memory
message passing vs shared memorymessage passing vs shared memory
message passing vs shared memory
 
VMWare Performance Tuning by Virtera (Jan 2009)
VMWare Performance Tuning by  Virtera (Jan 2009)VMWare Performance Tuning by  Virtera (Jan 2009)
VMWare Performance Tuning by Virtera (Jan 2009)
 
Chapter1
Chapter1Chapter1
Chapter1
 
CS6401 OPERATING SYSTEMS Unit 3
CS6401 OPERATING SYSTEMS Unit 3CS6401 OPERATING SYSTEMS Unit 3
CS6401 OPERATING SYSTEMS Unit 3
 
Operating System Lecture 4
Operating System Lecture 4Operating System Lecture 4
Operating System Lecture 4
 
Symmetric multiprocessing and Microkernel
Symmetric multiprocessing and MicrokernelSymmetric multiprocessing and Microkernel
Symmetric multiprocessing and Microkernel
 

Último

Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Victor Rentea
 
Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native Applications
WSO2
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
?#DUbAI#??##{{(☎️+971_581248768%)**%*]'#abortion pills for sale in dubai@
 

Último (20)

DEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 AmsterdamDEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
 
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ..."I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
 
MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century education
 
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodPolkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
 
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
 
Platformless Horizons for Digital Adaptability
Platformless Horizons for Digital AdaptabilityPlatformless Horizons for Digital Adaptability
Platformless Horizons for Digital Adaptability
 
FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024
 
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWEREMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
 
Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native Applications
 
WSO2's API Vision: Unifying Control, Empowering Developers
WSO2's API Vision: Unifying Control, Empowering DevelopersWSO2's API Vision: Unifying Control, Empowering Developers
WSO2's API Vision: Unifying Control, Empowering Developers
 
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
 
Exploring Multimodal Embeddings with Milvus
Exploring Multimodal Embeddings with MilvusExploring Multimodal Embeddings with Milvus
Exploring Multimodal Embeddings with Milvus
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a Fresher
 
MS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectorsMS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectors
 
Understanding the FAA Part 107 License ..
Understanding the FAA Part 107 License ..Understanding the FAA Part 107 License ..
Understanding the FAA Part 107 License ..
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
 
Vector Search -An Introduction in Oracle Database 23ai.pptx
Vector Search -An Introduction in Oracle Database 23ai.pptxVector Search -An Introduction in Oracle Database 23ai.pptx
Vector Search -An Introduction in Oracle Database 23ai.pptx
 
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of Terraform
 

Lecture6

  • 1. High Performance Computing JawwadShamsi Lecture #6 27th January 2010
  • 3. Today’s topics Cache Coherence – Continuation Vector Processing
  • 4. Cache Coherence In SMP or NUMA, multiple copies of cache Each copy may have a different value of data item Maintain Coherency How?
  • 5. Cache Coherence: Two Approaches Write back: Update Main memory once cache is flushed. Write through: Write is updated to cache as well as to the main memory.
  • 6. Implementations Software Solutions: Compile time decision Conservative Inefficient cache utilization Hardware Solutions: Runtime decision More effective
  • 7. Hardware based solution Directory Protocol Snoopy Protocol
  • 8. Directory Centralized Controller Individual cache controller makes a request Centralized controller checks and issues command Updates information
  • 9. Directory Write Processor requests exclusive writes Controller sends message Invalidates Read Issues command to the processor Holding Processor Writes back to MM Read permitted
  • 10. Directory Disadvantage Centralized Controller Bottleneck Advantage Useful in large –scale system
  • 11. Snoopy Protocol Update operation announced All Cache controllers snoop Bus architecture Careful Increased Bus Traffic
  • 12. Snoopy Protocol Two approaches Write Invalidate One write Multiple readers Exclusive: Writer invalidates others entries Write Update Multiple writers All writes are updated
  • 13. Write Invalidate The MESI Protocol : P4 processor Data cache: Two status bits, 4 states Modified Exclusive Shared Invalid See Table
  • 14. 4 Possibilities Read Miss: EX to SH SH to SH MO to SH Read-Hit Write-Miss RWITM MO to IN SH to IN Write Hit SH to IN EX Mo
  • 15. L1- L2 Cache Consistency
  • 16. Parallel programming and Amdahl's Law Suppose 1/N time for sequential code And 1-1/N for the parallel
  • 17. Amdahl's Law Speedup: speed gain of using parallel processor vs. single processor Speed= 1/(s+(p/N)) S=sequential code, p = parallel code, N= no. of processors S= T(1)/ T(j) For j parallel processors As problem size increases, p may rise and s may decrease