SlideShare a Scribd company logo
1 of 46
Chapter 5 Large and Fast: Exploiting Memory Hierarchy
Multilevel Cache Considerations ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],Chapter 5 — Large and Fast: Exploiting Memory Hierarchy —
Interactions with Advanced CPUs ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],Chapter 5 — Large and Fast: Exploiting Memory Hierarchy —
Interactions with Software ,[object Object],[object Object],[object Object],Chapter 5 — Large and Fast: Exploiting Memory Hierarchy —
Virtual Memory ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],Chapter 5 — Large and Fast: Exploiting Memory Hierarchy —  §5.4 Virtual Memory
Address Translation ,[object Object],Chapter 5 — Large and Fast: Exploiting Memory Hierarchy —
Page Fault Penalty ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],Chapter 5 — Large and Fast: Exploiting Memory Hierarchy —
Page Tables ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],Chapter 5 — Large and Fast: Exploiting Memory Hierarchy —
Translation Using a Page Table Chapter 5 — Large and Fast: Exploiting Memory Hierarchy —
Mapping Pages to Storage Chapter 5 — Large and Fast: Exploiting Memory Hierarchy —
Replacement and Writes ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],Chapter 5 — Large and Fast: Exploiting Memory Hierarchy —
Fast Translation Using a TLB ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],Chapter 5 — Large and Fast: Exploiting Memory Hierarchy —
Fast Translation Using a TLB Chapter 5 — Large and Fast: Exploiting Memory Hierarchy —
TLB Misses ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],Chapter 5 — Large and Fast: Exploiting Memory Hierarchy —
TLB Miss Handler ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],Chapter 5 — Large and Fast: Exploiting Memory Hierarchy —
Page Fault Handler ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],Chapter 5 — Large and Fast: Exploiting Memory Hierarchy —
TLB and Cache Interaction ,[object Object],[object Object],[object Object],[object Object],[object Object],Chapter 5 — Large and Fast: Exploiting Memory Hierarchy —
Memory Protection ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],Chapter 5 — Large and Fast: Exploiting Memory Hierarchy —
The Memory Hierarchy ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],Chapter 5 — Large and Fast: Exploiting Memory Hierarchy —  §5.5 A Common Framework for Memory Hierarchies The BIG Picture
Block Placement ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],Chapter 5 — Large and Fast: Exploiting Memory Hierarchy —
Finding a Block ,[object Object],[object Object],[object Object],[object Object],[object Object],Chapter 5 — Large and Fast: Exploiting Memory Hierarchy —  Associativity Location method Tag comparisons Direct mapped Index 1 n-way set associative Set index, then search entries within the set n Fully associative Search all entries #entries Full lookup table 0
Replacement ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],Chapter 5 — Large and Fast: Exploiting Memory Hierarchy —
Write Policy ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],Chapter 5 — Large and Fast: Exploiting Memory Hierarchy —
Sources of Misses ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],Chapter 5 — Large and Fast: Exploiting Memory Hierarchy —
Cache Design Trade-offs Chapter 5 — Large and Fast: Exploiting Memory Hierarchy —  Design change Effect on miss rate Negative performance effect Increase cache size Decrease capacity misses May increase access time Increase associativity Decrease conflict misses May increase access time Increase block size Decrease compulsory misses Increases miss penalty. For very large block size, may increase miss rate due to pollution.
Virtual Machines ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],Chapter 5 — Large and Fast: Exploiting Memory Hierarchy —  §5.6 Virtual Machines
Virtual Machine Monitor ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],Chapter 5 — Large and Fast: Exploiting Memory Hierarchy —
Example: Timer Virtualization ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],Chapter 5 — Large and Fast: Exploiting Memory Hierarchy —
Instruction Set Support ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],Chapter 5 — Large and Fast: Exploiting Memory Hierarchy —
Cache Control ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],Chapter 5 — Large and Fast: Exploiting Memory Hierarchy —  §5.7  Using a Finite State Machine to Control A Simple Cache Tag Index Offset 0 3 4 9 10 31 4 bits 10 bits 18 bits
Interface Signals Chapter 5 — Large and Fast: Exploiting Memory Hierarchy —  Cache CPU Memory Read/Write Valid Address Write Data Read Data Ready 32 32 32 Read/Write Valid Address Write Data Read Data Ready 32 128 128 Multiple cycles per access
Finite State Machines ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],Chapter 5 — Large and Fast: Exploiting Memory Hierarchy —
Cache Controller FSM Chapter 5 — Large and Fast: Exploiting Memory Hierarchy —  Could partition into separate states to reduce clock cycle time
Cache Coherence Problem ,[object Object],[object Object],Chapter 5 — Large and Fast: Exploiting Memory Hierarchy —  §5.8  Parallelism and Memory Hierarchies: Cache Coherence Time step Event CPU A’s cache CPU B’s cache Memory 0 0 1 CPU A reads X 0 0 2 CPU B reads X 0 0 0 3 CPU A writes 1 to X 1 0 1
Coherence Defined ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],Chapter 5 — Large and Fast: Exploiting Memory Hierarchy —
Cache Coherence Protocols ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],Chapter 5 — Large and Fast: Exploiting Memory Hierarchy —
Invalidating Snooping Protocols ,[object Object],[object Object],[object Object],[object Object],Chapter 5 — Large and Fast: Exploiting Memory Hierarchy —  CPU activity Bus activity CPU A’s cache CPU B’s cache Memory 0 CPU A reads X Cache miss for X 0 0 CPU B reads X Cache miss for X 0 0 0 CPU A writes 1 to X Invalidate for X 1 0 CPU B read X Cache miss for X 1 1 1
Memory Consistency ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],Chapter 5 — Large and Fast: Exploiting Memory Hierarchy —
Multilevel On-Chip Caches Chapter 5 — Large and Fast: Exploiting Memory Hierarchy —  §5.10 Real Stuff: The AMD Opteron X4 and Intel Nehalem Per core: 32KB L1 I-cache, 32KB L1 D-cache, 512KB L2 cache Intel Nehalem 4-core processor
2-Level TLB Organization Chapter 5 — Large and Fast: Exploiting Memory Hierarchy —  Intel Nehalem AMD Opteron X4 Virtual addr 48 bits 48 bits Physical addr 44 bits 48 bits Page size 4KB, 2/4MB 4KB, 2/4MB L1 TLB (per core) L1 I-TLB: 128 entries for small pages, 7 per thread (2 × ) for large pages L1 D-TLB: 64 entries for small pages, 32 for large pages Both 4-way, LRU replacement L1 I-TLB: 48 entries L1 D-TLB: 48 entries Both fully associative, LRU replacement L2 TLB (per core) Single L2 TLB: 512 entries 4-way, LRU replacement L2 I-TLB: 512 entries L2 D-TLB: 512 entries Both 4-way, round-robin LRU TLB misses Handled in hardware Handled in hardware
3-Level Cache Organization Chapter 5 — Large and Fast: Exploiting Memory Hierarchy —  Intel Nehalem AMD Opteron X4 L1 caches (per core) L1 I-cache: 32KB, 64-byte blocks, 4-way, approx LRU replacement, hit time n/a L1 D-cache: 32KB, 64-byte blocks, 8-way, approx LRU replacement, write-back/allocate, hit time n/a L1 I-cache: 32KB, 64-byte blocks, 2-way, LRU replacement, hit time 3 cycles L1 D-cache: 32KB, 64-byte blocks, 2-way, LRU replacement, write-back/allocate, hit time 9 cycles L2 unified cache (per core) 256KB, 64-byte blocks, 8-way, approx LRU replacement, write-back/allocate, hit time n/a 512KB, 64-byte blocks, 16-way, approx LRU replacement, write-back/allocate, hit time n/a L3 unified cache (shared) 8MB, 64-byte blocks, 16-way, replacement n/a, write-back/allocate, hit time n/a 2MB, 64-byte blocks, 32-way, replace block shared by fewest cores, write-back/allocate, hit time 32 cycles n/a: data not available
Mis Penalty Reduction ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],Chapter 5 — Large and Fast: Exploiting Memory Hierarchy —
Pitfalls ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],Chapter 5 — Large and Fast: Exploiting Memory Hierarchy —  §5.11 Fallacies and Pitfalls
Pitfalls ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],Chapter 5 — Large and Fast: Exploiting Memory Hierarchy —
Pitfalls ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],Chapter 5 — Large and Fast: Exploiting Memory Hierarchy —
Concluding Remarks ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],Chapter 5 — Large and Fast: Exploiting Memory Hierarchy —  §5.12 Concluding Remarks

More Related Content

What's hot

Session 8,9 PCI Express
Session 8,9 PCI ExpressSession 8,9 PCI Express
Session 8,9 PCI Express
Subhash Iyer
 
Lecture6 memory hierarchy
Lecture6 memory hierarchyLecture6 memory hierarchy
Lecture6 memory hierarchy
li12x
 
Addressing modes (detailed data path)
Addressing modes (detailed data path)Addressing modes (detailed data path)
Addressing modes (detailed data path)
Mahesh Kumar Attri
 

What's hot (20)

slides.pdf
slides.pdfslides.pdf
slides.pdf
 
80486 microprocessor
80486 microprocessor80486 microprocessor
80486 microprocessor
 
Chapter 1 computer abstractions and technology
Chapter 1 computer abstractions and technologyChapter 1 computer abstractions and technology
Chapter 1 computer abstractions and technology
 
Slideshare - PCIe
Slideshare - PCIeSlideshare - PCIe
Slideshare - PCIe
 
Session 8,9 PCI Express
Session 8,9 PCI ExpressSession 8,9 PCI Express
Session 8,9 PCI Express
 
Lecture6 memory hierarchy
Lecture6 memory hierarchyLecture6 memory hierarchy
Lecture6 memory hierarchy
 
PCI express
PCI expressPCI express
PCI express
 
Pipelining In computer
Pipelining In computer Pipelining In computer
Pipelining In computer
 
Microprocessor - Intel Pentium Series
Microprocessor - Intel Pentium SeriesMicroprocessor - Intel Pentium Series
Microprocessor - Intel Pentium Series
 
PCIe
PCIePCIe
PCIe
 
Pipelining
PipeliningPipelining
Pipelining
 
Instruction Set Architecture
Instruction Set ArchitectureInstruction Set Architecture
Instruction Set Architecture
 
Comparision between Core i3,i5,i7,i9
Comparision between Core i3,i5,i7,i9 Comparision between Core i3,i5,i7,i9
Comparision between Core i3,i5,i7,i9
 
Pc ie tl_layer (3)
Pc ie tl_layer (3)Pc ie tl_layer (3)
Pc ie tl_layer (3)
 
Pipelining of Processors
Pipelining of ProcessorsPipelining of Processors
Pipelining of Processors
 
PCIe Gen 3.0 Presentation @ 4th FPGA Camp
PCIe Gen 3.0 Presentation @ 4th FPGA CampPCIe Gen 3.0 Presentation @ 4th FPGA Camp
PCIe Gen 3.0 Presentation @ 4th FPGA Camp
 
PCIe and PCIe driver in WEC7 (Windows Embedded compact 7)
PCIe and PCIe driver in WEC7 (Windows Embedded compact 7)PCIe and PCIe driver in WEC7 (Windows Embedded compact 7)
PCIe and PCIe driver in WEC7 (Windows Embedded compact 7)
 
AMBA 5 COHERENT HUB INTERFACE.pptx
AMBA 5 COHERENT HUB INTERFACE.pptxAMBA 5 COHERENT HUB INTERFACE.pptx
AMBA 5 COHERENT HUB INTERFACE.pptx
 
Addressing modes (detailed data path)
Addressing modes (detailed data path)Addressing modes (detailed data path)
Addressing modes (detailed data path)
 
AMBA AHB 5
AMBA AHB 5AMBA AHB 5
AMBA AHB 5
 

Similar to Chapter 5 c

Web scale MySQL at Facebook (Domas Mituzas)
Web scale MySQL at Facebook (Domas Mituzas)Web scale MySQL at Facebook (Domas Mituzas)
Web scale MySQL at Facebook (Domas Mituzas)
Ontico
 
Name CPSC-351 Sample Final Exam Name .docx
Name CPSC-351 Sample Final Exam   Name   .docxName CPSC-351 Sample Final Exam   Name   .docx
Name CPSC-351 Sample Final Exam Name .docx
herthaweston
 

Similar to Chapter 5 c (20)

CH09.pdf
CH09.pdfCH09.pdf
CH09.pdf
 
Memory Hierarchy Design, Basics, Cache Optimization, Address Translation
Memory Hierarchy Design, Basics, Cache Optimization, Address TranslationMemory Hierarchy Design, Basics, Cache Optimization, Address Translation
Memory Hierarchy Design, Basics, Cache Optimization, Address Translation
 
Mca ii os u-4 memory management
Mca  ii  os u-4 memory managementMca  ii  os u-4 memory management
Mca ii os u-4 memory management
 
Linux Memory
Linux MemoryLinux Memory
Linux Memory
 
Memory management ppt coa
Memory management ppt coaMemory management ppt coa
Memory management ppt coa
 
CH08.pdf
CH08.pdfCH08.pdf
CH08.pdf
 
Cs8493 unit 3
Cs8493 unit 3Cs8493 unit 3
Cs8493 unit 3
 
Cs8493 unit 3
Cs8493 unit 3Cs8493 unit 3
Cs8493 unit 3
 
CS6401 OPERATING SYSTEMS Unit 3
CS6401 OPERATING SYSTEMS Unit 3CS6401 OPERATING SYSTEMS Unit 3
CS6401 OPERATING SYSTEMS Unit 3
 
virtual memory
virtual memoryvirtual memory
virtual memory
 
Memory comp
Memory compMemory comp
Memory comp
 
Web scale MySQL at Facebook (Domas Mituzas)
Web scale MySQL at Facebook (Domas Mituzas)Web scale MySQL at Facebook (Domas Mituzas)
Web scale MySQL at Facebook (Domas Mituzas)
 
381 ccs chapter7_updated(1)
381 ccs chapter7_updated(1)381 ccs chapter7_updated(1)
381 ccs chapter7_updated(1)
 
Os
OsOs
Os
 
Massstorage
MassstorageMassstorage
Massstorage
 
Os unit 2
Os unit 2Os unit 2
Os unit 2
 
Name CPSC-351 Sample Final Exam Name .docx
Name CPSC-351 Sample Final Exam   Name   .docxName CPSC-351 Sample Final Exam   Name   .docx
Name CPSC-351 Sample Final Exam Name .docx
 
Chapter 9 - Virtual Memory
Chapter 9 - Virtual MemoryChapter 9 - Virtual Memory
Chapter 9 - Virtual Memory
 
Virtual Memory
Virtual MemoryVirtual Memory
Virtual Memory
 
Virtual Memory Management
Virtual Memory ManagementVirtual Memory Management
Virtual Memory Management
 

More from ececourse

Machine Problem 2
Machine Problem 2Machine Problem 2
Machine Problem 2
ececourse
 
Machine Problem 1
Machine Problem 1Machine Problem 1
Machine Problem 1
ececourse
 
Chapter 2 Hw
Chapter 2 HwChapter 2 Hw
Chapter 2 Hw
ececourse
 
Chapter 2 Part2 C
Chapter 2 Part2 CChapter 2 Part2 C
Chapter 2 Part2 C
ececourse
 
C:\Fakepath\Chapter 2 Part2 B
C:\Fakepath\Chapter 2 Part2 BC:\Fakepath\Chapter 2 Part2 B
C:\Fakepath\Chapter 2 Part2 B
ececourse
 
Chapter 2 Part2 A
Chapter 2 Part2 AChapter 2 Part2 A
Chapter 2 Part2 A
ececourse
 
Chapter 2 Part1
Chapter 2 Part1Chapter 2 Part1
Chapter 2 Part1
ececourse
 

More from ececourse (10)

Auxiliary
AuxiliaryAuxiliary
Auxiliary
 
Mem Tb
Mem TbMem Tb
Mem Tb
 
Machine Problem 2
Machine Problem 2Machine Problem 2
Machine Problem 2
 
Machine Problem 1
Machine Problem 1Machine Problem 1
Machine Problem 1
 
Chapter 2 Hw
Chapter 2 HwChapter 2 Hw
Chapter 2 Hw
 
Chapter 2 Part2 C
Chapter 2 Part2 CChapter 2 Part2 C
Chapter 2 Part2 C
 
C:\Fakepath\Chapter 2 Part2 B
C:\Fakepath\Chapter 2 Part2 BC:\Fakepath\Chapter 2 Part2 B
C:\Fakepath\Chapter 2 Part2 B
 
Chapter 2 Part2 A
Chapter 2 Part2 AChapter 2 Part2 A
Chapter 2 Part2 A
 
Chapter1
Chapter1Chapter1
Chapter1
 
Chapter 2 Part1
Chapter 2 Part1Chapter 2 Part1
Chapter 2 Part1
 

Recently uploaded

Spellings Wk 3 English CAPS CARES Please Practise
Spellings Wk 3 English CAPS CARES Please PractiseSpellings Wk 3 English CAPS CARES Please Practise
Spellings Wk 3 English CAPS CARES Please Practise
AnaAcapella
 
Activity 01 - Artificial Culture (1).pdf
Activity 01 - Artificial Culture (1).pdfActivity 01 - Artificial Culture (1).pdf
Activity 01 - Artificial Culture (1).pdf
ciinovamais
 
The basics of sentences session 3pptx.pptx
The basics of sentences session 3pptx.pptxThe basics of sentences session 3pptx.pptx
The basics of sentences session 3pptx.pptx
heathfieldcps1
 

Recently uploaded (20)

How to Give a Domain for a Field in Odoo 17
How to Give a Domain for a Field in Odoo 17How to Give a Domain for a Field in Odoo 17
How to Give a Domain for a Field in Odoo 17
 
Spellings Wk 3 English CAPS CARES Please Practise
Spellings Wk 3 English CAPS CARES Please PractiseSpellings Wk 3 English CAPS CARES Please Practise
Spellings Wk 3 English CAPS CARES Please Practise
 
Unit-IV- Pharma. Marketing Channels.pptx
Unit-IV- Pharma. Marketing Channels.pptxUnit-IV- Pharma. Marketing Channels.pptx
Unit-IV- Pharma. Marketing Channels.pptx
 
Asian American Pacific Islander Month DDSD 2024.pptx
Asian American Pacific Islander Month DDSD 2024.pptxAsian American Pacific Islander Month DDSD 2024.pptx
Asian American Pacific Islander Month DDSD 2024.pptx
 
Grant Readiness 101 TechSoup and Remy Consulting
Grant Readiness 101 TechSoup and Remy ConsultingGrant Readiness 101 TechSoup and Remy Consulting
Grant Readiness 101 TechSoup and Remy Consulting
 
Magic bus Group work1and 2 (Team 3).pptx
Magic bus Group work1and 2 (Team 3).pptxMagic bus Group work1and 2 (Team 3).pptx
Magic bus Group work1and 2 (Team 3).pptx
 
Sociology 101 Demonstration of Learning Exhibit
Sociology 101 Demonstration of Learning ExhibitSociology 101 Demonstration of Learning Exhibit
Sociology 101 Demonstration of Learning Exhibit
 
Mixin Classes in Odoo 17 How to Extend Models Using Mixin Classes
Mixin Classes in Odoo 17  How to Extend Models Using Mixin ClassesMixin Classes in Odoo 17  How to Extend Models Using Mixin Classes
Mixin Classes in Odoo 17 How to Extend Models Using Mixin Classes
 
Spatium Project Simulation student brief
Spatium Project Simulation student briefSpatium Project Simulation student brief
Spatium Project Simulation student brief
 
Activity 01 - Artificial Culture (1).pdf
Activity 01 - Artificial Culture (1).pdfActivity 01 - Artificial Culture (1).pdf
Activity 01 - Artificial Culture (1).pdf
 
Holdier Curriculum Vitae (April 2024).pdf
Holdier Curriculum Vitae (April 2024).pdfHoldier Curriculum Vitae (April 2024).pdf
Holdier Curriculum Vitae (April 2024).pdf
 
Unit-IV; Professional Sales Representative (PSR).pptx
Unit-IV; Professional Sales Representative (PSR).pptxUnit-IV; Professional Sales Representative (PSR).pptx
Unit-IV; Professional Sales Representative (PSR).pptx
 
The basics of sentences session 3pptx.pptx
The basics of sentences session 3pptx.pptxThe basics of sentences session 3pptx.pptx
The basics of sentences session 3pptx.pptx
 
Basic Civil Engineering first year Notes- Chapter 4 Building.pptx
Basic Civil Engineering first year Notes- Chapter 4 Building.pptxBasic Civil Engineering first year Notes- Chapter 4 Building.pptx
Basic Civil Engineering first year Notes- Chapter 4 Building.pptx
 
Third Battle of Panipat detailed notes.pptx
Third Battle of Panipat detailed notes.pptxThird Battle of Panipat detailed notes.pptx
Third Battle of Panipat detailed notes.pptx
 
Key note speaker Neum_Admir Softic_ENG.pdf
Key note speaker Neum_Admir Softic_ENG.pdfKey note speaker Neum_Admir Softic_ENG.pdf
Key note speaker Neum_Admir Softic_ENG.pdf
 
Explore beautiful and ugly buildings. Mathematics helps us create beautiful d...
Explore beautiful and ugly buildings. Mathematics helps us create beautiful d...Explore beautiful and ugly buildings. Mathematics helps us create beautiful d...
Explore beautiful and ugly buildings. Mathematics helps us create beautiful d...
 
SKILL OF INTRODUCING THE LESSON MICRO SKILLS.pptx
SKILL OF INTRODUCING THE LESSON MICRO SKILLS.pptxSKILL OF INTRODUCING THE LESSON MICRO SKILLS.pptx
SKILL OF INTRODUCING THE LESSON MICRO SKILLS.pptx
 
Making communications land - Are they received and understood as intended? we...
Making communications land - Are they received and understood as intended? we...Making communications land - Are they received and understood as intended? we...
Making communications land - Are they received and understood as intended? we...
 
Micro-Scholarship, What it is, How can it help me.pdf
Micro-Scholarship, What it is, How can it help me.pdfMicro-Scholarship, What it is, How can it help me.pdf
Micro-Scholarship, What it is, How can it help me.pdf
 

Chapter 5 c

  • 1. Chapter 5 Large and Fast: Exploiting Memory Hierarchy
  • 2.
  • 3.
  • 4.
  • 5.
  • 6.
  • 7.
  • 8.
  • 9. Translation Using a Page Table Chapter 5 — Large and Fast: Exploiting Memory Hierarchy —
  • 10. Mapping Pages to Storage Chapter 5 — Large and Fast: Exploiting Memory Hierarchy —
  • 11.
  • 12.
  • 13. Fast Translation Using a TLB Chapter 5 — Large and Fast: Exploiting Memory Hierarchy —
  • 14.
  • 15.
  • 16.
  • 17.
  • 18.
  • 19.
  • 20.
  • 21.
  • 22.
  • 23.
  • 24.
  • 25. Cache Design Trade-offs Chapter 5 — Large and Fast: Exploiting Memory Hierarchy — Design change Effect on miss rate Negative performance effect Increase cache size Decrease capacity misses May increase access time Increase associativity Decrease conflict misses May increase access time Increase block size Decrease compulsory misses Increases miss penalty. For very large block size, may increase miss rate due to pollution.
  • 26.
  • 27.
  • 28.
  • 29.
  • 30.
  • 31. Interface Signals Chapter 5 — Large and Fast: Exploiting Memory Hierarchy — Cache CPU Memory Read/Write Valid Address Write Data Read Data Ready 32 32 32 Read/Write Valid Address Write Data Read Data Ready 32 128 128 Multiple cycles per access
  • 32.
  • 33. Cache Controller FSM Chapter 5 — Large and Fast: Exploiting Memory Hierarchy — Could partition into separate states to reduce clock cycle time
  • 34.
  • 35.
  • 36.
  • 37.
  • 38.
  • 39. Multilevel On-Chip Caches Chapter 5 — Large and Fast: Exploiting Memory Hierarchy — §5.10 Real Stuff: The AMD Opteron X4 and Intel Nehalem Per core: 32KB L1 I-cache, 32KB L1 D-cache, 512KB L2 cache Intel Nehalem 4-core processor
  • 40. 2-Level TLB Organization Chapter 5 — Large and Fast: Exploiting Memory Hierarchy — Intel Nehalem AMD Opteron X4 Virtual addr 48 bits 48 bits Physical addr 44 bits 48 bits Page size 4KB, 2/4MB 4KB, 2/4MB L1 TLB (per core) L1 I-TLB: 128 entries for small pages, 7 per thread (2 × ) for large pages L1 D-TLB: 64 entries for small pages, 32 for large pages Both 4-way, LRU replacement L1 I-TLB: 48 entries L1 D-TLB: 48 entries Both fully associative, LRU replacement L2 TLB (per core) Single L2 TLB: 512 entries 4-way, LRU replacement L2 I-TLB: 512 entries L2 D-TLB: 512 entries Both 4-way, round-robin LRU TLB misses Handled in hardware Handled in hardware
  • 41. 3-Level Cache Organization Chapter 5 — Large and Fast: Exploiting Memory Hierarchy — Intel Nehalem AMD Opteron X4 L1 caches (per core) L1 I-cache: 32KB, 64-byte blocks, 4-way, approx LRU replacement, hit time n/a L1 D-cache: 32KB, 64-byte blocks, 8-way, approx LRU replacement, write-back/allocate, hit time n/a L1 I-cache: 32KB, 64-byte blocks, 2-way, LRU replacement, hit time 3 cycles L1 D-cache: 32KB, 64-byte blocks, 2-way, LRU replacement, write-back/allocate, hit time 9 cycles L2 unified cache (per core) 256KB, 64-byte blocks, 8-way, approx LRU replacement, write-back/allocate, hit time n/a 512KB, 64-byte blocks, 16-way, approx LRU replacement, write-back/allocate, hit time n/a L3 unified cache (shared) 8MB, 64-byte blocks, 16-way, replacement n/a, write-back/allocate, hit time n/a 2MB, 64-byte blocks, 32-way, replace block shared by fewest cores, write-back/allocate, hit time 32 cycles n/a: data not available
  • 42.
  • 43.
  • 44.
  • 45.
  • 46.

Editor's Notes

  1. Morgan Kaufmann Publishers 9 March 2010 Chapter 5 — Large and Fast: Exploiting Memory Hierarchy
  2. Morgan Kaufmann Publishers 9 March 2010 Chapter 5 — Large and Fast: Exploiting Memory Hierarchy
  3. Morgan Kaufmann Publishers 9 March 2010 Chapter 5 — Large and Fast: Exploiting Memory Hierarchy
  4. Morgan Kaufmann Publishers 9 March 2010 Chapter 5 — Large and Fast: Exploiting Memory Hierarchy
  5. Morgan Kaufmann Publishers 9 March 2010 Chapter 5 — Large and Fast: Exploiting Memory Hierarchy
  6. Morgan Kaufmann Publishers 9 March 2010 Chapter 5 — Large and Fast: Exploiting Memory Hierarchy
  7. Morgan Kaufmann Publishers 9 March 2010 Chapter 5 — Large and Fast: Exploiting Memory Hierarchy
  8. Morgan Kaufmann Publishers 9 March 2010 Chapter 5 — Large and Fast: Exploiting Memory Hierarchy
  9. Morgan Kaufmann Publishers 9 March 2010 Chapter 5 — Large and Fast: Exploiting Memory Hierarchy
  10. Morgan Kaufmann Publishers 9 March 2010 Chapter 5 — Large and Fast: Exploiting Memory Hierarchy
  11. Morgan Kaufmann Publishers 9 March 2010 Chapter 5 — Large and Fast: Exploiting Memory Hierarchy
  12. Morgan Kaufmann Publishers 9 March 2010 Chapter 5 — Large and Fast: Exploiting Memory Hierarchy
  13. Morgan Kaufmann Publishers 9 March 2010 Chapter 5 — Large and Fast: Exploiting Memory Hierarchy
  14. Morgan Kaufmann Publishers 9 March 2010 Chapter 5 — Large and Fast: Exploiting Memory Hierarchy
  15. Morgan Kaufmann Publishers 9 March 2010 Chapter 5 — Large and Fast: Exploiting Memory Hierarchy
  16. Morgan Kaufmann Publishers 9 March 2010 Chapter 5 — Large and Fast: Exploiting Memory Hierarchy
  17. Morgan Kaufmann Publishers 9 March 2010 Chapter 5 — Large and Fast: Exploiting Memory Hierarchy
  18. Morgan Kaufmann Publishers 9 March 2010 Chapter 5 — Large and Fast: Exploiting Memory Hierarchy
  19. Morgan Kaufmann Publishers 9 March 2010 Chapter 5 — Large and Fast: Exploiting Memory Hierarchy
  20. Morgan Kaufmann Publishers 9 March 2010 Chapter 5 — Large and Fast: Exploiting Memory Hierarchy
  21. Morgan Kaufmann Publishers 9 March 2010 Chapter 5 — Large and Fast: Exploiting Memory Hierarchy
  22. Morgan Kaufmann Publishers 9 March 2010 Chapter 5 — Large and Fast: Exploiting Memory Hierarchy
  23. Morgan Kaufmann Publishers 9 March 2010 Chapter 5 — Large and Fast: Exploiting Memory Hierarchy
  24. Morgan Kaufmann Publishers 9 March 2010 Chapter 5 — Large and Fast: Exploiting Memory Hierarchy
  25. Morgan Kaufmann Publishers 9 March 2010 Chapter 5 — Large and Fast: Exploiting Memory Hierarchy
  26. Morgan Kaufmann Publishers 9 March 2010 Chapter 5 — Large and Fast: Exploiting Memory Hierarchy
  27. Morgan Kaufmann Publishers 9 March 2010 Chapter 5 — Large and Fast: Exploiting Memory Hierarchy
  28. Morgan Kaufmann Publishers 9 March 2010 Chapter 5 — Large and Fast: Exploiting Memory Hierarchy
  29. Morgan Kaufmann Publishers 9 March 2010 Chapter 5 — Large and Fast: Exploiting Memory Hierarchy
  30. Morgan Kaufmann Publishers 9 March 2010 Chapter 5 — Large and Fast: Exploiting Memory Hierarchy
  31. Morgan Kaufmann Publishers 9 March 2010 Chapter 5 — Large and Fast: Exploiting Memory Hierarchy
  32. Morgan Kaufmann Publishers 9 March 2010 Chapter 5 — Large and Fast: Exploiting Memory Hierarchy
  33. Morgan Kaufmann Publishers 9 March 2010 Chapter 5 — Large and Fast: Exploiting Memory Hierarchy
  34. Morgan Kaufmann Publishers 9 March 2010 Chapter 5 — Large and Fast: Exploiting Memory Hierarchy
  35. Morgan Kaufmann Publishers 9 March 2010 Chapter 5 — Large and Fast: Exploiting Memory Hierarchy
  36. Morgan Kaufmann Publishers 9 March 2010 Chapter 5 — Large and Fast: Exploiting Memory Hierarchy
  37. Morgan Kaufmann Publishers 9 March 2010 Chapter 5 — Large and Fast: Exploiting Memory Hierarchy
  38. Morgan Kaufmann Publishers 9 March 2010 Chapter 5 — Large and Fast: Exploiting Memory Hierarchy
  39. Morgan Kaufmann Publishers 9 March 2010 Chapter 5 — Large and Fast: Exploiting Memory Hierarchy
  40. Morgan Kaufmann Publishers 9 March 2010 Chapter 5 — Large and Fast: Exploiting Memory Hierarchy
  41. Morgan Kaufmann Publishers 9 March 2010 Chapter 5 — Large and Fast: Exploiting Memory Hierarchy
  42. Morgan Kaufmann Publishers 9 March 2010 Chapter 5 — Large and Fast: Exploiting Memory Hierarchy
  43. Morgan Kaufmann Publishers 9 March 2010 Chapter 5 — Large and Fast: Exploiting Memory Hierarchy
  44. Morgan Kaufmann Publishers 9 March 2010 Chapter 5 — Large and Fast: Exploiting Memory Hierarchy
  45. Morgan Kaufmann Publishers 9 March 2010 Chapter 5 — Large and Fast: Exploiting Memory Hierarchy
  46. Morgan Kaufmann Publishers 9 March 2010 Chapter 5 — Large and Fast: Exploiting Memory Hierarchy