This document summarizes a study on evaluating cache performance for multiprocessor systems. The study investigated how the number of processors and cache coherency protocols affect level 1 data cache performance. It used a Linux-based SystemC simulation with CPU, cache and memory modules to test snooping and directory-based coherency protocols. The results showed that execution time increases with more processors, snooping has a direct effect on caches, and directory-based protocols have a slight performance edge over snooping.
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Cache Performance Evaluation Multiprocessor System
1.
2. Cache Performance Evaluation
for Multiprocessor System
Drs. Alfred Mutanga
Management Information Systems Specialist
University of Venda
Date: 17 November 2013
Venue: Novotel Hotel, World Trade Centre, Dubai, UAE
3. Why a Cache Performance Evaluation System?
????
The Memory Wall?
– Memory and Processor Speeds
• Processors Speeds rising dramatically at 75%/p.a.
• Memory clock speeds at a paltry 7% p.a.
– Result: A divergence in the Operating Speeds
4. Research Questions
1. To what extend do the number of processors in
multiprocessor architectures affect the
performance of level one (L1) data Cache
Memory Systems?
2. How do cache coherency protocols influence the
Level-1 Data Cache memory performances of
multiprocessor architectures?
5. Theoretical Framework
The Challenges of Multi-core architectures?
– Programmability
– Scalability
– Communications
– Management of heterogeneous architectures
– Cache Memory Systems
– Attempts to increase memory bandwidth by
introducing concurrency in memory access
– Required regular memory access patterns – resulted
in degradation in memory performance
6. Memory Hierarchy
Architectural issues of Memory Hierarchy?
– Brings conflicting requirements in the memory systems
• Computing systems require a large and fast memory to scale
up performances
– MH attempts to make slow memory appear fast by
buffering data into smaller faster memories close to
CPUs
– Electronic Systems: Slow down as they increase in
size (compromise between power and performance)
– Most common solution to Memory wall is to cache
data
7. Research Methodology
CPU
CACHE
CPU
CPU
CACHE
CACHE
INTERCONNECTION NETWORK
CPU
CACHE
• Linux Environment – Arch
Linux
• SystemC
• Memory Trace Files
• Fast Fourier Trace Files
• Random Trace Files
• Debugging Trace Files
• Distributed Shared Memory
System
• Cache Coherence Protocols
• Snoopy (Valid-Invalid)
• Directory based (MOESI)
• Cache Memory
MEMORY
• 32KB Level-1 Data Cache
• 32 Byte line
8. Design and Implementation in SystemC
• Memory Module-simulated the Shared bulk (RAM)
• CPU Module-has to connect to the other modules such
as the cache, and memory using the appropriate ports
• Cache Module- defined the Cache properties and
macros that were used throughout the simulation
• Simple Bus Module- connected to the different address
ports in the cache using an appropriate bus signal
• Cache Helper Libraries- represented files that collected
the traces of the memory requests during each program
execution
13. Conclusions
• Write-invalidate-needs management of dynamic
requests
• Execution time-increases with number of processors
• Snooping- has a direct effect on cache
• Synchronization of caches and optimizations in the
compiler- can increase cache performance
• Cache Coherency protocols- directory based cache
coherency protocols have a slight performance edge
over Snooping cache coherency protocols