Call Girls Pimpri Chinchwad Call Me 7737669865 Budget Friendly No Advance Boo...
distributed memory architecture/ Non Shared MIMD Architecture
1. Distributed Memory Architecture
MS(CS) - I
Hafsa Habib
Syeda Haseeba Khanam
Amber Azhar
Zainab Khalid
Lahore College for Women University
Department of Computer Science
2. Content
● MIMD processor classification
● Distributed MIMD architecture
○ Basic difference between DM-MIMD and SM-MIMD
● Communication Techniques of DM-MIMD
● Major classification of DM-MIMD
○ NUMA
○ MPP
○ Cluster
● Pros and Cons of DM-MIMD over SM-MIMD architecture
○ Scalability
○ Issues in scalability
2
5. Non Shared MIMD Architecture
● Also called Distributed Memory MIMD or Message Passing MIMD
Computers or Loosely coupled MIMD
● Processors have their own memory local memory
○ Memory address for one processor does not map on other processors
○ No concept of global address space
● Each processor operates independently because of its own local memory
○ Changes in one processor’s local memory has no effect on other
processor’s local memory
○ Therefor cache synchronization and cache coherency does not apply.
● Inter Process Communication is done by Message Passing.
5
7. DM-MIMD vs SM-MIMD
DM-MIMD
● Private physical address space
for each processor
● Data must be explicitly assigned
to the private address space
● Communication/synchronization
via network by Message Passing
● Concept of cache coherency
does not apply because no global
address space
SM-MIMD
● Global address space shared by
all
● Data is implicitly assigned to the
address space.
● Cooperate by reading/writing
same shared variable
● Communication through BUS
● Concept of cache coherency
applies due to shared Global
address space 7
8. Content
● MIMD processor classification
● Distributed MIMD architecture
○ Basic difference between DM-MIMD and SM-MIMD
● Communication Technique
● Major classification of DM-MIMD
○ NUMA
○ MPP
○ Cluster
● Pros and Cons of DM-MIMD over SM-MIMD architecture
○ Scalability
○ Issues in scalability
8
10. Communication in DM
Architecture
● Require a communication
NETWORK to connect inter
processor memory.
● Communication and
Synchronization is done through
Message Passing Model.
● Processor share data by explicitly
send and receive information.
● Coordination is built into message
passing primitives
○ message SEND and message
RECEIVE
10
11. Why DM-
Architecture use Message
Passing Model?
In Distributed memory architecture there is no
global memory so it is necessary to move data
from one local memory to another by means of
message passing.
11
12. Message Passing Model
● Communication via
Send/Receive
○ Through Interconnection
Network
● Data is packed into larger
packets
● Send sends message to
another destination processor
● Receive indicates that a
processor is ready to receive a
message; message from
another source processor
12
13. Message Passing Model (cont’d)
● When a process interacts with another, two requirements have to be satisfied.
○ Synchronization and Communication
● Synchronization in message passing model is either asynchronous or
synchronous
○ If Asynchronous , it means no acknowledgement is required at both
ends(receiver and sender)
■ Sender and receiver don’t wait for each other and can carry on their
own computations while transfer of messages is being done.
○ If synchronous, Acknowledgement is required.
■ Both processors have to wait for each other while transferring the
message. (one blocks until the second is ready)
13
15. Pros and Cons of Message Passing Model
● The advantage for programmers is that
communication is explicit, so there are
fewer “performance surprises” than with
the implicit communication in cache-
coherent SMPs.
● Synchronization is naturally associated
with sending messages, reducing the
possibility for errors introduced by
incorrect synchronization
● Much easier for hardware designers to
design
● Message sending and receiving is much
slower
● It's harder to port a sequential program
to a message passing multiprocessor .
Pros Cons
15
17. Sr.No. Difference Distributed Memory
Architecture
Shared Memory
Architecture
1. Explicit Communication/Implicit
Communication
Explicit via Messages Implicit via Memory
Operations
2. Who is Responsible for carrying
communication task?
Programmer is
responsible to send
and receive data
Sending and receiving
is automatic.
System is Responsible
for setting data in
cache. Programmer
just load from
memory and store to
memory.
3. Synchronization Automatic Can be Achieved using
different mechanism
4. Protocols Fully under
programmer control
Hidden within the
system
17
18. Content
● MIMD processor classification
● Distributed MIMD architecture
○ Basic difference between DM-MIMD and SM-MIMD
● Communication Techniques of DM-MIMD
● Major classification of DM-MIMD
○ NUMA
○ MPP
○ Cluster
● Pros and Cons of DM-MIMD over SM-MIMD architecture
○ Scalability
○ Issues in scalability
18
22. NUMA (Non-Uniform memory Access)
● NUMA is a computer memory design
used in multiprocessing, where the
memory access time depends on the
memory location relative to the
processor.
● Under NUMA, a processor can access its
own local memory faster than non-local
memory (memory local to another
processor or memory shared between
processors).
● The benefits of NUMA are limited to
particular workloads, notably on servers
where the data is often associated
strongly with certain tasks or users,
● There are two morals to this performance
story.
● The first is that even a single 32-bit , but
already commonplace, processor is
starting to push the limits of standard
memory performance.
● The second is that even conventional
memory types differences play a role in
overall system performance. So it should
come as no surprise that NUMA support
is now in server operating systems. e.g
Microsoft’s Windows Server 2003 and in
Linux 2.6 kernel.
22
26. What is a Cluster
● Network of independent computers
○ Each has private memory and OS
○ Connected using I/O system
E.g., Ethernet/switch, Internet
● Independent Computers in a cluster are called Node
○ Master and computing Nodes
● Cluster Middleware is required
○ Message Passing Interface
● Node management is to be considered
● Appear as a single system to user
26
27. Clusters
Clusters split problem in smaller tasks
that are executed concurrently
Why?
● Absolute physical limits of
hardware components
● Economical reasons – more
complex = more expensive
● Performance limits – double
frequency <> double performance
● Large applications – demand too
much memory & time
Advantages:
Increasing speed & optimizing
resources utilization.greatly
independent of hardware
Disadvantages:
Complex programming models –
difficult development
Applications
Suitable for applications with
independent tasks
SuperComputers ,Web
servers, databases, simulations,
27
29. Clusters vs MPP
Similar to MPPs
● Commodity processor and memory
○ Processor performance must be maximized
● Memory Hierarchy includes remote memory
○ Non Uniform Memory Access
● No shared memory - message passing
29
30. Clusters vs MPPs
Clusters
● In a cluster, each machine is largely
independent of the others in terms
of memory, disk, etc.
● They are interconnected using
some variation on normal
networking.
● The cluster exists mostly in the
mind of the programmer and how
s/he chooses to distribute the
work.
● Best to use in servers with multiple
independent tasks.
MPPs
● In a Massively Parallel Processor,
there really is only one machine
with thousands of CPUs tightly
● Interconnected with I/O
subsystem.
● MPPs have exotic memory
architectures to allow extremely
high speed exchange of
intermediate results with
neighboring processors.
● MPPs are of use only on algorithms
that are embarrassingly parallel . 30
31. Content
● MIMD processor classification
● Distributed MIMD architecture
○ Basic difference between DM-MIMD and SM-MIMD
● Communication Technique
● Major classification of DM-MIMD
○ NUMA
○ MPP
○ Cluster
● Pros and Cons of DM-MIMD over SM-MIMD architecture
○ Issues in DM Architecture
○ Scalability
31
33. Pros of DM-MIMD over SM-MIMD
DM-MIMD
● Memory is scalable with the
number of processors.Increase the
number of processors and the size
of memory increases
proportionately.
● Each processor can rapidly access
its own memory without
interference and without overhead
with trying to maintain global cache
concurrency.
● Cost effectiveness: can use
commodity, off-the-shelf
processors and networking.
SM-MIMD
● Lack of scalability between
memory and CPUs: Adding more
CPUs can geometrically increases
traffic on the shared memory CPU
path,and geometrically increase
traffic associated with cache
memory management.
● Expense:it becomes increasingly
difficult and expensive to design
and produce shared memory
machines with ever increasing
number of processors.
33
34. Cons of DM-MIMD over SM-MIMD
DM-MIMD
● Non uniform memory access
times-data residing on a
remote node takes longer to
access than local data.
● The programmer is
responsible for many of the
details associated with data
communication processors .
SM-MIMD
● Data sharing between tasks
is both fast and uniform due
to the proximity of memory
to CPUs.
● Global address space
provides a user-friendly
programming perspective to
memory.
34
35. Issues of DM Architecture
Latency and Bandwidth for accessing distributed memory is the main memory
performance issues:
● Efficiency in parallel processing is usually related to ratio of time for calculation vs
time for communication, the higher the ratio the higher the performance.
● Problem is more even severe when access to distributed memory is needed,since
there is an extra level in the memory hierarchy,with latency and bandwidth that can
be very slower than local memory access.
35
36. Scalability and its issues
A scalable architecture is an architecture that can scale up to meet increased work loads. In other
words, if the workload all of a sudden exceeds the capacity of your existing software + hardware
combination, you can scale up the system (software + hardware) to meet the increased workload.
Scalability to more processor is the key issue
● Access times to “distinct processors should not be very much slower than
access to “nearby” processors since non-local and collective (all-to-all)
communication is important for many programs.This can be a problem for
large parallel computers (hundreds or thousands of processors). Many
different approaches to network topology and switching have been tried in
attempting to alleviate this program.
36
DM: Protocols are complex to programmer causing communication to be treated as an I/O call.
SM: Communication can be close to hardware because of shared bus system and if we modify our shared memory’s hardware then communication will be fast
Commodity computing involves the use of large numbers of already-available computing components for parallel computing, to get the greatest amount of useful computation at low cost
However, if you have such a problem, then an MPP can be shockingly fast.
Latency is the amount of time a message takes to traverse a system. In a computer network, it is an expression of how much time it takes for a packet of data to get from one designated point to another. It is sometimes measured as the time required for a packet to be returned to its sender.