SlideShare una empresa de Scribd logo
1 de 53
Advanced Computer Architecture
System Interconnect Architectures
System Interconnect Architectures
Direct networks for static connections
Indirect networks for dynamic connections
Networks are used for
internal connections in a centralized system among
• processors
• memory modules
• I/O disk arrays
distributed networking of multicomputer nodes
EENG-630 - Chapter 2
System Interconnect Architecture
Static and dynamic networks for interconnecting
computer subsystems or for constructing
multiprocessors or multicomputers
Ideal: Construct a low-latency network with a high
data transfer rate
EENG-630 - Chapter 2
Network Properties and Routing
Static Networks: Point-to-Point direct connections
which will not change during program execution
Dynamic Networks: Implemented with switched
channels. They are dynamically configured to
match the communication demand in user
programs
Goals and Analysis
The goals of an interconnection network are to
provide
low-latency
high data transfer rate
wide communication bandwidth
Analysis includes
latency
bisection bandwidth
data-routing functions
scalability of parallel architecture
Network Properties and Routing
Static networks: point-to-point direct connections
that will not change during program execution
Dynamic networks:
switched channels dynamically configured to match user
program communication demands
include buses, crossbar switches, and multistage
networks
Both network types also used for inter-PE data
routing in SIMD computers
Terminology - 1
Network usually represented by a graph with a finite
number of nodes linked by directed or undirected edges.
Number of nodes in graph = network size .
Number of edges (links or channels) incident on a node =
node degree d (also note in and out degrees when edges
are directed). Node degree reflects number of I/O ports
associated with a node, and should ideally be small and
constant.
Diameter D of a network is the maximum shortest path
between any two nodes, measured by the number of links
traversed; this should be as small as possible (from a
communication point of view).
Terminology - 2
Channel bisection width b = minimum number of edges cut
to split a network into two parts each having the same
number of nodes. Since each channel has w bit wires, the
wire bisection width B = bw. Bisection width provides good
indication of maximum communication bandwidth along the
bisection of a network, and all other cross sections should
be bounded by the bisection width.
Wire (or channel) length = length (e.g. weight) of edges
between nodes.
Network is symmetric if the topology is the same looking
from any node; these are easier to implement or to
program.
Other useful characterizing properties: homogeneous
nodes? buffered channels? nodes are switches?
Data Routing Functions
Shifting
Rotating
Permutation (one to one)
Broadcast (one to all)
Multicast (many to many)
Personalized broadcast (one to many)
Shuffle
Exchange
Etc.
Permutations
Given n objects, there are n ! ways in which they
can be reordered (one of which is no reordering).
A permutation can be specified by giving the rule fo
reordering a group of objects.
Permutations can be implemented using crossbar
switches, multistage networks, shifting, and
broadcast operations. The time required to
perform permutations of the connections between
nodes often dominates the network performance
when n is large.
Perfect Shuffle and Exchange
Stone suggested the special permutation that
entries according to the mapping of the k-bit
binary number a b … k to b c … k a (that is,
shifting 1 bit to the left and wrapping it around to
the least significant bit position).
The inverse perfect shuffle reverses the effect of
the perfect shuffle.
Hypercube Routing Functions
If the vertices of a n-dimensional cube are labeled
with n-bit numbers so that only one bit differs
between each pair of adjacent vertices, then n
routing functions are defined by the bits in the node
(vertex) address.
For example, with a 3-dimensional cube, we can
easily identify routing functions that exchange data
between nodes with addresses that differ in the
least significant, most significant, or middle bit.
Factors Affecting Performance
Functionality – how the network supports data routing,
interrupt handling, synchronization, request/message
combining, and coherence
Network latency – worst-case time for a unit message to be
transferred
Bandwidth – maximum data rate
Hardware complexity – implementation costs for wire, logic,
switches, connectors, etc.
Scalability – how easily does the scheme adapt to an
increasing number of processors, memories, etc.?
Static Networks
Linear Array
Ring and Chordal Ring
Barrel Shifter
Tree and Star
Fat Tree
Mesh and Torus
Sample Static Network TopologiesSample Static Network Topologies
Linear
Ring
2D Mesh
Hybercube
Binary Tree Fat Binary Tree Fully Connected
(Static or point-to-point)
2D
3D
4D
Higher link bandwidth
Closer to root
Static Networks – Linear Array
N nodes connected by n-1 links (not a bus);
segments between different pairs of nodes can be
used in parallel.
Internal nodes have degree 2; end nodes have
degree 1.
Diameter = n-1
Bisection = 1
For small n, this is economical, but for large n, it is
obviously inappropriate.
Static Networks – Ring, Chordal Ring
Like a linear array, but the two end nodes are
connected by an n th
link; the ring can be uni- or bi-
directional. Diameter is n/2 for a bidirectional
ring, or n for a unidirectional ring.
By adding additional links (e.g. “chords” in a circle),
the node degree is increased, and we obtain a
chordal ring. This reduces the network diameter.
In the limit, we obtain a fully-connected network,
with a node degree of n -1 and a diameter of 1.
Static Networks – Tree and Star
A k-level completely balanced binary tree will have
N = 2k – 1 nodes, with maximum node degree of 3
and network diameter is 2(k – 1).
The balanced binary tree is scalable, since it has a
constant maximum node degree.
A star is a two-level tree with a node degree d = N
– 1 and a constant diameter of 2.
Static Networks – Fat Tree
A fat tree is a tree in which the number of edges
between nodes increases closer to the root (similar
to the way the thickness of limbs increases in a
real tree as we get closer to the root).
The edges represent communication channels
(“wires”), and since communication traffic
increases as the root is approached, it seems
logical to increase the number of channels there.
Static Networks – Mesh and Torus
Pure mesh – N = n k
nodes with links between each
adjacent pair of nodes in a row or column (or higher
degree). This is not a symmetric network; interior node
degree d = 2k, diameter = k (n – 1).
Illiac mesh (used in Illiac IV computer) – wraparound is
allowed, thus reducing the network diameter to about half
that of the equivalent pure mesh.
A torus has ring connections in each dimension, and is
symmetric. An n × n binary torus has node degree of 4 and
a diameter of 2 × n / 2 .
Static Networks – Systolic Array
A systolic array is an arrangement of processing
elements and communication links designed
specifically to match the computation and
communication requirements of a specific
algorithm (or class of algorithms).
This specialized character may yield better
performance than more generalized structures, but
also makes them more expensive, and more
difficult to program.
Static Networks – Hypercubes
A binary n-cube architecture with N = 2n
nodes
spanning along n dimensions, with two nodes per
dimension.
The hypercube scalability is poor, and packaging is
difficult for higher-dimensional hypercubes.
Static Networks – Cube-connected
Cycles
k-cube connected cycles (CCC) can be created
from a k-cube by replacing each vertex of the k-
dimensional hypercube by a ring of k nodes.
A k-cube can be transformed to a k-CCC with k ×
2k
nodes.
The major advantage of a CCC is that each node
has a constant degree (but longer latency) than in
the corresponding k-cube. In that respect, it is
more scalable than the hypercube architecture.
Static Networks – k-ary n-Cubes
Rings, meshes, tori, binary n-cubes, and Omega
networks (to be seen) are topologically isomorphic
to a family of k-ary n-cube networks.
n is the dimension of the cube, and k is the radix,
or number of of nodes in each dimension.
The number of nodes in the network, N, is k n
.
Folding (alternating nodes between connections)
can be used to avoid the long “end-around” delays
in the traditional implementation.
Static Networks – k-ary n-Cubes
The cost of k-ary n-cubes is dominated by the
amount of wire, not the number of switches.
With constant wire bisection, low-dimensional
networks with wider channels provide lower
latecny, less contention, and higher “hot-spot”
throughput than higher-dimensional networks with
narrower channels.
Network Throughput
Network throughput – number of messages a network can
handle in a unit time interval.
One way to estimate is to calculate the maximum number of
messages that can be present in a network at any instant
(its capacity); throughput usually is some fraction of its
capacity.
A hot spot is a pair of nodes that accounts for a
disproportionately large portion of the total network traffic
(possibly causing congestion).
Hot spot throughput is maximum rate at which messages
can be sent between two specific nodes.
Minimizing Latency
Latency is minimized when the network radix k and
dimension n are chose so as to make the
components of latency due to distance (# of hops)
and the message aspect ratio L / W (message
length L divided by the channel width W )
approximately equal.
This occurs at a very low dimension. For up to
1024 nodes, the best dimension (in this respect) is
2.
Dynamic Connection Networks
Dynamic connection networks can implement all
communication patterns based on program demands.
In increasing order of cost and performance, these include
bus systems
multistage interconnection networks
crossbar switch networks
Price can be attributed to the cost of wires, switches,
arbiters, and connectors.
Performance is indicated by network bandwidth, data
transfer rate, network latency, and communication patterns
supported.
Dynamic Networks – Bus Systems
A bus system (contention bus, time-sharing bus) has
a collection of wires and connectors
multiple modules (processors, memories, peripherals, etc.) which
connect to the wires
data transactions between pairs of modules
Bus supports only one transaction at a time.
Bus arbitration logic must deal with conflicting requests.
Lowest cost and bandwidth of all dynamic schemes.
Many bus standards are available.
A bus is the simplest type od dynamic
interconnection networks. It constitutes a common
data transfer path for many devices. Depending on
the type of implemented transmissions we
have serial busses and parallel busses. The
devices connected to a bus can be processors,
memories, I/O units, as shown in the figure below.
Dynamic bus
The throughput of the network based on a bus can be increased by the use of
a multibus network shown in the figure below. In this network, processors
connected to the busses can transmit data in parallel (one for each bus) and
many processors can read data from many busses at a time.
Dynamic Networks – Switch Modules
An a × b switch module has a inputs and b outputs.
A binary switch has a = b = 2.
It is not necessary for a = b, but usually a = b = 2k
,
for some integer k.
In general, any input can be connected to one or
more of the outputs. However, multiple inputs may
not be connected to the same output.
When only one-to-one mappings are allowed, the
switch is called a crossbar switch.
Multistage Networks
In general, any multistage network is comprised of a
collection of a × b switch modules and fixed network
modules. The a × b switch modules are used to provide
variable permutation or other reordering of the inputs, which
are then further reordered by the fixed network modules.
A generic multistage network consists of a sequence
alternating dynamic switches (with relatively small values
for a and b) with static networks (with larger numbers of
inputs and outputs). The static networks are used to
implement interstage connections (ISC).
Single-stage networks
Single stage Shuffle-
Exchange IN (left)
Perfect shuffle mapping
function (right)
Perfect shuffle operation:
cyclic shift 1 place left, eg
101 --> 011
Exchange operation: invert
least significant bit, e.g. 101
--> 100
In single stage networks there are different methods needed to change the
interconnection pattern so that a given input is sent to the desired output.
However with one shuffle or exchange operation the desired output might
not be given so multiple shuffles will need to occur. Which is why a
multistage network was build.
Multistage Interconnection Networks
Self-routing network. Since a single-stage network was not
sufficient to get the desired output, a multistage is built. An
example of a multistage network is the omega network.
The capability of single stage networks are limited but if we cascade enough of
them together, they form a completely connected MIN (Multistage
Interconnection Network).
Switches can perform their own routing or can be controlled by a central router
This type of networks can be classified into the following four categories:
Nonblocking
A network is called strictly nonblocking if it can connect any idle input to any idle
output regardless of what other connections are currently in process
Rearrangeable nonblocking
In this case a network should be able to establish all possible connections between
inputs and outputs by rearranging its existing connections.
Blocking interconnection
A network is said to be blocking if it can perform many, but not all, possible
connections between terminals.
Example: the Omega network
Omega Network
A 2 × 2 switch can be configured for
Straight-through
Crossover
Upper broadcast (upper input to both outputs)
Lower broadcast (lower input to both outputs)
(No output is a somewhat vacuous possibility as well)
With four stages of eight 2 × 2 switches, and a static perfect
shuffle for each of the four ISCs, a 16 by 16 Omega
network can be constructed (but not all permutations are
possible).
In general , an n-input Omega network requires log 2 n
stages of 2 × 2 switches and n / 2 switch modules.
Example:
• Connect input 101 to output 001
• Use the bits of the destination
address, 001, for dynamically
selecting a path
• Routing:
- 0 means use upper output
- 1 means use lower output
A multi-stage IN using 2 × 2 switch boxes and a perfect shuffle interconnect pattern between the
stages
In the Omega MIN there is one unique path from each input to each output.
No redundant paths → no fault tolerance and the possibility of blocking
the inter-stage connections are created recursively using
the method of cyclic bit shifting.
The bit representation of an input is shifted right by 1 bit to
make the output number.
Between stages the next stage is always cut into 2 separate
sections. So the second stage would have 2 sections and
the third stage would have 4 sections and so on.
The switch rules are similar to Omega networks.
Baseline Network
A baseline network can be shown to be topologically
equivalent to other networks (including Omega), and has a
simple recursive generation procedure.
Stage k (k = 0, 1, …) is an m × m switch block (where m =
N / 2k
) composed entirely of 2 × 2 switch blocks, each
having two configurations: straight through and crossover.
The network can be generated recursively
The first stage N × N, the second (N/2) × (N/2).
Networks are topologically equivalent if one network can be
easily reproduced from the other networks by simply
rearranging nodes at each stage
4 × 4 Baseline Network
Crossbar Networks
A crossbar switch is a circuit that enables many interconnections between
elements of a parallel system at a time. A crossbar switch has a number of
input and output data pins and a number of control pins. In response to
control instructions set to its control input, the crossbar switch implements
a stable connection of a determined input with a determined output.
A m × n crossbar network can be used to provide a constant latency
connection between devices; it can be thought of as a single stage switch.
Different types of devices can be connected, yielding different constraints
on which switches can be enabled.
With m processors and n memories, one processor may be able to
generate requests for multiple memories in sequence; thus several
switches might be set in the same row.
For m × m interprocessor communication, each PE is connected to
both an input and an output of the crossbar; only one switch in each
row and column can be turned on simultaneously. Additional control
processors are used to manage the crossbar itself.
The crossbar is a special type of dynamic interconnection network.
Every input has route to every output directly, eliminating blocking.
A crossbar network offers better performance over other
The trade-off here is a crossbar design requires more switching
elements than other network types, increasing cost.
The major advantage of the cross-bar switch is its potential for speed.
In one clock, a connection can be made between source and
destination.
The diameter of the cross-bar is one.
Blocking if the destination is in use
Because of its complexity, the cost of the cross-bar switch can become
the dominant factor for a large multiprocessor system.
Crossbars can be used to implement the a×b switches used in MIN’s.
In this case each crossbar is small so costs are kept down.
Each junction is a switching component – connecting
the row to the column.
Can only have one connection in each column
Summary: Notes
Bus n processors, bus width w
Multistage Network n × n network using k × k switches, line
width w
Crossbar n × n crossbar, with line width w
Summary: Minimum Latency
Bus Constant
Multistage Network O(logk n)
Crossbar Constant
Summary: Bandwidth per Processor
Bus O(w/n) to O(w)
Multistage Network O(w) to O(nw)
Crossbar O(w) to O(nw)
Summary: Wiring Complexity
Bus O(w)
Multistage Network O(nw logk n)
Crossbar O(n2
w)
Summary: Switching Complexity
Bus O(n)
Multistage Network O(n logk n)
Crossbar O(n2
)
Summary: Connectivity and Routing
Bus One to one, and only one at a time
Multistage Network Some permutations and broadcast (if
network “unblocked”)
Crossbar All permutations, one at a time

Más contenido relacionado

La actualidad más candente

Processor allocation in Distributed Systems
Processor allocation in Distributed SystemsProcessor allocation in Distributed Systems
Processor allocation in Distributed SystemsRitu Ranjan Shrivastwa
 
Distributed shred memory architecture
Distributed shred memory architectureDistributed shred memory architecture
Distributed shred memory architectureMaulik Togadiya
 
Data Parallel and Object Oriented Model
Data Parallel and Object Oriented ModelData Parallel and Object Oriented Model
Data Parallel and Object Oriented ModelNikhil Sharma
 
Connection( less & oriented)
Connection( less & oriented)Connection( less & oriented)
Connection( less & oriented)ymghorpade
 
Design issues of dos
Design issues of dosDesign issues of dos
Design issues of dosvanamali_vanu
 
Feng’s classification
Feng’s classificationFeng’s classification
Feng’s classificationNarayan Kandel
 
Hardware and Software parallelism
Hardware and Software parallelismHardware and Software parallelism
Hardware and Software parallelismprashantdahake
 
Parallel programming model, language and compiler in ACA.
Parallel programming model, language and compiler in ACA.Parallel programming model, language and compiler in ACA.
Parallel programming model, language and compiler in ACA.MITS Gwalior
 
Fault tolerance in distributed systems
Fault tolerance in distributed systemsFault tolerance in distributed systems
Fault tolerance in distributed systemssumitjain2013
 
Applications of paralleL processing
Applications of paralleL processingApplications of paralleL processing
Applications of paralleL processingPage Maker
 
Distributed operating system
Distributed operating systemDistributed operating system
Distributed operating systemudaya khanal
 
Communication costs in parallel machines
Communication costs in parallel machinesCommunication costs in parallel machines
Communication costs in parallel machinesSyed Zaid Irshad
 
distributed shared memory
 distributed shared memory distributed shared memory
distributed shared memoryAshish Kumar
 
Inter-Process Communication in distributed systems
Inter-Process Communication in distributed systemsInter-Process Communication in distributed systems
Inter-Process Communication in distributed systemsAya Mahmoud
 
Computer Networks Unit 2 UNIT II DATA-LINK LAYER & MEDIA ACCESS
Computer Networks Unit 2 UNIT II DATA-LINK LAYER & MEDIA ACCESSComputer Networks Unit 2 UNIT II DATA-LINK LAYER & MEDIA ACCESS
Computer Networks Unit 2 UNIT II DATA-LINK LAYER & MEDIA ACCESSDr. SELVAGANESAN S
 
Distributed Operating Systems
Distributed Operating SystemsDistributed Operating Systems
Distributed Operating SystemsUmmiya Mohammedi
 

La actualidad más candente (20)

Processor allocation in Distributed Systems
Processor allocation in Distributed SystemsProcessor allocation in Distributed Systems
Processor allocation in Distributed Systems
 
Distributed shred memory architecture
Distributed shred memory architectureDistributed shred memory architecture
Distributed shred memory architecture
 
Data Parallel and Object Oriented Model
Data Parallel and Object Oriented ModelData Parallel and Object Oriented Model
Data Parallel and Object Oriented Model
 
Connection( less & oriented)
Connection( less & oriented)Connection( less & oriented)
Connection( less & oriented)
 
Design issues of dos
Design issues of dosDesign issues of dos
Design issues of dos
 
Feng’s classification
Feng’s classificationFeng’s classification
Feng’s classification
 
Stream oriented communication
Stream oriented communicationStream oriented communication
Stream oriented communication
 
Hardware and Software parallelism
Hardware and Software parallelismHardware and Software parallelism
Hardware and Software parallelism
 
Parallel programming model, language and compiler in ACA.
Parallel programming model, language and compiler in ACA.Parallel programming model, language and compiler in ACA.
Parallel programming model, language and compiler in ACA.
 
Aca2 10 11
Aca2 10 11Aca2 10 11
Aca2 10 11
 
Fault tolerance in distributed systems
Fault tolerance in distributed systemsFault tolerance in distributed systems
Fault tolerance in distributed systems
 
Applications of paralleL processing
Applications of paralleL processingApplications of paralleL processing
Applications of paralleL processing
 
Distributed operating system
Distributed operating systemDistributed operating system
Distributed operating system
 
Communication costs in parallel machines
Communication costs in parallel machinesCommunication costs in parallel machines
Communication costs in parallel machines
 
distributed shared memory
 distributed shared memory distributed shared memory
distributed shared memory
 
Lecture 3 threads
Lecture 3   threadsLecture 3   threads
Lecture 3 threads
 
Inter-Process Communication in distributed systems
Inter-Process Communication in distributed systemsInter-Process Communication in distributed systems
Inter-Process Communication in distributed systems
 
Computer Networks Unit 2 UNIT II DATA-LINK LAYER & MEDIA ACCESS
Computer Networks Unit 2 UNIT II DATA-LINK LAYER & MEDIA ACCESSComputer Networks Unit 2 UNIT II DATA-LINK LAYER & MEDIA ACCESS
Computer Networks Unit 2 UNIT II DATA-LINK LAYER & MEDIA ACCESS
 
Computer networks - Channelization
Computer networks - ChannelizationComputer networks - Channelization
Computer networks - Channelization
 
Distributed Operating Systems
Distributed Operating SystemsDistributed Operating Systems
Distributed Operating Systems
 

Similar a system interconnect architectures in ACA

Parallel computing chapter 2
Parallel computing chapter 2Parallel computing chapter 2
Parallel computing chapter 2Md. Mahedi Mahfuj
 
Optimal Transmit Power and Packet Size in Wireless Sensor Networks in Shadowe...
Optimal Transmit Power and Packet Size in Wireless Sensor Networks in Shadowe...Optimal Transmit Power and Packet Size in Wireless Sensor Networks in Shadowe...
Optimal Transmit Power and Packet Size in Wireless Sensor Networks in Shadowe...IDES Editor
 
Wireless sensor network
Wireless sensor network  Wireless sensor network
Wireless sensor network Sandeep Kumar
 
De31486489
De31486489De31486489
De31486489IJMER
 
What Is A Network made by Ms. Archika Bhatia
What Is A Network made by Ms. Archika BhatiaWhat Is A Network made by Ms. Archika Bhatia
What Is A Network made by Ms. Archika Bhatiakulachihansraj
 
COMPUTER NETWORKING SYSTEM
COMPUTER NETWORKING SYSTEMCOMPUTER NETWORKING SYSTEM
COMPUTER NETWORKING SYSTEMprapti borthakur
 
Performance Analysis of Enhanced Opportunistic Minimum Cost Routingin Mobile ...
Performance Analysis of Enhanced Opportunistic Minimum Cost Routingin Mobile ...Performance Analysis of Enhanced Opportunistic Minimum Cost Routingin Mobile ...
Performance Analysis of Enhanced Opportunistic Minimum Cost Routingin Mobile ...IJERA Editor
 
Performance Analysis of Enhanced Opportunistic Minimum Cost Routingin Mobile ...
Performance Analysis of Enhanced Opportunistic Minimum Cost Routingin Mobile ...Performance Analysis of Enhanced Opportunistic Minimum Cost Routingin Mobile ...
Performance Analysis of Enhanced Opportunistic Minimum Cost Routingin Mobile ...IJERA Editor
 
Design of a Reliable Wireless Sensor Network with Optimized Energy Efficiency...
Design of a Reliable Wireless Sensor Network with Optimized Energy Efficiency...Design of a Reliable Wireless Sensor Network with Optimized Energy Efficiency...
Design of a Reliable Wireless Sensor Network with Optimized Energy Efficiency...paperpublications3
 
Communication and Networking
Communication and NetworkingCommunication and Networking
Communication and NetworkingJay Nagar
 
Based on Heterogeneity and Electing Probability of Nodes Improvement in LEACH
Based on Heterogeneity and Electing Probability of Nodes Improvement in LEACHBased on Heterogeneity and Electing Probability of Nodes Improvement in LEACH
Based on Heterogeneity and Electing Probability of Nodes Improvement in LEACHijsrd.com
 
An Improved Energy Efficient Wireless Sensor Networks Through Clustering In C...
An Improved Energy Efficient Wireless Sensor Networks Through Clustering In C...An Improved Energy Efficient Wireless Sensor Networks Through Clustering In C...
An Improved Energy Efficient Wireless Sensor Networks Through Clustering In C...Editor IJCATR
 
An Improved Energy Efficient Wireless Sensor Networks Through Clustering In C...
An Improved Energy Efficient Wireless Sensor Networks Through Clustering In C...An Improved Energy Efficient Wireless Sensor Networks Through Clustering In C...
An Improved Energy Efficient Wireless Sensor Networks Through Clustering In C...Editor IJCATR
 

Similar a system interconnect architectures in ACA (20)

Parallel computing chapter 2
Parallel computing chapter 2Parallel computing chapter 2
Parallel computing chapter 2
 
Static networks
Static networksStatic networks
Static networks
 
Gk2411581160
Gk2411581160Gk2411581160
Gk2411581160
 
Static Networks
Static NetworksStatic Networks
Static Networks
 
WSN presentation
WSN presentationWSN presentation
WSN presentation
 
Optimal Transmit Power and Packet Size in Wireless Sensor Networks in Shadowe...
Optimal Transmit Power and Packet Size in Wireless Sensor Networks in Shadowe...Optimal Transmit Power and Packet Size in Wireless Sensor Networks in Shadowe...
Optimal Transmit Power and Packet Size in Wireless Sensor Networks in Shadowe...
 
Wireless sensor network
Wireless sensor network  Wireless sensor network
Wireless sensor network
 
De31486489
De31486489De31486489
De31486489
 
What Is A Network made by Ms. Archika Bhatia
What Is A Network made by Ms. Archika BhatiaWhat Is A Network made by Ms. Archika Bhatia
What Is A Network made by Ms. Archika Bhatia
 
COMPUTER NETWORKING SYSTEM
COMPUTER NETWORKING SYSTEMCOMPUTER NETWORKING SYSTEM
COMPUTER NETWORKING SYSTEM
 
D031202018023
D031202018023D031202018023
D031202018023
 
Performance Analysis of Enhanced Opportunistic Minimum Cost Routingin Mobile ...
Performance Analysis of Enhanced Opportunistic Minimum Cost Routingin Mobile ...Performance Analysis of Enhanced Opportunistic Minimum Cost Routingin Mobile ...
Performance Analysis of Enhanced Opportunistic Minimum Cost Routingin Mobile ...
 
Performance Analysis of Enhanced Opportunistic Minimum Cost Routingin Mobile ...
Performance Analysis of Enhanced Opportunistic Minimum Cost Routingin Mobile ...Performance Analysis of Enhanced Opportunistic Minimum Cost Routingin Mobile ...
Performance Analysis of Enhanced Opportunistic Minimum Cost Routingin Mobile ...
 
Design of a Reliable Wireless Sensor Network with Optimized Energy Efficiency...
Design of a Reliable Wireless Sensor Network with Optimized Energy Efficiency...Design of a Reliable Wireless Sensor Network with Optimized Energy Efficiency...
Design of a Reliable Wireless Sensor Network with Optimized Energy Efficiency...
 
Ppt college
Ppt collegePpt college
Ppt college
 
Communication and Networking
Communication and NetworkingCommunication and Networking
Communication and Networking
 
Based on Heterogeneity and Electing Probability of Nodes Improvement in LEACH
Based on Heterogeneity and Electing Probability of Nodes Improvement in LEACHBased on Heterogeneity and Electing Probability of Nodes Improvement in LEACH
Based on Heterogeneity and Electing Probability of Nodes Improvement in LEACH
 
H0515259
H0515259H0515259
H0515259
 
An Improved Energy Efficient Wireless Sensor Networks Through Clustering In C...
An Improved Energy Efficient Wireless Sensor Networks Through Clustering In C...An Improved Energy Efficient Wireless Sensor Networks Through Clustering In C...
An Improved Energy Efficient Wireless Sensor Networks Through Clustering In C...
 
An Improved Energy Efficient Wireless Sensor Networks Through Clustering In C...
An Improved Energy Efficient Wireless Sensor Networks Through Clustering In C...An Improved Energy Efficient Wireless Sensor Networks Through Clustering In C...
An Improved Energy Efficient Wireless Sensor Networks Through Clustering In C...
 

Último

NO1 Top No1 Amil Baba In Azad Kashmir, Kashmir Black Magic Specialist Expert ...
NO1 Top No1 Amil Baba In Azad Kashmir, Kashmir Black Magic Specialist Expert ...NO1 Top No1 Amil Baba In Azad Kashmir, Kashmir Black Magic Specialist Expert ...
NO1 Top No1 Amil Baba In Azad Kashmir, Kashmir Black Magic Specialist Expert ...Amil baba
 
Block diagram reduction techniques in control systems.ppt
Block diagram reduction techniques in control systems.pptBlock diagram reduction techniques in control systems.ppt
Block diagram reduction techniques in control systems.pptNANDHAKUMARA10
 
Hostel management system project report..pdf
Hostel management system project report..pdfHostel management system project report..pdf
Hostel management system project report..pdfKamal Acharya
 
Work-Permit-Receiver-in-Saudi-Aramco.pptx
Work-Permit-Receiver-in-Saudi-Aramco.pptxWork-Permit-Receiver-in-Saudi-Aramco.pptx
Work-Permit-Receiver-in-Saudi-Aramco.pptxJuliansyahHarahap1
 
HAND TOOLS USED AT ELECTRONICS WORK PRESENTED BY KOUSTAV SARKAR
HAND TOOLS USED AT ELECTRONICS WORK PRESENTED BY KOUSTAV SARKARHAND TOOLS USED AT ELECTRONICS WORK PRESENTED BY KOUSTAV SARKAR
HAND TOOLS USED AT ELECTRONICS WORK PRESENTED BY KOUSTAV SARKARKOUSTAV SARKAR
 
"Lesotho Leaps Forward: A Chronicle of Transformative Developments"
"Lesotho Leaps Forward: A Chronicle of Transformative Developments""Lesotho Leaps Forward: A Chronicle of Transformative Developments"
"Lesotho Leaps Forward: A Chronicle of Transformative Developments"mphochane1998
 
Generative AI or GenAI technology based PPT
Generative AI or GenAI technology based PPTGenerative AI or GenAI technology based PPT
Generative AI or GenAI technology based PPTbhaskargani46
 
Moment Distribution Method For Btech Civil
Moment Distribution Method For Btech CivilMoment Distribution Method For Btech Civil
Moment Distribution Method For Btech CivilVinayVitekari
 
A Study of Urban Area Plan for Pabna Municipality
A Study of Urban Area Plan for Pabna MunicipalityA Study of Urban Area Plan for Pabna Municipality
A Study of Urban Area Plan for Pabna MunicipalityMorshed Ahmed Rahath
 
Thermal Engineering -unit - III & IV.ppt
Thermal Engineering -unit - III & IV.pptThermal Engineering -unit - III & IV.ppt
Thermal Engineering -unit - III & IV.pptDineshKumar4165
 
School management system project Report.pdf
School management system project Report.pdfSchool management system project Report.pdf
School management system project Report.pdfKamal Acharya
 
Hospital management system project report.pdf
Hospital management system project report.pdfHospital management system project report.pdf
Hospital management system project report.pdfKamal Acharya
 
1_Introduction + EAM Vocabulary + how to navigate in EAM.pdf
1_Introduction + EAM Vocabulary + how to navigate in EAM.pdf1_Introduction + EAM Vocabulary + how to navigate in EAM.pdf
1_Introduction + EAM Vocabulary + how to navigate in EAM.pdfAldoGarca30
 
Unit 4_Part 1 CSE2001 Exception Handling and Function Template and Class Temp...
Unit 4_Part 1 CSE2001 Exception Handling and Function Template and Class Temp...Unit 4_Part 1 CSE2001 Exception Handling and Function Template and Class Temp...
Unit 4_Part 1 CSE2001 Exception Handling and Function Template and Class Temp...drmkjayanthikannan
 
Computer Networks Basics of Network Devices
Computer Networks  Basics of Network DevicesComputer Networks  Basics of Network Devices
Computer Networks Basics of Network DevicesChandrakantDivate1
 
DC MACHINE-Motoring and generation, Armature circuit equation
DC MACHINE-Motoring and generation, Armature circuit equationDC MACHINE-Motoring and generation, Armature circuit equation
DC MACHINE-Motoring and generation, Armature circuit equationBhangaleSonal
 
Bhubaneswar🌹Call Girls Bhubaneswar ❤Komal 9777949614 💟 Full Trusted CALL GIRL...
Bhubaneswar🌹Call Girls Bhubaneswar ❤Komal 9777949614 💟 Full Trusted CALL GIRL...Bhubaneswar🌹Call Girls Bhubaneswar ❤Komal 9777949614 💟 Full Trusted CALL GIRL...
Bhubaneswar🌹Call Girls Bhubaneswar ❤Komal 9777949614 💟 Full Trusted CALL GIRL...Call Girls Mumbai
 
GEAR TRAIN- BASIC CONCEPTS AND WORKING PRINCIPLE
GEAR TRAIN- BASIC CONCEPTS AND WORKING PRINCIPLEGEAR TRAIN- BASIC CONCEPTS AND WORKING PRINCIPLE
GEAR TRAIN- BASIC CONCEPTS AND WORKING PRINCIPLEselvakumar948
 
Navigating Complexity: The Role of Trusted Partners and VIAS3D in Dassault Sy...
Navigating Complexity: The Role of Trusted Partners and VIAS3D in Dassault Sy...Navigating Complexity: The Role of Trusted Partners and VIAS3D in Dassault Sy...
Navigating Complexity: The Role of Trusted Partners and VIAS3D in Dassault Sy...Arindam Chakraborty, Ph.D., P.E. (CA, TX)
 

Último (20)

NO1 Top No1 Amil Baba In Azad Kashmir, Kashmir Black Magic Specialist Expert ...
NO1 Top No1 Amil Baba In Azad Kashmir, Kashmir Black Magic Specialist Expert ...NO1 Top No1 Amil Baba In Azad Kashmir, Kashmir Black Magic Specialist Expert ...
NO1 Top No1 Amil Baba In Azad Kashmir, Kashmir Black Magic Specialist Expert ...
 
Block diagram reduction techniques in control systems.ppt
Block diagram reduction techniques in control systems.pptBlock diagram reduction techniques in control systems.ppt
Block diagram reduction techniques in control systems.ppt
 
Hostel management system project report..pdf
Hostel management system project report..pdfHostel management system project report..pdf
Hostel management system project report..pdf
 
Work-Permit-Receiver-in-Saudi-Aramco.pptx
Work-Permit-Receiver-in-Saudi-Aramco.pptxWork-Permit-Receiver-in-Saudi-Aramco.pptx
Work-Permit-Receiver-in-Saudi-Aramco.pptx
 
HAND TOOLS USED AT ELECTRONICS WORK PRESENTED BY KOUSTAV SARKAR
HAND TOOLS USED AT ELECTRONICS WORK PRESENTED BY KOUSTAV SARKARHAND TOOLS USED AT ELECTRONICS WORK PRESENTED BY KOUSTAV SARKAR
HAND TOOLS USED AT ELECTRONICS WORK PRESENTED BY KOUSTAV SARKAR
 
"Lesotho Leaps Forward: A Chronicle of Transformative Developments"
"Lesotho Leaps Forward: A Chronicle of Transformative Developments""Lesotho Leaps Forward: A Chronicle of Transformative Developments"
"Lesotho Leaps Forward: A Chronicle of Transformative Developments"
 
Generative AI or GenAI technology based PPT
Generative AI or GenAI technology based PPTGenerative AI or GenAI technology based PPT
Generative AI or GenAI technology based PPT
 
FEA Based Level 3 Assessment of Deformed Tanks with Fluid Induced Loads
FEA Based Level 3 Assessment of Deformed Tanks with Fluid Induced LoadsFEA Based Level 3 Assessment of Deformed Tanks with Fluid Induced Loads
FEA Based Level 3 Assessment of Deformed Tanks with Fluid Induced Loads
 
Moment Distribution Method For Btech Civil
Moment Distribution Method For Btech CivilMoment Distribution Method For Btech Civil
Moment Distribution Method For Btech Civil
 
A Study of Urban Area Plan for Pabna Municipality
A Study of Urban Area Plan for Pabna MunicipalityA Study of Urban Area Plan for Pabna Municipality
A Study of Urban Area Plan for Pabna Municipality
 
Thermal Engineering -unit - III & IV.ppt
Thermal Engineering -unit - III & IV.pptThermal Engineering -unit - III & IV.ppt
Thermal Engineering -unit - III & IV.ppt
 
School management system project Report.pdf
School management system project Report.pdfSchool management system project Report.pdf
School management system project Report.pdf
 
Hospital management system project report.pdf
Hospital management system project report.pdfHospital management system project report.pdf
Hospital management system project report.pdf
 
1_Introduction + EAM Vocabulary + how to navigate in EAM.pdf
1_Introduction + EAM Vocabulary + how to navigate in EAM.pdf1_Introduction + EAM Vocabulary + how to navigate in EAM.pdf
1_Introduction + EAM Vocabulary + how to navigate in EAM.pdf
 
Unit 4_Part 1 CSE2001 Exception Handling and Function Template and Class Temp...
Unit 4_Part 1 CSE2001 Exception Handling and Function Template and Class Temp...Unit 4_Part 1 CSE2001 Exception Handling and Function Template and Class Temp...
Unit 4_Part 1 CSE2001 Exception Handling and Function Template and Class Temp...
 
Computer Networks Basics of Network Devices
Computer Networks  Basics of Network DevicesComputer Networks  Basics of Network Devices
Computer Networks Basics of Network Devices
 
DC MACHINE-Motoring and generation, Armature circuit equation
DC MACHINE-Motoring and generation, Armature circuit equationDC MACHINE-Motoring and generation, Armature circuit equation
DC MACHINE-Motoring and generation, Armature circuit equation
 
Bhubaneswar🌹Call Girls Bhubaneswar ❤Komal 9777949614 💟 Full Trusted CALL GIRL...
Bhubaneswar🌹Call Girls Bhubaneswar ❤Komal 9777949614 💟 Full Trusted CALL GIRL...Bhubaneswar🌹Call Girls Bhubaneswar ❤Komal 9777949614 💟 Full Trusted CALL GIRL...
Bhubaneswar🌹Call Girls Bhubaneswar ❤Komal 9777949614 💟 Full Trusted CALL GIRL...
 
GEAR TRAIN- BASIC CONCEPTS AND WORKING PRINCIPLE
GEAR TRAIN- BASIC CONCEPTS AND WORKING PRINCIPLEGEAR TRAIN- BASIC CONCEPTS AND WORKING PRINCIPLE
GEAR TRAIN- BASIC CONCEPTS AND WORKING PRINCIPLE
 
Navigating Complexity: The Role of Trusted Partners and VIAS3D in Dassault Sy...
Navigating Complexity: The Role of Trusted Partners and VIAS3D in Dassault Sy...Navigating Complexity: The Role of Trusted Partners and VIAS3D in Dassault Sy...
Navigating Complexity: The Role of Trusted Partners and VIAS3D in Dassault Sy...
 

system interconnect architectures in ACA

  • 1. Advanced Computer Architecture System Interconnect Architectures
  • 2. System Interconnect Architectures Direct networks for static connections Indirect networks for dynamic connections Networks are used for internal connections in a centralized system among • processors • memory modules • I/O disk arrays distributed networking of multicomputer nodes
  • 3. EENG-630 - Chapter 2 System Interconnect Architecture Static and dynamic networks for interconnecting computer subsystems or for constructing multiprocessors or multicomputers Ideal: Construct a low-latency network with a high data transfer rate
  • 4. EENG-630 - Chapter 2 Network Properties and Routing Static Networks: Point-to-Point direct connections which will not change during program execution Dynamic Networks: Implemented with switched channels. They are dynamically configured to match the communication demand in user programs
  • 5. Goals and Analysis The goals of an interconnection network are to provide low-latency high data transfer rate wide communication bandwidth Analysis includes latency bisection bandwidth data-routing functions scalability of parallel architecture
  • 6. Network Properties and Routing Static networks: point-to-point direct connections that will not change during program execution Dynamic networks: switched channels dynamically configured to match user program communication demands include buses, crossbar switches, and multistage networks Both network types also used for inter-PE data routing in SIMD computers
  • 7. Terminology - 1 Network usually represented by a graph with a finite number of nodes linked by directed or undirected edges. Number of nodes in graph = network size . Number of edges (links or channels) incident on a node = node degree d (also note in and out degrees when edges are directed). Node degree reflects number of I/O ports associated with a node, and should ideally be small and constant. Diameter D of a network is the maximum shortest path between any two nodes, measured by the number of links traversed; this should be as small as possible (from a communication point of view).
  • 8. Terminology - 2 Channel bisection width b = minimum number of edges cut to split a network into two parts each having the same number of nodes. Since each channel has w bit wires, the wire bisection width B = bw. Bisection width provides good indication of maximum communication bandwidth along the bisection of a network, and all other cross sections should be bounded by the bisection width. Wire (or channel) length = length (e.g. weight) of edges between nodes. Network is symmetric if the topology is the same looking from any node; these are easier to implement or to program. Other useful characterizing properties: homogeneous nodes? buffered channels? nodes are switches?
  • 9. Data Routing Functions Shifting Rotating Permutation (one to one) Broadcast (one to all) Multicast (many to many) Personalized broadcast (one to many) Shuffle Exchange Etc.
  • 10. Permutations Given n objects, there are n ! ways in which they can be reordered (one of which is no reordering). A permutation can be specified by giving the rule fo reordering a group of objects. Permutations can be implemented using crossbar switches, multistage networks, shifting, and broadcast operations. The time required to perform permutations of the connections between nodes often dominates the network performance when n is large.
  • 11. Perfect Shuffle and Exchange Stone suggested the special permutation that entries according to the mapping of the k-bit binary number a b … k to b c … k a (that is, shifting 1 bit to the left and wrapping it around to the least significant bit position). The inverse perfect shuffle reverses the effect of the perfect shuffle.
  • 12. Hypercube Routing Functions If the vertices of a n-dimensional cube are labeled with n-bit numbers so that only one bit differs between each pair of adjacent vertices, then n routing functions are defined by the bits in the node (vertex) address. For example, with a 3-dimensional cube, we can easily identify routing functions that exchange data between nodes with addresses that differ in the least significant, most significant, or middle bit.
  • 13. Factors Affecting Performance Functionality – how the network supports data routing, interrupt handling, synchronization, request/message combining, and coherence Network latency – worst-case time for a unit message to be transferred Bandwidth – maximum data rate Hardware complexity – implementation costs for wire, logic, switches, connectors, etc. Scalability – how easily does the scheme adapt to an increasing number of processors, memories, etc.?
  • 14. Static Networks Linear Array Ring and Chordal Ring Barrel Shifter Tree and Star Fat Tree Mesh and Torus
  • 15. Sample Static Network TopologiesSample Static Network Topologies Linear Ring 2D Mesh Hybercube Binary Tree Fat Binary Tree Fully Connected (Static or point-to-point) 2D 3D 4D Higher link bandwidth Closer to root
  • 16. Static Networks – Linear Array N nodes connected by n-1 links (not a bus); segments between different pairs of nodes can be used in parallel. Internal nodes have degree 2; end nodes have degree 1. Diameter = n-1 Bisection = 1 For small n, this is economical, but for large n, it is obviously inappropriate.
  • 17. Static Networks – Ring, Chordal Ring Like a linear array, but the two end nodes are connected by an n th link; the ring can be uni- or bi- directional. Diameter is n/2 for a bidirectional ring, or n for a unidirectional ring. By adding additional links (e.g. “chords” in a circle), the node degree is increased, and we obtain a chordal ring. This reduces the network diameter. In the limit, we obtain a fully-connected network, with a node degree of n -1 and a diameter of 1.
  • 18. Static Networks – Tree and Star A k-level completely balanced binary tree will have N = 2k – 1 nodes, with maximum node degree of 3 and network diameter is 2(k – 1). The balanced binary tree is scalable, since it has a constant maximum node degree. A star is a two-level tree with a node degree d = N – 1 and a constant diameter of 2.
  • 19. Static Networks – Fat Tree A fat tree is a tree in which the number of edges between nodes increases closer to the root (similar to the way the thickness of limbs increases in a real tree as we get closer to the root). The edges represent communication channels (“wires”), and since communication traffic increases as the root is approached, it seems logical to increase the number of channels there.
  • 20. Static Networks – Mesh and Torus Pure mesh – N = n k nodes with links between each adjacent pair of nodes in a row or column (or higher degree). This is not a symmetric network; interior node degree d = 2k, diameter = k (n – 1). Illiac mesh (used in Illiac IV computer) – wraparound is allowed, thus reducing the network diameter to about half that of the equivalent pure mesh. A torus has ring connections in each dimension, and is symmetric. An n × n binary torus has node degree of 4 and a diameter of 2 × n / 2 .
  • 21. Static Networks – Systolic Array A systolic array is an arrangement of processing elements and communication links designed specifically to match the computation and communication requirements of a specific algorithm (or class of algorithms). This specialized character may yield better performance than more generalized structures, but also makes them more expensive, and more difficult to program.
  • 22. Static Networks – Hypercubes A binary n-cube architecture with N = 2n nodes spanning along n dimensions, with two nodes per dimension. The hypercube scalability is poor, and packaging is difficult for higher-dimensional hypercubes.
  • 23. Static Networks – Cube-connected Cycles k-cube connected cycles (CCC) can be created from a k-cube by replacing each vertex of the k- dimensional hypercube by a ring of k nodes. A k-cube can be transformed to a k-CCC with k × 2k nodes. The major advantage of a CCC is that each node has a constant degree (but longer latency) than in the corresponding k-cube. In that respect, it is more scalable than the hypercube architecture.
  • 24. Static Networks – k-ary n-Cubes Rings, meshes, tori, binary n-cubes, and Omega networks (to be seen) are topologically isomorphic to a family of k-ary n-cube networks. n is the dimension of the cube, and k is the radix, or number of of nodes in each dimension. The number of nodes in the network, N, is k n . Folding (alternating nodes between connections) can be used to avoid the long “end-around” delays in the traditional implementation.
  • 25. Static Networks – k-ary n-Cubes The cost of k-ary n-cubes is dominated by the amount of wire, not the number of switches. With constant wire bisection, low-dimensional networks with wider channels provide lower latecny, less contention, and higher “hot-spot” throughput than higher-dimensional networks with narrower channels.
  • 26. Network Throughput Network throughput – number of messages a network can handle in a unit time interval. One way to estimate is to calculate the maximum number of messages that can be present in a network at any instant (its capacity); throughput usually is some fraction of its capacity. A hot spot is a pair of nodes that accounts for a disproportionately large portion of the total network traffic (possibly causing congestion). Hot spot throughput is maximum rate at which messages can be sent between two specific nodes.
  • 27. Minimizing Latency Latency is minimized when the network radix k and dimension n are chose so as to make the components of latency due to distance (# of hops) and the message aspect ratio L / W (message length L divided by the channel width W ) approximately equal. This occurs at a very low dimension. For up to 1024 nodes, the best dimension (in this respect) is 2.
  • 28. Dynamic Connection Networks Dynamic connection networks can implement all communication patterns based on program demands. In increasing order of cost and performance, these include bus systems multistage interconnection networks crossbar switch networks Price can be attributed to the cost of wires, switches, arbiters, and connectors. Performance is indicated by network bandwidth, data transfer rate, network latency, and communication patterns supported.
  • 29. Dynamic Networks – Bus Systems A bus system (contention bus, time-sharing bus) has a collection of wires and connectors multiple modules (processors, memories, peripherals, etc.) which connect to the wires data transactions between pairs of modules Bus supports only one transaction at a time. Bus arbitration logic must deal with conflicting requests. Lowest cost and bandwidth of all dynamic schemes. Many bus standards are available.
  • 30. A bus is the simplest type od dynamic interconnection networks. It constitutes a common data transfer path for many devices. Depending on the type of implemented transmissions we have serial busses and parallel busses. The devices connected to a bus can be processors, memories, I/O units, as shown in the figure below.
  • 32. The throughput of the network based on a bus can be increased by the use of a multibus network shown in the figure below. In this network, processors connected to the busses can transmit data in parallel (one for each bus) and many processors can read data from many busses at a time.
  • 33. Dynamic Networks – Switch Modules An a × b switch module has a inputs and b outputs. A binary switch has a = b = 2. It is not necessary for a = b, but usually a = b = 2k , for some integer k. In general, any input can be connected to one or more of the outputs. However, multiple inputs may not be connected to the same output. When only one-to-one mappings are allowed, the switch is called a crossbar switch.
  • 34.
  • 35. Multistage Networks In general, any multistage network is comprised of a collection of a × b switch modules and fixed network modules. The a × b switch modules are used to provide variable permutation or other reordering of the inputs, which are then further reordered by the fixed network modules. A generic multistage network consists of a sequence alternating dynamic switches (with relatively small values for a and b) with static networks (with larger numbers of inputs and outputs). The static networks are used to implement interstage connections (ISC).
  • 36. Single-stage networks Single stage Shuffle- Exchange IN (left) Perfect shuffle mapping function (right) Perfect shuffle operation: cyclic shift 1 place left, eg 101 --> 011 Exchange operation: invert least significant bit, e.g. 101 --> 100 In single stage networks there are different methods needed to change the interconnection pattern so that a given input is sent to the desired output. However with one shuffle or exchange operation the desired output might not be given so multiple shuffles will need to occur. Which is why a multistage network was build.
  • 37. Multistage Interconnection Networks Self-routing network. Since a single-stage network was not sufficient to get the desired output, a multistage is built. An example of a multistage network is the omega network. The capability of single stage networks are limited but if we cascade enough of them together, they form a completely connected MIN (Multistage Interconnection Network). Switches can perform their own routing or can be controlled by a central router This type of networks can be classified into the following four categories: Nonblocking A network is called strictly nonblocking if it can connect any idle input to any idle output regardless of what other connections are currently in process Rearrangeable nonblocking In this case a network should be able to establish all possible connections between inputs and outputs by rearranging its existing connections. Blocking interconnection A network is said to be blocking if it can perform many, but not all, possible connections between terminals. Example: the Omega network
  • 38.
  • 39. Omega Network A 2 × 2 switch can be configured for Straight-through Crossover Upper broadcast (upper input to both outputs) Lower broadcast (lower input to both outputs) (No output is a somewhat vacuous possibility as well) With four stages of eight 2 × 2 switches, and a static perfect shuffle for each of the four ISCs, a 16 by 16 Omega network can be constructed (but not all permutations are possible). In general , an n-input Omega network requires log 2 n stages of 2 × 2 switches and n / 2 switch modules.
  • 40. Example: • Connect input 101 to output 001 • Use the bits of the destination address, 001, for dynamically selecting a path • Routing: - 0 means use upper output - 1 means use lower output A multi-stage IN using 2 × 2 switch boxes and a perfect shuffle interconnect pattern between the stages In the Omega MIN there is one unique path from each input to each output. No redundant paths → no fault tolerance and the possibility of blocking
  • 41. the inter-stage connections are created recursively using the method of cyclic bit shifting. The bit representation of an input is shifted right by 1 bit to make the output number. Between stages the next stage is always cut into 2 separate sections. So the second stage would have 2 sections and the third stage would have 4 sections and so on. The switch rules are similar to Omega networks.
  • 42. Baseline Network A baseline network can be shown to be topologically equivalent to other networks (including Omega), and has a simple recursive generation procedure. Stage k (k = 0, 1, …) is an m × m switch block (where m = N / 2k ) composed entirely of 2 × 2 switch blocks, each having two configurations: straight through and crossover. The network can be generated recursively The first stage N × N, the second (N/2) × (N/2). Networks are topologically equivalent if one network can be easily reproduced from the other networks by simply rearranging nodes at each stage
  • 43.
  • 44. 4 × 4 Baseline Network
  • 45. Crossbar Networks A crossbar switch is a circuit that enables many interconnections between elements of a parallel system at a time. A crossbar switch has a number of input and output data pins and a number of control pins. In response to control instructions set to its control input, the crossbar switch implements a stable connection of a determined input with a determined output. A m × n crossbar network can be used to provide a constant latency connection between devices; it can be thought of as a single stage switch. Different types of devices can be connected, yielding different constraints on which switches can be enabled. With m processors and n memories, one processor may be able to generate requests for multiple memories in sequence; thus several switches might be set in the same row. For m × m interprocessor communication, each PE is connected to both an input and an output of the crossbar; only one switch in each row and column can be turned on simultaneously. Additional control processors are used to manage the crossbar itself.
  • 46. The crossbar is a special type of dynamic interconnection network. Every input has route to every output directly, eliminating blocking. A crossbar network offers better performance over other The trade-off here is a crossbar design requires more switching elements than other network types, increasing cost. The major advantage of the cross-bar switch is its potential for speed. In one clock, a connection can be made between source and destination. The diameter of the cross-bar is one. Blocking if the destination is in use Because of its complexity, the cost of the cross-bar switch can become the dominant factor for a large multiprocessor system. Crossbars can be used to implement the a×b switches used in MIN’s. In this case each crossbar is small so costs are kept down.
  • 47. Each junction is a switching component – connecting the row to the column. Can only have one connection in each column
  • 48. Summary: Notes Bus n processors, bus width w Multistage Network n × n network using k × k switches, line width w Crossbar n × n crossbar, with line width w
  • 49. Summary: Minimum Latency Bus Constant Multistage Network O(logk n) Crossbar Constant
  • 50. Summary: Bandwidth per Processor Bus O(w/n) to O(w) Multistage Network O(w) to O(nw) Crossbar O(w) to O(nw)
  • 51. Summary: Wiring Complexity Bus O(w) Multistage Network O(nw logk n) Crossbar O(n2 w)
  • 52. Summary: Switching Complexity Bus O(n) Multistage Network O(n logk n) Crossbar O(n2 )
  • 53. Summary: Connectivity and Routing Bus One to one, and only one at a time Multistage Network Some permutations and broadcast (if network “unblocked”) Crossbar All permutations, one at a time