1. Parallel computing involves dividing large problems into smaller subproblems that can be solved simultaneously to reduce processing time.
2. There are two main reasons for using parallel computing: to save time and solve larger problems.
3. Parallel architectures can be classified based on how instructions and data are distributed, the coupling between processing elements, how memory is accessed, and the granularity of work.
1. PARALLEL ARCHITECTURE AND COMPUTING
1.What is Parallel Computing?
Parallel computing is a form of computation in which many calculations are carried out
simultaneously, operating on the principle that large problems can on often be divided into smaller ones,
which are then solved concurrently ("in parallel").
2.Why Use Parallel Computing?
There are two primary reasons for using parallel computing:
o Save time - wall clock time
o Solve larger problems
3.Comparison between Temporal and Data Parallelism
Temporal Parallelism
1. Job is divided into a set of independent tasks and tasks are assigned for processing.
2.Tasks should take equal time.Pipeline stages should thus be synchronized.
3.Bubbles in jobs lead to idling of processors. 4. Processors specialized to
do specific tasks efficiently.
5. Task assigned may be statics .
6. Not tolerant to faults.
7. Efficient with fine grained tasks.
8. Scales well as long as no. of jobs to be processed is much larger than the no. of
processors in the pipeline and communication time is minimal.
Data Parallelism
1. Full jobs are assigned for processing.
2. Jobs may take different times. No need to synchronize beginning of jobs.
3.Bubbles do not cause idling of processors.
4. Processors should be general purpose and may not do every job efficiently.
5.Job assignment may be static, dynamic or quasi-dynamic.
6.Tolerates faults.
2. 7.Efficient with coarse grained tasks and quasi-dynamic scheduling.
8. Scales well as long as no. of jobs are much greater than the no. of processors and processing time is
much higher than the communication time.
4.Explain the types of Data Processing with Specialized Processors
1. Fine grain Data Parallelism – Specialist Data Parallelism
2. Coarse grain Data Parallelism – Coarse grain Specialist Temporal Parallelism
5.Explain Disadvantage of Specialist Data Parallelism
1.Load is not balanced in this method. If one question takes more time to be graded
then the others will be idle for that time.
2. The same problem occurs if one question is not answered by many students.
3. The head examiner wastes a lot of time checking the script for unanswered questions and the teachers
are sitting idle at that time.
6.Explain Coarse grain specialist Temporal Parallelism Advantages
1.Processing of special tasks are done by specialized processors.
2. The method uses the concept of pipelined processing in a circular pipeline.
3. There is buffering (in-tray & out-tray)between pipeline stages.
4. Each stage has a chunk of work to do.
5. Does not need strict synchronization.
6. Tolerates bubbles.
1. What is parallel processing?
Parallel processing is the processing of program instructions by dividing them among
multiple processors with the objective of running a program in less time.
(Or)
The simultaneous use of more than one CPU to execute a program. Ideally, parallel
processing makes a program run faster because there are more engines (CPUs) running it.
2. What is concurrency?
Concurrency is a term used in the operating systems and databases communities which
refers to the property of a system in which multiple tasks remain logically active and make
3. progress at the same time by interleaving the execution order of the tasks and thereby creating an
illusion of simultaneously executing instructions. Parallelism exploits concurrency
3. What is multiprogramming?
Multiprogramming is the allocation of a computer system and its resources to more than
one concurrent application, job or user ("program" in this nomenclature). The use of
multiprogramming was enhanced by the arrival of virtual memory and virtual machine
technology.
In a multiprogramming system, multiple programs submitted by users were each allowed
to use the processor for a short time. To users it appeared that all of the programs were executing
at the same time.
4. What is Parallel computing?
Parallel computing is a form of computation in which many calculations are carried out
simultaneously, operating on the principle that large problems can often be divided into smaller
ones, which are then solved concurrently ("in parallel").
There are several different forms of parallel computing: bit-level, instruction level, data,
and task parallelism. Parallelism has been employed for many years, mainly in high-performance
computing.
5. What is vector processor?
A vector processor, or array processor, is a central processing unit (CPU) that
implements an instruction set containing instructions that operate on one-dimensional arrays of
data called vectors. This is in contrast to a scalar processor, whose instructions operate on single
data items.
A vector computer or vector processor is a machine designed to efficiently handle
arithmetic operations on elements of arrays, called vectors. Such machines are especially useful
in high-performance scientific computing, where matrix and vector arithmetic are quite common.
6. What is Parallel computer?
A parallel computer is defined as an interconnected set of processing Elements (PEs)
which cooperate by communicating with one another to solve large problems fast
7. What are the different types of Processing Elements(PEs)?
Arithmetic Logic unit(ALU) only
A microprocessor with only a private cache memory or a full fledged microprocessor
with is own cache and main memory( a PE with is own cache and main memory is called
Computing Elements (CEs)
4. A powerful large computer such as a mainframe or vector processor
8. What are the different types of mode of cooperation?
Each CE has a set of processes assigned to it. Each CE works independently and CEs
cooperate by exchanging intermediate results
All processes and data to be processed are stored in the memory shared by all PEs. A
free PE selects a process to execute and deposits the results in the memory for use by
other PEs.
A host CE stores a pool of tasks to be executed and schedules to free CEs dynamically
9. What are the criteria to classify the parallel computers?
1. How do instructions and data flow in the system? This idea for classification was
proposed by Flynn and is known as Flynn’s classification.
2. What is the coupling between CEs?
3. How do PEs access memory?
4. What is the quantum of work done by PE before it communicate with other
PEs?
10. What is Loosely coupling?
Loose coupling is a method of interconnecting the components in a system or network so
that those components, also called elements, depend on each other to the least extent practicable.
Loose coupling simplifies testing, maintenance and troubleshooting procedures because
problems are easy to isolate and unlikely to spread or propagate.
11. What is tightly coupling?
Tight coupling (or tightly coupled) is a type of coupling that describes a system in which
hardware and software are not only linked together, but are also dependant upon each other. In a
tightly coupled system where multiple systems share a workload, the entire system usually would
need to be powered down to fix a major hardware problem, not just the single system with the
issue.
Tight coupling is used to define software that will work only in one part of a specific type
of system and the software is dependant on other software. For example, an operating system
would be considered tightly coupled as it depends on software drivers to correctly install and
activate the system's peripheral devices.
12.How parallel computers are classified based on mode of accessing?
5. Parallel computer
Coupling Loosely coupled tightly coupled
Physical processing elements network processing elements share
Connection with private memory a common memory and
communicate via a network communicate via shared
memory
Logical cooperation compute independently and cooperate by sharing results
cooperate by exchanging stored in a common memory
messages
Types of parallel Message passing Multicomputer Shared Memory Multicomputer
Computer (or) Distributd Shared Memory or Symmetric Multiprocessor
computer
13. What is Uniform Memory Access (UMA)?
In Shared Memory (SM) computer or Symmetric Multiprocessor, the time to access a
word in memory is constant for all processors. Such a parallel computer(SM) is said to have a
Uniform Memory Access(UMA)
14. What is Non Uniform Memory Access (UMA)?
In distributed shared memory computer, The time to access a word in memory local t it
is smaller than the time taken to access a word stored in the memory of another computer or a
common shared memory. Such a parallel computer(DSm) is said to have a Non Uniform
Memory Access(NUMA)
15. What id cache coherence problem?
In multiprocessor, there are many caches, one per processor. It is essential to keep the
data in given address same in all caches to avoid errors in computation
The cache coherence problem arises when a PE writes a data into its private cache in
address x but which is not known to the caches of other PEs
16. What is write-through or write-now protocol?
6. If the processor initiate a write request, if the data is in the cache, it overwrites the
existing data in the cache. If the protocol is write-now protocol, the data is the main memory is
also updated.
17 . What write-back protocol?
If the processor initiate a write request, if the data is in the cache, it overwrites the
existing data in the cache. If the protocol is write-back protocol, the data is the main memory is
also updated only when the cache block in which the data is contained is to be replaces by
another block from the main memory.
18. List any two cache coherence protocol?
SNOOPY cache protocol
MESI protocol
19. What is snoopy cache protocol?
In multiprocessor, there are many caches, one per processor, and one has to know the
status of each of them. A bus based system has the advantage that a transaction involving a cache
is broadcast on the bus and other caches can listen to the broadcast. Thus cache coherence
protocols are based on cache controllers of each processors listening (called snoopy which means
secretly listening) to the broadcast on the bus and taking appropriate action. These protocols are
known as snoopy cache protocol
20. What is MESI ?
The Pentium cache coherence protocol is known as MESI protocol. This protocol
invalidates the shared blocks in caches when new information is written in that block by any PE.
When new information is written in any cache block it is not written immediately in the main
memory
21. What are the different status of cache block in MESI protocol?
Modified(M) The data in cache block has been modified and it is the only copy. Main
memory copy is an old copy
Exclusive(E) The data in cache block is valid and is the same as in main memory. No
other cache has these valid copies
Shared(S) The data in cache block is valid and is the same as in main memory. No
other caches also have valid copies.
Invalid(I) The data in cache block has been invalidated as another cache block has a
newly written value
7. 22. What is the difference between single processor system and multi processor system?
Single-processor system has only one actual processor, while Multiprocessor system has
more than one, both types of systems can have more than one core per processor.
Multiple processors per system have long been used in systems that need a lot of processing
power, like high traffic servers and when lots of computational power is needed.
However these systems have been expensive and not needed by normal home or office
users. In recent years it's become typical that one processor has 2, 3, 4 or even 8 cores. These
multicore processors behave the same way as if you had multiple processors.
One core can only do one task at a time. Multitasking is done by sharing the time of the
processor between processes (program), one process runs for a short time, then another, then
another or maybe the first one again. The switching is done so fast that the user wont know the
difference. Multiple cores can run multiple processes at once for real.
It depends on your software how well you computer can use the advantage of having multiple
processors/cores, dividing the task to different processes.
23. What is Read Miss and Write Miss?
When the required data is not found in the cache during read and request , this situation is called
Read Miss and Write Miss
24. What is the mechanism used to ensure the cache coherence in Shared Memory parallel
computer using an interconnection Network?
The Directory scheme is used to ensure cache coherence in these systems. The main
purpose of the directory is to know which blocks are in caches and their status. Suppose a
multiprocessor has M blocks in main memory and there are N processors and each processor has
a cache. Each memory has an N bit directory entry. If the kth processor’s cache has this block, the
kth bit is the directory is set to one
25. What are the different status of main memory in Directory schemed cache coherence?
State status bits Explanation
Lock bit Modified bit
Absent (A) 0 0 All processor’s cache bit in
directory 0(No cache holds a copy
Present (P) 1 0 one or more processor cache bit is 1
8. Present Exclusive (PM 0 1 Exactly one cache bit is 1
Lock(L) 1 _ An operation is going on involving
The block and it is locked
26. What are the advantage of Parallel processing?
o Increased throughput (more work done in less time)
o Economical
o Increased reliability
graceful degradation (fault tolerant)
27. List out the types of Multiprocessor Architecture?
• Message-Passing Architectures
– Separate address space for each processor.
– Processors communicate via message passing.
• Shared-Memory Architectures
– Single address space shared by all processors.
– Processors communicate by memory read/write.
– SMP or NUMA.
– Cache coherence is important issue.
28. What is Symmetric, Asymmetric multiprocessing?
Asymmetric multiprocessing
• Each processor is assigned a specific task; master
• processor schedules and allocated work to slave processors
• More common in extremely large systems
Symmetric multiprocessing (SMP)
• Each processor runs and identical copy of the operating System
• Many processes can run at once without performance deterioration
• Most modern operating systems support SMP