SlideShare una empresa de Scribd logo
1 de 24
VECTOR
COMPUTING
1
PRESENTED BY
VECTOR PROCESSOR
• Vector processors are special purpose computers
that match a range of (scientific) computing tasks.
• vector processors provide vector instructions. These
instructions operate in a pipeline .
3
OBJECTIVE
• Small Programs size
• No wastage
• Feeding of functional unit(FU) and the register
buses
4
HOW IT WORKS?
5
HOW IT WORKS?
6
OPERATIONS
• Add two vectors to produce a third.
• Subtract two vectors to produce a third
• Multiply two vectors to produce a third
• Divide two vectors to produce a third
• Load a vector from memory
• Store a vector to memory.
7
ARCHITECTURE
8
PROPERTIES
• Vector processors reduce the fetch and decode
bandwidth as the number of instructions fetched are
less.
• They also exploit data parallelism in large scientific
and multimedia applications.
• Many performance optimization schemes are used
in vector processors.
• Strip mining is used to generate code so that vector
operation is possible for vector operands whose size
is less than or greater than the size of vector
registers.
9
PROPERTIES
• Vector chaining the equivalent of forwarding in
vector processors - is used in case of data
dependency among vector instructions.
• Special scatter and gather instructions are provided
to efficiently operate on sparse matrices.
• Instruction are designed with the property that all
vector arithmetic instructions only allow element N
of one vector register to take part in operations with
element N from other vector registers.
10
PROPERTIES
• Based on how the operands are fetched, vector
processors can be divided into two categories - in
memory-memory architecture operands are directly
streamed to the functional units from the memory
and results are written back to memory as the vector
operation proceeds. In vector-register architecture,
operands are read into vector registers from which
they are fed to the functional units and results of
operations are written to vector registers.
11
ADVANTAGES
• Data can be represented at its original resolution and
form without generalization.
• Accurate location of data is maintained.
• Efficient encoding of topology, and as a result more
efficient operations.
• Mature, developed compiler technology
• Compact: Describe N operations with 1 short
instruction
12
SOME VECTOR PROCESSORS
13
NEW TERMS FOR VECTOR PROCESSORS
• Initiation rate
consuming operands
producing new results.
• Chime
timing measure
vector sequence
ignores the startup overhead for a vector operation.
14
• Convoy
is the set of vector instructions
potentially begin execution together in one clock period.
must complete before new instructions can begin.
• vector start-up time
overhead to start execution
related to the pipeline depth
NEW TERMS FOR VECTOR PROCESSORS
15
PROPOSED VECTOR PROCESSOR
• CODE (Clustered Organization for Decoupled
Execution) is a proposed vector architecture which
will overcome the some limitations of conventional
vector processors.
16
REASONS
• Complexity of central vector register files(VRF) - In
a processor with N vector functional units(VFU),
the register file needs approximately 3N access
ports. VRF area, power consumption and latency
are proportional to O(N*N), O(log N) and O(N)
respectively.
• Difficult to implement precise implementation - In
order to implement in-order commit, a large ROB is
needed with at least one vector register per VFU.
17
• In order to support virtual memory, large TLB is
needed so that TLB has enough entries to translate
all virtual addresses generated by a vector
instruction.
• Vector processors need expensive on-chip memory
for low latency.
REASONS
18
SOME FEATURES OF CODE
• Vector registers are organized in the form of clusters
in CODE architecture.
• CODE can hide communication latency by forcing
the output interface to look ahead into the
instruction queue and start executing register move
instructions.
• CODE supports precise exception using a history
buffer.
• In order to reduce the size of TLB.
• CODE proposes an ISA level change.
19
The Effect of cache design into
vector computers
• Numerical programs
 data sets that are too large for the current cache sizes.
Sweep accesses of a large vector
result in complete reloading of the cache
• achieve high memory bandwidth
Register files
highly interleaved memories
• Address sequentiation
20
Proposals of cache schemes
• Proposals such as prime-mapped cache schemes
have been proposed and studied. The new cache
organization minimizes cache misses caused by
cache line interferences that have been shown to be
critical in numerical applications.
• The cache lookup time of the new mapping scheme
keeps the same as conventional caches. Generation
of cache addresses for accessing the prime-mapped
cache can be done in parallel with normal address
calculations.
21
Conclusion
• Vector supercomputers
• Vector instruction
• Commodity technology like SMT
• Superscalar microprocessor
• Embedded and multimedia applications
22
References:
• J.L. Hennessy and D.A. Patterson, Computer Architecture, A
Quantitative Approach. Morgan Kaufmann, 1990.
• http://csep1.phy.ornl.gov/ca/node24.html
• http://www.comp.nus.edu.sg/~johnm/cs3220/l21.htm
• http://penta-performance.com/sager/vector/Default_vector2.htm
• www.google.com
• www.wikipedia.com
• www.youtube.com
23
24

Más contenido relacionado

La actualidad más candente

Multivector and multiprocessor
Multivector and multiprocessorMultivector and multiprocessor
Multivector and multiprocessor
Kishan Panara
 
Parallel Programming
Parallel ProgrammingParallel Programming
Parallel Programming
Uday Sharma
 
Database , 8 Query Optimization
Database , 8 Query OptimizationDatabase , 8 Query Optimization
Database , 8 Query Optimization
Ali Usman
 
memory Interleaving and low order interleaving and high interleaving
memory Interleaving and low order interleaving and high interleavingmemory Interleaving and low order interleaving and high interleaving
memory Interleaving and low order interleaving and high interleaving
Jawwad Rafiq
 

La actualidad más candente (20)

Multivector and multiprocessor
Multivector and multiprocessorMultivector and multiprocessor
Multivector and multiprocessor
 
Parallel Programming
Parallel ProgrammingParallel Programming
Parallel Programming
 
Database , 8 Query Optimization
Database , 8 Query OptimizationDatabase , 8 Query Optimization
Database , 8 Query Optimization
 
Parallel programming model, language and compiler in ACA.
Parallel programming model, language and compiler in ACA.Parallel programming model, language and compiler in ACA.
Parallel programming model, language and compiler in ACA.
 
FUNDAMENTALS OF COMPUTER DESIGN
FUNDAMENTALS OF COMPUTER DESIGNFUNDAMENTALS OF COMPUTER DESIGN
FUNDAMENTALS OF COMPUTER DESIGN
 
Unit 5 Advanced Computer Architecture
Unit 5 Advanced Computer ArchitectureUnit 5 Advanced Computer Architecture
Unit 5 Advanced Computer Architecture
 
parallel language and compiler
parallel language and compilerparallel language and compiler
parallel language and compiler
 
memory Interleaving and low order interleaving and high interleaving
memory Interleaving and low order interleaving and high interleavingmemory Interleaving and low order interleaving and high interleaving
memory Interleaving and low order interleaving and high interleaving
 
Multiprocessor Systems
Multiprocessor SystemsMultiprocessor Systems
Multiprocessor Systems
 
Query processing and optimization (updated)
Query processing and optimization (updated)Query processing and optimization (updated)
Query processing and optimization (updated)
 
Data flow architecture
Data flow architectureData flow architecture
Data flow architecture
 
VLIW Processors
VLIW ProcessorsVLIW Processors
VLIW Processors
 
program partitioning and scheduling IN Advanced Computer Architecture
program partitioning and scheduling  IN Advanced Computer Architectureprogram partitioning and scheduling  IN Advanced Computer Architecture
program partitioning and scheduling IN Advanced Computer Architecture
 
Introduction to Compiler design
Introduction to Compiler design Introduction to Compiler design
Introduction to Compiler design
 
Superscalar processor
Superscalar processorSuperscalar processor
Superscalar processor
 
Instruction Level Parallelism and Superscalar Processors
Instruction Level Parallelism and Superscalar ProcessorsInstruction Level Parallelism and Superscalar Processors
Instruction Level Parallelism and Superscalar Processors
 
Advanced Techniques for Exploiting ILP
Advanced Techniques for Exploiting ILPAdvanced Techniques for Exploiting ILP
Advanced Techniques for Exploiting ILP
 
Data Parallel and Object Oriented Model
Data Parallel and Object Oriented ModelData Parallel and Object Oriented Model
Data Parallel and Object Oriented Model
 
multiprocessors and multicomputers
 multiprocessors and multicomputers multiprocessors and multicomputers
multiprocessors and multicomputers
 
Huffman Encoding Pr
Huffman Encoding PrHuffman Encoding Pr
Huffman Encoding Pr
 

Similar a Vector computing

FIne Grain Multithreading
FIne Grain MultithreadingFIne Grain Multithreading
FIne Grain Multithreading
Dharmesh Tank
 
Application of Parallel Processing
Application of Parallel ProcessingApplication of Parallel Processing
Application of Parallel Processing
are you
 

Similar a Vector computing (20)

FIne Grain Multithreading
FIne Grain MultithreadingFIne Grain Multithreading
FIne Grain Multithreading
 
Project Slides for Website 2020-22.pptx
Project Slides for Website 2020-22.pptxProject Slides for Website 2020-22.pptx
Project Slides for Website 2020-22.pptx
 
Parallel Computing
Parallel ComputingParallel Computing
Parallel Computing
 
Application of Parallel Processing
Application of Parallel ProcessingApplication of Parallel Processing
Application of Parallel Processing
 
Basics of micro controllers for biginners
Basics of  micro controllers for biginnersBasics of  micro controllers for biginners
Basics of micro controllers for biginners
 
Latest (storage IO) patterns for cloud-native applications
Latest (storage IO) patterns for cloud-native applications Latest (storage IO) patterns for cloud-native applications
Latest (storage IO) patterns for cloud-native applications
 
Reduced instruction set computers
Reduced instruction set computersReduced instruction set computers
Reduced instruction set computers
 
Node architecture
Node architectureNode architecture
Node architecture
 
SOC Processors Used in SOC
SOC Processors Used in SOCSOC Processors Used in SOC
SOC Processors Used in SOC
 
Parallel Algorithms Advantages and Disadvantages
Parallel Algorithms Advantages and DisadvantagesParallel Algorithms Advantages and Disadvantages
Parallel Algorithms Advantages and Disadvantages
 
RISC Vs CISC Computer architecture and design
RISC Vs CISC Computer architecture and designRISC Vs CISC Computer architecture and design
RISC Vs CISC Computer architecture and design
 
Simulation of Heterogeneous Cloud Infrastructures
Simulation of Heterogeneous Cloud InfrastructuresSimulation of Heterogeneous Cloud Infrastructures
Simulation of Heterogeneous Cloud Infrastructures
 
Introduction to Apache Mesos and DC/OS
Introduction to Apache Mesos and DC/OSIntroduction to Apache Mesos and DC/OS
Introduction to Apache Mesos and DC/OS
 
A Dataflow Processing Chip for Training Deep Neural Networks
A Dataflow Processing Chip for Training Deep Neural NetworksA Dataflow Processing Chip for Training Deep Neural Networks
A Dataflow Processing Chip for Training Deep Neural Networks
 
Challenges in Embedded Computing
Challenges in Embedded ComputingChallenges in Embedded Computing
Challenges in Embedded Computing
 
Computer organization & ARM microcontrollers module 3 PPT
Computer organization & ARM microcontrollers module 3 PPTComputer organization & ARM microcontrollers module 3 PPT
Computer organization & ARM microcontrollers module 3 PPT
 
Monolithic to Microservices Architecture
Monolithic to Microservices ArchitectureMonolithic to Microservices Architecture
Monolithic to Microservices Architecture
 
A Closer Look at Apache Kudu
A Closer Look at Apache KuduA Closer Look at Apache Kudu
A Closer Look at Apache Kudu
 
CONDOR @ NGCLE@e-Novia 15.11.2017
CONDOR @ NGCLE@e-Novia 15.11.2017CONDOR @ NGCLE@e-Novia 15.11.2017
CONDOR @ NGCLE@e-Novia 15.11.2017
 
Oracle SOA suite and Coherence dehydration
Oracle SOA suite and  Coherence dehydrationOracle SOA suite and  Coherence dehydration
Oracle SOA suite and Coherence dehydration
 

Más de Safayet Hossain

Application-Aware Big Data Deduplication in Cloud Environment
Application-Aware Big Data Deduplication in Cloud EnvironmentApplication-Aware Big Data Deduplication in Cloud Environment
Application-Aware Big Data Deduplication in Cloud Environment
Safayet Hossain
 

Más de Safayet Hossain (13)

Application-Aware Big Data Deduplication in Cloud Environment
Application-Aware Big Data Deduplication in Cloud EnvironmentApplication-Aware Big Data Deduplication in Cloud Environment
Application-Aware Big Data Deduplication in Cloud Environment
 
Epipolar geometry
Epipolar geometryEpipolar geometry
Epipolar geometry
 
Find Transitive closure of a Graph Using Warshall's Algorithm
Find Transitive closure of a Graph Using Warshall's AlgorithmFind Transitive closure of a Graph Using Warshall's Algorithm
Find Transitive closure of a Graph Using Warshall's Algorithm
 
Color Guided Thermal image Super Resolution
Color Guided Thermal image Super ResolutionColor Guided Thermal image Super Resolution
Color Guided Thermal image Super Resolution
 
Different type of attack on computer
Different type of attack on computerDifferent type of attack on computer
Different type of attack on computer
 
Region based image segmentation
Region based image segmentationRegion based image segmentation
Region based image segmentation
 
Anti- aliasing computer graphics
Anti- aliasing computer graphicsAnti- aliasing computer graphics
Anti- aliasing computer graphics
 
detect emotion from text
detect emotion from textdetect emotion from text
detect emotion from text
 
Grid computing
Grid computing Grid computing
Grid computing
 
Green computing
Green computing Green computing
Green computing
 
E waste...
E   waste...E   waste...
E waste...
 
Economic presentation
Economic presentationEconomic presentation
Economic presentation
 
Remittance Management System
Remittance Management System Remittance Management System
Remittance Management System
 

Último

Beyond the EU: DORA and NIS 2 Directive's Global Impact
Beyond the EU: DORA and NIS 2 Directive's Global ImpactBeyond the EU: DORA and NIS 2 Directive's Global Impact
Beyond the EU: DORA and NIS 2 Directive's Global Impact
PECB
 
1029 - Danh muc Sach Giao Khoa 10 . pdf
1029 -  Danh muc Sach Giao Khoa 10 . pdf1029 -  Danh muc Sach Giao Khoa 10 . pdf
1029 - Danh muc Sach Giao Khoa 10 . pdf
QucHHunhnh
 

Último (20)

Advanced Views - Calendar View in Odoo 17
Advanced Views - Calendar View in Odoo 17Advanced Views - Calendar View in Odoo 17
Advanced Views - Calendar View in Odoo 17
 
General Principles of Intellectual Property: Concepts of Intellectual Proper...
General Principles of Intellectual Property: Concepts of Intellectual  Proper...General Principles of Intellectual Property: Concepts of Intellectual  Proper...
General Principles of Intellectual Property: Concepts of Intellectual Proper...
 
Web & Social Media Analytics Previous Year Question Paper.pdf
Web & Social Media Analytics Previous Year Question Paper.pdfWeb & Social Media Analytics Previous Year Question Paper.pdf
Web & Social Media Analytics Previous Year Question Paper.pdf
 
microwave assisted reaction. General introduction
microwave assisted reaction. General introductionmicrowave assisted reaction. General introduction
microwave assisted reaction. General introduction
 
Mehran University Newsletter Vol-X, Issue-I, 2024
Mehran University Newsletter Vol-X, Issue-I, 2024Mehran University Newsletter Vol-X, Issue-I, 2024
Mehran University Newsletter Vol-X, Issue-I, 2024
 
How to Give a Domain for a Field in Odoo 17
How to Give a Domain for a Field in Odoo 17How to Give a Domain for a Field in Odoo 17
How to Give a Domain for a Field in Odoo 17
 
Holdier Curriculum Vitae (April 2024).pdf
Holdier Curriculum Vitae (April 2024).pdfHoldier Curriculum Vitae (April 2024).pdf
Holdier Curriculum Vitae (April 2024).pdf
 
TỔNG ÔN TẬP THI VÀO LỚP 10 MÔN TIẾNG ANH NĂM HỌC 2023 - 2024 CÓ ĐÁP ÁN (NGỮ Â...
TỔNG ÔN TẬP THI VÀO LỚP 10 MÔN TIẾNG ANH NĂM HỌC 2023 - 2024 CÓ ĐÁP ÁN (NGỮ Â...TỔNG ÔN TẬP THI VÀO LỚP 10 MÔN TIẾNG ANH NĂM HỌC 2023 - 2024 CÓ ĐÁP ÁN (NGỮ Â...
TỔNG ÔN TẬP THI VÀO LỚP 10 MÔN TIẾNG ANH NĂM HỌC 2023 - 2024 CÓ ĐÁP ÁN (NGỮ Â...
 
Beyond the EU: DORA and NIS 2 Directive's Global Impact
Beyond the EU: DORA and NIS 2 Directive's Global ImpactBeyond the EU: DORA and NIS 2 Directive's Global Impact
Beyond the EU: DORA and NIS 2 Directive's Global Impact
 
Unit-V; Pricing (Pharma Marketing Management).pptx
Unit-V; Pricing (Pharma Marketing Management).pptxUnit-V; Pricing (Pharma Marketing Management).pptx
Unit-V; Pricing (Pharma Marketing Management).pptx
 
ICT role in 21st century education and it's challenges.
ICT role in 21st century education and it's challenges.ICT role in 21st century education and it's challenges.
ICT role in 21st century education and it's challenges.
 
2024-NATIONAL-LEARNING-CAMP-AND-OTHER.pptx
2024-NATIONAL-LEARNING-CAMP-AND-OTHER.pptx2024-NATIONAL-LEARNING-CAMP-AND-OTHER.pptx
2024-NATIONAL-LEARNING-CAMP-AND-OTHER.pptx
 
INDIA QUIZ 2024 RLAC DELHI UNIVERSITY.pptx
INDIA QUIZ 2024 RLAC DELHI UNIVERSITY.pptxINDIA QUIZ 2024 RLAC DELHI UNIVERSITY.pptx
INDIA QUIZ 2024 RLAC DELHI UNIVERSITY.pptx
 
1029 - Danh muc Sach Giao Khoa 10 . pdf
1029 -  Danh muc Sach Giao Khoa 10 . pdf1029 -  Danh muc Sach Giao Khoa 10 . pdf
1029 - Danh muc Sach Giao Khoa 10 . pdf
 
psychiatric nursing HISTORY COLLECTION .docx
psychiatric  nursing HISTORY  COLLECTION  .docxpsychiatric  nursing HISTORY  COLLECTION  .docx
psychiatric nursing HISTORY COLLECTION .docx
 
Sociology 101 Demonstration of Learning Exhibit
Sociology 101 Demonstration of Learning ExhibitSociology 101 Demonstration of Learning Exhibit
Sociology 101 Demonstration of Learning Exhibit
 
Role Of Transgenic Animal In Target Validation-1.pptx
Role Of Transgenic Animal In Target Validation-1.pptxRole Of Transgenic Animal In Target Validation-1.pptx
Role Of Transgenic Animal In Target Validation-1.pptx
 
Application orientated numerical on hev.ppt
Application orientated numerical on hev.pptApplication orientated numerical on hev.ppt
Application orientated numerical on hev.ppt
 
Ecological Succession. ( ECOSYSTEM, B. Pharmacy, 1st Year, Sem-II, Environmen...
Ecological Succession. ( ECOSYSTEM, B. Pharmacy, 1st Year, Sem-II, Environmen...Ecological Succession. ( ECOSYSTEM, B. Pharmacy, 1st Year, Sem-II, Environmen...
Ecological Succession. ( ECOSYSTEM, B. Pharmacy, 1st Year, Sem-II, Environmen...
 
Grant Readiness 101 TechSoup and Remy Consulting
Grant Readiness 101 TechSoup and Remy ConsultingGrant Readiness 101 TechSoup and Remy Consulting
Grant Readiness 101 TechSoup and Remy Consulting
 

Vector computing

  • 3. VECTOR PROCESSOR • Vector processors are special purpose computers that match a range of (scientific) computing tasks. • vector processors provide vector instructions. These instructions operate in a pipeline . 3
  • 4. OBJECTIVE • Small Programs size • No wastage • Feeding of functional unit(FU) and the register buses 4
  • 7. OPERATIONS • Add two vectors to produce a third. • Subtract two vectors to produce a third • Multiply two vectors to produce a third • Divide two vectors to produce a third • Load a vector from memory • Store a vector to memory. 7
  • 9. PROPERTIES • Vector processors reduce the fetch and decode bandwidth as the number of instructions fetched are less. • They also exploit data parallelism in large scientific and multimedia applications. • Many performance optimization schemes are used in vector processors. • Strip mining is used to generate code so that vector operation is possible for vector operands whose size is less than or greater than the size of vector registers. 9
  • 10. PROPERTIES • Vector chaining the equivalent of forwarding in vector processors - is used in case of data dependency among vector instructions. • Special scatter and gather instructions are provided to efficiently operate on sparse matrices. • Instruction are designed with the property that all vector arithmetic instructions only allow element N of one vector register to take part in operations with element N from other vector registers. 10
  • 11. PROPERTIES • Based on how the operands are fetched, vector processors can be divided into two categories - in memory-memory architecture operands are directly streamed to the functional units from the memory and results are written back to memory as the vector operation proceeds. In vector-register architecture, operands are read into vector registers from which they are fed to the functional units and results of operations are written to vector registers. 11
  • 12. ADVANTAGES • Data can be represented at its original resolution and form without generalization. • Accurate location of data is maintained. • Efficient encoding of topology, and as a result more efficient operations. • Mature, developed compiler technology • Compact: Describe N operations with 1 short instruction 12
  • 14. NEW TERMS FOR VECTOR PROCESSORS • Initiation rate consuming operands producing new results. • Chime timing measure vector sequence ignores the startup overhead for a vector operation. 14
  • 15. • Convoy is the set of vector instructions potentially begin execution together in one clock period. must complete before new instructions can begin. • vector start-up time overhead to start execution related to the pipeline depth NEW TERMS FOR VECTOR PROCESSORS 15
  • 16. PROPOSED VECTOR PROCESSOR • CODE (Clustered Organization for Decoupled Execution) is a proposed vector architecture which will overcome the some limitations of conventional vector processors. 16
  • 17. REASONS • Complexity of central vector register files(VRF) - In a processor with N vector functional units(VFU), the register file needs approximately 3N access ports. VRF area, power consumption and latency are proportional to O(N*N), O(log N) and O(N) respectively. • Difficult to implement precise implementation - In order to implement in-order commit, a large ROB is needed with at least one vector register per VFU. 17
  • 18. • In order to support virtual memory, large TLB is needed so that TLB has enough entries to translate all virtual addresses generated by a vector instruction. • Vector processors need expensive on-chip memory for low latency. REASONS 18
  • 19. SOME FEATURES OF CODE • Vector registers are organized in the form of clusters in CODE architecture. • CODE can hide communication latency by forcing the output interface to look ahead into the instruction queue and start executing register move instructions. • CODE supports precise exception using a history buffer. • In order to reduce the size of TLB. • CODE proposes an ISA level change. 19
  • 20. The Effect of cache design into vector computers • Numerical programs  data sets that are too large for the current cache sizes. Sweep accesses of a large vector result in complete reloading of the cache • achieve high memory bandwidth Register files highly interleaved memories • Address sequentiation 20
  • 21. Proposals of cache schemes • Proposals such as prime-mapped cache schemes have been proposed and studied. The new cache organization minimizes cache misses caused by cache line interferences that have been shown to be critical in numerical applications. • The cache lookup time of the new mapping scheme keeps the same as conventional caches. Generation of cache addresses for accessing the prime-mapped cache can be done in parallel with normal address calculations. 21
  • 22. Conclusion • Vector supercomputers • Vector instruction • Commodity technology like SMT • Superscalar microprocessor • Embedded and multimedia applications 22
  • 23. References: • J.L. Hennessy and D.A. Patterson, Computer Architecture, A Quantitative Approach. Morgan Kaufmann, 1990. • http://csep1.phy.ornl.gov/ca/node24.html • http://www.comp.nus.edu.sg/~johnm/cs3220/l21.htm • http://penta-performance.com/sager/vector/Default_vector2.htm • www.google.com • www.wikipedia.com • www.youtube.com 23
  • 24. 24