SlideShare una empresa de Scribd logo
1 de 32
INSTRUCTION PIPELINING
What is pipelining?
• The greater performance of the cpu is achieved by
  instruction pipelining.
• 8086 microprocesor has two blocks

            BIU(BUS INTERFACE UNIT)
            EU(EXECUTION UNIT)

•   The BIU performs all bus operations such as instruction
  fetching,reading and writing operands for memory and
  calculating the addresses of the memory operands. The
  instruction bytes are transferred to the instruction queue.
• EU executes instructions from the instruction system
  byte queue.
• Both units operate asynchronously to give the 8086 an
  overlapping instruction fetch and execution mechanism
  which is called as Pipelining.
INSTRUCTION PIPELINING

 First stage fetches the instruction and buffers it.
 When the second stage is free, the first stage
  passes it the buffered instruction.
 While the second stage is executing the
  instruction,the first stage takes advantages of
  any unused memory cycles to fetch and buffer the
  next instruction.
 This is called instruction prefetch or fetch
  overlap.
Inefficiency in two stage
          instruction pipelining
   There are two reasons
• The execution time will generally be longer than
  the fetch time.Thus the fetch stage may have to
  wait for some time before it can empty the buffer.
• When conditional branch occurs,then the address
  of next instruction to be fetched become
  unknown.Then the execution stage have to wait
  while the next instruction is fetched.
Two stage instruction pipelining




                    Simplified view
                  wait       new address       wait


                Fetch                Execute
  Instruction                 Instruction
Result

                    discard       EXPANDED VIEW
Decomposition of instruction
processing

To gain further speedup,the pipeline have more
  stages(6 stages)

 Fetch instruction(FI)
 Decode instruction(DI)
 Calculate operands (i.e. EAs)(CO)
 Fetch operands(FO)
 Execute instructions(EI)
 Write operand(WO)
SIX STAGE OF INSTRUCTION
PIPELINING

 Fetch Instruction(FI)
          Read the next expected instruction into a buffer
   Decode Instruction(DI)
         Determine the opcode and the operand specifiers.
   Calculate Operands(CO)
          Calculate the effective address of each source operand.
   Fetch Operands(FO)
           Fetch each operand from memory. Operands in registers
    need not be fetched.
   Execute Instruction(EI)
          Perform the indicated operation and store the result
   Write Operand(WO)
           Store the result in memory.
Timing diagram for instruction pipeline
              operation
High efficiency of instruction
pipelining
Assume all the below in diagram
• All stages will be of equal duration.
• Each instruction goes through all the six stages of
  the pipeline.
• All the stages can be performed parallel.
• No memory conflicts.
• All the accesses occur simultaneously.
 In the previous diagram the instruction pipelining
  works very efficiently and give high performance
Limits to performance
 enhancement
The factors affecting the performance are

1.   If six stages are not of equal duration,then there will
     be some waiting time at various stages.
2.   Conditional branch instruction which can invalidate
     several instruction fetches.
3.   Interrupt which is unpredictable event.
4.   Register and memory conflicts.
5.   CO stage may depend on the contents of a register
     that could be altered by a previous instruction that
     is still in pipeline.
Effect of conditional branch on
instruction pipeline operation
Conditional branch instructions
 Assume that the instruction 3 is a conditional
    branch to instruction 15.
   Until the instruction is executed there is no way of
    knowing which instruction will come next
   The pipeline will simply loads the next instruction
    in the sequence and execute.
   Branch is not determined until the end of time unit
    7.
   During time unit 8,instruction 15 enters into the
    pipeline.
   No instruction complete during time units 9
    through 12.
   This is the performance penalty incurred because
Simple pattern for high performance
• Two factors that frustrate this simple pattern for
  high performance are
1. At each stage of the pipeline,there is some
    overhead involved in moving data from buffer to
    buffer and in performing various preparation and
    delivery functions.This overhead will lengthen
    the execution time of a single instruction.This is
    significant when sequential instructions are
    logically dependent,either through heavy use of
    branching or through memory access
    dependencies
2. The amount of control logic required to handle
    memory and register dependencies and to
    optimize the use of the pipeline increases
Six-stage CPU instruction pipeline
Dealing with branches

   A variety of approaches have been taken for dealing
  with conditional branches.
 Multiple streams
 Prefetch branch target.
 Loop buffer
 Branch prediction
 Delayed branch
Multiple streams
 In simple pipeline,it must choose one of the two
  instructions to fetch next and may make wrong
  choice.
 In multiple streams allow the pipeline to fetch both
  instructions making use of two streams.
 Problems with this approach
• With multiple pipelines there are contention delays
  for the access to the registers and to memory.
• Additional branch instructions may enter the
  pipeline(either stream)before the original branch
  decision is resolved.Each such instructions needs
  an additional branch.
Examples:
• IBM 370/168 AND IBM 3033.
Prefetch Branch Target

 When a conditional branched is recognized,the target
  of the branch is prefetched,in addition to the instruction
  following the branch.
 This target is then saved until the branch instruction is
  executed.
 If the branch is taken,the target has already been
  prefetched.
 The IBM 360/91 uses this approach.
Loop buffer

 A loop buffer is a small,very high-speed memory
  maintained in instruction fetch stage.
 It contains n most recently fetched instructions in
  sequence.
 If a branch is to be taken,the hardware first checks
  whether the branch target is within the buffer.
 If so,the next instruction is fetched from the buffer.
Benefits of loop buffer
 Instructions fetched in sequence will be available
  without the usual memory access time
 If the branch occurs to the target just a few locations
  ahead of the address of the branch instruction, the
  target will already be in the buffer. This is useful for
  the rather common occurrence of IF-THEN and IF-
  THEN-ELSE sequences.
 This is well suited for loops or iterations, hence
  named loop buffer.If the loop buffer is large enough
  to contain all the instructions in a loop,then those
  instructions need to be fetched from memory only
  once,for the first iteration.
 For subsequent iterations,all the needed instructions
  are already in the buffer.
Cont..,
 Loop buffer is similar to cache.
 Least significant 8 bits are used to index the buffer
 and remaining MSB are checked to determine the
 branch target.

  Branch address
                     Loop buffer
                 8   (256 bytes)

 Instruction to be
                                             decoded
 in case of hit

                               Most significant address
 bits
Branch prediction

  Various techniques used to predict whether a
 branch will be taken. They are

 Predict Never Taken
 Predict Always Taken        STATIC
 Predict by Opcode
 Taken/Not Taken Switch
 Branch History Table          DYNAMIC
Static branch strategies
• STATIC(1,2,3)-They do not depend on the
  execution history
• Predict Never Taken
          Always assume that the branch will not be
  taken and continue to fetch instruction in sequence.
• Predict Always Taken
           Always assume that the branch will be taken
  and always fetch from target.
• Predict by Opcode
            Decision based on the opcode of the
  branch instruction. The processor assumes that the
  branch will be taken for certain branch opcodes and
  not for others.
Dynamic branch strategies
 DYNAMIC(4,5)-They depend on the execution
    history.
   They attempt to improve the accuracy of prediction
    by recording the history of conditional branch
    instructions in a program.
   For example,one or more bits can be associated
    with conditional branch instruction that reflect the
    recent history.
   These bits are referred as taken/not taken switch.
   These history bits are stored in temporary high-
    speed memory.
   Then associate the bits with any conditional branch
    instruction and make decision.
   Another possibility is to maintain a small table for
    recent history with one or more bits in each entry.
Cont..,
 With only one bit of history, an error prediction will occur
  twice for each use of the loop:once on entering the loop
  and once on exiting.
 The decision process can be represented by a finite-
  state machine with four stages.
Cont..,
 If the last two branches of the given instruction
    have taken same path,the prediction is to make
    the same path again.
   If the prediction is wrong it remains same for next
    time also
   But when again the prediction went wrong, the
    opposite path will be selected.
   Greater efficiency could be achieved if the
    instruction fetch could be initiated as soon as the
    branch decision is made.
   For this purpose, information must be saved, that
    is known as branch target buffer,or a branch
    history table.
Branch history table

 It is a small cache memory associated with
  instruction fetch stage.
 Each entry in table consist of elements:
 Address of branch instruction
 Some number of history bits.
 Information about the target instruction.
• The third field may contain address or target
  instruction itself.
Dealing with branches
Branching strategies
 If branch is taken,some logic in the processor
    detects that and instruct to fetch next instruction
    from target address.
   Each prefetch triggers a lookup in the branch
    history table.
   If no match is found,the next sequential instruction
    address is used for fetch.
   If match occurs, a prediction is made based on the
    state of the instruction.
   When the branch instruction is executed,the
    execute stage signals the branch history table logic
    with result.
Delayed branch


 It is possible to improve pipeline performance by
  automatically rearranging instructions within the
  program.
 So that branch instructions occur later than
  actually desired.
Intel 80486 Pipelining
• Fetch
— From cache or external memory
— Put in one of two 16-byte prefetch buffers
— Fill buffer with new data as soon as old data consumed
— Average 5 instructions fetched per load
— Independent of other stages to keep buffers full
• Decode stage 1
— Opcode & address-mode info
— At most first 3 bytes of instruction
— Can direct D2 stage to get rest of instruction
• Decode stage 2
— Expand opcode into control signals
— Computation of complex address modes
• Execute
— ALU operations, cache access, register update
• Writeback
— Update registers & flags
— Results sent to cache & bus interface write buffers
THANK YOU

Más contenido relacionado

La actualidad más candente

Instruction set and instruction execution cycle
Instruction set and instruction execution cycleInstruction set and instruction execution cycle
Instruction set and instruction execution cycleMkaur01
 
Control Unit Design
Control Unit DesignControl Unit Design
Control Unit DesignVinit Raut
 
Control Units : Microprogrammed and Hardwired:control unit
Control Units : Microprogrammed and Hardwired:control unitControl Units : Microprogrammed and Hardwired:control unit
Control Units : Microprogrammed and Hardwired:control unitabdosaidgkv
 
Stack organization
Stack organizationStack organization
Stack organizationchauhankapil
 
Types of Addressing modes- COA
Types of Addressing modes- COATypes of Addressing modes- COA
Types of Addressing modes- COARuchi Maurya
 
Pipeline processing - Computer Architecture
Pipeline processing - Computer Architecture Pipeline processing - Computer Architecture
Pipeline processing - Computer Architecture S. Hasnain Raza
 
Memory organization (Computer architecture)
Memory organization (Computer architecture)Memory organization (Computer architecture)
Memory organization (Computer architecture)Sandesh Jonchhe
 
COMPUTER INSTRUCTIONS & TIMING & CONTROL.
COMPUTER INSTRUCTIONS & TIMING & CONTROL.COMPUTER INSTRUCTIONS & TIMING & CONTROL.
COMPUTER INSTRUCTIONS & TIMING & CONTROL.ATUL KUMAR YADAV
 
pipeline and vector processing
pipeline and vector processingpipeline and vector processing
pipeline and vector processingAcad
 
Disk Scheduling Algorithm in Operating System
Disk Scheduling Algorithm in Operating SystemDisk Scheduling Algorithm in Operating System
Disk Scheduling Algorithm in Operating SystemMeghaj Mallick
 
Computer architecture data representation
Computer architecture  data representationComputer architecture  data representation
Computer architecture data representationAnil Pokhrel
 
INSTRUCTION LEVEL PARALLALISM
INSTRUCTION LEVEL PARALLALISMINSTRUCTION LEVEL PARALLALISM
INSTRUCTION LEVEL PARALLALISMKamran Ashraf
 

La actualidad más candente (20)

Pipelining
PipeliningPipelining
Pipelining
 
Instruction set and instruction execution cycle
Instruction set and instruction execution cycleInstruction set and instruction execution cycle
Instruction set and instruction execution cycle
 
Control Unit Design
Control Unit DesignControl Unit Design
Control Unit Design
 
Conditional branches
Conditional branchesConditional branches
Conditional branches
 
Computer Organization
Computer OrganizationComputer Organization
Computer Organization
 
Control Units : Microprogrammed and Hardwired:control unit
Control Units : Microprogrammed and Hardwired:control unitControl Units : Microprogrammed and Hardwired:control unit
Control Units : Microprogrammed and Hardwired:control unit
 
Stack organization
Stack organizationStack organization
Stack organization
 
Pipelining & All Hazards Solution
Pipelining  & All Hazards SolutionPipelining  & All Hazards Solution
Pipelining & All Hazards Solution
 
Types of Addressing modes- COA
Types of Addressing modes- COATypes of Addressing modes- COA
Types of Addressing modes- COA
 
Pipeline processing - Computer Architecture
Pipeline processing - Computer Architecture Pipeline processing - Computer Architecture
Pipeline processing - Computer Architecture
 
Instruction cycle
Instruction cycleInstruction cycle
Instruction cycle
 
Memory organization (Computer architecture)
Memory organization (Computer architecture)Memory organization (Computer architecture)
Memory organization (Computer architecture)
 
COMPUTER INSTRUCTIONS & TIMING & CONTROL.
COMPUTER INSTRUCTIONS & TIMING & CONTROL.COMPUTER INSTRUCTIONS & TIMING & CONTROL.
COMPUTER INSTRUCTIONS & TIMING & CONTROL.
 
pipeline and vector processing
pipeline and vector processingpipeline and vector processing
pipeline and vector processing
 
Interrupts and types of interrupts
Interrupts and types of interruptsInterrupts and types of interrupts
Interrupts and types of interrupts
 
Disk Scheduling Algorithm in Operating System
Disk Scheduling Algorithm in Operating SystemDisk Scheduling Algorithm in Operating System
Disk Scheduling Algorithm in Operating System
 
Computer architecture data representation
Computer architecture  data representationComputer architecture  data representation
Computer architecture data representation
 
Cache memory
Cache memoryCache memory
Cache memory
 
INSTRUCTION LEVEL PARALLALISM
INSTRUCTION LEVEL PARALLALISMINSTRUCTION LEVEL PARALLALISM
INSTRUCTION LEVEL PARALLALISM
 
Data Hazard and Solution for Data Hazard
Data Hazard and Solution for Data HazardData Hazard and Solution for Data Hazard
Data Hazard and Solution for Data Hazard
 

Similar a Instruction pipelining

Advanced Pipelining in ARM Processors.pptx
Advanced Pipelining  in ARM Processors.pptxAdvanced Pipelining  in ARM Processors.pptx
Advanced Pipelining in ARM Processors.pptxJoyChowdhury30
 
pipeline in computer architecture design
pipeline in computer architecture  designpipeline in computer architecture  design
pipeline in computer architecture designssuser87fa0c1
 
Performance Enhancement with Pipelining
Performance Enhancement with PipeliningPerformance Enhancement with Pipelining
Performance Enhancement with PipeliningAneesh Raveendran
 
CPU Structure and Function.pptx
CPU Structure and Function.pptxCPU Structure and Function.pptx
CPU Structure and Function.pptxnagargorv
 
Pipeline & Nonpipeline Processor
Pipeline & Nonpipeline ProcessorPipeline & Nonpipeline Processor
Pipeline & Nonpipeline ProcessorSmit Shah
 
12 processor structure and function
12 processor structure and function12 processor structure and function
12 processor structure and functionSher Shah Merkhel
 
Computer Organization: Introduction to Microprocessor and Microcontroller
Computer Organization: Introduction to Microprocessor and MicrocontrollerComputer Organization: Introduction to Microprocessor and Microcontroller
Computer Organization: Introduction to Microprocessor and MicrocontrollerAmrutaMehata
 
Topic2a ss pipelines
Topic2a ss pipelinesTopic2a ss pipelines
Topic2a ss pipelinesturki_09
 
Computer arithmetic in computer architecture
Computer arithmetic in computer architectureComputer arithmetic in computer architecture
Computer arithmetic in computer architectureishapadhy
 
Pipelining in Computer System Achitecture
Pipelining in Computer System AchitecturePipelining in Computer System Achitecture
Pipelining in Computer System AchitectureYashiUpadhyay3
 
Pipelining 16 computers Artitacher pdf
Pipelining   16 computers Artitacher  pdfPipelining   16 computers Artitacher  pdf
Pipelining 16 computers Artitacher pdfMadhuGupta99385
 
IT209 Cpu Structure Report
IT209 Cpu Structure ReportIT209 Cpu Structure Report
IT209 Cpu Structure ReportBis Aquino
 

Similar a Instruction pipelining (20)

Advanced Pipelining in ARM Processors.pptx
Advanced Pipelining  in ARM Processors.pptxAdvanced Pipelining  in ARM Processors.pptx
Advanced Pipelining in ARM Processors.pptx
 
ch2.pptx
ch2.pptxch2.pptx
ch2.pptx
 
Assembly p1
Assembly p1Assembly p1
Assembly p1
 
pipeline in computer architecture design
pipeline in computer architecture  designpipeline in computer architecture  design
pipeline in computer architecture design
 
Performance Enhancement with Pipelining
Performance Enhancement with PipeliningPerformance Enhancement with Pipelining
Performance Enhancement with Pipelining
 
CPU Structure and Function.pptx
CPU Structure and Function.pptxCPU Structure and Function.pptx
CPU Structure and Function.pptx
 
Pipeline & Nonpipeline Processor
Pipeline & Nonpipeline ProcessorPipeline & Nonpipeline Processor
Pipeline & Nonpipeline Processor
 
12 processor structure and function
12 processor structure and function12 processor structure and function
12 processor structure and function
 
Chapter 8
Chapter 8Chapter 8
Chapter 8
 
Computer Organization: Introduction to Microprocessor and Microcontroller
Computer Organization: Introduction to Microprocessor and MicrocontrollerComputer Organization: Introduction to Microprocessor and Microcontroller
Computer Organization: Introduction to Microprocessor and Microcontroller
 
Computer architecture
Computer architectureComputer architecture
Computer architecture
 
Oversimplified CA
Oversimplified CAOversimplified CA
Oversimplified CA
 
Topic2a ss pipelines
Topic2a ss pipelinesTopic2a ss pipelines
Topic2a ss pipelines
 
Computer arithmetic in computer architecture
Computer arithmetic in computer architectureComputer arithmetic in computer architecture
Computer arithmetic in computer architecture
 
Sayeh extension(v23)
Sayeh extension(v23)Sayeh extension(v23)
Sayeh extension(v23)
 
Pipelining in Computer System Achitecture
Pipelining in Computer System AchitecturePipelining in Computer System Achitecture
Pipelining in Computer System Achitecture
 
Pipelining 16 computers Artitacher pdf
Pipelining   16 computers Artitacher  pdfPipelining   16 computers Artitacher  pdf
Pipelining 16 computers Artitacher pdf
 
CA UNIT III.pptx
CA UNIT III.pptxCA UNIT III.pptx
CA UNIT III.pptx
 
IT209 Cpu Structure Report
IT209 Cpu Structure ReportIT209 Cpu Structure Report
IT209 Cpu Structure Report
 
Bc0040
Bc0040Bc0040
Bc0040
 

Más de Tech_MX

Virtual base class
Virtual base classVirtual base class
Virtual base classTech_MX
 
Theory of estimation
Theory of estimationTheory of estimation
Theory of estimationTech_MX
 
Templates in C++
Templates in C++Templates in C++
Templates in C++Tech_MX
 
String & its application
String & its applicationString & its application
String & its applicationTech_MX
 
Statistical quality__control_2
Statistical  quality__control_2Statistical  quality__control_2
Statistical quality__control_2Tech_MX
 
Stack data structure
Stack data structureStack data structure
Stack data structureTech_MX
 
Stack Data Structure & It's Application
Stack Data Structure & It's Application Stack Data Structure & It's Application
Stack Data Structure & It's Application Tech_MX
 
Spanning trees & applications
Spanning trees & applicationsSpanning trees & applications
Spanning trees & applicationsTech_MX
 
Set data structure 2
Set data structure 2Set data structure 2
Set data structure 2Tech_MX
 
Set data structure
Set data structure Set data structure
Set data structure Tech_MX
 
Real time Operating System
Real time Operating SystemReal time Operating System
Real time Operating SystemTech_MX
 
Mouse interrupts (Assembly Language & C)
Mouse interrupts (Assembly Language & C)Mouse interrupts (Assembly Language & C)
Mouse interrupts (Assembly Language & C)Tech_MX
 
Motherboard of a pc
Motherboard of a pcMotherboard of a pc
Motherboard of a pcTech_MX
 
More on Lex
More on LexMore on Lex
More on LexTech_MX
 
MultiMedia dbms
MultiMedia dbmsMultiMedia dbms
MultiMedia dbmsTech_MX
 
Merging files (Data Structure)
Merging files (Data Structure)Merging files (Data Structure)
Merging files (Data Structure)Tech_MX
 
Memory dbms
Memory dbmsMemory dbms
Memory dbmsTech_MX
 

Más de Tech_MX (20)

Virtual base class
Virtual base classVirtual base class
Virtual base class
 
Uid
UidUid
Uid
 
Theory of estimation
Theory of estimationTheory of estimation
Theory of estimation
 
Templates in C++
Templates in C++Templates in C++
Templates in C++
 
String & its application
String & its applicationString & its application
String & its application
 
Statistical quality__control_2
Statistical  quality__control_2Statistical  quality__control_2
Statistical quality__control_2
 
Stack data structure
Stack data structureStack data structure
Stack data structure
 
Stack Data Structure & It's Application
Stack Data Structure & It's Application Stack Data Structure & It's Application
Stack Data Structure & It's Application
 
Spss
SpssSpss
Spss
 
Spanning trees & applications
Spanning trees & applicationsSpanning trees & applications
Spanning trees & applications
 
Set data structure 2
Set data structure 2Set data structure 2
Set data structure 2
 
Set data structure
Set data structure Set data structure
Set data structure
 
Real time Operating System
Real time Operating SystemReal time Operating System
Real time Operating System
 
Parsing
ParsingParsing
Parsing
 
Mouse interrupts (Assembly Language & C)
Mouse interrupts (Assembly Language & C)Mouse interrupts (Assembly Language & C)
Mouse interrupts (Assembly Language & C)
 
Motherboard of a pc
Motherboard of a pcMotherboard of a pc
Motherboard of a pc
 
More on Lex
More on LexMore on Lex
More on Lex
 
MultiMedia dbms
MultiMedia dbmsMultiMedia dbms
MultiMedia dbms
 
Merging files (Data Structure)
Merging files (Data Structure)Merging files (Data Structure)
Merging files (Data Structure)
 
Memory dbms
Memory dbmsMemory dbms
Memory dbms
 

Último

WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure serviceWhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure servicePooja Nehwal
 
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 3652toLead Limited
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slidevu2urc
 
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...gurkirankumar98700
 
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Alan Dix
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Drew Madelung
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking MenDelhi Call girls
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Miguel Araújo
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationRidwan Fadjar
 
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonetsnaman860154
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking MenDelhi Call girls
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024The Digital Insurer
 
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersEnhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersThousandEyes
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxMalak Abu Hammad
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationMichael W. Hawkins
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Igalia
 
Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Paola De la Torre
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesSinan KOZAK
 

Último (20)

WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure serviceWhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
 
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
 
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
 
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 Presentation
 
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonets
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024
 
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersEnhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptx
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
 
Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen Frames
 

Instruction pipelining

  • 2. What is pipelining? • The greater performance of the cpu is achieved by instruction pipelining. • 8086 microprocesor has two blocks  BIU(BUS INTERFACE UNIT)  EU(EXECUTION UNIT) • The BIU performs all bus operations such as instruction fetching,reading and writing operands for memory and calculating the addresses of the memory operands. The instruction bytes are transferred to the instruction queue. • EU executes instructions from the instruction system byte queue. • Both units operate asynchronously to give the 8086 an overlapping instruction fetch and execution mechanism which is called as Pipelining.
  • 3. INSTRUCTION PIPELINING  First stage fetches the instruction and buffers it.  When the second stage is free, the first stage passes it the buffered instruction.  While the second stage is executing the instruction,the first stage takes advantages of any unused memory cycles to fetch and buffer the next instruction.  This is called instruction prefetch or fetch overlap.
  • 4. Inefficiency in two stage instruction pipelining There are two reasons • The execution time will generally be longer than the fetch time.Thus the fetch stage may have to wait for some time before it can empty the buffer. • When conditional branch occurs,then the address of next instruction to be fetched become unknown.Then the execution stage have to wait while the next instruction is fetched.
  • 5. Two stage instruction pipelining Simplified view wait new address wait Fetch Execute Instruction Instruction Result discard EXPANDED VIEW
  • 6. Decomposition of instruction processing To gain further speedup,the pipeline have more stages(6 stages)  Fetch instruction(FI)  Decode instruction(DI)  Calculate operands (i.e. EAs)(CO)  Fetch operands(FO)  Execute instructions(EI)  Write operand(WO)
  • 7. SIX STAGE OF INSTRUCTION PIPELINING  Fetch Instruction(FI) Read the next expected instruction into a buffer  Decode Instruction(DI) Determine the opcode and the operand specifiers.  Calculate Operands(CO) Calculate the effective address of each source operand.  Fetch Operands(FO) Fetch each operand from memory. Operands in registers need not be fetched.  Execute Instruction(EI) Perform the indicated operation and store the result  Write Operand(WO) Store the result in memory.
  • 8. Timing diagram for instruction pipeline operation
  • 9. High efficiency of instruction pipelining Assume all the below in diagram • All stages will be of equal duration. • Each instruction goes through all the six stages of the pipeline. • All the stages can be performed parallel. • No memory conflicts. • All the accesses occur simultaneously.  In the previous diagram the instruction pipelining works very efficiently and give high performance
  • 10. Limits to performance enhancement The factors affecting the performance are 1. If six stages are not of equal duration,then there will be some waiting time at various stages. 2. Conditional branch instruction which can invalidate several instruction fetches. 3. Interrupt which is unpredictable event. 4. Register and memory conflicts. 5. CO stage may depend on the contents of a register that could be altered by a previous instruction that is still in pipeline.
  • 11. Effect of conditional branch on instruction pipeline operation
  • 12. Conditional branch instructions  Assume that the instruction 3 is a conditional branch to instruction 15.  Until the instruction is executed there is no way of knowing which instruction will come next  The pipeline will simply loads the next instruction in the sequence and execute.  Branch is not determined until the end of time unit 7.  During time unit 8,instruction 15 enters into the pipeline.  No instruction complete during time units 9 through 12.  This is the performance penalty incurred because
  • 13. Simple pattern for high performance • Two factors that frustrate this simple pattern for high performance are 1. At each stage of the pipeline,there is some overhead involved in moving data from buffer to buffer and in performing various preparation and delivery functions.This overhead will lengthen the execution time of a single instruction.This is significant when sequential instructions are logically dependent,either through heavy use of branching or through memory access dependencies 2. The amount of control logic required to handle memory and register dependencies and to optimize the use of the pipeline increases
  • 15. Dealing with branches A variety of approaches have been taken for dealing with conditional branches.  Multiple streams  Prefetch branch target.  Loop buffer  Branch prediction  Delayed branch
  • 16. Multiple streams  In simple pipeline,it must choose one of the two instructions to fetch next and may make wrong choice.  In multiple streams allow the pipeline to fetch both instructions making use of two streams.  Problems with this approach • With multiple pipelines there are contention delays for the access to the registers and to memory. • Additional branch instructions may enter the pipeline(either stream)before the original branch decision is resolved.Each such instructions needs an additional branch. Examples: • IBM 370/168 AND IBM 3033.
  • 17. Prefetch Branch Target  When a conditional branched is recognized,the target of the branch is prefetched,in addition to the instruction following the branch.  This target is then saved until the branch instruction is executed.  If the branch is taken,the target has already been prefetched.  The IBM 360/91 uses this approach.
  • 18. Loop buffer  A loop buffer is a small,very high-speed memory maintained in instruction fetch stage.  It contains n most recently fetched instructions in sequence.  If a branch is to be taken,the hardware first checks whether the branch target is within the buffer.  If so,the next instruction is fetched from the buffer.
  • 19. Benefits of loop buffer  Instructions fetched in sequence will be available without the usual memory access time  If the branch occurs to the target just a few locations ahead of the address of the branch instruction, the target will already be in the buffer. This is useful for the rather common occurrence of IF-THEN and IF- THEN-ELSE sequences.  This is well suited for loops or iterations, hence named loop buffer.If the loop buffer is large enough to contain all the instructions in a loop,then those instructions need to be fetched from memory only once,for the first iteration.  For subsequent iterations,all the needed instructions are already in the buffer.
  • 20. Cont..,  Loop buffer is similar to cache.  Least significant 8 bits are used to index the buffer and remaining MSB are checked to determine the branch target. Branch address Loop buffer 8 (256 bytes) Instruction to be decoded in case of hit Most significant address bits
  • 21. Branch prediction Various techniques used to predict whether a branch will be taken. They are  Predict Never Taken  Predict Always Taken STATIC  Predict by Opcode  Taken/Not Taken Switch  Branch History Table DYNAMIC
  • 22. Static branch strategies • STATIC(1,2,3)-They do not depend on the execution history • Predict Never Taken Always assume that the branch will not be taken and continue to fetch instruction in sequence. • Predict Always Taken Always assume that the branch will be taken and always fetch from target. • Predict by Opcode Decision based on the opcode of the branch instruction. The processor assumes that the branch will be taken for certain branch opcodes and not for others.
  • 23. Dynamic branch strategies  DYNAMIC(4,5)-They depend on the execution history.  They attempt to improve the accuracy of prediction by recording the history of conditional branch instructions in a program.  For example,one or more bits can be associated with conditional branch instruction that reflect the recent history.  These bits are referred as taken/not taken switch.  These history bits are stored in temporary high- speed memory.  Then associate the bits with any conditional branch instruction and make decision.  Another possibility is to maintain a small table for recent history with one or more bits in each entry.
  • 24. Cont..,  With only one bit of history, an error prediction will occur twice for each use of the loop:once on entering the loop and once on exiting.  The decision process can be represented by a finite- state machine with four stages.
  • 25. Cont..,  If the last two branches of the given instruction have taken same path,the prediction is to make the same path again.  If the prediction is wrong it remains same for next time also  But when again the prediction went wrong, the opposite path will be selected.  Greater efficiency could be achieved if the instruction fetch could be initiated as soon as the branch decision is made.  For this purpose, information must be saved, that is known as branch target buffer,or a branch history table.
  • 26. Branch history table  It is a small cache memory associated with instruction fetch stage.  Each entry in table consist of elements:  Address of branch instruction  Some number of history bits.  Information about the target instruction. • The third field may contain address or target instruction itself.
  • 28. Branching strategies  If branch is taken,some logic in the processor detects that and instruct to fetch next instruction from target address.  Each prefetch triggers a lookup in the branch history table.  If no match is found,the next sequential instruction address is used for fetch.  If match occurs, a prediction is made based on the state of the instruction.  When the branch instruction is executed,the execute stage signals the branch history table logic with result.
  • 29. Delayed branch  It is possible to improve pipeline performance by automatically rearranging instructions within the program.  So that branch instructions occur later than actually desired.
  • 30. Intel 80486 Pipelining • Fetch — From cache or external memory — Put in one of two 16-byte prefetch buffers — Fill buffer with new data as soon as old data consumed — Average 5 instructions fetched per load — Independent of other stages to keep buffers full • Decode stage 1 — Opcode & address-mode info — At most first 3 bytes of instruction — Can direct D2 stage to get rest of instruction • Decode stage 2 — Expand opcode into control signals — Computation of complex address modes • Execute — ALU operations, cache access, register update • Writeback — Update registers & flags — Results sent to cache & bus interface write buffers
  • 31.