SlideShare una empresa de Scribd logo
1 de 82
Velammal Engineering College
Department of Computer Science
and Engineering
Welcome…
Slide Sources: Patterson & Hennessy COD book
website (copyright Morgan Kaufmann) adapted
and supplemented
Mr. A. Arockia Abins &
Ms. R. Amirthavalli,
Asst. Prof,
CSE,
Velammal Engineering College
UNIT III
PROCESSOR AND
CONTROL UNIT
Topics:
 Basic MIPS implementation – Building datapath –
Control Implementation scheme – Pipelining –
Pipelined datapath and control – Handling Data
hazards & Control hazards – Exceptions.
2
Implementing MIPS
• We're ready to look at an implementation of the MIPS instruction set
• Simplified to contain only
o arithmetic-logic instructions: add, sub, and, or, slt
o memory-reference instructions: lw, sw
o control-flow instructions: beq, j
op rs rt offset
6 bits 5 bits 5 bits 16 bits
op rs rt rd funct
shamt
6 bits 5 bits 5 bits 5 bits 5 bits 6 bits
R-Format
I-Format
op address
6 bits 26 bits
J-Format
3
Implementing MIPS: the
Fetch/Execute Cycle
• High-level abstract view of fetch/execute implementation
o use the program counter (PC) to read instruction address
o fetch the instruction from memory and increment PC
o use fields of the instruction to select registers to read
o execute depending on the instruction
o repeat…
Registers
Register #
Data
Register #
Data
memory
Address
Data
Register #
PC Instruction ALU
Instruction
memory
Address
4
The basic implementation of the MIPS subset, including the necessary
multiplexors and control lines
5
Overview: Processor
Implementation Styles
• Single Cycle
o perform each instruction in 1 clock cycle
o clock cycle must be long enough for slowest instruction; therefore,
o disadvantage: only as fast as slowest instruction
• Multi-Cycle
o break fetch/execute cycle into multiple steps
o perform 1 step in each clock cycle
o advantage: each instruction uses only as many cycles as it needs
• Pipelined
o execute each instruction in multiple steps
o perform 1 step / instruction in each clock cycle
o process multiple instructions in parallel – assembly line
6
2.Building datapath
 datapath element
 A unit used to operate on or hold data within a processor.
 MIPS – datapath elements:
1. Program Counter(PC)
2. Instruction Memory
3. Register File
4. ALU
5. Data Memory
7
Datapath: Instruction
Store/Fetch & PC Increment
PC
Instruction
memory
Instruction
address
Instruction
a. Instruction memory b. Program counter
Add Sum
c. Adder
PC
Instruction
memory
Read
address
Instruction
4
Add
Three elements used to store
and fetch instructions and
increment the PC
Datapath
8
Animating the Datapath
Instruction <- MEM[PC]
PC <- PC + 4
RD
Memory
ADDR
PC
Instruction
4
ADD
9
Datapath: R-Type
Instruction
ALU control
RegWrite
Registers
Write
register
Read
data 1
Read
data 2
Read
register 1
Read
register 2
Write
data
ALU
result
ALU
Data
Data
Register
numbers
a. Registers b. ALU
Zero
5
5
5 3
Instruction
Registers
Write
register
Read
data 1
Read
data 2
Read
register 1
Read
register 2
Write
data
ALU
result
ALU
Zero
RegWrite
ALU operation
3
Two elements used to implement
R-type instructions
Datapath
op rs rt rd funct
shamt
6 bits 5 bits 5 bits 5 bits 5 bits 6 bits
R-Format
4 4
10
Animating the Datapath
add rd, rs, rt
R[rd] <- R[rs] + R[rt];
5 5 5
RD1
RD2
RN1 RN2 WN
WD
RegWrite
Register File
op rs rt rd funct
shamt
Operation
ALU Zero
Instruction
3
11
Datapath:
Load/Store Instruction
16 32
Sign
extend
b. Sign-extension unit
MemRead
MemWrite
Data
memory
Write
data
Read
data
a. Data memory unit
Address
Instruction
16 32
Registers
Write
register
Read
data 1
Read
data 2
Read
register 1
Read
register 2
Data
memory
Write
data
Read
data
Write
data
Sign
extend
ALU
result
Zero
ALU
Address
MemRead
MemWrite
RegWrite
ALU operation
3
Two additional elements used
To implement load/stores
Datapath
op rs rt offset
6 bits 5 bits 5 bits 16 bits
I-Format
12
Animating the Datapath
lw rt, offset(rs)
R[rt] <- MEM[R[rs] + s_extend(offset)];
13
Animating the Datapath
sw rt, offset(rs)
MEM[R[rs] + sign_extend(offset)] <- R[rt]
14
Datapath: Branch
Instruction
16 32
Sign
extend
Zero
ALU
Sum
Shift
left 2
To branch
control logic
Branch target
PC + 4 from instruction datapath
Instruction
Add
Registers
Write
register
Read
data 1
Read
data 2
Read
register 1
Read
register 2
Write
data
RegWrite
ALU operation
3
Datapath
No shift hardware required:
simply connect wires from
input to output, each shifted
left 2 bits
15
Animating the Datapath
beq rs, rt, offset
if (R[rs] == R[rt]) then
PC <- PC+4 + s_extend(offset<<2)
16
MIPS Datapath
PC
Instruction
memory
Read
address
Instruction
16 32
Add ALU
result
M
u
x
Registers
Write
register
Write
data
Read
data 1
Read
data 2
Read
register 1
Read
register 2
Shift
left 2
4
M
u
x
ALU operation
3
RegWrite
MemRead
MemWrite
PCSrc
ALUSrc
MemtoReg
ALU
result
Zero
ALU
Data
memory
Address
Write
data
Read
data M
u
x
Sign
extend
Add
Adding branch capability and another multiplexor
Instruction address is either
PC+4 or branch target address
Extra adder needed as both
adders operate in each cycle
New multiplexor
17
Datapath Executing add
add rd, rs, rt
5 5
16
RD1
RD2
RN1 RN2 WN
WD
RegWrite
Register File
Operation
ALU
3
E
X
T
N
D
16 32
Zero
RD
WD
MemRead
Data
Memory
ADDR
MemWrite
5
Instruction
32
M
U
X
ALUSrc
MemtoReg
ADD
<<2
RD
Instruction
Memory
ADDR
PC
4
ADD
ADD
M
U
X
M
U
X
PCSrc
18
Datapath Executing lw
lw rt,offset(rs)
5 5
16
RD1
RD2
RN1 RN2 WN
WD
RegWrite
Register File
Operation
ALU
3
E
X
T
N
D
16 32
Zero
RD
WD
MemRead
Data
Memory
ADDR
MemWrite
5
Instruction
32
M
U
X
ALUSrc
MemtoReg
ADD
<<2
RD
Instruction
Memory
ADDR
PC
4
ADD
ADD
M
U
X
M
U
X
PCSrc
19
Datapath Executing sw
sw rt,offset(rs)
5 5
16
RD1
RD2
RN1 RN2 WN
WD
RegWrite
Register File
Operation
ALU
3
E
X
T
N
D
16 32
Zero
RD
WD
MemRead
Data
Memory
ADDR
MemWrite
5
Instruction
32
M
U
X
ALUSrc
MemtoReg
ADD
<<2
RD
Instruction
Memory
ADDR
PC
4
ADD
ADD
M
U
X
M
U
X
PCSrc
20
Datapath Executing beq
beq r1,r2,offset
5 5
16
RD1
RD2
RN1 RN2 WN
WD
RegWrite
Register File
Operation
ALU
3
E
X
T
N
D
16 32
Zero
RD
WD
MemRead
Data
Memory
ADDR
MemWrite
5
Instruction
32
M
U
X
ALUSrc
MemtoReg
ADD
<<2
RD
Instruction
Memory
ADDR
PC
4
ADD
ADD
M
U
X
M
U
X
PCSrc
21
Control Implementation
Scheme
• Control unit takes input from
o the instruction opcode bits
• Control unit generates
o ALU control input
o write enable (possibly, read enable also) signals for each storage
element
o selector controls for each multiplexor
22
ALU Control
• Plan to control ALU: main control sends a 2-bit ALUOp control field to the ALU
control. Based on ALUOp and funct field of instruction the ALU control generates
the 4-bit ALU control field
o ALU control Func-
field tion
0000 and
0001 or
0010 add
0110 sub
0111 slt
• ALU must perform
o add for load/stores (ALUOp 00)
o sub for branches (ALUOp 01)
o one of and, or, add, sub, slt for R-type instructions, depending on the instruction’s 6-bit
funct field (ALUOp 10)
Main
Control
ALU
Control
2
ALUOp
6
Instruction
funct field
4
ALU
control
input
To
ALU
23
Setting ALU Control Bits
Instruction AluOp Instruction Funct Field Desired ALU control
opcode operation ALU action input
LW 00 load word xxxxxx add 0010
SW 00 store word xxxxxx add 0010
Branch eq 01 branch eq xxxxxx subtract 0110
R-type 10 add 100000 add 0010
R-type 10 subtract 100010 subtract 0110
R-type 10 AND 100100 and 0000
R-type 10 OR 100101 or 0001
R-type 10 set on less 101010 set on less 0111
Truth table for ALU control bits
ALUOp Funct field Operation
ALUOp1 ALUOp0 F5 F4 F3 F2 F1 F0
0 0 X X X X X X 0010
X 1 X X X X X X 0110
1 X X X 0 0 0 0 0010
1 X X X 0 0 1 0 0110
1 X X X 0 1 0 0 0000
1 X X X 0 1 0 1 0001
1 X X X 1 0 1 0 0111 24
Designing the Main
Control
• Observations about MIPS instruction format
o opcode is always in bits 31-26
o two registers to be read are always rs (bits 25-21) and rt (bits 20-
16)
o base register for load/stores is always rs (bits 25-21)
o 16-bit offset for branch equal and load/store is always bits 15-0
o destination register for loads is in bits 20-16 (rt) while for R-type
instructions it is in bits 15-11 (rd) (will require multiplexor to select)
31-26 25-21 20-16 15-11 10-6 5-0
31-26 25-21 20-16 15-0
opcode
opcode
rs
rs
rt
rt address
rd shamt funct
R-type
Load/store
or branch
25
The Main Control Unit
• Control signals derived from instruction
0 rs rt rd shamt funct
31:26 5:0
25:21 20:16 15:11 10:6
35 or 43 rs rt address
31:26 25:21 20:16 15:0
4 rs rt address
31:26 25:21 20:16 15:0
R-type
Load/
Store
Branch
opcode always
read
read,
except
for load
write for
R-type
and load
sign-extend
and add
26
Datapath with Control I
New multiplexor
27
Control Signals
Signal Name Effect when deasserted Effect when asserted
RegDst The register destination number for the The register destination number for the
Write register comes from the rt field (bits 20-16) Write register comes from the rd field (bits 15-11)
RegWrite None The register on the Write register input is written
with the value on the Write data input
AlLUSrc The second ALU operand comes from the The second ALU operand is the sign-extended,
second register file output (Read data 2) lower 16 bits of the instruction
PCSrc The PC is replaced by the output of the adder The PC is replaced by the output of the adder
that computes the value of PC + 4 that computes the branch target
MemRead None Data memory contents designated by the address
input are put on the first Read data output
MemWrite None Data memory contents designated by the address
input are replaced by the value of the Write data input
MemtoReg The value fed to the register Write data input The value fed to the register Write data input
comes from the ALU comes from the data memory
Effects of the seven control signals
28
Datapath with Control II
PC
Instruction
memory
Read
address
Instruction
[31– 0]
Instruction [20 16]
Instruction [25 21]
Add
Instruction [5 0]
MemtoReg
ALUOp
MemWrite
RegWrite
MemRead
Branch
RegDst
ALUSrc
Instruction [31 26]
4
16 32
Instruction [15 0]
0
0
M
u
x
0
1
Control
Add
ALU
result
M
u
x
0
1
Registers
Write
register
Write
data
Read
data 1
Read
data 2
Read
register 1
Read
register 2
Sign
extend
M
u
x
1
ALU
result
Zero
PCSrc
Data
memory
Write
data
Read
data
M
u
x
1
Instruction [15 11]
ALU
control
Shift
left 2
ALU
Address
MIPS datapath with the control unit: input to control is the 6-bit instruction
opcode field, output is seven 1-bit signals and the 2-bit ALUOp signal
29
Datapath with Control II (cont.)
Instruction RegDst ALUSrc
Memto-
Reg
Reg
Write
Mem
Read
Mem
Write Branch ALUOp1 ALUp0
R-format 1 0 0 1 0 0 0 1 0
lw 0 1 1 1 1 0 0 0 0
sw X 1 X 0 0 1 0 0 0
beq X 0 X 0 0 0 1 0 1
PC
Instruction
memory
Read
address
Instruction
[31– 0]
Instruction [20 16]
Instruction [25 21]
Add
Instruction [5 0]
MemtoReg
ALUOp
MemWrite
RegWrite
MemRead
Branch
RegDst
ALUSrc
Instruction [31 26]
4
16 32
Instruction [15 0]
0
0
M
u
x
0
1
Control
Add
ALU
result
M
u
x
0
1
Registers
Write
register
Write
data
Read
data 1
Read
data 2
Read
register 1
Read
register 2
Sign
extend
M
u
x
1
ALU
result
Zero
PCSrc
Data
memory
Write
data
Read
data
M
u
x
1
Instruction [15 11]
ALU
control
Shift
left 2
ALU
Address
Determining control signals for the MIPS datapath based on instruction opcode
PCSrc cannot be
set directly from the
opcode: zero test
outcome is required
30
Control Signals:
R-Type Instruction
Control signals
shown in blue
1
0
0
0
1
???
Value depends on
funct
0
0
5 5
16
RD1
RD2
RN1 RN2 WN
WD
RegWrite
Register File
Operation
ALU
3
E
X
T
N
D
16 32
Zero
RD
WD
MemRead
Data
Memory
ADDR
MemWrite
5
Instruction
I
32
M
U
X
ALUSrc
MemtoReg
ADD
<<2
RD
Instruction
Memory
ADDR
PC
4
ADD
ADD
M
U
X
M
U
X
PCSrc
MUX
RegDst
5
rd
I[15:11]
rt
I[20:16]
rs
I[25:21]
immediate/
offset
I[15:0]
0
1
0
1
1
0
1
0
4
31
5 5
16
RD1
RD2
RN1 RN2 WN
WD
RegWrite
Register File
Operation
ALU
3
E
X
T
N
D
16 32
Zero
RD
WD
MemRead
Data
Memory
ADDR
MemWrite
5
Instruction
I
32
M
U
X
ALUSrc
MemtoReg
ADD
<<2
RD
Instruction
Memory
ADDR
PC
4
ADD
ADD
M
U
X
M
U
X
PCSrc
MUX
RegDst
5
rd
I[15:11]
rt
I[20:16]
rs
I[25:21]
immediate/
offset
I[15:0]
0
1
0
1
1
0
1
0
Control Signals:
lw Instruction
0
Control signals
shown in blue
0
0010
1
1
1
0
1
32
5 5
16
RD1
RD2
RN1 RN2 WN
WD
RegWrite
Register File
Operation
ALU
3
E
X
T
N
D
16 32
Zero
RD
WD
MemRead
Data
Memory
ADDR
MemWrite
5
Instruction
I
32
M
U
X
ALUSrc
MemtoReg
ADD
<<2
RD
Instruction
Memory
ADDR
PC
4
ADD
ADD
M
U
X
M
U
X
PCSrc
MUX
RegDst
5
rd
I[15:11]
rt
I[20:16]
rs
I[25:21]
immediate/
offset
I[15:0]
0
1
0
1
1
0
1
0
Control Signals:
sw Instruction
0
Control signals
shown in blue
X
0010
1
X
0
1
0
33
5 5
16
RD1
RD2
RN1 RN2 WN
WD
RegWrite
Register File
Operation
ALU
3
E
X
T
N
D
16 32
Zero
RD
WD
MemRead
Data
Memory
ADDR
MemWrite
5
Instruction
I
32
M
U
X
ALUSrc
MemtoReg
ADD
<<2
RD
Instruction
Memory
ADDR
PC
4
ADD
ADD
M
U
X
M
U
X
PCSrc
MUX
RegDst
5
rd
I[15:11]
rt
I[20:16]
rs
I[25:21]
immediate/
offset
I[15:0]
0
1
0
1
1
0
1
0
Control Signals:
beq Instruction
Control signals
shown in blue
X
0110
0
X
0
0
0
1 if Zero=1
34
Datapath with Control III
Shift
left 2
PC
Instruction
memory
Read
address
Instruction
[31– 0]
Data
memory
Read
data
Write
data
Registers
Write
register
Write
data
Read
data 1
Read
data 2
Read
register 1
Read
register 2
Instruction [15– 11]
Instruction [20– 16]
Instruction [25– 21]
Add
ALU
result
Zero
Instruction [5– 0]
MemtoReg
ALUOp
MemWrite
RegWrite
MemRead
Branch
Jump
RegDst
ALUSrc
Instruction [31– 26]
4
M
u
x
Instruction [25– 0] Jump address [31– 0]
PC+4 [31– 28]
Sign
extend
16 32
Instruction [15– 0]
1
M
u
x
1
0
M
u
x
0
1
M
u
x
0
1
ALU
control
Control
Add ALU
result
M
u
x
0
1 0
ALU
Shift
left 2
26 28
Address
31-26 25-0
opcode address
Jump
MIPS datapath extended to jumps: control unit generates new Jump control bit
New multiplexor with additional
control bit Jump
Composing jump
target address
35
Datapath Executing j
36
Datapath with Control III
Shift
left 2
PC
Instruction
memory
Read
address
Instruction
[31– 0]
Data
memory
Read
data
Write
data
Registers
Write
register
Write
data
Read
data 1
Read
data 2
Read
register 1
Read
register 2
Instruction [15– 11]
Instruction [20– 16]
Instruction [25– 21]
Add
ALU
result
Zero
Instruction [5– 0]
MemtoReg
ALUOp
MemWrite
RegWrite
MemRead
Branch
Jump
RegDst
ALUSrc
Instruction [31– 26]
4
M
u
x
Instruction [25– 0] Jump address [31– 0]
PC+4 [31– 28]
Sign
extend
16 32
Instruction [15– 0]
1
M
u
x
1
0
M
u
x
0
1
M
u
x
0
1
ALU
control
Control
Add ALU
result
M
u
x
0
1 0
ALU
Shift
left 2
26 28
Address
31-26 25-0
opcode address
Jump
MIPS datapath extended to jumps: control unit generates new Jump control bit
New multiplexor with additional
control bit Jump
Composing jump
target address
37
Enhancing Performance with
Pipelining
 An implementation technique in which
multiple instructions are overlapped in execution.
38
Pipelining
• Start work ASAP!! Do not waste time!
Time
7
6 PM 8 9 10 11 12 1 2 AM
A
B
C
D
Time
7
6 PM 8 9 10 11 12 1 2 AM
A
B
C
D
Task
order
Task
order
Assume 30 min. each task – wash, dry, fold, store – and that
separate tasks use separate hardware and so can be overlapped
Pipelined
Not pipelined
39
Breaking instructions
into steps
• We break instructions into the following potential execution steps –
not all instructions require all the steps – each step takes one
clock cycle
1. Instruction fetch and PC increment (IF)
2. Instruction decode and register fetch (ID)
3. Execution, memory address computation, or branch completion (EX)
4. Memory access or R-type instruction completion (MEM)
5. Memory read completion (WB)
• Each MIPS instruction takes from 3 – 5 cycles (steps)
40
Step 1: Instruction Fetch
& PC Increment (IF)
• Use PC to get instruction and put it in the instruction register.
Increment the PC by 4 and put the result back in the PC.
• Can be described succinctly using RTL (Register-Transfer Language):
IR = Memory[PC];
PC = PC + 4;
41
Step 2: Instruction Decode
and Register Fetch (ID)
• Read registers rs and rt in case we need them.
Compute the branch address in case the instruction is a branch.
• RTL:
A = Reg[IR[25-21]];
B = Reg[IR[20-16]];
ALUOut = PC + (sign-extend(IR[15-0]) << 2);
42
Step 3: Execution, Address
Computation or Branch
Completion (EX)
• ALU performs one of four functions depending on instruction
type
o memory reference:
ALUOut = A + sign-extend(IR[15-0]);
o R-type:
ALUOut = A op B;
o branch (instruction completes):
if (A==B) PC = ALUOut;
o jump (instruction completes):
PC = PC[31-28] || (IR(25-0) << 2)
43
Step 4: Memory access or
R-type Instruction
Completion
(MEM)
• Again depending on instruction type:
• Loads and stores access memory
o load
MDR = Memory[ALUOut];
o store (instruction completes)
Memory[ALUOut] = B;
• R-type (instructions completes)
Reg[IR[15-11]] = ALUOut;
44
Step 5: Memory Read
Completion (WB)
• Again depending on instruction type:
• Load writes back (instruction completes)
Reg[IR[20-16]]= MDR;
Important: There is no reason from a datapath (or control) point of
view that Step 5 cannot be eliminated by performing
Reg[IR[20-16]]= Memory[ALUOut];
for loads in Step 4. This would eliminate the MDR as well.
The reason this is not done is that, to keep steps balanced in
length, the design restriction is to allow each step to contain at
most one ALU operation, or one register access, or one
memory access.
45
Summary of Instruction
Execution
Step name
Action for R-type
instructions
Action for memory-reference
instructions
Action for
branches
Action for
jumps
Instruction fetch IR = Memory[PC]
PC = PC + 4
Instruction A = Reg [IR[25-21]]
decode/register fetch B = Reg [IR[20-16]]
ALUOut = PC + (sign-extend (IR[15-0]) << 2)
Execution, address ALUOut = A op B ALUOut = A + sign-extend if (A ==B) then PC = PC [31-28] II
computation, branch/ (IR[15-0]) PC = ALUOut (IR[25-0]<<2)
jump completion
Memory access or R-type Reg [IR[15-11]] = Load: MDR = Memory[ALUOut]
completion ALUOut or
Store: Memory [ALUOut] = B
Memory read completion Load: Reg[IR[20-16]] = MDR
1: IF
2: ID
3: EX
4: MEM
5: WB
Step
46
Pipelined vs. Single-Cycle
Instruction Execution: the
Plan
Instruction
fetch
Reg ALU
Data
access
Reg
8 ns
Instruction
fetch
Reg ALU
Data
access
Reg
8 ns
Instruction
fetch
8 ns
Time
lw $1, 100($0)
lw $2, 200($0)
lw $3, 300($0)
2 4 6 8 10 12 14 16 18
2 4 6 8 10 12 14
...
Program
execution
order
(in instructions)
Instruction
fetch
Reg ALU
Data
access
Reg
Time
lw $1, 100($0)
lw $2, 200($0)
lw $3, 300($0)
2 ns
Instruction
fetch
Reg ALU
Data
access
Reg
2 ns
Instruction
fetch
Reg ALU
Data
access
Reg
2 ns 2 ns 2 ns 2 ns 2 ns
Program
execution
order
(in instructions)
Single-cycle
Pipelined
Assume 2 ns for memory access, ALU operation; 1 ns for register access:
therefore, single cycle clock 8 ns; pipelined clock cycle 2 ns.
47
Pipelining: Keep in
Mind
• Pipelining does not reduce latency of a single task, it
increases throughput of entire workload
• Pipeline rate limited by longest stage
o potential speedup = number pipe stages
o unbalanced lengths of pipe stages reduces speedup
• Time to fill pipeline and time to drain it – when there is
slack in the pipeline – reduces speedup
48
Pipelining MIPS
• What makes it easy with MIPS?
o all instructions are same length
• so fetch and decode stages are similar for all instructions
o just a few instruction formats
• simplifies instruction decode and makes it possible in one stage
o memory operands appear only in load/stores
• so memory access can be deferred to exactly one later stage
o operands are aligned in memory
• one data transfer instruction requires one memory access stage
49
Pipelining MIPS
• What makes it hard?
o structural hazards: different instructions, at different stages,
in the pipeline want to use the same hardware resource
o control hazards: succeeding instruction, to put into pipeline,
depends on the outcome of a previous branch instruction,
already in pipeline
o data hazards: an instruction in the pipeline requires data to
be computed by a previous instruction still in the pipeline
• Before actually building the pipelined datapath and control
we first briefly examine these potential hazards
individually…
50
Structural Hazards
• Structural hazard: inadequate hardware to simultaneously
support all instructions in the pipeline in the same clock cycle
• E.g., suppose single – not separate – instruction and data
memory in pipeline below with one read port
o then a structural hazard between first and fourth lw instructions
• MIPS was designed to be pipelined: structural hazards are easy
to avoid!
2 4 6 8 10 12 14
Instruction
fetch
Reg ALU
Data
access
Reg
Time
lw $1, 100($0)
lw $2, 200($0)
lw $3, 300($0)
2 ns
Instruction
fetch
Reg ALU
Data
access
Reg
2 ns
Instruction
fetch
Reg ALU
Data
access
Reg
2 ns 2 ns 2 ns 2 ns 2 ns
Program
execution
order
(in instructions)
Pipelined
Instruction
fetch
Reg ALU
Data
access
Reg
2 ns
lw $4, 400($0)
Hazard if single memory
51
Control Hazards
• Control hazard: need to make a decision based on the result of a
previous instruction still executing in pipeline
• Solution 1 Stall the pipeline
Instruction
fetch
Reg ALU
Data
access
Reg
Time
beq $1, $2, 40
add $4, $5, $6
lw $3, 300($0)
4 ns
Instruction
fetch
Reg ALU
Data
access
Reg
2ns
Instruction
fetch
Reg ALU
Data
access
Reg
2ns
2 4 6 8 10 12 14 16
Program
execution
order
(in instructions)
Pipeline stall
bubble
Note that branch outcome is
computed in ID stage with
added hardware (later…)
52
Control Hazards
• Solution 2 Predict branch outcome
o e.g., predict branch-not-taken :
Instruction
fetch
Reg ALU
Data
access
Reg
Time
beq $1, $2, 40
add $4, $5, $6
lw $3, 300($0)
Instruction
fetch
Reg ALU
Data
access
Reg
2 ns
Instruction
fetch
Reg ALU
Data
access
Reg
2 ns
Program
execution
order
(in instructions)
Instruction
fetch
Reg ALU
Data
access
Reg
Time
beq $1, $2, 40
add $4, $5 ,$6
or $7, $8, $9
Instruction
fetch
Reg ALU
Data
access
Reg
2 4 6 8 10 12 14
2 4 6 8 10 12 14
Instruction
fetch
Reg ALU
Data
access
Reg
2 ns
4 ns
bubble bubble bubble bubble bubble
Program
execution
order
(in instructions)
Prediction success
Prediction failure: undo (=flush) lw 53
Control Hazards
• Solution 3 Delayed branch: always execute the sequentially next
statement with the branch executing after one instruction delay
– compiler’s job to find a statement that can be put in the slot
that is independent of branch outcome
o MIPS does this – but it is an option in SPIM (Simulator -> Settings)
Instruction
fetch
Reg ALU
Data
access
Reg
Time
beq $1, $2, 40
add $4, $5, $6
lw $3, 300($0)
Instruction
fetch
Reg ALU
Data
access
Reg
2 ns
Instruction
fetch
Reg ALU
Data
access
Reg
2 ns
2 4 6 8 10 12 14
2 ns
(d elayed branch slot)
Program
execution
order
(in instructions)
Delayed branch beq is followed by add that is
independent of branch outcome
54
Data Hazards
• Data hazard: instruction needs data from the result of a previous
instruction still executing in pipeline
• Solution Forward data if possible…
Time
2 4 6 8 10
add $s0, $t0, $t1 IF ID WB
EX MEM
add $s0, $t0, $t1
sub $t2, $s0, $t3
Program
execution
order
(in instructions)
IF ID WB
EX
IF ID MEM
EX
Time
2 4 6 8 10
MEM
WB
MEM
Instruction pipeline diagram:
shade indicates use –
left=write, right=read
Without forwarding – blue line –
data has to go back in time;
with forwarding – red line –
data is available in time
55
Data Hazards
• Forwarding may not be enough
o e.g., if an R-type instruction following a load uses the result of the
load – called load-use data hazard
Time
2 4 6 8 10 12 14
lw $s0, 20($t1)
sub $t2, $s0, $t3
Program
execution
order
(in instructions)
IF ID WB
MEM
EX
IF ID WB
MEM
EX
Time
2 4 6 8 10 12 14
lw $s0, 20($t1)
sub $t2, $s0, $t3
Program
execution
order
(in instructions)
IF ID WB
MEM
EX
IF ID WB
MEM
EX
bubble bubble bubble bubble bubble
With a one-stage stall, forwarding
can get the data to the sub
instruction in time
Without a stall it is impossible
to provide input to the sub
instruction in time
56
Reordering Code to Avoid
Pipeline Stall (Software
Solution)
• Example:
lw $t0, 0($t1)
lw $t2, 4($t1)
sw $t2, 0($t1)
sw $t0, 4($t1)
• Reordered code:
lw $t0, 0($t1)
lw $t2, 4($t1)
sw $t0, 4($t1)
sw $t2, 0($t1)
Data hazard
Interchanged
57
Pipelined Datapath
• We now move to actually building a pipelined datapath
• First recall the 5 steps in instruction execution
1. Instruction Fetch & PC Increment (IF)
2. Instruction Decode and Register Read (ID)
3. Execution or calculate address (EX)
4. Memory access (MEM)
5. Write result into register (WB)
• Review: single-cycle processor
o all 5 steps done in a single clock cycle
o dedicated hardware required for each step
• What happens if we break the execution into multiple cycles, but
keep the extra hardware?
58
Review - Single-Cycle
Datapath “Steps”
5 5
16
RD1
RD2
RN1 RN2 WN
WD
Register File ALU
E
X
T
N
D
16 32
RD
WD
Data
Memory
ADDR
5
Instruction I
32
M
U
X
<<2
RD
Instruction
Memory
ADDR
PC
4
ADD
ADD
M
U
X
32
IF
Instruction Fetch
ID
Instruction Decode
EX
Execute/ Address Calc.
MEM
Memory Access
WB
Write Back
Zero
59
Pipelined Datapath – Key
Idea
• What happens if we break the execution into multiple cycles, but keep the
extra hardware?
o Answer: We may be able to start executing a new instruction at each clock
cycle - pipelining
• …but we shall need extra registers to hold data between cycles –
pipeline registers
60
Pipelined Datapath
IF/ID
Pipeline registers
5 5
16
RD1
RD2
RN1 RN2 WN
WD
Register File ALU
E
X
T
N
D
16 32
RD
WD
Data
Memory
ADDR
5
Instruction I
32
M
U
X
<<2
RD
Instruction
Memory
ADDR
PC
4
ADD
ADD
M
U
X
32
ID/EX EX/MEM MEM/WB
Zero
64 bits
97 bits 64 bits
128 bits
wide enough to hold data coming in
61
Pipelined Datapath
IF/ID
Pipeline registers
5 5
16
RD1
RD2
RN1 RN2 WN
WD
Register File ALU
E
X
T
N
D
16 32
RD
WD
Data
Memory
ADDR
5
Instruction I
32
M
U
X
<<2
RD
Instruction
Memory
ADDR
PC
4
ADD
ADD
M
U
X
32
ID/EX EX/MEM MEM/WB
Zero
64 bits
97 bits 64 bits
128 bits
wide enough to hold data coming in
Only data flowing right to left may cause hazard…, why? 62
Bug in the Datapath
5 5
16
RD1
RD2
RN1 RN2 WN
WD
Register File ALU
E
X
T
N
D
16 32
RD
WD
Data
Memory
ADDR
5
Instruction I
32
M
U
X
<<2
RD
Instruction
Memory
ADDR
PC
4
ADD
ADD
M
U
X
32
Write register number comes from another later instruction!
IF/ID ID/EX EX/MEM MEM/WB
63
Corrected Datapath
5
RD1
RD2
RN1
RN2
WN
WD
Register
File
ALU
E
X
T
N
D
16 32
RD
WD
Data
Memory
ADDR
32
M
U
X
<<2
RD
Instruction
Memory
ADDR
PC
4
ADD
ADD
M
U
X
5
5
5
EX/MEM MEM/WB
Zero
ID/EX
IF/ID
64 bits 133 bits
102 bits 69 bits
Destination register number is also passed through ID/EX, EX/MEM
and MEM/WB registers, which are now wider by 5 bits 64
Pipelined Example
• Consider the following instruction sequence:
lw $t0, 10($t1)
sw $t3, 20($t4)
add $t5, $t6, $t7
sub $t8, $t9, $t10
65
Single-Clock-Cycle Diagram:
Clock Cycle 1
LW
66
Single-Clock-Cycle Diagram:
Clock Cycle 2
LW
SW
67
Single-Clock-Cycle Diagram:
Clock Cycle 3
LW
SW
ADD
68
Single-Clock-Cycle Diagram:
Clock Cycle 4 LW
SW
ADD
SUB
69
Single-Clock-Cycle Diagram:
Clock Cycle 5 LW
SW
ADD
SUB
70
Single-Clock-Cycle Diagram:
Clock Cycle 6 SW
ADD
SUB
71
Single-Clock-Cycle Diagram:
Clock Cycle 7 ADD
SUB
72
Single-Clock-Cycle Diagram:
Clock Cycle 8 SUB
73
Multiple-Clock-Cycle
Diagram
IM REG ALU DM REG
lw $t0, 10($t1)
sw $t3, 20($t4)
add $t5, $t6, $t7
CC 1 CC 2 CC 3 CC 4 CC 5 CC 6 CC 7
IM REG ALU DM REG
IM REG ALU DM REG
sub $t8, $t9, $t10 IM REG ALU DM REG
CC 8
Time axis
74
Recall Single-Cycle
Control – the Datapath
PC
Instruction
memory
Read
address
Instruction
[31– 0]
Instruction [20 16]
Instruction [25 21]
Add
Instruction [5 0]
MemtoReg
ALUOp
MemWrite
RegWrite
MemRead
Branch
RegDst
ALUSrc
Instruction [31 26]
4
16 32
Instruction [15 0]
0
0
M
u
x
0
1
Control
Add
ALU
result
M
u
x
0
1
Registers
Write
register
Write
data
Read
data 1
Read
data 2
Read
register 1
Read
register 2
Sign
extend
M
u
x
1
ALU
result
Zero
PCSrc
Data
memory
Write
data
Read
data
M
u
x
1
Instruction [15 11]
ALU
control
Shift
left 2
ALU
Address
75
Recall Single-Cycle – ALU Control
Instruction AluOp Instruction Funct Field Desired ALU control
opcode operation ALU action input
LW 00 load word xxxxxx add 010
SW 00 store word xxxxxx add 010
Branch eq 01 branch eq xxxxxx subtract 110
R-type 10 add 100000 add 010
R-type 10 subtract 100010 subtract 110
R-type 10 AND 100100 and 000
R-type 10 OR 100101 or 001
R-type 10 set on less 101010 set on less 111
Truth table for ALU control bits
ALUOp Funct field Operation
ALUOp1 ALUOp0 F5 F4 F3 F2 F1 F0
0 0 X X X X X X 010
0 1 X X X X X X 110
1 X X X 0 0 0 0 010
1 X X X 0 0 1 0 110
1 X X X 0 1 0 0 000
1 X X X 0 1 0 1 001
1 X X X 1 0 1 0 111 76
Signals
Signal Name Effect when deasserted Effect when asserted
RegDst The register destination number for the The register destination number for the
Write register comes from the rt field (bits 20-16) Write register comes from the rd field (bits 15-11)
RegWrite None The register on the Write register input is written
with the value on the Write data input
AlLUSrc The second ALU operand comes from the The second ALU operand is the sign-extended,
second register file output (Read data 2) lower 16 bits of the instruction
PCSrc The PC is replaced by the output of the adder The PC is replaced by the output of the adder
that computes the value of PC + 4 that computes the branch target
MemRead None Data memory contents designated by the address
input are put on the first Read data output
MemWrite None Data memory contents designated by the address
input are replaced by the value of the Write data input
MemtoReg The value fed to the register Write data input The value fed to the register Write data input
comes from the ALU comes from the data memory
Instruction RegDst ALUSrc
Memto-
Reg
Reg
Write
Mem
Read
Mem
Write Branch ALUOp1 ALUp0
R-format 1 0 0 1 0 0 0 1 0
lw 0 1 1 1 1 0 0 0 0
sw X 1 X 0 0 1 0 0 0
beq X 0 X 0 0 0 1 0 1
Effect of control bits
Deter-
mining
control
bits
77
Pipeline Control
• Initial design – motivated by single-cycle datapath control – use
the same control signals
• Observe:
o No separate write signal for the PC as it is written every cycle
o No separate write signals for the pipeline registers as they are written
every cycle
o No separate read signal for instruction memory as it is read every
clock cycle
o No separate read signal for register file as it is read every clock cycle
• Need to set control signals during each pipeline stage
• Since control signals are associated with components active
during a single pipeline stage, can group control lines into five
groups according to pipeline stage
Will be
modified
by hazard
detection
unit!!
78
Pipelined Datapath with
Control I
PC
Instruction
memory
Address
Instruction
Instruction
[20– 16]
MemtoReg
ALUOp
Branch
RegDst
ALUSrc
4
16 32
Instruction
[15– 0]
0
0
Registers
Write
register
Write
data
Read
data 1
Read
data 2
Read
register 1
Read
register 2
Sign
extend
M
u
x
1
Write
data
Read
data M
u
x
1
ALU
control
RegWrite
MemRead
Instruction
[15– 11]
6
IF/ID ID/EX EX/MEM MEM/WB
MemWrite
Address
Data
memory
PCSrc
Zero
Add
Add
result
Shift
left 2
ALU
result
ALU
Zero
Add
0
1
M
u
x
0
1
M
u
x
Same control
signals as the
single-cycle
datapath
79
Pipeline Control Signals
• There are five stages in the pipeline
o instruction fetch / PC increment
o instruction decode / register fetch
o execution / address calculation
o memory access
o write back
Execution/Address Calculation
stage control lines
Memory access stage
control lines
Write-back
stage control
lines
Instruction
Reg
Dst
ALU
Op1
ALU
Op0
ALU
Src Branch
Mem
Read
Mem
Write
Reg
write
Mem to
Reg
R-format 1 1 0 0 0 0 0 1 0
lw 0 0 0 1 0 1 0 1 1
sw X 0 0 1 0 0 1 0 X
beq X 0 1 0 1 0 0 0 X
Nothing to control as instruction memory
read and PC write are always enabled
80
Pipeline Control
Implementation
• Pass control signals along just like the data – extend each pipeline register to hold needed control bits for succeeding
stages
• Note: The 6-bit funct field of the instruction required in the EX stage to generate ALU control can be retrieved as the 6
least significant bits of the immediate field which is sign-extended and passed from the IF/ID register to the ID/EX
register
Control
EX
M
WB
M
WB
WB
IF/ID ID/EX EX/MEM MEM/WB
Instruction
81
Pipelined Datapath with
Control II
PC
Instruction
memory
Instruction
Add
Instruction
[20– 16]
MemtoReg
ALUOp
Branch
RegDst
ALUSrc
4
16 32
Instruction
[15– 0]
0
0
M
u
x
0
1
Add
Add
result
Registers
Write
register
Write
data
Read
data 1
Read
data 2
Read
register 1
Read
register 2
Sign
extend
M
u
x
1
ALU
result
Zero
Write
data
Read
data
M
u
x
1
ALU
control
Shift
left 2
RegWrite
MemRead
Control
ALU
Instruction
[15– 11]
6
EX
M
WB
M
WB
WB
IF/ID
PCSrc
ID/EX
EX/MEM
MEM/WB
M
u
x
0
1
MemWrite
Address
Data
memory
Address
Control signals
emanate from
the control
portions of the
pipeline registers
82

Más contenido relacionado

La actualidad más candente

A reusable verification environment for NoC platforms using UVM
A reusable verification environment for NoC platforms using UVMA reusable verification environment for NoC platforms using UVM
A reusable verification environment for NoC platforms using UVMSameh El-Ashry
 
Konfigurasi Site-to-Site IPSec VPN Tunnel di Mikrotik menggunakan GNS3
Konfigurasi Site-to-Site IPSec VPN Tunnel di Mikrotik menggunakan GNS3Konfigurasi Site-to-Site IPSec VPN Tunnel di Mikrotik menggunakan GNS3
Konfigurasi Site-to-Site IPSec VPN Tunnel di Mikrotik menggunakan GNS3I Putu Hariyadi
 
RISC-V Introduction
RISC-V IntroductionRISC-V Introduction
RISC-V IntroductionYi-Hsiu Hsu
 
Instruction Set Architecture – II
Instruction Set Architecture – IIInstruction Set Architecture – II
Instruction Set Architecture – IIDilum Bandara
 
Arbitration in computer organization
 Arbitration in computer organization   Arbitration in computer organization
Arbitration in computer organization Amit kashyap
 
Types of buses of computer
Types of buses of computerTypes of buses of computer
Types of buses of computerSAGAR DODHIA
 
Chapter 9 - Computer Networking a top-down Approach 7th
Chapter 9 - Computer Networking a top-down Approach 7thChapter 9 - Computer Networking a top-down Approach 7th
Chapter 9 - Computer Networking a top-down Approach 7thAndy Juan Sarango Veliz
 
Central processing unit and stack organization r013
Central processing unit and stack organization   r013Central processing unit and stack organization   r013
Central processing unit and stack organization r013arunachalamr16
 
Arm Processors Architectures
Arm Processors ArchitecturesArm Processors Architectures
Arm Processors ArchitecturesMohammed Hilal
 
Linux Kernel - Virtual File System
Linux Kernel - Virtual File SystemLinux Kernel - Virtual File System
Linux Kernel - Virtual File SystemAdrian Huang
 
Associative memory and set associative memory mapping
Associative memory and set associative memory mappingAssociative memory and set associative memory mapping
Associative memory and set associative memory mappingSnehalataAgasti
 
RISC-V on Edge: Porting EVE and Alpine Linux to RISC-V
RISC-V on Edge: Porting EVE and Alpine Linux to RISC-VRISC-V on Edge: Porting EVE and Alpine Linux to RISC-V
RISC-V on Edge: Porting EVE and Alpine Linux to RISC-VScyllaDB
 
Intro to Buses (Computer Architecture)
Intro to Buses  (Computer Architecture)Intro to Buses  (Computer Architecture)
Intro to Buses (Computer Architecture)Matthew Levandowski
 
Physical computing and iot programming final with cp sycs sem 3
Physical computing and iot programming final with cp sycs sem 3Physical computing and iot programming final with cp sycs sem 3
Physical computing and iot programming final with cp sycs sem 3WE-IT TUTORIALS
 
Lec14 Computer Architecture by Hsien-Hsin Sean Lee Georgia Tech --- Coherence
Lec14 Computer Architecture by Hsien-Hsin Sean Lee Georgia Tech --- CoherenceLec14 Computer Architecture by Hsien-Hsin Sean Lee Georgia Tech --- Coherence
Lec14 Computer Architecture by Hsien-Hsin Sean Lee Georgia Tech --- CoherenceHsien-Hsin Sean Lee, Ph.D.
 

La actualidad más candente (20)

A reusable verification environment for NoC platforms using UVM
A reusable verification environment for NoC platforms using UVMA reusable verification environment for NoC platforms using UVM
A reusable verification environment for NoC platforms using UVM
 
Konfigurasi Site-to-Site IPSec VPN Tunnel di Mikrotik menggunakan GNS3
Konfigurasi Site-to-Site IPSec VPN Tunnel di Mikrotik menggunakan GNS3Konfigurasi Site-to-Site IPSec VPN Tunnel di Mikrotik menggunakan GNS3
Konfigurasi Site-to-Site IPSec VPN Tunnel di Mikrotik menggunakan GNS3
 
cache memory
 cache memory cache memory
cache memory
 
RISC-V Introduction
RISC-V IntroductionRISC-V Introduction
RISC-V Introduction
 
Instruction Set Architecture – II
Instruction Set Architecture – IIInstruction Set Architecture – II
Instruction Set Architecture – II
 
Arbitration in computer organization
 Arbitration in computer organization   Arbitration in computer organization
Arbitration in computer organization
 
Types of buses of computer
Types of buses of computerTypes of buses of computer
Types of buses of computer
 
Chapter 9 - Computer Networking a top-down Approach 7th
Chapter 9 - Computer Networking a top-down Approach 7thChapter 9 - Computer Networking a top-down Approach 7th
Chapter 9 - Computer Networking a top-down Approach 7th
 
Introduction to ARM Architecture
Introduction to ARM ArchitectureIntroduction to ARM Architecture
Introduction to ARM Architecture
 
Chapter 6
Chapter 6Chapter 6
Chapter 6
 
Central processing unit and stack organization r013
Central processing unit and stack organization   r013Central processing unit and stack organization   r013
Central processing unit and stack organization r013
 
Rip ospf and bgp
Rip ospf and bgpRip ospf and bgp
Rip ospf and bgp
 
Arm Processors Architectures
Arm Processors ArchitecturesArm Processors Architectures
Arm Processors Architectures
 
Linux Kernel - Virtual File System
Linux Kernel - Virtual File SystemLinux Kernel - Virtual File System
Linux Kernel - Virtual File System
 
Associative memory and set associative memory mapping
Associative memory and set associative memory mappingAssociative memory and set associative memory mapping
Associative memory and set associative memory mapping
 
RISC-V on Edge: Porting EVE and Alpine Linux to RISC-V
RISC-V on Edge: Porting EVE and Alpine Linux to RISC-VRISC-V on Edge: Porting EVE and Alpine Linux to RISC-V
RISC-V on Edge: Porting EVE and Alpine Linux to RISC-V
 
Intro to Buses (Computer Architecture)
Intro to Buses  (Computer Architecture)Intro to Buses  (Computer Architecture)
Intro to Buses (Computer Architecture)
 
Physical computing and iot programming final with cp sycs sem 3
Physical computing and iot programming final with cp sycs sem 3Physical computing and iot programming final with cp sycs sem 3
Physical computing and iot programming final with cp sycs sem 3
 
Slide3
Slide3Slide3
Slide3
 
Lec14 Computer Architecture by Hsien-Hsin Sean Lee Georgia Tech --- Coherence
Lec14 Computer Architecture by Hsien-Hsin Sean Lee Georgia Tech --- CoherenceLec14 Computer Architecture by Hsien-Hsin Sean Lee Georgia Tech --- Coherence
Lec14 Computer Architecture by Hsien-Hsin Sean Lee Georgia Tech --- Coherence
 

Similar a PROCESSOR AND CONTROL UNIT

Lec1 Computer Architecture by Hsien-Hsin Sean Lee Georgia Tech -- Pipelining
Lec1 Computer Architecture by Hsien-Hsin Sean Lee Georgia Tech -- PipeliningLec1 Computer Architecture by Hsien-Hsin Sean Lee Georgia Tech -- Pipelining
Lec1 Computer Architecture by Hsien-Hsin Sean Lee Georgia Tech -- PipeliningHsien-Hsin Sean Lee, Ph.D.
 
3. Single Cycle Data Path in computer architecture
3. Single Cycle Data Path in computer architecture3. Single Cycle Data Path in computer architecture
3. Single Cycle Data Path in computer architecturebsse20142018
 
Chapter 4 the processor
Chapter 4 the processorChapter 4 the processor
Chapter 4 the processors9007912
 
Single Cycle Processing
Single Cycle ProcessingSingle Cycle Processing
Single Cycle Processinginmogr
 
multi cycle in microprocessor 8086 sy B-tech
multi cycle  in microprocessor 8086 sy B-techmulti cycle  in microprocessor 8086 sy B-tech
multi cycle in microprocessor 8086 sy B-techRushikeshThorat24
 
CMPN301-Pipelining_V2.pptx
CMPN301-Pipelining_V2.pptxCMPN301-Pipelining_V2.pptx
CMPN301-Pipelining_V2.pptxNadaAAmin
 
Control unit-implementation
Control unit-implementationControl unit-implementation
Control unit-implementationWBUTTUTORIALS
 
My seminar new 28
My seminar new 28My seminar new 28
My seminar new 28rajeshkvdn
 
IntroductionCPU performance factorsInstruction countDeterm.docx
IntroductionCPU performance factorsInstruction countDeterm.docxIntroductionCPU performance factorsInstruction countDeterm.docx
IntroductionCPU performance factorsInstruction countDeterm.docxnormanibarber20063
 
LPC 2148 Instructions Set.ppt
LPC 2148 Instructions Set.pptLPC 2148 Instructions Set.ppt
LPC 2148 Instructions Set.pptProfBadariNathK
 
Introduction to Microprocesso programming and interfacing.pptx
Introduction to Microprocesso programming and interfacing.pptxIntroduction to Microprocesso programming and interfacing.pptx
Introduction to Microprocesso programming and interfacing.pptxvipinkmenon1
 
Basic computer organization and design
Basic computer organization and designBasic computer organization and design
Basic computer organization and designmahesh kumar prajapat
 
heuring_jordan_----------ch02(1) (1).pptx
heuring_jordan_----------ch02(1) (1).pptxheuring_jordan_----------ch02(1) (1).pptx
heuring_jordan_----------ch02(1) (1).pptxchristinamary2620
 
arm-instructionset.pdf
arm-instructionset.pdfarm-instructionset.pdf
arm-instructionset.pdfRoopaPatil24
 

Similar a PROCESSOR AND CONTROL UNIT (20)

Lec1 Computer Architecture by Hsien-Hsin Sean Lee Georgia Tech -- Pipelining
Lec1 Computer Architecture by Hsien-Hsin Sean Lee Georgia Tech -- PipeliningLec1 Computer Architecture by Hsien-Hsin Sean Lee Georgia Tech -- Pipelining
Lec1 Computer Architecture by Hsien-Hsin Sean Lee Georgia Tech -- Pipelining
 
3. Single Cycle Data Path in computer architecture
3. Single Cycle Data Path in computer architecture3. Single Cycle Data Path in computer architecture
3. Single Cycle Data Path in computer architecture
 
Chapter 4 the processor
Chapter 4 the processorChapter 4 the processor
Chapter 4 the processor
 
Single Cycle Processing
Single Cycle ProcessingSingle Cycle Processing
Single Cycle Processing
 
multi cycle in microprocessor 8086 sy B-tech
multi cycle  in microprocessor 8086 sy B-techmulti cycle  in microprocessor 8086 sy B-tech
multi cycle in microprocessor 8086 sy B-tech
 
Chp 2 and 3.pptx
Chp 2 and 3.pptxChp 2 and 3.pptx
Chp 2 and 3.pptx
 
CMPN301-Pipelining_V2.pptx
CMPN301-Pipelining_V2.pptxCMPN301-Pipelining_V2.pptx
CMPN301-Pipelining_V2.pptx
 
Lec02
Lec02Lec02
Lec02
 
Control unit-implementation
Control unit-implementationControl unit-implementation
Control unit-implementation
 
My seminar new 28
My seminar new 28My seminar new 28
My seminar new 28
 
Pipelining1
Pipelining1Pipelining1
Pipelining1
 
Ch5_MorrisMano.pptx
Ch5_MorrisMano.pptxCh5_MorrisMano.pptx
Ch5_MorrisMano.pptx
 
IntroductionCPU performance factorsInstruction countDeterm.docx
IntroductionCPU performance factorsInstruction countDeterm.docxIntroductionCPU performance factorsInstruction countDeterm.docx
IntroductionCPU performance factorsInstruction countDeterm.docx
 
LPC 2148 Instructions Set.ppt
LPC 2148 Instructions Set.pptLPC 2148 Instructions Set.ppt
LPC 2148 Instructions Set.ppt
 
Ch 8
Ch 8Ch 8
Ch 8
 
Introduction to Microprocesso programming and interfacing.pptx
Introduction to Microprocesso programming and interfacing.pptxIntroduction to Microprocesso programming and interfacing.pptx
Introduction to Microprocesso programming and interfacing.pptx
 
Basic computer organization and design
Basic computer organization and designBasic computer organization and design
Basic computer organization and design
 
Chapter 4
Chapter 4Chapter 4
Chapter 4
 
heuring_jordan_----------ch02(1) (1).pptx
heuring_jordan_----------ch02(1) (1).pptxheuring_jordan_----------ch02(1) (1).pptx
heuring_jordan_----------ch02(1) (1).pptx
 
arm-instructionset.pdf
arm-instructionset.pdfarm-instructionset.pdf
arm-instructionset.pdf
 

Último

Call for Papers - African Journal of Biological Sciences, E-ISSN: 2663-2187, ...
Call for Papers - African Journal of Biological Sciences, E-ISSN: 2663-2187, ...Call for Papers - African Journal of Biological Sciences, E-ISSN: 2663-2187, ...
Call for Papers - African Journal of Biological Sciences, E-ISSN: 2663-2187, ...Christo Ananth
 
Introduction to Multiple Access Protocol.pptx
Introduction to Multiple Access Protocol.pptxIntroduction to Multiple Access Protocol.pptx
Introduction to Multiple Access Protocol.pptxupamatechverse
 
UNIT-III FMM. DIMENSIONAL ANALYSIS
UNIT-III FMM.        DIMENSIONAL ANALYSISUNIT-III FMM.        DIMENSIONAL ANALYSIS
UNIT-III FMM. DIMENSIONAL ANALYSISrknatarajan
 
Structural Analysis and Design of Foundations: A Comprehensive Handbook for S...
Structural Analysis and Design of Foundations: A Comprehensive Handbook for S...Structural Analysis and Design of Foundations: A Comprehensive Handbook for S...
Structural Analysis and Design of Foundations: A Comprehensive Handbook for S...Dr.Costas Sachpazis
 
OSVC_Meta-Data based Simulation Automation to overcome Verification Challenge...
OSVC_Meta-Data based Simulation Automation to overcome Verification Challenge...OSVC_Meta-Data based Simulation Automation to overcome Verification Challenge...
OSVC_Meta-Data based Simulation Automation to overcome Verification Challenge...Soham Mondal
 
Call Girls in Nagpur Suman Call 7001035870 Meet With Nagpur Escorts
Call Girls in Nagpur Suman Call 7001035870 Meet With Nagpur EscortsCall Girls in Nagpur Suman Call 7001035870 Meet With Nagpur Escorts
Call Girls in Nagpur Suman Call 7001035870 Meet With Nagpur EscortsCall Girls in Nagpur High Profile
 
VIP Call Girls Service Kondapur Hyderabad Call +91-8250192130
VIP Call Girls Service Kondapur Hyderabad Call +91-8250192130VIP Call Girls Service Kondapur Hyderabad Call +91-8250192130
VIP Call Girls Service Kondapur Hyderabad Call +91-8250192130Suhani Kapoor
 
Model Call Girl in Narela Delhi reach out to us at 🔝8264348440🔝
Model Call Girl in Narela Delhi reach out to us at 🔝8264348440🔝Model Call Girl in Narela Delhi reach out to us at 🔝8264348440🔝
Model Call Girl in Narela Delhi reach out to us at 🔝8264348440🔝soniya singh
 
Call Girls Service Nashik Vaishnavi 7001305949 Independent Escort Service Nashik
Call Girls Service Nashik Vaishnavi 7001305949 Independent Escort Service NashikCall Girls Service Nashik Vaishnavi 7001305949 Independent Escort Service Nashik
Call Girls Service Nashik Vaishnavi 7001305949 Independent Escort Service NashikCall Girls in Nagpur High Profile
 
(MEERA) Dapodi Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Escorts
(MEERA) Dapodi Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Escorts(MEERA) Dapodi Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Escorts
(MEERA) Dapodi Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Escortsranjana rawat
 
247267395-1-Symmetric-and-distributed-shared-memory-architectures-ppt (1).ppt
247267395-1-Symmetric-and-distributed-shared-memory-architectures-ppt (1).ppt247267395-1-Symmetric-and-distributed-shared-memory-architectures-ppt (1).ppt
247267395-1-Symmetric-and-distributed-shared-memory-architectures-ppt (1).pptssuser5c9d4b1
 
(PRIYA) Rajgurunagar Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(PRIYA) Rajgurunagar Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...(PRIYA) Rajgurunagar Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(PRIYA) Rajgurunagar Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...ranjana rawat
 
The Most Attractive Pune Call Girls Budhwar Peth 8250192130 Will You Miss Thi...
The Most Attractive Pune Call Girls Budhwar Peth 8250192130 Will You Miss Thi...The Most Attractive Pune Call Girls Budhwar Peth 8250192130 Will You Miss Thi...
The Most Attractive Pune Call Girls Budhwar Peth 8250192130 Will You Miss Thi...ranjana rawat
 
High Profile Call Girls Nagpur Isha Call 7001035870 Meet With Nagpur Escorts
High Profile Call Girls Nagpur Isha Call 7001035870 Meet With Nagpur EscortsHigh Profile Call Girls Nagpur Isha Call 7001035870 Meet With Nagpur Escorts
High Profile Call Girls Nagpur Isha Call 7001035870 Meet With Nagpur Escortsranjana rawat
 
Introduction and different types of Ethernet.pptx
Introduction and different types of Ethernet.pptxIntroduction and different types of Ethernet.pptx
Introduction and different types of Ethernet.pptxupamatechverse
 
(SHREYA) Chakan Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Esc...
(SHREYA) Chakan Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Esc...(SHREYA) Chakan Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Esc...
(SHREYA) Chakan Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Esc...ranjana rawat
 
College Call Girls Nashik Nehal 7001305949 Independent Escort Service Nashik
College Call Girls Nashik Nehal 7001305949 Independent Escort Service NashikCollege Call Girls Nashik Nehal 7001305949 Independent Escort Service Nashik
College Call Girls Nashik Nehal 7001305949 Independent Escort Service NashikCall Girls in Nagpur High Profile
 
Call Girls Service Nagpur Tanvi Call 7001035870 Meet With Nagpur Escorts
Call Girls Service Nagpur Tanvi Call 7001035870 Meet With Nagpur EscortsCall Girls Service Nagpur Tanvi Call 7001035870 Meet With Nagpur Escorts
Call Girls Service Nagpur Tanvi Call 7001035870 Meet With Nagpur EscortsCall Girls in Nagpur High Profile
 

Último (20)

Call for Papers - African Journal of Biological Sciences, E-ISSN: 2663-2187, ...
Call for Papers - African Journal of Biological Sciences, E-ISSN: 2663-2187, ...Call for Papers - African Journal of Biological Sciences, E-ISSN: 2663-2187, ...
Call for Papers - African Journal of Biological Sciences, E-ISSN: 2663-2187, ...
 
Introduction to Multiple Access Protocol.pptx
Introduction to Multiple Access Protocol.pptxIntroduction to Multiple Access Protocol.pptx
Introduction to Multiple Access Protocol.pptx
 
UNIT-III FMM. DIMENSIONAL ANALYSIS
UNIT-III FMM.        DIMENSIONAL ANALYSISUNIT-III FMM.        DIMENSIONAL ANALYSIS
UNIT-III FMM. DIMENSIONAL ANALYSIS
 
Structural Analysis and Design of Foundations: A Comprehensive Handbook for S...
Structural Analysis and Design of Foundations: A Comprehensive Handbook for S...Structural Analysis and Design of Foundations: A Comprehensive Handbook for S...
Structural Analysis and Design of Foundations: A Comprehensive Handbook for S...
 
OSVC_Meta-Data based Simulation Automation to overcome Verification Challenge...
OSVC_Meta-Data based Simulation Automation to overcome Verification Challenge...OSVC_Meta-Data based Simulation Automation to overcome Verification Challenge...
OSVC_Meta-Data based Simulation Automation to overcome Verification Challenge...
 
Call Us -/9953056974- Call Girls In Vikaspuri-/- Delhi NCR
Call Us -/9953056974- Call Girls In Vikaspuri-/- Delhi NCRCall Us -/9953056974- Call Girls In Vikaspuri-/- Delhi NCR
Call Us -/9953056974- Call Girls In Vikaspuri-/- Delhi NCR
 
Call Girls in Nagpur Suman Call 7001035870 Meet With Nagpur Escorts
Call Girls in Nagpur Suman Call 7001035870 Meet With Nagpur EscortsCall Girls in Nagpur Suman Call 7001035870 Meet With Nagpur Escorts
Call Girls in Nagpur Suman Call 7001035870 Meet With Nagpur Escorts
 
DJARUM4D - SLOT GACOR ONLINE | SLOT DEMO ONLINE
DJARUM4D - SLOT GACOR ONLINE | SLOT DEMO ONLINEDJARUM4D - SLOT GACOR ONLINE | SLOT DEMO ONLINE
DJARUM4D - SLOT GACOR ONLINE | SLOT DEMO ONLINE
 
VIP Call Girls Service Kondapur Hyderabad Call +91-8250192130
VIP Call Girls Service Kondapur Hyderabad Call +91-8250192130VIP Call Girls Service Kondapur Hyderabad Call +91-8250192130
VIP Call Girls Service Kondapur Hyderabad Call +91-8250192130
 
Model Call Girl in Narela Delhi reach out to us at 🔝8264348440🔝
Model Call Girl in Narela Delhi reach out to us at 🔝8264348440🔝Model Call Girl in Narela Delhi reach out to us at 🔝8264348440🔝
Model Call Girl in Narela Delhi reach out to us at 🔝8264348440🔝
 
Call Girls Service Nashik Vaishnavi 7001305949 Independent Escort Service Nashik
Call Girls Service Nashik Vaishnavi 7001305949 Independent Escort Service NashikCall Girls Service Nashik Vaishnavi 7001305949 Independent Escort Service Nashik
Call Girls Service Nashik Vaishnavi 7001305949 Independent Escort Service Nashik
 
(MEERA) Dapodi Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Escorts
(MEERA) Dapodi Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Escorts(MEERA) Dapodi Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Escorts
(MEERA) Dapodi Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Escorts
 
247267395-1-Symmetric-and-distributed-shared-memory-architectures-ppt (1).ppt
247267395-1-Symmetric-and-distributed-shared-memory-architectures-ppt (1).ppt247267395-1-Symmetric-and-distributed-shared-memory-architectures-ppt (1).ppt
247267395-1-Symmetric-and-distributed-shared-memory-architectures-ppt (1).ppt
 
(PRIYA) Rajgurunagar Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(PRIYA) Rajgurunagar Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...(PRIYA) Rajgurunagar Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(PRIYA) Rajgurunagar Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
 
The Most Attractive Pune Call Girls Budhwar Peth 8250192130 Will You Miss Thi...
The Most Attractive Pune Call Girls Budhwar Peth 8250192130 Will You Miss Thi...The Most Attractive Pune Call Girls Budhwar Peth 8250192130 Will You Miss Thi...
The Most Attractive Pune Call Girls Budhwar Peth 8250192130 Will You Miss Thi...
 
High Profile Call Girls Nagpur Isha Call 7001035870 Meet With Nagpur Escorts
High Profile Call Girls Nagpur Isha Call 7001035870 Meet With Nagpur EscortsHigh Profile Call Girls Nagpur Isha Call 7001035870 Meet With Nagpur Escorts
High Profile Call Girls Nagpur Isha Call 7001035870 Meet With Nagpur Escorts
 
Introduction and different types of Ethernet.pptx
Introduction and different types of Ethernet.pptxIntroduction and different types of Ethernet.pptx
Introduction and different types of Ethernet.pptx
 
(SHREYA) Chakan Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Esc...
(SHREYA) Chakan Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Esc...(SHREYA) Chakan Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Esc...
(SHREYA) Chakan Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Esc...
 
College Call Girls Nashik Nehal 7001305949 Independent Escort Service Nashik
College Call Girls Nashik Nehal 7001305949 Independent Escort Service NashikCollege Call Girls Nashik Nehal 7001305949 Independent Escort Service Nashik
College Call Girls Nashik Nehal 7001305949 Independent Escort Service Nashik
 
Call Girls Service Nagpur Tanvi Call 7001035870 Meet With Nagpur Escorts
Call Girls Service Nagpur Tanvi Call 7001035870 Meet With Nagpur EscortsCall Girls Service Nagpur Tanvi Call 7001035870 Meet With Nagpur Escorts
Call Girls Service Nagpur Tanvi Call 7001035870 Meet With Nagpur Escorts
 

PROCESSOR AND CONTROL UNIT

  • 1. Velammal Engineering College Department of Computer Science and Engineering Welcome… Slide Sources: Patterson & Hennessy COD book website (copyright Morgan Kaufmann) adapted and supplemented Mr. A. Arockia Abins & Ms. R. Amirthavalli, Asst. Prof, CSE, Velammal Engineering College
  • 2. UNIT III PROCESSOR AND CONTROL UNIT Topics:  Basic MIPS implementation – Building datapath – Control Implementation scheme – Pipelining – Pipelined datapath and control – Handling Data hazards & Control hazards – Exceptions. 2
  • 3. Implementing MIPS • We're ready to look at an implementation of the MIPS instruction set • Simplified to contain only o arithmetic-logic instructions: add, sub, and, or, slt o memory-reference instructions: lw, sw o control-flow instructions: beq, j op rs rt offset 6 bits 5 bits 5 bits 16 bits op rs rt rd funct shamt 6 bits 5 bits 5 bits 5 bits 5 bits 6 bits R-Format I-Format op address 6 bits 26 bits J-Format 3
  • 4. Implementing MIPS: the Fetch/Execute Cycle • High-level abstract view of fetch/execute implementation o use the program counter (PC) to read instruction address o fetch the instruction from memory and increment PC o use fields of the instruction to select registers to read o execute depending on the instruction o repeat… Registers Register # Data Register # Data memory Address Data Register # PC Instruction ALU Instruction memory Address 4
  • 5. The basic implementation of the MIPS subset, including the necessary multiplexors and control lines 5
  • 6. Overview: Processor Implementation Styles • Single Cycle o perform each instruction in 1 clock cycle o clock cycle must be long enough for slowest instruction; therefore, o disadvantage: only as fast as slowest instruction • Multi-Cycle o break fetch/execute cycle into multiple steps o perform 1 step in each clock cycle o advantage: each instruction uses only as many cycles as it needs • Pipelined o execute each instruction in multiple steps o perform 1 step / instruction in each clock cycle o process multiple instructions in parallel – assembly line 6
  • 7. 2.Building datapath  datapath element  A unit used to operate on or hold data within a processor.  MIPS – datapath elements: 1. Program Counter(PC) 2. Instruction Memory 3. Register File 4. ALU 5. Data Memory 7
  • 8. Datapath: Instruction Store/Fetch & PC Increment PC Instruction memory Instruction address Instruction a. Instruction memory b. Program counter Add Sum c. Adder PC Instruction memory Read address Instruction 4 Add Three elements used to store and fetch instructions and increment the PC Datapath 8
  • 9. Animating the Datapath Instruction <- MEM[PC] PC <- PC + 4 RD Memory ADDR PC Instruction 4 ADD 9
  • 10. Datapath: R-Type Instruction ALU control RegWrite Registers Write register Read data 1 Read data 2 Read register 1 Read register 2 Write data ALU result ALU Data Data Register numbers a. Registers b. ALU Zero 5 5 5 3 Instruction Registers Write register Read data 1 Read data 2 Read register 1 Read register 2 Write data ALU result ALU Zero RegWrite ALU operation 3 Two elements used to implement R-type instructions Datapath op rs rt rd funct shamt 6 bits 5 bits 5 bits 5 bits 5 bits 6 bits R-Format 4 4 10
  • 11. Animating the Datapath add rd, rs, rt R[rd] <- R[rs] + R[rt]; 5 5 5 RD1 RD2 RN1 RN2 WN WD RegWrite Register File op rs rt rd funct shamt Operation ALU Zero Instruction 3 11
  • 12. Datapath: Load/Store Instruction 16 32 Sign extend b. Sign-extension unit MemRead MemWrite Data memory Write data Read data a. Data memory unit Address Instruction 16 32 Registers Write register Read data 1 Read data 2 Read register 1 Read register 2 Data memory Write data Read data Write data Sign extend ALU result Zero ALU Address MemRead MemWrite RegWrite ALU operation 3 Two additional elements used To implement load/stores Datapath op rs rt offset 6 bits 5 bits 5 bits 16 bits I-Format 12
  • 13. Animating the Datapath lw rt, offset(rs) R[rt] <- MEM[R[rs] + s_extend(offset)]; 13
  • 14. Animating the Datapath sw rt, offset(rs) MEM[R[rs] + sign_extend(offset)] <- R[rt] 14
  • 15. Datapath: Branch Instruction 16 32 Sign extend Zero ALU Sum Shift left 2 To branch control logic Branch target PC + 4 from instruction datapath Instruction Add Registers Write register Read data 1 Read data 2 Read register 1 Read register 2 Write data RegWrite ALU operation 3 Datapath No shift hardware required: simply connect wires from input to output, each shifted left 2 bits 15
  • 16. Animating the Datapath beq rs, rt, offset if (R[rs] == R[rt]) then PC <- PC+4 + s_extend(offset<<2) 16
  • 17. MIPS Datapath PC Instruction memory Read address Instruction 16 32 Add ALU result M u x Registers Write register Write data Read data 1 Read data 2 Read register 1 Read register 2 Shift left 2 4 M u x ALU operation 3 RegWrite MemRead MemWrite PCSrc ALUSrc MemtoReg ALU result Zero ALU Data memory Address Write data Read data M u x Sign extend Add Adding branch capability and another multiplexor Instruction address is either PC+4 or branch target address Extra adder needed as both adders operate in each cycle New multiplexor 17
  • 18. Datapath Executing add add rd, rs, rt 5 5 16 RD1 RD2 RN1 RN2 WN WD RegWrite Register File Operation ALU 3 E X T N D 16 32 Zero RD WD MemRead Data Memory ADDR MemWrite 5 Instruction 32 M U X ALUSrc MemtoReg ADD <<2 RD Instruction Memory ADDR PC 4 ADD ADD M U X M U X PCSrc 18
  • 19. Datapath Executing lw lw rt,offset(rs) 5 5 16 RD1 RD2 RN1 RN2 WN WD RegWrite Register File Operation ALU 3 E X T N D 16 32 Zero RD WD MemRead Data Memory ADDR MemWrite 5 Instruction 32 M U X ALUSrc MemtoReg ADD <<2 RD Instruction Memory ADDR PC 4 ADD ADD M U X M U X PCSrc 19
  • 20. Datapath Executing sw sw rt,offset(rs) 5 5 16 RD1 RD2 RN1 RN2 WN WD RegWrite Register File Operation ALU 3 E X T N D 16 32 Zero RD WD MemRead Data Memory ADDR MemWrite 5 Instruction 32 M U X ALUSrc MemtoReg ADD <<2 RD Instruction Memory ADDR PC 4 ADD ADD M U X M U X PCSrc 20
  • 21. Datapath Executing beq beq r1,r2,offset 5 5 16 RD1 RD2 RN1 RN2 WN WD RegWrite Register File Operation ALU 3 E X T N D 16 32 Zero RD WD MemRead Data Memory ADDR MemWrite 5 Instruction 32 M U X ALUSrc MemtoReg ADD <<2 RD Instruction Memory ADDR PC 4 ADD ADD M U X M U X PCSrc 21
  • 22. Control Implementation Scheme • Control unit takes input from o the instruction opcode bits • Control unit generates o ALU control input o write enable (possibly, read enable also) signals for each storage element o selector controls for each multiplexor 22
  • 23. ALU Control • Plan to control ALU: main control sends a 2-bit ALUOp control field to the ALU control. Based on ALUOp and funct field of instruction the ALU control generates the 4-bit ALU control field o ALU control Func- field tion 0000 and 0001 or 0010 add 0110 sub 0111 slt • ALU must perform o add for load/stores (ALUOp 00) o sub for branches (ALUOp 01) o one of and, or, add, sub, slt for R-type instructions, depending on the instruction’s 6-bit funct field (ALUOp 10) Main Control ALU Control 2 ALUOp 6 Instruction funct field 4 ALU control input To ALU 23
  • 24. Setting ALU Control Bits Instruction AluOp Instruction Funct Field Desired ALU control opcode operation ALU action input LW 00 load word xxxxxx add 0010 SW 00 store word xxxxxx add 0010 Branch eq 01 branch eq xxxxxx subtract 0110 R-type 10 add 100000 add 0010 R-type 10 subtract 100010 subtract 0110 R-type 10 AND 100100 and 0000 R-type 10 OR 100101 or 0001 R-type 10 set on less 101010 set on less 0111 Truth table for ALU control bits ALUOp Funct field Operation ALUOp1 ALUOp0 F5 F4 F3 F2 F1 F0 0 0 X X X X X X 0010 X 1 X X X X X X 0110 1 X X X 0 0 0 0 0010 1 X X X 0 0 1 0 0110 1 X X X 0 1 0 0 0000 1 X X X 0 1 0 1 0001 1 X X X 1 0 1 0 0111 24
  • 25. Designing the Main Control • Observations about MIPS instruction format o opcode is always in bits 31-26 o two registers to be read are always rs (bits 25-21) and rt (bits 20- 16) o base register for load/stores is always rs (bits 25-21) o 16-bit offset for branch equal and load/store is always bits 15-0 o destination register for loads is in bits 20-16 (rt) while for R-type instructions it is in bits 15-11 (rd) (will require multiplexor to select) 31-26 25-21 20-16 15-11 10-6 5-0 31-26 25-21 20-16 15-0 opcode opcode rs rs rt rt address rd shamt funct R-type Load/store or branch 25
  • 26. The Main Control Unit • Control signals derived from instruction 0 rs rt rd shamt funct 31:26 5:0 25:21 20:16 15:11 10:6 35 or 43 rs rt address 31:26 25:21 20:16 15:0 4 rs rt address 31:26 25:21 20:16 15:0 R-type Load/ Store Branch opcode always read read, except for load write for R-type and load sign-extend and add 26
  • 27. Datapath with Control I New multiplexor 27
  • 28. Control Signals Signal Name Effect when deasserted Effect when asserted RegDst The register destination number for the The register destination number for the Write register comes from the rt field (bits 20-16) Write register comes from the rd field (bits 15-11) RegWrite None The register on the Write register input is written with the value on the Write data input AlLUSrc The second ALU operand comes from the The second ALU operand is the sign-extended, second register file output (Read data 2) lower 16 bits of the instruction PCSrc The PC is replaced by the output of the adder The PC is replaced by the output of the adder that computes the value of PC + 4 that computes the branch target MemRead None Data memory contents designated by the address input are put on the first Read data output MemWrite None Data memory contents designated by the address input are replaced by the value of the Write data input MemtoReg The value fed to the register Write data input The value fed to the register Write data input comes from the ALU comes from the data memory Effects of the seven control signals 28
  • 29. Datapath with Control II PC Instruction memory Read address Instruction [31– 0] Instruction [20 16] Instruction [25 21] Add Instruction [5 0] MemtoReg ALUOp MemWrite RegWrite MemRead Branch RegDst ALUSrc Instruction [31 26] 4 16 32 Instruction [15 0] 0 0 M u x 0 1 Control Add ALU result M u x 0 1 Registers Write register Write data Read data 1 Read data 2 Read register 1 Read register 2 Sign extend M u x 1 ALU result Zero PCSrc Data memory Write data Read data M u x 1 Instruction [15 11] ALU control Shift left 2 ALU Address MIPS datapath with the control unit: input to control is the 6-bit instruction opcode field, output is seven 1-bit signals and the 2-bit ALUOp signal 29
  • 30. Datapath with Control II (cont.) Instruction RegDst ALUSrc Memto- Reg Reg Write Mem Read Mem Write Branch ALUOp1 ALUp0 R-format 1 0 0 1 0 0 0 1 0 lw 0 1 1 1 1 0 0 0 0 sw X 1 X 0 0 1 0 0 0 beq X 0 X 0 0 0 1 0 1 PC Instruction memory Read address Instruction [31– 0] Instruction [20 16] Instruction [25 21] Add Instruction [5 0] MemtoReg ALUOp MemWrite RegWrite MemRead Branch RegDst ALUSrc Instruction [31 26] 4 16 32 Instruction [15 0] 0 0 M u x 0 1 Control Add ALU result M u x 0 1 Registers Write register Write data Read data 1 Read data 2 Read register 1 Read register 2 Sign extend M u x 1 ALU result Zero PCSrc Data memory Write data Read data M u x 1 Instruction [15 11] ALU control Shift left 2 ALU Address Determining control signals for the MIPS datapath based on instruction opcode PCSrc cannot be set directly from the opcode: zero test outcome is required 30
  • 31. Control Signals: R-Type Instruction Control signals shown in blue 1 0 0 0 1 ??? Value depends on funct 0 0 5 5 16 RD1 RD2 RN1 RN2 WN WD RegWrite Register File Operation ALU 3 E X T N D 16 32 Zero RD WD MemRead Data Memory ADDR MemWrite 5 Instruction I 32 M U X ALUSrc MemtoReg ADD <<2 RD Instruction Memory ADDR PC 4 ADD ADD M U X M U X PCSrc MUX RegDst 5 rd I[15:11] rt I[20:16] rs I[25:21] immediate/ offset I[15:0] 0 1 0 1 1 0 1 0 4 31
  • 32. 5 5 16 RD1 RD2 RN1 RN2 WN WD RegWrite Register File Operation ALU 3 E X T N D 16 32 Zero RD WD MemRead Data Memory ADDR MemWrite 5 Instruction I 32 M U X ALUSrc MemtoReg ADD <<2 RD Instruction Memory ADDR PC 4 ADD ADD M U X M U X PCSrc MUX RegDst 5 rd I[15:11] rt I[20:16] rs I[25:21] immediate/ offset I[15:0] 0 1 0 1 1 0 1 0 Control Signals: lw Instruction 0 Control signals shown in blue 0 0010 1 1 1 0 1 32
  • 33. 5 5 16 RD1 RD2 RN1 RN2 WN WD RegWrite Register File Operation ALU 3 E X T N D 16 32 Zero RD WD MemRead Data Memory ADDR MemWrite 5 Instruction I 32 M U X ALUSrc MemtoReg ADD <<2 RD Instruction Memory ADDR PC 4 ADD ADD M U X M U X PCSrc MUX RegDst 5 rd I[15:11] rt I[20:16] rs I[25:21] immediate/ offset I[15:0] 0 1 0 1 1 0 1 0 Control Signals: sw Instruction 0 Control signals shown in blue X 0010 1 X 0 1 0 33
  • 34. 5 5 16 RD1 RD2 RN1 RN2 WN WD RegWrite Register File Operation ALU 3 E X T N D 16 32 Zero RD WD MemRead Data Memory ADDR MemWrite 5 Instruction I 32 M U X ALUSrc MemtoReg ADD <<2 RD Instruction Memory ADDR PC 4 ADD ADD M U X M U X PCSrc MUX RegDst 5 rd I[15:11] rt I[20:16] rs I[25:21] immediate/ offset I[15:0] 0 1 0 1 1 0 1 0 Control Signals: beq Instruction Control signals shown in blue X 0110 0 X 0 0 0 1 if Zero=1 34
  • 35. Datapath with Control III Shift left 2 PC Instruction memory Read address Instruction [31– 0] Data memory Read data Write data Registers Write register Write data Read data 1 Read data 2 Read register 1 Read register 2 Instruction [15– 11] Instruction [20– 16] Instruction [25– 21] Add ALU result Zero Instruction [5– 0] MemtoReg ALUOp MemWrite RegWrite MemRead Branch Jump RegDst ALUSrc Instruction [31– 26] 4 M u x Instruction [25– 0] Jump address [31– 0] PC+4 [31– 28] Sign extend 16 32 Instruction [15– 0] 1 M u x 1 0 M u x 0 1 M u x 0 1 ALU control Control Add ALU result M u x 0 1 0 ALU Shift left 2 26 28 Address 31-26 25-0 opcode address Jump MIPS datapath extended to jumps: control unit generates new Jump control bit New multiplexor with additional control bit Jump Composing jump target address 35
  • 37. Datapath with Control III Shift left 2 PC Instruction memory Read address Instruction [31– 0] Data memory Read data Write data Registers Write register Write data Read data 1 Read data 2 Read register 1 Read register 2 Instruction [15– 11] Instruction [20– 16] Instruction [25– 21] Add ALU result Zero Instruction [5– 0] MemtoReg ALUOp MemWrite RegWrite MemRead Branch Jump RegDst ALUSrc Instruction [31– 26] 4 M u x Instruction [25– 0] Jump address [31– 0] PC+4 [31– 28] Sign extend 16 32 Instruction [15– 0] 1 M u x 1 0 M u x 0 1 M u x 0 1 ALU control Control Add ALU result M u x 0 1 0 ALU Shift left 2 26 28 Address 31-26 25-0 opcode address Jump MIPS datapath extended to jumps: control unit generates new Jump control bit New multiplexor with additional control bit Jump Composing jump target address 37
  • 38. Enhancing Performance with Pipelining  An implementation technique in which multiple instructions are overlapped in execution. 38
  • 39. Pipelining • Start work ASAP!! Do not waste time! Time 7 6 PM 8 9 10 11 12 1 2 AM A B C D Time 7 6 PM 8 9 10 11 12 1 2 AM A B C D Task order Task order Assume 30 min. each task – wash, dry, fold, store – and that separate tasks use separate hardware and so can be overlapped Pipelined Not pipelined 39
  • 40. Breaking instructions into steps • We break instructions into the following potential execution steps – not all instructions require all the steps – each step takes one clock cycle 1. Instruction fetch and PC increment (IF) 2. Instruction decode and register fetch (ID) 3. Execution, memory address computation, or branch completion (EX) 4. Memory access or R-type instruction completion (MEM) 5. Memory read completion (WB) • Each MIPS instruction takes from 3 – 5 cycles (steps) 40
  • 41. Step 1: Instruction Fetch & PC Increment (IF) • Use PC to get instruction and put it in the instruction register. Increment the PC by 4 and put the result back in the PC. • Can be described succinctly using RTL (Register-Transfer Language): IR = Memory[PC]; PC = PC + 4; 41
  • 42. Step 2: Instruction Decode and Register Fetch (ID) • Read registers rs and rt in case we need them. Compute the branch address in case the instruction is a branch. • RTL: A = Reg[IR[25-21]]; B = Reg[IR[20-16]]; ALUOut = PC + (sign-extend(IR[15-0]) << 2); 42
  • 43. Step 3: Execution, Address Computation or Branch Completion (EX) • ALU performs one of four functions depending on instruction type o memory reference: ALUOut = A + sign-extend(IR[15-0]); o R-type: ALUOut = A op B; o branch (instruction completes): if (A==B) PC = ALUOut; o jump (instruction completes): PC = PC[31-28] || (IR(25-0) << 2) 43
  • 44. Step 4: Memory access or R-type Instruction Completion (MEM) • Again depending on instruction type: • Loads and stores access memory o load MDR = Memory[ALUOut]; o store (instruction completes) Memory[ALUOut] = B; • R-type (instructions completes) Reg[IR[15-11]] = ALUOut; 44
  • 45. Step 5: Memory Read Completion (WB) • Again depending on instruction type: • Load writes back (instruction completes) Reg[IR[20-16]]= MDR; Important: There is no reason from a datapath (or control) point of view that Step 5 cannot be eliminated by performing Reg[IR[20-16]]= Memory[ALUOut]; for loads in Step 4. This would eliminate the MDR as well. The reason this is not done is that, to keep steps balanced in length, the design restriction is to allow each step to contain at most one ALU operation, or one register access, or one memory access. 45
  • 46. Summary of Instruction Execution Step name Action for R-type instructions Action for memory-reference instructions Action for branches Action for jumps Instruction fetch IR = Memory[PC] PC = PC + 4 Instruction A = Reg [IR[25-21]] decode/register fetch B = Reg [IR[20-16]] ALUOut = PC + (sign-extend (IR[15-0]) << 2) Execution, address ALUOut = A op B ALUOut = A + sign-extend if (A ==B) then PC = PC [31-28] II computation, branch/ (IR[15-0]) PC = ALUOut (IR[25-0]<<2) jump completion Memory access or R-type Reg [IR[15-11]] = Load: MDR = Memory[ALUOut] completion ALUOut or Store: Memory [ALUOut] = B Memory read completion Load: Reg[IR[20-16]] = MDR 1: IF 2: ID 3: EX 4: MEM 5: WB Step 46
  • 47. Pipelined vs. Single-Cycle Instruction Execution: the Plan Instruction fetch Reg ALU Data access Reg 8 ns Instruction fetch Reg ALU Data access Reg 8 ns Instruction fetch 8 ns Time lw $1, 100($0) lw $2, 200($0) lw $3, 300($0) 2 4 6 8 10 12 14 16 18 2 4 6 8 10 12 14 ... Program execution order (in instructions) Instruction fetch Reg ALU Data access Reg Time lw $1, 100($0) lw $2, 200($0) lw $3, 300($0) 2 ns Instruction fetch Reg ALU Data access Reg 2 ns Instruction fetch Reg ALU Data access Reg 2 ns 2 ns 2 ns 2 ns 2 ns Program execution order (in instructions) Single-cycle Pipelined Assume 2 ns for memory access, ALU operation; 1 ns for register access: therefore, single cycle clock 8 ns; pipelined clock cycle 2 ns. 47
  • 48. Pipelining: Keep in Mind • Pipelining does not reduce latency of a single task, it increases throughput of entire workload • Pipeline rate limited by longest stage o potential speedup = number pipe stages o unbalanced lengths of pipe stages reduces speedup • Time to fill pipeline and time to drain it – when there is slack in the pipeline – reduces speedup 48
  • 49. Pipelining MIPS • What makes it easy with MIPS? o all instructions are same length • so fetch and decode stages are similar for all instructions o just a few instruction formats • simplifies instruction decode and makes it possible in one stage o memory operands appear only in load/stores • so memory access can be deferred to exactly one later stage o operands are aligned in memory • one data transfer instruction requires one memory access stage 49
  • 50. Pipelining MIPS • What makes it hard? o structural hazards: different instructions, at different stages, in the pipeline want to use the same hardware resource o control hazards: succeeding instruction, to put into pipeline, depends on the outcome of a previous branch instruction, already in pipeline o data hazards: an instruction in the pipeline requires data to be computed by a previous instruction still in the pipeline • Before actually building the pipelined datapath and control we first briefly examine these potential hazards individually… 50
  • 51. Structural Hazards • Structural hazard: inadequate hardware to simultaneously support all instructions in the pipeline in the same clock cycle • E.g., suppose single – not separate – instruction and data memory in pipeline below with one read port o then a structural hazard between first and fourth lw instructions • MIPS was designed to be pipelined: structural hazards are easy to avoid! 2 4 6 8 10 12 14 Instruction fetch Reg ALU Data access Reg Time lw $1, 100($0) lw $2, 200($0) lw $3, 300($0) 2 ns Instruction fetch Reg ALU Data access Reg 2 ns Instruction fetch Reg ALU Data access Reg 2 ns 2 ns 2 ns 2 ns 2 ns Program execution order (in instructions) Pipelined Instruction fetch Reg ALU Data access Reg 2 ns lw $4, 400($0) Hazard if single memory 51
  • 52. Control Hazards • Control hazard: need to make a decision based on the result of a previous instruction still executing in pipeline • Solution 1 Stall the pipeline Instruction fetch Reg ALU Data access Reg Time beq $1, $2, 40 add $4, $5, $6 lw $3, 300($0) 4 ns Instruction fetch Reg ALU Data access Reg 2ns Instruction fetch Reg ALU Data access Reg 2ns 2 4 6 8 10 12 14 16 Program execution order (in instructions) Pipeline stall bubble Note that branch outcome is computed in ID stage with added hardware (later…) 52
  • 53. Control Hazards • Solution 2 Predict branch outcome o e.g., predict branch-not-taken : Instruction fetch Reg ALU Data access Reg Time beq $1, $2, 40 add $4, $5, $6 lw $3, 300($0) Instruction fetch Reg ALU Data access Reg 2 ns Instruction fetch Reg ALU Data access Reg 2 ns Program execution order (in instructions) Instruction fetch Reg ALU Data access Reg Time beq $1, $2, 40 add $4, $5 ,$6 or $7, $8, $9 Instruction fetch Reg ALU Data access Reg 2 4 6 8 10 12 14 2 4 6 8 10 12 14 Instruction fetch Reg ALU Data access Reg 2 ns 4 ns bubble bubble bubble bubble bubble Program execution order (in instructions) Prediction success Prediction failure: undo (=flush) lw 53
  • 54. Control Hazards • Solution 3 Delayed branch: always execute the sequentially next statement with the branch executing after one instruction delay – compiler’s job to find a statement that can be put in the slot that is independent of branch outcome o MIPS does this – but it is an option in SPIM (Simulator -> Settings) Instruction fetch Reg ALU Data access Reg Time beq $1, $2, 40 add $4, $5, $6 lw $3, 300($0) Instruction fetch Reg ALU Data access Reg 2 ns Instruction fetch Reg ALU Data access Reg 2 ns 2 4 6 8 10 12 14 2 ns (d elayed branch slot) Program execution order (in instructions) Delayed branch beq is followed by add that is independent of branch outcome 54
  • 55. Data Hazards • Data hazard: instruction needs data from the result of a previous instruction still executing in pipeline • Solution Forward data if possible… Time 2 4 6 8 10 add $s0, $t0, $t1 IF ID WB EX MEM add $s0, $t0, $t1 sub $t2, $s0, $t3 Program execution order (in instructions) IF ID WB EX IF ID MEM EX Time 2 4 6 8 10 MEM WB MEM Instruction pipeline diagram: shade indicates use – left=write, right=read Without forwarding – blue line – data has to go back in time; with forwarding – red line – data is available in time 55
  • 56. Data Hazards • Forwarding may not be enough o e.g., if an R-type instruction following a load uses the result of the load – called load-use data hazard Time 2 4 6 8 10 12 14 lw $s0, 20($t1) sub $t2, $s0, $t3 Program execution order (in instructions) IF ID WB MEM EX IF ID WB MEM EX Time 2 4 6 8 10 12 14 lw $s0, 20($t1) sub $t2, $s0, $t3 Program execution order (in instructions) IF ID WB MEM EX IF ID WB MEM EX bubble bubble bubble bubble bubble With a one-stage stall, forwarding can get the data to the sub instruction in time Without a stall it is impossible to provide input to the sub instruction in time 56
  • 57. Reordering Code to Avoid Pipeline Stall (Software Solution) • Example: lw $t0, 0($t1) lw $t2, 4($t1) sw $t2, 0($t1) sw $t0, 4($t1) • Reordered code: lw $t0, 0($t1) lw $t2, 4($t1) sw $t0, 4($t1) sw $t2, 0($t1) Data hazard Interchanged 57
  • 58. Pipelined Datapath • We now move to actually building a pipelined datapath • First recall the 5 steps in instruction execution 1. Instruction Fetch & PC Increment (IF) 2. Instruction Decode and Register Read (ID) 3. Execution or calculate address (EX) 4. Memory access (MEM) 5. Write result into register (WB) • Review: single-cycle processor o all 5 steps done in a single clock cycle o dedicated hardware required for each step • What happens if we break the execution into multiple cycles, but keep the extra hardware? 58
  • 59. Review - Single-Cycle Datapath “Steps” 5 5 16 RD1 RD2 RN1 RN2 WN WD Register File ALU E X T N D 16 32 RD WD Data Memory ADDR 5 Instruction I 32 M U X <<2 RD Instruction Memory ADDR PC 4 ADD ADD M U X 32 IF Instruction Fetch ID Instruction Decode EX Execute/ Address Calc. MEM Memory Access WB Write Back Zero 59
  • 60. Pipelined Datapath – Key Idea • What happens if we break the execution into multiple cycles, but keep the extra hardware? o Answer: We may be able to start executing a new instruction at each clock cycle - pipelining • …but we shall need extra registers to hold data between cycles – pipeline registers 60
  • 61. Pipelined Datapath IF/ID Pipeline registers 5 5 16 RD1 RD2 RN1 RN2 WN WD Register File ALU E X T N D 16 32 RD WD Data Memory ADDR 5 Instruction I 32 M U X <<2 RD Instruction Memory ADDR PC 4 ADD ADD M U X 32 ID/EX EX/MEM MEM/WB Zero 64 bits 97 bits 64 bits 128 bits wide enough to hold data coming in 61
  • 62. Pipelined Datapath IF/ID Pipeline registers 5 5 16 RD1 RD2 RN1 RN2 WN WD Register File ALU E X T N D 16 32 RD WD Data Memory ADDR 5 Instruction I 32 M U X <<2 RD Instruction Memory ADDR PC 4 ADD ADD M U X 32 ID/EX EX/MEM MEM/WB Zero 64 bits 97 bits 64 bits 128 bits wide enough to hold data coming in Only data flowing right to left may cause hazard…, why? 62
  • 63. Bug in the Datapath 5 5 16 RD1 RD2 RN1 RN2 WN WD Register File ALU E X T N D 16 32 RD WD Data Memory ADDR 5 Instruction I 32 M U X <<2 RD Instruction Memory ADDR PC 4 ADD ADD M U X 32 Write register number comes from another later instruction! IF/ID ID/EX EX/MEM MEM/WB 63
  • 64. Corrected Datapath 5 RD1 RD2 RN1 RN2 WN WD Register File ALU E X T N D 16 32 RD WD Data Memory ADDR 32 M U X <<2 RD Instruction Memory ADDR PC 4 ADD ADD M U X 5 5 5 EX/MEM MEM/WB Zero ID/EX IF/ID 64 bits 133 bits 102 bits 69 bits Destination register number is also passed through ID/EX, EX/MEM and MEM/WB registers, which are now wider by 5 bits 64
  • 65. Pipelined Example • Consider the following instruction sequence: lw $t0, 10($t1) sw $t3, 20($t4) add $t5, $t6, $t7 sub $t8, $t9, $t10 65
  • 74. Multiple-Clock-Cycle Diagram IM REG ALU DM REG lw $t0, 10($t1) sw $t3, 20($t4) add $t5, $t6, $t7 CC 1 CC 2 CC 3 CC 4 CC 5 CC 6 CC 7 IM REG ALU DM REG IM REG ALU DM REG sub $t8, $t9, $t10 IM REG ALU DM REG CC 8 Time axis 74
  • 75. Recall Single-Cycle Control – the Datapath PC Instruction memory Read address Instruction [31– 0] Instruction [20 16] Instruction [25 21] Add Instruction [5 0] MemtoReg ALUOp MemWrite RegWrite MemRead Branch RegDst ALUSrc Instruction [31 26] 4 16 32 Instruction [15 0] 0 0 M u x 0 1 Control Add ALU result M u x 0 1 Registers Write register Write data Read data 1 Read data 2 Read register 1 Read register 2 Sign extend M u x 1 ALU result Zero PCSrc Data memory Write data Read data M u x 1 Instruction [15 11] ALU control Shift left 2 ALU Address 75
  • 76. Recall Single-Cycle – ALU Control Instruction AluOp Instruction Funct Field Desired ALU control opcode operation ALU action input LW 00 load word xxxxxx add 010 SW 00 store word xxxxxx add 010 Branch eq 01 branch eq xxxxxx subtract 110 R-type 10 add 100000 add 010 R-type 10 subtract 100010 subtract 110 R-type 10 AND 100100 and 000 R-type 10 OR 100101 or 001 R-type 10 set on less 101010 set on less 111 Truth table for ALU control bits ALUOp Funct field Operation ALUOp1 ALUOp0 F5 F4 F3 F2 F1 F0 0 0 X X X X X X 010 0 1 X X X X X X 110 1 X X X 0 0 0 0 010 1 X X X 0 0 1 0 110 1 X X X 0 1 0 0 000 1 X X X 0 1 0 1 001 1 X X X 1 0 1 0 111 76
  • 77. Signals Signal Name Effect when deasserted Effect when asserted RegDst The register destination number for the The register destination number for the Write register comes from the rt field (bits 20-16) Write register comes from the rd field (bits 15-11) RegWrite None The register on the Write register input is written with the value on the Write data input AlLUSrc The second ALU operand comes from the The second ALU operand is the sign-extended, second register file output (Read data 2) lower 16 bits of the instruction PCSrc The PC is replaced by the output of the adder The PC is replaced by the output of the adder that computes the value of PC + 4 that computes the branch target MemRead None Data memory contents designated by the address input are put on the first Read data output MemWrite None Data memory contents designated by the address input are replaced by the value of the Write data input MemtoReg The value fed to the register Write data input The value fed to the register Write data input comes from the ALU comes from the data memory Instruction RegDst ALUSrc Memto- Reg Reg Write Mem Read Mem Write Branch ALUOp1 ALUp0 R-format 1 0 0 1 0 0 0 1 0 lw 0 1 1 1 1 0 0 0 0 sw X 1 X 0 0 1 0 0 0 beq X 0 X 0 0 0 1 0 1 Effect of control bits Deter- mining control bits 77
  • 78. Pipeline Control • Initial design – motivated by single-cycle datapath control – use the same control signals • Observe: o No separate write signal for the PC as it is written every cycle o No separate write signals for the pipeline registers as they are written every cycle o No separate read signal for instruction memory as it is read every clock cycle o No separate read signal for register file as it is read every clock cycle • Need to set control signals during each pipeline stage • Since control signals are associated with components active during a single pipeline stage, can group control lines into five groups according to pipeline stage Will be modified by hazard detection unit!! 78
  • 79. Pipelined Datapath with Control I PC Instruction memory Address Instruction Instruction [20– 16] MemtoReg ALUOp Branch RegDst ALUSrc 4 16 32 Instruction [15– 0] 0 0 Registers Write register Write data Read data 1 Read data 2 Read register 1 Read register 2 Sign extend M u x 1 Write data Read data M u x 1 ALU control RegWrite MemRead Instruction [15– 11] 6 IF/ID ID/EX EX/MEM MEM/WB MemWrite Address Data memory PCSrc Zero Add Add result Shift left 2 ALU result ALU Zero Add 0 1 M u x 0 1 M u x Same control signals as the single-cycle datapath 79
  • 80. Pipeline Control Signals • There are five stages in the pipeline o instruction fetch / PC increment o instruction decode / register fetch o execution / address calculation o memory access o write back Execution/Address Calculation stage control lines Memory access stage control lines Write-back stage control lines Instruction Reg Dst ALU Op1 ALU Op0 ALU Src Branch Mem Read Mem Write Reg write Mem to Reg R-format 1 1 0 0 0 0 0 1 0 lw 0 0 0 1 0 1 0 1 1 sw X 0 0 1 0 0 1 0 X beq X 0 1 0 1 0 0 0 X Nothing to control as instruction memory read and PC write are always enabled 80
  • 81. Pipeline Control Implementation • Pass control signals along just like the data – extend each pipeline register to hold needed control bits for succeeding stages • Note: The 6-bit funct field of the instruction required in the EX stage to generate ALU control can be retrieved as the 6 least significant bits of the immediate field which is sign-extended and passed from the IF/ID register to the ID/EX register Control EX M WB M WB WB IF/ID ID/EX EX/MEM MEM/WB Instruction 81
  • 82. Pipelined Datapath with Control II PC Instruction memory Instruction Add Instruction [20– 16] MemtoReg ALUOp Branch RegDst ALUSrc 4 16 32 Instruction [15– 0] 0 0 M u x 0 1 Add Add result Registers Write register Write data Read data 1 Read data 2 Read register 1 Read register 2 Sign extend M u x 1 ALU result Zero Write data Read data M u x 1 ALU control Shift left 2 RegWrite MemRead Control ALU Instruction [15– 11] 6 EX M WB M WB WB IF/ID PCSrc ID/EX EX/MEM MEM/WB M u x 0 1 MemWrite Address Data memory Address Control signals emanate from the control portions of the pipeline registers 82