You are on page 1of 47

Chapter 7.

Basic
Processing Unit
Overview
Instruction Set Processor (ISP)
Central Processing Unit (CPU)

A typical computing task consists of a series


of steps specified by a sequence of machine
instructions that constitute a program.
An instruction is executed by carrying out a
sequence of more rudimentary operations.
Some Fundamental
Concepts
Fundamental Concepts
Processor fetches one instruction at a time and
perform the operation specified.
Instructions are fetched from successive memory
locations until a branch or a jump instruction is
encountered.
Processor keeps track of the address of the memory
location containing the next instruction to be fetched
using Program Counter (PC).
Instruction Register (IR)
Executing an Instruction
Fetch the contents of the memory location pointed
to by the PC. The contents of this location are
loaded into the IR (fetch phase).
IR [[PC]]
Assuming that the memory is byte addressable,
increment the contents of the PC by 4 (fetch phase).
PC [PC] + 4
Carry out the actions specified by the instruction in
the IR (execution phase).
Processor Organization Internalprocessor
bus

Controlsignals

PC

Instruction
Address
decoderand
lines
MDR HAS MAR controllogic
TWO INPUTS Memory
AND TWO bus
OUTPUTS MDR
Data
lines IR

Datapath
Y

Constant4 R0

Select MUX

Add
A B
ALU Sub R n 1
control ALU
lines
Carryin
XOR TEMP

Textbook Page 413

Figure7.1.Singlebusorganizationofthedatapathinsideaprocessor.
Executing an Instruction
Transfer a word of data from one processor
register to another or to the ALU.
Perform an arithmetic or a logic operation
and store the result in a processor register.
Fetch the contents of a given memory
location and load them into a processor
register.
Store a word of data from a processor
register into a given memory location.
Register Transfers Riin
Internalprocessor
bus

Ri

Riout

Yin

Constant4

Select MUX

A B
ALU

Zin

Zout

Figure7.2.InputandoutputgatingfortheregistersinFigure7.1.
Register Transfers
All operations and data transfers are controlled by the processor clock.
Bus

D Q
1
Q
Riout

Ri in
Clock

Figure7.3.
Figure 7.3.Inputandoutputgatingforoneregisterbit.
Input and output gating for one register bit.
Performing an Arithmetic or
Logic Operation
The ALU is a combinational circuit that has no
internal storage.
ALU gets the two operands from MUX and bus.
The result is temporarily stored in register Z.
What is the sequence of operations to add the
contents of register R1 to those of R2 and store the
result in R3?
1. R1out, Yin
2. R2out, SelectY, Add, Zin
3. Zout, R3in
Fetching a Word from Memory
Address into MAR; issue Read operation; data into MDR.
Memorybus Internalprocessor
datalines MDRoutE MDRout bus

MDR

MDRinE MDRin

Figure7.4.
Figure 7.4. ConnectionandcontrolsignalsforregisterMDR.
Connection and control signals for register MDR.
Fetching a Word from Memory
The response time of each memory access varies
(cache miss, memory-mapped I/O,).
To accommodate this, the processor waits until it
receives an indication that the requested operation
has been completed (Memory-Function-Completed,
MFC).
Move (R1), R2
MAR [R1]
Start a Read operation on the memory bus
Wait for the MFC response from the memory
Load MDR from the memory bus
R2 [MDR]
Step 1 2 3

Timing Clock

MARin MAR [R1]


Assume MAR
is always available Address
on the address lines
of the memory bus. Start a Read operation on the memory bus
Read

MR

MDRinE

Data

Wait for the MFC response from the memory


MFC

MDRout Load MDR from the memory bus


R2 [MDR]

Figure7.5. TimingofamemoryReadoperation.
Execution of a Complete
Instruction
Add (R3), R1
Fetch the instruction

Fetch the first operand (the contents of the


memory location pointed to by R3)
Perform the addition

Load the result into R1


Architecture Riin
Internalprocessor
bus

Ri

Riout

Yin

Constant4

Select MUX

A B
ALU

Zin

Zout

Figure7.2.InputandoutputgatingfortheregistersinFigure7.1.
Execution of a Complete
Instruction Internalprocessor
bus

Add (R3), R1 Controlsignals

PC

Instruction
Step A ction Address
decoderand
lines
MAR controllogic

1 PCout , MAR in , Read, Select4,Add, Zin Memory


bus

2 Zout , PCin , Y in , WMF C MDR


Data
lines IR
3 MDRout , IR in
4 R3out , MAR in , Read Y

Constant4 R0
5 R1out , Y in , WMF C
6 MDRout , SelectY,Add, Zin Select MUX

7 Zout , R1in , End Add


A B
ALU Sub R n 1
control ALU
lines
Carryin
XOR TEMP
Figure 7.6. C ontrol sequence
forexecutionof theinstructionAdd (R3),R1.
Z

Figure7.1.Singlebusorganizationofthedatapathinsideaprocessor.
Execution of Branch
Instructions
A branch instruction replaces the contents of
PC with the branch target address, which is
usually obtained by adding an offset X given
in the branch instruction.
The offset X is usually the difference between
the branch target address and the address
immediately following the branch instruction.
Conditional branch
Execution of Branch
Instructions

StepAction

1 PCout, MAR in , Read,Select4,Add, Zin


2 Zout, PC in , Y in, WMF C
3 MDR out , IR in
4 Offset-field-of-IR
out, Add, Zin

5 Zout, PCin , End

Figure 7.7. Control sequence for an unconditional branch instruction


Multiple-Bus Organization
BusA BusB BusC

Incrementer

PC

Register
file

Constant4

MUX
A

ALU R

Instruction
decoder

IR

MDR

MAR

Memorybus Address
datalines lines

Figure7.8. Threeb usorganizationofthedatapath.


Multiple-Bus Organization
Add R4, R5, R6

StepAction

1 PCout, R=B, MAR in, Read, IncPC


2 WMFC
3 MDRoutB, R=B,IR in
4 R4outA, R5outB, SelectA,Add, R6in, End

Figure 7.9. Control sequence for the instruction. Add R4,R5,R


for the three-bus organization in Figure 7.8.
Quiz Internalprocessor
bus

Controlsignals

What is the control PC

Instruction

sequence for Address


lines
MAR
decoderand
controllogic

execution of the Memory


bus

instruction Data
lines
MDR
IR

Add R1, R2 Constant4


Y
R0

including the Select MUX

instruction fetch Add


Sub
A B
R n 1

phase? (Assume
ALU
control ALU
lines
Carryin

single bus XOR TEMP

architecture)
Z

Figure7.1.Singlebusorganizationofthedatapathinsideaprocessor.
Hardwired Control
Overview
To execute instructions, the processor must
have some means of generating the control
signals needed in the proper sequence.
Two categories: hardwired control and
microprogrammed control
Hardwired system can operate at high speed;
but with little flexibility.
Control Unit Organization
CLK Controlstep
Clock counter

External
inputs
Decoder/
IR
encoder
Condition
codes

Controlsignals

Figure7.10.Controlunitorganization.
Detailed Block Description
CLK
Clock Controlstep Reset
counter

Stepdecoder

T 1 T2 Tn

INS1
External
INS2 inputs
Instruction
IR Encoder
decoder
Condition
codes
INSm

Run End

Controlsignals

Figure7.11. Separationofthedecodingandencodingfunctions.
Generating Zin
Zin = T1 + T6 ADD + T4 BR +
Branch Add

T4 T6

T1

Figure7.12.GenerationoftheZincontrolsignalfortheprocessorinFigure7.1.

Generating End
End = T7 ADD + T5 BR + (T5 N + T4 N) BRN +
Branch<0
Add Branch
N N

T7 T5 T4 T5

End

Figure7.13. GenerationoftheEndcontrolsignal.
A Complete Processor
Instruction Integer Floatingpoint
unit unit unit

Instruction Data
cache cache

Businterface
Processor

Systembus

Main Input/
memory Output

Figure7.14. Blockdiagramofacompleteprocessor.
Microprogrammed
Control
Overview
Control signals are generated by a program similar to machine language
programs.
Control Word (CW); microroutine; microinstruction

MDRout

WMFC
MARin

Select
PCout

R1out

R3out
Micro

Read
PCin

Add

End
R1 in
Z out
IRin
Yin
instruction

Zin
1 0 1 1 1 0 0 0 1 1 1 0 0 0 0 0 0
2 1 0 0 0 0 0 1 0 0 0 1 0 0 0 1 0
3 0 0 0 0 1 1 0 0 0 0 0 0 0 0 0 0
4 0 0 1 1 0 0 0 0 0 0 0 0 0 1 0 0
5 0 0 0 0 0 0 1 0 0 0 0 1 0 0 1 0
6 0 0 0 0 1 0 0 0 1 1 0 0 0 0 0 0
7 0 0 0 0 0 0 0 0 0 0 1 0 1 0 0 1

Figure7.15 AnexampleofmicroinstructionsforFigure7.6.
Overview

Step A ction

1 PCout , MAR in , Read, Select4,Add, Zin


2 Zout , PCin , Y in , WMF C
3 MDRout , IR in
4 R3out , MAR in , Read
5 R1out , Y in , WMF C
6 MDRout , SelectY,Add, Zin
7 Zout , R1in , End

Figure 7.6. C ontrol sequence


forexecutionof theinstructionAdd (R3),R1.
Overview
Control store
Starting
IR address
generator One function
cannot be carried
out by this simple
organization.

Clock P C

Control
store CW

Figure7.16. Basicorganizationofamicroprogrammedcontrolunit.
Overview
The previous organization cannot handle the situation when the control
unit is required to check the status of the condition codes or external
inputs to choose between alternative courses of action.
Use conditional branch microinstruction.
Address
Microinstruction

0 PCout , MAR in , Read,Select4,Add, Zin


1 Zout , PC in , Y in , WMFC
2 MDRout , IR in
3 Branchto startingaddress of appropriate microroutine
. ... .. ... ... .. ... .. ... ... .. ... ... .. ... .. ... ... .. ... .. ... ... .. ... ..
25 If N=0, thenbranchto microinstruction
0
26 Offset-field-of-IR
out , SelectY,Add, Zin

27 Zout , PC in , End

Figure 7.17. Microroutine for the instruction Branch<0.


Overview External
inputs

Startingand
branchaddress Condition
IR codes
generator

Clock PC

Control
store CW

Figure7.18. Organizationofthecontrolunittoallow
conditionalbranchinginthemicroprogram.
Microinstructions
A straightforward way to structure
microinstructions is to assign one bit position
to each control signal.
However, this is very inefficient.
The length can be reduced: most signals are
not needed simultaneously, and many signals
are mutually exclusive.
All mutually exclusive signals are placed in
the same group in binary coding.
Partial Format for the
Microinstructions
Microinstruction

F1 F2 F3 F4 F5

F1(4bits) F2(3bits) F3(3bits) F4(4bits) F5(2bits)

0000:Notransfer 000:Notransfer 000:Notransfer 0000:Add 00:Noaction


0001:PCout 001:PCin 001:MARin 0001:Sub 01:Read
0010:MDRout 010:IRin 010:MDRin 10:Write
0011:Zout 011:Zin 011:TEMPin
0100:R0out 100:R0in 100:Yin 1111:XOR
0101:R1out 101:R1in
0110:R2out 110:R2in 16ALU
functions
0111:R3out 111:R3in
1010:TEMPout
1011:Offsetout

F6 F7 F8
What is the price paid for
this scheme?
F6(1bit) F7(1bit) F8(1bit)

0:SelectY 0:Noaction 0:Continue


1:Select4 1:WMFC 1:End

Figure7.19. Anexampleofapartialformatforfieldencodedmicroinstructions.
Further Improvement
Enumerate the patterns of required signals in
all possible microinstructions. Each
meaningful combination of active control
signals can then be assigned a distinct code.
Vertical organization

Horizontal organization
Microprogram Sequencing
If all microprograms require only straightforward
sequential execution of microinstructions except for
branches, letting a PC governs the sequencing
would be efficient.
However, two disadvantages:
Having a separate microroutine for each machine instruction results
in a large total number of microinstructions and a large control store.
Longer execution time because it takes more time to carry out the
required branches.
Example: Add src, Rdst
Four addressing modes: register, autoincrement,
autodecrement, and indexed (with indirect forms).
- Bit-ORing
- Wide-Branch Addressing
- WMFC
Mode

ContentsofIR OPcode 0 1 0 Rsrc Rdst

11 10 8 7 4 3 0

Address Microinstruction
(octal)

000 PCout,MARin,Read,Select 4 ,Add,Zin


001 Zout,PCin,Yin,WMFC
002 MDRout,IRin
003 Branch{ PC 101(fromInstructiondecoder);
PC5,4 [IR10,9]; PC3 [IR10] [IR9] [IR8]}
121 Rsrcout ,MARin ,Read,Select4,Add,Z in
122 Zout,Rsrcin
123 Branch{PC 170; PC0 [IR8]},WMFC
170 MDRout,MARin,Read,WMFC
171 MDRout,Yin
172 Rdstout ,SelectY,Add,Z in
173 Zout,Rdstin,End

Figure7.21. MicroinstructionforAdd(Rsrc)+,Rdst.
Note:Microinstructionatlocation170isnotexecutedforthisaddressingmode.
Microinstructions with Next-
Address Field
The microprogram we discussed requires several
branch microinstructions, which perform no useful
operation in the datapath.
A powerful alternative approach is to include an
address field as a part of every microinstruction to
indicate the location of the next microinstruction to
be fetched.
Pros: separate branch microinstructions are virtually
eliminated; few limitations in assigning addresses to
microinstructions.
Cons: additional bits for the address field (around
1/6)
Microinstructions with Next-
Address Field
IR

External Condition
Inputs codes

Decodingcircuits

AR

Controlstore

Nextaddress I R

Microinstructiondecoder

Controlsignals

Figure7.22.Microinstructionsequencingorganization.
Microinstruction

F0 F1 F2 F3

F0(8bits) F1(3bits) F2(3bits) F3(3bits)

Addressofnext 000:Notransfer 000:Notransfer 000:Notransfer


microinstruction 001:PCout 001:PCin 001:MARin
010:MDRout 010:IRin 010:MDRin
011:Zout 011:Zin 011:TEMPin
100:Rsrcout 100:Rsrcin 100:Yin
101:Rdstout 101:Rdstin
110:TEMPout

F4 F5 F6 F7

F4(4bits) F5(2bits) F6(1bit) F7(1bit)

0000:Add 00:Noaction 0:SelectY 0:Noaction


0001:Sub 01:Read 1:Select4 1:WMFC
10:Write
1111:XOR

F8 F9 F10

F8(1bit) F9(1bit) F10(1bit)

0:NextAdrs 0:Noaction 0:Noaction


1:InstDec 1:ORmode 1:ORindsrc

Figure7.23. FormatformicroinstructionsintheexampleofSection7.5.3.
Implementation of the
Microroutine
Octal
address F0 F1 F2 F3 F4 F5 F6 F7 F8 F9 F10

0 0 0 0 0 0 0 0 0 0 1 0 0 1 01 1 0 0 1 0 0 0 0 01 1 0 0 0 0
0 0 1 0 0 0 0 0 0 1 0 0 1 1 00 1 1 0 0 0 0 0 0 00 0 1 0 0 0
0 0 2 0 0 0 0 0 0 1 1 0 1 0 01 0 0 0 0 0 0 0 0 00 0 0 0 0 0
0 0 3 0 0 0 0 0 0 0 0 0 0 0 00 0 0 0 0 0 0 0 0 00 0 0 1 1 0

121 0 1 0 1 0 0 1 0 1 0 0 01 1 0 0 1 0 0 0 0 01 1 0 0 0 0
122 0 1 1 1 1 0 0 0 0 1 1 10 0 0 0 0 0 0 0 0 00 0 1 0 0 1

1 7 0 0 1 1 1 1 0 0 1 0 1 0 00 0 0 0 1 0 0 0 0 01 0 1 0 0 0
1 7 1 0 1 1 1 1 0 1 0 0 1 0 00 0 1 0 0 0 0 0 0 00 0 0 0 0 0
1 7 2 0 1 1 1 1 0 1 1 1 0 1 01 1 0 0 0 0 0 0 0 00 0 0 0 0 0
1 7 3 0 0 0 0 0 0 0 0 0 1 1 10 1 0 0 0 0 0 0 0 00 0 0 0 0 0

Figure7.24.ImplementationofthemicroroutineofFigure7.21usinga
nextmicroinstructionaddressfield. (SeeFigure7.23forencodedsignals.)
R15in R15out R0 in R0out

Decoder

Decoder

IR Rsrc Rdst

InstDecout
External
inputs ORmode
Decoding
circuits
Condition ORindsrc
codes

AR

Controlstore

Nextaddress F1 F2 F8 F9 F10

Rdstout

Rdstin
Microinstruction
decoder
Rsrcout

Rsrcin

Othercontrolsignals

Figure7.25. Somedetailsofthecontrolsignalgeneratingcircuitry.
bit-ORing
Further Discussions
Prefetching

Emulation