You are on page 1of 9

Course Code: CAP-211 (COA)

Homework – 3

Date:
21/10/10

Submitted to: submitted by:


Anjali Ma’m Manish Kr. Singh
10907108
D3912 A17
Part – A

Q1: Give a comparative study of RISC and CISC architectures.

Ans:

RISC->Means Reduced Instruction Set Computer. A RISC


system has reduced number of instructions and more
importantly it is load store architecture were pipelining can be
implemented easily. Eg.ATMELAVR

CISC-> Means Complex instruction set architecture. A CISC


system has complex instructions such as direct
addition between data in two memory locations.Eg.8085

Features of RISC:-

Uniform instruction format, using a single word with the opcode in the
same bit positions in every instruction, demanding less decoding;
Identical general purpose registers, allowing any register to be used in
any context, simplifying compiler design (although normally there are
separate floating point registers);
Simple addressing modes. Complex addressing performed via sequences
of arithmetic and/or load-store operations;
Few data types in hardware, some CISCs have byte string instructions,
or support complex numbers; this is so far unlikely to be found on a
RISC.
RISC chips require fewer transistors and cheaper to produce. Finally, it's
easier to write powerful optimized compilers. In common CISC chips
are relatively slow (compared to RISC chips) per instruction, but use
little (less than RISC) instructions.
RISC puts a greater burden on the software. Software developers need to
write more lines for the same tasks. In CISC, software developers no
need to write more lines for the same tasks.
Mainly used for real time applications Mainly used in normal PC’s,
Workstations and servers
Large number of registers, most of which can be used as general purpose
registers CISC processors cannot have a large number of registers.
RISC processor has a number of hardwired instructions. CISC processor
executes microcode instructions.

Q2: Taking a suitable example, illustrate how compiler based


optimization is performed in RISC systems.

Ans:

Compiler optimization is the process of tuning the output of a compiler


to minimize or maximize some attribute of an executable computer
program. The most common requirement is to minimize the time taken
to execute a program; a less common one is to minimize the amount of
memory occupied. The growth of portable computers has created a
market for minimizing the power consumed by a program. Compiler
optimization is generally implemented using a sequence of optimizing
transformations, algorithms which take a program and transform it to
produce an output program that uses less resource.

It has been shown that some code optimization problems are NP-
complete, or even undesirable. In practice, factors such as the
programmer's willingness to wait for the compiler to complete its task
place upper limits on the optimizations that a compiler implementer
might provide. (Optimization is generally a very CPU- and memory-
intensive process.) In the past, computer memory limitations were also a
major factor in limiting which optimizations could be performed.
Because of all these factors, optimization rarely produces "optimal"
output in any sense, and in fact an "optimization" may impede
performance in some cases; rather, they are heuristic methods for
improving resource usage in typical programs
A compiler typically only deals with a part of a program at a time, often
the code contained within a single file or module; the result is that it is
unable to consider contextual information that can only be obtained by
processing the other files.

Q3: Can instructions be executed in a pipeline? If yes, take a


example instruction and execute it using instruction pipeline.

Ans:

Pipelining is technology that improves instruction execution speed by


putting the steps into parallel.

To understand the pipeline mechanism, it is first necessary to understand


the execution phases of an instruction. Execution phases of an
instruction for a processor with a 5-step "classic" pipeline are as follows:

• FETCH: (retrieves the instruction from the cache;


• DECODE: decodes the instruction and looks for operands
(register or immediate values);
• EXECUTE: performs the instruction (for example, if it is an ADD
instruction, addition is performed, if it is a SUB instruction,
subtraction is performed, etc.);
• MEMORY: accesses the memory, and writes data or retrieves data
from it;
• WRITE BACK (retire): records the calculated value in a register.

Instructions are organized into lines in the memory and are loaded one
after the other.
The goal of the pipeline is to perform each step in parallel with the
preceding and following steps, meaning reading an instruction (FETCH)
while the previous step is being read (DECODE), while the step before
that is being executed (EXECUTE), while the step before that is being
written to the memory (MEMORY), and while the first step in the series
is being recorded in a register (WRITE BACK).

In general, 1 to 2 clock cycles (rarely more) for each pipeline step or a


maximum of 10 clock cycles per instruction should be planned for. For
two instructions, a maximum of 12 clock cycles are necessary (10+2=12
instead of 10*2=20) because the preceding instruction was already in the
pipeline. Both instructions are therefore being simultaneously processed,
but with a delay of 1 or 2 clock cycles. For 3 instructions, 14 clock
cycles are required, etc.

Part - B

Q4: How RISC pipelines are implemented in RISC environment?

Ans:
CISC Architecture

CISC (Complex Instruction Set Computer) architecture means


hardwiring the processor with complex instructions that are difficult to
create using basic instructions.

CISC is especially popular in 80x86 type processors. This type of


architecture has an elevated cost because of advanced functions printed
on the silicone.

Instructions are of variable length and may sometimes require more than
one clock cycle. Because CISC-based processors can only process one
instruction at a time, the processing time is a function of the size of the
instruction.

RISC Architecture

Processors with RISC (Reduced Instruction Set Computer) technology


do not have hardwired, advanced functions.

Programs must therefore be translated into simple instructions which


complicates development and/or requires a more powerful processor.
Such architecture has a reduced production cost compared to CISC
processors. In addition, instructions, simple in nature, are executed in
just one clock cycle, which speeds up program execution when
compared to CISC processors. Finally, these processors can handle
multiple instructions simultaneously by processing them in parallel.

Q5: Give the super scalar architectures of Pentium processor.

Ans:
A superscalar CPU architecture implements a form of parallelism
called instruction level parallelism within a single processor. It therefore
allows faster CPU throughput than would otherwise be possible at a
given clock rate. A superscalar processor executes more than one
instruction during a clock cycle by simultaneously dispatching multiple
instructions to redundant functional units on the processor. Each
functional unit is not a separate CPU core but an execution resource
within a single CPU such as an arithmetic logic unit, a bit shifter, or a
multiplier. Super scaling consists of placing multiple processing units in
parallel in order to process multiple instructions per cycle.

While a superscalar CPU is typically also pipelined, pipelining and


superscalar architecture are considered different performance
enhancement techniques.

The superscalar technique is traditionally associated with several


identifying characteristics (within a given CPU core):

• Instructions are issued from a sequential instruction stream


• CPU hardware dynamically checks for data dependencies between
instructions at run time (versus software checking at compile time)
• The CPU accepts multiple instructions per clock cycle

Therefore a superscalar processor can be envisioned having multiple


parallel pipelines, each of which is processing instructions
simultaneously from a single instruction thread.

Q6: Is there any difference between the working of vector


processors and array processors? Differentiate between SIMD and
MIMD array processors.

Ans:
The processor (CPU, for Central Processing Unit) is the computer's
brain. It allows the processing of numeric data, meaning information
entered in binary form, and the execution of instructions stored in
memory.

The first microprocessor (Intel 4004) was invented in 1971. It was a 4-


bit calculation device with a speed of 108 kHz. Since then,
microprocessor power has grown exponentially. So what exactly are
these little pieces of silicone that run our computers?

SIMDMIMD or SIMD/MIMD is a term referring to a machine that has a


dual function that can switch from MIMD to SIMD for a period of time
to handle some complex instruction, and thus has two modes. The
Thinking Machines, Inc. Connection Machine model CM-2 when placed
as a front end or back end of a MIMD machine permitted programmers
to operate different modes for execution of different parts of a problem,
referred to sometimes a dual modes. These machines have existed since
Illiac and have employed a bus that interconnects the master CPU with
other processors. The master control processor would have the capability
of interrupting the processing of other CPUs. The other CPUs could run
independent program code. During an interruption, some provision must
be made for check pointing (closing and saving current status of the
controlled processors).
SIMIMD is a processor array architecture wherein all processors in the
array are commanded from a Single Instruction stream, to execute
Multiple Data streams located one per processing element. Within this
construct, data dependent operations within each picket that mimic
instruction execution are controlled by the SIMD instruction stream.

This is a Single Instruction Stream machine with the ability to sequence


Multiple Instruction streams (one per Picket) using the SIMD instruction
stream and operate on Multiple Data Streams (one per Picket). SIMIMD
can be executed by a processor memory element system.

A parallel array processor for massively parallel applications is formed


with low power CMOS with DRAM processing while incorporating
processing elements on a single chip. Eight processors on a single chip
have their own associated processing element, significant memory, and
I/O and are interconnected with a hypercube based, but modified,
topology. These nodes are then interconnected, either by a hypercube,
modified hypercube, or ring, or ring within ring network topology.
Conventional microprocessor MMPs consume pins and time going to
memory. The new architecture merges processor and memory with
multiple PMEs (eight 16 bit processors with 32K and I/O) in DRAM and
has no memory access delays and uses all the pins for networking. The
chip can be a single node of a fine-grained parallel processor. Each chip
will have eight 16 bit processors, each processor providing 5 MIPs
performance. I/O has three internal ports and one external port shared by
the plural processors on the chip. Significant software flexibility is
provided to enable quick implementation of existing programs written in
common languages. The scalable chip PME has internal and external
connections for broadcast and asynchronous SIMD, MIMD and
SIMIMD (SIMD/MIMD) with dynamic switching of modes. The chip
can be used in systems which employ 32, 64 or 128,000 processors.
Local and global memory functions can all be provided by the chips
themselves, and the system can connect to and support other global
memories and DASD.

You might also like