Professional Documents
Culture Documents
Area Cost 51000 24500 62*N 60*N 50*N 50*N+80*floor (N/4) 60*N2
MP1 MP2 ALU4 ALU2 INV CMP EQV DEC2 DEC3 MUX2 MUX4 MUX8 MUX16
N-bit in 2N-bit out 1 stage multiplier N-bit in 2N-bit out 2 stage multiplier N-bit function ALU (and/or/xor/not) N-bit function ALU (and/or) N-bit inverse gate N-bit comparison and equivalence N-bit equivalence 2-bit input decoder 3-bit input decoder N-bit 2 to 1 mux N-bit 4 to 1 mux N-bit 8 to 1 mux N-bit 16 to 1 mux
Table 5
0.45*N (1cycle) 0.25*N (2cycle) 0.8+0.05*N 0.4+0.05*N 0.3 0.25*N 0.4 1.0 1.8 0.2+0.05*N 0.7+0.05*N 1.2+0.05*N 1.8+0.05*N
61*N2+125*N 62*N2+280*N 120*N 55*N 2*N 45*N 3*N 60 140 30*N 80*N 170*N 360*N
Figure 4
3. 4.
The depth of pipelining stages is dependant on your choice but not more than six. Fewer stages may lead to lower frequency, but more stages will increase the hardware area. There are three kinds of multipliers available in PART I since multiplication unit always have much larger area and longer critical path than any other combinational components. You may choose one of them regarded as a trade-off problem between operating frequency and area cost.
Exercise Draw a layout of your processor architecture with pipelining. Indicate each stage by labeling them in the top of the layout. Mark the delay elements in bits such as Figure 5 showing. Any signal in different pipelining stage cannot have the same signal names. The symbols of functional units should be the same as those in Table 4 of PART I with adding comments. Try to optimize your design according to Table 5 for better performance and minimize the detail area cost and critical time in your design.
Figure 5
Description Clock signal Reset signal Instruction memory address given by program counter Executable instructions from instruction memory Data memory address decoded by processor 0 for reading data from specified address; 1 for writing data into specified address Data from processor to memory Data from memory to processor
I/O interface signals
Exercise Implement your processor according to the layout. The main module must be named as PROCESSOR and contained in file PROCESSOR.v, which should include the necessary behavior model library file MODEL.v and other necessary module files. Try to take advantage of module hierarchy to complete your design in order to save your time. All combinational logic should be realized in standard cells, such as and, or, etc, and written in independent modules. List the accurate area cost of each component used in the layout like Table 9 and calculate the total cost area of entire processor. Module Name ADD_R R MUX2 MUX2
Table 9
Bit-width Amount Area Cost 24-bit 24-bit 16-bit 1-bit 1 2 3 2 1200 2880 1920 80
V. Design Verification
After completing your design with Verilog behavior models, you are required to verify it
4
with testbench. Figure 6 illustrates the verification flow of testbench. The testbench reads the instructions from machine code file, sends data and instructions to the processor, receives the calculated results in data memory, and verify the answers after all operations. If there are bugs in the processor, the results will be different and errors will be detected. To execute simulation, use Icarus Verilog available in the CD-ROM of textbook or course website. However, if you want to observe detail internal signals in your design, simulation tool such as Nanosim is necessary.
Figure 6
Exercise Verify your behavior module design with testbench given by T.A. Your design must pass two machine code files for simulation. One is translated from C program in PART I, and the other is translated from assembly code which is available in MIPS assembly language and should be modified according to your own instruction set to fit your assembler. Check the output file to see if the answers are correct. Adjust the clock period in testbench to find out the maximum speed your processor can run and compare the minimum clock period with your estimated critical timing. You can use both Icarus Verilog and Nanosim to verify your design and observe the signals. Try to modify the example of module testbench to check smaller modules, which can save your verification time. Report the minimum clock period and total execution cycles determined in testbench.
different bits are viewed as different module. Calculate the total area cost of your processor. 3. A report in one A4 page. The report records minimum clock period, maximum operation frequency, total execution cycles for two machine codes, and the value of (period x area). Besides, you have to list the special features you used in your design, such as data hazard detection logic, forwarding circuits, and techniques to reduce area and timing delay. Upload the following files to T.A.s FTP server: Verilog files, which contains all behavior models of the processor and should be correct-verified and bug-free.
4.
What is the grading policy? 1. Completeness of functionality and well-explanation to the pipelining architecture layout is the key to better grade. Comments, color lines, hierarchical blocks, and detailed description can help you get higher scores. 2. Bug-free and correctly verified design has the highest scores. If you cannot pass the verification of testbench, try to explain the errors. Dead body still has some scores. 3. 4. 5. 6. Smaller value, (period x area), can lead to higher scores. As a result, reducing the critical timing and minimizing the total area is important. Special features in your design are viewed as bonus on the final scores. Try to list as many features as possible. Hierarchy design and comments in your verilog file is also helpful. Most important of all, hand in your project on time.