Processor Verification PDF
Processor Verification PDF
STUDY EXAMPLE
J. Robert Heath* and Sreenivas Durbha Dept. o f Electrical Engineering, 453 Anderson Hall University of Kentucky Lexington, KY 40506 Heath @ enpr.uky.edu,sdurbha@ ~cocd2.intel.com
*CorrespondingAuthor
ABSTRACT
A goal of computer designers is to reduce the development cycle time for complex pipelined architecture core processor systems. A research egort is described which had a major objective of determining if an approach and methodology could be developed which will allow complex pipelined architecture processors with stringent system functional, timing, and pelformance requirements to be correctly and eficiently synthesized from a high behavioral-level-only HDL design description, thus reducing development cycle time. A second research objective was to synthesize to target FPGA technology using primarily standard available PC based CAD tools. Contributions include a developed approach and methodology which are verijied by presentation of the results of a case study example which resulted in the correct synthesis of a FPGA protorype of a behavioral-level-only HDL described pipeline architecture processor. Correct synthesis was verified via experimental testing of the processor prototype. Kev Words: High Level Design and Synthesis, HDLs, Computers, FPGAs, Prototyping, Testing, Verification.
using current techniques of largely structural HDL coding. This largely depends on the current generation of Computer Aided Design (CAD) tools. Hence a useful offshoot of the research problem is to show how best to utilize current CAD tools to suit our system design, design capture, prototype synthesis, and prototype experimental testing goals. The entire work was carried out in a Windows 95 PC based environment using three different kinds of CAD tools for simulation and synthesis purposes from three different vendors.
I.
This section discusses the motivation for the reported research and the goals and objectives of the research.
11.
We now briefly overview the Instruction Set Architecture (ISA) of the MIPS R2000 pipelined
143
architecture processor [SI for which a behavioral level HDL description will be developed, synthesized, and tested for verification of a correct behavioral level design capture and synthesis.
The MIPS addressing modes are [ 5 ] : register, dispalacement, immediate, PC relative, and pseudodirect. We implemented enough of the full instruction set 1.0 require all basic functionality found in a commercial version of the processor and we implemented enough of the instruction set to allow writing and execution of test programs required to verify all basic functionality of the processor and its instruction set. Our version of the processor implements the above 17 assembly language instructions. Its organization and architecture are shown .in Fig. 2.1 including the Forwarding and Hazard Detection Units.
144
simulation. Throughout the code-writing phase of the research project, the architecture of each of the units and their interface with the other units was constructed by having
high-level functional diagrams of the units. All signals must interface with their counterparts in successive modules.
SI I
145
behaviorally coded in the following order: 1) The IF Pipeline Stage, 2) The ID Pipeline Stage including the Controller and Hazard Detection functionality, 3) The EX Pipeline Stage, 4) The MEM Pipeline Stage, 5 ) The WB Pipeline Stage, 6) The Instruction and Data Cache Memory, 7) The Forwarding functionality, 8) The Maskable Hardware Vectored Priority Interrupt System (MHVPIS) functionality, and 9) Highest Level Module containing all above functional units.
f0nvard.v Forwarding U n i t
MEM Pipeline
dmv Data Memory
inv
Instruction Memory
RegFi1e.v
Re~ster File
rl
scuba-im.v
v l
scuba-RlR2.v
Figure 3.1: The Behavioral Level HDL Hierarchy of the Pipelined Processor.
146
147
and it also tests the Hazard Detection and Forwarding functionality of the pipelined processor. add $Rl,$R3, $R3 # $R1= 2* $R3 # $R1= 4* $R3 Loop:add $R1, $R1, $R1 add $R1, $R1, $R5 # $R1= $R1+ $R5 lw $R8,O($R1) # $R8 = Mem[$R1+0] beq $R8, $R5, Exit # if ($R5= $R8) go to Exit # $R3 = $R3 + $R4 add $R3, $R3, $R4 j Loop # Jump to address Loop Exit: add $RIA, $R10, $ R l l # $RIA = $R10 +$R11
I
immem 0:OOO00000000000000023184200210802 00212802 2C280000 24A80008 00632002 8: 08000002 034A5802 00000000 00000000
00000000 00000000 00000000 00000000 10: 00000000 00000000 00000000 00000000
00000000000000000000000000000000
18:00000000000000000000000000000000 00000o0o 00000000 000oOOOo oooooOOO
dmmem 0:0000000000020120 0000000000020120 00000000000000000000000200000000 8:0002012000000000 00000000OOO00000 oooOOO00 00000000 00000000 mOOO0 10: 00000000 oo0oo00o 00000000 00000000 000o0O00000000000000000000000000 18: 00000000 00000OOO OOOOOOOO 00000000 00000000 o00oo0oo 00000000 00000000 RegFile.mem
0:00000000000000000000oooo 00100100 oooo0O00 00000002 000oOOOo 00o0O000
Figure 5.lb shows the states of the Instruction Memory (im.mem), the Data Memory (dm.mem), and the Register File (RegFiiemem) before execution of the program of Fig. 5.la. The structure of the figures show the addresses of memory and register file locations (represented in hexadecimal notation) on the left side starting from 0 , (0 h) going through 7d (7 h) in the first two rows. The third and fourth rows start from address 8 d (8 h) and go through 15d(F h) and so on until the last two rows starting with the address 24d (18 h) through 3 l d (1F h). Thus the data is stored in each address location as indicated by the 8-digit hexadecimal number. Fig. 5 . 1 ~ shows the contents of the Reg.File.mem after execution of the program. Known results are left in {.he form of the highlighted and underlined values. The test program should execute all the instructions starting with the add at the Loop begin address 02 in im.mem of Fig.5.2b through the beq instruction at address 06 and after evaluating the branch equal test to false twice should go on to execute the add instruction at address 07 and then execute the jump to Loop beginning address. The third time however, the control of execution after reaching the beq instruction will evaluate the branch equal test to true to branch to the Exit loop address, which is 09 in the Instruction Memory. If this add operation executes correctly it should deposit (00000025)hcx in $RlA, which is the result of the addition of (00000012)h~x and (00000013)hcxin $R10 and $R11 respectively. Hence looking at $RlA shows us the test ran successfully as can be seen from the result in the RegFile.mem map of Fig. 5 . 2 ~ .Other test programs with different control flows were written and all were successfully executed on the prototype processor further verifying a correct behavioral level HDL description and synthesis of the pipelined processor [ 6 ] .
RegFile.mem 0:00000000004004020000000000100100 00000000000000020000000000000000 8:00000002000000000000001200000013 00000000 OOOOOOOO 00000000 oo00o0oo 10: 00o0000000000000 O O O m m o O 0 0 o o m 0 0 OOOOOOOO 000OOOoo 00000000 18: 00000000 00000000 00000025 OOOO0000 00000000 oO0o0ooo 00000000 0000OOOo
Figure 5.lb: Memory and Register File State before Execution This program illustrates a series of add operations performed on a single register $R1, to increment it to a value that will be added to another register $R5 that would eventually hold the value that needs to be compared to another register $R8 for the conditional branch operation. The outcome of the conditional branch operation will decide whether to branch and exit the loop or to re-enter the loop.
Figure2.lc: Register File State after Execution The ability of the prototype processor to correctly handle interrupts was tested. All tests were successful in that the processor's MHVPIS successfully recognized the interrupts and in response successfully transferred control to the correct interrupt service routines [ 6 ] . Even though the ORCA 2C40A P G A technology is rated at 33 MHz, we could only run our prototype, synthesized from multiple vendor PC based noncommercialized CAD tools, at 1.979 MHz. The low frequency of operation of the prototype can be attributed to the behavioral level only design capture medium and the utilized PC based design synthesis tools
148
VI.
CONCLUSIONS
The pipelined processor architecture prototype developed from behavioral-level-only HDL code as described within the paper was experimentally verified to be fully functional. Thus it has been shown that a pipelined processor i t h stringent and complex system functional, architecture w timing, and performance requirements can be correctly synthesized and embedded into an P G A with the design capture being done using an HDL at only the behavioral level of abstraction. Using behavioral-level-only HDL coding rather than more time consuming detailed lower level coding significantly reduces the development cycle time for such processor systems. It was shown that a more efficient synthesis can be achieved if a very small portion of the behavioral level HDL code is reduced to the register level; primarily that part of the code used to describe the Instruction and Data Memory caches of the pipelined processor.
REFERENCES
1. C. A. Fields. Proper Use of Hierarchy in HDL-Based High Density FPGA Design. pp. 168-177. Lecture Notes in Computer Science, Proc. Field-programmable Logic and Applications, 5" Int. Workshop, FPL '95, Aug./Sept. 1995. 2. Y. Li and W. Chu. Aizup - A Pipelined Processor Design and Implementation on XILJNX FPGA Chip. pp. 98-106. IEEE Comp. Soc. Press, 1996. 3. J. S . Gray. Homebrewing RISCs in FPGAs. www3.sympatico.ca/jsgray/j32.ppt 4. M. Gschwind and V. Salapura. A VHDL Design Methodology for FPGAs. pp. 208-217. Lecture Notes in Computer Science, Proc., Field-programmable Logic and Applications, 5" Int. Workshop, FPL '95, Aug./Sept. 1995. 5. D. A. Patterson and J. L. Hennessy, Computer Organization and Design: The Hardware/Sofhyare Intelface, 2"dEdition, Morgan Kauffmann, Inc., 1998. 6. S. Durbha, Prototyping and Testing of a Pipelined Processor from Behavioral Level Code, Master's Thesis, Dept. of EE, Univ. of KY, Lexington, KY, May, 2000. 7. M. D. Ciletti, Modeling, Synthesis, and Rapid Prototyping with the Verilog HDL, Prentice Hall, 1999. 8. S. Palnitkar, Verilog HDL: A Guide to Digital and Synthesis, Prentice Hall, Inc. 1996. 9. FPGA Express Users Manual. 1999. www.s ynopsys.com/products/fpga/fpga-expresshtml 10. Lucent Technologies. ORCA Foundry Development System. User's Guide, EPIC User's Guide (Version 9.1) 11. Lucent Technologies. Optimized Reconfigurable Cell Array (ORCATM), OR2CxxA Series Field-Programmable Gate Arrays. Microelectronics Data Sheet. Mar. 1996.
149