Chapter 2 Hardware and Software Design Issues
Chapter 2 Hardware and Software Design Issues
Purushotam Shrestha
Use the problem description to find out the truth table containing inputs and corresponding outputs.
Use K-maps to find a logic expression or get a pre-defined function.
Use Boolean algebra if simplification is required.
Draw circuits and simulate/ implement the circuit using necessary gates.
Testing and re-design may be required.
Combinational
Logic
Sequential
Logic
elements:
Flip-Flops
Figure: Sequential Logic
At the heart of a sequential circuit are the flipflops which provide the memory based functions,
while some processing may be done by
additional combinational logic.
Examples are JK flip flop, counters, registers etc
A sequential circuit is represented by a state
Chapter 2
Embedded Systems
Purushotam Shrestha
diagram consisting of states(circles), transitions(lines) between states upon triggered by input control, or state
tables listing input, present state, output, next state and required excitations,
Sequential design
1. Use the problem description to find out the state diagram/ table consisting of present states, inputs, next
states and corresponding outputs.
2. If N is the number of states, the log2N flip flops are required. Choose a particular flip-flop on the basis of
availability, cost, required flexibility etc.
3. Find out the flip-flop inputs that change a current state into next state for each flip-flop. Use flip-flop
excitation tables. The guidelines for combinational logic may be applicable here.
4. Draw circuits and simulate/ implement the circuit using necessary gates.
5. Testing and re-design may be required.
Faster
Small
Low power consuming
High NRE: Non Recurring Engineering may be higher implying higher cost, and
Less flexible, a processor of this type may be useless when required to perform a different task.
Examples:
Timers whose sole purpose is to decrement
a loaded value(time) to zero and give a signal.
LED and LCD display drivers that take in
certain bit values and compute specific bit
patterns suitable for the devices.
Motor control circuits which generate
driving signals in response to a command.
The custom single purpose processor is
required for non-standard task, the one a
designer needs but is not commercially
available in the market.
Design
A custom single purpose processor is designed
to meet a non standard specific customer/
application need.
Generally a single purpose processor consists of a
controller and a datapath.
Chapter 2
Embedded Systems
Purushotam Shrestha
Controller: A controller consists of the circuitry that controls the actions/ functions of the functional units, what
operation to perform by establishing paths between the units, selecting hardware blocks for computations. For
this it takes in external control signals, generates control signals to use upon the registers and other functional
units and gives out status and other control signals.
Datapath:
The datapath block consists of registers, ALUs, interconnection buses, multiplexers that are required for
handling data: moving data between registers and memory, performing computations on data, feeding data to
and taking results from functional units. The operations are carried out on the basis of the control signals
provided by control unit.
A general design procedure to design a custom single purpose involves following:
Specification: Before starting any design, the requirements must be clear: what the processor does/ has to do.
Identify inputs and outputs.
Algorithm: The processor processes inputs and gives outputs. The algorithm is about how it does the processing.
A flowchart may also do the job. The processing may require basic arithmetic operations, logical operations or
combinations of these. Develop the algorithm and verify the processes. A control mechanism is required to
carry out those operations in a certain sequence in order to give desired results.
Finite State Machine with Datapath (FSMD for short, it is a complex state diagram in which states and arcs may
include arithmetic and logical expressions which may use external control inputs and outputs as well as
variables, this is also known as Register Transfer Level): Use the algorithm to construct a FSMD or construct it
directly if possible. It is more like a flow chart containing the expressions. This state shows the number of states
required to perform the task at hand.
Datapath: Use a suitable register for a variable, may be input/output or an intermediate result. For each type of
operation/ computation, use a functional block, for example an adder for Addition purpose. An ALU has several
functional units, but we are designing a custom processor, not a general purpose. Define the interconnections
between the registers and functional units.
Finite State Machine (FSM): The FSM is for the controller. Assign the binary codes for each state in the FSMD.
Identify the control signals required by the datapath to carry out the operations in the sequence and manner as
per the specification. There may be inputs, external or generated by the datapath, to the state machine. Based
on these inputs and the current state, the FSM generates these signals as output. For example if there is a
register load operation in the FSMD, the signal line controlling the load operation is identified in the datapath,
labeled as a output variable of the controller and is included in the state diagram.
Controller: The controller design is essentially a sequential circuit design based on above FSM. Use
combinational and sequential logic elements to design the controller. An excitation table with present state,
inputs, next state, outputs may be helpful at this stage.
Implementation and Testing: After the design is completed, a simulation may help to catch the errors in the
design. Iterative review and simulation can reduce and eliminate the errors. Actual hardware implementation
can be done now. The hardware should be tested before application.
2.1.3 Optimization
Optimization is process of maximizing output/ efficiency of a system for an available, often limited, resources.
Once the design phase is completed, the whole process should be reviewed for optimization opportunity. The
custom single purpose processor can be optimized stage-wise as follows:
Original Program/ Algorithm:
The areas of improvement may be in
Chapter 2
Embedded Systems
Purushotam Shrestha
Size of variable: the size of variable directly impacts upon size of registers, interconnection buses, reduce the
size if possible
Number of computations: multiple computations may be reduced. A subtraction of value 1 may be replaced by
decrement reducing the number of computation and complexity. Approximation can also reduce the complexity.
Operations used: Multiplication and Division hardware is costly, replace by other operations where possible.
The FSMD:
Merge states: Two adjacent states with constants on transitions can be eliminated. If two states have
independent operations, one of them can be eliminated or merged to one. Two register-load operations can be
performed in a single state if there are two registers available.
Split States: If a state consists of complex type of operation, it can be split resulting in simpler operation which
implies simpler and less hardware. Instead of adding 4 numbers at once, each can be added one by one to a sum
value initialized to zero.
The Datapath:
Reuse of hardware units: It is not necessary to use a single hardware unit for each operation. If the operations
are same and are not carried out simultaneously, the hardware unit can be shared. Repetitive and sequential (non
simultaneous) operations like additions can be performed in single hardware.
Use of multifunctional units: A multi function unit like ALU can be used for arithmetic and logical operations, the
function being selected as per requirement. A single register with right and left shift capability can be used for both
operations instead of using two registers.
The controller:
Number of states: the controller involves states and transitions between the states. Its optimization follows
directly from FSM. Similar techniques of state minimization and simplification can be applied.
Chapter 2
Embedded Systems
Purushotam Shrestha
IF
Instruction no 2
DE
EX
IF
DE
EX
IF
DE
EX
IF
DE
EX
Instruction no 3
Instruction no 4
Time Cycle
2.2.2 Operations:
In order to carry out a task, a processor performs computational and control operations. The operations can be
broadly categorized into two groups:
Datapath operations and
Control operations
Datapath Operations:
The datapath being a data processing unit, carries out arithmetic and logical operations on data and data
movement into and out of the actual computational unit. The following are the main operations carried out by
the datapath:
Load / Read operations:
The data to be processed are loaded into ALU registers, either from memory or other input registers connected
to sensors and other input modules. Examples MOV operations
ALU operations:
The ALU can perform many different arithmetic and logical operations. The data are loaded into the processor,
one of the several processing functions is selected using appropriate value of the control lines, and the results
Chapter 2
Embedded Systems
Purushotam Shrestha
appear on the output register. Examples ADD, SUB, AND operations. Depending upon the application of
processor, the operations may be different, but the operations are basically computational in nature.
Store/ write operations:
The output of the computations are to be written into memory for further computations or loaded into output
registers that interface the external world through some additional stages. Examples MOV, PUSH operations
Controller operations:
The datapath needs control signals in order to carry out its operations. The controller is responsible for
providing these signals based upon the program instruction. In general, the controller repeatedly performs
following sequence of operations
Fetch instruction: The controller gets the instruction from memory address pointed by the program counter
which always points to next instruction address. Once the instruction is fetched, the program counter is
increased by 1 or a jump address is loaded. The fetched opcode is loaded into instruction register for decoding.
Decode instruction: The value loaded into the instruction is decoded to find out what the instruction means.
Each value of an opcode is unique and decoded using logic circuits to generate control signals that activate /
deactivate, enable / disable registers, a function of ALU etc
Fetch operands: Once an instruction is decoded, operands are required to operate on. The fetch operand is a
data movement process between datapath registers and memory. The registers and memory address are
determined by addressing modes. Some instructions may not require data for their operation like subroutine
return RET, no operation NOP.
Execute: The execute phase involves passing data to the actual processor, like ALU, and selecting the processor
function. The processor gives the results according to the function selected.
Store results: The output might be required for further computations and needs to be stored in memory or
other registers. So the processed data is moved from the registers in the datapath to specified memory address
or registers.
Chapter 2
Embedded Systems
Purushotam Shrestha
Embedded Systems
Purushotam Shrestha
The host contains are software systems that allow program writing, compiling, assembling, debugging for specific
type of controllers. There may also be an emulator that mimics the target device so that the program can be
tested on an actual hardware like system.
The host may also include circuit designing and analysis software packages for hardware design phase. The
output may be a PCB layout file for circuit board fabrication.
Target
The program is developed for a target processor into which the test program is downloaded into or burned. The
target processor runs the program and does some useful work.
Hardware testing and debugging tools :
Before downloading the program
Instruction Set Simulation
Emulation
After Downloaded into actual device
Digital multimeters, oscilloscope, logic
analysers, function generators
IDE Integrated Development Environment :
software package which provides source code
editor(text editor), cross-compiler, compiler,
linker, debugger, Emulators, programmer,
downloader,
Eg :MPLAB provided by microchip
Starts with a project
Choice of processor/controller
Programming
Compiling/assembling
Debugging
Testing on an emulator
Program burning
Chapter 2
Embedded Systems
Purushotam Shrestha
Reprogrammability: ASIPs can be reprogrammed. The scope of the programs executed by an ASIP may be
limited to particular class of applications, but still the feature of reprogrammability gives flexibility, though
limited, for upgrades and modifications. This may save time and cost.
NRE: ASIPs are not designed for a specific single task, they are designed for a class of tasks. Unlike single
purpose processor, they can be reprogrammed when the application requirement changes. Thus the cost of
engineering work for producing an ASIC can be distributed which lowers the overall cost. For same task, ASIPs
are cheaper than single purpose processor
Power Consumption: Compared to a GPP, an ASIP may consume less power. An ASIP is designed to execute
specific tasks; it would not contain unnecessary hardware components required by the GPPs in order to
possess generality. Less hardware implies less power.
The importance of ASIPs can't be undermined when there is an increasing use of microprocessor controlled
systems. The availability of hardware units including the processors in HDL (Hardware Description Language)
allows one to implement the processor in ASIP form.
Embedded Systems
Purushotam Shrestha
keeping the cost line low is preferred, though some performance trade-offs are required. It is not a good idea
to use a processor whose cost exceeds all the cost of the remaining hardware and software.
Other factors may be type/ version, no of registers available etc may also be used as selection criteria.
2.2.7 General Purpose Processor Design
A general purpose processor, GPP, is characterized by its nature of reprogramming for a wide variety of
applications. A GPP is designed to execute generalized, basic instructions which can be used to write programs
that perform different tasks. A lot of effort is put in the design phase in order to generalize the processor so that
it can be programmed for different situations. The high design cost, i e NRE, is acceptable as the GPP is produced
in large number distributing the cost and reducing the price per unit.
The design of a general purpose requires
the list of all the operations it is to perform.
We call the list the instruction set. The
adjacent table shows a list of operations for
a very simple general purpose processor.
There are 7 operations and they are
assigned binary values which will be used to
decode an instruction
Instructions
Load A
Load B
A OR B
A AND B
A+B
AB
A+1
0
0
0
1
1
1
1
Binary Value
0
1
1
0
0
1
1
1
0
1
0
1
0
1
The datapath hardware contains the functional units like adders, logical units, shifters that are required to
perform various operations
The controller fetches instruction from the program memory, decodes and provides appropriate signals to the
datapath. In doing so, the controller accepts various status signals from the datapath which are generated by the
datapath while carrying out the operations like overflow, carry, zero.
FSMD: The instructions that a general purpose is to execute and the different states that it goes through are all
summarized in a FSMD. The diagram contains the actual expressions with the unique variables. The designs for
both the datapath and the controller are derived from the FSMD.
Datapath:
The design of datapath follows from the instruction set. The each different expression, required by the
instruction, shown in the FSMD are carried out by separate functional unit. The datapath must contain the
functional units required by the instructions. The operation of a functional unit is activated by control signals
generated by the controller.
Another way is to select output given out by each functional unit after processing the common input using a
multiplexer. If A and B are two inputs, all the operations are performed on them: A+B, A-B, A AND B etc. But only
one of the output is selected, the selection being based upon the instruction being executed. on the input data
and outputs are available but only, to select the functional units.
The number of registers is determined by the nature of the operations defined in the instructions. Usually,
individual registers are required for unique variables contained in the expressions. Extra registers can be added
to facilitate computations.
The datapath uses the control signals to execute the instructions. It is not the concern of datapath design how
the control signals are generated, the concern is what control signals are required. Clearly define the required
control signals.
10
Chapter 2
Embedded Systems
Purushotam Shrestha
The interconnection between functional units, the registers and other units should also be defined.
Controller: The controller is a finite state machine that goes from one to another state and its design involves
sequential design procedure. The state diagram is different but procedure is more or less same. It is
responsible for generating control signals for datapath and generally cycles through following states:
FETCH
IR = M[PC]
PC=PC+1
DECODE
PC=PC+1
EXECUTE
001
Load A
Load B
010
011
A OR B
100
A AND B
101
A+B
110
AB
111
A+1
O/P = A OR B
O/P = A AND B
O/P = A + B
O/P = A -B
O/P = A + 1
Fetch: reads the memory location whose address is contained by Program Counter, PC, and loads the Instruction
Register ,IR, with contents of the location
Decode: takes the opcode bits from the IR and decodes what is to be done according to the current instruction
Execute: provides appropriate control signals to datapath which does the computational work.
The controller uses following special purpose registers to hold various data such as memory address, instruction
code, opcode, etc
The Program Counter(PC):
The program counter holds the address of memory location from where the instruction is fetched. The address
is calculated by increasing the previous value by 1. If a branch instruction is encountered, the address value is
loaded as calculated from the branch instruction.
The output of PC is connected to memory address lines.
A control signal is required to load values into the PC. When this line is activated, bits are loaded into the PC or
incremented.
Chapter 2
11
Embedded Systems
Purushotam Shrestha
Controller
Control
circuit:
state machine
Datapath
Register
Register
IR
Control
Signals
PC
Address
calculato
r
Memory
ALU with
multiple
functional
Units
Status
values
Data to be
stored
Register
Register
Output Data
12
Chapter 2