8 Karatsuba Document
8 Karatsuba Document
Computation intensive applications such as DSP, image processing, floating point processors
and communication technologies today require efficient binary multiplication which usually is
the most power and time consuming block. This paper proposes an efficient design for
unsigned binary multiplication to reduce delay. A 32×32 -bit multiplier has been designed
which is based on Vedic Karatsuba algorithm using reversible logic. The designs have been
coded in Verilog, synthesized in Xilinx vivado ise.
Chapter 1
INTRODUCTION
Multipliers play an important role in today’s digital signal processing and various other
applications. With advances in technology, many researchers have tried and are trying to design
multipliers which offer either of the following design targets – high speed, low power
consumption, regularity of layout and hence less area or even combination of them in one
multiplier thus making them suitable for various high speed, low power and compact VLSI
implementation.
The common multiplication method is “add and shift” algorithm. In parallel multipliers
number of partial products to be added is the main parameter that determines the performance of
the multiplier. To reduce the number of partial products to be added, vedic multiplier is one of
the most popular method. In this lecture we introduce the multiplication algorithms and
architecture and compare them in terms of speed, area, power and combination of these metrics.
The basic method of multiplier is explains below
The binary multiplication also happens in same way of digit multiplication as shown in
below example here by getting partial products and gates are used and we are using adder (half
adder ,full adder)adding the columns .
An example of 4-bit multiplication method is shown below:
Although the method is simple as it can be seen from this example, the addition is done serially
as well as in parallel. To improve on the delay and area the CRAs are replaced with Carry Save
Adders, in which every carry and sum signal is passed to the adders of the next stage. Final
product is obtained in a final adder by any fast adder (usually carry ripple adder). In array
multiplication we need to add, as many partial products as there are multiplier bits. This
arrangements is shown in the figure below
Fig : array multiplier
In applications like multimedia signal processing and data mining which can tolerate error, exact
computing units are not always necessary. They can be replaced with their approximate
counterparts. Research on approximate computing for error tolerant applications is on the rise.
Adders and multipliers form the key components in these applications. In, approximate full
adders are proposed at transistor level and they are utilized in digital signal processing
applications.
IMPLEMENTATION OF WALLACE MULTIPLIER
The Wallace tree has three steps:
1. Multiply (that is - AND) each bit of one of the arguments, by each bit of the other,
yielding results. Depending on position of the multiplied bits, the wires carry different weights,
for example wire of bit carrying result is 32.
2. Reduce the number of partial products to two by layers of full and half adders.
3. Group the wires in two numbers, and add them with a conventional adder.
4. The second phase works as long as there are three or more wires with the same weight add a
following layer:
Take any three wires with the same weights and input them into a full adder. The result will be
an output wire of the same weight and an output wire with a higher weight for each three input
wires.
If there are two wires of the same weight left, input them into a half adder.
If there is just one wire left, connect it to the next layer.
a)Steps involved in WALLACE TREE multipliers Algorithm:
Multiply (that is - AND) each bit of one of the arguments, by each bit of the other, yielding N
results. Depending on position of the multiplied bits, the wires carry different weights.
Reduce the number of partial products to two layers of full adders.
Group the wires in two numbers, and add them with a conventional adder.
Fig-2 Product terms generated by a collection of AND gates
Ripple Carry Adder is the method used to add more number of additions to be performed
with the carry in sand carry outs that is to be chained. Thus multiple adders are used in ripple
carry adder. It is possible to create a logical circuit using several full adders to add multiple-bit
numbers. Each full adder inputs a Cin, which is the Cout of the previous adder. This kind of
adder is a ripple carry adder, since each carry bit "ripples" to the next full adder.The proposed
architecture of WALLACE multiplier algorithm using RCA is shown in Fig
Take any 3 values with the same weights and give them as input into a full adder. The result will
be an output wire of the same weight.
Partial product obtained after multiplication is taken at the first stage. The data’s are taken with 3
wires and added using adders and the carry of each stage is added with next two data’s in the
same stage.
Partial products reduced to two layers of full adders with same procedure.
At the final stage, same method of ripple carry adder method is performed and thus product
terms p1 to p8 isobtained .
Fig :4x4 WALLACE Multiplier
Chapter 2
LITERATURE VIEW
Vijay kumar reddy Modified High Speed Vedic Multiplier Design and Implementation The
proposed research work specifies the modified version of binary vedic multiplier using vedic
sutras of ancient vedic mathematics.It provides modification in preliminarilry implemented vedic
multiplier.The modified binary vedic multiplier is preferable has shown improvement in the
terms of the time delay and also device utilization.The proposed technique was designed and
implemented in Verilog HDL.For HDL simulation, modelsim tool is used and for circuit
synthesis,Xilinx is used.The simulation has been done for 4 bit, 8 bit,16 bit, multiplication
operation. Only for 16 bit binary vedic multiplier technique the simulation results are shown.This
modified multiplication technique is extended for larger sizes.The outcomes of this
multiplication technique is compared with existing vedic multiplier techniques.
Table 1
It can be observed from Table I that third product- computation of (n/2 + 1)bits requires one
multiplication of(n/2 -1) bits and additional shifting, adding and multiplexing
operations, instead of a (n/2 + 1) bit multiplier at every stage.This makes Karatsuba
implementation recursive, without additional hardware. Fig. 2 shows the adaptive concept
for(n/2 + 1) bit computation at a stage when inputs are n bits and where the ‘Shift and Add’
block applies in appropriate cases as mentioned in Table I.
Fig. 2. Adaptive Concept for 3rd Product Computation
Conventional Wallace and Dadda multiplier uses 3:2 compressors for carrying out addition of
the generated partial products. The analyses depict that the proposed multiplier based on
adaptive, recursive Karatsuba approach has lesser delay compared to the conventional Wallace
and Dadda multiplier.
CARRY SELECT ADDERS
The performance of the Karatsuba multiplier has been further improved by the use of high
speed parallel adders at different stages. The optimizations that have been carried out
are as follows:
Additions involving 16 bit and above are carried out using the proposed fast adders.
Final addition in the third-product-computation is carried out using carry save adders with the
final stage involving the fast adders.
Carry Select Adder
In electronics, a carry-select adder is a particular way to implement an adder, which is a
reduced logic delay element that computes The carry-select adder generally consists of ripple-
carry adders and a multiplexer. Adding two n-bit numbers with a carry-select adder is done with
two adders (therefore two ripple-carry adders), in order to perform the calculation twice, one
time with the assumption of the carry-in being zero and the other assuming it will be one. After
the two results are calculated, the correct sum, as well as the correct carry-out, is then selected
with the multiplexer once the correct carry-in is known.
Ripple carry adder generates carry out bit by rippling effect of the incoming carry generating
from the previous stage after receiving the carry-in (C in) bit. Hence, the speed of RCA is less, as
Cout of a stage is dependent on Cout of the previous stage or the Cin of the current stage. This
linear dependency problem of Cout on Cin is overcome by assuming the possible values of C in
to be a ‘0’ and a ‘1’. By using these possible values of C in, the partial Sum (S) and Cout are
generated in parallel. Then, a multiplexer is employed to choose the final sum and carry based
on the actual carry input received. Though this feature helps in reducing the computation
delay, the area efficiency suffers due the use of the redundant adder circuit. Thus, Carry Select
Adder consists of two rca blocks.in this project to reduce the optimal delay we proposed square
root carry save adder.
Square Root Carry Select Adder
In Square Root Carry Select Adder (SRCSA), [9]-[10] the block size can be variable. The
complete analysis is omitted here for brevity, but here, e.g., a 16-bit adder can be created using
block sizes of 2-2-3-4-5 instead of using uniform block size of four (as done before) [8]. This
break-up is ideal when the Full-Adder delay is equal to the MUX delay. Fig. 3 shows the block
diagram of proposed SRCSA adder for 16 bits where the inputs are A and B, Carry-in is denoted
as Cin and outputs are denoted by sum (S) and Carry-out (Cout).
CARRY SELECT ADDERS
REVERSIBLE GATES
Reversible logic has received great attention in the recent years due to their ability to
reduce the power dissipation which is the main requirement in low power VLSI design. It has
wide applications in low power CMOS and Optical information processing, DNA computing,
quantum computation and nanotechnology. Irreversible hardware computation results in energy
dissipation due to information loss. According to Landauer’s research, the amount of energy
dissipated for every irreversible bit operation is at least KTln2 joules, where K=1.3806505*10-
23
m2kg-2K-1 (joule/Kelvin-1) is the Boltzmann’s constant and T is the temperature at which
operation is performed . The heat generated due to the loss of one bit of information is very
small at room temperature but when the number of bits is more as in the case of high speed
computational works the heat dissipated by them will be so large that it affects the
performance and results in the reduction of lifetime of the components In 1973, Bennett
showed that KTln2 energy would not dissipate from a system as long as the system allows the
reproduction of the inputs from observed outputs . Reversible logic supports the process of
running the system both forward and backward. This means that reversible computations can
generate inputs from outputs and can stop and go back to any point in the computation history.
A circuit is said to be reversible if the input vector can be uniquely recovered from the output
vector and there is a one-to-one correspondence between its input and output assignments, i.e.
not only the outputs can be uniquely determined from the inputs, but also the inputs can be
recovered from the outputs Energy dissipation can be reduced or even eliminated if
computation becomes Information-lossless
THE CONCEPT
Reversibility in computing implies that no information about the computational states can ever
be lost, so we can recover any earlier stage by computing backwards or un-computing the
results. This is termed as logical reversibility. The benefits of logical reversibility can be
gained only after employing physical reversibility. Physical reversibility is a process that
dissipates no energy to heat. Absolutely perfect physical reversibility is practically
unachievable. Computing systems give off heat when voltage levels change from positive to
negative: bits from zero to one. Most of the energy needed to make that change is given off in
the form of heat. Rather than changing voltages to new levels, reversible circuit elements will
gradually move charge from one node to the next. This way, one can only expect to lose a
minute amount of energy on each transition. Reversible computing strongly affects digital
logic designs. Reversible logic elements are needed to recover the state of inputs from the
outputs. It will impact instruction sets and high-level programming languages as well.
Eventually, these will also have to be reversible to provide optimal efficiency.
The number of Reversible gates (N): The number of reversible gates used in circuit.
The number of constant inputs (CI): This refers to the number of inputs that are to be
maintained constant at either 0 or 1 in order to synthesize the given logical function.
The number of garbage outputs (GO): This refers to the number of unused outputs present
in a reversible logic
circuit. One cannot avoid the garbage outputs as these are very essential to achieve re-
versibility.
Quantum cost (QC): This refers to the cost of the circuit in terms of the cost of a primitive
gate. It is calculated knowing the number of primitive reversible logic gates (1*1 or 2*2)
required to realize the circuit.
BASIC REVERSIBLE LOGIC GATES
Feynman Gate
Feynman gate is a 2*2 one through reversible gate as shown in figure 1. The input vector is
I(A, B) and the output vector is O(P, Q). The outputs are defined by P=A, Q=A xor B.
Quantum cost of a Feynman gate is 1. Feynman Gate (FG) can be used as a copying gate.
Since a fan-out is not allowed in reversible logic, this gate is useful for duplication of the
required outputs.
Toffoli Gate:
Fig 3 shows a 3*3 Toffoli gate. The input vector is I (A, B, C) and the output vector is
O(P,Q,R). The outputs are defined by P=A, Q=B, R=AB xor C. Quantum cost of a Toffoli gate
is 5.
Fredkin Gate
Fig 4 shows a 3*3 Fredkin gate. The input vector is I (A, B, C) and the output vector is
O (P, Q, R). The output is defined by P=A, Q=A′B xor AC and R=A′Cxor AB. Quantum cost of a
Fredkin gate is 5.
Peres Gate
Fig 5 shows a 3*3 Peres gate. The input vector is I (A, B, C) and the output vector is O
(P, Q, R). The output is defined by P = A, Q = Axor B and R=AB xor C. Quantum cost of a Peres
gate is 4. In the proposed design Peres gate is used because of its lowest quantum cost.
the proposed design Peres gate is used because of its lowest quantum cost. It can be verified
that the input pattern corresponding to a particular output pattern can be uniquely determined.
The proposed TSG gate is capable of implementing all Boolean functions and can also work
singly as a reversible Full adder
Sayem gate
SG is a 1 trough 4x4 reversible gate. The input and output vector of this gate are, Iv =
(A, B, C, D) and Ov = (A, A’B xor AC, A’B xor AC xor D, AB xor A’C xor D). The block
diagram of this gate is shown in Fig 9
APPLICATIONS
Reversible computing may have applications in computer security and transaction
processing, but the main long-term benefit will be felt very well in those areas which require
high energy efficiency, speed and performance .it include the area like
We have presented an approach to the realize the multipurpose binary reversible gates. Such
gates can be used in regular circuits realizing Boolean functions. In the same way it is possible
to construct multiple-valued reversible gates having similar properties. The proposed vedic
multiplier with Karatsuba algorithm designs have the applications in digital circuits like DSP
apllications, DIP applications, building reversible ALU, reversible processor etc. This work
forms an important move in building large and complex reversible VLSI Designs.
In carry look ahead adder is used for increment the address .in this adder sub blocks are
and gates ,or gates and xor gates .
SYNTAX:
Module Module_Name( port list);
Port declaration
Function declaration
Endmodule
module ß signifies the beginning of a module definition.
endmodule ßsignifies the end of a module definition.
IDENTIFIERS:
Any program requires blocks of statements, signals, etc., to be identified with an
attached nametag. Such nametags are identifiers.
There are some restrictions in assigning identifier names. All characters of the alphabet or an
underscore can be used as the first character. Subsequent characters can be of alphanumeric type,
or the underscore (_), or the dollar ($) sign .
For example
name, _name. Name, name1, name_$, . . . all these are allowed asidentifiers
name aa ßnot allowed as an identifier because of the blank ( “name” and “aa”are interpreted as
two different identifiers)
$name ß not allowed as an identifier because of the presence of “$” as the firstcharacter.1_name
not allowed as an identifier, since the numeral “1” is the first character @name not allowed as an
identifier because of the presence of the character “@”.
A+b not allowed as an identifier because of the presence of the character “+”.
An alternative format makes it is possible to use any of the printable ASCII characters
in an identifier. Such identifiers are called “escaped identifiers”; they have to start with the
backslash (\) character. The character set between the first backslash character and the first white
space encountered is treated as an identifier. The backslash itself is not treated as a character of
the identifier concerned.
Examples
\b=c
\control-signal
\&logic
\abc // Here the combination “abc” forms the identifier.
WHITE SPACE CHARACTERS
Blanks (\b), tabs (\t), newlines (\n), and form feed form the white space characters in
Verilog. In any design description the white space characters are included to improve readability.
COMMENTS
It is a healthy practice to comment a design description liberally –A single line
comment begins with “//” and ends with a new line, and for multiple comments starts with “\*”
and ends with”*\”.
PORT DECLERATION:
Verilog module declaration begins with a keyword ”module” and ends
with”endmodule”. The input and output ports are signals by which the module communicates
with each others.
Syntax:
Input identifier………………..identifier;
Output identifier………………..identifier;
Inout identifier………………..identifier;
Input [msb:lsb] identifier………………..identifier;
Output[msb:lsb] identifier………………..identifier;
Inout [msb:lsb] identifier………………..identifier;
LOGIC SYSTEM:
Verilog uses 4 –logic system .a 1 –bit signal can take one of only four possible values.
0 LOGIC 0,OR FALSE
1 LOGICAL 1,OR FALSE
X A UNKNOWN LOGICAL VALUE
Z HIGH IMPEDENCE
5.3 OPERATORS
Arithmetic Operators:
These perform arithmetic operations. The + and - can be used as either unary (-z) or binary (x-y)
operators.
+ (addition)
- (subtraction)
* (multiplication)
/ (division)
% (modulus)
Relational Operators:
Relational operators compare two operands and return a single bit 1or 0. These operators
synthesize into comparators.
< (less than)
<= (less than or equal to)
> (greater than)
>= (greater than or equal to)
== (equal to)
!= (not equal to)
Bit-wise Operators:
Bit-wise operators do a bit-by-bit comparison between two operands. However see “Reduction
Operators” .
~ (bitwise NOT)
& (bitwise AND)
| (bitwiseOR)
^ (bitwise XOR)
~^ or ^~ (bitwise XNOR)
Logical Operators:
Logical operators return a single bit 1 or 0. They are the same as bit-wise operators
only for single bit operands. They can work on expressions, integers or groups of bits, and treat
all values that are nonzero as “1”. Logical operators are typically used in conditional (if ... else)
statements since they work with expressions.
! (logical NOT)
&& (logical AND)
|| (logical OR)
Reduction Operators
Reduction operators operate on all the bits of an operand vector and return a single-bit
value. These are the unary (one argument) form of the bit-wise operators above.
& (reduction AND)
| (reduction OR)
~& (reduction NAND)
~| (reduction NOR)
^ (reduction XOR)
~^ or ^~ (reduction XNOR)
Shift Operators
Shift operators shift the first operand by the number of bits specified by the second operand.
Vacated positions are filled with zeros for both left and right shifts (There is no sign extension).
<< (shift left)
>> (shift right)
Concatenation Operator:
The concatenation operator combines two or more operands to form a larger vector
{} (concatenation)
Replication Operator:
The replication operator makes multiple copies of an item.
{n{item}} (n fold replication of an item)
Literals:
Literals are constant-valued operands that can be used in Verilog expressions. The two
common Verilog literals are:
(a) String: A string literal is a one-dimensional array of characters enclosed in double quotes(““).
(b) Numeric: constant numbers specified in binary, octal, decimal or hexadecimal.
Number Syntax
n’Fddd..., where
n - integer representing number of bits
F - one of four possible base formats:
b (binary), o (octal), d (decimal),h (hexadecimal). Default is d.
Literals written without a size indicator default to 32-bits or the word width used by the
simulator program, this may cause errors, so we should careful with unsized literals.
NET:
Verilog actually has two classes of signals
1. nets.
2. variables.
Nets represent connections between hardware elements. Just as in real circuits, nets have values
continuously driven on them by the outputs of devices that they are connected to.
The default net type is wire, any signal name that appears in a module input /output list, but not
in a net declaration is assumed to be type wire.
Nets are one-bit values by default unless they are declared explicitly as vectors. The terms wire
and net are often used interchangeably.
Note that net is not a keyword but represents a class of data types such as wire, wand, wor, tri,
triand, trior, trireg, etc. The wire declaration is used most frequently.
The syntax of verilog net declaration is similar to an input/output declaration.
Syntax:
Wire identifier ,………….. identifier;
Wire [msb:lsb] identifier ,………….. identifier;
tri identifier ,………….. identifier;
tri [msb:lsb] identifier ,………….. identifier;
The keyword tri has a function identical to that of wire. When a net is driven by more than one
tri-state gate, it is declared as tri rather than as wire. The distinction is for better clarity.
VARIABLE:
Verilog variables stores the values during the program execution, and they need not have
Physical significance in the circuit.
They are used in only procedural code (i.e,behavioral design).A variable value can be used ina
expression and can be combined and assign to other variables, as in conventional software
programming language.
The most commonly used variables are REG and INTEGERS.
Syntax:
Reg identifier ,………….identifier;
Reg [msb:lsb] identifier,………….identifier;
Integer identifier ,………….identifier;
A register variable is a single bit or vector of bits , the value of 1-bit reg variable is always
0,1,X,Z. the main use of reg variables is to store values in Verilog procedural code.
An integer variable value is a32-bit or larger integer ,depending on the word length on the word
length used by simulator .An integer variable is typically used to control a repetitive
statements ,such as loop, in Verilog procedural code.
PARAMETER:
Verilog provides a facility for defining named constants within a module ,to improve
readability and maintainability of code. The parameter declaration is
Syntax:
Parameter identifier =value;
Parameter identifier =value,
: :
identifier =value;
An identifier is assigned to a constant value that will be used in place of the identifier
throughout the current module.
Multiple constants can be defined in a single parameter declaration using a comma –separated
list of arguments.
The value in the parameter declaration can be simple constant ,or it can be a constant expression.
An expression involving multiple operators and constants including other parameters ,that yields
a constant result at compile time. The parameter scope is limited to that module in which it is
defined.
ARRAYS:
Arrays are allowed in Verilog for reg, integer, time, and vector register data types.
Arrays are not allowed for real variables. Arrays are accessed by <array_name> [<subscript>].
Multidimensional arrays are not permitted in Verilog.
Syntax:
Reg identifier [start:end];
Reg [msb:lsb] identifier [start:end];
Integer identifier [start:end] ;
Example: integer count [0: 7] ; I I An array of 8 count variables
5.4 DATAFLOW DESIGN ELEMENTS:
Continuous assignment statement allows to describe a combinational circuit in terms of
the flow of data and operations on the circuit. This style is called “dataflow design or
description”. The basic syntax of a continuous –assignment statement in Verilog is
Syntax:
Assign net-name=expression;
Assign net-name[bit-index]=expression;
Assign net-name[msb:lsb]=expression;
Assign net-concatenation =expression;
“Assign” is the keyword carrying out the assignment operation. This type of assignment is called
a continuous assignment.
The keyword “assign ”is followed by the name of a net, then an”=”sign and finally an expression
giving the value to be assigned
If a module contains two statements “assign X=Y” and “assign Y=~X”, then the simulation will
loop “forever”(until the simulation times out).
For example:
assign c = a && b;
a and b are operands – typically single-bit logic variables.
“&&” is a logic operator. It does the bit-wise AND operation on the two
operands a and b.
“=” is an assignment activity carried out.
c is a net representing the signal which is the result of the assignment.
5.5 STURCTURAL DESIGN (OR) GATE LEVEL MODELING
Structural Design Is the Series of Concurrent Statement .The Most Important Concurrent
Statement In the module covered like instance statements, continuous –assignment statement and
always block. These gives rise to three distinct styles of circuit design and description.
Statement of these types, and corresponding design styles, can be freely intermixed within a
Verilog module declaration.
Each concurrent statement in a Verilog module “executes” simultaneously with other statements
in the same module declaration.
In Verilog module, if the last statement updates a signal that is used by the first statement, then
the simulator goes back to that first statement and updates its result.
In fact, the simulator will propagate changes and updating results until the simulated circuit
stabilizes.
In structural design style, the circuit description or design individual gates and other components
are instantiated and connected to each other using nets.
Verilog has several built in gate types, the names of these gates are reserved words, some of
these are
Syntax of Verilog instance ststements:
Component_name
instance-identifier(expression……………expresssion);
Component_name instance-identifier (.port-name(expression),
: :
.port-name(expression));
Basic gate primitives in Verilog with details:
Table 5.1(a)
5.6 BEHAVIORAL MODELING
Behavioral level modeling constitutes design description at an abstract level. One can
visualize the circuit in terms of its key modular functions and their behavior. The constructs
available in behavioral modeling aim at the system level description. Here direct description of
the design is not a primary consideration in the Verilog standard. Rather, flexibility and
versatility in describing the design are in focus [IEEE].
Verilog provides designers the ability to describe design functionality in an algorithmic
manner. In other words, the designer describes the behavior of the circuit. Thus, behavioral
modeling represents the circuit at a very high level of abstraction. Design at this level resembles
C programming more than it resembles digital circuit design. Behavioral Verilog constructs are
similar to C language constructs in many ways.
Structured Procedures:
There are two structured procedure statements in Verilog: always and initial .These
statements are the two most basic statements in behavioral modeling. All other behavioral
statements can appear only inside these structured procedure statements.
Verilog is a concurrent programming language unlike the C programming language, which is
sequential in nature. Activity flows in Verilog run in parallel rather than in sequence. Each
always and initial statement represents a, separate activity flow in Verilog. Each activity flow
starts at simulation time 0.The statements always and initial cannot be nested. The fundamental
difference between the two statements is explained in the following sections.
Always:
The key element of Verilog behavioral design is the always block the always block contains one
or more “procedural statements”.
Another type of procedural statement is a “begin-end” block. But the ALWAYS block is used in
all because of its simplicity, that is why we call it an always block.
Procedural statement in an always block executes sequentially .The always block executes
concurrently with other concurrent statement in the same module.
Syntax:
1).Always @(signal-name …………signal-name)
Procedural statement
2). Always procedural statements
In the first form of always block, the @ sign and parenthesized list of signal names called
“sensitivity list “.
A verilog concurrent statement such as always block is either executing or suspend
A concurrent statement initially is in suspending state, when any signal value changes its value,
it resumes execution starting its first procedural statement and continuing until the end.
A properly written concurrent statement will suspend after one or more executions. However it is
possible to write a statement that never suspends (e.g.: assign X=~X), since X changes for every
pass, the statement will execute forever in zero simulation time(which is not useful).
As shown in the second part of syntax, the sensitivity list in always block is optional .an always
block without a sensitivity list starts running at zero simulation time and keeps looping forever.
There are different types of procedural statement that can appear with in an always block. They
are blocking-assignment statement, non blocking-assignment statement, begin-end blocks, if,
case, while and repeat.
IF AND IF-ELSE BLOCK:
The IF construct checks a specific condition and decides execution based on the result.
Figure shows the structure of a segment of a module with an IF statement. After execution of
assignment1, the condition specified is checked. If it is satisfied, assignment2 is executed; if not,
it is skipped. In either case the execution continues through assignment3, assignment4, etc.
Execution of assignment2 alone is dependent on the condition. The rest of the sequence remains.
The flowchart equivalent of the execution is shown in Figure.
Syntax:
If (condition)
...
assignment1;
if (condition) assignment2;
assignment3;
assignment4;
...
After the execution of assignment1, if the condition is satisfied, alternative1 is followed and
assignment2 and assignment3 are executed. Assignment4 and assignment 5 are skipped and
execution proceeds with assignment6.
If the condition is not satisfied, assignment2 and assignment3 are skipped and assignment4 and
assignment5 are executed. Then execution continues with assignment6.
For Loops
Similar to for loops in C/C++, they are used to repeatedly execute a statement or block of
statements. If the loop contains only one statement, the begin ... end statements may be omitted.
Syntax:
for (count = value1;
count </<=/>/>= value2;
count = count +/- step)
begin
... statements ...
End
While Loops:
The while loop repeatedly executes a statement or block of statements until the expression
in the while statement evaluates to false. To avoid combinational feedback during synthesis, a
while loop must be broken with an @(posedge/negedge clock) statement . For simulation a delay
inside the loop will suffice. If the loop contains only one statement, the begin ... end statements
may be omitted.
Syntax:
while (expression)
begin
... statements ...
End
CASE:
The case statement allows a multipath branch based on comparing the expression with a
list of case choices. Statements in the default block executes when none of the case choice
comparisons are true. With no default, if no comparisons are true, synthesizers will generate
unwanted latches. Good practice says to make a habit of puting in a default whether you need it
or not.
If the defaults are dont cares, define them as ‘x’ and the logic minimizer will treat them
as don’t cares and dsave area. Case choices may be a simple constant, expression, or a comma-
separated list of same.
Syntax
case (expression)
case_choice1:
begin
... statements ...
end
case_choice2:
begin
... statements ...
end
... more case choices blocks ...
default:
begin
... statements ...
end
endcase
casex:
In casex(a) the case choices constant “a” may contain z, x or ? which are used as don’t
cares for comparison. With case the corresponding simulation variable would have to match a
tri-state, unknown, or either signal. In short, case uses x to compare with an unknown signal.
Casex uses x as a don’t care which can be used to minimize logic.
Casez:
Casez is the same as casex except only ? and z (not x) are used in the case choice constants
as don’t cares. Casez is favored over casex since in simulation, an inadvertent x signal, will not
be matched by a 0 or 1 in the case choice.
FOREVER LOOPS
The forever statement executes an infinite loop of a statement or block of statements. To
avoid combinational feedback during synthesis, a forever loop must be broken with
an@(posedge/negedge clock) statement. For simulation a delay inside the loop will suffice. If the
loop contains only one statement, the begin ... end statements may be omitted.
Syntax
forever
begin
... statements ...
End
sExample
forever begin
@(posedge clk); // or use a= #9 a+1;
a = a + 1;
end
REPEAT:
The repeat statement executes a statement or blocks of statements a fixed number of times.
repeat CONSTRUCT The repeat construct is used to repeat a specified block a specified number
of times. The quantity a can be a number or an expression evaluated to a number. As soon as the
repeat statement is encountered, a is evaluated. The following block is executed “a” times. If “a”
evaluates to 0 or x or z, the block is not executed.
Syntax:
repeat (number_of_times)
begin
... statements ...
End
Chapter 6
SOFTWARE USED
Xilinx
Xilinx software is used by the VHDL/VERILOG designers for performing Synthesis operation.
Any simulated code can be synthesized and configured on FPGA. Synthesis is the transformation
of HDL code into gate level net list. It is an integral part of current design flows.
6.1Algorithm
Start the ISE Software by clicking the XILINX ISE icon.Create a New Project and find the
following properties displayed.If the design needs large number of LUTs there is a possibility to
change the family ,device and package changes.
Fig 6.1.Create new folder for design
Fig 6.3. finishing new folder ,Set family and device before design a project
Fig6.4. Ready to design a project
Create a HDL Source formatting all inputs, outputs and buffers if required. which provides a
window to write the HDL code, to be synthesized.
Fig 6.13. Check for view technology schematic view of the project
Fig6.14: Internal structure of view technology schematic view
View technology schematic of design (and gate) ,here LUTs are displayed ,luts are considered as
area of the design.
Fig 6.15 : The truth table ,schematic of design ,Boolean equation and k-map of design.
In Xilinx tool there is a availability to get truth table ,schematic of design ,Boolean equation and
k-map
Fig 6.16: Simulation of design to verifying the logics of design.
Fig 6.17: Apply inputs through force constant or force clock for input signals
Fig 6.20: Show all values(zoom to full view) for the design
Chapter7
RESULTS
RTL SCHEMATIC:- The RTL schematic is abbreviated as the register transfer level it denotes
the blue print of the architecture and is used to verify the designed architecture to the ideal
architecture that we are in need of development .The hdl language is used to convert the
description or summery of the architecture to the working summery by use of the coding
language i.e verilog ,vhdl. The RTL schematic even specifies the internal connection blocks for
better analyzing .The figure represented below shows the RTL schematic diagram of the
designed architecture.
7.3SIMULATION:-
The simulation is the process which is termed as the final verification in respect to its working
where as the schematic is the verification of the connections and blocks. The simulation window
is launched as shifting from implantation to the simulation on the home screen of the tool ,and
the simulation window confines the output in the form of the wave forms. Here it has the
flexibility of providing the different radix number systems.
Fig :Simulated Waveforms of existed vedic multiplier
7.5.ADVANTAGES
The karastubha vedic multiplier is the fastest and novel algorithm implanted by ancestors easy
to calculate than traditional and area, power are also less compared to conventional multiplier.
7.6.APPLICATIONS
1.Signal processing.
2.Image processing.
3.Amplifiers etc
4. cryptography
Chapter 8
Conclusion And Future Scope
Vedic algorithms have been useful in designing function circuits for achieving high speed and
simplified architecture. However, the challenge had been that all these algorithms are based on
decimal number systems and as such binarization often led to trade-off of the speed-advantage
and simplified- architecture due to conversion-hardware-overhead. Yet renewed interest has been
observed recently in Vedic algorithms particularly due to clever circuit realization – primarily
multipliers. The present endeavor assumes importance in that context where a known algorithm
(Karatsuba) has been modified by these authors to include adaptive aspect allowing recursive
operation to reduce the order of complexity from square to logarithmic value of power of bit-
length.
A 16×16-bit multiplier using reversible logic has been proposed and designed to showcase the
technique with the primary objective of minimizing the delay so that it can find application in
DSP, Image Processing and computation intensive ASIPs. It is based on the Vedic Karatsuba
algorithm that generates lesser number of partial product terms. The algorithm is further
optimized using adaptive concept for computation of the third product term to yield faster speed.
Moreover, the compression speed of the partial product terms is also enhanced by combining the
carry save adders with the proposed Square Root Carry Select Adder (SRCSA) adder with
reversible logic as discussed in this
The implementation, synthesis and simulation are performed in XILINX-ISE tool in verilog
HDL language.In future the implementation of this multiplier employed which eliminates gate
delays and adding the approximation to the architecture can enhance the performance in dsp
applications, image processing ,filters and cryptographic applications. Area and speed based
applications ,It will be used in future.
REFERENCES
[1] Swami Bharati Krishna Tirthaji Maharaja, “Vedic Mathematics”, MotilalBanarsidass
Publishers, 1965.
[2] Rakshith T R and RakshithSaligram, “Design of High Speed Low Power Multiplier using
Reversible logic: a Vedic Mathematical Approach”,International Conference on Circuits, Power
and Computing Technologies (ICCPCT-2013), ISBN: 978-1-4673-4922-2/13, pp.775-781.
[3] M.E. Paramasivam and Dr. R.S. Sabeenian, “An Efficient Bit Reduction Binary
Multiplication Algorithm using Vedic Methods”, IEEE 2nd International Advance Computing
Conference, 2010, ISBN: 978-1-4244-4791-6/10, pp. 25-28.
[4] Sushma R. Huddar, Sudhir Rao Rupanagudi, Kalpana M and Surabhi Mohan, “Novel High
Speed Vedic Mathematics Multiplier using Compressors”, International Multi conference on
Automation, Computing, Communication, Control and Compressed Sensing(iMac4s), 22-23
March 2013, Kottayam, ISBN: 978-1-4673-5090-7/13, pp.465-469.
[5] L. Sriraman and T. N. Prabakar, “Design and Implementation of Two Variables Multiplier
Using KCM and Vedic Mathematics”, 1st International Conference on Recent Advances in
Information Technology (RAIT -2012), ISBN: 978-1-4577-0697-4/12.
[6] Prabir Saha, Arindam Banerjee, Partha Bhattacharyya and Anup Dandapat, “High Speed
ASIC Design of Complex Multiplier Using Vedic Mathematics”, Proceeding of the 2011 IEEE
Students' Technology Symposium 14-16 January,2011, IIT Kharagpur, ISBN: 978-1-4244-8943-
5/11, pp.237-241.
[7] Soma BhanuTej, “Vedic Algorithms to develop green chips for future”, International Journal
of Systems, Algorithms & Applications, Volume 2, Issue ICAEM12, February 2012, ISSN
Online: 2277-2677.
[8] Gaurav Sharma, Arjun Singh Chauhan, Himanshu Joshi and Satish Kumar Alaria, “Delay
Comparison of 4 by 4 Vedic Multiplier based on Different Adder Architectures using VHDL”,
International Journal of IT, Engineering and Applied Sciences Research (IJIEASR), ISSN: 2319-
4413, Volume 2, No. 6, June 2013, pp. 28-32.
[9] Aniruddha Kanhe, Shishir Kumar Dasand Ankit Kumar Singh, “Design and Implementation
of Low Power Multiplier using Vedic Multiplication Technique”, International Journal of
Computer Science and Communication, Vol. 3, No. 1, June 2012, pp. 131-132.
[10] Anju and V.K. Agrawal, “FPGA Implementation of Low Power and High Speed Vedic
Multiplier using Vedic Mathematics”, IOSR Journal of VLSI and Signal Processing (IOSR-
JVSP) Volume 2, Issue 5 Jun. 2013, ISSN: 2319 – 4200, pp. 51-57.
[11] Animul islam, M.W. Akram, S.D. pable ,Mohd. Hasan, “Design and Analysis of Robust
Dual Threshold CMOS Full Adder Circuit in 32 nm Technology”, International Conference on
Advances in Recent Technologies in Communication and Computing,2010.
[12] Deepa Sinha, Tripti Sharma, k.G.Sharma, Prof.B.P.Singh, “Design and Analysis of low
Power 1-bit Full Adder Cell”,IEEE, 2011.
[13] Nabihah Ahmad, Rezaul Hasan, “A new Design of XOR-XNOR gates for Low Power
application”, International Conference on Electronic Devices,Systems and
Applications(ICEDSA) ,2011.
[14] R.Uma, “4-Bit Fast Adder Design: Topology and Layout with Self-Resetting Logic for Low
Power1VLSI Circuits”, International Journal of Advanced Engineering Sciences and
Technology, Vol No. 7,1Issue No. 2, 197 – 205.
[15] David J. Willingham and izzet Kale, “A Ternary Adiabatic Logic (TAL) Implementation of
a Four-Trit Full-Adder,IEEE, 2011.
[16] Padma Devi, Ashima Girdher and Balwinder Singh, “Improved Carry Select Adder with
Reduced1Area and Low Power Consumption”, International Journal of Computer
Application,Vol 3.No.4, June1 2010 .
[17] B.Ramkumar, Harish M Kittur, P.Mahesh Kannan, “ASIC Implementation of Modified
Faster Carry Save Adder”, European Journal of Scientific Research ISSN 1450-216X Vol.42
No.1, pp.53-58,2010.
[18] Y. Sunil Gavaskar Reddy and V.V.G.S.Rajendra Prasad, “Power Comparison of CMOS and
Adiabatic Full Adder Circuits”, International Journal of VLSI design & Communication Systems
(VLSICS) Vol.2, No.3, September 2011
[19] Mariano Aguirre-Hernandez and Monico Linares-Aranda, “CMOS Full-Adders for Energy-
Efficient Arithmetic Applications”, IEEE Transactions on Very Large Scale Integration (VLSI)
Systems, Vol.19, No. 4, April 2011.
[20] Ning Zhu, Wang Ling Goh, Weija Zhang, Kiat Seng Yeo, and Zhi Hui Kong, “Design of
Low-Power High-Speed Truncation-Error-Tolerant Adder and Its Application in Digital Signal
Processing”, IEEE Transactions on Very Large Scale Integration (VLSI) Systems, Vol. 18, No.
8, August 2010.
[21] Sreehari Veeramachaneni, M.B. Srinivas, “New Improved 1-Bit Full Adder Cells”, IEEE,
2008.International Journal of VLSI design & Communication Systems (VLSICS) Vol.3, No.1,
February 2012164
[22] Tripti Sharma, k.G.Sharma, Prof.B.P.Singh, Neha Arora, “High Speed, Low Power 8T Full
Adder Cell with 45% Improvement in Threshold Loss Problem”, Recent Advances in
Networking, VLSI and Signal Processing.
[23] G.Shyam Kishore, “A Novel Full Adder with High Speed Low Area”, 2nd National
Conference on Information and Communication Technology (NCICT) 2011 Proceedings
published in International Journal of Computer Applications® (IJCA).
[24] Shubhajit Roy Chowdhury, Aritra Banerjee, Aniruddha Roy, Hiranmay Saha, “A high
Speed 8 Transistor Full Adder Design using Novel 3 Transistor XOR Gates”, International
Journal of Electrical and Computer Engineering 3:12 2008.
[25] Romana Yousuf and Najeeb-ud-din, “Synthesis of Carry Select Adder in 65 nm FPGA”,
IEEE.
[26] Shubin.V.V, “Analysis and Comparison of Ripple Carry Full Adders by Speed”,
Micro/Nano Technologies and Electron Devices(EDM),2010, International Conference and
Seminar on, pp.132-135,2010.
[27] Pudi. V, Sridhara., K, “Low Complexity Design of Ripple Carry and Brent Kung Addersin
QCA”,Nanotechnology,IEEE transactions on,Vol.11,Issue.1,pp.105-119,2012.
[28] Jian-Fei Jiang; Zhi-Gang Mao; Wei-Feng He; Qin Wang, “A New Full Adder Design for
Tree Structured Aritmetic Circuits”, Computer Engineering and Technology(ICCET),2010,2 nd
International Conference on,Vol.4,pp.V4-246-V4- 249,2010.