0% found this document useful (0 votes)
157 views

FFT Using VHDL

1. The document discusses Fast Fourier Transform (FFT) which is an efficient algorithm to compute the Discrete Fourier Transform (DFT). FFT reduces the computation from O(N^2) to O(NlogN). 2. It provides details on the implementation of a 2-point DIT FFT algorithm using VHDL code including the butterfly model and twiddle factor. 3. The document discusses representing the FFT process using a data flow graph and scheduled sequencing graphs considering resource constraints with one ALU and multiplier. It provides VHDL code for the structural implementation of the FFT processor based on these models.
Copyright
© © All Rights Reserved
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
157 views

FFT Using VHDL

1. The document discusses Fast Fourier Transform (FFT) which is an efficient algorithm to compute the Discrete Fourier Transform (DFT). FFT reduces the computation from O(N^2) to O(NlogN). 2. It provides details on the implementation of a 2-point DIT FFT algorithm using VHDL code including the butterfly model and twiddle factor. 3. The document discusses representing the FFT process using a data flow graph and scheduled sequencing graphs considering resource constraints with one ALU and multiplier. It provides VHDL code for the structural implementation of the FFT processor based on these models.
Copyright
© © All Rights Reserved
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 22

1).

Fast Fourier Transform (FFT):


Fast Fourier Transform is an algorithm to compute Discrete Fourier Transform(DFT).
DFT is used to convert a time domain signal into its frequency spectrum domain. FFT is an
efficient algorithm to compute DFT. There are many distinct FFT algorithms involving a wide
range of mathematics, from simple complex-number arithmetic to group theory and number
theory.
FFTs are algorithms for quick calculation of discrete Fourier transform of a data vector. The
FFT is a DFT algorithm which reduces the number of computations needed for N points from O(N 2)
to O(N log N) where log is the base-2 logarithm.

2-point DIT-FFT Algorithm :


2-point FFT butterfly will have 2 inputs and 2 outputs. Butterfly model is shown below

Where twiddle factor =>


Here N=2 and k=0 so the twiddle factor value is=1
Further the simplified values of outputs are
X(0)=A(0)+WnB(0);
X(1)=A(0)-WnB(0);

A = x + Wn * y;
A_r + j*A_i = (x_r + j* x_i) + ( w_r + j*w_i) *(y_r + j*y_i);
A_r + j*A_i = (x_r + (w_r * y_r ) ( w_i *y_i)) + j*(x_i +( w_i *y_r) +( w_r
* y_i));

A_r = x_r + (w_r * y_r ) ( w_i *y_i);


A_i = x_i +( w_i *y_r) +( w_r * y_i);

B = x-+ Wn * y;
B_r + j*B_i = (x_r + j* x_i) - ( w_r + j*w_i) *(y_r + j*y_i);
B_r + j*B_i = ( x_r - (w_r *y_r) + (w_i * y_i)) + (x_i (w_r * y_i ) (w_i *
y_r ));
B_r =( x_r - (w_r *y_r) + (w_i * y_i);
B_i = x_i (w_r * y_i ) (w_i * y_r );

IMPLIMENTATION OF DIT-FFT ALGORITHM USING VHDL CODE:


-- The behavioral model outputs of 2 point fft algorithm model are
-- A_r = x_r + (y_r*w_r) - (y_i*w_i);
-- B_r = x_r - (y_r*w_r) + (y_i*w_i);
-- A_i = x_i + (y_r*w_i) + (y_i*w_r);
-- B_i = x_i - (y_r*w_i) - (y_i*w_r);
--------------------------------------------------LIBRARY IEEE;
USE IEEE.STD_LOGIC_1164.ALL;
USE IEEE.STD_LOGIC_UNSIGNED.ALL;
USE IEEE.STD_LOGIC_ARITH.ALL;

Entity fft_radix_2 is
port (
x_r

: inout std_logic_vector(7 downto 0);

y_r

: inout std_logic_vector(7 downto 0);

x_i

: inout std_logic_vector(7 downto 0);

y_i

: inout std_logic_vector(7 downto 0);

w_r

: inout std_logic_vector(7 downto 0);

w_i

: inout std_logic_vector(7 downto 0);

A_r

: out std_logic_vector(7 downto 0);

A_i

: out std_logic_vector(7 downto 0);

B_r

: out std_logic_vector(7 downto 0);

B_i

: out std_logic_vector(7 downto 0)

);
end fft_radix_2;
architecture behavioral of fft_radix_2 is
signal product_1,product_2 : std_logic_vector(15 downto 0);
begin
product_1 <= (y_r*w_r) - (y_i*w_i);
product_2 <= (y_r*w_i) + (y_i*w_r);
A_r <= x_r + product_1(7 downto 0);
A_i <= x_i + product_2(7 downto 0);
B_r <= x_r - product_1(7 downto 0);
B_i <= x_i - product_2(7 downto 0);
DUT:process
begin
x_r <="00000110";
y_r <="00000100";
x_i <="00000010";
y_i <="00000001";
w_r <="00000001";

w_i <="00000000";
wait;
end process DUT;
end architecture;

2). Represent the FFT process by means of a data flow graph.

Imaginary_part of Output-A

Imaginary_part of Output-B

3). Represent the FFT process by means of a scheduled sequencing graph considering
ASAP and ALAP schedules.

ASAP:(As Soon As Possible) Here some computations are performed in the first cycle

ALAP:(As Late As Possible) Here some computations need not be computed in the first cycle.

4. Assume that only one ALU (for addition. comparison, etc) and one multiplier is available.
Draw a scheduled sequencing graph for the process taking resource constraints into
consideration and considering minimum number of clock cycles.
When there are only one multiplier and one ALU it takes minimum 8-clock
cycles to complete the entire operation. So the latency of the sequencing graph is =
8*T

5). Obtain two designs of the FFT processor from problem 2 and problem 4 and
realize a structural model of the processor in VHDL.
---------------------------------------------------- DATA FLOW STRUCTURAL model of FFT algorithm
-- using 2 ALUs and 2 multipliers
---------------------------------------------------

LIBRARY IEEE;
USE IEEE.STD_LOGIC_1164.ALL;
USE IEEE.STD_LOGIC_UNSIGNED.ALL;
USE IEEE.STD_LOGIC_ARITH.ALL;

entity data_flow_fft_struct is
GENERIC (
N

: INTEGER := 8

);
port (
clk
reset

: in std_logic;
: in std_logic;

data_valid_in : in std_logic;
ain_r

: inout std_logic_vector(N-1 downto 0); -- real input data

bin_r

: inout std_logic_vector(N-1 downto 0); -- real input data

ain_i

: inout std_logic_vector(N-1 downto 0); -- imaginary input data

bin_i

: inout std_logic_vector(N-1 downto 0); -- imaginary input data

w_r

: inout std_logic_vector(N-1 downto 0);

w_i

: inout std_logic_vector(N-1 downto 0);

data_valid_out : out std_logic;

aout_r

: out std_logic_vector(N-1 downto 0);-- real output data

aout_i

: out std_logic_vector(N-1 downto 0);-- imaginary output data

bout_r

: out std_logic_vector(N-1 downto 0);-- real output data

bout_i

: out std_logic_vector(N-1 downto 0) -- imaginary output data

);
end data_flow_fft_struct;

architecture behav of data_flow_fft_struct is

signal mul1_in_1,mul1_in_2,mul2_in_1,mul2_in_2:std_logic_vector(N-1 DOWNTO


0);
signal ALU1_in_1,ALU1_in_2,ALU2_in_1,ALU2_in_2:std_logic_vector(N-1 downto
0);
signal mul1_out,mul2_out:std_logic_vector((2*N-1) downto 0);
signal ALU1_out,ALU2_out:std_logic_vector(N-1 downto 0);
signal mul1_out_b,mul2_out_b:std_logic_vector(N-1 downto 0);
signal ALU1_out_b,ALU2_out_b:std_logic_vector(N-1 downto 0);
signal a_r_t,a_i_t:std_logic_vector(N-1 downto 0);
signal Add_sub_1,Add_sub_2

type Times is ( time1,


time2,
time3,
time4,
time5,

: std_logic;

time6
);
signal next_state : times;
begin
process(clk,reset)
begin
if( reset ='0' ) then
a_r_t

<= (others =>'0');

a_i_t

<= (others =>'0');

aout_r

<= (others =>'0');

bout_r

<= (others =>'0');

aout_i

<= (others =>'0');

bout_i

<= (others =>'0');

Add_sub_1

<= '1';

Add_sub_2

<= '1';

data_valid_out <= '0';


next_state

<= time1;

elsif(clk = '1' and clk'event) then

case next_state is
when time1 =>
if (data_valid_in='1')then
mul1_in_1 <= bin_r;
mul1_in_2 <= w_r;

mul2_in_1 <= bin_r;


mul2_in_2 <= w_i;
next_state <= time2;
else
next_state<=time1;
end if;
when time2 =>
mul1_out_b <= mul1_out( 7 downto 0);
mul2_out_b <= mul2_out( 7 downto 0);

mul1_in_1 <= bin_i;


mul1_in_2 <= w_i;
mul2_in_1 <= bin_i;
mul2_in_2 <= w_r;
next_state <= time3;
when time3 =>
Add_sub_1 <='0'; -- subtractor-1
ALU1_in_1 <= mul1_out_b;
ALU1_in_2 <= mul1_out(7 downto 0);

Add_sub_2 <='1'; -- Adder-2


ALU2_in_1 <= mul2_out_b;
ALU2_in_2 <= mul2_out(7 downto 0);
next_state <= time4;
when time4 =>
ALU1_out_b <= ALU1_out;

ALU2_out_b <= ALU2_out;


Add_sub_1 <= '1';
ALU1_in_1 <= ain_r;
ALU1_in_2 <= ALU1_out_b;
Add_sub_2 <='1';
ALU2_in_1 <= ain_i;
ALU2_in_2 <= ALU2_out_b;
next_state <= time5;
when time5 =>
a_r_t <= ALU1_out;
a_i_t <= ALU2_out;

Add_sub_1<='0';
ALU1_in_1 <= ain_r;
ALU1_in_2 <= ALU1_out_b;
Add_sub_2<='0';
ALU2_in_1 <= ain_i;
ALU2_in_2 <= ALU2_out_b;
next_state <= time6;
when time6 =>
aout_r <= a_r_t;
aout_i <= a_i_t;
bout_r <= ALU1_out;
bout_i <= ALU2_out;
data_valid_out <='1';
next_state <= time6;

when others => NULL;


end case;
end if;
end process;
ALU1_out <= ALU1_in_1 + ALU1_in_2 when (Add_sub_1='1') else
ALU1_in_1 - ALU1_in_2;
ALU2_out <= ALU2_in_1 + ALU2_in_2 when Add_sub_2='1' else
ALU2_in_1 - ALU2_in_2;
mul1_out <= mul1_in_1 * mul1_in_2;
mul2_out <= mul2_in_1 * mul2_in_2;
DUT : process
begin
ain_r <= "00000110";
ain_i <= "00000010";
bin_r <= "00000100";
bin_i <= "00000001";
w_r <= "00000001";
w_i <= "00000000";
wait;
end process;
end behav;
Result:-

----------------------------------------------------- STRUCTURAL MODEL OF RESOURCE CONSTRAINT MODEL


--------------------------------------------------LIBRARY IEEE;
USE IEEE.STD_LOGIC_1164.ALL;
USE IEEE.STD_LOGIC_UNSIGNED.ALL;
USE IEEE.STD_LOGIC_ARITH.ALL;

entity resrc_constraint_fft is
GENERIC ( N : INTEGER := 8
);
PORT ( x_r : inout std_logic_vector(N-1 downto 0);
x_i : inout std_logic_vector(N-1 downto 0);
y_r : inout std_logic_vector(N-1 downto 0);
y_i : inout std_logic_vector(N-1 downto 0);

w_r : inout std_logic_vector(N-1 downto 0);


w_i : inout std_logic_vector(N-1 downto 0);
A_r : out std_logic_vector(N-1 downto 0);
A_i : out std_logic_vector(N-1 downto 0);
B_r : out std_logic_vector(N-1 downto 0);
B_i : out std_logic_vector(N-1 downto 0);
clk : in std_logic;
reset : in std_logic;
valid_in : in std_logic;
valid_out : out std_logic
);
end resrc_constraint_fft;

architecture behav of resrc_constraint_fft is

signal ALU_in_1,ALU_in_2:std_logic_vector(N-1 downto 0);


signal mul_in_1,mul_in_2:std_logic_vector(N-1 downto 0);
signal ALU_out,ALU_out_b_1:std_logic_vector(N-1 downto 0);
signal mul_out:std_logic_vector(2*N-1 downto 0);
signal mul_out_b:std_logic_vector(N-1 downto 0);
signal Add_sub:std_logic;
signal a_r_t,a_i_t,b_r_t:std_logic_vector(N-1 downto 0);

type times is ( time1,


time2,
time3,

time4,
time5,
time6,
time7,
time8,
time9
);
signal next_state:times;

begin
process(reset,clk)
begin
if (reset = '0') then
A_r <= (others=>'0');
A_i <= (others=>'0');
B_r <= (others=>'0');
B_i <= (others=>'0');
-- Add_sub <= (others =>'0');
a_r_t <= (others =>'0');
a_i_t <= (others =>'0');
b_r_t <= (others =>'0');
--

valid_out <= (others =>'0');


next_state <= time1;

elsif (clk = '1' and clk'event ) then


case next_state is
when time1 =>

if valid_in = '1' then


mul_in_1 <= y_r;
mul_in_2 <= w_r;
next_state <= time2;
else
next_state <= time1;
end if;
valid_out <='0';
when time2 =>
mul_out_b <= mul_out(7 downto 0);
mul_in_1 <= y_i;
mul_in_2 <= w_i;
next_state <= time3;
when time3 =>
Add_sub <= '0'; -- sub
ALU_in_1 <= mul_out_b;
ALU_in_2 <= mul_out(7 downto 0);
next_state <= time4;
when time4 =>
ALU_out_b_1 <= ALU_out;
Add_sub <='1'; --Add;
ALU_in_1 <= x_r;
ALU_in_2 <= ALU_out;

mul_in_1 <= y_i;


mul_in_2 <= w_r;

next_state <= time5;


when time5 =>
a_r_t <= ALU_out;
mul_out_b <= mul_out(7 downto 0);
Add_sub <='0'; -- sub
ALU_in_1 <= x_r;
ALU_in_2 <= ALU_out_b_1;

mul_in_1 <= y_r;


mul_in_2 <= w_i;
next_State <= time6;
when time6 =>
b_r_t <= ALU_out;
Add_sub <='1'; -- Add
ALU_in_1 <= mul_out_b;
ALU_in_2 <= mul_out(7 downto 0);
next_state <= time7;
when time7=>
ALU_out_b_1 <= ALU_out;
Add_sub <='1'; --Add
ALU_in_1 <= x_i;
ALU_in_2 <= ALU_out;
next_state <= time8;
when time8 =>
a_i_t <= ALU_out;
Add_sub <='0'; --sub

ALU_in_1 <= x_i;


ALU_in_2 <= ALU_out_b_1;
next_state <= time9;
when time9 =>
A_r <= a_r_t;
A_i <= a_i_t;
B_r <= b_r_t;
B_i <= ALU_out;
valid_out <='1';
next_state <= time9;
when others => NULL;
end case;
end if;
end process;
-- ADDER AND MULTIPLIER are assumed to be implimented
ALU_out <= ALU_in_1 + ALU_in_2 when (Add_sub='1') else
ALU_in_1 - ALU_in_2;
mul_out <= mul_in_1 * mul_in_2;

DUT:process
begin
x_r <= "00000110";
x_i <= "00000010";
y_r <= "00000100";
y_i <= "00000001";
w_r <= "00000001";

w_i <= "00000000";


wait;
end process DUT;
end behav;
Result:-

6). Synthesize the models in problem 5 on Altera Quartus II synthesis tool and
compare the resource utilization summary in the two cases.
Synthesized the problems in Quartus II by selecting EP2C20F484c7 FPGA
Problem-2 uses the resources as follows
Total=126/18,752 (<1%)
Combinational with no registers =27

Registers only =61


Combinational with registers =38

Problem-4 uses the resources as follows


Total=158/18,752 (<1%)
Combinational with no registers =37
Registers =56
Combinational with registers =65

Data flow graph uses more number of resources than the resource constraint sequencing graph.

You might also like