0% found this document useful (0 votes)
68 views

Transposed Form FIR Filters: Core Generator Tool

This application note describes a high-speed, reconfigurable, full-precision filter design implemented in the Virtex and Virtex-II series and Spartan(tm)-II family of FPGAs. The VHDL reference design is easily modified to change filter parameters including coefficients and the number of taps. By illustrating a design methodology for digital filters, the advantages of using FPGAs for digital signal processing applications are emphasized.

Uploaded by

sanoopmk
Copyright
© Attribution Non-Commercial (BY-NC)
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
68 views

Transposed Form FIR Filters: Core Generator Tool

This application note describes a high-speed, reconfigurable, full-precision filter design implemented in the Virtex and Virtex-II series and Spartan(tm)-II family of FPGAs. The VHDL reference design is easily modified to change filter parameters including coefficients and the number of taps. By illustrating a design methodology for digital filters, the advantages of using FPGAs for digital signal processing applications are emphasized.

Uploaded by

sanoopmk
Copyright
© Attribution Non-Commercial (BY-NC)
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 18

Application Note: Virtex and Virtex-II Series

R
Transposed Form FIR Filters
Author: Vikram Pasham, Andy Miller, and Ken Chapman
XAPP219 (v1.1) January 10, 2001

Summary This application note describes a high-speed, reconfigurable, full-precision Transposed Form
FIR filter design implemented in the Virtex™ and Virtex-II series and Spartan™-II family of
FPGAs. The VHDL reference design provided with this application note is easily modified to
change filter parameters including coefficients and the number of taps. By illustrating a design
methodology for digital filters, the advantages of using FPGAs for digital signal processing
applications (DSP) are emphasized. The Core Generator tool provides a preoptimized
alternative solution to this reference design (Core Generator Tool).

Introduction Digital filters are among the most significant components in digital signal processing
applications. The function of a filter is to eliminate undesirable parts of the signal (random
noise), or to extract signals in a particular frequency range. In other words, a filter selects,
suppresses, or modifies certain frequency components of the signal, either to reduce noise or
to shape the spectrum. This application note focuses on digital filters that are used widely in
digital video broadcast, digital video effects, and digital wireless communication. Figure 1 is an
example application of filters in a communication receiver.

r.f. First 200 Ks/s


Mixer I.F. 150 Ms/s
First I.F. Digital Signal
RF Xilinx FPGA
Band Pass A/D Processor Audio Signal
Amplifier
Amplifier
D/A Amp
Local
Oscillator

Analog (RF) Digital Analog (Audio)

Multiply
Decimation I data
A/D control Low Pass
FIR Filter Processor
Multiply Interface
Decimation and
Sine/Cosine
Low Pass Data
NCO
FIR Filter Buffering

Numeric Decimation Q data


Controlled Low Pass
Oscillator FIR Filter
Multiply
Parallel Methods Serial Methods Logic/RAM

x219_01_080400

Figure 1: Filter Applications: Communication Receiver

© 2000 Xilinx, Inc. All rights reserved. All Xilinx trademarks, registered trademarks, patents, and disclaimers are as listed at http://www.xilinx.com/legal.htm.
All other trademarks and registered trademarks are the property of their respective owners. All specifications are subject to change without notice.

XAPP219 (v1.1) January 10, 2001 www.xilinx.com 1


1-800-255-7778
R

Transposed Form FIR Filters

Most of the traditional filters in DSP applications are implemented using highly specialized DSP
processors. These DSP processors are capable of carrying out high-speed Multiply
Accumulate (MAC) operations, but have bandwidth limitations. Only a fixed number of
operations can be performed by these processors before the next sample arrives, thereby
limiting the bandwidth. DSP processors are sequential in nature, and thus DSPs using a single
processor can only perform one operation on a single set of data at a time. For example, in a
16-tap filter, they can only calculate the value of a single tap at a time, while the other 15 taps
wait for their turn. This also limits the overall frequency of the application. Due to resource
limitations, operations cannot be performed in parallel.
FPGA based filters are implemented with parallel-pipelined architecture, enhancing the overall
performance. Thus, a 16-tap filter will run as fast as a 64- or 128-tap filter implemented in an
FPGA. The FPGA implementation enables total access to the precision of the signal at each
stage of the algorithm. This is a significant difference between an FPGA-based filter and an
equivalent DSP processor solution. Implementations of digital filters with sample rates of a few
MHz are generally difficult and expensive to realize using standard DSPs. The potential for
parallel processing and reprogrammability makes all Virtex series FPGAs an ideal solution.
The flexible architecture of FPGAs permits optimum use of the available gates in the form of
Constant Coefficient Multipliers (KCM). The reprogrammability of FPGAs enables tuning of the
filter at any time.

Structures for Digital filter algorithms are primarily composed of multipliers, adders, and registers. The basic
FIR Filters structure of a Finite Impulse Response (FIR) filter is shown in Figure 2. The multipliers and
adders form the heart of a FIR filter. The input data passes to the multiplier and then to the
adder with interleaving delay elements.

n
In D Q D Q D Q D Q

xk0 xk1 xk2 xk3

D Q
m
D Q Out

D Q

X219_02_091800

Figure 2: FIR Filter Structure Employing Tree of Pipelined Adders

An alternate implementation structure called the Transposed Form FIR filter is shown in
Figure 3. Utilizing the same resources, data samples are applied in parallel to all the tap
multipliers through pipeline registers. The input registers are not required, because high fan-out
input signals can be handled by the Virtex and Virtex-II architectures. The products are applied
to a cascaded chain of registered adders, combining the effect of accumulators and registers.
The order of tap coefficients must be reversed with the first tap closest to the output. This
structure allows expansion of the number of taps required in a filter, since each "tap module" is
identical. Since the structure is uniform, a single component can be designed and instantiated
as many times as required by the number of taps.

2 www.xilinx.com XAPP219 (v1.1) January 10, 2001


1-800-255-7778
R

Transposed Form FIR Filters

n
In

D Q D Q D Q D Q

xk3 xk2 xk1 xk0

m
D Q D Q D Q D Q Out

"0"

X219_03_091800

Figure 3: Transposed Form FIR Filters Employing Cascaded Pipelined Adders

FIR vs. Both FIR and Transposed Form FIR filters have trade-offs and limitations. It is up to the
Transposed designer to choose the style most appropriate to the application. For an 8-tap, 16-bit filter, the
device utilization and performance obtained were nearly identical. In general, a smaller filter
Form FIR profits from the traditional approach, while a larger filter benefits from the Transposed Form FIR
approach. This argument becomes more obvious when very large filters are implemented
across multiple devices. The cascadable nature of the tap-slice modules allows for easy
interdevice connections. The input-to-output latency is reduced with fully pipelined Transposed
Form FIR filters. The filter selection also depends on the type of coefficients (symmetric or
asymmetric). In symmetric systems, coefficients occur in pairs.
In Transposed Form FIR filters, multipliers can be completely avoided if the coefficients can be
tuned to powers of two (2n) or values that are close to the powers of two (23 + 1 = 9.) In such
cases, the multiplication can be achieved by shifting and adding.

Transposed In traditional DSPs, the FIR filters are implemented in dedicated hardware without any
Form Filter parallelism, thus limiting the sample rate. The Virtex FPGAs have abundant hardware
resources to facilitate full parallelism (each TAP has a dedicated multiplier and adder). For
Design multiplier performance improvement, the features of the filter have to be carefully studied. The
efficiency of the multiplier determines the overall performance of the filter. Hence, the multiplier
must be implemented for the best possible performance.
The reference design is an 8-tap filter based on 16-bit input samples and 14-bit signed
coefficients. The basic building blocks of the filter are KCMs, Adders, Registers, and a delay-
locked loop.

Constant In a fully parallel implementation of a filter, each tap has a dedicated multiplier. The tap data is
Coefficient an input of this multiplier, the other a constant coefficient. Since one input is a constant, these
multipliers are called KCMs. KCMs are efficiently implemented by storing pre-computed partial
Multiplier (KCM) products of the fixed coefficient, thereby reducing the logic required as compared to traditional
two-variable multipliers. As a result, better performance can be achieved. In Xilinx FPGAs,
these partial products can be stored in ROMs using the distributed memory.
The 16-bit input sample is separated into four 4-bit nibbles. Each nibble acts as an input to the
ROM in different cycles. These ROMs store the product of the constant coefficient k, and a
factor with variable values that change from 0 through 15. The ROM contents are 0 x k, 1 x k,
2 x k, 3 x k, …, 15 x k. The word size in the ROM is:
(4-bit input nibble) x (14-bit coefficient) = 18 bits (ROM word size)

XAPP219 (v1.1) January 10, 2001 www.xilinx.com 3


1-800-255-7778
R

Transposed Form FIR Filters

Essentially this ROM functions as a times table of the constant coefficient, k. In this reference
design, the value read from this ROM based on its 4-bit input is added to another partial product
stored in an adjacent ROM. As a result, KCMs are less than one-third the size of full multipliers.
A KCM block diagram is shown in Figure 4.

4 16 x 18 18
[15:12] ROM
[29:12]

X [15:0] 4 16 x 18 18
[11:8] ROM

0xk
1xk [11:8]
2xk
4 16 x 18 18
3xk
4xk [7:4] ROM
5xk
6xk
7xk [7:4]
8xk
4 16 x 18 18
9xk
10 x k [3:0] ROM
11 x k
12 x k
13 x k "0" [3:0]
14 x k
15 x k X219_04_091800

Figure 4: KCM Block Diagram

A KCM would be ideal for unsigned inputs and coefficients. There are a couple of options for
handling signed numbers. The first approach is to implement two ROM tables, one for the
signed MSB nibble and the other for the LSB nibbles. This approach requires two separate
ROM tables per tap, as shown in Figure 5. This is not an optimal solution.
The second approach, shown in Figure 6, is to convert the signed sample input data into an
unsigned magnitude word and a sign bit, using a 2's-complement module. When a negative
word is detected, it is complemented, and the magnitude decodes a value from the same ROM
table that a non-negative data would use. The multiplier output is a negative value, which is
incorrect; however, the accompanying sign bit causes a subtract operation in the ADD/SUB
module resulting in the correct sign and magnitude.
In order to handle signed inputs and coefficients, a 2’s-complement component is used to
convert negative numbers to positive. After all the operations, the final result is made positive or
negative depending on the sign of the input and coefficient.

−1 = 1111 1111
8
Signed MSB Unsigned LSB 8 x 4 Multiplier
Nibble Value Nibble Value
4 (−1) (+15) 4 x 4 Multiplier
−5
0000 0101 Combined
Addr15 = (−1 x −5) = +5
Decode Sign with
12 Addr 15 Extended 1111 1011 0101 appropriate
4 x 4 Multiplier
+5 weightings
Decode 0000 0000 0101
+56 Addr15 = (15 x −5) = -75
Addr 15
Answer = +5 correct
+56 + (+5) = 61

X219_05_091800

Figure 5: KCM Implementation with Two ROM Tables

4 www.xilinx.com XAPP219 (v1.1) January 10, 2001


1-800-255-7778
R

Transposed Form FIR Filters

Alternative = Create Magnitude


with Sign bit and use adder/sub module Sign bit = 1 = (−) = Subtract
Sign bit = 0 = (+) = Add
−1 = (−) 0000 0001
8 UnSigned UnSigned
8 x 4 Multiplier 0000 0000
Value Value
(+0) (+1) 4 x 4 Multiplier 1111 1111 1011
4
−5 1111 1111 1011
Addr0 = (0 x −5) = 0

12 4 x 4 Multiplier Sign
−5 Extended
+56 Addr1 = (1 x −5) = −5

+56 − (−5) = 61
Multiplier output = −5, but this is accompanied
by the sign bit which is used to select subtract.
The add/sub signal can be used to feed CIN as a
subtracter has an "active Low" borrow.
x219_06_091800

Figure 6: KCM Implementation with Add/Subtract Module

The third approach, as implemented in the reference design, uses three 2’s-complement
modules to handle both signed inputs and coefficients. This is used to avoid signed
multiplication and addition.
The operation of a KCM multiplier implemented using a ROM is explained with the following
example:
16-bit input: 0001 0010 0000 0100 (Decimal equivalent 4612)
14-bit coefficient: 00 0000 0000 0010 (Decimal equivalent 2)

The 16-bit input is separated into four 4-bit nibbles: "0001", "0010", "0000", and "0100". All
fifteen coefficient factors, 0 x 2, 1 x 2, 2 x 2, …15 x 2 are stored with an 18-bit (14-bit x 4-bit)
word size in the ROM. Each 4-bit nibble of the 16-bit input acts as an address to the ROM. The
corresponding ROM content at this address is read.
First partial product = 00 0000 0000 0000 1000 (ROM contents at address "0100")
Second partial product = 00 0000 0000 0000 0000 (ROM contents at address "0000")
Third partial product = 00 0000 0000 0000 0100 (ROM contents at address "0010")
Fourth partial product = 00 0000 0000 0000 0010 (ROM contents at address "0001")

All the partial products are then added after shifting them appropriately (shown below):

00 0000 0000 0000 1000 First partial product


00 0000 0000 0000 0000 0000 Second partial product
00 0000 0000 0000 0100 0000 0000 Third partial product
+ 00 0000 0000 0000 0010 0000 0000 0000 Fourth partial product
00 0000 0000 0000 0010 0100 0000 1000 (Decimal equivalent 9224)

Pipelining and resource sharing of adders can further enhance the performance of KCM
multipliers. An enhanced multicycle KCM schematic of the reference design is shown in
Figure 7. The input sample arrives at clock frequency f1, while all the internal operations of the
KCM can be performed at a much higher frequency of f2 (4 x f1). Four muxes are used to select

XAPP219 (v1.1) January 10, 2001 www.xilinx.com 5


1-800-255-7778
R

Transposed Form FIR Filters

4-bit input nibbles. A 2-bit counter clock operating at f2 frequency acts as the select signal for
these muxes. For every 4-bit input nibble, a corresponding value is read from the ROM and
corresponding partial products are added after taking care of the required shift operations.

ROM In HDL, there are two approaches to infer ROMs using the function generators or Look-Up
Implementation Tables (LUTs) in Xilinx FPGAs. One approach is to use the case statement. With this approach,
the code would require as many case statements as the number of ROMs required in the filter
design, and each case statement would have to specify all 2n possibilities, n being the number
of address bits. Although this can make the code lengthy and tedious, an advantage is the fact
that the coefficients can be changed without an impact to the utilization or performance of the
filter design.
The reference design xapp219.zip uses array declarations. This second approach results in a
concise code that is easily editable, as well as a more optimal use of resources compared to the
first approach. As a result, any changes in the coefficient values would cause the utilization,
and thereby the performance, to be slightly changed.

Sign Value
Clkx4

sign_coeff sign_coeff_reg
A0
sign_in A4
A8 Clkx4
A12
16 2's 16
Clkx4
Complement A1
A5 18 2's
A9 ROM Complement
Clk 16 x 18
A13 [17:0] KCM
2's 30 Out
4 Addr[3:0]
Clkx4 Complement
A2 0 17
A6 0 16
0 15
A10 0 14 Clkx4
15
A14
0
Clkx4 Clkx4
A3
A7 [17:4]
A11 4 [3:0]
A15 4 [11:8]
Clkx4 4 [7:4]
Enable 2-bit 4 [3:0]
Counter
Clkx4
Clkx4

x219_07_091800

Figure 7: Multicycle KCM implementation

DLL or DCM All of the Virtex devices have clock phase deskew and clock manipulation circuitry. In Virtex ,
Virtex-E, and Virtex-EM devices this circuitry is called Delay Locked Loop (DLL). In Virtex-II
devices the Digial Clock Manager (DCM) is the clock management circuitry. As discussed
earlier, multicycle KCM uses two clocks of frequency f1 and f2, where f1 = f2 / 4 or f2 = 4 t f1.
In Virtex-II devices only one DCM is required for either the 4 t clock generation or for a divided
by 4 clock output.

6 www.xilinx.com XAPP219 (v1.1) January 10, 2001


1-800-255-7778
R

Transposed Form FIR Filters

In Virtex, Virtex-E, and Virtex-EM devices there are two approaches to generate f1 and f2 using
DLLs:
1. One DLL with an input frequency of f2 can be used to generate the frequency f1 = f2 / 4,
using the clock division capability of the DLL.
2. Two DLLs can be cascaded together to obtain 4 x f1 = f2. The clock with frequency f1 would
be the input to the first DLL, and its output 2 x f1 would be the input to the second DLL.
Please refer to XAPP132 for DLL details.
The reference design is based on the first option using a single DLL. In this case, the data
streams at f2/4 and the KCM operates at f2. Alternatively, the second option can be used. The
selection must be based on the external clock and the input sample rate. The clock output from
the DLL is only valid after its lock signal is enabled. Similarly, in Virtex-II devices the DCM
outputs are valid only after its lock signal is active. The lock signal is also used in this design to
enable the 2-bit counter in the multicycle KCM.

Transposed The complete filter is built by integrating the KCM multipliers, delay elements, and adders. The
Form FIR Filter transposed form FIR filter block diagram is shown in Figure 8, and a more detailed schematic
design with eight taps is shown in Figure 9. The precision of the filter is preserved at every tap
Implementation of the filter. The MSB bit from the corresponding KCM multiplier is sign-extended by one bit to
accommodate any sign overflow.

Samples go to all
taps simultaneously
S2 S2 S2

S1 S1 S1

S0 Taps are re-ordered S0 S0

Tap7 Tap 1 Tap 0

14 14 14
K2 K1 K0

Transposed Form
FIR Filter Output
Cycle 1
(0) + S0k0
Cycle 2
(S0k1) + S1k0
"0"
Cycle 3
(S0k2 + S1k1) + S2k0

System level Ζ-1 delay implemented by registered adders x219_08_091800

Figure 8: Transposed Form FIR Filter Block Diagram

The reference design implements the structural design shown in Figure 9. This design can be
further optimized by sharing the common resource of all the KCM multipliers. The 4-to-1 muxes
in the KCM multipliers are extracted and the adders are merged to optimize resources, as
shown in Figure 10. As before, each tap multiplier is implemented by a 16 x 18 ROM. Each tap
produces four 18-bit partial products at 4x clock frequency, rather than one 30-bit result in one
clock frequency. Four partial products need to be stored between the adder chain taps to
guarantee that only partial products with the same weighting are added together.

XAPP219 (v1.1) January 10, 2001 www.xilinx.com 7


1-800-255-7778
R

Transposed Form FIR Filters

Tap 7 Tap 1 Tap 0

"0"

16

x219_09_091800

Figure 9: Transposed Form FIR Filter with Eight Taps

4
16 4

Tap 2 Tap 1 Tap 0


4
14 16 x 18 14
k2 k0
4 ROM
4 delays after
18 4 x 14 18 18
the MAC
Multiplier

4 delays after 4 delays after


the MAC the MAC x219_10_091800

Figure 10: Optimized Transposed Form FIR Filter

8 www.xilinx.com XAPP219 (v1.1) January 10, 2001


1-800-255-7778
R

Transposed Form FIR Filters

VHDL The reference design provided with this application note is ideal for asymmetric coefficients.
Reference Depending on the targetted device, the design is implemented structurally by instantiating
KCMs and either a DLL or DCM. All the KCMs are identical in the filter, with different ROM
Design contents for each tap. Instead of defining four KCMs, a single KCM is defined with an option of
selecting different ROMs for each tap. The constant coefficients for eight taps are declared in
the package. This makes it easier to change the constants. Figure 11 shows simulation
waveforms of the reference design. The input is registered at the slower clock edge. The KCM
output is obtained after six clock cycles of the faster clocks (f2), and the final filter output is
obtained after a two-clock cycle latency of the slower clock, (f1 = f2 / 4.)
S

f2

f1 ( f 2 / 4 )

Input Sample Data 0 3 8515 -28351 4588

Registered Input Data 0 3 8515 -28351 4588

KCM_out (Tap 0) 0 6 17030

KCM_out (Tap 1) 0 30 85150

KCM_out (Tap 2) 0 60 170300

KCM_out (Tap 3) 0 90 255450

KCM_out (Tap 4) 0 −90 −255450

KCM_out (Tap 5) 0 −60 −170300

KCM_out (Tap 6) 0 −30 −85150

KCM_out (Tap 7) 0 −6 −17030

fir_out 0 6 17060

coefficients [Tap0 ... Tap7] = 2, 10, 20, 30, −30, −20, −10, −2 x219_11_091800

Figure 11: Simulation Waveform

Synthesis Tool The reference design was synthesized using different commercial synthesis tools. The results
Results are presented in Table 1. The filter has 8 taps, 16-bit inputs, 14-bit signed coefficients, and was
targeted to one of the smaller members of the Virtex family, XCV100-TQ144. The input data
samples at one-quarter of the clock frequency in Table 1.

Table 1: Performance/ Utilization Using XCV100-6TQ144


Clock
Number of Number of Number of Frequency
Synthesis Tool Slices 4-input LUTs Slice Registers (Timing Report)
Synplify 6.0 584 931 755 70.78 MHz
FPGA Express 3.4 654 977 807 72.15 MHz
Exemplar 2000.1a 703 792 677 56.0 MHz

XAPP219 (v1.1) January 10, 2001 www.xilinx.com 9


1-800-255-7778
R

Transposed Form FIR Filters

MATLAB Three different FIR filters were implemented using the MATLAB tool to prove that the impulse
FIR Filter responses for the traditional-form FIR and the Transposed Form FIR filter were identical. Note
that the coefficients for these filters are symmetric. The reference design provided with this
Implementation application note does not realize an optimal implementation for symmetric coefficients.

Ideal FIR Filter


The Ideal FIR filter implemented in MATLAB is a full-precision floating point implementation
using the Equiripple FIR (Remez Algorithm).
Ideal FIR filter coefficients:
H(z) = [ 0.0112 –0.1308 0.0390 0.5236 0.5236 0.0390 –0.1308 0.0112 ]
where F3dB = 4 MHz, F20dB = 6 MHz, FS = 16 MHz
The impulse response for this filter is shown in Figure 12.

X219_12_010101

Figure 12: Ideal FIR Filter Impulse Response

Traditional FIR Filter


The fixed-point, traditional-form FIR filter was implemented using the Xilinx System Generator
tool, which is a simulink blockset. It is a signed, single-rate CoreGen filter.
The fixed-point FIR filter coefficients:
H(z) = [ 0.112 –1.308 0.390 5.236 5.236 0.390 –1.308 0.112 ]

10 www.xilinx.com XAPP219 (v1.1) January 10, 2001


1-800-255-7778
R

Transposed Form FIR Filters

Quantization is 14 bits with the binary point at the 10th bit. The impulse response for this filter
is shown in Figure 13. The quantization error for this filter is shown in Figure 14.

X219_13_010101

Figure 13: Traditional FIR Filter Impulse Response

X219_14_010101

Figure 14: Traditional FIR Filter Quantization Error

Transposed Form FIR Filter


The fixed-point Transposed Form FIR filter was also built with the Xilinx System Generator tool
using the math primitives in the Xilinx blockset. It is a signed, single-rate filter.
Fixed-point Transposed Form FIR filter coefficients:

XAPP219 (v1.1) January 10, 2001 www.xilinx.com 11


1-800-255-7778
R

Transposed Form FIR Filters

H(z) = [ 0.1123 –1.308 0.3896 5.236 5.236 0.3896 –1.308 0.1123 ]


The slight discrepancy in the coefficients is due to quantizing these coefficients. As can be
seen, this has no impact on the quantization error or impulse response for this filter.
Quantization is 14 bits with the binary point at the 10th bit.The impulse response for this filter is
shown in Figure 15. The quantization error for this filter is shown in Figure 16.

X219_15_010101

Figure 15: Transposed Form FIR Filter Impulse Response

X219_16_010101

Figure 16: Transposed Form FIR Filter Quantization Error

12 www.xilinx.com XAPP219 (v1.1) January 10, 2001


1-800-255-7778
R

Transposed Form FIR Filters

Conclusion FIR filters are commonly used in DSP applications. The FIR filters implemented in Virtex,
Virtex-E, Virtex-EM, Virtex-II and Spartan-II FPGAs provide the designer tremendous flexibility
in terms of the number of filter taps and changes in existing coefficients. It may be necessary to
"tune" a filter in an existing system, or to have multiple filter settings. The reconfigurability of
FPGAs is exploited by making necessary coefficient changes in the synthesizable HDL code. In
a KCM, the coefficients are constant; therefore, they are stored as partial products in ROM
elements that are implemented in function generators or LUTs. This implementation permits
any coefficient values to be programmed into the same logic, thereby reducing the impact on
place and route or performance. The HDL reference design provided with this application note
is easily modified to achieve specific requirements.

Revision The following table shows the revision history for this document.
History
Date Version Revision
9/21/00 1.0 Initial Xilinx release.
01/10/01 1.1 Addition of Virtex-II series and updates.

XAPP219 (v1.1) January 10, 2001 www.xilinx.com 13


1-800-255-7778
Xilinx CORE Generator(TM)

Xilinx CORE Generator System


On This Page
Overview
CORE Generator Features
CORE Gernator Benefits
CORE Generator Components
CORE Generator Interfaces
Generating a Core
Platform Support
New Cores and Updates

More Information
CORE Solutions Documents CORE Generator Cores & IP Updates
CORE Generator User Guide v3.1i

Overview
The Xilinx CORE Generator System generates and delivers parameterizable cores optimized for Xilinx
FPGAs. Use the Xilinx CORE Generator System to design high-density Xilinx FPGA devices and
achieve high performance results while also cutting your design time.

The CORE Generator is included with the Xilinx Foundation, Foundation


ISE and Alliance Series software and comes with an extensive library of
Xilinx LogiCOREs, including DSP functions, memories, storage elements,
math functions and a variety of basic elements. Also included is information
on over 90 AllianceCORE functions.

CORE Generator Features


● Simple, intuitive operation – Select a core, Enter parameters, and Generate!
● Cores are delivered with an optimally floorplanned layout
❍ Performance is independent of FPGA device size
❍ Performance stays constant as more cores are added
● Detailed functional descriptions and timing diagrams as well as performance and utilization
information
provided for each core
● Compatible with VHDL, Verilog, and Schematic top-level design flows
● Verilog and VHDL behavioral simulation support
● Ready access to intellectual property from Xilinx and Xilinx partners
● Predictable & repeatable results – core layout is specified up front

http://www.xilinx.com/products/logicore/coregen/index.htm (1 of 3) [7/18/2001 23:47:42]


Xilinx CORE Generator(TM)
● Supported on both PC and Workstation platforms

CORE Generator Benefits


● Core performance and utilization comparable to the best expert, hand-packed design
● Faster time-to-market
❍ Spend less engineering time and effort by using pre-designed, pre-verified
cores that can be customized "on-the-fly" to your requirements
❍ Enjoy fast core generation with proprietary Xilinx software
❍ Reduce place and route time with pre-placed Cores
● Facilitates design reuse
❍ Build larger, more complex designs faster with cores!
❍ Reduce design documentation requirements by using larger parameterizable building blocks
❍ Use the Xilinx IP Capture Tool to integrate your IP into the CORE Generator
❍ Use the IP Capture Tool to package and share your IP on your company's intranet
● Optimal core layout produces lower power dissipation
Xilinx Smart-IP technology produces cores with predictable performance. Core performance is
independent of Xilinx FPGA device size and number of cores instantiated, even in large devices. Xilinx
Smart-IP technology guarantees that there is no routing interference between multiple cores or between
cores and other logic.
No Surprises! – The predictable and repeatable performance of CORE Generator cores allows large
FPGA designs to maintain target clock speeds as the design process proceeds. If it is necessary to move to
a larger device, the core performance does not change.
The core generation process fabricates the logic for the core, partitions it into configurable logic blocks
(CLBs), and then places the CLBs relative to each other. CLB level floorplanning is what makes Xilinx
LogiCORE performance so predictable. The relative placement of CLBs making up a core is maintained
as the core is integrated into the overall design and placed anywhere in the FPGA.

Xilinx IP Flow

CORE Generator Components


The CORE Generator contains a library of LogiCORE parameterizable cores and AllianceCOREs, along

http://www.xilinx.com/products/logicore/coregen/index.htm (2 of 3) [7/18/2001 23:47:42]


Xilinx CORE Generator(TM)
with data sheets for each core. LogiCOREs are designed and supported by Xilinx, while AllianceCOREs
are designed and supported by Xilinx AllianceCORE partners.

CORE Generator Interfaces


The CORE Generator CoreLINX interface allows you to bundle and "plug in" cores which your team
members may wish to share over the WEB. You can also interface the CORE Generator to system-level
tools with the CORE Generator batch mode interface.

Generating a Core
Enter your core parameters, then simply click on the Generate button. The output is an optimized CORE
for the targeted FPGA device which includesthe following files.
● A tailored Xilinx implementation netlist with complete relative
placement information to guarantee performance
● VHDL or Verilog instantiation code

● A symbol for schematic capture tools

CORE Generator Platform Support


The CORE Generator supports Windows 98, Windows NT, Windows 2000, and Solaris 2.5 and 2.6
operating systems for PC and Workstation compatibility. No security keys are required.

New Cores and Updates


New plug-in cores that are not already bundled with the CORE Generator can be downloaded from the IP
Center web site, or added to the CORE Generator through the CORE Generator IP Capture Tool.

Trademarks and Patents


Legal Information | Home | Products | Support | Education | Purchase | Contact | Search |
Privacy Policy

http://www.xilinx.com/products/logicore/coregen/index.htm (3 of 3) [7/18/2001 23:47:42]


CORE Solutions Documents

CORE Solutions Documents


CORE Solutions Documents Links
Application Notes & Technical Papers AllianceCOREs
Datasheets DSP
Documentation PCI-X/PCI
Product Brochures IP Center

to view the PDF files below.


Application Notes & Technical Papers
DSP
Modeling and Implementation of DSP FPGA Solutions 4/3/00

Multirate Filters and Wavelets: From Theory to Implementation 4/3/00

Filtering in the Wavelet Transform Domain 4/3/00

Real Time Image Rotation and Resizing Algorithms and Implementations 4/3/00

Wavelet Characteristics - What Wavelet Should I Use 10/15/99

From Fourier Transform to Wavelet Transform - Basic Concepts 10/15/99

Configurable Logic for Digital Signal Processing 4/99

FPGA Implementation of Adaptive Temporal Kalman Filter for Real Time Video
3/99
Filtering
FPGA Implementation of a Nonlinear Two Dimensional Fuzzy Filter 3/99

High-Performance FPGA Filters Using Sigma-Delta Modulation Encoding 3/99

Constant Coefficient Multipliers for the XC4000E (100 KB) 3/99

A Guide to Using Field Programmable Gate Arrays (FPGAs) for


12/98
Application-Specific DSP Performance (160 KB)
Building High Performance FIR Filters Using KCMs (20 KB) 12/98

Implementing Area Optimized Narrow-Band FIR Filters Using Xilinx FPGAs 11/98

Minimum Multiplicative Complexity Implementation of the 2-D DCT using


11/98
Xilinx FPGAs
Computing Multidimensional DFTs Using Xilinx FPGAs 10/98

FPGA Interpolators Using Polynomial Filters 10/98

Issues on Medical Image Enhancement 9/98


Using Xilinx FPGAs to Design Custom DSPs (26 KB) 2/98
Using Programmable Logic to Accelerate DSP Functions (200 KB) 3/98

http://www.xilinx.com/products/logicore/coredocs.htm (1 of 2) [7/18/2001 23:53:58]


CORE Solutions Documents

Block Adaptive Filter (100 KB) 3/98

FPGAs and DSP (50 KB) 3/98

The Fastest FFT in the West (70 KB) 3/98

The Fastest Filter in the West (30 KB) 3/98

The Role of Distributed Arithmetic in FPGA-based Signal Processing (130 KB) 3/98
Datasheets and Data Books
PCI-X/PCI
PCI-X 64/66 Virtex-E Interface Data Sheet 6/8/01

PCI64 and PCI32 Virtex/Spartan-II Interface Data Sheet 6/8/01

PCI64 Virtex/Spartan-II Interface Data Sheet 6/8/01

PCI32 Virtex/Spartan-II Interface Data Sheet 6/8/01

PCI32 Spartan XL Data Sheet 3/22/99

PCI32 Spartan Master & Slave Interfaces Data Sheet 5/18/98

LogiCORE PCI32 4000XLA Data Sheet 3/22/99

PCI32 Design Kit Data Sheet 5/18/98

Xilinx PCI May 1999 Data Book, (4,500 KB) 5/21/99

Synthesizable PCI Bridge Data Sheet 11/1/98

Ballyinx PCI64 Board Data Sheet 1/14/00

HotPCI Spartan Prototyping Board Data Sheet 5/18/98

Driver::Works Windows Device Driver Development Kit Version 2.0 5/18/98

VtoolsD Windows Device Driver Development Kit Version 3.0 5/18/98


Documentation
PCI-X/PCI
PCI-X v5.0 Design Guide (Build_010) 6/6/01

PCI-X v5.0 Implementation Guide (Build_010) 6/6/01

PCI v3.0 Design Guide (Build_067) 4/15/01

PCI v3.0 Implementation Guide (Build_067) 4/15/01


Product Brochures
Xilinx DSP Product Brochure (189 KB) 03/98

Trademarks and Patents


Legal Information | Home | Products | Support | Education | Purchase | Contact | Search |
Privacy Policy

http://www.xilinx.com/products/logicore/coredocs.htm (2 of 2) [7/18/2001 23:53:58]

You might also like