SYSC5603 (ELG6163) Digital Signal Processing Microprocessors, Software and Applications
SYSC5603 (ELG6163) Digital Signal Processing Microprocessors, Software and Applications
Miodrag Bolic
1
Outline
• Introduction to the course
• Computer architectures for signal processing
• Design cycle
2
Course Outline
Hardware Algorithm design and
• DSP Systems, A/D and D/A transformations
converters • Scheduling, Resource
• Architectural Analysis of a Allocation, Synthesis
DSP Device, TMS320C6x, • Finite-word length effects
TigerSharc, Blackfin • Algorithmic transformations
• FPGA for signal processing • FIR filter design
(Altera, Xilinx), • FFT design
• Application domain specific • IIR filter design
instruction set processors
• Adaptive filter design
• SoC, DSP Multiprocessors
• Signal processing arithmetic
units
3
Course Conduct
• Course notes will be posted on the course web page
• Assignments with solutions will be provided and will not
be graded
• There is no text-book
• The exam will be prepared based on lecture slides,
references and assignments
4
Paper Analysis and Presentation
• Topics are related to the studied material
• Each student will present for 15 minutes
• Discussion will follow after the presentation
• Each student has to choose one topic before January 16th at 7pm.
• Each student have to send a document (from 8-10 pages) font 12
single spaced three days before the presentation.
• The document has to be revised after my comments
• 15 presentation slides max (10 minutes, 15min max)
• The mark is 50% document, 50% presentation
• Some preliminary time schedule is given on the course web page.
This time schedule will be updated on January 16th
• Your reports will be posted on the course Web page. Please see the
paper on plagiarism: How to Handle Plagiarism: New Guidelines
5
Presentation topics- Computer
architectures
• Configurable processors for DSP applications
– The analysis of processors with configurable instructions sets.
Analysis of the tools. Include Tensilica, Altera and Coware
solutions (Lisatek). An example of existing designs using
configurable processors.
• Multiprocessors for DSP
– Analysis of papers including [Kumar05] and [Wiangtong05].
Analysis of current hardware solutions. Analysis of tools
including CMPWARE. An example of existing designs using
multi-processors.
• IP core design.
Current standards related to IP core design. Standard buses
used for IP cores. Advantages and disadvantages of hard and
soft IP cores. DSP processor cores. DSP hardware cores.
6
Presentation topics- Tools
• Design space exploration tools
– The analysis of the tools for design space exploration. Simulink based
tools AccelChip vs. C-based tools (Coware). Performance and
differences.
• Direct mapping from algorithms to hardware
– Analysis of different tools (Simulink, Synopsys System Studio,
CoWare's SPW 5-XP) and design processes used for automated
implementation of signal processing algorithms to FPGA. Analysis of
quality and speed of these automated implementations.
• Comparison between HandleC, SpecC and SystemC
– What is the main difference of these languages. Which language should
be taken for which application? Which of these languages have total
support from algorithm design to the implementation (example
Synopsys SystemC solution).
• Tools for the analysis of the optimal-word length
– Analyze the tools for floating to fixed point precision. Compare solutions
from Mathworks, Synopsys and AccelChip.
• TI standard for writing algorithms - eXpressDSP Algorithm
7
Presentation topics - Applications
• Software-defined radio
– Analysis of signal processing algorithms used for software defined
radios. Computer architectures for software defined radios. List of
commercial platforms and development tools.
• Signal processing for wireless sensor networks
– Analysis of signal processing algorithms used for wireless sensor
networks: positioning, tracking, data fusion, sensor processing. Analysis
of DSP architectures used in sensor networks. Specifics of algorithm
designs for wireless sensor networks.
• Tracking applications
– Detailed analysis of different tracking and navigation application
including: aircraft positioning, target tracking for radar and sonar
applications, car collision detection, and positioning and tracking in
homeland security applications. Define the requirements for each
application such as sampling rate, accuracy, latency, range. Discuss
about the algorithms and about the hardware platforms used for each
applications
8
Project
• Project proposals are expected by February 6th.
• Deadline for project demonstration: March 31
• Deadline for project report: March 27
• Grade: 20% Project Proposal, 20% Project Report, 20% Project
Presentation, 40% Demonstration
Report: A type-written, hardcopy project report, as well as an electronic version (including source code, design files
developed) are to be submitted at the end of the semester. The length of the report is not restricted. However, the
report must be include the following sections:
• Introduction: Motivation and backgrounds.
• Main body of report. Depending on types of project, this part may include method used, approaches taken, problem
description, etc.
• Conclusion and discussion: Highlight your achievement in this project and things may be done in the future.
11
Copied from http://homepages.cae.wisc.edu/~ece734/project/index.html
Course Objectives … To
• Understand tradeoffs in implementing DSP algorithms
• Know basic DSP architectures
• Know some reduced complexity strategies for algorithms
mainly on FPGA.
12
Why this course?
There is the demand to derive more information per signal.
“More” means
• Faster: Derive more information per unit time;
– Faster hardware
– Newer algorithms with fewer operations
• Cheaper: Derive information at a reduced cost in
processor size, weight, power consumption, or dollars;
• Better: Derive higher quality information, (higher
precision, finer resolution, higher signal-to-noise ratio)
[Richards04 ] 13
Hardware and software elements
Progress in signal processing capability is the product of
progress in IC devices, architectures, algorithms and
mathematics.
[Richards04 ] 14
Moore’s Law
http://www.icknowledge.com/trends/uproc.html
15
What is Signal Processing?
• Ways to manipulate • Types of processing:
signal in its original – Transformation
medium or an abstract – Filtering
representation. – Detection
– Estimation
• Signal can be abstracted – Recognition and
as functions of time or classification
spatial coordinates. – Coding (compression)
– Synthesis and reproduction
– Recording, archiving
– Analyzing, modeling
Copied from [Hu04-Slides] Design and Implementation of Signal Processing Systems: An Introduction 16
Digital Signal Processing
• Signals generated via • Digital signal processing
physical phenomenon are concerns processing
analog in that signals using digital
– Their amplitudes are
computers.
defined over the range of
– A continuous time/space
real/complex numbers
signal must be sampled to
– Their domains are
yield countable signal
continuous in time or samples.
space.
– The real-(complex) valued
samples must be quantized
to fit into internal word
length.
Copied from [Hu04-Slides] Design and Implementation of Signal Processing Systems: An Introduction 17
Signal Processing Systems
Digital
DigitalSignal
Signal D/A
A/D Processing
Processing
Copied from [Hu04-Slides] Design and Implementation of Signal Processing Systems: An Introduction 18
Stratix DSP Development Board
Nios Expansion
MAX 7000 Device Prototype Connector
Prototyping Area
D/A Converters
Mictor-Type Connectors
for HP Logic Analyzers
A/D Converters
Analog SMA
Connectors
40-Pin Connectors
Texas Instruments Connectors for Analog Devices
on Underside of Board
19
[AlteraDSP]
Example DSP Applications….
Sound/Modem/Fax Cards
Cellular Phones
Speaker Phones CONSUMER
Video Conferencing
Radar Detectors
ATMs
Power Tools
Digital Audio / TV
DSP Music Synthesizers
Toys / Games
INSTRUMENTATION
Answering Machines
Spectrum Analyzers
Digital Speakers
Seismic Processors
Digital Oscilloscopes
Mass Spectrometers
MILITARY
MEDICAL
INDUSTRIAL/CONTROL Secure Communications
PatientMonitoring
Robotics Sonar Processing
Ultrasound Equipment
Numeric Control Image Processing
Diagnostic Tools
Power Line Monitors Radar Processing
Fetal Monitors
Motor/Servo Control Navigation, Guidance
Life Support Systems
Image Enhancement
20
www.analog.com/dsp
Implementation of DSP Systems
• Platforms: • Requirements:
– Native signal processing – Real time
(NSP) with general purpose • Processing must be
processors (GPP) done before a pre-
• Multimedia extension specified deadline.
(MMX) instructions – Streamed numerical
– Programmable digital signal data
processors (PDSP) • Sequential processing
– Application-Specific • Fast arithmetic
Integrated Circuits (ASIC) processing
– Field-programmable gate – High throughput
array (FPGA) • Fast data input/output
• Fast manipulation of
data
Copied from [Hu04-Slides] Design and Implementation of Signal Processing Systems: An Introduction 21
How Fast is Enough for DSP?
• Real time requirements: • Different throughput rates
– Example: data capture for processing different
speed must match signals
sampling rate. Otherwise, – Throughput sampling
data will be lost. rate.
– Processing must be done – CD music: 44.1 kHz
by a specific deadline.
– Speech: 8-22 kHz
– Video (depends on frame
rate, frame size, etc.)
range from 100s kHz to
MHz.
Copied from [Hu04-Slides] Design and Implementation of Signal Processing Systems: An Introduction 22
ASIC: Application Specific ICs
• Custom or semi-custom • ASIC becomes popular
IC chip or chip sets due to availability of IC
developed for specific foundry services. Fab-
less design houses turn
functions.
innovative design into
• Suitable for high volume, profitable chip sets using
low cost productions. CAD tools.
• Example: MPEG codec, • Design automation is a
3D graphic chip, etc. key enabling technology
to facilitate fast design
cycle and shorter time to
market delay.
Copied from [Hu04-Slides] Design and Implementation of Signal Processing Systems: An Introduction 23
Programmable Digital Signal Processors
(PDSPs)
• Micro-processors designed • PDSPs were developed to fill a
for signal processing market segment between GPP
applications. and ASIC:
• Special hardware support – GPP flexible, but slow
for: – ASIC fast, but inflexible
– Multiply-and-Accumulate • As VLSI technology improves,
(MAC) ops role of PDSP changed over
– Saturation arithmetic ops time.
– Zero-overhead loop ops – Cost: design, sales,
– Dedicated data I/O ports maintenance/upgrade
– Complex address – Performance
calculation and memory
access
– Real time clock and other
embedded processing
supports.
Copied from [Hu04-Slides] Design and Implementation of Signal Processing Systems: An Introduction 24
[Seshan98] 25
PDSP Market – By Company
20%
24%
Texas Instruments
40% 43%
Motorola
9%
Agere
8% Analog Devices
Other
14%
16%
12% 14%
26
DSP Market – By Application
4% 3%
6%
WIRELESS
8%
CONSUMER
MULTIPURPOSE
11% WIRELINE
COMPUTER
68% AUTOMOTIVE
27
Computing using FPGA
• FPGA (Field programmable • Use of FPGA
gate array) is a derivative of – Rapid prototyping: run
PLD (programmable logic fractional ASIC speed
devices). without fab delay.
• They are hardware – Hardware accelerator:
configurable to behave using the same hardware
differently for different to realize different function
configurations. modules to save hardware
• Slower than ASIC, but faster – Low quantity system
than PDSP. deployment
• Once configured, it behaves
like an ASIC module.
Copied from [Hu04-Slides] Design and Implementation of Signal Processing Systems: An Introduction 28
Stratix EP1S10
Altera Corp., Stratix Module 2: Logic Structure & MultiTrack Interconnect, 2004. 29
IP Cores
• Processor cores
Start-Core
– 16-bit fixed-point VLIW DSP core from Lucent/Motorola (a company is
established by Lucent for DSP section called “Agere”)
– First VLIW machine to target low-power applications
– Pipeline relatively simple
– Targeting 198 mW @ 300 MHz, 1.5 V
• Hardware cores
Altera DSP coresDevice Type
– FIR Compiler
– IIR Compiler
– FFT/IFFT Compiler
– NCO Compiler
– Reed-Solomon Compiler
– Constellation Mapper/Demapper
– Viterbi Compiler
30
SoC (System-on-Chip)
• With the continuing scaling of • Soc uses intellectual
modern IC devices, it is now properties (IPs) that are pre-
possible to incorporate designed modules.
– Micro-processor cores + ASIC
function blocks • Designing SoC thus becomes
– Analog + digital components a task of system integration.
– Computation + communication • Challenge issues in SoC
functions design:
– I/O, memory + processor – Interface among IPs from
into the same chip to form a different venders
comprehensive “system”. – Verification of function
Thus, the notion of System-on- – Physical design challenges
chip (SoC)
Copied from [Hu04-Slides] Design and Implementation of Signal Processing Systems: An Introduction 31
Design Issues
• Given a DSP application, • Software design:
– NSP, PDSP
which implementation
– Algorithms are implemented
option should be chosen? as programs.
• For a particular • Hardware design:
– ASIC, FPGA
implementation option,
– Algorithms are directly
how to achieve optimal implemented in hardware
design? Optimal in terms modules.
of what criteria? • S/H Co-design: System level
design methodology.
Copied from [Hu04-Slides] Design and Implementation of Signal Processing Systems: An Introduction 32
Design Process Model
• Design is the process that • Implementation
links algorithm to – Assignment: Each
implementation operation can be realized
with
• Algorithm
• One or more instructions
– Operations (software)
– Dependency between • One or more function
operations determines a modules (hardware)
partial ordering of – Scheduling: Dependence
execution relations and resource
– Can be specified as a constraints leads to a
dependence graph schedule.
Copied from [Hu04-Slides] Design and Implementation of Signal Processing Systems: An Introduction 33
A Design Example …
Consider the algorithm: • Operations:
n – Multiplication
y a(k ) x(k ) – Addition
k 1 • Dependency
– y(k) depends on y(k-1)
Program:
y(0) = 0 – Dependence Graph:
For k = 1 to n Do
y(k) = y(k-1)+ a(k)*x(k)
End a(1) x(1) a(2) x(2) a(n) x(n)
y = y(n)
* * *
y(0)
+ + + y(n)
Copied from [Hu04-Slides] Design and Implementation of Signal Processing Systems: An Introduction 34
Design Example cont’d …
• Software Implementation: • Hardware Implementation:
– Map each * op. to a MUL
– Map each * op. to a multiplier,
instruction, and each + op. to a
ADD instruction. and each + op. to an adder.
– Allocate memory space for – Interconnect them according to
{a(k)}, {x(k)}, and {y(k)} the dependence graph:
– Schedule the operation by
sequentially execute
y(1)=a(1)*x(1), y(2)=y(1) +
a(2)*x(2), etc.
– Note that each instruction is a(1) x(1) a(2) x(2) a(n) x(n)
still to be implemented in
hardware. * * *
y(0)
+ + + y(n)
Copied from [Hu04-Slides] Design and Implementation of Signal Processing Systems: An Introduction 35
Observations
• Eventually, an • Bottom line – Hardware/
implementation is software co-design.
realized with hardware. There is a continuation
between hardware and
• However, by using the software implementation.
same hardware to realize • A design must explore
different operations at both simultaneously to
different time achieve best
(scheduling), we have a performance/cost trade-
software program! off.
Copied from [Hu04-Slides] Design and Implementation of Signal Processing Systems: An Introduction 36
A Theme
• Matching hardware to • Formulate algorithm to match
algorithm hardware
– Algorithm must be formulated
– Hardware architecture
so that they can best exploit
must match the the potential of architecture.
characteristics of the – Example: GPP, PDSP
algorithm. architectures are fixed. One
– Example: ASIC must formulate the algorithm
architecture is designed to properly to achieve best
implement a specific performance. Eg. To minimize
algorithm, and hence can number of operations.
achieve superior
performance.
Copied from [Hu04-Slides] Design and Implementation of Signal Processing Systems: An Introduction 37
Algorithm Reformulation
• Algorithmic level equivalence
– Different filter structures implementing the same specification
• Exploiting parallelism
– Regular iterative algorithms and loop reformulation
• Well studied in parallel compiler technology
– Signal flow/Data flow representation
• Suitable for specification of pipelining
Copied from [Hu04-Slides] Design and Implementation of Signal Processing Systems: An Introduction 38
Mapping Algorithm to Architecture
• Scheduling and Assignment Problem
– Resources: hardware modules, and time slots
– Demands: operations (algorithm), and throughput
• Constrained optimization problem
– Minimize resources (objective function) to meet demands
(constraints)
• For regular iterative algorithms and regular processor
arrays -> algebraic mapping.
Copied from [Hu04-Slides] Design and Implementation of Signal Processing Systems: An Introduction 39
Implementation process for PDSP
[Wiangtong05] 40
Direct Mapping Techniques
[Wiangtong05] 41
FIR Filters
[DSPPrimer-Slides] 42
Transposed FIR Filter
43
[DSPPrimer-Slides]
Example: One-to-one mapping and pipelining
A B C D
allocation A B C D
assignment A B C D
Analyse timing
pipelining A B C D • if OK then stop
• else pipelining
clocked flip-flop
ff
clock
[Meerbergen-Slides]
44
Coware SPW Design Flow
www.coware.com 45
System-level design flow: Simulink-
Altera
[AlteraDSP] 46
Arithmetic
• CORDIC
– Compute elementary functions
• Distributed arithmetic
– ROM based implementation
47
Floating to fixed point analysis
• Overflow of the number range
• Large errors in the output signal occur when the available number
range is exceeded— overflow.
• Round-off errors
• Rounding or truncation of products must be done in recursive loops
so that the word length does not increase for each iteration.
• Coefficient errors
• Coefficients can only be represented with finite precision.
48
References
49