UNIT2 Notes
UNIT2 Notes
UNIT-II
Programmable Digital Signal Processors
1. Introduction:
In Unit#1, we have studied the basic architecture and algorithms of DSP
processors. Now is the time to dig deeper!
Leading manufacturers of integrated circuits such as Texas Instruments (TI),
Analog devices & Motorola manufacture the digital signal processor (DSP)
chips. These manufacturers have developed a range of DSP chips with varied
complexity.
TI DSP processors are of interest to us in this course. Others are also similar,
by studying and understanding one, we can easily relate to the others and their
understanding becomes easier.
The TMS320 family consists of two types of single chips DSPs: 16-bit fixed
point &32-bit floating-point.
These DSPs possess the operational flexibility of high-speed controllers and
the numerical capability of array processors
Comparison:
Summary of the Architectural Features of three fixed-Points DSPs
Architectural Features Comparision
3. TMS320C54xx Architecture:
Bus Structure:
The ‘54xx CPU is common to all the ‘54xx devices, the ’54xx CPU contains
40-bit arithmetic logic unit (ALU);
Two 40-bit accumulators (ACCA and ACCB);
Barrel shifter;
17 x 17-bit multiplier; a 40-bit adder;
Compare, Select and Store unit (CSSU);
Exponent encoder(EXP);
Data Address Generation Unit (DAGEN); and
Program Address Generation Unit (PAGEN).
The ALU performs 2’s complement arithmetic operations and bit-level Boolean
operations on 16, 32, and 40-bit words.
It can also function as two separate 16-bit ALUs and perform two 16-bit operations
simultaneously as shown in the functional diagram.
Fig: Functional diagram of the central processing unit of the TMS32054xx processors
Accumulators:
The accumulators, ACCA and ACCB, store the output from the ALU or the multiplier
/ adder block.
The accumulators can also provide a second input to the ALU or the multiplier /
adder.
The bits in each accumulator is grouped as follows:
o Guard bits (bits 32–39)
o A high-order word (bits 16–31)
o A low-order word (bits 0–15)
Instructions are provided for storing the guard bits, the high-order and the low-order
accumulator words in data memory, and for manipulating 32-bit accumulator words in or out
of data memory. Also, any of the accumulators can be used as temporary storage for the
other.
Barrel Shifter:
The ’54x’s barrel shifter has a 40-bit input connected to the accumulator or data
memory (CB, DB) and a 40-bit output connected to the ALU or data memory (EB).
The barrel shifter produces a left shift of 0 to 31 bits and a right shift of 0 to 16 bits
on the input data.
The shift requirements are defined in the shift-count field (ASM) of ST1 or defined in
the temporary register (TREG), which is designated as a shift-count register.
This shifter and the exponent detector normalize the values in an accumulator in a
single cycle.
The least significant bits (LSBs) of the output are filled with 0s and the most
significant bits (MSBs) can be either zero-filled or sign-extended, depending on the
state of the sign-extended mode bit (SXM) of ST1.
Additional shift capabilities enable the processor to perform numerical scaling, bit
extraction, extended arithmetic, and overflow prevention operations.
Multiplier / Adder:
or an immediate value.
The fast on-chip multiplier allows the ’54x to perform DSP operations such as:
convolution, correlation, and filtering efficiently.
In addition, the multiplier and ALU together execute multiply/accumulate (MAC)
computations and ALU operations in parallel in a single instruction cycle.
This function is used in determining the Euclid distance, and in implementing
symmetrical and least mean square (LMS) filters, which are required for complex
DSP algorithms.
Registers:
BRAF: Block repeat active flag ; BRAF=0, the block repeat is deactivated. BRAF=1,
the block repeat is activated.
CPL: Compiler mode; CPL=0, the relative direct addressing mode using data page
pointer is selected. CPL=1,the relative direct addressing mode using stack pointer is
selected.
XF indicates the status of the external flag (XF) pin, which is a general purpose output
pin. The SSBX instruction can set XF and the RSBX instruction can reset XF.
HM: Hold mode, indicates whether the processor continues internal execution or
acknowledge for external interface.
INTM: Interrupt mode, it globally masks or enables all interrupts. INTM=0 all
unmasked interrupts are enabled. INTM=1all masked interrupts are disabled.
OVM: Overflow mode. OVM=1the destination accumulator is set either the most
positive value or the most negative value & =0 the overflowed result is in destination
accumulator.
SXM: Sign extension mode. SXM=0 Sign extension is suppressed. SXM=1Data is
sign extended
C16: Dual 16 bit/double-Precision arithmetic mode. C16=0ALU operates in double-
Precision arithmetic mode. C16=1ALU operates in dual 16-bit arithmetic mode.
FRCT: Fractional mode. FRCT=1 the multiplier output is left-shifted by 1bit to
compensate an extra sign bit. At reset this bit is ZERO.
CMPT: Compatibility mode. CMPT=0 ARP is not updated in the indirect addressing
mode. CMPT=1ARP is updated in the indirect addressing mode.
ASM: Accumulator Shift Mode. The 5-bit ASM field specifies a shift value within a
–16 through 15 range and is coded as a 2s-complement value. Instructions with a
parallel store, as well as STH, STL, ADD, SUB, and LD, use this shift capability.
ASM can be loaded from data memory or by the LD instruction using a short-
immediate operand.
Register bit with ZERO is for future expansion. Always read as 0
PMST Register:
The PMST register is loaded with memory-mapped register instructions such as STM.
IPTR: Interrupt vector pointer, point to the 128-word program page where the
interrupt vectors reside. 0 to 1FF; that is 512 locations, each vector is 4byte wide.
MP/MC: Microprocessor/Microcomputer mode, MP/MC=0, the on chip ROM is
enabled. MP/MC=1, the on chip ROM is NOT enabled.
OVLY: RAM OVERLAY, OVLY enables on chip dual access data RAM blocks to
be mapped into program space.
AVIS: It enables/disables the internal program address to be visible at the address
pins.
DROM: Data ROM, DROM enables on-chip ROM to be mapped into data space.
CLKOFF: CLOCKOUT off.
SMUL: Saturation on multiplication.
SST: Saturation on store.
General information:
In the C54x DSP, the data and program memories are organized in 16-bit words.
Data busses have a 16-bit width.
Data and instructions are generally of size N=16 bits.
Some instructions may take multiple of 16-bits.
Some data operands may be double precision and occupy 2 words.
Internal busses: 2 data read, 1 data write
External Buses: Data buses =2 ; Program Bus=1; result bus =1
Data addressing modes provide various ways to access operands to execute
instructions and place results in the memory or the registers.
Immediate Addressing:
Instruction contains the value of the operand. Value is preceded by #.
Example: ADD #4, A Add the value 4 to the content of accumulator A.
Absolute Addressing: The instruction contains a specified address in the operand. There are
four types of absolute addressing:
• The address can be in data, program or IO memory. 16 bits. so
instructions that encode absolute addresses are always at least two
words in length.
Data-memory address (dmad) addressing: MVDK Smem(single data
memory), dmad ; MVDM dmad, MMR ; MVKD dmad, Smem; MVMD
MMR, dmad
• Data memory addressing Example
– MVKD 1000h, *AR5; move data from memory 1000h(source)
to data memory pointed by AR5.
• Program memory addressing
– MVPD 1000h, *AR7; Move word from program memory at
address 1000h to data memory at address memory pointed by
AR7
Port address (PA) addressing:
– PORTR 05h, *AR3 ; Reads a 16 bit value from an external I/O
port at address 05H to the data memory location pointed by
AR3.
Location in the data space *(lk) addressing is used with all instructions that
support the use of a single data-memory (Smem) operand.
• LD *(1000h), A ; specify the exact 16bit address *(1000h)
Direct Addressing: In direct addressing, the instruction contains the lower seven
bits of the data memory address (dma).
The 7-bit dma is an address offset that is combined with a base address, with the data-
page pointer (DP), or with the stack pointer (SP) to form a 16-bit data-memory
address.
Using this form of addressing, you can access any of 128 locations in random order
without changing the DP or the SP.
Direct addressing is not the only method of offset addressing. However, the
advantage of this mode is that it encodes each instruction and address into a
single word.
Either DP or SP can be combined with the dma offset to generate the actual address.
The compiler mode bit (CPL), located in status register ST1, selects which method is
used to generate the address:
When CPL = 0, the dma field is concatenated with the 9-bit DP field to form
the 16-bit data-memory address.
When CPL = 1, the dma field is added (positive offset) to SP to form the 16-
bit data-memory address.
The syntax for direct addressing uses a symbol or a number to specify the offset
value.
The 16-bit address of the data
Memory location is formed by combining the lower 7 bits of the data memory address
contained
In the instruction .
For example, to add the contents of the memory location SAMPLE to accumulator B,
provided that the correct base address is in DP (CPL = 0) or SP (CPL = 1), you would
write:
ADD SAMPLE, B
The lower seven bits of the address of SAMPLE are stored in the instruction
word.
LD #4, DP;
ADD=0, B
When CPL=0, to add the contents of the memory location 0 on page 4 in the
data memory to accumulator B, the above program sequence is used.
Two auxiliary register arithmetic units (ARAU0 and ARAU1) operate on the contents
of the auxiliary registers (AR’s). The ARAUs perform unsigned, 16-bit auxiliary
register arithmetic operations.
As the figure shows, the main components used for address generation in indirect
addressing are the auxiliary register arithmetic units (ARAU0 and ARAU1) and the
auxiliary registers (AR0–AR7).
You can modify the addresses you use in instructions before or after they are
accessed, or you can leave them unchanged.
You can modify them by incrementing or decrementing the address by 1, adding a 16-
bit offset (lk), or indexing with the value in AR0.
Table lists the types of single data-memory operand addressing with explanations.
In this addressing mode, AR0 specifies one half of the size of the FFT.
The value contained in AR0 must be equal to 2N –1, where N is an
integer, and the FFT size is 2N.
An auxiliary register points to the physical location of a data value.
When you add AR0 to the auxiliary register using bit-reversed
addressing, the address is generated in a bit-reversed fashion, with the
carry bit propagating from left to right, instead of the normal right to
left.
Used for FFT algorithms.
These instructions are all one word long and operate in indirect addressing
mode only.
If the source operand and the destination operand point to the same location, in
instructions with a parallel store (for example, ST||LD), the source is read
before writing to the destination.
Only 2 bits are available in the instruction code for selecting each auxiliary register in
this mode.
Thus, just four of the auxiliary registers, AR2-AR5, can be used, The ARAUs
together with these registers, provide capability to access two operands in a single
cycle.
In each case, the content of the auxiliary register is used as the data-memory operand.
After using the address in the auxiliary register, the ARAUs perform the specified
mathematical operation.
• Forcing the nine most significant bits (MSBs) of data- memory address to 0,
regardless of the current value of DP or SP when direct addressing is used
• Using the seven LSBs of the current auxiliary register value when indirect
addressing is used
Stack Addressing: Stack is used to automatically store the PC during the subroutines and
interrupts. It can also be used to store data values or context at the programmer’s discretion.
The stack is filled from the highest to the lowest memory address. The processor uses
a 16-bit memory mapped register, the stack pointer (SP), to address the stack. SP
always points to the last element stored onto the stack.
Four instructions access the stack using the stack addressing mode:
PSHD pushes a data-memory value onto the stack.
PSHM pushes a memory-mapped register onto the stack.
POPD pops a data-memory value from the stack.
POPM pops a memory-mapped register from the stack.
Other operations also affect the stack and the stack pointer. The stack is used during
interrupts and subroutines to save and restore the PC contents.
Fig shows an example of the stack and SP before and after a push of X2 into the stack
(PSHD X2).
When a subroutine is called or an interrupt occurs, the return address is automatically
saved in the stack using a push operation. Instructions used for subroutine calls and
interrupts are CALA[D], CALL[D], CC[D], INTR, and TRAP.
When a subroutine returns, the return address is retrieved from the stack using a pop
operation and loaded into the PC. Instructions used for returns from subroutines are
RET[D], RETE[D], RETEF[D], and RC[D].
The FRAME instruction also affects the stack. This instruction adds a short
immediate offset to the stack pointer.
The C54xE DSP memory is organized into three individually selectable spaces:
program, data, and I/O.
Within any of these spaces, RAM, ROM, EPROM, EEPROM, or memory-
mapped peripherals can reside either on-chip or off-chip.
Addressability is a total of 128k words extendable up to 8192k words.
Data memory: To store data required to run programs & for external memory mapped
registers.
Program memory: To store program instructions &tables used in the execution of
programs.
Program Control:
: The PC is loaded with the address of the appropriate interrupt vector.
Instructions such as BACC, CALA, etc ;The PC is loaded with the contents of
the accumulator low word
End of a block rIt contains program counter (PC), the program counter related H/W,
hardware stack, repeat counters &status registers.
PC addresses memory in several ways namely:
Branch: The PC is loaded with the immediate value following the branch
instruction
Subroutine call: The PC is loaded with the immediate value following the call
instruction
Interruptepeat loop: The PC is loaded with the contents of the block repeat
program address start register.
Return: The PC is loaded from the top of the stack.
Arithmetic Operations:
Add instructions: ADD, (add to accumulator)ADDC,(add to accumulator with carry)
Logical operations:
AND instructions; AND(logical AND data or a constant with the ACC),
ANDM(logical AND data with the contents of data memory)
OR instructions; OR(logical OR data or a constant with the ACC), ORM(logical OR
data with the constants of data memory)
XOR instruction; XOR(logical XOR data or constant with the ACC), XORM(logical
XOR data or constant with the data memory)
Shift instructions; ROL, SFTL
Test instructions; BIT(copy the bit under test to the bit TC in the register ST0),
CMPM(complement the MMR)