0% found this document useful (0 votes)
10 views37 pages

audioport_specification_and_guide_2024

The document is a Specification and Design Guide for the audioport, a digital IP block implementing an I2S audio output interface in SoC designs. It includes functional specifications, RTL architecture, design guidelines, and implementation details, covering aspects such as clock domains, memory-mapped register interfaces, and digital signal processing. The guide also outlines the revision history and provides a comprehensive overview of the audioport's components and their interconnections.

Uploaded by

Sabbir Hossain
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
10 views37 pages

audioport_specification_and_guide_2024

The document is a Specification and Design Guide for the audioport, a digital IP block implementing an I2S audio output interface in SoC designs. It includes functional specifications, RTL architecture, design guidelines, and implementation details, covering aspects such as clock domains, memory-mapped register interfaces, and digital signal processing. The guide also outlines the revision history and provides a comprehensive overview of the audioport's components and their interconnections.

Uploaded by

Sabbir Hossain
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 37

audioport

Specification and Design Guide 2024


Table of Contents
Table of Contents
Revision History
1. Functional Specifications
1.1. Overview
1.2. Interface Specification
1.2.1. Clocks and resets
1.2.2.Inputs and Outputs
1.3. Functional Requirements
1.3.1. Input Assumptions
1.3.2. Design Blocks
1.3.2.1. control_unit
1.3.2.2. dsp_unit
1.3.2.3. cdc_unit
1.3.2.4. i2s_unit
1.3.2.5. Block Interconnect Signals
1.3.3. Clock Domains
1.3.4. Memory-Mapped Register Interface
1.3.4.1. Command Register (CMD_REG)
1.3.4.2. Status Register (STATUS_REG)
1.3.4.3. Level Register (LEVEL_REG)
1.3.4.4. Configuration Register (CFG_REG)
1.3.4.5. DSP Data Registers (DSP_REGS_START ... DSP_REGS_END)
1.3.4.6. Audio Buffer Registers (ABUF0_START ... ABUF1_END)
1.3.5. Operating Modes
1.3.6. Audio Streaming
1.3.7. Data Round-Trip Time
1.3.8. dsp_unit Maximum Latency
1.3.9. Functional Requirements for Sub-Modules
1.4. Typical Use Case
2. RTL Architecture Specifications
3. Design Guidelines
3.1. Design Parameters Setup
3.1.1. Project Parameters
3.1.2. audioport Memory-Mapped Register Interface Definition
3.2. RTL Design and Code Creation
3.4. audioport RTL Verification
3.4.1. Static Code Check
3.4.2. Functional Verification
3.4.4. Formal Verification
3.4.5. Clock-Domain Crossing Verification
4. audioport Implementation
4.1. RTL Synthesis and Logic Optimization
4.1.1. Timing Constraints
4.1.2. Design Compiler Constraints
4.1.3. Synthesis with Design-for-Testability
4.1.4. Synthesis with Clock Gating
4.2. Automatic Test Pattern Generation (ATPG)
4.3. Gate-Level Verification
4.4. Prototype Layout Design
4.5. Post-Layout Equivalence Check
4.6. Post-Layout Static Timing Analysis (STA)
4.7. Post-Layout Power Analysis
4.7.1. Post-Layout Power Simulation
4.7.2. Dynamic Power Consumption Estimation
4.8. Design Optimization
Revision History

Version Date Author Comment

1.0 7.1.2017 JL First version for 2017

1.0.1 13.1.2017 JL Fixed hex APB3 addresses in memory map to be 32-bit in 3.3.2.

2.0 5.1.2018 JL Updated for 2018

3.0 1.1.2019 JL Updated for 2019.

4.0 2.10.2020 JL Updated for 2020. Combined audioport design guide to this
document.

5.0 JL Changes:
● APB signal names change to match ARM recommendations
● mrst_n reset signal is derived from rst_n in cdc_unit

6.0. 20.10.2022 JL Redundant data tables removed. Minor functional changes.

7.0. 17.5.2023 JL Separate document created for audioport top module.


1. Functional Specifications
1.1. Overview
The audioport is a digital intellectual property block that implements an I2S audio output
interface in a system-on-a-chip design. The audioport module is the top-level design module
that instantiates four functional modules. The figure below shows a high-level block diagram of
the audioport, and its connections to the SoC infrastructure.

The audioport is designed to function as a responder device on the SoC's peripheral bus. This
bus is based on the AMBA advanced peripheral bus (APB) specification. It is connected to the
SoC's main bus via a bus bridge that generates APB-protocol-compliant signals for the APB
responders from the SoC's main bus signals, and multiplexes read data buses originating from
the APB responders to the main bus. The audioport is also assumed to be connected to an
interrupt controller unit with signal irq_out. The audioport receives the clock (clk, mclk) and
reset signals (rst_n) from the SoC's clock and reset generator units.

The purpose of the audioport is to function as an interface unit between the software running
on the SoC's CPU (e.g. an audio player) and an off-chip audio codec chip (e.g. a digital-to-
analog converter). The audioport appears as a memory-mapped responder device for the
CPU. When the CPU writes audio data to audio buffer registers inside the audioport's
control_unit block, the audioport converts them to a serial audio bit stream that complies with
the I2S protocol that is supported by most audio codec chips. The CPU can also perform
various control tasks on the audioport by writing to memory-mapped control registers inside the
control_unit. When the audioport has consumed all audio data from its buffer it can request
new data from the CPU by raising the interrupt signal irq_out. The audioport contains a digital
signal processing unit (dsp_unit) that can be used to perform specific signal processing tasks
on the audio data before it is sent out through the I2S interface unit (i2s_unit).
The audioport consists of two mutually asynchronous clock domains. The control_unit and
dsp_unit use the clock signal clk that is common to the APB subsystem. The i2s_unit uses a
clock signal mclk that also functions as a clock signal for the external audio codec chip. The two
clock domains, indicated with green and yellow colors in the figure above, communicate through
a clock domain crossing unit (cdc_unit) that implements the required synchronization functions.

1.2. Interface Specification

1.2.1. Clocks and resets


The audioport has two clock inputs, clk and mclk. The frequency of the clock signal clk is
specified by the CLK_PERIOD design parameter. The frequency of the clock signal mclk is
18.432000 MHz (period = 54.25347222 ns). The clocks clk and mclk are asynchronous with
respect to each other.

The asynchronous, active-low reset signal for the flip-flops clocked with clk is rst_n. It can be
assumed to be synchronized to clk, and to change its state on the falling edge of clk.

The asynchronous, active-low reset signal for the flip-flops clocked with mclk is mrst_n, which
is generated internally from rst_n in the cdc_unit. It can be assumed to be synchronized to
mclk.

1.2.2.Inputs and Outputs


The symbol below describes the external interface of the audioport.

The input and output ports of the design are listed in the following table.

PORTS
Name Direction Width Type Description
(bits)
clk input 1 logic Rising-edge sensitive clock signal input for main clock domain.
rst_n input 1 logic Active-low, asynchronous reset signal input for main clock domain.
mclk input 1 logic Rising-edge sensitive clock signal input for I2S clock domain.
PSEL input 1 logic AMBA APB PSEL signal.
PENABLE input 1 logic AMBA APB PENABLE signal.
PWRITE input 1 logic AMBA APB PWRITE signal.
PADDR input 32 logic AMBA APB PADDR bus.
PWDATA input 32 logic AMBA APB PWDATA bus.
PRDATA output 32 logic AMBA APB PRDATA bus.
PSLVERR output 1 logic AMBA APB PSLVERR signal
PREADY output 1 logic AMBA APB PREADY signal.
irq_out output 1 logic Active-high interrupt request output
sck_out output 1 logic I2S serial clock output.
ws_out output 1 logic I2S word select output
sdo_out output 1 logic I2S serial data output.
test_mode_in input 1 logic Active-high test mode select input.
scan_en_in input 1 logic Active-high scan path enable input.

The input and output delays of data inputs and outputs are assumed to be less than 1/8 of
the clock cycle.

1.3. Functional Requirements

1.3.1. Input Assumptions


The APB ports function according to the APB3 protocol specification.

The scan path enable input scan_en shall be unconnected in the RTL design and connected in
test synthesis.

The interval between CMD_START and CMD_STOP commands (defined below) should be
larger than one 48 kHz sample duration (1/48000 s)

1.3.2. Design Blocks


The audioport module is a top-level structural module that instantiates the submodules
control_unit, dsp_unit, cdc_unit and i2s_unit, but does not contain any functional logic.
The architecture of the audioport is defined by the following block diagram that shows the
connections between the ports of the audioport and its submodules (click here for a larger
diagram). The RTL architecture of the submodules that contain all functional logic is specified in
separate documents. The internal signal names shown in red are part of the specification and
must be used in the RTL code.

1.3.2.1. control_unit
The control_unit contains an AMBA 3 APB bus interface (protocol specification version 1.0)
and a register bank for control, configuration and audio data. The control_unit serves the
following of purposes:
● storage of configuration and audio data in a the register bank,
● decoding of commands that the host CPU sends to the audioport by writing command
codes to a specific control register in the control data register bank.
● logic for "streaming" (reading out in order) of audio data from the register bank
● interrupt generation logic
● status register that provided status information to the CPU

1.3.2.2. dsp_unit
The dsp_unit implements digital signal processing functions that can be used to process the
audio data before it is sent to the i2s_unit. The DSP functions are:
● filter (stereo enhancement filter)
● scaler (output level scaler)
The filter can be enabled or disabled by writing to a configuration register in the control_unit.

1.3.2.3. cdc_unit
The clock domain crossing block cdc_unit contains clock domain crossing logic that is needed
to transfer data between the "clk" clock domain and the "mclk" clock domain.

1.3.2.4. i2s_unit
The i2s_unit contains a "serializer" shift-register that converts parallel audio data into a serial
bit-stream sdo_out that conforms to the I2S protocol specification. The i2s_unit also generates
the I2S serial bit-clock (sck_out) and left/right word-select (ws_out) signals. The i2s_unit
generates the audio sample rates (48kHz, 98kHz or 192kHz) and requests audio samples at the
selected sample rate from the control_unit.
1.3.2.5. Block Interconnect Signals
The following table shows the internal signals of the audioport..

NAME KIND TYPE DESCRIPTION


level_reg interconnect logic [31:0] Output of LEVEL_REG register of control_unit.
cfg_reg interconnect logic [31:0] Output of CFG_REG register of control_unit.
dsp_regs interconnect logic Outputs of all registers in the DSP_REGS region of the register
[DSP_REGISTERS*32 bank in control_unit concatenated into one bitvector.
-1:0]
level interconnect logic Active-high CMD_LEVEL command indicator pulse signal from
control_unit.
cfg interconnect logic Active-high CMD_CFG command indicator pulse signal from
control_unit.
clr interconnect logic Active-high CMD_CLR command indicator pulse signal from
control_unit.
req interconnect logic Active-high next audio sample-request pulse signal in the
clk clock domain.
audio0 interconnect logic [23:0] Audio buffer output that contains the next left channel sample
to be sent for dsp_unit.
audio1 interconnect logic [23:0] Audio buffer output that contains the next right channel sample
to be sent for dsp_unit.
tick interconnect logic Active-high audio data valid indicator pulse signal for audio0
and audio1.
play interconnect logic Play mode indicator from control_unit.
dsp0 interconnect logic [23:0] Left channel audio output from dsp_unit.
dsp1 interconnect logic [23:0] Right channel audio output from dsp_unit.
dsp_tick interconnect logic dsp_unit audio output valid indicator for dso0 and dsp1.
mcfg_reg interconnect logic [31:0] cfg_reg signal in mclk clock domain,
mcfg interconnect logic cfg signal in mclk clock domain,
mplay interconnect logic play signal in mclk clock domain.
mreq interconnect logic Next audio sample request from i2s_unit.
mtick interconnect logic dsp_tick signal in mclk clock domain.
mdsp0 interconnect logic [23:0] dsp0 signal in mclk clock domain.
mdsp1 interconnect logic [23:0] dsp1 signal in mclk clock domain.
mclk_mux interconnect logic mclk domain clock signal
mrst_n interconnect logic Active-low, asynchronous reset signal for I2S clock domain from
cdc_unit.
1.3.3. Clock Domains
Submodules control_unit and dsp_unit use the clock signal clk and reset signal rst_n.

Submodule i2s_unit uses the clock signal mclk and the reset signal mrst_n.

Submodule cdc_unit receives the clock signals clk and mclk, and the reset signal rst_n. It
generates the reset signal mrst_n by synchronizing rst_n to the falling edge of clock mclk. The
cdc_unit contains synchronization logic that synchronizes all data signals that pass between
modules that reside in different clock domains into the clock signal of the receiving clock
domain.

1.3.4. Memory-Mapped Register Interface


For the purposes of this project, the audioport is assumed to be installed on the APB bus of an
AHB Example AMBA SYstem (EASY) type SoC. The memory map of that APB bus is shown in
section 4.1.2 Peripheral memory map in the AHB Example AMBA SYstem Technical reference
Manual [3] available in Moodle as PDF.

The bus controller (CPU) can control the audioport through a memory-mapped control register
interface by writing data to APB bus addresses reserved for the audioport. The control_unit
contains a register bank whose registers correspond to these bus addresses. The base address
of the region of the CPU's memory space reserved for the audioport on the APB bus is defined
by the AUDIOPORT_START_ADDRESS parameter (set to the first free APB address in the
EASY system, defined as 32'h8C000000 in file audioport_pkg.sv). Each register is logically 32
bits wide and organized in a little-endian manner.

Read and write accesses to all registers except the CMD_REG (defined below) should go
through with zero wait states. For CMD_REG writes, the bus interface in the audioport should
insert CMD_WAIT_STATES wait states, where CMD_WAIT_STATES is an integer obtained by
the formula

CMD_WAIT_STATES = 6 + roundup(6*MCLK_PERIOD/CLK_PERIOD).

The purpose of these wait states is to give the logic in the mclk clock domain time to handle
commands before a new command can be written by the CPU. The formula estimates the
synchronization delay of clock domain crossing logic.

The memory map of the audioport is shown in the table below.

APB REGISTER REGISTER REGISTER BITS


ADDRESS INDEX NAME
31-24 23-16 15-8 7-0

32'h8C000000 0 CMD_REG Command register


32'h8C000004 1 STATUS_REG Status register
32'h8C000008 2 LEVEL_REG Right channel level Left channel level
32'h8C00000C 3 CFG_REG Configuration register
32'h8C000010 4 DSP_DATA_STAR
T
Configuration data registers for dsp_unit.

? ? DSP_DATA_END

? ? ABUF0_START

not used Audio buffer data region 0

? ? ABUF0_END

ABUF1_START

not used Audio buffer data region 1

? ? ABUF1_END

1.3.4.1. Command Register (CMD_REG)


The host CPU writes command codes to this register. The command codes are one-hot
encoded 32-bit bit vectors meaning that only one bit corresponding to the command is set at a
time. The command names and codes recognized by the control_unit are listed in the
following table.

Command Value Description

CMD_NOP 32'h00000000 No operation, but an allowed command code.

CMD_CLR 32'h00000001 Clear all audio data from audio buffers and the dsp_unit.

CMD_CFG 32'h00000002 Apply configuration data to dsp_unit and i2s_unit.

CMD_START 32'h00000004 Put control_unit in play mode and start the sample rate counter in
i2s_unit that generates req pulses.

CMD_STOP 32'h00000008 Put control_unit in standby mode and stop the sample rate counter
in i2s_unit that generates req pulses.

CMD_LEVEL 32'h00000010 Apply level scaling data from LEVEL_REG to dsp_unit.

CMD_IRQACK 32'h00000020 Interrupt acknowledgement that sets irq_out to '0.

1.3.4.2. Status Register (STATUS_REG)


The control_unit module of the audioport provides status information for the CPU to read in
the status register by setting specific bits. When the CPU has read the status register, it must
reset the bits by writing an all zero bit vector into the status register.

Bit Name Bit Index Description

STATUS_PLAY 0 Indicates the mode audioport is in. Play mode = '1, standby mode = '0.

STATUS_CLR_ERR 1 This bit is set to '1 if a CMD_CLR command was executed when the
audioport was in play mode. (Fixed 15.1.2024)

STATUS_CFG_ERR 2 This bit is set to '1 if a CMD_CFG command was executed when the
audioport was in play mode. (Fixed 15.1.2024)

STATUS_IRQ_ERR 3 This bit is set to '1 if the interrupt output irq_out was raised and a
CMD_IRQACK command was not executed before the irq_out should have
been raised again.

STATUS_CMD_ERR 4 This bit is set to '1 if an unrecognized command code was written into the
command register CMD_REG.

1.3.4.3. Level Register (LEVEL_REG)


The LEVEL_REG register contains two 16-bit audio-level setting (volume control) values for the
scaler function of the dsp_unit. Bits [15:0] represent the left stereo channel level value, and
bits [31:16] represent the right stereo channel level value. The level scaling values are
represented as 16-bit unsigned fixed-point values x, whose format is <1.15> (1 bits on left side
of point, 15 bits on the right). The 24-bit audio data samples d should be scaled using the
following method:
scaled_d = d * x

The maximum supported value for x is 1.0000000_00000000 (decimal 1).

1.3.4.4. Configuration Register (CFG_REG)


The configuration register CFG_REG holds control parameters for the audioport encoded as bit
fields, as shown in the table below. In the current version it is possible to select the sample rate
and stereo/mono playback mode, and disable or enable the filter function in dsp_unit.

BIT
31-4 3 2 1 0
FILTER STEREO SAMPLE RATE SELECT
0=disabled 0 = normal 00 = 48 kHz
NOT USED
1=enabled 1 = mono 01 = 96 kHz
10 = 192 kHz
1.3.4.5. DSP Data Registers (DSP_REGS_START ... DSP_REGS_END)
The DSP data registers hold 32-bit digital filter coefficients. One 32-bit memory location is
reserved for each coefficient. The number of 32-bit memory locations reserved for this data area
depends on the design parameter FILTER_TAPS and is 4 * FILTER_TAPS.

The coefficient values are represented as 32-bit signed fixed-point values in the format <1.31>
(1 bits on left side of point, 31 bits on the right). Filter coefficient values should therefore be
normalized to range -1 <= value < 1 in decimal.

1.3.4.6. Audio Buffer Registers (ABUF0_START ... ABUF1_END)


Audio buffer registers hold 24-bit audio data samples. One 32-bit memory location is reserved
for every sample. The physical registers in the ABUF bank have 32 bits.

The number of memory locations reserved for this data area is 2 * AUDIO_BUFFER_SIZE * 2.
Left stereo channel samples are stored in registers whose index is even and right stereo
channel samples are stored in registers whose index is odd.

Audio data samples are represented as 24-bit signed integer values whose format is <24.0> (24
bits on left side of point, 0 bits on the right). Audio data is represented as bits [23:0]. 8 most-
significant bits are ignored in audio processing.

The ABUF region of the register bank is divided into two logical regions, ABUF0 and ABUF1.
When audio data from one region is used, the other can be filled by the host CPU. If the last
audio sample of the current ABUF region has been used, the control logic sets irq_out == '1 to
request the CPU to fill the buffer region that was just used. This way the CPU will never
overwrite data in the buffer region that is currently in use.

1.3.5. Operating Modes


The audioport can be in two modes, standby mode and play mode. In standby mode, only the
register interface of the control_unit is active. After reset the audioport is in standby mode.

In play mode, the audioport "streams" data from the ABUF registers in the control_unit
through the dsp_unit and cdc_unit to the i2s_unit, where it is converted into serial format
according to the I2S specification. Play mode is enabled by writing the command code
CMD_START into the CMD_REG register. Playmode is disabled and standby mode enabled by
writing the command code CMD_STOP into the CMD_REG register.

1.3.6. Audio Streaming


In the play mode, the data processing sample rate is set by the i2s_unit, which sets a data
request signal mreq to state '1 for one clock cycle when a new stereo sample must be read
from the control_unit and sent to i2s_unit through the dsp_unit and cdc_unit.

When the control_unit detects the request from req signal in the "clk" clock domain, it notifies
the dsp_unit that the next stereo audio data sample is available in the control_unit's audio
outputs ports using the tick signal, and then places the next stereo audio sample from the
ABUF region of its register bank in these ports that drive the audio0 and audio1 signals. When
the last stereo sample of the current region (ABUF0 or ABUF1) has been consumed, the
control_unit requests more data from the host CPU by raising the interrupt request output
irq_out. The host CPU should respond by filling the ABUF region that was just used.

When the dsp_unit has processed the sample, it must set its data valid indicator output signal
dsp_tick to state 1 for one clock cycle to indicate that data can be read from its output ports
that drive the dsp0 and dsp1 signals. These signals are connected to the cdc_unit, which
synchronizes them to the mclk clock signal. The cdc_unit's outputs are connected to the
i2s_unit's respective inputs. The i2s_unit reads the next stereo audio data sample to its input
register when the signal mtick is high. From there the sample is moved to a shift-register for
parallel-to-serial conversion, when the previous sample has been shifted out.

Concurrently with the process described above, the i2s_unit continuously shifts out databits
from a shift-register at the selected audio sample rate. The sample rate is determined by bits
[1:0] of the configuration register CFG_REG whose contents are available in the mclk domain
signal mcfg_reg.. At the same time, the i2s_unit generates the sck_out and ws_out I2S
timing signals. When all 48 bits of a stereo audio sample have been shifted out, the i2s_unit
processes the next sample from the input register and at the same time requests a new sample
from the control_unit. The i2s_unit always shifts out a complete 48-bit sample even if the
circuit is put to standby state in the middle of a sample.

1.3.7. Data Round-Trip Time


A data processing cycle is defined here as the period that begins when the i2s_unit raises its
data request output (a pulse in the audioport signal mreq) and ends when it receives a data-
ready notification (a pulse in the audioport signal mtick). The time between these events is
called the data round-trip time in this specification. The round-trip time is defined as

t roundtrip =¿ t pulsesync +t control +t dsp + t datasync


unit unit

where the components are:


● Pulse synchronization delay (tpulsesync) in the cdc_unit
● Delay of the control_unit in creation of the tick pulse (tcontrol_unit)
● Delay of the dsp_unit (tdsp_unit)
● Multibit data synchronization delay (tdatasync) in the cdc_unit
For the design to function correctly, the round-trip time should be smaller than one 192 kHz
sample period.
1
t roundtrip <
192 kHz

1.3.8. dsp_unit Maximum Latency


The maximum processing delay (latency) of the dsp_unit should be chosen so that the data
round-trip constraint is not violated even if the clock frequency used by the dsp_unit is halved
e.g. for power savings reasons.

1.3.9. Functional Requirements for Sub-Modules


The detailed functional requirements and design guides for the control_unit, dsp_unit,
cdc_unit, and the i2s_unit are given in separate documents.

1.4. Typical Use Case


A typical operating sequence for the audioport consists of the following step, beginning when
the audioport is in standby state.
1. The host CPU writes filter coefficient values to the registers in the DSP_REGS region.
2. The host CPU fills the audio buffers by writing to all registers in the ABUF0 and ABUF1
regions.
3. The host CPU sets the output level values by writing the level scaling values for left and
right audio channels into the LEVEL_REG, and applies the level data by writing the
command code CMD_LEVEL to the command register CMD_REG.
4. The host CPU writes configuration data to the CFG_REG to set the sample rate,
mono/stereo mode and filter-enable values, and applies the configuration data by writing
the command code CMD_CFG to the command register CMD_REG.
5. The host CPU starts playback by writing the command code CMD_START to the
command register CMD_REG.
6. When the host CPU detects an interrupt at irq_out, it fills the audio buffer ABUF0 or
ABUF1 that was just drained of data. The CPU must keep track of which buffer must be
filled next.
7. During playback, the host CPU can change the output level as described above.
8. The host CPU stops playback by writing the command code CMD_STOP to the
command register CMD_REG.
9. After stopping playback, the host CPU can clear audio data from the audioport by writing
the command code CMD_CLR to the command register CMD_REG.
10. The host CPU can at any time read the STATUS_REG to check the status bit values.

A transaction level model of the audioport can be used to simulate the behavior of the design. It
executes the steps described above.
2. RTL Architecture Specifications
The audioport module does not contain any functional logic. Detailed functional requirements of
RTL submodules are given in a separate specification document.
3. Design Guidelines
3.1. Design Parameters Setup

3.1.1. Project Parameters


Your personal design parameters must be entered in project progress report section 1.1. and in
the the following files:

File Description

audioport_pkg.sv SystemVerilog package definition used by all SystemVerilog files.

audioport_defs.h A C-language header file that contains the same data as the package file.

apb_pkg.sv SystemVerilog package file that defines the APB bus parameters. The
DUT_* addresses refer to your design. Make the whole APB address space
defined by the APB_* parameters about twice your address space
rounded to the next power of two.
The APB_MAX_WAIT_STATES and APB_INPUT_DELAY values can initially
be left as they are.

audioport.sdc Synopsys Timing Constraints file read by synthesis and verification tools.
Edit the clk and mclk clock period definitions, and define input and output
delays.

0_setup_audioport.tcl File read by all EDA tools command scripts. Define the clock periods and
input and output delays as TCL variables as all tools don't use the SDC file.

3.1.2. audioport Memory-Mapped Register Interface Definition


Create a table that shows the detailed memory map of the register bank in project progress
report section 1.2. Update the addresses into input/audioport_pkg.sv and
input/audioport_defs.h.

3.2. RTL Design and Code Creation


The top-level audioport design consists of the following files located in the input-directory.

File Description

0_setup_audioport.tcl Settings file for the audioport design

audioport_pkg.sv SystemVerilog package file for project-wide definitions.

audioport.sv SystemVerilog template for top-level module.

audioport_tb.sv SystemVerilog template for testbench.


audioport_svamod.sv SystemVerilog assertions module template.

audioport.sdc Timing constraints file.

The modules control_unit, dsp_unit, cdc_unit and i2s_unit should all be instantiated in the
top-level module audioport. The top-level module should not contain any functional code.

When the initial design hierarchy has been created, you can begin to work with the
submodules/subprojects control_unit, i2s_unit, cdc_unit and dsp_unit. When you complete a
subproject, it will be automatically instantiated in the audioport hierarchy through its top
module. Separate design guides are provided for the control_unit, dsp_unit, cdc_unit and
i2s_unit.

3.4. audioport RTL Verification


This part of the project is executed after the modules control_unit, i2s_unit , cdc_unit and
dsp_unit have been designed.

The RTL verification process consists of the following tasks:


● Static code check
● Functional verification by simulating the audioport in an UVM testbench
● Clock-domain crossing verification of the complete design

Coverage data is collected from all functional and formal verification tasks.

3.4.1. Static Code Check


Static code verification should be carried out using Questa AutoCheck. The check has been
disabled for modules created with high-level synthesis (they would generate too many warnings)
by this line in input/0_setup_audioport.tcl.
set QAUTOCHECK_DISABLE_MODULES { "dsp_unit_rtl" }

3.4.2. Functional Verification


The audioport's functional verification environment is based on the Universal Verification
Method (UVM). The audioport_tb testbench module contains an initial procedure that sets up
an UVM test environment and calls the run_test UVM task. This code is not compiled by
default. You can enable compilation of the UVM related code and disable instantiation of your
test program audiport_test by doing this variable setting in input/0_setup_audioport.tcl

set UVM_TESTBENCH 1

This setting defines the UVM_TESTBENCH macro for the compiler, and enables compilation of
UVM specific parts of the testbench.
The main parts of the testbench module audioport_tb are shown in the next figure. The
audioport module is instantiated as usual. Three SystemVerilog interface objects of type
apb_if, irq_out_if and i2s_if are also instantiated. The I/O ports of audiport are connected to
the variables in the interface objects using assign statements (red arrows).

In the initial procedure, handles to the interface objects are first pushed into the UVM
configuration database. After that, the test is started by calling the UVM task run_test. This
creates all the UVM test components. They get handles to the interface objects by reading them
from the configuration database. This way objects from the UVM class hierarchy get access to
the audioport module's ports.

You can select the UVM test to be executed by setting the UVM_TESTNAME TCL variable in
input/0_setup_audioport.tcl

set UVM_TESTNAME "apb_test"


# set UVM_TESTNAME "control_unit_uvm_test"
# set UVM_TESTNAME "audioport_uvm_test

This setting is passed to the simulator command vsim as a command line argument. The three
tests are:
apb_test Simple APB read-write test from the lecture textbook.

control_unit_uvm_test UVM-version of some tests from the control_unit_test program


(you must create this). This test only tests APB and irq_out
ports.

audioport_uvm_test Complete test that covers all I/O:s and contains a reference
model to which DUT outputs are compared.
Each test has its own test environment. These are described in a separate document audioport
UVM Tests.

Functional verification should be carried out by simulation with QuestaSim. Each UVM test
should be simulated separately by first selecting the test in 0_setup_audioport.tcl, and then
executing QuestaSim normally. Coverage data is saved from all tests and merged in the
verification tracking tool to obtain the total coverage result.

During audioport_uvm_test the simulator saves a file


results/audioport_uvm_comparator_out.txt. This file contains output from the DUT and the
reference model (audioport_predictor UVM component). You can compare the outputs
visually by using gnuplot as follows. The figure below shows an example of expected results,
with DUT outputs shown in green and reference model outputs in red.

Even though all modules have been thoroughly verified, the simulation of the complete system
may still reveal differences between the reference model or firings of assertions. Some probable
cases are:
● Differences in computation accuracy between the reference model and the DUT that
may cause minor differences
● Timing differences between the reference model and the DUT caused by the CDC
delays etc.
For these reasons, a visual similarity of the waveforms is an acceptable result for this test.

Tip: Select the VCD snapshot window settings for power estimation at this point as shown below.
Mark the start and end time in your project progress report in section 7.8.1. and use them in power-
analysis. The start time should be just before the first interrupt of the first test (192kHz) and the end
time at a time when the interrupt has been served. This is the most active state of the design
(dsp_unit and i2s_unit are processing data and at the same time the bus interface is active). Make
sure that at least a couple of "tick" intervals are included so that dsp_unit activity is also captured.

3.4.4. Formal Verification


Formal verification is not used to verify the complete audioport design.

3.4.5. Clock-Domain Crossing Verification


Even though the cdc_unit has already been verified, the full CDC verification flow should be
executed on the audioport at least for the following reasons:
● The cdc_unit is now connected to other modules which may cause problems, for
instance, synchronizer inputs driven by combinational logic in other modules.
● The data comes to cdc_unit from other modules instead of the testbench, and therefore
protocol simulations must be done again.
● In simulation with metastability injection, the randomly delayed synchronized data is now
handled by other modules, which may cause errors in them.
● Data that passes through multiple clock domain crossings data reconvergence may
cause problems.
Static CDC check is now executed with reconvergence detection enabled. Data reconvergence
describes a situation where data that originates from one source (register or flip-flop) in one
clock domain first diverges to several paths in that domain, cross over to another domain
through different synchronizers, and then converges at some point, for instance, in the next-
state encoding logic of a register. The red paths in the figure below are examples of
reconvergent paths. Because of the random one-clock-cycle delay variation in the synchronizers
the data used to encode the state of the receiving register may actually consist of two
successive values of the source data, potentially causing an error in operation.
APPLICA CLOCK APPLICA
TION DOMAIN TION

F F F C R
F F F E
C R
R C E
E
F F F C R
F F F E
C R
E
R C F F F C R
E F F F E

If reconvergence issues are detected in static analysis, the CDC verification tool creates
protocol checker assertions that detect potentially dangerous conditions in protocol simulation.
The assertions check that the hamming distance of data on reconvergent paths is one, meaning
that only one data bit should change at a time.

RTL simulations are also done with metastability injection enabled. The injectors are added in
clock domain crossings in the simulation model by the CDC analysis tool. When the injector
detects that the transmit and reception clock edges are aligned, it injects a random one-clock-
cycle delay variation into the data. If the circuit is properly designed, this should not cause a
malfunction.
4. audioport Implementation
The implementation flow consists of tasks in which the verified RTL SystemVerilog and VHDL
code is synthesized and optimized using logic gates and flip-flops from a target technology
library, after which an integrated circuit layout pattern database is created using "standard cells"
from the target library. This "back-end" flow also contains many verification and analysis tasks.

The figure below shows an overview of the back-end flow used in DT3. The green rectangles
represent synthesis (design data transformation) tasks, the yellow rectangles verification and
analysis tasks, and the gray rectangles design data. All tasks are executed using EDA tools with
automatic tool scripts. Therefore the whole back-end flow is automated.

The implementation flow consists of the following tasks:


1. RTL synthesis and logic optimization with design-for-testability (DFT)
2. Gate-level verification with formal equivalence check
3. Standard-cell layout generation and clock-tree synthesis
4. Post-layout verification with formal equivalence check
5. Post-layout static timing analysis
6. Post-layout dynamic power estimation by simulation
7. Automatic test pattern generation for manufacturing tests.

4.1. RTL Synthesis and Logic Optimization


The design should be synthesized and optimized using Design Compiler. The program executes
the commands given in the script file scripts/dc_rtlsyn.tcl. The script allows some
customization using the following TCL variables that can be defined in
input/0_setup_audioport.tcl. In this project, the following are used:
● SDC_FILE: Defines the path of a timing constraint file that is in Synopsys Design
Constraint format
● SYNTHESIS_CONSTRAINTS_FILE: Defines the path to a file that contains of a Design
Compiler constraints (other than timing constraints)
● GATE_CLOCK: When set to 1, enables insertion of clock gating logic for power saving
● INSERT_SCAN_CHAINS: When set to a non-zero value, enables insertion of a scan
path to improve testability
● DFT_SETUP_FILE: Defines a path to a file that contains commands that define scan
path settings
● DFT_AUTOFIX_SCRIPT: Defines a path to a file that contains commands for fixing
common testability-related design rule violations

4.1.1. Timing Constraints


Timing constraints are defined in file input/audioport.sdc. The designer should specify the
following:
1. Clock period for signal clk
2. Input delay for the reset input rst_n
3. Input delays for APB signals and scan_en
4. Output delays for ABP outputs and irq_out
5. Output delays for I2S outputs

The assumed timing relations of the I/O signals are shown in the figure below.

The SDC file defines the clock clk and mclk, and "virtual" clocks virtual_clk and virtual_mclk.
Virtual clocks only define the timing waveform, but they are not attached to any ports. This
allows us to modify the real clock descriptions with latency and skew settings without distorting
the external timing constraints. Input and output delays are defined with respect to the virtual
clocks.

Input drive and output load settings can be used as they are in the template file.

4.1.2. Design Compiler Constraints


The file input/audioport.syn_constraints.tcl contains commands that prevent the synthesis
program from ungrouping (merging) top-level hierarchical modules. This allows you to perform
hierarchical verification tasks, such as comparing the outputs of RTL and gate-level sub-
modules in simulation, and to generate a hierarchical power consumption report. Logic
optimization results may, however, degrade if the synthesis tool is not allowed to optimize over
module boundaries. The default settings assume that the module instances' names are of the
format modulename_instancenumber (e.g. control_unit_1). If you have used other instance
names, use them in the settings file.

4.1.3. Synthesis with Design-for-Testability


The manufactured chip will be tested using a scan chain based test method. This method
requires that the circuit be modified so that in test mode all flip-flops in the circuit form a shift-
register that can be enabled by setting the input scan_en_in = '1. The first and last flip-flop of
this shift-register are connected to the chip's pins and can be used to shift test data serially in
and out of the circuit. Using the scan chain, all flip-flops can be set to any state. When
scan_en_in is set to '0 for one clock cycle, all flip-flops will load in normal mode data. This data
is now generated based on the test data that was shifted into the flip-flops. After that, the results
data can be shifted out by setting scan_en_in = '1 again, and clocking the circuit as many times
as there are flip-flops in the shift-register. At the same time, the next test pattern can be shifted
in. By comparing the shifted-out data to expected data, manufacturing defects can be detected.
By repeating this many times with new test data, the circuit can be completely tested.

If the scan chain has N flip-flops, executing one test takes N+1 clock cycles. Testing a chip with
only one scan chain would be slow. Therefore several parallel scan chains are used. They can
share the scan_en_in port, but a separate input and output port must be reserved for each
chain. The number of scan chains must be decided by the designer. For this project, we assume
that the test clock frequency is 10 MHz. Furthermore we assume that one test should not take
more than 1 ms, allowing us to perform 1000 tests in a second. If we know the number of flip-
flops in the design, we can calculate the number of scan chains required.

Scan chains are created automatically by the RTL synthesis tool. The tool needs to know the
number of scan chains and their input and output ports. Normal data ports can be used as scan
chain I/O ports. The synthesis tool will add multiplexers controlled by scan_en_in at the output
ports so that the normal mode operation is not affected by test logic.

You can enable scan chain insertion settings in file 0_setup_audioport.tcl by defining the
number of chains you want to insert, e.g.
set INSERT_SCAN_CHAINS 3

You can define the scan chain configuration in file input/audioport.dft_setup.tcl by defining
the number of scan chains and the input and output port used for each chain, the scan enable
input port name, and the name of the port used as test clock input. The template file show below
tells the tool to create two scan chains, use scan_en_in as thescan enable input, and form the
first chain between ports PADDR[0] and PRDATA[0], and the second between PADDR[1], and
PRDATA[1]. Any input except clocks, resets the scan_en_in and the test_mode_in signal
discussed below can be used as scan input and any output can be used as scan output. The
remaining inputs and outputs can be used to apply and read test data in parallel on the clock
cycles when test data is applied. The file defines a test setup that is depicted in the figure below.

# Scan clock (rises at 45ns, falls at 55ns, test clock period is 100ns by default)
set_dft_signal -view existing_dft -type ScanClock -timing { 45 55 } -port clk

# Reset
set_dft_signal -view existing_dft -type Reset -port rst_n -active_state 0

# Test mode select input tied to low in test mode


set_dft_signal -view spec -type Constant -port test_mode_in -active_state 1

# Unused clock input mclk tied to low in test mode


set_dft_signal -view existing_dft -type Constant -port mclk -active_state 0

# Scan enable input


set_dft_signal -view spec -type ScanEnable -port scan_en_in -active_state 1

# Settings for two scan paths


set_scan_configuration -style multiplexed_flip_flop \
-chain_count 2
-clock_mixing mix_clocks

# Scan path inputs and outputs


set_dft_signal -view spec -type ScanDataIn -port "PADDR[0]"
set_dft_signal -view spec -type ScanDataOut -port "PRDATA[0]"

set_dft_signal -view spec -type ScanDataIn -port "PADDR[1]"


set_dft_signal -view spec -type ScanDataOut -port "PRDATA[1]"
The file input/audioport.dft_setup.tcl also defines the port test_mode_in as the test mode
select signal. This is needed to make the tool understand the function of the clock and reset
signal multiplexers in cdc_unit. In this case the testability problem caused by the internally
generated reset signal is solved in the RTL code, but it could also be solved automatically. The
feature of the Design Compiler used to solve these kinds of problems is called AutoFix. It is
enabled with the commands in the file input/audioport.dft_autofix.tcl that can be executed by
enabling the following variable definition in input/0_setup_audioport.tcl. The autofix file
contains some examples of commands that can be used to enable automatic fixing of test
design rule violations.

set DFT_AUTOFIX_SCRIPT "input/audioport.dft_autofix.tcl"

4.1.4. Synthesis with Clock Gating


Clock gating is a standard method for reducing the dynamic power consumption of digital
circuits. You can enable clock gating by setting the GATED_CLOCK variable in
input/0_setup_audioport.tcl.
set GATE_CLOCK 1

You should first run synthesis without clock gating, and then with clock gating, and find out from
the power consumption reports if the design benefits from clock gating.

You can also experiment with the following parameter, which limits the number of flip-flops that
can be controlled by one clock gating cells. The optimal value depends on the design's
properties.

set CLOCK_GATE_CLOCK_MAX_FANOUT 16

set_clock_gating_style -max_fanout $CLOCK_GATE_MAX_FANOUT \


-positive_edge_logic {integrated } \
-negative_edge_logic {or} \
-control_point before \
-control_signal scan_enable

The -positive_edge_logic option instructs the tool to use an integrated clock gating cell from the
target library for gating rising-edge sensitive flip-flops, and the -negative_edge_logic option to
choose any suitable OR gate for falling-edge sensitive flip-flops (the library we use does not
have an integrated cell for this purpose). The -control_point and -control_signal option tell how
the clock gates should be bypassed in test mode by the logic added by the synthesis tool to
improve testability. As a result, the scan_en_in signal will be connected to a second enable
input of the latch inside the clock gating cell.

4.2. Automatic Test Pattern Generation (ATPG)


In this task, test patterns for the circuit are created using Synopsys TMAX for the gate-level
version of the design. In a real design project, a post-layout netlist would be used, since ATPG
is closely related to the DFT tasks, it is done together with them in this project. Even though we
have not designed a complete chip, test pattern generation may reveal hard-to-test parts in the
design which should be addressed by redesigning the RTL.

TMAX uses the stuck-at -fault model and tries to generate test patterns that allow all possible
faults to be detected by using the scan path. The TCL script uses the following variables that
you can change if needed:

set POSTLAYOUT_ATPG 0
set TMAX_ABORT_LIMIT 1
set TMAX_CONTINUE_ATPG 0

The POTSLAYOUT_ATPG = 0 setting tells the script to use the gate-level netlist
(results/audioport_gatelevel.v). The abort limit setting limits the time TMAX spends searching
for patterns. You can increase the limit if test coverage is low after an ATPG run. If
TMAX_CONTINUE_ATPG = 1, the script will continue with the patterns already found and
saved during previous sessions, so that it does not have to start from the beginning if not-
detected (ND) faults remain.

This TMAX script reads the files results/audioport_postlayout.v (Verilog netlist) and
results/audioport_postlayout.stil (scan chain descriptions), and starts to generate patterns.

If ATPG results are bad (test coverage < 100%) and you have no idea where the hard-to-test
part of the design is, you can launch the TMAX GUI (command tmax) and run the script from
the File menu. After the script has finished, you can use the Analyze command in the GUI to
print out names of some of the gates that could not be tested (Analyze > Faults > Fill > Class =
ND and OK) and find out the reason why a test could not be generated. If the reason was
"abort", you can increase the abort limit (run time). If running ATPG longer does not help, you
have to think of ways to improve testability, e.g. by making combinational logic less deep.

The initial TCL settings are probably too "weak" for the complete design, so if the initial test
coverage is low, increase the abort limit and set TMAX_CONTINUE_ATPG to 1, and run the
script again. The reports
● 9_tmax_audioport_not_observed_faults.txt
● 9_tmax_audioport_not_detected_faults.txt
help you understand which parts of the design have testability problems. Violations are reported
in
● 9_tmax_audioport_violations.txt

After the patterns have been created, you can simulate the design with the patterns as test data
using a testbench generated by TMAX. This allows you to see how the design and the patterns
would work with real test equipment.

4.3. Gate-Level Verification


Verify the synthesized gate-level model against your RTL model with Synopsys Formality logic
equivalence checker. Before running the script, increase the verification time-out limit to at least
two hours by using the following setting in 0_setup_audioport.tcl. The format is
hours:minutes:seconds in wall clock time.

set FORMALITY_TIMEOUT_LIMIT "02:00:00"

4.4. Prototype Layout Design


The purpose of the layout design phase is to create a layout artwork that can be used to create
the photomasks that are needed in the chip manufacturing process. The actual final layout is
created when the complete system-on-a-chip design is assembled from the tens or hundreds of
blocks that constitute the SoC design. At this phase, the aim is to create a trial layout for the
audioport block to obtain two kinds of information:
● true silicon area of the design (and therefore cost)
● wire lengths from which parasitic capacitances can be accurately calculated and used
for timing and power consumption analysis (performance)
width

height Wiring in metal layers:


Different layers shown
in different colors.

On block-level, creating a layout is a fairly simple process, once all design and technology data
have been properly set up. To create a prototype layout, the following data are required:
● Design data
○ Gate-level netlist from logic synthesis tool (results/audioport_gatelevel.v)
○ Timing constraints file from logic synthesis tool
(results/audioport_gatelevel.sdc)
○ Clock-tree specification file (not used in this exercise, the layout tool creates a
clock tree with default settings)
○ Scan chain definition file from logic synthesis tool (results/audioport.scandef)
○ I/O pin place assignment file (not used in this exercise, the layout tool select pin
locations automatically)
● Technology data (defined in layout tool command script)
○ Standard cell timing libraries
○ Standard cell layout images ("LEF-files")
○ Routing capacitance tables
○ Global signal names (power, ground)

The layout of the audioport block is created with Cadence Innovus digital implementation tool.
Innovus reads the gate-level Verilog netlist file that was created by the logic synthesis tool,
places the components’ layout images in rows inside a rectangular “tile”, and finds routes for
metal layer wires to connect the components’ terminals. Innovus also builds a buffering tree for
the clock signal. The script scripts/6_innovus_layout_synthesis.tcl is used to control the
tool. The script saves the placed and routed design in output/audioport_postlayout.enc
where you can later open with File > Restore Design after executing run/innovus.

Notice: The Innovus script runs with optimization effort level "express" which can leave
connectivity and geometrical design rule violations in the design. You can ignore these, or run
the script with a higher effort level setting as explained below in 5.9.

innovus GUI Usage Tips

1. Set innovus’s view mode to “Physical” by clicking on the rightmost button


2. Using the ruler tool from the toolbar above the layout view, you can measure the dimensions of
the layout box and write it in your report (the dimensions are in micrometers)
3. If your design has negative slack (worst negative slack WNS < 0) , execute Timing > Debug
Timing to bring up a window from where you can select the most critical paths to see how they
are routed in the layout. Selecting a path from the list highlights it in the layout. This is useful if
you have a large design with many modules whose positions in the floorplan you can change.
4. Open the Clock Tree Browser window from Clock > Browse Clock Tree..., then select signal
CLK, click on Select, select Post-Route, and click OK. You can now see the components used
in the clock tree.

5. The check boxes in the Physical layer selection tab


allow you to control the visibility (left) and selectability (right) of different mask layers. You can,
for instance, hide the wires to see the standard cells. Double clicking a cell will show its
properties.
6. If your gate-level design has several modules, you can see how much area each submodule
requires by switching the layout view to amoeba mode (middle button).
7. You can save layout images from the Tools menu.

The layout script uses a standard cell density setting of 70%, which means that 70% of silicon
area is used for components to allow more vertical and horizontal routing channels to be
created. If you want to use different utilization settings, set the following variable in
0_setup_audioport.tcl, and execute run/innovus_layout again.

set INNOVUS_STANDARD_CELL_DENSITY 0.7

4.5. Post-Layout Equivalence Check


You can verify that the netlist results/audioport_gatelevel.v Design Compiler created is
logically identical to the netlist results/audioport_postlayout.v Innovus created by running
Synopsys Formality

4.6. Post-Layout Static Timing Analysis (STA)


Post-Layout timing analyses uses actual wiring capacitances and resistances extracted from the
layout to estimate delays. The results are therefore more accurate than in logic synthesis, which
uses simpler timing models. In this project, we use the PrimeTime static timing analysis tool to
generate post-layout timing reports, in addition to reports generated by Innovus.

The STA tool reads the following inputs:


1. The post-layout gate-level Verilog netlist results/audioport_postlayout.v generated by
Innovus
2. The synthesis timing constraints file results/audioport_gatelevel.sdc saved from
Design Compiler
3. Standard parasitics extraction format files results/audioport_postlayout.max.spef.gz
results/audioport_postlayout.min.spef.gz that contains wire capacitance and
resistance values extracted from the layout by Innovus. In this exercise, the max and
min RC corners are the same. In a more realistic use case, setup analysis would be
done with max parasitic values, and hold analysis with min values.

After reading the inputs, the tool provides an overview of the setup and hold timing for all path
groups:
Total Negative Slack
Worst Negative Slack
Number of violating endpoints

When you select a line in the list, a histogram that shows an overview of the timing paths in that
group is shown. In the example shown below, all paths had possessive slack, as indicated by
the NVE and WNS columns in the summary table above. If you select a bin in the histogram, the
slack of the paths in the selected bin is shown in the list below the histogram.
From the path list, you can use the right-mouse-button menu command Inspect Worst Paths to
open a path inspector window for the selected paths. There you can see a detailed description
of the paths, and generate path schematics and waveforms that help you analyze path timing.
Below is an example of a waveform that shows how the slack value has been computed.

STARTPOINT

DATA
ARRIVAL
TIME AT
ENDPOINT

ENDPOINT

PrimeTime also shows the standard timing reports for setup and hold timing in the reports
directory.

In this project, STA is used to identify the most critical register-to-register paths in the design. If
the paths have negative slack, you don't have to try to fix it, unless there is an obvious reason
and the fix is simple. The aim is to learn to analyze the timing so as to identify problematic parts
in the design.

4.7. Post-Layout Power Analysis


In this task the aim is to learn to analyze the dynamic power consumption of the design. The
analysis principle is simple: the post-layout design is simulated as usual. During simulation,
inside a time interval in which the power consumption is assumed to be representative or
otherwise interesting, all switching activity of the circuit (state changes of wires) is saved in a
value change dump (VCD) file. This file is later read into a power analysis program along with
the circuit's wiring capacitance data. When these two data sets are combined, power
consumption can be calculated using basic circuit theory.

4.7.1. Post-Layout Power Simulation


Enter the VCD snapshot start time and length into file 0_setup_audioport.tcl. Use the start
time you wrote in your project progress report section 7.8.1. and calculate the length value.
Here is an example:
set VCD_SNAPSHOT_START_TIME 500000ns
set VCD_SNAPSHOT_LENGTH 5000ns

These variable settings tell the simulation script to create a VCD file of the activity captured
during the specified interval. The power simulation script does the following:
1. Run for $VCD_SNAPSHOT_START_TIME
2. Enable VCD capture
3. Run for $VCD_SNAPSHOT_LENGTH
4. Disable VCD capture

Timing checks are disabled in this simulation script. The script saves VCD data in file
results/postlayout_postlayout.vcd for use with the PrimePower power analysis tool. Notice
that the VCD file is typically huge.

4.7.2. Dynamic Power Consumption Estimation


You can analyze power consumption with Synopsys PrimePower. The script saves the power
report in reports/8_primepower_audioport_postlayout_timebased_power.txt.

You can use Power > Show Power Analysis Driver to get a quick view of the total power
consumption of the circuit. Switch the view mode from Show by type to Show by predefined
groups to see the contribution of different structures. This information is also in the report file.

The command Power > New Power Design Map > Total Power Density shows a "heatmap"
that represents all components’ and hierarchical modules' power density by the size and color of
its rectangle.

The command Power > New Power Design Map > Total Power (or use the drop down list) to
find out the component that has the highest power consumption.

The “toggle coverage” map mode can be used to make sure that state changes were recorded
in all parts of the circuit so that you can rely on the power estimates.

PrimePower also opens a power waveform viewer that shows the power consumption as a
function of time as recorded in the simulation. The script also saves the waveform view as an
image in reports/8_wv_audioport_power.png. You can create your own screen dumps with
WaveView > Dump Screen. The image below shows an example of the default dump that
presents the complete dataset. Each waveform represents one of the top-level modules' power
consumption as a function of time.
4.8. Design Optimization
If you have negative setup slack in the design, you can try these additional settings:

1. Comment out the set_ungroup commands in audioport.syn_constraints.tcl, thus


allowing Design Compiler to change the design hierarchy during optimization.

if { $EDA_TOOL == "Design-Compiler" } {
set_ungroup control_unit_1 false
set_ungroup dsp_unit_1 false
set_ungroup i2s_unit_1 false
set_ungroup cdc_unit_1 false
}

2. Increase the cell density setting in 0_setup_audioport.tcl that determines the floorplan
size Innovus uses. Increasing density makes the silicon area smaller but makes routing
more difficult.

set INNOVUS_STANDARD_CELL_DENSITY 0.7


3. Increase Innovus the optimization effort in 0_setup_audioport.tcl. The script runs with
the effort setting "express". Other possible values are "standard" and "extreme".

set INNOVUS_OPTIMIZATION_EFFORT "standard"

You might also like