Week 6 Lecture Material_watermark
Week 6 Lecture Material_watermark
2
Components of Timing Closure
1. Timing-driven placement
Minimizes signal delays when assigning locations to circuit elements.
2. Timing-driven routing
Minimizes signal delays when selecting routing topologies and specific routes.
3. Physical synthesis
Sizing transistors or gates to decrease the delay or increase the drive strength of a
gate.
Inserting buffers into nets to decrease propagation delays.
Restructuring the circuit along its critical paths.
3
Background
• For many years, signal propagation delay in logic gates was
the main contributor to circuit delay, while wire delay was
negligible.
– Cell placement and wire routing did not affect circuit performance.
• Technology scaling post-1990 significantly increased the
relative impact of wire-induced delays.
– High-quality placement and routing have become critical for timing
closure.
4
Background
15% delay Mid 80 Scenario
Most of the input to output delay
85% delay
of the logic is due to gate delay.
50% delay
Mid 90 Scenario
50% delay Half of input to output delay of the
logic is due to wire delay.
5
Quick Recap of Setup and Hold Times
• Timing optimization tools adjust propagation delays through
circuit components, with the primary goal of satisfying timing
constraints. Two ways:
– Setup (long-path) constraints: Amount of time a data input signal
should be stable before the clock edge for each storage element.
– Hold (short-path) constraints: Amount of time a data input signal
should be stable after the clock edge at each storage element.
6
(a) Setup Constraints
• Ensure that no signal transition occurs too late.
• Initial phases of timing closure focus on these types of
constraints:
tcycle ≥ tcombDelay + tsetup + tskew
• Checking whether a circuit meets setup constraints requires
estimating how long signal transitions will take to propagate
from one storage element to the next.
– Typically uses Static Timing Analysis.
7
• What is Static Timing Analysis?
– Propagates actual arrival times (AAT) and required arrival times (RAT)
to the terminals of every gate or cell.
– Can quickly identify timing violations, and diagnose them by tracing
out critical paths in the circuit that are responsible for these timing
failures.
– Models propagation of signal transitions with the worst possible delay.
– Typically excludes false paths from the analysis.
8
– For every timing point x in the circuit netlist, the timing slack is
computed as:
SLACK(x) = RAT(x) – AAT(x)
– Positive slack means timing has been met; negative means violation.
– Guided by slack values, physical synthesis restructures the netlist to
make it more suitable for high-performance layout implementation.
• Gates lying on critical paths can be upsized to propagate signals faster.
• Buffers may be inserted into long critical wires.
• The netlist tree can be restructured to decrease the overall depth.
9
Hold-time Constraints
• Ensure that signal transitions do not occur too early.
– Hold violations can occur when a signal path is too short, allowing a
receiving flip-flop to capture the signal at the same cycle instead of
the next cycle.
• Hold-time constraint is given by:
tcombDelay ≥ thold + tskew
– Clock skew affects hold-time constraints significantly more than setup
constraints. So, hold-time constraints are typically enforced after
synthesizing the clock network.
10
END OF LECTURE 32
11
Lecture 33: TIMING CLOSURE (PART 2)
Clock
2
• The maximum clock frequency for a given design depends upon:
– Gate delays, which are the signal delays due to gate transitions.
– Wire delays, which are the delays associated with signal propagation
along wires.
– Clock skew.
• Need to quickly estimate sequential circuit timing:
– Perform static timing analysis (STA).
– Assume clock skew is negligible, postpone until after clock network
synthesis.
3
Static Timing Analysis
• We represent a combinational logic netlist as a directed acyclic graph (DAG).
• The inputs are annotated with times 0, 0 and 0.6 time units respectively, at
which signal transitions occur relative to the start of the clock cycle.
• The gate and wire delays are also shown.
4
DAG Representation
• The graph has one vertex for each input and output, as well as one vertex
for each logic gate.
• A source node s is introduced with a directed edge to each input.
• Vertices corresponding to logic gates are labeled with the respective gate
delays.
• Directed edges from the source to the inputs are labeled with transition
times, and directed edges between gate vertices are labeled with wire
delays.
5
a <0> (0.15) (0.2)
y (2) w (2) (0.2) f
(0.1)
b <0> (0.1) x (1) (0.3) (0.25)
z (2)
c <0.6> (0.1)
6
Actual Arrival Time (AAT)
• The AAT of a given node v V, denoted as AAT(v), is defined as the latest
transition time at v measured from the beginning of the clock cycle.
– By convention, AAT(v) records the arrival time at the output side of node v.
– In the previous example, AAT(x) = 0.1 + 1 = 1.1, AAT(y) = 1.1 + 0.1 + 2 = 3.2
• Formal definition:
AAT (v ) = max ( AAT (u ) + t (u , v ) )
u∈FI ( v )
where FI(v) is the set of all nodes from which there exists a directed edge
to v, and t(u,v) is the delay corresponding to the (u,v) edge.
7
• All AAT values in the DAG can be computed in O(|V| + |E|) time.
– Linear in number of gates are edges.
• This linear scaling of runtime makes STA applicable to modern designs
with hundreds of millions of gates.
a (0) (0.15) y (2)
A0 A 3.2
(0) (0.1) (0.2)
8
Required Arrival Time (RAT)
• The RAT of a given node v V, denoted as RAT(v), is defined as the time by
which the latest transition at a given node v must occur in order for the
circuit to operate correctly within a given clock cycle.
– Unlike AATs, which are determined from multiple paths from upstream inputs
and flip-flop outputs, RATs are determined from multiple paths to downstream
outputs and flip-flop inputs.
• Formal definition:
RAT (v ) = min (RAT (u ) − t (u, v) )
u∈FO ( v )
where FO(v) is the set of all vertices with a directed edge from v.
9
• It is assumed that the RAT values for the outputs are given.
• For the example, suppose that RAT(f) = 5.5 .
10 10
Slack Computation
• The correct operation of the chip with respect to setup constraints (e.g.
maximum path delay), requires that AAT at each node does not exceed RAT.
– That is, for all vertices v V, we must have AAT(v) ≤ RAT(v).
• The slack of a node v is computed as:
slack (v ) = RAT (v ) − AAT (v )
– Critical paths or critical nets are signals that have negative slack.
– Non-critical paths or non-critical nets have positive slack.
11
Final Result with Slacks A: AAT
Computed R: RAT
S: Slack
a (0) (0.15) y (2)
A0 A 3.2
(0) R 0.95 (0.1) R 3.1 (0.2)
S 0.95 S -0.1
s (0) b (0) (0.1) x (1) w (2) (0.2) f (0)
A0 A0
R -0.35 A 1.1 A 5.65 A 5.85
(0.6) R -0.35 R 0.75 (0.3) (0.25) R 5.3 R 5.5
S -0.35 S -0.35 S -0.35 S -0.35 S -0.35
c (0) (0.1) z (2)
A 0.6 A 3.4
R 0.95 R 3.05
S 0.35 S -0.35
12 12
Current Practice
• In modern designs, separate timing analyses are performed for the cases of
rise delay (rising transitions) and fall delay (falling transitions).
• Signal integrity extensions to STA consider changes in delay due to switching
activity on neighboring wires of the path under analysis.
– For signal integrity analysis, the STA tool keeps track of windows (intervals) of
AATs and RATs.
– Typically executes multiple timing analysis iterations before these timing
windows stabilize.
• Statistical STA is a generalization of STA where gate and wire delays are
modeled by random variables and represented by probability distributions.
13
Drawbacks of STA
1. Assumption of a clock.
Not applicable to asynchronous subsystems.
14
END OF LECTURE 33
15
Lecture 34: TIMING CLOSURE (PART 3)
2
Basic Idea
• Some notations:
– Consider a netlist consisting of logic gates v1, v2, …, vn
– Consider a set of nets e1, e2, …, em, where ei is the output net of gate vi.
– Let t(v) and t(e) denote gate delay and wire delay, respectively.
3
• The ZSA takes the netlist as input, and tries to decrease positive slacks of all
nodes to zero by increasing t(v) and t(e) values.
• These increased delay values together constitutes the Timing Budget TB(v) of
node v, which should not be exceeded during placement and routing.
TB(v) = t(v) + t(e)
• If TB(v) is exceeded, then the place-and-route tool typically:
(i) decrease the wirelength of e, or (ii) changes the size of gate v.
– The delay impact of a wire or gate size change can be estimated using the Elmore
delay model.
4
• If most arcs (branches) of a timing path are within budget, then the path
may meet its timing constraints even if some arcs exceed their budgets.
– Thus, another approach to satisfying the timing budget is rebudgeting.
• The zero slack algorithm shall be explained with the help of an illustrative
example.
5
Basic Steps in ZSA
1. Determine the initial slacks of all the nodes, and select a node vmin with
minimum positive slack slackmin.
2. Find a path of vertices that dominates slackmin, i.e. any change in the
delays in vertices along the path will cause slackmin to change.
3. Evenly distribute the slack by increasing TB(v) for each vertex v in the
path. Each budget increment will decrement the slack value of a vertex.
By repeating the process, the slack of each node in V will end up at zero.
The resulting timing budgets at all nodes are the final output of ZSA.
6
Example
• Use the zero-slack algorithm to distribute slack
• Format: <AAT, Slack, RAT>, [timing budget]
O1: <13,4,17>
I1 <1,4,5> [0] <3,4,7> [0] O2: <6,8,14>
2
I2
<0,5,5> [0]
<7,4,11> [0]
4 <13,4,17> [0]
6 O1
I3
<1,6,7> [0]
<6,8,14> [0]
3 0 O2
I4 <3,5,8> [0] <6,5,11> [0]
7 7
Example
• Find the path with the minimum non-zero slack (MARKED IN RED).
O1: <13,4,17>
I1 <1,4,5> [0] <3,4,7> [0] O2: <6,8,14>
2
I2
<0,5,5> [0]
<7,4,11> [0]
4 <13,4,17> [0]
6 O1
I3
<1,6,7> [0]
<6,8,14> [0]
3 0 O2
I4 <3,5,8> [0] <6,5,11> [0]
8 8
Example
• Find the path with the minimum non-zero slack.
• Distribute the slacks and update the timing budgets.
O1: <17,0,17>
I1 <1,0,1> [1] <4,0,4> [1] O2: <6,8,14>
2
I2
<0,2,2> [0]
<9,0,9> [1]
4 <16,0,16> [1]
6 O1
I3
<1,4,5> [0]
<6,8,14> [0]
3 0 O2
I4 <3,4,7> [0] <6,4,10> [0]
9 9
Example
• Find the path with the minimum non-zero slack.
• Distribute the slacks and update the timing budgets.
O1: <17,0,17>
I1 <1,0,1> [1] <4,0,4> [1] O2: <6,8,14>
2
I2
<0,2,2> [0]
<9,0,9> [1]
4 <16,0,16> [1]
6 O1
I3
<1,4,5> [0]
<6,8,14> [0]
3 0 O2
I4 <3,4,7> [0] <6,4,10> [0]
10 10
Example
• Find the path with the minimum non-zero slack.
• Distribute the slacks and update the timing budgets.
O1: <17,0,17>
I1 <1,0,1> [1] <4,0,4> [1] O2: <6,8,14>
2
I2
<0,0,0> [2]
<9,0,9> [1]
4 <16,0,16> [1]
6 O1
I3
<1,4,5> [0]
<6,8,14> [0]
3 0 O2
I4 <3,4,7> [0] <6,4,10> [0]
11 11
Example
• Find the path with the minimum non-zero slack.
• Distribute the slacks and update the timing budgets.
O1: <17,0,17>
I1 <1,0,1> [1] <4,0,4> [1] O2: <6,8,14>
2
I2
<0,0,0> [2]
<9,0,9> [1]
4 <16,0,16> [1]
6 O1
I3
<1,4,5> [0]
<6,8,14> [0]
3 0 O2
I4 <3,4,7> [0] <6,4,10> [0]
12 12
Example
• Find the path with the minimum non-zero slack.
• Distribute the slacks and update the timing budgets.
O1: <17,0,17>
I1 <1,0,1> [1] <4,0,4> [1] O2: <6,8,14>
2
I2
<0,0,0> [2]
<9,0,9> [1]
4 <16,0,16> [1]
6 O1
I3
<1,2,3> [2]
<6,8,14> [0]
3 0 O2
I4 <3,2,5> [0] <6,2,8> [2]
13 13
Example
• Find the path with the minimum non-zero slack.
• Distribute the slacks and update the timing budgets.
O1: <17,0,17>
I1 <1,0,1> [1] <4,0,4> [1] O2: <6,8,14>
2
I2
<0,0,0> [2]
<9,0,9> [1]
4 <16,0,16> [1]
6 O1
I3
<1,2,3> [2]
<6,8,14> [0]
3 0 O2
I4 <3,2,5> [0] <6,2,8> [2]
14 14
Example
• Find the path with the minimum non-zero slack.
• Distribute the slacks and update the timing budgets.
O1: <17,0,17>
I1 <1,0,1> [1] <4,0,4> [1] O2: <10,4,14>
2
I2
<0,0,0> [2]
<9,0,9> [1]
4 <16,0,16> [1]
6 O1
I3
<1,0,1> [3]
<10,4,14> [0]
3 0 O2
I4 <3,1,4> [0] <7,0,7> [3]
15 15
Example
• Find the path with the minimum non-zero slack.
• Distribute the slacks and update the timing budgets.
O1: <17,0,17>
I1 <1,0,1> [1] <4,0,4> [1] O2: <14,0,14>
2
I2
<0,0,0> [2]
<9,0,9> [1]
4 <16,0,16> [1]
6 O1
I3
<1,0,1> [3]
<10,0,10> [4]
3 0 O2
I4 <3,0,3> [1] <7,0,7> [3]
16 16
A Modification: Early Mode Analysis
• ZSA uses late-mode analysis with respect to setup constraints, i.e. the
latest times by which signal transitions can occur for the circuit to operate
correctly.
• Correct operation also depends on satisfying hold-time constraints on the
earliest signal transition times.
• Early-mode analysis considers these constraints.
17
How it Works?
• To correctly analyze this timing constraint, the earliest actual arrival time
of signal transitions at each node must be determined.
• The required arrival time of a sequential element in early mode is the time
at which the earliest signal can arrive and still satisfy the library-cell hold-
time requirement.
• For each gate v, AATEM(v) ≥ RATEM(v) must be satisfied.
– AATEM(v) is the earliest actual arrival time of a signal transition at gate v
– RATEM(v) is the required arrival time in early mode at gate v
18
• The early-mode slack can be defined as:
slackEM(v) = AATEM(v) – RATEM(v)
• When adapted to early-mode analysis, ZSA is also called the near zero-
slack algorithm.
– The modified algorithm seeks to decrease TB(v) by decreasing t(v) or t(e), so
that all nodes have minimum early-mode timing slacks.
– Since t(v) and t(e) cannot be negative, node slacks may not necessarily all
become zero.
19
To Summarize
• In practice, if the delay of a node does not satisfy its early-mode timing
budget, the delay constraint can be satisfied by adding additional delay
(padding) to appropriate components.
– The additional delay can violate late-mode timing constraints.
• Thus, a circuit should be first designed with ZSA and late-mode analysis.
Early-mode analysis may then be used to confirm that early-mode
constraints are satisfied, or to guide circuit modifications to satisfy such
constraints.
20
END OF LECTURE 34
21
Lecture 35: TIMING CLOSURE (PART 4)
2
An example:
The path of length 400 is never exercised.
u x
200 1 200 1
MUX fi MUX
0 0
v y
100 100
3
Multi-cycle Paths
• Data paths that require more than one clock period for
execution.
4
Timing Analysis Problems
• We want to determine the true critical paths of a circuit in
order to:
– Determine the minimum cycle time for which the circuit will function.
– Identify critical paths from performance optimization – do not try to
optimize the wrong (non-critical) paths
• Implications:
– Do not want false paths (produced by static delay analysis).
– Delay model is worst case model.
5
Functional Timing Analysis
• Estimate when the output of a given circuit gets stable.
0
Combinational
block
0
clock 0 T
6
Why Timing Analysis?
• Timing verification
– Verifies whether a design meets a given timing constraint.
• Example: cycle-time constraint
• Timing optimization
– Needs to identify critical portion of a design for further optimization.
• Critical path identification
• In both cases, higher the accuracy, the better.
7
Timing Analysis - Basics
• Naïve approach - Simulate all input vectors with SPICE
– Accurate, but too expensive.
• Gate-level timing analysis
– Less accurate than SPICE due to the level of abstraction, but much more
efficient.
– Scenario:
• Gate/wire delays are pre-characterized (accuracy loss).
• Perform timing analysis of a gate-level circuit assuming the gate/wire
delays.
8
Gate-level Timing Analysis
False path z • A naive approach is topological analysis.
aware – Easy longest-path problem
arr(z)? 1 – Linear in the size of a network
• Not all paths can propagate signal events.
– False paths
1 – If all longest paths are false, topological
analysis gives delay overestimate.
Functional timing analysis = false-path-
x1 x2 aware timing analysis
– Compute false-path-aware arrival time
arr(x1)=0 arr(x2)=0
9
Example: 2-bit Carry-skip Adder
c_in s0
Length 5 Length 1
a0
b0 s1
1
0
a1 c_out
b1
10
False Path Analysis - Basics
• Is a path responsible for delay?
– If the answer is no, can ignore the path for delay computation.
• Check the falsity of long paths until we find the longest true path.
– How can we determine whether a path is false?
11
Possible Approach :: Boolean Difference
fi-1 fi Fi+1
• So output P is sensitive to f0 if
12
Example :: Static False Path
u x fj
200 1 200 1
MUX fi MUX
0 0
v y
100 100
∂f i ∂f j
The path is not sensitizable and hence is false. Hence, ⋅ =0
∂u ∂x
13
Definitions
• Given a simple gate (i.e. AND, OR, NAND, NOR), a controlling value on an
input determines the output of the gate independent of the other inputs.
• Given a simple gate (i.e. AND, OR, NAND, NOR), a non-controlling value on
an input cannot determine the output of the gate independent of the
other inputs.
– 0 is a controlling value for AND gate; 1 is non-controlling value for AND gate.
• Controlling / non-controlling value is merely a specialization of the
Boolean difference to simple gates.
14
a
f
b
a
g
b
15
Controlling/Non-Controlling Values
Controlled value of AND
0 0 1
Controlled value of OR
1 1 0
16
END OF LECTURE 35
17
Lecture 36: TIMING CLOSURE (PART 5)
2
Static Sensitization (contd.)
• The (dashed) path is responsible for delay!
• Delay underestimation by static sensitization (delay = 2 when true
delay = 3)
– incorrect condition
1
0
1 2 3
0 2
0
3
What is Wrong with Static Sensitization?
• The idea of forcing non-controlling values to side inputs is
okay, but timing was ignored.
– The same signal can have a controlling value at one time and a non-
controlling value at another time.
4
Timing Simulation
0
2 2
2 3
1
1 1
1
4
0 4
Implies that delay = 0 for these inputs
BUT!
5
0
2 2
2 3
1
1 3 4
1
1
4->2
0 2
Implies that delay = 4 with the same set of inputs.
6
What is Wrong with Timing Simulation?
• If gate delays are reduced, delay estimates can increase.
• Not acceptable since
– Gate delays are just upper-bounds, actual delay is in [0,d].
• Delay uncertainty due to manufacturing.
– We are implicitly analyzing a family of circuits where gate delays are
within the upper-bounds.
7
Monotone Speedup Property
• Definition: For any circuit C, if
a) C’ is obtained from C by reducing some gate delays, and
b) delay_estimate(C’) ≤ delay_estimate(C),
then delay_estimate has Monotone Speedup property.
8
Timing Simulation Revisited
0 2
2
3
1
1 4
1
1
4
0 4
means that the rising signal occurs anywhere
4 between t = -∞ and t = 4.
9
What we just saw …
• Timed 3-valued (0,1,X) simulation
– called X-valued simulation.
• Monotone speedup property is satisfied.
10
SAT Based False Path Analysis
• Satisfiability (SAT) solvers are used for solving a wide range of
problems.
• Modern SAT solvers run very fast and can handle a large
number of variables.
• Basically, given a Boolean function F in product-of-sum form, a
SAT solver tries to find some assignment of the variables for
which F = 1.
11
The SAT Formulation
Decision problem:
Is there an input vector under which the output gets stable only after t = T ?
Idea:
1. Characterize the set of all input vectors S(T) that make the output stable
no later than t = T.
2. Check if S(T) contains S = all possible input vectors.
This check is solved as a SAT problem:
Is S \ S(T) empty? set difference + emptiness check
• Let F and F(T) be the characteristic functions of S and S(T)
• Is F !F(T) satisfiable?
12
Example
d
g
a
b e f
c
13
g(1,t=2) : the set of input vectors under which
g gets stable to value = 1 no later than t =2
d
g
a
b e f Onset:
stabilized by t=2?
c
g(1,t=2) = d(1,t=1) ∩ f(1,t=1)
= (a(0,t=0) ∩ b(0,t=0)) ∩ (c(1,t=0) ∪ e(1,t=0))
= !a!b(c ∪ ∅) = !a!bc = S1(t=2)
g(1,t=∞) = on-set = !a!bc = g(1,t=2) = S1
14
g(0,t=2) : the set of input vectors under which
g gets stable to value = 0 no later than t=2
d
g
a
b e f
c
g(0,t=2) = d(0,t=1) ∪ f(0,t=1)
= (a(1,t=0) ∪ b(1,t=0)) ∪ (c(0,t=0) ∩ e(0,t=0))
= (a+b) + (!c ∩ ∅) = a+b = S0(t=2)
g(0,t=∞) = off-set = a+b+!c = S0
15
g(0,t=2) : the set of input vectors under which
g gets stable to 0 no later than t=2
d
g
a
b e Offset:
f NOT stabilized by t=2
under abc = 000
c
g(0,t=2) = a+b
g(0,t=∞) = offset = a+b+!c
g(0,t=∞) \ g(0,t=2) = (a+b+!c) !(a+b) = !a !b !c = satisfiable
16
Summary
• False-path-aware arrival time analysis is well-understood.
– Practical algorithms exist.
• Can handle industrial circuits easily.
• Remaining problems:
– Incremental analysis (make it so that a small change in the circuit
does not make the analysis start all over).
– Integration with logic optimization.
– DSM issues such as cross-talk-aware false path analysis.
17
END OF LECTURE 36
18
Lecture 37: TIMING DRIVEN PLACEMENT
2
Techniques for Timing-Driven Placement
• Algorithmic techniques for TDP can be categorized as net-based,
path-based, or integrated.
• Two types of net-based techniques:
1. Delay budgeting, which assigns upper bounds to the timing or length of
individual nets.
2. Net weighting, which assign higher priorities to critical nets during placement.
• Path-based techniques seek to shorten or speedup all timing-critical paths
rather than individual nets.
– More accurate but does not scale to large designs because number of paths
can grow exponentially with number of gates (e.g. multiplier).
3
• Both path-based and net-based approaches rely on support within the
placement algorithm, and require a dedicated infrastructure for
incremental calculation of timing statistics and parameters.
• Integrated techniques typically use constraint-driven mathematical
formulation in which STA results are incorporated as constraints and
possibly in the objective function.
• In practice, some industrial flows do not incorporate timing-driven
methods during initial placement because timing information can be quite
inaccurate until locations are available.
– Instead, subsequent placement iterations, especially during detailed
placement, perform timing optimizations.
4
Net Based Techniques
• These approaches impose either quantitative priorities that
reflect timing criticality (net weights), or upper bounds on the
timing of nets in the form of net constraints (delay budgets).
• Net weights are more effective at the early design stages,
while delay budgets are more meaningful if timing analysis is
more accurate.
5
(a) Net Weighting
• A traditional placer optimizes total wirelength and routability.
• To account for timing, a placer can minimize the total weighted wirelength,
where each net is assigned a net weight.
– The higher the net weight is, the more timing-critical the net is.
• Net weights can be assigned either statically or dynamically to improve the
timing.
6
Static Net Weights
• They are computed before placement and do not change.
• They are usually based on slack: the more critical the net (i.e. smaller
slack), greater is the weight.
• Static net weights can be either discrete:
ω if slack > 0
w= 1 where ω1 > 0, ω2 > 0, and ω2 > ω1
ω 2 if slack ≤ 0
• Or they can be continuous:
α
slack
w = 1 −
t
where t is the longest path delay and α is a criticality exponent.
7
• Alternatively, net weights can be assigned based on sensitivity, as:
w = wo + α( slack target − slack ) ⋅ s wSLACK + β ⋅ s w
TNS
8
Dynamic Net Weights
• They are computed during placement iterations and keep an updated timing
profile.
• This can be more effective than static net weights, since they are computed
before placement, and can become outdated when net lengths change.
• Estimated slack of a net at iteration k can be computed as:
slack k = slack k −1 − s LDELAY ⋅ ∆L
where ΔL is the change in wirelength between iterations (k-1) and k
slackk is the slack at iteration k
sLDELAY is the delay sensitivity to the wirelength
9
• After the timing information has been updated, the net weights should be
adjusted accordingly.
– This incremental method of weight modification is based on previous iterations.
• The net criticality at iteration k is computed as:
1
2 (υ k −1 + 1) if among the top 3% of critical nets
υk =
1
υ k −1
2 otherwise
• And then the net weight is updated as:
wk = wk −1 ⋅ (1 + υ k )
10
Integrated Technique using Linear Programs
• Unlike net-based methods, where the timing requirements are mapped to
net weights or net constraints, path-based methods directly optimize the
design’s timing.
– As the number of paths can grow quickly, this method is much slower than
net-based approaches.
• To improve scalability, timing analysis may be captured by a set of
constraints and an optimization objective.
– For example, in a linear programming framework.
11
• In the context of timing-driven placement, a linear program (LP) minimizes
a function of slack (e.g. TNS), subject to two main types of constraints:
1. Physical constraints, which define the locations of the cells.
2. Timing constraints, which define the slack requirements.
12
Physical Constraints:
• Given a set of cells V and the set of nets E, we define the notations:
– xv and yv denote the center of cell v V
– Ve denotes the set of cells connected to net e E
13
• Then, for all v ∈ Ve: left (e) ≤ xv + δ x (v, e)
Every pin of a given
right (e) ≥ xv + δ x (v, e)
net e must be
bottom (e) ≤ yv + δ y (v, e) contained within e’s
top (e) ≥ yv + δ y (v, e) bounding box.
14
Timing Constraints:
• For timing constraints, let
– tGATE(vi,vo) be the gate delay from an input pin vi to the output pin vo for cell v
– tNET(e,uo,vi) be net e’s delay from cell u’s output pin uo to cell v’s input pin vi
– AAT(vj) be the arrival time on pin j of cell v
• For every input pin vi of cell v, the arrival time at vi is the arrival time at
the previous output pin u0 of cell u plus the net delay:
AAT (vi ) = AAT (u o ) + t NET (u o , vi )
15
• For every output pin v0 of cell v, the arrival time at v0 should be greater
than or equal to the arrival time plus gate delay of each input vi. That is,
for each input vi of cell v,
AAT (vo ) ≥ AAT (vi ) + t GATE (vi , vo )
16
Objective Functions:
a) Optimize the total negative slack (TNS) max : ∑ slack (τ
τ p ∈Pins ( τ ), τ∈Τ
p)
17
END OF LECTURE 37
18