500Mb/s Soft Output Viterbi Decoder: Engling Yeo Stephanie Augsburger, Wm. Rhett Davis, Borivoje Nikolić
500Mb/s Soft Output Viterbi Decoder: Engling Yeo Stephanie Augsburger, Wm. Rhett Davis, Borivoje Nikolić
Engling Yeo
Stephanie Augsburger, Wm. Rhett Davis, Borivoje Nikolić
Department of Electrical Engineering and Computer Sciences
University of California, Berkeley, CA 94720, USA
x1 u2 x2 y
u1 (11,13)
Encoder
π
EPR4
Channel
+
Encoder
y EPR4 û2 −1
x̂1 (11,13) û1
π
Decoder Decoder
ûˆ 2 x̂ˆ1
π
Decoder
Example high-throughput application: Magnetic Disk-Drive Read Channel
EPR4: Enhanced Partial Response Class 4
π : Interleavers
Engling Yeo University of California, Berkeley 2
Outline
Decoder Architecture
Add-Compare-Select Structures
Branch 8×
Channel Metric Compare-
Input Gen. Select- 3 Most-
Add 8 × CSA Survivor Memory Unit
decisions likely
(L-step SMU) state
Viterbi Decoder
Memories
Soft Output Evaluate
Viterbi Decoder
Most-likely state
8 × L-step FIFO
(CSA Decisions)
Memories Path Equivalence Detector
Soft Output Evaluate (M-step PED)
3
• • •
Most-likely state
8 × L-step FIFO
(CSA Decisions)
-
bm00 (n )
sm0 (n ) +
sm0 (n + 1)
bm10 (n )
sm1(n ) +
-
bm00 (n + 1)
+ sm0 (n + 1) + bm00 (n + 1)
bm00 (n ) + sm0 (n )
sm0 (n + 1)
bm10 (n ) + sm1(n )
+ sm0 (n + 1) + bm01(n + 1)
bm01(n + 1)
[G. Fettweis, et. al., “Reduced-complexity Viterbi detector architectures for partial
response signaling,” Proc. IEEE GLOBECOM, Nov 1995.]
[I. Lee and J. L. Sonntag, “A new architecture for the fast Viterbi algorithm,” Proc. IEEE
GLOBECOM, Nov. 2000.]
- ∆
bm00 (n + 1)
+
+ sm0 (n + 1) + bm00 (n + 1)
sm0 (n ) + bm00 (n )
sm1(n ) + bm10 (n )
+
bm01(n + 1) sm0 (n + 1) + bm01(n + 1)
+
Comp/ Sel
Add
Parallel executions of Add and Compare
34% reduction in critical delay
22% area penalty (overall decoder)
Most-likely state
8 × L-step FIFO
(CSA Decisions)
dACS0
Traceback recovers most-likely path.
dACS1
— SRAM Memory – Slower, Low power and
dACS2 area.
dACS5
dACS6
dACS7
Engling Yeo University of California, Berkeley 14
Micro Architecture: Path Equivalence Detector
8 × L-step FIFO
Branch 8× (Path Metric Differences)
Channel Metric Compare-
Input Gen. Select-
Add 8 × CSA Survivor Memory Unit
decisions
(L-step SMU)
Most-likely state
8 × L-step FIFO
(CSA Decisions)
001
100
101
110
111
Traceback
depths 0 1 2 3 4
Engling Yeo University of California, Berkeley 16
Path Equivalence Detection (PED)
State
000
001
110
111
Traceback
depths 0 1 2 3 4
Engling Yeo University of California, Berkeley 17
Path Equivalence Detector
1,2
1,1
EQ 1
EQ2
0,
0,
EQ
EQ
…
Most-likely state
8 × L-step FIFO
(CSA Decisions)
SEL SEL
EQiˆ, j EQiˆ, j +1
Cross
Verification
Netlist
[W. R. Davis, et. al., "An Automated Design Flow for High-Throughput Low-Power Dedicated
Signal Processing Systems," IEEE JSSC, Mar 2002.]
Layout
Engling Yeo University of California, Berkeley 21
Technology and Physical Parameters
Technology:
General-purpose 0.18µm CMOS with 6 metal layers
Dual threshold available (used only high-speed transistors)
1.8V power supply
Implemented decoders:
Soft Output Viterbi Algorithm
Convolutional codes
– 8-state Octal(11,13) code
– 8-state enhanced partial response class-4 (EPR4)
Speed: 500Mb/s
Power: 400mW
Core Area: 1mm x 0.5mm
Transistor count: 170,000
Engling Yeo University of California, Berkeley 22
Measured Performance
Freq (x100MHz)
Power (x 100mW)
Acknowledgments:
— T. Smilkstein, P. Pakzad and V. Anantharam of UC Berkeley for technical assistance
— ST Microelectronics for fabrication of the test chip.
— Texas Instruments for support under the UC MICRO program.
Engling Yeo University of California, Berkeley 25
END
Channel
Observations
D2
Extrinsic π−1
SISO D1
π
Intrinsic D2
π−1
D1
• Unrolled and pipelined decoder to achieve
desired throughput rates (> 1Gbps) D2 π
• Complexity (linear increase) π-1 D1
• Latency (multi-sector)
1
Perr = ; ∆ = M β − Mα
1 + exp(∆ )
Assumption: Values of path metrics, Mα and Mβ, dominate over that of other paths
1 − Perr
log-likelihood of error = log = ∆ = M β − Mα
Perr
CLK3