VLSI Decoder Architecture For High Throughput, Variable Block-Size and Multi-Rate LDPC Codes
VLSI Decoder Architecture For High Throughput, Variable Block-Size and Multi-Rate LDPC Codes
May 2007
Abstract— A low-density parity-check (LDPC) decoder archi- code rates is designed by storing 12 different parity check
tecture that supports variable block sizes and multiple code matrices on-chip. As we can see, the main design challenge for
rates is presented. The proposed architecture is based on the supporting variable block sizes and multiple code rates stems
structured quasi-cyclic (QC-LDPC) codes whose performance
compares favorably with that of randomly constructed LDPC from the random or unstructured nature of the LDPC codes.
codes for short to moderate block sizes. The main contribution Generally support for different block sizes of LDPC codes
of this work is to address the variable block-size and multi- would require different hardware architectures. To address this
rate decoder hardware complexity that stems from the irregular problem, we propose a generalized decoder architecture based
LDPC codes. The overall decoder, which was synthesized, placed on the quasi-cyclic LDPC (QC-LDPC) codes that can support
and routed on TSMC 0.13-micron CMOS technology with a core
area of 4.5 square millimeters, supports variable code lengths a wider range of block sizes and code rates at a low hardware
from 360 to 4200 bits and multiple code rates between 1/4 and requirement.
9/10. The average throughput can achieve 1 Gbps at 2.2 dB SNR.
II. S TRUCTURED QC-LDPC CODES
DP
I. I NTRODUCTION v0 v1 ...
c0 Layer 0
Low-density parity-check (LDPC) codes have received 10101100
Expand by P
tremendous attention in the coding community because of 01010110
BP
c1
.
Layer 1
iterative decoding and very long block sizes (on the order of (b) BP x DP generated PCM
106 to 107 ). However, for many practical applications (e.g.
packet-based communication systems), shorter and variable c0
cluster
c1
cluster
c2
cluster
c3
cluster
block-size LDPC codes with good Frame Error Rate (FER) check node
messages
performance are desired. Communications in packet-based
wireless networks usually involve a large per-frame overhead Permutation (Shift) Network
including both the physical (PHY) layer and MAC layer variable node
headers. As a result, the design for a reliable wireless link messages
v0 v1 v2 v3 v4 v5 v6 v7
often faces a trade-off between channel utilization (frame size) cluster cluster cluster cluster cluster cluster cluster cluster
and stored in one address of the APP and Check memories. Layer i+1 Read/Min-sum Write back Layer i+1 X X X
−6
D × Px × R × f clkmax 10
1 1.5 2 2.5 3 3.5
P ipelined T hroughput ≈ Eb/No [dB]
E × iterations
IV. P HYSICAL VLSI DESIGN
Fig. 7. FER performance comparison with IEEE 802.11n codes
A flexible LDPC decoder which supports variable block
sizes from 360 to 4200 bits in fine steps, where the step size TABLE II
can be 24 (at rate 1/4, 1/3, 1/2, 2/3, 3/4, 5/6 and 7/8), or C OMPARISON OF PROPOSED DECODER WITH EXISTING LDPC DECODERS
25 (at rate 2/5, 3/5 and 4/5), or 27 (at rate 8/9), or 30 (at rate
Proposed Decoder Blanksby [1] Mansour [2]
9/10), was described in Verilog HDL. Layout was generated Throughput 1.0 [email protected] 1.0 Gbps [email protected]
for a TSMC 0.13µm CMOS technology as shown in Fig. 6 Area 4.5 mm2 52.5mm2 14.3 mm2
Frequency 350 MHz 64 MHz 125 MHz
Power 740 mW 690 mW 787 mW
Block size 360 to 4200 bit 1024 bit fixed 2048 bit fixed
Code Rate 1/4 : 9/10 1/2 fixed 1/16 : 14/16
Technology 0.13µm, 1.2V 0.16µm, 1.5V 0.18µm, 1.8V
Check Memory
using TSMC 0.13 µm, 1.2V , eight metal layers CMOS tech-
nology. The decoder can support high throughput decoding,
APP Permuter
Memory PEs for example, 1 Gbps at 2.2 dB SNR, at less area.
CTRL VII. ACKNOWLEDGEMENT
PCM
Memory Glue Logic This work was supported in part by Nokia and by NSF under
grants CCF-0541363, CNS-0551692, and CNS-0619767.
R EFERENCES
[1] A.J. Blanksby and C.J. Howland, “A 690-mW 1-Gb/s 1024-b, rate-
Fig. 6. Flexible LDPC decoder VLSI layout (0.13µm) 1/2 low-density parity-check code decoder,” IEEE Journal of Solid-State
Circuits, vol. 37, no. 3, pp. 404–412, 2002.
[2] M.M. Mansour and N.R. Shanbhag, “A 640-Mb/s 2048-Bit Programmable
V. P ERFORMANCE ANALYSIS AND COMPARISON LDPC Decoder Chip,” IEEE Journal of Solid-State Circuits, vol. 41, pp.
684–698, March 2006.
Fig. 7 shows the FER performance and compares the two [3] M. Karkooti, P. Radosavljevic, and J. R. Cavallaro, “Configurable, High
cases that also exist in the IEEE 802.11n (WWiSE Proposal) Throughput, Irregular LDPC Decoder Architecture:Tradeoff Analysis and
codes. Table II compares this decoder with the state-of-the-art Implementation,” IEEE 17th International Conference on Application-
specific Systems, Architectures and Processors, pp. 360–367, Sep. 2006.
LDPC decoders of [1] and [2]. As we can see, the proposed [4] R.M. Tanner, D. Sridhara, A. Sridharan, T.E. Fuja, and D.J. Costello
decoder shows significant performance in throughput, flexibil- Jr., “LDPC block and convolutional codes based on circulant matrices,”
ity, area and power. IEEE Transactions on Information Theory, vol. 50, no. 12, pp. 2966–
2984, 2004.
VI. C ONCLUSION [5] M. M. Mansour and N. R. Shanbhag, “High-throughput LDPC decoders,”
IEEE Transactions on Very Large Scale Integration (VLSI) Systems,
A VLSI decoder architecture that supports variable block- vol. 11, pp. 976–996, Dec. 2003.
size and multi-rate LDPC codes has been presented. By [6] R. Gallager, “Low-density parity-check codes,” IEEE Transactions on
Information Theory, vol. 8, pp. 21–28, Jan. 1962.
utilizing structured QC-LDPC codes, we proposed a pipelined [7] J. Chen, A. Dholakia, E. Eleftheriou, M. Fossorier and X. Hu, “Reduced-
partially parallel decoding algorithm which is well suited for Complexity Decoding of LDPC Codes,” IEEE Transactions on Commu-
VLSI implementation. The decoder has been placed and routed nications, vol. 53, pp. 1232–1232, 2005.