Multi-Rate_QC-LDPC_Encoder
Multi-Rate_QC-LDPC_Encoder
Authorized licensed use limited to: INDIAN INSTITUTE OF TECHNOLOGY GUWAHATI. Downloaded on November 16,2023 at 19:04:52 UTC from IEEE Xplore. Restrictions apply.
978-1-4244-2587-7/09/$25.00 ©2009 IEEE
⎡A B T ⎤ the first row. Therefore, the memory cost will greatly reduce
H =⎢ (1)
⎥ to 1/127 of the original cost.
⎣C D E ⎦
If the first row of Gi , j is defined as g i,j (0) = [g 0 ,g1 ,...,g126 ]T ,
Where A is (m − g ) × ( n − m) , B is ( m − g ) × g , C is
g × (n − m) , D is g × g , E is g × (m − g ) , and finally T is then Gi , j can be expressed in the following equation.
Obtained p1 and p2 , we can get x , then the encoding is From (9), the multiplication of long vector and large
finished. In DMB-T, the parity-check matrix of LDPC is very
large and obviously, RU encoding algorithm is not suit. matrix can be broken down into K × C multiplications of 127-
B. The Encoding Algorithm for DMB-T bit vector and 127 × 127 matrix, (K−1)×C vector additions.
Assuming the code length is n, and the information bit T
length is k, so the linear block codes encoding algorithm can ⎡ mi×127 ⎤ ⎡ g0 g1 ... g125 g126 ⎤
⎢m ⎥ ⎢g g 0 ... g124 g125 ⎥⎥
be expressed by multiplication as follows: ⎢ i×127+1 ⎥ ⎢ 126
C1× n = m1×k Gk × n (6) M i Gi , j = ⎢ ... ⎥ ⎢ ... ... ... ... ... ⎥
⎢ ⎥ ⎢ ⎥
C1×n is the LDPC code, m1× k is the information bits, and ⎢mi×127+125 ⎥ ⎢ g2 g 3 ... g 0 g1 ⎥
⎢⎣mi×127+126 ⎥⎦ ⎢⎣ g1 g 2 ... g126 g 0 ⎥⎦
G k × n is the generator matrix. In DMB-T, the general LDPC
generator matrix has a form showed in the following, = ⎡⎣mi×127 gi , j T (0) + mi×127+1 gi , j T (1) + L + mi×127 +126 gi , j T (126)⎤⎦ (10)
⎡ G0 , 0 G0,1 K G0,C −1 I O K O⎤ Now, the blocks matrices operations are broken down into
⎢ G G1,1 K G1,C −1 O I K O ⎥⎥ one bit multiplied by the corresponding vectors, and additions.
Gqc = ⎢
1, 0
I is a b × b identity matrix, O is a b × b zero matrix, and rj = ∑⎡⎣mi×127 gi, jT (0) + mi×127+1gi, jT (1) +K+ mi×127+126gi, jT (126)⎤⎦ (11)
i=0
G i , j is a b × b cyclic matrix. 0 ≤ i ≤ K − 1, 0 ≤ j ≤ C − 1 , b=127.
In (11), rj is 127-bit vector, i is from 0 to K − 1 , j is from
C and K are determined by the corresponding rate of the
0 to C − 1 , then all the check bits can be calculated. The check
DMB-T. G i , j is a cyclic matrix, and in the hardware
bit of 0.8-rate LDPC (7493, 6096) in DMB-T is computed
implementation of the LDPC code, as a result of the using the following equation:
characteristics of the cycle, only the first row needs to be
[ ]
47
stored, and the rest rows can be attained by the rotate right of rj = ∑ mi×127gi, j (0) + mi×127+1gi, j (1) +K+ mi×127+126gi, j (126) (12)
i=0
Authorized licensed use limited to: INDIAN INSTITUTE OF TECHNOLOGY GUWAHATI. Downloaded on November 16,2023 at 19:04:52 UTC from IEEE Xplore. Restrictions apply.
Where i is from 0 to 47, and j is from 0 to 11. The following Table 1 Parameters of the generator matrices
will discuss the specific hardware implementation of the
encoder based on this algorithm using FPGA. Rate K C b
0.4 24 35 127
III. LDPC ENCODER BASED ON FPGA
0.6 36 23 127
A. The Computing Module Design 0.8 48 11 127
The blocks matrices operations can be broken down into
multiplications and additions, based on FPGA, multiplications Like the above 0.8-rate encoder computing module, the
can be achieved using AND, additions can be achieved using 0.6-rate LDPC (7493, 4572) can call 23 SRAA circuits
XOR. Researching the computing module based on the 0.8- parallel to compute the 2921 bits check bit, after 36*127 clock
rate generator matrix as an example, the generator matrix of cycles, all the check bit can be computed in the SRAA circuits
QC-LDPC (7493, 6096) code has 48 pieces of row-block and stored in the cumulative register B; the 0.4-rate LDPC
(7493,3048) can call 35 SRAA circuits parallel to compute the
matrices and 11 pieces of column-block matrices. The first
4445 bits check bit, after 24*127 clock cycles, all the check bit
row elements of the row-block matrices can be stored in a
can be computed in the SRAA circuits and stored in the
127-bit wide, 48-depth memory. There are 1397 bits check bit, cumulative register B. If just combine the three-rate encoder
which is127*11, in order to increase the encoding speed, 11 together, it will spend 11 +23 +35 = 69 SRAA circuits, since
pieces of the SRAA (shift-register-adder-accumulator) circuits each time we can only choose one rate, the SRAA circuits can
(in figure 3-1) are called in parallel to compute all the check- be reused to reduce the waste of resource, so only 35 SRAA
bit (9). If using one ROM to store each first row elements of circuits can achieve three-rate encoding. The overall structure
the block matrices, as the 11 SRAA parallel computing, each of the encoder is shown in Fig.3-2.
time reading data from ROM into the 11 127-bit shift registers
will cost at least 11 clock cycles, and 48 pieces of row-block
matrices operation will at least take 11 * 48 clock cycles to
read the total generator matrix. However, we can configure
separated ROM for each SRAA to store the first line of the
block matrix, and all the register A can read data in parallel
from the ROMs to increase the encoding speed.
Authorized licensed use limited to: INDIAN INSTITUTE OF TECHNOLOGY GUWAHATI. Downloaded on November 16,2023 at 19:04:52 UTC from IEEE Xplore. Restrictions apply.
by p1 and p2 . The parity-check matrix is needed to D. Simulations and Verification
transform into A, B, C, D, E and T respectively to store, The multi-rate QC-LDPC encoder based on SRAA
which can not make use of the quasi-cyclic nature of H to algorithm was implemented using Xilinx ISE9.1 platform,
reduce the memory. Comparison of complexity and resource described by Verilog hardware language, conducted on Xilinx
between RU algorithm and SRAA encoding algorithm is virtex2P xc2vp100 FPGA. The XST synthesis report showed
shown in table 2. that it cost 9707 slices, 9841 flip-flops, 17573 4-input-LUTs.
Table 2 Comparison of RU and SRAA To give a clear understanding, the simulation of the SRAA
Algorithm Complexity Memory(bit) structure is shown in Fig. 3-5 based on Xilinx ISE simulator.
RU 2
(k + c) × c × b
2
O(n + g )
SRAA O(n) k ×c×b
Table 2 shows that, SRAA algorithm costs less complexity
and memory than RU algorithm, and it is more suitable for
QC-LDPC in DMB-T. The following will compare the
resource of single-rate QC-LDPC encoder and multi-rate
encoder based on SRAA algorithm, shown in table 3 Fig.3-5 The simulation of SRAA encoding structure
Table 3 Comparison of single-rate and multi-rate In Fig.3-5, data_in is the information bit; rate_contr signal is
Rate Encoding Flip- XOR/ Memory
speed flops AND 2’b11, which means the rate is 0.8; after computed, the check
gates bit is sent to sraa_data_out in parallel; this is the key of the
Single
ki b 2ci b ci b ki ci b multi-rate encoder.
In order to verify the encoder, we have adopted a
Multi
k max b 2cmaxb cmax b 3
b ∑ (k ici ) verification program in MatLab platform. A set of random
i =1 binary sequence is produced and sent into both the encoder and
In DMB-T k1 = 24 , k 2 = 36 , k3 = 48 ; c1 = 35 , c2 = 23 , MatLab simulation. Comparing the two encoding results can
test and verify the correctness of the LDPC encoder proposed
c3 = 11 ; b = 127 , i is from 1 to 3, k max is the maximal of ki , in this paper.
cmax is the maximal of ci . Three rates are 0.4, 0.6 and 0.8. The IV. CONCLUSIONS
comparison of specific resource consumption is shown in A multi-rate LDPC encoder, using SRAA circuits and
figure3-3 and 3-4.
taking up less memory, is proposed in this paper that based on
F lip -flo p s the quasi-cyclic characters of the generator matrix of DMB-T.
Using separated ROMs to store blocks matrices is effective in
9000
reducing delays in the system. Compared with the traditional
6000 LDPC encoder, this encoder can achieve in three-rate, occupy
fewer resources, and have more practical value.
3000
R a te REFERENCES
0 .4 0 .6 0 .8 m u lti
[1] R.G.Gallager, Low-Density Parity-Check Codes [M].Cambridge: MIT
Press, 1963.
Fig.3-3 Flip-flops cost of different-rate encoders [2] D. J. C. MacKay, “Good error-correcting codes based on very sparse
Matrices”, IEEE Trans.Inf.Theory,vol.45, no.2, pp.399–431, Mar.1999.
[3] Sae-Young Chung, “The Design of Low-Density Parity-Check Codes
within 0.0045 dB of the Shannon Limit”, IEEE Comm.Letters, 5(2):58~60.
[4] J. L. Fan, “Array codes as low-density parity-check codes,” in Proc.
2nd. Int. Symp. Turbo Codes, Brest, France, Sep. 2000, pp. 543–546.
[5] Seho Myung, Kyeongcheol Yang, “Quasi-Cyclic LDPC Codes for Fast
Encoding” IEEE Trans.Inform.Trans, 51(8), Aug.2005, pp 2894~2896.
[6] WEN Hong, “The principle and application of LDPC codes”, UESTC
Publishing House.
[7] JIANG Huiyuan, TIAN Bin and YI Kechu, “Design of Quasi-regular
Fig.3-4 Memory cost of different-rate encoders LDPC Codes Encoder Base on Q-matrix”, VIDEO ENGINEERING,
No.11 Vol.31 2007.
The computing module of multi-rate QC-LDPC encoder
[8] GB 20600-2006,Framing structure, channel coding and modulation for
and 0.4-rate encoder consume almost the same number of flip- digital television terrestrial broadcasting system.[S]
flops、two-input XOR and AND gates, and the control module [9] Zongwang Li, Lei Chen and Lingqi Zeng, “IEEE Communications
of the multi-rate encoder costs another few gates. The memory Society subject matter experts for publication in the IEEE GLOBECOM
cost of the multi-rate encoder is the sum of memory cost of 2005 proceedings.”, pp1205~1208.
three single-rate encoder.
Authorized licensed use limited to: INDIAN INSTITUTE OF TECHNOLOGY GUWAHATI. Downloaded on November 16,2023 at 19:04:52 UTC from IEEE Xplore. Restrictions apply.