0% found this document useful (0 votes)
21 views

A Tutorial On LDPC

Tuan Ta, LDPC

Uploaded by

wbfan123
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
21 views

A Tutorial On LDPC

Tuan Ta, LDPC

Uploaded by

wbfan123
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 15

A Tutorial on Low Density Parity-Check Codes

Tuan Ta
The University of Texas at Austin

Abstract − Low density parity-check codes are one of the hottest topics in coding theory nowadays.
Equipped with very fast encoding and decoding algorithms (probabilistically,) LDPC are very attractive both
theoretically and practically. In this paper, I present a throughout view of LDPC. I discuss in detail LDPC’s
popular decoding algorithm: belief propagation. I also present some noticeable encoding algorithms including
Richardson and Urbanke’s lower triangular modification that gives rise to a linear encoding time. I show that
irregular LDPC perform better than regular LDPC thus are more desirable.

Index Terms—belief propagation, density evolution, Gallager codes, irregular codes, low density parity-
check codes

is introduced as an example to illustrate the


I. INTRODUCTION behavior of a linear block code. I also discuss the
OW density parity-check code (LDPC) is an
L error correcting code used in noisy
use of Tanner graph as equivalence to a parity-check
matrix for linear block codes. Section II is
communication channel to reduce the probability of concluded by the definition of LDPC. In section III,
loss of information. With LDPC, this probability I discuss how to decode an LDPC using belief
can be reduced to as small as desired, thus the data propagation algorithm. Both the hard-decision
transmission rate can be as close to Shannon’s limit decoder and the soft-decision decoder are presented.
as desired. An example is given to illustrate the use of the
LDPC was developed by Robert Gallager in his algorithm. Section IV talks about how to encode an
doctoral dissertation at MIT in 1960 [1]. It was LDPC in linear time. Two approaches are presented,
published 3 years latter in MIT Press. Due to the the accumulate approach, and Richardson and
limitation in computational effort in implementing Urbanke’s lower triangular modification approach.
the coder and decoder for such codes and the Section V discusses irregular LDPC and shows that
introduction of Reed-Solomon codes, LDPC was irregular codes perform better than regular codes.
ignored for almost 30 years. During that long Three designs are considered: Luby’s, Richardson’s
period, the only notable work done on the subject and Chung’s. Section V shows that it is possible for
was due to R. Michael Tanner in 1981 [2] where he irregular LDPC to come as close as 0.0045 dB to
generalized LDPC codes and introduced a graphical Shannon capacity.
representation of the codes later called Tanner
graph. Since 1993, with the invention of turbo II. ERROR CORRECTING CODES
codes, researchers switched their focus to finding
A. Introduction
low complexity code which can approach Shannon
In communication, errors can occur due to a lot
channel capacity. LDPC was reinvented with the
of reasons: noisy channel, scratches on CD or DVD,
work of Mackay [3], [4] and Luby [5]. Nowadays,
power surge in electronic circuits, etc. It is often
LDPC have made its way into some modern
desirable to detect and correct those errors. If no
applications such as 10GBase-T Ethernet, WiFi,
additional information is added to the original
WiMAX, Digital Video Broadcasting (DVB).
message, errors can turn a legal message (bit
Before discussing LDPC, I present a brief review
pattern) into another legal bit pattern. Therefore,
of error correcting code in section II. I pay the most
redundancy is used in error-correcting schemes. By
attention to linear block code. (7, 4) Hamming code
1
adding redundancy, a lot of bit patterns will become have 3 parity equations, which can identify up to
illegal. A good coding scheme will make an illegal 23 = 8 error conditions. One condition identifies “no
pattern caused by errors to be closer to one of the error”, so seven would be left to identify up to seven
legal patterns than others. places of single error. Therefore, we can detect and
A metric used to measure the “closeness” correct any single error in a 7-bit word. With 3
between two bit patterns is Hamming distance. The parity bits, we have 4 bits left for information. Thus
Hamming distance between two bit patterns is the this is a (7, 4) block code. In a (7, 4) Hamming
number of bits that are different. For example, bit code, the parity equations are determined as follow:
pattern 1100 and 0100 differ by one bit (the 1st bit), • The first parity equation checks bit 4, 5, 6, 7
thus have Hamming distance of one. Two identical
bit patterns have Hamming distance of zero. • The second parity equation checks bit 2, 3, 6, 7
A parity bit is an additional bit added to the bit • The third parity equation checks bit 1, 3, 5, 7
pattern to make sure that the total number of 1’s is This rule is easy to remember. The parity
even (even parity) or odd (odd parity). For example, equations use the binary representation of the
the information message is 01001100 and even location of the error bit. For example, location 5 has
parity is used. Since the number of 1’s in the the binary representation of 101, thus appears in
original message is 3, a ‘1’ is added at the end to equation 1 and 3. By applying this rule, we can tell
give the transmitted message of 010011001. The which bit is wrong by reading the value of the
decoder counts the number of 1’s (normally done by binary combination of the result of the parity
exclusive OR the bit stream) to determine if an error equations, with 1 being incorrect and 0 being
has occurred. A single parity bit can detect (but not correct. For example, if equation 1 and 2 are
correct) any odd number of errors. In the previous incorrect and equation 3 is correct, we can tell that
example, the code rate (number of useful the bit at location 6 (110) is wrong.
information bits/total number of bits) is 8/9. This is At the encoder, if location 3, 5, 6, 7 contain the
an efficient code rate, but the effectiveness is original information and location 1, 2, 4 contain the
limited. Single parity bit is often used in scenarios parity bits (locations which are power of 2) then
where the likelihood of errors is small and the using the first parity equation and bits at location 5,
receiver is able to request retransmission. 6, 7, we can calculate the value of the parity bit at
Sometimes it is used even when retransmission is location 4 and so on.
not possible. Early IBM Personal Computers (7, 4) Hamming code can be summarized in the
employ single bit parity and simple crash when an following table [7]
error occurs [6].
B. Linear block code TABLE 1
If a code uses n bits to provide error protection to (7, 4) HAMMING CODE FULL TABLE

k bits of information, it is called a (n, k) block code. 1 2 3 4 5 6 7


Often times, the minimum Hamming distance d Bit number
between any two valid codewords is also included p1 p2 d1 p3 d2 d3 d4
to give a (n, k, d) block code. An example of block Equation
codes is Hamming code. Consider the scenario 1 0 1 0 1 0 1
corresponds to p1
where we wish to correct single error using the
Equation
fewest number of parity bits (highest code rate). 0 1 1 0 0 1 1
corresponds to p2
Each parity bit gives the decoder a parity equation
Equation
to validate the received code. With 3 parity bits, we 0 0 0 1 1 1 1
corresponds to p3

2
1 1 0 1
In table 1, p denotes parity bits, d denotes data 1 0 1 1 

bits. 1 0 0 0
Removing the parity bits, we have  
G = 0 1 1 1 (4)
0 1 0 0
TABLE 2  
(7, 4) HAMMING CODE ABBREVIATED TABLE 0 0 1 0
0 0 0 1 

d1 d2 d3 d4
H is driven straight from Table 1. G is obtained
p1 1 1 0 1
by
• For parity bit locations: use associated data
p2 1 0 1 1 bits (from Table 2)
• For data bit locations: put a 1 for the position
p3 0 1 1 1
of the data bit, the rest are 0’s
For the generator matrix of (7, 4) Hamming code
For larger dimension block codes (larger n and
above, bit location 1 (1st row) is a parity bit, thus we
k), a matrix representation of the codes is used. This
use row 1 from table 2 (1101). Bit location 5 (5th
representation includes a generator matrix, G, and a
row) is a data bit, and bit 5 is data bit number 2,
parity-check matrix, H. Given a message p, the
thus we set bit 2 to 1 (0100).
codeword will be the product of G and p with
entries modulo 2: If the message is
c = Gp (1) 1 
0 
Given the received codeword y, the syndrome p= 
vector is 1 
 
z = Hy (2) 1 

If z = 0 then the received codeword is error-free, then the codeword will be


else the value of z is the position of the flipped bit. 1 1 1
0  2  0 
For the (7, 4) Hamming code, the parity-check 1 0 1
1  3  1 
matrix is  1     
1 0 0    1  1 
0
  0    
1 0 1 0 1 0 1 c = Gp = 0 1 1    =  2  = 0 
1
1 
H = 0 1 1 0 0 1 1 (3) 0 1 0     0  0 
0
0 0 0 1 1 1 1   1     
0 0 1 0 1  1 
0 0 0 1 1  1 
and the generator matrix is     
If no bit is flipped during transmission, in other
words, y = c. Then the syndrome vector is

3
0  shown below, with check nodes on the left and
1 message nodes on the right.
 
1 0 1 0 1 0 1 1 2 0
  c1
z = Hy = 0 1 1 0 0 1 1 0 = 4 = 0
c2
0 0 0 1 1 1 1 0 2 0
  c1+c3+c5+c7= f
1 0 c3
1 c2+c3+c6+c7= f
  0
c4

c4+c5+c6+c7= f c5
If the 6th bit is flipped, 0
c6
0 
1 c7
 
1 Figure 1: Tanner graph for (7, 4) Hamming code
 
y = 0 
0  As seen above, there is an edge connects a check
 
0  node with a message node if the message node is
1 included in the check node’s equation. From a
 
Tanner graph, we can deduce a parity-check matrix
then by putting a 1 at position (i, j) if there is an edge
connecting fi and cj. The code defined by a Tanner
0  graph (or a parity-check matrix) is the set of vectors
1 c = (c1,…,cn) such that HcT = 0. In other words, the
 
1 0 1 0 1 0 1 1 2 0 code forms the null space of H.
 
z = Hy = 0 1 1 0 0 1 1 0 = 3 = 1 D. Low Density Parity-check Code (LDPC)
0 0 0 1 1 1 1 0 1  1 Any linear code has a bipartite graph and a
  parity-check matrix representation. But not all linear
0 
1 code has a sparse representation. A n × m matrix is
 
sparse if the number of 1’s in any row, the row
Reading z from the bottom up (higher position weight wr, and the number of 1’s in any column, the
first), we see the flipped bit is indeed 6 (110). column weight wc, is much less than the dimension
C. Tanner graph (wr << m, wc << n). A code represented by a sparse
parity-check matrix is called low density parity-
A very useful way of representing linear block
check code (LDPC). The sparse property of LDPC
codes is using Tanner graph. Tanner graph is a
gives rise to its algorithmic advantages. An LDPC
bipartite graph, which means the graph is separated
code is said to be regular if wc is constant for every
into two partitions. These partitions are called by
column, wr is constant for every row and
different names: subcode nodes and digit nodes,
n
variable nodes and check nodes, message nodes and wr = wc . An LDPC which is not regular is called
check nodes. I will call them message nodes and m
check nodes from now on. Tanner graph maps irregular.
directly to the parity-check matrix H of a linear
block code, with check nodes represent the rows of
H. The Tanner’s graph of (7, 4) Hamming code is

4
to be correct for them. For example,
III. DECODING message node c2 receives a 1 (y2=1), so it
Different authors come up independently with sends a message containing 1 to check nodes
more or less the same iterative decoding algorithm. f1 and f2. Table 3 illustrates this step.
They call it different names: the sum-product
2. In the second step, every check nodes
algorithm, the belief propagation algorithm, and the
calculate a response to their connected
message passing algorithm. There are two
message nodes using the messages they
derivations of this algorithm: hard-decision and
receive from step 1. The response message
soft-decision schemes.
in this case is the value (0 or 1) that the
A. Hard-decision Decoder check node believes the message node has
based on the information of other message
nodes connected to that check node. This
response is calculated using the parity-check
equations which force all message nodes
connect to a particular check node to sum to
0 (mod 2).
In Table 3, check node f1 receives 1 from
c4, 0 from c5, 1 from c8 thus it believes c2
has 0 (1+0+1+0=0), and sends that
information back to c2. Similarly, it receives
1 from c2, 1 from c4, 1 from c8 thus it
believes c5 has 1 (1+1+1+1=0), and sends 1
back to c5.
At this point, if all the equations at all
check nodes are satisfied, meaning the
Figure 2: Belief propagation example code values that the check nodes calculate match
the values they receive, the algorithm
In [8], Leiner uses a (4, 8) linear block code to terminates. If not, we move on to step 3.
illustrate the hard-decision decoder. The code is
represented in Figure 2, its corresponding parity- 3. In this step, the message nodes use the
check matrix is messages they get from the check nodes to
decide if the bit at their position is a 0 or a 1
0 1 0 1 1 0 0 1 by majority rule. The message nodes then
1 1 1 0 0 1 0 0 send this hard-decision to their connected
H= (5)
0 0 1 0 0 1 1 1 check nodes. Table 4 illustrates this step. To
  make it clear, let us look at message node c2.
1 0 0 1 1 0 1 0
It receives 2 0’s from check nodes f1 and f2.
An error free codeword of H is Together with what it already has y2 = 1, it
c = [1 0 0 1 0 1 0 1]T. Suppose we receive decides that its real value is 0. It then sends
y = [1 1 0 1 0 1 0 1]T. So c2 was flipped. this information back to check nodes f1 and
The algorithm is as follow: f2.
1. In the first step, all message nodes send a
4. Repeat step 2 until either exit at step 2 or a
message to their connected check nodes. In
certain number of iterations has been passed.
this case, the message is the bit they believe
5
In this example, the algorithm terminates the messages are the conditional probability that the
right after the first iteration as all parity- received bit is a 1 or a 0 given the received vector y.
check equations have been satisfied. c2 is Let Pi = Pr [ci = 1 | y ] be the conditional
corrected to 0. probability that ci is a 1 given the value of y. We
have Pr[ci = 0 | y ] = 1 − Pi .
TABLE 3 Let qij(l ) be the message sent by message node ci
CHECK NODES ACTIVITIES FOR HARD-DECISION to check node fj at round l. Every message contains
DECODER FOR CODE OF FIGURE 2
a pair qij(l ) (0) and qij(l ) (1) which stands for the
check activities “amount of belief” that yi is 0 or 1,
nodes qij( l ) (0) + qij( l ) (1) = 1 . In particular, qij( 0 ) (1) = Pi and
f1 receive c2 → 1 c4 → 1 c5 → 0 c8 → 1 qij( 0 ) (0) = 1 − Pi .
send 0 → c2 0 → c4 1 → c5 0 → c8
Similarly, let r ji(l ) be the message sent by check
f2 receive c1 → 1 c2 → 1 c3 → 0 c6 → 1
node fj to message node ci at round l. Every message
send 0 → c1 0 → c2 1 → c3 0 → c6
contains a pair r ji(l ) (0) and r ji(l ) (1) which stands for
f3 receive c3 → 0 c6 → 1 c7 → 0 c8 → 1
the “amount of belief” that yi is 0 or 1. We also have
send 0 → c3 1 → c6 0 → c7 1 → c8 r ji(l ) (0) + r ji( l ) (1) = 1 . r ji(l ) (0) is also the probability
f4 receive c1 → 1 c4 → 1 c5 → 0 c7 → 0 that there are an even number of 1’s on all other
send 1 → c1 1 → c4 0 → c5 0 → c7 message nodes rather than ci.
First, let consider the probability that there are an
TABLE 4 even number of 1’s on 2 message nodes. Let q1 be
MESSAGE NODES DECISIONS FOR HARD- the probability that there is a 1 at message node c1
DECISION DECODER FOR CODE OF FIGURE 2 and q2 be the probability that there is a 1 at message
node c2. We have
message yi messages from decision
nodes check nodes
Pr[c1 ⊕ c 2 = 0] = q1 q 2 + (1 − q1 )(1 − q 2 )

c1 1 f2 → 0 f4 → 1 1
= 1 − q1 − q2 + 2q1q2
c2 1 f1 → 0 f2 → 0 0
c3 0 f2 → 1 f3 → 0 0 1
= (2 − 2q1 − 2q 2 + 4q1 q 2 )
c4 1 f1 → 0 f4 → 1 1
2
1
c5 0 f1 → 1 f4 → 0 0 = [1 + (1 − 2q1 )(1 − 2q2 )] = q (6)
2
c6 1 f2 → 0 f3 → 1 1
Now consider the probability that there are an
c7 0 f3 → 0 f4 → 0 0 even number of 1’s on 3 message nodes, c1, c2 and
c8 1 f1 → 1 f3 → 1 1 c3. Note that 1 – q is the probability that there are an
odd number of 1’s on c1 and c2.

B. Soft-decision Decoder Pr[(c1 ⊕ c2 ) ⊕ c3 = 0]


The soft-decision decoder operates with the same
principle as the hard-decision decoder, except that 1
= [1 + (1 − 2(1 − q ))(1 − 2q3 )]
2

6
1 As seen above, this algorithm uses a lot of
= [1 + (1 − 2q1 )(1 − 2q2 )(1 − 2q3 )] (7)
2 multiplications which are costly to implement.
Another approach is to use logarithmic likelihood
In general ratio. Let
1 1 n  Pr[ci = 0 | y ]  1 − Pi
Pr[c1 ⊕ ... ⊕ c n = 0] = + ∏ (1 − 2q i ) (8) Li =   = (15)
2 2 i =1
 Pr[ci = 1 | y ]  Pi
Therefore, the message that fj sends to ci at round
l is  Pr[ci = 0 | y ] 
l i = ln Li = ln  (16)
1 1  Pr[ c i = 1 | y ] 
r ji( l ) (0) = + ∏ (1 − 2q i('lj−1) (1)) (9)
2 2 i '∈V j ≠i Li is the likelihood ratio and li is the log
likelihood ratio at message node ci. Using log ratio
r ji( l ) (1) = 1 − r ji(l ) (0) (10) turns multiplications into additions which are much
cheaper to implement in hardware.
where Vj is the set of all message nodes connected With log likelihood ratio, we have
to check node fj.
The message that ci sends to fj at round l is 1
Pi = (17)
1 + Li
q (ijl ) (0) = k ij (1 − Pi ) ∏r
j '∈Ci ≠ j
( l −1)
j 'i
( 0) (11)
From (11) and (12), the message that ci sends to
fj at round l is
q (ijl ) (1) = k ij Pi ∏r ( l −1)
(1) (12)
1 − Pi r j('li−1) (0) 
j 'i
j '∈Ci ≠ j q ij(l ) (0)
(l )
m ij = ln (l ) = ln  ∏ (l −1) (1) 
where Ci is the set of all check nodes connected to q ij (1)  Pi j '∈Ci ≠ j r j 'i 
message node ci.
The constant kij is chosen so that = li + ∑m
j '∈Ci ≠ j
( l −1)
j 'i (18)
qij(0) + qij(1) = 1
From (9) and (10), the message that fj sends to ci
At each message node, the following calculations
at round l is
are made
1 1
Q i(l ) (0) = k i (1 − Pi )∏ r ji( l ) (0) (13) (l )
+ ∏ (1 − 2q (i 'lj−1) (1))
j∈Ci r ( 0)
ji
2 2 i '∈V j ≠i
m (jil ) = ln = ln
r ji(l ) (1) 1 1
− ∏ (1 − 2q (i 'lj−1) (1))
Q i (1) = k i Pi ∏ r ji (1)
(l ) (l )
(14) 2 2 i '∈V j ≠i
j∈Ci

Qi(l ) is the effective probability of 0 and 1 at mi('lj−1)


1+ ∏ tanh(
2
)
message node ci at round l. If Q (1) > Q (0) then i
(l )
i
(l )
= ln
i '∈V j ≠ i
(19)
mi('lj−1)
the estimation at this point is ci = 1, otherwise ci =
0. If this estimation satisfies the parity-check
1− ∏ tanh(
i '∈V j ≠ i 2
)

equations then the algorithm terminates. Else, the


We have (19) because from (18),
algorithm runs through a predetermined number of
iterations. mi ' j 1 − q i ' j (1)
e = (20)
qi ' j (1)

7
Thus 2. lim Pen (l ) = Pe∞ (l ), where Pe∞ (l ) is the
n →∞
1 expected fraction of incorrect messages
q i ' j (1) = mi ' j
(21)
1+ e passed at round l assuming that the graph
and does not contain any cycle of length 2l or
less. The assumption is to ensure that the
mi ' j
e −1 mi ' j decoding neighborhoods become “tree-like”
1 − 2qi ' j (1) = mi ' j
= tanh( ) (22)
e +1 2 so that the messages are independent for l
Equation (13), (14) turn into rounds [10]. The value Pe∞ (l ) can be
calculated by a method called density
Q i(l ) (0)
l i
(l )
= ln =l i
(0)
+ ∑m (l )
ji (23) evolution. For a message alphabet of size q,
Q i( l ) (1) j∈Ci Pe∞ (l ) can be expressed by means of q – 1

If l i( l ) > 0 then ci = 0 else ci = 1. coupled recursive functions.

In practice, belief propagation is executed for a 3. There exists a channel parameter σ* with the
maximum number of iterations or until the passed following property: if σ < σ* then
likelihoods are closed to certainty, whichever comes lim Pe∞ (l ) = 0 , else if σ > σ* then there
l →∞
first. A certain likelihood is li = ±∞, where Pi = 0
exists a constant γ(σ) > 0 such that
for li = ∞ and Pi = 1 for li = -∞.
Pe∞ (l ) > γ(σ) for all l ≥ 1. Here σ2 is the
One very important aspect of belief propagation
is that its running time is linear to the code length. noise variance in the channel. In other
Since the algorithm traverses between check nodes words, σ* sets the limit to which belief
and message nodes, and the graph is sparse, the propagation decodes successfully.
number of traversals is small. Moreover, if the
algorithm runs a fixed number of iterations then IV. ENCODING
each edge is traversed a fixed number of times, thus If the generator matrix G of a linear block code is
the number of operations is fixed and only depends known then encoding can be done using equation
on the number of edges. If we let the number of (1). The cost (number of operations) of this method
check nodes and message nodes increases linearly depends on the Hamming weights (number of 1’s)
with the code length, the number of operations of the basis vectors of G. If the vectors are dense,
performed by belief propagation also increases the cost of encoding using this method is
linearly with the code length. proportional to n2. This cost becomes linear with n
C. Performance of Belief Propagation if G is sparse.
However, LDPC is given by the null space of a
A parameter to measure the performance of
sparse parity-check matrix H. It is unlikely that the
belief propagation is the expected fraction of
generator matrix G will also be sparse. Therefore
incorrect messages passed at the lth iteration,
the straightforward method of encoding LDPC
Pen (l ) . In [9], Richardson and Urbanke show that
would require number of operations proportional to
1. For any δ > 0, the probability that the actual n2. This is too slow for most practical applications.
fraction of incorrect messages passed among Therefore it is desirable to have encoding
any instance at round l that lies outside algorithms that run in linear time. This section will
( Pen (l ) − δ , Pen (l ) + δ ) converges to zero look at two approaches to achieve that goal.
exponentially fast with n. A. Accumulate approach
The first approach modifies LDPC code so it has
8
an inherited fast encoding algorithm. In this case, degree d having all its connected check nodes
we assign a value to each check node which is equal erased is pd. This probability is conditioned on the
to the sum of all message nodes that are connected event that the degree of the message node is d.
to it. (It would be more appropriate to talk about Since the graph is created randomly, the probability
information nodes and redundant nodes, but for that a message node has all its connected check
consistence of notation, I will use message nodes nodes erased is ∑ M d p d , which is a constant
and check nodes.) The number of summations d

needed to calculate the value of a check node is independent of the length of the code. Therefore, no
bounded by the number of message nodes connected algorithm can recover the value of that message
to a check node. This is a constant when the code is node.
sparse. The message consists of the message nodes B. Lower triangular modification approach
appended by the values of the check nodes. To In [11], Richardson and Urbanke propose an
illustrate the difference between this modified encoding algorithm that has effectively linear
version of LDPC and the original version, consider running time for any code with a sparse parity-check
Figure 2. If Figure 2 represents an original LDPC matrix. The algorithm consists of two phases:
then c1, c2, c3, c4 are information bits and c5, c6, c7, preprocessing and encoding.
c8 are parity bits which have to be calculated from In the preprocessing phase, H is converted into
c1, c2, c3, c4 by solving the parity-check equations in the form shown in Figure 3 by row and column
f1, f2, f3, f4. The code rate is 4/8 = 1/2. Now if permutations.
Figure 2 represents a modified LDPC, then all of c1,
c2, c3, c4 ,c5, c6, c7, c8 are information bits; while f1,
n-k g k-g
f2, f3, f4 are redundant bits calculated from c1,…,c8.
f1 is connected to c2, c4, c5, c8 so f1 = c2 + c4 + c5 + 1
c8 and so on. The codeword in this case is 1 0
T
[c1 … c8 f1 … f4] . The code rate is 8/12 = 2/3. A B 1 k-g
Although this approach gives a linear encoder, it k T 1
1
causes a major problem at the decoder. In case the
channel is erasure, the value of the check nodes C D E g
might be erased. On the contrary, the check nodes of
the original LDPC are dependencies, not values, n
thus they are not affected by the channel. In other
Figure 3: Parity-check matrix in approximately
words, a check node defines a relationship of its
lower triangular form
connected message nodes. This relationship comes
straight from the parity-check matrix. The fact that
In matrix notation,
in modified LDPC, the values of check nodes can be
erased creates a lower bounded for the convergence  A B T
H=  (24)
of any decoding algorithm. In [r4], Shokrollahi C D E
proves the existence of such lower bound. Suppose
that the channel is erasure with the erasure where T has a lower triangular form with all
probability p. Then an expected p-fraction of the diagonal entries equal to 1. Since the operation is
message nodes and an expected p-fraction of the done by row and column permutations and H is
check nodes will be erased. Let Md be the fraction sparse, A, B, C, D, E, T are also sparse. g, the gap,
of message nodes of degree d (connected with d measures how close H can be made, by row and
check nodes.) The probability of a message node of column permutations, to a lower triangular matrix.
9
TABLE 5
COMPUTING p1 USING RICHARDSON AND URBANKE’S ENCODING ALGORITHM

Operation Comment Complexity


AsT Multiplication by sparse matrix O(n)
T−1[AsT] Back-substitution, T is lower triangular O(n)
−E[T−1AsT] Multiplication by sparse matrix O(n)
CsT Multiplication by sparse matrix O(n)
[−ET−1AsT]+ [CsT] Addition O(n)
−Φ−1(−ET−1AsT + CsT) Multiplication by g×g matrix O(n+g2)

TABLE 6
COMPUTING p2 USING RICHARDSON AND URBANKE’S ENCODING ALGORITHM

Operation Comment Complexity


AsT Multiplication by sparse matrix O(n)
Bp1T Multiplication by sparse matrix O(n)
[AsT] + [Bp1T] Addition O(n)
−T−1(AsT + Bp1T) Back-substitution, T is lower triangular O(n)

Multiple H from the left by


(−ET−1A + C)sT + (−ET−1B + D)p1T = 0 (29)
 I 0 The procedure to find p1 and p2 is summarized in
− ET −1 (25)
 I  Table 5 and 6.
we get Define: Φ = −ET−1B + D and assume for the
moment that Φ is nonsingular. Then
 A B T
− ET −1 A + C − ET −1B + D 0  (26) p1T = −Φ−1(−ET−1A + C)sT
 
Let the codeword c = (s, p1, p2) where s is the = −Φ−1(−ET−1AsT + CsT) (30)
information bits, p1 and p2 are the parity-check bits, First we compute AsT. Since A is sparse, this is
p1 has length g, p2 has length k – g. done in linear time O(n). Then we compute
By HcT = 0, we have T−1[AsT] = yT. Since [AsT] = TyT and T is lower
triangular, by back-substitution we can compute yT
 I 0 T
− ET −1 Hc in linear time. The calculations −EyT and CsT are
 I 
also done in O(n) as E, C are sparse. Now we have
(−ET−1AsT + CsT) = zT computed in O(n). Since Φ
s
 A B T   is g×g, p1 is computed from (30) in O(n+g2).
= −1 −1   p1  = 0 (27) From (28), p2T = −T−1(AsT + Bp1T). The steps to
− ET A + C − ET B + D 0  p 
 2 calculate p2 are quite similar and are shown in Table
6. We see that p2 can be computed in O(n).
Therefore
As seen in Table 5 and Table 6, c can be
AsT + Bp1T + Tp2T = 0 (28) computed in O(n+g2). Richardson and Urbanke

10
prove in [11] that the gap g concentrates around its corrected first, followed by slightly smaller degree
expected value, αn, with high probability. α here is a nodes, and so on.
small constant. For a regular LDPC with row weight Before getting into the details of how to
wr = 6, column weight wc = 3, α = 0.017. Therefore construct irregular codes, let us introduce some
even though mathematically the encoding algorithm notations. Let dl, dr be the maximum degrees of
run in O(n2) (α2 O(n2) to be precise), in practice the message nodes and check nodes. Define the left
encoder still runs in reasonable time for n = (right) degree of an edge to be the degree of the
100,000. In the same paper, Richardson and message node (check node) that is connected to the
Urbanke also show that for known “optimized” edge. λi (ρi) is the fraction of edges with left (right)
codes, the expected g is bounded by O( n ) thus the degree i. Any LDPC graph is specified by the
encoder runs in O(n). sequences (λ1 ,..., λ d l ) and ( ρ 1 ,..., ρ d r ) . Further,
define
V. IRREGULAR CODES
It has been shown that irregular LDPC perform λ ( x) = ∑ λi x i −1 (31)
i
better than regular LDPC [12], [13], [14]. The idea
was pioneered by Luby et al in [12]. He thinks of and ρ ( x) = ∑ ρ i x i −1 (32)
finding coefficients for an irregular code as a game, i

with the message nodes and check nodes as players. to be the degree distribution of message nodes and
Each player tries to choose the right number of check nodes. Also, define pi to be the probability
edges for them. A constraint of the game is that the that an incorrect message is passed in the ith
message nodes and the check nodes must agree on iteration.
the total number of edges. From the point of view of Now consider a pair of message node and check
the message nodes, it is best to have high degree node (m, c) and let c’ be another check node of m
since the more information it has from the check different than c. At the end of the ith iteration, c’
nodes, the more accurately it can judge what its will send m its correct value if there are an even
correct value should be. On the other hand, from the number (including 0) of message nodes other than
point of view of the check nodes, it is best to have m sending c’ the incorrect bit. By an analogous
low degree, since the lower the degree of a check analysis to equation (8), the probability that c’
node, the more valuable the information it can receives an even number of errors is
transmit back to the message nodes. These two
requirements must be appropriately balanced to 1 + (1 − 2 p i ) d r −1
(33)
have a good code. 2
MacKay shows in [15], [16] that for regular for the case of unvarying degrees of the nodes, and
codes, it is best to have low density. However,
allowing irregular codes provides another degree of 1 + ρ (1 − 2 p i )
(34)
freedom. In [12], Luby shows that having a wide 2
spread of degree is advantageous, at least for the for the case of varying degrees of the nodes, where
message nodes. The reason is message nodes with ρ(x) is defined in (32).
high degree tend to correct their value faster. These
A. Luby’s design
nodes then provide good information to the check
nodes, which subsequently provide better In [12], Luby proves the iterative description of
information to the lower degree message nodes. pi .
Therefore irregular graph has potential to provide a
wave effect where high degree message nodes are
11
dl So pi+1 = f(pi), therefore we want f(x) < x.
p i +1 = p 0 − ∑ λ j . Another constraint to λ and ρ is
j =1

λl ρi
 j
 j − 1 1 + ρ (1 − 2 p i ) 
t ∑ l
= (1 − R )∑
i
(38)
 p 0 ∑    
l i

 t =bi , j  t   2  Equation (38) makes sure that the total number


of left and right degrees are the same. Luby’s
j −1− t
1 − ρ (1 − 2 pi )  approach tries to find any sequence λ that satisfies
  (38) and f(x) < x for x ∈ (0, p 0 ) . It accomplishes
 2 
this task by examining the conditions at
t
j
 j − 1 1 + ρ (1 − 2 p i )  x = p 0 / A for some integer A. By plugging those
+ (1 − p 0 ) ∑  
t = bi , j  t   2 
 values of x into (37), it creates a system of linear
inequalities. The algorithm finds any λ that satisfies
j −1− t
1 − ρ (1 − 2 p i )   this linear system. As seen, Luby’s approach cannot
.   (35) determine the best sequence λ and ρ. Instead, it
 2  
determines a good vector λ given a vector ρ and a
Note that p0 is the error probability of the desired code rate R.
channel. bi,j is given by the smallest integer the Luby shows through simulations that the best
satisfies codes have constant ρ, in other words, the check
2 bi , j − j +1 nodes have the same degree. Some results from [r6]
1 − p 0 1 + ρ (1 − 2 p i )  is reproduced in Table 7.
≤  (36)
p0 1 − ρ (1 − 2 p i ) 
TABLE 7
The goal of this design is to find sequences
LUBY’S IRREGULAR CODES
λ = (λ1 ,..., λ dl ) and ρ = ( ρ1 ,..., ρ d r ) that yield the
biggest value of p0 such that the sequence {pi}
decreases to 0. Define
dl
f ( x) = p 0 − ∑ λ j .
j =1

 j
 j − 1 1 + ρ (1 − 2 x) 
t

 0 ∑
p  
 
 t =bi , j t   2 

j −1− t
1 − ρ (1 − 2 x) 
 
 2 

 j − 1 1 + ρ (1 − 2 x) 
j t

+ (1 − p 0 ) ∑    
t = bi , j  t  2 

j −1− t
1 − ρ (1 − 2 x)  
.   (37)
 2   p* in Table 7 is the maximum value of p0

12
achieved by each code. All of the code above have degree. The reason why this optimization works is
code rate R = 1/2. Previously, the best p* for because it suits Differential Evolution well. The
Gallager’s regular codes with code rate 1/2 is detail of Differential Evolution is discussed in [17].
0.0517 [1]. Some results from [13] is reproduced in Table 8.
B. Richardson’s design
TABLE 8
In [13], Richardson, Shokrollahi and Urbanke
RICHARDSON’S IRREGULAR CODES
propose a design of irregular LDPC that can
approach Shannon channel capacity tighter than
turbo code. Their algorithm employs two
optimizations: tolerate the error floor for practical
purpose and carefully design quantization of density
evolution to match the quantization of messages
passed. The idea of the former optimization is that
in practice, we always allow a finite (but small)
probability of error ε. If we choose ε small enough
then it automatically implies convergence. The
latter optimization makes sure that the performance
loss due to quantization errors is minimized. Since
belief propagation is optimal, the quantized version
is suboptimal, therefore the simulation results can
be thought of as lower bound for actual values.
Richardson’s algorithm starts with an arbitrary
degree distribution (λ, ρ). It sets the target error
probability ε and the maximum number of iterations
m. The algorithm searches for the maximum
admissible channel parameter such that belief
propagation returns a probability of error less than ε
after m iterations. Now slightly change the degree
distribution pair and runs the algorithm again and
check if a larger admissible channel parameter or a
lower probability of error is found. If yes then set
dv is the maximum message node degree, for
the current distribution pair to the new distribution
each dv, the individual degree fraction is provided.
pair, else keep the original pair. This process is
σ* is the channel parameter discussed in Section III
repeated a large number of times. The basic of this
– C. p* is the input bit error probability of a hard-
algorithm is that Richardson notices the existence of
decision decoder. All codes have rate 1/2.
stable regions where the probability of error does
not decrease much with the increase number of C. Chung’s design
iterations. This fact helps limit the search space of In his PhD dissertation, Chung introduces a
the degree distribution thus shortens the running derivation of density evolution called discretized
time. density evolution. This derivation is claimed to
Another optimization in Richardson’s algorithm model exactly the behavior of discretized belief
is the fact that he lets the degree to be a continuous propagation. In his letter [14], Chung introduces an
variable, and round it to return to real integer irregular code which is within 0.0045 dB of
Shannon capacity. This code has dv = 8000, which is
13
much greater than the maximum message node [6] “Information and Entropy”, MIT
degree studied by Luby and Richardson. Chung’s OpenCourseWare. Spring 2008.
code is the closest code to Shannon capacity that
[7] Wikipedia. “Hamming(7,4)”. Accessed May 01,
has been simulated. It further confirms that LDPC
2009.
indeed approaches channel capacity.
[8] B. M. J. Leiner, “LDPC Codes – a brief
VI. CONCLUSION Tutorial,” April 2005.
This paper summaries the important concepts [9] T. Richardson and R. Urbanke, “The capacity of
regarding low density parity-check code (LDPC). It low-density parity check codes under message-
goes through the motivation of LDPC and how passing decoding,” IEEE Trans. Inform. Theory,
LDPC can be encoded and decoded. Different vol. 47, pp. 599 – 618, 2001.
modifications of the codes are presented, especially
[10] A. Shokrollahi, “LDPC Codes: An
irregular codes. I chose to leave out derivations of
Introduction,” Digital Fountain, Inc., April 2,
regular codes such as MacKay codes [18], repeat-
2003.
accumulate codes [19] because they have become
less important with the advance of irregular codes. [11] T. Richardson and R. Urbanke, “Efficient
This paper however has not mentioned how encoding of low-density parity-check codes,"
LDPC is implemented in real hardware. For this, I IEEE Trans. Inform. Theory, vol. 47, pp. 638 –
refer the readers to [20], where a decoder design 656, 2001.
based on IEEE 802.11n standards with very high [12] M. Luby, M. Mitzenmacher, A. Shokrollahi,
throughput (900Mbps for FPGA, 2Gbps for ASIC and D. Spielman, “Improved Low-Density
design) is discussed. Parity-Check Codes Using Irregular Graphs,"
IEEE Trans. Inform. Theory, vol. 47, pp. 585 –
REFERENCES
598, 2001.
[1] R. Gallager, “Low density parity-check codes,”
IRE Trans, Information Theory, pp. 21-28. [13] T. Richardson, A. Shokrollahi, and R.
January 1962. Urbanke, “Design of capacity-approaching
irregular low-density parity-check codes," IEEE
[2] R. M. Tanner, “A recursive approach to low
Trans. Inform. Theory, vol. 47, pp. 619 – 637 ,
complexity codes,” IEEE Trans. Information
2001.
Theory, pp. 533-547, September 1981.
[14] S.-Y. Chung, D. Forney, T. Richardson, and
[3] D. Mackay and R. Neal, “Good codes based on
R. Urbanke, “On the design of low-density
very sparse matrices,” Cryptography and
parity-check codes within 0.0045 dB of the
Coding, 5th IMA Conf., C. Boyd, Ed., Lecture
Shannon limit," IEEE Communication Letters,
Aotes in Computer Science, pp. 100-111, Berlin,
vol. 5, pp. 58 − 60, 2001.
Germany, 1995.
[15] D. J. C. MacKay, “Good error correcting
[4] D. Mackay, “Good error correcting codes based
codes based on very sparse matrices,” IEEE
on very sparse matrices,” IEEE Trans.
Trans. Inform. Theory, vol. 45, pp. 399–431,
Information Theory, pp 399-431, March 1999.
Mar. 1999.
[5] N. Alon and M. Luby, “A linear time erasure-
[16] D. J. C. MacKay and R. M. Neal, “Near
resilient code with nearly optimal recovery,”
Shannon limit performance of low-density
IEEE Trans. Information Theory, vol. 47, pp.
parity-check codes,” Electron. Lett., vol. 32, pp.
6238-656, February 2001.
1645–1646, 1996.

14
[17] K. Price and R. Storn, “Differential evolution [19] D.Divsalar, H. Jin and R. McEliece, “Coding
− A simple and efficient heuristic for global theorems for turbo-like codes,” Proc. 36th
optimization over continuous spaces,” J. Global Annual Allerton Conf. on Comm., Control and
Optimiz., vol. 11, pp. 341–359, 1997. Conputingce, pp. 201-210. September 1998.
[18] D. Mackay, “Information Theory, [20] Marjan Karkooti, Predrag Radosavljevic and
Interference, and Learning Algorithms,” Joseph R. Cavallaro, “Configurable, High
Cambridge University Press 2003. Throughput, Irregular LDPC Decoder
Architecture: Tradeoff Analysis and
Implementation,” Rice Digital Scholarship
Archive, September 01, 2006.

15

You might also like