Convolutional Codes and Their Decoding 2010
Convolutional Codes and Their Decoding 2010
5.1 History
It was in 1955 that Peter Elias introduced the notion of convolutional code
[5.5]. The example of an encoder described is illustrated in Figure 5.1. It
is a systematic encoder, that is, the coded message contains the message to
be transmitted, to which redundant information is added. The message is of
infinite length, which at first sight limits the field of application of this type of
code. It is however easy to adapt it for packet transmissions thanks to tail-biting
techniques.
di
di d i 1 d i 2 d i 3 d i ,r i
D D D Mux
ri
The encoder presented in Figure 5.1 is designed around a shift register with
three memory elements. The redundancy bit at instant i, denoted ri is con-
structed with the help of a modulo 2 sum of the information at instant i, di and
the data present at instants i − 1 and i − 3 (di−1 and di−3 ). A multiplexer plays
the role of a parallel to serial converter and provides the result of the encoding
at a rate twice that of the rate at the input. The coding rate of this encoder is
168 Codes and Turbo Codes
1/2 since, at each instant i, it receives data di and delivers two elements at the
output: di (systematic part) and ri (redundant part).
It was not until 1957 that the first algorithm capable of decoding such codes
appeared. Invented by Wozencraft [5.15], this algorithm, called sequential de-
coding, was then improved by Fano [5.6] in 1963. Four years later, Viterbi
introduced a new algorithm that was particularly interesting when the length of
the shift register of the encoder is not too large [5.14]. Indeed, the complexity
of the Viterbi algorithm increases exponentially with the size of this register
whereas the complexity of the Fano algorithm is almost independent of it.
In 1974, Bahl, Cocke, Jelinek and Raviv presented a new algorithm [5.1]
capable of associating a probability with the binary decision. This property is
very widely used in the decoding of concatenated codes and more particularly
turbocodes, which have brought this algorithm back into favour. It is now
referred to in the literature in one of these three ways: BCJR (initials of the
inventors), MAP (Maximum A Posteriori) or APP (A Posteriori Probability).
The MAP algorithm is rather complex to implement in its initial version,
and it exists in simplified versions, the most common ones being presented in
Chapter 7.
In parallel with these advances in decoding algorithms, a number of works
have treated the construction of convolutional encoders. The aim of these stud-
ies has not been to decrease the complexity of the encoder, since its implantation
is trivial. The challenge is to find codes with the highest possible error correction
capability. In 1970, Forney wrote a reference paper on the algebra of convolu-
tional codes [5.7]. It showed that a good convolutional code is not necessarily
systematic and suggested a construction different from that of Figure 5.1. For a
short time, that paper took systematic convolutional codes away from the field
of research on channel coding.
Figure 5.2 gives an example of a non-systematic convolutional encoder. Un-
like the encoder in Figure 5.1, the data are not present at the output of the
encoder and are replaced by a modulo 2 sum of the data at instant i, di , and
of the data present at instants i − 2 and i − 3 (di−2 and di−3 ). The rate of
the encoder remains unchanged at 1/2 since the encoder always provides two
(1) (2)
elements at the output: ri and ri , at instant i.
When Berrou et al. presented their work on turbocodes [5.4], they rehabil-
itated systematic convolutional codes by using them in a recursive form. The
interest of recursive codes is presented in Sections 5.2 and 5.3. Figure 5.3 gives
an example of an encoder for recursive systematic convolutional codes. The
original message being transmitted (di ), the code is therefore truly systematic.
A feedback loop appears, the structure of the encoder now being similar to that
of pseudo-random sequence generators.
This brief overview has allowed us to present the three most commonly used
families of convolutional codes: systematic, non-systematic, and recursive sys-
tematic codes. The next two sections tackle the representation and performance
5. Convolutional codes and their decoding 169
r (1)
i
di d i1 r (1)
, r (2)
i i
D D D Mux
d i2 d i3
r (2)
i
di d i ,r i
ri Mux
di s (1)
i
D D D
s (0)
i
s (2)
i s (3)
i
d (1)
i c
d i(1)
d (2)
i c
d i(2)
(m)
d i c
d (m)
i
a(1)
1
a(2)
1
a(m)
1
a(1)
2
a2(2) a2(m) a(1) a(2) a(m)
D D D
s (1)
i s (2)
i s (i )
b1 b2
(l)
Using the coefficients aj , each of the m components of the vector di is
selected or not as the term of an addition with the content of a previous flip-flop
(except in the case of the first flip-flop) to provide the value to be stored in the
following flip-flop. The new content of a flip-flop thus depends on the current
input and on the content of the previous flip-flop. The case of the first flip-flop
has to be considered differently. If all the bj coefficients are null, the input is the
result of the sum of the only components selected of di . In the opposite case,
the contents of the flip-flops selected by the non-null bj coefficients are added to
the sum of the components selected of di . The code thus generated is recursive.
Thus, the succession of states of the register depends on the departure state
and on the succession of data at the input. The components of redundancy ri
are finally produced by summing the content of the flip-flops selected by the
coefficients g.
Let us consider some examples.
d (1)
i
d (1), d (2),ri
d (2)
i Mux
i i
ri
D D D
(1)
Let us take the case of the encoder defined in Figure 5.2. The outputs ri and
(2)
ri are expressed as functions of the successive data d as follows:
(1)
ri = di + di−2 + di−3 (5.2)
with G(1) (D) = 1 + D2 + D3 , the first generator polynomial of the code and
d(D) the transform in D of the message to be encoded. Likewise, the second
generator polynomial is G2 (D) = 1 + D + D3 .
These generator polynomials can also be resumed by the series of their coeffi-
cients, (1011) and (1101) respectively, generally denoted in octal representation,
(13)octal and (15)octal respectively. In the case of a non-recursive systematic
code , like the example in Figure 5.1, the generator polynomials are expressed
according to the same principle. In this example, the encoder has generator
polynomials G(1) (D) = 1 and G(2) (D) = 1 + D + D3 .
To define the generator polynomials of a recursive systematic code is not
straightforward. Let us consider the example of Figure 5.3. The first generator
5. Convolutional codes and their decoding 173
where G(1) (D) and G(2) (D) are the generator polynomials of the code shown in
Figure 5.2, which leads to
d(D)
s(D) = G(2) (D)
(5.8)
G(1) (D)
r(D) = G(2) (D)
d(D)
Figure 5.6 – Tree diagram of the code of polynomials [1, 1 + D + D3 ]. The binary pairs
indicate the outputs of the encoder and the values in brackets are the future states.
00 00 00 00 00
(000)
11 11 11
01 01
(001)
10 10
11 11
00 00 00
(010)
11 11 11
01 01
(011) 01
10 10
01
01 01
(100)
10 10 10
10 00 00
(101) 11 11
01
01 01
(110) 10 10 10
00 00
(111) 11 11
di ri di =0
di ri di =1
two states is represented by an arc between the two associated nodes and labelled
with the outputs of the encoder. In the case of a binary code, the transition on an
input at 0 (resp. 1) is represented by a dotted (resp. solid) line. The succession
of si states up to instant t is represented by the different paths between the
initial state and the different possible states at instant t.
Let us show this with the example of the systematic encoder of Figure 5.1.
Hypothesizing that the initial state s0 is state (000) :
Figure 5.8 – Trellis sections of the codes with generator polynomials [1, 1 + D + D3 ]
(a), [1 + D2 + D3 , 1 + D + D3 ] (b) and [1, (1 + D2 + D3 )/(1 + D + D3 )] (c)
176 Codes and Turbo Codes
Such a representation shows the basic pattern of these trellises: the butterfly,
so called because of its shape. Each of the sections of Figures 5.8 is thus made up
of 4 butterflies (the transitions of states 0 and 1 towards states 0 and 4 make up
one). The butterfly structure of the three trellises illustrated is identical but the
sequences coded differ. It should be noted in particular that all the transitions
arriving at the same node of a trellis of a non-recursive code are due to the same
value at the input of the encoder. Thus, among the two non-recursive examples
treated (Figures 5.8(a) and 5.8(b)), a transition associated with a 0 at the input
necessarily arrives at one of the states between 0 and 3 and a transition with a
1 arrives at one of the states between 4 and 7. It is different in the case of a
recursive code (like the one presented in Figure 5.8(c)): each state allows one
incident transition associated with an input with a 0, and another one associated
with 1. We shall see the consequences of this in Section 5.3.
10
11 (100) (110) 10
01 11
11
00 (000) 10 (010) 00 (101) 01 (111) 11
01 00 10 00
(001) 01 (011)
di ri di=0
di ri di=1
Figure 5.9 – State machine for a code with generator polynomials [1, 1 + D + D3 ].
10
11 (100) 01 01
(110) 00
01
00 (000) 00 (010) 10 (101) 11 (111) 11
11 10 10 00
(001) 01 (011)
ri(1) ri(2) di=0
ri(1) ri(2) di=1
Figure 5.11 – State machine for a code with generator polynomials [1, (1+D2 +D3 )/(1+
D + D3 )].
However, the recursive state machine allows another cycle on a null sequence at
the input: state 4 → state 6 → state 7 → state 3 → state 5 → state 2 → state
1 → state 4.
Moreover, this cycle is linked to the loop on state 0 by two transitions associated
with inputs at 1 (transitions 0 → 4 and 1 → 0). There therefore exists an infinite
number of input sequences with Hamming weight 2 equal to 2 producing a cycle
on state 0. This weight 2 is the minimum weight of any sequence that makes
the recursive encoder leave state 0 and return to zero. Because of the linearity
of the code (see Chapter 1), this value of 2 is also the smallest distance that can
separate two sequences with different inputs that make the encoder leave the
same state and return to the same state.
In the case of non-recursive codes, the Hamming weight of the input se-
quences allowing a cycle on state 0 can only be 1 (state 0 → state 4 → state 2
→ state 1 → state 0). This distinction is essential for understanding the interest
of recursive codes used alone (see Section 5.3) or in a turbocode structure (see
Chapter 7).
2 The Hamming weight of a binary sequence is equal to the number of bits equal to 1.
178 Codes and Turbo Codes
00 00 00 00 00
(000)
11 11 11
01 01
(001)
10 10
11 11
00 00 00
(010)
11 11 11
01 01
(011) 01
10 10
01
01 01
(100)
10 10 10
10 00 00
(101) 11 11
01
01 01
(110) 10 10 10
00 00
(111) 11 11
ri(1) ri(2) di =0
ri(1) ri(2) di =1
Figure 5.12 – RTZ sequence (in bold) defining the free distance of the code with
generator polynomials [1, 1 + D + D3 ].
00 00 00 00 00 00
(000)
11 11 11 11
11 11 11
(001)
00 00 00
11 11
10 10 10 10
(010)
01 01 01 01
01 01 01
(011) 01
10 10 10
01
01 01 01
(100)
10 10 10 10
10 10 10 10
(101) 01 01 01
11
11 11 11
(110) 00 00 00 00
00 00 00
(111) 11 11 11
Figure 5.13 – RTZ sequences (in bold) defining the free distance of the code with
generator polynomials [1 + D2 + D3 , 1 + D + D3 ].
00 00 00 00 00 00
(000)
11 11 11 11
11 11 11
(001)
00 00 00
11 11
01 01 01 01
(010)
10 10 10 10
10 10 10
(011) 10
01 01 01
10
10 10 10
(100)
01 01 01 01
01 01 01 01
(101) 10 10 10
11
11 11 11
(110) 00 00 00 00
00 00 00
(111) 11 11 11
di ri(2) di=0
di ri(2) di=1
Figure 5.14 – RTZ sequences (in bold) defining the free distance of the code with
generator polynomials [1, (1 + D2 + D3 )/(1 + D + D3 )].
the non-recursive systematic code, the only sequence of this type has a weight
180 Codes and Turbo Codes
equal to 1, which means that if the RTZ sequence is decided instead of the
transmitted "all zero" sequence, only one bit is erroneous. In the case of the
classical code, one sequence at the input has a weight of 1 and another a weight
of 3: one or three bits are therefore wrong if such an RTZ sequence is decoded.
In the case of the recursive systematic code, the RTZ sequences with minimum
weight have an input weight of 3.
Knowledge of the minimum Hamming distance and of the input weight as-
sociated with it is not sufficient to closely evaluate the error probability at the
output of the decoder of a simple convolutional code. It is necessary to com-
pute the distances, beyond the minimum Hamming distance, and their weight in
order to make this evaluation. This computation is called the distance spectrum.
(000)
ae
O²I
OI
(100) (110) OI
e O O²I g
O²I
OI (010) 1 (101) O (111) O²I
c f h
1 OI 1
(001) O (011)
b d
O
(000)
as
Figure 5.15 – Machine state of the code [1, 1 + D + D3 ], modified for the computation
of the associated transfer function.
Each transition has a label Oi I j , where i is the weight of the sequence coded
and j that of the sequence at the input of the encoder. In our example, j
can take the value 0 or 1 according to the level of the bit at the input of the
5. Convolutional codes and their decoding 181
encoder at each transition and i varies between 0 and 2, since 4 coded symbols
are possible (00, 01, 10, 11), with weights between 0 and 2.
The transfer function of the code T (O, I) is then defined by:
as
T (O, I) = (5.9)
ae
To establish this function, we have to solve the system of equations coming from
the relations between the 9 states (ae, b, c... h and as):
b = c + Od
c = Oe + f
d = h + Dg
e = O2 Iae + OIb
(5.10)
f = O2 Ic + OId
g = OIe + O2 If
h = O2 Ih + OIg
as = Ob
T (O, I) = IO4
+(I 4 + 2I 3 + 3I 2 )O6
+(4I 5 + 6I 4 + 6I 3 )O8
(5.11)
+(I 8 + 5I 7 + 21I 6 + 24I 5 + 17I 4 + I 3 )O10
+(7I 9 + 30I 8 + 77I 7 + 73I 6 + 42I 5 + 3I 4 )O12
+···
T (O, I) = (I 3 + I)O6
+(2I 6 + 5I 4 + 3I 2 )O8
+(4I 9 + 16I 7 + 21I 5 + 8I 3 )O10
(5.12)
+(8I 12 + 44I 10 + 90I 8 + 77I 6 + 22I 4 )O12
+(16I 15 + 112I 13 + 312I 11 + 420I 9 + 265I 7 + 60I 5 )O14
+···
182 Codes and Turbo Codes
Likewise, the recursive systematic code already studied has as its transfer func-
tion:
T (O, I) = 2I 3 O6
+(I 6 + 8I 4 + I 2 )O8
+(8I 7 + 33I 5 + 8I 3 )O10
(5.13)
+(I 10 + 47I 8 + 145I 6 + 47I 4 + I 2 )O12
+(14I 11 + 254I 9 + 649I 7 + 254I 5 + 14I 3 )O14
+···
Comparing the transfer functions from the point of view of the monomial with
the smallest degree allows us to appreciate the error correction capability at
very high signal to noise ratio (asymptotic behaviour). Thus, the non-recursive
systematic code is weaker than its rivals since it has a lower minimum distance.
A classical code and its equivalent recursive systematic code have the same free
distance, but their monomials of minimal degree differ. The first is in (I 3 + I)O6
and the second in 2I 3 O6 . This means that with the classical code an input
sequence with weight 3 and another with weight 1 produce an RTZ sequence
with weight 6 whereas with the recursive systematic code two sequences with
weight 3 produce an RTZ sequence with weight 6. Thus, if an RTZ sequence
with minimum weight is introduced by the noise, the classical code will introduce
one or three errors, whereas its recursive systematic code will introduce three
or three other errors. In conclusion, the probability of a binary error on such
a sequence is lower with a classical code than with a recursive systematic code,
which explains that the former will be slightly better at high signal to noise ratio.
Things are generally different when the codes are punctured (see Section 5.5) in
order to have higher rates [5.13].
To compare the performance of codes with low signal to noise ratio, we
must consider all the monomials. Let us take the example of the monomial in
O12 for the non-recursive systematic code, the classical code and the recursive
systematic code, respectively:
If 12 errors are introduced by the noise on the channel, 232 RTZ sequences
are "available" as errors for the first code, 241 for the second and 241 again
for the third. It is therefore (a little) less probable that an RTZ sequence will
appear if the code used is the non-recursive systematic code. Moreover, the
error expectancy per RTZ sequence of the three codes is 6.47, 7.49 and 6.00,
respectively: the recursive systematic code therefore introduces, on average,
fewer decoding errors than the classical code on RTZ sequences with 12 errors
on the frame coded. This is also true for higher degree monomials. Recursive
and non-recursive systematic codes are therefore more efficient at low signal to
5. Convolutional codes and their decoding 183
d 6 8 10 12 14 ...
ω(d) 6 40 245 1446 8295 ...
Table 5.1 – First terms of the spectrum of the recursive systematic code with generator
polynomials [1, (1 + D2 + D3 )/(1 + D + D3 )].
noise ratio than the classical code. Moreover, we find the monomials I 2 O8+4c ,
where c is an integer, in the transfer function of the recursive code. The infinite
number of monomials of this type is due to the existence of the cycle on a
null input sequence different from the loop on state 0. Moreover, such a code
does not provide any monomials of the form IO c , unlike non-recursive codes.
These conclusions concur with those drawn from the study of state machines in
Section 5.2.
This notion of transfer function is therefore efficient for studying the per-
formance of a convolutional code. A derived version is moreover essential for
the classification of codes according to their performance. This is the distance
spectrum ω(d) whose definition is as follows:
∞
∂T (O, I)
( )I=1 = ω(d)Od (5.14)
∂I
d=df
For example, the first terms of the spectrum of the recursive systematic code,
obtained from (5.13), are presented in Table 5.1. This spectrum is essential for
estimating the performance of codes in terms of calculating their error proba-
bility, as illustrated in the vast literature on this subject [5.9].
The codes used in the above examples have a rate of 1/2. By increasing the
number of redundancy bits n the rate becomes lower. In this case, the powers
of O associated with the branches of the state machines will be higher than or
equal to those of the figures above. This leads to higher transfer functions with
powers of O, that is, to RTZ sequences with a greater Hamming weight. The
codes with lower rates therefore have a higher error correction capability.
5.3.4 Performance
The performance of a code is defined by the decoding error probability after
transmission on a noisy channel. The previous section allows us to intuitively
compare non-recursive non-systematic, non-recursive systematic and recursive
systematic codes with the same constraint length. However, to estimate the
absolute performance of a code, we must be able to estimate the decoding error
probability as a function of the noise, or at least to limit it. The literature,
for example [5.9], thus defines many bounds that are not described here and we
will limit ourselves to comparing the three categories of convolutional codes. To
do this, a transmission on a Gaussian channel of blocks of 53 then 200 bytes
184 Codes and Turbo Codes
Figure 5.16 – Comparison of simulated performance (Binary Error Rate and Packet
Error Rate) of three categories of convolutional codes after transmission of packets of
53 bytes on a Gaussian channel (decoding using the MAP algorithm).
coded according to different schemes was simulated (Figures 5.16 and 5.17):
classical (non-recursive non-systematic), non-recursive systematic and recursive
systematic.
The blocks were constructed following the classical trellis termination tech-
nique for non-recursive codes whereas the recursive code is circular tail-biting
(see Section 5.5). The decoding algorithm used is the MAP algorithm.
The BER curves are in perfect agreement with the conclusions drawn during
the analysis of the free distance of codes and of their transfer function: the
systematic code is not as good as the others at high signal to noise ratio and
the classical code is then slightly better than the recursive code. At low signal
to noise ratios, the hierarchy is different: the recursive code and the systematic
code are equivalent and better than the classical code.
Comparing performance as a function of the size of the frame (53 and 200
bytes) shows that the performance hierarchy of the codes is not modified. More-
over, the bit error rates are almost identical. This was predictable as the sizes
of the frames are large enough for the transfer functions of the codes not to be
affected by edge effects. However, the packet error rate is affected by the length
of the blocks since although the bit error probability is constant, the packet
error probability increases with size.
The comparisons above only concern codes with 8 states. It is, however, easy
to see that the performance of a convolutional code is linked with its capacity
to provide information on the succession of data transmitted: the more the
code can integrate successive data into its output symbols, the more it improves
the quality of protection these data. In other words, the greater the number
of states (therefore the size of the register of the encoder), the more efficient
a convolutional code is (within its category). Let us compare three recursive
systematic codes:
probes was designed to process frames encoded with convolutional codes with
2 to 16384 states and rates much lower than 1/2 (16384 states and R = 1/6
for the Cassini probe to Saturn and the Mars Pathfinder probe). Why not
use such codes for terrestrial radio-mobile transmissions for the general public?
Because the complexity of the decoding would become unacceptable for current
terrestrial transmissions using a reasonably-sized terminal operating in real time,
and fitting into a pocket.
described in this book since, for turbo decoding, we prefer another family of
algorithms relying on the minimization of the error probability of each symbol
transmitted. Thus, the Maximum A Posteriori (MAP) algorithm enables the
calculation of the exact value of the a posteriori probability associated with each
symbol transmitted using the received sequence [5.1]. The MAP algorithm and
its variants are described in Chapter 7.
• Calculate for each branch of a branch metric, d (T (i, si−1 , si )). For a binary
output channel, this metric is defined as the Hamming distance between
the symbol carried by the branch of the trellis and the received symbol,
d (T (i, si−1 , si )) = dH (T (i, si−1 , si )).
For a Gaussian channel, the metric is equal to the square of the Euclidean
distance between the branch considered and the observation at the input
of the decoder (see also Section 1.3):
2 2
d (T (i, si−1 , si )) = Xi − xi + Yi − yi
m
2 n
2
(j) (j) (j) (j)
= xi − Xi + yi − Yi
j=1 j=1
• Calculate the accumulated metric associated with each branch T (i, si−1 , si )
defined by:
where μ(i − 1, si−1 ) is the accumulated metric associated with node si−1 .
• For each node si , select the branch of the trellis corresponding to the
minimum accumulated metric and memorize this branch in memory (in
practice, it is the value of di associated with the branch that is stored).
The path in the trellis made up of the branches successively memorized at
the instants between 0 and i is the survivor path arriving in si . If the two
paths that converge in si have identical accumulated metrics, the survivor
is then chosen arbitrarily between these two paths.
5. Convolutional codes and their decoding 189
From the point of view of complexity, the Viterbi algorithm requires the
calculation of 2ν+1 accumulated metrics at each instant i and its complexity
varies linearly with the length of sequence k or of decoding window l.
Figure 5.20 – Structure of the recursive systematic convolutional code (7,5) and asso-
ciated trellis.
For each of the four nodes of the trellis, the value of di corresponding to the
transition of minimum accumulated metric λ is stored in memory.
(0)
(1)
(2)
(i, 2 ) = min ( (i, s ) )
s {0, 1, 2, 3}
(3)
i 15 i 14 i 13 i 2 i 1 i
^d
i 15
Figure 5.21 – Survivor path traceback operation (in bold) in the trellis from instant i
and determining the binary decision at instant i − 15.
After selecting the node with minimum accumulated metric, denoted s (in
the example of Figure 5.21, s = 3), we trace back in the trellis along the survivor
path to a depth l = 15. At instant i − 15, the binary decision dˆi−15 is equal to
the value of di−15 stored in the memory associated with the survivor path.
The aim of applying the Viterbi algorithm with weighted inputs is to search
for codeword c that is the shortest Euclidean distance between two codewords.
Equivalently (see Chapter 1), this also means looking
m for the codeword that
k (l) (l) n
(l) (l)
maximizes the scalar product x, X + y, Y = xi Xi + yi Yi .
i=1 l=1 l=1
In this case, applying the Viterbi algorithm uses branch metrics of the form
m
(l) (l)
n
(l) (l)
d (T (i, si−1 , si )) = xi Xi + yi Yi and the survivor path then corre-
l=1 l=1
sponds to the path with maximum accumulated metric.
Figure 5.22 provides the performance of the two variants, with hard and
weighted inputs, of a decoder using the Viterbi algorithm for the code (7,5)
RSC for a transmission on a channel with additive white Gaussian noise. In
practice, we observe a gain of around 2 dB when we substitute weighted input
decoding for hard input decoding.
192 Codes and Turbo Codes
Figure 5.22 – Example of correction performance of the Viterbi algorithm with hard
inputs and with weighted inputs on a Gaussian channel. Recursive systematic convo-
lutional code (RSC) with generator polynomials 7 (recursivity) and 5 (redundancy).
Coding rate R = 1/2.
tions systems use independent frame transmissions. Paragraph 5.4.2 showed the
importance of knowing the initial and final states of the encoder during the de-
coding of a frame. In order to know these states, the technique used is usually
called trellis termination. This generally involves forcing the initial and final
states to values known by the decoder (in general zero).
d’i d’i , r i
ri Mux
di s (1)
i
1 D D D s (3)
i
2
s (0)
i
(2)
si
After initializing the register to zero, switch I is kept in position 1 and data
d1 to dk are coded. At the end of this encoding operation, instants k to k + ν,
switch I is placed in position 2 and di takes the value coming from the feedback
(0)
of the register, that is, a value that forces one register input to zero. Indeed, Si
is the result of a modulo-2 sum of two identical members. As for the encoder,
it continues to produce the associated redundancies ri .
194 Codes and Turbo Codes
This classical termination has one main drawback: the protection of the data
is not independent of their position in the frame. In particular, this can lead to
edge effects in the construction of a turbocode (see Chapter 7).
Tail-biting
A technique was introduced in the 70s and 80s [5.12] to terminate the trellis of
convolutional codes without edge effects: tail-biting. This involves making the
decoding trellis circular, that is, ensuring that the departure and the final states
of the encoder are identical. This state is then called the circulation state. This
technique is trivial for non-recursive codes as the circulation state is merely the
last ν bits of the sequence to encode. As for RSC codes, tail-biting requires
operations that are described in the following. The trellis of such a code, called
circular recursive systematic codes (CRSC), is shown in Figure 5.24.
(000)
(001)
(010)
(011)
(100)
(101)
(110)
(111)
where A is the state matrix and B the input matrix. In the case of the recur-
sive systematic code of generator polynomials [1, (1+D2 + D3 )/( 1+D + D3 )]
mentioned above, these matrices are
⎡ ⎤ ⎡ ⎤
1 0 1 1
A = ⎣ 1 0 0 ⎦ and B = ⎣ 0 ⎦.
0 1 0 0
5. Convolutional codes and their decoding 195
If the encoder is initialized to state 0 (s0 = 0), the final state s0k obtained at the
end of a frame of length k is:
k
s0k = Aj−1 Bdk−j (5.16)
j=1
When it is initialized in any state sc , the final state sk is expressed as follows:
k
sk = Ak sc + Aj−1 Bdk−j (5.17)
j=1
For this state sk to be equal to the departure state sc and for the latter therefore
to become the circulation state, it is necessary and sufficient that:
- .
k
I − Ak sc = Aj−1 Bdk−j (5.18)
j=1
XX
XXXk mod 7
XXX 1 2 3 4 5 6
s0k XX
0 0 0 0 0 0 0
1 6 3 5 4 2 7
2 4 7 3 1 5 6
3 2 4 6 5 7 1
4 7 5 2 6 1 3
5 1 6 7 2 3 4
6 3 2 1 7 4 5
7 5 1 4 3 6 2
Table 5.2 – Table of the CRSC code with generator polynomials [1, (1+D2 +D 3 )/
(1 + D + D3 )] providing the circulation state as a function of k mod 7 (k being
the length of the frame at the input) and of the terminal state s0k obtained after
encoding initialized to state 0.
3. Calculate the circulation state sc from the tables already calculated and
stored;
5.5.2 Puncturing
Some applications can only allocate a small space for the redundant part of the
codewords. But, by construction, the natural rate of a systematic convolutional
code is m/(m + n), where m is the number of input bits di of the encoder and n
is the number of output bits. It is therefore maximum when n = 1 and becomes
R = m/(m + 1). High rates can therefore only be obtained with high values
of m. Unfortunately, the number of transitions leaving any one node of the
trellis is 2m . In other words, the complexity of the trellis, and therefore of the
decoding, increases exponentially with the number of input bits of the encoder.
Therefore, this solution is generally not satisfactory. It is often avoided in favour
of a technique with a slightly lower error correction capability, but easier to
implement: puncturing.
The puncturing technique is commonly used to obtain high rates. It involves
using an encoder with a low value of m (1 or 2 for example), to keep a reasonable
decoding complexity, but transmitting only part of the bits coded. An example
is proposed in Figure 5.25. In this example, a 1/2 rate encoder produces outputs
di and ri at each instant i. Only 3 bits out of 4 are transmitted, which leads to
a global rate of 2/3. The pattern in which the bits are punctured is called the
puncturing mask.
5. Convolutional codes and their decoding 197
di ri di=0
di ri di=1
Figure 5.26 – Trellis diagram of the punctured recursive code for a rate 2/3.
The most widely used decoding technique involves taking the decoder of
the original code and inserting neutral values in the place of the punctured
elements. The neutral values are values representing information that is a priori
not known. In the usual case of a transmission using antipodal signalling (+1
for the logical ’1’, -1 for the logical ’0’), the null value (analogue 0) is taken as
the neutral value.
The introduction of puncturing increases the coding rate but, of course,
decreases its correction capability. Thus, in the example of Figure 5.26, the free
distance of the code is reduced from 6 to 4 (an associated RTZ sequence is shown
in the figure). Likewise Figure 5.27, in which we present the error rate curves
of code [1, (1 + D2 + D3 )/(1 + D + D3 )] for rates 1/2, 2/3, 3/4 and 6/7, shows
a decrease in error correction capability with the increase in coding rate.
The choice of puncturing mask obviously influences the performance of the
code. It is thus possible to favour one part of the frame, transporting sensitive
data, by slightly puncturing it to the detriment of another part that is more
highly punctured. A regular mask is, however, often chosen as it is simple to
implement.
198 Codes and Turbo Codes
Bibliography
[5.1] L. R. Bahl, J. Cocke, F. Jelinek, and J. Raviv. Optimal decoding of
linear codes for minimizing symbol error rate. IEEE Transactions on
Information Theory, IT-20:284–287, March 1974.