0% found this document useful (0 votes)
169 views78 pages

Script PDF

The document outlines a course on information theory and coding taught by M.Sc. Marko Hennhöfer. It covers topics including Fourier transforms, information entropy, source coding methods like Huffman coding, and channel coding techniques such as block codes, convolutional codes, turbo codes and LDPC codes. References on information theory, coding, signals and systems are also provided.

Uploaded by

Jijin UB
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
169 views78 pages

Script PDF

The document outlines a course on information theory and coding taught by M.Sc. Marko Hennhöfer. It covers topics including Fourier transforms, information entropy, source coding methods like Huffman coding, and channel coding techniques such as block codes, convolutional codes, turbo codes and LDPC codes. References on information theory, coding, signals and systems are also provided.

Uploaded by

Jijin UB
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 78

Information Theory and Coding

M. Sc. Marko Hennhfer


Ilmenau University of Technology
Communications Research Laboratory

Winter Semester 2011

M.Sc. Marko Hennhfer, Communications Research Lab

Information Theory and Coding

Slide: 1

Contents
1 Review
1.1 Fourier transformation
1.2 Convolution, continuous, discrete, matrix-vector version
1.3 Stochastics, PDF, CDF, moments
2 Information theory
2.1 Information, entropy, differential entropy
2.2 Mutual information, channel capacity
3 Source coding
3.1 Fano coding
3.2 Huffman coding
4 Channel coding
4.1 Block codes, asymptotic coding gains
4.2 Convolutional codes, trellis diagram, hard-/soft decision decoding
4.3 Turbo Codes
4.4 LDPC codes
M.Sc. Marko Hennhfer, Communications Research Lab

Information Theory and Coding

Slide: 2

Literature
Thomas M. Cover, Joy A. Thomas, Elements of Information Theory.
John Wiley & Sons, 2nd edition, 2006.
J. Proakis, Digital Communications.
John Wiley & Sons, 4th edition, 2001.
Branka Vucetic, Jinhong Yuan, Turbo Codes Principles and
applications. Kluwer Academic Publishers, 2000.

M.Sc. Marko Hennhfer, Communications Research Lab

Information Theory and Coding

Slide: 3

1 Review
Some references to refresh the basics:
S. Haykin and B. V. Veen, Signals and Systems. John Wiley & Sons,
second edition, 2003.
E. W. Kamen and B. S. Heck, Fundamentals of Signals and Systems
Using the Web and MATLAB. Upper Saddle River, New Jersey 07458:
Pearson Education, Inc. Pearson Prentice Hall, third ed., 2007.
A. D. Poularikas, Signals and Systems Primer with MATLAB. CRC Press,
2007.
S. Haykin, Communication Systems. John Wiley & Sons, 4th edition,
2001
A. Papoulis, Probability, Random Variables, and Stochastic Processes.
McGraw-Hill, 2nd edition, 1984.
G. Strang, Introduction to Linear Algebra. Wellesley-Cambridge Press,
Wellesley, MA, 1993.

M.Sc. Marko Hennhfer, Communications Research Lab

Information Theory and Coding

Slide: 4

2 Information Theory
Overview: communication system

certain
entropy

Sink

a,b,a,c

Source
coder

11,01,

remove
redundancy
efficient
mapping

Source
decoder

Channel
coder

110,011,

add useful
redundancy
e.g., for
FEC

Channel
decoder

Line
coder
Communications
engineering
lecture, Dr. Mike Wolf

Source

Line
decoder

Modulation

Physical
channel

Demodu
-lation

discrete channel

M.Sc. Marko Hennhfer, Communications Research Lab

Information Theory and Coding

Slide: 5

2.1 Information, entropy


Source

a,b,a,c

e.g.:

Discrete source, emits symbols from a given alphabet

modelled via a random variable S with probabilities of occurence

Discrete memoryless source.


subsequent symbols are statistically independent

M.Sc. Marko Hennhfer, Communications Research Lab

Information Theory and Coding

Slide: 6

2.1 Information, entropy


What is the ammount of information being produced by this source?
if:
no uncertainty, no surprise, i.e.,
no information
for small
values of

the surprise (information) is higher as compared to higher

Occurence of an event:
Information gain (removal of uncertainty ~
Information of the event

M.Sc. Marko Hennhfer, Communications Research Lab

Information Theory and Coding

Slide: 7

2.1 Information, entropy


Properties of information:
.
.
The event
yields a gain of information (or no information) but
never a loss of information.

The event with lower probability of occurence has the higher


information

For statistically independend events


M.Sc. Marko Hennhfer, Communications Research Lab

and

Information Theory and Coding

Slide: 8

2.1 Information, entropy


The basis of the logarithm can be chosen arbitrarily.
Usually:

Information if one of two equal probable events occurs

is a discrete random variable with probability of occurence

M.Sc. Marko Hennhfer, Communications Research Lab

Information Theory and Coding

Slide: 9

2.1 Information, entropy


Entropy
mean information of a source
(here: discrete memoryless source with alphabet

M.Sc. Marko Hennhfer, Communications Research Lab

Information Theory and Coding

Slide: 10

2.1 Information, entropy


Important properties of the entropy

where K is the number of Symbols in

no uncertainty

maximum uncertainty.
All symbols occur with the same probabilities

M.Sc. Marko Hennhfer, Communications Research Lab

Information Theory and Coding

Slide: 11

2.1 Information, entropy


Bounds for the entropy
Lower bound:

Upper bound:
Use

0
y

Given two distributions

0.5

-0.5

for the alphabet

-1

y = ln( x)
y = x-1

-1.5
0

x
M.Sc. Marko Hennhfer, Communications Research Lab

Information Theory and Coding

Slide: 12

2.1 Information, entropy


Upper bound for the entropy continued:

This yields Gibbs inequality:

Now assume

M.Sc. Marko Hennhfer, Communications Research Lab

Information Theory and Coding

Slide: 13

2.1 Information, entropy


Summary:
Entropy of the current source
Entropy of the best source
Redundancy and relative redundancy of the source

High redundancy of a source is a hint that compression methods will


be beneficial.
E.g., Fax transmission:
~90% white pixels
low entropy (as compared to the best source)
high redundancy of the source
redundancy is lowered by run length encoding

M.Sc. Marko Hennhfer, Communications Research Lab

Information Theory and Coding

Slide: 14

2.1 Information, entropy


Example: Entropy of a memoryless binary source
Symbol 0 occurs with probability
Symbol 1 occurs with probability
Entropy:
Characteristic points:
1
H(p ) / bit

0.8

0.6
0.4
0.2

Entropy function (Shannons Function)

0
0

0.5
p

M.Sc. Marko Hennhfer, Communications Research Lab

Information Theory and Coding

Slide: 15

2.1 Information, entropy


Extended (memoryless) sources:
Combine n primary symbols from
to a block of symbols (secondary symbols from

Example:

e.g., n=2, the extended source will have 3n =9 symbols,

M.Sc. Marko Hennhfer, Communications Research Lab

Information Theory and Coding

Slide: 16

2.2 Source Coding


Source coding theorem (Shannon)
Efficient representation (Coding) of data from a discrete source
Depends on the statistics of the source
short code words for frequent symbols
long code words for rare symbols

Code words must uniquely decodable


Source

a,b,a,c

K different

11,01,

efficient mapping to
binary code words

symbols

has the probabilities of occurence

M.Sc. Marko Hennhfer, Communications Research Lab

Source
coder

and the code word length

Information Theory and Coding

Slide: 17

2.2 Source Coding


Source coding theorem (Shannon)
Mean code word length
(as small as possible)

Given a discrete source with entropy


.
For uniquely decodable codes the entropy is the lower bound for the
mean code word length:

Efficiency of a code:

Redundancy and relative redundancy of the coding:

M.Sc. Marko Hennhfer, Communications Research Lab

Information Theory and Coding

Slide: 18

2.2 Source Coding


Fano Coding
Important group of prefix codes
Each symbol gets a code word assigned that approximately matches
its infomation
Fano algorithm:
1. Sort symbols with decreasing probabilities. Split symbols to groups with
approximately half of the sum probabilities
2. Assign 0 to one group and 1 to the other group.
3. Continue splitting

Fano Coding, example:


Code the symbols S={a, b, c, d, e, f, g, h} efficiently. Probabilities of
occurence pk={0.15,0.14, 0.13, 0.1, 0.12, 0.08, 0.06, 0.05}

M.Sc. Marko Hennhfer, Communications Research Lab

Information Theory and Coding

Slide: 19

2.2 Source Coding


Fano Coding, example:
Symbol prob.
c
0.3

lk / bit

CW
00

01

Source Entropy
a

0.15

0.14

100

0.12

101

0.1

1100

0.08

1101

0.06

1110

0.05

1111

Mean CW length

Redundancy

Efficiency

In average 0.06 bit/symbol more need to be transmitted as information is provided by the


source. E.g., 1000 bit source information -> 1022 bits to be transmitted.
M.Sc. Marko Hennhfer, Communications Research Lab

Information Theory and Coding

Slide: 20

10

2.2 Source Coding


Huffman Coding
Important group of prefix codes
Each symbol gets a code word assigned that approximately matches
its infomation
Huffman coding algorithm:
1. Sort symbols with decreasing probabilities. Assign 0 and 1 to the
symbols with the two lowest probabilities
2. Both symbols are combined to a new symbol with the sum of the
probabilities. Resort the symbols again with decreasing probabilities.
3. Repeat until the code tree is complete
4. Read out the code words from the back of the tree

M.Sc. Marko Hennhfer, Communications Research Lab

Information Theory and Coding

Slide: 21

2.2 Source Coding


Huffman Coding, example:
Symbol prob.
c
0.3

0.3

0.3

0.3

0.3

0.41
0.3

0.15

0.15

0.18

0.23

0.29

0.14

0.14

0.15

0.18

0.23

0.12

0.12

0.14

0.15

0.1

0.11

0.12

0.08

0.1

0.06

0.05

0.08

0.11

0.14

0
1

0.18

0.29

0.59

0.41

Redundancy

Efficiency

0
1

lk / bit

CW
00

010

011

100

110

111
e.g.
1010

1011

In average 0.03 bit/symbol more need to be transmitted as information is provided by the


source. E.g., 1000 bit source information -> 1011 bits to be transmitted.
M.Sc. Marko Hennhfer, Communications Research Lab

Information Theory and Coding

Slide: 22

11

2.3 Differential entropy


Source

Continuous (analog) source


modelled via a continuous random variable X with pdf

differential entropy

Example: Gaussian RV with pdf

M.Sc. Marko Hennhfer, Communications Research Lab

Information Theory and Coding

Slide: 23

2.4 The discrete channel


The discrete channel
Channel
coder

Line
coder

Modulation

discrete
channel

Physical
channel

input alphabet

Channel
decoder

output alphabet

M.Sc. Marko Hennhfer, Communications Research Lab

Line
decoder

Information Theory and Coding

Demodu
-lation

Slide: 24

12

2.4 The discrete channel


Discrete channel:
: Input alphabet with values/symbols. Easiest case
, i.e.,
binary codes. Commonly used
, i.e., symbols are bit
groups.
: Output values
Hard decision:
Decoder estimates directly the transmitted values, e.g., in the binary
case
.
Soft decision:
has more values as
. Extreme case:
, continuousvalued output. Allows measures for the reliability of the decision

M.Sc. Marko Hennhfer, Communications Research Lab

Information Theory and Coding

Slide: 25

2.4 The discrete channel


Conditional probabilities / transition probabilities:
conditional probability that
is received if
transmitted.
are assumed to be random variables with

has been
and

Discrete memoryless channel, DMC:


Subsequent symbols are statistically independent.
Example: Probability that a 00 is received if a 01 has been transmitted.

General:

M.Sc. Marko Hennhfer, Communications Research Lab

Information Theory and Coding

Slide: 26

13

2.4 The discrete channel


Symmetric hard decision DMC:
symmetric transition probabilities
.
.
special case

symbol error probability


: Binary symmetric channel (BSC)

M.Sc. Marko Hennhfer, Communications Research Lab

Information Theory and Coding

Slide: 27

2.4 The discrete channel


Binary symmetric channel (BSC):

Example: Probability to receive 101 if 110 has been transmitted

M.Sc. Marko Hennhfer, Communications Research Lab

Information Theory and Coding

Slide: 28

14

2.4 The discrete channel


Binary symmetric channel (BSC)
Important formulas:
1. Error event,
, i.e., probability that within a sequence
of length at least one error occurs.

2. Probability that

3. Probability for

specific bits are erroneous in a sequence of length

errors in a sequence of length

M.Sc. Marko Hennhfer, Communications Research Lab

Information Theory and Coding

Slide: 29

2.4 The discrete channel


Binary symmetric erasure channel (BSEC):

Simplest way of a
soft-decision output

M.Sc. Marko Hennhfer, Communications Research Lab

Information Theory and Coding

Slide: 30

15

2.4 The discrete channel


Entropy diagram:
source

channel

receiver

irrelevance

mean
transmitted
information

mean
received
information

equivocation

M.Sc. Marko Hennhfer, Communications Research Lab

Information Theory and Coding

Slide: 31

2.4 The discrete channel


Explaination:
source entropy, i.e., mean information emitted by the source
mean information observed at the receiver
irrelevance, i.e., the uncertainty over the output, if the
input is known
equivocation, i.e., the uncertainty over the input if the
output is observed
transinformation or mutual information, i.e., the
information of the input which is contained in the output.

M.Sc. Marko Hennhfer, Communications Research Lab

Information Theory and Coding

Slide: 32

16

2.4 The discrete channel


Important formulas:
Input entropy

output entropy

Example:

M.Sc. Marko Hennhfer, Communications Research Lab

Information Theory and Coding

Slide: 33

2.4 The discrete channel


irrelevance:
first consider only one input value

Example:

M.Sc. Marko Hennhfer, Communications Research Lab

Information Theory and Coding

Slide: 34

17

2.4 The discrete channel


irrelevance:
then take the mean for all possible input values

Example:

M.Sc. Marko Hennhfer, Communications Research Lab

Information Theory and Coding

Slide: 35

Information Theory and Coding

Slide: 36

2.4 The discrete channel


irrelevance:

M.Sc. Marko Hennhfer, Communications Research Lab

18

2.4 The discrete channel


equivocation:

Example:

M.Sc. Marko Hennhfer, Communications Research Lab

Information Theory and Coding

Slide: 37

Information Theory and Coding

Slide: 38

2.4 The discrete channel


Mutual information:

M.Sc. Marko Hennhfer, Communications Research Lab

19

2.4 The discrete channel


Mutual information & channel capacity:

1.2

The maximum
mutual information
occurs for p0=1/2,
independent of pe ,
i.e., for p0=1/2 we
can calculate the
channel capacities
for certain values of

p =0.1
e

p =0.2

0.8

p =0.3
I(X;Y)

mutual information

p =0.01

0.6
0.4
0.2

pe .

0
-0.2
0

0.2

0.4

0.6
p

0.8

M.Sc. Marko Hennhfer, Communications Research Lab

Information Theory and Coding

Slide: 39

2.5 The AWGN channel


AWGN (Additive White Gaussian Noise) Channel:
Channel
coder

Modulation

infinite bandwidth,
therefore, infinite
power:

Physical channel

PSD

Channel
decoder

Demodu
-lation

ACF

Demodulator limits bandwidth.


The noise variance at the
sampling times computes to
.
See Communications Engineering lecture for details.
M.Sc. Marko Hennhfer, Communications Research Lab

Information Theory and Coding

Slide: 40

20

2.5 The AWGN channel


Noise example:
Sample realizations

w(t) / V

5
0
-5
-10
0

0.2
t/s

0.4

PDF of the amplitudes:

f w(w)

0.2
0.15
standard
deviation

0.1
0.05
0

variance

-5

0
w

M.Sc. Marko Hennhfer, Communications Research Lab

Information Theory and Coding

Slide: 41

2.5 The AWGN channel


Simplified model:
assume as statistically
independent

Channel
coder

binary example

conditional PDF

0.2
f Y|X(x|sqrt(Eb))

Channel
decoder

0.15
0.1
0.05
0

M.Sc. Marko Hennhfer, Communications Research Lab

Information Theory and Coding

-5

0
y

5
Slide: 42

21

2.5 The AWGN channel


Error probability:
0.1

0.05

decision boundary

-5

0
y

M.Sc. Marko Hennhfer, Communications Research Lab

Information Theory and Coding

Slide: 43

2.5 The AWGN channel


AWGN Channel, binary input, BER performance (uncoded):

10

10

-2

-4

10

10

10

10

-6

-8

-10

10

12

E / N in dB
b

M.Sc. Marko Hennhfer, Communications Research Lab

Information Theory and Coding

Slide: 44

22

2.5 The AWGN channel


Bounds for the Q-function:
10

10

-2

Upper bounds

-4

10

Exactly

10

10

10

-6

-8

-10

10

12

E / N in dB
b

Lower bound

M.Sc. Marko Hennhfer, Communications Research Lab

Information Theory and Coding

Slide: 45

2.5 The AWGN channel


Entropy diagram for the continuous valued input and output:
source

channel

receiver

differential irrelevance

transmitted
differential
entropy

received
differential
entropy

differential equivocation

M.Sc. Marko Hennhfer, Communications Research Lab

Information Theory and Coding

Slide: 46

23

2.5 The AWGN channel


Differential entropies:

Mutual information:

M.Sc. Marko Hennhfer, Communications Research Lab

Information Theory and Coding

Slide: 47

2.5 The AWGN channel


AWGN Channel model:
: Random variables, containing the
sampled values
of the input, output,
and the noise process.
: Gaussian distibuted with variance

: Input signal, power limited to


Channel capacity:

M.Sc. Marko Hennhfer, Communications Research Lab

Information Theory and Coding

Slide: 48

24

2.5 The AWGN channel


Mutual information:

M.Sc. Marko Hennhfer, Communications Research Lab

Information Theory and Coding

Slide: 49

Information Theory and Coding

Slide: 50

2.5 The AWGN channel


AWGN Channel capacity:

M.Sc. Marko Hennhfer, Communications Research Lab

25

2.5 The AWGN channel


AWGN Channel capacity:
in bits per transmission
or bits per channel use
AWGN Channel capacity as a function of the SNR and in bits per second?
Example: Assume a transmission with a binary modulation scheme and bit
rate
bit/s.
PSD

M.Sc. Marko Hennhfer, Communications Research Lab

Information Theory and Coding

Slide: 51

2.5 The AWGN channel


PSD of the sampled signal:
...

Band limited noise process:

...

Sampling at Nyquist rate of 2W,


i.e., we use the channel 2W times
per second

Noise power

in bits per second


channel uses per second
in bits/second

M.Sc. Marko Hennhfer, Communications Research Lab

Information Theory and Coding

Slide: 52

26

2.5 The AWGN channel

in

bit/s
Hz

Normalized capacity /
spectral efficiency:

in

in this region no
error free
transmission is
possible

capacity boundary

10
10

error free
transmission
possible with a
certain amount of
channel coding

spectral bit rate

2
10
1

bit/s
Hz

1/2

-1

10
1/10

10

20

30

in dB

Shannon limit -1.6 dB


M.Sc. Marko Hennhfer, Communications Research Lab

Information Theory and Coding

Slide: 53

3 Channel Coding
Channel coding:
11,01,

info word,
length k

Channel
coder

110,011,

add useful
redundancy
e.g., for
FEC

Defines a (n,k ) block code


Example: (3,1) repetition code

code word,
length n

code rate R = k / n < 1


code
bit
rate

info
bit
rate

bandwidth
expansion
factor

results in an
increased data rate
M.Sc. Marko Hennhfer, Communications Research Lab

Information Theory and Coding

Slide: 54

27

3 Channel Coding
Code properties:
Systematic codes: Info words occur as a part of the code words

Code space:

Linear codes: The sum of two code words is again a codeword

bit-by-bit modulo 2
addition without carry

M.Sc. Marko Hennhfer, Communications Research Lab

Information Theory and Coding

Slide: 55

3 Channel Coding
Code properties:
Minimum Hamming distance:
A measure how different the most closely located code words are.
Example:
compare all combinations
of code words

For linear codes the comparison simplifies to finding the code word
with the lowest Hamming weight:

M.Sc. Marko Hennhfer, Communications Research Lab

Information Theory and Coding

Slide: 56

28

3 Channel Coding
Maximum likelihood decoding (MLD):
Goal:
Minimum word error probability

11,01,

Channel
coder

110,011,

discrete
channel

100,011,

CW
estimator

110,011,

encoder
inverse

11,01,

Code word estimator:


is the mapping from all 2n possible received words to the 2k possible
code words in
Example: (7,4) Hamming code
27 = 128 possible received words
24 = 16 valid code words in
M.Sc. Marko Hennhfer, Communications Research Lab

Information Theory and Coding

Slide: 57

3 Channel Coding
Decoding rule:
Assumption: equal apriori probabilities, i.e., all 2k code words appear with
probability 1/2k.
Probability for wrong detection if a certain cw

was transmitted:

Probability to receice a CW
that yields an estimate
Furthermore:

M.Sc. Marko Hennhfer, Communications Research Lab

Information Theory and Coding

Slide: 58

29

3 Channel Coding
Example: (n=3,k=1) Repetition Code:
Assumption: equal apriori probabilities, i.e., each of the 2k =21 =2 code
words (111,000) appear with probability 1/2k=1/21=1/2
Probability for wrong detection if a certain cw

e.g., assume

Transmitted,

111

Possibly Decoded
received, y
000
000
001
000
010
000
011
111
100
000
101
111
110
111
111
111

was transmitted:

was transmitted over a BSC:

consider all received


words that yield a
wrong estimate

M.Sc. Marko Hennhfer, Communications Research Lab

P(000|111)
P(001|111)
P(010|111)

Prob., e.g., if a BSC is


considered

P(100|111)

pe pe pe
pe pe (1-pe)
pe (1-pe) pe

(1-pe) pe pe

Information Theory and Coding

Slide: 59

3 Channel Coding
Probability for a wrong detection (considering all possibly
transmitted CWs now):

mean over all transmitted CWs


combining
the sums

wrong detection

any detection
M.Sc. Marko Hennhfer, Communications Research Lab

correct detection

Information Theory and Coding

Slide: 60

30

3 Channel Coding
Probability for wrong detection:

To minimize
choose
gets maximized

distance

for each received word such that

is maximized, if we choose a CW
to the received word .

M.Sc. Marko Hennhfer, Communications Research Lab

with the minimum

Information Theory and Coding

Slide: 61

3 Channel Coding
MLD for hard decision DMC:
Find the CW with minimum Hamming distance.

MLD for soft decision AWGN:

Euklidean distance
Find the CW with minimum Euklidean distance.

M.Sc. Marko Hennhfer, Communications Research Lab

Information Theory and Coding

Slide: 62

31

3 Channel Coding
Coding gain:
Suitable measure: Bit error probability:
(the bit error probability is considered only for the k info bits)

Code word error probability:


Example: Transmit 10 CWs and 1 bit error shall occur

k info bits
1 bit wrong will yield 1 wrong code word
40 info bits have been transmitted
As in general more than one error can occur in a code word, we can
only approximate

M.Sc. Marko Hennhfer, Communications Research Lab

Information Theory and Coding

Slide: 63

3 Channel Coding
If we consider that a decoding error occurs only if

bits are wrong:

Comparison of codes considering the AWGN channel:


Energy per bit vs. energy per coded bit (for constant transmit power)
Example: (3,1) repetition code,

coded bits, energy

coding
1

M.Sc. Marko Hennhfer, Communications Research Lab

1 1 1

Information Theory and Coding

Slide: 64

32

3 Channel Coding
Example:

10

BER Performance using the (7,4) Hamming code


uncoded
P hard, approx
b

-2

P soft, approx
b

10

10

-4

In the low SNR regime


we suffer from the
reduced energy per
coded bit
-6

asymptotic
coding gain

10

10

-8

hard vs. soft


decision gain

-10

10

11

12

E / N in dB
c

M.Sc. Marko Hennhfer, Communications Research Lab

Information Theory and Coding

Slide: 65

3 Channel Coding
Analytical calculation of the error probabilities:
Hard decision:
combinations for r errors
Example: (3,1) repetition code
in a sequence of length n
Info
word

code
word

received
word
1 combination for 3 errors
3 combinations for 1 error
will be corrected
3 combinations for 2 errors

M.Sc. Marko Hennhfer, Communications Research Lab

Information Theory and Coding

Slide: 66

33

3 Channel Coding
error can be corrected

3 combinations 1 combination
for 3 errors
for 2 errors
general:

CW errors occur
for more than
t+1 wrong bits

combinations
for r errors
in a sequence
of length n

M.Sc. Marko Hennhfer, Communications Research Lab

probability
for r errors

Information Theory and Coding

probability
for n-r
correct bits

Slide: 67

3 Channel Coding
Approximation for small values of
only take the lowest power
of
into account

general:
Example: (7,4) Hamming code,

for a binary
mod. scheme &
AWGN channel
M.Sc. Marko Hennhfer, Communications Research Lab

Information Theory and Coding

Slide: 68

34

3 Channel Coding
Example:

10

BER Performance using the (7,4) Hamming code


uncoded
P hard
b

-2

simulated

P hard
w

P approx

calculated
as derived
before

10

10

10

P approx
w

-4

more bits should have


been simulated to get
reliable results here

-6

-8

10

11

12

E / N in dB
b

M.Sc. Marko Hennhfer, Communications Research Lab

Information Theory and Coding

Slide: 69

3 Channel Coding
Asymptotic coding gain for hard decision decoding:
uncoded:
good approximation
for high SNR

coded:

constant

Assume constant BER and compare signal-to-noise ratios

in dB
M.Sc. Marko Hennhfer, Communications Research Lab

Information Theory and Coding

Slide: 70

35

3 Channel Coding
Example:

10

BER Performance using the (7,4) Hamming code


uncoded
P hard, approx
b

-2

P soft, approx
b

10

10

10

10

-4

-6

Asymptotic coding gain

-8

-10

10

11

12

E / N in dB
c

M.Sc. Marko Hennhfer, Communications Research Lab

Information Theory and Coding

Slide: 71

3 Channel Coding
Analytical calculation of the error probabilities:
Soft decision:
code word

AWGN channel

received word

+
Noise vector: i.i.d.
Example: (3,2) Parity check code

M.Sc. Marko Hennhfer, Communications Research Lab

Information Theory and Coding

Slide: 72

36

3 Channel Coding
Example continued

ML decoding rule, derived before

Pairwise error probability: Assume


has been transmitted. What is
the probability that the decoder decides for a different CW ?
The decoder will decide for if the received word
distance to
as compared to .

has a smaller Euklidean

next: Evaluate the norm by summing the squared components


M.Sc. Marko Hennhfer, Communications Research Lab

Information Theory and Coding

Slide: 73

3 Channel Coding

For the whole CW we have

M.Sc. Marko Hennhfer, Communications Research Lab

Information Theory and Coding

different bits

Slide: 74

37

3 Channel Coding

scales standard deviation


Gaussian rv with standard deviation
sum of Gaussian rvs: The variance of the sum will be the
sum of the individual variances.

std. dev.
variance
Gaussian rv with zero mean and variance
M.Sc. Marko Hennhfer, Communications Research Lab

Information Theory and Coding

Slide: 75

3 Channel Coding
multiplied
with -1
Question: What is the probability that our Gaussian r.v. becomes larger
than a certain value?
Answer: Integral over remaining part of the Gaussian PDF, e.g., expressed
via the Q-function.
Q-Function:
normalized Gaussian rv
Probability that a normalized Gaussian r.v. becomes larger than
certain value .

M.Sc. Marko Hennhfer, Communications Research Lab

Information Theory and Coding

Slide: 76

38

3 Channel Coding

normalized Gaussian r.v.

Pairwise error probability:

M.Sc. Marko Hennhfer, Communications Research Lab

Information Theory and Coding

Slide: 77

3 Channel Coding
Example continued: e.g., for
transmitted

Number of CW
within distance

For
get

we would

The CWs with the minimum Hamming distance to the transmitted CW


dominate the CW error probability

Mean over the transmitted CWs


M.Sc. Marko Hennhfer, Communications Research Lab

Information Theory and Coding

Slide: 78

39

3 Channel Coding

Best case: only one


CW within

worst case: all CWs


within

For high SNR or if

is unkown

M.Sc. Marko Hennhfer, Communications Research Lab

Information Theory and Coding

Slide: 79

3 Channel Coding
Example:

10

BER Performance using the (7,4) Hamming code


uncoded
P soft
b

-2

simulated

P soft
w

P approx

calculated
as derived
before
using

10

10

10

10

-4

P approx
w

-6

-8

-10

10

11

12

E / N in dB
b

M.Sc. Marko Hennhfer, Communications Research Lab

Information Theory and Coding

Slide: 80

40

3 Channel Coding
Asymptotic coding gain for soft decision decoding:
Derivation analog to the hard decision case
uncoded:

good approximation
for high SNR

coded:
Assume constant BER and compare signal-to-noise ratios

in dB

M.Sc. Marko Hennhfer, Communications Research Lab

Information Theory and Coding

Slide: 81

3 Channel Coding
Example:

10

BER Performance using the (7,4) Hamming code


uncoded
P hard, approx
b

-2

P soft, approx
b

10

-4

Asymptotic coding gain


10

10

10

-6

-8

-10

10

11

12

E / N in dB
c

M.Sc. Marko Hennhfer, Communications Research Lab

Information Theory and Coding

Slide: 82

41

3 Channel Coding
Matrix representation of block codes:
Example: (7,4) Hamming code
Encoding equation:

systematic code

bitwise modulo 2
sum without carry

M.Sc. Marko Hennhfer, Communications Research Lab

Information Theory and Coding

Slide: 83

3 Channel Coding
Introducing the generator matrix
as matrix-vector product.

we can express the encoding process

The identity matrix


is responsible that
the code becomes a
systematic code. It
just copies the info
word into the CW

multiply and
sum

Parity matrix

M.Sc. Marko Hennhfer, Communications Research Lab

Information Theory and Coding

Slide: 84

42

3 Channel Coding
General: For a (n,k ) block code:
info words
code words

Encoding:

For systematic codes:

Set of code words:

M.Sc. Marko Hennhfer, Communications Research Lab

Information Theory and Coding

Slide: 85

3 Channel Coding
Properties of the generator matrix
the rows of
shall be linear independent
the rows of
are code words of
the row space is the number of linear independent rows
the column space is the number of linear independent rows
row space and column space are equivalent, i.e., the rank of the matrix
as
has more columns than rows, the columns must be linear
dependent
Example: (7,4) Hamming code

M.Sc. Marko Hennhfer, Communications Research Lab

easy to see:
the rows are linear
independent
the last 3 columns can be
written as linear comb. of
the first 4 columns
rank 4
Information Theory and Coding

Slide: 86

43

3 Channel Coding
Properties of the generator matrix
rows can be exchanged without changing the code
multiplication of rows with a scalar doesnt change the code
sum of a scaled row with another row doesnt change the code
exchanging columns will change the set of codewords but the weight
distribution and the minimum Hamming distance will be the same
yields the same code:

each Generator matrix can be


brought to the row echelon form,
i.e., a systematic encoder
M.Sc. Marko Hennhfer, Communications Research Lab

Information Theory and Coding

Slide: 87

3 Channel Coding
Properties of the generator matrix
as the all zero word is a valid code word, and the rows of
are also
valid code words, the minimum Hamming distance must be less or
equal the minimum weight of the rows.

Parity check matrix


The code can be also defined via the parity check matrix

M.Sc. Marko Hennhfer, Communications Research Lab

Information Theory and Coding

Slide: 88

44

3 Channel Coding
Parity check matrix
If

is a systematic generator matrix, e.g.,

then

can be used to check whether a received CW is a valid CW, or to


determine what is wrong with the received CW (syndrom)
M.Sc. Marko Hennhfer, Communications Research Lab

Information Theory and Coding

Slide: 89

3 Channel Coding
Decoding:
ML decoding is trivial but computationally very complex as the received
CW has to be compared with all possible CWs. Impractical for larger code
sets.
Therefore, simplified decoding methods shall be considered.
Syndrom decoding using Standard Arrays (or Slepian arrays)
Assume an (n,k ) code with the parity check matrix
The Syndrom for a received CW

is defined as:

with
valid CW + error word, error pattern

For a valid received CW the syndrom will be 0.


Otherwise the Syndrom only depends on the error pattern.
M.Sc. Marko Hennhfer, Communications Research Lab

Information Theory and Coding

Slide: 90

45

3 Channel Coding
As we get 2k valid codewords and 2n possibly received words there must
be 2n - 2k error patterns. The syndrom is only of size n -k, therefore the
syndroms are not unique.
E.g., (7,4) Hamming Code: 16 valid CWs, 128 possibly received CWs, 112
error patterns, 2(n-k )=8 syndroms.
Let the different syndroms be
.
For each syndrom well get a whole set of error patterns
yield this syndrom.

Let

(cosets), that

, i.e., theyll yield the same Syndrom

The difference of two error patterns


in
must be a valid CW then.
M.Sc. Marko Hennhfer, Communications Research Lab

Information Theory and Coding

Slide: 91

3 Channel Coding
The set

can be expressed as one element

Within
each
of the coset.

can be chosen as coset leader

plus the code set

to calculate the rest

The coset leader is chosen with respect to the minimum Hamming weight

Example: (5,2) Code

M.Sc. Marko Hennhfer, Communications Research Lab

Information Theory and Coding

Slide: 92

46

3 Channel Coding
Syndrom 0 valid CWs
coset
leader

coset

syndrom
e.g.,
, all error
patterns that yield
the syndrom 011

choose the pattern with minimum Hamming


weight as coset leader
M.Sc. Marko Hennhfer, Communications Research Lab

Information Theory and Coding

Slide: 93

3 Channel Coding
Syndrom decoding

The same table as before only considering the coset leaders and the
syndroms.
syndrom table
resort for easier look-up.
contains already the
address information

As the coset leader was chosen with the


minimum Hamming distance, it is the most
likely error pattern for a certain syndrom
M.Sc. Marko Hennhfer, Communications Research Lab

Information Theory and Coding

Slide: 94

47

3 Channel Coding
Example: (5,2) Code continued
Assume we receive
Calculate the Syndrom (what is wrong with the received CW?)

Look-up in the syndrom table at position 3 (011 binary).


Invert the corresponding bit to find the most likely transmitted CW.

M.Sc. Marko Hennhfer, Communications Research Lab

Information Theory and Coding

Slide: 95

3 Channel Coding
Convolutional codes:
Features:
No block processing; a whole sequence is convolved with a set of
generator coefficients
No analytic construction is known good codes have been found by
computer search
Description is easier as compared to the block codes
Simple processing of soft decission information well suited for
iterative decoding
Coding gains from simple convolutional codes are similar as the ones
from complex block codes
Easy implementation via shift registers

M.Sc. Marko Hennhfer, Communications Research Lab

Information Theory and Coding

Slide: 96

48

3 Channel Coding
General structure:

Example: (n,k ), e.g., (3,2) convolutional code with


memory m=2 (constraint length K=m+1=3)

m=2 previous info-blocks

current input / info-block

weights for
the linear
combination

0
1

1
0

1
1

0
1

0
0

1
0

generators

output
block

[011001]
[101100]
[010000]

M.Sc. Marko Hennhfer, Communications Research Lab

usually in
octal form

(31, 54, 20)

Information Theory and Coding

Slide: 97

3 Channel Coding
Formal description:
Describes the linear combinations, how to compute the n output
bits from the k (m+1) input bits.

the
bit from
output block
the
block
sum over the input blocks

bit from input

corresponding weight,
0 or 1
sum over the bits of the input blocks

M.Sc. Marko Hennhfer, Communications Research Lab

Information Theory and Coding

Slide: 98

49

3 Channel Coding
General structure:

often used, input blocks of size 1: (n,1 ), e.g., (3,1)


convolutional codes

m=2 previous info-bits

current input / info-bit

1
1

0
0

0
1

generators

output
block

[100]
[101]
[111]

M.Sc. Marko Hennhfer, Communications Research Lab

Information Theory and Coding

octal form

(4, 5, 7)

Slide: 99

3 Channel Coding
General structure:

initialization

visualization as shift register, e.g., (3,1) conv. code


with generator (4,5,7), constraint length 3.

m=2,
memory

current
input bit

state

s0 = 0 0
s1 = 0 1
s2 = 1 0
s3 = 1 1
M.Sc. Marko Hennhfer, Communications Research Lab

Information Theory and Coding

Slide: 100

50

3 Channel Coding
Generation of Trellis diagram (example continued):
initialization

X00
state
s0 = 0 0

output

following
state

current
input

output
X=0

000
input X=0

s1 = 0 1

s2 = 1 0
input X=1

s3 = 1 1
1

M.Sc. Marko Hennhfer, Communications Research Lab

Information Theory and Coding

Slide: 101

3 Channel Coding
Trellis diagram (example continued):
state
s0 = 0 0

000

000

000

000

000

101

101

101

s1 = 0 1

s2 = 1 0

s3 = 1 1

current input:0
current input:1
M.Sc. Marko Hennhfer, Communications Research Lab

Information Theory and Coding

Slide: 102

51

3 Channel Coding
Encoding via the Trellis diagram (example continued):
Input seq.:
0
Output seq.: 000
state
s0 = 0 0

000

1
111

0
001

1
100

1
110

000

000

000

000

101

101

101

...
...

s1 = 0 1

s2 = 1 0

s3 = 1 1
current input:0
current input:1
M.Sc. Marko Hennhfer, Communications Research Lab

Information Theory and Coding

Slide: 103

3 Channel Coding
State diagram (example continued):
A more compact representation
s1 = 0 1
100

011
001

000

s0 = 0 0

111

s2 = 1 0
010

101

110

s3 = 1 1

current input:0
current input:1

M.Sc. Marko Hennhfer, Communications Research Lab

Information Theory and Coding

Slide: 104

52

3 Channel Coding
Encoding via state diagram (example continued):
Input seq.:
0
Output seq.: 000

1
111

0
001

1
100

1
110

...
...

s1 = 0 1
100

011
001

000
111

s0 = 0 0

s2 = 1 0
010

initialization;
start here

110

101

s3 = 1 1
current input:0
current input:1

M.Sc. Marko Hennhfer, Communications Research Lab

Information Theory and Coding

Slide: 105

3 Channel Coding
termination / tail bits

Viterbi algorithm for hard decission decoding:


Info bits: 0
Transm.: 000
Received: 001

1
111
111

0
001
011

1
100
000

0
001
001

0
011
010

transmission errors

s0 =
00

000

s1 =
01

s2 =

11

survivor

000
2

2
3

4
5

000
0

3
2

Viterbi
metric, i.e.,
Hamming
distance
001
111

10

s3 =

000
3

sum

M.Sc. Marko Hennhfer, Communications Research Lab

000

3
5

101
2

2
101
2

000
1

2
7
5

Information Theory and Coding

current input:0
current input:1
Slide: 106

53

3 Channel Coding
termination / tail bits

Viterbi algorithm for hard decission decoding:


Info bits: 0
Transm.: 000
Received: 001
ML est.: 000
Decoded: 0
000

1
111
111
111
1
1

0
001
011
001
0

000
3

1
100
000
100
1

000
2

5
4

3
2

2
3

000
0

M.Sc. Marko Hennhfer, Communications Research Lab

000

3
5

101
2

0
011
010
011
0

000
1

1
traceback
path with
minimum
metric

2
101
2

0
001
001
001
0

2
7

current input:0
current input:1

Information Theory and Coding

Slide: 107

3 Channel Coding
termination / tail bits

blank Trellis diagram:

state
00

01

10

11

current input:0
current input:1
M.Sc. Marko Hennhfer, Communications Research Lab

Information Theory and Coding

Slide: 108

54

3 Channel Coding
Summary: Viterbi algorithm for hard decission decoding:
Generate the Trellis diagram depending on the code (which is defined
by the generator)
For any branch compute the Viterbi metrics, i.e., the Hamming
distances between the possibly decoded sequence and the received
sequence
Sum up the individual branch metrics through the trellis (path metrics)
At each point choose the suvivor, i.e., the path metric with the
minimum weight
At the end the zero state is reached again (for terminated codes)
From the end of the Trellis trace back the path with the minimum
metric and get the corresponding decoder outputs
As the sequence with the minimum Hamming distance is found, this
decoding scheme corresponds to the Maximum Likelihood decoding
Sometimes also different metrics are used as Viterbi metric, such as the number of
equal bits. Then we need the path with the maximum metric.
M.Sc. Marko Hennhfer, Communications Research Lab

Information Theory and Coding

Slide: 109

3 Channel Coding
How good are different convolutional codes?
For Block codes it is possible to determine the minimum Hamming
distance between the different code words, which is the main
parameter that influences the bit error rate
For convolutional codes a similar measure can be found. The free
distance
is the number of bits which are at least different for two
output sequences. The larger
, the better the code.
A convolutional code is called optimal if the free distance is larger as
compared to all other codes with the same rate and constraint length
Even though the coding is a sequential process, the decoding is
performed in chunks with a finite length (decoding window width)
As convolutional codes are linear codes, the free distances are the
distances between each of the code sequences and the all zero code
sequence
The minimum free distance is the minimum Hamming weight of all
arbitrary long paths along the trellis that diverge and remerge to the
all-zero path (similar to the minimum Hamming distance for linear
block codes)
M.Sc. Marko Hennhfer, Communications Research Lab

Information Theory and Coding

Slide: 110

55

3 Channel Coding
Free distance (example recalled): (3,1) conv. code with generator (4,5,7).
diverge
state
s0 = 0 0

remerge
0
000

0
000

0
000

0
000

0
000

2
2

s1 = 0 1
Hamming
weight of
the branch

s2 = 1 0
2

1
101

s3 = 1 1

101

101

The path diverging and remerging to all-zero path with minimum weight

Note: This code is not optimal as there exists a better code with constraint
length 3 that uses the generator (5,7,7) and reaches a free distance of 8

M.Sc. Marko Hennhfer, Communications Research Lab

Information Theory and Coding

Slide: 111

3 Channel Coding
How good are different convolutional codes?
Optimal codes have been found via computer search, e.g.,
Code rate

Constraint
length

Generator
(octal)

Free distance

1/2

(5,7)

1/2

(15,17)

1/2

(23, 35)

1/3

(5,7,7)

1/3

(13,15,17)

10

1/3

(25,33,37)

12

Extensive tables, see reference: John G. Proakis, Digital Communications

As the decoding is done sequentially, e.g., with a large decoding


window, the free distance gives only a hint on the number of bits that
can be corrected. The higher the minimum distance, the more closely
located errors can be corrected
Therefore, interleavers are used to split up burst errors
M.Sc. Marko Hennhfer, Communications Research Lab

Information Theory and Coding

Slide: 112

56

3 Channel Coding
standardization 1982-1992
deployment starting 1992

Application example GSM voice transmission

The speech codec produces blocks of 260 bits, from which some bits
are more or less important for the speech quality
50 bits most sensitive to bit errors
132 bits moderately sensitive to bit errors
78 bits least sensitive to bit errors

calc. parity
bits (CRC)

50

3
50
132

class Ib
termination bits 0

189

378

456

78

class II
M.Sc. Marko Hennhfer, Communications Research Lab

189

multiplexer

voice coder

class Ia

convolutional
encoder

Class Ia:
Class Ib:
Class II:

Information Theory and Coding

Slide: 113

3 Channel Coding
Application example GSM voice transmission
The voice samples are taken every 20ms, i.e., the output of the voice
coder has a data rate of 260 bit / 20 ms = 12.7 kbit/s
After the encoding we get 456 bits which means overall we get a code
rate of about 0.57. The data rate increases to 456 bit / 20 ms = 22.3
kbit/s
The convolutional encoder applies a rate code with constraint length
5 (memory 4) and generator (23, 35),
. The blocks are also
terminated by appending 4 zero bits (tail bits).
Specific decoding schemes or algorithms are usually not standardized.
In most cases the Viterbi algorithm is used for decoding
24=16 states in the Trellis diagram
In case 1 of the 3 parity bits is wrong (error in the most sensitive data)
the block is discarded and replaced by the last one received correctly
To avoid burst errors additionally an interleaver is used at the encoder
output
M.Sc. Marko Hennhfer, Communications Research Lab

Information Theory and Coding

Slide: 114

57

3 Channel Coding

standardization 1990-2000
deployment starting 2001

Application example UMTS:


Example: Broadcast
channel (BCH)
Convolutional code:
Rate
Constraint length K=9
(memory m=8)
generator (561,753),
28=256 states in the
Trellis diagram!
Also Turbo codes got
standardized
From: Universal Mobile
Telecommunications System
(UMTS); Channel coding and
multiplexing examples (ETSI 3GPP
TR 25.944), 82 pages document
M.Sc. Marko Hennhfer, Communications Research Lab

Information Theory and Coding

Slide: 115

3 Channel Coding
Recursive Systematic Codes (RSC):
Example: rate RSC
Systematic: Info
bit occurs directly
as output bit
1

delay

delay

Recursive:
Feedback path in
the shift register

generators
feedback generator:
feedforward generator:

M.Sc. Marko Hennhfer, Communications Research Lab

[111] (7)octal
[101] (5)octal

Information Theory and Coding

Slide: 116

58

3 Channel Coding
Example continued:
0
0

0
1

1
1

delay

delay

current input:0
current input:1

0
1

state
s0 = 0 0

00

s1 = 0 1

s2 = 1 0

s3 = 1 1
M.Sc. Marko Hennhfer, Communications Research Lab

Information Theory and Coding

10
Slide: 117

3 Channel Coding
More detailed:

delay

M.Sc. Marko Hennhfer, Communications Research Lab

delay

Information Theory and Coding

Slide: 118

59

3 Channel Coding
Tailbits for the terminated code?
Depend on the state!
state

00

0
00

0
00

tail bits
00

s0 = 0 0

s1 = 0 1

10

11

The tailbits are


generated
automatically by
the encoder,
depending on the
encoded sequence

s2 = 1 0
0

s3 = 1 1

01

10

current input:0
current input:1
M.Sc. Marko Hennhfer, Communications Research Lab

Information Theory and Coding

Slide: 119

3 Channel Coding
How to terminate the code?

delay

delay

switch for
termination
now generated
from the state

will now be always zero, i.e., the


state will get filled with zeros

M.Sc. Marko Hennhfer, Communications Research Lab

Information Theory and Coding

Slide: 120

60

3 Channel Coding
Example: Termination if the last state has been 11:
As the input is not arbitrary anymore, we get only 4 cases to consider

11

01

From the state 11 we force the encoder back to the 00 state by generating the
tail bits 0 1. The corresponding output sequence would be 01 11. See also the
Trellis diagram for the termination.

M.Sc. Marko Hennhfer, Communications Research Lab

Information Theory and Coding

Slide: 121

3 Channel Coding
Turbo codes:
developed around 1993
get close to the Shannon limit
used in UMTS and DVB (Turbo Convolutional Codes, TCC)
parallel convolutional encoders are used
one gets a random permutation of the input bits
the decoder benefits then from two statistically independent encoded bits
slightly superior to TPC
noticeably superior to TPC for low code rates (~1 dB)

used in WLAN, Wimax (Turbo Product Codes, TPC)


serial concatenated codes; based on block codes
data arranged in a matrix or in a 3 dimensional array
e.g., Hamming codes along the dimensions
good performance at high code rates
good coding gains with low complexity

M.Sc. Marko Hennhfer, Communications Research Lab

Information Theory and Coding

Slide: 122

61

3 Channel Coding
System overview:
mapping from bit to
symbols, e.g., BPSK
Turbo
encoder

symbol
mapping

channel:
assume
AWGN

bit
mapping

soft
outputs

M.Sc. Marko Hennhfer, Communications Research Lab

Turbo
decoder

noisy received
values

Information Theory and Coding

Slide: 123

3 Channel Coding
Turbo encoder (for Turbo Convolutional Codes, TCC):
Structure of a rate 1/3 turbo encoder

convolutional
encoder 1

interleaver

convolutional
encoder 2

pseudo
random
permutation

two identical
convolutional
encoders

The turbo code is a block code, as a certain number of bits need to be


buffered first in order to fill the interleaver
M.Sc. Marko Hennhfer, Communications Research Lab

Information Theory and Coding

Slide: 124

62

3 Channel Coding
Example: UMTS Turbo encoder:
Rate 1/3, RSC with feedforward generator (15) and feedback generator (13)

delay

delay

delay

interleaver
Parallel Concatenated Convolutional Codes (PCCC)

delay

M.Sc. Marko Hennhfer, Communications Research Lab

delay

delay

Information Theory and Coding

Slide: 125

3 Channel Coding
Turbo decoder:
Structure of a turbo decoder
extrinsic
information
deinterleaver

MAP
decoder 1

extrinsic
information

MAP
decoder 2

interleaver

interleaver

The MAP decoders produce a soft output which is a measure for the reliability of
their decission for each of the bits. This likelihood is used as soft input for the
other decoder (which decodes the interleaved sequence). The process is repeated
until theres no significant improvement of the extrinsic information anymore.
M.Sc. Marko Hennhfer, Communications Research Lab

Information Theory and Coding

Slide: 126

63

3 Channel Coding
MAP (Maximum a posteriori probability) Decoding:
Difference compared to the Viterbi decoding:
Viterbi decoders decode a whole sequence (maximum likelihood
sequence estimation). If instead of the Hamming distance the
Euklidean distance is used as Viterbi metric we easily get the SoftOutput Viterbi algorithm (SOVA)
The SOVA provides a reliability measure for the decission of the
whole sequence
For the application in iterative decoding schemes a reliability measure
for each of the bits is desirable, as two decoders are used to decode
the same bit independently and exchange their reliability information
to improve the estimate. The indepencence is artificially generated by
applying an interleaver at the encoding stage.
In the Trellis diagram the MAP decoder uses some bits before and after
the current bit to find the most likely current bit
MAP decoding is used in systems with memory, e.g., convolutional
codes or channels with memory
M.Sc. Marko Hennhfer, Communications Research Lab

Information Theory and Coding

Slide: 127

3 Channel Coding
Consider the transmission over an AWGN channel applying a binary
modulation scheme (higher order modulation schemes can be treated
by grouping bits).
Mapping:

01

and 1 -1
5

Suitable measure for the reliability


Log-Likelihood Ratio (LLR)

4
3
2
1
0
-1
-2
-3
-4
-5

M.Sc. Marko Hennhfer, Communications Research Lab

0.2

Information Theory and Coding

0.4

0.6

0.8

1
Slide: 128

64

3 Channel Coding
The reliability measure (LLR) for a single bit at time r under the condition
that a sequence
ranging from 1 to N has been received is:

with Bayes rule:


joint
probability

a-priori
probability
of A

a-posteriori
probability of
B

unknown

M.Sc. Marko Hennhfer, Communications Research Lab

known,
observed

Information Theory and Coding

Slide: 129

3 Channel Coding
Example as used before Rate RSC with generators 5 and 7:
The probability that becomes +1 or -1 can be expressed in terms of
the starting and ending states in the trellis diagram
state before:

s0 = 0 0

state afterwards:
00

s1

s1 = 0 1

s2

s2 = 1 0

s3

s3 = 1 1

10

0 (+1)
00

10

s4

M.Sc. Marko Hennhfer, Communications Research Lab

1 (-1)

Information Theory and Coding

Slide: 130

65

3 Channel Coding
joint probability for a pair of
starting and ending states

0 (+1)
00

1 (-1)

probability for all combinations of starting


and ending states that will yield a +1

probability for all combinations of starting


and ending states that will yield a -1

M.Sc. Marko Hennhfer, Communications Research Lab

10

Information Theory and Coding

Slide: 131

3 Channel Coding
The probability to observe a certain pair of states
depends on the
past and the future bits. Therefore, we split the sequence of received bits
into the past, the current, and the future bits
....

M.Sc. Marko Hennhfer, Communications Research Lab

....

Information Theory and Coding

Slide: 132

66

3 Channel Coding
Using Bayes rule to split up the expression into past, present and future

Looking at the Trellis diagram, we see the the future


of the past. It only depends on the current state .

is independent

Using again Bayes rule for the last probability

Summarizing

M.Sc. Marko Hennhfer, Communications Research Lab

Information Theory and Coding

Slide: 133

3 Channel Coding
Identifying the metrics to compute the MAP estimate

probability for a certain


future given the
current state, called
Backward metric

probability to observe a
certain state and bit given
the state and the bit before,
called Transition metric

probability for a certain


state and a certain past,
called Forward metric

Now rewrite the LLR in terms of the metrics

M.Sc. Marko Hennhfer, Communications Research Lab

Information Theory and Coding

Slide: 134

67

3 Channel Coding
How to calculate the metrics? Forward metric

probability for a certain state and a certain past,


called Forward metric
known from
initialization

example: r=2

probability to arrive in a certain state and the


corresponding sequence that yielded that state

using again Bayes rule and

r-1

r+1

M.Sc. Marko Hennhfer, Communications Research Lab

Information Theory and Coding

Slide: 135

3 Channel Coding
How to calculate the metrics? Back metric

probability for a certain future given the current


state, called Backward metric
example: r=N

r -2

r-1

known from
termination

r=N

M.Sc. Marko Hennhfer, Communications Research Lab

Information Theory and Coding

Slide: 136

68

3 Channel Coding
How to calculate the metrics? Transition metric

probability to observe a certain state and bit given the


state and the bit before, called Transition metric

for a given state s the transition


probability does not depend on the past

prob. to observe a received


bit for a given pair of states

r -1

M.Sc. Marko Hennhfer, Communications Research Lab

r+1

prob. for this pair of states, i.e.,


the a-priori prob. of the input bit

Information Theory and Coding

Slide: 137

3 Channel Coding
Now some math:

starting with this one

expressing the a-priori probability in terms of the Likelihood ratio

with

M.Sc. Marko Hennhfer, Communications Research Lab

Information Theory and Coding

Slide: 138

69

3 Channel Coding

now combining the terms in a smart way to one expression

1 for + and

for -

we get the a-priori probability in terms of the likelihood ratio as


with

M.Sc. Marko Hennhfer, Communications Research Lab

Information Theory and Coding

Slide: 139

3 Channel Coding
continuing with this one

Now some more math:

pair of observed
bits

pair of transmitted coded bits, belonging to the


encoded info bit
example for code rate . Can
easily be extended
noisy observation,
disturbed by AWGN

+1 or -1 squared always 1

M.Sc. Marko Hennhfer, Communications Research Lab

Information Theory and Coding

Slide: 140

70

3 Channel Coding

Now the full expression:

a-priori information

M.Sc. Marko Hennhfer, Communications Research Lab

Information Theory and Coding

Slide: 141

3 Channel Coding

abbreviation

from before:

M.Sc. Marko Hennhfer, Communications Research Lab

Information Theory and Coding

Slide: 142

71

3 Channel Coding

unknown at the receiver, but resulting from the corresponding branch in the Trellis diagram s s

due to the assumptions

positive
negative

with

M.Sc. Marko Hennhfer, Communications Research Lab

Information Theory and Coding

Slide: 143

3 Channel Coding
Interpretation:

a-priori information
about the transmitted
bit, taken from an initial
estimate before running
the MAP algorithm

information
provided by the
observation. Only
depending on the
channel; not on
the coding scheme

a-posteriori (extrinsic) information.


Gained from the applied coding
scheme

In a Turbo decoder the extrinsic information of one MAP decoder is used


as a-priori information of the second MAP decoder. This exchange of
extrinsic information is repeated, until the extrinsic information does not
change significantly anymore.
M.Sc. Marko Hennhfer, Communications Research Lab

Information Theory and Coding

Slide: 144

72

3 Channel Coding
Summary:
Info bits: mapped to +1 (0) and -1 (1)

due to the fact that we


use a systematic code

....

....

AWGN channel

a-priori information
set to 0.5 LLR=0 in
the first stage

encoded sequence

noisy received bits


extrinsic information from the decoding

noisy observations

yields the LLR and


therefore, the bit
estimate
M.Sc. Marko Hennhfer, Communications Research Lab

Information Theory and Coding

Slide: 145

3 Channel Coding
Iterations:

Iteration #1:

constant over iterations K

Iteration #2:

first iteration, first


decoder,a-priori LLR=0
first iteration, second decoder: uses
extrinsic information from the first
one as a-priori information
continuing in the same fashion with
further iterations

Iteration #3:
reference:
see tutorials at www.complextoreal.com
or http://www.vashe.org/
Notes: We used a slightly different
notation. The first tutorial has some minor
errors but most cancel out

M.Sc. Marko Hennhfer, Communications Research Lab

Information Theory and Coding

Slide: 146

73

3 Channel Coding
Low-Density Parity Check (LDPC) codes:
first proposed 1962 by Gallager
due to comutational complexity neglegted until the 90s
new LDPC codes outperform Turbo Codes
reach the Shannon limit within hundredths decibel for large block sizes,
e.g., size of the parity check matrix 10000 x 20000
are used already for satellite links (DVB-S2, DVB-T2) and in optical
communications
have been adopted in IEEE wireless local areal network standards, e.g.,
802.11n or IEEE 802.16e (Wimax)
are under consideration for the long-term evolution (LTE) of third
generation mobile telephony
are block codes with parity check matrices containing only a small
number of non-zero elements
complexity and minimum Hamming distance increase linearily with the
block length
M.Sc. Marko Hennhfer, Communications Research Lab

Information Theory and Coding

Slide: 147

3 Channel Coding
Low-Density Parity Check (LDPC) codes:
not different to any other block code (besides the sparse parity check
matrix)
design: find a sparse parity check matrix and determine the generator
matrix
difference to classical block codes: LDPC codes are decoded iteratively

M.Sc. Marko Hennhfer, Communications Research Lab

Information Theory and Coding

Slide: 148

74

3 Channel Coding
Tanner graph
graphical representation of the parity check matrix
LDPC codes are often represented by the Tanner graph
Example: (7,4) Hamming code
check nodes

n bit nodes
n -k check nodes, i.e., parity check equations

bit nodes

Decoding via message passing (MP) algorithm. Likelihoods are passed


back and forth between the check nodes and bit nodes in an iterative
fashion
M.Sc. Marko Hennhfer, Communications Research Lab

Information Theory and Coding

Slide: 149

3 Channel Coding
Encoding
use Gaussian elimination to find
construct the generator matrix
calculate the set of code words

M.Sc. Marko Hennhfer, Communications Research Lab

Information Theory and Coding

Slide: 150

75

3 Channel Coding
Example:
length 12 (3,4) regular LDPC code
parity check code as introduced by Gallager

M.Sc. Marko Hennhfer, Communications Research Lab

Information Theory and Coding

Slide: 151

3 Channel Coding
Message Passing (MP) decoding
soft- and hard decision algorithms are used
often log-likelihood ratios are used (sum-product decoding)
Example: (7,4) Hamming code with a binary symmetric erasure channel
Initialization:
1+x+0+1

x+0+1+x

1+x+1+x

M.Sc. Marko Hennhfer, Communications Research Lab

in order to be a valid code


word, we want the
syndrom to be zero.
Therefore, x must be 0.

0
1
x

Information Theory and Coding

Slide: 152

76

3 Channel Coding
Message Passing (MP) decoding
1+0+0+1

x+0+1+x

1+0+0+1

1+x+1+x

0+0+1+x

1+0+1+x

M.Sc. Marko Hennhfer, Communications Research Lab

in order to be a valid code


word, we want the sydrom
to be zero.
Therefore, x must be 1 and
x must also be 1.

x
Information Theory and Coding

Slide: 153

3 Channel Coding
Message Passing (MP) decoding
1+0+0+1

0+0+1+1

0
0

1+0+1+0

Decoding result:
1001110
1

M.Sc. Marko Hennhfer, Communications Research Lab

0
Information Theory and Coding

Slide: 154

77

3 Channel Coding
Message Passing (MP) decoding
sum-product decoding
similar to the MAP Turbo decoding
observations are used a a-priori information
passed to the check nodes to calculate the parity bits, i.e., a-posteriory
information / extrinsic information
pass back the information from the parity bits as a-priori information
for the next iteration
actually, it has been shown, that the MAP decoding of Turbo codes is
just a special case of LDPC codes already presented by Gallager

Robert G. Gallager,Professor Emeritus, Massachusetts Institute of Technology


und publications youll also find his Ph.D. Thesis on LDPC codes

http://www.rle.mit.edu/rgallager/

M.Sc. Marko Hennhfer, Communications Research Lab

Information Theory and Coding

Slide: 155

78

You might also like