Coding Theory The Essentials - D.G Hoffman PDF
Coding Theory The Essentials - D.G Hoffman PDF
8,(C,u), for
any v in K® and w in C, where C is the parity check code formed from K™), but
is a reasonable frst approximation for a measure of reliabilty. Certainly 0,(C,»)
isa lower bound for the probability that v is decoded correctly.
Example 1.10.1 Suppose p = .90,|M| = 2,n = 3, and C = {000,111}, as
in Example 1.9.3. If the word v = 000 is sent, we compute the probability
that IMLD will correctly conclude this after one transmission. From Table 1.1,
‘v= 000 is decoded in the first four rows, so the set £(000) (words in K® closer
lo w= 000 than to 111) is
£1(000) = {000, 100,010,001}.
Thus,
@,(C,000)
(000,000) + 8,(000, 100) + €4(000, 010) + 6,(000, 001)
+ pe(L— p) + 9° ~ 9) + PAL —P)
P+3p-p)
= 972 (assuming p= 9).
Ifv = 111 is transmitted, we compute the probability that IMLD correctly
‘concludes this after one transmission. First,
L(111) = (110, 101,011,111},
6,(C,111) = #,(111,110) + 4,(111,101) +6,(111, 011) + 6,(111, 111)
P= p)+ (1 —p) + P°(1 —p) +P?
3p°(1— p) +P
972 (assuming p= 9).
Exercises
1.10.2 Suppose p = .90, |M|
1.9.5.
= 2)
3, and C
(001,101), as in Exercise
{a) Ifo = 001 is sent, find the probability that IMLD will corzectly
conclude this after one transmission.
1.10. RELIABILITY OF MLD 19
(b) Repeat part (a) for w= 101
Both answers in Exercise 1.10.2 are 8,(C,») = .900. Compating this to the
results in Example 1.10.1, we conclude that since .900 < .972, the code C =
{000, 111) is better than the code C = {001,101}, at least when judged by the
third criterion in the last section. Our method provides a procedure, (although
somewhat inefficient when n is large) for determining when the probability that
IMLD works is high, Fortunately, most of the codes we design later on are
structured s0 that the calculation of this probability is much easier,
(0000, 1010, 0111),
Example 1.10.3 Suppose p = .90, ||
as in Example 1.9.4. For each v in C, we compute 6,(C,v)
®)
> = 0000
«)
» =1010
1(1010) = (1010, 1110, 1011}
8(Cyw) = 8, (4010, 1010) + 65(2010, 1116) + 6(2010, 1011)
#4 (lp) +P =p)
if + 2p(1 — p) = 8019
©
> =o
Z(0111) = {0110,0101, 0011, 1101, 0111, 1111}
@,(C,x) = 0,(0111,0110) +-,(0111, 0101) + 4,(0111, 0011)
+6,(O111, 1101) + 6,(0111, 0111) + 6,(0112, 1111)
=P(L—p) +P —p) +P 2) +L py +t + LP)
= ptt Ap — p) +p — p= 9558.
Examining the three probabilities, we sge that the probability that IMLD will
conclude correctly that 0111 was sent is not too bad. However the probability
that IMLD will conclude correctly that either 0000 or 1010 was sent is horrible.
‘Thus, at least by the third criterion in the last section, C = {0000, 1010, 0111}
is not an especially good choice for a code.20 CHAPTER 1. INTRODUCTION TO CODING THEORY
Exercises
1.104 Suppose p = 90 and C = {000,001,110}, as in Exercise 1.9.6. If
v= 110 is sent, find the probability that IMLD will correctly conclude
this, and the probability that IMLD will incorrectly conclude that 000
was sent.
1.10.5 For each of the following codes C calculate é,(C,v) for each v in C using
P= .90. (The IMLD tables for these codes were constructed in Exercise
19.7).
(a) C= (101,111,011)
(b) © = {000, 001,010,011}
(€) © = {0000, 0001, 1110}
(4) © = {0000, 1001,0110, 1111)
(©) © = (00000, 11111}
(9 @ = (00000,11100,00111, 11011)
(s) © = (00000, 11110, 01111, 19001)
(8) = {000000, 101010, 10101, 111211}
1.11 Error—Detecting Codes
‘We now make precise the notion of when a code C will detect errors. Recall
that iv in Cis sent and w in Kis received, then u = v+w isthe err pattern
Any word w in K ean occur as an error pattern, and we wish to know which
error patterns C will detect
We say that code C detects the eror pattern w if and only if v-+w is not
codeword, for every v in C. In other words, uis detected if for any transmitted
codeword 1, the decoder, upon receiving v + u can recognize that itis not a
codeword and hence that some error has occurred.
Example 1.11.1 Let C = {001,101,110}. For the error pattern u = 010, we
calculate 0 +010 forall vin C:
001 + 010 = 011, 101 + 010 = 111, 10 4-010
00.
‘None of the three words O11, 111, or 100 is in C, so C detects the error pattern,
010. On the other hand, for the error pattern u = 100 we find
001 +100 = 101,101 + 100 = 001,110 + 100 = 010.
Since at least one of these sums is in C, C does not detect the error pattern 100.
1.1L. ERROR-DETECTING CODES a
Exercises
1.11.2 Let C = (001,101,110). Determine whether C' will detect the error
patterns (a) 011, (b) 001, and (e) 000.
1.11.3 For each of the following codes C' determine whether or not C detects
(a) © = {00000, 10101, 00111, 11100)
(i) w= 10101
Gi) w= 01010
Gi) w= 011
(&) © = (1101, 0110, 1100)
() w=0010
Gi) w= 0011
(ii) w= 1010
(©) © = {1000,0100, 0010, 0001)
(i) w= 1001
(i) w= m0
(i) w= on0
1.11.4 Which error patterns will the code C = K" detect?
1.11.5 (j) Let @ be a code which contains the zero word as a codeword, Prove
that if the error pattern u is a codeword, then C will not detect w.
(ii) Prove that no code will detect the zero error pattern u=0.
‘The table constructed for IMLD can be used to determine which error pat-
ters a code C will detect. The first column lists every word in K*. Hence the
first column can be reinterpreted as all possible error patterns, in which case the
“error pattern” columns in the IMLD table then contain the sums v +, for all
vin. If in any particular row none of these sums are codewords in C, then C
detects the error pattern in the first column of that row.
Example 1.11.6 Consider the code C = {000,111} with IMLD Table 1.1. All
possible error patterns u are in the first column. For a given w, all sums v-+ u
as v ranges over C are in the second and third columns of the row labeled by w
If none of these entries are in C (that is, neither is 000 or 111), thea C' detects
1. Thus C detects the error patterns 100, 010, 001, 110, 101, and O11, as can be
seen by inspecting rows 2 through 7 of the table, but not the error patterns 000
or 111.2 CHAPTER 1. INTRODUCTION TO CODING THEORY
Exercises
1.11.7 Determine the error patterns detected by each code in Exercise 1.9.7 by
using the IMLD tables constructed there.
An alternative and much faster method for finding the error patterns that
code C can detect is to first find all error patterns that C does not detect;
then all remaining error patterns can be detected by C. Clearly, for any pair of
codewords v and w, if = v+w then e cannot he detected, since v +e = w,
which is a codeword, So the set of all error patterns that cannot be detected by
Cis the set of all words that can be written as the sum of 2 codewords.
Example 1.11.8 Consider the code {000,111}. Since
(000 +000 = 000,000 +111 = 111 and 111 +111
100,
the set of error patterns that cannot be detected is {000,111}. ‘Therefore all
error patterns in K*\{000,111} ean be detected
Example 1.11.9 Let C = {1000,0100, 1111). Since 1000-1000 = 0000, 1000-4
0100 = 1100,1000 + 1111 = 0111 and 0100 +1111 = 1011, the set of error
patterns that cannot be detected by C is {0000,1100, 0111, 1011}. Therefore all
‘error patterns in K*\{0000, 1100, 0111, 1011} can be detected,
Exercises
1.11.10 Find the error patterns detected by each of the following codes and
‘compare your answers with those Exercises 1.11.7.
(a) @= (101,111,011)
() © = (000,001,010, 011)
(c) © = {0000, 0001, 1120}
(4) © = {0000, 1001,0110, 1111}
(©) @= {0000011111}
(8) @ = (00000, 11100, 00111, 11011}
(8) € = {00000,11110, 01111, 10001}
(b) © = {000000, 101010, o10101, 111111}
‘There is also a way of determining some error patterns that code C will
detect without any manual checking, First we have to introduce another number
associated with C.
For a code © containing at least two words the distance of the code C is
the smallest of the numbers d{(v,w) as v and w range over all pairs of different
111. ERROR-DETECTING CODES 2
codewords in C. Note that since d(1,t#) = wt(v +12), the distance of the code
is the smallest value of wi(v + w) as v and w,v 4 w range over all possible
codewords.
‘The distance of a code has many of the properties of Buclidean distance;
this correspondence may be useful to assist in understanding the concept of the
distance of a code.
Example 1.11.11 Let C= {0000,1010,0111). Then (0000, 1010)
0111) = 3, and d(1010,0111) = 3. Thus the distance of C is 2
(0000,
Exercises
1.11.12 Find the distance of each of the following codes,
(a) C= (101, 11,011)
(b) © = {000, 001,010,011)
(© = (0000, 0001, 1110)
(a) © = {0000, 1001, 0110, 1111}
(e) C= {00000 11111)
(f) © = (00000, 11100, 00111, 11011}
(g) C= {00000, 11110, 01111, 10001}
(h) © = (000000, 101010, 10101, 111111}
1.11.13 Find the distance of the code formed by adding a parity check digit to
ram
Now we can state a theorem which helps to identify many of the error patterns
code will detect,
‘Theorem 1.11.14 A code C of distance d will atleast detect all non-zero error
patterns of weight less than or equal to d—1. Moreover, there is at least one
error pattern of weight d which C will not detect
Remark Notice that C may detect some error patterns of weight d or more,
but does not detect all exor patterns of weight d
Proof: Let u bea nonzero error pattern with wt(u) = {0}.
In linear algebra itis shown that for any subset § of a vector space V, the
linear span < 5 > is a subspace of V, called the subspace spanned or generated
by 5. For the vector space K™, we have a very simple description of < S >
Which is stated in the next theorem. Since < $ > is a subspace, in K* we call
the linear code gencrated by 5.
‘Theorem 2.2.1 For any subset $ of K*, the code C =< $ > generated by S
consists precisely of the following words: the zero word, all words in $, and all
sums of too or more words in S.
Example 2.2.2 Let $ = {0100,0011, 1100}. Then the code C =< S$ > gener-
ated by S consists of
O11, 0100-40011 + 1100
1000, 0011 + 1100 = 1111;
9000, 0100, 0100+ 0011
100, 0011, 0100 +110
oul,
that is, © =< $ >= {0000,0100,0011, 1100,0111, 1000, 1111, 1011}32 CHAPTER 2. LINEAR CODES
Exercises
the elements of the linear code
2.2.3 For each of the following sets S, lis
.
(a) $= (010,011,111),
(b) $= {2010,0101, 1111}
(©) $ = {0101, 1010, 1100}
(a) $= (1000, 0100, 0010, 0001)
(6) $= {11000, 01111, 11110, 01010}
(f) $= {10101, 01010, 11111, 00011, 10110)
If = (04,03)---y4n) and w = (byybay--.yby) ase vectors in KX", we define
the scalar product or dot product v-w of » and 10 a5
w= ayby + aby +... Oya:
Note that v-w is a scalar, not a vector. For instance, in K*,
11001-01101 =1-0-41-140-140-041-1
+1404041
Exercises
2.24 Construct examples in IC* of each of the following rules
@u
(b) a(v-w)
wtu)=wobuw
(av) -w
(aw).
2.2.5 Prove that the two rules in Exercise 2.2.4 hold in K*.
Vectors v and w are orthogonal if v-w == 0. The example above shows that
v= 11001 and w = 01101 are orthogonal in K*. For a given set 5 of vectors in
ICY, we say a vector v is orthogonal to the set $ if »-w = 0 for all win S; that
is, ois orthogonal to every vector in S. The set of all vectors orthogonal to 5 is
denoted by Si and is called the orthogonal complement of S.
Tn linear algebra it is shown that for any subset § of a vector space V, the
‘orthogonal complement S+ is a subspace of V. For the vector space K*, if
C=< S>, then we write Ct = 5+ and call C+ the dual code of C.
23, INDEPENDENCE, BASIS, DIMENSION 33
Example 2.2.6 For $ = {0100,0101}, we compute the dual code C+ = S4.
We must find all words v = (y.2,w) in K* such that both the equations
vy -0100
vy -O101
hold. Computing the scalar product we have
0 and y+w
‘Thus y = w = 0 but x and z can be either 0 or 1. Writing down all such choices
for v we get
= S* = {0000,0010, 1000, 1010).
Exercises
2.2.1 Find the dual code C4 for each of the coves C =e § > in Exercise
2.2.3.
2.2.8 Find an example of a nonzero word v such that v-v = 0. What can you
say about the weight of such a word?
2.2.9 For any subset § of a vector space V, ($+)! =< $ >. Use the example
above to construct an example of this faet in A
2.2.10 Prove that (S) C (S*)*. (In fact (5+) =< $ >; for a linear code C,
this means (C+)* = C)
2.3 Independence, Basis, Dimension
We review several important concepts from linear algebra and illustrate how
to apply these concepts to linear codes. ‘The main objective is to find an efficient
‘way to describe a linear code without having to list all the codewords
A set $= {v1,02,---;04} of vectors is Hinearly dependent if there are scalars
4,42--+5a4 not all zero such that
04 + aby to Fane
Otherwise the set S is linearly independent.
‘The test for linear independence, thefi, is to form the vector equation above
using arbitrary scalars. If this question forces all the scalars 4,,43,...,44 to be
6, then the set $ is linearly independent. If at least one a, can be chooen to be
nonzero then $ is linearly dependent.u CHAPTER 2. LINEAR CODES
Example 2.3.1 We test $= {1001, 1101, 1011} for linear independence. Let a,
b and c be scalars (digits) such that
(1001) + (1101) + 1011) = 0000.
Equating components on both sides yields the salar equations
abbte=0,b=0c=00+b+e=0.
‘These equations force a = 6 = ¢= 0. Therefore S is a linearly independent set
of words in Ké
Example 2.3.2 We test $ = {110, 011,101,111} for linear independence.
Consider
(110) + 6(011) + (101) + (111) = 000
‘This yields the system of scalar equations
atetd =0
atbtd
btetd
‘Adding these three equations gives d = 0. Now we have a+ = 0,0+b =
0,b+¢ = 0. Thus we can choose a = 6 = c= 1. Therefore S is a linearly
dependent set.
In linear algebra itis shown that any set of vectors S # {0} contains a largest
linearly independent subset, The next example shows how such a subset may be
found,
Example 2.3.3 Let $ = {110,011, 101,111}. The last example shows that S is
linearly dependent. In fact, we found that
1(110) +1(012) + 1(101) + 0(211) = 000,
so we can solve for 101 as linear combination ofthe other words in S:
101 = 1(210) + 1(011) + 0(112),
In the dependent set S, if we take the words in the order given, we come to 101
as the first word which is dependent on, that is, is a linear combination of, the
preceding words 110 and OLL in S. Discarding this word, we obtain 2 new set,
‘S’ = {110,011,111}. Now $' can be tested for linear independence. If 5" is
linearly dependent, we discard the first word which is a linear combination of
the preceding words, thus obtaining a new set S". This process may be repeated
until we find @ new set which is linearly independent; such a set is always a
largest linearly independent subset of the given set S. In the present example,
this set is $'.
23, INDEPENDENCE, BASIS, DIMENSION 35
Bxercises
294 Test cach of the folowing sets for linear independence. If the set is
Tinearly dependent, extract fom Sa largest linearly independent subset.
(a) $= (1101, 1110, 1011)
(®) $= (201,011, 10,010)
(6) $= 101, 0111, 100,001)
(a) 5 = (2000, 0100, 0020,0001}
(e) $= {1000, 100, 1110, 11}
(© $= (1100, 1010, 1001, 0101}
(g) S = {0110,1010,1100,0012, 1111)
(h) $= (111000, 000111, 101010, 010101)
($= {00900000, 10101010, 01020101, 11111111).
In Exercise 2.3.4 (i) $ is found to be a linearly dependent set, Note that $
contains the zero word. It is always true that any set of vectors containing the
zero vector is linearly dependent.
‘A nonempty subset B of vectors from a vector space V is a basis for V if
both:
1) Bespans V (that is, < B >= V), and
2) Bis a linearly independent se.
Note that any linearly independent set B is automatically basis for < B >.
Also since any linearly dependent set $ of vectors that contains a non-zero word
always contains a largest independent subset B, we can extract from $ a basis
Bifor < S >. If $= {0} then we say thatthe basis of Sis the empty set, 6.
Example 2.8.5 Let $= {1001,1101, 1011}. In Example 2.3.1 we found that
Sis linearly independent. Therefore S is a basis for the code C = < S$ >=
{0000, 1001, 1101, 1011,0100, 0010, 0110,1111} whieh is a subspace of K*.
Example 2.8.6 Let $= {110,011,101,111}. In Example 2.3.2 we found that
‘Sis linearly dependent. But in Example 2.3.3 we extracted a maximal linearly
independent subset 8 = S’ = {110,011,111} of S. Hence B is a basis for the
code C=.
‘These examples illustrate how to obtain a basis for the code C =< S >
‘generated by a nonempty subset $ of K. To find a basis for the dual code C+,
extract a largest linearly independent subset from C following the procedure
in Example 2.3.3,