Chapter 10.
Error
Detection and Correction
10.1
Notes
Data
can be corrupted during transmission.
Some applications (acutally most of
applications) require that errors be detected
and corrected.
10.2
INTRODUCTION
Let us first discuss some issues related, directly
or indirectly, to error detection and correction.
10.3
Type of Errors
Single-bit
error
Only 1 bit in the data unit (packet, frame, cell) has
changed.
Either 1 to 0, or 1 to 0.
Burst
error
2 or more bits in the data unit have changed.
More likely to occur than the single-bit error because
the duration of noise is normally longer than the
duration of 1 bit.
10.4
Redundancy
To
detect or correct errors, we need to send
extra (redundant) bits with data.
The receiver will be able to detect or correct
the error using the extra information.
Detection
Looking at the existence of any error, as YES or NO.
Retransmission if yes. (ARQ)
Correction
Looking at both the number of errors and the location
of the errors in a message.
Forward error correction. (FEC)
10.5
Coding
Encoder
vs. decoder
Both encoder and decoder have agreed on a
detection/correct method in priori.
10.6
Modulo Arithmetic
In
modulo-N arithmetic, we use only the
integers in the range 0 to N1, inclusive.
Calculation
If a number is greater than N1, it is divided by N and
the remainder is the result.
If it is negative, as many Ns as needed are added to
make it positive.
Example
in Modulo-12
1512 = 312
-312 = 912
10.7
Modulo-2 Arithmetic
Possible
numbers are {0, 1}
Arithmetic
Addition
0+0=0, 0+1=1,
1+0= 1,
1+1=2=0
Subtraction
0-0=0,
0-1=-1=1,
1-0=1, 1-1=0
Surprisingly, the addition and subtraction give the
same result.
XOR (exclusively OR) can replace both addition and
subtraction.
10.8
BLOCKCODING
In block coding, we divide our message into
blocks, each of k bits, called datawords. We add
r redundant bits to each block to make the
length n = k + r. The resulting n-bit blocks are
called codewords.
10.9
Datawords and codewords in
block coding
10.10
Example 10.1
The
4B/5B block coding discussed in Chapter 4
is a good example of this type of coding.
In this coding scheme, k = 4 and n = 5. As we
saw, we have 2k = 16 datawords and 2n = 32
codewords.
We saw that 16 out of 32 codewords are used
for message transfer and the rest are either
used for other purposes or unused.
10.11
Error Detection
A
receiver can detect a change if the original
codeword if
The receiver has a list of valid codewords, and
The original codeword has changed to an invalid one.
10.12
Example 10.2
Let
us assume that k = 2 and n = 3, and assume the
following table.
Assume
the sender encodes the dataword 01 as 011
and sends it to the receiver. Consider the following
cases:
The receiver receives 011. It is a valid codeword. The
receiver extracts the dataword 01 from it.
The codeword is corrupted during transmission, and 111 is
received. This is not a valid codeword and is discarded.
The codeword is corrupted during transmission, and 000 is
received. This is a valid codeword. The receiver incorrectly
extracts the dataword 00. Two corrupted bits have made
the error undetectable.
10.13
Note
An
error-detecting code can detect only the
types of errors for which it is designed; other
types of errors may remain undetected.
The previous example
Is designed for detecting 1-bit error,
Cannot detect 2-bit error, and
Cannot find the location of the 1-bit error.
10.14
Error Correction
The
receiver needs to find (or guess) the
original codeword sent.
Need more redundancy than for error
detection.
10.15
Example 10.3
Add 3 redundant bits to the 2-bit dataword to make 5-bit
codewords as follows:
Example
Assume the dataword is 01.
The sender creates the codeword 01011.
The codeword is corrupted during transmission, and 01001 is
received. The receiver
Finds that the received codeword is not in the table.
Assuming that there is only 1 bit corrupted, uses the following
strategy to guess the correct dataword.
Comparing the received codeword with the first codeword in the table
(01001 versus 00000), the receiver decides that the first codeword is not
the one that was sent because there are two different bits.
By the same reasoning, the original codeword cannot be the third or
fourth one in the table.
The original codeword must be the second one in the table because this
is the only one that differs from the received codeword by 1 bit. The
receiver replaces 01001 with 01011 and consults the table to find the
dataword 01.
10.16
Hamming Distance
The
Hamming distance between two words is
the number of differences between
corresponding bits.
The Hamming distance d(000, 011) is 2 because
000 011 = 011 (two 1s)
The Hamming distance d(10101, 11110) is 3 because
10101 11110 = 01011 (three 1s)
The
minimum Hamming distance is the
smallest Hamming distance between all
possible pairs in a set of words.
10.17
Example 10.5
Find
the minimum Hamming distance of the
coding scheme in Table 10.1.
Solution
We first find all Hamming distances.
d(000,011)=2, d(000,101)=2,
d(000,110)=2
d(011,101)=2, d(011,110)=2,
d(101,110)=2
The dmin in this case is 2.
10.18
Example 10.6
Find
the minimum Hamming distance of the
coding scheme in Table 10.2.
Solution
We first find all the Hamming distances.
d(0000,01011)=3,
d(00000,10101)=3,
d(00000,11110)=4,
d(01011,10101)=4,
d(01011,11110)=3,
d(10101,11110)=3
The dmin in this case is 3.
10.19
Hamming Distance and
Detection
To
guarantee the detection of up to s-bit errors
in all cases, the minimum Hamming distance in
a block code must be
dmin = s + 1.
10.20
Example 10.7
The minimum Hamming distance for our first code
scheme (Table 10.1) is 2. This code guarantees
detection of only a single error. For example, if the
third codeword (101) is sent and one error occurs,
the received codeword does not match any valid
codeword. If two errors occur, however, the
received codeword may match a valid codeword and
the errors are not detected.
10.21
Example 10.8
Our second block code scheme (Table 10.2) has dmin
= 3. This code can detect up to two errors. Again,
we see that when any of the valid codewords is sent,
two errors create a codeword which is not in the
table of valid codewords. The receiver cannot be
fooled.
However, some combinations of three errors
change a valid codeword to another valid
codeword. The receiver accepts the received
codeword and the errors are undetected.
10.22
Minimum Distance and
Correction
To
guarantee correction of up to t errors in all
cases, the minimum Hamming distance in a
block code
must be dmin = 2t + 1.
10.23
Example 10.9
A code scheme has a Hamming distance dmin = 4.
What is the error detection and correction
capability of this scheme?
Solution
Thiscodeguaranteesthedetectionofuptothree
errors(s=3),butitcancorrectuptooneerror.In
otherwords,ifthiscodeisusedforerrorcorrection,
partofitscapabilityiswasted.Errorcorrection
codesneedtohaveanoddminimumdistance(3,5,
7,...).
10.24
LINEAR BLOCK
CODES
Almost all block codes used today belong to a
subset called linear block codes. A linear block
code is a code in which the exclusive OR
(addition modulo-2) of two valid codewords
creates another valid codeword.
10.25
Note
In
a linear block code, the exclusive OR (XOR)
of any two valid codewords creates another
valid codeword.
10.26
Example 10.10
Let us see if the two codes we defined in Table 10.1
and Table 10.2 belong to the class of linear block
codes.
1.The scheme in Table 10.1 is a linear block code
because the result of XORing any codeword with
any other codeword is a valid codeword. For
example, the XORing of the second and third
codewords creates the fourth one.
2.The scheme in Table 10.2 is also a linear block
code. We can create all four codewords by XORing
two other codewords.
10.27
Example 10.11
In our first code (Table 10.1), the numbers of 1s in
the nonzero codewords are 2, 2, and 2. So the
minimum Hamming distance is dmin = 2. In our
second code (Table 10.2), the numbers of 1s in the
nonzero codewords are 3, 3, and 4. So in this code
we have dmin = 3.
10.28
Simple Parity-Check Code
A
simple parity-check code is a single-bit errordetecting code in which n = k + 1 with dmin =
2.
A simple parity-check code can detect an odd
number of errors.
10.29
Encoder and decoder for simple
parity-check code
In
modulo,
r0 = a3+a2+a1+a0
s0 = b3+b2+b1+b0+q0
Note
that the receiver addds all 5 bits. The
result is called the syndrome.
10.30
Example 10.12
Let us look at some transmission scenarios. Assume the
sender sends the dataword 1011. The codeword created
from this dataword is 10111, which is sent to the receiver.
We examine five cases:
1.No error occurs; the received codeword is 10111. The
syndrome is 0. The dataword 1011 is created.
2.One single-bit error changes a1 . The received
codeword is 10011. The syndrome is 1. No dataword is
created.
3.One single-bit error changes r0 . The received codeword is
10110. The syndrome is 1. No dataword is created.
10.31
Example 10.12 (continued)
4. An error changes r0 and a second error changes
a3 . The received codeword is 00110. The
syndrome is 0. The dataword 0011 is created at
the receiver. Note that here the dataword is
wrongly created due to the syndrome value.
5. Three bitsa3, a2, and a1are changed by
errors. The received codeword is 01011. The
syndrome is 1. The dataword is not created. This
shows that the simple parity check, guaranteed
to detect one single error, can also find any odd
number of errors.
10.32
Hamming Code
Error
correcting codes.
The relationship between m and n in these
codes is n = 2m 1.
10.33
Figure 10.11 Two-dimensional parity-check code
10.34
Figure 10.11 Two-dimensional parity-check code
10.35
Figure 10.11 Two-dimensional parity-check code
10.36
Table 10.4 Hamming code C(7, 4)
10.37
Figure 10.12 The structure of the encoder and decoder for a Hamming code
10.38
Table 10.5 Logical decision made by the correction logic analyzer
10.39
Example 10.13
Let us trace the path of three datawords from the
sender to the destination:
1. The dataword 0100 becomes the codeword
0100011. The codeword 0100011 is received. The
syndrome is 000, the final dataword is 0100.
2. The dataword 0111 becomes the codeword
0111001. The syndrome is 011. After flipping b2
(changing the 1 to 0), the final dataword is 0111.
3. The dataword 1101 becomes the codeword
1101000. The syndrome is 101. After flipping b0, we
get 0000, the wrong dataword. This shows that our
code cannot correct two errors.
10.40
Example 10.14
We need a dataword of at least 7 bits. Calculate
values of k and n that satisfy this requirement.
Solution
Weneedtomakek=nmgreaterthanorequalto
7,or2m1m7.
1.Ifwesetm=3,theresultisn=231andk=7
3,or4,whichisnotacceptable.
2.Ifwesetm=4,thenn=241=15andk=15
4=11,whichsatisfiesthecondition.Sothecode
isC(15,11)
10.41
Figure 10.13 Burst error correction using Hamming code
10.42
CYCLIC CODES
Cyclic codes are special linear block codes with
one extra property. In a cyclic code, if a
codeword is cyclically shifted (rotated), the
result is another codeword.
10.43
Cyclic Redundancy Code
Widely
used in data communication
Example of CRC C(7,4)
10.44
Architecture of CRC
10.45
Figure 10.15 Division in CRC encoder
10.46
CRC Decoder
The
decoder does the same division as the
encoder.
The remainder of the division is the syndrome.
If there is no error during communication, the
syndrome is zero. The dataword is sperated from the
received codeword and accepted.
If the syndrom is non-zero, then errors occurs during
communication.
Question
What if there is errors during communication, but the
syndrome is zero.
10.47
Figure 10.16 Division in the CRC decoder for two cases
10.48
Figure 10.17 Hardwired design of the divisor in CRC
10.49
Figure 10.18 Simulation of division in CRC encoder
10.50
Figure 10.19 The CRC encoder design using shift registers
10.51
Figure 10.20 General design of encoder and decoder of a CRC code
10.52
Polynomials
The
binary vector can be represented by a
polynomial.
Coefficients are either 0 or 1.
Power of each term represents the position of the bit.
10.53
Polynomial Notation of CRC
S
and R agree upon a generator function g(x)
of degree n in priori.
Use binary and modulo-2 arithmetic
no carry for addition, no borrow for subtraction
addition = subtraction = exclusive OR.
n is the degree of g(x).
f(x)
S
g(x)
Is xnf(x)-r(x)+e(x) divisible by
g(x) ?
xnf(x)+
r(x)
g(x)*s(x)
x f(x)=g(x)*s(x)
+r(x)
n
e(x)
xnf(x)-r(x)
+e(x)
g(x)*s(x)
R
g(x)
+e(x)
3-54
CRC Division Using
Polynomials
10.55
Equivalence of Polynomial and
Binary Polynomial
Binary Vector
Generator,
g(x)
x3+x+1
1011
Data, f(x)
x3+x2
1100
xnf(x)
x6+x5
1100000
Division
xnf(x)g(x)
x3+x
+1
x3+x2+x
x6+x5
x6+x4+x3
x5+x4+x3
x5+x3+x2
x4+x2
x4+x2+x
x
Codeword,
xnf(x)-r(x)
x6+x5+x
1011
1110
1100000
1011
111000
1011
10100
1011
10
1100010
3-56
Note
The
divisor in a cyclic code is normally called
the generator polynomial or simply the
generator.
In a cyclic code, the remainder of
(xnf(x)-r(x)+e(x)) % g(x)
If s(x) 0, one or more bits is corrupted.
If s(x) = 0, either
No bit is corrupted, or
Some bits are corrupted, but the decoder failed to detect
them.
In
a cyclic code, those e(x) errors that are
divisible by g(x) are not caught.
10.57
Capability of CRC
If
the generator has more than one term and the
coefficient of x0 is 1, all single errors can be caught.
If a generator cannot divide xt+1 (t between 0 and
n 1), then all isolated double errors can be
detected.
A generator that contains a factor of x+1 can
detect all odd-numbered errors.
For the length of error (L) and the degree of the
generator (r)
All burst errors with L r will be detected.
All burst errors with L = r + 1 will be detected with
probability 1 (1/2)r1.
All burst errors with L > r + 1 will be detected with
probability 1 (1/2)r.
10.58
Example 10.15
Which of the following g(x) values guarantees that
a single-bit error is caught? For each case, what is
the error that cannot be caught?
a. x + 1
b. x3
c. 1
Solution
a.Noxicanbedivisiblebyx+1.Anysinglebit
errorcanbecaught.
b.Ifiisequaltoorgreaterthan3,xiisdivisibleby
g(x).allsinglebiterrorsinpositions1to3are
caught.
c.Allvaluesofimakexidivisiblebyg(x).No
singlebiterrorcanbecaught.Thisg(x)isuseless.
10.59
Figure 10.23 Representation of two isolated single-bit errors using polynomials
10.60
Example 10.16
Find the status of the following generators related to two
isolated, single-bit errors.
a. x + 1
b. x4 + 1
c. x7 + x6 + 1
d. x15 + x14 + 1
Solution
a.Thisisaverypoorchoiceforagenerator.Anytwoerrors
nexttoeachothercannotbedetected.
b.Thisgeneratorcannotdetecttwoerrorsthatarefour
positionsapart.
c.Thisisagoodchoiceforthispurpose.
d.Thispolynomialcannotdividext+1iftislessthan
32,768.Acodewordwithtwoisolatederrorsupto32,768
bitsapartcanbedetectedbythisgenerator.
10.61
Example 10.17
Find the suitability of the following generators in
relation to burst errors of different lengths.
a. x6 + 1
b. x18 + x7 + x + 1
c. x32 + x23 + x7
+1
Solution
a.Thisgeneratorcandetectallbursterrorswitha
lengthlessthanorequalto6bits;3outof100
bursterrorswithlength7willslipby;16outof
1000bursterrorsoflength8ormorewillslipby.
10.62
Example 10.17 (continued)
b.Thisgeneratorcandetectallbursterrorswitha
lengthlessthanorequalto18bits;8outof1
millionburstErrorswithlength19willslipby;4
outof1millionbursterrorsoflength20ormore
willslipby.
c.Thisgeneratorcandetectallbursterrorswitha
lengthlessthanorequalto32bits;5outof10
billionbursterrorswithlength33willslipby;3out
of10billionbursterrorsoflength34ormorewill
slipby.
10.63
Good CRC Generator
A
good polynomial generator needs to have
the following characteristics:
It should have at least two terms.
The coefficient of the term x0 should be 1.
It should not divide xt + 1, for t between 2 and n 1.
It should have the factor x + 1.
10.64
Standard Polynomials
10.65
CHECKSUM
The last error detection method we discuss here
is called the checksum. The checksum is used in
the Internet by several protocols although not at
the data link layer. However, we briefly discuss
it here to complete our discussion on error
checking
10.66
Example 10.18
Suppose our data is a list of five 4-bit numbers that
we want to send to a destination. In addition to
sending these numbers, we send the sum of the
numbers. For example, if the set of numbers is (7,
11, 12, 0, 6), we send (7, 11, 12, 0, 6, 36), where 36
is the sum of the original numbers. The receiver
adds the five numbers and compares the result with
the sum. If the two are the same, the receiver
assumes no error, accepts the five numbers, and
discards the sum. Otherwise, there is an error
somewhere and the data are not accepted.
10.67
Example 10.19
We can make the job of the receiver easier if we
send the negative (complement) of the sum, called
the checksum. In this case, we send (7, 11, 12, 0, 6,
36). The receiver can add all the numbers
received (including the checksum). If the result is
0, it assumes no error; otherwise, there is an error.
10.68
Example 10.20
How can we represent the number 21 in ones
complement arithmetic using only four bits?
Solution
The number 21 in binary is 10101 (it needs five
bits). We can wrap the leftmost bit and add it to
thefourrightmostbits.Wehave(0101+1)=0110
or6.
10.69
Example 10.21
How can we represent the number 6 in ones
complement arithmetic using only four bits?
Solution
In ones complement arithmetic, the negative or
complement of a number is found by inverting all
bits. Positive 6 is 0110; negative 6 is 1001. If we
consideronlyunsignednumbers,thisis9.Inother
words, the complement of 6 is 9. Another way to
find the complement of a number in ones
complement arithmetic is to subtract the number
from2n1(161inthiscase).
10.70
Example 10.22
Let us redo Exercise 10.19 using ones complement
arithmetic. Figure 10.24 shows the process at the sender
and at the receiver. The sender initializes the checksum to
0 and adds all data items and the checksum (the checksum
is considered as one data item and is shown in color). The
result is 36. However, 36 cannot be expressed in 4 bits.
The extra two bits are wrapped and added with the sum to
create the wrapped sum value 6. In the figure, we have
shown the details in binary. The sum is then
complemented, resulting in the checksum value 9 (15 6
= 9). The sender now sends six data items to the receiver
including the checksum 9.
10.71
Example 10.22 (continued)
The receiver follows the same procedure as the sender. It
adds all data items (including the checksum); the result is
45. The sum is wrapped and becomes 15. The wrapped
sum is complemented and becomes 0. Since the value of
the checksum is 0, this means that the data is not
corrupted. The receiver drops the checksum and keeps the
other data items. If the checksum is not zero, the entire
packet is dropped.
10.72
Figure 10.24 Example 10.22
10.73
Internet Checksum
16-bit
checksum
Sender site:
1. The message is divided into 16-bit words.
2. The value of the checksum word is set to 0.
3. All words including the checksum are added using ones
complement addition.
4. The sum is complemented and becomes the checksum.
5. The checksum is sent with the data.
Receiver
site:
1. The message (including checksum) is divided into 16-bit
words.
2. All words are added using ones complement addition.
3. The sum is complemented and becomes the new checksum.
4. If the value of checksum is 0, the message is accepted;
otherwise, it is rejected.
10.74
Example 10.23
Let us calculate the checksum for a text of 8 characters
(Forouzan). The text needs to be divided into 2-byte (16bit) words. We use ASCII (see Appendix A) to change each
byte to a 2-digit hexadecimal number. For example, F is
represented as 0x46 and o is represented as 0x6F. Figure
10.25 shows how the checksum is calculated at the sender
and receiver sites. In part a of the figure, the value of
partial sum for the first column is 0x36. We keep the
rightmost digit (6) and insert the leftmost digit (3) as the
carry in the second column. The process is repeated for
each column. Note that if there is any corruption, the
checksum recalculated by the receiver is not all 0s. We
leave this an exercise.
10.75
Figure 10.25 Example 10.23
10.76