0% found this document useful (0 votes)
56 views

Introduction - Modular Arithmetic - Finite Fields - N-Space Over A Finite Field - Error Correcting Codes - Exercises Introduction

This document introduces modular arithmetic and finite fields as the mathematical basis for error correcting codes. It discusses how modular arithmetic works by dividing integers into groups based on their remainders when divided by a given integer. A finite field is a modular number system where every non-zero number has an inverse. Linear algebra can be done over finite fields, with vectors and matrices having elements from the finite field. Error correcting codes use techniques from linear algebra over finite fields to allow for error detection and correction in digital data transmission and storage.
Copyright
© Attribution Non-Commercial (BY-NC)
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
56 views

Introduction - Modular Arithmetic - Finite Fields - N-Space Over A Finite Field - Error Correcting Codes - Exercises Introduction

This document introduces modular arithmetic and finite fields as the mathematical basis for error correcting codes. It discusses how modular arithmetic works by dividing integers into groups based on their remainders when divided by a given integer. A finite field is a modular number system where every non-zero number has an inverse. Linear algebra can be done over finite fields, with vectors and matrices having elements from the finite field. Error correcting codes use techniques from linear algebra over finite fields to allow for error detection and correction in digital data transmission and storage.
Copyright
© Attribution Non-Commercial (BY-NC)
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 16

Modular numbers and Error Correcting Codes Introduction Modular Arithmetic Finite elds n-space over a nite eld

eld Error correcting codes Exercises Introduction. Data transmission is not normally perfect; errors can be introduced because of misread data or poor transmission conditions. The purpose of error-correcting codes is to eliminate these errors, allowing near-perfect transmission of data under imperfect conditions. For example, a scratched LP record, with no error correction, sounds poor. In contrast, a scratched CD still sounds perfect, because the CD player can tell that the scratched data is wrong and reconstruct the correct data as long as the scratch isnt too long. Other applications include satellite transmission, in which the signal is very weak and is likely to include errors; and computer networks, in which error correction allows for faster data transmission while avoiding the decrease in reliability that it would normally cause. The theory of error-correcting codes uses techniques from many dierent areas of mathematics, but its basis is in linear algebra. To develop this theory, we need to use linear algebra over sets other than the real numbers R. Except for the geometric transformations, the only properties of the real numbers we have used are addition, subtraction, multiplication, and division. Therefore, we can do linear algebra over any set on which we have these operations. Any such set is called a eld. Modular Arithmetic. For every positive integer m larger than 1, we can form what is called a modular number system. In the simplest case of m = 2, the modular number system is nothing more than the notion of even and odd. This system of number is sometimes called the binary numbers with the even
1

numbers denoted as 0 and the odd numbers denoted as 1. We are all very familiar with the rules for addition and multiplication of binary numbers numbers. They are + 0 1 0 1 0 1 1 0 0 1 0 1 0 0 0 1

The idea of modular arithmetic is the following. We divide up all the integers into m groups according to what their remainder is when we divide by m. For instance, when m = 3, the three groups of integers are the integers with remainder 1 {. . . , 8, 5, 2, 1, 4, 7, 10, . . . }, the integers with remainder 2 {. . . , 7, 4, 1, 2, 5, 8, 11, . . . }, the integers with remainder 0 {. . . , 9, 6, 3, 0, 3, 6, 9, . . . } For convenience we label the three sets as 13 , 23 and 03 and call these three sets the integers 1, 2 and 0 modulo 3. In this notation, the even integers would be 02 (the integers 0 modulo 2), while the odd integers would be 12 (the integers 1 modulo 2). The modular number system for the integer 3 consists of the sets 13 , 23 and 03 with addition and multiplication dened by + 03 13 23 03 13 23 03 13 23 03 13 23 13 23 03 23 03 13 03 13 23 03 03 03 03 13 23 03 23 13

An example of what this means; if a is an integer with remainder 1 when divided by 3 (so a 13 ), and if b is an integer with remainder 2 when divided by 3 (so b 23 ), then the sum a + b will be in the set 03 , i.e. a + b is divisible by 3. Similary, the product a b will be in the set 23 . The modular number system for 3 has just three elements. The modular number 03 is zero in the sense that when we add it to any other modular number a (13 , 23 or 03 ), we get a. In a likewise manner, the modular number 13 has the property that when we multiply any modular number b by 13 , we get back
2

b. The addition and multiplication tables for the modular number system for m = 4 are + 0 1 2 3 0 0 1 2 3 1 1 2 3 0 2 2 3 0 1 3 3 0 1 2 0 1 2 3 0 0 0 0 0 1 0 1 2 3 2 3 0 2 0 2 0 3 2 1

Here, we have removed the subscript 4 for convenience. The entries in the addition table are very regular. There is a cyclic shift as we move from row to row. The entries in the multiplication table are less predictable. Note that 2 times 2 is 0, so the m = 4 number system is somewhat strange in that two nonzero numbers can multiply up to zero. Question. The modular numbers for m = 12 and m = 24 are very commonly used by people throughout the world. How? Answer. Clocks run on either the modular system for m = 12, twelve hours, or m = 24, 24 hour clocks. The multiplication table for the modular number system for the integer m = 7 is 0 1 2 3 4 5 6 0 1 2 3 4 5 6 0 0 0 0 0 0 0 0 1 2 3 4 5 6 0 2 4 6 1 3 5 0 3 6 2 5 1 4 0 4 1 5 2 6 3 0 5 3 1 6 4 2 0 6 5 4 3 2 1

For any modular number system, the familiar properties of COMMUTATIVITY ASSOCIATIVITY a+b=b+a ab=ba a + (b + c) = (a + b) + c
3

a (b c) = (a b) c ZERO ONE DISTRIBUTIVITY a+0=a a1=a a (b + c) = a b + a c

are all valid. The negative of a modular number such as 23 = {. . . , 7, 4, 1, 2, 5, 8, 11, . . . } is the modular number which as a set is equal to 13 = {. . . , 7, 4, 1, 2, 5, 8, 11, . . . }. The biggest dierence between modular numbers systems and the more familar real or complex number systems was already mentioned in the above paragraph. In some modular systems, eg those for the integers 4 or 6 or 8 or 9, it is possible for two nonzero numbers to multiply to zero. For example, in the modular system of the integer 15, we have 615 1015 = 015 . In this setting, neither of the equations 615 x = 115 nor 1015 x = 115 have solutions. In other words, both the modular numbers 615 and 1015 do not have inverses in the modular 15 system. To see why this is so, suppose that x = a is a solution to 615 x = 115 . Multiplying both sides of 615 1015 = 015 gives a (615 1015 ) = a 015 = 015 (a 615 ) 1015 = 015 1015 = 015 This is not true, so the assumption that 615 x = 115 has a solution (615 has an inverse in the modular numbers) is wrong. However, if we look at the multiplication table for both the modular 3 and modular 7 systems, we see that in these two number systems, every nonzero number has an inverse. In the modular 7 system, the numbers 27 , 47 are inverses of each other. The numbers 37 , 57 are inverse of each other. The numbers 17 and 67 are are their own inverses. 27 47 = 17 37 57 = 17
4

17 17 = 17

67 67 = 17

Finite elds. A modular system in which every nonzero number has an inverse is called a nite eld. A eld is any number system in which the one can add and multiply two numbers with the commutativity, associativity, zero, one, distributivity properties and in which every nonzero number has an inverse. The familar real numbers are a eld. The complex numbers are another example of a eld. The real numbers are denoted R and the complex numbers are denoted C. The rational numbers (denoted Q) are yet another example of a eld. The modular number system for the integer 3 is denoted F3 , while the modular number system for the integer 7 is denoted F7 . Modular number system for the integer 2 is also a eld and denoted F2 . The elds R, C, Q are elds with innitely many elements/numbers. The elds F2 , F3 and F7 are elds which have only a nite number of elements/numbers in them. Finite Field Theorem. If p is any prime integer, the modular numbers for p is denoted Fp . These modular numbers form a nite eld with p elements/numbers. To see why the modular numbers Fp form a eld, we must nd a way to show that any nonzero modular number has an inverse. One easy way when the prime p is not too large is to write out the complete multiplication table for Fp and then check that every nonzero modular number has an inverse. When the prime p writing out the multiplication table is extremely tedious. Luckily it is not necessary. There is an algorithm called the Euclidean algorithm which can be used to determine the inverse. For more about this, see the appendix. Linear algebra over a nite eld. Regular n-space (written in row form) Rn consists of the row vectors v = (x1 , . . . , xn ). That is Rn is the set of all possible size n row vectors, with the coordinates xi being real numbers. Complex n-space, written Cn is the set of all row vectors w = (z1 , . . . , zn ), where the coordinates zi are complex numbers. Similary if the coordinates are just rational numbers, we speak of rational n-space. Having just described nite elds Fp , we can now speak of n-space Fn over a nite eld. It consists of all the row vectors p
5

v = (x1 , . . . , xn ) with the condition that the coordinates belong to the nite eld Fp . Note that n-space over a nite eld has just nitely many vectors. Indeed, each of the n coordinates can be any of p possible things; therefore Fn consists of pn vectors. For example, 7-dimensional space over the binary p eld F2 has 27 = 128 vectors. For any n-space over a eld: We add two vectors v and v by adding their corresponding coordinates. We multiply a vector v by a scalar in the eld by multiplying all the coordinates of v by . In a similar manner, we can talk about an m n matrix with entries in a nite eld. An m n matrix A is a linear transformation. If we think of our m-space as being row vectors, then A takes row vectors in Fm to row p n vectors in Fp . If we think of our n-space as being column vectors, then A takes column vectors in Fn to column vectors in Fm . Over the nite eld p p F11 , the matrix 4 1 A = 6 7 9 10 is the linear transformation which would take the the column vector to A y1 y2 4 y1 + 1 y2 = 6 y1 + 7 y2 . 9 y1 + 10 y2 y1 y2

As another, example of matrices over nite elds, we can check that in the nite eld F5 1 3 2 3 4 4 1 2 = 13+24 33+44 11+22 31+42 = 1 0 0 . 1

that is the two matrices on the left side are each others inverses. Error correcting codes (The rst Hamming code)
6

As an example of the use of matrices over nite eld, we discuss the rst Hamming code. Data transmitted or stored electronically is usually done so in packets of 1s and 0s. The packet (word) size is usually of xed size. For instance, a word size of 4 would be able to take on 16 word states ranging from 0000, 0001, 0010, . . . to 1111. Imagine now sending words of length 4 over a serial communications channel and that there is some probability that each bit we transmit will be changed either from 0 to 1 or vice verse. If the probability is .95 that a bit is transmitted successfully over a communication channel, then the probabilty that a word consisting of 4 bits is transmitted successfully is sending 4 bits with no errors = .95 .95 .95 .95 = .8145 One way to improve things is to add an extra check or parity bit. Adding a check bit means sending a 5th bit which is the sum (in the binary eld F2 of the original 4 bits. It allows the receiver to check whether or not an error has occurred during transmission. If the received 5 bits do not have the property that they add up to 0, then at least one bit, possibly more, was altered during transmission. Note that this scheme has the drawback that if 2 bits were changed the receiver would be unable to recognize this error. The probability of various possibilities for the transmission of 5 bits (assuming the .95 as before) are sending 5 bits with no errors .955 = 0.7738 sending 5 bits with at most one error .955 + 5 .954 .05 = .9774 sending 5 bits with two or more errors 1 .9774 = .0226 While adding a parity check bit allows a receiver to detect some errors, it does not allow the receiver to x things. The rst Hamming code is a way of coding a 4 bit word into a 7 bit word so that if just one error occurs then the receiver will not only be able to detect the error but the receiver will be able to correct the error. Such a method/code is called an error correcting code. Since .8145 = probablity of transmitting 4 bits with error .9556 = probablity of transmitting 7 bits with at most one error
7

It is more reliable to code a 4 bit word into a 7 bit word and then send the 7 bit word. In order to measure the ability of a code to correct errors, we dene the Hamming distance between two vectors v1 and v2 in Fn to 2 be the number of bits which are dierent in the two vectors. This is thus the number of errors which would need to be made for a signal to be transmitted as v1 and read as v2 . There are 27 = 128 possible 7 bits words, or equivalently, there are vectors in 128 vectors in the 7-dimensional vector space F7 . The number of 7 bit words within a Hamming distance of 1 or 0 2 from a word is 1+7=8. Question. Is there some clever way/process to code a 4 bit word into a 7 bit word? Answer. The rst Hamming code. The rst Hamming code is a 4-dimensional subspace W of F7 with the 2 property that the Hamming distance between any two vectors in W is at least 3. The description of the subspace W is that it is both the kernel of the testing matrix 0 0 0 1 1 1 1 T = 0 1 1 0 0 1 1 1 0 1 0 1 0 1 as well as the image of the encoding matrix 1 1 0 1 1 0 1 1 1 0 0 0 E = 0 1 1 1 0 1 0 0 0 0 1 0 0 0 0 1 The columns of the encoding matrix E are obtained by nding a basis for the kernel of T via reduced rowechelon form. a b To code a 4 bit word v = into a 7 bit vector in the Hamming c d code (subspace W) we multiply v into the encoding matrix E to get the
8

encoded/transformed vector w = E(v) = E v. We then transmit w via our communication channel. At the other end of the communication channel, a person who receives a 7 bit word encoded w in this fashion, can rst multiply w into the testing matrix T to check if there have been any errors in transmission. 0 0 since the If w is received with no errors, then T w will be equal to 0 Hamming subspace is the kernel of the testing matrix T . If w has just one error in it, then the size three column vector T w has the remarkable property, which we cannot fully explain here, that it tells us (in binary) which column the error is located. To decode an encoded vector, one uses a decoding transformation D : F4 which changes the 7 bit encoded word back to the original 4 bit 2 word. The decoding matrix must satisfy the property that F7 2 D E = Identity 4 4 matrix For the rst Hamming code, the decoding matrix is 0 0 1 0 0 0 0 0 0 0 0 1 0 0 D= 0 0 0 0 0 1 0 0 0 0 0 0 0 1 0 1 Example: Suppose we want to transmit the 4 bit word v = . Compu1 0 1 1 0 tation of w = E(v) gives w = E v = 0 . If during transmission, the code 1 1 0
9

As an exercise, check that D E = I44 .

1 1 0 word is altered to w = 0 when we multiply it into the testing matrix 1 0 0 T we get 1 1 0 0 0 1 1 1 1 1 0 0 1 1 0 0 1 1 0 = 1 1 0 1 0 1 0 1 0 1 0 0 1 1 as the base two number 110 = 1 We now read the column vector 0 2 1 0 2 + 1 2 + 0 2 = 4 + 2 = 6, which means the 6th bit has been altered. Changing the 6th bit corrects the error caused in transmission. Error-correcting codes are of practical use because they greatly decrease the probability of an error without greatly lengthening the transmission. The design of error correcting codes depends very much on linear algebra as well as much more sophisticated mathematics beyond the scope of this course. An example of a much more sophisticated code is a code known as the Golay code. The Golay code is a 12 dimensional subspace of 23 dimensional binary space F23 . In analogy with the rst Hamming code, 2 there is an encoding 23 12 matrix E which encodes a 12 bit word into a 23 bit word. The image of the matrix E in F23 is the 12 dimensional Golay 2 code. There is also a testing matrix T whose kernel is the Golay code as well as a decoding matrix (a linear transformation from F23 to F12 ). The Golay 2 2 code, which is used both in satellite as well as spacecraft data transmission, has the ability to detect up to 6 bit changes in a code word and to correct code words which have been altered in 3 bits or less. Suppose that each bit has a .01 chance of being transmitted incorrectly.
10

The chance that four bits will be transmitted correctly is thus .994 = .9606. If we encode them with the rst Hamming code, the chance is .997 = .9320 that all seven bits will be correct, and 7 .01 .996 = .0659 that there is only one error (since any one of seven bits can be the wrong one), for a chance of .9979 that four bits will be transmitted correctly. This is 19 times more reliable. The probability of no error in 23 bits is .9923 = .793614, The probabililty of one error is 23 .01 .9922 = .184375. If there are two errors, there are 23 possible bits which can be one wrong bit, and 22 possibilities which are another wrong bit, but each pair of wrong bits is being counted twice (bits 2 and 8 being wrong is the same as bits 8 and 2 being wrong), so there are 23 22/2 ways to have two wrong bits, and the total probability of two wrong bits is (2322/2).012 .9921 = .020486. Similarly, the probability of three wrong bits is (23 22 21/6) .013 .9920 = .001449. Adding these together gives a probability of .999924. For comparison, the chance is .9939 that the same 12 bits will be correct if they are transmitted as three groups of four using the rst Hamming code; so the Golay code cuts the errors by an additional factor of 80 using only two more bits (23 bits for the Golay code vs 21 bits for three copies of the rst Hamming code).

11

Exercises. 1. Write out the multiplication table for the modular numbers (nite eld) F7 2. Solve the equation 4x = 5 in the nite eld F7 3. Find the inverse matrix of A= 3 3 4 2

where all entries are in the modular numbers F7 . Solve the linear system 3x1 + 3x2 = 4 4x1 + 2x2 = 1 for numbers x1 and x2 in F7 . 4. Choose a 4-bit word v in F4 code it via the encoding matrix E into a 2 7-bit Hamming code word w. Verify that w lies in the kernel of the testing matrix T . Then change one of the bits of w and verify that the testing matrix T allows one to determine which bit was changed. 5. Suppose you have received the code vector (encoded as a 7-bit Hamming code word) 1 0 1 w = 1 1 1 0 Find the transmitted vector v, correcting an error if necessary, and then nd the original 4 bit vector u. 6. Write out the multiplication table for the modular numbers F5 and determine a matrix E (entries in F5 ) whose image is equal to the kernel of the matrix 2 0 0 1 1 T = 3 1 0 1 1 4 2 1 1 1
12

7. How many vectors are there in the vector space F5 ? If W is a subspace 7 of F5 of dimension 3 how many vectors are in W ? 7

13

Appendix. Euclidean Algorithm The Euclidean algorithm is a procedure for nding the greatest common divisor gcd(a, b) of two nonzero integers a and b. When the gcd(a, b) = 1, the steps of the Euclidean algorithm can be reversed to nd inverse of the number b in the modular system of the number a. Euclidean Algorithm. If a > b are two positive integers, then their greatest common divisor gcd(a, b) can be obtained as follows. Divide b into a to get a quotient and remainder term a = q1 b + r1 the remainder r1 is between 0 and b 1

If r1 = 0, then b divides a and gcd(a, b) = b. If r1 > 0 then we divide r1 into b to get b = q2 r1 + r2 the remainder r2 is between 0 and r1 1

If r2 = 0, then r1 divides a and gcd(a, b) = r1 . If r2 > 0 then we divide r2 into r1 to get r1 = q3 r2 + r3 the remainder r3 is between 0 and r2 1

If r3 = 0, then r2 divides a and gcd(a, b) = r3 . If r3 > 0 then we continue on. The process will eventually stop since the remainders are getting smaller (b > r1 > r2 > > rk ) but still larger than 0. As an example of the process, we nd the greatest common divisor between the integers 319 and 100. We have 319 = 3 100 + 19 100 = 5 19 + 5 19 = 3 5 + 4 5=14+1 4=41 So, the greatest common divisor of 319 and 100 is 1. In this case (when the greatest common divisor is 1), the Euclidean algorithm provides us a way of
14

determining the inverse of the modular number 100319 . We just reverse the steps of the Euclidean algorithm starting with the last (r = 1) remainder. We have 1=54 = 5 (19 3 5) = 19 + 4 5 = 19 + 4 (100 5 19) = 4 100 21 19 = 4 100 21 (319 3 100) = 21 319 + 67 100 In the modular system 319, this becomes 1319 = 21319 319319 + 67319 100319 1319 = 0319 + 67319 100319 The inverse of 100319 in the m = 319 modular system is 67319 . For a prime p, the process of the Euclidean algorithm can be used to nd the inverse of any nonzero numbers in a modular number system. It is why the modular number system for a prime p is a eld. Many of the things we do in the real eld R can be done in a nite eld as well. For instance, in the nite eld F7 , we can solve system of linear equations 3x1 + 3x2 = 4 4x1 + 2x2 = 1 by Gaussian elimination. We need to divide eliminate the variable x1 from the 2nd equation by adding a multiple of the rst equation. To do this we need to nd a modular number ? so tha ? 3 = 4. From the multiplication table for F7 , or via the Euclidean algorithm, we nd 5 3 = 1, so it must be the case that 4 5 = 20 = 6 is the modular number we need to take for ?. Doing the elimination gives 3x1 (4 6 3)x1 which is + 3x2 + (2 6 3)x2 3x1 + 3x2 5x2
15

= 4 = 164

= 4 = 5

We see that x2 = 1. If we now perform the back substitution of x2 = 1 into the rst equation, we get 3x1 = 1 and so x1 = 5. As a check, we substitute x1 = 5 and x2 = 1 into the original two equations. We have 35+31 = 1+3 = 4 45+21 = 6+2 = 1 Exercises. 1. Find the inverse of 105 in the nite eld F1997 using the Euclidean algorithm.

16

You might also like